Re: [l10n-dev] gettext again

Eike Rathke Wed, 07 Sep 2005 10:03:20 -0700

Hi Danilo,

On Mon, Sep 05, 2005 at 18:47:06 +0200, Danilo �?egan wrote:


> >> The big question is: how much effort would be needed to port OOo to
> >> use gettext?
> >
> > An even bigger question is: would gettext be able to handle it? How does
> > it scale? Would it be worth the effort?
> 
> It's used for Gnome (30+k translatable messages for the core, over
> 80k with a bunch of apps like Gimp, Gnumeric,...), KDE (I think it's
> around the same for the core, grows again to over 80k with apps),
> XFCE and many other applications and environments.

Gnome+apps or KDE+apps are all in separated executables that come with
their own message catalogue each, not one single executable like OOo.
I don't know internals of the current gettext implementation, but AFAIR
the message catalogues are per executable. I may be totally wrong on
this though, and modularizing to some per library catalogue is possible.

> FWIW, Gnumeric (5-6k messages) is a hell of a lot faster and
> responsive (personal feeling, not a real measure) than OO Calc on my
> celeron2.3ghz, 636mb ram system, so I can be positive that gettext is
> not going to be a bottleneck.

Not for this version ;-)  but it may add to the overall sluggishness.

> I can also describe some of the implementation details, like MO files
> being mmap()-able, containing alphabetically sorted strings (allowing
> O(log N) search),

This indeed sounds good, and the O(1) hash map implementation detail you
mentioned for GNU systems sounds even better.


> Yes, in my opinion, it would be worth the effort.

Be careful with this statement. Changing to gettext or anything else
would involve a HUGE amount of work. Would we gain more than it would
cost? Or would adapting our current resource system to something easier
to handle, not requiring source tree builds and resulting in neater
language pack creation, be more effective? I wouldn't dare to follow
your statement without thorough investigation. And one question would
still remain unanswered: who would do the change?


> Yes, original strings are in the source code.  But if you wish, you
> can treat them still as simply "keys" in cases where you don't want to
> change a string in the source code (such as string freeze periods):
> you can provide "en_US" translation or something like that to
> introduce typo fixes, etc.

Ok, at least it seems to be possible to change a string without touching
the source code.

> Also, I don't really see the value of this argument, since with
> current OOo system, AFAIU, both original and translated strings end up
> in the "source code".

No, they are in indepedent resource source files, which you see as *.src
and their corresponding localize.sdf files per directory. The C++ code
does not contain any UI strings. Just that currently you have to build
the binary resource files from those *.src plus localize.sdf, and for
that to work you need a source tree, but in case you have an already
compiled output available, no C++ source has to be compiled. IMHO in
this build step is the real barrier to simple localizations that only
need translation, and if we could eliminate that we would already gain
a lot.


> PO files and gettext performance allow us to set up statistics systems
> updated every couple of hours DIRECTLY FROM CVS/SVN such as:
> 
>   http://l10n-status.gnome.org/
>   http://i18n.kde.org/stats/gui/stable/index.php
> 
> Why don't we get those for OOo?

I'm sure one could write some statistics generating code also for the
OOo resource system. And yes, it would be more difficult than doing it
for PO/POT.


> > In the past we already had quite some discussion about gettext and came
> > to no conclusion where it was feasible to switch to gettext, did
> > anything change in that system?
> 
> Depending on what you consider feasible.

Doable in a timely manner with a predictable outcome that clearly is
superior to the current system.

> I understand that there is
> some value to having all strings in resource files, but it inevitably
> leads to many problems for both programmers and translators (did it
> ever happen for a programmer to display a wrong string? or translator
> to match a wrong translation with it since original has been updated?)

Sure it did, and yes, it's a PITA.

> Of course, to be serious, I am definitely biased toward the format (I
> have my own implementations of MO file parsers for PHP, C#, even Perl
> for intltool), but only because I find it so natural as both a
> programmer and translator!

Of course. But with the oo2po tool you can still have much of these
without having to dig up the entire resource mechanism. As I said, the
real disadvantage OOo currently has in this regard as I see it, is the
dependency on the build step. If we could eliminate that we would be
more than half way through.


> > People tend to only see the mere translation phase and simplicity of
> > language packs regarding text only, and in these of course using gettext
> > is much easier. For handling localized icons and such you'd still need
> > another system, or did that change? 
> 
> No, gettext is a text handling locale system.  There is nothing in it
> to support other kind of data, but the problem it solves, it solves so
> well, that I don't see this as a counter argument.

It wasn't meant as such, just a question. What we'd need here is
a marriage of resource systems.

> I mean, it doesn't either handle localised sounds or videos, but does
> OOo currently handle that?

No.

> Or localised document templates *without* another system?

I didn't get that. Which other system? Templates are created within an
OOo application.


> > And, at least years back when I took
> > a look at it, gettext to me left an impression of a glued-together bunch
> > of something working only under specific circumstances. This may have
> > changed of course.
> 
> Ugh, do you want to hear my impression of the OOo system?

Even worse ;-)

> just a personal opinion, and it may be only because I never tried a
> better "way".  Indeed, even if I am not reluctant to learn a new way,
> I have found OOo localisation much harder.

It is. I never said it is better than other systems, but we have to be
realistic, and just assuming that switching over to gettext for the
string part and giving it a try would be better is not realistic as long
as nobody actually does it and proves that gettext and the remaining OOo
resource system can coexist without bigger disadvantages.

> After all, Solaris also has a full gettext implementation , allowing
> such experiments as those Tim Foster is doing using LD_PRELOAD
> mechanism (available on all GNU systems as well).

Bear in mind that we don't talk about Solaris or GNU systems only here.
We need that to work on any platform.

> And I think Tim would probably recommend XLIFF over PO file for doing
> translations,

Sure he does,
http://blogs.sun.com/roller/page/timf?anchor=open_source_translation_tools

> but he still makes great use of gettext library!  Just
> check out his blog about it:
> 
>   
> http://blogs.sun.com/roller/page/timf/Weblog?catname=%2FTranslation%2C+language+and+tools

Well, besides the experiment with intercepting gettext calls and a .po
filter for his TM system, I don't find any references to gettext in his
blog.

  Eike

-- 
 OOo/SO Calc core developer. Number formatter bedevilled I18N transpositionizer.
 GnuPG key 0x293C05FD:  997A 4C60 CE41 0149 0DB3  9E96 2F1A D073 293C 05FD

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: [l10n-dev] gettext again

Reply via email to