Hi Eike, Today at 16:37, Eike Rathke wrote:
> Also OOo string resources already may have comment info: the language > "x-comment" is reserved for adding comments to string resource entries. > However, in first place it has to be used by the developers who > add/modify the resource strings, and second be supported by tools that > extract the strings and/or convert them to .po or other translation > systems. I am really used to the level of support translators get in Gnome: when we have unclear message, we report it as a bug, and it either gets reworded, commented, or both, by module maintainer. It's nice to know that OOo already supports that in one way or another, of course. >> The big question is: how much effort would be needed to port OOo to >> use gettext? > > An even bigger question is: would gettext be able to handle it? How does > it scale? Would it be worth the effort? It's used for Gnome (30+k translatable messages for the core, over 80k with a bunch of apps like Gimp, Gnumeric,...), KDE (I think it's around the same for the core, grows again to over 80k with apps), XFCE and many other applications and environments. FWIW, Gnumeric (5-6k messages) is a hell of a lot faster and responsive (personal feeling, not a real measure) than OO Calc on my celeron2.3ghz, 636mb ram system, so I can be positive that gettext is not going to be a bottleneck. And yeah, it also combines some translatable messages from other sources, such as Gtk+ stock items (menus and buttons), libgnomeprint dialogs and stuff, etc. I can also describe some of the implementation details, like MO files being mmap()-able, containing alphabetically sorted strings (allowing O(log N) search), but also having a hash table (allowing O(1) string matching, alas, this is undocumented implementation detail on GNU systems; I don't know about Solaris or other gettext implementations). Yes, in my opinion, it would be worth the effort. > If it still applies, the first obstacle that always came to my mind with > any gettext implementation: the original string is in the source code. > If that needs to be changed you need to recompile and link. _And_ you > have to change the string in the corresponding gettext resource. Yes, original strings are in the source code. But if you wish, you can treat them still as simply "keys" in cases where you don't want to change a string in the source code (such as string freeze periods): you can provide "en_US" translation or something like that to introduce typo fixes, etc. GNU gettext tools provide excellent programs to handle "corresponding gettext resource" (i.e. MO and PO files), so you get automatic fuzzy matching, translation reuse, and it's all done automatically without programmers caring too much. Also, I don't really see the value of this argument, since with current OOo system, AFAIU, both original and translated strings end up in the "source code". Having only original ones there is clearly better, no? PO files and gettext performance allow us to set up statistics systems updated every couple of hours DIRECTLY FROM CVS/SVN such as: http://l10n-status.gnome.org/ http://i18n.kde.org/stats/gui/stable/index.php Why don't we get those for OOo? (these are one of the most valuable things translators have when working with PO files: it's not really about PO files, but about how simple it is to create such web pages working almost real-time: yeah, we are currently testing another version for Gnome where updates will happen on cvs commits instead!) Note that both of these are larger code bases than OOo, they are regenerated with simple fuzzy matching algorithm from GNU msgmerge [so it's not the fastest available method], and it takes two or three hours for complete Gnome pages to be regenerated for ~100 languages. > In the past we already had quite some discussion about gettext and came > to no conclusion where it was feasible to switch to gettext, did > anything change in that system? Depending on what you consider feasible. I understand that there is some value to having all strings in resource files, but it inevitably leads to many problems for both programmers and translators (did it ever happen for a programmer to display a wrong string? or translator to match a wrong translation with it since original has been updated?) Of course, to be serious, I am definitely biased toward the format (I have my own implementations of MO file parsers for PHP, C#, even Perl for intltool), but only because I find it so natural as both a programmer and translator! > Nowadays where tools like oo2po exist I wouldn't say that switching to > a gettext based approach would be necessary from this view. However, > there are some features of gettext that are worth taking a look at, like > language dependant plurals, but much more important for lowering the > barrier would it be to separate all localization effort from the source > tree, and not having to run a build in the entire source tree just to > (re)assemble some strings and bitmaps.. this _could_ be done using > gettext, but for text only, and at what cost? As I said, I can't judge the cost, because I don't know what it would take for OOo to switch. I only want to add that the only cost there will be is the cost of porting the code over, and there won't be any penalty in performance or value, IMHO. > People tend to only see the mere translation phase and simplicity of > language packs regarding text only, and in these of course using gettext > is much easier. For handling localized icons and such you'd still need > another system, or did that change? No, gettext is a text handling locale system. There is nothing in it to support other kind of data, but the problem it solves, it solves so well, that I don't see this as a counter argument. I mean, it doesn't either handle localised sounds or videos, but does OOo currently handle that? Or localised document templates *without* another system? > And, at least years back when I took > a look at it, gettext to me left an impression of a glued-together bunch > of something working only under specific circumstances. This may have > changed of course. Ugh, do you want to hear my impression of the OOo system? I.e. "make it harder for everybody" system, and which even then "hardly" works? Number of assumptions in OOo translation system is much higher than the number of them in gettext: you assume (i.e. compile-time setting) MO file base path and "domain" name: two options hard-coded total (and you can override both with documented LD_PRELOAD mechanism!). Everything else is there for programmers and translators to play with. I have translated north of 50k "messages" (ranging from single-word entries, to complete paragraphs), I have written hundreds of KLOCS in tens of languages, and I have found gettext format to be nicest to me as a programmer, and nicest to me as a translator. Of course, this is just a personal opinion, and it may be only because I never tried a better "way". Indeed, even if I am not reluctant to learn a new way, I have found OOo localisation much harder. After all, Solaris also has a full gettext implementation , allowing such experiments as those Tim Foster is doing using LD_PRELOAD mechanism (available on all GNU systems as well). And I think Tim would probably recommend XLIFF over PO file for doing translations, but he still makes great use of gettext library! Just check out his blog about it: http://blogs.sun.com/roller/page/timf/Weblog?catname=%2FTranslation%2C+language+and+tools Now, I am not saying gettext is perfect: far from it. It's only better :) Cheers, Danilo --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]