Re: automake po / pot file integration: when to merge the PO files?
At Tue, 7 Sep 2010 22:50:32 +0200, Ralf Wildenhues wrote: * Yavor Doganov wrote on Tue, Sep 07, 2010 at 10:17:07AM CEST: В Mon, 06 Sep 2010 11:32:54 +0100, Roger Leigh написа: I'm already doing this in some of my projects by adding --no-location to XGETTEXT_OPTIONS in po/Makevars. It's a massive improvement. It's a massive disaster for translators. A question I have is what purpose does having the line number and source file serve? Automatically jumping to the source. That's invaluable. Well, automatically jumping to a specific place in source code can be achieved with a more stable indexing method that file:line: annotations. An example is how tags/TAGS files work: they encode a file name and a search regex that looks like a sed /^pattern/. That's a nice idea, yes. I don't know if it's easy to implement, and I'm not sure why gettext's way is the way it is, given that *tags predates it (IIRC, not sure). Now, I'm not sure reusing tags functionality exactly is possible; translator tools will need some adjustment to work with this. PO editors definitely would have to be modified if this approach is implemented in gettext proper, but I don't think that's a major problem to be concerned about. (They already track and adapt to changes in gettext, naturally.) The solution for Roger's problem (preventing .po/.pot VCS noise) is very simple: just don't keep the .pot in the VCS; generate it at `make dist' time. If you keep .po files in VCS (not compulsory either; some projects do not do this, but fetch the .po's at dist time), then commit only those changes that are genuine translators' modifications. Any diligent translator knows how to generate a .pot, and how to refresh the .po from it.
Re: automake po / pot file integration: when to merge the PO files?
В Mon, 06 Sep 2010 11:32:54 +0100, Roger Leigh написа: I'd like to suggest that the best way to tackle the problem is to simply stop generating the source file/line number comments by default; I'm already doing this in some of my projects by adding --no-location to XGETTEXT_OPTIONS in po/Makevars. It's a massive improvement. Improvement for you as a maintaner, probably. It's a massive disaster for translators. A question I have is what purpose does having the line number and source file serve? Automatically jumping to the source. That's invaluable. The only case where locations do not work (and do not make sense) is for XML gibberish like .glade/.ui files. The filename is still useful, though -- I'm using it to open the file with Glade, and click around to find the message I'm interested in. And if the original source file(s) for a string need to be found, grep(1) is pretty fast. Try this with a non-trivial (or even trivial) package, you'd be annoyed pretty quickly. At least for me, the translators get mailed the po file, and never look at the source, so it's not of *any practical benefit* to anyone Yes, unfortunately these days many translators take the black box approach -- not only they never look at the source (which leads to amusing translations with horrible quality), they don't even compile and run the program. That's not a valid reason to cripple the .po files, though, making the life significantly harder for those who take care to test their translations properly. Source code availability *is* a prerequisite for a decent translation, and the locations of the messages are a major convenience (with a capable PO editor, of course).
Re: automake po / pot file integration: when to merge the PO files?
Hello, * Yavor Doganov wrote on Tue, Sep 07, 2010 at 10:17:07AM CEST: В Mon, 06 Sep 2010 11:32:54 +0100, Roger Leigh написа: I'd like to suggest that the best way to tackle the problem is to simply stop generating the source file/line number comments by default; I'm already doing this in some of my projects by adding --no-location to XGETTEXT_OPTIONS in po/Makevars. It's a massive improvement. Improvement for you as a maintaner, probably. It's a massive disaster for translators. A question I have is what purpose does having the line number and source file serve? Automatically jumping to the source. That's invaluable. Well, automatically jumping to a specific place in source code can be achieved with a more stable indexing method that file:line: annotations. An example is how tags/TAGS files work: they encode a file name and a search regex that looks like a sed /^pattern/. Now, I'm not sure reusing tags functionality exactly is possible; translator tools will need some adjustment to work with this. But that adjustment would be fairly straightforward, and the resulting translation file contents could have much more stable contents. Thoughts? Thanks, Ralf
automake po / pot file integration: when to merge the PO files?
Hi, One issue still needs discussion within the planned po / pot file integration [1]: When should the PO files that are distributed be merged with the POT file? The problem --- PO files (translations) are produced by translators and integrated to the project either by a maintainer (who receives them by mail from the translators directly or through the TP robot) or by a translator herself (who commits it into the version control repository). When a new release is made, or shortly before a new release is made, the maintainer circulates a tarball, and the translators are supposed to pick the PO files from this tarball and improve them by translating new untranslated messages. A PO file for a translator is produced by running 'msgmerge', basically $ msgmerge last-translation.po new-messages-list.pot new-translation.po If the PO files are being put in a VCS, then each time an 'msgmerge' is done, the PO file changes (new line numbers, new messages, dropped messages, etc.). Maintainers don't like this because - If they commit the modified PO files regularly, they bloat the history of their VCS, - If they don't commit them regularly, the risk of conflicts increases. Either way, it causes regular hassles. If the PO files are not being put in a VCS, then 1. the VCS contents is not the complete source, 2. the workflow where translators commit their translations directly is impossible. The classical approach -- In the approach designed in 1995, there is one PO file per language. Logically, the POT file depends on all source files, each PO file depends on the POT file, and each MO file depends on its corresponding PO file. So, it would be right to implement Makefile dependencies in such a way that each time a source file changes and the maintainer does a make, the POT file is being updated (via an 'xgettext' invocation), then the PO files are being updated (via N 'msgmerge' invocations), then the MO files are being updated (via N 'msgfmt' invocations). But this is too often: - It takes too much time to rebuild _all_ these files after every little change. - The maintainer most often does not care about whether the translations are up-to-date, because even if he runs make install, he is not going to start translation work. So, the approach implemented in po/Makefile.in.in is that make does not update all PO files, only make dist (which produces a tarball) does. There is also a make update-po target which updates all PO files but does not create a tarball. If there is a VCS, the maintainer is supposed to commit the updated PO files when he makes and releases the tarball. This was fine for cathedral style development, and until Automake came along. In bazaar style development, there are more frequent releases, and committing the updated PO files started to bloat the VCS history. Worse, Automake's make distcheck becoming more popular, maintainers started to create tarballs that were not really meant for use by translators. But the PO files were being updated and increased the potential of VCS conflicts. The minimalistic approach - It would be possible to never update the PO files, and instead produce the .mo files by running 'msgmerge' on the fly, directly before 'msgfmt': $ msgmerge xx.po domain.pot | msgfmt -c - xx.mo So: - The POT file would be updated at make dist, - The PO files would only be changed when the translator submits a new one, - The MO files would be updated at make dist. The VCS would only contain the PO files; and there would be no VCS conflicts. The drawback with this approach is that translators cannot work with a PO file that they take from a tarball; they would need to run 'msgmerge' by themselves (if there is no TP robot that does it for them). This would be a major hassle for the translators. Or they would need to rely on a web service to deliver them the merged PO files - then the translators have a methodology problem. The inconsistent approach - This is a variation of the minimalistic approach: In the development tree, never update the PO files. But implement the make dist target in such a way that it puts updated PO files into the tarball. Translators would be satisfied with this approach. The drawback is that once a maintainer unpacks a tarball right after producing it, its contents is different from what he has in his development tree. This is not only surprising, it can also lead to bugs that appear only with the release tarball and not earlier. A radically different approach -- It would be possible to store two PO files per language in a development tree: - xx.po, the last translation received from the translator, - xx.merged.po, the updated PO file, in sync with the latest POT file. The VCS would only contain the xx.po files, not the xx.merged.po files. But both sets of PO files would be present in the development
Re: automake po / pot file integration: when to merge the PO files?
On Mon, Sep 06, 2010 at 11:25:44AM +0200, Bruno Haible wrote: Hi, One issue still needs discussion within the planned po / pot file integration [1]: When should the PO files that are distributed be merged with the POT file? Just a few comments from a long-time gettext+automake user which I hope might be useful: The number one problem for me (as you identified) is the huge churn in po file content as you make source changes. I'd like to suggest that the best way to tackle the problem is to simply stop generating the source file/line number comments by default; I'm already doing this in some of my projects by adding --no-location to XGETTEXT_OPTIONS in po/Makevars. It's a massive improvement. Making this small change has a huge impact. po file changes are now sensible: they match source string changes only, not massive line renumbering because I added/removed some unrelated code. This makes merging between branches sensible because I don't have an entire po file full of line number conflicts I can't hope to merge manually. A question I have is what purpose does having the line number and source file serve? Do those benefits outweigh the massive disadvantages? And if the original source file(s) for a string need to be found, grep(1) is pretty fast. At least for me, the translators get mailed the po file, and never look at the source, so it's not of *any practical benefit* to anyone AFAICS; I've certainly had no complaints since I turned them off. With this change made, I would be fully in favour of having update-po run by default so that the po files are always kept up-to-date. In this situation, it makes sense--the po file changes are *entirely related* to the source changes, and can be committed together. Updating the po files by default also makes releases easier: if I tag a release and then make dist and find all the po files were updated, modifying the repository, I need to detag, commit the changes and retag. Updating by default means the repository is always in a releasable state whereby any revision can be tagged without doing additional sanity checks. Regards, Roger -- .''`. Roger Leigh : :' : Debian GNU/Linux http://people.debian.org/~rleigh/ `. `' Printing on GNU/Linux? http://gutenprint.sourceforge.net/ `-GPG Public Key: 0x25BFB848 Please GPG sign your mail. signature.asc Description: Digital signature
Re: [Translation-i18n] automake po / pot file integration: when to merge the PO files?
[: Bruno Haible :] The minimalistic approach - [...] The drawback with this approach is that translators cannot work with a PO file that they take from a tarball; [...] I'm likely missing something, but... Why not have a per-language PO update target, e.g. $ make update-po-LANG This would require msgmerge on translator's system, but Gettext tools are anyway the bare minimum that translators should have in a PO-based translation workflow (for msgfmt -c if for nothing else). This is what is conceptually done in Gnome Translation Project, only using a specialized tool (Intltool) instead of going directly through the build system's interface, see http://live.gnome.org/TranslationProject/SvnHowTo#Committing_interface_translation (Repository update-commit actions in this instruction can be substituted with unpack tarball, ..., send PO by email. But this brings up another advantage, and that is that translators can work both with a tarball and with a VCS in the same way.) [...] Or they would need to rely on a web service to deliver them the merged PO files - then the translators have a methodology problem. This has nothing to do with the issue at hand, but I'm curious, what exactly do you mean by a methodology problem? -- Chusslove Illich (Часлав Илић) signature.asc Description: This is a digitally signed message part.