Re: automake po / pot file integration: when to merge the PO files?

2010-09-13 Thread Yavor Doganov
At Tue, 7 Sep 2010 22:50:32 +0200,
Ralf Wildenhues wrote:
 * Yavor Doganov wrote on Tue, Sep 07, 2010 at 10:17:07AM CEST:
  В Mon, 06 Sep 2010 11:32:54 +0100, Roger Leigh написа:
   I'm already doing this in some of my projects by adding
   --no-location to XGETTEXT_OPTIONS in po/Makevars.  It's a
   massive improvement.
  
  It's a massive disaster for translators.
  
   A question I have is what purpose does having the line number
   and source file serve?
  
  Automatically jumping to the source.  That's invaluable.
 
 Well, automatically jumping to a specific place in source code can
 be achieved with a more stable indexing method that file:line:
 annotations.  An example is how tags/TAGS files work: they encode a
 file name and a search regex that looks like a sed /^pattern/.

That's a nice idea, yes.  I don't know if it's easy to implement, and
I'm not sure why gettext's way is the way it is, given that *tags
predates it (IIRC, not sure).

 Now, I'm not sure reusing tags functionality exactly is possible;
 translator tools will need some adjustment to work with this.

PO editors definitely would have to be modified if this approach is
implemented in gettext proper, but I don't think that's a major
problem to be concerned about.  (They already track and adapt to
changes in gettext, naturally.)

The solution for Roger's problem (preventing .po/.pot VCS noise) is
very simple: just don't keep the .pot in the VCS; generate it at `make
dist' time.  If you keep .po files in VCS (not compulsory either; some
projects do not do this, but fetch the .po's at dist time), then
commit only those changes that are genuine translators' modifications.

Any diligent translator knows how to generate a .pot, and how to
refresh the .po from it.



Re: automake po / pot file integration: when to merge the PO files?

2010-09-07 Thread Yavor Doganov
В Mon, 06 Sep 2010 11:32:54 +0100, Roger Leigh написа:

 I'd like to suggest that the best way to tackle the problem is to
 simply stop generating the source file/line number comments by
 default; I'm already doing this in some of my projects by adding
 --no-location to XGETTEXT_OPTIONS in po/Makevars.  It's a massive
 improvement.

Improvement for you as a maintaner, probably.  It's a massive disaster
for translators.

 A question I have is what purpose does having the line number and
 source file serve?

Automatically jumping to the source.  That's invaluable.

The only case where locations do not work (and do not make sense) is
for XML gibberish like .glade/.ui files.  The filename is still
useful, though -- I'm using it to open the file with Glade, and click
around to find the message I'm interested in.

 And if the original source file(s) for a string need to be found,
 grep(1) is pretty fast.

Try this with a non-trivial (or even trivial) package, you'd be
annoyed pretty quickly.

 At least for me, the translators get mailed the po file, and never
 look at the source, so it's not of *any practical benefit* to anyone

Yes, unfortunately these days many translators take the black box
approach -- not only they never look at the source (which leads to
amusing translations with horrible quality), they don't even compile
and run the program.  That's not a valid reason to cripple the .po
files, though, making the life significantly harder for those who take
care to test their translations properly.

Source code availability *is* a prerequisite for a decent translation,
and the locations of the messages are a major convenience (with a
capable PO editor, of course).




Re: automake po / pot file integration: when to merge the PO files?

2010-09-07 Thread Ralf Wildenhues
Hello,

* Yavor Doganov wrote on Tue, Sep 07, 2010 at 10:17:07AM CEST:
 В Mon, 06 Sep 2010 11:32:54 +0100, Roger Leigh написа:
 
  I'd like to suggest that the best way to tackle the problem is to
  simply stop generating the source file/line number comments by
  default; I'm already doing this in some of my projects by adding
  --no-location to XGETTEXT_OPTIONS in po/Makevars.  It's a massive
  improvement.
 
 Improvement for you as a maintaner, probably.  It's a massive disaster
 for translators.
 
  A question I have is what purpose does having the line number and
  source file serve?
 
 Automatically jumping to the source.  That's invaluable.

Well, automatically jumping to a specific place in source code can be
achieved with a more stable indexing method that file:line: annotations.
An example is how tags/TAGS files work: they encode a file name and a
search regex that looks like a sed /^pattern/.

Now, I'm not sure reusing tags functionality exactly is possible;
translator tools will need some adjustment to work with this.  But that
adjustment would be fairly straightforward, and the resulting
translation file contents could have much more stable contents.

Thoughts?

Thanks,
Ralf



automake po / pot file integration: when to merge the PO files?

2010-09-06 Thread Bruno Haible
Hi,

One issue still needs discussion within the planned po / pot file
integration [1]:
When should the PO files that are distributed be merged with the POT file?

The problem
---

PO files (translations) are produced by translators and integrated to the
project either by a maintainer (who receives them by mail from the translators
directly or through the TP robot) or by a translator herself (who commits
it into the version control repository).

When a new release is made, or shortly before a new release is made, the
maintainer circulates a tarball, and the translators are supposed to pick
the PO files from this tarball and improve them by translating new
untranslated messages.

A PO file for a translator is produced by running 'msgmerge', basically
  $ msgmerge last-translation.po new-messages-list.pot  new-translation.po

If the PO files are being put in a VCS, then each time an 'msgmerge' is done,
the PO file changes (new line numbers, new messages, dropped messages, etc.).
Maintainers don't like this because
  - If they commit the modified PO files regularly, they bloat the history
of their VCS,
  - If they don't commit them regularly, the risk of conflicts increases.
Either way, it causes regular hassles.

If the PO files are not being put in a VCS, then
  1. the VCS contents is not the complete source,
  2. the workflow where translators commit their translations directly is
 impossible.

The classical approach
--

In the approach designed in 1995, there is one PO file per language.

Logically, the POT file depends on all source files, each PO file depends on
the POT file, and each MO file depends on its corresponding PO file.

So, it would be right to implement Makefile dependencies in such a way that
each time a source file changes and the maintainer does a make, the POT file
is being updated (via an 'xgettext' invocation), then the PO files are being
updated (via N 'msgmerge' invocations), then the MO files are being updated
(via N 'msgfmt' invocations). But this is too often:
  - It takes too much time to rebuild _all_ these files after every little
change.
  - The maintainer most often does not care about whether the translations are
up-to-date, because even if he runs make install, he is not going to
start translation work.

So, the approach implemented in po/Makefile.in.in is that make does not
update all PO files, only make dist (which produces a tarball) does.
There is also a make update-po target which updates all PO files but does
not create a tarball. If there is a VCS, the maintainer is supposed to commit
the updated PO files when he makes and releases the tarball.

This was fine for cathedral style development, and until Automake came along.
In bazaar style development, there are more frequent releases, and committing
the updated PO files started to bloat the VCS history. Worse, Automake's
make distcheck becoming more popular, maintainers started to create tarballs
that were not really meant for use by translators. But the PO files were
being updated and increased the potential of VCS conflicts.

The minimalistic approach
-

It would be possible to never update the PO files, and instead produce the .mo
files by running 'msgmerge' on the fly, directly before 'msgfmt':
  $ msgmerge xx.po domain.pot | msgfmt -c -  xx.mo
So:
  - The POT file would be updated at make dist,
  - The PO files would only be changed when the translator submits a new one,
  - The MO files would be updated at make dist.
The VCS would only contain the PO files; and there would be no VCS conflicts.

The drawback with this approach is that translators cannot work with a PO
file that they take from a tarball; they would need to run 'msgmerge' by
themselves (if there is no TP robot that does it for them). This would be
a major hassle for the translators. Or they would need to rely on a web
service to deliver them the merged PO files - then the translators have a
methodology problem.

The inconsistent approach
-

This is a variation of the minimalistic approach: In the development tree,
never update the PO files. But implement the make dist target in such a
way that it puts updated PO files into the tarball.

Translators would be satisfied with this approach.

The drawback is that once a maintainer unpacks a tarball right after producing
it, its contents is different from what he has in his development tree. This
is not only surprising, it can also lead to bugs that appear only with the
release tarball and not earlier.

A radically different approach
--

It would be possible to store two PO files per language in a development tree:
  - xx.po, the last translation received from the translator,
  - xx.merged.po, the updated PO file, in sync with the latest POT file.
The VCS would only contain the xx.po files, not the xx.merged.po files. But
both sets of PO files would be present in the development 

Re: automake po / pot file integration: when to merge the PO files?

2010-09-06 Thread Roger Leigh
On Mon, Sep 06, 2010 at 11:25:44AM +0200, Bruno Haible wrote:
 Hi,
 
 One issue still needs discussion within the planned po / pot file
 integration [1]:
 When should the PO files that are distributed be merged with the POT file?

Just a few comments from a long-time gettext+automake user which
I hope might be useful:

The number one problem for me (as you identified) is the huge churn
in po file content as you make source changes.  I'd like to suggest
that the best way to tackle the problem is to simply stop generating
the source file/line number comments by default; I'm already doing this
in some of my projects by adding --no-location to XGETTEXT_OPTIONS
in po/Makevars.  It's a massive improvement.

Making this small change has a huge impact.  po file changes are now
sensible: they match source string changes only, not massive line
renumbering because I added/removed some unrelated code.  This makes
merging between branches sensible because I don't have an entire po
file full of line number conflicts I can't hope to merge manually.

A question I have is what purpose does having the line number and
source file serve?  Do those benefits outweigh the massive
disadvantages?  And if the original source file(s) for a string
need to be found, grep(1) is pretty fast.  At least for me, the
translators get mailed the po file, and never look at the source,
so it's not of *any practical benefit* to anyone AFAICS; I've
certainly had no complaints since I turned them off.


With this change made, I would be fully in favour of having
update-po run by default so that the po files are always kept
up-to-date.  In this situation, it makes sense--the po file changes
are *entirely related* to the source changes, and can be committed
together.

Updating the po files by default also makes releases easier:
if I tag a release and then make dist and find all the po
files were updated, modifying the repository, I need to
detag, commit the changes and retag.  Updating by default means
the repository is always in a releasable state whereby any
revision can be tagged without doing additional sanity checks.


Regards,
Roger 

-- 
  .''`.  Roger Leigh
 : :' :  Debian GNU/Linux http://people.debian.org/~rleigh/
 `. `'   Printing on GNU/Linux?   http://gutenprint.sourceforge.net/
   `-GPG Public Key: 0x25BFB848   Please GPG sign your mail.


signature.asc
Description: Digital signature


Re: [Translation-i18n] automake po / pot file integration: when to merge the PO files?

2010-09-06 Thread Chusslove Illich
 [: Bruno Haible :]
 The minimalistic approach
 -
 [...]
 The drawback with this approach is that translators cannot work with a PO
 file that they take from a tarball; [...]

I'm likely missing something, but...

Why not have a per-language PO update target, e.g.

  $ make update-po-LANG

This would require msgmerge on translator's system, but Gettext tools are
anyway the bare minimum that translators should have in a PO-based
translation workflow (for msgfmt -c if for nothing else).

This is what is conceptually done in Gnome Translation Project, only using a
specialized tool (Intltool) instead of going directly through the build
system's interface, see

http://live.gnome.org/TranslationProject/SvnHowTo#Committing_interface_translation

(Repository update-commit actions in this instruction can be substituted
with unpack tarball, ..., send PO by email. But this brings up another
advantage, and that is that translators can work both with a tarball and
with a VCS in the same way.)

 [...] Or they would need to rely on a web service to deliver them the
 merged PO files - then the translators have a methodology problem.

This has nothing to do with the issue at hand, but I'm curious, what exactly
do you mean by a methodology problem?

-- 
Chusslove Illich (Часлав Илић)


signature.asc
Description: This is a digitally signed message part.