Hello Martin,
thanks for your speedy reply.

Especially when asking things related to sgt-puzzles, please keep Ben
in CC:.

On Sat, Aug 06, 2022 at 12:23:48AM +0200, Martin Quinson wrote:
> the short answer is that po4a-gettextize is not intended to be used on a 
> regular
> basis. It's only intended for the first run when you want to convert an 
> existing
> translation to the po-based workflow. Once it's done, you're supposed to use
> po4a-updatepo to create an empty PO file. Even better, you should use po4a
> directly instead of the deprecated atomic commands.

Ok, so this would be incorrect usage in sgt-puzzles? It did work for
the past ~ 13 years. Then it might be helpful to add a note that
certain use cases are not working anymore.

Should this bug be cloned to sgt-puzzles for updating its
infrastructure?

> The extra spaces that you see are intended to help the gettextization process,
> as explained in the po4a-gettextize manpage.

At least I don't fully understand this text, even though I translated
it. (See below)

> I'm not sure of how I can help you here. What piece of documentation should be
> updated?

       •   In some case, po4a adds a space at the end of either the original or 
the translated strings. This is because every string must be deduplicated 
during the gettextize process. Imagine that a string appearing several times
           unmodified in the original, but is translated in differing way, or 
that different paragraphs are translated in the exact same way.

           Without deduplication, such case would break the gettexization 
algorithm, as it is a simple one to one pairing between the msgids of both the 
master and the localized files. Since one of the PO files would miss an entry
           (that would be reported as duplicate, with two references), the 
pairing would fail.

What is missing here is how and when these strings are merged back, i.e. what 
the 
translator or package maintainer should do to get to the desired
situation (i.e. each string only appearing once). 

           Since po4a uses the entry type ("title" or "plain paragraph", etc) 
to detect whether the parsing streams got desynchronized, similar issues could 
occur if two identical entries (same content but differing type) of the
           master file are translated in the exact same way in the localized 
file. po4a would detect a fake desyncronization in such case.

           In most cases, the extra space added by po4a to deduplicate the 
strings has no impact on the formatting. Strings are fuzzied anyway, and 
msgmerge will probably match the strings accordingly afterward.

Could you add an example here? I.e. like I did below with my example?
In your text above (in the e-mail) you state that you should use 
po4a-updatepo or po4a, here you mention msgmerge. Probably clarifying 
this would help as well.

> Thanks for using po4a,

Sure, for translators/translations its a great piece of software. 

> Mt
> 
> Le vendredi 05 août 2022 à 16:02 +0200, Helge Kreutzmann a écrit :
> > Package: po4a
> > Version: 0.67-2
> > Severity: normal
> > Tags: upstream
> > X-Debbugs-Cc: Ben Hutchings <b...@decadent.org.uk>
> > 
> > 
> > I'm the translator of the German translation for the documentation of
> > sgt-puzzles. It is a Debian-only patch at the moment for the halibut
> > based sources.
> > 
> > A few days ago Ben (the Debian maintainer) updated the package and
> > requested me to update the German translation. While doing so he
> > noticed a strange change in po4a behaviour:
> > 
> > (Some) strings, which are repeated (because the same text appears in
> > multiple places in the documentation resp. many man pages) are
> > inserted several times into de.po, except that an increasing number of
> > spaces is added, i.e.
> > 
> > "dog" would become
> > "dog"
> > "dog "
> > "dog  "
> > "dog   "
> > and so on.
> > 
> > While updating the German translation of po4a I remember translating 
> > something along these lines, though I did not fully understand its 
> > meaning.
> > 
> > This behaviour defeats part of the idea of the po format. Unless the
> > orginal author indicates this, identical strings in the original text
> > should be translated identical as well. 
> > 
> > Now for some reason po4a makes identical strings artificially different. 
> > 
> > In the toy example above, this could become:
> > "Hund"
> > "Rüde "
> > "Gerüstklammer  "
> > "Schlepphaken   " 
> > …
> > 
> > So now the same string is translated differently *and* the
> > translation receives also (varying) additional trailing spaces. (As a
> > translator, you usually reproduce space at the beginning and end). 
> > 
> > In this toy example this might be noticed easily, but usually po4a is
> > used for (longer) paragraphs - and translators might not realize they
> > already translated them and would retranslate them - additional work
> > and, as stated above, potentially inconsistent translations.
> > 
> > Thus please revert to the previous behaviour of po4a *or* ensure that
> > identical text is shown only once in the *.po(t) files.
> > 
> > In case you want to investigate yourself, do the following in
> > unstable:
> > 
> > apt-get source sgt-puzzles
> > cd sgt-puzzles-20191231.79a5378/
> > make -f debian/rules build
> > make -f Makefile.doc update-po


-- 
      Dr. Helge Kreutzmann                     deb...@helgefjell.de
           Dipl.-Phys.                   http://www.helgefjell.de/debian.php
        64bit GNU powered                     gpg signed mail preferred
           Help keep free software "libre": http://www.ffii.de/

Attachment: signature.asc
Description: PGP signature

Reply via email to