Re: [hugin-ptx] on the way to 2010.4.0 RC1 (sorting out the translations)

2010-12-23 Thread Yuval Levy
On December 22, 2010 11:25:53 pm Yuval Levy wrote:
 The right way to update translations that already exist in the codebase is
 `msgmerge -o merged_translation.po existing_translations.po
 new_contributed_translations.po`
 
 First question:  the above is what I understand when reading msgmerge's
 manpage.  It is the opposite of what is described in the translation guide.
 Should the translation guide be corrected?

Thanks to Alexandre Prokoudine [0] and Lu Fang [1] for the Russian and Chinese 
translations in the bug tracker.

I made a quick experiment:
  msgmerge -o ru.oldfirst.po ru.po new.ru.po
  msgmerge -o ru.newfirst.po new.ru.po ru.po
  msgfmt -c --statistics ru.newfirst.po
  msgfmt -c --statistics ru.oldfirst.po

Same with the Chinese translation.

The resulting files are of different file size and impossible to compare with 
traditional code inspection tools such as diff.  And indeed they are 
completely different:

ru.newfirst: 1069 translated messages, 47 fuzzy translations, 46 untranslated 
messages.
ru.oldfirst: 982 translated messages, 108 fuzzy translations, 68 untranslated 
messages.

similar results with the Chinese translation.  This indicates that the new 
files should be the first argument of the msgmerge command; and that's also 
what the Hugin Translation Guide says.  I must have misinterpreted the 
msgmerge man page.

Yuv (one step closer to issuing the release candidate)


[0] https://bugs.launchpad.net/hugin/+bug/693304
[1] https://bugs.launchpad.net/hugin/+bug/693735 


signature.asc
Description: This is a digitally signed message part.


[hugin-ptx] on the way to 2010.4.0 RC1 (sorting out the translations)

2010-12-22 Thread Yuval Levy
The current release cycle has highlighted some weaknesses in how we deal with 
translations.  Below are my findings.  Comments, verifications, sanity checks 
welcome.

Most translations arrive as incremental additions/improvements/corrections to 
existing po files.

The right way to update translations that already exist in the codebase is 
`msgmerge -o merged_translation.po existing_translations.po 
new_contributed_translations.po`

First question:  the above is what I understand when reading msgmerge's 
manpage.  It is the opposite of what is described in the translation guide 
[0].  Should the translation guide be corrected?

Anything else than `msgmerge`, including the process to keep code branches in 
sync described at [1] works only in special cases and is better avoided for 
translations.

The terms to be translated are in English.  Not all developers are native 
English speakers.  The inevitable language errors have been discovered too 
late in the process.  Executing extract-messages.sh early and often will help 
preventing last minute fixes.


KEY LEARNINGS

1. Development:  Execute extract-messages.sh early in the integration process. 
Don't wait for release branching.  This is particularly important when the new 
strings were added by non-native English speakers and need to be polished.

2. Translation:  The best way to contribute a translation is to publish your 
updated po file in the issue tracker [2].  If you have repository access, keep 
translating in the default branch.  Apply your translation to the release 
branch with msgmerge (or ask a developer to do this for you).

3. Bug-fixing:  While bug-fixing during the release cycle, do not mix code 
changes with translation changes in the same commits. this is generally good 
practice, and luckily all bug fixes in 2010.4 have been done this way.

4. Release-management:  From the moment the strings in the default and release 
branch divert, do not transplant changes to the translations.  use msgmerge 
instead.  if there is a need to run extract-messages.sh again, do it 
separately in default and release branch, do not transplant.  I will update 
the wiki documentation [1] accordingly.


NEXT STEPS

For the specific situation of 2010.4, I went back over the whole history of 
the translations.

Because the divergence between 2010.4 and default is negligibly small (details 
below), I decided to
* just fix the errors in the English that have been identified on the mailing 
list;
* run extract-messages.sh again to have clean .po files in both default and 
2010.4;
* keep the language files as they are;
* work through the back log of contributed language files (thank you, 
Alexandre).
* publish 2010.4.0 RC1 in the coming hours.

If you are a translator listed in the details below, after 20104.0 RC1 is 
published, please check your language.  If strings that you have translated 
have disappeared, let me know and we'll work together to trace the changes 
back.  If you posted them on either the tracker, the mailing list, or the 
mercurial repository, they are still available and fixing the issue boils down 
to grabbing your contribution from the archive and msmerge it agains the 
current po file.

For the future, it has been suggested to adopt Launchpad's Translations.  We 
still need to determine if it would make a difference (from reading its 
description [3] my educated guess is that it will).  What we already know now:  
a mandatory requirement for Launchpad's Translations (as noted by Lukas) is 
that they be BSD-Licensed.  This is necessary for the translations to 
propagate/share well.  For Hugin it would mean that we either get the 
translators to agree and relicense their work under the BSD license, or we 
regress on the translations that can't be relicensed.  To be discussed after 
2010.4.0 is released.


DETAILS

Traditional code inspection tools such as `diff` don't work well on po-files 
that have been edited with poedit or msgmerge.  Moreover, not being fluent in 
all concerned languages, it is difficult to conclusively say if the 
translations are OK or not.

2010.4 branched out Nov 24 2010 at revision 4597, and we executed extract-
messages.sh twice since.

To obtain a list of the significant changesets, I did the following:
* added the following line to the [alias] section of ~/.hgrc:
   ulog = log --template '{rev}\t{author|person}\t{desc|firstline}\n'
* listed the changes to default an 2010.4 branches into two files with
   hg ulog -b default -r 4597: src/translations/*  default.changes.txt
   hg ulog -b 2010.4 -r 4597: src/translations/*  release.changes.txt
* determined affected translations
- Czech (Vaclav Cerny)
- Venezoelan / Latin American Spanish (Ernesto Enrique Alvarado Viloria)
- Spanish (Uwe Koch Kronberg)
- Italian (Cristian Marchi)
- German (Joachim Schneider, Carl von Einem, Thomas Modes)
- Dutch (Harry van der Wolf)
- French (Jean-Luc Coulon)
- Hungarian (Lajos Höss)
* with the exception