The current release cycle has highlighted some weaknesses in how we deal with
translations. Below are my findings. Comments, verifications, sanity checks
welcome.
Most translations arrive as incremental additions/improvements/corrections to
existing po files.
The right way to update translations that already exist in the codebase is
`msgmerge -o merged_translation.po existing_translations.po
new_contributed_translations.po`
First question: the above is what I understand when reading msgmerge's
manpage. It is the opposite of what is described in the translation guide
[0]. Should the translation guide be corrected?
Anything else than `msgmerge`, including the process to keep code branches in
sync described at [1] works only in special cases and is better avoided for
translations.
The terms to be translated are in English. Not all developers are native
English speakers. The inevitable language errors have been discovered too
late in the process. Executing extract-messages.sh early and often will help
preventing last minute fixes.
KEY LEARNINGS
1. Development: Execute extract-messages.sh early in the integration process.
Don't wait for release branching. This is particularly important when the new
strings were added by non-native English speakers and need to be polished.
2. Translation: The best way to contribute a translation is to publish your
updated po file in the issue tracker [2]. If you have repository access, keep
translating in the default branch. Apply your translation to the release
branch with msgmerge (or ask a developer to do this for you).
3. Bug-fixing: While bug-fixing during the release cycle, do not mix code
changes with translation changes in the same commits. this is generally good
practice, and luckily all bug fixes in 2010.4 have been done this way.
4. Release-management: From the moment the strings in the default and release
branch divert, do not transplant changes to the translations. use msgmerge
instead. if there is a need to run extract-messages.sh again, do it
separately in default and release branch, do not transplant. I will update
the wiki documentation [1] accordingly.
NEXT STEPS
For the specific situation of 2010.4, I went back over the whole history of
the translations.
Because the divergence between 2010.4 and default is negligibly small (details
below), I decided to
* just fix the errors in the English that have been identified on the mailing
list;
* run extract-messages.sh again to have clean .po files in both default and
2010.4;
* keep the language files as they are;
* work through the back log of contributed language files (thank you,
Alexandre).
* publish 2010.4.0 RC1 in the coming hours.
If you are a translator listed in the details below, after 20104.0 RC1 is
published, please check your language. If strings that you have translated
have disappeared, let me know and we'll work together to trace the changes
back. If you posted them on either the tracker, the mailing list, or the
mercurial repository, they are still available and fixing the issue boils down
to grabbing your contribution from the archive and msmerge it agains the
current po file.
For the future, it has been suggested to adopt Launchpad's Translations. We
still need to determine if it would make a difference (from reading its
description [3] my educated guess is that it will). What we already know now:
a mandatory requirement for Launchpad's Translations (as noted by Lukas) is
that they be BSD-Licensed. This is necessary for the translations to
propagate/share well. For Hugin it would mean that we either get the
translators to agree and relicense their work under the BSD license, or we
regress on the translations that can't be relicensed. To be discussed after
2010.4.0 is released.
DETAILS
Traditional code inspection tools such as `diff` don't work well on po-files
that have been edited with poedit or msgmerge. Moreover, not being fluent in
all concerned languages, it is difficult to conclusively say if the
translations are OK or not.
2010.4 branched out Nov 24 2010 at revision 4597, and we executed extract-
messages.sh twice since.
To obtain a list of the significant changesets, I did the following:
* added the following line to the [alias] section of ~/.hgrc:
ulog = log --template '{rev}\t{author|person}\t{desc|firstline}\n'
* listed the changes to default an 2010.4 branches into two files with
hg ulog -b default -r 4597: src/translations/* default.changes.txt
hg ulog -b 2010.4 -r 4597: src/translations/* release.changes.txt
* determined affected translations
- Czech (Vaclav Cerny)
- Venezoelan / Latin American Spanish (Ernesto Enrique Alvarado Viloria)
- Spanish (Uwe Koch Kronberg)
- Italian (Cristian Marchi)
- German (Joachim Schneider, Carl von Einem, Thomas Modes)
- Dutch (Harry van der Wolf)
- French (Jean-Luc Coulon)
- Hungarian (Lajos Höss)
* with the exception