Re: [l10n-dev] Imagine :)

Rafaella Braconi Thu, 12 Jul 2007 01:36:32 -0700

Hi Jean-Christophe,

thank you once again for sharing your thoughts and experience.

I am trying to reproduce and clarify with other engineers what you sayhere below.

However, from what I understand here, the issue you see is notnecessarily Pootle but the format Pootle delivers which is .po. Asalready said, Pootle will be able to deliver in near future the contentin xliff format. Would you still see a probelm with this?


Regards,
Rafaella

Jean-Christophe Helary wrote:

I have no idea where the UI files come from and how they _must_ beprocessed before reaching the state of l10n source files.
So, let me give a very simplified view of the Help files preparationfor l10n as seen from a "pure" TMX+TMX supporting tool point of view.Since I don't know what the internal processes really are I can onlyguess and I may be mistaken.
• The original Help files are English HTML file sets.
• Each localization has a set of files that corresponds to theEnglish HTML sets
• The English and localized versions are sync'ed

To create TMX files:
Use a process that aligns each block level tag in the English set tothe corresponding block level tag in the localized set. That iscalled paragraph (or block) segmentation and that what SUN does forNetBeans: no intermediary file format, no .sdf, no .po, no whateverbetween the Help sets and the TMX sets.
The newly updated English Help files come as sets of files, all HTML.
The process to translate, after the original TMX conversion above(only _ONE_ conversion in the whole process) is the following:
Load the source file sets and the TMX sets in the tool.

The HTML tags are automatically handled by the tool.
The already translated segments are automatically translated by thetool.The translator only needs to focus on what has been updated. Usingthe whole translation memory as reference.
Once the translation is done, the translator delivers the full setthat is integrated in the release after proofreading etc.
What is required from the source files provided side ? Creating TMXfrom HTML paragraph sets.
What is required from the translator ? No conversion whatsoever, justwork with the files and automatically update the translation with thelegacy data.
Now, what do we have currently ?
The source files provider creates a differential of the new vs theold HTML set.
It converts the result to an intermediate format (.sdf)
It converts that result to yet another intermediate format for thetranslator (either .po or xliff)It matches the results of the diff strings to corresponding oldlocalized strings, thus removing the real context of the old stringIt creates a false TMX based on an already intermediate format,without hiding the internal codes (no TMX level 2, all the tag infois handled as text data...)
The translator is left to use intermediate files that have beenconverted twice, removing most relation to the original format andadding the probability of having problems with the back conversion.
It has to work with a false TMX that has none of the originalcontext, thus producing false matches that need to be guessedbackward and that displays internal codes as text data.
Do you see where the overhead is ?
It is very possible that the UI files do require some sort ofintermediate conversion to provide the translators with a manageableset of files, but as far as the Help files are concerned (and as faras I understand the process at hand) there is absolutely no needwhatsoever to use an intermediate conversion, to remove the originalcontext and to force the translator to use error prone source files.
It is important to find ways to simplify the system so that morepeople can contribute, so that the source files provider has lesstasks to handle, but clearly using a .po based process to translateHTML files is going totally the opposite way. And translators are(sadly without being conscious of that) suffering from that, whichresults into less time spend on checking one's translation and ageneral overhead for checkers and converters.
Don't get me wrong, I am not ranting or anything, I _am_ reallytrying to convince people here that things could (and should) bedrastically simplified, and for people who have some time, Iencourage you to see how NetBeans manages its localization process.Because we are loosing a _huge_ amount of human resources in thecurrent process.
Cheers,

Jean-Christophe Helary (fr team)
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: [l10n-dev] Imagine :)

Reply via email to