Re: [l10n-dev] Re: [documentation-dev] Re: [l10n-dev] Massive English help stampedo

Martin Srebotnjak Mon, 07 Sep 2009 02:42:01 -0700

2009/9/7 Frank Peters <f...@sun.com>:
>> Specifically, getting rid of the 6-blanks strings in the CWS should be
>> easy, assuming that there are no legitimate instances of such a thing.
>
> Any such validation must be based on rules, we need to define so we do
> not by accident "fix" bugs that are no bugs. Some come to mind:
>
> - No trailing spaces in a <paragraph>
> - No empty elements except for <br/>
> - Adjacent elements need to be merged:
>  <emph>One</emph><emph> Two</emph> -> <emph>One Two</emph>
>


AD *blanks*
Well, in the previous m51 there were 6-blanks added, this time (in
m57) it seems there were only some of those removed, but mostly not,
and 9-blanks and some less 6-blanks were added. These changes often
occur in the index entries strings, where we usually get:
"<bookmark_value>numbering; lists, while typing</bookmark_value>
<bookmark_value>bullet lists;creating while typing</bookmark_value>
  <bookmark_value>lists;automatic numbering</bookmark_value>
<bookmark_value>numbers;lists</bookmark_value>
<bookmark_value>automatic bullets/numbers; AutoCorrect
function</bookmark_value>      <bookmark_value>bullets; using
automatically</bookmark_value>      <bookmark_value>paragraphs;
automatic numbering</bookmark_value>"

AD *trailing spaces*
Trailing spaces, well, are another big problem and removing them
automatically may lead to elimination of spaces between sentences in
same paragraphs. A lot of strings in the help end with a single
trailing space (at least in the po files I am localizing - maybe these
are not split in your original OOo help content files, but they do get
split on the way to po files). Some are stand-alone paragraphs and
don't need their trailing space, but the majority of them need the
space for the following string that follows them in the paragraph. So
one should go through the help manually and see which of those
trailing spaces can or should be removed? Or all strings of a
paragraph should first be merged into single strings, then trailing
spaces can be removed? If this splitting of paragraphs happens by one
of your macros or algorithms, what would eliminating the splitting
bring to the translators - a lot of untranslated strings? I mean,
would gettext tools be wise enough to fuzzy match two strings together
into a new translation?

I have also "met" some strings in help that have a space in the
beginning, because they are probably a continuation of a paragraph, of
a previous help string (not many of those, but there are some). Or
maybe they are pure mistakes.

This is why I am very exact about leaving a trailing space when I see
it in the original strings.

But what worries me most (I guess the blanks can and will be taken
care easily) are massive change of tags like:
<emph> -> <item type=\"menuitem\">
I can only get a heart attack thinking if what just now happened in
helpcontent2/source/text/swriter/guide.po (there were around 1000
changed strings out of 2100, mostly with these tags changed and some 6
or 9-blanks added) would happen to other help files. There would be
10.000's strings that all translators would need to manually edit.

Just a small regression from a translator's perspective. When you
stumble upon such a fuzzy string with changed tags and/or unwanted
blanks added, you do not and cannot know, what other change happened
in the string. So you are being careful and recheck the translation,
then repair the tags and remove the spaces. Sometimes you check the
old po file, to chack what was changed from previous mXX. So in the
end one changed string like that takes more time from a translator
viewpoint than translating it from scratch. Please, keep this in mind
when you decide what to do with changes in m51 and m57 and further
steps in changing tags in these files. Changing 2000 help strings,
some intentionally by just replacing tags and some by mistake of
adding blanks mean same as 2000 new untranslated strings.

So if such tag changes are planned, a special CWS could be created,
where *all* such tags should be replaced, then all current SDF files
from all translation teams should be delivered by the translation
teams to Hamburg where they would use this special super-trouper
command line tool that would replicate all the tag changes in
translations as well (this would most probably be an error prone
procedure, but at least it would change most of the changes correctly;
I guess 95% success would be more than enough, if those error likely
strings would get marked as fuzzy). Then CWS would be published in an
mXY and teams would get their SDF->po files back and continue their
business as usual. If that is possible.

Thanks for listening,
m.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@l10n.openoffice.org
For additional commands, e-mail: dev-h...@l10n.openoffice.org

Re: [l10n-dev] Re: [documentation-dev] Re: [l10n-dev] Massive English help stampedo

Reply via email to