2009/9/7 Frank Peters <f...@sun.com>: >> Specifically, getting rid of the 6-blanks strings in the CWS should be >> easy, assuming that there are no legitimate instances of such a thing. > > Any such validation must be based on rules, we need to define so we do > not by accident "fix" bugs that are no bugs. Some come to mind: > > - No trailing spaces in a <paragraph> > - No empty elements except for <br/> > - Adjacent elements need to be merged: > <emph>One</emph><emph> Two</emph> -> <emph>One Two</emph> >
AD *blanks* Well, in the previous m51 there were 6-blanks added, this time (in m57) it seems there were only some of those removed, but mostly not, and 9-blanks and some less 6-blanks were added. These changes often occur in the index entries strings, where we usually get: "<bookmark_value>numbering; lists, while typing</bookmark_value> <bookmark_value>bullet lists;creating while typing</bookmark_value> <bookmark_value>lists;automatic numbering</bookmark_value> <bookmark_value>numbers;lists</bookmark_value> <bookmark_value>automatic bullets/numbers; AutoCorrect function</bookmark_value> <bookmark_value>bullets; using automatically</bookmark_value> <bookmark_value>paragraphs; automatic numbering</bookmark_value>" AD *trailing spaces* Trailing spaces, well, are another big problem and removing them automatically may lead to elimination of spaces between sentences in same paragraphs. A lot of strings in the help end with a single trailing space (at least in the po files I am localizing - maybe these are not split in your original OOo help content files, but they do get split on the way to po files). Some are stand-alone paragraphs and don't need their trailing space, but the majority of them need the space for the following string that follows them in the paragraph. So one should go through the help manually and see which of those trailing spaces can or should be removed? Or all strings of a paragraph should first be merged into single strings, then trailing spaces can be removed? If this splitting of paragraphs happens by one of your macros or algorithms, what would eliminating the splitting bring to the translators - a lot of untranslated strings? I mean, would gettext tools be wise enough to fuzzy match two strings together into a new translation? I have also "met" some strings in help that have a space in the beginning, because they are probably a continuation of a paragraph, of a previous help string (not many of those, but there are some). Or maybe they are pure mistakes. This is why I am very exact about leaving a trailing space when I see it in the original strings. But what worries me most (I guess the blanks can and will be taken care easily) are massive change of tags like: <emph> -> <item type=\"menuitem\"> I can only get a heart attack thinking if what just now happened in helpcontent2/source/text/swriter/guide.po (there were around 1000 changed strings out of 2100, mostly with these tags changed and some 6 or 9-blanks added) would happen to other help files. There would be 10.000's strings that all translators would need to manually edit. Just a small regression from a translator's perspective. When you stumble upon such a fuzzy string with changed tags and/or unwanted blanks added, you do not and cannot know, what other change happened in the string. So you are being careful and recheck the translation, then repair the tags and remove the spaces. Sometimes you check the old po file, to chack what was changed from previous mXX. So in the end one changed string like that takes more time from a translator viewpoint than translating it from scratch. Please, keep this in mind when you decide what to do with changes in m51 and m57 and further steps in changing tags in these files. Changing 2000 help strings, some intentionally by just replacing tags and some by mistake of adding blanks mean same as 2000 new untranslated strings. So if such tag changes are planned, a special CWS could be created, where *all* such tags should be replaced, then all current SDF files from all translation teams should be delivered by the translation teams to Hamburg where they would use this special super-trouper command line tool that would replicate all the tag changes in translations as well (this would most probably be an error prone procedure, but at least it would change most of the changes correctly; I guess 95% success would be more than enough, if those error likely strings would get marked as fuzzy). Then CWS would be published in an mXY and teams would get their SDF->po files back and continue their business as usual. If that is possible. Thanks for listening, m. --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@l10n.openoffice.org For additional commands, e-mail: dev-h...@l10n.openoffice.org