Re: [l10n-dev] PO and TMX contents do not match, help !

2007-12-27 Thread Jean-Christophe Helary

Javier,

I am glad we at last managed to agree on the most important:

if I generate TMX form PO, y should use it with POs, and if I  
generate from XLIFF, I should use it for XLIFF...


Yes, and if we generate TMX from SDF (like SUN's TMX) then it is  
supposed to work with SDF, which is the reason why I proposed a way to  
work with SDF directly.



and then it works.


It does indeed. If communities want to work with the TMX that SUN  
provides then they can use the workflow I proposed and they'll see  
wonders.


I am afraid that at this point we do not have such a thing as  
correct/universal TMX files.


Agreed, TMX depend on the original contents. And so it should be match  
with the format in which the original contents is expressed.


... and that there is no truth on this, just opinions and systems  
that work.


100% with you.


Jean-Christophe Helary


http://mac4translators.blogspot.com/

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: [l10n-dev] PO and TMX contents do not match, help !

2007-12-26 Thread Jean-Christophe Helary

Thank you for the reality check Alessandro.

Any other community willing to share experiences ? I would really like  
to know what are the commonly accepted best practices for the current  
PO based workflow ?


I'd really like to know myself how people translate with the current  
workflow as I feel we're missing something.


In the Italian community we're currently translating most of our  
files directly on Pootle which may be considered a good translation  
workflow management system but a very poor translation editor.

So far we've tried different solutions:
- we downloaded the PO files and tried to translate them with OmegaT  
but we had problems with the TMX matching and with the reconversion  
to SDF (gsicheck errors);


That is correct. The PO and the TMX do not match so the translators  
must be extra careful when re-using contents from the TMX, basically  
that means adding manually all the extra \ that PO has added.


- we extracted XLIFF files from Pootle and tried to translate them  
with the OLT Editor but the tool didn't even open them as it  
considered the XLIFF files not well formed;


No comment here.

- we converted the PO files using the OLT Filters, it worked, but  
then it proved so slow in handling the TM that we had to give up on  
that;


Here, the idea would be to have the OLT filters directly handle the  
SDF format, but I fear that would not change much for the overall  
performance. Unless the TMX files were trimmed down a little bit  
maybe. Like having separate TMX files per module (which would shrink  
them to the ~k segments each I suppose, instead of the 50k+20k chunks  
that we have now).


- we translated some of our content with poEdit but that editor is  
as poor as Pootle from this point of view (no TM and no glossary).


That is correct.

I have tried to install Kbabel on OSX yesterday and I see that it has  
limited TMX support, but had no time to check further. Plus, the TMX  
contents and the PO contents not matching we would have problems  
similar to work with OmegaT I suppose.


So far I find the method for translating SDF files proposed by Jean- 
Cristophe the best way to work on the translation but it seems to be  
not compatible with Pootle which we are using as well. What we, as  
translators, really need is a method to translate effectively using  
TM and glossaries just like we do in the professional world. OmegaT  
would have it all: a glossary extracted from SunGloss can easily be  
converted for the tool and the OmegaT TM engine works very well...  
but then, obviously, we need a TM that matches the content to be  
translated.


Which is why the solution I proposed based on SDF is the best in my  
opinion.


Regarding Pootle, it is possible to upload the result after the  
translation is completed ? If yes, you could translate based on SDF,  
convert the result with oo2po and upload that to Pootle to ensure your  
data is properly managed there ?



Jean-Christophe Helary


http://mac4translators.blogspot.com/

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: [l10n-dev] PO and TMX contents do not match, help !

2007-12-26 Thread Alessandro Cattelan

Jean-Christophe Helary ha scritto:


Regarding Pootle, it is possible to upload the result after the 
translation is completed ? If yes, you could translate based on SDF, 
convert the result with oo2po and upload that to Pootle to ensure your 
data is properly managed there ?


This could work but I haven't tried it yet. However, I'm afraid that the 
conversion could bring about some errors in the resulting files. 
Moreover, we'd have to ask Sun to deliver the files in two formats, SDF 
through Issuezilla and PO in Pootle. Anyway, we could try that for the 
next round of translation if no other suggestion comes up.

Ale.



Jean-Christophe Helary


http://mac4translators.blogspot.com/

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]





--
Alessandro Cattelan
Freelance translator (EN, ES -- IT)
http://www.proz.com/profile/76355
Tel.: (+39) 338 1823554
Skype: acattelan
Yahoo! IM: alessandro.cattelan

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: [l10n-dev] PO and TMX contents do not match, help !

2007-12-26 Thread Jean-Christophe Helary


On 26 déc. 07, at 17:08, Yury Tarasievich wrote:

On Wed, 26 Dec 2007 09:56:57 +0200, Alessandro Cattelan [EMAIL PROTECTED] 
 wrote:

...
translators, really need is a method to translate effectively using  
TM
and glossaries just like we do in the professional world. OmegaT  
would
have it all: a glossary extracted from SunGloss can easily be  
converted

for the tool and the OmegaT TM engine works very well... but then,
obviously, we need a TM that matches the content to be translated.


Maybe I'm missing something, but how can the Sun's glossary/TMX or  
whatever be helpful without meta-information? No amount of toolchain  
change is going to address this by itself.


I think you are indeed missing something.

As Ale wrote, such meta-information can be added to the glossaries (in  
OmegaT-use the third column) or to TMX files, or to XLIFF files.


TMX files can use the note place holder.
XLIFF files can use the context place holder.

Besides, glossary or TMX information in OmegaT (or anywhere else) is  
suggestions for the translator at best and the context can be provided  
by other means.


Other means include but is not limited to meta-information. Besides,  
it is necessary for the meta-information to be directly available and  
processable by the translator to have any practical use.


The focus on meta-information is valid as long as the data is  
automatically available to the processes. Currently it is not the  
case, or is it ?


Since there are not tools that can automatically process the SDF meta- 
information in its current form, focusing on meta-information seem to  
me to be counter productive.


Other ways to support the translator is to provide external context to  
strings. That can be done by the translator's experience itself  
(knowing the data set, having experience in the field etc), or by  
providing the data in external viewers: OOo's help viewer, screenshots  
etc...



Maybe *I'm* not making myself intelligible? I'm talking about having  
things assigned to the strings like a term variant, type of use  
(menu/option/...), keep short etc. Currently such info often has  
to be deduced from string ID, or lucky probe in the UI, even from  
sources digging.


Yes. That is correct. But in most of the cases the translator has  
enough common sense and external resources (the l10n community,  
experiences users, external context, etc.) to make do for the lack of  
meta-information or the lack of automatic access to it.


I fully understand that you want to provide the most error-prone-less  
possible workflow by using such meta-information, but in most cases  
this meta-information will not be available to the translators in a  
practical way. Last but not least, such meta-information is mostly  
useful for indetifying UI items, but for the whole rest of the  
translation process (terminology management, style management etc) it  
is simply useless.



Jean-Christophe Helary


http://mac4translators.blogspot.com/

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: [l10n-dev] PO and TMX contents do not match, help !

2007-12-26 Thread Jean-Christophe Helary


On 26 déc. 07, at 17:45, Yury Tarasievich wrote:

Could being the operative word here. See, I don't understand where  
do you expect this info to actually come *from*. Somebody has to  
type in those thousands of meta-descriptors into the carrier file,  
after all.


Yuri, your original question was:

Maybe I'm missing something, but how can the Sun's glossary/TMX or  
whatever be helpful without meta-information? No amount of toolchain  
change is going to address this by itself.


The answer is simple.

In the case of SUN GLOSS, and for an OmegaT centered process, you can  
leave the meta-information that SUN provides in its data as comments  
in the glossary file that OmegaT uses. When I write you can I mean  
it is trivial and can be done in a Calc sheet for example.


In the case of TMX/XLIFF, it can be done by properly using the  
relevant tags in the respective files. And that can be done with a  
script in the language of your choice. But for that, there is a need  
to have the _will_ to have a direct filter for the SDF format first.


It might as easily be done with the extended SDF/FDS/whatever as  
with XLIFF, but resources ought to be dedicated beforehand. And so,  
in the case of hypothetical format switch resources ought to be  
dedicated twice. That's why I strongly doubt the format switch at  
this juncture would facilitate the filling of the meta-info slots.


As Javier put it, SDF is _not_ a localization format. That is what you  
seem to not understand in what I wrote.


We need a localization format (PO, XLIFF, key=value, anything) that  
matches the localization data SUN provides us with (TMX). This has  
nothing to with with developing or not developing the SDF format.



Jean-Christophe Helary


http://mac4translators.blogspot.com/

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: [l10n-dev] PO and TMX contents do not match, help !

2007-12-26 Thread Jean-Christophe Helary


On 26 déc. 07, at 17:51, Alessandro Cattelan wrote:


Jean-Christophe Helary ha scritto:

What are the practical benefits related to using Pootle ?


Basically, I see two main benefits:
- it let's you assign files to translators so that you know who's  
translating a given file;
- it provides some statistics so that you know at a glance how many  
files or words need to be translated.


Ok, so the problem is that the current PO files, as provided by SUN  
using the oo2po convertion do not match the TMX contents so you can't  
work properly with them, right ?


So, if we could have PO files that match the TMX contents, we could  
use Pootle to do the file management and a different tool to do the  
translation itself.


Is that correct ?


Jean-Christophe Helary


http://mac4translators.blogspot.com/

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: [l10n-dev] PO and TMX contents do not match, help !

2007-12-26 Thread Alessandro Cattelan

Jean-Christophe Helary ha scritto:


On 26 déc. 07, at 17:51, Alessandro Cattelan wrote:


Jean-Christophe Helary ha scritto:

What are the practical benefits related to using Pootle ?


Basically, I see two main benefits:
- it let's you assign files to translators so that you know who's 
translating a given file;
- it provides some statistics so that you know at a glance how many 
files or words need to be translated.


Ok, so the problem is that the current PO files, as provided by SUN 
using the oo2po convertion do not match the TMX contents so you can't 
work properly with them, right ?


So, if we could have PO files that match the TMX contents, we could 
use Pootle to do the file management and a different tool to do the 
translation itself.


Is that correct ?


Yes, it is.
Assuming that the different tool is OmegaT (I can't see any other OSS 
or free alternative out there) we should also try to improve the PO 
translation in OmegaT... but that's another matter and we'll discuss 
that next time we'll be translating PO files with it! I didn't take part 
in the latest translation round but I've been reading in the Italian OOo 
L10N mailing list of different issues with OmegaT not handling correctly 
the PO files - I'm sorry but I can't be more precise on that as I didn't 
follow the issue too closely.

A.





Jean-Christophe Helary


http://mac4translators.blogspot.com/

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]





--
Alessandro Cattelan
Freelance translator (EN, ES -- IT)
http://www.proz.com/profile/76355
Tel.: (+39) 338 1823554
Skype: acattelan
Yahoo! IM: alessandro.cattelan

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Re: [l10n-dev] PO and TMX contents do not match, help !

2007-12-26 Thread Javier SOLA

Jean-Christophe Helary wrote


We need a localization format (PO, XLIFF, key=value, anything) that 
matches the localization data SUN provides us with (TMX).
I have read this 20 times in 2 days, and I dissagree. We need to use TM, 
but we can perfectly well generate our own TM with the format that we have.


If we are using PO, we need TM that comes from PO. The format of the 
data is the format that it will be used in PO. We can generate this with 
any tool (KBabel, poedit), and use it with the same tool.


We can do the same thing with XLIFF. We create internal TM from XLIFF 
files, and we use it in new XLIFF files. The format might be different 
from the one in PO, and different form the TM provided by SUN, but it 
can be used perfectly well, because it goes back to the same place that 
it was taken from.


We have been doing this for years with PO, and it worked... and it also 
works with XLIFF.


When we upgrade to new versions of OOo, we do not mandatorily need to 
use TM, we can do direct transfer of information from the old version of 
the file to the new version, based on ID matching, as we have been doing 
for years. TM supports this for strings that are new (but equal to an 
old string) and strings that have changed ID, but this is a very small 
minority.


To have the TM from SUN is very nice, but it is not necessary, as we 
have survived for many years without it, using internal TM. Maybe I am 
wrong, but I believe that there is no technical advantage to using it, 
as far as I understand.


At this point I believe that the best PO editor is KBabel, but it runs 
only on Linux. It does quite good TM matching, enough for the needs of 
OOo translation.


Javier

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]