Re: [l10n-dev] escaping

Jean-Christophe Helary Tue, 19 Jun 2007 03:13:31 -0700

OmegaT handles PO files pretty much as text files and thus does not

care about "\", for it, the "\" is just another character. Hence,
there is nothing that is generated by OmegaT in the screenshot I
showed. The files are displayed as they are.


Friedel,

I am not arguing for or against a certain way to display the data Iam just saying that OmegaT does not do anything to the data. Andconsiders the PO escapes as a "\" character.

Unfortunately a PO file isn't just a text file. It is a file formatthat
presents data in a specific way. To escape the slash
(\) and the quotes (") is part of the format that we try to conformto.


Which is very good and OmegaT does not interfere with that.

<big_snip>

So, you see, the TMX does not exactly match the original .po file.
Although it does match the .sdf, but this is irrelevant.

When I created the TMX by using XLFEdit from Heartsome, I first too
the converted po, converted it to XLIFF and then exported it as TMX
and the TMX contained the same number of escapes as the po.
I would consider this behaviour by the Heartsome tool to be a bug,to be
honest. Do they convert '<' to '&lt;' ? Then they should also convert
the rest. I would say this is part of the rules of data conversion
between these formats.
I believe our conversion conforms to the XLIFF representation guidefor
PO files:
http://xliff-tools.freedesktop.org/snapshots/po-repr-guide/wd-xliff-profile-po.html#s.general_considerations.escapechars
I think it follows logically that the same rules should apply for
converting to TMX.

I have no idea who is right and who is wrong. What I can say is thatHeartsome is _very_ strong when it comes to respecting standards.Besides, the document you quote has contributions from Rodolfo Rayawho is also developer at Heartsome and who himself is extremely pickywhen it comes to standards compliance.

In "3.4.Handling of Escape Sequences in Software Messages", the textsays, regarding a fragment that includes escape sequences like wehave here: "This fragment could be presented in XLIFF by preservingthe escape sequences:"

etc. Of course it proposes rules to handle special escape sequencesas opposed to generic escape sequences but there is nothing wrongseemingly with keeping all the escape sequences.

What matters in the end is _not_ that the PO has been through anXLIFF conversion process or not.


What matter is that:

1) I have a source po with \\\<this kind of things\\\>

2) my reference TMX should match that with \\\<that kind of things\\\> because it is created from a similar po file

3) but for some reason it provides only \\<this other kind of things\\>

Let me repeat myself. I have no issue with your processes and withyour level of compliance with the proposed standards.

The only problem is that somewhere, the TMX conversion process loosesdata and that impairs my ability to get leverage from it.

A somewhat separate issue for me is that the \< in the SDF file isalso
an escape of that format. In reality it refers to just a left angular
bracket. The SDF format is however a bit strange in the way these are
used, and we might not want to change the way we handle the SDFescapingwhile Pavel's POT files has a semi-official status. If we can agreehowwe interpret the escaping in the SDF file and coordinate thechange, we
can probably make the lives of translators far easier by eliminating
much of the escaping.

I don't think the problem is in the oo2po process. Whatever theresult we are all starting from po anyway.

What is at stake here is that if I take a po created from .sdf and Iuse po2tmx on that same file, the data that the TMX contains isdifferent from the data in the po.


JC

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: [l10n-dev] escaping

Reply via email to