On Mon, Feb 28, 2005 at 12:55:11AM +0100, Boris Yakobowski wrote: > On Sun, Feb 27, 2005 at 11:26:06PM +0100, Marcin Owsiany wrote: > > As far as I know, potool knows nothing about encodings, so it should be > > completly transparent to them, and just pass text from the po file in > > whatever encoding it is, unchanged, to the temp file, and back. But I > > may be wrong. > > Yes
Ah, so the issue is not that poedit performs some inappropriate recoding, but that $EDITOR decides to interpret a file containing just US-ASCII file as iso-8859-15, and not as UTF-8. But then after you input some non-us-ascii characters (which emacs encodes as iso-8859-15), poedit merges a UTF-8 and an iso-8859-15 file. > but I find the current behavior unsatisfactory because it is the > responsibility of the user to set the correct encoding for the temporary > file. Otherwise it is appended as is, in an incorrect way; in my case emacs > saw the temporary file as an iso-8859-15 file (which was technically > correct), and then poedit merged it as is with the original utf8 po file. The problem is that the temporary file which poedit creates does not have any metadata which would indicate its encoding. Therefore emacs is free to choose whatever encoding it feels is appropriate. And since on creation the file contains pure US-ASCII, emacs chooses iso-8859-15. > So > there are two ways to correct this in my opinion : > - the temporary file is created with the correct encoding What do you mean by "with the correct encoding"? The problem is exactly that for pure US-ASCII input, its iso-8859-15 and UTF-8 representations are _exactly_ the same. So technically speaking, it _does_ have the correct encoding. > - the temporary file is converted after it has been saved, before being > merged. Since automagic detection of encoding (based just on the data) seems a very risky business, in order to perform a conversion, two things would be needed: - a specification of the target encoding (could be easily retrieved from the original po file Content-Type: header) - a specification of the source encoding, i.e. "what encoding $EDITOR chose to save your input in". I can't see how that could be done for any editor in general. However, I can see a third possibility, namely to have poedit prepend a Content-Type header, which would hopefully force $EDITOR into using correct (i.e. matching the initial po file) encoding for the following input. > I think just about anything would work ; besides it is highly locales and > emacs/whatever dependent unfortunately... By the way, doesn't something like: LC_CTYPE=fr_FR.UTF-8 poedit blah.po provide a workaround? I guess that should force into using UTF-8 as the tempfile encoding.. Marcin -- Marcin Owsiany <[EMAIL PROTECTED]> http://marcin.owsiany.pl/ GnuPG: 1024D/60F41216 FE67 DA2D 0ACA FC5E 3F75 D6F6 3A0D 8AA0 60F4 1216 -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]