>>>>> "frank h." <[EMAIL PROTECTED]> (FH) wrote:

>FH> Hello,
>FH> I am using Mac Python 2.4.1 on Mac OS X 10.4 and I cannot seem to be able 
>to
>FH> read from a latin-1 file and then write to a UTF8 file correctly

>FH> Using Textwrangler on OS X, I create a latin-1 file with some special
>FH> characters in it and save it as "test.txt"

>FH> I am reading the textfile as such:

>FH>    f = codecs.open('test.txt', 'r', 'latin-1')
>FH>    content = f.read()
>FH>    f.close()

>FH>    type(content)
>FH>    <type 'unicode'>

>FH> all good. I can even

>FH>    print content.encode('utf8')
>FH>    äöåäöäööåäöäöå

>FH> (having set sys.defaultencoding to 'utf8' in siteconfig.py).
>FH> Now I want to create a new utf8 file and write "content" into it. I do the
>FH> following:

>FH>    f=codecs.open('newtest.txt','w','utf-8')
>FH>    f.write(content)
>FH>    f.close()

>FH> my problem is, that when I open "newtest.txt" in Textwrangler again,
>FH> Textwrangler recognizes the file as "MacRoman" encoded and the content is
>FH> garbled.

Then that is Textwrangler's fault. Interpreting a utf-8 file as MacRoman
will indeed give garbage. Maybe you can configure Texwrangler to recognize
utf-8 files. Otherwise use an editor that does this well. This is not a
python problem, as the file should be (and probably is) generated in utf-8.
-- 
Piet van Oostrum <[EMAIL PROTECTED]>
URL: http://www.cs.uu.nl/~piet [PGP 8DAE142BE17999C4]
Private email: [EMAIL PROTECTED]
_______________________________________________
Pythonmac-SIG maillist  -  Pythonmac-SIG@python.org
http://mail.python.org/mailman/listinfo/pythonmac-sig

Reply via email to