Re: [Tutor] appending to a utf-16 encoded text file

Mark Tolonen Tue, 21 Oct 2008 19:06:14 -0700

"Tim Golden" <[EMAIL PROTECTED]> wrote in messagenews:[EMAIL PROTECTED]

Tim Brown wrote:

Hi,
I'm trying to create and append unicode strings to a utf-16 text file.
The best I could come up with was to use codecs.open() with an encodingof 'utf-16' but when I do an append I get another UTF16 BOM put into thefile which other programs do not expect to see :-(
Is there some way to stop codecs from doing this or is there a better
way to create and add data to a utf-16 text file?



Well, there's nothing to stop you opening it "raw", as it were,
and just appending unicode encoded as utf16.

<code>
s = u"The cat sat on the mat"
f = open ("utf16.txt", "wb")
for word in s.split ():
 f.write (word.encode ("utf16") + " ")

f.close ()

</code>

TJG


Result: The＠揾愀琀 sat＠濾渀 the＠淾愀琀 

word.encode('utf16') adds a BOM every time, and the space wasn't encoded.

utf-16-le and utf-16-be don't add the BOM.  This works:

import codecs
s = u"The cat sat on the mat"
f = codecs.open("utf16.txt","wb","utf-16-le")
f.write(u'\ufeff') # if you want the BOM
for word in s.split ():
   f.write (word + u' ')
f.close()

-Mark


_______________________________________________
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor

Re: [Tutor] appending to a utf-16 encoded text file

Reply via email to