On Mon, Apr 6, 2009 at 6:48 PM, Pirritano, Matthew <[email protected]> wrote:
> Hello python people,
>
> I am a total newbie. I have a very large file > 4GB that I need to
> convert from Unicode to plain text. I used to just use dos when the file
> was < 4GB but it no longer seems to work. Can anyone point me to some
> python code that might perform this function?

What is the encoding of the Unicode file?

Assuming that the file has lines that will each fit in memory, you can
use the codecs module to decode the unicode. Something like this:

import codecs

inp = codecs.open('Unicode_file.txt', 'r', 'utf-16le')
outp = open('new_text_file.txt')
outp.writelines(inp)
inp.close()
outp.close()

The above code assumes UTF-16LE encoding, change it to the correct one
if that is not right. A list of supported encodings is here:
http://docs.python.org/library/codecs.html#id3

Kent
_______________________________________________
Tutor maillist  -  [email protected]
http://mail.python.org/mailman/listinfo/tutor

Reply via email to