I did make a mistake, it should have been 'wU'. The starting data is ASCII.
What I'm doing is data processing on files with new line and tab characters inside quoted fields. The idea is to convert all the new line and characters to 0x85 and 0x88 respectivly, then process the files. Finally right before importing them into a database convert them back to new line and tab's thus preserving the field values. Will python not handle the control characters correctly? "Serge Orlov" <[EMAIL PROTECTED]> wrote in message news:[EMAIL PROTECTED] > On 6/27/06, Mike Currie <[EMAIL PROTECTED]> wrote: >> I'm trying to write out files that have utf-8 characters 0x85 and 0x08 in >> them. Every configuration I try I get a UnicodeError: ascii codec can't >> decode byte 0x85 in position 255: oridinal not in range(128) >> >> I've tried using the codecs.open('foo.txt', 'rU', 'utf-8', >> errors='strict') >> and that doesn't work and I've also try wrapping the file in an >> utf8_writer >> using codecs.lookup('utf8') >> >> Any clues? > > Use unicode strings for non-ascii characters. The following program > "works": > > import codecs > > c1 = unichr(0x85) > f = codecs.open('foo.txt', 'wU', 'utf-8') > f.write(c1) > f.close() > > But unichr(0x85) is a control characters, are you sure you want it? > What is the encoding of your data? -- http://mail.python.org/mailman/listinfo/python-list