On 06/18/10 14:21, Rick Pasotto wrote: >> Remember, even if your terminal display is restricted to ASCII, you can >> still use Beautiful Soup to parse, process, and write documents in UTF-8 >> and other encodings. You just can't print certain strings with print. > > I can print the string fine. It's f.write(string_with_unicode) that fails > with: > > UnicodeEncodeError: 'ascii' codec can't encode characters in position 31-32: > ordinal not in range(128) > > Shouldn't I be able to f.write() *any* 8bit byte(s)? > > repr() gives: u"Realtors\\xc2\\xae" > > BTW, I'm running python 2.5.5 on debian linux. >
The FAQ explains half of it, except that in your case, substitute what it says about "terminal" with "file object". Python plays it safe and does not implicitly encode a unicode string when writing into a file. If you have a unicode string and you want to .write() that unicode string to a file, you need to .encode() the string first, so: string_with_unicode = u"Realtors\xc2\xae" f.write(string_with_unicode.encode('utf-8')) otherwise, you can use the codecs module to wrap the file object: f = codecs.open('filename.txt', 'w', encoding="utf-8") f.write(string_with_unicode) # now you can send unicode string to f _______________________________________________ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor