> OK, so newline is unicode, outfile.write() wants a plain string. What > encoding do you want outfile to be in? Try something like > outfile.write(newline.encode('utf-8')) > or use the codecs module to create an output that knows how to encode.
Aha!! The second of the two options above did the trick! It appears I needed to open my "outfile" with utf-8 encoding. After that, I was able to write out cleaned lines without any hitches. Below is the working code. And of course, many thanks for the help!! infile = open('test.txt','rb') #infile = codecs.open('test.txt','rb','utf-8') outfile = codecs.open('test_cleaned.txt','wb','utf-8') for line in infile: cleanline = strip_html(translate_code(line)).strip() if cleanline: outline = cleanline + '\n' outfile.write(outline) else: continue _______________________________________________ Tutor maillist - Tutor@python.org http://mail.python.org/mailman/listinfo/tutor