> For the reason BK explained, the important difference is that I ran in > the IDLE shell, which handles screen printing of unicode better ;-)
Something still does not seem right here to me. In the example above the bytes were decoded to 'UTF-8' with the replace option so any characters that were not UTF-8 were replaced and the resulting string is '\ufffdabc' as BK explained. I understand that the replace worked. Now consider this: Python 3.1.1 (r311:74483, Aug 17 2009, 16:45:59) [MSC v.1500 64 bit (AMD64)] on win32 Type "help", "copyright", "credits" or "license" for more information. >>> s = '\ufffdabc' >>> print(s) Traceback (most recent call last): File "<stdin>", line 1, in <module> File "p:\SW64\Python.3.1.1\lib\encodings\cp437.py", line 19, in encode return codecs.charmap_encode(input,self.errors,encoding_map)[0] UnicodeEncodeError: 'charmap' codec can't encode character '\ufffd' in position 0: character maps to <undefined> >>> import sys >>> sys.getdefaultencoding() 'utf-8' This too fails for the exact same reason (and doesn't invole decode). In the original example I decoded to UTF-8 and in this example the default encoding is UTF-8 so why is cp437 being used? Thanks in advance for your assistance! -- http://mail.python.org/mailman/listinfo/python-list