On Sat, Oct 24, 2009 at 8:47 PM, Joe <joesalm...@hotmail.com> wrote: >> For the reason BK explained, the important difference is that I ran in >> the IDLE shell, which handles screen printing of unicode better ;-) > > Something still does not seem right here to me. > > In the example above the bytes were decoded to 'UTF-8' with the > replace option so any characters that were not UTF-8 were replaced and > the resulting string is '\ufffdabc' as BK explained. I understand > that the replace worked. > > Now consider this: > > Python 3.1.1 (r311:74483, Aug 17 2009, 16:45:59) [MSC v.1500 64 bit > (AMD64)] on > win32 > Type "help", "copyright", "credits" or "license" for more information. >>>> s = '\ufffdabc' >>>> print(s) > Traceback (most recent call last): > File "<stdin>", line 1, in <module> > File "p:\SW64\Python.3.1.1\lib\encodings\cp437.py", line 19, in > encode > return codecs.charmap_encode(input,self.errors,encoding_map)[0] > UnicodeEncodeError: 'charmap' codec can't encode character '\ufffd' in > position > 0: character maps to <undefined> >>>> import sys >>>> sys.getdefaultencoding() > 'utf-8' > > This too fails for the exact same reason (and doesn't invole decode). > > In the original example I decoded to UTF-8 and in this example the > default encoding is UTF-8 so why is cp437 being used? > > Thanks in advance for your assistance! >
Try checking sys.stdout.encoding. Then run the command chcp (not in the python interpreter). You'll probably get 437 from both of those. Just because the system encoding is set to utf-8 doesn't mean the console is. Nobody really uses cp437 anymore- it was replaced years ago by cp1252- but Microsoft is scared to do anything to cmd.exe because it might break somebody's 20-year-old DOS script > > > > > > -- > http://mail.python.org/mailman/listinfo/python-list > -- http://mail.python.org/mailman/listinfo/python-list