I am new to unicode so please bear with my stupidity. I am doing the following in a Python IDE called Wing with Python 23.
>>> s = "äöü" >>> print s äöü >>> print s äöü >>> s '\xc3\xa4\xc3\xb6\xc3\xbc' >>> s.decode('utf-8') u'\xe4\xf6\xfc' >>> u = s.decode('utf-8') >>> u u'\xe4\xf6\xfc' >>> print u.encode('utf-8') äöü >>> print u.encode('latin1') äöü Why can't I get äöü printed from utf-8 and I can from latin1? How can I use utf-8 exclusivly and be able to print the characters? I also did the same thing an the same machine in a command window... ActivePython 2.3.2 Build 230 (ActiveState Corp.) based on Python 2.3.2 (#49, Oct 24 2003, 13:37:57) [MSC v.1200 32 bit (Intel)] on win32 Type "help", "copyright", "credits" or "license" for more information. >>> s = "äöü" >>> print s äöü >>> s '\x84\x94\x81' >>> s.decode('utf-8') Traceback (most recent call last): File "<stdin>", line 1, in ? UnicodeDecodeError: 'utf8' codec can't decode byte 0x84 in position 0: unexpected code byte >>> u = s.decode('utf-8') Traceback (most recent call last): File "<stdin>", line 1, in ? UnicodeDecodeError: 'utf8' codec can't decode byte 0x84 in position 0: unexpected code byte >>> Why such a difference from the IDE to the command window in what it can do and the internal representation of the unicode? Thanks, Shel -- http://mail.python.org/mailman/listinfo/python-list