Ron Garret <rnospa...@flownet.com> writes: > Put this another way: I would have thought that when the Python parser > parses "u'\xb5'" it would produce the same result as calling > unicode('\xb5'), but it doesn't. Instead it seems to produce the same > result as calling unicode('\xb5', 'latin-1'). But my default encoding > is not latin-1, it's ascii. So where is the Python parser getting its > encoding from? Why does parsing "u'\xb5'" not produce the same error > as calling unicode('\xb5')?
There is no encoding involved other than ascii, only processing of a backslash escape. The backslash escape '\xb5' is converted to the unicode character whose ordinal number is B5h. This gives the same result as "\xb5".decode("latin-1") because the unicode numbering is the same as the 'latin-1' numbering in that range. -M- -- http://mail.python.org/mailman/listinfo/python-list