On Thu, Aug 17, 2017 at 6:53 PM, Chris Angelico <ros...@gmail.com> wrote: > That doesn't work for everything, though. The 0x81 0x81 and 0x9d ones > are still a puzzle.
I'm fairly sure that b'M\x81\x81\xfcnster' is 'Münster'. It decodes to that in Latin-1 if you remove the \x81 bytes. The question then is what those extra bytes are doing there. I suspect that they and 0x9d are just non-printing junk control bytes from the C1 set that got inserted into the character stream somehow. -- https://mail.python.org/mailman/listinfo/python-list