Re: What extended ASCII character set uses 0x9D?

Ian Kelly Thu, 17 Aug 2017 19:28:48 -0700

On Thu, Aug 17, 2017 at 8:15 PM, MRAB <[email protected]> wrote:
> On 2017-08-18 01:53, Chris Angelico wrote:
>> So here's an insane theory: something attempted to lower-case the byte
>> stream as if it were ASCII. If you ignore the high bit, 0xC5 looks
>> like 0x45 or "E", which lower-cases by having 32 added to it, yielding
>> 0xE5. Reversing this transformation yields sane data for several of
>> your strings - they then decode as UTF-8:
>>
>> miguel Ángel santos
>
>
> I think that's:
>
> miguel ángel santos


It would be if it had been lower-cased correctly. The UTF-8 for á is
\xc3\xa1, not \xe3x81 (ironically the add-32 method still works in
this particular case; it was just added to the wrong byte).
-- 
https://mail.python.org/mailman/listinfo/python-list

Re: What extended ASCII character set uses 0x9D?

Reply via email to