On Fri, Aug 18, 2017 at 5:11 PM, Marko Rauhamaa <ma...@pacujo.net> wrote: > Chris Angelico <ros...@gmail.com>: > >> On Fri, Aug 18, 2017 at 4:57 PM, Marko Rauhamaa <ma...@pacujo.net> wrote: >>> Chris Angelico <ros...@gmail.com>: >>> >>>> On Fri, Aug 18, 2017 at 4:38 PM, Paul Rubin <no.email@nospam.invalid> >>>> wrote: >>>>> John Nagle <na...@animats.com> writes: >>>>>> Since, as someone pointed out, there was UTF-8 which had been >>>>>> run through an ASCII-type lower casing algorithm >>>>> >>>>> I spent a few minutes figuring out if some of the mysterious 0x81's >>>>> could be from ASCII-lower-casing some Unicode combining characters, >>>>> but the numbers didn't seem to work out. Might still be worth looking >>>>> for in some other cases. >>>> >>>> They can't be from anything like that. Lower-casing in ASCII consists >>>> of adding 32 (or setting the fifth bit) on certain byte/character >>>> values. >>> >>> How about lower-casing? >> >> Huh? > > s/lower/upper/
Ohh. We have no evidence that uppercasing is going on here, and a naive ASCII upper-casing wouldn't produce 0x81 either - if it did, it would also convert 0x21 ("!") into 0x01 (SOH, a control character). So this one's still a mystery. ChrisA -- https://mail.python.org/mailman/listinfo/python-list