Le 30/10/2013 15:34, Frédéric Grosshans a écrit :
Le 29/10/2013 17:15, "Jörg Knappen" a écrit :
After running this script, a few more things were there: Non-normalised accents and some really strange encodings I could not really explain but rather guess their meanings, like
s/Ãœ/Ü/g
s/É/É/g
s/AÌ€/À/g
s/aÌ€/à/g
s/EÌ€/È/g
s/eÌ€/è/g
s/„/„/g
s/“/“/g
s/ß/ß/g
s/’/’/g
s/Ä/Æ/g

It was probably not utf8 read as latin 1 and reencoded in utf8, but utf_8 encoding read as Windows 1252 ( http://en.wikipedia.org/wiki/Windows-1252 ) and reencoded as utf-8. Each of the combination above contains a character absent in latin-1 (œ‰€žŸ™„), and some of them are only present in Windows-1252 (‰™„) and not in Latin-15, the other possible mistake.

I'v e check that this is consistent with Ü É and ß but not with your Æ. This double encoding would give Ä : Ä=Win1252(C3 84)=110.00011 10.000100 = UTF8(00011 000100)=unicode 00C4 =Ä (and not Æ)

I've also checked the other combiniations, including Ì€ = U+0300 COMBINING GRAVE ACCENT and everything is consistent with Windows-1252, except your Æ which should be Ä.

    Frédéric


Reply via email to