Le 30/10/2013 15:34, Frédéric Grosshans a écrit :
Le 29/10/2013 17:15, "Jörg Knappen" a écrit :
After running this script, a few more things were there:
Non-normalised accents and some really strange
encodings I could not really explain but rather guess their meanings,
like
s/Ãœ/Ü/g
s/É/É/g
s/AÌ€/À/g
s/aÌ€/à/g
s/EÌ€/È/g
s/eÌ€/è/g
s/„/„/g
s/“/“/g
s/ß/ß/g
s/’/’/g
s/Ä/Æ/g
It was probably not utf8 read as latin 1 and reencoded in utf8, but
utf_8 encoding read as Windows 1252 (
http://en.wikipedia.org/wiki/Windows-1252 ) and reencoded as utf-8.
Each of the combination above contains a character absent in latin-1
(œ‰€žŸ™„), and some of them are only present in Windows-1252 (‰™„) and
not in Latin-15, the other possible mistake.
I'v e check that this is consistent with Ü É and ß but not with your
Æ. This double encoding would give Ä :
Ä=Win1252(C3 84)=110.00011 10.000100 = UTF8(00011 000100)=unicode
00C4 =Ä (and not Æ)
I've also checked the other combiniations, including ̀ = U+0300
COMBINING GRAVE ACCENT and everything is consistent with Windows-1252,
except your Æ which should be Ä.
Frédéric