On 7 December 2011 12:13, Zhi Yong Wu <zwu.ker...@gmail.com> wrote:
> Can you let me know how you see that it is ISO-8859-1 coding, not
> UTF-8? They look same to me.

This gets a bit confusing because mail clients and
web browsers tend to try to fix up what they think are
wrongly labelled encodings, so for example in my mail
client the diff looks like it is not changing anything.
However if you look at the file in the current git repo
with a hex editor:
00000000  23 20 32 30 30 34 2d 30  33 2d 31 36 20 48 61 6c  |# 2004-03-16 Hal|
00000010  6c 64 f3 72 20 47 75 f0  6d 75 6e 64 73 73 6f 6e  |ld.r Gu.mundsson|

you can see that there is an 0xf3 at offset 0x12, which
is LATIN SMALL LETTER O WITH ACUTE in ISO-8859-1. ISO-8859-1
is a one-byte-per-character encoding which is why it has a
raw 0xf3 here. However although the character is at Unicode
codepoint 0xf3 as well, the encoding of this in UTF-8 is
the two byte sequence 0xc3 0xb3. Similarly the 0xf0 LATIN
SMALL LETTER ETH has to be encoded as 0xc3 0xb0.

If you look at the raw text of Stefan's email it reads:
-# 2004-03-16 Halld=F3r Gu=F0mundsson and Morten Lange
+# 2004-03-16 Halld=C3=B3r Gu=C3=B0mundsson and Morten Lange

which is the quoted-printable encoding of this change.

-- PMM

Reply via email to