On Sun, 1 Oct 2017, Kerim Aydin wrote:

Could someone look at the index for cases:
https://faculty.washington.edu/kerim/nomic/cases/

And tell me why Ørjan's name displays correctly in CFJ 3565,
but not in CFJ 3470?

The first case involves a quoted message including 天火狐's CJK nick, and so probably got sent as UTF-8.

The second case has no special characters other than my own name, and for such messages my mailer (terminal Alpine) seems to use ISO-8859-15 (a western European charset, the revision of ISO-8859-1 with the euro sign, iirc) for sending.

Further, if you click through to the pure text version, the
situation is reversed.

ISO-8859-15 is mostly compatible with the Windows Codepage 1252 and ISO-8859-1 encodings, so if your pure text is served as either of those or something similar, the ISO-8859-15 is likely to be shown correctly in a browser.

For the other one, I can see it correctly as pure text by forcing UTF-8.

I'm guessing one uses a UTF-8 Ø, but the other uses some form
of extended ASCII?  So one displays in html not text, and the
other is vice versa?  And given that I get these from cutting
and pasting from email, is there any convenient way to tell
other than seeing the mistake or always having a hex editor
open?

Finding out my sending charset was surprisingly awkward because it's only given as a multipart header (for some reason it uses multipart format even though there is only one part), not a header of the email itself. Its format in the raw mailbox file looks like

Content-Type: TEXT/PLAIN; charset=ISO-8859-15; format=flowed
Content-Transfer-Encoding: QUOTED-PRINTABLE

*Sigh*, I think these days, cutting and pasting ought to convert via Unicode to work properly.

Greetings,
Ørjan.

Reply via email to