Ken Moffat wrote on 07/08/12  at 19:35:09 +0100:
> On Sat, Jul 07, 2012 at 02:41:41PM -0600, Jack M wrote:
> > I sometimes send messages that contain the lowercase 'o' with an umlaut
> > over it, i.e., ö, unicode char 246.  I compose my messages in vim, with
> > the encodings all set to utf-8.
> > 
> > Occasionally I can see that I message that I sent (e.g., to myself) has
> > been unexepectedly encoded with quoted-printable, when I have the 'ö' in
> > the message body.  Such messages also declare the charset as Latin1
> > (which, I presume, was done by mutt, using $send_charset).
> > 
> > Somewhere along the line, the quoted-printable translation of 'ö' gets
> > messed up.  Apparently, the raw text of such a message uses =C3=B6 to
> > encode 'ö'. (In case this gets garbled, that's "equalsign, C3,
> > equalsign, B6").  Mutt transparently decodes this (I guess) and shows me
> > an umlauted 'o' in the pager.  What's funny is that =C3=B6 is not the
> > correct QP-encoding for the umlauted 'o'; hence if I view the raw
> > message text in vim (either by pressing 'e' from the pager orby saving
> > to disk first), and then un-encode from QP, I get not one but two
> > unicode characters, and both are incorrect.  (Namely, a capital A with a
> > tilde on top, and some strange other thing).
> > 
> > As far as I can tell, the umlauted, lowercase 'o' is char 246 in both
> > UTF8 and Latin1.  And as far as I can tell, the correct QP-translation
> > of character 246 ought to be =F6 (equalsign, F6).  But the raw message
> > text has *two* QP characters, =C3=B6, neither of which is correct.
> > 
>  Well, yes in unicode it is still decimal 246, but displaying that
> in UTF-8 takes two bytes which happen to be 0xC3 0xB6.  I used to
> know the (tedious) mechanics of how to decode UTF-8, but nowadays I
> just google for the hex codes, "unicode C3B6" in this case.  The
> details of the encoding are in the UTF-8 page at wikipedia.

Ah, now I'm getting somewhere.  (Thanks Ken!).  This at least suggests that
the answer to the question "why does mutt display it correctly but it gets
garbled when saved locally?" is: "because my QP-decoding scheme in vim is
decoding =C3 and =B6 separately [as 195 and 182] instead of seeing them as a
multibyte representation of 246".  

>  I know nothing about the details of quoted printable (apart from
> what I've just read on wikipedia).  Certainly, that message isn't
> latin1, it's UTF-8.  I suspect that the key is to find out *why*
> that message has been sent as quoted printable latin1.  Certainly,
> your post here is text/plain utf-8 and reads fine.

Yes, this is the other half of the mystery.  Now I need a list of possible
suspects for who is the mystery QP-er.  I don't know enough about how mail
works to make a complete list, but surely mutt itself and my SMTP server are
possible suspects?

>From the manual, as far as I can tell, the only way mutt would QP-encode my
message for me would be if I have $encode_from set, which I do not.

Is QP-encoding something that SMTP servers might ever do?  I'm totally in the
dark on this.

--Jack


Reply via email to