Mark Sapiro writes:
> >> Content-Disposition: inline
> >> Content-Type: text/plain; charset="us-ascii"
> >> Content-Transfer-Encoding: 7bit
> >>
> >> And in the mbox file below (of the same message), I see:
> >>
> >> Content-Disposition: inline
> >> Content-Type: text/plain
> >> Content-Transfer-Encoding: quoted-printable
>
> OK, that's the issue. The message in the .mbox is after various list
> manipulations, but before scrubbing for the pipermail archive, and
> somehow the '; charset="us-ascii"' has been lost from the Content-Type:
> header, which is why Scrubber scrubs it.
Does Scrubber really do that? Per RFC, the two Content-Type fields
have exactly the same semantics: "it is plain text, encoded as ASCII."
I would hope instead that it's the non-ASCII content that triggered
something (are we sure it's Mailman? could be an MTA somewhere along
the line) to qp-encode.
For example, the original mail may have included directed quotes or
similar hard-to-distinguish "fancy punctuation", but the composing MUA
didn't notice them and just randomly set charset=us-ascii. Is there
quoted-printable (easily recognized by the "=" + 2 hex digits syntax)
in that MIME body in the mbox? Another possibility was that it was a
very long line and it was qp-encoded ("=" CRLF inserted after a space)
to conform to RFC 822.
> > FWIW, using Thunderbird I posted the contents of the original email to a
> > test list (on the same server, with the same lists configs) and as
> > expected the archived message displays correctly as a plaintext
> > email.
How did the contents get into the message in Thunderbird? Copy-paste?
Yank from mailbox? Forward? If it's anything but the last,
Thuderbird almost surely massaged it on the way in.
> > The headers this time show as:
> >
> > === FROM EMAIL HEADER
> > MIME-Version: 1.0
> > Content-Type: text/plain; charset="us-ascii"
> > Content-Transfer-Encoding: 7bit
> >
> > === FROM MBOX
> > MIME-Version: 1.0
> > Content-Type: text/plain; charset=utf-8
> > Content-Transfer-Encoding: 7bit
>
> This is what should happen. I'm not sure which handler changed the
> charset to utf-8, but I don't think that's significant.
This is RFC-ly bizarre. Why would anything change the charset, unless
there were non-ASCII octets? If something does make that change
despite the body being pure ASCII, I would argue that's a bug (very
old MUAs might refuse to display the message), although it's probably
irrelevant nowadays.
But if there *are* non-ASCII octets, the Content-Transfer-Encoding is
a lie. I think the original message was probably broken as sent.
Steve
------------------------------------------------------
Mailman-Users mailing list -- [email protected]
To unsubscribe send an email to [email protected]
https://mail.python.org/mailman3/lists/mailman-users.python.org/
Mailman FAQ: http://wiki.list.org/x/AgA3
Security Policy: http://wiki.list.org/x/QIA9
Searchable Archives: https://www.mail-archive.com/[email protected]/
https://mail.python.org/archives/list/[email protected]/