Mark Sapiro writes: > >> Content-Disposition: inline > >> Content-Type: text/plain; charset="us-ascii" > >> Content-Transfer-Encoding: 7bit > >> > >> And in the mbox file below (of the same message), I see: > >> > >> Content-Disposition: inline > >> Content-Type: text/plain > >> Content-Transfer-Encoding: quoted-printable > > OK, that's the issue. The message in the .mbox is after various list > manipulations, but before scrubbing for the pipermail archive, and > somehow the '; charset="us-ascii"' has been lost from the Content-Type: > header, which is why Scrubber scrubs it.
Does Scrubber really do that? Per RFC, the two Content-Type fields have exactly the same semantics: "it is plain text, encoded as ASCII." I would hope instead that it's the non-ASCII content that triggered something (are we sure it's Mailman? could be an MTA somewhere along the line) to qp-encode. For example, the original mail may have included directed quotes or similar hard-to-distinguish "fancy punctuation", but the composing MUA didn't notice them and just randomly set charset=us-ascii. Is there quoted-printable (easily recognized by the "=" + 2 hex digits syntax) in that MIME body in the mbox? Another possibility was that it was a very long line and it was qp-encoded ("=" CRLF inserted after a space) to conform to RFC 822. > > FWIW, using Thunderbird I posted the contents of the original email to a > > test list (on the same server, with the same lists configs) and as > > expected the archived message displays correctly as a plaintext > > email. How did the contents get into the message in Thunderbird? Copy-paste? Yank from mailbox? Forward? If it's anything but the last, Thuderbird almost surely massaged it on the way in. > > The headers this time show as: > > > > === FROM EMAIL HEADER > > MIME-Version: 1.0 > > Content-Type: text/plain; charset="us-ascii" > > Content-Transfer-Encoding: 7bit > > > > === FROM MBOX > > MIME-Version: 1.0 > > Content-Type: text/plain; charset=utf-8 > > Content-Transfer-Encoding: 7bit > > This is what should happen. I'm not sure which handler changed the > charset to utf-8, but I don't think that's significant. This is RFC-ly bizarre. Why would anything change the charset, unless there were non-ASCII octets? If something does make that change despite the body being pure ASCII, I would argue that's a bug (very old MUAs might refuse to display the message), although it's probably irrelevant nowadays. But if there *are* non-ASCII octets, the Content-Transfer-Encoding is a lie. I think the original message was probably broken as sent. Steve ------------------------------------------------------ Mailman-Users mailing list -- mailman-users@python.org To unsubscribe send an email to mailman-users-le...@python.org https://mail.python.org/mailman3/lists/mailman-users.python.org/ Mailman FAQ: http://wiki.list.org/x/AgA3 Security Policy: http://wiki.list.org/x/QIA9 Searchable Archives: https://www.mail-archive.com/mailman-users@python.org/ https://mail.python.org/archives/list/mailman-users@python.org/