Re: is the mailing list mangling email character-encoding
On 9/23/19 3:39 PM, Stephen J. Turnbull wrote: Ed sent his message in Unicode with NFC normalization (the é is pre-composed) and UTF-8. Tim's message contains two ?, indicating a pair of unknown characters. One possibility is that Tim's MUA (Evolution) converts that to NFD normalization and something in between chokes on that, and produces the doubled ?? instead of a single é. FWIW, I sent a message on Sun, 22 Sep 2019 23:22:44 +0800 using Evolution 3.34.0 (3.34.0-1.fc31) as I only have a F31 Beta GNOME VM In that message Frédéric survived. -- If simple questions can be answered with a simple google query then why are there so many of them? ___ users mailing list -- users@lists.fedoraproject.org To unsubscribe send an email to users-le...@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/users@lists.fedoraproject.org
Re: is the mailing list mangling email character-encoding
Tom Horsley writes: > Some of the messages show the international characters, In this thread, those are Ed's. > and some do not. Tim's. > It would be interesting to examine the detailed headers of the > message where the characters disappeared, I looked in my local folder, and there's nothing interesting the headers. Not surprising; the mail header is not very expressive. Archiving happens *after* message transformations, so I'm seeing the same headers you would in HyperKitty. We'd probably need to see the pre-Mailman header to identify any problems with Mailman via the header, but Mailman has never kept those. Perhaps if Tim sent a mail both to the list and to himself, comparing those headers might tell us something, but his messages are simple text/plain; charset=UTF-8, so I doubt it. I'm prety sure it has something to do with the Unicode encoding itself. Ed sent his message in Unicode with NFC normalization (the é is pre-composed) and UTF-8. Tim's message contains two ?, indicating a pair of unknown characters. One possibility is that Tim's MUA (Evolution) converts that to NFD normalization and something in between chokes on that, and produces the doubled ?? instead of a single é. Renormalization is perfectly conformant to both Unicode and mail standards, but lots of software has issues with NFD. Mailman should not, since it doesn't need to interpret anything other than ASCII text, and passes anything else along (or deletes/quarantines whole MIME parts), and I've not heard of such problems (but Mailman 3 is a completely new code base, so it's possible a new issues has been introduced). It's also possible that Tim's MUA double-UTF-8-encodes the é, which results in an illegal code point sequence which might also be represented as ??. Of course double-encoding is a bug, and if Mailman receives such email, it's quite likely that it would replace the broken text with ??. This seems highly unlikely, as Tim would be seeing issues all over the place, including in mail directly to himself, which he has tried without problem. So most likely something between Tim's MUA and Mailman (the list manager, not HyperKitty) is mishandling the text, in one of the ways described above. I can't exclude either endpoint, but in both cases somebody should be seeing a lot of similar mojibake. Tim reports some, but not in direct to self, and I've never seen Mailman cause anything like this. More objectively, the fact that the ??s are in HyperKitty rather than some 8-bit mojibake strongly suggests that even if Mailman is directly responsible for the ??s, it was replacing existing mojibake with ??. > but that seems to be impossible with hyperkitty. I believe there's work being done on more detailed archiving at GNU Mailman that might help diagnosing this kind of issue (not done yet though). I don't know if lists.fedoraproject is tracking us, though. -- Associate Professor Division of Policy and Planning Science http://turnbull.sk.tsukuba.ac.jp/ Faculty of Systems and Information Email: turnb...@sk.tsukuba.ac.jp University of Tsukuba Tel: 029-853-5175 Tennodai 1-1-1, Tsukuba 305-8573 JAPAN ___ users mailing list -- users@lists.fedoraproject.org To unsubscribe send an email to users-le...@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/users@lists.fedoraproject.org
Re: is the mailing list mangling email character-encoding
Ed Greshko writes: > Yes. For some reason I can't fathom it seems the mailing list > software turns every Content-Transfer-Encoding to base64 It's just "old fashioned stupidity" far beyond the ken of today's users inexperienced with the details of the global mail system. Let me explain. Mailman needs to decode text/* parts of the message body to check for administrivia directed to the list, to decorate it (mostly to add a footer), and to manipulate MIME structure (quarantining .exes and the like). When it forwards to the MTA, it normally chooses the Content- Transfer-Encoding from the RFC-mandated ASCII and BASE64 otherwise. I am not familiar with the rules used by Mailman 3, offhand. Mailman is not an MTA to negotiate SMTP options like 8BIT and UTF8, and doesn't know what any of the MTAs are, not even its own. It needs to be compatible with them all, and there may be thousands of instances of dozens of programs involved. So Mailman uses the least- common-denominator standard that can handle all messages, which is RFC 5322 and RFCs 2045-2049. (Fortunately it doesn't mess with addresses at all, so it doesn't need to know about IDNA and friends.) If the MTA wants to be smarter than that fine by us, but we have to handle everything the world spews out, so we reduce to the most compatible protocols we know. There remains mail software in active use, such as fetchmail, which is still not RFC 6531-conforming. > Weird. If it were something on the mailing list side you'd think > that everyone would see the same issue. Agreed 100%. If messages were personalized (eg, a personal link to Postorius in the footer), I'd only be 99% confident (it's possible there are weird interactions between the charset of personalized material and that of the main text). But users@ doesn't personalize, and the footer is pure ASCII AFAICS, so that isn't it. Another possibility would be if the mail had text/html content type: Mailman delegates conversion to plain text to external software such as Lynx. But all the example messages were text/plain, so that's not it. I'm 99% sure that this is not a Mailman issue, but rather a problem with MUAs that don't format the message correctly or don't interpret it correctly, or MTAs that translate inaccurately when converting (this might include spam or virus checkers, especially the latter which frequently modify message bodies). Nevertheless, "when the impossible is eliminated, what's left, however improbable, must be the truth." Feel free to contact mailman-develop...@mail.python.org if it begins to seem more likely that Mailman is causing the problem somehow, maybe somebody's heard of something like this. Steve ___ users mailing list -- users@lists.fedoraproject.org To unsubscribe send an email to users-le...@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/users@lists.fedoraproject.org
Re: is the mailing list mangling email character-encoding
On 9/23/19 1:36 PM, Tim via users wrote: On Mon, 2019-09-23 at 07:51 +0800, Ed Greshko wrote: For some reason I can't fathom it seems the mailing list software turns every Content-Transfer-Encoding to base64 Old fashioned stupidity? Thinking 7-bit email is still necessary? It should still work as base64, but the encoding is doing it wrong. I can't force an encoding scheme when viewing the message, it's mangled to begin with, and just gets worse. It's only messages going through the list that I notice getting mangled. Weird. If it were something on the mailing list side you'd think that everyone would see the same issue. Chances are that most people just ignore things like it. Well, that doesn't explain why I can write 楊秀茵 or Frédéric and it will show up correctly and you cannot. I wonder what happens if you did include 楊秀茵 in one of your messages. -- If simple questions can be answered with a simple google query then why are there so many of them? ___ users mailing list -- users@lists.fedoraproject.org To unsubscribe send an email to users-le...@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/users@lists.fedoraproject.org
Re: is the mailing list mangling email character-encoding
On Mon, 2019-09-23 at 07:51 +0800, Ed Greshko wrote: > For some reason I can't fathom it seems the mailing list software > turns every Content-Transfer-Encoding to base64 Old fashioned stupidity? Thinking 7-bit email is still necessary? It should still work as base64, but the encoding is doing it wrong. I can't force an encoding scheme when viewing the message, it's mangled to begin with, and just gets worse. >> It's only messages going through the list that I notice getting >> mangled. > Weird. If it were something on the mailing list side you'd think > that everyone would see the same issue. Chances are that most people just ignore things like it. -- uname -rsvp Linux 3.10.0-1062.1.1.el7.x86_64 #1 SMP Fri Sep 13 22:55:44 UTC 2019 x86_64 Boilerplate: All unexpected mail to my mailbox is automatically deleted. I will only get to see the messages that are posted to the mailing list. ___ users mailing list -- users@lists.fedoraproject.org To unsubscribe send an email to users-le...@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/users@lists.fedoraproject.org
Re: is the mailing list mangling email character-encoding
On Sun, 22 Sep 2019 20:16:17 -0400 Fred Smith wrote: > FWIW, I find these headers in your posting on my system: For what it is worth, I glanced at this thread over on the fedora mailing list archives (which isn't an easy thread to find over there because someone changed the subject line of an existing thread :-). Some of the messages show the international characters, and some do not. It would be interesting to examine the detailed headers of the message where the characters disappeared, but that seems to be impossible with hyperkitty. ___ users mailing list -- users@lists.fedoraproject.org To unsubscribe send an email to users-le...@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/users@lists.fedoraproject.org
Re: is the mailing list mangling email character-encoding
On 9/23/19 8:16 AM, Fred Smith wrote: FWIW, I find these headers in your posting on my system: Content-Type: text/plain; charset="utf-8"; format="flowed" Content-Transfer-Encoding: 8bit X-MIME-Autoconverted: from base64 to 8bit by fcshome.stoneham.ma.us id x8MNrkId016512 Your side "autoconverted" it back to what I find more reasonable. I suppose there may still be a few non 8-bit capable MTA's out in the world. I've not not encountered one recently. -- If simple questions can be answered with a simple google query then why are there so many of them? ___ users mailing list -- users@lists.fedoraproject.org To unsubscribe send an email to users-le...@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/users@lists.fedoraproject.org
Re: is the mailing list mangling email character-encoding
On Mon, Sep 23, 2019 at 07:51:45AM +0800, Ed Greshko wrote: > On 9/23/19 12:04 AM, Tim via users wrote: > >On Sun, 2019-09-22 at 22:52 +0800, Ed Greshko wrote: > >>And it did not. > >I noticed the same. On the messages that get corrupted, I've received > >them as base64 encoded (they weren't sent that way), though your > >messages with Frederic correctly accented was also base64 encodede. > >Most other messages coming from the list are just text. Do you get > >those messages as text or base64? > > Yes. For some reason I can't fathom it seems the mailing list software turns > every > Content-Transfer-Encoding to base64 > > When sending the initial message is sent from here as > > Content-Type: text/plain; charset=utf-8; format=flowed > Content-Transfer-Encoding: 8bit > > >>I think you've said a yahoo to yahoo message is fine. Do you have a > >>non-yahoo account you can send to directly to check? > >Yes, that works fine. It's only messages going through the list that I > >notice getting mangled. > > Weird. If it were something on the mailing list side you'd think that > everyone would see the same issue. > FWIW, I find these headers in your posting on my system: Content-Type: text/plain; charset="utf-8"; format="flowed" Content-Transfer-Encoding: 8bit X-MIME-Autoconverted: from base64 to 8bit by fcshome.stoneham.ma.us id x8MNrkId016512 -- Fred Smith -- fre...@fcshome.stoneham.ma.us - But God demonstrates his own love for us in this: While we were still sinners, Christ died for us. --- Romans 5:8 (niv) -- ___ users mailing list -- users@lists.fedoraproject.org To unsubscribe send an email to users-le...@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/users@lists.fedoraproject.org
Re: is the mailing list mangling email character-encoding
On 9/23/19 12:04 AM, Tim via users wrote: On Sun, 2019-09-22 at 22:52 +0800, Ed Greshko wrote: And it did not. I noticed the same. On the messages that get corrupted, I've received them as base64 encoded (they weren't sent that way), though your messages with Frederic correctly accented was also base64 encodede. Most other messages coming from the list are just text. Do you get those messages as text or base64? Yes. For some reason I can't fathom it seems the mailing list software turns every Content-Transfer-Encoding to base64 When sending the initial message is sent from here as Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 8bit I think you've said a yahoo to yahoo message is fine. Do you have a non-yahoo account you can send to directly to check? Yes, that works fine. It's only messages going through the list that I notice getting mangled. Weird. If it were something on the mailing list side you'd think that everyone would see the same issue. -- If simple questions can be answered with a simple google query then why are there so many of them? ___ users mailing list -- users@lists.fedoraproject.org To unsubscribe send an email to users-le...@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/users@lists.fedoraproject.org
Re: is the mailing list mangling email character-encoding
On Sun, 2019-09-22 at 22:52 +0800, Ed Greshko wrote: > And it did not. I noticed the same. On the messages that get corrupted, I've received them as base64 encoded (they weren't sent that way), though your messages with Frederic correctly accented was also base64 encodede. Most other messages coming from the list are just text. Do you get those messages as text or base64? > I think you've said a yahoo to yahoo message is fine. Do you have a > non-yahoo account you can send to directly to check? Yes, that works fine. It's only messages going through the list that I notice getting mangled. -- uname -rsvp Linux 3.10.0-1062.1.1.el7.x86_64 #1 SMP Fri Sep 13 22:55:44 UTC 2019 x86_64 Boilerplate: All unexpected mail to my mailbox is automatically deleted. I will only get to see the messages that are posted to the mailing list. ___ users mailing list -- users@lists.fedoraproject.org To unsubscribe send an email to users-le...@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/users@lists.fedoraproject.org
Re: is the mailing list mangling email character-encoding
On Sun, 2019-09-22 at 22:09 +0800, Ed Greshko wrote: > On 9/22/19 10:00 PM, Tim via users wrote: > > On Sun, 2019-09-22 at 19:29 +0800, Ed Greshko wrote: > > > So, I would blame yahoo. :-) :-) > > Ordinarily, I'd agree. But if I post directly through Yahoo (using > > their SMTP servers), and receive from a Yahoo address, > > international > > characters come through unmangled. > > > > I've seen this kind of thing before, when I've posted here using > > Thunderbird on a Mac. > > > > I see. I'm curious though. If you respond to this and keep the > contents will Frédéric survive? > Just because I want to eliminate evolution as the culprit ___ users mailing list -- users@lists.fedoraproject.org To unsubscribe send an email to users-le...@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/users@lists.fedoraproject.org
Re: is the mailing list mangling email character-encoding
On 9/22/19 10:41 PM, Tim via users wrote: I see. I'm curious though. If you respond to this and keep the contents will Fr??d??ric survive? Very punny.;-) Here's the reply. And it did not. I think you've said a yahoo to yahoo message is fine. Do you have a non-yahoo account you can send to directly to check? -- If simple questions can be answered with a simple google query then why are there so many of them? ___ users mailing list -- users@lists.fedoraproject.org To unsubscribe send an email to users-le...@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/users@lists.fedoraproject.org
Re: is the mailing list mangling email character-encoding
On Sun, 2019-09-22 at 22:09 +0800, Ed Greshko wrote: > On 9/22/19 10:00 PM, Tim via users wrote: > > On Sun, 2019-09-22 at 19:29 +0800, Ed Greshko wrote: > > > So, I would blame yahoo. :-) :-) > > > > Ordinarily, I'd agree. But if I post directly through Yahoo (using > > their SMTP servers), and receive from a Yahoo address, > > international > > characters come through unmangled. > > > > I've seen this kind of thing before, when I've posted here using > > Thunderbird on a Mac. > > > > I see. I'm curious though. If you respond to this and keep the > contents will Fr??d??ric survive? Very punny. ;-) Here's the reply. -- uname -rsvp Linux 3.10.0-1062.1.1.el7.x86_64 #1 SMP Fri Sep 13 22:55:44 UTC 2019 x86_64 Boilerplate: All unexpected mail to my mailbox is automatically deleted. I will only get to see the messages that are posted to the mailing list. ___ users mailing list -- users@lists.fedoraproject.org To unsubscribe send an email to users-le...@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/users@lists.fedoraproject.org
Re: is the mailing list mangling email character-encoding
On 9/22/19 10:00 PM, Tim via users wrote: On Sun, 2019-09-22 at 19:29 +0800, Ed Greshko wrote: So, I would blame yahoo. :-) :-) Ordinarily, I'd agree. But if I post directly through Yahoo (using their SMTP servers), and receive from a Yahoo address, international characters come through unmangled. I've seen this kind of thing before, when I've posted here using Thunderbird on a Mac. I see. I'm curious though. If you respond to this and keep the contents will Frédéric survive? -- If simple questions can be answered with a simple google query then why are there so many of them? ___ users mailing list -- users@lists.fedoraproject.org To unsubscribe send an email to users-le...@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/users@lists.fedoraproject.org
Re: is the mailing list mangling email character-encoding
On Sun, 2019-09-22 at 19:29 +0800, Ed Greshko wrote: > So, I would blame yahoo. :-) :-) Ordinarily, I'd agree. But if I post directly through Yahoo (using their SMTP servers), and receive from a Yahoo address, international characters come through unmangled. I've seen this kind of thing before, when I've posted here using Thunderbird on a Mac. -- uname -rsvp Linux 3.10.0-1062.1.1.el7.x86_64 #1 SMP Fri Sep 13 22:55:44 UTC 2019 x86_64 Boilerplate: All unexpected mail to my mailbox is automatically deleted. I will only get to see the messages that are posted to the mailing list. ___ users mailing list -- users@lists.fedoraproject.org To unsubscribe send an email to users-le...@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/users@lists.fedoraproject.org
Re: is the mailing list mangling email character-encoding
On 9/22/19 7:17 PM, Tim via users wrote: Below I've just pasted a bit of an email: On Sun, 2019-09-22 at 06:13 +0200, Fr??d??ric wrote: It's from my reply to a message. That mangled name is supposed to be "Frederic" with acute accents above both letter "e"s. It was when I wrote it, and my mail client *CORRECTLY* sent it as UTF-8, but somewhere between then and my receiving it back, the message content has been scrambled by some broken email software that stuffed the character encoding up. My reply showed up in the archives with Frédéric correctly shown. So, I would blame yahoo. :-) :-) -- If simple questions can be answered with a simple google query then why are there so many of them? ___ users mailing list -- users@lists.fedoraproject.org To unsubscribe send an email to users-le...@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/users@lists.fedoraproject.org