Re: is the mailing list mangling email character-encoding

2019-09-23 Thread Ed Greshko

On 9/23/19 3:39 PM, Stephen J. Turnbull wrote:

Ed sent his message in Unicode with NFC normalization (the é is
pre-composed) and UTF-8.  Tim's message contains two ?, indicating a
pair of unknown characters.  One possibility is that Tim's MUA
(Evolution) converts that to NFD normalization and something in
between chokes on that, and produces the doubled ?? instead of a
single é.


FWIW, I sent a message on Sun, 22 Sep 2019 23:22:44 +0800 using Evolution 
3.34.0 (3.34.0-1.fc31)
as I only have a F31 Beta GNOME VM

In that message Frédéric survived.


--
If simple questions can be answered with a simple google query then why are 
there so many of them?
___
users mailing list -- users@lists.fedoraproject.org
To unsubscribe send an email to users-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/users@lists.fedoraproject.org


Re: is the mailing list mangling email character-encoding

2019-09-23 Thread Stephen J. Turnbull
Tom Horsley writes:

 > Some of the messages show the international characters,

In this thread, those are Ed's.

 > and some do not.

Tim's.

 > It would be interesting to examine the detailed headers of the
 > message where the characters disappeared,

I looked in my local folder, and there's nothing interesting the
headers.  Not surprising; the mail header is not very expressive.
Archiving happens *after* message transformations, so I'm seeing the
same headers you would in HyperKitty.  We'd probably need to see the
pre-Mailman header to identify any problems with Mailman via the
header, but Mailman has never kept those.  Perhaps if Tim sent a mail
both to the list and to himself, comparing those headers might tell us
something, but his messages are simple text/plain; charset=UTF-8, so I
doubt it.  I'm prety sure it has something to do with the Unicode
encoding itself.

Ed sent his message in Unicode with NFC normalization (the é is
pre-composed) and UTF-8.  Tim's message contains two ?, indicating a
pair of unknown characters.  One possibility is that Tim's MUA
(Evolution) converts that to NFD normalization and something in
between chokes on that, and produces the doubled ?? instead of a
single é.  Renormalization is perfectly conformant to both Unicode and
mail standards, but lots of software has issues with NFD.  Mailman
should not, since it doesn't need to interpret anything other than
ASCII text, and passes anything else along (or deletes/quarantines
whole MIME parts), and I've not heard of such problems (but Mailman 3
is a completely new code base, so it's possible a new issues has been
introduced).

It's also possible that Tim's MUA double-UTF-8-encodes the é, which
results in an illegal code point sequence which might also be
represented as ??.  Of course double-encoding is a bug, and if Mailman
receives such email, it's quite likely that it would replace the
broken text with ??.  This seems highly unlikely, as Tim would be
seeing issues all over the place, including in mail directly to
himself, which he has tried without problem.

So most likely something between Tim's MUA and Mailman (the list
manager, not HyperKitty) is mishandling the text, in one of the ways
described above.  I can't exclude either endpoint, but in both cases
somebody should be seeing a lot of similar mojibake.  Tim reports
some, but not in direct to self, and I've never seen Mailman cause
anything like this.  More objectively, the fact that the ??s are in
HyperKitty rather than some 8-bit mojibake strongly suggests that even
if Mailman is directly responsible for the ??s, it was replacing
existing mojibake with ??.

 > but that seems to be impossible with hyperkitty.

I believe there's work being done on more detailed archiving at GNU
Mailman that might help diagnosing this kind of issue (not done yet
though).  I don't know if lists.fedoraproject is tracking us, though.


-- 
Associate Professor  Division of Policy and Planning Science
http://turnbull.sk.tsukuba.ac.jp/ Faculty of Systems and Information
Email: turnb...@sk.tsukuba.ac.jp   University of Tsukuba
Tel: 029-853-5175 Tennodai 1-1-1, Tsukuba 305-8573 JAPAN
___
users mailing list -- users@lists.fedoraproject.org
To unsubscribe send an email to users-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/users@lists.fedoraproject.org


Re: is the mailing list mangling email character-encoding

2019-09-23 Thread Stephen J. Turnbull
Ed Greshko writes:

 > Yes.  For some reason I can't fathom it seems the mailing list
 > software turns every Content-Transfer-Encoding to base64

It's just "old fashioned stupidity" far beyond the ken of today's
users inexperienced with the details of the global mail system.  Let
me explain.

Mailman needs to decode text/* parts of the message body to check for
administrivia directed to the list, to decorate it (mostly to add a
footer), and to manipulate MIME structure (quarantining .exes and the
like).  When it forwards to the MTA, it normally chooses the Content-
Transfer-Encoding from the RFC-mandated ASCII and BASE64 otherwise.  I
am not familiar with the rules used by Mailman 3, offhand.

Mailman is not an MTA to negotiate SMTP options like 8BIT and UTF8,
and doesn't know what any of the MTAs are, not even its own.  It needs
to be compatible with them all, and there may be thousands of
instances of dozens of programs involved.  So Mailman uses the least-
common-denominator standard that can handle all messages, which is RFC
5322 and RFCs 2045-2049.  (Fortunately it doesn't mess with addresses
at all, so it doesn't need to know about IDNA and friends.)  If the
MTA wants to be smarter than that fine by us, but we have to handle
everything the world spews out, so we reduce to the most compatible
protocols we know.

There remains mail software in active use, such as fetchmail, which is
still not RFC 6531-conforming.

 > Weird.  If it were something on the mailing list side you'd think
 > that everyone would see the same issue.

Agreed 100%.  If messages were personalized (eg, a personal link to
Postorius in the footer), I'd only be 99% confident (it's possible
there are weird interactions between the charset of personalized
material and that of the main text).  But users@ doesn't personalize,
and the footer is pure ASCII AFAICS, so that isn't it.  Another
possibility would be if the mail had text/html content type: Mailman
delegates conversion to plain text to external software such as Lynx.
But all the example messages were text/plain, so that's not it.

I'm 99% sure that this is not a Mailman issue, but rather a problem
with MUAs that don't format the message correctly or don't interpret
it correctly, or MTAs that translate inaccurately when converting
(this might include spam or virus checkers, especially the latter
which frequently modify message bodies).  Nevertheless, "when the
impossible is eliminated, what's left, however improbable, must be the
truth."  Feel free to contact mailman-develop...@mail.python.org if
it begins to seem more likely that Mailman is causing the problem
somehow, maybe somebody's heard of something like this.

Steve
___
users mailing list -- users@lists.fedoraproject.org
To unsubscribe send an email to users-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/users@lists.fedoraproject.org


Re: is the mailing list mangling email character-encoding

2019-09-22 Thread Ed Greshko

On 9/23/19 1:36 PM, Tim via users wrote:

On Mon, 2019-09-23 at 07:51 +0800, Ed Greshko wrote:

For some reason I can't fathom it seems the mailing list software
turns every Content-Transfer-Encoding to base64

Old fashioned stupidity?  Thinking 7-bit email is still necessary?

It should still work as base64, but the encoding is doing it wrong.  I
can't force an encoding scheme when viewing the message, it's mangled
to begin with, and just gets worse.


It's only messages going through the list that I notice getting
mangled.

Weird.  If it were something on the mailing list side you'd think
that everyone would see the same issue.

Chances are that most people just ignore things like it.
  


Well, that doesn't explain why I can write 楊秀茵 or Frédéric and it will show up 
correctly and you cannot.

I wonder what happens if you did include 楊秀茵 in one of your messages.

--
If simple questions can be answered with a simple google query then why are 
there so many of them?
___
users mailing list -- users@lists.fedoraproject.org
To unsubscribe send an email to users-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/users@lists.fedoraproject.org


Re: is the mailing list mangling email character-encoding

2019-09-22 Thread Tim via users
On Mon, 2019-09-23 at 07:51 +0800, Ed Greshko wrote:
> For some reason I can't fathom it seems the mailing list software
> turns every Content-Transfer-Encoding to base64

Old fashioned stupidity?  Thinking 7-bit email is still necessary?

It should still work as base64, but the encoding is doing it wrong.  I
can't force an encoding scheme when viewing the message, it's mangled
to begin with, and just gets worse.

>> It's only messages going through the list that I notice getting
>> mangled.

> Weird.  If it were something on the mailing list side you'd think
> that everyone would see the same issue.

Chances are that most people just ignore things like it.
 
-- 
 
uname -rsvp
Linux 3.10.0-1062.1.1.el7.x86_64 #1 SMP Fri Sep 13 22:55:44 UTC 2019 x86_64
 
Boilerplate:  All unexpected mail to my mailbox is automatically deleted.
I will only get to see the messages that are posted to the mailing list.
 
___
users mailing list -- users@lists.fedoraproject.org
To unsubscribe send an email to users-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/users@lists.fedoraproject.org


Re: is the mailing list mangling email character-encoding

2019-09-22 Thread Tom Horsley
On Sun, 22 Sep 2019 20:16:17 -0400
Fred Smith wrote:

> FWIW, I find these headers in your posting on my system:

For what it is worth, I glanced at this thread over on the fedora
mailing list archives (which isn't an easy thread to find over
there because someone changed the subject line of an
existing thread :-).

Some of the messages show the international characters,
and some do not. It would be interesting to examine
the detailed headers of the message where the characters
disappeared, but that seems to be impossible with
hyperkitty.
___
users mailing list -- users@lists.fedoraproject.org
To unsubscribe send an email to users-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/users@lists.fedoraproject.org


Re: is the mailing list mangling email character-encoding

2019-09-22 Thread Ed Greshko

On 9/23/19 8:16 AM, Fred Smith wrote:

FWIW, I find these headers in your posting on my system:

Content-Type: text/plain; charset="utf-8"; format="flowed"
Content-Transfer-Encoding: 8bit
X-MIME-Autoconverted: from base64 to 8bit by fcshome.stoneham.ma.us id
 x8MNrkId016512


Your side "autoconverted" it back to what I find more reasonable.

I suppose there may still be a few non 8-bit capable MTA's out in the world.  
I've not not encountered one recently.

--
If simple questions can be answered with a simple google query then why are 
there so many of them?
___
users mailing list -- users@lists.fedoraproject.org
To unsubscribe send an email to users-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/users@lists.fedoraproject.org


Re: is the mailing list mangling email character-encoding

2019-09-22 Thread Fred Smith
On Mon, Sep 23, 2019 at 07:51:45AM +0800, Ed Greshko wrote:
> On 9/23/19 12:04 AM, Tim via users wrote:
> >On Sun, 2019-09-22 at 22:52 +0800, Ed Greshko wrote:
> >>And it did not.
> >I noticed the same.  On the messages that get corrupted, I've received
> >them as base64 encoded (they weren't sent that way), though your
> >messages with Frederic correctly accented was also base64 encodede.
> >Most other messages coming from the list are just text.  Do you get
> >those messages as text or base64?
> 
> Yes.  For some reason I can't fathom it seems the mailing list software turns 
> every
> Content-Transfer-Encoding to base64
> 
> When sending the initial message is sent from here as
> 
> Content-Type: text/plain; charset=utf-8; format=flowed
> Content-Transfer-Encoding: 8bit
> 
> >>I think you've said a yahoo to yahoo message is fine.  Do you have a
> >>non-yahoo account you can send to directly to check?
> >Yes, that works fine.  It's only messages going through the list that I
> >notice getting mangled.
> 
> Weird.  If it were something on the mailing list side you'd think that 
> everyone would see the same issue.
> 

FWIW, I find these headers in your posting on my system:

Content-Type: text/plain; charset="utf-8"; format="flowed"
Content-Transfer-Encoding: 8bit
X-MIME-Autoconverted: from base64 to 8bit by fcshome.stoneham.ma.us id
x8MNrkId016512

-- 
 Fred Smith -- fre...@fcshome.stoneham.ma.us -
   But God demonstrates his own love for us in this: 
 While we were still sinners, 
  Christ died for us.
--- Romans 5:8 (niv) --
___
users mailing list -- users@lists.fedoraproject.org
To unsubscribe send an email to users-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/users@lists.fedoraproject.org


Re: is the mailing list mangling email character-encoding

2019-09-22 Thread Ed Greshko

On 9/23/19 12:04 AM, Tim via users wrote:

On Sun, 2019-09-22 at 22:52 +0800, Ed Greshko wrote:

And it did not.

I noticed the same.  On the messages that get corrupted, I've received
them as base64 encoded (they weren't sent that way), though your
messages with Frederic correctly accented was also base64 encodede.
Most other messages coming from the list are just text.  Do you get
those messages as text or base64?


Yes.  For some reason I can't fathom it seems the mailing list software turns 
every
Content-Transfer-Encoding to base64

When sending the initial message is sent from here as

Content-Type: text/plain; charset=utf-8; format=flowed
Content-Transfer-Encoding: 8bit


I think you've said a yahoo to yahoo message is fine.  Do you have a
non-yahoo account you can send to directly to check?

Yes, that works fine.  It's only messages going through the list that I
notice getting mangled.


Weird.  If it were something on the mailing list side you'd think that everyone 
would see the same issue.

--
If simple questions can be answered with a simple google query then why are 
there so many of them?
___
users mailing list -- users@lists.fedoraproject.org
To unsubscribe send an email to users-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/users@lists.fedoraproject.org


Re: is the mailing list mangling email character-encoding

2019-09-22 Thread Tim via users
On Sun, 2019-09-22 at 22:52 +0800, Ed Greshko wrote:
> And it did not.

I noticed the same.  On the messages that get corrupted, I've received
them as base64 encoded (they weren't sent that way), though your
messages with Frederic correctly accented was also base64 encodede. 
Most other messages coming from the list are just text.  Do you get
those messages as text or base64?

> I think you've said a yahoo to yahoo message is fine.  Do you have a
> non-yahoo account you can send to directly to check?

Yes, that works fine.  It's only messages going through the list that I
notice getting mangled.
 
-- 
 
uname -rsvp
Linux 3.10.0-1062.1.1.el7.x86_64 #1 SMP Fri Sep 13 22:55:44 UTC 2019 x86_64
 
Boilerplate:  All unexpected mail to my mailbox is automatically deleted.
I will only get to see the messages that are posted to the mailing list.
 
___
users mailing list -- users@lists.fedoraproject.org
To unsubscribe send an email to users-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/users@lists.fedoraproject.org


Re: is the mailing list mangling email character-encoding

2019-09-22 Thread Ed Greshko
On Sun, 2019-09-22 at 22:09 +0800, Ed Greshko wrote:
> On 9/22/19 10:00 PM, Tim via users wrote:
> > On Sun, 2019-09-22 at 19:29 +0800, Ed Greshko wrote:
> > > So, I would blame yahoo.  :-) :-)
> > Ordinarily, I'd agree.  But if I post directly through Yahoo (using
> > their SMTP servers), and receive from a Yahoo address,
> > international
> > characters come through unmangled.
> > 
> > I've seen this kind of thing before, when I've posted here using
> > Thunderbird on a Mac.
> >   
> 
> I see.  I'm curious though.  If you respond to this and keep the
> contents will Frédéric survive?
> 

Just because I want to eliminate evolution as the culprit
___
users mailing list -- users@lists.fedoraproject.org
To unsubscribe send an email to users-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/users@lists.fedoraproject.org


Re: is the mailing list mangling email character-encoding

2019-09-22 Thread Ed Greshko

On 9/22/19 10:41 PM, Tim via users wrote:

I see.  I'm curious though.  If you respond to this and keep the
contents will Fr??d??ric survive?

Very punny.;-)   Here's the reply.
  


And it did not.

I think you've said a yahoo to yahoo message is fine.  Do you have a non-yahoo 
account you can send to
directly to check?

--
If simple questions can be answered with a simple google query then why are 
there so many of them?
___
users mailing list -- users@lists.fedoraproject.org
To unsubscribe send an email to users-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/users@lists.fedoraproject.org


Re: is the mailing list mangling email character-encoding

2019-09-22 Thread Tim via users
On Sun, 2019-09-22 at 22:09 +0800, Ed Greshko wrote:
> On 9/22/19 10:00 PM, Tim via users wrote:
> > On Sun, 2019-09-22 at 19:29 +0800, Ed Greshko wrote:
> > > So, I would blame yahoo.  :-) :-)
> > 
> > Ordinarily, I'd agree.  But if I post directly through Yahoo (using
> > their SMTP servers), and receive from a Yahoo address,
> > international
> > characters come through unmangled.
> > 
> > I've seen this kind of thing before, when I've posted here using
> > Thunderbird on a Mac.
> >   
> 
> I see.  I'm curious though.  If you respond to this and keep the
> contents will Fr??d??ric survive?

Very punny.  ;-)  Here's the reply.
 
-- 
 
uname -rsvp
Linux 3.10.0-1062.1.1.el7.x86_64 #1 SMP Fri Sep 13 22:55:44 UTC 2019 x86_64
 
Boilerplate:  All unexpected mail to my mailbox is automatically deleted.
I will only get to see the messages that are posted to the mailing list.
 
___
users mailing list -- users@lists.fedoraproject.org
To unsubscribe send an email to users-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/users@lists.fedoraproject.org


Re: is the mailing list mangling email character-encoding

2019-09-22 Thread Ed Greshko

On 9/22/19 10:00 PM, Tim via users wrote:

On Sun, 2019-09-22 at 19:29 +0800, Ed Greshko wrote:

So, I would blame yahoo.  :-) :-)

Ordinarily, I'd agree.  But if I post directly through Yahoo (using
their SMTP servers), and receive from a Yahoo address, international
characters come through unmangled.

I've seen this kind of thing before, when I've posted here using
Thunderbird on a Mac.
  


I see.  I'm curious though.  If you respond to this and keep the contents will 
Frédéric survive?

--
If simple questions can be answered with a simple google query then why are 
there so many of them?
___
users mailing list -- users@lists.fedoraproject.org
To unsubscribe send an email to users-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/users@lists.fedoraproject.org


Re: is the mailing list mangling email character-encoding

2019-09-22 Thread Tim via users
On Sun, 2019-09-22 at 19:29 +0800, Ed Greshko wrote:
> So, I would blame yahoo.  :-) :-)

Ordinarily, I'd agree.  But if I post directly through Yahoo (using
their SMTP servers), and receive from a Yahoo address, international
characters come through unmangled.

I've seen this kind of thing before, when I've posted here using
Thunderbird on a Mac.
 
-- 
 
uname -rsvp
Linux 3.10.0-1062.1.1.el7.x86_64 #1 SMP Fri Sep 13 22:55:44 UTC 2019 x86_64
 
Boilerplate:  All unexpected mail to my mailbox is automatically deleted.
I will only get to see the messages that are posted to the mailing list.
 
___
users mailing list -- users@lists.fedoraproject.org
To unsubscribe send an email to users-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/users@lists.fedoraproject.org


Re: is the mailing list mangling email character-encoding

2019-09-22 Thread Ed Greshko

On 9/22/19 7:17 PM, Tim via users wrote:

Below I've just pasted a bit of an email:


On Sun, 2019-09-22 at 06:13 +0200, Fr??d??ric wrote:

It's from my reply to a message.  That mangled name is supposed to be
"Frederic" with acute accents above both letter "e"s.  It was when I
wrote it, and my mail client *CORRECTLY* sent it as UTF-8, but
somewhere between then and my receiving it back, the message content
has been scrambled by some broken email software that stuffed the
character encoding up.



My reply showed up in the archives with Frédéric correctly shown.

So, I would blame yahoo.  :-) :-)

--
If simple questions can be answered with a simple google query then why are 
there so many of them?
___
users mailing list -- users@lists.fedoraproject.org
To unsubscribe send an email to users-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/users@lists.fedoraproject.org