On Tue, 12 Feb 2019 at 02:41, Craig Russell <[email protected]> wrote: > > When this bug is fixed, perhaps you can verify that the message received > today has the same problem
That has a different cause: the message-id extraction failed because it expected the wrapped line to start with a space, but it starts with tab. This caused the parsing to fail entirely (a related bug) so the message was not added to the database. I'll fix that shortly. S. > From: Zhongming Hua 华 钟明 <[email protected]> > No subject > Timestamp: 5:51 PM PST > > > On Feb 11, 2019, at 5:33 PM, Sam Ruby <[email protected]> wrote: > > > > On Mon, Feb 11, 2019 at 5:12 PM sebb <[email protected]> wrote: > >> > >> On Mon, 11 Feb 2019 at 21:44, Sam Ruby <[email protected]> wrote: > >>> > >>> That is indeed a big in the mail gem. I contributed a patch that is a > >>> partial fix. > >>> > >>> What is essential is that you only convert headers not the binary > >>> attachment as that can corrupt images. > >> > >> AFAICT headers are already handled correctly even if they are > >> LF-terminated. > >> I assume this is because parsing looks for CR and/or LF - there's > >> certainly some code that does this. > >> > >> This does not seem to be the case for body parsing. > >> From what little I have seen, it looks as though the code tries to > >> convert to CRLF before continuing to parse. > >> > >> If the body is not ascii_only then the method Mail::Utilities.to_crlf > >> does not convert anything. > >> I'm pretty sure that is what is happening with this email. > >> > >> Surely binary images have to be base64-encoded or similar before they > >> can be sent in an email? > >> > >> AFAIK, CR and LF are not allowed in mail data - or have I got that wrong? > > > > For starters, if everybody correctly created emails according to the > > spec, there would not be a need to provide any fixups. This also > > explains why Craig resending the mail (with a proper email client that > > respects the relevant RFCs) works around the problem. > > > > Older versions of the mail gem used to do a global conversion of LF > > which was not preceded by a CR to CRLF (CRLF is required by the spec, > > but again, that part of the spec is routinely ignored).. In response > > to a bug report where that code was corrupting attachments, they > > "fixed" the code to no longer do that except for the headers, but only > > if the headers were pure ASCII. I provided a patch which expanded it > > to headers that are correctly encoded, and that patch was accepted. > > > > Clearly this is not enough. > > > > What probably should be done is to split the file at the first > > \r?\n\r?\n (a blank line separates the headers from the body). Then > > do a global change of LFs which are not preceded by a CR to a CRLF - > > but only in the headers. Then reattach the header and body and parse > > that. > > > >> S. > > > > - Sam Ruby > > > >>> On Mon, Feb 11, 2019, 4:11 PM sebb <[email protected] wrote: > >>> > >>>> It looks as though the mail gem does try to convert the input to CRLF > >>>> before parsing. > >>>> However, it only does this if the input is ASCII-only (*). > >>>> > >>>> Whilst this is the best approach, I think we can ignore that check and > >>>> just convert to CRLF. > >>>> [It may cause a very occasional glitch, but any such should be obvious.] > >>>> > >>>> It explains why the original email failed to parse: there are several > >>>> non-ASCII chars in it. > >>>> These have been converted to quoted-printable by the version you > >>>> forwarded, thus allowing the parse to complete OK. > >>>> > >>>> I'll commit a fix shortly. > >>>> > >>>> S. > >>>> (*) This applies to binary data, which is all that we know about the > >>>> input at this point. > >>>> > >>>> On Mon, 11 Feb 2019 at 18:55, sebb <[email protected]> wrote: > >>>>> > >>>>> Seems to be an issue with the parser not handling certain messages > >>>>> which have LF-only line terminators. > >>>>> > >>>>> I tried converting the mail file to CRLF and reparsing, and the > >>>>> attachment appeared. > >>>>> > >>>>> AFAICT all messages are being stored as LF-only, and I don't know why > >>>>> this only affects some mails. > >>>>> The one in question is quite complicated with various quoted mails > >>>>> before the attachment; that might have affected parsing. > >>>>> > >>>>> The copy of the mail sent to my GMail account appears to have CRLF > >>>>> line terminators, yet the copies on minotaur and mbox-vm are LF-only. > >>>>> Perhaps GMail auto-converts the mail? > >>>>> Not sure what is happening here. > >>>>> > >>>>> On Mon, 11 Feb 2019 at 17:20, sebb <[email protected]> wrote: > >>>>>> > >>>>>> The immediate reason is that the parsed message does not have any > >>>>>> attachments listed in the summary yml file, so does not show up in the > >>>>>> workbench. > >>>>>> > >>>>>> I will look further into why the parsing did not see the attachment. > >>>>>> > >>>>>> S. > >>>>>> On Mon, 11 Feb 2019 at 17:11, sebb <[email protected]> wrote: > >>>>>>> > >>>>>>> I'll take a look > >>>>>>> > >>>>>>> On Mon, 11 Feb 2019 at 16:27, Craig Russell <[email protected]> > >>>> wrote: > >>>>>>>> > >>>>>>>> Hi, > >>>>>>>> > >>>>>>>> This seems to be happening more frequently: an ICLA shows up in > >>>> secretary mail inbox but does not appear in workbench. > >>>>>>>> > >>>>>>>> If I forward the message to secretary, it then appears in > >>>> workbench. > >>>>>>>> > >>>>>>>> Without violating PII, I cannot forward the message directly. But > >>>> this is the message that was received this AM and never made it to > >>>> workbench: > >>>>>>>> > >>>>>>>> From: Olivier Coutu > >>>>>>>> Date: 6:10 AM PST > >>>>>>>> Subject: Fwd: Re: Patch to reduce lint time > >>>>>>>> > >>>>>>>> Any ideas? > >>>>>>>> > >>>>>>>> Thanks, > >>>>>>>> > >>>>>>>> Craig > >>>>>>>> > >>>>>>>> Craig L Russell > >>>>>>>> Secretary, Apache Software Foundation > >>>>>>>> [email protected] http://db.apache.org/jdo > >>>>>>>> > >>>> > > Craig L Russell > Secretary, Apache Software Foundation > [email protected] <mailto:[email protected]> http://db.apache.org/jdo > <http://db.apache.org/jdo>
