On Thu, May 3, 2018 at 8:46 AM, sebb <[email protected]> wrote: > I've been looking at the mail to secretary@ that caused a problem recently. > > I think the issue may be that the headers are not all ASCII, there are > a couple of o-umlauts. > If these are replaced with plain 'o's then the headers are parsed OK. > > This causes message.rb to crash in the rescue block > > @335: from = mail[:from].value.sub(/\s+<.*?>$/) > > because mail[:from] is nil. > > That could be avoided by using .to_s rather than .value, but that > causes the headers to be saved as a single blob with the key of the > first header. > > A possible solution might be to trap all errors and set some dummy > values for the Yaml file > This would at least allow the user to be alerted to the issue. > > But ideally the parser should be persuaded to handle non-ASCII header values. > They are not allowed, but they seem to be quite common.
It is slightly more complicated than that. It actually will correctly parse headers with non-ascii characters if the headers are separated by \r\n. A correctly formed email message uses \r\n as a separator between headers. Once upon a time, the mail gem would essentially do the equivalent of s/\n\/\r\n/ on such emails but that would occasionally corrupt attachments. In essence, that code was removed with the intention of replacing it if the headers were otherwise correctly encoded. I authored a patch (which was accepted, but I can't find it right now) which did that if the headers were pure ascii. If that patch can be found, that thread included instructions on restoring the old behavior using a method call with the word "unsafe" or somesuch in it. Of course, that could corrupt attachments, which for our use case can be bad. The safest way to do this is to separate the headers from the body, fix the headers, and then reattach the two before parsing: https://github.com/apache/whimsy/blob/6830b808866e140bd0f436c2cd02f9c66527fcc8/www/secretary/workbench/models/message.rb#L318 Perhaps this code could be put in lib/whimsy/asf someplace? - Sam Ruby
