Matt Sergeant wrote:

Wasn't there unicode normalisation in the original email parser that I submitted to the project (that Theo turned into the current parser) ?

Certainly it would make sense to use that if you could. It works very well on a very large set of test data.

That code only deals with MIME-labeled charsets. It has no provision for charset detection.

The code puts charset normalization inside of Mail::SpamAssassin::Message::Node::decode(). I don't think charset normalization is appropriate for the decode call that is used in parsing message/rfc822 objects.


Reply via email to