On 29 Nov 2018, at 17:32, Amir Caspi wrote:
B) Do you think that normalize_charsets could evolve to handle HTML entities?
That would be a mess. The normalize_charset option acts on the decoded text of text/* MIME parts before that text is parsed into meaningful tokens.
I have no issue with adding a new rule type to act on the output of a partial well-defined HTML parsing, something in between 'rawbody' and 'body' types, but overloading normalize_charset with that and so affecting every existing rule of all body-oriented rule types would be a bad design.
-- Bill Cole b...@scconsult.com or billc...@apache.org (AKA @grumpybozo and many *@billmail.scconsult.com addresses) Available For Hire: https://linkedin.com/in/billcole