On 29 Nov 2018, at 17:32, Amir Caspi wrote:

B) Do you think that normalize_charsets could evolve to handle HTML entities?

That would be a mess. The normalize_charset option acts on the decoded text of text/* MIME parts before that text is parsed into meaningful tokens.

I have no issue with adding a new rule type to act on the output of a partial well-defined HTML parsing, something in between 'rawbody' and 'body' types, but overloading normalize_charset with that and so affecting every existing rule of all body-oriented rule types would be a bad design.



--
Bill Cole
b...@scconsult.com or billc...@apache.org
(AKA @grumpybozo and many *@billmail.scconsult.com addresses)
Available For Hire: https://linkedin.com/in/billcole

Reply via email to