-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
John Gardiner Myers writes: > Justin Mason wrote: > > >+1 on the idea of charset normalization to a UTF-8 form, at least for the > >headers and one rendering of the message body. I think that should be the > >"body" rendering. I also think that the "rawbody" and "full" renderings > >should remain un-normalized, as should the "Header:raw" pseudo-header > >selector. > > > > > I agree with "full" and ":raw"--charset normalization makes no sense if > you aren't removing transfer encodings. I was thinking it was > appropriate for rawbody since that more correctly represents the HTML form. > > One could argue either way for the correct "rawbody" semantic. > Whichever one we choose, I can see SpamAssassin adding a new rule class > for the other semantic. if necessary. I'm not sure it'll be necessary, and I'd prefer to avoid increasing the number of renderings held in RAM.... in my opinion the correct "rawbody" rendering is un-normalized, I think. > >+1 on fixing Mail::SpamAssassin::HTML as described. > > > Which description? There are three choices for when to skip the pack calls: > > 1) When charset normalization is enabled > 2) When the installed HTML::Parser is 3.43 or later > 3) Always (upping the minimum requirement for HTML::Parser to 3.43) I think conditioning on version checks (perl < 5.8 || HTML::Parser < 3.43) is likely to be the least painful option. It may be worthwhile making charset normalization depend on that version check, too, if it's not going to work in HTML mail without that HTML::Parser fix. - --j. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.5 (GNU/Linux) Comment: Exmh CVS iD8DBQFDC4XRMJF5cimLx9ARAltjAKCnQT6ImK3NTnoEvdebsw5d2xbtpQCgvHIp aEXjkLMOaWbBWQ3p38Hm/8Y= =qlzk -----END PGP SIGNATURE-----
