On 07/13/2016 04:38 AM, Yasuhito FUTATSUKI wrote: > >> I think it is better to hold string attributes of mm_cfg and mlist class >> as Unicode than site_language code or list's preferred language code >> encoded (but I know it is so trouble to do so). > And then on pattern matching on message pipeline is done with Unicode > rather than list's prefered language.
I have been working on a change to do exactly that. I.e. collect the headers for matching with header_filter_rules as unicode and match the patterns as unicode. This is very difficult to do on a list whose preferred_language character set doesn't support the characters in the header_filter_rules patterns, e.g., trying to match Chinese characters in Subject: headers on an English language list. The major issue is when the character set of the form is say us-ascii and one enters non-ascii as a pattern, it is up to the browser to decide how to encode that. In at least one case with Firefox, characters which are in the windows-1252 character set are encoded as that and others as XML numeric references. I can easily deal with converting the XML numeric references to unicodes, but I don't know in what charset the other characters are encoded. <sigh> Yes, I know that everything should be unicode and utf-8 encoded regardless of language, but that isn't going to happen in MM 2.1. -- Mark Sapiro <m...@msapiro.net> The highway is for gamblers, San Francisco Bay Area, California better use your sense - B. Dylan ------------------------------------------------------ Mailman-Users mailing list Mailman-Users@python.org https://mail.python.org/mailman/listinfo/mailman-users Mailman FAQ: http://wiki.list.org/x/AgA3 Security Policy: http://wiki.list.org/x/QIA9 Searchable Archives: http://www.mail-archive.com/mailman-users%40python.org/ Unsubscribe: https://mail.python.org/mailman/options/mailman-users/archive%40jab.org