From: "Mark Sapiro" <[EMAIL PROTECTED]>
Mark Sapiro wrote:
Helmut Schneider wrote:
Interesting, with "^subject:.*Declined.*"

Subject: Declined: [Somelist] Invitation to workshop on 13rd Dec. 2008

matches while

Subject: [Somelist] Declined:  Invitation to workshop on 13rd Dec. 2008

does not. Huh?!


It turns out that RFC 2047 encoded headers are not decoded before
matching against the regexps. Is that the issue here? What do the raw
headers look like?

I think that the headers should be decoded, but I wonder if people are
currently working around this with regexps that match encoded headers
and wouldn't match decoded headers.


I have developed a patch for SpamDetect.py which will decode RFC 2047
encoded headers. This is somewhat problematic because the decoded
headers will presumably contain non-ascii characters, and while the
character sets of the headers are known (and there can be different
headers or even different parts of a single header encoded in different
character sets), the character set of the regexps in header_filter_rules
is not known.

The patch creates a unicode object containing all the headers unfolded
and RFC 2047 decoded with one complete header per line and then encodes
it into the character set of the list's preferred_language, and this
result is what the regexps will search. As long as the regexps contain
only ascii and the raw headers contain no non-ascii characters, this
should give expected results. If the regexps contain non-ascii
characters or the headers contain non-ascii not RFC 2047 encoded,
results may be unexpected.

If in fact, the original issue is due to RFC 2047 encoded headers, try
the patch and let us know how it works.

As far as I can see this patch works great. As a positive side effect, is it possible that this patch also affects uncaught bounces? I recieve lots of uncaught bounces now where a SPAM-filter was required before the patch.

Thanks a lot, Helmut
------------------------------------------------------
Mailman-Users mailing list
Mailman-Users@python.org
http://mail.python.org/mailman/listinfo/mailman-users
Mailman FAQ: http://wiki.list.org/x/AgA3
Searchable Archives: http://www.mail-archive.com/mailman-users%40python.org/
Unsubscribe: 
http://mail.python.org/mailman/options/mailman-users/archive%40jab.org

Security Policy: http://wiki.list.org/x/QIA9

Reply via email to