On Thu, 2006-01-05 at 13:48 +0000, Tony Finch wrote:
> MailScanner has gained an interesting optimization recently, that may be
> worth adding to Exim. It takes a checksum of the message body, and if that
> matches the checksum of a message that has already been received it skips
> the full SpamAssassin and anti-virus scan and instead pulls the results
> out of a cache. This seems to be more effective than you might expect,
> even without any kind of fuzzy-matching in the checksum.

I haven't noticed as much of the short random padding strings in stuff
that I see recently - a couple of years back many messages appeared to
have a short string at the beginning or end of the body and normally in
the subject.

I guess the thing to do here is handle the *body* - ie no headers so
that there are no receive and tracking headers polluting things.

However it would mean that the content scanning stuff also has to keep a
persistant database - with appropriate aging of old information.  Also
need to be careful of things like empty bodies (all have same size and
chksum).  Maybe this wants to be a couple of functions - get a message
body signature (used as a key - maybe the size & checksum), and the
ability to store/retrieve data, ideally with a lifetime, which could be
used in ACLs

        Nigel.
-- 
[ Nigel Metheringham           [EMAIL PROTECTED] ]
[ - Comments in this message are my own and not ITO opinion/policy - ]



-- 
## List details at http://www.exim.org/mailman/listinfo/exim-dev Exim details 
at http://www.exim.org/ ##

Reply via email to