Re: [Clamav-users] PhishingScanURLs is dreadfully slow/CPU-intensive

John Rudd Tue, 30 Oct 2007 07:45:50 -0800

Daniel T. Staal wrote:
> On Tue, October 30, 2007 10:15 am, David F. Skoll said:
> 
>> (Our customers, in fact, always run ClamAV in conjunction with an
>> anti-spam scanner, so it's no benefit to them to have Clam try to do
>> anti-spam.)
> 
> I usually find it a detriment: ClamAV is nowhere _near_ as good at
> distinguishing spam/phish emails as a most spam filters, and is much more
> prone to false-positives in particular.  So a 'spam' match from ClamAV
> means 'examine this file manually' whereas a spam match from a spam filter
> goes in the spambucket where it can be safely be ignored/deleted unless
> there is a reason to check it.


In general, the difference between AV and AS systems tends to be that AV 
systems are signature based, where Spam Assassin like AS systems are 
heuristic based.  Signature based AS systems tend to be rather accurate 
in terms of not having false positives, but tend to also have the 
vulnerability that they never catch new spam (they have to be trained 
for each variant of a given spam form).  Signature based software, IME, 
also tends to be faster than heuristic based software.  Heuristic 
software tends to be slower, but is able to detect "new" spam strains 
more effectively.

IMO: it's good to have both approaches to AS in your inventory.  First 
do signature based checks, because they're faster and should only 
eliminate known spam.  If the signature system flags the message, then 
don't submit it to the heuristic system ... thus lightening the CPU 
overhead and average message latency imposed by the heuristic system.

The problem I see with the Anti-Spam material in ClamAV is that it's not 
purely signature based.  It tries to be a little more speculative, and 
in the process gives up some of the advantages of a signature based 
scanning method: it loses some speed (in some cases, a lot of speed), 
and some accuracy.  But I don't see a problem with the approach in 
general ... I just think that if you're going to do AS work in ClamAV, 
you should limit it to signature based AS work, and not attempt to be 
heuristic about it.



As for what I do when ClamAV finds spam... I have code in my ClamAV 
module that does this:

1) If it's spam or phishing signature, accept the message, add a header 
identifying the message as "spam" and indicating which signature was 
triggered.  Better consistency of naming schemes for spam/phishing rules 
among the different signature sources could have made that easier, though.

2) If it's any other signature, reject the message during SMTP with a 
message indicating which signature was triggered.


In practice, I haven't found the need to actually turn on #1, but the 
code is there.  We did have one recent false positive, though (out of 
millions of messages scanned, even using the third party signature 
sources).  We're discussing whether or not to turn that feature on.

_______________________________________________
Help us build a comprehensive ClamAV guide: visit http://wiki.clamav.net
http://lurker.clamav.net/list/clamav-users.html

Re: [Clamav-users] PhishingScanURLs is dreadfully slow/CPU-intensive

Reply via email to