Re: [AMaViS-user] amavisd-new + dspam...

Gary V Fri, 07 Oct 2005 10:11:20 -0700

Nathanael wrote:

> Can anyone direct me to reasonably detailed information on how
> amavisd-new works together with dspam?  Can I take advantage of all of
> the dspam featureset if using it within amavisd?  I have looked and
> can't find any real documentation on the way they interoperate and how
> much of the full power of dspam is available when used with amavisd.


> Thanks,

I have not used dspam outside amavisd-new, but I would have to assume
it is somewhat crippled.

I have recently set it up on my system, and I am still in the process
of letting SpamAssassin train dspam (trying to get to 2500 messages).

I gather that in a typical system, you must manually train dspam for
it to work with any accuracy. Because dspam has not been trained,
everything dspam sees initially is considered innocent. Hard coded in
amavisd-new is a ham level  (< 0.5) and a spam level (> 7.0) that is
used to retrain dspam. The message is fed to dspam, dspam returns
'innocent' or 'spam', then the SpamAssassin score is used to determine if
dspam needs to be retrained. If it does, the message is fed back
through dspam to retrain it. This bit of code shows some of this:

if (defined $dspam && $dspam ne '' && defined $spam_level) {  # auto-learn
      my($eat,@options);
      @options = (qw(--stdout --mode=tum --user), $daemon_user);  # --mode=teft
      if (   $spam_level >  7.0 && $dspam_result eq 'Innocent') {
        $eat = 'SPAM'; push(@options, qw(--class=spam --source=error));
      }
      elsif ($spam_level <  0.5 && $dspam_result eq 'Spam') {
        $eat = 'HAM'; push(@options, qw(--class=innocent --source=error));
      }

I actually changed the numbers to 0.2 and 8.0 on my system.

Since dspam assumes everything is innocent at first, it is nearly always
retrained on spam (which I believe is the way you train dspam). Here are
my current stats:

                TS Total Spam:                124
                TI Total Innocent:            880
                SM Spam Misclassified:        118
                IM Innocent Misclassified:      0
                SC Spam Corpusfed:              0
                IC Innocent Corpusfed:          0
                TL Training Left:            1620
                SR Spam Catch Rate:        51.24%
                IR Innocent Catch Rate:   100.00%
                OR Overall Rate/Accuracy:  89.48%

At the moment, in local.cf, I have placed:

header DSPAM_SPAM X-DSPAM-Result =~ /^Spam$/
describe DSPAM_SPAM DSPAM claims it is spam
score DSPAM_SPAM 0.5

header DSPAM_HAM X-DSPAM-Result =~ /^Innocent$/
describe DSPAM_HAM DSPAM claims it is ham
score DSPAM_HAM -0.1

So as you can see, the dspam result is used by SpamAssassin and
initially we are very conservative with the numbers. Once dspam is
fully trained, dspam's accuracy will improve, and at that point dspam can
be relied upon to the point that the DSPAM_SPAM (and DSPAM_HAM) can be
given scores that will make dspam more effective
(I'm thinking 2.0 - 3.5).

Here is a sample header of a message that was passed to a user. Note
that it scored below 8.0, so amavisd-new did not use this message to
retrain dspam. Note the Bayes score (SpamAssassin could not decide if
it was spam or ham).

X-DSPAM-Result: Innocent
X-DSPAM-Confidence: 0.9997
X-DSPAM-Probability: 0.0000
X-DSPAM-Signature: 4343db5590271237217540
X-DSPAM-Factors: 27,
X-Virus-Scanned: amavisd-new at example.com
X-Spam-Status: Yes, score=5.136 required=5 tests=[BAYES_50=0.001,
 DCC_CHECK=2.169, DIGEST_MULTIPLE=0.098, DSPAM_HAM=-0.1,
 MARKETING_PARTNERS=1.401, RAZOR2_CF_RANGE_51_100=0.056, RAZOR2_CHECK=1.511]
X-Spam-Score: 5.136
X-Spam-Level: *****
X-Spam-Flag: YES

A side note:
I first used BDB as the storage method. Any messages that had over 15K
of text in the message body caused the child process to timeout. I
switched to MySQL 4.1, and of course, the problem has solved. It was
not trivial (for me) to set this up, and for some reason I cannot get
it to work on one of my machines, but two others work great.

After installing all the required MySQL libraries, on my Debian
machine I compiled dspam with:

./configure --with-storage-driver=mysql_drv --with-mysql-libraries=/usr/lib
 --with-mysql-includes=/usr/include/mysql --enable-virtual-users
 --with-dspam-home=/var/lib/amavis/dspam --enable-signature-headers
 --without-delivery-agent --without-quarantine-agent --enable-debug

Paths will vary. I would be interested in how others may have
compiled it for use with MySQL 4.1, as this was the part I was least
certain about.

Gary V



-------------------------------------------------------
This SF.Net email is sponsored by:
Power Architecture Resource Center: Free content, downloads, discussions,
and more. http://solutions.newsforge.com/ibmarch.tmpl
_______________________________________________
AMaViS-user mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/amavis-user
AMaViS-FAQ:http://www.amavis.org/amavis-faq.php3
AMaViS-HowTos:http://www.amavis.org/howto/

Re: [AMaViS-user] amavisd-new + dspam...

Reply via email to