[Dspam-user] spam_train does not work

Jehan Pagès Thu, 05 Feb 2009 02:36:50 -0800

Hi,

I have installed dspam for a postfix/dovecot installation (don't think this
matters, but... if ever), version 3.8.0-r15, installed from the Gentoo
package manager, built with clamav (not yet activated though), daemon,
mysql, virtual-users and debug. I am currently trying to train dspam with
all the spamassassin corpus feed. The command I run (from the man):


 dspam_train [email protected] spam/ hard_ham/

I am not in front of the training all the time reading all results, but it
looks like it always fails on spam and always succed on ham. At least each
time I was looking at the output of dspam_train, I was seeing such
behaviour, and this even after having trained thousands of spam/ham. Some
extract:

[...]
[test: nonspam] 00246.fdaacadac7143848978ea0af07 result: PASS
[test: spam   ] 00491.28cb63173ed4740180e45e6248 result: FAIL (Innocent)
        [fn] Subject: Finally! Sexy DVDs for FREE. Christmas is Great! NMV
[test: spam   ] 00492.73db79fb9ad03aff1e08deb73b result: FAIL (Innocent)
        [fn] Subject: Today's Special: Amazing Penetrations No. 17 29264
[test: nonspam] 00247.42534d5df0700cb2adf240556c result: PASS
[test: spam   ] 00493.1c5f59825f7a246187c137614f result: FAIL (Innocent)
        [fn] Subject: GOV'T GUARANTEED HOME BUSINESS
[test: spam   ] 00494.fd2efa67e63247ee89cdcf3a6f result: FAIL (Innocent)
        [fn] Subject: [ILUG] MANUEL OKO
[...]

Then if I look the stats of the user I am training:

# dspam_stats -H [email protected]
[email protected]:
                TP True Positives:              0
                TN True Negatives:           5619
                FP False Positives:             0
                FN False Negatives:          4745
                SC Spam Corpusfed:          14920
                NC Nonspam Corpusfed:           0
                TL Training Left:               0
                SHR Spam Hit Rate           0.00%
                HSR Ham Strike Rate:        0.00%
                OCA Overall Accuracy:      54.22%

First we not that it fails around half the time here, but also that the
nonspam corpusfeed is null! As though I never fed the trainer with any ham
(even though I did, as you can see in the above extract). I tried to search
on the web for such issue and found some similar problems, but nothing was
clearly given (or I searched with wrong terms). Could you help me to
understand this out and make my training succeed? Currently in my test
mailbox, all received spams are always considered "innocent" (in fact I
still never had any email flagged spam, even after all this training). Is it
an issue of configuration? A bug?
Thanks all.

Jehan

------------------------------------------------------------------------------
Create and Deploy Rich Internet Apps outside the browser with Adobe(R)AIR(TM)
software. With Adobe AIR, Ajax developers can use existing skills and code to
build responsive, highly engaging applications that combine the power of local
resources and data with the reach of the web. Download the Adobe AIR SDK and
Ajax docs to start building applications today-http://p.sf.net/sfu/adobe-com

_______________________________________________
Dspam-user mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/dspam-user

[Dspam-user] spam_train does not work

Reply via email to