After some tests, I have some more data to share:
Using DSPAM 3.6.8 with hash_drv: After retraining a message 5 times, dspam_dump
returns:
[EMAIL PROTECTED] dspam]# dspam_dump [EMAIL PROTECTED] Anfang
6792127458520707072 S: 00005 I: 00000 P: 0.9900
Using DSPAM 3.4.9 with libdb4_drv:
[EMAIL PROTECTED] dspam]# dspam_dump [EMAIL PROTECTED] Anfang
6792127458520707072 S: 00058 I: 00000 P: 0.4000
3.4.9 does not classify the mails as spam even after 58 times retraining
exactely the same message... (I think this happens with mysql_drv, too).
The thing is that I found in dspam.debug (3.4.9) the following lines when
retraining a mail:
5782: [3/10/2007 23:59:28] processing signature. length: 4192
5782: [3/10/2007 23:59:28] reversing 262 tokens
5782: [3/10/2007 23:59:28] reclassifying iteration 1 result: 0
5782: [3/10/2007 23:59:28] libdspam returned probability of 1.000000
5782: [3/10/2007 23:59:28] message result: SPAM
5782: [3/10/2007 23:59:28] appending header X-DSPAM-Reclassified: Spam
So this looks to me as if some action was taken but unfortunately, DSPAM still
does not recognize the spam...
I can even do some corpus-training. The returned email will contain these
headers:
X-DSPAM-Result: Spam
X-DSPAM-Confidence: 0.9998
X-DSPAM-Probability: 1.0000
But if I use classify with exactely the same mail afterwards, I only get
X-DSPAM-Result: [EMAIL PROTECTED]; result="Innocent"; probability=0.0023;
confidence=1.00
The only difference between both machines is that 3.6.8 starts catching spam
after 5 runs while 3.4.9 does not. However, I don't want to switch to DSPAM 3.6
right now (this should be done *after* the new servers are working).
fs