Re: [dspam-users] Train dspam

Steve Tue, 05 Aug 2008 14:28:55 -0700

-------- Original-Nachricht --------
> Datum: Tue, 5 Aug 2008 11:14:44 +0300
> Von: s91066 <[EMAIL PROTECTED]>
> An: [email protected]
> CC: [EMAIL PROTECTED]
> Betreff: [dspam-users] Train dspam


> I try to understand how dspam is trained. My setup is IMAP + Maildir,
> thus, I 
> have created a Junk and a NoSpam directory at which users add the spams
> and 
> the false positive mails respectively. 
> I run a script every hour in order to collect data. The script trains
> dspam 
> as:
> dspam --user $USER --class=spam --source=error < $j
> where $USER is the username (not the mail address, but the username) and
> $j is 
> the file that is spam but is classified as Innocent. 
> 
> Now, what I cannot understand is this:
> I have a lot of emails with the same subject and almost identical body. I
> had 
> trained dspam to handle those emails as errors. However, I still receive 
> those emails! 
> Since I do have the emails, I run dspam from command line in order to see
> the 
> classification result as:
> dspam --mode=notrain --user username --classify --stdout<mail_file
> The result was:
> X-DSPAM-Result: username; result="Innocent"; class="Innocent"; 
> probability=0.0000; confidence=1.00; signature=489787f2131472612618147
> 
> So, why? The message was feed to dspam just a couple of minutes ago, with
> the 
> same command as above (source=error).
> 
It is very easy. You called DSPAM with "--mode=notrain". Right? This means that 
the command will NOT train DSPAM. Right? You call it with "--classify" and with 
"--stdout". This means that the command will CLASSIFY the message and print out 
the output (the whole output) to the screen. Right? Do you see the result 
having a signature? Now ask yourself how the signature got there even when you 
told DSPAM to NOT TRAIN and you told DSPAM to CLASSIFY. And what does DSPAM do? 
It prints out a signature. But a signature is only created when the message 
get's tagged and tagging should not happen with "--mode=notrain --classify". 
Hmmm... well... very easy: The message you feed to DSPAM ALREADY HAS A 
SIGNATURE. That is the problem. Could you try this and tell me what the outcome 
is:

sed 
"/^\(X\-Quarantine\-ID:\|X\-OSBF\-Lua\-Score:\|X\-CRM114\-[a-zA-Z]*:\|X\-\(DKIM\|SenderID\):\|X\-Virus\-Scanned:\|X\-Greylist:\|X\-DCC\-.*\-Metrics:\|X\-\(Virus\|Pyzor\|Razor\)\-Status:\|X\-Delivery\-Agent:\|Received\-SPF:\|X\-policyd\-weight:\|X\-Spam\-[^:]*:\)
 
.*$/d;/^X\-Amavis\-OS\-Fingerprint:/,+1d;/^X\-DSPAM\-Result\:/,/^X\-DSPAM\-Signature:
 [0-9a-f,]*$/d;s/^Subject: \(\(ADV\|UNS\):[\t 
]\{1,99\}\)\{0,1\}\(\[[+-]\{1,2\}\][\t ]\{1,99\}\)\{0,1\}\(\[SPAM\][\t 
]\{1,99\}\)\{0,1\}/Subject: /;s/\0-9]\{0,9\},\{0,1\}[0-9a-f]\{1,32\}\!//g" 
mail_file | dspam --user username --classify --stdout --mode=notrain 
--deliver=summary


Do you still get DSPAM reporting that message as Innocent? Probably not. Right?


> Shouldn't dspam report the file as spam?
>
No. See above.


> Thank you 
> Peter
> 
Steve

> 
> !DSPAM:1011,4897fde0150921570549289!
> 

-- 
Psssst! Schon das coole Video vom GMX MultiMessenger gesehen?
Der Eine für Alle: http://www.gmx.net/de/go/messenger03

!DSPAM:1011,4898c607150929377744048!

Re: [dspam-users] Train dspam

Reply via email to