https://issues.apache.org/SpamAssassin/show_bug.cgi?id=5861


Justin Mason <[EMAIL PROTECTED]> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |RESOLVED
         Resolution|                            |FIXED
   Target Milestone|Undefined                   |3.3.0




--- Comment #7 from Justin Mason <[EMAIL PROTECTED]>  2008-04-10 01:48:34 PST 
---
(In reply to comment #5)
> So is there something that can help with these short messages, that don't
> create many tokens? When there aren't enough body tokens, by default all those
> hammy header tokens are sure to prevent correct scoring. It forces me to 
> ignore
> such headers.

Training on error should help -- train mostly on FPs and FNs from now on.

> Also whats the deal with saving those X-Spam-Relays-Internal tokens? I ignored
> it since I can't figure out any purpose to bloat my db.

Consider a site with 2 MXes -- a primary and secondary MX.  both are listed as
IPs in internal_networks.  For some reason, spammers tend to like sending spam
via the secondary.  The presence of that MX's IP in the
'X-Spam-Relays-Internal' hdr therefore becomes a spam sign, for that site.

If, on the other hand, a token appears equally in both ham and spam:

  - it's P value will tend towards the middle ground: 0.5
  - this means that it will fall outside $MIN_PROB_STRENGTH:

    # Should we ignore tokens with probs very close to the middle ground (.5)?
    # tokens need to be outside the [ .5-MPS, .5+MPS ] range to be used.
    our $MIN_PROB_STRENGTH = 0.346;

  - tokens outside that range are unused

  - unused tokens don't have their access times updated, and therefore
    are expired from the Bayes db.

thanks for the patch -- I'll apply it.  we should probably be running 
a 10-fold cross validation, but I'm a bit busy and I think it's a good
idea as a hunch. ;)

: jm 573...; svn commit -m "bug 5861: add DKIM-Signature and
DomainKey-Signature to the set of headers whose contents are ignored for Bayes;
their presence is marked, however.  thanks to Henrik Krohns"
lib/Mail/SpamAssassin/Plugin/Bayes.pm
Sending        lib/Mail/SpamAssassin/Plugin/Bayes.pm
Transmitting file data .
Committed revision 646688.


-- 
Configure bugmail: 
https://issues.apache.org/SpamAssassin/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.

Reply via email to