On Fri, 2002-03-29 at 17:10, Matthew Cline wrote:
> On Friday 29 March 2002 04:40 pm, Marsha Hanchrow wrote:
> 
> > Why in the world is "body Contains at least 3 dollar signs in a row
> > CASHCASHCASH -0.839" now scored as a negative?
> 
> Now SA uses a Genetic Algorithm (GA) to set the scores.  If any spam that $$$ 
> appears in would have been marked as spam even without the CASHCASHCASH rule, 
> then the score for CASHCASHCASH will randomly wander up and down.  I think 
> Craig is working on a fix for this.

"randomly" and "fix" are perhaps innaccurate.  The score for $$$
probably drifted below 0 because there are in fact non-spams which
contain $$$ in the body (say, as a separator or something).  In the
corpus, there are 1445 spams and 266 nonspams which match this rule. 
Since the spams probably matched lots of other rules too, the score for
this particular rule can actually help claw back points for things which
otherwise might be false-positives among the 266.

C

_______________________________________________
Spamassassin-talk mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk

Reply via email to