I think the Genetic Algorithm (GA) assigns all the scores now. GA's are very
powerful optimization tools, and if the GA lowered those scores, it likely
raised (compensated) other scores that were more common spam signatures.

The GA is only as good as the population of data it is run on.  Craig posted
an e-mail a few days (weeks?) ago stating he was looking for some sample
non-spam e-mail.  I'd be willing to contribute some business oriented
e-mails, as I think the corpus as it stands is leans slightly towards the
tech e-mails.

Gene

> -----Original Message-----
> From: [EMAIL PROTECTED]
> [mailto:[EMAIL PROTECTED]]On Behalf Of
> CertaintyTech - Ed Henderson
> Sent: Thursday, February 14, 2002 11:28 AM
> To: [EMAIL PROTECTED]
> Subject: [SAtalk] SA 2.01 low scores
>
>
> I have been seeing alot more Spam get thru (false negatives) in v2.01 than
> with v1.5.  I have been comparing the scores of 1.5 with 2.01 to see why.
> Here is an interesting discovery:  there are several scores in the
> 50_scores.cf file that are 0.01 in value:
>
> 50_scores.cf:score A_HREF_TO_UNSUB                0.01
> 50_scores.cf:score BULK_EMAIL                     0.01
> 50_scores.cf:score CLICK_BELOW                    0.01
> 50_scores.cf:score EXCUSE_12                      0.01
> 50_scores.cf:score EXCUSE_13                      0.01
> 50_scores.cf:score EXCUSE_14                      0.01
> 50_scores.cf:score EXCUSE_15                      0.01
> 50_scores.cf:score EXCUSE_7                       0.01
> 50_scores.cf:score FORM_W_MAILTO_ACTION           0.01
> 50_scores.cf:score HUNZA_DIET_BREAD               0.01
> 50_scores.cf:score JAVASCRIPT                     0.01
> 50_scores.cf:score MAILTO_WITH_SUBJ_REMOVE        0.01
> 50_scores.cf:score MASS_EMAIL                     0.01
> 50_scores.cf:score ONE_TIME_MAILING               0.01
> 50_scores.cf:score PORN_7                         0.01
> 50_scores.cf:score REMOVE_IN_QUOTES               0.01
> 50_scores.cf:score REMOVE_SCRIPT                  0.01
> 50_scores.cf:score SEXY_PICS                      0.01
> 50_scores.cf:score SUBJ_REMOVE                    0.01
> 50_scores.cf:score TO_INVESTORS                   0.01
>
> Most of these if they existed in 1.5 had significantly higher scores like
> CLICK_BELOW, EXCUSE_*, *SUBJ_REMOVE, etc.  Why have they been changed in
> v2.01? Is it a mistake?  I believe these scores should be
> reviewed.  If they
> were a little higher most of my false negatives would have been caught.
>
> Thanks,
> Ed.
>
>
> _______________________________________________
> Spamassassin-talk mailing list
> [EMAIL PROTECTED]
> https://lists.sourceforge.net/lists/listinfo/spamassassin-talk
>


_______________________________________________
Spamassassin-talk mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk

Reply via email to