> I know there are theoretical reasons why this might make sense, but I don't > see any benefit in the real world for scores like these. The high scores > increase the chance of a random false positive - regardless of the size of > the existing corpus - and if the negative ones indicate that the rules are > useless, they should just be removed.
To me, -ve scores on tests can also be used to "offset" spammy messages in clean email. I have several of these of my own creation: body CORRECT_FOR_EXCHANGE /This message is in MIME format/ describe CORRECT_FOR_EXCHANGE Correct for MIME 'null block' body GROUPS_YAHOO /http:\/\/groups\.yahoo\.com\/group\// describe GROUPS_YAHOO Yahoo Groups message list header FWD_MSG Subject =~ /\[?Fwd?:?\s*/ describe FWD_MSG Forwarded email header GROUPS_MSN Message-Id =~ /.*\@groups\.msn\.com/ describe GROUPS_MSN MSN Groups Message List body MAILBITS_EMAIL /This is a free service provided by MailBits\.com\./ body HOTMAIL_FOOTER1 /Send and receive Hotmail on your mobile device: / body HOTMAIL_FOOTER2 /Get your FREE download of MSN Explorer at / body HOTMAIL_FOOTER3 /Get Your Private, Free E-mail from MSN Hotmail at http:\/\/www\.hotmail\.com\./ body HOTMAIL_FOOTER4 /Join the world.s largest e-mail service with MSN Hotmail\./ body HOTMAIL_FOOTER5 /Chat with friends online, try MSN Messenger:/ body MSN_FOOTER1 /MSN Photos is the easiest way to share and print your photos: / body MSN_FOOTER2 /Remove my e-mail address from Gaming Zone / These are all assigned mid-size (-1 to -2.4) negative scores to try and counteract some of the +ve scored tests that these emails receive. IMO -ve scored tests don't show the test is bad, but rather that it is a test for NON-spam email. > Anyway, I still have a sneaking suspicion that there are a few thousand > messages from the spamassassin-talk mailing list (talking about spam, and > sometimes quoting it) in the non-spam corpus. Very likely. I am maintaining a folder of mis-detected email (non-spam detected as spam) so I can run these into the GA and help out with the "hairy-assed edge" of spam and nonspam. :-) Regards, Andrew _______________________________________________ Spamassassin-talk mailing list [EMAIL PROTECTED] https://lists.sourceforge.net/lists/listinfo/spamassassin-talk