On Jun 2, 2005, at 8:27 PM, Matt Kettler wrote:
If one's wrong, they are ALL wrong.

SA's rule scores are evolved based on a real-world test of a hand-sorted corpus of fresh spam and ham. The whole scoreset is evolved simultaneously to optimize the placement pattern.

Of course, one thing that can affect accuracy is if some spams are accidentally misplaced into the ham pile it can cause some heavy score biasing to occur. A little bit of this is unavoidable, as human mistakes happen, but a lot of it will cause deflated scores and a lot of FNs.

The rule scores are optimized for the spam which was sent at the time that version of SA was released (actually, at the time the rule scoreset was calculated). Since then, the static SA rules have become less useful since spammers now write their messages to avoid them. The only rules which spammers cannot easily avoid are the dynamic ones: bayes and network checks (RBLs, URIBLs, razor, etc).

On my systems, I raise the scores for the dynamic tests since they are the only ones which hit a lot of today's spam.


Reply via email to