At 07:56 PM 6/2/2005, Jason Haar wrote:
DNS_FROM_RFC_ABUSE,DNS_FROM_RFC_POST,FROM_HAS_MIXED_NUMS,RCVD_IN_NJABL_DUL,RCVD_IN_SORBS_DUL
scantime=4.4,size=1435,mid=<[EMAIL PROTECTED]>,autolearn=disabled
This had a Subject line of "russian XXXXX unusably in action fervid" - so
I'm guessing it was spam (;-) - even though it only got a score of 3/5.
Obviously the default values are set that way as a way of implying
"confidence" in what that means, it's just that I wonder if they need
updating? I guess I'm referring to the scores in 50_scores.cf.
e.g. RCVD_IN_NJABL_PROXY has a value of 1.0 - and yet the FAQ on the NJABL
web site (of course) tells you to set "score NJABL_PROXY 3.0" :-)
But the wonderful authors of SA know far more than I do - so are the
current levels still deemed to be correct?
If one's wrong, they are ALL wrong.
SA's rule scores are evolved based on a real-world test of a hand-sorted
corpus of fresh spam and ham. The whole scoreset is evolved simultaneously
to optimize the placement pattern.
Of course, one thing that can affect accuracy is if some spams are
accidentally misplaced into the ham pile it can cause some heavy score
biasing to occur. A little bit of this is unavoidable, as human mistakes
happen, but a lot of it will cause deflated scores and a lot of FNs.