howard chen wrote:

I am new to SA, I have read through some of the faq and wiki, so far
can't find the average spam rate % detected by SA. I know it is not
the same for everyone, but I want to get the feel of general
statistics (If you don't mind to share)

1. How many Spam detection rate if I am using default 3.2
configuration you would expect?
Depends on your settings, ie: are you using bayes/network tests or not. However, anywhere from 92% to 98% of spam should be detected out of the box.

See also: STATISTICS-set*.txt in the rules directory of the tarball for your release.

2. If fine tuned according to the wiki, e.g. running sa-update, more
rules set, how many % you would expect then?
Well, depends on how you tune. You can easily make SA have 100% detection rate for spam, but your false-positive (FP) rate will also be 100% :)

That said, a well tuned, well trained, well maintained SA should be able to detect 99% of spam with a less than 0.1% FP rate.
3. Is the % vary from SA version? e.g. 3.0, 3.1 and 3.2?
Certainly. Using an older release of SA against recent spam will result in significantly lower detection rates. The code really does matter quite a lot to detection rate. Things like tweaks to the HTML parser that deal with spammer obfuscations and improve accuracy are made in the code, not the rules. If you're using an older SA, you're missing out on these tweaks.

Also, generally speaking, sa-updates aren't made for older release families. There's usually a period of overlap when a new family comes out where both the current and previous versions get updates pushed, but that generally comes to a stop once development shifts full-bore to the next release.

At this point anyone using 3.1.x is stuck with rules from October 2007. The 3.2.x rules (at the time of this writing) were last updated in June 16, 2008.


