Re: A different approach to scoring spamassassin hits

Tom Allison Sat, 30 Jun 2007 03:22:27 -0700


On Jun 30, 2007, at 1:20 AM, Marc Perkel wrote:

Tom Allison wrote:
For some years now there has been a lot of effective spamfiltering using statistical approaches with variations on Bayesiantheory, some of these are inverse Chi Square modifications toNiave Bayes or even CRM114 and other "languages" have beendeveloped to improve the scoring of statistical analysis of spam.For all statistical processes the spamicity is always between 0and 1.
<snip>
Many Thanks for those of you who have read this far for yourpatience and consideration.
Tom, I suggested something somilar to that years ago and I'd stilllike to see it tried out. I wonder what would happen if youstripped ot the body and ran bayes just on the headers and therules and let bayes figure it out. You do have to have some pointsto start with to get bayes pointed in the right direction. But youcould use black lists and white lists to do bayes training. Alsoneeds more rules to identify ham and not just rules to identify spam.

I was under the belief that there were Ham-centric tests that wouldresult in negative point scorings.

Ham doesn't try to be evasive. It's pretty easy to identify.Without SA tagging much of it falls to <<0.5 and whitelisting wouldcapture much of the exceptions.

As for headers only testing -- The first five lines of stock spam isvery telling...

My question about SA is the PerMsgStatus (I think) Is this the placeto retrieve all the rules information? I know today you can get alist of all the rules that HIT, but is there where you would look tofind all the rules that were attempted? Or is there a better placefor it?

Re: A different approach to scoring spamassassin hits

Reply via email to