At 07:37 PM 4/5/2004, Mark wrote:
I like the new 20_drugs.cf file and I'm wondering if we should create ather classifications of rules like this. One for porn - one for finance/mortgage/creditcards - etc.

Also - have the ability to declare a default score for everything in that file. So that before it's scored - you can say give it a 3 instead of the default 1.

I've also thought that rule classifications could be scored in a way that they had independent totals - the dug score - the sex score - the credit card scam score - etc - with the idea of maybe being able to apply a scaling factor to the classification. A church might want to scale up the porn score. A reality company might want to scale down the financials scores.

We do precisely this with our Message Sniffer product with mixed results. It turns out that rules that score highly for drugs (snakeoil) frequently match credit card (debt) and even porn (adult) classifications. In practice there is little distinction except perhaps for porn/adult. Spammers tend to reuse domains and other header & obfuscation patterns across these three categories in particular.


It turns out that most of the time if a customer ranks one of the groups higher it is not because they have a particular filtering classification in mind, but rather because a particular classification tends to have higher accuracy in general... Due to the way we source our rules the porn/adult group tends to be slightly more accurate than some general rules - but about the same as drugs. Debt can sometimes be less accurate but not often. Frequently the slight distinction is amplified in the mind of the end user more than the statistics really support...

I suspect that similar classifications implemented directly in SA would have similar statistics.

$0.02
_M

Ref Classifications:
http://www.sortmonster.com/MessageSniffer/Help/ResultCodesHelp.html

Ref SA Plugin:
http://www.sortmonster.com/MessageSniffer/Installation/SpamAssassin.html




Reply via email to