Re: A New Approach: Find the Ham

Nigel Frankcom Sat, 10 Feb 2007 12:08:20 -0800

On Sat, 10 Feb 2007 20:52:17 +0100, "Giampaolo Tomassoni"
<[EMAIL PROTECTED]> wrote:


>From: Dan [mailto:[EMAIL PROTECTED]
>> 
>> I've developed a new approach to scoring that I want to 1) share with  
>> everyone and 2) make into a working system thats as accurate as what  
>> I've already built, but easier to use.  First, the theory:
>> 
>> 
>> 
>> SITUATION
>> In the beginning, all email was ham.  When spam came along, we left  
>> the ham alone and targeted the annoyance (spam).
>> 
>> ASSUMPTION
>> All messages are ham unless x,y,z score says they're spam.
>> 
>> APPROACH
>> Block nothing, then create rules to catch what you don't want.  ie,  
>> build tests that target the spam, then score the millions of ways  
>> spam can occur.
>> 
>> RESULT
>> Huge time spent tuning and retuning weights, catching everything in  
>> sight (including much ham).
>> 
>> 
>> 
>> NEW SITUATION
>> Ham is now the tiniest minority of all email.
>> 
>> NEW ASSUMPTION
>> All messages are spam unless x,y,z score says they're ham.
>> 
>> NEW APPROACH
>> Block everything, then create rules to not catch what you do want.   
>> ie, build tests that target the spam (keeping all the tests you've  
>> already built), then score the thousands of ways ham triggers on  
>> those tests.
>> 
>> NEW RESULT
>> Spend less time and energy while catching more of what you do want  
>> and less of what you don't.
>> 
>> 
>> 
>> CHALLENGE
>> All filtering software is written to score for results that equal  
>> spam -> catch the bad
>> 
>> SOLUTION
>> Make filtering software score for results that equal ham -> uncatch  
>> the good.
>> 
>> 
>> Your thoughts?
>
>How can this method "spend less time and energy"? Aren't you going to build a 
>"mirrored" method with respect to the actual one? Your rules wouldn't be like 
>the actual ones, but negated?
>
>Giampaolo
>
>> 
>> Dan
>> 
>> 
>> BTW, is there a better forum for this level of question?
>> 

Dan has a good point; on the surface at least. spam now accounts for
80%+ of all mail, so why are we concentrating on that?

At least the point is worth debate (IMHO).

Can it be done? Even I can see that it can, given the right impetus.
Though perhaps too many companies are making a good $/£/Y off
anti-spam systems based on, around or directly using SA.

Be interesting to see where this thread goes.

Kind regards

Nigel

Re: A New Approach: Find the Ham

Reply via email to