The next phase of Message Sniffer development includes a compound
Bayesian hinting algorythm to help modulate the black/white rule set.
Since Message Sniffer works with Declude that's one way this technology
will find it's way into the mix.

Scott's got a good point though - Bayesian filtering (as it has been
implemented) tends to work well at very specific tasks... That is, you
might get it to learn your specific email preferences accuratly - but
once you get to the server level where there are many people involved
the accuracy drops significantly due to the diversity of the message
content and the difficulties in obtaining training data... this is why
we will be implementing a structured differentiation approch.

One direct application that might work for Declude... If you can solve
the training problem you might use a Naieve Bayesian chain rule to
combine the results of the declude tests... Specifically Declude could
maintain a table of rule firings (including white & black lists, white &
black word lists etc) and collect a statistical product on the
combinations of rules that fire.

Then it could interpret that data as a new test which adds or subtracts
a weight given the Bayesian probability of that combination of tests
being spam.

For example, the Bayesian Product test would "learn" that a specific
combination of rule firings has a high probability of being spam on a
given system, while another combination of test firings has a lower or
negative probability (given some threshold). 

Additional "hiting" can be providided by using the external list tests
to match for patterns that may be specific to that system - or shared
between the group.

As Declude integrates a greater number of tests it's simple weighting
scheme will become less effective and difficult to tune - a  Bayesian
approach to combining the test results might bridge the gap.

-- just a thought,
_M

| -----Original Message-----
| From: [EMAIL PROTECTED] 
| [mailto:[EMAIL PROTECTED]] On Behalf Of R. 
| Scott Perry
| Sent: Thursday, January 23, 2003 3:29 PM
| To: [EMAIL PROTECTED]
| Subject: Re: [Declude.JunkMail] [Declude.Virus] Mozilla email client
| 
| 
| 
| >I read about this Bayesian filtering/scanning at some other forum as 
| >well. Is this something that Declude Junkmail does right now 
| or will do 
| >in the
| >(near) future? Would be nice if it were a feature of the 
| scanner on the
| >server in stead of changing all mail client software? ;-)
| 
| There was a very similar feature (the "heuristics" test), but 
| it proved to 
| be too unreliable when it came to mailing list E-mail.
| 
| Although in theory the Bayes Theory should work very well in 
| detecting 
| spam, it does not in reality (for very technical reasons).  
| Using the Bayes 
| Theory for spam testing relies on a number of assumptions 
| that don't hold 
| true -- it's kind of like saying if Sports Team X wins 2 of 
| the first 3 
| games they play, they have a 66% chance of winning the next 
| game.  With the 
| right assumptions, this could be accurate or close to it, but 
| otherwise it 
| just isn't accurate.
|                                     -Scott
| 
| 
| ---
| [This E-mail was scanned for viruses by Declude Virus 
(http://www.declude.com)]

---
This E-mail came from the Declude.JunkMail mailing list.  To
unsubscribe, just send an E-mail to [EMAIL PROTECTED], and type
"unsubscribe Declude.JunkMail".  The archives can be found at
http://www.mail-archive.com.


---
[This E-mail was scanned for viruses by Declude Virus (http://www.declude.com)]

---
This E-mail came from the Declude.JunkMail mailing list.  To
unsubscribe, just send an E-mail to [EMAIL PROTECTED], and
type "unsubscribe Declude.JunkMail".  The archives can be found
at http://www.mail-archive.com.

Reply via email to