The next phase of Message Sniffer development includes a compound Bayesian hinting algorythm to help modulate the black/white rule set. Since Message Sniffer works with Declude that's one way this technology will find it's way into the mix.
Scott's got a good point though - Bayesian filtering (as it has been implemented) tends to work well at very specific tasks... That is, you might get it to learn your specific email preferences accuratly - but once you get to the server level where there are many people involved the accuracy drops significantly due to the diversity of the message content and the difficulties in obtaining training data... this is why we will be implementing a structured differentiation approch. One direct application that might work for Declude... If you can solve the training problem you might use a Naieve Bayesian chain rule to combine the results of the declude tests... Specifically Declude could maintain a table of rule firings (including white & black lists, white & black word lists etc) and collect a statistical product on the combinations of rules that fire. Then it could interpret that data as a new test which adds or subtracts a weight given the Bayesian probability of that combination of tests being spam. For example, the Bayesian Product test would "learn" that a specific combination of rule firings has a high probability of being spam on a given system, while another combination of test firings has a lower or negative probability (given some threshold). Additional "hiting" can be providided by using the external list tests to match for patterns that may be specific to that system - or shared between the group. As Declude integrates a greater number of tests it's simple weighting scheme will become less effective and difficult to tune - a Bayesian approach to combining the test results might bridge the gap. -- just a thought, _M | -----Original Message----- | From: [EMAIL PROTECTED] | [mailto:[EMAIL PROTECTED]] On Behalf Of R. | Scott Perry | Sent: Thursday, January 23, 2003 3:29 PM | To: [EMAIL PROTECTED] | Subject: Re: [Declude.JunkMail] [Declude.Virus] Mozilla email client | | | | >I read about this Bayesian filtering/scanning at some other forum as | >well. Is this something that Declude Junkmail does right now | or will do | >in the | >(near) future? Would be nice if it were a feature of the | scanner on the | >server in stead of changing all mail client software? ;-) | | There was a very similar feature (the "heuristics" test), but | it proved to | be too unreliable when it came to mailing list E-mail. | | Although in theory the Bayes Theory should work very well in | detecting | spam, it does not in reality (for very technical reasons). | Using the Bayes | Theory for spam testing relies on a number of assumptions | that don't hold | true -- it's kind of like saying if Sports Team X wins 2 of | the first 3 | games they play, they have a 66% chance of winning the next | game. With the | right assumptions, this could be accurate or close to it, but | otherwise it | just isn't accurate. | -Scott | | | --- | [This E-mail was scanned for viruses by Declude Virus (http://www.declude.com)] --- This E-mail came from the Declude.JunkMail mailing list. To unsubscribe, just send an E-mail to [EMAIL PROTECTED], and type "unsubscribe Declude.JunkMail". The archives can be found at http://www.mail-archive.com. --- [This E-mail was scanned for viruses by Declude Virus (http://www.declude.com)] --- This E-mail came from the Declude.JunkMail mailing list. To unsubscribe, just send an E-mail to [EMAIL PROTECTED], and type "unsubscribe Declude.JunkMail". The archives can be found at http://www.mail-archive.com.