RE: [IMail Forum] Teaching the bayes engine with RBL mails

R. Scott Perry Sat, 06 Sep 2003 04:49:15 -0700

In particular, I have seen Imail tag mailings from a
commercial list I subscribe to over and over again as spam, when it clearly
is not. In fact, it gives it a 1.00 probability!

That's one of the reasons why the algorithm is referred to as "Naive Bayes Theorem." With real Bayes Theorem, you get a real probability (meaning that if it says there is a 1.00 probability, it *must* be spam). With naive Bayes Theorem, the results are often much greater/lower than they should be (you'll often see .99999 or 1.00, or .0000001 or 0.00). That's because real Bayes Theorem requires that the multiple tests be independent of each other, whereas Naive Bayes Theorem doesn't care. For example, if you have a spam that just says "Order Viagra!", a real Bayes Theorem implementation (there aren't any for spam, and almost certainly won't be for many years if ever, but if there was) might treat "Order Viagra" as one term, not two (since a spam with "Viagra" in it will probably have the word "Order" in it as well).

As you say, though, it works very well for individuals, mainly because it gets trained to their legitimate E-mail. So if you have two people who receive E-mail from a company that never got their permission to send them mail, the one who wants it will get it while the one who does not want it will not get it.

-Scott --- Declude JunkMail: The advanced anti-spam solution for IMail mailservers. Declude Virus: Catches known viruses and is the leader in mailserver vulnerability detection. Find out what you have been missing: Ask for a free 30-day evaluation.

---
[This E-mail was scanned for viruses by Declude Virus (http://www.declude.com)]


To Unsubscribe: http://www.ipswitch.com/support/mailing-lists.html
List Archive: http://www.mail-archive.com/imail_forum%40list.ipswitch.com/
Knowledge Base/FAQ: http://www.ipswitch.com/support/IMail/

RE: [IMail Forum] Teaching the bayes engine with RBL mails

Reply via email to