Hi,

I'm Matej and I'm writing a degree about Bayesian spam 
filtering methods. I would like to use your spam program for research but I 
have little problem. I need to find out, which classsification tehnique 
SpamBayes use, but I didn't find any notes about that.

In literature 
(http://digital.cs.usu.edu/~erbacher/publications/Bayes-Vikas2.pdf), I have 
found 4 significant tehniques that most of spam filters use:
- all tokens filter: Use of all tokens from a new email for classification.
- fixed number of tokens filter: Use of a fixed number of tokens for 
classification. These tokens are assumed to be the most effective in the given 
e-mail.
- standard deviation threshold filter: This tehnique emphasizes the spam 
probability of tokens rather than the number of tokens.
- relative number of tokens filter: Use of relative number of tokens for 
classification; for example 30%.

Can you please tell me which tehnique  SpamBayes use or is the closest to the 
tehnique that SpamBayes use, so that I will know how to start my research? And 
if it's impossible to somehow switch between these four tehniques.

Thank you for your help!

Yours sincerely,
Matej                                     
_______________________________________________
SpamBayes@python.org
http://mail.python.org/mailman/listinfo/spambayes
Info/Unsubscribe: http://mail.python.org/mailman/listinfo/spambayes
Check the FAQ before asking: http://spambayes.sf.net/faq.html

Reply via email to