Hi, I'm Matej and I'm writing a degree about Bayesian spam filtering methods. I would like to use your spam program for research but I have little problem. I need to find out, which classsification tehnique SpamBayes use, but I didn't find any notes about that.
In literature (http://digital.cs.usu.edu/~erbacher/publications/Bayes-Vikas2.pdf), I have found 4 significant tehniques that most of spam filters use: - all tokens filter: Use of all tokens from a new email for classification. - fixed number of tokens filter: Use of a fixed number of tokens for classification. These tokens are assumed to be the most effective in the given e-mail. - standard deviation threshold filter: This tehnique emphasizes the spam probability of tokens rather than the number of tokens. - relative number of tokens filter: Use of relative number of tokens for classification; for example 30%. Can you please tell me which tehnique SpamBayes use or is the closest to the tehnique that SpamBayes use, so that I will know how to start my research? And if it's impossible to somehow switch between these four tehniques. Thank you for your help! Yours sincerely, Matej
_______________________________________________ SpamBayes@python.org http://mail.python.org/mailman/listinfo/spambayes Info/Unsubscribe: http://mail.python.org/mailman/listinfo/spambayes Check the FAQ before asking: http://spambayes.sf.net/faq.html