Thanks! I will discuss here and find out with one is better.
What are the weight of the bayser score after they well trained ? Have any ideas about it ? []'sf.rique On Thu, Mar 4, 2010 at 2:41 PM, Bowie Bailey <bowie_bai...@buc.com> wrote: > (Please send replies to the list) > > Henrique Fernandes wrote: > > > > On Thu, Mar 4, 2010 at 2:22 PM, Bowie Bailey <bowie_bai...@buc.com > > <mailto:bowie_bai...@buc.com>> wrote: > > > > Henrique Fernandes wrote: > > > Nops, i wnat that after i trained, the same email, should get a > > higher > > > score cause the spamassassin was trained that is a spam, so when it > > > comes again , it should look in the database and add some extra > > point > > > on the score right ? > > > > That is a fairly common misconception. When you learn an email as > > spam, > > the Bayes system breaks it into tokens (words/character strings) and > > then makes a note that each of those tokens was seen in a spam. > > When an > > email comes in, it breaks up the new email into tokens and then > checks > > to see how frequently each of those tokens was previously seen in > spam > > or ham. Based on what it finds, it ranks the email from BAYES_00 > > (very > > unlikely to be spam) to BAYES_99 (almost certainly spam). > > > > Since learning from a single email only adds one data point to each > > token, it is unlikely to make a major difference on its own. The > > value > > comes in learning from lots of spam and ham. This is why the Bayes > > rules will not run until you have learned from at least 200 ham > > and 200 > > spam. > > > > > > hmm > > > > Thanks, so ech individual user has to have learned lots of emails so > > after that they will start to have an difference on score ? > > Yes. Each individual user will need to learn at least 200 ham and 200 > spam (manually or via auto-learn) before Bayes will start scoring. The > more they learn, the better the accuracy. > > > So is better to just traing one database to all user instead one base > > for each user ? > > > > Making just one base i am afraid of getting to many false-positives. > > Cause sometimes Viagra is not spam for some one that researhc it, but > > if it is in the same base, it will be marked as spam... > > Depends on your users. Unless they are wildly different, a single > database should work fairly well. Individual databases can be more > accurate in some instances, but a single well-trained database will > probably work better than a bunch of individual databases that are not > trained consistently. > > -- > Bowie >