> I'd agree that you can use autolearning to pick up some of the 200/200 > messages, but I'd do at least *some* hand training. >
Of course. I didnt say DONT do any manual training, i simply suggested that it
isnt necessary and that ultimately its "up to you" whether or not you do so.
True, I merely wished to point out that starting a bayes db using autolearning-only is something I'd strongly suggest not doing. If you've got the option of doing even a small amount of manual training when you first set up your bayes DB, it can save you from headaches later.
(of course, the more manual training you can do, the better)
> The autolearner isn't perfect, and an autolearn-only bayes database has a > noticable chance of ending up poison. It doesn't happen every time, but > there's a distinct chance of it. >
Anything can happen, and of course it doesnt happen every time. Again, i'd say
about 95% of the time my system is triggering bayes_99 or bayes_00. The so-so
rules (middle bayes where it isnt really sure) hardly ever get used and my
spams are usually above 15 while the hams are usually below -4. I'd say these
are pretty good results.
Well, yours did not make the "wrong turn" at the start, so it's likely to remain OK indefinitely.
The big risk isn't in letting an established bayes DB autolearn only, but in starting one that way. Since the autolearner has no past training to rely on, it can easily misclassify a low-scoring spam and autolearn that as ham. From there on, that training will have some limited influence on future spam learning, which can cause it to skip learning some spams as spam.
It's all about getting the DB off on the right foot to start with. From there, the no-contradictions rule will prevent most cases of it going awry.
