On Wed, 28 Sep 2011 14:30:32 +0200, Lars Jørgensen wrote:
On 28-09-2011 13:20, Benny Pedersen wrote:
I train Bayes manually on the borderline cases, but also have
auto-learning enabled. Is that really a bad idea? Should I disable
it,
delete the bayes-databases and start over on manual-only learning?
no training is always good
Are you missing a comma? Do you mean "no, training is always good" or
"no training is always good"?
no just my bolsk algebra and english is bad :)
what score are you learning on ?, default is -0.1 and 12.0, i have
changed them here to -4 and 14
Can't find any settings to that effect, so I guess I am using
defaults. I have entered your settings in my config now.
perldoc Mail::SpamAssassin::Plugin::AutoLearnThreshold
Looking at
http://spamassassin.apache.org/full/3.3.x/doc/Mail_SpamAssassin_Conf.html#learning_options
i see an option called "bayes_use_hapaxes" that promises
significantly better hit-rates, but also increases database size by a
factor of 8 to 10. What is the recommendation on this?
dont known for sure what is best there, using default here
perldoc Mail::SpamAssassin::Plugin::Bayes
perldoc Mail::SpamAssassin::Conf
for 3.3.1 and above i add in local.cf
bayes_auto_learn_on_error 1
reduce poising bayes and load
If throughput
is a factor in this decision, we are scanning about 60,000 to 90,000
mails a day.
more then my server handle now
what plugins have you enabled ?
DCC
pyzor/razor
SpamCop
AutoLearnThreshold
TextCat
MIMEHeader
ReplaceTags
DKIM
Check
HTTPSMismatch
URIDetail
Bayes
All the EvalTest plugins
VBounce
ImageInfo
FreeMail
3dr party rules or just default sa 3.3.2 ?
Default and Sought Rules.
should be safe enough to not give any problem to bayes
tip if you like to restart learning bayes on can do this like here:
sa-learn --dump magic
bayes_min_ham_num (Default: 200)
bayes_min_spam_num (Default: 200)
and adjust this with 200 more then listed in dump magic, this ensure
that bayes go back in learning mode