On Thu, 22 May 2014 15:54:42 +0100 RW <rwmailli...@googlemail.com> wrote:
Ian> I don't understand this setting, and reading the documentation Ian> doesn't help. Ian> It seems it should make Bayes learn spam whenever the total score Ian> surpasses the value of bayes_auto_learn_threshold_spam, and not Ian> require 3 points from header and body each; that would make it a Ian> global setting similar in purpose to Ian> bayes_auto_learn_threshold_spam. Ian> But in fact this is a per-test setting, a subcategory of tflags. Ian> Do I have to specify it separately for every test? Why? RW> The point is to set it for a small number of rules that are RW> sufficiently strong as to guarantee there will be no mislearning in RW> combination with the autolearn as spam threshold. RW> It's probably best to create a single metarule for this - something RW> that eliminates the possibility of mistraining through a lot of RW> overlapping rules. I do something similar to get more spam into my RW> high-scoring folder. I assign a lot of the near-certain spam rules RW> to different classes: BAYES, RBLs, URIBLs, relaycountry etc and then RW> count the number of classes. The problem I am trying to solve is that nearly all of my spam is flagged due to body rules. The header rules seem to be close to useless with the latest campaigns - spammers seem to have learned enough to avoid sending obvious stinking pieces of turd. (The one exception is patterns in the Message-ID, but I am afraid that will be short lived too, and is insufficient by itself even now). Thus, even if I set bayes_auto_learn_threshold_spam low, very few of my spams are autolearned because of the 3/3 requirement. The damn 3/3 is my problem - how can I work around it? If I have to spend an hour a day manually training the classifier the spammers have won :-( By the way, how are meta rules counted for this purpose? The documentation says nothing about that. -- Please *no* private copies of mailing list or newsgroup messages.