On Thu, 22 May 2014 15:54:42 +0100
RW <rwmailli...@googlemail.com> wrote:

Ian> I don't understand this setting, and reading the documentation
Ian> doesn't help.

Ian> It seems it should make Bayes learn spam whenever the total score
Ian> surpasses the value of bayes_auto_learn_threshold_spam, and not
Ian> require 3 points from header and body each; that would make it a
Ian> global setting similar in purpose to
Ian> bayes_auto_learn_threshold_spam.

Ian> But in fact this is a per-test setting, a subcategory of tflags.
Ian> Do I have to specify it separately for every test?  Why?

RW> The point is to set it for a small number of rules that are
RW> sufficiently strong as to guarantee there will be no mislearning in
RW> combination with the autolearn as spam threshold.

RW> It's probably best to create a single metarule for this - something
RW> that eliminates the possibility of mistraining through a lot of
RW> overlapping rules. I do something similar to get more spam into my
RW> high-scoring folder. I assign a lot of the near-certain spam rules
RW> to different classes: BAYES, RBLs, URIBLs, relaycountry etc and then
RW> count the number of classes.

The problem I am trying to solve is that nearly all of my spam is
flagged due to body rules.  The header rules seem to be close to useless
with the latest campaigns - spammers seem to have learned enough to
avoid sending obvious stinking pieces of turd.  (The one exception is
patterns in the Message-ID, but I am afraid that will be short lived
too, and is insufficient by itself even now).

Thus, even if I set bayes_auto_learn_threshold_spam low, very few of my
spams are autolearned because of the 3/3 requirement.  The damn 3/3 is
my problem - how can I work around it?  If I have to spend an hour a day
manually training the classifier the spammers have won :-(

By the way, how are meta rules counted for this purpose?  The
documentation says nothing about that.

-- 
Please *no* private copies of mailing list or newsgroup messages.

Reply via email to