From: "Bowie Bailey" <[EMAIL PROTECTED]>

jdow wrote:
From: "Bart Schaefer" <[EMAIL PROTECTED]>
> > On 4/29/06, Matt Kettler <[EMAIL PROTECTED]> wrote:
> > In SA 3.1.0 they did force-fix the scores of the bayes rules,
> > particularly the high-end. The perceptron assigned BAYES_99 a
> > score of 1.89 in the 3.1.0 mass-check run. The devs jacked it up
> > to 3.50.
> > > > That does make me wonder if:
> >     1) When BAYES_9x FPs, it FPs in conjunction with lots of
> > other rules due to the ham corpus being polluted with spam.
> > My recollection is that there was speculation that the BAYES_9x
> rules were scored "too low" not because they FP'd in conjunction
> with other rules, but because against the corpus they TRUE P'd in
> conjunction with lots of other rules, and that it therefore wasn't
> necessary for the perceptron to assign a high score to BAYES_9x in
> order to push the total over the 5.0 threshold.
> > The trouble with that is that users expect training on their
> personal spam flow to have a more significant effect on the
> scoring.  I want to train bayes to compensate for the LACK of
> other rules matching, not just to give a final nudge when a bunch
> of others already hit.
> > I filed a bugzilla some while ago suggesting that the bayes
> percentage ought to be used to select a rule set, not to adjust
> the score as a component of a rule set.

There is one other gotcha. I bet vastly different scores are
warranted for Bayes when run with per user training and rules as
compared to global training and rules.

Ack!  I missed the subject change on this thread prior to my last
reply.  Sorry about the duplication.

I think it is also a matter of manual training vs autolearning.  A
Bayes database that is consistently trained manually will be more
accurate and can support higher scores.

That may be a factor, too, Bowie. But, as igor is experiencing, the
site Bayes faces a singular problem in that one person's ham is another
person's extreme spam. When no two people can agree on what spam is
and what ham is a global Bayes becomes (relatively) ineffective very
quickly. This is why I included that afterthought which probably should
have been highlighted up front.

{^_^}

Reply via email to