Re: Annoying auto_whitelist

RW Fri, 10 Jul 2009 04:42:44 -0700

On Fri, 10 Jul 2009 12:33:51 +0200
Matus UHLAR - fantomas <uh...@fantomas.sk> wrote:


> > > On Sat, 04 Jul 2009 08:56:35 -0400
> > > Matt Kettler <mkettler...@verizon.net> wrote:
> > > > Please be aware the AWL is NOT whitelist, or a blacklist, and
> > > > the scores don't really quite work the way they look. The AWL is
> > > > essentially an averager, and as such, it's sometimes going to
> > > > assign negative scores to spam sometimes.
> 
> > > And it works from its own version of the score that ignores
> > > whitelisting and bayes scores. So if learning a spam leads to the
> > > next spam from the same address getting a higher bayes score,
> > > that benefit isn't washed-out by AWL. 
> 
> On 04.07.09 22:42, RW wrote:
> > I take that back, I thought the the BAYES_XX rules were ignored by
> > AWL, but they aren't.
> > 
> > Personally I think BAYES should be ignored by AWL, emails from the
> > same "from address" and ip address will have a lot of tokens in
> > common.  They should train quickly, and there shouldn't be any need
> > to "damp-out" that learning.
> 
> I don't think so. Teaching BAYES is a good way to hint AWL which way
> should it push scores. By ignoring bayes, you could move much spam
> the ham-way since much of spam isn't catched by other scores than
> BAYES, and vice versa.
> 
Right, but that's only a benefit if the BAYES score drops - remember
it's an averaging system. Personally I only have a single spam in my
spam corpus that has a AWL hit and doesn't hit BAYES_99, and that hits
BAYES_95. Sending multiple spams from the same from address and IP
address is a gift to Bayesian filters.

The much more common scenario is that the first spam hits BAYES_50 and
subsequent BAYES_99 hits are countered by a negative  AWL score.

Re: Annoying auto_whitelist

Reply via email to