On Fri, 10 Jul 2009 12:33:51 +0200 Matus UHLAR - fantomas <uh...@fantomas.sk> wrote:
> > > On Sat, 04 Jul 2009 08:56:35 -0400 > > > Matt Kettler <mkettler...@verizon.net> wrote: > > > > Please be aware the AWL is NOT whitelist, or a blacklist, and > > > > the scores don't really quite work the way they look. The AWL is > > > > essentially an averager, and as such, it's sometimes going to > > > > assign negative scores to spam sometimes. > > > > And it works from its own version of the score that ignores > > > whitelisting and bayes scores. So if learning a spam leads to the > > > next spam from the same address getting a higher bayes score, > > > that benefit isn't washed-out by AWL. > > On 04.07.09 22:42, RW wrote: > > I take that back, I thought the the BAYES_XX rules were ignored by > > AWL, but they aren't. > > > > Personally I think BAYES should be ignored by AWL, emails from the > > same "from address" and ip address will have a lot of tokens in > > common. They should train quickly, and there shouldn't be any need > > to "damp-out" that learning. > > I don't think so. Teaching BAYES is a good way to hint AWL which way > should it push scores. By ignoring bayes, you could move much spam > the ham-way since much of spam isn't catched by other scores than > BAYES, and vice versa. > Right, but that's only a benefit if the BAYES score drops - remember it's an averaging system. Personally I only have a single spam in my spam corpus that has a AWL hit and doesn't hit BAYES_99, and that hits BAYES_95. Sending multiple spams from the same from address and IP address is a gift to Bayesian filters. The much more common scenario is that the first spam hits BAYES_50 and subsequent BAYES_99 hits are countered by a negative AWL score.