RW wrote:
> On Fri, 10 Jul 2009 12:33:51 +0200
> Matus UHLAR - fantomas <uh...@fantomas.sk> wrote:
>
>   
>>>> On Sat, 04 Jul 2009 08:56:35 -0400
>>>> Matt Kettler <mkettler...@verizon.net> wrote:
>>>>         
>>>>> Please be aware the AWL is NOT whitelist, or a blacklist, and
>>>>> the scores don't really quite work the way they look. The AWL is
>>>>> essentially an averager, and as such, it's sometimes going to
>>>>> assign negative scores to spam sometimes.
>>>>>           
>>>> And it works from its own version of the score that ignores
>>>> whitelisting and bayes scores. So if learning a spam leads to the
>>>> next spam from the same address getting a higher bayes score,
>>>> that benefit isn't washed-out by AWL. 
>>>>         
>> On 04.07.09 22:42, RW wrote:
>>     
>>> I take that back, I thought the the BAYES_XX rules were ignored by
>>> AWL, but they aren't.
>>>
>>> Personally I think BAYES should be ignored by AWL, emails from the
>>> same "from address" and ip address will have a lot of tokens in
>>> common.  They should train quickly, and there shouldn't be any need
>>> to "damp-out" that learning.
>>>       
>> I don't think so. Teaching BAYES is a good way to hint AWL which way
>> should it push scores. By ignoring bayes, you could move much spam
>> the ham-way since much of spam isn't catched by other scores than
>> BAYES, and vice versa.
>>
>>     
> Right, but that's only a benefit if the BAYES score drops - remember
> it's an averaging system. Personally I only have a single spam in my
> spam corpus that has a AWL hit and doesn't hit BAYES_99, and that hits
> BAYES_95. Sending multiple spams from the same from address and IP
> address is a gift to Bayesian filters.
>
> The much more common scenario is that the first spam hits BAYES_50 and
> subsequent BAYES_99 hits are countered by a negative  AWL score.
>   
Technically, this only counters half the score. It also gets "paid back"
later. It raises the stored average that will apply to subsequent messages.

I'd also argue it's a rather rare case. Most of my spam hits BAYES_99
the first shot around, and most has varying sender address and IP. The
odds of one having increasing score and the same sender address/ip seems
extraordinarily unlikely to me.

Besides, the real problem there isn't the AWL, but the fact that the
first message scored low.

Are you really seeing cases where this is causing false negatives, or
are you just pontificating about what's possible?


Reply via email to