Re: AWL vs. mailing lists

Bob George 26 Feb 2004 02:15:59 -0000

Gary Funck wrote:
>> From: Rich Puhek [mailto:[EMAIL PROTECTED]
> [...]
>> Seems like AWL is indeed intended to let people I trust
occasionally
>> send me email that _SA thinks is spam_ (slight difference from
your
>> question).


And vice-versa I suppose.

>> So the idea is that the AWL system figures out that your
>> sailor buddy's messages tend to score high, and AWL compensates
for
>> it automatically (no training needed).
>
> Although I agree with your overall description of AWL, it
doesn't in
> fact let my friend's messages through because his messages often
> score higher. If the often score high, his AWL score will be
high
> (more spammy), and as it creeps up will actually increase the
chances
> that his message goes above the spam threshold.

I think a manual whitelist entry is probably more appropriate for
those spammy friends... and this certainly raises questions about
the wisdom of auto-bayes training off ALL ham.

But yes, my understanding is that it brings messages more towards
the overall norm for the sender in terms of scoring.

>> The idea does break a bit with mailing lists and with AWL
poisoning
>> attempts, of course. For that, your procmail example looks
perfect.
>
> The idea partially breaks with mailing lists, because often the
> spammers post their very first message to the list and it is
spam.

Hmm. In the case of a one-off poster, does AWL come into play?
Hopefully, any frequent spammer will be moderated or banned from a
list. It varies by list, but most I'm on have the From: field set
to the original sender, and THIS is the address considered in AWL.
So messages to the list FROM FREQUENT POSTERS are affected by AWL,
not all.

I've yet to see an AWL adjustment of more than -2, and certainly
nothing like the -25 described by Michael

> Same would
> go for [EMAIL PROTECTED], if the first message she sent you
was
> spam. Maybe AWL should start off at 1/5 of the spam threshold,
thus
> 1.0 on a 5.0 scale, slightly biasing in favor of spam upon first
> arrivals?

Yes (based on what I know -- admittedly only as a lowly end-user).

> I don't think it happens that often that someone who has posted
to a
> mailing list over a long period of time suddenly turns around
and
> sends a spam (though it does happen, and some spammers are even
using
> this trick).

I suppose forged messages would be a problem. Still, other SA
methods would, I think, generally over-ride a +/-2 offset.

> What makes mailing lists different is that there are
> many more senders, and on overage you don't see very many
messages
> from a given sender.

So AWL really won't come into play will it? On the busiest lists
I'm on, AWL adjustments are generally -1 - 0 range. Minor
"nudges". I'd only really expect AWL to matter if a VERY trusted,
frequent sender sends (or forwards) the odd spam.

> Again, maybe AWL should begin with a bias in
> favor of judging initial messages as spam?

It starts with none at all, which seems to me, prudent.

- Bob

>> The other nice thing about the AWL approach vs relying on
Bayes...
>> suppose your friend sends you enough risqué messages that
traditional
>> porn spam indicators are no longer reliable spam tokens?
>
> My approach to this is to outright whielist [EMAIL PROTECTED]
Bayes
> ignores whitelisting as I recall, in deciding whether to
autolearn or
> not.
>
> Same would go for [EMAIL PROTECTED] - I just add him to my
explicit
> white list.

Re: AWL vs. mailing lists

Reply via email to