Gary Funck wrote: >> From: Rich Puhek [mailto:[EMAIL PROTECTED] > [...] >> Seems like AWL is indeed intended to let people I trust occasionally >> send me email that _SA thinks is spam_ (slight difference from your >> question).
And vice-versa I suppose. >> So the idea is that the AWL system figures out that your >> sailor buddy's messages tend to score high, and AWL compensates for >> it automatically (no training needed). > > Although I agree with your overall description of AWL, it doesn't in > fact let my friend's messages through because his messages often > score higher. If the often score high, his AWL score will be high > (more spammy), and as it creeps up will actually increase the chances > that his message goes above the spam threshold. I think a manual whitelist entry is probably more appropriate for those spammy friends... and this certainly raises questions about the wisdom of auto-bayes training off ALL ham. But yes, my understanding is that it brings messages more towards the overall norm for the sender in terms of scoring. >> The idea does break a bit with mailing lists and with AWL poisoning >> attempts, of course. For that, your procmail example looks perfect. > > The idea partially breaks with mailing lists, because often the > spammers post their very first message to the list and it is spam. Hmm. In the case of a one-off poster, does AWL come into play? Hopefully, any frequent spammer will be moderated or banned from a list. It varies by list, but most I'm on have the From: field set to the original sender, and THIS is the address considered in AWL. So messages to the list FROM FREQUENT POSTERS are affected by AWL, not all. I've yet to see an AWL adjustment of more than -2, and certainly nothing like the -25 described by Michael > Same would > go for [EMAIL PROTECTED], if the first message she sent you was > spam. Maybe AWL should start off at 1/5 of the spam threshold, thus > 1.0 on a 5.0 scale, slightly biasing in favor of spam upon first > arrivals? Yes (based on what I know -- admittedly only as a lowly end-user). > I don't think it happens that often that someone who has posted to a > mailing list over a long period of time suddenly turns around and > sends a spam (though it does happen, and some spammers are even using > this trick). I suppose forged messages would be a problem. Still, other SA methods would, I think, generally over-ride a +/-2 offset. > What makes mailing lists different is that there are > many more senders, and on overage you don't see very many messages > from a given sender. So AWL really won't come into play will it? On the busiest lists I'm on, AWL adjustments are generally -1 - 0 range. Minor "nudges". I'd only really expect AWL to matter if a VERY trusted, frequent sender sends (or forwards) the odd spam. > Again, maybe AWL should begin with a bias in > favor of judging initial messages as spam? It starts with none at all, which seems to me, prudent. - Bob >> The other nice thing about the AWL approach vs relying on Bayes... >> suppose your friend sends you enough risqué messages that traditional >> porn spam indicators are no longer reliable spam tokens? > > My approach to this is to outright whielist [EMAIL PROTECTED] Bayes > ignores whitelisting as I recall, in deciding whether to autolearn or > not. > > Same would go for [EMAIL PROTECTED] - I just add him to my explicit > white list.