On 30-Apr-2009, at 11:50, Charles Gregory wrote:
On Thu, 30 Apr 2009, LuKreme wrote:
First off, I suppose that if you get real mail from someone who has only ever been seen as a spam sender, then yes, the first mail would be penalized. But is this ever the case?

(nod) Any time someone's address has been used as a spoofed sender before that legitimate sender makes first contact with a new correspondent. But as I understand your logic, there is no 'rule' to distinguish the 'first' AWL entry as 'special' from all the rest... just that 'others' exist...

Right.

Let's lay out the logic here:
2 AWL is positive or does not exist
a Check for other AWL entries using same address but different hosts.
i If there is an AWL with a negative score, then multiply by -0.2 and
  add to score

So any AWL with a negative score still helps the new mail be negative?
The sender's legit mail helps new spam?

No, the senders AWL HURTS new spam. I fthe score is -2 from the AWL then -2 * -0.2 = 0.4

ii If there is an AWL with a positive score, under 5.0, then multiply by
  0.1 and add
iii If there is an AWL with a positive score over 5.0, then multiply it
  by 0.4 and add

So in the unlikely event that spam (from a different server) precedes legitimate mail, the legit sender gets a postitive adjustment before they have a chance to score negative...

As I understand it the AWL is added after all others, but yes, the FIRST legitimate mail will be penalized.

Note that this logic will also be problematic when sender has multiple mail servers. Many senders get a few points positive...

This will only be an issue if those multiple servers have positive AWL scores.

c if total amount added is over some threshold, normalize on that threshold
(3 points? 5? 8?)

Now let's presume that the sender is spoofed by spammers on ten different IP's, producing ten different AWL entries. How will you distinguish the legit sender's IP (except by hoping they have scored negative?)... You will simply add up ALL the IP AWL's and score *any* mail from the sender
with a significant positive adjustment....

As far as I can tell, though it's not easy to be sure, legitimate senders have negative AWL scores.

3 AWL is negative
{ crickets }

But how often does that really happen? As I said, most people get a *few* points on legit mail.

But it's not the points on the mail, it is only the AWL listing that we're looking at.

The idea being that an average score of 0.8 will 'average' with a fluke spammy mail and keep the score lower.... But your way is adding those small scores to essentially ALL mail unless the lucky sender never mentioned viag.... ooops. There goes *my* score.... LOL

OK, how do we parse out the AWL numbers then so we can see what sorts of AWL numbers exist for legit senders. As I understand it, if an email comes in from a know sender who was average 0.8 and this email scores 3.0, a negative AWL will be applied to normalize the email closer to 0.8, right? The AWL score is not 0.8, but 3.0 - (AWL value)?

Maybe it makes sense to only do this check if the message has at least scored positive?

Again, a significant proportion of ham gets a few points.

So yes, if b...@example.com has never emailed me except for a bunch of spam, then yeah, the message is going to get bumped up in its score, but how often does that happen? Does that ever happen?

Happens for me all the time. I get dictionary spam with a random client's address as sender, and then I get an inquiry from the client about all these 'bounces' they are receiving. Naturally, they quote the bounce, which includes some spam sign, and the client is off to a good start with a moderately spammy mail to me. (smile)

But bob could also e-mail you three or four times, getting a small positive score, then you get spammed "from Bob" with high scores from a botnet (and I usually get several copies of a spam like that), and the next time bob e-mails, he gets logic 2.a.ii spplied above for each and every AWL for his address. Could be hefty....

Er.. ok. Perhaps I am misunderstanding the AWL. As I understand it, if a bunch of spam comes in from a server with average scores of 7.0 and a new message comes in with a score of 4, it will have a POSITIVE AWL applied to normalize at 7.0. If a message comes from a know sender with an average score of 2, and this email scores 4, it will get a NEGATIVE AWL score to normalize closer to 2.0, right? Since this is a negative AWL 2.a.ii would not apply because the AWL is negative, so section 2 is skipped entirely and we are at 3. AWL is negative => {crickets}.

Also, lets say b...@example.com sends a message after a bunch of spams have been sent, and say that message scores -1.0, plus an AWL adjustment of 5.0 based on the above.

I'm sure there are some people who *would* 'fit your model' and have negative scores on their legit mail and not be hurt by the proposed rule.

I think we are talking at cross purposes, and that's likely my fault. I am talking about the AWL adjustment being either positive or negative. Mail that is more spammy than usual will get penalized up. Mail that is less spammy than usual will not be affected.

Which, for any yahoo mailing list will be a different server many times. And so if your yahoo list scores slightly positive, all those different yahoo servers will all add to the score. Ditto hotmail, gmail, etc.

OK, if the value is 0.1 then it would take up to 50 outbound servers with even distribution to add 5.0 points.

I can see what you *want* to do. I just don't see a practical way to do it.

That's quite possible. As I said initially, it's jut an idea I had to make the AWL penalize botnets much more. If it can't be done, that's fine. I think there's some promise here though.

I'm not married to this idea, I just think there's something here that might be worth trying.

--
These budget numbers are not just estimates, these are the actual
        results for the fiscal year that ended February the 30th.
        - GWB

Reply via email to