Re: DNS Blacklist Policy Design

2006-06-06 Thread Marc Perkel
All the stuff on my list is either honnypot accounts or people 
impersonating my domains or other trickery. So they can't get around it 
by sending good messages because they get listed by who that sent the 
messages too. None of those listed are processed by Spamassassin. So if 
there's any good messages from an IP then they don't get listed at all.


My list might not be a big list but I intend it to be Spamhaus quality. 
If you're on my list then it can be bounced without being looked at. But 
I'm not yet there. I'm still testing.


RE: DNS Blacklist Policy Design

2006-06-05 Thread List Mail User
>...
>Paul,
>
>I've always thought of you as "chief scientist" among everyone on the spam
>assassin list... I've seen you dissect the inner mysterious workings of a
>spam like no other... uncovering the spammer's tracks like a superhero FBI
>agent meticulously piecing together data from the forensics lab.
>
>However, this time, I do think you've taken this DNS blacklist thing way too
>far. You have to consider the consumers of the DNS list as well.
>Overcomplicate this and few will ever get it to work effectively.
>
>:)
>
>Rob McEwen
>PowerView Systems
>[EMAIL PROTECTED]
>

Rob,

You have an excellent point.  But I think if the rules or a plugin
can be written so that the typical user need only install it, the "hidden"
complexity won't matter.  What I am afraid of it that the list can be made
useless by simple actions on the part of spammers (then everybody will have
wasted time, and maybe even opened up a hole for spam to get in - like
white-listing for twelve hours after an innocent looking message is sent).
To me, the data being offered seems too valuable not to try and take some
advantage of.

From the original discussson, this is intended to be an automatically
self-cleaning list, and that issue does greatly complicate things (though
it greatly reduces the work required of its operator).  It is important
that a self-cleaning list can't be caused to ignore spam sources easily.

No doubt, I do often make things more complex than they appear to
be *and* I haven't had enough sleep recently, which I don't think has hurt
my logic (yet), but does interfere with my ability to explain things:).

Paul Shupak
[EMAIL PROTECTED]


Re: DNS Blacklist Policy Design

2006-06-05 Thread List Mail User
>...
>From: "List Mail User" <[EMAIL PROTECTED]>
>
>> >...
>>>From: "List Mail User" <[EMAIL PROTECTED]>
>>>
 All of this would use up 6 bits and still leave 17 for any other
 purposes you have in mind (assuming codes from 127.0.0.2 to 127.0.0.126).
>>>
>>>Uses up 6 of the 7 bits in that range, Paul. Did you mean 127.0.0.2
>>>through 127.255.255.254?
>>>
>>>{o.o}
>>>
>> 
>> No I meant 127.0.0.2 to 127.0.0.126;  The bitmask '6' would check
>> the "bad" bits;  '24' the "good" bits; '32' for "well-known";  And '64'
>> for a recent offender.  The bottom bit can't be safely used if it can
>> be set alone (i.e. result in 127.0.0.1) and the top bit isn't needed.
>> Using the #1 bit (value 2) for any purpose is just redundant and not
>> needed.  (Using bit numbering starting at zero, and drawing little
>> endian for all of the programmers brought up on Intel  documentation.)
>> 
>> So I really did mean the 6 bits as below (warning ASCII art)
>> 
>> 128  64   32   168421
>> ---
>> unused   recent well-known   (good bits)   (bad bits)  unusable
>
>OK you meant 2 to 126 was used not that the ultimately usable bits
>extends over that range, which is what I had read your statement to
>mean. I took the parenthetical expression to be referring to the
>"17 for any other purposes" as opposed to the "6 bits" used up.
>
>{^_^}
>
>
Back on-list:-)

Actually, on reflection, since the "well-known" bit should never
occur alone without other bits, it could use bit 0 (value '1') and the BL
could them have 18 bits to spare (i.e. move "recent" to bit 5, value '32')
and use the range from 127.0.0.2 up to 127.0.0.63 - a total of 62 cases,
most of which wouldn't not need to be returned by the DNS server, but
could be useful with meta rules for "good guy"/"negative SA scoring"
(e.g. a domain with only "good bits", anti-fraud measures and a "good"
reputation value could be given a small negative score - negative SA
scores are very valuable becaue they are hard to construct in a way that
can not be gamed/defrauded - for eample, "good bits" only, SPF or DK/DKIM
and HASHCASH, BSP, HABEAS, IATB or maybe even SIQ or the commercial version
of DCC's reputation value).

Paul Shupak
[EMAIL PROTECTED]


RE: DNS Blacklist Policy Design

2006-06-05 Thread Rob McEwen
Paul,

I've always thought of you as "chief scientist" among everyone on the spam
assassin list... I've seen you dissect the inner mysterious workings of a
spam like no other... uncovering the spammer's tracks like a superhero FBI
agent meticulously piecing together data from the forensics lab.

However, this time, I do think you've taken this DNS blacklist thing way too
far. You have to consider the consumers of the DNS list as well.
Overcomplicate this and few will ever get it to work effectively.

:)

Rob McEwen
PowerView Systems
[EMAIL PROTECTED]






Re: DNS Blacklist Policy Design

2006-06-05 Thread jdow

From: "List Mail User" <[EMAIL PROTECTED]>


>...

From: "List Mail User" <[EMAIL PROTECTED]>


All of this would use up 6 bits and still leave 17 for any other
purposes you have in mind (assuming codes from 127.0.0.2 to 127.0.0.126).


Uses up 6 of the 7 bits in that range, Paul. Did you mean 127.0.0.2
through 127.255.255.254?

{o.o}



No I meant 127.0.0.2 to 127.0.0.126;  The bitmask '6' would check
the "bad" bits;  '24' the "good" bits; '32' for "well-known";  And '64'
for a recent offender.  The bottom bit can't be safely used if it can
be set alone (i.e. result in 127.0.0.1) and the top bit isn't needed.
Using the #1 bit (value 2) for any purpose is just redundant and not
needed.  (Using bit numbering starting at zero, and drawing little
endian for all of the programmers brought up on Intel  documentation.)

So I really did mean the 6 bits as below (warning ASCII art)

128  64   32   168421
---
unused   recent well-known   (good bits)   (bad bits)  unusable


OK you meant 2 to 126 was used not that the ultimately usable bits
extends over that range, which is what I had read your statement to
mean. I took the parenthetical expression to be referring to the
"17 for any other purposes" as opposed to the "6 bits" used up.

{^_^}



Re: DNS Blacklist Policy Design

2006-06-05 Thread List Mail User
>...
>From: "List Mail User" <[EMAIL PROTECTED]>
>
>> All of this would use up 6 bits and still leave 17 for any other
>> purposes you have in mind (assuming codes from 127.0.0.2 to 127.0.0.126).
>
>Uses up 6 of the 7 bits in that range, Paul. Did you mean 127.0.0.2
>through 127.255.255.254?
>
>{o.o}
>

No I meant 127.0.0.2 to 127.0.0.126;  The bitmask '6' would check
the "bad" bits;  '24' the "good" bits; '32' for "well-known";  And '64'
for a recent offender.  The bottom bit can't be safely used if it can
be set alone (i.e. result in 127.0.0.1) and the top bit isn't needed.
Using the #1 bit (value 2) for any purpose is just redundant and not
needed.  (Using bit numbering starting at zero, and drawing little
endian for all of the programmers brought up on Intel  documentation.)

So I really did mean the 6 bits as below (warning ASCII art)

128  64   32   168421

  unused   recent well-known   (good bits)   (bad bits)  unusable

with all of the possible value of:

  2  (one bad msg)
  4  (two bad msgs)
  6  (three bad msg)
  8  (one good msg)
 10  (one good and one bad msg)
 12  (one good and two bad msgs)
 14  (one good and three bad msgs)
 16  (two good msgs)
 18  (two good msgs and one bad msg)
 20  (two good msgs and two bad msgs)
 22  (two good msgs and three bad msgs)
 24  (three good msgs)
 26  (three good msgs and one bad msg)
 28  (three good msgs and two bad msgs)
 30  (three good msgs and three bad msgs)
 32  (well-known - shouldn't occur alone)
 34  (well-known with one bad msg)
 36  (well-known with two bad msgs)
 38  (well-known with three bad msgs)
 40  (well-known with one good msg)
 42  (well-known with one good msg and one bad msg)
 44  (well-known with one good msg and two bad msgs)
 46  (well-known with one good msg and three bad msgs)
 40  (well-known with two good msgs)
 50  (well-known with two good msgs and one bad msg)
 52  (well-known with two good msgs and two bad msgs)
 54  (well-known with two good msgs and three bad msgs)
 56  (well-known with three good msgs)
 58  (well-known with three good msgs and one bad msg)
 60  (well-known with three good msgs and two bad msgs)
 62  (well-known with three good msgs and three bad msgs)
 64  (recent offender)
...
then repeat with the recent offender bit set

Since the case of value '32' shouldn't occur, I guess there are only 63
different cases ('96' - a well-known but recent offender would occur).

Any spam arriving with bit 6 set (value '64') would set both
bits 1 and 2 (value '6').  Any spam arriving with either bit 3 or 4
set (values '8' and '16') would decrement the "good bits" field by one
(so a "testing" ham before spam would be at least partially "erased")
*and* increment the value of the "bad bits".  Otherwise the bits would
simply increment for each ham or spam received and decrement for each
time counter expiration (with the spam counter needing to be a longer
time than the ham counter - or the "good" field would have to be smaller
than the "bad field" so that the total possible "good" time period is
less than the maximum possible "bad" time period).

Sorry for the complexity, but I spent too much time years ago on
military and other hard real-time systems - most of the constraints I've
expressed are needed to avoid deadlock/live-lock or trivial circumvention
of the system.

In this "system", the best a spammer could do would be to send
three ham and one spam every ten hours, resulting in a value of '18'
when the spam is processed and increasing his connection rate 4 fold.
As long as that value (two good and one bad) still has a slightly "evil"
score (small positive number in SA or an MTA 45x response), there would be
no way to "game" the system.  The currently common case of a direct spam
or a single ham with a following spam would result in a value of '2' which
should be an effectively "evil" - guess at a score of ~2 for SA or even a
MTA 5xx or 45x response.  Downgrade any SA scores for well-known senders
and never use a 5xx code at the MTA, just 45x for the worst cases (i.e.
long term grey-listing - effectively 10 hours).

Paul Shupak
[EMAIL PROTECTED]


Re: DNS Blacklist Policy Design

2006-06-05 Thread jdow

From: "List Mail User" <[EMAIL PROTECTED]>


All of this would use up 6 bits and still leave 17 for any other
purposes you have in mind (assuming codes from 127.0.0.2 to 127.0.0.126).


Uses up 6 of the 7 bits in that range, Paul. Did you mean 127.0.0.2
through 127.255.255.254?

{o.o}


Re: DNS Blacklist Policy Design

2006-06-05 Thread List Mail User
>...
>Here's what I'm trying. I'm using MyDNS but added a few fields. 
>Basically I'm createing a white list and a black list. The while list 
>merely prevents an IP from getting on the black list. An IP gets on the 
>whitelist for 12 hours and on the blacklist for 4 hours. The idea being 
>to prevent any source that sends any good email from accidentally being 
>blacklisted.
>...

Marc,

If you use a bitmasked value, you have (in the common case) 23 bits
to work with.  You can use the bottom two bits for three values of "bad"
behavior - please don't use a 4 hour time-out, spam runs last *much* longer
than that;  I'd suggest incrementing the bitfield from zero to three for
every spam received and decrementing it approximately every *ten* hours when
no spam shows up (i.e. bad behavior lasts 10 to 30 hours after the last spam).
You can use another two bits for three values of good behavior - if a ham
is recieved and no spam has been for some time, increment the "good" field
and if it is non-zero decrement the bad field;  Here you would use a shorter
time period - I'd suggest 8 hours - Thus "good" behavior would be rewarded
for up to 24 hours.  Finally a fifth bit could be used to denote known ISPs
and service providers who many have temporary problems (i.e. Yahoo!, Hotmail,
Gmail, etc.).

Your original concept of a 12 hour "good" period and 4 hour "bad"
period is doomed to failure because many spam runs send innocuous messages
at the beginning to test for acceptance by the MX - you would be rewarding
this action by preventing any such spam run from ever causing a "bad" mark
(i.e. spammers would quickly use this tactic if you were successful in getting
your list used, and I for one like the criteria you have proposed for listing;
I just have problems with the timing period suggestions you have proposed).

Further, one more bit with a much longer timeout (3 days to a week)
could be used to immediately escalate repeat offenders back to the maximum
"bad" value.

A scheme like this would allow MTA level choices between a 5xx response
or a 4xx response by comparing the bit mask - e.g. If only "bad" bits are set
send a 5xx, if both bad and good or "well-known" send a 4xx (effectively a
long term greylisting) or let the message be accepted and just score in SA.
Also, scoring is SA for accepted messages (if no MTA blocks are used or occur)
could assign different values for the 32 (or 64) possible combinations.

All of this that I have described would mean keep track of three time
counters: The time since that last spam, the time since the last ham and the
time since the spam counters has been expired to zero.

All of this would use up 6 bits and still leave 17 for any other
purposes you have in mind (assuming codes from 127.0.0.2 to 127.0.0.126).

Paul Shupak
[EMAIL PROTECTED]


Re: DNS Blacklist Policy Design

2006-06-05 Thread Marc Perkel



jdow wrote:

From: "Marc Perkel" <[EMAIL PROTECTED]>


Rob McEwen (PowerView Systems) wrote:

Marc,

First, you should make a design decision up front... Are you going 
to allow IP addresses of valid hotmail and yahoo DNS servers (for 
example) which spew out a very high percentage of spams (especially 
nigeria scams) on your list, or not?


The only IPs I intend to list are going to be 100% spammers. So no 
Yahoo servers will be on it.


Gedanken Experiment

Something I have noticed is that the various RBLs encode some properties
of the hits in the address returned. You might consider experimenting
with a beige list that pegs large ISPs like Yahoo, GMail, and AOL
who seem to be somewhat indiscriminate about signups. The return would
indicate whether there is current spam activity. Then the account name
could be looked up as a secondary request, name.reversedip.dnsbl. At
least these sites make it awkward to perform mass signups. And this
would tend to stifle the names right away.

(Might there be some way to use the additional information part of a
DNS return to cite active spam accounts directly?)

{^_^}




Here's what I'm trying. I'm using MyDNS but added a few fields. 
Basically I'm createing a white list and a black list. The while list 
merely prevents an IP from getting on the black list. An IP gets on the 
whitelist for 12 hours and on the blacklist for 4 hours. The idea being 
to prevent any source that sends any good email from accidentally being 
blacklisted.


Re: DNS Blacklist Policy Design

2006-06-05 Thread jdow

From: "Marc Perkel" <[EMAIL PROTECTED]>


Rob McEwen (PowerView Systems) wrote:

Marc,

First, you should make a design decision up front... Are you going to allow IP 
addresses of valid hotmail and yahoo DNS servers (for example) which spew out a very 
high percentage of spams (especially nigeria scams) on your list, or not?


The only IPs I intend to list are going to be 100% spammers. So no Yahoo servers will be 
on it.


Gedanken Experiment

Something I have noticed is that the various RBLs encode some properties
of the hits in the address returned. You might consider experimenting
with a beige list that pegs large ISPs like Yahoo, GMail, and AOL
who seem to be somewhat indiscriminate about signups. The return would
indicate whether there is current spam activity. Then the account name
could be looked up as a secondary request, name.reversedip.dnsbl. At
least these sites make it awkward to perform mass signups. And this
would tend to stifle the names right away.

(Might there be some way to use the additional information part of a
DNS return to cite active spam accounts directly?)

{^_^} 



Re: DNS Blacklist Policy Design

2006-06-05 Thread Marc Perkel



Rob McEwen (PowerView Systems) wrote:

Marc,

First, you should make a design decision up front... Are you going to allow IP 
addresses of valid hotmail and yahoo DNS servers (for example) which spew out a 
very high percentage of spams (especially nigeria scams) on your list, or not?
  
The only IPs I intend to list are going to be 100% spammers. So no Yahoo 
servers will be on it.

Personally, I think that it is better to NOT try to catch these via RBLs even 
if only a tiny percentage of mail from some of those IPs is legit.
  

Agreed.

Therefore, IMHO, a good RBL will try to whitelist frequently used valid SMTP 
servers up front to prevent such collateral damage.

I thank God that many RBLs do NOT do this and many ISPs use such RBLs... this 
causes collateral damage which then puts pressure on these ISPs to clean up 
their acts... but I just don't want that collateral damage on MY server.

Finally, one really great resource for getting info on valid DNS servers is:

http://www.senderbase.org/

For example, if you enter the IP address of a valid SMTP server, it usually 
returns this IP and OTHER IP address from that same organization.

Keep in mind that being listed on serverbase.org alone doesn't mean that the IP 
isn't a spammer's IP... but if senderbase reports the IP as belonging to a 
legit organization and as being frequently used, that might be a good IP 
address (or IP address range) for whitelisting to prevent it from ever showing 
up on your RBL.

Rob McEwen
PowerView Systems
[EMAIL PROTECTED]


  


Re: DNS Blacklist Policy Design

2006-06-05 Thread Rob McEwen (PowerView Systems)
Marc,

First, you should make a design decision up front... Are you going to allow IP 
addresses of valid hotmail and yahoo DNS servers (for example) which spew out a 
very high percentage of spams (especially nigeria scams) on your list, or not?

Personally, I think that it is better to NOT try to catch these via RBLs even 
if only a tiny percentage of mail from some of those IPs is legit.

Therefore, IMHO, a good RBL will try to whitelist frequently used valid SMTP 
servers up front to prevent such collateral damage.

I thank God that many RBLs do NOT do this and many ISPs use such RBLs... this 
causes collateral damage which then puts pressure on these ISPs to clean up 
their acts... but I just don't want that collateral damage on MY server.

Finally, one really great resource for getting info on valid DNS servers is:

http://www.senderbase.org/

For example, if you enter the IP address of a valid SMTP server, it usually 
returns this IP and OTHER IP address from that same organization.

Keep in mind that being listed on serverbase.org alone doesn't mean that the IP 
isn't a spammer's IP... but if senderbase reports the IP as belonging to a 
legit organization and as being frequently used, that might be a good IP 
address (or IP address range) for whitelisting to prevent it from ever showing 
up on your RBL.

Rob McEwen
PowerView Systems
[EMAIL PROTECTED]