On Fri, Feb 22, 2008 at 7:55 AM, Marc Perkel <[EMAIL PROTECTED]> wrote:
>
>
>
>  Aaron Wolfe wrote:
>
>  On Thu, Feb 21, 2008 at 11:47 PM, Marc Perkel <[EMAIL PROTECTED]> wrote:
>
>
>  Steve Radich wrote:
>  > Sorry; apparently I was unclear.
>  >
>  > MX records I'm saying as follows:
>  > 100 - Real
>  > 200 - Real perhaps, as many "real" as you want
>  > 300 - Bogus - one that blocks port 25 with tcp reset for example
>  > 400 - accept port, logs ip -> blacklist (not to be scored
>  > aggressively at all) with a 421/retry.
>  >
>  > If a whole bunch of places are seeing the same smtp server hitting this
>  > 400 level MX then I'm saying that seems like a useful thing to be
>  > included in a blacklist using a low score in sa.
>  >
>  > The point was to offer the 400 level mx as a free service to log the ips
>  > quickly for those that don't want to set up the server themselves.
>  >
>  > In theory the 400 level MX wouldn't be used by "real" smtp very often,
>  > hence it's likely a spammer and therefore the IP could be auto
>  > blacklisted. Realize I'm NOT proposing we block on this, just score
>  > based on this list.
>  >
>  > Steve Radich - http://www.aspdeveloper.net /
>  > http://www.virtualserverfaq.com
>  > BitShop, Inc. - Development, Training, Hosting, Troubleshooting -
>  > http://www.bitshop.com
>  >
>  >
>
>  I'm actually doing something like that. What I do is track hits on the
>  highest MX that has not hit the lowest numbered MX, then because I use
>  Exim I can track which IP addresses don't send the QUIT command to close
>
>  I am thinking about playing around with the same type of thing here..
> Is this any different from looking for "lost connection after DATA" or
> "lost connection after RCPT" errors in a postfix server's logs? Not
> sure why you can detect this because you run Exim specifically. Or
> am I missing something?
>
>  Exim has ACLs that let you do things when the QUIT is received or not
> received. Exim probably has 100x the commans that Postfix does and you can
> do a lot of tricky stuff in Exim that no other MTA has.
>
>
>
>
>  the connection. This combination creates a highly reliable blacklist and
>  I'm currently tracking about 1.1 million virus infected spambots that
>  have tried to spam me in the last 4 days.
>
>  It's my hostkarma list.
>
>
>
>  Sounds interesting.. do you block based on this list or just use it
> for scoring in SA or something like that? What is the false positve
> rate?
>
>
>
>  Yes, I do block based on this list. Ther are some false positives but it's
> rare. I have a way for people to remove themselves from the list. There are
> other criteria that we blacklist on as well that makes for a few FP. But
> it's extremely low. I've put a lot of effort into getting it right.
>
>


Ok...  I have 24 hours of data to play with..  at first results seemed
promising. I found over 300,000 hosts that had connected only to my
higest MX and did not issue a quit.  But.. of that group:

96.0% are listed on spamhaus (zen, i did not breakdown onto the
individual lists)
2.3% of the hosts *not* listed on spamhaus are listed on Rob McEwen's
ivmSIP list (note that this is over 50% of the remaining hosts, about
10% higher than this list's hit rate with my normal mail flow).

I don't have the zone files for any other RBLs and I didn't want to
send out 300k queries via DNS.  But I think the picture is fairly
clear..  a vast majority of the hosts hitting the fake high MX will be
hosts already listed in major RBLs.

I'm sure my quick test is not perfect.  The remaining 1.7% of hosts
may include some amount of non spam sources (very small if any I would
guess).  Also, I ran the RBL checks all at once at the end of the
cycle. so some of the hits were 24 hours old.  Some amount of the
remainder were probably on the RBLs at the time they hit my server and
were since removed.

I will continue to look into this to see if today was a typical day.
Based on these number though... is this a promising way to reduce
server load/blacklist more hosts... or is this pointless?  I'm
interested in what people think since the data is so easy to gather
and use, if it makes any sense to use it.

-Aaron

Reply via email to