Okay, I whacked together a perl script to do some very rudimentary parsing
of the SMTP relays logging I hacked into amavisd-new.

I filtered out any host that delivered less than 2 emails, or had an average
spam level of less than 5. This is what I ended up with for this afternoon's
mail (since I got logging working a few hours ago). First column is # of
mails, second is average spam level, third is IP, and fourth is reverse DNS
on that IP.

3       9       66.150.179.24   mailout24.specialtysquare.com
3       14      69.60.15.54             host.gifts-items.com
3       23      24.188.126.239  ool-18bc7eef.dyn.optonline.net
3       10      65.218.108.66   epart18.emailpartners.com
3       36      200.164.252.86  PE252086.user.veloxzone.com.br
3       9       66.55.165.48    fg5.lvesfrdt.com
3       11      66.28.247.52    host52.designated-ns.com
3       10      66.28.247.57    host57.designated-ns.com
3       10      209.216.99.120  smtp201.maildeliverer.com
3       10      209.216.99.121  smtp202.maildeliverer.com
3       20      66.124.209.4    adsl-66-124-209-4.dsl.lsan03.pacbell.net
3       18      216.109.87.26   rb26ex.com
3       44      24.132.138.225  node18ae1.a2000.nl
3       17      146.82.38.127   mx17.sparklingbeavers.com
3       12      64.253.204.140  mail.bargaintribune.com
3       13      64.253.204.225  mail9.sendmedeals.com
3       10      128.105.6.22    norm.cs.wisc.edu
3       10      63.251.59.227   ina227.etracks.com
3       23      62.163.35.100   a35100.upc-a.chello.nl
3       14      221.124.31.57   221.124.31.57
3       10      66.55.165.3             ya2.lifsrs.com
3       9       64.62.204.196   smf196.showmefreestuff.com
3       13      68.171.163.242
ny-glennsfallscadent1-bdg11ad-242.bur.adelphia.net
3       6       64.157.40.132   epsilon.levelogic.com
3       24      168.226.62.143  168-226-62-143.speedy.com.ar
3       14      69.60.15.52             host.gifts-items.com
4       9       66.55.165.46    hg3.lifesrvtdt.com
4       17      63.212.169.107  mail.meta-deals.com
4       17      63.212.169.106  mail.meta-deals.com
4       27      200.204.87.48   200-204-87-48.dsl.telesp.net.br
5       17      69.56.32.32             iron2.bigdls.com
5       8       66.111.233.106  craftg.crispcraft.com
5       10      207.134.164.84  hos3.hos12m.com
5       10      209.216.99.125  smtp206.maildeliverer.com
5       16      63.212.169.108  mail.meta-deals.com
5       17      63.212.169.110  mail.meta-deals.com
5       21      63.212.169.91   mail.outstandingvalues.com
6       26      194.217.109.12  194.217.109.12
6       14      66.111.231.76   compulsive.compulsivebuys.com
6       20      63.212.169.92   mail.outstandingvalues.com
6       25      218.38.246.41   218.38.246.41
7       11      66.55.167.159   o5.dyforyifo.com
7       20      63.212.169.93   mail.outstandingvalues.com
7       6       12.148.246.14   mailhost2.artesyn.com
7       9       66.55.167.160   o6.dlyfryri.com
8       17      69.56.32.31             iron1.bigdls.com
10      29      67.92.242.140   petaybee.dal.net
11      12      206.162.135.158 sp13.simonpublishing.com

The only one that stands out as a false positive is mailhost2.artesyn.com...
this is due to our parent company forwarding some mail to us for some users.
Not really used, and mostly just gathers spam. I'm going to see about
turning off that forwarding... in any case, if not I can add it to my
trusted networks list within SA and it shouldn't then show up.

Looks somewhat promising... to my first glance, nearly all of these host
look to be likely spammers.

(Actually, looking at it again, the norm.cs.wisc.edu is a legitimate host -
I'm getting spam through there due to an ancient account I have. Again, I'd
probably have to add it to the trusted hosts list. Kind of a bother.)

The next question is what algorithm to decide when a spam IP should be shut
off, and when it should be allowed to come back.

I was thinking of keeping a two dimensional hash in perl. The first key
would be the IP, and the second would be a date (probably in UNIX style
seconds since 1970 or whatever). The value of the hash would be the spam
level for that particular IP.

So, if iron1.bigdls.com sent 3 emails, one at time 1111 of spam level -5,
one at time 1115 of spam level 27, and one at time 1136 of spam level 10,
the hash would be:

$spamhash{69.56.32.31}{1111} = -5;
$spamhash{69.56.32.31}{1115} = 27;
$spamhash{69.56.32.31}{1136} = 10;

Probably the simplest thing to do is just have the perl daemon continually
tail the messages file. It would look for these log entries, and populate
this hash as necessary. Then every X (100?) log entries it scanned, as long
as it took at least Y (5 minutes?) time, it would run two subroutines:

- The expire routine would expire any entries in the hashes older than a
certain amount.

- The "action" routine would run through the hashes and compute the average
spam levels for each IP, then take action as necessary (trigger the ban via
a filter on the external router, or possibly update the banned IPs on the
external mail gateway box).

I guess I need to sort out what a good criteria would be for action. Would
average spam level be an adequate way to determine a "bad" IP? Obviously a
level of 5 would be too likely to get false positives, but maybe 10?

Then, how long before the IP is allowed again? Should bans just "naturally"
expire out of the banned list by letting their hash entries expire?

I'm sort of thinking out loud...

johnS


-------------------------------------------------------
This SF.net email is sponsored by: The SF.net Donation Program.
Do you like what SourceForge.net is doing for the Open
Source Community?  Make a contribution, and help us add new
features and functionality. Click here: http://sourceforge.net/donate/
_______________________________________________
Spamassassin-talk mailing list
[EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/spamassassin-talk

Reply via email to