Re: new plugin: naughty & nice

Jared Johnson Thu, 24 May 2012 15:07:01 -0700

We do something not exactly similar to this, but which might be somewhat
instructive.  We modified the banner delay plugin so that the amount of
delay we apply to a server depends on its previous behavior with us.  The
default delay is 15 seconds; if you send a message that we reject as spam
or virus, we increase the delay by 5 seconds.  If you send us a message
that we deliver as clean (whether that's because of spam scanners,
whitelists, or whatever), we reduce the delay by 5 seconds.  We attached
other values to some other rejections; for instance, if you try to send
mail to an invalid user, the delay is increased by 1 or 2 seconds.  We do
not increase the delay to more than around 300 seconds, so that at least
some very RFC-compliant MTA's at least have a chance to deliver, even if
they manage to get themselves horribly delayed.


What we've found is that the main strength of the banner delay for us is
not that it exposes early talkers (it exposes some), but that it
encourages early disconnectors :)  Some spambots give up after just 5
seconds of delay; the one that wait out our 15 second delay will often try
to send a few messages and get rejected, then once we make our way up to
20 or 30 seconds they start giving up on us altogether.

So after we implemented this we noticed that what we had on our hands was
a rough score card for all connecting IP's indicating how well we know
them and how "naughty" they have been.  From this we've been able to
further reduce the number of transactions for which scanning is required;
for instance, if a client gets the default delay (e.g. for all intents and
purposes we haven't heard from them), we only let them have a single
concurrent connection with us.  We can also be more lenient with clients
that have won and maintained a banner delay of zero by sending us some
good mail and not screwing it up with bad mail.  If your 'naughty' plugin
is pretty successful, perhaps we could benefit from adding a 'penalty box'
like this based on the delay -- clients that have earned, say, a 30 or 60
second delay by trying to send us too much spam and not enough ham get
sent to the penalty box and are therefore no longer able to increase our
concurrency.  I had an idea similar to this a while ago (we haven't got
around to implementing it), where we *randomly* disconnect at the outset a
client that's earned an increased banner delay, and the frequency of
disconnection depends on long high a banner delay they've earned.  This
would allow us to throw away connections while, again, still giving good
clients a chance to eventually connect to us and send legit mail, thereby
decreasing their delay.

Others may or may not find it worthwhile to implement this sort of mucking
with banner delays (I'll list a couple of problems with it below); but
simply scoring clients might be an answer to the problem of legit clients
being rejected.  Rather than putting a client in the penalty box at the
first violation, give them a few chances, and let clean mail put them back
into good graces.

For storage of these scores we have been using a table Postgres with three
fields: IP, delay (aka score), and timestamp, with entries expired after a
client has not talked to us at all for 24 hours.  We've talked about using
Cache::FastMMap for this because there is not really any benefit to
keeping it in the database (we don't even try to synchronize it across
clusters, if your score is effectively reset because you started talking
to a new node then so be it).  But honestly, we haven't seen any big
bottlenecks with the DB, so we haven't bothered doing the coding and
testing work to move it to obviously more efficient mechanisms.

For anyone considering the delay method itself, one big weakness with this
method is that since it does not "get rid of naughty connections as
quickly as possible", unless you're using the async daemon (we aren't), it
can wind up taking up a lot of resources.  Obviously it will still save
plenty of CPU on account of spammers giving up instead of giving us
something to scan; but without async, the sharp increase in concurrency
will hurt memory usage.  One thing we recently added which I think will
really help in this area is that we do our RBL lookup before starting the
delay, and if a server is RBL'd we skip the delay and go right to
disconnecting them.  Since so many servers are RBL'd this should help
concurrency a lot.  If the server is not RBL'd, we subtract the amount of
time it took for us to do the RBL lookup (and anything else we might have
done first) from their banner delay -- after all, we've already been
delaying our banner for that amount of time.

Aside from that, simply doing a banner delay has caused problems for a few
legit servers which give up before they even get to our default delay. 
We've pretty much completely gotten around this by excluding all private
IP's, and clients that are given relay permission by our IP-based rules,
from any banner delay whatsoever.

HTH!

-Jared

> I have written a plugin that is currently named naughty.
>
> The POD has a good description of what it does and how it works. You can
> read the POD here:
>
> https://www.tnpi.net/internet/mail/naughty.html
>
> The plugin is very effective at blocking spam and it has reduced my CPU
> load enough to be measurable on munin graphs (which aren't exactly
> granular). Read the POD at the URL above to see how and why.
>
> The plugin has one teensy tiny problem. Because it penalizes servers that
> send spam, it occasionally penalizes a "mostly good" server.  In a week of
> running, it has done this exactly twice, and the two servers it nabbed are
> mx.develooper.com and mx2.freebsd.org.  Both are servers that send lots of
> ham, and occasionally, some spam.
>
> I see a couple options.
>
> a)  Whitelist the "mostly good" servers that get penalized. While that
> would be easy, it requires manual effort on the part of the sysadmin, and
> users will likely lose valid mail.
>
> b) Keep track of senders who send us ham. Then we have lots of options. We
> can allow mail from "mostly good" senders, randomly defer their
> connections with a "We like you, but you're sending spam" error, or any
> number of other ways to deal with them.
>
> c) Other?
>
> The data store for naughty is currently a database of key/value pairs. The
> key is the remote IP (in integer form) and the value is a timestamp.
> That's all that's needed for naughty to function.
>
> To track naughty and nice senders, I'm imagining the DB would need
> counters to track of the number of ham and spam messages received from
> that server. Or even a single signed integer, where anything above zero is
> bad and below zero is good.
>
> The only other thing needed is a reliable way to detect that incoming mail
> is ham.  For my server, I'll use 'dspam reject agree', so that if SA &
> dspam agree that it's ham, then the naughty plugin increments the nice
> counter.  When new connection arrive, they would be blocked based on their
> karma (the weighting of naughty -vs- nice emails) they send.
>
> I'm fishing for other ideas, better ideas, or experience (we tried that,
> and these are the results) you can share.
>
> Thanks,
> Matt
>
> `````````````````````````````````````````````````````````````````````````
>   Matt Simerson                   http://matt.simerson.net/
>   Systems Engineer            http://www.tnpi.net/
>
>   Mail::Toaster  - http://mail-toaster.org/
>   NicTool          - http://www.nictool.com/
> `````````````````````````````````````````````````````````````````````````
>
> --
> *** eFolder Email Security identified this as CLEAN. Give feedback:
> *** This is SPAM: http://mx11.dcmx.net/ms?k=crBBrkN29x96
> *** More options: http://mx11.dcmx.net/md?k=crBBrkN29x96
>

Re: new plugin: naughty & nice

Reply via email to