Re: Feedback wanted on gethttpd graylisting ideas included

Daniel Ouellet Mon, 11 Sep 2006 14:26:24 -0700

Joachim Schipper wrote:

Your worries about losing proxies is correct; it looks like you have
that problem mostly covered. I'm not sure it would help much about
bandwidth hogs, though - I don't have any numbers on what programs are
most often used, but something like wget certainly does respect
robots.txt.

Actually it does. There is many attacks going on right now as you know,but if you put them in category, you have the tones of variation of userpass value sanity check and you can now see that on Security focus. Theyrelease in the last three days, over a dozen so far. Even more now I amsure. I saw that started a few eeks ago if you look into the archive,but that's irrelevant anyway. The other is a virus that spread the sameway, or similar. In that case they actually call big content page(s) onyour site. When I mean big content, it's not with images, etc. But textstuff. The reason if their virus do not process the content and wouldneed to be bigger to do so. This way, it still small and the web serversee it as legit and will reply. But if you have pages that have .5MB oftext on it as an example that comes from database back end, then theyhope to bring your server down, your SQL back end down and if not makeyou waist as much bandwidth as possible. I notice it first on the HUGEincrease on the GB of transfer each day. Just for you to get a pictureof this effect. I have logged over 300,000 sources of virus doing thistype of attack so far on my servers and they pull a series of pages thatare pretty big in text content, between 150KB minimum to 750KB, or soexcluding any other content. Each of the offending source will pull thatcontent many times a day. I mean just think about it.

So, if you go ONLY with an example of let say just for fun. One time anhour only from each one on and accessing an average page of 500KB. Youget a waisted transfer for that day only of:


24 hours * 500,000 Bytes in size * from 300,000 source and you have:

3,600,000,000,000 Bytes of waisted bandwidth / day.

Now if you assume that this is prefect and constant without peek forexample, then you need to push this amount of data in 24 hours, so youwould need:

3,600,000,000,000 * 8 bits/Byte = 28,800,000,000,000 / (60 seconds * 60minutes * 24 hours) and get 333,333,333 bits/sec needed in capacity,just for this waisting stuff!

And this is only based on one query per hours! Get the picture and thesize of the problem. (:>

So, what I put into place to counter that doesn't stop it as you canstop the source from coming in, but you need to find the good out of thebad and my reply to bad one happen to be only 5 bytes instead in the loganyway.


All this is with forgetting all the overhead, etc.

So, yes it's a BIG help for "bandwidth hogs"!

And don't forget that's per destination under attack! (:>

So, yes, it can be totally unmanageable if not stop from the start andon big scale.

3. DDoS GET attacks & Bandwidth suckers defense. Multiple approach.

3.1 Good users supply data check.
So far most/all of the variations of attacks on web sites are withscripts trying to inject itself to your servers. Well, you need to dosanity checks on your code. Nothing can really protect you for that ifyou don't check what you expect to receive from users input. So, I havenothing for that. No idea anyway on how to, other then may be limitingthe side of the argument a get can send, but even that is bad idea I think.
This is not applicable to DDoS, really - though you are otherwise right,
of course.

I provided a very simple way to not remove the problem, but to at aminimum stop it from getting infected based on all the latest series ofsecurity focus variations and it also have the benefit to point you toany possible source that your server might have install on them as well.


Very simple really.

3.2 Gray listing idea via 302 temporary return code.

This could be effective, indeed - though I am not sure it would block
many attackers.


Work like a charm in real life so far. See number above for results.

It's been use successfully so far for a few weeks and no bad side effectstill, just HUGE benefits! And the servers still don't break into sweat yet!

3.4 What about the compromise user computer itself, or proxy server.


Faking those headers is easily done, though; ideally, you'd want to
cross-check p0f and the headers. I'm not entirely sure it would hurt an
attacker more than it hurt you, though, and priviliged code is always
scary, and doubly so when close to essentially untrusted web apps.

True for sure. But you still need a way to make the difference betweengood and bad passing through proxy, or you loose to much. Hereobviously, I go with the fact that so far, yes these headers are fakeand it's trivial to do as well, but none of the attack so far anywaygenerate random headers. In witch case it would be useless obviously.

4. What about more intelligent attack.

You *should* consider some unconventional browsers before going to far
down this lane, though. Notably, your 1x1-image will show up quite
readable on text-mode browsers; be sure to, at least, add a 'don't
click' alt attribute.

I know about the text one and tested with Lynx to see, but I did forgetthat I should add the "Do NOT CLICK HERE... Bot trap WARING" stuff, so Iwill do that.

Also, neither text-based browsers nor most legitimate bots will request
images.

And that was the point. Allow legitimate bots, if you choose soobviously, and ban the bad one!


Best,

Daniel

Re: Feedback wanted on gethttpd graylisting ideas included

Reply via email to