Tom Evans wrote:
On Fri, May 3, 2013 at 10:54 AM, André Warnier <a...@ice-sa.com> wrote:
So here is a challenge for the Apache devs : describe how a bot-writer could
update his software to avoid the consequences of the scheme that I am
advocating, without consequences on the effectivity of their URL-scanning.

This has been explained several times. The bot makes requests
asynchronously with a short select() timeout. If it doesn't have a
response from one of its current requests due to artificial delays, it
makes an additional request, not necessarily to the same server.

The fact that a single response takes longer to arrive is not
relevant, the bot can overall process roughly as many requests in the
same period as without a delay. The amount of concurrency that would
be required would be proportional to the artificial delay and the
network RTT.

There is a little overhead due to the extra concurrency, but not much
- you are not processing any more requests in a specific time period,
nor using more network traffic than without concurrency, the only real
cost is more simultaneous network connections, most of which are idle
waiting for the artificial delay to expire.

I would not be surprised if bots already behave like this, as it is a
useful way of increasing scanning rate if you have servers that are
slow to respond already, or have high network RTT.


Ok, maybe I am understanding this wrongly. But I am open to be proven wrong.

Suppose a bot is scanning 10000 IP's, 100 at a time concurrently (*), for 20 
potentially
vulnerable URLs per server. That is thus 200,000 HTTP requests to make.
And let's suppose that the bot cannot tell, from the delay experienced when 
receiving any
particular response, if this is a server that is artifically delaying 
responses, or if
this is a normal delay due to whatever condition (**).
And let's also suppose that, on the total of 200,000 requests, only 1% (2000) will be "hits" (where the URL actually responds by other than a 404 response). That leaves 99% of requests (198,000) responding with a 404. And let's suppose that the bot is extra-smart, and always keeps his "pool" of 100 outgoing connections busy, in the sense that as soon as a response was received on one connection, that connection is closed and immediately re-opened for another HTTP request.

If no webserver implements the scheme, we assume 10 ms per 404 response.

So the bot launches the first batch of 100 requests (taking 10 ms to do so), then goes back to check its first connection and finds a response. If the response is not a 404, it's a "hit" and gets added to the table of vulnerable IP's (and to gain some extra time, it means that if there would have been extra URLs to scan for the same server, they could now be canceled - although this could be disputed). If the response is a 404, it's a "miss". But it doesn't mean that there are no other vulnerable URLs on that server, so it still needs to scan the others. All in all, if the bot can keep issuing requests and processing responses at the rate of 100 per 10 ms on average, it will take it a total of 200,000 / 100 * 10 ms = 2,000 ms to perform the scan of the 200,000 URLs, and it will have collected 2000 hits after doing so.

Now let's suppose that out of these 10000 servers, 10% of them implement the scheme, and delay their 404 responses by an average of 1000 ms. So now the bot launches the first 100 requests in 10 ms, then goes back to check the status of the first one. With a probability of 0.1, this could be one of the delayed ones.
In that case, no response will be there yet, and the bot skips to the next 
connection.
At the end of this pass, the bot will thus have received 90 responses (10 are still delayed), and re-issued 90 new requests. Then on the next pass, the same 10 delayed responses would still be delayed (on average), and among the 90 new ones, 9 would also be. So now it can only issue 81 new requests, and when it comes back to check, 10 + 9 + 8 = 27 will be delayed. Basically, after a few cycles like this, all his 100 pool connections will be waiting for a response, and it would have no choice between either waiting, or starting to kill the connections that have been waiting more than a certain amount of time.
Or, increasing its number of connections and become more conspicuous (***).

If it choses to wait, then its time to complete the scan of the 10000 IP's will have increased by 200,000 * 10% * 1000 ms = 20,000,000 ms.
If it chooses not to wait, then it will never know if this URL was vulnerable 
or not.

Is there a flaw in this reasoning ?

If not, then the avoidance-scheme based on becoming more parallel would be quite ineffective, no ?



(*) I pick 100 at a time, imagining that as the number of established outgoing 
connections
increases, a bot becomes more and more visible on the host it is running on. So 
I imagine
that there is a reasonable limit to how many of them it can open at a time.

(**) this being because the server varies the individual 404 delay randomly 
between 2
reasonable values (100 ms and 2000 ms e.g.)which can happen on any normal 
server.

(***) I would say that a bot which would be opening 100 outgoing connections in parallel on average would already be *very* conspicuous.

Reply via email to