This might be a bit off topic.... Given that you can set POST_REQUEST_SIZE in a production PHP application, how likely is it really that an app will encounter a HashDos attack?
>From what I gather this will require MBs to GBs of data in order to cause a DoS. >From the web side, I think there are enough tools to prevent HashDos from happening... Would the issue then affect only CLI users? Sorry, if this seems like a derail, I am pretty new to the internals list. Cheers, Glenn On Wed, Sep 21, 2016 at 10:23 AM, Nikita Popov <nikita....@gmail.com> wrote: > On Wed, Sep 21, 2016 at 4:05 PM, Tom Worster <f...@thefsb.org> wrote: > > > On 9/21/16 8:37 AM, Rowan Collins wrote: > > > >> On 21 September 2016 13:02:20 BST, Glenn Eggleton <geggl...@gmail.com> > >> wrote: > >> > >>> What if we had some sort of configuration limit on collision length? > >>> > >> > >> Previous discussions have come to the conclusion that the difference > >> between normal collision frequency and sufficient for a DoS is so large > >> that the only meaningful settings would be on or off. e.g. the proposed > >> limit is 1000, and randomly inserting millions of rows produces about > 12. > >> > >> The problem with long running applications is not that they need to > raise > >> the limit, it's that they need to handle the error gracefully if they > are > >> in fact under attack. Because hash tables are so ubiquitous in the > engine, > >> there's no guarantee that that's possible, so an attacker would have the > >> ability to crash the process with the limit turned on, or hang the CPU > with > >> the limit turned off. > >> > > > > Right. It seems like count-and-limit pushes the problem onto the user who > > then has to discriminate normal from malicious causes for rising counters > > and find appropriate actions for each. > > > > Even a sophisticated user who understands hash collision counters may not > > welcome this since it adds complexity that's hard to test and involves > > questionable heuristics. > > > > Tom > > > > Quoting a relevant part of the previous discussion: > > > Lets [try] to quantify the probability of reaching the collision limit C > with a hashtable of size N and assuming a random hash distribution. The > upper bound for this should be (N choose C) * (1/N)^(C-1), with (1/N)^(C-1) > being the probability that C elements hash to one value and (N over C) the > number of C element subsets of an N element set. For large N and N >> C we > approximate (N choose C) to (e*N/C)^C / sqrt(2pi*C). As such our upper > bound becomes N * (e/C)^C / sqrt(2pi*C). Choosing N = 2^31 (largest > possible hashtable) we get for C = 20 probability 8.9E-10 and for C = 100 > probability 2.3E-149. The patch uses C = 1000. > > In other words, it is extremely unlikely that you hit the collision limit > by accident, with non-malicious data. So no, the user does not have to > discriminate normal from malicious causes. If the limit triggers, it's > malicious. > > Nikita >