On Thu, Jan 05, 2012 at 01:35:50PM -0500, Jeff Trawick wrote: > We don't want to say "go fish" to APR applications if our hash plus > their vector results in an exploit.
I'll bite. 1) Chained hash tables have quadratic worst case insertion time, O(n**2). The APR hash table (apr_hash_*) implementation shares that characteristic. 2) APR tables (apr_table_*) provide a data structure with constant time worst case insertion, O(1). 3) If applications are using apr_hash_* where they should be doing apr_table_*, the apps should be fixed to use the correct API. 4) Mixing some per-hash randomness into the hash is mostly futile. The attack is effective if you can generate "n" keys which hit the same hash bucket. The hash bucket chosen is (hash(key) % ht->max). The proposal is to change the hash lookup function to, say: (hash(key) * ht->pure_random_number) % ht->max where ht->max is 15 by default. So you "merely" have to increase the size of the input by 15 to produce at least the same overhead; the attacker must generate 15n keys to ensure they hit all the buckets. The attack is still quadratic time. Reducing the impact by a constant factor might make a difference to some app, but it's likely that app needs to switch to use apr_table_* or otherwise limiting untrusted input more tightly anyway. OK, now flame me to death ;) Joe