Re: [PERFORM] HashJoin order, hash the large or small table? Postgres likes to hash the big one, why?

Scott Carey Wed, 13 Apr 2011 10:08:21 -0700
Sorry for resurrecting this thread, but this has been in my outbox for months 
and I think it is important: On Oct 27, 2010, at 12:56 PM, Tom Lane wrote: > 
Scott Carey writes: > > Why does hashjoin behave poorly when the inner relation 
is not > > uniformly distributed and the outer is? > Because a poorly 
distributed inner relation leads to long hash chains. > In the very worst case, 
all the keys are on the same hash chain and it > degenerates to a nested-loop 
join. (There is an assumption in the > costing code that the longer hash chains 
also tend to get searched more > often, which maybe doesn't apply if the outer 
rel is flat, but it's not > obvious that it's safe to not assume that.) I 
disagree. Either 1: The estimator is wrong or 2: The hash data structure is 
flawed. A pathological skew case (all relations with the same key), should be 
_cheaper_ to probe. There should be only _one_ entry in the hash (for the one 
key), and that entry will be a list of all relations matching the key. 
Therefore, hash probes will either instantly fail to match on an empty bucket, 
fail to match the one key with one compare, or match the one key and join on 
the matching list. In particular for anti-join, high skew should be the best 
case scenario. A hash structure that allows multiple entries per key is 
inappropriate for skewed data, because it is not O(n). One that has one entry 
per key remains O(n) for all skew. Furthermore, the hash buckets and # of 
entries is proportional to n_distinct in this case, and smaller and more cache 
and memory friendly to probe. > Not really. It's still searching a long hash 
chain; maybe it will find > an exact match early in the chain, or maybe not. 
It's certainly not > *better* than antijoin with a well-distributed inner rel. 
There shouldn't be long hash chains. A good hash function + proper bucket count 
+ one entry per key = no long chains. > Although the > point is moot, anyway, 
since if it's an antijoin there is only one > candidate for which rel to put on 
the outside. You can put either relation on the outside with an anti-join, but 
would need a different algorithm and cost estimator if done the other way 
around. Construct a hash on the join key, that keeps a list of relations per 
key, iterate over the other relation, and remove the key and corresponding list 
from the hash when there is a match, when complete the remaining items in the 
hash are the result of the join (also already grouped by the key). It could be 
terminated early if all entries are removed. This would be useful if the hash 
was small, the other side of the hash too large to fit in memory, and 
alternative was a massive sort on the other relation. Does the hash cost 
estimator bias towards smaller hashes due to hash probe cost increasing with 
hash size due to processor caching effects? Its not quite O(n) due to caching 
effects. > regards, tom lane
Re: [PERFORM] HashJoin order, hash the large or small table? Postgres likes to hash the big one, why?

Reply via email to