All, Attached is an initial patch I've been playing with which uses Bloom filters to reduce unnecessary processing of outer tuples in hash joins. In short, this works by creating a Bloom filter, adding all relevant tuples for the inner relation, and querying the filter (for existence) when retrieving tuples from the outer relation. This avoids unnecessary tuple movement and bucket searches for matches we already know can't exist. Currently it works only for JOIN_INNER, but could be modified to optimize anti/semi joins as well. Similarly, I created a GUC to enable pruning, named bloom_pruning.
Rather than performing k hash functions, this implementation simply sets a bit based on the already-computed hash value. I wanted to send this around for reviews and comments before working on it further. As this isn't overly intrusive, if someone can commit to reviewing and providing input, I'll commit to having this ready for 8.4. -- Jonah H. Harris, Senior DBA myYearbook.com
bloompruning_v1.patch
Description: Binary data
-- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers