Hi All, I have been researching bloom filters and discussed it on IRC with RhodiumToad and David Fetter, and they pointed me to the various places that could potentially have bloom filters, apart from the places that already have them currently.
I have been reading the current implementation of hash joins, and in ExecScanHashBucket, which I understand is the actual lookup function, we could potentially look at a bloom filter per bucket. Instead of actually looking up each hash value for the outer relation, we could just check the corresponding bloom filter for that bucket, and if we get a positive, then lookup the actual values i.e. continue with our current behaviour (since we could be looking at a false positive). This doesn't involve a lot of new logic. We need to add a bloom filter in HashJoinState and set it when the hash table is being made. Also, one thing we need to look at is the set of hash functions being used for the bloom filters. This can be a point of further discussion. The major potential benefit we could gain is that bloom filters never give false negatives. So, we could save a lot of lookup where the corresponding bloom filter gives a negative. This approach can also be particularly useful for hash anti joins, where we look for negatives. Since bloom filters can easily be stored and processed, by straight bit operations, we could be looking at a lot of saving here. I am not sure if we already do something like this. Please direct me to the relevant parts in the code if we already do. Regards, Atri -- Regards, Atri l'apprenant -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers