> We saw a case recently where a hash join was using much more memory than > it was supposed to, causing failure when the server ran out of memory.
Yes. I had the same problem a few month ago, http://archives.postgresql.org/pgsql-general/2004-09/msg00410.php It turned out that the cost estimates were so way off no matter what tunables were modified, so I never was ever able to execute the query fully. I analyzed the code and devised a solution that was similar what you proposed, though I didn't consider HashAggregates at the time. Unfortunately, I lost all work in a hard drive failure and was never able to get back to working on it, so I can't really refer to my old notes. For what it's worth, your solution looks very reasonable to me. This also brings up a line of thought I had a while ago on a related topic. Something like a "HashDistinct" might be useful, if it had no startup cost. It would basically be a plan node in the executor that would dynamically build a hashtable so that it can pull rows from its child node (discarding if they appear in the hashtable) until it can pass on a novel row. I have some reservations about it, though. At best, in queries with minimal startup cost from the get-go, it would seem to be a tradeoff favoring latency over throughput (assuming the HashDistinct would be a slower operation overall than separate aggregation and distinct operations). Then we have the issue of really big hash tables... I was hoping to get some time in the upcoming months to hash out these issues to see if it's worth it, and if it would be generally useful at all. -Aaron ---------------------------(end of broadcast)--------------------------- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match