Re: [HACKERS] pretty bad n_distinct estimate, causing HashAgg OOM on TPC-H

Tomas Vondra Sat, 20 Jun 2015 08:51:07 -0700

Hi,

On 06/20/2015 05:29 PM, Feng Tian wrote:


I have not read Jeff's patch, but here is how I think hash agg should work,

Hash agg scan lineitem table, perform aggregation in memory.   Once
workmem is exhausted, it write intermediate state to disk, bucket by
bucket.  When lineitem table is finished, it reads all tuples from one
bucket back, combining intermediate state and finalize the aggregation.
   I saw a quite extensive discussion on combining aggregation on the
dev list, so I assume it will be added.

That's not really how the proposed patch works, and the fact that wedon't have a good way to serialize/deserialize the aggregate state etc.There are also various corner cases how you can end up with writing muchmore data than you assumed, but let's discuss that in the thread aboutthe patch, not here.


regards

--
Tomas Vondra                   http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pretty bad n_distinct estimate, causing HashAgg OOM on TPC-H

Reply via email to