On Mon, Mar 26, 2012 at 5:43 PM, Tom Lane <t...@sss.pgh.pa.us> wrote:
> Hm.  This illustrates that it's not too prudent to rely on a default
> numdistinct estimate to decide that a hash aggregation is safe :-(.
> We had probably better tweak the cost estimation rules to not trust
> that.  Maybe, if we have a default estimate, we should take the worst
> case estimate that the column might be unique?  That could still burn
> us if the rowcount estimate was horribly wrong, but those are not nearly
> as shaky as numdistinct estimates ...

Perhaps we should have two work_mem settings -- one for the target to
aim for and one for a hard(er) limit that we should ensure the worst
case falls under?

I have a sketch for how to handle spilling hash aggregates to disk in
my head. I'm not sure if it's worth the amount of complexity it would
require but I'll poke around a bit and see if it works out well.

-- 
greg

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to