On 07/20/2014 07:17 PM, Tomas Vondra wrote:
On 19.7.2014 20:24, Tomas Vondra wrote:
On 13.7.2014 21:32, Tomas Vondra wrote:
The current patch only implemnents this for tuples in the main
hash table, not for skew buckets. I plan to do that, but it will
require separate chunks for each skew bucket (so we can remove it
without messing with all of them). The chunks for skew buckets
should be smaller (I'm thinking about ~4kB), because there'll be
more of them, but OTOH those buckets are for frequent values so the
chunks should not remain empty.

I've looked into extending the dense allocation to the skew buckets,
and I think we shouldn't do that. I got about 50% of the changes and
then just threw it out because it turned out quite pointless.

The amount of memory for skew buckets is limited to 2% of work mem,
so even with 100% overhead it'll use ~4% of work mem. So there's
pretty much nothing to gain here. So the additional complexity
introduced by the dense allocation seems pretty pointless.

I'm not entirely happy with the code allocating some memory densely
and some using traditional palloc, but it certainly seems cleaner
than the code I had.

So I think the patch is mostly OK as is.

Attached is v4 of the patch, mostly with minor improvements and cleanups
(removed MemoryContextStats, code related to skew buckets).

Can you remind me what kind of queries this is supposed to help with? Could you do some performance testing to demonstrate the effect? And also a worst-case scenario.

At a quick glance, I think you're missing a fairly obvious trick in ExecHashIncreaseNumBatches: if a chunk contains only a single tuple, you can avoid copying it to a new chunk and just link the old chunk to the new list instead. Not sure if ExecHashIncreaseNumBatches is called often enough for that to be noticeable, but at least it should help in extreme cases where you push around huge > 100MB tuples.

- Heikki


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to