> Hello KaiGai-san, > > On 08/21/2015 02:28 AM, Kouhei Kaigai wrote: > ... > >> > >> But what is the impact on queries that actually need more than 1GB > >> of buckets? I assume we'd only limit the initial allocation and > >> still allow the resize based on the actual data (i.e. the 9.5 > >> improvement), so the queries would start with 1GB and then resize > >> once finding out the optimal size (as done in 9.5). The resize is > >> not very expensive, but it's not free either, and with so many > >> tuples (requiring more than 1GB of buckets, i.e. ~130M tuples) it's > >> probably just a noise in the total query runtime. But I'd be nice > >> to see some proofs of that ... > >> > > The problem here is we cannot know exact size unless Hash node > > doesn't read entire inner relation. All we can do is relying > > planner's estimation, however, it often computes a crazy number of > > rows. I think resizing of hash buckets is a reasonable compromise. > > I understand the estimation problem. The question I think we need to > answer is how to balance the behavior for well- and poorly-estimated > cases. It'd be unfortunate if we lower the memory consumption in the > over-estimated case while significantly slowing down the well-estimated > ones. > > I don't think we have a clear answer at this point - maybe it's not a > problem at all and it'll be a win no matter what threshold we choose. > But it's a separate problem from the bugfix. > I agree with this is a separate (and maybe not easy) problem.
If somebody know previous research in academic area, please share with us. > >> I believe the patch proposed by KaiGai-san is the right one to fix > >> the bug discussed in this thread. My understanding is KaiGai-san > >> withdrew the patch as he wants to extend it to address the > >> over-estimation issue. > >> > >> I don't think we should do that - IMHO that's an unrelated > >> improvement and should be addressed in a separate patch. > >> > > OK, it might not be a problem we should conclude within a few days, > > just before the beta release. > > I don't quite see a reason to wait for the over-estimation patch. We > probably should backpatch the bugfix anyway (although it's much less > likely to run into that before 9.5), and we can't really backpatch the > behavior change there (as there's no hash resize). > I don't argue this bugfix anymore. Thanks, -- NEC Business Creation Division / PG-Strom Project KaiGai Kohei <kai...@ak.jp.nec.com> -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers