Thanks Tomas. I will follow this problem on your thread. This thread could be terminated.
On Thu, May 16, 2019 at 3:58 AM Tomas Vondra <tomas.von...@2ndquadrant.com> wrote: > On Wed, May 15, 2019 at 06:19:38PM +0800, Hubert Zhang wrote: > >Hi all, > > > >When we build hash table for a hash join node etc., we split tuples into > >different hash buckets. Since tuples could not all be held in memory. > >Postgres splits each bucket into batches, only the current batch of bucket > >is in memory while other batches are written to disk. > > > >During ExecHashTableInsert(), if the memory cost exceeds the operator > >allowed limit(hashtable->spaceAllowed), batches will be split on the fly > by > >calling ExecHashIncreaseNumBatches(). > > > >In past, if data is distributed unevenly, the split of batch may > failed(All > >the tuples falls into one split batch and the other batch is empty) Then > >Postgres will set hashtable->growEnable to false. And never expand batch > >number any more. > > > >If tuples become diverse in future, spliting batch is still valuable and > >could avoid the current batch become too big and finally OOM. > > > >To fix this, we introduce a penalty on hashtable->spaceAllowed, which is > >the threshold to determine whether to increase batch number. > >If batch split failed, we increase the penalty instead of just turn off > the > >growEnable flag. > > > >Any comments? > > > > There's already another thread discussing various issues with how hashjoin > increases the number of batches, including various issues with how/when we > disable adding more batches. > > https://commitfest.postgresql.org/23/2109/ > > In general I think you're right something like this is necessary, but I > think we may need to rethink growEnable a bit more. > > For example, the way you implemented it, after reaching the increased > limit, we just increase the number of batches just like today, and then > decide whether it actually helped. But that means we double the number of > BufFile entries, which uses more and more memory (because each is 8kB and > we need 1 per batch). I think in this case (after increasing the limit) we > should check whether increasing batches makes sense or not. And only do it > if it helps. Otherwise we'll double the amount of memory for BufFile(s) > and also the work_mem. That's not a good idea. > > But as I said, there are other issues discussed on the other thread. For > example we only disable the growth when all rows fall into the same batch. > But that's overly strict. > > > regards > > -- > Tomas Vondra > https://urldefense.proofpoint.com/v2/url?u=http-3A__www.2ndQuadrant.com&d=DwIBAg&c=lnl9vOaLMzsy2niBC8-h_K-7QJuNJEsFrzdndhuJ3Sw&r=lz-kpGdw_rtpgYV2ho3DjDSB5Psxis_b-3VZKON7K7c&m=y2bI6_b4EPRd9aTQGv9Pio3c_ZtCWs_jzKd4t8CtJEI&s=XHLORM8U7I6XR_EDkgSFtJDvhxIVd2rDA7r-xvJa278&e= > PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services > > -- Thanks Hubert Zhang