Hi, all
> > if (!snapshot->suboverflowed) > { > /* we have full data, so search subxip */ > - int32 j; > - > - for (j = 0; j < snapshot->subxcnt; j++) > - { > - if (TransactionIdEquals(xid, > snapshot->subxip[j])) > - return true; > - } > + if (XidInXip(xid, snapshot->subxip, snapshot->subxcnt, > + &snapshot->subxiph)) > + return true; > > /* not there, fall through to search xip[] */ > } If snaphost->suboverflowed is false then the subxcnt must be less than PGPROC_MAX_CACHED_SUBXIDS which is 64 now. And we won’t use hash if the xcnt is less than XIP_HASH_MIN_ELEMENTS which is 128 currently during discussion. So that, subxid’s hash table will never be used, right? Regards, Zhang Mingli > On Jul 14, 2022, at 01:09, Nathan Bossart <nathandboss...@gmail.com> wrote: > > Hi hackers, > > A few years ago, there was a proposal to create hash tables for long > [sub]xip arrays in snapshots [0], but the thread seems to have fizzled out. > I was curious whether this idea still showed measurable benefits, so I > revamped the patch and ran the same test as before [1]. Here are the > results for 60₋second runs on an r5d.24xlarge with the data directory on > the local NVMe storage: > > writers HEAD patch diff > ---------------------------- > 16 659 664 +1% > 32 645 663 +3% > 64 659 692 +5% > 128 641 716 +12% > 256 619 610 -1% > 512 530 702 +32% > 768 469 582 +24% > 1000 367 577 +57% > > As before, the hash table approach seems to provide a decent benefit at > higher client counts, so I felt it was worth reviving the idea. > > The attached patch has some key differences from the previous proposal. > For example, the new patch uses simplehash instead of open-coding a new > hash table. Also, I've bumped up the threshold for creating hash tables to > 128 based on the results of my testing. The attached patch waits until a > lookup of [sub]xip before generating the hash table, so we only need to > allocate enough space for the current elements in the [sub]xip array, and > we avoid allocating extra memory for workloads that do not need the hash > tables. I'm slightly worried about increasing the number of memory > allocations in this code path, but the results above seemed encouraging on > that front. > > Thoughts? > > [0] https://postgr.es/m/35960b8af917e9268881cd8df3f88320%40postgrespro.ru > [1] https://postgr.es/m/057a9a95-19d2-05f0-17e2-f46ff20e9b3e%402ndquadrant.com > > -- > Nathan Bossart > Amazon Web Services: https://aws.amazon.com > <v1-0001-Optimize-lookups-in-snapshot-transactions-in-prog.patch>