Re: Division in dynahash.c due to HASH_FFACTOR
On Sat, Sep 19, 2020 at 1:30 PM Tom Lane wrote: > I wrote: > > ISTM that getting rid of the division obviates the concern that the > > nentries condition is too expensive, True, that comment needed to go. > Also, we could make it slightly cheaper yet, by changing the condition > to > > hctl->freeList[0].nentries > (long) (hctl->max_bucket) > > ie drop the +1. I'd argue that this is actually a more faithful > rendition of the previous condition, since what we had before was > basically > > hctl->freeList[0].nentries >= (long) (hctl->max_bucket + 1) Agreed, and done. Thanks!
Re: Division in dynahash.c due to HASH_FFACTOR
I wrote: > ISTM that getting rid of the division obviates the concern that the > nentries condition is too expensive, Also, we could make it slightly cheaper yet, by changing the condition to hctl->freeList[0].nentries > (long) (hctl->max_bucket) ie drop the +1. I'd argue that this is actually a more faithful rendition of the previous condition, since what we had before was basically hctl->freeList[0].nentries >= (long) (hctl->max_bucket + 1) regards, tom lane
Re: Division in dynahash.c due to HASH_FFACTOR
Thomas Munro writes: > Pushed. Thanks Jakub, everyone. BTW, looking over this patch, I wonder about /* * Can't split if running in partitioned mode, nor if frozen, nor if * table is the subject of any active hash_seq_search scans. Strange * order of these tests is to try to check cheaper conditions first. */ if (!IS_PARTITIONED(hctl) && !hashp->frozen && hctl->freeList[0].nentries > (long) (hctl->max_bucket + 1) && !has_seq_scans(hashp)) (void) expand_table(hashp); ISTM that getting rid of the division obviates the concern that the nentries condition is too expensive, and therefore we should revert to checking it first, on the grounds that that condition is most likely to fail. regards, tom lane
Re: Division in dynahash.c due to HASH_FFACTOR
On Tue, Sep 15, 2020 at 9:56 AM Thomas Munro wrote: > > I looked over the patch and the only thing I saw was that we might > > also want to remove the following line: > > > > #define DEF_FFACTOR1 /* default fill factor */ > > Right, thanks. Fixed in the attached. Pushed. Thanks Jakub, everyone. Separately, we really should tidy up the int/long/uint32/size_t confusion in this module. I know we have K C-era long-itude in numerous other modules, but this one is a little more egregious in its data type mixing.
Re: Division in dynahash.c due to HASH_FFACTOR
On Mon, Sep 14, 2020 at 11:35 PM David Rowley wrote: > I just did some benchmarking with this patch using the same recovery > benchmark that I used in [1] and also the two patches that I posted in > [2]. Additionally, I added a PANIC at the end of recovery so that I > could repeat the recovery over and over again with the same WAL. > > [data] N Min MaxMedian AvgStddev x 10 62.15 67.06 64.8664.132 1.6188528 + 10 59.6 63.81 63.1362.233 1.4983031 Difference at 95.0% confidence -1.899 +/- 1.46553 -2.96108% +/- 2.28517% (Student's t, pooled s = 1.55974) Thanks! Hmm, small but apparently significant and in line with Jakub's report, and I suppose the effect will be greater with other nearby recovery performance patches applied that halve the times. Annoyingly, I can't reproduce this speedup on my local i9-9900; maybe it requires a different CPU... > I looked over the patch and the only thing I saw was that we might > also want to remove the following line: > > #define DEF_FFACTOR1 /* default fill factor */ Right, thanks. Fixed in the attached. > The 2nd most costly call to hash_search_with_hash_value() came in via > hash_search() via smgropen(). That does use HASH_ENTER, which could > have triggered the divide code. The main caller of smgropen() was > XLogReadBufferExtended(). > > So, it looks unlikely that any gains we are seeing are from improved > buffer lookups. It's more likely they're coming from more optimal > XLogReadBufferExtended() I think we call smgropen() twice for every buffer referenced in the WAL: XLogReadBufferExtended() and again in ReadBufferWithoutRelcache(). We could reduce it to once with some refactoring, but I am looking into whether I can reduce it to zero as a side-effect of another change, more soon... From efecf68b159a3c65517e91076009cb4e5cc6f157 Mon Sep 17 00:00:00 2001 From: Thomas Munro Date: Thu, 10 Sep 2020 12:27:25 +1200 Subject: [PATCH v2] Remove custom fill factor support from dynahash.c. Since ancient times we have had support for a fill factor (maximum load factor) to be set for a dynahash hash table, but: 1. It had to be an integer value >= 1, whereas for in memory hash tables interesting load factor targets are probably somewhere near the 0.75-1.0 range. 2. It was implemented in a way that performed an expensive division operation that regularly showed up in profiles. 3. We are not aware of anyone ever having used a non-default value. Therefore, remove support, making fill factor 1 as the implicit value. Author: Jakub Wartak Reviewed-by: Alvaro Herrera Reviewed-by: Tomas Vondra Reviewed-by: Thomas Munro Reviewed-by: David Rowley Discussion: https://postgr.es/m/VI1PR0701MB696044FC35013A96FECC7AC8F62D0%40VI1PR0701MB6960.eurprd07.prod.outlook.com --- src/backend/utils/hash/dynahash.c | 20 ++-- src/include/utils/hsearch.h | 2 -- 2 files changed, 6 insertions(+), 16 deletions(-) diff --git a/src/backend/utils/hash/dynahash.c b/src/backend/utils/hash/dynahash.c index f4fbccdd7e..1122e2e5e5 100644 --- a/src/backend/utils/hash/dynahash.c +++ b/src/backend/utils/hash/dynahash.c @@ -122,7 +122,6 @@ #define DEF_SEGSIZE 256 #define DEF_SEGSIZE_SHIFT 8 /* must be log2(DEF_SEGSIZE) */ #define DEF_DIRSIZE 256 -#define DEF_FFACTOR 1 /* default fill factor */ /* Number of freelists to be used for a partitioned hash table. */ #define NUM_FREELISTS 32 @@ -191,7 +190,6 @@ struct HASHHDR Size keysize; /* hash key length in bytes */ Size entrysize; /* total user element size in bytes */ long num_partitions; /* # partitions (must be power of 2), or 0 */ - long ffactor; /* target fill factor */ long max_dsize; /* 'dsize' limit if directory is fixed size */ long ssize; /* segment size --- must be power of 2 */ int sshift; /* segment shift = log2(ssize) */ @@ -497,8 +495,6 @@ hash_create(const char *tabname, long nelem, HASHCTL *info, int flags) /* ssize had better be a power of 2 */ Assert(hctl->ssize == (1L << hctl->sshift)); } - if (flags & HASH_FFACTOR) - hctl->ffactor = info->ffactor; /* * SHM hash tables have fixed directory size passed by the caller. @@ -603,8 +599,6 @@ hdefault(HTAB *hashp) hctl->num_partitions = 0; /* not partitioned */ - hctl->ffactor = DEF_FFACTOR; - /* table has no fixed maximum size */ hctl->max_dsize = NO_MAX_DSIZE; @@ -670,11 +664,10 @@ init_htab(HTAB *hashp, long nelem) SpinLockInit(&(hctl->freeList[i].mutex)); /* - * Divide number of elements by the fill factor to determine a desired - * number of buckets. Allocate space for the next greater power of two - * number of buckets + * Allocate space for the next greater power of two number of buckets, + * assuming a desired maximum load factor of 1. */ - nbuckets = next_pow2_int((nelem - 1) / hctl->ffactor + 1); + nbuckets =
Re: Division in dynahash.c due to HASH_FFACTOR
On Thu, 10 Sep 2020 at 14:55, Thomas Munro wrote: > > I wrote a draft commit message for Jakub's proposed change (attached), > and did a little bit of testing, but I haven't seen a case where it > wins yet; I need to work on something else now but thought I'd share > this much anyway. One observation is that the rule in init_htab had > better agree with the expansion rule in hash_search_with_hash_value, > otherwise you can size your hash table perfectly for n elements and > then it still has to expand when you insert the nth element, which is > why I changed >= to >. Does this make sense? Oh man, I don't like > the mash-up of int, long, uint32, Size dynahash uses in various places > and that are brought into relief by that comparison... I just did some benchmarking with this patch using the same recovery benchmark that I used in [1] and also the two patches that I posted in [2]. Additionally, I added a PANIC at the end of recovery so that I could repeat the recovery over and over again with the same WAL. Without 0001-Remove-custom-fill-factor-support-from-dynahash.c.patch, the time in seconds reported for recovery was as follows: Run 1 65.2 Run 2 65.41 Run 3 65.1 Run 4 64.86 Run 5 62.29 Run 6 67.06 Run 7 63.35 Run 8 63.1 Run 9 62.8 Run 10 62.15 Avg 64.132 Med 64.105 After patching I got: Run 1 60.69 Run 2 63.13 Run 3 63.24 Run 4 62.13 Run 5 63.81 Run 6 60.27 Run 7 62.85 Run 8 63.45 Run 9 59.6 Run 10 63.16 Avg 62.233 Med 62.99 I was quite happy to see 59.6 seconds in there on run 9 as I'd been trying hard to get that benchmark to go below 1 minute on my machine. perf appears to indicate that less time was spent in hash_search_with_hash_value() Unpatched: 26.96% postgres postgres[.] PageRepairFragmentation 12.07% postgres libc-2.31.so[.] __memmove_avx_unaligned_erms 10.13% postgres postgres[.] hash_search_with_hash_value 6.70% postgres postgres[.] XLogReadBufferExtended Patched: 27.70% postgres postgres[.] PageRepairFragmentation 13.50% postgres libc-2.31.so[.] __memmove_avx_unaligned_erms 9.63% postgres postgres[.] hash_search_with_hash_value 6.44% postgres postgres[.] XLogReadBufferExtended I looked over the patch and the only thing I saw was that we might also want to remove the following line: #define DEF_FFACTOR1 /* default fill factor */ Also, a couple of things I noticed when looking at performance profiles in detail. 1) The most expensive user of hash_search_with_hash_value() comes from BufTableLookup() which passes HASH_FIND as the HASHACTION. 2) The code that was doing the divide in the unpatched version only triggered when HASHACTION was HASH_ENTER or HASH_ENTER_NULL. The 2nd most costly call to hash_search_with_hash_value() came in via hash_search() via smgropen(). That does use HASH_ENTER, which could have triggered the divide code. The main caller of smgropen() was XLogReadBufferExtended(). So, it looks unlikely that any gains we are seeing are from improved buffer lookups. It's more likely they're coming from more optimal XLogReadBufferExtended() David [1] https://www.postgresql.org/message-id/caaphdvokwqazhiuxet8jsqupjkdph8dnuzdfusx9p7dxrjd...@mail.gmail.com [2] https://www.postgresql.org/message-id/CAApHDvo9n-nOb3b4PYFx%2Bw9mqd2SSUHm_oAs039eZnZLqFGcbQ%40mail.gmail.com
Re: Division in dynahash.c due to HASH_FFACTOR
On Tue, Sep 8, 2020 at 11:17 PM Jakub Wartak wrote: > I agree with both, I just thought it might be interesting finding as this > idiv might be (?) present in other common paths like ReadBuffer*() / > PinBuffer() (some recent discussions, maybe on NUMA boxes), not just WAL > recovery as it seems relatively easy to improve. I wrote a draft commit message for Jakub's proposed change (attached), and did a little bit of testing, but I haven't seen a case where it wins yet; I need to work on something else now but thought I'd share this much anyway. One observation is that the rule in init_htab had better agree with the expansion rule in hash_search_with_hash_value, otherwise you can size your hash table perfectly for n elements and then it still has to expand when you insert the nth element, which is why I changed >= to >. Does this make sense? Oh man, I don't like the mash-up of int, long, uint32, Size dynahash uses in various places and that are brought into relief by that comparison... From e5d890bc1178861dc1a5b1f430289a5ee2b4a2fa Mon Sep 17 00:00:00 2001 From: Thomas Munro Date: Thu, 10 Sep 2020 12:27:25 +1200 Subject: [PATCH] Remove custom fill factor support from dynahash.c. Since ancient times we have had support for a fill factor (maximum load factor) to be set for a dynahash hash table, but: 1. It had to be an integer value >= 1, whereas for in memory hash tables interesting load factor targets are probably somewhere near the 0.75-1.0 range. 2. It was implemented in a way that performed an expensive division operation that regularly showed up in profiles. 3. We are not aware of anyone ever having used a non-default value. Therefore, remove support, making fill factor 1 as the implicit value. Author: Jakub Wartak Reviewed-by: Alvaro Herrera Reviewed-by: Tomas Vondra Reviewed-by: Thomas Munro Discussion: https://postgr.es/m/VI1PR0701MB696044FC35013A96FECC7AC8F62D0%40VI1PR0701MB6960.eurprd07.prod.outlook.com --- src/backend/utils/hash/dynahash.c | 15 --- src/include/utils/hsearch.h | 2 -- 2 files changed, 4 insertions(+), 13 deletions(-) diff --git a/src/backend/utils/hash/dynahash.c b/src/backend/utils/hash/dynahash.c index f4fbccdd7e..4aed0114fb 100644 --- a/src/backend/utils/hash/dynahash.c +++ b/src/backend/utils/hash/dynahash.c @@ -191,7 +191,6 @@ struct HASHHDR Size keysize; /* hash key length in bytes */ Size entrysize; /* total user element size in bytes */ long num_partitions; /* # partitions (must be power of 2), or 0 */ - long ffactor; /* target fill factor */ long max_dsize; /* 'dsize' limit if directory is fixed size */ long ssize; /* segment size --- must be power of 2 */ int sshift; /* segment shift = log2(ssize) */ @@ -497,8 +496,6 @@ hash_create(const char *tabname, long nelem, HASHCTL *info, int flags) /* ssize had better be a power of 2 */ Assert(hctl->ssize == (1L << hctl->sshift)); } - if (flags & HASH_FFACTOR) - hctl->ffactor = info->ffactor; /* * SHM hash tables have fixed directory size passed by the caller. @@ -603,8 +600,6 @@ hdefault(HTAB *hashp) hctl->num_partitions = 0; /* not partitioned */ - hctl->ffactor = DEF_FFACTOR; - /* table has no fixed maximum size */ hctl->max_dsize = NO_MAX_DSIZE; @@ -670,11 +665,10 @@ init_htab(HTAB *hashp, long nelem) SpinLockInit(&(hctl->freeList[i].mutex)); /* - * Divide number of elements by the fill factor to determine a desired - * number of buckets. Allocate space for the next greater power of two - * number of buckets + * Allocate space for the next greater power of two number of buckets, + * assuming a desired maximum load factor of 1. */ - nbuckets = next_pow2_int((nelem - 1) / hctl->ffactor + 1); + nbuckets = next_pow2_int(nelem); /* * In a partitioned table, nbuckets must be at least equal to @@ -733,7 +727,6 @@ init_htab(HTAB *hashp, long nelem) "DIRECTORY SIZE ", hctl->dsize, "SEGMENT SIZE", hctl->ssize, "SEGMENT SHIFT ", hctl->sshift, - "FILL FACTOR ", hctl->ffactor, "MAX BUCKET ", hctl->max_bucket, "HIGH MASK ", hctl->high_mask, "LOW MASK ", hctl->low_mask, @@ -975,7 +968,7 @@ hash_search_with_hash_value(HTAB *hashp, * order of these tests is to try to check cheaper conditions first. */ if (!IS_PARTITIONED(hctl) && !hashp->frozen && - hctl->freeList[0].nentries / (long) (hctl->max_bucket + 1) >= hctl->ffactor && + hctl->freeList[0].nentries > (long) (hctl->max_bucket + 1) && !has_seq_scans(hashp)) (void) expand_table(hashp); } diff --git a/src/include/utils/hsearch.h b/src/include/utils/hsearch.h index f1deb9beab..bebf89b3c4 100644 --- a/src/include/utils/hsearch.h +++ b/src/include/utils/hsearch.h @@ -68,7 +68,6 @@ typedef struct HASHCTL long ssize; /* segment size */ long dsize; /* (initial) directory size */ long max_dsize; /* limit to dsize if dir size is limited */ - long ffactor; /* fill factor */ Size
Re: Division in dynahash.c due to HASH_FFACTOR
Hi Thomas, Tomas :), Alvaro, hackers, >> > After removing HASH_FFACTOR PostgreSQL still compiles... Would >> > removing it break some external API/extensions ? >> >> FWIW, HASH_FFACTOR has *never* been used in Postgres core code. [..] >> I think if we know that there's a 4% performance increase by removing >> the division that comes with an ability to use a custom fillfactor that >> nobody uses or has ever used, it makes sense to remove that ability. Thomas wrote: >I think Tomas is right, and the correct fix for this would be to >compute it up front every time the hash table size changes, but since >apparently no one ever used it, +1 for just removing it and leaving it >hard-coded. [..] >For what it's worth, for dshash.c I just hard coded it to double the >hash table size whenever the number of elements in a partition >exceeded 75%. Popular hash tables in standard libraries seem to use >either .75 or 1.0 as a default or only setting. There's probably room >for discussion about numbers in those ranges, (..) Tomas also asked the same "what if you lower the fill factor, e.g. to 0.8? Does that improve performance or not?" and got some interesting links for avoiding bias. I couldn't find any other simple way to benchmark a real impact of it other than checking DB crash-recovery (if anyone has some artificial workload generator with PinBuffer->hash_search() please let me know). So I've been testing it using WAL crash-recovery using Thomas's approach [1]: always the same workload to reply, always on the same machine, same env: same 13GB WAL to recover, 8GB db, 24GB shared buffers on top of 128GB RAM, NVMe with fsync off to put dynahash as primary bottleneck (BufTableLookup()+smgropen()), append-only workload, gcc -O3: Results are <1st time is the WAL recovery timing, 2nd time is checkpoint before opening DB>. There were four measurements per each scenario, first one is cut off to avoid noise: 1. DEF_FFACTOR=1 idiv on 103.971, 3.719 103.937, 3.683 104.934, 3.691 2. DEF_FFACTOR=1 idiv off 100.565, 3.71 101.595, 3.807 100.794, 3.691 3. DEF_FFACTOR=1.2 idiv on // some proof that nobody uses it 1599552729.041 postmaster 94510 FATAL: failed to initialize hash table "SERIALIZABLEXID hash" 4. DEF_FFACTOR=0.8 idiv on // another proof ? The reason for below is that ffactor is long and as such cannot handle floats, if it would be float Program terminated with signal 8, Arithmetic exception. #0 0x009a4119 in init_htab (nelem=4, hashp=0x17cd238) at dynahash.c:677 677 nbuckets = next_pow2_int((nelem - 1) / hctl->ffactor + 1); Missing separate debuginfos, use: debuginfo-install glibc-2.17-292.180.amzn1.x86_64 #0 0x009a4119 in init_htab (nelem=4, hashp=0x17cd238) at dynahash.c:677 #1 hash_create (tabname=tabname@entry=0xb77f9a "Timezones", nelem=nelem@entry=4, info=info@entry=0x7ffd3055cf30, flags=flags@entry=16) at dynahash.c:529 #2 0x009ecddf in init_timezone_hashtable () at pgtz.c:211 #3 pg_tzset (name=name@entry=0xb49fd5 "GMT") at pgtz.c:248 #4 0x009ecfee in pg_timezone_initialize () at pgtz.c:372 5. DEF_FFACTOR=1 idiv on // just in case reproducer to validate measurement #1 104.333, 3.804 104.313, 3.772 103.951, 3.687 6. DEF_FFACTOR=2 idiv on 105.406, 3.783 105.33, 3.721 105.403, 3.777 7. DEF_FFACTOR=4 idiv on 105.803, 3.722 105.73, 3.728 106.104, 3.756 Some notes: - SMgrRelationHash is initialized from start at 400, while DB here is small 8GB and has only couple of tables -> no expansion needed in above test. - local backend private overflow hash table for buffers: PrivateRefCountHash is initialized at 100 and maybe it grows during - googling for DEF_FFACTOR I've found only 1 entry with value of <> 1 from 1993 (see this [2] , what's interesting is the same DEF_SEGSIZE value after all those years) Alvaro wrote: >> ... however, if we're really tense about improving some hash table's >> performance, it might be worth trying to use simplehash.h instead of >> dynahash.c for it. Thomas wrote: > Sure, but removing the dynahash IDIV also seems like a slam dunk removal of a > useless expensive feature. I agree with both, I just thought it might be interesting finding as this idiv might be (?) present in other common paths like ReadBuffer*() / PinBuffer() (some recent discussions, maybe on NUMA boxes), not just WAL recovery as it seems relatively easy to improve. -J. [1] - https://github.com/macdice/redo-bench [2] - https://fuhrwerks.com/csrg/info/93c40a660b6cdf74 ________ From: Thomas Munro Sent: Tuesday, September 8, 2020 2:55 AM To: Alvaro Herrera Cc: Jakub Wartak ; pgsql-hackers Subject: Re: Division in dynahash.c due to HASH_FFACTOR On Sat, Sep 5, 2020 at 2:34 AM Alvaro Herrera wrote: > On 2020-Sep-04, Jakub Wartak wrote: &g
Re: Division in dynahash.c due to HASH_FFACTOR
On Sat, Sep 5, 2020 at 2:34 AM Alvaro Herrera wrote: > On 2020-Sep-04, Jakub Wartak wrote: > > After removing HASH_FFACTOR PostgreSQL still compiles... Would > > removing it break some external API/extensions ? > > FWIW, HASH_FFACTOR has *never* been used in Postgres core code. > > https://postgr.es/m/20160418180711.55ac82c0@fujitsu already reported > that this flag was unused a few years ago. And a search through > codesearch.debian.net shows that no software packaged by Debian uses > that flag either. > > *If* any third-party extension is using HASH_FFACTOR, it can easily be > unbroken by removing the flag or #defining it to zero; the removal would > only be a problem if anyone showed that there is a performance loss by > using the default fillfactor for some dynahash table instead of their > custom value. > > I think if we know that there's a 4% performance increase by removing > the division that comes with an ability to use a custom fillfactor that > nobody uses or has ever used, it makes sense to remove that ability. I think Tomas is right, and the correct fix for this would be to compute it up front every time the hash table size changes, but since apparently no one ever used it, +1 for just removing it and leaving it hard-coded. For what it's worth, for dshash.c I just hard coded it to double the hash table size whenever the number of elements in a partition exceeded 75%. Popular hash tables in standard libraries seem to use either .75 or 1.0 as a default or only setting. There's probably room for discussion about numbers in those ranges, but I think the concept of customisable load factors with a range much wider than that may be more relevant for disk-based hash tables (like our hash indexes), which have very different economics. I think the name HASH_FFACTOR may be a clue that this was possibly interference from disk-based hash tables from the same Berkeley people: rather than "load factor", the more usual term (?) for nentries/nbuckets in memory-based hash tables as a way to model number of expected collisions, they called it "fill factor", which I guess is because in disk-based designs your actually want a certain rate of collisions, because you're working not with elements in an array and wanting to avoid collisions while not wasting space, but rather with fixed sized buckets that you want to fill up, but not overflow. > ... however, if we're really tense about improving some hash table's > performance, it might be worth trying to use simplehash.h instead of > dynahash.c for it. Sure, but removing the dynahash IDIV also seems like a slam dunk removal of a useless expensive feature.
Re: Division in dynahash.c due to HASH_FFACTOR
On 2020-Sep-04, Jakub Wartak wrote: > After removing HASH_FFACTOR PostgreSQL still compiles... Would > removing it break some external API/extensions ? FWIW, HASH_FFACTOR has *never* been used in Postgres core code. https://postgr.es/m/20160418180711.55ac82c0@fujitsu already reported that this flag was unused a few years ago. And a search through codesearch.debian.net shows that no software packaged by Debian uses that flag either. *If* any third-party extension is using HASH_FFACTOR, it can easily be unbroken by removing the flag or #defining it to zero; the removal would only be a problem if anyone showed that there is a performance loss by using the default fillfactor for some dynahash table instead of their custom value. I think if we know that there's a 4% performance increase by removing the division that comes with an ability to use a custom fillfactor that nobody uses or has ever used, it makes sense to remove that ability. ... however, if we're really tense about improving some hash table's performance, it might be worth trying to use simplehash.h instead of dynahash.c for it. -- Álvaro Herrerahttps://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: Division in dynahash.c due to HASH_FFACTOR
On Fri, Sep 04, 2020 at 07:01:41AM +, Jakub Wartak wrote: Greetins hackers, I have mixed feelings if this welcome contribution as the potential gain is relatively small in my tests, but still I would like to point out that HASH_FFACTOR functionality from dynahash.c could be removed or optimized (default fill factor is always 1, there's not a single place that uses custom custom fill factor other than DEF_FFACTOR=1 inside PostgreSQL repository). Because the functionality is present there seems to be division for every buffer access [BufTableLookup()] / or every smgropen() call (everything call to hash_search() is affected, provided it's not ShmemInitHash/HASH_PARTITION). This division is especially visible via perf on single process StartupXLOG WAL recovery process on standby in heavy duty 100% CPU conditions , as the top1 is inside hash_search: 0x00888751 <+449>: idiv r8 0x00888754 <+452>: cmp rax,QWORD PTR [r15+0x338] <<-- in perf annotate shows as 30-40%, even on default -O2, probably CPU pipelining for idiv above I've made a PoC test to skip that division assuming ffactor would be gone: if (!IS_PARTITIONED(hctl) && !hashp->frozen && - hctl->freeList[0].nentries / (long) (hctl->max_bucket + 1) >= hctl->ffactor && + hctl->freeList[0].nentries >= (long) (hctl->max_bucket + 1) && For a stream of WAL 3.7GB I'm getting consistent improvement of ~4%, (yes I know it's small, that's why I'm having mixed feelings): gcc -O3: 104->100s gcc -O2: 108->104s pgbench -S -c 16 -j 4 -T 30 -M prepared: stays more or less the same (-s 100), so no positive impact there Hmm, so it's the division that makes the difference? In that case, wouldn't it make sense to just compute a threshold every time the hash table is resized, and then do just the comparison. That is, compute nentries_threshold = (long) (hctl->max_bucket + 1) * hctl->ffactor; and then do just hctl->freeList[0].nentries >= nentries_threshold Of course, this assumes the threshold is calculated only rarely, maybe that's not the case. Also, what if you lower the fill factor, e.g. to 0.8? Does that improve performance or not? I can't find any discussion about this in the archives, but the dynamic hashing paper [1] seems to suggest it does make a difference (see the tables at the end, but I haven't re-read the paper recently so maybe it's irrelevant). Anyway, it'd be foolish to remove the ffactor parameter to save on division only to lose the ability to lower the factor and save more than that ... As for the 4% - it's not much, but it's also not negligible. I'm sure we've done micro-optimizations for smaller gains. The big question is whether this is statistically significant, or whether it's just due to random effects. It could easily be caused by changes in layout of the binary, for example - that can happen quite easily. See [2] and [3]. The talk actually mentions a tool meant to eliminate this bias by randomization, but I've never tried using it on PostgreSQL so I'm not sure how compatible it is :-( [1] https://www.csd.uoc.gr/~hy460/pdf/Dynamic%20Hash%20Tables.pdf [2] https://www.cis.upenn.edu/~cis501/previous/fall2016/papers/producing-wrong-data.pdf [3] https://www.youtube.com/watch?v=r-TLSBdHe1A After removing HASH_FFACTOR PostgreSQL still compiles... Would removing it break some external API/extensions ? I saw several optimization for the "idiv" where it could be optimized e.g. see https://github.com/ridiculousfish/libdivide Or maybe there is some other idea to expose bottlenecks of BufTableLookup() ? I also saw codepath PinBuffer()->GetPrivateRefCountEntry() -> dynahash that could be called pretty often I have no idea what kind of pgbench stresstest could be used to demonstrate the gain (or lack of it). -Jakub Wartak. I don't think whit would break a lot of stuff, but I'm kinda dubious it's a measurable improvement for common workloads ... regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services