On Tue, Sep 8, 2020 at 11:17 PM Jakub Wartak <jakub.war...@tomtom.com> wrote: > I agree with both, I just thought it might be interesting finding as this > idiv might be (?) present in other common paths like ReadBuffer*() / > PinBuffer() (some recent discussions, maybe on NUMA boxes), not just WAL > recovery as it seems relatively easy to improve.
I wrote a draft commit message for Jakub's proposed change (attached), and did a little bit of testing, but I haven't seen a case where it wins yet; I need to work on something else now but thought I'd share this much anyway. One observation is that the rule in init_htab had better agree with the expansion rule in hash_search_with_hash_value, otherwise you can size your hash table perfectly for n elements and then it still has to expand when you insert the nth element, which is why I changed >= to >. Does this make sense? Oh man, I don't like the mash-up of int, long, uint32, Size dynahash uses in various places and that are brought into relief by that comparison...
From e5d890bc1178861dc1a5b1f430289a5ee2b4a2fa Mon Sep 17 00:00:00 2001 From: Thomas Munro <thomas.mu...@gmail.com> Date: Thu, 10 Sep 2020 12:27:25 +1200 Subject: [PATCH] Remove custom fill factor support from dynahash.c. Since ancient times we have had support for a fill factor (maximum load factor) to be set for a dynahash hash table, but: 1. It had to be an integer value >= 1, whereas for in memory hash tables interesting load factor targets are probably somewhere near the 0.75-1.0 range. 2. It was implemented in a way that performed an expensive division operation that regularly showed up in profiles. 3. We are not aware of anyone ever having used a non-default value. Therefore, remove support, making fill factor 1 as the implicit value. Author: Jakub Wartak <jakub.war...@tomtom.com> Reviewed-by: Alvaro Herrera <alvhe...@2ndquadrant.com> Reviewed-by: Tomas Vondra <tomas.von...@2ndquadrant.com> Reviewed-by: Thomas Munro <thomas.mu...@gmail.com> Discussion: https://postgr.es/m/VI1PR0701MB696044FC35013A96FECC7AC8F62D0%40VI1PR0701MB6960.eurprd07.prod.outlook.com --- src/backend/utils/hash/dynahash.c | 15 ++++----------- src/include/utils/hsearch.h | 2 -- 2 files changed, 4 insertions(+), 13 deletions(-) diff --git a/src/backend/utils/hash/dynahash.c b/src/backend/utils/hash/dynahash.c index f4fbccdd7e..4aed0114fb 100644 --- a/src/backend/utils/hash/dynahash.c +++ b/src/backend/utils/hash/dynahash.c @@ -191,7 +191,6 @@ struct HASHHDR Size keysize; /* hash key length in bytes */ Size entrysize; /* total user element size in bytes */ long num_partitions; /* # partitions (must be power of 2), or 0 */ - long ffactor; /* target fill factor */ long max_dsize; /* 'dsize' limit if directory is fixed size */ long ssize; /* segment size --- must be power of 2 */ int sshift; /* segment shift = log2(ssize) */ @@ -497,8 +496,6 @@ hash_create(const char *tabname, long nelem, HASHCTL *info, int flags) /* ssize had better be a power of 2 */ Assert(hctl->ssize == (1L << hctl->sshift)); } - if (flags & HASH_FFACTOR) - hctl->ffactor = info->ffactor; /* * SHM hash tables have fixed directory size passed by the caller. @@ -603,8 +600,6 @@ hdefault(HTAB *hashp) hctl->num_partitions = 0; /* not partitioned */ - hctl->ffactor = DEF_FFACTOR; - /* table has no fixed maximum size */ hctl->max_dsize = NO_MAX_DSIZE; @@ -670,11 +665,10 @@ init_htab(HTAB *hashp, long nelem) SpinLockInit(&(hctl->freeList[i].mutex)); /* - * Divide number of elements by the fill factor to determine a desired - * number of buckets. Allocate space for the next greater power of two - * number of buckets + * Allocate space for the next greater power of two number of buckets, + * assuming a desired maximum load factor of 1. */ - nbuckets = next_pow2_int((nelem - 1) / hctl->ffactor + 1); + nbuckets = next_pow2_int(nelem); /* * In a partitioned table, nbuckets must be at least equal to @@ -733,7 +727,6 @@ init_htab(HTAB *hashp, long nelem) "DIRECTORY SIZE ", hctl->dsize, "SEGMENT SIZE ", hctl->ssize, "SEGMENT SHIFT ", hctl->sshift, - "FILL FACTOR ", hctl->ffactor, "MAX BUCKET ", hctl->max_bucket, "HIGH MASK ", hctl->high_mask, "LOW MASK ", hctl->low_mask, @@ -975,7 +968,7 @@ hash_search_with_hash_value(HTAB *hashp, * order of these tests is to try to check cheaper conditions first. */ if (!IS_PARTITIONED(hctl) && !hashp->frozen && - hctl->freeList[0].nentries / (long) (hctl->max_bucket + 1) >= hctl->ffactor && + hctl->freeList[0].nentries > (long) (hctl->max_bucket + 1) && !has_seq_scans(hashp)) (void) expand_table(hashp); } diff --git a/src/include/utils/hsearch.h b/src/include/utils/hsearch.h index f1deb9beab..bebf89b3c4 100644 --- a/src/include/utils/hsearch.h +++ b/src/include/utils/hsearch.h @@ -68,7 +68,6 @@ typedef struct HASHCTL long ssize; /* segment size */ long dsize; /* (initial) directory size */ long max_dsize; /* limit to dsize if dir size is limited */ - long ffactor; /* fill factor */ Size keysize; /* hash key length in bytes */ Size entrysize; /* total user element size in bytes */ HashValueFunc hash; /* hash function */ @@ -83,7 +82,6 @@ typedef struct HASHCTL #define HASH_PARTITION 0x0001 /* Hashtable is used w/partitioned locking */ #define HASH_SEGMENT 0x0002 /* Set segment size */ #define HASH_DIRSIZE 0x0004 /* Set directory size (initial and max) */ -#define HASH_FFACTOR 0x0008 /* Set fill factor */ #define HASH_ELEM 0x0010 /* Set keysize and entrysize */ #define HASH_BLOBS 0x0020 /* Select support functions for binary keys */ #define HASH_FUNCTION 0x0040 /* Set user defined hash function */ -- 2.20.1