Re: Division in dynahash.c due to HASH_FFACTOR

2020-09-18 Thread Thomas Munro
On Sat, Sep 19, 2020 at 1:30 PM Tom Lane  wrote:
> I wrote:
> > ISTM that getting rid of the division obviates the concern that the
> > nentries condition is too expensive,

True, that comment needed to go.

> Also, we could make it slightly cheaper yet, by changing the condition
> to
>
> hctl->freeList[0].nentries > (long) (hctl->max_bucket)
>
> ie drop the +1.  I'd argue that this is actually a more faithful
> rendition of the previous condition, since what we had before was
> basically
>
> hctl->freeList[0].nentries >= (long) (hctl->max_bucket + 1)

Agreed, and done.  Thanks!




Re: Division in dynahash.c due to HASH_FFACTOR

2020-09-18 Thread Tom Lane
I wrote:
> ISTM that getting rid of the division obviates the concern that the
> nentries condition is too expensive,

Also, we could make it slightly cheaper yet, by changing the condition
to

hctl->freeList[0].nentries > (long) (hctl->max_bucket)

ie drop the +1.  I'd argue that this is actually a more faithful
rendition of the previous condition, since what we had before was
basically

hctl->freeList[0].nentries >= (long) (hctl->max_bucket + 1)

regards, tom lane




Re: Division in dynahash.c due to HASH_FFACTOR

2020-09-18 Thread Tom Lane
Thomas Munro  writes:
> Pushed.  Thanks Jakub, everyone.

BTW, looking over this patch, I wonder about

/*
 * Can't split if running in partitioned mode, nor if frozen, nor if
 * table is the subject of any active hash_seq_search scans.  Strange
 * order of these tests is to try to check cheaper conditions first.
 */
if (!IS_PARTITIONED(hctl) && !hashp->frozen &&
hctl->freeList[0].nentries > (long) (hctl->max_bucket + 1) &&
!has_seq_scans(hashp))
(void) expand_table(hashp);

ISTM that getting rid of the division obviates the concern that the
nentries condition is too expensive, and therefore we should revert
to checking it first, on the grounds that that condition is most
likely to fail.

regards, tom lane




Re: Division in dynahash.c due to HASH_FFACTOR

2020-09-18 Thread Thomas Munro
On Tue, Sep 15, 2020 at 9:56 AM Thomas Munro  wrote:
> > I looked over the patch and the only thing I saw was that we might
> > also want to remove the following line:
> >
> > #define DEF_FFACTOR1 /* default fill factor */
>
> Right, thanks.  Fixed in the attached.

Pushed.  Thanks Jakub, everyone.

Separately, we really should tidy up the int/long/uint32/size_t
confusion in this module.  I know we have K C-era long-itude in
numerous other modules, but this one is a little more egregious in its
data type mixing.




Re: Division in dynahash.c due to HASH_FFACTOR

2020-09-14 Thread Thomas Munro
On Mon, Sep 14, 2020 at 11:35 PM David Rowley  wrote:
> I just did some benchmarking with this patch using the same recovery
> benchmark that I used in [1] and also the two patches that I posted in
> [2]. Additionally, I added a PANIC at the end of recovery so that I
> could repeat the recovery over and over again with the same WAL.
>
> [data]

N   Min   MaxMedian   AvgStddev
x  10 62.15 67.06 64.8664.132 1.6188528
+  10  59.6 63.81 63.1362.233 1.4983031
Difference at 95.0% confidence
-1.899 +/- 1.46553
-2.96108% +/- 2.28517%
(Student's t, pooled s = 1.55974)

Thanks!  Hmm, small but apparently significant and in line with
Jakub's report, and I suppose the effect will be greater with other
nearby recovery performance patches applied that halve the times.
Annoyingly, I can't reproduce this speedup on my local i9-9900; maybe
it requires a different CPU...

> I looked over the patch and the only thing I saw was that we might
> also want to remove the following line:
>
> #define DEF_FFACTOR1 /* default fill factor */

Right, thanks.  Fixed in the attached.

> The 2nd most costly call to hash_search_with_hash_value() came in via
> hash_search() via smgropen(). That does use HASH_ENTER, which could
> have triggered the divide code. The main caller of smgropen() was
> XLogReadBufferExtended().
>
> So, it looks unlikely that any gains we are seeing are from improved
> buffer lookups. It's more likely they're coming from more optimal
> XLogReadBufferExtended()

I think we call smgropen() twice for every buffer referenced in the
WAL: XLogReadBufferExtended() and again in
ReadBufferWithoutRelcache().  We could reduce it to once with some
refactoring, but I am looking into whether I can reduce it to zero as
a side-effect of another change, more soon...
From efecf68b159a3c65517e91076009cb4e5cc6f157 Mon Sep 17 00:00:00 2001
From: Thomas Munro 
Date: Thu, 10 Sep 2020 12:27:25 +1200
Subject: [PATCH v2] Remove custom fill factor support from dynahash.c.

Since ancient times we have had support for a fill factor (maximum load
factor) to be set for a dynahash hash table, but:

1.  It had to be an integer value >= 1, whereas for in memory hash
tables interesting load factor targets are probably somewhere near the
0.75-1.0 range.

2.  It was implemented in a way that performed an expensive division
operation that regularly showed up in profiles.

3.  We are not aware of anyone ever having used a non-default value.

Therefore, remove support, making fill factor 1 as the implicit value.

Author: Jakub Wartak 
Reviewed-by: Alvaro Herrera 
Reviewed-by: Tomas Vondra 
Reviewed-by: Thomas Munro 
Reviewed-by: David Rowley 
Discussion: https://postgr.es/m/VI1PR0701MB696044FC35013A96FECC7AC8F62D0%40VI1PR0701MB6960.eurprd07.prod.outlook.com
---
 src/backend/utils/hash/dynahash.c | 20 ++--
 src/include/utils/hsearch.h   |  2 --
 2 files changed, 6 insertions(+), 16 deletions(-)

diff --git a/src/backend/utils/hash/dynahash.c b/src/backend/utils/hash/dynahash.c
index f4fbccdd7e..1122e2e5e5 100644
--- a/src/backend/utils/hash/dynahash.c
+++ b/src/backend/utils/hash/dynahash.c
@@ -122,7 +122,6 @@
 #define DEF_SEGSIZE			   256
 #define DEF_SEGSIZE_SHIFT	   8	/* must be log2(DEF_SEGSIZE) */
 #define DEF_DIRSIZE			   256
-#define DEF_FFACTOR			   1	/* default fill factor */
 
 /* Number of freelists to be used for a partitioned hash table. */
 #define NUM_FREELISTS			32
@@ -191,7 +190,6 @@ struct HASHHDR
 	Size		keysize;		/* hash key length in bytes */
 	Size		entrysize;		/* total user element size in bytes */
 	long		num_partitions; /* # partitions (must be power of 2), or 0 */
-	long		ffactor;		/* target fill factor */
 	long		max_dsize;		/* 'dsize' limit if directory is fixed size */
 	long		ssize;			/* segment size --- must be power of 2 */
 	int			sshift;			/* segment shift = log2(ssize) */
@@ -497,8 +495,6 @@ hash_create(const char *tabname, long nelem, HASHCTL *info, int flags)
 		/* ssize had better be a power of 2 */
 		Assert(hctl->ssize == (1L << hctl->sshift));
 	}
-	if (flags & HASH_FFACTOR)
-		hctl->ffactor = info->ffactor;
 
 	/*
 	 * SHM hash tables have fixed directory size passed by the caller.
@@ -603,8 +599,6 @@ hdefault(HTAB *hashp)
 
 	hctl->num_partitions = 0;	/* not partitioned */
 
-	hctl->ffactor = DEF_FFACTOR;
-
 	/* table has no fixed maximum size */
 	hctl->max_dsize = NO_MAX_DSIZE;
 
@@ -670,11 +664,10 @@ init_htab(HTAB *hashp, long nelem)
 			SpinLockInit(&(hctl->freeList[i].mutex));
 
 	/*
-	 * Divide number of elements by the fill factor to determine a desired
-	 * number of buckets.  Allocate space for the next greater power of two
-	 * number of buckets
+	 * Allocate space for the next greater power of two number of buckets,
+	 * assuming a desired maximum load factor of 1.
 	 */
-	nbuckets = next_pow2_int((nelem - 1) / hctl->ffactor + 1);
+	nbuckets = 

Re: Division in dynahash.c due to HASH_FFACTOR

2020-09-14 Thread David Rowley
On Thu, 10 Sep 2020 at 14:55, Thomas Munro  wrote:
>
> I wrote a draft commit message for Jakub's proposed change (attached),
> and did a little bit of testing, but I haven't seen a case where it
> wins yet; I need to work on something else now but thought I'd share
> this much anyway.  One observation is that the rule in init_htab had
> better agree with the expansion rule in hash_search_with_hash_value,
> otherwise you can size your hash table perfectly for n elements and
> then it still has to expand when you insert the nth element, which is
> why I changed >= to >.  Does this make sense?  Oh man, I don't like
> the mash-up of int, long, uint32, Size dynahash uses in various places
> and that are brought into relief by that comparison...

I just did some benchmarking with this patch using the same recovery
benchmark that I used in [1] and also the two patches that I posted in
[2]. Additionally, I added a PANIC at the end of recovery so that I
could repeat the recovery over and over again with the same WAL.

Without 0001-Remove-custom-fill-factor-support-from-dynahash.c.patch,
the time in seconds reported for recovery was as follows:

Run 1 65.2
Run 2 65.41
Run 3 65.1
Run 4 64.86
Run 5 62.29
Run 6 67.06
Run 7 63.35
Run 8 63.1
Run 9 62.8
Run 10 62.15

Avg 64.132
Med 64.105

After patching I got:

Run 1 60.69
Run 2 63.13
Run 3 63.24
Run 4 62.13
Run 5 63.81
Run 6 60.27
Run 7 62.85
Run 8 63.45
Run 9 59.6
Run 10 63.16

Avg 62.233
Med 62.99

I was quite happy to see 59.6 seconds in there on run 9 as I'd been
trying hard to get that benchmark to go below 1 minute on my machine.

perf appears to indicate that less time was spent in
hash_search_with_hash_value()

Unpatched:
  26.96%  postgres  postgres[.] PageRepairFragmentation
  12.07%  postgres  libc-2.31.so[.] __memmove_avx_unaligned_erms
  10.13%  postgres  postgres[.] hash_search_with_hash_value
   6.70%  postgres  postgres[.] XLogReadBufferExtended

Patched:
  27.70%  postgres  postgres[.] PageRepairFragmentation
  13.50%  postgres  libc-2.31.so[.] __memmove_avx_unaligned_erms
   9.63%  postgres  postgres[.] hash_search_with_hash_value
   6.44%  postgres  postgres[.] XLogReadBufferExtended

I looked over the patch and the only thing I saw was that we might
also want to remove the following line:

#define DEF_FFACTOR1 /* default fill factor */

Also, a couple of things I noticed when looking at performance
profiles in detail.

1) The most expensive user of hash_search_with_hash_value() comes from
BufTableLookup() which passes HASH_FIND as the HASHACTION.
2) The code that was doing the divide in the unpatched version only
triggered when HASHACTION was HASH_ENTER or HASH_ENTER_NULL.

The 2nd most costly call to hash_search_with_hash_value() came in via
hash_search() via smgropen(). That does use HASH_ENTER, which could
have triggered the divide code. The main caller of smgropen() was
XLogReadBufferExtended().

So, it looks unlikely that any gains we are seeing are from improved
buffer lookups. It's more likely they're coming from more optimal
XLogReadBufferExtended()

David

[1] 
https://www.postgresql.org/message-id/caaphdvokwqazhiuxet8jsqupjkdph8dnuzdfusx9p7dxrjd...@mail.gmail.com
[2] 
https://www.postgresql.org/message-id/CAApHDvo9n-nOb3b4PYFx%2Bw9mqd2SSUHm_oAs039eZnZLqFGcbQ%40mail.gmail.com




Re: Division in dynahash.c due to HASH_FFACTOR

2020-09-09 Thread Thomas Munro
On Tue, Sep 8, 2020 at 11:17 PM Jakub Wartak  wrote:
> I agree with both, I just thought it might be interesting finding as this 
> idiv might be (?) present in other common paths like ReadBuffer*() / 
> PinBuffer() (some recent discussions, maybe on NUMA boxes), not just WAL 
> recovery as it seems relatively easy to improve.

I wrote a draft commit message for Jakub's proposed change (attached),
and did a little bit of testing, but I haven't seen a case where it
wins yet; I need to work on something else now but thought I'd share
this much anyway.  One observation is that the rule in init_htab had
better agree with the expansion rule in hash_search_with_hash_value,
otherwise you can size your hash table perfectly for n elements and
then it still has to expand when you insert the nth element, which is
why I changed >= to >.  Does this make sense?  Oh man, I don't like
the mash-up of int, long, uint32, Size dynahash uses in various places
and that are brought into relief by that comparison...
From e5d890bc1178861dc1a5b1f430289a5ee2b4a2fa Mon Sep 17 00:00:00 2001
From: Thomas Munro 
Date: Thu, 10 Sep 2020 12:27:25 +1200
Subject: [PATCH] Remove custom fill factor support from dynahash.c.

Since ancient times we have had support for a fill factor (maximum load
factor) to be set for a dynahash hash table, but:

1.  It had to be an integer value >= 1, whereas for in memory hash
tables interesting load factor targets are probably somewhere near the
0.75-1.0 range.

2.  It was implemented in a way that performed an expensive division
operation that regularly showed up in profiles.

3.  We are not aware of anyone ever having used a non-default value.

Therefore, remove support, making fill factor 1 as the implicit value.

Author: Jakub Wartak 
Reviewed-by: Alvaro Herrera 
Reviewed-by: Tomas Vondra 
Reviewed-by: Thomas Munro 
Discussion: https://postgr.es/m/VI1PR0701MB696044FC35013A96FECC7AC8F62D0%40VI1PR0701MB6960.eurprd07.prod.outlook.com
---
 src/backend/utils/hash/dynahash.c | 15 ---
 src/include/utils/hsearch.h   |  2 --
 2 files changed, 4 insertions(+), 13 deletions(-)

diff --git a/src/backend/utils/hash/dynahash.c b/src/backend/utils/hash/dynahash.c
index f4fbccdd7e..4aed0114fb 100644
--- a/src/backend/utils/hash/dynahash.c
+++ b/src/backend/utils/hash/dynahash.c
@@ -191,7 +191,6 @@ struct HASHHDR
 	Size		keysize;		/* hash key length in bytes */
 	Size		entrysize;		/* total user element size in bytes */
 	long		num_partitions; /* # partitions (must be power of 2), or 0 */
-	long		ffactor;		/* target fill factor */
 	long		max_dsize;		/* 'dsize' limit if directory is fixed size */
 	long		ssize;			/* segment size --- must be power of 2 */
 	int			sshift;			/* segment shift = log2(ssize) */
@@ -497,8 +496,6 @@ hash_create(const char *tabname, long nelem, HASHCTL *info, int flags)
 		/* ssize had better be a power of 2 */
 		Assert(hctl->ssize == (1L << hctl->sshift));
 	}
-	if (flags & HASH_FFACTOR)
-		hctl->ffactor = info->ffactor;
 
 	/*
 	 * SHM hash tables have fixed directory size passed by the caller.
@@ -603,8 +600,6 @@ hdefault(HTAB *hashp)
 
 	hctl->num_partitions = 0;	/* not partitioned */
 
-	hctl->ffactor = DEF_FFACTOR;
-
 	/* table has no fixed maximum size */
 	hctl->max_dsize = NO_MAX_DSIZE;
 
@@ -670,11 +665,10 @@ init_htab(HTAB *hashp, long nelem)
 			SpinLockInit(&(hctl->freeList[i].mutex));
 
 	/*
-	 * Divide number of elements by the fill factor to determine a desired
-	 * number of buckets.  Allocate space for the next greater power of two
-	 * number of buckets
+	 * Allocate space for the next greater power of two number of buckets,
+	 * assuming a desired maximum load factor of 1.
 	 */
-	nbuckets = next_pow2_int((nelem - 1) / hctl->ffactor + 1);
+	nbuckets = next_pow2_int(nelem);
 
 	/*
 	 * In a partitioned table, nbuckets must be at least equal to
@@ -733,7 +727,6 @@ init_htab(HTAB *hashp, long nelem)
 			"DIRECTORY SIZE  ", hctl->dsize,
 			"SEGMENT SIZE", hctl->ssize,
 			"SEGMENT SHIFT   ", hctl->sshift,
-			"FILL FACTOR ", hctl->ffactor,
 			"MAX BUCKET  ", hctl->max_bucket,
 			"HIGH MASK   ", hctl->high_mask,
 			"LOW  MASK   ", hctl->low_mask,
@@ -975,7 +968,7 @@ hash_search_with_hash_value(HTAB *hashp,
 		 * order of these tests is to try to check cheaper conditions first.
 		 */
 		if (!IS_PARTITIONED(hctl) && !hashp->frozen &&
-			hctl->freeList[0].nentries / (long) (hctl->max_bucket + 1) >= hctl->ffactor &&
+			hctl->freeList[0].nentries > (long) (hctl->max_bucket + 1) &&
 			!has_seq_scans(hashp))
 			(void) expand_table(hashp);
 	}
diff --git a/src/include/utils/hsearch.h b/src/include/utils/hsearch.h
index f1deb9beab..bebf89b3c4 100644
--- a/src/include/utils/hsearch.h
+++ b/src/include/utils/hsearch.h
@@ -68,7 +68,6 @@ typedef struct HASHCTL
 	long		ssize;			/* segment size */
 	long		dsize;			/* (initial) directory size */
 	long		max_dsize;		/* limit to dsize if dir size is limited */
-	long		ffactor;		/* fill factor */
 	Size		

Re: Division in dynahash.c due to HASH_FFACTOR

2020-09-08 Thread Jakub Wartak
Hi Thomas, Tomas :), Alvaro, hackers,

>> > After removing HASH_FFACTOR PostgreSQL still compiles...  Would
>> > removing it break some external API/extensions ?
>>
>> FWIW, HASH_FFACTOR has *never* been used in Postgres core code.
[..]
>> I think if we know that there's a 4% performance increase by removing
>> the division that comes with an ability to use a custom fillfactor that
>> nobody uses or has ever used, it makes sense to remove that ability.

Thomas wrote:
>I think Tomas is right, and the correct fix for this would be to
>compute it up front every time the hash table size changes, but since
>apparently no one ever used it, +1 for just removing it and leaving it
>hard-coded.
[..]
>For what it's worth, for dshash.c I just hard coded it to double the
>hash table size whenever the number of elements in a partition
>exceeded 75%.  Popular hash tables in standard libraries seem to use
>either .75 or 1.0 as a default or only setting.  There's probably room
>for discussion about numbers in those ranges, (..)

Tomas also asked the same "what if you lower the fill factor, e.g. to 0.8? Does 
that improve performance or not?" and got some interesting links for avoiding 
bias. I couldn't find any other simple way to benchmark a real impact of it 
other than checking DB crash-recovery (if anyone has some artificial workload 
generator with  PinBuffer->hash_search() please let me know).

So I've  been testing it using WAL crash-recovery using Thomas's approach [1]: 
always the same workload to reply, always on the same machine, same env: same 
13GB WAL to recover, 8GB db, 24GB shared buffers on top of 128GB RAM, NVMe with 
fsync off to put dynahash as primary bottleneck (BufTableLookup()+smgropen()), 
append-only workload, gcc -O3: Results are <1st time is the WAL recovery 
timing, 2nd time is checkpoint before opening DB>. There were four measurements 
per each scenario, first one is cut off to avoid noise:

1. DEF_FFACTOR=1 idiv on
103.971, 3.719
103.937, 3.683
104.934, 3.691

2. DEF_FFACTOR=1 idiv off
100.565, 3.71
101.595, 3.807
100.794, 3.691

3. DEF_FFACTOR=1.2 idiv on // some proof that nobody uses it
1599552729.041 postmaster 94510 FATAL:  failed to initialize hash table 
"SERIALIZABLEXID hash"

4. DEF_FFACTOR=0.8 idiv on // another proof ? The reason for below is that 
ffactor is long and as such cannot handle floats, if it would be float

Program terminated with signal 8, Arithmetic exception.
#0  0x009a4119 in init_htab (nelem=4, hashp=0x17cd238) at dynahash.c:677
677 nbuckets = next_pow2_int((nelem - 1) / hctl->ffactor + 1);
Missing separate debuginfos, use: debuginfo-install 
glibc-2.17-292.180.amzn1.x86_64

#0  0x009a4119 in init_htab (nelem=4, hashp=0x17cd238) at dynahash.c:677
#1  hash_create (tabname=tabname@entry=0xb77f9a "Timezones", 
nelem=nelem@entry=4, info=info@entry=0x7ffd3055cf30, flags=flags@entry=16) at 
dynahash.c:529
#2  0x009ecddf in init_timezone_hashtable () at pgtz.c:211
#3  pg_tzset (name=name@entry=0xb49fd5 "GMT") at pgtz.c:248
#4  0x009ecfee in pg_timezone_initialize () at pgtz.c:372

5. DEF_FFACTOR=1 idiv on // just in case reproducer to validate measurement #1
104.333, 3.804
104.313, 3.772
103.951, 3.687

6. DEF_FFACTOR=2 idiv on
105.406, 3.783
105.33, 3.721
105.403, 3.777

7. DEF_FFACTOR=4 idiv on
105.803, 3.722
105.73, 3.728
106.104, 3.756

Some notes:
- SMgrRelationHash is initialized from start at 400, while DB here is small 8GB 
and has only couple of tables  -> no expansion needed in above test.
- local backend private overflow hash table for buffers: PrivateRefCountHash is 
initialized at 100 and maybe it grows during
- googling for DEF_FFACTOR I've found only 1 entry with value of <> 1 from 1993 
(see this [2] , what's interesting is the same DEF_SEGSIZE value after all 
those years)

Alvaro wrote:
>> ... however, if we're really tense about improving some hash table's
>> performance, it might be worth trying to use simplehash.h instead of
>> dynahash.c for it.

Thomas wrote:
> Sure, but removing the dynahash IDIV also seems like a slam dunk removal of a 
> useless expensive feature.

I agree with both, I just thought it might be interesting finding as this idiv 
might be (?) present in other common paths like ReadBuffer*() / PinBuffer() 
(some recent discussions, maybe on NUMA boxes), not just WAL recovery as it 
seems relatively easy to improve.

-J.

[1] - https://github.com/macdice/redo-bench
[2] - https://fuhrwerks.com/csrg/info/93c40a660b6cdf74

________
From: Thomas Munro 
Sent: Tuesday, September 8, 2020 2:55 AM
To: Alvaro Herrera 
Cc: Jakub Wartak ; pgsql-hackers 

Subject: Re: Division in dynahash.c due to HASH_FFACTOR

On Sat, Sep 5, 2020 at 2:34 AM Alvaro Herrera  wrote:
> On 2020-Sep-04, Jakub Wartak wrote:
&g

Re: Division in dynahash.c due to HASH_FFACTOR

2020-09-07 Thread Thomas Munro
On Sat, Sep 5, 2020 at 2:34 AM Alvaro Herrera  wrote:
> On 2020-Sep-04, Jakub Wartak wrote:
> > After removing HASH_FFACTOR PostgreSQL still compiles...  Would
> > removing it break some external API/extensions ?
>
> FWIW, HASH_FFACTOR has *never* been used in Postgres core code.
>
> https://postgr.es/m/20160418180711.55ac82c0@fujitsu already reported
> that this flag was unused a few years ago.  And a search through
> codesearch.debian.net shows that no software packaged by Debian uses
> that flag either.
>
> *If* any third-party extension is using HASH_FFACTOR, it can easily be
> unbroken by removing the flag or #defining it to zero; the removal would
> only be a problem if anyone showed that there is a performance loss by
> using the default fillfactor for some dynahash table instead of their
> custom value.
>
> I think if we know that there's a 4% performance increase by removing
> the division that comes with an ability to use a custom fillfactor that
> nobody uses or has ever used, it makes sense to remove that ability.

I think Tomas is right, and the correct fix for this would be to
compute it up front every time the hash table size changes, but since
apparently no one ever used it, +1 for just removing it and leaving it
hard-coded.

For what it's worth, for dshash.c I just hard coded it to double the
hash table size whenever the number of elements in a partition
exceeded 75%.  Popular hash tables in standard libraries seem to use
either .75 or 1.0 as a default or only setting.  There's probably room
for discussion about numbers in those ranges, but I think the concept
of customisable load factors with a range much wider than that may be
more relevant for disk-based hash tables (like our hash indexes),
which have very different economics.  I think the name HASH_FFACTOR
may be a clue that this was possibly interference from disk-based hash
tables from the same Berkeley people: rather than "load factor", the
more usual term (?) for nentries/nbuckets in memory-based hash tables
as a way to model number of expected collisions, they called it "fill
factor", which I guess is because in disk-based designs your actually
want a certain rate of collisions, because you're working not with
elements in an array and wanting to avoid collisions while not wasting
space, but rather with fixed sized buckets that you want to fill up,
but not overflow.  

> ... however, if we're really tense about improving some hash table's
> performance, it might be worth trying to use simplehash.h instead of
> dynahash.c for it.

Sure, but removing the dynahash IDIV also seems like a slam dunk
removal of a useless expensive feature.




Re: Division in dynahash.c due to HASH_FFACTOR

2020-09-04 Thread Alvaro Herrera
On 2020-Sep-04, Jakub Wartak wrote:

> After removing HASH_FFACTOR PostgreSQL still compiles...  Would
> removing it break some external API/extensions ?

FWIW, HASH_FFACTOR has *never* been used in Postgres core code.

https://postgr.es/m/20160418180711.55ac82c0@fujitsu already reported
that this flag was unused a few years ago.  And a search through
codesearch.debian.net shows that no software packaged by Debian uses
that flag either.

*If* any third-party extension is using HASH_FFACTOR, it can easily be
unbroken by removing the flag or #defining it to zero; the removal would
only be a problem if anyone showed that there is a performance loss by
using the default fillfactor for some dynahash table instead of their
custom value.

I think if we know that there's a 4% performance increase by removing
the division that comes with an ability to use a custom fillfactor that
nobody uses or has ever used, it makes sense to remove that ability.

... however, if we're really tense about improving some hash table's
performance, it might be worth trying to use simplehash.h instead of
dynahash.c for it.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services




Re: Division in dynahash.c due to HASH_FFACTOR

2020-09-04 Thread Tomas Vondra

On Fri, Sep 04, 2020 at 07:01:41AM +, Jakub Wartak wrote:


Greetins hackers,

I have mixed feelings if this welcome contribution as the potential
gain is relatively small in my tests, but still I would like to point
out that HASH_FFACTOR functionality from dynahash.c could be removed or
optimized (default fill factor is always 1, there's not a single place
that uses custom custom fill factor other than DEF_FFACTOR=1 inside
PostgreSQL repository). Because the functionality is present there
seems to be division for every buffer access [BufTableLookup()] / or
every smgropen() call (everything call to hash_search() is affected,
provided it's not ShmemInitHash/HASH_PARTITION). This division is
especially visible via perf on single process StartupXLOG WAL recovery
process on standby in heavy duty 100% CPU conditions , as the top1 is
inside hash_search:

   0x00888751 <+449>:   idiv   r8
   0x00888754 <+452>:   cmp    rax,QWORD PTR [r15+0x338] <<-- in perf 
annotate shows as 30-40%, even on default -O2, probably CPU pipelining for idiv above

I've made a PoC test to skip that division assuming ffactor would be gone:
  if (!IS_PARTITIONED(hctl) && !hashp->frozen &&
-                       hctl->freeList[0].nentries / (long) (hctl->max_bucket + 1) >= 
hctl->ffactor &&
+                       hctl->freeList[0].nentries >= (long) (hctl->max_bucket + 1) 
&&

For a stream of WAL 3.7GB I'm getting consistent improvement of ~4%, (yes I 
know it's small, that's why I'm having mixed feelings):
gcc -O3: 104->100s
gcc -O2: 108->104s
pgbench -S -c 16 -j 4 -T 30 -M prepared: stays more or less the same (-s 100), 
so no positive impact there



Hmm, so it's the division that makes the difference? In that case,
wouldn't it make sense to just compute a threshold every time the hash
table is resized, and then do just the comparison. That is, compute

   nentries_threshold = (long) (hctl->max_bucket + 1) * hctl->ffactor;

and then do just

   hctl->freeList[0].nentries >= nentries_threshold

Of course, this assumes the threshold is calculated only rarely, maybe
that's not the case.

Also, what if you lower the fill factor, e.g. to 0.8? Does that improve
performance or not? I can't find any discussion about this in the
archives, but the dynamic hashing paper [1] seems to suggest it does
make a difference (see the tables at the end, but I haven't re-read the
paper recently so maybe it's irrelevant). Anyway, it'd be foolish to
remove the ffactor parameter to save on division only to lose the
ability to lower the factor and save more than that ...


As for the 4% - it's not much, but it's also not negligible. I'm sure
we've done micro-optimizations for smaller gains. The big question is
whether this is statistically significant, or whether it's just due to
random effects. It could easily be caused by changes in layout of the
binary, for example - that can happen quite easily. See [2] and [3].

The talk actually mentions a tool meant to eliminate this bias by
randomization, but I've never tried using it on PostgreSQL so I'm not
sure how compatible it is :-(


[1] https://www.csd.uoc.gr/~hy460/pdf/Dynamic%20Hash%20Tables.pdf
[2] 
https://www.cis.upenn.edu/~cis501/previous/fall2016/papers/producing-wrong-data.pdf
[3] https://www.youtube.com/watch?v=r-TLSBdHe1A


After removing HASH_FFACTOR PostgreSQL still compiles...  Would
removing it break some external API/extensions ? I saw several
optimization for the "idiv" where it could be optimized e.g. see
https://github.com/ridiculousfish/libdivide  Or maybe there is some
other idea to expose bottlenecks of BufTableLookup() ? I also saw
codepath PinBuffer()->GetPrivateRefCountEntry() -> dynahash that could
be called pretty often I have no idea what kind of pgbench stresstest
could be used to demonstrate the gain (or lack of it).

-Jakub Wartak.



I don't think whit would break a lot of stuff, but I'm kinda dubious
it's a measurable improvement for common workloads ...


regards

--
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services