Greetins hackers,

I have mixed feelings if this welcome contribution as the potential gain is 
relatively small in my tests, but still I would like to point out that 
HASH_FFACTOR functionality from dynahash.c could be removed or optimized 
(default fill factor is always 1, there's not a single place that uses custom 
custom fill factor other than DEF_FFACTOR=1 inside PostgreSQL repository). 
Because the functionality is present there seems to be division for every 
buffer access [BufTableLookup()] / or every smgropen() call (everything call to 
hash_search() is affected, provided it's not ShmemInitHash/HASH_PARTITION). 
This division is especially visible via perf on single process StartupXLOG WAL 
recovery process on standby in heavy duty 100% CPU conditions , as the top1 is 
inside hash_search:
   0x0000000000888751 <+449>:   idiv   r8
   0x0000000000888754 <+452>:   cmp    rax,QWORD PTR [r15+0x338] <<-- in perf 
annotate shows as 30-40%, even on default -O2, probably CPU pipelining for idiv 
above

I've made a PoC test to skip that division assuming ffactor would be gone:
               if (!IS_PARTITIONED(hctl) && !hashp->frozen &&
-                       hctl->freeList[0].nentries / (long) (hctl->max_bucket + 
1) >= hctl->ffactor &&
+                       hctl->freeList[0].nentries >= (long) (hctl->max_bucket 
+ 1) &&

For a stream of WAL 3.7GB I'm getting consistent improvement of ~4%, (yes I 
know it's small, that's why I'm having mixed feelings):
gcc -O3: 104->100s
gcc -O2: 108->104s
pgbench -S -c 16 -j 4 -T 30 -M prepared: stays more or less the same (-s 100), 
so no positive impact there

After removing HASH_FFACTOR PostgreSQL still compiles...  Would removing it 
break some external API/extensions ? I saw several optimization for the "idiv" 
where it could be optimized e.g. see 
https://github.com/ridiculousfish/libdivide  Or maybe there is some other idea 
to expose bottlenecks of BufTableLookup() ? I also saw codepath 
PinBuffer()->GetPrivateRefCountEntry() -> dynahash that could be called pretty 
often I have no idea what kind of pgbench stresstest could be used to 
demonstrate the gain (or lack of it).

-Jakub Wartak.

Reply via email to