On Thu, Oct 27, 2022 at 10:29 PM Andres Freund <and...@anarazel.de> wrote: > > But I think we can solve that fairly reasonably nonetheless. We can change > PGPROC->lwWaiting to not just be a boolean, but have three states: > 0: not waiting > 1: waiting in waitlist > 2: waiting to be woken up > > which we then can use in LWLockDequeueSelf() to only remove ourselves from the > list if we're on it. As removal from that list is protected by the wait list > lock, there's no race to worry about. > > client patched HEAD > 1 60109 60174 > 2 112694 116169 > 4 214287 208119 > 8 377459 373685 > 16 524132 515247 > 32 565772 554726 > 64 587716 497508 > 128 581297 415097 > 256 550296 334923 > 512 486207 243679 > 768 449673 192959 > 1024 410836 157734 > 2048 326224 82904 > 4096 250252 32007 > > Not perfect with the patch, but not awful either.
Here are results from my testing [1]. Results look impressive with the patch at a higher number of clients, for instance, on HEAD TPS with 1024 clients is 103587 whereas it is 248702 with the patch. HEAD, run 1: 1 34534 2 72088 4 135249 8 213045 16 243507 32 304108 64 375148 128 390658 256 345503 512 284510 768 146417 1024 103587 2048 34702 4096 12450 HEAD, run 2: 1 34110 2 72403 4 134421 8 211263 16 241606 32 295198 64 353580 128 385147 256 341672 512 295001 768 142341 1024 97721 2048 30229 4096 13179 PATCHED, run 1: 1 34412 2 71733 4 139141 8 211526 16 241692 32 308198 64 406198 128 385643 256 338464 512 295559 768 272639 1024 248702 2048 191402 4096 112074 PATCHED, run 2: 1 34087 2 73567 4 135624 8 211901 16 242819 32 310534 64 352663 128 381780 256 342483 512 301968 768 272596 1024 251014 2048 184939 4096 108186 > I've attached my quick-and-dirty patch. Obviously it'd need a few defines etc, > but I wanted to get this out to discuss before spending further time. Just for the record, here are some review comments posted in the other thread - https://www.postgresql.org/message-id/CALj2ACXktNbG%3DK8Xi7PSqbofTZozavhaxjatVc14iYaLu4Maag%40mail.gmail.com.. BTW, I've seen a sporadic crash (SEGV) with the patch in bg writer with the same set up [1], I'm not sure if it's really because of the patch. I'm unable to reproduce it now and unfortunately I didn't capture further details when it occurred. [1] ./configure --prefix=$PWD/inst/ --enable-tap-tests CFLAGS="-O3" > install.log && make -j 8 install > install.log 2>&1 & shared_buffers = 8GB max_wal_size = 32GB max_connections = 4096 checkpoint_timeout = 10min ubuntu: cat << EOF >> txid.sql SELECT txid_current(); EOF ubuntu: for c in 1 2 4 8 16 32 64 128 256 512 768 1024 2048 4096; do echo -n "$c ";./pgbench -n -M prepared -U ubuntu postgres -f txid.sql -c$c -j$c -T5 2>&1|grep '^tps'|awk '{print $3}';done -- Bharath Rupireddy PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com