I have been working on to analyze different ways to reduce the contention around ProcArrayLock. I have evaluated mainly 2 ideas, first one is to partition the ProcArrayLock (the basic idea is to allow multiple clients (equal to number of ProcArrayLock partitions) to perform ProcArrayEndTransaction and then wait for all of them at GetSnapshotData time) and second one is to have a mechanism to GroupClear the Xid during ProcArrayEndTransaction() and the second idea clearly stands out in my tests, so I have prepared the patch for that to further discuss here.
The idea behind second approach (GroupClear Xid) is, first try to get ProcArrayLock conditionally in ProcArrayEndTransaction(), if we get the lock then clear the advertised XID, else set the flag (which indicates that advertised XID needs to be clear for this proc) and push this proc to pendingClearXidList. Except one proc, all other proc's will wait for their Xid to be cleared. The only allowed proc will attempt the lock acquiration, after acquring the lock, pop all of the requests off the list using compare-and-swap, servicing each one before moving to next proc, and clearing their Xids. After servicing all the requests on pendingClearXidList, release the lock and once again go through the saved pendingClearXidList and wake all the processes waiting for their Xid to be cleared. To set the appropriate value for ShmemVariableCache->latestCompletedXid, we need to advertise latestXid incase proc needs to be pushed to pendingClearXidList. Attached patch implements the above idea. Performance Data ----------------------------- RAM - 500GB 8 sockets, 64 cores(Hyperthreaded128 threads total) Non-default parameters ------------------------------------ max_connections = 150 shared_buffers=8GB min_wal_size=10GB max_wal_size=15GB checkpoint_timeout =35min maintenance_work_mem = 1GB checkpoint_completion_target = 0.9 wal_buffers = 256MB pgbench setup ------------------------ scale factor - 300 Data is on magnetic disk and WAL on ssd. pgbench -M prepared tpc-b Head : commit 51d0fe5d Patch -1 : group_xid_clearing_at_trans_end_rel_v1 Client Count/TPS18163264128HEAD814609210899199262363617812Patch-110866483 11093199083122028237 The graph for the data is attached. Points about performance data --------------------------------------------- 1. Gives good performance improvement at or greater than 64 clients and give somewhat moderate improvement at lower client count. The reason is that because the contention around ProcArrayLock is mainly seen at higher client count. I have checked that at higher client-count, it started behaving lockless (which means performance with patch is equivivalent to if we just comment out ProcArrayLock in ProcArrayEndTransaction()). 2. There is some noise in this data (at 1 client count, I don't expect much difference). 3. I have done similar tests on power-8 m/c and found similar gains. 4. The gains are visible when the data fits in shared_buffers as for other workloads I/O starts dominating. 5. I have seen that effect of Patch is much more visible if we keep autovacuum = off (do manual vacuum after each run) and keep wal_writer_delay to lower value (say 20ms). I have not included that data here, but if somebody is interested, I can do the detailed tests against HEAD with those settings and share the results. Here are steps used to take data (there are repeated for each reading) -------------------------------------------------------------------------------------------------------- 1. Start Server 2. dropdb postgres 3. createdb posters 4. pgbench -i -s 300 postgres 5. pgbench -c $threads -j $threads -T 1800 -M prepared postgres 6. checkpoint 7. Stop Server Thanks to Robert Haas for having discussion (offlist) about the idea and suggestions to improve it and also Andres Freund for having discussion and sharing thoughts about this idea at PGCon. Suggestions? With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com
group_xid_clearing_at_trans_end_v1.patch
Description: Binary data
-- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers