Hi, On Mon, Mar 30, 2026 at 06:11:17PM +0200, Tomas Vondra wrote: > Hi, > > Isn't pgstat_lock_flush_cb a bit broken with nowait=true? It'll skip > flushing stats for that particular lock type, but then it'll happily > reset the pending stats anyway, forgetting the stats. > > AFAIK it should keep the pending stats, and flush them sometime lager, > when the lock is not contended. That's what the other flush callbacks > do, at least. This probably means it needs to reset the entries one by > one, not the whole struct at once.
Oh right, it's currently misbehaving, thanks for the warning! > TBH I'm rather skeptical about having one lock per entry. Sure, it > allows two backends to write different entries concurrently. But is it > actually worth it? With nowait=true it might even be cheaper to try with > a single lock, and AFAICS that's the case where it matters. > > I wouldn't be surprised if this behaved quite poorly with contended > cases, because the backends will be accessing the locks in exactly the > same order and synchronize. So if one lock is contended, won't that > "synchronize" the backends, making the other locks contended too? > > Has anyone tested it actually improves the behavior? I only quickly > skimmed the thread, I might have missed it ... > I just did a few tests, with a per entry lock version fixed (to avoid the bug mentioned above) and with a single lock. The test is this one: Setup: deadlock_timeout set to 1ms. CREATE TABLE t1(id int primary key, v int); CREATE TABLE t2(id int primary key, v int); INSERT INTO t1 SELECT i, 0 FROM generate_series(1,100) i; INSERT INTO t2 SELECT i, 0 FROM generate_series(1,100) i; test.sql: \set id1 random(1, 100) \set id2 random(1, 100) BEGIN; SELECT pg_advisory_xact_lock(:id1); UPDATE t1 SET v=v+1 WHERE id=:id1; UPDATE t2 SET v=v+1 WHERE id=:id2; END; Launched that way: pgbench -c 32 -j 32 -T60 -f test.sql One run produces, something like: postgres=# select locktype, waits, wait_time from pg_stat_lock where waits > 0; locktype | waits | wait_time ---------------+--------+----------- tuple | 5058 | 5092 transactionid | 78287 | 79269 advisory | 105005 | 177253 (3 rows) With one lock per entry, the avg (the test has been run 5 times) tps is 12099. With one single lock, the avg (the test has been run 5 times) tps is 12118. The difference looks like noise so that one lock per entry does not show improved performance. Also both kind of tests produce this perf profile: 0.00% 0.00% postgres postgres [.] pgstat_lock_flush_cb So, I'm tempted to say that one lock per entry adds complexity without observable performance benefit. Also one single lock matches more naturally the intent of the nowait path and I would also not be surprised if one lock per entry behaves worse in some cases. So, PFA a patch to move to a single lock instead. Regards, -- Bertrand Drouvot PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
>From 12eb6bb593fd660ff1283474a694b674b30d1e99 Mon Sep 17 00:00:00 2001 From: Bertrand Drouvot <[email protected]> Date: Mon, 30 Mar 2026 17:16:08 +0000 Subject: [PATCH v1] lock statistics: replace per lock type locks with a single lock Currently there is one LWLock per lock type entry: this adds complexity without observable performance benefit. Plus there is a bug with nowait=true: a backend iterating over all entries could successfully flush some entries while skipping others due to contention, then unconditionally reset PendingLockStats and clear have_lockstats, losing the skipped entries permanently. This commit replaces the per entry locks with a single lock protecting all entries and the reset timestamp. This makes the nowait path correct: either the lock is acquired and everything is flushed, or it is not acquired and nothing is modified. This also better matches the intent of the nowait path. As a result, the pgstat_lock_init_shmem_cb(), pgstat_lock_reset_all_cb() and pgstat_lock_snapshot_cb() callbacks are also simplified. Author: Bertrand Drouvot <[email protected]> Reported-by: Tomas Vondra <[email protected]> Discussion: https://postgr.es/m/1af63e6d-16d5-4d5b-9b03-11472ef1adf9%40vondra.me --- src/backend/utils/activity/pgstat_lock.c | 86 ++++++++---------------- src/include/utils/pgstat_internal.h | 6 +- 2 files changed, 32 insertions(+), 60 deletions(-) 92.0% src/backend/utils/activity/ 7.9% src/include/utils/ diff --git a/src/backend/utils/activity/pgstat_lock.c b/src/backend/utils/activity/pgstat_lock.c index 041f521ce73..aec64f8fb4b 100644 --- a/src/backend/utils/activity/pgstat_lock.c +++ b/src/backend/utils/activity/pgstat_lock.c @@ -50,99 +50,71 @@ pgstat_lock_flush(bool nowait) bool pgstat_lock_flush_cb(bool nowait) { - LWLock *lcktype_lock; - PgStat_LockEntry *lck_shstats; - bool lock_not_acquired = false; + LWLock *lckstat_lock; + PgStatShared_Lock *shstats; if (!have_lockstats) return false; + shstats = &pgStatLocal.shmem->lock; + lckstat_lock = &shstats->lock; + + if (!nowait) + LWLockAcquire(lckstat_lock, LW_EXCLUSIVE); + else if (!LWLockConditionalAcquire(lckstat_lock, LW_EXCLUSIVE)) + return true; + for (int i = 0; i <= LOCKTAG_LAST_TYPE; i++) { - lcktype_lock = &pgStatLocal.shmem->lock.locks[i]; - lck_shstats = - &pgStatLocal.shmem->lock.stats.stats[i]; - - if (!nowait) - LWLockAcquire(lcktype_lock, LW_EXCLUSIVE); - else if (!LWLockConditionalAcquire(lcktype_lock, LW_EXCLUSIVE)) - { - lock_not_acquired = true; - continue; - } - #define LOCKSTAT_ACC(fld) \ - (lck_shstats->fld += PendingLockStats.stats[i].fld) + (shstats->stats.stats[i].fld += PendingLockStats.stats[i].fld) LOCKSTAT_ACC(waits); LOCKSTAT_ACC(wait_time); LOCKSTAT_ACC(fastpath_exceeded); #undef LOCKSTAT_ACC - - LWLockRelease(lcktype_lock); } - memset(&PendingLockStats, 0, sizeof(PendingLockStats)); + LWLockRelease(lckstat_lock); + memset(&PendingLockStats, 0, sizeof(PendingLockStats)); have_lockstats = false; - return lock_not_acquired; + return false; } - void pgstat_lock_init_shmem_cb(void *stats) { PgStatShared_Lock *stat_shmem = (PgStatShared_Lock *) stats; - for (int i = 0; i <= LOCKTAG_LAST_TYPE; i++) - LWLockInitialize(&stat_shmem->locks[i], LWTRANCHE_PGSTATS_DATA); + LWLockInitialize(&stat_shmem->lock, LWTRANCHE_PGSTATS_DATA); } void pgstat_lock_reset_all_cb(TimestampTz ts) { - for (int i = 0; i <= LOCKTAG_LAST_TYPE; i++) - { - LWLock *lcktype_lock = &pgStatLocal.shmem->lock.locks[i]; - PgStat_LockEntry *lck_shstats = &pgStatLocal.shmem->lock.stats.stats[i]; + LWLock *lckstat_lock = &pgStatLocal.shmem->lock.lock; - LWLockAcquire(lcktype_lock, LW_EXCLUSIVE); + LWLockAcquire(lckstat_lock, LW_EXCLUSIVE); - /* - * Use the lock in the first lock type PgStat_LockEntry to protect the - * reset timestamp as well. - */ - if (i == 0) - pgStatLocal.shmem->lock.stats.stat_reset_timestamp = ts; + pgStatLocal.shmem->lock.stats.stat_reset_timestamp = ts; - memset(lck_shstats, 0, sizeof(*lck_shstats)); - LWLockRelease(lcktype_lock); - } + memset(pgStatLocal.shmem->lock.stats.stats, 0, + sizeof(pgStatLocal.shmem->lock.stats.stats)); + + LWLockRelease(lckstat_lock); } void pgstat_lock_snapshot_cb(void) { - for (int i = 0; i <= LOCKTAG_LAST_TYPE; i++) - { - LWLock *lcktype_lock = &pgStatLocal.shmem->lock.locks[i]; - PgStat_LockEntry *lck_shstats = &pgStatLocal.shmem->lock.stats.stats[i]; - PgStat_LockEntry *lck_snap = &pgStatLocal.snapshot.lock.stats[i]; - - LWLockAcquire(lcktype_lock, LW_SHARED); - - /* - * Use the lock in the first lock type PgStat_LockEntry to protect the - * reset timestamp as well. - */ - if (i == 0) - pgStatLocal.snapshot.lock.stat_reset_timestamp = - pgStatLocal.shmem->lock.stats.stat_reset_timestamp; - - /* using struct assignment due to better type safety */ - *lck_snap = *lck_shstats; - LWLockRelease(lcktype_lock); - } + LWLock *lckstat_lock = &pgStatLocal.shmem->lock.lock; + + LWLockAcquire(lckstat_lock, LW_SHARED); + + pgStatLocal.snapshot.lock = pgStatLocal.shmem->lock.stats; + + LWLockRelease(lckstat_lock); } /* diff --git a/src/include/utils/pgstat_internal.h b/src/include/utils/pgstat_internal.h index 97704421a92..8ae3fbcba46 100644 --- a/src/include/utils/pgstat_internal.h +++ b/src/include/utils/pgstat_internal.h @@ -467,10 +467,10 @@ typedef struct PgStatShared_IO typedef struct PgStatShared_Lock { /* - * locks[i] protects stats.stats[i]. locks[0] also protects - * stats.stat_reset_timestamp. + * single lock protecting all entries as well as + * stats->stat_reset_timestamp. */ - LWLock locks[LOCKTAG_LAST_TYPE + 1]; + LWLock lock; PgStat_Lock stats; } PgStatShared_Lock; -- 2.34.1
