On Wed Jun 3, 2026 at 1:59 PM UTC, Bertrand Drouvot wrote: > Hi hackers, > > Now that we have global lock statistics since 4019f725f5d, it could be useful > to have the same kind of information on a per-backend basis. > > Indeed, pg_stat_lock gives us cluster-wide aggregates: total waits, total wait > time, total fast-path exceeded across all backends since last reset. > > When we see high numbers, we can't answer: > > - Which backend is affected the most? > - Is it one backend affected or many? > - Is a specific application or connection pool suffering? > - After a specific workload/application is improved, did its lock behavior > improve? > > With per-backend lock stats, we could: > > 1/ Isolate problematic sessions. We can correlate locks behavior with specific > PIDs visible in pg_stat_activity: identify the exact application_name or user > experiencing lock waits. > > 2/ Debug live contention. During an incident, we could pinpoint which backends > are experiencing fast-path exhaustion or lock waits without having to reset > global stats and lose history. > > 3/ Define workload characterization. Different backend types may have very > different lock profiles. Per-backend stats would let us see this directly. > > 4/ Compare before/after per session. We could measure a single backend's lock > behavior across a specific operation, which is impossible with global counters > that include metrics from all other backends. > > IO and WAL stats already have per-backend counterparts > (pg_stat_get_backend_io(), > pg_stat_get_backend_wal()). Lock stats are the same class of operational data: > having them only at the global level is an inconsistency that limits > observability.
The motivation makes sense to me. > As far the technical implementation: > > This data can be retrieved with a new system function called > pg_stat_get_backend_lock(), that returns one tuple per lock type based on the > PID > provided in input. > > pgstat_flush_backend() gains a new flag value, able to control the flush of > the > lock stats. > > This patch relies mostly on the infrastructure provided by 9aea73fc61d4, that > has introduced backend statistics. > > The overhead (2 functions calls and counters increments) on the hot path > (normal > lock acquisition) is zero: counters are only incremented on paths that are > already > "slow" (post deadlock timeout waits, fast-path slot exhaustion) and does not > add > that much memory per-backend: PgStat_PendingLock is 288 bytes. > > The patch is made of 2 sub-patches: > > 0001: Refactor pg_stat_get_lock() to use a helper function > +static void > +pg_stat_lock_build_tuples(ReturnSetInfo *rsinfo, > + PgStat_LockEntry *lock_stats, > + TimestampTz stat_reset_timestamp) I think that the alignment of the second and third arguments could be off by one. They should line up with the capital R in ReturnSetInfo. > - values[i] = TimestampTzGetDatum(lock_stats->stat_reset_timestamp); > + if (stat_reset_timestamp != 0) > + values[i] = TimestampTzGetDatum(stat_reset_timestamp); > + else > + nulls[i] = true; It's not super clear to me why this changed in the first patch. Perhaps it is meant to be in the second patch? I see in the second patch that we use the stat_reset_timestamp from the backend stats instead of the lock stats in pg_stat_get_backend_lock(). The motivation makes sense. It might be cleaner to move the change into patch 2. > 0002: Add per-backend lock statistics > + Returns lock statistics about the backend with the specified > + process ID. The output fields are exactly the same as the ones in the > + <structname>pg_stat_lock</structname> view. It probably makes sense to link to pg_stat_lock here. Other than the few comments I had, this patchset looks good. It follows patterns that were already established with the per-backend IO and WAL stats. -- Tristan Partin PostgreSQL Contributors Team AWS (https://aws.amazon.com)
