We're observing a few cases with lockmanager spikes in a few quite loaded
systems.

These cases are different; queries are different, Postgres versions are 12,
13, and 14.

But in all cases, servers are quite beefy (96-128 vCPUs, ~600-800 GiB)
receiving a lot of TPS (a few dozens of thousands). Most queries that
struggle from wait_event=lockmanager involve a substantial number of
tables/indexes, often with partitioning.

FP_LOCK_SLOTS_PER_BACKEND is now hard-coded 16 in storage/proc.h – and it
is now very easy to hit this threshold in a loaded system, especially, for
example, if a table with a dozen of indexes was partitioned. It seems any
system with good growth hits it sooner or later.

I wonder, would it make sense to:
1) either consider increasing this hard-coded value, taking into account
that 16 seems to be very low for modern workloads, schemas, and hardware –
say, it could be 64,
2) or even make it configurable – a new GUC.

Thanks,
Nikolay Samokhvalov
Founder, Postgres.ai

Reply via email to