On 28/10/2022 06:56, Thomas Munro wrote:
One example is heavyweight lock wakeups.  If you run BEGIN; LOCK TABLE
t; ... and then N other sessions wait in SELECT * FROM t;, and then
you run ... COMMIT;, you'll see the first session wake all the others
while it still holds the partition lock itself.  They'll all wake up
and begin to re-acquire the same partition lock in exclusive mode,
immediately go back to sleep on*that*  wait list, and then wake each
other up one at a time in a chain.  We could avoid the first
double-bounce by not setting the latches until after we've released
the partition lock.  We could avoid the rest of them by not
re-acquiring the partition lock at all, which ... if I'm reading right
... shouldn't actually be necessary in modern PostgreSQL?  Or if there
is another reason to re-acquire then maybe the comment should be
updated.

ISTM that the change to not re-aqcuire the lock in ProcSleep is independent from the other changes. Let's split that off to a separate patch.

I agree it should be safe. Acquiring a lock just to hold off interrupts is overkill anwyway, HOLD_INTERRUPTS() would be enough. LockErrorCleanup() uses HOLD_INTERRUPTS() already.

There are no CHECK_FOR_INTERRUPTS() in GrantAwaitedLock(), so cancel/die interrupts can't happen here. But could we add HOLD_INTERRUPTS(), just pro forma, to document the assumption? It's a little awkward: you really should hold interrupts until the caller has done "awaitedLock = NULL;". So it's not quite enough to add a pair of HOLD_ and RESUME_INTERRUPTS() at the end of ProcSleep(). You'd need to do the HOLD_INTERRUPTS() in ProcSleep() and require the caller to do RESUME_INTERRUPTS(). In a sense, ProcSleep downgrades the lock on the partition to just holding off interrupts.

Overall +1 on this change to not re-acquire the partition lock.

--
Heikki Linnakangas
Neon (https://neon.tech)



Reply via email to