On Wed, May 10, 2023 at 5:34 PM Michael Paquier <[email protected]> wrote:
>
> + * NB: LWLockConflictsWithVar (which is called from
> + * LWLockWaitForVar) relies on the spinlock used above in this
> + * function and doesn't use a memory barrier.
>
> This patch adds the following comment in WaitXLogInsertionsToFinish()
> because lwlock.c on HEAD mentions that:
> /*
> * Test first to see if it the slot is free right now.
> *
> * XXX: the caller uses a spinlock before this, so we don't need a memory
> * barrier here as far as the current usage is concerned. But that might
> * not be safe in general.
> */
>
> Should it be something where we'd better be noisy about at the top of
> LWLockWaitForVar()? We don't want to add a memory barrier at the
> beginning of LWLockConflictsWithVar(), still it strikes me that
> somebody that aims at using LWLockWaitForVar() may miss this point
> because LWLockWaitForVar() is the routine published in lwlock.h, not
> LWLockConflictsWithVar(). This does not need to be really
> complicated, say a note at the top of LWLockWaitForVar() among the
> lines of (?):
> "Be careful that LWLockConflictsWithVar() does not include a memory
> barrier, hence the caller of this function may want to rely on an
> explicit barrier or a spinlock to avoid memory ordering issues."
+1. Now, we have comments in 3 places to warn about the
LWLockConflictsWithVar not using memory barrier - one in
WaitXLogInsertionsToFinish, one in LWLockWaitForVar and another one
(existing) in LWLockConflictsWithVar specifying where exactly a memory
barrier is needed if the caller doesn't use a spinlock. Looks fine to
me.
> + * NB: pg_atomic_exchange_u64 is used here as opposed to just
> + * pg_atomic_write_u64 to update the variable. Since
> pg_atomic_exchange_u64
> + * is a full barrier, we're guaranteed that all loads and stores issued
> + * prior to setting the variable are completed before any loads or stores
> + * issued after setting the variable.
>
> This is the same explanation as LWLockUpdateVar(), except that we
> lose the details explaining why we are doing the update before
> releasing the lock.
I think what I have so far seems more verbose explaining what a
barrier does and all that. I honestly think we don't need to be that
verbose, thanks to README.barrier.
I simplified those 2 comments as the following:
* NB: pg_atomic_exchange_u64, having full barrier semantics will ensure
* the variable is updated before releasing the lock.
* NB: pg_atomic_exchange_u64, having full barrier semantics will ensure
* the variable is updated before waking up waiters.
Please find the attached v7 patch.
--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com
From 8e4eeccadc106381bc2898c1887109f96c3db869 Mon Sep 17 00:00:00 2001
From: Bharath Rupireddy <[email protected]>
Date: Thu, 18 May 2023 05:17:05 +0000
Subject: [PATCH v7] Optimize WAL insertion lock acquisition and release
This commit optimizes WAL insertion lock acquisition and release
in the following way:
1. WAL insertion lock's variable insertingAt is currently read and
written with the help of lwlock's wait list lock to avoid
torn-free reads/writes. This wait list lock can become a point of
contention on a highly concurrent write workloads. Therefore, make
insertingAt a 64-bit atomic which inherently provides torn-free
reads/writes.
2. LWLockUpdateVar currently acquires lwlock's wait list lock even
when there are no waiters at all. Add a fastpath exit to
LWLockUpdateVar when there are no waiters to avoid unnecessary
locking.
Note that atomic exchange operation (which is a full barrier) is
used when necessary, instead of atomic write to ensure the memory
ordering is preserved.
It also adds notes on why LWLockConflictsWithVar doesn't need a
memory barrier as far as its current usage is concerned.
Suggested-by: Andres Freund
Author: Bharath Rupireddy
Reviewed-by: Nathan Bossart
Reviewed-by: Michael Paquier
Discussion: https://www.postgresql.org/message-id/20221124184619.xit4sfi52bcz2tva%40awork3.anarazel.de
---
src/backend/access/transam/xlog.c | 8 +++-
src/backend/storage/lmgr/lwlock.c | 64 +++++++++++++++++++------------
src/include/storage/lwlock.h | 6 +--
3 files changed, 48 insertions(+), 30 deletions(-)
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index bc5a8e0569..92b0b87d1e 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -376,7 +376,7 @@ typedef struct XLogwrtResult
typedef struct
{
LWLock lock;
- XLogRecPtr insertingAt;
+ pg_atomic_uint64 insertingAt;
XLogRecPtr lastImportantAt;
} WALInsertLock;
@@ -1495,6 +1495,10 @@ WaitXLogInsertionsToFinish(XLogRecPtr upto)
* calling LWLockUpdateVar. But if it has to sleep, it will
* advertise the insertion point with LWLockUpdateVar before
* sleeping.
+ *
+ * NB: LWLockConflictsWithVar (which is called from
+ * LWLockWaitForVar) relies on the spinlock used above in this
+ * function and doesn't use a memory barrier.
*/
if (LWLockWaitForVar(&WALInsertLocks[i].l.lock,
&WALInsertLocks[i].l.insertingAt,
@@ -4611,7 +4615,7 @@ XLOGShmemInit(void)
for (i = 0; i < NUM_XLOGINSERT_LOCKS; i++)
{
LWLockInitialize(&WALInsertLocks[i].l.lock, LWTRANCHE_WAL_INSERT);
- WALInsertLocks[i].l.insertingAt = InvalidXLogRecPtr;
+ pg_atomic_init_u64(&WALInsertLocks[i].l.insertingAt, InvalidXLogRecPtr);
WALInsertLocks[i].l.lastImportantAt = InvalidXLogRecPtr;
}
diff --git a/src/backend/storage/lmgr/lwlock.c b/src/backend/storage/lmgr/lwlock.c
index 59347ab951..65a7f64314 100644
--- a/src/backend/storage/lmgr/lwlock.c
+++ b/src/backend/storage/lmgr/lwlock.c
@@ -1547,9 +1547,8 @@ LWLockAcquireOrWait(LWLock *lock, LWLockMode mode)
* *result is set to true if the lock was free, and false otherwise.
*/
static bool
-LWLockConflictsWithVar(LWLock *lock,
- uint64 *valptr, uint64 oldval, uint64 *newval,
- bool *result)
+LWLockConflictsWithVar(LWLock *lock, pg_atomic_uint64 *valptr, uint64 oldval,
+ uint64 *newval, bool *result)
{
bool mustwait;
uint64 value;
@@ -1572,13 +1571,11 @@ LWLockConflictsWithVar(LWLock *lock,
*result = false;
/*
- * Read value using the lwlock's wait list lock, as we can't generally
- * rely on atomic 64 bit reads/stores. TODO: On platforms with a way to
- * do atomic 64 bit reads/writes the spinlock should be optimized away.
+ * Reading the value atomically ensures that we don't need any explicit
+ * locking. Note that in general, 64 bit atomic APIs in postgres inherently
+ * provide explicit locking for the platforms without atomics support.
*/
- LWLockWaitListLock(lock);
- value = *valptr;
- LWLockWaitListUnlock(lock);
+ value = pg_atomic_read_u64(valptr);
if (value != oldval)
{
@@ -1605,9 +1602,14 @@ LWLockConflictsWithVar(LWLock *lock,
*
* Note: this function ignores shared lock holders; if the lock is held
* in shared mode, returns 'true'.
+ *
+ * Be careful that LWLockConflictsWithVar() does not include a memory barrier,
+ * hence the caller of this function may want to rely on an explicit barrier or
+ * a spinlock to avoid memory ordering issues.
*/
bool
-LWLockWaitForVar(LWLock *lock, uint64 *valptr, uint64 oldval, uint64 *newval)
+LWLockWaitForVar(LWLock *lock, pg_atomic_uint64 *valptr, uint64 oldval,
+ uint64 *newval)
{
PGPROC *proc = MyProc;
int extraWaits = 0;
@@ -1735,29 +1737,43 @@ LWLockWaitForVar(LWLock *lock, uint64 *valptr, uint64 oldval, uint64 *newval)
* LWLockUpdateVar - Update a variable and wake up waiters atomically
*
* Sets *valptr to 'val', and wakes up all processes waiting for us with
- * LWLockWaitForVar(). Setting the value and waking up the processes happen
- * atomically so that any process calling LWLockWaitForVar() on the same lock
- * is guaranteed to see the new value, and act accordingly.
+ * LWLockWaitForVar(). It first sets the value atomically and then wakes up
+ * the waiting processes so that any process calling LWLockWaitForVar() on the
+ * same lock is guaranteed to see the new value, and act accordingly.
*
* The caller must be holding the lock in exclusive mode.
*/
void
-LWLockUpdateVar(LWLock *lock, uint64 *valptr, uint64 val)
+LWLockUpdateVar(LWLock *lock, pg_atomic_uint64 *valptr, uint64 val)
{
proclist_head wakeup;
proclist_mutable_iter iter;
PRINT_LWDEBUG("LWLockUpdateVar", lock, LW_EXCLUSIVE);
+ /*
+ * Update the variable atomically first without having to acquire wait
+ * list lock, so that if anyone looking for the lock will have chance to
+ * grab it a bit quickly.
+ *
+ * NB: pg_atomic_exchange_u64, having full barrier semantics will ensure
+ * the variable is updated before waking up waiters.
+ */
+ pg_atomic_exchange_u64(valptr, val);
+
+ /*
+ * Quick exit when there are no waiters. This avoids unnecessary lwlock's
+ * wait list lock acquisition and release.
+ */
+ if ((pg_atomic_read_u32(&lock->state) & LW_FLAG_HAS_WAITERS) == 0)
+ return;
+
proclist_init(&wakeup);
LWLockWaitListLock(lock);
Assert(pg_atomic_read_u32(&lock->state) & LW_VAL_EXCLUSIVE);
- /* Update the lock's value */
- *valptr = val;
-
/*
* See if there are any LW_WAIT_UNTIL_FREE waiters that need to be woken
* up. They are always in the front of the queue.
@@ -1873,17 +1889,15 @@ LWLockRelease(LWLock *lock)
* LWLockReleaseClearVar - release a previously acquired lock, reset variable
*/
void
-LWLockReleaseClearVar(LWLock *lock, uint64 *valptr, uint64 val)
+LWLockReleaseClearVar(LWLock *lock, pg_atomic_uint64 *valptr, uint64 val)
{
- LWLockWaitListLock(lock);
-
/*
- * Set the variable's value before releasing the lock, that prevents race
- * a race condition wherein a new locker acquires the lock, but hasn't yet
- * set the variables value.
+ * Update the variable atomically first.
+ *
+ * NB: pg_atomic_exchange_u64, having full barrier semantics will ensure
+ * the variable is updated before releasing the lock.
*/
- *valptr = val;
- LWLockWaitListUnlock(lock);
+ pg_atomic_exchange_u64(valptr, val);
LWLockRelease(lock);
}
diff --git a/src/include/storage/lwlock.h b/src/include/storage/lwlock.h
index d2c7afb8f4..f19bc49193 100644
--- a/src/include/storage/lwlock.h
+++ b/src/include/storage/lwlock.h
@@ -128,14 +128,14 @@ extern bool LWLockAcquire(LWLock *lock, LWLockMode mode);
extern bool LWLockConditionalAcquire(LWLock *lock, LWLockMode mode);
extern bool LWLockAcquireOrWait(LWLock *lock, LWLockMode mode);
extern void LWLockRelease(LWLock *lock);
-extern void LWLockReleaseClearVar(LWLock *lock, uint64 *valptr, uint64 val);
+extern void LWLockReleaseClearVar(LWLock *lock, pg_atomic_uint64 *valptr, uint64 val);
extern void LWLockReleaseAll(void);
extern bool LWLockHeldByMe(LWLock *lock);
extern bool LWLockAnyHeldByMe(LWLock *lock, int nlocks, size_t stride);
extern bool LWLockHeldByMeInMode(LWLock *lock, LWLockMode mode);
-extern bool LWLockWaitForVar(LWLock *lock, uint64 *valptr, uint64 oldval, uint64 *newval);
-extern void LWLockUpdateVar(LWLock *lock, uint64 *valptr, uint64 val);
+extern bool LWLockWaitForVar(LWLock *lock, pg_atomic_uint64 *valptr, uint64 oldval, uint64 *newval);
+extern void LWLockUpdateVar(LWLock *lock, pg_atomic_uint64 *valptr, uint64 val);
extern Size LWLockShmemSize(void);
extern void CreateLWLocks(void);
--
2.34.1