Re: WAL Insertion Lock Improvements (was: Re: Avoid LWLockWaitForVar() for currently held WAL insertion lock in WaitXLogInsertionsToFinish())

2023-01-24 Thread Bharath Rupireddy
On Tue, Dec 6, 2022 at 12:00 AM Andres Freund  wrote:
>
> Hi

Thanks for reviewing.

> FWIW, I don't see an advantage in 0003. If it allows us to make something else
> simpler / faster, cool, but on its own it doesn't seem worthwhile.

I've discarded this change.

> On 2022-12-02 16:31:58 -0800, Nathan Bossart wrote:
> > On Fri, Dec 02, 2022 at 04:32:38PM +0530, Bharath Rupireddy wrote:
> > > On Fri, Dec 2, 2022 at 6:10 AM Andres Freund  wrote:
> > >> I'm not sure this is quite right - don't we need a memory barrier. But I 
> > >> don't
> > >> see a reason to not just leave this code as-is. I think this should be
> > >> optimized entirely in lwlock.c
> > >
> > > Actually, we don't need that at all as LWLockWaitForVar() will return
> > > immediately if the lock is free. So, I removed it.
> >
> > I briefly looked at the latest patch set, and I'm curious how this change
> > avoids introducing memory ordering bugs.  Perhaps I am missing something
> > obvious.
>
> I'm a bit confused too - the comment above talks about LWLockWaitForVar(), but
> the patches seem to optimize LWLockUpdateVar().
>
> I think it'd be safe to optimize LWLockConflictsWithVar(), due to some
> pre-existing, quite crufty, code. LWLockConflictsWithVar() says:
>
>  * Test first to see if it the slot is free right now.
>  *
>  * XXX: the caller uses a spinlock before this, so we don't need a 
> memory
>  * barrier here as far as the current usage is concerned.  But that 
> might
>  * not be safe in general.
>
> which happens to be true in the single, indirect, caller:
>
> /* Read the current insert position */
> SpinLockAcquire(>insertpos_lck);
> bytepos = Insert->CurrBytePos;
> SpinLockRelease(>insertpos_lck);
> reservedUpto = XLogBytePosToEndRecPtr(bytepos);
>
> I think at the very least we ought to have a comment in
> WaitXLogInsertionsToFinish() highlighting this.

Done.

> It's not at all clear to me that the proposed fast-path for LWLockUpdateVar()
> is safe. I think at the very least we could end up missing waiters that we
> should have woken up.
>
> I think it ought to be safe to do something like
>
> pg_atomic_exchange_u64()..
> if (!(pg_atomic_read_u32(>state) & LW_FLAG_HAS_WAITERS))
>   return;
>
> because the pg_atomic_exchange_u64() will provide the necessary memory
> barrier.

Done.

I'm attaching the v3 patch with the above review comments addressed.
Hopefully, no memory ordering issues now. FWIW, I've added it to CF
https://commitfest.postgresql.org/42/4141/.

Test results with the v3 patch and insert workload are the same as
that of the earlier run - TPS starts to scale at higher clients as
expected after 512 clients and peaks at 2X with 2048 and 4096 clients.

HEAD:
1 1380.411086
2 1358.378988
4 2701.974332
8 5925.380744
16 10956.501237
32 20877.513953
64 40838.046774
128 70251.744161
256 108114.321299
512 120478.988268
768 99140.425209
1024 93645.984364
2048 70111.159909
4096 55541.804826

v3 PATCHED:
1 1493.800209
2 1569.414953
4 3154.186605
8 5965.578904
16 11912.587645
32 22720.964908
64 42001.094528
128 78361.158983
256 110457.926232
512 148941.378393
768 167256.590308
1024 155510.675372
2048 147499.376882
4096 119375.457779

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com


v3-0001-Optimize-WAL-insertion-lock-acquisition-and-relea.patch
Description: Binary data


Re: WAL Insertion Lock Improvements (was: Re: Avoid LWLockWaitForVar() for currently held WAL insertion lock in WaitXLogInsertionsToFinish())

2023-01-09 Thread Andres Freund
Hi,

On 2022-12-08 12:29:54 +0530, Bharath Rupireddy wrote:
> On Tue, Dec 6, 2022 at 12:00 AM Andres Freund  wrote:
> > I think it'd be safe to optimize LWLockConflictsWithVar(), due to some
> > pre-existing, quite crufty, code. LWLockConflictsWithVar() says:
> >
> >  * Test first to see if it the slot is free right now.
> >  *
> >  * XXX: the caller uses a spinlock before this, so we don't need a 
> > memory
> >  * barrier here as far as the current usage is concerned.  But that 
> > might
> >  * not be safe in general.
> >
> > which happens to be true in the single, indirect, caller:
> >
> > /* Read the current insert position */
> > SpinLockAcquire(>insertpos_lck);
> > bytepos = Insert->CurrBytePos;
> > SpinLockRelease(>insertpos_lck);
> > reservedUpto = XLogBytePosToEndRecPtr(bytepos);
> >
> > I think at the very least we ought to have a comment in
> > WaitXLogInsertionsToFinish() highlighting this.
> 
> So, using a spinlock ensures no memory ordering occurs while reading
> lock->state in LWLockConflictsWithVar()?

No, a spinlock *does* imply ordering. But your patch does remove several of
the spinlock acquisitions (via LWLockWaitListLock()). And moved the assignment
in LWLockUpdateVar() out from under the spinlock.

If you remove spinlock operations (or other barrier primitives), you need to
make sure that such modifications don't break the required memory ordering.


> How does spinlock that gets acquired and released in the caller
> WaitXLogInsertionsToFinish() itself and the memory barrier in the called
> function LWLockConflictsWithVar() relate here? Can you please help me
> understand this a bit?

The caller's barrier means that we'll see values that are at least as "up to
date" as at the time of the barrier (it's a bit more complicated than that, a
barrier always needs to be paired with another barrier).


> > It's not at all clear to me that the proposed fast-path for 
> > LWLockUpdateVar()
> > is safe. I think at the very least we could end up missing waiters that we
> > should have woken up.
> >
> > I think it ought to be safe to do something like
> >
> > pg_atomic_exchange_u64()..
> > if (!(pg_atomic_read_u32(>state) & LW_FLAG_HAS_WAITERS))
> >   return;
> 
> pg_atomic_exchange_u64(>state, exchange_with_what_?. Exchange will
> change the value no?

Not lock->state, but the atomic passed to LWLockUpdateVar(), which we do want
to update. An pg_atomic_exchange_u64() includes a memory barrier.


> > because the pg_atomic_exchange_u64() will provide the necessary memory
> > barrier.
> 
> I'm reading some comments [1], are these also true for 64-bit atomic
> CAS?

Yes. See
/* 
 * The 64 bit operations have the same semantics as their 32bit counterparts
 * if they are available. Check the corresponding 32bit function for
 * documentation.
 * 
 */


> Does it mean that an atomic CAS operation inherently provides a
> memory barrier?

Yes.


> Can you please point me if it's described better somewhere else?

I'm not sure what you'd like to have described more extensively, tbh.

Greetings,

Andres Freund




Re: WAL Insertion Lock Improvements (was: Re: Avoid LWLockWaitForVar() for currently held WAL insertion lock in WaitXLogInsertionsToFinish())

2022-12-07 Thread Bharath Rupireddy
On Tue, Dec 6, 2022 at 12:00 AM Andres Freund  wrote:
>
> FWIW, I don't see an advantage in 0003. If it allows us to make something else
> simpler / faster, cool, but on its own it doesn't seem worthwhile.

Thanks. I will discard it.

> I think it'd be safe to optimize LWLockConflictsWithVar(), due to some
> pre-existing, quite crufty, code. LWLockConflictsWithVar() says:
>
>  * Test first to see if it the slot is free right now.
>  *
>  * XXX: the caller uses a spinlock before this, so we don't need a 
> memory
>  * barrier here as far as the current usage is concerned.  But that 
> might
>  * not be safe in general.
>
> which happens to be true in the single, indirect, caller:
>
> /* Read the current insert position */
> SpinLockAcquire(>insertpos_lck);
> bytepos = Insert->CurrBytePos;
> SpinLockRelease(>insertpos_lck);
> reservedUpto = XLogBytePosToEndRecPtr(bytepos);
>
> I think at the very least we ought to have a comment in
> WaitXLogInsertionsToFinish() highlighting this.

So, using a spinlock ensures no memory ordering occurs while reading
lock->state in LWLockConflictsWithVar()? How does spinlock that gets
acquired and released in the caller WaitXLogInsertionsToFinish()
itself and the memory barrier in the called function
LWLockConflictsWithVar() relate here? Can you please help me
understand this a bit?

> It's not at all clear to me that the proposed fast-path for LWLockUpdateVar()
> is safe. I think at the very least we could end up missing waiters that we
> should have woken up.
>
> I think it ought to be safe to do something like
>
> pg_atomic_exchange_u64()..
> if (!(pg_atomic_read_u32(>state) & LW_FLAG_HAS_WAITERS))
>   return;

pg_atomic_exchange_u64(>state, exchange_with_what_?. Exchange
will change the value no?

> because the pg_atomic_exchange_u64() will provide the necessary memory
> barrier.

I'm reading some comments [1], are these also true for 64-bit atomic
CAS? Does it mean that an atomic CAS operation inherently provides a
memory barrier? Can you please point me if it's described better
somewhere else?

[1]
 * Full barrier semantics.
 */
static inline uint32
pg_atomic_exchange_u32(volatile pg_atomic_uint32 *ptr,

/*
 * Get and clear the flags that are set for this backend. Note that
 * pg_atomic_exchange_u32 is a full barrier, so we're guaranteed that the
 * read of the barrier generation above happens before we atomically
 * extract the flags, and that any subsequent state changes happen
 * afterward.

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com




Re: WAL Insertion Lock Improvements (was: Re: Avoid LWLockWaitForVar() for currently held WAL insertion lock in WaitXLogInsertionsToFinish())

2022-12-05 Thread Andres Freund
Hi,

FWIW, I don't see an advantage in 0003. If it allows us to make something else
simpler / faster, cool, but on its own it doesn't seem worthwhile.




On 2022-12-02 16:31:58 -0800, Nathan Bossart wrote:
> On Fri, Dec 02, 2022 at 04:32:38PM +0530, Bharath Rupireddy wrote:
> > On Fri, Dec 2, 2022 at 6:10 AM Andres Freund  wrote:
> >> I'm not sure this is quite right - don't we need a memory barrier. But I 
> >> don't
> >> see a reason to not just leave this code as-is. I think this should be
> >> optimized entirely in lwlock.c
> > 
> > Actually, we don't need that at all as LWLockWaitForVar() will return
> > immediately if the lock is free. So, I removed it.
> 
> I briefly looked at the latest patch set, and I'm curious how this change
> avoids introducing memory ordering bugs.  Perhaps I am missing something
> obvious.

I'm a bit confused too - the comment above talks about LWLockWaitForVar(), but
the patches seem to optimize LWLockUpdateVar().


I think it'd be safe to optimize LWLockConflictsWithVar(), due to some
pre-existing, quite crufty, code. LWLockConflictsWithVar() says:

 * Test first to see if it the slot is free right now.
 *
 * XXX: the caller uses a spinlock before this, so we don't need a 
memory
 * barrier here as far as the current usage is concerned.  But that 
might
 * not be safe in general.

which happens to be true in the single, indirect, caller:

/* Read the current insert position */
SpinLockAcquire(>insertpos_lck);
bytepos = Insert->CurrBytePos;
SpinLockRelease(>insertpos_lck);
reservedUpto = XLogBytePosToEndRecPtr(bytepos);

I think at the very least we ought to have a comment in
WaitXLogInsertionsToFinish() highlighting this.



It's not at all clear to me that the proposed fast-path for LWLockUpdateVar()
is safe. I think at the very least we could end up missing waiters that we
should have woken up.

I think it ought to be safe to do something like

pg_atomic_exchange_u64()..
if (!(pg_atomic_read_u32(>state) & LW_FLAG_HAS_WAITERS))
  return;

because the pg_atomic_exchange_u64() will provide the necessary memory
barrier.

Greetings,

Andres Freund




Re: WAL Insertion Lock Improvements (was: Re: Avoid LWLockWaitForVar() for currently held WAL insertion lock in WaitXLogInsertionsToFinish())

2022-12-02 Thread Nathan Bossart
On Fri, Dec 02, 2022 at 04:32:38PM +0530, Bharath Rupireddy wrote:
> On Fri, Dec 2, 2022 at 6:10 AM Andres Freund  wrote:
>> I'm not sure this is quite right - don't we need a memory barrier. But I 
>> don't
>> see a reason to not just leave this code as-is. I think this should be
>> optimized entirely in lwlock.c
> 
> Actually, we don't need that at all as LWLockWaitForVar() will return
> immediately if the lock is free. So, I removed it.

I briefly looked at the latest patch set, and I'm curious how this change
avoids introducing memory ordering bugs.  Perhaps I am missing something
obvious.

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com




WAL Insertion Lock Improvements (was: Re: Avoid LWLockWaitForVar() for currently held WAL insertion lock in WaitXLogInsertionsToFinish())

2022-12-02 Thread Bharath Rupireddy
On Fri, Dec 2, 2022 at 6:10 AM Andres Freund  wrote:
>
> On 2022-11-25 16:54:19 +0530, Bharath Rupireddy wrote:
> > On Fri, Nov 25, 2022 at 12:16 AM Andres Freund  wrote:
> > > I think we could improve this code more significantly by avoiding the 
> > > call to
> > > LWLockWaitForVar() for all locks that aren't acquired or don't have a
> > > conflicting insertingAt, that'd require just a bit more work to handle 
> > > systems
> > > without tear-free 64bit writes/reads.
> > >
> > > The easiest way would probably be to just make insertingAt a 64bit atomic,
> > > that transparently does the required work to make even non-atomic 
> > > read/writes
> > > tear free. Then we could trivially avoid the spinlock in
> > > LWLockConflictsWithVar(), LWLockReleaseClearVar() and with just a bit more
> > > work add a fastpath to LWLockUpdateVar(). We don't need to acquire the 
> > > wait
> > > list lock if there aren't any waiters.
> > >
> > > I'd bet that start to have visible effects in a workload with many small
> > > records.
> >
> > Thanks Andres! I quickly came up with the attached patch. I also ran
> > an insert test [1], below are the results. I also attached the results
> > graph. The cirrus-ci is happy with the patch -
> > https://github.com/BRupireddy/postgres/tree/wal_insertion_lock_improvements_v1_2.
> > Please let me know if the direction of the patch seems right.
> > clientsHEADPATCHED
> > 113541499
> > 214511464
> > 430693073
> > 857125797
> > 161133111157
> > 322202022074
> > 644174242213
> > 1287130076638
> > 256103652118944
> > 512111250161582
> > 76899544161987
> > 102496743164161
> > 204872711156686
> > 409654158135713
>
> Nice.

Thanks for taking a look at it.

> > From 293e789f9c1a63748147acd613c556961f1dc5c4 Mon Sep 17 00:00:00 2001
> > From: Bharath Rupireddy 
> > Date: Fri, 25 Nov 2022 10:53:56 +
> > Subject: [PATCH v1] WAL Insertion Lock Improvements
> >
> > ---
> >  src/backend/access/transam/xlog.c |  8 +++--
> >  src/backend/storage/lmgr/lwlock.c | 56 +--
> >  src/include/storage/lwlock.h  |  7 ++--
> >  3 files changed, 41 insertions(+), 30 deletions(-)
> >
> > diff --git a/src/backend/access/transam/xlog.c 
> > b/src/backend/access/transam/xlog.c
> > index a31fbbff78..b3f758abb3 100644
> > --- a/src/backend/access/transam/xlog.c
> > +++ b/src/backend/access/transam/xlog.c
> > @@ -376,7 +376,7 @@ typedef struct XLogwrtResult
> >  typedef struct
> >  {
> >   LWLock  lock;
> > - XLogRecPtr  insertingAt;
> > + pg_atomic_uint64insertingAt;
> >   XLogRecPtr  lastImportantAt;
> >  } WALInsertLock;
> >
> > @@ -1482,6 +1482,10 @@ WaitXLogInsertionsToFinish(XLogRecPtr upto)
> >   {
> >   XLogRecPtr  insertingat = InvalidXLogRecPtr;
> >
> > + /* Quickly check and continue if no one holds the lock. */
> > + if (!IsLWLockHeld([i].l.lock))
> > + continue;
>
> I'm not sure this is quite right - don't we need a memory barrier. But I don't
> see a reason to not just leave this code as-is. I think this should be
> optimized entirely in lwlock.c

Actually, we don't need that at all as LWLockWaitForVar() will return
immediately if the lock is free. So, I removed it.

> I'd probably split the change to an atomic from other changes either way.

Done. I've added commit messages to each of the patches.

I've also brought the patch from [1] here as 0003.

Thoughts?

[1] 
https://www.postgresql.org/message-id/CALj2ACXtQdrGXtb%3DrbUOXddm1wU1vD9z6q_39FQyX0166dq%3D%3DA%40mail.gmail.com

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com


v2-0001-Make-insertingAt-64-bit-atomic.patch
Description: Binary data


v2-0002-Add-fastpath-to-LWLockUpdateVar.patch
Description: Binary data


v2-0003-Make-lastImportantAt-64-bit-atomic.patch
Description: Binary data


Re: Avoid LWLockWaitForVar() for currently held WAL insertion lock in WaitXLogInsertionsToFinish()

2022-12-01 Thread Andres Freund
On 2022-11-25 16:54:19 +0530, Bharath Rupireddy wrote:
> On Fri, Nov 25, 2022 at 12:16 AM Andres Freund  wrote:
> > I think we could improve this code more significantly by avoiding the call 
> > to
> > LWLockWaitForVar() for all locks that aren't acquired or don't have a
> > conflicting insertingAt, that'd require just a bit more work to handle 
> > systems
> > without tear-free 64bit writes/reads.
> >
> > The easiest way would probably be to just make insertingAt a 64bit atomic,
> > that transparently does the required work to make even non-atomic 
> > read/writes
> > tear free. Then we could trivially avoid the spinlock in
> > LWLockConflictsWithVar(), LWLockReleaseClearVar() and with just a bit more
> > work add a fastpath to LWLockUpdateVar(). We don't need to acquire the wait
> > list lock if there aren't any waiters.
> >
> > I'd bet that start to have visible effects in a workload with many small
> > records.
> 
> Thanks Andres! I quickly came up with the attached patch. I also ran
> an insert test [1], below are the results. I also attached the results
> graph. The cirrus-ci is happy with the patch -
> https://github.com/BRupireddy/postgres/tree/wal_insertion_lock_improvements_v1_2.
> Please let me know if the direction of the patch seems right.
> clientsHEADPATCHED
> 113541499
> 214511464
> 430693073
> 857125797
> 161133111157
> 322202022074
> 644174242213
> 1287130076638
> 256103652118944
> 512111250161582
> 76899544161987
> 102496743164161
> 204872711156686
> 409654158135713

Nice.


> From 293e789f9c1a63748147acd613c556961f1dc5c4 Mon Sep 17 00:00:00 2001
> From: Bharath Rupireddy 
> Date: Fri, 25 Nov 2022 10:53:56 +
> Subject: [PATCH v1] WAL Insertion Lock Improvements
> 
> ---
>  src/backend/access/transam/xlog.c |  8 +++--
>  src/backend/storage/lmgr/lwlock.c | 56 +--
>  src/include/storage/lwlock.h  |  7 ++--
>  3 files changed, 41 insertions(+), 30 deletions(-)
> 
> diff --git a/src/backend/access/transam/xlog.c 
> b/src/backend/access/transam/xlog.c
> index a31fbbff78..b3f758abb3 100644
> --- a/src/backend/access/transam/xlog.c
> +++ b/src/backend/access/transam/xlog.c
> @@ -376,7 +376,7 @@ typedef struct XLogwrtResult
>  typedef struct
>  {
>   LWLock  lock;
> - XLogRecPtr  insertingAt;
> + pg_atomic_uint64insertingAt;
>   XLogRecPtr  lastImportantAt;
>  } WALInsertLock;
>  
> @@ -1482,6 +1482,10 @@ WaitXLogInsertionsToFinish(XLogRecPtr upto)
>   {
>   XLogRecPtr  insertingat = InvalidXLogRecPtr;
>  
> + /* Quickly check and continue if no one holds the lock. */
> + if (!IsLWLockHeld([i].l.lock))
> + continue;

I'm not sure this is quite right - don't we need a memory barrier. But I don't
see a reason to not just leave this code as-is. I think this should be
optimized entirely in lwlock.c


I'd probably split the change to an atomic from other changes either way.

Greetings,

Andres Freund




Re: Avoid LWLockWaitForVar() for currently held WAL insertion lock in WaitXLogInsertionsToFinish()

2022-11-25 Thread Bharath Rupireddy
On Fri, Nov 25, 2022 at 12:16 AM Andres Freund  wrote:
>
> Hi,
>
> On 2022-11-24 18:13:10 +0530, Bharath Rupireddy wrote:
> > With that said, here's a small improvement I can think of, that is, to
> > avoid calling LWLockWaitForVar() for the WAL insertion lock the caller
> > of WaitXLogInsertionsToFinish() currently holds. Since
> > LWLockWaitForVar() does a bunch of things - holds interrupts, does
> > atomic reads, acquires and releases wait list lock and so on, avoiding
> > it may be a good idea IMO.
>
> That doesn't seem like a big win. We're still going to call LWLockWaitForVar()
> for all the other locks.
>
> I think we could improve this code more significantly by avoiding the call to
> LWLockWaitForVar() for all locks that aren't acquired or don't have a
> conflicting insertingAt, that'd require just a bit more work to handle systems
> without tear-free 64bit writes/reads.
>
> The easiest way would probably be to just make insertingAt a 64bit atomic,
> that transparently does the required work to make even non-atomic read/writes
> tear free. Then we could trivially avoid the spinlock in
> LWLockConflictsWithVar(), LWLockReleaseClearVar() and with just a bit more
> work add a fastpath to LWLockUpdateVar(). We don't need to acquire the wait
> list lock if there aren't any waiters.
>
> I'd bet that start to have visible effects in a workload with many small
> records.

Thanks Andres! I quickly came up with the attached patch. I also ran
an insert test [1], below are the results. I also attached the results
graph. The cirrus-ci is happy with the patch -
https://github.com/BRupireddy/postgres/tree/wal_insertion_lock_improvements_v1_2.
Please let me know if the direction of the patch seems right.
clientsHEADPATCHED
113541499
214511464
430693073
857125797
161133111157
322202022074
644174242213
1287130076638
256103652118944
512111250161582
76899544161987
102496743164161
204872711156686
409654158135713

[1]
./configure --prefix=$PWD/inst/ CFLAGS="-O3" > install.log && make -j
8 install > install.log 2>&1 &
cd inst/bin
./pg_ctl -D data -l logfile stop
rm -rf data logfile
free -m
sudo su -c 'sync; echo 3 > /proc/sys/vm/drop_caches'
free -m
./initdb -D data
./pg_ctl -D data -l logfile start
./psql -d postgres -c 'ALTER SYSTEM SET shared_buffers = "8GB";'
./psql -d postgres -c 'ALTER SYSTEM SET max_wal_size = "16GB";'
./psql -d postgres -c 'ALTER SYSTEM SET max_connections = "4096";'
./pg_ctl -D data -l logfile restart
./pgbench -i -s 1 -d postgres
./psql -d postgres -c "ALTER TABLE pgbench_accounts DROP CONSTRAINT
pgbench_accounts_pkey;"
cat << EOF >> insert.sql
\set aid random(1, 10 * :scale)
\set delta random(1, 10 * :scale)
INSERT INTO pgbench_accounts (aid, bid, abalance) VALUES (:aid, :aid, :delta);
EOF
ulimit -S -n 5000
for c in 1 2 4 8 16 32 64 128 256 512 768 1024 2048 4096; do echo -n
"$c ";./pgbench -n -M prepared -U ubuntu postgres -f insert.sql -c$c
-j$c -T5 2>&1|grep '^tps'|awk '{print $3}';done

-- 
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com


v1-0001-WAL-Insertion-Lock-Improvements.patch
Description: Binary data


Re: Avoid LWLockWaitForVar() for currently held WAL insertion lock in WaitXLogInsertionsToFinish()

2022-11-24 Thread Andres Freund
Hi,

On 2022-11-24 18:13:10 +0530, Bharath Rupireddy wrote:
> With that said, here's a small improvement I can think of, that is, to
> avoid calling LWLockWaitForVar() for the WAL insertion lock the caller
> of WaitXLogInsertionsToFinish() currently holds. Since
> LWLockWaitForVar() does a bunch of things - holds interrupts, does
> atomic reads, acquires and releases wait list lock and so on, avoiding
> it may be a good idea IMO.

That doesn't seem like a big win. We're still going to call LWLockWaitForVar()
for all the other locks.

I think we could improve this code more significantly by avoiding the call to
LWLockWaitForVar() for all locks that aren't acquired or don't have a
conflicting insertingAt, that'd require just a bit more work to handle systems
without tear-free 64bit writes/reads.

The easiest way would probably be to just make insertingAt a 64bit atomic,
that transparently does the required work to make even non-atomic read/writes
tear free. Then we could trivially avoid the spinlock in
LWLockConflictsWithVar(), LWLockReleaseClearVar() and with just a bit more
work add a fastpath to LWLockUpdateVar(). We don't need to acquire the wait
list lock if there aren't any waiters.

I'd bet that start to have visible effects in a workload with many small
records.

Greetings,

Andres Freund