subject:"Latches vs lwlock contention"

Re: Latches vs lwlock contention

2024-07-22 Thread Heikki Linnakangas


On 28/09/2023 12:58, Heikki Linnakangas wrote:

On 28/10/2022 06:56, Thomas Munro wrote:

One example is heavyweight lock wakeups.  If you run BEGIN; LOCK TABLE
t; ... and then N other sessions wait in SELECT * FROM t;, and then
you run ... COMMIT;, you'll see the first session wake all the others
while it still holds the partition lock itself.  They'll all wake up
and begin to re-acquire the same partition lock in exclusive mode,
immediately go back to sleep on*that*  wait list, and then wake each
other up one at a time in a chain.  We could avoid the first
double-bounce by not setting the latches until after we've released
the partition lock.  We could avoid the rest of them by not
re-acquiring the partition lock at all, which ... if I'm reading right
... shouldn't actually be necessary in modern PostgreSQL?  Or if there
is another reason to re-acquire then maybe the comment should be
updated.


ISTM that the change to not re-aqcuire the lock in ProcSleep is
independent from the other changes. Let's split that off to a separate
patch.


I spent some time on splitting that off. I had to start from scratch, 
because commit 2346df6fc373df9c5ab944eebecf7d3036d727de conflicted 
heavily with your patch.


I split ProcSleep() into two functions: JoinWaitQueue does the first 
part of ProcSleep(), adding the process to the wait queue and checking 
for the dontWait and early deadlock cases. What remains in ProcSleep() 
does just the sleeping part. JoinWaitQueue is called with the partition 
lock held, and ProcSleep() is called without it. This way, the partition 
lock is acquired and released in the same function 
(LockAcquireExtended), avoiding awkward "lock is held on enter, but 
might be released on exit depending on the outcome" logic.


This is actually a set of 8 patches. The first 7 are independent tiny 
fixes and refactorings in these functions. See individual commit messages.



I agree it should be safe. Acquiring a lock just to hold off interrupts
is overkill anwyway, HOLD_INTERRUPTS() would be enough.
LockErrorCleanup() uses HOLD_INTERRUPTS() already.

There are no CHECK_FOR_INTERRUPTS() in GrantAwaitedLock(), so cancel/die
interrupts can't happen here. But could we add HOLD_INTERRUPTS(), just
pro forma, to document the assumption? It's a little awkward: you really
should hold interrupts until the caller has done "awaitedLock = NULL;".
So it's not quite enough to add a pair of HOLD_ and RESUME_INTERRUPTS()
at the end of ProcSleep(). You'd need to do the HOLD_INTERRUPTS() in
ProcSleep() and require the caller to do RESUME_INTERRUPTS(). In a
sense, ProcSleep downgrades the lock on the partition to just holding
off interrupts.


I didn't use HOLD/RESUME_INTERRUPTS() after all. Like you did, I'm just 
relying on the fact that there are no CHECK_FOR_INTERRUPTS() calls in 
places where they might cause trouble. Those sections are short, so I 
think it's fine.


--
Heikki Linnakangas
Neon (https://neon.tech)
From 989069eaca788cfd61010e5f88c0feb359b40180 Mon Sep 17 00:00:00 2001
From: Heikki Linnakangas 
Date: Mon, 22 Jul 2024 01:21:20 +0300
Subject: [PATCH 1/8] Remove LOCK_PRINT() call that could point to garbage

If a deadlock is detected in ProcSleep, it removes the process from
the wait queue and the shared LOCK and PROCLOCK entries. As soon as it
does that, the other process holding the lock might release it, and
remove the LOCK entry altogether. So on return from ProcSleep(), it's
possible that the LOCK entry is no longer valid.
---
 src/backend/storage/lmgr/lock.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/src/backend/storage/lmgr/lock.c b/src/backend/storage/lmgr/lock.c
index 0400a507779..05b3a243b5c 100644
--- a/src/backend/storage/lmgr/lock.c
+++ b/src/backend/storage/lmgr/lock.c
@@ -1861,8 +1861,6 @@ WaitOnLock(LOCALLOCK *locallock, ResourceOwner owner, bool dontWait)
 			 * now.
 			 */
 			awaitedLock = NULL;
-			LOCK_PRINT("WaitOnLock: aborting on lock",
-	   locallock->lock, locallock->tag.mode);
 			LWLockRelease(LockHashPartitionLock(locallock->hashcode));
 
 			/*
-- 
2.39.2

From 22bc7a3b7826dfbb991b5d63ad1fba52bb297923 Mon Sep 17 00:00:00 2001
From: Heikki Linnakangas 
Date: Mon, 22 Jul 2024 00:43:13 +0300
Subject: [PATCH 2/8] Fix comment in LockReleaseAll() on when locallock->nLock
 can be zero

We reach this case also e.g. when a deadlock is detected, not only
when we run out of memory.
---
 src/backend/storage/lmgr/lock.c | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/src/backend/storage/lmgr/lock.c b/src/backend/storage/lmgr/lock.c
index 05b3a243b5c..7eb5c2e5226 100644
--- a/src/backend/storage/lmgr/lock.c
+++ b/src/backend/storage/lmgr/lock.c
@@ -2208,9 +2208,8 @@ LockReleaseAll(LOCKMETHODID lockmethodid, bool allLocks)
 	while ((locallock = (LOCALLOCK *) hash_seq_search()) != NULL)
 	{
 		/*
-		 * If the LOCALLOCK entry is unused, we must've run out of shared
-		 * memory while trying to set up this lock.  Just forget the local
-		 * entry.
+

Re: Latches vs lwlock contention

2023-09-28 Thread Heikki Linnakangas


On 28/10/2022 06:56, Thomas Munro wrote:

One example is heavyweight lock wakeups.  If you run BEGIN; LOCK TABLE
t; ... and then N other sessions wait in SELECT * FROM t;, and then
you run ... COMMIT;, you'll see the first session wake all the others
while it still holds the partition lock itself.  They'll all wake up
and begin to re-acquire the same partition lock in exclusive mode,
immediately go back to sleep on*that*  wait list, and then wake each
other up one at a time in a chain.  We could avoid the first
double-bounce by not setting the latches until after we've released
the partition lock.  We could avoid the rest of them by not
re-acquiring the partition lock at all, which ... if I'm reading right
... shouldn't actually be necessary in modern PostgreSQL?  Or if there
is another reason to re-acquire then maybe the comment should be
updated.


ISTM that the change to not re-aqcuire the lock in ProcSleep is 
independent from the other changes. Let's split that off to a separate 
patch.


I agree it should be safe. Acquiring a lock just to hold off interrupts 
is overkill anwyway, HOLD_INTERRUPTS() would be enough. 
LockErrorCleanup() uses HOLD_INTERRUPTS() already.


There are no CHECK_FOR_INTERRUPTS() in GrantAwaitedLock(), so cancel/die 
interrupts can't happen here. But could we add HOLD_INTERRUPTS(), just 
pro forma, to document the assumption? It's a little awkward: you really 
should hold interrupts until the caller has done "awaitedLock = NULL;". 
So it's not quite enough to add a pair of HOLD_ and RESUME_INTERRUPTS() 
at the end of ProcSleep(). You'd need to do the HOLD_INTERRUPTS() in 
ProcSleep() and require the caller to do RESUME_INTERRUPTS(). In a 
sense, ProcSleep downgrades the lock on the partition to just holding 
off interrupts.


Overall +1 on this change to not re-acquire the partition lock.

--
Heikki Linnakangas
Neon (https://neon.tech)

Re: Latches vs lwlock contention

2023-07-11 Thread Aleksander Alekseev

Hi,

> Maybe this is all ok, but it would be good to make the assumptions more
> explicit.

Here are my two cents.

```
static void
SetLatchV(Latch **latches, int nlatches)
{
/* Flush any other changes out to main memory just once. */
pg_memory_barrier();

/* Keep only latches that are not already set, and set them. */
for (int i = 0; i < nlatches; ++i)
{
Latch   *latch = latches[i];

if (!latch->is_set)
latch->is_set = true;
else
latches[i] = NULL;
}

pg_memory_barrier();

[...]

void
SetLatches(LatchGroup *group)
{
if (group->size > 0)
{
SetLatchV(group->latches, group->size);

[...]
```

I suspect this API may be error-prone without some additional
comments. The caller (which may be an extension author too) may rely
on the implementation details of SetLatches() / SetLatchV() and use
the returned group->latches[] values e.g. to figure out whether he
attempted to change the state of the given latch. Even worse, one can
mistakenly assume that the result says exactly if the caller was the
one who changed the state of the latch. This being said I see why this
particular implementation was chosen.

I added corresponding comments to SetLatchV() and SetLatches(). Also
the patchset needed a rebase. PFA v4.

It passes `make installcheck-world` on 3 machines of mine: MacOS x64,
Linux x64 and Linux RISC-V.


--
Best regards,
Aleksander Alekseev
From 006499ee28230e8c1b5c8d2f047276c5661f50ad Mon Sep 17 00:00:00 2001
From: Thomas Munro 
Date: Wed, 26 Oct 2022 15:51:45 +1300
Subject: [PATCH v4 2/6] Provide SetLatches() for batched deferred latches.

If we have a way to buffer a set of wakeup targets and process them at a
later time, we can:

* move SetLatch() system calls out from under LWLocks, so that locks can
  be released faster; this is especially interesting in cases where the
  target backends will immediately try to acquire the same lock, or
  generally when the lock is heavily contended

* possibly gain some micro-opimization from issuing only two memory
  barriers for the whole batch of latches, not two for each latch to be
  set

* prepare for future IPC mechanisms which might allow us to wake
  multiple backends in a single system call

Individual users of this facility will follow in separate patches.

Discussion: https://postgr.es/m/CA%2BhUKGKmO7ze0Z6WXKdrLxmvYa%3DzVGGXOO30MMktufofVwEm1A%40mail.gmail.com
---
 src/backend/storage/ipc/latch.c  | 221 ---
 src/include/storage/latch.h  |  13 ++
 src/tools/pgindent/typedefs.list |   1 +
 3 files changed, 161 insertions(+), 74 deletions(-)

diff --git a/src/backend/storage/ipc/latch.c b/src/backend/storage/ipc/latch.c
index db889385b7..5b7113dac8 100644
--- a/src/backend/storage/ipc/latch.c
+++ b/src/backend/storage/ipc/latch.c
@@ -589,6 +589,88 @@ WaitLatchOrSocket(Latch *latch, int wakeEvents, pgsocket sock,
 	return ret;
 }
 
+/*
+ * Sets multiple 'latches' at the same time.
+ *
+ * Modifies the input array. The nature of this modification is
+ * an implementation detail of SetLatchV(). The caller shouldn't try
+ * interpreting it and/or using the input array after the call.
+ */
+static void
+SetLatchV(Latch **latches, int nlatches)
+{
+	/* Flush any other changes out to main memory just once. */
+	pg_memory_barrier();
+
+	/* Keep only latches that are not already set, and set them. */
+	for (int i = 0; i < nlatches; ++i)
+	{
+		Latch	   *latch = latches[i];
+
+		if (!latch->is_set)
+			latch->is_set = true;
+		else
+			latches[i] = NULL;
+	}
+
+	pg_memory_barrier();
+
+	/* Wake the remaining latches that might be sleeping. */
+	for (int i = 0; i < nlatches; ++i)
+	{
+		Latch	   *latch = latches[i];
+
+		/*
+		 * See if anyone's waiting for the latch. It can be the current
+		 * process if we're in a signal handler. We use the self-pipe or
+		 * SIGURG to ourselves to wake up WaitEventSetWaitBlock() without
+		 * races in that case. If it's another process, send a signal.
+		 *
+		 * Fetch owner_pid only once, in case the latch is concurrently
+		 * getting owned or disowned. XXX: This assumes that pid_t is atomic,
+		 * which isn't guaranteed to be true! In practice, the effective range
+		 * of pid_t fits in a 32 bit integer, and so should be atomic. In the
+		 * worst case, we might end up signaling the wrong process. Even then,
+		 * you're very unlucky if a process with that bogus pid exists and
+		 * belongs to Postgres; and PG database processes should handle excess
+		 * SIGURG interrupts without a problem anyhow.
+		 *
+		 * Another sort of race condition that's possible here is for a new
+		 * process to own the latch immediately after we look, so we don't
+		 * signal it. This is okay so long as all callers of
+		 * ResetLatch/WaitLatch follow the standard coding convention of
+		 * waiting at the bottom of their loops, not the top, so that they'll
+		 * correctly process latch-setting events that happen before they
+

Re: Latches vs lwlock contention

2023-07-06 Thread Peter Eisentraut


On 04.03.23 20:50, Thomas Munro wrote:

Subject: [PATCH v3 1/6] Allow palloc_extended(NO_OOM) in critical sections.

Commit 4a170ee9e0e banned palloc() and similar in critical sections, because an
allocation failure would produce a panic.  Make an exception for allocation
with NULL on failure, for code that has a backup plan.


I suppose this assumes that out of memory is the only possible error 
condition that we are concerned about for this?


For example, we sometimes see "invalid memory alloc request size" either 
because of corrupted data or because code does things we didn't expect. 
This would then possibly panic?  Also, the realloc code paths 
potentially do more work with possibly more error conditions, and/or 
they error out right away because it's not supported by the context type.


Maybe this is all ok, but it would be good to make the assumptions more 
explicit.

Re: Latches vs lwlock contention

2023-03-04 Thread Thomas Munro

On Sat, Jan 28, 2023 at 3:39 AM vignesh C  wrote:
> On Tue, 1 Nov 2022 at 16:40, Thomas Munro  wrote:
> > Here's an attempt at that.  There aren't actually any cases of uses of
> > this stuff in critical sections here, so perhaps I shouldn't bother
> > with that part.  The part I'd most like some feedback on is the
> > heavyweight lock bits.  I'll add this to the commitfest.
>
> The patch does not apply on top of HEAD as in [1], please post a rebased 
> patch:

Rebased.  I dropped the CV patch for now.
From d678e740c61ed5408ad254d2ca5f8e2fff7d2cd1 Mon Sep 17 00:00:00 2001
From: Thomas Munro 
Date: Tue, 1 Nov 2022 23:29:38 +1300
Subject: [PATCH v3 1/6] Allow palloc_extended(NO_OOM) in critical sections.

Commit 4a170ee9e0e banned palloc() and similar in critical sections, because an
allocation failure would produce a panic.  Make an exception for allocation
with NULL on failure, for code that has a backup plan.

Discussion: https://postgr.es/m/CA%2BhUKGJHudZMG_vh6GiPB61pE%2BGgiBk5jxzd7inijqx5nEZLCw%40mail.gmail.com
---
 src/backend/utils/mmgr/mcxt.c | 9 ++---
 1 file changed, 6 insertions(+), 3 deletions(-)

diff --git a/src/backend/utils/mmgr/mcxt.c b/src/backend/utils/mmgr/mcxt.c
index 0b00802df7..eb18b89368 100644
--- a/src/backend/utils/mmgr/mcxt.c
+++ b/src/backend/utils/mmgr/mcxt.c
@@ -1123,7 +1123,8 @@ MemoryContextAllocExtended(MemoryContext context, Size size, int flags)
 	void	   *ret;
 
 	Assert(MemoryContextIsValid(context));
-	AssertNotInCriticalSection(context);
+	if ((flags & MCXT_ALLOC_NO_OOM) == 0)
+		AssertNotInCriticalSection(context);
 
 	if (!((flags & MCXT_ALLOC_HUGE) != 0 ? AllocHugeSizeIsValid(size) :
 		  AllocSizeIsValid(size)))
@@ -1278,7 +1279,8 @@ palloc_extended(Size size, int flags)
 	MemoryContext context = CurrentMemoryContext;
 
 	Assert(MemoryContextIsValid(context));
-	AssertNotInCriticalSection(context);
+	if ((flags & MCXT_ALLOC_NO_OOM) == 0)
+		AssertNotInCriticalSection(context);
 
 	if (!((flags & MCXT_ALLOC_HUGE) != 0 ? AllocHugeSizeIsValid(size) :
 		  AllocSizeIsValid(size)))
@@ -1509,7 +1511,8 @@ repalloc_extended(void *pointer, Size size, int flags)
 		  AllocSizeIsValid(size)))
 		elog(ERROR, "invalid memory alloc request size %zu", size);
 
-	AssertNotInCriticalSection(context);
+	if ((flags & MCXT_ALLOC_NO_OOM) == 0)
+		AssertNotInCriticalSection(context);
 
 	/* isReset must be false already */
 	Assert(!context->isReset);
-- 
2.39.2

From a3917505d7bd7cec7e98010d05bc016ddcd0dbac Mon Sep 17 00:00:00 2001
From: Thomas Munro 
Date: Wed, 26 Oct 2022 15:51:45 +1300
Subject: [PATCH v3 2/6] Provide SetLatches() for batched deferred latches.

If we have a way to buffer a set of wakeup targets and process them at a
later time, we can:

* move SetLatch() system calls out from under LWLocks, so that locks can
  be released faster; this is especially interesting in cases where the
  target backends will immediately try to acquire the same lock, or
  generally when the lock is heavily contended

* possibly gain some micro-opimization from issuing only two memory
  barriers for the whole batch of latches, not two for each latch to be
  set

* prepare for future IPC mechanisms which might allow us to wake
  multiple backends in a single system call

Individual users of this facility will follow in separate patches.

Discussion: https://postgr.es/m/CA%2BhUKGKmO7ze0Z6WXKdrLxmvYa%3DzVGGXOO30MMktufofVwEm1A%40mail.gmail.com
---
 src/backend/storage/ipc/latch.c  | 214 ---
 src/include/storage/latch.h  |  13 ++
 src/tools/pgindent/typedefs.list |   1 +
 3 files changed, 154 insertions(+), 74 deletions(-)

diff --git a/src/backend/storage/ipc/latch.c b/src/backend/storage/ipc/latch.c
index f4123e7de7..96dd47575a 100644
--- a/src/backend/storage/ipc/latch.c
+++ b/src/backend/storage/ipc/latch.c
@@ -591,6 +591,85 @@ WaitLatchOrSocket(Latch *latch, int wakeEvents, pgsocket sock,
 	return ret;
 }
 
+/*
+ * Set multiple latches at the same time.
+ * Note: modifies input array.
+ */
+static void
+SetLatchV(Latch **latches, int nlatches)
+{
+	/* Flush any other changes out to main memory just once. */
+	pg_memory_barrier();
+
+	/* Keep only latches that are not already set, and set them. */
+	for (int i = 0; i < nlatches; ++i)
+	{
+		Latch	   *latch = latches[i];
+
+		if (!latch->is_set)
+			latch->is_set = true;
+		else
+			latches[i] = NULL;
+	}
+
+	pg_memory_barrier();
+
+	/* Wake the remaining latches that might be sleeping. */
+	for (int i = 0; i < nlatches; ++i)
+	{
+		Latch	   *latch = latches[i];
+
+		/*
+		 * See if anyone's waiting for the latch. It can be the current
+		 * process if we're in a signal handler. We use the self-pipe or
+		 * SIGURG to ourselves to wake up WaitEventSetWaitBlock() without
+		 * races in that case. If it's another process, send a signal.
+		 *
+		 * Fetch owner_pid only once, in case the latch is concurrently
+		 * getting owned or disowned. XXX: This assumes that pid_t is atomic,
+		 * which isn't guaranteed

Re: Latches vs lwlock contention

2023-01-27 Thread vignesh C

On Tue, 1 Nov 2022 at 16:40, Thomas Munro  wrote:
>
> On Fri, Oct 28, 2022 at 4:56 PM Thomas Munro  wrote:
> > See attached sketch patches.  I guess the main thing that may not be
> > good enough is the use of a fixed sized latch buffer.  Memory
> > allocation in don't-throw-here environments like the guts of lock code
> > might be an issue, which is why it just gives up and flushes when
> > full; maybe it should try to allocate and fall back to flushing only
> > if that fails.
>
> Here's an attempt at that.  There aren't actually any cases of uses of
> this stuff in critical sections here, so perhaps I shouldn't bother
> with that part.  The part I'd most like some feedback on is the
> heavyweight lock bits.  I'll add this to the commitfest.

The patch does not apply on top of HEAD as in [1], please post a rebased patch:

=== Applying patches on top of PostgreSQL commit ID
456fa635a909ee36f73ca84d340521bd730f265f ===
=== applying patch ./v2-0003-Use-SetLatches-for-condition-variables.patch
patching file src/backend/storage/lmgr/condition_variable.c
patching file src/backend/storage/lmgr/lwlock.c
Hunk #1 FAILED at 183.
1 out of 1 hunk FAILED -- saving rejects to file
src/backend/storage/lmgr/lwlock.c.rej
patching file src/include/storage/condition_variable.h
patching file src/include/storage/lwlock.h
Hunk #1 FAILED at 193.
1 out of 1 hunk FAILED -- saving rejects to file
src/include/storage/lwlock.h.rej

[1] - http://cfbot.cputube.org/patch_41_3998.log

Regards,
Vignesh

Re: Latches vs lwlock contention

2022-11-01 Thread Thomas Munro

On Fri, Oct 28, 2022 at 4:56 PM Thomas Munro  wrote:
> See attached sketch patches.  I guess the main thing that may not be
> good enough is the use of a fixed sized latch buffer.  Memory
> allocation in don't-throw-here environments like the guts of lock code
> might be an issue, which is why it just gives up and flushes when
> full; maybe it should try to allocate and fall back to flushing only
> if that fails.

Here's an attempt at that.  There aren't actually any cases of uses of
this stuff in critical sections here, so perhaps I shouldn't bother
with that part.  The part I'd most like some feedback on is the
heavyweight lock bits.  I'll add this to the commitfest.


v2-0001-Allow-palloc_extended-NO_OOM-in-critical-sections.patch
Description: Binary data


v2-0002-Provide-SetLatches-for-batched-deferred-latches.patch
Description: Binary data


v2-0003-Use-SetLatches-for-condition-variables.patch
Description: Binary data


v2-0004-Use-SetLatches-for-heavyweight-locks.patch
Description: Binary data


v2-0005-Don-t-re-acquire-LockManager-partition-lock-after.patch
Description: Binary data


v2-0006-Use-SetLatches-for-SERIALIZABLE-DEFERRABLE-wakeup.patch
Description: Binary data


v2-0007-Use-SetLatches-for-synchronous-replication-wakeup.patch
Description: Binary data

Latches vs lwlock contention

2022-10-27 Thread Thomas Munro

Hi,

We usually want to release lwlocks, and definitely spinlocks, before
calling SetLatch(), to avoid putting a system call into the locked
region so that we minimise the time held.  There are a few places
where we don't do that, possibly because it's not just a simple latch
to hold a pointer to but rather a set of them that needs to be
collected from some data structure and we don't have infrastructure to
help with that.  There are also cases where we semi-reliably create
lock contention, because the backends that wake up immediately try to
acquire the very same lock.

One example is heavyweight lock wakeups.  If you run BEGIN; LOCK TABLE
t; ... and then N other sessions wait in SELECT * FROM t;, and then
you run ... COMMIT;, you'll see the first session wake all the others
while it still holds the partition lock itself.  They'll all wake up
and begin to re-acquire the same partition lock in exclusive mode,
immediately go back to sleep on *that* wait list, and then wake each
other up one at a time in a chain.  We could avoid the first
double-bounce by not setting the latches until after we've released
the partition lock.  We could avoid the rest of them by not
re-acquiring the partition lock at all, which ... if I'm reading right
... shouldn't actually be necessary in modern PostgreSQL?  Or if there
is another reason to re-acquire then maybe the comment should be
updated.

Presumably no one really does that repeatedly while there is a long
queue of non-conflicting waiters, so I'm not claiming it's a major
improvement, but it's at least a micro-optimisation.

There are some other simpler mechanical changes including synchronous
replication, SERIALIZABLE DEFERRABLE and condition variables (this one
inspired by Yura Sokolov's patches[1]).  Actually I'm not at all sure
about the CV implementation, I feel like a more ambitious change is
needed to make our CVs perform.

See attached sketch patches.  I guess the main thing that may not be
good enough is the use of a fixed sized latch buffer.  Memory
allocation in don't-throw-here environments like the guts of lock code
might be an issue, which is why it just gives up and flushes when
full; maybe it should try to allocate and fall back to flushing only
if that fails.  These sketch patches aren't proposals, just
observations in need of more study.

[1] 
https://postgr.es/m/1edbb61981fe1d99c3f20e3d56d6c88999f4227c.camel%40postgrespro.ru
From b64d5782e2c3a2e34274a3bf9df4449afaee94dc Mon Sep 17 00:00:00 2001
From: Thomas Munro 
Date: Wed, 26 Oct 2022 15:51:45 +1300
Subject: [PATCH 1/8] Provide SetLatches() for batched deferred latches.

If we have a way to buffer a set of wakeup targets and process them at a
later time, we can:

* move SetLatch() system calls out from under LWLocks, so that locks can
  be released faster; this is especially interesting in cases where the
  target backends will immediately try to acquire the same lock, or
  generally when the lock is heavily contended

* possibly gain some micro-opimization from issuing only two memory
  barriers for the whole batch of latches, not two for each latch to be
  set

* provide the opportunity for potential future latch implementation
  mechanisms to deliver wakeups in a single system call

Individual users of this facility will follow in separate patches.
---
 src/backend/storage/ipc/latch.c  | 187 ++-
 src/include/storage/latch.h  |  13 +++
 src/tools/pgindent/typedefs.list |   1 +
 3 files changed, 123 insertions(+), 78 deletions(-)

diff --git a/src/backend/storage/ipc/latch.c b/src/backend/storage/ipc/latch.c
index eb3a569aae..71fdc388c8 100644
--- a/src/backend/storage/ipc/latch.c
+++ b/src/backend/storage/ipc/latch.c
@@ -576,105 +576,136 @@ WaitLatchOrSocket(Latch *latch, int wakeEvents, pgsocket sock,
 }
 
 /*
- * Sets a latch and wakes up anyone waiting on it.
- *
- * This is cheap if the latch is already set, otherwise not so much.
- *
- * NB: when calling this in a signal handler, be sure to save and restore
- * errno around it.  (That's standard practice in most signal handlers, of
- * course, but we used to omit it in handlers that only set a flag.)
- *
- * NB: this function is called from critical sections and signal handlers so
- * throwing an error is not a good idea.
+ * Set multiple latches at the same time.
+ * Note: modifies input array.
  */
-void
-SetLatch(Latch *latch)
+static void
+SetLatchV(Latch **latches, int nlatches)
 {
-#ifndef WIN32
-	pid_t		owner_pid;
-#else
-	HANDLE		handle;
-#endif
-
-	/*
-	 * The memory barrier has to be placed here to ensure that any flag
-	 * variables possibly changed by this process have been flushed to main
-	 * memory, before we check/set is_set.
-	 */
+	/* Flush any other changes out to main memory just once. */
 	pg_memory_barrier();
 
-	/* Quick exit if already set */
-	if (latch->is_set)
-		return;
+	/* Keep only latches that are not already set, and set them. */
+	for (int i = 0; i < nlatches; ++i)
+	{
+		Latch

Re: Latches vs lwlock contention

Re: Latches vs lwlock contention

Re: Latches vs lwlock contention

Re: Latches vs lwlock contention

Re: Latches vs lwlock contention

Re: Latches vs lwlock contention

Re: Latches vs lwlock contention

Latches vs lwlock contention

8 matches

Site Navigation

Mail list logo

Footer information