On Thu, Jan 12, 2023 at 7:57 PM Thomas Munro <thomas.mu...@gmail.com> wrote:
> On Thu, Jan 12, 2023 at 7:27 PM Tom Lane <t...@sss.pgh.pa.us> wrote:
> > skink seems to have found a problem:
> >
> > ==2011873== VALGRINDERROR-BEGIN
> > ==2011873== Syscall param epoll_wait(events) points to unaddressable byte(s)
> > ==2011873==    at 0x4D8DC73: epoll_wait (epoll_wait.c:30)
> > ==2011873==    by 0x55CA49: WaitEventSetWaitBlock (latch.c:1527)
> > ==2011873==    by 0x55D591: WaitEventSetWait (latch.c:1473)
> > ==2011873==    by 0x4F2B28: ServerLoop (postmaster.c:1729)
> > ==2011873==    by 0x4F3E85: PostmasterMain (postmaster.c:1452)
> > ==2011873==    by 0x42643C: main (main.c:200)
> > ==2011873==  Address 0x7b1e620 is 1,360 bytes inside a recently 
> > re-allocated block of size 8,192 alloc'd
> > ==2011873==    at 0x48407B4: malloc (vg_replace_malloc.c:381)
> > ==2011873==    by 0x6D9D30: AllocSetContextCreateInternal (aset.c:446)
> > ==2011873==    by 0x4F2D9B: PostmasterMain (postmaster.c:614)
> > ==2011873==    by 0x42643C: main (main.c:200)
> > ==2011873==
> > ==2011873== VALGRINDERROR-END
>
> Repro'd here on Valgrind.  Oh, that's interesting.  WaitEventSetWait()
> wants to use an internal buffer of the size given to the constructor
> function, but passes the size of the caller's output buffer to
> epoll_wait() and friends.  Perhaps it should use Min(nevents,
> set->nevents_space).  I mean, I should have noticed that, but I think
> that's arguably a pre-existing bug in the WES code, or at least an
> unhelpful interface.  Thinking...

Yeah.  This stops valgrind complaining here.
From f3d201a2509affc8248a930f18a58cdb7ed3220f Mon Sep 17 00:00:00 2001
From: Thomas Munro <thomas.mu...@gmail.com>
Date: Thu, 12 Jan 2023 20:05:38 +1300
Subject: [PATCH] Fix WaitEventSetWait() buffer overrun.

The WAIT_USE_EPOLL and WAIT_USE_KQUEUE implementations of
WaitEventSetWaitBlock() confused the size of their internal buffer with
the size of the caller's output buffer, and could ask the kernel for too
many events.  In fact the set of events retrieved from the kernel needs
to be able to fit in both buffers, so take the minimum of the two.

The WAIT_USE_POLL and WAIT_USE WIN32 implementations didn't have this
confusion.

This probably didn't come up before because we always used the same
number in both places, but commit 7389aad6 calculates a dynamic size at
construction time, while using MAXLISTEN for its output event buffer on
the stack.  That seems like a reasonable thing to want to do, so
consider this to be a pre-existing bug worth fixing.

As reported by skink, valgrind and Tom Lane.

Discussion: https://postgr.es/m/901504.1673504836%40sss.pgh.pa.us
---
 src/backend/storage/ipc/latch.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/src/backend/storage/ipc/latch.c b/src/backend/storage/ipc/latch.c
index a238c5827c..d79d71a851 100644
--- a/src/backend/storage/ipc/latch.c
+++ b/src/backend/storage/ipc/latch.c
@@ -1525,7 +1525,7 @@ WaitEventSetWaitBlock(WaitEventSet *set, int cur_timeout,
 
 	/* Sleep */
 	rc = epoll_wait(set->epoll_fd, set->epoll_ret_events,
-					nevents, cur_timeout);
+					Min(nevents, set->nevents_space), cur_timeout);
 
 	/* Check return code */
 	if (rc < 0)
@@ -1685,7 +1685,8 @@ WaitEventSetWaitBlock(WaitEventSet *set, int cur_timeout,
 
 	/* Sleep */
 	rc = kevent(set->kqueue_fd, NULL, 0,
-				set->kqueue_ret_events, nevents,
+				set->kqueue_ret_events,
+				Min(nevents, set->nevents_space),
 				timeout_p);
 
 	/* Check return code */
-- 
2.35.1

Reply via email to