On 2022-Jul-05, Andres Freund wrote: > I think we'd be better off disabling at least some signals during > dsm_impl_posix_resize(). I'm afraid we'll otherwise just find another > variation of these problems. I haven't checked the source of ftruncate, but > what Thomas dug up for fallocate makes it pretty clear that our current > approach of just retrying again and again isn't good enough. It's a bit more > obvious that it's a problem for fallocate, but I don't think it's worth having > different solutions for the two.
So what if we move the retry loop one level up? As in the attached. Here, if we get EINTR then we retry both syscalls. -- Álvaro Herrera PostgreSQL Developer — https://www.EnterpriseDB.com/ "No hay hombre que no aspire a la plenitud, es decir, la suma de experiencias de que un hombre es capaz"
>From 67d4447bc8075b9eb249e7543afb136b2cfaad9b Mon Sep 17 00:00:00 2001 From: Alvaro Herrera <alvhe...@alvh.no-ip.org> Date: Fri, 1 Jul 2022 17:16:33 +0200 Subject: [PATCH v4] rework retry loop for dsm_impl_op --- src/backend/storage/ipc/dsm_impl.c | 101 +++++++++++++++++------------ 1 file changed, 58 insertions(+), 43 deletions(-) diff --git a/src/backend/storage/ipc/dsm_impl.c b/src/backend/storage/ipc/dsm_impl.c index 323862a3d2..4f3284b66b 100644 --- a/src/backend/storage/ipc/dsm_impl.c +++ b/src/backend/storage/ipc/dsm_impl.c @@ -295,30 +295,57 @@ dsm_impl_posix(dsm_op op, dsm_handle handle, Size request_size, } request_size = st.st_size; } - else if (dsm_impl_posix_resize(fd, request_size) != 0) + else { - int save_errno; + int rc; - /* Back out what's already been done. */ - save_errno = errno; - close(fd); - ReleaseExternalFD(); - shm_unlink(name); - errno = save_errno; + for (;;) + { + rc = dsm_impl_posix_resize(fd, request_size); + if (rc == 0) + break; /* all good */ - /* - * If we received a query cancel or termination signal, we will have - * EINTR set here. If the caller said that errors are OK here, check - * for interrupts immediately. - */ - if (errno == EINTR && elevel >= ERROR) - CHECK_FOR_INTERRUPTS(); + /* + * If there's a signal that we need to handle, bail out here to + * clean up and handle it. + */ + if (QueryCancelPending || ProcDiePending) + break; - ereport(elevel, - (errcode_for_dynamic_shared_memory(), - errmsg("could not resize shared memory segment \"%s\" to %zu bytes: %m", - name, request_size))); - return false; + /* + * If we were interrupted but it wasn't a Postgres-level signal, + * try again. + */ + if (rc == EINTR) + continue; + break; + } + + if (rc != 0) + { + int save_errno; + + /* Back out what's already been done. */ + save_errno = errno; + close(fd); + ReleaseExternalFD(); + shm_unlink(name); + errno = save_errno; + + /* + * If we received a query cancel or termination signal, we will have + * EINTR set here. If the caller said that errors are OK here, check + * for interrupts immediately. + */ + if (errno == EINTR && elevel >= ERROR) + CHECK_FOR_INTERRUPTS(); + + ereport(elevel, + (errcode_for_dynamic_shared_memory(), + errmsg("could not resize shared memory segment \"%s\" to %zu bytes: %m", + name, request_size))); + return false; + } } /* Map it. */ @@ -355,15 +382,20 @@ dsm_impl_posix(dsm_op op, dsm_handle handle, Size request_size, * If necessary, also ensure that virtual memory is actually allocated by the * operating system, to avoid nasty surprises later. * - * Returns non-zero if either truncation or allocation fails, and sets errno. + * Returns non-zero if either truncation or allocation fails, and errno is set. */ static int dsm_impl_posix_resize(int fd, off_t size) { int rc; - /* Truncate (or extend) the file to the requested size. */ + /* + * Truncate (or extend) the file to the requested size. + */ + errno = 0; rc = ftruncate(fd, size); + if (rc != 0) + return errno; /* * On Linux, a shm_open fd is backed by a tmpfs file. After resizing with @@ -374,27 +406,10 @@ dsm_impl_posix_resize(int fd, off_t size) * SIGBUS later. */ #if defined(HAVE_POSIX_FALLOCATE) && defined(__linux__) - if (rc == 0) - { - /* - * We may get interrupted. If so, just retry unless there is an - * interrupt pending. This avoids the possibility of looping forever - * if another backend is repeatedly trying to interrupt us. - */ - pgstat_report_wait_start(WAIT_EVENT_DSM_FILL_ZERO_WRITE); - do - { - rc = posix_fallocate(fd, 0, size); - } while (rc == EINTR && !(ProcDiePending || QueryCancelPending)); - pgstat_report_wait_end(); - - /* - * The caller expects errno to be set, but posix_fallocate() doesn't - * set it. Instead it returns error numbers directly. So set errno, - * even though we'll also return rc to indicate success or failure. - */ - errno = rc; - } + pgstat_report_wait_start(WAIT_EVENT_DSM_FILL_ZERO_WRITE); + rc = posix_fallocate(fd, 0, size); + pgstat_report_wait_end(); + errno = rc; #endif /* HAVE_POSIX_FALLOCATE && __linux__ */ return rc; -- 2.30.2