Alvaro Herrera <[email protected]> wrote:

> On 2026-May-17, Baji Shaik wrote:
> 
> > v3 uses PG_ENSURE_ERROR_CLEANUP which:
> > - Handles both ERROR and FATAL exits
> > - Automatically cancels the callback on normal completion
> >   (no slot leak)
> > - Runs before memory contexts are destroyed (no use-after-free)
> 
> Yeah, looks good.  I have pushed it, with some comment wordsmithing and
> other cosmetic changes.
> 
> While looking at it, I realized that I didn't like the way
> stop_repack_decoding_worker() works, mainly because if there's no
> handle, we leak everything else -- and the way we initialize things
> means we leak the shared memory segment.  This is maybe a rare case and
> just a small memory leak, but it seems better to do it nicely.  So
> here's a followup patch that reworks that code.  This also forced me to
> understand more clearly what is going on, so I rewrote the comments.

The call of shm_mq_detach() got lost, or do you rely on dsm_detach() to call
shm_mq_detach_callback() ? The latter does not free ->mqh_buffer. Since each
REPACK runs in a separate transaction, I wouldn't consider that a leak, but I
still think that explicit call of shm_mq_detach() makes the code a bit easier
to read (i.e. no need for the developer to check if the detaching happens
automatically).


        /*
-        * If we could not cancel the current sleep due to ERROR, do that before
-        * we detach from the shared memory the condition variable is located 
in.
-        * If we did not, the bgworker ERROR handling code would try and fail
-        * badly.
+        * Now detach from our shared memory segment.  In error cases there 
might
+        * still be messages from the worker in the queue, which 
ProcessInterrupts
+        * would try to read; this is pointless (and causes an assertion 
failure),
+        * so set the global pointer to NULL to have ProcessRepackMessages 
ignore
+        * them.
         */
-       ConditionVariableCancelSleep();
-
-       dsm_detach(decoding_worker->seg);
+       dsmseg = decoding_worker->seg;
        pfree(decoding_worker);
        decoding_worker = NULL;
+
+       /* We must also cancel the current sleep, if one is still set up */
+       ConditionVariableCancelSleep();
+
+       if (dsmseg != NULL)
+               dsm_detach(dsmseg);


I suppose the reason for the assertion failure was reading from the queue
after the backend had detached from it? Thanks for fixing that.

-- 
Antonin Houska
Web: https://www.cybertec-postgresql.com


Reply via email to