Hi Alvaro, Sami,

Thank you both for the feedback.

Sami, thanks for identifying the use-after-free with on_proc_exit
and providing the v2 patch with before_shmem_exit. That helped
narrow down the right approach.

On Sun, May 17, 2026 at 3:46 PM Alvaro Herrera <[email protected]> wrote:

> I think a better fix for this is to use PG_ENSURE_ERROR_CLEANUP().  That
> way we avoid leaving callbacks in place, which would not be great if the
> same backend does a lot of REPACKs: after a dozen or so, it dies with
>
> FATAL:  out of before_shmem_exit slots
>
>
Alvaro, you're right that PG_ENSURE_ERROR_CLEANUP is the better
approach here. Attached is v3 which uses it instead of
before_shmem_exit or on_proc_exit.

Summary of the issues with v1 and v2:

- v1 (on_proc_exit): Crashes with assertions enabled because
  memory contexts are already destroyed by the time on_proc_exit
  callbacks run. The worker handle is clobbered with 0x7f
  (CLOBBER_FREED_MEMORY).

- v2 (before_shmem_exit): Works correctly for a single REPACK,
  but leaks a callback slot on each successful REPACK
  (CONCURRENTLY). After ~15 REPACKs in the same session:
  FATAL: out of before_shmem_exit slots.

v3 uses PG_ENSURE_ERROR_CLEANUP which:
- Handles both ERROR and FATAL exits
- Automatically cancels the callback on normal completion
  (no slot leak)
- Runs before memory contexts are destroyed (no use-after-free)

The existing PG_TRY/PG_FINALLY block is replaced with
PG_ENSURE_ERROR_CLEANUP for the concurrent path, and a plain
rebuild_relation() call for the non-concurrent path.

Tested with --enable-cassert --enable-debug --enable-injection-points:
- All 245 regression tests pass (including cluster)
- All 8 injection point tests pass (including repack and repack_toast)
- pg_terminate_backend cleans up worker and slot without crash
- 20 REPACK (CONCURRENTLY) in same session completes without
  slot exhaustion

I have not added a dedicated regression test for the
pg_terminate_backend scenario yet, but I can write one using
injection points if needed.

Thanks,
Baji Shaik

Attachment: v3-0001-Fix-REPACK-decoding-worker-not-cleaned-up-on-FATAL-exit.patch
Description: Binary data

Reply via email to