Hi, Currently, around $subject, we have a few weaknesses:
1) For small-ish power-of-two allocations, aset.c does not put a sentinel at the end of the allocation. Given how common power-of-two allocations are, that seems decidedly suboptimal. 2) The sentinel we use is just one byte. This means that basic variable alignment rules will often lead to access-beyond-end-of-allocation to be further out, preventing us from detecting that. 3) If we do detect memory context issues, we just issue WARNINGs, we do not crash. I think there are a few different problems with that approach: often there will eventually be a crashes, due to the memory corruption that already occurred, but it'll be further away from where we detected the problem, rather than failing closer to the issue. Another big issue is that this means that in contexts where we do not check for WARNINGs, e.g. tap tests or in background processes, such WARNINGs won't trigger test failures. Which seems rather bad to me. A third issue is that the warnings are useless for detecting problems during automated testing like fuzzing. It's very normal for some WARNINGs to be issued when e.g. datatype input functions, something like memory corruption really needs to result in a crash to be properly fuzzable. 4) We support using address sanitizer, but due to address sanitizer only learning about AllocBlock (et al) level allocations, many read accesses beyond the end of an allocation will never be detected. 5) Normally address sanitizers quarantines individual allocation "addresses" for a while, to make it more likely to catch use-after-free issues. Without something like that, many types of issues won't be detected. I think these aren't actually *that* hard to solve: 1) The reason we don't always use the sentinel for power-of-two allocations is that naively doing so would interfere with the size classing for the freelist. But that's relatively easy to address - we can "just" add the space for the size classes *after* determining the size class. IIRC we used to have a similar issue with the per-allocation chunk header and solved it this way too. 2) We should make make the sentinel size configurable and default to either 8 or 16 bytes. 3) I think we should make the memory context failures crash. Perhaps by emitting WARNINGs and then crashing if any of them occurred. As we trigger memory context checking during commit/abort handling, just using PANIC won't always reliably work, E.g. the AllocSetCheck() in AllocSetReset(), can trigger recursive PANICs due to the MemoryContextReset() in errstart(), if the corruption in in ErrorContext. Whether that's a real issue worth worrying about, IDK. 4) For this I prototyped making the valgrind annotations more generic and using the address sanitizer interface to mark memory as poisoned / unpoisoned. That doesn't provide quite all the checking that valgrind can do (it doesn't track uninitialized memory), but it's considerably better than our memory context checking, and *much* *much* faster than valgrind. 5) I think we should add a mode to all our allocators that just turns every allocation into a large allocation (perhaps with something slightly different for slab). In that mode asan, valgrind, et al would be more precise. Of course that'll have rather deleterious performance effects, but I think it'd still be quite useful for debugging. I have prototype patches for 1-4, but wanted to hear about whether such changes sound sane to others, before spending time to polish all of this up. Greetings, Andres Freund
