On Mon, Mar 13, 2023 at 6:41 PM Peter Geoghegan <p...@bowt.ie> wrote: > There are several different things that seem important to me > personally. These are in tension with each other, to a degree. These > are: > > 1. Like Andres, I'd really like to have some way of inspecting things > like heapam PRUNE, VACUUM, and FREEZE_PAGE records in significant > detail. These record types happen to be very important in general, and > the ability to see detailed information about the WAL record would > definitely help with some debugging scenarios. I've really missed > stuff like this while debugging serious issues under time pressure.
One problem that I often run into when performing analysis of VACUUM using pg_walinspect is the issue of *who* pruned which heap page, for any given PRUNE record. Was it VACUUM/autovacuum, or was it opportunistic pruning? There is no way of knowing for sure right now. You *cannot* rely on an xid of 0 as an indicator of a given PRUNE record coming from VACUUM; it could just have been an opportunistic prune operation that happened to take place when a SELECT query ran, before any XID was ever allocated. I think that we should do something like the attached, to completely avoid this ambiguity. This patch adds a new XLOG_HEAP2 bit that's similar to XLOG_HEAP_INIT_PAGE -- XLOG_HEAP2_BYVACUUM. This allows all XLOG_HEAP2 record types to indicate that they took place during VACUUM, by XOR'ing the flag with the record type/info when XLogInsert() is called. For now this is only used by PRUNE records. Tools like pg_walinspect will report a separate "Heap2/PRUNE+BYVACUUM" record_type, as well as the unadorned Heap2/PRUNE record_type, which we'll now know must have been opportunistic pruning. The approach of using a bit in the style of the heapam init bit makes sense to me, because the bit is available, and works in a way that is minimally invasive. Also, one can imagine needing to resolve a similar ambiguity in the future, when (say) opportunistic freezing is added. I think that it makes sense to treat this within the scope of Melanie's ongoing work to improve the instrumentation of these records -- meaning that it's in scope for Postgres 16. Admittedly this is a slightly creative interpretation, so if others disagree then I won't argue. This is quite a small patch, though, which makes debugging significantly easier. I think that there could be a great deal of utility in being able to easily "pair up" corresponding "Heap2/PRUNE+BYVACUUM" and "Heap2/VACUUM" records in debugging scenarios. I can imagine linking these to "Heap2/FREEZE_PAGE" and "Heap2/VISIBLE" records, too, since they're all closely related record types. -- Peter Geoghegan
v1-0001-Record-which-PRUNE-records-are-from-VACUUM.patch
Description: Binary data