Hi, On 2024-01-05 08:59:41 -0500, Robert Haas wrote: > On Thu, Jan 4, 2024 at 6:03 PM Melanie Plageman > <melanieplage...@gmail.com> wrote: > > When a single page is being processed, page pruning happens in > > heap_page_prune(). Freezing, dead items recording, and visibility > > checks happen in lazy_scan_prune(). Visibility map updates and > > freespace map updates happen back in lazy_scan_heap(). Except, if the > > table has no indexes, in which case, lazy_scan_heap() also invokes > > lazy_vacuum_heap_page() to set dead line pointers unused and do > > another separate visibility check and VM update. I maintain that all > > page-level processing should be done in the page-level processing > > functions (like lazy_scan_prune()). And lazy_scan_heap() shouldn't be > > directly responsible for special case page-level processing. > > But you can just as easily turn this argument on its head, can't you? > In general, except for HOT tuples, line pointers are marked dead by > pruning and unused by vacuum. Here you want to turn it on its head and > make pruning do what would normally be vacuum's responsibility.
OTOH, the pruning logic, including its WAL record, already supports marking items unused, all we need to do is to tell it to do so in a few more cases. If we didn't already need to have support for this, I'd a much harder time arguing for doing this. One important part of the larger project is to combine the WAL records for pruning, freezing and setting the all-visible/all-frozen bit into one WAL record. We can't set all-frozen before we have removed the dead items. So either we need to combine pruning and setting items unused for no-index tables or we end up considerably less efficient in the no-indexes case. An aside: As I think we chatted about before, I eventually would like the option to remove index entries for a tuple during on-access pruning, for OLTP workloads. I.e. before removing the tuple, construct the corresponding index tuple, use it to look up index entries pointing to the tuple. If all the index entries were found (they might not be, if they already were marked dead during a lookup, or if an expression wasn't actually immutable), we can prune without the full index scan. Obviously this would only be suitable for some workloads, but it could be quite beneficial when you have huge indexes. The reason I mention this is that then we'd have another source of marking items unused during pruning. Greetings, Andres Freund