On Mon, Apr 18, 2022 at 12:41 PM Tom Lane <t...@sss.pgh.pa.us> wrote: > If the planner looks at index reltuples at all, it's doing so > for cost estimation purposes, where the count including dead > entries is probably the right thing to use.
Then why does heapam_index_build_range_scan do it the other way around? I think that it probably doesn't matter that much in practice. The inconsistency should be noted in update_relstats_all_indexes, though. > If you want to make this cleaner, maybe there's a case for > splitting reltuples into two columns. But then index AMs > would be on the hook to determine how many of their entries > are live, which is not really an index's concern. The main concern behind this is that we're using vacrel->new_rel_tuples for the IndexVacuumInfo.num_heap_tuples value in amvacuumcleanup (but not in ambulkdelete), which is calculated towards the end of lazy_scan_heap, like so: /* * Also compute the total number of surviving heap entries. In the * (unlikely) scenario that new_live_tuples is -1, take it as zero. */ vacrel->new_rel_tuples = Max(vacrel->new_live_tuples, 0) + vacrel->recently_dead_tuples + vacrel->missed_dead_tuples; I think that this doesn't really belong here; new_rel_tuples should only be used for VACUUM VERBOSE/server log output, once we return to heap_vacuum_rel from lazy_scan_heap. We should use vacrel->new_live_tuples as our IndexVacuumInfo.num_heap_tuples value in the amvacuumcleanup path (instead of new_rel_tuples). That way the rule about IndexVacuumInfo.num_heap_tuples is simple: it's always taken from pg_class.reltuples (for the heap rel). Either the existing value, or the new value. -- Peter Geoghegan