> > If the table is dropped, there are no stats to update. right? > > Ops, right. I focused too much on "all warnings should be visible in > the statistic, so the sum of warnings and statistics should match", > but of course that's not the case. > > > However, I'm not entirely sure whether this behavior is always guaranteed. > > Could anyone clarify this? > > There's another different corner-case if I move the injection point > inside pgstat_report_skipped_vacuum_analyze, after releasing the > syscache. > > If the drop happens at that point, we insert an orphaned record into > the statistics, and it will be visible with the internal functions > (e.g. pg_stat_get_skipped_autoanalyze_count). > > It is still invisible in the pg_stat_all_tables view, but now that > I've looked more at the code, I think internally it will stay there > permanently, even surviving pg_stat_reset?
Yeah, you’re right. I just realized that pgstat_get_entry_ref_locked() creates an entry if one does not already exist, which means we can leak entries if the table is dropped concurrently. After releasing the syscache entry, we still have the relid, but the table may be dropped before the stats update runs. In that case, the stats update recreates the relation entry, and that entry is then leaked permanently. The tricky part of this feature is that we cannot hold a lock while updating the entry, because the whole point of the stat is to track cases where we failed to acquire a lock. So, it seems pgstat_init_function_usage() deals with the same pattern and they deal with it by doing a second syscache lookup after creating the entry, and if the relation is gone, drop the entry. We can do the same thing here. See v8-0001. I also attached a v8-0002 building on Zsolt's injection point test and his earlier suggestion to "move the injection point inside pgstat_report_skipped_vacuum_analyze, after releasing the syscache". These are isolation tests that rely on injection points. The tests are: 1. Table Drop + rollback. Lock is skipped and table is dropped but rolled back. The stats are updated. 2. Table Drop + commit. Lock is skipped and table is dropped and committed. No table entry to record a stat for. The tests use pg_stat_get_skipped_vacuum_count to verify that no orphaned entry is left behind. -- Sami
v8-0002-Add-injection-point-test-for-vacuum-skip_locked-s.patch
Description: Binary data
v8-0001-Track-skipped-vacuum-and-analyze-activity-per-rel.patch
Description: Binary data
