After much code reading, testing, and using the extremely handy pageinspect contrib to look at pages, here's what I believe is happening. I am not attempting to describe every possible scenario, only this one test path. Following my short test scenario above...
- Inserted rows get line pointers with status 1 (LP_NORMAL) - When I do the 100 row delete, those rows stay status 1, but get their t_xmax set, indicating they were deleted. - When I do the next 200 row delete, those rows also get their t_xmax set. - As a side-effect to the deletion, the "hot prune" feature kicks in. The 2 pages in question were processed by a call to heap_page_prune, which set the line pointers of the previous 100 deleted rows to 3 (LP_DEAD). Now I have 100 LP_DEAD and 200 LP_NORMAL with xmax set. - When I do the vacuum, all 300 are "vacuumed" -- line pointer set to status 0 (LP_UNUSED). - Because my table has no indexes, lazy_scan_heap calls lazy_vacuum_page directly for each block, and reports the variable tups_vacuumed ("removed 200 row versions in 2 pages"). However, tups_vacuumed is computed without counting the 100 LP_DEAD tuples, because per the code comment, it thinks to do so would be double-counting. Perhaps the output should say something like: removed 300 row versions (200 were recently deleted and 100 were previously deleted). Whatever the phrasing, I don't know why 200 is the most significant number here, and 300 isn't mentioned at all. - If my table did have indexes, as in the original test case of this thread, then the "removed xxx row version in yyy pages" message comes from lazy_vacuum_heap instead. However, instead of using tups_vacuumed, this code reports the actual number of tuples actually set to status 0. I would like to respectfully suggest that the vacuum output line "removed xx row versions in yy pages" should show the same counts regardless of whether or not there's an index on the table. I would suggest that the value reported by lazy_vacuum_heap is correct, and is what I would expect to see. I think it would be fine if it also reports the breakdown of LP_DEAD vs LP_NORMAL tuples vacuumed, if that is deemed useful. Regarding the output of pgstattuple, via the call to HeapTupleSatisfiesVisibility, it appears that this simply returns true for these hot-pruned LP_DEAD tuples, skewing the counts accordingly. Does that make sense? -- View this message in context: http://postgresql.1045698.n5.nabble.com/Surprising-dead-tuple-count-from-pgstattuple-tp2266955p2471232.html Sent from the PostgreSQL - hackers mailing list archive at Nabble.com. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers