On Tue, Jul 25, 2017 at 9:33 PM, Tomas Vondra <tomas.von...@2ndquadrant.com> wrote:
> On 7/25/17 12:55 AM, Tom Lane wrote: > >> Tomas Vondra <tomas.von...@2ndquadrant.com> writes: >> >>> It seems to me that VACUUM and ANALYZE somewhat disagree on what >>> exactly reltuples means. VACUUM seems to be thinking that reltuples >>> = live + dead while ANALYZE apparently believes that reltuples = >>> live >>> >> >> The question is - which of the reltuples definitions is the right >>> one? I've always assumed that "reltuples = live + dead" but perhaps >>> not? >>> >> >> I think the planner basically assumes that reltuples is the live >> tuple count, so maybe we'd better change VACUUM to get in step. >> >> > Attached is a patch that (I think) does just that. The disagreement was > caused by VACUUM treating recently dead tuples as live, while ANALYZE > treats both of those as dead. > > At first I was worried that this will negatively affect plans in the > long-running transaction, as it will get underestimates (due to reltuples > not including rows it can see). But that's a problem we already have > anyway, you just need to run ANALYZE in the other session. Thanks for the patch. >From the mail, I understand that this patch tries to improve the reltuples value update in the catalog table by the vacuum command to consider the proper visible tuples similar like analyze command. - num_tuples); + num_tuples - nkeep); With the above correction, there is a problem in reporting the number of live tuples to the stats. postgres=# select reltuples, n_live_tup, n_dead_tup from pg_stat_user_tables join pg_class using (relname) where relname = 't'; reltuples | n_live_tup | n_dead_tup -----------+------------+------------ 899818 | 799636 | 100182 (1 row) The live tuples data value is again decremented with dead tuples value before sending them to stats in function lazy_vacuum_rel(), /* report results to the stats collector, too */ new_live_tuples = new_rel_tuples - vacrelstats->new_dead_tuples; The fix needs a correction here also. Or change the correction in lazy_vacuum_rel() function itself before updating catalog table similar like stats. While testing this patch, I found another problem that is not related to this patch. When the vacuum command is executed mutiple times on a table with no dead rows, the number of reltuples value is slowly reducing. postgres=# select reltuples, n_live_tup, n_dead_tup from pg_stat_user_tables join pg_class using (relname) where relname = 't'; reltuples | n_live_tup | n_dead_tup -----------+------------+------------ 899674 | 899674 | 0 (1 row) postgres=# vacuum t; VACUUM postgres=# select reltuples, n_live_tup, n_dead_tup from pg_stat_user_tables join pg_class using (relname) where relname = 't'; reltuples | n_live_tup | n_dead_tup -----------+------------+------------ 899622 | 899622 | 0 (1 row) postgres=# vacuum t; VACUUM postgres=# select reltuples, n_live_tup, n_dead_tup from pg_stat_user_tables join pg_class using (relname) where relname = 't'; reltuples | n_live_tup | n_dead_tup -----------+------------+------------ 899570 | 899570 | 0 (1 row) In lazy_scan_heap() function, we force to scan the last page of the relation to avoid the access exclusive lock in lazy_truncate_heap if there are tuples in the last page. Because of this reason, the scanned_pages value will never be 0, so the vac_estimate_reltuples function will estimate the tuples based on the number of tuples from the last page of the relation. This estimation is leading to reduce the number of retuples. I am thinking whether this problem really happen in real world scenarios to produce a fix? Regards, Hari Babu Fujitsu Australia