Alexander Kuzmenkov <a.kuzmen...@postgrespro.ru> writes: > On 01.03.2018 06:23, David Gould wrote: >> In theory the sample pages analyze uses should represent the whole table >> fairly well. We rely on this to generate pg_statistic and it is a key >> input to the planner. Why should we not believe in it as much only for >> reltuples? If the analyze sampling does not work, the fix would be to improve >> that, not to disregard it piecemeal.
> Well, that sounds reasonable. But the problem with the moving average > calculation remains. Suppose you run vacuum and not analyze. If the > updates are random enough, vacuum won't be able to reclaim all the > pages, so the number of pages will grow. Again, we'll have the same > thing where the number of pages grows, the real number of live tuples > stays constant, and the estimated reltuples grows after each vacuum run. You claimed that before, with no more evidence than this time, and I still don't follow your argument. The number of pages may indeed bloat but the number of live tuples per page will fall. Ideally, at least, the estimate would remain on-target. If it doesn't, there's some other effect that you haven't explained. It doesn't seem to me that the use of a moving average would prevent that from happening. What it *would* do is smooth out errors from the inevitable sampling bias in any one vacuum or analyze run, and that seems like a good thing. > I did some more calculations on paper to try to understand this. If we > average reltuples directly, instead of averaging tuple density, it > converges like it should. The error with this density calculation seems > to be that we're effectively multiplying the old density by the new > number of pages. I'm not sure why we even work with tuple density. We > could just estimate the number of tuples based on analyze/vacuum, and > then apply moving average to it. The calculations would be shorter, too. > What do you think? I think you're reinventing the way we used to do it. Perhaps consulting the git history in the vicinity of this code would be enlightening. regards, tom lane