On Sun, Feb 10, 2013 at 12:10 PM, Jeff Janes <jeff.ja...@gmail.com> wrote:
> On Tue, Feb 5, 2013 at 2:00 PM, Kevin Grittner <kgri...@ymail.com> wrote: > > > > OK, the estimate was 13 million and there were actually 13.8 > > million, but it is a random sample used to generate estimates. > > That seems worse than average, but close enough to be useful. > > The 158.7 million total rows includes dead rows, which must be > > visited to determine visibility, but will not be counted because > > they are not visible to the counting transaction. > > To clarify here, the 158.7 million estimate does not *intentionally* > include dead rows. As you say, the ANALYZE did get a very good > instantaneous estimate of the number of live rows. However, ANALYZE > doesn't over-write the old estimate, it averages its estimate into the > old one. After the table shape changes dramatically, the ANALYZE > needs to be run repeatedly before the estimate will converge to the > new reality. (Of course a cluster or vacuum full will blow away the > old statistics, so the next analyze after that will solely determine > the new statistics.) > I was incredibly wrong here. The cluster or vacuum do not blow away the stats so that the next analyze gets to solely determine them. Rather, they impose their own idea of live tuples, and then analyze can only update that incrementally as it averages itself into the old value. Worse, the two methods have very different ideas of what constitutes a live tuple. ANALYZE thinks tuples that are visible to a current/recent snapshot are live. While CLUSTER and VACUUM think tuples that are possibly visible to anyone are live. I would say that this is a bug, or at least approaching to being one. It is not obvious whether reltuples and n_live_tuples should count the "recently dead", but it should either be one way or the other and not an unholy mixture of the two. As it is now, a cluster or simple vacuum will snap n_live_tuples so that it counts recently dead, then analyze will slowly converge it to excludes recently dead, and then the next vacuum will snap it back again. Of course, all of this only comes into play in the presence of very long-lived transactions that prevent tuples from going away. Otherwise the number recently dead is small enough not to matter. create table foo as select (random()*1000000)::integer as val from generate_series(1,50000000); In a different session, open a transaction and leave it open: begin; create temp table adlfkj (x serial); Back in the main session: delete from foo where val > 100; run this repeatedly and watch the rows estimate slowly decay: ANALYZE verbose foo; explain select count(*) from foo; Then run this and watch it instantly spring back: VACUUM VERBOSE foo ; explain select count(*) from foo; Cheers, Jeff