There's been some previous discussion of getting rid of the pg_class columns relpages and reltuples, in favor of having the planner check the current relation block count directly (RelationGetNumberOfBlocks) and extrapolate the current tuple count based on the most recently measured tuples-per-page density. A couple of past threads are http://archives.postgresql.org/pgsql-performance/2004-10/msg00367.php http://archives.postgresql.org/pgsql-general/2004-08/msg01422.php and the point came up again today: http://archives.postgresql.org/pgsql-performance/2004-11/msg00401.php where we were again reminded of the problems incurred by obsolete estimates.
It occurs to me that we could get most of the bang for the buck without making any incompatible changes: just leave the existing fields in place but make the planner use reltuples-divided-by-relpages as the density estimate. That is, in place of where we have rel->pages = relation->rd_rel->relpages; rel->tuples = relation->rd_rel->reltuples; in plancat.c, just do rel->pages = RelationGetNumberOfBlocks(relation); if (relation->rd_rel->relpages > 0) density = relation->rd_rel->reltuples / relation->rd_rel->relpages; else density = some_default_estimate; rel->tuples = round(rel->pages * density); In addition to this we'd perhaps want to hack VACUUM so that when the table is empty, it doesn't simply zero out relpages/reltuples, but somehow preserves the previous density value so we don't have to fall back to the default density estimate. (This of course assumes that we will refill the table with a density roughly similar to the last meaured density; which might be wrong but it's still better than just using a default, I think.) One way to do that is to set relpages = zero (truthfully) but set reltuples to the previously estimated density (we can do this because it's already a float field). It might look a little funny to have nonzero reltuples when relpages is zero, but I think it wouldn't break anything. Then the above logic becomes rel->pages = RelationGetNumberOfBlocks(relation); if (relation->rd_rel->relpages > 0) density = relation->rd_rel->reltuples / relation->rd_rel->relpages; else if (relation->rd_rel->reltuples > 0) /* already a density */ density = relation->rd_rel->reltuples; else density = some_default_estimate; rel->tuples = round(rel->pages * density); A variant of this is to set reltuples = density, relpages = 1 instead of 0, which makes the relpages value a lie but would be even less likely to confuse client-side code. Comments? Does this seem like something reasonable to do for 8.0? regards, tom lane ---------------------------(end of broadcast)--------------------------- TIP 4: Don't 'kill -9' the postmaster