Hi,
estimate_rel_size() explains:
/*
* HACK: if the relation has never yet been vacuumed,
use a
* minimum size estimate of 10 pages. The idea here is
to avoid
* assuming a newly-created table is really small, even
if it
* currently is, because that may not be true once some
data gets
* loaded into it. Once a vacuum or analyze cycle has
been done
* on it, it's more reasonable to believe the size is
somewhat
* stable.
*
* (Note that this is only an issue if the plan gets
cached and
* used again after the table has been filled. What
we're trying
* to avoid is using a nestloop-type plan on a table
that has
* grown substantially since the plan was made.
Normally,
* autovacuum/autoanalyze will occur once enough
inserts have
* happened and cause cached-plan invalidation; but
that doesn't
* happen instantaneously, and it won't happen at all
for cases
* such as temporary tables.)
*
* We approximate "never vacuumed" by "has relpages =
0", which
* means this will also fire on genuinely empty
relations. Not
* great, but fortunately that's a seldom-seen case in
the real
* world, and it shouldn't degrade the quality of the
plan too
* much anyway to err in this direction.
*
* There are two exceptions wherein we don't apply this
heuristic.
* One is if the table has inheritance children.
Totally empty
* parent tables are quite common, so we should be
willing to
* believe that they are empty. Also, we don't apply
the 10-page
* minimum to indexes.
*/
I don't quite get why we don't instead just represent "never vacuumed"
by storing a more meaningful value in relpages? We could go for
InvalidBlockNumber, or even NULL (although the latter would be a bit
annoying due to not being mappable to a struct anymore).
I've seen numerous cases where relpages = 0 -> never vacuumed has caused
worse plans, and it just doesn't seem necessary?
Greetings,
Andres Freund