>>>>> "Tom" == Tom Lane <t...@sss.pgh.pa.us> writes:

 Tom> Once you're down to an estimate of one row retrieved, adding
 Tom> additional index conditions simply increases the cost (not by
 Tom> much, but it increases) without delivering any visible benefit.

OK, but this is a serious problem because "estimate of one row" is a
very common estimation failure mode, and isn't always solvable in the
sense of arranging for better estimates (in the absence of hints, ugh).

 Tom> I believe what probably happened in this case is that the planner
 Tom> considered both forms of the indexscan path and concluded that
 Tom> they were fuzzily the same cost and rowcount, yet the path using
 Tom> only t2.a and t3.b clearly dominated by requiring strictly fewer
 Tom> outer relations for parameters.  So it threw away the path that
 Tom> also had the c = t4.c comparison before it ever got to the join
 Tom> stage.  Even had it kept that path, the join cost estimate
 Tom> wouldn't have looked any better than the one for the join it did
 Tom> pick, so there would have been no certainty of picking the
 Tom> "correct" plan.

 Tom> The real problem in your example is thus the incorrect rowcount
 Tom> estimate; with better rowcount estimates the two cases wouldn't
 Tom> have appeared to have the same output rowcount.

 Tom> For the toy data in your example, this can probably be blamed on
 Tom> the fact that eqjoinsel_inner doesn't have any smarts for the case
 Tom> of having an MCV list for only one side (though as noted in the
 Tom> comments, it's not obvious what it should do instead).  However,
 Tom> it's not very clear what was happening in the real-world case.

In the real-world case, t1 was something like an "overrides" table for
data otherwise obtained from the other tables, i.e. special-case
exceptions for general rules. As such it is highly skew, with many
possible (a,b) values having no row at all, but others having hundreds
of matches on (a,b) (but only one at most on (a,b,c) since this was the
pkey in the real data as well as the testcase).

Accordingly, there was no way that we could identify of getting any kind
of better estimate of rowcount.

-- 
Andrew (irc:RhodiumToad)


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to