On Sun, May 31, 2015 at 12:49 PM, Tom Lane <t...@sss.pgh.pa.us> wrote: > Andrew Gierth <and...@tao11.riddles.org.uk> writes: >> Obviously it makes little sense to use an (a,b,c) index to look up just >> (a,b) and then filter on c; the question is, what is the planner doing >> that leads it to get this so wrong? > > It's not so astonishing as all that; compare > > regression=# explain select * from t1 where a=3 and b=4; > QUERY PLAN > ------------------------------------------------------------------------ > Index Only Scan using t1_pkey on t1 (cost=0.28..8.29 rows=1 width=12) > Index Cond: ((a = 3) AND (b = 4)) > (2 rows) > > regression=# explain select * from t1 where a=3 and b=4 and c=5; > QUERY PLAN > ------------------------------------------------------------------------ > Index Only Scan using t1_pkey on t1 (cost=0.28..8.30 rows=1 width=12) > Index Cond: ((a = 3) AND (b = 4) AND (c = 5)) > (2 rows) > > Once you're down to an estimate of one row retrieved, adding additional > index conditions simply increases the cost (not by much, but it increases) > without delivering any visible benefit.
But Andrew's example is equivalent to planning the second query by putting the quals on a and b into the index qual and treating c=5 as a post-filter condition even though the index we're using is on (a, b, c), which I don't believe we'd ever do. If we did, I'd find that astonishing, too. > I believe what probably happened in this case is that the planner > considered both forms of the indexscan path and concluded that they were > fuzzily the same cost and rowcount, yet the path using only t2.a and t3.b > clearly dominated by requiring strictly fewer outer relations for > parameters. So it threw away the path that also had the c = t4.c > comparison before it ever got to the join stage. Even had it kept that > path, the join cost estimate wouldn't have looked any better than the one > for the join it did pick, so there would have been no certainty of picking > the "correct" plan. It's just hard to believe that it's ever better to treat something as a join filter than as an index condition. Yes, checking the index condition isn't free, either. But it doesn't seem like it should be particularly more expensive than checking the same thing as a join filter. And if there a lot of rows involved, it's going to be a whole lot LESS expensive. I guess it's hard for me to credit the idea that a parameterized index path constraining a superset of the columns present in some other parameterized path on the same index has insufficient additional selectivity to justify its existence. Even in a world where planner estimates are never wrong, I doubt treating the qual as a join filter is ever meaningfully better. The absolute best case - or so it seems to me - is a tie. If we underestimate the row counts, it's a loss. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers