Andres Freund <and...@anarazel.de> writes: > On 2015-07-11 14:31:25 +0200, Tomas Vondra wrote: >> While working on the "IOS with partial indexes" patch, I've noticed a bit >> strange plan. It's unrelated to that particular patch (reproducible on >> master), so I'm starting a new thread for it.
> It's indeed interesting. Running > ANALYZE t;EXPLAIN SELECT a FROM t WHERE b < 100; > repeatedly switches back and forth between the plans. The issue basically is that ANALYZE is putting quasi-random numbers into the reltuples estimates for the indexes. Things seem to be consistently sane after a VACUUM: regression=# vacuum t; VACUUM regression=# select relname,relpages,reltuples from pg_class where relname in ( 't', 'idx001', 'idx002', 'idx003'); relname | relpages | reltuples ---------+----------+------------- t | 44248 | 9.99998e+06 idx001 | 2 | 99 idx002 | 2 | 199 idx003 | 2 | 299 (4 rows) but not so much after ANALYZE: regression=# analyze t; ANALYZE regression=# select relname,relpages,reltuples from pg_class where relname in ( 't', 'idx001', 'idx002', 'idx003'); relname | relpages | reltuples ---------+----------+----------- t | 44248 | 1e+07 idx001 | 2 | 0 idx002 | 2 | 0 idx003 | 2 | 0 (4 rows) I've also seen states like relname | relpages | reltuples ---------+----------+------------- t | 44248 | 9.99998e+06 idx001 | 2 | 0 idx002 | 2 | 334 idx003 | 2 | 334 (4 rows) Presumably, this is happening because the numbers of rows actually satisfying the index predicates are so small that it's a matter of luck whether any of them are included in ANALYZE's sample. Given this bad data for the index sizes, it's not totally surprising that choose_bitmap_and() does something wacko. I'm not sure whether we should try to make it smarter, or write this off as "garbage in, garbage out". Another idea is to not trust any individual ANALYZE's estimate of the index rowcount so completely. (I'd thought that the moving-average logic would get applied to that, but it doesn't seem to be kicking in for some reason.) We could probably make this smarter if we were willing to apply the predicate-proof machinery in more situations; in this example, once we know that idx001 is applicable, we really should disregard idx002 and idx003 altogether because their predicates are implied by idx001's. I've always been hesitant to do that because the cost of checking seemed likely to greatly outweigh the benefits. But since Tomas is nosing around in this territory already, maybe he'd like to investigate that further. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers