Hi,
As previously discussed, enable_indexscan=false does not merely
disable index scans, but also index-only scans; and
enable_indexonlyscan=false does not really disable index-only scans,
but turns them into index-scans.[1]. We've now got some mention of
this in the documentation,[2] which is good, but I still think it's
confusing. However, today, I discovered (or possibly rediscovered)
something else about the behavior of enable_indexonlyscan that I find
quite odd: it can sometimes disable bitmap heap scans. Here's a test
from our regression tests:
-- Now check the results from bitmap indexscan
SET enable_seqscan = OFF;
SET enable_indexscan = OFF;
SET enable_bitmapscan = ON;
[...]
EXPLAIN (COSTS OFF)
SELECT count(*) FROM quad_point_tbl;
QUERY PLAN
----------------------------------------------
Aggregate
-> Bitmap Heap Scan on quad_point_tbl
-> Bitmap Index Scan on sp_quad_ind
(3 rows)
Now, suppose we repeat this test, but we also add SET
enable_indexonlyscan = OFF. Then we get this output:
QUERY PLAN
----------------------------------
Aggregate
-> Seq Scan on quad_point_tbl
Disabled: true
Maybe other people are not going to agree, but I feel like this is
clearly wrong behavior. The reason it happens is because
build_index_scan() entirely skips generating index paths when it
believes them to be categorically useless, and the same index paths
that it generates are used both for building index scan paths and
bitmap heap scan paths. For it to think that index paths are useful,
it's got to be the case either that (1) there are relevant restriction
clauses, meaning that the index could filter out some tuples or (2)
the index can return the data in sorted order, meaning that we might
be able to avoid a later sort or (3) the index has a useful predicate,
meaning that a full index scan will scan fewer tuples than a
sequential scan, or (4) an index-only scan is possible, meaning that
we can potentially skip a bunch of heap tuples fetches altogether. In
other words, this code thinks that we should not even consider a full
index scan unless we have some specific idea of what it's going to
save as compared with a sequential scan. However, (2) and (4) are only
real advantages if we actually do an index-only scan. If we just build
an index-only scan path and then use it to build a bitmap heap scan,
we are (as it seems to me) doing exactly what this code is trying to
avoid: performing a full scan of an index when there is supposedly no
compelling reason not to just do a Seq Scan.
By that argument, the current regression test output is wrong: the
planner should be rejecting the attempt to force a bitmap heap scan on
the theory that a filterless Bitmap Heap Scan on a non-partial index
can never be the winning plan. However, I'm not sure I really believe
that. Isn't bloat another possible reason why an index scan or
index-only scan or bitmap heap scan could be faster than a sequential
scan? Whoever wrote this regression test wasn't dismayed to see a
Bitmap Heap Scan with no filter condition, and I wouldn't be, either,
exactly because I think this actually could be the winning plan. I
think there's a pretty good argument that we ought to consider
index-based approaches to table scans in all cases, whether there
seems to be a meaningful restriction clause etc. or not. But I haven't
investigated, so maybe that would cost too much planning in too many
cases where it doesn't end up working out. Still, it seems difficult
to me to defend the current behavior. Either a Bitmap Heap Scan is
worth considering here, or alternatively the planner should reject it
outright, but I doubt that which of those two things we should do
ought to depend on the value of enable_indexonlyscan.
Thoughts?
--
Robert Haas
EDB: http://www.enterprisedb.com
[1]
https://www.postgresql.org/message-id/CA%2BTgmoZEg1tyW31t3jxhvDwff29K%3D2C9r6722SuFb%3D3XVKWkow%40mail.gmail.com
[2]
http://git.postgresql.org/pg/commitdiff/dda781609f977315c9ecdefc1e29f9499393aaa4