On 20/01/2024 12:14 am, Tomas Vondra wrote:
Looks like I was not true, even if it is not index-only scan but index
condition involves only index attributes, then heap is not accessed
until we find tuple satisfying search condition.
Inclusive index case described above
(https://commitfest.postgresql.org/46/4352/) is interesting but IMHO
exotic case. If keys are actually used in search, then why not to create
normal compound index instead?
Not sure I follow ...
Firstly, I'm not convinced the example addressed by that other patch is
that exotic. IMHO it's quite possible it's actually quite common, but
the users do no realize the possible gains.
Also, there are reasons to not want very wide indexes - it has overhead
associated with maintenance, disk space, etc. I think it's perfectly
rational to design indexes in a way eliminates most heap fetches
necessary to evaluate conditions, but does not guarantee IOS (so the
last heap fetch is still needed).
We are comparing compound index (a,b) and covering (inclusive) index (a)
include (b)
This indexes have exactly the same width and size and almost the same
maintenance overhead.
First index has more expensive comparison function (involving two
columns) but I do not think that it can significantly affect
performance and maintenance cost. Also if selectivity of "a" is good
enough, then there is no need to compare "b"
Why we can prefer covering index to compound index? I see only two good
reasons:
1. Extra columns type do not have comparison function need for AM.
2. The extra columns are never used in query predicate.
If you are going to use this columns in query predicates I do not see
much sense in creating inclusive index rather than compound index.
Do you?
What do you mean by "create normal compound index"? The patch addresses
a limitation that not every condition can be translated into a proper
scan key. Even if we improve this, there will always be such conditions.
The the IOS can evaluate them on index tuple, the regular index scan
can't do that (currently).
Can you share an example demonstrating the alternative approach?
May be I missed something.
This is the example from
https://www.postgresql.org/message-id/flat/N1xaIrU29uk5YxLyW55MGk5fz9s6V2FNtj54JRaVlFbPixD5z8sJ07Ite5CvbWwik8ZvDG07oSTN-usENLVMq2UAcizVTEd5b-o16ZGDIIU=@yamlcoder.me
:
```
And here is the plan with index on (a,b).
Limit (cost=0.42..4447.90 rows=1 width=12) (actual time=6.883..6.884
rows=0 loops=1) Output: a, b, d Buffers: shared hit=613 ->
Index Scan using t_a_b_idx on public.t (cost=0.42..4447.90 rows=1
width=12) (actual time=6.880..6.881 rows=0 loops=1) Output: a,
b, d Index Cond: ((t.a > 1000000) AND (t.b = 4))
Buffers: shared hit=613 Planning: Buffers: shared hit=41 Planning
Time: 0.314 ms Execution Time: 6.910 ms ```
Isn't it an optimal plan for this query?
And cite from self reproducible example https://dbfiddle.uk/iehtq44L :
```
create unique index t_a_include_b on t(a) include (b);
-- I'd expecd index above to behave the same as index below for this query
--create unique index on t(a,b);
```
I agree that it is natural to expect the same result for both indexes.
So this PR definitely makes sense.
My point is only that compound index (a,b) in this case is more natural
and preferable.