[HACKERS] Poor cost estimate with interaction between table correlation and partial indexes

Michael Malis Sat, 26 Aug 2017 16:21:07 -0700

Hi.

I'm looking to get started contributing code to Postgres. A small
issue I'm aware of that I think would make a good first contribution
is a poor cost estimate made due to an interaction between table
correlation and partial indexes. Currently the planner assumes that
when an index is perfectly correlated with a table and a range scan is
performed on the index, all of the table page reads performed by the
index scan except for the first one will be sequential reads. While
this assumption is correct for regular indexes, it is not true for
partial indexes.


The assumption holds for regular indexes because the rows
corresponding to two entries in a regular index that is perfectly
correlated with the table are guaranteed to be next to each other in
the table. On the other hand with a partial index perfectly correlated
with a table, there may be rows in the table in between the two rows
corresponding to two adjacent entries in the index that are not
included in the index because they do not satisfy the partial index
predicate.

To make the cost calculation for this case more accurate, I want to
apply the same estimate as the one currently used to estimate the cost
of a bitmap heap scan. The bitmap heap scan cost estimate applies in
this case because both cases involve reading pages from disk ordered
by the location in the table, but where the pages may not be
consecutive.

For the relevant functions, see cost_index and cost_index_heap_scan in
costsize.c.

Thanks,
Michael


-- 
Sent via pgsql-hackers mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Poor cost estimate with interaction between table correlation and partial indexes

Reply via email to