Hi. I'm looking to get started contributing code to Postgres. A small issue I'm aware of that I think would make a good first contribution is a poor cost estimate made due to an interaction between table correlation and partial indexes. Currently the planner assumes that when an index is perfectly correlated with a table and a range scan is performed on the index, all of the table page reads performed by the index scan except for the first one will be sequential reads. While this assumption is correct for regular indexes, it is not true for partial indexes.
The assumption holds for regular indexes because the rows corresponding to two entries in a regular index that is perfectly correlated with the table are guaranteed to be next to each other in the table. On the other hand with a partial index perfectly correlated with a table, there may be rows in the table in between the two rows corresponding to two adjacent entries in the index that are not included in the index because they do not satisfy the partial index predicate. To make the cost calculation for this case more accurate, I want to apply the same estimate as the one currently used to estimate the cost of a bitmap heap scan. The bitmap heap scan cost estimate applies in this case because both cases involve reading pages from disk ordered by the location in the table, but where the pages may not be consecutive. For the relevant functions, see cost_index and cost_index_heap_scan in costsize.c. Thanks, Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers