Hi,
As we know, index vacuum could be sped up significantly if it didn't have
to lock every page in left to right direction because of the integrity issue
described in nbtree/README. We could then scan the index in physical
order, and AFAICS combine the btbulkdelete and btvacuumcleanup logic to
just one scan.
Several approaches have been proposed before, see prior discussion:
http://archives.postgresql.org/pgsql-hackers/2004-04/msg00829.php
http://archives.postgresql.org/pgsql-hackers/2005-12/msg00927.php
The proposals this far have tried to solve the problem by changing the
split and/or vacuum algorithms to make sure that a tuple isn't deleted if
an index scan is stopped on it. That has proven to be hard, so I'm
proposing two alternatives that change the index scan instead:
1. Instead of stopping on the first matching tuple, scan the whole index
page for all matching entries at once. Then move one page to the
right, and logically stop before the first (not including hikey) item on
the page. Since the scan is not stopped on any specific tuple, it cannot
be deleted or moved. When the scan continues, it can just start scanning
from the beginning of the page.
This relies on the fact that items are never moved, except on page split
when they're always moved to the right and to a new page. Page deletion is
already done in two phases ensuring that the page doesn't get deleted
while a scan is stopped on it.
2. Alternatively, the index scan could store the location of the last
known non-deletable tuple it has encountered, in addition to the tuple it
stops on. When a stopped scan continues, it checks if the tuple it was
stopped on is still on the same page. If it's not, instead of moving
right to find it, relocate the last known non-deletable tuple and
continue the scan from there. There can't be any visible tuples between
the tuple it stopped on and the last known non-deletable tuple, because
we would have encountered it before, and would know by now that it's
non-deletable.
What do these ideas sound like? Anything I've missed?
- Heikki
---------------------------(end of broadcast)---------------------------
TIP 9: In versions below 8.0, the planner will ignore your desire to
choose an index scan if your joining column's datatypes do not
match