On Tue, Sep 13, 2016 at 9:31 AM, Jeff Janes <jeff.ja...@gmail.com> wrote:




> =======
>
> +Vacuum acquires cleanup lock on bucket to remove the dead tuples and or
> tuples
> +that are moved due to split.  The need for cleanup lock to remove dead
> tuples
> +is to ensure that scans' returns correct results.  Scan that returns
> multiple
> +tuples from the same bucket page always restart the scan from the previous
> +offset number from which it has returned last tuple.
>
> Perhaps it would be better to teach scans to restart anywhere on the page,
> than to force more cleanup locks to be taken?
>

Commenting on one of my own questions:

This won't work when the vacuum removes the tuple which an existing scan is
currently examining and thus will be used to re-find it's position when it
realizes it is not visible and so takes up the scan again.

The index tuples in a page are stored sorted just by hash value, not by the
combination of (hash value, tid).  If they were sorted by both, we could
re-find our position even if the tuple had been removed, because we would
know to start at the slot adjacent to where the missing tuple would be were
it not removed. But unless we are willing to break pg_upgrade, there is no
feasible way to change that now.

Cheers,

Jeff

Reply via email to