On Mon, Apr 16, 2012 at 4:26 AM, Heikki Linnakangas
<heikki.linnakan...@enterprisedb.com> wrote:
> Can we have a "soft" hot standby conflict that doesn't kill the query, but
> disables index-only-scans?

Yeah, something like that seems possible.

For example, suppose the master includes, in each
mark-heap-page-all-visible record, the newest XID on the page.  On the
standby, we keep track of the most recent such XID ever seen in shared
memory.  After noting that a page is all-visible, the standby
cross-checks its snapshot against this XID and does the heap fetch
anyway if it's too new.  Conceivably this can be done with memory
barriers but not without locks: we must update the XID in shared
memory *before* marking the page all-visible, and we must check the
visibility map or page-level bit *before* fetching the XID from shared
memory - but the actual reads and writes of the XID are atomic.

Now, if an index-only scan suffers a very high number of these "soft
conflicts" and consequently does a lot more heap fetches than
expected, performance might suck.  But that should be rare, and could
be mitigated by turning on hot_standby_feedback.  Also, there might be
some hit for repeatedly reading that XID from memory.

Alternatively, we could try to forbid index-only scan plans from being
created in the first place *only when* the underlying snapshot is too
old.  But then what happens if a conflict happens later, after the
plan has been created but before it's fully executed?  At that point
it's too late to switch gears, so we'd still need something like the
above.  And the above might be adequate all by itself, since in
practice index-only scans are mostly going to be useful when the data
isn't changing all that fast, so most of the queries that would be
cancelled by "hard" conflicts wouldn't have actually had a problem
anyway.

> In the long run, perhaps we need to store XIDs in the visibility map instead
> of just a bit. If you we only stored one xid per 100 pages or something like
> that, the storage overhead would not be much higher than what we have at the
> moment. But that's obviously not going to happen for 9.2...

Well, that would also have the undesirable side effect of greatly
reducing the granularity of the map.  As it is, updating a tiny
fraction of the tuples in the table could result in the entire table
ceasing to be not-all-visible if you happen to update exactly one
tuple per page through the entire heap.  Or more generally, updating
X% of the rows in the table can cause Y% of the rows in the table to
no longer be visible for index-only-scan purposes, where Y >> X.  What
you're proposing would make that much worse.

I think we're actually going to find that the current system isn't
tight enough; my suspicion is that users are going to complain that
we're not aggressive enough about marking pages all-visible, because
autovac won't kick in until updates+deletes>20%, but (1) index-only
scans also care about inserts and (2) by the time we've got 20% dead
tuples index-only scans may well be worthless.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to