On Mon, Apr 16, 2012 at 4:26 AM, Heikki Linnakangas <heikki.linnakan...@enterprisedb.com> wrote: > Can we have a "soft" hot standby conflict that doesn't kill the query, but > disables index-only-scans?
Yeah, something like that seems possible. For example, suppose the master includes, in each mark-heap-page-all-visible record, the newest XID on the page. On the standby, we keep track of the most recent such XID ever seen in shared memory. After noting that a page is all-visible, the standby cross-checks its snapshot against this XID and does the heap fetch anyway if it's too new. Conceivably this can be done with memory barriers but not without locks: we must update the XID in shared memory *before* marking the page all-visible, and we must check the visibility map or page-level bit *before* fetching the XID from shared memory - but the actual reads and writes of the XID are atomic. Now, if an index-only scan suffers a very high number of these "soft conflicts" and consequently does a lot more heap fetches than expected, performance might suck. But that should be rare, and could be mitigated by turning on hot_standby_feedback. Also, there might be some hit for repeatedly reading that XID from memory. Alternatively, we could try to forbid index-only scan plans from being created in the first place *only when* the underlying snapshot is too old. But then what happens if a conflict happens later, after the plan has been created but before it's fully executed? At that point it's too late to switch gears, so we'd still need something like the above. And the above might be adequate all by itself, since in practice index-only scans are mostly going to be useful when the data isn't changing all that fast, so most of the queries that would be cancelled by "hard" conflicts wouldn't have actually had a problem anyway. > In the long run, perhaps we need to store XIDs in the visibility map instead > of just a bit. If you we only stored one xid per 100 pages or something like > that, the storage overhead would not be much higher than what we have at the > moment. But that's obviously not going to happen for 9.2... Well, that would also have the undesirable side effect of greatly reducing the granularity of the map. As it is, updating a tiny fraction of the tuples in the table could result in the entire table ceasing to be not-all-visible if you happen to update exactly one tuple per page through the entire heap. Or more generally, updating X% of the rows in the table can cause Y% of the rows in the table to no longer be visible for index-only-scan purposes, where Y >> X. What you're proposing would make that much worse. I think we're actually going to find that the current system isn't tight enough; my suspicion is that users are going to complain that we're not aggressive enough about marking pages all-visible, because autovac won't kick in until updates+deletes>20%, but (1) index-only scans also care about inserts and (2) by the time we've got 20% dead tuples index-only scans may well be worthless. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers