Hi Since my attempts to find a simple solution for the read-only query locking problems (Once that doesn't need full wal logging of lock requests) haven't been successfully yet, I've decided to turn to the problems of tracking a snapshot on the slaves for now. (Because first such a snapshot is needed for any kind of concurrent recovery anyway, and second because any non-simplistic solution of the locking problems will quite likely benefit from such a snapshot).
The idea is to create a special kind of snapshot that works basically like a MVCC snapshot, but with the meaning of the xip array inverted. Usually, if a xid is *not* in the xip array of a snapshot, and greater than the xmin of that snapshot, the clog state of the xid determines tuple visibility. This is not well suited for queries running during replay, because the effects of a xlog record with a (to the slave) previously unknown xid shouldn't be visible to concurrently running queries. Therefore, flag xip_inverted will be added to SnapshotData, that causes HeapTupleSatisfiesMVCC to assume that any xid >= xmin and *not* in the xip array is in progress. This allows the following to work: .) Store RecentXmin with every xlog record, in a new field xl_xmin. (Wouldn't be needed in *every* record, but for now keeping it directly inside XLogRecord make things easier, and it's just 4 bytes) .) Maintain a global snapshot template in shmem during replay, with the xmin being the highest xmin seen so far in any xlog record. That template is copied whenever a readonly query needs to obtain a snapshot. .) Upon replaying a COMMIT or COMMIT_PREPARED record, the xmin of the to-be-committed transaction is added to the global snapshot, making the commit visibile to all further copies of that snapshot. If you can shoot this down, you're welcome to do so ;-) greetings, Florian Pflug ---------------------------(end of broadcast)--------------------------- TIP 2: Don't 'kill -9' the postmaster