On Wed, Nov 16, 2016 at 9:26 AM, Thomas Munro <thomas.mu...@enterprisedb.com> wrote: > On Tue, Nov 8, 2016 at 5:56 PM, Thomas Munro > <thomas.mu...@enterprisedb.com> wrote: > [..] Another solution > could be to have recovery on the standby detect tokens (CSNs > incremented by PreCommit_CheckForSerializationFailure) arriving out of > order, but I don't know what exactly it should do about that when it > is detected: you shouldn't respect an out-of-order claim of safety, > but then what should you wait for? Perhaps if the last replayed > commit record before that was marked SNAPSHOT_SAFE then it's OK to > leave it that way, and if it was marked SNAPSHOT_SAFETY_UNKNOWN then > you have to wait for that one to be resolved by a follow-up snapshot > safety message and then rince-and-repeat (take a new snapshot etc). I > think that might work, but it seems strange to allow random races on > the primary to create extra delays on the standby. Perhaps there is > some much simpler way to do all this that I'm missing. > > Another detail is that standbys that start up from a checkpoint and > don't see any SSI transactions commit don't yet have any snapshot > safety information, but defaulting to assuming that this point is safe > doesn't seem right, so I suspect it needs to be in checkpoints. > > Attached is a tidied up version which doesn't try to address the above > problems yet. When time permits I'll come back to this.
I haven't looked at this again yet but a nearby thread reminded me of another problem with this which I wanted to restate explicitly here in the context of this patch. Even without replication in the picture, there is a race to reach ProcArrayEndTransaction() after RecordTransactionCommit() runs, which means that the DO history (normal primary server) and REDO history (recovery) don't always agree on the order that transactions become visible. With this patch, this kind of diverging DO and REDO could allow undetectable read only serialization anomalies. I think that ProcArrayEndTransaction() and RecordTransactionCommit() need to be made atomic in the simple case so that DO and REDO agree. Synchronous replication can make that more likely and it seems like some other approach is probably needed to delay visibility of not-yet-durable transactions while keeping the order that transactions become visible the same on all nodes. Aside from the problems I mentioned in my earlier message (race between snapshot safety decision and logging order, and lack of checkpointing of snapshot safety information), it seems like the two DO vs REDO problems (race to ProcArrayEndTransaction, and deliberately delayed visibility in syncrep) also need to be addressed before SERIALIZABLE DEFERRABLE on standbys could make a water tight guarantee. -- Thomas Munro http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers