Hi, Michael Paquier <mich...@paquier.xyz> wrote:
> On Wed, Feb 20, 2019 at 11:50:42AM +0100, Oleksii Kliukin wrote: >> RecoverPreparedTransactions calls ProcessRecords with >> twophase_recover_callbacks right after releasing the TwoPhaseStateLock, so I >> think lock_held should be false here (matching the similar call of >> TwoPhaseGetDummyProc at lock_twophase_recover). > > Indeed. That would be a bad idea. We don't actually stress this > routine in 009_twophase.pl or the assertion I added would have blown > up immediately. So I propose to close the gap, and the updated patch > attached adds dedicated tests causing the problem you spotted to be > triggered (soft and hard restarts with 2PC transactions including > DDLs). The previous version of the patch and those tests cause the > assertion to blow up, failing the tests. Thank you for updating the patch and sorry for the delay, it looks good to me, the tests pass on my system. >> Since the patch touches TwoPhaseGetDummyBackendId, let’s fix the comment >> header to mention it instead of TwoPhaseGetDummyProc > > Not sure I understand the issue you are pointing out here. The patch > adds an extra argument to both TwoPhaseGetDummyProc() and > TwoPhaseGetDummyBackendId(), and the headers of both functions > document the new argument. Just this: @@ -844,17 +851,18 @@ TwoPhaseGetGXact(TransactionId xid) } /* - * TwoPhaseGetDummyProc + * TwoPhaseGetDummyBackendId * Get the dummy backend ID for prepared transaction specified by XID > > One extra trick I have been using for testing this patch is the > following: > --- a/src/backend/access/transam/twophase.c > +++ b/src/backend/access/transam/twophase.c > @@ -816,13 +816,6 @@ TwoPhaseGetGXact(TransactionId xid, bool lock_held) > > Assert(!lock_held || LWLockHeldByMe(TwoPhaseStateLock)); > > - /* > - * During a recovery, COMMIT PREPARED, or ABORT PREPARED, we'll be called > - * repeatedly for the same XID. We can save work with a simple cache. > - */ > - if (xid == cached_xid) > - return cached_gxact; > > This has the advantage to check more aggressively for lock conflicts, > causing the tests to deadlock if the flag is not correctly set in the > worst case scenario. Nice, thank you. I ran all my tests with it and found no further issues. Regards, Oleksii Kliukin