Hi all, 2PC files are created using RecreateTwoPhaseFile() in two places currently: - at replay on a XLOG_XACT_PREPARE record. - At checkpoint with CheckPointTwoPhase().
Now RecreateTwoPhaseFile() is careful to call pg_fsync() to be sure that the 2PC files find their way into disk. But one piece is missing: the parent directory pg_twophase is not fsync'd. At replay this is more sensitive if there is a PREPARE record followed by a checkpoint record. If there is a power failure after the checkpoint completes there is a risk to lose 2PC status files here. It seems to me that we really should have CheckPointTwoPhase() call fsync() on pg_twophase to be sure that no files are lost here. There is no point to do this operation in RecreateTwoPhaseFile() as if there are many 2PC transactions to replay performance would be impacted, and we don't care about the durability of those files until a checkpoint moves the redo pointer. I have drafted the patch attached to address this issue. I am adding that as well to the next CF for consideration. Thoughts? -- Michael
2pc-loss.patch
Description: application/download
-- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers