On Sat, Sep 15, 2018 at 12:29 AM, Alvaro Herrera
<[email protected]> wrote:
> On 2018-Sep-15, Masahiko Sawada wrote:
>
>> On Fri, Sep 14, 2018 at 4:27 PM, 李海龙 <[email protected]> wrote:
>
>> > When I enable the parameter track_commit_timestamp in postgresql.conf of a
>> > Base Backup (making a Base Backup from a standby and the
>> > track_commit_timestamp is off on it),
>>
>> In addition to the above operation, I've reproduced this issue by
>> replaying a commit WAL record that sets the timestamp to a new page
>> during the crash recovery (or from restart).
>>
>> It seems to me that the cause of this is that we could not extend
>> commitTs page since the COMMIT_TS_ZEROPAGE WAL wasn't generated at the
>> standby server whose track_commit_timestamp is off. So during
>> replaying the commit WAL record the startup process fails since the
>> corresponding commitTs page doesn't exist.
>
> Hmm, wow. I wonder if it's possible to detect the config difference
> early enough that the zeropage WAL records are emitted, instead. But
> even this might not work, since some transactions need to have their
> commitTS in pages that will not have been zeroed anyway, because the
> page threshold was crossed in the old primary.
>
>> To fix that maybe we can disable commitTs if
>> controlFile->track_commit_timestamp == false and the
>> track_commit_timestamp == true even in crash recovery.
>
> Hmm, so keep it off while crash recovery runs, and once it's out of that
> then enable it automatically?
Yes. The attached patch changes it to check
controlFile->track_commit_timestamp even the crash recover case. If
track_commit_timestamp is set to true in the config file, it's enabled
at end of the recovery.
> That might work -- by definition we don't
> care about the commit TSs of the transaction replayed during crash
> recovery, since they were executed in the primary that didn't have
> commitTS enable anyway.
>
> It seems like the first thing we need is TAP cases that reproduce these
> two crash scenarios.
I attached TAP test that reproduces this issue. We can reproduce it
even with single server; making postgres replay a commit WAL in the
crash recovery after consumed transactions and enabled
track_commit_timestamp.
Regards,
--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index 493f1db..805bbe1 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -6809,11 +6809,12 @@ StartupXLOG(void)
StartupMultiXact();
/*
- * Ditto commit timestamps. In a standby, we do it if setting is enabled
- * in ControlFile; in a master we base the decision on the GUC itself.
+ * Ditto commit timestamps. In both a standby and a master, we do it if
+ * setting is enabled in ControlFile since we don't care about the commit
+ * timestamps of the transaction that were executed when the commit
+ * timestamp is disabled.
*/
- if (ArchiveRecoveryRequested ?
- ControlFile->track_commit_timestamp : track_commit_timestamp)
+ if (ControlFile->track_commit_timestamp)
StartupCommitTs();
/*
diff --git a/src/test/modules/commit_ts/t/005_recovery.pl b/src/test/modules/commit_ts/t/005_recovery.pl
new file mode 100644
index 0000000..cdc1cf8
--- /dev/null
+++ b/src/test/modules/commit_ts/t/005_recovery.pl
@@ -0,0 +1,53 @@
+# Recovery test
+
+use strict;
+use warnings;
+use PostgresNode;
+use TestLib;
+use Test::More tests => 1;
+
+my $node = get_new_node('test');
+$node->init;
+$node->append_conf('postgresql.conf',
+ "track_commit_timestamp = off");
+$node->start;
+
+# When we start firstly from the initdb the PARAMETER_CHANGES
+# is emitted at end of the recovery, which disables the
+# track_commit_timestamp if the crash recovery replay that
+# WAL. Therefore we restart the server so that we can recovery
+# from the point where doesn't contain that WAL.
+$node->restart;
+
+# Consume 2000 XIDs to beyond the commitTS page boundary.
+$node->safe_psql(
+ 'postgres',
+ qq(
+CREATE PROCEDURE comsume_xid(cnt int)
+AS \$\$
+DECLARE
+ i int;
+BEGIN
+ FOR i in 1..cnt LOOP
+ EXECUTE 'SELECT txid_current()';
+ COMMIT;
+ END LOOP;
+END;
+\$\$
+LANGUAGE plpgsql;
+));
+$node->safe_psql('postgres', 'CALL comsume_xid(2000)');
+
+$node->teardown_node;
+
+# Enable track_commit_tiemstamp
+$node->append_conf('postgresql.conf',
+ "track_commit_timestamp = on");
+
+# During the crash recovery we replay the commit WAL that sets
+# the commit timestamp to a new page.
+$node->start;
+
+# Check if the server launched.
+is($node->psql('postgres', qq(SELECT 1)), 0,
+ 'started from the crash recovery');