On Fri, Apr 8, 2022 at 7:28 PM Robert Haas <robertmh...@gmail.com> wrote:
>
> On Fri, Apr 8, 2022 at 9:31 AM Bharath Rupireddy
> <bharath.rupireddyforpostg...@gmail.com> wrote:
> > Fundamental question - should the pg_walfile_{name, name_offset} check
> > whether the file with the computed WAL file name exists on the server
> > right now or ever existed earlier? Right now, they don't do that, see
> > [1].
>
> I don't think that checking whether the file exists is the right
> approach. However, I do think that it's important to be precise about
> which TLI is going to be used. I think it would be reasonable to
> redefine this function (on both the primary and the standby) so that
> the TLI that is used is the one that was in effect at the time record
> at the given LSN was either written or replayed. Then, you could
> potentially use this function to figure out whether you still have the
> WAL files that are needed to replay up to some previous point in the
> WAL stream. However, what about the segments where we switched from
> one TLI to the next in the middle of the segment? There, you probably
> need both the old and the new segments, or maybe if you're trying to
> stream them you only need the new one because we have some weird
> special case that will send the segment from the new timeline when the
> segment from the old timeline is requested. So you couldn't just call
> this function on one LSN per segment and call it good, and it wouldn't
> necessarily be the case that the filenames you got back were exactly
> the ones you needed.
>
> So I'm not entirely sure this proposal is good enough, but it at least
> would have the advantage of meaning that the filename you get back is
> one that existed at some point in time and somebody used it for
> something.

Using insert tli when not in recovery and using tli of the last WAL
replayed record in crash/archive/standby recovery, seems a reasonable
choice to me. I've also added a note in the docs.

Attaching v2 with the above change. Please review it further.

Regards,
Bharath Rupireddy.
From 37a7587abb11b6ebcb82d0fbf3cff7505355a679 Mon Sep 17 00:00:00 2001
From: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>
Date: Sat, 9 Apr 2022 13:14:03 +0000
Subject: [PATCH v2] Allow pg_walfile_{name, name_offset} to run in recovery

Right now, pg_walfile_name and pg_walfile_name_offset don't run
while the server is in recovery (standby, PITR/archive, crash).
This reduces their usability if the server opens up for
read-only connections in recovery.

This patch enables them to compute WAL file name even in recovery
with a timeline ID of last the last successfully replayed WAL
record. They continue to use the timeline ID with which new
WAL records are being inserted and flushed when not in recovery.
---
 doc/src/sgml/func.sgml                 |  8 +++++
 src/backend/access/transam/xlogfuncs.c | 48 +++++++++++++++++---------
 2 files changed, 40 insertions(+), 16 deletions(-)

diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index 5047e090db..0a08e34813 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -28442,6 +28442,14 @@ postgres=# SELECT * FROM pg_walfile_name_offset((pg_backup_stop()).lsn);
     needs to be archived.
    </para>
 
+   <para>
+    Both <function>pg_walfile_name_offset</function> and <function>pg_walfile_name</function>
+    use a timeline ID internally to the extract write-ahead log file name. When
+    not in recovery, they use the timeline ID with which new write-ahead log
+    records are being inserted and flushed. When in recovery, they use the
+    timeline ID of the last successfully replayed write-ahead log record.
+   </para>
+
   </sect2>
 
   <sect2 id="functions-recovery-control">
diff --git a/src/backend/access/transam/xlogfuncs.c b/src/backend/access/transam/xlogfuncs.c
index b61ae6c0b4..795fd8d26b 100644
--- a/src/backend/access/transam/xlogfuncs.c
+++ b/src/backend/access/transam/xlogfuncs.c
@@ -44,6 +44,8 @@
 static StringInfo label_file;
 static StringInfo tblspc_map_file;
 
+static TimeLineID GetTLIForWALFileNameComputation(void);
+
 /*
  * pg_backup_start: set up for taking an on-line backup dump
  *
@@ -316,6 +318,30 @@ pg_last_wal_replay_lsn(PG_FUNCTION_ARGS)
 	PG_RETURN_LSN(recptr);
 }
 
+/*
+ * Get timeLine ID for computing WAL file name.
+ *
+ * When not in recovery, it returns the timeline into which new WAL is being
+ * inserted and flushed.
+ *
+ * When in crash/archive/standby recovery, it returns the timeline of the last
+ * WAL record that is successfully replayed.
+ */
+static TimeLineID
+GetTLIForWALFileNameComputation(void)
+{
+	TimeLineID	tli;
+
+	if (RecoveryInProgress())
+		(void) GetXLogReplayRecPtr(&tli);
+	else
+		tli = GetWALInsertionTimeLine();
+
+	Assert(tli > 0);
+
+	return tli;
+}
+
 /*
  * Compute an xlog file name and decimal byte offset given a WAL location,
  * such as is returned by pg_backup_stop() or pg_switch_wal().
@@ -336,13 +362,9 @@ pg_walfile_name_offset(PG_FUNCTION_ARGS)
 	TupleDesc	resultTupleDesc;
 	HeapTuple	resultHeapTuple;
 	Datum		result;
+	TimeLineID	tli;
 
-	if (RecoveryInProgress())
-		ereport(ERROR,
-				(errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
-				 errmsg("recovery is in progress"),
-				 errhint("%s cannot be executed during recovery.",
-						 "pg_walfile_name_offset()")));
+	tli = GetTLIForWALFileNameComputation();
 
 	/*
 	 * Construct a tuple descriptor for the result row.  This must match this
@@ -360,8 +382,7 @@ pg_walfile_name_offset(PG_FUNCTION_ARGS)
 	 * xlogfilename
 	 */
 	XLByteToPrevSeg(locationpoint, xlogsegno, wal_segment_size);
-	XLogFileName(xlogfilename, GetWALInsertionTimeLine(), xlogsegno,
-				 wal_segment_size);
+	XLogFileName(xlogfilename, tli, xlogsegno, wal_segment_size);
 
 	values[0] = CStringGetTextDatum(xlogfilename);
 	isnull[0] = false;
@@ -394,17 +415,12 @@ pg_walfile_name(PG_FUNCTION_ARGS)
 	XLogSegNo	xlogsegno;
 	XLogRecPtr	locationpoint = PG_GETARG_LSN(0);
 	char		xlogfilename[MAXFNAMELEN];
+	TimeLineID	tli;
 
-	if (RecoveryInProgress())
-		ereport(ERROR,
-				(errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
-				 errmsg("recovery is in progress"),
-				 errhint("%s cannot be executed during recovery.",
-						 "pg_walfile_name()")));
+	tli = GetTLIForWALFileNameComputation();
 
 	XLByteToPrevSeg(locationpoint, xlogsegno, wal_segment_size);
-	XLogFileName(xlogfilename, GetWALInsertionTimeLine(), xlogsegno,
-				 wal_segment_size);
+	XLogFileName(xlogfilename, tli, xlogsegno, wal_segment_size);
 
 	PG_RETURN_TEXT_P(cstring_to_text(xlogfilename));
 }
-- 
2.25.1

Reply via email to