Re: Add LSN along with offset to error messages reported for WAL file read/write/validate header failures

2022-12-23 Thread Magnus Hagander
On Fri, Dec 23, 2022 at 2:06 AM Michael Paquier  wrote:

> On Thu, Dec 22, 2022 at 05:03:35PM +0530, Bharath Rupireddy wrote:
> > On Thu, Dec 22, 2022 at 4:57 PM Michael Paquier 
> wrote:
> >> As in using "sequence number" removing "file" from the docs and
> >> changing the OUT parameter name to segment_number rather than segno?
> >> Fine by me.
> >
> > +1.
>
> Okay, done this way.
>
>
Thanks!

//Magnus


Re: Add LSN along with offset to error messages reported for WAL file read/write/validate header failures

2022-12-22 Thread Michael Paquier
On Thu, Dec 22, 2022 at 05:03:35PM +0530, Bharath Rupireddy wrote:
> On Thu, Dec 22, 2022 at 4:57 PM Michael Paquier  wrote:
>> As in using "sequence number" removing "file" from the docs and
>> changing the OUT parameter name to segment_number rather than segno?
>> Fine by me.
> 
> +1.

Okay, done this way.
--
Michael


signature.asc
Description: PGP signature


Re: Add LSN along with offset to error messages reported for WAL file read/write/validate header failures

2022-12-22 Thread Bharath Rupireddy
On Thu, Dec 22, 2022 at 4:57 PM Michael Paquier  wrote:
>
> On Thu, Dec 22, 2022 at 05:19:24PM +0900, Kyotaro Horiguchi wrote:
> > In the first place "file sequence number" and "segno" can hardly be
> > associated by appearance by readers, I think.  (Yeah, we can identify
> > that since the another parameter is identifiable alone.) Why don't we
> > spell out the parameter simply as "segment number"?
>
> As in using "sequence number" removing "file" from the docs and
> changing the OUT parameter name to segment_number rather than segno?
> Fine by me.

+1.

-- 
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com




Re: Add LSN along with offset to error messages reported for WAL file read/write/validate header failures

2022-12-22 Thread Michael Paquier
On Thu, Dec 22, 2022 at 05:19:24PM +0900, Kyotaro Horiguchi wrote:
> In the first place "file sequence number" and "segno" can hardly be
> associated by appearance by readers, I think.  (Yeah, we can identify
> that since the another parameter is identifiable alone.) Why don't we
> spell out the parameter simply as "segment number"?

As in using "sequence number" removing "file" from the docs and
changing the OUT parameter name to segment_number rather than segno?
Fine by me.
--
Michael


signature.asc
Description: PGP signature


Re: Add LSN along with offset to error messages reported for WAL file read/write/validate header failures

2022-12-22 Thread Kyotaro Horiguchi
At Thu, 22 Dec 2022 10:09:23 +0530, Bharath Rupireddy 
 wrote in 
> On Thu, Dec 22, 2022 at 7:57 AM Michael Paquier  wrote:
> >
> > On Wed, Dec 21, 2022 at 10:22:02PM +0100, Magnus Hagander wrote:
> > > Basically, we take one thing and turn it into 3. That very naturally rings
> > > with "split" to me.
> > >
> > > Parse might work as well, certainly better than dissect. I'd still prefer
> > > split though.
> >
> > Honestly, I don't have any counter-arguments, so I am fine to switch
> > the name as you are suggesting.  And pg_split_walfile_name() it is?
> 
> +1. FWIW, a simple patch is here
> https://www.postgresql.org/message-id/CALj2ACXdZ7WGRD-_jPPeZugvWLN%2Bgxo3QtV-eZPRicUwjesM%3Dg%40mail.gmail.com.

By the way the function is documented as the follows.

>  Extracts the file sequence number and timeline ID from a WAL file name.

I didn't find the definition for the workd "file sequence number" in
the doc. Instead I find "segment number" (a bit doubtful, though..).

In the first place "file sequence number" and "segno" can hardly be
associated by appearance by readers, I think.  (Yeah, we can identify
that since the another parameter is identifiable alone.) Why don't we
spell out the parameter simply as "segment number"?

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center




Re: Add LSN along with offset to error messages reported for WAL file read/write/validate header failures

2022-12-21 Thread Bharath Rupireddy
On Thu, Dec 22, 2022 at 7:57 AM Michael Paquier  wrote:
>
> On Wed, Dec 21, 2022 at 10:22:02PM +0100, Magnus Hagander wrote:
> > Basically, we take one thing and turn it into 3. That very naturally rings
> > with "split" to me.
> >
> > Parse might work as well, certainly better than dissect. I'd still prefer
> > split though.
>
> Honestly, I don't have any counter-arguments, so I am fine to switch
> the name as you are suggesting.  And pg_split_walfile_name() it is?

+1. FWIW, a simple patch is here
https://www.postgresql.org/message-id/CALj2ACXdZ7WGRD-_jPPeZugvWLN%2Bgxo3QtV-eZPRicUwjesM%3Dg%40mail.gmail.com.

-- 
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com




Re: Add LSN along with offset to error messages reported for WAL file read/write/validate header failures

2022-12-21 Thread Michael Paquier
On Wed, Dec 21, 2022 at 10:22:02PM +0100, Magnus Hagander wrote:
> Basically, we take one thing and turn it into 3. That very naturally rings
> with "split" to me.
> 
> Parse might work as well, certainly better than dissect. I'd still prefer
> split though.

Honestly, I don't have any counter-arguments, so I am fine to switch
the name as you are suggesting.  And pg_split_walfile_name() it is?
--
Michael


signature.asc
Description: PGP signature


Re: Add LSN along with offset to error messages reported for WAL file read/write/validate header failures

2022-12-21 Thread Magnus Hagander
On Wed, Dec 21, 2022 at 1:09 AM Michael Paquier  wrote:

> On Tue, Dec 20, 2022 at 06:04:40PM +0530, Bharath Rupireddy wrote:
> > On Tue, Dec 20, 2022 at 1:27 PM Magnus Hagander 
> wrote:
> >> Caught this thread late. To me, pg_dissect_walfile_name() is a
> >> really strange name for a function. Grepping our I code I see the
> >> term dissect s used somewhere inside the regex code and exactly
> >> zero instances elsewhere. Which is why I definitely didn't
> >> recognize the term...
> >>
> >> Wouldn't something like pg_split_walfile_name() be a lot more
> >> consistent with the rest of our names?
>
> Fine by me to change that if there is little support for the current
> naming, though the current one does not sound that bad to me either.
>
> > Hm. FWIW, here's the patch.
>
> "split" is used a lot for the picksplit functions, but not in any of
> the existing functions as a name.  Some extra options: parse, read,
> extract, calculate, deduce, get.  "parse" would be something I would
> be OK with.
>


Not sure what you mean? We certainly have a lot of functions called split
that are not the picksplit ones. split_part(). regexp_split_to_array(),
regexp_split_to_table()... And ther'es things like tuiple_data_split() in
pageinspect.

There are many other examples outside of postgres as well, e.g. python has
a split() of pathnames, "almost every language" has a split() on strings
etc. I don't think I've ever seen dissect in a place like that either
(though Im sure it exists somewhere, it's hardly common)

Basically, we take one thing and turn it into 3. That very naturally rings
with "split" to me.

Parse might work as well, certainly better than dissect. I'd still prefer
split though.

-- 
 Magnus Hagander
 Me: https://www.hagander.net/ 
 Work: https://www.redpill-linpro.com/ 


Re: Add LSN along with offset to error messages reported for WAL file read/write/validate header failures

2022-12-20 Thread Bharath Rupireddy
On Wed, Dec 21, 2022 at 5:39 AM Michael Paquier  wrote:
>
> On Tue, Dec 20, 2022 at 06:04:40PM +0530, Bharath Rupireddy wrote:
> > On Tue, Dec 20, 2022 at 1:27 PM Magnus Hagander  wrote:
> >> Caught this thread late. To me, pg_dissect_walfile_name() is a
> >> really strange name for a function. Grepping our I code I see the
> >> term dissect s used somewhere inside the regex code and exactly
> >> zero instances elsewhere. Which is why I definitely didn't
> >> recognize the term...
> >>
> >> Wouldn't something like pg_split_walfile_name() be a lot more
> >> consistent with the rest of our names?
>
> Fine by me to change that if there is little support for the current
> naming, though the current one does not sound that bad to me either.
>
> > Hm. FWIW, here's the patch.
>
> "split" is used a lot for the picksplit functions, but not in any of
> the existing functions as a name.  Some extra options: parse, read,
> extract, calculate, deduce, get.  "parse" would be something I would
> be OK with.

"dissect", "split" and "parse" -  I'm okay with either of these.

Read somewhere - a saying that goes this way "the hardest part of
coding is to name variables and functions" :).

-- 
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com




Re: Add LSN along with offset to error messages reported for WAL file read/write/validate header failures

2022-12-20 Thread Michael Paquier
On Tue, Dec 20, 2022 at 06:04:40PM +0530, Bharath Rupireddy wrote:
> On Tue, Dec 20, 2022 at 1:27 PM Magnus Hagander  wrote:
>> Caught this thread late. To me, pg_dissect_walfile_name() is a
>> really strange name for a function. Grepping our I code I see the
>> term dissect s used somewhere inside the regex code and exactly
>> zero instances elsewhere. Which is why I definitely didn't
>> recognize the term... 
>>
>> Wouldn't something like pg_split_walfile_name() be a lot more
>> consistent with the rest of our names? 

Fine by me to change that if there is little support for the current
naming, though the current one does not sound that bad to me either.

> Hm. FWIW, here's the patch.

"split" is used a lot for the picksplit functions, but not in any of
the existing functions as a name.  Some extra options: parse, read,
extract, calculate, deduce, get.  "parse" would be something I would
be OK with.
--
Michael


signature.asc
Description: PGP signature


Re: Add LSN along with offset to error messages reported for WAL file read/write/validate header failures

2022-12-20 Thread Ian Lawrence Barwick
2022年12月20日(火) 21:35 Bharath Rupireddy :
>
> On Tue, Dec 20, 2022 at 1:27 PM Magnus Hagander  wrote:
> >
> > On Tue, Dec 20, 2022 at 5:40 AM Michael Paquier  wrote:
> >>
> >> On Tue, Dec 20, 2022 at 09:01:02AM +0900, Michael Paquier wrote:
> >> > Yeah, my mind was considering as well yesterday the addition of a note
> >> > in the docs about something among these lines, so fine by me.
> >>
> >> And applied that, after tweaking a few tiny things on a last lookup
> >> with a catversion bump.  Note that the example has been moved at the
> >> bottom of the table for these functions, which is more consistent with
> >> the surroundings.
> >>
> >
> > Hi!
> >
> > Caught this thread late. To me, pg_dissect_walfile_name() is a really 
> > strange name for a function. Grepping our I code I see the term dissect s 
> > used somewhere inside the regex code and exactly zero instances elsewhere. 
> > Which is why I definitely didn't recognize the term...

Late to the party too, but I did wonder about the name when I saw it.

> > Wouldn't something like pg_split_walfile_name() be a lot more consistent 
> > with the rest of our names?
>
> Hm. FWIW, here's the patch.

Hmm, "pg_split_walfile_name()" sounds like it would return three 8
character strings.

Maybe something like "pg_walfile_name_elements()" ?

Regards

Ian Barwick




Re: Add LSN along with offset to error messages reported for WAL file read/write/validate header failures

2022-12-20 Thread Bharath Rupireddy
On Tue, Dec 20, 2022 at 1:27 PM Magnus Hagander  wrote:
>
> On Tue, Dec 20, 2022 at 5:40 AM Michael Paquier  wrote:
>>
>> On Tue, Dec 20, 2022 at 09:01:02AM +0900, Michael Paquier wrote:
>> > Yeah, my mind was considering as well yesterday the addition of a note
>> > in the docs about something among these lines, so fine by me.
>>
>> And applied that, after tweaking a few tiny things on a last lookup
>> with a catversion bump.  Note that the example has been moved at the
>> bottom of the table for these functions, which is more consistent with
>> the surroundings.
>>
>
> Hi!
>
> Caught this thread late. To me, pg_dissect_walfile_name() is a really strange 
> name for a function. Grepping our I code I see the term dissect s used 
> somewhere inside the regex code and exactly zero instances elsewhere. Which 
> is why I definitely didn't recognize the term...
>
> Wouldn't something like pg_split_walfile_name() be a lot more consistent with 
> the rest of our names?

Hm. FWIW, here's the patch.

-- 
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com


v1-0001-Rename-pg_dissect_walfile_name-to-pg_split_walfil.patch
Description: Binary data


Re: Add LSN along with offset to error messages reported for WAL file read/write/validate header failures

2022-12-19 Thread Magnus Hagander
On Tue, Dec 20, 2022 at 5:40 AM Michael Paquier  wrote:

> On Tue, Dec 20, 2022 at 09:01:02AM +0900, Michael Paquier wrote:
> > Yeah, my mind was considering as well yesterday the addition of a note
> > in the docs about something among these lines, so fine by me.
>
> And applied that, after tweaking a few tiny things on a last lookup
> with a catversion bump.  Note that the example has been moved at the
> bottom of the table for these functions, which is more consistent with
> the surroundings.
>
>
Hi!

Caught this thread late. To me, pg_dissect_walfile_name() is a really
strange name for a function. Grepping our I code I see the term dissect s
used somewhere inside the regex code and exactly zero instances elsewhere.
Which is why I definitely didn't recognize the term...

Wouldn't something like pg_split_walfile_name() be a lot more consistent
with the rest of our names?

//Magnus


Re: Add LSN along with offset to error messages reported for WAL file read/write/validate header failures

2022-12-19 Thread Michael Paquier
On Tue, Dec 20, 2022 at 09:01:02AM +0900, Michael Paquier wrote:
> Yeah, my mind was considering as well yesterday the addition of a note
> in the docs about something among these lines, so fine by me.

And applied that, after tweaking a few tiny things on a last lookup
with a catversion bump.  Note that the example has been moved at the
bottom of the table for these functions, which is more consistent with
the surroundings. 
--
Michael


signature.asc
Description: PGP signature


Re: Add LSN along with offset to error messages reported for WAL file read/write/validate header failures

2022-12-19 Thread Michael Paquier
On Mon, Dec 19, 2022 at 05:51:19PM +0530, Bharath Rupireddy wrote:
> A nitpick - can we also specify a use case for the function
> pg_dissect_walfile_name(), that is, computing LSN from offset and WAL
> file name, something like [1]?

Yeah, my mind was considering as well yesterday the addition of a note
in the docs about something among these lines, so fine by me.
--
Michael


signature.asc
Description: PGP signature


Re: Add LSN along with offset to error messages reported for WAL file read/write/validate header failures

2022-12-19 Thread Bharath Rupireddy
On Mon, Dec 19, 2022 at 5:22 PM Bharath Rupireddy
 wrote:
>
> On Mon, Dec 19, 2022 at 1:37 PM Michael Paquier  wrote:
> >
> > On Tue, Dec 13, 2022 at 09:32:19PM +0530, Bharath Rupireddy wrote:
> > > Okay, here's the v5 patch that I could come up with. It basically adds
> > > functions for dissecting WAL file names and computing offset from lsn.
> > > Thoughts?
> >
> > I had a second look at that, and I still have mixed feelings about the
> > addition of the SQL function, no real objection about
> > pg_dissect_walfile_name().
> >
> > I don't really think that we need a specific handling with a new
> > macro from xlog_internal.h that does its own parsing of the segment
> > number while XLogFromFileName() can do that based on the user input,
> > so I have simplified that.
> >
> > A second thing is the TLI that had better be returned as int8 and not
> > int4 so as we don't have a negative number for a TLI higher than 2B in
> > a WAL segment name.
>
> Thanks. The v6 patch LGTM.

A nitpick - can we also specify a use case for the function
pg_dissect_walfile_name(), that is, computing LSN from offset and WAL
file name, something like [1]?

[1]
diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index 2f05b06f14..c36fcb83c8 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -26110,7 +26110,17 @@ LOG:  Grand total: 1651920 bytes in 201
blocks; 622360 free (88 chunks); 1029560


 Extract the file sequence number and timeline ID from a WAL file
-name.
+name. This function is useful to compute LSN from a given offset
+and WAL file name, for example:
+
+postgres=# \set file_name '0001000100C000AB'
+postgres=# \set offset 256
+postgres=# SELECT '0/0'::pg_lsn + pd.segno * ps.setting::int +
:offset AS lsn FROM pg_dissect_walfile_name(:'file_name') pd,
pg_show_all_settings() ps WHERE ps.name = 'wal_segment_size';
+  lsn
+---
+ C001/AB000100
+(1 row)
+

   

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com




Re: Add LSN along with offset to error messages reported for WAL file read/write/validate header failures

2022-12-19 Thread Bharath Rupireddy
On Mon, Dec 19, 2022 at 1:37 PM Michael Paquier  wrote:
>
> On Tue, Dec 13, 2022 at 09:32:19PM +0530, Bharath Rupireddy wrote:
> > Okay, here's the v5 patch that I could come up with. It basically adds
> > functions for dissecting WAL file names and computing offset from lsn.
> > Thoughts?
>
> I had a second look at that, and I still have mixed feelings about the
> addition of the SQL function, no real objection about
> pg_dissect_walfile_name().
>
> I don't really think that we need a specific handling with a new
> macro from xlog_internal.h that does its own parsing of the segment
> number while XLogFromFileName() can do that based on the user input,
> so I have simplified that.
>
> A second thing is the TLI that had better be returned as int8 and not
> int4 so as we don't have a negative number for a TLI higher than 2B in
> a WAL segment name.

Thanks. The v6 patch LGTM.

-- 
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com




Re: Add LSN along with offset to error messages reported for WAL file read/write/validate header failures

2022-12-19 Thread Michael Paquier
On Tue, Dec 13, 2022 at 09:32:19PM +0530, Bharath Rupireddy wrote:
> Okay, here's the v5 patch that I could come up with. It basically adds
> functions for dissecting WAL file names and computing offset from lsn.
> Thoughts?

I had a second look at that, and I still have mixed feelings about the
addition of the SQL function, no real objection about
pg_dissect_walfile_name().

I don't really think that we need a specific handling with a new
macro from xlog_internal.h that does its own parsing of the segment
number while XLogFromFileName() can do that based on the user input,
so I have simplified that.

A second thing is the TLI that had better be returned as int8 and not
int4 so as we don't have a negative number for a TLI higher than 2B in
a WAL segment name.
--
Michael
From 25fb0848cf3220a5fc9638c2cfd116e5390be887 Mon Sep 17 00:00:00 2001
From: Michael Paquier 
Date: Mon, 19 Dec 2022 17:02:16 +0900
Subject: [PATCH v6] Add utility functions to disset WAL file name and compute
 offset

---
 src/include/catalog/pg_proc.dat  |  7 +++
 src/backend/access/transam/xlogfuncs.c   | 53 
 src/test/regress/expected/misc_functions.out | 18 +++
 src/test/regress/sql/misc_functions.sql  |  8 +++
 doc/src/sgml/func.sgml   | 16 ++
 5 files changed, 102 insertions(+)

diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index 719599649a..79cfd1378d 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -6365,6 +6365,13 @@
 { oid => '2851', descr => 'wal filename, given a wal location',
   proname => 'pg_walfile_name', prorettype => 'text', proargtypes => 'pg_lsn',
   prosrc => 'pg_walfile_name' },
+{ oid => '8205',
+  descr => 'sequence number and timeline ID given a wal filename',
+  proname => 'pg_dissect_walfile_name', prorettype => 'record',
+  proargtypes => 'text', proallargtypes => '{text,numeric,int8}',
+  provolatile => 's', proargmodes => '{i,o,o}',
+  proargnames => '{file_name,segno,timeline_id}',
+  prosrc => 'pg_dissect_walfile_name' },
 
 { oid => '3165', descr => 'difference in bytes, given two wal locations',
   proname => 'pg_wal_lsn_diff', prorettype => 'numeric',
diff --git a/src/backend/access/transam/xlogfuncs.c b/src/backend/access/transam/xlogfuncs.c
index 487d5d9cac..0a31837ef1 100644
--- a/src/backend/access/transam/xlogfuncs.c
+++ b/src/backend/access/transam/xlogfuncs.c
@@ -432,6 +432,59 @@ pg_walfile_name(PG_FUNCTION_ARGS)
 	PG_RETURN_TEXT_P(cstring_to_text(xlogfilename));
 }
 
+/*
+ * Extract the sequence number and the timeline ID from given a WAL file
+ * name.
+ */
+Datum
+pg_dissect_walfile_name(PG_FUNCTION_ARGS)
+{
+#define PG_DISSECT_WALFILE_NAME_COLS 2
+	char	   *fname = text_to_cstring(PG_GETARG_TEXT_PP(0));
+	char	   *fname_upper;
+	char	   *p;
+	TimeLineID	tli;
+	XLogSegNo	segno;
+	Datum		values[PG_DISSECT_WALFILE_NAME_COLS] = {0};
+	bool		isnull[PG_DISSECT_WALFILE_NAME_COLS] = {0};
+	TupleDesc	tupdesc;
+	HeapTuple	tuple;
+	char		buf[256];
+	Datum		result;
+
+	fname_upper = pstrdup(fname);
+
+	/* Capitalize WAL file name. */
+	for (p = fname_upper; *p; p++)
+		*p = pg_toupper((unsigned char) *p);
+
+	if (!IsXLogFileName(fname_upper))
+		ereport(ERROR,
+(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("invalid WAL file name \"%s\"", fname)));
+
+	XLogFromFileName(fname_upper, , , wal_segment_size);
+
+	if (get_call_result_type(fcinfo, NULL, ) != TYPEFUNC_COMPOSITE)
+		elog(ERROR, "return type must be a row type");
+
+	/* Convert to numeric. */
+	snprintf(buf, sizeof buf, UINT64_FORMAT, segno);
+	values[0] = DirectFunctionCall3(numeric_in,
+	CStringGetDatum(buf),
+	ObjectIdGetDatum(0),
+	Int32GetDatum(-1));
+
+	values[1] = Int64GetDatum(tli);
+
+	tuple = heap_form_tuple(tupdesc, values, isnull);
+	result = HeapTupleGetDatum(tuple);
+
+	PG_RETURN_DATUM(result);
+
+#undef PG_DISSECT_WALFILE_NAME_COLS
+}
+
 /*
  * pg_wal_replay_pause - Request to pause recovery
  *
diff --git a/src/test/regress/expected/misc_functions.out b/src/test/regress/expected/misc_functions.out
index 88bb696ded..fb676e9971 100644
--- a/src/test/regress/expected/misc_functions.out
+++ b/src/test/regress/expected/misc_functions.out
@@ -619,3 +619,21 @@ SELECT count(*) > 0 AS ok FROM pg_control_system();
  t
 (1 row)
 
+-- pg_dissect_walfile_name
+SELECT segno > 0 AS ok_segno, timeline_id
+  FROM pg_dissect_walfile_name('invalid');
+ERROR:  invalid WAL file name "invalid"
+SELECT segno > 0 AS ok_segno, timeline_id
+  FROM pg_dissect_walfile_name('00010001');
+ ok_segno | timeline_id 
+--+-
+ t|   1
+(1 row)
+
+SELECT segno > 0 AS ok_segno, timeline_id
+  FROM pg_dissect_walfile_name('000100af');
+ ok_segno | timeline_id 
+--+-
+ t|  4294967295
+(1 row)
+
diff --git a/src/test/regress/sql/misc_functions.sql b/src/test/regress/sql/misc_functions.sql
index 

Re: Add LSN along with offset to error messages reported for WAL file read/write/validate header failures

2022-12-18 Thread Michael Paquier
On Tue, Dec 06, 2022 at 04:27:50PM +0900, Kyotaro Horiguchi wrote:
> At Mon, 5 Dec 2022 10:03:55 +0900, Michael Paquier  
> wrote in 
>> Hence I would tend to let XLogFromFileName do the job, while having a
>> SQL function that is just a thin wrapper around it that returns the
>> segment TLI and its number, leaving the offset out of the equation as
>> well as this new XLogIdFromFileName().
> 
> I don't think it could be problematic that the SQL-callable function
> returns a bogus result for a wrong WAL filename in the correct
> format. Specifically, I think that the function may return (0/0,0) for
> "" since that behavior is completely
> harmless. If we don't check logid, XLogFromFileName fits instead.

Yeah, I really don't think that it is a big deal either:
XLogIdFromFileName() just translates what it receives from the
caller for the TLI and the segment number.

> (If we assume that the file names are typed in letter-by-letter, I
>  rather prefer to allow lower-case letters:p)

Yep, makes sense to enforce a compatible WAL segment name if we can.
--
Michael


signature.asc
Description: PGP signature


Re: Add LSN along with offset to error messages reported for WAL file read/write/validate header failures

2022-12-13 Thread Bharath Rupireddy
On Tue, Dec 6, 2022 at 4:46 PM Bharath Rupireddy
 wrote:
>
> That said, I think we can have a single function
> pg_dissect_walfile_name(wal_file_name, offset optional) returning
> segno (XLogSegNo - physical log file sequence number), tli, lsn (if
> offset is given). This way there is no need for another SQL-callable
> function using pg_settings. Thoughts?
>
> > (If we assume that the file names are typed in letter-by-letter, I
> >  rather prefer to allow lower-case letters:p)
>
> It's easily doable if we convert the entered WAL file name to
> uppercase using pg_toupper() and then pass it to IsXLogFileName().

Okay, here's the v5 patch that I could come up with. It basically adds
functions for dissecting WAL file names and computing offset from lsn.
Thoughts?

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com
From e5f102911dd7670c96d9826b2e5dd377235f1717 Mon Sep 17 00:00:00 2001
From: Bharath Rupireddy 
Date: Tue, 13 Dec 2022 12:05:11 +
Subject: [PATCH v5] Add utility functions to disset WAL file name and compute
 offset

---
 doc/src/sgml/func.sgml   | 33 +++
 src/backend/access/transam/xlogfuncs.c   | 62 
 src/backend/catalog/system_functions.sql | 10 
 src/include/access/xlog_internal.h   |  8 +++
 src/include/catalog/pg_proc.dat  | 12 
 src/test/regress/expected/misc_functions.out | 53 +
 src/test/regress/sql/misc_functions.sql  | 17 ++
 7 files changed, 195 insertions(+)

diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index ad31fdb737..894f36bcb6 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -26099,6 +26099,39 @@ LOG:  Grand total: 1651920 bytes in 201 blocks; 622360 free (88 chunks); 1029560

   
 
+  
+   
+
+ pg_dissect_walfile_name
+
+pg_dissect_walfile_name ( file_name text )
+record
+( segno numeric,
+timeline_id integer )
+   
+   
+Computes physical write-ahead log (WAL) file sequence number and
+timeline ID from a given WAL file name.
+   
+  
+
+  
+   
+
+ pg_walfile_offset_lsn
+
+pg_walfile_offset_lsn ( file_name text, file_offset integer )
+pg_lsn
+   
+   
+Computes write-ahead log (WAL) location from a given WAL file name
+and byte offset within that file. When
+file_offset is negative or greater than
+equal to wal_segment_size, this function returns
+NULL.
+   
+  
+
   

 
diff --git a/src/backend/access/transam/xlogfuncs.c b/src/backend/access/transam/xlogfuncs.c
index 487d5d9cac..32cf431aca 100644
--- a/src/backend/access/transam/xlogfuncs.c
+++ b/src/backend/access/transam/xlogfuncs.c
@@ -432,6 +432,68 @@ pg_walfile_name(PG_FUNCTION_ARGS)
 	PG_RETURN_TEXT_P(cstring_to_text(xlogfilename));
 }
 
+/*
+ * Compute segno (physical WAL file sequence number) and timeline ID given a
+ * WAL file name.
+ */
+Datum
+pg_dissect_walfile_name(PG_FUNCTION_ARGS)
+{
+#define PG_DISSECT_WALFILE_NAME_COLS 2
+	char	   *fname = text_to_cstring(PG_GETARG_TEXT_PP(0));
+	char	   *fname_upper;
+	char	   *p;
+	TimeLineID	tli;
+	XLogSegNo	segno;
+	uint32		log;
+	uint32		seg;
+	Datum		values[PG_DISSECT_WALFILE_NAME_COLS] = {0};
+	bool		isnull[PG_DISSECT_WALFILE_NAME_COLS] = {0};
+	TupleDesc	tupdesc;
+	HeapTuple	tuple;
+	char		buf[256];
+	Datum		result;
+
+	fname_upper = pstrdup(fname);
+
+	/* Capitalize WAL file name. */
+	for (p = fname_upper; *p; p++)
+		*p = pg_toupper((unsigned char) *p);
+
+	if (!IsXLogFileName(fname_upper))
+		ereport(ERROR,
+(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("invalid WAL file name \"%s\"", fname)));
+
+	XLogIdFromFileName(fname_upper, , , , , wal_segment_size);
+
+	if (seg >= XLogSegmentsPerXLogId(wal_segment_size) ||
+		segno == 0 ||
+		tli == 0)
+		ereport(ERROR,
+(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("invalid WAL file name \"%s\"", fname)));
+
+	if (get_call_result_type(fcinfo, NULL, ) != TYPEFUNC_COMPOSITE)
+		elog(ERROR, "return type must be a row type");
+
+	/* Convert to numeric. */
+	snprintf(buf, sizeof buf, UINT64_FORMAT, segno);
+	values[0] = DirectFunctionCall3(numeric_in,
+	CStringGetDatum(buf),
+	ObjectIdGetDatum(0),
+	Int32GetDatum(-1));
+
+	values[1] = UInt32GetDatum(tli);
+
+	tuple = heap_form_tuple(tupdesc, values, isnull);
+	result = HeapTupleGetDatum(tuple);
+
+	PG_RETURN_DATUM(result);
+
+#undef PG_DISSECT_WALFILE_NAME_COLS
+}
+
 /*
  * pg_wal_replay_pause - Request to pause recovery
  *
diff --git a/src/backend/catalog/system_functions.sql b/src/backend/catalog/system_functions.sql
index 52517a6531..5dde5007ce 100644
--- a/src/backend/catalog/system_functions.sql
+++ b/src/backend/catalog/system_functions.sql
@@ -397,6 +397,16 @@ CREATE OR REPLACE 

Re: Add LSN along with offset to error messages reported for WAL file read/write/validate header failures

2022-12-06 Thread Bharath Rupireddy
On Tue, Dec 6, 2022 at 12:57 PM Kyotaro Horiguchi
 wrote:
>
> At Mon, 5 Dec 2022 13:13:23 +0900, Michael Paquier  
> wrote in
> > On Mon, Dec 05, 2022 at 08:48:25AM +0530, Bharath Rupireddy wrote:
> > > So, a SQL function pg_dissect_walfile_name (or some other better name)
> > > given a WAL file name returns the tli and seg number. Then the
> > > pg_walfile_offset_lsn can just be a SQL-defined function (in
> > > system_functions.sql) using this new function and pg_settings. If this
> > > understanding is correct, it looks good to me at this point.
> >
> > I would do without the SQL function that looks at pg_settings, FWIW.
>
> If that function may be called at a high frequency, SQL-defined one is
> not suitable, but I don't think this function is used that way.  With
> that premise, I would prefer SQL-defined as it is far simpler on its
> face.
>
> At Mon, 5 Dec 2022 10:03:55 +0900, Michael Paquier  
> wrote in
> > Hence I would tend to let XLogFromFileName do the job, while having a
> > SQL function that is just a thin wrapper around it that returns the
> > segment TLI and its number, leaving the offset out of the equation as
> > well as this new XLogIdFromFileName().
>
> I don't think it could be problematic that the SQL-callable function
> returns a bogus result for a wrong WAL filename in the correct
> format. Specifically, I think that the function may return (0/0,0) for
> "" since that behavior is completely
> harmless. If we don't check logid, XLogFromFileName fits instead.

If we were to provide correctness and input invalidations
(specifically, the validations around 'seg', see below) of the WAL
file name typed in by the user, the function pg_walfile_offset_lsn()
wins the race.

+XLogIdFromFileName(fname, , , , , wal_segment_size);
+
+if (seg >= XLogSegmentsPerXLogId(wal_segment_size) ||
+(log == 0 && seg == 0) ||
+segno == 0 ||
+tli == 0)
+ereport(ERROR,
+(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("invalid WAL file name \"%s\"", fname)));

+SELECT * FROM pg_walfile_offset_lsn('0001', 15);
+ERROR:  invalid WAL file name "0001"

That said, I think we can have a single function
pg_dissect_walfile_name(wal_file_name, offset optional) returning
segno (XLogSegNo - physical log file sequence number), tli, lsn (if
offset is given). This way there is no need for another SQL-callable
function using pg_settings. Thoughts?

> (If we assume that the file names are typed in letter-by-letter, I
>  rather prefer to allow lower-case letters:p)

It's easily doable if we convert the entered WAL file name to
uppercase using pg_toupper() and then pass it to IsXLogFileName().

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com




Re: Add LSN along with offset to error messages reported for WAL file read/write/validate header failures

2022-12-05 Thread Kyotaro Horiguchi
At Mon, 5 Dec 2022 13:13:23 +0900, Michael Paquier  wrote 
in 
> On Mon, Dec 05, 2022 at 08:48:25AM +0530, Bharath Rupireddy wrote:
> > So, a SQL function pg_dissect_walfile_name (or some other better name)
> > given a WAL file name returns the tli and seg number. Then the
> > pg_walfile_offset_lsn can just be a SQL-defined function (in
> > system_functions.sql) using this new function and pg_settings. If this
> > understanding is correct, it looks good to me at this point.
> 
> I would do without the SQL function that looks at pg_settings, FWIW.

If that function may be called at a high frequency, SQL-defined one is
not suitable, but I don't think this function is used that way.  With
that premise, I would prefer SQL-defined as it is far simpler on its
face.

At Mon, 5 Dec 2022 10:03:55 +0900, Michael Paquier  wrote 
in 
> Hence I would tend to let XLogFromFileName do the job, while having a
> SQL function that is just a thin wrapper around it that returns the
> segment TLI and its number, leaving the offset out of the equation as
> well as this new XLogIdFromFileName().

I don't think it could be problematic that the SQL-callable function
returns a bogus result for a wrong WAL filename in the correct
format. Specifically, I think that the function may return (0/0,0) for
"" since that behavior is completely
harmless. If we don't check logid, XLogFromFileName fits instead.

(If we assume that the file names are typed in letter-by-letter, I
 rather prefer to allow lower-case letters:p)

> > That said, let's also hear from others.
> 
> Sure.  Perhaps my set of suggestions will not get the majority,
> though..


regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center




Re: Add LSN along with offset to error messages reported for WAL file read/write/validate header failures

2022-12-04 Thread Michael Paquier
On Mon, Dec 05, 2022 at 08:48:25AM +0530, Bharath Rupireddy wrote:
> So, a SQL function pg_dissect_walfile_name (or some other better name)
> given a WAL file name returns the tli and seg number. Then the
> pg_walfile_offset_lsn can just be a SQL-defined function (in
> system_functions.sql) using this new function and pg_settings. If this
> understanding is correct, it looks good to me at this point.

I would do without the SQL function that looks at pg_settings, FWIW.

> That said, let's also hear from others.

Sure.  Perhaps my set of suggestions will not get the majority,
though..
--
Michael


signature.asc
Description: PGP signature


Re: Add LSN along with offset to error messages reported for WAL file read/write/validate header failures

2022-12-04 Thread Bharath Rupireddy
On Mon, Dec 5, 2022 at 6:34 AM Michael Paquier  wrote:
>
> Regarding pg_walfile_offset_lsn(), I am not sure that this is the best
> move we can do as it is possible to compile a LSN from 0/0 with just a
> segment number, say:
> select '0/0'::pg_lsn + :segno * setting::int + :offset
>   from pg_settings where name = 'wal_segment_size';

Nice.

> +   resultTupleDesc = CreateTemplateTupleDesc(2);
> +   TupleDescInitEntry(resultTupleDesc, (AttrNumber) 1, "lsn",
> +  PG_LSNOID, -1, 0);
> +   TupleDescInitEntry(resultTupleDesc, (AttrNumber) 2, "timeline_id",
> +  INT4OID, -1, 0);
> Let's use get_call_result_type() to get the TupleDesc and to avoid a
> duplication between pg_proc.dat and this code.
>
> Hence I would tend to let XLogFromFileName do the job, while having a
> SQL function that is just a thin wrapper around it that returns the
> segment TLI and its number, leaving the offset out of the equation as
> well as this new XLogIdFromFileName().

So, a SQL function pg_dissect_walfile_name (or some other better name)
given a WAL file name returns the tli and seg number. Then the
pg_walfile_offset_lsn can just be a SQL-defined function (in
system_functions.sql) using this new function and pg_settings. If this
understanding is correct, it looks good to me at this point.

That said, let's also hear from others.

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com




Re: Add LSN along with offset to error messages reported for WAL file read/write/validate header failures

2022-12-04 Thread Michael Paquier
On Sat, Dec 03, 2022 at 09:07:38AM +0530, Bharath Rupireddy wrote:
> Yes, I removed those changes. Even if someone sees an offset of a
> record within a WAL file elsewhere, they have the new utility function
> (0002) pg_walfile_offset_lsn().
> 
> I'm attaching the v4 patch set for further review.

+ * Compute an LSN and timline ID given a WAL file name and decimal byte offset.
s/timline/timeline/, exactly two times

+   Datum   values[2] = {0};
+   boolisnull[2] = {0};
I would not hardcode the number of attributes of the record returned.

Regarding pg_walfile_offset_lsn(), I am not sure that this is the best
move we can do as it is possible to compile a LSN from 0/0 with just a
segment number, say:
select '0/0'::pg_lsn + :segno * setting::int + :offset
  from pg_settings where name = 'wal_segment_size';

+   resultTupleDesc = CreateTemplateTupleDesc(2);
+   TupleDescInitEntry(resultTupleDesc, (AttrNumber) 1, "lsn",
+  PG_LSNOID, -1, 0);
+   TupleDescInitEntry(resultTupleDesc, (AttrNumber) 2, "timeline_id",
+  INT4OID, -1, 0);
Let's use get_call_result_type() to get the TupleDesc and to avoid a
duplication between pg_proc.dat and this code.

Hence I would tend to let XLogFromFileName do the job, while having a
SQL function that is just a thin wrapper around it that returns the
segment TLI and its number, leaving the offset out of the equation as
well as this new XLogIdFromFileName().
--
Michael


signature.asc
Description: PGP signature


Re: Add LSN along with offset to error messages reported for WAL file read/write/validate header failures

2022-12-02 Thread Bharath Rupireddy
On Fri, Dec 2, 2022 at 12:50 PM Michael Paquier  wrote:
>
> On Thu, Nov 17, 2022 at 11:53:23AM +0530, Bharath Rupireddy wrote:
> > Please do, if you feel so. Thanks for your review.
>
> I don't really mind the addition of the LSN when operating on a given
> record where we are reading a location, like in the five error paths
> for the header validation or the three ones in ReadRecord()

Thanks for reviewing.

> Now this one looks confusing:
> +   XLogSegNoOffsetToRecPtr(openLogSegNo, startoffset,
> +   wal_segment_size, lsn);
> ereport(PANIC,
> (errcode_for_file_access(),
>  errmsg("could not write to log file %s "
> -   "at offset %u, length %zu: %m",
> -   xlogfname, startoffset, nleft)));
> +   "at offset %u, LSN %X/%X, length %zu: %m",
> +   xlogfname, startoffset,
> +   LSN_FORMAT_ARGS(lsn), nleft)));
>
> This does not always refer to an exact LSN of a record as we may be in
> the middle of a write, so I would leave it as-is.
>
> Similarly the addition of wre_lsn would be confusing?  The offset
> looks kind of enough to me when referring to the middle of a page in
> WALReadRaiseError().

Yes, I removed those changes. Even if someone sees an offset of a
record within a WAL file elsewhere, they have the new utility function
(0002) pg_walfile_offset_lsn().

I'm attaching the v4 patch set for further review.

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com
From 452f6adc4617230f8843a9339331170cb6f85723 Mon Sep 17 00:00:00 2001
From: Bharath Rupireddy 
Date: Sat, 3 Dec 2022 03:13:57 +
Subject: [PATCH v4] Emit record LSN in WAL read and validate error messages

The error messages reported during any failures in WAL record read
or validation currently contains offset within the WAL file which
is a bit hard to decode it to WAL LSN while debugging. Reporting
the WAL record LSN at which the WAL read or validation failure
occurrs saves time and makes life easier here.

Author: Bharath Rupireddy
Reviewed-by: Maxim Orlov, Michael Paquier
Reviewed-by: Kyotaro Horiguchi
Discussion: https://www.postgresql.org/message-id/CALj2ACWV=FCddsxcGbVOA=cvPyMr75YCFbSQT6g4KDj=gcJK4g@mail.gmail.com
---
 src/backend/access/transam/xlogreader.c   | 15 ++-
 src/backend/access/transam/xlogrecovery.c | 13 -
 2 files changed, 18 insertions(+), 10 deletions(-)

diff --git a/src/backend/access/transam/xlogreader.c b/src/backend/access/transam/xlogreader.c
index 93f667b254..a38a80e049 100644
--- a/src/backend/access/transam/xlogreader.c
+++ b/src/backend/access/transam/xlogreader.c
@@ -1226,9 +1226,10 @@ XLogReaderValidatePageHeader(XLogReaderState *state, XLogRecPtr recptr,
 		XLogFileName(fname, state->seg.ws_tli, segno, state->segcxt.ws_segsize);
 
 		report_invalid_record(state,
-			  "invalid magic number %04X in WAL segment %s, offset %u",
+			  "invalid magic number %04X in WAL segment %s, LSN %X/%X, offset %u",
 			  hdr->xlp_magic,
 			  fname,
+			  LSN_FORMAT_ARGS(recptr),
 			  offset);
 		return false;
 	}
@@ -1240,9 +1241,10 @@ XLogReaderValidatePageHeader(XLogReaderState *state, XLogRecPtr recptr,
 		XLogFileName(fname, state->seg.ws_tli, segno, state->segcxt.ws_segsize);
 
 		report_invalid_record(state,
-			  "invalid info bits %04X in WAL segment %s, offset %u",
+			  "invalid info bits %04X in WAL segment %s, LSN %X/%X, offset %u",
 			  hdr->xlp_info,
 			  fname,
+			  LSN_FORMAT_ARGS(recptr),
 			  offset);
 		return false;
 	}
@@ -1281,9 +1283,10 @@ XLogReaderValidatePageHeader(XLogReaderState *state, XLogRecPtr recptr,
 
 		/* hmm, first page of file doesn't have a long header? */
 		report_invalid_record(state,
-			  "invalid info bits %04X in WAL segment %s, offset %u",
+			  "invalid info bits %04X in WAL segment %s, LSN %X/%X, offset %u",
 			  hdr->xlp_info,
 			  fname,
+			  LSN_FORMAT_ARGS(recptr),
 			  offset);
 		return false;
 	}
@@ -1300,9 +1303,10 @@ XLogReaderValidatePageHeader(XLogReaderState *state, XLogRecPtr recptr,
 		XLogFileName(fname, state->seg.ws_tli, segno, state->segcxt.ws_segsize);
 
 		report_invalid_record(state,
-			  "unexpected pageaddr %X/%X in WAL segment %s, offset %u",
+			  "unexpected pageaddr %X/%X in WAL segment %s, LSN %X/%X, offset %u",
 			  LSN_FORMAT_ARGS(hdr->xlp_pageaddr),
 			  fname,
+			  LSN_FORMAT_ARGS(recptr),
 			  offset);
 		return false;
 	}
@@ -1325,10 +1329,11 @@ XLogReaderValidatePageHeader(XLogReaderState *state, XLogRecPtr recptr,
 			XLogFileName(fname, state->seg.ws_tli, segno, state->segcxt.ws_segsize);
 
 			report_invalid_record(state,
-  "out-of-sequence timeline ID %u (after %u) in WAL segment %s, offset %u",
+  "out-of-sequence timeline ID %u (after %u) in WAL segment 

Re: Add LSN along with offset to error messages reported for WAL file read/write/validate header failures

2022-12-01 Thread Michael Paquier
On Thu, Nov 17, 2022 at 11:53:23AM +0530, Bharath Rupireddy wrote:
> Please do, if you feel so. Thanks for your review.

I don't really mind the addition of the LSN when operating on a given
record where we are reading a location, like in the five error paths
for the header validation or the three ones in ReadRecord()

Now this one looks confusing:
+   XLogSegNoOffsetToRecPtr(openLogSegNo, startoffset,
+   wal_segment_size, lsn);
ereport(PANIC,
(errcode_for_file_access(),
 errmsg("could not write to log file %s "
-   "at offset %u, length %zu: %m",
-   xlogfname, startoffset, nleft)));
+   "at offset %u, LSN %X/%X, length %zu: %m",
+   xlogfname, startoffset,
+   LSN_FORMAT_ARGS(lsn), nleft)));

This does not always refer to an exact LSN of a record as we may be in
the middle of a write, so I would leave it as-is.

Similarly the addition of wre_lsn would be confusing?  The offset
looks kind of enough to me when referring to the middle of a page in
WALReadRaiseError().
--
Michael


signature.asc
Description: PGP signature


Re: Add LSN along with offset to error messages reported for WAL file read/write/validate header failures

2022-11-16 Thread Bharath Rupireddy
On Wed, Nov 16, 2022 at 6:55 PM Maxim Orlov  wrote:
>
>> These functions are marked as 'STRICT', meaning a null is returned,
>> without even calling the function, if any of the input parameters is
>> null. I think we can keep the behaviour the same as its friends.
>
> Thanks for the explanations. I think you are right.
>
> Confirm. And a timeline_id is added.
>>
>> While on this, I noticed that the pg_walfile_name_offset() isn't
>> covered in tests. I took an opportunity and added a simple test case
>> along with pg_walfile_offset_lsn().
>>
>> I'm attaching the v3 patch set for further review.
>
> Great job! We should mark this patch as RFC, shall we?

Please do, if you feel so. Thanks for your review.

-- 
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com




Re: Add LSN along with offset to error messages reported for WAL file read/write/validate header failures

2022-11-16 Thread Maxim Orlov
On Tue, 15 Nov 2022 at 13:02, Bharath Rupireddy <
bharath.rupireddyforpostg...@gmail.com> wrote:

> On Fri, Nov 11, 2022 at 5:52 PM Maxim Orlov  wrote:
>
> > But, in my view, some improvements may be proposed. We should be more,
> let's say, cautious (or a paranoid if you wish),
> > in pg_walfile_offset_lsn while dealing with user input values. What I
> mean by that is:
> >  - to init input vars of sscanf, i.e. tli, log andseg;
> >  - check the return value of sscanf call;
> >  - check filename max length.
>
> IsXLogFileName() will take care of this. Also, I've added a new inline
> function XLogIdFromFileName() that parses file name and returns log
> and seg along with XLogSegNo and timeline id. This new function avoids
> an extra sscanf() call as well.
>
> > Another question arises for me: is this function can be called during
> recovery? If not, we should ereport about this, should we?
>
> It's just a utility function and doesn't depend on any of the server's
> current state (unlike pg_walfile_name()), hence it doesn't matter if
> this function is called during recovery.
>
> > And one last note here: pg_walfile_offset_lsn is accepting NULL values
> and return NULL in this case. From a theoretical
> > point of view, this is perfectly fine. Actually, I think this is exactly
> how it supposed to be, but I'm not sure if there are no other opinions here.
>
> These functions are marked as 'STRICT', meaning a null is returned,
> without even calling the function, if any of the input parameters is
> null. I think we can keep the behaviour the same as its friends.
>

Thanks for the explanations. I think you are right.


> >errmsg("offset must not be negative or greater than or equal to the
> WAL segment size")));
>
> Changed.
>

Confirm. And a timeline_id is added.


> While on this, I noticed that the pg_walfile_name_offset() isn't
> covered in tests. I took an opportunity and added a simple test case
> along with pg_walfile_offset_lsn().
>
> I'm attaching the v3 patch set for further review.
>

Great job! We should mark this patch as RFC, shall we?

-- 
Best regards,
Maxim Orlov.


Re: Add LSN along with offset to error messages reported for WAL file read/write/validate header failures

2022-11-15 Thread Bharath Rupireddy
On Fri, Nov 11, 2022 at 5:52 PM Maxim Orlov  wrote:
>
> Hi!
>
> I've watched over the patch and consider it useful. Applies without 
> conflicts. The functionality of the patch itself is
> meets declared functionality.

Thanks for reviewing.

> But, in my view, some improvements may be proposed. We should be more, let's 
> say, cautious (or a paranoid if you wish),
> in pg_walfile_offset_lsn while dealing with user input values. What I mean by 
> that is:
>  - to init input vars of sscanf, i.e. tli, log andseg;
>  - check the return value of sscanf call;
>  - check filename max length.

IsXLogFileName() will take care of this. Also, I've added a new inline
function XLogIdFromFileName() that parses file name and returns log
and seg along with XLogSegNo and timeline id. This new function avoids
an extra sscanf() call as well.

> Another question arises for me: is this function can be called during 
> recovery? If not, we should ereport about this, should we?

It's just a utility function and doesn't depend on any of the server's
current state (unlike pg_walfile_name()), hence it doesn't matter if
this function is called during recovery.

> And one last note here: pg_walfile_offset_lsn is accepting NULL values and 
> return NULL in this case. From a theoretical
> point of view, this is perfectly fine. Actually, I think this is exactly how 
> it supposed to be, but I'm not sure if there are no other opinions here.

These functions are marked as 'STRICT', meaning a null is returned,
without even calling the function, if any of the input parameters is
null. I think we can keep the behaviour the same as its friends.

On Fri, Nov 11, 2022 at 11:05 PM Alvaro Herrera  wrote:
>

Thanks for reviewing.

> Hmm, in 0002, why not return the timeline from the LSN too?  It seems a
> waste not to have it.

Yeah, that actually makes sense. We might be tempted to return segno
too, but it's not something that we emit to the user elsewhere,
whereas we emit timeline id.

> +   ereport(ERROR,
> +   (errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
> +errmsg("\"offset\" must not be negative or 
> greater than or "
> +   "equal to WAL segment 
> size")));
>
> I don't think the word offset should be in quotes; and please don't cut
> the line.  So I propose
>
>errmsg("offset must not be negative or greater than or equal to the WAL 
> segment size")));

Changed.

While on this, I noticed that the pg_walfile_name_offset() isn't
covered in tests. I took an opportunity and added a simple test case
along with pg_walfile_offset_lsn().

I'm attaching the v3 patch set for further review.

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com
From 1b83f37b789ecf1f6ad1fa3dd71673c369584b6d Mon Sep 17 00:00:00 2001
From: Bharath Rupireddy 
Date: Tue, 15 Nov 2022 06:04:57 +
Subject: [PATCH v3] Add LSN along with offset to error messages reported for
 WAL file read/write/validate header failures

Author: Bharath Rupireddy
---
 src/backend/access/transam/xlog.c |  8 ++--
 src/backend/access/transam/xlogreader.c   | 16 +++-
 src/backend/access/transam/xlogrecovery.c | 13 -
 src/backend/access/transam/xlogutils.c| 10 ++
 src/include/access/xlogreader.h   |  1 +
 5 files changed, 32 insertions(+), 16 deletions(-)

diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index a31fbbff78..42ee51f380 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -2214,6 +2214,7 @@ XLogWrite(XLogwrtRqst WriteRqst, TimeLineID tli, bool flexible)
 {
 	char		xlogfname[MAXFNAMELEN];
 	int			save_errno;
+	XLogRecPtr	lsn;
 
 	if (errno == EINTR)
 		continue;
@@ -,11 +2223,14 @@ XLogWrite(XLogwrtRqst WriteRqst, TimeLineID tli, bool flexible)
 	XLogFileName(xlogfname, tli, openLogSegNo,
  wal_segment_size);
 	errno = save_errno;
+	XLogSegNoOffsetToRecPtr(openLogSegNo, startoffset,
+			wal_segment_size, lsn);
 	ereport(PANIC,
 			(errcode_for_file_access(),
 			 errmsg("could not write to log file %s "
-	"at offset %u, length %zu: %m",
-	xlogfname, startoffset, nleft)));
+	"at offset %u, LSN %X/%X, length %zu: %m",
+	xlogfname, startoffset,
+	LSN_FORMAT_ARGS(lsn), nleft)));
 }
 nleft -= written;
 from += written;
diff --git a/src/backend/access/transam/xlogreader.c b/src/backend/access/transam/xlogreader.c
index 93f667b254..a321c55d8f 100644
--- a/src/backend/access/transam/xlogreader.c
+++ b/src/backend/access/transam/xlogreader.c
@@ -1226,9 +1226,10 @@ XLogReaderValidatePageHeader(XLogReaderState *state, XLogRecPtr recptr,
 		XLogFileName(fname, state->seg.ws_tli, segno, state->segcxt.ws_segsize);
 
 		report_invalid_record(state,
-			  "invalid 

Re: Add LSN along with offset to error messages reported for WAL file read/write/validate header failures

2022-11-11 Thread Alvaro Herrera
Hmm, in 0002, why not return the timeline from the LSN too?  It seems a
waste not to have it.

+   ereport(ERROR,
+   (errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
+errmsg("\"offset\" must not be negative or 
greater than or "
+   "equal to WAL segment size")));

I don't think the word offset should be in quotes; and please don't cut
the line.  So I propose

   errmsg("offset must not be negative or greater than or equal to the WAL 
segment size")));

-- 
Álvaro HerreraBreisgau, Deutschland  —  https://www.EnterpriseDB.com/




Re: Add LSN along with offset to error messages reported for WAL file read/write/validate header failures

2022-11-11 Thread Maxim Orlov
Hi!

I've watched over the patch and consider it useful. Applies without
conflicts. The functionality of the patch itself is
meets declared functionality.

But, in my view, some improvements may be proposed. We should be more,
let's say, cautious (or a paranoid if you wish),
in pg_walfile_offset_lsn while dealing with user input values. What I mean
by that is:
 - to init input vars of sscanf, i.e. tli, log andseg;
 - check the return value of sscanf call;
 - check filename max length.

Another question arises for me: is this function can be called during
recovery? If not, we should ereport about this, should we?

And one last note here: pg_walfile_offset_lsn is accepting NULL values and
return NULL in this case. From a theoretical
point of view, this is perfectly fine. Actually, I think this is exactly
how it supposed to be, but I'm not sure if there are no other opinions here.
-- 
Best regards,
Maxim Orlov.


Re: Add LSN along with offset to error messages reported for WAL file read/write/validate header failures

2022-10-27 Thread Bharath Rupireddy
On Tue, Oct 4, 2022 at 2:58 PM Bharath Rupireddy
 wrote:
>
> On Thu, Sep 29, 2022 at 7:43 PM Bharath Rupireddy
>  wrote:
> >
> > Please see the attached v1 patch.
>
> FWIW, I'm attaching Nathan's patch that introduced the new function
> pg_walfile_offset_lsn as 0002 in the v1 patch set. Here's the CF entry
> - https://commitfest.postgresql.org/40/3909/.
>
> Please consider this for further review.

I'm attaching the v2 patch set after rebasing on to the latest HEAD.

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com
From 87da2d0ae8fb0f92d2d88d8638aee1ca58332522 Mon Sep 17 00:00:00 2001
From: Bharath Rupireddy 
Date: Thu, 27 Oct 2022 09:06:27 +
Subject: [PATCH v2] Add LSN along with offset to error messages reported for
 WAL file read/write/validate header failures

---
 src/backend/access/transam/xlog.c |  8 ++--
 src/backend/access/transam/xlogreader.c   | 16 +++-
 src/backend/access/transam/xlogrecovery.c | 13 -
 src/backend/access/transam/xlogutils.c| 10 ++
 src/include/access/xlogreader.h   |  1 +
 5 files changed, 32 insertions(+), 16 deletions(-)

diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index 8f10effe3a..1467001b4f 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -2214,6 +2214,7 @@ XLogWrite(XLogwrtRqst WriteRqst, TimeLineID tli, bool flexible)
 {
 	char		xlogfname[MAXFNAMELEN];
 	int			save_errno;
+	XLogRecPtr	lsn;
 
 	if (errno == EINTR)
 		continue;
@@ -,11 +2223,14 @@ XLogWrite(XLogwrtRqst WriteRqst, TimeLineID tli, bool flexible)
 	XLogFileName(xlogfname, tli, openLogSegNo,
  wal_segment_size);
 	errno = save_errno;
+	XLogSegNoOffsetToRecPtr(openLogSegNo, startoffset,
+			wal_segment_size, lsn);
 	ereport(PANIC,
 			(errcode_for_file_access(),
 			 errmsg("could not write to log file %s "
-	"at offset %u, length %zu: %m",
-	xlogfname, startoffset, nleft)));
+	"at offset %u, LSN %X/%X, length %zu: %m",
+	xlogfname, startoffset,
+	LSN_FORMAT_ARGS(lsn), nleft)));
 }
 nleft -= written;
 from += written;
diff --git a/src/backend/access/transam/xlogreader.c b/src/backend/access/transam/xlogreader.c
index 93f667b254..a321c55d8f 100644
--- a/src/backend/access/transam/xlogreader.c
+++ b/src/backend/access/transam/xlogreader.c
@@ -1226,9 +1226,10 @@ XLogReaderValidatePageHeader(XLogReaderState *state, XLogRecPtr recptr,
 		XLogFileName(fname, state->seg.ws_tli, segno, state->segcxt.ws_segsize);
 
 		report_invalid_record(state,
-			  "invalid magic number %04X in WAL segment %s, offset %u",
+			  "invalid magic number %04X in WAL segment %s, LSN %X/%X, offset %u",
 			  hdr->xlp_magic,
 			  fname,
+			  LSN_FORMAT_ARGS(recptr),
 			  offset);
 		return false;
 	}
@@ -1240,9 +1241,10 @@ XLogReaderValidatePageHeader(XLogReaderState *state, XLogRecPtr recptr,
 		XLogFileName(fname, state->seg.ws_tli, segno, state->segcxt.ws_segsize);
 
 		report_invalid_record(state,
-			  "invalid info bits %04X in WAL segment %s, offset %u",
+			  "invalid info bits %04X in WAL segment %s, LSN %X/%X, offset %u",
 			  hdr->xlp_info,
 			  fname,
+			  LSN_FORMAT_ARGS(recptr),
 			  offset);
 		return false;
 	}
@@ -1281,9 +1283,10 @@ XLogReaderValidatePageHeader(XLogReaderState *state, XLogRecPtr recptr,
 
 		/* hmm, first page of file doesn't have a long header? */
 		report_invalid_record(state,
-			  "invalid info bits %04X in WAL segment %s, offset %u",
+			  "invalid info bits %04X in WAL segment %s, LSN %X/%X, offset %u",
 			  hdr->xlp_info,
 			  fname,
+			  LSN_FORMAT_ARGS(recptr),
 			  offset);
 		return false;
 	}
@@ -1300,9 +1303,10 @@ XLogReaderValidatePageHeader(XLogReaderState *state, XLogRecPtr recptr,
 		XLogFileName(fname, state->seg.ws_tli, segno, state->segcxt.ws_segsize);
 
 		report_invalid_record(state,
-			  "unexpected pageaddr %X/%X in WAL segment %s, offset %u",
+			  "unexpected pageaddr %X/%X in WAL segment %s, LSN %X/%X, offset %u",
 			  LSN_FORMAT_ARGS(hdr->xlp_pageaddr),
 			  fname,
+			  LSN_FORMAT_ARGS(recptr),
 			  offset);
 		return false;
 	}
@@ -1325,10 +1329,11 @@ XLogReaderValidatePageHeader(XLogReaderState *state, XLogRecPtr recptr,
 			XLogFileName(fname, state->seg.ws_tli, segno, state->segcxt.ws_segsize);
 
 			report_invalid_record(state,
-  "out-of-sequence timeline ID %u (after %u) in WAL segment %s, offset %u",
+  "out-of-sequence timeline ID %u (after %u) in WAL segment %s, LSN %X/%X, offset %u",
   hdr->xlp_tli,
   state->latestPageTLI,
   fname,
+  LSN_FORMAT_ARGS(recptr),
   offset);
 			return false;
 		}
@@ -1553,6 +1558,7 @@ WALRead(XLogReaderState *state,
 			errinfo->wre_req = 

Re: Add LSN along with offset to error messages reported for WAL file read/write/validate header failures

2022-10-04 Thread Bharath Rupireddy
On Thu, Sep 29, 2022 at 7:43 PM Bharath Rupireddy
 wrote:
>
> Please see the attached v1 patch.

FWIW, I'm attaching Nathan's patch that introduced the new function
pg_walfile_offset_lsn as 0002 in the v1 patch set. Here's the CF entry
- https://commitfest.postgresql.org/40/3909/.

Please consider this for further review.

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com
From b19aa25a0d1f2ce85abe0c2081c0e7ede256e329 Mon Sep 17 00:00:00 2001
From: Bharath Rupireddy 
Date: Thu, 29 Sep 2022 13:29:44 +
Subject: [PATCH v1] Add LSN along with offset to error messages reported for
 WAL file read/write/validate header failures

---
 src/backend/access/transam/xlog.c |  8 ++--
 src/backend/access/transam/xlogreader.c   | 16 +++-
 src/backend/access/transam/xlogrecovery.c | 13 -
 src/backend/access/transam/xlogutils.c| 10 ++
 src/include/access/xlogreader.h   |  1 +
 5 files changed, 32 insertions(+), 16 deletions(-)

diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index 8e15256db8..a495bbac85 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -2218,6 +2218,7 @@ XLogWrite(XLogwrtRqst WriteRqst, TimeLineID tli, bool flexible)
 {
 	char		xlogfname[MAXFNAMELEN];
 	int			save_errno;
+	XLogRecPtr	lsn;
 
 	if (errno == EINTR)
 		continue;
@@ -2226,11 +2227,14 @@ XLogWrite(XLogwrtRqst WriteRqst, TimeLineID tli, bool flexible)
 	XLogFileName(xlogfname, tli, openLogSegNo,
  wal_segment_size);
 	errno = save_errno;
+	XLogSegNoOffsetToRecPtr(openLogSegNo, startoffset,
+			wal_segment_size, lsn);
 	ereport(PANIC,
 			(errcode_for_file_access(),
 			 errmsg("could not write to log file %s "
-	"at offset %u, length %zu: %m",
-	xlogfname, startoffset, nleft)));
+	"at offset %u, LSN %X/%X, length %zu: %m",
+	xlogfname, startoffset,
+	LSN_FORMAT_ARGS(lsn), nleft)));
 }
 nleft -= written;
 from += written;
diff --git a/src/backend/access/transam/xlogreader.c b/src/backend/access/transam/xlogreader.c
index 5a8fe81f82..c3befc44ba 100644
--- a/src/backend/access/transam/xlogreader.c
+++ b/src/backend/access/transam/xlogreader.c
@@ -1229,9 +1229,10 @@ XLogReaderValidatePageHeader(XLogReaderState *state, XLogRecPtr recptr,
 		XLogFileName(fname, state->seg.ws_tli, segno, state->segcxt.ws_segsize);
 
 		report_invalid_record(state,
-			  "invalid magic number %04X in WAL segment %s, offset %u",
+			  "invalid magic number %04X in WAL segment %s, LSN %X/%X, offset %u",
 			  hdr->xlp_magic,
 			  fname,
+			  LSN_FORMAT_ARGS(recptr),
 			  offset);
 		return false;
 	}
@@ -1243,9 +1244,10 @@ XLogReaderValidatePageHeader(XLogReaderState *state, XLogRecPtr recptr,
 		XLogFileName(fname, state->seg.ws_tli, segno, state->segcxt.ws_segsize);
 
 		report_invalid_record(state,
-			  "invalid info bits %04X in WAL segment %s, offset %u",
+			  "invalid info bits %04X in WAL segment %s, LSN %X/%X, offset %u",
 			  hdr->xlp_info,
 			  fname,
+			  LSN_FORMAT_ARGS(recptr),
 			  offset);
 		return false;
 	}
@@ -1284,9 +1286,10 @@ XLogReaderValidatePageHeader(XLogReaderState *state, XLogRecPtr recptr,
 
 		/* hmm, first page of file doesn't have a long header? */
 		report_invalid_record(state,
-			  "invalid info bits %04X in WAL segment %s, offset %u",
+			  "invalid info bits %04X in WAL segment %s, LSN %X/%X, offset %u",
 			  hdr->xlp_info,
 			  fname,
+			  LSN_FORMAT_ARGS(recptr),
 			  offset);
 		return false;
 	}
@@ -1303,9 +1306,10 @@ XLogReaderValidatePageHeader(XLogReaderState *state, XLogRecPtr recptr,
 		XLogFileName(fname, state->seg.ws_tli, segno, state->segcxt.ws_segsize);
 
 		report_invalid_record(state,
-			  "unexpected pageaddr %X/%X in WAL segment %s, offset %u",
+			  "unexpected pageaddr %X/%X in WAL segment %s, LSN %X/%X, offset %u",
 			  LSN_FORMAT_ARGS(hdr->xlp_pageaddr),
 			  fname,
+			  LSN_FORMAT_ARGS(recptr),
 			  offset);
 		return false;
 	}
@@ -1328,10 +1332,11 @@ XLogReaderValidatePageHeader(XLogReaderState *state, XLogRecPtr recptr,
 			XLogFileName(fname, state->seg.ws_tli, segno, state->segcxt.ws_segsize);
 
 			report_invalid_record(state,
-  "out-of-sequence timeline ID %u (after %u) in WAL segment %s, offset %u",
+  "out-of-sequence timeline ID %u (after %u) in WAL segment %s, LSN %X/%X, offset %u",
   hdr->xlp_tli,
   state->latestPageTLI,
   fname,
+  LSN_FORMAT_ARGS(recptr),
   offset);
 			return false;
 		}
@@ -1556,6 +1561,7 @@ WALRead(XLogReaderState *state,
 			errinfo->wre_req = segbytes;
 			errinfo->wre_read = readbytes;
 			errinfo->wre_off = startoff;
+			errinfo->wre_lsn = recptr;
 			errinfo->wre_seg = state->seg;
 			

Re: Add LSN along with offset to error messages reported for WAL file read/write/validate header failures

2022-09-29 Thread Bharath Rupireddy
On Tue, Sep 27, 2022 at 8:31 AM Kyotaro Horiguchi
 wrote:
>
> If all error-emitting site knows the LSN, we don't need the context
> message. But *I* would like that the additional message looks like
> "while reading record at LSN %X/%X" or slightly shorter version of
> it. Because the targetRecPtr is the beginning of the current reading
> record, not the LSN for the segment and offset. It may point to past
> segments.

I think we could just say "LSN %X/%X, offset %u", because the overall
context whether it's being read or written is implicit with the other
part of the message.

Please see the attached v1 patch.

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com
From b19aa25a0d1f2ce85abe0c2081c0e7ede256e329 Mon Sep 17 00:00:00 2001
From: Bharath Rupireddy 
Date: Thu, 29 Sep 2022 13:29:44 +
Subject: [PATCH v1] Add LSN along with offset to error messages reported for
 WAL file read/write/validate header failures

---
 src/backend/access/transam/xlog.c |  8 ++--
 src/backend/access/transam/xlogreader.c   | 16 +++-
 src/backend/access/transam/xlogrecovery.c | 13 -
 src/backend/access/transam/xlogutils.c| 10 ++
 src/include/access/xlogreader.h   |  1 +
 5 files changed, 32 insertions(+), 16 deletions(-)

diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index 8e15256db8..a495bbac85 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -2218,6 +2218,7 @@ XLogWrite(XLogwrtRqst WriteRqst, TimeLineID tli, bool flexible)
 {
 	char		xlogfname[MAXFNAMELEN];
 	int			save_errno;
+	XLogRecPtr	lsn;
 
 	if (errno == EINTR)
 		continue;
@@ -2226,11 +2227,14 @@ XLogWrite(XLogwrtRqst WriteRqst, TimeLineID tli, bool flexible)
 	XLogFileName(xlogfname, tli, openLogSegNo,
  wal_segment_size);
 	errno = save_errno;
+	XLogSegNoOffsetToRecPtr(openLogSegNo, startoffset,
+			wal_segment_size, lsn);
 	ereport(PANIC,
 			(errcode_for_file_access(),
 			 errmsg("could not write to log file %s "
-	"at offset %u, length %zu: %m",
-	xlogfname, startoffset, nleft)));
+	"at offset %u, LSN %X/%X, length %zu: %m",
+	xlogfname, startoffset,
+	LSN_FORMAT_ARGS(lsn), nleft)));
 }
 nleft -= written;
 from += written;
diff --git a/src/backend/access/transam/xlogreader.c b/src/backend/access/transam/xlogreader.c
index 5a8fe81f82..c3befc44ba 100644
--- a/src/backend/access/transam/xlogreader.c
+++ b/src/backend/access/transam/xlogreader.c
@@ -1229,9 +1229,10 @@ XLogReaderValidatePageHeader(XLogReaderState *state, XLogRecPtr recptr,
 		XLogFileName(fname, state->seg.ws_tli, segno, state->segcxt.ws_segsize);
 
 		report_invalid_record(state,
-			  "invalid magic number %04X in WAL segment %s, offset %u",
+			  "invalid magic number %04X in WAL segment %s, LSN %X/%X, offset %u",
 			  hdr->xlp_magic,
 			  fname,
+			  LSN_FORMAT_ARGS(recptr),
 			  offset);
 		return false;
 	}
@@ -1243,9 +1244,10 @@ XLogReaderValidatePageHeader(XLogReaderState *state, XLogRecPtr recptr,
 		XLogFileName(fname, state->seg.ws_tli, segno, state->segcxt.ws_segsize);
 
 		report_invalid_record(state,
-			  "invalid info bits %04X in WAL segment %s, offset %u",
+			  "invalid info bits %04X in WAL segment %s, LSN %X/%X, offset %u",
 			  hdr->xlp_info,
 			  fname,
+			  LSN_FORMAT_ARGS(recptr),
 			  offset);
 		return false;
 	}
@@ -1284,9 +1286,10 @@ XLogReaderValidatePageHeader(XLogReaderState *state, XLogRecPtr recptr,
 
 		/* hmm, first page of file doesn't have a long header? */
 		report_invalid_record(state,
-			  "invalid info bits %04X in WAL segment %s, offset %u",
+			  "invalid info bits %04X in WAL segment %s, LSN %X/%X, offset %u",
 			  hdr->xlp_info,
 			  fname,
+			  LSN_FORMAT_ARGS(recptr),
 			  offset);
 		return false;
 	}
@@ -1303,9 +1306,10 @@ XLogReaderValidatePageHeader(XLogReaderState *state, XLogRecPtr recptr,
 		XLogFileName(fname, state->seg.ws_tli, segno, state->segcxt.ws_segsize);
 
 		report_invalid_record(state,
-			  "unexpected pageaddr %X/%X in WAL segment %s, offset %u",
+			  "unexpected pageaddr %X/%X in WAL segment %s, LSN %X/%X, offset %u",
 			  LSN_FORMAT_ARGS(hdr->xlp_pageaddr),
 			  fname,
+			  LSN_FORMAT_ARGS(recptr),
 			  offset);
 		return false;
 	}
@@ -1328,10 +1332,11 @@ XLogReaderValidatePageHeader(XLogReaderState *state, XLogRecPtr recptr,
 			XLogFileName(fname, state->seg.ws_tli, segno, state->segcxt.ws_segsize);
 
 			report_invalid_record(state,
-  "out-of-sequence timeline ID %u (after %u) in WAL segment %s, offset %u",
+  "out-of-sequence timeline ID %u (after %u) in WAL segment %s, LSN %X/%X, offset %u",
   hdr->xlp_tli,
   state->latestPageTLI,
   fname,
+  

Re: Add LSN along with offset to error messages reported for WAL file read/write/validate header failures

2022-09-26 Thread Kyotaro Horiguchi
At Tue, 20 Sep 2022 17:40:36 +0530, Bharath Rupireddy 
 wrote in 
> On Tue, Sep 20, 2022 at 12:57 PM Alvaro Herrera  
> wrote:
> >
> > On 2022-Sep-19, Bharath Rupireddy wrote:
> >
> > > We have a bunch of messages [1] that have an offset, but not LSN in
> > > the error message. Firstly, is there an easiest way to figure out LSN
> > > from offset reported in the error messages? If not, is adding LSN to
> > > these messages along with offset a good idea? Of course, we can't just
> > > convert offset to LSN using XLogSegNoOffsetToRecPtr() and report, but
> > > something meaningful like reporting the LSN of the page that we are
> > > reading-in or writing-out etc.
> >
> > Maybe add errcontext() somewhere that reports the LSN would be
> > appropriate.  For example, the page_read() callbacks have the LSN
> > readily available, so the ones in backend could install the errcontext
> > callback; or perhaps ReadPageInternal can do it #ifndef FRONTEND.  Not
> > sure what is best of those options, but either of those sounds better
> > than sticking the LSN in a lower-level routine that doesn't necessarily
> > have the info already.
> 
> All of the error messages [1] have the LSN from which offset was
> calculated, I think we can just append that to the error messages
> (something like " offset %u, LSN %X/%X: %m") and not complicate
> it. Thoughts?

If all error-emitting site knows the LSN, we don't need the context
message. But *I* would like that the additional message looks like
"while reading record at LSN %X/%X" or slightly shorter version of
it. Because the targetRecPtr is the beginning of the current reading
record, not the LSN for the segment and offset. It may point to past
segments.


> [1]
> errmsg("could not read from WAL segment %s, offset %u: %m",
> errmsg("could not read from WAL segment %s, offset %u: %m",
> errmsg("could not write to log file %s "
>"at offset %u, length %zu: %m",
> errmsg("unexpected timeline ID %u in WAL segment %s, offset %u",
> errmsg("could not read from WAL segment %s, offset %u: read %d of %zu",
> pg_log_error("received write-ahead log record for offset %u with no file 
> open",
> "invalid magic number %04X in WAL segment %s, offset %u",
> "invalid info bits %04X in WAL segment %s, offset %u",
> "invalid info bits %04X in WAL segment %s, offset %u",
> "unexpected pageaddr %X/%X in WAL segment %s, offset %u",
> "out-of-sequence timeline ID %u (after %u) in WAL segment %s, offset %u",


regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center




Re: Add LSN along with offset to error messages reported for WAL file read/write/validate header failures

2022-09-20 Thread Bharath Rupireddy
On Tue, Sep 20, 2022 at 12:57 PM Alvaro Herrera  wrote:
>
> On 2022-Sep-19, Bharath Rupireddy wrote:
>
> > We have a bunch of messages [1] that have an offset, but not LSN in
> > the error message. Firstly, is there an easiest way to figure out LSN
> > from offset reported in the error messages? If not, is adding LSN to
> > these messages along with offset a good idea? Of course, we can't just
> > convert offset to LSN using XLogSegNoOffsetToRecPtr() and report, but
> > something meaningful like reporting the LSN of the page that we are
> > reading-in or writing-out etc.
>
> Maybe add errcontext() somewhere that reports the LSN would be
> appropriate.  For example, the page_read() callbacks have the LSN
> readily available, so the ones in backend could install the errcontext
> callback; or perhaps ReadPageInternal can do it #ifndef FRONTEND.  Not
> sure what is best of those options, but either of those sounds better
> than sticking the LSN in a lower-level routine that doesn't necessarily
> have the info already.

All of the error messages [1] have the LSN from which offset was
calculated, I think we can just append that to the error messages
(something like " offset %u, LSN %X/%X: %m") and not complicate
it. Thoughts?

[1]
errmsg("could not read from WAL segment %s, offset %u: %m",
errmsg("could not read from WAL segment %s, offset %u: %m",
errmsg("could not write to log file %s "
   "at offset %u, length %zu: %m",
errmsg("unexpected timeline ID %u in WAL segment %s, offset %u",
errmsg("could not read from WAL segment %s, offset %u: read %d of %zu",
pg_log_error("received write-ahead log record for offset %u with no file open",
"invalid magic number %04X in WAL segment %s, offset %u",
"invalid info bits %04X in WAL segment %s, offset %u",
"invalid info bits %04X in WAL segment %s, offset %u",
"unexpected pageaddr %X/%X in WAL segment %s, offset %u",
"out-of-sequence timeline ID %u (after %u) in WAL segment %s, offset %u",

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com




Re: Add LSN along with offset to error messages reported for WAL file read/write/validate header failures

2022-09-20 Thread Bharath Rupireddy
On Tue, Sep 20, 2022 at 9:02 AM Nathan Bossart  wrote:
>
> On Mon, Sep 19, 2022 at 03:16:42PM -0700, Nathan Bossart wrote:
> > It seems like you want the opposite of pg_walfile_name_offset().  Perhaps
> > we could add a function like pg_walfile_offset_lsn() that accepts a WAL
> > file name and byte offset and returns the LSN.
>
> Like so...

Yeah, something like this will be handy for sure, but I'm not sure if
we want this to be in core. Let's hear from others.

-- 
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com




Re: Add LSN along with offset to error messages reported for WAL file read/write/validate header failures

2022-09-20 Thread Alvaro Herrera
On 2022-Sep-19, Bharath Rupireddy wrote:

> We have a bunch of messages [1] that have an offset, but not LSN in
> the error message. Firstly, is there an easiest way to figure out LSN
> from offset reported in the error messages? If not, is adding LSN to
> these messages along with offset a good idea? Of course, we can't just
> convert offset to LSN using XLogSegNoOffsetToRecPtr() and report, but
> something meaningful like reporting the LSN of the page that we are
> reading-in or writing-out etc.

Maybe add errcontext() somewhere that reports the LSN would be
appropriate.  For example, the page_read() callbacks have the LSN
readily available, so the ones in backend could install the errcontext
callback; or perhaps ReadPageInternal can do it #ifndef FRONTEND.  Not
sure what is best of those options, but either of those sounds better
than sticking the LSN in a lower-level routine that doesn't necessarily
have the info already.

-- 
Álvaro HerreraBreisgau, Deutschland  —  https://www.EnterpriseDB.com/




Re: Add LSN along with offset to error messages reported for WAL file read/write/validate header failures

2022-09-19 Thread Nathan Bossart
On Mon, Sep 19, 2022 at 03:16:42PM -0700, Nathan Bossart wrote:
> It seems like you want the opposite of pg_walfile_name_offset().  Perhaps
> we could add a function like pg_walfile_offset_lsn() that accepts a WAL
> file name and byte offset and returns the LSN.

Like so...

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com
>From f8ed45d9fa59690d8b3375d658c1baedb8510195 Mon Sep 17 00:00:00 2001
From: Nathan Bossart 
Date: Mon, 19 Sep 2022 20:24:10 -0700
Subject: [PATCH v1 1/1] introduce pg_walfile_offset_lsn()

---
 doc/src/sgml/func.sgml   | 14 +++
 src/backend/access/transam/xlogfuncs.c   | 39 
 src/include/catalog/pg_proc.dat  |  5 +++
 src/test/regress/expected/misc_functions.out | 19 ++
 src/test/regress/sql/misc_functions.sql  |  9 +
 5 files changed, 86 insertions(+)

diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index 67eb380632..318ac22769 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -25891,6 +25891,20 @@ LOG:  Grand total: 1651920 bytes in 201 blocks; 622360 free (88 chunks); 1029560

   
 
+  
+   
+
+ pg_walfile_offset_lsn
+
+pg_walfile_offset_lsn ( file_name text, file_offset integer )
+pg_lsn
+   
+   
+Converts a WAL file name and byte offset within that file to a
+write-ahead log location.
+   
+  
+
   

 
diff --git a/src/backend/access/transam/xlogfuncs.c b/src/backend/access/transam/xlogfuncs.c
index 27aeb6e281..75f58cb118 100644
--- a/src/backend/access/transam/xlogfuncs.c
+++ b/src/backend/access/transam/xlogfuncs.c
@@ -313,6 +313,45 @@ pg_last_wal_replay_lsn(PG_FUNCTION_ARGS)
 	PG_RETURN_LSN(recptr);
 }
 
+/*
+ * Compute an LSN given a WAL file name and decimal byte offset.
+ */
+Datum
+pg_walfile_offset_lsn(PG_FUNCTION_ARGS)
+{
+	char	   *filename = text_to_cstring(PG_GETARG_TEXT_PP(0));
+	int			offset = PG_GETARG_INT32(1);
+	TimeLineID	tli;
+	XLogSegNo	segno;
+	XLogRecPtr	result;
+	uint32		log;
+	uint32		seg;
+
+	if (!IsXLogFileName(filename))
+		ereport(ERROR,
+(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("invalid WAL file name \"%s\"", filename)));
+
+	sscanf(filename, "%08X%08X%08X", , , );
+	if (seg >= XLogSegmentsPerXLogId(wal_segment_size) ||
+		(log == 0 && seg == 0) ||
+		tli == 0)
+		ereport(ERROR,
+(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("invalid WAL file name \"%s\"", filename)));
+
+	if (offset < 0 || offset >= wal_segment_size)
+		ereport(ERROR,
+(errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
+ errmsg("\"offset\" must not be negative or greater than or "
+		"equal to WAL segment size")));
+
+	XLogFromFileName(filename, , , wal_segment_size);
+	XLogSegNoOffsetToRecPtr(segno, offset, wal_segment_size, result);
+
+	PG_RETURN_LSN(result);
+}
+
 /*
  * Compute an xlog file name and decimal byte offset given a WAL location,
  * such as is returned by pg_backup_stop() or pg_switch_wal().
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index a07e737a33..224fe590ac 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -6328,6 +6328,11 @@
   proargtypes => 'pg_lsn', proallargtypes => '{pg_lsn,text,int4}',
   proargmodes => '{i,o,o}', proargnames => '{lsn,file_name,file_offset}',
   prosrc => 'pg_walfile_name_offset' },
+{ oid => '8205',
+  descr => 'wal location, given a wal filename and byte offset',
+  proname => 'pg_walfile_offset_lsn', prorettype => 'pg_lsn',
+  proargtypes => 'text int4', proargnames => '{file_name,file_offset}',
+  prosrc => 'pg_walfile_offset_lsn' },
 { oid => '2851', descr => 'wal filename, given a wal location',
   proname => 'pg_walfile_name', prorettype => 'text', proargtypes => 'pg_lsn',
   prosrc => 'pg_walfile_name' },
diff --git a/src/test/regress/expected/misc_functions.out b/src/test/regress/expected/misc_functions.out
index 9f106c2a10..b2d6f23ac1 100644
--- a/src/test/regress/expected/misc_functions.out
+++ b/src/test/regress/expected/misc_functions.out
@@ -594,3 +594,22 @@ SELECT * FROM tenk1 a JOIN my_gen_series(1,10) g ON a.unique1 = g;
  Index Cond: (unique1 = g.g)
 (4 rows)
 
+-- pg_walfile_offset_lsn
+SELECT pg_walfile_offset_lsn('invalid', 15);
+ERROR:  invalid WAL file name "invalid"
+SELECT pg_walfile_offset_lsn('0001', 15);
+ERROR:  invalid WAL file name "0001"
+SELECT pg_walfile_offset_lsn('0001', 15);
+ERROR:  invalid WAL file name "0001"
+SELECT pg_walfile_offset_lsn('0001', 15);
+ERROR:  invalid WAL file name "0001"
+SELECT pg_walfile_offset_lsn('00010001', -1);
+ERROR:  "offset" must not be negative or greater than or equal to WAL segment size
+SELECT pg_walfile_offset_lsn('00010001', 20);
+ERROR:  "offset" must not be 

Re: Add LSN along with offset to error messages reported for WAL file read/write/validate header failures

2022-09-19 Thread Nathan Bossart
On Mon, Sep 19, 2022 at 07:26:57PM +0530, Bharath Rupireddy wrote:
> We have a bunch of messages [1] that have an offset, but not LSN in
> the error message. Firstly, is there an easiest way to figure out LSN
> from offset reported in the error messages? If not, is adding LSN to
> these messages along with offset a good idea? Of course, we can't just
> convert offset to LSN using XLogSegNoOffsetToRecPtr() and report, but
> something meaningful like reporting the LSN of the page that we are
> reading-in or writing-out etc.

It seems like you want the opposite of pg_walfile_name_offset().  Perhaps
we could add a function like pg_walfile_offset_lsn() that accepts a WAL
file name and byte offset and returns the LSN.

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com




Add LSN along with offset to error messages reported for WAL file read/write/validate header failures

2022-09-19 Thread Bharath Rupireddy
Hi,

I was recently asked about converting an offset reported in WAL read
error messages[1] to an LSN with which pg_waldump can be used to
verify the records or WAL file around that LSN (basically one can
filter out the output based on LSN). AFAICS, there's no function that
takes offset as an input and produces an LSN and I ended up figuring
out LSN manually. And, for some of my work too, I was hitting errors
in XLogReaderValidatePageHeader() and adding recptr to those error
messages helped me debug issues faster.

We have a bunch of messages [1] that have an offset, but not LSN in
the error message. Firstly, is there an easiest way to figure out LSN
from offset reported in the error messages? If not, is adding LSN to
these messages along with offset a good idea? Of course, we can't just
convert offset to LSN using XLogSegNoOffsetToRecPtr() and report, but
something meaningful like reporting the LSN of the page that we are
reading-in or writing-out etc.

Thoughts?

[1]
errmsg("could not read from WAL segment %s, offset %u: %m",
errmsg("could not read from WAL segment %s, offset %u: %m",
errmsg("could not write to log file %s "
   "at offset %u, length %zu: %m",
errmsg("unexpected timeline ID %u in WAL segment %s, offset %u",
errmsg("could not read from WAL segment %s, offset %u: read %d of %zu",
pg_log_error("received write-ahead log record for offset %u with no file open",
"invalid magic number %04X in WAL segment %s, offset %u",
"invalid info bits %04X in WAL segment %s, offset %u",
"invalid info bits %04X in WAL segment %s, offset %u",
"unexpected pageaddr %X/%X in WAL segment %s, offset %u",
"out-of-sequence timeline ID %u (after %u) in WAL segment %s, offset %u",

-- 
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com