Re: pg_dump, pg_basebackup don't error out with wrong option for "--format"

2021-11-25 Thread Michael Paquier
On Thu, Nov 25, 2021 at 11:22:11AM +0530, Bharath Rupireddy wrote:
> On Thu, Nov 25, 2021 at 11:02 AM Dilip Kumar  wrote:
>> I think for parsing we use getopt_long(), as per that if you use the
>> prefix of the string and that is not conflicting with any other option
>> then that is allowed.  So --fo, --for all are accepted, --f will not
>> be accepted because --file and --format will conflict, --foo is also
>> not allowed because it is not a valid prefix string of any valid
>> option string.
> 
> Yeah, that makes sense. Thanks.

It is worth noting that getopt_long() is a GNU extension and not
directly something defined in POSIX, but it is so popular that it
expanded everywhere.  This option anme handling is quite common across
everything that uses getopt_long(), actually, and erroring on
non-exact option names would break a bunch of existing use cases as it
is possible to save some characters if getopt_long() is sure of the 
uniqueness of the option found.
--
Michael


signature.asc
Description: PGP signature


Re: Deduplicate code updating ControleFile's DBState.

2021-11-25 Thread Michael Paquier
On Thu, Nov 25, 2021 at 04:04:23PM +0900, Michael Paquier wrote:
> I have not check the performance implication of that with a micro
> benchmark or the like, but I can get behind 0001 on consistency
> grounds between the backend and the frontend.

/* Now create pg_control */
InitControlFile(sysidentifier);
-   ControlFile->time = checkPoint.time;
ControlFile->checkPoint = checkPoint.redo;
ControlFile->checkPointCopy = checkPoint;
0001 has a mistake here, no?  The initial control file creation goes
through WriteControlFile(), and not update_controlfile(), so this
change means that we would miss setting up this timestamp for the
first time.

@@ -714,7 +714,6 @@ GuessControlValues(void)
ControlFile.checkPointCopy.oldestActiveXid = InvalidTransactionId;

ControlFile.state = DB_SHUTDOWNED;
-   ControlFile.time = (pg_time_t) time(NULL);
This one had better not be removed, either, as we require pg_resetwal
to guess a set of control file values.  Removing the one in
RewriteControlFile() is fine, on the contrary.
--
Michael


signature.asc
Description: PGP signature


Re: row filtering for logical replication

2021-11-25 Thread Amit Kapila
On Fri, Nov 26, 2021 at 12:01 PM Peter Smith  wrote:
>
> On Fri, Nov 26, 2021 at 4:18 PM Peter Smith  wrote:
> >
> > > > Do you mean to say that we should give an error on Update/Delete if any 
> > > > of the
> > > > publications contain table rowfilter that has columns that are not part 
> > > > of the
> > > > primary key or replica identity? I think this is what Hou-san has 
> > > > implemented in
> > > > his top-up patch and I also think this is the right behavior.
> > >
> > > Yes, the top-up patch will give an error if the columns in row filter are 
> > > not part of
> > > replica identity when UPDATE and DELETE.
> > >
> > > But the point I want to confirm is that:
> > >

Okay, I see your point now.

> > > ---
> > > create publication A for table tbl1 where (b<2) with(publish='insert');
> > > create publication B for table tbl1 where (a>1) with(publish='update');
> > > ---
> > >
> > > When UPDATE on the table 'tbl1', is it correct to combine and execute 
> > > both of
> > > the row filter in A(b<2) and B(a>1) ?(it's the current behavior)
> > >
> > > Because the filter in A has an unlogged column(b) and the publication A 
> > > only
> > > publish "insert", so for UPDATE, should we skip the row filter in A and 
> > > only
> > > execute the row filter in B ?
> > >
> >
> > But since the filters are OR'ed together does it even matter?
> >

Even if it is OR'ed, if the value is not logged (as it was not part of
replica identity or primary key) as per Hou-San's example, how will
evaluate such a filter?

> > Now that your top-up patch now prevents invalid updates/deletes, this
> > other point is only really a question about the cache performance,
> > isn't it?
> >
>
> Irrespective of replica identity I think there is still a functional
> behaviour question, right?
>
> e.g.
> create publication p1 for table census where (country = 'Aust') with
> (publish="update")
> create publication p2 for table census where (country = 'NZ') with
> (publish='insert')
>
> Should it be possible to UPDATE for country 'NZ' or not?
> Is this the same as your question Hou-san?
>

I am not sure if it is the same because in Hou-San's example
publications refer to different columns where one of the columns was
part of PK and another was not whereas in your example both refer to
the same column. I think in your example the error will happen at the
time of update/delete whereas in Hou-San's example it won't happen at
the time of update/delete.

With Regards,
Amit Kapila.




Re: [PATCH] ALTER tab completion

2021-11-25 Thread Michael Paquier
On Fri, Nov 26, 2021 at 01:55:46PM +0900, Ken Kato wrote:
> I noticed that there are some tab completions missing for the following
> commands:
> -ALTER DEFAULT PRIVILEGES: missing FOR USER

FOR ROLE is an equivalent.  That does not seem mandatory to me.

> -ALTER FOREIGN DATA WRAPPER: missing NO HANDLER, NO VALIDATOR

Okay for this one.

> -ALTER VIEW: no completion after ALTER COLUMN column_name

+   /* ALTER VIEW xxx ALTER yyy */
+   else if (Matches("ALTER", "VIEW", MatchAny, "ALTER", MatchAny))
+   COMPLETE_WITH("SET DEFAULT", "DROP DEFAULT");
It may be cleaner to group this one with "ALTER VIEW xxx ALTER yyy"
two blocks above.

> -ALTER TRANSFORM: no doc for ALTER TRANSFORM, so I excluded TRANSFORM from
> ALTER tab completion

Right.

> -ALTER SEQUENCE: missing AS
+   /* ALTER SEQUENCE  AS */
+   else if (TailMatches("ALTER", "SEQUENCE", MatchAny, "AS"))
+   COMPLETE_WITH("smallint", "integer", "bigint");
Re-quoting Horiguchi-san, that should be COMPLETE_WITH_CS() to keep
these completions in lower case.
--
Michael


signature.asc
Description: PGP signature


Re: row filtering for logical replication

2021-11-25 Thread Peter Smith
On Fri, Nov 26, 2021 at 4:18 PM Peter Smith  wrote:
>
> On Fri, Nov 26, 2021 at 4:05 PM houzj.f...@fujitsu.com
>  wrote:
> >
> > On Fri, Nov 26, 2021 11:32 AM Amit Kapila  wrote:
> > > On Thu, Nov 25, 2021 at 7:39 PM Euler Taveira  wrote:
> > > >
> > > > On Thu, Nov 25, 2021, at 10:39 AM, houzj.f...@fujitsu.com wrote:
> > > >
> > > > When researching and writing a top-up patch about this.
> > > > I found a possible issue which I'd like to confirm first.
> > > >
> > > > It's possible the table is published in two publications A and B,
> > > > publication A only publish "insert" , publication B publish "update".
> > > > When UPDATE, both row filter in A and B will be executed. Is this 
> > > > behavior
> > > expected?
> > > >
> > > > Good question. No. The code should check the action before combining
> > > > the multiple row filters.
> > > >
> > >
> > > Do you mean to say that we should give an error on Update/Delete if any 
> > > of the
> > > publications contain table rowfilter that has columns that are not part 
> > > of the
> > > primary key or replica identity? I think this is what Hou-san has 
> > > implemented in
> > > his top-up patch and I also think this is the right behavior.
> >
> > Yes, the top-up patch will give an error if the columns in row filter are 
> > not part of
> > replica identity when UPDATE and DELETE.
> >
> > But the point I want to confirm is that:
> >
> > ---
> > create publication A for table tbl1 where (b<2) with(publish='insert');
> > create publication B for table tbl1 where (a>1) with(publish='update');
> > ---
> >
> > When UPDATE on the table 'tbl1', is it correct to combine and execute both 
> > of
> > the row filter in A(b<2) and B(a>1) ?(it's the current behavior)
> >
> > Because the filter in A has an unlogged column(b) and the publication A only
> > publish "insert", so for UPDATE, should we skip the row filter in A and only
> > execute the row filter in B ?
> >
>
> But since the filters are OR'ed together does it even matter?
>
> Now that your top-up patch now prevents invalid updates/deletes, this
> other point is only really a question about the cache performance,
> isn't it?
>

Irrespective of replica identity I think there is still a functional
behaviour question, right?

e.g.
create publication p1 for table census where (country = 'Aust') with
(publish="update")
create publication p2 for table census where (country = 'NZ') with
(publish='insert')

Should it be possible to UPDATE for country 'NZ' or not?
Is this the same as your question Hou-san?

--
Kind Regards,
Peter Smith.
Fujitsu Australia




Re: pg_waldump stucks with options --follow or -f and --stats or -z

2021-11-25 Thread Michael Paquier
On Sat, Nov 20, 2021 at 11:46:35AM +0530, Bharath Rupireddy wrote:
> Thanks. Here's the v3 patch, a much simpler one. Please review it.

+   pqsignal(SIGINT, SignalHandlerForTermination);
+   pqsignal(SIGTERM, SignalHandlerForTermination);
+   pqsignal(SIGQUIT, SignalHandlerForTermination);
FWIW, I think that we should do this stuff only on SIGINT.  I would
imagine that this behavior becomes handy mainly when one wishes to
Ctrl+C the terminal running pg_waldump but still get some
information.

 XLogDumpCountRecord(, , xlogreader_state);
+stats.endptr = xlogreader_state->currRecPtr;
Isn't what you are looking for here EndRecPtr rather than currRecPtr,
to track the end of the last record read?

+When --follow is used with --stats and
+the pg_waldump is terminated or interrupted
+with signal SIGINT or 
SIGTERM
+or SIGQUIT, the summary statistics computed
+as of the termination will be displayed.
This description is not completely correct, as the set of stats would
show up only by using --stats, in non-quiet mode.  Rather than
describing this behavior at the end of the docs, I think that it would
be better to add a new paragraph in the description of --stats.
--
Michael


signature.asc
Description: PGP signature


Inconsistent results from seqscan and gist indexscan

2021-11-25 Thread Richard Guo
Here is how it can be reproduced.

create table point_tbl (f1 point);

insert into point_tbl(f1) values ('(5.1, 34.5)');
insert into point_tbl(f1) values (' ( Nan , NaN ) ');
analyze;

create index gpointind on point_tbl using gist (f1);

set enable_seqscan to on;
set enable_indexscan to off;
# select * from point_tbl where f1 <@ polygon
'(0,0),(0,100),(100,100),(100,0),(0,0)';
 f1

 (5.1,34.5)
 (NaN,NaN)
(2 rows)


set enable_seqscan to off;
set enable_indexscan to on;
# select * from point_tbl where f1 <@ polygon
'(0,0),(0,100),(100,100),(100,0),(0,0)';
 f1

 (5.1,34.5)
(1 row)

Seems point_inside() does not handle NaN properly.

BTW, I'm using 15devel. But this issue can be seen in at least 12
version also.

Thanks
Richard


Re: Postgres restart in the middle of exclusive backup and the presence of backup_label file

2021-11-25 Thread Michael Paquier
On Thu, Nov 25, 2021 at 06:19:03PM -0800, SATYANARAYANA NARLAPURAM wrote:
> Is there a plan in place to remove the exclusive backup option from the
> core in PG 15/16?

This was discussed, but removing it could also harm users relying on
it.  Perhaps it could be revisited, but I am not sure if this is worth
worrying about either.

> If we are keeping it then why not make it better?

Well, non-exclusive backups are better by design in many aspects, so I
don't quite see the point in spending time on something that has more
limitations than what's already in place.
--
Michael


signature.asc
Description: PGP signature


Re: row filtering for logical replication

2021-11-25 Thread Peter Smith
On Fri, Nov 26, 2021 at 4:05 PM houzj.f...@fujitsu.com
 wrote:
>
> On Fri, Nov 26, 2021 11:32 AM Amit Kapila  wrote:
> > On Thu, Nov 25, 2021 at 7:39 PM Euler Taveira  wrote:
> > >
> > > On Thu, Nov 25, 2021, at 10:39 AM, houzj.f...@fujitsu.com wrote:
> > >
> > > When researching and writing a top-up patch about this.
> > > I found a possible issue which I'd like to confirm first.
> > >
> > > It's possible the table is published in two publications A and B,
> > > publication A only publish "insert" , publication B publish "update".
> > > When UPDATE, both row filter in A and B will be executed. Is this behavior
> > expected?
> > >
> > > Good question. No. The code should check the action before combining
> > > the multiple row filters.
> > >
> >
> > Do you mean to say that we should give an error on Update/Delete if any of 
> > the
> > publications contain table rowfilter that has columns that are not part of 
> > the
> > primary key or replica identity? I think this is what Hou-san has 
> > implemented in
> > his top-up patch and I also think this is the right behavior.
>
> Yes, the top-up patch will give an error if the columns in row filter are not 
> part of
> replica identity when UPDATE and DELETE.
>
> But the point I want to confirm is that:
>
> ---
> create publication A for table tbl1 where (b<2) with(publish='insert');
> create publication B for table tbl1 where (a>1) with(publish='update');
> ---
>
> When UPDATE on the table 'tbl1', is it correct to combine and execute both of
> the row filter in A(b<2) and B(a>1) ?(it's the current behavior)
>
> Because the filter in A has an unlogged column(b) and the publication A only
> publish "insert", so for UPDATE, should we skip the row filter in A and only
> execute the row filter in B ?
>

But since the filters are OR'ed together does it even matter?

Now that your top-up patch now prevents invalid updates/deletes, this
other point is only really a question about the cache performance,
isn't it?

--
Kind Regards,
Peter Smith.
Fujitsu Australia.




RE: row filtering for logical replication

2021-11-25 Thread houzj.f...@fujitsu.com
On Fri, Nov 26, 2021 11:32 AM Amit Kapila  wrote:
> On Thu, Nov 25, 2021 at 7:39 PM Euler Taveira  wrote:
> >
> > On Thu, Nov 25, 2021, at 10:39 AM, houzj.f...@fujitsu.com wrote:
> >
> > When researching and writing a top-up patch about this.
> > I found a possible issue which I'd like to confirm first.
> >
> > It's possible the table is published in two publications A and B,
> > publication A only publish "insert" , publication B publish "update".
> > When UPDATE, both row filter in A and B will be executed. Is this behavior
> expected?
> >
> > Good question. No. The code should check the action before combining
> > the multiple row filters.
> >
> 
> Do you mean to say that we should give an error on Update/Delete if any of the
> publications contain table rowfilter that has columns that are not part of the
> primary key or replica identity? I think this is what Hou-san has implemented 
> in
> his top-up patch and I also think this is the right behavior.

Yes, the top-up patch will give an error if the columns in row filter are not 
part of
replica identity when UPDATE and DELETE.

But the point I want to confirm is that:

---
create publication A for table tbl1 where (b<2) with(publish='insert');
create publication B for table tbl1 where (a>1) with(publish='update');
---

When UPDATE on the table 'tbl1', is it correct to combine and execute both of
the row filter in A(b<2) and B(a>1) ?(it's the current behavior)

Because the filter in A has an unlogged column(b) and the publication A only
publish "insert", so for UPDATE, should we skip the row filter in A and only
execute the row filter in B ?

Best regards,
Hou zj


[PATCH] ALTER tab completion

2021-11-25 Thread Ken Kato

Hi hackers,

I noticed that there are some tab completions missing for the following 
commands:

-ALTER DEFAULT PRIVILEGES: missing FOR USER
-ALTER FOREIGN DATA WRAPPER: missing NO HANDLER, NO VALIDATOR
-ALTER SEQUENCE: missing AS
-ALTER VIEW: no completion after ALTER COLUMN column_name
-ALTER TRANSFORM: no doc for ALTER TRANSFORM, so I excluded TRANSFORM 
from ALTER tab completion


I made a patch for this, so please have a look.

Best wishes,

--
Ken Kato
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATIONdiff --git a/src/bin/psql/tab-complete.c b/src/bin/psql/tab-complete.c
index 4f724e4428..60bbd18ade 100644
--- a/src/bin/psql/tab-complete.c
+++ b/src/bin/psql/tab-complete.c
@@ -1088,7 +1088,7 @@ static const pgsql_thing_t words_after_create[] = {
 	{"TEMPORARY", NULL, NULL, NULL, THING_NO_DROP | THING_NO_ALTER},	/* for CREATE TEMPORARY
 		 * TABLE ... */
 	{"TEXT SEARCH", NULL, NULL, NULL},
-	{"TRANSFORM", NULL, NULL, NULL},
+	{"TRANSFORM", NULL, NULL, NULL, THING_NO_ALTER},
 	{"TRIGGER", "SELECT pg_catalog.quote_ident(tgname) FROM pg_catalog.pg_trigger WHERE substring(pg_catalog.quote_ident(tgname),1,%d)='%s' AND NOT tgisinternal"},
 	{"TYPE", NULL, NULL, _for_list_of_datatypes},
 	{"UNIQUE", NULL, NULL, NULL, THING_NO_DROP | THING_NO_ALTER},	/* for CREATE UNIQUE
@@ -1754,7 +1754,10 @@ psql_completion(const char *text, int start, int end)
 
 	/* ALTER FOREIGN DATA WRAPPER  */
 	else if (Matches("ALTER", "FOREIGN", "DATA", "WRAPPER", MatchAny))
-		COMPLETE_WITH("HANDLER", "VALIDATOR", "OPTIONS", "OWNER TO", "RENAME TO");
+		COMPLETE_WITH("HANDLER", "VALIDATOR", "NO",
+	  "OPTIONS", "OWNER TO", "RENAME TO");
+	else if (Matches("ALTER", "FOREIGN", "DATA", "WRAPPER", MatchAny, "NO"))
+		COMPLETE_WITH("HANDLER", "VALIDATOR");
 
 	/* ALTER FOREIGN TABLE  */
 	else if (Matches("ALTER", "FOREIGN", "TABLE", MatchAny))
@@ -1856,10 +1859,10 @@ psql_completion(const char *text, int start, int end)
 
 	/* ALTER DEFAULT PRIVILEGES */
 	else if (Matches("ALTER", "DEFAULT", "PRIVILEGES"))
-		COMPLETE_WITH("FOR ROLE", "IN SCHEMA");
+		COMPLETE_WITH("FOR ROLE", "FOR USER", "IN SCHEMA");
 	/* ALTER DEFAULT PRIVILEGES FOR */
 	else if (Matches("ALTER", "DEFAULT", "PRIVILEGES", "FOR"))
-		COMPLETE_WITH("ROLE");
+		COMPLETE_WITH("ROLE", "USER");
 	/* ALTER DEFAULT PRIVILEGES IN */
 	else if (Matches("ALTER", "DEFAULT", "PRIVILEGES", "IN"))
 		COMPLETE_WITH("SCHEMA");
@@ -1907,9 +1910,12 @@ psql_completion(const char *text, int start, int end)
 		COMPLETE_WITH("DEFAULT", "NOT NULL", "SCHEMA");
 	/* ALTER SEQUENCE  */
 	else if (Matches("ALTER", "SEQUENCE", MatchAny))
-		COMPLETE_WITH("INCREMENT", "MINVALUE", "MAXVALUE", "RESTART", "NO",
-	  "CACHE", "CYCLE", "SET SCHEMA", "OWNED BY", "OWNER TO",
-	  "RENAME TO");
+		COMPLETE_WITH("AS", "INCREMENT", "MINVALUE", "MAXVALUE", "RESTART",
+	  "NO", "CACHE", "CYCLE", "SET SCHEMA", "OWNED BY",
+	  "OWNER TO", "RENAME TO");
+	/* ALTER SEQUENCE  AS */
+	else if (TailMatches("ALTER", "SEQUENCE", MatchAny, "AS"))
+		COMPLETE_WITH("smallint", "integer", "bigint");
 	/* ALTER SEQUENCE  NO */
 	else if (Matches("ALTER", "SEQUENCE", MatchAny, "NO"))
 		COMPLETE_WITH("MINVALUE", "MAXVALUE", "CYCLE");
@@ -1938,9 +1944,15 @@ psql_completion(const char *text, int start, int end)
 	/* ALTER VIEW xxx RENAME yyy */
 	else if (Matches("ALTER", "VIEW", MatchAny, "RENAME", MatchAnyExcept("TO")))
 		COMPLETE_WITH("TO");
+	/* ALTER VIEW xxx ALTER yyy */
+	else if (Matches("ALTER", "VIEW", MatchAny, "ALTER", MatchAny))
+		COMPLETE_WITH("SET DEFAULT", "DROP DEFAULT");
 	/* ALTER VIEW xxx RENAME COLUMN yyy */
 	else if (Matches("ALTER", "VIEW", MatchAny, "RENAME", "COLUMN", MatchAnyExcept("TO")))
 		COMPLETE_WITH("TO");
+	/* ALTER VIEw xxx ALTER COLUMN yyy */
+	else if (Matches("ALTER", "VIEW", MatchAny, "ALTER", "COLUMN", MatchAny))
+		COMPLETE_WITH("SET DEFAULT", "DROP DEFAULT");
 
 	/* ALTER MATERIALIZED VIEW  */
 	else if (Matches("ALTER", "MATERIALIZED", "VIEW", MatchAny))


Re: Windows build warnings

2021-11-25 Thread Tom Lane
Greg Nancarrow  writes:
> AFAICS, the fundamental difference here seems to be that the GCC
> compiler still regards a variable as "unused" if it is never read,
> whereas if the variable is set (but not necessarily read) that's
> enough for the Windows C compiler to regard it as "used".

It depends.  Older gcc versions don't complain about set-but-not-read
variables, but clang has done so for awhile (with a specific warning
message about the case), and I think recent gcc follows suit.

> Personally I'm not really in favour of outright disabling the C4101
> warning on Windows, because I think it is a useful warning for
> Postgres developers on Windows for cases unrelated to the use of
> PG_USED_FOR_ASSERTS_ONLY.

IMO we should either do that or do whatever's necessary to make the
macro work properly on MSVC.  I'm not very much in favor of jumping
through hoops to satisfy a compiler that has a randomly-different-
but-still-demonstrably-inadequate version of this warning.

regards, tom lane




Re: Windows build warnings

2021-11-25 Thread Greg Nancarrow
On Thu, Nov 25, 2021 at 11:03 PM Daniel Gustafsson  wrote:
>
> To silence the warnings in the meantime (if the rework at all happens) we
> should either apply the patch from Greg or add C4101 to disablewarnings in
> src/tools/msvc/Project.pm as mentioned above.  On top of that, we should apply
> the patch I proposed downthread to remove PG_USED_FOR_ASSERTS_ONLY where it's
> no longer applicable.  Personally I'm fine with either, and am happy to make 
> it
> happen, once we agree on what it should be.
>

AFAICS, the fundamental difference here seems to be that the GCC
compiler still regards a variable as "unused" if it is never read,
whereas if the variable is set (but not necessarily read) that's
enough for the Windows C compiler to regard it as "used".
This is why, at least for the majority of cases, why we're not seeing
the C4101 warnings on Windows where PG_USED_FOR_ASSERTS_ONLY has been
used in the Postgres source, because in those cases the variable has
been set prior its use in an Assert or "#ifdef USE_ASSERT_CHECKING"
block.
IMO, for the case in point, it's best to fix it by either setting the
variables to NULL, prior to their use in the "#ifdef
USE_ASSERT_CHECKING" block, or by applying my patch.
Of course, this doesn't address fixing the PG_USED_ONLY_FOR_ASSERTS
macro to work on Windows, but I don't see an easy way forward on that
if it's to remain in its "variable attribute" form, and in any case
the Windows C compiler doesn't seem to support any annotation to mark
a variable as potentially unused.
Personally I'm not really in favour of outright disabling the C4101
warning on Windows, because I think it is a useful warning for
Postgres developers on Windows for cases unrelated to the use of
PG_USED_FOR_ASSERTS_ONLY.
I agree with your proposal to apply your patch to remove
PG_USED_FOR_ASSERTS_ONLY where it's no longer applicable.

Regards,
Greg Nancarrow
Fujitsu Australia




Re: prevent immature WAL streaming

2021-11-25 Thread Amul Sul
On Fri, Nov 26, 2021 at 1:42 AM Tom Lane  wrote:
>
> Alvaro Herrera  writes:
> > On 2021-Nov-25, Tom Lane wrote:
> >> Really?  AFAICS the WAL record contains the correct value, or at least
> >> we should define that one as being correct, for precisely this reason.
>
> > I don't know what is the correct value for a record that comes exactly
> > after the page header.  But here's a patch that fixes the problem; and
> > if a standby replays WAL written by an unpatched primary, it will be
> > able to read past instead of dying of FATAL.
>
> Meh ... but given the simplicity of the write-side fix, maybe changing
> it is appropriate.
>
> However, this seems too forgiving:
>
> +if (xlrec->overwritten_lsn != state->overwrittenRecPtr &&
> +xlrec->overwritten_lsn - SizeOfXLogShortPHD != 
> state->overwrittenRecPtr &&
> +xlrec->overwritten_lsn - SizeOfXLogLongPHD != 
> state->overwrittenRecPtr)
>

Unless I am missing something, I am not sure why need this adjustment
if we are going to use state->currRecPtr value which doesn't seem to
be changing at all. AFAICU, state->currRecPtr will be unchanged value
whether going to set overwrittenRecPtr or abortedRecPtr. Do primary
and standby see state->currRecPtr differently, I guess not, never?

Regards,
Amul




Re: row filtering for logical replication

2021-11-25 Thread Greg Nancarrow
On Fri, Nov 26, 2021 at 1:16 PM houzj.f...@fujitsu.com
 wrote:
>
> Based on this direction, I tried to write a top up POC patch(0005) which I'd 
> like to share.
>

I noticed a minor issue.
In the top-up patch, the following error message detail:

+ errdetail("Not all row filter columns are not part of the REPLICA
IDENTITY")));

should be:

+ errdetail("Not all row filter columns are part of the REPLICA IDENTITY")));


Regards,
Greg Nancarrow
Fujitsu Australia




Re: RFC: Logging plan of the running query

2021-11-25 Thread torikoshia

On 2021-11-17 22:44, Ekaterina Sokolova wrote:

Hi!

You forgot my last fix to build correctly on Mac. I have added it.


Thanks for the notification!
Since the patch could not be applied to the HEAD anymore, I also updated 
it.




About our discussion of pg_query_state:

torikoshia писал 2021-11-04 15:49:

I doubt that it was the right link.

Sorry for make you confused, here is the link.
https://www.postgresql.org/message-id/CA%2BTgmobkpFV0UB67kzXuD36--OFHwz1bs%3DL_6PZbD4nxKqUQMw%40mail.gmail.com


Thank you. I'll see it soon.


I imagined the following procedure.
Does it cause dead lock in pg_query_state?

- session1
BEGIN; TRUNCATE t;

- session2
BEGIN; TRUNCATE t; -- wait

- session1
SELECT * FROM pg_query_state(); -- wait and dead 
locked?


As I know, pg_query_state use non-blocking read and write. I have
wrote few tests trying to deadlock it (on 14 version), but all
finished correctly.

Have a nice day. Please feel free to contact me if you need any
further information.


Thanks for your information and help!

--
Regards,

--
Atsushi Torikoshi
NTT DATA CORPORATIONFrom b8367e22d7a9898e4b85627ba8c203be273fc22f Mon Sep 17 00:00:00 2001
From: Atsushi Torikoshi 
Date: Fri, 26 Nov 2021 10:31:00 +0900
Subject: [PATCH v14] Add function to log the untruncated query string and its
 plan for the query currently running on the backend with the specified
 process ID.

Currently, we have to wait for the query execution to finish
to check its plan. This is not so convenient when
investigating long-running queries on production environments
where we cannot use debuggers.
To improve this situation, this patch adds
pg_log_query_plan() function that requests to log the
plan of the specified backend process.

By default, only superusers are allowed to request to log the
plans because allowing any users to issue this request at an
unbounded rate would cause lots of log messages and which can
lead to denial of service.

On receipt of the request, at the next CHECK_FOR_INTERRUPTS(),
the target backend logs its plan at LOG_SERVER_ONLY level, so
that these plans will appear in the server log but not be sent
to the client.

Since some codes, tests and comments of
pg_log_query_plan() are the same with
pg_log_backend_memory_contexts(), this patch also refactors
them to make them common.

Reviewed-by: Bharath Rupireddy, Fujii Masao, Dilip Kumar, Masahiro Ikeda, Ekaterina Sokolova, Justin Pryzby

---
 doc/src/sgml/func.sgml   |  45 +++
 src/backend/catalog/system_functions.sql |   2 +
 src/backend/commands/explain.c   | 117 ++-
 src/backend/executor/execMain.c  |  10 ++
 src/backend/storage/ipc/procsignal.c |   4 +
 src/backend/storage/ipc/signalfuncs.c|  55 +
 src/backend/storage/lmgr/lock.c  |   9 +-
 src/backend/tcop/postgres.c  |   7 ++
 src/backend/utils/adt/mcxtfuncs.c|  36 +-
 src/backend/utils/init/globals.c |   1 +
 src/include/catalog/pg_proc.dat  |   6 +
 src/include/commands/explain.h   |   3 +
 src/include/miscadmin.h  |   1 +
 src/include/storage/lock.h   |   2 -
 src/include/storage/procsignal.h |   1 +
 src/include/storage/signalfuncs.h|  22 
 src/include/tcop/pquery.h|   1 +
 src/test/regress/expected/misc_functions.out |  54 +++--
 src/test/regress/sql/misc_functions.sql  |  42 +--
 19 files changed, 355 insertions(+), 63 deletions(-)
 create mode 100644 src/include/storage/signalfuncs.h

diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index 0a725a6711..b84ead4341 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -25358,6 +25358,26 @@ SELECT collation for ('foo' COLLATE "de_DE");

   
 
+  
+   
+
+ pg_log_query_plan
+
+pg_log_query_plan ( pid integer )
+boolean
+   
+   
+Requests to log the plan of the query currently running on the
+backend with specified process ID along with the untruncated
+query string.
+They will be logged at LOG message level and
+will appear in the server log based on the log
+configuration set (See 
+for more information), but will not be sent to the client
+regardless of .
+   
+  
+
   

 
@@ -25471,6 +25491,31 @@ LOG:  Grand total: 1651920 bytes in 201 blocks; 622360 free (88 chunks); 1029560
 because it may generate a large number of log messages.

 
+   
+pg_log_query_plan can be used
+to log the plan of a backend process. For example:
+
+postgres=# SELECT pg_log_query_plan(201116);
+ pg_log_query_plan
+---
+ t
+(1 row)
+
+The format of the query plan is the same as when VERBOSE,
+COSTS, SETTINGS and
+FORMAT TEXT are used in the EXPLAIN
+command. For example:
+
+LOG:  plan of the query 

Re: row filtering for logical replication

2021-11-25 Thread Amit Kapila
On Thu, Nov 25, 2021 at 7:39 PM Euler Taveira  wrote:
>
> On Thu, Nov 25, 2021, at 10:39 AM, houzj.f...@fujitsu.com wrote:
>
> When researching and writing a top-up patch about this.
> I found a possible issue which I'd like to confirm first.
>
> It's possible the table is published in two publications A and B, publication 
> A
> only publish "insert" , publication B publish "update". When UPDATE, both row
> filter in A and B will be executed. Is this behavior expected?
>
> Good question. No. The code should check the action before combining the
> multiple row filters.
>

Do you mean to say that we should give an error on Update/Delete if
any of the publications contain table rowfilter that has columns that
are not part of the primary key or replica identity? I think this is
what Hou-san has implemented in his top-up patch and I also think this
is the right behavior.

-- 
With Regards,
Amit Kapila.




Re: pg_get_publication_tables() output duplicate relid

2021-11-25 Thread Amit Kapila
On Fri, Nov 26, 2021 at 7:10 AM houzj.f...@fujitsu.com
 wrote:
>
> On Thursday, November 25, 2021 4:57 PM Amit Kapila  
> wrote:
> > On Thu, Nov 25, 2021 at 1:30 PM Amit Langote 
> > >
> > > I agree with backpatching the doc fix.  I've attached a diff against
> > > master, though it also appears to apply to 13 and 14 branches.
> > >
> >
> > I think we can  for publish_via_partition_root, other 
> > than that
> > the patch looks good to me.
> >
> > Hou-San, others, do you have any opinion about this patch and whether to
> > backpatch it or not?
>
> I think it makes sense to backpatch the doc fix, and the patch looks good to 
> me.
>

Thanks, I'll push this sometime early next week unless there are any
objections to it.


-- 
With Regards,
Amit Kapila.




Re: pg_upgrade and publication/subscription problem

2021-11-25 Thread Amit Kapila
On Thu, Nov 25, 2021 at 8:00 PM Marcos Pegoraro  wrote:
>>
>> Yes, the way you are doing I think it is bound to happen. There is
>> some discussion about why this is happening in email [2]. AFAIK, it is
>> not documented and if so, I think it will be a good idea to document
>>
> And my problem remains the same, how to solve it ? All records on 
> pg_subscription_rel are initialize with srsubstate null. How can I replay 
> only updates since yesterday. This replication is a auditing database, so I 
> cannot loose all things happened since that pg_upgrade. [1] points me how to 
> upgrade but if I did the wrong way, how to solve that ?
>

AFAIU the main problem in your case is that you didn't block the write
traffic on the publisher side. Let me try to understand the situation.
After the upgrade is finished, there are some new tables with data on
the publisher, and did old tables have any additional data?

Are the contents in pg_replication_origin intact after the upgrade?

So, in short, I think what we need to solve is to get the data from
new tables and newly performed writes on old tables. I could think of
the following two approaches:

Approach-1:
1. Drop subscription and Truncate all tables corresponding to subscription.
2. Create a new subscription for the publication.

I think this will be quite neat and there would be no risk of data
loss but it could be time-consuming since all the data from previous
tables needs to be synced again.

Approach-2:
Here, I am assuming pg_replication_origin is intact.
1. Block new writes on the publisher-side.
2. Disable the existing subscription (say the name of the subscription
is old_sub).
3. Drop the existing all tables publication.
4. Create two new publications, one for old tables (old_pub), and one
for new tables (new_pub).
5. Create a new subscription corresponding to new_pub.
6. Remove the existing publication from old_sub and add the old_pub.
7. Enable the subscription.
8. Now, perform a refresh on old_sub.

The benefit of Approach-1 is that you don't need to change anything on
the publisher-side and it has very few steps. OTOH, in Approach-2, we
can save the effort/time to re-sync the initial data for old tables
but as there are a lot of things to be taken care there is always a
chance of mistake and if that happens you might lose some data.

In any case, before following any of these, I suggest creating a dummy
setup that mimics your original setup, perform the above steps and
ensure everything is fine, then only try the same steps in your main
setup.

-- 
With Regards,
Amit Kapila.




Re: Postgres restart in the middle of exclusive backup and the presence of backup_label file

2021-11-25 Thread SATYANARAYANA NARLAPURAM
Thanks Michael!

This is a known issue with exclusive backups, which is a reason why
> non-exclusive backups have been implemented.  pg_basebackup does that,
> and using "false" as the third argument of pg_start_backup() would
> have the same effect.  So I would recommend to switch to that.
>

Is there a plan in place to remove the exclusive backup option from the
core in PG 15/16? If we are keeping it then why not make it better?


RE: Skipping logical replication transactions on subscriber side

2021-11-25 Thread tanghy.f...@fujitsu.com
On Friday, November 26, 2021 9:30 AM Masahiko Sawada  
wrote:
> 
> Indeed. Attached an updated patch. Thanks!

Thanks for your patch. A small comment:

+   OID of the relation that the worker is synchronizing; null for the
+   main apply worker

Should we modify it to "OID of the relation that the worker was synchronizing 
..."?

The rest of the patch LGTM.

Regards
Tang


RE: pg_get_publication_tables() output duplicate relid

2021-11-25 Thread houzj.f...@fujitsu.com
On Thursday, November 25, 2021 4:57 PM Amit Kapila  
wrote:
> On Thu, Nov 25, 2021 at 1:30 PM Amit Langote 
> >
> > I agree with backpatching the doc fix.  I've attached a diff against
> > master, though it also appears to apply to 13 and 14 branches.
> >
> 
> I think we can  for publish_via_partition_root, other than 
> that
> the patch looks good to me.
> 
> Hou-San, others, do you have any opinion about this patch and whether to
> backpatch it or not?

I think it makes sense to backpatch the doc fix, and the patch looks good to me.

Best regards,
Hou zj


Re: Skipping logical replication transactions on subscriber side

2021-11-25 Thread Masahiko Sawada
On Thu, Nov 25, 2021 at 10:06 PM houzj.f...@fujitsu.com
 wrote:
>
> On Thur, Nov 25, 2021 8:29 PM Masahiko Sawada  wrote:
> > On Thu, Nov 25, 2021 at 1:57 PM Amit Kapila  wrote:
> > >
> > > On Wed, Nov 24, 2021 at 5:14 PM Masahiko Sawada
> >  wrote:
> > > >
> > > > Changed. I've removed first_error_time as per discussion on the
> > > > thread for adding xact stats.
> > > >
> > >
> > > We also agreed to change the column names to start with last_error_*
> > > [1]. Is there a reason to not make those changes? Do you think that we
> > > can change it just before committing that patch? I thought it might be
> > > better to do it that way now itself.
> >
> > Oh, I thought that you think that we change the column names when adding 
> > xact
> > stats to the view. But these names also make sense even without the xact 
> > stats.
> > I've attached an updated patch. It also incorporated comments from Vignesh
> > and Greg.
> >
> Hi,
>
> I only noticed some minor things in the testcases
>
> 1)
> +$node_publisher->append_conf('postgresql.conf',
> +qq[
> +logical_decoding_work_mem = 64kB
> +]);
>
> It seems we don’t need set the decode_work_mem since we don't test streaming ?
>
> 2)
> +$node_publisher->safe_psql('postgres',
> +  q[
> +CREATE PUBLICATION tap_pub FOR TABLE test_tab1, test_tab2;
> +]);
>
> There are a few places where only one command exists in the 'q[' or 'qq[' 
> like the above code.
> To be consistent, I think it might be better to remove the wrap here, maybe 
> we can write like:
> $node_publisher->safe_psql('postgres',
> ' CREATE PUBLICATION tap_pub FOR TABLE test_tab1, test_tab2;');
>

Indeed. Attached an updated patch. Thanks!

Regards,

-- 
Masahiko Sawada
EDB:  https://www.enterprisedb.com/


v26-0001-Add-a-subscription-worker-statistics-view-pg_sta.patch
Description: Binary data


Re: WIP: WAL prefetch (another approach)

2021-11-25 Thread Tomas Vondra
Hi,

It's great you posted a new version of this patch, so I took a look a
brief look at it. The code seems in pretty good shape, I haven't found
any real issues - just two minor comments:

This seems a bit strange:

#define DEFAULT_DECODE_BUFFER_SIZE 0x1

Why not to define this as a simple decimal value? Is there something
special about this particular value, or is it arbitrary? I guess it's
simply the minimum for wal_decode_buffer_size GUC, but why not to use
the GUC for all places decoding WAL?

FWIW I don't think we include updates to typedefs.list in patches.


I also repeated the benchmarks I did at the beginning of the year [1].
Attached is a chart with four different configurations:

1) master (f79962d826)

2) patched (with prefetching disabled)

3) patched (with default configuration)

4) patched (with I/O concurrency 256 and 2MB decode buffer)

For all configs the shared buffers were set to 64GB, checkpoints every
20 minutes, etc.

The results are pretty good / similar to previous results. Replaying the
1h worth of work on a smaller machine takes ~5:30h without prefetching
(master or with prefetching disabled). With prefetching enabled this
drops to ~2h (default config) and ~1h (with tuning).

regards


[1]
https://www.postgresql.org/message-id/c5d52837-6256-0556-ac8c-d6d3d558820a%40enterprisedb.com

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: prevent immature WAL streaming

2021-11-25 Thread Tom Lane
I wrote:
> However, this seems too forgiving:

... also, I don't know if you intended this already, but the
VerifyOverwriteContrecord change should only be applied in
back branches.  There's no need for it in HEAD.

regards, tom lane




Re: prevent immature WAL streaming

2021-11-25 Thread Tom Lane
Alvaro Herrera  writes:
> On 2021-Nov-25, Tom Lane wrote:
>> Really?  AFAICS the WAL record contains the correct value, or at least
>> we should define that one as being correct, for precisely this reason.

> I don't know what is the correct value for a record that comes exactly
> after the page header.  But here's a patch that fixes the problem; and
> if a standby replays WAL written by an unpatched primary, it will be
> able to read past instead of dying of FATAL.

Meh ... but given the simplicity of the write-side fix, maybe changing
it is appropriate.

However, this seems too forgiving:

+if (xlrec->overwritten_lsn != state->overwrittenRecPtr &&
+xlrec->overwritten_lsn - SizeOfXLogShortPHD != 
state->overwrittenRecPtr &&
+xlrec->overwritten_lsn - SizeOfXLogLongPHD != state->overwrittenRecPtr)

The latter two cases should only be accepted if overwrittenRecPtr is
exactly at a page boundary.

regards, tom lane




Re: Non-superuser subscription owners

2021-11-25 Thread Jeff Davis
On Thu, 2021-11-25 at 09:51 +0530, Amit Kapila wrote:
> Won't it be better to just check if the current user is superuser
> before applying each change as a matter of this first patch? Sorry, I
> was under impression that first, we want to close the current gap
> where we allow to proceed with replication if the user's superuser
> privileges were revoked during replication.

That could be a first step, and I don't oppose it. But it seems like a
very small first step that would be made obsolete when v3-0001 is
ready, which I think will be very soon.

>  To allow non-superusers
> owners, I thought it might be better to first try to detect the
> change
> of ownership

In the case of revoked superuser privileges, there's no change in
ownership, just a change of privileges (SUPERUSER -> NOSUPERUSER). And
if we're detecting a change of privileges, why not just do it in
something closer to the right way, which is what v3-0001 is attempting
to do.

>  as soon as possible instead of at the transaction
> boundary.

I don't understand why it's important to detect a loss of privileges
faster than a transaction boundary. Can you elaborate?

Regards,
Jeff Davis






Re: prevent immature WAL streaming

2021-11-25 Thread Alvaro Herrera
On 2021-Nov-25, Tom Lane wrote:

> Alvaro Herrera  writes:
> 
> > The problem is that the bug occurs while writing the WAL record.  Fixed
> > servers won't produce such records, but if you run an unpatched server
> > and it happens to write one, without a mitigation you cannot get away
> > from FATAL during replay.
> 
> Really?  AFAICS the WAL record contains the correct value, or at least
> we should define that one as being correct, for precisely this reason.

I don't know what is the correct value for a record that comes exactly
after the page header.  But here's a patch that fixes the problem; and
if a standby replays WAL written by an unpatched primary, it will be
able to read past instead of dying of FATAL.

I originally wrote this to have a WARNING in VerifyOverwriteContrecord
(in the cases that are new), with the idea that it'd prompt people to
upgrade, but that's probably a waste of time.

-- 
Álvaro Herrera  Valdivia, Chile  —  https://www.EnterpriseDB.com/
"El hombre nunca sabe de lo que es capaz hasta que lo intenta" (C. Dickens)
>From c28d96dfeb0884dff58b989fc0b869ee0f7806e5 Mon Sep 17 00:00:00 2001
From: Alvaro Herrera 
Date: Thu, 25 Nov 2021 16:19:42 -0300
Subject: [PATCH] Fix LSN in OVERWRITTEN_CONTRECORD

If the record that was broken starts exactly at a page boundary, we
write the byte position at which the record data appears (skipping the
WAL page header), which isn't the actual LSN of the record.  Repair.

To avoid problems on production systems which might use the bogus code,
change the test so that LSNs at page boundaries are also considered
equal to the byte position for the record data.  This means that if an
unpatched server produces a bogus OVERWRITTEN_CONTRECORD, WAL replay can
continue after upgrading to a server containing this patch.
---
 src/backend/access/transam/xlog.c   | 9 -
 src/backend/access/transam/xlogreader.c | 2 +-
 2 files changed, 9 insertions(+), 2 deletions(-)

diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index 5a8851a602..1f1ae23526 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -10706,7 +10706,14 @@ xlog_redo(XLogReaderState *record)
 static void
 VerifyOverwriteContrecord(xl_overwrite_contrecord *xlrec, XLogReaderState *state)
 {
-	if (xlrec->overwritten_lsn != state->overwrittenRecPtr)
+	/*
+	 * 14.1 and sibling server versions wrote XLOG_OVERWRITE_CONTRECORD with a
+	 * bogus recptr. See https://postgr.es/m/45597.1637694...@sss.pgh.pa.us for
+	 * details.
+	 */
+	if (xlrec->overwritten_lsn != state->overwrittenRecPtr &&
+		xlrec->overwritten_lsn - SizeOfXLogShortPHD != state->overwrittenRecPtr &&
+		xlrec->overwritten_lsn - SizeOfXLogLongPHD != state->overwrittenRecPtr)
 		elog(FATAL, "mismatching overwritten LSN %X/%X -> %X/%X",
 			 LSN_FORMAT_ARGS(xlrec->overwritten_lsn),
 			 LSN_FORMAT_ARGS(state->overwrittenRecPtr));
diff --git a/src/backend/access/transam/xlogreader.c b/src/backend/access/transam/xlogreader.c
index f39f8044a9..4772cd1664 100644
--- a/src/backend/access/transam/xlogreader.c
+++ b/src/backend/access/transam/xlogreader.c
@@ -586,7 +586,7 @@ err:
 		 * in turn signal downstream WAL consumers that the broken WAL record
 		 * is to be ignored.
 		 */
-		state->abortedRecPtr = RecPtr;
+		state->abortedRecPtr = state->currRecPtr;
 		state->missingContrecPtr = targetPagePtr;
 	}
 
-- 
2.30.2



Re: prevent immature WAL streaming

2021-11-25 Thread Tom Lane
Alvaro Herrera  writes:
> On 2021-Nov-25, Tom Lane wrote:
>> Uh, why?  The fix should remove the problem, and if it doesn't, we're
>> still looking at inconsistent WAL aren't we?

> The problem is that the bug occurs while writing the WAL record.  Fixed
> servers won't produce such records, but if you run an unpatched server
> and it happens to write one, without a mitigation you cannot get away
> from FATAL during replay.

Really?  AFAICS the WAL record contains the correct value, or at least
we should define that one as being correct, for precisely this reason.

regards, tom lane




Re: prevent immature WAL streaming

2021-11-25 Thread Alvaro Herrera
On 2021-Nov-25, Tom Lane wrote:

> Alvaro Herrera  writes:
> > Oh, but also I think I should push a mitigation in case a production
> > system hits this problem: maybe reduce the message from FATAL to WARNING
> > if the registered LSN is at a page boundary.
> 
> Uh, why?  The fix should remove the problem, and if it doesn't, we're
> still looking at inconsistent WAL aren't we?

The problem is that the bug occurs while writing the WAL record.  Fixed
servers won't produce such records, but if you run an unpatched server
and it happens to write one, without a mitigation you cannot get away
from FATAL during replay.

Since this bug exists in released minors, we should allow people to
upgrade to a newer version if they hit it.

-- 
Álvaro Herrera  Valdivia, Chile  —  https://www.EnterpriseDB.com/
"Nadie está tan esclavizado como el que se cree libre no siéndolo" (Goethe)




Re: prevent immature WAL streaming

2021-11-25 Thread Tom Lane
Alvaro Herrera  writes:
> Oh, but also I think I should push a mitigation in case a production
> system hits this problem: maybe reduce the message from FATAL to WARNING
> if the registered LSN is at a page boundary.

Uh, why?  The fix should remove the problem, and if it doesn't, we're
still looking at inconsistent WAL aren't we?

regards, tom lane




Re: prevent immature WAL streaming

2021-11-25 Thread Alvaro Herrera
Oh, but also I think I should push a mitigation in case a production
system hits this problem: maybe reduce the message from FATAL to WARNING
if the registered LSN is at a page boundary.

-- 
Álvaro Herrera  Valdivia, Chile  —  https://www.EnterpriseDB.com/
"Entristecido, Wutra (canción de Las Barreras)
echa a Freyr a rodar
y a nosotros al mar"




Re: prevent immature WAL streaming

2021-11-25 Thread Alvaro Herrera
On 2021-Nov-25, Amul Sul wrote:

> In XLogReadRecord(), both the variables being compared have
> inconsistency in the assignment -- one gets assigned from
> state->currRecPtr where other is from RecPtr.
> 
> .
> state->overwrittenRecPtr = state->currRecPtr;
> .
> state->abortedRecPtr = RecPtr;
> .
> 
> Before the place where assembled flag sets, there is a bunch of code
> that adjusts RecPtr. I think instead of  RecPtr, the latter assignment
> should use state->currRecPtr as well.

You're exactly right.  I managed to reproduce the problem shown by
buildfarm members, and indeed this fixes it.  And it makes sense: the
adjustment you refer to, is precisely to skip the page header when the
LSN is the start of the page, which is exactly the problem we're seeing
in the buildfarm ... except that on lapwing branch REL_11_STABLE, we're
seeing the LSN is off by 0x14 instead of 0x18.  That seems very strange.
I think the reason for this is that lapwing has MAXALIGN 4, so
MAXALIGN(sizeof(XLogPageHeaderData)) is 20, not 24 as is the case in the
other failing members.

... checks buildfarm ...

Yeah, all the others in Tom's list are x86-64.

I'm pushing the fix in a minute.

-- 
Álvaro Herrera PostgreSQL Developer  —  https://www.EnterpriseDB.com/
"¿Qué importan los años?  Lo que realmente importa es comprobar que
a fin de cuentas la mejor edad de la vida es estar vivo"  (Mafalda)




Re: Non-decimal integer literals

2021-11-25 Thread John Naylor
Hi Peter,

0001

-/* we no longer allow unary minus in numbers.
- * instead we pass it separately to parser. there it gets
- * coerced via doNegate() -- Leon aug 20 1999
+/*
+ * Numbers
+ *
+ * Unary minus is not part of a number here.  Instead we pass it
separately to
+ * parser, and there it gets coerced via doNegate().

If we're going to change the comment anyway, "the parser" sounds more
natural. Aside from that, 0001 and 0002 can probably be pushed now, if you
like. I don't have any good ideas about 0003 at the moment.

0005

--- a/src/interfaces/ecpg/preproc/pgc.l
+++ b/src/interfaces/ecpg/preproc/pgc.l
@@ -365,6 +365,10 @@ real ({integer}|{decimal})[Ee][-+]?{digit}+
 realfail1 ({integer}|{decimal})[Ee]
 realfail2 ({integer}|{decimal})[Ee][-+]

+integer_junk {integer}{ident_start}
+decimal_junk {decimal}{ident_start}
+real_junk {real}{ident_start}

A comment might be good here to explain these are only in ECPG for
consistency with the other scanners. Not really important, though.

0006

+{hexfail} {
+ yyerror("invalid hexadecimal integer");
+ }
+{octfail} {
+ yyerror("invalid octal integer");
  }
-{decimal} {
+{binfail} {
+ yyerror("invalid binary integer");
+ }

It seems these could use SET_YYLLOC(), since the error cursor doesn't match
other failure states:

+SELECT 0b;
+ERROR:  invalid binary integer at or near "SELECT 0b"
+LINE 1: SELECT 0b;
+^
+SELECT 1b;
+ERROR:  trailing junk after numeric literal at or near "1b"
+LINE 1: SELECT 1b;
+   ^

We might consider some tests for ECPG since lack of coverage has been a
problem.

Also, I'm curious: how does the spec work as far as deciding the year of
release, or feature-freezing of new items?
--
John Naylor
EDB: http://www.enterprisedb.com


Re: XMAX_LOCK_ONLY and XMAX_COMMITTED (fk/multixact code)

2021-11-25 Thread Mark Dilger



> On Nov 24, 2021, at 12:53 PM, Bossart, Nathan  wrote:
> 
> Another option we might consider is only checking for the
> HEAP_XMAX_LOCK_ONLY bit instead of everything in
> HEAP_XMAX_IS_LOCKED_ONLY.  IIUC everything else is only expected to
> happen for upgrades from v9.2 or earlier, so it might be pretty rare
> at this point.  Otherwise, I'll extract the exact bit pattern for the
> error message as you suggest.

I would prefer to detect and report any "can't happen" bit patterns without 
regard for how likely the pattern may be.  The difficulty is in proving that a 
bit pattern is disallowed.  Just because you can't find a code path in the 
current code base that would create a pattern doesn't mean it won't have 
legitimately been created by some past release or upgrade path.  As such, any 
prohibitions explicitly in the backend, such as Asserts around a condition, are 
really valuable.  You can know that the pattern is disallowed, since the server 
would Assert on it if encountered.

Aside from that, I don't really buy the argument that databases upgraded from 
v9.2 or earlier are rare.  Even if servers *running* v9.2 or earlier are (or 
become) rare, servers initialized that far back which have been upgraded one or 
more times since then may be common.

—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company







Re: Reduce function call costs on ELF platforms

2021-11-25 Thread Andrew Dunstan


On 11/24/21 22:57, Andres Freund wrote:
>
>> Which things does it break exactly?
> -Bsymbolic causes symbols that are defined and referenced within one shared
> library to use that definition. E.g. if a shared lib has a function
> "do_something()" and some of its code calls do_something(), you cannot use
> LD_PRELOAD (or a definition in the main binary) to redirect the call to
> do_something() inside the shared library to something else.
>
> I.e. if a shared library calls a function that's *not* defined within that
> shared library, -Bsymbolic doesn't have an effect for that symbol.
>
>
>> I have a case where a library that
>> is LD_PRELOADed calls PQsetSSLKeyPassHook_OpenSSL() in its constructor
>> function. I'd be very unhappy if that stopped working (and so would our
>> client).
> Bsymbolic shouldn't affect that at all.
>

Thanks for the explanation.


cheers


andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com





Re: Non-decimal integer literals

2021-11-25 Thread Zhihong Yu
On Thu, Nov 25, 2021 at 5:18 AM Peter Eisentraut <
peter.eisentr...@enterprisedb.com> wrote:

> On 01.11.21 07:09, Peter Eisentraut wrote:
> > Here is an updated patch for this.  It's the previous patch polished a
> > bit more, and it contains changes so that numeric literals reject
> > trailing identifier parts without whitespace in between, as discussed.
> > Maybe I should split that into incremental patches, but for now I only
> > have the one.  I don't have a patch for the underscores in numeric
> > literals yet.  It's in progress, but not ready.
>
> Here is a progressed version of this work, split into more incremental
> patches.  The first three patches are harmless code cleanups.  Patch 3
> has an interesting naming conflict, noted in the commit message; ideas
> welcome.  Patches 4 and 5 handle the rejection of trailing junk after
> numeric literals, as discussed.  I have expanded that compared to the v4
> patch to also cover non-integer literals.  It also comes with more tests
> now.  Patch 6 is the titular introduction of non-decimal integer
> literals, unchanged from before.

Hi,
For patch 3,

+int64
+pg_strtoint64(const char *s)

How about naming the above function pg_scanint64()?
pg_strtoint64xx() can be named pg_strtoint64() - this would align with
existing function:

pg_strtouint64(const char *str, char **endptr, int base)

Cheers


Re: TOAST - why separate visibility map

2021-11-25 Thread Virender Singla
"Given the size of toasted data, the overhead is unlikely to be a
significant overhead. It's much more an issue for the main table, where
narrow rows are common."

Completely agree, row size should not be a big concern for toast tables.

However write amplification will happen with vacuum freeze where
transactions id need to freeze in wider toast table tuples as well. I have
not explored if TOAST has separate hint bits info as well. In that case it
means normal vacuum (or SELECT after WRITE) has to completely rewrite the
big toast table tuples along with the small main table to set the hint bits
(commit/rollback).

I believe B tree Index does not contain any seperate visibility info so
that means the only work VACUUM does on Indexes is cleaning up dead tuples.

With maintaining one visibility info, above operations could be way faster.
However now the main table and TOAST vacuuming process will be glued
together where optimization can be thought about like two synchronized
threads working together for main and TOAST table to do the cleanup job.
Agree that hot updates are gone  in TOAST if there is a common VM.

Overall this looks complex.

On Sat, Nov 20, 2021 at 9:46 PM Tom Lane  wrote:

> Andres Freund  writes:
> > On November 19, 2021 12:31:00 PM PST, Tom Lane 
> wrote:
> >> It might be feasible to drop the visibility map for toast tables,
> though.
>
> > I think it be a bad idea - the VM is used by vacuum to avoid rereading
> already vacuumed ranges. Loosing that for large toast tables would be bad.
>
> Ah, right.  I was thinking vacuuming depended on the other map fork,
> but of course it needs this one.
>
> In short, there are indeed good reasons why it works like this.
>
> regards, tom lane
>


Re: pg_upgrade and publication/subscription problem

2021-11-25 Thread Marcos Pegoraro
>
> The reason is after an upgrade, there won't be any data in
> pg_subscription_rel, and only when you tried to refresh it is trying
> to sync again which leads to the "duplicate key value ..." problem you
> are seeing.
>
> So, is pg_upgrade populating pg_subscription and not pg_subscription_rel ?
It is doing 50% of his job ?

>
> Don't you want to eventually upgrade the publisher node as well? You
> can refer to blog [1] for the detailed steps.
>
> It is possible but I don´t think changing publisher will solve anything,
will ?

>
> Yes, the way you are doing I think it is bound to happen. There is
> some discussion about why this is happening in email [2]. AFAIK, it is
> not documented and if so, I think it will be a good idea to document
>
> And my problem remains the same, how to solve it ? All records on
pg_subscription_rel are initialize with srsubstate null. How can I replay
only updates since yesterday. This replication is a auditing database, so I
cannot loose all things happened since that pg_upgrade. [1] points me how
to upgrade but if I did the wrong way, how to solve that ?


Re: row filtering for logical replication

2021-11-25 Thread Euler Taveira
On Thu, Nov 25, 2021, at 10:39 AM, houzj.f...@fujitsu.com wrote:
> When researching and writing a top-up patch about this.
> I found a possible issue which I'd like to confirm first.
> 
> It's possible the table is published in two publications A and B, publication 
> A
> only publish "insert" , publication B publish "update". When UPDATE, both row
> filter in A and B will be executed. Is this behavior expected?
Good question. No. The code should check the action before combining the
multiple row filters.


--
Euler Taveira
EDB   https://www.enterprisedb.com/


Re: pg_upgrade and publication/subscription problem

2021-11-25 Thread Amit Kapila
On Thu, Nov 25, 2021 at 5:13 PM Marcos Pegoraro  wrote:
>
> A publication for all tables was running fine, Master is a PostgreSQL 11.11. 
> Replica was running version 13 (don´t remember minor version).
>
> Then we tried to update only subscriber server, nothing was done on master 
> side.
>
> Then we did ...
> - installed postgresql-14.
> - configured postgresql.conf to be similar to previous.
> - on version 13 disabled subscription - alter subscription disable.
> - changed both port to run pg_upgrade.
> - stop services for both 13 e 14.
> - /usr/lib/postgresql/14/bin/pg_upgrade -b /usr/lib/postgresql/13/bin -B 
> /usr/lib/postgresql/14/bin -d /etc/postgresql/13/main/ -D 
> /etc/postgresql/14/main/ -j 2 --link -p  -P 9998 -U postgres -v
> - when finished upgrade process, we removed version 13 and ran vacuumdb -p 
> 9998 -U postgres --all --analyze-in-stages
> - last step was to enable that subscription.
> - just wait for the subscriber to get the data changed, pg_upgrade ran for 15 
> minutes, this should be synced in a few seconds ...
> - few seconds later we remembered that some other tables were created on 
> publication server, so we did a refresh publication.
>
> Then, some minutes later we got lots of log entries "duplicate key value 
> violates unique constraint pk..." because it was trying to COPY that table 
> from master.
>
> We disable subscription again until we solve, as remains.
>
> Selecting from pg_subscription_rel all old tables are with srsubstate i for 
> initialize, not s for synchronized or r for ready, as they should. And all 
> srsublsn of these records were null, so it lost synchronization coordination 
> for all tables which existed before this upgrade process.
>

The reason is after an upgrade, there won't be any data in
pg_subscription_rel, and only when you tried to refresh it is trying
to sync again which leads to the "duplicate key value ..." problem you
are seeing.

> So, my first question is, as our publication server continues running, lots 
> of updates were processed, so how can I synchronize both sides without 
> recreating that publication ?
>

Don't you want to eventually upgrade the publisher node as well? You
can refer to blog [1] for the detailed steps.

> And my second question is, is this problem documented ? Is this problem 
> expected to happen ?
>

Yes, the way you are doing I think it is bound to happen. There is
some discussion about why this is happening in email [2]. AFAIK, it is
not documented and if so, I think it will be a good idea to document
it.

[1] - https://elephanttamer.net/?p=58
[2] - 
https://www.postgresql.org/message-id/CALDaNm2-SRGHK0rqJQu7rGiS4hDAb7Nib5HbojEN5ubaXGs2CA%40mail.gmail.com

-- 
With Regards,
Amit Kapila.




RE: row filtering for logical replication

2021-11-25 Thread houzj.f...@fujitsu.com
On Wed, Nov 24, 2021 1:46 PM Amit Kapila  wrote:
> On Wed, Nov 24, 2021 at 6:51 AM houzj.f...@fujitsu.com
>  wrote:
> >
> > On Tues, Nov 23, 2021 6:16 PM Amit Kapila  wrote:
> > > On Tue, Nov 23, 2021 at 1:29 PM houzj.f...@fujitsu.com
> > >  wrote:
> > > >
> > > > On Tues, Nov 23, 2021 2:27 PM vignesh C  wrote:
> > > > > On Thu, Nov 18, 2021 at 7:04 AM Peter Smith
> > > > > 
> > > > > wrote:
> > > > > >
> > > > > > PSA new set of v40* patches.
> > > > >
> > > > > Few comments:
> > > > > 1) When a table is added to the publication, replica identity is
> > > > > checked. But while modifying the publish action to include
> > > > > delete/update, replica identity is not checked for the existing
> > > > > tables. I felt it should be checked for the existing tables too.
> > > >
> > > > In addition to this, I think we might also need some check to
> > > > prevent user from changing the REPLICA IDENTITY index which is used in
> > > > the filter expression.
> > > >
> > > > I was thinking is it possible do the check related to REPLICA
> > > > IDENTITY in function CheckCmdReplicaIdentity() or In
> > > > GetRelationPublicationActions(). If we move the REPLICA IDENTITY
> > > > check to this function, it would be consistent with the existing
> > > > behavior about the check related to REPLICA IDENTITY(see the
> > > > comments in CheckCmdReplicaIdentity) and seems can cover all the cases
> > > > mentioned above.
> > >
> > > Yeah, adding the replica identity check in CheckCmdReplicaIdentity()
> > > would cover all the above cases but I think that would put a premium
> > > on each update/delete operation. I think traversing the expression
> > > tree (it could be multiple traversals if the relation is part of
> > > multiple publications) during each update/delete would be costly.
> > > Don't you think so?
> >
> > Yes, I agreed that traversing the expression every time would be costly.
> >
> > I thought maybe we can cache the columns used in row filter or cache
> > only the a
> > flag(can_update|delete) in the relcache. I think every operation that
> > affect the row-filter or replica-identity will invalidate the relcache
> > and the cost of check seems acceptable with the cache.
> >
> 
> I think if we can cache this information especially as a bool flag then that 
> should
> probably be better.

When researching and writing a top-up patch about this.
I found a possible issue which I'd like to confirm first.

It's possible the table is published in two publications A and B, publication A
only publish "insert" , publication B publish "update". When UPDATE, both row
filter in A and B will be executed. Is this behavior expected?

For example:
 Publication
create table tbl1 (a int primary key, b int);
create publication A for table tbl1 where (b<2) with(publish='insert');
create publication B for table tbl1 where (a>1) with(publish='update');

 Subscription
create table tbl1 (a int primary key);
CREATE SUBSCRIPTION sub CONNECTION 'dbname=postgres host=localhost
port=1' PUBLICATION A,B;

 Publication
update tbl1 set a = 2;

The publication can be created, and when UPDATE, the rowfilter in A (b<2) will
also been executed but the column in it is not part of replica identity.
(I am not against this behavior just confirm)

Best regards,
Hou zj 


Re: Support for NSS as a libpq TLS backend

2021-11-25 Thread Joshua Brindle
On Wed, Nov 24, 2021 at 8:49 AM Joshua Brindle
 wrote:
>
> On Wed, Nov 24, 2021 at 8:46 AM Joshua Brindle
>  wrote:
> >
> > On Wed, Nov 24, 2021 at 6:59 AM Daniel Gustafsson  wrote:
> > >
> > > > On 23 Nov 2021, at 23:39, Joshua Brindle 
> > > >  wrote:
> > >
> > > > It no longer happens with v49, since it was a null deref of the pr_fd
> > > > which no longer happens.
> > > >
> > > > I'll continue testing now, so far it's looking better.
> > >
> > > Great, thanks for confirming.  I'm still keen on knowing how you 
> > > triggered the
> > > segfault so I can ensure there are no further bugs around there.
> > >
> >
> > It happened when I ran psql with hostssl on the server but before I'd
> > initialized my client certificate store.
>
> I don't know enough about NSS to know if this is problematic or not
> but if I try verify-full without having the root CA in the certificate
> store I get:
>
> $ /usr/pgsql-15/bin/psql "host=localhost sslmode=verify-full user=postgres"
> psql: error: SSL error: Issuer certificate is invalid.
> unable to shut down NSS context: NSS could not shutdown. Objects are
> still in use.

Something is strange with ssl downgrading and a bad ssldatabase
[postgres@11cdfa30f763 ~]$ /usr/pgsql-15/bin/psql "ssldatabase=oops
sslcert=client_cert host=localhost"
Password for user postgres:



On the server side:
2021-11-25 01:52:01.984 UTC [269] LOG:  unable to handshake:
Encountered end of file (PR_END_OF_FILE_ERROR)

Other than that and I still haven't tested --with-llvm I've gotten
everything working, including with an openssl client. Attached is a
dockerfile that gets to the point where a client can connect with
clientcert=verify-full. I've removed some of the old cruft and
debugging from the previous versions.

Thank you.


Dockerfile
Description: Binary data


Re: Yet another fast GiST build

2021-11-25 Thread Andrey Borodin



> 17 нояб. 2021 г., в 16:33, Daniel Gustafsson  написал(а):
> 
>> On 5 Jul 2021, at 08:27, Emre Hasegeli  wrote:
> 
>> ...
>> 
>> I couldn't understand patch number 2 "Remove DEBUG1 verification".  It
>> seems like something rather useful.

Emre, thanks for the review! And sorry for this delay. Properly answering 
questions is still in my queue.

> 
> These questions have gone unanswered since July, and the patch fails to apply
> anymore.  Is there an updated version on the way?

Yes. In future versions I also want to address IOS vs pinned buffers issue[0]. 
And, probably, sort items on leaf pages. And, maybe, split pages more 
intelligently.
I hope to get to this in December.

I'll post rebased version ASAP.

Best regards, Andrey Borodin.

[0] 
https://www.postgresql.org/message-id/CAH2-Wz=PqOziyRSrnN5jAtfXWXY7-BJcHz9S355LH8Dt=5q...@mail.gmail.com



Re: Non-decimal integer literals

2021-11-25 Thread Peter Eisentraut

On 01.11.21 07:09, Peter Eisentraut wrote:
Here is an updated patch for this.  It's the previous patch polished a 
bit more, and it contains changes so that numeric literals reject 
trailing identifier parts without whitespace in between, as discussed. 
Maybe I should split that into incremental patches, but for now I only 
have the one.  I don't have a patch for the underscores in numeric 
literals yet.  It's in progress, but not ready.


Here is a progressed version of this work, split into more incremental 
patches.  The first three patches are harmless code cleanups.  Patch 3 
has an interesting naming conflict, noted in the commit message; ideas 
welcome.  Patches 4 and 5 handle the rejection of trailing junk after 
numeric literals, as discussed.  I have expanded that compared to the v4 
patch to also cover non-integer literals.  It also comes with more tests 
now.  Patch 6 is the titular introduction of non-decimal integer 
literals, unchanged from before.From 39aed9c0516fcf0a6b3372361ecfcf4874614578 Mon Sep 17 00:00:00 2001
From: Peter Eisentraut 
Date: Wed, 24 Nov 2021 09:10:32 +0100
Subject: [PATCH v5 1/6] Improve some comments in scanner files

---
 src/backend/parser/scan.l | 14 --
 src/fe_utils/psqlscan.l   | 14 --
 src/interfaces/ecpg/preproc/pgc.l | 16 +---
 3 files changed, 25 insertions(+), 19 deletions(-)

diff --git a/src/backend/parser/scan.l b/src/backend/parser/scan.l
index 6e6824faeb..4e02815803 100644
--- a/src/backend/parser/scan.l
+++ b/src/backend/parser/scan.l
@@ -174,7 +174,7 @@ extern void core_yyset_column(int column_no, yyscan_t 
yyscanner);
  *   bit string literal
  *   extended C-style comments
  *   delimited identifiers (double-quoted identifiers)
- *   hexadecimal numeric string
+ *   hexadecimal byte string
  *   standard quoted strings
  *   quote stop (detect continued strings)
  *   extended quoted strings (support backslash escape sequences)
@@ -262,7 +262,7 @@ quotecontinuefail   {whitespace}*"-"?
 xbstart[bB]{quote}
 xbinside   [^']*
 
-/* Hexadecimal number */
+/* Hexadecimal byte string */
 xhstart[xX]{quote}
 xhinside   [^']*
 
@@ -341,7 +341,6 @@ xcstart \/\*{op_chars}*
 xcstop \*+\/
 xcinside   [^*/]+
 
-digit  [0-9]
 ident_start[A-Za-z\200-\377_]
 ident_cont [A-Za-z\200-\377_0-9\$]
 
@@ -380,15 +379,18 @@ self  [,()\[\].;\:\+\-\*\/\%\^\<\>\=]
 op_chars   [\~\!\@\#\^\&\|\`\?\+\-\*\/\%\<\>\=]
 operator   {op_chars}+
 
-/* we no longer allow unary minus in numbers.
- * instead we pass it separately to parser. there it gets
- * coerced via doNegate() -- Leon aug 20 1999
+/*
+ * Numbers
+ *
+ * Unary minus is not part of a number here.  Instead we pass it separately to
+ * parser, and there it gets coerced via doNegate().
  *
  * {decimalfail} is used because we would like "1..10" to lex as 1, dot_dot, 
10.
  *
  * {realfail1} and {realfail2} are added to prevent the need for scanner
  * backup when the {real} rule fails to match completely.
  */
+digit  [0-9]
 
 integer{digit}+
 decimal(({digit}*\.{digit}+)|({digit}+\.{digit}*))
diff --git a/src/fe_utils/psqlscan.l b/src/fe_utils/psqlscan.l
index 0fab48a382..9aac166aa0 100644
--- a/src/fe_utils/psqlscan.l
+++ b/src/fe_utils/psqlscan.l
@@ -112,7 +112,7 @@ extern void psql_yyset_column(int column_no, yyscan_t 
yyscanner);
  *   bit string literal
  *   extended C-style comments
  *   delimited identifiers (double-quoted identifiers)
- *   hexadecimal numeric string
+ *   hexadecimal byte string
  *   standard quoted strings
  *   quote stop (detect continued strings)
  *   extended quoted strings (support backslash escape sequences)
@@ -200,7 +200,7 @@ quotecontinuefail   {whitespace}*"-"?
 xbstart[bB]{quote}
 xbinside   [^']*
 
-/* Hexadecimal number */
+/* Hexadecimal byte string */
 xhstart[xX]{quote}
 xhinside   [^']*
 
@@ -279,7 +279,6 @@ xcstart \/\*{op_chars}*
 xcstop \*+\/
 xcinside   [^*/]+
 
-digit  [0-9]
 ident_start[A-Za-z\200-\377_]
 ident_cont [A-Za-z\200-\377_0-9\$]
 
@@ -318,15 +317,18 @@ self  [,()\[\].;\:\+\-\*\/\%\^\<\>\=]
 op_chars   [\~\!\@\#\^\&\|\`\?\+\-\*\/\%\<\>\=]
 operator   {op_chars}+
 
-/* we no longer allow unary minus in numbers.
- * instead we pass it separately to parser. there it gets
- * coerced via doNegate() -- Leon aug 20 1999
+/*
+ * Numbers
+ *
+ * Unary minus is not part of a number here.  Instead we pass it separately to
+ * parser, and there it gets coerced via doNegate().
  *
  * {decimalfail} is used because we would like "1..10" to lex as 1, dot_dot, 

RE: Skipping logical replication transactions on subscriber side

2021-11-25 Thread houzj.f...@fujitsu.com
On Thur, Nov 25, 2021 8:29 PM Masahiko Sawada  wrote:
> On Thu, Nov 25, 2021 at 1:57 PM Amit Kapila  wrote:
> >
> > On Wed, Nov 24, 2021 at 5:14 PM Masahiko Sawada
>  wrote:
> > >
> > > Changed. I've removed first_error_time as per discussion on the
> > > thread for adding xact stats.
> > >
> >
> > We also agreed to change the column names to start with last_error_*
> > [1]. Is there a reason to not make those changes? Do you think that we
> > can change it just before committing that patch? I thought it might be
> > better to do it that way now itself.
> 
> Oh, I thought that you think that we change the column names when adding xact
> stats to the view. But these names also make sense even without the xact 
> stats.
> I've attached an updated patch. It also incorporated comments from Vignesh
> and Greg.
> 
Hi,

I only noticed some minor things in the testcases

1)
+$node_publisher->append_conf('postgresql.conf',
+qq[
+logical_decoding_work_mem = 64kB
+]);

It seems we don’t need set the decode_work_mem since we don't test streaming ?

2)
+$node_publisher->safe_psql('postgres',
+  q[
+CREATE PUBLICATION tap_pub FOR TABLE test_tab1, test_tab2;
+]);

There are a few places where only one command exists in the 'q[' or 'qq[' like 
the above code.
To be consistent, I think it might be better to remove the wrap here, maybe we 
can write like:
$node_publisher->safe_psql('postgres',
' CREATE PUBLICATION tap_pub FOR TABLE test_tab1, test_tab2;');

The others LGTM.

Best regards,
Hou zj






Re: Skipping logical replication transactions on subscriber side

2021-11-25 Thread Masahiko Sawada
On Thu, Nov 25, 2021 at 9:08 PM Greg Nancarrow  wrote:
>
> On Wed, Nov 24, 2021 at 10:44 PM Masahiko Sawada  
> wrote:
> >
> > I've attached an updated version patch. Unless I miss something, all
> > comments I got so far have been incorporated into this patch. Please
> > review it.
> >
>
> Only a couple of minor points:
>
> src/backend/postmaster/pgstat.c
> (1) pgstat_get_subworker_entry
>
> In the following comment, it should say "returns an entry ...":
>
> + * apply worker otherwise returns entry of the table sync worker associated
>
> src/include/pgstat.h
> (2) typedef struct PgStat_StatDBEntry
>
> "subworker" should be "subworkers" in the following comment, to match
> the struct member name:
>
> * subworker is the hash table of PgStat_StatSubWorkerEntry which stores
>
> Otherwise, the patch LGTM.

Thank you for the comments! These are incorporated into v25 patch I
just submitted.

Regards,

-- 
Masahiko Sawada
EDB:  https://www.enterprisedb.com/




Re: Skipping logical replication transactions on subscriber side

2021-11-25 Thread Masahiko Sawada
On Thu, Nov 25, 2021 at 7:36 PM vignesh C  wrote:
>
> On Wed, Nov 24, 2021 at 5:14 PM Masahiko Sawada  wrote:
> >
> > On Wed, Nov 17, 2021 at 8:14 PM Amit Kapila  wrote:
> > >
> > > On Tue, Nov 16, 2021 at 12:01 PM Masahiko Sawada  
> > > wrote:
> > > >
> > > > Right. I've fixed this issue and attached an updated patch.
>
> One very minor comment:
> conflict can be moved to next line to keep it within 80 chars boundary
> wherever possible
> +# Initial table setup on both publisher and subscriber. On subscriber we 
> create
> +# the same tables but with primary keys. Also, insert some data that
> will conflict
> +# with the data replicated from publisher later.
> +$node_publisher->safe_psql(
>
> Similarly in the below:
> +# Insert more data to test_tab1, raising an error on the subscriber
> due to violation
> +# of the unique constraint on test_tab1.
> +my $xid = $node_publisher->safe_psql(
>
> The rest of the patch looks good.

Thank you for the comments! These are incorporated into v25 patch I
just submitted.

Regards,

-- 
Masahiko Sawada
EDB:  https://www.enterprisedb.com/




Re: Skipping logical replication transactions on subscriber side

2021-11-25 Thread Masahiko Sawada
On Thu, Nov 25, 2021 at 1:57 PM Amit Kapila  wrote:
>
> On Wed, Nov 24, 2021 at 5:14 PM Masahiko Sawada  wrote:
> >
> > Changed. I've removed first_error_time as per discussion on the thread
> > for adding xact stats.
> >
>
> We also agreed to change the column names to start with last_error_*
> [1]. Is there a reason to not make those changes? Do you think that we
> can change it just before committing that patch? I thought it might be
> better to do it that way now itself.

Oh, I thought that you think that we change the column names when
adding xact stats to the view. But these names also make sense even
without the xact stats. I've attached an updated patch. It also
incorporated comments from Vignesh and Greg.

Regards,

--
Masahiko Sawada
EDB:  https://www.enterprisedb.com/


v25-0001-Add-a-subscription-worker-statistics-view-pg_sta.patch
Description: Binary data


Re: pg_walinspect - a new extension to get raw WAL data and WAL stats

2021-11-25 Thread Bharath Rupireddy
On Thu, Nov 25, 2021 at 3:49 PM Bharath Rupireddy
 wrote:
> >
> > Thanks all. Here's the v1 patch set for the new extension pg_walinspect. 
> > Note that I didn't include the documentation part now, I will be doing it a 
> > bit later.
> >
> > Please feel free to review and provide your thoughts.
>
> The v1 patch set was failing to compile on Windows. Here's the v2
> patch set fixing that.

I forgot to specify this: the v1 patch set was failing to compile on
Windows with errors shown at [1]. Thanks to Julien Rouhaud who
suggested to use PGDLLIMPORT in an off-list discussion.

[1] (Link target) ->
  pg_walinspect.obj : error LNK2001: unresolved external symbol
forkNames [C:\Users\bhara\postgres\pg_walinspect.vcxproj]
  pg_walinspect.obj : error LNK2001: unresolved external symbol
pg_comp_crc32c [C:\Users\bhara\postgres\pg_walinspect.vcxproj]
  pg_walinspect.obj : error LNK2001: unresolved external symbol
wal_segment_size [C:\Users\bhara\postgres\pg_walinspect.vcxproj]
  pg_walinspect.obj : error LNK2001: unresolved external symbol
RmgrTable [C:\Users\bhara\postgres\pg_walinspect.vcxproj]
  .\Release\pg_walinspect\pg_walinspect.dll : fatal error LNK1120: 4
unresolved externals [C:\Users\bhara\postgres\pg_walinspect.vcxproj]

5 Error(s)

Regards,
Bharath Rupireddy.




Re: pg_get_publication_tables() output duplicate relid

2021-11-25 Thread Amit Langote
On Thu, Nov 25, 2021 at 5:57 PM Amit Kapila  wrote:
> On Thu, Nov 25, 2021 at 1:30 PM Amit Langote  wrote:
> > I agree with backpatching the doc fix.  I've attached a diff against
> > master, though it also appears to apply to 13 and 14 branches.
>
> I think we can  for publish_via_partition_root,
> other than that the patch looks good to me.

Oh right, fixed.  I also added the tag for "true" in the same sentence.

-- 
Amit Langote
EDB: http://www.enterprisedb.com


document-ATTACH-limitation_v2.diff
Description: Binary data


Re: Skipping logical replication transactions on subscriber side

2021-11-25 Thread Greg Nancarrow
On Wed, Nov 24, 2021 at 10:44 PM Masahiko Sawada  wrote:
>
> I've attached an updated version patch. Unless I miss something, all
> comments I got so far have been incorporated into this patch. Please
> review it.
>

Only a couple of minor points:

src/backend/postmaster/pgstat.c
(1) pgstat_get_subworker_entry

In the following comment, it should say "returns an entry ...":

+ * apply worker otherwise returns entry of the table sync worker associated

src/include/pgstat.h
(2) typedef struct PgStat_StatDBEntry

"subworker" should be "subworkers" in the following comment, to match
the struct member name:

* subworker is the hash table of PgStat_StatSubWorkerEntry which stores

Otherwise, the patch LGTM.

Regards,
Greg Nancarrow
Fujitsu Australia




Re: Windows build warnings

2021-11-25 Thread Daniel Gustafsson
> On 22 Nov 2021, at 16:06, Tom Lane  wrote:
> 
> Alvaro Herrera  writes:
>> .. but see
>> https://postgr.es/m/cah2-wznwwu+9on9nzcnztk7ua238mctgpxyr1ty7u_msn5z...@mail.gmail.com
>> where this was already discussed.  I think if we're going to workaround
>> PG_USED_FOR_ASSERTS_ONLY not actually working, we may as well get rid of
>> it entirely.  My preference would be to fix it so that it works on more
>> platforms (at least Windows in addition to GCC).
> 
> Yeah, I do not think there is a reason to change the code if it's using
> PG_USED_FOR_ASSERTS_ONLY properly.  We should either make that macro
> work on $compiler, or ignore such warnings from $compiler.

So, to reach some conclusion on this thread; it seems the code is using
PG_USED_FOR_ASSERTS_ONLY - as it's currently implemented - properly, but it
doesn't work on MSVC and isn't likely to work in the shape it is today.
Reworking to support a wider range of compilers will also likely mean net new
code.

To silence the warnings in the meantime (if the rework at all happens) we
should either apply the patch from Greg or add C4101 to disablewarnings in
src/tools/msvc/Project.pm as mentioned above.  On top of that, we should apply
the patch I proposed downthread to remove PG_USED_FOR_ASSERTS_ONLY where it's
no longer applicable.  Personally I'm fine with either, and am happy to make it
happen, once we agree on what it should be.

Thoughts?

--
Daniel Gustafsson   https://vmware.com/





pg_upgrade and publication/subscription problem

2021-11-25 Thread Marcos Pegoraro
A publication for all tables was running fine, Master is a PostgreSQL
11.11. Replica was running version 13 (don´t remember minor version).

Then we tried to update only subscriber server, nothing was done on master
side.

Then we did ...
- installed postgresql-14.
- configured postgresql.conf to be similar to previous.
- on version 13 disabled subscription - alter subscription disable.
- changed both port to run pg_upgrade.
- stop services for both 13 e 14.
- */usr/lib/postgresql/14/bin/pg_upgrade -b /usr/lib/postgresql/13/bin -B
/usr/lib/postgresql/14/bin -d /etc/postgresql/13/main/ -D
/etc/postgresql/14/main/ -j 2 --link -p  -P 9998 -U postgres -v*
- when finished upgrade process, we removed version 13 and ran *vacuumdb -p
9998 -U postgres --all --analyze-in-stages*
- last step was to enable that subscription.
- just wait for the subscriber to get the data changed, pg_upgrade ran for
15 minutes, this should be synced in a few seconds ...
- few seconds later we remembered that some other tables were created on
publication server, so we did a refresh publication.

Then, some minutes later we got lots of log entries "duplicate key value
violates unique constraint pk..." because it was trying to COPY that table
from master.

We disable subscription again until we solve, as remains.

Selecting from pg_subscription_rel all old tables are with srsubstate i for
initialize, not s for synchronized or r for ready, as they should. And
all srsublsn of these records were null, so it lost synchronization
coordination for all tables which existed before this upgrade process.

So, my first question is, as our publication server continues running, lots
of updates were processed, so how can I synchronize both sides without
recreating that publication ?
And my second question is, is this problem documented ? Is this problem
expected to happen ?

regards,
Marcos


Re: [Proposal] Fully WAL logged CREATE DATABASE - No Checkpoints

2021-11-25 Thread Dilip Kumar
On Thu, Nov 25, 2021 at 1:07 PM Greg Nancarrow  wrote:
>
> On Tue, Oct 5, 2021 at 7:07 PM Dilip Kumar  wrote:
> >
> > Patch details:
> > 0001 to 0006 implements an approach1
> > 0007 removes the code of pg_class scanning and adds the directory scan.
> >
>
> I had a scan through the patches, though have not yet actually run any
> tests to try to better gauge their benefit.
> I do have some initial review comments though:
>
> 0003
>
> src/backend/commands/tablecmds.c
> (1) RelationCopyAllFork()
> In the following comment:
>
> +/*
> + * Copy source smgr all fork's data to the destination smgr.
> + */
>
> Shouldn't it say "smgr relation"?
> Also, you could additionally say ", using a specified fork data
> copying function." or something like that, to account for the
> additional argument.
>
>
> 0006
>
> src/backend/commands/dbcommands.c
> (1) function prototype location
>
> The following prototype is currently located in the "non-export
> function prototypes" section of the source file, but it's not static -
> shouldn't it be in dbcommands.h?
>
> +void RelationCopyStorageUsingBuffer(SMgrRelation src, SMgrRelation dst,
> +ForkNumber forkNum, char relpersistence);
>
> (2) CreateDirAndVersionFile()
> Shouldn't the following code:
>
> + fd = OpenTransientFile(versionfile, O_RDWR | O_CREAT | O_EXCL | PG_BINARY);
> + if (fd < 0 && errno == EEXIST && isRedo)
> +   fd = OpenTransientFile(versionfile, O_RDWR | PG_BINARY);
>
> actually be:
>
> + fd = OpenTransientFile(versionfile, O_WRONLY | O_CREAT | O_EXCL | 
> PG_BINARY);
> + if (fd < 0 && errno == EEXIST && isRedo)
> +   fd = OpenTransientFile(versionfile, O_WRONLY | O_TRUNC | PG_BINARY);
>
> since we're only writing to that file descriptor and we want to
> truncate the file if it already exists.
>
> The current comment says "... open it in the write mode.", but should
> say "... open it in write mode."
>
> Also, shouldn't you be writing a newline (\n) after the
> PG_MAJORVERSION ? (compare with code in initdb.c)
>
> (3) GetDatabaseRelationList()
> Shouldn't:
>
> +  if (PageIsNew(page) || PageIsEmpty(page))
> +continue;
>
> be:
>
> +  if (PageIsNew(page) || PageIsEmpty(page))
> +  {
> +UnlockReleaseBuffer(buf);
> +continue;
> +  }
>
> ?
>
> Also, in the following code:
>
> +  if (rnodelist == NULL)
> +rnodelist = list_make1(relinfo);
> +  else
> +rnodelist = lappend(rnodelist, relinfo);
>
> it should really be "== NIL" rather than "== NULL".
> But in any case, that code can just be:
>
>rnodelist = lappend(rnodelist, relinfo);
>
> because lappend() will create a list if the first arg is NIL.
>
> (4) RelationCopyStorageUsingBuffer()
>
> In the function comments, IMO it is better to use "APIs" instead of "apis".
>
> Also, better to use "get" instead of "got" in the following comment:
>
> +  /* If we got a cancel signal during the copy of the data, quit */

Thanks for the review and many valuable comments, I have fixed all of
them except this comment (/* If we got a cancel signal during the copy
of the data, quit */) because this looks fine to me.  0007, I have
dropped from the patchset for now.  I have also included fixes for
comments given by John.

-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com


v6-0005-New-interface-to-lock-relation-id.patch
Description: Binary data


v6-0003-Refactor-index_copy_data.patch
Description: Binary data


v6-0002-Extend-relmap-interfaces.patch
Description: Binary data


v6-0004-Extend-bufmgr-interfaces.patch
Description: Binary data


v6-0001-Refactor-relmap-load-and-relmap-write-functions.patch
Description: Binary data


v6-0006-WAL-logged-CREATE-DATABASE.patch
Description: Binary data


Re: Skipping logical replication transactions on subscriber side

2021-11-25 Thread vignesh C
On Wed, Nov 24, 2021 at 5:14 PM Masahiko Sawada  wrote:
>
> On Wed, Nov 17, 2021 at 8:14 PM Amit Kapila  wrote:
> >
> > On Tue, Nov 16, 2021 at 12:01 PM Masahiko Sawada  
> > wrote:
> > >
> > > Right. I've fixed this issue and attached an updated patch.

One very minor comment:
conflict can be moved to next line to keep it within 80 chars boundary
wherever possible
+# Initial table setup on both publisher and subscriber. On subscriber we create
+# the same tables but with primary keys. Also, insert some data that
will conflict
+# with the data replicated from publisher later.
+$node_publisher->safe_psql(

Similarly in the below:
+# Insert more data to test_tab1, raising an error on the subscriber
due to violation
+# of the unique constraint on test_tab1.
+my $xid = $node_publisher->safe_psql(

The rest of the patch looks good.

Regards,
Vignesh




Re: pg_walinspect - a new extension to get raw WAL data and WAL stats

2021-11-25 Thread Bharath Rupireddy
On Thu, Nov 18, 2021 at 6:43 PM Bharath Rupireddy
 wrote:
>
> On Thu, Oct 7, 2021 at 10:43 AM Bharath Rupireddy 
>  wrote:
> > > Looking at the proposed API from the initial email, I like that there's
> > > both stats functionality and WAL record inspection functionality
> > > (similar to pg_waldump). I like that there's the ability to pull the
> > > records as raw bytea data, however I think we're also going to want an
> > > ability in v1 of the patch to decode it (similar to pageinspect
> > > heap_page_item_attrs, etc).
> >
> > I'm yet to start working on the patch. I will be doing it soon.
>
> Thanks all. Here's the v1 patch set for the new extension pg_walinspect. Note 
> that I didn't include the documentation part now, I will be doing it a bit 
> later.
>
> Please feel free to review and provide your thoughts.

The v1 patch set was failing to compile on Windows. Here's the v2
patch set fixing that.

Regards,
Bharath Rupireddy.


v2-0001-pg_walinspect.patch
Description: Binary data


v2-0002-pg_walinspect-tests.patch
Description: Binary data


Re: pg_get_publication_tables() output duplicate relid

2021-11-25 Thread Amit Kapila
On Thu, Nov 25, 2021 at 1:30 PM Amit Langote  wrote:
>
> On Wed, Nov 24, 2021 at 5:44 PM Amit Kapila  wrote:
> > On Wed, Nov 24, 2021 at 12:02 PM Amit Langote  
> > wrote:
> > > So yeah, documenting the ATTACH issue as a limitation sounds like the
> > > best course for now.  I might word it as follows and add it under
> > > Notes at 
> > > https://www.postgresql.org/docs/current/sql-createpublication.html:
> > >
> > > "ATTACHing a table into a partition tree whose root is published using
> > > a publication with publish_via_partition_root set to true does not
> > > result in the table's existing contents to be replicated."
> >
> > Instead of "to be", shall we use "being"?
>
> That reads better.
>
> > > I'm not sure there's a clean enough workaround to this that we can add
> > > to the paragraph.
> > >
>
> I agree with backpatching the doc fix.  I've attached a diff against
> master, though it also appears to apply to 13 and 14 branches.
>

I think we can  for publish_via_partition_root,
other than that the patch looks good to me.

Hou-San, others, do you have any opinion about this patch and whether
to backpatch it or not?

-- 
With Regards,
Amit Kapila.




Re: pg_get_publication_tables() output duplicate relid

2021-11-25 Thread Amit Langote
On Wed, Nov 24, 2021 at 5:44 PM Amit Kapila  wrote:
> On Wed, Nov 24, 2021 at 12:02 PM Amit Langote  wrote:
> > So yeah, documenting the ATTACH issue as a limitation sounds like the
> > best course for now.  I might word it as follows and add it under
> > Notes at https://www.postgresql.org/docs/current/sql-createpublication.html:
> >
> > "ATTACHing a table into a partition tree whose root is published using
> > a publication with publish_via_partition_root set to true does not
> > result in the table's existing contents to be replicated."
>
> Instead of "to be", shall we use "being"?

That reads better.

> > I'm not sure there's a clean enough workaround to this that we can add
> > to the paragraph.
> >
> > Does that make sense?
> >
>
> Yeah, that sounds reasonable. I think the case with
> "publish_via_partition_root = false" should work but please test it
> once.

Yeah, I did test that case to compare and didn't see a problem with it.

> The other thing which we need to consider about all these fixes
> is whether to back patch them or not. I think it is on case to case
> basis. I feel this limitation should be documented and backpatched,
> what do you think? Feel free to submit patches accordingly.

I agree with backpatching the doc fix.  I've attached a diff against
master, though it also appears to apply to 13 and 14 branches.

--
Amit Langote
EDB: http://www.enterprisedb.com


document-ATTACH-limitation.diff
Description: Binary data