Re: Implementing Incremental View Maintenance

2021-11-24 Thread Yugo NAGATA
On Wed, 24 Nov 2021 04:31:25 +
"r.takahash...@fujitsu.com"  wrote:

> > > ivm=# create table t (c1 int, c2 int);
> > > CREATE TABLE
> > > ivm=# create incremental materialized view ivm_t as select distinct c1 
> > > from t;
> > > NOTICE:  created index "ivm_t_index" on materialized view "ivm_t"
> > > SELECT 0
> > >
> > > Then I executed pg_dump.
> > >
> > > In the dump, the following SQLs appear.
> > >
> > > CREATE INCREMENTAL MATERIALIZED VIEW public.ivm_t AS
> > >  SELECT DISTINCT t.c1
> > >FROM public.t
> > >   WITH NO DATA;
> > >
> > > ALTER TABLE ONLY public.ivm_t
> > > ADD CONSTRAINT ivm_t_index UNIQUE (c1);
> > >
> > > If I execute psql with the result of pg_dump, following error occurs.
> > >
> > > ERROR:  ALTER action ADD CONSTRAINT cannot be performed on relation
> > "ivm_t"
> > > DETAIL:  This operation is not supported for materialized views.
> > 
> > Good catch! It was my mistake creating unique constraints on IMMV in spite 
> > of
> > we cannot defined them via SQL. I'll fix it to use unique indexes instead of
> > constraints.
> 
> I checked the same procedure on v24 patch.
> But following error occurs instead of the original error.
> 
> ERROR:  relation "ivm_t_index" already exists

Thank you for pointing out it!

Hmmm, an index is created when IMMV is defined, so CREAE INDEX called
after this would fail... Maybe, we should not create any index automatically
if IMMV is created WITH NO DATA.

I'll fix it after some investigation.

Regards,
Yugo Nagata

-- 
Yugo NAGATA 




Re: [Proposal] Fully WAL logged CREATE DATABASE - No Checkpoints

2021-11-24 Thread Greg Nancarrow
On Tue, Oct 5, 2021 at 7:07 PM Dilip Kumar  wrote:
>
> Patch details:
> 0001 to 0006 implements an approach1
> 0007 removes the code of pg_class scanning and adds the directory scan.
>

I had a scan through the patches, though have not yet actually run any
tests to try to better gauge their benefit.
I do have some initial review comments though:

0003

src/backend/commands/tablecmds.c
(1) RelationCopyAllFork()
In the following comment:

+/*
+ * Copy source smgr all fork's data to the destination smgr.
+ */

Shouldn't it say "smgr relation"?
Also, you could additionally say ", using a specified fork data
copying function." or something like that, to account for the
additional argument.


0006

src/backend/commands/dbcommands.c
(1) function prototype location

The following prototype is currently located in the "non-export
function prototypes" section of the source file, but it's not static -
shouldn't it be in dbcommands.h?

+void RelationCopyStorageUsingBuffer(SMgrRelation src, SMgrRelation dst,
+ForkNumber forkNum, char relpersistence);

(2) CreateDirAndVersionFile()
Shouldn't the following code:

+ fd = OpenTransientFile(versionfile, O_RDWR | O_CREAT | O_EXCL | PG_BINARY);
+ if (fd < 0 && errno == EEXIST && isRedo)
+   fd = OpenTransientFile(versionfile, O_RDWR | PG_BINARY);

actually be:

+ fd = OpenTransientFile(versionfile, O_WRONLY | O_CREAT | O_EXCL | PG_BINARY);
+ if (fd < 0 && errno == EEXIST && isRedo)
+   fd = OpenTransientFile(versionfile, O_WRONLY | O_TRUNC | PG_BINARY);

since we're only writing to that file descriptor and we want to
truncate the file if it already exists.

The current comment says "... open it in the write mode.", but should
say "... open it in write mode."

Also, shouldn't you be writing a newline (\n) after the
PG_MAJORVERSION ? (compare with code in initdb.c)

(3) GetDatabaseRelationList()
Shouldn't:

+  if (PageIsNew(page) || PageIsEmpty(page))
+continue;

be:

+  if (PageIsNew(page) || PageIsEmpty(page))
+  {
+UnlockReleaseBuffer(buf);
+continue;
+  }

?

Also, in the following code:

+  if (rnodelist == NULL)
+rnodelist = list_make1(relinfo);
+  else
+rnodelist = lappend(rnodelist, relinfo);

it should really be "== NIL" rather than "== NULL".
But in any case, that code can just be:

   rnodelist = lappend(rnodelist, relinfo);

because lappend() will create a list if the first arg is NIL.

(4) RelationCopyStorageUsingBuffer()

In the function comments, IMO it is better to use "APIs" instead of "apis".

Also, better to use "get" instead of "got" in the following comment:

+  /* If we got a cancel signal during the copy of the data, quit */


0007

(I think I prefer the first approach rather than this 2nd approach)

src/backend/commands/dbcommands.c
(1) createdb()
pfree(srcpath) seems to be missing, in the case that CopyDatabase() gets called.

(2) GetRelfileNodeFromFileName()
%s in sscanf() allows an unbounded read and is considered potentially
dangerous (allows buffer overflow), especially here where
FORKNAMECHARS is so small.

+  nmatch = sscanf(filename, "%u_%s", , forkname);

how about using the following instead in this case:

+  nmatch = sscanf(filename, "%u_%4s", , forkname);

?

(even if there were > 4 chars after the underscore, it would still
match and InvalidOid would be returned because nmatch==2)

Regards,
Greg Nancarrow
Fujitsu Australia




Re: Deduplicate code updating ControleFile's DBState.

2021-11-24 Thread Michael Paquier
On Thu, Nov 25, 2021 at 10:21:40AM +0530, Amul Sul wrote:
> Thanks for the inputs --  moved timestamp setting inside update_controlfile().

I have not check the performance implication of that with a micro
benchmark or the like, but I can get behind 0001 on consistency
grounds between the backend and the frontend.  0002 does not seem
worth the trouble, though, as it is changing only two code paths.
--
Michael


signature.asc
Description: PGP signature


Re: Implementing Incremental View Maintenance

2021-11-24 Thread Yugo NAGATA
Hello Takahashi-san,

On Wed, 24 Nov 2021 04:27:13 +
"r.takahash...@fujitsu.com"  wrote:

> Hi Nagata-san,
> 
> 
> Sorry for late reply.
> 
> 
> > However, even if we create triggers recursively on the parents or children, 
> > we would still
> > need more consideration. This is because we will have to convert the format 
> > of tuple of
> > modified table to the format of the table specified in the view for cases 
> > that the parent
> > and some children have different format.
> > 
> > I think supporting partitioned tables can be left for the next release.
> 
> OK. I understand.
> In the v24-patch, creating IVM on partions or partition table is prohibited.
> It is OK but it should be documented.
> 
> Perhaps, the following statement describe this.
> If so, I think the definition of "simple base table" is ambiguous for some 
> users.
> 
> + IMMVs must be based on simple base tables. It's not supported to
> + create them on top of views or materialized views.

Oh, I forgot to fix the documentation. I'll fix it.

Ragards,
Yugo Nagata

-- 
Yugo NAGATA 




Re: [BUG]Missing REPLICA IDENTITY check when DROP NOT NULL

2021-11-24 Thread Michael Paquier
On Thu, Nov 25, 2021 at 10:44:53AM +0530, Dilip Kumar wrote:
> Good catch, your patch looks fine to me and solves the reported problem.

And applied down to 10.  A couple of comments in the same area did not
get the call, though.
--
Michael


signature.asc
Description: PGP signature


Re: prevent immature WAL streaming

2021-11-24 Thread Amul Sul
On Wed, Nov 24, 2021 at 2:10 AM Alvaro Herrera  wrote:
>
> On 2021-Nov-23, Tom Lane wrote:
>
> > We're *still* not out of the woods with 026_overwrite_contrecord.pl,
> > as we are continuing to see occasional "mismatching overwritten LSN"
> > failures, further down in the test where it tries to start up the
> > standby:
>
> Augh.
>
> > Looking at adjacent successful runs, it seems that the exact point
> > where the "missing contrecord" starts varies substantially, even after
> > our previous fix to disable autovacuum in this test.  How could that be?
>
> Well, there is intentionally some variability.  Maybe not as much as one
> would wish, but I expect that that should explain why that point is not
> always the same.
>
> > It's probably for the best though, because I think this is exposing
> > an actual bug that we would not have seen if the start point were
> > completely consistent.  I have not dug into the code, but it looks to
> > me like if the "consistent recovery state" is reached exactly at a
> > page boundary (0/1FFE000 in all these cases), then the standby expects
> > that to be what the OVERWRITE_CONTRECORD record will point at.  But
> > actually it points to the first WAL record on that page, resulting
> > in a bogus failure.
>
> So what is happening is that we set state->overwrittenRecPtr to the LSN
> of page start, ignoring the page header.  Is that the LSN of the first
> record in a page?  I'll see if I can reproduce the problem.
>

In XLogReadRecord(), both the variables being compared have
inconsistency in the assignment -- one gets assigned from
state->currRecPtr where other is from RecPtr.

.
state->overwrittenRecPtr = state->currRecPtr;
.
state->abortedRecPtr = RecPtr;
.

Before the place where assembled flag sets, there is a bunch of code
that adjusts RecPtr. I think instead of  RecPtr, the latter assignment
should use state->currRecPtr as well.

Regards,
Amul




Re: pg_dump, pg_basebackup don't error out with wrong option for "--format"

2021-11-24 Thread Bharath Rupireddy
On Thu, Nov 25, 2021 at 11:02 AM Dilip Kumar  wrote:
>
> On Thu, Nov 25, 2021 at 10:44 AM Bharath Rupireddy
>  wrote:
> >
> > Hi,
> >
> > I noticed that the pg_dump and pg_basebackup are not erroring out when
> > "--fo"/"--for"/"--form"/"--forma"/" are specified(use cases 1 - 4, 7 -
> > 9) whereas it fails if a pattern that doesn't match with "format" is
> > specified (use case 5, 10). This seems to be true only for "--format"
> > option, for other options it just errors out (use case 6). Why is the
> > behaviour for "--format" isn't consistent?
> >
> > Is it something expected? Is the option parsing logic here buggy?
>
> I think for parsing we use getopt_long(), as per that if you use the
> prefix of the string and that is not conflicting with any other option
> then that is allowed.  So --fo, --for all are accepted, --f will not
> be accepted because --file and --format will conflict, --foo is also
> not allowed because it is not a valid prefix string of any valid
> option string.

Yeah, that makes sense. Thanks.

Regards,
Bharath Rupireddy.




Re: pg_dump, pg_basebackup don't error out with wrong option for "--format"

2021-11-24 Thread Dilip Kumar
On Thu, Nov 25, 2021 at 10:44 AM Bharath Rupireddy
 wrote:
>
> Hi,
>
> I noticed that the pg_dump and pg_basebackup are not erroring out when
> "--fo"/"--for"/"--form"/"--forma"/" are specified(use cases 1 - 4, 7 -
> 9) whereas it fails if a pattern that doesn't match with "format" is
> specified (use case 5, 10). This seems to be true only for "--format"
> option, for other options it just errors out (use case 6). Why is the
> behaviour for "--format" isn't consistent?
>
> Is it something expected? Is the option parsing logic here buggy?

I think for parsing we use getopt_long(), as per that if you use the
prefix of the string and that is not conflicting with any other option
then that is allowed.  So --fo, --for all are accepted, --f will not
be accepted because --file and --format will conflict, --foo is also
not allowed because it is not a valid prefix string of any valid
option string.

-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com




Re: [BUG]Missing REPLICA IDENTITY check when DROP NOT NULL

2021-11-24 Thread Dilip Kumar
On Thu, Nov 25, 2021 at 8:21 AM tanghy.f...@fujitsu.com
 wrote:
>
> On Wed, Nov 24, 2021 7:29 PM, Michael Paquier  wrote:

>
> Thanks for your comment. I agree with you.
> I have fixed it and attached v2 patch.

Good catch, your patch looks fine to me and solves the reported problem.

-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com




pg_dump, pg_basebackup don't error out with wrong option for "--format"

2021-11-24 Thread Bharath Rupireddy
Hi,

I noticed that the pg_dump and pg_basebackup are not erroring out when
"--fo"/"--for"/"--form"/"--forma"/" are specified(use cases 1 - 4, 7 -
9) whereas it fails if a pattern that doesn't match with "format" is
specified (use case 5, 10). This seems to be true only for "--format"
option, for other options it just errors out (use case 6). Why is the
behaviour for "--format" isn't consistent?

Is it something expected? Is the option parsing logic here buggy?

Thoughts?

Use cases:
1) ./pg_dump --dbname=postgres --host=localhost --port=5432
--username=bharath --no-password --fo=plain --file=mydump.sql
2) ./pg_dump --dbname=postgres --host=localhost --port=5432
--username=bharath --no-password --for=plain --file=mydump.sql
3) ./pg_dump --dbname=postgres --host=localhost --port=5432
--username=bharath --no-password --form=plain --file=mydump.sql
4) ./pg_dump --dbname=postgres --host=localhost --port=5432
--username=bharath --no-password --forma=plain --file=mydump.sql
5) ./pg_dump --dbname=postgres --host=localhost --port=5432
--username=bharath --no-password --foo=plain --file=mydump.sql
6) ./pg_dump --dbname=postgres --host=localhost --port=5432
--username=bharath --no-password --format=plain --fi=mydump.sql
7) ./pg_basebackup -D bkupdata --fo=plain
8) ./pg_basebackup -D bkupdata --for=plain
9) ./pg_basebackup -D bkupdata --forma=plain
10) ./pg_basebackup -D bkupdata --foo=plain

Regards,
Bharath Rupireddy.




Re: Skipping logical replication transactions on subscriber side

2021-11-24 Thread Amit Kapila
On Wed, Nov 24, 2021 at 5:14 PM Masahiko Sawada  wrote:
>
> Changed. I've removed first_error_time as per discussion on the thread
> for adding xact stats.
>

We also agreed to change the column names to start with last_error_*
[1]. Is there a reason to not make those changes? Do you think that we
can change it just before committing that patch? I thought it might be
better to do it that way now itself.

[1] - 
https://www.postgresql.org/message-id/CAD21AoCQ8z5goy3BCqfk2gn5p8NVH5B-uxO3Xc-dXN-MXVfnKg%40mail.gmail.com

-- 
With Regards,
Amit Kapila.




Re: pg_replslotdata - a tool for displaying replication slot information

2021-11-24 Thread Bharath Rupireddy
On Wed, Nov 24, 2021 at 9:40 PM Japin Li  wrote:
>
>
> On Wed, 24 Nov 2021 at 23:59, Bharath Rupireddy 
>  wrote:
> > On Wed, Nov 24, 2021 at 9:09 PM Bharath Rupireddy
> >> > Thoughts?
> >>
> >> Attaching the rebased v2 patch.
> >
> > On windows the previous patches were failing, fixed that in the v3
> > patch. I'm really sorry for the noise.
> >
>
> Cool!  When I try to use it, there is an error for -v, --verbose option.
>
> px@ubuntu:~/Codes/postgres/Debug$ pg_replslotdata -v
> pg_replslotdata: invalid option -- 'v'
> Try "pg_replslotdata --help" for more information.
>
> This is because the getopt_long() missing 'v' in the third parameter.
>
> while ((c = getopt_long(argc, argv, "D:v", long_options, NULL)) != -1)

Thanks for taking a look at the patch, attaching v4.

There are many things that I could do in the patch, for instance, more
comments, documentation, code improvements etc. I would like to first
know what hackers think about this tool, and then start spending more
time on it.

Regards,
Bharath Rupireddy.


v4-0001-pg_replslotdata.patch
Description: Binary data


Re: Deduplicate code updating ControleFile's DBState.

2021-11-24 Thread Amul Sul
On Thu, Nov 11, 2021 at 1:30 AM Bossart, Nathan  wrote:
>
> On 10/1/21, 10:40 PM, "Michael Paquier"  wrote:
> > On Fri, Oct 01, 2021 at 05:47:45PM +, Bossart, Nathan wrote:
> >> I'm inclined to agree that anything that calls update_controlfile()
> >> should update the timestamp.
> >
> > pg_control.h tells that:
> > pg_time_t   time;   /* time stamp of last pg_control update */
> > So, yes, that would be more consistent.
> >
> >> However, I wonder if the additional
> >> calls to time() would have a noticeable impact.
> >
> > I would not take that lightly either.  Now, I don't think that any of
> > the code paths where UpdateControlFile() or update_controlfile() is
> > called are hot enough to worry about that.
> >
> >  UpdateControlFile(void)
> >  {
> > +   ControlFile->time = (pg_time_t) time(NULL);
> > update_controlfile(DataDir, ControlFile, true);
> >  }
> > I have to admit that it is a bit strange to do that in the backend but
> > not the frontend, so there is a good argument for doing that directly
> > in update_controlfile().  pg_resetwal does an update of the time, but
> > pg_rewind does not.
>

Thanks for the inputs --  moved timestamp setting inside update_controlfile().

> I don't see any recent updates to this thread from Amul, so I'm
> marking this one as waiting-for-author.
>

Sorry for the delay, please have a look at the attached version --
changing status to Needs review, thanks.

Regards,
Amul


v5-0001-Do-the-ControlFile-timestamp-setting-in-update_co.patch
Description: Binary data


v5-0002-Deduplicate-code-updating-ControleFile-s-DBState.patch
Description: Binary data


Re: Non-superuser subscription owners

2021-11-24 Thread Amit Kapila
On Thu, Nov 25, 2021 at 6:00 AM Jeff Davis  wrote:
>
> On Fri, 2021-11-19 at 16:45 -0800, Mark Dilger wrote:
> > Renamed as 0001 in version 3, as it is the only remaining patch.  For
> > anyone who reviewed the older patch set, please note that I made some
> > changes to the src/test/subscription/t/026_nosuperuser.pl test case
> > relative to the prior version.
>
> We need to do permission checking for WITH CHECK OPTION and RLS. The
> patch right now allows the subscription to write data that an RLS
> policy forbids.
>

Won't it be better to just check if the current user is superuser
before applying each change as a matter of this first patch? Sorry, I
was under impression that first, we want to close the current gap
where we allow to proceed with replication if the user's superuser
privileges were revoked during replication. To allow non-superusers
owners, I thought it might be better to first try to detect the change
of ownership as soon as possible instead of at the transaction
boundary.

-- 
With Regards,
Amit Kapila.




Re: Reduce function call costs on ELF platforms

2021-11-24 Thread Andres Freund
Hi,

On 2021-11-24 17:55:03 -0500, Andrew Dunstan wrote:
> On 11/24/21 13:55, Andres Freund wrote:
> > On 2021-11-23 17:28:08 +0100, Peter Eisentraut wrote:
> >> On 22.11.21 23:32, Tom Lane wrote:
>  The easier approach for this class of issues is to use the linker option
>  -Bsymbolic.
> >>> I don't recall details, but we've previously rejected the idea of
> >>> trying to use -Bsymbolic widely; apparently it has undesirable
> >>> side-effects on some platforms.  See commit message for e3d77ea6b
> >>> (hopefully there's some detail in the email thread [1]).  It sounds
> >>> like you're not actually proposing that, but I thought it would be
> >>> a good idea to note the hazard here.
> >> Also, IIRC, -Bsymbolic was once frowned upon by packaging policies, since 
> >> it
> >> prevents use of LD_PRELOAD.  I'm not sure what the current thinking there
> >> is, however.
> > It doesn't break some (most?) of the uses of LD_PRELOAD. In particular, it
> > doesn't break things like replacing the malloc implementation. When do you
> > have a symbol that you want to override *inside* your library (executables
> > already bind to their own symbols at compile time)? I've seen that for
> > replacing buggy functions in closed source things, but that's about it?
> >
> 
> Which things does it break exactly?

-Bsymbolic causes symbols that are defined and referenced within one shared
library to use that definition. E.g. if a shared lib has a function
"do_something()" and some of its code calls do_something(), you cannot use
LD_PRELOAD (or a definition in the main binary) to redirect the call to
do_something() inside the shared library to something else.

I.e. if a shared library calls a function that's *not* defined within that
shared library, -Bsymbolic doesn't have an effect for that symbol.


> I have a case where a library that
> is LD_PRELOADed calls PQsetSSLKeyPassHook_OpenSSL() in its constructor
> function. I'd be very unhappy if that stopped working (and so would our
> client).

Bsymbolic shouldn't affect that at all.

Greetings,

Andres Freund




Re: row filtering for logical replication

2021-11-24 Thread Peter Smith
On Tue, Nov 23, 2021 at 10:52 PM Amit Kapila  wrote:
>
> On Tue, Nov 23, 2021 at 4:58 PM Ajin Cherian  wrote:
> >
> > Attaching a new patchset v41 which includes changes by both Peter and 
> > myself.
> >
>
...
> I suggest at this stage we can combine 0001, 0003, and 0004. Then move
> pg_dump and psql (describe.c) related changes to 0002 and make 0002 as
> the last patch in the series. This will help review backend changes
> first and then we can look at client-side changes.
>

The patch combining and reordering was as suggested.

BEFORE:
v41-0001 Euler's main patch
v41-0002 Tab-complete
v41-0003 ExprState cache
v41-0004 OR/AND
v41-0005 Validation walker
v41-0006 new/old tuple updates

AFTER:
v42-0001 main patch <== v41-0001 + v41-0003 + v41-0004
v42-0002 validation walker <== v41-0005
v42-0003 new/old tuple updates <== v41-0006
v42-0004 tab-complete and pgdump <== v41-0002 (plus pgdump code from v41-0001)

~

Please note,  I did not remove the describe.c changes from the
v42-0001 patch at this time. I left this as-is because I felt the
ability for psql \d+ or \dRp+ etc to display the current row-filter is
*essential* functionality to be able to test and debug the 0001 patch
properly.

--
Kind Regards,
Peter Smith.
Fujitsu Australia




Re: row filtering for logical replication

2021-11-24 Thread Peter Smith
On Thu, Nov 25, 2021 at 12:03 AM Amit Kapila  wrote:
>
> On Tue, Nov 23, 2021 at 4:58 PM Ajin Cherian  wrote:
> >
> > Attaching a new patchset v41 which includes changes by both Peter and 
> > myself.
> >
...
> Few more comments:
> =
...
> 0005
> 2.
> +typedef struct {
> + Relation rel;
> + bool check_replident;
> + Bitmapset  *bms_replident;
> +}
> +rf_context;
>
> Add rf_context in the same line where } ends.
>

Fixed in v42* [1]

--
[1] 
https://www.postgresql.org/message-id/CAHut%2BPsGZHvafa3K_RAJ0Agm28W2owjNN%2BqU0EUsSjBNbuXFsQ%40mail.gmail.com

Kind Regards,
Peter Smith.
Fujitsu Australia




Re: row filtering for logical replication

2021-11-24 Thread Peter Smith
On Wed, Nov 24, 2021 at 8:52 PM vignesh C  wrote:
>
> On Tue, Nov 23, 2021 at 4:58 PM Ajin Cherian  wrote:
> >
> > Attaching a new patchset v41 which includes changes by both Peter and 
> > myself.
> >
> > Patches v40-0005 and v40-0006 have been merged to create patch
> > v41-0005 which reduces the patches to 6 again.
>
> Few comments:
...
> 2) Should '"delete" or "delete"' be '"delete" or "update"'
> --- a/src/backend/catalog/pg_publication.c
> +++ b/src/backend/catalog/pg_publication.c
> @@ -340,7 +340,7 @@ rowfilter_walker(Node *node, rf_context *context)
>   * 1. Only certain simple node types are permitted in the expression. See
>   * function rowfilter_walker for details.
>   *
> - * 2. If the publish operation contains "delete" then only columns that
> + * 2. If the publish operation contains "delete" or "delete" then
> only columns that
>   * are allowed by the REPLICA IDENTITY rules are permitted to be used in the
>   * row-filter WHERE clause.
>   */
> @@ -352,12 +352,10 @@ rowfilter_expr_checker(Publication *pub,
> Relation rel, Node *rfnode)
> context.rel = rel;
>
> /*
> -* For "delete", check that filter cols are also valid replica 
> identity
> +* For "delete" or "update", check that filter cols are also
> valid replica identity
>  * cols.

Fixed in v42* [1]


> 4) This should be included in typedefs.list, also we could add some
> comments for this structure
> +typedef struct {
> +   Relationrel;
> +   Bitmapset  *bms_replident;
> +}
> +rf_context;
>

Fixed in v42* [1]


> 6) typo "filte" should be "filter":
> +/*
> + * The row filte walker checks that the row filter expression is legal.
> + *
> + * Rules: Node-type validation
> + * ---
> + * Allow only simple or compound expressions like:
> + * - "(Var Op Const)" or
> + * - "(Var Op Const) Bool (Var Op Const)"

Fixed in v42* [1]

--
[1] 
https://www.postgresql.org/message-id/CAHut%2BPsGZHvafa3K_RAJ0Agm28W2owjNN%2BqU0EUsSjBNbuXFsQ%40mail.gmail.com

Kind Regards,
Peter Smith.
Fujitsu Australia




Re: row filtering for logical replication

2021-11-24 Thread Peter Smith
On Tue, Nov 23, 2021 at 4:40 PM tanghy.f...@fujitsu.com
 wrote:
>
> On Thursday, November 18, 2021 9:34 AM Peter Smith  
> wrote:
> >
> > PSA new set of v40* patches.
> >
>

> Besides, a small comment on 0004 patch:
>
> +* Multiple row-filter expressions for the same publication 
> will later be
> +* combined by the COPY using OR, but this means if any of 
> the filters is
>
> Should we change it to:
> Multiple row-filter expressions for the same table ...

Fixed in v42* [1]

--
[1] 
https://www.postgresql.org/message-id/CAHut%2BPsGZHvafa3K_RAJ0Agm28W2owjNN%2BqU0EUsSjBNbuXFsQ%40mail.gmail.com

Kind Regards,
Peter Smith.
Fujitsu Australia




Re: row filtering for logical replication

2021-11-24 Thread Peter Smith
On Tue, Nov 23, 2021 at 8:02 PM Amit Kapila  wrote:
>
> On Thu, Nov 18, 2021 at 11:02 AM Peter Smith  wrote:
> >
> > On Mon, Nov 15, 2021 at 9:31 PM Amit Kapila  wrote:
> >
> > >
> > > 4. I think we should add some comments in pgoutput_row_filter() as to
> > > why we are caching the row_filter here instead of
> > > get_rel_sync_entry()? That has been discussed multiple times so it is
> > > better to capture that in comments.
> >
> > Added comment in v40 [1]
> >
>
> I think apart from truncate and error cases, it can also happen for
> other operations because we decide whether to publish a change
> (operation) after calling get_rel_sync_entry() in pgoutput_change. I
> think we can reflect that as well in the comment.

Fixed in v42* [1]

--
[1] 
https://www.postgresql.org/message-id/CAHut%2BPsGZHvafa3K_RAJ0Agm28W2owjNN%2BqU0EUsSjBNbuXFsQ%40mail.gmail.com

Kind Regards,
Peter Smith.
Fujitsu Australia




Re: Parallel vacuum workers prevent the oldest xmin from advancing

2021-11-24 Thread Amit Kapila
On Wed, Nov 24, 2021 at 7:46 PM John Naylor
 wrote:
>
> On Wed, Nov 17, 2021 at 7:28 AM Amit Kapila  wrote:
> >
> > > The patch looks good to me. But I can't come up with a stable test for
> > > this. It seems to be hard without stopping and resuming parallel
> > > vacuum workers. Do you have any good idea?
> > >
> >
> > No, let's wait for a day or so to see if anybody else has any ideas to
> > write a test for this case, otherwise, I'll check these once again and
> > push.
>
> I set this "committed" in the CF app.
>

Thanks!

-- 
With Regards,
Amit Kapila.




Re: row filtering for logical replication

2021-11-24 Thread Peter Smith
On Thu, Nov 18, 2021 at 9:35 PM Greg Nancarrow  wrote:
>
> On Thu, Nov 18, 2021 at 12:33 PM Peter Smith  wrote:
> >
> > PSA new set of v40* patches.
> >
>
> Thanks for the patch updates.
>
> A couple of comments so far:
>
> (1) compilation warning
> WIth the patches applied, there's a single compilation warning when
> Postgres is built:
>
> pgoutput.c: In function ‘pgoutput_row_filter_init’:
> pgoutput.c:854:8: warning: unused variable ‘relid’ [-Wunused-variable]
>   Oid   relid = RelationGetRelid(relation);
> ^
>

Fixed in v42* [1]

--
[1] 
https://www.postgresql.org/message-id/CAHut%2BPsGZHvafa3K_RAJ0Agm28W2owjNN%2BqU0EUsSjBNbuXFsQ%40mail.gmail.com

Kind Regards,
Peter Smith.
Fujitsu Australia




RE: [BUG]Missing REPLICA IDENTITY check when DROP NOT NULL

2021-11-24 Thread tanghy.f...@fujitsu.com
On Wed, Nov 24, 2021 7:29 PM, Michael Paquier  wrote:
> 
> On Wed, Nov 24, 2021 at 07:04:51AM +, tanghy.f...@fujitsu.com wrote:
> > create table tbl (a int not null unique);
> > alter table tbl replica identity using INDEX tbl_a_key;
> > alter table tbl alter column a drop not null;
> > insert into tbl values (null);
> 
> Oops.  Yes, that's obviously not good.
> 
> > To solve the above problem, I think it's better to add a check when
> > executing  ALTER COLUMN DROP NOT NULL,
> > and report an error if this column is part of replica identity.
> 
> I'd say that you are right to block the operation.  I'll try to play a
> bit with this stuff tomorrow.
> 
> > Attaching a patch that disallow DROP NOT NULL on a column if it's in
> > a REPLICA IDENTITY index. Also added a test in it.
> 
>if (indexStruct->indkey.values[i] == attnum)
>ereport(ERROR,
>(errcode(ERRCODE_INVALID_TABLE_DEFINITION),
> -   errmsg("column \"%s\" is in a primary key",
> +   errmsg(ngettext("column \"%s\" is in a primary key",
> +   "column \"%s\" is in a REPLICA IDENTITY 
> index",
> +   indexStruct->indisprimary),
> colName)));
> Using ngettext() looks incorrect to me here as it is used to get the
> plural form of a string, so you'd better make these completely
> separated instead.

Thanks for your comment. I agree with you.
I have fixed it and attached v2 patch.

Regards,
Tang


v2-0001-Disallow-DROP-NOT-NULL-on-a-column-in-the-REPLICA.patch
Description:  v2-0001-Disallow-DROP-NOT-NULL-on-a-column-in-the-REPLICA.patch


RE: pg_get_publication_tables() output duplicate relid

2021-11-24 Thread houzj.f...@fujitsu.com
On Wed, Nov 24, 2021 4:48 PM Amit Kapila  wrote:
> On Mon, Nov 22, 2021 at 12:55 PM Amit Langote 
> wrote:
> >
> > On Fri, Nov 19, 2021 at 2:28 PM Amit Kapila 
> wrote:
> > > On Fri, Nov 19, 2021 at 7:19 AM Amit Langote 
> wrote:
> > > >  As in,
> > > > do we know of any replication (initial/streaming) misbehavior
> > > > caused by the duplicate partition OIDs in this case or is the only
> > > > problem that pg_publication_tables output looks odd?
> > >
> > > The latter one but I think either we should document this or change
> > > it as we can't assume users will follow what subscriber-side code does.
> >
> > On second thought, I agree that de-duplicating partitions from this
> > view is an improvement.
> >
> 
> Fair enough. Hou-San, Can you please submit the updated patch after fixing
> any pending comments including the test case?

OK, I will submit the updated patch soon.

Best regards,
Hou zj


Re: pg_stat_bgwriter.buffers_backend is pretty meaningless (and more?)

2021-11-24 Thread Justin Pryzby
Thanks for working on this.  I was just trying to find something like
"pg_stat_checkpointer".

You wrote beentry++ at the start of two loops, but I think that's wrong; it
should be at the end, as in the rest of the file (or as a loop increment).
BackendStatusArray[0] is actually used (even though its backend has
backendId==1, not 0).  "MyBEEntry = [MyBackendId - 1];"

You could put *_NUM_TYPES as the last value in these enums, like
NUM_AUXPROCTYPES, NUM_PMSIGNALS, and NUM_PROCSIGNALS:

+#define IOOP_NUM_TYPES (IOOP_WRITE + 1)
+#define IOPATH_NUM_TYPES (IOPATH_STRATEGY + 1)
+#define BACKEND_NUM_TYPES (B_LOGGER + 1)

There's extraneous blank lines in these functions:

+pgstat_sum_io_path_ops
+pgstat_report_live_backend_io_path_ops
+pgstat_recv_resetsharedcounter
+GetIOPathDesc
+StrategyRejectBuffer

This function is doubly-indented:

+pgstat_send_buffers_reset

As support for C99 is now required by postgres, variables can be declared as
part of various loops.

+   int io_path;
+   for (io_path = 0; io_path < IOPATH_NUM_TYPES; io_path++)

Rather than memset(), you could initialize msg like this.
PgStat_MsgIOPathOps msg = {0};

+pgstat_send_buffers(void)
+{
+   PgStat_MsgIOPathOps msg;
+
+   PgBackendStatus *beentry = MyBEEntry;
+
+   if (!beentry)
+   return;
+
+   memset(, 0, sizeof(msg));

-- 
Justin




Re: Non-superuser subscription owners

2021-11-24 Thread Jeff Davis
On Fri, 2021-11-19 at 16:45 -0800, Mark Dilger wrote:
> Renamed as 0001 in version 3, as it is the only remaining patch.  For
> anyone who reviewed the older patch set, please note that I made some
> changes to the src/test/subscription/t/026_nosuperuser.pl test case
> relative to the prior version.

We need to do permission checking for WITH CHECK OPTION and RLS. The
patch right now allows the subscription to write data that an RLS
policy forbids.

A couple other points:

 * We shouldn't refer to the behavior of previous versions in the docs
unless there's a compelling reason
 * Do we need to be smarter about partitioned tables, where an insert
can turn into an update?
 * Should we refactor to borrow logic from ExecInsert so that it's less
likely that we miss something in the future?

Regards,
Jeff Davis






Re: Reduce function call costs on ELF platforms

2021-11-24 Thread Andrew Dunstan


On 11/24/21 13:55, Andres Freund wrote:
> Hi,
>
> On 2021-11-23 17:28:08 +0100, Peter Eisentraut wrote:
>> On 22.11.21 23:32, Tom Lane wrote:
 The easier approach for this class of issues is to use the linker option
 -Bsymbolic.
>>> I don't recall details, but we've previously rejected the idea of
>>> trying to use -Bsymbolic widely; apparently it has undesirable
>>> side-effects on some platforms.  See commit message for e3d77ea6b
>>> (hopefully there's some detail in the email thread [1]).  It sounds
>>> like you're not actually proposing that, but I thought it would be
>>> a good idea to note the hazard here.
>> Also, IIRC, -Bsymbolic was once frowned upon by packaging policies, since it
>> prevents use of LD_PRELOAD.  I'm not sure what the current thinking there
>> is, however.
> It doesn't break some (most?) of the uses of LD_PRELOAD. In particular, it
> doesn't break things like replacing the malloc implementation. When do you
> have a symbol that you want to override *inside* your library (executables
> already bind to their own symbols at compile time)? I've seen that for
> replacing buggy functions in closed source things, but that's about it?
>

Which things does it break exactly? I have a case where a library that
is LD_PRELOADed calls PQsetSSLKeyPassHook_OpenSSL() in its constructor
function. I'd be very unhappy if that stopped working (and so would our
client).


cheers


andrew


--
Andrew Dunstan
EDB: https://www.enterprisedb.com





Re: Postgres restart in the middle of exclusive backup and the presence of backup_label file

2021-11-24 Thread Michael Paquier
On Wed, Nov 24, 2021 at 02:12:19PM -0800, SATYANARAYANA NARLAPURAM wrote:
> While an exclusive backup is in progress if Postgres restarts, postgres
> runs the recovery from the checkpoint identified by the label file instead
> of the control file. This can cause long recovery or even sometimes fail to
> recover as the WAL records corresponding to that checkpoint location are
> removed. I can write a layer in my control plane to remove the backup_label
> file when I know the server is not in restore from the base backup but I
> don't see a reason why everyone has to repeat this step. Am I missing
> something?

This is a known issue with exclusive backups, which is a reason why 
non-exclusive backups have been implemented.  pg_basebackup does that,
and using "false" as the third argument of pg_start_backup() would
have the same effect.  So I would recommend to switch to that.
--
Michael


signature.asc
Description: PGP signature


Re: internal subtransactions, memory contexts, resource owners

2021-11-24 Thread Tom Lane
Chapman Flack  writes:
> By inspection of plperl and plpython, it looks like the canonical pattern
> for a PL using internal subtransactions is:

> save CurrentMemoryContext
> save CurrentResourceOwner
> BeginInternalSubTransaction
> reimpose the saved memory context
> // but not the saved resource owner

> ...
> (RollbackAnd)?ReleaseCurrentSubTransaction
> reimpose the saved memory context
> and the saved resource owner

> Therefore, during the subtransaction, its newly-established memory context
> is accessible as CurTransactionMemoryContext, but the caller can still use
> CurrentMemoryContext to refer to the same context it already expected.

> By contrast, the newly established resource owner is both the
> CurTransactionResourceOwner and the CurrentResourceOwner within the scope
> of the subtransaction.

> Is there more explanation of this pattern written somewhere than I have
> managed to find, and in particular of the motivation for treating the memory
> context and the resource owner in these nearly-but-not-quite matching
> ways?

You normally want a separate resource owner for a subtransaction, since
the main point of a subtransaction is to be able to clean up after errors
and release resources.  What to do with CurrentMemoryContext is a lot more
specific to a particular PL.  I don't want to speak to plperl and plpython
in particular, but plpgsql does it this way because it uses the same
function parsetree data structure and the same variable values throughout
execution of a function.  You would not, say, want the function's local
variables to revert to previous values upon failure of a BEGIN block;
so they have to be kept in the same memory context throughout.

regards, tom lane




Postgres restart in the middle of exclusive backup and the presence of backup_label file

2021-11-24 Thread SATYANARAYANA NARLAPURAM
Hi Hackers,

While an exclusive backup is in progress if Postgres restarts, postgres
runs the recovery from the checkpoint identified by the label file instead
of the control file. This can cause long recovery or even sometimes fail to
recover as the WAL records corresponding to that checkpoint location are
removed. I can write a layer in my control plane to remove the backup_label
file when I know the server is not in restore from the base backup but I
don't see a reason why everyone has to repeat this step. Am I missing
something?

If there are no standby.signal or recovery.signal, what is the use case of
honoring backup_label file? Even when they exist, for a long running
recovery, should we honor the backup_label file as the majority of the WAL
already applied? It does slow down the recovery on restart right as it has
to start all the way from the beginning?

Thanks,
Satya


Re: pg_stat_bgwriter.buffers_backend is pretty meaningless (and more?)

2021-11-24 Thread Melanie Plageman
On Fri, Nov 19, 2021 at 11:49 AM Andres Freund  wrote:
> > +/*
> > + * On modern systems this is really just *counter++.  On some older systems
> > + * there might be more to it, due to inability to read and write 64 bit 
> > values
> > + * atomically.
> > + */
> > +static inline void inc_counter(pg_atomic_uint64 *counter)
> > +{
> > + pg_atomic_write_u64(counter, pg_atomic_read_u64(counter) + 1);
> > +}
> > +
> >  #undef INSIDE_ATOMICS_H
>
> Why is this using a completely different naming scheme from the rest of the
> file?

It was what Thomas originally named it. Also, I noticed all the other
pg_atomic* in this file were wrappers around the same impl function, so
I thought maybe naming it this way would be confusing. I renamed it to
pg_atomic_inc_counter(), though maybe pg_atomic_readonly_write() would
be better?

>
> >  doc/src/sgml/monitoring.sgml| 116 +-
> >  src/backend/catalog/system_views.sql|  11 ++
> >  src/backend/postmaster/checkpointer.c   |   3 +-
> >  src/backend/postmaster/pgstat.c | 161 +++-
> >  src/backend/storage/buffer/bufmgr.c |  46 --
> >  src/backend/storage/buffer/freelist.c   |  23 ++-
> >  src/backend/storage/buffer/localbuf.c   |   3 +
> >  src/backend/storage/sync/sync.c |   1 +
> >  src/backend/utils/activity/backend_status.c |  60 +++-
> >  src/backend/utils/adt/pgstatfuncs.c | 152 ++
> >  src/include/catalog/pg_proc.dat |   9 ++
> >  src/include/miscadmin.h |   2 +
> >  src/include/pgstat.h|  53 +++
> >  src/include/storage/buf_internals.h |   4 +-
> >  src/include/utils/backend_status.h  |  80 ++
> >  src/test/regress/expected/rules.out |   8 +
> >  16 files changed, 701 insertions(+), 31 deletions(-)
>
> This is a pretty large change, I wonder if there's a way to make it a bit more
> granular.
>

I have done this. See v15 patch set attached.

- Melanie


v15-0007-small-comment-correction.patch
Description: Binary data


v15-0004-Add-buffers-to-pgstat_reset_shared_counters.patch
Description: Binary data


v15-0006-Remove-superfluous-bgwriter-stats.patch
Description: Binary data


v15-0003-Send-IO-operations-to-stats-collector.patch
Description: Binary data


v15-0005-Add-system-view-tracking-IO-ops-per-backend-type.patch
Description: Binary data


v15-0002-Add-IO-operation-counters-to-PgBackendStatus.patch
Description: Binary data


v15-0001-Read-only-atomic-backend-write-function.patch
Description: Binary data


Re: XMAX_LOCK_ONLY and XMAX_COMMITTED (fk/multixact code)

2021-11-24 Thread Bossart, Nathan
On 11/23/21, 4:59 PM, "Mark Dilger"  wrote:
>> On Nov 23, 2021, at 4:51 PM, Bossart, Nathan  wrote:
>>
>> This is a good point.  Right now, you'd have to manually inspect the
>> infomask field to determine the exact form of the invalid combination.
>> My only worry with this is that we'd want to make sure these checks
>> stayed consistent with the definition of HEAP_XMAX_IS_LOCKED_ONLY in
>> htup_details.h.  I'm guessing HEAP_XMAX_IS_LOCKED_ONLY is unlikely to
>> change all that often, though.
>
> I expect that your patch will contain some addition to the amcheck (or 
> pg_amcheck) tests, so if we ever allow some currently disallowed bit 
> combination, we'd be reminded of the need to update amcheck.  So I'm not too 
> worried about that aspect of this.
>
> I prefer not to have to get a page (or full file) from a customer when the 
> check reports corruption.  Even assuming they are comfortable giving that, 
> which they may not be, it is an extra round trip to the customer asking for 
> stuff.

Another option we might consider is only checking for the
HEAP_XMAX_LOCK_ONLY bit instead of everything in
HEAP_XMAX_IS_LOCKED_ONLY.  IIUC everything else is only expected to
happen for upgrades from v9.2 or earlier, so it might be pretty rare
at this point.  Otherwise, I'll extract the exact bit pattern for the
error message as you suggest.

Nathan



internal subtransactions, memory contexts, resource owners

2021-11-24 Thread Chapman Flack
Hi,

By inspection of plperl and plpython, it looks like the canonical pattern
for a PL using internal subtransactions is:

save CurrentMemoryContext
save CurrentResourceOwner
BeginInternalSubTransaction
reimpose the saved memory context
// but not the saved resource owner

...
(RollbackAnd)?ReleaseCurrentSubTransaction
reimpose the saved memory context
and the saved resource owner


Therefore, during the subtransaction, its newly-established memory context
is accessible as CurTransactionMemoryContext, but the caller can still use
CurrentMemoryContext to refer to the same context it already expected.

By contrast, the newly established resource owner is both the
CurTransactionResourceOwner and the CurrentResourceOwner within the scope
of the subtransaction.

Is there more explanation of this pattern written somewhere than I have
managed to find, and in particular of the motivation for treating the memory
context and the resource owner in these nearly-but-not-quite matching
ways?

Regards,
-Chap




Re: Post-CVE Wishlist

2021-11-24 Thread Robert Haas
On Wed, Nov 24, 2021 at 3:14 PM Tom Lane  wrote:
> Robert Haas  writes:
> > On Wed, Nov 24, 2021 at 2:53 PM Tom Lane  wrote:
> >> One other point to be made here is that it seems like a stretch to call
> >> these particular bugs "high-severity".
>
> > Well, I was referring to the CVSS score, which was in the "high" range.
>
> The server CVE is; the client CVE, not so much, precisely because we
> couldn't find exciting exploits.

Right, I understand that.

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: Post-CVE Wishlist

2021-11-24 Thread Tom Lane
Robert Haas  writes:
> On Wed, Nov 24, 2021 at 2:53 PM Tom Lane  wrote:
>> One other point to be made here is that it seems like a stretch to call
>> these particular bugs "high-severity".

> Well, I was referring to the CVSS score, which was in the "high" range.

The server CVE is; the client CVE, not so much, precisely because we
couldn't find exciting exploits.

regards, tom lane




Re: Post-CVE Wishlist

2021-11-24 Thread Jacob Champion
On Wed, 2021-11-24 at 14:01 -0500, Robert Haas wrote:
> The bar for actually ripping it out
> on an expedited time scale would be proving not only that it's broken
> in multiple ways, but that it's so badly broken that it can't be fixed
> with any reasonable amount of effort. And I just don't see one bug
> that had a pretty localized fix is really moving the needle as far as
> that burden of proof is concerned.

I'm not suggesting that we rip it out today, for the record. I agree
with you on where the bar is for that.

--Jacob


Re: Post-CVE Wishlist

2021-11-24 Thread Robert Haas
On Wed, Nov 24, 2021 at 2:53 PM Tom Lane  wrote:
> One other point to be made here is that it seems like a stretch to call
> these particular bugs "high-severity".

Well, I was referring to the CVSS score, which was in the "high" range.

> Given what we learned about
> the difficulty of exploiting the libpq bug, and the certainty that any
> other clients sharing the issue would have their own idiosyncrasies
> necessitating a custom-designed attack, I rather doubt that we're going
> to hear of anybody trying to exploit the issue in the field.

I don't know. The main thing that I find consoling is the fact that
most people probably have the libpq connection behind a firewall where
nasty people can't even connect to the port. But there are probably
exceptions.

> (By no means do I suggest that these bugs aren't worth fixing when we
> find them.  But so far they seem very easy to fix.  So moving mountains
> to design out just this one type of bug doesn't seem like a great use
> of our finite earth-moving capacity.)

I have enough trouble just moving the couch.

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: Post-CVE Wishlist

2021-11-24 Thread Jacob Champion
On Wed, 2021-11-24 at 14:03 -0500, Tom Lane wrote:
> > I don't buy the idea that, because we have fixed that particular
> > vulnerability, we've rendered this entire class of bugs "hypothetical".
> > There will be more code and more clients. There will always be bugs.
> > I'd rather the bugs that people write be in places that are less
> > security-critical.
> 
> Unless we actively remove the existing way of starting SSL encryption
> --- and GSS encryption, and anything else somebody proposes in future ---
> we are not going to be able to design out this class of bugs.

_We_ can't. I get that. But if this feature is introduced, new clients
will begin to have the option of designing it out of their code. And
DBAs will have the option of locking down their servers so that any new
bugs we introduce in the TLS-upgrade codepath will simply not affect
them.

The ecosystem has the option of transitioning faster than we can. And
then, some number of releases later, an entirely new conversation might
happen. (Or it might not.)

> Maybe
> we could start the process now in the hopes of making such a breaking
> change ten years down the road; but whether anyone will remember to
> pull the trigger then is doubtful, and even if we do remember, you can
> be dead certain it will still break some people's clients.

I am familiar with the "we didn't plant a tree 20 years ago, so we
shouldn't plant one now" line of argument. :D I hope it's not as
persuasive as it used to be.

> So I don't
> put much stock in the argument that this will make things more secure.
> (Ten years from now, SSL may be dead and replaced by something more
> secure against quantum computers.)

That would be great! But I suspect that if that happens, the new
argument will be "we can't upgrade our server to XQuantum-only! Look at
all these legacy SSL clients."

--Jacob


Re: Post-CVE Wishlist

2021-11-24 Thread Tom Lane
Robert Haas  writes:
> I think it would take an overwhelming amount of evidence to convince
> the project to remove support for the current method. One or even two
> or three high-severity bugs will probably not convince the project to
> do more than spend more studying that code and trying to tighten
> things up in a systematic way.

One other point to be made here is that it seems like a stretch to call
these particular bugs "high-severity".  Given what we learned about
the difficulty of exploiting the libpq bug, and the certainty that any
other clients sharing the issue would have their own idiosyncrasies
necessitating a custom-designed attack, I rather doubt that we're going
to hear of anybody trying to exploit the issue in the field.

(By no means do I suggest that these bugs aren't worth fixing when we
find them.  But so far they seem very easy to fix.  So moving mountains
to design out just this one type of bug doesn't seem like a great use
of our finite earth-moving capacity.)

regards, tom lane




Re: Split xlog.c

2021-11-24 Thread Robert Haas
On Wed, Nov 24, 2021 at 12:16 PM Heikki Linnakangas  wrote:
> And here's another rebase, now that Robert got rid of ReadRecPtr and
> EndRecPtr.

In general, I think 0001 is a good idea, but the comment that says
"Set the XLP_FIRST_IS_OVERWRITE_CONTRECORD flag on the page header"
seems to me to be telling the reader about what's already obvious
instead of explaining to them the thing they might have missed.
GetXLogBuffer() says that it's only safe to use if you hold a WAL
insertion lock and don't go backwards, and here you don't hold a WAL
insertion lock and I guess you're not going backwards only because
you're staying in exactly the same place? It seems to me that the only
reason this is safe is because, at the time this is called, only the
startup process is able to write WAL, and therefore the race condition
that would otherwise exist does not. Even then, I wonder what keeps
the buffer from being flushed after we return from XLogInsert() and
before we set the bit, and if the answer is that nothing prevents
that, whether that's OK. It might be good to talk about these issues
too.

Just to be clear, I'm not saying that I think the code is broken. But
I am concerned about someone using this as precedent for code that
runs in some other place, which would be highly likely to be broken,
and the way to avoid that is for the comment to explain the tricky
points.

Also, you've named the parameter to this new function so that it's
exactly the same as the global variable. I do approve of trying to
pass the value as a parameter instead of relying on a global variable,
and I wonder if you could find a way to remove the global variable
entirely. But if not, I think the function parameter and the global
variable should have different names, because otherwise it's easy for
anyone reading the code to get confused about which one is being
referenced in any particular spot, and it's also hard to grep.

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: Should rename "startup process" to something else?

2021-11-24 Thread Robert Haas
On Wed, Nov 24, 2021 at 1:54 PM Alvaro Herrera  wrote:
> I don't object to an underscore, but it looks a bit uglier to me.
> AFAIK the main problem with uuid-ossp was that it is used as an identifier, so
> it required quoting, which won't be the case with this process name.

I agree, and for that reason I would prefer no separator, or a space.
That's what we do with other processes, and I think it's fine.

It's worth thinking too about the fact that we may want to rename
functions, adjust comments, etc. Each of those things has their own
conventions. For example consider StartupProcessMain(). If we decide
to call this the WAL replay process, I suppose that is going to become
WALReplayMain(). For sure it's not going to be come WAL-Replay-Main().
But what displays in the 'ps' status should look like what we do in
other cases.

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: Should rename "startup process" to something else?

2021-11-24 Thread Tom Lane
Alvaro Herrera  writes:
> On 2021-Nov-23, Tom Lane wrote:
>> Bad memories of uuid-ossp float up ... can we use underscore?

> I don't object to an underscore, but it looks a bit uglier to me.
> AFAIK the main problem with uuid-ossp was that it is used as an identifier, so
> it required quoting, which won't be the case with this process name.

Well, we can call the process whatever we want, but we will need to
derive identifiers from that name, so you can't just ignore the issue.

regards, tom lane




Re: Post-CVE Wishlist

2021-11-24 Thread Tom Lane
Jacob Champion  writes:
> What I'm trying to convince you of is that this pattern of beginning a
> TLS conversation is known to be particularly error-prone, across
> multiple protocols and implementations. I think this is supported by
> the fact that at least three independent client libraries made changes
> in response to this Postgres CVE, a decade after the first writeup of
> this exact vulnerability.

Well, it's not clear that they didn't just copy libpq's buggy logic,
so I'm not sure how "independent" these bugs are.  I actually took
considerable comfort from the number of clients that *weren't*
vulnerable.

> I don't buy the idea that, because we have fixed that particular
> vulnerability, we've rendered this entire class of bugs "hypothetical".
> There will be more code and more clients. There will always be bugs.
> I'd rather the bugs that people write be in places that are less
> security-critical.

Unless we actively remove the existing way of starting SSL encryption
--- and GSS encryption, and anything else somebody proposes in future ---
we are not going to be able to design out this class of bugs.  Maybe
we could start the process now in the hopes of making such a breaking
change ten years down the road; but whether anyone will remember to
pull the trigger then is doubtful, and even if we do remember, you can
be dead certain it will still break some people's clients.  So I don't
put much stock in the argument that this will make things more secure.
(Ten years from now, SSL may be dead and replaced by something more
secure against quantum computers.)

The suggestion that we could remove one network roundtrip is worth
something.  And perhaps the argument about improving compatibility
with tools that know SSL but not the PG wire protocol (although
that one seems pretty unbacked by concrete facts).  Whether these
things make it worth the effort is dubious in my mind, but others
may evaluate that differently.  Note, however, that neither argument
impels us to break compatibility with existing clients.  That's a
far heavier price to pay, and basically I don't believe that we
are willing to pay it.

regards, tom lane




Re: Post-CVE Wishlist

2021-11-24 Thread Robert Haas
On Wed, Nov 24, 2021 at 1:29 PM Jacob Champion  wrote:
> What I'm trying to convince you of is that this pattern of beginning a
> TLS conversation is known to be particularly error-prone, across
> multiple protocols and implementations. I think this is supported by
> the fact that at least three independent client libraries made changes
> in response to this Postgres CVE, a decade after the first writeup of
> this exact vulnerability.
>
> - https://github.com/postgres/postgres/commit/160c0258802
> - https://github.com/pgbouncer/pgbouncer/commit/e4453c9151a
> - https://github.com/yandex/odyssey/commit/4e00bf797a

Sure, that's certainly true, but there are many programming patterns
that have well-known gotchas, and people still write programs that do
those things, debug them, and are satisfied with the results.
Beginning programming classes often cover the abuse of recursion using
fact() and fib() as examples. It's extremely easy to write a program
that concates a large number of strings with O(n^2) runtime, and tons
of people must have made that mistake. I've personally written fork()
bombs multiple times, sometimes unintentionally. None of that proves
that computing factorials or fibonacci numbers, concatenating strings,
or calling fork() are things that you just should not do. However, if
you do those things and make the classic errors, somebody's probably
going to think that you're kind of stupid. So here.

I think it would take an overwhelming amount of evidence to convince
the project to remove support for the current method. One or even two
or three high-severity bugs will probably not convince the project to
do more than spend more studying that code and trying to tighten
things up in a systematic way. Even if we did agree to move away from
it, we would mostly likely add support for the replacement method now
with a view to deprecating the old way in 6-10 years when existing
releases are out of support, which means we'd still need to fix all of
the bugs in the existing implementation, or at least all of the ones
discovered between now and then. The bar for actually ripping it out
on an expedited time scale would be proving not only that it's broken
in multiple ways, but that it's so badly broken that it can't be fixed
with any reasonable amount of effort. And I just don't see one bug
that had a pretty localized fix is really moving the needle as far as
that burden of proof is concerned.

And if the existing method is not going away, then adding a new method
just means that we have two things that can have bugs instead of one.
That might or might not be an advancement in usability or convenience,
but it certainly can't be less buggy.

--
Robert Haas
EDB: http://www.enterprisedb.com




Re: [PATCH] Partial foreign key updates in referential integrity triggers

2021-11-24 Thread Peter Eisentraut

On 05.01.21 22:40, Paul Martinez wrote:

CREATE TABLE tenants (id serial PRIMARY KEY);
CREATE TABLE users (
   tenant_id int REFERENCES tenants ON DELETE CASCADE,
   id serial,
   PRIMARY KEY (tenant_id, id),
);
CREATE TABLE posts (
 tenant_id int REFERENCES tenants ON DELETE CASCADE,
 id serial,
 author_id int,
 PRIMARY KEY (tenant_id, id),
 FOREIGN KEY (tenant_id, author_id) REFERENCES users ON DELETE SET NULL
);

INSERT INTO tenants VALUES (1);
INSERT INTO users VALUES (1, 101);
INSERT INTO posts VALUES (1, 201, 101);
DELETE FROM users WHERE id = 101;
ERROR:  null value in column "tenant_id" violates not-null constraint
DETAIL:  Failing row contains (null, 201, null).


I was looking through this example to see if it could be adapted for the 
documentation.


The way the users table is defined, it appears that "id" is actually 
unique and the primary key ought to be just (id).  The DELETE command 
you show also just uses the id column to find the user, which would be 
bad if the user id is not unique across tenants.  If the id were unique, 
then the foreign key from posts to users would just use the user id 
column and the whole problem of the ON DELETE SET NULL action would go 
away.  If the primary key of users is indeed supposed to be (tenant_id, 
id), then maybe the definition of the users.id column should not use 
serial, and the DELETE command should also look at the tenant_id column. 
 (The same question applies to posts.id.)


Also, you initially wrote that this is a denormalized schema.  I think 
if we keep the keys the way you show, then this isn't denormalized.  But 
if we considered users.id globally unique, then there would be 
normalization concerns.


What do you think?





Re: pgsql: xlog.c: Remove global variables ReadRecPtr and EndRecPtr.

2021-11-24 Thread Robert Haas
On Wed, Nov 24, 2021 at 1:51 PM Alvaro Herrera  wrote:
> > There is a buildfarm failure on lapwing which looks likely to be
> > attributable to this commit, but I don't understand what has gone
> > wrong exactly. The error message that is reported is:
>
> No, this is an unrelated problem; Tom reported this yesterday also:
> https://postgr.es/m/45597.1637694...@sss.pgh.pa.us

Oh, OK. I couldn't see exactly what that could have to do with my
commit, but it happened right afterwards so I was hesitant to call it
a coincidence.

Thanks,

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: Reduce function call costs on ELF platforms

2021-11-24 Thread Andres Freund
Hi,

On 2021-11-23 17:28:08 +0100, Peter Eisentraut wrote:
> On 22.11.21 23:32, Tom Lane wrote:
> > > The easier approach for this class of issues is to use the linker option
> > > -Bsymbolic.
> > I don't recall details, but we've previously rejected the idea of
> > trying to use -Bsymbolic widely; apparently it has undesirable
> > side-effects on some platforms.  See commit message for e3d77ea6b
> > (hopefully there's some detail in the email thread [1]).  It sounds
> > like you're not actually proposing that, but I thought it would be
> > a good idea to note the hazard here.
> 
> Also, IIRC, -Bsymbolic was once frowned upon by packaging policies, since it
> prevents use of LD_PRELOAD.  I'm not sure what the current thinking there
> is, however.

It doesn't break some (most?) of the uses of LD_PRELOAD. In particular, it
doesn't break things like replacing the malloc implementation. When do you
have a symbol that you want to override *inside* your library (executables
already bind to their own symbols at compile time)? I've seen that for
replacing buggy functions in closed source things, but that's about it?

Greetings,

Andres Freund




Re: Should rename "startup process" to something else?

2021-11-24 Thread Alvaro Herrera
On 2021-Nov-23, Tom Lane wrote:

> Alvaro Herrera  writes:
> > On 2021-Nov-20, Andrew Dunstan wrote:
> >> Maybe something along those lines but using a dash/hyphen would work:
> >> e.g. wal-replayer
> 
> > Yeah, the idea of a dash occurred to me too.
> 
> Bad memories of uuid-ossp float up ... can we use underscore?

I don't object to an underscore, but it looks a bit uglier to me.
AFAIK the main problem with uuid-ossp was that it is used as an identifier, so
it required quoting, which won't be the case with this process name.


Anyway, as a second and separate point, should we rename the other WAL-related
process names to use the same character?

walreceiver -> wal_receiver / wal-receiver
walsender -> wal_sender / wal-sender
walwriter -> wal_writer / wal-writer.

I assume we would *not* rename other existing processes that already
have spaces in their names, such as "autovacuum launcher" to
"autovacuum_launcher", as that would be a net loss IMO.

Currently, the process list in an idle primary server looks like 

USER PID %CPU %MEMVSZ   RSS TTY  STAT START   TIME COMMAND
alvherre 1091970  0.0  0.0 199608 22252 pts/0S+   12:22   0:00 
/pgsql/install/master/bin/postmaster -p 55432
alvherre 1091983  0.0  0.0 199608  3644 ?Ss   12:22   0:00  \_ 
postgres: checkpointer 
alvherre 1091984  0.0  0.0 199608  5432 ?Ss   12:22   0:00  \_ 
postgres: background writer 
alvherre 1091986  0.0  0.0 199608  8956 ?Ss   12:22   0:00  \_ 
postgres: walwriter 
alvherre 1091987  0.0  0.0 200148  7988 ?Ss   12:22   0:00  \_ 
postgres: autovacuum launcher 
alvherre 1091988  0.0  0.0  54432  4156 ?Ss   12:22   0:00  \_ 
postgres: stats collector 
alvherre 1091989  0.0  0.0 200036  6416 ?Ss   12:22   0:00  \_ 
postgres: logical replication launcher 

I think this looks better:

USER PID %CPU %MEMVSZ   RSS TTY  STAT START   TIME COMMAND
alvherre 1091970  0.0  0.0 199608 22252 pts/0S+   12:22   0:00 
/pgsql/install/master/bin/postmaster -p 55432
alvherre 1091983  0.0  0.0 199608  3644 ?Ss   12:22   0:00  \_ 
postgres: checkpointer 
alvherre 1091984  0.0  0.0 199608  5432 ?Ss   12:22   0:00  \_ 
postgres: background writer 
alvherre 1091986  0.0  0.0 199608  8956 ?Ss   12:22   0:00  \_ 
postgres: wal-writer 
alvherre 1091987  0.0  0.0 200148  7988 ?Ss   12:22   0:00  \_ 
postgres: autovacuum launcher 
alvherre 1091988  0.0  0.0  54432  4156 ?Ss   12:22   0:00  \_ 
postgres: stats collector 
alvherre 1091989  0.0  0.0 200036  6416 ?Ss   12:22   0:00  \_ 
postgres: logical replication launcher 

-- 
Álvaro Herrera  Valdivia, Chile  —  https://www.EnterpriseDB.com/




Re: pgsql: xlog.c: Remove global variables ReadRecPtr and EndRecPtr.

2021-11-24 Thread Alvaro Herrera
On 2021-Nov-24, Robert Haas wrote:

> On Wed, Nov 24, 2021 at 11:28 AM Robert Haas  wrote:
> > xlog.c: Remove global variables ReadRecPtr and EndRecPtr.
> > https://git.postgresql.org/pg/commitdiff/d2ddfa681db27a138acb63c8defa8cc6fa588922
> 
> There is a buildfarm failure on lapwing which looks likely to be
> attributable to this commit, but I don't understand what has gone
> wrong exactly. The error message that is reported is:

No, this is an unrelated problem; Tom reported this yesterday also:
https://postgr.es/m/45597.1637694...@sss.pgh.pa.us


-- 
Álvaro Herrera  Valdivia, Chile  —  https://www.EnterpriseDB.com/
"La espina, desde que nace, ya pincha" (Proverbio africano)




Re: pgsql: xlog.c: Remove global variables ReadRecPtr and EndRecPtr.

2021-11-24 Thread Tom Lane
Robert Haas  writes:
> 2021-11-24 17:07:41.428 UTC [1449:6] FATAL:  mismatching overwritten
> LSN 0/1FFE014 -> 0/1FFE000
> 2021-11-24 17:07:41.428 UTC [1449:7] CONTEXT:  WAL redo at 0/224
> for XLOG/OVERWRITE_CONTRECORD: lsn 0/1FFE014; time 2021-11-24
> 17:07:41.127049+00

> Beyond the fact that something bad has happened, I'm not sure what
> that is trying to tell me.

Pre-existing issue:

https://www.postgresql.org/message-id/45597.1637694259%40sss.pgh.pa.us

regards, tom lane




Re: Rename dead_tuples to dead_items in vacuumlazy.c

2021-11-24 Thread Peter Geoghegan
On Wed, Nov 24, 2021 at 9:53 AM Alvaro Herrera  wrote:
> OK, this makes a lot more sense.  I wasn't aware of ae7291ac (and I
> wasn't aware of the significance of 8523492d either, but that's not
> really relevant here.)

Thanks for hearing me out about the significance of 8523492d.

Having the right formalisms seems to really matter here, because they
enable decoupling, which is generally very useful. This makes it easy
to understand (just for example) that index vacuuming and heap
vacuuming are just additive, optional steps (in principle) -- an idea
that will become even more important once we get Robert's pending TID
conveyor belt design. I believe that that design goes one step further
than what we have today, by making index vacuuming and heap vacuuming
occur in a distinct operation to VACUUM proper (VACUUM would only need
to set up the LP_DEAD item list for index vacuuming and heap
vacuuming, which may or may not happen immediately after).

An interesting question (at least to me) is: within a non-aggressive
VACUUM, what remaining steps are *not* technically optional?

I am pretty sure that they're all optional in principle (or will be
soon), because soon we will be able to independently advance
relfrozenxid without freezing all tuples with XIDs before our original
FreezeLimit (FreezeLimit should only be used to decide which tuples to
freeze, not to decide on a new relfrozenxid). Soon almost everything
will be decoupled, without changing the basic invariants that we've
had for many years. This flexibility seems really important to me.

That just leaves avoiding pruning without necessarily avoiding
ostensibly related processing for indexes. We can already
independently prune without doing index/heap vacuuming (the bypass
indexes optimization). We will also be able to do the opposite thing,
with my new patch: we can perform index/heap vacuuming *without*
pruning ourselves. This makes sense in the case where we cannot
acquire a cleanup lock on a heap page with preexisting LP_DEAD items.

> I think we could say "LP_DEAD line pointer" and that would be perfectly
> clear.  Given how nuanced we have to be if we want to be clear about
> this, I would rather not use "LP_DEAD item"; that seems slightly
> contradictory, since the item is the storage and such a line pointer
> does not have storage.  Perhaps change that define in progress.h to
> PROGRESS_VACUUM_NUM_DEAD_LPS, and, in the first comment in vacuumlazy.c,
> use wording such as

I agree with all that, I think. But it's still not clear what the
variable dead_tuples should be renamed to within the structure that
you lay out (I imagine that you agree with me that dead_tuples is now
a bad name). This one detail affects more individual lines of code
than the restructuring of comments.

-- 
Peter Geoghegan




Re: Sequence's value can be rollback after a crashed recovery.

2021-11-24 Thread Tom Lane
Laurenz Albe  writes:
> On Tue, 2021-11-23 at 16:41 -0500, Tom Lane wrote:
>> A bit of polishing later, maybe like the attached.

> That looks good to me.

Pushed, thanks.

regards, tom lane




Re: BUFFERS enabled by default in EXPLAIN (ANALYZE)

2021-11-24 Thread Michael Christofides
I think it *should* be enabled for planning, since that makes the default
> easier to understand and document, and it makes a user's use of "explain"
> easier.


I’d be keen to see BUFFERS off by default with EXPLAIN, and on by default
with EXPLAIN ANALYZE.

The SUMMARY flag was implemented that way, which I think has been easy
enough for folks to understand and document. In fact, I think the only
BUFFERS information goes in the “summary” section for EXPLAIN (BUFFERS), so
maybe it makes perfect sense? If it would be great if that made
implementation easier, too.

In any case, thank you all, I’m so glad that this is being discussed again.
It’d be so good to start seeing buffers in more plans.

—
Michael


Re: Post-CVE Wishlist

2021-11-24 Thread Jacob Champion
On Wed, 2021-11-24 at 09:40 -0500, Robert Haas wrote:
> I am not really persuaded by Jacob's argument that, had this only
> worked the other way from the start, this bug wouldn't have occurred.
> That's just a tautology, because we can only have bugs in the code we
> write, not the code we didn't write. So perhaps we would have just had
> some other bug, which might have been more or less serious than the
> one we actually had. It's hard to say, really, because the situation
> is hypothetical.

I'm not trying to convince you that there wouldn't have been bugs.
There will always be bugs.

What I'm trying to convince you of is that this pattern of beginning a
TLS conversation is known to be particularly error-prone, across
multiple protocols and implementations. I think this is supported by
the fact that at least three independent client libraries made changes
in response to this Postgres CVE, a decade after the first writeup of
this exact vulnerability.

- https://github.com/postgres/postgres/commit/160c0258802
- https://github.com/pgbouncer/pgbouncer/commit/e4453c9151a
- https://github.com/yandex/odyssey/commit/4e00bf797a

I don't buy the idea that, because we have fixed that particular
vulnerability, we've rendered this entire class of bugs "hypothetical".
There will be more code and more clients. There will always be bugs.
I'd rather the bugs that people write be in places that are less
security-critical.

> This argument doesn't answer the question of whether speaking pure SSL
> on a separate port is better or worse than having a single port that
> does either. If I had to guess, the latter is more convenient for
> users but less convenient to code. I don't even see a compelling
> reason why we can't support multiple models here, supposing someone is
> willing to do the work and fix the bugs that result.

I only have experience in the area of HTTP(S), which supports three
models of plaintext-only, plaintext-upgrade-to-TLS (which is rare in
practice), and implicit-TLS. I'm not aware of mainstream efforts to mix
plaintext and implicit-TLS traffic on the same HTTP port -- but there
are projects that fill that niche [1] -- so I don't know what security
issues might arise from that approach.

--Jacob

[1] https://github.com/mscdex/httpolyglot


Re: pgsql: xlog.c: Remove global variables ReadRecPtr and EndRecPtr.

2021-11-24 Thread Robert Haas
On Wed, Nov 24, 2021 at 11:28 AM Robert Haas  wrote:
> xlog.c: Remove global variables ReadRecPtr and EndRecPtr.
> https://git.postgresql.org/pg/commitdiff/d2ddfa681db27a138acb63c8defa8cc6fa588922

There is a buildfarm failure on lapwing which looks likely to be
attributable to this commit, but I don't understand what has gone
wrong exactly. The error message that is reported is:

[17:07:19] t/025_stuck_on_old_timeline.pl ... ok 6048 ms
# poll_query_until timed out executing this query:
# SELECT '0/201E8E0'::pg_lsn <= pg_last_wal_replay_lsn()
# expecting this output:
# t
# last actual query output:
#
# with stderr:
# psql: error: connection to server on socket
"/tmp/SrZekgRh7K/.s.PGSQL.57959" failed: No such file or directory
# Is the server running locally and accepting connections on that socket?

This output confused me for a while because I thought the failure was
in test 025, but I should have realized that the problem must be in
026, since 025 finished with "ok". Anyway, in
026_overwrite_contrecord_standby.log there's this:

2021-11-24 17:07:41.388 UTC [1473:1] LOG:  started streaming WAL from
primary at 0/200 on timeline 1
2021-11-24 17:07:41.428 UTC [1449:6] FATAL:  mismatching overwritten
LSN 0/1FFE014 -> 0/1FFE000
2021-11-24 17:07:41.428 UTC [1449:7] CONTEXT:  WAL redo at 0/224
for XLOG/OVERWRITE_CONTRECORD: lsn 0/1FFE014; time 2021-11-24
17:07:41.127049+00

Beyond the fact that something bad has happened, I'm not sure what
that is trying to tell me.

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: Rename dead_tuples to dead_items in vacuumlazy.c

2021-11-24 Thread Alvaro Herrera
On 2021-Nov-24, Peter Geoghegan wrote:

> TIDs (ItemPointerData structs) are of course not the same thing as
> line pointers (ItemIdData structs). There is a tendency to refer to
> the latter as "item pointers" all the same, which was confusing. I
> personally corrected/normalized this in commit ae7291ac in 2019. I
> think that it's worth being careful about precisely because they're
> closely related (but distinct) concepts. And so FWIW "LP_DEAD item
> pointer" is not a thing. I agree that an LP_DEAD item pointer has no
> tuple storage, and so you could say that it points to nothing (though
> only in heapam). I probably would just say that it has no tuple
> storage, though.

OK, this makes a lot more sense.  I wasn't aware of ae7291ac (and I
wasn't aware of the significance of 8523492d either, but that's not
really relevant here.)

> I agree with others that the term "item" is vague, but I don't think
> that that's necessarily a bad thing here -- I deliberately changed the
> comments to say either "TIDs" or "LP_DEAD items", emphasizing whatever
> the important aspect seemed to be in each context (they're LP_DEAD
> items to the heap structure, TIDs to index structures).

I think we could say "LP_DEAD line pointer" and that would be perfectly
clear.  Given how nuanced we have to be if we want to be clear about
this, I would rather not use "LP_DEAD item"; that seems slightly
contradictory, since the item is the storage and such a line pointer
does not have storage.  Perhaps change that define in progress.h to
PROGRESS_VACUUM_NUM_DEAD_LPS, and, in the first comment in vacuumlazy.c,
use wording such as

+ * The major space usage for LAZY VACUUM is storage for the array of TIDs
+ * of dead line pointers that are to be removed from indexes.

or

+ * The major space usage for LAZY VACUUM is storage for the array of TIDs
+ * of LP_DEAD line pointers that are to be removed from indexes.

(The point being that TIDs are not dead themselves, only the line
pointers that they refer to.)

-- 
Álvaro Herrera   39°49'30"S 73°17'W  —  https://www.EnterpriseDB.com/
"Most hackers will be perfectly comfortable conceptualizing users as entropy
 sources, so let's move on."   (Nathaniel Smith)




Re: Python 3.11 vs. Postgres

2021-11-24 Thread Tom Lane
Peter Eisentraut  writes:
> On 24.11.21 04:07, Tom Lane wrote:
>> According to [1], we need to stop including Python's .

> See attached patch.  The minimum Python version for this change is 2.4, 
> which is the oldest version supported by PG10, so we can backpatch this 
> to all live branches.

LGTM.  Tested with v10 and prairiedog's Python 2.4.1.

regards, tom lane




Re: Rename dead_tuples to dead_items in vacuumlazy.c

2021-11-24 Thread Peter Geoghegan
On Wed, Nov 24, 2021 at 7:16 AM Alvaro Herrera  wrote:
> Sorry to reply to myself, but I realized that I forgot to return to the
> main point of this thread.  If we agree that "an LP_DEAD item pointer
> does not point to any item" (an assertion that gives a precise meaning
> to both those terms), then a patch that renames "tuples" to "items" is
> not doing anything useful IMO, because those two terms are synonyms.

TIDs (ItemPointerData structs) are of course not the same thing as
line pointers (ItemIdData structs). There is a tendency to refer to
the latter as "item pointers" all the same, which was confusing. I
personally corrected/normalized this in commit ae7291ac in 2019. I
think that it's worth being careful about precisely because they're
closely related (but distinct) concepts. And so FWIW "LP_DEAD item
pointer" is not a thing. I agree that an LP_DEAD item pointer has no
tuple storage, and so you could say that it points to nothing (though
only in heapam). I probably would just say that it has no tuple
storage, though.

> Now maybe Peter doesn't agree with the definitions I suggest, in which
> case I would like to know what his definitions are.

I agree with others that the term "item" is vague, but I don't think
that that's necessarily a bad thing here -- I deliberately changed the
comments to say either "TIDs" or "LP_DEAD items", emphasizing whatever
the important aspect seemed to be in each context (they're LP_DEAD
items to the heap structure, TIDs to index structures).

I'm not attached to the term "item". To me the truly important point
is what these items are *not*: they're not tuples. The renaming is
intended to enforce the concepts that I went into at the end of the
commit message for commit 8523492d. Now the pruning steps in
lazy_scan_prune always avoiding keeping around a DEAD tuple with tuple
storage on return to lazy_scan_heap (only LP_DEAD items can remain),
since (as of that commit) lazy_scan_prune alone is responsible for
things involving the "logical database".

This means that index vacuuming and heap vacuuming can now be thought
of as removing garbage items from physical data structures (they're
purely "physical database" concepts), and nothing else. They don't
need recovery conflicts. How could they? Where are you supposed to get
the XIDs for that from, when you've only got LP_DEAD items?

This is also related to the idea that pruning by VACUUM isn't
necessarily all that special compared to earlier pruning or concurrent
opportunistic pruning. As I go into on the other recent thread on
removing special cases in vacuumlazy.c, ISTM that we ought to do
everything except pruning itself (and freezing tuples, which
effectively depends on pruning) without even acquiring a cleanup lock.
Which is actually quite a lot of things.

-- 
Peter Geoghegan




Re: Minor documentation fix - missing blank space

2021-11-24 Thread Heikki Linnakangas

On 24/11/2021 16:49, Japin Li wrote:

When I read the documentation about Overview of PostgreSQL Internals - Executor 
[1],
I find there is a missing space in the documentation.

diff --git a/doc/src/sgml/arch-dev.sgml b/doc/src/sgml/arch-dev.sgml
index 7aff059e82..c2be28fac8 100644
--- a/doc/src/sgml/arch-dev.sgml
+++ b/doc/src/sgml/arch-dev.sgml
@@ -559,7 +559,7 @@
  A simple INSERT ... VALUES command creates a
  trivial plan tree consisting of a single Result
  node, which computes just one result row, feeding that up
-toModifyTable to perform the insertion.
+to ModifyTable to perform the insertion.
 
  



Applied, thanks!

- Heikki





Re: xlog.c: removing ReadRecPtr and EndRecPtr

2021-11-24 Thread Robert Haas
On Tue, Nov 23, 2021 at 1:36 PM Robert Haas  wrote:
> Attached please find the test case not for commit as
> v2-0001-dubious-test-case.patch; the fix, for commit and now with a
> proper commit message as
> v2-0002-Fix-corner-case-failure-to-detect-improper-timeli.patch; and
> the back-ported test case for v14 as v14-0001-dubious-test-case.patch
> in case anyone wants to play around with that. (with apologies for
> using the idea of a version number in two different and conflicting
> senses)

OK, I have committed this patch, rebased the original combined patch
over top of that, and committed that too. By my count that means I've
now removed a total of 3 global variables from this file and Amul got
rid of one as well so that makes 4 in total. If we continue at this
brisk pace the code may become understandable and maintainable
sometime prior to the heat death of the universe, which I think would
be rather nice.

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: Synchronizing slots from primary to standby

2021-11-24 Thread Dimitri Fontaine
Hi all,

Peter Eisentraut  writes:
> I want to reactivate $subject.  I took Petr Jelinek's patch from [0],
> rebased it, added a bit of testing.  It basically works, but as 
> mentioned in [0], there are various issues to work out.

Thanks for working on that topic, I believe it's an important part of
Postgres' HA story.

> The idea is that the standby runs a background worker to periodically fetch
> replication slot information from the primary.  On failover, a logical
> subscriber would then ideally find up-to-date replication slots on the new
> publisher and can just continue normally.

Is there a case to be made about doing the same thing for physical
replication slots too?

That's what pg_auto_failover [1] does by default: it creates replication
slots on every node for every other node, in a way that a standby
Postgres instance now maintains a replication slot for the primary. This
ensures that after a promotion, the standby knows to retain any and all
WAL segments that the primary might need when rejoining, at pg_rewind
time.

> The previous thread didn't have a lot of discussion, but I have gathered
> from off-line conversations that there is a wider agreement on this 
> approach.  So the next steps would be to make it more robust and
> configurable and documented.

I suppose part of the configuration would then include taking care of
physical slots. Some people might want to turn that off and use the
Postgres 13+ ability to use the remote primary restore_command to fetch
missing WAL files, instead. Well, people who have setup an archiving
system, anyway.

> As I said, I added a small test case to 
> show that it works at all, but I think a lot more tests should be added.   I
> have also found that this breaks some seemingly unrelated tests in the
> recovery test suite.  I have disabled these here.  I'm not sure if the patch
> actually breaks anything or if these are just differences in timing or
> implementation dependencies.  This patch adds a LIST_SLOTS replication
> command, but I think this could be replaced with just a SELECT FROM
> pg_replication_slots query now.  (This patch is originally older than when
> you could run SELECT queries over the replication protocol.)

Given the admitted state of the patch, I didn't focus on tests. I could
successfully apply the patch on-top of current master's branch, and
cleanly compile and `make check`.

Then I also updated pg_auto_failover to support Postgres 15devel [2] so
that I could then `make NODES=3 cluster` there and play with the new
replication command:

  $ psql -d "port=5501 replication=1" -c "LIST_SLOTS;"
  psql:/Users/dim/.psqlrc:24: ERROR:  XX000: cannot execute SQL commands in WAL 
sender for physical replication
  LOCATION:  exec_replication_command, walsender.c:1830
  ...

I'm not too sure about this idea of running SQL in a replication
protocol connection that you're mentioning, but I suppose that's just me
needing to brush up on the topic.

> So, again, this isn't anywhere near ready, but there is already a lot here
> to gather feedback about how it works, how it should work, how to configure
> it, and how it fits into an overall replication and HA architecture.

Maybe the first question about configuration would be about selecting
which slots a standby should maintain from the primary. Is it all of the
slots that exists on both the nodes, or a sublist of that?

Is it possible to have a slot with the same name on a primary and a
standby node, in a way that the standby's slot would be a completely
separate entity from the primary's slot? If yes (I just don't know at
the moment), well then, should we continue to allow that?

Other aspects of the configuration might include a list of databases in
which to make the new background worker active, and the polling delay,
etc.

Also, do we want to even consider having the slot management on a
primary node depend on the ability to sync the advancing on one or more
standby nodes? I'm not sure to see that one as a good idea, but maybe we
want to kill it publically very early then ;-)

Regards,
-- 
dim

Author of “The Art of PostgreSQL”, see https://theartofpostgresql.com

[1]: https://github.com/citusdata/pg_auto_failover
[2]: https://github.com/citusdata/pg_auto_failover/pull/838




Re: psql - add SHOW_ALL_RESULTS option

2021-11-24 Thread Fabien COELHO



Hello Daniel,


This patch still has a few of the problems reported earlier this year.


The patch fails to apply and the thread seems to have taken a nap.


I'm not napping:-) I just do not have enough time available this month. I 
intend to work on the patch in the next CF (January). AFAICR there is one 
necessary rebase and one bug to fix.


--
Fabien.




Re: pg_replslotdata - a tool for displaying replication slot information

2021-11-24 Thread Japin Li


On Wed, 24 Nov 2021 at 23:59, Bharath Rupireddy 
 wrote:
> On Wed, Nov 24, 2021 at 9:09 PM Bharath Rupireddy
>> > Thoughts?
>>
>> Attaching the rebased v2 patch.
>
> On windows the previous patches were failing, fixed that in the v3
> patch. I'm really sorry for the noise.
>

Cool!  When I try to use it, there is an error for -v, --verbose option.

px@ubuntu:~/Codes/postgres/Debug$ pg_replslotdata -v
pg_replslotdata: invalid option -- 'v'
Try "pg_replslotdata --help" for more information.

This is because the getopt_long() missing 'v' in the third parameter.

while ((c = getopt_long(argc, argv, "D:v", long_options, NULL)) != -1)

-- 
Regrads,
Japin Li.
ChengDu WenWu Information Technology Co.,Ltd.




Re: pg_replslotdata - a tool for displaying replication slot information

2021-11-24 Thread Bharath Rupireddy
On Wed, Nov 24, 2021 at 9:09 PM Bharath Rupireddy
 wrote:
>
> On Tue, Nov 23, 2021 at 10:39 AM Bharath Rupireddy
>  wrote:
> >
> > Hi,
> >
> > The replication slots data is stored in binary format on the disk under the 
> > pg_replslot/<> directory which isn't human readable. If the 
> > server is crashed/down (for whatever reasons) and unable to come up, 
> > currently there's no way for the user/admin/developer to know what were all 
> > the replication slots available at the time of server crash/down to figure 
> > out what's the restart lsn, xid, two phase info or types of slots etc.
> >
> > pg_replslotdata is a tool that interprets the replication slots information 
> > and displays it onto the stdout even if the server is crashed/down. The 
> > design of this tool is similar to other tools available in the core today 
> > i.e. pg_controldata, pg_waldump.
> >
> > Attaching initial patch herewith. I will improve it with documentation and 
> > other stuff a bit later.
> >
> > Please see the attached picture for the sample output.
> >
> > Thoughts?
>
> Attaching the rebased v2 patch.

On windows the previous patches were failing, fixed that in the v3
patch. I'm really sorry for the noise.

Regards,
Bharath Rupireddy.


v3-0001-pg_replslotdata.patch
Description: Binary data


Re: Rename dead_tuples to dead_items in vacuumlazy.c

2021-11-24 Thread Robert Haas
On Wed, Nov 24, 2021 at 9:51 AM Alvaro Herrera  wrote:
> On 2021-Nov-24, Robert Haas wrote:
> > Hmm. I think in my model an item and an item pointer and a line
> > pointer are all the same thing, but a TID is different. When I talk
> > about a TID, I mean the location of an item pointer, not its contents.
> > So a TID is what tells me that I want block 5 and the 4th slot in the
> > item pointer array. The item pointer tells me that the associate tuple
> > is at a certain position in the page and has a certain length.
>
> OK, but you can have item pointers that don't have any item.
> LP_REDIRECT, LP_DEAD, LP_UNUSED item pointers don't have items.

I guess so. I said before that I thought an item and an item pointer
were the same, but on reflection, that doesn't entirely make sense.
But I don't know that I like making item and tuple synonymous either.
I think perhaps the term "item" by itself is not very clear.

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: pg_replslotdata - a tool for displaying replication slot information

2021-11-24 Thread Bharath Rupireddy
On Tue, Nov 23, 2021 at 10:39 AM Bharath Rupireddy
 wrote:
>
> Hi,
>
> The replication slots data is stored in binary format on the disk under the 
> pg_replslot/<> directory which isn't human readable. If the server 
> is crashed/down (for whatever reasons) and unable to come up, currently 
> there's no way for the user/admin/developer to know what were all the 
> replication slots available at the time of server crash/down to figure out 
> what's the restart lsn, xid, two phase info or types of slots etc.
>
> pg_replslotdata is a tool that interprets the replication slots information 
> and displays it onto the stdout even if the server is crashed/down. The 
> design of this tool is similar to other tools available in the core today 
> i.e. pg_controldata, pg_waldump.
>
> Attaching initial patch herewith. I will improve it with documentation and 
> other stuff a bit later.
>
> Please see the attached picture for the sample output.
>
> Thoughts?

Attaching the rebased v2 patch.

Regards,
Bharath Rupireddy.


v2-0001-pg_replslotdata.patch
Description: Binary data


Re: Windows build warnings

2021-11-24 Thread Dagfinn Ilmari Mannsåker
Tom Lane  writes:

> =?utf-8?Q?Dagfinn_Ilmari_Manns=C3=A5ker?=  writes:
>> Should we change the compiler checks for attributes in c.h to include
>> `|| __has_attribute(…)`, so that we automatically get them on compilers
>> that support that (particularly clang)?
>
> clang already #defines GCC, no?


__GNUC__, but yes, I didn't realise that.  Clang 11 seems to claim to be
GCC 4.2 by default, but that can be overridden usng the -fgnuc-version
(and turned off by setting it to zero).

Do any other compilers support __has_attribute()?

>   regards, tom lane

- ilmari




Re: Rename dead_tuples to dead_items in vacuumlazy.c

2021-11-24 Thread Alvaro Herrera
On 2021-Nov-24, Alvaro Herrera wrote:

> On 2021-Nov-24, Robert Haas wrote:
> 
> > Hmm. I think in my model an item and an item pointer and a line
> > pointer are all the same thing, but a TID is different. When I talk
> > about a TID, I mean the location of an item pointer, not its contents.
> > So a TID is what tells me that I want block 5 and the 4th slot in the
> > item pointer array. The item pointer tells me that the associate tuple
> > is at a certain position in the page and has a certain length.
> 
> OK, but you can have item pointers that don't have any item.
> LP_REDIRECT, LP_DEAD, LP_UNUSED item pointers don't have items.

Sorry to reply to myself, but I realized that I forgot to return to the
main point of this thread.  If we agree that "an LP_DEAD item pointer
does not point to any item" (an assertion that gives a precise meaning
to both those terms), then a patch that renames "tuples" to "items" is
not doing anything useful IMO, because those two terms are synonyms.

Now maybe Peter doesn't agree with the definitions I suggest, in which 
case I would like to know what his definitions are.

-- 
Álvaro Herrera PostgreSQL Developer  —  https://www.EnterpriseDB.com/
"How strange it is to find the words "Perl" and "saner" in such close
proximity, with no apparent sense of irony. I doubt that Larry himself
could have managed it." (ncm, http://lwn.net/Articles/174769/)




Re: Windows build warnings

2021-11-24 Thread Tom Lane
=?utf-8?Q?Dagfinn_Ilmari_Manns=C3=A5ker?=  writes:
> Should we change the compiler checks for attributes in c.h to include
> `|| __has_attribute(…)`, so that we automatically get them on compilers
> that support that (particularly clang)?

clang already #defines GCC, no?

regards, tom lane




Re: Rename dead_tuples to dead_items in vacuumlazy.c

2021-11-24 Thread Alvaro Herrera
On 2021-Nov-24, Robert Haas wrote:

> Hmm. I think in my model an item and an item pointer and a line
> pointer are all the same thing, but a TID is different. When I talk
> about a TID, I mean the location of an item pointer, not its contents.
> So a TID is what tells me that I want block 5 and the 4th slot in the
> item pointer array. The item pointer tells me that the associate tuple
> is at a certain position in the page and has a certain length.

OK, but you can have item pointers that don't have any item.
LP_REDIRECT, LP_DEAD, LP_UNUSED item pointers don't have items.

-- 
Álvaro Herrera  Valdivia, Chile  —  https://www.EnterpriseDB.com/
"Nunca confiaré en un traidor.  Ni siquiera si el traidor lo he creado yo"
(Barón Vladimir Harkonnen)




Minor documentation fix - missing blank space

2021-11-24 Thread Japin Li


Hi,

When I read the documentation about Overview of PostgreSQL Internals - Executor 
[1],
I find there is a missing space in the documentation.

diff --git a/doc/src/sgml/arch-dev.sgml b/doc/src/sgml/arch-dev.sgml
index 7aff059e82..c2be28fac8 100644
--- a/doc/src/sgml/arch-dev.sgml
+++ b/doc/src/sgml/arch-dev.sgml
@@ -559,7 +559,7 @@
 A simple INSERT ... VALUES command creates a
 trivial plan tree consisting of a single Result
 node, which computes just one result row, feeding that up
-toModifyTable to perform the insertion.
+to ModifyTable to perform the insertion.

 
   


[1] https://www.postgresql.org/docs/14/executor.html

-- 
Regrads,
Japin Li.
ChengDu WenWu Information Technology Co.,Ltd.




Re: Rename dead_tuples to dead_items in vacuumlazy.c

2021-11-24 Thread Robert Haas
On Wed, Nov 24, 2021 at 9:37 AM Alvaro Herrera  wrote:
> My mental model is that "tuple" (in the narrow context of heap vacuum)
> is the variable-size on-disk representation of a row in a page; "line
> pointer" is the fixed-size struct at the bottom of each page that
> contains location, size and flags of a tuple: struct ItemIdData.  The
> TID is the address of a line pointer -- an ItemPointerData.
>
> What is an item?  Is an item the same as a line pointer?  That seems
> confusing.  I think "item" means the tuple as a whole.  In that light,
> using the term TID for some of the things that the patch renames to
> "item" seems more appropriate.

Hmm. I think in my model an item and an item pointer and a line
pointer are all the same thing, but a TID is different. When I talk
about a TID, I mean the location of an item pointer, not its contents.
So a TID is what tells me that I want block 5 and the 4th slot in the
item pointer array. The item pointer tells me that the associate tuple
is at a certain position in the page and has a certain length.

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: Post-CVE Wishlist

2021-11-24 Thread Robert Haas
On Tue, Nov 23, 2021 at 5:41 PM Heikki Linnakangas  wrote:
> All that said, I'm not sure how serious I am about this. I think it
> would work, and it wouldn't even be very complicated, but it feels
> hacky, and that's not a good thing with anything security related. And
> the starttls-style negotiation isn't that bad, really. I'm inclined to
> do nothing I guess. Thoughts?

I am not really persuaded by Jacob's argument that, had this only
worked the other way from the start, this bug wouldn't have occurred.
That's just a tautology, because we can only have bugs in the code we
write, not the code we didn't write. So perhaps we would have just had
some other bug, which might have been more or less serious than the
one we actually had. It's hard to say, really, because the situation
is hypothetical.

But on reflection, one thing that isn't very nice about the current
approach is that it won't work with anything that doesn't support the
PostgreSQL wire protocol specifically. Imagine that you have a driver
for PostgreSQL that for some reason does not support SSL, but you want
to use SSL to talk to the server. You cannot stick a generic proxy
that speaks plaintext on one side and SSL on the other side between
that driver and the server and have it work. You will need something
that knows how to proxy the PostgreSQL protocol specifically, and that
will probably end up being higher-overhead than a generic proxy. There
are all sorts of other variants of this scenario, and one of them is
probably the motivation behind the request for proxy protocol support.
I don't use these kinds of software myself, but I think a lot of
people do, and it wouldn't be a bad thing if we could be
"plug-compatible" with things that people on the Internet want to do,
without needing a PostgreSQL-specific adapter. SSL is certainly one of
those things.

This argument doesn't answer the question of whether speaking pure SSL
on a separate port is better or worse than having a single port that
does either. If I had to guess, the latter is more convenient for
users but less convenient to code. I don't even see a compelling
reason why we can't support multiple models here, supposing someone is
willing to do the work and fix the bugs that result.

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: Rename dead_tuples to dead_items in vacuumlazy.c

2021-11-24 Thread Alvaro Herrera
On 2021-Nov-24, Robert Haas wrote:

> On Wed, Nov 24, 2021 at 7:48 AM Masahiko Sawada  wrote:
> > I think it's more consistent if we change it to one side. I prefer
> > "dead items".
> 
> I feel like "items" is quite a generic word, so I think I would prefer
> TIDs. But it's probably not a big deal.

Is there clarity on what each term means? 

Since this patch only changes things that are specific to heap
vacuuming, it seems OK to rely the convention that "item" means "heap
item" (not just any generic item).  However, I'm not sure that we fully
agree exactly what a heap item is.  Maybe if we agree to a single non
ambiguous definition for each of those terms we can agree what
terminology to use.

It seems to me we have the following terms:

- tuple
- line pointer
- [heap] item
- TID

My mental model is that "tuple" (in the narrow context of heap vacuum)
is the variable-size on-disk representation of a row in a page; "line
pointer" is the fixed-size struct at the bottom of each page that
contains location, size and flags of a tuple: struct ItemIdData.  The
TID is the address of a line pointer -- an ItemPointerData.

What is an item?  Is an item the same as a line pointer?  That seems
confusing.  I think "item" means the tuple as a whole.  In that light,
using the term TID for some of the things that the patch renames to
"item" seems more appropriate.

-- 
Álvaro Herrera PostgreSQL Developer  —  https://www.EnterpriseDB.com/




Re: Parallel vacuum workers prevent the oldest xmin from advancing

2021-11-24 Thread John Naylor
On Wed, Nov 17, 2021 at 7:28 AM Amit Kapila  wrote:
>
> > The patch looks good to me. But I can't come up with a stable test for
> > this. It seems to be hard without stopping and resuming parallel
> > vacuum workers. Do you have any good idea?
> >
>
> No, let's wait for a day or so to see if anybody else has any ideas to
> write a test for this case, otherwise, I'll check these once again and
> push.

I set this "committed" in the CF app.

--
John Naylor
EDB: http://www.enterprisedb.com


Re: Rename dead_tuples to dead_items in vacuumlazy.c

2021-11-24 Thread Robert Haas
On Wed, Nov 24, 2021 at 7:48 AM Masahiko Sawada  wrote:
> I think it's more consistent if we change it to one side. I prefer "dead 
> items".

I feel like "items" is quite a generic word, so I think I would prefer
TIDs. But it's probably not a big deal.

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: Support for NSS as a libpq TLS backend

2021-11-24 Thread Joshua Brindle
On Wed, Nov 24, 2021 at 8:46 AM Joshua Brindle
 wrote:
>
> On Wed, Nov 24, 2021 at 6:59 AM Daniel Gustafsson  wrote:
> >
> > > On 23 Nov 2021, at 23:39, Joshua Brindle  
> > > wrote:
> >
> > > It no longer happens with v49, since it was a null deref of the pr_fd
> > > which no longer happens.
> > >
> > > I'll continue testing now, so far it's looking better.
> >
> > Great, thanks for confirming.  I'm still keen on knowing how you triggered 
> > the
> > segfault so I can ensure there are no further bugs around there.
> >
>
> It happened when I ran psql with hostssl on the server but before I'd
> initialized my client certificate store.

I don't know enough about NSS to know if this is problematic or not
but if I try verify-full without having the root CA in the certificate
store I get:

$ /usr/pgsql-15/bin/psql "host=localhost sslmode=verify-full user=postgres"
psql: error: SSL error: Issuer certificate is invalid.
unable to shut down NSS context: NSS could not shutdown. Objects are
still in use.




Re: Support for NSS as a libpq TLS backend

2021-11-24 Thread Joshua Brindle
On Wed, Nov 24, 2021 at 6:59 AM Daniel Gustafsson  wrote:
>
> > On 23 Nov 2021, at 23:39, Joshua Brindle  
> > wrote:
>
> > It no longer happens with v49, since it was a null deref of the pr_fd
> > which no longer happens.
> >
> > I'll continue testing now, so far it's looking better.
>
> Great, thanks for confirming.  I'm still keen on knowing how you triggered the
> segfault so I can ensure there are no further bugs around there.
>

It happened when I ran psql with hostssl on the server but before I'd
initialized my client certificate store.




Re: row filtering for logical replication

2021-11-24 Thread Amit Kapila
On Tue, Nov 23, 2021 at 4:58 PM Ajin Cherian  wrote:
>
> Attaching a new patchset v41 which includes changes by both Peter and myself.
>
> Patches v40-0005 and v40-0006 have been merged to create patch
> v41-0005 which reduces the patches to 6 again.
> This patch-set contains changes addressing the following review comments:
>
> On Mon, Nov 15, 2021 at 5:48 PM Amit Kapila  wrote:
> >
> > What I meant was that with this new code we have regressed the old
> > behavior. Basically, imagine a case where no filter was given for any
> > of the tables. Then after the patch, we will remove all the old tables
> > whereas before the patch it will remove the oldrels only when they are
> > not specified as part of new rels. If you agree with this, then we can
> > retain the old behavior and for the new tables, we can always override
> > the where clause for a SET variant of command.
>
> Fixed and modified the behaviour to match with what the schema patch
> implemented.
>

+
+ /*
+ * If the new relation or the old relation has a where clause,
+ * we need to remove it so that it can be added afresh later.
+ */
+ if (RelationGetRelid(newpubrel->relation) == oldrelid &&
+ newpubrel->whereClause == NULL && rfisnull)

Can't we use _equalPublicationTable() here? It compares the whereClause as well.

Few more comments:
=
0001
1.
@@ -1039,10 +1081,11 @@ PublicationAddTables(Oid pubid, List *rels,
bool if_not_exists,
  {
  PublicationRelInfo *pub_rel = (PublicationRelInfo *) lfirst(lc);
  Relation rel = pub_rel->relation;
+ Oid relid = RelationGetRelid(rel);
  ObjectAddress obj;

  /* Must be owner of the table or superuser. */
- if (!pg_class_ownercheck(RelationGetRelid(rel), GetUserId()))
+ if (!pg_class_ownercheck(relid, GetUserId()))

Here, you can directly use RelationGetRelid as was used in the
previous code without using an additional variable.

0005
2.
+typedef struct {
+ Relation rel;
+ bool check_replident;
+ Bitmapset  *bms_replident;
+}
+rf_context;

Add rf_context in the same line where } ends.

3. In the function header comment of rowfilter_walker, you mentioned
the simple expressions allowed but we should write why we are doing
so. It has been discussed in detail in various emails in this thread.
AFAIR, below are the reasons:
A. We don't want to allow user-defined functions or operators because
(a) if the user drops such a function/operator or if there is any
other error via that function, the walsender won't be able to recover
from such an error even if we fix the function's problem because it
uses a historic snapshot to access row-filter; (b) any other table
could be accessed via a function which won't work because of historic
snapshots in logical decoding environment.

B. We don't allow anything other immutable built-in functions as those
can access database and would lead to the problem (b) mentioned in the
previous paragraph.

Don't we need to check for user-defined types similar to user-defined
functions and operators? If not why?

4.
+ * Rules: Node-type validation
+ * ---
+ * Allow only simple or compound expressions like:
+ * - "(Var Op Const)" or

It seems Var Op Var is allowed. I tried below and it works:
create publication pub for table t1 where (c1 < c2) WITH (publish = 'insert');

I think it should be okay to allow it provided we ensure that we never
access some other table/view etc. as part of the expression. Also, we
should document the behavior correctly.

-- 
With Regards,
Amit Kapila.




Re: Rename dead_tuples to dead_items in vacuumlazy.c

2021-11-24 Thread Masahiko Sawada
On Wed, Nov 24, 2021 at 2:46 PM Peter Geoghegan  wrote:
>
> Attached patch performs polishing within vacuumlazy.c, as follow-up
> work to the refactoring work in Postgres 14. This mainly consists of
> changing references of dead tuples to dead items, which reflects the
> fact that VACUUM no longer deals with TIDs that might point to
> remaining heap tuples with storage -- the TIDs in the array must now
> strictly point to LP_DEAD stub line pointers that remain in the heap,
> following pruning.

+1

> If there are no objections, I'll move on this soon. It's mostly just
> mechanical changes.

The patch renames dead tuples to dead items at some places and  to
dead TIDs at some places. For instance, it renames dead tuples to dead
TIDs here:

- * Return the maximum number of dead tuples we can record.
+ * Computes the number of dead TIDs that VACUUM will have to store in the
+ * worst case, where all line pointers are allocated, and all are LP_DEAD

whereas renames to dead items here:

- * extra cost of bsearch(), especially if dead tuples on the heap are
+ * extra cost of bsearch(), especially if dead items on the heap are

I think it's more consistent if we change it to one side. I prefer "dead items".

---
There is one more place where we can rename "dead tuples":

/*
 * Allocate the space for dead tuples.  Note that this handles parallel
 * VACUUM initialization as part of allocating shared memory space used
 * for dead_items.
 */

Regards,


--
Masahiko Sawada
EDB:  https://www.enterprisedb.com/




Re: VS2022: Support Visual Studio 2022 on Windows

2021-11-24 Thread Dave Page
Hi

On Wed, Nov 24, 2021 at 11:36 AM Michael Paquier 
wrote:

> On Wed, Nov 24, 2021 at 10:00:19AM +, Dave Page wrote:
> > It's extremely unlikely that we'd shift to such a new version for PG15.
> We
> > build many components aside from PostgreSQL, and need to use the same
> > toolchain for all of them (we've had very painful experiences with mix n
> > match CRT versions in the past) so it's not just PG that needs to support
> > VS2022 as far as we're concerned
>
> Yes, I can understand that upgrading the base version of VS used is a
> very difficult exercise.  I have been through that, on Windows for
> Postgres..  As well as for the compilation of all its dependencies.
>
> > - Perl, Python, TCL, MIT Kerberos,
> > OpenSSL, libxml2, libxslt etc. are all built with the same toolchain for
> > consistency.
>
> Dave, do you include LZ4 in 14?  Just asking, as a matter of
> curiosity.
>

Yes we do :-)

C:\Program Files\PostgreSQL\14\bin>pg_config
BINDIR = C:/PROGRA~1/POSTGR~1/14/bin
DOCDIR = C:/PROGRA~1/POSTGR~1/14/doc
HTMLDIR = C:/PROGRA~1/POSTGR~1/14/doc
INCLUDEDIR = C:/PROGRA~1/POSTGR~1/14/include
PKGINCLUDEDIR = C:/PROGRA~1/POSTGR~1/14/include
INCLUDEDIR-SERVER = C:/PROGRA~1/POSTGR~1/14/include/server
LIBDIR = C:/Program Files/PostgreSQL/14/lib
PKGLIBDIR = C:/Program Files/PostgreSQL/14/lib
LOCALEDIR = C:/PROGRA~1/POSTGR~1/14/share/locale
MANDIR = C:/Program Files/PostgreSQL/14/man
SHAREDIR = C:/PROGRA~1/POSTGR~1/14/share
SYSCONFDIR = C:/Program Files/PostgreSQL/14/etc
PGXS = C:/Program Files/PostgreSQL/14/lib/pgxs/src/makefiles/pgxs.mk
CONFIGURE = --enable-thread-safety --enable-nls --with-ldap
--with-ssl=openssl --with-uuid --with-libxml --with-libxslt --with-lz4
--with-icu --with-tcl --with-perl --with-python
CC = not recorded
CPPFLAGS = not recorded
CFLAGS = not recorded
CFLAGS_SL = not recorded
LDFLAGS = not recorded
LDFLAGS_EX = not recorded
LDFLAGS_SL = not recorded
LIBS = not recorded
VERSION = PostgreSQL 14.1

-- 
Dave Page
Blog: https://pgsnake.blogspot.com
Twitter: @pgsnake

EDB: https://www.enterprisedb.com


Re: Support for NSS as a libpq TLS backend

2021-11-24 Thread Daniel Gustafsson
> On 23 Nov 2021, at 23:39, Joshua Brindle  
> wrote:

> It no longer happens with v49, since it was a null deref of the pr_fd
> which no longer happens.
> 
> I'll continue testing now, so far it's looking better.

Great, thanks for confirming.  I'm still keen on knowing how you triggered the
segfault so I can ensure there are no further bugs around there.

--
Daniel Gustafsson   https://vmware.com/





Re: Teach pg_receivewal to use lz4 compression

2021-11-24 Thread Jeevan Ladhe
On Wed, Nov 24, 2021 at 10:55 AM Michael Paquier 
wrote:

> On Mon, Nov 22, 2021 at 09:02:47AM -0500, Robert Haas wrote:
> > On Mon, Nov 22, 2021 at 12:46 AM Jeevan Ladhe
> >  wrote:
> >> Fair enough. But, still I have a doubt in mind what benefit would that
> >> really bring to us here, because we are immediately also freeing the
> >> lz4buf without using it anywhere.
> >
> > Yeah, I'm also doubtful about that. If we're freeng the compression
> > context, we shouldn't need to guarantee that it's in any particular
> > state before doing so. Why would any critical cleanup be part of
> > LZ4F_compressEnd() rather than LZ4F_freeCompressionContext()? The
> > point of LZ4F_compressEnd() is to make sure all of the output bytes
> > get written, and it would be stupid to force people to write the
> > output bytes even when they've decided that they no longer care about
> > them due to some error.
>
> Hmm.  I have double-checked all that, and I agree that we could just
> skip LZ4F_compressEnd() in this error code path.  From what I can see
> in the upstream code, what we have now is not broken either, but the
> compressEnd() call does some work that's not needed here.


Yes I agree that we are not broken, but as you said we are doing some
an extra bit of work here.

Regards,
Jeevan Ladhe


Re: Skipping logical replication transactions on subscriber side

2021-11-24 Thread Masahiko Sawada
On Wed, Nov 17, 2021 at 8:14 PM Amit Kapila  wrote:
>
> On Tue, Nov 16, 2021 at 12:01 PM Masahiko Sawada  
> wrote:
> >
> > Right. I've fixed this issue and attached an updated patch.
> >
>
> Few comments/questions:
> =
> 1.
> +  
> +   The pg_stat_subscription_workers view will 
> contain
> +   one row per subscription error reported by workers applying logical
> +   replication changes and workers handling the initial data copy of the
> +   subscribed tables.  The statistics entry is removed when the subscription
> +   the worker is running on is removed.
> +  
>
> The last line of this paragraph is not clear to me. First "the" before
> "worker" in the following part of the sentence seems unnecessary
> "..when the subscription the worker..". Then the part "running on is
> removed" is unclear because it could also mean that we remove the
> entry when a subscription is disabled. Can we rephrase it to: "The
> statistics entry is removed when the corresponding subscription is
> dropped"?

Agreed. Fixed.

>
> 2.
> Between v20 and v23 versions of patch the size of hash table
> PGSTAT_SUBWORKER_HASH_SIZE is increased from 32 to 256. I might have
> missed the comment which lead to this change, can you point me to the
> same or if you changed it for some other reason, can you let me know
> the same?

I'd missed reverting this change. I considered increasing this value
since the lifetime of subscription is long. But when it comes to
unshared hashtable can be expanded on-the-fly, it's better to start
with a small value. Reverted.

>
> 3.
> +
> + /*
> + * Repeat for subscription workers.  Similarly, we needn't bother
> + * in the common case where no function stats are being collected.
> + */
>
> /function/subscription workers'

Fixed.

>
> 4.
> +  
> +   Name of command being applied when the error occurred.  This field
> +   is always NULL if the error was reported during the initial data
> +   copy.
> +  
> + 
> +
> + 
> +  
> +   xid xid
> +  
> +  
> +   Transaction ID of the publisher node being applied when the error
> +   occurred.  This field is always NULL if the error was reported
> +   during the initial data copy.
> +  
>
> Is it important to stress on 'always' in the above two descriptions?

No, removed.

>
> 5.
> The current description of first/last_error_time seems sliglthy
> misleading as one can interpret that these are about different errors.
> Let's slightly change the description of first/last_error_time as
> follows or something on those lines:
>
> 
> +  
> +   Time at which the first error occurred
> +  
> + 
>
> First time at which this error occurred
>
> last_error_time timestamp with time 
> zone
> +  
> +  
> +   Time at which the last error occurred
>
> Last time at which this error occurred. This will be the same as
> first_error_time except when the same error occurred more than once
> consecutively.

Changed. I've removed first_error_time as per discussion on the thread
for adding xact stats.

>
> 6.
> +
> +pg_stat_reset_subscription_worker (
> subid oid, 
> relid oid  )
> +void
> +   
> +   
> +Resets the statistics of a single subscription worker running on the
> +subscription with subid shown in the
> +pg_stat_subscription_worker view.  If the
> +argument relid is not NULL,
> +resets statistics of the subscription worker handling the initial 
> data
> +copy of the relation with relid.  Otherwise,
> +resets the subscription worker statistics of the main apply worker.
> +If the argument relid is omitted, resets the
> +statistics of all subscription workers running on the subscription
> +with subid.
> +   
>
> The first line of this description seems to indicate that we can only
> reset the stats of a single worker but the later part indicates that
> we can reset stats of all subscription workers. Can we change the
> first line as: "Resets the statistics of subscription workers running
> on the subscription with subid shown in the
> pg_stat_subscription_worker view.".
>

Changed.

> 7.
> pgstat_vacuum_stat()
> {
> ..
> + pgstat_setheader(_hdr, PGSTAT_MTYPE_SUBSCRIPTIONPURGE);
> + spmsg.m_databaseid = MyDatabaseId;
> + spmsg.m_nentries = 0;
> ..
> }
>
> Do we really need to set the header here? It seems to be getting set
> in pgstat_send_subscription_purge() while sending this message.

Removed.

>
> 8.
> pgstat_vacuum_stat()
> {
> ..
> +
> + if (hash_search(htab, (void *) &(subwentry->key.subid), HASH_FIND, NULL)
> + != NULL)
> + continue;
> +
> + /* This subscription is dead, add the subid to the message */
> + spmsg.m_subids[spmsg.m_nentries++] = subwentry->key.subid;
> ..
> }
>
> I think it is better to use a separate variable here for subid as we
> are using for funcid and tableid. That will make this part of the code
> easier to follow and look consistent.


Re: VS2022: Support Visual Studio 2022 on Windows

2021-11-24 Thread Michael Paquier
On Wed, Nov 24, 2021 at 10:00:19AM +, Dave Page wrote:
> It's extremely unlikely that we'd shift to such a new version for PG15. We
> build many components aside from PostgreSQL, and need to use the same
> toolchain for all of them (we've had very painful experiences with mix n
> match CRT versions in the past) so it's not just PG that needs to support
> VS2022 as far as we're concerned

Yes, I can understand that upgrading the base version of VS used is a
very difficult exercise.  I have been through that, on Windows for
Postgres..  As well as for the compilation of all its dependencies.

> - Perl, Python, TCL, MIT Kerberos,
> OpenSSL, libxml2, libxslt etc. are all built with the same toolchain for
> consistency.

Dave, do you include LZ4 in 14?  Just asking, as a matter of
curiosity.
--
Michael


signature.asc
Description: PGP signature


Re: [BUG]Missing REPLICA IDENTITY check when DROP NOT NULL

2021-11-24 Thread Michael Paquier
On Wed, Nov 24, 2021 at 07:04:51AM +, tanghy.f...@fujitsu.com wrote:
> create table tbl (a int not null unique);
> alter table tbl replica identity using INDEX tbl_a_key;
> alter table tbl alter column a drop not null;
> insert into tbl values (null);

Oops.  Yes, that's obviously not good.

> To solve the above problem, I think it's better to add a check when
> executing  ALTER COLUMN DROP NOT NULL,
> and report an error if this column is part of replica identity.

I'd say that you are right to block the operation.  I'll try to play a
bit with this stuff tomorrow.

> Attaching a patch that disallow DROP NOT NULL on a column if it's in
> a REPLICA IDENTITY index. Also added a test in it. 

   if (indexStruct->indkey.values[i] == attnum)
   ereport(ERROR,
   (errcode(ERRCODE_INVALID_TABLE_DEFINITION),
-   errmsg("column \"%s\" is in a primary key",
+   errmsg(ngettext("column \"%s\" is in a primary key",
+   "column \"%s\" is in a REPLICA IDENTITY 
index",
+   indexStruct->indisprimary),
colName)));
Using ngettext() looks incorrect to me here as it is used to get the
plural form of a string, so you'd better make these completely
separated instead.
--
Michael


signature.asc
Description: PGP signature


Re: Windows build warnings

2021-11-24 Thread Dagfinn Ilmari Mannsåker
Daniel Gustafsson  writes:

> On 22 Nov 2021, at 16:40, Tom Lane  wrote:
>
>> I can't find anything that is providing a non-empty definition of
>> PG_USED_FOR_ASSERTS_ONLY (a/k/a pg_attribute_unused) for anything
>> except GCC.  
>
> It's supported in clang as well per the documentation [0] in at least some
> configurations or distributions:
>
>   "The [[maybe_unused]] (or __attribute__((unused))) attribute can be
>   used to silence such diagnostics when the entity cannot be removed.
>   For instance, a local variable may exist solely for use in an assert()
>   statement, which makes the local variable unused when NDEBUG is
>   defined."

Should we change the compiler checks for attributes in c.h to include
`|| __has_attribute(…)`, so that we automatically get them on compilers
that support that (particularly clang)?

- ilmari




Re: VS2022: Support Visual Studio 2022 on Windows

2021-11-24 Thread Dave Page
Hi

On Wed, Nov 24, 2021 at 9:12 AM Hans Buschmann  wrote:

> Hello Michael,
>
> thanks for your hard work and quick response!
> It is very convenient to only use VS2022 for Windows from now on...
>
> >Diff unrelated to your patch.
>
> Sorry for the copysoft problem from the first version.
>
> >Glad to see that we should have nothing to do about locales this
> >time.  I have not tested, but I think that you covering all the areas
> >that need a refresh here.  Nice work.
>
> I think it is almost impossible to overestimate the value of such support
> from experienced hackers to others starting their journey right now...
>
> I hope I can motivate you (and other experienced hackers) to give me some
> more support on my real project arriving anytime soon. It addresses
> hex_encoding (and more) targetting mostly pg_dump, but requires also some
> deeper knowledge of general infrastructure and building (also on Windows).
> Stay tuned!
>
> PS: Does anybody have good relations to EDB suggesting them to target
> VS2022 as the build environment for the upcoming PG15 release?
>

That would be me...


>
> postgres=# select version ();
>   version
> 
>  PostgreSQL 14.1, compiled by Visual C++ build 1931, 64-bit
> (1 row)
>

It's extremely unlikely that we'd shift to such a new version for PG15. We
build many components aside from PostgreSQL, and need to use the same
toolchain for all of them (we've had very painful experiences with mix n
match CRT versions in the past) so it's not just PG that needs to support
VS2022 as far as we're concerned - Perl, Python, TCL, MIT Kerberos,
OpenSSL, libxml2, libxslt etc. are all built with the same toolchain for
consistency.

-- 
Dave Page
Blog: https://pgsnake.blogspot.com
Twitter: @pgsnake

EDB: https://www.enterprisedb.com


Re: row filtering for logical replication

2021-11-24 Thread vignesh C
On Tue, Nov 23, 2021 at 4:58 PM Ajin Cherian  wrote:
>
> Attaching a new patchset v41 which includes changes by both Peter and myself.
>
> Patches v40-0005 and v40-0006 have been merged to create patch
> v41-0005 which reduces the patches to 6 again.

Few comments:
1) I'm not sure if we will be able to throw a better error message in
this case "ERROR:  missing FROM-clause entry for table "t4"", if
possible you could change it.

+   if (pri->whereClause != NULL)
+   {
+   /* Set up a pstate to parse with */
+   pstate = make_parsestate(NULL);
+   pstate->p_sourcetext = nodeToString(pri->whereClause);
+
+   nsitem = addRangeTableEntryForRelation(pstate, targetrel,
+
AccessShareLock,
+
NULL, false, false);
+   addNSItemToQuery(pstate, nsitem, false, true, true);
+
+   whereclause = transformWhereClause(pstate,
+
copyObject(pri->whereClause),
+
EXPR_KIND_PUBLICATION_WHERE,
+
"PUBLICATION");
+
+   /* Fix up collation information */
+   assign_expr_collations(pstate, whereclause);
+   }

alter publication pub1 add table t5 where ( t4.c1 = 10);
ERROR:  missing FROM-clause entry for table "t4"
LINE 1: alter publication pub1 add table t5 where ( t4.c1 = 10);
^
pstate->p_expr_kind is stored as EXPR_KIND_PUBLICATION_WHERE, we could
differentiate using expr_kind.

2) Should '"delete" or "delete"' be '"delete" or "update"'
--- a/src/backend/catalog/pg_publication.c
+++ b/src/backend/catalog/pg_publication.c
@@ -340,7 +340,7 @@ rowfilter_walker(Node *node, rf_context *context)
  * 1. Only certain simple node types are permitted in the expression. See
  * function rowfilter_walker for details.
  *
- * 2. If the publish operation contains "delete" then only columns that
+ * 2. If the publish operation contains "delete" or "delete" then
only columns that
  * are allowed by the REPLICA IDENTITY rules are permitted to be used in the
  * row-filter WHERE clause.
  */
@@ -352,12 +352,10 @@ rowfilter_expr_checker(Publication *pub,
Relation rel, Node *rfnode)
context.rel = rel;

/*
-* For "delete", check that filter cols are also valid replica identity
+* For "delete" or "update", check that filter cols are also
valid replica identity
 * cols.

3) Should we include row filter condition in pg_publication_tables
view like in describe publication(\dRp+) , since the prqual is not
easily readable in pg_publication_rel table:
select * from pg_publication_tables ;
 pubname | schemaname | tablename
-++---
 pub1| public | t1
(1 row)

 select * from pg_publication_rel ;
  oid  | prpubid | prrelid |
  prqual
---+-+-+---
-
 16389 |   16388 |   16384 | {OPEXPR :opno 518 :opfuncid 144
:opresulttype 16 :opretset false :opcollid 0 :inputcollid 0 :args
({VAR :varno 1 :varattno 1 :vartype 23 :vartypmod -1 :varcollid 0
:varlevelsup 0 :va
rnosyn 1 :varattnosyn 1 :location 45} {CONST :consttype 23
:consttypmod -1 :constcollid 0 :constlen 4 :constbyval true
:constisnull false :location 51 :constvalue 4 [ 0 0 0 0 0 0 0 0 ]})
:location 48}
(1 row)

4) This should be included in typedefs.list, also we could add some
comments for this structure
+typedef struct {
+   Relationrel;
+   Bitmapset  *bms_replident;
+}
+rf_context;

5) Few includes are not required. #include "miscadmin.h" not required
in pg_publication.c, #include "executor/executor.h" not required in
proto.c, #include "access/xact.h", #include "executor/executor.h" and
#include "replication/logicalrelation.h" not required in pgoutput.c

6) typo "filte" should be "filter":
+/*
+ * The row filte walker checks that the row filter expression is legal.
+ *
+ * Rules: Node-type validation
+ * ---
+ * Allow only simple or compound expressions like:
+ * - "(Var Op Const)" or
+ * - "(Var Op Const) Bool (Var Op Const)"

Regards,
Vignesh




AW: VS2022: Support Visual Studio 2022 on Windows

2021-11-24 Thread Hans Buschmann
Hello Michael,

thanks for your hard work and quick response!
It is very convenient to only use VS2022 for Windows from now on...

>Diff unrelated to your patch.  

Sorry for the copysoft problem from the first version.

>Glad to see that we should have nothing to do about locales this
>time.  I have not tested, but I think that you covering all the areas
>that need a refresh here.  Nice work.

I think it is almost impossible to overestimate the value of such support from 
experienced hackers to others starting their journey right now...

I hope I can motivate you (and other experienced hackers) to give me some more 
support on my real project arriving anytime soon. It addresses hex_encoding 
(and more) targetting mostly pg_dump, but requires also some deeper knowledge 
of general infrastructure and building (also on Windows). Stay tuned!

PS: Does anybody have good relations to EDB suggesting them to target VS2022 as 
the build environment for the upcoming PG15 release?

postgres=# select version ();
  version

 PostgreSQL 14.1, compiled by Visual C++ build 1931, 64-bit
(1 row)

Thanks!

Hans Buschmann




Re: rename SnapBuild* macros in slot.c

2021-11-24 Thread Amit Kapila
On Tue, Nov 23, 2021 at 8:12 AM Amit Kapila  wrote:
>
> On Mon, Nov 22, 2021 at 7:47 PM Alvaro Herrera  
> wrote:
> >
> > On 2021-Nov-22, Amit Kapila wrote:
> >
> > > +1 for this change. This seems to be introduced by commit ec5896aed3
> > > [1] and I think it is just a typo to name these macros starting with
> > > SnapBuildOnDisk* unless I am missing something.
> >
> > Yeah, it looks pretty weird.  +1 for the change on consistency grounds.
> >
>
> Okay, I will push this tomorrow unless I see any objections.
>

Pushed!

-- 
With Regards,
Amit Kapila.




Re: parallel vacuum comments

2021-11-24 Thread Amit Kapila
On Wed, Nov 24, 2021 at 12:16 PM Masahiko Sawada  wrote:
>
> On Wed, Nov 24, 2021 at 1:34 PM Amit Kapila  wrote:
> >
> > On Wed, Nov 24, 2021 at 7:43 AM Masahiko Sawada  
> > wrote:
> > >
> > > On Mon, Nov 22, 2021 at 1:48 PM Amit Kapila  
> > > wrote:
> > > >
> > > > On Fri, Nov 19, 2021 at 7:55 AM houzj.f...@fujitsu.com
> > > >  wrote:
> > > >
> > > > >
> > > > > 4)
> > > > >
> > > > > Just a personal suggestion for the parallel related function name. 
> > > > > Since Andres
> > > > > wanted a uniform naming pattern. Mabe we can rename the following 
> > > > > functions:
> > > > >
> > > > > end|begin_parallel_vacuum => parallel_vacuum_end|begin
> > > > > perform_parallel_index_bulkdel|cleanup => 
> > > > > parallel_vacuum_index_bulkdel|cleanup
> > > > >
> > > > > So that all the parallel related functions' name is like 
> > > > > parallel_vacuum_xxx.
> > > > >
> > > >
> > > > BTW, do we really need functions
> > > > perform_parallel_index_bulkdel|cleanup? Both do some minimal
> > > > assignments and then call parallel_vacuum_all_indexes() and there is
> > > > just one caller of each. Isn't it better to just do those assignments
> > > > in the caller and directly call parallel_vacuum_all_indexes()?
> > >
> > > The reason why I declare these two functions are: (1) the fields of
> > > ParallelVacuumState are not exposed and (2) bulk-deletion and cleanup
> > > require different arguments (estimated_count is required only by
> > > cleanup).  So if we expose the fields of ParallelVacuumState, the
> > > caller can do those assignments and directly call
> > > parallel_vacuum_all_indexes(). But I'm not sure it's good if those
> > > assignments are the caller's responsibility.
> > >
> >
> > Okay, that makes sense. However, I am still not very comfortable with
> > the function naming suggested by Hou-San, do you have any thoughts on
> > that?
>
> I personally don't disagree with the names starting with
> "parallel_vacuum_*".
>

I don't have any strong opinion here but I prefer the name which makes
more sense in the context it is being used. OTOH, I see there is an
argument that it will be easier to follow and might appear consistent
if we use parallel_vacuum_*.

-- 
With Regards,
Amit Kapila.




Re: Skipping logical replication transactions on subscriber side

2021-11-24 Thread Amit Kapila
On Wed, Nov 24, 2021 at 1:50 PM Masahiko Sawada  wrote:
>
> On Wed, Nov 24, 2021 at 12:13 PM Amit Kapila  wrote:
> >
> > On Wed, Nov 24, 2021 at 7:51 AM Masahiko Sawada  
> > wrote:
> > >
> > > On Thu, Nov 18, 2021 at 12:52 PM houzj.f...@fujitsu.com
> > >  wrote:
> > > >
> > > > On Tuesday, November 16, 2021 2:31 PM Masahiko Sawada 
> > > >  wrote:
> > > > > Right. I've fixed this issue and attached an updated patch.
> > > > >
> > > > Hi,
> > > >
> > > > I have few comments for the testcases.
> > > >
> > > > 1)
> > > >
> > > > +my $appname = 'tap_sub';
> > > > +$node_subscriber->safe_psql(
> > > > +'postgres',
> > > > +"CREATE SUBSCRIPTION tap_sub CONNECTION '$publisher_connstr 
> > > > application_name=$appname' PUBLICATION tap_pub WITH (streaming = off, 
> > > > two_phase = on);");
> > > > +my $appname_streaming = 'tap_sub_streaming';
> > > > +$node_subscriber->safe_psql(
> > > > +'postgres',
> > > > +"CREATE SUBSCRIPTION tap_sub_streaming CONNECTION 
> > > > '$publisher_connstr application_name=$appname_streaming' PUBLICATION 
> > > > tap_pub_streaming WITH (streaming = on, two_phase = on);");
> > > > +
> > > >
> > > > I think we can remove the 'application_name=$appname', so that the 
> > > > command
> > > > could be shorter.
> > >
> > > But we wait for the subscription to catch up by using
> > > wait_for_catchup() with application_name, no?
> > >
> >
> > Yeah, but you can directly use the subscription name in
> > wait_for_catchup because we internally use that as
> > fallback_application_name. If application_name is not specified in the
> > connection string as suggested by Hou-San then
> > fallback_application_name will be considered. Both ways are okay and I
> > see we use both ways in the tests but it seems there are more places
> > where we use the method Hou-San is suggesting in subscription tests.
>
> Okay, thanks! I referred to tests that set application_name. ISTM it's
> better to unite them so as not to confuse them in future tests.
>

Agreed, but let's do this clean-up as a separate patch. Feel free to
submit the patch for the same in a separate thread.

-- 
With Regards,
Amit Kapila.




Re: pg_get_publication_tables() output duplicate relid

2021-11-24 Thread Amit Kapila
On Mon, Nov 22, 2021 at 12:55 PM Amit Langote  wrote:
>
> On Fri, Nov 19, 2021 at 2:28 PM Amit Kapila  wrote:
> > On Fri, Nov 19, 2021 at 7:19 AM Amit Langote  
> > wrote:
> > >  As in,
> > > do we know of any replication (initial/streaming) misbehavior caused
> > > by the duplicate partition OIDs in this case or is the only problem
> > > that pg_publication_tables output looks odd?
> >
> > The latter one but I think either we should document this or change it
> > as we can't assume users will follow what subscriber-side code does.
>
> On second thought, I agree that de-duplicating partitions from this
> view is an improvement.
>

Fair enough. Hou-San, Can you please submit the updated patch after
fixing any pending comments including the test case?

-- 
With Regards,
Amit Kapila.




Re: pg_get_publication_tables() output duplicate relid

2021-11-24 Thread Amit Kapila
On Wed, Nov 24, 2021 at 12:02 PM Amit Langote  wrote:
>
> On Tue, Nov 23, 2021 at 12:21 PM Amit Kapila  wrote:
> > Isn't it better to document this case and explain what
> > users can expect instead of trying to design a solution around this?
>
> I thought about the problems you've described and it looks like I
> hadn't considered many of the details which complicate implementing a
> solution for this.
>
> So yeah, documenting the ATTACH issue as a limitation sounds like the
> best course for now.  I might word it as follows and add it under
> Notes at https://www.postgresql.org/docs/current/sql-createpublication.html:
>
> "ATTACHing a table into a partition tree whose root is published using
> a publication with publish_via_partition_root set to true does not
> result in the table's existing contents to be replicated."
>

Instead of "to be", shall we use "being"?

> I'm not sure there's a clean enough workaround to this that we can add
> to the paragraph.
>
> Does that make sense?
>

Yeah, that sounds reasonable. I think the case with
"publish_via_partition_root = false" should work but please test it
once. The other thing which we need to consider about all these fixes
is whether to back patch them or not. I think it is on case to case
basis. I feel this limitation should be documented and backpatched,
what do you think? Feel free to submit patches accordingly.

> > Even if we do so the streaming after attach partition problem as
> > discussed above should be fixed.
>
> I agree.  I have reproduced the problem though haven't managed to pin
> down the cause yet.
>

No problem, do let us know whatever be your findings at the end. I
think now we know how to proceed with this attach issue, maybe we
should also try to commit the fixes for other issues.

-- 
With Regards,
Amit Kapila.




Re: Post-CVE Wishlist

2021-11-24 Thread Heikki Linnakangas

On 24/11/2021 07:09, Michael Paquier wrote:

On Tue, Nov 23, 2021 at 02:18:30PM -0500, Tom Lane wrote:

Jacob Champion  writes:

= Client-Side Auth Selection =
The second request is for the client to stop fully trusting the server
during the authentication phase. If I tell libpq to use a client
certificate, for example, I don't think the server should be allowed to
extract a plaintext password from my environment (at least not without
my explicit opt-in).


Yeah.  I don't recall whether it's been discussed in public or not,
but it certainly seems like libpq should be able to be configured so
that (for example) it will never send a cleartext password.  It's not
clear to me what extent of configurability would be useful, and I
don't want to overdesign it --- but that much at least would be a
good thing.


I recall this part being discussed in public, but I cannot put my
finger on the exact thread.  I think that this was around when we
discussed the open items of 10 or 11 for things around channel binding
and how libpq was sensitive to downgrade attacks, which would mean
around 2016 or 2017.  I also recall reading (writing?) a patch that
introduced a new connection parameter that takes in input a
comma-separated list of keywords to allow the user to choose a set of
auth methods accepted, failing if the server is willing to use a
method that does not match with what the user has put in his list.
Perhaps this last part has never reached -hackers though :)

Anyway, the closest thing I can put my finger on now is that:
https://www.postgresql.org/message-id/c5cb08f4cce46ff661ad287fadaa1...@postgrespro.ru


Here's a thread:

https://www.postgresql.org/message-id/227015d8417f2b4fef03f8966dbfa5cbcc4f44da.camel%40j-davis.com

The result of that thread was that we added the 
channel_binding=require/prefer/disable option.


- Heikki




Re: Skipping logical replication transactions on subscriber side

2021-11-24 Thread Masahiko Sawada
On Wed, Nov 24, 2021 at 12:13 PM Amit Kapila  wrote:
>
> On Wed, Nov 24, 2021 at 7:51 AM Masahiko Sawada  wrote:
> >
> > On Thu, Nov 18, 2021 at 12:52 PM houzj.f...@fujitsu.com
> >  wrote:
> > >
> > > On Tuesday, November 16, 2021 2:31 PM Masahiko Sawada 
> > >  wrote:
> > > > Right. I've fixed this issue and attached an updated patch.
> > > >
> > > Hi,
> > >
> > > I have few comments for the testcases.
> > >
> > > 1)
> > >
> > > +my $appname = 'tap_sub';
> > > +$node_subscriber->safe_psql(
> > > +'postgres',
> > > +"CREATE SUBSCRIPTION tap_sub CONNECTION '$publisher_connstr 
> > > application_name=$appname' PUBLICATION tap_pub WITH (streaming = off, 
> > > two_phase = on);");
> > > +my $appname_streaming = 'tap_sub_streaming';
> > > +$node_subscriber->safe_psql(
> > > +'postgres',
> > > +"CREATE SUBSCRIPTION tap_sub_streaming CONNECTION 
> > > '$publisher_connstr application_name=$appname_streaming' PUBLICATION 
> > > tap_pub_streaming WITH (streaming = on, two_phase = on);");
> > > +
> > >
> > > I think we can remove the 'application_name=$appname', so that the command
> > > could be shorter.
> >
> > But we wait for the subscription to catch up by using
> > wait_for_catchup() with application_name, no?
> >
>
> Yeah, but you can directly use the subscription name in
> wait_for_catchup because we internally use that as
> fallback_application_name. If application_name is not specified in the
> connection string as suggested by Hou-San then
> fallback_application_name will be considered. Both ways are okay and I
> see we use both ways in the tests but it seems there are more places
> where we use the method Hou-San is suggesting in subscription tests.

Okay, thanks! I referred to tests that set application_name. ISTM it's
better to unite them so as not to confuse them in future tests.

Anyway, I'll remove it in the next version patch that I'll submit soon.

Regards,

-- 
Masahiko Sawada
EDB:  https://www.enterprisedb.com/