date:20211103

Re: Data is copied twice when specifying both child and parent table in publication

2021-11-03 Thread Greg Nancarrow

On Thu, Nov 4, 2021 at 3:13 PM Amit Kapila  wrote:
>
> On further thinking about this, I think we should define the behavior
> of replication among partitioned (on the publisher) and
> non-partitioned (on the subscriber) tables a bit more clearly.
>
> - If the "publish_via_partition_root" is set for a publication then we
> can always replicate to the table with the same name as the root table
> in publisher.
> - If the "publish_via_partition_root" is *not* set for a publication
> then we can always replicate to the tables with the same name as the
> non-root tables in publisher.
>
> Thoughts?
>

I'd adjust that wording slightly, because "we can always replicate to
..." sounds a bit vague, and saying that an option is set or not set
could be misinterpreted, as the option could be "set" to false.

How about:

- If "publish_via_partition_root" is true for a publication, then data
is replicated to the table with the same name as the root (i.e.
partitioned) table in the publisher.
- If "publish_via_partition_root" is false (the default) for a
publication, then data is replicated to tables with the same name as
the non-root (i.e. partition) tables in the publisher.

?

Regards,
Greg Nancarrow
Fujitsu Australia

Re: Added schema level support for publication.

2021-11-03 Thread Peter Smith

On Thu, Nov 4, 2021 at 3:24 PM vignesh C  wrote:
>
> On Thu, Nov 4, 2021 at 5:41 AM Peter Smith  wrote:
> >
> > FYI - I found a small problem with one of the new PublicationObjSpec
> > parser error messages that was introduced by the recent schema
> > publication commit [1].
> >
> > The error message text is assuming that the error originates from
> > CREATE PUBLICATION, but actually that same error can also come from
> > ALTER PUBLICATION.
> >
> > For example,
> >
> > e.g.1) Here the error came from CREATE PUBLICATION, so the message
> > text looks OK.
> >
> > test_pub=# CREATE PUBLICATION p1 FOR t1;
> > 2021-11-04 10:50:17.208 AEDT [738] ERROR:  FOR TABLE/FOR ALL TABLES IN
> > SCHEMA should be specified before the table/schema name(s) at
> > character 27
> > 2021-11-04 10:50:17.208 AEDT [738] STATEMENT:  CREATE PUBLICATION p1 FOR t1;
> > ERROR:  FOR TABLE/FOR ALL TABLES IN SCHEMA should be specified before
> > the table/schema name(s)
> > LINE 1: CREATE PUBLICATION p1 FOR t1;
> >   ^
> > ~~
> >
> > e.g.2) Here the error came from ALTER PUBLICATION, so the message text
> > is not OK because the ALTER syntax [2] does not even have a FOR
> > keyword.
> >
> > test_pub=# ALTER PUBLICATION p1 SET t1;
> > 2021-11-04 10:51:53.912 AEDT [738] ERROR:  FOR TABLE/FOR ALL TABLES IN
> > SCHEMA should be specified before the table/schema name(s) at
> > character 26
> > 2021-11-04 10:51:53.912 AEDT [738] STATEMENT:  ALTER PUBLICATION p1 SET t1;
> > ERROR:  FOR TABLE/FOR ALL TABLES IN SCHEMA should be specified before
> > the table/schema name(s)
> > LINE 1: ALTER PUBLICATION p1 SET t1;
> >  ^
>
> Thanks for the comment, I changed the error message to remove the FOR
> keyword. The attached patch has the changes for the same.
>

LGTM.

--
Kind Regards,
Peter Smith.
Fujitsu Australia

Re: parallel vacuum comments

2021-11-03 Thread Masahiko Sawada

On Wed, Nov 3, 2021 at 1:08 PM Amit Kapila  wrote:
>
> On Tue, Nov 2, 2021 at 11:17 AM Masahiko Sawada  wrote:
> >
> > On Tue, Nov 2, 2021 at 5:57 AM Peter Geoghegan  wrote:
> > >
> >
> > > Rather than inventing PARALLEL_VACUUM_KEY_INDVAC_CHECK (just for
> > > assert-enabled builds), we should invent PARALLEL_VACUUM_STATS -- a
> > > dedicated shmem area for the array of LVSharedIndStats (no more
> > > storing LVSharedIndStats entries at the end of the LVShared space in
> > > an ad-hoc, type unsafe way). There should be one array element for
> > > each and every index -- even those indexes where parallel index
> > > vacuuming is unsafe or not worthwhile (unsure if avoiding parallel
> > > processing for "not worthwhile" indexes actually makes sense, BTW). We
> > > can then get rid of the bitmap/IndStatsIsNull() stuff entirely. We'd
> > > also add new per-index state fields to LVSharedIndStats itself. We
> > > could directly record the status of each index (e.g., parallel unsafe,
> > > amvacuumcleanup processing done, ambulkdelete processing done)
> > > explicitly. All code could safely subscript the LVSharedIndStats array
> > > directly, using idx style integers. That seems far more robust and
> > > consistent.
> >
> > Sounds good.
> >
> > During the development, I wrote the patch while considering using
> > fewer shared memory but it seems that it brought complexity (and
> > therefore the bug). It would not be harmful even if we allocate index
> > statistics on DSM for unsafe indexes and “not worthwhile" indexes in
> > practice.
> >
>
> If we want to allocate index stats for all indexes in DSM then why not
> consider it on the lines of buf/wal_usage means tack those via
> LVParallelState? And probably replace bitmap with an array of bools
> that indicates which indexes can be skipped by the parallel worker.
>

I've attached a draft patch. The patch incorporated all comments from
Andres except for the last comment that moves parallel related code to
another file. I'd like to discuss how we split vacuumlazy.c.

Regarding tests, I’d like to add tests to check if a vacuum with
multiple index scans (i.g., due to small maintenance_work_mem) works
fine. But a problem is that we need at least about 200,000 garbage
tuples to perform index scan twice even with the minimum
maintenance_work_mem. Which takes a time to load tuples.

Regards,

-- 
Masahiko Sawada
EDB:  https://www.enterprisedb.com/


parallel_vacuum_refactor.patch
Description: Binary data

Re: row filtering for logical replication

2021-11-03 Thread Peter Smith

On Thu, Nov 4, 2021 at 2:08 PM Amit Kapila  wrote:
>
> On Thu, Nov 4, 2021 at 8:17 AM Peter Smith  wrote:
> >
> >
> > PROPOSAL
> >
> > I propose that we change the way duplicate tables are processed to
> > make it so that it is always the *last* one that takes effect (instead
> > of the *first* one).
> >
>
> I don't have a good reason to prefer one over another but I think if
> we do this then we should document the chosen behavior. BTW, why not
> give an error if the duplicate table is present and any one of them or
> both have row-filters? I think the current behavior makes sense
> because it makes no difference if the table is present more than once
> in the list but with row-filter it can make difference so it seems to
> me that giving an error should be considered.

Yes,  giving an error if any duplicate table has a filter is also a
good alternative solution.

I only wanted to demonstrate the current problem, and get some
consensus on the solution before implementing a fix. If others are
happy to give an error for this case then that is fine by me too.

--
Kind Regards,
Peter Smith.
Fujitsu Australia.

Re: Logical Replication - improve error message while adding tables to the publication in check_publication_add_relation

2021-11-03 Thread Bharath Rupireddy

On Wed, Nov 3, 2021 at 6:21 PM Daniel Gustafsson  wrote:
>
> > On 26 Jul 2021, at 09:33, Bharath Rupireddy 
> >  wrote:
>
> > PSA v7.
>
> This patch no longer applies on top of HEAD, please submit a rebased version.

Here's a rebased v8 patch. Please review it.

Regards,
Bharath Rupireddy.


v8-0001-Improve-publication-error-messages.patch
Description: Binary data

Re: Added schema level support for publication.

2021-11-03 Thread vignesh C

On Thu, Nov 4, 2021 at 5:41 AM Peter Smith  wrote:
>
> FYI - I found a small problem with one of the new PublicationObjSpec
> parser error messages that was introduced by the recent schema
> publication commit [1].
>
> The error message text is assuming that the error originates from
> CREATE PUBLICATION, but actually that same error can also come from
> ALTER PUBLICATION.
>
> For example,
>
> e.g.1) Here the error came from CREATE PUBLICATION, so the message
> text looks OK.
>
> test_pub=# CREATE PUBLICATION p1 FOR t1;
> 2021-11-04 10:50:17.208 AEDT [738] ERROR:  FOR TABLE/FOR ALL TABLES IN
> SCHEMA should be specified before the table/schema name(s) at
> character 27
> 2021-11-04 10:50:17.208 AEDT [738] STATEMENT:  CREATE PUBLICATION p1 FOR t1;
> ERROR:  FOR TABLE/FOR ALL TABLES IN SCHEMA should be specified before
> the table/schema name(s)
> LINE 1: CREATE PUBLICATION p1 FOR t1;
>   ^
> ~~
>
> e.g.2) Here the error came from ALTER PUBLICATION, so the message text
> is not OK because the ALTER syntax [2] does not even have a FOR
> keyword.
>
> test_pub=# ALTER PUBLICATION p1 SET t1;
> 2021-11-04 10:51:53.912 AEDT [738] ERROR:  FOR TABLE/FOR ALL TABLES IN
> SCHEMA should be specified before the table/schema name(s) at
> character 26
> 2021-11-04 10:51:53.912 AEDT [738] STATEMENT:  ALTER PUBLICATION p1 SET t1;
> ERROR:  FOR TABLE/FOR ALL TABLES IN SCHEMA should be specified before
> the table/schema name(s)
> LINE 1: ALTER PUBLICATION p1 SET t1;
>  ^

Thanks for the comment, I changed the error message to remove the FOR
keyword. The attached patch has the changes for the same.

Regards,
Vignesh
From 8a36508d18144150d63812778a8c675a75d0f56d Mon Sep 17 00:00:00 2001
From: Vigneshwaran C 
Date: Wed, 3 Nov 2021 09:21:22 +0530
Subject: [PATCH v3] Renamed few enums and changed error message in publish
 schema feature.

Renamed PUBLICATIONOBJ_REL_IN_SCHEMA, PUBLICATIONOBJ_CURRSCHEMA,
DO_PUBLICATION_REL_IN_SCHEMA and PRIO_PUBLICATION_REL_IN_SCHEMA
to PUBLICATIONOBJ_TABLE_IN_SCHEMA, PUBLICATIONOBJ_TABLE_IN_CUR_SCHEMA,
DO_PUBLICATION_TABLE_IN_SCHEMA and PRIO_PUBLICATION_TABLE_IN_SCHEMA in publish
schema feature, this change is required to avoid confusion with other objects
like sequences which will be available in the future and also changed
the error message to keep it common for both create and alter publication.
---
 src/backend/commands/publicationcmds.c|  8 
 src/backend/parser/gram.y | 14 +++---
 src/bin/pg_dump/pg_dump.c |  6 +++---
 src/bin/pg_dump/pg_dump.h |  2 +-
 src/bin/pg_dump/pg_dump_sort.c|  6 +++---
 src/include/nodes/parsenodes.h|  5 +++--
 src/test/regress/expected/publication.out |  4 ++--
 7 files changed, 23 insertions(+), 22 deletions(-)

diff --git a/src/backend/commands/publicationcmds.c b/src/backend/commands/publicationcmds.c
index d1fff13d2e..7d4a0e95f6 100644
--- a/src/backend/commands/publicationcmds.c
+++ b/src/backend/commands/publicationcmds.c
@@ -169,13 +169,13 @@ ObjectsInPublicationToOids(List *pubobjspec_list, ParseState *pstate,
 			case PUBLICATIONOBJ_TABLE:
 *rels = lappend(*rels, pubobj->pubtable);
 break;
-			case PUBLICATIONOBJ_REL_IN_SCHEMA:
+			case PUBLICATIONOBJ_TABLE_IN_SCHEMA:
 schemaid = get_namespace_oid(pubobj->name, false);
 
 /* Filter out duplicates if user specifies "sch1, sch1" */
 *schemas = list_append_unique_oid(*schemas, schemaid);
 break;
-			case PUBLICATIONOBJ_CURRSCHEMA:
+			case PUBLICATIONOBJ_TABLE_IN_CUR_SCHEMA:
 search_path = fetch_search_path(false);
 if (search_path == NIL) /* nothing valid in search_path? */
 	ereport(ERROR,
@@ -214,7 +214,7 @@ CheckObjSchemaNotAlreadyInPublication(List *rels, List *schemaidlist,
 
 		if (list_member_oid(schemaidlist, relSchemaId))
 		{
-			if (checkobjtype == PUBLICATIONOBJ_REL_IN_SCHEMA)
+			if (checkobjtype == PUBLICATIONOBJ_TABLE_IN_SCHEMA)
 ereport(ERROR,
 		errcode(ERRCODE_INVALID_PARAMETER_VALUE),
 		errmsg("cannot add schema \"%s\" to publication",
@@ -613,7 +613,7 @@ AlterPublicationSchemas(AlterPublicationStmt *stmt,
 		rels = OpenReliIdList(reloids);
 
 		CheckObjSchemaNotAlreadyInPublication(rels, schemaidlist,
-			  PUBLICATIONOBJ_REL_IN_SCHEMA);
+			  PUBLICATIONOBJ_TABLE_IN_SCHEMA);
 
 		CloseTableList(rels);
 		PublicationAddSchemas(pubform->oid, schemaidlist, false, stmt);
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index d0eb80e69c..a6d0cefa6b 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -9662,14 +9662,14 @@ PublicationObjSpec:
 			| ALL TABLES IN_P SCHEMA ColId
 {
 	$$ = makeNode(PublicationObjSpec);
-	$$->pubobjtype = PUBLICATIONOBJ_REL_IN_SCHEMA;
+	$$->pubobjtype = PUBLICATIONOBJ_TABLE_IN_SCHEMA;
 	$$->name = $5;
 	$$->location = @5;
 }
 			| ALL TABLES IN_P SCHEMA CURRENT_SCHEMA
 {
 	$$

Re: Data is copied twice when specifying both child and parent table in publication

2021-11-03 Thread Amit Kapila

On Wed, Nov 3, 2021 at 3:43 PM Amit Kapila  wrote:
>
> On Thu, Oct 28, 2021 at 1:55 PM Amit Langote  wrote:
> >
> >
> > Thanks for the summary, Hou-san, and sorry about my late reply.
> >
> > I had thought about this some last week and I am coming around to
> > recognizing the confusing user experience of the current behavior.
> > Though, I am not sure if removing partitions from the result of
> > pg_publication_tables view for pubviaroot publications is acceptable
> > as a fix, because that prevents replicating into a subscriber node
> > where tables that are partition root and a partition respectively on
> > the publisher are independent tables on the subscriber.
> >
>
> But we already do that way when the publication is "For All Tables".
> Anyway, for the purpose of initial sync, it will just replicate the
> same data in two different tables if the corresponding tables on the
> subscriber-side are non-partitioned which I am not sure is what the
> user will be expecting.
>

On further thinking about this, I think we should define the behavior
of replication among partitioned (on the publisher) and
non-partitioned (on the subscriber) tables a bit more clearly.

- If the "publish_via_partition_root" is set for a publication then we
can always replicate to the table with the same name as the root table
in publisher.
- If the "publish_via_partition_root" is *not* set for a publication
then we can always replicate to the tables with the same name as the
non-root tables in publisher.

Thoughts?

-- 
With Regards,
Amit Kapila.

Re: enhance pg_log_backend_memory_contexts() to log memory contexts of auxiliary processes

2021-11-03 Thread Bharath Rupireddy

On Mon, Nov 1, 2021 at 6:42 AM Kyotaro Horiguchi
 wrote:
>
> At Fri, 29 Oct 2021 22:25:04 +0530, Bharath Rupireddy 
>  wrote in
> > On Mon, Oct 11, 2021 at 9:55 AM Bharath Rupireddy
> >  wrote:
> > >
> > > On Mon, Oct 11, 2021 at 8:21 AM torikoshia  
> > > wrote:
> > > > If we can use debuggers, it's possible to know the memory contexts e.g.
> > > > using MemoryContextStats().
> > > > So IMHO if it's necessary to know memory contexts without attaching gdb
> > > > for other than client backends(probably this means using under
> > > > production environment), this enhancement would be pay.
> > >
> > > Thanks for providing your thoughts. Knowing memory usage of auxiliary
> > > processes is as important as backends (user session processes) without
> > > attaching debugger in production environments.
> > >
> > > There are some open points as mentioned in my first mail in this
> > > thread, I will start working  on this patch once we agree on them.
> >
> > I'm attaching the v1 patch that enables
> > pg_log_backend_memory_contexts() to log memory contexts of auxiliary
> > processes. Please review it.
> >
> > Here's the CF entry - https://commitfest.postgresql.org/35/3385/
>
> After the patch applied the function looks like this
>
>   proc = BackendPidGetProc(pid);
>   if (proc == NULL)
> 
> 
>   if (proc == NULL)
> 
>   if (!is_aux_proc)
> 
>   SendProcSignal(.., the backend id);
>
> is_aux_proc lookslike making the code complex.  I think we can remove
> it.
>
>
> +   /* Only regular backends will have valid backend id, auxiliary 
> processes don't. */
> +   if (!is_aux_proc)
> +   backendId = proc->backendId;
>
> I think the reason we need to do this is not that aux processes have
> the invalid backend id (=InvalidBackendId) but that "some" auxiliary
> processes may have a broken proc->backendId in regard to
> SendProcSignal (we know that's the startup for now.).

I wanted to not have any problems signalling the startup process with
the current code. Yes, the startup process is the only auxiliary
process that has a valid backind id and we have other threads fixing
it. Let's keep the way it is in the v1 patch. Based on whichever patch
gets in we can modify the code.

> +SELECT pg_log_backend_memory_contexts(memcxt_get_proc_pid('autovacuum 
> launcher'+SELECT pg_log_backend_memory_contexts(memcxt_get_proc_pid('logical 
> replication launcher'));
> ...
>
> Maybe we can reduce (a quite bit of) run time of the test by
> loopingover the processes but since the test only checks if the
> function doesn't fail to send a signal, I'm not sure we need to
> perform the test for all of the processes here.

Okay, let me choose the checkpointer for this test, I will remove other tests.

> On the other hand,
> the test is missing the most significant target of the startup
> process.

If we were to have tests for the startup process, then it needs to be
in TAP tests as we have to start a hot standby where the startup
process will be in continuous mode. Is there any other way that we can
add the test case in a .sql file? Do we need to get into this much
complexity for the test case?

Regards,
Bharath Rupireddy.

Re: Missing include in be-secure-openssl.c?

2021-11-03 Thread Tom Lane

Thomas Munro  writes:
> On Thu, Nov 4, 2021 at 4:33 PM Tom Lane  wrote:
>> But I don't get the point about where HEAD is different from v14?
>> be-secure-openssl.c isn't.

> I don't understand what's going on and I don't have the headers to
> look at, but I was thinking that WIN32_LEAN_AND_MEAN must be causing a
> different state to be reached that somehow leaves the bad definition
> of X509_NAME in place.  It's confusing though, because you'd hope
> that'd cause *less* stuff to get defined...

Yeah, I noted the comment about WIN32_LEAN_AND_MEAN in the
stackoverflow thread too ... but as you say, it seems like
that should make the problem less probable not more so.
Still, it's hard to think of any other relevant change.

Anyway, my thought now is (1) move the openssl includes to
after system includes in both *-secure-openssl.c files,
and (2) add comments explaining why the order is critical.
But it's late here and I'm not going to mess with it right now.
If you want to take a shot at a blind fix before hamerkop's
next run, have at it.

regards, tom lane

Re: Missing include in be-secure-openssl.c?

2021-11-03 Thread Thomas Munro

On Thu, Nov 4, 2021 at 4:33 PM Tom Lane  wrote:
> But I don't get the point about where HEAD is different from v14?
> be-secure-openssl.c isn't.

I don't understand what's going on and I don't have the headers to
look at, but I was thinking that WIN32_LEAN_AND_MEAN must be causing a
different state to be reached that somehow leaves the bad definition
of X509_NAME in place.  It's confusing though, because you'd hope
that'd cause *less* stuff to get defined...

Re: Missing include in be-secure-openssl.c?

2021-11-03 Thread Tom Lane

Thomas Munro  writes:
> Ahh, I think this contains some relevant bits, and we have indeed
> messed around with mentioned Windows headers in master.
> https://stackoverflow.com/questions/49504648/x509-name-macro-in-c-wont-compile/49504794

Oooh  note the comment there about

This helped me too, but I found the simplest thing to do was just make sure 
the OpenSSL headers were #included last in my source file. – 

So the fact that be-secure-openssl.c and fe-secure-openssl.c
are including things in different orders *is* relevant.  I'd
previously discovered this bit in OpenSSL's headers (ossl_typ.h):

# ifdef _WIN32
#  undef X509_NAME
...
#endif
...
typedef struct X509_name_st X509_NAME;

So that will work around the problem as long as it's #include'd
after the relevant Windows headers.

But I don't get the point about where HEAD is different from v14?
be-secure-openssl.c isn't.

regards, tom lane

RE: row filtering for logical replication

2021-11-03 Thread houzj.f...@fujitsu.com

On Wednesday, November 3, 2021 8:51 PM Ajin Cherian  wrote:
> On Tue, Nov 2, 2021 at 10:44 PM Ajin Cherian  wrote:
> .
> >
> > The patch 0005 and 0006 has not yet been rebased but will be updated
> > in a few days.
> >
> 
> Here's a rebase of all the 6 patches. Issue remaining:

Thanks for the patches.
I started to review the patches and here are a few comments.

1)
/*
 * ALTER PUBLICATION ... ADD TABLE provides a PublicationTable 
List
 * (Relation, Where clause). ALTER PUBLICATION ... DROP TABLE 
provides
 * a Relation List. Check the List element to be used.
 */
if (IsA(lfirst(lc), PublicationTable))
whereclause = true;
else
whereclause = false;

I am not sure about the comments here, wouldn't it be better to always provides
PublicationTable List which could be more consistent.

2)
+   if ($3)
+   {
+   $$->pubtable->whereClause = $3;
+   }

It seems we can remove the if ($3) check here.

3)

+   oldctx = 
MemoryContextSwitchTo(CacheMemoryContext);
+   rfnode = 
stringToNode(TextDatumGetCString(rfdatum));
+   exprstate = 
pgoutput_row_filter_init_expr(rfnode);
+   entry->exprstates = 
lappend(entry->exprstates, exprstate);
+   MemoryContextSwitchTo(oldctx);
+   }

Currently in the patch, it save and execute each expression separately. I was
thinking it might be better if we can use "AND" to combine all the expressions
into one expression, then we can initialize and optimize the final expression
and execute it only once.

Best regards,
Hou zj

Re: row filtering for logical replication

2021-11-03 Thread Amit Kapila

On Thu, Nov 4, 2021 at 8:17 AM Peter Smith  wrote:
>
>
> PROPOSAL
>
> I propose that we change the way duplicate tables are processed to
> make it so that it is always the *last* one that takes effect (instead
> of the *first* one).
>

I don't have a good reason to prefer one over another but I think if
we do this then we should document the chosen behavior. BTW, why not
give an error if the duplicate table is present and any one of them or
both have row-filters? I think the current behavior makes sense
because it makes no difference if the table is present more than once
in the list but with row-filter it can make difference so it seems to
me that giving an error should be considered.

-- 
With Regards,
Amit Kapila.

Re: Missing include in be-secure-openssl.c?

2021-11-03 Thread Thomas Munro

On Thu, Nov 4, 2021 at 3:39 PM Thomas Munro  wrote:
> On Thu, Nov 4, 2021 at 3:16 PM Michael Paquier  wrote:
> > Could it be possible to copy-paste on this thread some of the
> > buildfarm logs that show the compilation failure?  No issues from me
> > even if these are in Japanese.
>
> BTW It looks like these messages can be translated to Unicode like
> this, in python2:

Ahh, I think this contains some relevant bits, and we have indeed
messed around with mentioned Windows headers in master.

https://stackoverflow.com/questions/49504648/x509-name-macro-in-c-wont-compile/49504794

Here's the full transcode text (using previous trick with """-quotes
to grab whole region of log), which makes it clear that X509_NAME is
confusing it:

FinalizeBuildStatus:
  ファイル ".\Release\pg_waldump\pg_waldump.tlog\unsuccessfulbuild" を削除しています。
  ".\Release\pg_waldump\pg_waldump.tlog\pg_waldump.lastbuildstate"
のタッチ タスクを実行しています。
プロジェクト "c:\build-farm-local\buildroot\HEAD\pgsql.build\pg_waldump.vcxproj"
(既定のターゲット) のビルドが完了しました。
プロジェクト "c:\build-farm-local\buildroot\HEAD\pgsql.build\pgsql.sln"
(既定のターゲット) のビルドが終了しました -- 失敗。

ビルドに失敗しました。

"c:\build-farm-local\buildroot\HEAD\pgsql.build\pgsql.sln" (既定のターゲット) (1) ->
"c:\build-farm-local\buildroot\HEAD\pgsql.build\adminpack.vcxproj"
(既定のターゲット) (9) ->
"c:\build-farm-local\buildroot\HEAD\pgsql.build\postgres.vcxproj"
(既定のターゲット) (10) ->
(ClCompile ターゲット) ->
  
c:\build-farm-local\buildroot\head\pgsql.build\src\backend\libpq\be-secure-openssl.c(577):
warning C4047: '関数': 間接参照のレベルが 'X509_NAME *' と 'int' で異なっています。
[c:\build-farm-local\buildroot\HEAD\pgsql.build\postgres.vcxproj]
  
c:\build-farm-local\buildroot\head\pgsql.build\src\backend\libpq\be-secure-openssl.c(577):
warning C4024: 'X509_NAME_get_text_by_NID': の型が 1 の仮引数および実引数と異なります。
[c:\build-farm-local\buildroot\HEAD\pgsql.build\postgres.vcxproj]
  
c:\build-farm-local\buildroot\head\pgsql.build\src\backend\libpq\be-secure-openssl.c(583):
warning C4047: '関数': 間接参照のレベルが 'X509_NAME *' と 'int' で異なっています。
[c:\build-farm-local\buildroot\HEAD\pgsql.build\postgres.vcxproj]
  
c:\build-farm-local\buildroot\head\pgsql.build\src\backend\libpq\be-secure-openssl.c(583):
warning C4024: 'X509_NAME_get_text_by_NID': の型が 1 の仮引数および実引数と異なります。
[c:\build-farm-local\buildroot\HEAD\pgsql.build\postgres.vcxproj]
  
c:\build-farm-local\buildroot\head\pgsql.build\src\backend\libpq\be-secure-openssl.c(624):
warning C4047: '関数': 間接参照のレベルが 'X509_NAME *' と 'int' で異なっています。
[c:\build-farm-local\buildroot\HEAD\pgsql.build\postgres.vcxproj]
  
c:\build-farm-local\buildroot\head\pgsql.build\src\backend\libpq\be-secure-openssl.c(624):
warning C4024: 'X509_NAME_print_ex': の型が 2 の仮引数および実引数と異なります。
[c:\build-farm-local\buildroot\HEAD\pgsql.build\postgres.vcxproj]
  
c:\build-farm-local\buildroot\head\pgsql.build\src\backend\libpq\be-secure-openssl.c(1282):
warning C4013: 関数 'X509_NAME_to_cstring' は定義されていません。int
型の値を返す外部関数と見なします。
[c:\build-farm-local\buildroot\HEAD\pgsql.build\postgres.vcxproj]
  
c:\build-farm-local\buildroot\head\pgsql.build\src\backend\libpq\be-secure-openssl.c(1282):
warning C4047: '関数': 間接参照のレベルが 'const char *' と 'int' で異なっています。
[c:\build-farm-local\buildroot\HEAD\pgsql.build\postgres.vcxproj]
  
c:\build-farm-local\buildroot\head\pgsql.build\src\backend\libpq\be-secure-openssl.c(1282):
warning C4024: 'strlcpy': の型が 2 の仮引数および実引数と異なります。
[c:\build-farm-local\buildroot\HEAD\pgsql.build\postgres.vcxproj]
  
c:\build-farm-local\buildroot\head\pgsql.build\src\backend\libpq\be-secure-openssl.c(1291):
warning C4047: '関数': 間接参照のレベルが 'const char *' と 'int' で異なっています。
[c:\build-farm-local\buildroot\HEAD\pgsql.build\postgres.vcxproj]
  
c:\build-farm-local\buildroot\head\pgsql.build\src\backend\libpq\be-secure-openssl.c(1291):
warning C4024: 'strlcpy': の型が 2 の仮引数および実引数と異なります。
[c:\build-farm-local\buildroot\HEAD\pgsql.build\postgres.vcxproj]


"c:\build-farm-local\buildroot\HEAD\pgsql.build\pgsql.sln" (既定のターゲット) (1) ->
"c:\build-farm-local\buildroot\HEAD\pgsql.build\adminpack.vcxproj"
(既定のターゲット) (9) ->
"c:\build-farm-local\buildroot\HEAD\pgsql.build\postgres.vcxproj"
(既定のターゲット) (10) ->
(ClCompile ターゲット) ->
  
c:\build-farm-local\buildroot\head\pgsql.build\src\backend\libpq\be-secure-openssl.c(68):
error C2143: 構文エラー: ')' が '(' の前にありません。
[c:\build-farm-local\buildroot\HEAD\pgsql.build\postgres.vcxproj]
  
c:\build-farm-local\buildroot\head\pgsql.build\src\backend\libpq\be-secure-openssl.c(68):
error C2091: 関数は関数を返せません。
[c:\build-farm-local\buildroot\HEAD\pgsql.build\postgres.vcxproj]
  
c:\build-farm-local\buildroot\head\pgsql.build\src\backend\libpq\be-secure-openssl.c(68):
error C2059: 構文エラー: ')'
[c:\build-farm-local\buildroot\HEAD\pgsql.build\postgres.vcxproj]
  
c:\build-farm-local\buildroot\head\pgsql.build\src\backend\libpq\be-secure-openssl.c(68):
error C2143: 構文エラー: ')' が '定数' の前にありません。
[c:\build-farm-local\buildroot\HEAD\pgsql.build\postgres.vcxproj]
  
c:\build-farm-local\buildroot\head\pgsql.build\src\backend\libpq\be-secure-openssl.c(68):
error C2143: 構文エラー:

Re: row filtering for logical replication

2021-11-03 Thread Peter Smith

Hi.

During some ad-hoc filter testing I observed a quirk when there are
duplicate tables. I think we need to define/implement some proper
rules for this behaviour.

=

BACKGROUND

When the same table appears multiple times in a CREATE PUBLICATION
then those duplicates are simply ignored. The end result is that the
table is only one time in the publication.

This is fine and makes no difference where there are no row-filters
(because the duplicates are all exactly the same as each other), but
if there *are* row-filters there there is a quirky behaviour.

=

PROBLEM

Apparently it is the *first* of the occurrences that is used and all
the other duplicates are ignored.

In practice it looks like this.

ex.1)

DROP PUBLICATION
test_pub=# CREATE PUBLICATION p1 FOR TABLE t1 WHERE (a=1), t1 WHERE (a=2);
CREATE PUBLICATION
test_pub=# \dRp+ p1
   Publication p1
  Owner   | All tables | Inserts | Updates | Deletes | Truncates | Via root
--++-+-+-+---+--
 postgres | f  | t   | t   | t   | t | f
Tables:
"public.t1" WHERE ((a = 1))

** Notice that the 2nd filter (a=2) was ignored

~

IMO ex1 is wrong behaviour. I think that any subsequent duplicate
table names should behave the same as if the CREATE was a combination
of CREATE PUBLICATION then ALTER PUBLICATION SET.

Like this:

ex.2)

test_pub=# CREATE PUBLICATION p1 FOR TABLE t1 WHERE (a=1);
CREATE PUBLICATION
test_pub=# ALTER PUBLICATION p1 SET TABLE t1 WHERE (a=2);
ALTER PUBLICATION
test_pub=# \dRp+ p1
   Publication p1
  Owner   | All tables | Inserts | Updates | Deletes | Truncates | Via root
--++-+-+-+---+--
 postgres | f  | t   | t   | t   | t | f
Tables:
"public.t1" WHERE ((a = 2))

** Notice that the 2nd filter (a=2) overwrites the 1st filter (a=1) as expected.

~~

The current behaviour of duplicates becomes even more "unexpected" if
duplicate tables occur in a single ALTER PUBLICATION ... SET command.

ex.3)

test_pub=# CREATE PUBLICATION p1;
CREATE PUBLICATION
test_pub=# ALTER PUBLICATION p1 SET TABLE t1 WHERE (a=1), t1 WHERE (a=2);
ALTER PUBLICATION
test_pub=# \dRp+ p1
   Publication p1
  Owner   | All tables | Inserts | Updates | Deletes | Truncates | Via root
--++-+-+-+---+--
 postgres | f  | t   | t   | t   | t | f
Tables:
"public.t1" WHERE ((a = 1))


** Notice the 2nd filter (a=2) did not overwrite the 1st filter (a=1).
I think a user would be quite surprised by this behaviour.

=

PROPOSAL

I propose that we change the way duplicate tables are processed to
make it so that it is always the *last* one that takes effect (instead
of the *first* one). AFAIK doing this won't affect any current PG
behaviour, but doing this will let the new row-filter feature work in
a consistent/predictable/sane way.

Thoughts?

--
Kind Regards,
Peter Smith.
Fujitsu Australia

Re: Missing include in be-secure-openssl.c?

2021-11-03 Thread Thomas Munro

On Thu, Nov 4, 2021 at 3:16 PM Michael Paquier  wrote:
> On Thu, Nov 04, 2021 at 10:56:13AM +0900, 近藤雄太 wrote:
> > We'll check it again, but there is no difference between HEAD and v14 
> > branch.
> > We simply added v14 branch to build recently, and did not change any 
> > settings of HEAD at that time.
>
> Thanks for checking.
>
> Could it be possible to copy-paste on this thread some of the
> buildfarm logs that show the compilation failure?  No issues from me
> even if these are in Japanese.

BTW It looks like these messages can be translated to Unicode like
this, in python2:

>>> print("\202\314\214^\202\252 1 
>>> \202\314\211\274\210\370\220\224\202\250\202\346\202\321\216\300\210\370\220\224\202\306\210\331\202\310\202\350\202\334\202\267\201B".decode("SJIS"))
の型が 1 の仮引数および実引数と異なります。

... which Google translates as:

"Different from formal and actual arguments of type 1"

Re: extended stats on partitioned tables

2021-11-03 Thread Justin Pryzby

On Thu, Nov 04, 2021 at 12:44:45AM +0100, Tomas Vondra wrote:
> > I probably did this to make the code change small, to avoid indentin the 
> > whole
> > block.
> 
> But indenting the block is not necessary. It's possible to do something
> like this:
> 
> if (!rel->inh)
> return 1.0;
> 
> or whatever is the "default" result for that function.

You're right.  I did like that, Except in examine_variable, which already does
it with "break".

> > Maybe the for inh<=1 loop should instead be two calls to new functions 
> > factored
> > out of get_relation_statistics() and RemoveStatisticsById(), which take 
> > "bool
> > inh".

I did like that in a separate patch for now.
And I avoided making a !inh tuple for partitioned tables, since they're never
populated.

> >> And I'm not sure we do the right thing after removing children, for example
> >> (that should drop the inheritance stats, I guess).
> > 
> > Do you mean for inheritance only ?  Or partitions too ?
> > I think for partitions, the stats should stay.
> > And for inheritence, they can stay, for consistency with partitions, and 
> > since
> > it does no harm.
> > 
> 
> I think the behavior should be the same as for data in pg_statistic,
> i.e. if we keep/remove those, we should do the same thing for extended
> statistics.

This works for column stats the way I proposed for extended stats: child stats
are never removed, neither when the only child is dropped, nor when re-running
ANALYZE (actually, that part is odd).

I can stop sending patches if it makes it hard to reconcile, but I wanted to
put it "on paper" to see/show what the patch series would look like, for v15
and back branches.

-- 
Justin
>From 907046c93d8f20e8edcc4fb0334feed86d194b81 Mon Sep 17 00:00:00 2001
From: Justin Pryzby 
Date: Sat, 25 Sep 2021 19:42:41 -0500
Subject: [PATCH v3 1/6] Do not use extended statistics on inheritence trees..

Since 859b3003de, inherited ext stats are not built.
However, the non-inherited stats stats were incorrectly used during planning of
queries with inheritence heirarchies.

Since the ext stats do not include child tables, they can lead to worse
estimates.  This is remarkably similar to 427c6b5b9, which affected column
statistics 15 years ago.

choose_best_statistics is handled a bit differently (in the calling function),
because it isn't passed rel nor rel->inh, and it's an exported function, so
avoid changing its signature in back branches.

https://www.postgresql.org/message-id/flat/20210925223152.ga7...@telsasoft.com

Backpatch to v10
---
 src/backend/statistics/dependencies.c   |  5 +
 src/backend/statistics/extended_stats.c |  5 +
 src/backend/utils/adt/selfuncs.c|  9 +
 src/test/regress/expected/stats_ext.out | 23 +++
 src/test/regress/sql/stats_ext.sql  | 14 ++
 5 files changed, 56 insertions(+)

diff --git a/src/backend/statistics/dependencies.c b/src/backend/statistics/dependencies.c
index 8bf80db8e4..015b9e28f6 100644
--- a/src/backend/statistics/dependencies.c
+++ b/src/backend/statistics/dependencies.c
@@ -1410,11 +1410,16 @@ dependencies_clauselist_selectivity(PlannerInfo *root,
 	int			ndependencies;
 	int			i;
 	AttrNumber	attnum_offset;
+	RangeTblEntry *rte = root->simple_rte_array[rel->relid];
 
 	/* unique expressions */
 	Node	  **unique_exprs;
 	int			unique_exprs_cnt;
 
+	/* If it's an inheritence tree, skip statistics (which do not include child stats) */
+	if (rte->inh)
+		return 1.0;
+
 	/* check if there's any stats that might be useful for us. */
 	if (!has_stats_of_kind(rel->statlist, STATS_EXT_DEPENDENCIES))
 		return 1.0;
diff --git a/src/backend/statistics/extended_stats.c b/src/backend/statistics/extended_stats.c
index 69ca52094f..b9e755f44e 100644
--- a/src/backend/statistics/extended_stats.c
+++ b/src/backend/statistics/extended_stats.c
@@ -1694,6 +1694,11 @@ statext_mcv_clauselist_selectivity(PlannerInfo *root, List *clauses, int varReli
 	List	  **list_exprs;		/* expressions matched to any statistic */
 	int			listidx;
 	Selectivity sel = (is_or) ? 0.0 : 1.0;
+	RangeTblEntry *rte = root->simple_rte_array[rel->relid];
+
+	/* If it's an inheritence tree, skip statistics (which do not include child stats) */
+	if (rte->inh)
+		return sel;
 
 	/* check if there's any stats that might be useful for us. */
 	if (!has_stats_of_kind(rel->statlist, STATS_EXT_MCV))
diff --git a/src/backend/utils/adt/selfuncs.c b/src/backend/utils/adt/selfuncs.c
index 10895fb287..a0932e39e1 100644
--- a/src/backend/utils/adt/selfuncs.c
+++ b/src/backend/utils/adt/selfuncs.c
@@ -3913,6 +3913,11 @@ estimate_multivariate_ndistinct(PlannerInfo *root, RelOptInfo *rel,
 	Oid			statOid = InvalidOid;
 	MVNDistinct *stats;
 	StatisticExtInfo *matched_info = NULL;
+	RangeTblEntry		*rte = root->simple_rte_array[rel->relid];
+
+	/* If it's an inheritence tree, skip statistics (which do not include child stats) */
+	if (rte->inh)
+		return false;
 
 	/* bail out immediately if the table has no

Re: Missing include in be-secure-openssl.c?

2021-11-03 Thread Michael Paquier

On Thu, Nov 04, 2021 at 10:56:13AM +0900, 近藤雄太 wrote:
> We'll check it again, but there is no difference between HEAD and v14 branch.
> We simply added v14 branch to build recently, and did not change any settings 
> of HEAD at that time.

Thanks for checking.

Could it be possible to copy-paste on this thread some of the
buildfarm logs that show the compilation failure?  No issues from me
even if these are in Japanese.
--
Michael


signature.asc
Description: PGP signature

Re: Consider parallel for lateral subqueries with limit

2021-11-03 Thread Greg Nancarrow

On Thu, Nov 4, 2021 at 12:49 AM James Coleman  wrote:
>
> Greg,
>
> Do you believe this is now ready for committer?
>

The patch LGTM.
I have set the status to "Ready for Committer".

Regards,
Greg Nancarrow
Fujitsu Australia

Re: Missing include in be-secure-openssl.c?

2021-11-03 Thread 近藤雄太

On Wed, 03 Nov 2021 10:35:36 -0400
Tom Lane  wrote:

> Michael Paquier  writes:
> > I would follow the practice of upstream to include both if were me
> > to be consistent, but I'm also fine if you just add x509v3.h to
> > be-secure-openssl.c.
> 
> 24f9e49e4 has not fixed it, so I have no idea what to do next.
> We definitely need some input from the machine's owner:
> 
> * what OpenSSL version is that?

1.0.2p (not upgraded recently)

> * is HEAD being built any differently from v14 (seems like it must be)?

We'll check it again, but there is no difference between HEAD and v14 branch.
We simply added v14 branch to build recently, and did not change any settings 
of HEAD at that time.

--
Kondo Yuta

Re: Added schema level support for publication.

2021-11-03 Thread Peter Smith

On Thu, Nov 4, 2021 at 11:55 AM houzj.f...@fujitsu.com
 wrote:
>
> On Thurs, Nov 4, 2021 8:12 AM Peter Smith  wrote:
> > FYI - I found a small problem with one of the new PublicationObjSpec parser
> > error messages that was introduced by the recent schema publication commit
> > [1].
> >
> > The error message text is assuming that the error originates from CREATE
> > PUBLICATION, but actually that same error can also come from ALTER
> > PUBLICATION.
> > e.g.2) Here the error came from ALTER PUBLICATION, so the message text is
> > not OK because the ALTER syntax [2] does not even have a FOR keyword.
> >
> > test_pub=# ALTER PUBLICATION p1 SET t1;
> > 2021-11-04 10:51:53.912 AEDT [738] ERROR:  FOR TABLE/FOR ALL TABLES IN
> > SCHEMA should be specified before the table/schema name(s) at character 26
> > 2021-11-04 10:51:53.912 AEDT [738] STATEMENT:  ALTER PUBLICATION p1 SET
> > t1;
> > ERROR:  FOR TABLE/FOR ALL TABLES IN SCHEMA should be specified before
> > the table/schema name(s) LINE 1: ALTER PUBLICATION p1 SET t1;
>
> I think it might be better to report " TABLE/ALL TABLES IN SCHEMA should be 
> specified before ...".
>

+1 - Yes, I also thought the fix should just be some simple/generic
change of the wording like your suggestion (rather than a different
message for both cases).

--
Kind Regards,
Peter Smith.
Fujitsu Australia

Re: Logical insert/update/delete WAL records for custom table AMs

2021-11-03 Thread Jeff Davis

On Wed, 2021-11-03 at 11:25 +0530, Amit Kapila wrote:
> You have modeled DecodeLogicalInsert based on current DecodeInsert
> and
> it generates the same change message, so not sure how exactly these
> new messages will be different from current heap_insert/update/delete
> messages? 

Is there a reason you think the messages should be different for heap
versus another table AM? Isn't the table AM a physical implementation
detail?

Regards,
Jeff Davis

RE: Added schema level support for publication.

2021-11-03 Thread houzj.f...@fujitsu.com

On Thurs, Nov 4, 2021 8:12 AM Peter Smith  wrote:
> FYI - I found a small problem with one of the new PublicationObjSpec parser
> error messages that was introduced by the recent schema publication commit
> [1].
> 
> The error message text is assuming that the error originates from CREATE
> PUBLICATION, but actually that same error can also come from ALTER
> PUBLICATION.
> e.g.2) Here the error came from ALTER PUBLICATION, so the message text is
> not OK because the ALTER syntax [2] does not even have a FOR keyword.
> 
> test_pub=# ALTER PUBLICATION p1 SET t1;
> 2021-11-04 10:51:53.912 AEDT [738] ERROR:  FOR TABLE/FOR ALL TABLES IN
> SCHEMA should be specified before the table/schema name(s) at character 26
> 2021-11-04 10:51:53.912 AEDT [738] STATEMENT:  ALTER PUBLICATION p1 SET
> t1;
> ERROR:  FOR TABLE/FOR ALL TABLES IN SCHEMA should be specified before
> the table/schema name(s) LINE 1: ALTER PUBLICATION p1 SET t1;

I think it might be better to report " TABLE/ALL TABLES IN SCHEMA should be 
specified before ...".

Best regards,
Hou zj

Re: Failed transaction statistics to measure the logical replication progress

2021-11-03 Thread Greg Nancarrow

On Tue, Nov 2, 2021 at 12:18 AM osumi.takami...@fujitsu.com
 wrote:
>
> On Thursday, October 28, 2021 11:19 PM I wrote:
> > I've created a new patch that extends pg_stat_subscription_workers to 
> > include
> > other transaction statistics.
> >
> > Note that this patch depends on v18 patch-set in [1]...
> Rebased based on the v19 in [1].
> Also, fixed documentation a little and made tests tidy.
> FYI, the newly included TAP test(027_worker_xact_stats.pl) is stable
> because I checked that 100 times of its execution in a tight loop all passed.
>

I have done some basic testing of the patch and have some initial
review comments:

(1) I think this patch needs to be in "git format-patch" format, with
a proper commit message that describes the purpose of the patch and
the functionality it adds, and any high-level design points (something
like the overview given in the initial post, and accounting for the
subsequent discussion points and updated functionality).

(2) doc/src/sgml/monitoring.sgml
There are some grammatical issues in the current description. I
suggest changing it to something like:

BEFORE:
+  At least one row per subscription, showing about
transaction statistics and error summary that
AFTER:
+  At least one row per subscription, showing transaction
statistics and information about errors that

(2) doc/src/sgml/monitoring.sgml
The current description seems a little confusing.
Per subscription, it shows the transaction statistics and any last
error info from tablesync/apply workers? If this is the case, I'd
suggest the following change:

BEFORE:
+   one row per subscription for transaction statistics and summary of the last
+   error reported by workers applying logical replication changes and workers
+   handling the initial data copy of the subscribed tables.
AFTER:
+   one row per subscription, showing corresponding transaction statistics and
+   information about the last error reported by workers applying
logical replication
+   changes or by workers handling the initial data copy of the
subscribed tables.

(3) xact_commit
I think that the "xact_commit" column should be named
"xact_commit_count" or "xact_commits".
Similarly, I think "xact_error" should be named "xact_error_count" or
"xact_errors", and "xact_aborts" should be named "xact_abort_count" or
"xact_aborts".

(4) xact_commit_bytes

+   Amount of transactions data successfully applied in this subscription.
+   Consumed memory for xact_commit is displayed.

I find this description a bit confusing. "Consumed memory for
xact_commit" seems different to "transactions data".
Could the description be something like: Amount of data (in bytes)
successfully applied in this subscription, across "xact_commit_count"
transactions.

(5)
I'd suggest some minor rewording for the following:

BEFORE:
+   Number of transactions failed to be applied and caught by table
+   sync worker or main apply worker in this subscription.
AFTER:
+   Number of transactions that failed to be applied by the table
+   sync worker or main apply worker in this subscription.

(6) xact_error_bytes
Again, it's a little confusing referring to "consumed memory" here.
How about rewording this, something like:

BEFORE:
+   Amount of transactions data unsuccessfully applied in this subscription.
+   Consumed memory that past failed transaction used is displayed.
AFTER:
+   Amount of data (in bytes) unsuccessfully applied in this
subscription by the last failed transaction.

(7)
The additional information provided for "xact_abort_bytes" needs some
rewording, something like:

BEFORE:
+   Increase logical_decoding_work_mem on the publisher
+   so that it exceeds the size of whole streamed transaction
+   to suppress unnecessary consumed network bandwidth in addition to change
+   in memory of the subscriber, if unexpected amount of streamed
transactions
+   are aborted.
AFTER:
+   In order to suppress unnecessary consumed network bandwidth, increase
+   logical_decoding_work_mem on the publisher so that it
+   exceeds the size of the whole streamed transaction, and
additionally increase
+   the available subscriber memory, if an unexpected amount of
streamed transactions
+   are aborted.

(8)
Suggested update:

BEFORE:
+ *  Tell the collector that worker transaction has finished without problem.
AFTER:
+ *  Tell the collector that the worker transaction has successfully completed.

(9) src/backend/postmaster/pgstat.c
I think that the GID copying is unnecessarily copying the whole GID
buffer or using an additional strlen().
It should be changed to use strlcpy() to match other code:

BEFORE:
+ /* get the gid for this two phase operation */
+ if (command == LOGICAL_REP_MSG_PREPARE ||
+   command == LOGICAL_REP_MSG_STREAM_PREPARE)
+   memcpy(msg.m_gid, prepare_data->gid, GIDSIZE);
+ else if (command == LOGICAL_REP_MSG_COMMIT_PREPARED)
+   memcpy(msg.m_gid, commit_data->gid, GIDSIZE);
+ else /* rollback pre

Re: pgbench bug candidate: negative "initial connection time"

2021-11-03 Thread Yugo NAGATA

On Tue, 2 Nov 2021 23:11:39 +0900
Fujii Masao  wrote:

> 
> 
> On 2021/11/01 23:01, Fujii Masao wrote:
> >> The remainings are the changes of handling of initial connection or
> >> logfile open failures. I agree to push them at least for the master.
> >> But I'm not sure if they should be back-patched. Without these changes,
> >> even when those failures happen, pgbench proceeds the benchmark and
> >> reports the result. But with the changes, pgbench exits immediately in
> >> that case. I'm not sure if there are people who expect this behavior,
> >> but if there are, maybe we should not change it at least at stable 
> >> branches.
> >> Thought?
> > 
> > The current behavior should be improved, but not a bug.
> > So I don't think that the patch needs to be back-patched.
> > Barring any objection, I will push the attached patch to the master.
> 
> Pushed. Thanks!

Thanks!

> 
> I also pushed the typo-fix patch that I proposed upthread.
> 
> Regards,
> 
> -- 
> Fujii Masao
> Advanced Computing Technology Center
> Research and Development Headquarters
> NTT DATA CORPORATION
> 
> 


-- 
Yugo NAGATA

Re: Added schema level support for publication.

2021-11-03 Thread Peter Smith

FYI - I found a small problem with one of the new PublicationObjSpec
parser error messages that was introduced by the recent schema
publication commit [1].

The error message text is assuming that the error originates from
CREATE PUBLICATION, but actually that same error can also come from
ALTER PUBLICATION.

For example,

e.g.1) Here the error came from CREATE PUBLICATION, so the message
text looks OK.

test_pub=# CREATE PUBLICATION p1 FOR t1;
2021-11-04 10:50:17.208 AEDT [738] ERROR:  FOR TABLE/FOR ALL TABLES IN
SCHEMA should be specified before the table/schema name(s) at
character 27
2021-11-04 10:50:17.208 AEDT [738] STATEMENT:  CREATE PUBLICATION p1 FOR t1;
ERROR:  FOR TABLE/FOR ALL TABLES IN SCHEMA should be specified before
the table/schema name(s)
LINE 1: CREATE PUBLICATION p1 FOR t1;
  ^
~~

e.g.2) Here the error came from ALTER PUBLICATION, so the message text
is not OK because the ALTER syntax [2] does not even have a FOR
keyword.

test_pub=# ALTER PUBLICATION p1 SET t1;
2021-11-04 10:51:53.912 AEDT [738] ERROR:  FOR TABLE/FOR ALL TABLES IN
SCHEMA should be specified before the table/schema name(s) at
character 26
2021-11-04 10:51:53.912 AEDT [738] STATEMENT:  ALTER PUBLICATION p1 SET t1;
ERROR:  FOR TABLE/FOR ALL TABLES IN SCHEMA should be specified before
the table/schema name(s)
LINE 1: ALTER PUBLICATION p1 SET t1;
 ^
--
[1] 
https://github.com/postgres/postgres/commit/5a2832465fd8984d089e8c44c094e6900d987fcd
[2] https://www.postgresql.org/docs/devel/sql-alterpublication.html

Kind Regards,
Peter Smith.
Fujitsu Australia

Re: Fixing WAL instability in various TAP tests

2021-11-03 Thread Mark Dilger

> On Oct 21, 2021, at 3:23 PM, Bossart, Nathan  wrote:
> 
> Do we intend to proceed with those, or should we just
> close out the Commmitfest entry?

I have withdrawn the patch.  The issues were intermittent on the buildfarm, and 
committing other changes along with what Tom already committed would seem to 
confuse matters if any new issues were to arise.  We can come back to this 
sometime in the future, if need be.

—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: extended stats on partitioned tables

2021-11-03 Thread Tomas Vondra




On 11/4/21 12:19 AM, Justin Pryzby wrote:
> On Wed, Nov 03, 2021 at 11:48:44PM +0100, Tomas Vondra wrote:
>> On 10/8/21 12:45 AM, Justin Pryzby wrote:
>>> On Thu, Oct 07, 2021 at 03:26:46PM -0500, Jaime Casanova wrote:
 On Sun, Sep 26, 2021 at 03:25:50PM -0500, Justin Pryzby wrote:
> On Sat, Sep 25, 2021 at 05:31:52PM -0500, Justin Pryzby wrote:
>> It seems like your patch should also check "inh" in examine_variable and
>> statext_expressions_load.
>
> I tried adding that - I mostly kept my patches separate.
> Hopefully this is more helpful than a complication.
> I added at: https://commitfest.postgresql.org/35/3332/
>

 Actually, this is confusing. Which patch is the one we should be
 reviewing?
>>>
>>> It is confusing, but not as much as I first thought.  Please check the 
>>> commit
>>> messages.
>>>
>>> The first two patches are meant to be applied to master *and* backpatched.  
>>> The
>>> first one intends to fixes the bug that non-inherited stats are being used 
>>> for
>>> queries of inheritance trees.  The 2nd one fixes the regression that stats 
>>> are
>>> not collected for inheritence trees of partitioned tables (which is the only
>>> type of stats they could ever possibly have).
>>
>> I think 0001 and 0002 seem mostly fine, but it seems a bit strange to do
>> the (!rte->inh) check in the rel->statlist loops. AFAICS both places
>> could do that right at the beginning, because it does not depend on the
>> statistics object at all, just the RelOptInfo.
> 
> I probably did this to make the code change small, to avoid indentin the whole
> block.

But indenting the block is not necessary. It's possible to do something
like this:

if (!rel->inh)
return 1.0;

or whatever is the "default" result for that function.

> 
>>> And the 3rd+4th patches (Tomas' plus my changes) allow collecting both
>>> inherited and non-inherited stats, only in master, since it requires a 
>>> catalog
>>> change.  It's a bit confusing that patch #4 removes most what I added in
>>> patches 1 and 2.  But that's exactly what's needed to collect and apply both
>>> inherited and non-inherited stats: the first two patches avoid applying 
>>> stats
>>> collected with the wrong inheritence.  That's also what's needed for the
>>> patchset to follow the normal "apply to master and backpatch" process, 
>>> rather
>>> than 2 patches which are backpatched but not applied to master, and one 
>>> which
>>> is applied to master and not backpatched..
>>>
>>
>> Yeah. Af first I was a bit confused because after applying 0003 there
>> are both the fixes and the "correct" way, but then I realized 0004
>> removes the unnecessary bits.
> 
> This was to leave your 0003 (mostly) unchanged, so you can see and/or apply my
> changes.  They should be squished together.
> 

Yep.

>> The one thing 0003 still needs is to rework the places that need to
>> touch both inh and !inh stats. The patch simply does
>>
>> for (inh = 0; inh <= 1; inh++) { ... }
>>
>> but that feels a bit too hackish. But if we don't know which of the two
>> stats exist, I'm not sure what to do about it. 
> 
> There's also this:
> 
> On Sun, Sep 26, 2021 at 03:25:50PM -0500, Justin Pryzby wrote:
>> +   /* create only the "stxdinherit=false", because that always exists */
>> +   datavalues[Anum_pg_statistic_ext_data_stxdinherit - 1] = 
>> ObjectIdGetDatum(false);
>>
>> That'd be confusing for partitioned tables, no?
>> They'd always have an row with no data.
>> I guess it could be stxdinherit = BoolGetDatum(rel->rd_rel->relkind == 
>> RELKIND_PARTITIONED_TABLE).
>> (not ObjectIdGetDatum).
>> Then, that affects the loops which delete the tuples - neither inh nor !inh 
>> is
>> guaranteed, unless you check relkind there, too.
> 
> Maybe the for inh<=1 loop should instead be two calls to new functions 
> factored
> out of get_relation_statistics() and RemoveStatisticsById(), which take "bool
> inh".
> 

Well, yeah. That's part of the strange 1:2 mapping between the stats
definition and data. Although, even with regular stats we have such
mapping, except the "definition" is the pg_attribute row.

>> And I'm not sure we do the right thing after removing children, for example
>> (that should drop the inheritance stats, I guess).
> 
> Do you mean for inheritance only ?  Or partitions too ?
> I think for partitions, the stats should stay.
> And for inheritence, they can stay, for consistency with partitions, and since
> it does no harm.
> 

I think the behavior should be the same as for data in pg_statistic,
i.e. if we keep/remove those, we should do the same thing for extended
statistics.


regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Why doesn't GiST VACUUM require a super-exclusive lock, like nbtree VACUUM?

2021-11-03 Thread Peter Geoghegan

The code in gistvacuum.c is closely based on similar code in nbtree.c,
except that it only acquires an exclusive lock -- not a
super-exclusive lock. I suspect that that's because it seemed
unnecessary; nbtree plain index scans have their own special reasons
for this, that don't apply to GiST. Namely: nbtree plain index scans
that don't use an MVCC snapshot clearly need some other interlock to
protect against concurrent recycling of pointed-to-by-leaf-page TIDs.
And so as a general rule nbtree VACUUM needs a full
super-exclusive/cleanup lock, just in case there is a plain index scan
that uses some other kind of snapshot (logical replication, say).

To say the same thing another way: nbtree follows "the third rule"
described by "62.4. Index Locking Considerations" in the docs [1], but
GiST does not. The idea that GiST's behavior is okay here does seem
consistent with what the same docs go on to say about it: "When using
an MVCC-compliant snapshot, there is no problem because the new
occupant of the slot is certain to be too new to pass the snapshot
test".

But what about index-only scans, which GiST also supports? I think
that the rules are different there, even though index-only scans use
an MVCC snapshot.

The (admittedly undocumented) reason why we can never drop the leaf
page pin for an index-only scan in nbtree (but can do so for plain
index scans) also relates to heap interlocking -- but with a twist.
Here's the twist: the second heap pass by VACUUM can set visibility
map bits independently of the first (once LP_DEAD items from the first
pass over the heap are set to LP_UNUSED, which renders the page
all-visible) -- this all happens at the end of
lazy_vacuum_heap_page(). That's why index-only scans can't just assume
that VACUUM won't have deleted the TID from the leaf page they're
scanning immediately after they're done reading it. VACUUM could even
manage to set the visibility map bit for a relevant heap page inside
lazy_vacuum_heap_page(), before the index-only scan can read the
visibility map. If that is allowed to happen, the index-only would
give wrong answers if one of the TID references held in local memory
by the index-only scan happens to be marked LP_UNUSED inside
lazy_vacuum_heap_page(). IOW, it looks like we run the risk of a
concurrently recycled dead-to-everybody TID becoming visible during
GiST index-only scans, just because we have no interlock.

In summary:

UUIC this is only safe for nbtree because 1.) It acquires a
super-exclusive lock when vacuuming leaf pages, and 2.) Index-only
scans never drop their pin on the leaf page when accessing the
visibility map "in-sync" with the scan (of course we hope not to
access the heap proper at all for index-only scans). These precautions
are both necessary to make the race condition I describe impossible,
because they ensure that VACUUM cannot reach lazy_vacuum_heap_page()
until after our index-only scan reads the visibility map (and then has
to read the heap for at least that one dead-to-all TID, discovering
that the TID is dead to its snapshot). Why wouldn't GiST need to take
the same precautions, though?

[1] https://www.postgresql.org/docs/devel/index-locking.html
--
Peter Geoghegan

Re: lastOverflowedXid does not handle transaction ID wraparound

2021-11-03 Thread Stan Hu

Good catch on doing this in ExpireAllKnownAssignedTransactionIds() as well.
Thanks. Looks good to me!

As Nikolay mentioned, I think this is an important bug that we are seeing
in production and would appreciate a backport to v12 if possible.

On Wed, Nov 3, 2021 at 3:07 PM Alexander Korotkov 
wrote:

> Hi!
>
> On Wed, Nov 3, 2021 at 8:51 PM Simon Riggs 
> wrote:
> > It is however, an undocumented modularity violation. I think that is
> > acceptable because of the ProcArrayLock traffic, but needs to have a
> > comment to explain this at the call to
> > ExpireOldKnownAssignedTransactionIds() i.e. " and potentially reset
> > lastOverflowedXid", as well as a comment on the
> > ExpireOldKnownAssignedTransactionIds() function.
>
> Thank you for your feedback.  Please find the revised patch attached.
> It incorporates this function comment changes altogether with minor
> editings and commit message. Let me know if you have further
> suggestions.
>
> I'm going to push this if no objections.
>
> --
> Regards,
> Alexander Korotkov
>

Re: minor gripe about lax reloptions parsing for views

2021-11-03 Thread Mark Dilger



> On Oct 1, 2021, at 12:34 PM, Mark Dilger  wrote:
> 
> The patch does it this way. 

A rebased patch is attached.



v2-0001-Reject-storage-options-in-toast-namespace-in-view.patch
Description: Binary data


—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: extended stats on partitioned tables

2021-11-03 Thread Justin Pryzby

On Wed, Nov 03, 2021 at 11:48:44PM +0100, Tomas Vondra wrote:
> On 10/8/21 12:45 AM, Justin Pryzby wrote:
> > On Thu, Oct 07, 2021 at 03:26:46PM -0500, Jaime Casanova wrote:
> >> On Sun, Sep 26, 2021 at 03:25:50PM -0500, Justin Pryzby wrote:
> >>> On Sat, Sep 25, 2021 at 05:31:52PM -0500, Justin Pryzby wrote:
>  It seems like your patch should also check "inh" in examine_variable and
>  statext_expressions_load.
> >>>
> >>> I tried adding that - I mostly kept my patches separate.
> >>> Hopefully this is more helpful than a complication.
> >>> I added at: https://commitfest.postgresql.org/35/3332/
> >>>
> >>
> >> Actually, this is confusing. Which patch is the one we should be
> >> reviewing?
> > 
> > It is confusing, but not as much as I first thought.  Please check the 
> > commit
> > messages.
> > 
> > The first two patches are meant to be applied to master *and* backpatched.  
> > The
> > first one intends to fixes the bug that non-inherited stats are being used 
> > for
> > queries of inheritance trees.  The 2nd one fixes the regression that stats 
> > are
> > not collected for inheritence trees of partitioned tables (which is the only
> > type of stats they could ever possibly have).
> 
> I think 0001 and 0002 seem mostly fine, but it seems a bit strange to do
> the (!rte->inh) check in the rel->statlist loops. AFAICS both places
> could do that right at the beginning, because it does not depend on the
> statistics object at all, just the RelOptInfo.

I probably did this to make the code change small, to avoid indentin the whole
block.

> > And the 3rd+4th patches (Tomas' plus my changes) allow collecting both
> > inherited and non-inherited stats, only in master, since it requires a 
> > catalog
> > change.  It's a bit confusing that patch #4 removes most what I added in
> > patches 1 and 2.  But that's exactly what's needed to collect and apply both
> > inherited and non-inherited stats: the first two patches avoid applying 
> > stats
> > collected with the wrong inheritence.  That's also what's needed for the
> > patchset to follow the normal "apply to master and backpatch" process, 
> > rather
> > than 2 patches which are backpatched but not applied to master, and one 
> > which
> > is applied to master and not backpatched..
> > 
> 
> Yeah. Af first I was a bit confused because after applying 0003 there
> are both the fixes and the "correct" way, but then I realized 0004
> removes the unnecessary bits.

This was to leave your 0003 (mostly) unchanged, so you can see and/or apply my
changes.  They should be squished together.

> The one thing 0003 still needs is to rework the places that need to
> touch both inh and !inh stats. The patch simply does
> 
> for (inh = 0; inh <= 1; inh++) { ... }
> 
> but that feels a bit too hackish. But if we don't know which of the two
> stats exist, I'm not sure what to do about it. 

There's also this:

On Sun, Sep 26, 2021 at 03:25:50PM -0500, Justin Pryzby wrote:
> +   /* create only the "stxdinherit=false", because that always exists */
> +   datavalues[Anum_pg_statistic_ext_data_stxdinherit - 1] = 
> ObjectIdGetDatum(false);
> 
> That'd be confusing for partitioned tables, no?
> They'd always have an row with no data.
> I guess it could be stxdinherit = BoolGetDatum(rel->rd_rel->relkind == 
> RELKIND_PARTITIONED_TABLE).
> (not ObjectIdGetDatum).
> Then, that affects the loops which delete the tuples - neither inh nor !inh is
> guaranteed, unless you check relkind there, too.

Maybe the for inh<=1 loop should instead be two calls to new functions factored
out of get_relation_statistics() and RemoveStatisticsById(), which take "bool
inh".

> And I'm not sure we do the right thing after removing children, for example
> (that should drop the inheritance stats, I guess).

Do you mean for inheritance only ?  Or partitions too ?
I think for partitions, the stats should stay.
And for inheritence, they can stay, for consistency with partitions, and since
it does no harm.

Re: Extending amcheck to check toast size and compression

2021-11-03 Thread Mark Dilger



> On Oct 20, 2021, at 12:06 PM, Mark Dilger  
> wrote:
> 
> Ok.  How about:

Done that way.



v3-0001-Adding-more-toast-pointer-checks-to-amcheck.patch
Description: Binary data


—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: extended stats on partitioned tables

2021-11-03 Thread Tomas Vondra

On 10/8/21 12:45 AM, Justin Pryzby wrote:
> On Thu, Oct 07, 2021 at 03:26:46PM -0500, Jaime Casanova wrote:
>> On Sun, Sep 26, 2021 at 03:25:50PM -0500, Justin Pryzby wrote:
>>> On Sat, Sep 25, 2021 at 05:31:52PM -0500, Justin Pryzby wrote:
 It seems like your patch should also check "inh" in examine_variable and
 statext_expressions_load.
>>>
>>> I tried adding that - I mostly kept my patches separate.
>>> Hopefully this is more helpful than a complication.
>>> I added at: https://commitfest.postgresql.org/35/3332/
>>>
>>
>> Actually, this is confusing. Which patch is the one we should be
>> reviewing?
> 
> It is confusing, but not as much as I first thought.  Please check the commit
> messages.
> 
> The first two patches are meant to be applied to master *and* backpatched.  
> The
> first one intends to fixes the bug that non-inherited stats are being used for
> queries of inheritance trees.  The 2nd one fixes the regression that stats are
> not collected for inheritence trees of partitioned tables (which is the only
> type of stats they could ever possibly have).
> 

I think 0001 and 0002 seem mostly fine, but it seems a bit strange to do
the (!rte->inh) check in the rel->statlist loops. AFAICS both places
could do that right at the beginning, because it does not depend on the
statistics object at all, just the RelOptInfo.

> And the 3rd+4th patches (Tomas' plus my changes) allow collecting both
> inherited and non-inherited stats, only in master, since it requires a catalog
> change.  It's a bit confusing that patch #4 removes most what I added in
> patches 1 and 2.  But that's exactly what's needed to collect and apply both
> inherited and non-inherited stats: the first two patches avoid applying stats
> collected with the wrong inheritence.  That's also what's needed for the
> patchset to follow the normal "apply to master and backpatch" process, rather
> than 2 patches which are backpatched but not applied to master, and one which
> is applied to master and not backpatched..
> 

Yeah. Af first I was a bit confused because after applying 0003 there
are both the fixes and the "correct" way, but then I realized 0004
removes the unnecessary bits.

The one thing 0003 still needs is to rework the places that need to
touch both inh and !inh stats. The patch simply does

for (inh = 0; inh <= 1; inh++) { ... }

but that feels a bit too hackish. But if we don't know which of the two
stats exist, I'm not sure what to do about it. And I'm not sure we do
the right thing after removing children, for example (that should drop
the inheritance stats, I guess).

The 1:2 mapping between pg_statistic_ext and pg_statistic_ext_data is a
bit strange, but I can't think of a better way.

> @Tomas: I just found commit 427c6b5b9, which is a remarkably similar issue
> affecting column stats 15 years ago.
> 

What can I say? The history repeats itself ...

regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: should we enable log_checkpoints out of the box?

2021-11-03 Thread Justin Pryzby

On Tue, Nov 02, 2021 at 08:02:41PM -0400, Tom Lane wrote:
> I'm still of the position that the default ought to be that a
> normally-functioning server generates no ongoing log output.
> Only people who have got Nagios watching their logs, or some
> such setup, are going to want anything different.  And that is
> a minority use-case.  There are going to be way more people
> bitching because their postmaster log overflowed their disk
> than there will be people who are happier because you made
> such output the default.  (Don't forget that our default
> logging setup does not rotate the logs.)

The default also has logging_collector=off, so is ENOSPC really a concern for
anyone at all ?

I think there's no issue for distros who distribute postgres with their own
modified defaults, since they can either use their own customized default with
log_checkpoints=off, or they can accommodate our "default default".

Debian uses logging_collector=off but does its own redirection, with log
rotation and compression.  That's the case that Robert described.

-- 
Justin

Re: CREATE ROLE IF NOT EXISTS

2021-11-03 Thread Tom Lane

David Christensen  writes:
> Updated version attached.

I'm generally pretty down on IF NOT EXISTS semantics in all cases,
but it seems particularly dangerous for something as fundamental
to privilege checks as a role.  It's not hard at all to conjure up
scenarios in which this permits privilege escalation.  That is,
Alice wants to create role Bob and give it some privileges, but
she's lazy and writes a quick-and-dirty script using CREATE ROLE
IF NOT EXISTS.  Meanwhile Charlie sneaks in and creates Bob first,
and then grants it to himself.  Now Alice's script is giving away
all sorts of privilege to Charlie.  (Admittedly, Charlie must have
CREATEROLE privilege already, but that doesn't mean he has every
privilege that Alice has --- especially not as we continue working
to slice the superuser salami ever more finely.)

Do we really need this?

regards, tom lane

Re: row filtering for logical replication

2021-11-03 Thread Peter Smith

On Tue, Nov 2, 2021 at 10:44 PM Ajin Cherian  wrote:
>
> Here's a rebase of the first 4 patches of the row-filter patch. Some
> issues still remain:
>
> 1. the following changes for adding OptWhereClause to the
> PublicationObjSpec has not been added
> as the test cases for this has not been yet rebased:
>
> PublicationObjSpec:
> ...
> + TABLE relation_expr OptWhereClause
> ...
> + | ColId OptWhereClause
> ...
>  + | ColId indirection OptWhereClause
> ...
> + | extended_relation_expr OptWhereClause
>

This is addressed in the v36-0001 patch [1]

--
[1] 
https://www.postgresql.org/message-id/CAFPTHDYKfxTr2zpA-fC12u%2BhL2abCc%3D276OpJQUTyc6FBgYX9g%40mail.gmail.com

Kind Regards,
Peter Smith.
Fujitsu Australia

Re: lastOverflowedXid does not handle transaction ID wraparound

2021-11-03 Thread Alexander Korotkov

Hi!

On Wed, Nov 3, 2021 at 8:51 PM Simon Riggs  wrote:
> It is however, an undocumented modularity violation. I think that is
> acceptable because of the ProcArrayLock traffic, but needs to have a
> comment to explain this at the call to
> ExpireOldKnownAssignedTransactionIds() i.e. " and potentially reset
> lastOverflowedXid", as well as a comment on the
> ExpireOldKnownAssignedTransactionIds() function.

Thank you for your feedback.  Please find the revised patch attached.
It incorporates this function comment changes altogether with minor
editings and commit message. Let me know if you have further
suggestions.

I'm going to push this if no objections.

--
Regards,
Alexander Korotkov

0001-Reset-lastOverflowedXid-on-standby-when-needed.patch
Description: Binary data

Re: CREATE ROLE IF NOT EXISTS

2021-11-03 Thread David Christensen

>
>
> This fails the roleattributes test in "make check", with what seems to be a
> trivial change in the output.  Can you please submit a rebased version
> fixing
> the test?
>

Updated version attached.

David


CREATE-ROLE-IF-NOT-EXISTS-v2.patch
Description: Binary data

Re: pg14 psql broke \d datname.nspname.relname

2021-11-03 Thread Mark Dilger



> On Nov 3, 2021, at 12:07 PM, Tom Lane  wrote:
> 
> Mark Dilger  writes:
>> [ v1-0001-Reject-patterns-with-too-many-parts-or-wrong-db.patch ]
> 
> This needs a rebase over the recent renaming of our Perl test modules.
> (Per the cfbot, so do several of your other pending patches.)
> 
>   regards, tom lane

Thanks for calling my attention to it.



v2-0001-Reject-patterns-with-too-many-parts-or-wrong-db.patch
Description: Binary data

—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: [PATCH] Proof of concept for GUC improvements

2021-11-03 Thread David Christensen

> On Nov 3, 2021, at 5:35 AM, Daniel Gustafsson  wrote:
> 
> 
>> 
>>> On 15 Oct 2021, at 23:54, Cary Huang  wrote:
>> 
>> I scanned through the GUC list and found that the following parameters can
>> potentially be categorized in the "special_disabled0" group, just for your
>> reference.
> 
> 
> This patch no longer applies, can you please submit a rebased version?  Also,
> do you have any thoughts on Cary's suggestion in the above review?

Hi, enclosed is a v3, which includes the rebase as well as the additional int 
GUCs mentioned in Cary’s review. I can add support for Reals (required for the 
JIT and several other ones), but wanted to get feedback/buy-in on the overall 
idea first. 

Best,

David

special-guc-values-v3.patch
Description: Binary data

Re: [PATCH] Native spinlock support on RISC-V

2021-11-03 Thread Thomas Munro

On Wed, Nov 3, 2021 at 5:41 PM Thomas Munro  wrote:
> 'generic' is not a recognized processor for this target (ignoring processor)

I still don't know what's wrong but I spent 20 minutes searching for
more clues this morning...

First, 'generic' is coming from LLVMGetHostCPUName(), and yet it's not
accepted by LLVMCreateTargetMachine().  Ok then, let's try something
else:

$ clang12 --print-supported-cpus

Target: riscv64-portbld-freebsd13.0
Thread model: posix
InstalledDir: /usr/local/llvm12/bin
Available CPUs for this target:

generic-rv32
generic-rv64
rocket-rv32
rocket-rv64
sifive-7-rv32
sifive-7-rv64
sifive-e31
sifive-e76
sifive-u54
sifive-u74

So I hacked my copy of PostgreSQL thusly:

+   cpu = LLVMCreateMessage("generic-rv64");

... and now I get:

2021-11-03 20:27:28.487 UTC [26880] FATAL:  fatal llvm error:
Relocation type not implemented yet!

Taking that at face value, instead of LLVMRelocDefault, I tried a
couple of other values like LLVMRelocDynamicNoPic, but same result.
So I attached a debugger to see the stack producing the error, and
huh, I see a clue that it's confusing this architecture with SystemZ
(= IBM mainframe).

(gdb) break fatal_llvm_error_handler
Function "fatal_llvm_error_handler" not defined.
Make breakpoint pending on future shared library load? (y or [n]) y
Breakpoint 1 (fatal_llvm_error_handler) pending.
(gdb) cont
Continuing.

Breakpoint 1, fatal_llvm_error_handler (user_data=0x0,
reason="Relocation type not implemented yet!", gen_crash_diag=true) at
llvmjit_error.cpp:147
147ereport(FATAL,
(gdb) bt
#0  fatal_llvm_error_handler (user_data=0x0, reason="Relocation type
not implemented yet!", gen_crash_diag=true) at llvmjit_error.cpp:147
#1  0x4dc3729a in llvm::report_fatal_error(llvm::Twine const&,
bool) () from /usr/local/llvm12/lib/libLLVM-12.so
#2  0x4dc371d0 in llvm::report_fatal_error(char const*, bool)
() from /usr/local/llvm12/lib/libLLVM-12.so
#3  0x4ef61010 in
llvm::RuntimeDyldELF::resolveSystemZRelocation(llvm::SectionEntry
const&, unsigned long, unsigned long, unsigned int, long) () from
/usr/local/llvm12/lib/libLLVM-12.so
#4  0x4ef547c6 in
llvm::RuntimeDyldImpl::applyExternalSymbolRelocations(llvm::StringMap) () from /usr/local/llvm12/lib/libLLVM-12.so
#5  0x4ef54fb4 in ?? () from /usr/local/llvm12/lib/libLLVM-12.so
#6  0x4ef54be6 in
llvm::RuntimeDyldImpl::finalizeAsync(std::__1::unique_ptr >,
llvm::unique_function,
std::__1::unique_ptr >,
llvm::Error)>, llvm::object::OwningBinary,
std::__1::unique_ptr >) ()
   from /usr/local/llvm12/lib/libLLVM-12.so
#7  0x4ef55e2a in
llvm::jitLinkForORC(llvm::object::OwningBinary,
llvm::RuntimeDyld::MemoryManager&, llvm::JITSymbolResolver&, bool,
llvm::unique_function,
std::__1::allocator > >)>, llvm::unique_function,
std::__1::unique_ptr >,
llvm::Error)>) ()
   from /usr/local/llvm12/lib/libLLVM-12.so
#8  0x4ef421ea in
llvm::orc::RTDyldObjectLinkingLayer::emit(std::__1::unique_ptr >,
std::__1::unique_ptr >) () from
/usr/local/llvm12/lib/libLLVM-12.so
#9  0x4ef3e946 in
llvm::orc::ObjectTransformLayer::emit(std::__1::unique_ptr >,
std::__1::unique_ptr >) () from
/usr/local/llvm12/lib/libLLVM-12.so
#10 0x4ef25c0c in
llvm::orc::IRCompileLayer::emit(std::__1::unique_ptr >,
llvm::orc::ThreadSafeModule) () from
/usr/local/llvm12/lib/libLLVM-12.so
#11 0x4ef25fb2 in
llvm::orc::IRTransformLayer::emit(std::__1::unique_ptr >,
llvm::orc::ThreadSafeModule) () from
/usr/local/llvm12/lib/libLLVM-12.so
#12 0x4ef25fb2 in
llvm::orc::IRTransformLayer::emit(std::__1::unique_ptr >,
llvm::orc::ThreadSafeModule) () from
/usr/local/llvm12/lib/libLLVM-12.so
#13 0x4ef2a614 in
llvm::orc::BasicIRLayerMaterializationUnit::materialize(std::__1::unique_ptr >)
() from /usr/local/llvm12/lib/libLLVM-12.so
#14 0x4ef08dc2 in ?? () from /usr/local/llvm12/lib/libLLVM-12.so
#15 0x4ef0f61c in std::__1::__function::__func >,
std::__1::unique_ptr >),
std::__1::allocator >,
std::__1::unique_ptr
>)>, void (std::__1::unique_ptr >,
std::__1::unique_ptr
>)>::operator()(std::__1::unique_ptr >&&,
std::__1::unique_ptr
>&&) ()
   from /usr/local/llvm12/lib/libLLVM-12.so
#16 0x4ef09c12 in
llvm::orc::ExecutionSession::dispatchOutstandingMUs() () from
/usr/local/llvm12/lib/libLLVM-12.so
#17 0x4ef0b684 in
llvm::orc::ExecutionSession::OL_completeLookup(std::__1::unique_ptr >,
std::__1::shared_ptr,
std::__1::function >,
llvm::DenseMapInfo,
llvm::detail::DenseMapPair > > > const&)>) () from
/usr/local/llvm12/lib/libLLVM-12.so
#18 0x4ef141ee in ?? () from /usr/local/llvm12/lib/libLLVM-12.so
#19 0x4ef01c1e in
llvm::orc::ExecutionSession::OL_applyQueryPhase1(std::__1::unique_ptr >,
llvm::Error) () from /usr/local/llvm12/lib/libLLVM-12.so
#20 0x4ef0096c in
llvm::orc::ExecutionSession::lookup(llvm::orc::LookupKind,
std::__1::vector,
std::__1::allocator > > const&,
llvm::orc::Symbol

Re: On login trigger: take three

2021-11-03 Thread Daniel Gustafsson

> On 3 Nov 2021, at 17:15, Ivan Panchenko  wrote:
> Среда, 29 сентября 2021, 15:14 +03:00 от Teodor Sigaev  >:
> 2 For logging purpose failing of trigger will cause impossibility to login, it
> could be workarounded by catching error in trigger function, but it's not so
> obvious for DBA.
> If the trigger contains an error, nobody can login. The database is bricked.
> Previous variant of the patch proposed to fix this with going to single-user 
> mode.
> For faster recovery I propose to have also a GUC variable to turn on/off the 
> login triggers.
> It should be 'on' by default.

As voiced earlier, I disagree with this and I dislike the idea of punching a
hole for circumventing infrastructure intended for auditing.

Use-cases for a login-trigger commonly involve audit trail logging, session
initialization etc.  If the login trigger bricks the production database to the
extent that it needs to be restarted with the magic GUC, then it's highly
likely that you *don't* want regular connections to the database for the
duration of this.  Any such connection won't be subject to what the trigger
does which seem counter to having the trigger in the first place.  This means
that it's likely that the superuser fixing it will have to disable logins for
everyone else while fixing, and it quicly becomes messy.

With that in mind, I think single-user mode actually *helps* the users in this
case, and we avoid a hole punched which in worst case can be a vector for an
attack.

Maybe I'm overly paranoid, but adding a backdoor of sorts for a situation which
really shouldn't happen doesn't seem like a good idea.
> some other issues:
> 3 It's easy to create security hole, see attachment where non-privileged user
> can close access to database even for superuser.
> Such cases can be avoided by careful design of the login triggers and related 
> permissions. Added such note to the documentation.

If all login triggers are written carefully like that, we don't need the GUC to
disable them?

--
Daniel Gustafsson   https://vmware.com/

Re: Empty string in lexeme for tsvector

2021-11-03 Thread Tom Lane

Jean-Christophe Arnu  writes:
>> (By the same token, I think there's a good argument for
>> tsvector_delete_arr to just ignore nulls, not throw an error.
>> That's a somewhat orthogonal issue, though.)

> Nulls are now ignored in tsvector_delete_arr().

I think setweight() with an array should handle this the same as
ts_delete() with an array, so the attached v6 does it like that.

> I also wonder if we should not also consider changing COPY FROM  behaviour
> on empty string lexemes.
> Current version is just crashing on empty string lexemes. Should
> we allow them  anyway as COPY FROM input (it seems weird not to be able
> to re-import dumped data) or "skipping them" just like array_to_tsvector()
> does in the patched version (that may lead to database content changes) or
> finally should we not change COPY behaviour ?

No, I don't think so.  tsvector's restriction against empty lexemes was
certainly intentional from the beginning, so I wouldn't be surprised if
we'd run into semantic difficulties if we remove it.  Moreover, we're
going on fourteen years with that restriction and we've had few
complaints, so there's no field demand to loosen it.  It's clearly just
an oversight that array_to_tsvector() failed to enforce the restriction.

I polished the docs and tests a bit more, too.  I think the attached
is committable -- any objections?

regards, tom lane

diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index 4b49dff2ff..24447c0017 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -12920,8 +12920,10 @@ CREATE TYPE rainbow AS ENUM ('red', 'orange', 'yellow', 'green', 'blue', 'purple
 tsvector


-Converts an array of lexemes to a tsvector.
-The given strings are used as-is without further processing.
+Converts an array of text strings to a tsvector.
+The given strings are used as lexemes as-is, without further
+processing.  Array elements must not be empty strings
+or NULL.


 array_to_tsvector('{fat,cat,rat}'::text[])
@@ -13104,6 +13106,9 @@ CREATE TYPE rainbow AS ENUM ('red', 'orange', 'yellow', 'green', 'blue', 'purple
 Assigns the specified weight to elements
 of the vector that are listed
 in lexemes.
+The strings in lexemes are taken as lexemes
+as-is, without further processing.  Strings that do not match any
+lexeme in vector are ignored.


 setweight('fat:2,4 cat:3 rat:5,6B'::tsvector, 'A', '{cat,rat}')
@@ -13265,6 +13270,8 @@ CREATE TYPE rainbow AS ENUM ('red', 'orange', 'yellow', 'green', 'blue', 'purple

 Removes any occurrence of the given lexeme
 from the vector.
+The lexeme string is treated as a lexeme as-is,
+without further processing.


 ts_delete('fat:2,4 cat:3 rat:5A'::tsvector, 'fat')
@@ -13281,6 +13288,9 @@ CREATE TYPE rainbow AS ENUM ('red', 'orange', 'yellow', 'green', 'blue', 'purple
 Removes any occurrences of the lexemes
 in lexemes
 from the vector.
+The strings in lexemes are taken as lexemes
+as-is, without further processing.  Strings that do not match any
+lexeme in vector are ignored.


 ts_delete('fat:2,4 cat:3 rat:5A'::tsvector, ARRAY['fat','rat'])
diff --git a/src/backend/utils/adt/tsvector_op.c b/src/backend/utils/adt/tsvector_op.c
index 9236ebcc8f..11ccb5297c 100644
--- a/src/backend/utils/adt/tsvector_op.c
+++ b/src/backend/utils/adt/tsvector_op.c
@@ -322,10 +322,9 @@ tsvector_setweight_by_filter(PG_FUNCTION_ARGS)
 		int			lex_len,
 	lex_pos;
 
+		/* Ignore null array elements, they surely don't match */
 		if (nulls[i])
-			ereport(ERROR,
-	(errcode(ERRCODE_NULL_VALUE_NOT_ALLOWED),
-	 errmsg("lexeme array may not contain nulls")));
+			continue;
 
 		lex = VARDATA(dlexemes[i]);
 		lex_len = VARSIZE(dlexemes[i]) - VARHDRSZ;
@@ -602,10 +601,9 @@ tsvector_delete_arr(PG_FUNCTION_ARGS)
 		int			lex_len,
 	lex_pos;
 
+		/* Ignore null array elements, they surely don't match */
 		if (nulls[i])
-			ereport(ERROR,
-	(errcode(ERRCODE_NULL_VALUE_NOT_ALLOWED),
-	 errmsg("lexeme array may not contain nulls")));
+			continue;
 
 		lex = VARDATA(dlexemes[i]);
 		lex_len = VARSIZE(dlexemes[i]) - VARHDRSZ;
@@ -761,13 +759,21 @@ array_to_tsvector(PG_FUNCTION_ARGS)
 
 	deconstruct_array(v, TEXTOID, -1, false, TYPALIGN_INT, &dlexemes, &nulls, &nitems);
 
-	/* Reject nulls (maybe we should just ignore them, instead?) */
+	/*
+	 * Reject nulls and zero length strings (maybe we should just ignore them,
+	 * instead?)
+	 */
 	for (i = 0; i < nitems; i++)
 	{
 		if (nulls[i])
 			ereport(ERROR,
 	(errcode(ERRCODE_NULL_VALUE_NOT_ALLOWED),
 	 errmsg("lexeme array may not contain nulls")));
+
+		if (VARSIZE(dlexemes[i]) - VARHDRSZ == 0)
+			ereport(ERROR,
+	(errcode(ERRCODE_ZERO_LENGTH_CHARACTER_STRING),
+

Re: remove internal support in pgcrypto?

2021-11-03 Thread Daniel Gustafsson

> On 3 Nov 2021, at 16:06, Peter Eisentraut  
> wrote:
> 
> On 03.11.21 11:16, Daniel Gustafsson wrote:
>>> On 30 Oct 2021, at 14:11, Peter Eisentraut 
>>>  wrote:
>>> 
>>> On 24.08.21 11:13, Peter Eisentraut wrote:
 So I'm tempted to suggest that we remove the built-in, non-OpenSSL cipher 
 and hash implementations in pgcrypto (basically INT_SRCS in 
 pgcrypto/Makefile), and then also pursue the simplifications in the 
 OpenSSL code paths described in [0].
>>> 
>>> Here is a patch for this.
>> This patch doesn't work on Windows, which I think is because it pulls in
>> pgcrypto even in builds without OpenSSL.  Poking at that led me to realize 
>> that
>> we can simplify even more with this.  The conditonal source includes can go
>> away and be replaced with a simple OBJS clause, and with that the special 
>> hacks
>> in Mkvcbuild.pm to overcome that.
>> Attached is a diff on top of your patch to do the above.  I haven't tested it
>> on Windows yet, but if you think it's in the right direction we'll take it 
>> for
>> a spin in a CI with/without OpenSSL.
> 
> Here is a consolidated patch.  I have tested it locally, so it should be okay 
> on Windows.

I don't think this bit is correct, as OSSL_TESTS have been removed from the 
Makefie:

- $config->{openssl}
- ? GetTests("OSSL_TESTS", $m)
- : GetTests("INT_TESTS",  $m);
+ GetTests("OSSL_TESTS", $m);

I think we need something like the (untested) diff below:

diff --git a/src/tools/msvc/vcregress.pl b/src/tools/msvc/vcregress.pl
index e3a323b8bf..fc2406b2be 100644
--- a/src/tools/msvc/vcregress.pl
+++ b/src/tools/msvc/vcregress.pl
@@ -729,13 +729,10 @@ sub fetchTests
# pgcrypto is special since the tests depend on the
# configuration of the build

-   my $cftests =
- GetTests("OSSL_TESTS", $m);
my $pgptests =
  $config->{zlib}
  ? GetTests("ZLIB_TST", $m)
  : GetTests("ZLIB_OFF_TST", $m);
-   $t =~ s/\$\(CF_TESTS\)/$cftests/;
$t =~ s/\$\(CF_PGP_TESTS\)/$pgptests/;
}
}

>> Now, *if* we merge the NSS patch this does introduce special cases again 
>> which
>> this rips out.  I prefer to try and fix them in that patch to keep avoiding 
>> the
>> need for them rather than keep them on speculation for a patch which hasn't
>> been decided on.
> 
> Okay, I wasn't sure about the preferred way forward here.  I'm content with 
> the approach you have chosen.

I'm honestly not sure either; but as the NSS patch author, if I break it I get
to keep both pieces =)

--
Daniel Gustafsson   https://vmware.com/

Re: Non-superuser subscription owners

2021-11-03 Thread Mark Dilger

> On Nov 1, 2021, at 10:58 AM, Mark Dilger  wrote:
> 
> ALTER SUBSCRIPTION..[ENABLE | DISABLE] do not synchronously start or stop 
> subscription workers.  The ALTER command updates the catalog's subenabled 
> field, but workers only lazily respond to that.  Disabling and enabling the 
> subscription as part of the OWNER TO would not reliably accomplish anything.

I have rethought my prior analysis.  The problem in the previous patch was that 
the subscription apply workers did not check for a change in ownership the way 
they checked for other changes, instead only picking up the new ownership 
information when the worker restarted for some other reason.  This next patch 
set fixes that.  The application of a change record may continue under the old 
ownership permissions when a concurrent command changes the ownership of the 
subscription, but the worker will pick up the new permissions before applying 
the next record.  I think that is consistent enough with reasonable 
expectations.

The first two patches are virtually unchanged.  The third updates the behavior 
of the apply workers, and updates the documentation to match.

v2-0001-Handle-non-superuser-subscription-owners-sensibly.patch
Description: Binary data

v2-0002-Allow-subscription-ownership-by-non-superusers.patch
Description: Binary data

v2-0003-Respect-permissions-within-logical-replication.patch
Description: Binary data

—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: pg14 psql broke \d datname.nspname.relname

2021-11-03 Thread Tom Lane

Mark Dilger  writes:
> [ v1-0001-Reject-patterns-with-too-many-parts-or-wrong-db.patch ]

This needs a rebase over the recent renaming of our Perl test modules.
(Per the cfbot, so do several of your other pending patches.)

regards, tom lane

Re: Add option --drop-cascade for pg_dump/restore

2021-11-03 Thread Tom Lane

Wu Haotian  writes:
> here's the rebased patch.

Looks like it needs rebasing again, probably as a result of our recent
renaming of our Perl test modules.

FWIW, I'd strongly recommend that it's time to pull all that SQL code
hacking out of RestoreArchive and put it in its own function.

regards, tom lane

Re: Predefined role pg_maintenance for VACUUM, ANALYZE, CHECKPOINT.

2021-11-03 Thread Stephen Frost

Greetings,

* Bossart, Nathan (bossa...@amazon.com) wrote:
> On 11/2/21, 11:27 AM, "Stephen Frost"  wrote:
> > * Bossart, Nathan (bossa...@amazon.com) wrote:
> >> The approach in the patch looks alright to me, but another one could
> >> be to build a SelectStmt when parsing CHECKPOINT.  I think that'd
> >> simplify the standard_ProcessUtility() changes.
> >
> > For my 2c, at least, I'm not really partial to either approach, though
> > I'd want to see what error messages end up looking like.  Seems like we
> > might want to exercise a bit more control than we'd be able to if we
> > transformed it directly into a SelectStmt (that is, we might add a HINT:
> > roles with execute rights on pg_checkpoint() can run this command, or
> > something; maybe not too tho).
> 
> I don't feel strongly one way or the other as well, but you have a
> good point about extra control over the error messages.  The latest
> patch just does a standard aclcheck_error(), so you'd probably see
> "permission denied for function" if you didn't have privileges for
> CHECKPOINT.  That could be confusing.

Yeah, that's exactly the thing I was thinking about that might seem odd.
I don't think it's a huge deal but I do think it'd be good for us to at
least think about if we're ok with that or if we want to try and do
something a bit better.

Thanks,

Stephen


signature.asc
Description: PGP signature

Re: XTS cipher mode for cluster file encryption

2021-11-03 Thread Stephen Frost

Greetings,

* Bruce Momjian (br...@momjian.us) wrote:
> On Mon, Nov  1, 2021 at 02:24:36PM -0400, Stephen Frost wrote:
> > I can understand the general idea that we should be sure to engineer
> > this in a way that multiple methods can be used, as surely one day folks
> > will say that AES128 isn't acceptable any more.  In terms of what we'll
> > do from the start, I would think providing the options of AES128 and
> > AES256 would be good to ensure that we have the bits covered to support
> > multiple methods and I don't think that would put us into a situation of
> 
> My patch supports AES128, AES192, and AES256.

Right, so we're already showing that it's flexible to allow for multiple
encryption methods.  If folks want more then it's on them to research
how they'd work exactly and explain why they'd be useful to add and how
users might make an informed choice (though, again, I don't think we
need to go *too* deep into that as we don't for, eg, pgcrypto, and I
don't believe we've ever heard people complain about that).

Thanks,

Stephen

signature.asc
Description: PGP signature

Re: lastOverflowedXid does not handle transaction ID wraparound

2021-11-03 Thread Simon Riggs

On Thu, 21 Oct 2021 at 05:01, Kyotaro Horiguchi  wrote:
>
> At Wed, 20 Oct 2021 08:55:12 -0700, Stan Hu  wrote in
> > On Wed, Oct 20, 2021 at 4:00 AM Andrey Borodin  wrote:
> >
> > >
> > >
> > > > 17 окт. 2021 г., в 21:55, Dmitry Dolgov <9erthali...@gmail.com>
> > > написал(а):
> > > > I wonder what would be side
> > > > effects of clearing it when the snapshot is not suboverfloved anymore?
> > >
> > > I think we should just invalidate lastOverflowedXid on every
> > > XLOG_RUNNING_XACTS if subxid_overflow == false. I can't find a reason not
> > > to do so.

I believe that to be an incorrect fix, but so very nearly correct.
There is a documented race condition in the generation of a
XLOG_RUNNING_XACTS that means there could be a new overflow event
after the snapshot was taken but before it was logged.

> > On a replica, I think it's possible for lastOverflowedXid to be set even if
> > subxid_overflow is false on the primary and secondary (
> > https://github.com/postgres/postgres/blob/dc899146dbf0e1d23fb24155a5155826ddce34c9/src/backend/storage/ipc/procarray.c#L1327).
> > I thought subxid_overflow only gets set if there are more than
> > PGPROC_MAX_CACHED_SUBXIDS (64) used in a given transaction.
> >
> > Should the replica be invalidating lastOverflowedXid if subxcnt goes to
> > zero in XLOG_RUNNING_XACTS? But if there's an outstanding snapshot with an
> > xmin that precedes lastOverflowedXid we might violate MVCC if we invalidate
> > this, so I wonder if we also need to check the snapshot with the lowest
> > xmin?
>
> lastOverflowedXid is the smallest subxid that possibly exists but
> possiblly not known to the standby. So if all top-level transactions
> older than lastOverflowedXid end, that means that all the
> subtransactions in doubt are known to have been ended.

Agreed

> XLOG_RUNNING_XACTS reports oldestRunningXid, which is the oldest
> running top-transaction.  Standby expires xids in KnownAssignedXids
> array that precede to the oldestRunningXid.  We are sure that all
> possiblly-overflown subtransactions are gone as well if the oldest xid
> is newer than the first overflowed subtransaction.

Agreed

> As a cross check, the following existing code in GetSnapshotData means
> that no overflow is not happening if the smallest xid in the known
> assigned list is larger than lastOverflowedXid, which agrees to the
> consideration above.
>
> procaray.c:2428
> >   subcount = KnownAssignedXidsGetAndSetXmin(snapshot->subxip, 
> > &xmin,
> > 
> > xmax);
> >
> >   if (TransactionIdPrecedesOrEquals(xmin, 
> > procArray->lastOverflowedXid))
> >   suboverflowed = true;
>
>
> If the discussion so far is correct, the following diff will fix the
> issue.
>
> diff --git a/src/backend/storage/ipc/procarray.c 
> b/src/backend/storage/ipc/procarray.c
> index bd3c7a47fe..19682b73ec 100644
> --- a/src/backend/storage/ipc/procarray.c
> +++ b/src/backend/storage/ipc/procarray.c
> @@ -4463,6 +4463,12 @@ ExpireOldKnownAssignedTransactionIds(TransactionId xid)
>  {
> LWLockAcquire(ProcArrayLock, LW_EXCLUSIVE);
> KnownAssignedXidsRemovePreceding(xid);
> +   /*
> +* reset lastOverflowedXid if we know transactions that have been 
> possiblly
> +* running are being gone.
> +*/
> +   if (TransactionIdPrecedes(procArray->lastOverflowedXid, xid))
> +   procArray->lastOverflowedXid = InvalidTransactionId;
> LWLockRelease(ProcArrayLock);
>  }

So I agree with this fix.

It is however, an undocumented modularity violation. I think that is
acceptable because of the ProcArrayLock traffic, but needs to have a
comment to explain this at the call to
ExpireOldKnownAssignedTransactionIds() i.e. " and potentially reset
lastOverflowedXid", as well as a comment on the
ExpireOldKnownAssignedTransactionIds() function.

-- 
Simon Riggshttp://www.EnterpriseDB.com/

Re: should we enable log_checkpoints out of the box?

2021-11-03 Thread Jan Wieck


On 11/3/21 09:09, Robert Haas wrote:


For better or for worse, the
distinction between ERROR, FATAL, and PANIC is entirely based on what
we do after printing the message, and NOT on how serious the message
is.


THAT is a real problem with our error handling and logging system. Often 
using RAISE in a user defined function or procedure is part of the 
communication between the application's code in the backend and the 
parts in the middleware. The information that a function rejected user 
input after a sanity check doesn't belong in the system log. It should 
produce an error message on the user's end and that is it. However, 
currently there is no way for a UDF to ensure the transaction gets 
rolled back without raising an ERROR and bloating the log.


For example the BenchmarkSQL UDF implementation raises such ERROR for 
the transactions that the TPC-C requires to contain an input error 
(non-existing ITEM), which happens on 1% of all NEW-ORDER transactions. 
Running an 8-core system with plenty of RAM close to saturation will 
generate about 10MB of log just for that per hour. That is a quarter GB 
of useless garbage, no DBA is ever going to be interested in, every day.


If anybody is worried about producing too much log output, this should 
be their primary focus.



Regards, Jan

Re: Parallelize correlated subqueries that execute within each worker

2021-11-03 Thread James Coleman

Hi Robert, thanks for the detailed reply.

On Wed, Nov 3, 2021 at 10:48 AM Robert Haas  wrote:
>
> As a preliminary comment, it would be quite useful to get Tom Lane's
> opinion on this, since it's not an area I understand especially well,
> and I think he understands it better than anyone.
>
> On Fri, May 7, 2021 at 12:30 PM James Coleman  wrote:
> > The basic idea is that we need to track (both on nodes and relations)
> > not only whether that node or rel is parallel safe but also whether
> > it's parallel safe assuming params are rechecked in the using context.
> > That allows us to delay making a final decision until we have
> > sufficient context to conclude that a given usage of a Param is
> > actually parallel safe or unsafe.
>
> I don't really understand what you mean by "assuming params are
> rechecked in the using context." However, I think that a possibly
> better approach to this whole area would be to try to solve the
> problem by putting limits on where you can insert a Gather node.
> Consider:
>
> Nested Loop
> -> Seq Scan on x
> -> Index Scan on y
>Index Cond: y.q = x.q
>
> If you insert a Gather node atop the Index Scan, necessarily changing
> it to a Parallel Index Scan, then you need to pass values around. For
> every value we get for x.q, we would need to start workers, sending
> them the value of x.q, and they do a parallel index scan working
> together to find all rows where y.q = x.q, and then exit. We repeat
> this for every tuple from x.q. In the absence of infrastructure to
> pass those parameters, we can't put the Gather there. We also don't
> want to, because it would be really slow.
>
> If you insert the Gather node atop the Seq Scan or the Nested Loop, in
> either case necessarily changing the Seq Scan to a Parallel Seq Scan,
> you have no problem. If you put it on top of the Nested Loop, the
> parameter will be set in the workers and used in the workers and
> everything is fine. If you put it on top of the Seq Scan, the
> parameter will be set in the leader -- by the Nested Loop -- and used
> in the leader, and again you have no problem.
>
> So in my view of the world, the parameter just acts as an additional
> constraint on where Gather nodes can be placed. I don't see that there
> are any parameters that are unsafe categorically -- they're just
> unsafe if the place where they are set is on a different side of the
> Gather from the place where they are used. So I don't understand --
> possibly just because I'm dumb -- the idea behind
> consider_parallel_rechecking_params, because that seems to be making a
> sort of overall judgement about the safety or unsafety of the
> parameter on its own merits, rather than thinking about the Gather
> placement.

I had to read through this several times before I understood the point
(not your fault, this is, as you note, a complicated area). I *think*
if I grok it properly you're effectively describing what this patch
results in conceptually (but possibly solving it from a different
direction).

As I understand the current code, parallel plans are largely chosen
based not on where it's safe to insert a Gather node but rather by
determining if a given path is parallel safe. Through that lens params
are a bit of an odd man out -- they aren't inherently unsafe in the
way a parallel-unsafe function is, but they can only be used in
parallel plans under certain conditions (whether because of project
policy, performance, or missing infrastructure).

Under that paradigm the existing consider_parallel and parallel_safe
boolean values imply everything is about whether a plan is inherently
parallel safe. Thus the current doesn't have the context to handle the
nuance of params (as they are not inherently parallel-safe or unsafe).

Introducing consider_parallel_rechecking_params and
parallel_safe_ignoring_params allows us to keep more context on params
and make a more nuanced decision at the proper level of the plan. This
is what I mean by "rechecked in the using context", though I realize
now that both "recheck" and "context" are overloaded terms in the
project, so don't describe the concept particularly clearly. When a
path relies on params we can only make a final determination about its
parallel safety if we know whether or not the current parallel node
can provide the param's value. We don't necessarily know that
information until we attempt to generate a full parallel node in the
plan (I think what you're describing as "inserting a Gather node")
since the param may come from another node in the plan. These new
values allow us to do that by tracking tentatively parallel-safe
subplans (given proper Gather node placement) and delaying the
parallel-safety determination until the point at which a param is
available (or not).

Is that a more helpful framing of what my goal is here?

> When I last worked on this, I had hoped that extParam or allParam
> would be the thing that would answer the question: are there any
> parameters used under this no

Re: [PATCH] Full support for index LP_DEAD hint bits on standby

2021-11-03 Thread Antonin Houska

Michail Nikolaev  wrote:

> > * Is the purpose of the repeatable read (RR) snapshot to test that
> >  heap_hot_search_buffer() does not set deadness->all_dead if some 
> > transaction
> >  can still see a tuple of the chain?
> 
> The main purpose is to test xactStartedInRecovery logic after the promotion.
> For example -
> > if (scan->xactStartedInRecovery && !RecoveryInProgress())`

I understand that the RR snapshot is used to check the MVCC behaviour, however
this comment seems to indicate that the RR snapshot should also prevent the
standb from setting the hint bits.

# Make sure previous queries not set the hints on standby because
# of RR snapshot

I can imagine that on the primary, but I don't think that the backend that
checks visibility on standby does checks other snapshots/backends. And it
didn't work when I ran the test manually, although I could have missed
something.

A few more notes regarding the tests:

* 026_standby_index_lp_dead.pl should probably be renamed to
  027_standby_index_lp_dead.pl (026_* was created in the master branch
  recently)

* The test fails, although I do have convigrured the build with
  --enable-tap-tests.

BEGIN failed--compilation aborted at t/026_standby_index_lp_dead.pl line 5.
t/026_standby_index_lp_dead.pl .. Dubious, test returned 2 (wstat 512, 0x200)

I suspect the testing infrastructure changed recently.

* The messages like this

is(hints_num($node_standby_1), qq(10),
'hints are set on standby1 because FPI but marked as non-safe');

  say that the hints are "marked as non-safe", but the hints_num() function
  does not seem to check that.

* wording:

is(hints_num($node_standby_2), qq(10), 'index hint bits already set on second 
standby 2');
->
is(hints_num($node_standby_2), qq(10), 'index hint bits already set on standby 
2');

And a few more notes on the code:

* There's an extra colon in mask_lp_dead():

bufmask.c:148:38: warning: for loop has empty body [-Wempty-body]
 offnum = OffsetNumberNext(offnum));
   ^
bufmask.c:148:38: note: put the semicolon on a separate line to silence this 
warning

* the header comment of heap_hot_search_buffer() still says "*all_dead"
  whereas I'd expect "->all_dead".

  The same for "*page_lsn".

* I can see no test for the INDEX_LP_DEAD_OK_MIN_LSN value of the
  IndexLpDeadAllowedResult enumeration. Shouldn't there be only two values,
  e.g. INDEX_LP_DEAD_OK and INDEX_LP_DEAD_MAYBE_OK ? Or a boolean variable (in
  index_fetch_heap()) of the appropriate name, e.g. kill_maybe_allowed, and
  rename the function is_index_lp_dead_allowed() to
  is_index_lp_dead_maybe_allowed()?

-- 
Antonin Houska
Web: https://www.cybertec-postgresql.com

Re: Perl warnings when building contrib on RHEL 9 beta

2021-11-03 Thread Tom Lane

"Daniel Westermann (DWE)"  writes:
> just compiled PostgreSQL head on RedHat 9 beta. Generally seems to be fine 
> but when building contrib, a lot of these warnings pop up:

> /usr/lib64/perl5/CORE/inline.h:2635:5: warning: '}' and ')' tokens 
> terminating statement expression appear in different macro expansion contexts 
> [-Wcompound-token-split-by-macro]

Yeah, I've been seeing that on macOS as well, with the latest Xcode;
and several other buildfarm members are showing it too.

I think this is a new clang warning that the Perl headers haven't
caught up with yet.  I'm not sure there is anything we can do about
it other than add -Wno-compound-token-split-by-macro.

regards, tom lane

Re: PITR: enhance getRecordTimestamp()

2021-11-03 Thread Simon Riggs

On Wed, 3 Nov 2021 at 13:28, Daniel Gustafsson  wrote:
>
> > On 30 Jun 2021, at 11:59, Simon Riggs  wrote:
>
> > For PITR, getRecordTimestamp() did not include all record types that
> > contain times.
> > Add handling for checkpoints, end of recovery and prepared xact record 
> > types.
>
> + 
> This breaks doc compilation, and looks like a stray tag as you want this entry
> in the currently open variablelist?

Thanks. Fixed and rebased.

-- 
Simon Riggshttp://www.EnterpriseDB.com/


pitr_enhance_getRecordTimestamp.v2.patch
Description: Binary data

Re: Parallelize correlated subqueries that execute within each worker

2021-11-03 Thread Tom Lane

Robert Haas  writes:
> One thing I discovered when I was looking at this a few years ago is
> that there was only one query in the regression tests where extParam
> and allParam were not the same. The offending query was select 1 =
> all(select (select 1)), and the resulting plan has a Materialize node
> with an attached InitPlan. For that Materialize node, extParam = {}
> and allParam = {$0}, with $0 also being the output parameter of the
> InitPlan attached that that Materialize node. In every other node in
> that plan and in every node of every other plan generated by the
> regression tests, the values were identical. So it's extremely niche
> that these fields are even different from each other, and it's unclear
> to me that we really need both of them.

Yeah, I've had that nagging feeling about them too.  But ISTR trying to
reduce them to one value years ago, and finding that it didn't quite work,
or at least would result in more subquery-re-evaluation than we do today.
You have to dig into what the executor uses these values for to really
grok them.  I'm afraid that that detail is all swapped out right now, so
I can't say much more.

regards, tom lane

Re: Parallelize correlated subqueries that execute within each worker

2021-11-03 Thread Robert Haas

On Wed, Nov 3, 2021 at 11:14 AM Tom Lane  wrote:
> FWIW, I've never been very happy with those fields either.  IIRC the
> design in that area was all Vadim's, but to the extent that there's
> any usable documentation of extParam/allParam, it was filled in by me
> while trying to understand what Vadim did.  If somebody wants to step
> up and do a rewrite to make the planner's Param management more useful
> or at least easier to understand, I think that'd be great.

Good to know, thanks.

> But anyway: yeah, those fields as currently constituted don't help
> much.  They tell you which Params are consumed by this node or its
> subnodes, but not where those Params came from.  The planner's
> plan_params and outer_params fields might be more nearly the right
> thing, but I'm not sure they're spot-on either, nor that they're
> up-to-date at the point where you'd want to make decisions about
> Gather safety.

One thing I discovered when I was looking at this a few years ago is
that there was only one query in the regression tests where extParam
and allParam were not the same. The offending query was select 1 =
all(select (select 1)), and the resulting plan has a Materialize node
with an attached InitPlan. For that Materialize node, extParam = {}
and allParam = {$0}, with $0 also being the output parameter of the
InitPlan attached that that Materialize node. In every other node in
that plan and in every node of every other plan generated by the
regression tests, the values were identical. So it's extremely niche
that these fields are even different from each other, and it's unclear
to me that we really need both of them.

What's also interesting is that extParam is computed (by
finalize_plan) as plan->extParam = bms_del_members(plan->extParam,
initSetParam). So I think it mostly ends up that extParam for a node
is not exactly all the parameters that anything under that node cares
about, but rather - approximately - all the things that anything under
that node cares about that aren't also set someplace under that node.
If it were exactly that, I think it would be perfect for our needs
here: if the set of things used but not set below the current level is
empty, it's OK to insert a Gather node; otherwise, it's not, at least,
not unless we find a way to pipe parameters from the leader into the
workers. But I think there's some reason that I no longer remember why
it's not exactly that, and therefore the idea doesn't work.

One problem I do remember is that attaching initplans at the top of
each subquery level as we presently do is really not good for this
kind of thing. Suppose you have several levels of Nested Loop and
someplace down in the plan you reference an InitPlan. The planner sees
no harm in attaching the InitPlan at the top level, which makes it
unsafe to put the Gather any place but at the top level. If you
attached the InitPlan to the lowest node in the plan tree that is high
enough to be above all the places that use the value from that
parameter, you could potentially shift the Gather down the plan tree,
which would be great if, for example, there's exactly one
parallel-restricted join and the rest are parallel-safe. The best plan
might be to do all the other joins under a Gather and then perform the
parallel-restricted join above it.

But I found it very hard to figure out how to rejigger the logic that
places InitPlans to be more intelligent, and eventually gave up.

-- 
Robert Haas
EDB: http://www.enterprisedb.com

Re[2]: On login trigger: take three

2021-11-03 Thread Ivan Panchenko


Dear colleagues, 
Please see my suggestions below and the updated patch attached.
 
 
 
 
  
>Среда, 29 сентября 2021, 15:14 +03:00 от Teodor Sigaev < teo...@sigaev.ru >:
> 
>Hi!
>
>Nice feature, but, sorry, I see some design problem in suggested feature. 
>AFAIK,
>there is two use cases for this feature:
>1 A permission / prohibition to login some users
>2 Just a logging of facts of user's login
>
>Suggested patch proposes prohibition of login only by failing of login trigger
>and it has at least two issues:
>1 In case of prohibition to login, there is no clean way to store information
>about unsuccessful login. Ok, it could be solved by dblink module but it seems
>to scary.
This is a common problem of logging errors and unsuccessful transactions of any 
kind. It can be solved by:
- logging to external log storage (stupid logging to files or syslog or 
whatever you can imagine with PL/Perl (sorry))
- logging inside the database by db_link or through background worker (like 
described in  
https://dpavlin.wordpress.com/2017/05/09/david-rader-autonomous-transactions-using-pg_background/
 )
- or by implementing autonomous transactions in PostgreSQL, which is already 
under development by some of my and Teodor’s colleagues.
 
So I propose not to invent another solution to this common problem here.
>2 For logging purpose failing of trigger will cause impossibility to login, it
>could be workarounded by catching error in trigger function, but it's not so
>obvious for DBA.
If the trigger contains an error, nobody can login. The database is bricked.
Previous variant of the patch proposed to fix this with going to single-user 
mode.
For faster recovery I propose to have also a GUC variable to turn on/off the 
login triggers. It should be 'on' by default.
>
>some other issues:
>3 It's easy to create security hole, see attachment where non-privileged user
>can close access to database even for superuser.
Such cases can be avoided by careful design of the login triggers and related 
permissions. Added such note to the documentation.
>4 Feature is not applicable for logging unsuccessful login with wrong
>username/password or not matched by pg_hba.conf. Oracle could operate with such
>cases. But I don't think that this issue is a stopper for the patch.
Yes. Btw, note that pg_hba functionality can be implemented completely inside 
the login trigger :) !
>
>May be, to solve that issues we could introduce return value of trigger and/or
>add something like IGNORE ERROR to CREATE EVENT TRIGGER command.
Any solutions which make syntax more complicated can lead Postgres to become 
Oracle (in the worst sense). I do not like this.

A new version of the patch is attached. It applies to the current master and 
contains the above mentioned GUC code, relevant tests and docs.
Regards,
Ivan Panchenko
>
>On 15.09.2021 16:32, Greg Nancarrow wrote:
>> On Wed, Sep 8, 2021 at 10:56 PM Daniel Gustafsson < dan...@yesql.se > wrote:
>>>
>>> I took a look at this, and while I like the proposed feature I think the 
>>> patch
>>> has a bit more work required.
>>>
>>
>> Thanks for reviewing the patch.
>> I am not the original patch author (who no longer seems active) but
>> I've been contributing a bit and keeping the patch alive because I
>> think it's a worthwhile feature.
>>
>>>
>>> +
>>> +-- 2) Initialize some user session data
>>> + CREATE TEMP TABLE session_storage (x float, y integer);
>>> The example in the documentation use a mix of indentations, neither of 
>>> which is
>>> the 2-space indentation used elsewhere in the docs.
>>>
>>
>> Fixed, using 2-space indentation.
>> (to be honest, the indentation seems inconsistent elsewhere in the
>> docs e.g. I'm seeing a nearby case of 2-space on 1st indent, 6-space
>> on 2nd indent)
>>
>>>
>>> + /* Get the command tag. */
>>> + tag = parsetree ? CreateCommandTag(parsetree) : CMDTAG_CONNECT;
>>> This is hardcoding knowledge that currently only this trigger wont have a
>>> parsetree, and setting the tag accordingly. This should instead check the
>>> event and set CMDTAG_UNKNOWN if it isn't the expected one.
>>>
>>
>> I updated that, but maybe not exactly how you expected?
>>
>>>
>>> + /* database has on-login triggers */
>>> + bool dathaslogontriggers;
>>> This patch uses three different names for the same thing: client connection,
>>> logontrigger and login trigger. Since this trigger only fires after 
>>> successful
>>> authentication it’s not accurate to name it client connection, as that 
>>> implies
>>> it running on connections rather than logins. The nomenclature used in the
>>> tree is "login" so the patch should be adjusted everywhere to use that.
>>>
>>
>> Fixed.
>>
>>>
>>> +/* Hook for plugins to get control at start of session */
>>> +client_connection_hook_type client_connection_hook = EventTriggerOnConnect;
>>> I don't see the reason for adding core functionality by hooks. Having a hook
>>> might be a useful thing on its own (to be discussed in a new thread, not 
>>> hidden
>>> her

Re: Schema variables - new implementation for Postgres 15

2021-11-03 Thread Tomas Vondra

FWIW the patch was marked as RFC for about a year, but there was plenty
of discussion / changes since then, so that seemed premature. I've
switched it back to WoA.

regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: GiST operator class for bool

2021-11-03 Thread Tomas Vondra

Hi,

I looked at this patch today - it's pretty simple and in pretty good
shape, I can't think of anything that'd need fixing. Perhaps the test
might also do EXPLAIN like for other types, to verify the new index is
actually used. But that's minor enough to handle during commit.


I've marked this as RFC and will get it committed in a day or two.

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: Parallelize correlated subqueries that execute within each worker

2021-11-03 Thread Tom Lane

Robert Haas  writes:
> When I last worked on this, I had hoped that extParam or allParam
> would be the thing that would answer the question: are there any
> parameters used under this node that are not also set under this node?
> But I seem to recall that neither seemed to be answering precisely
> that question, and the lousy naming of those fields and limited
> documentation of their intended purpose did not help.

FWIW, I've never been very happy with those fields either.  IIRC the
design in that area was all Vadim's, but to the extent that there's
any usable documentation of extParam/allParam, it was filled in by me
while trying to understand what Vadim did.  If somebody wants to step
up and do a rewrite to make the planner's Param management more useful
or at least easier to understand, I think that'd be great.

But anyway: yeah, those fields as currently constituted don't help
much.  They tell you which Params are consumed by this node or its
subnodes, but not where those Params came from.  The planner's
plan_params and outer_params fields might be more nearly the right
thing, but I'm not sure they're spot-on either, nor that they're
up-to-date at the point where you'd want to make decisions about
Gather safety.

regards, tom lane

Re: Should we support new definition for Identity column : GENERATED BY DEFAULT ON NULL?

2021-11-03 Thread Geoff Winkless

On Wed, 3 Nov 2021 at 14:39, David Fetter  wrote:

> Unfortunately, the PREPARE/EXECUTE infrastructure, and not just for
> PostgreSQL, has no way of passing along DEFAULT explicitly, i.e. it
>

My $.02: I'd be much happier with the idea of changing those
obviously-deficient interfaces to allow explicit DEFAULT than of having an
explicit NULL treated differently depending on the column type.

Geoff

Re: Partial aggregates pushdown

2021-11-03 Thread Alexander Pyhalov


Daniel Gustafsson писал 2021-11-03 16:45:
On 2 Nov 2021, at 10:12, Alexander Pyhalov  
wrote:



Updated and rebased patch.


+   state = (Int128AggState *) palloc0(sizeof(Int128AggState));
+   state->calcSumX2 = false;
+
+   if (!PG_ARGISNULL(0))
+   {
+#ifdef HAVE_INT128
+   do_int128_accum(state, (int128) PG_GETARG_INT64(0));
+#else
+   do_numeric_accum(state, int64_to_numeric(PG_GETARG_INT64(0)));
+#endif

This fails on non-INT128 platforms as state cannot be cast to 
Int128AggState

outside of HAVE_INT128; it's not defined there.  This needs to be a
PolyNumAggState no?


Hi.
Thank you for noticing this. It's indeed fails with 
pgac_cv__128bit_int=no.

Updated patch.
--
Best regards,
Alexander Pyhalov,
Postgres ProfessionalFrom f72a3d52a2b85ad9ea5f61f8ff5c46cb50ae3ec8 Mon Sep 17 00:00:00 2001
From: Alexander Pyhalov 
Date: Thu, 14 Oct 2021 17:30:34 +0300
Subject: [PATCH] Partial aggregates push down

---
 contrib/postgres_fdw/deparse.c|  57 -
 .../postgres_fdw/expected/postgres_fdw.out| 185 -
 contrib/postgres_fdw/postgres_fdw.c   | 196 +-
 contrib/postgres_fdw/sql/postgres_fdw.sql |  27 ++-
 src/backend/catalog/pg_aggregate.c|  28 ++-
 src/backend/commands/aggregatecmds.c  |  23 +-
 src/backend/utils/adt/numeric.c   |  96 +
 src/bin/pg_dump/pg_dump.c |  21 +-
 src/include/catalog/catversion.h  |   2 +-
 src/include/catalog/pg_aggregate.dat  | 106 +-
 src/include/catalog/pg_aggregate.h|  10 +-
 src/include/catalog/pg_proc.dat   |   6 +
 src/test/regress/expected/oidjoins.out|   1 +
 13 files changed, 674 insertions(+), 84 deletions(-)

diff --git a/contrib/postgres_fdw/deparse.c b/contrib/postgres_fdw/deparse.c
index d98bd666818..8cee12c1b2a 100644
--- a/contrib/postgres_fdw/deparse.c
+++ b/contrib/postgres_fdw/deparse.c
@@ -196,6 +196,7 @@ static bool is_subquery_var(Var *node, RelOptInfo *foreignrel,
 static void get_relation_column_alias_ids(Var *node, RelOptInfo *foreignrel,
 		  int *relno, int *colno);
 
+static bool partial_agg_ok(Aggref *agg);
 
 /*
  * Examine each qual clause in input_conds, and classify them into two groups,
@@ -831,8 +832,10 @@ foreign_expr_walker(Node *node,
 if (!IS_UPPER_REL(glob_cxt->foreignrel))
 	return false;
 
-/* Only non-split aggregates are pushable. */
-if (agg->aggsplit != AGGSPLIT_SIMPLE)
+if ((agg->aggsplit != AGGSPLIT_SIMPLE) && (agg->aggsplit != AGGSPLIT_INITIAL_SERIAL))
+	return false;
+
+if (agg->aggsplit == AGGSPLIT_INITIAL_SERIAL && !partial_agg_ok(agg))
 	return false;
 
 /* As usual, it must be shippable. */
@@ -3249,7 +3252,7 @@ deparseAggref(Aggref *node, deparse_expr_cxt *context)
 	bool		use_variadic;
 
 	/* Only basic, non-split aggregation accepted. */
-	Assert(node->aggsplit == AGGSPLIT_SIMPLE);
+	Assert(node->aggsplit == AGGSPLIT_SIMPLE || node->aggsplit == AGGSPLIT_INITIAL_SERIAL);
 
 	/* Check if need to print VARIADIC (cf. ruleutils.c) */
 	use_variadic = node->aggvariadic;
@@ -3719,3 +3722,51 @@ get_relation_column_alias_ids(Var *node, RelOptInfo *foreignrel,
 	/* Shouldn't get here */
 	elog(ERROR, "unexpected expression in subquery output");
 }
+
+/*
+ * Check that partial aggregate agg is fine to push down
+ */
+static bool
+partial_agg_ok(Aggref *agg)
+{
+	HeapTuple	aggtup;
+	Form_pg_aggregate aggform;
+
+	Assert(agg->aggsplit == AGGSPLIT_INITIAL_SERIAL);
+
+	/* We don't support complex partial aggregates */
+	if (agg->aggdistinct || agg->aggvariadic || agg->aggkind != AGGKIND_NORMAL || agg->aggorder != NIL)
+		return false;
+
+	aggtup = SearchSysCache1(AGGFNOID, ObjectIdGetDatum(agg->aggfnoid));
+	if (!HeapTupleIsValid(aggtup))
+		elog(ERROR, "cache lookup failed for function %u", agg->aggfnoid);
+	aggform = (Form_pg_aggregate) GETSTRUCT(aggtup);
+
+	/* Only aggregates, marked as pushdown safe, are allowed */
+	if (!aggform->aggpartialpushdownsafe)
+	{
+		ReleaseSysCache(aggtup);
+		return false;
+	}
+
+	/*
+	 * If an aggregate requires serialization/deserialization, partial
+	 * converter should be defined
+	 */
+	if (agg->aggtranstype == INTERNALOID && aggform->aggpartialconverterfn == InvalidOid)
+	{
+		ReleaseSysCache(aggtup);
+		return false;
+	}
+
+	/* In this case we currently don't use converter */
+	if (agg->aggtranstype != INTERNALOID && get_func_rettype(agg->aggfnoid) != agg->aggtranstype)
+	{
+		ReleaseSysCache(aggtup);
+		return false;
+	}
+
+	ReleaseSysCache(aggtup);
+	return true;
+}
diff --git a/contrib/postgres_fdw/expected/postgres_fdw.out b/contrib/postgres_fdw/expected/postgres_fdw.out
index fd141a0fa5c..80c507783e6 100644
--- a/contrib/postgres_fdw/expected/postgres_fdw.out
+++ b/contrib/postgres_fdw/expected/postgres_fdw.out
@@ -9279,13 +9279,13 @@ RESET enable_partitionwise_join;
 -- =

Re: Parallelize correlated subqueries that execute within each worker

2021-11-03 Thread Robert Haas

As a preliminary comment, it would be quite useful to get Tom Lane's
opinion on this, since it's not an area I understand especially well,
and I think he understands it better than anyone.

On Fri, May 7, 2021 at 12:30 PM James Coleman  wrote:
> The basic idea is that we need to track (both on nodes and relations)
> not only whether that node or rel is parallel safe but also whether
> it's parallel safe assuming params are rechecked in the using context.
> That allows us to delay making a final decision until we have
> sufficient context to conclude that a given usage of a Param is
> actually parallel safe or unsafe.

I don't really understand what you mean by "assuming params are
rechecked in the using context." However, I think that a possibly
better approach to this whole area would be to try to solve the
problem by putting limits on where you can insert a Gather node.
Consider:

Nested Loop
-> Seq Scan on x
-> Index Scan on y
   Index Cond: y.q = x.q

If you insert a Gather node atop the Index Scan, necessarily changing
it to a Parallel Index Scan, then you need to pass values around. For
every value we get for x.q, we would need to start workers, sending
them the value of x.q, and they do a parallel index scan working
together to find all rows where y.q = x.q, and then exit. We repeat
this for every tuple from x.q. In the absence of infrastructure to
pass those parameters, we can't put the Gather there. We also don't
want to, because it would be really slow.

If you insert the Gather node atop the Seq Scan or the Nested Loop, in
either case necessarily changing the Seq Scan to a Parallel Seq Scan,
you have no problem. If you put it on top of the Nested Loop, the
parameter will be set in the workers and used in the workers and
everything is fine. If you put it on top of the Seq Scan, the
parameter will be set in the leader -- by the Nested Loop -- and used
in the leader, and again you have no problem.

So in my view of the world, the parameter just acts as an additional
constraint on where Gather nodes can be placed. I don't see that there
are any parameters that are unsafe categorically -- they're just
unsafe if the place where they are set is on a different side of the
Gather from the place where they are used. So I don't understand --
possibly just because I'm dumb -- the idea behind
consider_parallel_rechecking_params, because that seems to be making a
sort of overall judgement about the safety or unsafety of the
parameter on its own merits, rather than thinking about the Gather
placement.

When I last worked on this, I had hoped that extParam or allParam
would be the thing that would answer the question: are there any
parameters used under this node that are not also set under this node?
But I seem to recall that neither seemed to be answering precisely
that question, and the lousy naming of those fields and limited
documentation of their intended purpose did not help.

-- 
Robert Haas
EDB: http://www.enterprisedb.com

Re: Should we support new definition for Identity column : GENERATED BY DEFAULT ON NULL?

2021-11-03 Thread David Fetter

On Wed, Nov 03, 2021 at 11:07:58AM +0100, Vik Fearing wrote:
> On 11/2/21 12:19 PM, Himanshu Upadhyaya wrote:
> > Hi,
> > 
> > Trying to insert NULL value to the Identity column defined by "GENERATED BY
> > DEFAULT" is disallowed, but there can be use cases where the user would
> > like to have an identity column where manual NULL insertion is required(and
> > it should not error-out by Postgres).
> 
> What could possibly be the use case for this?

Unfortunately, the PREPARE/EXECUTE infrastructure, and not just for
PostgreSQL, has no way of passing along DEFAULT explicitly, i.e. it
can only explicitly pass literals and NULL, so people come up with
workarounds like this. I keep saying "explicitly" because one of the
workarounds is to pass DEFAULT by omission in the target list, which
is not a great way to handle anything, being a communication by
silence and all. I'm thinking that lack is the problem we should
actually address.

> > Thoughts?
> 
> I don't like it.

Neither do I, but I do get some of the motivation behind it. I don't
suppose the standard actually says much about PREPARE, or whether we
should care about anything it happens to say that gets in the way of
doing something more helpful.

Best,
David.
-- 
David Fetter  http://fetter.org/
Phone: +1 415 235 3778

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

Re: Missing include in be-secure-openssl.c?

2021-11-03 Thread Tom Lane

Michael Paquier  writes:
> I would follow the practice of upstream to include both if were me
> to be consistent, but I'm also fine if you just add x509v3.h to
> be-secure-openssl.c.

24f9e49e4 has not fixed it, so I have no idea what to do next.
We definitely need some input from the machine's owner:

* what OpenSSL version is that?
* is HEAD being built any differently from v14 (seems like it must be)?

regards, tom lane

Re: removing global variable ThisTimeLineID

2021-11-03 Thread Robert Haas

On Wed, Nov 3, 2021 at 9:12 AM Amul Sul  wrote:
> 0002:
> -GetFlushRecPtr(void)
> +GetFlushRecPtr(TimeLineID *insertTLI)
>
> Should we rename this argument to more generic as "tli", like
> GetStandbyFlushRecPtr, since the caller in 003 patch trying to fetch a
> TLI that means different for them, e.g. currTLI, FlushTLI, etc.

It's possible that more consistency would be better here, but I don't
think making this name more generic is going in the right direction.
If somebody gets the TLI using this function, it's the timeline into
which a system not in recovery is inserting WAL, which is the same
timeline to which WAL is being flushed. I think it's important for
this name to try to make that clear. It doesn't really matter if
someone treats the insertTLI as the writeTLI or the flushTLI since on
a system in production they're all the same - but they must not
confuse it with, say, the replayTLI.

> How about logsegtli instead, to be inlined with logsegno ?
> Similarly, openLogSegTLI ?

Hmm, well I guess it depends on how you think the words group. If you
logsegno means "the number of the log segment" then it would make
sense to have logsegli to mean "the tli of the log segment." But I
think of logsegno as meaning "the segment number of the log file," so
it makes more sense to have logtli to mean "the TLI of the log file".
I think it's definitely not a "log segment file" - just a "log file".

> Can't GetWALInsertionTimeLine() called instead? I guess the reason is
> to avoid the unnecessary overhead involved in the function call. (Same
> at a few other places.)

That, plus to avoid failing the assertion in that function. As we've
discussed on the ALTER SYSTEM READ ONLY thread, xlog.c itself inserts
some WAL before recovery is officially over.

-- 
Robert Haas
EDB: http://www.enterprisedb.com

Re: Predefined role pg_maintenance for VACUUM, ANALYZE, CHECKPOINT.

2021-11-03 Thread Daniel Gustafsson

> On 2 Nov 2021, at 19:26, Stephen Frost  wrote:

>> Otherwise, I see a couple of warnings when compiling:
>>xlogfuncs.c:54: warning: implicit declaration of function 
>> ‘RequestCheckpoint’
>>xlogfuncs.c:56: warning: control reaches end of non-void function
> 
> Yeah, such things would need to be cleaned up, of course.

The Commitfest CI has -Werror,-Wimplicit-function-declaration on some platforms
in which this patch breaks, so I think we should apply the below (or something
similar) to ensure this is tested everywhere:

diff --git a/src/backend/access/transam/xlogfuncs.c 
b/src/backend/access/transam/xlogfuncs.c
index 7ecaca4788..c9e1df39c1 100644
--- a/src/backend/access/transam/xlogfuncs.c
+++ b/src/backend/access/transam/xlogfuncs.c
@@ -26,6 +26,7 @@
 #include "funcapi.h"
 #include "miscadmin.h"
 #include "pgstat.h"
+#include "postmaster/bgwriter.h"
 #include "replication/walreceiver.h"
 #include "storage/fd.h"
 #include "storage/ipc.h"
@@ -53,6 +54,7 @@ pg_checkpoint(PG_FUNCTION_ARGS)
 {
RequestCheckpoint(CHECKPOINT_IMMEDIATE | CHECKPOINT_WAIT |
  (RecoveryInProgress() ? 0 : 
CHECKPOINT_FORCE));
+   PG_RETURN_VOID();
 }

--
Daniel Gustafsson   https://vmware.com/

Re: Consider parallel for lateral subqueries with limit

2021-11-03 Thread James Coleman

On Fri, Jul 16, 2021 at 3:16 PM James Coleman  wrote:
>
> On Thu, May 27, 2021 at 9:01 PM Greg Nancarrow  wrote:
> >
> > On Tue, Dec 8, 2020 at 10:46 AM James Coleman  wrote:
> > >
> > > While I haven't actually tracked down to guarantee this is handled
> > > elsewhere, a thought experiment -- I think -- shows it must be so.
> > > Here's why: suppose we don't have a limit here, but the query return
> > > order is different in different backends. Then we would have the same
> > > problem you bring up. In that case this code is already setting
> > > consider_parallel=true on the rel. So I don't think we're changing any
> > > behavior here.
> > >
> >
> > AFAICS, the patch seems very reasonable and specifically targets
> > lateral subqueries with limit/offset. It seems like the uncorrelated
> > case is the only real concern.
> > I generally agree that the current patch is probably not changing any
> > behavior in the uncorrelated case (and like yourself, haven't yet
> > found a case for which it breaks), but I'm not sure Brian's concerns
> > can be ruled out entirely.
> >
> > How about a minor update to the patch to make it slightly more
> > restrictive, to exclude the case when there are no lateral
> > cross-references, so we'll be allowing parallelism only when we know
> > the lateral subquery will be evaluated anew for each source row?
> > I was thinking of the following patch modification:
> >
> > BEFORE:
> > -if (limit_needed(subquery))
> > +   if (!rte->lateral && limit_needed(subquery))
> >
> > AFTER:
> > -if (limit_needed(subquery))
> > +   if ((!rte->lateral || bms_is_empty(rel->lateral_relids)) &&
> > + limit_needed(subquery))
> >
> >
> > Thoughts?
>
> Apologies for the delayed response; this seems fine to me. I've
> attached patch v2.

Greg,

Do you believe this is now ready for committer?

Thanks,
James Coleman

Re: Parallelize correlated subqueries that execute within each worker

2021-11-03 Thread James Coleman

On Wed, Sep 8, 2021 at 8:47 AM James Coleman  wrote:

> See updated patch series attached.

Jaime,

I noticed on 3-October you moved this into "waiting on author"; I
don't see anything waiting in this thread, however. Am I missing
something?

I'm planning to change it back to "needs review".

Thanks,
James

Re: Partial aggregates pushdown

2021-11-03 Thread Daniel Gustafsson

> On 2 Nov 2021, at 10:12, Alexander Pyhalov  wrote:

> Updated and rebased patch.

+   state = (Int128AggState *) palloc0(sizeof(Int128AggState));
+   state->calcSumX2 = false;
+
+   if (!PG_ARGISNULL(0))
+   {
+#ifdef HAVE_INT128
+   do_int128_accum(state, (int128) PG_GETARG_INT64(0));
+#else
+   do_numeric_accum(state, int64_to_numeric(PG_GETARG_INT64(0)));
+#endif

This fails on non-INT128 platforms as state cannot be cast to Int128AggState
outside of HAVE_INT128; it's not defined there.  This needs to be a
PolyNumAggState no?

--
Daniel Gustafsson   https://vmware.com/

Re: PROXY protocol support

2021-11-03 Thread Daniel Gustafsson

> On 28 Sep 2021, at 15:23, Magnus Hagander  wrote:
> On Fri, Sep 10, 2021 at 1:44 AM Jacob Champion  wrote:

>> The TAP test will need to be rebased over the changes in 201a76183e.
> 
> Done

And now the TAP test will need to be rebased over the changes in
b3b4d8e68ae83f432f43f035c7eb481ef93e1583.

--
Daniel Gustafsson   https://vmware.com/

Re: PITR: enhance getRecordTimestamp()

2021-11-03 Thread Daniel Gustafsson

> On 30 Jun 2021, at 11:59, Simon Riggs  wrote:

> For PITR, getRecordTimestamp() did not include all record types that
> contain times.
> Add handling for checkpoints, end of recovery and prepared xact record types.

+ 
This breaks doc compilation, and looks like a stray tag as you want this entry
in the currently open variablelist?

--
Daniel Gustafsson   https://vmware.com/

Re: should we enable log_checkpoints out of the box?

2021-11-03 Thread Robert Haas

On Wed, Nov 3, 2021 at 9:04 AM Bruce Momjian  wrote:
> Well, another issue is that if something unusual does happen, it appears
> very visibly if you are looking just for LOG messages, while if you have
> many checkpoint log messages, it might get lost.

That's a theoretical risk, but in practice the problem is actually
exactly the opposite of that, at least in my experience. If somebody
sends me a log file from a machine with log_checkpoints=on, the file
is typically hundreds of megabytes in size and in that file someplace
there are a handful of log lines generated by log_checkpoints mixed in
with millions of log lines about other things. So it's not that
log_checkpoints=on would keep you from noticing other things happening
on the machine. It's that other things happening on the machine would
keep you from noticing the log_checkpoints=on output.

In fact this problem is true for pretty much anything important that
shows up in the log: it's likely to be completely swamped by other
messages logged *at the exact same log level* that are not important
at all. The ERROR that could have alerted you to data corruption is
likely mixed in with millions of innocuous ERROR messages, for
example.

> If we want to log more
> by default, I think we are looking at several issues:
>
> *  enabling log rotation and log file reuse in the default install
> *  changing the labels of some of the normal-operation log messages
> *  changing the way some of these log messages are controlled
> *  perhaps using a ring buffer for common log messages

If we were talking about increasing the log volume by an amount that
was actually meaningful, we might need to think about these things,
but that's not what is being proposed. The only systems where this is
going to lead to a significant percentage increase in log volume are
the ones that are pretty much idle now. On real systems, this is going
to lead to a change that is less than 1%, and maybe less than 0.001%.
We don't have to rearchitect anything as a result of decisions that
have so little practical impact - especially considering that at least
some packagers are already putting log rotation in place
automatically, in a way consistent with distro policies.

-- 
Robert Haas
EDB: http://www.enterprisedb.com

Re: should we enable log_checkpoints out of the box?

2021-11-03 Thread Bruce Momjian

On Wed, Nov  3, 2021 at 09:09:18AM -0400, Robert Haas wrote:
> I think that's pretty impractical. In order to get rid of all of the
> harmless application errors, you'd have to set log_min_messages to
> FATAL. That's a bad idea for two reasons. First, it wouldn't even
> filter out all the garbage that you don't care about, and second, it
> would filter out a lot of really important things that you do care
> about. For example, you would still get messages in the logs like
> "FATAL: connection to client lost" and "FATAL: canceling
> authentication due to timeout" that probably nobody really cares
> about, but you would not get messages like "ERROR: downlink or sibling
> link points to deleted block in index " which are indicative
> of data corruption and thus important. For better or for worse, the
> distinction between ERROR, FATAL, and PANIC is entirely based on what
> we do after printing the message, and NOT on how serious the message
> is.
> 
> Now maybe we want to improve that at some point, but it shouldn't
> stand in the way of this proposal. If people are living with the
> gigantic volume of ERROR and FATAL messages that typically end up in
> the logs today, they can certainly live with the absolutely tiny
> volume of additional information that would be generated by
> log_checkpoints=on.

See my later email --- I think we need to make a few changes for this to
work well.

-- 
  Bruce Momjian  https://momjian.us
  EDB  https://enterprisedb.com

  If only the physical world exists, free will is an illusion.

Re: removing global variable ThisTimeLineID

2021-11-03 Thread Amul Sul

On Tue, Nov 2, 2021 at 12:46 AM Robert Haas  wrote:
>
> [...]
> I would like to clean this up. Attached is a series of patches which
> try to do that. For ease of review, this is separated out into quite a
> few separate patches, but most likely we'd combine some of them for
> commit. Patches 0001 through 0004 eliminate all use of the global
> variable ThisTimeLineID outside of xlog.c, and patch 0005 makes it
> "static" so that it ceases to be visible outside of xlog.c. Patches
> 0006 to 0010 clean up uses of ThisTimeLineID itself, passing it around
> as a function parameter, or otherwise arranging to fetch the relevant
> timeline ID from someplace sensible rather than using the global
> variable as a scratchpad. Finally, patch 0011 deletes the global
> variable.
>
> I have not made a serious attempt to clear up all of the
> terminological confusion created by the term ThisTimeLineID, which
> also appears as part of some structs, so you'll still see that name in
> the source code after applying these patches, just not as the name of
> a global variable. I have, however, used names like insertTLI and
> replayTLI in many places changed by the patch, and I think that makes
> it significantly more clear which timeline is being discussed in each
> case. In some places I have endeavored to improve the comments. There
> is doubtless more that could be done here, but I think this is a
> fairly respectable start.
>
> Opinions welcome.
>

I had a plan to look at these patches closely, but unfortunately,
didn't get enough time; and might not be able to spend time in the
rest of this week. Here are a few thoughts that I had in the initial
go-through, but that may or may not sound very interesting:

0002:
-GetFlushRecPtr(void)
+GetFlushRecPtr(TimeLineID *insertTLI)

Should we rename this argument to more generic as "tli", like
GetStandbyFlushRecPtr, since the caller in 003 patch trying to fetch a
TLI that means different for them, e.g. currTLI, FlushTLI, etc.


0004:
 static int
-XLogFileInitInternal(XLogSegNo logsegno, bool *added, char *path)
+XLogFileInitInternal(XLogSegNo logsegno, TimeLineID logtli,
+bool *added, char *path)

 int
-XLogFileInit(XLogSegNo logsegno)
+XLogFileInit(XLogSegNo logsegno, TimeLineID logtli)
 {

How about logsegtli instead, to be inlined with logsegno ?


0007:
 static XLogSegNo openLogSegNo = 0;
+static TimeLineID openLogTLI = 0;

Similarly, openLogSegTLI ?


0008:
+   /*
+* Given that we're not in recovery, ThisTimeLineID is set and can't
+* change, so we can read it without a lock.
+*/
+   insertTLI = XLogCtl->ThisTimeLineID;

Can't GetWALInsertionTimeLine() called instead? I guess the reason is
to avoid the unnecessary overhead involved in the function call. (Same
at a few other places.)

Regards,
Amul

Re: should we enable log_checkpoints out of the box?

2021-11-03 Thread Robert Haas

On Tue, Nov 2, 2021 at 7:46 PM Bruce Momjian  wrote:
> I realize in this thread that Robert Haas mentioned foreign key
> violations that would like also appear in logs, but those are
> ERROR/WARNING etc. messages that can be filtered out with
> log_min_message.

I think that's pretty impractical. In order to get rid of all of the
harmless application errors, you'd have to set log_min_messages to
FATAL. That's a bad idea for two reasons. First, it wouldn't even
filter out all the garbage that you don't care about, and second, it
would filter out a lot of really important things that you do care
about. For example, you would still get messages in the logs like
"FATAL: connection to client lost" and "FATAL: canceling
authentication due to timeout" that probably nobody really cares
about, but you would not get messages like "ERROR: downlink or sibling
link points to deleted block in index " which are indicative
of data corruption and thus important. For better or for worse, the
distinction between ERROR, FATAL, and PANIC is entirely based on what
we do after printing the message, and NOT on how serious the message
is.

Now maybe we want to improve that at some point, but it shouldn't
stand in the way of this proposal. If people are living with the
gigantic volume of ERROR and FATAL messages that typically end up in
the logs today, they can certainly live with the absolutely tiny
volume of additional information that would be generated by
log_checkpoints=on.

-- 
Robert Haas
EDB: http://www.enterprisedb.com

Re: Schema variables - new implementation for Postgres 15

2021-11-03 Thread Tomas Vondra

Hi,

I took a quick look at the latest patch version. In general the patch
looks pretty complete and clean, and for now I have only some basic
comments. The attached patch tweaks some of this, along with a couple
additional minor changes that I'll not discuss here.


1) Not sure why we need to call this "schema variables". Most objects
are placed in a schema, and we don't say "schema tables" for example.
And it's CREATE VARIABLE and not CREATE SCHEMA VARIABLE, so it's a bit
inconsistent.

The docs actually use "Global variables" in one place for some reason.


2) I find this a bit confusing:

SELECT non_existent_variable;
test=# select s;
ERROR:  column "non_existent_variable" does not exist
LINE 1: select non_existent_variable;

I wonder if this means using SELECT to read variables is a bad idea, and
we should have a separate command, just like we have LET (instead of
just using UPDATE in some way).


3) I've reworded / tweaked a couple places in the docs, but this really
needs a native speaker - I don't have a very good "feeling" for this
technical language so it's probably still quite cumbersome.


4) Is sequential scan of the hash table  in clean_cache_callback() a
good idea? I wonder how fast (with how many variables) it'll become
noticeable, but it may be good enough for now and we can add something
better (tracing which variables need resetting) later.


5) In what situation would we call clean_cache_callback() without a
transaction state? If that happens it seems more like a bug, so
maybeelog(ERROR) or Assert() would be more appropriate?


6) free_schema_variable does not actually use the force parameter


7) The target_exprkind expression in transformSelectStmt really needs
some explanation. Because that's chance you'll look at this in 6 months
and understand what it does?

target_exprkind =
(pstate->p_expr_kind != EXPR_KIND_LET_TARGET ||
 pstate->parentParseState != NULL) ?
EXPR_KIND_SELECT_TARGET : EXPR_KIND_LET_TARGET;


8) immutable variables without a default value

IMO this case should not be allowed. On 2021/08/29 you wrote:

I thought about this case, and I have one scenario, where this
behaviour can be useful. When the variable is declared as IMMUTABLE
NOT NULL without not null default, then any access to the content of
the variable has to fail. I think it can be used for detection,
where and when the variable is first used. So this behavior is
allowed just because I think, so this feature can be interesting for
debugging. If this idea is too strange, I have no problem to disable
this case.

This seems like a really strange use case. In a production code you'll
not do this, because then the variable is useless and the code does not
work at all (it'll just fail whenever it attempts to access the var).
And if you can modify the code, there are other / better ways to do this
(raising an exception, ...).

So this seems pretty useless to me, +1 to disabling it.


regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
diff --git a/doc/src/sgml/advanced.sgml b/doc/src/sgml/advanced.sgml
index 2d6555497c..bd4733c0bb 100644
--- a/doc/src/sgml/advanced.sgml
+++ b/doc/src/sgml/advanced.sgml
@@ -714,24 +714,16 @@ SELECT name, elevation
 

 Schema variables are database objects that can hold a value.
-Schema variables, like relations, exist within a schema and their access is
+Schema variables, like relations, exist within a schema and the access is
 controlled via GRANT and REVOKE commands.
-The schema variable can be created by the CREATE VARIABLE command.
+A schema variable can be created by the CREATE VARIABLE command.

 

-The value of a schema variable is local to the current session. Retrieving
-a variable's value returns either a NULL or a default value, unless its value
-is set to something else in the current session with a LET command. The content
-of a variable is not transactional. This is the same as in regular variables
-in PL languages.
-   
-
-   
-Schema variables are retrieved by the SELECT SQL command.
-Their value is set with the LET SQL command.
+The value of a schema variable is set with the LET SQL command.
 While schema variables share properties with tables, their value cannot be updated
-with an UPDATE command.
+with an UPDATE command. The value of a schema variable may be
+retrieved by the SELECT SQL command.
 
 CREATE VARIABLE var1 AS date;
 LET var1 = current_date;
@@ -747,6 +739,15 @@ LET current_user_id = (SELECT id FROM users WHERE usename = session_user);
 SELECT current_user_id;
 

+
+   
+The value of a schema variable is local to the current session. Retrieving
+a variable's value returns either a NULL or a default value,
+unless its value is set to something else in the current session using the
+LET command. The content of a variable is not tra

Re: Map WAL segment files on PMEM as WAL buffers

2021-11-03 Thread Daniel Gustafsson

> On 28 Oct 2021, at 08:09, Takashi Menjo  wrote:

> Rebased, and added the patches below into the patchset.

Looks like the 0001 patch needs to be updated to support Windows and MSVC.  See
src/tools/msvc/Mkvcbuild.pm and Solution.pm et.al for inspiration on how to add
the MSVC equivalent of --with-libpmem.  Currently the patch fails in the
"Generating configuration headers" step in Solution.pm.

--
Daniel Gustafsson   https://vmware.com/

Re: should we enable log_checkpoints out of the box?

2021-11-03 Thread Bruce Momjian

On Wed, Nov  3, 2021 at 08:45:46AM -0400, Robert Haas wrote:
> If log_checkpoints=on wouldn't fill up the disk on my 5+-year old
> Raspberry Pi in less time that it takes to raise multiple children to
> adulthood even after disabling the OS-provided log rotation and
> compression, then it seems more than fair to say that for the vast
> majority of users, this isn't a real problem. And for those few for
> whom it *is* a real problem, they can still shut off log_checkpoints.
> It's not like anybody is proposing to remove the option.

Well, another issue is that if something unusual does happen, it appears
very visibly if you are looking just for LOG messages, while if you have
many checkpoint log messages, it might get lost.  If we want to log more
by default, I think we are looking at several issues:

*  enabling log rotation and log file reuse in the default install
*  changing the labels of some of the normal-operation log messages
*  changing the way some of these log messages are controlled
*  perhaps using a ring buffer for common log messages

-- 
  Bruce Momjian  https://momjian.us
  EDB  https://enterprisedb.com

  If only the physical world exists, free will is an illusion.

Re: should we enable log_checkpoints out of the box?

2021-11-03 Thread Bruce Momjian

On Tue, Nov  2, 2021 at 08:02:41PM -0400, Tom Lane wrote:
> I'm still of the position that the default ought to be that a
> normally-functioning server generates no ongoing log output.
> Only people who have got Nagios watching their logs, or some
> such setup, are going to want anything different.  And that is
> a minority use-case.  There are going to be way more people
> bitching because their postmaster log overflowed their disk
> than there will be people who are happier because you made
> such output the default.  (Don't forget that our default
> logging setup does not rotate the logs.)
> 
> However, I'm prepared to accept a tight definition of what
> "normally-functioning" means.  For instance, I don't have a
> problem with the idea of having log_autovacuum_min_duration
> defaulting to something positive, as long as it's fairly large.
> Then it's only going to emit anything if there is a situation
> that really needs attention.
> 
> My objection to log_checkpoints=on is that that's going to produce
> a constant stream of messages even when *nothing at all* is wrong.
> Worse yet, a novice DBA would likely have a hard time understanding
> from those messages whether there was anything to worry about or not.
> If we could design a variant of log_checkpoints that would produce
> output only when the situation really needs attention, I'd be fine
> with enabling that by default.

I wonder if we need to follow the Unix model on this by having kernel
messages logged to a file via syslog (and potentially filtered), and
then have all recent activity logged into a ring buffer and visible via
something like dmesg.

-- 
  Bruce Momjian  https://momjian.us
  EDB  https://enterprisedb.com

  If only the physical world exists, free will is an illusion.

Re: Logical Replication - improve error message while adding tables to the publication in check_publication_add_relation

2021-11-03 Thread Daniel Gustafsson

> On 26 Jul 2021, at 09:33, Bharath Rupireddy 
>  wrote:

> PSA v7.

This patch no longer applies on top of HEAD, please submit a rebased version.

--
Daniel Gustafsson   https://vmware.com/

Re: RFC: Logging plan of the running query

2021-11-03 Thread Daniel Gustafsson

This patch no longer applies on top of HEAD, please submit a rebased version.

--
Daniel Gustafsson   https://vmware.com/

Re: should we enable log_checkpoints out of the box?

2021-11-03 Thread Robert Haas

On Tue, Nov 2, 2021 at 8:02 PM Tom Lane  wrote:
> I'm still of the position that the default ought to be that a
> normally-functioning server generates no ongoing log output.
> Only people who have got Nagios watching their logs, or some
> such setup, are going to want anything different.  And that is
> a minority use-case.  There are going to be way more people
> bitching because their postmaster log overflowed their disk
> than there will be people who are happier because you made
> such output the default.  (Don't forget that our default
> logging setup does not rotate the logs.)

I mean, you still seem to be postulating a setup where none of the
user activity ever causes any logging messages to get generated, which
is just not the way real systems behave. And as to log volume, if you
log a checkpoint every 5 minutes, generating roughly 1/4 kB of log
output each time, that's ~3kB/hour, or ~27MB/year.

Just for fun, I went and had a look at the least-used PostgreSQL
instances that I know about, which is the one running on my Raspberry
Pi. It powers a little web site that I wrote and people sometimes use
that web site but not that often. It's running PostgreSQL 9.4.5
because it's not an important enough system that I can be bothered to
upgrade it. The OS that came preinstalled is some Debian derivative
that arranges to rotate the logs automatically, so I only have logs
going back to May. For the most part, it generates pretty much nothing
in the logs, although back end of May/early June it kicked out about
56MB of logs. I guess I must have temporarily enabled
log_statement=all to debug something or other.

Anyway, there are three points I'd like to make about this machine.
The first is that the root partition, where pretty much all the files
are located on this tiny box, is 6GB. It currently has 825MB of free
space. If you enabled log_checkpoints=on on this machine, first of all
I suppose most of the time it wouldn't log anything because there's
usually no activity and thus the checkpoint could just be skipped.
Second, even if it did checkpoint every 5 minutes and generate
27MB/year of logs, that would take roughly 30 years to free up the
825MB of free space I have on the root partition. I doubt this machine
will be around in 30 years; it's uncertain that *I'll* be around in 30
years. Third, this machine came with PostgreSQL log compression and
rotation set up by the operating system, which I think is pretty good
evidence that configurations including those features are common. The
compression ratio on the logs this machine is generating seems to be
about 98.5%. After compression, the overhead of the log_checkpoints
output for the three months of logs that it keeps - assuming it had
enough activity to checkpoint every 5 minutes - would be perhaps
100kB, which is not only little enough not to be a problem on this
machine, but also little enough not to be a problem on the first IBM
PC that I ever used, back in the 1980s.

If log_checkpoints=on wouldn't fill up the disk on my 5+-year old
Raspberry Pi in less time that it takes to raise multiple children to
adulthood even after disabling the OS-provided log rotation and
compression, then it seems more than fair to say that for the vast
majority of users, this isn't a real problem. And for those few for
whom it *is* a real problem, they can still shut off log_checkpoints.
It's not like anybody is proposing to remove the option.

-- 
Robert Haas
EDB: http://www.enterprisedb.com

Re: Kerberos delegation support in libpq and postgres_fdw

2021-11-03 Thread Daniel Gustafsson

This patch no longer applies following the Perl namespace changes, can you
please submit a rebased version? Marking the patch "Waiting on Author".

--
Daniel Gustafsson   https://vmware.com/

Re: Commitfest 2021-11

2021-11-03 Thread Daniel Gustafsson

> On 1 Nov 2021, at 14:38, Daniel Gustafsson  wrote:
> 
> It is now November 1st AOE and thus the 2021-11 commitfest is now in progress.
> I've switched the status and opened 2022-01 for new patches, but noone has so
> far raised their hand to run it AFAICT?  Do we have a volunteer Commitfest
> Manager keen to help the community make progress on closing patches?

I'll take that as a No, so I'm voluntelling myself this time.  Let's close some
patches!

--
Daniel Gustafsson   https://vmware.com/

Re: should we enable log_checkpoints out of the box?

2021-11-03 Thread Robert Haas

On Tue, Nov 2, 2021 at 11:18 PM Jan Wieck  wrote:
> The thing I don't want to see us doing is *nothing at all* when pretty
> much everyone with some customer experience in the field is saying "this
> is the information we want to see post incident and nobody has it so we
> sit there waiting for the next time it happens."

Quite so.

I'm not convinced that the proposal to log checkpoints only when
they're triggered by WAL rather than by time is really solving
anything. It isn't as if a time-based checkpoint couldn't have caused
a problem. What you're going to be looking for is something much more
complicated than that. Were the fsyncs slow? Did the checkpoint around
the time the user reported a problem write significantly more data
than the other checkpoints? I guess if a checkpoint wrote 1MB of data
and took 0.1 seconds to complete the fsyncs, I don't much care whether
it shows up in the log or not. If it wrote 4GB of data, or if it took
15 seconds to complete the fsyncs, I care. That's easily enough to
account for some problem that somebody had.

I'm not sure whether there are any other interesting criteria.

-- 
Robert Haas
EDB: http://www.enterprisedb.com

Re: Eval expression R/O once time (src/backend/executor/execExpr.c)

2021-11-03 Thread Ranier Vilela

Em ter., 2 de nov. de 2021 às 15:33, Andres Freund 
escreveu:

> On 2021-11-02 13:43:46 -0400, Tom Lane wrote:
> > Ranier Vilela  writes:
> > > It seems that 1ec7679f1b67e84be688a311dce234eeaa1d5de8 caused the
> problem.
> >
> > Indeed.  Fix pushed.
>
> Thanks to both of you!
>
You are welcome, Andres.

regards,
Ranier Vilela

Re: Replication protocol doc fix

2021-11-03 Thread Daniel Gustafsson

The commitfest CI times out on all platforms and never finishes when running
make check with this patch, so unless the patch is dropped due to concerns
already raised then that seems like a good thing to fix.

--
Daniel Gustafsson   https://vmware.com/

Re: stat() vs ERROR_DELETE_PENDING, round N + 1

2021-11-03 Thread Daniel Gustafsson

This patch doesn't compile on Windows according to Appveyor, seemingly because
of a syntax error in the new win32ntdll.h file, but the MSVC logs are hard on
the eye so it might be unrelated.

--
Daniel Gustafsson   https://vmware.com/

Re: CREATE ROLE IF NOT EXISTS

2021-11-03 Thread Daniel Gustafsson

> On 19 Oct 2021, at 22:12, David Christensen 
>  wrote:
> 
> Greetings -hackers,
> 
> Enclosed is a patch that implements CREATE ROLE IF NOT EXISTS (along with the 
> same support for USER/GROUP).  This is a fairly straightforward approach in 
> that we do no validation of anything other than existence, with the user 
> needing to ensure that permissions/grants are set up in the proper way.
> 
> Comments?

This fails the roleattributes test in "make check", with what seems to be a
trivial change in the output.  Can you please submit a rebased version fixing
the test?

--
Daniel Gustafsson   https://vmware.com/

Re: [PATCH] Proof of concept for GUC improvements

2021-11-03 Thread Daniel Gustafsson

> On 15 Oct 2021, at 23:54, Cary Huang  wrote:

> I scanned through the GUC list and found that the following parameters can
> potentially be categorized in the "special_disabled0" group, just for your
> reference.

This patch no longer applies, can you please submit a rebased version?  Also,
do you have any thoughts on Cary's suggestion in the above review?

--
Daniel Gustafsson   https://vmware.com/

Re: remove internal support in pgcrypto?

2021-11-03 Thread Daniel Gustafsson

> On 30 Oct 2021, at 14:11, Peter Eisentraut 
>  wrote:
> 
> On 24.08.21 11:13, Peter Eisentraut wrote:
>> So I'm tempted to suggest that we remove the built-in, non-OpenSSL cipher 
>> and hash implementations in pgcrypto (basically INT_SRCS in 
>> pgcrypto/Makefile), and then also pursue the simplifications in the OpenSSL 
>> code paths described in [0].
> 
> Here is a patch for this.

This patch doesn't work on Windows, which I think is because it pulls in
pgcrypto even in builds without OpenSSL.  Poking at that led me to realize that
we can simplify even more with this.  The conditonal source includes can go
away and be replaced with a simple OBJS clause, and with that the special hacks
in Mkvcbuild.pm to overcome that.

Attached is a diff on top of your patch to do the above.  I haven't tested it
on Windows yet, but if you think it's in the right direction we'll take it for
a spin in a CI with/without OpenSSL.

Now, *if* we merge the NSS patch this does introduce special cases again which
this rips out.  I prefer to try and fix them in that patch to keep avoiding the
need for them rather than keep them on speculation for a patch which hasn't
been decided on.

--
Daniel Gustafsson   https://vmware.com/

pgcrypto_openssl.diff
Description: Binary data

Re: Data is copied twice when specifying both child and parent table in publication

2021-11-03 Thread Amit Kapila

On Thu, Oct 28, 2021 at 1:55 PM Amit Langote  wrote:
>
> On Thu, Oct 28, 2021 at 4:35 PM houzj.f...@fujitsu.com
>  wrote:
> > As there are basically two separate issues mentioned in the thread, I tried 
> > to
> > summarize the discussion so far which might be helpful to others.
> >
> > * The first issue[1]:
> >
> > If we include both the partitioned table and (explicitly) its child 
> > partitions
> > in the publication when set publish_via_partition_root=true, like:
> > ---
> > CREATE PUBLICATION pub FOR TABLE parent_table, child_table with 
> > (publish_via_partition_root=on);
> > ---
> > It could execute initial sync for both the partitioned(parent_table) table 
> > and
> > (explicitly) its child partitions(child_table) which cause duplication of
> > data in partition(child_table) in subscriber side.
> >
> > The reasons I considered this behavior a bug are:
> >
> > a) In this case, the behavior of initial sync is inconsistent with the 
> > behavior
> > of transaction streaming. All changes in the leaf the partition will be 
> > applied
> > using the identity and schema of the partitioned(root) table. But for the
> > initial sync, it will execute sync for both the partitioned(root) table and
> > (explicitly) its child partitions which cause duplication of data.
> >
> > b) The behavior of FOR TABLE is inconsistent with the behavior of FOR ALL 
> > TABLE.
> > If user create a FOR ALL TABLE publication and set 
> > publish_via_partition_root=true,
> > then only the top most partitioned(root) table will execute initial sync.
> >
> > IIRC, most people in this thread agreed that the current behavior is not
> > expected. So, maybe it's time to try to fix it.
> >
> > Attach my fix patch here. The patch try to fix this issue by making the
> > pg_publication_tables view only show partitioned table when
> > publish_via_partition_root is true.
> >
> >
> > * The second issue[2]:
> > -
> > CREATE TABLE sale (sale_date date not null,country_code text, product_sku 
> > text,
> > units integer) PARTITION BY RANGE (sale_date);
> > CREATE TABLE sale_201901 PARTITION OF sale FOR VALUES FROM ('2019-01-01') TO
> > ('2019-02-01');
> > CREATE TABLE sale_201902 PARTITION OF sale FOR VALUES FROM ('2019-02-01') TO
> > ('2019-03-01');
> >
> > (1) PUB:  CREATE PUBLICATION pub FOR TABLE sale_201901,
> > sale_201902 WITH (publish_via_partition_root=true);
> > (2) SUB:  CREATE SUBSCRIPTION sub CONNECTION 'dbname=postgres 
> > host=localhost port=5432' PUBLICATION pub;
> > (3) PUB:  INSERT INTO sale VALUES('2019-01-01', 'AU', 'cpu', 5), 
> > ('2019-01-02', 'AU', 'disk', 8);
> > (4) SUB:  SELECT * FROM sale;
> > (5) PUB:  ALTER PUBLICATION pub ADD TABLE sale;
> > (6) SUB:  ALTER SUBSCRIPTION sub REFRESH PUBLICATION;
> > (7) SUB:  SELECT * FROM sale;
> > -
> >
> > In step (7), we can see duplication of data.
> >
> > The reason is that the INSERTed data is first published though the 
> > partitions,
> > since initially there is no partitioned table in the publication (so
> > publish_via_partition_root=true doesn't have any effect). But then adding 
> > the
> > partitioned table to the publication and refreshing the publication in the
> > subscriber, the data is then published "using the identity and schema of the
> > partitioned table" due to publish_via_partition_root=true.
> > (Copied from Greg's analysis).
> >
> > Whether this behavior is correct is still under debate.
> >
> >
> > Overall, I think the second issue still needs further discussion while the
> > first issue seems clear that most people think it's unexpected. So, I think 
> > it
> > might be better to fix the first issue.
>
> Thanks for the summary, Hou-san, and sorry about my late reply.
>
> I had thought about this some last week and I am coming around to
> recognizing the confusing user experience of the current behavior.
> Though, I am not sure if removing partitions from the result of
> pg_publication_tables view for pubviaroot publications is acceptable
> as a fix, because that prevents replicating into a subscriber node
> where tables that are partition root and a partition respectively on
> the publisher are independent tables on the subscriber.
>

But we already do that way when the publication is "For All Tables".
Anyway, for the purpose of initial sync, it will just replicate the
same data in two different tables if the corresponding tables on the
subscriber-side are non-partitioned which I am not sure is what the
user will be expecting.

>  ISTM that
> what Amit K mentioned in his first reply may be closer to what we may
> ultimately need to do, which is this:
>
> "I think one possible idea to investigate is that on the
> subscriber-side, after fetching tables, we check the already
> subscribed tables and if the child tables already exist then we ignore
> the parent table and vice versa."
>
> I had also thought about a way to implement that a bit and part of
> that is to make pg_publication_tables always expose leaf partitions,
> that is, even if pubviar

Re: Should we support new definition for Identity column : GENERATED BY DEFAULT ON NULL?

2021-11-03 Thread Vik Fearing

On 11/2/21 12:19 PM, Himanshu Upadhyaya wrote:
> Hi,
> 
> Trying to insert NULL value to the Identity column defined by "GENERATED BY
> DEFAULT" is disallowed, but there can be use cases where the user would
> like to have an identity column where manual NULL insertion is required(and
> it should not error-out by Postgres).

What could possibly be the use case for this?

> Thoughts?

I don't like it.
-- 
Vik Fearing

Re: lastOverflowedXid does not handle transaction ID wraparound

2021-11-03 Thread Andrey Borodin




> 3 нояб. 2021 г., в 14:08, Alexander Korotkov  
> написал(а):
> 
> ( a.On Wed, Nov 3, 2021 at 11:44 AM Andrey Borodin  
> wrote:
>>> 21 окт. 2021 г., в 09:01, Kyotaro Horiguchi  
>>> написал(а):
>>> 
>>> If the discussion so far is correct, the following diff will fix the
>>> issue.
>>> 
>>> diff --git a/src/backend/storage/ipc/procarray.c 
>>> b/src/backend/storage/ipc/procarray.c
>>> index bd3c7a47fe..19682b73ec 100644
>>> --- a/src/backend/storage/ipc/procarray.c
>>> +++ b/src/backend/storage/ipc/procarray.c
>>> @@ -4463,6 +4463,12 @@ ExpireOldKnownAssignedTransactionIds(TransactionId 
>>> xid)
>>> {
>>>   LWLockAcquire(ProcArrayLock, LW_EXCLUSIVE);
>>>   KnownAssignedXidsRemovePreceding(xid);
>>> +   /*
>>> +* reset lastOverflowedXid if we know transactions that have been 
>>> possiblly
>>> +* running are being gone.
>>> +*/
>>> +   if (TransactionIdPrecedes(procArray->lastOverflowedXid, xid))
>>> +   procArray->lastOverflowedXid = InvalidTransactionId;
>>>   LWLockRelease(ProcArrayLock);
>>> }
>> 
>> The patch seems correct bugfix to me. The only question I have: is it right 
>> place from modularity standpoint? procArray->lastOverflowedXid is not a part 
>> of KnownAssignedTransactionIds?
> 
> It seems the right place because we take ProcArrayLock here.
Oh.. I see. ProcArrayApplyRecoveryInfo() is taking ProcArrayLock in so many 
places.

>  It would
> be undesirable to take it twice.  We could give a better name for
> ExpireOldKnownAssignedTransactionIds() indicating that it could modify
> lastOverflowedXid as well.  Any ideas?
Looking more I think the name is OK. KnownAssignedXidsReset() and 
KnownAssignedXidsRemovePreceding() interferes with procArray a lot.

> Should ExpireAllKnownAssignedTransactionIds() be also involved here?
I think it's good for unification, but I do not see how 
procArray->lastOverflowedXid can be used after 
ExpireAllKnownAssignedTransactionIds().


Best regards, Andrey Borodin.

Re: Teach pg_receivewal to use lz4 compression

2021-11-03 Thread gkokolatos



‐‐‐ Original Message ‐‐‐

On Wednesday, November 3rd, 2021 at 9:05 AM,  wrote:

> ‐‐‐ Original Message ‐‐‐
>
> On Wednesday, November 3rd, 2021 at 12:23 AM, Michael Paquier 
> mich...@paquier.xyz wrote:
> > On Tue, Nov 02, 2021 at 12:31:47PM -0400, Robert Haas wrote:
> > > On Tue, Nov 2, 2021 at 8:17 AM Magnus Hagander mag...@hagander.net wrote:
> > >
> > > > I think for the end user, it is strictly better to name it "gzip",
> > > > and given that the target of this option is the end user we should
> > > > do so. (It'd be different it we were talking about a build-time
> > > > parameter to configure).
> > >
> > > I agree. Also, I think there's actually a file format called "zlib"
> > > which is slightly different from the "gzip" format, and you have to be
> > > careful not to generate the wrong one.
> >
> > Okay, fine by me. It would be better to be also consistent in
> > WalCompressionMethods once we switch to this option value, then.
>
> I will revert to gzip for version 9. Should be out shortly.

Please find v9 attached.

Cheers,
//Georgios
>
> > MichaelFrom 8e33136f81c3197020053cba0f7f070d594f056e Mon Sep 17 00:00:00 2001
From: Georgios Kokolatos 
Date: Wed, 3 Nov 2021 08:59:58 +
Subject: [PATCH v9 2/2] Teach pg_receivewal to use LZ4 compression

The program pg_receivewal can use gzip compression to store the received WAL.
This commit teaches it to also be able to use LZ4 compression. It is required
that the binary is build using the -llz4 flag. It is enabled via the --with-lz4
flag on configuration time.

The option `--compression-method` has been expanded to inlude the value [LZ4].
The option `--compress` can not be used with LZ4 compression.

Under the hood there is nothing exceptional to be noted. Tar based archives have
not yet been taught to use LZ4 compression. If that is felt useful, then it is
easy to be added in the future.

Tests have been added to verify the creation and correctness of the generated
LZ4 files. The later is achieved by the use of LZ4 program, if present in the
installation.
---
 doc/src/sgml/ref/pg_receivewal.sgml  |   5 +-
 src/Makefile.global.in   |   1 +
 src/bin/pg_basebackup/Makefile   |   1 +
 src/bin/pg_basebackup/pg_receivewal.c| 129 +++
 src/bin/pg_basebackup/t/020_pg_receivewal.pl |  72 -
 src/bin/pg_basebackup/walmethods.c   | 159 ++-
 src/bin/pg_basebackup/walmethods.h   |   1 +
 7 files changed, 358 insertions(+), 10 deletions(-)

diff --git a/doc/src/sgml/ref/pg_receivewal.sgml b/doc/src/sgml/ref/pg_receivewal.sgml
index cf2eaa1486..411b275de0 100644
--- a/doc/src/sgml/ref/pg_receivewal.sgml
+++ b/doc/src/sgml/ref/pg_receivewal.sgml
@@ -268,8 +268,11 @@ PostgreSQL documentation
   

 Enables compression of write-ahead logs using the specified method.
-Supported values gzip,
+Supported values are lz4, gzip,
 and none.
+For the LZ4 method to be available,
+PostgreSQL must have been have been compiled
+with --with-lz4.

   
  
diff --git a/src/Makefile.global.in b/src/Makefile.global.in
index 533c12fef9..05c54b27de 100644
--- a/src/Makefile.global.in
+++ b/src/Makefile.global.in
@@ -350,6 +350,7 @@ XGETTEXT = @XGETTEXT@
 
 GZIP	= gzip
 BZIP2	= bzip2
+LZ4	= lz4
 
 DOWNLOAD = wget -O $@ --no-use-server-timestamps
 #DOWNLOAD = curl -o $@
diff --git a/src/bin/pg_basebackup/Makefile b/src/bin/pg_basebackup/Makefile
index 459d514183..387d728345 100644
--- a/src/bin/pg_basebackup/Makefile
+++ b/src/bin/pg_basebackup/Makefile
@@ -24,6 +24,7 @@ export TAR
 # used by the command "gzip" to pass down options, so stick with a different
 # name.
 export GZIP_PROGRAM=$(GZIP)
+export LZ4
 
 override CPPFLAGS := -I$(libpq_srcdir) $(CPPFLAGS)
 LDFLAGS_INTERNAL += -L$(top_builddir)/src/fe_utils -lpgfeutils $(libpq_pgport)
diff --git a/src/bin/pg_basebackup/pg_receivewal.c b/src/bin/pg_basebackup/pg_receivewal.c
index 9449b50868..af3eba8845 100644
--- a/src/bin/pg_basebackup/pg_receivewal.c
+++ b/src/bin/pg_basebackup/pg_receivewal.c
@@ -29,6 +29,10 @@
 #include "receivelog.h"
 #include "streamutil.h"
 
+#ifdef HAVE_LIBLZ4
+#include "lz4frame.h"
+#endif
+
 /* Time to sleep between reconnection attempts */
 #define RECONNECT_SLEEP_TIME 5
 
@@ -137,6 +141,15 @@ is_xlogfilename(const char *filename, bool *ispartial,
 		return true;
 	}
 
+	/* File looks like a complete LZ4 compressed XLOG file */
+	if (fname_len == XLOG_FNAME_LEN + strlen(".lz4") &&
+		strcmp(filename + XLOG_FNAME_LEN, ".lz4") == 0)
+	{
+		*ispartial = false;
+		*wal_compression_method = COMPRESSION_LZ4;
+		return true;
+	}
+
 	/* File looks like a partial uncompressed XLOG file */
 	if (fname_len == XLOG_FNAME_LEN + strlen(".partial") &&
 		strcmp(filename + XLOG_FNAME_LEN, ".partial") == 0)
@@ -155,6 +168,15 @@ is_xlogfilename(const char *filename, bool *ispartial,
 		return true;
 	}
 
+	/* File looks like a partia

Re: lastOverflowedXid does not handle transaction ID wraparound

2021-11-03 Thread Alexander Korotkov

( a.On Wed, Nov 3, 2021 at 11:44 AM Andrey Borodin  wrote:
> > 21 окт. 2021 г., в 09:01, Kyotaro Horiguchi  
> > написал(а):
> >
> > If the discussion so far is correct, the following diff will fix the
> > issue.
> >
> > diff --git a/src/backend/storage/ipc/procarray.c 
> > b/src/backend/storage/ipc/procarray.c
> > index bd3c7a47fe..19682b73ec 100644
> > --- a/src/backend/storage/ipc/procarray.c
> > +++ b/src/backend/storage/ipc/procarray.c
> > @@ -4463,6 +4463,12 @@ ExpireOldKnownAssignedTransactionIds(TransactionId 
> > xid)
> > {
> >LWLockAcquire(ProcArrayLock, LW_EXCLUSIVE);
> >KnownAssignedXidsRemovePreceding(xid);
> > +   /*
> > +* reset lastOverflowedXid if we know transactions that have been 
> > possiblly
> > +* running are being gone.
> > +*/
> > +   if (TransactionIdPrecedes(procArray->lastOverflowedXid, xid))
> > +   procArray->lastOverflowedXid = InvalidTransactionId;
> >LWLockRelease(ProcArrayLock);
> > }
>
> The patch seems correct bugfix to me. The only question I have: is it right 
> place from modularity standpoint? procArray->lastOverflowedXid is not a part 
> of KnownAssignedTransactionIds?

It seems the right place because we take ProcArrayLock here.  It would
be undesirable to take it twice.  We could give a better name for
ExpireOldKnownAssignedTransactionIds() indicating that it could modify
lastOverflowedXid as well.  Any ideas?

Should ExpireAllKnownAssignedTransactionIds() be also involved here?

--
Regards,
Alexander Korotkov

Re: Added schema level support for publication.

2021-11-03 Thread Amit Kapila

On Wed, Nov 3, 2021 at 11:55 AM vignesh C  wrote:
>
> On Wed, Nov 3, 2021 at 11:37 AM Peter Smith  wrote:
> >
> > While you are changing these, maybe also change:
> >
> > Before: PUBLICATIONOBJ..._CURRSCHEMA
> > After: PUBLICATIONOBJ..._CUR_SCHEMA
> >
> > Which seems more readable to me.
>
> Thanks for the comment, the attached patch has the changes for the same.
>

LGTM. I'll push this tomorrow unless there are more comments or suggestions.

-- 
With Regards,
Amit Kapila.

1 2 >

1 - 100 of 103 matches

Mail list logo