date:20210901

Re: Converting contrib SQL functions to new style

2021-09-01 Thread Peter Eisentraut


On 14.04.21 00:26, Tom Lane wrote:

Attached are some draft patches to convert almost all of the
contrib modules' SQL functions to use SQL-standard function bodies.


This first patch is still the patch of record in CF 2021-09, but from 
the subsequent discussion, it seems more work is being contemplated.

Re: Proposal: More structured logging

2021-09-01 Thread Peter Eisentraut


On 23.08.21 11:33, Magnus Hagander wrote:

In short, I would also support the presence of JSON log format in
core. (but as a proper log_destination of course -- or if it's time to
actually split that into a separaet thing, being one parameter for
log_destination and another for log_format)


It would be useful if there was some kind of standardized schema for 
JSON logging, meaning what the keys should be named etc.  Then we don't 
need to make this all up from scratch, and there could be some 
integration with existing log analysis tools.

Re: Proposal: More structured logging

2021-09-01 Thread Peter Eisentraut


On 13.08.21 15:23, Ronan Dunklau wrote:

The logging system already captures a lot of information in the ErrorData. But
at present there is no way for a log message authors to include more metadata
about the log outside of the log message itself. For example, including the
extension name which can be useful for filtering / dispatching log messages.
This can be done in the log message itself, but that requires specialized
parsing in the log output.

Even though among the available loggers in core, only the csv logger would be
able to make sense of it, I feel it would be beneficial to add a tagging system
to logs, by adding different (tag, value) combinations to a log entry, with an
API similar to what we do for other ErrorData fields:

ereport(NOTICE,
  (errmsg("My log message")),
  (errtag("EMITTER", "MYEXTENSION")),
  (errtag("MSG-ID", "%d", error_message_id))
);


What are some more examples of where this could be used?  The extension 
name could be added more or less automatically to ereport() calls.  I 
don't know what the MSG-ID is supposed to be.  Are there other use cases?

Re: Proposal: More structured logging

2021-09-01 Thread Ronan Dunklau

Le mercredi 1 septembre 2021, 09:36:50 CEST Peter Eisentraut a écrit :
> On 13.08.21 15:23, Ronan Dunklau wrote:
> > The logging system already captures a lot of information in the ErrorData.
> > But at present there is no way for a log message authors to include more
> > metadata about the log outside of the log message itself. For example,
> > including the extension name which can be useful for filtering /
> > dispatching log messages. This can be done in the log message itself, but
> > that requires specialized parsing in the log output.
> > 
> > Even though among the available loggers in core, only the csv logger would
> > be able to make sense of it, I feel it would be beneficial to add a
> > tagging system to logs, by adding different (tag, value) combinations to
> > a log entry, with an API similar to what we do for other ErrorData
> > fields:
> > 
> > ereport(NOTICE,
> > 
> >   (errmsg("My log message")),
> >   (errtag("EMITTER", "MYEXTENSION")),
> >   (errtag("MSG-ID", "%d", error_message_id))
> > 
> > );
> 
> What are some more examples of where this could be used?  The extension
> name could be added more or less automatically to ereport() calls.  I
> don't know what the MSG-ID is supposed to be.  Are there other use cases?

Adding it automatically would be nice, but how would you go about that ?

In-core it would open up the possibility to split log messages into different 
fields, for example the different statistics reported in the logs by VACUUM / 
ANALYZE VERBOSE and make it easier to consume the output without having to 
parse the message. Parsing the message also means that any tool parsing it 
needs to either be aware of the locale, or to force the user to use a specific 
one.

Out-of-core, the same things could be done for extensions like pg_audit which 
logs structured information into the message itself, leaving the message 
format to be parsed at a later time.

-- 
Ronan Dunklau

Re: Fix around conn_duration in pgbench

2021-09-01 Thread Fujii Masao





On 2021/09/01 1:10, Fujii Masao wrote:

+1. So we reached the consensus!

Attached is the updated version of the patch. Based on Nagata-san's latest 
patch,
I just modified the comment slightly and ran pgindent. Barring any objection,
I will commit this patch only in master and v14.


Pushed. Thanks!

Regards,

--
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION

Re: Separate out FileSet from SharedFileSet (was Re: pgsql: pgstat: Bring up pgstat in BaseInit() to fix uninitialized use o)

2021-09-01 Thread Dilip Kumar

On Tue, Aug 31, 2021 at 3:20 PM Amit Kapila  wrote:
>
> On Fri, Aug 27, 2021 at 12:04 PM Dilip Kumar  wrote:
> >
>
> Few comments on v6-0002*
> =
> 1.
> -BufFileDeleteFileSet(FileSet *fileset, const char *name)
> +BufFileDeleteFileSet(FileSet *fileset, const char *name, bool missing_ok)
>  {
>   char segment_name[MAXPGPATH];
>   int segment = 0;
> @@ -358,7 +369,7 @@ BufFileDeleteFileSet(FileSet *fileset, const char *name)
>   for (;;)
>   {
>   FileSetSegmentName(segment_name, name, segment);
> - if (!FileSetDelete(fileset, segment_name, true))
> + if (!FileSetDelete(fileset, segment_name, !missing_ok))
>
> I don't think the usage of missing_ok is correct here. If you see
> FileSetDelete->PathNameDeleteTemporaryFile, it already tolerates that
> the file doesn't exist but gives an error only when it is unable to
> link. So, with this missing_ok users (say like worker.c) won't even
> get errors when they are not able to remove files whereas I think the
> need for the patch is to not get an error when the file doesn't exist.
> I think you don't need to change anything in the way we invoke
> FileSetDelete.

Right, fixed.

> 2.
> -static HTAB *xidhash = NULL;
> +static FileSet *stream_fileset = NULL;
>
> Can we keep this in LogicalRepWorker and initialize it accordingly?

Done

> 3.
> + /* Open the subxact file, if it does not exist, create it. */
> + fd = BufFileOpenFileSet(stream_fileset, path, O_RDWR, true);
> + if (fd == NULL)
> + fd = BufFileCreateFileSet(stream_fileset, path);
>
> I think retaining the existing comment: "Create the subxact file if it
> not already created, otherwise open the existing file." seems better
> here.

Done

> 4.
>   /*
> - * If there is no subtransaction then nothing to do, but if already have
> - * subxact file then delete that.
> + * If there are no subtransactions, there is nothing to be done, but if
> + * subxacts already exist, delete it.
>   */
>
> How about changing the above comment to something like: "Delete the
> subxacts file, if exists"?

Done

> 5. Can we slightly change the commit message as:
> Optimize fileset usage in apply worker.
>
> Use one fileset for the entire worker lifetime instead of using
> separate filesets for each streaming transaction. Now, the
> changes/subxacts files for every streaming transaction will be created
> under the same fileset and the files will be deleted after the
> transaction is completed.
>
> This patch extends the BufFileOpenFileSet and BufFileDeleteFileSet
> APIs to allow users to specify whether to give an error on missing
> files.

Done


-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com
From fade8ae2779e8aeda0d1fee3902a93eaedb0187f Mon Sep 17 00:00:00 2001
From: Dilip Kumar 
Date: Fri, 27 Aug 2021 11:49:12 +0530
Subject: [PATCH v7] Optimize fileset usage in apply worker

Use one fileset for the entire worker lifetime instead of using
separate filesets for each streaming transaction. Now, the
changes/subxacts files for every streaming transaction will be
created under the same fileset and the files will be deleted
after the transaction is completed.

This patch extends the BufFileOpenFileSet and BufFileDeleteFileSet
APIs to allow users to specify whether to give an error on missing
files.
---
 src/backend/replication/logical/launcher.c |   6 +-
 src/backend/replication/logical/worker.c   | 249 +
 src/backend/storage/file/buffile.c |  21 ++-
 src/backend/utils/sort/logtape.c   |   2 +-
 src/backend/utils/sort/sharedtuplestore.c  |   3 +-
 src/include/replication/worker_internal.h  |  11 +-
 src/include/storage/buffile.h  |   5 +-
 7 files changed, 80 insertions(+), 217 deletions(-)

diff --git a/src/backend/replication/logical/launcher.c b/src/backend/replication/logical/launcher.c
index 8b1772d..3fb4caa 100644
--- a/src/backend/replication/logical/launcher.c
+++ b/src/backend/replication/logical/launcher.c
@@ -379,6 +379,7 @@ retry:
 	worker->relid = relid;
 	worker->relstate = SUBREL_STATE_UNKNOWN;
 	worker->relstate_lsn = InvalidXLogRecPtr;
+	worker->stream_fileset = NULL;
 	worker->last_lsn = InvalidXLogRecPtr;
 	TIMESTAMP_NOBEGIN(worker->last_send_time);
 	TIMESTAMP_NOBEGIN(worker->last_recv_time);
@@ -648,8 +649,9 @@ logicalrep_worker_onexit(int code, Datum arg)
 
 	logicalrep_worker_detach();
 
-	/* Cleanup filesets used for streaming transactions. */
-	logicalrep_worker_cleanupfileset();
+	/* Cleanup fileset used for streaming transactions. */
+	if (MyLogicalRepWorker->stream_fileset != NULL)
+		FileSetDeleteAll(MyLogicalRepWorker->stream_fileset);
 
 	ApplyLauncherWakeup();
 }
diff --git a/src/backend/replication/logical/worker.c b/src/backend/replication/logical/worker.c
index bfb7d1a..a222cb3 100644
--- a/src/backend/replication/logical/worker.c
+++ b/src/backend/replication/logical/worker.c
@@ -236,20 +236,6 @@ static ApplyErrorCallbackArg apply_error_callback_arg =
 	.ts = 0,
 };
 
-/*
- * Stream xid hash entry. Whene

Re: Failure of subscription tests with topminnow

2021-09-01 Thread Amit Kapila

On Tue, Aug 31, 2021 at 6:13 PM Ajin Cherian  wrote:
>
> On Tue, Aug 31, 2021 at 3:47 PM Masahiko Sawada  wrote:
>
> Thanks Masahiko-san. I have included this change and made a new patch-set.
>
> Hi Amit,
>
> I have included your comments as well and also attached the patches
> for the back-branches.
>

You forgot to make changes in 022_twophase_cascade.pl in the head
version patch. I have made the required changes and pushed it. Thanks
to you and Sawada-San for diagnosing and fixing this problem.

-- 
With Regards,
Amit Kapila.

Re: Trap errors from streaming child in pg_basebackup to exit early

2021-09-01 Thread Daniel Gustafsson



> On 30 Aug 2021, at 12:31, Bharath Rupireddy 
>  wrote:

> Here are some comments on the patch:
> 1) Do we need volatile keyword here to read the value of the variables
> always from the memory?
> +static volatile sig_atomic_t bgchild_exited = false;

Yes, fixed.

> 2) Do we need #ifndef WIN32 ... #endif around sigchld_handler function
> definition?

Ah yes, good point. Fixed.

> 3) I'm not sure if the new value of bgchild_exited being set in the
> child thread will reflect in the main process on Windows? But
> theoretically, I can understand that the memory will be shared between
> the main process thread and child thread.

The child does not have it’s own copy of bgchild_exited.

> 4) How about "set the same flag that we use on Unix to signal
> SIGCHLD."  instead of "* set the flag used on Unix to signal
> SIGCHLD."?

Fixed.

> 5) How about "background WAL receiver terminated unexpectedly" instead
> of "log streamer child terminated unexpectedly"? This will be in sync
> with the existing message "starting background WAL receiver". "log
> streamer" is the word used internally in the code, user doesn't know
> it with that name.

Good point, that’s better.

> 6) How about giving the exit code (like postmaster's reaper function
> does) instead of just a message saying unexpected termination? It will
> be useful to know for what reason the process exited. For Windows, we
> can use GetExitCodeThread (I'm referring to the code around waitpid in
> pg_basebackup) and for Unix we can use waitpid.

The rest of the program is doing exit(1) regardless of the failure of the
child/thread, so it seems more consistent to keep doing that for this class of
error as well.

A v2 with the above fixes is attached.

--
Daniel Gustafsson   https://vmware.com/



v2-0001-Quick-exit-on-log-stream-child-exit-in-pg_basebac.patch
Description: Binary data

Re: Returning to Postgres community work

2021-09-01 Thread Amit Kapila

On Tue, Aug 31, 2021 at 11:24 AM Gurjeet Singh  wrote:
>
> Hi All,
>
>  I'm very happy to announce that I now work for Supabase [1]. They
> have hired me so that I can participate in, and contribute to the
> Postgres community.
>

Congratulations! Glad to hear this news.

-- 
With Regards,
Amit Kapila.

Question about creation of a new PostgreSQL Extension.

2021-09-01 Thread A Z

To who it may concern,


I am trying to get a project completed to enhance PostgreSQL arithmetic and 
elementary functions

prowess by means of two new High Precision mixed decimal number types in a self 
installing

extension.  Hopefully, I want this to be a free or low cost project.


Is there anyone who can read these project specifications and email back to

me here, at powerus...@live.com.au, to give me a quote for this project?

They are in my top posting at this discussion thread, at:


https://github.com/dvarrazzo/pgmp/issues/22


The extension could be called HPPM, High Precision Postgresql Mathematics.  It 
is

to be written in C, and will need a number of offline installers for major 
operating

systems, like Windows 10/11 or rpm based Linux. The project could be hosted on 
SourceForge

or GitHub.


If anyone on this list is interested, or knows which direction to point me in,

could they please reply to me here, at powerus...@live.com.au?


ZM.

Re: pgstat_send_connstats() introduces unnecessary timestamp and UDP overhead

2021-09-01 Thread Laurenz Albe

On Tue, 2021-08-31 at 21:16 -0700, Andres Freund wrote:
> On 2021-09-01 05:39:14 +0200, Laurenz Albe wrote:
> > On Tue, 2021-08-31 at 18:55 -0700, Andres Freund wrote:
> > > > > On Tue, Aug 31, 2021 at 04:55:35AM +0200, Laurenz Albe wrote:In the
> > > > > view of that, how about doubling PGSTAT_STAT_INTERVAL to 1000
> > > > > milliseconds?  That would mean slightly less up-to-date statistics, 
> > > > > but
> > > > > I doubt that that will be a problem.
> > > 
> > > I think it's not helpful. Still increases the number of messages 
> > > substantially in workloads
> > > with a lot of connections doing occasional queries. Which is common.
> > 
> > How come?  If originally you send table statistics every 500ms, and now you 
> > send
> > table statistics and session statistics every second, that should amount to 
> > the
> > same thing.  Where is my misunderstanding?
> 
> Consider the case of one query a second.

I guess I am too stupid.  I don't see it.

Yours,
Laurenz Albe

Re: Kerberos delegation support in libpq and postgres_fdw

2021-09-01 Thread Peter Eisentraut


On 22.07.21 10:39, Peifeng Qiu wrote:
I've slightly modified the patch to support "gssencmode" and added TAP 
tests.


For the TAP tests, please put then under src/test/kerberos/, instead of 
copying the whole infrastructure to contrib/postgres_fdw/.  Just make a 
new file, for example t/002_postgres_fdw_proxy.pl, and put your tests there.


Also, you can put code and tests in one patch, no need to separate.

I wonder if this feature would also work in dblink.  Since there is no 
substantial code changes in postgres_fdw itself as part of this patch, I 
would suspect yes.  Can you check?

Re: pg_receivewal: remove extra conn = NULL; in StreamLog

2021-09-01 Thread Bharath Rupireddy

On Sun, Aug 29, 2021 at 1:27 AM Daniel Gustafsson  wrote:
>
> > On 28 Aug 2021, at 14:10, Bharath Rupireddy 
> >  wrote:
>
> > It seems there's a redundant assignment statement conn = NULL in
> > pg_receivewal's StreamLog function. Attaching a tiny patch herewith.
> > Thoughts?
>
> Agreed, while harmless this is superfluous since conn is already set to NULL
> after the PQfinish call a few lines up (which was added in a4205fa00d526c3).
> Unless there are objections I’ll apply this tomorrow or Monday.

Thanks for picking this up. I added this to CF to not lose it in the
wild - https://commitfest.postgresql.org/34/3317/

Regards,
Bharath Rupireddy.

RE: AIX: Symbols are missing in libpq.a

2021-09-01 Thread REIX, Tony

Thanks for your help!
That wasn't so difficult, once I've refreshed my memory.
Here is a new patch, using the export.txt whenever it does exist.
I have tested it with v13.4 : it's OK.
Patch for 14beta3 should be the same since there was no change for 
src/Makefile.shlib between v13 and v14.

Regards/Cordialement,

Tony Reix

tony.r...@atos.net

ATOS / Bull SAS
ATOS Expert
IBM-Bull Cooperation Project: Architect & OpenSource Technical Leader
1, rue de Provence - 38432 ECHIROLLES - FRANCE
www.atos.net

De : Noah Misch 
Envoyé : mardi 31 août 2021 05:33
À : REIX, Tony 
Cc : pgsql-hackers@lists.postgresql.org ; 
CHIGOT, CLEMENT 
Objet : Re: AIX: Symbols are missing in libpq.a

Caution! External email. Do not open attachments or click links, unless this 
email comes from a known sender and you know the content is safe.

On Mon, Aug 30, 2021 at 03:35:23PM +, REIX, Tony wrote:
> Yes, trying to use the create lib$(NAME).exp from $(SHLIB_EXPORTS) when it 
> exists was my first idea, too.
> However, I do not master (or I forgot) this kind of "if" in a Makefile 
> and I was unable to find a solution by reading Makefile manuals or by 
> searching for a similar example. So, I did it in an easier (to me!) and 
> quicker way: merge with a new command line in the Makefile rule.
> Now that we have a clear understanding of what is happenning, I may have a 
> deeper look at a clean Makefile solution. However, if you know how to manage 
> this, I would really appreciate some example. I'm asking my colleague too if 
> he can help me here.

Here's an example from elsewhere in Makefile.shlib:

# If SHLIB_EXPORTS is set, the rules below will build a .def file from that.
# Else we just use --export-all-symbols.
ifeq (,$(SHLIB_EXPORTS))
$(shlib): $(OBJS) | $(SHLIB_PREREQS)
$(CC) $(CFLAGS)  -shared -static-libgcc -o $@  $(OBJS) $(LDFLAGS) 
$(LDFLAGS_SL) $(SHLIB_LINK) $(LIBS) -Wl,--export-all-symbols 
-Wl,--out-implib=$(stlib)
else
DLL_DEFFILE = lib$(NAME)dll.def

$(shlib): $(OBJS) $(DLL_DEFFILE) | $(SHLIB_PREREQS)
$(CC) $(CFLAGS)  -shared -static-libgcc -o $@  $(OBJS) $(DLL_DEFFILE) 
$(LDFLAGS) $(LDFLAGS_SL) $(SHLIB_LINK) $(LIBS) -Wl,--out-implib=$(stlib)

UC_NAME = $(shell echo $(NAME) | tr 'abcdefghijklmnopqrstuvwxyz' 
'ABCDEFGHIJKLMNOPQRSTUVWXYZ')

$(DLL_DEFFILE): $(SHLIB_EXPORTS)
echo 'LIBRARY LIB$(UC_NAME).dll' >$@
echo 'EXPORTS' >>$@
sed -e '/^#/d' -e 's/^\(.*[  ]\)\([0-9][0-9]*\)/\1@ \2/' $< >>$@
endif



postgresql-13.4-exports.txt.patch
Description: postgresql-13.4-exports.txt.patch

Re: Add option --drop-cascade for pg_dump/restore

2021-09-01 Thread Daniel Gustafsson

> On 16 Aug 2021, at 08:35, Wu Haotian  wrote:
> 
> There are already documents for "--clean only works with plain text output",
> so adding checks for --clean seems like a breaking change to me.
> 
> I've updated the docs to indicate --drop-cascade and --if-exists only
> works with plain text output.

This patch fails to apply after recent changes to the pg_dump TAP tests.
Please submit a rebased version.

--
Daniel Gustafsson   https://vmware.com/

Re: [PATCH] regexp_positions ( string text, pattern text, flags text ) → setof int4range[]

2021-09-01 Thread Daniel Gustafsson

> On 9 Mar 2021, at 20:30, Joel Jacobson  wrote:

> Attached is a patch implementing it this way.

This patch no longer applies, can you please submit a rebased version?

--
Daniel Gustafsson   https://vmware.com/

Re: Autovacuum (analyze) on partitioned tables for ATTACH/DETACH/DROP commands

2021-09-01 Thread Daniel Gustafsson

> On 21 Jun 2021, at 10:21, yuzuko  wrote:

> Attach the v1 patch.  What do you think?

This patch no longer applies to HEAD, can you please submit a rebased version
for the commitfest?

--
Daniel Gustafsson   https://vmware.com/

Re: Returning to Postgres community work

2021-09-01 Thread Julien Rouhaud

On Wed, Sep 1, 2021 at 1:02 AM Gurjeet Singh  wrote:
>
> On Tue, Aug 31, 2021 at 8:04 AM Alvaro Herrera  
> wrote:
> >
> > You know what I've heard?  That your index advisor module languishes
> > unmaintained and that there's no shortage of people wishing to use it.
>
> Now there's a masterclass in making someone feel great and guilty in
> the same sentence ;-)
>
> > Heck, we spent a lot of back-and-forth in the spanish mailing list
> > with somebody building a super obsolete version of Postgres just so that
> > they could compile your index advisor.  I dunno, if you have some spare
> > time, maybe updating that one would be a valuable contribution from
> > users' perspective.
>
> Aye-aye Capn' :-)
>
> EDB folks reached out to me a few months ago to assign a license to
> the project, which I did and it is now a Postgres-licensed project
> [1].
>
> Given the above, it is safe to assume that this tool is at least being
> maintained by EDB, at least internally for their customers. I would
> request them to contribute the changes back in the open.
>
> Regardless of that, please rest assured that I will work on making it
> compatible with the current supported versions of Postgres. Lack of
> time is not an excuse anymore :-)

For the record we created an index adviser, which can be used either
with powa user interface (which requires a bit more effort to setup
but gives a lot of additional performance info) or a standalone one in
SQL using only pg_qualstats exension.

Unlike most advisers it's using the predicates sampled from the actual
workload rather than with a per-single-query basis to come up with its
suggestion.  As a result it can give better results as it can e.g.
suggest multi-column indexes to optimize multiple queries at once
rather than suggesting multiple partially redundant indexes for each
query.  The UI version can also check all the suggested indexes using
hypopg to verify if they're sensible and also give a rough idea on how
much the queries can benefit from it.  You can see a naive example at
[1].

Note that this is compatible with all postgres version down to 9.4.

[1]: https://powa.readthedocs.io/en/latest/_images/hypopg_db1.png

Re: Patch: shouldn't timezone(text, timestamp[tz]) be STABLE?

2021-09-01 Thread Aleksander Alekseev

Hi Tom,

> No, because date_trunc depends on the current timezone setting,
> or at least its stable variants do.

Once again, many thanks for your answers!

> I wasn't excited enough about it personally to change it, and I'm
> still not --- but if you want to, send a patch.

Here is the patch.

-- 
Best regards,
Aleksander Alekseev


v3-0001-timezone_volatility.patch
Description: Binary data

Re: [PATCH] Native spinlock support on RISC-V

2021-09-01 Thread Christoph Berg

Re: Tom Lane
> I did not like confusing the RISC-V case with the ARM case: duplicating
> the code block seems better, in case there's ever reason to add
> arch-specific stuff like SPIN_DELAY.  So I split it off to its own
> code block and pushed it.

Fwiw I can report success on Debian's riscv port with that commit
cherry-picked onto 13.4:

https://buildd.debian.org/status/fetch.php?pkg=postgresql-13&arch=riscv64&ver=13.4-3&stamp=1630411643&raw=0

Christoph

Re: mark the timestamptz variant of date_bin() as stable

2021-09-01 Thread Aleksander Alekseev

Hi John,

By looking at timestamptz_bin() implementation I don't see why it
should be STABLE. Its return value depends only on the input values.
It doesn't look at the session parameters. timestamptz_in() and
timestamptz_out() are STABLE, that's true, but this is no concern of
timestamptz_bin().

Am I missing something?

-- 
Best regards,
Aleksander Alekseev

Re: Fix around conn_duration in pgbench

2021-09-01 Thread Yugo NAGATA

On Wed, 1 Sep 2021 17:07:43 +0900
Fujii Masao  wrote:

> 
> 
> On 2021/09/01 1:10, Fujii Masao wrote:
> > +1. So we reached the consensus!
> > 
> > Attached is the updated version of the patch. Based on Nagata-san's latest 
> > patch,
> > I just modified the comment slightly and ran pgindent. Barring any 
> > objection,
> > I will commit this patch only in master and v14.
> 
> Pushed. Thanks!

Thank you!

-- 
Yugo NAGATA

RE: Allow escape in application_name (was: [postgres_fdw] add local pid to fallback_application_name)

2021-09-01 Thread kuroda.hay...@fujitsu.com

Dear Fujii-san,

> Can we split the patch into two as follows? If so, we can review
> and commit them one by one.
> 
> #1. Add application_name GUC into postgres_fdw
> #2. Allow to specify special escape characters in application_name that
> postgres_fdw uses.

OK, I split and attached like that. 0001 adds new GUC, and
0002 allows to accept escapes. 

> For now I've not found real use case that implementing the feature
> in libpq would introduce. So IMO postgres_fdw version is better.

+1, so I'll focus on the this version.

Best Regards,
Hayato Kuroda
FUJITSU LIMITED



v05_0002_allow_to_escape.patch
Description: v05_0002_allow_to_escape.patch


v05_0001_add_application_name_GUC.patch
Description: v05_0001_add_application_name_GUC.patch

Re: Trap errors from streaming child in pg_basebackup to exit early

2021-09-01 Thread Bharath Rupireddy

On Wed, Sep 1, 2021 at 1:56 PM Daniel Gustafsson  wrote:
> A v2 with the above fixes is attached.

Thanks for the updated patch. Here are some comments:

1) Do we need to set bgchild = -1 before the exit(1); in the code
below so that we don't kill(bgchild, SIGTERM); unnecessarily in
kill_bgchild_atexit?
+ if (bgchild_exited)
+ {
+ pg_log_error("background WAL receiver terminated unexpectedly");
+ exit(1);
+ }
+

2) Missing "," after "On Windows, we use a ."
+ * that time. On Windows we use a background thread which can communicate

3) How about "/* Flag to indicate whether or not child process exited
*/" instead of +/* State of child process */?

4) Instead of just exiting from the main pg_basebackup process when
the child WAL receiver dies, can't we think of restarting the child
process, probably with the WAL streaming position where it left off or
stream from the beginning? This way, the work that the main
pg_basebackup has done so far doesn't get wasted. I'm not sure if this
affects the pg_basebackup functionality. We can restart the child
process for 1 or 2 times, if it still dies, we can kill the main
pg_baasebackup process too. Thoughts?

Regards,
Bharath Rupireddy.

Re: Next Steps with Hash Indexes

2021-09-01 Thread Sadhuprasad Patro

>
> That's a significant difference. Have you checked via perf or some
> other way what causes this difference? I have seen that sometimes
> single client performance with pgbench is not stable, so can you
> please once check with 4 clients or so and possibly with a larger
> dataset as well.

I have verified manually, without the PGBENCH tool also. I can see a
significant difference for each query fired in both the versions of
patch implemented. We can see as mentioned below, I have run the SAME
query on the SAME dataset on both patches. We have a significant
performance impact with Separate Hash values for multiple key columns.

SingleHash_MultiColumn:
postgres=# create table perftest(a int, b int, c int, d int, e int, f int);
CREATE TABLE

postgres=# insert into perftest values (generate_series(1, 1000),
generate_series(1, 1000), generate_series(1, 1000), 9, 7);
INSERT 0 1000

postgres=# create index idx on perftest using hash(a, b, c);
CREATE INDEX

postgres=# select * from perftest where a=5999 and b=5999 and c=5999;
  a   |  b   |  c   | d | e | f
--+--+--+---+---+---
 5999 | 5999 | 5999 | 9 | 7 |
(1 row)
Time: 2.022 ms

postgres=# select * from perftest where a=597989 and b=597989 and c=597989;
   a|   b|   c| d | e | f
+++---+---+---
 597989 | 597989 | 597989 | 9 | 7 |
(1 row)
Time: 0.867 ms

postgres=# select * from perftest where a=6297989 and b=6297989 and c=6297989;
a|b|c| d | e | f
-+-+-+---+---+---
 6297989 | 6297989 | 6297989 | 9 | 7 |
(1 row)
Time: 1.439 ms

postgres=# select * from perftest where a=6290798 and b=6290798 and c=6290798;
a|b|c| d | e | f
-+-+-+---+---+---
 6290798 | 6290798 | 6290798 | 9 | 7 |
(1 row)
Time: 1.013 ms

postgres=# select * from perftest where a=6290791 and b=6290791 and c=6290791;
a|b|c| d | e | f
-+-+-+---+---+---
 6290791 | 6290791 | 6290791 | 9 | 7 |
(1 row)
Time: 0.903 ms

postgres=# select * from perftest where a=62907 and b=62907 and c=62907;
   a   |   b   |   c   | d | e | f
---+---+---+---+---+---
 62907 | 62907 | 62907 | 9 | 7 |
(1 row)
Time: 0.894 ms

SeparateHash_MultiColumn:
postgres=# create table perftest(a int, b int, c int, d int, e int, f int);
CREATE TABLE

postgres=# insert into perftest values (generate_series(1, 1000),
generate_series(1, 1000), generate_series(1, 1000), 9, 7);
INSERT 0 1000

postgres=# create index idx on perftest using hash(a, b, c);
CREATE INDEX

postgres=# select * from perftest where a=5999 and b=5999 and c=5999;
  a   |  b   |  c   | d | e | f
--+--+--+---+---+---
 5999 | 5999 | 5999 | 9 | 7 |
(1 row)
Time: 2.915 ms

postgres=# select * from perftest where a=597989 and b=597989 and c=597989;
   a|   b|   c| d | e | f
+++---+---+---
 597989 | 597989 | 597989 | 9 | 7 |
(1 row)
Time: 1.129 ms

postgres=# select * from perftest where a=6297989 and b=6297989 and c=6297989;
a|b|c| d | e | f
-+-+-+---+---+---
 6297989 | 6297989 | 6297989 | 9 | 7 |
(1 row)
Time: 2.454 ms

postgres=# select * from perftest where a=6290798 and b=6290798 and c=6290798;
a|b|c| d | e | f
-+-+-+---+---+---
 6290798 | 6290798 | 6290798 | 9 | 7 |
(1 row)
Time: 2.327 ms

postgres=# select * from perftest where a=6290791 and b=6290791 and c=6290791;
a|b|c| d | e | f
-+-+-+---+---+---
 6290791 | 6290791 | 6290791 | 9 | 7 |
(1 row)
Time: 1.676 ms

postgres=# select * from perftest where a=62907 and b=62907 and c=62907;
   a   |   b   |   c   | d | e | f
---+---+---+---+---+---
 62907 | 62907 | 62907 | 9 | 7 |
(1 row)
Time: 2.614 ms

If I do a test with 4 clients, then there is not much visible
difference. I think this is because of contentions. And here our focus
is single thread & single operation performance.

>
> One more thing to consider is that it seems that the planner requires
> a condition for the first column of an index before considering an
> indexscan plan. See Tom's email [1] in this regard. I think it would
> be better to see what kind of work is involved there if you want to
> explore a single hash value for all columns idea.
>
> [1] - https://www.postgresql.org/message-id/29263.1506483172%40sss.pgh.pa.us

About this point, I will analyze further and update.

Thanks & Regards
SadhuPrasad
EnterpriseDB: http://www.enterprisedb.com

Re: support for MERGE

2021-09-01 Thread Daniel Gustafsson

> On 21 Mar 2021, at 14:57, Alvaro Herrera  wrote:
> On 2021-Mar-21, Magnus Hagander wrote:

>> But what is it we *want* it to do? That is, what should be the target?
>> Is it 2021-07 with status WoA?
> 
> Yeah, that sounds like it, thanks.

This patch fails to apply, will there be a new version for this CF?

--
Daniel Gustafsson   https://vmware.com/

Re: logical replication empty transactions

2021-09-01 Thread Ajin Cherian

On Wed, Aug 25, 2021 at 5:15 PM Peter Smith  wrote:
>
> I reviewed the v14-0001 patch.
>
> All my previous comments have been addressed.
>
> Apply / build / test was all OK.
>
> --
>
> More review comments:
>
> 1. Params names in the function declarations should match the rest of the 
> code.
>
> 1a. src/include/replication/logical.h
>
> @@ -26,7 +26,8 @@ typedef LogicalOutputPluginWriterWrite
> LogicalOutputPluginWriterPrepareWrite;
>
>  typedef void (*LogicalOutputPluginWriterUpdateProgress) (struct
> LogicalDecodingContext *lr,
>   XLogRecPtr Ptr,
> - TransactionId xid
> + TransactionId xid,
> + bool send_keep_alive
>
> =>
> Change "send_keep_alive" --> "send_keepalive"
>
> ~~
>
> 1b. src/include/replication/output_plugin.h
>
> @@ -243,6 +243,6 @@ typedef struct OutputPluginCallbacks
>  /* Functions in replication/logical/logical.c */
>  extern void OutputPluginPrepareWrite(struct LogicalDecodingContext
> *ctx, bool last_write);
>  extern void OutputPluginWrite(struct LogicalDecodingContext *ctx,
> bool last_write);
> -extern void OutputPluginUpdateProgress(struct LogicalDecodingContext *ctx);
> +extern void OutputPluginUpdateProgress(struct LogicalDecodingContext
> *ctx, bool send_keep_alive);
>
> =>
> Change "send_keep_alive" --> "send_keepalive"
>
> --
>
> 2. Comment should be capitalized - src/backend/replication/walsender.c
>
> @@ -170,6 +170,9 @@ static TimestampTz last_reply_timestamp = 0;
>  /* Have we sent a heartbeat message asking for reply, since last reply? */
>  static bool waiting_for_ping_response = false;
>
> +/* force keep alive when skipping transactions in synchronous
> replication mode */
> +static bool force_keepalive_syncrep = false;
>
> =>
> "force" --> "Force"
>
> --
>
> Otherwise, v14-0001 LGTM.
>

Thanks for the comments. Addressed them in the attached patch.

regards,
Ajin Cherian
Fujitsu Australia


v15-0001-Skip-empty-transactions-for-logical-replication.patch
Description: Binary data

Re: NOT VALID for Unique Indexes

2021-09-01 Thread Daniel Gustafsson

> On 26 Feb 2021, at 10:36, Simon Riggs  wrote:

> I won't be able to finish this patch in time for this next CF, but
> thanks for your interest, I will complete for PG15 later this year.

This patch no longer applies to HEAD, will there be an updated version for this
CF?

--
Daniel Gustafsson   https://vmware.com/

Re: Parallelize correlated subqueries that execute within each worker

2021-09-01 Thread Daniel Gustafsson

> On 7 May 2021, at 18:30, James Coleman  wrote:

> ..here we are now, and I finally have this patch cleaned up
> enough to share.

This patch no longer applies to HEAD, can you please submit a rebased version?

--
Daniel Gustafsson   https://vmware.com/

Re: [PATCH] Partial foreign key updates in referential integrity triggers

2021-09-01 Thread Daniel Gustafsson

> On 4 Aug 2021, at 23:49, Paul Martinez  wrote:

> I've rebased my original patch and it should work now. 

This patch no longer applies, can you please submit a rebased version?  It
currently fails on catversion.h, to keep that from happening repeatedly you can
IMO skip that from the patch submission.

--
Daniel Gustafsson   https://vmware.com/

Re: Postgres Win32 build broken?

2021-09-01 Thread Ranier Vilela

Em ter., 31 de ago. de 2021 às 22:52, Michael Paquier 
escreveu:

> On Tue, Aug 31, 2021 at 07:49:40PM -0300, Ranier Vilela wrote:
> > I'm not a perl specialist and it seems to me that the Win32 build is
> broken.
> > The Win32 build is still important because of the 32-bit clients still in
> > use.
> > I'm investigating the problem.
>
> Being able to see the command you are using for build.pl, your
> buildenv.pl and/or config.pl, as well as your build dependencies
> should help to know what's wrong.
>
When I build Postgres to post, I basically don't change anything.
Everything is the head's default.
config.pl does not exist
command to build, either on x64 or Win32.
build.bat 


>
> MSVC builds are tested by various buildfarm members on a daily basis,
> and nothing is red.  I also have a x86 and x64 configuration with
> VS2015 that prove to work as of HEAD at de1d4fe, FWIW.  Now, by
> experience, one could say that N Windows PG developpers finish with at
> least (N+1) different environments.  Basically Simon Riggs's theorem
> applied to Windows development..
>
I'm using the latest msvc 2019.
>From the error message, there is no Release|x86, but Release|Win32.
But I still haven't found where are setting this "Release|x86"

regards,
Ranier Vilela

Re: postgres_fdw: Handle boolean comparison predicates

2021-09-01 Thread Daniel Gustafsson

> On 31 May 2021, at 18:51, Emre Hasegeli  wrote:
> 
>> Please add this patch to the commitfest so that it's not forgotten. It
>> will be considered as a new feature so will be considered for commit
>> after the next commitfest.
> 
> I did [1].

The patch no longer applies to HEAD, can you please submit a rebased version?

--
Daniel Gustafsson   https://vmware.com/

Re: row filtering for logical replication

2021-09-01 Thread Euler Taveira

On Sun, Aug 29, 2021, at 11:14 PM, Peter Smith wrote:
> Here are the new v26* patches. This is a refactoring of the row-filter
> caches to remove all the logic from the get_rel_sync_entry function
> and delay it until if/when needed in the pgoutput_row_filter function.
> This is now implemented per Amit's suggestion to move all the cache
> code [1]. It is a replacement for the v25* patches.
> 
> The make check and TAP subscription tests are all OK. I have repeated
> the performance tests [2] and those results are good too.
> 
> v26-0001 <--- v23 (base RF patch)
> v26-0002 <--- ExprState cache mods (refactored row filter caching)
> v26-0002 <--- ExprState cache extra debug logging (temp)
Peter, I'm still reviewing this new cache mechanism. I will provide a feedback
as soon as I integrate it as part of this recent modification.

I'm attaching a new version that simply including Houzj review [1]. This is
based on v23.

There has been a discussion about which row should be used by row filter. We
don't have a unanimous choice, so I think it is prudent to provide a way for
the user to change it. I suggested in a previous email [2] that a publication
option should be added. Hence, row filter can be applied to old tuple, new
tuple, or both. This approach is simpler than using OLD/NEW references (less
code and avoid validation such as NEW reference for DELETEs and OLD reference
for INSERTs). I think about a reasonable default value and it seems _new_ tuple
is a good one because (i) it is always available and (ii) user doesn't have
to figure out that replication is broken due to a column that is not part
of replica identity. I'm attaching a POC that implements it. I'm still
polishing it. Add tests for multiple row filters and integrate Peter's caching
mechanism [3] are the next steps.

[1] 
https://www.postgresql.org/message-id/OS0PR01MB571696CA853B3655F7DE752994E29%40OS0PR01MB5716.jpnprd01.prod.outlook.com
[2] 
https://www.postgresql.org/message-id/5a3f74df-ffa1-4126-a5d8-dbb081d3e439%40www.fastmail.com
[3] 
https://www.postgresql.org/message-id/CAHut%2BPsgRHymwLhJ9t3By6%2BKNaVDzfjf6Y4Aq%3DJRD-y8t1mEFg%40mail.gmail.com

--
Euler Taveira
EDB   https://www.enterprisedb.com/
From 018cdb79733ddf4f0de1e4eace3a172bd685d53c Mon Sep 17 00:00:00 2001
From: Euler Taveira 
Date: Mon, 18 Jan 2021 12:07:51 -0300
Subject: [PATCH v27 1/2] Row filter for logical replication

This feature adds row filter for publication tables. When a publication
is defined or modified, rows that don't satisfy an optional WHERE clause
will be filtered out. This allows a database or set of tables to be
partially replicated. The row filter is per table. A new row filter can
be added simply by specifying a WHERE clause after the table name. The
WHERE clause must be enclosed by parentheses.

The WHERE clause should probably contain only columns that are part of the
primary key or that are covered by REPLICA IDENTITY. Otherwise, any DELETEs
won't be replicated. DELETE uses the old row version (that is limited to
primary key or REPLICA IDENTITY) to evaluate the row filter. INSERT and UPDATE
use the new row version to evaluate the row filter, hence, you can use any
column. If the row filter evaluates to NULL, it returns false. For simplicity,
functions are not allowed; it could possibly be addressed in a future patch.

If you choose to do the initial table synchronization, only data that satisfies
the row filters is sent. If the subscription has several publications in which
a table has been published with different WHERE clauses, rows must satisfy all
expressions to be copied. If subscriber is a pre-15 version, data
synchronization won't use row filters if they are defined in the publisher.
Previous versions cannot handle row filters.

If your publication contains a partitioned table, the publication parameter
publish_via_partition_root determines if it uses the partition row filter (if
the parameter is false, the default) or the root partitioned table row filter.
---
 doc/src/sgml/catalogs.sgml  |   8 +
 doc/src/sgml/ref/alter_publication.sgml |  11 +-
 doc/src/sgml/ref/create_publication.sgml|  33 ++-
 doc/src/sgml/ref/create_subscription.sgml   |  10 +-
 src/backend/catalog/pg_publication.c|  47 ++-
 src/backend/commands/publicationcmds.c  | 101 ---
 src/backend/nodes/copyfuncs.c   |  14 +
 src/backend/nodes/equalfuncs.c  |  12 +
 src/backend/parser/gram.y   |  24 +-
 src/backend/parser/parse_agg.c  |  10 +
 src/backend/parser/parse_expr.c |  21 +-
 src/backend/parser/parse_func.c |   3 +
 src/backend/parser/parse_oper.c |   7 +
 src/backend/replication/logical/tablesync.c |  95 ++-
 src/backend/replication/pgoutput/pgoutput.c | 257 -
 src/bin/pg_dump/pg_dump.c   |  24 +-
 src/bin/pg_dump/pg_dump.h   |   1 +
 src/bin/psql/describe.c |  15 +-
 src/include

Re: Separate out FileSet from SharedFileSet (was Re: pgsql: pgstat: Bring up pgstat in BaseInit() to fix uninitialized use o)

2021-09-01 Thread Amit Kapila

On Wed, Sep 1, 2021 at 1:53 PM Dilip Kumar  wrote:
>

The latest patch looks good to me. I have made some changes in the
comments, see attached. I am planning to push this tomorrow unless you
or others have any comments on it.

-- 
With Regards,
Amit Kapila.


v8-0001-Optimize-fileset-usage-in-apply-worker.patch
Description: Binary data

Re: Update of partition key on foreign server

2021-09-01 Thread Etsuro Fujita

On Thu, Aug 26, 2021 at 11:10 PM Ilya Gladyshev
 wrote:
> I have developed a simple patch to fix this, while I’m not fully satisfied 
> with it, it gets the job done.

Thanks for working on this!

> In message [1] there’s also mentioned possibility of existence of triggers on 
> the foreign server, and transforming the update to delete+insert will cause 
> these triggers to be omitted.

Yeah, I still think so.

> While it is true, I feel like partition pruning already has a similar effect, 
> as it allows to skip scanning foreign partitions.

I don’t fully understand this part.  Could you elaborate a bit more on it?

> The only way to both fix the update and have the triggers work, that I came 
> up with, is to use parent partitioned table as a target for the foreign 
> update (FDW request would then be "UPDATE public.players …"), while this is 
> possible, it requires the foreign server to have the same partitioned table, 
> which seems quite restrictive to me.

Yeah, that would be too restrictive.

Best regards,
Etsuro Fujita

Re: Support reset of Shared objects statistics in "pg_stat_reset" function

2021-09-01 Thread Fujii Masao




On 2021/08/24 13:07, Mahendra Singh Thalor wrote:

Thanks Sadhu for the updated patch.

Patch looks good to me and I don't have any more comments.

I marked this patch as 'ready for committer'.
https://commitfest.postgresql.org/34/3282/ 



Attached is the updated version of the patch. In this patch, I updated
the description for pg_stat_reset_single_table_counters() in pg_proc.dat.
Barring any objection, I will commit this patch.

Regards,

--
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION
diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml
index 74a58a916c..2281ba120f 100644
--- a/doc/src/sgml/monitoring.sgml
+++ b/doc/src/sgml/monitoring.sgml
@@ -5097,7 +5097,7 @@ SELECT pid, wait_event_type, wait_event FROM 
pg_stat_activity WHERE wait_event i


 Resets statistics for a single table or index in the current database
-to zero.
+or shared across all databases in the cluster to zero.


 This function is restricted to superusers by default, but other users
diff --git a/src/backend/postmaster/pgstat.c b/src/backend/postmaster/pgstat.c
index 4a280897b1..3450a10129 100644
--- a/src/backend/postmaster/pgstat.c
+++ b/src/backend/postmaster/pgstat.c
@@ -38,6 +38,7 @@
 #include "access/transam.h"
 #include "access/twophase_rmgr.h"
 #include "access/xact.h"
+#include "catalog/catalog.h"
 #include "catalog/pg_database.h"
 #include "catalog/pg_proc.h"
 #include "common/ip.h"
@@ -5140,7 +5141,8 @@ 
pgstat_recv_resetsharedcounter(PgStat_MsgResetsharedcounter *msg, int len)
 /* --
  * pgstat_recv_resetsinglecounter() -
  *
- * Reset a statistics for a single object
+ * Reset a statistics for a single object, which may be of current
+ * database or shared across all databases in the cluster.
  * --
  */
 static void
@@ -5148,7 +5150,10 @@ 
pgstat_recv_resetsinglecounter(PgStat_MsgResetsinglecounter *msg, int len)
 {
PgStat_StatDBEntry *dbentry;
 
-   dbentry = pgstat_get_db_entry(msg->m_databaseid, false);
+   if (IsSharedRelation(msg->m_objectid))
+   dbentry = pgstat_get_db_entry(InvalidOid, false);
+   else
+   dbentry = pgstat_get_db_entry(msg->m_databaseid, false);
 
if (!dbentry)
return;
diff --git a/src/include/catalog/pg_proc.dat b/src/include/catalog/pg_proc.dat
index 4f170eaf48..1a750a49ca 100644
--- a/src/include/catalog/pg_proc.dat
+++ b/src/include/catalog/pg_proc.dat
@@ -5754,7 +5754,7 @@
   proname => 'pg_stat_reset_shared', provolatile => 'v', prorettype => 'void',
   proargtypes => 'text', prosrc => 'pg_stat_reset_shared' },
 { oid => '3776',
-  descr => 'statistics: reset collected statistics for a single table or index 
in the current database',
+  descr => 'statistics: reset collected statistics for a single table or index 
in the current database or shared across all databases in the cluster',
   proname => 'pg_stat_reset_single_table_counters', provolatile => 'v',
   prorettype => 'void', proargtypes => 'oid',
   prosrc => 'pg_stat_reset_single_table_counters' },

Re: Separate out FileSet from SharedFileSet (was Re: pgsql: pgstat: Bring up pgstat in BaseInit() to fix uninitialized use o)

2021-09-01 Thread Dilip Kumar

On Wed, Sep 1, 2021 at 5:23 PM Amit Kapila  wrote:

> On Wed, Sep 1, 2021 at 1:53 PM Dilip Kumar  wrote:
> >
>
> The latest patch looks good to me. I have made some changes in the
> comments, see attached. I am planning to push this tomorrow unless you
> or others have any comments on it.
>

These comments changes look good to me.

-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

Re: support for MERGE

2021-09-01 Thread Alvaro Herrera

On 2021-Sep-01, Daniel Gustafsson wrote:

> > On 21 Mar 2021, at 14:57, Alvaro Herrera  wrote:
> > On 2021-Mar-21, Magnus Hagander wrote:
> 
> >> But what is it we *want* it to do? That is, what should be the target?
> >> Is it 2021-07 with status WoA?
> > 
> > Yeah, that sounds like it, thanks.
> 
> This patch fails to apply, will there be a new version for this CF?

No, sorry, there won't.  I think it's better to close it as RfW at this
point, I'll create a new one when I have it.

-- 
Álvaro Herrera  Valdivia, Chile  —  https://www.EnterpriseDB.com/

Re: Add jsonlog log_destination for JSON server logs

2021-09-01 Thread Sehrope Sarkuni

On Tue, Aug 31, 2021 at 8:43 PM Michael Paquier  wrote:

> On Tue, Aug 31, 2021 at 11:34:56AM -0400, Sehrope Sarkuni wrote:
> > The second commit adds a TAP test for log_destination "csvlog". This was
> > done to both confirm that the previous change didn't break anything and
> as
> > a skeleton for the test in the next commit.
>
> +note "Before sleep";
> +usleep(100_000);
> +note "Before rotate";
> +$node->logrotate();
> +note "After rotate";
> +usleep(100_000);
>
> Do you really need a rotation of the log files here?  Wouldn't it be
> better to grab the position of the current log file with a fixed log
> file name, and then slurp the file from this position with your
> expected output?  That would make the test faster, as well.
>

Yes, that was intentional. I used the log rotation tap test as a base and
kept that piece in there so it verifies that the csv log files are actually
rotated. Same for the json logs.


> Rather than making elog.c larger, I think that we should try to split
> that into more files.  Why not refactoring out the CSV part first?
> You could just call that csvlog.c, then create a new jsonlog.c for the
> meat of the patch.
>

That's a good idea. I'll try that out.

The list of fields is not up to date.  At quick glance, you are
> missing:
> - backend type.
> - leader PID.
> - query ID.
> - Session start timestamp (?)
>

Thanks. I'll take a look.

Regards,
-- Sehrope Sarkuni
Founder & CEO | JackDB, Inc. | https://www.jackdb.com/

Re: row filtering for logical replication

2021-09-01 Thread Amit Kapila

On Wed, Sep 1, 2021 at 4:53 PM Euler Taveira  wrote:
>
> On Sun, Aug 29, 2021, at 11:14 PM, Peter Smith wrote:
>
> Here are the new v26* patches. This is a refactoring of the row-filter
> caches to remove all the logic from the get_rel_sync_entry function
> and delay it until if/when needed in the pgoutput_row_filter function.
> This is now implemented per Amit's suggestion to move all the cache
> code [1]. It is a replacement for the v25* patches.
>
> The make check and TAP subscription tests are all OK. I have repeated
> the performance tests [2] and those results are good too.
>
> v26-0001 <--- v23 (base RF patch)
> v26-0002 <--- ExprState cache mods (refactored row filter caching)
> v26-0002 <--- ExprState cache extra debug logging (temp)
>
> Peter, I'm still reviewing this new cache mechanism. I will provide a feedback
> as soon as I integrate it as part of this recent modification.
>
> I'm attaching a new version that simply including Houzj review [1]. This is
> based on v23.
>
> There has been a discussion about which row should be used by row filter. We
> don't have a unanimous choice, so I think it is prudent to provide a way for
> the user to change it. I suggested in a previous email [2] that a publication
> option should be added. Hence, row filter can be applied to old tuple, new
> tuple, or both. This approach is simpler than using OLD/NEW references (less
> code and avoid validation such as NEW reference for DELETEs and OLD reference
> for INSERTs). I think about a reasonable default value and it seems _new_ 
> tuple
> is a good one because (i) it is always available and (ii) user doesn't have
> to figure out that replication is broken due to a column that is not part
> of replica identity.
>

I think this or any other similar solution for row filters (on
updates) won't work till we solve the problem reported by Hou-San [1].
The main reason is that we don't have data for unchanged toast columns
in WAL. For that, we have discussed some probable solutions in email
[2], however, that also required us to solve one of the existing
bugs[3].

[1] - 
https://www.postgresql.org/message-id/OS0PR01MB571618736E7E79309A723BBE94E99%40OS0PR01MB5716.jpnprd01.prod.outlook.com
[2] - 
https://www.postgresql.org/message-id/CAA4eK1JLQqNZypOpN7h3%3DVt0JJW4Yb_FsLJS%3DT8J9J-WXgFMYg%40mail.gmail.com
[3] - 
https://www.postgresql.org/message-id/os0pr01mb611342d0a92d4f4bf26c0f47fb...@os0pr01mb6113.jpnprd01.prod.outlook.com

-- 
With Regards,
Amit Kapila.

Re: support for MERGE

2021-09-01 Thread Daniel Gustafsson

> On 1 Sep 2021, at 14:25, Alvaro Herrera  wrote:
> 
> On 2021-Sep-01, Daniel Gustafsson wrote:
> 
>>> On 21 Mar 2021, at 14:57, Alvaro Herrera  wrote:
>>> On 2021-Mar-21, Magnus Hagander wrote:
>> 
 But what is it we *want* it to do? That is, what should be the target?
 Is it 2021-07 with status WoA?
>>> 
>>> Yeah, that sounds like it, thanks.
>> 
>> This patch fails to apply, will there be a new version for this CF?
> 
> No, sorry, there won't.  I think it's better to close it as RfW at this
> point, I'll create a new one when I have it.

No worries, I did that now.

--
Daniel Gustafsson   https://vmware.com/

Re: psql tab auto-complete for CREATE PUBLICATION

2021-09-01 Thread Fujii Masao





On 2021/07/17 2:19, vignesh C wrote:

Thanks for the patch, the changes look good to me.


The patch looks good to me, too. Pushed. Thanks!

Regards,

--
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION

2021-09 Commitfest

2021-09-01 Thread Daniel Gustafsson

It is now 2021-09-01 Anywhere On Earth so I’ve set the September commitfest to
In Progress and opened the November one for new entries.  Jaime Casanova has
volunteered for CFM [0], so let’s help him close the 284 still open items in
the queue.

--
Daniel Gustafsson   https://vmware.com/

[0] https://postgr.es/m/20210826231608.GA7242@ahch-to

Re: dup(0) fails on Ubuntu 20.04 and macOS 10.15 with 13.0

2021-09-01 Thread Tom Lane

[ redirecting to -hackers ]

Mario Emmenlauer  writes:
> On 05.10.20 14:35, Tom Lane wrote:
>> Mario Emmenlauer  writes:
>>> I get reproducibly the error:
>>> 2020-10-05 11:48:19.720 CEST [84731] WARNING:  dup(0) failed after 0 
>>> successes: Bad file descriptor

>> Hmph.  That code loop assumes that stdin exists to be duplicated,
>> but maybe if it had been closed, you'd get this error.
>> 
>> However, that logic hasn't changed in decades, and we've not heard
>> complaints about it before.  Are you starting the server in some
>> unusual way?

> Replying to a very old thread here: I could indeed trace the problem of
> the failing `dup(0)` back to how we start the server! We start the
> server from an executable that closes stdin very early on, and this
> seems to lead to the problem.
> We solved it by not closing stdin. But for future reference, if other
> people may be affected by this, the code could probably be modified to
> revert to stdout or stderr or a temporary file in case stdin is not
> available (guessing here...).

Hm.  I'm tempted to propose that we simply change that from dup(0) to
dup(2).  Formally, that's just moving the problem around.  In practice
though, there are good reasons not to close the server's stderr, ie you
will lose error messages that might be important.  OTOH there does not
seem to be any obvious reason why the server should need valid stdin,
so if we can get rid of an implementation artifact that makes it require
that, that seems like an improvement.

regards, tom lane

Re: Re: [PATCH] regexp_positions ( string text, pattern text, flags text ) → setof int4range[]

2021-09-01 Thread Tom Lane

Daniel Gustafsson  writes:
>> On 9 Mar 2021, at 20:30, Joel Jacobson  wrote:
>> Attached is a patch implementing it this way.

> This patch no longer applies, can you please submit a rebased version?

Also, since 642433707 ("This patch adds new functions regexp_count(),
regexp_instr(), regexp_like(), and regexp_substr(), and extends
regexp_replace() with some new optional arguments") is already in,
we need to think about how this interacts with that.  Do we even
still need any more functionality in this area?  Should we try to
align the APIs?

Those new function APIs have some Oracle-isms that I don't especially
care for, like use of int for what should be a boolean.  Still, users
aren't going to give us a pass for wildly inconsistent APIs just because
some functions came from Oracle and some didn't.

regards, tom lane

Re: Add statistics refresh materialized view

2021-09-01 Thread Fujii Masao





On 2021/07/09 1:39, Seino Yuki wrote:

Hi.

This is a proposal for a new feature in statistics collector.
I think we need to add statistics about refresh matview to pg_stat_all_tables 
view.


Why do you want to treat only REFRESH MATERIALIZED VIEW command special?
What about other utility commands like TRUNCATE, CLUSTER, etc?

It's not good design to add new columns per utility command into
pg_stat_all_tables. Otherwise pg_stat_all_tables will have to have lots of
columns to expose the stats of many utility commands at last. Which is
ugly and very user-unfriendly.

Most entries in pg_stat_all_tables are basically for tables. So the columns
about REFRESH MATERIALIZED VIEW are useless for those most entries.
This is another reason why I think the design is not good.




When the "REFRESH MATERIALIZED VIEW" was executed, the number of times it was 
executed
and date it took were not recorded anywhere.


pg_stat_statements and log_statement would help?




"pg_stat_statements" can be used to get the number of executions and the date 
and time of execution,
but this information is statement-based, not view-based.


pg_stat_statements reports different records for REFRESH MATERIALIZED VIEW
commands on different views. So ISTM that we can aggregate the information
per view, from pg_stat_statements. No?

Regards,

--
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION

Re: 2021-09 Commitfest

2021-09-01 Thread Jaime Casanova

On Wed, Sep 01, 2021 at 03:10:32PM +0200, Daniel Gustafsson wrote:
> It is now 2021-09-01 Anywhere On Earth so I’ve set the September commitfest to
> In Progress and opened the November one for new entries.  Jaime Casanova has
> volunteered for CFM [0], so let’s help him close the 284 still open items in
> the queue.
> 

Thank you Daniel for editing the commitfest entries, that's something I
cannot do.

And you're right, we have 284 patches in the queue (excluding committed, 
returned with feedback, withdrawn and rejected)... 18 of them for more than
10 commitfests!

Needs review: 192. 
Waiting on Author: 68. 
Ready for Committer: 24

If you have a patch in this commitfest, please check in
http://commitfest.cputube.org/ if your patch still applies and passes
tests. 

Thanks to all of you for your great work!

-- 
Jaime Casanova
Director de Servicios Profesionales
SystemGuards - Consultores de PostgreSQL

Re: AIX: Symbols are missing in libpq.a

2021-09-01 Thread Tony Reix

A new patch, using exports.txt file instead of building the list of symbols to 
be exported and merging the 2 files, has been provided. This seems much better.

Re: row filtering for logical replication

2021-09-01 Thread Euler Taveira

On Wed, Sep 1, 2021, at 9:36 AM, Amit Kapila wrote:
> I think this or any other similar solution for row filters (on
> updates) won't work till we solve the problem reported by Hou-San [1].
> The main reason is that we don't have data for unchanged toast columns
> in WAL. For that, we have discussed some probable solutions in email
> [2], however, that also required us to solve one of the existing
> bugs[3].
I didn't mention but I'm working on it in parallel.

I agree with you that including TOAST values in the WAL is a possible solution
for this issue. This is a popular request for wal2json [1][2][3] and I think
other output plugins have the same request too. It is useful for CDC solutions.

I'm experimenting 2 approaches: (i) always include unchanged TOAST values to
new tuple if a GUC is set and (ii) include unchanged TOAST values to new tuple
iif it wasn't include in the old tuple. The advantage of the first option is
that you fix the problem adjusting a parameter in your configuration file.
However, the disadvantage is that, depending on your setup -- REPLICA IDENTITY
FULL, you might have the same TOAST value for a single change twice in the WAL.
The second option solves the disadvantage of (i) but it only works if you have
REPLICA IDENTITY FULL and Dilip's patch applied [4] (I expect to review it
soon). In the output plugin, (i) requires a simple modification (remove
restriction for unchanged TOAST values) but (ii) needs a more complex surgery.

[1] https://github.com/eulerto/wal2json/issues/205
[2] https://github.com/eulerto/wal2json/issues/132
[3] https://github.com/eulerto/wal2json/issues/42
[4] 
https://www.postgresql.org/message-id/CAFiTN-uW50w0tWoUBg_VYCdvNeCzT%3Dt%3DJzhmiFd452FrLOwMMQ%40mail.gmail.com

--
Euler Taveira
EDB   https://www.enterprisedb.com/

Re: Allow escape in application_name (was: [postgres_fdw] add local pid to fallback_application_name)

2021-09-01 Thread Fujii Masao





On 2021/09/01 19:04, kuroda.hay...@fujitsu.com wrote:

OK, I split and attached like that. 0001 adds new GUC, and
0002 allows to accept escapes.


Thanks for splitting and updating the patches!

Here are the comments for 0001 patch.

-   /* Use "postgres_fdw" as fallback_application_name. */
+   /* Use GUC paramter if set */
+   if (pgfdw_application_name && *pgfdw_application_name != '\0')

This GUC parameter should be set after the options of foreign server
are set so that its setting can override the server-level ones.
Isn't it better to comment this?


+static bool
+check_pgfdw_application_name(char **newval, void **extra, GucSource source)
+{
+   /* Only allow clean ASCII chars in the application name */
+   if (*newval)
+   pg_clean_ascii(*newval);
+   return true;

Do we really need this check hook? Even without that, any non-ASCII characters
in application_name would be replaced with "?" in the foreign PostgreSQL server
when it's passed to that.

On the other hand, if it's really necessary, application_name set in foreign
server object also needs to be processed in the same way.


+  NULL,
+  PGC_USERSET,
+  GUC_IS_NAME,

Why is GUC_IS_NAME flag necessary?


postgres_fdw.application_name overrides application_name set in foreign server 
object.
Shouldn't this information be documented?


Isn't it better to have the regression test for this feature?

Regards,


--
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION

Re: [PATCH] Support pg_ident mapping for LDAP

2021-09-01 Thread Jacob Champion

On Tue, 2021-08-31 at 19:39 +, Jacob Champion wrote:
> Hello,
> 
> There was a brief discussion [1] back in February on allowing user
> mapping for LDAP, in order to open up some more complex authorization
> logic (and slightly reduce the need for LDAP-to-Postgres user
> synchronization). Attached is an implementation of this that separates
> the LDAP authentication and authorization identities, and lets the
> client control the former with an `ldapuser` connection option or its
> associated PGLDAPUSER envvar.

The cfbot found a failure in postgres_fdw, which I completely neglected
in my design. I think the desired functionality should be to allow the
ldapuser connection option during CREATE USER MAPPING but not CREATE
SERVER. I'll have a v2 up today to fix that.

--Jacob

Re: dup(0) fails on Ubuntu 20.04 and macOS 10.15 with 13.0

2021-09-01 Thread Mario Emmenlauer

On 01.09.21 15:54, Tom Lane wrote:
> Mario Emmenlauer  writes:
>> On 05.10.20 14:35, Tom Lane wrote:
>>> Mario Emmenlauer  writes:
 I get reproducibly the error:
 2020-10-05 11:48:19.720 CEST [84731] WARNING:  dup(0) failed after 0 
 successes: Bad file descriptor
> 
>>> Hmph.  That code loop assumes that stdin exists to be duplicated,
>>> but maybe if it had been closed, you'd get this error.
> 
>> Replying to a very old thread here: I could indeed trace the problem of
>> the failing `dup(0)` back to how we start the server! We start the
>> server from an executable that closes stdin very early on, and this
>> seems to lead to the problem.
> 
> Hm.  I'm tempted to propose that we simply change that from dup(0) to
> dup(2).  Formally, that's just moving the problem around.  In practice
> though, there are good reasons not to close the server's stderr, ie you
> will lose error messages that might be important.  OTOH there does not
> seem to be any obvious reason why the server should need valid stdin,
> so if we can get rid of an implementation artifact that makes it require
> that, that seems like an improvement.

The idea to switch to dup(2) sounds very good to me. Also, while at it,
maybe the error message could be improved? The kids nowadays don't learn
so much about system I/O any more, and if someone does not know `dup()`,
then the error message is not very telling. It took me a while to under-
stand what the code was supposed to do. So it may be helpful to add to
the error message something like "possible the stderr stream is closed,
this is not supported". What do you think?

All the best,

Mario Emmenlauer

Re: dup(0) fails on Ubuntu 20.04 and macOS 10.15 with 13.0

2021-09-01 Thread Tom Lane

Mario Emmenlauer  writes:
> The idea to switch to dup(2) sounds very good to me. Also, while at it,
> maybe the error message could be improved? The kids nowadays don't learn
> so much about system I/O any more, and if someone does not know `dup()`,
> then the error message is not very telling. It took me a while to under-
> stand what the code was supposed to do. So it may be helpful to add to
> the error message something like "possible the stderr stream is closed,
> this is not supported". What do you think?

Meh ... it's been like that for ~20 years and you're the first one
to complain, so I'm not inclined to make our translators spend effort
on a HINT message.  However, we could reword it to the extent of, say,

elog(WARNING, "duplicating stderr failed after %d successes: %m", used);

which at least reduces the jargon level to something that Unix users
should have heard of.

regards, tom lane

Re: WIP: System Versioned Temporal Table

2021-09-01 Thread Jaime Casanova

Hi,

This doesn't pass tests because of lack of some file. Can we fix that
please and send the patch again?

On Tue, Aug 10, 2021 at 7:20 AM Simon Riggs 
wrote:

> On Wed, 14 Jul 2021 at 12:48, vignesh C  wrote:
>
> > The patch does not apply on Head anymore, could you rebase and post a
> > patch. I'm changing the status to "Waiting for Author".
>
> OK, so I've rebased the patch against current master to take it to v15.
>
> I've then worked on the patch some myself to make v16 (attached),
> adding these things:
>
> * Add code, docs and test to remove the potential anomaly where
> endtime < starttime, using the sqlstate 2201H as pointed out by Vik
> * Add code and tests to handle multiple changes in a transaction
> correctly, according to SQL Std
> * Add code and tests to make Foreign Keys behave correctly, according to
> SQL Std
> * Fixed nascent bug in relcache setup code
> * Various small fixes from Japin's review - thanks! I've used
> starttime and endtime as default column names
> * Additional tests and docs to show that the functionality works with
> or without PKs on the table
>
> I am now satisfied that the patch does not have any implementation
> anomalies in behavioral design, but it is still a long way short in
> code architecture.
>
> There are various aspects still needing work. This is not yet ready
> for Commit, but it is appropriate now to ask for initial design
> guidance on architecture and code placement by a Committer, so I am
> setting this to Ready For Committer, in the hope that we get the
> review in SeptCF and a later version can be submitted for later commit
> in JanCF. With the right input, this patch is about a person-month
> away from being ready, assuming we don't hit any blocking issues.
>
> Major Known Issues
> * SQLStd says that it should not be possible to update historical
> rows, but those tests show we fail to prevent that and there is code
> marked NOT_USED in those areas
> * The code is structured poorly around
> parse-analyze/rewriter/optimizer/executor and that needs positive
> design recommendations, rather than critical review
> * Joins currently fail because of the botched way WHERE clauses are
> added, resulting in duplicate names
> * Views probably don't work, but there are no tests
> * CREATE TABLE (LIKE foo) doesn't correctly copy across all features -
> test for that added, with test failure accepted for now
> * ALTER TABLE is still incomplete and also broken; I suggest we remove
> that for the first version of the patch to reduce patch size for an
> initial commit.
>
> Minor Known Issues
> * Logical replication needs some minor work, no tests yet
> * pg_dump support looks like it exists and might work easily, but
> there are no tests yet
> * Correlation names don't work in FROM clause - shift/reduce errors
> from double use of AS
> * Add test and code to prevent triggers referencing period cols in the
> WHEN clause
> * No tests yet to prove you can't set various parameters/settings on
> the period time start/end cols
> * Code needs some cleanup in a few places
> * Not really sure what value is added by
> lock-update-delete-system-versioned.spec
>
> * IMHO we should make the PK definition use "endtime DESC", so that
> the current version is always the first row found in the PK for any
> key, since historical indexes will grow bigger over time
>
> There are no expected issues with integration with MERGE, since SQLStd
> explains how to handle that.
>
> Other reviews are welcome.
>
> --
> Simon Riggshttp://www.EnterpriseDB.com/
>


--

Re: [PATCH] test/ssl: rework the sslfiles Makefile target

2021-09-01 Thread Jacob Champion

On Fri, 2021-08-27 at 15:02 +0900, Michael Paquier wrote:
> On Fri, Aug 13, 2021 at 12:08:16AM +, Jacob Champion wrote:
> > (Things would get hairier if someone included the SSL Makefile
> > somewhere else, but I don't see anyone doing that now and I don't know
> > why someone would.)
> 
> That would not happen.  Hopefully.

:)

> FWIW, I'll try to spend a
> couple of hours on what you had upthread in 0002 for the
> simplification of SSL stuff generation and see if I can come up with
> something.

Thanks! The two-patch v3 no longer applies so I've attached a v4 to
make the cfbot happy.

--Jacob
From 6e038173717798dcd375e3741862152979c414fb Mon Sep 17 00:00:00 2001
From: Jacob Champion 
Date: Fri, 26 Feb 2021 15:25:28 -0800
Subject: [PATCH v4] test/ssl: rework the sslfiles Makefile target

Adding a new certificate is as easy as dropping in a new .config file,
adding it to the top of the Makefile, and running `make sslfiles`.

Improvements:
- Interfile dependencies should be fixed, with the exception of the CRL
  dirs.
- New certificates have serial numbers based on the current time,
  reducing the chance of collision.
- The CA index state is created on demand and cleaned up automatically
  at the end of the Make run.
- *.config files are now self-contained; one certificate needs one
  config file instead of two.
- Duplication is reduced, and along with it some unneeded code (and
  possible copy-paste errors).
---
 src/Makefile.global.in|   5 +-
 src/test/ssl/Makefile | 258 +++---
 src/test/ssl/cas.config   |  10 +-
 src/test/ssl/client-dn.config |   1 -
 src/test/ssl/client-revoked.config|  13 +
 src/test/ssl/client.config|   1 -
 src/test/ssl/client_ca.config |   5 +
 src/test/ssl/root_ca.config   |   1 +
 src/test/ssl/server-cn-only.config|   3 +-
 src/test/ssl/server-no-names.config   |   5 +-
 src/test/ssl/server-revoked.config|   3 +-
 src/test/ssl/server_ca.config |   5 +
 src/test/ssl/ssl/both-cas-1.crt   |  86 +++---
 src/test/ssl/ssl/both-cas-2.crt   |  86 +++---
 src/test/ssl/ssl/client+client_ca.crt |  65 ++---
 src/test/ssl/ssl/client-crldir/9bb9e3c3.r0|  18 +-
 src/test/ssl/ssl/client-dn.crt|  34 +--
 src/test/ssl/ssl/client-revoked.crt   |  31 ++-
 src/test/ssl/ssl/client.crl   |  18 +-
 src/test/ssl/ssl/client.crt   |  31 ++-
 src/test/ssl/ssl/client_ca.crt|  34 +--
 .../ssl/ssl/root+client-crldir/9bb9e3c3.r0|  18 +-
 .../ssl/ssl/root+client-crldir/a3d11bff.r0|  16 +-
 src/test/ssl/ssl/root+client.crl  |  34 +--
 src/test/ssl/ssl/root+client_ca.crt   |  52 ++--
 .../ssl/ssl/root+server-crldir/a3d11bff.r0|  16 +-
 .../ssl/ssl/root+server-crldir/a836cc2d.r0|  18 +-
 src/test/ssl/ssl/root+server.crl  |  34 +--
 src/test/ssl/ssl/root+server_ca.crt   |  52 ++--
 src/test/ssl/ssl/root.crl |  16 +-
 src/test/ssl/ssl/root_ca.crt  |  18 +-
 src/test/ssl/ssl/server-cn-and-alt-names.crt  |  36 +--
 src/test/ssl/ssl/server-cn-only.crt   |  33 +--
 src/test/ssl/ssl/server-crldir/a836cc2d.r0|  18 +-
 .../ssl/ssl/server-multiple-alt-names.crt |  36 +--
 src/test/ssl/ssl/server-no-names.crt  |  32 +--
 src/test/ssl/ssl/server-revoked.crt   |  33 +--
 src/test/ssl/ssl/server-single-alt-name.crt   |  34 +--
 src/test/ssl/ssl/server.crl   |  18 +-
 src/test/ssl/ssl/server_ca.crt|  34 +--
 src/test/ssl/t/001_ssltests.pl|  17 +-
 41 files changed, 681 insertions(+), 597 deletions(-)
 create mode 100644 src/test/ssl/client-revoked.config

diff --git a/src/Makefile.global.in b/src/Makefile.global.in
index 6e2f224cc4..cf349d19e4 100644
--- a/src/Makefile.global.in
+++ b/src/Makefile.global.in
@@ -32,8 +32,11 @@ all:
 # started to update the file.
 .DELETE_ON_ERROR:
 
-# Never delete any intermediate files automatically.
+# Never delete any intermediate files automatically unless the Makefile
+# explicitly asks for it.
+ifneq ($(clean_intermediates),yes)
 .SECONDARY:
+endif
 
 # PostgreSQL version number
 VERSION = @PACKAGE_VERSION@
diff --git a/src/test/ssl/Makefile b/src/test/ssl/Makefile
index c216560dcc..458520dc6d 100644
--- a/src/test/ssl/Makefile
+++ b/src/test/ssl/Makefile
@@ -11,168 +11,216 @@
 
 subdir = src/test/ssl
 top_builddir = ../../..
+
+clean_intermediates := yes
 include $(top_builddir)/src/Makefile.global
 
 export with_ssl
 
-CERTIFICATES := server_ca server-cn-and-alt-names \
+#
+# To add a new server or client certificate, add a new .config file in
+# this directory, then add  to either SERVERS or CLIENTS below. A
+# key/certificate pair will be generated for you, signed by the appropriate CA.
+#
+SERVERS := ser

Is it safe to use the extended protocol with COPY?

2021-09-01 Thread Daniele Varrazzo

Hello,

in psycopg 3 we are currently using PQexecParams - although with no
params - to send COPY commands. The reason is mostly to avoid people
to send COPY together with other statements. Especially if other
operations are chained after COPY: we would only notice them after
copy is finished. Data changes might have been applied by then, so
throwing an exception seems impolite (the result might have been
applied already) but managing the result is awkward too.

Someone [1] has pointed out this conversation [2] which suggests that
COPY with extended protocol might break in the future.

[1] https://github.com/psycopg/psycopg/issues/78
[2] 
https://www.postgresql.org/message-id/flat/CAMsr%2BYGvp2wRx9pPSxaKFdaObxX8DzWse%2BOkWk2xpXSvT0rq-g%40mail.gmail.com#camsr+ygvp2wrx9ppsxakfdaobxx8dzwse+okwk2xpxsvt0r...@mail.gmail.com

As far as PostgreSQL is concerned, would it be better to stick to
PQexec with COPY, and if people append statements afterwards they
would be the ones to deal with the consequences? (being the server
applying the changes, the client throwing an exception)

Thank you very much

-- Daniele

Re: 2021-09 Commitfest

2021-09-01 Thread Magnus Hagander

On Wed, Sep 1, 2021 at 4:26 PM Jaime Casanova
 wrote:
>
> On Wed, Sep 01, 2021 at 03:10:32PM +0200, Daniel Gustafsson wrote:
> > It is now 2021-09-01 Anywhere On Earth so I’ve set the September commitfest 
> > to
> > In Progress and opened the November one for new entries.  Jaime Casanova has
> > volunteered for CFM [0], so let’s help him close the 284 still open items in
> > the queue.
> >
>
> Thank you Daniel for editing the commitfest entries, that's something I
> cannot do.

I've added cf admin permissions to you as well now.

-- 
 Magnus Hagander
 Me: https://www.hagander.net/
 Work: https://www.redpill-linpro.com/

Re: Is it safe to use the extended protocol with COPY?

2021-09-01 Thread Tom Lane

Daniele Varrazzo  writes:
> Someone [1] has pointed out this conversation [2] which suggests that
> COPY with extended protocol might break in the future.

As was pointed out in that same thread, the odds of us actually
breaking that case are nil.  I wouldn't recommend changing your
code on this basis.

regards, tom lane

Re: PoC/WIP: Extended statistics on expressions

2021-09-01 Thread Tomas Vondra



On 8/24/21 3:13 PM, Justin Pryzby wrote:

On Mon, Aug 16, 2021 at 05:41:57PM +0200, Tomas Vondra wrote:

This still seems odd:

postgres=# CREATE STATISTICS asf ON i FROM t;
ERROR:  extended statistics require at least 2 columns
postgres=# CREATE STATISTICS asf ON (i) FROM t;
CREATE STATISTICS

It seems wrong that the command works with added parens, but builds expression
stats on a simple column (which is redundant with what analyze does without
extended stats).


Well, yeah. But I think this is a behavior that was discussed somewhere in
this thread, and the agreement was that it's not worth the complexity, as
this comment explains

   * XXX We do only the bare minimum to separate simple attribute and
   * complex expressions - for example "(a)" will be treated as a complex
   * expression. No matter how elaborate the check is, there'll always be
   * a way around it, if the user is determined (consider e.g. "(a+0)"),
   * so it's not worth protecting against it.

Patch 0001 fixes the "double parens" issue discussed elsewhere in this
thread, and patch 0002 tweaks CREATE STATISTICS to treat "(a)" as a simple
column reference.


0002 refuses to create expressional stats on a simple column reference like
(a), which I think is helps to avoid a user accidentally creating useless ext
stats objects (which are redundant with the table's column stats).

0002 does not attempt to refuse cases like (a+0), which I think is fine:
we don't try to reject useless cases if someone insists on it.
See 240971675, 701fd0bbc.

So I am +1 to apply both patches.

I added this as an Opened Item for increased visibility.



I've pushed both fixes, so the open item should be resolved.

However while polishing the second patch, I realized we're allowing 
statistics on expressions referencing system attributes. So this fails;


CREATE STATISTICS s ON ctid, x FROM t;

but this passes:

CREATE STATISTICS s ON (ctid::text), x FROM t;

IMO we should reject such expressions, just like we reject direct 
references to system attributes - patch attached.


Furthermore, I wonder if we should reject expressions without any Vars? 
This works now:


CREATE STATISTICS s ON (11:text) FROM t;

but it seems rather silly / useless, so maybe we should reject it.

regards

--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
>From a3ade67b8b12cdbfa585bf351ca5569599977b41 Mon Sep 17 00:00:00 2001
From: Tomas Vondra 
Date: Wed, 1 Sep 2021 17:28:21 +0200
Subject: [PATCH] Reject extended statistics referencing system attributes

Extended statistics are not allowed to reference system attributes, but
we only enforced that for simple columns. For expressions, it was
possible to define the statistics on an expression defining the system
attribute, e.g.

CREATE STATISTICS s ON (ctid::text) FROM t;

which seems strange. This adds a check rejection expressions referencing
system attributes, just like we do for simple columns.

Backpatch to 14, where extended statistics on expressions were added.

Backpath-through: 14
---
 src/backend/commands/statscmds.c | 15 +++
 1 file changed, 15 insertions(+)

diff --git a/src/backend/commands/statscmds.c b/src/backend/commands/statscmds.c
index 59369f8736..bf01840d8a 100644
--- a/src/backend/commands/statscmds.c
+++ b/src/backend/commands/statscmds.c
@@ -288,9 +288,24 @@ CreateStatistics(CreateStatsStmt *stmt)
 			Node	   *expr = selem->expr;
 			Oid			atttype;
 			TypeCacheEntry *type;
+			Bitmapset  *attnums = NULL;
+			int			k;
 
 			Assert(expr != NULL);
 
+			/* Disallow expressions referencing system attributes. */
+			pull_varattnos(expr, 1, &attnums);
+
+			k = -1;
+			while ((k = bms_next_member(attnums, k)) >= 0)
+			{
+AttrNumber	attnum = k + FirstLowInvalidHeapAttributeNumber;
+if (attnum <= 0)
+	ereport(ERROR,
+		(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+		 errmsg("statistics creation on system columns is not supported")));
+			}
+
 			/*
 			 * Disallow data types without a less-than operator.
 			 *
-- 
2.31.1

Re: 2021-09 Commitfest

2021-09-01 Thread Jaime Casanova

On Wed, Sep 01, 2021 at 06:32:38PM +0200, Magnus Hagander wrote:
> On Wed, Sep 1, 2021 at 4:26 PM Jaime Casanova
>  wrote:
> >
> > On Wed, Sep 01, 2021 at 03:10:32PM +0200, Daniel Gustafsson wrote:
> > > It is now 2021-09-01 Anywhere On Earth so I’ve set the September 
> > > commitfest to
> > > In Progress and opened the November one for new entries.  Jaime Casanova 
> > > has
> > > volunteered for CFM [0], so let’s help him close the 284 still open items 
> > > in
> > > the queue.
> > >
> >
> > Thank you Daniel for editing the commitfest entries, that's something I
> > cannot do.
> 
> I've added cf admin permissions to you as well now.
> 

I have the power! mwahahaha!
eh! i mean, thanks Magnus ;)

-- 
Jaime Casanova
Director de Servicios Profesionales
SystemGuards - Consultores de PostgreSQL

Re: prevent immature WAL streaming

2021-09-01 Thread Andres Freund

Hi,

On 2021-09-01 15:01:43 +0900, Fujii Masao wrote:
> Thanks for clarifying that! Unless I misunderstand that, when recovery ends
> at a partially-flushed segment-spanning record, it sets
> XLP_FIRST_IS_ABORTED_PARTIAL in the next segment before starting writing
> new WAL data there. Therefore XLP_FIRST_IS_CONTRECORD or
> XLP_FIRST_IS_ABORTED_PARTIAL must be set in the next segment following
> a partially-flushed segment-spanning record. If none of them is set,
> the validation code in recovery should report an error.

Right. With the small addition that I think we shouldn't just do this for
segment spanning records, but for all records spanning pages.

Greetings,

Andres Freund

Re: mark the timestamptz variant of date_bin() as stable

2021-09-01 Thread John Naylor

On Wed, Sep 1, 2021 at 5:32 AM Aleksander Alekseev 
wrote:
>
> By looking at timestamptz_bin() implementation I don't see why it
> should be STABLE. Its return value depends only on the input values.
> It doesn't look at the session parameters. timestamptz_in() and
> timestamptz_out() are STABLE, that's true, but this is no concern of
> timestamptz_bin().

I'm not quite willing to bet the answer couldn't change if the timezone
changes, but it's possible I'm the one missing something.

--
John Naylor
EDB: http://www.enterprisedb.com

Re: Converting contrib SQL functions to new style

2021-09-01 Thread Tom Lane

Peter Eisentraut  writes:
> On 14.04.21 00:26, Tom Lane wrote:
>> Attached are some draft patches to convert almost all of the
>> contrib modules' SQL functions to use SQL-standard function bodies.

> This first patch is still the patch of record in CF 2021-09, but from 
> the subsequent discussion, it seems more work is being contemplated.

Yeah, it looks like we already did the unconditionally-safe part
(i.e. making initdb-created SQL functions use new style, cf 767982e36).

The rest of this is stuck pending investigation of the ideas about
making new-style function creation safer when the creation-time path
isn't secure, so I suppose we should mark it RWF rather than leaving
it in the queue.  Will go do that.

regards, tom lane

Re: [PATCH] Support pg_ident mapping for LDAP

2021-09-01 Thread Jacob Champion

On Wed, 2021-09-01 at 15:42 +, Jacob Champion wrote:
> The cfbot found a failure in postgres_fdw, which I completely neglected
> in my design. I think the desired functionality should be to allow the
> ldapuser connection option during CREATE USER MAPPING but not CREATE
> SERVER.

Fixed in v2, attached.

--Jacob
diff --git a/contrib/postgres_fdw/expected/postgres_fdw.out 
b/contrib/postgres_fdw/expected/postgres_fdw.out
index e3ee30f1aa..fe47cff920 100644
--- a/contrib/postgres_fdw/expected/postgres_fdw.out
+++ b/contrib/postgres_fdw/expected/postgres_fdw.out
@@ -188,7 +188,7 @@ ALTER USER MAPPING FOR public SERVER testserver1
 ALTER USER MAPPING FOR public SERVER testserver1
OPTIONS (ADD sslmode 'require');
 ERROR:  invalid option "sslmode"
-HINT:  Valid options in this context are: user, password, sslpassword, 
password_required, sslcert, sslkey
+HINT:  Valid options in this context are: user, password, sslpassword, 
ldapuser, password_required, sslcert, sslkey
 -- But we can add valid ones fine
 ALTER USER MAPPING FOR public SERVER testserver1
OPTIONS (ADD sslpassword 'dummy');
diff --git a/contrib/postgres_fdw/option.c b/contrib/postgres_fdw/option.c
index c574ca2cf3..0d9a070dc3 100644
--- a/contrib/postgres_fdw/option.c
+++ b/contrib/postgres_fdw/option.c
@@ -306,10 +306,12 @@ InitPgFdwOptions(void)
popt->keyword = lopt->keyword;
 
/*
-* "user" and any secret options are allowed only on user 
mappings.
-* Everything else is a server option.
+* "user", "ldapuser", and any secret options are allowed only 
on user
+* mappings.  Everything else is a server option.
 */
-   if (strcmp(lopt->keyword, "user") == 0 || 
strchr(lopt->dispchar, '*'))
+   if (strcmp(lopt->keyword, "user") == 0 ||
+   strcmp(lopt->keyword, "ldapuser") == 0 ||
+   strchr(lopt->dispchar, '*'))
popt->optcontext = UserMappingRelationId;
else
popt->optcontext = ForeignServerRelationId;
diff --git a/doc/src/sgml/postgres-fdw.sgml b/doc/src/sgml/postgres-fdw.sgml
index 0075bc3dbb..32a0d35e4f 100644
--- a/doc/src/sgml/postgres-fdw.sgml
+++ b/doc/src/sgml/postgres-fdw.sgml
@@ -120,7 +120,8 @@
 
  
   
-   user, password and 
sslpassword (specify these
+   user, ldapuser,
+   password and sslpassword (specify 
these
in a user mapping, instead, or use a service file)
   
  
From 74e403a482ba4f16badf4dfc8171bb00c0064a37 Mon Sep 17 00:00:00 2001
From: Jacob Champion 
Date: Mon, 30 Aug 2021 16:29:59 -0700
Subject: [PATCH v2] Allow user name mapping with LDAP

Enable the `map` HBA option for the ldap auth method. To make effective
use of this from the client side, the ldapuser connection option (and a
corresponding environment variable, PGLDAPUSER) has been added; it
defaults to the PGUSER.

For more advanced mapping, the ldap_map_dn HBA option can be set to use
the full user Distinguished Name during user mapping. (This parallels
the include_realm=1 and clientname=DN options.)
---
 .../postgres_fdw/expected/postgres_fdw.out|   2 +-
 contrib/postgres_fdw/option.c |   8 +-
 doc/src/sgml/client-auth.sgml |  45 +-
 doc/src/sgml/libpq.sgml   |  32 
 doc/src/sgml/postgres-fdw.sgml|   3 +-
 src/backend/libpq/auth.c  |  38 +++--
 src/backend/libpq/hba.c   |  29 +++-
 src/backend/postmaster/postmaster.c   |   2 +
 src/include/libpq/hba.h   |   1 +
 src/include/libpq/libpq-be.h  |   5 +
 src/interfaces/libpq/fe-connect.c |   6 +
 src/interfaces/libpq/fe-protocol3.c   |   2 +
 src/interfaces/libpq/libpq-int.h  |   1 +
 src/test/ldap/t/001_auth.pl   | 145 +-
 14 files changed, 299 insertions(+), 20 deletions(-)

diff --git a/contrib/postgres_fdw/expected/postgres_fdw.out b/contrib/postgres_fdw/expected/postgres_fdw.out
index e3ee30f1aa..fe47cff920 100644
--- a/contrib/postgres_fdw/expected/postgres_fdw.out
+++ b/contrib/postgres_fdw/expected/postgres_fdw.out
@@ -188,7 +188,7 @@ ALTER USER MAPPING FOR public SERVER testserver1
 ALTER USER MAPPING FOR public SERVER testserver1
 	OPTIONS (ADD sslmode 'require');
 ERROR:  invalid option "sslmode"
-HINT:  Valid options in this context are: user, password, sslpassword, password_required, sslcert, sslkey
+HINT:  Valid options in this context are: user, password, sslpassword, ldapuser, password_required, sslcert, sslkey
 -- But we can add valid ones fine
 ALTER USER MAPPING FOR public SERVER testserver1
 	OPTIONS (ADD sslpassword 'dummy');
diff --git a/contrib/postgres_fdw/option.c b/contrib/postgres_fdw/option.c
index c574ca2cf3..0d9a070dc3 100644
--- a/contrib/postgres_fdw/option.c
+++ b/contrib

Re: mark the timestamptz variant of date_bin() as stable

2021-09-01 Thread Tom Lane

John Naylor  writes:
> On Wed, Sep 1, 2021 at 5:32 AM Aleksander Alekseev 
> wrote:
>> By looking at timestamptz_bin() implementation I don't see why it
>> should be STABLE. Its return value depends only on the input values.
>> It doesn't look at the session parameters. timestamptz_in() and
>> timestamptz_out() are STABLE, that's true, but this is no concern of
>> timestamptz_bin().

> I'm not quite willing to bet the answer couldn't change if the timezone
> changes, but it's possible I'm the one missing something.

After playing with it for awhile, it seems like the behavior is indeed
not TZ-dependent, but the real question is should it be?
As an example,

regression=# set timezone to 'America/New_York';
SET
regression=# select date_bin('1 day', '2021-11-01 00:00 +00'::timestamptz, 
'2021-09-01 00:00 -04'::timestamptz);
date_bin

 2021-10-31 00:00:00-04
(1 row)

regression=# select date_bin('1 day', '2021-11-10 00:00 +00'::timestamptz, 
'2021-09-01 00:00 -04'::timestamptz);
date_bin

 2021-11-08 23:00:00-05
(1 row)

I see that these two answers are both exactly multiples of 24 hours away
from the given origin.  But if I'm binning on the basis of "days" or
larger units, I would sort of expect to get local midnight, and I'm not
getting that once I cross a DST boundary.

If this is indeed the behavior we want, I concur with Aleksander
that date_bin isn't TZ-sensitive and needn't be marked STABLE.

regards, tom lane

Re: mark the timestamptz variant of date_bin() as stable

2021-09-01 Thread Justin Pryzby

On Wed, Sep 01, 2021 at 01:26:26PM -0400, John Naylor wrote:
> On Wed, Sep 1, 2021 at 5:32 AM Aleksander Alekseev 
> wrote:
> >
> > By looking at timestamptz_bin() implementation I don't see why it
> > should be STABLE. Its return value depends only on the input values.
> > It doesn't look at the session parameters. timestamptz_in() and
> > timestamptz_out() are STABLE, that's true, but this is no concern of
> > timestamptz_bin().
> 
> I'm not quite willing to bet the answer couldn't change if the timezone
> changes, but it's possible I'm the one missing something.

ts=# SET timezone='-12';
ts=# SELECT date_bin('1hour', '2021-07-01 -1200', '2021-01-01');
date_bin | 2021-07-01 00:00:00-12

ts=# SET timezone='+12';
ts=# SELECT date_bin('1hour', '2021-07-01 -1200', '2021-01-01');
date_bin | 2021-07-02 00:00:00+12

-- 
Justin

Re: mark the timestamptz variant of date_bin() as stable

2021-09-01 Thread Tom Lane

Justin Pryzby  writes:
> On Wed, Sep 01, 2021 at 01:26:26PM -0400, John Naylor wrote:
>> I'm not quite willing to bet the answer couldn't change if the timezone
>> changes, but it's possible I'm the one missing something.

> ts=# SET timezone='-12';
> ts=# SELECT date_bin('1hour', '2021-07-01 -1200', '2021-01-01');
> date_bin | 2021-07-01 00:00:00-12

> ts=# SET timezone='+12';
> ts=# SELECT date_bin('1hour', '2021-07-01 -1200', '2021-01-01');
> date_bin | 2021-07-02 00:00:00+12

Yeah, but those are the same timestamptz value.

Another problem with this example as written is that the origin
values being used are not the same in the two cases ... so I
think it's a bit accidental that the answers come out the same.

regards, tom lane

Re: mark the timestamptz variant of date_bin() as stable

2021-09-01 Thread John Naylor

On Wed, Sep 1, 2021 at 2:44 PM Tom Lane  wrote:

> regression=# set timezone to 'America/New_York';
> SET
> regression=# select date_bin('1 day', '2021-11-01 00:00
+00'::timestamptz, '2021-09-01 00:00 -04'::timestamptz);
> date_bin
> 
>  2021-10-31 00:00:00-04
> (1 row)
>
> regression=# select date_bin('1 day', '2021-11-10 00:00
+00'::timestamptz, '2021-09-01 00:00 -04'::timestamptz);
> date_bin
> 
>  2021-11-08 23:00:00-05
> (1 row)
>
> I see that these two answers are both exactly multiples of 24 hours away
> from the given origin.  But if I'm binning on the basis of "days" or
> larger units, I would sort of expect to get local midnight, and I'm not
> getting that once I cross a DST boundary.

Hmm, that's seems like a reasonable expectation. I can get local midnight
if I recast to timestamp:

# select date_bin('1 day', '2021-11-10 00:00 +00'::timestamptz::timestamp,
'2021-09-01 00:00 -04'::timestamptz::timestamp);
  date_bin
-
 2021-11-09 00:00:00
(1 row)

It's a bit unintuitive, though.

--
John Naylor
EDB: http://www.enterprisedb.com

Re: mark the timestamptz variant of date_bin() as stable

2021-09-01 Thread Tom Lane

John Naylor  writes:
> On Wed, Sep 1, 2021 at 2:44 PM Tom Lane  wrote:
>> I see that these two answers are both exactly multiples of 24 hours away
>> from the given origin.  But if I'm binning on the basis of "days" or
>> larger units, I would sort of expect to get local midnight, and I'm not
>> getting that once I cross a DST boundary.

> Hmm, that's seems like a reasonable expectation. I can get local midnight
> if I recast to timestamp:

> # select date_bin('1 day', '2021-11-10 00:00 +00'::timestamptz::timestamp,
> '2021-09-01 00:00 -04'::timestamptz::timestamp);
>   date_bin
> -
>  2021-11-09 00:00:00
> (1 row)

Yeah, and then back to timestamptz if that's what you really need :-(

> It's a bit unintuitive, though.

Agreed.  If we keep it like this, adding some documentation around
the point would be a good idea I think.

regards, tom lane

Re: PoC/WIP: Extended statistics on expressions

2021-09-01 Thread Justin Pryzby

On Wed, Sep 01, 2021 at 06:45:29PM +0200, Tomas Vondra wrote:
> > > Patch 0001 fixes the "double parens" issue discussed elsewhere in this
> > > thread, and patch 0002 tweaks CREATE STATISTICS to treat "(a)" as a simple
> > > column reference.
> > 
> > 0002 refuses to create expressional stats on a simple column reference like
> > (a), which I think is helps to avoid a user accidentally creating useless 
> > ext
> > stats objects (which are redundant with the table's column stats).
> > 
> > 0002 does not attempt to refuse cases like (a+0), which I think is fine:
> > we don't try to reject useless cases if someone insists on it.
> > See 240971675, 701fd0bbc.
> > 
> > So I am +1 to apply both patches.
> > 
> > I added this as an Opened Item for increased visibility.
> 
> I've pushed both fixes, so the open item should be resolved.

Thank you - I marked it as such.

There are some typos in 537ca68db (refenrece)
I'll add them to my typos branch if you don't want to patch them right now or
wait to see if someone notices anything else.

diff --git a/src/backend/commands/statscmds.c b/src/backend/commands/statscmds.c
index 59369f8736..17cbd97808 100644
--- a/src/backend/commands/statscmds.c
+++ b/src/backend/commands/statscmds.c
@@ -205,27 +205,27 @@ CreateStatistics(CreateStatsStmt *stmt)
numcols = list_length(stmt->exprs);
if (numcols > STATS_MAX_DIMENSIONS)
ereport(ERROR,
(errcode(ERRCODE_TOO_MANY_COLUMNS),
 errmsg("cannot have more than %d columns in 
statistics",
STATS_MAX_DIMENSIONS)));
 
/*
 * Convert the expression list to a simple array of attnums, but also 
keep
 * a list of more complex expressions.  While at it, enforce some
 * constraints - we don't allow extended statistics on system 
attributes,
-* and we require the data type to have less-than operator.
+* and we require the data type to have a less-than operator.
 *
-* There are many ways how to "mask" a simple attribute refenrece as an
+* There are many ways to "mask" a simple attribute reference as an
 * expression, for example "(a+0)" etc. We can't possibly detect all of
-* them, but we handle at least the simple case with attribute in 
parens.
+* them, but we handle at least the simple case with the attribute in 
parens.
 * There'll always be a way around this, if the user is determined (like
 * the "(a+0)" example), but this makes it somewhat consistent with how
 * indexes treat attributes/expressions.
 */
foreach(cell, stmt->exprs)
{
StatsElem  *selem = lfirst_node(StatsElem, cell);
 
if (selem->name)/* column reference */
{
char   *attname;

Re: PoC/WIP: Extended statistics on expressions

2021-09-01 Thread Tomas Vondra





On 9/1/21 9:38 PM, Justin Pryzby wrote:

On Wed, Sep 01, 2021 at 06:45:29PM +0200, Tomas Vondra wrote:

Patch 0001 fixes the "double parens" issue discussed elsewhere in this
thread, and patch 0002 tweaks CREATE STATISTICS to treat "(a)" as a simple
column reference.


0002 refuses to create expressional stats on a simple column reference like
(a), which I think is helps to avoid a user accidentally creating useless ext
stats objects (which are redundant with the table's column stats).

0002 does not attempt to refuse cases like (a+0), which I think is fine:
we don't try to reject useless cases if someone insists on it.
See 240971675, 701fd0bbc.

So I am +1 to apply both patches.

I added this as an Opened Item for increased visibility.


I've pushed both fixes, so the open item should be resolved.


Thank you - I marked it as such.

There are some typos in 537ca68db (refenrece)
I'll add them to my typos branch if you don't want to patch them right now or
wait to see if someone notices anything else.



Yeah, probably better to wait a bit. Any opinions on rejecting 
expressions referencing system attributes or no attributes at all?


regards

--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: [PATCH] Support pg_ident mapping for LDAP

2021-09-01 Thread Zhihong Yu

On Wed, Sep 1, 2021 at 11:43 AM Jacob Champion  wrote:

> On Wed, 2021-09-01 at 15:42 +, Jacob Champion wrote:
> > The cfbot found a failure in postgres_fdw, which I completely neglected
> > in my design. I think the desired functionality should be to allow the
> > ldapuser connection option during CREATE USER MAPPING but not CREATE
> > SERVER.
>
> Fixed in v2, attached.
>
> --Jacob
>
Hi,

+   if (strcmp(val, "1") == 0)
+   hbaline->ldap_map_dn = true;
+   else
+   hbaline->ldap_map_dn = false;

The above can be shortened as:

  hbaline->ldap_map_dn = strcmp(val, "1") == 0;

Cheers

Re: Postgres Win32 build broken?

2021-09-01 Thread Andrew Dunstan



On 8/31/21 9:52 PM, Michael Paquier wrote:
> On Tue, Aug 31, 2021 at 07:49:40PM -0300, Ranier Vilela wrote:
>> I'm not a perl specialist and it seems to me that the Win32 build is broken.
>> The Win32 build is still important because of the 32-bit clients still in
>> use.
>> I'm investigating the problem.
> Being able to see the command you are using for build.pl, your
> buildenv.pl and/or config.pl, as well as your build dependencies
> should help to know what's wrong.
>
> MSVC builds are tested by various buildfarm members on a daily basis,
> and nothing is red.  I also have a x86 and x64 configuration with
> VS2015 that prove to work as of HEAD at de1d4fe, FWIW.  Now, by
> experience, one could say that N Windows PG developpers finish with at
> least (N+1) different environments.  Basically Simon Riggs's theorem
> applied to Windows development..



I am seeing the same result as Ranier using VS2017 and VS 2019.


cheers


andrew


--
Andrew Dunstan
EDB: https://www.enterprisedb.com

Re: Add jsonlog log_destination for JSON server logs

2021-09-01 Thread Sehrope Sarkuni

Updated patch set is attached.

This version splits out the existing csvlog code into its own file and
centralizes the common helpers into a new elog-internal.h so that they're
only included by the actual write_xyz sources.

That makes the elog.c changes in the JSON logging patch minimal as all it's
really doing is invoking the new write_jsonlog(...) function.

It also adds those missing fields to the JSON logger output.

Regards,
-- Sehrope Sarkuni
Founder & CEO | JackDB, Inc. | https://www.jackdb.com/
From d5b3f5fe44e91d35aefdd570758d5b2a9e9c1a36 Mon Sep 17 00:00:00 2001
From: Sehrope Sarkuni 
Date: Wed, 10 Jul 2019 10:02:31 -0400
Subject: [PATCH 1/4] Adds separate dest field to log protocol PipeProtoHeader

Adds a separate dest field to PipeProtoHeader to store the log destination
requested by the sending process. Also changes the is_last field to only
store whether the chunk is the last one for a message rather than also
including whether the destination is csvlog.
---
 src/backend/postmaster/syslogger.c | 15 ++-
 src/backend/utils/error/elog.c |  4 +++-
 src/include/postmaster/syslogger.h |  4 ++--
 3 files changed, 11 insertions(+), 12 deletions(-)

diff --git a/src/backend/postmaster/syslogger.c b/src/backend/postmaster/syslogger.c
index cad43bdef2..edd8f33204 100644
--- a/src/backend/postmaster/syslogger.c
+++ b/src/backend/postmaster/syslogger.c
@@ -878,7 +878,6 @@ process_pipe_input(char *logbuffer, int *bytes_in_logbuffer)
 {
 	char	   *cursor = logbuffer;
 	int			count = *bytes_in_logbuffer;
-	int			dest = LOG_DESTINATION_STDERR;
 
 	/* While we have enough for a header, process data... */
 	while (count >= (int) (offsetof(PipeProtoHeader, data) + 1))
@@ -891,8 +890,9 @@ process_pipe_input(char *logbuffer, int *bytes_in_logbuffer)
 		if (p.nuls[0] == '\0' && p.nuls[1] == '\0' &&
 			p.len > 0 && p.len <= PIPE_MAX_PAYLOAD &&
 			p.pid != 0 &&
-			(p.is_last == 't' || p.is_last == 'f' ||
-			 p.is_last == 'T' || p.is_last == 'F'))
+			(p.is_last == 't' || p.is_last == 'f') &&
+			(p.dest == LOG_DESTINATION_CSVLOG ||
+			 p.dest == LOG_DESTINATION_STDERR))
 		{
 			List	   *buffer_list;
 			ListCell   *cell;
@@ -906,9 +906,6 @@ process_pipe_input(char *logbuffer, int *bytes_in_logbuffer)
 			if (count < chunklen)
 break;
 
-			dest = (p.is_last == 'T' || p.is_last == 'F') ?
-LOG_DESTINATION_CSVLOG : LOG_DESTINATION_STDERR;
-
 			/* Locate any existing buffer for this source pid */
 			buffer_list = buffer_lists[p.pid % NBUFFER_LISTS];
 			foreach(cell, buffer_list)
@@ -924,7 +921,7 @@ process_pipe_input(char *logbuffer, int *bytes_in_logbuffer)
 	free_slot = buf;
 			}
 
-			if (p.is_last == 'f' || p.is_last == 'F')
+			if (p.is_last == 'f')
 			{
 /*
  * Save a complete non-final chunk in a per-pid buffer
@@ -970,7 +967,7 @@ process_pipe_input(char *logbuffer, int *bytes_in_logbuffer)
 	appendBinaryStringInfo(str,
 		   cursor + PIPE_HEADER_SIZE,
 		   p.len);
-	write_syslogger_file(str->data, str->len, dest);
+	write_syslogger_file(str->data, str->len, p.dest);
 	/* Mark the buffer unused, and reclaim string storage */
 	existing_slot->pid = 0;
 	pfree(str->data);
@@ -979,7 +976,7 @@ process_pipe_input(char *logbuffer, int *bytes_in_logbuffer)
 {
 	/* The whole message was one chunk, evidently. */
 	write_syslogger_file(cursor + PIPE_HEADER_SIZE, p.len,
-		 dest);
+		 p.dest);
 }
 			}
 
diff --git a/src/backend/utils/error/elog.c b/src/backend/utils/error/elog.c
index a3e1c59a82..cd13111708 100644
--- a/src/backend/utils/error/elog.c
+++ b/src/backend/utils/error/elog.c
@@ -3250,6 +3250,8 @@ write_pipe_chunks(char *data, int len, int dest)
 
 	p.proto.nuls[0] = p.proto.nuls[1] = '\0';
 	p.proto.pid = MyProcPid;
+	p.proto.dest = (int32) dest;
+	p.proto.is_last = 'f';
 
 	/* write all but the last chunk */
 	while (len > PIPE_MAX_PAYLOAD)
@@ -3264,7 +3266,7 @@ write_pipe_chunks(char *data, int len, int dest)
 	}
 
 	/* write the last chunk */
-	p.proto.is_last = (dest == LOG_DESTINATION_CSVLOG ? 'T' : 't');
+	p.proto.is_last = 't';
 	p.proto.len = len;
 	memcpy(p.proto.data, data, len);
 	rc = write(fd, &p, PIPE_HEADER_SIZE + len);
diff --git a/src/include/postmaster/syslogger.h b/src/include/postmaster/syslogger.h
index 1491eecb0f..41d026a474 100644
--- a/src/include/postmaster/syslogger.h
+++ b/src/include/postmaster/syslogger.h
@@ -46,8 +46,8 @@ typedef struct
 	char		nuls[2];		/* always \0\0 */
 	uint16		len;			/* size of this chunk (counts data only) */
 	int32		pid;			/* writer's pid */
-	char		is_last;		/* last chunk of message? 't' or 'f' ('T' or
- * 'F' for CSV case) */
+	int32		dest;			/* log destination */
+	char		is_last;/* last chunk of message? 't' or 'f'*/
 	char		data[FLEXIBLE_ARRAY_MEMBER];	/* data payload starts here */
 } PipeProtoHeader;
 
-- 
2.17.1

From dfb17c0b1804b9e54a287e6a058d02dd1be27ffb Mon Sep 17 00:00:00 2001
From: Sehrope Sarkuni 
Date: Tue,

Re: BUG #16583: merge join on tables with different DB collation behind postgres_fdw fails

2021-09-01 Thread Tom Lane

Etsuro Fujita  writes:
> On Thu, Jul 15, 2021 at 4:17 AM Ibrar Ahmed  wrote:
>> On Wed, Jul 14, 2021 at 1:41 AM Tom Lane  wrote:
>>> Seems to just need an update of the expected-file to account for test
>>> cases added recently.  (I take no position on whether the new results
>>> are desirable; some of these might be breaking the intent of the case.
>>> But this should quiet the cfbot anyway.)

>> The test case was added by commit "Add support for asynchronous execution"
>> "27e1f14563cf982f1f4d71e21ef24782a052" by Etsuro Fujita. He can comment
>> whether the new results are desirable or not.

> The new results aren't what I intended.  I'll update the patch to
> avoid that by modifying the original test cases properly, if there are
> no objections.

Please follow up on that sometime?  In the meantime, here is a rebase
over aa769f80e and 2dc53fe2a, to placate the cfbot.

The real reason that this hasn't gotten committed is that I remain
pretty uncomfortable about whether it's an acceptable solution to
the problem.  Suddenly asking people to plaster COLLATE clauses
on all their textual remote columns seems like a big compatibility
gotcha.  However, I lack any ideas about a less unpleasant solution.

regards, tom lane

diff --git a/contrib/postgres_fdw/deparse.c b/contrib/postgres_fdw/deparse.c
index d98bd66681..654b09273e 100644
--- a/contrib/postgres_fdw/deparse.c
+++ b/contrib/postgres_fdw/deparse.c
@@ -328,7 +328,9 @@ foreign_expr_walker(Node *node,
 
 /*
  * If the Var is from the foreign table, we consider its
- * collation (if any) safe to use.  If it is from another
+ * collation (if any) safe to use, *unless* it's
+ * DEFAULT_COLLATION_OID.  We treat that as meaning "we don't
+ * know which collation this is".  If it is from another
  * table, we treat its collation the same way as we would a
  * Param's collation, ie it's not safe for it to have a
  * non-default collation.
@@ -350,7 +352,12 @@ foreign_expr_walker(Node *node,
 
 	/* Else check the collation */
 	collation = var->varcollid;
-	state = OidIsValid(collation) ? FDW_COLLATE_SAFE : FDW_COLLATE_NONE;
+	if (collation == InvalidOid)
+		state = FDW_COLLATE_NONE;
+	else if (collation == DEFAULT_COLLATION_OID)
+		state = FDW_COLLATE_UNSAFE;
+	else
+		state = FDW_COLLATE_SAFE;
 }
 else
 {
@@ -941,8 +948,24 @@ foreign_expr_walker(Node *node,
 
 	/*
 	 * Now, merge my collation information into my parent's state.
+	 *
+	 * If one branch of an expression derives a non-default collation safely
+	 * (that is, from a foreign Var) and another one derives the same
+	 * collation unsafely, we can consider the expression safe overall.  This
+	 * allows cases such as "foreign_var = ('foo' COLLATE x)" where x is the
+	 * same collation the foreign_var has anyway.  Note that we will not ship
+	 * any explicit COLLATE clause to the remote, but rely on it to re-derive
+	 * the correct collation based on the foreign_var.
 	 */
-	if (state > outer_cxt->state)
+	if (collation == outer_cxt->collation &&
+		((state == FDW_COLLATE_UNSAFE &&
+		  outer_cxt->state == FDW_COLLATE_SAFE) ||
+		 (state == FDW_COLLATE_SAFE &&
+		  outer_cxt->state == FDW_COLLATE_UNSAFE)))
+	{
+		outer_cxt->state = FDW_COLLATE_SAFE;
+	}
+	else if (state > outer_cxt->state)
 	{
 		/* Override previous parent state */
 		outer_cxt->collation = collation;
diff --git a/contrib/postgres_fdw/expected/postgres_fdw.out b/contrib/postgres_fdw/expected/postgres_fdw.out
index e3ee30f1aa..9e52e09a8b 100644
--- a/contrib/postgres_fdw/expected/postgres_fdw.out
+++ b/contrib/postgres_fdw/expected/postgres_fdw.out
@@ -32,29 +32,29 @@ CREATE SCHEMA "S 1";
 CREATE TABLE "S 1"."T 1" (
 	"C 1" int NOT NULL,
 	c2 int NOT NULL,
-	c3 text,
+	c3 text collate "C",
 	c4 timestamptz,
 	c5 timestamp,
-	c6 varchar(10),
-	c7 char(10),
+	c6 varchar(10) collate "C",
+	c7 char(10) collate "C",
 	c8 user_enum,
 	CONSTRAINT t1_pkey PRIMARY KEY ("C 1")
 );
 CREATE TABLE "S 1"."T 2" (
 	c1 int NOT NULL,
-	c2 text,
+	c2 text collate "C",
 	CONSTRAINT t2_pkey PRIMARY KEY (c1)
 );
 CREATE TABLE "S 1"."T 3" (
 	c1 int NOT NULL,
 	c2 int NOT NULL,
-	c3 text,
+	c3 text collate "C",
 	CONSTRAINT t3_pkey PRIMARY KEY (c1)
 );
 CREATE TABLE "S 1"."T 4" (
 	c1 int NOT NULL,
 	c2 int NOT NULL,
-	c3 text,
+	c3 text collate "C",
 	CONSTRAINT t4_pkey PRIMARY KEY (c1)
 );
 -- Disable autovacuum for these tables to avoid unexpected effects of that
@@ -94,16 +94,18 @@ ANALYZE "S 1"."T 3";
 ANALYZE "S 1"."T 4";
 -- ===
 -- create foreign tables
+-- Note: to ensure stable regression results, all collatable columns
+-- in these tables must have explicitly-specified collations.
 -- ===
 CREATE FOREIGN TABLE ft1 (
 	c0 int,
 	c1 int NOT NULL,
 	c2 int NOT NULL,
-	c3 text,
+	c3 text collate "C",
 	c4 timestamptz,

Re: [PATCH] Support pg_ident mapping for LDAP

2021-09-01 Thread Jacob Champion

On Wed, 2021-09-01 at 12:59 -0700, Zhihong Yu wrote:
> +   if (strcmp(val, "1") == 0)
> +   hbaline->ldap_map_dn = true;
> +   else
> +   hbaline->ldap_map_dn = false;
> 
> The above can be shortened as:
> 
>   hbaline->ldap_map_dn = strcmp(val, "1") == 0;

I usually prefer simplifying those conditionals, too, but in this case
I think it'd be a pretty big departure from the existing style. See for
example the handling of include_realm and compat_realm just after this
hunk.

--Jacob

Re: [PATCH] Support pg_ident mapping for LDAP

2021-09-01 Thread Zhihong Yu

On Wed, Sep 1, 2021 at 1:56 PM Jacob Champion  wrote:

> On Wed, 2021-09-01 at 12:59 -0700, Zhihong Yu wrote:
> > +   if (strcmp(val, "1") == 0)
> > +   hbaline->ldap_map_dn = true;
> > +   else
> > +   hbaline->ldap_map_dn = false;
> >
> > The above can be shortened as:
> >
> >   hbaline->ldap_map_dn = strcmp(val, "1") == 0;
>
> I usually prefer simplifying those conditionals, too, but in this case
> I think it'd be a pretty big departure from the existing style. See for
> example the handling of include_realm and compat_realm just after this
> hunk.
>
> --Jacob
>
Hi,
I looked at v2-Allow-user-name-mapping-with-LDAP.patch
and src/backend/postmaster/postmaster.c in master branch but didn't find
what you mentioned.

I still think my recommendation is concise.

Cheers

Re: Column Filtering in Logical Replication

2021-09-01 Thread Rahila Syed

Hi,


>> Another point is what if someone drops the column used in one of the
>> publications? Do we want to drop the entire relation from publication
>> or just remove the column filter or something else?
>>
>
After thinking about this, I think it is best to remove the entire table
from publication,
if a column specified in the column filter is dropped from the table.
Because, if we drop the entire filter without dropping the table, it means
all the columns will be replicated,
and the downstream server table might not have those columns.
If we drop only the column from the filter we might have to recreate the
filter and check for replica identity.
That means if the replica identity column is dropped, you can't drop it
from the filter,
and might have to drop the entire publication-table mapping anyways.

Thus, I think it is cleanest to drop the entire relation from publication.

This has been implemented in the attached version.


> Do we want to consider that the columns specified in the filter must
>> not have NOT NULL constraint? Because, otherwise, the subscriber will
>> error out inserting such rows?
>>
>> I think you mean columns *not* specified in the filter must not have NOT
> NULL constraint
> on the subscriber, as this will break during insert, as it will try to
> insert NULL for columns
> not sent by the publisher.
> I will look into fixing this. Probably this won't be a problem in
> case the column is auto generated or contains a default value.
>
>
I am not sure if this needs to be handled. Ideally, we need to prevent the
subscriber tables from having a NOT NULL
constraint if the publisher uses column filters to publish the values of
the table. There is no way
to do this at the time of creating a table on subscriber.
As this involves querying the publisher for this information, it can be
done at the time of initial table synchronization.
i.e error out if any of the subscribed tables has NOT NULL constraint on
non-filter columns.
This will lead to the user dropping and recreating the subscription after
removing the
NOT NULL constraint from the table.
I think the same can be achieved by doing nothing and letting the
subscriber error out while inserting rows.

Minor comments:
>> 
>>   pq_sendbyte(out, flags);
>> -
>>   /* attribute name */
>>   pq_sendstring(out, NameStr(att->attname));
>>
>> @@ -953,6 +1000,7 @@ logicalrep_write_attrs(StringInfo out, Relation rel)
>>
>>   /* attribute mode */
>>   pq_sendint32(out, att->atttypmod);
>> +
>>   }
>>
>>   bms_free(idattrs);
>> diff --git a/src/backend/replication/logical/relation.c
>> b/src/backend/replication/logical/relation.c
>> index c37e2a7e29..d7a7b00841 100644
>> --- a/src/backend/replication/logical/relation.c
>> +++ b/src/backend/replication/logical/relation.c
>> @@ -354,7 +354,6 @@ logicalrep_rel_open(LogicalRepRelId remoteid,
>> LOCKMODE lockmode)
>>
>>   attnum = logicalrep_rel_att_by_name(remoterel,
>>   NameStr(attr->attname));
>> -
>>   entry->attrmap->attnums[i] = attnum;
>>
>> There are quite a few places in the patch that contains spurious line
>> additions or removals.
>>
>>
> Fixed these in the attached patch.

Having said that, I'm not sure I agree with this design decision; what I
> think this is doing is hiding from the user the fact that they are
> publishing columns that they don't want to publish.  I think as a user I
> would rather get an error in that case:


  ERROR:  invalid column list in published set
>   DETAIL:  The set of published commands does not include all the replica
> identity columns.


Added this.

Also added some more tests. Please find attached a rebased and updated
patch.

Thank you,
Rahila Syed


v4-0001-Add-column-filtering-to-logical-replication.patch
Description: Binary data

Re: [PATCH] Support pg_ident mapping for LDAP

2021-09-01 Thread Jacob Champion

On Wed, 2021-09-01 at 14:20 -0700, Zhihong Yu wrote:
> I looked at v2-Allow-user-name-mapping-with-LDAP.patch
> and src/backend/postmaster/postmaster.c in master branch but didn't
> find what you mentioned.

This hunk is in src/backend/libpq/hba.c, in the parse_hba_auth_opt()
function. The code there uses the less concise form throughout, as far
as I can see.

--Jacob

Re: [PATCH] regexp_positions ( string text, pattern text, flags text ) → setof int4range[]

2021-09-01 Thread Daniel Gustafsson

> On 1 Sep 2021, at 16:02, Tom Lane  wrote:
> 
> Daniel Gustafsson  writes:
>>> On 9 Mar 2021, at 20:30, Joel Jacobson  wrote:
>>> Attached is a patch implementing it this way.
> 
>> This patch no longer applies, can you please submit a rebased version?

On a brief skim, this patch includes the doc stanza for regexp_replace which I
assume is a copy/pasteo.

+   TupleDescInitEntry(tupdesc, (AttrNumber) 1, "starts”,
While “start_positions” is awfully verbose, just “starts” doesn’t really roll
off the tongue.  Perhaps “positions” would be more explanatory?

> Also, since 642433707 ("This patch adds new functions regexp_count(),
> regexp_instr(), regexp_like(), and regexp_substr(), and extends
> regexp_replace() with some new optional arguments") is already in,
> we need to think about how this interacts with that.  Do we even
> still need any more functionality in this area?  Should we try to
> align the APIs?

I can see value in a function like this one, and the API is AFAICT fairly
aligned with what I as a user would expect it to be given what we already have.

--
Daniel Gustafsson   https://vmware.com/

stat() vs ERROR_DELETE_PENDING, round N + 1

2021-09-01 Thread Thomas Munro

Hi,

Back 2017, Michael and Magus apparently fixed a bug report[1] about
failing basebackups on Windows due to its concurrent file access
semantics:

commit 9951741bbeb3ec37ca50e9ce3df1808c931ff6a6
Author: Magnus Hagander 
Date:   Wed Jan 4 10:48:30 2017 +0100

Attempt to handle pending-delete files on Windows

I think this has been re-broken by:

commit bed90759fcbcd72d4d06969eebab81e47326f9a2
Author: Tom Lane 
Date:   Fri Oct 9 16:20:12 2020 -0400

Fix our Windows stat() emulation to handle file sizes > 4GB.

There's code in there that appears to understand the
ERROR_PENDING_DELETE stuff, but it seems to be too late, as this bit
will fail with ERROR_ACCESS_DENIED first:

/* fast not-exists check */
if (GetFileAttributes(name) == INVALID_FILE_ATTRIBUTES)
{
_dosmaperr(GetLastError());
return -1;
}

... and if you comment that out, then the CreateFile() call will fail
and we'll return before we get to the code that purports to grok
pending deletes.  I don't really understand that code, but I can
report that it's not reached.

This came up because in our work on AIO, we have extra io worker
processes that might have file handles open even in a single session
scenario like 010_pg_basebackup.pl, so we make these types of problems
more likely to hit (hence also my CF entry to fix a related problem in
DROP TABLESPACE).  But that's just chance: I assume basebackup could
fail for anyone in 14 for the same reason due to any other backend
that hasn't processed a sinval to close the file yet.

Perhaps we need some combination of the old way (that apparently knew
how to detect pending deletes) and the new way (that knows about large
files)?

[1] 
https://www.postgresql.org/message-id/flat/20160712083220.1426.58667%40wrigleys.postgresql.org

Re: libpq compression

2021-09-01 Thread Daniel Gustafsson

> On 29 Jul 2021, at 16:57, Daniil Zakhlystov  wrote:
> 
> Forgot to attach the updated patch :) 

This fails to build on Windows due to the use of strcasecmp:

+   if (strcasecmp(supported_algorithms[zpq->compressors[i].impl], 
"zstd") == 

Was that meant to be pg_strcasecmp?

--
Daniel Gustafsson   https://vmware.com/

Re: unpack_sql_state not called?

2021-09-01 Thread Peter Smith

Thanks for pushing!

--
Kind Regards,
Peter Smith.
Fujitsu Australia

Re: Postgres Win32 build broken?

2021-09-01 Thread Andrew Dunstan



On 9/1/21 4:00 PM, Andrew Dunstan wrote:
> On 8/31/21 9:52 PM, Michael Paquier wrote:
>> On Tue, Aug 31, 2021 at 07:49:40PM -0300, Ranier Vilela wrote:
>>> I'm not a perl specialist and it seems to me that the Win32 build is broken.
>>> The Win32 build is still important because of the 32-bit clients still in
>>> use.
>>> I'm investigating the problem.
>> Being able to see the command you are using for build.pl, your
>> buildenv.pl and/or config.pl, as well as your build dependencies
>> should help to know what's wrong.
>>
>> MSVC builds are tested by various buildfarm members on a daily basis,
>> and nothing is red.  I also have a x86 and x64 configuration with
>> VS2015 that prove to work as of HEAD at de1d4fe, FWIW.  Now, by
>> experience, one could say that N Windows PG developpers finish with at
>> least (N+1) different environments.  Basically Simon Riggs's theorem
>> applied to Windows development..
>
>
> I am seeing the same result as Ranier using VS2017 and VS 2019.
>
>

But not with VS2013. If you need to build 32 bit client libraries, using
an older VS release is probably your best bet.


chers


andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com

Re: psql tab auto-complete for CREATE PUBLICATION

2021-09-01 Thread Peter Smith

On Wed, Sep 1, 2021 at 11:04 PM Fujii Masao  wrote:
>
>
>
> On 2021/07/17 2:19, vignesh C wrote:
> > Thanks for the patch, the changes look good to me.
>
> The patch looks good to me, too. Pushed. Thanks!
>

Thanks for pushing!

--
Kind Regards,
Peter Smith.
Fujitsu Australia

Re: Numeric x^y for negative x

2021-09-01 Thread Jaime Casanova

On Fri, Aug 06, 2021 at 09:27:03PM +0100, Dean Rasheed wrote:
> On Fri, 6 Aug 2021 at 21:26, Tom Lane  wrote:
> >
> > Dean Rasheed  writes:
> > > On Fri, 6 Aug 2021 at 17:15, Tom Lane  wrote:
> > >> Looks plausible by eyeball (I've not tested).
> >
> > > So, I have back-branch patches for this ready to go. The question is,
> > > is it better to push now, or wait until after next week's releases?
> >
> > I'd push now, given we have a failing buildfarm member.
> >
> > Admittedly, there may be nobody else using that compiler out in
> > the real world, but we don't know that.
> >
> 
> OK. Will do.
> 

Hi Dean,

It seems you already committed this. But it's still as "Ready for
committer" in the commitfest app. 

Are we waiting for something else or we can mark it as committed?

-- 
Jaime Casanova
Director de Servicios Profesionales
SystemGuards - Consultores de PostgreSQL

Re: Postgres Win32 build broken?

2021-09-01 Thread Michael Paquier

On Wed, Sep 01, 2021 at 06:49:31PM -0400, Andrew Dunstan wrote:
> On 9/1/21 4:00 PM, Andrew Dunstan wrote:
>> I am seeing the same result as Ranier using VS2017 and VS 2019.
>
> But not with VS2013. If you need to build 32 bit client libraries, using
> an older VS release is probably your best bet.

That's annoying.  Should we be more careful with the definition of
{platform} in DeterminePlatform for those versions of VS?
--
Michael


signature.asc
Description: PGP signature

Re: Postgres Win32 build broken?

2021-09-01 Thread Ranier Vilela

Em qua., 1 de set. de 2021 às 19:49, Andrew Dunstan 
escreveu:

>
> On 9/1/21 4:00 PM, Andrew Dunstan wrote:
> > On 8/31/21 9:52 PM, Michael Paquier wrote:
> >> On Tue, Aug 31, 2021 at 07:49:40PM -0300, Ranier Vilela wrote:
> >>> I'm not a perl specialist and it seems to me that the Win32 build is
> broken.
> >>> The Win32 build is still important because of the 32-bit clients still
> in
> >>> use.
> >>> I'm investigating the problem.
> >> Being able to see the command you are using for build.pl, your
> >> buildenv.pl and/or config.pl, as well as your build dependencies
> >> should help to know what's wrong.
> >>
> >> MSVC builds are tested by various buildfarm members on a daily basis,
> >> and nothing is red.  I also have a x86 and x64 configuration with
> >> VS2015 that prove to work as of HEAD at de1d4fe, FWIW.  Now, by
> >> experience, one could say that N Windows PG developpers finish with at
> >> least (N+1) different environments.  Basically Simon Riggs's theorem
> >> applied to Windows development..
> >
> >
> > I am seeing the same result as Ranier using VS2017 and VS 2019.
> >
> >
>
> But not with VS2013. If you need to build 32 bit client libraries, using
> an older VS release is probably your best bet.
>
Thanks Andrew, but I finally got a workaround for the problem.
set MSBFLAGS=/p:Platform="Win32"

Now Postgres builds fine in 32 bits with the latest msvc (2019).
Is it worth documenting this?

regards,
Ranier Vilela


doc_build_in_32bits.patch
Description: Binary data

Re: prevent immature WAL streaming

2021-09-01 Thread Bossart, Nathan

On 9/1/21, 10:06 AM, "Andres Freund"  wrote:
> On 2021-09-01 15:01:43 +0900, Fujii Masao wrote:
>> Thanks for clarifying that! Unless I misunderstand that, when recovery ends
>> at a partially-flushed segment-spanning record, it sets
>> XLP_FIRST_IS_ABORTED_PARTIAL in the next segment before starting writing
>> new WAL data there. Therefore XLP_FIRST_IS_CONTRECORD or
>> XLP_FIRST_IS_ABORTED_PARTIAL must be set in the next segment following
>> a partially-flushed segment-spanning record. If none of them is set,
>> the validation code in recovery should report an error.
>
> Right. With the small addition that I think we shouldn't just do this for
> segment spanning records, but for all records spanning pages.

This approach seems promising.  I like that it avoids adding extra
work in the hot path for writing WAL.  I'm assuming it won't be back-
patchable, though.

Nathan

Re: [PATCH] test/ssl: rework the sslfiles Makefile target

2021-09-01 Thread Jacob Champion

On Fri, 2021-08-27 at 15:02 +0900, Michael Paquier wrote:
> On Fri, Aug 13, 2021 at 12:08:16AM +, Jacob Champion wrote:
> > If _that's_ the case, then our build system is holding all of us
> > hostage. Which is frustrating to me. Should I shift focus to help with
> > that, first?
> 
> Fresh ideas in this area are welcome, yes.

Since the sslfiles target is its own little island in the dependency
graph (it doesn't need anything from Makefile.global), would it be
acceptable to just move it to a standalone sslfiles.mk that the
Makefile defers to? E.g.

sslfiles:
$(MAKE) -f sslfiles.mk
Then we wouldn't have to touch Makefile.global at all, because
sslfiles.mk wouldn't need to include it. This also reduces .NOTPARALLEL
pollution as a bonus.

--Jacob

Re: prevent immature WAL streaming

2021-09-01 Thread Kyotaro Horiguchi

At Wed, 1 Sep 2021 15:01:43 +0900, Fujii Masao  
wrote in 
> 
> 
> On 2021/09/01 12:15, Andres Freund wrote:
> > Hi,
> > On 2021-09-01 11:34:34 +0900, Fujii Masao wrote:
> >> On 2021/09/01 0:53, Andres Freund wrote:
> >>> Of course, we need to be careful to not weaken WAL validity checking
> >>> too
> >>> much. How about the following:
> >>>
> >>> If we're "aborting" a continued record, we set
> >>> XLP_FIRST_IS_ABORTED_PARTIAL on
> >>> the page at which we do so (i.e. the page after the valid end of the
> >>> WAL).
> >>
> >> When do you expect that XLP_FIRST_IS_ABORTED_PARTIAL is set? It's set
> >> when recovery finds a a partially-flushed segment-spanning record?
> >> But maybe we cannot do that (i.e., cannot overwrite the page) because
> >> the page that the flag is set in might have already been archived. No?
> > I was imagining that XLP_FIRST_IS_ABORTED_PARTIAL would be set in the
> > "tail
> > end" of a partial record. I.e. if there's a partial record starting in
> > the
> > successfully archived segment A, but the end of the record, in B, has
> > not been
> > written to disk before a crash, we'd set XLP_FIRST_IS_ABORTED_PARTIAL
> > at the
> > end of the valid data in B. Which could not have been archived yet, or
> > we'd
> > not have a partial record.  So we should never need to set the flag on
> > an
> > already archived page.
> 
> Thanks for clarifying that! Unless I misunderstand that, when recovery
> ends
> at a partially-flushed segment-spanning record, it sets
> XLP_FIRST_IS_ABORTED_PARTIAL in the next segment before starting
> writing
> new WAL data there. Therefore XLP_FIRST_IS_CONTRECORD or
> XLP_FIRST_IS_ABORTED_PARTIAL must be set in the next segment following
> a partially-flushed segment-spanning record. If none of them is set,
> the validation code in recovery should report an error.
> 
> Yes, this design looks good to me.

So, this is a crude PoC of that.

At the end of recovery:

- When XLogReadRecord misses a page where the next part of the current
  continuation record should be seen, xlogreader->ContRecAbortPtr is
  set to the beginning of the missing page.

- When StartupXLOG receives a valid ContRecAbortPtr, the value is used
  as the next WAL insertion location then sets the same to
  XLogCtl->contAbortedRecPtr.

- When XLogCtl->contAbortedRecPtr is set, AdvanceXLInsertBuffer()
  (called under XLogInsertRecord()) sets XLP_FIRST_IS_ABORTED_PARTIAL
  flag to the page.

While recovery:
- When XLogReadRecord meets a XLP_FIRST_IS_ABORT_PARTIAL page, it
  rereads a record from there.

In this PoC,

1. This patch is written on the current master, but it doesn't
  interfare with the seg-boundary-memorize patch since it removes the
  calls to RegisterSegmentBoundary.

2. Since xlogreader cannot emit a log-message immediately, we don't
  have a means to leave a log message to inform recovery met an
  aborted partial continuation record. (In this PoC, it is done by
  fprintf:p)

3. Myebe we need to pg_waldump to show partial continuation records,
  but I'm not sure how to realize that.

4. By the way, is this (method) applicable in this stage?


The attached first is the PoC including debug-crash aid.  The second
is the repro script.  It failes to reproduce the situation once per
several trials.

The following log messages are shown by a run of the script.

> ...
> TRAP: FailedAssertion("c++ < 1", File: "xlog.c", Line: 2675, PID: 254644)
> ...
> LOG:  database system is shut down
> ...
> 
> LOG:  redo starts at 0/228
> LOG:  redo done at 0/6A8 system usage:...
> LOG:   Recovery finished: ContRecAbort: 0/700 (EndRecPtr: 0/6E8)

The record from 6E8 is missing the trailing part after 700.

> LOG:   EndOfLog=0/700
> LOG:   set XLP_FIRST_IS_ABORT_PARTIAL@0/700

So, WAL insertion starts from 700 and the first page is set the flag.

> LOG:  database system is ready to accept connections
> ...
> LOG:  database system is shut down
> ...
> #
> ...
> LOG:  redo starts at 0/228
> LOG:  consistent recovery state reached at 0/2000100
> ...
> LOG:  restored log file "00010007" from archive
>  aborted partial continuation record found at 0/6E8, continue from 
> 0/700

The record from 6E8 is immature so skip it and continue reading
from 700.

> LOG:  last completed transaction was at log time 2021-09-01 20:40:21.775295+09
> LOG:   Recovery finished: ContRecAbort: 0/0 (EndRecPtr: 0/800)
> LOG:  restored log file "00010007" from archive
> LOG:  selected new timeline ID: 2
> LOG:  archive recovery complete
> LOG:   EndOfLog=0/800

Recovery ends.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center
diff --git a/src/backend/access/transam/xlog.c 
b/src/backend/access/transam/xlog.c
index 24165ab03e..b0f18e4e5e 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -111,6 +111,7 @@ int CommitSiblings = 5; /*

Re: Possible missing segments in archiving on standby

2021-09-01 Thread Kyotaro Horiguchi

At Wed, 1 Sep 2021 14:37:43 +0900, Fujii Masao  
wrote in 
> 
> 
> On 2021/09/01 12:12, Kyotaro Horiguchi wrote:
> > Putting aside the issue C, it would work as far as recovery is not
> > paused or delayed.  Although simply doing that means we run additional
> > and a bit) wasteful XLogArchiveCheckDone() in most cases, It's hard to
> > imagine moving the responsibility to notify a finished segment from
> > walsender (writer side) to startup (reader side).
> 
> You mean walreceiver, not walsender?

Sorry, it's walreceiver.

> I was thinking to apply my latest patch, to address the issue A and C.
> So walreceiver is still basically responsible to create .ready file.

Considering the following discussion, I don't object to the patch.

> Also regarding the issue B, I was thinking to make the startup process
> call XLogArchiveCheckDone() or something only when it finds
> XLOG_SWITCH record. Thought?

Sounds workable.  I came to agree to the reader-side amendment as
below. But I might prefer to do that at every segment-switch in case
of a crash.

> What happens if the server is promoted before that walreceiver is
> invoked?

Hm.  A partial segment is not created if a server promotes just at
a segment boundary, then the previous segment won't get archived until
the next checkpoint runs.

Ok, I agree that the reader-side needs an amendment.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center

Re: Using COPY FREEZE in pgbench

2021-09-01 Thread Tatsuo Ishii

>> I tested this with -s100 and got similar results, but with -s1000 it
>> was more like 1.5x faster:
>> 
>> master:
>> done in 111.33 s (drop tables 0.00 s, create tables 0.01 s,
>> client-side generate 52.45 s, vacuum 32.30 s, primary keys 26.58 s)
>> 
>> patch:
>> done in 74.04 s (drop tables 0.46 s, create tables 0.04 s, client-side
>> generate 51.81 s, vacuum 2.11 s, primary keys 19.63 s)
>> 
>> Nice!
>> 
>> Regards,
>> Dean
> 
> If there's no objection, I am going to commit/push to master branch in
> early September.

I have pushed the patch to master branch.
https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=06ba4a63b85e5aa47b325c3235c16c05a0b58b96

Thank you for those who gave me the valuable reviews!
Reviewed-by: Fabien COELHO, Laurenz Albe, Peter Geoghegan, Dean Rasheed
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jp

Re: Postgres Win32 build broken?

2021-09-01 Thread Andrew Dunstan



> On Sep 1, 2021, at 8:01 PM, Ranier Vilela  wrote:
> 
> 
>> Em qua., 1 de set. de 2021 às 19:49, Andrew Dunstan  
>> escreveu:
>> 
>> On 9/1/21 4:00 PM, Andrew Dunstan wrote:
>> > On 8/31/21 9:52 PM, Michael Paquier wrote:
>> >> On Tue, Aug 31, 2021 at 07:49:40PM -0300, Ranier Vilela wrote:
>> >>> I'm not a perl specialist and it seems to me that the Win32 build is 
>> >>> broken.
>> >>> The Win32 build is still important because of the 32-bit clients still in
>> >>> use.
>> >>> I'm investigating the problem.
>> >> Being able to see the command you are using for build.pl, your
>> >> buildenv.pl and/or config.pl, as well as your build dependencies
>> >> should help to know what's wrong.
>> >>
>> >> MSVC builds are tested by various buildfarm members on a daily basis,
>> >> and nothing is red.  I also have a x86 and x64 configuration with
>> >> VS2015 that prove to work as of HEAD at de1d4fe, FWIW.  Now, by
>> >> experience, one could say that N Windows PG developpers finish with at
>> >> least (N+1) different environments.  Basically Simon Riggs's theorem
>> >> applied to Windows development..
>> >
>> >
>> > I am seeing the same result as Ranier using VS2017 and VS 2019.
>> >
>> >
>> 
>> But not with VS2013. If you need to build 32 bit client libraries, using
>> an older VS release is probably your best bet.
> Thanks Andrew, but I finally got a workaround for the problem.
> set MSBFLAGS=/p:Platform="Win32"
> 
> Now Postgres builds fine in 32 bits with the latest msvc (2019).
> Is it worth documenting this?

Better to try to automate it.

Cheers

Andrew

RE: Failed transaction statistics to measure the logical replication progress

2021-09-01 Thread osumi.takami...@fujitsu.com

Hi


I've made a new patch to extend pg_stat_subscription
as suggested in [1] to have columns
xact_commit, xact_error and independent xact_abort mentioned in [2].
Also, during discussion, we touched a topic if we should
include data sizes for each column above and concluded that it's
better to have ones. Accordingly, I've implemented corresponding
columns to show the data sizes as well.

Note that this patch depends on v12 patchset of apply error callback
provided in [3]. Therfore, applying the patchset first is required,
if you would like to test my patch.

[1] - 
https://www.postgresql.org/message-id/CAA4eK1%2BtOV-%2BssGjj1zq%2BnAL8a9LfPsxbtyupZGvZ0U7nV0A7g%40mail.gmail.com
[2] - 
https://www.postgresql.org/message-id/CAA4eK1KMT8biciVqTBoZ9gYV-Gf297JFeNhJaxZNmFrZL8m2jA%40mail.gmail.com
[3] - 
https://www.postgresql.org/message-id/CAD21AoA5HrhXqwbYLpSobGzV6rWoJmH3-NB9J3YarKDwARBj4w%40mail.gmail.com


Best Regards,
Takamichi Osumi



extend_subscription_stats_of_transaction_v02.patch
Description: extend_subscription_stats_of_transaction_v02.patch

Re: BUG #16583: merge join on tables with different DB collation behind postgres_fdw fails

2021-09-01 Thread Etsuro Fujita

On Thu, Sep 2, 2021 at 5:42 AM Tom Lane  wrote:
> Etsuro Fujita  writes:
> > On Thu, Jul 15, 2021 at 4:17 AM Ibrar Ahmed  wrote:
> >> On Wed, Jul 14, 2021 at 1:41 AM Tom Lane  wrote:
> >>> Seems to just need an update of the expected-file to account for test
> >>> cases added recently.  (I take no position on whether the new results
> >>> are desirable; some of these might be breaking the intent of the case.
> >>> But this should quiet the cfbot anyway.)
>
> >> The test case was added by commit "Add support for asynchronous execution"
> >> "27e1f14563cf982f1f4d71e21ef24782a052" by Etsuro Fujita. He can comment
> >> whether the new results are desirable or not.
>
> > The new results aren't what I intended.  I'll update the patch to
> > avoid that by modifying the original test cases properly, if there are
> > no objections.
>
> Please follow up on that sometime?

Will do in this commitfest.

> In the meantime, here is a rebase
> over aa769f80e and 2dc53fe2a, to placate the cfbot.

Thanks for the rebase!

Best regards,
Etsuro Fujita

Re: Skipping logical replication transactions on subscriber side

2021-09-01 Thread Greg Nancarrow

On Mon, Aug 30, 2021 at 5:07 PM Masahiko Sawada  wrote:
>
>
> I've attached rebased patches. 0004 patch is not the scope of this
> patch. It's borrowed from another thread[1] to fix the assertion
> failure for newly added tests. Please review them.
>

I have some initial feedback on the v12-0001 patch.
Most of these are suggested improvements to wording, and some typo fixes.

(0) Patch comment

Suggestion to improve the patch comment:

BEFORE:
Add pg_stat_subscription_errors statistics view.

This commits adds new system view pg_stat_logical_replication_error,
showing errors happening during applying logical replication changes
as well as during performing initial table synchronization.

The subscription error entries are removed by autovacuum workers when
the table synchronization competed in table sync worker cases and when
dropping the subscription in apply worker cases.

It also adds SQL function pg_stat_reset_subscription_error() to
reset the single subscription error.

AFTER:
Add a subscription errors statistics view "pg_stat_subscription_errors".

This commits adds a new system view pg_stat_logical_replication_errors,
that records information about any errors which occur during application
of logical replication changes as well as during performing initial table
synchronization.

The subscription error entries are removed by autovacuum workers after
table synchronization completes in table sync worker cases and after
dropping the subscription in apply worker cases.

It also adds an SQL function pg_stat_reset_subscription_error() to
reset a single subscription error.

doc/src/sgml/monitoring.sgml:

(1)
BEFORE:
+  One row per error that happened on subscription, showing
information about
+   the subscription errors.
AFTER:
+  One row per error that occurred on subscription,
providing information about
+   each subscription error.

(2)
BEFORE:
+   The pg_stat_subscription_errors view will
contain one
AFTER:
+   The pg_stat_subscription_errors view contains one

(3)
BEFORE:
+Name of the database in which the subscription is created.
AFTER:
+Name of the database in which the subscription was created.

(4)
BEFORE:
+   OID of the relation that the worker is processing when the
+   error happened.
AFTER:
+   OID of the relation that the worker was processing when the
+   error occurred.

(5)
BEFORE:
+Name of command being applied when the error happened.  This
+field is always NULL if the error is reported by
+tablesync worker.
AFTER:
+Name of command being applied when the error occurred.  This
+field is always NULL if the error is reported by a
+tablesync worker.

(6)
BEFORE:
+Transaction ID of publisher node being applied when the error
+happened.  This field is always NULL if the error is reported
+by tablesync worker.
AFTER:
+Transaction ID of the publisher node being applied when the error
+happened.  This field is always NULL if the error is reported
+by a tablesync worker.

(7)
BEFORE:
+Type of worker reported the error: apply or
+tablesync.
AFTER:
+Type of worker reporting the error: apply or
+tablesync.

(8)
BEFORE:
+   Number of times error happened on the worker.
AFTER:
+   Number of times the error occurred in the worker.

[or "Number of times the worker reported the error" ?]

(9)
BEFORE:
+   Time at which the last error happened.
AFTER:
+   Time at which the last error occurred.

(10)
BEFORE:
+   Error message which is reported last failure time.
AFTER:
+   Error message which was reported at the last failure time.

Maybe this should just say "Last reported error message" ?

(11)
You shouldn't call hash_get_num_entries() on a NULL pointer.

Suggest swappling lines, as shown below:

BEFORE:
+ int32 nerrors = hash_get_num_entries(subent->suberrors);
+
+ /* Skip this subscription if not have any errors */
+ if (subent->suberrors == NULL)
+continue;
AFTER:
+ int32 nerrors;
+
+ /* Skip this subscription if not have any errors */
+ if (subent->suberrors == NULL)
+continue;
+ nerrors = hash_get_num_entries(subent->suberrors);

(12)
Typo:  legnth -> length

+ * contains the fixed-legnth error message string which is

src/backend/postmaster/pgstat.c

(13)
"Subscription stat entries" hashtable is created in two different
places, one with HASH_CONTEXT and the other without. Is this
intentional?
Shouldn't there be a single function for creating this?

(14)
+ * PgStat_MsgSubscriptionPurge Sent by the autovacuum purge the subscriptions.

Seems to be missing a word, is it meant to say "Sent by the autovacuum
to purge the subscriptions." ?

(15)
+ * PgStat_MsgSubscriptionErrPurge Sent by the autovacuum purge the subscription
+ * errors.

Seems to be missing a word, is it meant to say "Sent by the autovacuum
to purge the subscription errors." ?

Regards,
Greg Nancarrow
Fujitsu Australia

Re: corruption of WAL page header is never reported

2021-09-01 Thread Fujii Masao





On 2021/07/19 18:52, Yugo NAGATA wrote:

Well, I think my first patch was not wrong. The difference with the latest
patch is just whether we perform the additional check when we are not in
standby mode or not. The behavior is basically the same although which function
detects and prints the page-header error in cases of crash recovery is 
different.


Yes, so which patch do you think is better? I like your version
because there seems no reason why XLogPageRead() should skip
XLogReaderValidatePageHeader() when not in standby mode.

Also I'm tempted to move ereport() and reset of errmsg_buf to
under next_record_is_invalid as follows. That is, in standby mode
whenever we find an invalid record and retry reading WAL page
in XLogPageRead(), we report the error message and reset it.
For now in XLogPageRead(), only XLogReaderValidatePageHeader()
sets errmsg_buf, but in the future other code or function doing that
may be added. For that case, the following change seems more elegant.
Thought?

 * shouldn't be a big deal from a performance point of view.
 */
if (!XLogReaderValidatePageHeader(xlogreader, targetPagePtr, readBuf))
-   {
-   /* reset any error XLogReaderValidatePageHeader() might have 
set */
-   xlogreader->errormsg_buf[0] = '\0';
goto next_record_is_invalid;
-   }
 
 	return readLen;
 
@@ -12517,7 +12513,17 @@ next_record_is_invalid:
 
 	/* In standby-mode, keep trying */

if (StandbyMode)
+   {
+   if (xlogreader->errormsg_buf[0])
+   {
+   ereport(emode_for_corrupt_record(emode, EndRecPtr),
+   (errmsg_internal("%s", 
xlogreader->errormsg_buf)));
+
+   /* reset any error XLogReaderValidatePageHeader() might 
have set */
+   xlogreader->errormsg_buf[0] = '\0';
+   }
goto retry;
+   }
else
return -1;
 }

Regards,

--
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION

Re: row filtering for logical replication

2021-09-01 Thread Amit Kapila

On Wed, Sep 1, 2021 at 8:29 PM Euler Taveira  wrote:
>
> On Wed, Sep 1, 2021, at 9:36 AM, Amit Kapila wrote:
>
> I think this or any other similar solution for row filters (on
> updates) won't work till we solve the problem reported by Hou-San [1].
> The main reason is that we don't have data for unchanged toast columns
> in WAL. For that, we have discussed some probable solutions in email
> [2], however, that also required us to solve one of the existing
> bugs[3].
>
> I didn't mention but I'm working on it in parallel.
>
> I agree with you that including TOAST values in the WAL is a possible solution
> for this issue. This is a popular request for wal2json [1][2][3] and I think
> other output plugins have the same request too. It is useful for CDC 
> solutions.
>
> I'm experimenting 2 approaches: (i) always include unchanged TOAST values to
> new tuple if a GUC is set and (ii) include unchanged TOAST values to new tuple
> iif it wasn't include in the old tuple.
>

In the second approach, we will always end up having unchanged toast
columns for non-key columns in the WAL which will be a significant
overhead, so not sure if that can be acceptable if we want to do it by
default.

> The advantage of the first option is
> that you fix the problem adjusting a parameter in your configuration file.
> However, the disadvantage is that, depending on your setup -- REPLICA IDENTITY
> FULL, you might have the same TOAST value for a single change twice in the 
> WAL.
> The second option solves the disadvantage of (i) but it only works if you have
> REPLICA IDENTITY FULL and Dilip's patch applied [4] (I expect to review it
> soon).
>

Thanks for offering the review of that patch. I think it will be good
to get it committed.

> In the output plugin, (i) requires a simple modification (remove
> restriction for unchanged TOAST values) but (ii) needs a more complex surgery.
>

I think if get Dilip's patch then we can have a rule for filter
columns such that it can contain only replica identity key columns.
This rule is anyway required for Deletes and we can have it for
Updates. At this stage, I haven't checked what it takes to implement
such a solution but it would be worth investigating it.

-- 
With Regards,
Amit Kapila.

Re: corruption of WAL page header is never reported

2021-09-01 Thread Kyotaro Horiguchi

At Thu, 2 Sep 2021 12:19:25 +0900, Fujii Masao  
wrote in 
> 
> 
> On 2021/07/19 18:52, Yugo NAGATA wrote:
> > Well, I think my first patch was not wrong. The difference with the
> > latest
> > patch is just whether we perform the additional check when we are not
> > in
> > standby mode or not. The behavior is basically the same although which
> > function
> > detects and prints the page-header error in cases of crash recovery is
> > different.
> 
> Yes, so which patch do you think is better? I like your version
> because there seems no reason why XLogPageRead() should skip
> XLogReaderValidatePageHeader() when not in standby mode.

Did you read the comment just above?

xlog.c:12523
>* Check the page header immediately, so that we can retry immediately 
> if
>* it's not valid. This may seem unnecessary, because XLogReadRecord()
>* validates the page header anyway, and would propagate the failure up 
> to
>* ReadRecord(), which would retry. However, there's a corner case with
>* continuation records, if a record is split across two pages such that

So when not in standby mode, the same check is performed by xlogreader
which has the responsibility to validate the binary data read by
XLogPageRead. The page-header validation is a compromise to save a
specific case.

> Also I'm tempted to move ereport() and reset of errmsg_buf to
> under next_record_is_invalid as follows. That is, in standby mode
> whenever we find an invalid record and retry reading WAL page
> in XLogPageRead(), we report the error message and reset it.
> For now in XLogPageRead(), only XLogReaderValidatePageHeader()
> sets errmsg_buf, but in the future other code or function doing that
> may be added. For that case, the following change seems more elegant.
> Thought?

I don't think it is good choice to conflate read-failure and header
validation failure from the view of modularity.

>* shouldn't be a big deal from a performance point of view.
>*/
>   if (!XLogReaderValidatePageHeader(xlogreader, targetPagePtr, readBuf))
> - {
> - /* reset any error XLogReaderValidatePageHeader() might have 
> set */
> - xlogreader->errormsg_buf[0] = '\0';
>   goto next_record_is_invalid;
> - }
>   return readLen;
>  @@ -12517,7 +12513,17 @@ next_record_is_invalid:
>   /* In standby-mode, keep trying */
>   if (StandbyMode)
> + {
> + if (xlogreader->errormsg_buf[0])
> + {
> + ereport(emode_for_corrupt_record(emode, EndRecPtr),
> + (errmsg_internal("%s", xlogreader->errormsg_buf)));
> +
> + /* reset any error XLogReaderValidatePageHeader() might have set */
> + xlogreader->errormsg_buf[0] = '\0';
> + }
>   goto retry;
> + }
>   else
>   return -1;
>  }

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center

1 2 >

1 - 100 of 116 matches

Mail list logo