Re: POC: Cleaning up orphaned files using undo logs

2022-01-12 Thread Julien Rouhaud
Hi,

On Thu, Nov 25, 2021 at 10:00 PM Antonin Houska  wrote:
>
> So it should be ok if the temporary undo is managed and discarded by
> individual backends. Patch 0005 of the new series tries to do that.

The cfbot reports that at least the 001 patch doesn't apply anymore:
http://cfbot.cputube.org/patch_36_3228.log
> === applying patch 
> ./undo-20211125/0001-Add-SmgrId-to-smgropen-and-BufferTag.patch
> [...]
> patching file src/bin/pg_waldump/pg_waldump.c
> Hunk #1 succeeded at 480 (offset 17 lines).
> Hunk #2 FAILED at 500.
> Hunk #3 FAILED at 531.
> 2 out of 3 hunks FAILED -- saving rejects to file 
> src/bin/pg_waldump/pg_waldump.c.rej

Could you send a rebased version?  In the meantime I'll switch the cf
entry to Waiting on Author.




Re: DELETE CASCADE

2022-01-12 Thread Julien Rouhaud
Hi,

On Wed, Sep 29, 2021 at 03:55:22PM -0500, David Christensen wrote:
> 
> I can see the argument for this in terms of being cautious/explicit about 
> what gets removed, however
> the utility in this particular form was related to being able to *avoid* 
> having to manually figure out
> the relationship chains and the specific constraints involved.  Might there 
> be some sort of middle
> ground here?
> [...]
> > I think we could do something like extending the syntax to be
> >
> > SET CONSTRAINTS conname [ON tablename] [,...] new_properties
> 
> This part seems reasonable.  I need to look at how the existing SET 
> CONSTRAINTS is implemented;
> would be interesting to see how the settings per-table/session are managed, 
> as that would be
> illustrative to additional transient state like this.

The cfbot reports that this patch doesn't apply anymore:
http://cfbot.cputube.org/patch_36_3195.log

> patching file src/backend/utils/adt/ri_triggers.c
> Hunk #1 succeeded at 93 (offset 3 lines).
> Hunk #2 FAILED at 181.
> Hunk #3 succeeded at 556 (offset 5 lines).
> Hunk #4 succeeded at 581 (offset 5 lines).
> Hunk #5 succeeded at 755 (offset 5 lines).
> Hunk #6 succeeded at 776 (offset 5 lines).
> 1 out of 6 hunks FAILED -- saving rejects to file 
> src/backend/utils/adt/ri_triggers.c.rej

Are you currently working on a possibly different approach and/or grammar?  If
not, could you send a rebased patch?  In the meantime I will switch the cf
entry to Waiting on Author.




Re: row filtering for logical replication

2022-01-12 Thread Amit Kapila
On Wed, Jan 12, 2022 at 3:00 AM Alvaro Herrera  wrote:
>
> I just looked at 0002 because of Justin Pryzby's comment in the column
> filtering thread, and realized that the pgoutput row filtering has a
> very strange API, which receives both heap tuples and slots; and we seem
> to convert to and from slots in seemingly unprincipled ways.  I don't
> think this is going to fly.  I think it's OK for the initial entry into
> pgoutput to be HeapTuple (but only because that's what
> ReorderBufferTupleBuf has), but it should be converted a slot right when
> it enters pgoutput, and then used as a slot throughout.
>

One another thing that we can improve about 0002 is to unify the APIs
for row filtering for update and insert/delete. I find having separate
APIs a bit awkward.

-- 
With Regards,
Amit Kapila.




Re: libpq compression (part 2)

2022-01-12 Thread Andrey Borodin



> 8 янв. 2022 г., в 01:56, Stephen Frost  написал(а):
>> 
>> It's discussed in last year's thread.  The thinking is that there tends to be
>> *fewer* exploitable opportunities between application->DB than between
>> browser->app.
> 
> Yes, this was discussed previously and addressed.

What else do we need to decide architecturally to make protocol compression 
happen in 15? As far as I can see - only HBA\GUC part.

Best regards, Andrey Borodin.





Re: [PATCH] Disable bgworkers during servers start in pg_upgrade

2022-01-12 Thread Julien Rouhaud
On Sat, Aug 28, 2021 at 10:40:42AM +0900, Michael Paquier wrote:
> On Fri, Aug 27, 2021 at 12:28:42PM -0700, Andres Freund wrote:
> > Isn't that just going to end up with extension code erroring out and/or
> > blocking waiting for a bgworker to start?
> 
> Well, that's the point to block things during an upgrade.  Do you have
> a list of requirements you'd like to see satisfied here?  POWA would
> be fine with blocking the execution of bgworkers AFAIK (Julien feel
> free to correct me here if necessary).  It could be possible that
> preventing extension code to execute this way could prevent hazards as
> well.  The idea from upthread to prevent any writes and/or WAL
> activity is not really different as extensions may still generate an
> error because of pg_upgrade's safety measures we'd put in, no?

This thread is now almost one year old, and AFAICT there's still no consensus
on how to fix this problem.  It would be good to have something done in pg15,
ideally backpatched, as this is a corruption hazard that triggered at least
once already.

Andres, do you still have an objection with either preventing bgworker
registration/launch or WAL-write during the impacted pg_upgrade steps, or a
better alternative to fix the problem?




Re: small bug in ecpg unicode identifier error handling

2022-01-12 Thread Peter Eisentraut

On 10.01.22 14:14, Peter Eisentraut wrote:

I think this patch is necessary:

diff --git a/src/interfaces/ecpg/preproc/pgc.l 
b/src/interfaces/ecpg/preproc/pgc.l

index 07fee80a9c..3529b2ea86 100644
--- a/src/interfaces/ecpg/preproc/pgc.l
+++ b/src/interfaces/ecpg/preproc/pgc.l
@@ -753,7 +753,7 @@ cppline
{space}*#([^i][A-Za-z]*|{if}|{ifdef}|{ifndef}|{import})((\/\*[^*/]*\*+

  }
  {dquote}    {
  BEGIN(state_before_str_start);
-    if (literallen == 2) /* "U&" */
+    if (literallen == 0)
  mmerror(PARSE_ERROR, ET_ERROR, "zero-length 
delimited identifier");
  /* The backend will truncate the identifier here. 
We do not as it does not change the result. */

  base_yylval.str = psprintf("U&\"%s\"", literalbuf);

The old code doesn't make sense.  The literallen is the length of the
data in literalbuf, which clearly doesn't include the "U&" as the
comment suggests.

A test case is to preprocess a file like this (ecpg test.pgc):

exec sql select u&"

which currently does *not* give the above error, but it should.


Committed.

For the record, the correct test case was actually

exec sql select u&"";




Re: 2022-01 Commitfest

2022-01-12 Thread Amit Kapila
On Wed, Jan 12, 2022 at 11:11 AM Julien Rouhaud  wrote:
>
> The January commitfest should have started almost two weeks ago, but given 
> that
> nothing happened until now I think that it's safe to assume that either
> everyone forgot or no one wanted to volunteer.
>
> I'm therfore volunteering to manage this commitfest,
>

Thanks!

-- 
With Regards,
Amit Kapila.




Re: dynamic result sets support in extended query protocol

2022-01-12 Thread Julien Rouhaud
Hi,

On Mon, Aug 30, 2021 at 02:11:34PM -0700, Zhihong Yu wrote:
> On Mon, Aug 30, 2021 at 1:23 PM Peter Eisentraut <
> peter.eisentr...@enterprisedb.com> wrote:
> 
> > rebased patch set
> 
> +WITH RETURN
> +WITHOUT RETURN
> +
> + 
> +  This option is only valid for cursors defined inside a procedure.
> 
> Since there are two options listed, I think using 'These options are' would
> be better.
> 
> For CurrentProcedure(),
> 
> +   return InvalidOid;
> +   else
> +   return llast_oid(procedure_stack);
> 
> The word 'else' can be omitted.

The cfbot reports that the patch doesn't apply anymore:
http://cfbot.cputube.org/patch_36_2911.log.

Since you mentioned that this patch depends on the SHOW_ALL_RESULTS psql patch
which is still being worked on, I'm not expecting much activity here until the
prerequirements are done.  It also seems better to mark this patch as Waiting
on Author as further reviews are probably not really needed for now.




Re: sepgsql logging

2022-01-12 Thread Dave Page
On Tue, Jan 11, 2022 at 5:55 PM Tom Lane  wrote:

> Andrew Dunstan  writes:
> > I am not that person either. I agree this looks reasonable, but I also
> > would like the opinion of an expert, if we have one.
>
> I'm not sure we do anymore.  Anyway, I tried this on Fedora 35 and
> confirmed that it compiles and the (very tedious) test process
> described in the sepgsql docs still passes.  Looking in the system's
> logs, it appears that Dave didn't precisely emulate how SELinux
> logs this setting, because I see messages like
>
> Jan  4 12:25:46 nuc1 audit[1754]: AVC avc:  denied  { setgid } for
> pid=1754 comm="sss_cache" capability=6
> scontext=unconfined_u:unconfined_r:useradd_t:s0-s0:c0.c1023
> tcontext=unconfined_u:unconfined_r:useradd_t:s0-s0:c0.c1023
> tclass=capability permissive=0
>
> So it looks like their plan is to unconditionally write "permissive=0"
> or "permissive=1", while Dave's patch just prints nothing in enforcing
> mode.  While I can see some virtue in brevity, I think that doing
> exactly what SELinux does is probably a better choice.  For one thing,
> it'd remove doubt about whether one is looking at a log from a sepgsql
> version that logs this or one that doesn't.
>
> Other than that nitpick, I think we should just push this.
>

Here's an update that adds the "permissive=0" case.

-- 
Dave Page
Blog: https://pgsnake.blogspot.com
Twitter: @pgsnake

EDB: https://www.enterprisedb.com


sepgsql_permissive_logging_v2.diff
Description: Binary data


Re: Error "initial slot snapshot too large" in create replication slot

2022-01-12 Thread Julien Rouhaud
Hi,

On Tue, Nov 02, 2021 at 04:40:39PM +0530, Dilip Kumar wrote:
> 
> I have fixed this, the updated patch is attached.

The cfbot reports that this patch doesn't compile:
https://cirrus-ci.com/task/564273490432?logs=build

[03:01:24.477] snapbuild.c: In function ‘SnapBuildInitialSnapshot’:
[03:01:24.477] snapbuild.c:587:2: error: ‘newsubxcnt’ undeclared (first use in 
this function); did you mean ‘newsubxip’?
[03:01:24.477]   587 |  newsubxcnt = 0;
[03:01:24.477]   |  ^~
[03:01:24.477]   |  newsubxip
[03:01:24.477] snapbuild.c:587:2: note: each undeclared identifier is reported 
only once for each function it appears in
[03:01:24.477] snapbuild.c:535:8: warning: unused variable ‘maxxidcnt’ 
[-Wunused-variable]
[03:01:24.477]   535 |  int   maxxidcnt;
[03:01:24.477]   |^

Could you send a new version?  In the meantime I will switch the patch to
Waiting on Author.




Stream replication test fails of cfbot/windows server 2019

2022-01-12 Thread Pavel Borisov
Hi, hackers!

I've noticed that on my branch with amcheck improvements cfbot on windows
server 2019 fails stream replication test.
https://cirrus-ci.com/task/5353503093686272

I don't see any relation of it to the changes in my patch. Furthermore it
also fails on the other СF branch
https://cirrus-ci.com/task/4599128897355776

Is it known cfbot problem? Do I need to do something to my amcheck СF
branch mentioned above for it to become green on cfbot eventually?

-- 
Best regards,
Pavel Borisov

Postgres Professional: http://postgrespro.com 


Re: Stream replication test fails of cfbot/windows server 2019

2022-01-12 Thread Michail Nikolaev
Hello.

Looks like logical replication also affected:

[08:26:35.599] # poll_query_until timed out executing this query:
[08:26:35.599] # SELECT count(1) = 0 FROM pg_subscription_rel WHERE
srsubstate NOT IN ('r', 's');
[08:26:35.599] # expecting this output:
[08:26:35.599] # t
[08:26:35.599] # last actual query output:
[08:26:35.599] # f

https://cirrus-ci.com/task/6532060239101952
https://cirrus-ci.com/task/471606276096

Best regards,
Michail.




Re: Logical replication timeout problem

2022-01-12 Thread Amit Kapila
On Tue, Jan 11, 2022 at 8:13 PM Fabrice Chapuis 
wrote:

> Can you explain why you think this will help in solving your current
> problem?
>
> Indeed your are right this function won't help, we have to look elsewhere.
>
> It is still not clear to me why the problem happened? IIUC, after
> restoring 4096 changes from snap files, we send them to the subscriber, and
> then apply worker should apply those one by one. Now, is it taking one
> minute to restore 4096 changes due to which apply worker is timed out?
>
> Now I can easily reproduce the problem.
> In a first phase, snap files are generated and stored in pg_replslot. This
> process end when1420 files are present in pg_replslots (this is in relation
> with statements that must be replayed from WAL). In the pg_stat_replication
> view, the state field is set to *catchup*.
> In a 2nd phase, the snap files must be decoded. However after one minute
> (wal_receiver_timeout parameter set to 1 minute) the worker process stop
> with a timeout.
>
>
What exactly do you mean by the first and second phase in the above
description?

-- 
With Regards,
Amit Kapila.


Re: Stream replication test fails of cfbot/windows server 2019

2022-01-12 Thread Julien Rouhaud
On Wed, Jan 12, 2022 at 01:51:24PM +0300, Michail Nikolaev wrote:
> Hello.
> 
> Looks like logical replication also affected:
> 
> [08:26:35.599] # poll_query_until timed out executing this query:
> [08:26:35.599] # SELECT count(1) = 0 FROM pg_subscription_rel WHERE
> srsubstate NOT IN ('r', 's');
> [08:26:35.599] # expecting this output:
> [08:26:35.599] # t
> [08:26:35.599] # last actual query output:
> [08:26:35.599] # f
> 
> https://cirrus-ci.com/task/6532060239101952
> https://cirrus-ci.com/task/471606276096

Indeed, and yet CI on postgres tree doesn't exhibit any problem:
https://cirrus-ci.com/github/postgres/postgres




Re: Column Filtering in Logical Replication

2022-01-12 Thread Peter Eisentraut

On 12.01.22 01:41, Alvaro Herrera wrote:

Therefore, I propose to add an index on pg_publication_rel.prpubid.


That seems very reasonable.




Re: Improve error handling of HMAC computations and SCRAM

2022-01-12 Thread Michael Paquier
On Wed, Jan 12, 2022 at 12:56:17PM +0900, Michael Paquier wrote:
> Attached is a rebased patch for the HMAC portions, with a couple of
> fixes I noticed while going through this stuff again (mostly around
> SASLprep and pg_fe_scram_build_secret), and a fix for a conflict
> coming from 9cb5518.  psql's \password is wrong to assume that the
> only error that can happen for scran-sha-256 is an OOM, but we'll get
> there.

With an attachment, that's even better.  (Thanks, Daniel.)
--
Michael
From a6bcfefa9a8dd98bdc6f0e105f7b55dc8739c49e Mon Sep 17 00:00:00 2001
From: Michael Paquier 
Date: Wed, 12 Jan 2022 12:46:27 +0900
Subject: [PATCH v2] Improve HMAC error handling

---
 src/include/common/hmac.h|  1 +
 src/include/common/scram-common.h| 14 +++--
 src/backend/libpq/auth-scram.c   | 22 ---
 src/common/hmac.c| 64 
 src/common/hmac_openssl.c| 90 
 src/common/scram-common.c| 47 +++
 src/interfaces/libpq/fe-auth-scram.c | 67 +++--
 src/interfaces/libpq/fe-auth.c   | 10 ++--
 src/interfaces/libpq/fe-auth.h   |  3 +-
 9 files changed, 269 insertions(+), 49 deletions(-)

diff --git a/src/include/common/hmac.h b/src/include/common/hmac.h
index cf7aa17be4..c18783fe11 100644
--- a/src/include/common/hmac.h
+++ b/src/include/common/hmac.h
@@ -25,5 +25,6 @@ extern int	pg_hmac_init(pg_hmac_ctx *ctx, const uint8 *key, size_t len);
 extern int	pg_hmac_update(pg_hmac_ctx *ctx, const uint8 *data, size_t len);
 extern int	pg_hmac_final(pg_hmac_ctx *ctx, uint8 *dest, size_t len);
 extern void pg_hmac_free(pg_hmac_ctx *ctx);
+extern const char *pg_hmac_error(pg_hmac_ctx *ctx);
 
 #endif			/* PG_HMAC_H */
diff --git a/src/include/common/scram-common.h b/src/include/common/scram-common.h
index d53b4fa7f5..d1f840c11c 100644
--- a/src/include/common/scram-common.h
+++ b/src/include/common/scram-common.h
@@ -47,12 +47,16 @@
 #define SCRAM_DEFAULT_ITERATIONS	4096
 
 extern int	scram_SaltedPassword(const char *password, const char *salt,
- int saltlen, int iterations, uint8 *result);
-extern int	scram_H(const uint8 *str, int len, uint8 *result);
-extern int	scram_ClientKey(const uint8 *salted_password, uint8 *result);
-extern int	scram_ServerKey(const uint8 *salted_password, uint8 *result);
+ int saltlen, int iterations, uint8 *result,
+ const char **errstr);
+extern int	scram_H(const uint8 *str, int len, uint8 *result,
+	const char **errstr);
+extern int	scram_ClientKey(const uint8 *salted_password, uint8 *result,
+			const char **errstr);
+extern int	scram_ServerKey(const uint8 *salted_password, uint8 *result,
+			const char **errstr);
 
 extern char *scram_build_secret(const char *salt, int saltlen, int iterations,
-const char *password);
+const char *password, const char **errstr);
 
 #endif			/* SCRAM_COMMON_H */
diff --git a/src/backend/libpq/auth-scram.c b/src/backend/libpq/auth-scram.c
index 7c9dee70ce..ee7f52218a 100644
--- a/src/backend/libpq/auth-scram.c
+++ b/src/backend/libpq/auth-scram.c
@@ -465,6 +465,7 @@ pg_be_scram_build_secret(const char *password)
 	pg_saslprep_rc rc;
 	char		saltbuf[SCRAM_DEFAULT_SALT_LEN];
 	char	   *result;
+	const char *errstr = NULL;
 
 	/*
 	 * Normalize the password with SASLprep.  If that doesn't work, because
@@ -482,7 +483,8 @@ pg_be_scram_build_secret(const char *password)
  errmsg("could not generate random salt")));
 
 	result = scram_build_secret(saltbuf, SCRAM_DEFAULT_SALT_LEN,
-SCRAM_DEFAULT_ITERATIONS, password);
+SCRAM_DEFAULT_ITERATIONS, password,
+&errstr);
 
 	if (prep_password)
 		pfree(prep_password);
@@ -509,6 +511,7 @@ scram_verify_plain_password(const char *username, const char *password,
 	uint8		computed_key[SCRAM_KEY_LEN];
 	char	   *prep_password;
 	pg_saslprep_rc rc;
+	const char *errstr = NULL;
 
 	if (!parse_scram_secret(secret, &iterations, &encoded_salt,
 			stored_key, server_key))
@@ -539,10 +542,10 @@ scram_verify_plain_password(const char *username, const char *password,
 
 	/* Compute Server Key based on the user-supplied plaintext password */
 	if (scram_SaltedPassword(password, salt, saltlen, iterations,
-			 salted_password) < 0 ||
-		scram_ServerKey(salted_password, computed_key) < 0)
+			 salted_password, &errstr) < 0 ||
+		scram_ServerKey(salted_password, computed_key, &errstr) < 0)
 	{
-		elog(ERROR, "could not compute server key");
+		elog(ERROR, "could not compute server key: %s", errstr);
 	}
 
 	if (prep_password)
@@ -1113,6 +1116,7 @@ verify_client_proof(scram_state *state)
 	uint8		client_StoredKey[SCRAM_KEY_LEN];
 	pg_hmac_ctx *ctx = pg_hmac_create(PG_SHA256);
 	int			i;
+	const char *errstr = NULL;
 
 	/*
 	 * Calculate ClientSignature.  Note that we don't log directly a failure
@@ -1133,7 +1137,8 @@ verify_client_proof(scram_state *state)
 	   strlen(state->client_final_message_without_proof)

RE: Failed transaction statistics to measure the logical replication progress

2022-01-12 Thread osumi.takami...@fujitsu.com
On Thursday, December 23, 2021 6:37 PM Wang, Wei/王 威  
wrote:
> On Wednesday, December 22, 2021 10:30 PM osumi.takami...@fujitsu.com
>  wrote:
> > On Wednesday, December 22, 2021 8:38 PM I wrote:
> > > Do we expect these commit counts which come from empty transactions ?
> > This is another issue discussed in [1] where the patch in the thread
> > is a work in progress, I think.
> > ..
> > IMHO, the conclusion is we are currently in the middle of fixing the 
> > behavior.
> 
> Thank you for telling me this.
> After applying v19-* and
> v15-0001-Skip-empty-transactions-for-logical-replication.patch,
> I retested v19-* patches. The result of previous case looks good to me.
> 
> But the results of following cases are also similar to previous unexpected 
> result
> which increases commit_count or abort_count unexpectedly.
> [1]
> (Based on environment in the previous example, set TWO_PHASE=true)
> [Publisher] begin; insert into replica_test1 values(1,'1'); prepare 
> transaction
> 'id'; commit prepared 'id';
> 
> In subscriber side, the commit_count of two records(sub1 and sub2) is
> increased.
> 
> [2]
> (Based on environment in the previous example, set STREAMING=on)
> [Publisher] begin; INSERT INTO replica_test1 SELECT i, md5(i::text) FROM
> generate_series(1, 5000) s(i); commit;
> 
> In subscriber side, the commit_count of two records(sub1 and sub2) is
> increased.
> 
> [3]
> (Based on environment in the previous example, set TWO_PHASE=true)
> [Publisher] begin; insert into replica_test1 values(1,'1'); prepare 
> transaction
> 'id'; rollback prepared 'id';
> 
> In subscriber side, the abort_count of two records(sub1 and sub2) is
> increased.
> 
> I think the problem maybe is the patch you mentioned
> (Skip-empty-transactions-for-logical-replication.patch) is not finished yet.
> Share this information here.
Hi, thank you for your report.

Yes. As the patch's commit message mentions, the patch doesn't
cover streaming and two phase transactions.

In the above reply, I said that this was an independent issue
and we were in the middle of the modification of the behavior,
but empty transaction turned out to be harmful enough for this feature.
As far as I searched, the current logical replication sends
every transaction that is unrelated to publication specification.
It means that in a common scenario where some parts of tables are
replicated but others are not, the subscription statistics will
be buried by the updates by the empty transactions for the latter,
which damages this patch's value greatly.

Therefore, I included the description in the documentation
reported by you.

The attached v21 has a couple of other minor updates
like a modification of error message text.


Best Regards,
Takamichi Osumi



v21-0001-Fix-alignments-of-TAP-tests-for-pg_stat_subscrip.patch
Description:  v21-0001-Fix-alignments-of-TAP-tests-for-pg_stat_subscrip.patch


v21-0002-Extend-pg_stat_subscription_workers-to-include-g.patch
Description:  v21-0002-Extend-pg_stat_subscription_workers-to-include-g.patch


Re: Add 64-bit XIDs into PostgreSQL 15

2022-01-12 Thread Simon Riggs
On Sat, 8 Jan 2022 at 08:21, Maxim Orlov  wrote:
>>
>>Perhaps we can merge some of the code cleanup that it contained, such as 
>> using XID_FMT everywhere and creating a type for the kind of page returned 
>> by TransactionIdToPage() to make the code cleaner.
>
>
> Agree, I think this is a good idea.

Looks to me like the best next actions would be:

1. Submit a patch that uses XID_FMT everywhere, as a cosmetic change.
This looks like it will reduce the main patch size considerably and
make it much less scary. That can be cleaned up and committed while we
discuss the main approach.

2. Write up the approach in a detailed README, so people can
understand the proposal and assess if there are problems. A few short
notes and a link back to old conversations isn't enough to allow wide
review and give confidence on such a major patch.

-- 
Simon Riggshttp://www.EnterpriseDB.com/




Re: Add 64-bit XIDs into PostgreSQL 15

2022-01-12 Thread Finnerty, Jim
Re: patch that uses XID_FMT everywhere ... to make the main patch much smaller

That's exactly what my previous patch did, plus the patch to support 64-bit 
GUCs.

Maxim, maybe it's still a good idea to isolate those two patches and submit 
them separately first, to reduce the size of the rest of the patch?

On 1/12/22, 8:28 AM, "Simon Riggs"  wrote:

CAUTION: This email originated from outside of the organization. Do not 
click links or open attachments unless you can confirm the sender and know the 
content is safe.



On Sat, 8 Jan 2022 at 08:21, Maxim Orlov  wrote:
>>
>>Perhaps we can merge some of the code cleanup that it contained, such 
as using XID_FMT everywhere and creating a type for the kind of page returned 
by TransactionIdToPage() to make the code cleaner.
>
>
> Agree, I think this is a good idea.

Looks to me like the best next actions would be:

1. Submit a patch that uses XID_FMT everywhere, as a cosmetic change.
This looks like it will reduce the main patch size considerably and
make it much less scary. That can be cleaned up and committed while we
discuss the main approach.

2. Write up the approach in a detailed README, so people can
understand the proposal and assess if there are problems. A few short
notes and a link back to old conversations isn't enough to allow wide
review and give confidence on such a major patch.

--
Simon Riggshttp://www.EnterpriseDB.com/





Re: Postgres Replication from windows to linux

2022-01-12 Thread Peter Eisentraut

On 10.01.22 19:56, rajesh singarapu wrote:
I am trying to migrate my postgres to linux, as we are moving away from 
windows.
I am trying both dump/restore  and logical decoding, but people are not 
happy with performance.
Is there a way/tooling I can use around WAL shipping/physical 
replication here ?


Cross-platform physical replication is always risky.  It might work, but 
there is no easy way to find out whether everything is ok afterwards.


Aside from the issue of possible storage format differences, I would 
also be worried about locale differences affecting text sort order.





Re: Add 64-bit XIDs into PostgreSQL 15

2022-01-12 Thread Pavel Borisov
>
> Maxim, maybe it's still a good idea to isolate those two patches and
> submit them separately first, to reduce the size of the rest of the patch?
>


> Looks to me like the best next actions would be:
>
> 1. Submit a patch that uses XID_FMT everywhere, as a cosmetic change.
> This looks like it will reduce the main patch size considerably and
> make it much less scary. That can be cleaned up and committed while we
> discuss the main approach.
>
> 2. Write up the approach in a detailed README, so people can
> understand the proposal and assess if there are problems. A few short
> notes and a link back to old conversations isn't enough to allow wide
> review and give confidence on such a major patch.
>

Big thanks to all for your ideas!

We intend to do the following work on the patch soon:
1. Write a detailed README
2. Split the patch into several pieces including a separate part for
XID_FMT. But if committers meanwhile choose to commit Jim's XID_FMT patch
we also appreciate this and will rebase our patch accordingly.
 2A. Probably refactor it to store precalculated XMIN/XMAX in memory
tuple representation instead of t_xid_base/t_multi_base
 2B. Split the in-memory part of a patch as a separate
3. Construct some variants for leaving "double xmax" format as a temporary
one just after upgrade for having only one persistent on-disk format
instead of two.
 3A. By using SQL function "vacuum doublexmax;"
OR
 3B. By freeing space on all heap pages for pd_special before
pg-upgrade.
OR
 3C. By automatically repacking all "double xmax" pages after upgrade
(with a priority specified by common vacuum-related GUCs)
4. Intentionally prohibit starting a new transaction with XID difference of
more than 2^32 from the oldest currently running one. This is to enforce
some dba's action for cleaning defunct transaction but not binding one:
he/she can wait if they consider these old transactions not defunct.
5. Investigate and add a solution for archs without 64-bit atomic values.
   5A. Provide XID 8-byte alignment for systems where 64-bit atomics is
provided for 8-byte aligned values.
   5B. Wrap XID reading into PG atomic locks for remaining 32-bit ones
(they are expected to be rare).

--
Best regards,
Pavel Borisov

Postgres Professional: http://postgrespro.com 


Re: Skipping logical replication transactions on subscriber side

2022-01-12 Thread vignesh C
On Wed, Jan 12, 2022 at 11:32 AM Masahiko Sawada  wrote:
>
> On Wed, Jan 12, 2022 at 12:21 PM Amit Kapila  wrote:
> >
> > On Wed, Jan 12, 2022 at 5:49 AM Masahiko Sawada  
> > wrote:
> > >
> > > On Tue, Jan 11, 2022 at 7:08 PM Amit Kapila  
> > > wrote:
> > > >
> > > > On Tue, Jan 11, 2022 at 1:51 PM Masahiko Sawada  
> > > > wrote:
> > > > >
> > > > > On second thought, the same is true for other cases, for example,
> > > > > preparing the transaction and clearing skip_xid while handling a
> > > > > prepare message. That is, currently we don't clear skip_xid while
> > > > > handling a prepare message but do that while handling commit/rollback
> > > > > prepared message, in order to avoid the worst case. If we do both
> > > > > while handling a prepare message and the server crashes between them,
> > > > > it ends up that skip_xid is cleared and the transaction will be
> > > > > resent, which is identical to the worst-case above.
> > > > >
> > > >
> > > > How are you thinking to update the skip xid before prepare? If we do
> > > > it in the same transaction then the changes in the catalog will be
> > > > part of the prepared xact but won't be committed. Now, say if we do it
> > > > after prepare, then the situation won't be the same because after
> > > > restart the same xact won't appear again.
> > >
> > > I was thinking to commit the catalog change first in a separate
> > > transaction while not updating origin LSN and then prepare an empty
> > > transaction while updating origin LSN.
> > >
> >
> > But, won't it complicate the handling if in the future we try to
> > enhance this API such that it skips partial changes like skipping only
> > for particular relation(s) or particular operations as discussed
> > previously in this thread?
>
> Right. I was thinking that if we accept the situation that the user
> has to set skip_xid again in case of the server crashes, we might be
> able to accept also the situation that the user has to clear skip_xid
> in a case of the server crashes. But it seems the former is less
> problematic.
>
> I've attached an updated patch that incorporated all comments I got so far.

Thanks for the updated patch, few comments:
1) Currently skip xid is not displayed in describe subscriptions, can
we include it too:
\dRs+  sub1
List of subscriptions
 Name |  Owner  | Enabled | Publication | Binary | Streaming | Two
phase commit | Synchronous commit |Conninfo
--+-+-+-++---+--++
 sub1 | vignesh | t   | {pub1}  | f  | f | e
 | off| dbname=postgres host=localhost
(1 row)

2) This import "use PostgreSQL::Test::Utils;" is not required:
+# Tests for skipping logical replication transactions.
+use strict;
+use warnings;
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More tests => 6;

3) Some of the comments uses a punctuation mark and some of them does
not use, Should we keep it consistent:
+# Wait for worker error
+$node_subscriber->poll_query_until(
+   'postgres',

+# Set skip xid
+$node_subscriber->safe_psql(
+   'postgres',

+# Create publisher node.
+my $node_publisher = PostgreSQL::Test::Cluster->new('publisher');
+$node_publisher->init(allows_streaming => 'logical');


+# Create subscriber node.
+my $node_subscriber = PostgreSQL::Test::Cluster->new('subscriber');

4) Should this be changed:
+ * True if we are skipping all data modification changes (INSERT,
UPDATE, etc.) of
+ * the specified transaction at MySubscription->skipxid.  Once we
start skipping
+ * changes, we don't stop it until the we skip all changes of the
transaction even
+ * if pg_subscription is updated that and MySubscription->skipxid
gets changed or
to:
+ * True if we are skipping all data modification changes (INSERT,
UPDATE, etc.) of
+ * the specified transaction at MySubscription->skipxid.  Once we
start skipping
+ * changes, we don't stop it until we skip all changes of the transaction even
+ * if pg_subscription is updated that and MySubscription->skipxid
gets changed or

In "stop it until the we skip all changes", here the is not required.

Regards,
Vignesh




Re: libpq compression (part 2)

2022-01-12 Thread Justin Pryzby
zlib still causes check-world to get stuck.  I first mentioned this last March:
20210319062800.gi11...@telsasoft.com

Actually all the compression methods seems to get stuck with
time make check -C src/bin/pg_rewind
time make check -C src/test/isolation

For CI purposes, there should be an 0003 patch which enables compression by
default, for all message types and maybe all lengths.

I removed the thread from the CFBOT until that's resolved.

-- 
Justin




Re: generic plans and "initial" pruning

2022-01-12 Thread Amit Langote
Thanks for taking the time to look at this.

On Wed, Jan 12, 2022 at 1:22 AM Robert Haas  wrote:
> On Fri, Dec 24, 2021 at 10:36 PM Amit Langote  wrote:
> > However, using an idea that Robert suggested to me off-list a little
> > while back, it seems possible to determine the set of partitions that
> > we can safely skip locking.  The idea is to look at the "initial" or
> > "pre-execution" pruning instructions contained in a given Append or
> > MergeAppend node when AcquireExecutorLocks() is collecting the
> > relations to lock and consider relations from only those sub-nodes
> > that survive performing those instructions.   I've attempted
> > implementing that idea in the attached patch.
>
> Hmm. The first question that occurs to me is whether this is fully safe.
>
> Currently, AcquireExecutorLocks calls LockRelationOid for every
> relation involved in the query. That means we will probably lock at
> least one relation on which we previously had no lock and thus
> AcceptInvalidationMessages(). That will end up marking the query as no
> longer valid and CheckCachedPlan() will realize this and tell the
> caller to replan. In the corner case where we already hold all the
> required locks, we will not accept invalidation messages at this
> point, but must have done so after acquiring the last of the locks
> required, and if that didn't mark the plan invalid, it can't be
> invalid now either. Either way, everything is fine.
>
> With the proposed patch, we might never lock some of the relations
> involved in the query. Therefore, if one of those relations has been
> modified in some way that would invalidate the plan, we will
> potentially fail to discover this, and will use the plan anyway.  For
> instance, suppose there's one particular partition that has an extra
> index and the plan involves an Index Scan using that index. Now
> suppose that the scan of the partition in question is pruned, but
> meanwhile, the index has been dropped. Now we're running a plan that
> scans a nonexistent index. Admittedly, we're not running that part of
> the plan. But is that enough for this to be safe? There are things
> (like EXPLAIN or auto_explain) that we might try to do even on a part
> of the plan tree that we don't try to run. Those things might break,
> because for example we won't be able to look up the name of an index
> in the catalogs for EXPLAIN output if the index is gone.
>
> This is just a relatively simple example and I think there are
> probably a bunch of others. There are a lot of kinds of DDL that could
> be performed on a partition that gets pruned away: DROP INDEX is just
> one example. The point is that to my knowledge we have no existing
> case where we try to use a plan that might be only partly valid, so if
> we introduce one, there's some risk there. I thought for a while, too,
> about whether changes to some object in a part of the plan that we're
> not executing could break things for the rest of the plan even if we
> never do anything with the plan but execute it. I can't quite see any
> actual hazard. For example, I thought about whether we might try to
> get the tuple descriptor for the pruned-away object and get a
> different tuple descriptor than we were expecting. I think we can't,
> because (1) the pruned object has to be a partition, and tuple
> descriptors have to match throughout the partitioning hierarchy,
> except for column ordering, which currently can't be changed
> after-the-fact and (2) IIRC, the tuple descriptor is stored in the
> plan and not reconstructed at runtime and (3) if we don't end up
> opening the relation because it's pruned, then we certainly can't do
> anything with its tuple descriptor. But it might be worth giving more
> thought to the question of whether there's any other way we could be
> depending on the details of an object that ended up getting pruned.

I have pondered on the possible hazards before writing the patch,
mainly because the concerns about a previously discussed proposal were
along similar lines [1].

IIUC, you're saying the plan tree is subject to inspection by non-core
code before ExecutorStart() has initialized a PlanState tree, which
must have discarded pruned portions of the plan tree.  I wouldn't
claim to have scanned *all* of the core code that could possibly
access the invalidated portions of the plan tree, but from what I have
seen, I couldn't find any site that does.  An ExecutorStart_hook()
gets to do that, but from what I can see it is expected to call
standard_ExecutorStart() before doing its thing and supposedly only
looks at the PlanState tree, which must be valid.  Actually, EXPLAIN
also does ExecutorStart() before starting to look at the plan (the
PlanState tree), so must not run into pruned plan tree nodes.  All
that said, it does sound like wishful thinking to say that no problems
can possibly occur.

At first, I had tried to implement this such that the
Append/MergeAppend nodes are edited to record the result of initial
pruni

Re: do only critical work during single-user vacuum?

2022-01-12 Thread John Naylor
On Tue, Jan 11, 2022 at 8:57 PM Peter Geoghegan  wrote:
> On Tue, Jan 11, 2022 at 4:59 PM John Naylor
> > For the PoC I wanted to try re-using existing keywords. I went with
> > "VACUUM LIMIT" since LIMIT is already a keyword that cannot be used as
> > a table name. It also brings "wraparound limit" to mind. We could add
> > a single-use unreserved keyword (such as VACUUM_MINIMAL or
> > VACUUM_FAST), but that doesn't seem great.
>
> This seems reasonable, but you could add a new option instead, without
> much downside. While INDEX_CLEANUP kind of looks like a keyword, it
> isn't really a keyword. (Perhaps you knew this already.)
>
> Making this a new option is a little awkward, admittedly. It's not
> clear what it means to "VACUUM (LIMIT) my_table" -- do you just throw
> an error for stuff like that? So perhaps your approach of adding
> VacuumMinimalStmt (a minimal variant of the VACUUM command) is better.

We'd also have to do some checks to either ignore other options or
throw an error, which seems undesirable for code maintenance. For that
reason, I prefer the separate top-level statement, but I'm open to
bike-shedding on the actual syntax. I also briefly looked into a SQL
function, but the transaction management would make that more
difficult.

> > I'm not sure what the right trade-off is, but as written I used 95% of
> > max age. It might be undesirable to end up so close to kicking off
> > uninterruptible vacuums, but the point is to get out of single-user
> > mode and back to streaming WAL as quickly as possible. It might also
> > be worth overriding the min ages as well, but haven't done so here.
>
> I wonder if we should keep autovacuum_freeze_max_age out of it -- its
> default is too conservative in general. I'm concerned that applying
> this autovacuum_freeze_max_age test during VACUUM LIMIT doesn't go far
> enough -- it may require VACUUM LIMIT to do significantly more work
> than is needed to get the system back online (while leaving a sensible
> amount of headroom). Also seems like it might be a good idea to avoid
> relying on the user configuration, given that VACUUM LIMIT is only run
> when everything is already in disarray. (Besides, it's not clear that
> it's okay to use the autovacuum_freeze_max_age GUC without also using
> the reloption of the same name.)
>
> What do you think of applying a similar test using a generic 1 billion
> XID (and 1 billion MXID) age cutoff?

I like that a lot, actually. It's simple and insulates us from
wondering about corner cases in configuration.

> The GetNewTransactionId() WARNINGs ought to be changed to reference
> VACUUM LIMIT. (You probably just didn't get around to that in this
> POC, but couldn't hurt to remind you.)

I'll do that as well as documentation after we have agreement (or at
least lack of objection) on the syntax.

-- 
John Naylor
EDB: http://www.enterprisedb.com




Re: do only critical work during single-user vacuum?

2022-01-12 Thread John Naylor
On Tue, Jan 11, 2022 at 9:20 PM Justin Pryzby  wrote:
>
> On Tue, Jan 11, 2022 at 07:58:56PM -0500, John Naylor wrote:
> > + // FIXME: also check reloption
> > + // WIP: 95% is a starting point for discussion
> > + if ((table_xid_age < autovacuum_freeze_max_age * 0.95) ||
> > + (table_mxid_age < autovacuum_multixact_freeze_max_age 
> > * 0.95))
> > + continue;
>
> Should be &&

Thanks! Will fix.

> Should this emergency vacuum "order by age() DESC" ?

That would add complexity and only save a marginal amount of time.

-- 
John Naylor
EDB: http://www.enterprisedb.com




Re: 2022-01 Commitfest

2022-01-12 Thread Magnus Hagander
On Wed, Jan 12, 2022 at 6:42 AM Julien Rouhaud  wrote:
>
> Note that I don't have admin permissions on the cf app, so I'd be glad if
> something could grant it!

Granted!


-- 
 Magnus Hagander
 Me: https://www.hagander.net/
 Work: https://www.redpill-linpro.com/




Re: 2022-01 Commitfest

2022-01-12 Thread Julien Rouhaud
On Wed, Jan 12, 2022 at 04:16:36PM +0100, Magnus Hagander wrote:
> On Wed, Jan 12, 2022 at 6:42 AM Julien Rouhaud  wrote:
> >
> > Note that I don't have admin permissions on the cf app, so I'd be glad if
> > something could grant it!
> 
> Granted!

Thanks Magnus!




Add non-blocking version of PQcancel

2022-01-12 Thread Jelte Fennema
The existing PQcancel API is using blocking IO. This makes PQcancel
impossible to use in an event loop based codebase, without blocking the
event loop until the call returns.

This patch adds a new cancellation API to libpq which is called
PQcancelConnectionStart. This API can be used to send cancellations in a
non-blocking fashion. To do this it internally uses regular PGconn
connection establishment. This has as a downside that
PQcancelConnectionStart cannot be safely called from a  signal handler.

Luckily, this should be fine for most usages of this API. Since most
code that's using an event loop handles signals in that event loop as
well (as opposed to calling functions from the signal handler directly).

There are also a few advantages of this approach:
1. No need to add and maintain a second non-blocking connection
   establishment codepath.
2. Cancel connections benefit automatically from any improvements made
   to the normal connection establishment codepath. Examples of things
   that it currently gets for free currently are TLS support and
   keepalive settings.

This patch also includes a test for this new API (and also the already
existing cancellation APIs). The test can be easily run like this:

cd src/test/modules/libpq_pipeline
make && ./libpq_pipeline cancel

NOTE: I have not tested this with GSS for the moment. My expectation is
that using this new API with a GSS connection will result in a
CONNECTION_BAD status when calling PQcancelStatus. The reason for this
is that GSS reads will also need to communicate back that an EOF was
found, just like I've done for TLS reads and unencrypted reads. Since in
case of a cancel connection an EOF is actually expected, and should not
be treated as an error.

0001-Add-non-blocking-version-of-PQcancel.patch
Description: 0001-Add-non-blocking-version-of-PQcancel.patch


cpluspluscheck failure

2022-01-12 Thread Andrew Dunstan


While testing a buildfarm module to automate running headerscheck and
cpluspluscheck, I encountered a bunch of errors like this:

Jan 12 09:35:57 In file included from
/home/andrew/bf/root/HEAD/pgsql.build/../pgsql/src/include/port/atomics.h:70,
Jan 12 09:35:57  from
/home/andrew/bf/root/HEAD/pgsql.build/../pgsql/src/include/storage/lwlock.h:21,
Jan 12 09:35:57  from
/home/andrew/bf/root/HEAD/pgsql.build/../pgsql/src/include/storage/lock.h:23,
Jan 12 09:35:57  from
/home/andrew/bf/root/HEAD/pgsql.build/../pgsql/src/include/storage/proc.h:21,
Jan 12 09:35:57  from
/home/andrew/bf/root/HEAD/pgsql.build/../pgsql/src/include/storage/shm_mq.h:18,
Jan 12 09:35:57  from
/home/andrew/bf/root/HEAD/pgsql.build/../pgsql/src/include/libpq/pqmq.h:17,
Jan 12 09:35:57  from /tmp/cpluspluscheck.16q7jo/test.cpp:3:
Jan 12 09:35:57
/home/andrew/bf/root/HEAD/pgsql.build/../pgsql/src/include/port/atomics/arch-x86.h:
In function ‘bool pg_atomic_test_set_flag_impl(volatile pg_atomic_flag*)’:
Jan 12 09:35:57
/home/andrew/bf/root/HEAD/pgsql.build/../pgsql/src/include/port/atomics/arch-x86.h:143:23:
warning: ISO C++17 does not allow ‘register’ storage class specifier
[-Wregister]
Jan 12 09:35:57   143 | register char _res = 1;
Jan 12 09:35:57   |   ^~~~


there are similar complaints about s_lock.h.


The compiler is: g++ (GCC) 11.2.1 20210728 (Red Hat 11.2.1-1)


Do we need to add -Wnoregister to the CXXFLAGS?


cheers


andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com





Re: do only critical work during single-user vacuum?

2022-01-12 Thread John Naylor
On Wed, Jan 12, 2022 at 1:49 AM Masahiko Sawada  wrote:
> It seems to me that adding new syntax instead of a new option is less
> flexible. In the future, for instance, when we support parallel heap
> scan for VACUUM, we may want to add a parallel-related option to both
> VACUUM statement and VACUUM LIMIT statement. VACUUM LIMIT statement
> would end up becoming like VACUUM statement?

This is intended for single-user mode, so parallelism is not relevant.

> As another idea, we might be able to add a new option that takes an
> optional integer value, like VACUUM (MIN_XID), VACUUM (MIN_MXID), and
> VACUUM (MIN_XID 50). We vacuum only tables whose age is older than
> the given value. If the value is omitted, we vacuum only tables whose
> age exceeds a threshold (say autovacuum_freeze_max_age * 0.95), which
> can be used in an emergency case and output in GetNewTransactionID()
> WARNINGs output. vacuumdb’s --min-xid-age and --min-mxid-age can use
> this option instead of fetching the list of tables from the server.

That could work, and maybe also have general purpose, but I see two
problems if I understand you correctly:

- If we have a default threshold when the values are omitted, that
implies we need to special-case single-user mode with non-obvious
behavior, which is not ideal, as Andres mentioned upthread. (Or, now
manual VACUUM by default would not do anything, except in extreme
cases, which is worse.)

- In the single-user case, the admin would still need to add
INDEX_CLEANUP = off for minimum downtime, and it should be really
simple.

- For the general case, we would now have the ability to vacuum a
table, and possibly have no effect at all. That seems out of place
with the other options.

-- 
John Naylor
EDB: http://www.enterprisedb.com




Re: [EXTERNAL] Re: PQcancel does not use tcp_user_timeout, connect_timeout and keepalive settings

2022-01-12 Thread Jelte Fennema
Thanks for all the cleanup and adding of windows support. To me it now looks 
good to merge.

Meanwhile I've created another patch that adds, a non-blocking version of 
PQcancel to libpq.
Which doesn't have this problem by design, because it simply reuses the normal 
code for 
connection establishement. And it also includes tests for PQcancel itself.



trigger example for plsample

2022-01-12 Thread Mark Wong
Hi everyone,

I've adapted the work that Konstantina did for pl/julia as part of her
GSOC project to add an example of handling triggers to plsample.  Which
was based from pl/tcl and pl/perl.

One aspect that I'm not sure about is whether the example should be
duplicating code (as it is now) for keeping an example contained within
a single function.  The only reason I can come up with is to try to read
through an example with minimal jumping around.

Hoping this is a good start.

Regards,
Mark
diff --git a/src/test/modules/plsample/expected/plsample.out b/src/test/modules/plsample/expected/plsample.out
index a0c318b6df..832db79b5c 100644
--- a/src/test/modules/plsample/expected/plsample.out
+++ b/src/test/modules/plsample/expected/plsample.out
@@ -34,3 +34,280 @@ NOTICE:  argument: 0; name: a1; value: {foo,bar,hoge}
  
 (1 row)
 
+CREATE FUNCTION my_trigger_func() RETURNS trigger AS $$
+if TD_event == "INSERT"
+return TD_NEW
+elseif TD_event == "UPDATE"
+return TD_NEW
+else
+return "OK"
+end
+$$ language plsample;
+CREATE TABLE my_table (num integer, description text);
+CREATE TRIGGER my_trigger_func BEFORE INSERT OR UPDATE ON my_table
+   FOR EACH ROW EXECUTE FUNCTION my_trigger_func();
+CREATE TRIGGER my_trigger_func2 AFTER INSERT OR UPDATE ON my_table
+   FOR EACH ROW EXECUTE FUNCTION my_trigger_func(8);
+INSERT INTO my_table (num, description)
+VALUES (1, 'first'), (2, 'second'), (1, 'first');
+NOTICE:  source text of function "my_trigger_func": 
+if TD_event == "INSERT"
+return TD_NEW
+elseif TD_event == "UPDATE"
+return TD_NEW
+else
+return "OK"
+end
+
+NOTICE:  compile trigger function: my_trigger_func(TD_name, TD_relid, TD_table_name, TD_table_schema, TD_event, TD_when, TD_level, TD_NEW, TD_OLD, args) 
+if TD_event == "INSERT"
+return TD_NEW
+elseif TD_event == "UPDATE"
+return TD_NEW
+else
+return "OK"
+end
+
+NOTICE:  trigger name: my_trigger_func
+NOTICE:  trigger relation: my_table
+NOTICE:  trigger relation schema: public
+NOTICE:  triggered by INSERT
+NOTICE:  triggered BEFORE
+NOTICE:  triggered per row
+NOTICE:  out
+NOTICE:  source text of function "my_trigger_func": 
+if TD_event == "INSERT"
+return TD_NEW
+elseif TD_event == "UPDATE"
+return TD_NEW
+else
+return "OK"
+end
+
+NOTICE:  compile trigger function: my_trigger_func(TD_name, TD_relid, TD_table_name, TD_table_schema, TD_event, TD_when, TD_level, TD_NEW, TD_OLD, args) 
+if TD_event == "INSERT"
+return TD_NEW
+elseif TD_event == "UPDATE"
+return TD_NEW
+else
+return "OK"
+end
+
+NOTICE:  trigger name: my_trigger_func
+NOTICE:  trigger relation: my_table
+NOTICE:  trigger relation schema: public
+NOTICE:  triggered by INSERT
+NOTICE:  triggered BEFORE
+NOTICE:  triggered per row
+NOTICE:  out
+NOTICE:  source text of function "my_trigger_func": 
+if TD_event == "INSERT"
+return TD_NEW
+elseif TD_event == "UPDATE"
+return TD_NEW
+else
+return "OK"
+end
+
+NOTICE:  compile trigger function: my_trigger_func(TD_name, TD_relid, TD_table_name, TD_table_schema, TD_event, TD_when, TD_level, TD_NEW, TD_OLD, args) 
+if TD_event == "INSERT"
+return TD_NEW
+elseif TD_event == "UPDATE"
+return TD_NEW
+else
+return "OK"
+end
+
+NOTICE:  trigger name: my_trigger_func
+NOTICE:  trigger relation: my_table
+NOTICE:  trigger relation schema: public
+NOTICE:  triggered by INSERT
+NOTICE:  triggered BEFORE
+NOTICE:  triggered per row
+NOTICE:  out
+NOTICE:  source text of function "my_trigger_func": 
+if TD_event == "INSERT"
+return TD_NEW
+elseif TD_event == "UPDATE"
+return TD_NEW
+else
+return "OK"
+end
+
+NOTICE:  compile trigger function: my_trigger_func(TD_name, TD_relid, TD_table_name, TD_table_schema, TD_event, TD_when, TD_level, TD_NEW, TD_OLD, args) 
+if TD_event == "INSERT"
+return TD_NEW
+elseif TD_event == "UPDATE"
+return TD_NEW
+else
+return "OK"
+end
+
+NOTICE:  trigger name: my_trigger_func2
+NOTICE:  trigger relation: my_table
+NOTICE:  trigger relation schema: public
+NOTICE:  triggered by INSERT
+NOTICE:  triggered AFTER
+NOTICE:  triggered per row
+NOTICE:  trigger arg[0]: 8
+NOTICE:  out
+NOTICE:  source text of function "my_trigger_func": 
+if TD_event == "INSERT"
+return TD_NEW
+elseif TD_event == "UPDATE"
+return TD_NEW
+else
+return "OK"
+end
+
+NOTICE:  compile trigger function: my_trigger_func(TD_name, TD_relid, TD_table_name, TD_table_schema, TD_event, TD_when, TD_level, TD_NEW, TD_OLD, args) 
+if TD_event == "INSERT"
+return TD_NEW
+elseif TD_event == "UPDATE"
+return TD_NEW
+else
+return "OK"
+end
+
+NOTICE:  trigger name: my_trigger_func2
+NOTICE:  trigger relation: my_table
+NOTICE:  trigger relation schema: public
+NOTICE:  triggered by INSERT
+NOTICE:  triggered AFTER
+NOTICE:  triggered per row
+NOTICE:  trigger arg[0]: 8
+NOTICE:  out
+NOTICE:  source text of function "my_trigger_func": 
+if TD_event == "INSERT"
+return TD_NEW
+elseif TD_event == "UPDATE"
+return TD_NEW
+

Re: do only critical work during single-user vacuum?

2022-01-12 Thread Bossart, Nathan
On 1/12/22, 7:43 AM, "John Naylor"  wrote:
> On Wed, Jan 12, 2022 at 1:49 AM Masahiko Sawada  wrote:
>> As another idea, we might be able to add a new option that takes an
>> optional integer value, like VACUUM (MIN_XID), VACUUM (MIN_MXID), and
>> VACUUM (MIN_XID 50). We vacuum only tables whose age is older than
>> the given value. If the value is omitted, we vacuum only tables whose
>> age exceeds a threshold (say autovacuum_freeze_max_age * 0.95), which
>> can be used in an emergency case and output in GetNewTransactionID()
>> WARNINGs output. vacuumdb’s --min-xid-age and --min-mxid-age can use
>> this option instead of fetching the list of tables from the server.
>
> That could work, and maybe also have general purpose, but I see two
> problems if I understand you correctly:
>
> - If we have a default threshold when the values are omitted, that
> implies we need to special-case single-user mode with non-obvious
> behavior, which is not ideal, as Andres mentioned upthread. (Or, now
> manual VACUUM by default would not do anything, except in extreme
> cases, which is worse.)

I agree, I don't think such options should have a default value.

> - In the single-user case, the admin would still need to add
> INDEX_CLEANUP = off for minimum downtime, and it should be really
> simple.
>
> - For the general case, we would now have the ability to vacuum a
> table, and possibly have no effect at all. That seems out of place
> with the other options.

Perhaps a message would be emitted when tables are specified but
skipped due to the min-xid-age option.

As I've stated upthread, Sawada-san's suggested approach was my
initial reaction to this thread.  I'm not wedded to the idea of adding
new options, but I think there are a couple of advantages.  For both
single-user mode and normal operation (which may be in imminent
wraparound danger), you could use the same command:

VACUUM (MIN_XID_AGE 16, ...);

(As an aside, we'd need to figure out how XID and MXID options would
work together.  Presumably most users would want to OR them.)

This doesn't really tie in super nicely with the failsafe mechanism,
but adding something like a FAILSAFE option doesn't seem right to me,
as it's basically just an alias for a bunch of other options.  In my
mind, even a new top-level command would just be an alias for the
aforementioned command.  Of course, providing a new option is not
quite as simple as opening up single-user mode and typing "BAIL OUT,"
but I don't know if it is prohibitively complicated for end users.
They'll already have had to figure out how to start single-user mode
in the first place, and we can have nice ERROR/WARNING messages that
provide a suggested VACUUM command.

The other advantage I see with age-related options is that it can be
useful for non-imminent-wraparound situations as well.  For example,
maybe a user just wants to manually vacuum everything (including
indexes) with an age above 500M on the weekends.

Another idea is to do both.  We could add age-related options, and we
could also add a "BAIL OUT" command that is just an alias for a
special VACUUM command that we feel will help get things under control
as quickly as possible.

Nathan



is ErrorResponse possible on Sync?

2022-01-12 Thread Andrei Matei
Hello Postgres friends,

I've got a question about the wire protocol; the relevant text in the docs
seems a bit ambiguous to me. If the processing of a Sync message fails
(e.g. because the commit of the current transaction fails), is the backend
allowed to respond with an ErrorResponse, in addition to the ReadyForQuery
message? Or, does the backend swallow the error, and return only the
ReadyForQuery (I hope not).

The docs

 say:
"""
At completion of each series of extended-query messages, the frontend
should issue a Sync message. This parameterless message causes the backend
to close the current transaction if it's not inside a BEGIN/COMMIT
transaction block (“close” meaning to commit if no error, or roll back if
error). Then a ReadyForQuery response is issued. The purpose of Sync is to
provide a resynchronization point for error recovery. When an error is
detected while processing any extended-query message, the backend issues
ErrorResponse, then reads and discards messages until a Sync is reached,
then issues ReadyForQuery and returns to normal message processing. (But
note that no skipping occurs if an error is detected while processing Sync
— this ensures that there is one and only one ReadyForQuery sent for each
Sync.)
"""

This paragraph acknowledges that an error can be "detected" while
processing a Sync, but one reading of it might suggest that the only
response from a Sync is a single ReadyForQueryMessage.

Thanks!

- Andrei


Re: is ErrorResponse possible on Sync?

2022-01-12 Thread Vladimir Sitnikov
>Or, does the backend swallow the error, and return only the ReadyForQuery
(I hope not).

What is your backend version?

Here's a well-known case when the backend did swallow the error:
"Error on failed COMMIT"
https://www.postgresql.org/message-id/b9fb50dc-0f6e-15fb-6555-8ddb86f4aa71%40postgresfriends.org

I don't remember if the behavior has been fixed or not.
The expected behavior was "commit" returned "rollback" status without any
error.

Vladimir


Re: is ErrorResponse possible on Sync?

2022-01-12 Thread Tom Lane
Andrei Matei  writes:
> I've got a question about the wire protocol; the relevant text in the docs
> seems a bit ambiguous to me. If the processing of a Sync message fails
> (e.g. because the commit of the current transaction fails), is the backend
> allowed to respond with an ErrorResponse, in addition to the ReadyForQuery
> message? Or, does the backend swallow the error, and return only the
> ReadyForQuery (I hope not).

Uh ... I don't think Sync itself can fail.  Any ErrorResponse you see
there is really from failure of some prior command.  The Sync is really
delimiting how much stuff you'd like to skip in case of a failure.
Basically this is to allow pipelining of commands, with the ability to
discard later commands if an earlier one fails.

But in any case, no, Sync would not suppress an error message if
one is needed.

regards, tom lane




Re: is ErrorResponse possible on Sync?

2022-01-12 Thread Andrei Matei
Thanks!

I work on CockroachDB - which is wire-compatible with Postgres - so I'm
interested in what the server can and cannot do.


> Uh ... I don't think Sync itself can fail.  Any ErrorResponse you see
> there is really from failure of some prior command.


Hmm, this got me curious. If Sync itself cannot fail, then what is this
sentence really saying: "This parameterless message (ed. Sync) causes the
backend to close the current transaction if it's not inside a BEGIN/COMMIT
transaction block (“close” meaning to commit if no error, or roll back if
error)." ?
This seems to say that, outside of BEGIN/END, the transaction is committed
at Sync time (i.e. if the Sync is never sent, nothing is committed).
Presumably, committing a transaction can fail even if no
previous command/statement failed, right?



>   The Sync is really
> delimiting how much stuff you'd like to skip in case of a failure.
> Basically this is to allow pipelining of commands, with the ability to
> discard later commands if an earlier one fails.
>
> But in any case, no, Sync would not suppress an error message if
> one is needed.
>
> regards, tom lane
>


Re: generic plans and "initial" pruning

2022-01-12 Thread Robert Haas
On Wed, Jan 12, 2022 at 9:32 AM Amit Langote  wrote:
> I have pondered on the possible hazards before writing the patch,
> mainly because the concerns about a previously discussed proposal were
> along similar lines [1].

True. I think that the hazards are narrower with this proposal,
because if you *delay* locking a partition that you eventually need,
then you might end up trying to actually execute a portion of the plan
that's no longer valid. That seems like hopelessly bad news. On the
other hand, with this proposal, you skip locking altogether, but only
for parts of the plan that you don't plan to execute. That's still
kind of scary, but not to nearly the same degree.

> IIUC, you're saying the plan tree is subject to inspection by non-core
> code before ExecutorStart() has initialized a PlanState tree, which
> must have discarded pruned portions of the plan tree.  I wouldn't
> claim to have scanned *all* of the core code that could possibly
> access the invalidated portions of the plan tree, but from what I have
> seen, I couldn't find any site that does.  An ExecutorStart_hook()
> gets to do that, but from what I can see it is expected to call
> standard_ExecutorStart() before doing its thing and supposedly only
> looks at the PlanState tree, which must be valid.  Actually, EXPLAIN
> also does ExecutorStart() before starting to look at the plan (the
> PlanState tree), so must not run into pruned plan tree nodes.  All
> that said, it does sound like wishful thinking to say that no problems
> can possibly occur.

Yeah. I don't think it's only non-core code we need to worry about
either. What if I just do EXPLAIN ANALYZE on a prepared query that
ends up pruning away some stuff? IIRC, the pruned subplans are not
shown, so we might escape disaster here, but FWIW if I'd committed
that code I would have pushed hard for showing those and saying "(not
executed)"  so it's not too crazy to imagine a world in which
things work that way.

> At first, I had tried to implement this such that the
> Append/MergeAppend nodes are edited to record the result of initial
> pruning, but it felt wrong to be munging the plan tree in plancache.c.

It is. You can't munge the plan tree: it's required to be strictly
read-only once generated. It can be serialized and deserialized for
transmission to workers, and it can be shared across executions.

> Or, maybe this won't be a concern if performing ExecutorStart() is
> made a part of CheckCachedPlan() somehow, which would then take locks
> on the relation as the PlanState tree is built capturing any plan
> invalidations, instead of AcquireExecutorLocks(). That does sound like
> an ambitious undertaking though.

On the surface that would seem to involve abstraction violations, but
maybe that could be finessed somehow. The plancache shouldn't know too
much about what the executor is going to do with the plan, but it
could ask the executor to perform a step that has been designed for
use by the plancache. I guess the core problem here is how to pass
around information that is node-specific before we've stood up the
executor state tree. Maybe the executor could have a function that
does the pruning and returns some kind of array of results that can be
used both to decide what to lock and also what to consider as pruned
at the start of execution. (I'm hand-waving about the details because
I don't know.)

> Yeah, the premise of the patch is that "initial" pruning steps produce
> the same result both times.  I assumed that would be true because the
> pruning steps are not allowed to contain any VOLATILE expressions.
> Regarding the possibility that IMMUTABLE labeling of functions may be
> incorrect, I haven't considered if the runtime pruning code can cope
> or whether it should try to.  If such a case does occur in practice,
> the bad outcome would be an Assert failure in
> ExecGetRangeTableRelation() or using a partition unlocked in the
> non-assert builds, the latter of which feels especially bad.

Right. I think it's OK for a query to produce wrong answers under
those kinds of conditions - the user has broken everything and gets to
keep all the pieces - but doing stuff that might violate fundamental
assumptions of the system like "relations can only be accessed when
holding a lock on them" feels quite bad. It's not a stretch to imagine
that failing to follow those invariants could take the whole system
down, which is clearly too severe a consequence for the user's failure
to label things properly.

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: Time to drop plpython2?

2022-01-12 Thread Robert Haas
On Wed, Jan 12, 2022 at 2:39 AM Peter Eisentraut
 wrote:
> Well, the minimum supported version has always been the oldest version
> that actually works.  I don't think we ever said, we support >= X, even
> though < X still actually works, about any dependency.

I think that we sometimes say that versions < X are unsupported if we
are unable to test whether or not they work. In other words, I think
the relevant question is whether we are able to demonstrate that it
works, not whether it actually does work.

-- 
Robert Haas
EDB: http://www.enterprisedb.com




Re: Time to drop plpython2?

2022-01-12 Thread Tom Lane
Peter Eisentraut  writes:
> On 11.01.22 17:06, Tom Lane wrote:
>> Nonetheless, we need to make a recommendation to the
>> buildfarm owners about what's the minimum python3 version we intend
>> to support going forward.

> Well, the minimum supported version has always been the oldest version 
> that actually works.  I don't think we ever said, we support >= X, even 
> though < X still actually works, about any dependency.

The concern I have is how do we know what "actually works", if we're
not testing it?  installation.sgml currently promises python2 >= 2.6,
and we know that that works because we have 2.6 in the buildfarm.
It also promises python3 >= 3.1, but we have no buildfarm animals
testing anything older than 3.4.3, so I don't think that promise
is worth the electrons it's written on.  Furthermore, if the meson
conversion forces people to update their python3 to something newer,
there will probably be no testing of plpython against anything older
than what meson requires.

> I don't care much to tie this to Meson right now.  Meson might well move 
> to 3.8 next week and ruin this whole scheme.

Wouldn't be a problem unless our build scripts require that newer
version of meson.  Andres mentioned earlier that we should be able
to run with some older meson versions that only require python 3.5
or so, so I'm hoping we can end up with directives like "use meson
X or later and python 3.5 or later".

> I'm okay with issuing some sort of recommendation for what is reasonable 
> to test, and 3.5 or 3.6 seems like a good cutoff, considering what LTS 
> OS currently ship.  But I'm not sure if that is the same as "support".

Well, I'll see about putting 3.5 on my dinosaurs, and hope I don't
have to do it over.

Anyway, getting back to the point: I think we should notify the
owners ASAP and set a 30-day deadline.  We should try to get this
done before the March CF starts, so it's too late for a 60-day
grace period.  In any case, the worst-case scenario for an owner
is to disable --with-python until they have time to do an upgrade,
so it doesn't seem like a month is a big problem.

regards, tom lane




Re: [PATCH] Proof of concept for GUC improvements

2022-01-12 Thread David Christensen
> Hi,
> 
> According to the cfbot, the patch doesn't apply anymore and needs a
> rebase: http://cfbot.cputube.org/patch_36_3290.log

V4 rebased attached. 



special-guc-values-v4.patch
Description: Binary data


Re: Column Filtering in Logical Replication

2022-01-12 Thread Alvaro Herrera
On 2022-Jan-11, Justin Pryzby wrote:

> Is there any coordination between the "column filter" patch and the "row
> filter" patch ?

Not beyond the grammar, which I tested.

> Are they both on track for PG15 ?

I think they're both on track, yes.

> Has anybody run them together ?

Not me.

> I have a suggestion: for the functions for which both patches are adding
> additional argument types, define a filtering structure for both patches to
> use.  Similar to what we did for some utility statements in a3dc92600.
> 
> I'm referring to:
> logicalrep_write_update()
> logicalrep_write_tuple()

Fixed: the row filter patch no longer adds extra arguments to those
functions.

-- 
Álvaro Herrera   39°49'30"S 73°17'W  —  https://www.EnterpriseDB.com/
"Tiene valor aquel que admite que es un cobarde" (Fernandel)




Re: sepgsql logging

2022-01-12 Thread Tom Lane
Dave Page  writes:
> On Tue, Jan 11, 2022 at 5:55 PM Tom Lane  wrote:
>> So it looks like their plan is to unconditionally write "permissive=0"
>> or "permissive=1", while Dave's patch just prints nothing in enforcing
>> mode.  While I can see some virtue in brevity, I think that doing
>> exactly what SELinux does is probably a better choice.  For one thing,
>> it'd remove doubt about whether one is looking at a log from a sepgsql
>> version that logs this or one that doesn't.

> Here's an update that adds the "permissive=0" case.

You forgot to update the expected-results files :-(.
Done and pushed.

regards, tom lane




Re: Add index scan progress to pg_stat_progress_vacuum

2022-01-12 Thread Bossart, Nathan
On 1/11/22, 11:46 PM, "Masahiko Sawada"  wrote:
> Regarding the new pg_stat_progress_vacuum_index view, why do we need
> to have a separate view? Users will have to check two views. If this
> view is expected to be used together with and joined to
> pg_stat_progress_vacuum, why don't we provide one view that has full
> information from the beginning? Especially, I think it's not useful
> that the total number of indexes to vacuum (num_indexes_to_vacuum
> column) and the current number of indexes that have been vacuumed
> (index_ordinal_position column) are shown in separate views.

I suppose we could add all of the new columns to
pg_stat_progress_vacuum and just set columns to NULL as appropriate.
But is that really better than having a separate view?

> Also, I’m not sure how useful index_tuples_removed is; what can we
> infer from this value (without a total number)?

I think the idea was that you can compare it against max_dead_tuples
and num_dead_tuples to get an estimate of the current cycle progress.
Otherwise, it just shows that progress is being made.

Nathan

[0] https://postgr.es/m/7874FB21-FAA5-49BD-8386-2866552656C7%40amazon.com



Windows vs recovery tests

2022-01-12 Thread Andrew Dunstan


For some considerable time the recovery tests have been at best flaky on
Windows, and at worst disastrous (i.e. they can hang rather than just
fail). It's a problem I worked around on my buildfarm animals by
disabling the tests, hoping to find time to get back to analysing the
problem. But now we are seeing failures on the cfbot too (e.g.
https://cirrus-ci.com/task/5860692694663168 and
https://cirrus-ci.com/task/5316745152954368 ) so I think we need to
spend some effort on finding out what's going on here.


cheers


andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com





Re: is ErrorResponse possible on Sync?

2022-01-12 Thread Alvaro Herrera
On 2022-Jan-12, Andrei Matei wrote:

> If Sync itself cannot fail, then what is this
> sentence really saying: "This parameterless message (ed. Sync) causes the
> backend to close the current transaction if it's not inside a BEGIN/COMMIT
> transaction block (“close” meaning to commit if no error, or roll back if
> error)." ?
> This seems to say that, outside of BEGIN/END, the transaction is committed
> at Sync time (i.e. if the Sync is never sent, nothing is committed).
> Presumably, committing a transaction can fail even if no
> previous command/statement failed, right?

A deferred trigger can cause a failure at COMMIT time for which no
previous error was reported.

alvherre=# create table t (a int unique deferrable initially deferred);
CREATE TABLE
alvherre=# insert into t values (1);
INSERT 0 1
alvherre=# begin;
BEGIN
alvherre=*# insert into t values (1);
INSERT 0 1
alvherre=*# commit;
ERROR:  duplicate key value violates unique constraint "t_a_key"
DETALLE:  Key (a)=(1) already exists.

I'm not sure if you can cause this to explode with just a Sync message, though.

-- 
Álvaro Herrera  Valdivia, Chile  —  https://www.EnterpriseDB.com/




Re: support for MERGE

2022-01-12 Thread Andrew Dunstan


On 1/5/22 02:53, Simon Riggs wrote:
> On Wed, 22 Dec 2021 at 11:35, Simon Riggs  
> wrote:
>> On Mon, 15 Nov 2021 at 22:45, Alvaro Herrera  wrote:
>>> On 2021-Nov-15, Alvaro Herrera wrote:
>>>
 Thanks everyone for the feedback.  I attach a version with the fixes
 that were submitted, as well as some additional changes:
>>> Attachment failure.
>> I rebased this, please check.
>>
>> And 2 patch-on-patches that resolve a few minor questions/docs wording.
> Here is another patch-on-patch to fix a syntax error in the MERGE docs.



The problem with patches like this is that they seriously screw up the
cfbot, which doesn't know about the previous patches you want this based
on top of. See 


cheers


andrew


--
Andrew Dunstan
EDB: https://www.enterprisedb.com





Re: SQL/JSON: functions

2022-01-12 Thread Andrew Dunstan


On 12/9/21 09:04, Himanshu Upadhyaya wrote:
>
>
>
> Few comments For 0002-SQL-JSON-constructors-v59.patch:
> 1)
> +       if (IsA(node, JsonConstructorExpr))
> +       {
> +               JsonConstructorExpr *ctor = (JsonConstructorExpr *) node;
> +               ListCell   *lc;
> +               bool            is_jsonb =
> +                       ctor->returning->format->format ==
> JS_FORMAT_JSONB;
> +
> +               /* Check argument_type => json[b] conversions */
> +               foreach(lc, ctor->args)
> +               {
> +                       Oid                     typid =
> exprType(lfirst(lc));
> +
> +                       if (is_jsonb ?
> +                               !to_jsonb_is_immutable(typid) :
> +                               !to_json_is_immutable(typid))
> +                               return true;
> +               }
> +
> +               /* Check all subnodes */
> +       }
> can have ctor as const pointer?


I guess we could, although I'm not sure it's an important advance.


>
> 2)
> +typedef struct JsonFormat
> +{
> +       NodeTag         type;
> +       JsonFormatType format;          /* format type */
> +       JsonEncoding encoding;          /* JSON encoding */
> +       int                     location;               /* token
> location, or -1 if unknown */
> +} JsonFormat;
>
> I think it will be good if we can have a JsonformatType(defined in
> patch 0001-Common-SQL-JSON-clauses-v59.patch) member named as
> format_type or formatType instead of format?
> There are places in the patch where we access it as "if
> (format->format == JS_FORMAT_DEFAULT)". "format->format" looks little
> difficult to understand.
> "format->format_type == JS_FORMAT_DEFAULT" will be easy to follow.


That's a fair criticism.



>
> 3)
> +               if (have_jsonb)
> +               {
> +                       returning->typid = JSONBOID;
> +                       returning->format->format = JS_FORMAT_JSONB;
> +               }
> +               else if (have_json)
> +               {
> +                       returning->typid = JSONOID;
> +                       returning->format->format = JS_FORMAT_JSON;
> +               }
> +               else
> +               {
> +                       /* XXX TEXT is default by the standard, but we
> return JSON */
> +                       returning->typid = JSONOID;
> +                       returning->format->format = JS_FORMAT_JSON;
> +               }
>
> why we need a separate "else if (have_json)" statement in the below
> code, "else" is also doing the same thing?



I imagine it's more or less a placeholder in case we want to do
something else in the default case. But I agree it's confusing.


>
> 4)
> -test: json jsonb json_encoding jsonpath jsonpath_encoding jsonb_jsonpath
> +test: json jsonb json_encoding jsonpath jsonpath_encoding
> jsonb_jsonpath sqljson
>
> can we rename sqljson sql test file to json_constructor?


Not really - the later patches in the series add to it, so it ends up
being more than just constructor tests.


Thanks for reviewing,


cheers


andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com





Re: Windows vs recovery tests

2022-01-12 Thread Tom Lane
Andrew Dunstan  writes:
> For some considerable time the recovery tests have been at best flaky on
> Windows, and at worst disastrous (i.e. they can hang rather than just
> fail).

How long is "some considerable time"?  I'm wondering if this isn't
the same issue under discussion in

https://www.postgresql.org/message-id/flat/CA%2BhUKG%2BOeoETZQ%3DQw5Ub5h3tmwQhBmDA%3DnuNO3KG%3DzWfUypFAw%40mail.gmail.com

regards, tom lane




Re: Windows vs recovery tests

2022-01-12 Thread Michail Nikolaev
Hello.

It is also could be related -
https://www.postgresql.org/message-id/flat/20220112112425.pgzymqcgdy62e7m3%40jrouhaud#097b54a539ac3091ca4e4ed8ce9ab89c
(both Windows and Linux cases.

Best regards,
Michail.


Re: Avoid erroring out when unable to remove or parse logical rewrite files to save checkpoint work

2022-01-12 Thread Bossart, Nathan
On 12/31/21, 4:44 AM, "Bharath Rupireddy" 
 wrote:
> Currently the server is erroring out when unable to remove/parse a
> logical rewrite file in CheckPointLogicalRewriteHeap wasting the
> amount of work the checkpoint has done and preventing the checkpoint
> from finishing. This is unlike CheckPointSnapBuild does for snapshot
> files i.e. it just emits a message at LOG level and continues if it is
> unable to parse or remove the file. Attaching a small patch applying
> the same idea to the mapping files.

This seems reasonable to me.  AFAICT moving on to other files after an
error shouldn't cause any problems.  In fact, it's probably beneficial
to try to clean up as much as possible so that the files do not
continue to build up.

The only feedback I have for the patch is that I don't think the new
comments are necessary.

Nathan



Re: Windows vs recovery tests

2022-01-12 Thread Andrew Dunstan


On 1/12/22 16:15, Tom Lane wrote:
> Andrew Dunstan  writes:
>> For some considerable time the recovery tests have been at best flaky on
>> Windows, and at worst disastrous (i.e. they can hang rather than just
>> fail).
> How long is "some considerable time"?  I'm wondering if this isn't
> the same issue under discussion in
>
> https://www.postgresql.org/message-id/flat/CA%2BhUKG%2BOeoETZQ%3DQw5Ub5h3tmwQhBmDA%3DnuNO3KG%3DzWfUypFAw%40mail.gmail.com
>
>   



many months - this isn't a new thing.


I'm going to set up a system where I run the test in a fairly tight loop
and see if I can find out more.


cheers


andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com





Re: A qsort template

2022-01-12 Thread John Naylor
On Mon, Jun 28, 2021 at 8:16 PM Thomas Munro  wrote:

[v4 patchset]

Hi Thomas,

(Sorry for the delay -- I have some time to put into this now.)

> The lower numbered patches are all things that are reused in many
> places, and in my humble opinion improve the notation and type safety
> and code deduplication generally when working with common types

I think 0001-0003 have had enough review previously to commit them, as
they are mostly notational. There's a small amount of bitrot, but not
enough to change the conclusions any. Also 0011 with the missing
#undef.

On Thu, Aug 5, 2021 at 7:18 PM Peter Geoghegan  wrote:
>
> If somebody wants to get a sense of what the size hit is from all of
> these specializations, I can recommend the diff feature of bloaty:
>
> https://github.com/google/bloaty/blob/master/doc/using.md#size-diffs
>
> Obviously you'd approach this by building postgres without the patch,
> and diffing that baseline to postgres with the patch. And possibly
> variations of the patch, with less or more sort specializations.

Thanks, that's a neat feature! For 0001-0003, the diff shows +700
bytes in memory, so pretty small:

$ bloaty -s vm src/backend/postgres -- src/backend/postgres.orig
FILE SIZEVM SIZE
 --  --
  +0.0%+608  +0.0%+608.text
  +0.0% +64  +0.0% +64.eh_frame
  +0.0% +24  +0.0% +24.dynsym
  +0.0% +14  +0.0% +14.dynstr
  +0.0%  +2  +0.0%  +2.gnu.version
  +0.0% +58  [ = ]   0.debug_abbrev
  +0.1% +48  [ = ]   0.debug_aranges
  +0.0% +1.65Ki  [ = ]   0.debug_info
  +0.0%+942  [ = ]   0.debug_line
  +0.1% +26  [ = ]   0.debug_line_str
  +0.0%+333  [ = ]   0.debug_loclists
  -0.0% -23  [ = ]   0.debug_rnglists
  +0.0% +73  [ = ]   0.debug_str
  -1.0%  -4  [ = ]   0.shstrtab
  +0.0% +20  [ = ]   0.strtab
  +0.0% +24  [ = ]   0.symtab
  +131% +3.30Ki  [ = ]   0[Unmapped]
  +0.0% +7.11Ki  +0.0%+712TOTAL

[back to Thomas]

> I tried to measure a speedup in vacuum, but so far I have not.  I did
> learn some things though:  While doing that with an uncorrelated index
> and a lot of deleted tuples, I found that adding more
> maintenance_work_mem doesn't help beyond a few MB, because then cache
> misses dominate to the point where it's not better than doing multiple
> passes (and this is familiar to me from work on hash joins).  If I
> turned on huge pages on Linux and set min_dynamic_shared_memory so
> that the parallel DSM used by vacuum lives in huge pages, then
> parallel vacuum with a large maintenance_work_mem starts to do much
> better than non-parallel vacuum by improving the TLB misses (as with
> hash joins).  I thought that was quite interesting!  Perhaps
> bsearch_itemptr might help with correlated indexes with a lot of
> deleted indexes (so not dominated by cache misses), though?
>
> (I wouldn't be suprised if someone comes up with a much better idea
> than bsearch for that anyway...  a few ideas have been suggested.)

That's interesting about the (un)correlated index having such a large
effect on cache hit rate! By now there has been some discussion and a
benchmark for dead tuple storage [1]. bit there doesn't seem to be
recent activity on that thread. We might consider adding the ItemPtr
comparator work I did in [2] for v15 if we don't have any of the other
proposals in place by feature freeze. My concern there is the speedups
I observed were observed when the values were comfortably in L2 cache,
IIRC. That would need wider testing.

That said, I think what I'll do next is test the v3-0001 standalone
patch with tuplesort specializations for more data types. I already
have a decent test script that I can build on for this. (this is the
one currently in CI)

Then, I want to do at least preliminary testing of the qsort boundary
parameters.

Those two things should be doable for this commitfest.

[1] 
https://www.postgresql.org/message-id/CAD21AoAfOZvmfR0j8VmZorZjL7RhTiQdVttNuC4W-Shdc2a-AA%40mail.gmail.com
[2] 
https://www.postgresql.org/message-id/CAFBsxsG_c24CHKA3cWrOP1HynWGLOkLb8hyZfsD9db5g-ZPagA%40mail.gmail.com

-- 
John Naylor
EDB: http://www.enterprisedb.com




Re: PATCH: generate fractional cheapest paths in generate_orderedappend_path

2022-01-12 Thread Tomas Vondra

Pushed, after clarifying the comments a bit.

I also looked into what would it take to consider incremental paths, and 
in principle it doesn't seem all that complicated. The attached PoC 
patch extends get_cheapest_fractional_path_for_pathkeys() to optionally 
build incremental sort on paths if needed. There are two GUCs that make 
experimenting simpler:


* enable_fractional_paths - disables fractional paths entirely, i.e. we 
get behavior without the part I already pushed


* enable_fractional_incremental_paths - disables the incremental sort 
part added by the attached patch


With this, I get this plan (see the test in partitionwise_join.sql)

test=# EXPLAIN (COSTS OFF)
test-# SELECT * FROM fract_t x LEFT JOIN fract_t y USING (id1, id2) 
ORDER BY id1 ASC, id2 ASC LIMIT 10;
  QUERY PLAN 


--
 Limit
   ->  Merge Left Join
 Merge Cond: ((x.id1 = y.id1) AND (x.id2 = y.id2))
 ->  Append
   ->  Index Only Scan using fract_t0_id1_id2_idx on
 fract_t0 x_1
   ->  Incremental Sort
 Sort Key: x_2.id1, x_2.id2
 Presorted Key: x_2.id1
 ->  Index Scan using fract_t1_pkey on fract_t1 x_2
 ->  Materialize
   ->  Append
 ->  Incremental Sort
   Sort Key: y_1.id1, y_1.id2
   Presorted Key: y_1.id1
   ->  Index Scan using fract_t0_pkey on
fract_t0 y_1
 Index Cond: (id1 = id1)
 Filter: (id2 = id2)
 ->  Incremental Sort
   Sort Key: y_2.id1, y_2.id2
   Presorted Key: y_2.id1
   ->  Index Scan using fract_t1_pkey on
fract_t1 y_2
 Index Cond: (id1 = id1)
 Filter: (id2 = id2)
(23 rows)

instead of

  QUERY PLAN 


--
 Limit
   ->  Incremental Sort
 Sort Key: x.id1, x.id2
 Presorted Key: x.id1
 ->  Merge Left Join
   Merge Cond: (x.id1 = y.id1)
   Join Filter: (x.id2 = y.id2)
   ->  Append
 ->  Index Scan using fract_t0_pkey on fract_t0 x_1
 ->  Index Scan using fract_t1_pkey on fract_t1 x_2
   ->  Materialize
 ->  Append
   ->  Index Scan using fract_t0_pkey on
fract_t0 y_1
   ->  Index Scan using fract_t1_pkey on
fract_t1 y_2
(14 rows)

i.e. the incremental sorts moved below the merge join (and the cost is 
lower, but that's not shown here).


So that seems reasonable, but there's a couple issues too:

1) Planning works (hence we can run explain), but execution fails 
because of segfault in CheckVarSlotCompatibility - it gets NULL slot for 
some reason. I haven't looked into the details, but I'd guess we need to 
pass a different rel to create_incrementalsort_path, not childrel.


2) enable_partitionwisejoin=on seems to cause some confusion, because it 
results in picking a different plan with higher cost. But that's clearly 
not because of this patch.


3) I'm still a bit skeptical about the cost of this implementation - it 
builds the incrementalsort path, just to throw most of the paths away. 
It'd be much better to just calculate the cost, which should be much 
cheaper, and only do the heavy work for the one "best" path.


4) One thing I did not realize before is what pathkeys we even consider 
here. Imagine you have two tables:


   CREATE TABLE t1 (a int, b int, primary key (a));
   CREATE TABLE t2 (a int, b int, primary key (a));

and query

   SELECT * FROM t1 JOIN t2 USING (a,b);

It seems reasonable to also consider pathkeys (a,b) because that's make 
e.g. mergejoin much cheaper, right? But no, we'll not do that - we only 
consider pathkeys that at least one child relation has, so in this case 
it's just (a). Which means we'll never consider incremental sort for 
this particular example.


It's a bit strange, because it's enough to create index on (a,b) for 
just one of the relations, and it'll suddenly consider building 
incremental sort on both sides.



I don't plan to pursue this further at this point, so I pushed the first 
part because that's a fairly simple improvement over what we have now. 
But it seems like a fairly promising area for improvements.


Also, the non-intuitive effects of enabling partition-wise joins (i.e. 
picking plans with higher estimated costs)

Re: Improve error handling of HMAC computations and SCRAM

2022-01-12 Thread Sergey Shinderuk

On 12.01.2022 14:32, Michael Paquier wrote:

On Wed, Jan 12, 2022 at 12:56:17PM +0900, Michael Paquier wrote:

Attached is a rebased patch for the HMAC portions, with a couple of
fixes I noticed while going through this stuff again (mostly around
SASLprep and pg_fe_scram_build_secret), and a fix for a conflict
coming from 9cb5518.  psql's \password is wrong to assume that the
only error that can happen for scran-sha-256 is an OOM, but we'll get
there.


With an attachment, that's even better.  (Thanks, Daniel.)
Gave it a thorough read.  Looks good, except for errstr not set in a 
couple of places (see the diff attached).


Didn't test it.

--
Sergey Shinderukhttps://postgrespro.com/diff --git a/src/common/hmac.c b/src/common/hmac.c
index 592f9b20a38..a27778e86b3 100644
--- a/src/common/hmac.c
+++ b/src/common/hmac.c
@@ -46,9 +46,7 @@ typedef enum pg_hmac_errno
PG_HMAC_ERROR_INTERNAL
 } pg_hmac_errno;
 
-/*
- * Internal structure for pg_hmac_ctx->data with this implementation.
- */
+/* Internal pg_hmac_ctx structure */
 struct pg_hmac_ctx
 {
pg_cryptohash_ctx *hash;
diff --git a/src/common/hmac_openssl.c b/src/common/hmac_openssl.c
index c352f9db9e9..44f36d51dcb 100644
--- a/src/common/hmac_openssl.c
+++ b/src/common/hmac_openssl.c
@@ -60,9 +60,7 @@ typedef enum pg_hmac_errno
PG_HMAC_ERROR_OPENSSL
 } pg_hmac_errno;
 
-/*
- * Internal structure for pg_hmac_ctx->data with this implementation.
- */
+/* Internal pg_hmac_ctx structure */
 struct pg_hmac_ctx
 {
HMAC_CTX   *hmacctx;
diff --git a/src/common/scram-common.c b/src/common/scram-common.c
index 5f90397c66d..8896b1e73e4 100644
--- a/src/common/scram-common.c
+++ b/src/common/scram-common.c
@@ -44,7 +44,10 @@ scram_SaltedPassword(const char *password,
pg_hmac_ctx *hmac_ctx = pg_hmac_create(PG_SHA256);
 
if (hmac_ctx == NULL)
+   {
+   *errstr = pg_hmac_error(NULL);  /* returns OOM */
return -1;
+   }
 
/*
 * Iterate hash calculation of HMAC entry using given salt.  This is
@@ -126,7 +129,10 @@ scram_ClientKey(const uint8 *salted_password, uint8 
*result,
pg_hmac_ctx *ctx = pg_hmac_create(PG_SHA256);
 
if (ctx == NULL)
+   {
+   *errstr = pg_hmac_error(NULL);  /* returns OOM */
return -1;
+   }
 
if (pg_hmac_init(ctx, salted_password, SCRAM_KEY_LEN) < 0 ||
pg_hmac_update(ctx, (uint8 *) "Client Key", strlen("Client 
Key")) < 0 ||


Re: Stream replication test fails of cfbot/windows server 2019

2022-01-12 Thread Thomas Munro
On Thu, Jan 13, 2022 at 12:24 AM Julien Rouhaud  wrote:
> On Wed, Jan 12, 2022 at 01:51:24PM +0300, Michail Nikolaev wrote:
> > https://cirrus-ci.com/task/6532060239101952
> > https://cirrus-ci.com/task/471606276096

For the record, cfbot only started running the recovery tests on
Windows a couple of weeks ago (when the new improved .cirrus.yml
landed in the tree).  I don't know if it's significant that Pavel's
patch is failing every time:

https://cirrus-ci.com/github/postgresql-cfbot/postgresql/commitfest/36/3464

... while one mentioned by Michail has lower frequency random failures:

https://cirrus-ci.com/github/postgresql-cfbot/postgresql/commitfest/36/2979

> Indeed, and yet CI on postgres tree doesn't exhibit any problem:
> https://cirrus-ci.com/github/postgres/postgres

(It's very cool that we have that turned on now!)  That has run ~35
times (once per commit) and never failed.  Across all cfbot branches,
cfbot is triggering over 100 builds a day, so something like 1400
since we started running the recovery test on Windows, so it's not a
fair comparison: plenty more chances for random/timing based failures
to show up.

I don't know how many different kinds of flakiness we're suffering
from on Windows.  Could these cases be explained by the FD_CLOSE
problem + timing differences?




Re: parse/analyze API refactoring

2022-01-12 Thread Bossart, Nathan
On 12/28/21, 8:25 AM, "Peter Eisentraut"  
wrote:
> (The "withcb" naming maybe isn't great; better ideas welcome.)

FWIW I immediately understood that this meant "with callback," so it
might be okay.

> Not included in this patch set, but food for further thought:  The
> pg_analyze_and_rewrite_*() functions aren't all that useful (anymore).
> One might as well write
>
>  pg_rewrite_query(parse_analyze_xxx(...))

I had a similar thought while reading through the patches.  If further
deduplication isn't too much trouble, I'd vote for that.

Nathan



Re: is ErrorResponse possible on Sync?

2022-01-12 Thread Tatsuo Ishii
> Hmm, this got me curious. If Sync itself cannot fail, then what is this
> sentence really saying: "This parameterless message (ed. Sync) causes the
> backend to close the current transaction if it's not inside a BEGIN/COMMIT
> transaction block (“close” meaning to commit if no error, or roll back if
> error)." ?
> This seems to say that, outside of BEGIN/END, the transaction is committed
> at Sync time (i.e. if the Sync is never sent, nothing is committed).

Yes, if you do not send Sync and terminate the session, then the
transaction will not be committed.

FE=> Parse(stmt="", query="INSERT INTO t1 VALUES(2)")
FE=> Bind(stmt="", portal="")
FE=> Execute(portal="")
FE=> Terminate

After this, I don't see the row (2) in table t1.

> Presumably, committing a transaction can fail even if no
> previous command/statement failed, right?

Right. Alvaro gave an excellent example.

Best reagards,
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jp




Re: Windows vs recovery tests

2022-01-12 Thread Andres Freund
Hi,

On 2022-01-12 14:34:00 -0500, Andrew Dunstan wrote:
> For some considerable time the recovery tests have been at best flaky on
> Windows, and at worst disastrous (i.e. they can hang rather than just
> fail). It's a problem I worked around on my buildfarm animals by
> disabling the tests, hoping to find time to get back to analysing the
> problem. But now we are seeing failures on the cfbot too (e.g.
> https://cirrus-ci.com/task/5860692694663168 and
> https://cirrus-ci.com/task/5316745152954368 ) so I think we need to
> spend some effort on finding out what's going on here.

I'm somewhat certain that this is caused by assertions or aborts hanging with
a GUI popup, e.g. due to a check in the CRT.

I saw these kind of hangs a lot in the aio development tree before I merged
the changes to change error/abort handling on windows. Before the recent CI
changes cfbot ran windows tests without assertions, which - besides just
running fewer tests - explains having fewer such hang before, because there's
more sources of such error popups in the debug CRT.

It'd be nice if somebody could look at the patch and discussion in
https://www.postgresql.org/message-id/20211005193033.tg4pqswgvu3hcolm%40alap3.anarazel.de


The debugging information for the cirrus-ci tasks has a list of
processes. E.g. for https://cirrus-ci.com/task/5860692694663168 there's

  1 agent.exe
  1 CExecSvc.exe
  1 csrss.exe
  1 fontdrvhost.exe
  1 lsass.exe
  1 msdtc.exe
  1 psql.exe
  1 services.exe
  1 wininit.exe
  9 cmd.exe
  9 perl.exe
  9 svchost.exe
 49 postgres.exe
processes.

So we know that some tests were actually still in progress... It's
particularly interesting that there's a psql process still hanging around...


Before I "simplified" that away, the CI patch ran all tests with a shorter
individual timeout than the overall CI timeout, so we'd see error logs
etc. Perhaps that was a mistake to remove. IIRC I did something like

"C:\Program Files\Git\usr\bin\timeout.exe" -v --kill-after=35m 30m perl 
path/to/vcregress.pl ...

Perhaps worth re-adding?

Greetings,

Andres Freund




Re: disfavoring unparameterized nested loops

2022-01-12 Thread Peter Geoghegan
On Tue, Jun 22, 2021 at 4:13 AM Robert Haas  wrote:
> I think that's a reasonable request. I'm not sure that I believe it's
> 100% necessary, but it's certainly an improvement on a technical
> level, and given that the proposed change could impact quite a lot of
> plans, it's fair to want to see some effort being put into mitigating
> the possible downsides. Now, I'm not sure when I might have time to
> actually try to do the work, which kind of sucks, but that's how it
> goes sometimes.

Should I take it that you've dropped this project? I was rather hoping
that you'd get back to it, FWIW.

-- 
Peter Geoghegan




Re: Stream replication test fails of cfbot/windows server 2019

2022-01-12 Thread Andres Freund
Hi,

On 2022-01-13 12:40:00 +1300, Thomas Munro wrote:
> On Thu, Jan 13, 2022 at 12:24 AM Julien Rouhaud  wrote:
> > On Wed, Jan 12, 2022 at 01:51:24PM +0300, Michail Nikolaev wrote:
> > > https://cirrus-ci.com/task/6532060239101952
> > > https://cirrus-ci.com/task/471606276096
>
> For the record, cfbot only started running the recovery tests on
> Windows a couple of weeks ago (when the new improved .cirrus.yml
> landed in the tree).  I don't know if it's significant that Pavel's
> patch is failing every time:
>
> https://cirrus-ci.com/github/postgresql-cfbot/postgresql/commitfest/36/3464

I think we need to make wait_for_catchup() log something more useful. It's
pretty hard to debug failures involving it right now.

E.g. the logs for https://cirrus-ci.com/task/5353503093686272 tells us that

  Waiting for replication conn standby_1's replay_lsn to pass '0/3023208' on 
primary
  # poll_query_until timed out executing this query:
  # SELECT '0/3023208' <= replay_lsn AND state = 'streaming' FROM 
pg_catalog.pg_stat_replication WHERE application_name = 'standby_1';
  # expecting this output:
  # t
  # last actual query output:
  #
  # with stderr:
  timed out waiting for catchup at t/001_stream_rep.pl line 50.

and there's lots of instances of that query in the primary's logs. But nowhere
do we see what the actual replication progress is. So we don't know if the
problem is just that there's a record that doesn't need to be flushed to disk,
or something more fundamental.

I think instead of croak("timed out waiting for catchup") we should make
wait_for_catchup() query the primary all columns of pg_stat_replication and
report those. And perhaps also report the result of
  SELECT * FROM pg_control_recovery(), pg_control_checkpoint();
on the standby?



> I don't know how many different kinds of flakiness we're suffering
> from on Windows.  Could these cases be explained by the FD_CLOSE
> problem + timing differences?

Maybe. There's certainly something odd going on:

https://api.cirrus-ci.com/v1/artifact/task/5353503093686272/log/src/test/recovery/tmp_check/log/001_stream_rep_primary.log
https://api.cirrus-ci.com/v1/artifact/task/5353503093686272/log/src/test/recovery/tmp_check/log/001_stream_rep_standby_1.log

standby_1:
  2022-01-12 08:21:36.543 GMT [8584][walreceiver] LOG:  started streaming WAL 
from primary at 0/300 on timeline 1

primary:
  2022-01-12 08:21:38.855 GMT [6276][postmaster] LOG:  received fast shutdown 
request
  2022-01-12 08:21:39.146 GMT [6276][postmaster] LOG:  database system is shut 
down
  2022-01-12 08:21:50.235 GMT [5524][postmaster] LOG:  starting PostgreSQL 
15devel, compiled by Visual C++ build 1929, 64-bit
  2022-01-12 08:21:50.417 GMT [5524][postmaster] LOG:  database system is ready 
to accept connections

standby_1:
  2022-01-12 08:21:53.469 GMT [5108][walsender] [standby_2][2/0:0] LOG:  
received replication command: START_REPLICATION 0/300 TIMELINE 1
  2022-01-12 08:28:33.949 GMT [6484][postmaster] LOG:  database system is shut 
down

afaict standby_1's walreceiver never realized that the primary stopped?


More evidence to that fact is that the above "last actual query output:" shows 
nothing
rather than 'f' for
  SELECT '0/3023208' <= replay_lsn AND state = 'streaming' FROM 
pg_catalog.pg_stat_replication WHERE application_name = 'standby_1';



I wonder if it's relevant that the .cirrus file uses "unix sockets" on
windows as well, to avoid port conflicts (otherwise I saw frequent spurious
test failures due to port conflicts).
# Avoids port conflicts between concurrent tap test runs
PG_TEST_USE_UNIX_SOCKETS: 1

It's not particularly hard to imagine that either our "windows unix socket"
support still has some bugs, or that windows' implementation of unix sockets
is borked.


Greetings,

Andres Freund




Re: Add non-blocking version of PQcancel

2022-01-12 Thread Andres Freund
Hi,

On 2022-01-12 15:22:18 +, Jelte Fennema wrote:
> This patch also includes a test for this new API (and also the already
> existing cancellation APIs). The test can be easily run like this:
>
> cd src/test/modules/libpq_pipeline
> make && ./libpq_pipeline cancel

Right now tests fails to build on windows with:

[15:45:10.518] src/interfaces/libpq/libpqdll.def : fatal error LNK1121: 
duplicate ordinal number '189' [c:\cirrus\libpq.vcxproj]
on fails tests on other platforms. See
https://cirrus-ci.com/build/4791821363576832


> NOTE: I have not tested this with GSS for the moment. My expectation is
> that using this new API with a GSS connection will result in a
> CONNECTION_BAD status when calling PQcancelStatus. The reason for this
> is that GSS reads will also need to communicate back that an EOF was
> found, just like I've done for TLS reads and unencrypted reads. Since in
> case of a cancel connection an EOF is actually expected, and should not
> be treated as an error.

The failures do not seem related to this.

Greetings,

Andres Freund




RE: Skipping logical replication transactions on subscriber side

2022-01-12 Thread tanghy.f...@fujitsu.com
On Wed, Jan 12, 2022 2:02 PM Masahiko Sawada  wrote:
> 
> I've attached an updated patch that incorporated all comments I got so far.
> 

Thanks for updating the patch. Here are some comments:

1)
+  Skip applying changes of the particular transaction.  If incoming data

Should "Skip" be "Skips" ?

2)
+  prepared by enabling two_phase on susbscriber.  After 
h
+  the logical replication successfully skips the transaction, the 
transaction

The "h" after word "After" seems redundant.

3)
+   Skipping the whole transaction includes skipping the cahnge that may not 
violate

"cahnge" should be "changes" I think.

4)
+/*
+ * True if we are skipping all data modification changes (INSERT, UPDATE, 
etc.) of
+ * the specified transaction at MySubscription->skipxid.  Once we start 
skipping
...
+ */
+static TransactionId skipping_xid = InvalidTransactionId;
+#define is_skipping_changes() (TransactionIdIsValid(skipping_xid))

Maybe we should modify this comment. Something like:
skipping_xid is valid if we are skipping all data modification changes ...

5)
+   if (!superuser())
+   ereport(ERROR,
+   
(errcode(ERRCODE_INSUFFICIENT_PRIVILEGE),
+errmsg("must 
be superuser to set %s", "skip_xid")));

Should we change the message to "must be superuser to skip xid"?
Because the SQL stmt is "ALTER SUBSCRIPTION ... SKIP (xid = XXX)".

Regards,
Tang


Re: Windows vs recovery tests

2022-01-12 Thread Andres Freund
Hi,

On 2022-01-12 15:58:26 -0800, Andres Freund wrote:
> On 2022-01-12 14:34:00 -0500, Andrew Dunstan wrote:
> > For some considerable time the recovery tests have been at best flaky on
> > Windows, and at worst disastrous (i.e. they can hang rather than just
> > fail). It's a problem I worked around on my buildfarm animals by
> > disabling the tests, hoping to find time to get back to analysing the
> > problem. But now we are seeing failures on the cfbot too (e.g.
> > https://cirrus-ci.com/task/5860692694663168 and
> > https://cirrus-ci.com/task/5316745152954368 ) so I think we need to
> > spend some effort on finding out what's going on here.
> 
> I'm somewhat certain that this is caused by assertions or aborts hanging with
> a GUI popup, e.g. due to a check in the CRT.

Oh, that was only about https://cirrus-ci.com/task/5860692694663168 not
https://cirrus-ci.com/task/5316745152954368

Looking through the recent recovery failures that were just on windows, I see
three different "classes" of recovery test failures:

1) Tests sometimes never finish, resulting in CI timing out
2) Tests sometimes finish, but t/001_stream_rep.pl fails
3) Tests fail with patch specific issues (e.g. 36/2096, 36/3461, 36/3459)

>From the cases I looked the failures in 1) always have a successful
t/001_stream_rep.pl. This makes me think that we're likely at two separate
types of problems?


One might think that
https://cirrus-ci.com/github/postgresql-cfbot/postgresql/commitfest/36/3464
conflicts with the above grouping. But all but the currently last failure were
due a compiler warning in an earlier version of the patch.


There's one interesting patch that also times out just on windows, albeit in
another test group:
https://cirrus-ci.com/github/postgresql-cfbot/postgresql/commitfest/36/2096

This IMO looks likely to be a bug in psql introduced by that patch.

Greetings,

Andres Freund




Re: null iv parameter passed to combo_init()

2022-01-12 Thread Noah Misch
On Mon, Jan 10, 2022 at 03:34:27PM -0800, Zhihong Yu wrote:
> On Sun, Jan 9, 2022 at 6:45 PM Zhihong Yu  wrote:
> > gcc -Wall -Wmissing-prototypes -Wpointer-arith
> > -Wdeclaration-after-statement -Werror=vla -Wendif-labels
> > -Wmissing-format-attribute -Wimplicit-fallthrough=3 -Wcast-function-type
> > -Wformat-security -fno-strict-aliasing -fwrapv -fexcess-precision=standard
> > -Wno-format-truncation -Wno-stringop-truncation -fsanitize=undefined
> > -fsanitize-undefined-trap-on-error -I../../src/port -DFRONTEND
> > -I../../src/include  -D_GNU_SOURCE   -c -o path.o path.c
> 
> Patch v3 passes `make check-world`

The patch uses the "LENGTH_VAR != 0" style in px.c, but it uses "POINTER_VAR
!= NULL" style in the other files.  Please use "LENGTH_VAR != 0" style in each
place you're changing.

Assuming the next version looks good, I'll likely back-patch it to v10.  Would
anyone like to argue for a back-patch all the way to 9.2?




Re: postgres_fdw: commit remote (sub)transactions in parallel during pre-commit

2022-01-12 Thread Fujii Masao




On 2022/01/06 17:29, Etsuro Fujita wrote:

On Fri, Dec 3, 2021 at 6:07 PM Fujii Masao  wrote:

On 2021/11/16 18:55, Etsuro Fujita wrote:

I changed my mind; I’ll update the patch to ignore the error as
before, because 1) as far as I know, there are no reports from the
field concerning that we ignore all kinds of errors in cleaning up the
prepared statements, so maybe we don’t need to change that, and 2) we
already committed at least one of the remote transactions, so it’s not
good to abort the local transaction unless we really have to.


Done.


Are you planning to update the patch? In addition to this change,
at least documentation about new parallel_commit parameter needs
to be included in the patch.


Done.  Attached is a new version.

* 0001
This is an updated version of the previous patch.  In addition to the
above, I expanded a comment in do_sql_command_end() a bit to explain
why we do PQconsumeInput() before doing pgfdw_get_result(), to address
your comment.  Also, I moved the code to finish closing pending
(sub)transactions in pgfdw_xact_callback()(pgfdw_subxact_callback())
into separate functions.  Also, I modified regression test cases a bit
to access multiple foreign servers.


Thanks for updating the patch!

At first I'm reading the 0001 patch. Here are the comments for the patch.

0001 patch failed to be applied. Could you rebase the patch?

+   entry->changing_xact_state = true;
+   do_sql_command_begin(entry->conn, "DEALLOCATE ALL");
+   pending_deallocs = lappend(pending_deallocs, entry);

Originally entry->changing_xact_state is not set to true when executing 
DEALLOCATE ALL. But the patch do that. Why do we need this change?

The source comment explains that we intentionally ignore errors in the 
DEALLOCATE. But the patch changes DEALLOCATE ALL so that it's executed via 
do_sql_command_begin() that can cause an error. Is this OK?

+   if (ignore_errors)
+   pgfdw_report_error(WARNING, res, conn, true, sql);

When DEALLOCATE fails, originally even warning message is not logged. But the 
patch changes DEALLOCATE so that its result is received via 
do_sql_command_end() that can log warning message even when ignore_errors 
argument is enabled. Why do we need to change the behavior?

+  
+   This option controls whether postgres_fdw commits
+   multiple remote (sub)transactions opened in a local (sub)transaction
+   in parallel when the local (sub)transaction commits.

Since parallel_commit is an option for foreign server, how the server with this 
option enabled is handled by postgres_fdw should be documented, instead?

+   
+Note that if many remote (sub)transactions are opened on a remote server
+in a local (sub)transaction, this option might increase the remote
+server’s load significantly when those remote (sub)transactions are
+committed.  So be careful when using this option.
+   

This paragraph should be inside the listitem for parallel_commit, shouldn't it?

async_capable=true also may cause the similar issue? If so, this kind of note 
should be documented also in async_capable?

This explains that the remote server's load will be increased *significantly*. But 
"significantly" part is really true? I'd like to know how much 
parallel_commit=true actually can increase the load in a remote server.

Regards,

--
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION




Re: null iv parameter passed to combo_init()

2022-01-12 Thread Zhihong Yu
On Wed, Jan 12, 2022 at 6:49 PM Noah Misch  wrote:

> On Mon, Jan 10, 2022 at 03:34:27PM -0800, Zhihong Yu wrote:
> > On Sun, Jan 9, 2022 at 6:45 PM Zhihong Yu  wrote:
> > > gcc -Wall -Wmissing-prototypes -Wpointer-arith
> > > -Wdeclaration-after-statement -Werror=vla -Wendif-labels
> > > -Wmissing-format-attribute -Wimplicit-fallthrough=3
> -Wcast-function-type
> > > -Wformat-security -fno-strict-aliasing -fwrapv
> -fexcess-precision=standard
> > > -Wno-format-truncation -Wno-stringop-truncation -fsanitize=undefined
> > > -fsanitize-undefined-trap-on-error -I../../src/port -DFRONTEND
> > > -I../../src/include  -D_GNU_SOURCE   -c -o path.o path.c
> >
> > Patch v3 passes `make check-world`
>
> The patch uses the "LENGTH_VAR != 0" style in px.c, but it uses
> "POINTER_VAR
> != NULL" style in the other files.  Please use "LENGTH_VAR != 0" style in
> each
> place you're changing.
>
> Assuming the next version looks good, I'll likely back-patch it to v10.
> Would
> anyone like to argue for a back-patch all the way to 9.2?
>
Hi,
Please take a look at patch v5.

Cheers


0005-memcpy-null.patch
Description: Binary data


Re: ExecRTCheckPerms() and many prunable partitions

2022-01-12 Thread Julien Rouhaud
Hi,

On Mon, Dec 20, 2021 at 04:13:04PM +0900, Amit Langote wrote:
> 
> Patch 0002 needed a rebase, because a conflicting change to
> expected/rules.out has since been committed.

The cfbot reports new conflicts since about a week ago with this patch:

Latest failure: https://cirrus-ci.com/task/4686414276198400 and
https://api.cirrus-ci.com/v1/artifact/task/4686414276198400/regress_diffs/src/test/regress/regression.diffs

diff -U3 /tmp/cirrus-ci-build/src/test/regress/expected/xml.out 
/tmp/cirrus-ci-build/src/test/regress/results/xml.out
--- /tmp/cirrus-ci-build/src/test/regress/expected/xml.out  2022-01-12 
05:24:02.795477001 +
+++ /tmp/cirrus-ci-build/src/test/regress/results/xml.out   2022-01-12 
05:28:20.329086031 +
@@ -603,12 +603,12 @@
 CREATE VIEW xmlview9 AS SELECT xmlserialize(content 'good' as text);
 SELECT table_name, view_definition FROM information_schema.views
   WHERE table_name LIKE 'xmlview%' ORDER BY 1;
- table_name |  view_definition
-+---
+ table_name |  view_definition
++
  xmlview1   |  SELECT xmlcomment('test'::text) AS xmlcomment;
  xmlview2   |  SELECT XMLCONCAT('hello'::xml, 'you'::xml) AS "xmlconcat";
  xmlview3   |  SELECT XMLELEMENT(NAME element, XMLATTRIBUTES(1 AS ":one:", 
'deuce' AS two), 'content&') AS "xmlelement";
- xmlview4   |  SELECT XMLELEMENT(NAME employee, XMLFOREST(emp.name AS name, 
emp.age AS age, emp.salary AS pay)) AS "xmlelement"+
+ xmlview4   |  SELECT XMLELEMENT(NAME employee, XMLFOREST(name AS name, age AS 
age, salary AS pay)) AS "xmlelement" +
 |FROM emp;
  xmlview5   |  SELECT XMLPARSE(CONTENT 'x'::text STRIP WHITESPACE) 
AS "xmlparse";
  xmlview6   |  SELECT XMLPI(NAME foo, 'bar'::text) AS "xmlpi";
diff -U3 /tmp/cirrus-ci-build/src/test/regress/expected/compression.out 
/tmp/cirrus-ci-build/src/test/regress/results/compression.out
--- /tmp/cirrus-ci-build/src/test/regress/expected/compression.out  
2022-01-12 05:24:02.739471690 +
+++ /tmp/cirrus-ci-build/src/test/regress/results/compression.out   
2022-01-12 05:28:23.537403929 +
@@ -187,7 +187,7 @@
 
+--+---+--+-+--+-+--+-
  x  | text |   |  | | extended | | 
 |
 View definition:
- SELECT cmdata1.f1 AS x
+ SELECT f1 AS x
FROM cmdata1;

 SELECT pg_column_compression(f1) FROM cmdata1;
@@ -274,7 +274,7 @@
 
+--+---+--+-+--+-+--+-
  x  | text |   |  | | extended | lz4 | 
 |
 View definition:
- SELECT cmdata1.f1 AS x
+ SELECT f1 AS x
FROM cmdata1;

Could you send a rebased patch?  In the meantime I'll switch the cf entry to
Waiting on Author.




Re: a misbehavior of partition row movement (?)

2022-01-12 Thread Julien Rouhaud
Hi,

On Tue, Jan 11, 2022 at 05:08:59PM +0900, Amit Langote wrote:
> 
> I think I've managed to apply f4566345cf40b into v13 and v14.  Patches 
> attached.
> 

FTR this doesn't play well with the cfbot unfortunately as it tries to apply
both patches, and obviously on the wrong branches anyway.

It means that the previous-0002-now-0001 patch that Álvaro previously sent
(https://www.postgresql.org/message-id/202201052227.bc4yvvy6lqpb%40alvherre.pgsql)
is not tested anymore, and IIUC it's not pushed yet so it's not ideal.

There's now an official documentation on how to send patches that should be
ignored by the cfbot [1], so sending backpatch versions with a .txt extension
could be useful.  Just in case I'm attaching the pending patch to this mail to
make the cfbot happy again.

[1] 
https://wiki.postgresql.org/wiki/Cfbot#Which_attachments_are_considered_to_be_patches.3F
>From d1b067ad541f80191763e329577e0d3f62d00d82 Mon Sep 17 00:00:00 2001
From: amitlan 
Date: Mon, 11 Oct 2021 14:57:19 +0900
Subject: [PATCH v12] Enforce foreign key correctly during cross-partition
 updates

When an update on a partitioned table referenced in foreign keys
constraint causes a row to move from one partition to another, which
is implemented by deleting the old row from the source leaf partition
followed by inserting the new row into the destination leaf partition,
firing the foreign key triggers on that delete event can result in
surprising outcomes for those keys.  For example, a given foreign
key's delete trigger which implements the ON DELETE CASCADE clause of
that key will delete any referencing rows, although it should not,
because the referenced row is simply being moved into another
partition.

This commit teaches trigger.c to skip queuing such delete trigger
events on the leaf partitions in favor of an UPDATE event fired on
the "root" target relation.  Doing so makes sense because both the
old and the new tuple "logically" belong to the latter.

The after trigger event queuing interface now allows passing the
source and the destination partitions of a particular cross-partition
update when registering the update event for the root partitioned
table.  Along with the 2 ctids of the old and the new tuple, an after
trigger event now also stores the OIDs of those partitions. The tuples
fetched from the source and the destination partitions are converted
into the root table format before they are passed to the trigger
function.

The implementation currently has a limitation that only the foreign
keys pointing into the query's target relation are considered, not
those of its sub-partitioned partitions.  That seems like a
reasonable limitation, it sounds rare to have distinct foreign keys
pointing into sub-partitioned partitions, but not into the root
table.
---
 doc/src/sgml/ref/update.sgml  |   7 +
 src/backend/commands/trigger.c| 322 +++---
 src/backend/executor/execMain.c   |  19 +-
 src/backend/executor/execReplication.c|   5 +-
 src/backend/executor/nodeModifyTable.c| 187 -
 src/backend/utils/adt/ri_triggers.c   |   6 +
 src/include/commands/trigger.h|   4 +
 src/include/executor/executor.h   |   3 +-
 src/include/nodes/execnodes.h |   3 +
 src/test/regress/expected/foreign_key.out | 204 +-
 src/test/regress/sql/foreign_key.sql  | 135 -
 11 files changed, 840 insertions(+), 55 deletions(-)

diff --git a/doc/src/sgml/ref/update.sgml b/doc/src/sgml/ref/update.sgml
index 3fa54e5f70..3ba13010e7 100644
--- a/doc/src/sgml/ref/update.sgml
+++ b/doc/src/sgml/ref/update.sgml
@@ -316,6 +316,13 @@ UPDATE count
partition (provided the foreign data wrapper supports tuple routing), they
cannot be moved from a foreign-table partition to another partition.
   
+
+  
+   An attempt of moving a row from one partition to another will fail if a
+   foreign key is found to directly reference a non-root partitioned table
+   in the partition tree, unless that table is also directly mentioned
+   in the UPDATEquery.
+  
  
 
  
diff --git a/src/backend/commands/trigger.c b/src/backend/commands/trigger.c
index 452b743f21..1ed6dd1b38 100644
--- a/src/backend/commands/trigger.c
+++ b/src/backend/commands/trigger.c
@@ -94,7 +94,11 @@ static HeapTuple ExecCallTriggerFunc(TriggerData *trigdata,
 
FmgrInfo *finfo,
 
Instrumentation *instr,
 
MemoryContext per_tuple_context);
-static void AfterTriggerSaveEvent(EState *estate, ResultRelInfo *relinfo,
+static void AfterTriggerSaveEvent(EState *estate,
+ 
ModifyTableState *mtstate,
+ ResultRelInfo 
*relinfo,
+

Re: [PATCH]Add tab completion for foreigh table

2022-01-12 Thread Fujii Masao




On 2022/01/11 21:43, tanghy.f...@fujitsu.com wrote:

Hi

Attached a patch to improve the tab completion for foreigh table.


Thanks!

Isn't it better to tab-complete not only "PARTITION OF" but also "(" for CREATE 
FOREIGN TABLE?



Also modified some DOC description of ALTER TABLE at [1] in according with 
CREATE INDEX at [2].

In [1], we use "ALTER INDEX ATTACH PARTITION"
In [2], we use "ALTER INDEX ... ATTACH PARTITION"

I think the format in [2] is better.


Agreed.

IMO it's better to make the docs changes in separate patch because they are not 
directly related to the improvement of tab-completion.

Regards,

--
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION




Re: Add index scan progress to pg_stat_progress_vacuum

2022-01-12 Thread Imseih (AWS), Sami
On 1/12/22, 1:28 PM, "Bossart, Nathan"  wrote:

On 1/11/22, 11:46 PM, "Masahiko Sawada"  wrote:
> Regarding the new pg_stat_progress_vacuum_index view, why do we need
> to have a separate view? Users will have to check two views. If this
> view is expected to be used together with and joined to
> pg_stat_progress_vacuum, why don't we provide one view that has full
> information from the beginning? Especially, I think it's not useful
> that the total number of indexes to vacuum (num_indexes_to_vacuum
> column) and the current number of indexes that have been vacuumed
> (index_ordinal_position column) are shown in separate views.

 > I suppose we could add all of the new columns to
 > pg_stat_progress_vacuum and just set columns to NULL as appropriate.
 > But is that really better than having a separate view?

To add, since a vacuum can utilize parallel worker processes + the main vacuum 
process to perform index vacuuming, it made sense to separate the backends 
doing index vacuum/cleanup in a separate view. 
Besides what Nathan suggested, the only other clean option I can think of is to 
perhaps create a json column in pg_stat_progress_vacuum which will include all 
the new fields. My concern with this approach is that it will make usability, 
to flatten the json, difficult for users.

> Also, I’m not sure how useful index_tuples_removed is; what can we
> infer from this value (without a total number)?

>I think the idea was that you can compare it against max_dead_tuples
>   and num_dead_tuples to get an estimate of the current cycle progress.
>Otherwise, it just shows that progress is being made.

The main purpose is to really show that the "index vacuum" phase is actually 
making progress. Note that for certain types of indexes, i.e. GIN/GIST the 
number of tuples_removed will end up exceeding the number of num_dead_tuples.

Nathan

[0] https://postgr.es/m/7874FB21-FAA5-49BD-8386-2866552656C7%40amazon.com




Re: Windows vs recovery tests

2022-01-12 Thread Andres Freund
Hi,

On 2022-01-12 18:25:26 -0800, Andres Freund wrote:
> There's one interesting patch that also times out just on windows, albeit in
> another test group:
> https://cirrus-ci.com/github/postgresql-cfbot/postgresql/commitfest/36/2096
> 
> This IMO looks likely to be a bug in psql introduced by that patch.

vcregress doesn't say which tests it's about to run unfortunately, but
comparing a successful run (on another branch) says that the test running
after pgbench are the psql tests.


I pushed a branch to my github repository containing cfbot's commit and one
that runs the psql tests in isolation, under a timeout... Which predictably
failed. But at least we see the logs...

https://cirrus-ci.com/task/6723083204558848?logs=psql_test_tcp#L15

based on the log files it looks like psql's 001_basic test did run

# Test clean handling of unsupported replication command responses
psql_like(
$node,
'handling of unexpected PQresultStatus',
'START_REPLICATION 0/0',
undef, qr/unexpected PQresultStatus: 8$/);

https://api.cirrus-ci.com/v1/artifact/task/6723083204558848/log/src/bin/psql/tmp_check/log/001_basic_main.log
2022-01-13 03:28:45.973 GMT [604][walsender] [001_basic.pl][3/0:0] STATEMENT:  
START_REPLICATION 0/0

https://api.cirrus-ci.com/v1/artifact/task/6723083204558848/tap/src/bin/psql/tmp_check/log/regress_log_001_basic

the last log entry in tap log is

ok 23 - \help with argument: stdout matches


So it looks like psql is hanging somewhere after that. I assume with an error
popup that nobody can click on :/.

Greetings,

Andres Freund




Custom Operator for citext LIKE predicates question

2022-01-12 Thread Efrain J. Berdecia
After attempting to use gin and gist indexes for our queries that run against 
citext columns, our team has come up with the following to make our queries run 
from 2 mins to 25ms;CREATE EXTENSION pg_trgmCREATE EXTENSION btree_gin --may 
not be needed, checking
CREATE OPERATOR CLASS gin_trgm_ops_ci_newFOR TYPE citext USING ginASOPERATOR 1 
% (text, text),FUNCTION 1 btint4cmp (int4, int4),FUNCTION 2 
gin_extract_value_trgm (text, internal),FUNCTION 3 gin_extract_query_trgm 
(text, internal, int2, internal, internal, internal, internal),FUNCTION 4 
gin_trgm_consistent (internal,int2, text, int4, internal, internal, internal, 
internal),STORAGE int4;
ALTER OPERATOR FAMILY gin_trgm_ops_ci_new USING gin ADDOPERATOR 3 ~~ (citext, 
citext),OPERATOR 4 ~~* (citext, citext);ALTER OPERATOR FAMILY 
gin_trgm_ops_ci_new USING gin ADDOPERATOR 7 %> (text, text),FUNCTION 6 
(text,text) gin_trgm_triconsistent (internal, int2, text, int4, internal, 
internal, internal);

Our question is, does anyone see any flaw on this? 
Also, could this not be incorporated into postgres natively?
I'm posting the old and new explain plans;
New explain;
QUERY PLAN

 Aggregate  (cost=874327.76..874327.77 rows=1 width=8) (actual 
time=21.952..21.954 rows=1 loops=1)->  Nested Loop  (cost=1620.95..874284.13 
rows=17449 width=0) (actual time=6.259..21.948 rows=9 loops=1)->  Bitmap Heap 
Scan on t775 b1  (cost=1620.39..525029.25 rows=45632 width=35) (actual 
time=6.212..8.189 rows=13 loops=1)Recheck Cond: ((c240001002 ~~ 'smp%'::citext) 
OR (c20020 ~~ 'smp%'::citext) OR (c20001 ~~ 'smp%'::citext))Rows 
Removed by Index Recheck: 259Filter: ((c400079600 <> 
'ABC_BUSINESSSERVICE'::citext) AND (c400127400 = 'ABC.ASSET'::citext) AND 
((c11 = 'Mrictton Global'::citext) OR (c11 = 
'ABCOpsMonitoring'::citext) OR (c11 = 'Mrictton'::citext) OR 
(c11 = 'Mrictton EITTE'::citext) OR (c11 = 'Mrictton 
Finance'::citext) OR (c11 = 'Mrictton Generic Services and 
Support'::citext) OR (c11 = 'Mrictton Global'::citext) OR (c11 
= 'Mrictton Global Demo Solutions'::citext) OR (c11 = 'Mrictton HR 
Direct'::citext) OR (c11 = 'Mrictton Marketing and 
Communications'::citext) OR (c11 = 'Ericsson Master Data 
Management'::citext) OR (c11 = 'Mrictton OHS'::citext) OR (c11 
= 'Mrictton Patents and Licensing'::citext) OR (c11 = 'Mrictton 
Sales'::citext) OR (c11 = 'MricttonSecurity'::citext) OR (c11 = 
'Mrictton Shared Services'::citext) OR (c11 = 'Mrictton 
Sourcing'::citext) OR (c11 = 'Mrict

Re: a misbehavior of partition row movement (?)

2022-01-12 Thread Amit Langote
On Thu, Jan 13, 2022 at 12:19 PM Julien Rouhaud  wrote:
> On Tue, Jan 11, 2022 at 05:08:59PM +0900, Amit Langote wrote:
> >
> > I think I've managed to apply f4566345cf40b into v13 and v14.  Patches 
> > attached.
> >
>
> FTR this doesn't play well with the cfbot unfortunately as it tries to apply
> both patches, and obviously on the wrong branches anyway.

Oops, that's right.  Thanks for the heads up.

> It means that the previous-0002-now-0001 patch that Álvaro previously sent
> (https://www.postgresql.org/message-id/202201052227.bc4yvvy6lqpb%40alvherre.pgsql)
> is not tested anymore, and IIUC it's not pushed yet so it's not ideal.

Agreed.

> There's now an official documentation on how to send patches that should be
> ignored by the cfbot [1], so sending backpatch versions with a .txt extension
> could be useful.  Just in case I'm attaching the pending patch to this mail to
> make the cfbot happy again.

Thanks and sorry I wasn't aware of the rule about sending back patch versions.

-- 
Amit Langote
EDB: http://www.enterprisedb.com




Re: Windows vs recovery tests

2022-01-12 Thread Andres Freund
Hi,

On 2022-01-12 20:03:14 -0800, Andres Freund wrote:
> So it looks like psql is hanging somewhere after that. I assume with an error
> popup that nobody can click on :/.

Not quite. Psql is actually just logging output in an endless loop. I
connected with cdb.exe.

kP:

`007fd3c8 7ffc`0d00f13a ntdll!NtWriteFile+0x14
`007fd3d0 7ffc`03978ec3 KERNELBASE!WriteFile+0x7a
`007fd440 7ffc`03979d21 ucrtbased!write_text_ansi_nolock(
int fh = 0n2,
char * buffer = 0x`007febb0 " ???",
unsigned int buffer_size = 1)+0x183
`007fe8f0 7ffc`039798a7 ucrtbased!_write_nolock(
int fh = 0n2,
void * buffer = 0x`007febb0,
unsigned int buffer_size = 1,
class __crt_cached_ptd_host * ptd = 
0x`007fef40)+0x451
`007fea80 7ffc`03920e1d ucrtbased!_write_internal(
int fh = 0n2,
void * buffer = 0x`007febb0,
unsigned int size = 1,
class __crt_cached_ptd_host * ptd = 
0x`007fef40)+0x377
`007feb20 7ffc`0392090e ucrtbased!write_buffer_nolock(
char c = 0n32 ' ',
class __crt_stdio_stream stream = class 
__crt_stdio_stream,
class __crt_cached_ptd_host * ptd = 
0x`007fef40)+0x27d
`007febb0 7ffc`03921242 
ucrtbased!common_flush_and_write_nolock(
int c = 0n32,
class __crt_stdio_stream stream = class 
__crt_stdio_stream,
class __crt_cached_ptd_host * ptd = 
0x`007fef40)+0x22e
`007fec20 7ffc`038ddf5a 
ucrtbased!__acrt_stdio_flush_and_write_narrow_nolock(
int c = 0n32,
struct _iobuf * stream = 0x7ffc`03a27ce0,
class __crt_cached_ptd_host * ptd = 
0x`007fef40)+0x32
`007fec60 7ffc`038dd5a3 ucrtbased!_fwrite_nolock_internal(
void * buffer = 0x`007ff020,
unsigned int64 element_size = 1,
unsigned int64 element_count = 7,
struct _iobuf * public_stream = 0x7ffc`03a27ce0,
class __crt_cached_ptd_host * ptd = 
0x`007fef40)+0x79a
`007fed60 7ffc`038dd426 
ucrtbased!::operator()(void)+0x73
`007fedd0 7ffc`038dd4a8 
ucrtbased!__crt_seh_guarded_call::operator()<, 
&, >(
class 
__acrt_lock_stream_and_call::__l2:: * 
setup = 0x0
000`007fee58,
class 
_fwrite_internal::__l2:: * action = 
0x`00
7feec0,
class 
__acrt_lock_stream_and_call::__l2:: * 
cleanup = 0
x`007fee50)+0x36
`007fee10 7ffc`038dd72d 
ucrtbased!__acrt_lock_stream_and_call<
>(
struct _iobuf * stream = 0x7ffc`03a27ce0,
class 
_fwrite_internal::__l2:: * action = 
0x`00
7feec0)+0x58
`007fee70 7ffc`038de046 ucrtbased!_fwrite_internal(
void * buffer = 0x`007ff020,
unsigned int64 size = 1,
unsigned int64 count = 7,
struct _iobuf * stream = 0x7ffc`03a27ce0,
class __crt_cached_ptd_host * ptd = 
0x`007fef40)+0x15d
`007fef00 0001`4004a639 ucrtbased!fwrite(
void * buffer = 0x`007ff020,
unsigned int64 size = 1,
unsigned int64 count = 7,
struct _iobuf * stream = 0x7ffc`03a27ce0)+0x56
`007fef90 0001`4004a165 psql!flushbuffer(
struct PrintfTarget * target = 0x`007feff8)+0x59
`007fefd0 0001`4004a1e6 psql!pg_vfprintf(
struct _iobuf * stream = 0x7ffc`03a27ce0,
char * fmt = 0x0001`40094268 "error: ",
char * args = 0x`007ff4a0 "@???")+0xa5
`007ff450 0001`40045962 psql!pg_fprintf(
struct _iobuf * stream = 0x7ffc`03a27ce0,
char * fmt = 0x0001`40094268 "error: ")+0x36
`007ff490 0001`40045644 psql!pg_log_generic_v(
pg_log_level level = PG_LOG_ERROR (0n4),
char * fmt = 0x0001`40062e90 "unexpected 
PQresultStatus: %d",
char * ap = 0x`007ff540 "???")+0x302
`007ff4f0 0001`4000ef1f psql!pg_log_generic(
pg_log_level level = PG_LOG_ERROR (0n4),
char * fmt = 0x0001`4

Re: psql - add SHOW_ALL_RESULTS option

2022-01-12 Thread Andres Freund
Hi,

On 2022-01-08 19:32:36 +0100, Fabien COELHO wrote:
> Attached v13 where the crash test is moved to tap.

The reason this test constantly fails on cfbot windows is a use-after-free
bug.

I figured that out in the context of another thread, so the debugging is
there:

https://postgr.es/m/20220113054123.ib4khtafgq34lv4z%40alap3.anarazel.de
> Ah, I see the bug. It's a use-after-free introduced in the patch:
>
> SendQueryAndProcessResults(const char *query, double *pelapsed_msec,
>   bool is_watch, const printQueryOpt *opt, FILE *printQueryFout, bool 
> *tx_ended)
> 
> 
> ...
>   /* first result */
>   result = PQgetResult(pset.db);
> 
> 
>   while (result != NULL)
> 
> 
> ...
>   if (!AcceptResult(result, false))
>   {
> ...
>   ClearOrSaveResult(result);
>   success = false;
> 
> 
>   /* and switch to next result */
>   result_status = PQresultStatus(result);
>   if (result_status == PGRES_COPY_BOTH ||
>   result_status == PGRES_COPY_OUT ||
>   result_status == PGRES_COPY_IN)
> 
> 
> So we called ClearOrSaveResult() with did a PQclear(), and then we go and call
> PQresultStatus().

Greetings,

Andres Freund




Re: Custom Operator for citext LIKE predicates question

2022-01-12 Thread Tom Lane
"Efrain J. Berdecia"  writes:
> After attempting to use gin and gist indexes for our queries that run against 
> citext columns, our team has come up with the following to make our queries 
> run from 2 mins to 25ms;CREATE EXTENSION pg_trgmCREATE EXTENSION btree_gin 
> --may not be needed, checking
> CREATE OPERATOR CLASS gin_trgm_ops_ci_newFOR TYPE citext USING ginASOPERATOR 
> 1 % (text, text),FUNCTION 1 btint4cmp (int4, int4),FUNCTION 2 
> gin_extract_value_trgm (text, internal),FUNCTION 3 gin_extract_query_trgm 
> (text, internal, int2, internal, internal, internal, internal),FUNCTION 4 
> gin_trgm_consistent (internal,int2, text, int4, internal, internal, internal, 
> internal),STORAGE int4;
> ALTER OPERATOR FAMILY gin_trgm_ops_ci_new USING gin ADDOPERATOR 3 ~~ (citext, 
> citext),OPERATOR 4 ~~* (citext, citext);ALTER OPERATOR FAMILY 
> gin_trgm_ops_ci_new USING gin ADDOPERATOR 7 %> (text, text),FUNCTION 6 
> (text,text) gin_trgm_triconsistent (internal, int2, text, int4, internal, 
> internal, internal);

> Our question is, does anyone see any flaw on this? 

Umm ... does it actually work?  I'd expect that you get case-sensitive
comparison behavior in such an index, because those support functions
are for plain text and they're not going to know that you'd like
case-insensitive behavior.

You generally can't make a new gin or gist opclass without actually
writing some C code, because the support functions embody all
the semantics of the operators.

regards, tom lane




Re: Avoid erroring out when unable to remove or parse logical rewrite files to save checkpoint work

2022-01-12 Thread Bharath Rupireddy
On Thu, Jan 13, 2022 at 3:47 AM Bossart, Nathan  wrote:
>
> On 12/31/21, 4:44 AM, "Bharath Rupireddy" 
>  wrote:
> > Currently the server is erroring out when unable to remove/parse a
> > logical rewrite file in CheckPointLogicalRewriteHeap wasting the
> > amount of work the checkpoint has done and preventing the checkpoint
> > from finishing. This is unlike CheckPointSnapBuild does for snapshot
> > files i.e. it just emits a message at LOG level and continues if it is
> > unable to parse or remove the file. Attaching a small patch applying
> > the same idea to the mapping files.
>
> This seems reasonable to me.  AFAICT moving on to other files after an
> error shouldn't cause any problems.  In fact, it's probably beneficial
> to try to clean up as much as possible so that the files do not
> continue to build up.

Thanks for the review Nathan!

> The only feedback I have for the patch is that I don't think the new
> comments are necessary.

I borrowed the comments as-is from the CheckPointSnapBuild introduced
by the commit b89e15105. IMO, let the comments be there as they
explain why we are not emitting ERRORs, however I will leave it to the
committer to decide on that.

Regards,
Bharath Rupireddy.




Re: Add checkpoint and redo LSN to LogCheckpointEnd log message

2022-01-12 Thread Bharath Rupireddy
On Wed, Jan 12, 2022 at 11:39 AM Julien Rouhaud  wrote:
>
> Hi,
>
> On Tue, Dec 28, 2021 at 10:56 AM Bharath Rupireddy
>  wrote:
> >
> > attaching v1-0001-XXX from the initial mail again just for the sake of
> > completion:
>
> Unfortunately this breaks the cfbot as it tries to apply this patch
> too: http://cfbot.cputube.org/patch_36_3474.log.
>
> For this kind of situation I think that the usual solution is to use a
> .txt extension to make sure that the cfbot won't try to apply it.

Thanks.  IMO, the following format of logging is better, so attaching
the v2-0001-Add-checkpoint-and-redo-LSN-to-LogCheckpointEnd-l.patch as
.patch

2021-12-28 02:52:24.464 UTC [2394396] LOG:  checkpoint completed at
location=0/212FFC8 with REDO start location=0/212FF90: wrote 451
buffers (2.8%); 0 WAL file(s) added, 0 removed, 1 recycled;
write=0.012 s, sync=0.032 s, total=0.071 s; sync files=6,
longest=0.022 s, average=0.006 s; distance=6272 kB, estimate=6272 kB

Others are attached as .txt files.

Regards,
Bharath Rupireddy.
From c23010c67a66dd21e318ae4d475ee0d85c5c1d08 Mon Sep 17 00:00:00 2001
From: Bharath Rupireddy 
Date: Tue, 28 Dec 2021 02:52:49 +
Subject: [PATCH v2] Add checkpoint and redo LSN to LogCheckpointEnd log
 message

It is useful (for debugging purposes) if the checkpoint end message
has the checkpoint LSN and REDO LSN. It gives more context while
analyzing checkpoint-related issues. The pg_controldata gives the
last checkpoint LSN and REDO LSN, but having this info alongside
the log message helps analyze issues that happened previously,
connect the dots and identify the root cause.
---
 src/backend/access/transam/xlog.c | 10 --
 1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/src/backend/access/transam/xlog.c 
b/src/backend/access/transam/xlog.c
index 1e1fbe957f..cd3fce6a2c 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -8920,11 +8920,14 @@ LogCheckpointEnd(bool restartpoint)
 
if (restartpoint)
ereport(LOG,
-   (errmsg("restartpoint complete: wrote %d 
buffers (%.1f%%); "
+   (errmsg("restartpoint completed at 
location=%X/%X with REDO start location=%X/%X: "
+   "wrote %d buffers (%.1f%%); "
"%d WAL file(s) added, %d 
removed, %d recycled; "
"write=%ld.%03d s, 
sync=%ld.%03d s, total=%ld.%03d s; "
"sync files=%d, 
longest=%ld.%03d s, average=%ld.%03d s; "
"distance=%d kB, estimate=%d 
kB",
+   
LSN_FORMAT_ARGS(ControlFile->checkPoint),
+   
LSN_FORMAT_ARGS(ControlFile->checkPointCopy.redo),

CheckpointStats.ckpt_bufs_written,
(double) 
CheckpointStats.ckpt_bufs_written * 100 / NBuffers,
CheckpointStats.ckpt_segs_added,
@@ -8940,11 +8943,14 @@ LogCheckpointEnd(bool restartpoint)
(int) 
(CheckPointDistanceEstimate / 1024.0;
else
ereport(LOG,
-   (errmsg("checkpoint complete: wrote %d buffers 
(%.1f%%); "
+   (errmsg("checkpoint completed at location=%X/%X 
with REDO start location=%X/%X: "
+   "wrote %d buffers (%.1f%%); "
"%d WAL file(s) added, %d 
removed, %d recycled; "
"write=%ld.%03d s, 
sync=%ld.%03d s, total=%ld.%03d s; "
"sync files=%d, 
longest=%ld.%03d s, average=%ld.%03d s; "
"distance=%d kB, estimate=%d 
kB",
+   
LSN_FORMAT_ARGS(ControlFile->checkPoint),
+   
LSN_FORMAT_ARGS(ControlFile->checkPointCopy.redo),

CheckpointStats.ckpt_bufs_written,
(double) 
CheckpointStats.ckpt_bufs_written * 100 / NBuffers,
CheckpointStats.ckpt_segs_added,
-- 
2.25.1



v2-0001-Add-checkpoint-and-redo-LSN-to-LogCheckpointEnd-l.patch
Description: Binary data
From 999cfd53ce4e16ccfff94c0022cd80fe8ff84be5 Mon Sep 17 00:00:00 2001
From: Bharath Rupireddy 
Date: Tue, 28 Dec 2021 02:45:51 +
Subject: [PATCH v2] Add checkpoint and redo LSN to LogCheckpointEnd log
 message

It is useful (for debugging purposes) if the checkpoint end message
has the checkpoint LSN and REDO LSN. It gives more context while
analyzing checkpoi

Re: Why is src/test/modules/committs/t/002_standby.pl flaky?

2022-01-12 Thread Thomas Munro
On Wed, Jan 12, 2022 at 8:00 PM Alexander Lakhin  wrote:
> By the look of things, you are right and this is the localhost-only issue.

But can't that be explained with timing races?  You change some stuff
around and it becomes less likely that you get a FIN to arrive in a
super narrow window, which I'm guessing looks something like: recv ->
EWOULDBLOCK, [receive FIN], wait -> FD_CLOSE, wait [hangs].  Note that
it's not happening on several Windows BF animals, and the ones it is
happening on only do it only every few weeks.

Here's a draft attempt at a fix.  First I tried to use recv(fd, &c, 1,
MSG_PEEK) == 0 to detect EOF, which seemed to me to be a reasonable
enough candidate, but somehow it corrupts the stream (!?), so I used
Alexander's POLLHUP idea, except I pushed it down to a more principled
place IMHO.  Then I suppressed it after the initial check because then
the logic from my earlier patch takes over, so stuff like FeBeWaitSet
doesn't suffer from extra calls, only these two paths that haven't
been converted to long-lived WESes yet.  Does this pass the test?

I wonder if this POLLHUP technique is reliable enough (I know that
wouldn't work on other systems[1], which is why I was trying to make
MSG_PEEK work...).

What about environment variable PG_TEST_USE_UNIX_SOCKETS=1, does it
reproduce with that set, and does the patch fix it?  I'm hoping that
explains some Windows CI failures from a nearby thread[2].

[1] 
https://illumos.topicbox.com/groups/developer/T5576767e764aa26a-Maf8f3460c2866513b0ac51bf
[2] 
https://www.postgresql.org/message-id/flat/CALT9ZEG%3DC%3DJSypzt2gz6NoNtx-ew2tYHbwiOfY_xNo%2ByBY_%3Djw%40mail.gmail.com
From a2cf2b016afc03d384a1682cd02aab760273d6fb Mon Sep 17 00:00:00 2001
From: Thomas Munro 
Date: Thu, 13 Jan 2022 15:48:14 +1300
Subject: [PATCH] Fix handling of socket FD_CLOSE events on Windows.

In some situations we could hang waiting for an FD_CLOSE event that has
already been reported.  Fix this by adding some state to WaitEvent to
remember that we've already been told the socket is closed.  To handle
client code that creates and destroys multiple temporary WaitEventSet
objects and thus would lose that state, check for EOF explicitly the
first time through.
---
 src/backend/storage/ipc/latch.c | 41 +
 src/include/storage/latch.h |  1 +
 2 files changed, 42 insertions(+)

diff --git a/src/backend/storage/ipc/latch.c b/src/backend/storage/ipc/latch.c
index 61c876beff..86b86e71c4 100644
--- a/src/backend/storage/ipc/latch.c
+++ b/src/backend/storage/ipc/latch.c
@@ -899,6 +899,7 @@ AddWaitEventToSet(WaitEventSet *set, uint32 events, pgsocket fd, Latch *latch,
 	event->user_data = user_data;
 #ifdef WIN32
 	event->reset = false;
+	event->eof = -1;			/* unknown */
 #endif
 
 	if (events == WL_LATCH_SET)
@@ -1851,6 +1852,45 @@ WaitEventSetWaitBlock(WaitEventSet *set, int cur_timeout,
 			cur_event->reset = false;
 		}
 
+		/*
+		 * Windows does not report FD_CLOSE repeatedly, and does not report
+		 * FD_READ on EOF.  Therefore we need to remember if we've seen
+		 * FD_CLOSE, to defend against callers that wait twice in a row
+		 * without a recv() that fails with WSAEWOULDBLOCK in between.
+		 */
+		if (cur_event->events & WL_SOCKET_READABLE)
+		{
+			/*
+			 * Check for EOF explicitly the first time through, to bridge the
+			 * gap between temporary WaitEventSet objects (such as those
+			 * created by WaitLatchOrSocket()).
+			 */
+			if (cur_event->eof == -1)
+			{
+WSAPOLLFD	pollfd;
+int			rc;
+
+pollfd.fd = cur_event->fd;
+pollfd.events = POLLRDNORM;
+pollfd.revents = 0;
+rc = WSAPoll(&pollfd, 1, 0);
+if (rc == 1 && pollfd.revents & POLLHUP)
+	cur_event->eof = 1; /* EOF */
+else
+	cur_event->eof = 0; /* data or error available */
+			}
+
+			/* If we've seen EOF or FD_CLOSE, keep reporting readiness. */
+			if (cur_event->eof == 1)
+			{
+occurred_events->pos = cur_event->pos;
+occurred_events->user_data = cur_event->user_data;
+occurred_events->events = WL_SOCKET_READABLE;
+occurred_events->fd = cur_event->fd;
+return 1;
+			}
+		}
+
 		/*
 		 * Windows does not guarantee to log an FD_WRITE network event
 		 * indicating that more data can be sent unless the previous send()
@@ -2002,6 +2042,7 @@ WaitEventSetWaitBlock(WaitEventSet *set, int cur_timeout,
 		{
 			/* EOF/error, so signal all caller-requested socket flags */
 			occurred_events->events |= (cur_event->events & WL_SOCKET_MASK);
+			cur_event->eof = 1;
 		}
 
 		if (occurred_events->events != 0)
diff --git a/src/include/storage/latch.h b/src/include/storage/latch.h
index 3aa7b33834..41a857c0c3 100644
--- a/src/include/storage/latch.h
+++ b/src/include/storage/latch.h
@@ -147,6 +147,7 @@ typedef struct WaitEvent
 	void	   *user_data;		/* pointer provided in AddWaitEventToSet */
 #ifdef WIN32
 	bool		reset;			/* Is reset of the event required? */
+	int8		eof;			/* Has EOF been reached on a socket? */
 #endif
 } WaitEvent;

Re: row filtering for logical replication

2022-01-12 Thread Peter Smith
Thanks for posting the merged v63.

Here are my review comments for the v63-0001 changes.

~~~

1. src/backend/replication/logical/proto.c - logicalrep_write_tuple

  TupleDesc desc;
- Datum values[MaxTupleAttributeNumber];
- bool isnull[MaxTupleAttributeNumber];
+ Datum*values;
+ bool*isnull;
  int i;
  uint16 nliveatts = 0;

Those separate declarations for values / isnull are not strictly
needed anymore, so those vars could be deleted. IIRC those were only
added before when there were both slots and tuples. OTOH, maybe you
prefer to keep it this way just for code readability?

~~~

2. src/backend/replication/pgoutput/pgoutput.c - typedef

+typedef enum RowFilterPubAction
+{
+ PUBACTION_INSERT,
+ PUBACTION_UPDATE,
+ PUBACTION_DELETE,
+ NUM_ROWFILTER_PUBACTIONS  /* must be last */
+} RowFilterPubAction;

This typedef is not currently used by any of the code.

So I think choices are:

- Option 1: remove the typedef, because nobody is using it.

- Option 2: keep the typedef, but use it! e.g. everywhere there is an
exprstate array index variable probably it should be declared as a
'RowFilterPubAction idx' instead of just 'int idx'.

I prefer option 2, but YMMV.

~~~

3. src/backend/replication/pgoutput/pgoutput.c - map_changetype_pubaction

After this recent v63 refactoring and merging of some APIs it seems
that the map_changetype_pubaction is now ONLY used by
pgoutput_row_filter function. So this can now be a static member of
pgoutput_row_filter function instead of being declared at file scope.

~~~

4. src/backend/replication/pgoutput/pgoutput.c - pgoutput_row_filter comments

+
+/*
+ * Change is checked against the row filter, if any.
+ *
+ * If it returns true, the change is replicated, otherwise, it is not.
+ *
+ * FOR INSERT: evaluates the row filter for new tuple.
+ * FOR DELETE: evaluates the row filter for old tuple.
+ * For UPDATE: evaluates the row filter for old and new tuple. If both
+ * evaluations are true, it sends the UPDATE. If both evaluations are false, it
+ * doesn't send the UPDATE. If only one of the tuples matches the row filter
+ * expression, there is a data consistency issue. Fixing this issue requires a
+ * transformation.
+ *
+ * Transformations:
+ * Updates are transformed to inserts and deletes based on the
+ * old tuple and new tuple. The new action is updated in the
+ * action parameter. If not updated, action remains as update.
+ *
+ * Case 1: old-row (no match)new-row (no match)  -> (drop change)
+ * Case 2: old-row (no match)new row (match) -> INSERT
+ * Case 3: old-row (match)   new-row (no match)  -> DELETE
+ * Case 4: old-row (match)   new row (match) -> UPDATE
+ *
+ * If the change is to be replicated this function returns true, else false.
+ *
+ * Examples:

The function header comment says the same thing 2x about the return values.

The 1st text "If it returns true, the change is replicated, otherwise,
it is not." should be replaced by the better wording of the 2nd text
("If the change is to be replicated this function returns true, else
false."). Then, that 2nd text can be removed (from where it is later
in this same comment).

~~~

5. src/backend/replication/pgoutput/pgoutput.c - pgoutput_row_filter

+ ExprContext*ecxt;
+ int filter_index = map_changetype_pubaction[*action];
+
+ /* *action is already assigned default by caller */
+ Assert(*action == REORDER_BUFFER_CHANGE_INSERT ||
+*action == REORDER_BUFFER_CHANGE_UPDATE ||
+*action == REORDER_BUFFER_CHANGE_DELETE);
+

Accessing the map_changetype_pubaction array should be done *after* the Assert.

~~~

6. src/backend/replication/pgoutput/pgoutput.c - pgoutput_row_filter

Actually, instead of assigning the filter_insert and then referring to
entry->exprstate[filter_index] in multiple places, now the code might
be neater if we simply assign a local variable “filter_exprstate” like
below and use that instead.

ExprState *filter_exprstate;
...
filter_exprstate = entry->exprstate[map_changetype_pubaction[*action]];

Please consider what way you think is best.

~~~

7. src/backend/replication/pgoutput/pgoutput.c - pgoutput_row_filter

+ /*
+ * For the following occasions where there is only one tuple, we can
+ * evaluates the row filter for the not null tuple and return.
+ *
+ * For INSERT: we only have new tuple.
+ *
+ * For UPDATE: if no old tuple, it means none of the replica identity
+ * columns changed and this would reduce to a simple update. we only need
+ * to evaluate the row filter for new tuple.
+ *
+ * FOR DELETE: we only have old tuple.
+ */

There are several things not quite right with that comment:
a. I thought now it should refer to "slots" instead of "tuples"
b. Some of the upper/lowercase is wonky
c. Maybe it reads better without the ":"

Suggested replacement comment:

/*
* For the following occasions where there is only one slot, we can
* evaluates the row filter for the not-null slot and return.
*
* For INSERT we only have the  new slot.
*
* For UPDATE if 

RE: [PATCH]Add tab completion for foreigh table

2022-01-12 Thread tanghy.f...@fujitsu.com
On Thursday, January 13, 2022 12:38 PM, Fujii Masao 
 wrote:
> Isn't it better to tab-complete not only "PARTITION OF" but also "(" for 
> CREATE
> FOREIGN TABLE?

Thanks for your review. Left bracket completion added.

> IMO it's better to make the docs changes in separate patch because they are 
> not
> directly related to the improvement of tab-completion.

Agreed. The former one patch was divided into two. 
0001 patch, added tab completion for foreign table.
0002 patch, modified some doc description.

Regards,
Tang


v-0001-add-tab-completion-for-PARTITION-OF-when-creating.patch
Description:  v-0001-add-tab-completion-for-PARTITION-OF-when-creating.patch


v2-0002-Modify-doc-description-for-ALTER-TABLE-in-accordi.patch
Description:  v2-0002-Modify-doc-description-for-ALTER-TABLE-in-accordi.patch


Re: Why is src/test/modules/committs/t/002_standby.pl flaky?

2022-01-12 Thread Alexander Lakhin
13.01.2022 09:36, Thomas Munro wrote:
> On Wed, Jan 12, 2022 at 8:00 PM Alexander Lakhin  wrote:
>> By the look of things, you are right and this is the localhost-only issue.
> But can't that be explained with timing races?  You change some stuff
> around and it becomes less likely that you get a FIN to arrive in a
> super narrow window, which I'm guessing looks something like: recv ->
> EWOULDBLOCK, [receive FIN], wait -> FD_CLOSE, wait [hangs].  Note that
> it's not happening on several Windows BF animals, and the ones it is
> happening on only do it only every few weeks.
But I still see the issue when I run both test parts on a single
machine: first instance is `vcregress taptest src\test\restart` and the
second `set NO_TEMP_INSTALL=1 & vcregress taptest contrib/postgres_fdw`
(see attachment).

I'll try new tests and continue investigation later today/tomorrow. Thanks!

Best regards,
Alexander


Re: Improve error handling of HMAC computations and SCRAM

2022-01-12 Thread Michael Paquier
On Thu, Jan 13, 2022 at 02:01:24AM +0300, Sergey Shinderuk wrote:
> Gave it a thorough read.  Looks good, except for errstr not set in a couple
> of places (see the diff attached).

Thanks for the review.  The comments about pg_hmac_ctx->data were
wrong from the beginning, coming, I guess, from one of the earlier
patch versions where this was discussed.  So I have applied that
independently.

I have also spent a good amount of time on that to close the loop and
make sure that no code paths are missing an error context, adjusted a
couple of comments to explain more the role of *errstr in all the
SCRAM routines, and finally applied it on HEAD.
--
Michael


signature.asc
Description: PGP signature


Re: 2022-01 Commitfest

2022-01-12 Thread Michael Paquier
On Wed, Jan 12, 2022 at 01:41:42PM +0800, Julien Rouhaud wrote:
> The January commitfest should have started almost two weeks ago, but given 
> that
> nothing happened until now I think that it's safe to assume that either
> everyone forgot or no one wanted to volunteer.

Thanks, Julien!
--
Michael


signature.asc
Description: PGP signature