Re: [PATCH] Completed unaccent dictionary with many missing characters

2022-06-27 Thread Michael Paquier
On Thu, Jun 23, 2022 at 02:10:42PM +0200, Przemysław Sztoch wrote:
> The only division that is probably possible is the one attached.

Well, the addition of cyrillic does not make necessary the removal of
SOUND RECORDING COPYRIGHT or the DEGREEs, that implies the use of a
dictionnary when manipulating the set of codepoints, but that's me
being too picky.  Just to say that I am fine with what you are
proposing here.

By the way, could you add a couple of regressions tests for each
patch with a sample of the characters added?  U+210C is a particularly
sensitive case, as we should really make sure that it maps to what we
want even if Latin-ASCII.xml tells a different story.  This requires
the addition of a couple of queries in unaccent.sql with the expected
output updated in unaccent.out.
--
Michael


signature.asc
Description: PGP signature


Re: [PoC] Improve dead tuple storage for lazy vacuum

2022-06-27 Thread Andres Freund
Hi,

On 2022-06-28 11:17:42 +0700, John Naylor wrote:
> On Mon, Jun 27, 2022 at 10:23 PM Hannu Krosing  wrote:
> >
> > > Another thought: for non-x86 platforms, the SIMD nodes degenerate to
> > > "simple loop", and looping over up to 32 elements is not great
> > > (although possibly okay). We could do binary search, but that has bad
> > > branch prediction.
> >
> > I am not sure that for relevant non-x86 platforms SIMD / vector
> > instructions would not be used (though it would be a good idea to
> > verify)
> 
> By that logic, we can also dispense with intrinsics on x86 because the
> compiler will autovectorize there too (if I understand your claim
> correctly). I'm not quite convinced of that in this case.

Last time I checked (maybe a year ago?) none of the popular compilers could
autovectorize that code pattern.

Greetings,

Andres Freund




Re: Comments referring to pg_start/stop_backup

2022-06-27 Thread Kyotaro Horiguchi
At Tue, 28 Jun 2022 13:41:58 +0900, Michael Paquier  wrote 
in 
> Hi all,
> 
> While browsing through the recent changes with the base backup APIs, I
> have noticed that a couple of comments did not get the renaming of the
> SQL functions to pg_backup_start/stop, as of the attached.
> 
> That's not a big deal, but let's be right.

+1 and I don't find other instances of the same mistake.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center




Comments referring to pg_start/stop_backup

2022-06-27 Thread Michael Paquier
Hi all,

While browsing through the recent changes with the base backup APIs, I
have noticed that a couple of comments did not get the renaming of the
SQL functions to pg_backup_start/stop, as of the attached.

That's not a big deal, but let's be right.

Thanks,
--
Michael
diff --git a/src/backend/access/transam/xlogrecovery.c b/src/backend/access/transam/xlogrecovery.c
index 6eba626420..e23451b0f1 100644
--- a/src/backend/access/transam/xlogrecovery.c
+++ b/src/backend/access/transam/xlogrecovery.c
@@ -716,7 +716,7 @@ InitWalRecovery(ControlFileData *ControlFile, bool *wasShutdown_ptr,
 		 * know how far we need to replay the WAL before we reach consistency.
 		 * This can happen for example if a base backup is taken from a
 		 * running server using an atomic filesystem snapshot, without calling
-		 * pg_start/stop_backup. Or if you just kill a running primary server
+		 * pg_backup_start/stop. Or if you just kill a running primary server
 		 * and put it into archive recovery by creating a recovery signal
 		 * file.
 		 *
diff --git a/src/backend/replication/basebackup.c b/src/backend/replication/basebackup.c
index 5244823ff8..95440013c0 100644
--- a/src/backend/replication/basebackup.c
+++ b/src/backend/replication/basebackup.c
@@ -16,7 +16,7 @@
 #include 
 #include 
 
-#include "access/xlog_internal.h"	/* for pg_start/stop_backup */
+#include "access/xlog_internal.h"	/* for pg_backup_start/stop */
 #include "common/compression.h"
 #include "common/file_perm.h"
 #include "commands/defrem.h"


signature.asc
Description: PGP signature


Re: WIP Patch: Add a function that returns binary JSONB as a bytea

2022-06-27 Thread John Naylor
I wrote:
> We can also shave a
> few percent by having pg_utf8_verifystr use SSE2 for the ascii path. I
> can look into this.

Here's a patch for that. If the input is mostly ascii, I'd expect that
part of the flame graph to shrink by 40-50% and give a small boost
overall.

-- 
John Naylor
EDB: http://www.enterprisedb.com
 src/common/wchar.c   | 18 
 src/include/mb/pg_wchar.h| 50 ++--
 src/test/regress/expected/conversion.out |  3 +-
 src/test/regress/sql/conversion.sql  |  3 +-
 4 files changed, 58 insertions(+), 16 deletions(-)

diff --git a/src/common/wchar.c b/src/common/wchar.c
index 1e6e198bf2..a305e0e66b 100644
--- a/src/common/wchar.c
+++ b/src/common/wchar.c
@@ -1918,26 +1918,20 @@ pg_utf8_verifystr(const unsigned char *s, int len)
 	const int	orig_len = len;
 	uint32		state = BGN;
 
-/*
- * Sixteen seems to give the best balance of performance across different
- * byte distributions.
- */
-#define STRIDE_LENGTH 16
-
-	if (len >= STRIDE_LENGTH)
+	if (len >= ASCII_CHECK_LEN)
 	{
-		while (len >= STRIDE_LENGTH)
+		while (len >= ASCII_CHECK_LEN)
 		{
 			/*
 			 * If the chunk is all ASCII, we can skip the full UTF-8 check,
 			 * but we must first check for a non-END state, which means the
 			 * previous chunk ended in the middle of a multibyte sequence.
 			 */
-			if (state != END || !is_valid_ascii(s, STRIDE_LENGTH))
-utf8_advance(s, , STRIDE_LENGTH);
+			if (state != END || !is_valid_ascii(s, ASCII_CHECK_LEN))
+utf8_advance(s, , ASCII_CHECK_LEN);
 
-			s += STRIDE_LENGTH;
-			len -= STRIDE_LENGTH;
+			s += ASCII_CHECK_LEN;
+			len -= ASCII_CHECK_LEN;
 		}
 
 		/* The error state persists, so we only need to check for it here. */
diff --git a/src/include/mb/pg_wchar.h b/src/include/mb/pg_wchar.h
index 31f5b393da..ca238c212b 100644
--- a/src/include/mb/pg_wchar.h
+++ b/src/include/mb/pg_wchar.h
@@ -700,19 +700,64 @@ extern WCHAR *pgwin32_message_to_UTF16(const char *str, int len, int *utf16len);
 #endif
 
 
+/*
+ * Note: We piggy-back on the check for SSE 4.2 intrinsics but only need SSE2 at runtime.
+ * That's supported by all x86-64 hardware, so we don't need an indirect function call.
+ * WIP: put this somewhere useful
+ */
+#if (defined (__x86_64__) || defined(_M_AMD64)) && (defined(USE_SSE42_CRC32C) || defined(USE_SSE42_CRC32C_WITH_RUNTIME_CHECK))
+#include 
+#define USE_SSE2
+#endif
+
 /*
  * Verify a chunk of bytes for valid ASCII.
  *
  * Returns false if the input contains any zero bytes or bytes with the
- * high-bit set. Input len must be a multiple of 8.
+ * high-bit set. Input len must be a multiple of the chunk size (8 or 16).
  */
 static inline bool
 is_valid_ascii(const unsigned char *s, int len)
 {
+#ifdef USE_SSE2
+	__m128i chunk,
+error_cum = _mm_setzero_si128(),
+zeros;
+
+/*
+ * With two chunks, gcc can unroll the loop, so provide a convenience macro for
+ * callers. Even if the compiler can unroll a longer loop, it's not worth it
+ * because callers might have to use a byte-wise algorithm if we return false.
+ */
+#define ASCII_CHECK_LEN (2 * sizeof(__m128i))
+	Assert(len % sizeof(chunk) == 0);
+
+	while (len > 0)
+	{
+		chunk = _mm_loadu_si128((const __m128i *) s);
+
+		/* Capture all set bits in this chunk. */
+		error_cum = _mm_or_si128(error_cum, chunk);
+
+		/*
+		 * Set all bits in each lane of the error accumulator where input bytes are zero.
+		 */
+		zeros = _mm_cmpeq_epi8(chunk, _mm_setzero_si128());
+		error_cum = _mm_or_si128(error_cum, zeros);
+
+		s += sizeof(chunk);
+		len -= sizeof(chunk);
+	}
+
+	/* Check if any high bits in the error accumulator got set. */
+	return _mm_movemask_epi8(error_cum) == 0;
+
+#else
 	uint64		chunk,
 highbit_cum = UINT64CONST(0),
 zero_cum = UINT64CONST(0x8080808080808080);
 
+#define ASCII_CHECK_LEN (2 * sizeof(uint64))
 	Assert(len % sizeof(chunk) == 0);
 
 	while (len > 0)
@@ -734,7 +779,7 @@ is_valid_ascii(const unsigned char *s, int len)
 		 */
 		zero_cum &= (chunk + UINT64CONST(0x7f7f7f7f7f7f7f7f));
 
-		/* Capture any set bits in this chunk. */
+		/* Capture all set bits in this chunk. */
 		highbit_cum |= chunk;
 
 		s += sizeof(chunk);
@@ -750,6 +795,7 @@ is_valid_ascii(const unsigned char *s, int len)
 		return false;
 
 	return true;
+#endif /* USE_SSE2 */
 }
 
 #endif			/* PG_WCHAR_H */
diff --git a/src/test/regress/expected/conversion.out b/src/test/regress/expected/conversion.out
index 442e7aff2b..434dc4d93c 100644
--- a/src/test/regress/expected/conversion.out
+++ b/src/test/regress/expected/conversion.out
@@ -140,7 +140,8 @@ select description, (test_conv(inbytes, 'utf8', 'utf8')).* from utf8_verificatio
 -- will contain all 4 bytes if they are present, so various
 -- expressions below add 3 ASCII bytes to the end to ensure
 -- consistent error messages.
--- The number 64 below needs to be at least the value of STRIDE_LENGTH in wchar.c.
+-- The number 64 below needs to equal or a multiple of the 

Re: Making the subquery alias optional in the FROM clause

2022-06-27 Thread Julien Rouhaud
Hi,

On Mon, Jun 27, 2022 at 12:03:20PM -0400, Isaac Morland wrote:
> On Mon, 27 Jun 2022 at 11:12, Julien Rouhaud  wrote:
>
> > More generally, I'm -0.5 on the feature.
> > I prefer to force using SQL-compliant queries, and also not take bad
> > habits.
> >
>
> As to forcing SQL-complaint queries, that ship sailed a long time ago:
> Postgres allows but does not enforce the use of SQL-compliant queries, and
> many of its important features are extensions anyway, so forcing SQL
> compliant queries is out of the question (although I could see the utility
> of a mode where it warns or errors on non-compliant queries, at least in
> principle).

Sure, but it doesn't mean that we should support even more non-compliant syntax
without any restraint.  In this case, I don't see much benefit as it's not
solving performance problem or something like that.

> As to bad habits, I'm having trouble understanding. Why do you think
> leaving the alias off a subquery is a bad habit (assuming it were allowed)?

I think It's a bad habit because as far as I can see it's not supported on
mysql or sqlserver.

> If the name is never used, why are we required to supply it?

I'm not saying that I'm thrilled having to do so, but it's also not a huge
trouble.  And since it's required I have the habit to automatically put some
random alias if I'm writing some one shot query that indeed doesn't need to use
the alias.

But similarly, I many times relied on the fact that writable CTE are executed
even if not explicitly referenced.  So by the same argument shouldn't we allow
something like this?

WITH (INSERT INTO t SELECT * pending WHERE ts < now())
SELECT now() AS last_processing_time;




Re: [PoC] Improve dead tuple storage for lazy vacuum

2022-06-27 Thread John Naylor
On Mon, Jun 27, 2022 at 10:23 PM Hannu Krosing  wrote:
>
> > Another thought: for non-x86 platforms, the SIMD nodes degenerate to
> > "simple loop", and looping over up to 32 elements is not great
> > (although possibly okay). We could do binary search, but that has bad
> > branch prediction.
>
> I am not sure that for relevant non-x86 platforms SIMD / vector
> instructions would not be used (though it would be a good idea to
> verify)

By that logic, we can also dispense with intrinsics on x86 because the
compiler will autovectorize there too (if I understand your claim
correctly). I'm not quite convinced of that in this case.

> I would definitely test before assuming binary search is better.

I wasn't very clear in my language, but I did reject binary search as
having bad branch prediction.

-- 
John Naylor
EDB: http://www.enterprisedb.com




Re: Perform streaming logical transactions by background workers and parallel apply

2022-06-27 Thread Amit Kapila
On Tue, Jun 28, 2022 at 8:51 AM wangw.f...@fujitsu.com
 wrote:
>
> On Thu, Jun 23, 2022 at 16:44 PM Amit Kapila  wrote:
> > On Thu, Jun 23, 2022 at 12:51 PM wangw.f...@fujitsu.com
> >  wrote:
> > >
> > > On Mon, Jun 20, 2022 at 11:00 AM Amit Kapila 
> > wrote:
> > > > I have improved the comments in this and other related sections of the
> > > > patch. See attached.
> > > Thanks for your comments and patch!
> > > Improved the comments as you suggested.
> > >
> > > > > > 3.
> > > > > > +
> > > > > > +  
> > > > > > +   Setting streaming mode to apply could export 
> > > > > > invalid
> > > > LSN
> > > > > > +   as finish LSN of failed transaction. Changing the streaming 
> > > > > > mode and
> > > > making
> > > > > > +   the same conflict writes the finish LSN of the failed 
> > > > > > transaction in the
> > > > > > +   server log if required.
> > > > > > +  
> > > > > >
> > > > > > How will the user identify that this is an invalid LSN value and she
> > > > > > shouldn't use it to SKIP the transaction? Can we change the second
> > > > > > sentence to: "User should change the streaming mode to 'on' if they
> > > > > > would instead wish to see the finish LSN on error. Users can use
> > > > > > finish LSN to SKIP applying the transaction." I think we can give
> > > > > > reference to docs where the SKIP feature is explained.
> > > > > Improved the sentence as suggested.
> > > > >
> > > >
> > > > You haven't answered first part of the comment: "How will the user
> > > > identify that this is an invalid LSN value and she shouldn't use it to
> > > > SKIP the transaction?". Have you checked what value it displays? For
> > > > example, in one of the case in apply_error_callback as shown in below
> > > > code, we don't even display finish LSN if it is invalid.
> > > > else if (XLogRecPtrIsInvalid(errarg->finish_lsn))
> > > > errcontext("processing remote data for replication origin \"%s\"
> > > > during \"%s\" in transaction %u",
> > > >errarg->origin_name,
> > > >logicalrep_message_type(errarg->command),
> > > >errarg->remote_xid);
> > > I am sorry that I missed something in my previous reply.
> > > The invalid LSN value here is to say InvalidXLogRecPtr (0/0).
> > > Here is an example :
> > > ```
> > > 2022-06-23 14:30:11.343 CST [822333] logical replication worker CONTEXT:
> > processing remote data for replication origin "pg_16389" during "INSERT" for
> > replication target relation "public.tab" in transaction 727 finished at 0/0
> > > ```
> > >
> >
> > I don't think it is a good idea to display invalid values. We can mask
> > this as we are doing in other cases in function apply_error_callback.
> > The ideal way is that we provide a view/system table for users to
> > check these errors but that is a matter of another patch. So users
> > probably need to check Logs to see if the error is from a background
> > apply worker to decide whether or not to switch streaming mode.
>
> Thanks for your comments.
> I improved it as you suggested. I mask the LSN if it is invalid LSN(0/0).
> Also, I improved the related pg-doc as following:
> ```
>When the streaming mode is parallel, the finish LSN of
>failed transactions may not be logged. In that case, it may be necessary to
>change the streaming mode to on and cause the same
>conflicts again so the finish LSN of the failed transaction will be written
>to the server log. For the usage of finish LSN, please refer to linkend="sql-altersubscription">ALTER SUBSCRIPTION ...
>SKIP.
> ```
> After improving this (mask invalid LSN), I found that this improvement and
> parallel apply patch do not seem to have a strong correlation. Would it be
> better to improve and commit in another separate patch?
>

Is there any other case where we can hit this code path (mask
invalidLSN) without this patch?

-- 
With Regards,
Amit Kapila.




Re: NAMEDATALEN increase because of non-latin languages

2022-06-27 Thread Julien Rouhaud
On Sat, Jun 25, 2022 at 08:00:04PM -0700, Andres Freund wrote:
>
> On 2022-06-26 10:48:24 +0800, Julien Rouhaud wrote:
> > Anyway, per the nearby discussions I don't see much interest, especially 
> > not in
> > the context of varlena identifiers (I should have started a different 
> > thread,
> > sorry about that), so I don't think it's worth investing more efforts into
> > it.
>
> FWIW, I still think it's a worthwhile feature

Oh, ok, I will start a new thread then.

> - just, to me, unrelated to varlena identifiers.

Yeah I get that and I agree.

> I've seen tremendous amounts of space wasted in tables
> just because of alignment issues...

Same here unfortunately.




Re: Support logical replication of DDLs

2022-06-27 Thread Amit Kapila
On Sun, Jun 26, 2022 at 11:47 PM Alvaro Herrera  wrote:
>
> On 2022-Jun-22, vignesh C wrote:
>
> > 1) Creation of temporary table fails infinitely in the subscriber.
> > CREATE TEMPORARY TABLE temp1 (a int primary key);
> >
> > The above statement is converted to the below format:
> > CREATE TEMPORARY TABLE  pg_temp.temp1 (a pg_catalog.int4   ,
> > CONSTRAINT temp1_pkey PRIMARY KEY (a));
> > While handling the creation of temporary table in the worker, the
> > worker fails continuously with the following error:
> > 2022-06-22 14:24:01.317 IST [240872] ERROR:  schema "pg_temp" does not exist
>
> Perhaps one possible fix is to change the JSON format string used in
> deparse_CreateStmt.  Currently, the following is used:
>
> +   if (node->ofTypename)
> +   fmtstr = "CREATE %{persistence}s TABLE %{if_not_exists}s 
> %{identity}D "
> +   "OF %{of_type}T %{table_elements}s "
> +   "%{with_clause}s %{on_commit}s %{tablespace}s";
> +   else
> +   fmtstr = "CREATE %{persistence}s TABLE %{if_not_exists}s 
> %{identity}D "
> +   "(%{table_elements:, }s) %{inherits}s "
> +   "%{with_clause}s %{on_commit}s %{tablespace}s";
> +
> +   createStmt =
> +   new_objtree_VA(fmtstr, 1,
> +  "persistence", ObjTypeString,
> +  
> get_persistence_str(relation->rd_rel->relpersistence));
>
> (Note that the word for the "persistence" element here comes straight
> from relation->rd_rel->relpersistence.)  Maybe it would be more
> appropriate to set the schema to empty when the table is temp, since the
> temporary-ness is in the %{persistence} element, and thus there is no
> need to schema-qualify the table name.
>
>
> However, that would still replicate a command that involves a temporary
> table, which perhaps should not be considered fit for replication.  So
> another school of thought is that if the %{persistence} is set to
> TEMPORARY, then it would be better to skip replicating the command
> altogether.
>

+1. I think it doesn't make sense to replicate temporary tables.
Similarly, we don't need to replicate the unlogged tables.

>  I'm not sure how to plug that in the replication layer,
> however.
>

I see two possibilities (a) We can check the persistence and skip
logging it in the event trigger where the patch deparses the DDL and
WAL log it, or (b) We can add a similar check in pgoutput.c where we
send the DDL to downstream.

I feel (a) is better unless it is difficult to detect at that stage as
that saves additional WAL.

-- 
With Regards,
Amit Kapila.




RE: Perform streaming logical transactions by background workers and parallel apply

2022-06-27 Thread wangw.f...@fujitsu.com
On Thu, Jun 23, 2022 at 9:41 AM Peter Smith  wrote:
> Here are some review comments for v12-0002

Thanks for your comments.

> 3. .../subscription/t/022_twophase_cascade.pl
> 
> For every test file in this patch the new function is passed $is_apply
> = 0/1 to indicate to use 'on' or 'apply' parameter value. But in this
> test file the parameter is passed as $streaming_mode = 'on'/'apply'.
> 
> I was wondering if (for the sake of consistency) it might be better to
> use the same parameter kind for all of the test files. Actually, I
> don't care if you choose to do nothing and leave this as-is; I am just
> posting this review comment in case it was not a deliberate decision
> to implement them differently.
> 
> e.g.
> + my ($node_publisher, $node_subscriber, $appname, $is_apply) = @_;
> 
> versus
> + my ($node_A, $node_B, $node_C, $appname_B, $appname_C,
> $streaming_mode) =
> +   @_;

This is because in 022_twophase_cascade.pl, altering subscription streaming
mode is inside test_streaming(), it would be more convenient to pass which
streaming mode we use (on or apply), we can directly use that in alter
subscription command.
In other files, we need to get the option because we only check the log in
apply mode, so I think it is sufficient to pass 'is_apply' (whose value is 0 or
1).
Because of these differences, I did not change it.

The rest of the comments are improved as suggested.
The new patches were attached in [1].

[1] - 
https://www.postgresql.org/message-id/OS3PR01MB62758DBE8FA12BA72A43AC819EB89%40OS3PR01MB6275.jpnprd01.prod.outlook.com

Regards,
Wang wei


RE: Perform streaming logical transactions by background workers and parallel apply

2022-06-27 Thread wangw.f...@fujitsu.com
On Mon, Jun 21, 2022 at 9:41 AM Peter Smith  wrote:
> Here are some review comments for the v11-0001 patch.
> 
> (I will review the remaining patches 0002-0005 and post any comments later)
> 

Thanks for your comments.

> 6. doc/src/sgml/protocol.sgml
> 
> Since there are protocol changes made here, shouldn’t there also be
> some corresponding LOGICALREP_PROTO_XXX constants and special checking
> added in the worker.c?

I think it is okay not to add new macro. Because we just expanded the existing
options ("streaming"). And we added a check for version in function
apply_handle_stream_abort.

> 8. doc/src/sgml/ref/create_subscription.sgml
> 
> + 
> +  If set to on, the changes of transaction are
> +  written to temporary files and then applied at once after the
> +  transaction is committed on the publisher.
> + 
> 
> SUGGESTION
> If set to on, the incoming changes are written to a temporary file and
> then applied only after the transaction is committed on the publisher.

In "on" mode, there may be more than one temporary file for one streaming
transaction. (see the invocation of function BufFileCreateFileSet in function
stream_open_file and function subxact_info_write)
So I think the existing description might be better.
If you feel this sentence is not clear, I will try to improve it later.

> 10. src/backend/access/transam/xact.c
> 
> @@ -1741,6 +1742,13 @@ RecordTransactionAbort(bool isSubXact)
>   elog(PANIC, "cannot abort transaction %u, it was already committed",
>   xid);
> 
> + /*
> + * Are we using the replication origins feature?  Or, in other words,
> + * are we replaying remote actions?
> + */
> + replorigin = (replorigin_session_origin != InvalidRepOriginId &&
> +   replorigin_session_origin != DoNotReplicateId);
> +
>   /* Fetch the data we need for the abort record */
>   nrels = smgrGetPendingDeletes(false, );
>   nchildren = xactGetCommittedChildren();
> @@ -1765,6 +1773,11 @@ RecordTransactionAbort(bool isSubXact)
>  MyXactFlags, InvalidTransactionId,
>  NULL);
> 
> + if (replorigin)
> + /* Move LSNs forward for this replication origin */
> + replorigin_session_advance(replorigin_session_origin_lsn,
> +XactLastRecEnd);
> +
> 
> I did not see any reason why the code assigning the 'replorigin' and
> the code checking the 'replorigin' are separated like they are. I
> thought these 2 new code fragments should be kept together. Perhaps it
> was decided this assignment must be outside the critical section? But
> if that’s the case maybe a comment explaining so would be good.
> 
> ~~~
> 
> 11. src/backend/access/transam/xact.c
> 
> + if (replorigin)
> + /* Move LSNs forward for this replication origin */
> + replorigin_session_advance(replorigin_session_origin_lsn,
> +
> 
> The positioning of that comment is unusual. Maybe better before the check?

As Amit-san said in [1], this is just for consistency with the code in the
function RecordTransactionCommit.

> 12. src/backend/commands/subscriptioncmds.c - defGetStreamingMode
> 
> + /*
> + * If no parameter given, assume "true" is meant.
> + */
> + if (def->arg == NULL)
> + return SUBSTREAM_ON;
> 
> SUGGESTION for comment
> If the streaming parameter is given but no parameter value is
> specified, then assume "true" is meant.

I think it might be better to be consistent with the function defGetBoolean
here.

> 24. .../replication/logical/applybgwroker.c - LogicalApplyBgwLoop
> 
> +/* Apply Background Worker main loop */
> +static void
> +LogicalApplyBgwLoop(shm_mq_handle *mqh, volatile ApplyBgworkerShared
> *pst)
> 
> Why is the name incosistent with other function names in the file?
> Should it be apply_bgworker_loop?

I think this function name would be better to be consistent with the function
LogicalRepApplyLoop.

> 28. .../replication/logical/applybgwroker.c - LogicalApplyBgwMain
> 
> For consistency should it be called apply_bgworker_main?

I think this function name would be better to be consistent with the function
ApplyWorkerMain.

> 30. src/backend/replication/logical/decode.c
> 
> @@ -651,9 +651,10 @@ DecodeCommit(LogicalDecodingContext *ctx,
> XLogRecordBuffer *buf,
>   {
>   for (i = 0; i < parsed->nsubxacts; i++)
>   {
> - ReorderBufferForget(ctx->reorder, parsed->subxacts[i], buf->origptr);
> + ReorderBufferForget(ctx->reorder, parsed->subxacts[i], buf->origptr,
> + commit_time);
>   }
> - ReorderBufferForget(ctx->reorder, xid, buf->origptr);
> + ReorderBufferForget(ctx->reorder, xid, buf->origptr, commit_time);
> 
> ReorderBufferForget was declared with 'abort_time' param. So it makes
> these calls a bit confusing looking to be passing 'commit_time'
> 
> Maybe better to do like below and pass 'forget_time' (inside that
> 'if') along with an explanatory comment:
> 
> TimestampTz forget_time = commit_time;

I did not change this. I am just not sure how much this will help.

> 36. src/backend/replication/logical/launcher.c -
> logicalrep_apply_background_worker_count
> 
> + int res 

Re: Repeatability of installcheck for test_oat_hooks

2022-06-27 Thread Michael Paquier
On Tue, Jun 28, 2022 at 10:12:48AM +0900, Michael Paquier wrote:
> As far as I can see, test_oat_hook has no need to keep around the
> extra role it creates as part of the regression tests, because at the
> end of the test there are no objects that depend on it.  Wouldn't it
> be better to make the test self-isolated?  NO_INSTALLCHECK is set in
> the module because of the issue with caching and the namespace search
> hooks, but it seems to me that we'd better make the test self-isolated
> in the long term, like in the attached.

And actually, I have found a second issue here.  The tests issue a
GRANT on work_mem, like that:
GRANT SET ON PARAMETER work_mem TO PUBLIC;

This has as effect to leave around an entry in pg_parameter_acl, which
is designed this way in aclchk.c.  However, this interacts with
guc_privs.sql in unsafe_tests, because those tests include similar
queries GRANT queries, also on work_mem.  So, if one issues an
installcheck on test_oat_modules followed by an installcheck in
unsafe_tests, the latter fails.  I think that we'd better add an extra
REVOKE to clear the contents of pg_parameter_acl.
--
Michael
diff --git a/src/test/modules/test_oat_hooks/expected/test_oat_hooks.out b/src/test/modules/test_oat_hooks/expected/test_oat_hooks.out
index 39b274b8fa..a86455395d 100644
--- a/src/test/modules/test_oat_hooks/expected/test_oat_hooks.out
+++ b/src/test/modules/test_oat_hooks/expected/test_oat_hooks.out
@@ -44,6 +44,9 @@ privileges for parameter another.bogus
 GRANT SET ON PARAMETER work_mem TO PUBLIC;
 NOTICE:  in process utility: superuser attempting GrantStmt
 NOTICE:  in process utility: superuser finished GrantStmt
+REVOKE SET ON PARAMETER work_mem FROM PUBLIC;
+NOTICE:  in process utility: superuser attempting GrantStmt
+NOTICE:  in process utility: superuser finished GrantStmt
 REVOKE ALTER SYSTEM ON PARAMETER work_mem FROM PUBLIC;
 NOTICE:  in process utility: superuser attempting GrantStmt
 NOTICE:  in process utility: superuser finished GrantStmt
@@ -299,3 +302,4 @@ REVOKE ALL PRIVILEGES ON PARAMETER
 	test_oat_hooks.user_var2, test_oat_hooks.super_var2
 	FROM regress_role_joe;
 DROP ROLE regress_role_joe;
+DROP ROLE regress_test_user;
diff --git a/src/test/modules/test_oat_hooks/sql/test_oat_hooks.sql b/src/test/modules/test_oat_hooks/sql/test_oat_hooks.sql
index 8b6d5373aa..c27d0068ba 100644
--- a/src/test/modules/test_oat_hooks/sql/test_oat_hooks.sql
+++ b/src/test/modules/test_oat_hooks/sql/test_oat_hooks.sql
@@ -26,6 +26,7 @@ DROP ROLE regress_role_joe;
 
 -- Check the behavior of the hooks relative to do-nothing grants and revokes
 GRANT SET ON PARAMETER work_mem TO PUBLIC;
+REVOKE SET ON PARAMETER work_mem FROM PUBLIC;
 REVOKE ALTER SYSTEM ON PARAMETER work_mem FROM PUBLIC;
 
 -- Check the behavior of the hooks relative to unrecognized parameters
@@ -98,3 +99,4 @@ REVOKE ALL PRIVILEGES ON PARAMETER
 	test_oat_hooks.user_var2, test_oat_hooks.super_var2
 	FROM regress_role_joe;
 DROP ROLE regress_role_joe;
+DROP ROLE regress_test_user;


signature.asc
Description: PGP signature


Re: Handle infinite recursion in logical replication setup

2022-06-27 Thread Peter Smith
Here are my review comments for the v24* patch set.

Now, these few comments are all trivial and non-functional. Apart from
these, everything looks good to me.


v24-0001


No comments. LGTM


V24-0002


2.1 doc/src/sgml/ref/create_subscription.sgml

+ 
+  Specifies whether the subscription will request the publisher to only
+  send changes that originated locally, or to send any changes
+  regardless of origin. Setting origin to
+  local means that the subscription will request the
+  publisher to only send changes that originated locally. Setting
+  origin to any means that the
+  publisher sends any changes regardless of their origin. The default
+  is any.
+ 

2.1a.
IMO remove the word "any" from "any changes". Then the text will match
what is written in catalogs.sgml.
"send any changes regardless of origin" -> "send changes regardless of
origin" (occurs 2x)

2.1b.
This same text is cut/paste to the commit message so that can also be updated.


~~~

2.2 src/backend/commands/subscriptioncmds.c

+ /*
+ * Even though "origin" parameter allows only "local" and "any"
+ * values, the "origin" parameter type is implemented as string
+ * type instead of boolean to extend the "origin" parameter to
+ * support filtering of origin name specified by the user in the
+ * later versions.
+ */

2.2a.
SUGGESTION
Even though "origin" parameter allows only "local" and "any" values,
it is implemented as a string type so that the parameter can be
extended in future versions to support filtering using origin names
specified by the user.

2.2b.
This same text is cut/paste to the commit message so that can also be updated.



v24-0003


3.1 Commit message

This patch does a couple of things:
change 1) Checks and throws an error if 'copy_data = on' and 'origin =
local' but the publication tables were also replicating from other publishers.
change 2) Adds 'force' value for copy_data parameter.

~

"replicating" -> "replicated"
"change 1)" -> "1)"
"change 2)" -> "2)"


v24-0004


No comments. LGTM

--
Kind Regards,
Peter Smith.
Fujitsu Australia




Re: ICU for global collation

2022-06-27 Thread Justin Pryzby
On Mon, Jun 27, 2022 at 09:10:29AM +0200, Peter Eisentraut wrote:
> On 27.06.22 08:42, Michael Paquier wrote:
> > On Mon, Jun 27, 2022 at 08:23:59AM +0200, Peter Eisentraut wrote:
> > > On 26.06.22 05:51, Julien Rouhaud wrote:
> > > > On Sat, Jun 25, 2022 at 10:19:30AM -0500, Justin Pryzby wrote:
> > > > > > +   if (GET_MAJOR_VERSION(old_cluster.major_version) <= 1500)
> > > 
> > > I think the fix here is to change <= to < ?
> > 
> > Yes.
> 
> Ok, committed.
> 
> (I see now that in the context of pg_upgrade, writing <= 1400 is equivalent,
> but I find that confusing, so I did < 1500.)

I suggested using <= 1400 for consistency with the other code, and per
bc1fbc960.  But YMMV.

-- 
Justin




Repeatability of installcheck for test_oat_hooks

2022-06-27 Thread Michael Paquier
Hi all,
(Andrew in CC.)

When running installcheck multiple times under src/test/modules/, we
are getting two failures as of test_pg_dump and test_oat_tests,
because these keep around some roles created by the tests.

Keeping around a role for test_pg_dump has been discussed already,
where the buildfarm can use that for pg_upgrade, and because there are
many objects that depend on the role created:
https://www.postgresql.org/message-id/20180904203012.gg20...@paquier.xyz

Note that it would require a DROP OWNED BY and DROP ROLE, but anyway:
--- a/src/test/modules/test_pg_dump/sql/test_pg_dump.sql
+++ b/src/test/modules/test_pg_dump/sql/test_pg_dump.sql
@@ -106,3 +106,5 @@ ALTER EXTENSION test_pg_dump DROP SERVER s0;
 ALTER EXTENSION test_pg_dump DROP TABLE test_pg_dump_t1;
 ALTER EXTENSION test_pg_dump DROP TYPE test_pg_dump_e1;
 ALTER EXTENSION test_pg_dump DROP VIEW test_pg_dump_v1;
+DROP OWNED BY regress_dump_test_role;
+DROP ROLE regress_dump_test_role;

As far as I can see, test_oat_hook has no need to keep around the
extra role it creates as part of the regression tests, because at the
end of the test there are no objects that depend on it.  Wouldn't it
be better to make the test self-isolated?  NO_INSTALLCHECK is set in
the module because of the issue with caching and the namespace search
hooks, but it seems to me that we'd better make the test self-isolated
in the long term, like in the attached.

Thoughts?
--
Michael
diff --git a/src/test/modules/test_oat_hooks/expected/test_oat_hooks.out b/src/test/modules/test_oat_hooks/expected/test_oat_hooks.out
index 39b274b8fa..c28251b6b6 100644
--- a/src/test/modules/test_oat_hooks/expected/test_oat_hooks.out
+++ b/src/test/modules/test_oat_hooks/expected/test_oat_hooks.out
@@ -299,3 +299,4 @@ REVOKE ALL PRIVILEGES ON PARAMETER
 	test_oat_hooks.user_var2, test_oat_hooks.super_var2
 	FROM regress_role_joe;
 DROP ROLE regress_role_joe;
+DROP ROLE regress_test_user;
diff --git a/src/test/modules/test_oat_hooks/sql/test_oat_hooks.sql b/src/test/modules/test_oat_hooks/sql/test_oat_hooks.sql
index 8b6d5373aa..09a38b902a 100644
--- a/src/test/modules/test_oat_hooks/sql/test_oat_hooks.sql
+++ b/src/test/modules/test_oat_hooks/sql/test_oat_hooks.sql
@@ -98,3 +98,4 @@ REVOKE ALL PRIVILEGES ON PARAMETER
 	test_oat_hooks.user_var2, test_oat_hooks.super_var2
 	FROM regress_role_joe;
 DROP ROLE regress_role_joe;
+DROP ROLE regress_test_user;


signature.asc
Description: PGP signature


Re: [BUG] Panic due to incorrect missingContrecPtr after promotion

2022-06-27 Thread Kyotaro Horiguchi
At Mon, 27 Jun 2022 15:02:11 +0900, Michael Paquier  wrote 
in 
> On Fri, Jun 24, 2022 at 04:17:34PM +, Imseih (AWS), Sami wrote:
> > It is been difficult to get a generic repro, but the way we reproduce
> > Is through our test suite. To give more details, we are running tests
> > In which we constantly failover and promote standbys. The issue
> > surfaces after we have gone through a few promotions which occur
> > every few hours or so ( not really important but to give context ).
> 
> Hmm.  Could you describe exactly the failover scenario you are using?
> Is the test using a set of cascading standbys linked to the promoted
> one?  Are the standbys recycled from the promoted nodes with pg_rewind
> or created from scratch with a new base backup taken from the
> freshly-promoted primary?  I have been looking more at this thread
> through the day but I don't see a remaining issue.  It could be
> perfectly possible that we are missing a piece related to the handling
> of those new overwrite contrecords in some cases, like in a rewind.
> 
> > I am adding some additional debugging  to see if I can draw a better
> > picture of what is happening. Will also give aborted_contrec_reset_3.patch 
> > a go, although I suspect it will not handle the specific case we are deaing 
> > with.
> 
> Yeah, this is not going to change much things if you are still seeing
> an issue.  This patch does not change the logic, aka it just

True. That is a siginicant hint on what happened at the time.

- Are there only two hosts in the replication set?  I concerned on
  whether it is a cascading set or not.

- Exactly what are you performing at every failover?  Especially do
  the steps contain pg_rewind, and do you copy pg_wal and/or archive
  files between the failover hosts?

> simplifies the tracking of the continuation record data, resetting it
> when a complete record has been read.  Saying that, getting rid of the
> dependency on StandbyMode because we cannot promote in the middle of a
> record is nice (my memories around that were a bit blurry but even
> recovery_target_lsn would not recover in the middle of an continuation
> record), and this is not bug so there is limited reason to backpatch
> this part of the change.

Agreed.  In the first place my "repro" (or the test case) is a bit too
intricated to happen in the real field.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center




Re: Allowing REINDEX to have an optional name

2022-06-27 Thread Justin Pryzby
Maybe the answer is to 1) add a parenthesized option REINDEX(SYSTEM) (to allow
the current behavior); and 2) make REINDEX DATABASE an alias which implies
"SYSTEM false"; 3) prohibit REINDEX (SYSTEM true) SYSTEM, or consider removing
"REINDEX SYSTEM;".

That avoids the opaque and surprising behavior that "REINDEX DATABASE" skips
system stuff but "REINDEX DATABASE foo" doesn't, while allowing the old
behavior (disabled by default).




Re: pg_upgrade (12->14) fails on aggregate

2022-06-27 Thread Justin Pryzby
On Sat, Jun 25, 2022 at 03:34:49PM +0500, Andrey Borodin wrote:
> > On 25 Jun 2022, at 01:28, Justin Pryzby  wrote:
> 
> >  this is my latest.
> > <0001-WIP-pg_upgrade-check-detect-old-polymorphics-from-pr.patch>
> 
> Let's rename "databases_with_old_polymorphics.txt" to somthing like 
> "old_polymorphics.txt" or maybe even "incompatible_polymorphics_usage.txt"?
> I think you will come up with a better name, my point is here everythin is in 
> "databases", and "old" doesn't describe essence of the problem.

> Also, let's check that oid of used functions belong to system catalog 
> (<16384)? We don't care about user-defined functions with the same name.

Right now, we test
  =ANY(ARRAY['array_remove(anyarray,anyelement)',...]::regprocedure)

..which will find the system's array_remove, and not some other one, due to
ALWAYS_SECURE_SEARCH_PATH_SQL (which is also why ::regprocedure prints a
namespace for the non-system functions we're interested in displaying).

I had "transnsp.nspname='pg_catalog'", which was redundant, so I removed it.

I tested that this allows upgrades with aggregates on top of non-system
functions of the same name/args:

postgres=# CREATE FUNCTION array_append(anyarray, anyelement) RETURNS ANYARRAY 
LANGUAGE SQL AS $$ $$;
postgres=# CREATE AGGREGATE foo(anyelement) (sfunc=public.array_append, 
stype=anyarray, initcond='{}');

> And, probably, we can do this unconditionally:
> if (old_cluster.major_version >= 9500)
> appendPQExpBufferStr(_polymorphics,
> Nothing bad will happen if we blacklist usage of nonexistent functions.

Nope, it's as I said: this would break pg_upgrade from older versions.

> I realized that my latest patch would break upgrades from old servers, which 
> do
> not have array_position/s nor width_bucket, so ::reprocedure would fail.  
> Maybe
> Andrey's way is better (checking proname rather than its OID).

This fixes several error with the version test.

-- 
Justin
>From 965deb773dd0170018f4b82d27420c61690ad690 Mon Sep 17 00:00:00 2001
From: Justin Pryzby 
Date: Fri, 10 Jun 2022 11:17:36 -0500
Subject: [PATCH] WIP: pg_upgrade --check: detect old polymorphics from pre-14

These fail when upgrading from pre-14 (as expected), but it should fail during
pg_upgrade --check, and not after dumping the cluster and in the middle of
restoring it.

CREATE AGGREGATE array_accum(anyelement) (sfunc=array_append, stype=anyarray, initcond='{}');
CREATE OPERATOR !@# (PROCEDURE = array_append, LEFTARG=anyarray, rightarg=anyelement);

See also:
9e38c2bb5093ceb0c04d6315ccd8975bd17add66
97f73a978fc1aca59c6ad765548ce0096d95a923
---
 src/bin/pg_upgrade/check.c | 131 +
 1 file changed, 131 insertions(+)

diff --git a/src/bin/pg_upgrade/check.c b/src/bin/pg_upgrade/check.c
index ace7387edaf..20e7923c20c 100644
--- a/src/bin/pg_upgrade/check.c
+++ b/src/bin/pg_upgrade/check.c
@@ -31,6 +31,7 @@ static void check_for_jsonb_9_4_usage(ClusterInfo *cluster);
 static void check_for_pg_role_prefix(ClusterInfo *cluster);
 static void check_for_new_tablespace_dir(ClusterInfo *new_cluster);
 static void check_for_user_defined_encoding_conversions(ClusterInfo *cluster);
+static void check_for_incompatible_polymorphics(ClusterInfo *cluster);
 static char *get_canonical_locale_name(int category, const char *locale);
 
 
@@ -122,6 +123,12 @@ check_and_dump_old_cluster(bool live_check)
 	if (GET_MAJOR_VERSION(old_cluster.major_version) <= 1300)
 		check_for_user_defined_postfix_ops(_cluster);
 
+	/*
+	 * PG 14 changed polymorphic functions from anyarray to anycompatiblearray.
+	 */
+	if (GET_MAJOR_VERSION(old_cluster.major_version) <= 1300)
+		check_for_incompatible_polymorphics(_cluster);
+
 	/*
 	 * Pre-PG 12 allowed tables to be declared WITH OIDS, which is not
 	 * supported anymore. Verify there are none, iff applicable.
@@ -775,6 +782,130 @@ check_proper_datallowconn(ClusterInfo *cluster)
 }
 
 
+/*
+ *	check_for_incompatible_polymorphics()
+ *
+ *	Make sure nothing is using old polymorphic functions with
+ *	anyarray/anyelement rather than the new anycompatible variants.
+ */
+static void
+check_for_incompatible_polymorphics(ClusterInfo *cluster)
+{
+	PGresult   *res;
+	FILE	   *script = NULL;
+	char		output_path[MAXPGPATH];
+	PQExpBufferData old_polymorphics;
+
+	initPQExpBuffer(_polymorphics);
+
+	if (GET_MAJOR_VERSION(old_cluster.major_version) >= 903)
+		appendPQExpBufferStr(_polymorphics,
+			"'array_remove(anyarray,anyelement)', "
+			"'array_replace(anyarray,anyelement,anyelement)', ");
+
+	if (GET_MAJOR_VERSION(old_cluster.major_version) >= 905)
+		appendPQExpBufferStr(_polymorphics,
+			"'array_position(anyarray,anyelement)', "
+			"'array_position(anyarray,anyelement,integer)', "
+			"'array_positions(anyarray,anyelement)', "
+			"'width_bucket(anyelement,anyarray)', ");
+
+	appendPQExpBufferStr(_polymorphics,
+		"'array_append(anyarray,anyelement)', "
+		"'array_cat(anyarray,anyarray)', "
+		

Re: [PoC] Let libpq reject unexpected authentication requests

2022-06-27 Thread Jacob Champion
On Mon, Jun 27, 2022 at 12:05 PM Jacob Champion  wrote:
> v5 adds a second patch which implements a client-certificate analogue
> to gssencmode; I've named it sslcertmode.

...and v6 fixes check-world, because I always forget about postgres_fdw.

--Jacob
diff --git a/contrib/postgres_fdw/expected/postgres_fdw.out 
b/contrib/postgres_fdw/expected/postgres_fdw.out
index 44457f930c..47df10119e 100644
--- a/contrib/postgres_fdw/expected/postgres_fdw.out
+++ b/contrib/postgres_fdw/expected/postgres_fdw.out
@@ -9529,7 +9529,7 @@ DO $d$
 END;
 $d$;
 ERROR:  invalid option "password"
-HINT:  Valid options in this context are: service, passfile, channel_binding, 
connect_timeout, dbname, host, hostaddr, port, options, application_name, 
keepalives, keepalives_idle, keepalives_interval, keepalives_count, 
tcp_user_timeout, sslmode, sslcompression, sslcert, sslkey, sslrootcert, 
sslcrl, sslcrldir, sslsni, requirepeer, ssl_min_protocol_version, 
ssl_max_protocol_version, gssencmode, krbsrvname, gsslib, target_session_attrs, 
use_remote_estimate, fdw_startup_cost, fdw_tuple_cost, extensions, updatable, 
truncatable, fetch_size, batch_size, async_capable, parallel_commit, 
keep_connections
+HINT:  Valid options in this context are: service, passfile, channel_binding, 
connect_timeout, dbname, host, hostaddr, port, options, application_name, 
keepalives, keepalives_idle, keepalives_interval, keepalives_count, 
tcp_user_timeout, sslmode, sslcompression, sslcert, sslkey, sslcertmode, 
sslrootcert, sslcrl, sslcrldir, sslsni, requirepeer, require_auth, 
ssl_min_protocol_version, ssl_max_protocol_version, gssencmode, krbsrvname, 
gsslib, target_session_attrs, use_remote_estimate, fdw_startup_cost, 
fdw_tuple_cost, extensions, updatable, truncatable, fetch_size, batch_size, 
async_capable, parallel_commit, keep_connections
 CONTEXT:  SQL statement "ALTER SERVER loopback_nopw OPTIONS (ADD password 
'dummypw')"
 PL/pgSQL function inline_code_block line 3 at EXECUTE
 -- If we add a password for our user mapping instead, we should get a different
From 3973c6d89874c265d0c2187374baf72f968c294f Mon Sep 17 00:00:00 2001
From: Jacob Champion 
Date: Mon, 28 Feb 2022 09:40:43 -0800
Subject: [PATCH v6 1/2] libpq: let client reject unexpected auth methods

The require_auth connection option allows the client to choose a list of
acceptable authentication types for use with the server. There is no
negotiation: if the server does not present one of the allowed
authentication requests, the connection fails. Additionally, all methods
in the list may be negated, e.g. '!password', in which case the server
must NOT use the listed authentication type. The special method "none"
allows/disallows the use of unauthenticated connections (but it does not
govern transport-level authentication via TLS or GSSAPI).

Internally, the patch expands the role of check_expected_areq() to
ensure that the incoming request is compatible with conn->require_auth.
It also introduces a new flag, conn->client_finished_auth, which is set
by various authentication routines when the client side of the handshake
is finished. This signals to check_expected_areq() that an OK message
from the server is expected, and allows the client to complain if the
server forgoes authentication entirely.

(Since the client can't generally prove that the server is actually
doing the work of authentication, this last part is mostly useful for
SCRAM without channel binding. It could also provide a client with a
decent signal that, at the very least, it's not connecting to a database
with trust auth, and so the connection can be tied to the client in a
later audit.)

Deficiencies:
- This feature currently doesn't prevent unexpected certificate exchange
  or unexpected GSSAPI encryption.
- This is unlikely to be very forwards-compatible at the moment,
  especially with SASL/SCRAM.
- SSPI support is "implemented" but untested.
- require_auth=none,scram-sha-256 currently allows the server to leave a
  SCRAM exchange unfinished. This is not net-new behavior but may be
  surprising.
---
 .../postgres_fdw/expected/postgres_fdw.out|   2 +-
 doc/src/sgml/libpq.sgml   | 105 +++
 src/include/libpq/pqcomm.h|   1 +
 src/interfaces/libpq/fe-auth-scram.c  |   1 +
 src/interfaces/libpq/fe-auth.c| 138 +++
 src/interfaces/libpq/fe-connect.c | 164 ++
 src/interfaces/libpq/fe-secure-openssl.c  |   1 -
 src/interfaces/libpq/libpq-int.h  |   7 +
 src/test/authentication/t/001_password.pl | 149 
 src/test/kerberos/t/001_auth.pl   |  26 +++
 src/test/ldap/t/001_auth.pl   |   6 +
 src/test/ssl/t/002_scram.pl   |  25 +++
 12 files changed, 623 insertions(+), 2 deletions(-)

diff --git a/contrib/postgres_fdw/expected/postgres_fdw.out b/contrib/postgres_fdw/expected/postgres_fdw.out
index 44457f930c..493931e003 100644
--- 

Re: Hardening PostgreSQL via (optional) ban on local file system access

2022-06-27 Thread Hannu Krosing
My current thinking is (based on more insights from Andres) that we
should also have a startup flag to disable superuser altogether to
avoid bypasses via direct manipulation of pg_proc.

Experience shows that 99% of the time one can run PostgreSQL just fine
without a superuser, so having a superuser available all the time is
kind of like leaving a loaded gun on the kitchen table because you
sometimes need to go hunting.

I am especially waiting for Andres' feedback on viability this approach.


Cheers
Hannu

On Mon, Jun 27, 2022 at 10:37 PM Jeff Davis  wrote:
>
> On Sat, 2022-06-25 at 00:08 +0200, Hannu Krosing wrote:
> > Hi Pgsql-Hackers
> >
> > As part of ongoing work on PostgreSQL  security hardening we have
> > added a capability to disable all file system access (COPY TO/FROM
> > [PROGRAM] , pg_*file*() functions, lo_*() functions
> > accessing files, etc) in a way that can not be re-enabled without
> > already having access to the file system. That is via a flag which
> > can
> > be set only in postgresql.conf or on the command line.
>
> How much of this can be done as a special extension already?
>
> For instance, a ProcessUtility_hook can prevent superuser from
> executing COPY TO/FROM PROGRAM.
>
> As others point out, that would still leave a lot of surface area for
> attacks, e.g. by manipulating the catalog. But it could be a starting
> place to make attacks "harder", without core postgres needing to make
> security promises that will be hard to keep.
>
> Regards,
> Jeff Davis
>
>




Re: Hardening PostgreSQL via (optional) ban on local file system access

2022-06-27 Thread Jeff Davis
On Sat, 2022-06-25 at 00:08 +0200, Hannu Krosing wrote:
> Hi Pgsql-Hackers
> 
> As part of ongoing work on PostgreSQL  security hardening we have
> added a capability to disable all file system access (COPY TO/FROM
> [PROGRAM] , pg_*file*() functions, lo_*() functions
> accessing files, etc) in a way that can not be re-enabled without
> already having access to the file system. That is via a flag which
> can
> be set only in postgresql.conf or on the command line.

How much of this can be done as a special extension already?

For instance, a ProcessUtility_hook can prevent superuser from
executing COPY TO/FROM PROGRAM.

As others point out, that would still leave a lot of surface area for
attacks, e.g. by manipulating the catalog. But it could be a starting
place to make attacks "harder", without core postgres needing to make
security promises that will be hard to keep.

Regards,
Jeff Davis






Re: do only critical work during single-user vacuum?

2022-06-27 Thread Peter Geoghegan
On Mon, Jun 27, 2022 at 12:36 PM Justin Pryzby  wrote:
> By chance, I came across this prior thread which advocated the same thing in a
> initially (rather than indirectly as in this year's thread).

Revisiting this topic reminded me that PostgreSQL 14 (the first
version that had the wraparound failsafe mechanism controlled by
vacuum_failsafe_age) has been a stable release for 9 months now. As of
today I am still not aware of even one user that ran into the failsafe
mechanism in production. It might well have happened by now, of
course, but I am not aware of any specific case. Perhaps this will
change soon enough -- maybe somebody else will read this and enlighten
me.

To me the fact that the failsafe seems to seldom kick-in in practice
suggests something about workload characteristics in general: that it
isn't all that common for users to try to get away with putting off
freezing until a table attains an age that is significantly above 1
billion XIDs.

When people talk about things like 64-bit XIDs, I tend to wonder: if 2
billion XIDs wasn't enough, why should 4 billion or 8 billion be
enough? *Maybe* the system can do better by getting even further into
debt than it can today, but you can't expect to avoid freezing
altogether (without significant work elsewhere). My general sense is
that freezing isn't a particularly good thing to try to do lazily --
even if we ignore the risk of an eventual wraparound failure.

-- 
Peter Geoghegan




Re: Retrieving unused tuple attributes in ExecScan

2022-06-27 Thread Finnerty, Jim
Re: So I'm actually using the columns during merge join, basically I'm building 
a bloom filter on the outer relation and filtering out data on the inner 
relation of the join. I'm building the filter on the join keys

We had a whole implementation for Bloom filtering for hash inner join, complete 
with costing and pushdown of the Bloom filter from the build side to the 
execution tree on the probe side (i.e. building a Bloom filter on the inner 
side of the join at the conclusion of the build phase of the hash join, then 
pushing it down as a semi-join filter to the probe side of the join, where it 
could potentially be applied to multiple scans).  After a large change to that 
same area of the code by the community it got commented out and has been in 
that state ever since.  It's a good example of the sort of change that really 
ought to be made with the community because there's too much merge burden 
otherwise.

It was a pretty effective optimization in some cases, though.  Most commercial 
systems have an optimization like this, sometimes with special optimizations 
when the number of distinct join keys is very small. If there is interest in 
reviving this functionality, we could probably extract some patches and work 
with the community to try to get it running again.  

   /Jim




Re: Retrieving unused tuple attributes in ExecScan

2022-06-27 Thread Andres Freund
Hi,

(please don't top-quote on PG lists)

On 2022-06-27 19:29:34 +, Ma, Marcus wrote:
> So I'm actually using the columns during merge join, basically I'm building a 
> bloom filter on the outer relation and filtering out data on the inner 
> relation of the join. I'm building the filter on the join keys, so the 
> columns are being used further up the execution tree. However, even on a 
> command like:
> 
> Select * from t1 inner join t2 on t1.c1 = t2.c2;
> 
> The execScan function returns slots that have (0, 0, 0) even though t1.c1 and 
> t2.c2 will be used later on. I know that the Sort node and the MergeJoin node 
> are able to read the actual values of the join keys, but for some reason the 
> values aren't showing up on the SeqScan level. However, as soon as I add a 
> qualification, such as:
> 
> Select * from t1 inner join on t1.c1 = t2.c2 where t1.c1 % 2 = 0;
> 
> The qualification makes the t1.c1 value show up during execScan, but not the 
> t2.c2 value.

Slots can incrementally deform tuples. You need to call
   slot_getsomeattrs(slot, number-up-to-which-you-need-tuples)
to reliably have columns deformed.

Greetings,

Andres Freund




Re: do only critical work during single-user vacuum?

2022-06-27 Thread Justin Pryzby
On Thu, Feb 03, 2022 at 01:05:50PM -0500, Robert Haas wrote:
> On Thu, Dec 9, 2021 at 8:56 PM Andres Freund  wrote:
> > I think we should move *away* from single user mode, rather than the
> > opposite. It's a substantial code burden and it's hard to use.
> 
> Yes. This thread seems to be largely devoted to the topic of making
> single-user vacuum work better, but I don't see anyone asking the
> question "why do we have a message that tells people to vacuum in
> single user mode in the first place?". It's basically bad advice,

> The correct thing to do would be to remove
> the hint as bad advice that we never should have offered in the first
> place. And so here. We should not try to make vacuum in single
> user-mode work better or differently, or at least that shouldn't be
> our primary objective. We should just stop telling people to do it. We
> should probably add messages and documentation *discouraging* the use
> of single user mode for recovering from wraparound trouble, exactly
> the opposite of what we do now. There's nothing we can do in
> single-user mode that we can't do equally well in multi-user mode. If
> people try to fix wraparound problems in multi-user mode, they still
> have read-only access to their database, they can use parallelism,
> they can use command line utilities like vacuumdb, and they can use
> psql which has line editing and allows remote access and is a way
> nicer user experience than running postgres --single. We need a really
> compelling reason to tell people to give up all those advantages, and
> there is no such reason. It makes just as much sense as telling people
> to deal with wraparound problems by angering a live anaconda.

By chance, I came across this prior thread which advocated the same thing in a
initially (rather than indirectly as in this year's thread).

https://www.postgresql.org/message-id/flat/CAMT0RQTmRj_Egtmre6fbiMA9E2hM3BsLULiV8W00stwa3URvzA%40mail.gmail.com
|We should stop telling users to "vacuum that database in single-user mode"




Re: Retrieving unused tuple attributes in ExecScan

2022-06-27 Thread Ma, Marcus
Hey Andres,

So I'm actually using the columns during merge join, basically I'm building a 
bloom filter on the outer relation and filtering out data on the inner relation 
of the join. I'm building the filter on the join keys, so the columns are being 
used further up the execution tree. However, even on a command like:

Select * from t1 inner join t2 on t1.c1 = t2.c2;

The execScan function returns slots that have (0, 0, 0) even though t1.c1 and 
t2.c2 will be used later on. I know that the Sort node and the MergeJoin node 
are able to read the actual values of the join keys, but for some reason the 
values aren't showing up on the SeqScan level. However, as soon as I add a 
qualification, such as:

Select * from t1 inner join on t1.c1 = t2.c2 where t1.c1 % 2 = 0;

The qualification makes the t1.c1 value show up during execScan, but not the 
t2.c2 value.

Marcus

On 6/27/22, 3:10 PM, "Andres Freund"  wrote:

CAUTION: This email originated from outside of the organization. Do not 
click links or open attachments unless you can confirm the sender and know the 
content is safe.



Hi,

On 2022-06-27 19:00:44 +, Ma, Marcus wrote:
> If I understand correctly, when a Sequential Scan takes place, the 
ExecScan function (located in executor/execScan.c) does not retrieve all 
attributes per tuple in the TupleTableSlot and only retrieves the necessary 
attribute. So for example, let’s imagine we have a table t1 with 3 number 
fields, c1, c2, and c3. So in the command:
>
> Select * from t1 where t1.c1 > 500;
>
> The returned TupleTableSlot will have its field of tts_values in the form 
(X, 0, 0), where X is the real value of t1.c1 but the fields of c2 and c3 are 
not actually retrieved because they aren’t used. Similarly, for the command:
>
> Select * from t1;
>
> The TupleTableSlot will always return the values of (0, 0, 0) because no
> comparisons are necessary. I am working on code where I’ll need access to
> attributes that aren’t listed in any qualification – what code should I
> change in execScan, or nodeSeqScan to be able to retrieve any attribute 
of a
> tuple? Basically, being able to make execScan return (X, Y, Z) instead of
> (0, 0, 0) even if the command doesn’t use any attribute comparisons.

You'll need to tell the planner that those columns are needed. It's not just
seqscans that otherwise will discard / not compute values.

Where exactly do you need those columns and why?

Greetings,

Andres Freund



Re: Retrieving unused tuple attributes in ExecScan

2022-06-27 Thread Andres Freund
Hi,

On 2022-06-27 19:00:44 +, Ma, Marcus wrote:
> If I understand correctly, when a Sequential Scan takes place, the ExecScan 
> function (located in executor/execScan.c) does not retrieve all attributes 
> per tuple in the TupleTableSlot and only retrieves the necessary attribute. 
> So for example, let’s imagine we have a table t1 with 3 number fields, c1, 
> c2, and c3. So in the command:
> 
> Select * from t1 where t1.c1 > 500;
> 
> The returned TupleTableSlot will have its field of tts_values in the form (X, 
> 0, 0), where X is the real value of t1.c1 but the fields of c2 and c3 are not 
> actually retrieved because they aren’t used. Similarly, for the command:
> 
> Select * from t1;
> 
> The TupleTableSlot will always return the values of (0, 0, 0) because no
> comparisons are necessary. I am working on code where I’ll need access to
> attributes that aren’t listed in any qualification – what code should I
> change in execScan, or nodeSeqScan to be able to retrieve any attribute of a
> tuple? Basically, being able to make execScan return (X, Y, Z) instead of
> (0, 0, 0) even if the command doesn’t use any attribute comparisons.

You'll need to tell the planner that those columns are needed. It's not just
seqscans that otherwise will discard / not compute values.

Where exactly do you need those columns and why?

Greetings,

Andres Freund




Re: [PoC] Let libpq reject unexpected authentication requests

2022-06-27 Thread Jacob Champion
On Fri, Jun 24, 2022 at 12:17 PM Jacob Champion  wrote:
> Both NOT (via ! negation) and "none" are implemented in v4.

v5 adds a second patch which implements a client-certificate analogue
to gssencmode; I've named it sslcertmode. This takes the place of the
require_auth=[!]cert setting implemented previously.

As I mentioned upthread, I think sslcertmode=require is the weakest
feature here, since the server always sends a certificate request if
you are using TLS. It would potentially be more useful if we start
expanding TLS setups and middlebox options, but I still only see it as
a troubleshooting feature for administrators. By contrast,
sslcertmode=disable lets you turn off the use of the certificate, no
matter what libpq is able to find in your environment or home
directory. That seems more immediately useful.

With this addition, I'm wondering if GSS encrypted transport should be
removed from the definition/scope of require_auth=gss. We already have
gssencmode to control that, and it would remove an ugly special case
from the patch.

I'll add this patchset to the commitfest.

--Jacob
From 30739bd7e4c96f06101e3f235363a97c66adfced Mon Sep 17 00:00:00 2001
From: Jacob Champion 
Date: Mon, 28 Feb 2022 09:40:43 -0800
Subject: [PATCH v5 1/2] libpq: let client reject unexpected auth methods

The require_auth connection option allows the client to choose a list of
acceptable authentication types for use with the server. There is no
negotiation: if the server does not present one of the allowed
authentication requests, the connection fails. Additionally, all methods
in the list may be negated, e.g. '!password', in which case the server
must NOT use the listed authentication type. The special method "none"
allows/disallows the use of unauthenticated connections (but it does not
govern transport-level authentication via TLS or GSSAPI).

Internally, the patch expands the role of check_expected_areq() to
ensure that the incoming request is compatible with conn->require_auth.
It also introduces a new flag, conn->client_finished_auth, which is set
by various authentication routines when the client side of the handshake
is finished. This signals to check_expected_areq() that an OK message
from the server is expected, and allows the client to complain if the
server forgoes authentication entirely.

(Since the client can't generally prove that the server is actually
doing the work of authentication, this last part is mostly useful for
SCRAM without channel binding. It could also provide a client with a
decent signal that, at the very least, it's not connecting to a database
with trust auth, and so the connection can be tied to the client in a
later audit.)

Deficiencies:
- This feature currently doesn't prevent unexpected certificate exchange
  or unexpected GSSAPI encryption.
- This is unlikely to be very forwards-compatible at the moment,
  especially with SASL/SCRAM.
- SSPI support is "implemented" but untested.
- require_auth=none,scram-sha-256 currently allows the server to leave a
  SCRAM exchange unfinished. This is not net-new behavior but may be
  surprising.
---
 doc/src/sgml/libpq.sgml   | 105 ++
 src/include/libpq/pqcomm.h|   1 +
 src/interfaces/libpq/fe-auth-scram.c  |   1 +
 src/interfaces/libpq/fe-auth.c| 138 ++
 src/interfaces/libpq/fe-connect.c | 164 ++
 src/interfaces/libpq/fe-secure-openssl.c  |   1 -
 src/interfaces/libpq/libpq-int.h  |   7 +
 src/test/authentication/t/001_password.pl | 149 
 src/test/kerberos/t/001_auth.pl   |  26 
 src/test/ldap/t/001_auth.pl   |   6 +
 src/test/ssl/t/002_scram.pl   |  25 
 11 files changed, 622 insertions(+), 1 deletion(-)

diff --git a/doc/src/sgml/libpq.sgml b/doc/src/sgml/libpq.sgml
index 37ec3cb4e5..cc831158b7 100644
--- a/doc/src/sgml/libpq.sgml
+++ b/doc/src/sgml/libpq.sgml
@@ -1221,6 +1221,101 @@ postgresql://%2Fvar%2Flib%2Fpostgresql/dbname
   
  
 
+ 
+  require_auth
+  
+  
+Specifies the authentication method that the client requires from the
+server. If the server does not use the required method to authenticate
+the client, or if the authentication handshake is not fully completed by
+the server, the connection will fail. A comma-separated list of methods
+may also be provided, of which the server must use exactly one in order
+for the connection to succeed. By default, any authentication method is
+accepted, and the server is free to skip authentication altogether.
+  
+  
+Methods may be negated with the addition of a !
+prefix, in which case the server must not attempt
+the listed method; any other method is accepted, and the server is free
+not to authenticate the client at all. If a comma-separated list is
+provided, the server may not attempt any of 

Retrieving unused tuple attributes in ExecScan

2022-06-27 Thread Ma, Marcus
Hey,

If I understand correctly, when a Sequential Scan takes place, the ExecScan 
function (located in executor/execScan.c) does not retrieve all attributes per 
tuple in the TupleTableSlot and only retrieves the necessary attribute. So for 
example, let’s imagine we have a table t1 with 3 number fields, c1, c2, and c3. 
So in the command:

Select * from t1 where t1.c1 > 500;

The returned TupleTableSlot will have its field of tts_values in the form (X, 
0, 0), where X is the real value of t1.c1 but the fields of c2 and c3 are not 
actually retrieved because they aren’t used. Similarly, for the command:

Select * from t1;

The TupleTableSlot will always return the values of (0, 0, 0) because no 
comparisons are necessary. I am working on code where I’ll need access to 
attributes that aren’t listed in any qualification – what code should I change 
in execScan, or nodeSeqScan to be able to retrieve any attribute of a tuple? 
Basically, being able to make execScan return (X, Y, Z) instead of (0, 0, 0) 
even if the command doesn’t use any attribute comparisons.

Marcus


Re: Making the subquery alias optional in the FROM clause

2022-06-27 Thread Dean Rasheed
On Mon, 27 Jun 2022 at 19:43, David G. Johnston
 wrote:
>
> On Mon, Jun 27, 2022 at 11:25 AM Dean Rasheed  
> wrote:
>>
>> On Mon, 27 Jun 2022 at 16:12, Julien Rouhaud  wrote:
>> >
>> > It doesn't play that well if you have something called subquery though:
>> >
>> > [example that changes a user-provided alias]
>> >
>> > While the output is a valid query, it's not nice that it's replacing a
>> > user provided alias with another one (or force an alias if you have a
>> > relation called subquery).
>>
>> It's already the case that user-provided aliases can get replaced by
>> new ones in the query-deparsing code, e.g.:
>>
>
> Regardless, is there any reason to not just prefix our made-up aliases with 
> "pg_" to make it perfectly clear they were generated by the system and are 
> basically implementation details as opposed to something that appeared in the 
> originally written query?
>
> I suppose, "because we've haven't until now, so why start" suffices...but 
> still doing a rename/suffixing because of query rewriting and inventing one 
> where we made it optional seem different enough to justify implementing 
> something different.
>

I think "pg_" would be a bad idea, since it's too easily confused with
things like system catalogs. The obvious precedent we have for a
made-up alias is "unnamed_join", so perhaps "unnamed_subquery" would
be better.

Regards,
Dean




Re: Making the subquery alias optional in the FROM clause

2022-06-27 Thread David G. Johnston
On Mon, Jun 27, 2022 at 11:25 AM Dean Rasheed 
wrote:

> On Mon, 27 Jun 2022 at 16:12, Julien Rouhaud  wrote:
> >
> > It doesn't play that well if you have something called subquery though:
> >
> > [example that changes a user-provided alias]
> >
> > While the output is a valid query, it's not nice that it's replacing a
> > user provided alias with another one (or force an alias if you have a
> > relation called subquery).
>
> It's already the case that user-provided aliases can get replaced by
> new ones in the query-deparsing code, e.g.:
>
>
Regardless, is there any reason to not just prefix our made-up aliases with
"pg_" to make it perfectly clear they were generated by the system and are
basically implementation details as opposed to something that appeared in
the originally written query?

I suppose, "because we've haven't until now, so why start" suffices...but
still doing a rename/suffixing because of query rewriting and inventing one
where we made it optional seem different enough to justify implementing
something different.

David J.


Re: Making the subquery alias optional in the FROM clause

2022-06-27 Thread Dean Rasheed
On Mon, 27 Jun 2022 at 16:12, Julien Rouhaud  wrote:
>
> It doesn't play that well if you have something called subquery though:
>
> [example that changes a user-provided alias]
>
> While the output is a valid query, it's not nice that it's replacing a
> user provided alias with another one (or force an alias if you have a
> relation called subquery).

It's already the case that user-provided aliases can get replaced by
new ones in the query-deparsing code, e.g.:

CREATE OR REPLACE VIEW test_view AS
  SELECT x.a, y.b
FROM foo AS x,
 (SELECT b FROM foo AS x) AS y;

\sv test_view

CREATE OR REPLACE VIEW public.test_view AS
 SELECT x.a,
y.b
   FROM foo x,
( SELECT x_1.b
   FROM foo x_1) y

and similarly it may invent technically unnecessary aliases where
there were none before. The query-deparsing code has never been
alias-preserving, unless you take care to give everything a globally
unique alias.

Regards,
Dean




Re: SYSTEM_USER reserved word implementation

2022-06-27 Thread Alvaro Herrera
On 2022-Jun-25, Drouvot, Bertrand wrote:

> diff --git a/src/include/miscadmin.h b/src/include/miscadmin.h
> index 0af130fbc5..8d761512fd 100644
> --- a/src/include/miscadmin.h
> +++ b/src/include/miscadmin.h
> @@ -364,6 +364,10 @@ extern void InitializeSessionUserIdStandalone(void);
>  extern void SetSessionAuthorization(Oid userid, bool is_superuser);
>  extern Oid   GetCurrentRoleId(void);
>  extern void SetCurrentRoleId(Oid roleid, bool is_superuser);
> +/* kluge to avoid including libpq/libpq-be.h here */
> +typedef struct Port MyPort;
> +extern void InitializeSystemUser(const MyPort *port);
> +extern const char* GetSystemUser(void);

This typedef here looks quite suspicious.  I think this should suffice:

+/* kluge to avoid including libpq/libpq-be.h here */
+struct Port;
+extern void InitializeSystemUser(struct Port *port);

I suspect that having a typedef called MyPort is going to wreak serious
havoc for pgindent.

-- 
Álvaro Herrera PostgreSQL Developer  —  https://www.EnterpriseDB.com/




Re: Making the subquery alias optional in the FROM clause

2022-06-27 Thread Isaac Morland
On Mon, 27 Jun 2022 at 11:12, Julien Rouhaud  wrote:

> More generally, I'm -0.5 on the feature.
> I prefer to force using SQL-compliant queries, and also not take bad
> habits.
>

As to forcing SQL-complaint queries, that ship sailed a long time ago:
Postgres allows but does not enforce the use of SQL-compliant queries, and
many of its important features are extensions anyway, so forcing SQL
compliant queries is out of the question (although I could see the utility
of a mode where it warns or errors on non-compliant queries, at least in
principle).

As to bad habits, I'm having trouble understanding. Why do you think
leaving the alias off a subquery is a bad habit (assuming it were allowed)?
If the name is never used, why are we required to supply it?


Re: Lazy JIT IR code generation to increase JIT speed with partitions

2022-06-27 Thread Alvaro Herrera
On 2021-Jan-18, Luc Vlaming wrote:

> I would like this topic to somehow progress and was wondering what other
> benchmarks / tests would be needed to have some progress? I've so far
> provided benchmarks for small(ish) queries and some tpch numbers, assuming
> those would be enough.

Hi, some time ago I reported a case[1] where our JIT implementation does
a very poor job and perhaps the changes that you're making could explain
what is going on, and maybe even fix it:

[1] https://postgr.es/m/20241706.wqq7xoyigwa2@alvherre.pgsql

The query for which I investigated the problem involved some pg_logical
metadata tables, so I didn't post it anywhere public; but the blog post
I found later contains a link to a query that shows the same symptoms,
and which is luckily still available online:
https://gist.github.com/saicitus/251ba20b211e9e73285af35e61b19580
I attach it here in case it goes missing sometime.

-- 
Álvaro Herrera   48°01'N 7°57'E  —  https://www.EnterpriseDB.com/


go_api_metadata_pg_query.sql
Description: application/sql


Re: Emit postgres log messages that have security or PII with special flags/error code/elevel

2022-06-27 Thread Julien Rouhaud
Hi,

On Mon, Jun 27, 2022 at 06:41:21PM +0530, Bharath Rupireddy wrote:
>
> Here's an idea - what if postgres can emit log messages that have sensitive
> information with special error codes or flags? The emit_log_hook
> implementers will then just need to look for those special error codes or
> flags to treat them differently.

This has been discussed multiple times in the past, and always rejected.  The
main reason for that is that it's impossible to accurately determine whether a
message contains sensitive information or not, and if it were there wouldn't be
a single definition that would fit everyone.

As a simple example, how would you handle the log emitted by this query?

ALTERR OLE myuser WITH PASSWORD 'my super secret password';




[Commitfest 2022-07] Begins This Friday

2022-06-27 Thread Jacob Champion
Hi all,

Just a reminder that the July 2022 commitfest will begin this coming
Friday, July 1. I'll send out reminders this week to get your patches
registered/rebased, and I'll be updating stale statuses in the CF app.

Thanks,
--Jacob




Re: [PoC] Improve dead tuple storage for lazy vacuum

2022-06-27 Thread Andres Freund
Hi,

On 2022-06-27 18:12:13 +0700, John Naylor wrote:
> Another thought: for non-x86 platforms, the SIMD nodes degenerate to
> "simple loop", and looping over up to 32 elements is not great
> (although possibly okay). We could do binary search, but that has bad
> branch prediction.

I'd be quite quite surprised if binary search were cheaper. Particularly on
less fancy platforms.

- Andres




Re: [PoC] Improve dead tuple storage for lazy vacuum

2022-06-27 Thread Hannu Krosing
> Another thought: for non-x86 platforms, the SIMD nodes degenerate to
> "simple loop", and looping over up to 32 elements is not great
> (although possibly okay). We could do binary search, but that has bad
> branch prediction.

I am not sure that for relevant non-x86 platforms SIMD / vector
instructions would not be used (though it would be a good idea to
verify)
Do you know any modern platforms that do not have SIMD ?

I would definitely test before assuming binary search is better.

Often other approaches like counting search over such small vectors is
much better when the vector fits in cache (or even a cache line) and
you always visit all items as this will completely avoid branch
predictions and allows compiler to vectorize and / or unroll the loop
as needed.

Cheers
Hannu




Re: JSON/SQL: jsonpath: incomprehensible error message

2022-06-27 Thread Andrew Dunstan

On 2022-06-26 Su 11:44, Erik Rijkers wrote:
> JSON/SQL jsonpath
>
> For example, a jsonpath string with deliberate typo 'like_regexp'
> (instead of 'like_regex'):
>
> select js
> from (values (jsonb '{}')) as f(js)
> where js @? '$ ? (@ like_regexp "^xxx")';
>
> ERROR:  syntax error, unexpected IDENT_P at or near " " of jsonpath input
> LINE 1: ...s from (values (jsonb '{}')) as f(js) where js @? '$ ? (@
> li...
>  ^
>
> Both  'IDENT_P'  and  'at or near " "'  seem pretty useless.
>
> Perhaps some improvement can be thought of?
>
> Similar messages in release 14 seem to use 'invalid token', which is
> better:
>
> select js
> from (values (jsonb '{"a":"b"}')) as f(js)
> where js @? '$ ? (@.a .= "b")';
> ERROR:  syntax error, unexpected invalid token at or near "=" of
> jsonpath input
>
>

Yeah :-(

This apparently goes back to the original jsonpath commit 72b6460336e.
There are similar error messages in the back branch regression tests:

andrew@ub20:pgl $ grep -r IDENT_P pg_*/src/test/regress/expected/
pg_12/src/test/regress/expected/jsonpath.out:ERROR:  syntax error, unexpected 
IDENT_P at end of jsonpath input
pg_13/src/test/regress/expected/jsonpath.out:ERROR:  syntax error, unexpected 
IDENT_P at end of jsonpath input
pg_14/src/test/regress/expected/jsonpath.out:ERROR:  syntax error, unexpected 
IDENT_P at end of jsonpath input

For some reason the parser contains a '%error-verbose' directive, unlike
all our other bison parsers. Removing that fixes it, as in this patch.
I'm a bit inclined to say we should backpatch the removal of the
directive, but I guess a lack of complaints suggests it's not a huge issue.


cheers


andrew


--
Andrew Dunstan
EDB: https://www.enterprisedb.com
diff --git a/src/backend/utils/adt/jsonpath_gram.y b/src/backend/utils/adt/jsonpath_gram.y
index 06d4c8c229..57f6beb27b 100644
--- a/src/backend/utils/adt/jsonpath_gram.y
+++ b/src/backend/utils/adt/jsonpath_gram.y
@@ -74,7 +74,6 @@ static JsonPathParseItem *makeItemLikeRegex(JsonPathParseItem *expr,
 %pure-parser
 %expect 0
 %name-prefix="jsonpath_yy"
-%error-verbose
 %parse-param {JsonPathParseResult **result}
 
 %union
diff --git a/src/test/regress/expected/jsonb_sqljson.out b/src/test/regress/expected/jsonb_sqljson.out
index ec7dc50593..e2f7df50a8 100644
--- a/src/test/regress/expected/jsonb_sqljson.out
+++ b/src/test/regress/expected/jsonb_sqljson.out
@@ -2083,7 +2083,7 @@ SELECT JSON_QUERY(jsonb '{"a": 123}', '$' || '.' || 'a' WITH WRAPPER);
 
 -- Should fail (invalid path)
 SELECT JSON_QUERY(jsonb '{"a": 123}', 'error' || ' ' || 'error');
-ERROR:  syntax error, unexpected IDENT_P at or near " " of jsonpath input
+ERROR:  syntax error at or near " " of jsonpath input
 -- Should fail (not supported)
 SELECT * FROM JSON_TABLE(jsonb '{"a": 123}', '$' || '.' || 'a' COLUMNS (foo int));
 ERROR:  only string constants supported in JSON_TABLE path specification


Re: Making the subquery alias optional in the FROM clause

2022-06-27 Thread Julien Rouhaud
On Mon, Jun 27, 2022 at 9:49 PM Dean Rasheed  wrote:
>
> This was discussed previously in [1], and there seemed to be general
> consensus in favour of it, but no new patch emerged.
>
> Attached is a patch that takes the approach of not generating an alias
> at all, which seems to be neater and simpler, and less code than
> trying to generate a unique alias.
>
> It still generates an eref for the subquery RTE, which has a made-up
> relation name, but that is marked as not visible on the
> ParseNamespaceItem, so it doesn't conflict with anything else, need
> not be unique, and cannot be used for qualified references to the
> subquery's columns.
>
> The only place that exposes the eref's made-up relation name is the
> existing query deparsing code in ruleutils.c, which uniquifies it and
> generates SQL spec-compliant output. For example:
>
> CREATE OR REPLACE VIEW test_view AS
>   SELECT *
> FROM (SELECT a, b FROM foo),
>  (SELECT c, d FROM bar)
>WHERE a = c;
>
> \sv test_view
>
> CREATE OR REPLACE VIEW public.test_view AS
>  SELECT subquery.a,
> subquery.b,
> subquery_1.c,
> subquery_1.d
>FROM ( SELECT foo.a,
> foo.b
>FROM foo) subquery,
> ( SELECT bar.c,
> bar.d
>FROM bar) subquery_1
>   WHERE subquery.a = subquery_1.c

It doesn't play that well if you have something called subquery though:

CREATE OR REPLACE VIEW test_view AS
  SELECT *
FROM (SELECT a, b FROM foo),
 (SELECT c, d FROM bar), (select relname from pg_class limit
1)  as subquery
   WHERE a = c;

\sv test_view
CREATE OR REPLACE VIEW public.test_view AS
 SELECT subquery.a,
subquery.b,
subquery_1.c,
subquery_1.d,
subquery_2.relname
   FROM ( SELECT foo.a,
foo.b
   FROM foo) subquery,
( SELECT bar.c,
bar.d
   FROM bar) subquery_1,
( SELECT pg_class.relname
   FROM pg_class
 LIMIT 1) subquery_2
  WHERE subquery.a = subquery_1.c

While the output is a valid query, it's not nice that it's replacing a
user provided alias with another one (or force an alias if you have a
relation called subquery).  More generally, I'm -0.5 on the feature.
I prefer to force using SQL-compliant queries, and also not take bad
habits.




Re: Lazy JIT IR code generation to increase JIT speed with partitions

2022-06-27 Thread David Geier
Hi hackers,

I picked this up and had a closer look in which way the total JIT time
depends on the number of modules to be jitted.

Indeed, the total JIT time increases the more modules are used. The reason
for this to happen is that the inlining pass loads and deserializes all to
be inlined modules (.bc files) from disk prior to inlining them via
llvm::IRMover. There's already a cache for such modules in the code, but it
is currently unused. This is because llvm::IRMover takes the module to be
inlined as std::unique_ptr. The by-value argument requires
the source module to be moved, which means it cannot be reused afterwards.
The code is accounting for that by erasing the module from the cache after
inlining it, which in turns requires reloading the module next time a
reference to it is encountered.

Instead of each time loading and deserializing all to be inlined modules
from disk, they can reside in the cache and instead be cloned via
llvm::CloneModule() before they get inlined. Key to calling
llvm::CloneModule() is fully deserializing the module upfront, instead of
loading the module lazily. That is why I changed the call from
LLVMGetBitcodeModuleInContext2() (which lazily loads the module via
llvm::getOwningLazyBitcodeModule()) to LLVMParseBitCodeInContext2() (which
fully loads the module via llvm::parseBitcodeFile()). Beyond that it seems
like that prior to LLVM 13, cloning modules could fail with an assertion
(not sure though if that would cause problems in a release build without
assertions). Andres reported this problem back in the days here [1]. In the
meanwhile the issue got discussed in [2] and finally fixed for LLVM 13, see
[3].

Lazily jitting, typically, results in the generation of half as many
modules as there are functions. The IR for tuple deforming and expression
evaluation is always created together in ExecCompileExpr(), before the
jitted program gets used for the first time. However, if e.g. no rows gets
processed or a plan node is never executed, with the new version the
corresponding JIT program will never be created and the function / module
count can be lower.

In the following you can see some performance numbers and the reproduction
steps. Everything is based on REL_14_2. The query I built is based on
Andres initial suggestion and results in 53 functions. With the patch,
instead of a single module, 27 modules are created. Arbitrarily more
modules and functions can be generated by adding more UNIONs to the query
below. I also left out the first invocation in every test, because during
the first invocation the .bc files to be inlined still need to be loaded,
which is (1) slower now because the bitcode file is parsed completely and
(2) needs to be done only once as it now gets cached.

SET jit_above_cost = 0;
SET max_parallel_workers_per_gather = 0;
SET jit_inline_above_cost = 0;
SET jit_optimize_above_cost = 0;
CREATE TABLE foo (blub INT, zap INT, blarg INT);
INSERT INTO foo SELECT generate_series(1, 100), 1, 2;
ANALYZE foo;
EXPLAIN (ANALYZE, VERBOSE)
(SELECT blarg, count(*), sum(zap) FROM foo WHERE blub = 2 GROUP BY blarg)
UNION
(SELECT blarg, count(*), sum(zap) FROM foo WHERE blub = 2 GROUP BY blarg)
UNION
(SELECT blarg, count(*), sum(zap) FROM foo WHERE blub = 2 GROUP BY blarg)
UNION
(SELECT blarg, count(*), sum(zap) FROM foo WHERE blub = 2 GROUP BY blarg)
UNION
(SELECT blarg, count(*), sum(zap) FROM foo WHERE blub = 2 GROUP BY blarg)
UNION
(SELECT blarg, count(*), sum(zap) FROM foo WHERE blub = 2 GROUP BY blarg);


Baseline (1 module)
---
Timing: Generation 0.432 ms, Inlining 4.705 ms, Optimization  29.662 ms,
Emission  21.300 ms, Total  56.098 ms (1 module,  9 functions)
Timing: Generation 2.042 ms, Inlining 4.014 ms, Optimization 164.745 ms,
Emission 120.334 ms, Total 291.135 ms (1 module, 59 functions)

CloneModule()
-
Timing: Generation 0.454 ms, Inlining 0.077 ms, Optimization 14.441 ms,
Emission 12.234 ms, Total  27.206 ms (1 module,  9 functions)
Timing: Generation 2.061 ms, Inlining 0.193 ms, Optimization 95.627 ms,
Emission 77.652 ms, Total 175.533 ms (1 module, 59 functions)

CloneModule(), lazy jitting (> 1 modules)
-
Timing: Generation 0.457 ms, Inlining 0.154 ms, Optimization  15.972 ms,
Emission 13.152 ms, Total  29.735 ms ( 4 modules,  8 functions)
Timing: Generation 2.715 ms, Inlining 0.974 ms, Optimization 105.122 ms,
Emission 86.776 ms, Total 195.587 ms (27 modules, 53 functions)

CloneModule(), lazy jitting, optimization settings
--
Timing: Generation 0.758 ms, Inlining 0.153 ms, Optimization 13.137 ms,
Emission 13.805 ms, Total  27.854 ms ( 4 modules,  8 functions)
Timing: Generation 2.712 ms, Inlining 1.133 ms, Optimization 92.661 ms,
Emission 84.538 ms, Total 181.043 ms (27 modules, 53 functions)


Cloning the module instead of reloading it brings down inlining time, even
for a single module. Obviously, it pays off the more modules are 

Making the subquery alias optional in the FROM clause

2022-06-27 Thread Dean Rasheed
This was discussed previously in [1], and there seemed to be general
consensus in favour of it, but no new patch emerged.

Attached is a patch that takes the approach of not generating an alias
at all, which seems to be neater and simpler, and less code than
trying to generate a unique alias.

It still generates an eref for the subquery RTE, which has a made-up
relation name, but that is marked as not visible on the
ParseNamespaceItem, so it doesn't conflict with anything else, need
not be unique, and cannot be used for qualified references to the
subquery's columns.

The only place that exposes the eref's made-up relation name is the
existing query deparsing code in ruleutils.c, which uniquifies it and
generates SQL spec-compliant output. For example:

CREATE OR REPLACE VIEW test_view AS
  SELECT *
FROM (SELECT a, b FROM foo),
 (SELECT c, d FROM bar)
   WHERE a = c;

\sv test_view

CREATE OR REPLACE VIEW public.test_view AS
 SELECT subquery.a,
subquery.b,
subquery_1.c,
subquery_1.d
   FROM ( SELECT foo.a,
foo.b
   FROM foo) subquery,
( SELECT bar.c,
bar.d
   FROM bar) subquery_1
  WHERE subquery.a = subquery_1.c

Regards,
Dean

[1] 
https://www.postgresql.org/message-id/flat/1487773980.3143.15.camel%40oopsware.de
diff --git a/doc/src/sgml/ref/select.sgml b/doc/src/sgml/ref/select.sgml
new file mode 100644
index 80bb8ad..8cae456
--- a/doc/src/sgml/ref/select.sgml
+++ b/doc/src/sgml/ref/select.sgml
@@ -51,7 +51,7 @@ SELECT [ ALL | DISTINCT [ ON ( table_name [ * ] [ [ AS ] alias [ ( column_alias [, ...] ) ] ]
 [ TABLESAMPLE sampling_method ( argument [, ...] ) [ REPEATABLE ( seed ) ] ]
-[ LATERAL ] ( select ) [ AS ] alias [ ( column_alias [, ...] ) ]
+[ LATERAL ] ( select ) [ [ AS ] alias [ ( column_alias [, ...] ) ] ]
 with_query_name [ [ AS ] alias [ ( column_alias [, ...] ) ] ]
 [ LATERAL ] function_name ( [ argument [, ...] ] )
 [ WITH ORDINALITY ] [ [ AS ] alias [ ( column_alias [, ...] ) ] ]
@@ -490,11 +490,18 @@ TABLE [ ONLY ] VALUES command
 can also be used here.

+
+   
+An alias may be provided to name the output of the
+sub-SELECT, and optionally rename its columns.
+Note that according to the SQL standard, a
+sub-SELECT must have an alias,
+whereas in PostgreSQL it is optional.
+   
   
  
 
@@ -2041,6 +2048,16 @@ SELECT 2+2;

   
 
+  
+   Omitting sub-SELECT aliases in FROM
+
+   
+According to the SQL standard, a sub-SELECT in the
+FROM list must have an alias.  In
+PostgreSQL, this alias may be omitted.
+   
+  
+
   
ONLY and Inheritance
 
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
new file mode 100644
index 969c9c1..3b25f7c
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -13346,33 +13346,6 @@ table_ref:	relation_expr opt_alias_claus
 	n->lateral = false;
 	n->subquery = $1;
 	n->alias = $2;
-	/*
-	 * The SQL spec does not permit a subselect
-	 * () without an alias clause,
-	 * so we don't either.  This avoids the problem
-	 * of needing to invent a unique refname for it.
-	 * That could be surmounted if there's sufficient
-	 * popular demand, but for now let's just implement
-	 * the spec and see if anyone complains.
-	 * However, it does seem like a good idea to emit
-	 * an error message that's better than "syntax error".
-	 */
-	if ($2 == NULL)
-	{
-		if (IsA($1, SelectStmt) &&
-			((SelectStmt *) $1)->valuesLists)
-			ereport(ERROR,
-	(errcode(ERRCODE_SYNTAX_ERROR),
-	 errmsg("VALUES in FROM must have an alias"),
-	 errhint("For example, FROM (VALUES ...) [AS] foo."),
-	 parser_errposition(@1)));
-		else
-			ereport(ERROR,
-	(errcode(ERRCODE_SYNTAX_ERROR),
-	 errmsg("subquery in FROM must have an alias"),
-	 errhint("For example, FROM (SELECT ...) [AS] foo."),
-	 parser_errposition(@1)));
-	}
 	$$ = (Node *) n;
 }
 			| LATERAL_P select_with_parens opt_alias_clause
@@ -13382,23 +13355,6 @@ table_ref:	relation_expr opt_alias_claus
 	n->lateral = true;
 	n->subquery = $2;
 	n->alias = $3;
-	/* same comment as above */
-	if ($3 == NULL)
-	{
-		if (IsA($2, SelectStmt) &&
-			((SelectStmt *) $2)->valuesLists)
-			ereport(ERROR,
-	(errcode(ERRCODE_SYNTAX_ERROR),
-	 errmsg("VALUES in FROM must have an alias"),
-	 errhint("For example, FROM (VALUES ...) [AS] foo."),
-	 parser_errposition(@2)));
-		else
-			ereport(ERROR,
-	(errcode(ERRCODE_SYNTAX_ERROR),
-	 errmsg("subquery in FROM must have an alias"),
-	 errhint("For example, FROM (SELECT ...) [AS] foo."),
-	 parser_errposition(@2)));
-	}
 	$$ = (Node *) n;
 }
 			| joined_table
diff --git 

Emit postgres log messages that have security or PII with special flags/error code/elevel

2022-06-27 Thread Bharath Rupireddy
Hi,

Today, postgres doesn't distinguish the log messages that it emits to
server logs via ereport/elog mechanism, based on security information or
PII (Personally Identifiable Information) or other sensitive information
[1]. In production environments, these log messages would be captured and
stored (perhaps in a different intermediate database specially designed for
text and log analytics) for debug, analytical, reporting or
on-demand-delivery to the customers via portal/tools. In this context, the
customers will expect to treat the sensitive information differently
(perhaps encode/mask before storing) for security and compliance purposes.
Also, it's not safe to show all the log messages as-is for internal
debugging purposes as the sensitive information can be misused
intentionally or unintentionally.

Today, one can implement an emit_log_hook which can look for sensitive log
messages based on the errmsg i.e. "text" and treat them differently. But
the errmsg based approach has its own disadvantages - errmsg can get
tweaked, there can be too many sensitive type log messages, not everyone
can rightly distinguish what a sensitive log message is and what is not,
the hook implementation and maintainability is a huge problem in the long
run.

Here's an idea - what if postgres can emit log messages that have sensitive
information with special error codes or flags? The emit_log_hook
implementers will then just need to look for those special error codes or
flags to treat them differently.

Thoughts?

[1]
errmsg("role \"%s\" cannot be dropped because some objects depend on it"
errmsg("role \"%s\" already exists"
errmsg("must have admin option on role \"%s\""
errmsg("role \"%s\" is a member of role \"%s\""
errmsg("must have admin option on role \"%s\""
errmsg("pg_hba.conf rejects replication connection for host \"%s\", user
\"%s\", %s"
errmsg("duplicate key value violates unique constraint \"%s\""
log_connections and log_disconnections messages
.
.

Regards,
Bharath Rupireddy.


Re: Add non-blocking version of PQcancel

2022-06-27 Thread Alvaro Herrera
On 2022-Jun-27, Justin Pryzby wrote:

> On Fri, Jun 24, 2022 at 07:36:16PM -0500, Justin Pryzby wrote:

> > Apparently there's still an occasional issue.
> > https://cirrus-ci.com/task/6613309985128448
> 
> I think that failure is actually not related to this patch.

Yeah, it's not -- Kyotaro diagnosed it as a problem in libpq's pipeline
mode.  I hope to push his fix soon, but there are nearby problems that I
haven't been able to track down a good fix for.  I'm looking into the
whole.

-- 
Álvaro Herrera   48°01'N 7°57'E  —  https://www.EnterpriseDB.com/




Re: Add non-blocking version of PQcancel

2022-06-27 Thread Justin Pryzby
On Fri, Jun 24, 2022 at 07:36:16PM -0500, Justin Pryzby wrote:
> Resending with a problematic email removed from CC...
> 
> On Mon, Apr 04, 2022 at 03:21:54PM +, Jelte Fennema wrote:
> > 2. Added some extra sleeps to the cancellation test, to remove random 
> > failures on FreeBSD.
> 
> Apparently there's still an occasional issue.
> https://cirrus-ci.com/task/6613309985128448

I think that failure is actually not related to this patch.

There are probably others, but I noticed because it also affected one of my
patches, which changes nothing relevant.
https://cirrus-ci.com/task/5904044051922944





Re: Logging query parmeters in auto_explain

2022-06-27 Thread Dagfinn Ilmari Mannsåker
Dagfinn Ilmari Mannsåker  writes:

> Michael Paquier  writes:
>
>> On Thu, Jun 09, 2022 at 11:55:11PM +0100, Dagfinn Ilmari Mannsåker wrote:
>>> Done (and more tests added), v3 attached.
>>
>> One thing I am wondering is if we'd better mention errdetail_params()
>> at the top of the initial check done in ExplainQueryParameters().
>> That's a minor issue, though.
>>
>> +sub query_log
>> +{
>> Perhaps a short description at the top of this routine to explain it
>> is worth it?
>
> Done. I also moved the function to the bottom of the file, to avoid
> distracting from the actual test queries.

I forgot to mention, I also changed the order of the query and
parameters, so that they can more naturally be left out when no changes
are needed.

- ilmari




Re: Logging query parmeters in auto_explain

2022-06-27 Thread Dagfinn Ilmari Mannsåker
Michael Paquier  writes:

> On Thu, Jun 09, 2022 at 11:55:11PM +0100, Dagfinn Ilmari Mannsåker wrote:
>> Done (and more tests added), v3 attached.
>
> One thing I am wondering is if we'd better mention errdetail_params()
> at the top of the initial check done in ExplainQueryParameters().
> That's a minor issue, though.
>
> +sub query_log
> +{
> Perhaps a short description at the top of this routine to explain it
> is worth it?

Done. I also moved the function to the bottom of the file, to avoid
distracting from the actual test queries.

> The test still does a set of like() and unlike() after running each
> query when the parameter updates are done.  One thing I would have
> played with is to group the set of logs expected or not expected as
> parameters of the centralized routine, but that would reduce the
> customization of the test names, so at the end the approach you have
> taken for query_log() looks good to me.

I did consider passing the tests as a data structure to the function,
but that would amount to specifying exactly the same things but as a
data structure, and then calling the appropriate function by reference,
which just makes things more cluttered.

If we were using TAP subtests, it might make a sense to have the
function run each set of related tests in a subtest, rather than having
multiple subtest calls at the top level, but we don't, so it doesn't.

> +$node->stop('fast');
> There is no need for that.  The END block of Cluster.pm does that
> already.

Ah, I was not aware of that. The call was there in the original version,
so I had just left it in. Removed.

v4 patch attached.

- ilmari

>From 1c504beecdf2a0e64a7bbd1d0e07b6b33fdc533b Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Dagfinn=20Ilmari=20Manns=C3=A5ker?= 
Date: Tue, 31 May 2022 21:12:12 +0100
Subject: [PATCH v4] Log query parameters in auto_explain

Add an auto_explain.log_parameter_max_length config setting, similar
to the corresponding core setting, that controls the inclusion of
query parameters in the logged explain output.

Also adjust the tests to only look at the relevant parts of the log
for each query, and tho reset the GUCs after each test.
---
 contrib/auto_explain/auto_explain.c|  15 +++
 contrib/auto_explain/t/001_auto_explain.pl | 134 ++---
 doc/src/sgml/auto-explain.sgml |  19 +++
 src/backend/commands/explain.c |  22 
 src/include/commands/explain.h |   1 +
 5 files changed, 177 insertions(+), 14 deletions(-)

diff --git a/contrib/auto_explain/auto_explain.c b/contrib/auto_explain/auto_explain.c
index c9a0d947c8..1ba7536879 100644
--- a/contrib/auto_explain/auto_explain.c
+++ b/contrib/auto_explain/auto_explain.c
@@ -19,12 +19,14 @@
 #include "common/pg_prng.h"
 #include "executor/instrument.h"
 #include "jit/jit.h"
+#include "nodes/params.h"
 #include "utils/guc.h"
 
 PG_MODULE_MAGIC;
 
 /* GUC variables */
 static int	auto_explain_log_min_duration = -1; /* msec or -1 */
+static int	auto_explain_log_parameter_max_length = -1; /* bytes or -1 */
 static bool auto_explain_log_analyze = false;
 static bool auto_explain_log_verbose = false;
 static bool auto_explain_log_buffers = false;
@@ -105,6 +107,18 @@ _PG_init(void)
 			NULL,
 			NULL);
 
+	DefineCustomIntVariable("auto_explain.log_parameter_max_length",
+			"Sets the maximum length of query parameters to log.",
+			"Zero logs no query parameters, -1 logs them in full.",
+			_explain_log_parameter_max_length,
+			-1,
+			-1, INT_MAX,
+			PGC_SUSET,
+			GUC_UNIT_BYTE,
+			NULL,
+			NULL,
+			NULL);
+
 	DefineCustomBoolVariable("auto_explain.log_analyze",
 			 "Use EXPLAIN ANALYZE for plan logging.",
 			 NULL,
@@ -389,6 +403,7 @@ explain_ExecutorEnd(QueryDesc *queryDesc)
 
 			ExplainBeginOutput(es);
 			ExplainQueryText(es, queryDesc);
+			ExplainQueryParameters(es, queryDesc->params, auto_explain_log_parameter_max_length);
 			ExplainPrintPlan(es, queryDesc);
 			if (es->analyze && auto_explain_log_triggers)
 ExplainPrintTriggers(es, queryDesc);
diff --git a/contrib/auto_explain/t/001_auto_explain.pl b/contrib/auto_explain/t/001_auto_explain.pl
index 82e4d9d15c..2d04147090 100644
--- a/contrib/auto_explain/t/001_auto_explain.pl
+++ b/contrib/auto_explain/t/001_auto_explain.pl
@@ -17,41 +17,147 @@
 $node->start;
 
 # run a couple of queries
-$node->safe_psql("postgres", "SELECT * FROM pg_class;");
-$node->safe_psql("postgres",
-	"SELECT * FROM pg_proc WHERE proname = 'int4pl';");
+my $log_contents = query_log("SELECT * FROM pg_class;");
 
-# emit some json too
-$node->append_conf('postgresql.conf', "auto_explain.log_format = json");
-$node->reload;
-$node->safe_psql("postgres", "SELECT * FROM pg_proc;");
-$node->safe_psql("postgres",
-	"SELECT * FROM pg_class WHERE relname = 'pg_class';");
+like(
+	$log_contents,
+	qr/Query Text: SELECT \* FROM pg_class;/,
+	"query text logged, text mode");
 
-$node->stop('fast');
-
-my $log = 

Re: [PoC] Improve dead tuple storage for lazy vacuum

2022-06-27 Thread John Naylor
On Mon, Jun 20, 2022 at 7:57 AM Masahiko Sawada  wrote:

[v3 patch]

Hi Masahiko,

Since there are new files, and they are pretty large, I've attached
most specific review comments and questions as a diff rather than in
the email body. This is not a full review, which will take more time
-- this is a first pass mostly to aid my understanding, and discuss
some of the design and performance implications.

I tend to think it's a good idea to avoid most cosmetic review until
it's close to commit, but I did mention a couple things that might
enhance readability during review.

As I mentioned to you off-list, I have some thoughts on the nodes using SIMD:

> On Thu, Jun 16, 2022 at 4:30 PM John Naylor
>  wrote:
> >
> > For now, though, I'd like to question
> > why we even need to use 32-byte registers in the first place. For one,
> > the paper referenced has 16-pointer nodes, but none for 32 (next level
> > is 48 and uses a different method to find the index of the next
> > pointer). Andres' prototype has 32-pointer nodes, but in a quick read
> > of his patch a couple weeks ago I don't recall a reason mentioned for
> > it.
>
> I might be wrong but since AVX2 instruction set is introduced in
> Haswell microarchitecture in 2013 and the referenced paper is
> published in the same year, the art didn't use AVX2 instruction set.

Sure, but with a bit of work the same technique could be done on that
node size with two 16-byte registers.

> 32-pointer nodes are better from a memory perspective as you
> mentioned. Andres' prototype supports both 16-pointer nodes and
> 32-pointer nodes (out of 6 node types). This would provide better
> memory usage but on the other hand, it would also bring overhead of
> switching the node type.

Right, using more node types provides smaller increments of node size.
Just changing node type can be better or worse, depending on the
input.

> Anyway, it's an important design decision to
> support which size of node to support. It should be done based on
> experiment results and documented.

Agreed. I would add that in the first step, we want something
straightforward to read and easy to integrate into our codebase. I
suspect other optimizations would be worth a lot more than using AVX2:
- collapsing inner nodes
- taking care when constructing the key (more on this when we
integrate with VACUUM)
...and a couple Andres mentioned:
- memory management: in
https://www.postgresql.org/message-id/flat/20210717194333.mr5io3zup3kxahfm%40alap3.anarazel.de
- node dispatch:
https://www.postgresql.org/message-id/20210728184139.qhvx6nbwdcvo63m6%40alap3.anarazel.de

Therefore, I would suggest that we use SSE2 only, because:
- portability is very easy
- to avoid a performance hit from indirecting through a function pointer

When the PG16 cycle opens, I will work separately on ensuring the
portability of using SSE2, so you can focus on other aspects. I think
it would be a good idea to have both node16 and node32 for testing.
During benchmarking we can delete one or the other and play with the
other thresholds a bit.

Ideally, node16 and node32 would have the same code with a different
loop count (1 or 2). More generally, there is too much duplication of
code (noted by Andres in his PoC), and there are many variable names
with the node size embedded. This is a bit tricky to make more
general, so we don't need to try it yet, but ideally we would have
something similar to:

switch (node->kind) // todo: inspect tagged pointer
{
  case RADIX_TREE_NODE_KIND_4:
   idx = node_search_eq(node, chunk, 4);
   do_action(node, idx, 4, ...);
   break;
  case RADIX_TREE_NODE_KIND_32:
   idx = node_search_eq(node, chunk, 32);
   do_action(node, idx, 32, ...);
  ...
}

static pg_alwaysinline void
node_search_eq(radix_tree_node node, uint8 chunk, int16 node_fanout)
{
if (node_fanout <= SIMPLE_LOOP_THRESHOLD)
  // do simple loop with (node_simple *) node;
else if (node_fanout <= VECTORIZED_LOOP_THRESHOLD)
  // do vectorized loop where available with (node_vec *) node;
...
}

...and let the compiler do loop unrolling and branch removal. Not sure
how difficult this is to do, but something to think about.

Another thought: for non-x86 platforms, the SIMD nodes degenerate to
"simple loop", and looping over up to 32 elements is not great
(although possibly okay). We could do binary search, but that has bad
branch prediction.

-- 
John Naylor
EDB: http://www.enterprisedb.com
diff --git a/src/backend/lib/radixtree.c b/src/backend/lib/radixtree.c
index bf87f932fd..2bb04eba86 100644
--- a/src/backend/lib/radixtree.c
+++ b/src/backend/lib/radixtree.c
@@ -16,6 +16,11 @@
  *
  * The key is a 64-bit unsigned integer and the value is a Datum. Both internal
  * nodes and leaf nodes have the identical structure. For internal tree nodes,
+It might worth mentioning:
+- the paper refers to this technique as "Multi-value leaves"
+- we chose it (I assume) for simplicity and to avoid an additional pointer 
traversal
+- it is the 

Re: Allow pageinspect's bt_page_stats function to return a set of rows instead of a single row

2022-06-27 Thread Bharath Rupireddy
On Mon, Jun 27, 2022 at 1:40 PM Drouvot, Bertrand  wrote:
>
> Hi,
>
> On 6/27/22 9:31 AM, Hamid Akhtar wrote:
>
>
> Hello Hackers,
>
> While working on one of my blogs on the B-Tree indexes, I needed to look at a 
> range of B-Tree page statistics. So the goto solution was to use pageinspect. 
> However, reviewing stats for multiple pages meant issuing multiple queries.

+1 to improve the API.

> I felt that there's an opportunity for improvement in the extension by 
> extending the API to output the statistics for multiple pages with a single 
> query.
>
> That attached patch is based on the master branch. It makes the following 
> changes to the pageinspect contrib module:
> - Updates bt_page_stats_internal function to accept 3 arguments instead of 2.
> - The function now uses SRF macros to return a set rather than a single row. 
> The function call now requires specifying column names.
>
> The extension version is bumped to 1.11 (PAGEINSPECT_V1_11).
> To maintain backward compatibility, for versions below 1.11, the multi-call 
> mechanism is ended to keep the old behavior consistent.
>
> Regression test cases for the module are updated as well as part of this 
> change. Here is a subset of queries that are added to the btree.sql test case 
> file for pageinspect.
>
> 
> CREATE TABLE test2 AS (SELECT generate_series(1, 5000) AS col1);
> CREATE INDEX test2_col1_idx ON test2(col1);
> SELECT * FROM bt_page_stats('test2_col1_idx', 1, 2);
>
> For example, this could be written as:
>
> select * from
> generate_series(1, 2) blkno ,
> bt_page_stats('test2_col1_idx',blkno::int);
>
> Or, if one wants to inspect to whole relation, something like:
>
> select * from
> generate_series(1, pg_relation_size('test2_col1_idx'::regclass::text) / 8192 
> - 1) blkno ,
> bt_page_stats('test2_col1_idx',blkno::int);

Good one. But not all may know the alternatives. Do we have any
difference in the execution times for the above query vs the new
function introduced in the v1 patch? If there's not much difference, I
would suggest adding an SQL function around the generate_series
approach in the pageinspect extension for better and easier usability.

Regards,
Bharath Rupireddy.




Re: last_archived_wal is not necessary the latest WAL file (was Re: pgsql: Add test case for an archive recovery corner case.)

2022-06-27 Thread Michael Paquier
On Mon, Jun 27, 2022 at 12:04:57AM -0700, Noah Misch wrote:
> For me, it reproduces consistently with a sleep just before the startup
> process exits:

Nice catch.

> One can adapt the test to the server behavior by having the test wait for the
> archiver to start, as attached.  This is sufficient to make check-world pass
> with the above sleep in place.  I think we should also modify the PostgresNode
> archive_command to log a message.  That lack of logging was a obstacle
> upthread (as seen in commit 3279cef) and again here.

  ? qq{copy "%p" "$path%f"}
- : qq{cp "%p" "$path/%f"};
+ : qq{echo >&2 "ARCHIVE_COMMAND %p"; cp "%p" "$path/%f"};

This is a bit inelegant.  Perhaps it would be better through a perl
wrapper like cp_history_files?

> An alternative would be to declare that the test is right and the server is
> wrong.  The postmaster knows how to start the checkpointer if the checkpointer
> is not running when the postmaster needs a shutdown checkpoint.  It could
> start the archiver around that same area:
> 
>   /* Start the checkpointer if not running */
>   if (CheckpointerPID == 0)
>   CheckpointerPID = StartCheckpointer();
>   /* And tell it to shut down */
>   if (CheckpointerPID != 0)
>   {
>   signal_child(CheckpointerPID, SIGUSR2);
>   pmState = PM_SHUTDOWN;
>   }
> 
> Any opinions between the change-test and change-server approaches?

The startup sequence can be sometimes tricky.  Though I don't have a
specific argument coming into mind, I would stick to a fix in the
test.
--
Michael


signature.asc
Description: PGP signature


Re: [EXTERNAL] Re: Add non-blocking version of PQcancel

2022-06-27 Thread Jelte Fennema
(resent because it was blocked from the mailing-list due to inclusion of a 
blocked email address in the To line)

From: Andres Freund 
> On 2022-04-04 15:21:54 +, Jelte Fennema wrote:
> > 2. Added some extra sleeps to the cancellation test, to remove random
> > failures on FreeBSD.
> 
> That's extremely extremely rarely the solution to address test reliability
> issues. It'll fail when running test under valgrind etc.
> 
> Why do you need sleeps / can you find another way to make the test reliable?

The problem they are solving is racy behaviour between sending the query
and sending the cancellation. If the cancellation is handled before the query
is started, then the query doesn't get cancelled. To solve this problem I used
the sleeps to wait a bit before sending the cancelation request.

When I wrote this, I couldn't think of a better way to do it then with sleeps.
But I didn't like it either (and I still don't). These emails made me start to 
think
again, about other ways of solving the problem. I think I've found another 
solution (see attached patch). The way I solve it now is by using another 
connection to check the state of the first one.

Jelte

0001-Add-documentation-for-libpq_pipeline-tests.patch
Description: 0001-Add-documentation-for-libpq_pipeline-tests.patch


0002-Add-non-blocking-version-of-PQcancel.patch
Description: 0002-Add-non-blocking-version-of-PQcancel.patch


Allow pageinspect's bt_page_stats function to return a set of rows instead of a single row

2022-06-27 Thread Hamid Akhtar
Hello Hackers,

While working on one of my blogs on the B-Tree indexes
,
I needed to look at a range of B-Tree page statistics. So the goto solution
was to use pageinspect. However, reviewing stats for multiple pages meant
issuing multiple queries. I felt that there's an opportunity for
improvement in the extension by extending the API to output the statistics
for multiple pages with a single query.

That attached patch is based on the master branch. It makes the following
changes to the pageinspect contrib module:
- Updates bt_page_stats_internal function to accept 3 arguments instead of
2.
- The function now uses SRF macros to return a set rather than a single
row. The function call now requires specifying column names.

The extension version is bumped to 1.11 (PAGEINSPECT_V1_11).
To maintain backward compatibility, for versions below 1.11, the multi-call
mechanism is ended to keep the old behavior consistent.

Regression test cases for the module are updated as well as part of this
change. Here is a subset of queries that are added to the btree.sql test
case file for pageinspect.


CREATE TABLE test2 AS (SELECT generate_series(1, 5000) AS col1);
CREATE INDEX test2_col1_idx ON test2(col1);
SELECT * FROM bt_page_stats('test2_col1_idx', 1, 2);
SELECT * FROM bt_page_stats('test2_col1_idx', 1, 0);
SELECT * FROM bt_page_stats('test2_col1_idx', 0, 1);
SELECT * FROM bt_page_stats('test2_col1_idx', 1, -1);
DROP TABLE test2;


Regards.

--
Hamid Akhtar,
Percona LLC,
URL : www.percona.com
CELL:+923335449950
SKYPE: engineeredvirus
diff --git a/contrib/pageinspect/Makefile b/contrib/pageinspect/Makefile
index 5c0736564a..31d7b1c581 100644
--- a/contrib/pageinspect/Makefile
+++ b/contrib/pageinspect/Makefile
@@ -13,12 +13,12 @@ OBJS = \
 	rawpage.o
 
 EXTENSION = pageinspect
-DATA =  pageinspect--1.9--1.10.sql pageinspect--1.8--1.9.sql \
-	pageinspect--1.7--1.8.sql pageinspect--1.6--1.7.sql \
-	pageinspect--1.5.sql pageinspect--1.5--1.6.sql \
-	pageinspect--1.4--1.5.sql pageinspect--1.3--1.4.sql \
-	pageinspect--1.2--1.3.sql pageinspect--1.1--1.2.sql \
-	pageinspect--1.0--1.1.sql
+DATA =  pageinspect--1.10--1.11.sql pageinspect--1.9--1.10.sql \
+	pageinspect--1.8--1.9.sql pageinspect--1.7--1.8.sql \
+	pageinspect--1.6--1.7.sql pageinspect--1.5.sql \
+	pageinspect--1.5--1.6.sql pageinspect--1.4--1.5.sql \
+	pageinspect--1.3--1.4.sql pageinspect--1.2--1.3.sql \
+	pageinspect--1.1--1.2.sql pageinspect--1.0--1.1.sql
 PGFILEDESC = "pageinspect - functions to inspect contents of database pages"
 
 REGRESS = page btree brin gin gist hash checksum oldextversions
diff --git a/contrib/pageinspect/btreefuncs.c b/contrib/pageinspect/btreefuncs.c
index 62f2c1b315..85127fcfdb 100644
--- a/contrib/pageinspect/btreefuncs.c
+++ b/contrib/pageinspect/btreefuncs.c
@@ -45,6 +45,7 @@ PG_FUNCTION_INFO_V1(bt_page_items_1_9);
 PG_FUNCTION_INFO_V1(bt_page_items);
 PG_FUNCTION_INFO_V1(bt_page_items_bytea);
 PG_FUNCTION_INFO_V1(bt_page_stats_1_9);
+PG_FUNCTION_INFO_V1(bt_page_stats_1_11);
 PG_FUNCTION_INFO_V1(bt_page_stats);
 
 #define IS_INDEX(r) ((r)->rd_rel->relkind == RELKIND_INDEX)
@@ -80,6 +81,26 @@ typedef struct BTPageStat
 	BTCycleId	btpo_cycleid;
 } BTPageStat;
 
+/*
+ * cross-call data structure for SRF for page stats
+ */
+typedef struct ua_page_stats
+{
+	uint64		blkno;
+	uint64		blk_count;
+}			ua_page_stats;
+
+/*
+ * cross-call data structure for SRF for page items
+ */
+typedef struct ua_page_items
+{
+	Page		page;
+	OffsetNumber offset;
+	bool		leafpage;
+	bool		rightmost;
+	TupleDesc	tupd;
+}			ua_page_items;
 
 /* -
  * GetBTPageStatistics()
@@ -176,35 +197,17 @@ GetBTPageStatistics(BlockNumber blkno, Buffer buffer, BTPageStat *stat)
 		stat->avg_item_size = 0;
 }
 
+
 /* ---
- * bt_page_stats()
+ * bt_index_block_validate()
  *
- * Usage: SELECT * FROM bt_page_stats('t1_pkey', 1);
+ * Validate index type is btree and block number
+ * is within a valid block number range.
  * ---
  */
-static Datum
-bt_page_stats_internal(PG_FUNCTION_ARGS, enum pageinspect_version ext_version)
+static void
+bt_index_block_validate(Relation rel, int64 blkno)
 {
-	text	   *relname = PG_GETARG_TEXT_PP(0);
-	int64		blkno = (ext_version == PAGEINSPECT_V1_8 ? PG_GETARG_UINT32(1) : PG_GETARG_INT64(1));
-	Buffer		buffer;
-	Relation	rel;
-	RangeVar   *relrv;
-	Datum		result;
-	HeapTuple	tuple;
-	TupleDesc	tupleDesc;
-	int			j;
-	char	   *values[11];
-	BTPageStat	stat;
-
-	if (!superuser())
-		ereport(ERROR,
-(errcode(ERRCODE_INSUFFICIENT_PRIVILEGE),
- errmsg("must be superuser to use pageinspect functions")));
-
-	relrv = makeRangeVarFromNameList(textToQualifiedNameList(relname));
-	rel = relation_openrv(relrv, AccessShareLock);
-
 	if (!IS_INDEX(rel) || !IS_BTREE(rel))
 		ereport(ERROR,
 

Re: ICU for global collation

2022-06-27 Thread Peter Eisentraut

On 27.06.22 08:42, Michael Paquier wrote:

On Mon, Jun 27, 2022 at 08:23:59AM +0200, Peter Eisentraut wrote:

On 26.06.22 05:51, Julien Rouhaud wrote:

On Sat, Jun 25, 2022 at 10:19:30AM -0500, Justin Pryzby wrote:

+   if (GET_MAJOR_VERSION(old_cluster.major_version) <= 1500)


I think the fix here is to change <= to < ?


Yes.


Ok, committed.

(I see now that in the context of pg_upgrade, writing <= 1400 is 
equivalent, but I find that confusing, so I did < 1500.)





Re: last_archived_wal is not necessary the latest WAL file (was Re: pgsql: Add test case for an archive recovery corner case.)

2022-06-27 Thread Noah Misch
On Wed, Feb 16, 2022 at 01:42:27AM +0200, Heikki Linnakangas wrote:
> On 15/02/2022 23:28, Tom Lane wrote:
> >Heikki Linnakangas  writes:
> >>That was interesting: the order that WAL segments are archived when a
> >>standby is promoted is not fully deterministic.
> >
> >Oh, of course.
> >
> >>I find it a bit surprising that pg_stat_archiver.last_archived_wal is
> >>not necessarily the highest-numbered segment that was archived. I
> >>propose that we mention that in the docs, as in the attached patch.
> >
> >+1, but I think the description of that field in the pg-stat-archiver-view
> >table is also pretty misleading.  Maybe like
> >
> >-  Name of the last WAL file successfully archived
> >+  Name of the WAL file most recently successfully archived
> >
> >and similarly s/last/most recent/ for the other fields claiming
> >to be "last" something.
> 
> Makes sense, committed it that way.

This has seen two failures like the following:
#   Failed test 'check table contents after point-in-time recovery'
#   at t/028_pitr_timelines.pl line 167.
#  got: '2'
# expected: '3'
# Looks like you failed 1 test of 3.

 sysname  │  snapshot   │ branch │  
  bfurl 
──┼─┼┼──
 sungazer │ 2022-03-08 16:24:46 │ HEAD   │ 
http://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=sungazer=2022-03-08%2016%3A24%3A46
 mylodon  │ 2022-05-18 00:14:19 │ HEAD   │ 
http://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=mylodon=2022-05-18%2000%3A14%3A19

For me, it reproduces consistently with a sleep just before the startup
process exits:

--- a/src/backend/postmaster/startup.c
+++ b/src/backend/postmaster/startup.c
@@ -266,6 +266,8 @@ StartupProcessMain(void)
 */
StartupXLOG();
 
+   pg_usleep(3 * 1000 * 1000);
+
/*
 * Exit normally. Exit code 0 tells postmaster that we completed 
recovery
 * successfully.

A normal/passing run gets this sequence of events:

checkpointer: end-of-recovery checkpoint
startup process: exits
postmaster reaper(): "Startup succeeded, commence normal operations" = 
StartArchiver()
test issues its INSERT
shutdown checkpoint
checkpointer: exits
postmaster reaper(): signal_child(PgArchPID, SIGUSR2)
archiver: completes archiving, exits

However, if the startup process exit is slow enough, you get:

checkpointer: end-of-recovery checkpoint
test issues its INSERT
shutdown checkpoint
checkpointer: exits
postmaster reaper(): no PgArchPID, skip that part
[no archiving]

One can adapt the test to the server behavior by having the test wait for the
archiver to start, as attached.  This is sufficient to make check-world pass
with the above sleep in place.  I think we should also modify the PostgresNode
archive_command to log a message.  That lack of logging was a obstacle
upthread (as seen in commit 3279cef) and again here.

An alternative would be to declare that the test is right and the server is
wrong.  The postmaster knows how to start the checkpointer if the checkpointer
is not running when the postmaster needs a shutdown checkpoint.  It could
start the archiver around that same area:

/* Start the checkpointer if not running */
if (CheckpointerPID == 0)
CheckpointerPID = StartCheckpointer();
/* And tell it to shut down */
if (CheckpointerPID != 0)
{
signal_child(CheckpointerPID, SIGUSR2);
pmState = PM_SHUTDOWN;
}

Any opinions between the change-test and change-server approaches?

The test is new in HEAD, but I get the same behavior in v14 and v13.  In v12,
node_pitr2 never exits pg_is_in_recovery().  v11 would need test code changes,
which I did not attempt.

Thanks,
nm
Author: Noah Misch 
Commit: Noah Misch 



diff --git a/src/test/recovery/t/028_pitr_timelines.pl 
b/src/test/recovery/t/028_pitr_timelines.pl
index a8b12d9..bad02ed 100644
--- a/src/test/recovery/t/028_pitr_timelines.pl
+++ b/src/test/recovery/t/028_pitr_timelines.pl
@@ -140,6 +140,13 @@ is($result, qq{1}, "check table contents after 
point-in-time recovery");
 # back to this timeline.
 $node_pitr->safe_psql('postgres', "INSERT INTO foo VALUES(3);");
 
+# Wait for the archiver to be running.  The startup process might have yet to
+# exit, in which case the postmaster has not started the archiver.  If we
+# stop() without an archiver, the archive will be incomplete.
+$node_pitr->poll_query_until('postgres',
+   "SELECT true FROM pg_stat_activity WHERE backend_type = 'archiver';")
+  or die "Timed 

Re: ICU for global collation

2022-06-27 Thread Michael Paquier
On Mon, Jun 27, 2022 at 08:23:59AM +0200, Peter Eisentraut wrote:
> On 26.06.22 05:51, Julien Rouhaud wrote:
>> On Sat, Jun 25, 2022 at 10:19:30AM -0500, Justin Pryzby wrote:
 +  if (GET_MAJOR_VERSION(old_cluster.major_version) <= 1500)
> 
> I think the fix here is to change <= to < ?

Yes.
--
Michael


signature.asc
Description: PGP signature


Re: PG15 beta1 fix pg_stats_ext/pg_stats_ext_exprs view manual

2022-06-27 Thread Michael Paquier
On Mon, Jun 27, 2022 at 03:49:20AM +, Shinoda, Noriyoshi (PN Japan FSIP) 
wrote:
> Thanks for your comment. sorry for the late reply.
> I hope it will be fixed during the period of PostgreSQL 15 Beta.

Apologies for the delay, fixed in time for beta2.
--
Michael


signature.asc
Description: PGP signature


Re: ICU for global collation

2022-06-27 Thread Peter Eisentraut

On 26.06.22 05:51, Julien Rouhaud wrote:

Hi,

On Sat, Jun 25, 2022 at 10:19:30AM -0500, Justin Pryzby wrote:

commit f2553d43060edb210b36c63187d52a632448e1d2 says >=1500 in a few places,
but in pg_upgrade says <=1500, which looks wrong for upgrades from v15.
I think it should say <= 1400.

On Wed, Feb 02, 2022 at 02:01:23PM +0100, Peter Eisentraut wrote:

diff --git a/src/bin/pg_upgrade/info.c b/src/bin/pg_upgrade/info.c
index 69ef23119f..2a9ca0e389 100644
--- a/src/bin/pg_upgrade/info.c
+++ b/src/bin/pg_upgrade/info.c
@@ -312,11 +312,20 @@ get_db_infos(ClusterInfo *cluster)
i_spclocation;
charquery[QUERY_ALLOC];
  
  	snprintf(query, sizeof(query),

-"SELECT d.oid, d.datname, d.encoding, d.datcollate, 
d.datctype, "
+"SELECT d.oid, d.datname, d.encoding, d.datcollate, 
d.datctype, ");
+   if (GET_MAJOR_VERSION(old_cluster.major_version) <= 1500)


I think the fix here is to change <= to < ?


+   snprintf(query + strlen(query), sizeof(query) - strlen(query),
+"'c' AS datcollprovider, NULL AS daticucoll, 
");
+   else
+   snprintf(query + strlen(query), sizeof(query) - strlen(query),
+"datcollprovider, daticucoll, ");
+   snprintf(query + strlen(query), sizeof(query) - strlen(query),
 "pg_catalog.pg_tablespace_location(t.oid) AS spclocation 
"
 "FROM pg_catalog.pg_database d "
 " LEFT OUTER JOIN pg_catalog.pg_tablespace t "


Indeed!








Re: [BUG] Panic due to incorrect missingContrecPtr after promotion

2022-06-27 Thread Michael Paquier
On Fri, Jun 24, 2022 at 04:17:34PM +, Imseih (AWS), Sami wrote:
> It is been difficult to get a generic repro, but the way we reproduce
> Is through our test suite. To give more details, we are running tests
> In which we constantly failover and promote standbys. The issue
> surfaces after we have gone through a few promotions which occur
> every few hours or so ( not really important but to give context ).

Hmm.  Could you describe exactly the failover scenario you are using?
Is the test using a set of cascading standbys linked to the promoted
one?  Are the standbys recycled from the promoted nodes with pg_rewind
or created from scratch with a new base backup taken from the
freshly-promoted primary?  I have been looking more at this thread
through the day but I don't see a remaining issue.  It could be
perfectly possible that we are missing a piece related to the handling
of those new overwrite contrecords in some cases, like in a rewind.

> I am adding some additional debugging  to see if I can draw a better
> picture of what is happening. Will also give aborted_contrec_reset_3.patch 
> a go, although I suspect it will not handle the specific case we are deaing 
> with.

Yeah, this is not going to change much things if you are still seeing
an issue.  This patch does not change the logic, aka it just
simplifies the tracking of the continuation record data, resetting it
when a complete record has been read.  Saying that, getting rid of the
dependency on StandbyMode because we cannot promote in the middle of a
record is nice (my memories around that were a bit blurry but even
recovery_target_lsn would not recover in the middle of an continuation
record), and this is not bug so there is limited reason to backpatch
this part of the change.
--
Michael


signature.asc
Description: PGP signature