date:20240128

Re: Exposing the lock manager's WaitForLockers() to SQL

2024-01-28 Thread Will Mortensen

Minor style fix; sorry for the spam.


v9-0001-Refactor-GetLockConflicts-into-more-general-GetLo.patch
Description: Binary data


v9-0002-Allow-specifying-single-lockmode-in-WaitForLocker.patch
Description: Binary data


v9-0003-Add-pg_wait_for_lockers-function.patch
Description: Binary data

RE: Fix inappropriate comments in function ReplicationSlotAcquire

2024-01-28 Thread Wei Wang (Fujitsu)

On Mon, Jan 29, 2024 at 14:50 Amit Kapila  wrote:
> Pushed!

Thanks for pushing.

Regards,
Wang Wei

Re: Network failure may prevent promotion

2024-01-28 Thread Kyotaro Horiguchi

Thank you fixing the issue.

At Tue, 23 Jan 2024 11:43:43 +0200, Heikki Linnakangas  wrote i
n 
> There's an existing AmWalReceiverProcess() macro too. Let's use that.

Mmm. I sought an Is* function becuase "IsLogicalWorker()" is placed on
the previous line. Our convention regarding those functions (macros)
and variables seems inconsistent. However, I can't say for sure that
we should unify all of them.

> (See also
> https://www.postgresql.org/message-id/f3ecd4cb-85ee-4e54-8278-5fabfb3a4ed0%40iki.fi
> for refactoring in this area)
> 
> Here's a patch set summarizing the changes so far. They should be
> squashed, but I kept them separate for now to help with review:
> 
> 1. revert the revert of 728f86fec6.
> 2. your walrcv_shutdown_deblocking_v2-2.patch
> 3. Also replace libpqrcv_PQexec() and libpqrcv_PQgetResult() with the
> wrappers from libpq-be-fe-helpers.h

Both replacements look fine. I didn't find another instance of similar
code.

> 4. Replace IsWalReceiver() with AmWalReceiverProcess()

Just look fine.

> >> - pqsignal(SIGTERM, SignalHandlerForShutdownRequest); /* request
> >> - shutdown */
> >> + pqsignal(SIGTERM, WalRcvShutdownSignalHandler); /* request shutdown
> >> */
> >>
> >> Can't we just use die(), instead?
> > There was a comment explaining the problems associated with exiting
> > within a signal handler;
> > - * Currently, only SIGTERM is of interest.  We can't just exit(1) within
> > - * the
> > - * SIGTERM signal handler, because the signal might arrive in the middle
> > - * of
> > - * some critical operation, like while we're holding a spinlock.
> > - * Instead, the
> > And I think we should keep the considerations it suggests. The patch
> > removes the comment itself, but it does so because it implements our
> > standard process exit procedure, which incorporates points suggested
> > by the now-removed comment.
> 
> die() doesn't call exit(1). Unless DoingCommandRead is set, but it
> never is in the walreceiver. It looks just like the new
> WalRcvShutdownSignalHandler() function. Am I missing something?

Ugh.. Doesn't the name 'die()' suggest exit()?
I agree that die() can be used instad.

> Hmm, but doesn't bgworker_die() have that problem with exit(1)ing in
> the signal handler?

I noticed that but ignored for this time.

> I also wonder if we should replace SignalHandlerForShutdownRequest()
> completely with die(), in all processes? The difference is that
> SignalHandlerForShutdownRequest() uses ShutdownRequestPending, while
> die() uses ProcDiePending && InterruptPending to indicate that the
> signal was received. Or do some of the processes want to check for
> ShutdownRequestPending only at specific places, and don't want to get
> terminated at the any random CHECK_FOR_INTERRUPTS()?

At least, pg_log_backend_memory_context() causes a call to
ProcessInterrupts via "ereport(LOG_SERVER_ONLY" which can lead to an
exit due to ProcDiePending. In this regard, checkpointer clearly
requires the distinction.

Rather than merely consolidating the notification variables and
striving to annihilate CFI calls in the execution path, I
believe we need a shutdown mechanism that CFI doesn't react
to. However, as for the method to achieve this, whether we should keep
the notification variables separate as they are now, or whether it
would be better to introduce a variable that causes CFI to ignore
ProcDiePending, is a matter I think is open to discussion.

Attached patches are the rebased version of v3 (0003 is updated) and
additional 0005 that makes use of die() instead of walreceiver's
custom function.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center
>From daac3aa06bd33f5e8f04a406c8a59fd4cc97 Mon Sep 17 00:00:00 2001
From: Heikki Linnakangas 
Date: Tue, 23 Jan 2024 11:01:03 +0200
Subject: [PATCH v4 1/5] Revert "Revert "libpqwalreceiver: Convert to
 libpq-be-fe-helpers.h""

This reverts commit 21ef4d4d897563adb2f7920ad53b734950f1e0a4.
---
 .../libpqwalreceiver/libpqwalreceiver.c   | 55 +++
 1 file changed, 8 insertions(+), 47 deletions(-)

diff --git a/src/backend/replication/libpqwalreceiver/libpqwalreceiver.c b/src/backend/replication/libpqwalreceiver/libpqwalreceiver.c
index 2439733b55..dbee2f8f0e 100644
--- a/src/backend/replication/libpqwalreceiver/libpqwalreceiver.c
+++ b/src/backend/replication/libpqwalreceiver/libpqwalreceiver.c
@@ -24,6 +24,7 @@
 #include "common/connect.h"
 #include "funcapi.h"
 #include "libpq-fe.h"
+#include "libpq/libpq-be-fe-helpers.h"
 #include "mb/pg_wchar.h"
 #include "miscadmin.h"
 #include "pgstat.h"
@@ -136,7 +137,6 @@ libpqrcv_connect(const char *conninfo, bool logical, bool must_use_password,
  const char *appname, char **err)
 {
 	WalReceiverConn *conn;
-	PostgresPollingStatusType status;
 	const char *keys[6];
 	const char *vals[6];
 	int			i = 0;
@@ -192,56 +192,17 @@ libpqrcv_connect(const char *conninfo, bool logical, bool must_use_password,
 	Assert(i < sizeof(keys));
 
 	conn =

Re: [PoC] Improve dead tuple storage for lazy vacuum

2024-01-28 Thread Masahiko Sawada

On Fri, Jan 26, 2024 at 11:05 PM Masahiko Sawada  wrote:
>
> On Wed, Jan 24, 2024 at 3:42 PM John Naylor  wrote:
> >
> > On Tue, Jan 23, 2024 at 10:58 AM Masahiko Sawada  
> > wrote:
> > >
> > > The new patches probably need to be polished but the VacDeadItemInfo
> > > idea looks good to me.
> >
> > That idea looks good to me, too. Since you already likely know what
> > you'd like to polish, I don't have much to say except for a few
> > questions below. I also did a quick sweep through every patch, so some
> > of these comments are unrelated to recent changes:
>
> Thank you!
>
> >
> > +/*
> > + * Calculate the slab blocksize so that we can allocate at least 32 chunks
> > + * from the block.
> > + */
> > +#define RT_SLAB_BLOCK_SIZE(size) \
> > + Max((SLAB_DEFAULT_BLOCK_SIZE / (size)) * (size), (size) * 32)
> >
> > The first parameter seems to be trying to make the block size exact,
> > but that's not right, because of the chunk header, and maybe
> > alignment. If the default block size is big enough to waste only a
> > tiny amount of space, let's just use that as-is.
>
> Agreed.
>

As of v55 patch, the sizes of each node class are:

- node4: 40 bytes
- node16_lo: 168 bytes
- node16_hi: 296 bytes
- node48: 784 bytes
- node256: 2088 bytes

If we use SLAB_DEFAULT_BLOCK_SIZE (8kB) for each node class, we waste
(approximately):

- node4: 32 bytes
- node16_lo: 128 bytes
- node16_hi: 200 bytes
- node48: 352 bytes
- node256: 1928 bytes

We might want to calculate a better slab block size for node256 at least.

> >
> > + * TODO: The caller must be certain that no other backend will attempt to
> > + * access the TidStore before calling this function. Other backend must
> > + * explicitly call TidStoreDetach() to free up backend-local memory 
> > associated
> > + * with the TidStore. The backend that calls TidStoreDestroy() must not 
> > call
> > + * TidStoreDetach().
> >
> > Do we need to do anything now?
>
> No, will remove it.
>

I misunderstood something. I think the above statement is still true
but we don't need to do anything at this stage. It's a typical usage
that the leader destroys the shared data after confirming all workers
are detached. It's not a TODO but probably a NOTE.

Regards,

-- 
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

Re: New Table Access Methods for Multi and Single Inserts

2024-01-28 Thread Bharath Rupireddy

On Wed, Jan 17, 2024 at 10:57 PM Bharath Rupireddy
 wrote:
>
> Thank you. I'm attaching v8 patch-set here which includes use of new
> insert TAMs for COPY FROM. With this, postgres not only will have the
> new TAM for inserts, but they also can make the following commands
> faster - CREATE TABLE AS, SELECT INTO, CREATE MATERIALIZED VIEW,
> REFRESH MATERIALIZED VIEW and COPY FROM. I'll perform some testing in
> the coming days and post the results here, until then I appreciate any
> feedback on the patches.
>
> I've also added this proposal to CF -
> https://commitfest.postgresql.org/47/4777/.

Some of the tests related to Incremental Sort added by a recent commit
0452b461bc4 in aggregates.sql are failing when the multi inserts
feature is used for CTAS (like done in 0002 patch). I'm not so sure if
it's because of the reduction in the CTAS execution times. Execution
time for table 'btg' created with CREATE TABLE AS added by commit
0452b461bc4 with single inserts is 25.3 msec, with multi inserts is
17.7 msec. This means that the multi inserts are about 1.43 times or
30.04% faster  than the single inserts. Couple of ways to make these
tests pick Incremental Sort as expected - 1) CLUSTER btg USING abc; or
2) increase the number of rows in table btg to 100K from 10K. FWIW, if
I reduce the number of rows in the table from 10K to 1K, the
Incremental Sort won't get picked on HEAD with CTAS using single
inserts. Hence, I chose option (2) to fix the issue.

Please find the attached v9 patch set.

[1]
 -- Engage incremental sort
 explain (COSTS OFF) SELECT x,y FROM btg GROUP BY x,y,z,w;
-   QUERY PLAN
--
+  QUERY PLAN
+--
  Group
Group Key: x, y, z, w
-   ->  Incremental Sort
+   ->  Sort
  Sort Key: x, y, z, w
- Presorted Key: x, y
- ->  Index Scan using btg_x_y_idx on btg
-(6 rows)
+ ->  Seq Scan on btg
+(5 rows)

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com
From a84107e498ffddc56ef4fbb207d6ba6e82717901 Mon Sep 17 00:00:00 2001
From: Bharath Rupireddy 
Date: Mon, 29 Jan 2024 05:39:55 +
Subject: [PATCH v9 1/4] New TAMs for inserts

---
 src/backend/access/heap/heapam.c | 224 +++
 src/backend/access/heap/heapam_handler.c |   9 +
 src/include/access/heapam.h  |  49 +
 src/include/access/tableam.h | 143 +++
 4 files changed, 425 insertions(+)

diff --git a/src/backend/access/heap/heapam.c b/src/backend/access/heap/heapam.c
index 707460a536..7df305380e 100644
--- a/src/backend/access/heap/heapam.c
+++ b/src/backend/access/heap/heapam.c
@@ -68,6 +68,7 @@
 #include "utils/datum.h"
 #include "utils/inval.h"
 #include "utils/lsyscache.h"
+#include "utils/memutils.h"
 #include "utils/relcache.h"
 #include "utils/snapmgr.h"
 #include "utils/spccache.h"
@@ -2446,6 +2447,229 @@ heap_multi_insert(Relation relation, TupleTableSlot **slots, int ntuples,
 	pgstat_count_heap_insert(relation, ntuples);
 }
 
+/*
+ * Initialize state required for an insert a single tuple or multiple tuples
+ * into a heap.
+ */
+TableInsertState *
+heap_insert_begin(Relation rel, CommandId cid, int am_flags, int insert_flags)
+{
+	TableInsertState *tistate;
+
+	tistate = palloc0(sizeof(TableInsertState));
+	tistate->rel = rel;
+	tistate->cid = cid;
+	tistate->am_flags = am_flags;
+	tistate->insert_flags = insert_flags;
+
+	if ((am_flags & TABLEAM_MULTI_INSERTS) != 0 ||
+		(am_flags & TABLEAM_BULKWRITE_BUFFER_ACCESS_STRATEGY))
+		tistate->am_data = palloc0(sizeof(HeapInsertState));
+
+	if ((am_flags & TABLEAM_MULTI_INSERTS) != 0)
+	{
+		HeapMultiInsertState *mistate;
+
+		mistate = palloc0(sizeof(HeapMultiInsertState));
+		mistate->slots = palloc0(sizeof(TupleTableSlot *) * HEAP_MAX_BUFFERED_SLOTS);
+
+		mistate->context = AllocSetContextCreate(CurrentMemoryContext,
+ "heap_multi_insert_v2 memory context",
+ ALLOCSET_DEFAULT_SIZES);
+
+		((HeapInsertState *) tistate->am_data)->mistate = mistate;
+	}
+
+	if ((am_flags & TABLEAM_BULKWRITE_BUFFER_ACCESS_STRATEGY) != 0)
+		((HeapInsertState *) tistate->am_data)->bistate = GetBulkInsertState();
+
+	return tistate;
+}
+
+/*
+ * Insert a single tuple into a heap.
+ */
+void
+heap_insert_v2(TableInsertState * state, TupleTableSlot *slot)
+{
+	bool		shouldFree = true;
+	HeapTuple	tuple = ExecFetchSlotHeapTuple(slot, true, );
+	BulkInsertState bistate = NULL;
+
+	Assert(state->am_data != NULL &&
+		   ((HeapInsertState *) state->am_data)->mistate == NULL);
+
+	/* Update tuple with table oid */
+	slot->tts_tableOid = RelationGetRelid(state->rel);
+	tuple->t_tableOid = slot->tts_tableOid;
+
+	if (state->am_data != NULL &&
+		((HeapInsertState *) state->am_data)->bistate != NULL)
+		bistate = ((HeapInsertState *) state->am_data)->bistate;
+
+	/* Perform insertion, and copy the resulting ItemPointer */
+

Re: Make COPY format extendable: Extract COPY TO format implementations

2024-01-28 Thread Junwang Zhao

On Mon, Jan 29, 2024 at 2:03 PM Sutou Kouhei  wrote:
>
> Hi,
>
> In 
>   "Re: Make COPY format extendable: Extract COPY TO format implementations" 
> on Sat, 27 Jan 2024 14:15:02 +0800,
>   Junwang Zhao  wrote:
>
> > I have been working on a *COPY TO JSON* extension since yesterday,
> > which is based on your V6 patch set, I'd like to give you more input
> > so you can make better decisions about the implementation(with only
> > pg-copy-arrow you might not get everything considered).
>
> Thanks!
>
> > 0009 is some changes made by me, I changed CopyToGetFormat to
> > CopyToSendCopyBegin because pg_copy_json need to send different bytes
> > in SendCopyBegin, get the format code along is not enough
>
> Oh, I haven't cared about the case.
> How about the following API instead?
>
> static void
> SendCopyBegin(CopyToState cstate)
> {
> StringInfoData buf;
>
> pq_beginmessage(, PqMsg_CopyOutResponse);
> cstate->opts.to_routine->CopyToFillCopyOutResponse(cstate, );
> pq_endmessage();
> cstate->copy_dest = COPY_FRONTEND;
> }
>
> static void
> CopyToJsonFillCopyOutResponse(CopyToState cstate, StringInfoData )
> {
> int16   format = 0;
>
> pq_sendbyte(, format);  /* overall format */
> /*
>  * JSON mode is always one non-binary column
>  */
> pq_sendint16(, 1);
> pq_sendint16(, format);
> }

Make sense to me.

>
> > 00010 is the pg_copy_json extension, I think this should be a good
> > case which can utilize the *extendable copy format* feature
>
> It seems that it's convenient that we have one more callback
> for initializing CopyToState::opaque. It's called only once
> when Copy{To,From}Routine is chosen:
>
> typedef struct CopyToRoutine
> {
> void(*CopyToInit) (CopyToState cstate);
> ...
> };

I like this, we can alloc private data in this hook.

>
> void
> ProcessCopyOptions(ParseState *pstate,
>CopyFormatOptions *opts_out,
>bool is_from,
>void *cstate,
>List *options)
> {
> ...
> foreach(option, options)
> {
> DefElem*defel = lfirst_node(DefElem, option);
>
> if (strcmp(defel->defname, "format") == 0)
> {
> ...
> opts_out->to_routine = 
> opts_out->to_routine->CopyToInit(cstate);
> ...
> }
> }
> ...
> }
>
>
> >  maybe we
> > should delete copy_test_format if we have this extension as an
> > example?
>
> I haven't read the COPY TO format json thread[1] carefully
> (sorry), but we may add the JSON format as a built-in
> format. If we do it, copy_test_format is useful to test the
> extension API.

Yeah, maybe, I have no strong opinion here, pg_copy_json is
just a toy extension for discussion.

>
> [1] 
> https://www.postgresql.org/message-id/flat/CALvfUkBxTYy5uWPFVwpk_7ii2zgT07t3d-yR_cy4sfrrLU%3Dkcg%40mail.gmail.com
>
>
> Thanks,
> --
> kou



-- 
Regards
Junwang Zhao

Re: Small fix on COPY ON_ERROR document

2024-01-28 Thread Yugo NAGATA

On Sun, 28 Jan 2024 19:14:58 -0700
"David G. Johnston"  wrote:

> > Also, I think "invalid input syntax" is a bit ambiguous. For example,
> > COPY FROM raises an error when the number of input column does not match
> > to the table schema, but this error is not ignored by ON_ERROR while
> > this seems to fall into the category of "invalid input syntax".
> 
> 
> 
> It is literally the error text that appears if one were not to ignore it.
> It isn’t a category of errors.  But I’m open to ideas here.  But being
> explicit with what on actually sees in the system seemed preferable to
> inventing new classification terms not otherwise used.

Thank you for explanation! I understood the words was from the error messages
that users actually see. However, as Torikoshi-san said in [1], errors other
than valid input syntax (e.g. range error) can be also ignored, therefore it
would be better to describe to be ignored errors more specifically.

[1] 
https://www.postgresql.org/message-id/7f1457497fa3bf9dfe486f162d1c8ec6%40oss.nttdata.com

> 
> >
> > So, keeping consistency with the existing description, we can say:
> >
> > "Specifies which how to behave when encountering an error due to
> >  column values unacceptable to the input function of each attribute's
> >  data type."
> 
> 
> Yeah, I was considering something along those lines as an option as well.
> But I’d rather add that wording to the glossary.

Although I am still be not convinced if we have to introduce the words
"soft error" to the documentation, I don't care it if there are no other
opposite opinions. 

> 
> > Currently, ON_ERROR doesn't support other soft errors, so it can explain
> > it more simply without introducing the new concept, "soft error" to users.
> >
> >
> Good point.  Seems we should define what user-facing errors are ignored
> anywhere in the system and if we aren’t consistently leveraging these in
> all areas/commands make the necessary qualifications in those specific
> places.
> 

> > I think "left in a deleted state" is also unclear for users because this
> > explains the internal state but not how looks from user's view.How about
> > leaving the explanation "These rows will not be visible or accessible" in
> > the existing statement?
> >
> 
> Just visible then, I don’t like an “or” there and as tuples at least they
> are accessible to the system, in vacuum especially.  But I expected the
> user to understand “as if you deleted it” as their operational concept more
> readily than visible.  I think this will be read by people who haven’t read
> MVCC to fully understand what visible means but know enough to run vacuum
> to clean up updated and deleted data as a rule.

Ok, I agree we can omit "or accessible". How do you like the followings?
Still redundant?

 "If the command fails, these rows are left in a deleted state;
  these rows will not be visible, but they still occupy disk space. "

Regards,
Yugo Nagata

-- 
Yugo NAGATA

Re: Make COPY format extendable: Extract COPY TO format implementations

2024-01-28 Thread Sutou Kouhei

Hi,

In 
  "Re: Make COPY format extendable: Extract COPY TO format implementations" on 
Sat, 27 Jan 2024 14:15:02 +0800,
  Junwang Zhao  wrote:

> I have been working on a *COPY TO JSON* extension since yesterday,
> which is based on your V6 patch set, I'd like to give you more input
> so you can make better decisions about the implementation(with only
> pg-copy-arrow you might not get everything considered).

Thanks!

> 0009 is some changes made by me, I changed CopyToGetFormat to
> CopyToSendCopyBegin because pg_copy_json need to send different bytes
> in SendCopyBegin, get the format code along is not enough

Oh, I haven't cared about the case.
How about the following API instead?

static void
SendCopyBegin(CopyToState cstate)
{
StringInfoData buf;

pq_beginmessage(, PqMsg_CopyOutResponse);
cstate->opts.to_routine->CopyToFillCopyOutResponse(cstate, );
pq_endmessage();
cstate->copy_dest = COPY_FRONTEND;
}

static void
CopyToJsonFillCopyOutResponse(CopyToState cstate, StringInfoData )
{
int16   format = 0;

pq_sendbyte(, format);  /* overall format */
/*
 * JSON mode is always one non-binary column
 */
pq_sendint16(, 1);
pq_sendint16(, format);
}

> 00010 is the pg_copy_json extension, I think this should be a good
> case which can utilize the *extendable copy format* feature

It seems that it's convenient that we have one more callback
for initializing CopyToState::opaque. It's called only once
when Copy{To,From}Routine is chosen:

typedef struct CopyToRoutine
{
void(*CopyToInit) (CopyToState cstate);
...
};

void
ProcessCopyOptions(ParseState *pstate,
   CopyFormatOptions *opts_out,
   bool is_from,
   void *cstate,
   List *options)
{
...
foreach(option, options)
{
DefElem*defel = lfirst_node(DefElem, option);

if (strcmp(defel->defname, "format") == 0)
{
...
opts_out->to_routine = 
opts_out->to_routine->CopyToInit(cstate);
...
}
}
...
}


>  maybe we
> should delete copy_test_format if we have this extension as an
> example?

I haven't read the COPY TO format json thread[1] carefully
(sorry), but we may add the JSON format as a built-in
format. If we do it, copy_test_format is useful to test the
extension API.

[1] 
https://www.postgresql.org/message-id/flat/CALvfUkBxTYy5uWPFVwpk_7ii2zgT07t3d-yR_cy4sfrrLU%3Dkcg%40mail.gmail.com


Thanks,
-- 
kou

Re: SLRU optimization - configurable buffer pool and partitioning the SLRU lock

2024-01-28 Thread Dilip Kumar

On Fri, Jan 26, 2024 at 11:08 PM Alvaro Herrera  wrote:
>
> I've continued to review this and decided that I don't like the mess
> this patch proposes in order to support pg_commit_ts's deletion of all
> files.  (Yes, I know that I was the one that proposed this idea. It's
> still a bad one).  I'd like to change that code by removing the limit
> that we can only have 128 bank locks in a SLRU.  To recap, the reason we
> did this is that commit_ts sometimes wants to delete all files while
> running (DeactivateCommitTs), and for this it needs to acquire all bank
> locks.  Since going above the MAX_SIMUL_LWLOCKS limit is disallowed, we
> added the SLRU limit making multiple banks share lwlocks.
>
> I propose two alternative solutions:
>
> 1. The easiest is to have DeactivateCommitTs continue to hold
> CommitTsLock until the end, including while SlruScanDirectory does its
> thing.  This sounds terrible, but considering that this code only runs
> when the module is being deactivated, I don't think it's really all that
> bad in practice.  I mean, if you deactivate the commit_ts module and
> then try to read commit timestamp data, you deserve to wait for a few
> seconds just as a punishment for your stupidity.

I think this idea looks reasonable.  I agree that if we are trying to
read commit_ts after deactivating it then it's fine to make it wait.

-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

Re: SLRU optimization - configurable buffer pool and partitioning the SLRU lock

2024-01-28 Thread Dilip Kumar

On Thu, Jan 25, 2024 at 10:03 PM Alvaro Herrera  wrote:
>
> On 2024-Jan-25, Alvaro Herrera wrote:
>
> > Here's a touched-up version of this patch.
>
> > diff --git a/src/backend/storage/lmgr/lwlock.c 
> > b/src/backend/storage/lmgr/lwlock.c
> > index 98fa6035cc..4a5e05d5e4 100644
> > --- a/src/backend/storage/lmgr/lwlock.c
> > +++ b/src/backend/storage/lmgr/lwlock.c
> > @@ -163,6 +163,13 @@ static const char *const BuiltinTrancheNames[] = {
> >   [LWTRANCHE_LAUNCHER_HASH] = "LogicalRepLauncherHash",
> >   [LWTRANCHE_DSM_REGISTRY_DSA] = "DSMRegistryDSA",
> >   [LWTRANCHE_DSM_REGISTRY_HASH] = "DSMRegistryHash",
> > + [LWTRANCHE_COMMITTS_SLRU] = "CommitTSSLRU",
> > + [LWTRANCHE_MULTIXACTOFFSET_SLRU] = "MultixactOffsetSLRU",
> > + [LWTRANCHE_MULTIXACTMEMBER_SLRU] = "MultixactMemberSLRU",
> > + [LWTRANCHE_NOTIFY_SLRU] = "NotifySLRU",
> > + [LWTRANCHE_SERIAL_SLRU] = "SerialSLRU"
> > + [LWTRANCHE_SUBTRANS_SLRU] = "SubtransSLRU",
> > + [LWTRANCHE_XACT_SLRU] = "XactSLRU",
> >  };
>
> Eeek.  Last minute changes ...  Fixed here.

Thank you for working on this.  There is one thing that I feel is
problematic.  We have kept the allowed values for these GUCs to be in
multiple of SLRU_BANK_SIZE i.e. 16 and that's the reason the min
values were changed to 16 but in this refactoring patch for some of
the buffers you have changed that to 8 so I think that's not good.

+ {
+ {"multixact_offsets_buffers", PGC_POSTMASTER, RESOURCES_MEM,
+ gettext_noop("Sets the size of the dedicated buffer pool used for
the MultiXact offset cache."),
+ NULL,
+ GUC_UNIT_BLOCKS
+ },
+ _offsets_buffers,
+ 16, 8, SLRU_MAX_ALLOWED_BUFFERS,
+ check_multixact_offsets_buffers, NULL, NULL
+ },

Other than this patch looks good to me.

-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

Re: Fix inappropriate comments in function ReplicationSlotAcquire

2024-01-28 Thread Amit Kapila

On Fri, Jan 26, 2024 at 6:47 AM Wei Wang (Fujitsu)
 wrote:
>
> It looks good to me. So, I updated the patch as suggested.
>

Pushed!

-- 
With Regards,
Amit Kapila.

Re: More new SQL/JSON item methods

2024-01-28 Thread Tom Lane

Kyotaro Horiguchi  writes:
> Having said that, I'm a bit concerned about the case where an overly
> long string is given there. However, considering that we already have
> "invalid input syntax for type xxx: %x" messages (including for the
> numeric), this concern might be unnecessary.

Yeah, we have not worried about that in the past.

> Another concern is that the input value is already a numeric, not a
> string. This situation occurs when the input is NaN of +-Inf. Although
> numeric_out could be used, it might cause another error. Therefore,
> simply writing out as "argument NaN and Infinity.." would be better.

Oh!  I'd assumed that we were discussing a string that we'd failed to
convert to numeric.  If the input is already numeric, then either
the error is unreachable or what we're really doing is rejecting
special values such as NaN on policy grounds.  I would ask first
if that policy is sane at all.  (I'd lean to "not" --- if we allow
it in the input JSON, why not in the output?)  If it is sane, the
error message needs to be far more specific.

regards, tom lane

Re: Update the doc that refers to the removed column of pg_replication_slots.

2024-01-28 Thread Amit Kapila

On Mon, Jan 29, 2024 at 10:38 AM Zhijie Hou (Fujitsu)
 wrote:
>
> I noticed one minor thing that the upgrade doc refers to the old 'conflicting'
> column of pg_replication_slot. As the column has been changed to 
> 'conflict_reason',
> I think the doc needs to be updated like the attachment.
>

Right, this should be fixed. I'll take care of this.

-- 
With Regards,
Amit Kapila.

Re: More new SQL/JSON item methods

2024-01-28 Thread Kyotaro Horiguchi

At Sun, 28 Jan 2024 22:47:02 -0500, Tom Lane  
wrote in

Kyotaro Horiguchi  writes:
> They seem to be suggesting that PostgreSQL has the types 
"decimal" and
> "number". I know of the former, but I don't think PostgreSQL has 
the
>  latter type. Perhaps the "number" was intended to refer to 
"numeric"?

Probably.  But I would write just "type numeric".  We do not 
generally
acknowledge "decimal" as a separate type, because for us it's only 
an

alias for numeric (there is not a pg_type entry for it).

Also, that leads to the thought that "numeric argument ... is out of
range for type numeric" seems either redundant or contradictory
depending on how you look at it.  So I suggest wording like

argument "...input string here..." of jsonpath item method .%s() is 
out of range for type numeric

> (And I think it is largely helpful if the given string were shown 
in

> the error message, but it would be another issue.)

Agreed, so I suggest the above.

Having said that, I'm a bit concerned about the case where an overly
long string is given there. However, considering that we already have
"invalid input syntax for type xxx: %x" messages (including for the
numeric), this concern might be unnecessary.

Another concern is that the input value is already a numeric, not a
string. This situation occurs when the input is NaN of +-Inf. Although
numeric_out could be used, it might cause another error. Therefore,
simply writing out as "argument NaN and Infinity.." would be better.

Once we resolve these issues, another question arises regarding on of
the messages. In the case of NaN of Infinity, the message will be as
the follows:

"argument NaN or Infinity of jsonpath item method .%s() is out of 
range for type numeric"

This message appears quite strange and puzzling. I suspect that the
intended message was somewhat different.

> The same commit has introduced the following set of messages:

>> %s format is not recognized: "%s"
>> date format is not recognized: "%s"
>> time format is not recognized: "%s"
>> time_tz format is not recognized: "%s"
>> timestamp format is not recognized: "%s"
>> timestamp_tz format is not recognized: "%s"

> I believe that the first line was intended to cover all the 
others:p

+1

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center

Update the doc that refers to the removed column of pg_replication_slots.

2024-01-28 Thread Zhijie Hou (Fujitsu)

Hi,

I noticed one minor thing that the upgrade doc refers to the old 'conflicting'
column of pg_replication_slot. As the column has been changed to 
'conflict_reason', 
I think the doc needs to be updated like the attachment.

Best Regards,
Hou Zhijie


0001-Update-the-doc-for-upgrading-slots.patch
Description: 0001-Update-the-doc-for-upgrading-slots.patch

Re: Documentation to upgrade logical replication cluster

2024-01-28 Thread vignesh C

On Mon, 29 Jan 2024 at 06:34, Peter Smith  wrote:
>
> Hi Vignesh,
>
> Here are some review comments for patch v4.
>
> These are cosmetic only; otherwise v4 LGTM.
>
> ==
> doc/src/sgml/ref/pgupgrade.sgml
>
> 1.
> Configure the servers for log shipping.  (You do not need to run
> pg_backup_start() and
> pg_backup_stop()
> or take a file system backup as the standbys are still synchronized
> -   with the primary.)  Only logical slots on the primary are copied to 
> the
> -   new standby, but other slots on the old standby are not copied so must
> -   be recreated manually.
> +   with the primary.)  If the old cluster is prior to 17.0, then no slots
> +   on the primary are copied to the new standby, so all the slots must be
> +   recreated manually. If the old cluster is 17.0 or later, then only
> +   logical slots on the primary are copied to the new standby, but other
> +   slots on the old standby are not copied so must be recreated manually.
>
>
> Perhaps the part from "If the old cluster is prior..." should be in a
> new paragraph.

Modified

> ==
> doc/src/sgml/logical-replication.sgml
>
> 2.
> +   
> +Setup the 
> +subscriber configurations in the new subscriber.
> +pg_upgrade attempts to migrate subscription
> +dependencies which includes the subscription's table information present 
> in
> +pg_subscription_rel
> +system catalog and also the subscription's replication origin. This 
> allows
> +logical replication on the new subscriber to continue from where the
> +old subscriber was up to. Migration of subscription dependencies is only
> +supported when the old cluster is version 17.0 or later. Subscription
> +dependencies on clusters before version 17.0 will silently be ignored.
> +   
>
> Perhaps the part from "pg_upgrade attempts..." should be in a new paragraph.

Modified

Thanks for the comments, the attached v5 version patch has the changes
for the same.

Regards,
Vignesh
From 08d42bc4b8601c9af9302e9778523640a238f8d5 Mon Sep 17 00:00:00 2001
From: Vignesh C 
Date: Wed, 13 Dec 2023 14:11:58 +0530
Subject: [PATCH v5] Documentation for upgrading logical replication cluster.

Documentation for upgrading logical replication cluster.
---
 doc/src/sgml/logical-replication.sgml | 810 ++
 doc/src/sgml/ref/pgupgrade.sgml   | 147 +
 2 files changed, 829 insertions(+), 128 deletions(-)

diff --git a/doc/src/sgml/logical-replication.sgml b/doc/src/sgml/logical-replication.sgml
index ec2130669e..650a3151ff 100644
--- a/doc/src/sgml/logical-replication.sgml
+++ b/doc/src/sgml/logical-replication.sgml
@@ -1926,6 +1926,816 @@ CONTEXT:  processing remote data for replication origin "pg_16395" during "INSER
 
  
 
+ 
+  Upgrade
+
+  
+   Migration of logical replication clusters is possible only when all the
+   members of the old logical replication clusters are version 17.0 or later.
+   Before reading this section, refer  page for
+   more details about pg_upgrade.
+  
+
+  
+   Prepare for publisher upgrades
+
+   
+pg_upgrade attempts to migrate logical
+slots. This helps avoid the need for manually defining the same
+logical slots on the new publisher. Migration of logical slots is
+only supported when the old cluster is version 17.0 or later.
+Logical slots on clusters before version 17.0 will silently be
+ignored.
+   
+
+   
+Before you start upgrading the publisher cluster, ensure that the
+subscription is temporarily disabled, by executing
+ALTER SUBSCRIPTION ... DISABLE.
+Re-enable the subscription after the upgrade.
+   
+
+   
+There are some prerequisites for pg_upgrade to
+be able to upgrade the logical slots. If these are not met an error
+will be reported.
+   
+
+   
+
+ 
+  The new cluster must have
+  wal_level as
+  logical.
+ 
+
+
+ 
+  The new cluster must have
+  max_replication_slots
+  configured to a value greater than or equal to the number of slots
+  present in the old cluster.
+ 
+
+
+ 
+  The output plugins referenced by the slots on the old cluster must be
+  installed in the new PostgreSQL executable
+  directory.
+ 
+
+
+ 
+  The old cluster has replicated all the transactions and logical decoding
+  messages to subscribers.
+ 
+
+
+ 
+  All slots on the old cluster must be usable, i.e., there are no slots
+  whose
+  pg_replication_slots.conflicting
+  is true.
+ 
+
+
+ 
+  The new cluster must not have permanent logical slots, i.e.,
+  there must be no slots where
+  pg_replication_slots.temporary
+  is false.
+ 
+
+   
+  
+
+  
+   Prepare for subscriber upgrades
+
+   
+Setup the 
+subscriber configurations in the new subscriber.
+   
+
+   
+pg_upgrade attempts to migrate subscription
+

Re: Exposing the lock manager's WaitForLockers() to SQL

2024-01-28 Thread Will Mortensen

I guess the output of the deadlock test was unstable, so I simply
removed it in v8 here, but I can try to fix it instead if it seems
important to test that.


v8-0001-Refactor-GetLockConflicts-into-more-general-GetLo.patch
Description: Binary data


v8-0003-Add-pg_wait_for_lockers-function.patch
Description: Binary data


v8-0002-Allow-specifying-single-lockmode-in-WaitForLocker.patch
Description: Binary data

Re: Synchronizing slots from primary to standby

2024-01-28 Thread Amit Kapila

On Mon, Jan 29, 2024 at 9:21 AM Peter Smith  wrote:
>
> Here are some review comments for v70-0001.
>
> ==
> doc/src/sgml/protocol.sgml
>
> 1.
> Related to this, please also review my other patch to the same docs
> page protocol.sgml [1].
>

We can check that separately.

> ==
> src/backend/replication/logical/tablesync.c
>
> 2.
>   walrcv_create_slot(LogRepWorkerWalRcvConn,
>  slotname, false /* permanent */ , false /* two_phase */ ,
> +false,
>  CRS_USE_SNAPSHOT, origin_startpos);
>
> I know it was previously mentioned in this thread that inline
> parameter comments are unnecessary, but here they are already in the
> existing code so shouldn't we do the same?
>

I think it is better to remove the even existing ones as those many
times make code difficult to read.

-- 
With Regards,
Amit Kapila.

Re: Synchronizing slots from primary to standby

2024-01-28 Thread Peter Smith

Here are some review comments for v70-0001.

==
doc/src/sgml/protocol.sgml

1.
Related to this, please also review my other patch to the same docs
page protocol.sgml [1].

==
src/backend/replication/logical/tablesync.c

2.
  walrcv_create_slot(LogRepWorkerWalRcvConn,
 slotname, false /* permanent */ , false /* two_phase */ ,
+false,
 CRS_USE_SNAPSHOT, origin_startpos);

I know it was previously mentioned in this thread that inline
parameter comments are unnecessary, but here they are already in the
existing code so shouldn't we do the same?

==
src/backend/replication/repl_gram.y

3.
+/* ALTER_REPLICATION_SLOT slot options */
+alter_replication_slot:
+ K_ALTER_REPLICATION_SLOT IDENT '(' generic_option_list ')'
+ {
+ AlterReplicationSlotCmd *cmd;
+ cmd = makeNode(AlterReplicationSlotCmd);
+ cmd->slotname = $2;
+ cmd->options = $4;
+ $$ = (Node *) cmd;
+ }
+ ;
+

IMO write that comment with parentheses, so it matches the code.

SUGGESTION
ALTER_REPLICATION_SLOT slot ( options )

==
[1] 
https://www.postgresql.org/message-id/CAHut%2BPtDWSmW8uiRJF1LfGQJikmo7V2jdysLuRmtsanNZc7fNw%40mail.gmail.com

Kind Regards,
Peter Smith.
Fujitsu Australia

Re: More new SQL/JSON item methods

2024-01-28 Thread Tom Lane

Kyotaro Horiguchi  writes:
> I have two possible issues in a recent commit.
> Commit 66ea94e8e6 has introduced the following messages:

>> errmsg("numeric argument of jsonpath item method .%s() is out of range for 
>> type decimal or number",

> They seem to be suggesting that PostgreSQL has the types "decimal" and
> "number". I know of the former, but I don't think PostgreSQL has the
>  latter type. Perhaps the "number" was intended to refer to "numeric"?

Probably.  But I would write just "type numeric".  We do not generally
acknowledge "decimal" as a separate type, because for us it's only an
alias for numeric (there is not a pg_type entry for it).

Also, that leads to the thought that "numeric argument ... is out of
range for type numeric" seems either redundant or contradictory
depending on how you look at it.  So I suggest wording like

argument "...input string here..." of jsonpath item method .%s() is out of 
range for type numeric

> (And I think it is largely helpful if the given string were shown in
> the error message, but it would be another issue.)

Agreed, so I suggest the above.

> The same commit has introduced the following set of messages:

>> %s format is not recognized: "%s"
>> date format is not recognized: "%s"
>> time format is not recognized: "%s"
>> time_tz format is not recognized: "%s"
>> timestamp format is not recognized: "%s"
>> timestamp_tz format is not recognized: "%s"

> I believe that the first line was intended to cover all the others:p

+1

regards, tom lane

Re: Wrong rows estimations with joins of CTEs slows queries by more than factor 500

2024-01-28 Thread Richard Guo

On Mon, Jan 29, 2024 at 11:20 AM vignesh C  wrote:

> On Mon, 29 Jan 2024 at 08:01, Richard Guo  wrote:
> > On Sat, Jan 27, 2024 at 10:08 AM vignesh C  wrote:
> >> I have changed the status of the commitfest entry to "Committed" as I
> >> noticed the patch has already been committed.
> >
> > Well, the situation seems a little complex here.  At first, this thread
> > was dedicated to discussing the 'Examine-simple-variable-for-Var-in-CTE'
> > patch, which has already been pushed in [1].  Subsequently, I proposed
> > another patch 'Propagate-pathkeys-from-CTEs-up-to-the-outer-query' in
> > [2], which is currently under review and is what the commitfest entry
> > for.  Later on, within the same thread, another patch was posted as a
> > fix to the first patch and was subsequently pushed in [3].  I believe
> > this sequence of events might have led to confusion.
> >
> > What is the usual practice in such situations?  I guess I'd better to
> > fork a new thread to discuss my proposed patch which is about the
> > 'Propagate-pathkeys-from-CTEs-up-to-the-outer-query'.
>
> Sorry I missed to notice that there was one pending patch yet to be
> committed, I feel you can continue discussing here itself just to
> avoid losing any historical information about the issue and the
> continuation of the discussion. You can add a new commitfest entry for
> this.


It seems to me that a fresh new thread is a better option.  I have just
started a new thread in [1], and have tried to migrate the necessary
context over there.  I have also updated the commitfest entry
accordingly.

[1]
https://www.postgresql.org/message-id/flat/CAMbWs49xYd3f8CrE8-WW3--dV1zH_sDSDn-vs2DzHj81Wcnsew%40mail.gmail.com

Thanks
Richard

Commitfest 2024-01 fourth week update

2024-01-28 Thread vignesh C

Hi,

Here's a quick status report after the fourth week:
Status summary:
status|   w1  |  w2 |   w3|   w4
---+---+++--
Needs review:   |238 | 213|  181|  147
Waiting on Author: | 44 |   46 |   52|65
Ready for Committer:| 27 |   27 |   26|24
Committed:| 36 |   46 |   57|70
Moved to next CF  |   1 | 3 | 4| 4
Withdrawn:|   2 | 4 |   12|14
Returned with Feedback:  |   3 |   12 |   18|26
Rejected:   |   1 | 1 | 2| 2
Total:  |   352|  352 |  352   |  352

If you have submitted a patch and it's in "Waiting for author" state,
please aim to get it to "Needs review" state soon if you can, as
that's where people are most likely to be looking for things to
review.

I have pinged all the threads that are in "Needs review" state and
don't apply, compile warning-free, or pass check-world.  I'll do some
more of that sort of thing.

I have sent out mails for which there has been no activity for a long
time, please respond to the mails if you are planning to continue to
work on it or if you are planning to work on it in the next
commitfest, please move it to the next commitfest. If there is no
response I'm planning to return these patches and it can be added
again when it will be worked upon actively.

Regards,
Vignesh

Re: Make COPY format extendable: Extract COPY TO format implementations

2024-01-28 Thread Junwang Zhao

On Mon, Jan 29, 2024 at 11:22 AM Masahiko Sawada  wrote:
>
> On Mon, Jan 29, 2024 at 12:10 PM Junwang Zhao  wrote:
> >
> > On Mon, Jan 29, 2024 at 10:42 AM Masahiko Sawada  
> > wrote:
> > >
> > > On Fri, Jan 26, 2024 at 6:02 PM Junwang Zhao  wrote:
> > > >
> > > > On Fri, Jan 26, 2024 at 4:55 PM Sutou Kouhei  
> > > > wrote:
> > > > >
> > > > > Hi,
> > > > >
> > > > > In 
> > > > > 
> > > > >   "Re: Make COPY format extendable: Extract COPY TO format 
> > > > > implementations" on Fri, 26 Jan 2024 16:41:50 +0800,
> > > > >   Junwang Zhao  wrote:
> > > > >
> > > > > > CopyToProcessOption()/CopyFromProcessOption() can only handle
> > > > > > single option, and store the options in the opaque field,  but it 
> > > > > > can not
> > > > > > check the relation of two options, for example, considering json 
> > > > > > format,
> > > > > > the `header` option can not be handled by these two functions.
> > > > > >
> > > > > > I want to find a way when the user specifies the header option, 
> > > > > > customer
> > > > > > handler can error out.
> > > > >
> > > > > Ah, you want to use a built-in option (such as "header")
> > > > > value from a custom handler, right? Hmm, it may be better
> > > > > that we call CopyToProcessOption()/CopyFromProcessOption()
> > > > > for all options including built-in options.
> > > > >
> > > > Hmm, still I don't think it can handle all cases, since we don't know
> > > > the sequence of the options, we need all the options been parsed
> > > > before we check the compatibility of the options, or customer
> > > > handlers will need complicated logic to resolve that, which might
> > > > lead to ugly code :(
> > > >
> > >
> > > Does it make sense to pass only non-builtin options to the custom
> > > format callback after parsing and evaluating the builtin options? That
> > > is, we parse and evaluate only the builtin options and populate
> > > opts_out first, then pass each rest option to the custom format
> > > handler callback. The callback can refer to the builtin option values.
> >
> > Yeah, I think this makes sense.
> >
> > > The callback is expected to return false if the passed option is not
> > > supported. If one of the builtin formats is specified and the rest
> > > options list has at least one option, we raise "option %s not
> > > recognized" error.  IOW it's the core's responsibility to ranse the
> > > "option %s not recognized" error, which is in order to raise a
> > > consistent error message. Also, I think the core should check the
> > > redundant options including bultiin and custom options.
> >
> > It would be good that core could check all the redundant options,
> > but where should core do the book-keeping of all the options? I have
> > no idea about this, in my implementation of pg_copy_json extension,
> > I handle redundant options by adding a xxx_specified field for each
> > xxx.
>
> What I imagined is that while parsing the all specified options, we
> evaluate builtin options and we add non-builtin options to another
> list. Then when parsing a non-builtin option, we check if this option
> already exists in the list. If there is, we raise the "option %s not
> recognized" error.". Once we complete checking all options, we pass
> each option in the list to the callback.

LGTM.

>
> Regards,
>
> --
> Masahiko Sawada
> Amazon Web Services: https://aws.amazon.com



-- 
Regards
Junwang Zhao

Re: Exposing the lock manager's WaitForLockers() to SQL

2024-01-28 Thread Will Mortensen

On Fri, Jan 26, 2024 at 4:54 AM vignesh C  wrote:
>
> CFBot shows that there is one warning as in [1]:
> patching file doc/src/sgml/libpq.sgml
> ...
> [09:30:40.000] [943/2212] Compiling C object
> src/backend/postgres_lib.a.p/storage_lmgr_lock.c.obj
> [09:30:40.000] c:\cirrus\src\backend\storage\lmgr\lock.c(4084) :
> warning C4715: 'ParseLockmodeName': not all control paths return a
> value

Thanks Vignesh, I guess the MS compiler doesn't have
__builtin_constant_p()? So I added an unreachable return, and a
regression test that exercises this error path.

I also made various other simplifications and minor fixes to the code,
docs, and tests.

Back in v5 (with a new SQL command) I had a detailed example in the
docs, which I removed when changing to a function, and I'm not sure if
I should try to add it back now...I could shrink it but it might still
be too long for this part of the docs?

Anyway, please see attached.

v7-0001-Refactor-GetLockConflicts-into-more-general-GetLo.patch
Description: Binary data

v7-0002-Allow-specifying-single-lockmode-in-WaitForLocker.patch
Description: Binary data

v7-0003-Add-pg_wait_for_lockers-function.patch
Description: Binary data

Re: Make COPY format extendable: Extract COPY TO format implementations

2024-01-28 Thread Masahiko Sawada

On Mon, Jan 29, 2024 at 12:10 PM Junwang Zhao  wrote:
>
> On Mon, Jan 29, 2024 at 10:42 AM Masahiko Sawada  
> wrote:
> >
> > On Fri, Jan 26, 2024 at 6:02 PM Junwang Zhao  wrote:
> > >
> > > On Fri, Jan 26, 2024 at 4:55 PM Sutou Kouhei  wrote:
> > > >
> > > > Hi,
> > > >
> > > > In 
> > > >   "Re: Make COPY format extendable: Extract COPY TO format 
> > > > implementations" on Fri, 26 Jan 2024 16:41:50 +0800,
> > > >   Junwang Zhao  wrote:
> > > >
> > > > > CopyToProcessOption()/CopyFromProcessOption() can only handle
> > > > > single option, and store the options in the opaque field,  but it can 
> > > > > not
> > > > > check the relation of two options, for example, considering json 
> > > > > format,
> > > > > the `header` option can not be handled by these two functions.
> > > > >
> > > > > I want to find a way when the user specifies the header option, 
> > > > > customer
> > > > > handler can error out.
> > > >
> > > > Ah, you want to use a built-in option (such as "header")
> > > > value from a custom handler, right? Hmm, it may be better
> > > > that we call CopyToProcessOption()/CopyFromProcessOption()
> > > > for all options including built-in options.
> > > >
> > > Hmm, still I don't think it can handle all cases, since we don't know
> > > the sequence of the options, we need all the options been parsed
> > > before we check the compatibility of the options, or customer
> > > handlers will need complicated logic to resolve that, which might
> > > lead to ugly code :(
> > >
> >
> > Does it make sense to pass only non-builtin options to the custom
> > format callback after parsing and evaluating the builtin options? That
> > is, we parse and evaluate only the builtin options and populate
> > opts_out first, then pass each rest option to the custom format
> > handler callback. The callback can refer to the builtin option values.
>
> Yeah, I think this makes sense.
>
> > The callback is expected to return false if the passed option is not
> > supported. If one of the builtin formats is specified and the rest
> > options list has at least one option, we raise "option %s not
> > recognized" error.  IOW it's the core's responsibility to ranse the
> > "option %s not recognized" error, which is in order to raise a
> > consistent error message. Also, I think the core should check the
> > redundant options including bultiin and custom options.
>
> It would be good that core could check all the redundant options,
> but where should core do the book-keeping of all the options? I have
> no idea about this, in my implementation of pg_copy_json extension,
> I handle redundant options by adding a xxx_specified field for each
> xxx.

What I imagined is that while parsing the all specified options, we
evaluate builtin options and we add non-builtin options to another
list. Then when parsing a non-builtin option, we check if this option
already exists in the list. If there is, we raise the "option %s not
recognized" error.". Once we complete checking all options, we pass
each option in the list to the callback.

Regards,

-- 
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

Re: Wrong rows estimations with joins of CTEs slows queries by more than factor 500

2024-01-28 Thread vignesh C

On Mon, 29 Jan 2024 at 08:01, Richard Guo  wrote:
>
>
> On Sat, Jan 27, 2024 at 10:08 AM vignesh C  wrote:
>>
>> On Mon, 8 Jan 2024 at 22:21, Tom Lane  wrote:
>> >
>> > Richard Guo  writes:
>> > > On Sun, Jan 7, 2024 at 6:41 AM Tom Lane  wrote:
>> > >> Thanks for the report!  I guess we need something like the attached.
>> >
>> > > +1.
>> >
>> > Pushed, thanks for looking at it.
>>
>> I have changed the status of the commitfest entry to "Committed" as I
>> noticed the patch has already been committed.
>
>
> Well, the situation seems a little complex here.  At first, this thread
> was dedicated to discussing the 'Examine-simple-variable-for-Var-in-CTE'
> patch, which has already been pushed in [1].  Subsequently, I proposed
> another patch 'Propagate-pathkeys-from-CTEs-up-to-the-outer-query' in
> [2], which is currently under review and is what the commitfest entry
> for.  Later on, within the same thread, another patch was posted as a
> fix to the first patch and was subsequently pushed in [3].  I believe
> this sequence of events might have led to confusion.
>
> What is the usual practice in such situations?  I guess I'd better to
> fork a new thread to discuss my proposed patch which is about the
> 'Propagate-pathkeys-from-CTEs-up-to-the-outer-query'.

Sorry I missed to notice that there was one pending patch yet to be
committed, I feel you can continue discussing here itself just to
avoid losing any historical information about the issue and the
continuation of the discussion. You can add a new commitfest entry for
this.

Regards,
Vignesh

Propagate pathkeys from CTEs up to the outer query

2024-01-28 Thread Richard Guo

In [1] we've reached a conclusion that for a MATERIALIZED CTE it's okay
to 'allow our statistics or guesses for the sub-query to subsequently
influence what the upper planner does'.  Commit f7816aec23 exposes
column statistics to the upper planner.  In the light of that, here is a
patch that exposes info about the ordering of the CTE result to the
upper planner.

This patch was initially posted in that same thread and has received
some comments from Tom in [2].  Due to the presence of multiple patches
in that thread, it has led to confusion.  So fork a new thread here
specifically dedicated to discussing the patch about exposing pathkeys
from CTEs to the upper planner.

Comments/thoughts?

[1] https://www.postgresql.org/message-id/482957.1700192299%40sss.pgh.pa.us
[2] https://www.postgresql.org/message-id/1339607.1700502358%40sss.pgh.pa.us

Thanks
Richard


v3-0001-Propagate-pathkeys-from-CTEs-up-to-the-outer-query.patch
Description: Binary data

Re: More new SQL/JSON item methods

2024-01-28 Thread Kyotaro Horiguchi

I have two possible issues in a recent commit.

Commit 66ea94e8e6 has introduced the following messages:
(Aplogizies in advance if the commit is not related to this thread.)

jsonpath_exec.c:1287
> if (numeric_is_nan(num) || numeric_is_inf(num))
>   RETURN_ERROR(ereport(ERROR,
>  (errcode(ERRCODE_NON_NUMERIC_SQL_JSON_ITEM),
>   errmsg("numeric argument of jsonpath item method .%s() is out 
> of range for type decimal or number",
>jspOperationName(jsp->type);

:1387
>   noerr = DirectInputFunctionCallSafe(numeric_in, numstr,
...
> if (!noerr || escontext.error_occurred)
>   RETURN_ERROR(ereport(ERROR,
>  (errcode(ERRCODE_NON_NUMERIC_SQL_JSON_ITEM),
>   errmsg("string argument of jsonpath item method .%s() is not a 
> valid representation of a decimal or number",

They seem to be suggesting that PostgreSQL has the types "decimal" and
"number". I know of the former, but I don't think PostgreSQL has the
 latter type. Perhaps the "number" was intended to refer to "numeric"?
(And I think it is largely helpful if the given string were shown in
the error message, but it would be another issue.)


The same commit has introduced the following set of messages:

> %s format is not recognized: "%s"
> date format is not recognized: "%s"
> time format is not recognized: "%s"
> time_tz format is not recognized: "%s"
> timestamp format is not recognized: "%s"
> timestamp_tz format is not recognized: "%s"

I believe that the first line was intended to cover all the others:p

They are attached to this message separately.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center
diff --git a/src/backend/utils/adt/jsonpath_exec.c b/src/backend/utils/adt/jsonpath_exec.c
index 22f598cd35..c10926a8a2 100644
--- a/src/backend/utils/adt/jsonpath_exec.c
+++ b/src/backend/utils/adt/jsonpath_exec.c
@@ -1287,7 +1287,7 @@ executeItemOptUnwrapTarget(JsonPathExecContext *cxt, JsonPathItem *jsp,
 	if (numeric_is_nan(num) || numeric_is_inf(num))
 		RETURN_ERROR(ereport(ERROR,
 			 (errcode(ERRCODE_NON_NUMERIC_SQL_JSON_ITEM),
-			  errmsg("numeric argument of jsonpath item method .%s() is out of range for type decimal or number",
+			  errmsg("numeric argument of jsonpath item method .%s() is out of range for type decimal or numeric",
 	 jspOperationName(jsp->type);
 
 	if (jsp->type == jpiDecimal)
@@ -1312,14 +1312,14 @@ executeItemOptUnwrapTarget(JsonPathExecContext *cxt, JsonPathItem *jsp,
 	if (!noerr || escontext.error_occurred)
 		RETURN_ERROR(ereport(ERROR,
 			 (errcode(ERRCODE_NON_NUMERIC_SQL_JSON_ITEM),
-			  errmsg("string argument of jsonpath item method .%s() is not a valid representation of a decimal or number",
+			  errmsg("string argument of jsonpath item method .%s() is not a valid representation of a decimal or numeric",
 	 jspOperationName(jsp->type);
 
 	num = DatumGetNumeric(datum);
 	if (numeric_is_nan(num) || numeric_is_inf(num))
 		RETURN_ERROR(ereport(ERROR,
 			 (errcode(ERRCODE_NON_NUMERIC_SQL_JSON_ITEM),
-			  errmsg("string argument of jsonpath item method .%s() is not a valid representation of a decimal or number",
+			  errmsg("string argument of jsonpath item method .%s() is not a valid representation of a decimal or numeric",
 	 jspOperationName(jsp->type);
 
 	res = jperOk;
@@ -1401,7 +1401,7 @@ executeItemOptUnwrapTarget(JsonPathExecContext *cxt, JsonPathItem *jsp,
 	if (!noerr || escontext.error_occurred)
 		RETURN_ERROR(ereport(ERROR,
 			 (errcode(ERRCODE_NON_NUMERIC_SQL_JSON_ITEM),
-			  errmsg("string argument of jsonpath item method .%s() is not a valid representation of a decimal or number",
+			  errmsg("string argument of jsonpath item method .%s() is not a valid representation of a decimal or numeric",
 	 jspOperationName(jsp->type);
 
 	num = DatumGetNumeric(numdatum);
diff --git a/src/test/regress/expected/jsonb_jsonpath.out b/src/test/regress/expected/jsonb_jsonpath.out
index eea2af30c8..9d8ce48a25 100644
--- a/src/test/regress/expected/jsonb_jsonpath.out
+++ b/src/test/regress/expected/jsonb_jsonpath.out
@@ -2144,7 +2144,7 @@ select jsonb_path_query('"1.23"', '$.decimal()');
 (1 row)
 
 select jsonb_path_query('"1.23aaa"', '$.decimal()');
-ERROR:  string argument of jsonpath item method .decimal() is not a valid representation of a decimal or number
+ERROR:  string argument of jsonpath item method .decimal() is not a valid representation of a decimal or numeric
 select jsonb_path_query('1e1000', '$.decimal()');

Re: Add new error_action COPY ON_ERROR "log"

2024-01-28 Thread torikoshia

On Fri, Jan 26, 2024 at 10:44 PM jian he  
wrote:



I doubt the following part:
  If the log value is specified,
  COPY behaves the same as
ignore, exept that
  it logs information that should have resulted in errors to
PostgreSQL log at
  INFO level.

I think it does something like:
When an error happens, cstate->escontext->error_data->elevel will be 
ERROR

you manually change the cstate->escontext->error_data->elevel to LOG,
then you call ThrowErrorData.

but it's not related to `INFO level`?
my log_min_messages is default, warning.


Thanks!

Modified them to NOTICE in accordance with the following summary 
message:

NOTICE:  x row was skipped due to data type incompatibility



On 2024-01-27 00:43, David G. Johnston wrote:

On Thu, Jan 25, 2024 at 9:42 AM torikoshia
 wrote:


Hi,

As described in 9e2d870119, COPY ON_EEOR is expected to have more
"error_action".
(Note that option name was changed by b725b7eec)

I'd like to have a new option "log", which skips soft errors and
logs
information that should have resulted in errors to PostgreSQL log.


Seems like an easy win but largely unhelpful in the typical case.  I
suppose ETL routines using this feature may be running on their
machine under root or "postgres" but in a system where they are not
this very useful information is inaccessible to them.  I suppose the
DBA could set up an extractor to send these specific log lines
elsewhere but that seems like enough hassle to disfavor this approach
and favor one that can place the soft error data and feedback into
user-specified tables in the same database.  Setting up temporary
tables or unlogged tables probably is going to be a more acceptable
methodology than trying to get to the log files.

David J.


I agree that not a few people would prefer to store error information in 
tables and there have already been suggestions[1].


OTOH not everyone thinks saving table information is the best idea[2].

I think it would be desirable for ON_ERROR to be in a form that allows 
the user to choose where to store error information from among some 
options, such as table, log and file.


"ON_ERROR log" would be useful at least in the case of 'running on their 
machine under root or "postgres"' as you pointed out.



[1] 
https://www.postgresql.org/message-id/CACJufxEkkqnozdnvNMGxVAA94KZaCPkYw_Cx4JKG9ueNaZma_A%40mail.gmail.com


[2] 
https://www.postgresql.org/message-id/20231109002600.fuihn34bjqqgm...@awork3.anarazel.de


--
Regards,

--
Atsushi Torikoshi
NTT DATA Group CorporationFrom 5f44cc7525641302842a3d67c14ebb09615bf67b Mon Sep 17 00:00:00 2001
From: Atsushi Torikoshi 
Date: Mon, 29 Jan 2024 12:02:32 +0900
Subject: [PATCH v2] Add new error_action "log" to ON_ERROR option

Currently ON_ERROR option only has "ignore" to skip malformed data and
there are no ways to know where and why COPY skipped them.

"log" skips malformed data as well as "ignore", but it logs information that
should have resulted in errors to PostgreSQL log.
---
 doc/src/sgml/ref/copy.sgml  |  9 +++--
 src/backend/commands/copy.c |  4 +++-
 src/backend/commands/copyfrom.c | 24 
 src/include/commands/copy.h |  1 +
 src/test/regress/expected/copy2.out | 18 +-
 src/test/regress/sql/copy2.sql  |  9 +
 6 files changed, 53 insertions(+), 12 deletions(-)

diff --git a/doc/src/sgml/ref/copy.sgml b/doc/src/sgml/ref/copy.sgml
index 21a5c4a052..3d949f04a4 100644
--- a/doc/src/sgml/ref/copy.sgml
+++ b/doc/src/sgml/ref/copy.sgml
@@ -380,12 +380,17 @@ COPY { table_name [ ( 
   Specifies which 
   error_action to perform when there is malformed data in the input.
-  Currently, only stop (default) and ignore
-  values are supported.
+  Currently, only stop (default), ignore
+  and log values are supported.
   If the stop value is specified,
   COPY stops operation at the first error.
   If the ignore value is specified,
   COPY skips malformed data and continues copying data.
+  If the log value is specified,
+  COPY behaves the same as ignore,
+  except that it logs information that should have resulted in errors to
+  PostgreSQL log at NOTICE
+  level.
   The option is allowed only in COPY FROM.
   Only stop value is allowed when
   using binary format.
diff --git a/src/backend/commands/copy.c b/src/backend/commands/copy.c
index cc0786c6f4..812ca63350 100644
--- a/src/backend/commands/copy.c
+++ b/src/backend/commands/copy.c
@@ -415,13 +415,15 @@ defGetCopyOnErrorChoice(DefElem *def, ParseState *pstate, bool is_from)
 		return COPY_ON_ERROR_STOP;
 
 	/*
-	 * Allow "stop", or "ignore" values.
+	 * Allow "stop", "ignore" or "log" values.
 	 */
 	sval = defGetString(def);
 	if (pg_strcasecmp(sval, "stop") == 0)
 		return COPY_ON_ERROR_STOP;
 	if (pg_strcasecmp(sval, "ignore") == 0)
 		return COPY_ON_ERROR_IGNORE;
+	if (pg_strcasecmp(sval, "log") == 0)
+		return COPY_ON_ERROR_LOG;

Re: Make COPY format extendable: Extract COPY TO format implementations

2024-01-28 Thread Junwang Zhao

On Mon, Jan 29, 2024 at 10:42 AM Masahiko Sawada  wrote:
>
> On Fri, Jan 26, 2024 at 6:02 PM Junwang Zhao  wrote:
> >
> > On Fri, Jan 26, 2024 at 4:55 PM Sutou Kouhei  wrote:
> > >
> > > Hi,
> > >
> > > In 
> > >   "Re: Make COPY format extendable: Extract COPY TO format 
> > > implementations" on Fri, 26 Jan 2024 16:41:50 +0800,
> > >   Junwang Zhao  wrote:
> > >
> > > > CopyToProcessOption()/CopyFromProcessOption() can only handle
> > > > single option, and store the options in the opaque field,  but it can 
> > > > not
> > > > check the relation of two options, for example, considering json format,
> > > > the `header` option can not be handled by these two functions.
> > > >
> > > > I want to find a way when the user specifies the header option, customer
> > > > handler can error out.
> > >
> > > Ah, you want to use a built-in option (such as "header")
> > > value from a custom handler, right? Hmm, it may be better
> > > that we call CopyToProcessOption()/CopyFromProcessOption()
> > > for all options including built-in options.
> > >
> > Hmm, still I don't think it can handle all cases, since we don't know
> > the sequence of the options, we need all the options been parsed
> > before we check the compatibility of the options, or customer
> > handlers will need complicated logic to resolve that, which might
> > lead to ugly code :(
> >
>
> Does it make sense to pass only non-builtin options to the custom
> format callback after parsing and evaluating the builtin options? That
> is, we parse and evaluate only the builtin options and populate
> opts_out first, then pass each rest option to the custom format
> handler callback. The callback can refer to the builtin option values.

Yeah, I think this makes sense.

> The callback is expected to return false if the passed option is not
> supported. If one of the builtin formats is specified and the rest
> options list has at least one option, we raise "option %s not
> recognized" error.  IOW it's the core's responsibility to ranse the
> "option %s not recognized" error, which is in order to raise a
> consistent error message. Also, I think the core should check the
> redundant options including bultiin and custom options.

It would be good that core could check all the redundant options,
but where should core do the book-keeping of all the options? I have
no idea about this, in my implementation of pg_copy_json extension,
I handle redundant options by adding a xxx_specified field for each
xxx.

>
> Regards,
>
> --
> Masahiko Sawada
> Amazon Web Services: https://aws.amazon.com



-- 
Regards
Junwang Zhao

Re: Tab completion for ATTACH PARTITION

2024-01-28 Thread vignesh C

On Wed, 13 Sept 2023 at 06:57, bt23nguyent  wrote:
>
> Hi,
>
> Currently, the psql's tab completion feature does not support properly
> for ATTACH PARTITION. When  key is typed after "ALTER TABLE
>  ATTACH PARTITION ", all possible table names should be
> displayed, however, foreign table names are not displayed. So I created
> a patch that addresses this issue by ensuring that psql displays not
> only normal table names but also foreign table names in this case.
>
> Any kind of feedback is appreciated.

I have changed the status of the commitfest entry to RWF as Alvaro's
comments have not yet been addressed. Kindly post an updated version
by addressing the comments and add a new commitfest entry for the
same.

Regards,
Vignesh

Re: Make COPY format extendable: Extract COPY TO format implementations

2024-01-28 Thread Masahiko Sawada

On Fri, Jan 26, 2024 at 6:02 PM Junwang Zhao  wrote:
>
> On Fri, Jan 26, 2024 at 4:55 PM Sutou Kouhei  wrote:
> >
> > Hi,
> >
> > In 
> >   "Re: Make COPY format extendable: Extract COPY TO format implementations" 
> > on Fri, 26 Jan 2024 16:41:50 +0800,
> >   Junwang Zhao  wrote:
> >
> > > CopyToProcessOption()/CopyFromProcessOption() can only handle
> > > single option, and store the options in the opaque field,  but it can not
> > > check the relation of two options, for example, considering json format,
> > > the `header` option can not be handled by these two functions.
> > >
> > > I want to find a way when the user specifies the header option, customer
> > > handler can error out.
> >
> > Ah, you want to use a built-in option (such as "header")
> > value from a custom handler, right? Hmm, it may be better
> > that we call CopyToProcessOption()/CopyFromProcessOption()
> > for all options including built-in options.
> >
> Hmm, still I don't think it can handle all cases, since we don't know
> the sequence of the options, we need all the options been parsed
> before we check the compatibility of the options, or customer
> handlers will need complicated logic to resolve that, which might
> lead to ugly code :(
>

Does it make sense to pass only non-builtin options to the custom
format callback after parsing and evaluating the builtin options? That
is, we parse and evaluate only the builtin options and populate
opts_out first, then pass each rest option to the custom format
handler callback. The callback can refer to the builtin option values.
The callback is expected to return false if the passed option is not
supported. If one of the builtin formats is specified and the rest
options list has at least one option, we raise "option %s not
recognized" error.  IOW it's the core's responsibility to ranse the
"option %s not recognized" error, which is in order to raise a
consistent error message. Also, I think the core should check the
redundant options including bultiin and custom options.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

Re: Wrong rows estimations with joins of CTEs slows queries by more than factor 500

2024-01-28 Thread Richard Guo

On Sat, Jan 27, 2024 at 10:08 AM vignesh C  wrote:

> On Mon, 8 Jan 2024 at 22:21, Tom Lane  wrote:
> >
> > Richard Guo  writes:
> > > On Sun, Jan 7, 2024 at 6:41 AM Tom Lane  wrote:
> > >> Thanks for the report!  I guess we need something like the attached.
> >
> > > +1.
> >
> > Pushed, thanks for looking at it.
>
> I have changed the status of the commitfest entry to "Committed" as I
> noticed the patch has already been committed.

Well, the situation seems a little complex here.  At first, this thread
was dedicated to discussing the 'Examine-simple-variable-for-Var-in-CTE'
patch, which has already been pushed in [1].  Subsequently, I proposed
another patch 'Propagate-pathkeys-from-CTEs-up-to-the-outer-query' in
[2], which is currently under review and is what the commitfest entry
for.  Later on, within the same thread, another patch was posted as a
fix to the first patch and was subsequently pushed in [3].  I believe
this sequence of events might have led to confusion.

What is the usual practice in such situations?  I guess I'd better to
fork a new thread to discuss my proposed patch which is about the
'Propagate-pathkeys-from-CTEs-up-to-the-outer-query'.

[1] https://www.postgresql.org/message-id/754093.1700250120%40sss.pgh.pa.us
[2]
https://www.postgresql.org/message-id/CAMbWs49gAHeEOn0rpdUUYXryaa60KZ8JKwk1aSERttY9caCYkA%40mail.gmail.com
[3] https://www.postgresql.org/message-id/1941515.1704732682%40sss.pgh.pa.us

Thanks
Richard

RE: Documentation to upgrade logical replication cluster

2024-01-28 Thread Hayato Kuroda (Fujitsu)

Dear Vignesh,

Thanks for updating the patch! For now, v4 patch LGTM.

Best Regards,
Hayato Kuroda
FUJITSU LIMITED
https://www.fujitsu.com/

Re: Remove unused fields in ReorderBufferTupleBuf

2024-01-28 Thread Masahiko Sawada

On Fri, Jan 26, 2024 at 4:04 PM Masahiko Sawada  wrote:
>
> On Thu, Jan 25, 2024 at 10:17 PM Aleksander Alekseev
>  wrote:
> >
> > Hi,
> >
> > > > Here is the corrected patch.
> > >
> > > Thank you for updating the patch! I have some comments:
> >
> > Thanks for the review.
> >
> > > -tuple = (ReorderBufferTupleBuf *)
> > > +tuple = (HeapTuple)
> > >  MemoryContextAlloc(rb->tup_context,
> > > -
> > > sizeof(ReorderBufferTupleBuf) +
> > > +   HEAPTUPLESIZE +
> > > MAXIMUM_ALIGNOF +
> > > alloc_len);
> > > -tuple->alloc_tuple_size = alloc_len;
> > > -tuple->tuple.t_data = ReorderBufferTupleBufData(tuple);
> > > +tuple->t_data = (HeapTupleHeader)((char *)tuple + HEAPTUPLESIZE);
> > >
> > > Why do we need to add MAXIMUM_ALIGNOF? since HEAPTUPLESIZE is the
> > > MAXALIGN'd size, it seems we don't need it. heap_form_tuple() does a
> > > similar thing but it doesn't add it.
> >
> > Indeed. I gave it a try and nothing crashed, so it appears that
> > MAXIMUM_ALIGNOF is not needed.
> >
> > > ---
> > >  xl_parameter_change *xlrec =
> > > -(xl_parameter_change *)
> > > XLogRecGetData(buf->record);
> > > +(xl_parameter_change *)
> > > XLogRecGetData(buf->record);
> > >
> > > Unnecessary change.
> >
> > That's pgindent. Fixed.
> >
> > > ---
> > > -/*
> > > - * Free a ReorderBufferTupleBuf.
> > > - */
> > > -void
> > > -ReorderBufferReturnTupleBuf(ReorderBuffer *rb, ReorderBufferTupleBuf 
> > > *tuple)
> > > -{
> > > -pfree(tuple);
> > > -}
> > > -
> > >
> > > Why does ReorderBufferReturnTupleBuf need to be moved from
> > > reorderbuffer.c to reorderbuffer.h? It seems not related to this
> > > refactoring patch so I think we should do it in a separate patch if we
> > > really want it. I'm not sure it's necessary, though.
> >
> > OK, fixed.
>
> Thank you for updating the patch. It looks good to me. I'm going to
> push it next week, barring any objections.

Pushed (08e6344fd642).

Regards,

-- 
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

Small fix on COPY ON_ERROR document

2024-01-28 Thread David G. Johnston

On Sunday, January 28, 2024, Yugo NAGATA  wrote:

> On Fri, 26 Jan 2024 08:04:45 -0700
> "David G. Johnston"  wrote:
>
> > On Fri, Jan 26, 2024 at 2:30 AM Yugo NAGATA  wrote:
> >
> > > On Fri, 26 Jan 2024 00:00:57 -0700
> > > "David G. Johnston"  wrote:
> > >
> > > > I will need to make this tweak and probably a couple others to my own
> > > > suggestions in 12 hours or so.
> > > >
> > >
> > >
> > And here is my v2.
> >
> > Notably I choose to introduce the verbiage "soft error" and then define
> in
> > the ON_ERROR clause the specific soft error that matters here - "invalid
> > input syntax".
>
> I am not sure we should use "soft error" without any explanation
> because it seems to me that the meaning of words is unclear for users.


Agreed. It needs to be added to the glossary.



>
> Also, I think "invalid input syntax" is a bit ambiguous. For example,
> COPY FROM raises an error when the number of input column does not match
> to the table schema, but this error is not ignored by ON_ERROR while
> this seems to fall into the category of "invalid input syntax".



It is literally the error text that appears if one were not to ignore it.
It isn’t a category of errors.  But I’m open to ideas here.  But being
explicit with what on actually sees in the system seemed preferable to
inventing new classification terms not otherwise used.


>
> So, keeping consistency with the existing description, we can say:
>
> "Specifies which how to behave when encountering an error due to
>  column values unacceptable to the input function of each attribute's
>  data type."


Yeah, I was considering something along those lines as an option as well.
But I’d rather add that wording to the glossary.



> Currently, ON_ERROR doesn't support other soft errors, so it can explain
> it more simply without introducing the new concept, "soft error" to users.
>
>
Good point.  Seems we should define what user-facing errors are ignored
anywhere in the system and if we aren’t consistently leveraging these in
all areas/commands make the necessary qualifications in those specific
places.



> > I also note the log message behavior when ignore mode is chosen.  I
> haven't
> > confirmed that it is accurate but that is readily tweaked if approved of.
> >
>
> +  An INFO level context message containing the
> ignored row count is
> +  emitted at the end of the COPY FROM if at least
> one row was discarded.
>
>
> The log level is NOTICE not INFO.


Makes sense, I hadn’t experimented.


> I think "left in a deleted state" is also unclear for users because this
> explains the internal state but not how looks from user's view.How about
> leaving the explanation "These rows will not be visible or accessible" in
> the existing statement?
>

Just visible then, I don’t like an “or” there and as tuples at least they
are accessible to the system, in vacuum especially.  But I expected the
user to understand “as if you deleted it” as their operational concept more
readily than visible.  I think this will be read by people who haven’t read
MVCC to fully understand what visible means but know enough to run vacuum
to clean up updated and deleted data as a rule.

David J.

Re: Small fix on COPY ON_ERROR document

2024-01-28 Thread torikoshia

On 2024-01-27 00:04, David G. Johnston wrote:

On Fri, Jan 26, 2024 at 2:30 AM Yugo NAGATA 
wrote:

On Fri, 26 Jan 2024 00:00:57 -0700
"David G. Johnston"  wrote:

I will need to make this tweak and probably a couple others to my

own

suggestions in 12 hours or so.

And here is my v2.

Notably I choose to introduce the verbiage "soft error" and then
define in the ON_ERROR clause the specific soft error that matters
here - "invalid input syntax".

I also note the log message behavior when ignore mode is chosen.  I
haven't confirmed that it is accurate but that is readily tweaked if
approved of.

David J.

Thanks for refining the doc.

+  Specifies which how to behave when encountering a soft error.

To be consistent with other parts in the manual[1][2], should be “soft” 
error?

+  An error_action 
value of

+  stop means fail the command, while
+  ignore means discard the input row and 
continue with the next one.

+  The default is stop

Is "." required at the end of the line?

+ 
+  The only relevant soft error is "invalid input syntax", which 
manifests when attempting

+  to create a column value from the text input.
+ 

I think it is not restricted to "invalid input syntax".
We can handle out of range error:

  =# create table t1(i int);
  CREATE TABLE

  =# copy t1  from stdin with(ON_ERROR ignore);
  Enter data to be copied followed by a newline.
  End with a backslash and a period on a line by itself, or an EOF
  signal.
  >> 1
  >> \.
  NOTICE:  1 row was skipped due to data type incompatibility
  COPY 0

Also, I'm a little concerned that users might wonder what soft error is.

Certainly there are already references to "soft" errors in the manual, 
but they seem to be for developer, such as creating new TYPE for 
PostgreSQL.

It might be better to describe what soft error is like below:

-- src/backend/utils/fmgr/README
An error reported "softly" must be safe, in the sense that there is
no question about our ability to continue normal processing of the
transaction.

[1] https://www.postgresql.org/docs/devel/sql-createtype.html
[2] https://www.postgresql.org/docs/devel/functions-info.html

--
Regards,

--
Atsushi Torikoshi
NTT DATA Group Corporation

Re: Small fix on COPY ON_ERROR document

2024-01-28 Thread Yugo NAGATA

On Fri, 26 Jan 2024 08:04:45 -0700
"David G. Johnston"  wrote:

> On Fri, Jan 26, 2024 at 2:30 AM Yugo NAGATA  wrote:
> 
> > On Fri, 26 Jan 2024 00:00:57 -0700
> > "David G. Johnston"  wrote:
> >
> > > I will need to make this tweak and probably a couple others to my own
> > > suggestions in 12 hours or so.
> > >
> >
> >
> And here is my v2.
> 
> Notably I choose to introduce the verbiage "soft error" and then define in
> the ON_ERROR clause the specific soft error that matters here - "invalid
> input syntax".

I am not sure we should use "soft error" without any explanation
because it seems to me that the meaning of words is unclear for users. 

This is explained in src/backend/utils/fmgr/README;

 * Some callers of datatype input functions (and in future perhaps
 other classes of functions) pass an instance of ErrorSaveContext.
 This indicates that the caller wishes to handle "soft" errors without
 a transaction-terminating exception being thrown: instead, the callee
 should store information about the error cause in the ErrorSaveContext
 struct and return a dummy result value. 

However, this is not mentioned in the documentation anywhere. So, it
would be hard for users to understand what "soft error" is because
they would not read README.

Also, I think "invalid input syntax" is a bit ambiguous. For example, 
COPY FROM raises an error when the number of input column does not match
to the table schema, but this error is not ignored by ON_ERROR while
this seems to fall into the category of "invalid input syntax".

I found the description as followings in the documentation in COPY:

 The column values themselves are strings (...) acceptable to the input
 function, of each attribute's data type.

So, keeping consistency with the existing description, we can say:

"Specifies which how to behave when encountering an error due to
 column values unacceptable to the input function of each attribute's
 data type."

Currently, ON_ERROR doesn't support other soft errors, so it can explain
it more simply without introducing the new concept, "soft error" to users.

> I also note the log message behavior when ignore mode is chosen.  I haven't
> confirmed that it is accurate but that is readily tweaked if approved of.
> 

+  An INFO level context message containing the ignored 
row count is
+  emitted at the end of the COPY FROM if at least one 
row was discarded.

The log level is NOTICE not INFO.

-COPY stops operation at the first error when
-ON_ERROR is not specified. This
-should not lead to problems in the event of a COPY
-TO, but the target table will already have received
-earlier rows in a COPY FROM. These rows will not
-be visible or accessible, but they still occupy disk space. This might
+The COPY FROM command physically inserts input rows
+into the table as it progresses.  If the command fails these rows are
+left in a deleted state, still occupying disk space. This might
 amount to a considerable amount of wasted disk space if the failure
-happened well into a large copy operation. You might wish to invoke
-VACUUM to recover the wasted space.
+happened well into a large copy operation. VACUUM
+should be used to recover the wasted space.

I think "left in a deleted state" is also unclear for users because this
explains the internal state but not how looks from user's view.How about
leaving the explanation "These rows will not be visible or accessible" in
the existing statement?

Regards,
Yugo Nagata

-- 
Yugo NAGATA

Re: speed up a logical replica setup

2024-01-28 Thread Euler Taveira

On Fri, Jan 26, 2024, at 4:55 AM, Hayato Kuroda (Fujitsu) wrote:
> Again, thanks for updating the patch! There are my random comments for v9.

Thanks for checking v9. I already incorporated some of the points below into
the next patch. Give me a couple of hours to include some important points.

> 01.
> I cannot find your replies for my comments#7 [1] but you reverted related 
> changes.
> I'm not sure you are still considering it or you decided not to include 
> changes.
> Can you clarify your opinion?
> (It is needed because changes are huge so it quite affects other 
> developments...)

It is still on my list. As I said in a previous email I'm having a hard time
reviewing pieces from your 0002 patch because you include a bunch of things
into one patch.

> 02.
> ```
> +   -t  class="parameter">seconds
> +   --timeout= class="parameter">seconds
> ```
> 
> But source codes required `--recovery-timeout`. Please update either of them,

Oops. Fixed. My preference is --recovery-timeout because someone can decide to
include a --timeout option for this tool.

> 03.
> ```
> + *Create a new logical replica from a standby server
> ```
> 
> Junwang pointed out to change here but the change was reverted [2]
> Can you clarify your opinion as well?

I'll review the documentation once I fix the code. Since the tool includes the
*create* verb into its name, it seems strange use another verb (convert) in the
description. Maybe we should just remove the word *new* and a long description
explains the action is to turn the standby (physical replica) into a logical
replica.

> 04.
> ```
> +/*
> + * Is the source server ready for logical replication? If so, create the
> + * publications and replication slots in preparation for logical replication.
> + */
> +static bool
> +setup_publisher(LogicalRepInfo *dbinfo)
> ```
> 
> But this function verifies the source server. I felt they should be in the
> different function.

I split setup_publisher() into check_publisher() and setup_publisher().

> 05.
> ```
> +/*
> + * Is the target server ready for logical replication?
> + */
> +static bool
> +setup_subscriber(LogicalRepInfo *dbinfo)
> 
> 
> Actually, this function does not set up subscriber. It just verifies whether 
> the
> target can become a subscriber, right? If should be renamed.

I renamed setup_subscriber() -> check_subscriber().

> 06.
> ```
> +atexit(cleanup_objects_atexit);
> ```
> 
> The registration of the cleanup function is too early. This sometimes triggers
> a core-dump. E.g., 

I forgot to apply the atexit() patch.

> 07.
> Missing a removal of publications on the standby.

It was on my list to do. It will be in the next patch.

> 08.
> Missing registration of LogicalRepInfo in the typedefs.list.

Done.

> 09
> ```
> + 
> +  
> +   pg_subscriber
> +   option
> +  
> + 
> ```
> 
> Can you reply my comment#2 [4]? I think mandatory options should be written.

I included the mandatory options into the synopsis.

> 10.
> Just to confirm - will you implement start_standby/stop_standby functions in 
> next version?

It is still on my list.

--
Euler Taveira
EDB   https://www.enterprisedb.com/

Re: Documentation to upgrade logical replication cluster

2024-01-28 Thread Peter Smith

Hi Vignesh,

Here are some review comments for patch v4.

These are cosmetic only; otherwise v4 LGTM.

==
doc/src/sgml/ref/pgupgrade.sgml

1.
Configure the servers for log shipping.  (You do not need to run
pg_backup_start() and
pg_backup_stop()
or take a file system backup as the standbys are still synchronized
-   with the primary.)  Only logical slots on the primary are copied to the
-   new standby, but other slots on the old standby are not copied so must
-   be recreated manually.
+   with the primary.)  If the old cluster is prior to 17.0, then no slots
+   on the primary are copied to the new standby, so all the slots must be
+   recreated manually. If the old cluster is 17.0 or later, then only
+   logical slots on the primary are copied to the new standby, but other
+   slots on the old standby are not copied so must be recreated manually.
   

Perhaps the part from "If the old cluster is prior..." should be in a
new paragraph.

==
doc/src/sgml/logical-replication.sgml

2.
+   
+Setup the 
+subscriber configurations in the new subscriber.
+pg_upgrade attempts to migrate subscription
+dependencies which includes the subscription's table information present in
+pg_subscription_rel
+system catalog and also the subscription's replication origin. This allows
+logical replication on the new subscriber to continue from where the
+old subscriber was up to. Migration of subscription dependencies is only
+supported when the old cluster is version 17.0 or later. Subscription
+dependencies on clusters before version 17.0 will silently be ignored.
+   

Perhaps the part from "pg_upgrade attempts..." should be in a new paragraph.

==
Kind Regards,
Peter Smith.
Fujitsu Australia

Re: Change COPY ... ON_ERROR ignore to ON_ERROR ignore_row

2024-01-28 Thread David G. Johnston

On Sun, Jan 28, 2024 at 4:51 PM jian he  wrote:

> On Fri, Jan 26, 2024 at 11:09 PM David G. Johnston
>  wrote:
> >
> > Hi,
> >
> > The option choice of "ignore" in the COPY ON_ERROR clause seems overly
> generic.  There would seem to be two relevant ways to ignore bad column
> input data - drop the entire row or just set the column value to null.  I
> can see us wanting to provide the set to null option and in any case having
> the option name be explicit that it ignores the row seems like a good idea.
>
> two issue I found out while playing around with it;
> create table x1(a int not null, b int not null );
>
> another issue:
> COPY x1 from stdin (on_error null);
>
> when we already have `not null` top level constraint for table x1.
> Do we need an error immediately?
> "on_error null" seems to conflict with `not null` constraint (assume
> refers to the same column).
> it may fail while doing bulk inserts while on_error is set to null
> because of violating a not null constraint.
>

You should not error immediately since whether or not there is a problem is
table and data dependent.  I would not check for the case of all columns
being defined not null and just let the mismatch happen.

That said, maybe with this being a string we can accept something like:
'null, ignore'

And so if attempting to place any one null fails, assuming we can make that
a soft error too, we would then ignore the entire row.

David J.

Re: Use of backup_label not noted in log

2024-01-28 Thread Michael Paquier

On Fri, Jan 26, 2024 at 12:08:46PM +0900, Michael Paquier wrote:
> Well, I'm OK with this consensus on 1d35f705e if folks think this is
> useful enough for all the stable branches.

I have done that down to REL_15_STABLE for now as this is able to
apply cleanly there.  Older branches have a lack of information here,
actually, because read_backup_label() does not return the TLI
retrieved from the start WAL segment, so we don't have the whole
package of information.
--
Michael


signature.asc
Description: PGP signature

Re: SQL:2011 application time

2024-01-28 Thread jian he

I fixed your tests, some of your tests can be simplified, (mainly
primary key constraint is unnecessary for the failed tests)
also your foreign key patch test table, temporal_rng is created at
line 141, and we use it at around line 320.
it's hard to get the definition of temporal_rng.  I drop the table
and recreate it.
So people can view the patch with tests more easily.


+ 
+  In a temporal foreign key, the delete/update will use
+  FOR PORTION OF semantics to constrain the
+  effect to the bounds being deleted/updated in the referenced row.
+ 

in v24-0003-Add-temporal-FOREIGN-KEYs.patch
 FOR PORTION OF not yet implemented, so we should
not mention it.

+ 
+  If the last column is marked with PERIOD,
+  it must be a period or range column, and the referenced table
+  must have a temporal primary key.
can we change "it must be a period or range column" to "it must be a
range column", maybe we can add it on another patch.


v1-0001-refactor-temporal-FOREIGN-KEYs-test.based_on_v24
Description: Binary data

Re: Change COPY ... ON_ERROR ignore to ON_ERROR ignore_row

2024-01-28 Thread jian he

On Fri, Jan 26, 2024 at 11:09 PM David G. Johnston
 wrote:
>
> Hi,
>
> The option choice of "ignore" in the COPY ON_ERROR clause seems overly 
> generic.  There would seem to be two relevant ways to ignore bad column input 
> data - drop the entire row or just set the column value to null.  I can see 
> us wanting to provide the set to null option and in any case having the 
> option name be explicit that it ignores the row seems like a good idea.

two issue I found out while playing around with it;
create table x1(a int not null, b int not null );

you can only do:
COPY x1 from stdin (on_error 'null');
but you cannot do
COPY x1 from stdin (on_error null);
we need to hack the gram.y to escape the "null". I don't know how to
make it work.
related post I found:
https://stackoverflow.com/questions/31786611/how-to-escape-flex-keyword

another issue:
COPY x1 from stdin (on_error null);

when we already have `not null` top level constraint for table x1.
Do we need an error immediately?
"on_error null" seems to conflict with `not null` constraint (assume
refers to the same column).
it may fail while doing bulk inserts while on_error is set to null
because of violating a not null constraint.

Re: MERGE ... RETURNING

2024-01-28 Thread jian he

On Fri, Jan 19, 2024 at 1:44 AM Dean Rasheed  wrote:
>
>
> Thanks for reviewing. Updated patch attached.
>
> The wider question is whether people are happy with the overall
> approach this patch now takes, and the new MERGING() function and
> MergingFunc node.
>

one minor white space issue:

git diff --check
doc/src/sgml/func.sgml:22482: trailing whitespace.
+ action | clause_number | product_id | in_stock | quantity


@@ -3838,7 +3904,7 @@ ExecModifyTable(PlanState *pstate)
  }
  slot = ExecGetUpdateNewTuple(resultRelInfo, context.planSlot,
  oldSlot);
- context.relaction = NULL;
+ node->mt_merge_action = NULL;

I wonder what's the purpose of setting node->mt_merge_action to null ?
I add `node->mt_merge_action = NULL;` at the end of each branch in
`switch (operation)`.
All the tests still passed.
Other than this question, this patch is very good.

Re: Add recovery to pg_control and remove backup_label

2024-01-28 Thread David Steele


On 1/28/24 19:11, Michael Paquier wrote:

On Fri, Jan 26, 2024 at 06:27:30PM +0530, vignesh C wrote:

Please post an updated version for the same.

[1] - http://cfbot.cputube.org/patch_46_3511.log


With the recent introduction of incremental backups that depend on
backup_label and the rather negative feedback received, I think that
it would be better to return this entry as RwF for now.  What do you
think?


I've been thinking it makes little sense to update the patch. It would 
be a lot of work with all the new changes for incremental backup and 
since Andres and Robert appear to be very against the idea, I doubt it 
would be worth the effort.


I have withdrawn the patch.

Regards,
-David

Re: Add recovery to pg_control and remove backup_label

2024-01-28 Thread Michael Paquier

On Fri, Jan 26, 2024 at 06:27:30PM +0530, vignesh C wrote:
> Please post an updated version for the same.
> 
> [1] - http://cfbot.cputube.org/patch_46_3511.log

With the recent introduction of incremental backups that depend on
backup_label and the rather negative feedback received, I think that
it would be better to return this entry as RwF for now.  What do you
think?
--
Michael


signature.asc
Description: PGP signature

Re: proposal: psql: show current user in prompt

2024-01-28 Thread Jelte Fennema-Nio

On Sun, 28 Jan 2024 at 20:01, Pavel Stehule  wrote:
> There is another reason - compatibility with other drivers.  We maintain just 
> libpq, but there are JDBC, Npgsql, and some native Python drivers. I didn't 
> checked, but maybe they expect GUC with GUC_REPORT flag.

This doesn't matter, because these drivers themselves would only stop
receiving certain GUC report messages if they changed this the
_pq_.report_paramers GUC. And at that point the other driver is
disabling the reporting on purpose. But like I said, I'm fine with
forcing the currently default GUC_REPORT GUCs to be GUC_REPORT always
(maybe excluding application_name).

> But now, I don't see any design without problems. Your look a little bit 
> fragile to me,

Can you explain what still looks fragile to you about my design? Like
I explained at least from a proxy perspective this is the least
fragile imho, since it can reuse already existing and battle tested
code.

> From my perspective the situation can be better if I know so defaultly 
> reported GUC are fixed, and cannot be broken. Then for almost all clients 
> (without pgbouncer users), the CUSTOM_REPORT_GUC GUC will contain just 
> "role", and then the risk is minimal.

Which risk are you talking about here?

> But still there are problems with handling of RESET ALL - so that means I 
> need to do a recheck of the local state every time, when I will show a prompt 
> with %N - that is not nice, but probably with a short list it will not be a 
> problem.

I'm not entirely sure what you mean here. Is this still a problem if
RESET ALL is ignored for _pq_.report_parameters? If so, what problem
are you talking about then?

> But I can imagine a client crash, and then pgbouncer executes RESET ALL, and 
> at this moment I would like to reset GUC_REPORT on "role" GUC. Maybe the 
> introduction of a new flag for DISCARD can solve it. But again there can be a 
> problem for which GUC the flag GUC_REPORT should be removed, because there 
> are not two independent lists.

I don't think this is a problem. PgBouncer wouldn't rely on RESET ALL
to reset the state of _pq_.report_parameters. Before handing off the
old connection to a new client, PgBouncer would simply change the
_pq_.report_parameters GUC back to its default value by sending a
ParameterSet message. i.e. PgBouncer would use the same logic as it
currently uses to correctly reset tracked GUCs (application_name,
client_encoding, etc).

Re: Things I don't like about \du's "Attributes" column

2024-01-28 Thread David G. Johnston

On Sun, Jan 28, 2024 at 1:29 PM Pavel Luzanov 
wrote:

> I'd suggest pulling out this system view change into its own patch.
>
>
> But within this thread or new one?
>
>
>
Thread.  The subject line needs to make clear we are proposing changing a
system view.

The connection limit can be set to 0. What should be displayed in this
case, blank or 0?
>
> 0 or even "not allowed" to make it clear


> The connection limit can be set for superusers. What should be displayed in 
> this case,
> blank or actual non-effective value?
>
> print "# (ignored)" ?


CREATE|ALTER ROLE commands allow incorrect values to be set for 'Conn
limit' and 'Valid until'.
> How can the administrator see them and fix them?
>
>
That is unfortunate...but they can always go look at the actual system
view.  Or do what i showed above and add (invalid) after the real value.
Note I'm only really talking about -1 here being the value that is simply
hidden from display since it means unlimited and not actually -1

I'd be more inclined to print "forever" for valid until since the existing
presentation of a timestamp is already multiple characters. Using a word
for a column that is typically a number is less appealing.

David J.

Re: Things I don't like about \du's "Attributes" column

2024-01-28 Thread Pavel Luzanov


On 23.01.2024 01:59, David G. Johnston wrote:


The attribute names correspond to the keywords of the CREATE ROLE command.
The attributes are listed in the same order as in the documentation.
(I think that the LOGIN attribute should be moved to the first place,
both in the documentation and in the command.)

I'd just flip INHERIT and LOGIN


ok


I'm strongly in favor of using mixed-case for the attributes.  The SQL 
Command itself doesn't care about capitalization and it is much easier 
on the eyes.  I'm also strongly in favor of newlines, as can be seen 
by the default bootstrap superuser entry putting everything on one 
line eats up 65 characters.


              List of roles
 Role name | Attributes  | Password? | Valid until | Connection limit 
| Description

---+-+---+-+--+-
 davidj    | Superuser  +| no        |             |       -1 |
           | CreateDB   +|           |             |          |
           | CreateRole +|           |             |          |
           | Inherit    +|           |             |          |
           | Login      +|           |             |          |
           | Replication+|           |             |          |
           | BypassRLS   |           |             |          |
(1 row)


ok, I will continue with this display variant.




As noted by Peter this patch didn't update the two affected expected 
output files. psql.out and, due to the system view change, rules.out.  
That particular change requires a documentation update to the pg_roles 
system view page.


Yes, I was waiting for the direction of implementation to appear. Now it is 
there.



I'd suggest pulling out this system view change into its own patch.


But within this thread or new one?


On 23.01.2024 05:30, David G. Johnston wrote:
On Sun, Jan 21, 2024 at 2:35 PM Pavel Luzanov 
 wrote:


   List of roles
  Role name | Attributes  | Password? |  Valid until   | Connection 
limit

---+-+---++--
  admin | INHERIT | no||
   -1
  alice | SUPERUSER  +| yes   | infinity   |
5

I think I'm in the minority on believing that these describe outputs 
should not be beholden to internal implementation details.


Yes, I prefer real column values. But it can be discussed.


But seeing a -1 in the limit column is just jarring to my 
sensibilities.  I suggest displaying blank (not null, \pset should not 
influence this) if the connection limit is "no limit", only showing 
positive numbers when there is meaningful exceptional information for 
the user to absorb.


The connection limit can be set to 0. What should be displayed in this case, 
blank or 0?
The connection limit can be set for superusers. What should be displayed in 
this case,
blank or actual non-effective value?
CREATE|ALTER ROLE commands allow incorrect values to be set for 'Conn limit' 
and 'Valid until'.
How can the administrator see them and fix them?

These are my reasons for real column values.

--
Pavel Luzanov
Postgres Professional:https://postgrespro.com

Re: Schema variables - new implementation for Postgres 15

2024-01-28 Thread Dmitry Dolgov

> On Sun, Jan 28, 2024 at 08:34:40PM +0100, Pavel Stehule wrote:
> > * I've noticed an interesting result when a LET statement is used to
> > assign a
> >   value without a subquery:
> >
> > create variable test as text;
> > -- returns NULL
> > select test;
> >
> > -- use repeat directly without a subquery
> > let test = repeat("test", 10);
> >
> > -- returns NULL
> > select test;
> >
> >   I was expecting to see an error here, is this a correct behaviour?
> >
>
> what is strange on this result?

Never mind, I've got confused about the quotes here -- it was referring
to the variable content, not a string.

Re: Things I don't like about \du's "Attributes" column

2024-01-28 Thread Pavel Luzanov


On 23.01.2024 04:18, Tom Lane wrote:

I think expecting the pg_roles view to change for this is problematic.
You can't have that in the back branches, so with this patch psql
will show something different against a pre-17 server than later
versions.  At best, that's going to be confusing.


I've been thinking about it. Therefore, the column "Password?" is shown
only for version 17 and older.


   Can you get the same result without changing pg_roles?


Hm. I'm not sure if this is possible.


Actually, even more to the point: while this doesn't expose the
contents of a role's password, it does expose whether the role
*has* a password to every user in the installation.  I doubt
that that's okay from a security standpoint.  It'd need debate
at the least.


Yes, I remember your caution about security from the original post.
I'll try to explain why changing pg_roles is acceptable.
Now \du shows column "Valid until". We know that you can set
the password expiration date without having a password, but this is
more likely a mistake in role maintenance. In most cases, a non-null
value indicates that the password has been set. Therefore, security
should not suffer much, but it will help the administrator to see
incorrect values.

On 23.01.2024 05:22, David G. Johnston wrote:
At present it seems like a createrole permissioned user is unable 
to determine whether a given role has a password or not even in the case

when that role would be allowed to alter a role they've created to set or
remove said password.  Keeping with the changes made in v16 it does seem
worthwhile modifying pg_roles to be sensitive to the role querying the view
having both createrole and admin membership on the role being displayed.
With now three possible outcomes: NULL if no password is in use, *
if a password is in use and the user has the ability to alter role, or
 (alt. N/A).


Good point.
But what about "Valid until". Can roles without superuser or createrole
attributes see it? The same about "Connection limit"?

I'll think about it and try to implement in the next patch version within a few 
days.
Thank you for review.

--
Pavel Luzanov
Postgres Professional:https://postgrespro.com

Re: Schema variables - new implementation for Postgres 15

2024-01-28 Thread Pavel Stehule

ne 28. 1. 2024 v 19:00 odesílatel Dmitry Dolgov <9erthali...@gmail.com>
napsal:

> Thanks for the update, smaller patches looks promising.
>
> Off the list Pavel has mentioned that the first two patches contain a
> bare minimum for session variables, so I've reviewed them once more and
> suggest to concentrate on them first. I'm afraid the memory cleanup
> patch has to be added to the "bare minimum" set as well -- otherwise in
> my tests it was too easy to run out of memory via creating, assigning
> and dropping variables. Unfortunately one can't extract those three
> patches from the series and apply only them, the memory patch would have
> some conflicts. Can you maybe reshuffle the series to have those patches
> (1, 2 + 8) as first three?
>
> If that's possible, my proposal would be to proceed with them first. To the
> best of my knowledge they look good to me, except few minor details:
>
> * The documentation says in a couple of places (ddl.sgml,
>   create_variable.sgml) that "Retrieving a session variable's value
>   returns either a NULL or a default value", but as far as I see the
>   default value feature is not implemented within first two patches.
>
> * Similar with mentioning immutable session variables in plpgsql.sgml .
>
> * Commentary to LookupVariable mentions a rowtype_only argument:
>
> +/*
> + * Returns oid of session variable specified by possibly
> qualified identifier.
> + *
> + * If not found, returns InvalidOid if missing_ok, else throws
> error.
> + * When rowtype_only argument is true the session variables of not
> + * composite types are ignored. This should to reduce possible
> collisions.
> + */
> +Oid
> +LookupVariable(const char *nspname,
> +  const char *varname,
> +  bool missing_ok)
>
>   but the function doesn't have it.
>
> * I've noticed an interesting result when a LET statement is used to
> assign a
>   value without a subquery:
>
> create variable test as text;
> -- returns NULL
> select test;
>
> -- use repeat directly without a subquery
> let test = repeat("test", 10);
>
> -- returns NULL
> select test;
>
>   I was expecting to see an error here, is this a correct behaviour?
>

what is strange on this result?

(2024-01-28 20:32:05) postgres=# let test = 'ab';
LET
(2024-01-28 20:32:12) postgres=# let test = repeat("test", 10);
LET
(2024-01-28 20:32:19) postgres=# select test;
┌──┐
│ test │
╞══╡
│ abababababababababab │
└──┘
(1 row)

(2024-01-28 20:32:21) postgres=# let test = null;
LET
(2024-01-28 20:32:48) postgres=# let test = repeat("test", 10);
LET
(2024-01-28 20:32:51) postgres=# select test;
┌──┐
│ test │
╞══╡
│ ∅│
└──┘
(1 row)

(2024-01-28 20:32:53) postgres=# select repeat(test, 10);
┌┐
│ repeat │
╞╡
│ ∅  │
└┘
(1 row)

"repeat" is the usual scalar function. Maybe you thought different function

Re: proposal: psql: show current user in prompt

2024-01-28 Thread Pavel Stehule

ne 28. 1. 2024 v 10:42 odesílatel Jelte Fennema-Nio 
napsal:

> On Sat, 27 Jan 2024 at 20:44, Pavel Stehule 
> wrote:
> > client_encoding, standard_conforming_strings, server_version,
> default_transaction_read_only, in_hot_standby and scram_iterations
> > are used by libpq directly, so it can be wrong to introduce the
> possibility to break it.
>
> libpq could add these ones automatically to the list, just like a
> proxy. But I think you are probably right and always reporting our
> current default set is probably easier.
>

There is another reason - compatibility with other drivers.  We maintain
just libpq, but there are JDBC, Npgsql, and some native Python drivers. I
didn't checked, but maybe they expect GUC with GUC_REPORT flag.


> >> Maybe I'm misunderstanding what you're saying, but it's not clear to
> >> me why you are seeing two different use cases here. To me if we have
> >> the ParameterSet message then they are both the same. When you enable
> >> %N you would send a ParamaterSet message for _pq_.report_parameters
> >> and set it to a list of gucs including "role". And when you disable %N
> >> you would set _pq_.report_parameters to a list of GUCs without "role".
> >> Then if any proxy still needed "role" even if the list it receives in
> >> _pq_.report_parameters doesn't contain it, then this proxy would
> >> simply prepend "role" to the list of GUCs before forwarding the
> >> ParameterSet message.
> >
> >
> > Your scenario can work but looks too fragile. I checked - GUC now cannot
> contain some special chars, so writing parser should not be hard work. But
> your proposal means the proxy should be smart about it, and have to check
> any change of _pq_.report_parameters, and this point can be fragile and a
> source of hardly searched bugs.
>
> Yes, proxies should be smart about it. But if there's new message
> types introduced specifically for this, then proxies need to be smart
> about it too. Because they would need to remember which reporting was
> requested by the client, to be able to correctly ask for reporting
> GUCs it after server connection . Using GUCs actually makes this
> easier to implement (and thus less error prone), because proxies
> already have logic to re-sync GUCs after connection assignment.
>
> I think this is probably one of the core reasons why I would very much
> prefer GUCs over new message types to configure protocol extensions
> like this: It means proxies would not to keep track of and re-sync a
> new kind of connection state every time a protocol extension is added.
> They can make their GUC tracking and re-syncing robust, and that's all
> they would need.
>

I am not against GUC based solutions. I think so for proxies it is probably
the best solution. But I just see only a GUC based solution not practical
for application.

Things are more complex when we try to think about possibility so
maintaining a list of reported GUC is more than one application. But now, I
don't see any design without problems. Your look a little bit fragile to
me, my proposal probably needs two independent lists of reported GUC, which
is not nice too. From my perspective the situation can be better if I know
so defaultly reported GUC are fixed, and cannot be broken. Then for almost
all clients (without pgbouncer users), the CUSTOM_REPORT_GUC GUC will
contain just "role", and then the risk is minimal. But still there are
problems with handling of RESET ALL - so that means I need to do a recheck
of the local state every time, when I will show a prompt with %N - that is
not nice, but probably with a short list it will not be a problem.



> > This is true, but how common is this situation? Probably every client
> that uses one proxy will use the same defaultly reported GUC.
>
> If you have different clients connecting to the same proxy, it seems
> quite likely that this will happen. This does not seem uncommon to me,
> e.g. actual application would need different things always reported
> than some dev client. Or clients for different languages might ask to
> report slightly different settings.
>
> > Reporting has no extra overhead. The notification is reduced. When there
> is a different set of reported GUC, then the proxy can try to find another
> connection with the same set or can reconnect.
>
> Honestly, this logic seems much more fragile to implement. And
> requiring reconnection seems problematic from a performance point of
> view.
>
> > I think so there is still opened question what should be correct
> behaviour when client execute RESET ALL or DISCARD ALL.  Without special
> protection the communication with proxy can be broken - and we use GUC for
> reported variables, then my case, prompt in psql will be broken too. Inside
> psql I have not callback on change of reported GUC. So this is reason why
> reporting based on mutable GUC is fragile :-/
>
> Specifically for this reason, the current patchset in the other thread
> already ignores RESET ALL and DISCARD ALL for protocol

Re: MultiXact\SLRU buffers configuration

2024-01-28 Thread Andrey M. Borodin

> On 28 Jan 2024, at 17:49, Alvaro Herrera  wrote:
> 
> I'd appreciate it if you or Horiguchi-san can update his patch to remove
> use of usleep in favor of a CV in multixact, and keep this CF entry to
> cover it.

Sure! Sounds great!

> Perhaps a test to make the code reach the usleep(1000) can be written
> using injection points (49cd2b93d7db)?

I've tried to prototype something like that. But interesting point between 
GetNewMultiXactId() and RecordNewMultiXact() is a critical section, and we 
cannot have injection points in critical sections...
Also, to implement such a test we need "wait" type of injection points, see 
step 2 in attachment. With this type of injection points I can stop a backend 
amidst entering information about new MultiXact.

Best regards, Andrey Borodin.

0001-Add-conditional-variable-to-wait-for-next-MultXact-o.patch
Description: Binary data

0002-Add-wait-type-for-injection-points.patch
Description: Binary data

0003-Try-to-test-multixact-CV-sleep.patch
Description: Binary data

Re: Schema variables - new implementation for Postgres 15

2024-01-28 Thread Dmitry Dolgov

Thanks for the update, smaller patches looks promising.

Off the list Pavel has mentioned that the first two patches contain a
bare minimum for session variables, so I've reviewed them once more and
suggest to concentrate on them first. I'm afraid the memory cleanup
patch has to be added to the "bare minimum" set as well -- otherwise in
my tests it was too easy to run out of memory via creating, assigning
and dropping variables. Unfortunately one can't extract those three
patches from the series and apply only them, the memory patch would have
some conflicts. Can you maybe reshuffle the series to have those patches
(1, 2 + 8) as first three?

If that's possible, my proposal would be to proceed with them first. To the
best of my knowledge they look good to me, except few minor details:

* The documentation says in a couple of places (ddl.sgml,
  create_variable.sgml) that "Retrieving a session variable's value
  returns either a NULL or a default value", but as far as I see the
  default value feature is not implemented within first two patches.

* Similar with mentioning immutable session variables in plpgsql.sgml .

* Commentary to LookupVariable mentions a rowtype_only argument:

+/*
+ * Returns oid of session variable specified by possibly qualified 
identifier.
+ *
+ * If not found, returns InvalidOid if missing_ok, else throws error.
+ * When rowtype_only argument is true the session variables of not
+ * composite types are ignored. This should to reduce possible 
collisions.
+ */
+Oid
+LookupVariable(const char *nspname,
+  const char *varname,
+  bool missing_ok)

  but the function doesn't have it.

* I've noticed an interesting result when a LET statement is used to assign a
  value without a subquery:

create variable test as text;
-- returns NULL
select test;

-- use repeat directly without a subquery
let test = repeat("test", 10);

-- returns NULL
select test;

  I was expecting to see an error here, is this a correct behaviour?

Re: MultiXact\SLRU buffers configuration

2024-01-28 Thread Alvaro Herrera

On 2024-Jan-27, Andrey Borodin wrote:

> thanks for the ping! Most important parts of this patch set are discussed in 
> [0]. If that patchset will be committed, I'll withdraw entry for this thread 
> from commitfest.
> There's a version of Multixact-specific optimizations [1], but I hope they 
> will not be necessary with effective caches developed in [0]. It seems to me 
> that most important part of those optimization is removing sleeps under SLRU 
> lock on standby [2] by Kyotaro Horiguchi. But given that cache optimizations 
> took 4 years to get closer to commit, I'm not sure we will get this 
> optimization any time soon...

I'd appreciate it if you or Horiguchi-san can update his patch to remove
use of usleep in favor of a CV in multixact, and keep this CF entry to
cover it.

Perhaps a test to make the code reach the usleep(1000) can be written
using injection points (49cd2b93d7db)?

-- 
Álvaro Herrera PostgreSQL Developer  —  https://www.EnterpriseDB.com/
"La experiencia nos dice que el hombre peló millones de veces las patatas,
pero era forzoso admitir la posibilidad de que en un caso entre millones,
las patatas pelarían al hombre" (Ijon Tichy)

Re: UUID v7

2024-01-28 Thread Sergey Prokhorenko

By the way, the Go language has also already implemented a function for UUIDv7: 
https://pkg.go.dev/github.com/gofrs/uuid#NewV7




Sergey Prokhorenko sergeyprokhore...@yahoo.com.au 

On Thursday, 25 January 2024 at 12:49:46 am GMT+3, Sergey Prokhorenko 
 wrote:  
 
 That's right! There is no point in waiting for the official approval of the 
new RFC, which obviously will not change anything. I have been a contributor to 
this RFC for several years, and I can testify that every aspect imaginable has 
been thoroughly researched and agreed upon. Nothing new will definitely appear 
in the new RFC.


Sergey prokhorenkosergeyprokhore...@yahoo.com.au 

On Monday, 22 January 2024 at 07:22:32 am GMT+3, Nikolay Samokhvalov 
 wrote:  
 
 On Fri, Jan 19, 2024 at 10:07 AM Andrey Borodin  wrote:



> On 19 Jan 2024, at 13:25, Andrey Borodin  wrote:
> 
> Also, I've added some documentation on all functions.

Here's v12. Changes:
1. Documentation improvements
2. Code comments
3. Better commit message and reviews list


Thank you, Andrey! I have just checked v12 – cleanly applied to HEAD, and 
functions work well. I especially like that fact that we keep 
uuid_extract_time(..) here – this is a great thing to have for time-based 
partitioning, and in many cases we will be able to decide not to have a 
creation column timestamp (e.g., "created_at") at all, saving 8 bytes.
The docs and comments look great too.
Overall, the patch looks mature enough. It would be great to have it in pg17. 
Yes, the RFC is not fully finalized yet, but it's very close. And many 
libraries are already including implementation of UUIDv7 – here are some 
examples:
- https://www.npmjs.com/package/uuidv7
- https://crates.io/crates/uuidv7
- https://github.com/google/uuid/pull/139
Nik

Re: [EXTERNAL] Re: Add non-blocking version of PQcancel

2024-01-28 Thread Jelte Fennema-Nio

On Sun, 28 Jan 2024 at 10:51, Jelte Fennema-Nio  wrote:
> Both of those are fixed now.

Okay, there turned out to also be an issue on Windows with
setKeepalivesWin32 not being available in fe-cancel.c. That's fixed
now too (as well as some minor formatting issues).
From 4efbb0c75341f4612f0c5b8d5d3fe3f8f9c3b43c Mon Sep 17 00:00:00 2001
From: Jelte Fennema-Nio 
Date: Fri, 26 Jan 2024 17:01:28 +0100
Subject: [PATCH v28 2/5] libpq: Add pq_release_conn_hosts function

In a follow up PR we'll need to free this connhost field in a function
defined in fe-cancel.c

So this extracts the logic to a dedicated extern function.
---
 src/interfaces/libpq/fe-connect.c | 38 ---
 src/interfaces/libpq/libpq-int.h  |  1 +
 2 files changed, 26 insertions(+), 13 deletions(-)

diff --git a/src/interfaces/libpq/fe-connect.c b/src/interfaces/libpq/fe-connect.c
index 5d08b4904d3..bc1f6521650 100644
--- a/src/interfaces/libpq/fe-connect.c
+++ b/src/interfaces/libpq/fe-connect.c
@@ -4395,19 +4395,7 @@ freePGconn(PGconn *conn)
 		free(conn->events[i].name);
 	}
 
-	/* clean up pg_conn_host structures */
-	for (int i = 0; i < conn->nconnhost; ++i)
-	{
-		free(conn->connhost[i].host);
-		free(conn->connhost[i].hostaddr);
-		free(conn->connhost[i].port);
-		if (conn->connhost[i].password != NULL)
-		{
-			explicit_bzero(conn->connhost[i].password, strlen(conn->connhost[i].password));
-			free(conn->connhost[i].password);
-		}
-	}
-	free(conn->connhost);
+	pq_release_conn_hosts(conn);
 
 	free(conn->client_encoding_initial);
 	free(conn->events);
@@ -4526,6 +4514,30 @@ release_conn_addrinfo(PGconn *conn)
 	}
 }
 
+/*
+ * pq_release_conn_hosts
+ *	 - Free the host list in the PGconn.
+ */
+void
+pq_release_conn_hosts(PGconn *conn)
+{
+	if (conn->connhost)
+	{
+		for (int i = 0; i < conn->nconnhost; ++i)
+		{
+			free(conn->connhost[i].host);
+			free(conn->connhost[i].hostaddr);
+			free(conn->connhost[i].port);
+			if (conn->connhost[i].password != NULL)
+			{
+explicit_bzero(conn->connhost[i].password, strlen(conn->connhost[i].password));
+free(conn->connhost[i].password);
+			}
+		}
+		free(conn->connhost);
+	}
+}
+
 /*
  * sendTerminateConn
  *	 - Send a terminate message to backend.
diff --git a/src/interfaces/libpq/libpq-int.h b/src/interfaces/libpq/libpq-int.h
index 48c10b474f5..4cbad2c2c83 100644
--- a/src/interfaces/libpq/libpq-int.h
+++ b/src/interfaces/libpq/libpq-int.h
@@ -680,6 +680,7 @@ extern int	pqPacketSend(PGconn *conn, char pack_type,
 extern bool pqGetHomeDirectory(char *buf, int bufsize);
 extern bool pq_parse_int_param(const char *value, int *result, PGconn *conn,
 			   const char *context);
+extern void pq_release_conn_hosts(PGconn *conn);
 
 extern pgthreadlock_t pg_g_threadlock;
 
-- 
2.34.1

From f1168ac4c3dd758a77be3ceb8c40bacb9aebef8c Mon Sep 17 00:00:00 2001
From: Jelte Fennema-Nio 
Date: Fri, 26 Jan 2024 16:47:51 +0100
Subject: [PATCH v28 3/5] libpq: Change some static functions to extern

This is in preparation of a follow up commit that starts using these
functions from fe-cancel.c.
---
 src/interfaces/libpq/fe-connect.c | 85 +++
 src/interfaces/libpq/libpq-int.h  |  6 +++
 2 files changed, 46 insertions(+), 45 deletions(-)

diff --git a/src/interfaces/libpq/fe-connect.c b/src/interfaces/libpq/fe-connect.c
index bc1f6521650..aeb3adc0e31 100644
--- a/src/interfaces/libpq/fe-connect.c
+++ b/src/interfaces/libpq/fe-connect.c
@@ -387,15 +387,10 @@ static const char uri_designator[] = "postgresql://";
 static const char short_uri_designator[] = "postgres://";
 
 static bool connectOptions1(PGconn *conn, const char *conninfo);
-static bool connectOptions2(PGconn *conn);
-static int	connectDBStart(PGconn *conn);
-static int	connectDBComplete(PGconn *conn);
 static PGPing internal_ping(PGconn *conn);
-static PGconn *makeEmptyPGconn(void);
 static void pqFreeCommandQueue(PGcmdQueueEntry *queue);
 static bool fillPGconn(PGconn *conn, PQconninfoOption *connOptions);
 static void freePGconn(PGconn *conn);
-static void closePGconn(PGconn *conn);
 static void release_conn_addrinfo(PGconn *conn);
 static int	store_conn_addrinfo(PGconn *conn, struct addrinfo *addrlist);
 static void sendTerminateConn(PGconn *conn);
@@ -644,7 +639,7 @@ pqDropServerData(PGconn *conn)
  * PQconnectStart or PQconnectStartParams (which differ in the same way as
  * PQconnectdb and PQconnectdbParams) and PQconnectPoll.
  *
- * Internally, the static functions connectDBStart, connectDBComplete
+ * Internally, the static functions pqConnectDBStart, pqConnectDBComplete
  * are part of the connection procedure.
  */
 
@@ -678,7 +673,7 @@ PQconnectdbParams(const char *const *keywords,
 	PGconn	   *conn = PQconnectStartParams(keywords, values, expand_dbname);
 
 	if (conn && conn->status != CONNECTION_BAD)
-		(void) connectDBComplete(conn);
+		(void) pqConnectDBComplete(conn);
 
 	return conn;
 }
@@ -731,7 +726,7 @@ PQconnectdb(const char *conninfo)
 	PGconn	   *conn =

Re: [EXTERNAL] Re: Add non-blocking version of PQcancel

2024-01-28 Thread Jelte Fennema-Nio

On Sun, 28 Jan 2024 at 04:15, vignesh C  wrote:
> CFBot shows that the patch has few compilation errors as in [1]:
> [17:07:07.621] /usr/bin/ld:
> ../../../src/fe_utils/libpgfeutils.a(cancel.o): in function
> `handle_sigint':
> [17:07:07.621] cancel.c:(.text+0x50): undefined reference to `PQcancel'

I forgot to update ./configure based builds with the new file, only
meson was working. Also it seems I trimmed the header list fe-cancel.c
a bit too much for OSX, so I added unistd.h back.

Both of those are fixed now.


v27-0002-libpq-Add-pq_release_conn_hosts-function.patch
Description: Binary data


v27-0005-Start-using-new-libpq-cancel-APIs.patch
Description: Binary data


v27-0003-libpq-Change-some-static-functions-to-extern.patch
Description: Binary data


v27-0004-Add-non-blocking-version-of-PQcancel.patch
Description: Binary data


v27-0001-libpq-Move-cancellation-related-functions-to-fe-.patch
Description: Binary data

Re: proposal: psql: show current user in prompt

2024-01-28 Thread Jelte Fennema-Nio

On Sat, 27 Jan 2024 at 20:44, Pavel Stehule  wrote:
> client_encoding, standard_conforming_strings, server_version, 
> default_transaction_read_only, in_hot_standby and scram_iterations
> are used by libpq directly, so it can be wrong to introduce the possibility 
> to break it.

libpq could add these ones automatically to the list, just like a
proxy. But I think you are probably right and always reporting our
current default set is probably easier.

>> Maybe I'm misunderstanding what you're saying, but it's not clear to
>> me why you are seeing two different use cases here. To me if we have
>> the ParameterSet message then they are both the same. When you enable
>> %N you would send a ParamaterSet message for _pq_.report_parameters
>> and set it to a list of gucs including "role". And when you disable %N
>> you would set _pq_.report_parameters to a list of GUCs without "role".
>> Then if any proxy still needed "role" even if the list it receives in
>> _pq_.report_parameters doesn't contain it, then this proxy would
>> simply prepend "role" to the list of GUCs before forwarding the
>> ParameterSet message.
>
>
> Your scenario can work but looks too fragile. I checked - GUC now cannot 
> contain some special chars, so writing parser should not be hard work. But 
> your proposal means the proxy should be smart about it, and have to check any 
> change of _pq_.report_parameters, and this point can be fragile and a source 
> of hardly searched bugs.

Yes, proxies should be smart about it. But if there's new message
types introduced specifically for this, then proxies need to be smart
about it too. Because they would need to remember which reporting was
requested by the client, to be able to correctly ask for reporting
GUCs it after server connection . Using GUCs actually makes this
easier to implement (and thus less error prone), because proxies
already have logic to re-sync GUCs after connection assignment.

I think this is probably one of the core reasons why I would very much
prefer GUCs over new message types to configure protocol extensions
like this: It means proxies would not to keep track of and re-sync a
new kind of connection state every time a protocol extension is added.
They can make their GUC tracking and re-syncing robust, and that's all
they would need.

> This is true, but how common is this situation? Probably every client  that 
> uses one proxy will use the same defaultly reported GUC.

If you have different clients connecting to the same proxy, it seems
quite likely that this will happen. This does not seem uncommon to me,
e.g. actual application would need different things always reported
than some dev client. Or clients for different languages might ask to
report slightly different settings.

> Reporting has no extra overhead. The notification is reduced. When there is a 
> different set of reported GUC, then the proxy can try to find another 
> connection with the same set or can reconnect.

Honestly, this logic seems much more fragile to implement. And
requiring reconnection seems problematic from a performance point of
view.

> I think so there is still opened question what should be correct behaviour 
> when client execute RESET ALL or DISCARD ALL.  Without special protection the 
> communication with proxy can be broken - and we use GUC for reported 
> variables, then my case, prompt in psql will be broken too. Inside psql I 
> have not callback on change of reported GUC. So this is reason why reporting 
> based on mutable GUC is fragile :-/

Specifically for this reason, the current patchset in the other thread
already ignores RESET ALL and DISCARD ALL for protocol extension
parameters (including _pq_.report_parameters). So this would be a
non-issue.

Re: Evaluate arguments of correlated SubPlans in the referencing ExprState

2024-01-28 Thread Alena Rybakina


On 26.01.2024 05:37, vignesh C wrote:

On Tue, 24 Oct 2023 at 01:47, Alena Rybakina  wrote:

Hi!

I looked through your patch and noticed that it was not applied to the current 
version of the master. I rebased it and attached a version. I didn't see any 
problems and, honestly, no big changes were needed, all regression tests were 
passed.

I think it's better to add a test, but to be honest, I haven't been able to 
come up with something yet.

The patch does not apply anymore as in CFBot at [1]:

=== Applying patches on top of PostgreSQL commit ID
7014c9a4bba2d1b67d60687afb5b2091c1d07f73 ===
=== applying patch
./v2-0001-WIP-Evaluate-arguments-of-correlated-SubPlans-in-the.patch

patching file src/include/executor/execExpr.h
Hunk #1 succeeded at 160 (offset 1 line).
Hunk #2 succeeded at 382 (offset 2 lines).
Hunk #3 FAILED at 778.
1 out of 3 hunks FAILED -- saving rejects to file
src/include/executor/execExpr.h.rej
patching file src/include/nodes/execnodes.h
Hunk #1 succeeded at 959 (offset 7 lines).

Please have a look and post an updated version.

[1] - http://cfbot.cputube.org/patch_46_4209.log

Regards,
Vignesh


Thank you!

I fixed it. The code remains the same.

--
Regards,
Alena Rybakina
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company
From bf40b14c0cb63f47280299fd3f76a1711db6aada Mon Sep 17 00:00:00 2001
From: Alena Rybakina 
Date: Sun, 28 Jan 2024 11:58:44 +0300
Subject: [PATCH] WIP: Evaluate arguments of correlated SubPlans in the
 referencing ExprState.

---
 src/backend/executor/execExpr.c   | 93 +--
 src/backend/executor/execExprInterp.c | 22 +++
 src/backend/executor/execProcnode.c   |  5 ++
 src/backend/executor/nodeSubplan.c| 30 -
 src/backend/jit/llvm/llvmjit_expr.c   |  6 ++
 src/backend/jit/llvm/llvmjit_types.c  |  1 +
 src/include/executor/execExpr.h   |  6 +-
 src/include/nodes/execnodes.h |  1 -
 8 files changed, 112 insertions(+), 52 deletions(-)

diff --git a/src/backend/executor/execExpr.c b/src/backend/executor/execExpr.c
index 3181b1136a2..d2e539e7b28 100644
--- a/src/backend/executor/execExpr.c
+++ b/src/backend/executor/execExpr.c
@@ -88,6 +88,9 @@ static void ExecBuildAggTransCall(ExprState *state, AggState *aggstate,
   FunctionCallInfo fcinfo, AggStatePerTrans pertrans,
   int transno, int setno, int setoff, bool ishash,
   bool nullcheck);
+static void ExecInitSubPlanExpr(SubPlan *subplan,
+ExprState *state,
+Datum *resv, bool *resnull);
 
 
 /*
@@ -1386,7 +1389,6 @@ ExecInitExprRec(Expr *node, ExprState *state,
 		case T_SubPlan:
 			{
 SubPlan*subplan = (SubPlan *) node;
-SubPlanState *sstate;
 
 /*
  * Real execution of a MULTIEXPR SubPlan has already been
@@ -1403,19 +1405,7 @@ ExecInitExprRec(Expr *node, ExprState *state,
 	break;
 }
 
-if (!state->parent)
-	elog(ERROR, "SubPlan found with no parent plan");
-
-sstate = ExecInitSubPlan(subplan, state->parent);
-
-/* add SubPlanState nodes to state->parent->subPlan */
-state->parent->subPlan = lappend(state->parent->subPlan,
- sstate);
-
-scratch.opcode = EEOP_SUBPLAN;
-scratch.d.subplan.sstate = sstate;
-
-ExprEvalPushStep(state, );
+ExecInitSubPlanExpr(subplan, state, resv, resnull);
 break;
 			}
 
@@ -2752,29 +2742,12 @@ ExecPushExprSetupSteps(ExprState *state, ExprSetupInfo *info)
 	foreach(lc, info->multiexpr_subplans)
 	{
 		SubPlan*subplan = (SubPlan *) lfirst(lc);
-		SubPlanState *sstate;
 
 		Assert(subplan->subLinkType == MULTIEXPR_SUBLINK);
 
-		/* This should match what ExecInitExprRec does for other SubPlans: */
-
-		if (!state->parent)
-			elog(ERROR, "SubPlan found with no parent plan");
-
-		sstate = ExecInitSubPlan(subplan, state->parent);
-
-		/* add SubPlanState nodes to state->parent->subPlan */
-		state->parent->subPlan = lappend(state->parent->subPlan,
-		 sstate);
-
-		scratch.opcode = EEOP_SUBPLAN;
-		scratch.d.subplan.sstate = sstate;
-
 		/* The result can be ignored, but we better put it somewhere */
-		scratch.resvalue = >resvalue;
-		scratch.resnull = >resnull;
-
-		ExprEvalPushStep(state, );
+		ExecInitSubPlanExpr(subplan, state,
+			>resvalue, >resnull);
 	}
 }
 
@@ -4181,3 +4154,57 @@ ExecBuildParamSetEqual(TupleDesc desc,
 
 	return state;
 }
+
+static void
+ExecInitSubPlanExpr(SubPlan *subplan,
+	ExprState *state,
+	Datum *resv, bool *resnull)
+{
+	ExprEvalStep scratch = {0};
+	SubPlanState *sstate;
+	ListCell   *pvar;
+	ListCell   *l;
+
+	if (!state->parent)
+		elog(ERROR, "SubPlan found with no parent plan");
+
+	/*
+	 * Generate steps to evaluate input arguments for the subplan.
+	 *
+	 * We evaluate the argument expressions into ExprState's resvalue/resnull,
+	 * and then use PARAM_SET to update the parameter. We do that, instead of
+	 * evaluating directly into the param, to avoid depending on the pointer
+	 * value remaining stable / being included

Re: tablecmds.c/MergeAttributes() cleanup

2024-01-28 Thread Alexander Lakhin


Hello Peter,

26.01.2024 16:42, Peter Eisentraut wrote:


I have committed all this.  These are great improvements.



Please look at the segmentation fault triggered by the following query since
4d969b2f8:
CREATE TABLE t1(a text COMPRESSION pglz);
CREATE TABLE t2(a text);
CREATE TABLE t3() INHERITS(t1, t2);
NOTICE:  merging multiple inherited definitions of column "a"
server closed the connection unexpectedly
    This probably means the server terminated abnormally
    before or while processing the request.

Core was generated by `postgres: law regression [local] CREATE TABLE
 '.
Program terminated with signal SIGSEGV, Segmentation fault.

(gdb) bt
#0  __strcmp_avx2 () at ../sysdeps/x86_64/multiarch/strcmp-avx2.S:116
#1  0x5606fbcc9d52 in MergeAttributes (columns=0x0, supers=supers@entry=0x5606fe293d30, relpersistence=112 'p', 
is_partition=false, supconstr=supconstr@entry=0x7fff4046d410, supnotnulls=supnotnulls@entry=0x7fff4046d418)

    at tablecmds.c:2811
#2  0x5606fbccd764 in DefineRelation (stmt=stmt@entry=0x5606fe26a130, relkind=relkind@entry=114 'r', ownerId=10, 
ownerId@entry=0, typaddress=typaddress@entry=0x0,

    queryString=queryString@entry=0x5606fe2695c0 "CREATE TABLE t3() INHERITS(t1, 
t2);") at tablecmds.c:885
...

Best regards,
Alexander

63 matches

Mail list logo