date:20230829

Re: [PoC] pg_upgrade: allow to upgrade publisher node

2023-08-29 Thread Peter Smith

Hi Kuroda-san.

Here are some review comments for v28-0003.

==
src/bin/pg_upgrade/check.c

1. check_and_dump_old_cluster
+ /*
+ * Logical replication slots can be migrated since PG17. See comments atop
+ * get_old_cluster_logical_slot_infos().
+ */
+ if (GET_MAJOR_VERSION(old_cluster.major_version) >= 1700)
+ check_old_cluster_for_valid_slots(live_check);
+

IIUC we are preferring to use the <= 1600 style of version check
instead of >= 1700 where possible.

So this comment and version check ought to be removed from here, and
done inside check_old_cluster_for_valid_slots() instead.

~~~

2. check_old_cluster_for_valid_slots

+/*
+ * check_old_cluster_for_valid_slots()
+ *
+ * Make sure logical replication slots can be migrated to new cluster.
+ * Following points are checked:
+ *
+ * - All logical replication slots are usable.
+ * - All logical replication slots consumed all WALs, except a
+ *   CHECKPOINT_SHUTDOWN record.
+ */
+static void
+check_old_cluster_for_valid_slots(bool live_check)

I suggested in the previous comment above (#1) that the version check
should be moved into this function.

Therefore, this function comment now should also mention slot upgrade
is only allowed for >= PG17

~~~

3.
+static void
+check_old_cluster_for_valid_slots(bool live_check)
+{
+ int i,
+ ntups,
+ i_slotname;
+ PGresult   *res;
+ DbInfo*active_db = _cluster.dbarr.dbs[0];
+ PGconn*conn;
+
+ /* Quick exit if the cluster does not have logical slots. */
+ if (count_old_cluster_logical_slots() == 0)
+ return;

3a.
See comment #1. At the top of this function body there should be a
version check like:

if (GET_MAJOR_VERSION(old_cluster.major_version) <= 1600)
return;

~

3b.
/Quick exit/Quick return/

~

4.
+ prep_status("Checking for logical replication slots");

I felt that should add the word "valid" like:
"Checking for valid logical replication slots"

~~~

5.
+ /* Check there are no logical replication slots with a 'lost' state. */
+ res = executeQueryOrDie(conn,
+ "SELECT slot_name FROM pg_catalog.pg_replication_slots "
+ "WHERE wal_status = 'lost' AND "
+ "temporary IS FALSE;");

Since the SQL is checking if there *are* lost slots I felt it would be
more natural to reverse that comment.

SUGGESTION
/* Check and reject if there are any logical replication slots with a
'lost' state. */

~~~

6.
+ /*
+ * Do additional checks if a live check is not required. This requires
+ * that confirmed_flush_lsn of all the slots is the same as the latest
+ * checkpoint location, but it would be satisfied only when the server has
+ * been shut down.
+ */
+ if (!live_check)

I think the comment can be rearranged slightly:

SUGGESTION
Do additional checks to ensure that 'confirmed_flush_lsn' of all the
slots is the same as the latest checkpoint location.
Note: This can be satisfied only when the old_cluster has been shut
down, so we skip this for "live" checks.

==
src/bin/pg_upgrade/controldata.c

7.
+ /*
+ * Read the latest checkpoint location if the cluster is PG17
+ * or later. This is used for upgrading logical replication
+ * slots.
+ */
+ if (GET_MAJOR_VERSION(cluster->major_version) >= 1700)
+ {

Fetching this "Latest checkpoint location:" value is only needed for
the check_old_cluster_for_valid_slots validation check, isn't it? But
AFAICT this code is common for both old_cluster and new_cluster.

I am not sure what is best to do:
- Do only the minimal logic needed?
- Read the value redundantly even for new_cluster just to keep code simpler?

Either way, maybe the comment should say something about this.

==
.../t/003_logical_replication_slots.pl

8. Consider adding one more test

Maybe there should also be some "live check" test performed (e.g.
using --check, and a running old_cluster).

This would demonstrate pg_upgrade working successfully even when the
WAL records are not consumed (because LSN checks would be skipped in
check_old_cluster_for_valid_slots function).

--
Kind Regards,
Peter Smith.
Fujitsu Australia

Re: Extract numeric filed in JSONB more effectively

2023-08-29 Thread Andy Fan

(Sorry for leaving this discussion for such a long time,  how times fly!)

On Sun, Aug 27, 2023 at 6:28 AM Chapman Flack  wrote:

> On 2023-08-22 08:16, Chapman Flack wrote:
> > On 2023-08-22 01:54, Andy Fan wrote:
> >> After we label it, we will get error like this:
> >>
> >> select (a->'a')::int4 from m;
> >> ERROR:  cannot display a value of type internal
> >
> > Without looking in depth right now, I would double-check
> > what relabel node is being applied at the result. The idea,
> > of course, was to relabel the result as the expected result
>
> as I suspected, looking at v10-0003, there's this:
>
> +   fexpr = (FuncExpr *)makeRelabelType((Expr *) fexpr, INTERNALOID,
> +   0, InvalidOid,
> COERCE_IMPLICIT_CAST);
>
> compared to the example I had sent earlier:
>
> On 2023-08-18 17:02, Chapman Flack wrote:
> > expr = (Expr *)makeRelabelType((Expr *)fexpr,
> >   targetOid, -1, InvalidOid, COERCE_IMPLICIT_CAST);
>
> The key difference: this is the label going on the result type
> of the function we are swapping in.

I'm feeling we have some understanding gap in this area, let's
see what it is.  Suppose the original query is:

numeric(jsonb_object_field(v_jsonb, text)) -> numeric.

without the patch 003,  the rewritten query is:
jsonb_object_field_type(NUMERICOID,  v_jsonb, text) -> NUMERIC.

However the declared type of jsonb_object_field_type is:

jsonb_object_field_type(internal, jsonb, text) -> internal.

So the situation is:  a).  We input a CONST(type=OIDOID, ..) for an
internal argument.  b).  We return a NUMERIC type which matches
the original query c).  result type NUMERIC doesn't match the declared
type  'internal'  d).  it doesn't match the  run-time type of internal
argument which is OID.

case a) is fixed by RelableType.  case b) shouldn't be treat as an
issue.  I thought you wanted to address the case c), and patch
003 tries to fix it, then the ERROR above.  At last I realized case
c) isn't the one you want to fix.  case d) shouldn't be requirement
at the first place IIUC.

So your new method is:
1. jsonb_{op}_start() ->  internal  (internal actually JsonbValue).
2. jsonb_finish_{type}(internal, ..) -> type.   (internal is JsonbValue ).

This avoids the case a) at the very beginning.  I'd like to provides
patches for both solutions for comparison.  Any other benefits of
this method I am missing?

> Two more minor differences: (1) the node you get from
> makeRelabelType is an Expr, but not really a FuncExpr. Casting
> it to FuncExpr is a bit bogus. Also, the third argument to
> makeRelabelType is a typmod, and I believe the "not-modified"
> typmod is -1, not 0.
>

My fault, you are right.

>
> On 2023-08-21 06:50, Andy Fan wrote:
> > I'm not very excited with this manner, reasons are: a).  It will have
> > to emit more steps in ExprState->steps which will be harmful for
> > execution. The overhead  is something I'm not willing to afford.
>
> I would be open to a performance comparison, but offhand I am not
> sure whether the overhead of another step or two in an ExprState
> is appreciably more than some of the overhead in the present patch,
> such as the every-time-through fcinfo initialization buried in
> DirectFunctionCall1 where you don't necessarily see it. I bet

the fcinfo in an ExprState step gets set up once, and just has
> new argument values slammed into it each time through.
>

fcinfo initialization in DirectFunctionCall1 is an interesting point!
so  I am persuaded the extra steps in  ExprState may not be
worse than the current way due to the "every-time-through
fcinfo initialization" (in which case the memory is allocated
once in heap rather than every time in stack).   I can do a
comparison at last to see if we can find some other interesting
findings.

> I would not underestimate the benefit of reducing the code
> duplication and keeping the patch as clear as possible.
> The key contributions of the patch are getting a numeric or
> boolean efficiently out of the JSON operation. Getting from
> numeric to int or float are things the system already does
> well.

True, reusing the casting system should be better than hard-code
the casting function manually.  I'd apply this on both methods.

> A patch that focuses on what it contributes, and avoids
> redoing things the system already can do--unless the duplication
> can be shown to have a strong performance benefit--is easier to
> review and probably to get integrated.
>

Agreed.

At last, thanks for the great insights and patience!

-- 
Best Regards
Andy Fan

Re: Synchronizing slots from primary to standby

2023-08-29 Thread shveta malik

On Wed, Aug 23, 2023 at 4:21 PM Dilip Kumar  wrote:
>
> On Wed, Aug 23, 2023 at 3:38 PM shveta malik  wrote:
> >
> I have reviewed the v12-0002 patch and I have some comments. I see the
> latest version posted sometime back and if any of this comment is
> already fixed in this version then feel free to ignore that.
>
> In general, code still needs a lot more comments to make it readable
> and in some places, code formatting is also not as per PG standard so
> that needs to be improved.
> There are some other specific comments as listed below
>

Please see the latest patch-set (v14). Did some code-formatting, used
pg_indent as well.
Added more comments. Let me know specifically if some more comments or
formatting is needed.

> 1.
> @@ -925,7 +936,7 @@ ApplyLauncherRegister(void)
>   memset(, 0, sizeof(bgw));
>   bgw.bgw_flags = BGWORKER_SHMEM_ACCESS |
>   BGWORKER_BACKEND_DATABASE_CONNECTION;
> - bgw.bgw_start_time = BgWorkerStart_RecoveryFinished;
> + bgw.bgw_start_time = BgWorkerStart_ConsistentState;
>
> What is the reason for this change, can you add some comments?

Sure, done.

>
> 2.
> ApplyLauncherShmemInit(void)
>  {
>   bool found;
> + bool foundSlotSync;
>
> Is there any specific reason to launch the sync worker from the
> logical launcher instead of making this independent?
> I mean in the future if we plan to sync physical slots as well then it
> wouldn't be an expandable design.
>
> 3.
> + /*
> + * Remember the old dbids before we stop and cleanup this worker
> + * as these will be needed in order to relaunch the worker.
> + */
> + copied_dbcnt = worker->dbcount;
> + copied_dbids = (Oid *)palloc0(worker->dbcount * sizeof(Oid));
> +
> + for (i = 0; i < worker->dbcount; i++)
> + copied_dbids[i] = worker->dbids[i];
>
> probably we can just memcpy the memory containing the dbids.

Done.

>
> 4.
> + /*
> + * If worker is being reused, and there is vacancy in dbids array,
> + * just update dbids array and dbcount and we are done.
> + * But if dbids array is exhausted, stop the worker, reallocate
> + * dbids in dsm, relaunch the worker with same set of dbs as earlier
> + * plus the new db.
> + */
>
> Why do we need to relaunch the worker, can't we just use dsa_pointer
> to expand the shared memory whenever required?
>

Done.

> 5.
>
> +static bool
> +WaitForSlotSyncWorkerAttach(SlotSyncWorker *worker,
> +uint16 generation,
> +BackgroundWorkerHandle *handle)
>
> this function is an exact duplicate of WaitForReplicationWorkerAttach
> except for the LWlock, why don't we use the same function by passing
> the LWLock as a parameter
>
> 6.
> +/*
> + * Attach Slot-sync worker to worker-slot assigned by launcher.
> + */
> +void
> +slotsync_worker_attach(int slot)
>
> this is also very similar to the logicalrep_worker_attach function.
>
> Please check other similar functions and reuse them wherever possible
>

Will revisit these as stated in [1].

[1]: 
https://www.postgresql.org/message-id/CAJpy0uDeWjJj%2BU8nn%2BHbnGWkfY%2Bn-Bbw_kuHqgphETJ1Lucy%2BQ%40mail.gmail.com

> --
> Regards,
> Dilip Kumar
> EnterpriseDB: http://www.enterprisedb.com

Re: persist logical slots to disk during shutdown checkpoint

2023-08-29 Thread Amit Kapila

On Wed, Aug 30, 2023 at 9:03 AM Julien Rouhaud  wrote:
>
> On Tue, Aug 29, 2023 at 02:21:15PM +0530, Amit Kapila wrote:
> > On Tue, Aug 29, 2023 at 10:16 AM vignesh C  wrote:
> > >
> > > That makes sense. The attached v6 version has the changes for the
> > > same, apart from this I have also fixed a) pgindent issues b) perltidy
> > > issues c) one variable change (flush_lsn_changed to
> > > confirmed_flush_has_changed) d) corrected few comments in the test
> > > file. Thanks to Peter for providing few offline comments.
> > >
> >
> > The latest version looks good to me. Julien, Ashutosh, and others,
> > unless you have more comments or suggestions, I would like to push
> > this in a day or two.
>
> Unfortunately I'm currently swamped with some internal escalations so I
> couldn't keep up closely with the latest activity here.
>
> I think I recall that you wanted to
> change the timing at which logical slots are shutdown, I'm assuming that this
> change won't lead to always have a difference between the LSN and latest
> persisted LSN being different?
>

I think here by LSN you are referring to confirmed_flush LSN. If so,
this doesn't create any new difference between the values for the
confirmed_flush LSN in memory and in disk. We just remember the last
persisted value to avoid writes of slots at shutdown time.

>  Otherwise saving the latest persisted LSN to
> try to avoid persisting again all logical slots on shutdown seems reasonable 
> to
> me.

Thanks for responding.

-- 
With Regards,
Amit Kapila.

Re: pg_stat_get_backend_subxact() and backend IDs?

2023-08-29 Thread Michael Paquier

On Tue, Aug 29, 2023 at 07:01:51PM -0700, Nathan Bossart wrote:
> On Wed, Aug 30, 2023 at 08:22:27AM +0900, Michael Paquier wrote:
>> +extern LocalPgBackendStatus *pgstat_get_local_beentry_by_index(int beid); 
>> 
>> Still I would to a bit more of s/beid/id/ for cases where the code
>> refers to an index ID, and not a backend ID, especially for the
>> internal routines.
> 
> Makes sense.  I did this in v4.

Yep, that looks more consistent, at quick glance.
--
Michael


signature.asc
Description: PGP signature

Re: Synchronizing slots from primary to standby

2023-08-29 Thread shveta malik

On Fri, Aug 25, 2023 at 11:09 AM shveta malik  wrote:
>
> On Wed, Aug 23, 2023 at 4:21 PM Dilip Kumar  wrote:
> >
> > On Wed, Aug 23, 2023 at 3:38 PM shveta malik  wrote:
> > >
> > I have reviewed the v12-0002 patch and I have some comments. I see the
> > latest version posted sometime back and if any of this comment is
> > already fixed in this version then feel free to ignore that.
> >
>
> Thanks for the feedback. Please find my comments on a few. I will work on 
> rest.
>
>
> > 2.
> > ApplyLauncherShmemInit(void)
> >  {
> >   bool found;
> > + bool foundSlotSync;
> >
> > Is there any specific reason to launch the sync worker from the
> > logical launcher instead of making this independent?
> > I mean in the future if we plan to sync physical slots as well then it
> > wouldn't be an expandable design.
>
> When we started working on this, it was reusing logical-apply worker
> infra, so I separated it from logical-apply worker but let it be
> managed by a replication launcher considering that only logical slots
> needed to be synced. I think this needs more thought and I would like
> to know from others as well before concluding anything here.
>
>
> > 5.
> >
> > +static bool
> > +WaitForSlotSyncWorkerAttach(SlotSyncWorker *worker,
> > +uint16 generation,
> > +BackgroundWorkerHandle *handle)
> >
> > this function is an exact duplicate of WaitForReplicationWorkerAttach
> > except for the LWlock, why don't we use the same function by passing
> > the LWLock as a parameter
> >
>
> Here workers (first argument) are different. We can always pass
> LWLock, but since workers are different, in order to merge the common
> functionality, we need to have some common worker structure between
> the two workers (apply and sync-slot) and pass that to functions which
> need to be merged (similar to NodeTag used in Insert/CreateStmt etc).
> But changing LogicalRepWorker() would mean changing
> applyworker/table-sync worker/parallel-apply-worker files. Since there
> are only two such functions which you pointed out (attach and
> wait_for_attach), I prefered to keep the functions as is until we
> conclude on where slot-sync worker functionality actually fits in. I
> can revisit these comments then. Or if you see any better way to do
> it, kindly let me know.
>
> > 6.
> > +/*
> > + * Attach Slot-sync worker to worker-slot assigned by launcher.
> > + */
> > +void
> > +slotsync_worker_attach(int slot)
> >
> > this is also very similar to the logicalrep_worker_attach function.
> >
> > Please check other similar functions and reuse them wherever possible
> >
> > Also, why this function is not registering the cleanup function on 
> > shmmem_exit?
> >
>
> It is doing it in ReplSlotSyncMain() since we have dsm-seg there.
> Please see this:
>
> /* Primary initialization is complete. Now, attach to our slot. */
> slotsync_worker_attach(worker_slot);
> before_shmem_exit(slotsync_worker_detach, PointerGetDatum(seg));
>

PFA new patch-set which attempts to fix these:

a) Synced slots on standby are not consumable i.e.
pg_logical_slot_get/peek_changes will give error on these while will
work on user-created slots.
b) User created slots on standby will not be dropped by slot-sync
workers anymore. Earlier slot-sync worker was dropping all the slots
which were not part of synchronize_slot_names.
c) Now DSA is being used for dbids to facilitate memory extension if
required without needing to restart the worker. Earlier dsm was used
alone which needed restart of the worker in case the memory allocated
needs to be extended.

Changes are in patch 0002.

Next in pipeline:
1. Handling of corner scenarios which I explained in:
https://www.postgresql.org/message-id/CAJpy0uC%2B2agRtF3H%3Dn-hW5JkoPfaZkjPXJr%3D%3Dy3_PRE04dQvhw%40mail.gmail.com
2. Revisiting comments (older ones in this thread and latest given) for patch 1.

thanks
Shveta


v14-0001-Allow-logical-walsenders-to-wait-for-physical-st.patch
Description: Binary data


v14-0002-Add-logical-slot-sync-capability-to-physical-sta.patch
Description: Binary data

Re: persist logical slots to disk during shutdown checkpoint

2023-08-29 Thread Julien Rouhaud

Hi,

On Tue, Aug 29, 2023 at 02:21:15PM +0530, Amit Kapila wrote:
> On Tue, Aug 29, 2023 at 10:16 AM vignesh C  wrote:
> >
> > That makes sense. The attached v6 version has the changes for the
> > same, apart from this I have also fixed a) pgindent issues b) perltidy
> > issues c) one variable change (flush_lsn_changed to
> > confirmed_flush_has_changed) d) corrected few comments in the test
> > file. Thanks to Peter for providing few offline comments.
> >
>
> The latest version looks good to me. Julien, Ashutosh, and others,
> unless you have more comments or suggestions, I would like to push
> this in a day or two.

Unfortunately I'm currently swamped with some internal escalations so I
couldn't keep up closely with the latest activity here.

I think I recall that you wanted to
change the timing at which logical slots are shutdown, I'm assuming that this
change won't lead to always have a difference between the LSN and latest
persisted LSN being different?  Otherwise saving the latest persisted LSN to
try to avoid persisting again all logical slots on shutdown seems reasonable to
me.

Re: Debian 12 gcc warning

2023-08-29 Thread Tom Lane

Bruce Momjian  writes:
> On Tue, Aug 29, 2023 at 10:18:36AM -0400, Tom Lane wrote:
>> That seems like a pretty clear compiler bug, particularly since it just
>> appears in this one version.  Rather than contorting our code, I'd
>> suggest filing a gcc bug.

> I assume I have to create a test case to report this to the gcc team.  I
> tried to create such a test case on gcc 12 but it doesn't generate the
> warning.  Attached is my attempt.  Any ideas?  I assume we can't just
> tell them to download our software and compile it.

IIRC, they'll accept preprocessed compiler input for the specific file;
you don't need to provide a complete source tree.  Per
https://gcc.gnu.org/bugs/

Please include all of the following items, the first three of which can be 
obtained from the output of gcc -v:

the exact version of GCC;
the system type;
the options given when GCC was configured/built;
the complete command line that triggers the bug;
the compiler output (error messages, warnings, etc.); and
the preprocessed file (*.i*) that triggers the bug, generated by adding 
-save-temps to the complete compilation command, or, in the case of a bug 
report for the GNAT front end, a complete set of source files (see below).

Obviously, if you can trim the input it's good, but it doesn't
have to be a minimal reproducer.

regards, tom lane

Re: [PoC] pg_upgrade: allow to upgrade publisher node

2023-08-29 Thread Peter Smith

Here are some minor review comments for patch v28-0002

==
src/sgml/ref/pgupgrade.sgml

1.
-   with the primary.)  Replication slots are not copied and must
-   be recreated.
+   with the primary.)  Replication slots on old standby are not copied.
+   Only logical slots on the primary are migrated to the new standby,
+   and other slots must be recreated.
   

/on old standby/on the old standby/

==
src/bin/pg_upgrade/info.c

2. get_old_cluster_logical_slot_infos

+void
+get_old_cluster_logical_slot_infos(void)
+{
+ int dbnum;
+
+ /* Logical slots can be migrated since PG17. */
+ if (GET_MAJOR_VERSION(old_cluster.major_version) <= 1600)
+ return;
+
+ pg_log(PG_VERBOSE, "\nsource databases:");
+
+ for (dbnum = 0; dbnum < old_cluster.dbarr.ndbs; dbnum++)
+ {
+ DbInfo*pDbInfo = _cluster.dbarr.dbs[dbnum];
+
+ get_old_cluster_logical_slot_infos_per_db(pDbInfo);
+
+ if (log_opts.verbose)
+ {
+ pg_log(PG_VERBOSE, "Database: \"%s\"", pDbInfo->db_name);
+ print_slot_infos(>slot_arr);
+ }
+ }
+}

It might be worth putting an Assert before calling the
get_old_cluster_logical_slot_infos_per_db(...) just as a sanity check:
Assert(pDbInfo->slot_arr.nslots == 0);

This also helps to better document the "Note" of the
count_old_cluster_logical_slots() function comment.

~~~

3. count_old_cluster_logical_slots

+/*
+ * count_old_cluster_logical_slots()
+ *
+ * Sum up and return the number of logical replication slots for all databases.
+ *
+ * Note: this function always returns 0 if the old_cluster is PG16 and prior
+ * because old_cluster.dbarr.dbs[dbnum].slot_arr is set only for PG17 and
+ * later.
+ */
+int
+count_old_cluster_logical_slots(void)

Maybe that "Note" should be expanded a bit to say who does this:

SUGGESTION

Note: This function always returns 0 if the old_cluster is PG16 and
prior because old_cluster.dbarr.dbs[dbnum].slot_arr is set only for
PG17 and later. See where get_old_cluster_logical_slot_infos_per_db()
is called.

==
src/bin/pg_upgrade/pg_upgrade.c

4.
+ /*
+ * Logical replication slot upgrade only supported for old_cluster >=
+ * PG17.
+ *
+ * Note: This must be done after doing the pg_resetwal command because
+ * pg_resetwal would remove required WALs.
+ */
+ if (count_old_cluster_logical_slots())
+ {
+ start_postmaster(_cluster, true);
+ create_logical_replication_slots();
+ stop_postmaster(false);
+ }
+

4a.
I felt this comment needs a bit more detail otherwise you can't tell
how the >= PG17 version check works.

4b.
/slot upgrade only supported/slot upgrade is only supported/

~

SUGGESTION

Logical replication slot upgrade is only supported for old_cluster >=
PG17. An explicit version check is not necessary here because function
count_old_cluster_logical_slots() will always return 0 for old_cluster
<= PG16.

--
Kind Regards,
Peter Smith.
Fujitsu Australia

Re: Missing comments/docs about custom scan path

2023-08-29 Thread Richard Guo

On Tue, Aug 29, 2023 at 5:08 PM Etsuro Fujita 
wrote:

> Another thing I would like to propose is minor adjustments to the docs
> related to parallel query:
>
> A custom scan provider will typically add paths for a base relation by
> setting the following hook, which is called after the core code has
> generated all the access paths it can for the relation (except for
> Gather paths, which are made after this call so that they can use
> partial paths added by the hook):
>
> For clarity, I think "except for Gather paths" should be "except for
> Gather and Gather Merge paths".
>
> Although this hook function can be used to examine, modify, or remove
> paths generated by the core system, a custom scan provider will
> typically confine itself to generating CustomPath objects and adding
> them to rel using add_path.
>
> For clarity, I think "adding them to rel using add_path" should be eg,
> "adding them to rel using add_path, or using add_partial_path if they
> are partial paths".


+1.  I can see that this change makes the doc more consistent with the
comments in set_rel_pathlist.

Thanks
Richard

Re: pg_stat_get_backend_subxact() and backend IDs?

2023-08-29 Thread Nathan Bossart

On Wed, Aug 30, 2023 at 08:22:27AM +0900, Michael Paquier wrote:
> On Tue, Aug 29, 2023 at 09:46:55AM -0700, Nathan Bossart wrote:
>> This was my first reaction [0].  I was concerned about renaming the
>> exported functions so close to release, so I was suggesting that we hold
>> off on that part until v17.  If there isn't a concern with renaming these
>> functions in v16, I can proceed with something more like v2.
> 
> Thanks for the pointer.  This version is much better IMO, because it
> removes entirely the source of the confusion between the difference in
> backend ID and index ID treatments when fetching the local entries in
> the array.  So I'm okay to rename these functions now, before .0 is
> released to get things in a better shape while addressing the issue
> reported.

Okay.

> +extern LocalPgBackendStatus *pgstat_get_local_beentry_by_index(int beid); 
> 
> Still I would to a bit more of s/beid/id/ for cases where the code
> refers to an index ID, and not a backend ID, especially for the
> internal routines.

Makes sense.  I did this in v4.

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com
>From 3e360c1b970526079528159fe1c49be03d0b13b5 Mon Sep 17 00:00:00 2001
From: Nathan Bossart 
Date: Thu, 24 Aug 2023 07:42:54 -0700
Subject: [PATCH v4 1/2] rename some pgstat functions

---
 src/backend/utils/activity/backend_status.c | 28 
 src/backend/utils/adt/pgstatfuncs.c | 36 ++---
 src/include/utils/backend_status.h  |  4 +--
 3 files changed, 34 insertions(+), 34 deletions(-)

diff --git a/src/backend/utils/activity/backend_status.c b/src/backend/utils/activity/backend_status.c
index 38f91a495b..a4860b10fc 100644
--- a/src/backend/utils/activity/backend_status.c
+++ b/src/backend/utils/activity/backend_status.c
@@ -849,8 +849,8 @@ pgstat_read_current_status(void)
 			/*
 			 * The BackendStatusArray index is exactly the BackendId of the
 			 * source backend.  Note that this means localBackendStatusTable
-			 * is in order by backend_id.  pgstat_fetch_stat_beentry() depends
-			 * on that.
+			 * is in order by backend_id.  pgstat_get_beentry_by_backend_id()
+			 * depends on that.
 			 */
 			localentry->backend_id = i;
 			BackendIdGetTransactionIds(i,
@@ -1073,21 +1073,21 @@ cmp_lbestatus(const void *a, const void *b)
 }
 
 /* --
- * pgstat_fetch_stat_beentry() -
+ * pgstat_get_beentry_by_backend_id() -
  *
  *	Support function for the SQL-callable pgstat* functions. Returns
  *	our local copy of the current-activity entry for one backend,
  *	or NULL if the given beid doesn't identify any known session.
  *
  *	The beid argument is the BackendId of the desired session
- *	(note that this is unlike pgstat_fetch_stat_local_beentry()).
+ *	(note that this is unlike pgstat_get_local_beentry_by_index()).
  *
  *	NB: caller is responsible for a check if the user is permitted to see
  *	this info (especially the querystring).
  * --
  */
 PgBackendStatus *
-pgstat_fetch_stat_beentry(BackendId beid)
+pgstat_get_beentry_by_backend_id(BackendId beid)
 {
 	LocalPgBackendStatus key;
 	LocalPgBackendStatus *ret;
@@ -,13 +,13 @@ pgstat_fetch_stat_beentry(BackendId beid)
 
 
 /* --
- * pgstat_fetch_stat_local_beentry() -
+ * pgstat_get_local_beentry_by_index() -
  *
- *	Like pgstat_fetch_stat_beentry() but with locally computed additions (like
- *	xid and xmin values of the backend)
+ *	Like pgstat_get_beentry_by_backend_id() but with locally computed additions
+ *	(like xid and xmin values of the backend)
  *
- *	The beid argument is a 1-based index in the localBackendStatusTable
- *	(note that this is unlike pgstat_fetch_stat_beentry()).
+ *	The idx argument is a 1-based index in the localBackendStatusTable
+ *	(note that this is unlike pgstat_get_beentry_by_backend_id()).
  *	Returns NULL if the argument is out of range (no current caller does that).
  *
  *	NB: caller is responsible for a check if the user is permitted to see
@@ -1125,14 +1125,14 @@ pgstat_fetch_stat_beentry(BackendId beid)
  * --
  */
 LocalPgBackendStatus *
-pgstat_fetch_stat_local_beentry(int beid)
+pgstat_get_local_beentry_by_index(int idx)
 {
 	pgstat_read_current_status();
 
-	if (beid < 1 || beid > localNumBackends)
+	if (idx < 1 || idx > localNumBackends)
 		return NULL;
 
-	return [beid - 1];
+	return [idx - 1];
 }
 
 
@@ -1141,7 +1141,7 @@ pgstat_fetch_stat_local_beentry(int beid)
  *
  *	Support function for the SQL-callable pgstat* functions. Returns
  *	the number of sessions known in the localBackendStatusTable, i.e.
- *	the maximum 1-based index to pass to pgstat_fetch_stat_local_beentry().
+ *	the maximum 1-based index to pass to pgstat_get_local_beentry_by_index().
  * --
  */
 int
diff --git a/src/backend/utils/adt/pgstatfuncs.c b/src/backend/utils/adt/pgstatfuncs.c
index 2b9742ad21..49cc887b97 100644
--- a/src/backend/utils/adt/pgstatfuncs.c
+++ b/src/backend/utils/adt/pgstatfuncs.c
@@ -211,7 +211,7 @@

Re: pgbench: allow to exit immediately when any client is aborted

2023-08-29 Thread Yugo NAGATA

On Wed, 30 Aug 2023 10:11:10 +0900 (JST)
Tatsuo Ishii  wrote:

> Yugo,
> Fabien,
> 
> >>> I start to think this behavior is ok and consistent with previous
> >>> behavior of pgbench because serialization (and dealock) errors have
> >>> been treated specially from other types of errors, such as accessing
> >>> non existing tables. However, I suggest to add more sentences to the
> >>> explanation of this option so that users are not confused by this.
> >>> 
> >>> + 
> >>> +  --exit-on-abort
> >>> +  
> >>> +   
> >>> +Exit immediately when any client is aborted due to some error. 
> >>> Without
> >>> +this option, even when a client is aborted, other clients could 
> >>> continue
> >>> +their run as specified by -t or 
> >>> -T option,
> >>> +and pgbench will print an incomplete 
> >>> results
> >>> +in this case.
> >>> +   
> >>> +  
> >>> + 
> >>> +
> >>> 
> >>> What about inserting "Note that serialization failures or deadlock
> >>> failures will not abort client.  See  >>> linkend="failures-and-retries"/> for more information." into the end
> >>> of this paragraph?
> >> 
> >> --exit-on-abort is related to "abort" of a client instead of error or
> >> failure itself, so rather I wonder a bit that mentioning 
> >> serialization/deadlock
> >> failures might be  confusing. However, if users may think of such failures 
> >> from
> >> "abort", it could be beneficial to add the sentences with some 
> >> modification as
> >> below.
> > 
> > I myself confused by this and believe that adding extra paragraph is
> > beneficial to users.
> > 
> >>  "Note that serialization failures or deadlock failures does not abort the
> >>   client, so they are not affected by this option.
> >>   See  for more information."
> > 
> > "does not" --> "do not".
> > 
> >>> BTW, I think:
> >>> Exit immediately when any client is aborted due to some error. 
> >>> Without
> >>> 
> >>> should be:
> >>> Exit immediately when any client is aborted due to some errors. 
> >>> Without
> >>> 
> >>> (error vs. erros)
> >> 
> >> Well, I chose "some" to mean "unknown or unspecified", not "an unspecified 
> >> amount
> >> or number of", so singular form "error" is used. 
> > 
> > Ok.
> > 
> >> Instead, should we use "due to a error"?
> > 
> > I don't think so.
> > 
> >>> Also:
> >>> + --exit-on-abort is specified . Otherwise in 
> >>> the worst
> >>> 
> >>> There is an extra space between "specified" and ".".
> >> 
> >> Fixed.
> >> 
> >> Also, I fixed the place of the description in the documentation
> >> to alphabetical order
> >> 
> >> Attached is the updated patch. 
> > 
> > Looks good to me. If there's no objection, I will commit this next week.
> 
> I have pushed the patch. Thank you for the conributions!

Thank you!

Regards,
Yugo Nagata

> 
> Best reagards,
> --
> Tatsuo Ishii
> SRA OSS LLC
> English: http://www.sraoss.co.jp/index_en/
> Japanese:http://www.sraoss.co.jp


-- 
Yugo NAGATA

Re: Fix shadow warnings in logical replication code

2023-08-29 Thread Richard Guo

On Wed, Aug 30, 2023 at 8:49 AM Michael Paquier  wrote:

> There is much more going on with -Wshadow, but let's do things
> incrementally, case by case.


Yeah, IIRC the source tree currently is able to be built without any
shadow-related warnings with -Wshadow=compatible-local.  But with
-Wshadow or -Wshadow=local, you can still see a lot of warnings.

Thanks
Richard

A propose to revise \watch help message

2023-08-29 Thread Kyotaro Horiguchi

Recently \watch got the following help message.

>   \watch [[i=]SEC] [c=N] [m=MIN]
>   execute query every SEC seconds, up to N times
>   stop if less than MIN rows are returned

The "m=MIN" can be a bit misleading. It may look like it's about
interval or counts, but it actually refers to the row number, which is
not spelled out in the line.

Would it make sense to change it to MINROWS?  There's enough room in
the line for that change and the doc already uses min_rows.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center
diff --git a/src/bin/psql/help.c b/src/bin/psql/help.c
index 38c165a627..8c8616205d 100644
--- a/src/bin/psql/help.c
+++ b/src/bin/psql/help.c
@@ -200,9 +200,9 @@ slashUsage(unsigned short int pager)
 	HELP0("  \\gset [PREFIX] execute query and store result in psql variables\n");
 	HELP0("  \\gx [(OPTIONS)] [FILE] as \\g, but forces expanded output mode\n");
 	HELP0("  \\q quit psql\n");
-	HELP0("  \\watch [[i=]SEC] [c=N] [m=MIN]\n");
+	HELP0("  \\watch [[i=]SEC] [c=N] [m=MINROWS]\n");
 	HELP0("  execute query every SEC seconds, up to N times\n");
-	HELP0("  stop if less than MIN rows are returned\n");
+	HELP0("  stop if less than MINROWS rows are returned\n");
 	HELP0("\n");
 
 	HELP0("Help\n");

Re: Wrong usage of pqMsg_Close message code?

2023-08-29 Thread Nathan Bossart

On Wed, Aug 30, 2023 at 07:56:33AM +0900, Michael Paquier wrote:
> On Tue, Aug 29, 2023 at 02:11:06PM -0700, Nathan Bossart wrote:
>> I plan to commit the attached patch shortly.
> 
> WFM.

Committed.

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com

Re: A propose to revise \watch help message

2023-08-29 Thread Kyotaro Horiguchi

At Wed, 30 Aug 2023 10:21:26 +0900 (JST), Kyotaro Horiguchi 
 wrote in 
> Recently \watch got the following help message.
> 
> >   \watch [[i=]SEC] [c=N] [m=MIN]
> >   execute query every SEC seconds, up to N times
> >   stop if less than MIN rows are returned
> 
> The "m=MIN" can be a bit misleading. It may look like it's about
> interval or counts, but it actually refers to the row number, which is
> not spelled out in the line.
> 
> Would it make sense to change it to MINROWS?  There's enough room in
> the line for that change and the doc already uses min_rows.

Mmm. I noticed the continuation lines are indented too much, probably
because of the backslash escape in the main line.  The attached
includes the fix for that.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center
diff --git a/src/bin/psql/help.c b/src/bin/psql/help.c
index 38c165a627..2da79a75f5 100644
--- a/src/bin/psql/help.c
+++ b/src/bin/psql/help.c
@@ -200,9 +200,9 @@ slashUsage(unsigned short int pager)
 	HELP0("  \\gset [PREFIX] execute query and store result in psql variables\n");
 	HELP0("  \\gx [(OPTIONS)] [FILE] as \\g, but forces expanded output mode\n");
 	HELP0("  \\q quit psql\n");
-	HELP0("  \\watch [[i=]SEC] [c=N] [m=MIN]\n");
-	HELP0("  execute query every SEC seconds, up to N times\n");
-	HELP0("  stop if less than MIN rows are returned\n");
+	HELP0("  \\watch [[i=]SEC] [c=N] [m=MINROWS]\n");
+	HELP0(" execute query every SEC seconds, up to N times\n");
+	HELP0(" stop if less than MINROWS rows are returned\n");
 	HELP0("\n");
 
 	HELP0("Help\n");

Re: should frontend tools use syncfs() ?

2023-08-29 Thread Nathan Bossart

On Wed, Aug 30, 2023 at 09:10:47AM +0900, Michael Paquier wrote:
> I understand that I'm perhaps sounding pedantic about fsync_pgdata()..
> But, after thinking more about it, I would still make this code fail
> hard with an exit(EXIT_FAILURE) to let any C code calling directly
> this routine with sync_method = DATA_DIR_SYNC_METHOD_SYNCFS know that
> the build does not allow the use of this option when we don't have
> HAVE_SYNCFS.  parse_sync_method() offers some protection, but adding
> this restriction also in the execution path is more friendly than
> falling back silently to the default of flushing each file if
> fsync_pgdata() is called with syncfs but the build does not support
> it.  At least that's more predictible.

That seems fair enough.  I did this in v7.  I restructured fsync_pgdata()
and fsync_dir_recurse() so that any new sync methods should cause compiler
warnings until they are implemented.

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com
>From f5ecb3c8b5d397c94a9f8ab9a053b3a0cfd7f854 Mon Sep 17 00:00:00 2001
From: Nathan Bossart 
Date: Mon, 21 Aug 2023 19:05:14 -0700
Subject: [PATCH v7 1/2] move PG_TEMP_FILE* macros to file_utils.h

---
 src/backend/backup/basebackup.c |  2 +-
 src/backend/postmaster/postmaster.c |  1 +
 src/backend/storage/file/fileset.c  |  1 +
 src/bin/pg_checksums/pg_checksums.c | 10 --
 src/bin/pg_rewind/filemap.c |  2 +-
 src/include/common/file_utils.h |  4 
 src/include/storage/fd.h|  4 
 7 files changed, 8 insertions(+), 16 deletions(-)

diff --git a/src/backend/backup/basebackup.c b/src/backend/backup/basebackup.c
index 45be21131c..5d66014499 100644
--- a/src/backend/backup/basebackup.c
+++ b/src/backend/backup/basebackup.c
@@ -25,6 +25,7 @@
 #include "commands/defrem.h"
 #include "common/compression.h"
 #include "common/file_perm.h"
+#include "common/file_utils.h"
 #include "lib/stringinfo.h"
 #include "miscadmin.h"
 #include "nodes/pg_list.h"
@@ -37,7 +38,6 @@
 #include "storage/bufpage.h"
 #include "storage/checksum.h"
 #include "storage/dsm_impl.h"
-#include "storage/fd.h"
 #include "storage/ipc.h"
 #include "storage/reinit.h"
 #include "utils/builtins.h"
diff --git a/src/backend/postmaster/postmaster.c b/src/backend/postmaster/postmaster.c
index d7bfb28ff3..54e9bfb8c4 100644
--- a/src/backend/postmaster/postmaster.c
+++ b/src/backend/postmaster/postmaster.c
@@ -94,6 +94,7 @@
 #include "access/xlogrecovery.h"
 #include "catalog/pg_control.h"
 #include "common/file_perm.h"
+#include "common/file_utils.h"
 #include "common/ip.h"
 #include "common/pg_prng.h"
 #include "common/string.h"
diff --git a/src/backend/storage/file/fileset.c b/src/backend/storage/file/fileset.c
index e9951b0269..84cae32548 100644
--- a/src/backend/storage/file/fileset.c
+++ b/src/backend/storage/file/fileset.c
@@ -25,6 +25,7 @@
 
 #include "catalog/pg_tablespace.h"
 #include "commands/tablespace.h"
+#include "common/file_utils.h"
 #include "common/hashfn.h"
 #include "miscadmin.h"
 #include "storage/ipc.h"
diff --git a/src/bin/pg_checksums/pg_checksums.c b/src/bin/pg_checksums/pg_checksums.c
index 19eb67e485..9011a19b4e 100644
--- a/src/bin/pg_checksums/pg_checksums.c
+++ b/src/bin/pg_checksums/pg_checksums.c
@@ -52,16 +52,6 @@ typedef enum
 	PG_MODE_ENABLE
 } PgChecksumMode;
 
-/*
- * Filename components.
- *
- * XXX: fd.h is not declared here as frontend side code is not able to
- * interact with the backend-side definitions for the various fsync
- * wrappers.
- */
-#define PG_TEMP_FILES_DIR "pgsql_tmp"
-#define PG_TEMP_FILE_PREFIX "pgsql_tmp"
-
 static PgChecksumMode mode = PG_MODE_CHECK;
 
 static const char *progname;
diff --git a/src/bin/pg_rewind/filemap.c b/src/bin/pg_rewind/filemap.c
index bd5c598e20..58280d9abc 100644
--- a/src/bin/pg_rewind/filemap.c
+++ b/src/bin/pg_rewind/filemap.c
@@ -27,12 +27,12 @@
 #include 
 
 #include "catalog/pg_tablespace_d.h"
+#include "common/file_utils.h"
 #include "common/hashfn.h"
 #include "common/string.h"
 #include "datapagemap.h"
 #include "filemap.h"
 #include "pg_rewind.h"
-#include "storage/fd.h"
 
 /*
  * Define a hash table which we can use to store information about the files
diff --git a/src/include/common/file_utils.h b/src/include/common/file_utils.h
index b7efa1226d..dd1532bcb0 100644
--- a/src/include/common/file_utils.h
+++ b/src/include/common/file_utils.h
@@ -46,4 +46,8 @@ extern ssize_t pg_pwritev_with_retry(int fd,
 
 extern ssize_t pg_pwrite_zeros(int fd, size_t size, off_t offset);
 
+/* Filename components */
+#define PG_TEMP_FILES_DIR "pgsql_tmp"
+#define PG_TEMP_FILE_PREFIX "pgsql_tmp"
+
 #endif			/* FILE_UTILS_H */
diff --git a/src/include/storage/fd.h b/src/include/storage/fd.h
index 6791a406fc..aea30c0622 100644
--- a/src/include/storage/fd.h
+++ b/src/include/storage/fd.h
@@ -195,8 +195,4 @@ extern int	durable_unlink(const char *fname, int elevel);
 extern void SyncDataDirectory(void);
 extern int	data_sync_elevel(int elevel);
 
-/* Filename components

Re: pgbench: allow to exit immediately when any client is aborted

2023-08-29 Thread Tatsuo Ishii

Yugo,
Fabien,

>>> I start to think this behavior is ok and consistent with previous
>>> behavior of pgbench because serialization (and dealock) errors have
>>> been treated specially from other types of errors, such as accessing
>>> non existing tables. However, I suggest to add more sentences to the
>>> explanation of this option so that users are not confused by this.
>>> 
>>> + 
>>> +  --exit-on-abort
>>> +  
>>> +   
>>> +Exit immediately when any client is aborted due to some error. 
>>> Without
>>> +this option, even when a client is aborted, other clients could 
>>> continue
>>> +their run as specified by -t or 
>>> -T option,
>>> +and pgbench will print an incomplete 
>>> results
>>> +in this case.
>>> +   
>>> +  
>>> + 
>>> +
>>> 
>>> What about inserting "Note that serialization failures or deadlock
>>> failures will not abort client.  See >> linkend="failures-and-retries"/> for more information." into the end
>>> of this paragraph?
>> 
>> --exit-on-abort is related to "abort" of a client instead of error or
>> failure itself, so rather I wonder a bit that mentioning 
>> serialization/deadlock
>> failures might be  confusing. However, if users may think of such failures 
>> from
>> "abort", it could be beneficial to add the sentences with some modification 
>> as
>> below.
> 
> I myself confused by this and believe that adding extra paragraph is
> beneficial to users.
> 
>>  "Note that serialization failures or deadlock failures does not abort the
>>   client, so they are not affected by this option.
>>   See  for more information."
> 
> "does not" --> "do not".
> 
>>> BTW, I think:
>>> Exit immediately when any client is aborted due to some error. 
>>> Without
>>> 
>>> should be:
>>> Exit immediately when any client is aborted due to some errors. 
>>> Without
>>> 
>>> (error vs. erros)
>> 
>> Well, I chose "some" to mean "unknown or unspecified", not "an unspecified 
>> amount
>> or number of", so singular form "error" is used. 
> 
> Ok.
> 
>> Instead, should we use "due to a error"?
> 
> I don't think so.
> 
>>> Also:
>>> + --exit-on-abort is specified . Otherwise in the 
>>> worst
>>> 
>>> There is an extra space between "specified" and ".".
>> 
>> Fixed.
>> 
>> Also, I fixed the place of the description in the documentation
>> to alphabetical order
>> 
>> Attached is the updated patch. 
> 
> Looks good to me. If there's no objection, I will commit this next week.

I have pushed the patch. Thank you for the conributions!

Best reagards,
--
Tatsuo Ishii
SRA OSS LLC
English: http://www.sraoss.co.jp/index_en/
Japanese:http://www.sraoss.co.jp

Re: Remove IndexInfo.ii_OpclassOptions field

2023-08-29 Thread Michael Paquier

On Tue, Aug 29, 2023 at 10:51:10AM +0200, Peter Eisentraut wrote:
> At a glance, however, I think my patch is (a) not related, and (b) if it
> were, it would probably *help*, because the change is to not allocate any
> long-lived structures that no one needs and that might get out of date.

Hmm, yeah, perhaps you're right about (b) here.  I have a few other
high-priority items for stable branches on my board before being able
to look at all this in more details, unfortunately, so feel free to
ignore me if you think that this is an improvement anyway even
regarding the other issue discussed.
--
Michael


signature.asc
Description: PGP signature

Re: Fix shadow warnings in logical replication code

2023-08-29 Thread Michael Paquier

On Wed, Aug 30, 2023 at 09:16:38AM +1000, Peter Smith wrote:
> logicalfuncs.c:184:13: warning: declaration of ‘name’ shadows a
> previous local [-Wshadow]
> char*name = TextDatumGetCString(datum_opts[i]);
>  ^
> logicalfuncs.c:105:8: warning: shadowed declaration is here [-Wshadow]
>   Name  name;

A bit confusing here, particularly as the name is reused with
ReplicationSlotAcquire() at the end of
pg_logical_slot_get_changes_guts() once again.

> reorderbuffer.c:4843:10: warning: declaration of ‘isnull’ shadows a
> previous local [-Wshadow]
> bool  isnull;
>   ^
> reorderbuffer.c:4734:11: warning: shadowed declaration is here [-Wshadow]
>   bool*isnull;
>^

Agreed as well about this one.

> walsender.c:3543:14: warning: declaration of ‘sentPtr’ shadows a
> global declaration [-Wshadow]
>XLogRecPtr sentPtr;
>   ^
> walsender.c:155:19: warning: shadowed declaration is here [-Wshadow]
>  static XLogRecPtr sentPtr = InvalidXLogRecPtr;
>^

This one looks pretty serious to me, particularly as the static
sentPtr is used quite a bit.  It is fortunate that the impact is
limited to the WAL sender stat function.

Fixing all these seems like a good thing in the long term, so OK for
me.  Like all the fixes similar to this one, I don't see a need for a
backpatch based on their locality, even if sentPtr makes me a bit
nervous to keep even in stable branches.

There is much more going on with -Wshadow, but let's do things
incrementally, case by case.
--
Michael

signature.asc
Description: PGP signature

Re: Strange presentaion related to inheritance in \d+

2023-08-29 Thread Kyotaro Horiguchi

At Tue, 29 Aug 2023 19:28:28 +0200, Alvaro Herrera  
wrote in 
> On 2023-Aug-29, Kyotaro Horiguchi wrote:
> 
> > Attached is the initial version of the patch. It prevents "CREATE
> > TABLE" from executing if there is an inconsisntent not-null
> > constraint.  Also I noticed that "ALTER TABLE t ADD NOT NULL c NO
> > INHERIT" silently ignores the "NO INHERIT" part and fixed it.
> 
> Great, thank you.  I pushed it after modifying it a bit -- instead of
> throwing the error in MergeAttributes, I did it in
> AddRelationNotNullConstraints().  It seems cleaner this way, mostly
> because we already have to match these two constraints there.  (I guess

I agree that it is cleaner.

> you could argue that we waste catalog-insertion work before the error is
> reported and the whole thing is aborted; but I don't think this is a
> serious problem in practice.)

Given the rarity and the speed required, I agree that early-catching
is not that crucial here. Thanks for clearing that up.

regardes.


-- 
Kyotaro Horiguchi
NTT Open Source Software Center

Re: Standardize spelling of "power of two"

2023-08-29 Thread Kyotaro Horiguchi

At Tue, 29 Aug 2023 14:39:42 +0200, Daniel Gustafsson  wrote 
in 
> > On 29 Aug 2023, at 13:11, Alvaro Herrera  wrote:
> > 
> > On 2023-Aug-29, Daniel Gustafsson wrote:
> > 
> >> Agreed.  While we have numerous "power of 2" these were the only ones
> >> in translated user-facing messages, so I've pushed this to master (it
> >> didn't seem worth disrupting translations for 16 as we are so close to
> >> wrapping it, if others disagree I can backpatch).
> > 
> > I'd rather backpatch it.  There's only 5 translations that are 100% for
> > initdb.po, and they have two weeks to make the change from "2" to "two".
> > I don't think this is a problem.
> 
> Fair enough, backpatched to v16.

Thank you for committing this.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center

Re: should frontend tools use syncfs() ?

2023-08-29 Thread Michael Paquier

On Tue, Aug 29, 2023 at 08:45:59AM -0700, Nathan Bossart wrote:
> rebased

0001 looks OK, worth its own, independent, commit.

I understand that I'm perhaps sounding pedantic about fsync_pgdata()..
But, after thinking more about it, I would still make this code fail
hard with an exit(EXIT_FAILURE) to let any C code calling directly
this routine with sync_method = DATA_DIR_SYNC_METHOD_SYNCFS know that
the build does not allow the use of this option when we don't have
HAVE_SYNCFS.  parse_sync_method() offers some protection, but adding
this restriction also in the execution path is more friendly than
falling back silently to the default of flushing each file if
fsync_pgdata() is called with syncfs but the build does not support
it.  At least that's more predictible.

I'm fine with the doc changes.
--
Michael


signature.asc
Description: PGP signature

Re: pg_stat_get_backend_subxact() and backend IDs?

2023-08-29 Thread Michael Paquier

On Tue, Aug 29, 2023 at 09:46:55AM -0700, Nathan Bossart wrote:
> This was my first reaction [0].  I was concerned about renaming the
> exported functions so close to release, so I was suggesting that we hold
> off on that part until v17.  If there isn't a concern with renaming these
> functions in v16, I can proceed with something more like v2.

Thanks for the pointer.  This version is much better IMO, because it
removes entirely the source of the confusion between the difference in
backend ID and index ID treatments when fetching the local entries in
the array.  So I'm okay to rename these functions now, before .0 is
released to get things in a better shape while addressing the issue
reported.

+extern LocalPgBackendStatus *pgstat_get_local_beentry_by_index(int beid); 

Still I would to a bit more of s/beid/id/ for cases where the code
refers to an index ID, and not a backend ID, especially for the
internal routines.
--
Michael

signature.asc
Description: PGP signature

Fix shadow warnings in logical replication code

2023-08-29 Thread Peter Smith

The -Wshadow compiler option reported 3 shadow warnings within the
logical replication files. (These are all in old code)

PSA a patch to address those.

==

logicalfuncs.c:184:13: warning: declaration of ‘name’ shadows a
previous local [-Wshadow]
char*name = TextDatumGetCString(datum_opts[i]);
 ^
logicalfuncs.c:105:8: warning: shadowed declaration is here [-Wshadow]
  Name  name;
^

~~~

reorderbuffer.c:4843:10: warning: declaration of ‘isnull’ shadows a
previous local [-Wshadow]
bool  isnull;
  ^
reorderbuffer.c:4734:11: warning: shadowed declaration is here [-Wshadow]
  bool*isnull;
   ^

~~~

walsender.c:3543:14: warning: declaration of ‘sentPtr’ shadows a
global declaration [-Wshadow]
   XLogRecPtr sentPtr;
  ^
walsender.c:155:19: warning: shadowed declaration is here [-Wshadow]
 static XLogRecPtr sentPtr = InvalidXLogRecPtr;
   ^

--
Kind Regards,
Peter Smith.
Fujitsu Australia


v1-0001-Remove-shadows-found-in-logical-replication-files.patch
Description: Binary data

Re: Avoid a possible overflow (src/backend/utils/sort/logtape.c)

2023-08-29 Thread Michael Paquier

On Thu, Aug 24, 2023 at 07:02:40PM -0700, Peter Geoghegan wrote:
> FWIW I'm pretty sure that it's impossible to run into problems here in
> practice -- the minheap is allocated by palloc(), and the high
> watermark number of free pages is pretty small. Even still, I agree
> with your conclusion. There is really no reason to not be consistent
> here.

Postgres 16 RC1 is now tagged, so applied down to 13.
--
Michael


signature.asc
Description: PGP signature

Re: Allow specifying a dbname in pg_basebackup connection string

2023-08-29 Thread Jim Jones


Hi Jelte

On 29.08.23 15:55, Jelte Fennema wrote:

Thanks for the review. I've updated the documentation to make it
clearer (using some of your suggestions but also some others)


This patch applies and builds cleanly, and the documentation is very clear.

I tested it using the 'replication-support' branch from your github fork:

/pg_basebackup --dbname "port=6432 user=postgres dbname=foo" -D /tmp/dump1/

pgbouncer log:

/2023-08-30 00:50:52.866 CEST [811770] LOG C-0x555fbd65bf40: 
(nodb)/postgres@unix(811776):6432 login attempt: db=foo user=postgres 
tls=no replication=yes/


However, pgbouncer closes with a segmentation fault, so I couldn't test 
the result of pg_basebackup itself - but I guess it isn't the issue here.


Other than that, everything looks good to me.

Jim

Re: Wrong usage of pqMsg_Close message code?

2023-08-29 Thread Michael Paquier

On Tue, Aug 29, 2023 at 02:11:06PM -0700, Nathan Bossart wrote:
> I plan to commit the attached patch shortly.

WFM.
--
Michael


signature.asc
Description: PGP signature

Re: Autogenerate some wait events code and documentation

2023-08-29 Thread Michael Paquier

On Tue, Aug 29, 2023 at 02:21:48PM +0200, Alvaro Herrera wrote:
> Yeah, I have a mild preference for keeping the prefix, but it's mild
> because I also imagine that if somebody doesn't see the full symbol name
> when grepping they will think to remove the prefix.  So only -0.1.

So, are you fine with the patch as presented?  Or are there other
things you'd like to see changed in the format?

> I think the DOCONLY stuff should be better documented; they make no
> sense without looking at the commit message for fa88928470b5.

Good point.  However, with 0002 in place these are gone.
--
Michael


signature.asc
Description: PGP signature

Re: Is pg_regress --use-existing used by anyone or is it broken?

2023-08-29 Thread Peter Geoghegan

On Tue, Aug 29, 2023 at 3:37 PM Daniel Gustafsson  wrote:
> > It's handy when using pg_regress with a custom test suite, where I
> > don't want to be nagged about disconnecting from the database every
> > time.
>
> I'm curious about your workflow around it, it seems to me that it's kind of
> broken so I wonder if we instead then should make it an equal citizen with 
> temp
> instance?

I'm confused. You seem to think that it's a problem that
--use-existing doesn't create databases and roles. But that's the
whole point, at least for me.

I don't use --use-existing to run the standard regression tests, or
anything like that. I use it to run my own custom test suite, often
while relying upon the database having certain data already. Sometimes
it's a nontrivial amount of data. I don't want to have to set up and
tear down the data every time, since it isn't usually necessary.

I usually have a relatively small and fast running read-only test
suite, and a larger test suite that does indeed need to do various
setup and teardown steps. It isn't possible to run the smaller test
suite without having first run the larger one at least once. But this
is just for me, during development. Right now, with my SAOP nbtree
project, the smaller test suite takes me about 50ms to run, while the
larger one takes almost 10 seconds.

-- 
Peter Geoghegan

Re: Is pg_regress --use-existing used by anyone or is it broken?

2023-08-29 Thread Daniel Gustafsson

> On 30 Aug 2023, at 00:33, Peter Geoghegan  wrote:
> 
> On Tue, Aug 29, 2023 at 2:53 PM Daniel Gustafsson  wrote:
>> Having looked a bit more on it I have a feeling that plain removing it would 
>> be
>> the best option.  Unless someone chimes in as a user of it I'll propose a 
>> patch
>> to remove it.
> 
> -1. I use it.

Thanks for confirming!

> It's handy when using pg_regress with a custom test suite, where I
> don't want to be nagged about disconnecting from the database every
> time.

I'm curious about your workflow around it, it seems to me that it's kind of
broken so I wonder if we instead then should make it an equal citizen with temp
instance?

--
Daniel Gustafsson

Re: Is pg_regress --use-existing used by anyone or is it broken?

2023-08-29 Thread Peter Geoghegan

On Tue, Aug 29, 2023 at 2:53 PM Daniel Gustafsson  wrote:
> Having looked a bit more on it I have a feeling that plain removing it would 
> be
> the best option.  Unless someone chimes in as a user of it I'll propose a 
> patch
> to remove it.

-1. I use it.

It's handy when using pg_regress with a custom test suite, where I
don't want to be nagged about disconnecting from the database every
time.

-- 
Peter Geoghegan

Re: Is pg_regress --use-existing used by anyone or is it broken?

2023-08-29 Thread Daniel Gustafsson

> On 29 Aug 2023, at 23:38, Nathan Bossart  wrote:
> On Mon, Aug 28, 2023 at 03:11:15PM +0200, Daniel Gustafsson wrote:

>> Does anyone here use it?
> 
> I don't think I've ever used it.  AFAICT it was added with hot standby mode
> (efc16ea) to support 'make standbycheck', which was removed last year
> (4483b2cf).  Unless someone claims to be using it, it's probably fine to
> just remove it.

Having looked a bit more on it I have a feeling that plain removing it would be
the best option.  Unless someone chimes in as a user of it I'll propose a patch
to remove it.

--
Daniel Gustafsson

Re: Query execution in Perl TAP tests needs work

2023-08-29 Thread Thomas Munro

On Wed, Aug 30, 2023 at 1:49 AM Noah Misch  wrote:
> On Tue, Aug 29, 2023 at 04:25:24PM +1200, Thomas Munro wrote:
> > On Tue, Aug 29, 2023 at 1:48 PM Noah Misch  wrote:
> > > https://github.com/cpan-authors/IPC-Run/issues/166#issuecomment-1288190929
> >
> > Interesting.  But that shows a case with no pipes connected, using
> > select() as a dumb sleep and ignoring SIGCHLD.  In our usage we have
> > pipes connected, and I think select() should return when the child's
> > output pipes become readable due to EOF.  I guess something about that
> > might be b0rked on Windows?  I see there is an extra helper process
> > doing socket<->pipe conversion (hah, that explains an extra ~10ms at
> > the start in my timestamps)...
>
> In that case, let's assume it's not the same issue.

Yeah, I think it amounts to the same thing, if EOF never arrives.

I suspect that we could get ->safe_psql() down to about ~25ms baseline
if someone could fix the posited IPC::Run EOF bug and change its
internal helper process to a helper thread.  Even if we fix tests to
reuse backends, I expect that'd help CI measurably.  (The native way
would be to use pipes directly, changing select() to
WaitForMultipleObjects(), but I suspect IPC::Run might have painted
itself into a corner by exposing the psuedo-pipes and documenting that
they can be used with select().  Oh well.)

Re: Is pg_regress --use-existing used by anyone or is it broken?

2023-08-29 Thread Nathan Bossart

On Mon, Aug 28, 2023 at 03:11:15PM +0200, Daniel Gustafsson wrote:
> When looking at pg_regress I noticed that the --use-existing support didn't
> seem to work.  ISTM that the removal/creation of test databases and roles
> doesn't run since the conditional is reversed.  There is also no support for
> using a non-default socket directory with PG_REGRESS_SOCK_DIR.  The attached
> hack fixes these and allows the tests to execute for me, but even with that 
> the
> test_setup suite fails due to the tablespace not being dropped and recreated
> like databases and roles.
> 
> Is it me who is too thick to get it working, or is it indeed broken?  If it's
> the latter, it's been like that for a long time which seems to indicate that 
> it
> isn't really used and should probably be removed rather than fixed?
> 
> Does anyone here use it?

I don't think I've ever used it.  AFAICT it was added with hot standby mode
(efc16ea) to support 'make standbycheck', which was removed last year
(4483b2cf).  Unless someone claims to be using it, it's probably fine to
just remove it.

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com

Re: Wrong usage of pqMsg_Close message code?

2023-08-29 Thread Nathan Bossart

On Tue, Aug 29, 2023 at 09:15:55AM -0700, Nathan Bossart wrote:
> Thanks for the report.  I'll get this fixed up.  My guess is that this was
> leftover from an earlier version of the patch that used the same macro for
> identical protocol characters.

I plan to commit the attached patch shortly.

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com
>From 3a895c081fb26a2f818d07c7d28a54b94c278391 Mon Sep 17 00:00:00 2001
From: Nathan Bossart 
Date: Tue, 29 Aug 2023 14:01:18 -0700
Subject: [PATCH v2 1/1] Fix misuse of PqMsg_Close.

EndCommand() and EndReplicationCommand() should use
PqMsg_CommandComplete instead.  Oversight in commit f4b54e1ed9.

Reported-by: Pavel Stehule, Tatsuo Ishii
Author: Pavel Stehule
Reviewed-by: Aleksander Alekseev, Michael Paquier
---
 src/backend/tcop/dest.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/src/backend/tcop/dest.c b/src/backend/tcop/dest.c
index 06d1872b9a..bc8494ee7d 100644
--- a/src/backend/tcop/dest.c
+++ b/src/backend/tcop/dest.c
@@ -176,7 +176,7 @@ EndCommand(const QueryCompletion *qc, CommandDest dest, bool force_undecorated_o
 
 			len = BuildQueryCompletionString(completionTag, qc,
 			 force_undecorated_output);
-			pq_putmessage(PqMsg_Close, completionTag, len + 1);
+			pq_putmessage(PqMsg_CommandComplete, completionTag, len + 1);
 
 		case DestNone:
 		case DestDebug:
@@ -200,7 +200,7 @@ EndCommand(const QueryCompletion *qc, CommandDest dest, bool force_undecorated_o
 void
 EndReplicationCommand(const char *commandTag)
 {
-	pq_putmessage(PqMsg_Close, commandTag, strlen(commandTag) + 1);
+	pq_putmessage(PqMsg_CommandComplete, commandTag, strlen(commandTag) + 1);
 }
 
 /* 
-- 
2.25.1

Re: tablecmds.c/MergeAttributes() cleanup

2023-08-29 Thread Nathan Bossart

On Tue, Aug 29, 2023 at 08:44:02PM +0200, Alvaro Herrera wrote:
> On 2023-Aug-29, Nathan Bossart wrote:
>> My compiler is complaining about 1fa9241b:
>> 
>> ../postgresql/src/backend/commands/sequence.c: In function ‘DefineSequence’:
>> ../postgresql/src/backend/commands/sequence.c:196:21: error: ‘coldef’ may be 
>> used uninitialized in this function [-Werror=maybe-uninitialized]
>>   196 |   stmt->tableElts = lappend(stmt->tableElts, coldef);
>>   | ^~~~
>> 
>> This went away when I added a default case that ERROR'd or initialized
>> coldef to NULL.
> 
> Makes sense.  However, maybe we should replace those ugly defines and
> their hardcoded values in DefineSequence with a proper array with their
> names and datatypes.

That might be an improvement, but IIUC you'd still need to enumerate all of
the columns or data types to make sure you use the right get-Datum
function.  It doesn't help that last_value uses Int64GetDatumFast and
log_cnt uses Int64GetDatum.  I could be missing something, though.

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com

Re: tablecmds.c/MergeAttributes() cleanup

2023-08-29 Thread Alvaro Herrera

On 2023-Aug-29, Nathan Bossart wrote:

> On Tue, Aug 29, 2023 at 10:43:39AM +0200, Peter Eisentraut wrote:
> > I have committed a few more patches from this series that were already
> > agreed upon.  The remaining ones are rebased and reordered a bit, attached.
> 
> My compiler is complaining about 1fa9241b:
> 
> ../postgresql/src/backend/commands/sequence.c: In function ‘DefineSequence’:
> ../postgresql/src/backend/commands/sequence.c:196:21: error: ‘coldef’ may be 
> used uninitialized in this function [-Werror=maybe-uninitialized]
>   196 |   stmt->tableElts = lappend(stmt->tableElts, coldef);
>   | ^~~~
> 
> This went away when I added a default case that ERROR'd or initialized
> coldef to NULL.

Makes sense.  However, maybe we should replace those ugly defines and
their hardcoded values in DefineSequence with a proper array with their
names and datatypes.

-- 
Álvaro Herrera   48°01'N 7°57'E  —  https://www.EnterpriseDB.com/
"Los trabajadores menos efectivos son sistematicamente llevados al lugar
donde pueden hacer el menor daño posible: gerencia."  (El principio Dilbert)

Re: tablecmds.c/MergeAttributes() cleanup

2023-08-29 Thread Nathan Bossart

On Tue, Aug 29, 2023 at 10:43:39AM +0200, Peter Eisentraut wrote:
> I have committed a few more patches from this series that were already
> agreed upon.  The remaining ones are rebased and reordered a bit, attached.

My compiler is complaining about 1fa9241b:

../postgresql/src/backend/commands/sequence.c: In function ‘DefineSequence’:
../postgresql/src/backend/commands/sequence.c:196:21: error: ‘coldef’ may be 
used uninitialized in this function [-Werror=maybe-uninitialized]
  196 |   stmt->tableElts = lappend(stmt->tableElts, coldef);
  | ^~~~

This went away when I added a default case that ERROR'd or initialized
coldef to NULL.

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com

Re: Strange presentaion related to inheritance in \d+

2023-08-29 Thread Alvaro Herrera

On 2023-Aug-29, Kyotaro Horiguchi wrote:

> Attached is the initial version of the patch. It prevents "CREATE
> TABLE" from executing if there is an inconsisntent not-null
> constraint.  Also I noticed that "ALTER TABLE t ADD NOT NULL c NO
> INHERIT" silently ignores the "NO INHERIT" part and fixed it.

Great, thank you.  I pushed it after modifying it a bit -- instead of
throwing the error in MergeAttributes, I did it in
AddRelationNotNullConstraints().  It seems cleaner this way, mostly
because we already have to match these two constraints there.  (I guess
you could argue that we waste catalog-insertion work before the error is
reported and the whole thing is aborted; but I don't think this is a
serious problem in practice.)

-- 
Álvaro Herrera   48°01'N 7°57'E  —  https://www.EnterpriseDB.com/
"En las profundidades de nuestro inconsciente hay una obsesiva necesidad
de un universo lógico y coherente. Pero el universo real se halla siempre
un paso más allá de la lógica" (Irulan)

Re: broken master regress tests

2023-08-29 Thread Pavel Stehule

út 29. 8. 2023 v 17:54 odesílatel Alvaro Herrera 
napsal:

> On 2023-Aug-27, Thomas Munro wrote:
>
> > On Sun, Aug 27, 2023 at 3:03 AM Pavel Stehule 
> wrote:
> > > So it looks so IPC::Run::run is ignore parent environment
> >
> > I guess the new initdb template captures lc_messages in
> > postgresql.conf, when it runs earlier?  I guess if you put
> > $node->append_conf('postgresql.conf', 'lc_messages=C'); into
> > src/bin/pg_amcheck/t/003_check.pl then it will work.  I'm not sure
> > what the correct fix should be, ie if the template mechanism should
> > notice this difference and not use the template, or if tests that
> > depend on the message locale should explicitly say so with
> > lc_messages=C or similar (why is this the only one?), or ...
>
> So I tried this technique, but it gest old pretty fast: apparently
> there's a *ton* of tests that depend on the locale.  I gave up after
> patching the first five files, and noticing that in a second run there
> another half a dozen failing tests that hadn't failed the first time
> around.  (Not sure why this happened.)
>
> So I think injecting --no-locale to the initdb line that creates the
> template is a better approach; something like the attached.
>

ok

thank you for fixing it

Regards

Pavel


>
> --
> Álvaro Herrera PostgreSQL Developer  —
> https://www.EnterpriseDB.com/
>

Re: pg_stat_get_backend_subxact() and backend IDs?

2023-08-29 Thread Nathan Bossart

On Mon, Aug 28, 2023 at 10:53:52AM +0900, Michael Paquier wrote:
> I understand that this is not a fantastic naming, but renaming
> pgstat_fetch_stat_local_beentry() to something like
> pgstat_fetch_stat_local_beentry_by_{index|position}_id() would make
> the difference much easier to grasp, and we should avoid the use of
> "beid" when we refer to the *position/index ID* in
> localBackendStatusTable, because it is not a BackendId at all, just a
> position in the local array.

This was my first reaction [0].  I was concerned about renaming the
exported functions so close to release, so I was suggesting that we hold
off on that part until v17.  If there isn't a concern with renaming these
functions in v16, I can proceed with something more like v2.

[0] https://postgr.es/m/20230824161913.GA1394441%40nathanxps13.lan

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com

Re: Wrong usage of pqMsg_Close message code?

2023-08-29 Thread Nathan Bossart

Hi everyone,

Thanks for the report.  I'll get this fixed up.  My guess is that this was
leftover from an earlier version of the patch that used the same macro for
identical protocol characters.

On Tue, Aug 29, 2023 at 10:01:47AM -0400, Tom Lane wrote:
> Michael Paquier  writes:
>> Actually, this may be OK as well as it stands.  One can also say that
>> the parallel processing is out of this scope, being used only
>> internally.  I cannot keep wondering whether we should put more
>> efforts in documenting the parallel worker/leader protocol.  That's
>> internal to the backend and out of the scope of this thread, still..
> 
> Yeah.  I'm not sure whether the leader/worker protocol needs better
> documentation, but the parts of it that are not common with the
> frontend protocol should NOT be documented in protocol.sgml.
> That would just confuse authors of frontend code.
> 
> I don't mind having constants for the leader/worker protocol in
> protocol.h, as long as they're in a separate section that's clearly
> marked as relevant only for server-internal parallelism.

+1, I left the parallel stuff (and a couple other things) out in the first
round to avoid prolonging the naming discussion, but we can continue to add
to protocol.h.

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com

Re: broken master regress tests

2023-08-29 Thread Alvaro Herrera

On 2023-Aug-27, Thomas Munro wrote:

> On Sun, Aug 27, 2023 at 3:03 AM Pavel Stehule  wrote:
> > So it looks so IPC::Run::run is ignore parent environment
> 
> I guess the new initdb template captures lc_messages in
> postgresql.conf, when it runs earlier?  I guess if you put
> $node->append_conf('postgresql.conf', 'lc_messages=C'); into
> src/bin/pg_amcheck/t/003_check.pl then it will work.  I'm not sure
> what the correct fix should be, ie if the template mechanism should
> notice this difference and not use the template, or if tests that
> depend on the message locale should explicitly say so with
> lc_messages=C or similar (why is this the only one?), or ...

So I tried this technique, but it gest old pretty fast: apparently
there's a *ton* of tests that depend on the locale.  I gave up after
patching the first five files, and noticing that in a second run there
another half a dozen failing tests that hadn't failed the first time
around.  (Not sure why this happened.)

So I think injecting --no-locale to the initdb line that creates the
template is a better approach; something like the attached.

-- 
Álvaro Herrera PostgreSQL Developer  —  https://www.EnterpriseDB.com/
diff --git a/meson.build b/meson.build
index 8b2b521a01..5422885b0a 100644
--- a/meson.build
+++ b/meson.build
@@ -3107,7 +3107,7 @@ sys.exit(sp.returncode)
 ''',
test_initdb_template,
temp_install_bindir / 'initdb',
-   '-A', 'trust', '-N', '--no-instructions'
+   '-A', 'trust', '-N', '--no-instructions', '--no-locale'
  ],
  priority: setup_tests_priority - 1,
  timeout: 300,
diff --git a/src/Makefile.global.in b/src/Makefile.global.in
index 0b4ca0eb6a..765b60db02 100644
--- a/src/Makefile.global.in
+++ b/src/Makefile.global.in
@@ -425,7 +425,7 @@ ifeq ($(MAKELEVEL),0)
 	$(MAKE) -C '$(top_builddir)' DESTDIR='$(abs_top_builddir)'/tmp_install install >'$(abs_top_builddir)'/tmp_install/log/install.log 2>&1
 	$(MAKE) -j1 $(if $(CHECKPREP_TOP),-C $(CHECKPREP_TOP),) checkprep >>'$(abs_top_builddir)'/tmp_install/log/install.log 2>&1
 
-	$(with_temp_install) initdb -A trust -N --no-instructions '$(abs_top_builddir)'/tmp_install/initdb-template >>'$(abs_top_builddir)'/tmp_install/log/initdb-template.log 2>&1
+	$(with_temp_install) initdb -A trust -N --no-instructions --no-locale '$(abs_top_builddir)'/tmp_install/initdb-template >>'$(abs_top_builddir)'/tmp_install/log/initdb-template.log 2>&1
 endif
 endif
 endif

Re: should frontend tools use syncfs() ?

2023-08-29 Thread Nathan Bossart

rebased

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com
>From 1bb28cfe5d40670386ae663e14c8854dc1b5486d Mon Sep 17 00:00:00 2001
From: Nathan Bossart 
Date: Mon, 21 Aug 2023 19:05:14 -0700
Subject: [PATCH v6 1/2] move PG_TEMP_FILE* macros to file_utils.h

---
 src/backend/backup/basebackup.c |  2 +-
 src/backend/postmaster/postmaster.c |  1 +
 src/backend/storage/file/fileset.c  |  1 +
 src/bin/pg_checksums/pg_checksums.c | 10 --
 src/bin/pg_rewind/filemap.c |  2 +-
 src/include/common/file_utils.h |  4 
 src/include/storage/fd.h|  4 
 7 files changed, 8 insertions(+), 16 deletions(-)

diff --git a/src/backend/backup/basebackup.c b/src/backend/backup/basebackup.c
index 45be21131c..5d66014499 100644
--- a/src/backend/backup/basebackup.c
+++ b/src/backend/backup/basebackup.c
@@ -25,6 +25,7 @@
 #include "commands/defrem.h"
 #include "common/compression.h"
 #include "common/file_perm.h"
+#include "common/file_utils.h"
 #include "lib/stringinfo.h"
 #include "miscadmin.h"
 #include "nodes/pg_list.h"
@@ -37,7 +38,6 @@
 #include "storage/bufpage.h"
 #include "storage/checksum.h"
 #include "storage/dsm_impl.h"
-#include "storage/fd.h"
 #include "storage/ipc.h"
 #include "storage/reinit.h"
 #include "utils/builtins.h"
diff --git a/src/backend/postmaster/postmaster.c b/src/backend/postmaster/postmaster.c
index d7bfb28ff3..54e9bfb8c4 100644
--- a/src/backend/postmaster/postmaster.c
+++ b/src/backend/postmaster/postmaster.c
@@ -94,6 +94,7 @@
 #include "access/xlogrecovery.h"
 #include "catalog/pg_control.h"
 #include "common/file_perm.h"
+#include "common/file_utils.h"
 #include "common/ip.h"
 #include "common/pg_prng.h"
 #include "common/string.h"
diff --git a/src/backend/storage/file/fileset.c b/src/backend/storage/file/fileset.c
index e9951b0269..84cae32548 100644
--- a/src/backend/storage/file/fileset.c
+++ b/src/backend/storage/file/fileset.c
@@ -25,6 +25,7 @@
 
 #include "catalog/pg_tablespace.h"
 #include "commands/tablespace.h"
+#include "common/file_utils.h"
 #include "common/hashfn.h"
 #include "miscadmin.h"
 #include "storage/ipc.h"
diff --git a/src/bin/pg_checksums/pg_checksums.c b/src/bin/pg_checksums/pg_checksums.c
index 19eb67e485..9011a19b4e 100644
--- a/src/bin/pg_checksums/pg_checksums.c
+++ b/src/bin/pg_checksums/pg_checksums.c
@@ -52,16 +52,6 @@ typedef enum
 	PG_MODE_ENABLE
 } PgChecksumMode;
 
-/*
- * Filename components.
- *
- * XXX: fd.h is not declared here as frontend side code is not able to
- * interact with the backend-side definitions for the various fsync
- * wrappers.
- */
-#define PG_TEMP_FILES_DIR "pgsql_tmp"
-#define PG_TEMP_FILE_PREFIX "pgsql_tmp"
-
 static PgChecksumMode mode = PG_MODE_CHECK;
 
 static const char *progname;
diff --git a/src/bin/pg_rewind/filemap.c b/src/bin/pg_rewind/filemap.c
index bd5c598e20..58280d9abc 100644
--- a/src/bin/pg_rewind/filemap.c
+++ b/src/bin/pg_rewind/filemap.c
@@ -27,12 +27,12 @@
 #include 
 
 #include "catalog/pg_tablespace_d.h"
+#include "common/file_utils.h"
 #include "common/hashfn.h"
 #include "common/string.h"
 #include "datapagemap.h"
 #include "filemap.h"
 #include "pg_rewind.h"
-#include "storage/fd.h"
 
 /*
  * Define a hash table which we can use to store information about the files
diff --git a/src/include/common/file_utils.h b/src/include/common/file_utils.h
index b7efa1226d..dd1532bcb0 100644
--- a/src/include/common/file_utils.h
+++ b/src/include/common/file_utils.h
@@ -46,4 +46,8 @@ extern ssize_t pg_pwritev_with_retry(int fd,
 
 extern ssize_t pg_pwrite_zeros(int fd, size_t size, off_t offset);
 
+/* Filename components */
+#define PG_TEMP_FILES_DIR "pgsql_tmp"
+#define PG_TEMP_FILE_PREFIX "pgsql_tmp"
+
 #endif			/* FILE_UTILS_H */
diff --git a/src/include/storage/fd.h b/src/include/storage/fd.h
index 6791a406fc..aea30c0622 100644
--- a/src/include/storage/fd.h
+++ b/src/include/storage/fd.h
@@ -195,8 +195,4 @@ extern int	durable_unlink(const char *fname, int elevel);
 extern void SyncDataDirectory(void);
 extern int	data_sync_elevel(int elevel);
 
-/* Filename components */
-#define PG_TEMP_FILES_DIR "pgsql_tmp"
-#define PG_TEMP_FILE_PREFIX "pgsql_tmp"
-
 #endif			/* FD_H */
-- 
2.25.1

>From 81e87db711bb90f4641b41b4f863f1f5064eb05d Mon Sep 17 00:00:00 2001
From: Nathan Bossart 
Date: Fri, 28 Jul 2023 15:56:24 -0700
Subject: [PATCH v6 2/2] allow syncfs in frontend utilities

---
 doc/src/sgml/config.sgml  | 12 +---
 doc/src/sgml/filelist.sgml|  1 +
 doc/src/sgml/postgres.sgml|  1 +
 doc/src/sgml/ref/initdb.sgml  | 22 +++
 doc/src/sgml/ref/pg_basebackup.sgml   | 25 
 doc/src/sgml/ref/pg_checksums.sgml| 22 +++
 doc/src/sgml/ref/pg_dump.sgml | 21 +++
 doc/src/sgml/ref/pg_rewind.sgml   | 22 +++
 doc/src/sgml/ref/pgupgrade.sgml   | 23 +++
 doc/src/sgml/syncfs.sgml  | 36 +++
 src/backend/storage/file/fd.c |  4 +-
 src/backend/utils/misc/guc_tables.c

Re: Eager page freeze criteria clarification

2023-08-29 Thread Robert Haas

On Mon, Aug 28, 2023 at 4:30 PM Melanie Plageman
 wrote:
> By low-velocity, do you mean lower overall TPS? In that case, wouldn't you be
> less likely to run into xid wraparound and thus need less aggressive
> opportunistic freezing?

Yes. But it also means that we've got slack capacity to do extra work
now without really hurting anything. If we can leverage that capacity
to reduce the pain of future bulk operations, that seems good. When
resources are tight, doing speculative work immediately becomes less
appealing. It's pretty hard to take such things into account, though.
I was just mentioning it.

> So, this is where the caveat about absolute number of page freezes
> matters. In algorithm A, master only did 57 page freezes (spread across
> the various pgbench tables). At the end of the run, 2 pages were still
> frozen.

I'm increasingly feeling that it's hard to make sense of the ratio.
Maybe show the number of pages frozen, the number that are frozen at
the end, and the number of pages in the database at the end of the run
as three separate values.

> (1) seems bad to me because it doesn't consider whether or not freezing
> will be useful -- only if it will be cheap. It froze very little of the
> cold data in a workload where a small percentage of it was being
> modified (especially workloads A, C, H). And it froze a lot of data in
> workloads where it was being uniformly modified (workload B).

Sure, but if the cost of being wrong is low, you can be wrong a lot
and still be pretty happy. It's unclear to me to what extent we should
gamble on making only inexpensive mistakes and to what extent we
should gamble on making only infrequent mistakes, but they're both
valid strategies.

> I suggested (4) and (5) because I think the "older than 33%" threshold
> is better than the "older than 10%" threshold. I chose both because I am
> still unclear on our values. Are we willing to freeze more aggressively
> at the expense of emitting more FPIs? As long as it doesn't affect
> throughput? For pretty much all of these workloads, the algorithms which
> froze based on page modification recency OR FPI required emitted many
> more FPIs than those which froze based only on page modification
> recency.

Let's assume for a moment that the rate at which the insert LSN is
advancing is roughly constant over time, so that it serves as a good
proxy for wall-clock time. Consider four tables A, B, C, and D that
are, respectively, vacuumed once per minute, once per hour, once per
day, and once per week. With a 33% threshold, pages in table A will be
frozen if they haven't been modified in 20 seconds, page in table B
will be frozen if they haven't been modified in 20 minutes, pages in
table C will be frozen if they haven't been modified in 8 hours, and
pages in table D will be frozen if they haven't been modified in 2
days, 8 hours. My intuition is that this feels awfully aggressive for
A and awfully passive for D.

To expand on that: apparently D doesn't actually get much write
activity, else there would be more vacuuming happening. So it's very
likely that pages in table D are going to get checkpointed and evicted
before they get modified again. Freezing them therefore seems like a
good bet: it's a lot cheaper to freeze those pages when they're
already in shared_buffers and dirty than it is if we have to read and
write them specifically for freezing. It's less obvious that what
we're doing in table A is wrong, and it could be exactly right, but
workloads where a row is modified, a human thinks about something
(e.g. whether to complete the proposed purchase), and then the same
row is modified again are not uncommon, and human thinking times can
easily exceed 20 seconds. On the other hand, workloads where a row is
modified, a computer thinks about something, and then the row is
modified again are also quite common, and computer thinking times can
easily be less than 20 seconds. It feels like a toss-up whether we get
it right. For this kind of table, I suspect we'd be happier freezing
pages that are about to be evicted or about to be written as part of a
checkpoint rather than freezing pages opportunistically in vacuum.

Maybe that's something we need to think harder about. If we froze
dirty pages that wouldn't need a new FPI just before evicting them,
and just before they would be written out for a checkpoint, under what
circumstances would we still want vacuum to opportunistically freeze?
I think the answer might be "only when it's very cheap." If it's very
cheap to freeze now, it's appealing to gamble on doing it before a
checkpoint comes along and makes the same operation require an extra
FPI. But if it isn't too cheap, then why not just wait and see what
happens? If the buffer gets modified again before it gets written out,
then freeing immediately is a waste and freeze-on-evict is better. If
it doesn't, freezing immediately and freeze-on-evict are the same
price.

I'm hand-waving a bit here because maybe freeze-on-evict

Re: Debian 12 gcc warning

2023-08-29 Thread Tom Lane

Bruce Momjian  writes:
> On Tue, Aug 29, 2023 at 10:26:27AM +0700, John Naylor wrote:
>> It looks like the former, since I can silence it on gcc 13 / -O1 by doing:
>> /* keep compiler quiet */
>> actual_arg_types[0] = InvalidOid;

> Agreed, that fixes it for me too.  In fact, assigning to only element 99 or
> 200 also prevents the warning, and considering the array is defined for
> 100 elements, the fact is accepts 200 isn't a good thing.  Patch attached.

That seems like a pretty clear compiler bug, particularly since it just
appears in this one version.  Rather than contorting our code, I'd
suggest filing a gcc bug.

regards, tom lane

Re: Wrong usage of pqMsg_Close message code?

2023-08-29 Thread Tom Lane

Michael Paquier  writes:
> Actually, this may be OK as well as it stands.  One can also say that
> the parallel processing is out of this scope, being used only
> internally.  I cannot keep wondering whether we should put more
> efforts in documenting the parallel worker/leader protocol.  That's
> internal to the backend and out of the scope of this thread, still..

Yeah.  I'm not sure whether the leader/worker protocol needs better
documentation, but the parts of it that are not common with the
frontend protocol should NOT be documented in protocol.sgml.
That would just confuse authors of frontend code.

I don't mind having constants for the leader/worker protocol in
protocol.h, as long as they're in a separate section that's clearly
marked as relevant only for server-internal parallelism.

regards, tom lane

Re: Allow specifying a dbname in pg_basebackup connection string

2023-08-29 Thread Jelte Fennema

On Mon, 28 Aug 2023 at 23:50, Tristen Raab  wrote:
> I've reviewed your patch and it applies and builds without error. When 
> testing this patch I was slightly confused as to what its purpose was, after 
> testing it I now understand. Initially, I thought this was a change to add 
> database-level replication. I would suggest some clarifications to the 
> documentation such as changing:

Thanks for the review. I've updated the documentation to make it
clearer (using some of your suggestions but also some others)


v3-0001-Allow-specifying-a-dbname-in-pg_basebackup-connec.patch
Description: Binary data

Re: Query execution in Perl TAP tests needs work

2023-08-29 Thread Noah Misch

On Tue, Aug 29, 2023 at 04:25:24PM +1200, Thomas Munro wrote:
> On Tue, Aug 29, 2023 at 1:48 PM Noah Misch  wrote:
> > https://github.com/cpan-authors/IPC-Run/issues/166#issuecomment-1288190929
> 
> Interesting.  But that shows a case with no pipes connected, using
> select() as a dumb sleep and ignoring SIGCHLD.  In our usage we have
> pipes connected, and I think select() should return when the child's
> output pipes become readable due to EOF.  I guess something about that
> might be b0rked on Windows?  I see there is an extra helper process
> doing socket<->pipe conversion (hah, that explains an extra ~10ms at
> the start in my timestamps)...

In that case, let's assume it's not the same issue.

Re: A failure in 031_recovery_conflict.pl on Debian/s390x

2023-08-29 Thread Christoph Berg

Re: Thomas Munro
> 2022), and then backpatched to all releases.  They were disabled again
> in release branches 10-14 (discussion at
> https://postgr.es/m/3447060.1652032...@sss.pgh.pa.us):
> 
> +plan skip_all => "disabled until after minor releases, due to instability";

Right:
https://pgdgbuild.dus.dg-i.net/view/Snapshot/job/postgresql-14-binaries-snapshot/366/architecture=s390x,distribution=sid/consoleText
-> t/031_recovery_conflict.pl ... skipped: disabled until after minor 
releases, due to instability

> Now your mainframe build bot is regularly failing for whatever timing
> reason, in 16 and master.  That's quite useful because your tests have
> made us or at least me a lot more confident that the fix is good (one
> of the reasons progress has been slow is that failures in CI and BF
> were pretty rare and hard to analyse).  But... I wonder why it isn't
> failing for you in 15?  Are you able to check if it ever has?  I
> suppose we could go and do the "disabled due to instability" thing in
> 15 and 16, and then later we'll un-disable it in 16 when we back-patch
> the fix into it.

The current explanation is that builds are only running frequently for
16 and devel, and the periodic extra test jobs only run "make check"
and the postgresql-common testsuite, not the full server testsuite.

I had "snapshot" jobs configured for basically everything (server and
extensions) for some time, but ultimately realized that I don't have
the slightest time to look at everything, so I disabled them again
about 3 weeks ago. This removed enough noise so I could actually look
at the failing 16 and devel jobs again.

I don't see any failures of that kind for the 15 snapshot build while
it was still running, the failures in there are all something else
(mostly problems outside PG).

https://pgdgbuild.dus.dg-i.net/view/Snapshot/job/postgresql-15-binaries-snapshot/

I'll enable the server snapshot builds again, those shouldn't be too
noisy.

Christoph

Re: logical_replication_mode

2023-08-29 Thread Peter Eisentraut


On 27.08.23 14:05, Zhijie Hou (Fujitsu) wrote:

On Friday, August 25, 2023 5:56 PM Amit Kapila  wrote:


On Fri, Aug 25, 2023 at 12:38 PM Peter Eisentraut  wrote:


On 25.08.23 08:52, Zhijie Hou (Fujitsu) wrote:

On Friday, August 25, 2023 12:28 PM Amit Kapila

 wrote:


On Thu, Aug 24, 2023 at 12:45 PM Peter Eisentraut

wrote:


I suggest we rename this setting to something starting with debug_.
Right now, the name looks much too tempting for users to fiddle with.
I think this is similar to force_parallel_mode.



+1. How about debug_logical_replication?


Also, the descriptions in guc_tables.c could be improved.  For
example,

   gettext_noop("Controls when to replicate or apply each
change."),

is pretty content-free and unhelpful.



The other possibility I could think of is to change short_desc as:
"Allows to replicate each change for large transactions.". Do you
have any better ideas?


How about "Forces immediate streaming or serialization of changes in
large transactions." which is similar to the description in document.

I agree that renaming it to debug_xx would be better and here is a
patch that tries to do this.


Maybe debug_logical_replication is too general?  Something like
debug_logical_replication_streaming would be more concrete.



Yeah, that sounds better.


OK, here is the debug_logical_replication_streaming version.


committed

Re: Debian 12 gcc warning

2023-08-29 Thread Bruce Momjian

On Tue, Aug 29, 2023 at 10:26:27AM +0700, John Naylor wrote:
> 
> On Tue, Aug 29, 2023 at 6:56 AM David Rowley  wrote:
> >
> > I'm just not sure if it's unable to figure out if at least nargs
> > elements is set or if it won't be happy until all 100 elements are
> > set.
> 
> It looks like the former, since I can silence it on gcc 13 / -O1 by doing:
> 
> /* keep compiler quiet */
> actual_arg_types[0] = InvalidOid;

Agreed, that fixes it for me too.  In fact, assigning to only element 99 or
200 also prevents the warning, and considering the array is defined for
100 elements, the fact is accepts 200 isn't a good thing.  Patch attached.

I think the question is whether we add this to silence a common compiler
but non-default optimization level.  It is the only such case in our
source code right now.

-- 
  Bruce Momjian  https://momjian.us
  EDB  https://enterprisedb.com

  Only you can decide what is important to you.
diff --git a/src/backend/optimizer/util/clauses.c b/src/backend/optimizer/util/clauses.c
index da258968b8..f4a1d1049c 100644
--- a/src/backend/optimizer/util/clauses.c
+++ b/src/backend/optimizer/util/clauses.c
@@ -4284,6 +4284,10 @@ recheck_cast_function_args(List *args, Oid result_type,
 	if (list_length(args) > FUNC_MAX_ARGS)
 		elog(ERROR, "too many function arguments");
 	nargs = 0;
+
+	/* Silence gcc 12 compiler at -O1. */
+	actual_arg_types[0] = InvalidOid;
+
 	foreach(lc, args)
 	{
 		actual_arg_types[nargs++] = exprType((Node *) lfirst(lc));

Re: Eliminate redundant tuple visibility check in vacuum

2023-08-29 Thread Melanie Plageman

Hi David,
Thanks for taking a look!

On Tue, Aug 29, 2023 at 5:07 AM David Geier  wrote:
>
> Hi Melanie,
>
> On 8/29/23 01:49, Melanie Plageman wrote:
> > While working on a set of patches to combine the freeze and visibility
> > map WAL records into the prune record, I wrote the attached patches
> > reusing the tuple visibility information collected in heap_page_prune()
> > back in lazy_scan_prune().
> >
> > heap_page_prune() collects the HTSV_Result for every tuple on a page
> > and saves it in an array used by heap_prune_chain(). If we make that
> > array available to lazy_scan_prune(), it can use it when collecting
> > stats for vacuum and determining whether or not to freeze tuples.
> > This avoids calling HeapTupleSatisfiesVacuum() again on every tuple in
> > the page.
> >
> > It also gets rid of the retry loop in lazy_scan_prune().
>
> How did you test this change?

I didn't add a new test, but you can confirm some existing test
coverage if you, for example, set every HTSV_Result in the array to
HEAPTUPLE_LIVE and run the regression tests. Tests relying on vacuum
removing the right tuples may fail. For example, select count(*) from
gin_test_tbl where i @> array[1, 999]; in src/test/regress/sql/gin.sql
fails for me since it sees a tuple it shouldn't.

> Could you measure any performance difference?
>
> If so could you provide your test case?

I created a large table and then updated a tuple on every page in the
relation and vacuumed it. I saw a consistent slight improvement in
vacuum execution time. I profiled a bit with perf stat as well. The
difference is relatively small for this kind of example, but I can
work on a more compelling, realistic example. I think eliminating the
retry loop is also useful, as I have heard that users have had to
cancel vacuums which were in this retry loop for too long.

- Melanie

Re: pg_rewind WAL segments deletion pitfall

2023-08-29 Thread torikoshia

On 2023-08-24 09:45, Kyotaro Horiguchi wrote:

At Wed, 23 Aug 2023 13:44:52 +0200, Alexander Kukushkin
 wrote in
On Tue, 22 Aug 2023 at 07:32, Michael Paquier  
wrote:

> I don't like much this patch.  While it takes correctly advantage of
> the backward record read logic from SimpleXLogPageRead() able to
> handle correctly timeline jumps, it creates a hidden dependency in the
> code between the hash table from filemap.c and the page callback.
> Wouldn't it be simpler to build a list of the segment names using the
> information from WALOpenSegment and build this list in
> findLastCheckpoint()?

I think the first version of the patch more or less did that. Not
necessarily a list, but a hash table of WAL file names that we want to
keep. But Kyotaro Horiguchi didn't like it and suggested creating 
entries

in the filemap.c hash table instead.
But, I agree, doing it directly from the findLastCheckpoint() makes 
the

code easier to understand.

...

> +   /*
> +* Some entries (WAL segments) already have an action assigned
> +* (see SimpleXLogPageRead()).
> +*/
> +   if (entry->action == FILE_ACTION_UNDECIDED)
> +   entry->action = decide_file_action(entry);
>
> This change makes me a bit uneasy, per se my previous comment with the
> additional code dependencies.
>

We can revert to the original approach (see
v1-0001-pg_rewind-wal-deletion.patch from the very first email) if you 
like.

On the other hand, that approach brings in another source that
suggests the way that file should be handled. I still think that
entry->action should be the only source.

+1.
Imaging a case when we come to need decide how to treat files based on 
yet another factor, I feel that a single source of truth is better than 
creating a list or hash for each factor.

However, it seems I'm in the
minority here. So I'm not tied to that approach.

> I think that this scenario deserves a test case.  If one wants to
> emulate a delay in WAL archiving, it is possible to set
> archive_command to a command that we know will fail, for instance.
>

Yes, I totally agree, it is on our radar, but meanwhile please see the 
new

version, just to check if I correctly understood your idea.

Thanks for the patch.
I tested v4 patch using the script attached below thread and it has 
successfully finished.

https://www.postgresql.org/message-id/2e75ae22dce9a227c3d47fa6d0ed094a%40oss.nttdata.com

--
Regards,

--
Atsushi Torikoshi
NTT DATA Group Corporation

Re: Standardize spelling of "power of two"

2023-08-29 Thread Daniel Gustafsson

> On 29 Aug 2023, at 13:11, Alvaro Herrera  wrote:
> 
> On 2023-Aug-29, Daniel Gustafsson wrote:
> 
>> Agreed.  While we have numerous "power of 2" these were the only ones
>> in translated user-facing messages, so I've pushed this to master (it
>> didn't seem worth disrupting translations for 16 as we are so close to
>> wrapping it, if others disagree I can backpatch).
> 
> I'd rather backpatch it.  There's only 5 translations that are 100% for
> initdb.po, and they have two weeks to make the change from "2" to "two".
> I don't think this is a problem.

Fair enough, backpatched to v16.

--
Daniel Gustafsson

Re: subscription/015_stream sometimes breaks

2023-08-29 Thread Alvaro Herrera

On 2023-Aug-29, Amit Kapila wrote:

> On Mon, Aug 28, 2023 at 5:35 AM Peter Smith  wrote:
> >
> > On Fri, Aug 25, 2023 at 8:15 PM Amit Kapila  wrote:
> >
> > IMO there are inconsistencies in the second patch that was pushed.

> I find your suggestions reasonable. Alvaro, do you have any comments?

Well, my main comment is that at this point I'm not sure these
isFooWorker() macros are worth their salt.  It looks like we could
replace their uses with direct type comparisons in their callsites and
remove them, with no loss of readability.  The am_sth_worker() are
probably a bit more useful, though some of the callsites could end up
better if replaced with straight type comparison.

All in all, I don't disagree with Peter's suggestions, but this is
pretty much in "meh" territory for me.

-- 
Álvaro HerreraBreisgau, Deutschland  —  https://www.EnterpriseDB.com/

Re: Autogenerate some wait events code and documentation

2023-08-29 Thread Alvaro Herrera

On 2023-Aug-29, Michael Paquier wrote:

> On Tue, Aug 29, 2023 at 08:17:10AM +0200, Drouvot, Bertrand wrote:
> > Agree that done that way one could easily grep the events from the
> > source code and match them with wait_event_names.txt. Then I don't
> > think the "search" issue in the code is still a concern with the
> > current proposal.
> 
> It could still be able to append WAIT_EVENT_ to the first column of
> the file.  I'd just rather keep it shorter.

Yeah, I have a mild preference for keeping the prefix, but it's mild
because I also imagine that if somebody doesn't see the full symbol name
when grepping they will think to remove the prefix.  So only -0.1.

I think the DOCONLY stuff should be better documented; they make no
sense without looking at the commit message for fa88928470b5.

-- 
Álvaro HerreraBreisgau, Deutschland  —  https://www.EnterpriseDB.com/

Use the same Windows image on both VS and MinGW tasks

2023-08-29 Thread Nazir Bilal Yavuz

Hi,

The VS and MinGW Windows images are merged on Andres' pg-vm-images
repository now [1]. So, the old pg-ci-windows-ci-vs-2019 and
pg-ci-windows-ci-mingw64 images will not be updated from now on. This
new merged image (pg-ci-windows-ci) needs to be used on both VS and
MinGW tasks.
I attached a patch for using pg-ci-windows-ci Windows image on VS and
MinGW tasks.

CI run when pg-ci-windows-ci is used:
https://cirrus-ci.com/build/6063036847357952

[1]: 
https://github.com/anarazel/pg-vm-images/commit/6747f676b97348d47f041b05aa9b36cde43c33fe

Regards,
Nazir Bilal Yavuz
Microsoft
From 8db9f8672a9216f02e8e599626121c0de5befd59 Mon Sep 17 00:00:00 2001
From: Nazir Bilal Yavuz 
Date: Mon, 28 Aug 2023 13:23:41 +0300
Subject: [PATCH v1] windows: Use the same Windows image on both VS and MinGW
 tasks

The VS and MinGW Windows images are merged now [1]. So, the old
pg-ci-windows-ci-vs-2019 and pg-ci-windows-ci-mingw64 images will not be
updated from now on. This new merged image (pg-ci-windows-ci) needs to
be used on both VS and MinGW tasks.

[1]: https://github.com/anarazel/pg-vm-images/commit/6747f676b97348d47f041b05aa9b36cde43c33fe
---
 .cirrus.tasks.yml | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/.cirrus.tasks.yml b/.cirrus.tasks.yml
index e137769850d..66aee48a5d3 100644
--- a/.cirrus.tasks.yml
+++ b/.cirrus.tasks.yml
@@ -514,6 +514,7 @@ WINDOWS_ENVIRONMENT_BASE: _ENVIRONMENT_BASE
 PG_TEST_USE_UNIX_SOCKETS: 1
 PG_REGRESS_SOCK_DIR: "c:/cirrus/"
 DISK_SIZE: 50
+IMAGE_FAMILY: pg-ci-windows-ci
 
   sysinfo_script: |
 chcp
@@ -537,7 +538,6 @@ task:
 # given that it explicitly prevents crash dumps from working...
 # 0x8001 is SEM_FAILCRITICALERRORS | SEM_NOOPENFILEERRORBOX
 CIRRUS_WINDOWS_ERROR_MODE: 0x8001
-IMAGE_FAMILY: pg-ci-windows-ci-vs-2019
 
   <<: *windows_task_template
 
@@ -598,7 +598,6 @@ task:
 # Start bash in current working directory
 CHERE_INVOKING: 1
 BASH: C:\msys64\usr\bin\bash.exe -l
-IMAGE_FAMILY: pg-ci-windows-ci-mingw64
 
   <<: *windows_task_template
 
-- 
2.40.1

Re: proposal: psql: show current user in prompt

2023-08-29 Thread Jelte Fennema

On Mon, 28 Aug 2023 at 15:00, Pavel Stehule  wrote:
+   minServerMajor = 1600;
+   serverMajor = PQserverVersion(pset.db) / 100;

Instead of using the server version, we should instead use the
protocol version negotiation that's provided by the
NegotiateProtocolVersion message type. We should bump the requested
protocol version from 3.0 to 3.1 and check that the server supports
3.1. Otherwise proxies or connection poolers might get this new
message type, without knowing what to do with them.

+   
+ReportGUC (F)

We'll need some docs on the "Message Flow" page too:
https://www.postgresql.org/docs/current/protocol-flow.html
Specifically what response is expected, if any.

Another thing that should be described there is that this falls
outside of the transaction flow, i.e. it's changes are not reverted on
ROLLBACK. But that leaves an important consideration: What happens
when an error occurs on the server during handling of this message
(e.g. the GUC does not exist or an OOM is triggered). Is any open
transaction aborted in that case? If not, we should have a test for
that.

+   if (PQresultStatus(res) != PGRES_COMMAND_OK)
+   pg_fatal("failed to link custom variable: %s", PQerrorMessage(conn));
+   PQclear(res);

The tests should also change the config after running
PQlinkParameterStatus/PQunlinkParameterStatus to show that the guc is
reported then or not reported then.

+   if (!PQsendTypedCommand(conn, PqMsg_ReportGUC, 't', paramName))
+   return NULL;


I think we'll need some bikeshedding on what the protocol message
should look like exactly. I'm not entirely sure what is the most
sensible here, so please treat everything I write next as
suggestions/discussion:
I see that you're piggy-backing on PQsendTypedCommand, which seems
nice to avoid code duplication. It has one downside though: not every
type, is valid for each command anymore.
One way to avoid that would be to not introduce a new command, but
only add a new type that is understood by Describe and Close, e.g. a
'G' (for GUC). Then PqMsg_Describe, G would become the equivalent of
what'the current patch its PqMsg_ReportGUC, 't' and PqMsg_Close, G
would be the same as PqMsg_ReportGUC, 'f'.

The rest of this email assumes that we're keeping your current
proposal for the protocol message, so it might not make sense to
address all of this feedback, in case we're still going to change the
protocol:

+   if (is_set == 't')
+   {
+   SetGUCOptionFlag(name, GUC_REPORT);
+   status = "SET REPORT_GUC";
+   }
+   else
+   {
+   UnsetGUCOptionFlag(name, GUC_REPORT);
+   status = "UNSET REPORT_GUC";
+   }

I think we should be strict about what we accept here. Any other value
than 'f'/'t' for is_set should result in an error imho.

Re: persist logical slots to disk during shutdown checkpoint

2023-08-29 Thread Ashutosh Bapat

On Tue, Aug 29, 2023 at 2:21 PM Amit Kapila  wrote:
>
> On Tue, Aug 29, 2023 at 10:16 AM vignesh C  wrote:
> >
> > That makes sense. The attached v6 version has the changes for the
> > same, apart from this I have also fixed a) pgindent issues b) perltidy
> > issues c) one variable change (flush_lsn_changed to
> > confirmed_flush_has_changed) d) corrected few comments in the test
> > file. Thanks to Peter for providing few offline comments.
> >
>
> The latest version looks good to me. Julien, Ashutosh, and others,
> unless you have more comments or suggestions, I would like to push
> this in a day or two.

I am looking at it. If you can wait till the end of the week, that
will be great.

-- 
Best Wishes,
Ashutosh Bapat

Re: tablecmds.c/MergeAttributes() cleanup

2023-08-29 Thread Alvaro Herrera

On 2023-Aug-29, Peter Eisentraut wrote:

> From 471fda80c41fae835ecbe63ae8505526a37487a9 Mon Sep 17 00:00:00 2001
> From: Peter Eisentraut 
> Date: Wed, 12 Jul 2023 16:12:35 +0200
> Subject: [PATCH v2 04/10] Add TupleDescGetDefault()
> 
> This unifies some repetitive code.
> 
> Note: I didn't push the "not found" error message into the new
> function, even though all existing callers would be able to make use
> of it.  Using the existing error handling as-is would probably require
> exposing the Relation type via tupdesc.h, which doesn't seem
> desirable.

Note that all errors here are elog(ERROR), not user-facing, so ISTM
it would be OK to change them to report the relation OID rather than the
name.

-- 
Álvaro Herrera   48°01'N 7°57'E  —  https://www.EnterpriseDB.com/
"In fact, the basic problem with Perl 5's subroutines is that they're not
crufty enough, so the cruft leaks out into user-defined code instead, by
the Conservation of Cruft Principle."  (Larry Wall, Apocalypse 6)

RE: [PoC] pg_upgrade: allow to upgrade publisher node

2023-08-29 Thread Hayato Kuroda (Fujitsu)

Dear Amit,

Thank you for giving comments! PSA new version. I ran the pgindent.

> > 1. check_and_dump_old_cluster
> >
> > CURRENT CODE (with v26-0003 patch applied)
> >
> > /* Extract a list of logical replication slots */
> > get_old_cluster_logical_slot_infos();
> >
> > ...
> >
> > /*
> > * Logical replication slots can be migrated since PG17. See comments atop
> > * get_old_cluster_logical_slot_infos().
> > */
> > if (GET_MAJOR_VERSION(old_cluster.major_version) >= 1700)
> > {
> > check_old_cluster_for_lost_slots();
> >
> > /*
> > * Do additional checks if a live check is not required. This requires
> > * that confirmed_flush_lsn of all the slots is the same as the latest
> > * checkpoint location, but it would be satisfied only when the server
> > * has been shut down.
> > */
> > if (!live_check)
> > check_old_cluster_for_confirmed_flush_lsn();
> > }
> >
> >
> > SUGGESTION
> >
> > /*
> >  * Logical replication slots can be migrated since PG17. See comments atop
> >  * get_old_cluster_logical_slot_infos().
> >  */
> > if (GET_MAJOR_VERSION(old_cluster.major_version) >= 1700) // NOTE 1a.
> > {
> >   /* Extract a list of logical replication slots */
> >   get_old_cluster_logical_slot_infos();
> >
> >   if (count_old_cluster_slots()) // NOTE 1b.
> >   {
> > check_old_cluster_for_lost_slots();
> >
> > /*
> >  * Do additional checks if a live check is not required. This requires
> >  * that confirmed_flush_lsn of all the slots is the same as the latest
> >  * checkpoint location, but it would be satisfied only when the server
> >  * has been shut down.
> >  */
> > if (!live_check)
> >   check_old_cluster_for_confirmed_flush_lsn();
> >   }
> > }
> >
> 
> I think a slightly better way to achieve this is to combine the code
> from check_old_cluster_for_lost_slots() and
> check_old_cluster_for_confirmed_flush_lsn() into
> check_old_cluster_for_valid_slots(). That will even save us a new
> connection for the second check.

They are combined into one function.

> Also, I think we can simplify another check in the patch:
> @@ -1446,8 +1446,10 @@ check_new_cluster_logical_replication_slots(void)
> char   *wal_level;
> 
> /* Logical slots can be migrated since PG17. */
> -   if (GET_MAJOR_VERSION(old_cluster.major_version) >= 1700)
> -   nslots = count_old_cluster_logical_slots();
> +   if (GET_MAJOR_VERSION(old_cluster.major_version) <= 1600)
> +   return;
> +
> +   nslots = count_old_cluster_logical_slots();
>

Fixed.

Also, I have tested the combination of this patch and the physical standby.

1. Logical slots defined on old physical standby *cannot be upgraded*
2. Logical slots defined on physical primary *are migrated* to new physical 
standby

The primal reason is that pg_upgrade cannot be used for physical standby. If
users want to upgrade standby, rsync command is used instead. The command
creates the cluster based on the based on the new primary, hence they are
replicated to new standby. In contrast, the old cluster is basically ignored so
that slots on old cluster is not upgraded.  I updated the doc accordingly.

Best Regards,
Hayato Kuroda
FUJITSU LIMITED



v28-0001-Persist-logical-slots-to-disk-during-a-shutdown-.patch
Description:  v28-0001-Persist-logical-slots-to-disk-during-a-shutdown-.patch


v28-0002-pg_upgrade-Allow-to-replicate-logical-replicatio.patch
Description:  v28-0002-pg_upgrade-Allow-to-replicate-logical-replicatio.patch


v28-0003-pg_upgrade-Add-check-function-for-logical-replic.patch
Description:  v28-0003-pg_upgrade-Add-check-function-for-logical-replic.patch

Re: tablecmds.c/MergeAttributes() cleanup

2023-08-29 Thread Alvaro Herrera

On 2023-Aug-29, Peter Eisentraut wrote:

Regarding this hunk in 0002,

> @@ -3278,13 +3261,16 @@ MergeAttributes(List *schema, List *supers, char 
> relpersistence,
>   *
>   * constraints is a list of CookedConstraint structs for previous 
> constraints.
>   *
> - * Returns true if merged (constraint is a duplicate), or false if it's
> - * got a so-far-unique name, or throws error if conflict.
> + * If the constraint is a duplicate, then the existing constraint's
> + * inheritance count is updated.  If the constraint doesn't match or conflict
> + * with an existing one, a new constraint is appended to the list.  If there
> + * is a conflict (same name but different expression), throw an error.

This wording confused me:

"If the constraint doesn't match or conflict with an existing one, a new
constraint is appended to the list."

I first read it as "doesn't match or conflicts with ..." (i.e., the
negation only applied to the first verb, not both) which would have been
surprising (== broken) behavior.

I think it's clearer if you say "doesn't match nor conflict", but I'm
not sure if this is grammatically correct.

-- 
Álvaro Herrera   48°01'N 7°57'E  —  https://www.EnterpriseDB.com/

Re: Support prepared statement invalidation when result types change

2023-08-29 Thread Jelte Fennema

On Tue, 29 Aug 2023 at 11:29, jian he  wrote:
> regression=# CREATE TEMP TABLE pcachetest AS SELECT * FROM int8_tbl;
> SELECT 5
> regression=# PREPARE prepstmt2(bigint) AS SELECT * FROM pcachetest
> WHERE q1 = $1;'
> PREPARE
> regression=# alter table pcachetest rename q1 to x;
> ALTER TABLE
> regression=# EXECUTE prepstmt2(123);
> 2023-08-29 17:23:59.148 CST [1382074] ERROR:  column "q1" does not
> exist at character 61
> 2023-08-29 17:23:59.148 CST [1382074] HINT:  Perhaps you meant to
> reference the column "pcachetest.q2".
> 2023-08-29 17:23:59.148 CST [1382074] STATEMENT:  EXECUTE prepstmt2(123);
> ERROR:  column "q1" does not exist
> HINT:  Perhaps you meant to reference the column "pcachetest.q2".

Thank you for the full set of commands. In that case the issue you're
describing is completely separate from this patch. The STATEMENT: part
of the log will always show the query that was received by the server.
This behaviour was already present even without my patch (I double
checked with PG15.3).

Re: Standardize spelling of "power of two"

2023-08-29 Thread Alvaro Herrera

On 2023-Aug-29, Daniel Gustafsson wrote:

> Agreed.  While we have numerous "power of 2" these were the only ones
> in translated user-facing messages, so I've pushed this to master (it
> didn't seem worth disrupting translations for 16 as we are so close to
> wrapping it, if others disagree I can backpatch).

I'd rather backpatch it.  There's only 5 translations that are 100% for
initdb.po, and they have two weeks to make the change from "2" to "two".
I don't think this is a problem.

-- 
Álvaro HerreraBreisgau, Deutschland  —  https://www.EnterpriseDB.com/

Re: Support run-time partition pruning for hash join

2023-08-29 Thread Richard Guo

On Fri, Aug 25, 2023 at 11:03 AM David Rowley  wrote:

> I'd suggest writing some cost which costs an execution of run-time
> pruning.  With LIST and RANGE you probably want something like
> cpu_operator_cost * LOG2(nparts) once for each hashed tuple to account
> for the binary search over the sorted datum array. For HASH
> partitions, something like cpu_operator_cost * npartcols once for each
> hashed tuple.
>
> You'll need to then come up with some counter costs to subtract from
> the Append/MergeAppend.  This is tricky, as discussed.  Just come up
> with something crude for now.
>
> To start with, it could just be as crude as:
>
> total_costs *= (Min(expected_outer_rows, n_append_subnodes)  /
> n_append_subnodes);
>
> i.e assume that every outer joined row will require exactly 1 new
> partition up to the total number of partitions.  That's pretty much
> worst-case, but it'll at least allow the optimisation to work for
> cases like where the hash table is expected to contain just a tiny
> number of rows (fewer than the number of partitions)
>
> To make it better, you might want to look at join selectivity
> estimation and see if you can find something there to influence
> something better.

I have a go at writing some costing codes according to your suggestion.
That's compute_partprune_cost() in the v2 patch.

For the hash side, this function computes the pruning cost as
cpu_operator_cost * LOG2(nparts) * inner_rows for LIST and RANGE, and
cpu_operator_cost * nparts * inner_rows for HASH.

For the Append/MergeAppend side, this function first estimates the size
of outer side that matches, using the same idea as we estimate the
joinrel size for JOIN_SEMI.  Then it assumes that each outer joined row
occupies one new partition (the worst case) and computes how much cost
can be saved from partition pruning.

If the cost saved from the Append/MergeAppend side is larger than the
pruning cost from the Hash side, then we say that partition pruning is a
win.

Note that this costing logic runs for each Append-Hash pair, so it copes
with the case where we have multiple join levels.

With this costing logic added, I performed the same performance
comparisons of the worst case as in [1], and here is what I got.

tuples  unpatched   patched
1   44.66   44.37   -0.006493506
2   52.41   52.29   -0.002289639
3   61.11   61.12   +0.000163639
4   67.87   68.24   +0.005451599
5   74.51   74.75   +0.003221044
6   82.381.55   -0.009113001
7   87.16   86.98   -0.002065168
8   93.49   93.89   +0.004278532
9   101.52  100.83  -0.00679669
10  108.34  108.56  +0.002030644

So the costing logic successfully avoids performing the partition
pruning in the worst case.

I also tested the cases where partition pruning is possible with
different sizes of the hash side.

tuples  unpatched   patched
100 36.86   2.4 -0.934888768
200 35.87   2.37-0.933928074
300 35.95   2.55-0.92906815
400 36.42.63-0.927747253
500 36.39   2.85-0.921681781
600 36.32   2.97-0.918226872
700 36.63.23-0.911748634
800 36.88   3.44-0.906724512
900 37.02   3.46-0.906537007
100037.25   37.21   -0.001073826

The first 9 rows show that the costing logic allows the partition
pruning to be performed and the pruning turns out to be a big win.  The
last row shows that the partition pruning is disallowed by the costing
logic because it thinks no partition can be pruned (we have 1000
partitions in total).

So it seems that the new costing logic is quite crude and tends to be
very conservative, but it can help avoid the large overhead in the worst
cases.  I think this might be a good start to push this patch forward.

Any thoughts or comments?

[1]
https://www.postgresql.org/message-id/CAMbWs49%2Bp6hBxXJHFiSwOtPCSkAHwhJj3hTpCR_pmMiUUVLZ1Q%40mail.gmail.com

Thanks
Richard

v2-0001-Support-run-time-partition-pruning-for-hash-join.patch
Description: Binary data

Re: Sync scan & regression tests

2023-08-29 Thread Heikki Linnakangas


(noticed this thread just now)

On 07/08/2023 03:55, Tom Lane wrote:

Having said that ... I just noticed that chipmunk, which I'd been
ignoring because it had been having configuration-related failures
ever since it came back to life about three months ago, has gotten
past those problems


Yes, I finally got around to fix the configuration...


and is now failing with what looks suspiciously like syncscan
problems:

https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=chipmunk=2023-08-06%2008%3A20%3A21

This is possibly explained by the fact that it uses (per its
extra_config)
   'shared_buffers = 10MB',
although it's done that for a long time and portals.out hasn't changed
since before chipmunk's latest success.  Perhaps some change in an
earlier test script affected this?


I changed the config yesterday to 'shared_buffers = 20MB', before seeing 
this thread. If we expect the regression tests to work with 
'shared_buffers=10MB' - and I think that would be nice - I'll change it 
back. But let's see what happens with 'shared_buffers=20MB'. It will 
take a few days for the tests to complete.



I think it'd be worth running to ground exactly what is causing these
failures and when they started.


I bisected it to this commit:

commit 7ae0ab0ad9704b10400a611a9af56c63304c27a5
Author: David Rowley 
Date:   Fri Feb 3 16:20:43 2023 +1300

Reduce code duplication between heapgettup and heapgettup_pagemode

The code to get the next block number was exactly the same between 
these
two functions, so let's just put it into a helper function and call 
that

from both locations.

Author: Melanie Plageman
Reviewed-by: Andres Freund, David Rowley
Discussion: 
https://postgr.es/m/CAAKRu_bvkhka0CZQun28KTqhuUh5ZqY=_t8qeqzqol02rpi...@mail.gmail.com


From that description, that was supposed to be just code refactoring, 
with no change in behavior.


Looking the new heapgettup_advance_block() function and the code that it 
replaced, it's now skipping this ss_report_location() on the last call, 
when it has reached the end of the scan:




/*
 * Report our new scan position for synchronization purposes. We
 * don't do that when moving backwards, however. That would just
 * mess up any other forward-moving scanners.
 *
 * Note: we do this before checking for end of scan so that the
 * final state of the position hint is back at the start of the
 * rel.  That's not strictly necessary, but otherwise when you run
 * the same query multiple times the starting position would shift
 * a little bit backwards on every invocation, which is confusing.
 * We don't guarantee any specific ordering in general, though.
 */
if (scan->rs_base.rs_flags & SO_ALLOW_SYNC)
ss_report_location(scan->rs_base.rs_rd, block);


The comment explicitly says that we do that before checking for 
end-of-scan, but that commit moved it to after. I propose the attached 
to move it back to before the end-of-scan checks.


Melanie, David, any thoughts?


But assuming that they are triggered by syncscan, my first thought
about dealing with it is to disable syncscan within the affected
tests.  However ... do we exercise the syncscan logic at all within
the regression tests, normally?  Is there a coverage issue to be
dealt with?

Good question..

--
Heikki Linnakangas
Neon (https://neon.tech)
diff --git a/src/backend/access/heap/heapam.c b/src/backend/access/heap/heapam.c
index 3bb1d5cff68..48b4ca45439 100644
--- a/src/backend/access/heap/heapam.c
+++ b/src/backend/access/heap/heapam.c
@@ -647,17 +647,6 @@ heapgettup_advance_block(HeapScanDesc scan, BlockNumber block, ScanDirection dir
 			if (block >= scan->rs_nblocks)
 block = 0;
 
-			/* we're done if we're back at where we started */
-			if (block == scan->rs_startblock)
-return InvalidBlockNumber;
-
-			/* check if the limit imposed by heap_setscanlimits() is met */
-			if (scan->rs_numblocks != InvalidBlockNumber)
-			{
-if (--scan->rs_numblocks == 0)
-	return InvalidBlockNumber;
-			}
-
 			/*
 			 * Report our new scan position for synchronization purposes. We
 			 * don't do that when moving backwards, however. That would just
@@ -673,6 +662,17 @@ heapgettup_advance_block(HeapScanDesc scan, BlockNumber block, ScanDirection dir
 			if (scan->rs_base.rs_flags & SO_ALLOW_SYNC)
 ss_report_location(scan->rs_base.rs_rd, block);
 
+			/* we're done if we're back at where we started */
+			if (block == scan->rs_startblock)
+return InvalidBlockNumber;
+
+			/* check if the limit imposed by heap_setscanlimits() is met */
+			if (scan->rs_numblocks != InvalidBlockNumber)
+			{
+if (--scan->rs_numblocks == 0)
+	return InvalidBlockNumber;
+			}
+
 			return block;
 		}
 		else

Re: Prevent psql \watch from running queries that return no rows

2023-08-29 Thread Daniel Gustafsson

> On 22 Aug 2023, at 23:23, Greg Sabino Mullane  wrote:
> 
> Thank you for the feedback, everyone. Attached is version 4 of the patch, 
> featuring a few tests and minor rewordings.

I went over this once more, and pushed it along with pgindenting.  I did reduce
the number of tests since they were overlapping and a bit too expensive to have
multiples of. Thanks for the patch!

--
Daniel Gustafsson

Re: [PATCH v1] PQputCopyEnd never returns 0, fix the inaccurate comment

2023-08-29 Thread Junwang Zhao

On Tue, Aug 29, 2023 at 6:40 AM Michael Paquier  wrote:
>
> On Mon, Aug 28, 2023 at 09:46:07PM +0800, Junwang Zhao wrote:
> > Yeah, it makes sense to me, or maybe just `PQputCopyEnd(...) == -1`,
> > let's wait for some other opinions.
>
> One can argue that PQputCopyEnd() returning 0 could be possible in an
> older version of libpq these callers are linking to, but this has
> never existed from what I can see (just checked down to 8.2 now).
> Anyway, changing these callers may create some backpatching conflicts,
> so I'd let them as they are, and just fix the comment.

sure, thanks for the explanation.

> --
> Michael



-- 
Regards
Junwang Zhao

RE: logical_replication_mode

2023-08-29 Thread Zhijie Hou (Fujitsu)

On Tuesday, August 29, 2023 3:26 PM Peter Smith  wrote:

Thanks for reviewing.

> 2. DebugLogicalRepStreamingMode
> 
> -/* possible values for logical_replication_mode */
> +/* possible values for debug_logical_replication_streaming */
>  typedef enum
>  {
> - LOGICAL_REP_MODE_BUFFERED,
> - LOGICAL_REP_MODE_IMMEDIATE
> -} LogicalRepMode;
> + DEBUG_LOGICAL_REP_STREAMING_BUFFERED,
> + DEBUG_LOGICAL_REP_STREAMING_IMMEDIATE
> +} DebugLogicalRepStreamingMode;
> 
> Shouldn't this typedef name be included in the typedef.list file?

I think it's unnecessary to add this as there currently is no reference to the 
name.
See other similar examples like DebugParallelMode, RecoveryPrefetchValue ...
And the name is also not included in BF[1]

[1] https://buildfarm.postgresql.org/cgi-bin/typedefs.pl?branch=HEAD

Best Regards,
Hou zj

Re: Support prepared statement invalidation when result types change

2023-08-29 Thread jian he

On Tue, Aug 29, 2023 at 12:51 AM Jelte Fennema  wrote:
>
> Could you share the full set of commands that cause the reporting
> issue? I don't think my changes should impact this reporting, so I'm
> curious if this is a new issue, or an already existing one.

I didn't apply your v2 patch.
full set of commands:

regression=# CREATE TEMP TABLE pcachetest AS SELECT * FROM int8_tbl;
SELECT 5
regression=# PREPARE prepstmt2(bigint) AS SELECT * FROM pcachetest
WHERE q1 = $1;'
PREPARE
regression=# alter table pcachetest rename q1 to x;
ALTER TABLE
regression=# EXECUTE prepstmt2(123);
2023-08-29 17:23:59.148 CST [1382074] ERROR:  column "q1" does not
exist at character 61
2023-08-29 17:23:59.148 CST [1382074] HINT:  Perhaps you meant to
reference the column "pcachetest.q2".
2023-08-29 17:23:59.148 CST [1382074] STATEMENT:  EXECUTE prepstmt2(123);
ERROR:  column "q1" does not exist
HINT:  Perhaps you meant to reference the column "pcachetest.q2".
--

Re: Standardize spelling of "power of two"

2023-08-29 Thread Daniel Gustafsson

> On 29 Aug 2023, at 10:56, Kyotaro Horiguchi  wrote:

> pg_resetwal and initdb has an error message like this:
> 
> msgid "argument of --wal-segsize must be a power of 2 between 1 and 1024"
> 
> In other parts in the tree, however, we spell it as "power of two". I
> think it would make sense to standardize the spelling for
> consistency. See the attached.

Agreed.  While we have numerous "power of 2" these were the only ones in
translated user-facing messages, so I've pushed this to master (it didn't seem
worth disrupting translations for 16 as we are so close to wrapping it, if
others disagree I can backpatch).

--
Daniel Gustafsson

Re: Missing comments/docs about custom scan path

2023-08-29 Thread Etsuro Fujita

On Thu, Aug 3, 2023 at 6:01 PM Etsuro Fujita  wrote:
> On Mon, Jul 31, 2023 at 7:05 PM Etsuro Fujita  wrote:
> > While working on [1], I noticed $SUBJECT:

Another thing I would like to propose is minor adjustments to the docs
related to parallel query:

A custom scan provider will typically add paths for a base relation by
setting the following hook, which is called after the core code has
generated all the access paths it can for the relation (except for
Gather paths, which are made after this call so that they can use
partial paths added by the hook):

For clarity, I think "except for Gather paths" should be "except for
Gather and Gather Merge paths".

Although this hook function can be used to examine, modify, or remove
paths generated by the core system, a custom scan provider will
typically confine itself to generating CustomPath objects and adding
them to rel using add_path.

For clarity, I think "adding them to rel using add_path" should be eg,
"adding them to rel using add_path, or using add_partial_path if they
are partial paths".

Attached is a patch for that.

Best regards,
Etsuro Fujita

further-update-custom-scan-path-docs.patch
Description: Binary data

Re: Eliminate redundant tuple visibility check in vacuum

2023-08-29 Thread David Geier


Hi Melanie,

On 8/29/23 01:49, Melanie Plageman wrote:

While working on a set of patches to combine the freeze and visibility
map WAL records into the prune record, I wrote the attached patches
reusing the tuple visibility information collected in heap_page_prune()
back in lazy_scan_prune().

heap_page_prune() collects the HTSV_Result for every tuple on a page
and saves it in an array used by heap_prune_chain(). If we make that
array available to lazy_scan_prune(), it can use it when collecting
stats for vacuum and determining whether or not to freeze tuples.
This avoids calling HeapTupleSatisfiesVacuum() again on every tuple in
the page.

It also gets rid of the retry loop in lazy_scan_prune().


How did you test this change?

Could you measure any performance difference?

If so could you provide your test case?

--
David Geier
(ServiceNow)

Standardize spelling of "power of two"

2023-08-29 Thread Kyotaro Horiguchi

Hello.

pg_resetwal and initdb has an error message like this:

msgid "argument of --wal-segsize must be a power of 2 between 1 and 1024"

In other parts in the tree, however, we spell it as "power of two". I
think it would make sense to standardize the spelling for
consistency. See the attached.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center
diff --git a/src/bin/initdb/initdb.c b/src/bin/initdb/initdb.c
index c66467eb95..905b979947 100644
--- a/src/bin/initdb/initdb.c
+++ b/src/bin/initdb/initdb.c
@@ -3350,7 +3350,7 @@ main(int argc, char *argv[])
 	check_need_password(authmethodlocal, authmethodhost);
 
 	if (!IsValidWalSegSize(wal_segment_size_mb * 1024 * 1024))
-		pg_fatal("argument of %s must be a power of 2 between 1 and 1024", "--wal-segsize");
+		pg_fatal("argument of %s must be a power of two between 1 and 1024", "--wal-segsize");
 
 	get_restricted_token();
 
diff --git a/src/bin/pg_resetwal/pg_resetwal.c b/src/bin/pg_resetwal/pg_resetwal.c
index 9bebc2a995..25ecdaaa15 100644
--- a/src/bin/pg_resetwal/pg_resetwal.c
+++ b/src/bin/pg_resetwal/pg_resetwal.c
@@ -298,7 +298,7 @@ main(int argc, char *argv[])
 		exit(1);
 	set_wal_segsize = wal_segsize_mb * 1024 * 1024;
 	if (!IsValidWalSegSize(set_wal_segsize))
-		pg_fatal("argument of %s must be a power of 2 between 1 and 1024", "--wal-segsize");
+		pg_fatal("argument of %s must be a power of two between 1 and 1024", "--wal-segsize");
 	break;
 }

Re: persist logical slots to disk during shutdown checkpoint

2023-08-29 Thread Amit Kapila

On Tue, Aug 29, 2023 at 10:16 AM vignesh C  wrote:
>
> That makes sense. The attached v6 version has the changes for the
> same, apart from this I have also fixed a) pgindent issues b) perltidy
> issues c) one variable change (flush_lsn_changed to
> confirmed_flush_has_changed) d) corrected few comments in the test
> file. Thanks to Peter for providing few offline comments.
>

The latest version looks good to me. Julien, Ashutosh, and others,
unless you have more comments or suggestions, I would like to push
this in a day or two.

-- 
With Regards,
Amit Kapila.

Re: Remove IndexInfo.ii_OpclassOptions field

2023-08-29 Thread Peter Eisentraut


On 25.08.23 03:31, Michael Paquier wrote:

On Thu, Aug 24, 2023 at 08:57:58AM +0200, Peter Eisentraut wrote:

During some refactoring I noticed that the field IndexInfo.ii_OpclassOptions
is kind of useless.  The IndexInfo struct is notionally an executor support
node, but this field is not used in the executor or by the index AM code.
It is really just used in DDL code in index.c and indexcmds.c to pass
information around locally.  For that, it would be clearer to just use local
variables, like for other similar cases.  With that change, we can also
remove RelationGetIndexRawAttOptions(), which only had one caller left, for
which it was overkill.


I am not so sure.  There is a very recent thread where it has been
pointed out that we have zero support for relcache invalidation with
index options, causing various problems:
https://www.postgresql.org/message-id/CAGem3qAM7M7B3DdccpgepRxuoKPd2Y74qJ5NSNRjLiN21dPhgg%40mail.gmail.com

Perhaps we'd better settle on the other one before deciding if the
change you are proposing here is adapted or not.


Ok, I'll wait for the resolution of that.

At a glance, however, I think my patch is (a) not related, and (b) if it 
were, it would probably *help*, because the change is to not allocate 
any long-lived structures that no one needs and that might get out of date.

Re: postgres_fdw: wrong results with self join + enable_nestloop off

2023-08-29 Thread Etsuro Fujita

Hi,

Thanks for the detailed explanation!

On Mon, Aug 21, 2023 at 10:34 PM Önder Kalacı  wrote:

>> As described in the commit message, we assume that extensions use the
>> hook in a similar way to FDWs

> I'm not sure if it is fair to assume that extensions use any hook in any way.

I am not sure either, but as for the hook, I think it is an undeniable
fact that the core system assumes that extensions will use it in that
way.

>> So my question is: does the Citus extension use the hook like this?
>> (Sorry, I do not fully understand Onder's explanation.)

> I haven't gone into detail about how Citus uses this hook, but I don't think 
> we should
> need to explain it. In general, Citus uses many hooks, and many other 
> extensions
> use this specific hook. With minor version upgrades, we haven't seen this 
> kind of
> behavior change before.
>
> In general, Citus relies on this hook for collecting information about joins 
> across
> relations/ctes/subqueries. So, its scope is bigger than a single join for 
> Citus.
>
> The extension assigns a special marker(s) for RTE Relations, and then checks 
> whether
> all relations with these special markers joined transitively across 
> subqueries, such that
> it can decide to pushdown the whole or some parts of the (sub)query.

IIUC, I think that that is going beyond what the hook supports.

> But the bigger issue is that there has usually been a clear line between the 
> extensions and
> the PG itself when it comes to hooks within the minor version upgrades. 
> Sadly, this change
> breaks that line. We wanted to share our worries here and find out what 
> others think.

My understanding is: at least for hooks with intended usages, if an
extension uses them as intended, it is guaranteed that the extension
as-is will work correctly with minor version upgrades; otherwise it is
not necessarily.  I think it is unfortunate that my commit broke the
Citus extension, though.

>> >Except that this was only noticed after it was released in a set of minor
>> > versions, I would say that 6f80a8d9c should just straight up be reverted.

> I cannot be the one to ask for reverting a commit in PG, but I think doing it 
> would be a
> fair action. We kindly ask those who handle this to think about it.

Reverting the commit would resolve your issue, but re-introduce the
issue mentioned upthread to extensions that use the hook properly, so
I do not think that reverting the commit would be a fair action.

Sorry for the delay.

Best regards,
Etsuro Fujita

Re: tablecmds.c/MergeAttributes() cleanup

2023-08-29 Thread Peter Eisentraut


On 12.07.23 16:29, Peter Eisentraut wrote:

On 11.07.23 20:17, Alvaro Herrera wrote:

I spent a few minutes doing a test merge of this to my branch with NOT
NULL changes.  Here's a quick review.


Subject: [PATCH 01/17] Remove obsolete comment about OID support


Obvious, trivial.  +1

Subject: [PATCH 02/17] Remove ancient special case code for adding 
oid columns


LGTM; deletes dead code.


Subject: [PATCH 03/17] Remove ancient special case code for dropping oid
  columns


Hmm, interesting.  Yay for more dead code removal.  Didn't verify it.


I have committed these first three.  I'll leave it at that for now.


I have committed a few more patches from this series that were already 
agreed upon.  The remaining ones are rebased and reordered a bit, attached.


There was some doubt about the patch "Refactor ATExecAddColumn() to use 
BuildDescForRelation()" (now 0009), whether it's too clever to build a 
fake one-item tuple descriptor.  I am working on another patch, which I 
hope to post this week, that proposes to replace the use of tuple 
descriptors there with a List of something.  That would then allow maybe 
rewriting this in a less-clever way.  That patch incidentally also wants 
to move BuildDescForRelation from tupdesc.c to tablecmds.c (patch 0007 
here).
From 4d01d244305c63a5a602558267d0021910e200b6 Mon Sep 17 00:00:00 2001
From: Peter Eisentraut 
Date: Wed, 12 Jul 2023 16:12:34 +0200
Subject: [PATCH v2 01/10] Clean up MergeAttributesIntoExisting()

Make variable naming clearer and more consistent.  Move some variables
to smaller scope.  Remove some unnecessary intermediate variables.
Try to save some vertical space.

Apply analogous changes to nearby MergeConstraintsIntoExisting() and
RemoveInheritance() for consistency.
---
 src/backend/commands/tablecmds.c | 123 ---
 1 file changed, 48 insertions(+), 75 deletions(-)

diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c
index d097da3c78..e7846178b3 100644
--- a/src/backend/commands/tablecmds.c
+++ b/src/backend/commands/tablecmds.c
@@ -15691,92 +15691,76 @@ static void
 MergeAttributesIntoExisting(Relation child_rel, Relation parent_rel)
 {
Relationattrrel;
-   AttrNumber  parent_attno;
-   int parent_natts;
-   TupleDesc   tupleDesc;
-   HeapTuple   tuple;
-   boolchild_is_partition = false;
+   TupleDesc   parent_desc;
 
attrrel = table_open(AttributeRelationId, RowExclusiveLock);
+   parent_desc = RelationGetDescr(parent_rel);
 
-   tupleDesc = RelationGetDescr(parent_rel);
-   parent_natts = tupleDesc->natts;
-
-   /* If parent_rel is a partitioned table, child_rel must be a partition 
*/
-   if (parent_rel->rd_rel->relkind == RELKIND_PARTITIONED_TABLE)
-   child_is_partition = true;
-
-   for (parent_attno = 1; parent_attno <= parent_natts; parent_attno++)
+   for (AttrNumber parent_attno = 1; parent_attno <= parent_desc->natts; 
parent_attno++)
{
-   Form_pg_attribute attribute = TupleDescAttr(tupleDesc,
-   
parent_attno - 1);
-   char   *attributeName = NameStr(attribute->attname);
+   Form_pg_attribute parent_att = TupleDescAttr(parent_desc, 
parent_attno - 1);
+   char   *parent_attname = NameStr(parent_att->attname);
+   HeapTuple   tuple;
 
/* Ignore dropped columns in the parent. */
-   if (attribute->attisdropped)
+   if (parent_att->attisdropped)
continue;
 
/* Find same column in child (matching on column name). */
-   tuple = SearchSysCacheCopyAttName(RelationGetRelid(child_rel),
-   
  attributeName);
+   tuple = SearchSysCacheCopyAttName(RelationGetRelid(child_rel), 
parent_attname);
if (HeapTupleIsValid(tuple))
{
-   /* Check they are same type, typmod, and collation */
-   Form_pg_attribute childatt = (Form_pg_attribute) 
GETSTRUCT(tuple);
+   Form_pg_attribute child_att = (Form_pg_attribute) 
GETSTRUCT(tuple);
 
-   if (attribute->atttypid != childatt->atttypid ||
-   attribute->atttypmod != childatt->atttypmod)
+   if (parent_att->atttypid != child_att->atttypid ||
+   parent_att->atttypmod != child_att->atttypmod)
ereport(ERROR,

(errcode(ERRCODE_DATATYPE_MISMATCH),
 errmsg("child table \"%s\" has 
different type for column \"%s\"",
-

Re: logical_replication_mode

2023-08-29 Thread Amit Kapila

On Tue, Aug 29, 2023 at 12:56 PM Peter Smith  wrote:
>
> I had a look at the patch 0001.
>
> It looks OK to me, but here are a couple of comments:
>
> ==
>
> 1. Is this fix intended for PG16?
>

Yes.

> I found some mention of this GUC old name lurking in the release v16 notes 
> [1].
>

That should be changed as well but we can do that as a separate patch
just for v16.

-- 
With Regards,
Amit Kapila.

Re: Wrong usage of pqMsg_Close message code?

2023-08-29 Thread Michael Paquier

On Tue, Aug 29, 2023 at 10:04:24AM +0900, Tatsuo Ishii wrote:
>> Yeah, both of you are right here.  Anyway, it seems to me that there
>> is a bit more going on in protocol.h.  I have noticed two more things
>> that are incorrect:
>> - HandleParallelMessage is missing a message for 'P', but I think that
>> we should have a code for it as well as part of the parallel query
>> protocol.
> 
> I did not know this. Why is this not explained in the frontend/backend
> protocol document?

Hmm.  Thinking more about it, I am actually not sure that we need to
do that in this case, so perhaps things are OK as they stand for this
one.

>> - PqMsg_Terminate can be sent by the frontend *and* the backend, see
>> fe-connect.c and parallel.c.  However, protocol.h documents it as a
>> frontend-only code.
> 
> I do not blame protocol.h because our frontend/backend protocol
> document also stats that it's a frontend only message. Someone who
> started to use 'X' in backend should have added that in the
> documentation.

Actually, this may be OK as well as it stands.  One can also say that
the parallel processing is out of this scope, being used only
internally.  I cannot keep wondering whether we should put more
efforts in documenting the parallel worker/leader protocol.  That's
internal to the backend and out of the scope of this thread, still..
--
Michael


signature.asc
Description: PGP signature

Re: logical_replication_mode

2023-08-29 Thread Peter Smith

Hi Hou-san.

I had a look at the patch 0001.

It looks OK to me, but here are a couple of comments:

==

1. Is this fix intended for PG16?

I found some mention of this GUC old name lurking in the release v16 notes [1].

~~~

2. DebugLogicalRepStreamingMode

-/* possible values for logical_replication_mode */
+/* possible values for debug_logical_replication_streaming */
 typedef enum
 {
- LOGICAL_REP_MODE_BUFFERED,
- LOGICAL_REP_MODE_IMMEDIATE
-} LogicalRepMode;
+ DEBUG_LOGICAL_REP_STREAMING_BUFFERED,
+ DEBUG_LOGICAL_REP_STREAMING_IMMEDIATE
+} DebugLogicalRepStreamingMode;

Shouldn't this typedef name be included in the typedef.list file?

--
[1] https://www.postgresql.org/docs/16/release-16.html

Kind Regards,
Peter Smith.
Fujitsu Australia

Re: Use FD_CLOEXEC on ListenSockets (was Re: Refactoring backend fork+exec code)

2023-08-29 Thread Heikki Linnakangas


On 29/08/2023 09:21, Heikki Linnakangas wrote:

Thinking about this some more, the ListenSockets array is a bit silly
anyway. We fill the array starting from index 0, always append to the
end, and never remove entries from it. It would seem more
straightforward to keep track of the used size of the array. Currently
we always loop through the unused parts too, and e.g.
ConfigurePostmasterWaitSet() needs to walk the array to count how many
elements are in use.


Like this.

--
Heikki Linnakangas
Neon (https://neon.tech)
From 796280f07dd5dbf50897c9895715ab5e2dbf187b Mon Sep 17 00:00:00 2001
From: Heikki Linnakangas 
Date: Tue, 29 Aug 2023 09:53:08 +0300
Subject: [PATCH 1/1] Refactor ListenSocket array.

Keep track of the used size of the array. That avoids looping through
the whole array in a few places. It doesn't matter from a performance
point of view since the array is small anyway, but this feels less
surprising and is a little less code. Now that we have an explicit
NumListenSockets variable that is statically initialized to 0, we
don't need the loop to initialize the array.

Allocate the array in PostmasterContext. The array isn't needed in
child processes, so this allows reusing that memory. We could easily
make the array resizable now, but we haven't heard any complaints
about the current 64 sockets limit.
---
 src/backend/libpq/pqcomm.c  | 18 +++
 src/backend/postmaster/postmaster.c | 76 +++--
 src/include/libpq/libpq.h   |  2 +-
 3 files changed, 36 insertions(+), 60 deletions(-)

diff --git a/src/backend/libpq/pqcomm.c b/src/backend/libpq/pqcomm.c
index 8d217b0645..48ae7704fb 100644
--- a/src/backend/libpq/pqcomm.c
+++ b/src/backend/libpq/pqcomm.c
@@ -311,8 +311,9 @@ socket_close(int code, Datum arg)
  * specified.  For TCP ports, hostName is either NULL for all interfaces or
  * the interface to listen on, and unixSocketDir is ignored (can be NULL).
  *
- * Successfully opened sockets are added to the ListenSocket[] array (of
- * length MaxListen), at the first position that isn't PGINVALID_SOCKET.
+ * Successfully opened sockets are appended to the ListenSockets[] array.  The
+ * current size of the array is *NumListenSockets, it is updated to reflect
+ * the opened sockets.  MaxListen is the allocated size of the array.
  *
  * RETURNS: STATUS_OK or STATUS_ERROR
  */
@@ -320,7 +321,7 @@ socket_close(int code, Datum arg)
 int
 StreamServerPort(int family, const char *hostName, unsigned short portNumber,
  const char *unixSocketDir,
- pgsocket ListenSocket[], int MaxListen)
+ pgsocket ListenSockets[], int *NumListenSockets, int MaxListen)
 {
 	pgsocket	fd;
 	int			err;
@@ -335,7 +336,6 @@ StreamServerPort(int family, const char *hostName, unsigned short portNumber,
 	struct addrinfo *addrs = NULL,
 			   *addr;
 	struct addrinfo hint;
-	int			listen_index = 0;
 	int			added = 0;
 	char		unixSocketPath[MAXPGPATH];
 #if !defined(WIN32) || defined(IPV6_V6ONLY)
@@ -401,12 +401,7 @@ StreamServerPort(int family, const char *hostName, unsigned short portNumber,
 		}
 
 		/* See if there is still room to add 1 more socket. */
-		for (; listen_index < MaxListen; listen_index++)
-		{
-			if (ListenSocket[listen_index] == PGINVALID_SOCKET)
-break;
-		}
-		if (listen_index >= MaxListen)
+		if (*NumListenSockets == MaxListen)
 		{
 			ereport(LOG,
 	(errmsg("could not bind to all requested addresses: MAXLISTEN (%d) exceeded",
@@ -573,7 +568,8 @@ StreamServerPort(int family, const char *hostName, unsigned short portNumber,
 	(errmsg("listening on %s address \"%s\", port %d",
 			familyDesc, addrDesc, (int) portNumber)));
 
-		ListenSocket[listen_index] = fd;
+		ListenSockets[*NumListenSockets] = fd;
+		(*NumListenSockets)++;
 		added++;
 	}
 
diff --git a/src/backend/postmaster/postmaster.c b/src/backend/postmaster/postmaster.c
index d7bfb28ff3..2659329b82 100644
--- a/src/backend/postmaster/postmaster.c
+++ b/src/backend/postmaster/postmaster.c
@@ -226,7 +226,8 @@ int			ReservedConnections;
 
 /* The socket(s) we're listening to. */
 #define MAXLISTEN	64
-static pgsocket ListenSocket[MAXLISTEN];
+static int NumListenSockets = 0;
+static pgsocket *ListenSockets = NULL;
 
 /* still more option variables */
 bool		EnableSSL = false;
@@ -587,7 +588,6 @@ PostmasterMain(int argc, char *argv[])
 	int			status;
 	char	   *userDoption = NULL;
 	bool		listen_addr_saved = false;
-	int			i;
 	char	   *output_config_variable = NULL;
 
 	InitProcessGlobals();
@@ -1141,17 +1141,6 @@ PostmasterMain(int argc, char *argv[])
  errmsg("could not remove file \"%s\": %m",
 		LOG_METAINFO_DATAFILE)));
 
-	/*
-	 * Initialize input sockets.
-	 *
-	 * Mark them all closed, and set up an on_proc_exit function that's
-	 * charged with closing the sockets again at postmaster shutdown.
-	 */
-	for (i = 0; i < MAXLISTEN; i++)
-		ListenSocket[i] = PGINVALID_SOCKET;
-
-	on_proc_exit(CloseServerPorts, 0);
-
 	/*
 	 * If enabled, start up syslogger collection

Re: Query execution in Perl TAP tests needs work

2023-08-29 Thread Thomas Munro

On Tue, Aug 29, 2023 at 1:48 PM Noah Misch  wrote:
> On Mon, Aug 28, 2023 at 05:29:56PM +1200, Thomas Munro wrote:
> > 1.  Don't fork anything at all: open (and cache) a connection directly
> > from Perl.
> > 1a.  Write xsub or ffi bindings for libpq.  Or vendor (parts) of the
> > popular Perl xsub library?
> > 1b.  Write our own mini pure-perl pq client module.  Or vendor (parts)
> > of some existing one.
> > 2.  Use long-lived psql sessions.
> > 2a.  Something building on BackgroundPsql.
> > 2b.  Maybe give psql or a new libpq-wrapper a new low level stdio/pipe
> > protocol that is more fun to talk to from Perl/machines?
>
> (2a) seems adequate and easiest to achieve.  If DBD::Pg were under a
> compatible license, I'd say use it as the vendor for (1a).  Maybe we can get
> it relicensed?  Building a separate way of connecting from Perl would be sad.

Here's my minimal POC of 2a.  It only changes ->safe_psql() and not
the various other things like ->psql() and ->poll_query_until().
Hence use of amcheck/001_verify_heapam as an example: it runs a lot of
safe_psql() queries.  It fails in all kinds of ways if enabled
generally, which would take some investigation (some tests require
there to be no connections at various times, so we'd probably need to
insert disable/re-enable calls at various places).
From 0f6f4b1bdc69a10b1048731d5b1d744567978cce Mon Sep 17 00:00:00 2001
From: Thomas Munro 
Date: Tue, 29 Aug 2023 18:12:07 +1200
Subject: [PATCH] XXX cache psql processes


diff --git a/contrib/amcheck/t/001_verify_heapam.pl b/contrib/amcheck/t/001_verify_heapam.pl
index 46d5b53181..316e876c5d 100644
--- a/contrib/amcheck/t/001_verify_heapam.pl
+++ b/contrib/amcheck/t/001_verify_heapam.pl
@@ -18,6 +18,7 @@ $node = PostgreSQL::Test::Cluster->new('test');
 $node->init;
 $node->append_conf('postgresql.conf', 'autovacuum=off');
 $node->start;
+$node->enable_auto_background_psql(1);
 $node->safe_psql('postgres', q(CREATE EXTENSION amcheck));
 
 #
diff --git a/src/test/perl/PostgreSQL/Test/BackgroundPsql.pm b/src/test/perl/PostgreSQL/Test/BackgroundPsql.pm
index 924b57ab21..53ee69c801 100644
--- a/src/test/perl/PostgreSQL/Test/BackgroundPsql.pm
+++ b/src/test/perl/PostgreSQL/Test/BackgroundPsql.pm
@@ -221,7 +221,7 @@ sub query
 	$output = $self->{stdout};
 
 	# remove banner again, our caller doesn't care
-	$output =~ s/\n$banner$//s;
+	$output =~ s/\n?$banner$//s;
 
 	# clear out output for the next query
 	$self->{stdout} = '';
diff --git a/src/test/perl/PostgreSQL/Test/Cluster.pm b/src/test/perl/PostgreSQL/Test/Cluster.pm
index 227c34ab4d..f53844e25e 100644
--- a/src/test/perl/PostgreSQL/Test/Cluster.pm
+++ b/src/test/perl/PostgreSQL/Test/Cluster.pm
@@ -986,6 +986,8 @@ sub stop
 
 	local %ENV = $self->_get_env();
 
+	$self->_close_auto_background_psqls;
+
 	$mode = 'fast' unless defined $mode;
 	return 1 unless defined $self->{_pid};
 
@@ -1025,6 +1027,8 @@ sub reload
 
 	local %ENV = $self->_get_env();
 
+	$self->_close_auto_background_psqls;
+
 	print "### Reloading node \"$name\"\n";
 	PostgreSQL::Test::Utils::system_or_bail('pg_ctl', '-D', $pgdata,
 		'reload');
@@ -1049,6 +1053,8 @@ sub restart
 
 	local %ENV = $self->_get_env(PGAPPNAME => undef);
 
+	$self->_close_auto_background_psqls;
+
 	print "### Restarting node \"$name\"\n";
 
 	# -w is now the default but having it here does no harm and helps
@@ -1349,7 +1355,9 @@ sub new
 		_logfile_base =>
 		  "$PostgreSQL::Test::Utils::log_path/${testname}_${name}",
 		_logfile =>
-		  "$PostgreSQL::Test::Utils::log_path/${testname}_${name}.log"
+		  "$PostgreSQL::Test::Utils::log_path/${testname}_${name}.log",
+		_auto_background_psqls => {},
+		_use_auto_background_psqls => 0
 	};
 
 	if ($params{install_path})
@@ -1508,6 +1516,39 @@ sub _get_env
 	return (%inst_env);
 }
 
+# Private routine to close and forget any psql processes we may be
+# holding onto.
+sub _close_auto_background_psqls
+{
+	my ($self) = @_;
+	foreach my $dbname (keys %{$self->{_auto_background_psqls}})
+	{
+		my $psql = $self->{_auto_background_psqls}{$dbname};
+		delete $self->{_auto_background_psqls}{$dbname};
+		$psql->quit;
+	}
+}
+
+=pod
+
+=item enable_auto_background_psql()
+
+Enable or disable the automatic reuse of long-lived psql sessions.
+
+=cut
+
+sub enable_auto_background_psql
+{
+	my $self = shift;
+	my $value = shift;
+
+	$self->{_use_auto_background_psqls} = $value;
+	if (!$value)
+	{
+		$self->_close_auto_background_psqls;
+	}
+}
+
 # Private routine to get an installation path qualified command.
 #
 # IPC::Run maintains a cache, %cmd_cache, mapping commands to paths.  Tests
@@ -1744,13 +1785,32 @@ sub safe_psql
 
 	my ($stdout, $stderr);
 
-	my $ret = $self->psql(
-		$dbname, $sql,
-		%params,
-		stdout => \$stdout,
-		stderr => \$stderr,
-		on_error_die => 1,
-		on_error_stop => 1);
+
+	# If there are no special parameters, and re-use of sessions
+	# hasn't been disabled, create or re-use a long-lived psql
+	# process.
+	if (!%params &&

Re: Autogenerate some wait events code and documentation

2023-08-29 Thread Michael Paquier

On Tue, Aug 29, 2023 at 08:17:10AM +0200, Drouvot, Bertrand wrote:
> Agree that done that way one could easily grep the events from the
> source code and match them with wait_event_names.txt. Then I don't
> think the "search" issue in the code is still a concern with the
> current proposal.

It could still be able to append WAIT_EVENT_ to the first column of
the file.  I'd just rather keep it shorter.

> FWIW, I'm getting:
> 
> $ git am v3-000*
> Applying: Rename wait events with more consistent camelcase style
> Applying: Remove column for wait event names in wait_event_names.txt
> error: patch failed: src/backend/utils/activity/wait_event_names.txt:261
> error: src/backend/utils/activity/wait_event_names.txt: patch does not apply
> Patch failed at 0002 Remove column for wait event names in 
> wait_event_names.txt

That may be a bug in the matrix because of bb90022, as git am can be
easily pissed.  I am attaching a new patch series, but it does not
seem to matter here.

I have double-checked the docs generated, while on it, and I am not
seeing anything missing, particularly for the LWLock and Lock parts..
--
Michael
From 7e221db0fe26f9d36df90202934f4177daa86ff7 Mon Sep 17 00:00:00 2001
From: Michael Paquier 
Date: Mon, 28 Aug 2023 14:47:13 +0900
Subject: [PATCH v4 1/2] Rename wait events with more consistent camelcase
 style

This will help in the introduction of more simplifications with the
generation of wait event data using wait_event_names.txt.  The event
names used the same case-insensitive characters, hence applying lower()
or upper() to the monitoring queries allows the detection of the same
events as before this change.
---
 src/backend/libpq/pqmq.c  |  2 +-
 src/backend/storage/ipc/shm_mq.c  |  6 +-
 .../utils/activity/wait_event_names.txt   | 90 +--
 3 files changed, 49 insertions(+), 49 deletions(-)

diff --git a/src/backend/libpq/pqmq.c b/src/backend/libpq/pqmq.c
index 253181f47a..38b042804c 100644
--- a/src/backend/libpq/pqmq.c
+++ b/src/backend/libpq/pqmq.c
@@ -182,7 +182,7 @@ mq_putmessage(char msgtype, const char *s, size_t len)
 			break;
 
 		(void) WaitLatch(MyLatch, WL_LATCH_SET | WL_EXIT_ON_PM_DEATH, 0,
-		 WAIT_EVENT_MQ_PUT_MESSAGE);
+		 WAIT_EVENT_MESSAGE_QUEUE_PUT_MESSAGE);
 		ResetLatch(MyLatch);
 		CHECK_FOR_INTERRUPTS();
 	}
diff --git a/src/backend/storage/ipc/shm_mq.c b/src/backend/storage/ipc/shm_mq.c
index d134a09a19..06d6b73527 100644
--- a/src/backend/storage/ipc/shm_mq.c
+++ b/src/backend/storage/ipc/shm_mq.c
@@ -1017,7 +1017,7 @@ shm_mq_send_bytes(shm_mq_handle *mqh, Size nbytes, const void *data,
 			 * cheaper than setting one that has been reset.
 			 */
 			(void) WaitLatch(MyLatch, WL_LATCH_SET | WL_EXIT_ON_PM_DEATH, 0,
-			 WAIT_EVENT_MQ_SEND);
+			 WAIT_EVENT_MESSAGE_QUEUE_SEND);
 
 			/* Reset the latch so we don't spin. */
 			ResetLatch(MyLatch);
@@ -1163,7 +1163,7 @@ shm_mq_receive_bytes(shm_mq_handle *mqh, Size bytes_needed, bool nowait,
 		 * setting one that has been reset.
 		 */
 		(void) WaitLatch(MyLatch, WL_LATCH_SET | WL_EXIT_ON_PM_DEATH, 0,
-		 WAIT_EVENT_MQ_RECEIVE);
+		 WAIT_EVENT_MESSAGE_QUEUE_RECEIVE);
 
 		/* Reset the latch so we don't spin. */
 		ResetLatch(MyLatch);
@@ -1252,7 +1252,7 @@ shm_mq_wait_internal(shm_mq *mq, PGPROC **ptr, BackgroundWorkerHandle *handle)
 
 		/* Wait to be signaled. */
 		(void) WaitLatch(MyLatch, WL_LATCH_SET | WL_EXIT_ON_PM_DEATH, 0,
-		 WAIT_EVENT_MQ_INTERNAL);
+		 WAIT_EVENT_MESSAGE_QUEUE_INTERNAL);
 
 		/* Reset the latch so we don't spin. */
 		ResetLatch(MyLatch);
diff --git a/src/backend/utils/activity/wait_event_names.txt b/src/backend/utils/activity/wait_event_names.txt
index 13774254d2..0cace8f40a 100644
--- a/src/backend/utils/activity/wait_event_names.txt
+++ b/src/backend/utils/activity/wait_event_names.txt
@@ -45,15 +45,15 @@
 Section: ClassName - WaitEventActivity
 
 WAIT_EVENT_ARCHIVER_MAIN	ArchiverMain	"Waiting in main loop of archiver process."
-WAIT_EVENT_AUTOVACUUM_MAIN	AutoVacuumMain	"Waiting in main loop of autovacuum launcher process."
-WAIT_EVENT_BGWRITER_HIBERNATE	BgWriterHibernate	"Waiting in background writer process, hibernating."
-WAIT_EVENT_BGWRITER_MAIN	BgWriterMain	"Waiting in main loop of background writer process."
+WAIT_EVENT_AUTOVACUUM_MAIN	AutovacuumMain	"Waiting in main loop of autovacuum launcher process."
+WAIT_EVENT_BGWRITER_HIBERNATE	BgwriterHibernate	"Waiting in background writer process, hibernating."
+WAIT_EVENT_BGWRITER_MAIN	BgwriterMain	"Waiting in main loop of background writer process."
 WAIT_EVENT_CHECKPOINTER_MAIN	CheckpointerMain	"Waiting in main loop of checkpointer process."
 WAIT_EVENT_LOGICAL_APPLY_MAIN	LogicalApplyMain	"Waiting in main loop of logical replication apply process."
 WAIT_EVENT_LOGICAL_LAUNCHER_MAIN	LogicalLauncherMain	"Waiting in main loop of logical replication launcher process."
 WAIT_EVENT_LOGICAL_PARALLEL_APPLY_MAIN	LogicalParallelApplyMain	"Waiting in

Re: Logger process and "LOG: could not close client or listen socket: Bad file descriptor"

2023-08-29 Thread Michael Paquier

On Tue, Aug 29, 2023 at 09:27:48AM +0300, Heikki Linnakangas wrote:
> Just to close the loop on this thread: I committed and backpatched Michael's
> fix.

Thanks!
--
Michael


signature.asc
Description: PGP signature

Re: Logger process and "LOG: could not close client or listen socket: Bad file descriptor"

2023-08-29 Thread Heikki Linnakangas


On 29/08/2023 06:18, Peter Geoghegan wrote:

On Sun, Aug 27, 2023 at 5:52 PM Michael Paquier  wrote:

 From what I can see, this is is a rather old issue, because
ListenSocket[] is filled with PGINVALID_SOCKET *after* starting the
syslogger.  It seems to me that we should just initialize the array
before starting the syslogger, so as we don't get these incorrect
logs?

Thoughts?  Please see the attached.


Agreed, this is very annoying. I'm going to start using your patch
with the feature branch I'm working on. Hopefully that won't be
necessary for too much longer.


Just to close the loop on this thread: I committed and backpatched 
Michael's fix.


Discussion on other thread at 
https://www.postgresql.org/message-id/9caed67f-f93e-5701-8c25-265a2b139ed0%40iki.fi.


--
Heikki Linnakangas
Neon (https://neon.tech)

Re: [PoC] pg_upgrade: allow to upgrade publisher node

2023-08-29 Thread Amit Kapila

On Mon, Aug 28, 2023 at 1:01 PM Peter Smith  wrote:
>
> Hi, here are my review comments for v26-0003
>
> It seems I must defend some of my previous suggestions from v25* [1],
> so here goes...
>
> ==
> src/bin/pg_upgrade/check.c
>
> 1. check_and_dump_old_cluster
>
> CURRENT CODE (with v26-0003 patch applied)
>
> /* Extract a list of logical replication slots */
> get_old_cluster_logical_slot_infos();
>
> ...
>
> /*
> * Logical replication slots can be migrated since PG17. See comments atop
> * get_old_cluster_logical_slot_infos().
> */
> if (GET_MAJOR_VERSION(old_cluster.major_version) >= 1700)
> {
> check_old_cluster_for_lost_slots();
>
> /*
> * Do additional checks if a live check is not required. This requires
> * that confirmed_flush_lsn of all the slots is the same as the latest
> * checkpoint location, but it would be satisfied only when the server
> * has been shut down.
> */
> if (!live_check)
> check_old_cluster_for_confirmed_flush_lsn();
> }
>
>
> SUGGESTION
>
> /*
>  * Logical replication slots can be migrated since PG17. See comments atop
>  * get_old_cluster_logical_slot_infos().
>  */
> if (GET_MAJOR_VERSION(old_cluster.major_version) >= 1700) // NOTE 1a.
> {
>   /* Extract a list of logical replication slots */
>   get_old_cluster_logical_slot_infos();
>
>   if (count_old_cluster_slots()) // NOTE 1b.
>   {
> check_old_cluster_for_lost_slots();
>
> /*
>  * Do additional checks if a live check is not required. This requires
>  * that confirmed_flush_lsn of all the slots is the same as the latest
>  * checkpoint location, but it would be satisfied only when the server
>  * has been shut down.
>  */
> if (!live_check)
>   check_old_cluster_for_confirmed_flush_lsn();
>   }
> }
>

I think a slightly better way to achieve this is to combine the code
from check_old_cluster_for_lost_slots() and
check_old_cluster_for_confirmed_flush_lsn() into
check_old_cluster_for_valid_slots(). That will even save us a new
connection for the second check.

Also, I think we can simplify another check in the patch:
@@ -1446,8 +1446,10 @@ check_new_cluster_logical_replication_slots(void)
char   *wal_level;

/* Logical slots can be migrated since PG17. */
-   if (GET_MAJOR_VERSION(old_cluster.major_version) >= 1700)
-   nslots = count_old_cluster_logical_slots();
+   if (GET_MAJOR_VERSION(old_cluster.major_version) <= 1600)
+   return;
+
+   nslots = count_old_cluster_logical_slots();


-- 
With Regards,
Amit Kapila.

Re: Use FD_CLOEXEC on ListenSockets (was Re: Refactoring backend fork+exec code)

2023-08-29 Thread Heikki Linnakangas


On 29/08/2023 01:28, Michael Paquier wrote:


In case you've not noticed:
https://www.postgresql.org/message-id/zovvuqe0rdj2s...@paquier.xyz
But it does not really matter now ;)


Ah sorry, missed that thread.


Yes, so it seems. Syslogger is started before the ListenSockets array is
initialized, so its still all zeros. When syslogger is forked, the child
process tries to close all the listen sockets, which are all zeros. So
syslogger calls close(0) MAXLISTEN (64) times. Attached patch moves the
array initialization earlier.


Yep, I've reached the same conclusion.  Wouldn't it be cleaner to move
the callback registration of CloseServerPorts() closer to the array
initialization, though?


Ok, pushed that way.

I checked the history of this: it goes back to commit 9a86f03b4e in 
version 13. The SysLogger_Start() call used to be later, after setting p 
ListenSockets, but that commit moved it. So I backpatched this to v13.



The first close(0) actually does have an effect: it closes stdin, which is
fd 0. That is surely accidental, but I wonder if we should indeed close
stdin in child processes? The postmaster process doesn't do anything with
stdin either, although I guess a library might try to read a passphrase from
stdin before starting up, for example.


We would have heard about that, wouldn't we?  I may be missing
something of course, but on HEAD, the array initialization is done
before starting any child processes, and the syslogger is the first
one forked.


Yes, syslogger is the only process that closes stdin. After moving the 
initialization, it doesn't close it either.


Thinking about this some more, the ListenSockets array is a bit silly 
anyway. We fill the array starting from index 0, always append to the 
end, and never remove entries from it. It would seem more 
straightforward to keep track of the used size of the array. Currently 
we always loop through the unused parts too, and e.g. 
ConfigurePostmasterWaitSet() needs to walk the array to count how many 
elements are in use.


--
Heikki Linnakangas
Neon (https://neon.tech)

Re: Autogenerate some wait events code and documentation

2023-08-29 Thread Drouvot, Bertrand


Hi,

On 8/28/23 10:04 AM, Michael Paquier wrote:

On Mon, Jul 17, 2023 at 10:16:02AM +0900, Michael Paquier wrote:

So you mean to switch a line that now looks like that:
WAIT_EVENT_FOO_BAR   FooBar"Waiting on Foo Bar."
To that:
FOO_BAR   "Waiting on Foo Bar."
Or even that:
WAIT_EVENT_FOO_BAR   "Waiting on Foo Bar."

Sure, it is an improvement for any wait events that use WAIT_EVENT_
when searching them, but this adds more magic into the LWLock and Lock
areas if the same conversion is applied there.  Or am I right to
assume that you'd mean to *not* do any of that for these two classes?
These can be treated as exceptions in the script when generating the
wait event names from the enum elements, of course.


I have looked again at that, and switching wait_event_names.txt to use
two columns made of the typedef definitions and the docs like is not a
problem:
FOO_BAR   "Waiting on Foo Bar."

WAIT_EVENT_ is appended to the typedef definitions in the script.  The
wait event names like "FooBar" are generated from the enums by
splitting using their underscores and doing some lc().  Lock and
LWLock don't need to change.  This way, it is easy to grep the wait
events from the source code and match them with wait_event_names.txt.

Thoughts or comments?


Agree that done that way one could easily grep the events from the source code 
and
match them with wait_event_names.txt. Then I don't think the "search" issue in 
the code
is still a concern with the current proposal.

FWIW, I'm getting:

$ git am v3-000*
Applying: Rename wait events with more consistent camelcase style
Applying: Remove column for wait event names in wait_event_names.txt
error: patch failed: src/backend/utils/activity/wait_event_names.txt:261
error: src/backend/utils/activity/wait_event_names.txt: patch does not apply
Patch failed at 0002 Remove column for wait event names in wait_event_names.txt

Regards,

--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

91 matches

Mail list logo