date:20220419

Re: using an end-of-recovery record in all cases

2022-04-19 Thread Amul Sul

 On Tue, Apr 19, 2022 at 2:14 AM Robert Haas  wrote:
>
> On Sat, Jan 15, 2022 at 11:52 PM Julien Rouhaud  wrote:
> > The cfbot reports that this version of the patch doesn't apply anymore:
>
> Here is a new version of the patch which, unlike v1, I think is
> something we could seriously consider applying (not before v16, of
> course). It now removes CHECKPOINT_END_OF_RECOVERY completely, and I
> attach a second patch as well which nukes checkPoint.PrevTimeLineID as
> well.
>
> I mentioned two problems with $SUBJECT in the first email with this
> subject line. One was a bug, which Noah has since fixed (thanks,
> Noah). The other problem is that LogStandbySnapshot() and a bunch of
> its friends expect latestCompletedXid to always be a normal XID, which
> is a problem because (1) we currently set nextXid to 3 and (2) at
> startup, we compute latestCompletedXid = nextXid - 1. The current code
> dodges this kind of accidentally: the checkpoint that happens at
> startup is a "shutdown checkpoint" and thus skips logging a standby
> snapshot, since a shutdown checkpoint is a sure indicator that there
> are no running transactions. With the changes, the checkpoint at
> startup happens after we've started allowing write transactions, and
> thus a standby snapshot needs to be logged also. In the cases where
> the end-of-recovery record was already being used, the problem could
> have happened already, except for the fact that those cases involve a
> standby promotion, which doesn't happen during initdb. I explored a
> few possible ways of solving this problem.
>
> The first thing I considered was replacing latestCompletedXid with a
> name like incompleteXidHorizon. The idea is that, where
> latestCompletedXid is the highest XID that is known to have committed
> or aborted, incompleteXidHorizon would be the lowest XID such that all
> known commits or aborts are for lower XIDs. In effect,
> incompleteXidHorizon would be latestCompletedXid + 1. Since
> latestCompletedXid is always normal except when it's 2,
> incompleteXidHorizon would be normal in all cases. Initially this
> seemed fairly promising, but it kind of fell down when I realized that
> we copy latestCompletedXid into
> ComputeXidHorizonsResult.latest_completed. That seemed to me to make
> the consequences of the change a bit more far-reaching than I liked.
> Also, it wasn't entirely clear to me that I wouldn't be introducing
> any off-by-one errors into various wraparound calculations with this
> approach.
>
> The second thing I considered was skipping LogStandbySnapshot() during
> initdb. There are two ways of doing this and neither of them seem that
> great to me. Something that does work is to skip LogStandbySnapshot()
> when in single user mode, but there is no particular reason why WAL
> generated in single user mode couldn't be replayed on a standby, so
> this doesn't seem great. It's too big a hammer for what we really
> want, which is just to suppress this during initdb. Another way of
> approaching it is to skip it in bootstrap mode, but that actually
> doesn't work: initdb then fails during the post-bootstrapping step
> rather than during bootstrapping. I thought about patching around that
> by forcing the code that generates checkpoint records to forcibly
> advance an XID of 3 to 4, but that seemed like working around the
> problem from the wrong end.
>
> So ... I decided that the best approach here seems to be to just skip
> FirstNormalTransactionId and use FirstNormalTransactionId + 1 for the
> first write transaction that the cluster ever processes. That's very
> simple and doesn't seem likely to break anything else. On the downside
> it seems a bit grotty, but I don't see anything better, and on the
> whole, I think with this approach we come out substantially ahead.

IIUC, the failure was something like this on initdb:

running bootstrap script ... TRAP:
FailedAssertion("TransactionIdIsNormal(CurrentRunningXacts->latestCompletedXid)",
File: "procarray.c", Line: 2892, PID: 60363)

/bin/postgres(ExceptionalCondition+0xb9)[0xb3917d]
/bin/postgres(GetRunningTransactionData+0x36c)[0x96aa26]
/bin/postgres(LogStandbySnapshot+0x64)[0x974393]
/bin/postgres(CreateCheckPoint+0x67f)[0x5928bf]
/bin/postgres(RequestCheckpoint+0x26)[0x8ca649]
/bin/postgres(StartupXLOG+0xf51)[0x591126]
/bin/postgres(InitPostgres+0x188)[0xb4f2ac]
/bin/postgres(BootstrapModeMain+0x4d3)[0x5ac6de]
/bin/postgres(main+0x275)[0x7ca72e]
/lib64/libc.so.6(__libc_start_main+0xf5)[0x7f71af82d445]
/bin/postgres[0x48aae9]
child process was terminated by signal 6: Aborted
initdb: removing data directory "/inst/data"

That was happening because RequestCheckpoint() was called from StartupXLOG()
unconditionally, but with the v5 patch that is not true.

If my understanding is correct then we don't need any handling
for latestCompletedXid, at least in this patch.

Regards,
Amul

Re: Stabilizing the test_decoding checks, take N

2022-04-19 Thread Amit Kapila

On Tue, Apr 19, 2022 at 3:16 PM Amit Kapila  wrote:
>
> On Tue, Apr 19, 2022 at 11:38 AM Dilip Kumar  wrote:
> >
> > On Mon, Apr 18, 2022 at 3:29 PM Dilip Kumar  wrote:
> >
> > >
> > > This needs to be verified once by doing some manual testing as it may
> > > not be easily reproducible every time. If this happens to be true then
> > > I think your suggestion related to increasing autovacuum_naptime would
> > > work.
> > >
> > >
> > > I will try to reproduce this, maybe by reducing the autovacuum_naptime or 
> > > parallelly running some script that continuously performs DDL-only 
> > > transactions.
> >
> > I have reproduced it [1] by repeatedly running the attached
> > script(stream.sql) from one session and parallely running the vacuum
> > analysis from the another session.
> >
> > I have also changed the config for testing decoding to set the
> > autovacuum_naptime to 1d (patch attached)
> >
>
> Thanks, I am also able to see similar results. This shows the analysis
> was right. I will push the autovacuum_naptime change in HEAD and 14
> (as both contains this test) tomorrow unless someone thinks otherwise.
>

Pushed.

-- 
With Regards,
Amit Kapila.

Re: Handle infinite recursion in logical replication setup

2022-04-19 Thread Peter Smith

Below are my review comments for the v9-0002 patch (TAP test part only).

(The order of my comments is a bit muddled because I jumped around in
the file a bit while reviewing it).

==

1. create_subscription - missing comment.

+sub create_subscription
+{
+ my ($node_subscriber, $node_publisher, $sub_name, $node_connstr,
+ $application_name, $pub_name, $copy_data_val)
+   = @_;

Maybe add a comment for this subroutine and describe the expected parameters.

~~

2. create_subscription - hides the options

IMO the "create_subscription" subroutine is hiding too many of the
details, so now it is less clear (from the caller's POV) what the test
is really doing. Perhaps the options could be passed more explicitly
to this subroutine.

e.g. Instead of

create_subscription($node_B, $node_A, $appname_B1, $node_A_connstr,
   $appname_B1, 'tap_pub_A', 'on');

perhaps explicitly set the WITH options like:

create_subscription($node_B, $node_A, $appname_B1, $node_A_connstr,
   $appname_B1, 'tap_pub_A', 'local_only = on, copy_data = on');

~~~

3. the application names are confusing

+my $appname_A1 = 'tap_sub_A_B';
+my $appname_A2 = 'tap_sub_A_C';
+my $appname_B1 = 'tap_sub_B_A';
+my $appname_B2 = 'tap_sub_B_C';
+my $appname_C1 = 'tap_sub_C_A';
+my $appname_C2 = 'tap_sub_C_B';

I found the application names using '1' and '2' to be a bit confusing.
i.e  it was unnecessarily hard to associate them (in my head) with
their relevant subscriptions. IMO it would be easier to name them
using letters like below:

SUGGESTED
my $appname_AB = 'tap_sub_A_B';
my $appname_AC = 'tap_sub_A_C';
my $appname_BA = 'tap_sub_B_A';
my $appname_BC = 'tap_sub_B_C';
my $appname_CA = 'tap_sub_C_A';
my $appname_CB = 'tap_sub_C_B';

~~~

4. create_subscription - passing the $appname 2x.

+create_subscription($node_B, $node_A, $appname_B1, $node_A_connstr,
+   $appname_B1, 'tap_pub_A', 'on');


It seemed confusing that the $app_name is passed twice.

IMO should rename all those $appname_XX vars (see previous comment) to
be $subname_XX, and just pass that create_subscription instead.

my $subname_AB = 'tap_sub_A_B';
my $subname_AC = 'tap_sub_A_C';
my $subname_BA = 'tap_sub_B_A';
my $subname_BC = 'tap_sub_B_C';
my $subname_CA = 'tap_sub_C_A';
my $subname_CB = 'tap_sub_C_B';

Then create_subscription subroutine should have one less argument.
Just add a comment saying that the appname is always assigned the same
value as the subname.

~~~

5. cleanup part - seems a bit misleading

+# cleanup
+$node_B->safe_psql(
+   'postgres', "
+DROP SUBSCRIPTION $appname_B2");
+$node_C->safe_psql(
+ 'postgres', "
+DELETE FROM tab_full");
+$node_B->safe_psql(
+ 'postgres', "
+DELETE FROM tab_full where a = 13");

Is that comment misleading? IIUC this is not really cleaning up
everything. It is just a cleanup of the previous Node_C test part
isn't it?

~~~

6. Error case (when copy_data is true)

+# Error when creating susbcription with local_only and copy_data as true when
+# the publisher has replicated data

6a. Typo "susbcription"

6b. That comment maybe needs some more explanation - eg. say that
since Node_A is already subscribing to Node_B so when Node_B makes
another subscription to Node_A the copy doesn't really know if the
data really originated from Node_A or not...

6c. Maybe the comment needed to be more like ## style to
denote this (and the one that follows) is a separate test case.

~~~

7. Error case (when copy_data is force)

+# Creating subscription with local_only and copy_data as force should be
+# successful when the publisher has replicated data

7a. Same typo "subscription"

~~~

8. Add 3rd node when the existing node has some data

8a. Maybe that comment needs to be expanded more. This isn't just
"adding a node" - you are joining it to the others as another
bi-directional participant in a 3-way group. And the comment should
clarify exactly which nodes have the initial data. Both the existing 2
you are joining to? The new one only? All of them?

8b. IMO is would be much clearer to do SELECT from all the nodes at
the start of this test just to re-confirm what is all the initial data
on these nodes before joining the 3rd node.

NOTE - These same review comments apply to the other test combinations too
- Add 3rd node when the existing node has no data
- Add 3rd node when the new node has some data

~~~

9. Inserted data

+# insert some data in all the nodes
+$node_A->safe_psql('postgres', "INSERT INTO tab_full VALUES (13);");
+$node_B->safe_psql('postgres', "INSERT INTO tab_full VALUES (21);");
+$node_C->safe_psql('postgres', "INSERT INTO tab_full VALUES (31);");

They seemed strange values (13, 21, 31) to add. I couldn't work out
the "meaning" of them.

Wouldn't values like 13, 23, 33 make more sense (e.g pattern is 10 x
node# + something)

~~~

10. verify_data($node_A, $node_B, $node_C);

All the expected values are too buried in this subroutine which makes
it hard to read. I think it wo

Re: Intermittent buildfarm failures on wrasse

2022-04-19 Thread Masahiko Sawada

On Wed, Apr 20, 2022 at 3:29 AM Tom Lane  wrote:
>
> Alvaro Herrera  writes:
> > On 2022-Apr-15, Tom Lane wrote:
> >> Here's a WIP patch for that.  The only exciting thing in it is that
> >> because of some undocumented cowboy programming in walsender.c, the
> >> Assert((proc->statusFlags & (~PROC_COPYABLE_FLAGS)) == 0);
> >> in ProcArrayInstallRestoredXmin fires unless we skip that.
>
> > Hmm, maybe a better use of that define is to use to select which flags
> > to copy, rather than to ensure we they are the only ones set.  What
> > about this?
>
> Yeah, I thought about that too, but figured that the author probably
> had a reason for writing the assertion the way it was.

The motivation behind the assertion was that when we copy whole
statusFlags from the leader process to the worker process we want to
make sure that the flags we're copying is a known subset of the flags
that are valid to copy from the leader.

> If we want
> to redefine PROC_COPYABLE_FLAGS as "flags associated with xmin",
> that's fine by me.  But I'd suggest that both the name of the macro
> and the comment for it in proc.h should be revised to match that
> definition.
>
> Another point is that as you've coded it, the code doesn't so much
> copy those flags as union them with whatever the recipient had,
> which seems wrong.  I could go with either
>
> Assert(!(MyProc->statusFlags & PROC_XMIN_FLAGS));
> MyProc->statusFlags |= (proc->statusFlags & PROC_XMIN_FLAGS);
>
> or
>
> MyProc->statusFlags = (MyProc->statusFlags & ~PROC_XMIN_FLAGS) |
>   (proc->statusFlags & PROC_XMIN_FLAGS);
>
> Perhaps the latter is more future-proof.

Copying only xmin-related flags in this way also makes sense to me and
there is no problem at least for now. A note would be that when we
introduce a new flag that needs to be copied in the future, we need to
make sure to add it to PROC_XMIN_FLAGS so it is copied. Otherwise a
similar issue we fixed by 0f0cfb494004befb0f6e could happen again.

Regards,


--
Masahiko Sawada
EDB:  https://www.enterprisedb.com/

Re: TRAP: FailedAssertion("tabstat->trans == trans", File: "pgstat_relation.c", Line: 508

2022-04-19 Thread Kyotaro Horiguchi

At Tue, 19 Apr 2022 10:55:26 -0700, Andres Freund  wrote in 
> Hi,
> 
> On 2022-04-19 10:36:24 -0700, Andres Freund wrote:
> > On 2022-04-19 13:50:25 +0200, Erik Rijkers wrote:
> > > The 12th run of statbug.sh crashed and gave a corefile.
> > 
> > I ran through quite a few iterations by now, without reproducing :(
> > 
> > I guess there's some timing issue and you're hitting on your system
> > due to the slower disks.
> 
> Ah. I found the issue. The new pgstat_report_stat(true) call in
> LogicalRepApplyLoop()'s "timeout" section doesn't check if we're in a
> transaction. And the transactional stats code doesn't handle that (never
> has).
> 
> I think all that's needed is a if (IsTransactionState()) around that
> pgstat_report_stat().

if (!IsTransactinoState()) ?

> It might be possible to put an assertion into pgstat_report_stat(), but
> I need to look at the process exit code to see if it is.

Inserting a sleep in pgoutput_commit_txn reproduced this. Crashes with
the same stack trace with the similar variable state.

diff --git a/src/backend/replication/pgoutput/pgoutput.c 
b/src/backend/replication/pgoutput/pgoutput.c
index b197bfd565..def4d751d3 100644
--- a/src/backend/replication/pgoutput/pgoutput.c
+++ b/src/backend/replication/pgoutput/pgoutput.c
@@ -568,6 +568,7 @@ pgoutput_commit_txn(LogicalDecodingContext *ctx, 
ReorderBufferTXN *txn,
return;
}
 
+   sleep(2);
OutputPluginPrepareWrite(ctx, true);
logicalrep_write_commit(ctx->out, txn, commit_lsn);
OutputPluginWrite(ctx, true);

The following  actuall works for this.

diff --git a/src/backend/replication/logical/worker.c 
b/src/backend/replication/logical/worker.c
index 4171371296..f4e5359513 100644
--- a/src/backend/replication/logical/worker.c
+++ b/src/backend/replication/logical/worker.c
@@ -2882,10 +2882,11 @@ LogicalRepApplyLoop(XLogRecPtr last_received)
send_feedback(last_received, requestReply, 
requestReply);
 
/*
-* Force reporting to ensure long idle periods don't 
lead to
-* arbitrarily delayed stats.
+* Force reporting to ensure long out-of-transaction 
idle periods
+* don't lead to arbitrarily delayed stats.
 */
-   pgstat_report_stat(true);
+   if (!IsTransactionState())
+   pgstat_report_stat(true);
}
}
 
regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center

Re: typos

2022-04-19 Thread Amit Kapila

On Tue, Apr 19, 2022 at 4:35 PM Alvaro Herrera  wrote:
>
> Yeah, more invasive rewording seems called for.  I propose this:
>
>For publications containing partitioned tables, the row filter for each
>partition is taken from the published partitioned table if the
>publication parameter publish_via_partition_root is 
> true,
>or from the partition itself if it is false (the default).
>
> I think we should also mention that this parameter affects row filters,
> in the  for the WITH clause.  Currently it has
>
> publish_via_partition_root 
> (boolean)
> 
>  
>   This parameter determines whether changes in a partitioned table (or
>   on its partitions) contained in the publication will be published
>   using the identity and schema of the partitioned table rather than
>   that of the individual partitions that are actually changed; the
>   latter is the default.  Enabling this allows the changes to be
>   replicated into a non-partitioned table or a partitioned table
>   consisting of a different set of partitions.
>  
>
> I propose to add
>
> publish_via_partition_root 
> (boolean)
> 
>  
>   This parameter determines whether changes in a partitioned table (or
>   on its partitions) contained in the publication will be published
>   using the identity and schema of the partitioned table rather than
>   that of the individual partitions that are actually changed; the
>   latter is the default.  Enabling this allows the changes to be
>   replicated into a non-partitioned table or a partitioned table
>   consisting of a different set of partitions.
>  
>
> 
>  This parameter also affects how row filters are chosen for 
> partitions;
>  see below for details.
> 
>

Your proposed changes look good to me but I think all these places
need to mention 'column list' as well because the behavior is the same
for it.

> More generally, I think we need to connect the WHERE keyword with "row
> filters" more explicitly.  Right now, the parameter reference says
>
>   If the optional WHERE clause is specified, rows for
>   which the expression
>   evaluates to false or null will not be published. Note that parentheses
>   are required around the expression. It has no effect on
>   TRUNCATE commands.
>
> I propose to make it "If the optional WHERE clause is specified, it
> defines a row filter expression.  Rows for which
> the row filter expression evaluates to false ..."
>

Looks good to me.

-- 
With Regards,
Amit Kapila.

effective_io_concurrency and NVMe devices

2022-04-19 Thread Bruce Momjian

NVMe devices have a maximum queue length of 64k:

https://blog.westerndigital.com/nvme-queues-explained/

but our effective_io_concurrency maximum is 1,000:

test=> set effective_io_concurrency = 1001;
ERROR:  1001 is outside the valid range for parameter 
"effective_io_concurrency" (0 .. 1000)

Should we increase its maximum to 64k?  Backpatched?  (SATA has a
maximum queue length of 256.)

-- 
  Bruce Momjian  https://momjian.us
  EDB  https://enterprisedb.com

  Indecision is a decision.  Inaction is an action.  Mark Batterson

RE: Logical replication timeout problem

2022-04-19 Thread wangw.f...@fujitsu.com

On Mon, Apr 18, 2022 at 00:36 PM Amit Kapila  wrote:
> On Mon, Apr 18, 2022 at 9:29 AM Amit Kapila  wrote:
> >
> > On Thu, Apr 14, 2022 at 5:52 PM Euler Taveira  wrote:
> > >
> > > On Wed, Apr 13, 2022, at 7:45 AM, Amit Kapila wrote:
> > >
> > > Sawada-San, Euler, do you have any opinion on this approach? I
> > > personally still prefer the approach implemented in v10 [1]
> > > especially due to the latest finding by Wang-San that we can't
> > > update the lag-tracker apart from when it is invoked at the transaction 
> > > end.
> > > However, I am fine if we like this approach more.
> > >
> > > It seems v15 is simpler and less error prone than v10. v10 has a mix
> > > of
> > > OutputPluginUpdateProgress() and the new function update_progress().
> > > The v10 also calls update_progress() for every change action in
> > > pgoutput_change(). It is not a good approach for maintainability --
> > > new changes like sequences need extra calls.
> > >
> >
> > Okay, let's use the v15 approach as Sawada-San also seems to have a
> > preference for that.
> >
> > > However, as you mentioned there should handle the track lag case.
> > >
> > > Both patches change the OutputPluginUpdateProgress() so it cannot be
> > > backpatched. Are you planning to backpatch it? If so, the boolean
> > > variable (last_write or end_xacts depending of which version you are
> > > considering) could be added to LogicalDecodingContext.
> > >
> >
> > If we add it to LogicalDecodingContext then I think we have to always
> > reset the variable after its use which will make it look ugly and
> > error-prone. I was not thinking to backpatch it because of the API
> > change but I guess if we want to backpatch then we can add it to
> > LogicalDecodingContext for back-branches. I am not sure if that will
> > look committable but surely we can try.
> >
> 
> Even, if we want to add the variable in the struct in back-branches, we need 
> to
> ensure not to change the size of the struct as it is exposed, see email [1] 
> for a
> similar mistake we made in another case.
> 
> [1] - https://www.postgresql.org/message-
> id/2358496.1649168259%40sss.pgh.pa.us
Thanks for your comments.

I did some checks about adding the new variable in LogicalDecodingContext.
I found that because of padding, if we add the new variable at the end of
structure, it dose not make the structure size change. I verified this in
REL_10~REL_14.

So as suggested by Euler-San and Amit-San, I wrote the patch for REL_14. Attach
this patch. To prevent patch confusion, the patch for HEAD is also attached.
The patch for REL_14:
REL_14_v1-0001-Fix-the-logical-replication-timeout-during-large-.patch
The patch for HEAD:
v17-0001-Fix-the-logical-replication-timeout-during-large.patch

The following is the details of checks.
On gcc/Linux/x86-64, in REL_14, by looking at the size of each member variable
in the structure LogicalDecodingContext, I found that there are three parts
padding behind the following member variables:
- 7 bytes after fast_forward
- 4 bytes after prepared_write
- 4 bytes after write_xid

If we add the new variable at the end of structure (bool takes one byte), it
means we will only consume one byte of padding after member write_xid. And
then, at the end of the struct, 3 padding are still required. For easy
understanding, please refer to the following simple calculation.
(In REL14, the size of structure LogicalDecodingContext is 304 bytes.)
Before adding new variable (In REL14):
8+8+8+8+8+1+168+8+8+8+8+8+8+8+8+1+1+1+1+8+4  =  ‭289 (if padding is not 
considered)
 +7  +4  +4  =  +15 (the padding)
So, the size of structure LogicalDecodingContext is 289+15=304.
After adding new variable (In REL14 with patch):
8+8+8+8+8+1+168+8+8+8+8+8+8+8+8+1+1+1+1+8+4+1  =  ‭290‬ (if padding is not 
considered)
 +7  +4+3  =  +14 (the padding)
So, the size of structure LogicalDecodingContext is 290+14=304.

BTW, the size of structure LogicalDecodingContext in REL_10~REL_13 is 184, 200,
200,200 respectively. And I found that at the end of the structure
LogicalDecodingContext, there are always the following members:
```
XLogRecPtr  write_location;   --> 8
TransactionId write_xid;  --> 4
  --> There are 4 padding after write_xid.
```
It means at the end of structure LogicalDecodingContext, there are 4 bytes
padding. So, if we add a new bool type variable (It takes one byte) at the end
of the structure LogicalDecodingContext, I think in the current REL_10~REL_14,
because of padding, the size of the structure LogicalDecodingContext will not
change.

Regards,
Wang wei


REL_14_v1-0001-Fix-the-logical-replication-timeout-during-large-.patch
Description:  REL_14_v1-0001-Fix-the-logical-replication-timeout-during-large-.patch


v17-0001-Fix-the-logical-replication-timeout-during-large.patch
Description:  v17-0001-Fix-the-logical-replication-timeout-during-large.patch

Re: Postgres restart in the middle of exclusive backup and the presence of backup_label file

2022-04-19 Thread Michael Paquier

On Tue, Apr 19, 2022 at 10:12:32PM -0300, Martín Marqués wrote:
> The typo is in `exist in in a running cluster`. There's two `in` in a row.

Thanks, fixed.
--
Michael


signature.asc
Description: PGP signature

Re: Dump/Restore of non-default PKs

2022-04-19 Thread David G. Johnston

On Tue, Apr 19, 2022 at 9:14 AM Simon Riggs 
wrote:

> On Mon, 18 Apr 2022 at 22:05, Simon Riggs 
> wrote:
> >
> > On Mon, 18 Apr 2022 at 21:48, Tom Lane  wrote:
> > >
> > > "David G. Johnston"  writes:
> > > > On Mon, Apr 18, 2022 at 1:00 PM Simon Riggs <
> simon.ri...@enterprisedb.com>
> > > > wrote:
> > > >> I propose that we change pg_dump so that when it creates a PK it
> does
> > > >> so in 2 commands:
> > > >> 1. CREATE [UNIQUE] INDEX iname ...
> > > >> 2. ALTER TABLE .. ADD PRIMARY KEY USING INDEX iname;
> > >
> > > > Why not just get rid of the limitation that constraint definitions
> don't
> > > > support non-default methods?
> > >
> > > That approach would be doubling down on the assumption that we can
> always
> > > shoehorn more custom options into SQL-standard constraint clauses, and
> > > we'll never fall foul of shift/reduce problems or future spec
> additions.
> > > I think for example that USING INDEX TABLESPACE is a blot on humanity,
> > > and I'd be very glad to see pg_dump stop using it in favor of doing
> > > things as Simon suggests.
> >
> > Sigh, agreed. It's more work, but its cleaner in the longer term to
> > separate indexes from constraints.
> >
> > I'll look in more detail and come back here later.
> >
> > Thanks both.
>
> Anyway, the main question is how should the code be structured?
>
>
I don't have a good answer to that question but the patch presently
produces the dump below for a partitioned table with one partition.

After manually adjusting the order of operations you end up with:

psql:/vagrant/pg_dump_indexattach.v1.txt:67: ERROR:  index "parent_pkey" is
not valid
LINE 2: ADD CONSTRAINT parent_pkey PRIMARY KEY USING INDEX paren...
^
Because:

https://www.postgresql.org/docs/current/sql-altertable.html
ADD table_constraint_using_index
...This form is not currently supported on partitioned tables.

David J.

= pg_dump with manual re-ordering of create/alter index before alter
table

CREATE TABLE public.parent (
id integer NOT NULL,
class text NOT NULL
)
PARTITION BY LIST (class);

CREATE TABLE public.parent_a (
id integer NOT NULL,
class text NOT NULL
);

ALTER TABLE public.parent_a OWNER TO vagrant;

ALTER TABLE ONLY public.parent ATTACH PARTITION public.parent_a FOR VALUES
IN ('a');

CREATE UNIQUE INDEX parent_pkey ON ONLY public.parent USING btree (id,
class);

ALTER TABLE ONLY public.parent
ADD CONSTRAINT parent_pkey PRIMARY KEY USING INDEX parent_pkey;

CREATE UNIQUE INDEX parent_a_pkey ON public.parent_a USING btree (id,
class);

ALTER INDEX public.parent_pkey ATTACH PARTITION public.parent_a_pkey;

ALTER TABLE ONLY public.parent_a
ADD CONSTRAINT parent_a_pkey PRIMARY KEY USING INDEX parent_a_pkey;

Re: Handle infinite recursion in logical replication setup

2022-04-19 Thread Peter Smith

Below are my review comments for the v9-0002 patch (except I did not
yet look at the TAP tests).

~~~

1. General comment - describe.c

I wondered why the copy_data enum value is not displayed by the psql
\drs+ command. Should it be?

~~~

2. General comment - SUBOPT_LOCAL_ONLY

@@ -1134,7 +1204,7 @@ AlterSubscription(ParseState *pstate,
AlterSubscriptionStmt *stmt,

  case ALTER_SUBSCRIPTION_SET_PUBLICATION:
  {
- supported_opts = SUBOPT_COPY_DATA | SUBOPT_REFRESH;
+ supported_opts = SUBOPT_COPY_DATA | SUBOPT_REFRESH | SUBOPT_LOCAL_ONLY;
  parse_subscription_options(pstate, stmt->options,
 supported_opts, &opts);

@@ -1236,7 +1308,8 @@ AlterSubscription(ParseState *pstate,
AlterSubscriptionStmt *stmt,
  errmsg("ALTER SUBSCRIPTION ... REFRESH is not allowed for disabled
subscriptions")));

  parse_subscription_options(pstate, stmt->options,
-SUBOPT_COPY_DATA, &opts);
+SUBOPT_COPY_DATA | SUBOPT_LOCAL_ONLY,
+&opts);

I noticed there is some new code that appears to be setting the
SUBOT_LOCAL_ONLY as a supported option. Shouldn't those changes belong
in the patch 0001 instead of in this patch 0002?

~~~

3. Commit message - wording

CURRENT
a) Node1 - Publication publishing employee table.
b) Node2 - Subscription subscribing from publication pub1 with
local_only.
c) Node2 - Publication publishing employee table.
d) Node1 - Subscription subscribing from publication pub2 with
local_only.

SUGGESTION (I think below has the same meaning but is simpler)
a) Node1 - PUBLICATION pub1 for the employee table
b) Node2 - SUBSCRIPTION from pub1 with local_only=true
c) Node2 - PUBLICATION pub2 for the employee table
d) Node1 - SUBSCRIPTION from pub2 with local_only=true

~~~

4. Commit message - meaning?

CURRENT
Now when user is trying to add another node Node3 to the
above Multi master logical replication setup:
a) user will have to create one subscription subscribing from Node1 to
Node3
b) user wil have to create another subscription subscribing from
Node2 to Node3 using local_only option and copy_data as true.

While the subscription is created, server will identify that Node2 is
subscribing from Node1 and Node1 is subscribing from Node2 and throw an
error so that user can handle the initial copy data.

~

The wording above confused me. Can you clarify it?
e.g.
a) What exactly was the user hoping to achieve here?
b) Are the user steps doing something deliberately wrong just so you
can describe later that an error gets thrown?

~~~

5. doc/src/sgml/logical-replication.sgml - how to get here?

I didn’t see any easy way to get to this page. (No cross refs from anywhere?)

~~~

6. doc/src/sgml/logical-replication.sgml - section levels

I think the section levels may be a bit messed up. e.g. The HTML
rendering of sections looks a bit confused. Maybe this is same as my
review comment #13.

~~

7. doc/src/sgml/logical-replication.sgml - headings

Setting Bidirection logical replication between two nodes:

7a. Maybe better just to have a simpler main heading like
"Bidirectional logical replication".

7b. Don't put ":" in the title.

~~~

8. doc/src/sgml/logical-replication.sgml - missing intro

IMO this page needs some sort of introduction/blurb instead of leaping
straight into examples without any preamble text to give context.

~~~

9. doc/src/sgml/logical-replication.sgml - bullets

Suggest removing all the bullets from the example steps. (I had
something similar a while ago for the "Row Filter" page but
eventually, they all had to be removed).

~~~

10. doc/src/sgml/logical-replication.sgml - SQL too long

Many of the example commands are much too long, and for me, they are
giving scroll bars when rendered. It would be better if they can be
wrapped in appropriate places so easier to read (and no resulting
scroll bars).

~~~

11. doc/src/sgml/logical-replication.sgml - add the psql prompt

IMO it would also be easier to understand the examples if you show the
psql prompt. Then you can easily know the node context without having
to describe it in the text so often.

e.g.

+
+ Create the subscription in node2 to subscribe the changes from node1:
+
+CREATE SUBSCRIPTION sub_node1_node2 ...

SUGGGESTION
+
+ Create the subscription in node2 to subscribe the changes from node1
+
+node_2=# CREATE SUBSCRIPTION sub_node1_node2 ...

~~~

12. doc/src/sgml/logical-replication.sgml - typo

+  
+   Now the BiDirectional logical replication setup is complete between node1

typo "BiDirectional"

~~~

13. doc/src/sgml/logical-replication.sgml - deep levels

The section levels are very deep, but > 3 will not appear in the table
of contents when rendered. Maybe you can rearrange to raise them all
up one level, then IMO the TOC will work better and the whole page
will be easier to read.

~~~

14. doc/src/sgml/logical-replication.sgml - unnecessarily complex?

+
+# Truncate the table data but do not replicate the truncate.
+ALTER PUBLICATION pub_node3 SET (publish='insert,update,delete');
+TRUNCATE t1;
+
+CREATE S

Re: using an end-of-recovery record in all cases

2022-04-19 Thread Kyotaro Horiguchi

At Tue, 19 Apr 2022 13:37:59 -0700, Nathan Bossart  
wrote in 
> On Mon, Apr 18, 2022 at 04:44:03PM -0400, Robert Haas wrote:
> > Here is a new version of the patch which, unlike v1, I think is
> > something we could seriously consider applying (not before v16, of
> > course). It now removes CHECKPOINT_END_OF_RECOVERY completely, and I
> > attach a second patch as well which nukes checkPoint.PrevTimeLineID as
> > well.
> 
> I'd like to add a big +1 for this change.  IIUC this should help with some
> of the problems I've noted elsewhere [0].

Agreed.

> > I mentioned two problems with $SUBJECT in the first email with this
> > subject line. One was a bug, which Noah has since fixed (thanks,
> > Noah). The other problem is that LogStandbySnapshot() and a bunch of
> > its friends expect latestCompletedXid to always be a normal XID, which
> > is a problem because (1) we currently set nextXid to 3 and (2) at
> > startup, we compute latestCompletedXid = nextXid - 1. The current code
> > dodges this kind of accidentally: the checkpoint that happens at
> > startup is a "shutdown checkpoint" and thus skips logging a standby
> > snapshot, since a shutdown checkpoint is a sure indicator that there
> > are no running transactions. With the changes, the checkpoint at
> > startup happens after we've started allowing write transactions, and
> > thus a standby snapshot needs to be logged also. In the cases where
> > the end-of-recovery record was already being used, the problem could
> > have happened already, except for the fact that those cases involve a
> > standby promotion, which doesn't happen during initdb. I explored a
> > few possible ways of solving this problem.
> 
> Shouldn't latestCompletedXid be set to MaxTransactionId in this case?  Or
> is this related to the logic in FullTransactionIdRetreat() that avoids
> skipping over the "actual" special transaction IDs?

As the result FullTransactionIdRetreat(FirstNormalFullTransactionId)
results in FrozenTransactionId, which looks odd.  It seems to me
rather should be InvalidFullTransactionId, or simply should assert-out
that case.  But incrmenting the very first xid avoid all that
complexity.  It is somewhat hacky but very smiple and understandable.

> > So ... I decided that the best approach here seems to be to just skip
> > FirstNormalTransactionId and use FirstNormalTransactionId + 1 for the
> > first write transaction that the cluster ever processes. That's very
> > simple and doesn't seem likely to break anything else. On the downside
> > it seems a bit grotty, but I don't see anything better, and on the
> > whole, I think with this approach we come out substantially ahead.
> > 0001 removes 3 times as many lines as it inserts, and 0002 saves a few
> > more lines of code.
> 
> This doesn't seem all that bad to me.  It's a little hacky, but it's very
> easy to understand and only happens once per initdb.  I don't think it's
> worth any extra complexity.

+1.

> > Now, I still don't really know that there isn't some theoretical
> > difficulty here that makes this whole approach a non-starter, but I
> > also can't think of what it might be. If the promotion case, which has
> > used the end-of-recovery record for many years, is basically safe,
> > despite the fact that it switches TLIs, then it seems to me that the
> > crash recovery case, which doesn't have that complication, ought to be
> > safe too. But I might well be missing something, so if you see a
> > problem, please speak up!
> 
> Your reasoning seems sound to me.
> 
> [0] https://postgr.es/m/C1EE64B0-D4DB-40F3-98C8-0CED324D34CB%40amazon.com

FWIW, I don't find a flaw in the reasoning, too.

By the way do we need to leave CheckPoint.PrevTimeLineID? It is always
the same value with ThisTimeLineID.  The most dubious part is
ApplyWalRecord but XLOG_CHECKPOINT_SHUTDOWN no longer induces timeline
switch.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center

Re: minor MERGE cleanups

2022-04-19 Thread Michael Paquier

On Tue, Apr 19, 2022 at 03:45:22PM +0200, Alvaro Herrera wrote:
> I expect these fixups in new code should be uncontroversial.

The whole set looks rather sane to me.
--
Michael


signature.asc
Description: PGP signature

Re: Postgres restart in the middle of exclusive backup and the presence of backup_label file

2022-04-19 Thread Martín Marqués

Hi,

I was looking at the commit for this patch and noticed this small typo
in the comment in `basebackup.c`:

```
diff --git a/src/backend/replication/basebackup.c
b/src/backend/replication/basebackup.c
index 
6884cad2c00af1632eacec07903098e7e1874393..815681ada7dd0c135af584ad5da2dd13c9a12465
100644 (file)
--- a/src/backend/replication/basebackup.c
+++ b/src/backend/replication/basebackup.c
@@ -184,10 +184,8 @@ static const struct exclude_list_item excludeFiles[] =
{RELCACHE_INIT_FILENAME, true},

/*
-* If there's a backup_label or tablespace_map file, it belongs to a
-* backup started by the user with pg_start_backup().  It is *not* correct
-* for this backup.  Our backup_label/tablespace_map is injected into the
-* tar separately.
+* backup_label and tablespace_map should not exist in in a running cluster
+* capable of doing an online backup, but exclude them just in case.
 */
{BACKUP_LABEL_FILE, false},
{TABLESPACE_MAP, false},
```

The typo is in `exist in in a running cluster`. There's two `in` in a row.

P.D.: I was looking at this just because I was looking at an issue
where someone bumped their head with this "problem", so great that
we're in a better place now. Hopefully one day everyone will be
running PG15 or better :)

Kind regards, Martín

El mié, 6 abr 2022 a las 16:29, Stephen Frost () escribió:
>
> Greetings,
>
> * Laurenz Albe (laurenz.a...@cybertec.at) wrote:
> > On Tue, 2022-04-05 at 13:06 -0700, Nathan Bossart wrote:
> > > On Tue, Apr 05, 2022 at 11:25:36AM -0400, Stephen Frost wrote:
> > > > Please find attached an updated patch + commit message.  Mostly, I just
> > > > went through and did a bit more in terms of updating the documentation
> > > > and improving the comments (there were some places that were still
> > > > worrying about the chance of a 'stray' backup_label file existing, which
> > > > isn't possible any longer), along with some additional testing and
> > > > review.  This is looking pretty good to me, but other thoughts are
> > > > certainly welcome.  Otherwise, I'm hoping to commit this tomorrow.
> > >
> > > LGTM!
> >
> > Cassandra (not the software) from the sidelines predicts that we will
> > get some fire from users for this, although I concede the theoretical
> > sanity of the change.
>
> Great, thanks for that.
>
> This has now been committed- thanks again to everyone for their help!
>
> Stephen



-- 
Martín Marqués
It’s not that I have something to hide,
it’s that I have nothing I want you to see

Re: Odd off-by-one dirty buffers and checkpoint buffers written

2022-04-19 Thread David G. Johnston

On Tue, Apr 19, 2022 at 4:36 PM Nathan Bossart 
wrote:

> On Tue, Apr 19, 2022 at 04:21:21PM -0700, David G. Johnston wrote:
> > I've done this four times in a row and while the number of dirty buffers
> > shown each time vary (see below) I see that "wrote N buffers" is always
> > exactly one more than the total count of dirty buffers.  I'm just curious
> > if anyone has a quick answer for this unusual correspondence.
>
> I see that SlruInternalWritePage() increments ckpt_bufs_written, so my
> first guess would be that it's due to something like CheckPointCLOG().
>
>
I peeked at pg_stat_bgwriter and see an increase in buffers_checkpoint
matching the dirty buffers number.

I also looked at pg_stat_slru to try and find the corresponding change
caused by:

slru.c:766 (SlruPhysicalWritePage)
pgstat_count_slru_page_written(shared->slru_stats_idx);

I do see (Xact) blks_hit change during this process (after the
update/commit, not the checkpoint, though) but it increases by 2 when dirty
buffers is 4.  I was expecting 4, thinking that blocks and buffers and
pages are basically the same things (which [1] seems to affirm).

https://www.postgresql.org/message-id/13563.1044552279%40sss.pgh.pa.us

David J.

Re: Replace open mode with PG_BINARY_R/W/A macros

2022-04-19 Thread Japin Li



On Tue, 19 Apr 2022 at 22:21, Tom Lane  wrote:
> Japin Li  writes:
>> On Tue, 19 Apr 2022 at 14:20, Tom Lane  wrote:
>>> I think the comment's at best misleading.  See e.g. 66f8687a8.
>>> It might be okay to use "rb" to read a text file when there
>>> is actually \r-stripping logic present, but you need to check
>>> that.  Using "wb" to write a text file is flat wrong.
>
>> Thanks for the detail explanation.  Should we remove the misleading comment?
>
> We should rewrite it, not just remove it.  But I'm not 100% sure
> what to say instead.  I wonder whether the comment's claims about
> control-Z processing still apply on modern Windows.
>

It might be true [1].

[1] 
https://docs.microsoft.com/en-us/cpp/c-runtime-library/reference/fopen-wfopen?view=msvc-170

> Another question is whether we actually like the current shape of
> the code.  I can see at least two different directions we might
> prefer to the status quo:
>
> * Invent '#define PG_TEXT_R "r"' and so on, and use those in the
> calls that currently use plain "r" etc, establishing a project
> policy that you should use one of these six macros and never the
> underlying strings directly.  This perhaps has some advantages
> in greppability and clarity of intent, but I can't help wondering
> if it's mostly obsessive-compulsiveness.
>
> * In the other direction, decide that the PG_BINARY_X macros are
> offering no benefit at all and just rip 'em out, writing "rb" and
> so on in their place.  POSIX specifies that the character "b" has
> no effect on Unix-oid systems, and it has said that for thirty years
> now, so we do not really need the platform dependency that presently
> exists in the macro definitions.  The presence or absence of "b"
> would serve fine as an indicator of intent, and there would be one
> less PG-specific coding convention to remember.
>

I'm incline the second direction if we need to change this.

> Or maybe it's fine as-is.  Any sort of wide-ranging change like this
> creates hazards for back-patching, so we shouldn't do it lightly.
>

Agreed.  Thanks again for the explanation.

-- 
Regrads,
Japin Li.
ChengDu WenWu Information Technology Co.,Ltd.

Re: Postgres perl module namespace

2022-04-19 Thread Michael Paquier

On Tue, Apr 19, 2022 at 07:24:58PM -0400, Andrew Dunstan wrote:
> On 2022-04-19 Tu 18:39, Michael Paquier wrote:
>> +*generate_ascii_string = *TestLib::generate_ascii_string;
>> +*slurp_dir = *TestLib::slurp_dir;
>> +*slurp_file = *TestLib::slurp_file;
>>
>> I am not sure if it is possible and my perl-fu is limited in this
>> area, but could a failure be enforced when loading this path if a new
>> routine added in TestLib.pm is forgotten in this list?
> 
> Not very easily that I'm aware of, but maybe some superior perl wizard
> will know better.

Okay.  Please do not consider this as a blocker.  I was just wondering
about ways to ease more the error reports when it comes to
back-patching, and this would move the error stack a bit earlier.
--
Michael


signature.asc
Description: PGP signature

Re: [PATCH] Add native windows on arm64 support

2022-04-19 Thread Michael Paquier

On Tue, Apr 19, 2022 at 03:22:30PM +0100, Niyas Sait wrote:
> Sorry for the delay! Configuring the scripts took some time. I have
> successfully run the builfarm animal script using my git repository [1]
> which contains the proposed patch on a Windows Arm64 machine.
> 
> I made a request to add a new machine to build farm through [2].

Have you tested with the amount of coverage provided by vcregress.pl?

Another thing I was wondering about is if it would be possible to have
this option in Travis, but that does not seem to be the case:
https://docs.travis-ci.com/user/reference/windows/#windows-version

> I believe we should be able to enable the build farm machine once the
> change has been merged.

Better to wait for the beginning of the development cycle at this
stage, based on all the replies received.  That would bring us to the
beginning of July.

+   if ($solution->{platform} eq 'ARM64') {
+   push(@pgportfiles, 'pg_crc32c_armv8_choose.c');
+   push(@pgportfiles, 'pg_crc32c_armv8.c');
+   } else {
+   push(@pgportfiles, 'pg_crc32c_sse42_choose.c');
+   push(@pgportfiles, 'pg_crc32c_sse42.c');
+   }

+++ b/src/port/pg_crc32c_armv8.c
+#ifndef _MSC_VER
 #include 
+#endif
[ ... ]
+#ifdef _M_ARM64
+   /*
+* arm64 way of hinting processor for spin loops optimisations
+* ref: 
https://community.arm.com/support-forums/f/infrastructure-solutions-forum/48654/ssetoneon-faq
+*/
+   __isb(_ARM64_BARRIER_SY);
+#else
I think that such extra optimizations had better be in a separate
patch, and we should focus on getting the build done first.

+   # arm64 linker only supports dynamic base address
+   my $cfgrandbaseaddress = $self->{platform} eq 'ARM64' ? 'true' : 'false';
This issue is still lying around, and you may have been lucky.  Would
there be any issues to remove this change to get a basic support in?
As mentioned upthread, there is a long history of Postgres with ASLR.

This would mean that a basic patch could be done with just the changes
for gendef.pl, and the first part of the changes inMSBuildProject.pm.
Would that not be enough?
--
Michael


signature.asc
Description: PGP signature

Re: DBT-5 Stored Procedure Development (2022)

2022-04-19 Thread Peter Geoghegan

On Tue, Apr 19, 2022 at 11:31 AM Mark Wong  wrote:
> As some of tasks proposed are actually in place, one other task could be
> updating egen (the TPC supplied code.)  The kit was last developed again
> 1.12 and 1.14 is current as this email.

As you know, I have had some false starts with using DBT5 on a modern
Linux distribution. Perhaps I gave up too easily at the time, but I'm
definitely still interested. Has there been work on that since?

Thanks
-- 
Peter Geoghegan

Re: Odd off-by-one dirty buffers and checkpoint buffers written

2022-04-19 Thread Nathan Bossart

On Tue, Apr 19, 2022 at 04:21:21PM -0700, David G. Johnston wrote:
> I've done this four times in a row and while the number of dirty buffers
> shown each time vary (see below) I see that "wrote N buffers" is always
> exactly one more than the total count of dirty buffers.  I'm just curious
> if anyone has a quick answer for this unusual correspondence.

I see that SlruInternalWritePage() increments ckpt_bufs_written, so my
first guess would be that it's due to something like CheckPointCLOG().

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com

Re: Postgres perl module namespace

2022-04-19 Thread Andrew Dunstan



On 2022-04-19 Tu 18:39, Michael Paquier wrote:
> On Tue, Apr 19, 2022 at 04:06:28PM -0400, Andrew Dunstan wrote:
>> Here's a version with a fixed third patch that corrects a file misnaming
>> and fixes the export issue referred to above. Passes my testing so far.
> Wow.  That's really cool.  You are combining the best of both worlds
> here to ease backpatching, as far as I understand what you wrote.


Thanks.


>
> +*generate_ascii_string = *TestLib::generate_ascii_string;
> +*slurp_dir = *TestLib::slurp_dir;
> +*slurp_file = *TestLib::slurp_file;
>
> I am not sure if it is possible and my perl-fu is limited in this
> area, but could a failure be enforced when loading this path if a new
> routine added in TestLib.pm is forgotten in this list?



Not very easily that I'm aware of, but maybe some superior perl wizard
will know better.


cheers


andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com

Odd off-by-one dirty buffers and checkpoint buffers written

2022-04-19 Thread David G. Johnston

view bc is just a joining wrapper around pg_buffercache.

regression=# select datname, relname, count(*), sum(count(*)) over () AS
total from bc where isdirty group by datname, relname;
 datname | relname | count | total
-+-+---+---
(0 rows)

regression=# update tenk1 set stringu1 = stringu1 || '' where (unique1 %
384) = 3;
UPDATE 27
regression=# select datname, relname, count(*), sum(count(*)) over () AS
total from bc where isdirty group by datname, relname;
  datname   | relname | count | total
+-+---+---
 regression | tenk1   | 3 | 3
(1 row)

regression=# checkpoint;
CHECKPOINT

2022-04-19 23:17:08.256 UTC [161084] LOG:  checkpoint starting: immediate
force wait
2022-04-19 23:17:08.264 UTC [161084] LOG:  checkpoint complete: wrote 4
buffers (0.0%); 0 WAL file(s) added, 0 removed, 0 recycled; write=0.002 s,
sync=0.002 s, total=0.009 s; sync files=2, longest=0.002 s, average=0.001
s; distance=12 kB, estimate=72358 kB

I've done this four times in a row and while the number of dirty buffers
shown each time vary (see below) I see that "wrote N buffers" is always
exactly one more than the total count of dirty buffers.  I'm just curious
if anyone has a quick answer for this unusual correspondence.

David J.

regression=# update tenk1 set stringu1 = stringu1 || '' where (unique1 %
384) = 3;
UPDATE 27
regression=# select datname, relname, count(*), sum(count(*)) over () AS
total from bc where isdirty group by datname, relname;
  datname   |   relname| count | total
+--+---+---
 regression | tenk1|33 |   102
 regression | tenk1_hundred| 9 |   102
 regression | tenk1_thous_tenthous |18 |   102
 regression | tenk1_unique1|27 |   102
 regression | tenk1_unique2|15 |   102
(5 rows)

2022-04-19 23:13:03.480 UTC [161084] LOG:  checkpoint starting: immediate
force wait
2022-04-19 23:13:03.523 UTC [161084] LOG:  checkpoint complete: wrote 103
buffers (0.6%); 0 WAL file(s) added, 0 removed, 0 recycled; write=0.004 s,
sync=0.014 s, total=0.044 s; sync files=8, longest=0.008 s, average=0.002
s; distance=721 kB, estimate=110165 kB

Re: Postgres perl module namespace

2022-04-19 Thread Michael Paquier

On Tue, Apr 19, 2022 at 04:06:28PM -0400, Andrew Dunstan wrote:
> Here's a version with a fixed third patch that corrects a file misnaming
> and fixes the export issue referred to above. Passes my testing so far.

Wow.  That's really cool.  You are combining the best of both worlds
here to ease backpatching, as far as I understand what you wrote.

+*generate_ascii_string = *TestLib::generate_ascii_string;
+*slurp_dir = *TestLib::slurp_dir;
+*slurp_file = *TestLib::slurp_file;

I am not sure if it is possible and my perl-fu is limited in this
area, but could a failure be enforced when loading this path if a new
routine added in TestLib.pm is forgotten in this list?
--
Michael


signature.asc
Description: PGP signature

Re: Estimating HugePages Requirements?

2022-04-19 Thread Michael Paquier

On Thu, Mar 24, 2022 at 02:07:26PM +0100, Magnus Hagander wrote:
> But neither would the suggestion of redirecting stderr to /dev/null.
> In fact, doing the redirect it will *also* throw away any FATAL that
> happens. In fact, using the 2>/dev/null method, we *also* remove the
> message that says there's another postmaster running in this
> directory, which is strictly worse than the override of
> log_min_messages.

Well, we could also tweak more the command with a redirection of
stderr to a log file or such, and tell to look at it for errors.

> That said, the redirect can be removed without recompiling postgres,
> so it is probably still hte better choice as a temporary workaround.
> But we should really look into getting a better solution in place once
> we start on 16.

But do we really need a better or more invasive solution for already
running servers though?  A SHOW command would be able to do the work
already in this case.  This would lack consistency compared to the
offline case, but we are not without option either.  That leaves the
case where the server is running, has allocated memory but is not
ready to accept connections, like crash recovery, still this use case
looks rather thin to me. 

>> My solution for the docs is perhaps too confusing for the end-user,
>> and we are talking about a Linux-only thing here anyway.  So, at the
>> end, I am tempted to just add the "2> /dev/null" as suggested upthread
>> by Nathan and call it a day.  Does that sound fine?
> 
> What would be a linux only thing?

Perhaps not at some point in the future.  Now that's under a section
of the docs only for Linux.
--
Michael

signature.asc
Description: PGP signature

Re: Bad estimate with partial index

2022-04-19 Thread Tom Lane

I wrote:
> it looks like the problem is that the extended stats haven't been used
> while forming the estimate of the number of index entries retrieved, so
> we overestimate the cost of using this index.
> That seems like a bug.  Tomas?

I dug into this enough to locate the source of the problem.
btcostestimate includes the partial index clauses in what it
sends to clauselist_selectivity, but per the comments for
add_predicate_to_index_quals:

 * Note that indexQuals contains RestrictInfo nodes while the indpred
 * does not, so the output list will be mixed.  This is OK for both
 * predicate_implied_by() and clauselist_selectivity(), but might be
 * problematic if the result were passed to other things.

That comment was true when it was written, but it's been falsified
by the extended-stats patches, which have added a whole lot of logic
in and under clauselist_selectivity that ignores clauses that are not
RestrictInfos.

While we could perhaps fix this by having add_predicate_to_index_quals
add RestrictInfos, I'm inclined to feel that the extended-stats code
is in the wrong.  The contract for clauselist_selectivity has always
been that it could optimize if given RestrictInfos rather than bare
clauses, not that it would fail to work entirely without them.
There are probably more places besides add_predicate_to_index_quals
that are relying on that.

regards, tom lane

Re: using an end-of-recovery record in all cases

2022-04-19 Thread Nathan Bossart

On Mon, Apr 18, 2022 at 04:44:03PM -0400, Robert Haas wrote:
> Here is a new version of the patch which, unlike v1, I think is
> something we could seriously consider applying (not before v16, of
> course). It now removes CHECKPOINT_END_OF_RECOVERY completely, and I
> attach a second patch as well which nukes checkPoint.PrevTimeLineID as
> well.

I'd like to add a big +1 for this change.  IIUC this should help with some
of the problems I've noted elsewhere [0].

> I mentioned two problems with $SUBJECT in the first email with this
> subject line. One was a bug, which Noah has since fixed (thanks,
> Noah). The other problem is that LogStandbySnapshot() and a bunch of
> its friends expect latestCompletedXid to always be a normal XID, which
> is a problem because (1) we currently set nextXid to 3 and (2) at
> startup, we compute latestCompletedXid = nextXid - 1. The current code
> dodges this kind of accidentally: the checkpoint that happens at
> startup is a "shutdown checkpoint" and thus skips logging a standby
> snapshot, since a shutdown checkpoint is a sure indicator that there
> are no running transactions. With the changes, the checkpoint at
> startup happens after we've started allowing write transactions, and
> thus a standby snapshot needs to be logged also. In the cases where
> the end-of-recovery record was already being used, the problem could
> have happened already, except for the fact that those cases involve a
> standby promotion, which doesn't happen during initdb. I explored a
> few possible ways of solving this problem.

Shouldn't latestCompletedXid be set to MaxTransactionId in this case?  Or
is this related to the logic in FullTransactionIdRetreat() that avoids
skipping over the "actual" special transaction IDs?

> So ... I decided that the best approach here seems to be to just skip
> FirstNormalTransactionId and use FirstNormalTransactionId + 1 for the
> first write transaction that the cluster ever processes. That's very
> simple and doesn't seem likely to break anything else. On the downside
> it seems a bit grotty, but I don't see anything better, and on the
> whole, I think with this approach we come out substantially ahead.
> 0001 removes 3 times as many lines as it inserts, and 0002 saves a few
> more lines of code.

This doesn't seem all that bad to me.  It's a little hacky, but it's very
easy to understand and only happens once per initdb.  I don't think it's
worth any extra complexity.

> Now, I still don't really know that there isn't some theoretical
> difficulty here that makes this whole approach a non-starter, but I
> also can't think of what it might be. If the promotion case, which has
> used the end-of-recovery record for many years, is basically safe,
> despite the fact that it switches TLIs, then it seems to me that the
> crash recovery case, which doesn't have that complication, ought to be
> safe too. But I might well be missing something, so if you see a
> problem, please speak up!

Your reasoning seems sound to me.

[0] https://postgr.es/m/C1EE64B0-D4DB-40F3-98C8-0CED324D34CB%40amazon.com

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com

Re: Postgres perl module namespace

2022-04-19 Thread Andrew Dunstan


On 2022-04-19 Tu 11:36, Andrew Dunstan wrote:
> On 2022-04-18 Mo 14:07, Tom Lane wrote:
>> Andrew Dunstan  writes:
>>> No, I think we could probably just port the whole of src/test/PostreSQL
>>> back if required, and have it live alongside the old modules. Each TAP
>>> test is a separate miracle - see comments elsewhere about port
>>> assignment in parallel TAP tests.
>>> But that would mean we have some tests in the old flavor and some in the
>>> new flavor in the back branches, which might get confusing.
>> That works for back-patching entire new test scripts, but not for adding
>> some cases to an existing script, which I think is more common.
>>
>>  
>
> I think I've come up with a better scheme that I hope will fix all or
> almost all of the pain complained of in this thread. I should note that
> we deliberately delayed making these changes until fairly early in the
> release 15 development cycle, and that was clearly a good decision.
>
> The attached three patches basically implement the new naming scheme for
> the back branches without doing away with the old scheme or doing a
> wholesale copy of the new modules.
>
> The first simply implements a proper "new" constructor for PostgresNode,
> just like we have in PostgreSQL:Test::Cluster. It's not really essential
> but it seems like a good idea. The second adds all the visible
> functionality of the PostgresNode and TestLib modules to the
> PostgreSQL::Test::Cluster and PostgreSQL::Test::Utils namespaces.. The
> third adds dummy packages so that any code doing 'use
> PostgreSQL::Test::Utils;' or 'use PostgreSQL::Test::Cluster;' will
> actually import the old modules. This last piece is where there might be
> some extra work needed, to export the names so that using an unqualified
> function or variable, say, 'slurp_file("foo");' will work. But in
> general, modulo that issue, I believe things should Just Work (tm). You
> should basically just be able to backpatch any new or modified TAP test
> without difficulty, sed script usage, etc.
>
> Comments welcome.
>
>

Here's a version with a fixed third patch that corrects a file misnaming
and fixes the export issue referred to above. Passes my testing so far.


cheers


andrew


--
Andrew Dunstan
EDB: https://www.enterprisedb.com
diff --git a/src/test/perl/PostgresNode.pm b/src/test/perl/PostgresNode.pm
index 92d9303e23..94b39341ce 100644
--- a/src/test/perl/PostgresNode.pm
+++ b/src/test/perl/PostgresNode.pm
@@ -137,16 +137,22 @@ INIT
 
 =over
 
-=item PostgresNode::new($class, $name, $pghost, $pgport)
+=item PostgresNode->new(node_name, %params)
 
-Create a new PostgresNode instance. Does not initdb or start it.
-
-You should generally prefer to use get_new_node() instead since it takes care
-of finding port numbers, registering instances for cleanup, etc.
+Class oriented alias for get_new_node()
 
 =cut
 
 sub new
+{
+	my $class = $_[0];
+	$class->isa(__PACKAGE__) || die "new() not called as a class method";
+	return get_new_node(@_);
+}
+
+# internal subroutine used by get_new_node(). SHould not be called by any
+# external client of the module.
+sub _new
 {
 	my ($class, $name, $pghost, $pgport) = @_;
 	my $testname = basename($0);
@@ -1229,7 +1235,7 @@ sub get_new_node
 	}
 
 	# Lock port number found by creating a new node
-	my $node = $class->new($name, $host, $port);
+	my $node = _new($class, $name, $host, $port);
 
 	if ($params{install_path})
 	{
diff --git a/src/test/perl/PostgresNode.pm b/src/test/perl/PostgresNode.pm
index 92d9303e23..f5d2ba72e6 100644
--- a/src/test/perl/PostgresNode.pm
+++ b/src/test/perl/PostgresNode.pm
@@ -2761,4 +2761,16 @@ sub corrupt_page_checksum
 
 =cut
 
+# support release 15+ perl module namespace
+
+package PostgreSQL::Test::Cluster;
+
+sub new
+{
+	shift; # remove class param from args
+	return PostgresNode->get_new_node(@_);
+}
+
+sub get_free_port { return PostgresNode::get_free_port(); }
+
 1;
diff --git a/src/test/perl/TestLib.pm b/src/test/perl/TestLib.pm
index 0dfc414b07..92583f84b4 100644
--- a/src/test/perl/TestLib.pm
+++ b/src/test/perl/TestLib.pm
@@ -948,4 +948,46 @@ sub command_checks_all
 
 =cut
 
+# support release 15+ perl module namespace
+
+package PostgreSQL::Test::Utils;
+
+# we don't want to export anything, but we want to support things called
+# via this package name explicitly.
+
+# use typeglobs to alias these functions and variables
+
+*generate_ascii_string = *TestLib::generate_ascii_string;
+*slurp_dir = *TestLib::slurp_dir;
+*slurp_file = *TestLib::slurp_file;
+*append_to_file = *TestLib::append_to_file;
+*check_mode_recursive = *TestLib::check_mode_recursive;
+*chmod_recursive = *TestLib::chmod_recursive;
+*check_pg_config = *TestLib::check_pg_config;
+*dir_symlink = *TestLib::dir_symlink;
+*system_or_bail = *TestLib::system_or_bail;
+*system_log = *TestLib::system_log;
+*run_log = *TestLib::run_log;
+*run_command = *TestLib::run_command;
+sub pump_until { die "pump_until not implemented in TestLib"; }
+*command_o

Re: postgres_fdw: commit remote (sub)transactions in parallel during pre-commit

2022-04-19 Thread David Zhang

I tried to apply the patch to master and plan to run some tests, but got 
below errors due to other commits.


$ git apply --check 
v7-0003-postgres-fdw-Add-support-for-parallel-abort.patch

error: patch failed: doc/src/sgml/postgres-fdw.sgml:479
error: doc/src/sgml/postgres-fdw.sgml: patch does not apply


+     * remote server in parallel at (sub)transaction end.

Here, I think the comment above could potentially apply to multiple 
remote server(s).



Not sure if there is a way to avoid repeated comments? For example, the 
same comment below appears in two places (line 231 and line 296).


+    /*
+     * If requested, consume whatever data is available from the socket.
+     * (Note that if all data is available, this allows
+     * pgfdw_get_cleanup_result to call PQgetResult without forcing the
+     * overhead of WaitLatchOrSocket, which would be large compared to the
+     * overhead of PQconsumeInput.)
+     */

On 2022-03-24 11:46 p.m., Etsuro Fujita wrote:

On Thu, Mar 24, 2022 at 1:34 PM Etsuro Fujita  wrote:

Attached is a new patch set.  The new version of 0002 is just a
cleanup patch (see the commit message in 0002), and I think it's
committable, so I'm planning to commit it, if no objections.

Done.

Attached is the 0003 patch, which is the same as the one I sent yesterday.

Best regards,
Etsuro Fujita

--
Best regards,
David

Software Engineer
Highgo Software Inc. (Canada)
www.highgo.ca

Re: error handling in pqRowProcessor broken

2022-04-19 Thread Tom Lane

Peter Eisentraut  writes:
> I find that this doesn't work anymore.  If you set *errmsgp = "some
> message" and return 0, then psql will just print a result set with zero
> rows.

Ah, I see the problem: a few places in fe-protocol3 didn't get the memo
that conn->error_result represents a "pending" PGresult that hasn't
been constructed yet.  The attached fixes it for me --- can you try it
on whatever test case led you to this?

> (Even before the above commit, the handling of the returned message was
> a bit weird: The error output was just the message string, without any
> prefix like "ERROR:".)

libpq's always acted that way for internally-generated messages.
Most of them are so rare that we're probably not used to seeing 'em.
Perhaps there's a case for making it more verbose, but right now
doesn't seem like the time to undertake that.

regards, tom lane

diff --git a/src/interfaces/libpq/fe-exec.c b/src/interfaces/libpq/fe-exec.c
index 78cff4475c..1b9c04f47e 100644
--- a/src/interfaces/libpq/fe-exec.c
+++ b/src/interfaces/libpq/fe-exec.c
@@ -1196,6 +1196,7 @@ pqSaveParameterStatus(PGconn *conn, const char *name, const char *value)
  *	  Returns 1 if OK, 0 if error occurred.
  *
  * On error, *errmsgp can be set to an error string to be returned.
+ * (Such a string should already be translated via libpq_gettext().)
  * If it is left NULL, the error is presumed to be "out of memory".
  *
  * In single-row mode, we create a new result holding just the current row,
diff --git a/src/interfaces/libpq/fe-protocol3.c b/src/interfaces/libpq/fe-protocol3.c
index 94b4a448b9..e0da22dba7 100644
--- a/src/interfaces/libpq/fe-protocol3.c
+++ b/src/interfaces/libpq/fe-protocol3.c
@@ -209,7 +209,7 @@ pqParseInput3(PGconn *conn)
 case 'C':		/* command complete */
 	if (pqGets(&conn->workBuffer, conn))
 		return;
-	if (conn->result == NULL)
+	if (conn->result == NULL && !conn->error_result)
 	{
 		conn->result = PQmakeEmptyPGresult(conn,
 		   PGRES_COMMAND_OK);
@@ -263,7 +263,7 @@ pqParseInput3(PGconn *conn)
 	}
 	break;
 case 'I':		/* empty query */
-	if (conn->result == NULL)
+	if (conn->result == NULL && !conn->error_result)
 	{
 		conn->result = PQmakeEmptyPGresult(conn,
 		   PGRES_EMPTY_QUERY);
@@ -281,7 +281,7 @@ pqParseInput3(PGconn *conn)
 	if (conn->cmd_queue_head &&
 		conn->cmd_queue_head->queryclass == PGQUERY_PREPARE)
 	{
-		if (conn->result == NULL)
+		if (conn->result == NULL && !conn->error_result)
 		{
 			conn->result = PQmakeEmptyPGresult(conn,
 			   PGRES_COMMAND_OK);
@@ -362,7 +362,7 @@ pqParseInput3(PGconn *conn)
 	if (conn->cmd_queue_head &&
 		conn->cmd_queue_head->queryclass == PGQUERY_DESCRIBE)
 	{
-		if (conn->result == NULL)
+		if (conn->result == NULL && !conn->error_result)
 		{
 			conn->result = PQmakeEmptyPGresult(conn,
 			   PGRES_COMMAND_OK);

Re: Readd use of TAP subtests

2022-04-19 Thread Daniel Gustafsson

> On 19 Apr 2022, at 21:07, Jacob Champion  wrote:
> 
> On Mon, 2022-02-28 at 17:02 +0100, Peter Eisentraut wrote:
>> Perhaps in another 7 years or so this will be resolved and we can make 
>> another attempt at this. ;-)
> 
> For what it's worth, the TAP 14 spec was officially released today:

Interesting.  I hadn't even registered that a v14 was in the works, and come to
think of it I'm not sure I've yet seen a v13 consumer or producer.  For the TAP
support in pg_regress I've kept to the original spec.

--
Daniel Gustafsson   https://vmware.com/

Re: Extract epoch from Interval weird behavior

2022-04-19 Thread Peter Eisentraut


On 08.04.22 15:10, Tom Lane wrote:

Peter Eisentraut  writes:

We really wanted to avoid doing calculations in numeric as much as
possible.  So we should figure out a different way to write this.  The
attached patch works for me.  It's a bit ugly since it hardcodes some
factors.  Maybe we can rephrase it a bit more elegantly.


I think it's fine but needs some commentary.  Maybe about like
"To do this calculation in integer arithmetic even though
DAYS_PER_YEAR is fractional, multiply everything by 4
and then divide by 4 again at the end.  This relies on
DAYS_PER_YEAR being a multiple of 0.25 and on SECS_PER_DAY
being a multiple of 4."

BTW, it might be good to parenthesize as

(... big calculation ...) * (SECS_PER_DAY/4)

to eliminate any question of whether the value could overflow
before the final division by 4.


done that way

Re: Readd use of TAP subtests

2022-04-19 Thread Jacob Champion

On Mon, 2022-02-28 at 17:02 +0100, Peter Eisentraut wrote:
> Perhaps in another 7 years or so this will be resolved and we can make 
> another attempt at this. ;-)

For what it's worth, the TAP 14 spec was officially released today:

https://testanything.org/tap-version-14-specification.html

--Jacob

Re: pg14 psql broke \d datname.nspname.relname

2022-04-19 Thread Vik Fearing


On 4/19/22 16:00, Robert Haas wrote:

On Mon, Apr 18, 2022 at 3:39 PM Mark Dilger
 wrote:

Since there hasn't been any agreement on that point, I've just rebased the 
patch to apply cleanly against the current master:


This looks OK to me. There may be better ways to do some of it, but
there's no rule against further improving the code later. Also, since
the issue was introduced in v14, we probably shouldn't wait forever to
do something about it. However, there is a procedural issue here now
that we are past feature freeze. I think someone could defensibly take
any of the following positions:

(A) This is a new feature. Wait for v16.
(B) This is a bug fix. Commit it now and back-patch to v14.
(C) This is a cleanup that is OK to put into v15 even after feature
freeze but since it is a behavior change we shouldn't back-patch it.

I vote for (C). What do other people think?



I vote for (B).
--
Vik Fearing

Re: DBT-5 Stored Procedure Development (2022)

2022-04-19 Thread Mark Wong

Hi Mahesh,

On Tue, Apr 19, 2022 at 02:01:54PM -0400, Mahesh Gouru wrote:
> Dear all,
> 
> Please review the attached for my jerry-rigged project proposal. I am
> seeking to continually refactor the proposal as I can!

My comments might briefer that they should be, but I need to write this
quickly.  :)

* The 4 steps in the description aren't needed, they already exist.
* May 20: I think this should be more about reviewing the TPC-E
  specification rather than industry research, as we want to try to
  follow specification guidelines.
* June 20: Random data generation and scaling are provided by and
  already defined by the spec
* Aug 01: A report generator already exists, but I think time could be
  allocated to redoing the raw HTML generation with something like
  reStructuredText, something that is easier to generate with scripts
  and convertible into other formats with other tools

As some of tasks proposed are actually in place, one other task could be
updating egen (the TPC supplied code.)  The kit was last developed again
1.12 and 1.14 is current as this email.

Regards,
Mark

Re: Intermittent buildfarm failures on wrasse

2022-04-19 Thread Tom Lane

Alvaro Herrera  writes:
> On 2022-Apr-15, Tom Lane wrote:
>> Here's a WIP patch for that.  The only exciting thing in it is that
>> because of some undocumented cowboy programming in walsender.c, the
>> Assert((proc->statusFlags & (~PROC_COPYABLE_FLAGS)) == 0);
>> in ProcArrayInstallRestoredXmin fires unless we skip that.

> Hmm, maybe a better use of that define is to use to select which flags
> to copy, rather than to ensure we they are the only ones set.  What
> about this?

Yeah, I thought about that too, but figured that the author probably
had a reason for writing the assertion the way it was.  If we want
to redefine PROC_COPYABLE_FLAGS as "flags associated with xmin",
that's fine by me.  But I'd suggest that both the name of the macro
and the comment for it in proc.h should be revised to match that
definition.

Another point is that as you've coded it, the code doesn't so much
copy those flags as union them with whatever the recipient had,
which seems wrong.  I could go with either

Assert(!(MyProc->statusFlags & PROC_XMIN_FLAGS));
MyProc->statusFlags |= (proc->statusFlags & PROC_XMIN_FLAGS);

or

MyProc->statusFlags = (MyProc->statusFlags & ~PROC_XMIN_FLAGS) |
  (proc->statusFlags & PROC_XMIN_FLAGS);

Perhaps the latter is more future-proof.

regards, tom lane

Re: Intermittent buildfarm failures on wrasse

2022-04-19 Thread Alvaro Herrera

On 2022-Apr-15, Tom Lane wrote:

> Here's a WIP patch for that.  The only exciting thing in it is that
> because of some undocumented cowboy programming in walsender.c, the
>   Assert((proc->statusFlags & (~PROC_COPYABLE_FLAGS)) == 0);
> in ProcArrayInstallRestoredXmin fires unless we skip that.

Hmm, maybe a better use of that define is to use to select which flags
to copy, rather than to ensure we they are the only ones set.  What
about this?


-- 
Álvaro Herrera PostgreSQL Developer  —  https://www.EnterpriseDB.com/
"¿Qué importan los años?  Lo que realmente importa es comprobar que
a fin de cuentas la mejor edad de la vida es estar vivo"  (Mafalda)
diff --git a/src/backend/storage/ipc/procarray.c b/src/backend/storage/ipc/procarray.c
index 25c310f675..4347941568 100644
--- a/src/backend/storage/ipc/procarray.c
+++ b/src/backend/storage/ipc/procarray.c
@@ -2685,17 +2685,14 @@ ProcArrayInstallRestoredXmin(TransactionId xmin, PGPROC *proc)
 		TransactionIdIsNormal(xid) &&
 		TransactionIdPrecedesOrEquals(xid, xmin))
 	{
-		/* Install xmin */
+		/*
+		 * Install xmin.  In addition, propagate statusFlags that affect how
+		 * the value is interpreted by vacuum.
+		 */
 		MyProc->xmin = TransactionXmin = xmin;
 
-		/* walsender cheats by passing proc == MyProc, don't check its flags */
-		if (proc != MyProc)
-		{
-			/* Flags being copied must be valid copy-able flags. */
-			Assert((proc->statusFlags & (~PROC_COPYABLE_FLAGS)) == 0);
-			MyProc->statusFlags = proc->statusFlags;
-			ProcGlobal->statusFlags[MyProc->pgxactoff] = MyProc->statusFlags;
-		}
+		MyProc->statusFlags |= (proc->statusFlags & PROC_COPYABLE_FLAGS);
+		ProcGlobal->statusFlags[MyProc->pgxactoff] = MyProc->statusFlags;
 
 		result = true;
 	}

Re: DBT-5 Stored Procedure Development (2022)

2022-04-19 Thread Peter Geoghegan

On Tue, Apr 19, 2022 at 11:02 AM Mahesh Gouru  wrote:
> Please review the attached for my jerry-rigged project proposal. I am seeking 
> to continually refactor the proposal as I can!

I for one see a lot of value in this proposal. I think it would be
great to revive DBT-5, since TPC-E has a number of interesting
bottlenecks that we'd likely learn something from. It's particularly
good at stressing concurrency control, which TPC-C really doesn't do.
It's also a lot easier to run smaller benchmarks that don't require
lots of storage space, but are nevertheless correct according to the
spec.

-- 
Peter Geoghegan

Re: TRAP: FailedAssertion("tabstat->trans == trans", File: "pgstat_relation.c", Line: 508

2022-04-19 Thread Erik Rijkers


Op 19-04-2022 om 19:36 schreef Andres Freund:

Hi,

On 2022-04-19 13:50:25 +0200, Erik Rijkers wrote:

The 12th run of statbug.sh crashed and gave a corefile.


I ran through quite a few iterations by now, without reproducing :(

I guess there's some timing issue and you're hitting on your system
due to the slower disks.



Program terminated with signal 6, Aborted.
#0  0x00357d6324f5 in raise () from /lib64/libc.so.6
(gdb) bt
#0  0x00357d6324f5 in raise () from /lib64/libc.so.6
#1  0x00357d633cd5 in abort () from /lib64/libc.so.6
#2  0x00b3bada in ExceptionalCondition (conditionName=0xd389a1
"tabstat->trans == trans", errorType=0xd388b2 "FailedAssertion",
fileName=0xd388a0 "pgstat_relation.c", lineNumber=508) at assert.c:69
#3  0x009bf5dc in AtEOXact_PgStat_Relations (xact_state=0x31b1b50,
isCommit=true) at pgstat_relation.c:508
#4  0x009c4107 in AtEOXact_PgStat (isCommit=true, parallel=false) at
pgstat_xact.c:54
#5  0x00583764 in CommitTransaction () at xact.c:2360
#6  0x00584354 in CommitTransactionCommand () at xact.c:3048
#7  0x0090b34e in apply_handle_commit_internal
(commit_data=0x7ffd024b5940) at worker.c:1532
#8  0x0090a287 in apply_handle_commit (s=0x7ffd024b59b0) at
worker.c:845
#9  0x0090ce3a in apply_dispatch (s=0x7ffd024b59b0) at worker.c:2473
#10 0x0090d41c in LogicalRepApplyLoop (last_received=74680880) at
worker.c:2757
#11 0x0090e974 in start_apply (origin_startpos=0) at worker.c:3526
#12 0x0090f156 in ApplyWorkerMain (main_arg=0) at worker.c:3782
#13 0x008c7623 in StartBackgroundWorker () at bgworker.c:858
#14 0x008d1557 in do_start_bgworker (rw=0x30ff0a0) at
postmaster.c:5802
#15 0x008d1903 in maybe_start_bgworkers () at postmaster.c:6026
#16 0x008d09ba in sigusr1_handler (postgres_signal_arg=10) at
postmaster.c:5191
#17 
#18 0x00357d6e1683 in __select_nocancel () from /lib64/libc.so.6
#19 0x008cc6c1 in ServerLoop () at postmaster.c:1757
#20 0x008cc0aa in PostmasterMain (argc=11, argv=0x30d6590) at
postmaster.c:1465
#21 0x007c9256 in main (argc=11, argv=0x30d6590) at main.c:202
(gdb) f 3
#3  0x009bf5dc in AtEOXact_PgStat_Relations (xact_state=0x31b1b50,
isCommit=true) at pgstat_relation.c:508
508 Assert(tabstat->trans == trans);
(gdb) p tabstat
$1 = (PgStat_TableStatus *) 0x319e630
(gdb) p *tabstat
$2 = {t_id = 2139062143, t_shared = 127, trans = 0x7f7f7f7f7f7f7f7f,
t_counts = {t_numscans = 9187201950435737471, t_tuples_returned =
9187201950435737471, t_tuples_fetched = 9187201950435737471,
 t_tuples_inserted = 9187201950435737471, t_tuples_updated =
9187201950435737471, t_tuples_deleted = 9187201950435737471,
t_tuples_hot_updated = 9187201950435737471, t_truncdropped = 127,
 t_delta_live_tuples = 9187201950435737471, t_delta_dead_tuples =
9187201950435737471, t_changed_tuples = 9187201950435737471,
t_blocks_fetched = 9187201950435737471, t_blocks_hit = 9187201950435737471},
   relation = 0x7f7f7f7f7f7f7f7f}
(gdb) p trans
$3 = (PgStat_TableXactStatus *) 0x31b1ba8
(gdb) p *trans
$4 = {tuples_inserted = 1, tuples_updated = 0, tuples_deleted = 0,
truncdropped = false, inserted_pre_truncdrop = 0, updated_pre_truncdrop = 0,
deleted_pre_truncdrop = 0, nest_level = 1, upper = 0x0,
   parent = 0x319e630, next = 0x31b1ab8}
(gdb)


Could you print out
p xact_state
p *xact_state
p xact_state->first
p *xact_state->first

Do you have the server log file for the failed run / instance?



(gdb) p xact_state
$5 = (PgStat_SubXactStatus *) 0x31b1b50

(gdb) p *xact_state
$6 = {nest_level = 1, prev = 0x0, pending_drops = {head = {prev = 
0x31b1b60, next = 0x31b1b60}}, pending_drops_count = 0, first = 0x31b1ba8}


(gdb) p xact_state->first
$7 = (PgStat_TableXactStatus *) 0x31b1ba8

(gdb) p *xact_state->first
$8 = {tuples_inserted = 1, tuples_updated = 0, tuples_deleted = 0, 
truncdropped = false, inserted_pre_truncdrop = 0, updated_pre_truncdrop 
= 0, deleted_pre_truncdrop = 0, nest_level = 1, upper = 0x0,

  parent = 0x319e630, next = 0x31b1ab8}
(gdb)


The logfile is attached.


Erik



Greetings,

Andres Freund2022-04-19 13:33:42.944 CEST [24981] LOG:  starting PostgreSQL 
15devel_HEAD_20220419_1308_a62bff74b135 on x86_64-pc-linux-gnu, compiled by gcc 
(GCC) 8.2.0, 64-bit
2022-04-19 13:33:42.945 CEST [24981] LOG:  listening on IPv4 address 
"127.0.0.1", port 6526
2022-04-19 13:33:43.010 CEST [24981] LOG:  listening on Unix socket 
"/tmp/.s.PGSQL.6526"
2022-04-19 13:33:43.085 CEST [24991] LOG:  database system was shut down at 
2022-04-19 13:33:41 CEST
2022-04-19 13:33:43.181 CEST [24981] LOG:  database system is ready to accept 
connections
2022-04-19 13:33:51.040 CEST [25047] LOG:  received replication command: 
CREATE_REPLICATION_SLOT "pub_6527_from_6526" LOGICAL pgoutput (SNAPSHOT 
'nothing')
2022-04-19 13:33:51.040 CEST [25047] STATEMENT:  CREATE_REPLICATION_SLOT 
"pub_6527_from_6526" LOGICAL pgoutput (SNAPS

DBT-5 Stored Procedure Development (2022)

2022-04-19 Thread Mahesh Gouru

Dear all,

Please review the attached for my jerry-rigged project proposal. I am
seeking to continually refactor the proposal as I can!

Thanks,
Mahesh


GSoC (2022) - Proposal for PostgreSQL-Gouru.pdf
Description: Adobe PDF document

Re: Bad estimate with partial index

2022-04-19 Thread Tom Lane

=?iso-8859-1?Q?Andr=E9_H=E4nsel?=  writes:
> I have a case where Postgres chooses the wrong index and I'm not sure what
> to do about it:

The core problem here seems to be a poor estimate for the selectivity
of "WHERE cropped AND NOT resized":

regression=# EXPLAIN ANALYZE
SELECT count(*) FROM t
WHERE cropped AND NOT resized ;
...
   ->  Bitmap Heap Scan on t  (cost=35.26..6352.26 rows=91100 width=0) (actual 
time=0.121..0.190 rows=1000 loops=1)
 Recheck Cond: (cropped AND (NOT resized))
...

I think this is because the planner expects those two columns to be
independent, which they are completely not in your test data.  Perhaps
that assumption is more true in your real-world data, but since you're
here complaining, I suppose not :-(.  What you can do about that, in
recent Postgres versions, is to create extended statistics on the
combination of the columns:

regression=# create statistics t_stats on cropped, resized from t;
CREATE STATISTICS
regression=# analyze t;
ANALYZE
regression=# EXPLAIN ANALYZE  
SELECT count(*) FROM t
WHERE cropped AND NOT resized AND create_date < CURRENT_DATE;
  QUERY PLAN

--
 Aggregate  (cost=3145.15..3145.16 rows=1 width=8) (actual time=9.765..9.766 
rows=1 loops=1)
   ->  Index Scan using idx_resized on t  (cost=0.29..3142.65 rows=1000 
width=0) (actual time=9.608..9.735 rows=1000 loops=1)
 Filter: (cropped AND (create_date < CURRENT_DATE))
 Rows Removed by Filter: 10
 Planning Time: 0.115 ms
 Execution Time: 9.779 ms

Better estimate, but it's still using the wrong index :-(.  If we force
use of the other one:

regression=# drop index idx_resized;
DROP INDEX
regression=# EXPLAIN ANALYZE
regression-# SELECT count(*) FROM t
regression-# WHERE cropped AND NOT resized AND create_date < CURRENT_DATE;
  QUERY PLAN

---
 Aggregate  (cost=6795.38..6795.39 rows=1 width=8) (actual time=0.189..0.191 
rows=1 loops=1)
   ->  Bitmap Heap Scan on t  (cost=13.40..6792.88 rows=1000 width=0) (actual 
time=0.047..0.147 rows=1000 loops=1)
 Recheck Cond: (cropped AND (NOT resized))
 Filter: (create_date < CURRENT_DATE)
 Heap Blocks: exact=6
 ->  Bitmap Index Scan on specific  (cost=0.00..13.15 rows=91565 
width=0) (actual time=0.035..0.035 rows=1000 loops=1)
  ^^
 Planning Time: 0.154 ms
 Execution Time: 0.241 ms

it looks like the problem is that the extended stats haven't been used
while forming the estimate of the number of index entries retrieved,
so we overestimate the cost of using this index.

That seems like a bug.  Tomas?

In the meantime, maybe you could dodge the problem by combining
"cropped" and "resized" into one multivalued column, so that there's
not a need to depend on extended stats to arrive at a decent estimate.

regards, tom lane

Re: TRAP: FailedAssertion("tabstat->trans == trans", File: "pgstat_relation.c", Line: 508

2022-04-19 Thread Andres Freund

Hi,

On 2022-04-19 10:36:24 -0700, Andres Freund wrote:
> On 2022-04-19 13:50:25 +0200, Erik Rijkers wrote:
> > The 12th run of statbug.sh crashed and gave a corefile.
> 
> I ran through quite a few iterations by now, without reproducing :(
> 
> I guess there's some timing issue and you're hitting on your system
> due to the slower disks.

Ah. I found the issue. The new pgstat_report_stat(true) call in
LogicalRepApplyLoop()'s "timeout" section doesn't check if we're in a
transaction. And the transactional stats code doesn't handle that (never
has).

I think all that's needed is a if (IsTransactionState()) around that
pgstat_report_stat().

It might be possible to put an assertion into pgstat_report_stat(), but
I need to look at the process exit code to see if it is.

Greetings,

Andres Freund

Re: TRAP: FailedAssertion("tabstat->trans == trans", File: "pgstat_relation.c", Line: 508

2022-04-19 Thread Andres Freund

Hi,

On 2022-04-19 13:50:25 +0200, Erik Rijkers wrote:
> The 12th run of statbug.sh crashed and gave a corefile.

I ran through quite a few iterations by now, without reproducing :(

I guess there's some timing issue and you're hitting on your system
due to the slower disks.


> Program terminated with signal 6, Aborted.
> #0  0x00357d6324f5 in raise () from /lib64/libc.so.6
> (gdb) bt
> #0  0x00357d6324f5 in raise () from /lib64/libc.so.6
> #1  0x00357d633cd5 in abort () from /lib64/libc.so.6
> #2  0x00b3bada in ExceptionalCondition (conditionName=0xd389a1
> "tabstat->trans == trans", errorType=0xd388b2 "FailedAssertion",
> fileName=0xd388a0 "pgstat_relation.c", lineNumber=508) at assert.c:69
> #3  0x009bf5dc in AtEOXact_PgStat_Relations (xact_state=0x31b1b50,
> isCommit=true) at pgstat_relation.c:508
> #4  0x009c4107 in AtEOXact_PgStat (isCommit=true, parallel=false) at
> pgstat_xact.c:54
> #5  0x00583764 in CommitTransaction () at xact.c:2360
> #6  0x00584354 in CommitTransactionCommand () at xact.c:3048
> #7  0x0090b34e in apply_handle_commit_internal
> (commit_data=0x7ffd024b5940) at worker.c:1532
> #8  0x0090a287 in apply_handle_commit (s=0x7ffd024b59b0) at
> worker.c:845
> #9  0x0090ce3a in apply_dispatch (s=0x7ffd024b59b0) at worker.c:2473
> #10 0x0090d41c in LogicalRepApplyLoop (last_received=74680880) at
> worker.c:2757
> #11 0x0090e974 in start_apply (origin_startpos=0) at worker.c:3526
> #12 0x0090f156 in ApplyWorkerMain (main_arg=0) at worker.c:3782
> #13 0x008c7623 in StartBackgroundWorker () at bgworker.c:858
> #14 0x008d1557 in do_start_bgworker (rw=0x30ff0a0) at
> postmaster.c:5802
> #15 0x008d1903 in maybe_start_bgworkers () at postmaster.c:6026
> #16 0x008d09ba in sigusr1_handler (postgres_signal_arg=10) at
> postmaster.c:5191
> #17 
> #18 0x00357d6e1683 in __select_nocancel () from /lib64/libc.so.6
> #19 0x008cc6c1 in ServerLoop () at postmaster.c:1757
> #20 0x008cc0aa in PostmasterMain (argc=11, argv=0x30d6590) at
> postmaster.c:1465
> #21 0x007c9256 in main (argc=11, argv=0x30d6590) at main.c:202
> (gdb) f 3
> #3  0x009bf5dc in AtEOXact_PgStat_Relations (xact_state=0x31b1b50,
> isCommit=true) at pgstat_relation.c:508
> 508 Assert(tabstat->trans == trans);
> (gdb) p tabstat
> $1 = (PgStat_TableStatus *) 0x319e630
> (gdb) p *tabstat
> $2 = {t_id = 2139062143, t_shared = 127, trans = 0x7f7f7f7f7f7f7f7f,
> t_counts = {t_numscans = 9187201950435737471, t_tuples_returned =
> 9187201950435737471, t_tuples_fetched = 9187201950435737471,
> t_tuples_inserted = 9187201950435737471, t_tuples_updated =
> 9187201950435737471, t_tuples_deleted = 9187201950435737471,
> t_tuples_hot_updated = 9187201950435737471, t_truncdropped = 127,
> t_delta_live_tuples = 9187201950435737471, t_delta_dead_tuples =
> 9187201950435737471, t_changed_tuples = 9187201950435737471,
> t_blocks_fetched = 9187201950435737471, t_blocks_hit = 9187201950435737471},
>   relation = 0x7f7f7f7f7f7f7f7f}
> (gdb) p trans
> $3 = (PgStat_TableXactStatus *) 0x31b1ba8
> (gdb) p *trans
> $4 = {tuples_inserted = 1, tuples_updated = 0, tuples_deleted = 0,
> truncdropped = false, inserted_pre_truncdrop = 0, updated_pre_truncdrop = 0,
> deleted_pre_truncdrop = 0, nest_level = 1, upper = 0x0,
>   parent = 0x319e630, next = 0x31b1ab8}
> (gdb)

Could you print out
p xact_state
p *xact_state
p xact_state->first
p *xact_state->first

Do you have the server log file for the failed run / instance?

Greetings,

Andres Freund

Re: Postgres perl module namespace

2022-04-19 Thread Andres Freund

Hi,

On 2022-04-19 11:36:44 -0400, Andrew Dunstan wrote:
> The attached three patches basically implement the new naming scheme for
> the back branches without doing away with the old scheme or doing a
> wholesale copy of the new modules.

That sounds like good plan!

I don't know perl enough to comment on the details, but it looks roughly
sane to me.

Greetings,

Andres Freund

Re: Defer selection of asynchronous subplans until the executor initialization stage

2022-04-19 Thread Zhihong Yu

On Tue, Apr 19, 2022 at 2:01 AM Etsuro Fujita 
wrote:

> Hi,
>
> On Sun, Apr 17, 2022 at 7:30 PM Zhihong Yu  wrote:
> > On Sun, Apr 17, 2022 at 1:48 AM Etsuro Fujita 
> wrote:
> >> I think we might support more cases in the switch statement in the
> >> future.  My concern about your proposal is that it might make it hard
> >> to add new cases to the statement.  I agree that what I proposed has a
> >> bit of redundant code, but writing code inside each case independently
> >> would make it easy to add them, making code consistent across branches
> >> and thus making back-patching easy.
>
> > When a new case arises where the plan is not a Result node, this func
> can be rewritten.
> > If there is only one such new case, the check at the beginning of the
> func can be tuned to exclude that case.
>
> Sorry, I don't agree with you.
>
> > I still think the check should be lifted to the beginning of the func
> (given the current cases).
>
> The given path isn't limited to SubqueryScanPath, ForeignPath and
> ProjectionPath, so another concern is extra cycles needed when the
> path is other path type that is projection-capable (e.g., Path for
> sequential scan, IndexPath, NestPath, ...).  Assume that the given
> path is a Path (that doesn't contain pseudoconstant quals).  In that
> case the given SeqScan plan node wouldn't contain a gating Result
> node, so if we put the if test at the top of the function, we need to
> execute not only the test but the switch statement for the given
> path/plan nodes.  But if we put the if test inside each case block, we
> only need to execute the switch statement, without executing the test.
> In the latter case I think we can save cycles for normal cases.
>
> In short: I don't think it's a great idea to put the if test at the
> top of the function.
>
> Best regards,
> Etsuro Fujita
>
Hi,
It is okay to keep the formation in your patch.

Cheers

Re: GSoC: Database Load Stress Benchmark (2022)

2022-04-19 Thread Mark Wong

Hi!

On Mon, Apr 18, 2022 at 03:40:23PM +0200, Mohammad Zain Abbas wrote:
> Dear concerned,
> 
> I hope you are doing well.
> 
> I am Mohammad Zain Abbas, currently enrolled in Erasmus Mundus (BDMA)
> program. I would like you to have a look at my proposal for the "*Database
> Load Stress Benchmark" *project.
> 
> Link:
> https://docs.google.com/document/d/1TThl7ODGD301GkjITY2k4OU88fZIhc1XvJYGqPCnOns/edit?usp=sharing
> 
> I would appreciate any feedback or guidance that you are able to provide.

I think you've covered all the bases here.  Good luck!

Regards,
Mark

Re: Dump/Restore of non-default PKs

2022-04-19 Thread Simon Riggs

On Mon, 18 Apr 2022 at 22:05, Simon Riggs  wrote:
>
> On Mon, 18 Apr 2022 at 21:48, Tom Lane  wrote:
> >
> > "David G. Johnston"  writes:
> > > On Mon, Apr 18, 2022 at 1:00 PM Simon Riggs 
> > > wrote:
> > >> I propose that we change pg_dump so that when it creates a PK it does
> > >> so in 2 commands:
> > >> 1. CREATE [UNIQUE] INDEX iname ...
> > >> 2. ALTER TABLE .. ADD PRIMARY KEY USING INDEX iname;
> >
> > > Why not just get rid of the limitation that constraint definitions don't
> > > support non-default methods?
> >
> > That approach would be doubling down on the assumption that we can always
> > shoehorn more custom options into SQL-standard constraint clauses, and
> > we'll never fall foul of shift/reduce problems or future spec additions.
> > I think for example that USING INDEX TABLESPACE is a blot on humanity,
> > and I'd be very glad to see pg_dump stop using it in favor of doing
> > things as Simon suggests.
>
> Sigh, agreed. It's more work, but its cleaner in the longer term to
> separate indexes from constraints.
>
> I'll look in more detail and come back here later.
>
> Thanks both.

My original plan was to get pg_dump to generate

--
-- Name: foo foo_a_idx; Type: CONSTRAINT; Schema: public; Owner: postgres
--
CREATE UNIQUE INDEX foo_a_idx ON public.foo USING btree (a);
ALTER TABLE ONLY public.foo
ADD CONSTRAINT foo_a_idx PRIMARY KEY USING INDEX foo_a_idx;

so the index definition is generated as a CONSTRAINT, not an INDEX.

Separating things a bit more generates this output, which is what I
think we want:

--
-- Name: foo foo_a_idx; Type: CONSTRAINT; Schema: public; Owner: postgres
--
ALTER TABLE ONLY public.foo
ADD CONSTRAINT foo_a_idx PRIMARY KEY USING INDEX foo_a_idx;
--
-- Name: foo_a_idx; Type: INDEX; Schema: public; Owner: postgres
--
CREATE UNIQUE INDEX foo_a_idx ON public.foo USING btree (a);

Which is better, but there is still some ugly code for REPLICA
IDENTITY and CLUSTER duplicated in dumpIndex() and dumpConstraint().

The attached patch includes a change to pg_dump_sort.c which changes
the priority of CONSTRAINT, but that doesn't seem to have any effect
on the output. I'm hoping that's a quick fix, but I haven't seen it
yet, even after losing sanity points trying to read the priority code.

Anyway, the main question is how should the code be structured?

-- 
Simon Riggshttp://www.EnterpriseDB.com/


pg_dump_constraint_using_index.v1.patch
Description: Binary data


pg_dump_test_setup.sql
Description: Binary data

Re: Add --{no-,}bypassrls flags to createuser

2022-04-19 Thread Robert Haas

On Mon, Apr 18, 2022 at 9:50 PM Kyotaro Horiguchi
 wrote:
> Hmm.. So, "-r/--role" and "-m/--member(ship)" is the (least worse) way
> to go?  Or we can give up adding -m for the reason of being hard to
> name it..

Hmm, yeah, I hadn't quite realized what the problem was when I wrote
that. I honestly don't know what to do about that. Renaming the
existing option is not great, but having the syntax diverge between
SQL and CLI is not great either. Giving up is also not great. Not sure
what is best.

-- 
Robert Haas
EDB: http://www.enterprisedb.com

Re: TRAP: FailedAssertion("HaveRegisteredOrActiveSnapshot()", File: "toast_internals.c", Line: 670, PID: 19403)

2022-04-19 Thread Robert Haas

On Tue, Apr 19, 2022 at 10:36 AM Tom Lane  wrote:
> I'm not here to claim that there are precisely zero remaining bugs
> of this ilk.  I'm just saying that I think we've flushed out most
> of them.  I think there is some value in trying to think of a way
> to prove that none remain, but it's not a problem we can solve
> for v15.

Sure, that's fine.

-- 
Robert Haas
EDB: http://www.enterprisedb.com

Re: make MaxBackends available in _PG_init

2022-04-19 Thread Nathan Bossart

On Tue, Apr 19, 2022 at 05:49:13PM +0800, Julien Rouhaud wrote:
> On Mon, Apr 18, 2022 at 08:17:04PM -0400, Tom Lane wrote:
>> Nathan Bossart  writes:
>> > I'm looking for a clean way to ERROR if someone attempts to call
>> > RequestAddinShmemSpace() or RequestNamedLWLockTranche() outside of the
>> > hook.  Currently, we are using static variables in ipci.c and lwlock.c to
>> > silently ignore invalid requests.  I could add a new 'extern bool' called
>> > 'process_shmem_requests_in_progress', but extensions could easily hack
>> > around that to allow requests in _PG_init().  Maybe I am overthinking all
>> > this and that is good enough.
>> 
>> If they do that and it breaks something, that's their fault not ours.
>> (It's not like there's not $BIGNUM ways for a C-language module to
>> break the backend, anyway.)
> 
> Agreed.  Similarly the process_shared_preload_libraries_in_progress flag could
> be modified by extension, and that wouldn't be any better.
> 
>> BTW, I'd make such errors FATAL, as it's unlikely that we can recover
>> cleanly from an error during initialization of a loadable module.
>> The module's likely to be only partially initialized/hooked in.
> 
> While at it, should we make process_shmem_requests_in_progress true when the
> new hook is called?  The hook should only be called when that's the case, and
> extension authors may feel like asserting it.

Okay, I did it this way in v5.

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com
>From f7c137ffe6fd8ba24f74398333fdad8832647f09 Mon Sep 17 00:00:00 2001
From: Nathan Bossart 
Date: Tue, 12 Apr 2022 14:57:00 -0700
Subject: [PATCH v5 1/2] Fix comments about bgworker registration before
 MaxBackends initialization

Since 6bc8ef0b, InitializeMaxBackends() has used max_worker_processes
instead of adapting MaxBackends to the number of background workers
registered by modules loaded in shared_preload_libraries (at this time,
bgworkers were only static, but gained dynamic capabilities as a matter
of supporting parallel queries meaning that a control cap was
necessary).

Some comments referred to the past registration logic, making them
confusing and incorrect, so fix these.

Some of the out-of-core modules that could be loaded in this path
sometimes like to manipulate dynamically some of the resource-related
GUCs for their own needs, this commit adds a note about that.

Author: Nathan Bossart
Discussion: https://postgr.es/m/20220127181815.GA551692@nathanxps13
---
 src/backend/postmaster/postmaster.c | 10 --
 src/backend/utils/init/postinit.c   |  5 ++---
 2 files changed, 6 insertions(+), 9 deletions(-)

diff --git a/src/backend/postmaster/postmaster.c b/src/backend/postmaster/postmaster.c
index 964a56dec4..ce4007bb2c 100644
--- a/src/backend/postmaster/postmaster.c
+++ b/src/backend/postmaster/postmaster.c
@@ -1005,10 +1005,8 @@ PostmasterMain(int argc, char *argv[])
 	LocalProcessControlFile(false);
 
 	/*
-	 * Register the apply launcher.  Since it registers a background worker,
-	 * it needs to be called before InitializeMaxBackends(), and it's probably
-	 * a good idea to call it before any modules had chance to take the
-	 * background worker slots.
+	 * Register the apply launcher.  It's probably a good idea to call this
+	 * before any modules had a chance to take the background worker slots.
 	 */
 	ApplyLauncherRegister();
 
@@ -1029,8 +1027,8 @@ PostmasterMain(int argc, char *argv[])
 #endif
 
 	/*
-	 * Now that loadable modules have had their chance to register background
-	 * workers, calculate MaxBackends.
+	 * Now that loadable modules have had their chance to alter any GUCs,
+	 * calculate MaxBackends.
 	 */
 	InitializeMaxBackends();
 
diff --git a/src/backend/utils/init/postinit.c b/src/backend/utils/init/postinit.c
index 9139fe895c..a28612b375 100644
--- a/src/backend/utils/init/postinit.c
+++ b/src/backend/utils/init/postinit.c
@@ -538,9 +538,8 @@ pg_split_opts(char **argv, int *argcp, const char *optstr)
 /*
  * Initialize MaxBackends value from config options.
  *
- * This must be called after modules have had the chance to register background
- * workers in shared_preload_libraries, and before shared memory size is
- * determined.
+ * This must be called after modules have had the chance to alter GUCs in
+ * shared_preload_libraries and before shared memory size is determined.
  *
  * Note that in EXEC_BACKEND environment, the value is passed down from
  * postmaster to subprocesses via BackendParameters in SubPostmasterMain; only
-- 
2.25.1

>From 77cc9c5181d4f6dd477542d4875c166236aec2ed Mon Sep 17 00:00:00 2001
From: Nathan Bossart 
Date: Mon, 18 Apr 2022 15:25:37 -0700
Subject: [PATCH v5 2/2] Add a new shmem_request_hook hook.

Currently, preloaded libraries are expected to request additional
shared memory and LWLocks in _PG_init().  However, it is not unusal
for such requests to depend on MaxBackends, which won't be
initialized at that time.  Such requests could also depend on GUCs
that other modules m

Re: Postgres perl module namespace

2022-04-19 Thread Andrew Dunstan


On 2022-04-18 Mo 14:07, Tom Lane wrote:
> Andrew Dunstan  writes:
>> No, I think we could probably just port the whole of src/test/PostreSQL
>> back if required, and have it live alongside the old modules. Each TAP
>> test is a separate miracle - see comments elsewhere about port
>> assignment in parallel TAP tests.
>> But that would mean we have some tests in the old flavor and some in the
>> new flavor in the back branches, which might get confusing.
> That works for back-patching entire new test scripts, but not for adding
> some cases to an existing script, which I think is more common.
>
>   


I think I've come up with a better scheme that I hope will fix all or
almost all of the pain complained of in this thread. I should note that
we deliberately delayed making these changes until fairly early in the
release 15 development cycle, and that was clearly a good decision.

The attached three patches basically implement the new naming scheme for
the back branches without doing away with the old scheme or doing a
wholesale copy of the new modules.

The first simply implements a proper "new" constructor for PostgresNode,
just like we have in PostgreSQL:Test::Cluster. It's not really essential
but it seems like a good idea. The second adds all the visible
functionality of the PostgresNode and TestLib modules to the
PostgreSQL::Test::Cluster and PostgreSQL::Test::Utils namespaces.. The
third adds dummy packages so that any code doing 'use
PostgreSQL::Test::Utils;' or 'use PostgreSQL::Test::Cluster;' will
actually import the old modules. This last piece is where there might be
some extra work needed, to export the names so that using an unqualified
function or variable, say, 'slurp_file("foo");' will work. But in
general, modulo that issue, I believe things should Just Work (tm). You
should basically just be able to backpatch any new or modified TAP test
without difficulty, sed script usage, etc.

Comments welcome.


cheers


andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com
diff --git a/src/test/perl/PostgresNode.pm b/src/test/perl/PostgresNode.pm
index 92d9303e23..94b39341ce 100644
--- a/src/test/perl/PostgresNode.pm
+++ b/src/test/perl/PostgresNode.pm
@@ -137,16 +137,22 @@ INIT
 
 =over
 
-=item PostgresNode::new($class, $name, $pghost, $pgport)
+=item PostgresNode->new(node_name, %params)
 
-Create a new PostgresNode instance. Does not initdb or start it.
-
-You should generally prefer to use get_new_node() instead since it takes care
-of finding port numbers, registering instances for cleanup, etc.
+Class oriented alias for get_new_node()
 
 =cut
 
 sub new
+{
+	my $class = $_[0];
+	$class->isa(__PACKAGE__) || die "new() not called as a class method";
+	return get_new_node(@_);
+}
+
+# internal subroutine used by get_new_node(). SHould not be called by any
+# external client of the module.
+sub _new
 {
 	my ($class, $name, $pghost, $pgport) = @_;
 	my $testname = basename($0);
@@ -1229,7 +1235,7 @@ sub get_new_node
 	}
 
 	# Lock port number found by creating a new node
-	my $node = $class->new($name, $host, $port);
+	my $node = _new($class, $name, $host, $port);
 
 	if ($params{install_path})
 	{
diff --git a/src/test/perl/PostgresNode.pm b/src/test/perl/PostgresNode.pm
index 92d9303e23..f5d2ba72e6 100644
--- a/src/test/perl/PostgresNode.pm
+++ b/src/test/perl/PostgresNode.pm
@@ -2761,4 +2761,16 @@ sub corrupt_page_checksum
 
 =cut
 
+# support release 15+ perl module namespace
+
+package PostgreSQL::Test::Cluster;
+
+sub new
+{
+	shift; # remove class param from args
+	return PostgresNode->get_new_node(@_);
+}
+
+sub get_free_port { return PostgresNode::get_free_port(); }
+
 1;
diff --git a/src/test/perl/TestLib.pm b/src/test/perl/TestLib.pm
index 0dfc414b07..92583f84b4 100644
--- a/src/test/perl/TestLib.pm
+++ b/src/test/perl/TestLib.pm
@@ -948,4 +948,46 @@ sub command_checks_all
 
 =cut
 
+# support release 15+ perl module namespace
+
+package PostgreSQL::Test::Utils;
+
+# we don't want to export anything, but we want to support things called
+# via this package name explicitly.
+
+# use typeglobs to alias these functions and variables
+
+*generate_ascii_string = *TestLib::generate_ascii_string;
+*slurp_dir = *TestLib::slurp_dir;
+*slurp_file = *TestLib::slurp_file;
+*append_to_file = *TestLib::append_to_file;
+*check_mode_recursive = *TestLib::check_mode_recursive;
+*chmod_recursive = *TestLib::chmod_recursive;
+*check_pg_config = *TestLib::check_pg_config;
+*dir_symlink = *TestLib::dir_symlink;
+*system_or_bail = *TestLib::system_or_bail;
+*system_log = *TestLib::system_log;
+*run_log = *TestLib::run_log;
+*run_command = *TestLib::run_command;
+sub pump_until { die "pump_until not implemented in TestLib"; }
+*command_ok = *TestLib::command_ok;
+*command_fails = *TestLib::command_fails;
+*command_exit_is = *TestLib::command_exit_is;
+*program_help_ok = *TestLib::program_help_ok;
+*program_version_ok = *TestLib::program_version_ok;
+*program_options_handling_ok = *TestL

Re: pg14 psql broke \d datname.nspname.relname

2022-04-19 Thread Tom Lane

Justin Pryzby  writes:
> On Tue, Apr 19, 2022 at 10:00:01AM -0400, Robert Haas wrote:
>> (A) This is a new feature. Wait for v16.
>> (B) This is a bug fix. Commit it now and back-patch to v14.
>> (C) This is a cleanup that is OK to put into v15 even after feature
>> freeze but since it is a behavior change we shouldn't back-patch it.
>> I vote for (C). What do other people think?

> I thought the plan was to backpatch to v14.

> v14 psql had an unintentional behavior change, rejecting \d
> datname.nspname.relname.

I agree that the v14 behavior is a bug, so ordinarily I'd vote
for back-patching.

A possible objection to doing that is that the patch changes the
APIs of processSQLNamePattern and patternToSQLRegex.  We would avoid
making such a change in core-backend APIs in a minor release, but
I'm not certain whether there are equivalent stability concerns
for src/fe_utils/.

On the whole I'd vote for (B), with (C) as second choice.

regards, tom lane

Re: error handling in pqRowProcessor broken

2022-04-19 Thread Tom Lane

Peter Eisentraut  writes:
> The error handling for pqRowProcessor is described as
>   *Add the received row to the current async result (conn->result).
>   *Returns 1 if OK, 0 if error occurred.
>   *
>   * On error, *errmsgp can be set to an error string to be returned.
>   * If it is left NULL, the error is presumed to be "out of memory".

> I find that this doesn't work anymore.

Will look into it, thanks for reporting.

(Hmm, seems like this API spec is deficient anyway.  Is the error
string to be freed later?  Is it already translated?)

regards, tom lane

Re: automatically generating node support functions

2022-04-19 Thread Peter Eisentraut


On 19.04.22 16:39, Tom Lane wrote:

Maybe we should fix JsonPathSpec to be less creative while we
still can?  It's not real clear to me why that typedef even exists,
rather than using a String node, or just a plain char * field.


Yeah, let's get rid of it and use char *.

I see in JsonCommon a pathspec is converted to a String node, so it's 
not like JsonPathSpec is some kind of universal representation of the 
thing anyway.

Re: automatically generating node support functions

2022-04-19 Thread Tom Lane

Alvaro Herrera  writes:
> I rebased this mostly out of curiousity.  I fixed some smallish
> conflicts and fixed a typedef problem new in JSON support; however, even
> with these fixes it doesn't compile, because JsonPathSpec uses a novel
> typedef pattern that apparently will need bespoke handling in the
> gen_nodes_support.pl script.  It seemed better to post this even without
> that, though.

Maybe we should fix JsonPathSpec to be less creative while we
still can?  It's not real clear to me why that typedef even exists,
rather than using a String node, or just a plain char * field.

regards, tom lane

Re: TRAP: FailedAssertion("HaveRegisteredOrActiveSnapshot()", File: "toast_internals.c", Line: 670, PID: 19403)

2022-04-19 Thread Tom Lane

Robert Haas  writes:
> On Mon, Apr 18, 2022 at 4:07 PM Tom Lane  wrote:
>> There may be some corner cases that aren't described by any of these
>> three blanket scenarios, but they've got to be pretty few and far
>> between.

> My first thought whenever anything like this comes up is cursors,
> especially but not only holdable cursors. Also, plpgsql variables,
> maybe mixed with embedded COMMIT/ROLLBACK.

Those exact cases have had detoasting bugs in the past and are now fixed.

> I don't find it
> particularly hard to believe we have some bugs in
> insufficiently-well-considered parts of the system that pass around
> datums outside of the normal executor flow, but I don't know exactly
> how to find them all, either.

I'm not here to claim that there are precisely zero remaining bugs
of this ilk.  I'm just saying that I think we've flushed out most
of them.  I think there is some value in trying to think of a way
to prove that none remain, but it's not a problem we can solve
for v15.

regards, tom lane

error handling in pqRowProcessor broken

2022-04-19 Thread Peter Eisentraut


The error handling for pqRowProcessor is described as

 *Add the received row to the current async result (conn->result).
 *Returns 1 if OK, 0 if error occurred.
 *
 * On error, *errmsgp can be set to an error string to be returned.
 * If it is left NULL, the error is presumed to be "out of memory".

I find that this doesn't work anymore.  If you set *errmsgp = "some 
message" and return 0, then psql will just print a result set with zero 
rows.


Bisecting points to

commit 618c16707a6d6e8f5c83ede2092975e4670201ad
Author: Tom Lane 
Date:   Fri Feb 18 15:35:15 2022 -0500

Rearrange libpq's error reporting to avoid duplicated error text.

It is very uncommon to get an error from pqRowProcessor().  To 
reproduce, I inserted this code:


diff --git a/src/interfaces/libpq/fe-exec.c b/src/interfaces/libpq/fe-exec.c
index c7c48d07dc..9c1b33c6e2 100644
--- a/src/interfaces/libpq/fe-exec.c
+++ b/src/interfaces/libpq/fe-exec.c
@@ -1124,6 +1124,12 @@ pqRowProcessor(PGconn *conn, const char **errmsgp)
return 0;
}

+   if (nfields == 7)
+   {
+   *errmsgp = "gotcha";
+   goto fail;
+   }
+
/*
 * Basically we just allocate space in the PGresult for each field and
 * copy the data over.

This will produce assorted failures in the regression tests that 
illustrate the effect.


(Even before the above commit, the handling of the returned message was 
a bit weird:  The error output was just the message string, without any 
prefix like "ERROR:".)

Re: Add version and data directory to initdb output

2022-04-19 Thread Daniel Gustafsson

> On 19 Apr 2022, at 15:56, David G. Johnston  
> wrote:

> The motivating situation had me placing it as close to the last line as 
> possible so my 8 line or so tmux panel would show it to me without scrolling. 
>  The version is all I cared about, but when writing the patch the path seemed 
> to be at least worth considering.
> 
> As for "Success", I'm confused about the --no-instructions choice to change 
> it the way it did, but given that precedent I only felt it important to leave 
> the word Success as the leading word on a line.  Scripts should be triggering 
> on the exit code anyway and presently --no-instructions removes the Success 
> acknowledgement completely anyway.

Good point, I forgot about the no-instructions option.

./daniel

Re: pg14 psql broke \d datname.nspname.relname

2022-04-19 Thread Justin Pryzby

On Tue, Apr 19, 2022 at 10:00:01AM -0400, Robert Haas wrote:
> (A) This is a new feature. Wait for v16.
> (B) This is a bug fix. Commit it now and back-patch to v14.
> (C) This is a cleanup that is OK to put into v15 even after feature
> freeze but since it is a behavior change we shouldn't back-patch it.
> 
> I vote for (C). What do other people think?

I thought the plan was to backpatch to v14.

v14 psql had an unintentional behavior change, rejecting \d
datname.nspname.relname.

This patch is meant to relax that change by allowing datname, but only if it
matches the name of the current database ... without returning to the v13
behavior, which allowed arbitrary leading junk.

-- 
Justin

Re: [PATCH] Add native windows on arm64 support

2022-04-19 Thread Niyas Sait

 > Niyas, any updates on that?

Sorry for the delay! Configuring the scripts took some time. I have
successfully run the builfarm animal script using my git repository [1]
which contains the proposed patch on a Windows Arm64 machine.

I made a request to add a new machine to build farm through [2].

I believe we should be able to enable the build farm machine once the
change has been merged.

Thanks,
Niyas

[1] https://github.com/nsait-linaro/postgres.git
[2] https://buildfarm.postgresql.org/cgi-bin/register-form.pl

On Thu, 7 Apr 2022 at 18:54, Andres Freund  wrote:

> Hi,
>
> On 2022-04-07 13:42:49 -0400, Tom Lane wrote:
> > Andres Freund  writes:
> > > On 2022-04-07 09:40:43 -0400, Tom Lane wrote:
> > >> If it causes problems down the road, how will we debug it?
> >
> > > If what causes problems down the road? Afaics the patch doesn't change
> > > anything outside of windows-on-arm, so it shouldn't cause any breakage
> we care
> > > about until we get a buildfarm animal.
> >
> > Do we have a volunteer to run a buildfarm animal?  I don't see much
> > point in thinking about this till there is one ready to go.
>
> OP said that they might be able to:
>
> https://www.postgresql.org/message-id/CAFPTBD_NZb%3Dq_6uE-QV%2BS%2Bpm%3DRc9sBKrg8CQ_Y3dc27Awrm7cQ%40mail.gmail.com
>
> Niyas, any updates on that?
>
> Greetings,
>
> Andres Freund
>

Re: Replace open mode with PG_BINARY_R/W/A macros

2022-04-19 Thread Tom Lane

Japin Li  writes:
> On Tue, 19 Apr 2022 at 14:20, Tom Lane  wrote:
>> I think the comment's at best misleading.  See e.g. 66f8687a8.
>> It might be okay to use "rb" to read a text file when there
>> is actually \r-stripping logic present, but you need to check
>> that.  Using "wb" to write a text file is flat wrong.

> Thanks for the detail explanation.  Should we remove the misleading comment?

We should rewrite it, not just remove it.  But I'm not 100% sure
what to say instead.  I wonder whether the comment's claims about
control-Z processing still apply on modern Windows.

Another question is whether we actually like the current shape of
the code.  I can see at least two different directions we might
prefer to the status quo:

* Invent '#define PG_TEXT_R "r"' and so on, and use those in the
calls that currently use plain "r" etc, establishing a project
policy that you should use one of these six macros and never the
underlying strings directly.  This perhaps has some advantages
in greppability and clarity of intent, but I can't help wondering
if it's mostly obsessive-compulsiveness.

* In the other direction, decide that the PG_BINARY_X macros are
offering no benefit at all and just rip 'em out, writing "rb" and
so on in their place.  POSIX specifies that the character "b" has
no effect on Unix-oid systems, and it has said that for thirty years
now, so we do not really need the platform dependency that presently
exists in the macro definitions.  The presence or absence of "b"
would serve fine as an indicator of intent, and there would be one
less PG-specific coding convention to remember.

Or maybe it's fine as-is.  Any sort of wide-ranging change like this
creates hazards for back-patching, so we shouldn't do it lightly.

regards, tom lane

Re: pg14 psql broke \d datname.nspname.relname

2022-04-19 Thread David G. Johnston

On Tue, Apr 19, 2022 at 7:00 AM Robert Haas  wrote:

> On Mon, Apr 18, 2022 at 3:39 PM Mark Dilger
>  wrote:
> > Since there hasn't been any agreement on that point, I've just rebased
> the patch to apply cleanly against the current master:
>
> This looks OK to me. There may be better ways to do some of it, but
> there's no rule against further improving the code later. Also, since
> the issue was introduced in v14, we probably shouldn't wait forever to
> do something about it. However, there is a procedural issue here now
> that we are past feature freeze. I think someone could defensibly take
> any of the following positions:
>
> (A) This is a new feature. Wait for v16.
> (B) This is a bug fix. Commit it now and back-patch to v14.
> (C) This is a cleanup that is OK to put into v15 even after feature
> freeze but since it is a behavior change we shouldn't back-patch it.
>
> I vote for (C). What do other people think?
>
>
I vote for (B).  The behavioral change for v14 turns working usage patterns
into errors where it should not have.  It is a design bug and POLA
violation that should be corrected.

"""
such that the above example was
interpreted as schema=production, relation=marketing.customers.
This turns out to be highly unintuitive to users.
"""

My concern here about a behavior affecting bug fix - which we allow - is
reduced by the fact this feature is almost exclusively an interactive one.
Which supports not having only v14, and maybe v15, behave differently than
v13 and v16 when it comes to using it for expected usage patterns:

"""
We've had reports that users sometimes copy-and-paste database- and
schema-qualified relation names from the logs.
"""

David J.

Re: GSoC Proposal Submission.

2022-04-19 Thread Stephen Frost

Greetings,

An actual message would be better when sending to this list in the
future.  Thanks for your GSoC proposal.

Stephen


signature.asc
Description: PGP signature

Re: Proposal for New and improved website for pgjdbc (JDBC) for GSOC 2022

2022-04-19 Thread Stephen Frost

Greetings,

* Vedant Gokhale (gokhalevedan...@gmail.com) wrote:
> I was reading the contributor guidelines and it mentioned sending the
> proposal to this email address. The guidelines also mentioned that you must
> be subscribed to the mailing list. Please let me know if something more has
> to be done.
> 
> The proposal is attached to this email

Thanks!

Stephen


signature.asc
Description: PGP signature

Re: pg14 psql broke \d datname.nspname.relname

2022-04-19 Thread Robert Haas

On Mon, Apr 18, 2022 at 3:39 PM Mark Dilger
 wrote:
> Since there hasn't been any agreement on that point, I've just rebased the 
> patch to apply cleanly against the current master:

This looks OK to me. There may be better ways to do some of it, but
there's no rule against further improving the code later. Also, since
the issue was introduced in v14, we probably shouldn't wait forever to
do something about it. However, there is a procedural issue here now
that we are past feature freeze. I think someone could defensibly take
any of the following positions:

(A) This is a new feature. Wait for v16.
(B) This is a bug fix. Commit it now and back-patch to v14.
(C) This is a cleanup that is OK to put into v15 even after feature
freeze but since it is a behavior change we shouldn't back-patch it.

I vote for (C). What do other people think?

-- 
Robert Haas
EDB: http://www.enterprisedb.com

Re: A qsort template

2022-04-19 Thread John Naylor

> Okay, this makes logical sense and is a smaller patch to boot. I've
> re-run my tests (attached) to make sure we have our bases covered. I'm
> sharing the min-of-five, as before, but locally I tried . The

My sentence there was supposed to read "I tried using median and it
was a bit less noisy".

-- 
John Naylor
EDB: http://www.enterprisedb.com

Re: Add version and data directory to initdb output

2022-04-19 Thread David G. Johnston

On Tue, Apr 19, 2022 at 2:28 AM Daniel Gustafsson  wrote:

> > On 16 Apr 2022, at 01:50, David G. Johnston 
> wrote:
>
> > initdb is already pretty chatty, and the version of the cluster being
> installed seems useful to include as well.
>
> That seems quite reasonable.
>
> > The data directory is probably less so - though I am thinking that the
> absolute path would be useful to report, especially when a relative path is
> specified (I didn't figure that part out yet, figured I'd get the idea
> approved before working out how to make it happen).
>
> I'm less convinced that it will be worth the additional code to make it
> portable across *nix/Windows etc.
>

ok

>
> > Moving "Success" to that "summary output" line and leaving the optional
> shell command line just be the shell command made sense to me.
>
> Looking at the output, couldn't it alternatively be printed grouped with
> the
> other info on the cluster, ie the final three rows in the example below:
>
>   ./bin/initdb -D data
>   The files belonging to this database system will be owned by user
> "".
>   This user must also own the server process.
>
>   The database cluster will be initialized with locale "en_US.UTF-8".
>   The default database encoding has accordingly been set to "UTF8".
>   The default text search configuration will be set to "english".
>
> How about 'The database cluster will be initialized with version "14.2".'
> added
> there, which then can keep the "Success" line in place in case existing
> scripts
> are triggering on that line?
>
>
The motivating situation had me placing it as close to the last line as
possible so my 8 line or so tmux panel would show it to me without
scrolling.  The version is all I cared about, but when writing the patch
the path seemed to be at least worth considering.

As for "Success", I'm confused about the --no-instructions choice to change
it the way it did, but given that precedent I only felt it important to
leave the word Success as the leading word on a line.  Scripts should be
triggering on the exit code anyway and presently --no-instructions removes
the Success acknowledgement completely anyway.

David J.

Re: A qsort template

2022-04-19 Thread John Naylor

On Tue, Apr 19, 2022 at 12:30 PM David Rowley  wrote:
>
> Thanks for looking at this.
>
> On Tue, 19 Apr 2022 at 02:11, John Naylor  
> wrote:
> > IIUC, this function is called by tuplesort_begin_common, which in turn
> > is called by tuplesort_begin_{heap, indexes, etc}. The latter callers
> > set the onlyKey and now oneKeySort variables as appropriate, and
> > sometimes hard-coded to false. Is it intentional to set them here
> > first?
> >
> > Falling under the polish that you were likely thinking of above:
>
> I did put the patch together quickly just for the benchmark and at the
> time I was subtly aware that the onlyKey field was being set using a
> similar condition as I was using to set the boolean field I'd added.
> On reflection today, it should be fine just to check if that field is
> NULL or not in the 3 new comparison functions. Similarly to before,
> this only needs to be done if the datums compare equally, so does not
> add any code to the path where the datums are non-equal.  It looks
> like the other tuplesort_begin_* functions use a different comparison
> function that will never make use of the specialization comparison
> functions added by 697492434.

Okay, this makes logical sense and is a smaller patch to boot. I've
re-run my tests (attached) to make sure we have our bases covered. I'm
sharing the min-of-five, as before, but locally I tried . The
regression is fixed, and most other differences from v15 seem to be
noise. It's possible the naturally fastest cases (pre-sorted ints and
bigints) are slower than v15-revert than expected from noise, but it's
not clear.

I think this is good to go.

-- 
John Naylor
EDB: http://www.enterprisedb.com


qsort-fix-regression-jcn2.ods
Description: application/vnd.oasis.opendocument.spreadsheet

minor MERGE cleanups

2022-04-19 Thread Alvaro Herrera

There's a complain from Coverity about outer_plan being referenced while
possibly NULL, which can be silenced by using an existing local
variable.  0001 does that.

0002 and 0003 are unrelated: in the former, we avoid creating a separate
local variable planSlot when we can just refer to the eponymous member
of ModifyTableContext.  In the latter, we reduce the scope where
'lockmode' is defined by moving it from ModifyTableContext to
UpdateContext, which means we can save initializing it in a few spots;
this makes the code more natural.

I expect these fixups in new code should be uncontroversial.

-- 
Álvaro Herrera   48°01'N 7°57'E  —  https://www.EnterpriseDB.com/
>From 30a3bfadac90fabba239533a754dfe4efb85a5d8 Mon Sep 17 00:00:00 2001
From: Alvaro Herrera 
Date: Tue, 19 Apr 2022 13:57:24 +0200
Subject: [PATCH 1/3] reuse local variable, to silence coverity

---
 src/backend/utils/adt/ruleutils.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/backend/utils/adt/ruleutils.c b/src/backend/utils/adt/ruleutils.c
index 7e08d7fe6c..5d49f564a2 100644
--- a/src/backend/utils/adt/ruleutils.c
+++ b/src/backend/utils/adt/ruleutils.c
@@ -4996,7 +4996,7 @@ set_deparse_plan(deparse_namespace *dpns, Plan *plan)
 	if (IsA(plan, ModifyTable))
 	{
 		if (((ModifyTable *) plan)->operation == CMD_MERGE)
-			dpns->inner_tlist = dpns->outer_plan->targetlist;
+			dpns->inner_tlist = dpns->outer_tlist;
 		else
 			dpns->inner_tlist = ((ModifyTable *) plan)->exclRelTlist;
 	}
-- 
2.30.2

>From 309e6f9717890713dde6b2a62cf20dcafdc176d0 Mon Sep 17 00:00:00 2001
From: Alvaro Herrera 
Date: Tue, 19 Apr 2022 13:57:08 +0200
Subject: [PATCH 2/3] use context.planSlot instead of having a separate
 planSlot

---
 src/backend/executor/nodeModifyTable.c | 24 ++--
 1 file changed, 10 insertions(+), 14 deletions(-)

diff --git a/src/backend/executor/nodeModifyTable.c b/src/backend/executor/nodeModifyTable.c
index 171575cd73..0de6abd5bb 100644
--- a/src/backend/executor/nodeModifyTable.c
+++ b/src/backend/executor/nodeModifyTable.c
@@ -3452,7 +3452,6 @@ ExecModifyTable(PlanState *pstate)
 	ResultRelInfo *resultRelInfo;
 	PlanState  *subplanstate;
 	TupleTableSlot *slot;
-	TupleTableSlot *planSlot;
 	TupleTableSlot *oldSlot;
 	ItemPointerData tuple_ctid;
 	HeapTupleData oldtupdata;
@@ -3525,10 +3524,10 @@ ExecModifyTable(PlanState *pstate)
 		if (pstate->ps_ExprContext)
 			ResetExprContext(pstate->ps_ExprContext);
 
-		planSlot = ExecProcNode(subplanstate);
+		context.planSlot = ExecProcNode(subplanstate);
 
 		/* No more tuples to process? */
-		if (TupIsNull(planSlot))
+		if (TupIsNull(context.planSlot))
 			break;
 
 		/*
@@ -3542,7 +3541,7 @@ ExecModifyTable(PlanState *pstate)
 			bool		isNull;
 			Oid			resultoid;
 
-			datum = ExecGetJunkAttribute(planSlot, node->mt_resultOidAttno,
+			datum = ExecGetJunkAttribute(context.planSlot, node->mt_resultOidAttno,
 		 &isNull);
 			if (isNull)
 			{
@@ -3556,9 +3555,8 @@ ExecModifyTable(PlanState *pstate)
  */
 if (operation == CMD_MERGE)
 {
-	EvalPlanQualSetSlot(&node->mt_epqstate, planSlot);
+	EvalPlanQualSetSlot(&node->mt_epqstate, context.planSlot);
 
-	context.planSlot = planSlot;
 	context.lockmode = 0;
 
 	ExecMerge(&context, node->resultRelInfo, NULL, node->canSetTag);
@@ -3589,13 +3587,13 @@ ExecModifyTable(PlanState *pstate)
 			 * ExecProcessReturning by IterateDirectModify, so no need to
 			 * provide it here.
 			 */
-			slot = ExecProcessReturning(resultRelInfo, NULL, planSlot);
+			slot = ExecProcessReturning(resultRelInfo, NULL, context.planSlot);
 
 			return slot;
 		}
 
-		EvalPlanQualSetSlot(&node->mt_epqstate, planSlot);
-		slot = planSlot;
+		EvalPlanQualSetSlot(&node->mt_epqstate, context.planSlot);
+		slot = context.planSlot;
 
 		tupleid = NULL;
 		oldtuple = NULL;
@@ -3637,9 +3635,8 @@ ExecModifyTable(PlanState *pstate)
 {
 	if (operation == CMD_MERGE)
 	{
-		EvalPlanQualSetSlot(&node->mt_epqstate, planSlot);
+		EvalPlanQualSetSlot(&node->mt_epqstate, context.planSlot);
 
-		context.planSlot = planSlot;
 		context.lockmode = 0;
 
 		ExecMerge(&context, node->resultRelInfo, NULL, node->canSetTag);
@@ -3698,7 +3695,6 @@ ExecModifyTable(PlanState *pstate)
 		}
 
 		/* complete context setup */
-		context.planSlot = planSlot;
 		context.lockmode = 0;
 
 		switch (operation)
@@ -3707,7 +3703,7 @@ ExecModifyTable(PlanState *pstate)
 /* Initialize projection info if first time for this table */
 if (unlikely(!resultRelInfo->ri_projectNewInfoValid))
 	ExecInitInsertProjection(node, resultRelInfo);
-slot = ExecGetInsertNewTuple(resultRelInfo, planSlot);
+slot = ExecGetInsertNewTuple(resultRelInfo, context.planSlot);
 slot = ExecInsert(&context, resultRelInfo, slot,
   node->canSetTag, NULL, NULL);
 break;
@@ -3737,7 +3733,7 @@ ExecModifyTable(PlanState *pstate)
 	   oldSlot))
 		elog(ERROR, "failed to fet

Re: Fwd: Declarative partitioning and partition pruning/check (+postgis)

2022-04-19 Thread Justin Pryzby

On Tue, Apr 19, 2022 at 02:39:12PM +0200, Mats Taraldsvik wrote:
> I'm re-trying this email here, as there were no answers in the psql-general
> list. Hope that's ok. (please cc me when answering as I'm not subscribed
> (yet) )

-hackers is for development and bug reports, so this isn't the right place.
If you had mailed on -performance, I would have responded there.

> The first part, getting the rows into the "right" partition isn't
> especially interesting: Reduce every geometry to a point, and use the x and
> y coordinates separately in a range partition. This is possible with
> PostgreSQL as it is a normal range partition on double.

I agree that it's conceptually simple.  Have you tried it ?

ts=# CREATE TABLE t(a geometry) PARTITION BY RANGE(st_x(a));
ts=# CREATE TABLE t1 PARTITION OF t FOR VALUES FROM (1)to(2);
...

> The second part is more interesting. Whenever the spatial index is
> (implicitly or directly) used in a query, the partition pruning step
> (during execution) checks the spatial index's root bounding box to
> determine if the partition can be skipped.
> 
> Is this possible to achieve in PostgreSQL? There is already a function in
> PostGIS to get the spatial index root bounding box
> (_postgis_index_extent(tbl regclass, col text)), but I think the real issue
> is that the actual SQL query might not even call the index directly (SELECT
> * FROM a WHERE ST_Intersects(mygeom, a.geom) - the ST_Intersects function
> uses the index internally).

For partition pruning to work, a query would have to include a WHERE clause
which is sufficient to prune the partitions.  If the table is partitioned by
RANGE(st_x(col)), then the query would need to say "st_x(col) <= 11" (or
similar).  If st_x() is compared to a constant, then partition pruning can
happen at planning time; if not, it might (since v11) happen at execution time.

https://www.postgresql.org/docs/current/ddl-partitioning.html#DDL-PARTITION-PRUNING

I doubt your queries would have the necesarily condition for this to do what
you want.  It would be easy to 1) try; and then 2) post a question with the
necessary SQL to set up the test, and show what you've tried.

-- 
Justin

Re: automatically generating node support functions

2022-04-19 Thread Alvaro Herrera

I rebased this mostly out of curiousity.  I fixed some smallish
conflicts and fixed a typedef problem new in JSON support; however, even
with these fixes it doesn't compile, because JsonPathSpec uses a novel
typedef pattern that apparently will need bespoke handling in the
gen_nodes_support.pl script.  It seemed better to post this even without
that, though.

-- 
Álvaro HerreraBreisgau, Deutschland  —  https://www.EnterpriseDB.com/
"El miedo atento y previsor es la madre de la seguridad" (E. Burke)
>From 56cf5918f1a2abcb285ad2d4d880779eaed388d0 Mon Sep 17 00:00:00 2001
From: Alvaro Herrera 
Date: Tue, 19 Apr 2022 13:36:21 +0200
Subject: [PATCH 1/2] Complete JsonTableColumnType typedef

---
 src/include/nodes/parsenodes.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
index da02658c81..b1f81feb46 100644
--- a/src/include/nodes/parsenodes.h
+++ b/src/include/nodes/parsenodes.h
@@ -1610,7 +1610,7 @@ typedef enum JsonQuotes
  * JsonTableColumnType -
  *		enumeration of JSON_TABLE column types
  */
-typedef enum
+typedef enum JsonTableColumnType
 {
 	JTC_FOR_ORDINALITY,
 	JTC_REGULAR,
-- 
2.30.2

>From 0cd2f44259c4778eb19568d9c2f3a81965642303 Mon Sep 17 00:00:00 2001
From: Alvaro Herrera 
Date: Tue, 19 Apr 2022 12:03:19 +0200
Subject: [PATCH 2/2] Automatically generate node support functions

Add a script to automatically generate the node support functions
(copy, equal, out, and read, as well as the node tags enum) from the
struct definitions.

For each of the four node support files, it creates two include files,
e.g., copyfuncs.funcs.c and copyfuncs.switch.c, to include in the main
file.  All the scaffolding of the main file stays in place.

TODO: In this patch, I have only ifdef'ed out the code to could be
removed, mainly so that it won't constantly have merge conflicts.
Eventually, that should all be changed to delete the code.  All the
code comments that are worth keeping from those sections have already
been moved to the header files where the structs are defined.

I have tried to mostly make the coverage of the output match what is
currently there.  For example, one could now do out/read coverage of
utility statement nodes, but I have manually excluded those for now.
The reason is mainly that it's easier to diff the before and after,
and adding a bunch of stuff like this might require a separate
analysis and review.

Subtyping (TidScan -> Scan) is supported.

For the hard cases, you can just write a manual function and exclude
generating one.  For the not so hard cases, there is a way of
annotating struct fields to get special behaviors.  For example,
pg_node_attr(equal_ignore) has the field ignored in equal functions.

Discussion: https://www.postgresql.org/message-id/flat/c1097590-a6a4-486a-64b1-e1f9cc0533ce%40enterprisedb.com
---
 src/backend/Makefile  |   8 +-
 src/backend/nodes/.gitignore  |   4 +
 src/backend/nodes/Makefile|  46 ++
 src/backend/nodes/copyfuncs.c |  19 +-
 src/backend/nodes/equalfuncs.c|  22 +-
 src/backend/nodes/gen_node_support.pl | 729 ++
 src/backend/nodes/outfuncs.c  |  34 +-
 src/backend/nodes/readfuncs.c |  23 +-
 src/include/nodes/.gitignore  |   2 +
 src/include/nodes/nodes.h |  28 +
 src/include/nodes/parsenodes.h|   3 +-
 src/include/nodes/pathnodes.h | 170 +++---
 src/include/nodes/plannodes.h |  90 ++--
 src/include/nodes/primnodes.h |  33 +-
 src/include/utils/rel.h   |   6 +-
 src/tools/msvc/Solution.pm|  46 ++
 16 files changed, 1114 insertions(+), 149 deletions(-)
 create mode 100644 src/backend/nodes/.gitignore
 create mode 100644 src/backend/nodes/gen_node_support.pl
 create mode 100644 src/include/nodes/.gitignore

diff --git a/src/backend/Makefile b/src/backend/Makefile
index 4a02006788..821bef2694 100644
--- a/src/backend/Makefile
+++ b/src/backend/Makefile
@@ -143,11 +143,15 @@ storage/lmgr/lwlocknames.h: storage/lmgr/generate-lwlocknames.pl storage/lmgr/lw
 submake-catalog-headers:
 	$(MAKE) -C catalog distprep generated-header-symlinks
 
+# run this unconditionally to avoid needing to know its dependencies here:
+submake-nodes-headers:
+	$(MAKE) -C nodes distprep generated-header-symlinks
+
 # run this unconditionally to avoid needing to know its dependencies here:
 submake-utils-headers:
 	$(MAKE) -C utils distprep generated-header-symlinks
 
-.PHONY: submake-catalog-headers submake-utils-headers
+.PHONY: submake-catalog-headers submake-nodes-headers submake-utils-headers
 
 # Make symlinks for these headers in the include directory. That way
 # we can cut down on the -I options. Also, a symlink is automatically
@@ -162,7 +166,7 @@ submake-utils-headers:
 
 .PHONY: generated-headers
 
-generated-headers: $(top_builddir)/src/include/parser/gram.h $(top_builddir)/src/include/storage/lwlocknames.h subma

Re: TRAP: FailedAssertion("tabstat->trans == trans", File: "pgstat_relation.c", Line: 508

2022-04-19 Thread Alvaro Herrera

On 2022-Apr-19, Erik Rijkers wrote:

> (gdb) p tabstat
> $1 = (PgStat_TableStatus *) 0x319e630
> (gdb) p *tabstat
> $2 = {t_id = 2139062143, t_shared = 127, trans = 0x7f7f7f7f7f7f7f7f,
> t_counts = {t_numscans = 9187201950435737471, t_tuples_returned =
> 9187201950435737471, t_tuples_fetched = 9187201950435737471,
> t_tuples_inserted = 9187201950435737471, t_tuples_updated =
> 9187201950435737471, t_tuples_deleted = 9187201950435737471,
> t_tuples_hot_updated = 9187201950435737471, t_truncdropped = 127,
> t_delta_live_tuples = 9187201950435737471, t_delta_dead_tuples =
> 9187201950435737471, t_changed_tuples = 9187201950435737471,
> t_blocks_fetched = 9187201950435737471, t_blocks_hit = 9187201950435737471},
>   relation = 0x7f7f7f7f7f7f7f7f}

It looks like this struct is freed or is in a memory context that was
reset.  Perhaps its lifetime wasn't carefully considered in the logical
replication code, which takes some shortcuts.

-- 
Álvaro Herrera PostgreSQL Developer  —  https://www.EnterpriseDB.com/
"El que vive para el futuro es un iluso, y el que vive para el pasado,
un imbécil" (Luis Adler, "Los tripulantes de la noche")

Fwd: Declarative partitioning and partition pruning/check

2022-04-19 Thread Mats Taraldsvik

I'm re-trying this email here, as there were no answers in the psql-general
list. Hope that's ok. (please cc me when answering as I'm not subscribed
(yet) )

Hi,

I have tried to read about Oracle's spatial partitioning feature (
https://www.oracle.com/technetwork/database/enterprise-edition/spatial-twp-partitioningbp-10gr2-05-134277.pdf)
and wondered if something like this is possible with PostgreSQL (with
PostGIS):

The first part, getting the rows into the "right" partition isn't
especially interesting: Reduce every geometry to a point, and use the x and
y coordinates separately in a range partition. This is possible with
PostgreSQL as it is a normal range partition on double.

The second part is more interesting. Whenever the spatial index is
(implicitly or directly) used in a query, the partition pruning step
(during execution) checks the spatial index's root bounding box to
determine if the partition can be skipped.

Is this possible to achieve in PostgreSQL? There is already a function in
PostGIS to get the spatial index root bounding box
(_postgis_index_extent(tbl regclass, col text)), but I think the real issue
is that the actual SQL query might not even call the index directly (SELECT
* FROM a WHERE ST_Intersects(mygeom, a.geom) - the ST_Intersects function
uses the index internally).

Best Regards,
Mats Taraldsvik

Re: Patch a potential memory leak in describeOneTableDetails()

2022-04-19 Thread Daniel Gustafsson

> On 24 Feb 2022, at 14:55, Daniel Gustafsson  wrote:
>> On 24 Feb 2022, at 07:32, Kyotaro Horiguchi  wrote:

>> So the attached does that.
> 
> I think this is reasonable, since it's pretty clear that more(1) supporting
> "-x" is fairly rare. I'll await others commenting.

The original patch which prompted this got a bit buried, the attached 0001
contains that fix (using free() which matches the surrounding code) and 0002
has the documentation fixes for the more/less -x command.

--
Daniel Gustafsson   https://vmware.com/

v2-0002-Fix-out-of-date-description-mentioning-the-comman.patch
Description: Binary data

v2-0001-Fix-small-memory-leak-in-psql-d.patch
Description: Binary data

Re: using an end-of-recovery record in all cases

2022-04-19 Thread Robert Haas

On Mon, Apr 18, 2022 at 11:56 PM Amul Sul  wrote:
> I think RequestCheckpoint() should be called conditionally.  What is the need
> of the checkpoint if we haven't been through the recovery, in other words,
> starting up from a clean shutdown?

Good point. v5 attached.

-- 
Robert Haas
EDB: http://www.enterprisedb.com


v5-0001-Remove-the-end-of-recovery-checkpoint-in-all-case.patch
Description: Binary data


v5-0002-Remove-previous-timeline-ID-from-checkpoint.patch
Description: Binary data

Re: postgres_fdw: batch inserts vs. before row triggers

2022-04-19 Thread Tomas Vondra




On 4/19/22 11:16, Etsuro Fujita wrote:
> On Sun, Apr 17, 2022 at 6:20 PM Etsuro Fujita  wrote:
>> Here
>> is an example using HEAD:
>>
>> create extension postgres_fdw;
>> create server loopback foreign data wrapper postgres_fdw options
>> (dbname 'postgres');
>> create user mapping for current_user server loopback;
>> create table t (a int);
>> create foreign table ft (a int) server loopback options (table_name 't');
>> create function ft_rowcount_tf() returns trigger as $$ begin raise
>> notice '%: rows = %', tg_name, (select count(*) from ft); return new;
>> end; $$ language plpgsql;
>> create trigger ft_rowcount before insert on ft for each row execute
>> function ft_rowcount_tf();
>>
>> insert into ft select i from generate_series(1, 10) i;
>> NOTICE:  ft_rowcount: rows = 0
>> NOTICE:  ft_rowcount: rows = 1
>> NOTICE:  ft_rowcount: rows = 2
>> NOTICE:  ft_rowcount: rows = 3
>> NOTICE:  ft_rowcount: rows = 4
>> NOTICE:  ft_rowcount: rows = 5
>> NOTICE:  ft_rowcount: rows = 6
>> NOTICE:  ft_rowcount: rows = 7
>> NOTICE:  ft_rowcount: rows = 8
>> NOTICE:  ft_rowcount: rows = 9
>> INSERT 0 10
>>
>> This looks good, but when batch insert is enabled, the trigger
>> produces incorrect results:
>>
>> alter foreign table ft options (add batch_size '10');
>> delete from ft;
>>
>> insert into ft select i from generate_series(1, 10) i;
>> NOTICE:  ft_rowcount: rows = 0
>> NOTICE:  ft_rowcount: rows = 0
>> NOTICE:  ft_rowcount: rows = 0
>> NOTICE:  ft_rowcount: rows = 0
>> NOTICE:  ft_rowcount: rows = 0
>> NOTICE:  ft_rowcount: rows = 0
>> NOTICE:  ft_rowcount: rows = 0
>> NOTICE:  ft_rowcount: rows = 0
>> NOTICE:  ft_rowcount: rows = 0
>> NOTICE:  ft_rowcount: rows = 0
>> INSERT 0 10
> 
> Actually, the results are correct, as we do batch-insert here.  But I
> just wanted to show that the trigger behaves *differently* when doing
> batch-insert.
> 
>> So I think we should disable batch insert in such cases, just as we
>> disable multi insert when there are any before row triggers on the
>> target (local) tables/partitions in copyfrom.c.  Attached is a patch
>> for that.
> 
> If there are no objections from Tomas or anyone else, I'll commit the patch.
> 

+1, I think it's a bug to do batch insert in this case.


regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: [HACKERS] WIP aPatch: Pgbench Serialization and deadlock errors

2022-04-19 Thread Fabien COELHO





It find it a little annoying that there is no change in tests, it
means that the format is not checked at all:-(


Yeah. Perhaps it's a little bit hard to perform this kind of tests in
the TAP test?


Not really. I'll look into it.

--
Fabien.

Re: TRAP: FailedAssertion("tabstat->trans == trans", File: "pgstat_relation.c", Line: 508

2022-04-19 Thread Erik Rijkers


Op 19-04-2022 om 11:25 schreef Kyotaro Horiguchi:

Thaks Erik.

At Tue, 19 Apr 2022 07:00:30 +0200, Erik Rijkers  wrote in

Op 19-04-2022 om 02:15 schreef Kyotaro Horiguchi:

Could you read tabstat, *tabstat, trans, *trans here?


To be honest I'm not sure how to, but I gave it a try:




I rebuilt newest master (a62bff74b135)  with

export CUSTOM_COPT="-O0 -g"

The 12th run of statbug.sh crashed and gave a corefile.


GNU gdb (GDB) 7.6
Copyright (C) 2013 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later 


This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-unknown-linux-gnu".
For bug reporting instructions, please see:
...
Reading symbols from 
/home/aardvark/pg_stuff/pg_installations/pgsql.HEAD/bin/postgres...done.

[New LWP 25058]

warning: Can't read pathname for load map: Input/output error.
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
Core was generated by `postgres: logical replication worker for 
subscription 16411   '.

Program terminated with signal 6, Aborted.
#0  0x00357d6324f5 in raise () from /lib64/libc.so.6
(gdb) bt
#0  0x00357d6324f5 in raise () from /lib64/libc.so.6
#1  0x00357d633cd5 in abort () from /lib64/libc.so.6
#2  0x00b3bada in ExceptionalCondition (conditionName=0xd389a1 
"tabstat->trans == trans", errorType=0xd388b2 "FailedAssertion", 
fileName=0xd388a0 "pgstat_relation.c", lineNumber=508) at assert.c:69
#3  0x009bf5dc in AtEOXact_PgStat_Relations 
(xact_state=0x31b1b50, isCommit=true) at pgstat_relation.c:508
#4  0x009c4107 in AtEOXact_PgStat (isCommit=true, 
parallel=false) at pgstat_xact.c:54

#5  0x00583764 in CommitTransaction () at xact.c:2360
#6  0x00584354 in CommitTransactionCommand () at xact.c:3048
#7  0x0090b34e in apply_handle_commit_internal 
(commit_data=0x7ffd024b5940) at worker.c:1532
#8  0x0090a287 in apply_handle_commit (s=0x7ffd024b59b0) at 
worker.c:845

#9  0x0090ce3a in apply_dispatch (s=0x7ffd024b59b0) at worker.c:2473
#10 0x0090d41c in LogicalRepApplyLoop (last_received=74680880) 
at worker.c:2757

#11 0x0090e974 in start_apply (origin_startpos=0) at worker.c:3526
#12 0x0090f156 in ApplyWorkerMain (main_arg=0) at worker.c:3782
#13 0x008c7623 in StartBackgroundWorker () at bgworker.c:858
#14 0x008d1557 in do_start_bgworker (rw=0x30ff0a0) at 
postmaster.c:5802

#15 0x008d1903 in maybe_start_bgworkers () at postmaster.c:6026
#16 0x008d09ba in sigusr1_handler (postgres_signal_arg=10) at 
postmaster.c:5191

#17 
#18 0x00357d6e1683 in __select_nocancel () from /lib64/libc.so.6
#19 0x008cc6c1 in ServerLoop () at postmaster.c:1757
#20 0x008cc0aa in PostmasterMain (argc=11, argv=0x30d6590) at 
postmaster.c:1465

#21 0x007c9256 in main (argc=11, argv=0x30d6590) at main.c:202
(gdb) f 3
#3  0x009bf5dc in AtEOXact_PgStat_Relations 
(xact_state=0x31b1b50, isCommit=true) at pgstat_relation.c:508

508 Assert(tabstat->trans == trans);
(gdb) p tabstat
$1 = (PgStat_TableStatus *) 0x319e630
(gdb) p *tabstat
$2 = {t_id = 2139062143, t_shared = 127, trans = 0x7f7f7f7f7f7f7f7f, 
t_counts = {t_numscans = 9187201950435737471, t_tuples_returned = 
9187201950435737471, t_tuples_fetched = 9187201950435737471,
t_tuples_inserted = 9187201950435737471, t_tuples_updated = 
9187201950435737471, t_tuples_deleted = 9187201950435737471, 
t_tuples_hot_updated = 9187201950435737471, t_truncdropped = 127,
t_delta_live_tuples = 9187201950435737471, t_delta_dead_tuples = 
9187201950435737471, t_changed_tuples = 9187201950435737471, 
t_blocks_fetched = 9187201950435737471, t_blocks_hit = 9187201950435737471},

  relation = 0x7f7f7f7f7f7f7f7f}
(gdb) p trans
$3 = (PgStat_TableXactStatus *) 0x31b1ba8
(gdb) p *trans
$4 = {tuples_inserted = 1, tuples_updated = 0, tuples_deleted = 0, 
truncdropped = false, inserted_pre_truncdrop = 0, updated_pre_truncdrop 
= 0, deleted_pre_truncdrop = 0, nest_level = 1, upper = 0x0,

  parent = 0x319e630, next = 0x31b1ab8}
(gdb)



Looks like we're one step further, no?


Erik



(gdb) p tabstat
$1 = 


Great! It is that.  But unfortunately they are optimized out..  Could
you cause the crash with -O0 binary?  You will see the variable with
it.  You can rebuild with the option as follows.

$ make clean; make install CUSTOM_COPT="-O0 -g"

You can dump only the whole xact_state chain from the current core
file but the result will give a bit obscure hint for diagnosis.

regards.

Re: TRAP: FailedAssertion("HaveRegisteredOrActiveSnapshot()", File: "toast_internals.c", Line: 670, PID: 19403)

2022-04-19 Thread Robert Haas

On Mon, Apr 18, 2022 at 4:07 PM Tom Lane  wrote:
> There may be some corner cases that aren't described by any of these
> three blanket scenarios, but they've got to be pretty few and far
> between.

My first thought whenever anything like this comes up is cursors,
especially but not only holdable cursors. Also, plpgsql variables,
maybe mixed with embedded COMMIT/ROLLBACK. I don't find it
particularly hard to believe we have some bugs in
insufficiently-well-considered parts of the system that pass around
datums outside of the normal executor flow, but I don't know exactly
how to find them all, either.

-- 
Robert Haas
EDB: http://www.enterprisedb.com

Bad estimate with partial index

2022-04-19 Thread André Hänsel

Hi list,

I have a case where Postgres chooses the wrong index and I'm not sure what
to do about it:

https://dbfiddle.uk/?rdbms=postgres_14&fiddle=f356fd56a920ea8a93c192f5a8c16b
1c

Setup:

CREATE TABLE t (
filename int,
cropped bool not null default false,
resized bool not null default false,
create_date date not null default '1970-01-01'
);

INSERT INTO t
SELECT generate_series(1, 100);

UPDATE t SET cropped = true, resized = true
WHERE filename IN (SELECT filename FROM t ORDER BY random() LIMIT 90);
UPDATE t SET resized = false
WHERE filename IN (SELECT filename FROM t WHERE cropped = true ORDER BY
random() LIMIT 1000);

VACUUM FULL t;
ANALYZE t;

Data now looks like this:

SELECT cropped, resized, count(*)
FROM t
GROUP BY 1,2;

I create two indexes:

CREATE INDEX idx_resized ON t(resized) WHERE NOT resized;
CREATE INDEX specific ON t(cropped,resized) WHERE cropped AND NOT resized;

And then run my query:

EXPLAIN ANALYZE
SELECT count(*) FROM t WHERE cropped AND NOT resized AND create_date <
CURRENT_DATE;

Aggregate  (cost=4001.25..4001.26 rows=1 width=8) (actual
time=478.557..478.558 rows=1 loops=1)
  ->  Index Scan using idx_resized on t  (cost=0.29..3777.71 rows=89415
width=0) (actual time=478.177..478.480 rows=1000 loops=1)
Filter: (cropped AND (create_date < CURRENT_DATE))
Rows Removed by Filter: 10

It takes 478 ms on dbfiddle.uk (on my machine it's faster but the difference
is still visible).

Now I delete an index:

DROP INDEX idx_resized;

and run the same query again and I get a much better plan:

Aggregate  (cost=11876.27..11876.28 rows=1 width=8) (actual
time=0.315..0.316 rows=1 loops=1)
  ->  Bitmap Heap Scan on t  (cost=35.50..11652.73 rows=89415 width=0)
(actual time=0.054..0.250 rows=1000 loops=1)
Recheck Cond: (cropped AND (NOT resized))
Filter: (create_date < CURRENT_DATE)
Heap Blocks: exact=6
->  Bitmap Index Scan on specific  (cost=0.00..13.15 rows=89415
width=0) (actual time=0.040..0.040 rows=1000 loops=1)

which uses the index specific and completes in less than a ms on both
dbfiddle.uk and my machine.

Additional mystery - when I set the values not with an UPDATE but with a
DEFAULT, then the correct index is chosen. What is going on?
https://dbfiddle.uk/?rdbms=postgres_14&fiddle=dc7d8aea14e90f08ab6537a855f34d
8c

Regards,
André

Re: typos

2022-04-19 Thread Alvaro Herrera

CCing Amit K, because I propose a few relatively minor changes to
logical rep docs.

On 2022-Apr-13, Justin Pryzby wrote:

> $ git grep -F ", the default)"
> doc/src/sgml/ref/create_publication.sgml:   partition's row filter (if the 
> parameter is false, the default) or the root
> 
> Maybe what's needed is more like this.
> 
>If the publication contains a partitioned table, and the publication 
> parameter
>publish_via_partition_root is false (the default), then 
> the
>row filter is taken from the partition; otherwise, the row filter is taken
>from the root partitioned table.

Yeah, more invasive rewording seems called for.  I propose this:

   For publications containing partitioned tables, the row filter for each
   partition is taken from the published partitioned table if the
   publication parameter publish_via_partition_root is true,
   or from the partition itself if it is false (the default).

I think we should also mention that this parameter affects row filters,
in the  for the WITH clause.  Currently it has

publish_via_partition_root 
(boolean)

  This parameter determines whether changes in a partitioned table (or
  on its partitions) contained in the publication will be published
  using the identity and schema of the partitioned table rather than
  that of the individual partitions that are actually changed; the
  latter is the default.  Enabling this allows the changes to be
  replicated into a non-partitioned table or a partitioned table
  consisting of a different set of partitions.

I propose to add 

publish_via_partition_root 
(boolean)

  This parameter determines whether changes in a partitioned table (or
  on its partitions) contained in the publication will be published
  using the identity and schema of the partitioned table rather than
  that of the individual partitions that are actually changed; the
  latter is the default.  Enabling this allows the changes to be
  replicated into a non-partitioned table or a partitioned table
  consisting of a different set of partitions.

 This parameter also affects how row filters are chosen for partitions;
 see below for details.

More generally, I think we need to connect the WHERE keyword with "row
filters" more explicitly.  Right now, the parameter reference says

  If the optional WHERE clause is specified, rows for
  which the expression
  evaluates to false or null will not be published. Note that parentheses
  are required around the expression. It has no effect on
  TRUNCATE commands.

I propose to make it "If the optional WHERE clause is specified, it
defines a row filter expression.  Rows for which
the row filter expression evaluates to false ..."

> $ git grep 'zstd.*product' doc
> doc/src/sgml/config.sgml:zstd (if 
> PostgreSQL
> $ git grep 'ZSTD.*product' doc
> doc/src/sgml/install-windows.sgml: 
> ZSTD
> doc/src/sgml/install-windows.sgml:  Required for supporting 
> ZSTD compression
> doc/src/sgml/installation.sgml:  You need 
> ZSTD, if you want to support
> doc/src/sgml/installation.sgml: Build with 
> ZSTD compression support.

I don't see any official sources calling it all-uppercase ZSTD.  In a
quick non-scientific survey, most seem to use Zstd or zstd.  The
non-abbreviated official name is Zstandard, but it's hard to find any
places using that spelling, and I don't think our docs are a place to
educate people on what the official name or pronunciation is.

I propose we standardize on Zstd everywhere.
Users can look it up if they're really interested.

-- 
Álvaro Herrera PostgreSQL Developer  —  https://www.EnterpriseDB.com/

Re: Fix NULL pointer reference in _outPathTarget()

2022-04-19 Thread Alvaro Herrera

On 2022-Apr-18, Tom Lane wrote:

> I suppose that Peter was trying to remove special cases from the
> outfuncs.c code, but do we want to put this one back?  Richard's
> proposal would not accurately reflect the contents of the data
> structure, so I'm not too thrilled with it.

Yeah -- looking at the script to generate node support functions[1], it
might be better go back to the original formulation (i.e., your proposed
patch), and then use a "path_hack4" for this struct member, which looks
similar to other hacks already there for other cases that require
bespoke handling.

[1] https://postgr.es/m/bee9fdb0-cd10-5fdb-3027-c4b5a240b...@enterprisedb.com

-- 
Álvaro HerreraBreisgau, Deutschland  —  https://www.EnterpriseDB.com/

Re: make MaxBackends available in _PG_init

2022-04-19 Thread Julien Rouhaud

Hi,

On Mon, Apr 18, 2022 at 08:17:04PM -0400, Tom Lane wrote:
> Nathan Bossart  writes:
> > I'm looking for a clean way to ERROR if someone attempts to call
> > RequestAddinShmemSpace() or RequestNamedLWLockTranche() outside of the
> > hook.  Currently, we are using static variables in ipci.c and lwlock.c to
> > silently ignore invalid requests.  I could add a new 'extern bool' called
> > 'process_shmem_requests_in_progress', but extensions could easily hack
> > around that to allow requests in _PG_init().  Maybe I am overthinking all
> > this and that is good enough.
> 
> If they do that and it breaks something, that's their fault not ours.
> (It's not like there's not $BIGNUM ways for a C-language module to
> break the backend, anyway.)

Agreed.  Similarly the process_shared_preload_libraries_in_progress flag could
be modified by extension, and that wouldn't be any better.

> BTW, I'd make such errors FATAL, as it's unlikely that we can recover
> cleanly from an error during initialization of a loadable module.
> The module's likely to be only partially initialized/hooked in.

While at it, should we make process_shmem_requests_in_progress true when the
new hook is called?  The hook should only be called when that's the case, and
extension authors may feel like asserting it.

Re: Stabilizing the test_decoding checks, take N

2022-04-19 Thread Amit Kapila

On Tue, Apr 19, 2022 at 11:38 AM Dilip Kumar  wrote:
>
> On Mon, Apr 18, 2022 at 3:29 PM Dilip Kumar  wrote:
>
> >
> > This needs to be verified once by doing some manual testing as it may
> > not be easily reproducible every time. If this happens to be true then
> > I think your suggestion related to increasing autovacuum_naptime would
> > work.
> >
> >
> > I will try to reproduce this, maybe by reducing the autovacuum_naptime or 
> > parallelly running some script that continuously performs DDL-only 
> > transactions.
>
> I have reproduced it [1] by repeatedly running the attached
> script(stream.sql) from one session and parallely running the vacuum
> analysis from the another session.
>
> I have also changed the config for testing decoding to set the
> autovacuum_naptime to 1d (patch attached)
>

Thanks, I am also able to see similar results. This shows the analysis
was right. I will push the autovacuum_naptime change in HEAD and 14
(as both contains this test) tomorrow unless someone thinks otherwise.


-- 
With Regards,
Amit Kapila.

Re: Add version and data directory to initdb output

2022-04-19 Thread Daniel Gustafsson

> On 16 Apr 2022, at 01:50, David G. Johnston  
> wrote:

> initdb is already pretty chatty, and the version of the cluster being 
> installed seems useful to include as well.  

That seems quite reasonable.

> The data directory is probably less so - though I am thinking that the 
> absolute path would be useful to report, especially when a relative path is 
> specified (I didn't figure that part out yet, figured I'd get the idea 
> approved before working out how to make it happen).

I'm less convinced that it will be worth the additional code to make it
portable across *nix/Windows etc.

> Moving "Success" to that "summary output" line and leaving the optional shell 
> command line just be the shell command made sense to me.

Looking at the output, couldn't it alternatively be printed grouped with the
other info on the cluster, ie the final three rows in the example below:

  ./bin/initdb -D data
  The files belonging to this database system will be owned by user 
"".
  This user must also own the server process.

  The database cluster will be initialized with locale "en_US.UTF-8".
  The default database encoding has accordingly been set to "UTF8".
  The default text search configuration will be set to "english".

How about 'The database cluster will be initialized with version "14.2".' added
there, which then can keep the "Success" line in place in case existing scripts
are triggering on that line?

--
Daniel Gustafsson   https://vmware.com/

Re: TRAP: FailedAssertion("tabstat->trans == trans", File: "pgstat_relation.c", Line: 508

2022-04-19 Thread Kyotaro Horiguchi

Thaks Erik.

At Tue, 19 Apr 2022 07:00:30 +0200, Erik Rijkers  wrote in 
> Op 19-04-2022 om 02:15 schreef Kyotaro Horiguchi:
> > Could you read tabstat, *tabstat, trans, *trans here?
> 
> To be honest I'm not sure how to, but I gave it a try:
>
> (gdb) p tabstat
> $1 = 

Great! It is that.  But unfortunately they are optimized out..  Could
you cause the crash with -O0 binary?  You will see the variable with
it.  You can rebuild with the option as follows.

$ make clean; make install CUSTOM_COPT="-O0 -g"

You can dump only the whole xact_state chain from the current core
file but the result will give a bit obscure hint for diagnosis.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center

Re: postgres_fdw: batch inserts vs. before row triggers

2022-04-19 Thread Etsuro Fujita

On Sun, Apr 17, 2022 at 6:20 PM Etsuro Fujita  wrote:
> Here
> is an example using HEAD:
>
> create extension postgres_fdw;
> create server loopback foreign data wrapper postgres_fdw options
> (dbname 'postgres');
> create user mapping for current_user server loopback;
> create table t (a int);
> create foreign table ft (a int) server loopback options (table_name 't');
> create function ft_rowcount_tf() returns trigger as $$ begin raise
> notice '%: rows = %', tg_name, (select count(*) from ft); return new;
> end; $$ language plpgsql;
> create trigger ft_rowcount before insert on ft for each row execute
> function ft_rowcount_tf();
>
> insert into ft select i from generate_series(1, 10) i;
> NOTICE:  ft_rowcount: rows = 0
> NOTICE:  ft_rowcount: rows = 1
> NOTICE:  ft_rowcount: rows = 2
> NOTICE:  ft_rowcount: rows = 3
> NOTICE:  ft_rowcount: rows = 4
> NOTICE:  ft_rowcount: rows = 5
> NOTICE:  ft_rowcount: rows = 6
> NOTICE:  ft_rowcount: rows = 7
> NOTICE:  ft_rowcount: rows = 8
> NOTICE:  ft_rowcount: rows = 9
> INSERT 0 10
>
> This looks good, but when batch insert is enabled, the trigger
> produces incorrect results:
>
> alter foreign table ft options (add batch_size '10');
> delete from ft;
>
> insert into ft select i from generate_series(1, 10) i;
> NOTICE:  ft_rowcount: rows = 0
> NOTICE:  ft_rowcount: rows = 0
> NOTICE:  ft_rowcount: rows = 0
> NOTICE:  ft_rowcount: rows = 0
> NOTICE:  ft_rowcount: rows = 0
> NOTICE:  ft_rowcount: rows = 0
> NOTICE:  ft_rowcount: rows = 0
> NOTICE:  ft_rowcount: rows = 0
> NOTICE:  ft_rowcount: rows = 0
> NOTICE:  ft_rowcount: rows = 0
> INSERT 0 10

Actually, the results are correct, as we do batch-insert here.  But I
just wanted to show that the trigger behaves *differently* when doing
batch-insert.

> So I think we should disable batch insert in such cases, just as we
> disable multi insert when there are any before row triggers on the
> target (local) tables/partitions in copyfrom.c.  Attached is a patch
> for that.

If there are no objections from Tomas or anyone else, I'll commit the patch.

Best regards,
Etsuro Fujita

Re: Defer selection of asynchronous subplans until the executor initialization stage

2022-04-19 Thread Etsuro Fujita

Hi,

On Sun, Apr 17, 2022 at 7:30 PM Zhihong Yu  wrote:
> On Sun, Apr 17, 2022 at 1:48 AM Etsuro Fujita  wrote:
>> I think we might support more cases in the switch statement in the
>> future.  My concern about your proposal is that it might make it hard
>> to add new cases to the statement.  I agree that what I proposed has a
>> bit of redundant code, but writing code inside each case independently
>> would make it easy to add them, making code consistent across branches
>> and thus making back-patching easy.

> When a new case arises where the plan is not a Result node, this func can be 
> rewritten.
> If there is only one such new case, the check at the beginning of the func 
> can be tuned to exclude that case.

Sorry, I don't agree with you.

> I still think the check should be lifted to the beginning of the func (given 
> the current cases).

The given path isn't limited to SubqueryScanPath, ForeignPath and
ProjectionPath, so another concern is extra cycles needed when the
path is other path type that is projection-capable (e.g., Path for
sequential scan, IndexPath, NestPath, ...).  Assume that the given
path is a Path (that doesn't contain pseudoconstant quals).  In that
case the given SeqScan plan node wouldn't contain a gating Result
node, so if we put the if test at the top of the function, we need to
execute not only the test but the switch statement for the given
path/plan nodes.  But if we put the if test inside each case block, we
only need to execute the switch statement, without executing the test.
In the latter case I think we can save cycles for normal cases.

In short: I don't think it's a great idea to put the if test at the
top of the function.

Best regards,
Etsuro Fujita

RE: Data is copied twice when specifying both child and parent table in publication

2022-04-19 Thread shiy.f...@fujitsu.com

On Tue, Apr 19, 2022 3:05 PM houzj.f...@fujitsu.com  
wrote:
>
> > -Original Message-
> > From: Wang, Wei/王 威 
> On Thursday, April 7, 2022 11:08 AM
> >
> > On Thur, Mar 10, 2021 at 10:08 AM houzj.f...@fujitsu.com wrote:
> > > Hi,
> > >
> > > When reviewing some logical replication related features. I noticed 
> > > another
> > > possible problem if the subscriber subscribes multiple publications which
> > > publish parent and child table.
> > >
> > > For example:
> > >
> > > pub
> > > create table t (a int, b int, c int) partition by range (a);
> > > create table t_1 partition of t for values from (1) to (10);
> > >
> > > create publication pub1 for table t
> > >   with (PUBLISH_VIA_PARTITION_ROOT);
> > > create publication pub2 for table t_1
> > >   with (PUBLISH_VIA_PARTITION_ROOT);
> > >
> > > sub
> > >  prepare table t and t_1
> > > CREATE SUBSCRIPTION sub CONNECTION 'port=1 dbname=postgres'
> > > PUBLICATION pub1, pub2;
> > >
> > > select * from pg_subscription_rel ;
> > >  srsubid | srrelid | srsubstate | srsublsn
> > > -+-++---
> > >16391 |   16385(t) | r  | 0/150D100
> > >16391 |   16388(t_1) | r  | 0/150D138
> > >
> > > If subscribe two publications one of them publish parent table with
> > > (pubviaroot=true) and another publish child table. Both the parent table 
> > > and
> > > child table will exist in pg_subscription_rel which also means we will do
> > > initial copy for both tables.
> > >
> > > But after initial copy, we only publish change with the schema of the 
> > > parent
> > > table(t). It looks a bit inconsistent.
> > >
> > > Based on the document of PUBLISH_VIA_PARTITION_ROOT option. I think
> > the
> > > expected behavior could be we only store the top most parent(table t) in
> > > pg_subscription_rel and do initial copy for it if pubviaroot is on. I 
> > > haven't
> > > thought about how to fix this and will investigate this later.
> > Hi,
> > I try to fix this bug. Attach the patch.
> >
> > The current HEAD get table list for one publication by invoking function
> > pg_get_publication_tables. If multiple publications are subscribed, then 
> > this
> > function is invoked multiple times. So option PUBLISH_VIA_PARTITION_ROOT
> > works
> > independently on every publication, I think it does not work correctly on
> > different publications of the same subscription.
> >
> > So I fix this bug by the following two steps:
> > First step,
> > I get oids of subscribed tables by publication list. Then for tables with 
> > the
> > same topmost root table, I filter them base on the option
> > PUBLISH_VIA_PARTITION_ROOT(see new function filter_partitions_oids).
> > After filtering, I get the final oid list.
> > Second step,
> > I get the required informations(nspname and relname) base on the oid list of
> > first step.
> 
> Thanks for updating the patch.
> I confirmed that the bug is fixed by this patch.
> 
> One suggestion is that can we simplify the code by moving the logic of 
> checking
> the ancestor into the SQL ?. For example, we could filter the outpout of
> pg_publication_tables by adding A WHERE clause which checks whether the table
> is a partition and if its ancestor is also in the output. I think we can also
> filter the needless partition in this approach.
> 

I agreed with you and I tried to fix this problem in a simpler way. What we want
is to exclude the partitioned table whose ancestor is also need to be
replicated, so how about implementing that by using the following SQL when
getting the table list from publisher?

SELECT DISTINCT ns.nspname, c.relname
FROM pg_catalog.pg_publication_tables t
JOIN pg_catalog.pg_namespace ns ON ns.nspname = t.schemaname
JOIN pg_catalog.pg_class c ON c.relname = t.tablename AND c.relnamespace = 
ns.oid
WHERE t.pubname IN ('p0','p2')
AND (c.relispartition IS FALSE OR NOT EXISTS (SELECT 1 FROM 
pg_partition_ancestors(c.oid)
WHERE relid IN ( SELECT DISTINCT (schemaname||'.'||tablename)::regclass::oid
FROM pg_catalog.pg_publication_tables t
WHERE t.pubname IN ('p0','p2') ) AND relid != c.oid));

Please find the attached patch which used this approach, I also merged the test
in Wang's patch into it.

Regards,
Shi yu


v2-0001-Fix-data-replicated-twice-when-specifying-PUBLISH.patch
Description:  v2-0001-Fix-data-replicated-twice-when-specifying-PUBLISH.patch

Re: Crash in new pgstats code

2022-04-19 Thread Thomas Munro

On Tue, Apr 19, 2022 at 2:50 AM Andres Freund  wrote:
> Kestrel won't go that far back even - I set it up 23 days ago...

Here's a ~6 month old example from mylodon (I can't see much further
back than that with HTTP requests... I guess BF records are purged?):

https://buildfarm.postgresql.org/cgi-bin/show_stage_log.pl?nm=mylodon&dt=2021-10-19%2022%3A57%3A54&stg=recovery-check

Re: [HACKERS] WIP aPatch: Pgbench Serialization and deadlock errors

2022-04-19 Thread Tatsuo Ishii

>>> Ok. Fine with me. Possibly at some point there was the idea that there
>>> could be other failures counted, but there are none. Also, there has
>>> been questions about the failures detailed option, or whether the
>>> reports should always be detailed, and the result may be some kind of
>>> not convincing compromise.
>>
>> Attached is the patch to remove the column.
> 
> Patch applies cleanly. Compilation ok. Global and local "make check"
> ok.
> Doc build ok.

Thank you for reviewing. Patch pushed.

> It find it a little annoying that there is no change in tests, it
> means that the format is not checked at all:-(

Yeah. Perhaps it's a little bit hard to perform this kind of tests in
the TAP test?

Best reagards,
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jp

Re: [HACKERS] WIP aPatch: Pgbench Serialization and deadlock errors

2022-04-19 Thread Fabien COELHO





Ok. Fine with me. Possibly at some point there was the idea that there
could be other failures counted, but there are none. Also, there has
been questions about the failures detailed option, or whether the
reports should always be detailed, and the result may be some kind of
not convincing compromise.


Attached is the patch to remove the column.


Patch applies cleanly. Compilation ok. Global and local "make check" ok.
Doc build ok.

It find it a little annoying that there is no change in tests, it means 
that the format is not checked at all:-(


--
Fabien.

Re: Replace open mode with PG_BINARY_R/W/A macros

2022-04-19 Thread Japin Li



On Tue, 19 Apr 2022 at 14:20, Tom Lane  wrote:
> Japin Li  writes:
>> On Mon, 18 Apr 2022 at 22:41, Tom Lane  wrote:
>>> A lot of these changes look wrong to me: they are substituting "rb" for
>>> "r", etc, in places that mean to read text files.  You have to think
>>> about the Windows semantics.
>
>> I do this substituting, since the comment says it can be used for opening
>> text files.  Maybe I misunderstand the comment.
>
> I think the comment's at best misleading.  See e.g. 66f8687a8.
> It might be okay to use "rb" to read a text file when there
> is actually \r-stripping logic present, but you need to check
> that.  Using "wb" to write a text file is flat wrong.
>

Thanks for the detail explanation.  Should we remove the misleading comment?

-- 
Regrads,
Japin Li.
ChengDu WenWu Information Technology Co.,Ltd.

Re: Replace open mode with PG_BINARY_R/W/A macros

2022-04-19 Thread Japin Li



On Tue, 19 Apr 2022 at 14:14, Michael Paquier  wrote:
> On Tue, Apr 19, 2022 at 01:29:18PM +0800, Japin Li wrote:
>> On Mon, 18 Apr 2022 at 22:41, Tom Lane  wrote:
>>> Japin Li  writes:
 I found we defined PG_BINARY_R/W/A macros for opening files, however,
 there are some places use the constant strings.  IMO we should use
 those macros instead of constant strings.  Here is a patch for it.
 Any thoughts?
>>>
>>> A lot of these changes look wrong to me: they are substituting "rb" for
>>> "r", etc, in places that mean to read text files.  You have to think
>>> about the Windows semantics.
>
> This reminded me of the business from a couple of years ago in
> pgwin32_open() to enforce the text mode in the frontend if O_BINARY is
> not specified.
>
>> I do this substituting, since the comment says it can be used for opening
>> text files.  Maybe I misunderstand the comment.
>
> 'b' is normally ignored on POSIX platforms (per the Linux man page for
> fopen), but your patch has as effect to silently switch to binary mode
> on Windows all those code paths.  See _setmode() in pgwin32_open(),
> that changes the behavior of CRLF when reading or writing such files,
> as described here:
> https://docs.microsoft.com/en-us/cpp/c-runtime-library/reference/setmode?view=msvc-170
>
> The change in adminpack.c would be actually as 'b' should be ignored
> on non-WIN32, but Tom's point is to not take lightly all the others.

Oh, I understand your points.  Thanks for the explanation.

-- 
Regrads,
Japin Li.
ChengDu WenWu Information Technology Co.,Ltd.

Re: Postgres perl module namespace

2022-04-19 Thread Daniel Gustafsson

> On 18 Apr 2022, at 19:59, Andrew Dunstan  wrote:
> On 2022-04-18 Mo 13:43, Tom Lane wrote:

>> Another thing that ought to be on the table is back-patching
>> 549ec201d (Replace Test::More plans with done_testing). Those
>> test counts are an even huger pain for back-patching than the
>> renaming, because the count is often different in each branch.   
>> 
> 
> +1 for doing that

TI'll get to work on that then.

--
Daniel Gustafsson   https://vmware.com/

RE: Data is copied twice when specifying both child and parent table in publication

2022-04-19 Thread houzj.f...@fujitsu.com



> -Original Message-
> From: Wang, Wei/王 威 
On Thursday, April 7, 2022 11:08 AM
> 
> On Thur, Mar 10, 2021 at 10:08 AM houzj.f...@fujitsu.com wrote:
> > Hi,
> >
> > When reviewing some logical replication related features. I noticed another
> > possible problem if the subscriber subscribes multiple publications which
> > publish parent and child table.
> >
> > For example:
> >
> > pub
> > create table t (a int, b int, c int) partition by range (a);
> > create table t_1 partition of t for values from (1) to (10);
> >
> > create publication pub1 for table t
> >   with (PUBLISH_VIA_PARTITION_ROOT);
> > create publication pub2 for table t_1
> >   with (PUBLISH_VIA_PARTITION_ROOT);
> >
> > sub
> >  prepare table t and t_1
> > CREATE SUBSCRIPTION sub CONNECTION 'port=1 dbname=postgres'
> > PUBLICATION pub1, pub2;
> >
> > select * from pg_subscription_rel ;
> >  srsubid | srrelid | srsubstate | srsublsn
> > -+-++---
> >16391 |   16385(t) | r  | 0/150D100
> >16391 |   16388(t_1) | r  | 0/150D138
> >
> > If subscribe two publications one of them publish parent table with
> > (pubviaroot=true) and another publish child table. Both the parent table and
> > child table will exist in pg_subscription_rel which also means we will do
> > initial copy for both tables.
> >
> > But after initial copy, we only publish change with the schema of the parent
> > table(t). It looks a bit inconsistent.
> >
> > Based on the document of PUBLISH_VIA_PARTITION_ROOT option. I think
> the
> > expected behavior could be we only store the top most parent(table t) in
> > pg_subscription_rel and do initial copy for it if pubviaroot is on. I 
> > haven't
> > thought about how to fix this and will investigate this later.
> Hi,
> I try to fix this bug. Attach the patch.
> 
> The current HEAD get table list for one publication by invoking function
> pg_get_publication_tables. If multiple publications are subscribed, then this
> function is invoked multiple times. So option PUBLISH_VIA_PARTITION_ROOT
> works
> independently on every publication, I think it does not work correctly on
> different publications of the same subscription.
> 
> So I fix this bug by the following two steps:
> First step,
> I get oids of subscribed tables by publication list. Then for tables with the
> same topmost root table, I filter them base on the option
> PUBLISH_VIA_PARTITION_ROOT(see new function filter_partitions_oids).
> After filtering, I get the final oid list.
> Second step,
> I get the required informations(nspname and relname) base on the oid list of
> first step.

Thanks for updating the patch.
I confirmed that the bug is fixed by this patch.

One suggestion is that can we simplify the code by moving the logic of checking
the ancestor into the SQL ?. For example, we could filter the outpout of
pg_publication_tables by adding A WHERE clause which checks whether the table
is a partition and if its ancestor is also in the output. I think we can also
filter the needless partition in this approach.

Best regards,
Hou zj

95 matches

Mail list logo