Re: trying again to get incremental backup

2023-12-10 Thread Dilip Kumar
On Mon, Dec 11, 2023 at 11:44 AM Dilip Kumar  wrote:
>
> On Tue, Dec 5, 2023 at 11:40 PM Robert Haas  wrote:
> >
> > On Mon, Dec 4, 2023 at 3:58 PM Robert Haas  wrote:
> > > Considering all this, what I'm inclined to do is go and put
> > > UPLOAD_MANIFEST back, instead of INCREMENTAL_WAL_RANGE, and adjust
> > > accordingly. But first: does anybody see more problems here that I may
> > > have missed?
> >
> > OK, so here's a new version with UPLOAD_MANIFEST put back. I wrote a
> > long comment explaining why that's believed to be necessary and
> > sufficient. I committed 0001 and 0002 from the previous series also,
> > since it doesn't seem like anyone has further comments on those
> > renamings.
>
> I have done some testing on standby, but I am facing some issues,
> although things are working fine on the primary.  As shown below test
> [1]standby is reporting some errors that manifest require WAL from
> 0/6F8, but this backup starts at 0/628.  Then I tried to look
> into the manifest file of the full backup and it shows contents as
> below[0].  Actually from this WARNING and ERROR, I am not clear what
> is the problem, I understand that full backup ends at  "0/6F8" so
> for the next incremental backup we should be looking for a summary
> that has WAL starting at "0/6F8" and we do have those WALs.  In
> fact, the error message is saying "this backup starts at 0/628"
> which is before  "0/6F8" so whats the issue?
>
> [0]
> "WAL-Ranges": [
> { "Timeline": 1, "Start-LSN": "0/628", "End-LSN": "0/6F8" }
>
>
> [1]
> -- test on primary
> dilipkumar@dkmac bin % ./pg_basebackup -D d
> dilipkumar@dkmac bin % ./pg_basebackup -D d1 -i d/backup_manifest
>
> -- cleanup the backup directory
> dilipkumar@dkmac bin % rm -rf d
> dilipkumar@dkmac bin % rm -rf d1
>
> --test on standby
> dilipkumar@dkmac bin % ./pg_basebackup -D d -p 5433
> dilipkumar@dkmac bin % ./pg_basebackup -D d1 -i d/backup_manifest -p 5433
>
> WARNING:  aborting backup due to backend exiting before pg_backup_stop
> was called
> pg_basebackup: error: could not initiate base backup: ERROR:  manifest
> requires WAL from final timeline 1 ending at 0/6F8, but this
> backup starts at 0/628
> pg_basebackup: removing data directory "d1"

Jakub, pinged me offlist and pointed me to the thread[1] where it is
already explained so I think we can ignore this.

[1] 
https://www.postgresql.org/message-id/CA%2BTgmoYuC27_ToGtTTNyHgpn_eJmdqrmhJ93bAbinkBtXsWHaA%40mail.gmail.com

-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com




Re: Synchronizing slots from primary to standby

2023-12-10 Thread Drouvot, Bertrand

Hi,

On 12/8/23 10:06 AM, Amit Kapila wrote:

On Wed, Dec 6, 2023 at 4:53 PM shveta malik  wrote:


PFA v43, changes are:



I wanted to discuss 0003 patch about cascading standby's. It is not
clear to me whether we want to allow physical standbys to further wait
for cascading standby to sync their slots. If we allow such a feature
one may expect even primary to wait for all the cascading standby's
because otherwise still logical subscriber can be ahead of one of the
cascading standby.


I've the same feeling here. I think it would probably be expected that
the primary also wait for all the cascading standby.


I feel even if we want to allow such a behaviour we
can do it later once the main feature is committed. 


Agree.


I think it would
be good to just allow logical walsenders on primary to wait for
physical standbys represented by GUC 'standby_slot_names'.


That makes sense for me for v1.


If we agree
on that then it would be good to prohibit setting this GUC on standby
or at least it should be a no-op even if this GUC should be set on
physical standby.


I'd prefer to completely prohibit it on standby (to make it very clear it's not
working at all) as long as one can enable it without downtime once the standby
is promoted (which is the case currently).

Regards,

--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com




Re: Synchronizing slots from primary to standby

2023-12-10 Thread Peter Smith
Here are some review comments for v44-0001

==
src/backend/replication/slot.c


1. ReplicationSlotCreate

  * during getting changes, if the two_phase option is enabled it can skip
  * prepare because by that time start decoding point has been moved. So the
  * user will only get commit prepared.
+ * failover: Allows the slot to be synced to physical standbys so that logical
+ * replication can be resumed after failover.
  */
 void
 ReplicationSlotCreate(const char *name, bool db_specific,

~

/Allows the slot.../If enabled, allows the slot.../

==

2. validate_standby_slots

+validate_standby_slots(char **newval)
+{
+ char*rawname;
+ List*elemlist;
+ ListCell   *lc;
+ bool ok = true;
+
+ /* Need a modifiable copy of string */
+ rawname = pstrdup(*newval);
+
+ /* Verify syntax and parse string into list of identifiers */
+ if (!(ok = SplitIdentifierString(rawname, ',', )))
+ GUC_check_errdetail("List syntax is invalid.");
+
+ /*
+ * If there is a syntax error in the name or if the replication slots'
+ * data is not initialized yet (i.e., we are in the startup process), skip
+ * the slot verification.
+ */
+ if (!ok || !ReplicationSlotCtl)
+ {
+ pfree(rawname);
+ list_free(elemlist);
+ return ok;
+ }


2a.
You don't need to initialize 'ok' during declaration because it is
assigned immediately anyway.

~

2b.
AFAIK assignment within a conditional like this is not a normal PG
coding style unless there is no other way to do it.

~

2c.
/into list/into a list/

SUGGESTION
/* Verify syntax and parse string into a list of identifiers */
ok = SplitIdentifierString(rawname, ',', );
if (!ok)
  GUC_check_errdetail("List syntax is invalid.");


~~~

3. assign_standby_slot_names

+ if (!SplitIdentifierString(standby_slot_names_cpy, ',', _slots))
+ {
+ /* This should not happen if GUC checked check_standby_slot_names. */
+ elog(ERROR, "list syntax is invalid");
+ }

This error here and in validate_standby_slots() are different --
"list" versus "List".

==
src/backend/replication/walsender.c


4. WalSndFilterStandbySlots


+ foreach(lc, standby_slots_cpy)
+ {
+ char*name = lfirst(lc);
+ XLogRecPtr restart_lsn = InvalidXLogRecPtr;
+ bool invalidated = false;
+ char*warningfmt = NULL;
+ ReplicationSlot *slot;
+
+ slot = SearchNamedReplicationSlot(name, true);
+
+ if (slot && SlotIsPhysical(slot))
+ {
+ SpinLockAcquire(>mutex);
+ restart_lsn = slot->data.restart_lsn;
+ invalidated = slot->data.invalidated != RS_INVAL_NONE;
+ SpinLockRelease(>mutex);
+ }
+
+ /* Continue if the current slot hasn't caught up. */
+ if (!invalidated && !XLogRecPtrIsInvalid(restart_lsn) &&
+ restart_lsn < wait_for_lsn)
+ {
+ /* Log warning if no active_pid for this physical slot */
+ if (slot->active_pid == 0)
+ ereport(WARNING,
+ errmsg("replication slot \"%s\" specified in parameter \"%s\" does
not have active_pid",
+name, "standby_slot_names"),
+ errdetail("Logical replication is waiting on the "
+   "standby associated with \"%s\"", name),
+ errhint("Consider starting standby associated with "
+ "\"%s\" or amend standby_slot_names", name));
+
+ continue;
+ }
+ else if (!slot)
+ {
+ /*
+ * It may happen that the slot specified in standby_slot_names GUC
+ * value is dropped, so let's skip over it.
+ */
+ warningfmt = _("replication slot \"%s\" specified in parameter
\"%s\" does not exist, ignoring");
+ }
+ else if (SlotIsLogical(slot))
+ {
+ /*
+ * If a logical slot name is provided in standby_slot_names, issue
+ * a WARNING and skip it. Although logical slots are disallowed in
+ * the GUC check_hook(validate_standby_slots), it is still
+ * possible for a user to drop an existing physical slot and
+ * recreate a logical slot with the same name. Since it is
+ * harmless, a WARNING should be enough, no need to error-out.
+ */
+ warningfmt = _("cannot have logical replication slot \"%s\" in
parameter \"%s\", ignoring");
+ }
+ else if (XLogRecPtrIsInvalid(restart_lsn) || invalidated)
+ {
+ /*
+ * Specified physical slot may have been invalidated, so there is no point
+ * in waiting for it.
+ */
+ warningfmt = _("physical slot \"%s\" specified in parameter \"%s\"
has been invalidated, ignoring");
+ }
+ else
+ {
+ Assert(restart_lsn >= wait_for_lsn);
+ }

This if/else chain seems structured awkwardly. IMO it would be tidier
to eliminate the NULL slot and IsLogicalSlot up-front, which would
also simplify some of the subsequent conditions

SUGGESTION

slot = SearchNamedReplicationSlot(name, true);

if (!slot)
{
...
}
else if (SlotIsLogical(slot))
{
...
}
else
{
  Assert(SlotIsPhysical(slot))

  SpinLockAcquire(>mutex);
  restart_lsn = slot->data.restart_lsn;
  invalidated = slot->data.invalidated != RS_INVAL_NONE;
  SpinLockRelease(>mutex);

  if (XLogRecPtrIsInvalid(restart_lsn) || invalidated)
  {
  ...
  }
  else if (!invalidated && !XLogRecPtrIsInvalid(restart_lsn) &&
restart_lsn < wait_for_lsn)
  {
  ...
  }
  else
  {
Assert(restart_lsn >= wait_for_lsn);
  }
}



5. WalSndWaitForWal

+ else

Re: [PoC] Improve dead tuple storage for lazy vacuum

2023-12-10 Thread Masahiko Sawada
On Fri, Dec 8, 2023 at 9:44 PM Masahiko Sawada  wrote:
>
> On Fri, Dec 8, 2023 at 7:46 PM John Naylor  wrote:
> >
> > On Fri, Dec 8, 2023 at 3:06 PM Masahiko Sawada  
> > wrote:
> > >
> > > BTW Given that the actual value size can be calculated only by the
> > > caller, how does the tree know if the value is embedded or not? It's
> > > probably related to how to store combined pointer/value slots.
> >
> > Right, this is future work. At first, variable-length types will have
> > to be single-value leaves. In fact, the idea for storing up to 3
> > offsets in the bitmap header could be done this way -- it would just
> > be a (small) single-value leaf.
>
> Agreed.
>
> >
> > (Reminder: Currently, fixed-length values are compile-time embeddable
> > if the platform pointer size is big enough.)
> >
> > > If leaf
> > > nodes have a bitmap array that indicates the corresponding slot is an
> > > embedded value or a pointer to a value, it would be easy.
> >
> > That's the most general way to do it. We could do it much more easily
> > with a pointer tag, although for the above idea it may require some
> > endian-aware coding. Both were mentioned in the paper, I recall.
>
> True. Probably we can use the combined pointer/value slots approach
> only if the tree is able to use the pointer tagging. That is, if the
> caller allows the tree to use one bit of the value.
>
> I'm going to update the patch based on the recent discussion (RT_SET()
> and variable-length values) etc., and post the patch set early next
> week.

I've attached the updated patch set. From the previous patch set, I've
merged patches 0007 to 0010. The other changes such as adding RT_GET()
still are unmerged for now, for discussion. Probably we can make them
as follow-up patches as we discussed. 0011 to 0015 patches are new
changes for v44 patch set, which removes RT_SEARCH() and RT_SET() and
support variable-length values.

Regards,

-- 
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com


v44-ART.tar.gz
Description: GNU Zip compressed data


Re: trying again to get incremental backup

2023-12-10 Thread Dilip Kumar
On Tue, Dec 5, 2023 at 11:40 PM Robert Haas  wrote:
>
> On Mon, Dec 4, 2023 at 3:58 PM Robert Haas  wrote:
> > Considering all this, what I'm inclined to do is go and put
> > UPLOAD_MANIFEST back, instead of INCREMENTAL_WAL_RANGE, and adjust
> > accordingly. But first: does anybody see more problems here that I may
> > have missed?
>
> OK, so here's a new version with UPLOAD_MANIFEST put back. I wrote a
> long comment explaining why that's believed to be necessary and
> sufficient. I committed 0001 and 0002 from the previous series also,
> since it doesn't seem like anyone has further comments on those
> renamings.

I have done some testing on standby, but I am facing some issues,
although things are working fine on the primary.  As shown below test
[1]standby is reporting some errors that manifest require WAL from
0/6F8, but this backup starts at 0/628.  Then I tried to look
into the manifest file of the full backup and it shows contents as
below[0].  Actually from this WARNING and ERROR, I am not clear what
is the problem, I understand that full backup ends at  "0/6F8" so
for the next incremental backup we should be looking for a summary
that has WAL starting at "0/6F8" and we do have those WALs.  In
fact, the error message is saying "this backup starts at 0/628"
which is before  "0/6F8" so whats the issue?

[0]
"WAL-Ranges": [
{ "Timeline": 1, "Start-LSN": "0/628", "End-LSN": "0/6F8" }


[1]
-- test on primary
dilipkumar@dkmac bin % ./pg_basebackup -D d
dilipkumar@dkmac bin % ./pg_basebackup -D d1 -i d/backup_manifest

-- cleanup the backup directory
dilipkumar@dkmac bin % rm -rf d
dilipkumar@dkmac bin % rm -rf d1

--test on standby
dilipkumar@dkmac bin % ./pg_basebackup -D d -p 5433
dilipkumar@dkmac bin % ./pg_basebackup -D d1 -i d/backup_manifest -p 5433

WARNING:  aborting backup due to backend exiting before pg_backup_stop
was called
pg_basebackup: error: could not initiate base backup: ERROR:  manifest
requires WAL from final timeline 1 ending at 0/6F8, but this
backup starts at 0/628
pg_basebackup: removing data directory "d1"


--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com




Re: undetected deadlock in ALTER SUBSCRIPTION ... REFRESH PUBLICATION

2023-12-10 Thread Amit Kapila
On Sat, Dec 9, 2023 at 12:16 PM Amit Kapila  wrote:
>
> Thanks, I could also reproduce the issue on back branches (tried till
> 12), and the fix works. I'll push this on Monday.
>

Peter sent one minor suggestion (to write the check differently for
easier understanding) offlist which I addressed and pushed the patch.

-- 
With Regards,
Amit Kapila.




Re: RFC: Logging plan of the running query

2023-12-10 Thread torikoshia

On 2023-12-07 08:33, Rafael Thofehrn Castro wrote:

Hello hackers,

Last Saturday I submitted a patch to the pgsql-hackers list with the
title
"Proposal: In-flight explain logging" with a patch proposing a feature
very
similar to the one being worked on in this thread. I should have done
a better
search in the commitfest before implementing something from scratch.

So, as recommended by Ashutosh, I am sending an incremental patch
containing
an additional feature I personally think we should include: logging
the plan
with instrumentation details if enabled.


Thanks for the proposal and making the patch!


When targeting a query with instrumentation PG should log the complete
EXPLAIN ANALYZE plan with current row count and, if enabled, timing
for each
node. This gives the user not only the ability to see what the plan is
but also what was executed so far, which is super useful when
troubleshooting queries that never finish.



I think showing the progress of the query execution would be useful.

OTOH it seems to at least need some modifications around Instrumentation 
as your patch.
As a first step, I think it would better to minimize the scope and focus 
on the fundamental function.
For the same reason, getting queries for parallel workers is also 
prohibited in the current patch as discussed here[1].


[1] 
https://www.postgresql.org/message-id/c25ae6015be96a1964eddd964657660b%40oss.nttdata.com


So I think below steps would be better than pushing all the 
functionalities to the 1st commit.


- First, develop function to enable output of query 
progress(v34-0001-Add-function-to-log-the-plan-of-the-query.patch).

- Then enhance the function
  - showing the progress of the query 
execution(v34-0002-Log-plan-along-with-instrumentation-details.patch), 
etc.



--https://www.postgresql.org/message-id/CAG0ozMp3g3drnkDa6RZxXO_OmnisL2sD9vBrmpu5fOBoYpC-3w%40mail.gmail.com
- ExplainState customization

A ExplainState is allocated and customized for the in-flight logging.
Instrumentation related settings are enabled based on how the target
query started, which is usually via EXPLAIN ANALYZE or with 
auto_explain.


Does this mean the progress can be got only when the target query was 
run with EXPLAIN ANALYZE or auto_explain.log_analyze?


If so, there might be limited situations we can get the progress since I 
imagine EXPLAIN ANALYZE is used when user want to get the plan from the 
beginning and auto_explain.log_analyze can give negative impact on 
performance as described in the manual and there may not be many 
environments which enable it.


--
Regards,

--
Atsushi Torikoshi
NTT DATA Group Corporation




Re: Adding facility for injection points (or probe points?) for more advanced tests

2023-12-10 Thread Dilip Kumar
On Tue, Nov 28, 2023 at 4:07 AM Michael Paquier  wrote:
>
> On Mon, Nov 27, 2023 at 12:14:05PM +0530, Ashutosh Bapat wrote:
> > Since you wroten "(still need to improve ...) I thought you are
> > working on v6. No problem. Sorry for the confusion.
>
> I see why my previous message could be confusing.  Sorry about that.

I haven't specifically done a review or testing of this patch, but I
have used this for testing the CLOG group update code with my
SLRU-specific changes and I found it quite helpful to test some of the
concurrent areas where you need to stop processing somewhere in the
middle of the code and testing that area without this kind of
injection point framework is really difficult or may not be even
possible.  We wanted to test the case of clog group update where we
can get multiple processes added to a single group and get the xid
status updated by the group leader, you can refer to my test in that
thread[1] (the last patch test_group_commit.patch is using this
framework for testing).  Overall I feel this framework is quite useful
and easy to use as well.

[1] 
https://www.postgresql.org/message-id/CAFiTN-udSTGG_t5n9Z3eBbb4_%3DzNoKU%2B8FP-S6zpv-r4Gm-Y%2BQ%40mail.gmail.com


-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com




Re: Oversight in reparameterize_path_by_child leading to executor crash

2023-12-10 Thread Richard Guo
On Fri, Dec 8, 2023 at 5:39 PM Alena Rybakina 
wrote:

> On 06.12.2023 10:30, Richard Guo wrote:
> > I've self-reviewed this patch again and I think it's now in a
> > committable state.  I'm wondering if we can mark it as 'Ready for
> > Committer' now, or we need more review comments/feedbacks.
> >
> > To recap, this patch postpones reparameterization of paths until
> > createplan.c, which would help avoid building the reparameterized
> > expressions we might not use.  More importantly, it makes it possible to
> > modify the expressions in RTEs (e.g. tablesample) and in RelOptInfos
> > (e.g. baserestrictinfo) for reparameterization.  Failing to do that can
> > lead to executor crashes, wrong results, or planning errors, as we have
> > already seen.

I marked it as 'Ready for Committer'. I think it is ready.


Thank you.  Appreciate that.

Thanks
Richard


Re: Assert failure on 'list_member_ptr(rel->joininfo, restrictinfo)'

2023-12-10 Thread Andrei Lepikhov

On 11/12/2023 09:31, Richard Guo wrote:
On Fri, Dec 8, 2023 at 3:13 PM Alexander Pyhalov 
mailto:a.pyha...@postgrespro.ru>> wrote:

Andrei Lepikhov писал(а) 2023-12-08 07:37:
 > I'd already clashed with Tom on copying the required_relids field
and
 > voluntarily made unnecessary copies in the project [1].
 > And ... stuck into huge memory consumption. The reason was in
 > Bitmapsets:
 > When we have 1E3-1E4 partitions and try to reparameterize a join,
one
 > bitmapset field can have a size of about 1kB. Having bitmapset
 > referencing Relation with a large index value, we had a lot of (for
 > example, 1E4 * 1kB) copies on each reparametrization of such a
field.
 > Alexander Pyhalov should remember that case.
Yes. If it matters, this happened during reparametrization when 2
partitioned tables with 1000 partitions each were joined. Then
asymmetric  pw join managed to eat lots of memory for bitmapsets (by
lots of memory I mean all available on the test VM).
By reparametrization did you mean the work done in
reparameterize_path_by_child()?  If so maybe you'd be interested in the
patch [1] which postpones reparameterization of paths until createplan.c
and thus can help avoid unnecessary reparametrization work.


Yeah, I have discovered it already. It is a promising solution and only 
needs a bit more review. But here, I embraced some corner cases with the 
idea that we may not see other cases right now. And also, sometimes the 
Bitmapset field is significant - it is not a corner case.


--
regards,
Andrei Lepikhov
Postgres Professional





Re: Postgres db Update to Version 15

2023-12-10 Thread David Rowley
On Sun, 10 Dec 2023 at 04:10, Ritthaler, Axel  wrote:
> The Root Cause of this behavior, as aligned with AWS RDS Support, has been a 
> new feature-set coding (parallel_feature_query) with Postgres Version 15, 
> that shows a different behavior due to related parameter 
> (max_parallel_workers_per_gather).

What is parallel_feature_query?  No version of PostgreSQL has a
setting by that name.

> Remaining question now is, what has to be done to move related 
> Live-Landscapes back to the default parameter value (2) without creating the 
> same impact again.

You'll need to identify the query or queries causing the problem.
We've likely made many more query shapes parallelizable in PG15
compared to PG11. So it does not seem unusual that PG15 will be able
to paralleize more of your queries than what PG11 could do.  That
could lead to parallel plans not getting the workers they desire due
to workers being busy with other queries.

> What is your suggestion and recommended way-forward to enable parallel-worker 
> setup again ?

Identify the queries causing the problem.  Then determine if the plan
has changed since PG11. You can then check all the release notes
starting with PG12 to see if anything is mentioned about why the plan
might have changed. e.g. something in the query is parallelizable in
this version that wasn't in PG11.

One thing to keep in mind is that PostgreSQL does not opt to
parallelize the cheapest serial plan. It will attempt to find the
cheapest plan with or without parallel workers.  The difference here
is that it's optimized for speed rather than resource usage.  I'm not
sure if this is a factor in your issue, but it may be something to
keep in mind while investigating.

David




Re: Assert failure on 'list_member_ptr(rel->joininfo, restrictinfo)'

2023-12-10 Thread Richard Guo
On Fri, Dec 8, 2023 at 3:13 PM Alexander Pyhalov 
wrote:

> Andrei Lepikhov писал(а) 2023-12-08 07:37:
> > I'd already clashed with Tom on copying the required_relids field and
> > voluntarily made unnecessary copies in the project [1].
> > And ... stuck into huge memory consumption. The reason was in
> > Bitmapsets:
> > When we have 1E3-1E4 partitions and try to reparameterize a join, one
> > bitmapset field can have a size of about 1kB. Having bitmapset
> > referencing Relation with a large index value, we had a lot of (for
> > example, 1E4 * 1kB) copies on each reparametrization of such a field.
> > Alexander Pyhalov should remember that case.
>
> Yes. If it matters, this happened during reparametrization when 2
> partitioned tables with 1000 partitions each were joined. Then
> asymmetric  pw join managed to eat lots of memory for bitmapsets (by
> lots of memory I mean all available on the test VM).


By reparametrization did you mean the work done in
reparameterize_path_by_child()?  If so maybe you'd be interested in the
patch [1] which postpones reparameterization of paths until createplan.c
and thus can help avoid unnecessary reparametrization work.

[1]
https://www.postgresql.org/message-id/CAMbWs48PBwe1YadzgKGW_ES%3DV9BZhq00BaZTOTM6Oye8n_cDNg%40mail.gmail.com

Thanks
Richard


Re: Make COPY format extendable: Extract COPY TO format implementations

2023-12-10 Thread Masahiko Sawada
On Sun, Dec 10, 2023 at 5:55 AM Sutou Kouhei  wrote:
>
> Hi,
>
> In 
>   "Re: Make COPY format extendable: Extract COPY TO format implementations" 
> on Fri, 8 Dec 2023 15:42:06 +0900,
>   Masahiko Sawada  wrote:
>
> > So a custom COPY extension would not be able to define SQL functions
> > just like arrow(internal) for example. We might need to define a rule
> > like the function returning copy_in/out_handler must be defined as
> > _to(internal) and _from(internal).
>
> We may not need to add "_to"/"_from" suffix by checking both
> of argument type and return type. Because we use different
> return type for copy_in/out_handler.
>
> But the current LookupFuncName() family doesn't check return
> type. If we use this approach, we need to improve the
> current LookupFuncName() family too.

IIUC we cannot create two same name functions with the same arguments
but a different return value type in the first place. It seems to me
to be an overkill to change such a design.

Another idea is to encapsulate copy_to/from_handler by a super class
like copy_handler. The handler function is called with an argument,
say copyto, and returns copy_handler encapsulating either
copy_to/from_handler depending on the argument.

Regards,

-- 
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com




Re: pg_upgrade failing for 200+ million Large Objects

2023-12-10 Thread Tom Lane
I spent some time looking at the v7 patch.  I can't help feeling
that this is going off in the wrong direction, primarily for
these reasons:

* It focuses only on cutting the number of transactions needed
to restore a large number of blobs (large objects).  Certainly
that's a pain point, but it's not the only one of this sort.
If you have a lot of tables, restore will consume just as many
transactions as it would for a similar number of blobs --- probably
more, in fact, since we usually need more commands per table than
per blob.

* I'm not too thrilled with the (undocumented) rearrangements in
pg_dump.  I really don't like the idea of emitting a fundamentally
different TOC layout in binary-upgrade mode; that seems unmaintainably
bug-prone.  Plus, the XID-consumption problem is not really confined
to pg_upgrade.

What I think we actually ought to do is one of the alternatives
discussed upthread: teach pg_restore to be able to commit
every so often, without trying to provide the all-or-nothing
guarantees of --single-transaction mode.  This cuts its XID
consumption by whatever multiple "every so often" is, while
allowing us to limit the number of locks taken during any one
transaction.  It also seems a great deal safer than the idea
I floated of not taking locks at all during a binary upgrade;
plus, it has some usefulness with regular pg_restore that's not
under control of pg_upgrade.

So I had a go at coding that, and attached is the result.
It invents a --transaction-size option, and when that's active
it will COMMIT after every N TOC items.  (This seems simpler to
implement and less bug-prone than every-N-SQL-commands.)

I had initially supposed that in a parallel restore we could
have child workers also commit after every N TOC items, but was
soon disabused of that idea.  After a worker processes a TOC
item, any dependent items (such as index builds) might get
dispatched to some other worker, which had better be able to
see the results of the first worker's step.  So at least in
this implementation, we disable the multi-command-per-COMMIT
behavior during the parallel part of the restore.  Maybe that
could be improved in future, but it seems like it'd add a
lot more complexity, and it wouldn't make life any better for
pg_upgrade (which doesn't use parallel pg_restore, and seems
unlikely to want to in future).

I've not spent a lot of effort on pg_upgrade changes here:
I just hard-wired it to select --transaction-size=1000.
Given the default lock table size of 64*100, that gives us
enough headroom for each TOC to take half a dozen locks.
We could go higher than that by making pg_upgrade force the
destination postmaster to create a larger-than-default lock
table, but I'm not sure if it's worth any trouble.  We've
already bought three orders of magnitude improvement as it
stands, which seems like enough ambition for today.  (Also,
having pg_upgrade override the user's settings in the
destination cluster might not be without downsides.)

Another thing I'm wondering about is why this is only a pg_restore
option not also a pg_dump/pg_dumpall option.  I did it like that
because --single-transaction is pg_restore only, but that seems more
like an oversight or laziness than a well-considered decision.
Maybe we should back-fill that omission; but it could be done later.

Thoughts?

regards, tom lane

diff --git a/doc/src/sgml/ref/pg_restore.sgml b/doc/src/sgml/ref/pg_restore.sgml
index 1a23874da6..2e3ba80258 100644
--- a/doc/src/sgml/ref/pg_restore.sgml
+++ b/doc/src/sgml/ref/pg_restore.sgml
@@ -786,6 +786,30 @@ PostgreSQL documentation
   
  
 
+ 
+  --transaction-size=N
+  
+   
+Execute the restore as a series of transactions, each processing
+up to N database
+objects.  This option implies --exit-on-error.
+   
+   
+--transaction-size offers an intermediate choice
+between the default behavior (one transaction per SQL command)
+and -1/--single-transaction
+(one transaction for all restored objects).
+While --single-transaction has the least
+overhead, it may be impractical for large databases because the
+transaction will take a lock on each restored object, possibly
+exhausting the server's lock table space.
+Using --transaction-size with a size of a few
+thousand objects offers nearly the same performance benefits while
+capping the amount of lock table space needed.
+   
+  
+ 
+
  
   --use-set-session-authorization
   
diff --git a/src/bin/pg_dump/pg_backup.h b/src/bin/pg_dump/pg_backup.h
index 9ef2f2017e..fbf5f1c515 100644
--- a/src/bin/pg_dump/pg_backup.h
+++ b/src/bin/pg_dump/pg_backup.h
@@ -149,7 +149,9 @@ typedef struct _restoreOptions
  * compression */
 	int			suppressDumpWarnings;	/* Suppress output of WARNING entries
 		 * to stderr */
-	bool		single_txn;
+
+	bool		single_txn;		/* 

Some useless includes in llvmjit_inline.cpp

2023-12-10 Thread Thomas Munro
Hi,

This is not exhaustive, I just noticed in passing that we don't need these.
From ccadf2778192db48376632c35474736a3370b0c2 Mon Sep 17 00:00:00 2001
From: Thomas Munro 
Date: Mon, 11 Dec 2023 13:50:18 +1300
Subject: [PATCH] Remove useless includes from llvmjit_inline.cpp.

---
 src/backend/jit/llvm/llvmjit_inline.cpp | 2 --
 1 file changed, 2 deletions(-)

diff --git a/src/backend/jit/llvm/llvmjit_inline.cpp b/src/backend/jit/llvm/llvmjit_inline.cpp
index d92d7f3c88..d6e6f2a559 100644
--- a/src/backend/jit/llvm/llvmjit_inline.cpp
+++ b/src/backend/jit/llvm/llvmjit_inline.cpp
@@ -34,9 +34,7 @@ extern "C"
 #include 
 #include 
 
-#include "common/string.h"
 #include "miscadmin.h"
-#include "storage/fd.h"
 }
 
 #include 
-- 
2.39.2



unconstify()/unvolatize() vs g++/clang++

2023-12-10 Thread Thomas Munro
Hi,

AFAICS you can't use unconstify()/unvolatize() in a static inline
function in a .h file, or in a .cpp file, because
__builtin_types_compatible_p is only available in C, not C++.  Seems
like a reasonable thing to want to be able to do, no?  I'm not
immediately sure what the right fix is; would #if
defined(HAVE__BUILTIN_TYPES_COMPATIBLE_P) && !defined(__cplusplus)
around the relevant versions of constify()/unvolatize() be too easy?

HAVE__BUILTIN_TYPES_COMPATIBLE_P is also tested in relptr.h, but only
for further preprocessor stuff, not in functions that the compiler
will see, so cpluspluscheck doesn't have anything to reject, and
nothing will break unless someone writing C++ code actually tries to
use relptr_access().  I think we can live with that one?




Re: Make COPY format extendable: Extract COPY TO format implementations

2023-12-10 Thread Masahiko Sawada
On Sun, Dec 10, 2023 at 5:44 AM Sutou Kouhei  wrote:
>
> Hi,
>
> Thanks for reviewing our latest patch!
>
> In
>  
> 
>   "RE: Make COPY format extendable: Extract COPY TO format implementations" 
> on Sat, 9 Dec 2023 02:43:49 +,
>   "Hayato Kuroda (Fujitsu)"  wrote:
>
> > (I remember that this theme was talked at Japan PostgreSQL conference)
>
> Yes. I should have talked to you more at the conference...
> I will do it next time!
>
>
> Can we discuss how to proceed this improvement?
>
> There are 2 approaches for it:
>
> 1. Do the followings concurrently:
>a. Implementing small changes that got a consensus and
>   merge them step-by-step
>   (e.g. We got a consensus that we need to extract the
>   current format related routines.)
>b. Discuss design

It's preferable to make patches small for easy review. We can merge
them anytime before commit if necessary.

I think we need to discuss overall design about callbacks and how
extensions define a custom copy handler etc. It may require some PoC
patches. Once we have a consensus on overall design we polish patches
including the documentation changes and regression tests.

Regards,

-- 
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com




Re: Synchronizing slots from primary to standby

2023-12-10 Thread Peter Smith
FYI -- the patch 0002 did not apply cleanly for me on top of the 050
test file created by patch 0001.

[postgres@CentOS7-x64 oss_postgres_misc]$ git apply
../patches_misc/v44-0001-Allow-logical-walsenders-to-wait-for-the-physica.patch

[postgres@CentOS7-x64 oss_postgres_misc]$ git apply
../patches_misc/v44-0002-Add-logical-slot-sync-capability-to-the-physical.patch
error: patch failed: src/test/recovery/t/050_standby_failover_slots_sync.pl:289
error: src/test/recovery/t/050_standby_failover_slots_sync.pl: patch
does not apply

==
Kind Regards,
Peter Smith.
Fujitsu Australia




Re: GUC names in messages

2023-12-10 Thread Peter Smith
This v5* looks good to me, except it will need some further
modification if PeterE's suggestion [1] to keep quotes for the
MixedCase GUCs is adopted.

==
[1] 
https://www.postgresql.org/message-id/9e7802b2-2cf2-4c2d-b680-b2ccb9db1d2f%40eisentraut.org

Kind Regards,
Peter Smith.
Futjisu Australia.




Re: GUC names in messages

2023-12-10 Thread Peter Smith
On Sat, Dec 9, 2023 at 1:48 AM Peter Eisentraut  wrote:
>
> On 08.12.23 05:10, Peter Smith wrote:
> > Patch 0001 -- "datestyle" becomes DateStyle in messages
> > Rebased this again, which was part of an earlier patch set
> > - I think any GUC names documented as MixedCase should keep that same
> > case in messages; this also obeys the guidelines recently pushed [1].
> > - Some others agreed, expecting the exact GUC name (in the message)
> > can be found in pg_settings [2].
> > - OTOH, Michael didn't like the diff churn [3] caused by this patch.
>
> I'm fine with adjusting the mixed-case stuff, but intuitively, I don't
> think removing the quotes in this is an improvement:
>
> - GUC_check_errdetail("Conflicting \"datestyle\" specifications.");
> + GUC_check_errdetail("Conflicting DateStyle specifications.");
>

My original intention of this thread was only to document the GUC name
quoting guidelines and then apply those consistently in the code.

I'm happy either way for the MixedCase names to be quoted or not
quoted, whatever is the consensus.

If the rule is changed to quote those MixedCase GUCs then the docs
will require minor tweaking

CURRENT
   
In messages containing configuration variable names, do not include quotes
when the names are visibly not natural English words, such as when they
have underscores, are all-uppercase or have mixed case. Otherwise, quotes
must be added. Do include quotes in a message where an arbitrary variable
name is to be expanded.
   

"are all-uppercase or have mixed case." --> "or are all-uppercase."

==
Kind Regards,
Peter Smith.
Fujitsu Australia




Re: Change GUC hashtable to use simplehash?

2023-12-10 Thread John Naylor
I wrote:

> On Sun, Dec 10, 2023 at 2:18 AM Jeff Davis  wrote:
> >
> > On Sat, 2023-12-09 at 18:52 +0700, John Naylor wrote:
> > > > I tested using the new hash function APIs for my search path cache,
> > > > and
> > > > there's a significant speedup for cases not benefiting from
> > > > a86c61c9ee.
> > > > It's enough that we almost don't need a86c61c9ee. So a definite +1
> > > > to
> > > > the new APIs.
>
> Interesting, thanks for testing! SearchPathCache is a better starting
> point than dynahash for removing strlen calls anyway -- it's more
> localized, uses simplehash, and we can test it with at-hand tests.

Since I had to fix a misalignment in the original to keep ubsan from
crashing CI anyway (v8-0005), I thought I'd take the experiment with
search path cache and put the temporary validation of the hash
function output in there (v8-0004). I had to finagle a bit to get the
bytewise interface to give the same answer as the original, but that's
okay: The bytewise interface is intended for when we don't know the
length up front (and therefore the internal seed can't be tweaked with
the length), but it's nice to make sure nothing's broken.

There is also a chunkwise version for search path cache. That might be
a little faster. Perf testing can be done as is, because I put the
validation in assert builds only.

I've left out the GUC stuff for now, just want to get CI green again.
From 54e6419a632b04d97cad847603035050ab48c84f Mon Sep 17 00:00:00 2001
From: John Naylor 
Date: Sat, 9 Dec 2023 16:32:05 +0700
Subject: [PATCH v8 3/5] Add bytewise interface

This is useful for hashing values with unknown length,
like NUL-terminated strings. It should be faster than calling
strlen() first and passing the length, which most hash
functions require.

Note: This method can't give the same answer as
regular fasthash, so it will need to be evaluated. It's possible
we need to mix in the length at the finalization step (at which
time can know the length), in order to safeguard against
collisions.
---
 src/include/common/hashfn_unstable.h | 19 +++
 1 file changed, 19 insertions(+)

diff --git a/src/include/common/hashfn_unstable.h b/src/include/common/hashfn_unstable.h
index fbae7a5522..80aec98dc9 100644
--- a/src/include/common/hashfn_unstable.h
+++ b/src/include/common/hashfn_unstable.h
@@ -49,6 +49,7 @@ typedef struct fasthash_state
 {
 	uint64 accum;
 #define FH_SIZEOF_ACCUM sizeof(uint64)
+	int8	accum_len;
 	uint64 hash;
 } fasthash_state;
 
@@ -69,6 +70,7 @@ fasthash_combine(fasthash_state* hs)
 
 	/* reset hash state for next input */
 	hs->accum = 0;
+	hs->accum_len = 0;
 }
 
 static inline void
@@ -82,6 +84,18 @@ fasthash_init(fasthash_state *hs, int len, uint64 seed)
 	hs->hash = seed ^ (len * 0x880355f21e6d1965ULL);
 }
 
+static inline void
+fasthash_accum_byte(fasthash_state *hs, const unsigned char ch)
+{
+	hs->accum <<= BITS_PER_BYTE;
+	hs->accum |= ch;
+	hs->accum_len++;
+
+	// wip: is there a better way to get sizeof struct member?
+	if (hs->accum_len == sizeof(((fasthash_state *) 0)->accum))
+		fasthash_combine(hs);
+}
+
 static inline void
 fasthash_accum(fasthash_state *hs, const unsigned char *k, int len)
 {
@@ -117,6 +131,11 @@ fasthash_accum(fasthash_state *hs, const unsigned char *k, int len)
 static inline uint64
 fasthash_final64(fasthash_state *hs)
 {
+	// check for remaining bytes to combine into hash
+	// should only be used by the bytewise interface
+	if (hs->accum_len > 0)
+		fasthash_combine(hs);
+
 	return fasthash_mix(hs->hash);
 }
 
-- 
2.43.0

From f5ab683d61724e9766d43e58c6f3177a30f708d0 Mon Sep 17 00:00:00 2001
From: John Naylor 
Date: Sun, 10 Dec 2023 12:11:37 +0700
Subject: [PATCH v8 2/5] Rewrite fasthash functions using a homegrown
 incremental interface

The incremental interface will be useful for cases we don't know
the length up front, such as NUL-terminated strings. First, we
need to validate that this interface can give the same answer
as the original functions when we do know the length. A future

commit will add a temporary assert for testing in CI.
---
 src/include/common/hashfn_unstable.h | 161 +--
 1 file changed, 153 insertions(+), 8 deletions(-)

diff --git a/src/include/common/hashfn_unstable.h b/src/include/common/hashfn_unstable.h
index a5bf965fa2..fbae7a5522 100644
--- a/src/include/common/hashfn_unstable.h
+++ b/src/include/common/hashfn_unstable.h
@@ -1,3 +1,25 @@
+/*
+Building blocks for creating fast inlineable hash functions. The
+unstable designation is in contrast to hashfn.h, which cannot break
+compatibility because hashes can be writen to disk and so must have
+the same hashes between versions.
+
+ *
+ * Portions Copyright (c) 2018-2023, PostgreSQL Global Development Group
+ *
+ * src/include/common/hashfn_unstable.c
+ */
+
+#ifndef HASHFN_UNSTABLE_H
+#define HASHFN_UNSTABLE_H
+
+/*
+ * fasthash is a modification of code taken from
+ * https://code.google.com/archive/p/fast-hash/source/default/source
+ * under the 

Re: Synchronizing slots from primary to standby

2023-12-10 Thread Amit Kapila
On Wed, Dec 6, 2023 at 4:53 PM shveta malik  wrote:
>
> v43-002:
>

Review comments on v43-0002:
=
1.
synchronize_one_slot()
{
...
+ /*
+ * With hot_standby_feedback enabled and invalidations handled
+ * apropriately as above, this should never happen.
+ */
+ if (remote_slot->restart_lsn < MyReplicationSlot->data.restart_lsn)
+ {
+ ereport(ERROR,
+ errmsg("not synchronizing local slot \"%s\" LSN(%X/%X)"
+" to remote slot's LSN(%X/%X) as synchronization "
+" would move it backwards", remote_slot->name,
+LSN_FORMAT_ARGS(MyReplicationSlot->data.restart_lsn),
+LSN_FORMAT_ARGS(remote_slot->restart_lsn)));
+
+ goto cleanup;
...
}

After the error, the control won't return, so the above goto doesn't
make any sense.

2.
synchronize_one_slot()
{
...
+ /* Search for the named slot */
+ if ((s = SearchNamedReplicationSlot(remote_slot->name, true)))
+ {
+ SpinLockAcquire(>mutex);
+ sync_state = s->data.sync_state;
+ SpinLockRelease(>mutex);
+ }
...
...
+ ReplicationSlotAcquire(remote_slot->name, true);
+
+ /*
+ * Copy the invalidation cause from remote only if local slot is not
+ * invalidated locally, we don't want to overwrite existing one.
+ */
+ if (MyReplicationSlot->data.invalidated == RS_INVAL_NONE)
+ {
+ SpinLockAcquire(>mutex);
+ MyReplicationSlot->data.invalidated = remote_slot->invalidated;
+ SpinLockRelease(>mutex);
+ }
+
+ /* Skip the sync if slot has been invalidated locally. */
+ if (MyReplicationSlot->data.invalidated != RS_INVAL_NONE)
+ goto cleanup;
...

It seems useless to acquire the slot if it is locally invalidated in
the first place. Won't it be better if after the search we first check
whether the slot is locally invalidated and take appropriate action?

3. After doing the above two, I think it doesn't make sense to have
goto at the remaining places in synchronize_one_slot(). We can simply
release the slot and commit the transaction at other places.

4.
+ * Returns nap time for the next sync-cycle.
+ */
+static long
+synchronize_slots(WalReceiverConn *wrconn)

Returning nap time from here appears a bit awkward. I think it is
better if this function returns any_slot_updated and then the caller
decides the adjustment of naptime.

5.
+synchronize_slots(WalReceiverConn *wrconn)
{
...
...
+ /* The syscache access needs a transaction env. */
+ StartTransactionCommand();
+
+ /*
+ * Make result tuples live outside TopTransactionContext to make them
+ * accessible even after transaction is committed.
+ */
+ MemoryContextSwitchTo(oldctx);
+
+ /* Construct query to get slots info from the primary server */
+ initStringInfo();
+ construct_slot_query();
+
+ elog(DEBUG2, "slot sync worker's query:%s \n", s.data);
+
+ /* Execute the query */
+ res = walrcv_exec(wrconn, s.data, SLOTSYNC_COLUMN_COUNT, slotRow);
+ pfree(s.data);
+
+ if (res->status != WALRCV_OK_TUPLES)
+ ereport(ERROR,
+ (errmsg("could not fetch failover logical slots info "
+ "from the primary server: %s", res->err)));
+
+ CommitTransactionCommand();
...
...
}

Where exactly in the above code, there is a syscache access as
mentioned above StartTransactionCommand()?

6.
-  ~/.pgpass file on the standby server (use
+  ~/.pgpass file on the standby server. (use
   replication as the database name).

Why do we need this change?

7.
+ standby. Additionally, similar to creating a logical replication slot
+ on the hot standby, hot_standby_feedback should be
+ set on the standby and a physical slot between the primary and the standby
+ should be used.

In this, I don't understand the relation between the first part of the
line: "Additionally, similar to creating a logical replication slot on
the hot standby ..." with the rest.

8.
However,
+ the slots which were in initiated sync_state ('i) and were not

A single quote after 'i' is missing.

9.
the slots with state 'r' and 'i' can neither be used for logical
+  decoded nor dropped by the user.

/decoded/decoding

10.
+/*
+ * Allocate and initialize slow sync worker shared memory
+ */

/slow/slot

-- 
With Regards,
Amit Kapila.




Re: POC PATCH: copy from ... exceptions to: (was Re: VLDB Features)

2023-12-10 Thread jian he
On Fri, Dec 8, 2023 at 3:09 PM Alena Rybakina  wrote:
>
> Thank you for your work. Unfortunately, your code contained errors during the 
> make installation:
>
> 'SAVEPOINT' after 'SAVE_ERROR' in unreserved_keyword list is misplaced
> 'SAVEPOINT' after 'SAVE_ERROR' in bare_label_keyword list is misplaced
> make[2]: *** [../../../src/Makefile.global:783: gram.c] Error 1
> make[1]: *** [Makefile:131: parser/gram.h] Error 2
> make[1]: *** Waiting for unfinished jobs
> make: *** [src/Makefile.global:383: submake-generated-headers] Error 2
>
> I have ubuntu 22.04 operation system.
>
> On 06.12.2023 13:47, jian he wrote:
>
> On Tue, Dec 5, 2023 at 6:07 PM Alena Rybakina  
> wrote:
>
> Hi!
>
> Thank you for your contribution to this thread.
>
>
> I reviewed it and have a few questions.
>
> 1. I have seen that you delete a table before creating it, to which you want 
> to add errors due to a failed "copy from" operation. I think this is wrong 
> because this table can save useful data for the user.
> At a minimum, we should warn the user about this, but I think we can just add 
> some number at the end of the name, such as name_table1, name_table_2.
>
> Sorry. I don't understand this part.
> Currently, if the error table name already exists, then the copy will
> fail, an error will be reported.
> I try to first create a table, if no error then the error table will be 
> dropped.
>
> To be honest, first of all, I misunderstood this part of the code. Now I see 
> that it works the way you mentioned.
>
> However, I didn't see if you dealt with cases where we already had a table 
> with the same name as the table error.
> I mean, when is he trying to create for the first time, or will we never be 
> able to face such a problem?
>
> Can you demo the expected behavior?
>
> Unfortunately, I was unable to launch it due to a build issue.
>

Hopefully attached will work.

> 2. I noticed that you are forming a table name using the type of errors that 
> prevent rows from being added during 'copy from' operation.
> I think it would be better to use the name of the source file that was used 
> while 'copy from' was running.
> In addition, there may be several such files, it is also worth considering.
>
> Another column added.
> now it looks like:
>
> SELECT * FROM save_error_csv_error;
>  filename | lineno |line
>  | field | source | err_message |
> err_detail | errorcode
> --+++---++-++---
>  STDIN|  1 | 2002232 40  50  60  70
> 80 | NULL  | NULL   | extra data after last expected column   |
> NULL   | 22P04
>  STDIN|  1 | 2000230 23
>  | d | NULL   | missing data for column "d" | NULL
>   | 22P04
>  STDIN|  1 | z,,""
>  | a | z  | invalid input syntax for type integer: "z"  | NULL
>   | 22P02
>  STDIN|  2 | \0,,
>  | a | \0 | invalid input syntax for type integer: "\0" | NULL
>   | 22P02
>
> Yes, I see the "filename" column, and this will solve the problem, but 
> "STDIN" is unclear to me.

please see comment in struct CopyFromStateData:
char*filename; /* filename, or NULL for STDIN */


>  */
>
> Maybe we can rewrite it like this:
>
> /* Check, the err_nsp.error_rel table has already existed
> * and if it is, check its column name and data types.
>
refactored.
From 2510dc2e2b13c60a5a7e184bf8e55325601d97e0 Mon Sep 17 00:00:00 2001
From: pgaddict 
Date: Sun, 10 Dec 2023 09:51:42 +0800
Subject: [PATCH v10 1/1] Make COPY FROM more error tolerant

Currently COPY FROM has 3 types of error while processing the source file.
* extra data after last expected column
* missing data for column \"%s\"
* data type conversion error.

Instead of throwing errors while copying, save_error will save errors to a table automatically.
We check the table definition via column name and column data type.
if table already exists and meets the criteria then errors will save to that table.
if the table does not exist, then create one.

Only works for COPY FROM, non-BINARY mode.

While copying, if error never happened, error save table will be dropped at the ending of COPY FROM.
If the error saving table already exists, meaning at least once COPY FROM errors has happened,
then all the future errors will be saved to that table.
we save the error to error saving table using SPI, construct a query, then execute the query.
---
 contrib/file_fdw/file_fdw.c  |   4 +-
 doc/src/sgml/ref/copy.sgml   |  93 +
 src/backend/commands/copy.c  |  12 ++
 src/backend/commands/copyfrom.c  | 146 +++-
 src/backend/commands/copyfromparse.c | 169 +--
 src/backend/parser/gram.y|   8 +-
 src/bin/psql/tab-complete.c  |   3 +-
 

Re: postgres_fdw test timeouts

2023-12-10 Thread Alexander Lakhin

08.12.2023 02:02, Nathan Bossart wrote:

On Fri, Dec 08, 2023 at 09:55:58AM +1300, Thomas Munro wrote:

Now we have the question of whether to go forwards (commit the "socket
table" thing), or backwards (revert 04a09ee for now to clear the CI
failures).  I don't love the hidden complexity of the socket table and
am not in a hurry to commit it, but I don't currently see another
way... on the other hand we have other CI flapping due to that problem
too so reverting 04a09ee would be sweeping problems under the carpet.
I still need to process your feedback/discoveries on that other thread
and it may take a few weeks for me to get to it.

I don't think we need to revert 04a09ee provided the issue is unrelated and
a fix is in development.


I've reviewed the links posted upthread and analyzed statistics of such
failures:
yes, it happens rather frequently in Cirrus CI, but there might be dozens
of successful runs, for example:
https://cirrus-ci.com/github/postgresql-cfbot/postgresql/commitfest%2F45%2F3686
has 1 postgres_fdw failure on Windows per 32 runs.
And there is only one such failure for 90 days in the buildfarm.
(Perhaps the probability of the failure depend on external factors, such as
concurrent activity.)

So I would not say that it's a dominant failure for now, and given that
04a09ee lives in master only, maybe we can save two commits (Revert +
Revert of revert) while moving to a more persistent solution.

Best regards,
Alexander