Re: Clock-skew management in logical replication

2024-09-24 Thread Nisha Moond
On Mon, Sep 23, 2024 at 4:00 PM Nisha Moond  wrote:
>
> On Fri, Sep 20, 2024 at 7:51 PM Tom Lane  wrote:
> >
> > Nisha Moond  writes:
> > > While considering the implementation of timestamp-based conflict
> > > resolution (last_update_wins) in logical replication (see [1]), there
> > > was a feedback at [2] and the discussion on whether or not to manage
> > > clock-skew at database level.
> >
> > FWIW, I cannot see why we would do anything beyond suggesting that
> > people run NTP.  That's standard anyway on the vast majority of
> > machines these days.  Why would we add complexity that we have
> > to maintain (and document) in order to cater to somebody not doing
> > that?
> >
> > regards, tom lane
>
> Thank you for your response.
>
> I agree with suggesting users to run NTP and we can recommend it in
> the docs rather than introducing additional complexities.
>
> In my research on setting up NTP servers on Linux, I found that
> Chrony[1] is a lightweight and efficient solution for time
> synchronization across nodes. Another reliable option is the classic
> NTP daemon (ntpd)[2], which is also easy to configure and maintain.
> Both Chrony and ntpd can be used to configure a local machine as an
> NTP server for localized time synchronization, or as clients syncing
> from public NTP servers such as 'ntp.ubuntu.com' (default ntp server
> pool for Ubuntu systems) or 'time.google.com'(Google Public NTP).
> For example, on Ubuntu, Chrony is straightforward to install and
> configure[3]. Comprehensive NTP(ntpd) configuration guides are
> available for various Linux distributions, such as Ubuntu[4] and
> RedHat-Linux[5].
>
> Further, I’m exploring options for implementing NTP on Windows systems.
>

Windows platforms provide built-in time synchronization services. As a
client, they allow users to sync system time using internet or public
NTP servers. This can be easily configured by selecting a public NTP
server directly in the Date and Time settings. More details can be
found at [1].

Additionally, Windows servers can be configured as NTP servers for
localized time synchronization within a network, allowing other nodes
to sync with them. Further instructions on configuring an NTP server
on Windows can be found at [2].

[1] 
https://learn.microsoft.com/en-us/windows-server/networking/windows-time-service/how-the-windows-time-service-works
[2] 
https://learn.microsoft.com/en-us/troubleshoot/windows-server/active-directory/configure-authoritative-time-server

Thanks,
Nisha




Re: Clock-skew management in logical replication

2024-09-23 Thread Nisha Moond
On Fri, Sep 20, 2024 at 7:51 PM Tom Lane  wrote:
>
> Nisha Moond  writes:
> > While considering the implementation of timestamp-based conflict
> > resolution (last_update_wins) in logical replication (see [1]), there
> > was a feedback at [2] and the discussion on whether or not to manage
> > clock-skew at database level.
>
> FWIW, I cannot see why we would do anything beyond suggesting that
> people run NTP.  That's standard anyway on the vast majority of
> machines these days.  Why would we add complexity that we have
> to maintain (and document) in order to cater to somebody not doing
> that?
>
> regards, tom lane

Thank you for your response.

I agree with suggesting users to run NTP and we can recommend it in
the docs rather than introducing additional complexities.

In my research on setting up NTP servers on Linux, I found that
Chrony[1] is a lightweight and efficient solution for time
synchronization across nodes. Another reliable option is the classic
NTP daemon (ntpd)[2], which is also easy to configure and maintain.
Both Chrony and ntpd can be used to configure a local machine as an
NTP server for localized time synchronization, or as clients syncing
from public NTP servers such as 'ntp.ubuntu.com' (default ntp server
pool for Ubuntu systems) or 'time.google.com'(Google Public NTP).
For example, on Ubuntu, Chrony is straightforward to install and
configure[3]. Comprehensive NTP(ntpd) configuration guides are
available for various Linux distributions, such as Ubuntu[4] and
RedHat-Linux[5].

Further, I’m exploring options for implementing NTP on Windows systems.

[1] https://chrony-project.org/index.html
[2] https://www.ntp.org/documentation/4.2.8-series/
[3] 
https://documentation.ubuntu.com/server/how-to/networking/serve-ntp-with-chrony/
[4] https://askubuntu.com/questions/14558/how-do-i-setup-a-local-ntp-server
[5] 
https://docs.redhat.com/en/documentation/red_hat_enterprise_linux/7/html/system_administrators_guide/ch-configuring_ntp_using_ntpd#s1-Understanding_the_ntpd_Configuration_File

Thanks,
Nisha




Clock-skew management in logical replication

2024-09-20 Thread Nisha Moond
Hello Hackers,
(CC people involved in the earlier discussion)

While considering the implementation of timestamp-based conflict
resolution (last_update_wins) in logical replication (see [1]), there
was a feedback at [2] and the discussion on whether or not to manage
clock-skew at database level. We tried to research the history of
clock-skew related discussions in Postgres itself and summarized that
at [3].

We also analyzed how other databases deal with it. Based on our
research, the other classic RDBMS  like Oracle and IBM, using similar
timestamp-based resolution methods,  do not address clock-skew at the
database level. Instead, they recommend using external time
synchronization solutions, such as NTP.

- Oracle while handling conflicts[2] assumes clocks are synchronized
and relies on external tools like NTP for time synchronization between
nodes[4].
- IBM Informix, similarly, recommends using their network commands to
ensure clock synchronization across nodes[5].

Other postgres dependent databases like EDB-BDR and YugabyteDB provide
GUC parameters to manage clock-skew within the database:

- EDB-BDR allows configuration of parameters like
bdr.maximum_clock_skew and bdr.maximum_clock_skew_action to define
acceptable skew and actions when it exceeds[6].
- YugabyteDB offers a GUC max_clock_skew_usec setting, which causes
the node to crash if the clock-skew exceeds the specified value[7].

There are, of course, other approaches to managing clock-skew used by
distributed systems, such as NTP daemons, centralized logical clocks,
atomic clocks (as in Google Spanner), and time sync services like
AWS[4].

Implementing any of these time-sync services for CDR seems quite a bit
of deviation and a big project in itself, which we are not sure is
really needed. At best, for users' aid, we should provide some GUCs
based implementation to handle clock-skew in logical replication. The
idea is that users should be able to handle clock-skew outside of the
database. But in worst case scenarios, users can rely on these GUCs.

We have attempted to implement a patch which manages clock-skew in
logical replication. It works  based on these new GUCs: (see [10] for
detailed discussion)

- max_logical_rep_clock_skew: Defines the tolerable limit for clock-skew.
- max_logical_rep_clock_skew_action: Configures the action when
clock-skew exceeds the limit.
- max_logical_rep_clock_skew_wait: Limits the maximum wait time if the
action is configured as "wait."

The proposed idea is implemented in attached patch v1. Thank you
Shveta for implementing it.
Thanks Kuroda-san for assisting in the research.

Thoughts? Looking forward to hearing others' opinions!

[1]: 
https://www.postgresql.org/message-id/CAJpy0uD0-DpYVMtsxK5R%3DzszXauZBayQMAYET9sWr_w0CNWXxQ%40mail.gmail.com
[2]: 
https://www.postgresql.org/message-id/CAFiTN-uTycjZWdp1kEpN9w7b7SQpoGL5zyg_qZzjpY_vr2%2BKsg%40mail.gmail.com
[3]: 
https://www.postgresql.org/message-id/CAA4eK1Jn4r-y%2BbkW%3DJaKCbxEz%3DjawzQAS1Z4wAd8jT%2B1B0RL2w%40mail.gmail.com
[4]: 
https://www.oracle.com/cn/a/tech/docs/technical-resources/wp-oracle-goldengate-activeactive-final2-1.pdf
[5]: 
https://docs.oracle.com/en/operating-systems/oracle-linux/8/network/network-ConfiguringNetworkTime.html
[6]: 
https://www.ibm.com/docs/en/informix-servers/14.10?topic=environment-time-synchronization
[7]:  
https://www.enterprisedb.com/docs/pgd/latest/reference/pgd-settings/#bdrmaximum_clock_skew
[8]:  
https://support.yugabyte.com/hc/en-us/articles/4403707404173-Too-big-clock-skew-leading-to-error-messages-or-tserver-crashes
[9]: 
https://aws.amazon.com/about-aws/whats-new/2023/11/amazon-time-sync-service-microsecond-accurate-time/
[10]: 
https://www.postgresql.org/message-id/CAJpy0uDCW%2BvrBoUZWrBWPjsM%3D9wwpwbpZuZa8Raj3VqeVYs3PQ%40mail.gmail.com

--
Thanks,
Nisha


v1-0001-Implements-Clock-skew-management-between-nodes.patch
Description: Binary data


Re: Conflict Detection and Resolution

2024-09-19 Thread Nisha Moond
On Fri, Sep 20, 2024 at 8:40 AM Nisha Moond  wrote:
>
> On Wed, Sep 18, 2024 at 10:46 AM vignesh C  wrote:
> >
> > On Thu, 12 Sept 2024 at 14:03, Ajin Cherian  wrote:
> > >
> > > On Tue, Sep 3, 2024 at 7:42 PM vignesh C  wrote:
> > >
> > > On Fri, 30 Aug 2024 at 11:01, Nisha Moond  
> > > wrote:
> > > >
> > > > Here is the v11 patch-set. Changes are:
> > >
> > > 1) This command crashes:
> > > ALTER SUBSCRIPTION name RESET CONFLICT RESOLVER FOR NULL;
> > > #0  __strcmp_avx2 () at ../sysdeps/x86_64/multiarch/strcmp-avx2.S:116
> > > #1  0x55c67270600a in ResetConflictResolver (subid=16404,
> > > conflict_type=0x0) at conflict.c:744
> > > #2  0x55c67247e0c3 in AlterSubscription (pstate=0x55c6748ff9d0,
> > > stmt=0x55c67497dfe0, isTopLevel=true) at subscriptioncmds.c:1664
> > >
> > > +   | ALTER SUBSCRIPTION name RESET CONFLICT
> > > RESOLVER FOR conflict_type
> > > +   {
> > > +   AlterSubscriptionStmt *n =
> > > +   
> > > makeNode(AlterSubscriptionStmt);
> > > +
> > > +   n->kind =
> > > ALTER_SUBSCRIPTION_RESET_CONFLICT_RESOLVER;
> > > +   n->subname = $3;
> > > +   n->conflict_type = $8;
> > > +   $$ = (Node *) n;
> > > +   }
> > > +   ;
> > > +conflict_type:
> > > +   Sconst
> > >  { $$ = $1; }
> > > +   | NULL_P
> > >  { $$ = NULL; }
> > > ;
> > >
> > > May be conflict_type should be changed to:
> > > +conflict_type:
> > > +   Sconst
> > >  { $$ = $1; }
> > > ;
> > >
> > >
> > > Fixed.
> > >
> >
> > Few comments:
> > 1) This should be in  (fout->remoteVersion >= 18) check to support
> > dumping backward compatible server objects, else dump with older
> > version will fail:
> > +   /* Populate conflict type fields using the new query */
> > +   confQuery = createPQExpBuffer();
> > +   appendPQExpBuffer(confQuery,
> > + "SELECT confrtype,
> > confrres FROM pg_catalog.pg_subscription_conflict "
> > + "WHERE confsubid =
> > %u;", subinfo[i].dobj.catId.oid);
> > +   confRes = ExecuteSqlQuery(fout, confQuery->data,
> > PGRES_TUPLES_OK);
> > +
> > +   ntuples = PQntuples(confRes);
> > +   for (j = 0; j < ntuples; j++)
> >
> > 2) Can we check and throw an error before the warning is logged in
> > this case as it seems strange to throw a warning first and then an
> > error for the same track_commit_timestamp configuration:
> > postgres=# create subscription sub1 connection ... publication pub1
> > conflict resolver (insert_exists = 'last_update_wins');
> > WARNING:  conflict detection and resolution could be incomplete due to
> > disabled track_commit_timestamp
> > DETAIL:  Conflicts update_origin_differs and delete_origin_differs
> > cannot be detected, and the origin and commit timestamp for the local
> > row will not be logged.
> > ERROR:  resolver last_update_wins requires "track_commit_timestamp" to
> > be enabled
> > HINT:  Make sure the configuration parameter "track_commit_timestamp" is 
> > set.
> >
>
> Thanks for the review.
> Here is the v14 patch-set fixing review comments in [1] and [2].

Just noticed that there are failures for the new
034_conflict_resolver.pl test on CFbot. From the initial review it
seems to be a test issue and not a bug.
We will fix these along with the next version of patch-sets.

Thanks,
Nisha




Re: Conflict Detection and Resolution

2024-09-19 Thread Nisha Moond
On Fri, Sep 20, 2024 at 8:40 AM Nisha Moond  wrote:
>
> On Wed, Sep 18, 2024 at 10:46 AM vignesh C  wrote:
> >
> > On Thu, 12 Sept 2024 at 14:03, Ajin Cherian  wrote:
> > >
> > > On Tue, Sep 3, 2024 at 7:42 PM vignesh C  wrote:
> > >
> > > On Fri, 30 Aug 2024 at 11:01, Nisha Moond  
> > > wrote:
> > > >
> > > > Here is the v11 patch-set. Changes are:
> > >
> > > 1) This command crashes:
> > > ALTER SUBSCRIPTION name RESET CONFLICT RESOLVER FOR NULL;
> > > #0  __strcmp_avx2 () at ../sysdeps/x86_64/multiarch/strcmp-avx2.S:116
> > > #1  0x55c67270600a in ResetConflictResolver (subid=16404,
> > > conflict_type=0x0) at conflict.c:744
> > > #2  0x55c67247e0c3 in AlterSubscription (pstate=0x55c6748ff9d0,
> > > stmt=0x55c67497dfe0, isTopLevel=true) at subscriptioncmds.c:1664
> > >
> > > +   | ALTER SUBSCRIPTION name RESET CONFLICT
> > > RESOLVER FOR conflict_type
> > > +   {
> > > +   AlterSubscriptionStmt *n =
> > > +   
> > > makeNode(AlterSubscriptionStmt);
> > > +
> > > +   n->kind =
> > > ALTER_SUBSCRIPTION_RESET_CONFLICT_RESOLVER;
> > > +   n->subname = $3;
> > > +   n->conflict_type = $8;
> > > +   $$ = (Node *) n;
> > > +   }
> > > +   ;
> > > +conflict_type:
> > > +   Sconst
> > >  { $$ = $1; }
> > > +   | NULL_P
> > >  { $$ = NULL; }
> > > ;
> > >
> > > May be conflict_type should be changed to:
> > > +conflict_type:
> > > +   Sconst
> > >  { $$ = $1; }
> > > ;
> > >
> > >
> > > Fixed.
> > >
> >
> > Few comments:
> > 1) This should be in  (fout->remoteVersion >= 18) check to support
> > dumping backward compatible server objects, else dump with older
> > version will fail:
> > +   /* Populate conflict type fields using the new query */
> > +   confQuery = createPQExpBuffer();
> > +   appendPQExpBuffer(confQuery,
> > + "SELECT confrtype,
> > confrres FROM pg_catalog.pg_subscription_conflict "
> > + "WHERE confsubid =
> > %u;", subinfo[i].dobj.catId.oid);
> > +   confRes = ExecuteSqlQuery(fout, confQuery->data,
> > PGRES_TUPLES_OK);
> > +
> > +   ntuples = PQntuples(confRes);
> > +   for (j = 0; j < ntuples; j++)
> >
> > 2) Can we check and throw an error before the warning is logged in
> > this case as it seems strange to throw a warning first and then an
> > error for the same track_commit_timestamp configuration:
> > postgres=# create subscription sub1 connection ... publication pub1
> > conflict resolver (insert_exists = 'last_update_wins');
> > WARNING:  conflict detection and resolution could be incomplete due to
> > disabled track_commit_timestamp
> > DETAIL:  Conflicts update_origin_differs and delete_origin_differs
> > cannot be detected, and the origin and commit timestamp for the local
> > row will not be logged.
> > ERROR:  resolver last_update_wins requires "track_commit_timestamp" to
> > be enabled
> > HINT:  Make sure the configuration parameter "track_commit_timestamp" is 
> > set.
> >
>
> Thanks for the review.
> Here is the v14 patch-set fixing review comments in [1] and [2].

Clarification:
The fixes for mentioned comments from Vignesh - [1] & [2] are fixed in
patch-001. Thank you Ajin for providing the changes.

> New in patches:
> 1) Added partition table tests in 034_conflict_resolver.pl in 002 and
> 003 patches.
> 2) 003 has a bug fix for update_exists conflict resolution on
> partitioned tables.
>
> [1]: 
> https://www.postgresql.org/message-id/CALDaNm3es1JqU8Qcv5Yw%3D7Ts2dOvaV8a_boxPSdofB%2BDTx1oFg%40mail.gmail.com
> [2]: 
> https://www.postgresql.org/message-id/CALDaNm18HuAcNsEC47J6qLRC7rMD2Q9_wT_hFtcc4UWqsfkgjA%40mail.gmail.com
>
> Thanks,
> Nisha




Re: Conflict Detection and Resolution

2024-09-09 Thread Nisha Moond
On Fri, Sep 6, 2024 at 2:05 PM Ajin Cherian  wrote:
>
>
>
> On Thu, Aug 29, 2024 at 2:50 PM shveta malik  wrote:
>>
>> On Wed, Aug 28, 2024 at 4:07 PM shveta malik  wrote:
>> >
>> > > On Wed, Aug 28, 2024 at 10:30 AM Ajin Cherian  wrote:
>> > > >
>> >
>> > The review is WIP. Please find a few comments on patch001.
>> >
>>
>> More comments on ptach001 in continuation of previous comments:
>>
>
> Thank you for your feedback, Shveta. I've addressed both sets of comments you 
> provided.

Thanks for the patches. I tested the v12-0001 patch, and here are my comments:

1) An unexpected error occurs when attempting to alter the resolver
for multiple conflict_type(s) in ALTER SUB...CONFLICT RESOLVER
command. See below examples :

postgres=#  alter subscription sub2 CONFLICT RESOLVER
(update_exists=keep_local,  delete_missing=error,
update_origin_differs=error);
ERROR:  unrecognized node type: 1633972341

postgres=#  alter subscription sub2 CONFLICT RESOLVER (
update_origin_differs=error, update_exists=error);
ERROR:  unrecognized node type: 1633972341

postgres=#  alter subscription sub2 CONFLICT RESOLVER (
delete_origin_differs=error, delete_missing=error);
ERROR:  unrecognized node type: 1701602660

postgres=#  alter subscription sub2 CONFLICT RESOLVER
(update_exists=keep_local,  delete_missing=error);
ALTER SUBSCRIPTION

-- It appears that the error occurs only when at least two conflict
types belong to the same category, either UPDATE or DELETE.

2) Given the above issue, it would be beneficial to add a test in
subscription.sql to cover cases where all valid conflict types are set
with appropriate resolvers in both the ALTER and CREATE commands.

Thanks,
Nisha




Re: Commit Timestamp and LSN Inversion issue

2024-09-08 Thread Nisha Moond
On Wed, Sep 4, 2024 at 12:23 PM shveta malik  wrote:
>
> Hello hackers,
> (Cc people involved in the earlier discussion)
>
> I would like to discuss the $Subject.
>
> While discussing Logical Replication's Conflict Detection and
> Resolution (CDR) design in [1] , it came to  our notice that the
> commit LSN and timestamp may not correlate perfectly i.e. commits may
> happen with LSN1 < LSN2 but with Ts1 > Ts2. This issue may arise
> because, during the commit process, the timestamp (xactStopTimestamp)
> is captured slightly earlier than when space is reserved in the WAL.
>
>  ~~
>
>  Reproducibility of conflict-resolution problem due to the timestamp inversion
> 
> It was suggested that timestamp inversion *may* impact the time-based
> resolutions such as last_update_wins (targeted to be implemented in
> [1]) as we may end up making wrong decisions if timestamps and LSNs
> are not correctly ordered. And thus we tried some tests but failed to
> find any practical scenario where it could be a problem.
>
> Basically, the proposed conflict resolution is a row-level resolution,
> and to cause the row value to be inconsistent, we need to modify the
> same row in concurrent transactions and commit the changes
> concurrently. But this doesn't seem possible because concurrent
> updates on the same row are disallowed (e.g., the later update will be
> blocked due to the row lock).  See [2] for the details.
>
> We tried to give some thoughts on multi table cases as well e.g.,
> update table A with foreign key and update the table B that table A
> refers to. But update on table A will block the update on table B as
> well, so we could not reproduce data-divergence due to the
> LSN/timestamp mismatch issue there.
>
>  ~~
>
> Idea proposed to fix the timestamp inversion issue
> 
> There was a suggestion in [3] to acquire the timestamp while reserving
> the space (because that happens in LSN order). The clock would need to
> be monotonic (easy enough with CLOCK_MONOTONIC), but also cheap. The
> main problem why it's being done outside the critical section, because
> gettimeofday() may be quite expensive. There's a concept of hybrid
> clock, combining "time" and logical counter, which might be useful
> independently of CDR.
>
> On further analyzing this idea, we found that CLOCK_MONOTONIC can be
> accepted only by clock_gettime() which has more precision than
> gettimeofday() and thus is equally or more expensive theoretically (we
> plan to test it and post the results). It does not look like a good
> idea to call any of these when holding spinlock to reserve the wal
> position.  As for the suggested solution "hybrid clock", it might not
> help here because the logical counter is only used to order the
> transactions with the same timestamp. The problem here is how to get
> the timestamp along with wal position
> reservation(ReserveXLogInsertLocation).
>

Here are the tests done to compare clock_gettime() and gettimeofday()
performance.

Machine details :
Intel(R) Xeon(R) CPU E7-4890 v2 @ 2.80GHz
CPU(s): 120; 800GB RAM

Three functions were tested across three different call volumes (1
million, 100 million, and 1 billion):
1) clock_gettime() with CLOCK_REALTIME
2) clock_gettime() with CLOCK_MONOTONIC
3) gettimeofday()

--> clock_gettime() with CLOCK_MONOTONIC sometimes shows slightly
better performance, but not consistently. The difference in time taken
by all three functions is minimal, with averages varying by no more
than ~2.5%. Overall, the performance between CLOCK_MONOTONIC and
gettimeofday() is essentially the same.

Below are the test results -
(each test was run twice for consistency)

1) For 1 million calls:
 1a) clock_gettime() with CLOCK_REALTIME:
- Run 1: 0.01770 seconds, Run 2: 0.01772 seconds, Average: 0.01771 seconds.
 1b) clock_gettime() with CLOCK_MONOTONIC:
- Run 1: 0.01753 seconds, Run 2: 0.01748 seconds, Average: 0.01750 seconds.
 1c) gettimeofday():
- Run 1: 0.01742 seconds, Run 2: 0.01777 seconds, Average: 0.01760 seconds.

2) For 100 million calls:
 2a) clock_gettime() with CLOCK_REALTIME:
- Run 1: 1.76649 seconds, Run 2: 1.76602 seconds, Average: 1.76625 seconds.
 2b) clock_gettime() with CLOCK_MONOTONIC:
- Run 1: 1.72768 seconds, Run 2: 1.72988 seconds, Average: 1.72878 seconds.
 2c) gettimeofday():
- Run 1: 1.72436 seconds, Run 2: 1.72174 seconds, Average: 1.72305 seconds.

3) For 1 billion calls:
 3a) clock_gettime() with CLOCK_REALTIME:
- Run 1: 17.63859 seconds, Run 2: 17.65529 seconds, Average:
17.64694 seconds.
 3b) clock_gettime() with CLOCK_MONOTONIC:
- Run 1: 17.15109 seconds, Run 2: 17.27406 seconds, Average:
17.21257 seconds.
 3c) gettimeofday():
- Run 1: 17.21368 seconds, Run 2: 17.22983 seconds, Average:
17.22175 seconds.

Attached the scripts used for tests.

--
Thanks,
Nisha
<>


Re: Conflict Detection and Resolution

2024-09-06 Thread Nisha Moond
On Thu, Aug 29, 2024 at 4:43 PM Amit Kapila  wrote:
>
> On Fri, Aug 23, 2024 at 10:39 AM shveta malik  wrote:
> >
> > Please find issues which need some thoughts and approval for
> > time-based resolution and clock-skew.
> >
> > 1)
> > Time based conflict resolution and two phase transactions:
> >
> > Time based conflict resolution (last_update_wins) is the one
> > resolution which will not result in data-divergence considering
> > clock-skew is taken care of. But when it comes to two-phase
> > transactions, it might not be the case. For two-phase transaction, we
> > do not have commit timestamp when the changes are being applied. Thus
> > for time-based comparison, initially it was decided to user prepare
> > timestamp but it may result in data-divergence.  Please see the
> > example at [1].
> >
> > Example at [1] is a tricky situation, and thus in the initial draft,
> > we decided to restrict usage of 2pc and CDR together. The plan is:
> >
> > a) During Create subscription, if the user has given last_update_wins
> > resolver for any conflict_type and 'two_phase' is also enabled, we
> > ERROR out.
> > b) During Alter subscription, if the user tries to update resolver to
> > 'last_update_wins' but 'two_phase' is enabled, we error out.
> >
> > Another solution could be to save both prepare_ts and commit_ts. And
> > when any txn comes for conflict resolution, we first check if
> > prepare_ts is available, use that else use commit_ts.  Availability of
> > prepare_ts would indicate it was a prepared txn and thus even if it is
> > committed, we should use prepare_ts for comparison for consistency.
> > This will have some overhead of storing prepare_ts along with
> > commit_ts. But if the number of prepared txns are reasonably small,
> > this overhead should be less.
> >
>
> Yet another idea is that if the conflict is detected and the
> resolution strategy is last_update_wins then from that point we start
> writing all the changes to the file similar to what we do for
> streaming mode and only once commit_prepared arrives, we will read and
> apply changes. That will solve this problem.
>
> > We currently plan to go with restricting 2pc and last_update_wins
> > together, unless others have different opinions.
> >
>
> Sounds reasonable but we should add comments on the possible solution
> like the one I have mentioned so that we can extend it afterwards.
>

Done, v12-004 patch has the comments for the possible solution.

> > ~~
> >
> > 2)
> > parallel apply worker and conflict-resolution:
> > As discussed in [2] (see last paragraph in [2]), for streaming of
> > in-progress transactions by parallel worker, we do not have
> > commit-timestamp with each change and thus it makes sense to disable
> > parallel apply worker with CDR. The plan is to not start parallel
> > apply worker if 'last_update_wins' is configured for any
> > conflict_type.
> >
>
> The other idea is that we can let the changes written to file if any
> conflict is detected and then at commit time let the remaining changes
> be applied by apply worker. This can introduce some complexity, so
> similar to two_pc we can extend this functionality later.
>

v12-004 patch has the comments to extend it later.

> >  ~~
> >
> > 3)
> > parallel apply worker and clock skew management:
> > Regarding clock-skew management as discussed in [3], we will wait for
> > the local clock to come within tolerable range during 'begin' rather
> > than before 'commit'. And this wait needs commit-timestamp in the
> > beginning, thus we plan to restrict starting pa-worker even when
> > clock-skew related GUCs are configured.
> >
> > Earlier we had restricted both 2pc and parallel worker worker start
> > when detect_conflict was enabled, but now since detect_conflict
> > parameter is removed, we will change the implementation to restrict
> > all 3 above cases when last_update_wins is configured. When the
> > changes are done, we will post the patch.
> >
>
> At this stage, we are not sure how we want to deal with clock skew.
> There is an argument that clock-skew should be handled outside the
> database, so we can probably have the clock-skew-related stuff in a
> separate patch.
>

Separated the clock-skew related code in v12-005 patch.

--
Thanks,
Nisha




Re: Conflict Detection and Resolution

2024-08-29 Thread Nisha Moond
On Mon, Aug 26, 2024 at 2:23 PM Amit Kapila  wrote:
>
> On Thu, Aug 22, 2024 at 3:45 PM shveta malik  wrote:
> >
> > On Wed, Aug 21, 2024 at 4:08 PM Nisha Moond  
> > wrote:
> > >
> > > The patches have been rebased on the latest pgHead following the merge
> > > of the conflict detection patch [1].
> >
> > Thanks for working on patches.
> >
> > Summarizing the issues which need some suggestions/thoughts.
> >
> > 1)
> > For subscription based resolvers, currently the syntax implemented is:
> >
> > 1a)
> > CREATE SUBSCRIPTION 
> > CONNECTION  PUBLICATION 
> > CONFLICT RESOLVER
> > (conflict_type1 = resolver1, conflict_type2 = resolver2,
> > conflict_type3 = resolver3,...);
> >
> > 1b)
> > ALTER SUBSCRIPTION  CONFLICT RESOLVER
> > (conflict_type1 = resolver1, conflict_type2 = resolver2,
> > conflict_type3 = resolver3,...);
> >
> > Earlier the syntax suggested in [1] was:
> > CREATE SUBSCRIPTION  CONNECTION  PUBLICATION 
> > CONFLICT RESOLVER 'conflict_resolver1' FOR 'conflict_type1',
> > CONFLICT RESOLVER 'conflict_resolver2' FOR 'conflict_type2';
> >
> > I think the currently implemented syntax  is good as it has less
> > repetition, unless others think otherwise.
> >
> > ~~
> >
> > 2)
> > For subscription based resolvers, do we need a RESET command to reset
> > resolvers to default? Any one of below or both?
> >
> > 2a) reset all at once:
> >  ALTER SUBSCRIPTION  RESET CONFLICT RESOLVERS
> >
> > 2b) reset one at a time:
> >  ALTER SUBSCRIPTION  RESET CONFLICT RESOLVER for 'conflict_type';
> >
> > The issue I see here is, to implement 1a and 1b, we have introduced
> > the  'RESOLVER' keyword. If we want to implement 2a, we will have to
> > introduce the 'RESOLVERS' keyword as well. But we can come up with
> > some alternative syntax if we plan to implement these. Thoughts?
> >
>
> It makes sense to have a RESET on the lines of (a) and (b). At this
> stage, we should do minimal in extending the syntax. How about RESET
> CONFLICT RESOLVER ALL for (a)?
>

Done, v11 implements the suggested RESET command.

> > ~~
> >

--
Thanks,
Nisha




Re: Conflict Detection and Resolution

2024-08-29 Thread Nisha Moond
On Fri, Aug 23, 2024 at 10:39 AM shveta malik  wrote:
>
> On Thu, Aug 22, 2024 at 3:44 PM shveta malik  wrote:
> >
> >
> > For clock-skew and timestamp based resolution, if needed, I will post
> > another email for the design items where suggestions are needed.
> >
>
> Please find issues which need some thoughts and approval for
> time-based resolution and clock-skew.
>
> 1)
> Time based conflict resolution and two phase transactions:
>
> Time based conflict resolution (last_update_wins) is the one
> resolution which will not result in data-divergence considering
> clock-skew is taken care of. But when it comes to two-phase
> transactions, it might not be the case. For two-phase transaction, we
> do not have commit timestamp when the changes are being applied. Thus
> for time-based comparison, initially it was decided to user prepare
> timestamp but it may result in data-divergence.  Please see the
> example at [1].
>
> Example at [1] is a tricky situation, and thus in the initial draft,
> we decided to restrict usage of 2pc and CDR together. The plan is:
>
> a) During Create subscription, if the user has given last_update_wins
> resolver for any conflict_type and 'two_phase' is also enabled, we
> ERROR out.
> b) During Alter subscription, if the user tries to update resolver to
> 'last_update_wins' but 'two_phase' is enabled, we error out.
>
> Another solution could be to save both prepare_ts and commit_ts. And
> when any txn comes for conflict resolution, we first check if
> prepare_ts is available, use that else use commit_ts.  Availability of
> prepare_ts would indicate it was a prepared txn and thus even if it is
> committed, we should use prepare_ts for comparison for consistency.
> This will have some overhead of storing prepare_ts along with
> commit_ts. But if the number of prepared txns are reasonably small,
> this overhead should be less.
>
> We currently plan to go with restricting 2pc and last_update_wins
> together, unless others have different opinions.
>

Done. v11-004 implements the idea of restricting 2pc and
last_update_wins together.

> ~~
>
> 2)
> parallel apply worker and conflict-resolution:
> As discussed in [2] (see last paragraph in [2]), for streaming of
> in-progress transactions by parallel worker, we do not have
> commit-timestamp with each change and thus it makes sense to disable
> parallel apply worker with CDR. The plan is to not start parallel
> apply worker if 'last_update_wins' is configured for any
> conflict_type.
>

Done.

>  ~~
>
> 3)
> parallel apply worker and clock skew management:
> Regarding clock-skew management as discussed in [3], we will wait for
> the local clock to come within tolerable range during 'begin' rather
> than before 'commit'. And this wait needs commit-timestamp in the
> beginning, thus we plan to restrict starting pa-worker even when
> clock-skew related GUCs are configured.
>

Done. v11 implements it.

> Earlier we had restricted both 2pc and parallel worker worker start
> when detect_conflict was enabled, but now since detect_conflict
> parameter is removed, we will change the implementation to restrict
> all 3 above cases when last_update_wins is configured. When the
> changes are done, we will post the patch.
>
> ~~
>
--
Thanks,
Nisha




Re: Conflict Detection and Resolution

2024-08-27 Thread Nisha Moond
On Fri, Aug 23, 2024 at 10:39 AM shveta malik  wrote:
>
> On Thu, Aug 22, 2024 at 3:44 PM shveta malik  wrote:
> >
> >
> > For clock-skew and timestamp based resolution, if needed, I will post
> > another email for the design items where suggestions are needed.
> >
>
> Please find issues which need some thoughts and approval for
> time-based resolution and clock-skew.
>
> 1)
> Time based conflict resolution and two phase transactions:
>
> Time based conflict resolution (last_update_wins) is the one
> resolution which will not result in data-divergence considering
> clock-skew is taken care of. But when it comes to two-phase
> transactions, it might not be the case. For two-phase transaction, we
> do not have commit timestamp when the changes are being applied. Thus
> for time-based comparison, initially it was decided to user prepare
> timestamp but it may result in data-divergence.  Please see the
> example at [1].
>
> Example at [1] is a tricky situation, and thus in the initial draft,
> we decided to restrict usage of 2pc and CDR together. The plan is:
>
> a) During Create subscription, if the user has given last_update_wins
> resolver for any conflict_type and 'two_phase' is also enabled, we
> ERROR out.
> b) During Alter subscription, if the user tries to update resolver to
> 'last_update_wins' but 'two_phase' is enabled, we error out.
>
> Another solution could be to save both prepare_ts and commit_ts. And
> when any txn comes for conflict resolution, we first check if
> prepare_ts is available, use that else use commit_ts.  Availability of
> prepare_ts would indicate it was a prepared txn and thus even if it is
> committed, we should use prepare_ts for comparison for consistency.
> This will have some overhead of storing prepare_ts along with
> commit_ts. But if the number of prepared txns are reasonably small,
> this overhead should be less.
>
> We currently plan to go with restricting 2pc and last_update_wins
> together, unless others have different opinions.
>
> ~~
>
> 2)
> parallel apply worker and conflict-resolution:
> As discussed in [2] (see last paragraph in [2]), for streaming of
> in-progress transactions by parallel worker, we do not have
> commit-timestamp with each change and thus it makes sense to disable
> parallel apply worker with CDR. The plan is to not start parallel
> apply worker if 'last_update_wins' is configured for any
> conflict_type.
>
>  ~~
>
> 3)
> parallel apply worker and clock skew management:
> Regarding clock-skew management as discussed in [3], we will wait for
> the local clock to come within tolerable range during 'begin' rather
> than before 'commit'. And this wait needs commit-timestamp in the
> beginning, thus we plan to restrict starting pa-worker even when
> clock-skew related GUCs are configured.
>
> Earlier we had restricted both 2pc and parallel worker worker start
> when detect_conflict was enabled, but now since detect_conflict
> parameter is removed, we will change the implementation to restrict
> all 3 above cases when last_update_wins is configured. When the
> changes are done, we will post the patch.
>
> ~~
>
> 4)
> 
> Earlier when 'detect_conflict' was enabled, we were giving WARNING if
> 'track_commit_timestamp' was not enabled. This was during CREATE and
> ALTER subscription. Now with this parameter removed, this WARNING has
> also been removed. But I think we need to bring back this WARNING.
> Currently default resolvers set may work without
> 'track_commit_timestamp' but when user gives CONFLICT RESOLVER in
> create-sub or alter-sub explicitly making them configured to
> non-default values (or say any values, does not matter if few are
> defaults), we may still emit this warning to alert user:
>
> 2024-07-26 09:14:03.152 IST [195415] WARNING:  conflict detection
> could be incomplete due to disabled track_commit_timestamp
> 2024-07-26 09:14:03.152 IST [195415] DETAIL:  Conflicts update_differ
> and delete_differ cannot be detected, and the origin and commit
> timestamp for the local row will not be logged.
>
> Thoughts?
>
> If we emit this WARNING during each resolution, then it may flood our
> log files, thus it seems better to emit it during create or alter
> subscription instead of during resolution.
>
>
Done.
v10 has implemented the suggested warning when a user gives CONFLICT
RESOLVER in create-sub or alter-sub explicitly.

Thanks,
Nisha




Re: Conflict Detection and Resolution

2024-08-27 Thread Nisha Moond
On Mon, Aug 26, 2024 at 2:23 PM Amit Kapila  wrote:
>
> On Thu, Aug 22, 2024 at 3:45 PM shveta malik  wrote:
> >
> > On Wed, Aug 21, 2024 at 4:08 PM Nisha Moond  
> > wrote:
> > >
> > > The patches have been rebased on the latest pgHead following the merge
> > > of the conflict detection patch [1].
> >
> > Thanks for working on patches.
> >
> > Summarizing the issues which need some suggestions/thoughts.
> >
> > 1)
> > For subscription based resolvers, currently the syntax implemented is:
> >
> > 1a)
> > CREATE SUBSCRIPTION 
> > CONNECTION  PUBLICATION 
> > CONFLICT RESOLVER
> > (conflict_type1 = resolver1, conflict_type2 = resolver2,
> > conflict_type3 = resolver3,...);
> >
> > 1b)
> > ALTER SUBSCRIPTION  CONFLICT RESOLVER
> > (conflict_type1 = resolver1, conflict_type2 = resolver2,
> > conflict_type3 = resolver3,...);
> >
> > Earlier the syntax suggested in [1] was:
> > CREATE SUBSCRIPTION  CONNECTION  PUBLICATION 
> > CONFLICT RESOLVER 'conflict_resolver1' FOR 'conflict_type1',
> > CONFLICT RESOLVER 'conflict_resolver2' FOR 'conflict_type2';
> >
> > I think the currently implemented syntax  is good as it has less
> > repetition, unless others think otherwise.
> >
> > ~~
> >
> > 2)
> > For subscription based resolvers, do we need a RESET command to reset
> > resolvers to default? Any one of below or both?
> >
> > 2a) reset all at once:
> >  ALTER SUBSCRIPTION  RESET CONFLICT RESOLVERS
> >
> > 2b) reset one at a time:
> >  ALTER SUBSCRIPTION  RESET CONFLICT RESOLVER for 'conflict_type';
> >
> > The issue I see here is, to implement 1a and 1b, we have introduced
> > the  'RESOLVER' keyword. If we want to implement 2a, we will have to
> > introduce the 'RESOLVERS' keyword as well. But we can come up with
> > some alternative syntax if we plan to implement these. Thoughts?
> >
>
> It makes sense to have a RESET on the lines of (a) and (b). At this
> stage, we should do minimal in extending the syntax. How about RESET
> CONFLICT RESOLVER ALL for (a)?
>
> > ~~
> >
> > 3)  Regarding update_exists:
> >
> > 3a)
> > Currently update_exists resolver patch is kept separate. The reason
> > being, it performs resolution which will need deletion of multiple
> > rows. It will be good to discuss if we want to target this in the
> > first draft. Please see the example:
> >
> > create table tab (a int primary key, b int unique, c int unique);
> >
> > Pub: insert into tab  values (1,1,1);
> > Sub:
> > insert into tab  values (2,20,30);
> > insert into tab values (3,40,50);
> > insert into tab values (4,60,70);
> >
> > Pub: update tab set a=2,b=40,c=70 where a=1;
> >
> > The above 'update' on pub will result in 'update_exists' on sub and if
> > resolution is in favour of 'apply', then it will conflict with all the
> > three local rows of subscriber due to unique constraint present on all
> > three columns. Thus in order to resolve the conflict, it will have to
> > delete these 3 rows on sub:
> >
> > 2,20,30
> > 3,40,50
> > 4,60,70
> > and then update 1,1,1 to 2,40,70.
> >
> > Just need opinion on if we shall target this in the initial draft.
> >
>
> This case looks a bit complicated. It seems there is no other
> alternative than to delete the multiple rows. It is better to create a
> separate top-up patch for this and we can discuss in detail about this
> once the basic patch is in better shape.

v9 onwards the patch-0003 is a separate top-up patch implementing update_exists.

> > 3b)
> > If we plan to implement this, we need to work on optimal design where
> > we can find all the conflicting rows at once and delete those.
> > Currently the implementation has been done using recursion i.e. find
> > one conflicting row, then delete it and then next and so on i.e. we
> > call  apply_handle_update_internal() recursively. On initial code
> > review, I feel it is doable to scan all indexes at once and get
> > conflicting-tuple-ids in one go and get rid of recursion. It can be
> > attempted once we decide on 3a.
> >
>
> I suggest following the simplest strategy (even if that means calling
> the update function recursively) by adding comments on the optimal
> strategy. We can optimize it later as well.
>
> > ~~
> >
> > 4)
> > Now for insert_exists and update_exists, we are doing a pre-

Re: Conflict Detection and Resolution

2024-08-27 Thread Nisha Moond
On Mon, Aug 26, 2024 at 9:05 AM shveta malik  wrote:
>
> On Wed, Aug 21, 2024 at 4:08 PM Nisha Moond  wrote:
> >
> > The patches have been rebased on the latest pgHead following the merge
> > of the conflict detection patch [1]. The detect_conflict option has
> > been removed, and conflict detection is now enabled by default. This
> > change required the following updates in resolver patches:
> > patch-0001:
> >  - Removed dependency on the detect_conflict option. Now, default
> > conflict resolvers are set on CREATE SUBSCRIPTION if no values are
> > provided.
> >  - To keep the behavior unchanged, the default resolvers are now set as -
> >   insert_exists  = error
> >   update_exists  = error
> >   update_differ  = apply_remote
> >   update_missing = skip
> >   delete_missing = skip
> >   delete_differ  = apply_remote
> >  - Added documentation for conflict resolvers.
> >
> > patch-0002:
> > - Removed dependency on the detect_conflict option.
> > - Updated test cases in 034_conflict_resolver.pl to reflect new
> > default resolvers and the removal of the detect_conflict option.
> >
> > patch-0003:
> > - Implemented resolver for the update_exists conflict type. Supported
> > resolvers are: apply_remote, keep_local, error.
> >
>
> Thanks Nisha for the patches, I was running some tests on
> update_exists and found this case wherein it misses to LOG one
> conflict out of 3.
>
> create table tab (a int primary key, b int unique, c int unique);
> Pub: insert into tab  values (1,1,1);
>
> Sub:
> insert into tab  values (2,20,30);
> insert into tab values (3,40,50);
> insert into tab values (4,60,70);
>
> Pub: update tab set a=2,b=40,c=70 where a=1;
>
> Here it logs update_exists conflict and the resolution for  Key
> (b)=(40) and Key (c)=(70) but misses to LOG first one which is with
> Key (a)=(2).
>

Fixed.

Thanks,
Nisha




Re: Conflict Detection and Resolution

2024-08-25 Thread Nisha Moond
On Thu, Aug 22, 2024 at 3:45 PM shveta malik  wrote:
>
> On Wed, Aug 21, 2024 at 4:08 PM Nisha Moond  wrote:
> >
> > The patches have been rebased on the latest pgHead following the merge
> > of the conflict detection patch [1].
>
> Thanks for working on patches.
>
> Summarizing the issues which need some suggestions/thoughts.
>
> 1)
> For subscription based resolvers, currently the syntax implemented is:
>
> 1a)
> CREATE SUBSCRIPTION 
> CONNECTION  PUBLICATION 
> CONFLICT RESOLVER
> (conflict_type1 = resolver1, conflict_type2 = resolver2,
> conflict_type3 = resolver3,...);
>
> 1b)
> ALTER SUBSCRIPTION  CONFLICT RESOLVER
> (conflict_type1 = resolver1, conflict_type2 = resolver2,
> conflict_type3 = resolver3,...);
>
> Earlier the syntax suggested in [1] was:
> CREATE SUBSCRIPTION  CONNECTION  PUBLICATION 
> CONFLICT RESOLVER 'conflict_resolver1' FOR 'conflict_type1',
> CONFLICT RESOLVER 'conflict_resolver2' FOR 'conflict_type2';
>
> I think the currently implemented syntax  is good as it has less
> repetition, unless others think otherwise.
>
> ~~
>
> 2)
> For subscription based resolvers, do we need a RESET command to reset
> resolvers to default? Any one of below or both?
>
> 2a) reset all at once:
>  ALTER SUBSCRIPTION  RESET CONFLICT RESOLVERS
>
> 2b) reset one at a time:
>  ALTER SUBSCRIPTION  RESET CONFLICT RESOLVER for 'conflict_type';
>
> The issue I see here is, to implement 1a and 1b, we have introduced
> the  'RESOLVER' keyword. If we want to implement 2a, we will have to
> introduce the 'RESOLVERS' keyword as well. But we can come up with
> some alternative syntax if we plan to implement these. Thoughts?
>
> ~~
>
> 3)  Regarding update_exists:
>
> 3a)
> Currently update_exists resolver patch is kept separate. The reason
> being, it performs resolution which will need deletion of multiple
> rows. It will be good to discuss if we want to target this in the
> first draft. Please see the example:
>
> create table tab (a int primary key, b int unique, c int unique);
>
> Pub: insert into tab  values (1,1,1);
> Sub:
> insert into tab  values (2,20,30);
> insert into tab values (3,40,50);
> insert into tab values (4,60,70);
>
> Pub: update tab set a=2,b=40,c=70 where a=1;
>
> The above 'update' on pub will result in 'update_exists' on sub and if
> resolution is in favour of 'apply', then it will conflict with all the
> three local rows of subscriber due to unique constraint present on all
> three columns. Thus in order to resolve the conflict, it will have to
> delete these 3 rows on sub:
>
> 2,20,30
> 3,40,50
> 4,60,70
> and then update 1,1,1 to 2,40,70.
>
> Just need opinion on if we shall target this in the initial draft.
>
> 3b)
> If we plan to implement this, we need to work on optimal design where
> we can find all the conflicting rows at once and delete those.
> Currently the implementation has been done using recursion i.e. find
> one conflicting row, then delete it and then next and so on i.e. we
> call  apply_handle_update_internal() recursively. On initial code
> review, I feel it is doable to scan all indexes at once and get
> conflicting-tuple-ids in one go and get rid of recursion. It can be
> attempted once we decide on 3a.
>
> ~~
>
> 4)
> Now for insert_exists and update_exists, we are doing a pre-scan of
> all unique indexes to find conflict. Also there is post-scan to figure
> out if the conflicting row is inserted meanwhile. This needs to be
> reviewed for optimization. We need to avoid pre-scan wherever
> possible. I think the only case for which it can be avoided is
> 'ERROR'. For the cases where resolver is in favor of remote-apply, we
> need to check conflict beforehand to avoid rollback of already
> inserted data. And for the case where resolver is in favor of skipping
> the change, then too we should know beforehand about the conflict to
> avoid heap-insertion and rollback. Thoughts?
>
+1 to the idea of optimization, but it seems that when the resolver is
set to ERROR, skipping the pre-scan only optimizes the case where no
conflict exists.
If a conflict is found, the apply-worker will error out during the
pre-scan, and no post-scan occurs, so there's no opportunity for
optimization.
However, if no conflict is present, we currently do both pre-scan and
post-scan. Skipping the pre-scan in this scenario could be a
worthwhile optimization, even if it only benefits the no-conflict
case.

--
Thanks,
Nisha




Re: Conflict detection and logging in logical replication

2024-08-12 Thread Nisha Moond
On Mon, Aug 5, 2024 at 10:05 AM shveta malik  wrote:
>
> On Mon, Aug 5, 2024 at 9:19 AM Amit Kapila  wrote:
> >
> > On Fri, Aug 2, 2024 at 6:28 PM Nisha Moond  wrote:
> > >
> > > Performance tests done on the v8-0001 and v8-0002 patches, available at 
> > > [1].
> > >
> >
> > Thanks for doing the detailed tests for this patch.
> >
> > > The purpose of the performance tests is to measure the impact on
> > > logical replication with track_commit_timestamp enabled, as this
> > > involves fetching the commit_ts data to determine
> > > delete_differ/update_differ conflicts.
> > >
> > > Fortunately, we did not see any noticeable overhead from the new
> > > commit_ts fetch and comparison logic. The only notable impact is
> > > potential overhead from logging conflicts if they occur frequently.
> > > Therefore, enabling conflict detection by default seems feasible, and
> > > introducing a new detect_conflict option may not be necessary.
> > >
> > ...
> > >
> > > Test 1: create conflicts on Sub using pgbench.
> > > 
> > > Setup:
> > >  - Both publisher and subscriber have pgbench tables created as-
> > >   pgbench -p $node1_port postgres -qis 1
> > >  - At Sub, a subscription created for all the changes from Pub node.
> > >
> > > Test Run:
> > >  - To test, ran pgbench for 15 minutes on both nodes simultaneously,
> > > which led to concurrent updates and update_differ conflicts on the
> > > Subscriber node.
> > >  Command used to run pgbench on both nodes-
> > > ./pgbench postgres -p 8833 -c 10 -j 3 -T 300 -P 20
> > >
> > > Results:
> > > For each case, note the “tps” and total time taken by the apply-worker
> > > on Sub to apply the changes coming from Pub.
> > >
> > > Case1: track_commit_timestamp = off, detect_conflict = off
> > > Pub-tps = 9139.556405
> > > Sub-tps = 8456.787967
> > > Time of replicating all the changes: 19min 28s
> > > Case 2 : track_commit_timestamp = on, detect_conflict = on
> > > Pub-tps = 8833.016548
> > > Sub-tps = 8389.763739
> > > Time of replicating all the changes: 20min 20s
> > >
> >
> > Why is there a noticeable tps (~3%) reduction in publisher TPS? Is it
> > the impact of track_commit_timestamp = on or something else?

When both the publisher and subscriber nodes are on the same machine,
we observe a decrease in the publisher's TPS in case when
'track_commit_timestamp' is ON for the subscriber. Testing on pgHead
(without the patch) also showed a similar reduction in the publisher's
TPS.

Test Setup: The test was conducted with the same setup as Test-1.

Results:
Case 1: pgHead - 'track_commit_timestamp' = OFF
  - Pub TPS: 9306.25
  - Sub TPS: 8848.91
Case 2: pgHead - 'track_commit_timestamp' = ON
  - Pub TPS: 8915.75
  - Sub TPS: 8667.12

On pgHead too, there was a ~400tps reduction in the publisher when
'track_commit_timestamp' was enabled on the subscriber.

Additionally, code profiling of the walsender on the publisher showed
that the overhead in Case-2 was mainly in the DecodeCommit() call
stack, causing slower write operations, especially in
logicalrep_write_update() and OutputPluginWrite().

case1 : 'track_commit_timestamp' = OFF
--11.57%--xact_decode
| |  DecodeCommit
| |  ReorderBufferCommit
...
| |  --6.10%--pgoutput_change
| ||
| ||--3.09%--logicalrep_write_update
| |  
| |  |--2.01%--OutputPluginWrite
| ||--1.97%--WalSndWriteData

case2: 'track_commit_timestamp' = ON
|--53.19%--xact_decode
| |  DecodeCommit
| |  ReorderBufferCommit
...
| | --30.25%--pgoutput_change
| |  |
| |  |--15.23%--logicalrep_write_update
| |  
| |  |--9.82%--OutputPluginWrite
| |   |--9.57%--WalSndWriteData

-- In Case 2, the subscriber's process of writing timestamp data for
millions of rows appears to have impacted all write operations on the
machine.

To confirm the profiling results, we conducted the same test with the
publisher and subscriber on separate machines.

Results:
Case 1: 'track_commit_timestamp' = OFF
  - Run 1: Pub TPS: 2144.10, Sub TPS: 2216.02
  - Run 2: Pub TPS: 2159.41, Sub TPS: 2233.82

Case 2: 'track_commit_timestamp' = ON
  - Run 1: Pub TPS: 2174.39, Sub TPS: 2226.89
  - Run 2: Pub TPS: 2148.92, Sub TPS: 2224.80

Note: The machines used in this test were not as powerful as the one
used in the earlier tests, resulting in lower overall TPS (~2k vs.
~8-9k).
However, the results sh

Re: Conflict detection and logging in logical replication

2024-08-02 Thread Nisha Moond
Performance tests done on the v8-0001 and v8-0002 patches, available at [1].

The purpose of the performance tests is to measure the impact on
logical replication with track_commit_timestamp enabled, as this
involves fetching the commit_ts data to determine
delete_differ/update_differ conflicts.

Fortunately, we did not see any noticeable overhead from the new
commit_ts fetch and comparison logic. The only notable impact is
potential overhead from logging conflicts if they occur frequently.
Therefore, enabling conflict detection by default seems feasible, and
introducing a new detect_conflict option may not be necessary.

Please refer to the following for detailed test results:

For all the tests, created two server nodes, one publisher and one as
subscriber. Both the nodes are configured with below settings -
  wal_level = logical
  shared_buffers = 40GB
  max_worker_processes = 32
  max_parallel_maintenance_workers = 24
  max_parallel_workers = 32
  synchronous_commit = off
  checkpoint_timeout = 1d
  max_wal_size = 24GB
  min_wal_size = 15GB
  autovacuum = off


Test 1: create conflicts on Sub using pgbench.

Setup:
 - Both publisher and subscriber have pgbench tables created as-
  pgbench -p $node1_port postgres -qis 1
 - At Sub, a subscription created for all the changes from Pub node.

Test Run:
 - To test, ran pgbench for 15 minutes on both nodes simultaneously,
which led to concurrent updates and update_differ conflicts on the
Subscriber node.
 Command used to run pgbench on both nodes-
./pgbench postgres -p 8833 -c 10 -j 3 -T 300 -P 20

Results:
For each case, note the “tps” and total time taken by the apply-worker
on Sub to apply the changes coming from Pub.

Case1: track_commit_timestamp = off, detect_conflict = off
Pub-tps = 9139.556405
Sub-tps = 8456.787967
Time of replicating all the changes: 19min 28s
Case 2 : track_commit_timestamp = on, detect_conflict = on
Pub-tps = 8833.016548
Sub-tps = 8389.763739
Time of replicating all the changes: 20min 20s
Case3: track_commit_timestamp = on, detect_conflict = off
Pub-tps = 8886.101726
Sub-tps = 8374.508017
Time of replicating all the changes: 19min 35s
Case 4: track_commit_timestamp = off, detect_conflict = on
Pub-tps = 8981.924596
Sub-tps = 8411.120808
Time of replicating all the changes: 19min 27s

**The difference of TPS between each case is small. While I can see a
slight increase of the replication time (about 5%), when enabling both
track_commit_timestamp and detect_conflict.

Test2: create conflict using a manual script

 - To measure the precise time taken by the apply-worker in all cases,
create a test with a table having 10 million rows.
 - To record the total time taken by the apply-worker, dump the
current time in the logfile for apply_handle_begin() and
apply_handle_commit().

Setup:
Pub : has a table ‘perf’ with 10 million rows.
Sub : has the same table ‘perf’ with its own 10 million rows (inserted
by 1000 different transactions). This table is subscribed for all
changes from Pub.

Test Run:
At Pub: run UPDATE on the table ‘perf’ to update all its rows in a
single transaction. (this will lead to update_differ conflict for all
rows on Sub when enabled).
At Sub: record the time(from log file) taken by the apply-worker to
apply all updates coming from Pub.

Results:
Below table shows the total time taken by the apply-worker
(apply_handle_commit Time - apply_handle_begin Time ).
(Two test runs for each of the four cases)

Case1: track_commit_timestamp = off, detect_conflict = off
Run1 - 2min 42sec 579ms
Run2 - 2min 41sec 75ms
Case 2 : track_commit_timestamp = on, detect_conflict = on
Run1 - 6min 11sec 602ms
Run2 - 6min 25sec 179ms
Case3: track_commit_timestamp = on, detect_conflict = off
Run1 - 2min 34sec 223ms
Run2 - 2min 33sec 482ms
Case 4: track_commit_timestamp = off, detect_conflict = on
Run1 - 2min 35sec 276ms
Run2 - 2min 38sec 745ms

** In the case-2 when both track_commit_timestamp and detect_conflict
are enabled, the time taken by the apply-worker is ~140% higher.

Test3: Case when no conflict is detected.

To measure the time taken by the apply-worker when there is no
conflict detected. This test is to confirm if the time overhead in
Test1-Case2 is due to the new function GetTupleCommitTs() which
fetches the origin and timestamp information for each row in the table
before applying the update.

Setup:
 - The Publisher and Subscriber both have an empty table to start with.
 - At Sub, the table is subscribed for all changes from Pub.
 - At Pub: Insert 10 million rows and the same will be replicated to
the Sub table as well.

Test Run:
At Pub: run an UPDATE on the table to update all rows in a single
transaction. (This will NOT hit the update_differ 

Re: Conflict detection and logging in logical replication

2024-07-26 Thread Nisha Moond
On Thu, Jul 25, 2024 at 12:04 PM Zhijie Hou (Fujitsu)
 wrote:
> Here is the V6 patch set which addressed Shveta and Nisha's comments
> in [1][2][3][4].

Thanks for the patch.
I tested the v6-0001 patch with partition table scenarios. Please
review the following scenario where Pub updates a tuple, causing it to
move from one partition to another on Sub.

Setup:
Pub:
 create table tab (a int not null, b int not null);
 alter table tab add constraint tab_pk primary key (a,b);
Sub:
 create table tab (a int not null, b int not null) partition by range (b);
 alter table tab add constraint tab_pk primary key (a,b);
 create table tab_1 partition of tab FOR values from (MINVALUE) TO (100);
 create table tab_2 partition of tab FOR values from (101) TO (MAXVALUE);

Test:
 Pub: insert into tab values (1,1);
 Sub: update tab set a=1 where a=1;  > just to make it Sub's origin
 Sub: insert into tab values (1,101);
 Pub: update b=101 where b=1; --> Both 'update_differ' and
'insert_exists' are detected.

For non-partitioned tables, a similar update results in
'update_differ' and 'update_exists' conflicts. After detecting
'update_differ', the apply worker proceeds to apply the remote update
and if a tuple with the updated key already exists, it raises
'update_exists'.
This same behavior is expected for partitioned tables too.

Thanks,
Nisha




Re: Conflict detection and logging in logical replication

2024-07-23 Thread Nisha Moond
On Thu, Jul 18, 2024 at 7:52 AM Zhijie Hou (Fujitsu)
 wrote:
>
> Attach the V5 patch set which changed the following:
>

Tested v5-0001 patch, and it fails to detect the update_exists
conflict for a setup where Pub has a non-partitioned table and Sub has
the same table partitioned.
Below is a testcase showcasing the issue:

Setup:
Pub:
create table tab (a int not null, b int not null);
alter table tab add constraint tab_pk primary key (a,b);

Sub:
create table tab (a int not null, b int not null) partition by range (b);
alter table tab add constraint tab_pk primary key (a,b);
CREATE TABLE tab_1 PARTITION OF tab FOR VALUES FROM (MINVALUE) TO (100);
CREATE TABLE tab_2 PARTITION OF tab FOR VALUES FROM (101) TO (MAXVALUE);

Test:
Pub: insert into tab values (1,1);
Sub: insert into tab values (2,1);
Pub: update tab set a=2 where a=1;--> After this update on Pub,
'update_exists' is expected on Sub, but it fails to detect the
conflict and logs the key violation error -

ERROR:  duplicate key value violates unique constraint "tab_1_pkey"
DETAIL:  Key (a, b)=(2, 1) already exists.

Thanks,
Nisha




Re: Improve the connection failure error messages

2024-07-11 Thread Nisha Moond
On Tue, Jul 9, 2024 at 1:00 AM Tom Lane  wrote:
>
> Nisha Moond  writes:
> > Attached v5 patch with the translator comments as suggested.
>
> I looked at this, and I agree with the goal, but I find just about all
> of the translator comments unnecessary.  The ones that are useful are
> useful only because the message is violating one of our message style
> guidelines [1]:
>
> When citing the name of an object, state what kind of object it is.
>
> Rationale: Otherwise no one will know what “foo.bar.baz” refers to.
>
> So, for example, where you have
>
> +
> +/*
> + * translator: first %s is the subscription name, second %s is 
> the
> + * error
> + */
> + errmsg("subscription \"%s\" could not connect to the 
> publisher: %s", stmt->subname, err)));
>
> I don't find that that translator comment is adding anything.
> But there are a couple of places like
>
> +/*
> + * translator: first %s is the slotsync worker name, second %s is the
> + * error
> + */
> +errmsg("\"%s\" could not connect to the primary server: %s", 
> app_name.data, err));
>
> I think that the right cure for the ambiguity here is not to add a
> translator comment, but to label the name properly, perhaps like
>
> errmsg("synchronization worker \"%s\" could not connect to 
> the primary server: %s",
>app_name.data, err));
>
>
> regards, tom lane
>
> [1] 
> https://www.postgresql.org/docs/devel/error-style-guide.html#ERROR-STYLE-GUIDE-OBJECT-TYPE

Thank you for the review.

Attached the patch v6 with suggested improvements.
- Removed unnecessary translator comments.
- Added appropriate identifier names where missing.

--
Thanks,
Nisha


v6-0001-Improve-the-connection-failure-error-messages.patch
Description: Binary data


Re: Conflict Detection and Resolution

2024-07-05 Thread Nisha Moond
On Mon, Jul 1, 2024 at 1:17 PM Ajin Cherian  wrote:
>
>
>
> On Thu, Jun 27, 2024 at 1:14 PM Nisha Moond  wrote:
>>
>> Please find the attached  'patch0003', which implements conflict
>> resolutions according to the global resolver settings.
>>
>> Summary of Conflict Resolutions Implemented in 'patch0003':
>>
>> INSERT Conflicts:
>> 
>> 1) Conflict Type: 'insert_exists'
>>
>> Supported Resolutions:
>> a) 'remote_apply': Convert the INSERT to an UPDATE and apply.
>> b) 'keep_local': Ignore the incoming (conflicting) INSERT and retain
>> the local tuple.
>> c) 'error': The apply worker will error out and restart.
>>
>
> Hi Nisha,
>
> While testing the patch, when conflict resolution is configured and 
> insert_exists is set to "remote_apply", I see this warning in the logs due to 
> a resource not being closed:
>
> 2024-07-01 02:52:59.427 EDT [20304] LOG:  conflict insert_exists detected on 
> relation "public.test1"
> 2024-07-01 02:52:59.427 EDT [20304] DETAIL:  Key already exists. Applying 
> resolution method "remote_apply"
> 2024-07-01 02:52:59.427 EDT [20304] CONTEXT:  processing remote data for 
> replication origin "pg_16417" during message type "INSERT" for replication 
> target relation "public.test1" in transaction 763, finished at 0/15E7F68
> 2024-07-01 02:52:59.427 EDT [20304] WARNING:  resource was not closed: [138] 
> (rel=base/5/16413, blockNum=0, flags=0x9380, refcount=1 1)
> 2024-07-01 02:52:59.427 EDT [20304] CONTEXT:  processing remote data for 
> replication origin "pg_16417" during message type "COMMIT" in transaction 
> 763, finished at 0/15E7F68
> 2024-07-01 02:52:59.427 EDT [20304] WARNING:  resource was not closed: 
> TupleDesc 0x7f8c0439e448 (16402,-1)
> 2024-07-01 02:52:59.427 EDT [20304] CONTEXT:  processing remote data for 
> replication origin "pg_16417" during message type "COMMIT" in transaction 
> 763, finished at 0/15E7F68
>
Thank you Ajin for reporting the issue, This is now fixed with the
v4-0003 patch.

--
Thanks,
Nisha




Re: Conflict detection and logging in logical replication

2024-06-24 Thread Nisha Moond
On Mon, Jun 24, 2024 at 7:39 AM Zhijie Hou (Fujitsu)
 wrote:
>
> When testing the patch, I noticed a bug that when reporting the conflict
> after calling ExecInsertIndexTuples(), we might find the tuple that we
> just inserted and report it.(we should only report conflict if there are
> other conflict tuples which are not inserted by us) Here is a new patch
> which fixed this and fixed a compile warning reported by CFbot.
>
Thank you for the patch!
A review comment: The patch does not detect 'update_differ' conflicts
when the Publisher has a non-partitioned table and the Subscriber has
a partitioned version.

Here’s a simple failing test case:
Pub: create table tab (a int primary key, b int not null, c varchar(5));

Sub: create table tab (a int not null, b int not null, c varchar(5))
partition by range (b);
alter table tab add constraint tab_pk primary key (a, b);
create table tab_1 partition of tab for values from (minvalue) to (100);
create table tab_2 partition of tab for values from (100) to (maxvalue);

With the above setup, in case the Subscriber table has a tuple with
its own origin, the incoming remote update from the Publisher fails to
detect the 'update_differ' conflict.

--
Thanks,
Nisha




Re: Conflict Detection and Resolution

2024-06-06 Thread Nisha Moond
On Wed, Jun 5, 2024 at 7:29 PM Dilip Kumar  wrote:
>
> On Tue, Jun 4, 2024 at 9:37 AM Amit Kapila  wrote:
> >
> > Can you share the use case of "earliest_timestamp_wins" resolution
> > method? It seems after the initial update on the local node, it will
> > never allow remote update to succeed which sounds a bit odd. Jan has
> > shared this and similar concerns about this resolution method, so I
> > have added him to the email as well.
> >
> I can not think of a use case exactly in this context but it's very
> common to have such a use case while designing a distributed
> application with multiple clients.  For example, when we are doing git
> push concurrently from multiple clients it is expected that the
> earliest commit wins.
>

Here are more use cases of the "earliest_timestamp_wins" resolution method:
1) Applications where the record of first occurrence of an event is
important. For example, sensor based applications like earthquake
detection systems, capturing the first seismic wave's time is crucial.
2) Scheduling systems, like appointment booking, prioritize the
earliest request when handling concurrent ones.
3) In contexts where maintaining chronological order is important -
  a) Social media platforms display comments ensuring that the
earliest ones are visible first.
  b) Finance transaction processing systems rely on timestamps to
prioritize the processing of transactions, ensuring that the earliest
transaction is handled first

--
Thanks,
Nisha




Re: Improve the connection failure error messages

2024-05-31 Thread Nisha Moond
On Fri, Apr 26, 2024 at 1:10 PM Daniel Gustafsson  wrote:
>
> > On 22 Mar 2024, at 11:42, Nisha Moond  wrote:
>
> > Here is the v4 patch with changes required in slotfuncs.c and slotsync.c 
> > files.
>
> -   errmsg("could not connect to the primary server: %s", err));
> +   errmsg("\"%s\" could not connect to the primary server: %s", 
> app_name.data, err));
>
> Messages like this should perhaps have translator comments to indicate what 
> the
> leading "%s" will contain?

Attached v5 patch with the translator comments as suggested.

--
Thanks,
Nisha


v5-0001-Improve-the-connection-failure-error-messages.patch
Description: Binary data


Re: Conflict Detection and Resolution

2024-05-28 Thread Nisha Moond
On Mon, May 27, 2024 at 11:19 AM shveta malik  wrote:
>
> On Sat, May 25, 2024 at 2:39 AM Tomas Vondra
>  wrote:
> >
> > On 5/23/24 08:36, shveta malik wrote:
> > > Hello hackers,
> > >
> > > Please find the proposal for Conflict Detection and Resolution (CDR)
> > > for Logical replication.
> > >  > > below details.>
> > >
> > > Introduction
> > > 
> > > In case the node is subscribed to multiple providers, or when local
> > > writes happen on a subscriber, conflicts can arise for the incoming
> > > changes.  CDR is the mechanism to automatically detect and resolve
> > > these conflicts depending on the application and configurations.
> > > CDR is not applicable for the initial table sync. If locally, there
> > > exists conflicting data on the table, the table sync worker will fail.
> > > Please find the details on CDR in apply worker for INSERT, UPDATE and
> > > DELETE operations:
> > >
> >
> > Which architecture are you aiming for? Here you talk about multiple
> > providers, but the wiki page mentions active-active. I'm not sure how
> > much this matters, but it might.
>
> Currently, we are working for multi providers case but ideally it
> should work for active-active also. During further discussion and
> implementation phase, if we find that, there are cases which will not
> work in straight-forward way for active-active, then our primary focus
> will remain to first implement it for multiple providers architecture.
>
> >
> > Also, what kind of consistency you expect from this? Because none of
> > these simple conflict resolution methods can give you the regular
> > consistency models we're used to, AFAICS.
>
> Can you please explain a little bit more on this.
>
> >
> > > INSERT
> > > 
> > > To resolve INSERT conflict on subscriber, it is important to find out
> > > the conflicting row (if any) before we attempt an insertion. The
> > > indexes or search preference for the same will be:
> > > First check for replica identity (RI) index.
> > >   - if not found, check for the primary key (PK) index.
> > > - if not found, then check for unique indexes (individual ones or
> > > added by unique constraints)
> > >  - if unique index also not found, skip CDR
> > >
> > > Note: if no RI index, PK, or unique index is found but
> > > REPLICA_IDENTITY_FULL is defined, CDR will still be skipped.
> > > The reason being that even though a row can be identified with
> > > REPLICAT_IDENTITY_FULL, such tables are allowed to have duplicate
> > > rows. Hence, we should not go for conflict detection in such a case.
> > >
> >
> > It's not clear to me why would REPLICA_IDENTITY_FULL mean the table is
> > allowed to have duplicate values? It just means the upstream is sending
> > the whole original row, there can still be a PK/UNIQUE index on both the
> > publisher and subscriber.
>
> Yes, right. Sorry for confusion. I meant the same i.e. in absence of
> 'RI index, PK, or unique index', tables can have duplicates. So even
> in presence of Replica-identity (FULL in this case) but in absence of
> unique/primary index, CDR will be skipped for INSERT.
>
> >
> > > In case of replica identity ‘nothing’ and in absence of any suitable
> > > index (as defined above), CDR will be skipped for INSERT.
> > >
> > > Conflict Type:
> > > 
> > > insert_exists: A conflict is detected when the table has the same
> > > value for a key column as the new value in the incoming row.
> > >
> > > Conflict Resolution
> > > 
> > > a) latest_timestamp_wins:The change with later commit timestamp wins.
> > > b) earliest_timestamp_wins:   The change with earlier commit timestamp 
> > > wins.
> > > c) apply:   Always apply the remote change.
> > > d) skip:Remote change is skipped.
> > > e) error:   Error out on conflict. Replication is stopped, manual
> > > action is needed.
> > >
> >
> > Why not to have some support for user-defined conflict resolution
> > methods, allowing to do more complex stuff (e.g. merging the rows in
> > some way, perhaps even with datatype-specific behavior)?
>
> Initially, for the sake of simplicity, we are targeting to support
> built-in resolvers. But we have a plan to work on user-defined
> resolvers as well. We shall propose that separately.
>
> >
> > > The change will be converted to 'UPDATE' and applied if the decision
> > > is in favor of applying remote change.
> > >
> > > It is important to have commit timestamp info available on subscriber
> > > when latest_timestamp_wins or earliest_timestamp_wins method is chosen
> > > as resolution method.  Thus ‘track_commit_timestamp’ must be enabled
> > > on subscriber, in absence of which, configuring the said
> > > timestamp-based resolution methods will result in error.
> > >
> > > Note: If the user has chosen the latest or earliest_timestamp_wins,
> > > and the remote and local timestamps are the same, then it will go by
> > > system identifier. The change with a higher system identifier will
> > 

Re: Synchronizing slots from primary to standby

2024-04-01 Thread Nisha Moond
Did performance test on optimization patch
(v2-0001-optimize-the-slot-advancement.patch). Please find the
results:

Setup:
- One primary node with 100 failover-enabled logical slots
- 20 DBs, each having 5 failover-enabled logical replication slots
- One physical standby node with 'sync_replication_slots' as off but
other parameters required by slot-sync as enabled.

Node Configurations: please see config.txt

Test Plan:
1) Create 20 Databases on Primary node, each with 5 failover slots
using "pg_create_logical_replication_slot()". Overall 100 failover
slots.
2) Use pg_sync_replication_slot() to sync them to the standby. Note
the execution time of sync and lsns values.
3) On Primary node, run pgbench for 15 mins on postgres db
4) Advance lsns of all the 100 slots on primary using
pg_replication_slot_advance().
5) Use pg_sync_replication_slot() to sync slots to the standby. Note
the execution time of sync and lsns values.

Executed the above test plan for three cases and did time elapsed
comparison for the pg_replication_slot_advance()-

(1) HEAD
Time taken by pg_sync_replication_slot() on Standby node -
  a) The initial sync (step 2) = 140.208 ms
  b) Sync after pgbench run on primary (step 5) = 66.994 ms

(2) HEAD + v3-0001-advance-the-restart_lsn-of-synced-slots-using-log.patch
  a) The initial sync (step 2) = 163.885 ms
  b) Sync after pgbench run on primary (step 5) = 837901.290 ms (13:57.901)

  >> With v3 patch, the pg_sync_replication_slot() takes a significant
amount of time to sync the slots.

(3) HEAD + v3-0001-advance-the-restart_lsn-of-synced-slots-using-log.patch
+ v2-0001-optimize-the-slot-advancement.patch
  a) The initial sync (step 2) = 165.554 ms
  b) Sync after pgbench run on primary (step 5) = 7991.718 ms (00:07.992)

  >> With the optimization patch, the time taken by
pg_sync_replication_slot() is reduced significantly to ~7 seconds.

We did the same test with a single DB too by creating all 100 failover
slots in postgres DB and the results were almost similar.

Attached the scripts used for the test  -
"v3_perf_test_scripts.tar.gz" include files -
setup_multidb.sh : setup primary and standby nodes
createdb20.sql : create 20 DBs
createslot20.sql : create total 100 logical slots, 5 on each DB
run_sync.sql : call pg_replication_slot_advance() with timing
advance20.sql : advance lsn of all slots on Primary node to current lsn
advance20_perdb.sql : use on HEAD to advance lsn on Primary node
get_synced_data.sql : get details of the
config.txt : configuration used for nodes


v3_perf_test_scripts.tar.gz
Description: GNU Zip compressed data


Re: Improve the connection failure error messages

2024-03-22 Thread Nisha Moond
On Wed, Mar 13, 2024 at 11:16 AM Peter Smith  wrote:
>
> FYI -- some more code has been pushed since this patch was last
> updated. AFAICT perhaps you'll want to update this patch again for the
> following new connection messages on HEAD:
>
> - slotfuncs.c [1]
> - slotsync.c [2]
>
> --
> [1] 
> https://github.com/postgres/postgres/blob/0b84f5c419a300dc1b1a70cf63b9907208e52643/src/backend/replication/slotfuncs.c#L989
> [2] 
> https://github.com/postgres/postgres/blob/0b84f5c419a300dc1b1a70cf63b9907208e52643/src/backend/replication/logical/slotsync.c#L1258
>
Thanks for the update.
Here is the v4 patch with changes required in slotfuncs.c and slotsync.c files.

--
Thanks,
Nisha


v4-0001-Improve-the-connection-failure-error-messages.patch
Description: Binary data


Re: Synchronizing slots from primary to standby

2024-03-04 Thread Nisha Moond
I did performance tests for the v99 patch w.r.t. wait time analysis.
As this patch is introducing a wait for standby before sending changes
to a subscriber, at the primary node, logged time at the start and end
of the XLogSendLogical() call (which eventually calls
WalSndWaitForWal()) and calculated total time taken by this function
during the load run.

For load, ran pgbench for 15 minutes:
Creating tables: pgbench -p 5833 postgres -qis 2
Running benchmark: pgbench postgres -p 5833 -c 10 -j 3 -T 900 -P 20

Machine details:
11th Gen Intel(R) Core(TM) i9-11950H @ 2.60GHz 32GB RAM
OS - Windows 10 Enterprise

Test setup:
Primary node -->
  -> One physical standby node
  -> One subscriber node having only one subscription with failover=true

-- the slot-sync relevant parameters are set to default (OFF) for all
the tests i.e.
hot_standby_feedback = off
sync_replication_slots = false

-- addition configuration on each instance is:
shared_buffers = 6GB
max_worker_processes = 32
max_parallel_maintenance_workers = 24
max_parallel_workers = 32
synchronous_commit = off
checkpoint_timeout = 1d
max_wal_size = 24GB
min_wal_size = 15GB
autovacuum = off

To review the wait time impact with and without patch, compared three
cases (did two runs for each case)-
(1) HEAD code:
time taken in run 1 = 103.935631 seconds
time taken in run 2 = 104.832186 seconds

(2) HEAD code + v99 patch ('standby_slot_names' is not set):
time taken in run 1 = 104.076343 seconds
time taken in run 2 = 103.116226 seconds

(3) HEAD code + v99 patch + a valid 'standby_slot_names' is set:
time taken in run 1 = 103.871012 seconds
time taken in run 2 = 103.793524 seconds

The time consumption of XLogSendLogical() is almost same in all the
cases and no performance degradation is observed.

--
Thanks,
Nisha




Re: Synchronizing slots from primary to standby

2024-02-07 Thread Nisha Moond
We conducted stress testing for the patch with a setup of one primary
node with 100 tables and five subscribers, each having 20
subscriptions. Then created three physical standbys syncing the
logical replication slots from the primary node.
All 100 slots were successfully synced on all three standbys. We then
ran the load and monitored LSN convergence using the prescribed SQL
checks.
Once the standbys were failover-ready, we were able to successfully
promote one of the standbys and all the subscribers seamlessly
migrated to the new primary node.

We replicated the tests with 200 tables, creating 200 logical
replication slots. With the increased load, all the tests were
completed successfully.

Minor errors (not due to patch) observed during tests -

1) When the load was run, on subscribers, the logical replication
apply workers started failing due to timeout. This is not related to
the patch as it happened due to the small "wal_receiver_timeout"
setting w.r.t. the load. To confirm, we ran the same load without the
patch too, and the same failure happened.
2) There was a buffer overflow exception on the primary node with the
'200 replication slots' case. It was not related to the patch as it
was due to short memory configuration.

All the tests were done on Windows as well as Linux environments.
Thank you Ajin for the stress test and analysis on Linux.




Re: Improve the connection failure error messages

2024-01-31 Thread Nisha Moond
> AFAIK some recent commits patches (e,g [1]  for the "slot sync"
> development) have created some more cases of "could not connect..."
> messages. So, you might need to enhance your patch to deal with any
> new ones in the latest HEAD.
>
> ==
> [1]
> https://github.com/postgres/postgres/commit/776621a5e4796fa214b6b29a7ca134f6c138572a
>
> Thank you for the update.
The v3 patch has the changes needed as per the latest HEAD.

--
Thanks,
Nisha


v3-0001-Improve-the-connection-failure-error-messages.patch
Description: Binary data


Re: Improve the connection failure error messages

2024-01-18 Thread Nisha Moond
On Fri, Jan 12, 2024 at 7:06 PM Aleksander Alekseev
 wrote:
>
> Hi,
>
> Thanks for the patch.
>
> > Due to this behavior, it is not possible to add a test to show the
> > error message as it is done for CREATE SUBSCRIPTION.
> > Let me know if you think there is another way to add this test.
>
> I believe it can be done with TAP tests, see for instance:
>
> contrib/auto_explain/t/001_auto_explain.pl
>
> However I wouldn't insist on including the test in scope of this
> particular patch. Such a test doesn't currently exist, it can be added
> as a separate patch, and whether this is actually a useful test (all
> the tests consume time after all...) is somewhat debatable. Personally
> I agree that it would be nice to have though.
Thank you for providing this information. Yes, I can write a TAP test
to check the log for the same error message.
I'll attempt it and perform a time analysis. I'm unsure where to
appropriately add this test. Any suggestions?

Following your suggestion, I won't include the test in the scope of
this patch. Instead, I'll start a new thread once I have sufficient
information required.

--
Thanks,
Nisha




Re: Improve the connection failure error messages

2024-01-17 Thread Nisha Moond
>
> ~~
>
> BTW, while experimenting with the bad connection ALTER I also tried
> setting 'disable_on_error' like below:
>
> ALTER SUBSCRIPTION sub4 SET (disable_on_error);
> ALTER SUBSCRIPTION sub4 CONNECTION 'port = -1';
>
> ...but here the subscription did not become DISABLED as I expected it
> would do on the next connection error iteration. It remains enabled
> and just continues to loop relaunch/ERROR indefinitely same as before.
>
> That looks like it may be a bug. Thoughts?
>
Ideally, if the already running apply worker in
"LogicalRepApplyLoop()" has any exception/error it will be handled and
the subscription will be disabled if 'disable_on_error' is set -

start_apply(XLogRecPtr origin_startpos)
{
PG_TRY();
{
LogicalRepApplyLoop(origin_startpos);
}
PG_CATCH();
{
if (MySubscription->disableonerr)
DisableSubscriptionAndExit();
...

What is happening in this case is that the control reaches the function -
run_apply_worker() -> start_apply() -> LogicalRepApplyLoop ->
maybe_reread_subscription()
...
/*
* Exit if any parameter that affects the remote connection was changed.
* The launcher will start a new worker but note that the parallel apply
* worker won't restart if the streaming option's value is changed from
* 'parallel' to any other value or the server decides not to stream the
* in-progress transaction.
*/
if (strcmp(newsub->conninfo, MySubscription->conninfo) != 0 ||
...

and it sees a change in the parameter and calls apply_worker_exit().
This will exit the current process, without throwing an exception to
the caller and the postmaster will try to restart the apply worker.
The new apply worker, before reaching the start_apply() [where we
handle exception], will hit the code to establish the connection to
the publisher -

ApplyWorkerMain() -> run_apply_worker() -
...
LogRepWorkerWalRcvConn = walrcv_connect(MySubscription->conninfo,
true /* replication */ ,
true,
must_use_password,
MySubscription->name, &err);

if (LogRepWorkerWalRcvConn == NULL)
  ereport(ERROR,
  (errcode(ERRCODE_CONNECTION_FAILURE),
   errmsg("could not connect to the publisher: %s", err)));
...
and due to the bad connection string in the subscription, it will error out.
[28680] ERROR:  could not connect to the publisher: invalid port number: "-1"
[3196] LOG:  background worker "logical replication apply worker" (PID
28680) exited with exit code 1

Now, the postmaster keeps trying to restart the apply worker and it
will keep failing until the connection string is corrected or the
subscription is disabled manually.

I think this is a bug that needs to be handled in run_apply_worker()
when disable_on_error is set.
IMO, this bug-fix discussion deserves a separate thread. Thoughts?




Re: Synchronizing slots from primary to standby

2024-01-16 Thread Nisha Moond
A review on v62-006: failover-ready validation steps doc -

+   Next, check that the logical replication slots identified above exist on
+   the standby server. This step can be skipped if
+   standby_slot_names has been correctly configured.
+
+test_standby=# SELECT bool_and(synced AND NOT temporary AND
conflict_reason IS NULL) AS failover_ready
+   FROM pg_replication_slots
+   WHERE slot_name in ('sub');
+ failover_ready
+
+ t
+(1 row)

This query does not ensure that all the logical replication slots
exist on standby. Due to the 'IN ('slots')' check, it will return
'true' even if only one or a few slots exist.




Re: Improve the connection failure error messages

2024-01-16 Thread Nisha Moond
Thanks for reviewing, please find my response inline.

On Wed, Jan 17, 2024 at 4:56 AM Peter Smith  wrote:
>
> On Sat, Jan 13, 2024 at 12:36 AM Aleksander Alekseev
>  wrote:
> >
> > Hi,
> >
> > Thanks for the patch.
> >
> > > Due to this behavior, it is not possible to add a test to show the
> > > error message as it is done for CREATE SUBSCRIPTION.
> > > Let me know if you think there is another way to add this test.
> >
> > I believe it can be done with TAP tests, see for instance:
> >
> > contrib/auto_explain/t/001_auto_explain.pl
> >
> > However I wouldn't insist on including the test in scope of this
> > particular patch. Such a test doesn't currently exist, it can be added
> > as a separate patch, and whether this is actually a useful test (all
> > the tests consume time after all...) is somewhat debatable. Personally
> > I agree that it would be nice to have though.
> >
> > This being said...
> >
> > > The ALTER SUBSCRIPTION command does not error out on the user
> > > interface if updated with a bad connection string and the connection
> > > failure error can only be seen in the respective log file.
> >
> > I wonder if we should fix this. Again, not necessarily in scope of
> > this work, but if this is not a difficult task, again, it would be
> > nice to have.
> >
> > Other than that v2 looks good.
> >
>
> OK.  I see now that any ALTER of the subscription's connection, even
> to some value that fails, will restart a new worker (like ALTER of any
> other subscription parameters). For a bad connection, it will continue
> to relaunch-worker/ERROR over and over.
>
> test_sub=# \r2024-01-17 09:34:28.665 AEDT [11274] LOG:  logical
> replication apply worker for subscription "sub4" has started
> 2024-01-17 09:34:28.666 AEDT [11274] ERROR:  could not connect to the
> publisher: invalid port number: "-1"
> 2024-01-17 09:34:28.667 AEDT [928] LOG:  background worker "logical
> replication apply worker" (PID 11274) exited with exit code 1
> dRs su2024-01-17 09:34:33.669 AEDT [11391] LOG:  logical replication
> apply worker for subscription "sub4" has started
> 2024-01-17 09:34:33.669 AEDT [11391] ERROR:  could not connect to the
> publisher: invalid port number: "-1"
> 2024-01-17 09:34:33.670 AEDT [928] LOG:  background worker "logical
> replication apply worker" (PID 11391) exited with exit code 1
> b4
> ...
>
> I don't really have any opinion if that behaviour is good or bad, but
> anyway, it is deliberate, and IMO it is outside the scope of your
> patch, so v2 patch LGTM.

Upon code review, the ALTER SUBSCRIPTION updates the connection string
after checking for parse and a few obvious errors and does not attempt
to establish a connection. It is the apply worker running for the
respective subscription that will try to connect and fail in case of a
bad connection string.
To me, it seems an intentional design behavior and I agree that
deciding to change or maintain this behavior is out of this patch's
scope.

--
Thanks,
Nisha




Re: Improve the connection failure error messages

2024-01-12 Thread Nisha Moond
Thanks for the review. Attached v2 patch with suggested changes.
Please find my response inline.

On Fri, Jan 12, 2024 at 8:20 AM Peter Smith  wrote:
>
> Thanks for the patch! Here are a couple of review comments for it.
>
> ==
> src/backend/commands/subscriptioncmds.c
>
> 1.
> @@ -742,7 +742,7 @@ CreateSubscription(ParseState *pstate,
> CreateSubscriptionStmt *stmt,
>   if (!wrconn)
>   ereport(ERROR,
>   (errcode(ERRCODE_CONNECTION_FAILURE),
> - errmsg("could not connect to the publisher: %s", err)));
> + errmsg("\"%s\" could not connect to the publisher: %s", stmt->subname, 
> err)));
>
> In practice, these commands give errors like:
>
> test_sub=# create subscription sub1 connection 'dbname=bogus' publication 
> pub1;
> ERROR:  could not connect to the publisher: connection to server on
> socket "/tmp/.s.PGSQL.5432" failed: FATAL:  database "bogus" does not
> exist
>
> and logs like:
>
> 2024-01-12 12:45:05.177 AEDT [13108] ERROR:  could not connect to the
> publisher: connection to server on socket "/tmp/.s.PGSQL.5432" failed:
> FATAL:  database "bogus" does not exist
> 2024-01-12 12:45:05.177 AEDT [13108] STATEMENT:  create subscription
> sub1 connection 'dbname=bogus' publication pub1;
>
> Since the subscription name is already obvious from the statement that
> caused the error I'm not sure it benefits much to add this to the
> error message (but maybe it is useful if the error message was somehow
> read in isolation from the statement).
>
> Anyway, I felt at least it should include the word "subscription" for
> better consistency with the other messages in this patch:
>
> SUGGESTION
> subscription \"%s\" could not connect to the publisher: %s

Done.

> ==
>
> 2.
> + appname = cluster_name[0] ? cluster_name : "walreceiver";
> +
>   /* Establish the connection to the primary for XLOG streaming */
> - wrconn = walrcv_connect(conninfo, false, false,
> - cluster_name[0] ? cluster_name : "walreceiver",
> - &err);
> + wrconn = walrcv_connect(conninfo, false, false, appname, &err);
>   if (!wrconn)
>   ereport(ERROR,
>   (errcode(ERRCODE_CONNECTION_FAILURE),
> - errmsg("could not connect to the primary server: %s", err)));
> + errmsg("%s could not connect to the primary server: %s", appname, err)));
>
> I think your new %s should be quoted according to the guidelines at [1].

Done.

> ==
> src/test/regress/expected/subscription.out
>
> 3.
> Apparently, there is no existing regression test case for the ALTER
> "could not connect" message because if there was, it would have
> failed. Maybe a test should be added?
>
The ALTER SUBSCRIPTION command does not error out on the user
interface if updated with a bad connection string and the connection
failure error can only be seen in the respective log file.
Due to this behavior, it is not possible to add a test to show the
error message as it is done for CREATE SUBSCRIPTION.
Let me know if you think there is another way to add this test.

--
Thanks,
Nisha


v2-0001-Improve-the-connection-failure-error-messages.patch
Description: Binary data


Improve the connection failure error messages

2024-01-11 Thread Nisha Moond
Hi Hackers,

Various sections of the code utilize the walrcv_connect() function,
employed by various processes such as walreceiver, logical replication
apply worker, etc., to establish connections with other hosts.
Presently, in case of connection failures, the error message lacks
information about the specific process attempting to connect and
encountering the failure.

The provided patch enhances error messages for such connection
failures, offering more details on the processes that failed to
establish a connection.

--
Thanks,
Nisha


v1-0001-Improve-the-connection-failure-error-messages.patch
Description: Binary data


Re: Intermittent failure with t/003_logical_slots.pl test on windows

2023-12-26 Thread Nisha Moond
Thanks for working on it. I tested the patch on my system and it
resolved the issue with commands running -V (version check).

As you mentioned, I am also still seeing intermittent errors even with
the patch as below -
 in 'pg_upgrade/002_pg_upgrade' -

# Running: pg_upgrade --no-sync -d
D:\Project\pg2\postgres\build/testrun/pg_upgrade/002_pg_upgrade\data/t_002_pg_upgrade_old_node_data/pgdata
-D 
D:\Project\pg2\postgres\build/testrun/pg_upgrade/002_pg_upgrade\data/t_002_pg_upgrade_new_node_data/pgdata
-b D:/Project/pg2/postgres/build/tmp_install/Project/pg2/postgresql/bin
-B D:/Project/pg2/postgres/build/tmp_install/Project/pg2/postgresql/bin
-s 127.0.0.1 -p 56095 -P 56096 --copy --check
Performing Consistency Checks
-
Checking cluster versions ok

The source cluster lacks cluster state information:
Failure, exiting
[12:37:38.666](3.317s) not ok 12 - run of pg_upgrade --check for new instance
[12:37:38.666](0.000s) #   Failed test 'run of pg_upgrade --check for
new instance'
#   at D:/Project/pg2/postgres/src/bin/pg_upgrade/t/002_pg_upgrade.pl line 375.

and in 'pg_upgrade/003_logical_slots'  -

[12:35:33.773](0.001s) not ok 6 - run of pg_upgrade of old cluster
with slots having unconsumed WAL records stdout /(?^:Your installation
contains logical replication slots that can't be upgraded.)/
[12:35:33.773](0.000s) #   Failed test 'run of pg_upgrade of old
cluster with slots having unconsumed WAL records stdout /(?^:Your
installation contains logical replication slots that can't be
upgraded.)/'
#   at D:/Project/pg2/postgres/src/bin/pg_upgrade/t/003_logical_slots.pl
line 102.
[12:35:33.773](0.000s) #   'Performing Consistency Checks
# -
# Checking cluster versions ok
#
# The target cluster lacks cluster state information:
# Failure, exiting

It seems 'Performing Consistency Checks' fail due to a lack of some
information and possible that it can also be fixed on the same lines.

--
Thanks,
Nisha




Re: Synchronizing slots from primary to standby

2023-12-15 Thread Nisha Moond
Review for v47 patch -

(1)
When we try to create a subscription on standby using a synced slot
that is in 'r' sync_state, the subscription will be created at the
subscriber, and on standby, two actions will take place -
(i)  As copy_data is true by default, it will switch the failover
state of the synced-slot to 'false'.
(ii) As we don't allow to use synced-slots, it will start giving
the expected error in the log file -
ERROR:  cannot use replication slot "logical_slot" for logical decoding
DETAIL:  This slot is being synced from the primary server.
HINT:  Specify another replication slot.

The first one seems an issue,  it toggles the failover to false and
then it remains false after that. I think it should be fixed.

(2)
With the patch, the 'CREATE SUBSCRIPTION' command with a 'slot_name'
of an 'active' logical slot fails and errors out -
ERROR:  could not alter replication slot "logical_slot" on
publisher: ERROR:  replication slot "logical_slot1" is active for PID


Without the patch, the create subscription with an 'active' slot_name
succeeds and the log file shows the error "could not start WAL
streaming: ERROR:  replication slot "logical_slot" is active for PID
".

Given that the specified active slot_name has failover set to false
and the create subscription command also specifies failover=false, the
expected behavior of the "with-patch" case is anticipated to be the
same as that of the "without-patch" scenario.




Re: Synchronizing slots from primary to standby

2023-12-12 Thread Nisha Moond
A review on v45 patch:

If one creates a logical slot with failover=true as -
select pg_create_logical_replication_slot('logical_slot','pgoutput',
false, true, true);

Then, uses the existing logical slot while creating a subscription -
postgres=#  create subscription sub4 connection 'dbname=postgres
host=localhost port=5433' publication pub1t4 WITH
(slot_name=logical_slot, create_slot=false, failover=true);
NOTICE:  changed the failover state of replication slot "logical_slot"
on publisher to false
CREATE SUBSCRIPTION

Despite configuring logical_slot's failover to true and specifying
failover=true during subscription creation, the NOTICE indicates a
change in the failover state to 'false', without providing any
explanation for this transition.
It can be confusing for users, so IMO, the notice should include the
reason for switching failover to 'false' or should give a hint to use
either refresh=false or copy_data=false to enable failover=true for
the slot as we do in other similar 'alter subscription...' scenarios.

--
Thanks & Regards,
Nisha




Re: Synchronizing slots from primary to standby

2023-12-01 Thread Nisha Moond
On Fri, Dec 1, 2023 at 5:40 PM Nisha Moond  wrote:
>
> Review for v41 patch.
>
> 1.
> ==
> src/backend/utils/misc/postgresql.conf.sample
>
> +#enable_syncslot = on # enables slot synchronization on the physical
> standby from the primary
>
> enable_syncslot is disabled by default, so, it should be 'off' here.
>
> ~~~
> 2.
> IIUC, the slotsyncworker's connection to the primary is to execute a
> query. Its aim is not walsender type connection, but at primary when
> queried, the 'backend_type' is set to 'walsender'.
> Snippet from primary db-
>
> datname  |   usename   | application_name | wait_event_type | backend_type
> -+-+--+-+--
> postgres | replication | slotsyncworker   | Client  | walsender
>
> Is it okay?
>
> ~~~
> 3.
> As per current logic, If there are slots on primary with disabled
> subscriptions, then, when standby is created it replicates these slots
> but can't make them sync-ready until any activity happens on the
> slots.
> So, such slots stay in 'i' sync-state and get dropped when failover
> happens. Now, if the subscriber tries to enable their existing
> subscription after failover, it gives an error that the slot does not
> exist.

Is this behavior expected? If yes, then is it worth documenting about
disabled subscription slots not being synced?

--
Thanks,
Nisha




Re: Synchronizing slots from primary to standby

2023-12-01 Thread Nisha Moond
Review for v41 patch.

1.
==
src/backend/utils/misc/postgresql.conf.sample

+#enable_syncslot = on # enables slot synchronization on the physical
standby from the primary

enable_syncslot is disabled by default, so, it should be 'off' here.

~~~
2.
IIUC, the slotsyncworker's connection to the primary is to execute a
query. Its aim is not walsender type connection, but at primary when
queried, the 'backend_type' is set to 'walsender'.
Snippet from primary db-

datname  |   usename   | application_name | wait_event_type | backend_type
-+-+--+-+--
postgres | replication | slotsyncworker   | Client  | walsender

Is it okay?

~~~
3.
As per current logic, If there are slots on primary with disabled
subscriptions, then, when standby is created it replicates these slots
but can't make them sync-ready until any activity happens on the
slots.
So, such slots stay in 'i' sync-state and get dropped when failover
happens. Now, if the subscriber tries to enable their existing
subscription after failover, it gives an error that the slot does not
exist.

~~~
4. primary_slot_name GUC value test:

When standby is started with a non-existing primary_slot_name, the
wal-receiver gives an error but the slot-sync worker does not raise
any error/warning. It is no-op though as it has a check 'if
(XLogRecPtrIsInvalid(WalRcv->latestWalEnd)) do nothing'.   Is this
okay or shall the slot-sync worker too raise an error and exit?

In another case, when standby is started with valid primary_slot_name,
but it is changed to some invalid value in runtime, then walreceiver
starts giving error but the slot-sync worker keeps on running. In this
case, unlike the previous case, it even did not go to no-op mode (as
it sees valid WalRcv->latestWalEnd from the earlier run) and keep
pinging primary repeatedly for slots.  Shall here it should error out
or at least be no-op until we give a valid primary_slot_name?

--
Thanks,
Nisha




Re: Intermittent failure with t/003_logical_slots.pl test on windows

2023-11-06 Thread Nisha Moond
On Fri, Nov 3, 2023 at 5:02 PM Nisha Moond  wrote:
>
> On Thu, Nov 2, 2023 at 11:52 AM Kyotaro Horiguchi
>  wrote:
> >
> > At Tue, 31 Oct 2023 18:11:48 +0530, vignesh C  wrote in
> > > Few others are also facing this problem with similar code like in:
> > > https://stackoverflow.com/questions/15882799/fgets-returning-error-for-file-returned-by-popen
> >
> > I'm inclined to believe that the pipe won't enter the EOF state until
> > the target command terminates (then the top-level cmd.exe). The target
> > command likely terminated prematurely due to an error before priting
> > any output.
> >
> > If we append "2>&1" to the command line, we can capture the error
> > message through the returned pipe if any. Such error messages will
> > cause the subsequent code to fail with an error such as "unexpected
> > string: 'the output'". I'm not sure, but if this is permissive, the
> > returned error messages could potentially provide insight into the
> > underlying issue, paving the way for a potential solution.
> >
>
> Appending '2>&1 test:
> The command still results in NULL and ends up failing as no data is
> returned. Which means even no error message is returned. The error log
> with appended '2>$1' is -
>
> no data was returned by command
> ""D:/Project/pg1/postgres/tmp_install/bin/pg_resetwal" -V 2>&1"
>
> check for "D:/Project/pg1/postgres/tmp_install/bin/pg_resetwal"
> failed: cannot execute
> Failure, exiting
>
> Further observations:
> 1. To make sure the forked process completes before fgets(), I tested
> with Sleep(100) before fgets() call.
> ...
> ...
> if ((pgver = popen(cmd, "r")) == NULL)
> {
> perror("popen failure");
> return NULL;
> }
>
> errno = 0;
> Sleep(100);
> if (fgets(line, maxsize, pgver) == NULL)
> {
> if (feof(pgver))
> fprintf(stderr, "no data was returned by command \"%s\"\n", cmd);
> ...
> ...
>
> This also doesn't resolve the issue, the error is still seen intermittently.
>
> 2.  Even though fgets() fails, the output is still getting captured in
> 'line' string.
> Tested with printing the 'line' in case of failure:
> ...
> ...
> if ((pgver = popen(cmd, "r")) == NULL)
> {
> perror("popen failure");
> return NULL;
> }
>
> errno = 0;
> if (fgets(line, maxsize, pgver) == NULL)
> {
> if (line)
>   fprintf(stderr, "cmd output - %s\n", line);
>
> if (feof(pgver))
>   fprintf(stderr, "no data was returned by command \"%s\"\n", cmd);
> …
> …
> And the log looks like -
> cmd output - postgres (PostgreSQL) 17devel
> no data was returned by command
> ""D:/Project/pg1/postgres/tmp_install/bin/pg_controldata" -V"
>
> check for "D:/Project/pg1/postgres/tmp_install/bin/pg_controldata"
> failed: cannot execute
> Failure, exiting
>
> Attached test result log for the same - "regress_log_003_logical_slots".

The same failure is observed with test 't\002_pg_upgrade.pl' too
(intermittently). So, it is not limited to "t/003_logical_slots.pl"
test alone. It is more closely associated with the pg_upgrade command
run.

--
Thanks,
Nisha Moond




Re: Intermittent failure with t/003_logical_slots.pl test on windows

2023-11-03 Thread Nisha Moond
On Thu, Nov 2, 2023 at 11:52 AM Kyotaro Horiguchi
 wrote:
>
> At Tue, 31 Oct 2023 18:11:48 +0530, vignesh C  wrote in
> > Few others are also facing this problem with similar code like in:
> > https://stackoverflow.com/questions/15882799/fgets-returning-error-for-file-returned-by-popen
>
> I'm inclined to believe that the pipe won't enter the EOF state until
> the target command terminates (then the top-level cmd.exe). The target
> command likely terminated prematurely due to an error before priting
> any output.
>
> If we append "2>&1" to the command line, we can capture the error
> message through the returned pipe if any. Such error messages will
> cause the subsequent code to fail with an error such as "unexpected
> string: 'the output'". I'm not sure, but if this is permissive, the
> returned error messages could potentially provide insight into the
> underlying issue, paving the way for a potential solution.
>

Appending '2>&1 test:
The command still results in NULL and ends up failing as no data is
returned. Which means even no error message is returned. The error log
with appended '2>$1' is -

no data was returned by command
""D:/Project/pg1/postgres/tmp_install/bin/pg_resetwal" -V 2>&1"

check for "D:/Project/pg1/postgres/tmp_install/bin/pg_resetwal"
failed: cannot execute
Failure, exiting

Further observations:
1. To make sure the forked process completes before fgets(), I tested
with Sleep(100) before fgets() call.
...
...
if ((pgver = popen(cmd, "r")) == NULL)
{
perror("popen failure");
return NULL;
}

errno = 0;
Sleep(100);
if (fgets(line, maxsize, pgver) == NULL)
{
if (feof(pgver))
fprintf(stderr, "no data was returned by command \"%s\"\n", cmd);
...
...

This also doesn't resolve the issue, the error is still seen intermittently.

2.  Even though fgets() fails, the output is still getting captured in
'line' string.
Tested with printing the 'line' in case of failure:
...
...
if ((pgver = popen(cmd, "r")) == NULL)
{
perror("popen failure");
return NULL;
}

errno = 0;
if (fgets(line, maxsize, pgver) == NULL)
{
if (line)
  fprintf(stderr, "cmd output - %s\n", line);

if (feof(pgver))
  fprintf(stderr, "no data was returned by command \"%s\"\n", cmd);
…
…
And the log looks like -
cmd output - postgres (PostgreSQL) 17devel
no data was returned by command
""D:/Project/pg1/postgres/tmp_install/bin/pg_controldata" -V"

check for "D:/Project/pg1/postgres/tmp_install/bin/pg_controldata"
failed: cannot execute
Failure, exiting

Attached test result log for the same - "regress_log_003_logical_slots".

Thanks,
Nisha Moond


regress_log_003_logical_slots
Description: Binary data


Intermittent failure with t/003_logical_slots.pl test on windows

2023-10-31 Thread Nisha Moond
Hi hackers,

There is a failure with 't/003_logical_slots.pl' test during the
upgrade. The failure is intermittent and observed in the Windows
environment.

Details-
Test - pg_upgrade/t/003_logical_slots.pl
Result -
t/003_logical_slots.pl .. 5/?
#   Failed test 'run of pg_upgrade of old cluster'
#   at t/003_logical_slots.pl line 165.
t/003_logical_slots.pl .. 10/?
#   Failed test 'check the slot exists on new cluster'
#   at t/003_logical_slots.pl line 171.
#  got: ''
# expected: 'regress_sub|t'
# Tests were run but no plan was declared and done_testing() was not seen.
# Looks like your test exited with 25 just after 11.
t/003_logical_slots.pl .. Dubious, test returned 25 (wstat 6400, 0x1900)
Failed 2/11 subtests

Test Summary Report
---
t/003_logical_slots.pl (Wstat: 6400 (exited 25) Tests: 11 Failed: 2)
  Failed tests:  10-11
  Non-zero exit status: 25
  Parse errors: No plan found in TAP output
Files=1, Tests=11, 32 wallclock secs ( 0.03 usr +  0.01 sys =  0.05 CPU)
Result: FAIL

log attached - 'regress_log_003_logical_slots'.

The failure cause is  -
no data was returned by command
""D:/Project/pg1/postgres/tmp_install/bin/pg_resetwal" -V"
check for "D:/Project/pg1/postgres/tmp_install/bin/pg_resetwal"
failed: cannot execute

Failure, exiting
[16:24:21.144](6.275s) not ok 10 - run of pg_upgrade of old cluster

If the same command is run manually, it succeeds -

>"D:/Project/pg1/postgres/tmp_install/bin/pg_resetwal" -V
pg_resetwal (PostgreSQL) 17devel

The same test failure (intermittent) is also seen with different
commands like pg_ctl  and pg_dump as failure cause while retrieving
version -
Ex -
no data was returned by command
""D:/Project/pg1/postgres/tmp_install/bin/pg_dump" -V"
check for "D:/Project/pg1/postgres/tmp_install/bin/pg_dump" failed:
cannot execute

Failure, exiting
[16:08:50.444](7.434s) not ok 10 - run of pg_upgrade of old cluster

Has anyone come across this issue? I am not sure what is the issue here.
Any thoughts?

Thanks,
Nisha Moond


regress_log_003_logical_slots
Description: Binary data