Conflict Detection and Resolution

2024-05-23 Thread shveta malik
Hello hackers, Please find the proposal for Conflict Detection and Resolution (CDR) for Logical replication. Introduction In case the node is subscribed to multiple providers, or when local writes happen on a subscriber, conflicts can arise for the incoming changes. CDR is the

Re: Synchronizing slots from primary to standby

2024-04-29 Thread shveta malik
On Mon, Apr 29, 2024 at 5:28 PM Zhijie Hou (Fujitsu) wrote: > > On Monday, April 29, 2024 5:11 PM shveta malik wrote: > > > > On Mon, Apr 29, 2024 at 11:38 AM shveta malik > > wrote: > > > > > > On Mon, Apr 29, 2024 at 10:57 AM Zhijie Hou (Fujitsu) >

Re: Synchronizing slots from primary to standby

2024-04-29 Thread shveta malik
On Mon, Apr 29, 2024 at 11:38 AM shveta malik wrote: > > On Mon, Apr 29, 2024 at 10:57 AM Zhijie Hou (Fujitsu) > wrote: > > > > On Friday, March 15, 2024 10:45 PM Bertrand Drouvot > > wrote: > > > > > > Hi, > > > > > > On Th

Re: Synchronizing slots from primary to standby

2024-04-29 Thread shveta malik
On Mon, Apr 29, 2024 at 11:38 AM shveta malik wrote: > > On Mon, Apr 29, 2024 at 10:57 AM Zhijie Hou (Fujitsu) > wrote: > > > > On Friday, March 15, 2024 10:45 PM Bertrand Drouvot > > wrote: > > > > > > Hi, > > > > > > On Th

Re: Synchronizing slots from primary to standby

2024-04-29 Thread shveta malik
On Mon, Apr 29, 2024 at 10:57 AM Zhijie Hou (Fujitsu) wrote: > > On Friday, March 15, 2024 10:45 PM Bertrand Drouvot > wrote: > > > > Hi, > > > > On Thu, Mar 14, 2024 at 02:22:44AM +, Zhijie Hou (Fujitsu) wrote: > > > Hi, > > > > > > Since the standby_slot_names patch has been committed, I

Re: promotion related handling in pg_sync_replication_slots()

2024-04-23 Thread shveta malik
On Tue, Apr 23, 2024 at 9:07 AM Amit Kapila wrote: > > On Mon, Apr 22, 2024 at 7:04 PM Masahiko Sawada wrote: > > > > On Mon, Apr 22, 2024 at 9:02 PM shveta malik wrote: > > > > > > Thanks for the patch, the changes look good Amit. Please find the mer

Re: promotion related handling in pg_sync_replication_slots()

2024-04-22 Thread shveta malik
On Mon, Apr 22, 2024 at 5:10 PM Amit Kapila wrote: > > On Fri, Apr 19, 2024 at 1:52 PM shveta malik wrote: > > > > Please find v9 with the above comments addressed. > > > > I have made minor modifications in the comments and a function name. > Please see the

Re: Disallow changing slot's failover option in transaction block

2024-04-22 Thread shveta malik
On Mon, Apr 22, 2024 at 5:57 AM Zhijie Hou (Fujitsu) wrote: > > On Friday, April 19, 2024 10:54 AM Kuroda, Hayato/黒田 隼人 > wrote: > > In your patch, the pg_dump.c was updated. IIUC the main reason was that > > pg_restore executes some queries as single transactions so that ALTER > > SUBSCRIPTION

Re: Disallow changing slot's failover option in transaction block

2024-04-19 Thread shveta malik
On Fri, Apr 19, 2024 at 6:09 AM Zhijie Hou (Fujitsu) wrote: > > Here is V2 patch which includes the changes for CREATE SUBSCRIPTION as > suggested. Since we don't connect pub to alter slot when (create_slot=false) > anymore, the restriction that disallows failover=true when connect=false is >

Re: promotion related handling in pg_sync_replication_slots()

2024-04-19 Thread shveta malik
On Fri, Apr 19, 2024 at 11:37 AM shveta malik wrote: > > On Fri, Apr 19, 2024 at 10:53 AM Bertrand Drouvot > wrote: > > > > Hi, > > > > On Thu, Apr 18, 2024 at 05:36:05PM +0530, shveta malik wrote: > > > Please find v8 attached. Changes are: > >

Re: promotion related handling in pg_sync_replication_slots()

2024-04-19 Thread shveta malik
On Fri, Apr 19, 2024 at 10:53 AM Bertrand Drouvot wrote: > > Hi, > > On Thu, Apr 18, 2024 at 05:36:05PM +0530, shveta malik wrote: > > Please find v8 attached. Changes are: > > Thanks! > > A few comments: Thanks for reviewing. > 1 === > > @@ -1440,7

Re: promotion related handling in pg_sync_replication_slots()

2024-04-18 Thread shveta malik
On Thu, Apr 18, 2024 at 12:35 PM shveta malik wrote: > > To fix above issues, these changes have been made in v7: Please find v8 attached. Changes are: 1) It fixes ShutDownSlotSync() issue, where we perform kill(SlotSyncCtx->pid). There are chances that after we release spin-lock and

Re: Disallow changing slot's failover option in transaction block

2024-04-18 Thread shveta malik
On Thu, Apr 18, 2024 at 11:40 AM Hayato Kuroda (Fujitsu) wrote: > > Dear Shveta, > > Sorry for delay response. I missed your post. > > > +1. > > > > Similar to ALTER SUB, CREATE SUB also needs to be fixed. This is > > because we call alter_replication_slot in CREATE SUB as well, for the > > case

Re: promotion related handling in pg_sync_replication_slots()

2024-04-18 Thread shveta malik
On Tue, Apr 16, 2024 at 2:52 PM Bertrand Drouvot wrote: > > Hi, > > On Tue, Apr 16, 2024 at 02:06:45PM +0530, shveta malik wrote: > > On Tue, Apr 16, 2024 at 1:55 PM Zhijie Hou (Fujitsu) > > wrote: > > > I personally feel adding the additional check for

Re: Disallow changing slot's failover option in transaction block

2024-04-18 Thread shveta malik
On Thu, Apr 18, 2024 at 11:22 AM Amit Kapila wrote: > > On Tue, Apr 16, 2024 at 5:06 PM shveta malik wrote: > > > > On Tue, Apr 16, 2024 at 1:45 PM Hayato Kuroda (Fujitsu) > > wrote: > > > > > > Dear Hou, > > > > > > > Kuroda-San r

Re: Disallow changing slot's failover option in transaction block

2024-04-16 Thread shveta malik
On Tue, Apr 16, 2024 at 1:45 PM Hayato Kuroda (Fujitsu) wrote: > > Dear Hou, > > > Kuroda-San reported an issue off-list that: > > > > If user execute ALTER SUBSCRIPTION SET (failover) command inside a txn block > > and rollback, only the subscription option change can be rolled back, while > >

Re: promotion related handling in pg_sync_replication_slots()

2024-04-16 Thread shveta malik
On Tue, Apr 16, 2024 at 1:55 PM Zhijie Hou (Fujitsu) wrote: > > On Tuesday, April 16, 2024 2:52 PM Bertrand Drouvot > wrote: > > > Hi, > > > On Tue, Apr 16, 2024 at 10:00:04AM +0530, shveta malik wrote: > > > Please find v5 addressing above comments. > &g

Re: promotion related handling in pg_sync_replication_slots()

2024-04-16 Thread shveta malik
On Tue, Apr 16, 2024 at 12:03 PM Bertrand Drouvot wrote: > > > I think for now let's restrict their usage in parallel and make the > > promotion behavior consistent both for worker and API. > > Okay, let's do it that way. Is it worth to add a few words in the doc related > to >

Re: promotion related handling in pg_sync_replication_slots()

2024-04-15 Thread shveta malik
On Tue, Apr 16, 2024 at 9:27 AM Zhijie Hou (Fujitsu) wrote: > > On Monday, April 15, 2024 6:09 PM shveta malik wrote: > > > > Please find v4 addressing the above comments. > > Thanks for the patch. > > Here are few comments: Thanks f

Re: promotion related handling in pg_sync_replication_slots()

2024-04-15 Thread shveta malik
On Mon, Apr 15, 2024 at 2:29 PM Amit Kapila wrote: > > On Fri, Apr 12, 2024 at 5:25 PM shveta malik wrote: > > > > Please find v3 addressing race-condition and one other comment. > > > > Up-thread it was suggested that, probably, checking > > SlotS

Re: promotion related handling in pg_sync_replication_slots()

2024-04-12 Thread shveta malik
On Fri, Apr 12, 2024 at 4:18 PM Amit Kapila wrote: > > On Fri, Apr 12, 2024 at 7:57 AM shveta malik wrote: > > > > On Sat, Apr 6, 2024 at 12:25 PM Bharath Rupireddy > > wrote: > > > > > > On Fri, Apr 5, 2024 at 10:31 AM shveta malik > >

Re: promotion related handling in pg_sync_replication_slots()

2024-04-11 Thread shveta malik
On Sat, Apr 6, 2024 at 12:25 PM Bharath Rupireddy wrote: > > On Fri, Apr 5, 2024 at 10:31 AM shveta malik wrote: > > > > Please find v2. Changes are: > > Thanks for the patch. Here are some comments. Thanks for reviewing. > > 1. Can we have a clear saying in th

Re: promotion related handling in pg_sync_replication_slots()

2024-04-11 Thread shveta malik
On Sat, Apr 6, 2024 at 11:49 AM Amit Kapila wrote: > > On Fri, Apr 5, 2024 at 10:31 AM shveta malik wrote: > > > > Please find v2. Changes are: > > 1) Rebased the patch as there was conflict due to recent commit 6f132ed. > > 2) Added an Assert in update_synced_sl

Re: Synchronizing slots from primary to standby

2024-04-05 Thread shveta malik
On Fri, Apr 5, 2024 at 4:31 PM Bertrand Drouvot wrote: > > BTW, I just realized that the LSN I used in my example in the > LSN_FORMAT_ARGS() > are not the right ones. Noted. Thanks. Please find v3 with the comments addressed. thanks Shveta

Re: Synchronizing slots from primary to standby

2024-04-05 Thread shveta malik
On Fri, Apr 5, 2024 at 10:09 AM Bertrand Drouvot wrote: > > What about something like? > > ereport(LOG, > errmsg("synchronized confirmed_flush_lsn for slot \"%s\" differs from > remote slot", > remote_slot->name), > errdetail("Remote slot has LSN %X/%X but local slot has

Re: promotion related handling in pg_sync_replication_slots()

2024-04-04 Thread shveta malik
On Thu, Apr 4, 2024 at 5:17 PM Amit Kapila wrote: > > On Thu, Apr 4, 2024 at 5:05 PM shveta malik wrote: > > > > Hello hackers, > > > > Currently, promotion related handling is missing in the slot sync SQL > > function pg_sync_replication_slots(). Here is

Re: Synchronizing slots from primary to standby

2024-04-04 Thread shveta malik
On Fri, Apr 5, 2024 at 9:22 AM Bertrand Drouvot wrote: > > Hi, > > On Thu, Apr 04, 2024 at 05:31:45PM +0530, shveta malik wrote: > > On Thu, Apr 4, 2024 at 2:59 PM shveta malik wrote: > > > > > > > > > Prior to commit 2ec005b, this check was

Re: Introduce XID age and inactive timeout based replication slot invalidation

2024-04-04 Thread shveta malik
On Thu, Apr 4, 2024 at 5:53 PM Bharath Rupireddy wrote: > > On Thu, Apr 4, 2024 at 4:35 PM Amit Kapila wrote: > > > > > Thanks for the changes. v34-0001 LGTM. > > > > I was doing a final review before pushing 0001 and found that > > 'inactive_since' could be set twice during startup after

Re: Synchronizing slots from primary to standby

2024-04-04 Thread shveta malik
On Thu, Apr 4, 2024 at 2:59 PM shveta malik wrote: > > > Prior to commit 2ec005b, this check was okay, as we did not expect > restart_lsn of the synced slot to be ahead of remote since we were > directly copying the lsns. But now when we use 'advance' to do logical > d

promotion related handling in pg_sync_replication_slots()

2024-04-04 Thread shveta malik
Hello hackers, Currently, promotion related handling is missing in the slot sync SQL function pg_sync_replication_slots(). Here is the background on how it is done in slot sync worker: During promotion, the startup process in order to shut down the slot-sync worker, sets the 'stopSignaled'

Re: Synchronizing slots from primary to standby

2024-04-04 Thread shveta malik
On Wed, Apr 3, 2024 at 3:36 PM Amit Kapila wrote: > > On Wed, Apr 3, 2024 at 11:13 AM Amit Kapila wrote: > > > > On Wed, Apr 3, 2024 at 9:36 AM Bharath Rupireddy > > wrote: > > > > > I quickly looked at v8, and have a nit, rest all looks good. > > > > > > +if (DecodingContextReady(ctx)

Re: Introduce XID age and inactive timeout based replication slot invalidation

2024-04-03 Thread shveta malik
On Wed, Apr 3, 2024 at 11:17 AM Bharath Rupireddy wrote: > > On Wed, Apr 3, 2024 at 8:38 AM shveta malik wrote: > > > > > > Or a simple solution is that the slotsync worker updates > > > > inactive_since as it does for non-synced slots, and disables >

Re: Introduce XID age and inactive timeout based replication slot invalidation

2024-04-02 Thread shveta malik
On Tue, Apr 2, 2024 at 11:58 AM Bertrand Drouvot wrote: > > Hi, > > On Tue, Apr 02, 2024 at 12:07:54PM +0900, Masahiko Sawada wrote: > > On Mon, Apr 1, 2024 at 12:18 PM Bharath Rupireddy > > > > FWIW, coming to this thread late, I think that the inactive_since > > should not be synchronized from

Re: Synchronizing slots from primary to standby

2024-04-01 Thread shveta malik
On Mon, Apr 1, 2024 at 5:05 PM Amit Kapila wrote: > > > 2 === > > > > + { > > + if (SnapBuildSnapshotExists(remote_slot->restart_lsn)) > > + { > > > > That could call SnapBuildSnapshotExists() multiple times for the same > > "restart_lsn" (for example in case of

Re: Synchronizing slots from primary to standby

2024-04-01 Thread shveta malik
On Mon, Apr 1, 2024 at 10:40 AM Amit Kapila wrote: > > On Mon, Apr 1, 2024 at 10:01 AM Bharath Rupireddy > wrote: > > > > On Thu, Mar 28, 2024 at 10:08 AM Zhijie Hou (Fujitsu) > > wrote: > > > > > > [2] The steps to reproduce the data miss issue on a primary->standby > > > setup: > > > > I'm

Re: Synchronizing slots from primary to standby

2024-03-28 Thread shveta malik
On Fri, Mar 29, 2024 at 6:36 AM Zhijie Hou (Fujitsu) wrote: > > Attach a new version patch which fixed an un-initialized variable issue and > added some comments. Also, temporarily enable DEBUG2 for the 040 tap-test so > that > we can analyze the possible CFbot failures easily. As suggested by

Re: Introduce XID age and inactive timeout based replication slot invalidation

2024-03-27 Thread shveta malik
On Wed, Mar 27, 2024 at 9:00 PM Bharath Rupireddy wrote: > > Thanks. I'm attaching v29 patches. 0001 managing inactive_since on the > standby for sync slots. 0002 implementing inactive timeout GUC based > invalidation mechanism. > > Please have a look. Thanks for the patches. v29-001 looks good

Re: Introduce XID age and inactive timeout based replication slot invalidation

2024-03-27 Thread shveta malik
On Wed, Mar 27, 2024 at 2:55 PM Bharath Rupireddy wrote: > > On Wed, Mar 27, 2024 at 11:39 AM shveta malik wrote: > > > > Thanks for the patch. Few trivial things: > > Thanks for reviewing. > > > -- > > 1) > > system-views.sgml: > > &g

Re: Introduce XID age and inactive timeout based replication slot invalidation

2024-03-27 Thread shveta malik
On Wed, Mar 27, 2024 at 11:05 AM Bharath Rupireddy wrote: > > Fixed an issue in synchronize_slots where DatumGetLSN is being used in > place of DatumGetTimestampTz. Found this via CF bot member [1], not on > my dev system. > > Please find the attached v6 patch. Thanks for the patch. Few trivial

Re: Introduce XID age and inactive timeout based replication slot invalidation

2024-03-26 Thread shveta malik
On Tue, Mar 26, 2024 at 6:05 PM Bertrand Drouvot wrote: > > > > We can think on that later if we really need another > > field which give us sync time. > > I think that calling GetCurrentTimestamp() so frequently could be too costly, > so > I'm not sure we should. Agreed. > > In my second

Re: Introduce XID age and inactive timeout based replication slot invalidation

2024-03-26 Thread shveta malik
On Wed, Mar 27, 2024 at 10:22 AM Amit Kapila wrote: > > On Wed, Mar 27, 2024 at 10:08 AM Bharath Rupireddy > wrote: > > > > On Tue, Mar 26, 2024 at 11:22 PM Bertrand Drouvot > > wrote: > > > > > 3) > > > update_synced_slots_inactive_time(): > > > > > > This assert is removed, is it intentional?

Re: Introduce XID age and inactive timeout based replication slot invalidation

2024-03-26 Thread shveta malik
On Tue, Mar 26, 2024 at 9:59 PM Bharath Rupireddy wrote: > > On Tue, Mar 26, 2024 at 4:35 PM Bharath Rupireddy > wrote: > > > > If we just sync inactive_since value for synced slots while in > > recovery from the primary, so be it. Why do we need to update it to > > the current time when the

Re: Introduce XID age and inactive timeout based replication slot invalidation

2024-03-26 Thread shveta malik
On Tue, Mar 26, 2024 at 4:35 PM Bharath Rupireddy wrote: > > On Tue, Mar 26, 2024 at 4:18 PM shveta malik wrote: > > > > > What about another approach?: inactive_since gives data synced from > > > primary for > > > synced slots and another dedicate

Re: Introduce XID age and inactive timeout based replication slot invalidation

2024-03-26 Thread shveta malik
On Tue, Mar 26, 2024 at 3:50 PM Bertrand Drouvot wrote: > > Hi, > > > I think there may have been some misunderstanding here. > > Indeed ;-) > > > But now if I > > rethink this, I am fine with 'inactive_since' getting synced from > > primary to standby. But if we do that, we need to add docs

Re: Introduce XID age and inactive timeout based replication slot invalidation

2024-03-26 Thread shveta malik
the time of creation of the synced slot. 2) during standby restart. 3) during promotion of standby. I have attached a sample patch for this idea as.txt file. I am fine with any of these approaches. One gives data synced from primary for synced slots, while another gives actual inactivity data of syn

Re: Introduce XID age and inactive timeout based replication slot invalidation

2024-03-26 Thread shveta malik
On Tue, Mar 26, 2024 at 2:27 PM Bharath Rupireddy wrote: > > > > > 1) > > Commti msg: > > > > ensures the value is set to current timestamp during the > > shutdown to help correctly interpret the time if the standby gets > > promoted without a restart. > > > > shutdown --> shutdown of slot sync

Re: Introduce XID age and inactive timeout based replication slot invalidation

2024-03-26 Thread shveta malik
On Tue, Mar 26, 2024 at 11:08 AM Bharath Rupireddy wrote: > > On Tue, Mar 26, 2024 at 9:30 AM shveta malik wrote: > > > > On Mon, Mar 25, 2024 at 12:43 PM shveta malik > > wrote: > > > > > > I have one concern, for synced slots on standby, how do we d

Re: Introduce XID age and inactive timeout based replication slot invalidation

2024-03-26 Thread shveta malik
On Tue, Mar 26, 2024 at 11:36 AM Bertrand Drouvot wrote: > > > > The issue that I can see with your proposal is: what if one synced the slots > > manually (with pg_sync_replication_slots()) but does not use the sync > > worker? > > Then I think ShutDownSlotSync() is not going to help in that

Re: pgsql: Track last_inactive_time in pg_replication_slots.

2024-03-25 Thread shveta malik
On Tue, Mar 26, 2024 at 1:50 AM Bharath Rupireddy wrote: > > On Tue, Mar 26, 2024 at 1:30 AM Nathan Bossart > wrote: > > > > On Mon, Mar 25, 2024 at 04:49:12PM +, Bertrand Drouvot wrote: > > > On Mon, Mar 25, 2024 at 12:25:37PM -0400, Robert Haas wrote: > > >> In the same vein, I think

Re: pgsql: Track last_inactive_time in pg_replication_slots.

2024-03-25 Thread shveta malik
On Mon, Mar 25, 2024 at 9:54 PM Bertrand Drouvot wrote: > > Hi, > > On Mon, Mar 25, 2024 at 07:32:11PM +0530, Amit Kapila wrote: > > On Mon, Mar 25, 2024 at 6:57 PM Robert Haas wrote: > > > And I'm suspicious that having an exception for slots being synced is > > > a bad idea. That makes too

Re: Introduce XID age and inactive timeout based replication slot invalidation

2024-03-25 Thread shveta malik
On Mon, Mar 25, 2024 at 12:43 PM shveta malik wrote: > > I have one concern, for synced slots on standby, how do we disallow > invalidation due to inactive-timeout immediately after promotion? > > For synced slots, last_inactive_time and inactive_timeout are both > set. Let'

Re: Introduce XID age and inactive timeout based replication slot invalidation

2024-03-25 Thread shveta malik
On Mon, Mar 25, 2024 at 5:10 PM Amit Kapila wrote: > > I think we should keep pg_alter_replication_slot() as the last > priority among the remaining patches for this release. Let's try to > first finish the primary functionality of inactive_timeout patch. > Otherwise, I agree that the problem

Re: Introduce XID age and inactive timeout based replication slot invalidation

2024-03-25 Thread shveta malik
On Mon, Mar 25, 2024 at 3:31 PM Bharath Rupireddy wrote: > > Right. Done that way i.e. not setting the last_inactive_time for slots > both while releasing the slot and restoring from the disk. > > Also, I've added a TAP function to check if the captured times are > sane per Bertrand's review

Re: Introduce XID age and inactive timeout based replication slot invalidation

2024-03-25 Thread shveta malik
On Mon, Mar 25, 2024 at 1:37 PM Bertrand Drouvot wrote: > > Hi, > > Yeah, and I can see last_inactive_time is moving on the standby (while not the > case on the primary), probably due to the sync worker slot acquisition/release > which does not seem right. > Yes, you are right,

Re: Introduce XID age and inactive timeout based replication slot invalidation

2024-03-25 Thread shveta malik
On Mon, Mar 25, 2024 at 1:37 PM Bertrand Drouvot wrote: > > Hi, > > On Mon, Mar 25, 2024 at 12:59:52PM +0530, Amit Kapila wrote: > > On Mon, Mar 25, 2024 at 12:43 PM shveta malik > > wrote: > > > > > > On Mon, Mar 25, 2024 at 11:53 AM shveta malik &

Re: Introduce XID age and inactive timeout based replication slot invalidation

2024-03-25 Thread shveta malik
On Mon, Mar 25, 2024 at 11:53 AM shveta malik wrote: > > On Mon, Mar 25, 2024 at 10:33 AM shveta malik wrote: > > > > On Sun, Mar 24, 2024 at 3:06 PM Bharath Rupireddy > > wrote: > > > > > > I've attached the v18 patch set here. I have one conce

Re: Introduce XID age and inactive timeout based replication slot invalidation

2024-03-25 Thread shveta malik
On Mon, Mar 25, 2024 at 10:33 AM shveta malik wrote: > > On Sun, Mar 24, 2024 at 3:06 PM Bharath Rupireddy > wrote: > > > > I've attached the v18 patch set here. > I have a question. Don't we allow creating subscriptions on an existing slot with a non-null 'ina

Re: Introduce XID age and inactive timeout based replication slot invalidation

2024-03-24 Thread shveta malik
On Sun, Mar 24, 2024 at 3:06 PM Bharath Rupireddy wrote: > > I've attached the v18 patch set here. Thanks for the patches. Please find few comments: patch 001: 1) slot.h: + /* The time at which this slot become inactive */ + TimestampTz last_inactive_time; become -->became

Re: Introduce XID age and inactive timeout based replication slot invalidation

2024-03-15 Thread shveta malik
On Thu, Mar 14, 2024 at 7:58 PM Bharath Rupireddy wrote: > > While we wait to hear from others on this, I'm attaching the v9 patch > set implementing the above idea (check 0001 patch). Please have a > look. I'll come back to the other review comments soon. > patch002: 1) I would like to

Re: Introduce XID age and inactive timeout based replication slot invalidation

2024-03-14 Thread shveta malik
On Thu, Mar 14, 2024 at 7:58 PM Bharath Rupireddy wrote: > > On Thu, Mar 14, 2024 at 12:24 PM Amit Kapila wrote: > > > > On Wed, Mar 13, 2024 at 9:24 PM Bharath Rupireddy > > > > > > Yes, there will be some sort of duplicity if we emit conflict_reason > > > as a text field. However, I still

Re: Have pg_basebackup write "dbname" in "primary_conninfo"?

2024-03-14 Thread shveta malik
On Thu, Mar 14, 2024 at 10:57 AM Amit Kapila wrote: > > On Thu, Mar 14, 2024 at 5:57 AM Masahiko Sawada wrote: > > > > This fact makes me think that the slotsync worker might be able to > > accept the primary_conninfo value even if there is no dbname in the > > value. That is, if there is no

Re: Introduce XID age and inactive timeout based replication slot invalidation

2024-03-13 Thread shveta malik
On Fri, Mar 8, 2024 at 10:42 PM Bharath Rupireddy wrote: > > On Wed, Mar 6, 2024 at 4:49 PM Amit Kapila wrote: > > > > You might want to consider its interaction with sync slots on standby. > > Say, there is no activity on slots in terms of processing the changes > > for slots. Now, we won't

Re: Introduce XID age and inactive timeout based replication slot invalidation

2024-03-13 Thread shveta malik
> JFYI, the patch does not apply to the head. There is a conflict in > multiple files. For review purposes, I applied v8 to the March 6 code-base. I have yet to review in detail, please find my initial thoughts: 1) I found that 'inactive_replication_slot_timeout' works only if there was any

Re: Introduce XID age and inactive timeout based replication slot invalidation

2024-03-12 Thread shveta malik
On Wed, Mar 6, 2024 at 2:47 PM Bharath Rupireddy wrote: > > On Wed, Mar 6, 2024 at 2:42 PM Bertrand Drouvot > wrote: > > > > Hi, > > > > On Tue, Mar 05, 2024 at 01:44:43PM -0600, Nathan Bossart wrote: > > > On Wed, Mar 06, 2024 at 12:50:38AM +0530, Bharath Rupireddy wrote: > > > > On Mon, Mar 4,

Re: Add comment to specify timeout unit in ConditionVariableTimedSleep()

2024-03-11 Thread shveta malik
On Sat, Mar 9, 2024 at 12:19 PM Michael Paquier wrote: > > On Tue, Mar 05, 2024 at 03:20:48PM +0900, Michael Paquier wrote: > > That sounds like a good idea to me, so I'm OK with your suggestion. > > Applied this one as f160bf06f72a. Thanks. Thanks! thanks Shveta

Re: Regardign RecentFlushPtr in WalSndWaitForWal()

2024-03-11 Thread shveta malik
On Sat, Mar 2, 2024 at 4:44 PM Amit Kapila wrote: > > Right, I think the quoted code has check "if (!RecoveryInProgress())". > > > > But apart from that, your > > observation seems accurate, yes. > > > > I also find the observation correct and the code has been like that > since commit 5a991ef8

Re: Synchronizing slots from primary to standby

2024-03-07 Thread shveta malik
On Fri, Mar 8, 2024 at 9:56 AM Ajin Cherian wrote: > >> Pushed with minor modifications. I'll keep an eye on BF. >> >> BTW, one thing that we should try to evaluate a bit more is the >> traversal of slots in StandbySlotsHaveCaughtup() where we verify if >> all the slots mentioned in

Re: Missing LWLock protection in pgstat_reset_replslot()

2024-03-06 Thread shveta malik
On Thu, Mar 7, 2024 at 11:12 AM Michael Paquier wrote: > > Yeah, it is possible that what you retrieve from > pgstat_fetch_replslot() does not refer exactly to the slot's content > under concurrent activity, but you cannot protect a single scan of > pg_stat_replication_slots as of an effect of

Re: Missing LWLock protection in pgstat_reset_replslot()

2024-03-06 Thread shveta malik
On Wed, Mar 6, 2024 at 2:36 PM Bertrand Drouvot wrote: > > Hi, > > On Wed, Mar 06, 2024 at 10:24:46AM +0530, shveta malik wrote: > > On Tue, Mar 5, 2024 at 6:52 PM Bertrand Drouvot > > wrote: > > Thanks. Can we try to get rid of multiple LwLockRelease in > >

Re: Synchronizing slots from primary to standby

2024-03-06 Thread shveta malik
On Wed, Mar 6, 2024 at 6:54 PM Zhijie Hou (Fujitsu) wrote: > > On Wednesday, March 6, 2024 9:13 PM Zhijie Hou (Fujitsu) > wrote: > > > > On Wednesday, March 6, 2024 11:04 AM Zhijie Hou (Fujitsu) > > wrote: > > > > > > On Wednesday, March 6, 2024 9:30 AM Masahiko Sawada > > > wrote: > > > > >

Re: Missing LWLock protection in pgstat_reset_replslot()

2024-03-05 Thread shveta malik
On Tue, Mar 5, 2024 at 6:52 PM Bertrand Drouvot wrote: > > > /* > > * Nothing to do for physical slots as we collect stats only for logical > > * slots. > > */ > > if (SlotIsPhysical(slot)) > > return; > > D'oh! Thanks! Fixed in v2 shared up-thread. Thanks. Can we try to get rid of multiple

Re: Missing LWLock protection in pgstat_reset_replslot()

2024-03-05 Thread shveta malik
On Tue, Mar 5, 2024 at 1:25 PM Heikki Linnakangas wrote: > SearchNamedReplicationSlot() will also acquire the lock in LW_SHARED > mode, when you pass need_lock=true. So that at least should be changed > to false. > Also don't we need to release the lock when we return here: /* * Nothing to do

Re: Synchronizing slots from primary to standby

2024-03-04 Thread shveta malik
on(and not on each change) should be outside of this function. It is too embedded. And the comment which states the order of wait (first flush and then standbys confirmation) should be outside the for-loop in WalSndWaitForWal(), but yes we do need both the comments. Attached a patch (.txt) for comments

Add comment to specify timeout unit in ConditionVariableTimedSleep()

2024-03-04 Thread shveta malik
Hi hackers, ConditionVariableTimedSleep() accepts a timeout parameter, but it doesn't explicitly state the unit for the timeout anywhere. To determine this, one needs to look into the details of the function to find it out from the comments of the internally called function WaitLatch(). It would

Re: Synchronizing slots from primary to standby

2024-02-29 Thread shveta malik
On Wed, Feb 28, 2024 at 4:51 PM Amit Kapila wrote: > > On Wed, Feb 28, 2024 at 3:26 PM shveta malik wrote: > > > > > > Here is the patch which addresses the above comments. Also optimized > > the test a little bit. Now we use pg_sync_replication_slots() function

Re: Synchronizing slots from primary to standby

2024-02-29 Thread shveta malik
On Thu, Feb 29, 2024 at 7:04 AM Zhijie Hou (Fujitsu) wrote: > > Here is the v101 patch set which addressed above comments. > Thanks for the patch. Few comments: 1) Shall we mention in doc that shutdown will wait for standbys in standby_slot_names to confirm receiving WAL: Suggestion for

Re: Synchronizing slots from primary to standby

2024-02-28 Thread shveta malik
On Wed, Feb 28, 2024 at 1:33 PM Bertrand Drouvot wrote: > > Hi, > A few comments: Thanks for reviewing. > > 1 === > > +* used to run normal SQL queries > > s/run normal SQL/run SQL/ ? > > As mentioned up-thread I don't like that much the idea of creating such a test > but if we do then

Re: Synchronizing slots from primary to standby

2024-02-27 Thread shveta malik
On Wed, Feb 28, 2024 at 8:49 AM Amit Kapila wrote: > > > Few comments: Thanks for the feedback. > === > 1. > - if (logical) > + if (logical || !replication) > { > > Can we add a comment about connection types that require > ALWAYS_SECURE_SEARCH_PATH_SQL? > > 2. > Can we add a test

Re: Synchronizing slots from primary to standby

2024-02-26 Thread shveta malik
On Mon, Feb 26, 2024 at 5:18 PM Amit Kapila wrote: > > > > > + if (!ok) > > > > + GUC_check_errdetail("List syntax is invalid."); > > > > + > > > > + /* > > > > +* If there is a syntax error in the name or if the replication > > > > slots' > > > > +*

Regardign RecentFlushPtr in WalSndWaitForWal()

2024-02-26 Thread shveta malik
Hi hackers, I would like to understand why we have code [1] that retrieves RecentFlushPtr in WalSndWaitForWal() outside of the loop. We utilize RecentFlushPtr later within the loop, but prior to that, we already have [2]. Wouldn't [2] alone be sufficient? Just to check the impact, I ran 'make

Re: Synchronizing slots from primary to standby

2024-02-25 Thread shveta malik
for the patch, changes look good. I have corporated it in the patch which addresses the rest of your comments in [1]. I have attached the patch as .txt [1]: https://www.postgresql.org/message-id/ZdcejBDCr%2BwlVGnO%40ip-10-97-1-34.eu-west-3.compute.internal thanks Shveta From f1489f783748e85749a576cf

Re: Synchronizing slots from primary to standby

2024-02-23 Thread shveta malik
On Fri, Feb 23, 2024 at 1:28 PM Bertrand Drouvot wrote: > > Hi, > > Because one could create say the "=" OPERATOR in their own schema, attach a > function to it doing undesired stuff and change the search_path for the > database > the sync slot worker connects to. > > Then this new "=" operator

Re: Synchronizing slots from primary to standby

2024-02-22 Thread shveta malik
On Fri, Feb 23, 2024 at 10:06 AM Zhijie Hou (Fujitsu) wrote: > > > I noticed one CFbot failure[1] which is because the tap-test doesn't wait for > the > standby to catch up before promoting, thus the data inserted after promotion > could not be replicated to the subscriber. Add a

Re: Synchronizing slots from primary to standby

2024-02-22 Thread shveta malik
On Fri, Feb 23, 2024 at 8:35 AM shveta malik wrote: > > On Thu, Feb 22, 2024 at 4:35 PM Bertrand Drouvot > wrote: > > > > Suppose that in synchronize_slots() the query would be: > > > > const char *query = "SELECT slot_name, plugin, confirmed

Re: Synchronizing slots from primary to standby

2024-02-22 Thread shveta malik
On Thu, Feb 22, 2024 at 4:35 PM Bertrand Drouvot wrote: > > Suppose that in synchronize_slots() the query would be: > > const char *query = "SELECT slot_name, plugin, confirmed_flush_lsn," > " restart_lsn, catalog_xmin, two_phase, failover," > " database, conflict_reason" >

Re: Synchronizing slots from primary to standby

2024-02-22 Thread shveta malik
On Thu, Feb 22, 2024 at 3:44 PM Bertrand Drouvot wrote: > > Hi, > > Thanks! > > Some random comments about v92_001 (Sorry if it has already been discussed > up-thread): Thanks for the feedback. The patch is pushed 15 minutes back. I will prepare a top-up patch for your comments. > 1 === > > +

Re: Synchronizing slots from primary to standby

2024-02-21 Thread shveta malik
On Thu, Feb 22, 2024 at 10:31 AM shveta malik wrote: > > Please find patch001 attached. There is a CFBot failure in patch002. > The test added there needs some adjustment. We will rebase and post > rest of the patches once we fix that issue. > There was a recent commit 801792e t

Re: Synchronizing slots from primary to standby

2024-02-21 Thread shveta malik
On Wed, Feb 21, 2024 at 5:19 PM Amit Kapila wrote: > > A few minor comments: Thanks for the feedback. > = > 1. > +/* > + * Is stopSignaled set in SlotSyncCtx? > + */ > +bool > +IsStopSignaledSet(void) > +{ > + bool signaled; > + > + SpinLockAcquire(>mutex); > + signaled =

Re: 'Shutdown <= SmartShutdown' check while launching processes in postmaster.

2024-02-21 Thread shveta malik
On Wed, Feb 21, 2024 at 10:01 AM Tom Lane wrote: > > shveta malik writes: > > I would like to know that why we have 'Shutdown <= SmartShutdown' > > check before launching few processes (WalReceiver, WalSummarizer, > > AutoVacuum worker) while rest of the pro

Re: Synchronizing slots from primary to standby

2024-02-20 Thread shveta malik
On Tue, Feb 20, 2024 at 6:19 PM Masahiko Sawada wrote: > > On Tue, Feb 20, 2024 at 12:33 PM shveta malik wrote: > Thank you for the explanation. It makes sense to me to move the check. > > > 2. If the wal_level is not logical, the server will need to restart > anyway to

'Shutdown <= SmartShutdown' check while launching processes in postmaster.

2024-02-20 Thread shveta malik
Hi hackers, I would like to know that why we have 'Shutdown <= SmartShutdown' check before launching few processes (WalReceiver, WalSummarizer, AutoVacuum worker) while rest of the processes (BGWriter, WalWriter, Checkpointer, Archiver etc) do not have any such check. If I have to launch a new

Re: A new message seems missing a punctuation

2024-02-20 Thread shveta malik
On Tue, Feb 20, 2024 at 5:01 PM Amit Kapila wrote: > > > We do expose the required information (restart_lsn, catalog_xmin, > synced, temporary, etc) via pg_replication_slots. So, I feel the LOG > message here is sufficient to DEBUG (or know the details) when the > slot sync doesn't succeed. >

Re: About a recently-added message

2024-02-20 Thread shveta malik
On Tue, Feb 20, 2024 at 2:13 PM Amit Kapila wrote: > > I would prefer the changed ones as those clearly explain the problem > without additional information. okay, attached v2 patch with changed error msgs and double quotes around logical. thanks Shveta

Re: Synchronizing slots from primary to standby

2024-02-19 Thread shveta malik
On Tue, Feb 20, 2024 at 8:25 AM Masahiko Sawada wrote: > > > I've reviewed the v91 patch. Here are random comments: Thanks for the comments. > --- > /* > * Checks the remote server info. > * > - * We ensure that the 'primary_slot_name' exists on the remote server and the > - * remote

Re: Synchronizing slots from primary to standby

2024-02-19 Thread shveta malik
On Mon, Feb 19, 2024 at 5:32 PM Amit Kapila wrote: > > Few comments on 0001 Thanks for the feedback. > > 1. I think it is better to error out when the valid GUC or option is > not set in ensure_valid_slotsync_params() and > ensure_valid_remote_info() instead of waiting. And

Re: About a recently-added message

2024-02-18 Thread shveta malik
On Mon, Feb 19, 2024 at 11:10 AM Amit Kapila wrote: > > On Thu, Feb 15, 2024 at 11:49 AM Kyotaro Horiguchi > wrote: > > > > At Thu, 15 Feb 2024 09:22:23 +0530, shveta malik > > wrote in > > > > > > +1 on changing the msg(s) suggested way. Plea

Re: A new message seems missing a punctuation

2024-02-18 Thread shveta malik
On Mon, Feb 19, 2024 at 10:31 AM Robert Haas wrote: > > On Mon, Feb 19, 2024 at 10:10 AM Kyotaro Horiguchi > wrote: > > A recent commit (7a424ece48) added the following message: > > > > > could not sync slot information as remote slot precedes local slot: > > > remote slot "%s": LSN (%X/%X),

Re: Synchronizing slots from primary to standby

2024-02-18 Thread shveta malik
On Sun, Feb 18, 2024 at 7:40 PM Zhijie Hou (Fujitsu) wrote: > > On Friday, February 16, 2024 6:41 PM shveta malik > wrote: > Thanks for the patch. Here are few comments: Thanks for the comments. > > 2. > > +static bool > +validate_remote_info(WalRecei

Re: About a recently-added message

2024-02-14 Thread shveta malik
On Thu, Feb 15, 2024 at 8:26 AM Amit Kapila wrote: > > On Wed, Feb 14, 2024 at 7:51 PM Euler Taveira wrote: > > > > On Wed, Feb 14, 2024, at 8:45 AM, Amit Kapila wrote: > > > > Now, I am less clear about whether to quote "logical" or not in the > > above message. Do you have any suggestions? > >

Re: Synchronizing slots from primary to standby

2024-02-13 Thread shveta malik
On Fri, Feb 9, 2024 at 10:04 AM Amit Kapila wrote: > > +reserve_wal_for_local_slot(XLogRecPtr restart_lsn) > { > ... > + /* > + * Find the oldest existing WAL segment file. > + * > + * Normally, we can determine it by using the last removed segment > + * number. However, if no WAL segment files

  1   2   3   >