Re: Exit walsender before confirming remote flush in logical replication

2023-02-14 Thread Kyotaro Horiguchi
At Mon, 13 Feb 2023 17:13:43 -0800, Andres Freund wrote in > On 2023-02-14 10:05:40 +0900, Kyotaro Horiguchi wrote: > > What do you think about the need for explicitly specifying the > > default? I'm fine with specifying the default using a single word, > > such as WAIT_FOR_REMOTE_FLUSH. > >

Re: Exit walsender before confirming remote flush in logical replication

2023-02-13 Thread Andres Freund
On 2023-02-14 10:05:40 +0900, Kyotaro Horiguchi wrote: > What do you think about the need for explicitly specifying the > default? I'm fine with specifying the default using a single word, > such as WAIT_FOR_REMOTE_FLUSH. We obviously shouldn't force the option to be present. Why would we want

Re: Exit walsender before confirming remote flush in logical replication

2023-02-13 Thread Kyotaro Horiguchi
At Mon, 13 Feb 2023 08:27:01 +0530, Amit Kapila wrote in > On Mon, Feb 13, 2023 at 7:26 AM Kyotaro Horiguchi > wrote: > > > > IMHO I vaguely don't like that we lose a means to specify the default > > behavior here. And I'm not sure we definitely don't need other than > > flush and immedaite

Re: Exit walsender before confirming remote flush in logical replication

2023-02-12 Thread Peter Smith
Here are some comments for patch v7-0002. == Commit Message 1. This commit extends START_REPLICATION to accept SHUTDOWN_MODE clause. It is currently implemented only for logical replication. ~ "to accept SHUTDOWN_MODE clause." --> "to accept a SHUTDOWN_MODE clause." ==

Re: Exit walsender before confirming remote flush in logical replication

2023-02-12 Thread Amit Kapila
On Mon, Feb 13, 2023 at 7:26 AM Kyotaro Horiguchi wrote: > > At Fri, 10 Feb 2023 12:40:43 +, "Hayato Kuroda (Fujitsu)" > wrote in > > Dear Amit, > > > > > Can't we have this option just as a bool (like shutdown_immediate)? > > > Why do we want to keep multiple modes? > > > > Of course we

Re: Exit walsender before confirming remote flush in logical replication

2023-02-12 Thread Kyotaro Horiguchi
At Fri, 10 Feb 2023 12:40:43 +, "Hayato Kuroda (Fujitsu)" wrote in > Dear Amit, > > > Can't we have this option just as a bool (like shutdown_immediate)? > > Why do we want to keep multiple modes? > > Of course we can use boolean instead, but current style is motivated by the > post[1].

RE: Exit walsender before confirming remote flush in logical replication

2023-02-10 Thread Hayato Kuroda (Fujitsu)
Dear Amit, > Can't we have this option just as a bool (like shutdown_immediate)? > Why do we want to keep multiple modes? Of course we can use boolean instead, but current style is motivated by the post[1]. This allows to add another option in future, whereas I do not have idea now. I want to

Re: Exit walsender before confirming remote flush in logical replication

2023-02-10 Thread Amit Kapila
On Fri, Feb 10, 2023 at 5:24 PM Hayato Kuroda (Fujitsu) wrote: > Can't we have this option just as a bool (like shutdown_immediate)? Why do we want to keep multiple modes? -- With Regards, Amit Kapila.

Re: Exit walsender before confirming remote flush in logical replication

2023-02-09 Thread Peter Smith
Here are my review comments for the v6-0002 patch. == Commit Message 1. This commit extends START_REPLICATION to accept SHUTDOWN_MODE term. Currently, it works well only for logical replication. ~ 1a. "to accept SHUTDOWN term" --> "to include a SHUTDOWN_MODE clause." ~ 1b. "it works well

RE: Exit walsender before confirming remote flush in logical replication

2023-02-09 Thread Hayato Kuroda (Fujitsu)
Dear Sawada-san, Thank you for reviewing! > +/* > + * Options for controlling the behavior of the walsender. Options can be > + * specified in the START_STREAMING replication command. Currently only one > + * option is allowed. > + */ > +typedef struct > +{ > +WalSndShutdownMode

RE: Exit walsender before confirming remote flush in logical replication

2023-02-09 Thread Hayato Kuroda (Fujitsu)
Dear Osumi-san, Thank you for reviewing! PSA new version. > (1) > > + Decides the condition for exiting the walsender process. > + 'wait_flush', which is the default, the > walsender > + will wait for all the sent WALs to be flushed on the subscriber > side, > +

Re: Exit walsender before confirming remote flush in logical replication

2023-02-09 Thread Masahiko Sawada
Hi, On Wed, Feb 8, 2023 at 6:47 PM Hayato Kuroda (Fujitsu) wrote: > > > > > Dear Horiguchi-san, > > > > Thank you for checking the patch! PSA new version. > > PSA rebased patch that supports updated time-delayed patch[1]. > Thank you for the patch! Here are some comments on v5 patch: +/* + *

RE: Exit walsender before confirming remote flush in logical replication

2023-02-08 Thread Takamichi Osumi (Fujitsu)
On Wednesday, February 8, 2023 6:47 PM Hayato Kuroda (Fujitsu) wrote: > PSA rebased patch that supports updated time-delayed patch[1]. Hi, Thanks for creating the patch ! Minor review comments on v5-0002. (1) + Decides the condition for exiting the walsender process. +

RE: Exit walsender before confirming remote flush in logical replication

2023-02-08 Thread Hayato Kuroda (Fujitsu)
> > Dear Horiguchi-san, > > Thank you for checking the patch! PSA new version. PSA rebased patch that supports updated time-delayed patch[1]. [1]: https://www.postgresql.org/message-id/tyapr01mb5866c11daf8ab04f3cc181d3f5...@tyapr01mb5866.jpnprd01.prod.outlook.com Best Regards, Hayato Kuroda

RE: Exit walsender before confirming remote flush in logical replication

2023-02-08 Thread Hayato Kuroda (Fujitsu)
Dear Horiguchi-san, Thank you for checking the patch! PSA new version. > 0002: > > This patch doesn't seem to offer a means to change the default > walsender behavior. We need a subscription option named like > "walsender_exit_mode" to do that. As I said in another mail[1], I'm thinking the

RE: Exit walsender before confirming remote flush in logical replication

2023-02-07 Thread Hayato Kuroda (Fujitsu)
Dear Amit, Thanks for giving comments! > > > > 0002: > > > > This patch doesn't seem to offer a means to change the default > > walsender behavior. We need a subscription option named like > > "walsender_exit_mode" to do that. > > > > I don't think at this stage we need a subscription-level

Re: Exit walsender before confirming remote flush in logical replication

2023-02-07 Thread Amit Kapila
On Wed, Feb 8, 2023 at 7:57 AM Kyotaro Horiguchi wrote: > > I agree to the direction and thanks for the patch. > > At Tue, 7 Feb 2023 17:08:54 +, "Hayato Kuroda (Fujitsu)" > wrote in > > > I noticed that previous ones are rejected by cfbot, even if they passed > > > on my > > >

Re: Exit walsender before confirming remote flush in logical replication

2023-02-07 Thread Kyotaro Horiguchi
I agree to the direction and thanks for the patch. At Tue, 7 Feb 2023 17:08:54 +, "Hayato Kuroda (Fujitsu)" wrote in > > I noticed that previous ones are rejected by cfbot, even if they passed on > > my > > environment... > > PSA fixed version. > > While analyzing more, I found the

RE: Exit walsender before confirming remote flush in logical replication

2023-02-07 Thread Hayato Kuroda (Fujitsu)
> I noticed that previous ones are rejected by cfbot, even if they passed on my > environment... > PSA fixed version. While analyzing more, I found the further bug that forgets initialization. PSA new version that could be passed automated tests on my github repository. Sorry for noise. Best

RE: Exit walsender before confirming remote flush in logical replication

2023-02-07 Thread Hayato Kuroda (Fujitsu)
> Dear Andres, Amit, > > > On 2023-02-07 09:00:13 +0530, Amit Kapila wrote: > > > On Tue, Feb 7, 2023 at 2:04 AM Andres Freund wrote: > > > > How about we make it an option in START_REPLICATION? Delayed logical > > rep can > > > > toggle that on by default. > > > > > Works for me. So, when this

RE: Exit walsender before confirming remote flush in logical replication

2023-02-07 Thread Hayato Kuroda (Fujitsu)
Dear Andres, Amit, > On 2023-02-07 09:00:13 +0530, Amit Kapila wrote: > > On Tue, Feb 7, 2023 at 2:04 AM Andres Freund wrote: > > > How about we make it an option in START_REPLICATION? Delayed logical > rep can > > > toggle that on by default. > > > Works for me. So, when this option is set in

Re: Exit walsender before confirming remote flush in logical replication

2023-02-06 Thread Andres Freund
Hi, On 2023-02-07 09:00:13 +0530, Amit Kapila wrote: > On Tue, Feb 7, 2023 at 2:04 AM Andres Freund wrote: > > How about we make it an option in START_REPLICATION? Delayed logical rep can > > toggle that on by default. > Works for me. So, when this option is set in START_REPLICATION > message,

Re: Exit walsender before confirming remote flush in logical replication

2023-02-06 Thread Amit Kapila
On Tue, Feb 7, 2023 at 2:04 AM Andres Freund wrote: > > On 2023-02-06 12:23:54 +0530, Amit Kapila wrote: > > On Mon, Feb 6, 2023 at 10:33 AM Andres Freund wrote: > > > Smart shutdown is practically unusable. I don't think it makes sense to > > > tie behavior of walsender to it in any way. > > >

Re: Exit walsender before confirming remote flush in logical replication

2023-02-06 Thread Andres Freund
Hi, On 2023-02-06 12:23:54 +0530, Amit Kapila wrote: > On Mon, Feb 6, 2023 at 10:33 AM Andres Freund wrote: > > Smart shutdown is practically unusable. I don't think it makes sense to tie > > behavior of walsender to it in any way. > > > > So, we have the following options: (a) do nothing for

Re: Exit walsender before confirming remote flush in logical replication

2023-02-05 Thread Amit Kapila
On Mon, Feb 6, 2023 at 10:33 AM Andres Freund wrote: > > On February 5, 2023 8:29:19 PM PST, Amit Kapila > wrote: > >> > >> But that seems a too narrow view to me. Imagine you want to decomission > >> the current primary, and instead start to use the logical standby as the > >> primary. For

Re: Exit walsender before confirming remote flush in logical replication

2023-02-05 Thread Andres Freund
Hi, On February 5, 2023 8:29:19 PM PST, Amit Kapila wrote: >On Sat, Feb 4, 2023 at 6:31 PM Andres Freund wrote: >> >> On 2023-02-02 11:21:54 +0530, Amit Kapila wrote: >> > The main problem we want to solve here is to avoid shutdown failing in >> > case walreceiver/applyworker is busy waiting

Re: Exit walsender before confirming remote flush in logical replication

2023-02-05 Thread Amit Kapila
On Sat, Feb 4, 2023 at 6:31 PM Andres Freund wrote: > > On 2023-02-02 11:21:54 +0530, Amit Kapila wrote: > > The main problem we want to solve here is to avoid shutdown failing in > > case walreceiver/applyworker is busy waiting for some lock or for some > > other reason as shown in the email

Re: Exit walsender before confirming remote flush in logical replication

2023-02-04 Thread Andres Freund
Hi, On 2023-02-02 11:21:54 +0530, Amit Kapila wrote: > The main problem we want to solve here is to avoid shutdown failing in > case walreceiver/applyworker is busy waiting for some lock or for some > other reason as shown in the email [1]. Isn't handling this part of the job of

Re: Exit walsender before confirming remote flush in logical replication

2023-02-04 Thread Amit Kapila
On Fri, Feb 3, 2023 at 5:38 PM Hayato Kuroda (Fujitsu) wrote: > > Dear Amit, Sawada-san, > > > > IIUC there is no difference between smart shutdown and fast shutdown > > > in logical replication walsender, but reading the doc[1], it seems to > > > me that in the smart shutdown mode, the server

RE: Exit walsender before confirming remote flush in logical replication

2023-02-03 Thread Hayato Kuroda (Fujitsu)
Dear Amit, Sawada-san, > > IIUC there is no difference between smart shutdown and fast shutdown > > in logical replication walsender, but reading the doc[1], it seems to > > me that in the smart shutdown mode, the server stops existing sessions > > normally. For example, If the client is psql

Re: Exit walsender before confirming remote flush in logical replication

2023-02-01 Thread Amit Kapila
On Thu, Feb 2, 2023 at 10:04 AM Kyotaro Horiguchi wrote: > > At Wed, 1 Feb 2023 14:58:14 +0530, Amit Kapila > wrote in > > On Wed, Feb 1, 2023 at 2:09 PM Masahiko Sawada > > wrote: > > > > > > Otherwise, we will end up terminating > > > the WAL stream without the done message. Which will lead

Re: Exit walsender before confirming remote flush in logical replication

2023-02-01 Thread Amit Kapila
On Thu, Feb 2, 2023 at 10:48 AM Masahiko Sawada wrote: > > On Wed, Feb 1, 2023 at 6:28 PM Amit Kapila wrote: > > > > > > > In a case where pq_is_send_pending() doesn't become false > > > for a long time, (e.g., the network socket buffer got full due to the > > > apply worker waiting on a lock),

Re: Exit walsender before confirming remote flush in logical replication

2023-02-01 Thread Masahiko Sawada
On Wed, Feb 1, 2023 at 6:28 PM Amit Kapila wrote: > > On Wed, Feb 1, 2023 at 2:09 PM Masahiko Sawada wrote: > > > > On Fri, Jan 20, 2023 at 7:45 PM Amit Kapila wrote: > > > > > > On Tue, Jan 17, 2023 at 2:41 PM Amit Kapila > > > wrote: > > > > > > > > Let me try to summarize the discussion

Re: Exit walsender before confirming remote flush in logical replication

2023-02-01 Thread Kyotaro Horiguchi
At Wed, 1 Feb 2023 14:58:14 +0530, Amit Kapila wrote in > On Wed, Feb 1, 2023 at 2:09 PM Masahiko Sawada wrote: > > > > Otherwise, we will end up terminating > > the WAL stream without the done message. Which will lead to an error > > message "ERROR: could not receive data from WAL stream:

Re: Exit walsender before confirming remote flush in logical replication

2023-02-01 Thread Amit Kapila
On Wed, Feb 1, 2023 at 2:09 PM Masahiko Sawada wrote: > > On Fri, Jan 20, 2023 at 7:45 PM Amit Kapila wrote: > > > > On Tue, Jan 17, 2023 at 2:41 PM Amit Kapila wrote: > > > > > > Let me try to summarize the discussion till now. The problem we are > > > trying to solve here is to allow a

Re: Exit walsender before confirming remote flush in logical replication

2023-02-01 Thread Masahiko Sawada
On Fri, Jan 20, 2023 at 7:45 PM Amit Kapila wrote: > > On Tue, Jan 17, 2023 at 2:41 PM Amit Kapila wrote: > > > > Let me try to summarize the discussion till now. The problem we are > > trying to solve here is to allow a shutdown to complete when walsender > > is not able to send the entire WAL.

RE: Exit walsender before confirming remote flush in logical replication

2023-01-26 Thread Hayato Kuroda (Fujitsu)
Dear Dilip, hackers, Thanks for giving your opinion. I analyzed the relation with the given commit, and I thought I could keep my patch. How do you think? # Abstract * Some modifications should be needed. * We cannot rollback the shutdown if walsenders are stuck * We don't have a good way to

Re: Exit walsender before confirming remote flush in logical replication

2023-01-22 Thread Dilip Kumar
On Fri, Jan 20, 2023 at 4:15 PM Amit Kapila wrote: > > On Tue, Jan 17, 2023 at 2:41 PM Amit Kapila wrote: > > > > Let me try to summarize the discussion till now. The problem we are > > trying to solve here is to allow a shutdown to complete when walsender > > is not able to send the entire WAL.

Re: Exit walsender before confirming remote flush in logical replication

2023-01-20 Thread Amit Kapila
On Tue, Jan 17, 2023 at 2:41 PM Amit Kapila wrote: > > Let me try to summarize the discussion till now. The problem we are > trying to solve here is to allow a shutdown to complete when walsender > is not able to send the entire WAL. Currently, in such cases, the > shutdown fails. As per our

RE: Exit walsender before confirming remote flush in logical replication

2023-01-19 Thread Hayato Kuroda (Fujitsu)
Dear Amit, hackers, > Let me try to summarize the discussion till now. The problem we are > trying to solve here is to allow a shutdown to complete when walsender > is not able to send the entire WAL. Currently, in such cases, the > shutdown fails. As per our current understanding, this can

Re: Exit walsender before confirming remote flush in logical replication

2023-01-17 Thread Amit Kapila
On Mon, Jan 16, 2023 at 4:39 PM Hayato Kuroda (Fujitsu) wrote: > > > In logical replication apply preceeds write and flush so we have no > > indication whether a record is "replicated" to standby by other than > > apply LSN. On the other hand, logical recplication doesn't have a > > business with

RE: Exit walsender before confirming remote flush in logical replication

2023-01-16 Thread Hayato Kuroda (Fujitsu)
Dear Horiguchi-san, > If I'm grabbing the discussion here correctly, in my memory, it is > because: physical replication needs all records that have written on > primary are written on standby for switchover to succeed. It is > annoying that normal shutdown occasionally leads to switchover >

RE: Exit walsender before confirming remote flush in logical replication

2023-01-16 Thread Hayato Kuroda (Fujitsu)
Dear Horiguchi-san, Amit, > At Fri, 13 Jan 2023 16:41:08 +0530, Amit Kapila > wrote in > > Okay, but what happens in the case of physical replication when > > synchronous_commit = remote_apply? In that case, won't it ensure that > > apply has also happened? If so, then shouldn't the time delay

Re: Exit walsender before confirming remote flush in logical replication

2023-01-15 Thread Kyotaro Horiguchi
At Fri, 13 Jan 2023 16:41:08 +0530, Amit Kapila wrote in > Okay, but what happens in the case of physical replication when > synchronous_commit = remote_apply? In that case, won't it ensure that > apply has also happened? If so, then shouldn't the time delay feature > also cause a similar

Re: Exit walsender before confirming remote flush in logical replication

2023-01-15 Thread Kyotaro Horiguchi
At Wed, 28 Dec 2022 09:15:41 +, "Hayato Kuroda (Fujitsu)" wrote in > > Another thing we can investigate here why do we need to ensure that > > there is no pending send before shutdown. > > I have not done yet about it, will continue next year. > It seems that walsenders have been sending

Re: Exit walsender before confirming remote flush in logical replication

2023-01-13 Thread Amit Kapila
On Wed, Dec 28, 2022 at 8:18 AM Hayato Kuroda (Fujitsu) wrote: > > Dear Amit, > > > > Firstly I considered 2, but I thought 1 seemed to be better. > > > PSA the updated patch. > > > > > > > I think even for logical replication we should check whether there is > > any pending WAL (via

RE: Exit walsender before confirming remote flush in logical replication

2022-12-28 Thread Hayato Kuroda (Fujitsu)
Dear Amit, > Thanks for the verification. BTW, do you think we should document this > either with time-delayed replication or otherwise unless this is > already documented? I think this should be documented at "Shutting Down the Server" section in runtime.sgml or logical-replicaiton.sgml, but I

Re: Exit walsender before confirming remote flush in logical replication

2022-12-27 Thread Amit Kapila
On Wed, Dec 28, 2022 at 8:19 AM Hayato Kuroda (Fujitsu) wrote: > > > In logical replication, it can happen today as well without > > time-delayed replication. Basically, say apply worker is waiting to > > acquire some lock that is already acquired by some backend then it > > will have the same

RE: Exit walsender before confirming remote flush in logical replication

2022-12-27 Thread Hayato Kuroda (Fujitsu)
Dear Amit, > In logical replication, it can happen today as well without > time-delayed replication. Basically, say apply worker is waiting to > acquire some lock that is already acquired by some backend then it > will have the same behavior. I have not verified this, so you may want > to check

RE: Exit walsender before confirming remote flush in logical replication

2022-12-27 Thread Hayato Kuroda (Fujitsu)
Dear Amit, > > Firstly I considered 2, but I thought 1 seemed to be better. > > PSA the updated patch. > > > > I think even for logical replication we should check whether there is > any pending WAL (via pq_is_send_pending()) to be sent. Otherwise, what > is the point to send the done message?

Re: Exit walsender before confirming remote flush in logical replication

2022-12-27 Thread Amit Kapila
On Tue, Dec 27, 2022 at 2:50 PM Amit Kapila wrote: > > On Tue, Dec 27, 2022 at 1:44 PM Hayato Kuroda (Fujitsu) > wrote: > > > > Thanks for checking my proposal! > > > > > - * Note that if we determine that there's still more data to send, this > > > - * function will return control to the

Re: Exit walsender before confirming remote flush in logical replication

2022-12-27 Thread Amit Kapila
On Tue, Dec 27, 2022 at 1:44 PM Hayato Kuroda (Fujitsu) wrote: > > Thanks for checking my proposal! > > > - * Note that if we determine that there's still more data to send, this > > - * function will return control to the caller. > > + * Note that if we determine that there's still more data to

RE: Exit walsender before confirming remote flush in logical replication

2022-12-27 Thread Hayato Kuroda (Fujitsu)
Dear Dilip, Thanks for checking my proposal! > - * Note that if we determine that there's still more data to send, this > - * function will return control to the caller. > + * Note that if we determine that there's still more data to send or we are > in > + * the physical replication more, this

Re: Exit walsender before confirming remote flush in logical replication

2022-12-26 Thread Dilip Kumar
On Thu, Dec 22, 2022 at 11:16 AM Hayato Kuroda (Fujitsu) wrote: > In case of logical replication, however, we cannot support the use-case that > switches the role publisher <-> subscriber. Suppose same case as above, > additional > transactions are committed while doing step2. To catch up such

RE: Exit walsender before confirming remote flush in logical replication

2022-12-26 Thread Hayato Kuroda (Fujitsu)
Dear Horiguchi-san, > > Thus how about before entering an apply_delay, logrep worker sending a > > kind of crafted feedback, which reports commit_data.end_lsn as > > flushpos? A little tweak is needed in send_feedback() but seems to > > work.. > > Thanks for replying! I tested your saying but

RE: Exit walsender before confirming remote flush in logical replication

2022-12-23 Thread Hayato Kuroda (Fujitsu)
Dear Horiguchi-san, > Thus how about before entering an apply_delay, logrep worker sending a > kind of crafted feedback, which reports commit_data.end_lsn as > flushpos? A little tweak is needed in send_feedback() but seems to > work.. Thanks for replying! I tested your saying but it could not

Re: Exit walsender before confirming remote flush in logical replication

2022-12-22 Thread Amit Kapila
On Fri, Dec 23, 2022 at 7:51 AM Kyotaro Horiguchi wrote: > > At Thu, 22 Dec 2022 17:29:34 +0530, Ashutosh Bapat > wrote in > > On Thu, Dec 22, 2022 at 11:16 AM Hayato Kuroda (Fujitsu) > > wrote: > > > In case of logical replication, however, we cannot support the use-case > > > that > > >

Re: Exit walsender before confirming remote flush in logical replication

2022-12-22 Thread Kyotaro Horiguchi
At Thu, 22 Dec 2022 17:29:34 +0530, Ashutosh Bapat wrote in > On Thu, Dec 22, 2022 at 11:16 AM Hayato Kuroda (Fujitsu) > wrote: > > In case of logical replication, however, we cannot support the use-case that > > switches the role publisher <-> subscriber. Suppose same case as above, > >

Re: Exit walsender before confirming remote flush in logical replication

2022-12-22 Thread Ashutosh Bapat
On Thu, Dec 22, 2022 at 11:16 AM Hayato Kuroda (Fujitsu) wrote: > > Dear hackers, > (I added Amit as CC because we discussed in another thread) > > This is a fork thread from time-delayed logical replication [1]. > While discussing, we thought that we could extend the condition of walsender >

Exit walsender before confirming remote flush in logical replication

2022-12-21 Thread Hayato Kuroda (Fujitsu)
Dear hackers, (I added Amit as CC because we discussed in another thread) This is a fork thread from time-delayed logical replication [1]. While discussing, we thought that we could extend the condition of walsender shutdown[2][3]. Currently, walsenders delay the shutdown request until