Re: Disallow cancellation of waiting for synchronous replication

2021-04-23 Thread Andrey Borodin
Hi Aleksander! Thanks for looking into this. > 23 апр. 2021 г., в 14:30, Aleksander Alekseev > написал(а): > > Hi hackers, > After using a patch for a while it became obvious that PANICing during termination is not a good idea. Even when we wait for synchronous replication.

Re: Disallow cancellation of waiting for synchronous replication

2021-04-23 Thread Aleksander Alekseev
Hi hackers, > >> After using a patch for a while it became obvious that PANICing during > >> termination is not a good idea. Even when we wait for synchronous > >> replication. It generates undesired coredumps. > >> I think in presence of SIGTERM it's reasonable to say that we cannot > >>

Re: Disallow cancellation of waiting for synchronous replication

2021-03-11 Thread Andrey Borodin
Thanks for looking into this! > 11 марта 2021 г., в 19:15, Fujii Masao > написал(а): > > > > On 2020/12/09 18:07, Andrey Borodin wrote: >>> 9 июня 2020 г., в 23:32, Jeff Davis написал(а): >>> >>> >> After using a patch for a while it became obvious that PANICing during >> termination is

Re: Disallow cancellation of waiting for synchronous replication

2021-03-11 Thread Fujii Masao
On 2020/12/09 18:07, Andrey Borodin wrote: 9 июня 2020 г., в 23:32, Jeff Davis написал(а): After using a patch for a while it became obvious that PANICing during termination is not a good idea. Even when we wait for synchronous replication. It generates undesired coredumps. I think

Re: Disallow cancellation of waiting for synchronous replication

2021-03-11 Thread David Steele
On 12/9/20 4:07 AM, Andrey Borodin wrote: 9 июня 2020 г., в 23:32, Jeff Davis написал(а): After using a patch for a while it became obvious that PANICing during termination is not a good idea. Even when we wait for synchronous replication. It generates undesired coredumps. I think in

Re: Disallow cancellation of waiting for synchronous replication

2020-12-09 Thread Andrey Borodin
> 9 июня 2020 г., в 23:32, Jeff Davis написал(а): > > After using a patch for a while it became obvious that PANICing during termination is not a good idea. Even when we wait for synchronous replication. It generates undesired coredumps. I think in presence of SIGTERM it's reasonable to

Re: Disallow cancellation of waiting for synchronous replication

2020-06-09 Thread Jeff Davis
On Sat, 2019-12-21 at 11:34 +0100, Marco Slot wrote: > The GUCs are not re-checked in the main loop in SyncRepWaitForLSN, so > backends will remain stuck there even if synchronous replication has > been (temporarily) disabled while they were waiting. If you do: alter system set

Re: Disallow cancellation of waiting for synchronous replication

2020-02-20 Thread Michail Nikolaev
Hello. Just want to share some thoughts about how it looks from perspective of a high availability web-service application developer. Because sometimes things look different from other sides. And everything looks like disaster to be honest. But let's take it one at a time. First - the problem

Re: Disallow cancellation of waiting for synchronous replication

2020-01-15 Thread Maksim Milyutin
On 15.01.2020 01:53, Andres Freund wrote: On 2020-01-12 16:18:38 +0500, Andrey Borodin wrote: 11 янв. 2020 г., в 7:34, Bruce Momjian написал(а): Actually, it might be worse than that. In my reading of RecordTransactionCommit(), we do this: write to WAL flush WAL (durable)

Re: Disallow cancellation of waiting for synchronous replication

2020-01-14 Thread Andres Freund
Hi, On 2020-01-12 16:18:38 +0500, Andrey Borodin wrote: > > 11 янв. 2020 г., в 7:34, Bruce Momjian написал(а): > > > > Actually, it might be worse than that. In my reading of > > RecordTransactionCommit(), we do this: > > > > write to WAL > > flush WAL (durable) > > make visible

Re: Disallow cancellation of waiting for synchronous replication

2020-01-12 Thread Andrey Borodin
> 11 янв. 2020 г., в 7:34, Bruce Momjian написал(а): > > Actually, it might be worse than that. In my reading of > RecordTransactionCommit(), we do this: > > write to WAL > flush WAL (durable) > make visible to other backends > replicate > communicate to the

Re: Disallow cancellation of waiting for synchronous replication

2020-01-10 Thread Bruce Momjian
On Thu, Jan 2, 2020 at 10:26:16PM +0500, Andrey Borodin wrote: > > > > 2 янв. 2020 г., в 19:13, Robert Haas написал(а): > > > > On Sun, Dec 29, 2019 at 4:13 AM Andrey Borodin wrote: > >> Not loosing data - is a nice property of the database either. > > > > Sure, but there's more than one

Re: Disallow cancellation of waiting for synchronous replication

2020-01-02 Thread Andrey Borodin
> 2 янв. 2020 г., в 19:13, Robert Haas написал(а): > > On Sun, Dec 29, 2019 at 4:13 AM Andrey Borodin wrote: >> Not loosing data - is a nice property of the database either. > > Sure, but there's more than one way to fix that problem, as I pointed > out in my first response. Sorry, it took

Re: Disallow cancellation of waiting for synchronous replication

2020-01-02 Thread Robert Haas
On Mon, Dec 30, 2019 at 9:39 AM Bruce Momjian wrote: > This gets to the heart of something I was hoping to discuss. When is > something committed? You would think it is when the client receives the > commit message, but Postgres can commit something, and try to inform the > client but fail to

Re: Disallow cancellation of waiting for synchronous replication

2020-01-02 Thread Robert Haas
On Sun, Dec 29, 2019 at 4:13 AM Andrey Borodin wrote: > Not loosing data - is a nice property of the database either. Sure, but there's more than one way to fix that problem, as I pointed out in my first response. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise

Re: Disallow cancellation of waiting for synchronous replication

2019-12-30 Thread Bruce Momjian
On Sat, Dec 28, 2019 at 04:55:55PM -0500, Robert Haas wrote: > On Fri, Dec 20, 2019 at 12:04 AM Andrey Borodin wrote: > > Currently, we can have split brain with the combination of following steps: > > 0. Setup cluster with synchronous replication. Isolate primary from > > standbys. > > 1. Issue

Re: Disallow cancellation of waiting for synchronous replication

2019-12-29 Thread Andrey Borodin
> 29 дек. 2019 г., в 4:54, Robert Haas написал(а): > > On Sat, Dec 28, 2019 at 6:19 PM Maksim Milyutin wrote: >> The stuckness of backend is not deadlock here. To cancel waiting of >> backend fluently, client is enough to turn off synchronous replication >> (change synchronous_standby_names

Re: Disallow cancellation of waiting for synchronous replication

2019-12-28 Thread Robert Haas
On Sat, Dec 28, 2019 at 6:19 PM Maksim Milyutin wrote: > The stuckness of backend is not deadlock here. To cancel waiting of > backend fluently, client is enough to turn off synchronous replication > (change synchronous_standby_names through server reload) or change > synchronous replica to

Re: Disallow cancellation of waiting for synchronous replication

2019-12-28 Thread Maksim Milyutin
On 29.12.2019 00:55, Robert Haas wrote: On Fri, Dec 20, 2019 at 12:04 AM Andrey Borodin wrote: Currently, we can have split brain with the combination of following steps: 0. Setup cluster with synchronous replication. Isolate primary from standbys. 1. Issue upsert query INSERT .. ON CONFLICT

Re: Disallow cancellation of waiting for synchronous replication

2019-12-28 Thread Robert Haas
On Fri, Dec 20, 2019 at 12:04 AM Andrey Borodin wrote: > Currently, we can have split brain with the combination of following steps: > 0. Setup cluster with synchronous replication. Isolate primary from standbys. > 1. Issue upsert query INSERT .. ON CONFLICT DO NOTHING > 2. CANCEL 1 during wait

Re: Disallow cancellation of waiting for synchronous replication

2019-12-26 Thread Maksim Milyutin
On 25.12.2019 13:45, Andrey Borodin wrote: 25 дек. 2019 г., в 15:28, Maksim Milyutin написал(а): Synchronous replication does not guarantee that a committed write is actually on any replica, but it does in general guarantee that a commit has been replicated before sending a response to the

Re: Disallow cancellation of waiting for synchronous replication

2019-12-25 Thread Maksim Milyutin
On 25.12.2019 14:27, Marco Slot wrote: On Wed, Dec 25, 2019, 11:28 Maksim Milyutin > wrote: But in this case locally committed data becomes visible to new incoming transactions that is bad side-effect of this issue. Your application should be

Re: Disallow cancellation of waiting for synchronous replication

2019-12-25 Thread Marco Slot
On Wed, Dec 25, 2019, 11:28 Maksim Milyutin wrote: > But in this case locally committed data becomes visible to new incoming > transactions that is bad side-effect of this issue. > Your application should be prepared for that in any case. At the point where synchronous replication waits, the

Re: Disallow cancellation of waiting for synchronous replication

2019-12-25 Thread Andrey Borodin
> 25 дек. 2019 г., в 15:28, Maksim Milyutin написал(а): > >> Synchronous replication >> does not guarantee that a committed write is actually on any replica, >> but it does in general guarantee that a commit has been replicated >> before sending a response to the client. That's arguably more

Re: Disallow cancellation of waiting for synchronous replication

2019-12-25 Thread Maksim Milyutin
On 21.12.2019 13:34, Marco Slot wrote: I do agree with the general sentiment that terminating the connection is preferable over sending a response to the client (except when synchronous replication was already disabled). But in this case locally committed data becomes visible to new incoming

Re: Disallow cancellation of waiting for synchronous replication

2019-12-25 Thread Maksim Milyutin
On 21.12.2019 00:19, Tom Lane wrote: Three is still a problem when backend is not canceled, but terminated [2]. Exactly. If you don't have a fix that handles that case, you don't have anything. In fact, you've arguably made things worse, by increasing the temptation to terminate or "kill -9"

Re: Disallow cancellation of waiting for synchronous replication

2019-12-21 Thread Andrey Borodin
> 21 дек. 2019 г., в 2:19, Tom Lane написал(а): > > Andrey Borodin writes: >> I think proper solution here would be to add GUC to disallow cancellation of >> synchronous replication. > > This sounds entirely insane to me. There is no possibility that you > can prevent a failure from

Re: Disallow cancellation of waiting for synchronous replication

2019-12-21 Thread Marco Slot
On Fri, Dec 20, 2019 at 11:07 AM Andrey Borodin wrote: > I think changing synchronous_standby_names to some available standbys will > resume all backends waiting for synchronous replication. > Do we need to check necessity of synchronous replication in any other case? The GUCs are not

Re: Disallow cancellation of waiting for synchronous replication

2019-12-20 Thread Michael Paquier
On Fri, Dec 20, 2019 at 03:07:26PM +0500, Andrey Borodin wrote: >> Sending a cancellation is currently the only way to resume after >> disabling synchronous replication. Some HA solutions (e.g. >> pg_auto_failover) rely on this behaviour. Would it be worth checking >> whether synchronous

Re: Disallow cancellation of waiting for synchronous replication

2019-12-20 Thread Tom Lane
Andrey Borodin writes: > I think proper solution here would be to add GUC to disallow cancellation of > synchronous replication. This sounds entirely insane to me. There is no possibility that you can prevent a failure from occurring at this step. > Three is still a problem when backend is

Re: Disallow cancellation of waiting for synchronous replication

2019-12-20 Thread Andrey Borodin
> 20 дек. 2019 г., в 12:23, Marco Slot написал(а): > > On Fri, Dec 20, 2019 at 6:04 AM Andrey Borodin wrote: >> I think proper solution here would be to add GUC to disallow cancellation of >> synchronous replication. Retry step 3 will wait on locks after hanging 1 and >> data will be

Re: Disallow cancellation of waiting for synchronous replication

2019-12-19 Thread Marco Slot
On Fri, Dec 20, 2019 at 6:04 AM Andrey Borodin wrote: > I think proper solution here would be to add GUC to disallow cancellation of > synchronous replication. Retry step 3 will wait on locks after hanging 1 and > data will be consistent. > Three is still a problem when backend is not canceled,

Disallow cancellation of waiting for synchronous replication

2019-12-19 Thread Andrey Borodin
Hi, hackers! This is continuation of thread [0] in pgsql-general with proposed changes. As Maksim pointed out, this topic was raised before here [1]. Currently, we can have split brain with the combination of following steps: 0. Setup cluster with synchronous replication. Isolate primary from