On Fri, Dec 4, 2020 at 7:22 PM Drouvot, Bertrand <bdrou...@amazon.com> wrote: > > Hi, > > On 12/4/20 2:21 AM, Fujii Masao wrote: > > > > On 2020/12/04 9:28, Masahiko Sawada wrote: > >> On Fri, Dec 4, 2020 at 2:54 AM Fujii Masao > >> <masao.fu...@oss.nttdata.com> wrote: > >>> > >>> > >>> > >>> On 2020/12/01 17:29, Drouvot, Bertrand wrote: > >>>> Hi, > >>>> > >>>> On 12/1/20 12:35 AM, Masahiko Sawada wrote: > >>>>> CAUTION: This email originated from outside of the organization. > >>>>> Do not click links or open attachments unless you can confirm the > >>>>> sender and know the content is safe. > >>>>> > >>>>> > >>>>> > >>>>> On Tue, Dec 1, 2020 at 3:25 AM Alvaro Herrera > >>>>> <alvhe...@alvh.no-ip.org> wrote: > >>>>>> On 2020-Dec-01, Fujii Masao wrote: > >>>>>> > >>>>>>> + if (proc) > >>>>>>> + { > >>>>>>> + if (nprocs == 0) > >>>>>>> + appendStringInfo(&buf, "%d", proc->pid); > >>>>>>> + else > >>>>>>> + appendStringInfo(&buf, ", %d", proc->pid); > >>>>>>> + > >>>>>>> + nprocs++; > >>>>>>> > >>>>>>> What happens if all the backends in wait_list have gone? In > >>>>>>> other words, > >>>>>>> how should we handle the case where nprocs == 0 (i.e., nprocs > >>>>>>> has not been > >>>>>>> incrmented at all)? This would very rarely happen, but can happen. > >>>>>>> In this case, since buf.data is empty, at least there seems no > >>>>>>> need to log > >>>>>>> the list of conflicting processes in detail message. > >>>>>> Yes, I noticed this too; this can be simplified by changing the > >>>>>> condition in the ereport() call to be "nprocs > 0" (rather than > >>>>>> wait_list being null), otherwise not print the errdetail. (You > >>>>>> could > >>>>>> test buf.data or buf.len instead, but that seems uglier to me.) > >>>>> +1 > >>>>> > >>>>> Maybe we can also improve the comment of this function from: > >>>>> > >>>>> + * This function also reports the details about the conflicting > >>>>> + * process ids if *wait_list is not NULL. > >>>>> > >>>>> to " This function also reports the details about the conflicting > >>>>> process ids if exist" or something. > >>>>> > >>>> Thank you all for the review/remarks. > >>>> > >>>> They have been addressed in the new attached patch version. > >>> > >>> Thanks for updating the patch! I read through the patch again > >>> and applied the following chages to it. Attached is the updated > >>> version of the patch. Could you review this version? If there is > >>> no issue in it, I'm thinking to commit this version. > >> > >> Thank you for updating the patch! I have one question. > >> > >>> > >>> + timeouts[cnt].id = STANDBY_TIMEOUT; > >>> + timeouts[cnt].type = TMPARAM_AFTER; > >>> + timeouts[cnt].delay_ms = DeadlockTimeout; > >>> > >>> Maybe STANDBY_TIMEOUT should be STANDBY_DEADLOCK_TIMEOUT here? > >>> I changed the code that way. > >> > >> As the comment of ResolveRecoveryConflictWithLock() says the > >> following, a deadlock is detected by the ordinary backend process: > >> > >> * Deadlocks involving the Startup process and an ordinary backend > >> proces > >> * will be detected by the deadlock detector within the ordinary > >> backend. > >> > >> If we use STANDBY_DEADLOCK_TIMEOUT, > >> SendRecoveryConflictWithBufferPin() will be called after > >> DeadlockTimeout passed, but I think it's not necessary for the startup > >> process in this case. > > > > Thanks for pointing this! You are right. > > > > > >> If we want to just wake up the startup process > >> maybe we can use STANDBY_TIMEOUT here? > > > Thanks for the patch updates! Except what we are still discussing below, > it looks good to me. > > > When STANDBY_TIMEOUT happens, a request to release conflicting buffer > > pins is sent. Right? If so, we should not also use STANDBY_TIMEOUT there? > > Agree > > > > > Or, first of all, we don't need to enable the deadlock timer at all? > > Since what we'd like to do is to wake up after deadlock_timeout > > passes, we can do that by changing ProcWaitForSignal() so that it can > > accept the timeout and giving the deadlock_timeout to it. If we do > > this, maybe we can get rid of STANDBY_LOCK_TIMEOUT from > > ResolveRecoveryConflictWithLock(). Thought?
Where do we enable deadlock timeout in hot standby case? You meant to enable it in ProcWaitForSignal() or where we set a timer for not hot standby case, in ProcSleep()? > > Why not simply use (again) the STANDBY_LOCK_TIMEOUT one? (as it triggers > a call to StandbyLockTimeoutHandler() which does nothing, except waking > up. That's what we want, right?) Right, what I wanted to mean is STANDBY_LOCK_TIMEOUT. The startup process can wake up and do nothing. Thank you for pointing out. Regards, -- Masahiko Sawada EnterpriseDB: https://www.enterprisedb.com/