On 2020-08-06 18:55:58 -0400, Alvaro Herrera wrote: > Ashutosh Bapat noticed that WalSndWaitForWal() is setting > waiting_for_ping_response after sending a keepalive that does *not* > request a reply. The bad consequence is that other callers that do > require a reply end up in not sending a keepalive, because they think it > was already sent previously. So the whole thing gets stuck. > > He found that commit 41d5f8ad734 failed to remove the setting of > waiting_for_ping_response after changing the "request" parameter > WalSndKeepalive from true to false; that seems to have been an omission > and it breaks the algorithm. Thread at [1]. > > The simplest fix is just to remove the line that sets > waiting_for_ping_response, but I think it is less error-prone to have > WalSndKeepalive set the flag itself, instead of expecting its callers to > do it (and know when to). Patch attached. Also rewords some related > commentary.
Thanks for diagnosis and fix! - Andres