On Tue, Apr 11, 2023 at 01:10:57PM -0700, Andres Freund wrote:
> On 2023-04-11 11:04:50 +0200, Drouvot, Bertrand wrote:
> > On 4/11/23 10:55 AM, Drouvot, Bertrand wrote:
> > > I think we might want to add:
> > > 
> > > $node_primary->wait_for_replay_catchup($node_standby);
> > > 
> > > before calling the slot creation.

> Pushed. Seems like a clear race in the test, so I didn't think it was worth
> waiting for testing it on hoverfly.

We'll see what happens in the next run.

> I think we should lower the log level, but perhaps wait for a few more cycles
> in case there are random failures?

Fine with me.

> I wonder if we should make the connections in poll_query_until to reduce
> verbosity - it's pretty annoying how much that can bloat the log. Perhaps also
> introduce some backoff? It's really annoying to have to trawl through all
> those logs when there's a problem.

Agreed.  My ranked wish list for poll_query_until is:

1. Exponential backoff
2. Closed-loop time control via Time::HiRes or similar, instead of assuming
   that ten loops complete in ~1s.  I've seen the loop take 3x as long as the
   intended timeout.
3. Connect less often than today's once per probe


Reply via email to