Hi William,

On Mon, Apr 27, 2020 at 08:34:48PM +0200, William Dauchy wrote:
> If this can help, the issue is possibly between dev5 and dev6. We have
> been following quite closely the dev releases, but we had to revert
> from dev6 to dev5 as it produced issues on our side - where it is
> perfectly running fine on dev5.

So just to let you know, we could finally nail the issue down to a
change I did on mux-h1 at the end of dev4 (so dev5 is also affected).
Only front connections dying in a dirty way were affected (i.e. users
disappearing from the net after they got a response, the timeout was
not rearmed).

I'm not aware of anything past dev5 which could condition its
occurrence. As such it's very likely that you have this bug as well
as I don't see what could work around it, but that your workload makes
you very unlikely to encounter it.

I've pushed the fix below now:

  ca39747dcf ("BUG/MEDIUM: mux-h1: make sure we always have a timeout on front 
connections")

Next time you rebuild, it could make sense to apply it (even on top of
your dev5) to improve the stability. If by any chance you retest dev6
I'm interested in knowing if you still get problems with it, of course :-)

Thanks,
Willy

Reply via email to