On Tue, Mar 16, 2021 at 01:46:48PM +0100, Maciej Zdeb wrote:
> In the last message I said that the old process (after
> reload) is consuming cpu for lua processing and that's not true, it is
> processing other things also.
> 
> I'll take a break. ;) Then I'll verify if the issue exists on 2.3 and maybe
> 2.4 branch. For each version I need a week or two to be sure the issue does
> not occur. :(

Don't worry, we know how it is. On the developer side it's annoying to
have such bugs, but on the other hand, when bugs take one week to become
visible, it's likely that there aren't that many left, so this gives some
incentive to find the root cause :-)

The perf top you sent indicates that *something* is waking a task in
loops. Unfortunately we don't know what. In 2.4 we have more tools to
observe this, so if you still observe it I may even indicate you some
gdb commands or suggest a few patches to figure exactly what is
happening.

> If 2.3 and 2.4 behaves the same way the 2.2 does, I'll try to confirm if
> there is any relation between infinite loops and custom configuration:
> - lua scripts (mainly used for header generation/manipulation),
> - spoe (used for sending metadata about each request to external service),
> - peers (we have a cluster of 12 HAProxy servers connected to each other).

Any of the 3 could be indirectly responsible, as the stopping state
sometimes has to be taken care of in special ways. The peers are a bit
less likely because they're used a lot, though that doesn't mean they're
immune to this. But Lua and SPOE are a bit less commonly deployed and
could still be facing some rare stupid corner cases that have not yet
been identified. Obviously I have no idea which ones :-/

Cheers,
Willy

Reply via email to