Hi Christopher,

Thanks! I'm building a patched version and will return with feedback!

Kind regards,

pt., 19 mar 2021 o 16:40 Christopher Faulet <cfau...@haproxy.com>
napisał(a):

> Le 16/03/2021 à 13:46, Maciej Zdeb a écrit :
> > Sorry for spam. In the last message I said that the old process (after
> reload)
> > is consuming cpu for lua processing and that's not true, it is
> processing other
> > things also.
> >
> > I'll take a break. ;) Then I'll verify if the issue exists on 2.3 and
> maybe 2.4
> > branch. For each version I need a week or two to be sure the issue does
> not
> > occur. :(
> >
> > If 2.3 and 2.4 behaves the same way the 2.2 does, I'll try to confirm if
> there
> > is any relation between infinite loops and custom configuration:
> > - lua scripts (mainly used for header generation/manipulation),
> > - spoe (used for sending metadata about each request to external
> service),
> > - peers (we have a cluster of 12 HAProxy servers connected to each
> other).
> >
> Hi Maciej,
>
> I've read more carefully your backtraces, and indeed, it seems to be
> related to
> lua processing. I don't know if the watchdog is triggered because of the
> lua or
> if it is just a side-effect. But the lua execution is interrupted inside
> the
> memory allocator. And malloc/realloc are not async-signal-safe.
> Unfortunately,
> when the lua stack is dumped, the same allocator is also used. At this
> stage,
> because a lock was not properly released, HAProxy enter in a deadlock.
>
> On other threads, we loop in the watchdog, waiting for the hand to dump
> the
> thread information and that explains the 100% CPU usage you observe.
>
> So, to prevent this situation, the lua stack must not be dumped if it was
> interrupted inside an unsafe part. It is the easiest way we found to
> workaround
> this bug. And because it is pretty rare, it should be good enough.
>
> However, I'm unable to reproduce the bug. Could you test attached patches
> please
> ? I attached patched for the 2.4, 2.3 and 2.2. Because you experienced
> this bug
> on the 2.2, it is probably easier to test patches for this version.
>
> If fixed, it could be good to figure out why the watchdog is triggered on
> your
> old processes.
>
> --
> Christopher Faulet
>

Reply via email to