Hi Frank,

On Thu, Apr 05, 2018 at 09:41:25AM +0000, Frank Schreuder wrote:
> Hi Willy
> 
> >>>> There are very few abort() calls in the code :
> >>>>   - some in the thread debugging code to detect recursive locks ;
> >>>>   - one in the cache applet which triggers on an impossible case very
> >>>>     likely resulting from cache corruption (hence a bug)
> >>>>   - a few inside the Lua library
> >>>>   - a few in the HPACK decompressor, detecting a few possible bugs there
> 
> After playing around with some config changes we managed to not have haproxy
> throw the "worker <pid> exited with code 134" error for at least a day. Which
> is a long time as before we had this error at least 5 times a day...

Great!

> The line we removed from our config to get this result was:
> compression algo gzip

Hmmm interesting.

> Could it be a locking issue in the compression code? I'm going to run a few
> more days without compression enabled, but for now this looks promising!

In fact, the locking is totally disabled when not using compression, so
it cannot be an option. Also, most of the recently fixed bugs may only
be triggered with H2 or threads, none of which you're using. I rechecked
the compression code to try to spot anything obvious, but nothing popped
out :-/

All I can strongly recommend if you retry with compression enabled is to
do it with latest 1.8 release. I'm currently checking that I didn't miss
anything to issue 1.8.6 hopefully today. If it still dies, this will at
least rule out the possible side effects of a few of the bugs we've fixed
since, all of which were really tricky.

Cheers,
Willy

Reply via email to