Re: Haproxy 1.8.25 segfault

2020-05-26 Thread Willy Tarreau
On Wed, May 27, 2020 at 11:48:05AM +1000, Igor Cicimov wrote:
> Hi Willy,
> 
> On Tue, May 26, 2020 at 4:43 PM Willy Tarreau  wrote:
> 
> > On Sun, May 24, 2020 at 10:35:10AM +1000, Igor Cicimov wrote:
> > > We are getting segfaults with haproxy 1.8.25
> >
> > By the way, does this mean you didn't get them with a previous version
> > (presumably 1.8.24) ? There aren't that many fixes between 1.8.24 and
> > 1.8.25, only 23.
> >
> 
> Yes, it started happening recently for some reason def on 1.8.25 only:

OK, very useful!

> Unfortunately (in context of figuring out the issue) we are not using lua
> :-/

So I'm hardly seeing a candidate here, unless we have another latent bug
that was woken up by one of the recent fixes.

> One thing I noticed though was that there was an OCSP file that had landed
> by mistake inside the SSL directory HAP is loading the certificates from.
> Do you think something like that can cause this to happen over the course
> of time? I don't but thought worth mentioning since that was the only diff
> I could see from our standard config elsewhere.

I don't think so, but thanks for mentioning it, we never know and any track
should be considered!

Cheers,
Willy



Re: Haproxy 1.8.25 segfault

2020-05-26 Thread Igor Cicimov
Hi Willy,

On Tue, May 26, 2020 at 4:43 PM Willy Tarreau  wrote:

> On Sun, May 24, 2020 at 10:35:10AM +1000, Igor Cicimov wrote:
> > We are getting segfaults with haproxy 1.8.25
>
> By the way, does this mean you didn't get them with a previous version
> (presumably 1.8.24) ? There aren't that many fixes between 1.8.24 and
> 1.8.25, only 23.
>

Yes, it started happening recently for some reason def on 1.8.25 only:

# zgrep -i segfault /var/log/syslog.*.gz
/var/log/syslog.4.gz:May 23 00:36:52 ip-172-31-37-74 kernel:
[30284682.620567] haproxy[14736]: segfault at 5609a853 ip
7f1b93928c10 sp 7ffd5e731fd8 error 4 in libc-2.19.so
[7f1b9388e000+1be000]
/var/log/syslog.5.gz:May 22 01:18:55 ip-172-31-37-74 kernel:
[30200805.498707] haproxy[7361]: segfault at 5575725c8fff ip
7f7d35bd4c10 sp 7b58a078 error 4 in libc-2.19.so
[7f7d35b3a000+1be000]
/var/log/syslog.5.gz:May 22 12:15:55 ip-172-31-37-74 kernel:
[30240225.673643] haproxy[15054]: segfault at 555f41a03fff ip
7f594d00fc10 sp 7ffeac111c98 error 4 in libc-2.19.so
[7f594cf75000+1be000]
/var/log/syslog.6.gz:May 21 12:08:13 ip-172-31-37-74 kernel:
[30153363.801627] haproxy[28398]: segfault at 55b4b43ccfff ip
7f5f33b53c10 sp 7ffe7fc290e8 error 4 in libc-2.19.so
[7f5f33ab9000+1be000]
/var/log/syslog.7.gz:May 20 16:04:54 ip-172-31-37-74 kernel:
[30081165.011057] haproxy[4830]: segfault at 563a387fafff ip
7fa11e6f2c10 sp 7fffaacd82c8 error 4 in libc-2.19.so
[7fa11e658000+1be000]


>
> The only one among them that I'm seeing capable of possibly having a
> side effet in unclear code parts would be this one:
>
>3d69a6029 ("BUG/MINOR: lua: Ignore the reserve to know if a channel is
> full or not")
>
> Do you use some Lua code which would involve the is_full() attribute on
> a channel ?
>
> Willy
>

Unfortunately (in context of figuring out the issue) we are not using lua
:-/
One thing I noticed though was that there was an OCSP file that had landed
by mistake inside the SSL directory HAP is loading the certificates from.
Do you think something like that can cause this to happen over the course
of time? I don't but thought worth mentioning since that was the only diff
I could see from our standard config elsewhere.

Thanks,
Igor


Re: Haproxy 1.8.25 segfault

2020-05-26 Thread Igor Cicimov
Hi Willy,

On Tue, May 26, 2020 at 4:31 PM Willy Tarreau  wrote:

> Hi Igor,
>
> On Sun, May 24, 2020 at 10:35:10AM +1000, Igor Cicimov wrote:
> > Hi guys,
> >
> > We are getting segfaults with haproxy 1.8.25 and thought I would ask if
> > this rings any bell:
> >
> > segfault at 5609a853 ip 7f1b93928c10 sp 7ffd5e731fd8 error 4
> in
> > libc-2.19.so[7f1b9388e000+1be000]
>
> At this point, no unfortunately. This could be a memcpy() on a NULL
> pointer or a use after free for example.
>
> > It is running on Ubuntu-14.04.2 (kernel 4.4.0-144-generic) and is
> happening
> > only on this particular one out of many dozens we have on Ubuntu-14.04
> and
> > 16.04
> >
> > I have attached strace so more details upon the next crash.
>
> I doubt you'll see much more using strace. You'd rather attach gdb to
> it and let it run. This way when it crashes again you can issue "bt full"
> and see the whole trace.
>
>
Done. Hopefully I get something useful on the next segfault.

> It is even possible to force a core to be dumped from gdb for later
> inspection using "generate-core-file". Some people also know how to script
> it so that it automatically dumps and detaches upon crash, and limits the
> service interruption time, but I never remember how to do this, and the
> help embedded in it is next to inexistent :-/
>

Nice, good to know thanks will dig around for details.

>
> Regards,
> Willy
>

Cheers,
Igor


Re: Haproxy 1.8.25 segfault

2020-05-26 Thread Willy Tarreau
Hi Igor,

On Sun, May 24, 2020 at 10:35:10AM +1000, Igor Cicimov wrote:
> Hi guys,
> 
> We are getting segfaults with haproxy 1.8.25 and thought I would ask if
> this rings any bell:
> 
> segfault at 5609a853 ip 7f1b93928c10 sp 7ffd5e731fd8 error 4 in
> libc-2.19.so[7f1b9388e000+1be000]

At this point, no unfortunately. This could be a memcpy() on a NULL
pointer or a use after free for example.

> It is running on Ubuntu-14.04.2 (kernel 4.4.0-144-generic) and is happening
> only on this particular one out of many dozens we have on Ubuntu-14.04 and
> 16.04
> 
> I have attached strace so more details upon the next crash.

I doubt you'll see much more using strace. You'd rather attach gdb to
it and let it run. This way when it crashes again you can issue "bt full"
and see the whole trace.

It is even possible to force a core to be dumped from gdb for later
inspection using "generate-core-file". Some people also know how to script
it so that it automatically dumps and detaches upon crash, and limits the
service interruption time, but I never remember how to do this, and the
help embedded in it is next to inexistent :-/

Regards,
Willy



Haproxy 1.8.25 segfault

2020-05-23 Thread Igor Cicimov
Hi guys,

We are getting segfaults with haproxy 1.8.25 and thought I would ask if
this rings any bell:

segfault at 5609a853 ip 7f1b93928c10 sp 7ffd5e731fd8 error 4 in
libc-2.19.so[7f1b9388e000+1be000]

It is running on Ubuntu-14.04.2 (kernel 4.4.0-144-generic) and is happening
only on this particular one out of many dozens we have on Ubuntu-14.04 and
16.04

I have attached strace so more details upon the next crash.

Thanks,
Igor