Re: Haproxy 1.8.25 segfault
On Wed, May 27, 2020 at 11:48:05AM +1000, Igor Cicimov wrote: > Hi Willy, > > On Tue, May 26, 2020 at 4:43 PM Willy Tarreau wrote: > > > On Sun, May 24, 2020 at 10:35:10AM +1000, Igor Cicimov wrote: > > > We are getting segfaults with haproxy 1.8.25 > > > > By the way, does this mean you didn't get them with a previous version > > (presumably 1.8.24) ? There aren't that many fixes between 1.8.24 and > > 1.8.25, only 23. > > > > Yes, it started happening recently for some reason def on 1.8.25 only: OK, very useful! > Unfortunately (in context of figuring out the issue) we are not using lua > :-/ So I'm hardly seeing a candidate here, unless we have another latent bug that was woken up by one of the recent fixes. > One thing I noticed though was that there was an OCSP file that had landed > by mistake inside the SSL directory HAP is loading the certificates from. > Do you think something like that can cause this to happen over the course > of time? I don't but thought worth mentioning since that was the only diff > I could see from our standard config elsewhere. I don't think so, but thanks for mentioning it, we never know and any track should be considered! Cheers, Willy
Re: Haproxy 1.8.25 segfault
Hi Willy, On Tue, May 26, 2020 at 4:43 PM Willy Tarreau wrote: > On Sun, May 24, 2020 at 10:35:10AM +1000, Igor Cicimov wrote: > > We are getting segfaults with haproxy 1.8.25 > > By the way, does this mean you didn't get them with a previous version > (presumably 1.8.24) ? There aren't that many fixes between 1.8.24 and > 1.8.25, only 23. > Yes, it started happening recently for some reason def on 1.8.25 only: # zgrep -i segfault /var/log/syslog.*.gz /var/log/syslog.4.gz:May 23 00:36:52 ip-172-31-37-74 kernel: [30284682.620567] haproxy[14736]: segfault at 5609a853 ip 7f1b93928c10 sp 7ffd5e731fd8 error 4 in libc-2.19.so [7f1b9388e000+1be000] /var/log/syslog.5.gz:May 22 01:18:55 ip-172-31-37-74 kernel: [30200805.498707] haproxy[7361]: segfault at 5575725c8fff ip 7f7d35bd4c10 sp 7b58a078 error 4 in libc-2.19.so [7f7d35b3a000+1be000] /var/log/syslog.5.gz:May 22 12:15:55 ip-172-31-37-74 kernel: [30240225.673643] haproxy[15054]: segfault at 555f41a03fff ip 7f594d00fc10 sp 7ffeac111c98 error 4 in libc-2.19.so [7f594cf75000+1be000] /var/log/syslog.6.gz:May 21 12:08:13 ip-172-31-37-74 kernel: [30153363.801627] haproxy[28398]: segfault at 55b4b43ccfff ip 7f5f33b53c10 sp 7ffe7fc290e8 error 4 in libc-2.19.so [7f5f33ab9000+1be000] /var/log/syslog.7.gz:May 20 16:04:54 ip-172-31-37-74 kernel: [30081165.011057] haproxy[4830]: segfault at 563a387fafff ip 7fa11e6f2c10 sp 7fffaacd82c8 error 4 in libc-2.19.so [7fa11e658000+1be000] > > The only one among them that I'm seeing capable of possibly having a > side effet in unclear code parts would be this one: > >3d69a6029 ("BUG/MINOR: lua: Ignore the reserve to know if a channel is > full or not") > > Do you use some Lua code which would involve the is_full() attribute on > a channel ? > > Willy > Unfortunately (in context of figuring out the issue) we are not using lua :-/ One thing I noticed though was that there was an OCSP file that had landed by mistake inside the SSL directory HAP is loading the certificates from. Do you think something like that can cause this to happen over the course of time? I don't but thought worth mentioning since that was the only diff I could see from our standard config elsewhere. Thanks, Igor
Re: Haproxy 1.8.25 segfault
Hi Willy, On Tue, May 26, 2020 at 4:31 PM Willy Tarreau wrote: > Hi Igor, > > On Sun, May 24, 2020 at 10:35:10AM +1000, Igor Cicimov wrote: > > Hi guys, > > > > We are getting segfaults with haproxy 1.8.25 and thought I would ask if > > this rings any bell: > > > > segfault at 5609a853 ip 7f1b93928c10 sp 7ffd5e731fd8 error 4 > in > > libc-2.19.so[7f1b9388e000+1be000] > > At this point, no unfortunately. This could be a memcpy() on a NULL > pointer or a use after free for example. > > > It is running on Ubuntu-14.04.2 (kernel 4.4.0-144-generic) and is > happening > > only on this particular one out of many dozens we have on Ubuntu-14.04 > and > > 16.04 > > > > I have attached strace so more details upon the next crash. > > I doubt you'll see much more using strace. You'd rather attach gdb to > it and let it run. This way when it crashes again you can issue "bt full" > and see the whole trace. > > Done. Hopefully I get something useful on the next segfault. > It is even possible to force a core to be dumped from gdb for later > inspection using "generate-core-file". Some people also know how to script > it so that it automatically dumps and detaches upon crash, and limits the > service interruption time, but I never remember how to do this, and the > help embedded in it is next to inexistent :-/ > Nice, good to know thanks will dig around for details. > > Regards, > Willy > Cheers, Igor
Re: Haproxy 1.8.25 segfault
Hi Igor, On Sun, May 24, 2020 at 10:35:10AM +1000, Igor Cicimov wrote: > Hi guys, > > We are getting segfaults with haproxy 1.8.25 and thought I would ask if > this rings any bell: > > segfault at 5609a853 ip 7f1b93928c10 sp 7ffd5e731fd8 error 4 in > libc-2.19.so[7f1b9388e000+1be000] At this point, no unfortunately. This could be a memcpy() on a NULL pointer or a use after free for example. > It is running on Ubuntu-14.04.2 (kernel 4.4.0-144-generic) and is happening > only on this particular one out of many dozens we have on Ubuntu-14.04 and > 16.04 > > I have attached strace so more details upon the next crash. I doubt you'll see much more using strace. You'd rather attach gdb to it and let it run. This way when it crashes again you can issue "bt full" and see the whole trace. It is even possible to force a core to be dumped from gdb for later inspection using "generate-core-file". Some people also know how to script it so that it automatically dumps and detaches upon crash, and limits the service interruption time, but I never remember how to do this, and the help embedded in it is next to inexistent :-/ Regards, Willy
Haproxy 1.8.25 segfault
Hi guys, We are getting segfaults with haproxy 1.8.25 and thought I would ask if this rings any bell: segfault at 5609a853 ip 7f1b93928c10 sp 7ffd5e731fd8 error 4 in libc-2.19.so[7f1b9388e000+1be000] It is running on Ubuntu-14.04.2 (kernel 4.4.0-144-generic) and is happening only on this particular one out of many dozens we have on Ubuntu-14.04 and 16.04 I have attached strace so more details upon the next crash. Thanks, Igor