Hi Amaury, Willy,
Thank you! That sounds good. I'll give 2.8-dev1 a try when I have the
chance (probably later this week or next).
Best,
Luke
Luke Seelenbinder
Founder, Stadia Maps
https://stadiamaps.com
On 1/9/23 09:41, Amaury Denoyelle wrote:
On Sat, Jan 07, 2023 at 02:22:01PM +0100, Willy Tarreau wrote:
Hi Luke,
On Sat, Jan 07, 2023 at 01:44:30PM +0100, Luke Seelenbinder wrote:
Hi list,
We've been running 2.7.1 on a subset of our edge servers with QUIC + HTTP/3
enabled, and we're seeing routine, but infrequent (~daily), crashes (mix of
SIGABRT / SIGSEGV). I have coredumps and there doesn't seem to be any common
thread across crashes / machines, but it's possible I'm missing something.
Two of the coredumps show the following backtrace:
Program terminated with signal SIGSEGV, Segmentation fault.
#0 0x000055b0fe319ce7 in qc_release_frm (qc=0x55b101236570,
frm=0x7fd8201fbbf0 <main_arena+112>) at src/quic_conn.c:1569
1569 pn = f->pkt->pn_node.key;
Program terminated with signal SIGSEGV, Segmentation fault.
#0 qc_release_frm (qc=0x5652aa588fc0, frm=0x5652aa2537d0) at
src/quic_conn.c:1564
1564 list_for_each_entry_safe(f, tmp, &origin->reflist, ref) {
which seem similar enough to possibly share a common cause. The other
crashes occur in quictls (sigabrt), htx.h (sigsegv), and ebtree.h (sigsegv).
Are there known fixes from 2.8-dev or internal trackers that could be
related? I can dig deeper, but for now I'll probably disable quic since that
seems to be the most likely culprit.
I'm seeing the following patch for QUIC which was fixed right after
2.7.1 was emitted and which suggest potential crashes:
15337fd80 ("BUG/MEDIUM: mux-quic: fix double delete from qcc.opening_list")
So you might possibly be hitting that bug, indeed. If you're interested
in giving 2.8-dev1 a try, it would confirm whether you're facing this
exact issue. But at the moment we're not aware of any remaining crash-
inducing bugs in 2.8-dev, so if it would still fail for you it would
indicate a new unknown bug.
Luke, the crashes you reported are quite identical to the ones I had
before I introduced the fix. Indeed, you should try 2.8-dev1 if you can
and report us if this has solved the issue.
Thanks for your help,