Hello, I checked the backtrace of a crashed dhcpd running on 4.4.1-2.1ubuntu5.
(gdb) info threads Id Target Id Frame * 1 Thread 0x7fb4ddecb700 (LWP 3170) __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50 2 Thread 0x7fb4dd6ca700 (LWP 3171) __lll_lock_wait (futex=futex@entry=0x7fb4de6d2028, private=0) at lowlevellock.c:52 3 Thread 0x7fb4de6cc700 (LWP 3169) futex_wake (private=<optimized out>, processes_to_wake=1, futex_word=<optimized out>) at ../sysdeps/nptl/futex-internal.h:364 4 Thread 0x7fb4de74f740 (LWP 3148) futex_wait_cancelable (private=<optimized out>, expected=0, futex_word=0x7fb4de6cd0d0) at ../sysdeps/nptl/futex-internal.h:183 (gdb) frame 2 #2 0x00007fb4dec85985 in isc_assertion_failed (file=file@entry=0x7fb4decd8878 "../../../../lib/isc/unix/socket.c", line=line@entry=3361, type=type@entry=isc_assertiontype_insist, cond=cond@entry=0x7fb4decda033 "!sock->pending_send") at ../../../lib/isc/assertions.c:52 (gdb) bt #1 0x00007fb4deaa7859 in __GI_abort () at abort.c:79 #2 0x00007fb4dec85985 in isc_assertion_failed (file=file@entry=0x7fb4decd8878 "../../../../lib/isc/unix/socket.c", line=line@entry=3361, type=type@entry=isc_assertiontype_insist, cond=cond@entry=0x7fb4decda033 "!sock->pending_send") at ../../../lib/isc/assertions.c:52 #3 0x00007fb4decc17e1 in dispatch_send (sock=0x7fb4de6d4990) at ../../../../lib/isc/unix/socket.c:4041 #4 process_fd (writeable=<optimized out>, readable=<optimized out>, fd=11, manager=0x7fb4de6d0010) at ../../../../lib/isc/unix/socket.c:4054 #5 process_fds (writefds=<optimized out>, readfds=0x7fb4de6d1090, maxfd=13, manager=0x7fb4de6d0010) at ../../../../lib/isc/unix/socket.c:4211 #6 watcher (uap=0x7fb4de6d0010) at ../../../../lib/isc/unix/socket.c:4397 #7 0x00007fb4dea68609 in start_thread (arg=<optimized out>) at pthread_create.c:477 #8 0x00007fb4deba4103 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95 (gdb) frame 3 #3 0x00007fb4decc17e1 in dispatch_send (sock=0x7fb4de6d4990) at ../../../../lib/isc/unix/socket.c:4041 4041 in ../../../../lib/isc/unix/socket.c (gdb) p sock->pending_send $2 = 1 The code is crashing on this assertion: https://gitlab.isc.org/isc-projects/bind9/-/blob/v9_11_3/lib/isc/unix/socket.c#L3364 This was already reported and marked as fixed in debian (?) via [0] ""Now if a wakeup event occurres the socket would be dispatched for processing regardless which kind of event (timer?) triggered the wakeup. At least I did not find any sanity checks in process_fds() except SOCK_DEAD(sock). This leads to the following situation: The sock is not dead yet but it is still pending when it is dispatched again. I would now check sock->pending_send before calling dispatch_send().This would at least prevent the assertion failure - well knowing that the situation described above ( not dead but still pending and alerting ) is not a very pleasant one - until someone comes up with a better solution. """ https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=430065#20 ** Follow up questions: 0) The reproducer doesn't seems consistent and seems to be related to a race condition associated with a internal timer/futex. 1) Can anyone confirm that a pristine upstream 4.4.1 doesn't reproduces the issue? ** Bug watch added: Debian Bug tracker #430065 https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=430065 -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1872118 Title: DHCP Cluster crashes after a few hours To manage notifications about this bug go to: https://bugs.launchpad.net/dhcp/+bug/1872118/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs