On Mon, Jun 20, 2022 at 06:50:11PM +0200, Anton Lindqvist wrote: > On Sun, Jun 19, 2022 at 11:34:53AM +0000, Visa Hankala wrote: > > On Fri, Jun 17, 2022 at 04:25:48PM +0300, Mikhail wrote: > > > I was debugging tog in lldb and in second tmux window opened another > > > bare tog instance, after a second I got this panic: > > > > > > panic: kernel diagnostic assetion "p->p_kq->kq_refcnt.r_refs == 1" > > > failed file "/usr/src/sys/kern/kern_event.c", line 839 > > > > > > There were also couple of xterms and chrome launched. > > > > > > There was an update of kern_event.c from 12 Jun - not sure if it's the > > > fix for this panic or not. > > > > > > After the panic I updated to the latest snapshot, and can't reproduce it > > > anymore, but maybe someone will have a clue. > > > > The 12 Jun kern_event.c commit is unrelated. > > > > This report shows no kernel stack trace, so I don't know if the panic > > was caused by some unexpected thread exit path. > > > > However, it looks that there is a problem with kqueue_task(). Even > > though the task holds a reference to the kqueue, the task should be > > cleared before the kqueue is deleted. Otherwise, the kqueue's lifetime > > can extend beyond that of the file descriptor table, causing > > a use-after-free in KQRELE(). In addition, the task clearing should > > avoid the unexpected reference count in kqpoll_exit(). > > > > > > The lifetime bug can be lured out by adding a brief sleep between > > taskq_next_work() and (*work.t_func)(work.t_arg) in taskq_thread(). > > With the sleep in place, regress/sys/kern/kqueue causes the following > > panic: > > > > panic: pool_do_get: fdescpl free list modified: page 0xfffffd811cf0e000; > > item addr 0xfffffd811cf0e888; offset 0x48=0xdead4113 > > Stopped at db_enter+0x10: popq %rbp > > TID PID UID PRFLAGS PFLAGS CPU COMMAND > > *338246 40644 1001 0x100003 0 3K make > > db_enter() at db_enter+0x10 > > panic(ffffffff81f841ac) at panic+0xbf > > pool_do_get(ffffffff823c2fb8,9,ffff8000226dc904) at pool_do_get+0x35c > > pool_get(ffffffff823c2fb8,9) at pool_get+0x96 > > fdcopy(ffff8000ffff13d0) at fdcopy+0x38 > > process_new(ffff8000ffff9500,ffff8000ffff13d0,1) at process_new+0x107 > > fork1(ffff8000ffff6d30,1,ffffffff81a6eab0,0,ffff8000226dcb00,0) at > > fork1+0x236 > > syscall(ffff8000226dcb70) at syscall+0x374 > > Xsyscall() at Xsyscall+0x128 > > end of kernel > > end trace frame: 0x7f7fffff28e0, count: 6 > > syzkaller discovered the same panic with a reproducer available. I have > not tried it myself, yet. > > https://syzkaller.appspot.com/bug?id=40ba6706cc749e6df21236a0da62fcfaa4f0e4f4
Indeed, the syzkaller report is about the same issue.