> Date: Mon, 21 Nov 2022 20:28:35 +0100 > From: Alexander Bluhm <alexander.bl...@gmx.net> > > Hi, > > Some of my test machines hang while booting userland. > > starting network > -> here it hangs > load: 0.02 cmd: ifconfig 81303 [sbar] 0.00u 0.15s 0% 78k > > ddb shows these two processes. > > 81303 375320 89140 0 3 0x3 sbar ifconfig > 48135 157353 0 0 3 0x14200 netlock systqmp > > ddb{0}> trace /t 0t375320 > sleep_finish(ffff800022d31318,1) at sleep_finish+0xfe > cond_wait(ffff800022d313b0,ffffffff81f15e9d) at cond_wait+0x54 > sched_barrier(ffff800022512ff0) at sched_barrier+0x73 > ixgbe_stop(ffff800000118000) at ixgbe_stop+0x1f7 > ixgbe_init(ffff800000118000) at ixgbe_init+0x32 > ixgbe_ioctl(ffff800000118048,8020690c,ffff80000022ec00) at ixgbe_ioctl+0x13a > in_ifinit(ffff800000118048,ffff80000022ec00,ffff800022d31740,1) at > in_ifinit+0x > ef > in_ioctl_change_ifaddr(8040691a,ffff800022d31730,ffff800000118048,1) at > in_ioct > l_change_ifaddr+0x3a4 > in_control(fffffd81901dc740,8040691a,ffff800022d31730,ffff800000118048) at > in_c > ontrol+0x75 > ifioctl(fffffd81901dc740,8040691a,ffff800022d31730,ffff800022d60000) at > ifioctl > +0x982 > sys_ioctl(ffff800022d60000,ffff800022d31840,ffff800022d318a0) at > sys_ioctl+0x2c > 4 > syscall(ffff800022d31910) at syscall+0x384 > Xsyscall() at Xsyscall+0x128 > end of kernel > end trace frame: 0x7f7ffffd94a0, count: -13 > > ddb{0}> trace /t 0t157353 > sleep_finish(ffff800022ca8b70,1) at sleep_finish+0xfe > rw_enter(ffffffff822b4f80,1) at rw_enter+0x1cb > pf_purge(0) at pf_purge+0x1d > taskq_thread(ffffffff822ac568) at taskq_thread+0x100 > end trace frame: 0x0, count: -4 > > ifconfig waits for the sched_barrier_task() on the systqmp task > queue while holding the netlock. pf_purge() runs on the systqmp > task queue and is waiting for the netlock. The netlock has been > taken by ifconfig in in_ioctl_change_ifaddr(). > > The problem has been introduced when pf_purge() was moved from systq > to systqmp. > https://marc.info/?l=openbsd-cvs&m=166818274216800&w=2
I'd say pfpurge should be moved to itw own taskq. ixgb(4) holding netlock while calling sched_barrier() is probably wrong too.