> Date: Mon, 21 Nov 2022 20:28:35 +0100
> From: Alexander Bluhm <alexander.bl...@gmx.net>
> 
> Hi,
> 
> Some of my test machines hang while booting userland.
> 
> starting network
> -> here it hangs
> load: 0.02  cmd: ifconfig 81303 [sbar] 0.00u 0.15s 0% 78k
> 
> ddb shows these two processes.
> 
>  81303  375320  89140      0  3         0x3  sbar          ifconfig
>  48135  157353      0      0  3     0x14200  netlock       systqmp
> 
> ddb{0}> trace /t 0t375320
> sleep_finish(ffff800022d31318,1) at sleep_finish+0xfe
> cond_wait(ffff800022d313b0,ffffffff81f15e9d) at cond_wait+0x54
> sched_barrier(ffff800022512ff0) at sched_barrier+0x73
> ixgbe_stop(ffff800000118000) at ixgbe_stop+0x1f7
> ixgbe_init(ffff800000118000) at ixgbe_init+0x32
> ixgbe_ioctl(ffff800000118048,8020690c,ffff80000022ec00) at ixgbe_ioctl+0x13a
> in_ifinit(ffff800000118048,ffff80000022ec00,ffff800022d31740,1) at 
> in_ifinit+0x
> ef
> in_ioctl_change_ifaddr(8040691a,ffff800022d31730,ffff800000118048,1) at 
> in_ioct
> l_change_ifaddr+0x3a4
> in_control(fffffd81901dc740,8040691a,ffff800022d31730,ffff800000118048) at 
> in_c
> ontrol+0x75
> ifioctl(fffffd81901dc740,8040691a,ffff800022d31730,ffff800022d60000) at 
> ifioctl
> +0x982
> sys_ioctl(ffff800022d60000,ffff800022d31840,ffff800022d318a0) at 
> sys_ioctl+0x2c
> 4
> syscall(ffff800022d31910) at syscall+0x384
> Xsyscall() at Xsyscall+0x128
> end of kernel
> end trace frame: 0x7f7ffffd94a0, count: -13
> 
> ddb{0}> trace /t 0t157353
> sleep_finish(ffff800022ca8b70,1) at sleep_finish+0xfe
> rw_enter(ffffffff822b4f80,1) at rw_enter+0x1cb
> pf_purge(0) at pf_purge+0x1d
> taskq_thread(ffffffff822ac568) at taskq_thread+0x100
> end trace frame: 0x0, count: -4
> 
> ifconfig waits for the sched_barrier_task() on the systqmp task
> queue while holding the netlock.  pf_purge() runs on the systqmp
> task queue and is waiting for the netlock.  The netlock has been
> taken by ifconfig in in_ioctl_change_ifaddr().
> 
> The problem has been introduced when pf_purge() was moved from systq
> to systqmp.
> https://marc.info/?l=openbsd-cvs&m=166818274216800&w=2

I'd say pfpurge should be moved to itw own taskq.

ixgb(4) holding netlock while calling sched_barrier() is probably
wrong too.

Reply via email to