On 29/01/18(Mon) 20:38, Artturi Alm wrote: > On Mon, Jan 29, 2018 at 10:42:20AM +0100, Martin Pieuchot wrote: > > Hello Artturi, > > > > On 28/01/18(Sun) 09:08, Artturi Alm wrote: > > > >Synopsis: stuck in netlock > > > >Category: amd64 > > > >Environment: > > > System : OpenBSD 6.2 > > > Details : OpenBSD 6.2-current (GENERIC.MP) #333: Sun Jan 7 > > > 09:13:00 MST 2018 > > > > > > dera...@amd64.openbsd.org:/usr/src/sys/arch/amd64/compile/GENERIC.MP > > > > > > Architecture: OpenBSD.amd64 > > > Machine : amd64 > > > >Description: > > > processes getting stuck w/STATE=netlock, kill has no effect. > > > >How-To-Repeat: > > > using the desktop normally, until trying to restart chrome ends > > > up failing. > > > > What do you mean with "using the desktop normally"? Which applications > > are you using? Which browser plugins? Can you find out the minimum > > setup to reproduce this deadlock? > > > > > I've had this happen to me atleast twice in the last few of weeks. > > > > Do you know how to reproduce it easily? > > > > this time i had less than 10tabs open, so i guess it can be narrowed > down even further. > > > > At first time i noticed how trying to launch chrome did lock up > > > all the other processes in netlock, and "pkill chrome" did allow > > > the system to recover, i was unable to figure out what was wrong > > > and rebooting did make everything work again, while ie. > > > removing ~/.cache & ~/.config did not. > > > > So the deadlock is related to your chrome usage? > > > > now it does feel like so. i'll upgrade tonight. > > > > long before running the "ps cl" below, i had already killed all > > > the xterm-windows those processes were in. cwm(1) was unable to > > > kill some of those, but xkill did not. > > > > Well killing process waiting for the 'netlock' won't help. What has to > > be find is which process is holding it. For that we need the full ps > > output, including kernel and userland threads. > > > > > > after exiting X w/ctrl+alt+backspace(iirc?) i didn't get back to > > > $-prompt, and ^T did show xauth stuck in netlock.. > > > i guess it's obvious where it was heading; so i got pics of > > > "# reboot -nq" failing because stuck in the fckng netlock -_- > > > > > > i do have ddb.{panic,console,log}=1, but > > > "# sysctl ddb.trigger=1" == > > > "sysctl: ddb.trigger: Operation not supported by device" > > > > Not having DDB access will limit the debugging experience. Are you sure > > you tried to enter it on your console? > > > > so this requires ttyC0, right? > this time it was ifconfig in [netlock], that prevented using ttyC0. > i got there from X by running "virsh shutdown <domain" from the kvm host, > i guess it emulates what pressing actual power button would(acpi?). > > > > ?? so i had no option but "virsh reset <domain>"... > > > > Did you try top(1)? What were the kernel processes doing? > > see below, if "top -bCHS -d 1 999" should do. > anything else i could do? anyway, thanks in advance:)
This is where the problems comes from: > 33315 443734 -6 0 141M 102M idle viowait 0:00 0.00% chrome: I don't understand how chrome can end up sleeping in vio_ioctl() and why it is sleeping forever. But this thread is holding the NET_LOCK() and prevents the rest of the kernel from making progress. Could you try a virtual interface different from vio(4) and see if you can reproduce the problem?