[Bug 216304] Adding xn0 to bridge0 causes kernel panic

2017-01-19 Thread bugzilla-noreply
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=216304

Bug ID: 216304
   Summary: Adding xn0 to bridge0 causes kernel panic
   Product: Base System
   Version: 11.0-RELEASE
  Hardware: amd64
OS: Any
Status: New
  Severity: Affects Some People
  Priority: ---
 Component: kern
  Assignee: freebsd-bugs@FreeBSD.org
  Reporter: m...@rootbsd.net
CC: freebsd-am...@freebsd.org
CC: freebsd-am...@freebsd.org

We've encountered kernel panic in FreeBSD 11-RELEASE when attempting to add xn0
as a member of bridge0. The kernel panic happens immediately after the command
to add xn0 to bridge0 is issued. Oddly, the kernel panic doesn't occur after
upgrading in-place to 11.0-RELEASE from 10.3-RELEASE and proceeding to add xn0
to bridge0. This seems to only be an issue with fresh 11.0-RELEASE installs. 

All installs we've seen this issue on are virtual machines running on Xen 3.4.4
hypervisors. The virtual machine we upgraded from 10.3 to 11.0 (where adding
xn0 to bridge0 works fine) is also on a Xen 3.4.4 hypervisor.

Output of "uname -r" on 11.0-RELEASE vm with kernel panic issue:
11.0-RELEASE-p2

Output of "uname -r" on 11.0-RELEASE vm upgraded from 10.3-RELEASE without
kernel panic issue:
11.0-RELEASE-p2

Commands used on both servers to add bridge0 and then add xn0 to bridge0:
ifconfig bridge create
ifconfig bridge0 addm xn0

Output of "kgdb kernel.debug /var/crash/vmcore.0":

GNU gdb 6.1.1 [FreeBSD]
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for details.
This GDB was configured as "amd64-marcel-freebsd"...

Unread portion of the kernel message buffer:
Sleeping thread (tid 100076, pid 831) owns a non-sleepable lock
KDB: stack backtrace of thread 100076:
#0 0x80ae46e2 at mi_switch+0xd2
#1 0x80b3279a at sleepq_timedwait+0x3a
#2 0x80ae4091 at _sleep+0x281
#3 0x8096e9c8 at xn_ioctl+0x5d8
#4 0x822194b3 at bridge_ioctl_add+0x4b3
#5 0x8221af8f at bridge_ioctl+0x29f
#6 0x80bdcbec at ifioctl+0xfbc
#7 0x80b41ab4 at kern_ioctl+0x2d4
#8 0x80b41771 at sys_ioctl+0x171
#9 0x80fa168e at amd64_syscall+0x4ce
#10 0x80f8442b at Xfast_syscall+0xfb
panic: sleeping thread
cpuid = 0
KDB: stack backtrace:
#0 0x80b24077 at kdb_backtrace+0x67
#1 0x80ad93e2 at vpanic+0x182
#2 0x80ad9253 at panic+0x43
#3 0x80b39a99 at propagate_priority+0x299
#4 0x80b3a59f at turnstile_wait+0x3ef
#5 0x80ab493d at __mtx_lock_sleep+0x13d
#6 0x8221d4c5 at bridge_output+0x75
#7 0x80be286e at ether_output+0x68e
#8 0x80c62fe7 at ip_output+0x16c7
#9 0x80cf593e at tcp_output+0x191e
#10 0x80d01396 at tcp_timer_rexmt+0x526
#11 0x80af325a at softclock_call_cc+0x18a
#12 0x80af37d4 at softclock+0x94
#13 0x80a9340f at intr_event_execute_handlers+0x20f
#14 0x80a93676 at ithread_loop+0xc6
#15 0x80a90055 at fork_exit+0x85
#16 0x80f8467e at fork_trampoline+0xe
Uptime: 2m27s
Dumping 85 out of 479 MB:..19%..38%..57%..76%..94%

Reading symbols from /boot/kernel/if_bridge.ko...done.
Loaded symbols for /boot/kernel/if_bridge.ko
Reading symbols from /boot/kernel/bridgestp.ko...done.
Loaded symbols for /boot/kernel/bridgestp.ko
#0  doadump (textdump=) at pcpu.h:221
221 __asm("movq %%gs:%1,%0" : "=r" (td)
(kgdb) backtrace
#0  doadump (textdump=) at pcpu.h:221
#1  0x80ad8e69 in kern_reboot (howto=260) at
/usr/src/sys/kern/kern_shutdown.c:366
#2  0x80ad941b in vpanic (fmt=, ap=) at /usr/src/sys/kern/kern_shutdown.c:759
#3  0x80ad9253 in panic (fmt=0x0) at
/usr/src/sys/kern/kern_shutdown.c:690
#4  0x80b39a99 in propagate_priority (td=) at
/usr/src/sys/kern/subr_turnstile.c:226
#5  0x80b3a59f in turnstile_wait (ts=,
owner=, queue=)
at /usr/src/sys/kern/subr_turnstile.c:742
#6  0x80ab493d in __mtx_lock_sleep (c=,
tid=18446735277668753408, opts=, 
file=, line=) at
/usr/src/sys/kern/kern_mutex.c:583
#7  0x8221d4c5 in bridge_output () from /boot/kernel/if_bridge.ko
#8  0x80be286e in ether_output (ifp=, m=, dst=0xf800033bd9b0, ro=)
at /usr/src/sys/net/if_ethersubr.c:407
#9  0x80c62fe7 in ip_output (m=0x0, opt=,
ro=, flags=, 
imo=, inp=) at
/usr/src/sys/netinet/ip_output.c:661
#10 0x80cf593e in tcp_output (tp=) at
/usr/src/sys/netinet/tcp_output.c:1422
#11 0x80d01396 in tcp_timer_rexmt (xtp=) at
/usr/src/sys/netinet/tcp_timer.c:812
#12 0x80af325a in softclock_call_cc (c=, cc=, direct=)
at /usr/src/sys/kern/kern_timeout.c:729
#13 0x80af37d4

[Bug 216304] Adding xn0 to bridge0 causes kernel panic

2017-01-20 Thread bugzilla-noreply
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=216304

Kristof Provost  changed:

   What|Removed |Added

 CC||k...@freebsd.org

--- Comment #1 from Kristof Provost  ---
Tests with WITNESS show:

xn0: performing interface reset due to feature change
uma_zalloc_arg: zone "64" with the following non-sleepable locks held:
exclusive sleep mutex if_bridge (if_bridge) r = 0 (0xf80003f0aa18) locked @
/usr/home/rootbsd/freebsd/sys/modules/if_bridge/../../net/if_bridge.c:826
stack backtrace:
#0 0x80a66720 at witness_debugger+0x70
#1 0x80a679c7 at witness_warn+0x3d7
#2 0x80cb00bb at uma_zalloc_arg+0x3b
#3 0x809f328c at malloc+0xfc
#4 0x80a57c45 at sbuf_new+0x95
#5 0x808eea18 at xs_rm+0x28
#6 0x808eb411 at xn_ioctl+0x211
#7 0x8221d4b3 at bridge_ioctl_add+0x4b3
#8 0x8221ef8f at bridge_ioctl+0x29f
#9 0x80ae577a at ifioctl+0x104a
#10 0x80a6ab64 at kern_ioctl+0x214
#11 0x80a6a8e1 at sys_ioctl+0x171
#12 0x80e35fd4 at amd64_syscall+0x314
#13 0x80e1c1ab at Xfast_syscall+0xfb

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
freebsd-bugs@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-bugs
To unsubscribe, send any mail to "freebsd-bugs-unsubscr...@freebsd.org"


[Bug 216304] Adding xn0 to bridge0 causes kernel panic

2017-01-20 Thread bugzilla-noreply
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=216304

Kristof Provost  changed:

   What|Removed |Added

 CC||roy...@freebsd.org

--- Comment #2 from Kristof Provost  ---
I think what happens here is that the bridge code (through bridge_ioctl_add()
calls the xen driver's ioctl() handler for SIOCSIFCAP, which through xn_ioctl()
calls xs_rm(xenbus_get_node(dev), "feature-gso-tcp4"), which tries to compose a
string with the sbuf functions, which use a M_WAITOK allocation.

That means that we can end up sleeping (because malloc(M_WAITOK)) with the
bridge lock (a standard mutex) held.
That violates locking rules, by sleeping with a mutex held, so WITNESS warns us
about this.

If we're unlucky enough to actually try to acquire the bridge lock from another
thread (say because we want to transmit a packet) we can end up panic()ing.

It's not obvious to me how this can be fixed however. I'm cc-ing royger because
he touched the xen-netfront code at some point.

Perhaps we can allocate the strings the xs_rm() needs at device initialisation
time, but that would require the result of xenbus_get_node(dev) to be constant,
and I don't know if that's a valid assumption.

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
freebsd-bugs@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-bugs
To unsubscribe, send any mail to "freebsd-bugs-unsubscr...@freebsd.org"


[Bug 216304] Adding xn0 to bridge0 causes kernel panic

2017-01-21 Thread bugzilla-noreply
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=216304

Mark Linimon  changed:

   What|Removed |Added

 CC|freebsd-am...@freebsd.org   |
   Assignee|freebsd-bugs@FreeBSD.org|freebsd-...@freebsd.org

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
freebsd-bugs@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-bugs
To unsubscribe, send any mail to "freebsd-bugs-unsubscr...@freebsd.org"