Re: [vlan_device_event] BUG: unable to handle kernel paging request at 6b6b6ccf

2017-11-12 Thread Linus Torvalds
On Wed, Nov 8, 2017 at 9:12 AM, Fengguang Wu wrote: > > OK. Here is the original faddr2line output: > > $ ~/linux/scripts/faddr2line vmlinux vlan_device_event+0x7f5/0xa40 > vlan_device_event+0x7f5/0xa40: > vlan_device_event at net/8021q/vlan.h:60 > > And below is call trace embedded with full fadd

Re: [vlan_device_event] BUG: unable to handle kernel paging request at 6b6b6ccf

2017-11-09 Thread Cong Wang
On Thu, Nov 9, 2017 at 7:51 AM, Girish Moodalbail wrote: > > Upon receiving NETDEV_DOWN event, we are calling > > vlan_vid_del(dev, htons(ETH_P_8021Q), 0); > > which in turn calls call_rcu() to queue vlan_info_free_rcu() to be called at > some point. This free function frees the array[] >

Re: [vlan_device_event] BUG: unable to handle kernel paging request at 6b6b6ccf

2017-11-09 Thread Girish Moodalbail
On 11/8/17 10:34 PM, Cong Wang wrote: On Wed, Nov 8, 2017 at 7:12 PM, Fengguang Wu wrote: Hi Alex, So looking over the trace the panic seems to be happening after a decnet interface is getting deleted. Is there any chance we could try compiling the kernel without decnet support to see if that

Re: [vlan_device_event] BUG: unable to handle kernel paging request at 6b6b6ccf

2017-11-08 Thread Fengguang Wu
On Thu, Nov 09, 2017 at 02:55:10PM +0800, Fengguang Wu wrote: On Wed, Nov 08, 2017 at 10:34:10PM -0800, Cong Wang wrote: On Wed, Nov 8, 2017 at 7:12 PM, Fengguang Wu wrote: Hi Alex, So looking over the trace the panic seems to be happening after a decnet interface is getting deleted. Is ther

Re: [vlan_device_event] BUG: unable to handle kernel paging request at 6b6b6ccf

2017-11-08 Thread Fengguang Wu
On Thu, Nov 09, 2017 at 12:09:45PM +0800, Fengguang Wu wrote: On Thu, Nov 09, 2017 at 11:12:06AM +0800, Fengguang Wu wrote: Hi Alex, So looking over the trace the panic seems to be happening after a decnet interface is getting deleted. Is there any chance we could try compiling the kernel with

Re: [vlan_device_event] BUG: unable to handle kernel paging request at 6b6b6ccf

2017-11-08 Thread Fengguang Wu
On Wed, Nov 08, 2017 at 10:34:10PM -0800, Cong Wang wrote: On Wed, Nov 8, 2017 at 7:12 PM, Fengguang Wu wrote: Hi Alex, So looking over the trace the panic seems to be happening after a decnet interface is getting deleted. Is there any chance we could try compiling the kernel without decnet s

Re: [vlan_device_event] BUG: unable to handle kernel paging request at 6b6b6ccf

2017-11-08 Thread Fengguang Wu
On Thu, Nov 09, 2017 at 10:43:08AM +0800, Fengguang Wu wrote: Of course, if it's bisectable, that would be great too. Yes, bisect is on the way. So far it's bisecting in the 4.12 commits. The bisect was unsuccessful due to an unrelated DRM_BOCHS oops in 4.11. Disabling the buggy driver, I man

Re: [vlan_device_event] BUG: unable to handle kernel paging request at 6b6b6ccf

2017-11-08 Thread Cong Wang
On Wed, Nov 8, 2017 at 7:12 PM, Fengguang Wu wrote: > Hi Alex, > >> So looking over the trace the panic seems to be happening after a >> decnet interface is getting deleted. Is there any chance we could try >> compiling the kernel without decnet support to see if that is the >> source of these iss

Re: [vlan_device_event] BUG: unable to handle kernel paging request at 6b6b6ccf

2017-11-08 Thread Fengguang Wu
On Thu, Nov 09, 2017 at 11:12:06AM +0800, Fengguang Wu wrote: Hi Alex, So looking over the trace the panic seems to be happening after a decnet interface is getting deleted. Is there any chance we could try compiling the kernel without decnet support to see if that is the source of these issues

Re: [vlan_device_event] BUG: unable to handle kernel paging request at 6b6b6ccf

2017-11-08 Thread Fengguang Wu
Of course, if it's bisectable, that would be great too. Yes, bisect is on the way. So far it's bisecting in the 4.12 commits. The bisect was unsuccessful due to an unrelated DRM_BOCHS oops in 4.11. Disabling the buggy driver, I managed to reproduce the vlan_device_event bug in 4.11. However on

Re: [vlan_device_event] BUG: unable to handle kernel paging request at 6b6b6ccf

2017-11-08 Thread Alexander Duyck
On Wed, Nov 8, 2017 at 9:12 AM, Fengguang Wu wrote: > Hi Linus, > > > On Wed, Nov 08, 2017 at 08:20:38AM -0800, Linus Torvalds wrote: >> >> On Wed, Nov 8, 2017 at 1:48 AM, Fengguang Wu >> wrote: >>> >>> >>> Now I got the faddr2line output. :) >> >> >> Thank you, but this also shows that you then

Re: [vlan_device_event] BUG: unable to handle kernel paging request at 6b6b6ccf

2017-11-08 Thread Linus Torvalds
On Wed, Nov 8, 2017 at 9:12 AM, Fengguang Wu wrote: > > OK. Here is the original faddr2line output: > > $ ~/linux/scripts/faddr2line vmlinux vlan_device_event+0x7f5/0xa40 > vlan_device_event+0x7f5/0xa40: > vlan_device_event at net/8021q/vlan.h:60 Hmm. Yes, that's not what I hoped for. > I notice

Re: [vlan_device_event] BUG: unable to handle kernel paging request at 6b6b6ccf

2017-11-08 Thread Fengguang Wu
[...] I notice that this trace shows no additional inline files at all. Is it because I did some kconfig option wrong, so that inline info is lost? Eg. CONFIG_OPTIMIZE_INLINING=y (reading lib/Kconfig.debug, it looks better set to N) CONFIG_DEBUG_INFO_REDUCED=y CONFIG_DEBUG_INFO_SPLIT=y (full .co

Re: [vlan_device_event] BUG: unable to handle kernel paging request at 6b6b6ccf

2017-11-08 Thread Fengguang Wu
Hi Linus, On Wed, Nov 08, 2017 at 08:20:38AM -0800, Linus Torvalds wrote: On Wed, Nov 8, 2017 at 1:48 AM, Fengguang Wu wrote: Now I got the faddr2line output. :) Thank you, but this also shows that you then compress the output too much for convenience. [ 745.719623] BUG: unable to handle

Re: [vlan_device_event] BUG: unable to handle kernel paging request at 6b6b6ccf

2017-11-08 Thread Linus Torvalds
On Wed, Nov 8, 2017 at 1:48 AM, Fengguang Wu wrote: > > Now I got the faddr2line output. :) Thank you, but this also shows that you then compress the output too much for convenience. > [ 745.719623] BUG: unable to handle kernel paging request at 6b6b6f4f > [ 745.732871] IP: vlan_device_event a

Re: [vlan_device_event] BUG: unable to handle kernel paging request at 6b6b6ccf

2017-11-07 Thread Fengguang Wu
On Tue, Nov 07, 2017 at 08:25:03AM -0800, Linus Torvalds wrote: On Tue, Nov 7, 2017 at 2:21 AM, Fengguang Wu wrote: FYI this happens in v4.14-rc8 -- it's not necessarily a new bug. Probably not. Looks like a use-after-free bug in vlan_device_event() judging by the base pointer: ECX: 6b6

Re: [vlan_device_event] BUG: unable to handle kernel paging request at 6b6b6ccf

2017-11-07 Thread Linus Torvalds
On Tue, Nov 7, 2017 at 2:21 AM, Fengguang Wu wrote: > > FYI this happens in v4.14-rc8 -- it's not necessarily a new bug. Probably not. Looks like a use-after-free bug in vlan_device_event() judging by the base pointer: ECX: 6b6b6b6b this is one of those circumstances where having the faddr