Re: [XFS on bad superblock] BUG: unable to handle kernel NULL pointer dereference at 00000003

2013-10-10 Thread Dave Chinner
On Thu, Oct 10, 2013 at 04:23:50PM +0800, Fengguang Wu wrote: > Dave, > > >> This is an easily reproducible bug. And I further confirmed it in > >> two ways: > >> > >> 1) turn off XFS, build 39 commits and boot them 2000+ times > >> > >> => no single mount error > > > >That doesn't tell you it is

Re: [XFS on bad superblock] BUG: unable to handle kernel NULL pointer dereference at 00000003

2013-10-10 Thread Fengguang Wu
Dave, >> This is an easily reproducible bug. And I further confirmed it in >> two ways: >> >> 1) turn off XFS, build 39 commits and boot them 2000+ times >> >> => no single mount error > >That doesn't tell you it is an XFS error. Absence of symptoms != >absence of bug. True. >> 2) turn off all o

Re: [XFS on bad superblock] BUG: unable to handle kernel NULL pointer dereference at 00000003

2013-10-10 Thread Dave Chinner
On Thu, Oct 10, 2013 at 02:03:34PM +0800, Fengguang Wu wrote: > On Thu, Oct 10, 2013 at 03:28:20PM +1100, Dave Chinner wrote: > > On Thu, Oct 10, 2013 at 11:38:34AM +0800, Fengguang Wu wrote: > > > On Thu, Oct 10, 2013 at 11:33:00AM +0800, Fengguang Wu wrote: > > > > On Thu, Oct 10, 2013 at 11:26:3

Re: [XFS on bad superblock] BUG: unable to handle kernel NULL pointer dereference at 00000003

2013-10-10 Thread Fengguang Wu
On Thu, Oct 10, 2013 at 02:23:13PM +0800, Fengguang Wu wrote: > Dave, > > Here are the first oops chunks that show up in the 3.12-rc4 kernel > with only XFS build in. Attached is the kconfig and one full dmesg. > > Hope there are more clues in them. I'll further test whether the > problems disapp

Re: [XFS on bad superblock] BUG: unable to handle kernel NULL pointer dereference at 00000003

2013-10-09 Thread Fengguang Wu
On Thu, Oct 10, 2013 at 03:28:20PM +1100, Dave Chinner wrote: > On Thu, Oct 10, 2013 at 11:38:34AM +0800, Fengguang Wu wrote: > > On Thu, Oct 10, 2013 at 11:33:00AM +0800, Fengguang Wu wrote: > > > On Thu, Oct 10, 2013 at 11:26:37AM +0800, Fengguang Wu wrote: > > > > Dave, > > > > > > > > > I note

Re: [XFS on bad superblock] BUG: unable to handle kernel NULL pointer dereference at 00000003

2013-10-09 Thread Dave Chinner
On Thu, Oct 10, 2013 at 11:38:34AM +0800, Fengguang Wu wrote: > On Thu, Oct 10, 2013 at 11:33:00AM +0800, Fengguang Wu wrote: > > On Thu, Oct 10, 2013 at 11:26:37AM +0800, Fengguang Wu wrote: > > > Dave, > > > > > > > I note that you have CONFIG_SLUB=y, which means that the cache slabs > > > > are

Re: [XFS on bad superblock] BUG: unable to handle kernel NULL pointer dereference at 00000003

2013-10-09 Thread Fengguang Wu
On Thu, Oct 10, 2013 at 11:33:00AM +0800, Fengguang Wu wrote: > On Thu, Oct 10, 2013 at 11:26:37AM +0800, Fengguang Wu wrote: > > Dave, > > > > > I note that you have CONFIG_SLUB=y, which means that the cache slabs > > > are shared with objects of other types. That means that the memory > > > corr

Re: [XFS on bad superblock] BUG: unable to handle kernel NULL pointer dereference at 00000003

2013-10-09 Thread Fengguang Wu
On Thu, Oct 10, 2013 at 11:26:37AM +0800, Fengguang Wu wrote: > Dave, > > > I note that you have CONFIG_SLUB=y, which means that the cache slabs > > are shared with objects of other types. That means that the memory > > corruption problem is likely to be caused by one of the other > > filesystems

Re: [XFS on bad superblock] BUG: unable to handle kernel NULL pointer dereference at 00000003

2013-10-09 Thread Fengguang Wu
Dave, > I note that you have CONFIG_SLUB=y, which means that the cache slabs > are shared with objects of other types. That means that the memory > corruption problem is likely to be caused by one of the other > filesystems that is probing the block device(s), not XFS. Good to know that, it would

Re: [XFS on bad superblock] BUG: unable to handle kernel NULL pointer dereference at 00000003

2013-10-09 Thread Dave Chinner
On Thu, Oct 10, 2013 at 09:41:17AM +0800, Fengguang Wu wrote: > On Thu, Oct 10, 2013 at 09:16:40AM +0800, Fengguang Wu wrote: > > On Thu, Oct 10, 2013 at 11:59:00AM +1100, Dave Chinner wrote: > > > [add x...@oss.sgi.com to cc] > > > > Thanks. > > > > To help debug the problem, I searched XFS in m

Re: [XFS on bad superblock] BUG: unable to handle kernel NULL pointer dereference at 00000003

2013-10-09 Thread Fengguang Wu
On Thu, Oct 10, 2013 at 09:16:40AM +0800, Fengguang Wu wrote: > On Thu, Oct 10, 2013 at 11:59:00AM +1100, Dave Chinner wrote: > > [add x...@oss.sgi.com to cc] > > Thanks. > > To help debug the problem, I searched XFS in my tests' oops database > and find one kernel that failed 4 times (out of 12

Re: [XFS on bad superblock] BUG: unable to handle kernel NULL pointer dereference at 00000003

2013-10-09 Thread Fengguang Wu
On Thu, Oct 10, 2013 at 11:59:00AM +1100, Dave Chinner wrote: > [add x...@oss.sgi.com to cc] Thanks. To help debug the problem, I searched XFS in my tests' oops database and find one kernel that failed 4 times (out of 12 total boots) with basically the same error: 4 BUG: sleeping function