Re: [BUG] kernel BUG at mm/vmacache.c:85!

2014-04-30 Thread Hugh Dickins
On Thu, 1 May 2014, Srivatsa S. Bhat wrote: > > I tried to recall the *exact* steps that I had carried out when I first > hit the bug. I realized that I had actually used kexec to boot the new > kernel. I had originally booted into a 3.7.7 kernel that happens to be > on that machine, and then

Re: [BUG] kernel BUG at mm/vmacache.c:85!

2014-04-30 Thread Linus Torvalds
On Wed, Apr 30, 2014 at 12:16 PM, Srivatsa S. Bhat wrote: > > So I tried the same recipe again (boot into 3.7.7 and kexec into 3.15-rc3+) > and I got totally random crashes so far, once in sys_kill and two times in > exit_mmap. So I guess the bug is in 3.7.x and probably 3.15-rc is fine after >

Re: [BUG] kernel BUG at mm/vmacache.c:85!

2014-04-30 Thread Srivatsa S. Bhat
On 05/01/2014 12:48 AM, Srivatsa S. Bhat wrote: > On 05/01/2014 12:46 AM, Srivatsa S. Bhat wrote: >> On 04/29/2014 03:29 PM, Srivatsa S. Bhat wrote: >>> On 04/29/2014 03:55 AM, Linus Torvalds wrote: On Mon, Apr 28, 2014 at 3:14 PM, Davidlohr Bueso wrote: > > I think that returning

Re: [BUG] kernel BUG at mm/vmacache.c:85!

2014-04-30 Thread Srivatsa S. Bhat
On 05/01/2014 12:46 AM, Srivatsa S. Bhat wrote: > On 04/29/2014 03:29 PM, Srivatsa S. Bhat wrote: >> On 04/29/2014 03:55 AM, Linus Torvalds wrote: >>> On Mon, Apr 28, 2014 at 3:14 PM, Davidlohr Bueso wrote: I think that returning some stale/bogus vma is causing those segfaults in

Re: [BUG] kernel BUG at mm/vmacache.c:85!

2014-04-30 Thread Srivatsa S. Bhat
On 05/01/2014 12:46 AM, Srivatsa S. Bhat wrote: On 04/29/2014 03:29 PM, Srivatsa S. Bhat wrote: On 04/29/2014 03:55 AM, Linus Torvalds wrote: On Mon, Apr 28, 2014 at 3:14 PM, Davidlohr Bueso davidl...@hp.com wrote: I think that returning some stale/bogus vma is causing those segfaults in

Re: [BUG] kernel BUG at mm/vmacache.c:85!

2014-04-30 Thread Srivatsa S. Bhat
On 05/01/2014 12:48 AM, Srivatsa S. Bhat wrote: On 05/01/2014 12:46 AM, Srivatsa S. Bhat wrote: On 04/29/2014 03:29 PM, Srivatsa S. Bhat wrote: On 04/29/2014 03:55 AM, Linus Torvalds wrote: On Mon, Apr 28, 2014 at 3:14 PM, Davidlohr Bueso davidl...@hp.com wrote: I think that returning some

Re: [BUG] kernel BUG at mm/vmacache.c:85!

2014-04-30 Thread Linus Torvalds
On Wed, Apr 30, 2014 at 12:16 PM, Srivatsa S. Bhat srivatsa.b...@linux.vnet.ibm.com wrote: So I tried the same recipe again (boot into 3.7.7 and kexec into 3.15-rc3+) and I got totally random crashes so far, once in sys_kill and two times in exit_mmap. So I guess the bug is in 3.7.x and

Re: [BUG] kernel BUG at mm/vmacache.c:85!

2014-04-30 Thread Hugh Dickins
On Thu, 1 May 2014, Srivatsa S. Bhat wrote: I tried to recall the *exact* steps that I had carried out when I first hit the bug. I realized that I had actually used kexec to boot the new kernel. I had originally booted into a 3.7.7 kernel that happens to be on that machine, and then

Re: [BUG] kernel BUG at mm/vmacache.c:85!

2014-04-29 Thread Oleg Nesterov
On 04/28, Davidlohr Bueso wrote: > > @@ -29,6 +30,7 @@ void use_mm(struct mm_struct *mm) > tsk->active_mm = mm; > } > tsk->mm = mm; > + vmacache_flush(tsk); But this can't help, we need to do this in unuse_mm(). And we can race with vmacache_flush_all() which

Re: [BUG] kernel BUG at mm/vmacache.c:85!

2014-04-29 Thread Srivatsa S. Bhat
On 04/29/2014 05:41 AM, Davidlohr Bueso wrote: > On Mon, 2014-04-28 at 16:57 -0700, Linus Torvalds wrote: >> On Mon, Apr 28, 2014 at 4:11 PM, Andrew Morton >> wrote: >>> >>> unuse_mm() leaves current->mm at NULL so we'd hear about it pretty >>> quickly if a user task was running use_mm/unuse_mm.

Re: [BUG] kernel BUG at mm/vmacache.c:85!

2014-04-29 Thread Srivatsa S. Bhat
On 04/29/2014 03:55 AM, Linus Torvalds wrote: > On Mon, Apr 28, 2014 at 3:14 PM, Davidlohr Bueso wrote: >> >> I think that returning some stale/bogus vma is causing those segfaults >> in udev. It shouldn't occur in a normal scenario. What puzzles me is >> that it's not always reproducible. This

Re: [BUG] kernel BUG at mm/vmacache.c:85!

2014-04-29 Thread Srivatsa S. Bhat
On 04/29/2014 05:30 AM, Dave Jones wrote: > On Tue, Apr 29, 2014 at 12:48:14AM +0530, Srivatsa S. Bhat wrote: > > Hi, > > > > I hit this during boot on v3.15-rc3, just once so far. > > Subsequent reboots went fine, and a few quick runs of multi- > > threaded ebizzy also didn't recreate the

Re: [BUG] kernel BUG at mm/vmacache.c:85!

2014-04-29 Thread Srivatsa S. Bhat
On 04/29/2014 04:09 AM, Davidlohr Bueso wrote: > Adding Oleg. > > On Mon, 2014-04-28 at 14:55 -0700, Linus Torvalds wrote: >> On Mon, Apr 28, 2014 at 2:20 PM, Linus Torvalds >> wrote: >>> >>> That said, the bug does seem to be that some path doesn't invalidate >>> the vmacache sufficiently, or

Re: [BUG] kernel BUG at mm/vmacache.c:85!

2014-04-29 Thread Srivatsa S. Bhat
On 04/29/2014 03:25 AM, Linus Torvalds wrote: > On Mon, Apr 28, 2014 at 2:20 PM, Linus Torvalds > wrote: >> >> That said, the bug does seem to be that some path doesn't invalidate >> the vmacache sufficiently, or something inserts a vmacache entry into >> the current process when looking up a

Re: [BUG] kernel BUG at mm/vmacache.c:85!

2014-04-29 Thread Srivatsa S. Bhat
On 04/29/2014 03:25 AM, Linus Torvalds wrote: On Mon, Apr 28, 2014 at 2:20 PM, Linus Torvalds torva...@linux-foundation.org wrote: That said, the bug does seem to be that some path doesn't invalidate the vmacache sufficiently, or something inserts a vmacache entry into the current process

Re: [BUG] kernel BUG at mm/vmacache.c:85!

2014-04-29 Thread Srivatsa S. Bhat
On 04/29/2014 04:09 AM, Davidlohr Bueso wrote: Adding Oleg. On Mon, 2014-04-28 at 14:55 -0700, Linus Torvalds wrote: On Mon, Apr 28, 2014 at 2:20 PM, Linus Torvalds torva...@linux-foundation.org wrote: That said, the bug does seem to be that some path doesn't invalidate the vmacache

Re: [BUG] kernel BUG at mm/vmacache.c:85!

2014-04-29 Thread Srivatsa S. Bhat
On 04/29/2014 05:30 AM, Dave Jones wrote: On Tue, Apr 29, 2014 at 12:48:14AM +0530, Srivatsa S. Bhat wrote: Hi, I hit this during boot on v3.15-rc3, just once so far. Subsequent reboots went fine, and a few quick runs of multi- threaded ebizzy also didn't recreate the problem.

Re: [BUG] kernel BUG at mm/vmacache.c:85!

2014-04-29 Thread Srivatsa S. Bhat
On 04/29/2014 03:55 AM, Linus Torvalds wrote: On Mon, Apr 28, 2014 at 3:14 PM, Davidlohr Bueso davidl...@hp.com wrote: I think that returning some stale/bogus vma is causing those segfaults in udev. It shouldn't occur in a normal scenario. What puzzles me is that it's not always reproducible.

Re: [BUG] kernel BUG at mm/vmacache.c:85!

2014-04-29 Thread Srivatsa S. Bhat
On 04/29/2014 05:41 AM, Davidlohr Bueso wrote: On Mon, 2014-04-28 at 16:57 -0700, Linus Torvalds wrote: On Mon, Apr 28, 2014 at 4:11 PM, Andrew Morton a...@linux-foundation.org wrote: unuse_mm() leaves current-mm at NULL so we'd hear about it pretty quickly if a user task was running

Re: [BUG] kernel BUG at mm/vmacache.c:85!

2014-04-29 Thread Oleg Nesterov
On 04/28, Davidlohr Bueso wrote: @@ -29,6 +30,7 @@ void use_mm(struct mm_struct *mm) tsk-active_mm = mm; } tsk-mm = mm; + vmacache_flush(tsk); But this can't help, we need to do this in unuse_mm(). And we can race with vmacache_flush_all() which relies

Re: [BUG] kernel BUG at mm/vmacache.c:85!

2014-04-28 Thread Davidlohr Bueso
On Mon, 2014-04-28 at 16:57 -0700, Linus Torvalds wrote: > On Mon, Apr 28, 2014 at 4:11 PM, Andrew Morton > wrote: > > > > unuse_mm() leaves current->mm at NULL so we'd hear about it pretty > > quickly if a user task was running use_mm/unuse_mm. > > Yes. > > > I think so. Maybe it's time to

Re: [BUG] kernel BUG at mm/vmacache.c:85!

2014-04-28 Thread Dave Jones
On Tue, Apr 29, 2014 at 12:48:14AM +0530, Srivatsa S. Bhat wrote: > Hi, > > I hit this during boot on v3.15-rc3, just once so far. > Subsequent reboots went fine, and a few quick runs of multi- > threaded ebizzy also didn't recreate the problem. > > The kernel I was running was v3.15-rc3

Re: [BUG] kernel BUG at mm/vmacache.c:85!

2014-04-28 Thread Linus Torvalds
On Mon, Apr 28, 2014 at 4:11 PM, Andrew Morton wrote: > > unuse_mm() leaves current->mm at NULL so we'd hear about it pretty > quickly if a user task was running use_mm/unuse_mm. Yes. > I think so. Maybe it's time to cook up a debug patch for Srivatsa to > use? Dump the vma cache when the bug

Re: [BUG] kernel BUG at mm/vmacache.c:85!

2014-04-28 Thread Andrew Morton
On Mon, 28 Apr 2014 15:58:02 -0700 Linus Torvalds wrote: > On Mon, Apr 28, 2014 at 3:39 PM, Davidlohr Bueso wrote: > > > > Is this perhaps a KVM guest? fwiw I see CONFIG_KVM_ASYNC_PF=y which is a > > user of use_mm(). > > So I tried to look through these guys, and that was one of the ones I

Re: [BUG] kernel BUG at mm/vmacache.c:85!

2014-04-28 Thread Linus Torvalds
On Mon, Apr 28, 2014 at 3:39 PM, Davidlohr Bueso wrote: > > Is this perhaps a KVM guest? fwiw I see CONFIG_KVM_ASYNC_PF=y which is a > user of use_mm(). So I tried to look through these guys, and that was one of the ones I looked at. It's using use_mm(), but it's only called through

Re: [BUG] kernel BUG at mm/vmacache.c:85!

2014-04-28 Thread Davidlohr Bueso
Adding Oleg. On Mon, 2014-04-28 at 14:55 -0700, Linus Torvalds wrote: > On Mon, Apr 28, 2014 at 2:20 PM, Linus Torvalds > wrote: > > > > That said, the bug does seem to be that some path doesn't invalidate > > the vmacache sufficiently, or something inserts a vmacache entry into > > the current

Re: [BUG] kernel BUG at mm/vmacache.c:85!

2014-04-28 Thread Linus Torvalds
On Mon, Apr 28, 2014 at 3:14 PM, Davidlohr Bueso wrote: > > I think that returning some stale/bogus vma is causing those segfaults > in udev. It shouldn't occur in a normal scenario. What puzzles me is > that it's not always reproducible. This makes me wonder what else is > going on... I've

Re: [BUG] kernel BUG at mm/vmacache.c:85!

2014-04-28 Thread Davidlohr Bueso
On Mon, 2014-04-28 at 15:05 -0700, Hugh Dickins wrote: > On Mon, 28 Apr 2014, Linus Torvalds wrote: > > On Mon, Apr 28, 2014 at 2:20 PM, Linus Torvalds > > wrote: > > > > > > That said, the bug does seem to be that some path doesn't invalidate > > > the vmacache sufficiently, or something inserts

Re: [BUG] kernel BUG at mm/vmacache.c:85!

2014-04-28 Thread Hugh Dickins
On Mon, 28 Apr 2014, Linus Torvalds wrote: > On Mon, Apr 28, 2014 at 2:20 PM, Linus Torvalds > wrote: > > > > That said, the bug does seem to be that some path doesn't invalidate > > the vmacache sufficiently, or something inserts a vmacache entry into > > the current process when looking up a

Re: [BUG] kernel BUG at mm/vmacache.c:85!

2014-04-28 Thread Linus Torvalds
On Mon, Apr 28, 2014 at 2:20 PM, Linus Torvalds wrote: > > That said, the bug does seem to be that some path doesn't invalidate > the vmacache sufficiently, or something inserts a vmacache entry into > the current process when looking up a remote process or whatever. > Davidlohr, ideas? Maybe we

Re: [BUG] kernel BUG at mm/vmacache.c:85!

2014-04-28 Thread Linus Torvalds
On Mon, Apr 28, 2014 at 12:18 PM, Srivatsa S. Bhat wrote: > > I hit this during boot on v3.15-rc3, just once so far. > Subsequent reboots went fine, and a few quick runs of multi- > threaded ebizzy also didn't recreate the problem. > > The kernel I was running was v3.15-rc3 + some totally >

Re: [BUG] kernel BUG at mm/vmacache.c:85!

2014-04-28 Thread Linus Torvalds
On Mon, Apr 28, 2014 at 12:18 PM, Srivatsa S. Bhat srivatsa.b...@linux.vnet.ibm.com wrote: I hit this during boot on v3.15-rc3, just once so far. Subsequent reboots went fine, and a few quick runs of multi- threaded ebizzy also didn't recreate the problem. The kernel I was running was

Re: [BUG] kernel BUG at mm/vmacache.c:85!

2014-04-28 Thread Linus Torvalds
On Mon, Apr 28, 2014 at 2:20 PM, Linus Torvalds torva...@linux-foundation.org wrote: That said, the bug does seem to be that some path doesn't invalidate the vmacache sufficiently, or something inserts a vmacache entry into the current process when looking up a remote process or whatever.

Re: [BUG] kernel BUG at mm/vmacache.c:85!

2014-04-28 Thread Hugh Dickins
On Mon, 28 Apr 2014, Linus Torvalds wrote: On Mon, Apr 28, 2014 at 2:20 PM, Linus Torvalds torva...@linux-foundation.org wrote: That said, the bug does seem to be that some path doesn't invalidate the vmacache sufficiently, or something inserts a vmacache entry into the current process

Re: [BUG] kernel BUG at mm/vmacache.c:85!

2014-04-28 Thread Davidlohr Bueso
On Mon, 2014-04-28 at 15:05 -0700, Hugh Dickins wrote: On Mon, 28 Apr 2014, Linus Torvalds wrote: On Mon, Apr 28, 2014 at 2:20 PM, Linus Torvalds torva...@linux-foundation.org wrote: That said, the bug does seem to be that some path doesn't invalidate the vmacache sufficiently, or

Re: [BUG] kernel BUG at mm/vmacache.c:85!

2014-04-28 Thread Linus Torvalds
On Mon, Apr 28, 2014 at 3:14 PM, Davidlohr Bueso davidl...@hp.com wrote: I think that returning some stale/bogus vma is causing those segfaults in udev. It shouldn't occur in a normal scenario. What puzzles me is that it's not always reproducible. This makes me wonder what else is going on...

Re: [BUG] kernel BUG at mm/vmacache.c:85!

2014-04-28 Thread Davidlohr Bueso
Adding Oleg. On Mon, 2014-04-28 at 14:55 -0700, Linus Torvalds wrote: On Mon, Apr 28, 2014 at 2:20 PM, Linus Torvalds torva...@linux-foundation.org wrote: That said, the bug does seem to be that some path doesn't invalidate the vmacache sufficiently, or something inserts a vmacache entry

Re: [BUG] kernel BUG at mm/vmacache.c:85!

2014-04-28 Thread Linus Torvalds
On Mon, Apr 28, 2014 at 3:39 PM, Davidlohr Bueso davidl...@hp.com wrote: Is this perhaps a KVM guest? fwiw I see CONFIG_KVM_ASYNC_PF=y which is a user of use_mm(). So I tried to look through these guys, and that was one of the ones I looked at. It's using use_mm(), but it's only called

Re: [BUG] kernel BUG at mm/vmacache.c:85!

2014-04-28 Thread Andrew Morton
On Mon, 28 Apr 2014 15:58:02 -0700 Linus Torvalds torva...@linux-foundation.org wrote: On Mon, Apr 28, 2014 at 3:39 PM, Davidlohr Bueso davidl...@hp.com wrote: Is this perhaps a KVM guest? fwiw I see CONFIG_KVM_ASYNC_PF=y which is a user of use_mm(). So I tried to look through these

Re: [BUG] kernel BUG at mm/vmacache.c:85!

2014-04-28 Thread Linus Torvalds
On Mon, Apr 28, 2014 at 4:11 PM, Andrew Morton a...@linux-foundation.org wrote: unuse_mm() leaves current-mm at NULL so we'd hear about it pretty quickly if a user task was running use_mm/unuse_mm. Yes. I think so. Maybe it's time to cook up a debug patch for Srivatsa to use? Dump the vma

Re: [BUG] kernel BUG at mm/vmacache.c:85!

2014-04-28 Thread Dave Jones
On Tue, Apr 29, 2014 at 12:48:14AM +0530, Srivatsa S. Bhat wrote: Hi, I hit this during boot on v3.15-rc3, just once so far. Subsequent reboots went fine, and a few quick runs of multi- threaded ebizzy also didn't recreate the problem. The kernel I was running was v3.15-rc3 + some

Re: [BUG] kernel BUG at mm/vmacache.c:85!

2014-04-28 Thread Davidlohr Bueso
On Mon, 2014-04-28 at 16:57 -0700, Linus Torvalds wrote: On Mon, Apr 28, 2014 at 4:11 PM, Andrew Morton a...@linux-foundation.org wrote: unuse_mm() leaves current-mm at NULL so we'd hear about it pretty quickly if a user task was running use_mm/unuse_mm. Yes. I think so. Maybe it's