Re: CONFIG_VMAP_STACK, on-stack struct, and wake_up_bit

2016-10-27 Thread Borislav Petkov
On Thu, Oct 27, 2016 at 05:03:13PM -0400, Bob Peterson wrote: > I rebooted the machine with and without your patch, about 15 times > each, and no failures. Not sure why I got it the first time. Must have > been a one-off. Ok, thanks for giving it a try! -- Regards/Gruss, Boris. ECO tip

Re: CONFIG_VMAP_STACK, on-stack struct, and wake_up_bit

2016-10-27 Thread Borislav Petkov
On Thu, Oct 27, 2016 at 05:03:13PM -0400, Bob Peterson wrote: > I rebooted the machine with and without your patch, about 15 times > each, and no failures. Not sure why I got it the first time. Must have > been a one-off. Ok, thanks for giving it a try! -- Regards/Gruss, Boris. ECO tip

Re: CONFIG_VMAP_STACK, on-stack struct, and wake_up_bit

2016-10-27 Thread Bob Peterson
- Original Message - | I mean, it would be great if you try a couple times but even if you're | unsuccessful, that's fine too - the fix is obviously correct and I've | confirmed that it boots fine in my VM here. Hi Boris, I rebooted the machine with and without your patch, about 15 times

Re: CONFIG_VMAP_STACK, on-stack struct, and wake_up_bit

2016-10-27 Thread Bob Peterson
- Original Message - | I mean, it would be great if you try a couple times but even if you're | unsuccessful, that's fine too - the fix is obviously correct and I've | confirmed that it boots fine in my VM here. Hi Boris, I rebooted the machine with and without your patch, about 15 times

Re: CONFIG_VMAP_STACK, on-stack struct, and wake_up_bit

2016-10-27 Thread Borislav Petkov
On Thu, Oct 27, 2016 at 02:51:30PM -0400, Bob Peterson wrote: > I couldn't recreate that first boot failure, even using .config.old, > and even after removing (rm -fR) my linux.git and untarring it from the > original tarball, doing a make clean, etc. Hmm, so it could also depend on the

Re: CONFIG_VMAP_STACK, on-stack struct, and wake_up_bit

2016-10-27 Thread Borislav Petkov
On Thu, Oct 27, 2016 at 02:51:30PM -0400, Bob Peterson wrote: > I couldn't recreate that first boot failure, even using .config.old, > and even after removing (rm -fR) my linux.git and untarring it from the > original tarball, doing a make clean, etc. Hmm, so it could also depend on the

Re: CONFIG_VMAP_STACK, on-stack struct, and wake_up_bit

2016-10-27 Thread Bob Peterson
| Here's a fix which works here - I'd appreciate it if you ran it and | checked the microcode was applied correctly, i.e.: | | $ dmesg | grep -i microcode | | before and after the patch. Please paste that output in a mail too. Hi Borislav, Sorry it's taken me so long. I've been having issues.

Re: CONFIG_VMAP_STACK, on-stack struct, and wake_up_bit

2016-10-27 Thread Bob Peterson
| Here's a fix which works here - I'd appreciate it if you ran it and | checked the microcode was applied correctly, i.e.: | | $ dmesg | grep -i microcode | | before and after the patch. Please paste that output in a mail too. Hi Borislav, Sorry it's taken me so long. I've been having issues.

Re: CONFIG_VMAP_STACK, on-stack struct, and wake_up_bit

2016-10-27 Thread Peter Zijlstra
On Thu, Oct 27, 2016 at 10:07:42AM +0100, Mel Gorman wrote: > > Something like so could work I suppose, but then there's a slight > > regression in the page_unlock() path, where we now do an unconditional > > spinlock; iow. we loose the unlocked waitqueue_active() test. > > > > I can't convince

Re: CONFIG_VMAP_STACK, on-stack struct, and wake_up_bit

2016-10-27 Thread Peter Zijlstra
On Thu, Oct 27, 2016 at 10:07:42AM +0100, Mel Gorman wrote: > > Something like so could work I suppose, but then there's a slight > > regression in the page_unlock() path, where we now do an unconditional > > spinlock; iow. we loose the unlocked waitqueue_active() test. > > > > I can't convince

Re: CONFIG_VMAP_STACK, on-stack struct, and wake_up_bit

2016-10-27 Thread Mel Gorman
On Thu, Oct 27, 2016 at 11:44:49AM +0200, Peter Zijlstra wrote: > On Thu, Oct 27, 2016 at 10:07:42AM +0100, Mel Gorman wrote: > > > Something like so could work I suppose, but then there's a slight > > > regression in the page_unlock() path, where we now do an unconditional > > > spinlock; iow. we

Re: CONFIG_VMAP_STACK, on-stack struct, and wake_up_bit

2016-10-27 Thread Mel Gorman
On Thu, Oct 27, 2016 at 11:44:49AM +0200, Peter Zijlstra wrote: > On Thu, Oct 27, 2016 at 10:07:42AM +0100, Mel Gorman wrote: > > > Something like so could work I suppose, but then there's a slight > > > regression in the page_unlock() path, where we now do an unconditional > > > spinlock; iow. we

Re: CONFIG_VMAP_STACK, on-stack struct, and wake_up_bit

2016-10-27 Thread Peter Zijlstra
On Thu, Oct 27, 2016 at 12:07:26AM +0100, Mel Gorman wrote: > > but I consider PeterZ's > > patch the fix to that, so I wouldn't worry about it. > > > > Agreed. Peter, do you plan to finish that patch? I was waiting for you guys to hash out the 32bit issue. But if we're now OK with having this

Re: CONFIG_VMAP_STACK, on-stack struct, and wake_up_bit

2016-10-27 Thread Peter Zijlstra
On Thu, Oct 27, 2016 at 12:07:26AM +0100, Mel Gorman wrote: > > but I consider PeterZ's > > patch the fix to that, so I wouldn't worry about it. > > > > Agreed. Peter, do you plan to finish that patch? I was waiting for you guys to hash out the 32bit issue. But if we're now OK with having this

Re: CONFIG_VMAP_STACK, on-stack struct, and wake_up_bit

2016-10-27 Thread Nicholas Piggin
On Thu, 27 Oct 2016 10:08:52 +0200 Peter Zijlstra wrote: > On Thu, Oct 27, 2016 at 12:07:26AM +0100, Mel Gorman wrote: > > > but I consider PeterZ's > > > patch the fix to that, so I wouldn't worry about it. > > > > > > > Agreed. Peter, do you plan to finish that patch?

Re: CONFIG_VMAP_STACK, on-stack struct, and wake_up_bit

2016-10-27 Thread Nicholas Piggin
On Thu, 27 Oct 2016 10:08:52 +0200 Peter Zijlstra wrote: > On Thu, Oct 27, 2016 at 12:07:26AM +0100, Mel Gorman wrote: > > > but I consider PeterZ's > > > patch the fix to that, so I wouldn't worry about it. > > > > > > > Agreed. Peter, do you plan to finish that patch? > > I was waiting

Re: CONFIG_VMAP_STACK, on-stack struct, and wake_up_bit

2016-10-27 Thread Borislav Petkov
On Wed, Oct 26, 2016 at 08:37:25PM -0400, Bob Peterson wrote: > Attached, but as Linus suggested, I turned off the AMD microcode driver, > so it should be the same if you turn it back on. If you want, I can > do it and re-send so you have a more pristine .config. Let me know. Thanks, but I was

Re: CONFIG_VMAP_STACK, on-stack struct, and wake_up_bit

2016-10-27 Thread Borislav Petkov
On Wed, Oct 26, 2016 at 08:37:25PM -0400, Bob Peterson wrote: > Attached, but as Linus suggested, I turned off the AMD microcode driver, > so it should be the same if you turn it back on. If you want, I can > do it and re-send so you have a more pristine .config. Let me know. Thanks, but I was

Re: CONFIG_VMAP_STACK, on-stack struct, and wake_up_bit

2016-10-27 Thread Mel Gorman
On Thu, Oct 27, 2016 at 10:08:52AM +0200, Peter Zijlstra wrote: > On Thu, Oct 27, 2016 at 12:07:26AM +0100, Mel Gorman wrote: > > > but I consider PeterZ's > > > patch the fix to that, so I wouldn't worry about it. > > > > > > > Agreed. Peter, do you plan to finish that patch? > > I was waiting

Re: CONFIG_VMAP_STACK, on-stack struct, and wake_up_bit

2016-10-27 Thread Mel Gorman
On Thu, Oct 27, 2016 at 10:08:52AM +0200, Peter Zijlstra wrote: > On Thu, Oct 27, 2016 at 12:07:26AM +0100, Mel Gorman wrote: > > > but I consider PeterZ's > > > patch the fix to that, so I wouldn't worry about it. > > > > > > > Agreed. Peter, do you plan to finish that patch? > > I was waiting

Re: CONFIG_VMAP_STACK, on-stack struct, and wake_up_bit

2016-10-26 Thread Borislav Petkov
On Wed, Oct 26, 2016 at 05:01:24PM -0400, Bob Peterson wrote: > Hm. It didn't even boot, at least on my amd box in the lab. > I've made no attempt to debug this. Btw, can you send me your .config so that I can try to reproduce? I'm assuming you're booting latest Linus' tree on it? I'd need to

Re: CONFIG_VMAP_STACK, on-stack struct, and wake_up_bit

2016-10-26 Thread Borislav Petkov
On Wed, Oct 26, 2016 at 05:01:24PM -0400, Bob Peterson wrote: > Hm. It didn't even boot, at least on my amd box in the lab. > I've made no attempt to debug this. Btw, can you send me your .config so that I can try to reproduce? I'm assuming you're booting latest Linus' tree on it? I'd need to

Re: CONFIG_VMAP_STACK, on-stack struct, and wake_up_bit

2016-10-26 Thread Mel Gorman
On Wed, Oct 26, 2016 at 03:09:41PM -0700, Linus Torvalds wrote: > On Wed, Oct 26, 2016 at 3:03 PM, Mel Gorman > wrote: > > > > To be clear, are you referring to PeterZ's patch that avoids the lookup? If > > so, I see your point. > > Yup, that's the one. I think you

Re: CONFIG_VMAP_STACK, on-stack struct, and wake_up_bit

2016-10-26 Thread Mel Gorman
On Wed, Oct 26, 2016 at 03:09:41PM -0700, Linus Torvalds wrote: > On Wed, Oct 26, 2016 at 3:03 PM, Mel Gorman > wrote: > > > > To be clear, are you referring to PeterZ's patch that avoids the lookup? If > > so, I see your point. > > Yup, that's the one. I think you tested it. In fact, I'm sure

Re: CONFIG_VMAP_STACK, on-stack struct, and wake_up_bit

2016-10-26 Thread Borislav Petkov
On Wed, Oct 26, 2016 at 02:30:49PM -0700, Linus Torvalds wrote: > Ok, similar issue, I think - passing a non-1:1 address to __phys_addr(). > > But the call trace has nothing to do with gfs2 or the bitlocks: > > > [2.504561] Call Trace: > > [2.507005]

Re: CONFIG_VMAP_STACK, on-stack struct, and wake_up_bit

2016-10-26 Thread Borislav Petkov
On Wed, Oct 26, 2016 at 02:30:49PM -0700, Linus Torvalds wrote: > Ok, similar issue, I think - passing a non-1:1 address to __phys_addr(). > > But the call trace has nothing to do with gfs2 or the bitlocks: > > > [2.504561] Call Trace: > > [2.507005]

Re: CONFIG_VMAP_STACK, on-stack struct, and wake_up_bit

2016-10-26 Thread Linus Torvalds
On Wed, Oct 26, 2016 at 3:03 PM, Mel Gorman wrote: > > To be clear, are you referring to PeterZ's patch that avoids the lookup? If > so, I see your point. Yup, that's the one. I think you tested it. In fact, I'm sure you did, because I remember seeing performance

Re: CONFIG_VMAP_STACK, on-stack struct, and wake_up_bit

2016-10-26 Thread Linus Torvalds
On Wed, Oct 26, 2016 at 3:03 PM, Mel Gorman wrote: > > To be clear, are you referring to PeterZ's patch that avoids the lookup? If > so, I see your point. Yup, that's the one. I think you tested it. In fact, I'm sure you did, because I remember seeing performance numbers from you ;) So yes,

Re: CONFIG_VMAP_STACK, on-stack struct, and wake_up_bit

2016-10-26 Thread Mel Gorman
On Wed, Oct 26, 2016 at 02:26:57PM -0700, Linus Torvalds wrote: > On Wed, Oct 26, 2016 at 1:31 PM, Mel Gorman > wrote: > > > > IO wait activity is not all that matters. We hit the lock/unlock paths > > during a lot of operations like reclaim. > > I doubt we do. > >

Re: CONFIG_VMAP_STACK, on-stack struct, and wake_up_bit

2016-10-26 Thread Mel Gorman
On Wed, Oct 26, 2016 at 02:26:57PM -0700, Linus Torvalds wrote: > On Wed, Oct 26, 2016 at 1:31 PM, Mel Gorman > wrote: > > > > IO wait activity is not all that matters. We hit the lock/unlock paths > > during a lot of operations like reclaim. > > I doubt we do. > > Yes, we hit the lock/unlock

Re: CONFIG_VMAP_STACK, on-stack struct, and wake_up_bit

2016-10-26 Thread Linus Torvalds
On Wed, Oct 26, 2016 at 2:01 PM, Bob Peterson wrote: > > Hm. It didn't even boot, at least on my amd box in the lab. > I've made no attempt to debug this. Hmm. Looks like a completely independent issue from the patch. Did you try booting that machine without the patch? > [

Re: CONFIG_VMAP_STACK, on-stack struct, and wake_up_bit

2016-10-26 Thread Linus Torvalds
On Wed, Oct 26, 2016 at 2:01 PM, Bob Peterson wrote: > > Hm. It didn't even boot, at least on my amd box in the lab. > I've made no attempt to debug this. Hmm. Looks like a completely independent issue from the patch. Did you try booting that machine without the patch? > [2.378877] kernel

Re: CONFIG_VMAP_STACK, on-stack struct, and wake_up_bit

2016-10-26 Thread Linus Torvalds
On Wed, Oct 26, 2016 at 1:31 PM, Mel Gorman wrote: > > IO wait activity is not all that matters. We hit the lock/unlock paths > during a lot of operations like reclaim. I doubt we do. Yes, we hit the lock/unlock itself, but do we hit the *contention*? The current

Re: CONFIG_VMAP_STACK, on-stack struct, and wake_up_bit

2016-10-26 Thread Linus Torvalds
On Wed, Oct 26, 2016 at 1:31 PM, Mel Gorman wrote: > > IO wait activity is not all that matters. We hit the lock/unlock paths > during a lot of operations like reclaim. I doubt we do. Yes, we hit the lock/unlock itself, but do we hit the *contention*? The current code is nasty, and always ends

Re: CONFIG_VMAP_STACK, on-stack struct, and wake_up_bit

2016-10-26 Thread Bob Peterson
- Original Message - | On Wed, Oct 26, 2016 at 11:04 AM, Bob Peterson wrote: | > | > I can test it for you, if you give me about an hour. | | I can definitely wait an hour, it would be lovely to see more testing. | Especially if you have a NUMA machine and an

Re: CONFIG_VMAP_STACK, on-stack struct, and wake_up_bit

2016-10-26 Thread Bob Peterson
- Original Message - | On Wed, Oct 26, 2016 at 11:04 AM, Bob Peterson wrote: | > | > I can test it for you, if you give me about an hour. | | I can definitely wait an hour, it would be lovely to see more testing. | Especially if you have a NUMA machine and an interesting workload. | |

Re: CONFIG_VMAP_STACK, on-stack struct, and wake_up_bit

2016-10-26 Thread Mel Gorman
On Wed, Oct 26, 2016 at 10:15:30AM -0700, Linus Torvalds wrote: > On Wed, Oct 26, 2016 at 9:32 AM, Linus Torvalds > wrote: > > > > Quite frankly, I think the solution is to just rip out all the insane > > zone crap. > > IOW, something like the attached. > >

Re: CONFIG_VMAP_STACK, on-stack struct, and wake_up_bit

2016-10-26 Thread Mel Gorman
On Wed, Oct 26, 2016 at 10:15:30AM -0700, Linus Torvalds wrote: > On Wed, Oct 26, 2016 at 9:32 AM, Linus Torvalds > wrote: > > > > Quite frankly, I think the solution is to just rip out all the insane > > zone crap. > > IOW, something like the attached. > > Advantage: > > - just look at the

Re: CONFIG_VMAP_STACK, on-stack struct, and wake_up_bit

2016-10-26 Thread Bob Peterson
- Original Message - | On Wed, Oct 26, 2016 at 11:04 AM, Bob Peterson wrote: | > | > I can test it for you, if you give me about an hour. Sorry. I guess I underestimated the time it takes to build a kernel on my test box. It will take a little longer, but it's

Re: CONFIG_VMAP_STACK, on-stack struct, and wake_up_bit

2016-10-26 Thread Bob Peterson
- Original Message - | On Wed, Oct 26, 2016 at 11:04 AM, Bob Peterson wrote: | > | > I can test it for you, if you give me about an hour. Sorry. I guess I underestimated the time it takes to build a kernel on my test box. It will take a little longer, but it's compiling now. | I can

Re: CONFIG_VMAP_STACK, on-stack struct, and wake_up_bit

2016-10-26 Thread Linus Torvalds
On Wed, Oct 26, 2016 at 11:04 AM, Bob Peterson wrote: > > I can test it for you, if you give me about an hour. I can definitely wait an hour, it would be lovely to see more testing. Especially if you have a NUMA machine and an interesting workload. And if you actually have

Re: CONFIG_VMAP_STACK, on-stack struct, and wake_up_bit

2016-10-26 Thread Linus Torvalds
On Wed, Oct 26, 2016 at 11:04 AM, Bob Peterson wrote: > > I can test it for you, if you give me about an hour. I can definitely wait an hour, it would be lovely to see more testing. Especially if you have a NUMA machine and an interesting workload. And if you actually have that NUMA machine and

Re: CONFIG_VMAP_STACK, on-stack struct, and wake_up_bit

2016-10-26 Thread Bob Peterson
- Original Message - | On Wed, Oct 26, 2016 at 10:15 AM, Linus Torvalds | wrote: | > | > Oh, and the patch is obviously entirely untested. I wouldn't want to | > ruin my reputation by *testing* the patches I send out. What would be | > the fun in that? | |

Re: CONFIG_VMAP_STACK, on-stack struct, and wake_up_bit

2016-10-26 Thread Bob Peterson
- Original Message - | On Wed, Oct 26, 2016 at 10:15 AM, Linus Torvalds | wrote: | > | > Oh, and the patch is obviously entirely untested. I wouldn't want to | > ruin my reputation by *testing* the patches I send out. What would be | > the fun in that? | | So I tested it. It compiles,

Re: CONFIG_VMAP_STACK, on-stack struct, and wake_up_bit

2016-10-26 Thread Linus Torvalds
On Wed, Oct 26, 2016 at 10:15 AM, Linus Torvalds wrote: > > Oh, and the patch is obviously entirely untested. I wouldn't want to > ruin my reputation by *testing* the patches I send out. What would be > the fun in that? So I tested it. It compiles, and it actually

Re: CONFIG_VMAP_STACK, on-stack struct, and wake_up_bit

2016-10-26 Thread Linus Torvalds
On Wed, Oct 26, 2016 at 10:15 AM, Linus Torvalds wrote: > > Oh, and the patch is obviously entirely untested. I wouldn't want to > ruin my reputation by *testing* the patches I send out. What would be > the fun in that? So I tested it. It compiles, and it actually also solves the performance

Re: CONFIG_VMAP_STACK, on-stack struct, and wake_up_bit

2016-10-26 Thread Linus Torvalds
On Wed, Oct 26, 2016 at 9:32 AM, Linus Torvalds wrote: > > Quite frankly, I think the solution is to just rip out all the insane > zone crap. IOW, something like the attached. Advantage: - just look at the number of garbage lines removed! 21 insertions(+), 182

Re: CONFIG_VMAP_STACK, on-stack struct, and wake_up_bit

2016-10-26 Thread Linus Torvalds
On Wed, Oct 26, 2016 at 9:32 AM, Linus Torvalds wrote: > > Quite frankly, I think the solution is to just rip out all the insane > zone crap. IOW, something like the attached. Advantage: - just look at the number of garbage lines removed! 21 insertions(+), 182 deletions(-) - it will

Re: CONFIG_VMAP_STACK, on-stack struct, and wake_up_bit

2016-10-26 Thread Linus Torvalds
On Wed, Oct 26, 2016 at 8:51 AM, Andy Lutomirski wrote: >> >> I get the following BUG with 4.9-rc2, CONFIG_VMAP_STACK and >> CONFIG_DEBUG_VIRTUAL turned on: >> >> kernel BUG at arch/x86/mm/physaddr.c:26! > > const struct zone *zone = page_zone(virt_to_page(word)); > > If

Re: CONFIG_VMAP_STACK, on-stack struct, and wake_up_bit

2016-10-26 Thread Linus Torvalds
On Wed, Oct 26, 2016 at 8:51 AM, Andy Lutomirski wrote: >> >> I get the following BUG with 4.9-rc2, CONFIG_VMAP_STACK and >> CONFIG_DEBUG_VIRTUAL turned on: >> >> kernel BUG at arch/x86/mm/physaddr.c:26! > > const struct zone *zone = page_zone(virt_to_page(word)); > > If the stack is vmalloced,

Re: CONFIG_VMAP_STACK, on-stack struct, and wake_up_bit

2016-10-26 Thread Andy Lutomirski
On Wed, Oct 26, 2016 at 5:51 AM, Andreas Gruenbacher wrote: > Hi, > > CONFIG_VMAP_STACK has broken gfs2 and I'm trying to figure out what's > going on. What I'm seeing is the following: on a fresh gfs2 filesystem > created with: > > mkfs.gfs2 -p lock_nolock $DEVICE > > I

Re: CONFIG_VMAP_STACK, on-stack struct, and wake_up_bit

2016-10-26 Thread Andy Lutomirski
On Wed, Oct 26, 2016 at 5:51 AM, Andreas Gruenbacher wrote: > Hi, > > CONFIG_VMAP_STACK has broken gfs2 and I'm trying to figure out what's > going on. What I'm seeing is the following: on a fresh gfs2 filesystem > created with: > > mkfs.gfs2 -p lock_nolock $DEVICE > > I get the following BUG

CONFIG_VMAP_STACK, on-stack struct, and wake_up_bit

2016-10-26 Thread Andreas Gruenbacher
Hi, CONFIG_VMAP_STACK has broken gfs2 and I'm trying to figure out what's going on. What I'm seeing is the following: on a fresh gfs2 filesystem created with: mkfs.gfs2 -p lock_nolock $DEVICE I get the following BUG with 4.9-rc2, CONFIG_VMAP_STACK and CONFIG_DEBUG_VIRTUAL turned on: kernel

CONFIG_VMAP_STACK, on-stack struct, and wake_up_bit

2016-10-26 Thread Andreas Gruenbacher
Hi, CONFIG_VMAP_STACK has broken gfs2 and I'm trying to figure out what's going on. What I'm seeing is the following: on a fresh gfs2 filesystem created with: mkfs.gfs2 -p lock_nolock $DEVICE I get the following BUG with 4.9-rc2, CONFIG_VMAP_STACK and CONFIG_DEBUG_VIRTUAL turned on: kernel