On 10/20/2014 07:00 PM, Dave Jones wrote:
> On Fri, May 30, 2014 at 08:41:00AM -0700, Linus Torvalds wrote:
> > On Fri, May 30, 2014 at 8:25 AM, H. Peter Anvin wrote:
> > >
> > > If we removed struct thread_info from the stack allocation then one
> > > could do a guard page below the stack. O
On Fri, May 30, 2014 at 08:41:00AM -0700, Linus Torvalds wrote:
> On Fri, May 30, 2014 at 8:25 AM, H. Peter Anvin wrote:
> >
> > If we removed struct thread_info from the stack allocation then one
> > could do a guard page below the stack. Of course, we'd have to use IST
> > for #PF in that
On Tue, Jun 3, 2014 at 6:28 AM, Rasmus Villemoes
wrote:
> Possibly stupid question: Is it true that any given task can only be
> using one wait_queue_t at a time?
Nope.
Being on multiple different wait-queues is actually very common. The
obvious case is select/poll, but there are others. The mor
Possibly stupid question: Is it true that any given task can only be
using one wait_queue_t at a time? If so, would it be an idea to put a
wait_queue_t into struct task_struct [maybe union'ed with a struct
wait_bit_queue] and avoid allocating this 40 byte structure repeatedly
on the stack.
E.g., i
On Sat, May 31, 2014 at 6:06 AM, Jens Axboe wrote:
> On 2014-05-28 20:42, Linus Torvalds wrote:
>>>
>>> Regardless of whether it is swap or something external queues the
>>> bio on the plug, perhaps we should look at why it's done inline
>>> rather than by kblockd, where it was moved because it wa
On Fri, May 30, 2014 at 08:06:53PM -0600, Jens Axboe wrote:
> On 2014-05-28 20:42, Linus Torvalds wrote:
> >Well, we've definitely have had some issues with deeper callchains
> >with md, but I suspect virtio might be worse, and the new blk-mq code
> >is lilkely worse in this respect too.
>
> I don
On 2014-05-28 20:42, Linus Torvalds wrote:
Regardless of whether it is swap or something external queues the
bio on the plug, perhaps we should look at why it's done inline
rather than by kblockd, where it was moved because it was blowing
the stack from schedule():
So it sounds like we need to
On Thu, May 29, 2014 at 9:37 PM, Linus Torvalds
wrote:
>
> It really might be very good to create a "struct alloc_info" that
> contains those shared arguments, and just pass a (const) pointer to
> that around. [ .. ]
>
> Ugh. I think I'll try looking at that tomorrow.
I did look at it, but the th
Linus Torvalds writes:
> From a quick glance at the frame usage, some of it seems to be gcc
> being rather bad at stack allocation, but lots of it is just nasty
> spilling around the disgusting call-sites with tons or arguments. A
> _lot_ of the stack slots are marked as "%sfp" (which is gcc'ese
On 05/30/2014 10:24 AM, Dave Hansen wrote:
> On 05/30/2014 09:06 AM, Linus Torvalds wrote:
>> On Fri, May 30, 2014 at 8:52 AM, H. Peter Anvin wrote:
That said, it's still likely a non-production option due to the page
table games we'd have to play at fork/clone time.
>>>
>>> Still, seems
On 05/30/2014 09:06 AM, Linus Torvalds wrote:
> On Fri, May 30, 2014 at 8:52 AM, H. Peter Anvin wrote:
>>> That said, it's still likely a non-production option due to the page
>>> table games we'd have to play at fork/clone time.
>>
>> Still, seems much more tractable.
>
> We might be able to mak
On Fri, May 30, 2014 at 8:52 AM, H. Peter Anvin wrote:
>>
>> That said, it's still likely a non-production option due to the page
>> table games we'd have to play at fork/clone time.
>
> Still, seems much more tractable.
We might be able to make it more attractive by having a small
front-end cach
On 05/30/2014 08:41 AM, Linus Torvalds wrote:
> On Fri, May 30, 2014 at 8:25 AM, H. Peter Anvin wrote:
>>
>> If we removed struct thread_info from the stack allocation then one
>> could do a guard page below the stack. Of course, we'd have to use IST
>> for #PF in that case, which makes it a non-
On Fri, May 30, 2014 at 8:25 AM, H. Peter Anvin wrote:
>
> If we removed struct thread_info from the stack allocation then one
> could do a guard page below the stack. Of course, we'd have to use IST
> for #PF in that case, which makes it a non-production option.
We could just have the guard pag
On Fri, May 30, 2014 at 2:48 AM, Richard Weinberger
wrote:
>
> If we raise the stack size on x86_64 to 16k, what about i386?
> Beside of the fact that most of you consider 32bits as dead and must die... ;)
x86-32 doesn't have nearly the same issue, since a large portion of
stack content tends to
On 05/29/2014 06:34 PM, Dave Chinner wrote:
>> ...
>> "kworker/u24:1 (94) used greatest stack depth: 8K bytes left, it means
>> there is some horrible stack hogger in your kernel. Please report it
>> the LKML and enable stacktrace to investigate who is culprit"
>
> That, however, presumes that a u
On Thu, May 29, 2014 at 5:24 PM, Linus Torvalds
wrote:
> So I'm not in fact arguing against Minchan's patch of upping
> THREAD_SIZE_ORDER to 2 on x86-64, but at the same time stack size does
> remain one of my "we really need to be careful" issues, so while I am
> basically planning on applying th
On Thu, May 29, 2014 at 06:24:02PM -0700, Linus Torvalds wrote:
> On Thu, May 29, 2014 at 5:50 PM, Minchan Kim wrote:
> >>
> >> You could also try Dave's patch, and _not_ do my mm/vmscan.c part.
> >
> > Sure. While I write this, Rusty's test was crached so I will try Dave's
> > patch,
> > them yo
Final result,
I tested the machine below patch (Dave suggested + some part I modified)
and I couldn't see the problem any more(tested 4hr, I will queue it into
the machine during weekend for long running test if I don't get more
enhanced version before leaving the office today) but as I reported
i
On Thu, May 29, 2014 at 7:12 PM, Minchan Kim wrote:
>
> Interim report,
>
> And result is as follows, It reduce about 800-byte compared to
> my first report but still stack usage seems to be high.
> Really needs diet of VM functions.
Yes. And in this case uninlining things might actually help, be
On Thu, May 29, 2014 at 6:58 PM, Dave Chinner wrote:
>
> If the patch I sent solves the swap stack usage issue, then perhaps
> we should look towards adding "blk_plug_start_async()" to pass such
> hints to the plug flushing. I'd want to use the same behaviour in
> __xfs_buf_delwri_submit() for bul
On Fri, May 30, 2014 at 10:15:58AM +1000, Dave Chinner wrote:
> On Fri, May 30, 2014 at 08:36:38AM +0900, Minchan Kim wrote:
> > Hello Dave,
> >
> > On Thu, May 29, 2014 at 11:58:30AM +1000, Dave Chinner wrote:
> > > On Thu, May 29, 2014 at 11:30:07AM +1000, Dave Chinner wrote:
> > > > On Wed, May
On Thu, May 29, 2014 at 06:24:02PM -0700, Linus Torvalds wrote:
> On Thu, May 29, 2014 at 5:50 PM, Minchan Kim wrote:
> >>
> >> You could also try Dave's patch, and _not_ do my mm/vmscan.c part.
> >
> > Sure. While I write this, Rusty's test was crached so I will try Dave's
> > patch,
> > them yo
On Fri, May 30, 2014 at 09:32:19AM +0900, Minchan Kim wrote:
> On Fri, May 30, 2014 at 10:21:13AM +1000, Dave Chinner wrote:
> > On Thu, May 29, 2014 at 08:06:49PM -0400, Dave Jones wrote:
> > > On Fri, May 30, 2014 at 09:53:08AM +1000, Dave Chinner wrote:
> > >
> > > > That sounds like a plan. P
On Thu, May 29, 2014 at 5:05 PM, Linus Torvalds
wrote:
>
> So maybe test a patch something like the attached.
>
> NOTE! This is absolutely TOTALLY UNTESTED!
It's still untested, but I realized that the whole
"blk_flush_plug_list(plug, true);" thing is pointless, since
schedule() itself will do th
On Thu, May 29, 2014 at 5:50 PM, Minchan Kim wrote:
>>
>> You could also try Dave's patch, and _not_ do my mm/vmscan.c part.
>
> Sure. While I write this, Rusty's test was crached so I will try Dave's patch,
> them yours except vmscan.c part.
Looking more at Dave's patch (well, description), I do
On Thu, May 29, 2014 at 05:31:42PM -0700, Linus Torvalds wrote:
> On Thu, May 29, 2014 at 5:20 PM, Minchan Kim wrote:
> >
> > I guess this part which avoid swapout in direct reclaim would be key
> > if this patch were successful. But it could make anon pages rotate back
> > into inactive's head fr
On Thu, May 29, 2014 at 5:20 PM, Minchan Kim wrote:
>
> I guess this part which avoid swapout in direct reclaim would be key
> if this patch were successful. But it could make anon pages rotate back
> into inactive's head from tail in direct reclaim path until kswapd can
> catch up. And kswapd ksw
On Fri, May 30, 2014 at 10:21:13AM +1000, Dave Chinner wrote:
> On Thu, May 29, 2014 at 08:06:49PM -0400, Dave Jones wrote:
> > On Fri, May 30, 2014 at 09:53:08AM +1000, Dave Chinner wrote:
> >
> > > That sounds like a plan. Perhaps it would be useful to add a
> > > WARN_ON_ONCE(stack_usage > 8k
On Fri, May 30, 2014 at 10:21:13AM +1000, Dave Chinner wrote:
> On Thu, May 29, 2014 at 08:06:49PM -0400, Dave Jones wrote:
> > On Fri, May 30, 2014 at 09:53:08AM +1000, Dave Chinner wrote:
> >
> > > That sounds like a plan. Perhaps it would be useful to add a
> > > WARN_ON_ONCE(stack_usage
On Thu, May 29, 2014 at 08:06:49PM -0400, Dave Jones wrote:
> On Fri, May 30, 2014 at 09:53:08AM +1000, Dave Chinner wrote:
>
> > That sounds like a plan. Perhaps it would be useful to add a
> > WARN_ON_ONCE(stack_usage > 8k) (or some other arbitrary depth beyond
> > 8k) so that we get some ind
Hello Linus,
On Thu, May 29, 2014 at 05:05:17PM -0700, Linus Torvalds wrote:
> On Thu, May 29, 2014 at 4:36 PM, Minchan Kim wrote:
> >
> > I did below hacky test to apply your idea and the result is overflow again.
> > So, again it would second stack expansion. Otherwise, we should prevent
> > sw
On Fri, May 30, 2014 at 08:36:38AM +0900, Minchan Kim wrote:
> Hello Dave,
>
> On Thu, May 29, 2014 at 11:58:30AM +1000, Dave Chinner wrote:
> > On Thu, May 29, 2014 at 11:30:07AM +1000, Dave Chinner wrote:
> > > On Wed, May 28, 2014 at 03:41:11PM -0700, Linus Torvalds wrote:
> > > commit a237c1c5
On Fri, May 30, 2014 at 09:53:08AM +1000, Dave Chinner wrote:
> That sounds like a plan. Perhaps it would be useful to add a
> WARN_ON_ONCE(stack_usage > 8k) (or some other arbitrary depth beyond
> 8k) so that we get some indication that we're hitting a deep stack
> but the system otherwise ke
On Thu, May 29, 2014 at 4:36 PM, Minchan Kim wrote:
>
> I did below hacky test to apply your idea and the result is overflow again.
> So, again it would second stack expansion. Otherwise, we should prevent
> swapout in direct reclaim.
So changing io_schedule() is bad, for the reasons I outlined e
On Thu, May 29, 2014 at 08:24:49AM -0700, Linus Torvalds wrote:
> On Thu, May 29, 2014 at 12:26 AM, Dave Chinner wrote:
> >
> > What concerns me about both __alloc_pages_nodemask() and
> > kernel_map_pages is that when I look at the code I see functions
> > that have no obvious stack usage problem
Hello Linus,
On Thu, May 29, 2014 at 08:24:49AM -0700, Linus Torvalds wrote:
> On Thu, May 29, 2014 at 12:26 AM, Dave Chinner wrote:
> >
> > What concerns me about both __alloc_pages_nodemask() and
> > kernel_map_pages is that when I look at the code I see functions
> > that have no obvious stack
Hello Dave,
On Thu, May 29, 2014 at 11:58:30AM +1000, Dave Chinner wrote:
> On Thu, May 29, 2014 at 11:30:07AM +1000, Dave Chinner wrote:
> > On Wed, May 28, 2014 at 03:41:11PM -0700, Linus Torvalds wrote:
> > commit a237c1c5bc5dc5c76a21be922dca4826f3eca8ca
> > Author: Jens Axboe
> > Date: Sat
On Thu, May 29, 2014 at 12:26 AM, Dave Chinner wrote:
>
> What concerns me about both __alloc_pages_nodemask() and
> kernel_map_pages is that when I look at the code I see functions
> that have no obvious stack usage problem. However, the compiler is
> producing functions with huge stack footprint
> Hmm, stupid question: what happens when 16K is not enough too, do we
> increase again? When do we stop increasing? 1M, 2M... ?
It's not a stupid question, it's IMHO the most important question
> Sounds like we want to make it a config option with a couple of sizes
> for everyone to be happy. :-
On Wed, May 28, 2014 at 07:42:40PM -0700, Linus Torvalds wrote:
> On Wed, May 28, 2014 at 6:30 PM, Dave Chinner wrote:
> >
> > You're focussing on the specific symptoms, not the bigger picture.
> > i.e. you're ignoring all the other "let's start IO" triggers in
> > direct reclaim. e.g there's two
Linus Torvalds writes:
> Well, we've definitely have had some issues with deeper callchains
> with md, but I suspect virtio might be worse, and the new blk-mq code
> is lilkely worse in this respect too.
I looked at this; I've now got a couple of virtio core cleanups, and
I'm testing with Minchan
On Wed, May 28, 2014 at 12:06:58PM -0400, Johannes Weiner wrote:
> On Wed, May 28, 2014 at 07:13:45PM +1000, Dave Chinner wrote:
> > On Wed, May 28, 2014 at 06:37:38PM +1000, Dave Chinner wrote:
> > > [ cc XFS list ]
> >
> > [and now there is a complete copy on the XFs list, I'll add my 2c]
> >
>
On 05/28/2014 07:42 PM, Linus Torvalds wrote:
>
> And Minchan running out of stack is at least _partly_ due to his debug
> options (that DEBUG_PAGEALLOC thing as an extreme example, but I
> suspect there's a few other options there that generate more bloated
> data structures too too).
>
I have
On Wed, May 28, 2014 at 09:13:15PM -0700, Linus Torvalds wrote:
> On Wed, May 28, 2014 at 8:46 PM, Minchan Kim wrote:
> >
> > Yes. For example, with mark __alloc_pages_slowpath noinline_for_stack,
> > we can reduce 176byte.
>
> Well, but it will then call that __alloc_pages_slowpath() function,
>
On Wed, May 28, 2014 at 8:46 PM, Minchan Kim wrote:
>
> Yes. For example, with mark __alloc_pages_slowpath noinline_for_stack,
> we can reduce 176byte.
Well, but it will then call that __alloc_pages_slowpath() function,
which has a 176-byte stack frame.. Plus the call frame.
Now, that only trigg
On Wed, May 28, 2014 at 10:44:48PM -0400, Steven Rostedt wrote:
> On Thu, 29 May 2014 10:09:40 +0900
> Minchan Kim wrote:
>
> > stacktrace reported that vring_add_indirect used 376byte and objdump says
> >
> > 8141dc60 :
> > 8141dc60: 55 push %rbp
> >
On Wed, May 28, 2014 at 09:09:23AM -0700, Linus Torvalds wrote:
> On Tue, May 27, 2014 at 11:53 PM, Minchan Kim wrote:
> >
> > So, my stupid idea is just let's expand stack size and keep an eye
> > toward stack consumption on each kernel functions via stacktrace of ftrace.
>
> We probably have to
[ Crossed emails ]
On Wed, May 28, 2014 at 6:58 PM, Dave Chinner wrote:
> On Thu, May 29, 2014 at 11:30:07AM +1000, Dave Chinner wrote:
>>
>> And now we have too deep a stack due to unplugging from io_schedule()...
>
> So, if we make io_schedule() push the plug list off to the kblockd
> like is d
Minchan Kim writes:
> On Wed, May 28, 2014 at 12:04:09PM +0300, Michael S. Tsirkin wrote:
>> On Wed, May 28, 2014 at 03:53:59PM +0900, Minchan Kim wrote:
>> > [ 1065.604404] kworker/-57660d..2 1071625993us : stack_trace_call:
>> > 9) 6456 80 __kmalloc+0x1cb/0x200
>> > [ 1065.6044
On Thu, 29 May 2014 10:09:40 +0900
Minchan Kim wrote:
> stacktrace reported that vring_add_indirect used 376byte and objdump says
>
> 8141dc60 :
> 8141dc60: 55 push %rbp
> 8141dc61: 48 89 e5mov%rsp,%rbp
> 8141
On Wed, May 28, 2014 at 6:30 PM, Dave Chinner wrote:
>
> You're focussing on the specific symptoms, not the bigger picture.
> i.e. you're ignoring all the other "let's start IO" triggers in
> direct reclaim. e.g there's two separate plug flush triggers in
> shrink_inactive_list(), one of which is:
On Thu, May 29, 2014 at 11:30:07AM +1000, Dave Chinner wrote:
> On Wed, May 28, 2014 at 03:41:11PM -0700, Linus Torvalds wrote:
> commit a237c1c5bc5dc5c76a21be922dca4826f3eca8ca
> Author: Jens Axboe
> Date: Sat Apr 16 13:27:55 2011 +0200
>
> block: let io_schedule() flush the plug inline
>
On Wed, May 28, 2014 at 03:41:11PM -0700, Linus Torvalds wrote:
> On Wed, May 28, 2014 at 3:31 PM, Dave Chinner wrote:
> >
> > Indeed, the call chain reported here is not caused by swap issuing
> > IO.
>
> Well, that's one way of reading that callchain.
>
> I think it's the *wrong* way of readin
On 05/28/2014 04:17 PM, Dave Chinner wrote:
>>
>> You were the one calling it a canary.
>
> That doesn't mean it's to blame. Don't shoot the messenger...
>
Fair enough.
-hpa
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord..
On Wed, May 28, 2014 at 03:42:18PM -0700, H. Peter Anvin wrote:
> On 05/28/2014 03:11 PM, Dave Chinner wrote:
> > On Wed, May 28, 2014 at 07:23:23AM -0700, H. Peter Anvin wrote:
> >> We tried for 4K on x86-64, too, for b quite a while as I recall.
> >> The kernel stack is a one of the main costs fo
On 05/28/2014 03:11 PM, Dave Chinner wrote:
> On Wed, May 28, 2014 at 07:23:23AM -0700, H. Peter Anvin wrote:
>> We tried for 4K on x86-64, too, for b quite a while as I recall.
>> The kernel stack is a one of the main costs for a thread. I would
>> like to decouple struct thread_info from the ker
On Wed, May 28, 2014 at 3:31 PM, Dave Chinner wrote:
>
> Indeed, the call chain reported here is not caused by swap issuing
> IO.
Well, that's one way of reading that callchain.
I think it's the *wrong* way of reading it, though. Almost dishonestly
so. Because very clearly, the swapout _is_ what
On Wed, May 28, 2014 at 09:09:23AM -0700, Linus Torvalds wrote:
> On Tue, May 27, 2014 at 11:53 PM, Minchan Kim wrote:
> >
> > So, my stupid idea is just let's expand stack size and keep an eye
> > toward stack consumption on each kernel functions via stacktrace of ftrace.
.
> But what *does*
On Wed, May 28, 2014 at 07:23:23AM -0700, H. Peter Anvin wrote:
> We tried for 4K on x86-64, too, for b quite a while as I recall.
> The kernel stack is a one of the main costs for a thread. I would
> like to decouple struct thread_info from the kernel stack (PJ
> Waskewicz was working on that bef
On Wed, May 28, 2014 at 12:06:58PM -0400, Johannes Weiner wrote:
> On Wed, May 28, 2014 at 07:13:45PM +1000, Dave Chinner wrote:
> > On Wed, May 28, 2014 at 06:37:38PM +1000, Dave Chinner wrote:
> > > [ cc XFS list ]
> >
> > [and now there is a complete copy on the XFs list, I'll add my 2c]
> >
>
On Wed, 28 May 2014 17:43:50 +0200
Richard Weinberger wrote:
> > diff --git a/arch/x86/include/asm/page_64_types.h
> > b/arch/x86/include/asm/page_64_types.h
> > index 8de6d9cf3b95..678205195ae1 100644
> > --- a/arch/x86/include/asm/page_64_types.h
> > +++ b/arch/x86/include/asm/page_64_types.h
On Wed, May 28, 2014 at 9:08 AM, Steven Rostedt wrote:
>
> What performance impact are you looking for? Now if the system is short
> on memory, it would probably cause issues in creating tasks.
It doesn't necessarily need to be short on memory, it could just be
fragmented. But a page order of 2 s
Am 28.05.2014 18:08, schrieb Steven Rostedt:
> On Wed, 28 May 2014 17:43:50 +0200
> Richard Weinberger wrote:
>
>
>>> diff --git a/arch/x86/include/asm/page_64_types.h
>>> b/arch/x86/include/asm/page_64_types.h
>>> index 8de6d9cf3b95..678205195ae1 100644
>>> --- a/arch/x86/include/asm/page_64_t
On Tue, May 27, 2014 at 11:53 PM, Minchan Kim wrote:
>
> So, my stupid idea is just let's expand stack size and keep an eye
> toward stack consumption on each kernel functions via stacktrace of ftrace.
We probably have to do this at some point, but that point is not -rc7.
And quite frankly, from
On Wed, May 28, 2014 at 07:13:45PM +1000, Dave Chinner wrote:
> On Wed, May 28, 2014 at 06:37:38PM +1000, Dave Chinner wrote:
> > [ cc XFS list ]
>
> [and now there is a complete copy on the XFs list, I'll add my 2c]
>
> > On Wed, May 28, 2014 at 03:53:59PM +0900, Minchan Kim wrote:
> > > While I
On Wed, May 28, 2014 at 8:53 AM, Minchan Kim wrote:
> While I play inhouse patches with much memory pressure on qemu-kvm,
> 3.14 kernel was randomly crashed. The reason was kernel stack overflow.
>
> When I investigated the problem, the callstack was a little bit deeper
> by involve with reclaim f
We tried for 4K on x86-64, too, for b quite a while as I recall. The kernel
stack is a one of the main costs for a thread. I would like to decouple struct
thread_info from the kernel stack (PJ Waskewicz was working on that before he
left Intel) but that doesn't buy us all that much.
8K additi
This looks like something that Linus should be involved in too. He's
been critical in the past about stack usage.
On Wed, 28 May 2014 15:53:59 +0900
Minchan Kim wrote:
> While I play inhouse patches with much memory pressure on qemu-kvm,
> 3.14 kernel was randomly crashed. The reason was kernel
On Wed, May 28, 2014 at 03:53:59PM +0900, Minchan Kim wrote:
> While I play inhouse patches with much memory pressure on qemu-kvm,
> 3.14 kernel was randomly crashed. The reason was kernel stack overflow.
>
> When I investigated the problem, the callstack was a little bit deeper
> by involve with
On Wed, May 28, 2014 at 06:37:38PM +1000, Dave Chinner wrote:
> [ cc XFS list ]
[and now there is a complete copy on the XFs list, I'll add my 2c]
> On Wed, May 28, 2014 at 03:53:59PM +0900, Minchan Kim wrote:
> > While I play inhouse patches with much memory pressure on qemu-kvm,
> > 3.14 kernel
On Wed, May 28, 2014 at 03:53:59PM +0900, Minchan Kim wrote:
> While I play inhouse patches with much memory pressure on qemu-kvm,
> 3.14 kernel was randomly crashed. The reason was kernel stack overflow.
>
> When I investigated the problem, the callstack was a little bit deeper
> by involve with
[ cc XFS list ]
On Wed, May 28, 2014 at 03:53:59PM +0900, Minchan Kim wrote:
> While I play inhouse patches with much memory pressure on qemu-kvm,
> 3.14 kernel was randomly crashed. The reason was kernel stack overflow.
>
> When I investigated the problem, the callstack was a little bit deeper
>
While I play inhouse patches with much memory pressure on qemu-kvm,
3.14 kernel was randomly crashed. The reason was kernel stack overflow.
When I investigated the problem, the callstack was a little bit deeper
by involve with reclaim functions but not direct reclaim path.
I tried to diet stack s
74 matches
Mail list logo