* [EMAIL PROTECTED] <[EMAIL PROTECTED]> wrote:
> I think this pretty clearly points out the need for some arch-generic
> infrastructure in Linux. An awful lot of arch hooks are for one or
> two architectures with some peculiarities, and the other 90% of the
> implementations are identical.
>
> include/asm-alpha/mmu_context.h |6 ++
> include/asm-arm/mmu_context.h |6 ++
> include/asm-arm26/mmu_context.h |6 ++
> include/asm-cris/mmu_context.h |6 ++
> include/asm-frv/mmu_context.h |6 ++
> include/asm-h8300/mmu_context.h
* Peter Zijlstra <[EMAIL PROTECTED]> wrote:
> code like that always makes me think of duffs-device
> http://www.lysator.liu.se/c/duffs-device.html
>
> although it might be that the compiler generates better code from the
> current incarnation; just my .02 ;-)
yeah, will do that. First wanted
* Russell King <[EMAIL PROTECTED]> wrote:
> On Fri, Jul 29, 2005 at 10:28:26AM +0200, Ingo Molnar wrote:
> > @@ -2872,10 +2878,10 @@ go_idle:
> > /*
> > * Prefetch (at least) a cacheline below the current
> > * kernel stack (in expectation of any new task touching
> > -* the sta
> -
> unroll prefetch_range() loops manually.
>
> Signed-off-by: Ingo Molnar <[EMAIL PROTECTED]>
>
> include/linux/prefetch.h | 31 +--
> 1 files changed, 29 insertions(+), 2 deletions(-)
>
> Index: linux/include/linux/prefetch.h
>
On Fri, Jul 29, 2005 at 10:28:26AM +0200, Ingo Molnar wrote:
> @@ -2872,10 +2878,10 @@ go_idle:
> /*
>* Prefetch (at least) a cacheline below the current
>* kernel stack (in expectation of any new task touching
> - * the stack at least minimally), and a cacheline above
>
Ingo Molnar wrote on Friday, July 29, 2005 1:36 AM
> * Chen, Kenneth W <[EMAIL PROTECTED]> wrote:
> > It generate slight different code because previous patch asks for a little
> > over 5 cache lines worth of bytes and it always go to the for loop.
>
> ok - fix below. But i'm not that sure we want
* Eric Dumazet <[EMAIL PROTECTED]> wrote:
> Please test that len is a constant, or else the inlining is too large
> for the non constant case.
yeah. fix below.
Ingo
-
noticed by Eric Dumazet: unrolling should be dependent on a constant
length, otherwise inlining gets too large.
S
* Chen, Kenneth W <[EMAIL PROTECTED]> wrote:
> Ingo Molnar wrote on Friday, July 29, 2005 12:07 AM
> > the patch below unrolls the prefetch_range() loop manually, for up to 5
> > cachelines prefetched. This patch, ontop of the 4 previous patches,
> > should generate similar code to the assembly
Ingo Molnar wrote on Friday, July 29, 2005 12:07 AM
> the patch below unrolls the prefetch_range() loop manually, for up to 5
> cachelines prefetched. This patch, ontop of the 4 previous patches,
> should generate similar code to the assembly code in your original
> patch. The full patch-series
Ingo Molnar a écrit :
unroll prefetch_range() loops manually.
Signed-off-by: Ingo Molnar <[EMAIL PROTECTED]>
include/linux/prefetch.h | 31 +--
1 files changed, 29 insertions(+), 2 deletions(-)
Index: linux/include/linux/prefetch.h
=
* Chen, Kenneth W <[EMAIL PROTECTED]> wrote:
> On ia64, we have two kernel stacks, one for outgoing task, and one for
> incoming task. for outgoing task, we haven't called switch_to() yet.
> So the switch stack structure for 'current' will be allocated
> immediately below current 'sp' pointer
Keith Owens wrote on Friday, July 29, 2005 12:38 AM
> BTW, for ia64 you may as well prefetch pt_regs, that is also quite
> large.
>
> #define MIN_KERNEL_STACK_FOOTPRINT (IA64_SWITCH_STACK_SIZE +
> IA64_PT_REGS_SIZE)
This has to be carefully done, because you really don't want to overwhelm
number
Keith Owens wrote on Friday, July 29, 2005 12:46 AM
> On Fri, 29 Jul 2005 00:22:43 -0700,
> "Chen, Kenneth W" <[EMAIL PROTECTED]> wrote:
> >On ia64, we have two kernel stacks, one for outgoing task, and one for
> >incoming task. for outgoing task, we haven't called switch_to() yet.
> >So the swit
On Fri, 29 Jul 2005 00:22:43 -0700,
"Chen, Kenneth W" <[EMAIL PROTECTED]> wrote:
>On ia64, we have two kernel stacks, one for outgoing task, and one for
>incoming task. for outgoing task, we haven't called switch_to() yet.
>So the switch stack structure for 'current' will be allocated immediately
On Fri, 29 Jul 2005 09:04:48 +0200,
Ingo Molnar <[EMAIL PROTECTED]> wrote:
>ok, how about the additional patch below? Does this do the trick on
>ia64? It makes complete sense on every architecture to prefetch from
>below the current kernel stack, in the expectation of the next task
>touching th
Ingo Molnar wrote on Friday, July 29, 2005 12:05 AM
> --- linux.orig/kernel/sched.c
> +++ linux/kernel/sched.c
> @@ -2869,7 +2869,14 @@ go_idle:
>* its thread_info, its kernel stack and mm:
>*/
> prefetch(next->thread_info);
> - prefetch(kernel_stack(next));
> + /*
> +
* Ingo Molnar <[EMAIL PROTECTED]> wrote:
> > Sorry, this is not enough. Switch stack on ia64 is 528 bytes. We
> > need to prefetch 5 lines. It probably should use prefetch_range().
>
> ok, how about the additional patch below? Does this do the trick on
> ia64? It makes complete sense on e
* Chen, Kenneth W <[EMAIL PROTECTED]> wrote:
> > i.e. like the patch below. Boot-tested on x86. x86, x64 and ia64 have a
> > real kernel_stack() implementation, the other architectures all return
> > 'next'. (I've also cleaned up a couple of other things in the
> > prefetch-next area, see the
> i.e. like the patch below. Boot-tested on x86. x86, x64 and ia64 have a
> real kernel_stack() implementation, the other architectures all return
> 'next'. (I've also cleaned up a couple of other things in the
> prefetch-next area, see the changelog below.)
>
> Ken, would this patch generate a
Ingo Molnar wrote:
* Nick Piggin <[EMAIL PROTECTED]> wrote:
[...]
prefetch_area(void *first_addr, void *last_addr)
(or as addr,len)
Yep. We have prefetch_range.
Yeah, then a specific field _within_ next->mm or thread_info may want
to be fetched. In short, I don't see any argu
* Nick Piggin <[EMAIL PROTECTED]> wrote:
> >i'm not sure what you mean by prefetching next->timestamp, it's an
> >inline field to 'next', in the first cacheline of it, which we've
> >already used so it's present. (If you mean the value of next->timestamp,
> >that has no address meaning at all
Ingo Molnar wrote:
* Nick Piggin <[EMAIL PROTECTED]> wrote:
such as?
Not sure. thread_info? Maybe next->timestamp or some other fields in
next, something in next->mm?
next->thread_info we could and should prefetch - but from the generic
scheduler code (see the patch i just sent).
* Ingo Molnar <[EMAIL PROTECTED]> wrote:
> next->mm we might want to prefetch, but it's probably not worth it
> because we are referencing it too soon, in context_switch(). (while
> the kernel stack itself wont be referenced until the full
> context-switch is done) But might be worth trying -
* Nick Piggin <[EMAIL PROTECTED]> wrote:
> >>Just a minor point, I agree with David: I'd like it to be called
> >>prefetch_task(), because some architecture may want to prefetch other
> >>memory.
> >
> >such as?
>
> Not sure. thread_info? Maybe next->timestamp or some other fields in
> next,
* Ingo Molnar <[EMAIL PROTECTED]> wrote:
> [...] If yes then we want to have something like:
>
> prefetch(kernel_stack(next));
>
> to make it more generic. By default kernel_stack(next) could be
> next->thread_info (to make sure we prefetch something real). On e.g.
> x86/x64, kernel_sta
Ingo Molnar wrote:
* Nick Piggin <[EMAIL PROTECTED]> wrote:
No, they can be up to 30K apart. See include/asm-ia64/ptrace.h.
thread_info is at ~0xda0, depending on the config. The switch_stack
can be as high as 0x7bd0 in the kernel stack, depending on why the task
is sleeping.
Just a minor
* Nick Piggin <[EMAIL PROTECTED]> wrote:
> >No, they can be up to 30K apart. See include/asm-ia64/ptrace.h.
> >thread_info is at ~0xda0, depending on the config. The switch_stack
> >can be as high as 0x7bd0 in the kernel stack, depending on why the task
> >is sleeping.
> >
>
> Just a minor poi
Keith Owens wrote:
On Thu, 28 Jul 2005 09:41:18 +0200,
Ingo Molnar <[EMAIL PROTECTED]> wrote:
i'm wondering, is the switch_stack at the same/similar place as
next->thread_info? If yes then we could simply do a
prefetch(next->thread_info).
No, they can be up to 30K apart. See include/asm-ia
On Thu, 28 Jul 2005 09:41:18 +0200,
Ingo Molnar <[EMAIL PROTECTED]> wrote:
>
>* david mosberger <[EMAIL PROTECTED]> wrote:
>
>> Also, should this be called prefetch_stack() or perhaps even just
>> prefetch_task()? Not every architecture defines a switch_stack
>> structure.
>
>yeah. I'd too suggest
* Keith Owens wrote:
> >yeah. I'd too suggest to call it prefetch_stack(), and not make it a
> >macro & hook but something defined on all arches, with for now only ia64
> >having any real code in the inline function.
> >
> >i'm wondering, is the switch_stack at the same/similar place as
> >next-
* david mosberger <[EMAIL PROTECTED]> wrote:
> Also, should this be called prefetch_stack() or perhaps even just
> prefetch_task()? Not every architecture defines a switch_stack
> structure.
yeah. I'd too suggest to call it prefetch_stack(), and not make it a
macro & hook but something defin
Also, should this be called prefetch_stack() or perhaps even just
prefetch_task()? Not every architecture defines a switch_stack
structure.
--david
--
Mosberger Consulting LLC, voice/fax: 510-744-9372,
http://www.mosberger-consulting.com/
35706 Runckel Lane, Fremont, CA 94536
On 7/27/05, And
"Chen, Kenneth W" <[EMAIL PROTECTED]> wrote:
>
> +#ifdef ARCH_HAS_PREFETCH_SWITCH_STACK
> +extern void prefetch_switch_stack(struct task_struct*);
> +#else
> +#define prefetch_switch_stack(task) do { } while (0)
> +#endif
It is better to use
static inline void prefetch_switch_stack(struct tas
I would like to propose adding a prefetch switch stack hook in
the scheduler function. For architecture like ia64, the switch
stack structure is fairly large (currently 528 bytes). For context
switch intensive application, we found that significant amount of
cache misses occurs in switch_to() fun
35 matches
Mail list logo