Re: [sched, patch] better wake-balancing, #3

2005-07-29 Thread Nick Piggin
Ingo Molnar wrote: * Ingo Molnar <[EMAIL PROTECTED]> wrote: there's an even simpler way: only do wakeup-balancing if this_cpu is idle. (tbench results are still OK, and other workloads improved.) here's an updated patch. It handles one more detail: on SCHED_SMT we should check the idleness

Re: __copy_user exception handling

2005-07-29 Thread Matt Chapman
On Fri, Jul 29, 2005 at 04:23:33PM -0700, Chen, Kenneth W wrote: > > Because exception handler use to work at instruction bundle granularity. > The first EX would automatically catch the 2nd ld8 or st8, with a caveat > that this code is assuming gcc 2.x tool chain. With moving to gcc 3.x > assemb

RE: __copy_user exception handling

2005-07-29 Thread Chen, Kenneth W
Matt Chapman wrote on Friday, July 29, 2005 3:11 PM > The main __copy_user loop looks like this: > > 2: > EX(.failure_in3,(p16) ld8 val1[0]=[src1],16) > (p16) ld8 val2[0]=[src2],16 > > EX(.failure_out, (EPI) st8 [dst1]=val1[PIPE_DEPTH-1],16) > (EPI) st8 [dst2]=val2[PIPE_DEPTH

[PATCH] Add ACPI based P-state support

2005-07-29 Thread Venkatesh Pallipadi
Patch to support P-state transitions on ia64. This driver is based on ACPI, and uses the ACPI processor driver interface to find out the P-state support information for the processor. This driver plugs into generic cpufreq infrastructure. Once this driver is loaded successfully, ondemand/userspac

Re: 2.6.13-rc3-mm3

2005-07-29 Thread Andrew Morton
Khalid Aziz <[EMAIL PROTECTED]> wrote: > > Serial console is broken on ia64 on an HP rx2600 machine on > 2.6.13-rc3-mm3. When kernel is booted up with "console=ttyS,...", no > output ever appears on the console and system is hung. So I booted the > kernel with "console=uart,mmio,0xff5e" to enab

__copy_user exception handling

2005-07-29 Thread Matt Chapman
The main __copy_user loop looks like this: 2: EX(.failure_in3,(p16) ld8 val1[0]=[src1],16) (p16) ld8 val2[0]=[src2],16 EX(.failure_out, (EPI) st8 [dst1]=val1[PIPE_DEPTH-1],16) (EPI) st8 [dst2]=val2[PIPE_DEPTH-1],16 br.ctop.dptk 2b What I'm trying to understand is why

Re: [PATCH] /proc/iomem update

2005-07-29 Thread Khalid Aziz
That "continue" is intentional and you are right, it will cause memory leak. It should be preceded by kfree(res). I will fix that. Thanks, Khalid On Fri, 2005-07-29 at 11:22 -0700, david mosberger wrote: > Is the "continue" here spurious? If not, it would cause > memory-leaking since "res" has b

Re: [PATCH] /proc/iomem update

2005-07-29 Thread david mosberger
Is the "continue" here spurious? If not, it would cause memory-leaking since "res" has been allocated already. + case EFI_MEMORY_MAPPED_IO: + case EFI_MEMORY_MAPPED_IO_PORT_SPACE: + continue; +

[PATCH] /proc/iomem update

2005-07-29 Thread Khalid Aziz
Included patch updates /proc/iomem on ia64 to include information about system RAM and kernel memory. This makes /proc/iomem on ia64 similar to one on x86. This patch is relative to 2.6.13-rc3 and applies on top of the EFI memory map walk rewrite patch at

RE: Delete scheduler SD_WAKE_AFFINE and SD_WAKE_BALANCE flags

2005-07-29 Thread Chen, Kenneth W
Ingo Molnar wrote on Friday, July 29, 2005 4:26 AM > * Chen, Kenneth W <[EMAIL PROTECTED]> wrote: > > To demonstrate the problem, we turned off these two flags in the cpu > > sd domain and measured a stunning 2.15% performance gain! And > > deleting all the code in the try_to_wake_up() pertain t

Re: Delete scheduler SD_WAKE_AFFINE and SD_WAKE_BALANCE flags

2005-07-29 Thread Ingo Molnar
* Nick Piggin <[EMAIL PROTECTED]> wrote: > Chen, Kenneth W wrote: > >Nick Piggin wrote on Thursday, July 28, 2005 7:01 PM > > >This clearly outlines an issue with the implementation. Optimize for one > >type of workload has detrimental effect on another workload and vice versa. > > > > Yep. Th

[sched, patch] better wake-balancing, #3

2005-07-29 Thread Ingo Molnar
* Ingo Molnar <[EMAIL PROTECTED]> wrote: > there's an even simpler way: only do wakeup-balancing if this_cpu is > idle. (tbench results are still OK, and other workloads improved.) here's an updated patch. It handles one more detail: on SCHED_SMT we should check the idleness of siblings too. B

[sched, patch] better wake-balancing, #2

2005-07-29 Thread Ingo Molnar
* Ingo Molnar <[EMAIL PROTECTED]> wrote: > another approach would be the patch below, to do wakeup-balancing only > if the wakeup CPU or the task CPU is idle. there's an even simpler way: only do wakeup-balancing if this_cpu is idle. (tbench results are still OK, and other workloads improved.)

[sched, patch] better wake-balancing

2005-07-29 Thread Ingo Molnar
another approach would be the patch below, to do wakeup-balancing only if the wakeup CPU or the task CPU is idle. I've measured half-loaded tbench and unless total wakeup-balancing removal it does not degrade with this patch applied, while fully loaded tbench and other workloads clearly improv

[patch] remove wake-balancing

2005-07-29 Thread Ingo Molnar
* Chen, Kenneth W <[EMAIL PROTECTED]> wrote: > > If we can get performance to within a couple of tenths of a percent > > of the zero balancing case, then that would be preferable I think. > > I won't try to compromise between the two. If you do so, we would end > up with two half baked raw tur

Re: Delete scheduler SD_WAKE_AFFINE and SD_WAKE_BALANCE flags

2005-07-29 Thread Ingo Molnar
* Chen, Kenneth W <[EMAIL PROTECTED]> wrote: > To demonstrate the problem, we turned off these two flags in the cpu > sd domain and measured a stunning 2.15% performance gain! And > deleting all the code in the try_to_wake_up() pertain to load > balancing gives us another 0.2% gain. another

Re: Add prefetch switch stack hook in scheduler function

2005-07-29 Thread Ingo Molnar
* Peter Zijlstra <[EMAIL PROTECTED]> wrote: > code like that always makes me think of duffs-device > http://www.lysator.liu.se/c/duffs-device.html > > although it might be that the compiler generates better code from the > current incarnation; just my .02 ;-) yeah, will do that. First wanted

Re: Add prefetch switch stack hook in scheduler function

2005-07-29 Thread Ingo Molnar
* Russell King <[EMAIL PROTECTED]> wrote: > On Fri, Jul 29, 2005 at 10:28:26AM +0200, Ingo Molnar wrote: > > @@ -2872,10 +2878,10 @@ go_idle: > > /* > > * Prefetch (at least) a cacheline below the current > > * kernel stack (in expectation of any new task touching > > -* the sta

Re: Add prefetch switch stack hook in scheduler function

2005-07-29 Thread Peter Zijlstra
> - > unroll prefetch_range() loops manually. > > Signed-off-by: Ingo Molnar <[EMAIL PROTECTED]> > > include/linux/prefetch.h | 31 +-- > 1 files changed, 29 insertions(+), 2 deletions(-) > > Index: linux/include/linux/prefetch.h >

Re: Delete scheduler SD_WAKE_AFFINE and SD_WAKE_BALANCE flags

2005-07-29 Thread Ingo Molnar
* Nick Piggin <[EMAIL PROTECTED]> wrote: > Well, you can easily see suboptimal scheduling decisions on many > programs with lots of interprocess communication. For example, tbench > on a dual Xeon: > > processes1 2 3 4 > > 2.6.13-rc4: 187, 183, 17

Re: Delete scheduler SD_WAKE_AFFINE and SD_WAKE_BALANCE flags

2005-07-29 Thread Ingo Molnar
* Nick Piggin <[EMAIL PROTECTED]> wrote: > >>processes1 2 3 4 > >> > >>2.6.13-rc4: 187, 183, 179 260, 259, 256 340, 320, 349 504, 496, 500 > >>no wake-bal: 180, 180, 177 254, 254, 253 268, 270, 348 345, 290, 500 > >> > >>Numbers are MB/s, hi

Re: Add prefetch switch stack hook in scheduler function

2005-07-29 Thread Russell King
On Fri, Jul 29, 2005 at 10:28:26AM +0200, Ingo Molnar wrote: > @@ -2872,10 +2878,10 @@ go_idle: > /* >* Prefetch (at least) a cacheline below the current >* kernel stack (in expectation of any new task touching > - * the stack at least minimally), and a cacheline above >

Re: Delete scheduler SD_WAKE_AFFINE and SD_WAKE_BALANCE flags

2005-07-29 Thread Nick Piggin
Ingo Molnar wrote: * Nick Piggin <[EMAIL PROTECTED]> wrote: processes1 2 3 4 2.6.13-rc4: 187, 183, 179 260, 259, 256 340, 320, 349 504, 496, 500 no wake-bal: 180, 180, 177 254, 254, 253 268, 270, 348 345, 290, 500 Numbers are MB/s, high

Re: Delete scheduler SD_WAKE_AFFINE and SD_WAKE_BALANCE flags

2005-07-29 Thread Ingo Molnar
* Nick Piggin <[EMAIL PROTECTED]> wrote: > Well, you can easily see suboptimal scheduling decisions on many > programs with lots of interprocess communication. For example, tbench > on a dual Xeon: > > processes1 2 3 4 > > 2.6.13-rc4: 187, 183, 17

Re: Delete scheduler SD_WAKE_AFFINE and SD_WAKE_BALANCE flags

2005-07-29 Thread Nick Piggin
Chen, Kenneth W wrote: Nick Piggin wrote on Thursday, July 28, 2005 7:01 PM This clearly outlines an issue with the implementation. Optimize for one type of workload has detrimental effect on another workload and vice versa. Yep. That comes up fairly regularly when tuning the scheduler :(

Re: Add prefetch switch stack hook in scheduler function

2005-07-29 Thread Ingo Molnar
* Eric Dumazet <[EMAIL PROTECTED]> wrote: > Please test that len is a constant, or else the inlining is too large > for the non constant case. yeah. fix below. Ingo - noticed by Eric Dumazet: unrolling should be dependent on a constant length, otherwise inlining gets too large. S

RE: Add prefetch switch stack hook in scheduler function

2005-07-29 Thread Chen, Kenneth W
Ingo Molnar wrote on Friday, July 29, 2005 1:36 AM > * Chen, Kenneth W <[EMAIL PROTECTED]> wrote: > > It generate slight different code because previous patch asks for a little > > over 5 cache lines worth of bytes and it always go to the for loop. > > ok - fix below. But i'm not that sure we want

Re: Add prefetch switch stack hook in scheduler function

2005-07-29 Thread Ingo Molnar
* Chen, Kenneth W <[EMAIL PROTECTED]> wrote: > Ingo Molnar wrote on Friday, July 29, 2005 12:07 AM > > the patch below unrolls the prefetch_range() loop manually, for up to 5 > > cachelines prefetched. This patch, ontop of the 4 previous patches, > > should generate similar code to the assembly

RE: Add prefetch switch stack hook in scheduler function

2005-07-29 Thread Chen, Kenneth W
Ingo Molnar wrote on Friday, July 29, 2005 12:07 AM > the patch below unrolls the prefetch_range() loop manually, for up to 5 > cachelines prefetched. This patch, ontop of the 4 previous patches, > should generate similar code to the assembly code in your original > patch. The full patch-series

Re: Add prefetch switch stack hook in scheduler function

2005-07-29 Thread Eric Dumazet
Ingo Molnar a écrit : unroll prefetch_range() loops manually. Signed-off-by: Ingo Molnar <[EMAIL PROTECTED]> include/linux/prefetch.h | 31 +-- 1 files changed, 29 insertions(+), 2 deletions(-) Index: linux/include/linux/prefetch.h =

Re: Add prefetch switch stack hook in scheduler function

2005-07-29 Thread Ingo Molnar
* Chen, Kenneth W <[EMAIL PROTECTED]> wrote: > On ia64, we have two kernel stacks, one for outgoing task, and one for > incoming task. for outgoing task, we haven't called switch_to() yet. > So the switch stack structure for 'current' will be allocated > immediately below current 'sp' pointer

RE: Add prefetch switch stack hook in scheduler function

2005-07-29 Thread Chen, Kenneth W
Keith Owens wrote on Friday, July 29, 2005 12:38 AM > BTW, for ia64 you may as well prefetch pt_regs, that is also quite > large. > > #define MIN_KERNEL_STACK_FOOTPRINT (IA64_SWITCH_STACK_SIZE + > IA64_PT_REGS_SIZE) This has to be carefully done, because you really don't want to overwhelm number

RE: Add prefetch switch stack hook in scheduler function

2005-07-29 Thread Chen, Kenneth W
Keith Owens wrote on Friday, July 29, 2005 12:46 AM > On Fri, 29 Jul 2005 00:22:43 -0700, > "Chen, Kenneth W" <[EMAIL PROTECTED]> wrote: > >On ia64, we have two kernel stacks, one for outgoing task, and one for > >incoming task. for outgoing task, we haven't called switch_to() yet. > >So the swit

Re: Add prefetch switch stack hook in scheduler function

2005-07-29 Thread Keith Owens
On Fri, 29 Jul 2005 00:22:43 -0700, "Chen, Kenneth W" <[EMAIL PROTECTED]> wrote: >On ia64, we have two kernel stacks, one for outgoing task, and one for >incoming task. for outgoing task, we haven't called switch_to() yet. >So the switch stack structure for 'current' will be allocated immediately

Re: Add prefetch switch stack hook in scheduler function

2005-07-29 Thread Keith Owens
On Fri, 29 Jul 2005 09:04:48 +0200, Ingo Molnar <[EMAIL PROTECTED]> wrote: >ok, how about the additional patch below? Does this do the trick on >ia64? It makes complete sense on every architecture to prefetch from >below the current kernel stack, in the expectation of the next task >touching th

[PATCH 2.6.13-rc4] Remove unnecessary function declarations

2005-07-29 Thread takeuchi satoru
Hi, I resend this patch because I sent the previous mail without Subject: line. Thanks, Satoru Takeuchi This patch removes unnecessary function declarations in . Signed-off-by: Satoru Takeuchi <[EMAIL PROTECTED]> --- include/asm-ia64/iosapic.h |2 -- 1 files changed, 2 deletions(-) diff

RE: Add prefetch switch stack hook in scheduler function

2005-07-29 Thread Chen, Kenneth W
Ingo Molnar wrote on Friday, July 29, 2005 12:05 AM > --- linux.orig/kernel/sched.c > +++ linux/kernel/sched.c > @@ -2869,7 +2869,14 @@ go_idle: >* its thread_info, its kernel stack and mm: >*/ > prefetch(next->thread_info); > - prefetch(kernel_stack(next)); > + /* > +

[no subject]

2005-07-29 Thread takeuchi satoru
This patch removes unnecessary function declarations in . Signed-off-by: Satoru Takeuchi <[EMAIL PROTECTED]> --- include/asm-ia64/iosapic.h |2 -- 1 files changed, 2 deletions(-) diff -puN include/asm-ia64/iosapic.h~my-patch include/asm-ia64/iosapic.h --- linux-2.6.13-rc4/include/asm-ia64

Re: Add prefetch switch stack hook in scheduler function

2005-07-29 Thread Ingo Molnar
* Ingo Molnar <[EMAIL PROTECTED]> wrote: > > Sorry, this is not enough. Switch stack on ia64 is 528 bytes. We > > need to prefetch 5 lines. It probably should use prefetch_range(). > > ok, how about the additional patch below? Does this do the trick on > ia64? It makes complete sense on e

Re: Add prefetch switch stack hook in scheduler function

2005-07-29 Thread Ingo Molnar
* Chen, Kenneth W <[EMAIL PROTECTED]> wrote: > > i.e. like the patch below. Boot-tested on x86. x86, x64 and ia64 have a > > real kernel_stack() implementation, the other architectures all return > > 'next'. (I've also cleaned up a couple of other things in the > > prefetch-next area, see the