The access_ok() and negative length check on each iov segment in function
generic_file_aio_read/write are redundant. They are all already checked
before calling down to these low level generic functions.
Vector I/O (both sync and async) are checked via rw_copy_check_uvector().
For single segment
On 64-bit arch like x86_64, struct bio is 104 byte. Since bio slab is
created with SLAB_HWCACHE_ALIGN flag, there are usually spare memory
available at the end of bio. I think we can utilize that memory for
bio_vec allocation. The purpose is not so much on saving memory consumption
for bio_vec,
Zach Brown wrote on Monday, December 04, 2006 11:19 AM
On Dec 4, 2006, at 8:26 AM, Chen, Kenneth W wrote:
The access_ok() and negative length check on each iov segment in
function
generic_file_aio_read/write are redundant. They are all already
checked
before calling down
Andrew Morton wrote on Monday, December 04, 2006 11:36 AM
On Mon, 4 Dec 2006 08:26:36 -0800
Chen, Kenneth W [EMAIL PROTECTED] wrote:
So it's not possible to call down to generic_file_aio_read/write with
invalid
iov segment. Patch proposed to delete these redundant code.
erp, please
Jens Axboe wrote on Monday, December 04, 2006 12:07 PM
On Mon, Dec 04 2006, Chen, Kenneth W wrote:
On 64-bit arch like x86_64, struct bio is 104 byte. Since bio slab is
created with SLAB_HWCACHE_ALIGN flag, there are usually spare memory
available at the end of bio. I think we can utilize
This patch implements block device specific .direct_IO method instead
of going through generic direct_io_worker for block device.
direct_io_worker is fairly complex because it needs to handle O_DIRECT
on file system, where it needs to perform block allocation, hole detection,
extents file on
pagevec is never expected to be more than PAGEVEC_SIZE, I think a
unsigned char is enough to count them. This patch makes nr, cold
to be unsigned char and also adds an iterator index. With that,
the size can be even bumped up by 1 to 15.
Signed-off-by: Ken Chen [EMAIL PROTECTED]
diff -Nurp
Andrew Morton wrote on Monday, December 04, 2006 9:45 PM
On Mon, 4 Dec 2006 21:21:31 -0800
Chen, Kenneth W [EMAIL PROTECTED] wrote:
pagevec is never expected to be more than PAGEVEC_SIZE, I think a
unsigned char is enough to count them. This patch makes nr, cold
to be unsigned char
Chris Mason wrote on Friday, December 01, 2006 7:37 AM
> > It benefit from shorter path length. It takes much shorter time to process
> > one I/O request, both in the submit and completion path. I always think in
> > terms of how many instructions, or clock ticks does it take to convert user
> >
Chris Mason wrote on Friday, December 01, 2006 7:37 AM
It benefit from shorter path length. It takes much shorter time to process
one I/O request, both in the submit and completion path. I always think in
terms of how many instructions, or clock ticks does it take to convert user
request
Zach Brown wrote on Thursday, November 30, 2006 1:45 PM
> > At that time, a patch was written for raw device to demonstrate that
> > large performance head room is achievable (at ~20% speedup for micro-
> > benchmark and ~2% for db transaction processing benchmark) with a
> > tight I/O
Zach Brown wrote on Thursday, November 30, 2006 1:45 PM
At that time, a patch was written for raw device to demonstrate that
large performance head room is achievable (at ~20% speedup for micro-
benchmark and ~2% for db transaction processing benchmark) with a
tight I/O submission
I've been complaining about O_DIRECT I/O processing being exceedingly
complex and slow since March 2005, see posting below:
http://marc.theaimsgroup.com/?l=linux-kernel=111033309732261=2
At that time, a patch was written for raw device to demonstrate that
large performance head room is achievable
I've been complaining about O_DIRECT I/O processing being exceedingly
complex and slow since March 2005, see posting below:
http://marc.theaimsgroup.com/?l=linux-kernelm=111033309732261w=2
At that time, a patch was written for raw device to demonstrate that
large performance head room is
Mike Galbraith wrote on Friday, November 17, 2006 2:19 PM
> > And a changelog, then we're all set!
> >
> > Oh. And a patch, too.
>
> Co-opt rq->timestamp_last_tick to maintain a cache_hot_time evaluation
> reference timestamp at both tick and sched times to prevent said
> reference, formerly
Ingo Molnar wrote on Friday, November 17, 2006 11:21 AM
> * Mike Galbraith <[EMAIL PROTECTED]> wrote:
>
> > One way to improve granularity, and eliminate the possibility of
> > p->last_run being > rq->timestamp_tast_tick, and thereby short
> > circuiting the evaluation of cache_hot_time, is to
Mike Galbraith wrote on Friday, November 17, 2006 8:57 AM
> On Tue, 2006-11-14 at 15:00 -0800, Chen, Kenneth W wrote:
> > The argument used for task_hot in can_migrate_task() looks wrong:
> >
> > int can_migrate_task()
> > { ...
> >if (task_h
Mike Galbraith wrote on Friday, November 17, 2006 8:57 AM
On Tue, 2006-11-14 at 15:00 -0800, Chen, Kenneth W wrote:
The argument used for task_hot in can_migrate_task() looks wrong:
int can_migrate_task()
{ ...
if (task_hot(p, rq-timestamp_last_tick, sd
Ingo Molnar wrote on Friday, November 17, 2006 11:21 AM
* Mike Galbraith [EMAIL PROTECTED] wrote:
One way to improve granularity, and eliminate the possibility of
p-last_run being rq-timestamp_tast_tick, and thereby short
circuiting the evaluation of cache_hot_time, is to cache the
Mike Galbraith wrote on Friday, November 17, 2006 2:19 PM
And a changelog, then we're all set!
Oh. And a patch, too.
Co-opt rq-timestamp_last_tick to maintain a cache_hot_time evaluation
reference timestamp at both tick and sched times to prevent said
reference, formerly
Repost previously discussed patch (on Jul 27, 2005). Ingo did
the same thing for all arch with 471 lines of patch. I'm still
advocating this little 30 lines patch, of 6 lines introduces
prefetch_stack() generic interface.
Andrew, please consider -mm inclusion. Or advise me what I need
to do to
Repost previously discussed patch (on Jul 27, 2005). Ingo did
the same thing for all arch with 471 lines of patch. I'm still
advocating this little 30 lines patch, of 6 lines introduces
prefetch_stack() generic interface.
Andrew, please consider -mm inclusion. Or advise me what I need
to do to
Dave McCracken wrote on Tuesday, August 30, 2005 3:13 PM
> This patch implements page table sharing for all shared memory regions that
> span an entire page table page. It supports sharing at multiple page
> levels, depending on the architecture.
In function pt_share_pte():
> +
Dave McCracken wrote on Tuesday, August 30, 2005 3:13 PM
> This patch implements page table sharing for all shared memory regions that
> span an entire page table page. It supports sharing at multiple page
> levels, depending on the architecture.
>
>
> This version of the patch supports i386
Dave McCracken wrote on Tuesday, August 30, 2005 3:13 PM
This patch implements page table sharing for all shared memory regions that
span an entire page table page. It supports sharing at multiple page
levels, depending on the architecture.
This version of the patch supports i386 and
Dave McCracken wrote on Tuesday, August 30, 2005 3:13 PM
This patch implements page table sharing for all shared memory regions that
span an entire page table page. It supports sharing at multiple page
levels, depending on the architecture.
In function pt_share_pte():
+ while
Ingo Molnar wrote on Saturday, July 30, 2005 12:19 AM
> * Nick Piggin <[EMAIL PROTECTED]> wrote:
>
> > > here's an updated patch. It handles one more detail: on SCHED_SMT we
> > > should check the idleness of siblings too. Benchmark numbers still
> > > look good.
> >
> > Maybe. Ken hasn't
"nohalt" option is currently broken on IPF (2.6.12 is the earliest
kernel I looked, might be broken even earlier).
- Ken
-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Chen, Kenneth W
Sent: Monday, August 08, 2005 3:25 PM
To: linux-ia64@vger.
Adam Litke wrote on Monday, August 08, 2005 3:17 PM
> The reason for the VM_FAULT_SIGBUS default return is because I thought a
> fault on a pte_present hugetlb page was an invalid/unhandled fault.
> I'll have another think about races to the fault handler though.
Two threads fault on the same
Adam Litke wrote on Monday, August 08, 2005 3:17 PM
The reason for the VM_FAULT_SIGBUS default return is because I thought a
fault on a pte_present hugetlb page was an invalid/unhandled fault.
I'll have another think about races to the fault handler though.
Two threads fault on the same pte,
nohalt option is currently broken on IPF (2.6.12 is the earliest
kernel I looked, might be broken even earlier).
- Ken
-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Chen, Kenneth W
Sent: Monday, August 08, 2005 3:25 PM
To: linux-ia64@vger.kernel.org
Ingo Molnar wrote on Saturday, July 30, 2005 12:19 AM
* Nick Piggin [EMAIL PROTECTED] wrote:
here's an updated patch. It handles one more detail: on SCHED_SMT we
should check the idleness of siblings too. Benchmark numbers still
look good.
Maybe. Ken hasn't measured the effect
round-robin from all NUMA nodes.
Chen, Kenneth W wrote on Friday, August 05, 2005 2:34 PM
> Spurious WARN_ON. Calls to hugetlb_pte_fault() is conditioned upon
> if (is_vm_hugetlb_page(vma))
>
>
>
> Broken here. Return VM_FAULT_SIGBUS when *pte is present?? Why
Adam Litke wrote on Friday, August 05, 2005 8:22 AM
> +int hugetlb_pte_fault(struct mm_struct *mm, struct vm_area_struct *vma,
> + unsigned long address, int write_access)
> +{
> + int ret = VM_FAULT_MINOR;
> + unsigned long idx;
> + pte_t *pte;
> + struct page
Adam Litke wrote on Friday, August 05, 2005 8:22 AM
> Below is a patch to implement demand faulting for huge pages. The main
> motivation for changing from prefaulting to demand faulting is so that
> huge page allocations can follow the NUMA API. Currently, huge pages
> are allocated round-robin
Adam Litke wrote on Friday, August 05, 2005 8:22 AM
Below is a patch to implement demand faulting for huge pages. The main
motivation for changing from prefaulting to demand faulting is so that
huge page allocations can follow the NUMA API. Currently, huge pages
are allocated round-robin
Adam Litke wrote on Friday, August 05, 2005 8:22 AM
+int hugetlb_pte_fault(struct mm_struct *mm, struct vm_area_struct *vma,
+ unsigned long address, int write_access)
+{
+ int ret = VM_FAULT_MINOR;
+ unsigned long idx;
+ pte_t *pte;
+ struct page *page;
from all NUMA nodes.
Chen, Kenneth W wrote on Friday, August 05, 2005 2:34 PM
Spurious WARN_ON. Calls to hugetlb_pte_fault() is conditioned upon
if (is_vm_hugetlb_page(vma))
Broken here. Return VM_FAULT_SIGBUS when *pte is present?? Why
can't you move all the logic
Andi Kleen wrote on Thursday, August 04, 2005 3:54 PM
> > This might be too low on large system. We usually stress shm pretty hard
> > for db application and usually use more than 87% of total memory in just
> > one shm segment. So I prefer either no limit or a tunable.
>
> With large system
Andi Kleen wrote on Thursday, August 04, 2005 6:24 AM
> I think we should just get rid of the per process limit and keep
> the global limit, but make it auto tuning based on available memory.
> That is still not very nice because that would likely keep it < available
> memory/2, but I suspect
Andi Kleen wrote on Thursday, August 04, 2005 6:24 AM
I think we should just get rid of the per process limit and keep
the global limit, but make it auto tuning based on available memory.
That is still not very nice because that would likely keep it available
memory/2, but I suspect
Andi Kleen wrote on Thursday, August 04, 2005 3:54 PM
This might be too low on large system. We usually stress shm pretty hard
for db application and usually use more than 87% of total memory in just
one shm segment. So I prefer either no limit or a tunable.
With large system you mean
Ingo Molnar wrote on Friday, July 29, 2005 4:26 AM
> * Chen, Kenneth W <[EMAIL PROTECTED]> wrote:
> > To demonstrate the problem, we turned off these two flags in the cpu
> > sd domain and measured a stunning 2.15% performance gain! And
> > deleting all the code in t
Ingo Molnar wrote on Friday, July 29, 2005 1:36 AM
> * Chen, Kenneth W <[EMAIL PROTECTED]> wrote:
> > It generate slight different code because previous patch asks for a little
> > over 5 cache lines worth of bytes and it always go to the for loop.
>
> ok - fix belo
Ingo Molnar wrote on Friday, July 29, 2005 12:07 AM
> the patch below unrolls the prefetch_range() loop manually, for up to 5
> cachelines prefetched. This patch, ontop of the 4 previous patches,
> should generate similar code to the assembly code in your original
> patch. The full patch-series
Keith Owens wrote on Friday, July 29, 2005 12:38 AM
> BTW, for ia64 you may as well prefetch pt_regs, that is also quite
> large.
>
> #define MIN_KERNEL_STACK_FOOTPRINT (IA64_SWITCH_STACK_SIZE +
> IA64_PT_REGS_SIZE)
This has to be carefully done, because you really don't want to overwhelm
Keith Owens wrote on Friday, July 29, 2005 12:46 AM
> On Fri, 29 Jul 2005 00:22:43 -0700,
> "Chen, Kenneth W" <[EMAIL PROTECTED]> wrote:
> >On ia64, we have two kernel stacks, one for outgoing task, and one for
> >incoming task. for outgoing task, we hav
Ingo Molnar wrote on Friday, July 29, 2005 12:05 AM
> --- linux.orig/kernel/sched.c
> +++ linux/kernel/sched.c
> @@ -2869,7 +2869,14 @@ go_idle:
>* its thread_info, its kernel stack and mm:
>*/
> prefetch(next->thread_info);
> - prefetch(kernel_stack(next));
> + /*
>
Nick Piggin wrote on Thursday, July 28, 2005 7:01 PM
> Chen, Kenneth W wrote:
> >Well, that's exactly what I'm trying to do: make them not aggressive
> >at all by not performing any load balance :-) The workload gets maximum
> >benefit with zero aggressiveness.
>
> Un
Nick Piggin wrote on Thursday, July 28, 2005 7:01 PM
Chen, Kenneth W wrote:
Well, that's exactly what I'm trying to do: make them not aggressive
at all by not performing any load balance :-) The workload gets maximum
benefit with zero aggressiveness.
Unfortunately we can't forget about
Ingo Molnar wrote on Friday, July 29, 2005 12:05 AM
--- linux.orig/kernel/sched.c
+++ linux/kernel/sched.c
@@ -2869,7 +2869,14 @@ go_idle:
* its thread_info, its kernel stack and mm:
*/
prefetch(next-thread_info);
- prefetch(kernel_stack(next));
+ /*
+ *
Keith Owens wrote on Friday, July 29, 2005 12:46 AM
On Fri, 29 Jul 2005 00:22:43 -0700,
Chen, Kenneth W [EMAIL PROTECTED] wrote:
On ia64, we have two kernel stacks, one for outgoing task, and one for
incoming task. for outgoing task, we haven't called switch_to() yet.
So the switch stack
Keith Owens wrote on Friday, July 29, 2005 12:38 AM
BTW, for ia64 you may as well prefetch pt_regs, that is also quite
large.
#define MIN_KERNEL_STACK_FOOTPRINT (IA64_SWITCH_STACK_SIZE +
IA64_PT_REGS_SIZE)
This has to be carefully done, because you really don't want to overwhelm
number of
Ingo Molnar wrote on Friday, July 29, 2005 12:07 AM
the patch below unrolls the prefetch_range() loop manually, for up to 5
cachelines prefetched. This patch, ontop of the 4 previous patches,
should generate similar code to the assembly code in your original
patch. The full patch-series is:
Ingo Molnar wrote on Friday, July 29, 2005 1:36 AM
* Chen, Kenneth W [EMAIL PROTECTED] wrote:
It generate slight different code because previous patch asks for a little
over 5 cache lines worth of bytes and it always go to the for loop.
ok - fix below. But i'm not that sure we want
Ingo Molnar wrote on Friday, July 29, 2005 4:26 AM
* Chen, Kenneth W [EMAIL PROTECTED] wrote:
To demonstrate the problem, we turned off these two flags in the cpu
sd domain and measured a stunning 2.15% performance gain! And
deleting all the code in the try_to_wake_up() pertain to load
Nick Piggin wrote on Thursday, July 28, 2005 6:46 PM
> I'd like to try making them less aggressive first if possible.
Well, that's exactly what I'm trying to do: make them not aggressive
at all by not performing any load balance :-) The workload gets maximum
benefit with zero aggressiveness.
-
Nick Piggin wrote on Thursday, July 28, 2005 6:25 PM
> Well pipes are just an example. It could be any type of communication.
> What's more, even the synchronous wakeup uses the wake balancing path
> (although that could be modified to only do wake balancing for synch
> wakeups, I'd have to be
Nick Piggin wrote on Thursday, July 28, 2005 4:35 PM
> Wake balancing provides an opportunity to provide some input bias
> into the load balancer.
>
> For example, if you started 100 pairs of tasks which communicate
> through a pipe. On a 2 CPU system without wake balancing, probably
> half of
What sort of workload needs SD_WAKE_AFFINE and SD_WAKE_BALANCE?
SD_WAKE_AFFINE are not useful in conjunction with interrupt binding.
In fact, it creates more harm than usefulness, causing detrimental
process migration and destroy process cache affinity etc. Also
SD_WAKE_BALANCE is giving us
> i.e. like the patch below. Boot-tested on x86. x86, x64 and ia64 have a
> real kernel_stack() implementation, the other architectures all return
> 'next'. (I've also cleaned up a couple of other things in the
> prefetch-next area, see the changelog below.)
>
> Ken, would this patch generate
i.e. like the patch below. Boot-tested on x86. x86, x64 and ia64 have a
real kernel_stack() implementation, the other architectures all return
'next'. (I've also cleaned up a couple of other things in the
prefetch-next area, see the changelog below.)
Ken, would this patch generate a
What sort of workload needs SD_WAKE_AFFINE and SD_WAKE_BALANCE?
SD_WAKE_AFFINE are not useful in conjunction with interrupt binding.
In fact, it creates more harm than usefulness, causing detrimental
process migration and destroy process cache affinity etc. Also
SD_WAKE_BALANCE is giving us
Nick Piggin wrote on Thursday, July 28, 2005 4:35 PM
Wake balancing provides an opportunity to provide some input bias
into the load balancer.
For example, if you started 100 pairs of tasks which communicate
through a pipe. On a 2 CPU system without wake balancing, probably
half of the
Nick Piggin wrote on Thursday, July 28, 2005 6:25 PM
Well pipes are just an example. It could be any type of communication.
What's more, even the synchronous wakeup uses the wake balancing path
(although that could be modified to only do wake balancing for synch
wakeups, I'd have to be
Nick Piggin wrote on Thursday, July 28, 2005 6:46 PM
I'd like to try making them less aggressive first if possible.
Well, that's exactly what I'm trying to do: make them not aggressive
at all by not performing any load balance :-) The workload gets maximum
benefit with zero aggressiveness.
-
This patch adds ia64 specific implementation to prefetch switch stack
structure. It applies on top of "add prefetch switch stack hook ..."
posted earlier. Using my favorite industry standard OLTP workload, we
measured 6.2X reduction on cache misses occurred in the context switch
code and yielded
I would like to propose adding a prefetch switch stack hook in
the scheduler function. For architecture like ia64, the switch
stack structure is fairly large (currently 528 bytes). For context
switch intensive application, we found that significant amount of
cache misses occurs in switch_to()
I would like to propose adding a prefetch switch stack hook in
the scheduler function. For architecture like ia64, the switch
stack structure is fairly large (currently 528 bytes). For context
switch intensive application, we found that significant amount of
cache misses occurs in switch_to()
This patch adds ia64 specific implementation to prefetch switch stack
structure. It applies on top of add prefetch switch stack hook ...
posted earlier. Using my favorite industry standard OLTP workload, we
measured 6.2X reduction on cache misses occurred in the context switch
code and yielded
Alexey Dobriyan wrote on Thursday, July 14, 2005 3:34 PM
> On Friday 15 July 2005 00:21, Chen, Kenneth W wrote:
> > I'm pleased to announce that we have established a linux kernel
> > performance project, hosted at sourceforge.net:
> >
> > http://kernel-perf.sourcef
[EMAIL PROTECTED] wrote on Thursday, July 14, 2005 3:18 PM
> "Chen, Kenneth W" <[EMAIL PROTECTED]> writes:
> > I'm pleased to announce that we have established a linux kernel
> > performance project, hosted at sourceforge.net:
> >
> > http://kernel
[EMAIL PROTECTED] wrote on Thursday, July 14, 2005 3:18 PM
Chen, Kenneth W [EMAIL PROTECTED] writes:
I'm pleased to announce that we have established a linux kernel
performance project, hosted at sourceforge.net:
http://kernel-perf.sourceforge.net
That's very cool. Thanks a lot
Alexey Dobriyan wrote on Thursday, July 14, 2005 3:34 PM
On Friday 15 July 2005 00:21, Chen, Kenneth W wrote:
I'm pleased to announce that we have established a linux kernel
performance project, hosted at sourceforge.net:
http://kernel-perf.sourceforge.net
Perhaps, some cool-looking
Nick Piggin wrote on Tuesday, April 12, 2005 4:09 AM
> Chen, Kenneth W wrote:
> > I like the patch a lot and already did bench it on our db setup. However,
> > I'm seeing a negative regression compare to a very very crappy patch (see
> > attached, you can laugh at me
Chen, Kenneth W wrote on Tuesday, April 05, 2005 5:13 PM
> Jens Axboe wrote on Tuesday, April 05, 2005 7:54 AM
> > On Tue, Mar 29 2005, Chen, Kenneth W wrote:
> > > Jens Axboe wrote on Tuesday, March 29, 2005 12:04 PM
> > > > No such promise was ever made, noop
To lazy to write a patch, the inline debugfs function declaration
for the following three functions disagree between CONFIG_DEBUG_FS
and !CONFIG_DEBUG_FS
4th argument mismatch, looks like an obvious copy-n-paste error.
u16, u32, and u32?
static inline struct dentry *debugfs_create_u16(const
On Tue, Apr 12 2005, Nick Piggin wrote:
> Actually the patches I have sent you do fix real bugs, but they also
> make the block layer less likely to recurse into page reclaim, so it
> may be eg. hiding the problem that Neil's patch fixes.
Jens Axboe wrote on Tuesday, April 12, 2005 12:08 AM
> Can
On Tue, Apr 12 2005, Nick Piggin wrote:
Actually the patches I have sent you do fix real bugs, but they also
make the block layer less likely to recurse into page reclaim, so it
may be eg. hiding the problem that Neil's patch fixes.
Jens Axboe wrote on Tuesday, April 12, 2005 12:08 AM
Can you
To lazy to write a patch, the inline debugfs function declaration
for the following three functions disagree between CONFIG_DEBUG_FS
and !CONFIG_DEBUG_FS
4th argument mismatch, looks like an obvious copy-n-paste error.
u16, u32, and u32?
static inline struct dentry *debugfs_create_u16(const
Chen, Kenneth W wrote on Tuesday, April 05, 2005 5:13 PM
Jens Axboe wrote on Tuesday, April 05, 2005 7:54 AM
On Tue, Mar 29 2005, Chen, Kenneth W wrote:
Jens Axboe wrote on Tuesday, March 29, 2005 12:04 PM
No such promise was ever made, noop just means it does 'basically
nothing
Nick Piggin wrote on Tuesday, April 12, 2005 4:09 AM
Chen, Kenneth W wrote:
I like the patch a lot and already did bench it on our db setup. However,
I'm seeing a negative regression compare to a very very crappy patch (see
attached, you can laugh at me for doing things like that :-).
OK
Ingo Molnar wrote on Tuesday, April 05, 2005 11:46 PM
> ok, the delay of 16 secs is alot better. Could you send me the full
> detection log, how stable is the curve?
Full log attached.
begin 666 boot.log
M0F]O="!PF5D($E40R!W:71H($-052 P("AL87-T(&1I9F8@,R!C>6-L97,L(>&5R
M@I#86QI8G)A=
Ingo Molnar wrote on Tuesday, April 05, 2005 11:46 PM
ok, the delay of 16 secs is alot better. Could you send me the full
detection log, how stable is the curve?
Full log attached.
begin 666 boot.log
M0F]O=!PF]C97-S;W(@:60@,'@P+S!X8S0Q. I#4%4@,3H@WEN8VAR;VYI
MF5D($E40R!W:71H($-052
Ingo Molnar wrote on Monday, April 04, 2005 8:05 PM
>
> latest patch attached. Changes:
>
> - stabilized calibration even more, by using cache flushing
>instructions to generate a predictable working set. The cache
>flushing itself is not timed, it is used to create quiescent
>cache
Jens Axboe wrote on Tuesday, April 05, 2005 7:54 AM
> On Tue, Mar 29 2005, Chen, Kenneth W wrote:
> > Jens Axboe wrote on Tuesday, March 29, 2005 12:04 PM
> > > No such promise was ever made, noop just means it does 'basically
> > > nothing'. It never meant FIFO
Ingo Molnar wrote on Sunday, April 03, 2005 11:24 PM
> great! How long does the benchmark take (hours?), and is there any way
> to speed up the benchmarking (without hurting accuracy), so that
> multiple migration-cost settings could be tried? Would it be possible to
> try a few other values via
Ingo Molnar wrote on Sunday, April 03, 2005 11:24 PM
great! How long does the benchmark take (hours?), and is there any way
to speed up the benchmarking (without hurting accuracy), so that
multiple migration-cost settings could be tried? Would it be possible to
try a few other values via the
Jens Axboe wrote on Tuesday, April 05, 2005 7:54 AM
On Tue, Mar 29 2005, Chen, Kenneth W wrote:
Jens Axboe wrote on Tuesday, March 29, 2005 12:04 PM
No such promise was ever made, noop just means it does 'basically
nothing'. It never meant FIFO in anyway, we cannot break the semantics
Ingo Molnar wrote on Monday, April 04, 2005 8:05 PM
latest patch attached. Changes:
- stabilized calibration even more, by using cache flushing
instructions to generate a predictable working set. The cache
flushing itself is not timed, it is used to create quiescent
cache state.
* Chen, Kenneth W <[EMAIL PROTECTED]> wrote:
> The cache size information on ia64 is already available at the finger
> tip. Here is a patch that I whipped up to set max_cache_size for ia64.
Ingo Molnar wrote on Monday, April 04, 2005 4:38 AM
> thanks - i've added this to my
* Chen, Kenneth W [EMAIL PROTECTED] wrote:
The cache size information on ia64 is already available at the finger
tip. Here is a patch that I whipped up to set max_cache_size for ia64.
Ingo Molnar wrote on Monday, April 04, 2005 4:38 AM
thanks - i've added this to my tree.
i've attached
Siddha, Suresh B wrote on Friday, April 01, 2005 8:05 PM
> On Sat, Apr 02, 2005 at 01:11:20PM +1000, Nick Piggin wrote:
> > How important is this? Any application to real workloads? Even if
> > not, I agree it would be nice to improve this more. I don't know
> > if I really like this approach - I
Ingo Molnar wrote on Sunday, April 03, 2005 7:30 AM
> how close are these numbers to the real worst-case migration costs on
> that box?
I booted your latest patch on a 4-way SMP box (1.5 GHz, 9MB ia64). This
is what it produces. I think the estimate is excellent.
[00]: -10.4(0) 10.4(0)
Ingo Molnar wrote on Saturday, April 02, 2005 11:04 PM
> the default on ia64 (32MB) was way too large and caused the search to
> start from 64MB. That can take a _long_ time.
>
> i've attached a new patch with your changes included, and a couple of
> new things added:
>
> - removed the 32MB
Ingo Molnar wrote on Saturday, April 02, 2005 11:04 PM
the default on ia64 (32MB) was way too large and caused the search to
start from 64MB. That can take a _long_ time.
i've attached a new patch with your changes included, and a couple of
new things added:
- removed the 32MB
Siddha, Suresh B wrote on Friday, April 01, 2005 8:05 PM
On Sat, Apr 02, 2005 at 01:11:20PM +1000, Nick Piggin wrote:
How important is this? Any application to real workloads? Even if
not, I agree it would be nice to improve this more. I don't know
if I really like this approach - I guess
Paul Jackson wrote on Friday, April 01, 2005 5:45 PM
> Kenneth wrote:
> > Paul, you definitely want to check this out on your large numa box.
>
> Interesting - thanks. I can get a kernel patched and booted on a big
> box easily enough. I don't know how to run an "industry db benchmark",
> and
the problem that would be an easy thing to fix for 2.6.12.
Chen, Kenneth W wrote on Thursday, March 31, 2005 9:15 PM
> Yes, we are increasing the number in our experiments. It's in the queue
> and I should have a result soon.
Hot of the press: bumping up cache_hot_time to 10ms on our db s
Grecko OSCP wrote on Friday, April 01, 2005 10:22 AM
> I noticed yesterday a news article on Linux.org about more kernel
> performance testing being called for, and I decided it would be a nice
> project to try. I have 10 completely identical systems that can be
> used for this, and would like to
101 - 200 of 294 matches
Mail list logo