Re: remove zero_page (was Re: -mm merge plans for 2.6.24)

2007-10-09 Thread Nick Piggin
On Thursday 04 October 2007 01:21, Linus Torvalds wrote: > On Wed, 3 Oct 2007, Nick Piggin wrote: > > I don't know if Linus actually disliked the patch itself, or disliked > > my (maybe confusingly worded) rationale? > > Yes. I'd happily accept the patch, but I'd want

Re: [PATCH] mm: set_page_dirty_balance() vs ->page_mkwrite()

2007-10-09 Thread Nick Piggin
On Tuesday 09 October 2007 12:12, Mark Fasheh wrote: > On Mon, Oct 08, 2007 at 05:47:52PM +1000, Nick Piggin wrote: > > > block_page_mkwrite() is just using generic interfaces to do this, > > > same as pretty much any write() system call. The idea was to make it > >

Re: [PATCH] mm: set_page_dirty_balance() vs -page_mkwrite()

2007-10-09 Thread Nick Piggin
On Tuesday 09 October 2007 12:12, Mark Fasheh wrote: On Mon, Oct 08, 2007 at 05:47:52PM +1000, Nick Piggin wrote: block_page_mkwrite() is just using generic interfaces to do this, same as pretty much any write() system call. The idea was to make it as similar to the write() call path

Re: remove zero_page (was Re: -mm merge plans for 2.6.24)

2007-10-09 Thread Nick Piggin
On Thursday 04 October 2007 01:21, Linus Torvalds wrote: On Wed, 3 Oct 2007, Nick Piggin wrote: I don't know if Linus actually disliked the patch itself, or disliked my (maybe confusingly worded) rationale? Yes. I'd happily accept the patch, but I'd want it clarified and made obvious what

Re: [PATCH -mm -v4 1/3] i386/x86_64 boot: setup data

2007-10-09 Thread Nick Piggin
On Tuesday 09 October 2007 16:40, Huang, Ying wrote: +unsigned long copy_from_phys(void *to, unsigned long from_phys, + unsigned long n) +{ + struct page *page; + void *from; + unsigned long remain = n, offset, trunck; + + while (remain) { +

Re: [PATCH -mm -v4 1/3] i386/x86_64 boot: setup data

2007-10-09 Thread Nick Piggin
On Tuesday 09 October 2007 18:22, Huang, Ying wrote: On Tue, 2007-10-09 at 01:25 +1000, Nick Piggin wrote: On Tuesday 09 October 2007 16:40, Huang, Ying wrote: +unsigned long copy_from_phys(void *to, unsigned long from_phys, + unsigned long n) I suppose that's

Re: [PATCH -mm -v4 1/3] i386/x86_64 boot: setup data

2007-10-09 Thread Nick Piggin
On Tuesday 09 October 2007 18:55, Huang, Ying wrote: On Tue, 2007-10-09 at 02:06 +1000, Nick Piggin wrote: I'm just wondering whether you really need to access highmem in boot code... Because the zero page (boot_parameters) of i386 boot protocol has 4k limitation, a linked list style boot

Re: [13/18] x86_64: Allow fallback for the stack

2007-10-09 Thread Nick Piggin
On Wednesday 10 October 2007 04:39, Christoph Lameter wrote: On Mon, 8 Oct 2007, Nick Piggin wrote: The tight memory restrictions on stack usage do not come about because of the difficulty in increasing the stack size :) It is because we want to keep stack sizes small! Increasing

Re: remove zero_page (was Re: -mm merge plans for 2.6.24)

2007-10-09 Thread Nick Piggin
On Wednesday 10 October 2007 00:52, Linus Torvalds wrote: On Tue, 9 Oct 2007, Nick Piggin wrote: I have done some tests which indicate a couple of very basic common tools don't do much zero-page activity (ie. kbuild). And also combined with some logical arguments to say that a sane app

Re: [13/18] x86_64: Allow fallback for the stack

2007-10-09 Thread Nick Piggin
On Wednesday 10 October 2007 11:26, Christoph Lameter wrote: On Tue, 9 Oct 2007, Nick Piggin wrote: We already use 32k stacks on IA64. So the memory argument fail there. I'm talking about generic code. The stack size is set in arch code not in generic code. Generic code must assume a 4K

Re: remove zero_page (was Re: -mm merge plans for 2.6.24)

2007-10-09 Thread Nick Piggin
On Wednesday 10 October 2007 12:22, Linus Torvalds wrote: On Tue, 9 Oct 2007, Nick Piggin wrote: Where do you suggest I go from here? Is there any way I can convince you to try it? Make it a config option? (just kidding) No, I'll take the damn patch, but quite frankly, I think your

Re: howto boost write(2) performance?

2007-10-09 Thread Nick Piggin
On Tuesday 09 October 2007 23:50, Michael Stiller wrote: Hi list, i'm developing an application (in C) which needs to write about 1Gbit/s (125Mb/s) to a disk array attached via U320 SCSI. It runs on Dual Core 2 Xeons @2Ghz utilizing kernel 2.6.22.7. I buffer the data in (currently 4) 400Mb

Re: [13/18] x86_64: Allow fallback for the stack

2007-10-08 Thread Nick Piggin
On Tuesday 09 October 2007 03:36, Christoph Lameter wrote: > On Sun, 7 Oct 2007, Nick Piggin wrote: > > > The problem can become non-rare on special low memory machines doing > > > wild swapping things though. > > > > But only your huge systems will be using hu

Re: [PATCH] mm: set_page_dirty_balance() vs ->page_mkwrite()

2007-10-08 Thread Nick Piggin
On Tuesday 09 October 2007 09:36, David Chinner wrote: > On Mon, Oct 08, 2007 at 04:37:00PM +1000, Nick Piggin wrote: > > On Tuesday 09 October 2007 02:54, Peter Zijlstra wrote: > > > Force a balance call if ->page_mkwrite() was successful. > > > > Would it b

Re: [PATCH]fix VM_CAN_NONLINEAR check in sys_remap_file_pages

2007-10-08 Thread Nick Piggin
On Tuesday 09 October 2007 03:51, Andrew Morton wrote: > On Mon, 8 Oct 2007 10:28:43 -0700 > > I'll now add remap_file_pages soon. > > Maybe those other 2 tests aren't strong enough (?). > > Or maybe they don't return a non-0 exit status even when they fail... > > (I'll check.) > > Perhaps Yan

Re: [PATCH]fix VM_CAN_NONLINEAR check in sys_remap_file_pages

2007-10-08 Thread Nick Piggin
On Tuesday 09 October 2007 03:04, Andrew Morton wrote: > On Mon, 8 Oct 2007 19:45:08 +0800 "Yan Zheng" <[EMAIL PROTECTED]> wrote: > > Hi all > > > > The test for VM_CAN_NONLINEAR always fails > > > > Signed-off-by: Yan Zheng<[EMAIL PROTECTED]> > > > > diff -ur linux-2.6.23-rc9/mm/fremap.c

Re: [PATCH]fix VM_CAN_NONLINEAR check in sys_remap_file_pages

2007-10-08 Thread Nick Piggin
ing. I doubt anybody will be using nonlinear mappings on anything but regular files for the time being, but as a trivial fix, I think this probably should go into 2.6.23. Thanks for spotting this problem Acked-by: Nick Piggin <[EMAIL PROTECTED]> > I hope Nick or Miklos is clearer on

Re: [PATCH] mm: set_page_dirty_balance() vs ->page_mkwrite()

2007-10-08 Thread Nick Piggin
On Tuesday 09 October 2007 02:54, Peter Zijlstra wrote: > It seems that with the recent usage of ->page_mkwrite() a little detail > was overlooked. > > .22-rc1 merged OCFS2 usage of this hook > .23-rc1 merged XFS usage > .24-rc1 will most likely merge NFS usage > > Please consider this for .23

[patch] fs: restore nobh

2007-10-08 Thread Nick Piggin
Hi, This is overdue, sorry. Got a little complicated, and I've been away from my filesystem test setup so I didn't want ot send it (lucky, coz I found a bug after more substantial testing). Anyway, RFC? --- Implement nobh in new aops. This is a bit tricky. FWIW, nobh_truncate is now implemented

[patch] fs: restore nobh

2007-10-08 Thread Nick Piggin
Hi, This is overdue, sorry. Got a little complicated, and I've been away from my filesystem test setup so I didn't want ot send it (lucky, coz I found a bug after more substantial testing). Anyway, RFC? --- Implement nobh in new aops. This is a bit tricky. FWIW, nobh_truncate is now implemented

Re: [PATCH] mm: set_page_dirty_balance() vs -page_mkwrite()

2007-10-08 Thread Nick Piggin
On Tuesday 09 October 2007 02:54, Peter Zijlstra wrote: It seems that with the recent usage of -page_mkwrite() a little detail was overlooked. .22-rc1 merged OCFS2 usage of this hook .23-rc1 merged XFS usage .24-rc1 will most likely merge NFS usage Please consider this for .23 final and

Re: [PATCH]fix VM_CAN_NONLINEAR check in sys_remap_file_pages

2007-10-08 Thread Nick Piggin
for the time being, but as a trivial fix, I think this probably should go into 2.6.23. Thanks for spotting this problem Acked-by: Nick Piggin [EMAIL PROTECTED] I hope Nick or Miklos is clearer on what the risks are. (Apologies for all the nots and nons here, I'm embarrassed after just criticizing

Re: [PATCH]fix VM_CAN_NONLINEAR check in sys_remap_file_pages

2007-10-08 Thread Nick Piggin
On Tuesday 09 October 2007 03:04, Andrew Morton wrote: On Mon, 8 Oct 2007 19:45:08 +0800 Yan Zheng [EMAIL PROTECTED] wrote: Hi all The test for VM_CAN_NONLINEAR always fails Signed-off-by: Yan Zheng[EMAIL PROTECTED] diff -ur linux-2.6.23-rc9/mm/fremap.c linux/mm/fremap.c ---

Re: [PATCH]fix VM_CAN_NONLINEAR check in sys_remap_file_pages

2007-10-08 Thread Nick Piggin
On Tuesday 09 October 2007 03:51, Andrew Morton wrote: On Mon, 8 Oct 2007 10:28:43 -0700 I'll now add remap_file_pages soon. Maybe those other 2 tests aren't strong enough (?). Or maybe they don't return a non-0 exit status even when they fail... (I'll check.) Perhaps Yan Zheng can

Re: [PATCH] mm: set_page_dirty_balance() vs -page_mkwrite()

2007-10-08 Thread Nick Piggin
On Tuesday 09 October 2007 09:36, David Chinner wrote: On Mon, Oct 08, 2007 at 04:37:00PM +1000, Nick Piggin wrote: On Tuesday 09 October 2007 02:54, Peter Zijlstra wrote: Force a balance call if -page_mkwrite() was successful. Would it be better to just have the callers

Re: [13/18] x86_64: Allow fallback for the stack

2007-10-08 Thread Nick Piggin
On Tuesday 09 October 2007 03:36, Christoph Lameter wrote: On Sun, 7 Oct 2007, Nick Piggin wrote: The problem can become non-rare on special low memory machines doing wild swapping things though. But only your huge systems will be using huge stacks? I have no idea who else would

Re: race with page_referenced_one->ptep_test_and_clear_young and pagetable setup/pulldown

2007-10-07 Thread Nick Piggin
On Friday 05 October 2007 12:44, Andrew Morton wrote: > On Thu, 04 Oct 2007 18:43:32 -0700 Jeremy Fitzhardinge <[EMAIL PROTECTED]> wrote: > > David's change 10a8d6ae4b3182d6588a5809a8366343bc295c20, "i386: add > > ptep_test_and_clear_{dirty,young}" has introduced an SMP race which > > affects the

Re: [DEBUG PATCH] demonstration of hang in sched domain partitioning code

2007-10-07 Thread Nick Piggin
On Friday 05 October 2007 19:24, Paul Jackson wrote: > Nick, > > I'm running into a problem trying to use your suggested > partition_sched_domains(cpumask_t *partition) routine the way > I thought I was supposed to be able to use it. > > If I ask to set up the same partition twice in a row having

Re: [13/18] x86_64: Allow fallback for the stack

2007-10-07 Thread Nick Piggin
On Friday 05 October 2007 07:20, Christoph Lameter wrote: > On Thu, 4 Oct 2007, Rik van Riel wrote: > > > Well we can now address the rarity. That is the whole point of the > > > patchset. > > > > Introducing complexity to fight a very rare problem with a good > > fallback (refusing to fork more

Re: [13/18] x86_64: Allow fallback for the stack

2007-10-07 Thread Nick Piggin
On Friday 05 October 2007 07:20, Christoph Lameter wrote: On Thu, 4 Oct 2007, Rik van Riel wrote: Well we can now address the rarity. That is the whole point of the patchset. Introducing complexity to fight a very rare problem with a good fallback (refusing to fork more tasks, as well

Re: [DEBUG PATCH] demonstration of hang in sched domain partitioning code

2007-10-07 Thread Nick Piggin
On Friday 05 October 2007 19:24, Paul Jackson wrote: Nick, I'm running into a problem trying to use your suggested partition_sched_domains(cpumask_t *partition) routine the way I thought I was supposed to be able to use it. If I ask to set up the same partition twice in a row having just

Re: race with page_referenced_one-ptep_test_and_clear_young and pagetable setup/pulldown

2007-10-07 Thread Nick Piggin
On Friday 05 October 2007 12:44, Andrew Morton wrote: On Thu, 04 Oct 2007 18:43:32 -0700 Jeremy Fitzhardinge [EMAIL PROTECTED] wrote: David's change 10a8d6ae4b3182d6588a5809a8366343bc295c20, i386: add ptep_test_and_clear_{dirty,young} has introduced an SMP race which affects the Xen pv-ops

Re: [BUG] kernel BUG at arch/i386/mm/highmem.c:15! on 2.6.23-rc8/rc9

2007-10-04 Thread Nick Piggin
On Thursday 04 October 2007 00:53, Nick Piggin wrote: > On Thursday 04 October 2007 16:37, gurudas pai wrote: > > Hi, > > > > While running Oracle database test on x86/6GB RAM machine panics with > > following messages. > > Hi, > > Hmm, seems like somet

Re: [BUG] kernel BUG at arch/i386/mm/highmem.c:15! on 2.6.23-rc8/rc9

2007-10-04 Thread Nick Piggin
On Thursday 04 October 2007 16:37, gurudas pai wrote: > Hi, > > While running Oracle database test on x86/6GB RAM machine panics with > following messages. Hi, Hmm, seems like something in sys_remap_file_pages might have broken. It's a bit hard to work out from the backtrace, though. Is it

Re: [BUG] kernel BUG at arch/i386/mm/highmem.c:15! on 2.6.23-rc8/rc9

2007-10-04 Thread Nick Piggin
On Thursday 04 October 2007 16:37, gurudas pai wrote: Hi, While running Oracle database test on x86/6GB RAM machine panics with following messages. Hi, Hmm, seems like something in sys_remap_file_pages might have broken. It's a bit hard to work out from the backtrace, though. Is it possible

Re: [BUG] kernel BUG at arch/i386/mm/highmem.c:15! on 2.6.23-rc8/rc9

2007-10-04 Thread Nick Piggin
On Thursday 04 October 2007 00:53, Nick Piggin wrote: On Thursday 04 October 2007 16:37, gurudas pai wrote: Hi, While running Oracle database test on x86/6GB RAM machine panics with following messages. Hi, Hmm, seems like something in sys_remap_file_pages might have broken. It's

[rfc][patch 2/3] x86: fix IO write barriers

2007-10-03 Thread Nick Piggin
wmb() on x86 must always include a barrier, because stores can go out of order in many cases when dealing with devices (eg. WC memory). Signed-off-by: Nick Piggin <[EMAIL PROTECTED]> Index: linux-2.6/include/asm-i386/sy

[rfc][patch 3/3] x86: optimise barriers

2007-10-03 Thread Nick Piggin
. smp_rmb on buggy pentium pros remains a locked op, which is apparently required. Signed-off-by: Nick Piggin <[EMAIL PROTECTED]> --- Index: linux-2.6/include/asm-i386/system.h === --- linux-2.6.orig/include/asm-i386/system.h +++

[rfc][patch 1/3] x86_64: fence nontemproal stores

2007-10-03 Thread Nick Piggin
sfences _before_ movnt* everywhere too? ] Signed-off-by: Nick Piggin <[EMAIL PROTECTED]> Index: linux-2.6/arch/x86_64/lib/copy_user_nocache.S === --- linux-2.6.orig/arch/x86_64/lib/copy_user_nocache.S +++ linux-2.6/arch/x86_

Re: [PATCH] cpuset and sched domains: sched_load_balance flag

2007-10-03 Thread Nick Piggin
On Wednesday 03 October 2007 22:17, Paul Jackson wrote: > Nick wrote: > > OK, so I don't exactly understand you either. To make it simple, can > > you give a concrete example of a cpuset hierarchy that wouldn't > > work? > > It's more a matter of knowing how my third party batch scheduler > coders

Re: [PATCH] cpuset and sched domains: sched_load_balance flag

2007-10-03 Thread Nick Piggin
On Wednesday 03 October 2007 22:41, Paul Jackson wrote: > > pdflush > > is not pinned at all and can be dynamically created and destroyed. Ditto > > for kjournald, as well as many others. > > Whatever is not pinned is moved out of the top cpuset, on the kind of > systems I'm most familiar with.

Re: [PATCH] cpuset and sched domains: sched_load_balance flag

2007-10-03 Thread Nick Piggin
On Wednesday 03 October 2007 22:14, Paul Jackson wrote: > > These are what I'm worried about, and things like kswapd, pdflush, > > could definitely use a huge amount of CPU. > > > > If you are interested in hard partitioning the system, you most > > definitely want these things to be balanced

Re: [PATCH] cpuset and sched domains: sched_load_balance flag

2007-10-03 Thread Nick Piggin
On Wednesday 03 October 2007 21:38, Paul Jackson wrote: > > OK, so to really do anything different (from a non-partitioned setup), > > you would need to set sched_load_balance=0 for the root cpuset? > > Suppose you do that to hard partition the machine, what happens to > > newly created tasks

Re: pgd_none_or_clear_bad strangeness?

2007-10-03 Thread Nick Piggin
On Tue, Oct 02, 2007 at 05:20:03PM -0500, Matt Mackall wrote: > In lib/pagewalk.c, I've been using the various forms of > {pgd,pud,pmd}_none_or_clear_bad while walking page tables as that > seemed the canonical way to do things. Lately (eg with -rc7-mm1), > these have been triggering messages like

Re: [PATCH] cpuset and sched domains: sched_load_balance flag

2007-10-03 Thread Nick Piggin
On Wednesday 03 October 2007 19:55, Paul Jackson wrote: > > > Yeah -- cpusets are hierarchical. And some of the use cases for > > > which cpusets are designed are hierarchical. > > > > But partitioning isn't. > > Yup. We've got a square peg and a round hole. An impedance mismatch. > That's the

remove zero_page (was Re: -mm merge plans for 2.6.24)

2007-10-03 Thread Nick Piggin
On Tuesday 02 October 2007 07:22, Andrew Morton wrote: > remove-zero_page.patch > > Linus dislikes it. Probably drop it. I don't know if Linus actually disliked the patch itself, or disliked my (maybe confusingly worded) rationale? To clarify: it is not zero_page that fundamentally causes a

Re: [patch] sched: fix sched-domains partitioning by cpusets

2007-10-03 Thread Nick Piggin
On Wednesday 03 October 2007 19:39, Paul Jackson wrote: > > in any case i'd like to see the externally visible API get in foremost - > > and there now seems to be agreement about that. (yay!) Any internal > > shaping of APIs can be done flexibly between cpusets and the scheduler. > > Yup - though

Re: [patch] sched: fix sched-domains partitioning by cpusets

2007-10-03 Thread Nick Piggin
On Wednesday 03 October 2007 19:21, Paul Jackson wrote: > Nick wrote: > > Sorry for the confusion: I only meant the sched.c part of that > > patch, not the full thing. > > Ah - ok. We're getting closer then. Good. > > Let me be sure I've got this right then. > > You prefer the interface from

Re: [PATCH] cpuset and sched domains: sched_load_balance flag

2007-10-03 Thread Nick Piggin
On Wednesday 03 October 2007 17:25, Paul Jackson wrote: > Nick wrote: > > BTW. as far as the sched.c changes in your patch go, I much prefer > > the partition_sched_domains API: http://lkml.org/lkml/2006/10/19/85 > > > > The caller should manage everything itself, rather than > >

Re: [PATCH] cpuset and sched domains: sched_load_balance flag

2007-10-03 Thread Nick Piggin
On Wednesday 03 October 2007 16:58, Paul Jackson wrote: > > > Yup - it's asking for load balancing over that set. That is why it is > > > called that. There's no idea here of better or worse load balancing, > > > that's an internal kernel scheduler subtlety -- it's just a request > > > that load

Re: [PATCH] mark read_crX() asm code as volatile

2007-10-03 Thread Nick Piggin
On Wednesday 03 October 2007 16:18, H. Peter Anvin wrote: > Nick Piggin wrote: > >> This should work because the result gets used before reading again: > >> > >> read_cr3(a); > >> write_cr3(a | 1); > >> read_cr3(a); > >> > >>

Re: [patch] sched: fix sched-domains partitioning by cpusets

2007-10-03 Thread Nick Piggin
On Wednesday 03 October 2007 16:56, Paul Jackson wrote: > I must NAQ this patch, and I'm surprised to see Nick propose it > again, as I thought he had already agreed that it didn't suffice. Sorry for the confusion: I only meant the sched.c part of that patch, not the full thing. - To unsubscribe

Re: [PATCH] cpuset and sched domains: sched_load_balance flag

2007-10-03 Thread Nick Piggin
On Tuesday 02 October 2007 04:15, Paul Jackson wrote: > Nick wrote: > > which you could equally achieve by adding > > a second set of sched domains (and the global domains could keep > > globally balancing). > > Hmmm ... this could be the key to this discussion. > > Nick - can two sched domains

Re: [PATCH] cpuset and sched domains: sched_load_balance flag

2007-10-03 Thread Nick Piggin
On Tuesday 02 October 2007 04:15, Paul Jackson wrote: Nick wrote: which you could equally achieve by adding a second set of sched domains (and the global domains could keep globally balancing). Hmmm ... this could be the key to this discussion. Nick - can two sched domains overlap? And

Re: [patch] sched: fix sched-domains partitioning by cpusets

2007-10-03 Thread Nick Piggin
On Wednesday 03 October 2007 16:56, Paul Jackson wrote: I must NAQ this patch, and I'm surprised to see Nick propose it again, as I thought he had already agreed that it didn't suffice. Sorry for the confusion: I only meant the sched.c part of that patch, not the full thing. - To unsubscribe

Re: [PATCH] mark read_crX() asm code as volatile

2007-10-03 Thread Nick Piggin
On Wednesday 03 October 2007 16:18, H. Peter Anvin wrote: Nick Piggin wrote: This should work because the result gets used before reading again: read_cr3(a); write_cr3(a | 1); read_cr3(a); But this might be reordered so that b gets read before the write: read_cr3(a); write_cr3

Re: [PATCH] cpuset and sched domains: sched_load_balance flag

2007-10-03 Thread Nick Piggin
On Wednesday 03 October 2007 16:58, Paul Jackson wrote: Yup - it's asking for load balancing over that set. That is why it is called that. There's no idea here of better or worse load balancing, that's an internal kernel scheduler subtlety -- it's just a request that load balancing

Re: [PATCH] cpuset and sched domains: sched_load_balance flag

2007-10-03 Thread Nick Piggin
On Wednesday 03 October 2007 17:25, Paul Jackson wrote: Nick wrote: BTW. as far as the sched.c changes in your patch go, I much prefer the partition_sched_domains API: http://lkml.org/lkml/2006/10/19/85 The caller should manage everything itself, rather than partition_sched_domains

Re: [patch] sched: fix sched-domains partitioning by cpusets

2007-10-03 Thread Nick Piggin
On Wednesday 03 October 2007 19:21, Paul Jackson wrote: Nick wrote: Sorry for the confusion: I only meant the sched.c part of that patch, not the full thing. Ah - ok. We're getting closer then. Good. Let me be sure I've got this right then. You prefer the interface from your proposed

Re: [patch] sched: fix sched-domains partitioning by cpusets

2007-10-03 Thread Nick Piggin
On Wednesday 03 October 2007 19:39, Paul Jackson wrote: in any case i'd like to see the externally visible API get in foremost - and there now seems to be agreement about that. (yay!) Any internal shaping of APIs can be done flexibly between cpusets and the scheduler. Yup - though Nick and

remove zero_page (was Re: -mm merge plans for 2.6.24)

2007-10-03 Thread Nick Piggin
On Tuesday 02 October 2007 07:22, Andrew Morton wrote: remove-zero_page.patch Linus dislikes it. Probably drop it. I don't know if Linus actually disliked the patch itself, or disliked my (maybe confusingly worded) rationale? To clarify: it is not zero_page that fundamentally causes a

Re: [PATCH] cpuset and sched domains: sched_load_balance flag

2007-10-03 Thread Nick Piggin
On Wednesday 03 October 2007 19:55, Paul Jackson wrote: Yeah -- cpusets are hierarchical. And some of the use cases for which cpusets are designed are hierarchical. But partitioning isn't. Yup. We've got a square peg and a round hole. An impedance mismatch. That's the root cause of

Re: pgd_none_or_clear_bad strangeness?

2007-10-03 Thread Nick Piggin
On Tue, Oct 02, 2007 at 05:20:03PM -0500, Matt Mackall wrote: In lib/pagewalk.c, I've been using the various forms of {pgd,pud,pmd}_none_or_clear_bad while walking page tables as that seemed the canonical way to do things. Lately (eg with -rc7-mm1), these have been triggering messages like bad

Re: [PATCH] cpuset and sched domains: sched_load_balance flag

2007-10-03 Thread Nick Piggin
On Wednesday 03 October 2007 21:38, Paul Jackson wrote: OK, so to really do anything different (from a non-partitioned setup), you would need to set sched_load_balance=0 for the root cpuset? Suppose you do that to hard partition the machine, what happens to newly created tasks like kernel

Re: [PATCH] cpuset and sched domains: sched_load_balance flag

2007-10-03 Thread Nick Piggin
On Wednesday 03 October 2007 22:14, Paul Jackson wrote: These are what I'm worried about, and things like kswapd, pdflush, could definitely use a huge amount of CPU. If you are interested in hard partitioning the system, you most definitely want these things to be balanced across the

Re: [PATCH] cpuset and sched domains: sched_load_balance flag

2007-10-03 Thread Nick Piggin
On Wednesday 03 October 2007 22:41, Paul Jackson wrote: pdflush is not pinned at all and can be dynamically created and destroyed. Ditto for kjournald, as well as many others. Whatever is not pinned is moved out of the top cpuset, on the kind of systems I'm most familiar with. They are

Re: [PATCH] cpuset and sched domains: sched_load_balance flag

2007-10-03 Thread Nick Piggin
On Wednesday 03 October 2007 22:17, Paul Jackson wrote: Nick wrote: OK, so I don't exactly understand you either. To make it simple, can you give a concrete example of a cpuset hierarchy that wouldn't work? It's more a matter of knowing how my third party batch scheduler coders think.

[rfc][patch 1/3] x86_64: fence nontemproal stores

2007-10-03 Thread Nick Piggin
sfences _before_ movnt* everywhere too? ] Signed-off-by: Nick Piggin [EMAIL PROTECTED] Index: linux-2.6/arch/x86_64/lib/copy_user_nocache.S === --- linux-2.6.orig/arch/x86_64/lib/copy_user_nocache.S +++ linux-2.6/arch/x86_64/lib

[rfc][patch 2/3] x86: fix IO write barriers

2007-10-03 Thread Nick Piggin
wmb() on x86 must always include a barrier, because stores can go out of order in many cases when dealing with devices (eg. WC memory). Signed-off-by: Nick Piggin [EMAIL PROTECTED] Index: linux-2.6/include/asm-i386/system.h

[rfc][patch 3/3] x86: optimise barriers

2007-10-03 Thread Nick Piggin
. smp_rmb on buggy pentium pros remains a locked op, which is apparently required. Signed-off-by: Nick Piggin [EMAIL PROTECTED] --- Index: linux-2.6/include/asm-i386/system.h === --- linux-2.6.orig/include/asm-i386/system.h +++ linux-2.6

Re: wibbling over the cpuset shed domain connnection

2007-10-02 Thread Nick Piggin
On Wednesday 03 October 2007 15:21, Paul Jackson wrote: > > In the meantime, that patch should be merged though, shouldn't it? > > Which patch do you refer to: > 1) the year old patch to disconnect cpusets and sched domains: > cpuset-remove-sched-domain-hooks-from-cpusets.patch > 2) my

Re: [PATCH] cpuset and sched domains: sched_load_balance flag

2007-10-02 Thread Nick Piggin
On Monday 01 October 2007 13:42, Paul Jackson wrote: > Nick wrote: > > Moreover, sched_load_balance doesn't really sound like a good name > > for asking for a partition. > > Yup - it's not a good name for asking for a partition. > > That's because it isn't asking for a partition. > > It's asking

Re: wibbling over the cpuset shed domain connnection

2007-10-02 Thread Nick Piggin
On Tuesday 02 October 2007 07:34, Paul Jackson wrote: > In -mm merge plans for 2.6.24, Andrew wrote: > > cpuset-remove-sched-domain-hooks-from-cpusets.patch > > > > Paul continues to wibble over this. Hold, I guess. > > Oh dear ... after looking at the following to figure out what > a wibble

Re: [PATCH] mark read_crX() asm code as volatile

2007-10-02 Thread Nick Piggin
On Wednesday 03 October 2007 04:27, Chuck Ebbert wrote: > On 10/02/2007 11:28 AM, Arjan van de Ven wrote: > > On Tue, 02 Oct 2007 18:08:32 +0400 > > > > Kirill Korotaev <[EMAIL PROTECTED]> wrote: > >> Some gcc versions (I checked at least 4.1.1 from RHEL5 & 4.1.2 from > >> gentoo) can generate

Re: per BDI dirty limit (was Re: -mm merge plans for 2.6.24)

2007-10-02 Thread Nick Piggin
On Tuesday 02 October 2007 21:40, Peter Zijlstra wrote: > On Tue, 2007-10-02 at 13:21 +0200, Kay Sievers wrote: > > How about adding this information to the tree then, instead of > > creating a new top-level hack, just because something that you think > > you need doesn't exist. > > So you

Re: kswapd min order, slub max order [was Re: -mm merge plans for 2.6.24]

2007-10-02 Thread Nick Piggin
On Wednesday 03 October 2007 02:06, Hugh Dickins wrote: > On Mon, 1 Oct 2007, Andrew Morton wrote: > > # > > # slub && antifrag > > # > > have-kswapd-keep-a-minimum-order-free-other-than-order-0.patch > > only-check-absolute-watermarks-for-alloc_high-and-alloc_harder-allocation > >s.patch

Re: [PATCH 05/12] mm: trylock_page

2007-10-02 Thread Nick Piggin
On Sunday 30 September 2007 01:01, Peter Zijlstra wrote: > On Fri, 2007-09-28 at 13:11 +1000, Nick Piggin wrote: > > On Friday 28 September 2007 17:42, Peter Zijlstra wrote: > > > Replace raw TestSetPageLocked() usage with trylock_page() > > > > I have such a thing q

Re: [PATCH] Add ability to dump mapped pages with /proc/sys/vm/drop_caches

2007-10-02 Thread Nick Piggin
On Monday 01 October 2007 04:03, Soeren Sandmann wrote: > This patch adds the ability to drop mapped pages with > /proc/sys/vm/drop_caches. This is useful to get repeatable > measurements of startup time for applications. > > Without it, pages that are mapped in already-running applications will >

Re: [15/17] SLUB: Support virtual fallback via SLAB_VFALLBACK

2007-10-02 Thread Nick Piggin
On Tuesday 02 October 2007 06:50, Christoph Lameter wrote: > On Fri, 28 Sep 2007, Nick Piggin wrote: > > I thought it was slower. Have you fixed the performance regression? > > (OK, I read further down that you are still working on it but not > > confirmed yet...) > > Th

Re: [15/17] SLUB: Support virtual fallback via SLAB_VFALLBACK

2007-10-02 Thread Nick Piggin
On Tuesday 02 October 2007 07:01, Christoph Lameter wrote: > On Sat, 29 Sep 2007, Peter Zijlstra wrote: > > On Fri, 2007-09-28 at 11:20 -0700, Christoph Lameter wrote: > > > Really? That means we can no longer even allocate stacks for forking. > > > > I think I'm running with 4k stacks... > > 4k

Re: [15/17] SLUB: Support virtual fallback via SLAB_VFALLBACK

2007-10-02 Thread Nick Piggin
On Tuesday 02 October 2007 07:01, Christoph Lameter wrote: On Sat, 29 Sep 2007, Peter Zijlstra wrote: On Fri, 2007-09-28 at 11:20 -0700, Christoph Lameter wrote: Really? That means we can no longer even allocate stacks for forking. I think I'm running with 4k stacks... 4k stacks will

Re: [15/17] SLUB: Support virtual fallback via SLAB_VFALLBACK

2007-10-02 Thread Nick Piggin
On Tuesday 02 October 2007 06:50, Christoph Lameter wrote: On Fri, 28 Sep 2007, Nick Piggin wrote: I thought it was slower. Have you fixed the performance regression? (OK, I read further down that you are still working on it but not confirmed yet...) The problem is with the weird way

Re: [PATCH 05/12] mm: trylock_page

2007-10-02 Thread Nick Piggin
On Sunday 30 September 2007 01:01, Peter Zijlstra wrote: On Fri, 2007-09-28 at 13:11 +1000, Nick Piggin wrote: On Friday 28 September 2007 17:42, Peter Zijlstra wrote: Replace raw TestSetPageLocked() usage with trylock_page() I have such a thing queued too, for the lock bitops patches

Re: [PATCH] Add ability to dump mapped pages with /proc/sys/vm/drop_caches

2007-10-02 Thread Nick Piggin
On Monday 01 October 2007 04:03, Soeren Sandmann wrote: This patch adds the ability to drop mapped pages with /proc/sys/vm/drop_caches. This is useful to get repeatable measurements of startup time for applications. Without it, pages that are mapped in already-running applications will not

Re: kswapd min order, slub max order [was Re: -mm merge plans for 2.6.24]

2007-10-02 Thread Nick Piggin
On Wednesday 03 October 2007 02:06, Hugh Dickins wrote: On Mon, 1 Oct 2007, Andrew Morton wrote: # # slub antifrag # have-kswapd-keep-a-minimum-order-free-other-than-order-0.patch only-check-absolute-watermarks-for-alloc_high-and-alloc_harder-allocation s.patch

Re: per BDI dirty limit (was Re: -mm merge plans for 2.6.24)

2007-10-02 Thread Nick Piggin
On Tuesday 02 October 2007 21:40, Peter Zijlstra wrote: On Tue, 2007-10-02 at 13:21 +0200, Kay Sievers wrote: How about adding this information to the tree then, instead of creating a new top-level hack, just because something that you think you need doesn't exist. So you suggest adding

Re: [PATCH] mark read_crX() asm code as volatile

2007-10-02 Thread Nick Piggin
On Wednesday 03 October 2007 04:27, Chuck Ebbert wrote: On 10/02/2007 11:28 AM, Arjan van de Ven wrote: On Tue, 02 Oct 2007 18:08:32 +0400 Kirill Korotaev [EMAIL PROTECTED] wrote: Some gcc versions (I checked at least 4.1.1 from RHEL5 4.1.2 from gentoo) can generate incorrect code with

Re: wibbling over the cpuset shed domain connnection

2007-10-02 Thread Nick Piggin
On Tuesday 02 October 2007 07:34, Paul Jackson wrote: In -mm merge plans for 2.6.24, Andrew wrote: cpuset-remove-sched-domain-hooks-from-cpusets.patch Paul continues to wibble over this. Hold, I guess. Oh dear ... after looking at the following to figure out what a wibble is, I wonder

Re: [PATCH] cpuset and sched domains: sched_load_balance flag

2007-10-02 Thread Nick Piggin
On Monday 01 October 2007 13:42, Paul Jackson wrote: Nick wrote: Moreover, sched_load_balance doesn't really sound like a good name for asking for a partition. Yup - it's not a good name for asking for a partition. That's because it isn't asking for a partition. It's asking for load

Re: wibbling over the cpuset shed domain connnection

2007-10-02 Thread Nick Piggin
On Wednesday 03 October 2007 15:21, Paul Jackson wrote: In the meantime, that patch should be merged though, shouldn't it? Which patch do you refer to: 1) the year old patch to disconnect cpusets and sched domains: cpuset-remove-sched-domain-hooks-from-cpusets.patch 2) my patch of a

Re: 2.6.21 -> 2.6.22 & 2.6.23-rc8 performance regression

2007-09-30 Thread Nick Piggin
Hi Denys, thanks for reporting (btw. please reply-to-all when replying on lkml). You say that SLAB is better than SLUB on an otherwise identical kernel, but I didn't see if you quantified the actual numbers? It sounds like there is still a regression with SLAB? On Monday 01 October 2007 03:48,

Re: [15/17] SLUB: Support virtual fallback via SLAB_VFALLBACK

2007-09-30 Thread Nick Piggin
On Monday 01 October 2007 06:12, Andrew Morton wrote: > On Sun, 30 Sep 2007 05:09:28 +1000 Nick Piggin <[EMAIL PROTECTED]> wrote: > > On Sunday 30 September 2007 05:20, Andrew Morton wrote: > > > We can't "run out of unfragmented memory" for an o

Re: [PATCH] cpuset and sched domains: sched_load_balance flag

2007-09-30 Thread Nick Piggin
On Monday 01 October 2007 04:07, Paul Jackson wrote: > Nick wrote: > > The user should just be able to specify exactly the partitioning of > > tasks required, and cpusets should ask the scheduler to do the best > > job of load balancing possible. > > If the cpusets which have 'sched_load_balance'

Re: [patch] x86: improved memory barrier implementation

2007-09-30 Thread Nick Piggin
On Sat, Sep 29, 2007 at 09:07:30AM -0700, Linus Torvalds wrote: > > > On Sat, 29 Sep 2007, Nick Piggin wrote: > > > > > > The non-temporal stores should be basically considered to be "IO", not > > > any > > > normal memory operatio

Re: [rfc][patch] i386: remove comment about barriers

2007-09-30 Thread Nick Piggin
On Sat, Sep 29, 2007 at 12:12:52PM -0700, Davide Libenzi wrote: > On Sat, 29 Sep 2007, Nick Piggin wrote: > > > [ This is true for x86's sfence/lfence, but raises a question about Linux's > > memory barriers. Does anybody expect that a sequence of smp_wmb and smp_rmb > > w

Re: [rfc][patch] i386: remove comment about barriers

2007-09-30 Thread Nick Piggin
On Sat, Sep 29, 2007 at 08:16:47PM -0700, Paul E. McKenney wrote: > On Sat, Sep 29, 2007 at 03:28:48PM +0200, Nick Piggin wrote: > > Acked-by: Paul E. McKenney <[EMAIL PROTECTED]> > Thanks v much for confirming, everyone. > > Signed-off-by: Nick

Re: [PATCH] cpuset and sched domains: sched_load_balance flag

2007-09-30 Thread Nick Piggin
On Sunday 30 September 2007 20:44, Paul Jackson wrote: > From: Paul Jackson <[EMAIL PROTECTED]> > > Add a new per-cpuset flag called 'sched_load_balance'. > > When enabled in a cpuset (the default value) it tells the kernel > scheduler that the scheduler should provide the normal load > balancing

Re: [15/17] SLUB: Support virtual fallback via SLAB_VFALLBACK

2007-09-30 Thread Nick Piggin
On Sunday 30 September 2007 05:20, Andrew Morton wrote: > On Sat, 29 Sep 2007 06:19:33 +1000 Nick Piggin <[EMAIL PROTECTED]> wrote: > > On Saturday 29 September 2007 19:27, Andrew Morton wrote: > > > On Sat, 29 Sep 2007 11:14:02 +0200 Peter Zijlstra > > >

Re: [15/17] SLUB: Support virtual fallback via SLAB_VFALLBACK

2007-09-30 Thread Nick Piggin
On Sunday 30 September 2007 05:20, Andrew Morton wrote: On Sat, 29 Sep 2007 06:19:33 +1000 Nick Piggin [EMAIL PROTECTED] wrote: On Saturday 29 September 2007 19:27, Andrew Morton wrote: On Sat, 29 Sep 2007 11:14:02 +0200 Peter Zijlstra [EMAIL PROTECTED] wrote: oom-killings

Re: [PATCH] cpuset and sched domains: sched_load_balance flag

2007-09-30 Thread Nick Piggin
On Sunday 30 September 2007 20:44, Paul Jackson wrote: From: Paul Jackson [EMAIL PROTECTED] Add a new per-cpuset flag called 'sched_load_balance'. When enabled in a cpuset (the default value) it tells the kernel scheduler that the scheduler should provide the normal load balancing on the

<    4   5   6   7   8   9   10   11   12   13   >