[PATCH] change zonelist order v5 [1/3] implements zonelist order selection

2007-05-08 Thread KAMEZAWA Hiroyuki
n Node-based order. command: %echo Z > /proc/sys/vm/numa_zonelist_order Will rebuild zonelist in Zone-based order. Tested on ia64 2-Node NUMA. works well. Thanks to Lee Schermerhorn, he gives me much help and codes. Signed-Off-By: KAMEZAWA Hiroyuki <[EMAIL PROTECTED]> --- include/linux

[PATCH] change zonelist order v5 [0/3]

2007-05-08 Thread KAMEZAWA Hiroyuki
Hi, this is zonelist-order-fix patch version 5. against 2.6.21-mm1. works well in my ia64/NUMA environment. ChangeLog V4 -> V5 - separated 'doc' patch and rewrote it. - more clean ups. - sysctl/boot option params are simplified. ChangeLog V2 -> V4 - automatic configuration is added. - automatic

Re: [BUG] cpu-hotplug: Can't offline the CPU with naughty realtime processes

2007-05-07 Thread KAMEZAWA Hiroyuki
On Mon, 07 May 2007 19:10:05 +0900 Satoru Takeuchi <[EMAIL PROTECTED]> wrote: > kstopmachine is created, bound to the CPU1, and woken up here, but > this process can't start to run because reschedule doesn't occur on > CPU1. Hence CPU0 also be able to run because it's waiting completion > of CPU1

[PATCH] change global zonelist order v4 [2/2] auto configuration

2007-04-26 Thread KAMEZAWA Hiroyuki
Node (A)'s ZONE_DMA/DMA32 occupies 60% of Node(A)'s memory. otherwise, ZONE_ORDER_ZONE is selected. Note: a user can specifiy this ordering from boot option. Signed-Off-By: KAMEZAWA Hiroyuki <[EMAIL PROTECTED]> --- mm/page_alloc.c | 44 +

[PATCH] change global zonelist order v4 [1/2] change zonelist ordering.

2007-04-26 Thread KAMEZAWA Hiroyuki
ing order(new style, ZONE order) Node(0)'s NORMAL -> Node(1)'s NORMAL -> Node(0)'s DMA. means put more priority on zone_type. And you can specify this option as boot param. Because autoconfig function does nothing. Default is "Node" order. Tested on ia64 2-Node NU

[PATCH] change global zonelist order v4 [0/2]

2007-04-26 Thread KAMEZAWA Hiroyuki
Hi, this is version 4. including Lee Schermerhon's good rework. and automatic configuration at boot time. (This patch is reworked from V2, so skip V3 changelog.) ChangeLog V2 -> V4 - automatic configuration is added. - automatic configuration is now default. - relaxed_zone_order is renamed to be

Re: [PATCH] change global zonelist order on NUMA v2

2007-04-26 Thread KAMEZAWA Hiroyuki
On Thu, 26 Apr 2007 18:25:10 -0700 (PDT) Christoph Lameter <[EMAIL PROTECTED]> wrote: > On Fri, 27 Apr 2007, KAMEZAWA Hiroyuki wrote: > > > > DMA memory. > > > > > It seems a bit complicated. If we do so, following can occur, > > > > Node

Re: [PATCH] change global zonelist order on NUMA v2

2007-04-26 Thread KAMEZAWA Hiroyuki
On Thu, 26 Apr 2007 17:57:40 -0400 Lee Schermerhorn <[EMAIL PROTECTED]> wrote: > On Thu, 2007-04-26 at 18:34 +0900, KAMEZAWA Hiroyuki wrote: > > Changelog from V1 -> V2 > > - sysctl name is changed to be relaxed_zone_order > > - NORMAL->NORMAL->->DM

Re: [PATCH] change global zonelist order on NUMA v2

2007-04-26 Thread KAMEZAWA Hiroyuki
On Thu, 26 Apr 2007 08:48:19 -0700 (PDT) Christoph Lameter <[EMAIL PROTECTED]> wrote: > On Thu, 26 Apr 2007, KAMEZAWA Hiroyuki wrote: > > > (1)Use new zonelist ordering always and move init_task's tied cpu to a > > cpu on the best node. > > Child process

[PATCH] change global zonelist order on NUMA v3

2007-04-26 Thread KAMEZAWA Hiroyuki
changes *default* zone order to Node(0)'s NORMAL -> Node(1)'s NORMAL -> Node(0)'s DMA. tested ia64 2-Node NUMA. works well. Signed-Off-By: KAMEZAWA Hiroyuki <[EMAIL PROTECTED]> Index: linux-2.6.21-rc7-mm2/mm/page_alloc.c ===

Re: [PATCH] change global zonelist order on NUMA v2

2007-04-26 Thread KAMEZAWA Hiroyuki
On Thu, 26 Apr 2007 11:47:44 +0200 Andi Kleen <[EMAIL PROTECTED]> wrote: > On Thursday 26 April 2007 11:34:17 KAMEZAWA Hiroyuki wrote: > > > > Changelog from V1 -> V2 > > - sysctl name is changed to be relaxed_zone_order > > - NORMAL->NORMAL->-

[PATCH] change global zonelist order on NUMA v2

2007-04-26 Thread KAMEZAWA Hiroyuki
pported by arch. But this style zonelist can easily cause OOM-Kill because of ZONE_DMA exhaition. be careful. command: echo 0 > /proc/sys/vm/relaxed_zone_order will rebuild zonelist as Node(0)'s NORMAL -> Node(1)'s NORMAL -> Node(0)'s DM

Re: [RFC][PATCH] syctl for selecting global zonelist[] order

2007-04-25 Thread KAMEZAWA Hiroyuki
On Thu, 26 Apr 2007 09:31:12 +0900 KAMEZAWA Hiroyuki <[EMAIL PROTECTED]> wrote: > > > > So a IA64 platform with i386 sicknesses? And pretty bad case of it since I > > assume that the memory sizes per node are equal. Your solution of taking > > 4G off node 0 and t

Re: [RFC][PATCH] syctl for selecting global zonelist[] order

2007-04-25 Thread KAMEZAWA Hiroyuki
On Wed, 25 Apr 2007 12:17:15 -0700 (PDT) Christoph Lameter <[EMAIL PROTECTED]> wrote: > On Wed, 25 Apr 2007, KAMEZAWA Hiroyuki wrote: > > > Make zonelist policy selectable from sysctl. > > > > Assume 2 node NUMA, only node(0) has ZONE_DMA (ZONE_DMA32). > >

Re: [RFC][PATCH] syctl for selecting global zonelist[] order

2007-04-25 Thread KAMEZAWA Hiroyuki
On Wed, 25 Apr 2007 00:42:14 -0700 Andrew Morton <[EMAIL PROTECTED]> wrote: > On Wed, 25 Apr 2007 12:19:46 +0900 KAMEZAWA Hiroyuki <[EMAIL PROTECTED]> > wrote: > > > Make zonelist policy selectable from sysctl. > > > > Assume 2 node NUMA, only node(0)

[RFC][PATCH] syctl for selecting global zonelist[] order

2007-04-24 Thread KAMEZAWA Hiroyuki
useful in some users with heavy memory pressure and mlocks. Tested under ia64 2 node NUMA against 2.6.21-rc7.. works well. Signed-Off-By: KAMEZAWA Hiroyuki <[EMAIL PROTECTED]> Index: linux-2.6.21-rc7/kernel/sysctl.c === -

Re: [PATCH] mm: PageLRU can be non-atomic bit operation

2007-04-23 Thread KAMEZAWA Hiroyuki
On Tue, 24 Apr 2007 10:54:27 +0900 Hisashi Hifumi <[EMAIL PROTECTED]> wrote: > In the case that changing the same bit concurrently, lock prefix or other > spinlock is needed. But, I think that concurrent bit operation on different > bits > is just like OR operation , so lock prefix is not needed.

Re: [PATCH]Fix parsing kernelcore boot option for ia64

2007-04-23 Thread KAMEZAWA Hiroyuki
On Mon, 23 Apr 2007 19:32:46 +0100 [EMAIL PROTECTED] (Mel Gorman) wrote: > > > I wasn't even aware of this kernelcore thing. It's pretty nasty-looking. > > > yet another reminder that this code hasn't been properly reviewed in the > > > past year or three. > > > > Just now, I'm making memory-un

Re: [PATCH] fix OOM killing processes wrongly thought MPOL_BIND

2007-04-18 Thread KAMEZAWA Hiroyuki
ce we initialize nodemask in constrained_alloc(). > thank you for catching bug. Acked-by: KAMEZAWA Hiroyuki <[EMAIL PROTECTED]> > Signed-off-by: Hugh Dickins <[EMAIL PROTECTED]> > --- > Perhaps appropriate for 2.6.20-stable too - regression since 2.6.19. > >

Re: [PATCH][2/2] double stack limit (rfc)

2007-03-22 Thread KAMEZAWA Hiroyuki
On Thu, 22 Mar 2007 21:56:03 -0700 "Tony Luck" <[EMAIL PROTECTED]> wrote: > On 3/22/07, KAMEZAWA Hiroyuki <[EMAIL PROTECTED]> wrote: > > I hear some people says that "When I set stack-size-limit to 32M, > > I want to use 32M of memory stack..." an

[PATCH][2/2] double stack limit (rfc)

2007-03-22 Thread KAMEZAWA Hiroyuki
r-backing store cannot be expanded because the memory stack uses the whole stack". How about this ? Signed-Off-By: KAMEZAWA Hiroyuki <[EMAIL PROTECTED]> Index: linux-2.6.21-rc4/arch/ia64/mm/init.c === --- linux-2.6.21-rc4.orig/

[PATCH][ia64][1/2] bugfix stack layout upside-down

2007-03-22 Thread KAMEZAWA Hiroyuki
by adjusting register-stack. Signed-Off-By: KAMEZAWA Hiroyuki <[EMAIL PROTECTED]> Index: linux-2.6.21-rc4/arch/ia64/mm/init.c === --- linux-2.6.21-rc4.orig/arch/ia64/mm/init.c +++ linux-2.6.21-rc4/arch/ia64/mm/init.c @@ -155,7

Re: thread stacks and strict vm overcommit accounting

2007-03-15 Thread KAMEZAWA Hiroyuki
On Thu, 15 Mar 2007 11:06:21 -0800 Andrew Morton <[EMAIL PROTECTED]> wrote: > > On Tue, 13 Mar 2007 18:33:20 +0200 Dan Aloni <[EMAIL PROTECTED]> wrote: > > Hello, > > > > This question is relevent to 2.6.20. > > > > I noticed that if the RSS for the stack size is say, 8MB, running > > a single-t

Re: [BUGFIX][PATCH] fixing placement of register stack under ulimit -s

2007-03-15 Thread KAMEZAWA Hiroyuki
plz allow me to explain more. "Why register-stack/memory-stack upside down is bad" is a bit complicated. So...this is a test and result for explaining bug. This is a sample code and its result on 2.6.21-rc3. Note: base address of memory'stack can be randomly change. == sample code == [EMAIL PRO

Re: [BUGFIX][PATCH] fixing placement of register stack under ulimit -s

2007-03-15 Thread KAMEZAWA Hiroyuki
On Fri, 16 Mar 2007 06:20:47 +0900 KAMEZAWA Hiroyuki <[EMAIL PROTECTED]> wrote: > On Thu, 15 Mar 2007 09:57:28 -0600 > "David Mosberger-Tang" <[EMAIL PROTECTED]> wrote: > > > But aren't you going to be limited to less than a page worth of > > reg

Re: [BUGFIX][PATCH] fixing placement of register stack under ulimit -s

2007-03-15 Thread KAMEZAWA Hiroyuki
y if I don't catch your point. --Kame > --david > > On 3/15/07, KAMEZAWA Hiroyuki <[EMAIL PROTECTED]> wrote: > > This patch fixes ia64's bug in ulimit -s handling. against 2.6.21-rc3. > > > > At first,the address of register stack is defined by

[BUGFIX][PATCH] fixing placement of register stack under ulimit -s

2007-03-15 Thread KAMEZAWA Hiroyuki
27;t handle this case. In this case, register stack expansion causes SEGV. This means that the user program can use only 1 page for its register stack. This patch fixes the above case by moving register stack to suitable place. Note) fixing page fault handler seems to be another way...but a bit

Re: mm: migrate_pages using

2007-03-14 Thread KAMEZAWA Hiroyuki
On Mon, 12 Mar 2007 19:57:58 +0100 Michal Hocko <[EMAIL PROTECTED]> wrote: > What do you think about that. Is this way correct? > If you are sure that your "original" pages is never freed while you are migrating it.maybe. -Kame - To unsubscribe from this list: send the line "unsubscribe li

Re: [BUGFIX][PATCH] fix NULL pointer in ia64/irq_chip-mask/unmask function

2007-03-06 Thread KAMEZAWA Hiroyuki
On Tue, 6 Mar 2007 22:57:10 -0800 Andrew Morton <[EMAIL PROTECTED]> wrote: > On Wed, 7 Mar 2007 15:23:17 +0900 KAMEZAWA Hiroyuki <[EMAIL PROTECTED]> wrote: > > > This patch fixes boot failure because irq_desc->mask() is NULL. > > > > - Added mask/unmas

[BUGFIX][PATCH] fix NULL pointer in ia64/irq_chip-mask/unmask function

2007-03-06 Thread KAMEZAWA Hiroyuki
This patch fixes boot failure because irq_desc->mask() is NULL. - Added mask/unmask functions to ia64's irq desc function table. But I'm not sure this fix is correct or not. please review. - rename hw_interrupt_type to irq_chip. hw_interrupt_type is old name. Signed-Off-By: KAME

Re: 2.6.21-rc2-mm2

2007-03-06 Thread KAMEZAWA Hiroyuki
On Tue, 6 Mar 2007 03:09:27 -0800 Andrew Morton <[EMAIL PROTECTED]> wrote: > > == > > > > Is "mask" always valid pointer ? > > I can only find two `struct irq_chip's in arch/ia64 and they both have a > .mask. And a .unmask. So perhaps that is a misreading of what oopsed. > > There are no cha

Re: The performance and behaviour of the anti-fragmentation related patches

2007-03-02 Thread KAMEZAWA Hiroyuki
zones are overkill > there and anti-fragmentation on its own is good enough). Pages hot-added > to ZONE_MOVABLE will always be reclaimable or migratable in the case of > mlock(). Kamezawa Hiroyuki has indicated that his hot-remove patches also > do something like ZONE_MOVABLE. I would hope t

Re: The performance and behaviour of the anti-fragmentation related patches

2007-03-01 Thread KAMEZAWA Hiroyuki
On Thu, 1 Mar 2007 21:11:58 -0800 (PST) Linus Torvalds <[EMAIL PROTECTED]> wrote: > The whole DRAM power story is a bedtime story for gullible children. Don't > fall for it. It's not realistic. The hardware support for it DOES NOT > EXIST today, and probably won't for several years. And the real

Re: The performance and behaviour of the anti-fragmentation related patches

2007-03-01 Thread KAMEZAWA Hiroyuki
On Thu, 1 Mar 2007 16:09:15 -0800 Andrew Morton <[EMAIL PROTECTED]> wrote: > On Thu, 1 Mar 2007 10:12:50 + > [EMAIL PROTECTED] (Mel Gorman) wrote: > > > Any opinion on merging these patches into -mm > > for wider testing? > > I'm a little reluctant to make changes to -mm's core mm unless tho

Re: [RFC][PATCH 0/3] VM throttling: avoid blocking occasional writers

2007-02-26 Thread KAMEZAWA Hiroyuki
On Tue, 27 Feb 2007 09:50:16 +0900 Tomoki Sekiyama <[EMAIL PROTECTED]> wrote: > Hi Kamezawa-san, > > thanks for your reply. > > KAMEZAWA Hiroyuki wrote: > > Interesting, but how about adjust this parameter like below instead of > > adding new control knob ?(thi

Re: SLUB: The unqueued Slab allocator

2007-02-23 Thread KAMEZAWA Hiroyuki
On Thu, 22 Feb 2007 10:42:23 -0800 (PST) Christoph Lameter <[EMAIL PROTECTED]> wrote: > > > G. Slab merging > > > > > >We often have slab caches with similar parameters. SLUB detects those > > >on bootup and merges them into the corresponding general caches. This > > >leads to more ef

Re: [RFC][PATCH 0/3] VM throttling: avoid blocking occasional writers

2007-02-23 Thread KAMEZAWA Hiroyuki
On Fri, 23 Feb 2007 21:03:37 +0900 Tomoki Sekiyama <[EMAIL PROTECTED]> wrote: > Hi, > > I have observed a problem that write(2) can be blocked for a long time > if a system has several disks and is under heavy I/O pressure. This > patchset is to avoid the problem. > > Example of the probrem: >

Re: [rfc][patch] dynamic resizing dentry hash using RCU

2007-02-23 Thread KAMEZAWA Hiroyuki
On Fri, 23 Feb 2007 16:37:43 +0100 Nick Piggin <[EMAIL PROTECTED]> wrote: > +static void dcache_hash_resize(unsigned int new_shift); > +static void mod_nr_dentry(int mod) > +{ > + unsigned long dentry_size; > + > + dentry_stat.nr_dentry += mod; > + > + dentry_size = 1 << dentry_hash->s

Re: [PATCH] fix handling of SIGCHILD from reaped child

2007-02-20 Thread KAMEZAWA Hiroyuki
On Tue, 20 Feb 2007 15:10:07 -0800 (PST) Roland McGrath <[EMAIL PROTECTED]> wrote: > I'm usually the stickler for anal POSIX compliance, but this is one thing > that I did notice a while ago, realized Linux had never done it, and > decided I didn't care. > Okay, I don't think this is a big troubl

Re: [PATCH] fix handling of SIGCHILD from reaped child

2007-02-20 Thread KAMEZAWA Hiroyuki
On Tue, 20 Feb 2007 20:20:49 +0300 Oleg Nesterov <[EMAIL PROTECTED]> wrote: > > > > + clear_stale_sigchild(current, retval); > > > > > > But we are not checking that SIGCHLD is blocked? > > > > > I'm sorry if I don't read SUSv3 correctly. SUSv3 doesn't define how we > > should > >

Re: [PATCH] fix handling of SIGCHILD from reaped child

2007-02-20 Thread KAMEZAWA Hiroyuki
On Tue, 20 Feb 2007 17:22:57 +0300 Oleg Nesterov <[EMAIL PROTECTED]> wrote: > > I'd suggest to make a separate function, but not complicate collect_signal(). > okay. I'll try again if people admit me to go ahead. > > --- linux-2.6.20-devel.orig/kernel/exit.c > > +++ linux-2.6.20-devel/kernel/exi

[PATCH] fix handling of SIGCHILD from reaped child

2007-02-20 Thread KAMEZAWA Hiroyuki
reaped process.) please review... works well on 2.6.20/ia64/NUMA environment and passed my easy test. Signed-Off-By: KAMEZAWA Hiroyuki <[EMAIL PROTECTED]> Index: linux-2.6.20-devel/kernel/signal.c === --- linux-2.6.20-devel.orig/

Re: [RFC][PATCH][3/4] Add reclaim support

2007-02-19 Thread KAMEZAWA Hiroyuki
On Mon, 19 Feb 2007 12:20:42 +0530 Balbir Singh <[EMAIL PROTECTED]> wrote: > +int memctlr_mm_overlimit(struct mm_struct *mm, void *sc_cont) > +{ > + struct container *cont; > + struct memctlr *mem; > + long usage, limit; > + int ret = 1; > + > + if (!sc_cont) > + go

Re: [PATCH] slab: ensure cache_alloc_refill terminates

2007-02-19 Thread KAMEZAWA Hiroyuki
On Mon, 19 Feb 2007 10:22:52 +0200 (EET) Pekka J Enberg <[EMAIL PROTECTED]> wrote: > @@ -2987,6 +2987,14 @@ > slabp = list_entry(entry, struct slab, list); > check_slabp(cachep, slabp); > check_spinlock_acquired(cachep); > + > + /* > +

[Qeustion][Maybe BUG?] simaltaneous wait and SIGCHLD handling

2007-02-18 Thread KAMEZAWA Hiroyuki
Hi, >From SUSv3, I expected SIGCHLD from dead processes (already reaped by wait(2)) should be cleared. But it seems that such situation is not handled in Linux. Here is a test program. set sigchld handler and call waitpid() in main(). == #include #include #include #include int sigchld_handl

[PATCH] fix mempolicy's check on a system with memory-less-node take4

2007-02-14 Thread KAMEZAWA Hiroyuki
log: v2 -> v3 - removed ambiguous void *pointer usage. - fixed warnings...misuse of PTR_ERR. Changelog: v1 -> v2 - avoid extra pgdat scanningit is not necessary. tested on ia64/NUMA with memory-less-node. Signed-Off-By: KAMEZAWA Hiroyuki <[EMAIL PROTECTED]> Index: li

Re: [RFC] [PATCH] more support for memory-less-node.

2007-02-13 Thread KAMEZAWA Hiroyuki
On Tue, 13 Feb 2007 10:50:53 -0800 (PST) Christoph Lameter <[EMAIL PROTECTED]> wrote: > On Tue, 13 Feb 2007, Martin J. Bligh wrote: > > > What's wrong with just setting the existing counters like > > node_spanned_pages / node_present_pages to zero? > > Will this fix the breakage that Kame-san sa

Re: [RFC] [PATCH] more support for memory-less-node.

2007-02-13 Thread KAMEZAWA Hiroyuki
On Tue, 13 Feb 2007 09:25:00 -0800 (PST) Christoph Lameter <[EMAIL PROTECTED]> wrote: > On Tue, 13 Feb 2007, KAMEZAWA Hiroyuki wrote: > > > NOD_DATA(nid) is always valid pointer if a node is online. > > NODE_DATA(nid)->present_pages can be 0 even if a node is online,

Re: [RFC] [PATCH] more support for memory-less-node.

2007-02-13 Thread KAMEZAWA Hiroyuki
On Tue, 13 Feb 2007 09:29:49 +0100 Andi Kleen <[EMAIL PROTECTED]> wrote: > > > In my understanding, a "node" is a block of cpu, memory, devices. > > and there could be cpu-only-node, memory-only-node, device-only-node... > > The trouble with this is that you'll need to harden large parts > of co

[RFC] [PATCH] more support for memory-less-node.

2007-02-12 Thread KAMEZAWA Hiroyuki
continue; x This patch adds a new node mask "node_memory_online_map" for nodes which have memory. for_each_node_mask(nid, node_memory_online_map) walks all memory-ready-nodes. This mask is updated at node-hotplug ops. Signed-Off-By: KAMEZAWA H

Re: Fw: [BUG][PATCH] fix mempolcy's check on a system with memory-less-node take3

2007-02-08 Thread KAMEZAWA Hiroyuki
On Thu, 8 Feb 2007 11:28:30 -0800 (PST) Christoph Lameter <[EMAIL PROTECTED]> wrote: > > @@ -193,9 +197,11 @@ > > break; > > case MPOL_BIND: > > policy->v.zonelist = bind_zonelist(nodes); > > - if (policy->v.zonelist == NULL) { > > + if (IS_ERR(policy

Fw: [BUG][PATCH] fix mempolcy's check on a system with memory-less-node take3

2007-02-08 Thread KAMEZAWA Hiroyuki
of zonelist is zero, just returns -EINVAL. Changelog: v2 -> v3 - changed handling of void *pointer - fixed warnings...misuse of PTR_ERR. Changelog: v1 -> v2 - avoid extra pgdat scanningit is not necessary. Signed-Off-By: KAMEZAWA Hiroyuki <[EMAIL PROTECTED]> Index: lin

Re: [BUG][PATCH] fix mempolcy's check on a system with memory-less-node take2

2007-02-08 Thread KAMEZAWA Hiroyuki
On Thu, 8 Feb 2007 08:49:41 +0100 Andi Kleen <[EMAIL PROTECTED]> wrote: > > > This panic(hang) was found by a numa test-set on a system with 3 nodes, > > where > > node(2) was memory-less-node. > > I still think it's the wrong fix -- just get rid of the memory less node. > I expect you'll likel

[BUG][PATCH] fix mempolcy's check on a system with memory-less-node take2

2007-02-07 Thread KAMEZAWA Hiroyuki
-EINVAL. Changelog: v1 -> v2 - avoid extra pgdat scanningit is not necessary. Signed-Off-By: KAMEZAWA Hiroyuki <[EMAIL PROTECTED]> Index: linux-2.6.20/mm/mempolicy.c === --- linux-2.6.20.orig/mm/mempolicy.c2007-02

Re: [2.6.20][PATCH] fix mempolicy error check on a system with memory-less-node

2007-02-07 Thread KAMEZAWA Hiroyuki
On Wed, 7 Feb 2007 09:43:44 -0800 Andrew Morton <[EMAIL PROTECTED]> wrote: > > > and to > > > accurately present the machine's topology to the user without us having to > > > go adding falsehoods like this? > > > > a node is a piece of memory. Without memory it doesn't make sense. > > Who said?

Re: [2.6.20][PATCH] fix mempolicy error check on a system with memory-less-node

2007-02-07 Thread KAMEZAWA Hiroyuki
On Wed, 7 Feb 2007 06:05:56 -0800 (PST) Christoph Lameter <[EMAIL PROTECTED]> wrote: > On Wed, 7 Feb 2007, KAMEZAWA Hiroyuki wrote: > > > > IMHO there shouldn't be any memory less nodes. The architecture code > > > should not create them. The CPU

Re: [2.6.20][PATCH] fix mempolicy error check on a system with memory-less-node

2007-02-07 Thread KAMEZAWA Hiroyuki
On Wed, 7 Feb 2007 12:32:36 +0100 Andi Kleen <[EMAIL PROTECTED]> wrote: > > > How for_each_online_node(nid) works ? it can handle alias-nid ? > > > > == > > for_each_online_node(nid) { > > pgdat = NODE_DATA(nid); > > == > > This code never accesses pgdat_for_A twice ? > > It wou

Re: [2.6.20][PATCH] fix mempolicy error check on a system with memory-less-node

2007-02-07 Thread KAMEZAWA Hiroyuki
On Wed, 7 Feb 2007 11:41:25 +0100 Andi Kleen <[EMAIL PROTECTED]> wrote: > On Wednesday 07 February 2007 11:37, KAMEZAWA Hiroyuki wrote: > > On Wed, 7 Feb 2007 11:19:02 +0100 > > Andi Kleen <[EMAIL PROTECTED]> wrote: > > > You can also alias node numbers to

Re: [2.6.20][PATCH] fix mempolicy error check on a system with memory-less-node

2007-02-07 Thread KAMEZAWA Hiroyuki
On Wed, 7 Feb 2007 11:19:02 +0100 Andi Kleen <[EMAIL PROTECTED]> wrote: > > > AFAIK, ia64 creates nodes just depends on SRAT's possible resource > > information. > > Then, ia64 can create cpu-memory-less-node(node with no available > > resource.). > > (*)I don't like this. > > > > If we don't

Re: [2.6.20][PATCH] fix mempolicy error check on a system with memory-less-node

2007-02-07 Thread KAMEZAWA Hiroyuki
On 07 Feb 2007 11:20:06 +0100 Andi Kleen <[EMAIL PROTECTED]> wrote: > KAMEZAWA Hiroyuki <[EMAIL PROTECTED]> writes: > > > current mempolicy just checks whether a node is online or not. > > If there is memory-less-node, mempolicy's target node can be >

Re: [2.6.20][PATCH] fix mempolicy error check on a system with memory-less-node

2007-02-07 Thread KAMEZAWA Hiroyuki
On Wed, 7 Feb 2007 00:04:41 -0800 (PST) Christoph Lameter <[EMAIL PROTECTED]> wrote: > On Wed, 7 Feb 2007, KAMEZAWA Hiroyuki wrote: > > > > Hmmm... Remove the node from the node_online_map instead? > > > > > Changing defintion of node_online_map is har

Re: [2.6.20][PATCH] fix mempolicy error check on a system with memory-less-node

2007-02-06 Thread KAMEZAWA Hiroyuki
On Tue, 6 Feb 2007 09:26:53 -0800 (PST) Christoph Lameter <[EMAIL PROTECTED]> wrote: > On Tue, 6 Feb 2007, KAMEZAWA Hiroyuki wrote: > > > This means an access to NULL,here. > > == > > unsigned slab_node(struct mempolicy *policy) &g

[2.6.20][PATCH] fix mempolicy error check on a system with memory-less-node

2007-02-06 Thread KAMEZAWA Hiroyuki
node. */ return zone_to_nid(policy->v.zonelist->zones[0]); } == length of this zonelist was 0. It seems fixing a NULL access here is also O.K. This patch is just an idea. Signed-Off-By: KAMEZAWA Hiroyuki <[EMAIL PROTECTED]> Index: linux-2.6.

Re: [RFC] Track mlock()ed pages

2007-01-26 Thread KAMEZAWA Hiroyuki
On Fri, 26 Jan 2007 10:10:27 -0800 Andrew Morton <[EMAIL PROTECTED]> wrote: > On Fri, 26 Jan 2007 07:44:42 -0800 (PST) > Christoph Lameter <[EMAIL PROTECTED]> wrote: > > > On Fri, 26 Jan 2007, Andrew Morton wrote: > > > > > > > > Track mlocked pages via a ZVC > > > > > > Why? > > > > Large amo

Re: [RFC] Limit the size of the pagecache

2007-01-26 Thread KAMEZAWA Hiroyuki
On Fri, 26 Jan 2007 02:29:55 -0800 Andrew Morton <[EMAIL PROTECTED]> wrote: > On Wed, 24 Jan 2007 14:15:10 +0900 > KAMEZAWA Hiroyuki <[EMAIL PROTECTED]> wrote: > > > - One for stability > > When a customer constructs their detabase(Oracle), the system ofte

Re: [RFC] Limit the size of the pagecache

2007-01-24 Thread KAMEZAWA Hiroyuki
On Thu, 25 Jan 2007 00:40:54 -0500 Rik van Riel <[EMAIL PROTECTED]> wrote: > KAMEZAWA Hiroyuki wrote: > > On Wed, 24 Jan 2007 23:28:15 -0500 > > Rik van Riel <[EMAIL PROTECTED]> wrote: > > > >> KAMEZAWA Hiroyuki wrote: > > I always says Linux

Re: [RFC] Limit the size of the pagecache

2007-01-24 Thread KAMEZAWA Hiroyuki
On Wed, 24 Jan 2007 23:28:15 -0500 Rik van Riel <[EMAIL PROTECTED]> wrote: > KAMEZAWA Hiroyuki wrote: > > > FYI: > > Because some customers are migrated from mainframes, they want to control > > almost all features in OS, IOW, designing memory usages. > >

Re: [RFC] Limit the size of the pagecache

2007-01-24 Thread KAMEZAWA Hiroyuki
On Wed, 24 Jan 2007 18:41:27 -0800 (PST) Christoph Lameter <[EMAIL PROTECTED]> wrote: > > But I can't think of the way to show that. > > == > > [EMAIL PROTECTED] src]$ free > > total used free sharedbuffers cached > > Mem:741604 724628 16976

Re: [RFC] Limit the size of the pagecache

2007-01-24 Thread KAMEZAWA Hiroyuki
On Wed, 24 Jan 2007 14:15:10 +0900 KAMEZAWA Hiroyuki <[EMAIL PROTECTED]> wrote: > And...some customers want to keep memory Free as much as possible. > 99% memory usage makes insecure them ;) > If there is a way that the "free" command can show "never used&q

Re: [RFC] Limit the size of the pagecache

2007-01-23 Thread KAMEZAWA Hiroyuki
On Tue, 23 Jan 2007 20:30:16 -0800 (PST) Christoph Lameter <[EMAIL PROTECTED]> wrote: > On Wed, 24 Jan 2007, KAMEZAWA Hiroyuki wrote: > > > I don't prefer to cause zone fallback by this. > > This may use ZONE_DMA before exhausing ZONE_NORMAL (ia64), > > H

Re: [RFC] Limit the size of the pagecache

2007-01-23 Thread KAMEZAWA Hiroyuki
one more thing... On Tue, 23 Jan 2007 16:49:55 -0800 (PST) Christoph Lameter <[EMAIL PROTECTED]> wrote: > @@ -1168,6 +1170,11 @@ zonelist_scan: > !cpuset_zone_allowed_softwall(zone, gfp_mask)) > goto try_next_zone; > > + if ((gfp_

Re: [RFC] Limit the size of the pagecache

2007-01-23 Thread KAMEZAWA Hiroyuki
On Tue, 23 Jan 2007 16:49:55 -0800 (PST) Christoph Lameter <[EMAIL PROTECTED]> wrote: > If we enter reclaim and the number of page cache pages > is too high then we switch off swapping during reclaim > to avoid touching anonymous pages. In general, I like this (kind of) feature. > + /* > +

[BUG][PATCH] fix oom killer kills current every time if there is memory-less-node take2

2006-12-22 Thread KAMEZAWA Hiroyuki
n zonelist[]. contstrained_alloc() should get memory_less_node into count. Otherwise, it always thinks 'oom is from mempolicy'. This means that current process dies at any time. This patch fix it. Signed-Off-By: KAMEZAWA Hiroyuki <[EMAIL PROTECTED]> mm/oom_kill.c |7 ++- 1 files c

Re: [BUG][PATCH] fix oom killer kills current every time if there is memory-less-node

2006-12-21 Thread KAMEZAWA Hiroyuki
On Thu, 21 Dec 2006 21:18:12 -0800 Paul Jackson <[EMAIL PROTECTED]> wrote: > KAMEZAWA-san wrote: > > But there is memory-less-node. contstrained_alloc() should get > > memory_less_node into count. > > This patch looks ok to me. > > One line in the patch comment seems backward: > > If zone_lis

[BUG][PATCH] fix oom killer kills current every time if there is memory-less-node

2006-12-21 Thread KAMEZAWA Hiroyuki
time. This patch fix it. Signed-Off-By: KAMEZAWA Hiroyuki <[EMAIL PROTECTED]> mm/oom_kill.c |7 ++- 1 files changed, 6 insertions(+), 1 deletion(-) Index: devel-2.6.20-rc1-mm1/mm/oom_kill.c === --- devel-2.6.20-rc1-mm

[BUG ?] oom with empty nodes.

2006-12-20 Thread KAMEZAWA Hiroyuki
On some system, there are memory-less-nodes. (IOW, cpu-only-node) Then,there are online nodes which has no memory. Now, below code is used to detect the context where oom happens. === static inline int constrained_alloc(struct zonelist *zonelist, gfp_t gfp_mask) { #ifdef CONFIG_NUMA struc

Re: [PATCH][2.6.20-rc1-mm1] sparsemem vmem_map optimzed pfn_valid() [0/2]

2006-12-20 Thread KAMEZAWA Hiroyuki
On Wed, 20 Dec 2006 15:06:28 -0500 "Bob Picco" <[EMAIL PROTECTED]> wrote: > Sorry I was looking for AIM VII and/or reaim which are multiuser loads. > The results (2.6.20-rc1-mm1) for EXTREME, SPARSEMEM+VMEMMAP and > SPARSEMEM+VMEMMAP+your+patch are below. Note SPARSEMEM+VMEMMAP AIM VII > wasn't be

Re: [PATCH] Fix sparsemem on Cell

2006-12-18 Thread KAMEZAWA Hiroyuki
On Mon, 18 Dec 2006 15:16:20 -0800 Dave Hansen <[EMAIL PROTECTED]> wrote: > enum context > { > EARLY, > HOTPLUG > }; I like this :) Thanks, -Kame - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo

Re: [PATCH][2.6.20-rc1-mm1] sparsemem vmem_map optimzed pfn_valid() [0/2]

2006-12-16 Thread KAMEZAWA Hiroyuki
On Sat, 16 Dec 2006 10:38:53 -0800 (PST) Christoph Lameter <[EMAIL PROTECTED]> wrote: > On Sat, 16 Dec 2006, KAMEZAWA Hiroyuki wrote: > > > By this, we'll not access mem_section[] in usual ops. > > Why do we need mem_section? We have a page table that fulfills the s

Re: [PATCH] Fix sparsemem on Cell

2006-12-16 Thread KAMEZAWA Hiroyuki
On Fri, 15 Dec 2006 11:45:36 -0800 Andrew Morton <[EMAIL PROTECTED]> wrote: > Perhaps if the function's role in the world was commented it would be clearer. > How about patch like this ? (this one is not tested.) Already-exisiting-more-generic-flag is available ? -Kame == include/linux/memory

[PATCH][2.6.20-rc1-mm1] sparsemem vmem_map optimzed pfn_valid() [2/2] for ia64

2006-12-16 Thread KAMEZAWA Hiroyuki
ia64 support for sparsemem vmem_map optimize pfn_valid() patch. Because ia64 has its own virtual mem_map, we can reuse the same code. So this patch is simple. To support optimized pfn_valid() in other arch, you (may) have to modify fault handler in kernel address space. Signed-Off-By: KAMEZAWA

[PATCH][2.6.20-rc1-mm1] sparsemem vmem_map optimzed pfn_valid() [1/2] generic arch.

2006-12-16 Thread KAMEZAWA Hiroyuki
m_map range. Signed-Off-By: KAMEZAWA Hiroyuki <[EMAIL PROTECTED]> include/linux/mmzone.h | 10 ++ mm/Kconfig |4 mm/sparse.c|7 +++ 3 files changed, 21 insertions(+) Index: devel-2.6.20-rc1-mm1/include/lin

[PATCH][2.6.20-rc1-mm1] sparsemem vmem_map optimzed pfn_valid() [0/2]

2006-12-16 Thread KAMEZAWA Hiroyuki
This patch implements pfn_valid() micro optimization. This uses ia64_pfn_valid() idea to check mem_map is valid or not instead of sparsemem's logic. By this, we'll not access mem_section[] in usual ops. I attaches my easy test result with *micro* benchmark on SMP system. I'm glad if you give me

Re: 2.6.19-mm1

2006-12-11 Thread KAMEZAWA Hiroyuki
On Mon, 11 Dec 2006 22:06:17 -0800 Andrew Morton <[EMAIL PROTECTED]> wrote: > > When I use ftp on 2.6.19-mm1, transfered file is always broken. > > like this: > > == > > [EMAIL PROTECTED] ~]$ file ./linux-2.6.19.tar.bz2 (got on 2.6.19-mm1) > > ./linux-2.6.19.tar.bz2: data > > (I confirmed original

Re: 2.6.19-mm1

2006-12-11 Thread KAMEZAWA Hiroyuki
On Mon, 11 Dec 2006 00:58:07 -0800 Andrew Morton <[EMAIL PROTECTED]> wrote: > > Temporarily at > > http://userweb.kernel.org/~akpm/2.6.19-mm1/ > > Will appear later at > > > ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.19/2.6.19-mm1/ > When I use ftp on 2.6.

Re: [RFC] [PATCH] virtual memmap on sparsemem v3 [2/4] generic virtual mem_map on sparsemem

2006-12-10 Thread KAMEZAWA Hiroyuki
On Sat, 9 Dec 2006 22:17:00 +0900 KAMEZAWA Hiroyuki <[EMAIL PROTECTED]> wrote: > I'll renew this in the next week. > Hi, this is a fix patch. Sorry for my carelessness. I'll post next add-on patch against the next -mm which will be shipped. What I have now are - pf

Re: [RFC] [PATCH] virtual memmap on sparsemem v3 [0/4] introduction

2006-12-09 Thread KAMEZAWA Hiroyuki
On Sat, 9 Dec 2006 12:51:37 +0100 Heiko Carstens <[EMAIL PROTECTED]> wrote: > > Virtual mem_map is not useful for 32bit archs. This uses huge virtual > > address range. > > Why? The s390 vmem_map implementation which I sent last week to linux-mm > is merged in the meantime. It supports both 32 an

Re: [RFC] [PATCH] virtual memmap on sparsemem v3 [2/4] generic virtual mem_map on sparsemem

2006-12-09 Thread KAMEZAWA Hiroyuki
On Sat, 9 Dec 2006 13:05:47 +0100 Heiko Carstens <[EMAIL PROTECTED]> wrote: > > +#ifdef CONFIG_SPARSEMEM_VMEMMAP > > +#if (((BITS_PER_LONG/4) * PAGES_PER_SECTION) % PAGE_SIZE) != 0 > > +#error "PAGE_SIZE/SECTION_SIZE relationship is not suitable for vmem_map" > > +#endif > > Why the BITS_PER_LONG

Re: [RFC] [PATCH] virtual memmap on sparsemem v3 [4/4] ia64 support

2006-12-08 Thread KAMEZAWA Hiroyuki
I tested ia64 with this patch under - DISCONTIGMEM + VIRTUAL_MEM_MAP - SPARSEMEM - SPARSEMEM_VMEMMAP on SMP with tiger4_defconfig. Fix typo for DISCONTIGMEM Signed-Off-By: KAMEZAWA Hiroyuki <[EMAIL PROTECTED]> Index: devel-2.6.19/include/asm-ia64/pgt

Re: [RFC] [PATCH] virtual memmap on sparsemem v3 [1/4] map and unmap

2006-12-08 Thread KAMEZAWA Hiroyuki
This removes implicit default actions in map_generic_kernel() call. Also changes comments in vmalloc.h Signed-Off-By: KAMEZAWA Hiroyuki <[EMAIL PROTECTED]> Index: devel-2.6.19/include/linux/vmalloc.h === --- devel-2.6.1

Re: [RFC] [PATCH] virtual memmap on sparsemem v3 [3/4] static virtual mem_map

2006-12-08 Thread KAMEZAWA Hiroyuki
for avoiding complex inclusion of headr file in the middle of another header file. Signed-Off-By: KAMEZAWA Hiroyuki <[EMAIL PROTECTED]> Index: devel-2.6.19/include/linux/mmzone.h === --- devel-2.6.19.orig/include/linux/mm

Re: [RFC] [PATCH] virtual memmap on sparsemem v3 [3/4] static virtual mem_map

2006-12-08 Thread KAMEZAWA Hiroyuki
On Fri, 8 Dec 2006 19:33:23 -0800 Andrew Morton <[EMAIL PROTECTED]> wrote: > > > > > Would prefer to unconditionally include the header file - conditional > > > inclusions > > > like this can cause compile failures when someone changes a config > > > option. They > > > generally raise the comp

Re: [RFC] [PATCH] virtual memmap on sparsemem v3 [3/4] static virtual mem_map

2006-12-08 Thread KAMEZAWA Hiroyuki
On Fri, 8 Dec 2006 16:30:20 -0800 Andrew Morton <[EMAIL PROTECTED]> wrote: > > +#ifdef CONFIG_SPARSEMEM_VMEMMAP_STATIC > > +#include > > +extern struct page mem_map[]; > > +#else > > extern struct page* mem_map; > > #endif > > +#endif > > This looks rather unpleasant - what went wrong here? >

Re: [RFC] [PATCH] virtual memmap on sparsemem v3 [1/4] map and unmap

2006-12-08 Thread KAMEZAWA Hiroyuki
On Fri, 8 Dec 2006 16:28:19 -0800 Andrew Morton <[EMAIL PROTECTED]> wrote: > Generally we prefer to simply *require* that the function vector be filled > in appropriately. So if the caller has no special needs, the caller will > set their gen_map_kern_ops.k_pte_alloc to point at pte_alloc_kernel(

[RFC] [PATCH] virtual memmap on sparsemem v3 [4/4] ia64 support

2006-12-07 Thread KAMEZAWA Hiroyuki
ia64 support for sparsemem/vmem_map. * defines mem_map[] and set its value (by static way). * changes definitions of VMALLOC_START. * adds CONFIGS. Signed-Off-By: KAMEZAWA Hiroyuki <[EMAIL PROTECTED]> Index: devel-2.6.19/arch/ia64/K

[RFC] [PATCH] virtual memmap on sparsemem v3 [3/4] static virtual mem_map

2006-12-07 Thread KAMEZAWA Hiroyuki
This patch adds support for statically allocated virtual mem_map. (means virtual address of mem_map array is defined statically.) This removes reference to *(&mem_map). Signed-Off-By: KAMEZAWA Hiroyuki <[EMAIL PROTECTED]> Index: devel-2.6.19/include/linu

[RFC] [PATCH] virtual memmap on sparsemem v3 [2/4] generic virtual mem_map on sparsemem

2006-12-07 Thread KAMEZAWA Hiroyuki
This patch implements of virtual mem_map on sparsemem. This includes only arch independent part and depends on generic map/unmap in the kernel function in this patch series. Usual sparsemem(_extreme) have to do global table look up in pfn_to_page()/page_to_pfn(), this seems a bit costly. If an ar

[RFC] [PATCH] virtual memmap on sparsemem v3 [1/4] map and unmap

2006-12-07 Thread KAMEZAWA Hiroyuki
e their virtual/physical space by themselves. Because it's complex and danger to manage virtual address space by each function's own code, it's better to use fixed address. Note: My first purpose is supporting virtual mem_map both at boot/hotplug sharing the same logic. Signed-Off

[RFC] [PATCH] virtual memmap on sparsemem v3 [0/4] introduction

2006-12-07 Thread KAMEZAWA Hiroyuki
Hi, virtual mem_map on sparsemem/generic patch version 3. I myself likes this patch. But someone may feels this patch is intrusive and scattered. please pointing out. Changes v2 -> v3 - make map/unmap function for general purpose. (for my purpose ;) - drop memory hotplug support. will be posted a

Re: [PATCH] Add __GFP_MOVABLE for callers to flag allocations that may be migrated

2006-12-04 Thread KAMEZAWA Hiroyuki
Hi, your plan looks good to me. some comments. On Mon, 4 Dec 2006 23:45:32 + (GMT) Mel Gorman <[EMAIL PROTECTED]> wrote: > 1. Use lumpy-reclaim to intelligently reclaim contigous pages. The same > logic can be used to reclaim within a PFN range > 2. Merge anti-frag to help high-order allo

<    2   3   4   5   6   7   8   >