On Wed, Apr 04, 2001 at 09:39:23AM -0700, Kanoj Sarcar wrote:
> example, for NUMA, we need to try hard to schedule a thread on the
> node that has most of its memory (for no reason other than to decrease
> memory latency). Independently, some NUMA machines build in multilevel
> caches and local snoops that also means that specific processors on
> the same node as the last_processor are also good candidates to run
> the process next.
yes. That will probably need to be optional and choosen by the architecture
at compile time too. The probably most important factor to consider is the
penality of accessing remote memory, I think I can say on all recent and future
machines with a small difference between local and remote memory (and possibly
as you say with a decent cache protocol able to snoop cacheline data from the
other cpus even if they're not dirty) it's much better to always try to keep
the task in its last node. My patch is actually assuming recent machines and it
keeps the task in its last node if not in the last cpu and it keeps doing
memory allocation from there and it forgets about its original node where it
started allocating the memory from. This provided the best performance during
userspace CPU bound load as far I can tell and it also better distribute the load.
Kanoj could you also have a look at the NUMA related common code MM fixes I did
in this patch? I'd like to get them integrated (just skip the arch/alpha/*
include/asm-alpha/* stuff while reading the patch, they're totally orthogonal).
ftp://ftp.us.kernel.org/pub/linux/kernel/people/andrea/kernels/v2.4/2.4.3aa1/00_alpha-numa-1
If you prefer I can extract them in a more finegrinded patch just dropping
the alpha stuff by hand.
Andrea
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/