* Christoph Lameter <c...@linux.com> wrote: > On Mon, 12 Nov 2012, Peter Zijlstra wrote: > > > The biggest conceptual addition, beyond the elimination of > > the home node, is that the scheduler is now able to > > recognize 'private' versus 'shared' pages, by carefully > > analyzing the pattern of how CPUs touch the working set > > pages. The scheduler automatically recognizes tasks that > > share memory with each other (and make dominant use of that > > memory) - versus tasks that allocate and use their working > > set privately. > > That is a key distinction to make and if this really works > then that is major progress.
I posted updated benchmark results yesterday, and the approach is indeed a performance breakthrough: http://lkml.org/lkml/2012/11/12/330 It also made the code more generic and more maintainable from a scheduler POV. > > This new scheduler code is then able to group tasks that are > > "memory related" via their memory access patterns together: > > in the NUMA context moving them on the same node if > > possible, and spreading them amongst nodes if they use > > private memory. > > What happens if processes memory accesses are related but the > common set of data does not fit into the memory provided by a > single node? The other (very common) node-overload case is that there are more tasks for a shared piece of memory than fits on a single node. I have measured two such workloads, one is the Java SPEC benchmark: v3.7-vanilla: 494828 transactions/sec v3.7-NUMA: 627228 transactions/sec [ +26.7% ] the other is the 'numa01' testcase of autonumabench: v3.7-vanilla: 340.3 seconds v3.7-NUMA: 216.9 seconds [ +56% ] > The correct resolution usually is in that case to interleasve > the pages over both nodes in use. I'd not go as far as to claim that to be a general rule: the correct placement depends on the system and workload specifics: how much memory is on each node, how many tasks run on each node, and whether the access patterns and working set of the tasks is symmetric amongst each other - which is not a given at all. Say consider a database server that executes small and large queries over a large, memory-shared database, and has worker tasks to clients, to serve each query. Depending on the nature of the queries, interleaving can easily be the wrong thing to do. Thanks, Ingo -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/