On 26/03/2012 04:25, Sajith T S wrote:
Date: Sun, 25 Mar 2012 22:49:52 -0400
From: Sajith T S<saj...@gmail.com>
To: The Haskell Cafe<haskell-c...@haskell.org>
Subject: Google Summer of Code: a NUMA wishlist!
Dear Cafe,
It's last minute-ish to bring this up (in my part of the world it's
still March 25), but graduate students are famously a busy and lazy
lot. :) I study at Indiana University Bloomington, and I wish to
propose^W rush in this proposal and solicit feedback, mentors, etc
while I can.
Since student application deadline is April 6, I figure we can beat
this into a real proposal's shape by then. This probably also falls
on the naive and ambitious side of things, and I might not even know
what I'm talking about, but let's see! That's the idea of proposal,
yes?
Broadly, idea is to improve support for NUMA systems. Specifically:
-- Real physical processor affinity with forkOn [1]. Can we fire all
CPUs if we want to? (Currently, the number passed to forkOn is
interpreted as number modulo the value returned by
getNumCapabilities [2]).
You can get real processor affinity with +RTS -qa in combination with
forkOn.
-- Also kind of associated with the above: when launching processes,
we might want to specify a list of CPUs rather than the number of
CPUs. Say, a -N [0,1,3] flag rather than -N 3 flag. This shall
enable us to gawk at real pretty htop [3] output.
I like that idea.
-- From a very recent discussion on parallel-haskell [4], we learn
that RTS' NUMA support could be improved. The hypothesis is that
allocating nurseries per Capability might be a better plan than
using global pool. We might borrow/steal ideas from hwloc [5] for
this.
I like this idea too (since I suggested it :-).
-- Finally, a logging/monitoring infrastructure to verify assumptions
and determine if/how local work stays.
I'm not sure if you're suggesting a *new* logging/monitoring framework
here, but in any case it would make much more sense to extend ghc-events
and ThreadScope rather than building something new. There is ongoing
work to have ThreadScope understand the output of the Linux "perf" tool,
which would give insight into CPU scheduling activity amongst other
things. Talk to Duncan Coutts <dun...@well-typed.com> about how far
this is along and the best way for a GSoc project to help (usually it
works best when the GSoc project is not dependent on, or depended on by,
other ongoing projects - reducing synchronisation overhead and latency
due to blocking is always good!).
Cheers,
Simon
(I would like to acknowledge my fellow conspirators and leave them
unnamed, lest they shall be embarrassed by my... naivete.)
Thanks,
Sajith.
[1]
http://www.haskell.org/ghc/docs/latest/html/libraries/base/Control-Concurrent.html#v:forkOn
[2]
http://www.haskell.org/ghc/docs/latest/html/libraries/base/Control-Concurrent.html#v:getNumCapabilities
[3] http://htop.sourceforge.net/
[4]
http://groups.google.com/group/parallel-haskell/browse_thread/thread/7ec1ebc73dde8bbd
[5] http://www.open-mpi.org/projects/hwloc/
_______________________________________________
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users