Terry Dontje wrote:
On 12/10/2010 09:19 AM, Richard Treumann wrote:
It seems to me the MPI_Get_processor_name
description is too ambiguous to make this 100% portable. I assume most
MPI implementations simply use the hostname so all processes on the
same host will return the same string. The suggestion would work then.
However, it would also be
reasonable for an MPI that did processor binding to return "
hostname.socket#.core#" so every rank would have a unique processor
name.
Fair enough. However, I think it is a lot more stable then grabbing
information from the bowels of the runtime environment. Of course one
could just call the appropriate system call to get the hostname, if you
are on the right type of OS/Architecture :-).
The extension idea is a
bit at odds with the idea that MPI is an architecture independent API.
That does not rule out the option if there is a good use case but it
does raise the bar just a bit.
Yeah, that is kind of the rub isn't it. There is enough architectural
differences out there that it might be difficult to come to an
agreement on the elements of locality you should focus on. It would be
nice if there was some sort of distance value that would be assigned to
each peer a process has. Of course then you still have the problem
trying to figure out what distance you really want to base your
grouping on.
Similar issues within a node (e.g., hwloc, shared caches, sockets,
boards, etc.) as outside a node (same/different hosts, number of switch
hops, number of torus hops, etc.). Lots of potential complexity, but
the main difference inside/outside a node is that nodal boundaries
present "hard" process-migration boundaries.
|