Terry Dontje wrote:
On 12/10/2010 09:19 AM, Richard Treumann wrote:
It seems to me the MPI_Get_processor_name description is too ambiguous to make this 100% portable.  I assume most MPI implementations simply use the hostname so all processes on the same host will return the same string.  The suggestion would work then.

However, it would also be reasonable for an MPI  that did processor binding to return " hostname.socket#.core#" so every rank would have a unique processor name.
Fair enough.  However, I think it is a lot more stable then grabbing information from the bowels of the runtime environment.  Of course one could just call the appropriate system call to get the hostname, if you are on the right type of OS/Architecture :-).
The extension idea is a bit at odds with the idea that MPI is an architecture independent API.  That does not rule out the option if there is a good use case but it does raise the bar just a bit.
Yeah, that is kind of the rub isn't it.  There is enough architectural differences out there that it might be difficult to come to an agreement on the elements of locality you should focus on.  It would be nice if there was some sort of distance value that would be assigned to each peer a process has.  Of course then you still have the problem trying to figure out what distance you really want to base your grouping on.
Similar issues within a node (e.g., hwloc, shared caches, sockets, boards, etc.) as outside a node (same/different hosts, number of switch hops, number of torus hops, etc.).  Lots of potential complexity, but the main difference inside/outside a node is that nodal boundaries present "hard" process-migration boundaries.

Reply via email to