So in essence the user might set one parameter and depending on whether
orted is being used to launch the job or not determines when the process
binding happens (process launch vs MPI_Init time). In the case that one
needs/wants to rely on a different launcher to bind then you don't
specify the OMPI parameter at all.
Is that right?
So, will there be a way to force MPI_Init based binding even if one is
using orted to launch a job. Not sure there really is a use case for
such just curious.
--td
Ralph Castain wrote:
FWIW: Jeff and I chatted about this on the phone and came up with two
issues that need resolving:
1. we use mpi_paffinity_alone to indicate that we should bind
processes, yet the orteds have no way of seeing that MCA param as it
is registered and evaluated in the MPI layer. We propose to resolve
this by (a) declaring an opal_paffinity_alone MCA param in the
paffinity framework, and then (b) declaring an alias of
mpi_paffinity_alone for it, also in the paffinity framework. This
obviously is an abstraction break, but we feel it is an acceptable one
under the circumstances.
Our apologies to Lenny, whose ears were boxed over doing just this
last year...sigh.
This will allow the orteds to check to see if processes should be
bound before launching them.
2. we would not be able to bind processes launched without daemons
under systems that do not provide their own process binding
capability. For example, on Torque, we have an ability to natively
launch processes from within mpirun - those processes currently can
bind themselves in MPI_Init, but would not be able to do so any longer
under this proposed change.
To alleviate that problem, we propose to leave the process binding
code that is currently in MPI_Init, but surround it with a test to see
if an MCA param has been set indicating that the proc is to use that
code to bind itself. Thus, when launching without daemons (but via
mpirun), we can set the flag and instruct the procs to bind
themselves. However, procs that are launched without daemons via
something which has its own binding capability (e.g., SLURM), and
procs that were launched via daemon (and hence would have already been
bound), would not attempt to do so.
Any further thoughts are welcome...
Ralph
On May 7, 2009, at 12:59 PM, Ralph Castain wrote:
I can do the coding - just want to ensure interested others get their
$0.002 in on how it should work.
I came up with a way to do it that doesn't require changes to the
paffinity framework. I can complete the prototype next week on an hg
branch and let you look at it. Mostly consists of moving what is now
in MPI_Init into the odls modules between the fork and exec, as Brian
suggested.
On May 7, 2009, at 12:43 PM, Terry Dontje wrote:
Brian W. Barrett wrote:
On Wed, 6 May 2009, Ralph Castain wrote:
Any thoughts on this? Should we change it?
Yes, we should change this (IMHO) :).
Me too.
If so, who wants to be involved in the re-design? I'm pretty sure
it would require some modification of the paffinity framework,
plus some minor mods to the odls framework and (since you cannot
bind a process other than yourself) addition of a new small
"proxy" script that would bind-then-exec each process started by
the orted (Eugene posted a candidate on the user list, though we
will have to deal with some system-specific issues in it).
I can't contribute a whole lot of time, but I'd be happy to lurk,
offer advice, and write some small bits of code. But I definitely
can't lead.
Fist offering of opinion from me. I think we can avoid the "proxy"
script by doing the binding after the fork but before the exec.
This will definitely require minor changes to the odls and probably
a bunch of changes to the paffinity framework. This will make
things slightly less fragile than a script would, and yet get us
what we want.
I'll have to talk with Len to see if Sun has any time to allocate to
this.
--td
_______________________________________________
devel mailing list
[email protected]
http://www.open-mpi.org/mailman/listinfo.cgi/devel
_______________________________________________
devel mailing list
[email protected]
http://www.open-mpi.org/mailman/listinfo.cgi/devel