On 03/27/2014 03:02 PM, Ralph Castain wrote:
Or use --display-map to see the process to node assignments


Aha!
That one was not on my radar.
Maybe because somehow I can't find it in the
OMPI 1.6.5 mpiexec man page.
However, it seems to work with that version also, which is great.
(--display-map goes to stdout, whereas -report-bindings goes to stderr, right?)
Thanks, Ralph!

Gus Correa

Sent from my iPhone

On Mar 27, 2014, at 11:47 AM, Gus Correa <g...@ldeo.columbia.edu> wrote:

PS - The (OMPI 1.6.5) mpiexec default is -bind-to-none,
in which case -report-bindings won't report anything.

So, if you are using the default,
you can apply Joe Landman's suggestion
(or alternatively use the MPI_Get_processor_name function,
in lieu of uname(&uts); cpu_name = uts.nodename; ).

However, many MPI applications benefit from some type of hardware binding, 
maybe yours will do also, and as a bonus
-report-bindings will tell you where each rank ran.
mpiexec's -tag-output is also helpful for debugging,
but won't tell you the node name, just the MPI rank.

You can setup a lot of these things as your preferred defaults,
via mca parameters, and omit them from the mpiexec command line.
The trick is to match each mpiexec option to
the appropriate mca parameter, as the names are not exactly the same.
"ompi-info --all" may help in that regard.
See this FAQ:
http://www.open-mpi.org/faq/?category=tuning#setting-mca-params

Again, the OMPI FAQ page is your friend!  :)
http://www.open-mpi.org/faq/

I hope this helps,
Gus Correa

On 03/27/2014 02:06 PM, Gus Correa wrote:
Hi John

Take a look at the mpiexec/mpirun options:

-report-bindings (this one should report what you want)

and maybe also also:

-bycore, -bysocket, -bind-to-core, -bind-to-socket, ...

and similar, if you want more control on where your MPI processes run.

"man mpiexec" is your friend!

I hope this helps,
Gus Correa

On 03/27/2014 01:53 PM, Sasso, John (GE Power & Water, Non-GE) wrote:
When a piece of software built against OpenMPI fails, I will see an
error referring to the rank of the MPI task which incurred the failure.
For example:

MPI_ABORT was invoked on rank 1236 in communicator MPI_COMM_WORLD

with errorcode 1.

Unfortunately, I do not have access to the software code, just the
installation directory tree for OpenMPI.  My question is:  Is there a
flag that can be passed to mpirun, or an environment variable set, which
would reveal the mapping of ranks to the hosts they are on?

I do understand that one could have multiple MPI ranks running on the
same host, but finding a way to determine which rank ran on what host
would go a long way in help troubleshooting problems which may be
central to the host.  Thanks!

                   --john



_______________________________________________
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users

_______________________________________________
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users

_______________________________________________
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users
_______________________________________________
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users


Reply via email to