I just checked the head of both the master and 3.0.x branches, and they both work fine:
$ mpirun --map-by ppr:1:socket:pe=1 date [rhc001:139231] SETTING BINDING TO CORE [rhc002.cluster:203672] SETTING BINDING TO CORE Wed Dec 20 00:20:55 PST 2017 Wed Dec 20 00:20:55 PST 2017 Tue Dec 19 18:37:03 PST 2017 Tue Dec 19 18:37:03 PST 2017 $ I’ll remove the debug, but it looks like this was already fixed. > On Dec 19, 2017, at 10:49 PM, Siegmar Gross > <siegmar.gr...@informatik.hs-fulda.de> wrote: > > Hi, > > I've installed openmpi-v3.0.0 on my "SUSE Linux Enterprise Server 12.3 > (x86_64)" with gcc-6.4.0. Today I discovered that I get an error for --map-by > that I don't > get with older versions. > > > loki fd1026 115 which mpiexec > /usr/local/openmpi-2.0.3_64_gcc/bin/mpiexec > loki fd1026 116 mpiexec --host pc02:2,pc03:2 --map-by ppr:1:socket:pe=1 date > Wed Dec 20 07:41:00 CET 2017 > ,... > > loki fd1026 107 which mpiexec > /usr/local/openmpi-2.1.2_64_gcc/bin/mpiexec > loki fd1026 108 mpiexec --host pc02:2,pc03:2 --map-by ppr:1:socket:pe=1 date > Wed Dec 20 07:41:27 CET 2017 > ... > > loki fd1026 107 which mpiexec > /usr/local/openmpi-3.0.0_64_gcc/bin/mpiexec > loki fd1026 108 mpiexec --host pc02:2,pc03:2 --map-by ppr:1:socket:pe=1 date > [loki:32662] SETTING BINDING TO CORE > [pc02:04420] SETTING BINDING TO CORE > [pc03:04788] SETTING BINDING TO CORE > -------------------------------------------------------------------------- > The request to bind processes could not be completed due to > an internal error - the locale of the following process was > not set by the mapper code: > > Process: [[57386,1],3] > > Please contact the OMPI developers for assistance. Meantime, > you will still be able to run your application without binding > by specifying "--bind-to none" on your command line. > -------------------------------------------------------------------------- > -------------------------------------------------------------------------- > ORTE has lost communication with a remote daemon. > > HNP daemon : [[57386,0],0] on node loki > Remote daemon: [[57386,0],2] on node pc03 > > This is usually due to either a failure of the TCP network > connection to the node, or possibly an internal failure of > the daemon itself. We cannot recover from this failure, and > therefore will terminate the job. > -------------------------------------------------------------------------- > [loki:32662] 1 more process has sent help message help-orte-rmaps-base.txt / > rmaps:no-locale > [loki:32662] Set MCA parameter "orte_base_help_aggregate" to 0 to see all > help / error messages > loki fd1026 109 > > > > I would be grateful, if somebody can fix the problem. Do you need anything > else? Thank you very much for any help in advance. > > > Kind regards > > Siegmar > _______________________________________________ > users mailing list > users@lists.open-mpi.org > https://lists.open-mpi.org/mailman/listinfo/users
_______________________________________________ users mailing list users@lists.open-mpi.org https://lists.open-mpi.org/mailman/listinfo/users