Hello all, I'd like to bind 8 cores to a single MPI rank for hybrid MPI/OpenMP codes. In OMPI 1.6.3, I can do:
$ mpirun -np 2 -cpus-per-rank 8 -machinefile ./nodes ./hello I get one rank bound to procs 0-7 and the other bound to 8-15. Great! But I'm having some difficulties doing this with openmpi 1.8.1: $ mpirun -np 2 -cpus-per-rank 8 -machinefile ./nodes ./hello -------------------------------------------------------------------------- The following command line options and corresponding MCA parameter have been deprecated and replaced as follows: Command line options: Deprecated: --cpus-per-proc, -cpus-per-proc, --cpus-per-rank, -cpus-per-rank Replacement: --map-by <obj>:PE=N Equivalent MCA parameter: Deprecated: rmaps_base_cpus_per_proc Replacement: rmaps_base_mapping_policy=<obj>:PE=N The deprecated forms *will* disappear in a future version of Open MPI. Please update to the new syntax. -------------------------------------------------------------------------- -------------------------------------------------------------------------- There are not enough slots available in the system to satisfy the 2 slots that were requested by the application: ./hello Either request fewer slots for your application, or make more slots available for use. -------------------------------------------------------------------------- OK, let me try the new syntax... $ mpirun -np 2 --map-by core:pe=8 -machinefile ./nodes ./hello -------------------------------------------------------------------------- There are not enough slots available in the system to satisfy the 2 slots that were requested by the application: ./hello Either request fewer slots for your application, or make more slots available for use. -------------------------------------------------------------------------- What am I doing wrong? The documentation on these new options is somewhat poor and confusing so I'm probably doing something wrong. If anyone could provide some pointers here it'd be much appreciated! If it's not something simple and you need config logs and such please let me know. As a side note - If I try this using the PBS nodefile with the above, I get a confusing message: -------------------------------------------------------------------------- A request for multiple cpus-per-proc was given, but a directive was also give to map to an object level that has less cpus than requested ones: #cpus-per-proc: 8 number of cpus: 1 map-by: BYCORE:NOOVERSUBSCRIBE Please specify a mapping level that has more cpus, or else let us define a default mapping that will allow multiple cpus-per-proc. -------------------------------------------------------------------------- >From what I've gathered this is because I have a node listed 16 times in my PBS nodefile so it's assuming then I have 1 core per node? Some better documentation here would be helpful. I haven't been able to figure out how to use the "oversubscribe" option listed in the docs. Not that I really want to oversubscribe, of course, I need to modify the nodefile, but this just stumped me for a while as 1.6.3 didn't have this behavior. As a extra bonus, I get a segfault in this situation: $ mpirun -np 2 -machinefile ./nodes ./hello [conte-a497:13255] *** Process received signal *** [conte-a497:13255] Signal: Segmentation fault (11) [conte-a497:13255] Signal code: Address not mapped (1) [conte-a497:13255] Failing at address: 0x2c [conte-a497:13255] [ 0] /lib64/libpthread.so.0[0x3c9460f500] [conte-a497:13255] [ 1] /apps/rhel6/openmpi/1.8.1/intel-14.0.2.144/lib/libopen-rte.so.7(orte_plm_base_complete_setup+0x615)[0x2ba960a59015] [conte-a497:13255] [ 2] /apps/rhel6/openmpi/1.8.1/intel-14.0.2.144/lib/libopen-pal.so.6(opal_libevent2021_event_base_loop+0xa05)[0x2ba961666715] [conte-a497:13255] [ 3] mpirun(orterun+0x1b45)[0x40684f] [conte-a497:13255] [ 4] mpirun(main+0x20)[0x4047f4] [conte-a497:13255] [ 5] /lib64/libc.so.6(__libc_start_main+0xfd)[0x3a1bc1ecdd] [conte-a497:13255] [ 6] mpirun[0x404719] [conte-a497:13255] *** End of error message *** Segmentation fault (core dumped) My "nodes" file simply contains the first two lines of my original $PBS_NODEFILE provided by Torque. See above why I modified. Works fine if use the full file. Thanks in advance for any pointers you all may have! Dan -- Dan Dietz Scientific Applications Analyst ITaP Research Computing, Purdue University