Hi I could successfully use the following rankfile on Linux with openmpi-1.6.4rc3r27923, but it doesn't work with a patched openmpi-1.6.4rc4r28022 (patch.diff from Eugene). Perhaps this information helps to track down the error.
tyr rankfiles 114 cat rf_ex_linpc # mpiexec -report-bindings -rf rf_ex_linpc hostname rank 0=linpc0 slot=0:0-1,1:0-1 rank 1=linpc1 slot=0:0-1 rank 2=linpc1 slot=1:0 rank 3=linpc1 slot=1:1 linpc1 rankfiles 99 mpiexec -report-bindings -rf rf_ex_linpc hostname ------------------------------------------------------------------------ The rankfile that was used claimed that a host was either not allocated or oversubscribed its slots. Please review your rank-slot assignments and your host allocation to ensure a proper match. Also, some systems may require using full hostnames, such as "host1.example.com" (instead of just plain "host1"). Host: linpc0 ------------------------------------------------------------------------ linpc1 rankfiles 100 ompi_info | grep "MPI:" Open MPI: 1.6.4rc4r28022 linpc1 rankfiles 101 exit tyr rankfiles 110 ssh linpc1 linpc1 fd1026 96 cd .../prog/mpi/rankfiles/ linpc1 rankfiles 97 mpiexec -report-bindings -rf rf_ex_linpc hostname [linpc1:21351] MCW rank 1 bound to socket 0[core 0-1]: [B B][. .] (slot list 0:0-1) [linpc1:21351] MCW rank 2 bound to socket 1[core 0]: [. .][B .] (slot list 1:0) [linpc1:21351] MCW rank 3 bound to socket 1[core 1]: [. .][. B] (slot list 1:1) [linpc0:08012] MCW rank 0 bound to socket 0[core 0-1] socket 1[core 0-1]: [B B][B B] (slot list 0:0-1,1:0-1) linpc1 rankfiles 98 ompi_info | grep "MPI:" Open MPI: 1.6.4rc3r27923 linpc1 rankfiles 99 I will build an unpatched openmpi-1.6.4rc4 and check if the above rankfile will work. Unfortunately I can check only tomorrow because new packages will be mirrored in the night to all machines so that it is not available on both machines today. I let you know the result. Kind regards Siegmar