Hi everyone, I've been having a pretty odd issue with Slurm and openmpi the last few days. I just set up a heterogeneous cluster with Slurm consisting of P4 32 bit machines and a few new i7 64 bit machines, all running the latest version of Ubuntu linux. I compiled the latest OpenMPI 1.3.3 with the flags
./configure --enable-heterogeneous --with-threads --with-slurm --with-memory-manager --with-openib --without-udapl --disable-openib-ibcm I also made a trivial test program: #include "mpi.h" #include <stdio.h> #include <stdlib.h> #define LEN 12000000 int main (int argc, char *argv[]) { int size, rank, i, len = LEN; MPI_Init(&argc, &argv); MPI_Comm_rank(MPI_COMM_WORLD, &rank); if (argc > 1) len = atoi(argv[1]); printf("Size: %d, ", len); char *greeting = malloc(sizeof(char)*len); if (rank == 0) { for ( i = 0; i < len-1; i++) greeting[i] = ' '; greeting[len-1] = '\0'; } MPI_Bcast(greeting, len, MPI_BYTE, 0, MPI_COMM_WORLD); printf("rank: %d\n", rank); MPI_Finalize(); free(greeting); return 0; } I run this with salloc -n 28 mpirun -n 28 mpitest on my slurm cluster. At 12,000,000 characters, this command works exactly as expected, no issues at all. However, beyond a certain critical limit somewhere around 16,000,000 characters, the program will consistently segfault with this error message: salloc -n 28 -p all mpiexec -n 28 mpitest 16500000 salloc: Granted job allocation 234 [ibogaine:24883] *** Process received signal *** [ibogaine:24883] Signal: Segmentation fault (11) [ibogaine:24883] Signal code: Address not mapped (1) [ibogaine:24883] Failing at address: 0x101a60f58 [ibogaine:24883] [ 0] /lib/libpthread.so.0 [0x7f6c00405080] [ibogaine:24883] [ 1] /usr/local/lib/openmpi/mca_pml_ob1.so [0x7f6bfd9dff68] [ibogaine:24883] [ 2] /usr/local/lib/openmpi/mca_btl_tcp.so [0x7f6bfcf3ec7c] [ibogaine:24883] [ 3] /usr/local/lib/libopen-pal.so.0 [0x7f6c00ed5ee8] [ibogaine:24883] [ 4] /usr/local/lib/libopen-pal.so.0(opal_progress+0xa1) [0x7f6c00eca7b1] [ibogaine:24883] [ 5] /usr/local/lib/libmpi.so.0 [0x7f6c013a185d] [ibogaine:24883] [ 6] /usr/local/lib/openmpi/mca_coll_tuned.so [0x7f6bfc10c29c] [ibogaine:24883] [ 7] /usr/local/lib/openmpi/mca_coll_tuned.so [0x7f6bfc10c9eb] [ibogaine:24883] [ 8] /usr/local/lib/openmpi/mca_coll_tuned.so [0x7f6bfc10295c] [ibogaine:24883] [ 9] /usr/local/lib/openmpi/mca_coll_sync.so [0x7f6bfc31b35a] [ibogaine:24883] [10] /usr/local/lib/libmpi.so.0(MPI_Bcast+0xa3) [0x7f6c013b78c3] [ibogaine:24883] [11] mpitest(main+0xd4) [0x400bc0] [ibogaine:24883] [12] /lib/libc.so.6(__libc_start_main+0xe6) [0x7f6c000a25a6] [ibogaine:24883] [13] mpitest [0x400a29] [ibogaine:24883] *** End of error message *** As far as I can tell, the segfault occurs on the root node doing the broadcast. This error only occurs when I try to send across heterogeneous sections. If I only communicate between homogeneous subsets of the cluster, I can go as far as 120,000,000 characters without issue. However, a hard "limit" seems to occur somewhere just under 16,000,000 characters across the heterogeneous cluster. Any ideas?