Hi,
I just compiled openmpi-4.0.1 using --with-sge to work with Univa Grid Engine that we have on our cluster. I tried the basic hello world c program where a worker will print it's rank and the world size on stdout and then quit. This seems to work fine and I've had it running on 64 nodes with no issues. I moved on to a more complex test program where a worker calculates it's share of a sum from 1-N and then communicates its partial sum to rank 0 which collects all the answers using the MPI_Reduce() function. Now that the program has workers that communicate amongst each other it is failing to work. I get errors such as the following... WARNING: Open MPI accepted a TCP connection from what appears to be a another Open MPI process but cannot find a corresponding process entry for that peer. This attempted connection will be ignored; your MPI job may or may not continue properly. Local host: node-hp0409 PID: 58849 I've googled for this error and there doesn't seem to be anything relevant to this issue there as far as I can tell. Does anyone have any idea what might be going on and what solutions there may be ? Our nodes are running Scientific Linux release 7.2. Regards, Emyr James
_______________________________________________ users mailing list users@lists.open-mpi.org https://lists.open-mpi.org/mailman/listinfo/users