Bonjour,

 Running with OpenMPI 1.4.3 on an SGI Altix cluster with 4096 cores, I got
this error message, right at startup :
mca_oob_tcp_peer_recv_connect_ack: received unexpected process identifier [[13816,0],209]

and the whole job is going to spin for an undefined period, without crashing/aborting.

 What could be the culprit please ?
Is there a workaround ?
Which parameter is to be tuned ?

 Thanks in advance for any help,    Best,    G.


Reply via email to