Re: [OMPI users] Hanging vs Stopping behaviour in communication failures

2009-12-14 Thread Constantinos Makassikis
Jeff Squyres wrote: On Dec 9, 2009, at 3:47 AM, Constantinos Makassikis wrote: sometimes when running Open MPI jobs, the application hangs. By looking the output I get the following error message: [ic17][[34562,1],74][../../../../../ompi/mca/btl/tcp/btl_tcp_frag.c:216:mca_btl_tcp_frag_recv

Re: [OMPI users] Hanging vs Stopping behaviour in communication failures

2009-12-14 Thread Jeff Squyres
On Dec 9, 2009, at 3:47 AM, Constantinos Makassikis wrote: > sometimes when running Open MPI jobs, the application hangs. By looking the > output I get the following error message: > > [ic17][[34562,1],74][../../../../../ompi/mca/btl/tcp/btl_tcp_frag.c:216:mca_btl_tcp_frag_recv > > ] mca_btl_

[OMPI users] Hanging vs Stopping behaviour in communication failures

2009-12-09 Thread Constantinos Makassikis
Dear all, sometimes when running Open MPI jobs, the application hangs. By looking the output I get the following error message: [ic17][[34562,1],74][../../../../../ompi/mca/btl/tcp/btl_tcp_frag.c:216:mca_btl_tcp_frag_recv ] mca_btl_tcp_frag_recv: readv failed: No route to host (113) I woul