I'm AFK but let me reply about the IB thing: double ports/multi rail is a good thing. It's not a good thing if they're on the same subnet.
Check the FAQ - http://www.open-mpi.org/faq/?category=openfabrics - I can't see it well enough on the small screen of my phone, but I think there's a q on there about how multi rail destinations are chosen. Spoiler: put your ports in different subnets so that OMPI makes deterministic choices. Sent from my phone. No type good. On Jun 2, 2014, at 6:55 AM, "Gilles Gouaillardet" <gilles.gouaillar...@gmail.com<mailto:gilles.gouaillar...@gmail.com>> wrote: Jeff, On Mon, Jun 2, 2014 at 7:26 PM, Jeff Squyres (jsquyres) <jsquy...@cisco.com<mailto:jsquy...@cisco.com>> wrote: On Jun 2, 2014, at 5:03 AM, Gilles Gouaillardet <gilles.gouaillar...@gmail.com<mailto:gilles.gouaillar...@gmail.com>> wrote: > i faced a bit different problem, but that is 100% reproductible : > - i launch mpirun (no batch manager) from a node with one IB port > - i use -host node01,node02 where node01 and node02 both have two IB port on > the > same subnet FWIW: 2 IB ports on the same subnet? That's not a good idea. could you please elaborate a bit ? from what i saw, this basically doubles the bandwidth (imb PingPong benchmark) between two nodes (!) which is a not a bad thing. i can only guess this might not scale (e.g. if 16 tasks is running on each host, the overhead associated with the use of two ports might void the extra bandwidth) > by default, this will hang. ...but it still shouldn't hang. I wonder if it's somehow related to https://svn.open-mpi.org/trac/ompi/ticket/4442...? i doubt it ... here is my command line (from node0) `which mpirun` -np 2 -host node1,node2 --mca rtc_freq_priority 0 --mca btl openib,self --mca btl_openib_if_include mlx4_0 ./abort on top of that, the usnic btl is not built (nor installed) > if this is a "feature" (e.g. openmpi does not support this kind of > configuration) i am fine with it. > > when i run mpirun --mca btl_openib_if_exclude mlx4_1, then if the application > is a success, then it works just fine. > > if the application calls MPI_Abort() /* and even if all tasks call > MPI_Abort() */ then it will hang 100% of the time. > i do not see that as a feature but as a bug. Yes, OMPI should never hang upon a call to MPI_Abort. Can you get some stack traces to show where the hung process(es) are stuck? That would help Ralph pin down where things aren't working down in ORTE. on node0 : \_ -bash \_ /.../local/ompi-trunk/bin/mpirun -np 2 -host node1,node2 --mca rtc_freq_priority 0 --mc \_ /usr/bin/ssh -x node1 PATH=/.../local/ompi-trunk/bin:$PATH ; export PATH ; LD_LIBRAR \_ /usr/bin/ssh -x node2 PATH=/.../local/ompi-trunk/bin:$PATH ; export PATH ; LD_LIBRAR pstack (mpirun) : $ pstack 10913 Thread 2 (Thread 0x7f0ecad35700 (LWP 10914)): #0 0x0000003ba66e15e3 in select () from /lib64/libc.so.6 #1 0x00007f0ecad4391e in listen_thread () from /.../local/ompi-trunk/lib/openmpi/mca_oob_tcp.so #2 0x0000003ba72079d1 in start_thread () from /lib64/libpthread.so.0 #3 0x0000003ba66e8b6d in clone () from /lib64/libc.so.6 Thread 1 (Thread 0x7f0ecc601700 (LWP 10913)): #0 0x0000003ba66df343 in poll () from /lib64/libc.so.6 #1 0x00007f0ecc6b1a05 in poll_dispatch () from /.../local/ompi-trunk/lib/libopen-pal.so.0 #2 0x00007f0ecc6a641c in opal_libevent2021_event_base_loop () from /.../local/ompi-trunk/lib/libopen-pal.so.0 #3 0x00000000004056a1 in orterun () #4 0x00000000004039f4 in main () on node 1 : sshd: gouaillardet@notty \_ bash -c PATH=/.../local/ompi-trunk/bin:$PATH ; export PATH ; LD_LIBRARY_PATH=/... \_ /.../local/ompi-trunk/bin/orted -mca ess env -mca orte_ess_jobid 3459448832 -mca orte_ess_vpid \_ [abort] <defunct> $ pstack (orted) #0 0x00007fe0ba6a0566 in vfprintf () from /lib64/libc.so.6 #1 0x00007fe0ba6c9a52 in vsnprintf () from /lib64/libc.so.6 #2 0x00007fe0ba6a9523 in snprintf () from /lib64/libc.so.6 #3 0x00007fe0bbc019b6 in orte_util_print_jobids () from /.../local/ompi-trunk/lib/libopen-rte.so.0 #4 0x00007fe0bbc01791 in orte_util_print_name_args () from /.../local/ompi-trunk/lib/libopen-rte.so.0 #5 0x00007fe0b8e16a8b in mca_oob_tcp_component_hop_unknown () from /.../local/ompi-trunk/lib/openmpi/mca_oob_tcp.so #6 0x00007fe0bb94ab7a in event_process_active_single_queue () from /.../local/ompi-trunk/lib/libopen-pal.so.0 #7 0x00007fe0bb94adf2 in event_process_active () from /.../local/ompi-trunk/lib/libopen-pal.so.0 #8 0x00007fe0bb94b470 in opal_libevent2021_event_base_loop () from /.../local/ompi-trunk/lib/libopen-pal.so.0 #9 0x00007fe0bbc1fa7b in orte_daemon () from /.../local/ompi-trunk/lib/libopen-rte.so.0 #10 0x000000000040093a in main () on node 2 : sshd: gouaillardet@notty \_ bash -c PATH=/.../local/ompi-trunk/bin:$PATH ; export PATH ; LD_LIBRARY_PATH=/... \_ /.../local/ompi-trunk/bin/orted -mca ess env -mca orte_ess_jobid 3459448832 -mca orte_ess_vpid \_ [abort] <defunct> $ pstack (orted) #0 0x00007fe8fc435e39 in strchrnul () from /lib64/libc.so.6 #1 0x00007fe8fc3ef8f5 in vfprintf () from /lib64/libc.so.6 #2 0x00007fe8fc41aa52 in vsnprintf () from /lib64/libc.so.6 #3 0x00007fe8fc3fa523 in snprintf () from /lib64/libc.so.6 #4 0x00007fe8fd9529b6 in orte_util_print_jobids () from /.../local/ompi-trunk/lib/libopen-rte.so.0 #5 0x00007fe8fd952791 in orte_util_print_name_args () from /.../local/ompi-trunk/lib/libopen-rte.so.0 #6 0x00007fe8fab6c1b5 in resend () from /.../local/ompi-trunk/lib/openmpi/mca_oob_tcp.so #7 0x00007fe8fab67ce3 in mca_oob_tcp_component_hop_unknown () from /.../local/ompi-trunk/lib/openmpi/mca_oob_tcp.so #8 0x00007fe8fd69bb7a in event_process_active_single_queue () from /.../local/ompi-trunk/lib/libopen-pal.so.0 #9 0x00007fe8fd69bdf2 in event_process_active () from /.../local/ompi-trunk/lib/libopen-pal.so.0 #10 0x00007fe8fd69c470 in opal_libevent2021_event_base_loop () from /.../local/ompi-trunk/lib/libopen-pal.so.0 #11 0x00007fe8fd970a7b in orte_daemon () from /...t/local/ompi-trunk/lib/libopen-rte.so.0 #12 0x000000000040093a in main () orted processes loop forever in event_process_active_single_queue mca_oob_tcp_component_hop_unknown gets called again and again mca_oob_tcp_component_hop_unknown (fd=-1, args=4, cbdata=0x99dc50) at ../../../../../../src/ompi-trunk/orte/mca/oob/tcp/oob_tcp_component.c:1369 > in an other thread, Jeff mentionned that the usnic btl is doing stuff even if > there is no usnic hardware (this will be fixed shortly). > Do you still see intermittent hang without listing usnic as a btl ? Yeah, there's a definite race in the usnic BTL ATM. If you care, here's what's happening: thanks for the insights :-) Cheers, Gilles _______________________________________________ devel mailing list de...@open-mpi.org<mailto:de...@open-mpi.org> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel Link to this post: http://www.open-mpi.org/community/lists/devel/2014/06/14943.php