Hello again, attaching the gdb to mpirun the back trace when it hangs is (gdb) bt #0 0x00002b039f74169d in poll () from /usr/lib64/libc.so.6 #1 0x00002b039e1a9c42 in poll_dispatch () from /cluster/mpi/openmpi/2.0.1/intel2016/lib/libopen-pal.so.20 #2 0x00002b039e1a2751 in opal_libevent2022_event_base_loop () from /cluster/mpi/openmpi/2.0.1/intel2016/lib/libopen-pal.so.20 #3 0x00000000004056ef in orterun (argc=13, argv=0x7ffef20a79f8) at orterun.c:1057 #4 0x00000000004035a0 in main (argc=13, argv=0x7ffef20a79f8) at main.c:13
Using pstack on mpirun I see several threads, below Thread 5 (Thread 0x2b03a33b0700 (LWP 11691)): #0 0x00002b039f743413 in select () from /usr/lib64/libc.so.6 #1 0x00002b039c599979 in listen_thread () from /cluster/mpi/openmpi/2.0.1/intel2016/lib/libopen-rte.so.20 #2 0x00002b039defedc5 in start_thread () from /usr/lib64/libpthread.so.0 #3 0x00002b039f74bced in clone () from /usr/lib64/libc.so.6 Thread 4 (Thread 0x2b03a3be9700 (LWP 11692)): #0 0x00002b039f74c2c3 in epoll_wait () from /usr/lib64/libc.so.6 #1 0x00002b039e1a0f42 in epoll_dispatch () from /cluster/mpi/openmpi/2.0.1/intel2016/lib/libopen-pal.so.20 #2 0x00002b039e1a2751 in opal_libevent2022_event_base_loop () from /cluster/mpi/openmpi/2.0.1/intel2016/lib/libopen-pal.so.20 #3 0x00002b039e1fa996 in progress_engine () from /cluster/mpi/openmpi/2.0.1/intel2016/lib/libopen-pal.so.20 #4 0x00002b039defedc5 in start_thread () from /usr/lib64/libpthread.so.0 #5 0x00002b039f74bced in clone () from /usr/lib64/libc.so.6 Thread 3 (Thread 0x2b03a3dea700 (LWP 11693)): #0 0x00002b039f743413 in select () from /usr/lib64/libc.so.6 #1 0x00002b039e1f3a5f in listen_thread () from /cluster/mpi/openmpi/2.0.1/intel2016/lib/libopen-pal.so.20 #2 0x00002b039defedc5 in start_thread () from /usr/lib64/libpthread.so.0 #3 0x00002b039f74bced in clone () from /usr/lib64/libc.so.6 Thread 2 (Thread 0x2b03a3feb700 (LWP 11694)): #0 0x00002b039f743413 in select () from /usr/lib64/libc.so.6 #1 0x00002b039c55616b in listen_thread_fn () from /cluster/mpi/openmpi/2.0.1/intel2016/lib/libopen-rte.so.20 #2 0x00002b039defedc5 in start_thread () from /usr/lib64/libpthread.so.0 #3 0x00002b039f74bced in clone () from /usr/lib64/libc.so.6 Thread 1 (Thread 0x2b039c324100 (LWP 11690)): #0 0x00002b039f74169d in poll () from /usr/lib64/libc.so.6 #1 0x00002b039e1a9c42 in poll_dispatch () from /cluster/mpi/openmpi/2.0.1/intel2016/lib/libopen-pal.so.20 #2 0x00002b039e1a2751 in opal_libevent2022_event_base_loop () from /cluster/mpi/openmpi/2.0.1/intel2016/lib/libopen-pal.so.20 #3 0x00000000004056ef in orterun (argc=13, argv=0x7ffef20a79f8) at orterun.c:1057 #4 0x00000000004035a0 in main (argc=13, argv=0x7ffef20a79f8) at main.c:13 Best Regards Christof On Wed, Dec 07, 2016 at 02:07:27PM +0100, Christof Koehler wrote: > Hello, > > thank you for the fast answer. > > On Wed, Dec 07, 2016 at 08:23:43PM +0900, Gilles Gouaillardet wrote: > > Christoph, > > > > can you please try again with > > > > mpirun --mca btl tcp,self --mca pml ob1 ... > > mpirun -n 20 --mca btl tcp,self --mca pml ob1 > /cluster/vasp/5.3.5/intel2016/openmpi-2.0/bin/vasp-mpi > > Deadlocks/ hangs, has no effect. > > > mpirun --mca btl tcp,self --mca pml ob1 --mca coll ^tuned ... > mpirun -n 20 --mca btl tcp,self --mca pml ob1 --mca coll ^tuned > /cluster/vasp/5.3.5/intel2016/openmpi-2.0/bin/vasp-mpi > > Deadlocks/ hangs, has no effect. There is additional output. > > wannier90 error: examine the output/error file for details > [node109][[55572,1],16][btl_tcp_frag.c:230:mca_btl_tcp_frag_recv] > mca_btl_tcp_frag_recv: readv failed: Connection reset by peer > (104)[node109][[55572,1],8][btl_tcp_frag.c:230:mca_btl_tcp_frag_recv] > mca_btl_tcp_frag_recv: readv failed: Connection reset by peer (104) > [node109][[55572,1],4][btl_tcp_frag.c:230:mca_btl_tcp_frag_recv] > mca_btl_tcp_frag_recv: readv failed: Connection reset by peer (104) > [node109][[55572,1],1][btl_tcp_frag.c:230:mca_btl_tcp_frag_recv] > mca_btl_tcp_frag_recv: readv failed: Connection reset by peer (104) > [node109][[55572,1],2][btl_tcp_frag.c:230:mca_btl_tcp_frag_recv] > mca_btl_tcp_frag_recv: readv failed: Connection reset by peer (104) > > Please note: The "wannier90 error: examine the output/error file for > details" is expected, there is in fact an error in the input file. It > is supposed to terminate. > > However, with mvapich2 and openmpi 1.10.4 it terminates > completely, i.e. I get my shell prompt back. If a segfault is involved with > mvapich2 (as is apparently the case with openmpi 1.10.4 based in the > termination message) I do not know. I tried > > export MV2_DEBUG_SHOW_BACKTRACE=1 > mpirun -n 20 /cluster/vasp/5.3.5/intel2016/mvapich2-2.2/bin/vasp-mpi > > but did not get any indication of a problem (segfault), the last lines > are > > calculate QP shifts <psi_nk| G(iteration)W_0 |psi_nk>: iteration 1 > writing wavefunctions > wannier90 error: examine the output/error file for details > node109 14:00 /scratch/ckoe/gw % > > The last line is my shell prompt. > > > > > if everything fails, can you describe of MPI_Allreduce is invoked ? > > /* number of tasks, datatype, number of elements */ > Difficult, this is not our code in the first place [1] and the problem > occurs when using an ("officially" supported) third party library [2]. > > From the stack trace of the hanging process the vasp routine which calls > allreduce is "m_sum_i_". That is in the mpi.F source file. Allreduce is > called as > > CALL MPI_ALLREDUCE( MPI_IN_PLACE, ivec(1), n, MPI_INTEGER, & > & MPI_SUM, COMM%MPI_COMM, ierror ) > > n and ivec(1) are data type integer. It was originally with 20 ranks, I > tried 2 ranks now also and it hangs, too. With one (!) rank > > mpirun -n 1 --mca btl tcp,self --mca pml ob1 --mca coll ^tuned > /cluster/vasp/5.3.5/intel2016/openmpi-2.0/bin/vasp-mpi > > I of course get a shell prompt back. > > I then started in normally in the shell with 2 ranks > mpirun -n 2 --mca btl tcp,self --mca pml ob1 --mca coll ^tuned > /cluster/vasp/5.3.5/intel2016/openmpi-2.0/bin/vasp-mpi > and attached gdb to the rank with the lowest pid (3478). I do not get a > prompt > back (it hangs), the second rank 3479 is still at 100 % CPU and mpirun is > still a process > I can see with "ps", but gdb says > (gdb) continue <- that is where I attached it ! > Continuing. > [Thread 0x2b8366806700 (LWP 3480) exited] > [Thread 0x2b835da1c040 (LWP 3478) exited] > [Inferior 1 (process 3478) exited normally] > (gdb) bt > No stack. > > So, as far as gdb is concerned the rank with the lowest pid (which is > gone while the other rank is still eating CPU time) terminated normally > ? > > I hope this helps. I have only very basic experience with debuggers > (never needed them really) and even less with using them in parallel. > I can try to catch the contents of ivec, but I do not think that would > be helpful ? If you need them I can try of course, I have no idea hwo > large the vector is. > > > Best Regards > > Christof > > [1] https://www.vasp.at/ > [2] http://www.wannier.org/, Old version 1.2 > > > > > > > > Cheers, > > > > Gilles > > > > On Wed, Dec 7, 2016 at 7:38 PM, Christof Koehler > > <christof.koeh...@bccms.uni-bremen.de> wrote: > > > Hello everybody, > > > > > > I am observing a deadlock in allreduce with openmpi 2.0.1 on a Single > > > node. A stack tracke (pstack) of one rank is below showing the program > > > (vasp > > > 5.3.5) and the two psm2 progress threads. However: > > > > > > In fact, the vasp input is not ok and it should abort at the point where > > > it hangs. It does when using mvapich 2.2. With openmpi 2.0.1 it just > > > deadlocks in some allreduce operation. Originally it was started with 20 > > > ranks, when it hangs there are only 19 left. From the PIDs I would > > > assume it is the master rank which is missing. So, this looks like a > > > failure to terminate. > > > > > > With 1.10 I get a clean > > > -------------------------------------------------------------------------- > > > mpiexec noticed that process rank 0 with PID 18789 on node node109 > > > exited on signal 11 (Segmentation fault). > > > -------------------------------------------------------------------------- > > > > > > Any ideas what to try ? Of course in this situation it may well be the > > > program. Still, with the observed difference between 2.0.1 and 1.10 (and > > > mvapich) this might be interesting to someone. > > > > > > Best Regards > > > > > > Christof > > > > > > > > > Thread 3 (Thread 0x2ad362577700 (LWP 4629)): > > > #0 0x00002ad35b1562c3 in epoll_wait () from /lib64/libc.so.6 > > > #1 0x00002ad35d114f42 in epoll_dispatch () from > > > /cluster/mpi/openmpi/2.0.1/intel2016/lib/libopen-pal.so.20 > > > #2 0x00002ad35d116751 in opal_libevent2022_event_base_loop () from > > > /cluster/mpi/openmpi/2.0.1/intel2016/lib/libopen-pal.so.20 > > > #3 0x00002ad35d16e996 in progress_engine () from > > > /cluster/mpi/openmpi/2.0.1/intel2016/lib/libopen-pal.so.20 > > > #4 0x00002ad359efbdc5 in start_thread () from /lib64/libpthread.so.0 > > > #5 0x00002ad35b155ced in clone () from /lib64/libc.so.6 > > > Thread 2 (Thread 0x2ad362778700 (LWP 4640)): > > > #0 0x00002ad35b14b69d in poll () from /lib64/libc.so.6 #1 > > > 0x00002ad35d11dc42 in poll_dispatch () from > > > /cluster/mpi/openmpi/2.0.1/intel2016/lib/libopen-pal.so.20 > > > #2 0x00002ad35d116751 in opal_libevent2022_event_base_loop () from > > > /cluster/mpi/openmpi/2.0.1/intel2016/lib/libopen-pal.so.20 > > > #3 0x00002ad35d0c61d1 in progress_engine () from > > > /cluster/mpi/openmpi/2.0.1/intel2016/lib/libopen-pal.so.20 > > > #4 0x00002ad359efbdc5 in start_thread () from /lib64/libpthread.so.0 > > > #5 0x00002ad35b155ced in clone () from /lib64/libc.so.6 > > > Thread 1 (Thread 0x2ad35978d040 (LWP 4609)): > > > #0 0x00002ad35b14b69d in poll () from /lib64/libc.so.6 > > > #1 0x00002ad35d11dc42 in poll_dispatch () from > > > /cluster/mpi/openmpi/2.0.1/intel2016/lib/libopen-pal.so.20 > > > #2 0x00002ad35d116751 in opal_libevent2022_event_base_loop () from > > > /cluster/mpi/openmpi/2.0.1/intel2016/lib/libopen-pal.so.20 > > > #3 0x00002ad35d0c28cf in opal_progress () from > > > /cluster/mpi/openmpi/2.0.1/intel2016/lib/libopen-pal.so.20 > > > #4 0x00002ad35adce8d8 in ompi_request_wait_completion () from > > > /cluster/mpi/openmpi/2.0.1/intel2016/lib/libmpi.so.20 > > > #5 0x00002ad35adce838 in mca_pml_cm_recv () from > > > /cluster/mpi/openmpi/2.0.1/intel2016/lib/libmpi.so.20 > > > #6 0x00002ad35ad4da42 in > > > ompi_coll_base_allreduce_intra_recursivedoubling () from > > > /cluster/mpi/openmpi/2.0.1/intel2016/lib/libmpi.so.20 > > > #7 0x00002ad35ad52906 in ompi_coll_tuned_allreduce_intra_dec_fixed () > > > from /cluster/mpi/openmpi/2.0.1/intel2016/lib/libmpi.so.20 > > > #8 0x00002ad35ad1f0f4 in PMPI_Allreduce () from > > > /cluster/mpi/openmpi/2.0.1/intel2016/lib/libmpi.so.20 > > > #9 0x00002ad35aa99c38 in pmpi_allreduce__ () from > > > /cluster/mpi/openmpi/2.0.1/intel2016/lib/libmpi_mpifh.so.20 > > > #10 0x000000000045f8c6 in m_sum_i_ () > > > #11 0x0000000000e1ce69 in mlwf_mp_mlwf_wannier90_ () > > > #12 0x00000000004331ff in vamp () at main.F:2640 > > > #13 0x000000000040ea1e in main () > > > #14 0x00002ad35b080b15 in __libc_start_main () from /lib64/libc.so.6 > > > #15 0x000000000040e929 in _start () > > > > > > > > > -- > > > Dr. rer. nat. Christof Köhler email: c.koeh...@bccms.uni-bremen.de > > > Universitaet Bremen/ BCCMS phone: +49-(0)421-218-62334 > > > Am Fallturm 1/ TAB/ Raum 3.12 fax: +49-(0)421-218-62770 > > > 28359 Bremen > > > > > > PGP: http://www.bccms.uni-bremen.de/cms/people/c_koehler/ > > > > > > _______________________________________________ > > > users mailing list > > > users@lists.open-mpi.org > > > https://rfd.newmexicoconsortium.org/mailman/listinfo/users > > -- > Dr. rer. nat. Christof Köhler email: c.koeh...@bccms.uni-bremen.de > Universitaet Bremen/ BCCMS phone: +49-(0)421-218-62334 > Am Fallturm 1/ TAB/ Raum 3.12 fax: +49-(0)421-218-62770 > 28359 Bremen > > PGP: http://www.bccms.uni-bremen.de/cms/people/c_koehler/ -- Dr. rer. nat. Christof Köhler email: c.koeh...@bccms.uni-bremen.de Universitaet Bremen/ BCCMS phone: +49-(0)421-218-62334 Am Fallturm 1/ TAB/ Raum 3.12 fax: +49-(0)421-218-62770 28359 Bremen PGP: http://www.bccms.uni-bremen.de/cms/people/c_koehler/
signature.asc
Description: Digital signature
_______________________________________________ users mailing list users@lists.open-mpi.org https://rfd.newmexicoconsortium.org/mailman/listinfo/users