On Mon, Feb 6, 2012 at 11:20 PM, Jed Brown <jedbrown at mcs.anl.gov> wrote:
> > Hmm, progress semantics of MPI should ensure completion. Stalling the > process with gdb should not change anything (assuming you weren't actually > making changes with gdb). Can you run with MPICH2? > Ok - an update on this. I recompiled my whole stack with mvapich2... and it still is hanging in the same place: #0 0x00002b336a732f40 in PMI_Get_rank () from /apps/local/mvapich2/1.7/intel-12.1.1/opt/lib/libmpich.so.3 #1 0x00002b336a6bf453 in MPIDI_CH3I_MRAILI_Cq_poll () from /apps/local/mvapich2/1.7/intel-12.1.1/opt/lib/libmpich.so.3 #2 0x00002b336a675818 in MPIDI_CH3I_read_progress () from /apps/local/mvapich2/1.7/intel-12.1.1/opt/lib/libmpich.so.3 #3 0x00002b336a67485b in MPIDI_CH3I_Progress () from /apps/local/mvapich2/1.7/intel-12.1.1/opt/lib/libmpich.so.3 #4 0x00002b336a6bea96 in MPIC_Wait () from /apps/local/mvapich2/1.7/intel-12.1.1/opt/lib/libmpich.so.3 #5 0x00002b336a6be9db in MPIC_Sendrecv () from /apps/local/mvapich2/1.7/intel-12.1.1/opt/lib/libmpich.so.3 #6 0x00002b336a6be8aa in MPIC_Sendrecv_ft () from /apps/local/mvapich2/1.7/intel-12.1.1/opt/lib/libmpich.so.3 #7 0x00002b336a652db1 in MPIR_Allgather_intra_MV2 () from /apps/local/mvapich2/1.7/intel-12.1.1/opt/lib/libmpich.so.3 #8 0x00002b336a652965 in MPIR_Allgather_MV2 () from /apps/local/mvapich2/1.7/intel-12.1.1/opt/lib/libmpich.so.3 #9 0x00002b336a651846 in MPIR_Allgather_impl () from /apps/local/mvapich2/1.7/intel-12.1.1/opt/lib/libmpich.so.3 #10 0x00002b336a6517b1 in PMPI_Allgather () from /apps/local/mvapich2/1.7/intel-12.1.1/opt/lib/libmpich.so.3 #11 0x00000000004a1f23 in PetscLayoutSetUp () #12 0x000000000054e469 in MatMPIAIJSetPreallocation_MPIAIJ () #13 0x000000000055584a in MatCreateMPIAIJ () It's been hung there for about 35 minutes. This particular job has ~100 million DoFs with 512 MPI processes. Any ideas? Derek -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20120207/5a1754cb/attachment.htm>
