So the code logic is after the matrix is assembled, I iterate over all distributed patches in the domain to see which of the patch is abutting a Dirichlet boundary. Depending upon which patch abuts a physical and Dirichlet boundary, a processor will call this routine. However, that same processor is “owning” that DoF, which would be on its diagonal.
I think Barry already mentioned this is not going to work unless I use the flag to not communicate explicitly. However, that flag is not working as it should over here for some reason. I can always change the matrix coefficients for Dirichlet rows during MatSetValues. However, that would lengthen my code and I was trying to avoid that. On Wed, Nov 29, 2023 at 10:02 AM Matthew Knepley <knep...@gmail.com> wrote: > On Wed, Nov 29, 2023 at 12:30 PM Amneet Bhalla <mail2amn...@gmail.com> > wrote: > >> Ok, I added both, but it still hangs. Here, is bt from all three tasks: >> > > It looks like two processes are calling AllReduce, but one is not. Are all > procs not calling MatZeroRows? > > Thanks, > > Matt > > >> Task 1: >> >> amneetb@APSB-MacBook-Pro-16:~$ lldb -p 44691 >> >> (lldb) process attach --pid 44691 >> >> Process 44691 stopped >> >> * thread #1, queue = 'com.apple.main-thread', stop reason = signal >> SIGSTOP >> >> frame #0: 0x000000018a2d750c libsystem_kernel.dylib`__semwait_signal >> + 8 >> >> libsystem_kernel.dylib`: >> >> -> 0x18a2d750c <+8>: b.lo 0x18a2d752c ; <+40> >> >> 0x18a2d7510 <+12>: pacibsp >> >> 0x18a2d7514 <+16>: stp x29, x30, [sp, #-0x10]! >> >> 0x18a2d7518 <+20>: mov x29, sp >> >> Target 0: (fo_acoustic_streaming_solver_2d) stopped. >> >> Executable module set to >> "/Users/amneetb/Softwares/IBAMR-Git/objs-dbg/tests/IBTK/fo_acoustic_streaming_solver_2d". >> >> Architecture set to: arm64-apple-macosx-. >> >> (lldb) cont >> >> Process 44691 resuming >> >> Process 44691 stopped >> >> * thread #1, queue = 'com.apple.main-thread', stop reason = signal >> SIGSTOP >> >> frame #0: 0x000000010ba40b60 >> libpmpi.12.dylib`MPIDI_POSIX_mpi_release_gather_release + 752 >> >> libpmpi.12.dylib`MPIDI_POSIX_mpi_release_gather_release: >> >> -> 0x10ba40b60 <+752>: add w8, w8, #0x1 >> >> 0x10ba40b64 <+756>: ldr w9, [x22] >> >> 0x10ba40b68 <+760>: cmp w8, w9 >> >> 0x10ba40b6c <+764>: b.lt 0x10ba40b4c ; <+732> >> >> Target 0: (fo_acoustic_streaming_solver_2d) stopped. >> >> (lldb) bt >> >> * thread #1, queue = 'com.apple.main-thread', stop reason = signal >> SIGSTOP >> >> * frame #0: 0x000000010ba40b60 >> libpmpi.12.dylib`MPIDI_POSIX_mpi_release_gather_release + 752 >> >> frame #1: 0x000000010ba48528 >> libpmpi.12.dylib`MPIDI_POSIX_mpi_allreduce_release_gather + 1088 >> >> frame #2: 0x000000010ba47964 >> libpmpi.12.dylib`MPIDI_Allreduce_intra_composition_gamma + 368 >> >> frame #3: 0x000000010ba35e78 libpmpi.12.dylib`MPIR_Allreduce + 1588 >> >> frame #4: 0x0000000103f587dc libmpi.12.dylib`MPI_Allreduce + 2280 >> >> frame #5: 0x0000000106d67650 >> libpetsc.3.17.dylib`MatZeroRows_MPIAIJ(A=0x0000000105846470, N=1, >> rows=0x000000016dbfa9f4, diag=1, x=0x0000000000000000, >> b=0x0000000000000000) at mpiaij.c:827:3 >> >> frame #6: 0x0000000106aadfac >> libpetsc.3.17.dylib`MatZeroRows(mat=0x0000000105846470, numRows=1, >> rows=0x000000016dbfa9f4, diag=1, x=0x0000000000000000, >> b=0x0000000000000000) at matrix.c:5935:3 >> >> frame #7: 0x00000001023952d0 >> fo_acoustic_streaming_solver_2d`IBAMR::AcousticStreamingPETScMatUtilities::constructPatchLevelFOAcousticStreamingOp(mat=0x000000016dc04168, >> omega=1, sound_speed=1, rho_idx=3, mu_idx=2, lambda_idx=4, >> u_bc_coefs=0x000000016dc04398, data_time=NaN, num_dofs_per_proc=size=3, >> u_dof_index_idx=27, p_dof_index_idx=28, >> patch_level=Pointer<SAMRAI::hier::PatchLevel<2> > @ 0x000000016dbfcec0, >> mu_interp_type=VC_HARMONIC_INTERP) at >> AcousticStreamingPETScMatUtilities.cpp:799:36 >> >> frame #8: 0x00000001023acb8c >> fo_acoustic_streaming_solver_2d`IBAMR::FOAcousticStreamingPETScLevelSolver::initializeSolverStateSpecialized(this=0x000000016dc04018, >> x=0x000000016dc05778, (null)=0x000000016dc05680) at >> FOAcousticStreamingPETScLevelSolver.cpp:149:5 >> >> frame #9: 0x000000010254a2dc >> fo_acoustic_streaming_solver_2d`IBTK::PETScLevelSolver::initializeSolverState(this=0x000000016dc04018, >> x=0x000000016dc05778, b=0x000000016dc05680) at PETScLevelSolver.cpp:340:5 >> >> frame #10: 0x0000000102202e5c >> fo_acoustic_streaming_solver_2d`main(argc=11, argv=0x000000016dc07450) at >> fo_acoustic_streaming_solver.cpp:400:22 >> >> frame #11: 0x0000000189fbbf28 dyld`start + 2236 >> >> (lldb) >> >> >> Task 2: >> >> amneetb@APSB-MacBook-Pro-16:~$ lldb -p 44692 >> >> (lldb) process attach --pid 44692 >> >> Process 44692 stopped >> >> * thread #1, queue = 'com.apple.main-thread', stop reason = signal >> SIGSTOP >> >> frame #0: 0x000000018a2d750c libsystem_kernel.dylib`__semwait_signal >> + 8 >> >> libsystem_kernel.dylib`: >> >> -> 0x18a2d750c <+8>: b.lo 0x18a2d752c ; <+40> >> >> 0x18a2d7510 <+12>: pacibsp >> >> 0x18a2d7514 <+16>: stp x29, x30, [sp, #-0x10]! >> >> 0x18a2d7518 <+20>: mov x29, sp >> >> Target 0: (fo_acoustic_streaming_solver_2d) stopped. >> >> Executable module set to >> "/Users/amneetb/Softwares/IBAMR-Git/objs-dbg/tests/IBTK/fo_acoustic_streaming_solver_2d". >> >> Architecture set to: arm64-apple-macosx-. >> >> (lldb) cont >> >> Process 44692 resuming >> >> Process 44692 stopped >> >> * thread #1, queue = 'com.apple.main-thread', stop reason = signal >> SIGSTOP >> >> frame #0: 0x000000010e5a022c >> libpmpi.12.dylib`MPIDI_POSIX_mpi_barrier_release_gather + 516 >> >> libpmpi.12.dylib`MPIDI_POSIX_mpi_barrier_release_gather: >> >> -> 0x10e5a022c <+516>: ldr x10, [x19, #0x4e8] >> >> 0x10e5a0230 <+520>: cmp x9, x10 >> >> 0x10e5a0234 <+524>: b.hs 0x10e5a0254 ; <+556> >> >> 0x10e5a0238 <+528>: add w8, w8, #0x1 >> >> Target 0: (fo_acoustic_streaming_solver_2d) stopped. >> >> (lldb) bt >> >> * thread #1, queue = 'com.apple.main-thread', stop reason = signal >> SIGSTOP >> >> * frame #0: 0x000000010e5a022c >> libpmpi.12.dylib`MPIDI_POSIX_mpi_barrier_release_gather + 516 >> >> frame #1: 0x000000010e59fd14 libpmpi.12.dylib`MPIDI_SHM_mpi_barrier >> + 224 >> >> frame #2: 0x000000010e59fb60 >> libpmpi.12.dylib`MPIDI_Barrier_intra_composition_alpha + 44 >> >> frame #3: 0x000000010e585490 libpmpi.12.dylib`MPIR_Barrier + 900 >> >> frame #4: 0x0000000106ac5030 libmpi.12.dylib`MPI_Barrier + 684 >> >> frame #5: 0x0000000108e62638 >> libpetsc.3.17.dylib`PetscCommDuplicate(comm_in=1140850688, >> comm_out=0x00000001408ae4b0, first_tag=0x00000001408ae4e4) at tagm.c:235: >> 5 >> >> frame #6: 0x0000000108e6a910 >> libpetsc.3.17.dylib`PetscHeaderCreate_Private(h=0x00000001408ae470, >> classid=1211228, class_name="KSP", descr="Krylov Method", mansec="KSP", >> comm=1140850688, destroy=(libpetsc.3.17.dylib`KSPDestroy at itfunc.c:1418), >> view=(libpetsc.3.17.dylib`KSPView at itcreate.c:113)) at inherit.c:62:3 >> >> frame #7: 0x000000010aa28010 >> libpetsc.3.17.dylib`KSPCreate(comm=1140850688, inksp=0x000000016b0a4160) at >> itcreate.c:679:3 >> >> frame #8: 0x00000001050aa2f4 >> fo_acoustic_streaming_solver_2d`IBTK::PETScLevelSolver::initializeSolverState(this=0x000000016b0a4018, >> x=0x000000016b0a5778, b=0x000000016b0a5680) at PETScLevelSolver.cpp:344: >> 12 >> >> frame #9: 0x0000000104d62e5c >> fo_acoustic_streaming_solver_2d`main(argc=11, argv=0x000000016b0a7450) at >> fo_acoustic_streaming_solver.cpp:400:22 >> >> frame #10: 0x0000000189fbbf28 dyld`start + 2236 >> >> (lldb) >> >> >> Task 3: >> >> amneetb@APSB-MacBook-Pro-16:~$ lldb -p 44693 >> >> (lldb) process attach --pid 44693 >> >> Process 44693 stopped >> >> * thread #1, queue = 'com.apple.main-thread', stop reason = signal >> SIGSTOP >> >> frame #0: 0x000000018a2d750c libsystem_kernel.dylib`__semwait_signal >> + 8 >> >> libsystem_kernel.dylib`: >> >> -> 0x18a2d750c <+8>: b.lo 0x18a2d752c ; <+40> >> >> 0x18a2d7510 <+12>: pacibsp >> >> 0x18a2d7514 <+16>: stp x29, x30, [sp, #-0x10]! >> >> 0x18a2d7518 <+20>: mov x29, sp >> >> Target 0: (fo_acoustic_streaming_solver_2d) stopped. >> >> Executable module set to >> "/Users/amneetb/Softwares/IBAMR-Git/objs-dbg/tests/IBTK/fo_acoustic_streaming_solver_2d". >> >> Architecture set to: arm64-apple-macosx-. >> >> (lldb) cont >> >> Process 44693 resuming >> >> Process 44693 stopped >> >> * thread #1, queue = 'com.apple.main-thread', stop reason = signal >> SIGSTOP >> >> frame #0: 0x000000010e59c68c >> libpmpi.12.dylib`MPIDI_POSIX_mpi_release_gather_gather + 952 >> >> libpmpi.12.dylib`MPIDI_POSIX_mpi_release_gather_gather: >> >> -> 0x10e59c68c <+952>: ldr w9, [x21] >> >> 0x10e59c690 <+956>: cmp w8, w9 >> >> 0x10e59c694 <+960>: b.lt 0x10e59c670 ; <+924> >> >> 0x10e59c698 <+964>: bl 0x10e59ce64 ; >> MPID_Progress_test >> >> Target 0: (fo_acoustic_streaming_solver_2d) stopped. >> >> (lldb) bt >> >> * thread #1, queue = 'com.apple.main-thread', stop reason = signal >> SIGSTOP >> >> * frame #0: 0x000000010e59c68c >> libpmpi.12.dylib`MPIDI_POSIX_mpi_release_gather_gather + 952 >> >> frame #1: 0x000000010e5a44bc >> libpmpi.12.dylib`MPIDI_POSIX_mpi_allreduce_release_gather + 980 >> >> frame #2: 0x000000010e5a3964 >> libpmpi.12.dylib`MPIDI_Allreduce_intra_composition_gamma + 368 >> >> frame #3: 0x000000010e591e78 libpmpi.12.dylib`MPIR_Allreduce + 1588 >> >> frame #4: 0x0000000106ab47dc libmpi.12.dylib`MPI_Allreduce + 2280 >> >> frame #5: 0x00000001098c3650 >> libpetsc.3.17.dylib`MatZeroRows_MPIAIJ(A=0x0000000136862270, N=1, >> rows=0x000000016b09e9f4, diag=1, x=0x0000000000000000, >> b=0x0000000000000000) at mpiaij.c:827:3 >> >> frame #6: 0x0000000109609fac >> libpetsc.3.17.dylib`MatZeroRows(mat=0x0000000136862270, numRows=1, >> rows=0x000000016b09e9f4, diag=1, x=0x0000000000000000, >> b=0x0000000000000000) at matrix.c:5935:3 >> >> frame #7: 0x0000000104ef12d0 >> fo_acoustic_streaming_solver_2d`IBAMR::AcousticStreamingPETScMatUtilities::constructPatchLevelFOAcousticStreamingOp(mat=0x000000016b0a8168, >> omega=1, sound_speed=1, rho_idx=3, mu_idx=2, lambda_idx=4, >> u_bc_coefs=0x000000016b0a8398, data_time=NaN, num_dofs_per_proc=size=3, >> u_dof_index_idx=27, p_dof_index_idx=28, >> patch_level=Pointer<SAMRAI::hier::PatchLevel<2> > @ 0x000000016b0a0ec0, >> mu_interp_type=VC_HARMONIC_INTERP) at >> AcousticStreamingPETScMatUtilities.cpp:799:36 >> >> frame #8: 0x0000000104f08b8c >> fo_acoustic_streaming_solver_2d`IBAMR::FOAcousticStreamingPETScLevelSolver::initializeSolverStateSpecialized(this=0x000000016b0a8018, >> x=0x000000016b0a9778, (null)=0x000000016b0a9680) at >> FOAcousticStreamingPETScLevelSolver.cpp:149:5 >> >> frame #9: 0x00000001050a62dc >> fo_acoustic_streaming_solver_2d`IBTK::PETScLevelSolver::initializeSolverState(this=0x000000016b0a8018, >> x=0x000000016b0a9778, b=0x000000016b0a9680) at PETScLevelSolver.cpp:340:5 >> >> frame #10: 0x0000000104d5ee5c >> fo_acoustic_streaming_solver_2d`main(argc=11, argv=0x000000016b0ab450) at >> fo_acoustic_streaming_solver.cpp:400:22 >> >> frame #11: 0x0000000189fbbf28 dyld`start + 2236 >> >> (lldb) >> >> >> On Wed, Nov 29, 2023 at 7:22 AM Barry Smith <bsm...@petsc.dev> wrote: >> >>> >>> >>> On Nov 29, 2023, at 1:16 AM, Amneet Bhalla <mail2amn...@gmail.com> >>> wrote: >>> >>> BTW, I think you meant using MatSetOption(mat, >>> *MAT_NO_OFF_PROC_ZERO_ROWS*, PETSC_TRUE) >>> >>> >>> Yes >>> >>> instead ofMatSetOption(mat, *MAT_NO_OFF_PROC_ENTRIES*, PETSC_TRUE) ?? >>> >>> >>> Please try setting both flags. >>> >>> However, that also did not help to overcome the MPI Barrier issue. >>> >>> >>> If there is still a problem please trap all the MPI processes when >>> they hang in the debugger and send the output from using bt on all of them. >>> This way >>> we can see the different places the different MPI processes are stuck at. >>> >>> >>> >>> On Tue, Nov 28, 2023 at 9:57 PM Amneet Bhalla <mail2amn...@gmail.com> >>> wrote: >>> >>>> I added that option but the code still gets stuck at the same call >>>> MatZeroRows with 3 processors. >>>> >>>> On Tue, Nov 28, 2023 at 7:23 PM Amneet Bhalla <mail2amn...@gmail.com> >>>> wrote: >>>> >>>>> >>>>> >>>>> On Tue, Nov 28, 2023 at 6:42 PM Barry Smith <bsm...@petsc.dev> wrote: >>>>> >>>>>> >>>>>> for (int comp = 0; comp < 2; ++comp) >>>>>> { >>>>>> ....... >>>>>> for (Box<NDIM>::Iterator bc(bc_coef_box); bc; >>>>>> bc++) >>>>>> { >>>>>> ...... >>>>>> if (IBTK::abs_equal_eps(b, 0.0)) >>>>>> { >>>>>> const double diag_value = a; >>>>>> ierr = MatZeroRows(mat, 1, &u_dof_index, >>>>>> diag_value, NULL, NULL); >>>>>> IBTK_CHKERRQ(ierr); >>>>>> } >>>>>> } >>>>>> } >>>>>> >>>>>> In general, this code will not work because each process calls >>>>>> MatZeroRows a different number of times, so it cannot match up with all >>>>>> the >>>>>> processes. >>>>>> >>>>>> If u_dof_index is always local to the current process, you can call >>>>>> MatSetOption(mat, MAT_NO_OFF_PROC_ENTRIES,PETSC_TRUE) above the for loop >>>>>> and >>>>>> the MatZeroRows will not synchronize across the MPI processes (since >>>>>> it does not need to and you told it that). >>>>>> >>>>> >>>>> Yes, u_dof_index is going to be local and I put a check on it a few >>>>> lines before calling MatZeroRows. >>>>> >>>>> Can MatSetOption() be called after the matrix has been assembled? >>>>> >>>>> >>>>>> If the u_dof_index will not always be local, then you need, on each >>>>>> process, to list all the u_dof_index for each process in an array and >>>>>> then >>>>>> call MatZeroRows() >>>>>> once after the loop so it can exchange the needed information with >>>>>> the other MPI processes to get the row indices to the right place. >>>>>> >>>>>> Barry >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> On Nov 28, 2023, at 6:44 PM, Amneet Bhalla <mail2amn...@gmail.com> >>>>>> wrote: >>>>>> >>>>>> >>>>>> Hi Folks, >>>>>> >>>>>> I am using MatZeroRows() to set Dirichlet boundary conditions. This >>>>>> works fine for the serial run and the solver produces correct results >>>>>> (verified through analytical solution). However, when I run the case in >>>>>> parallel, the simulation gets stuck at MatZeroRows(). My understanding is >>>>>> that this function needs to be called after the MatAssemblyBegin{End}() >>>>>> has >>>>>> been called, and should be called by all processors. Here is that bit of >>>>>> the code which calls MatZeroRows() after the matrix has been assembled >>>>>> >>>>>> >>>>>> https://github.com/IBAMR/IBAMR/blob/amneetb/acoustically-driven-flows/src/acoustic_streaming/AcousticStreamingPETScMatUtilities.cpp#L724-L801 >>>>>> >>>>>> I ran the parallel code (on 3 processors) in the debugger >>>>>> (-start_in_debugger). Below is the call stack from the processor that >>>>>> gets >>>>>> stuck >>>>>> >>>>>> amneetb@APSB-MBP-16:~$ lldb -p 4307 >>>>>> (lldb) process attach --pid 4307 >>>>>> Process 4307 stopped >>>>>> * thread #1, queue = 'com.apple.main-thread', stop reason = signal >>>>>> SIGSTOP >>>>>> frame #0: 0x000000018a2d750c >>>>>> libsystem_kernel.dylib`__semwait_signal + 8 >>>>>> libsystem_kernel.dylib`: >>>>>> -> 0x18a2d750c <+8>: b.lo 0x18a2d752c ; <+40> >>>>>> 0x18a2d7510 <+12>: pacibsp >>>>>> 0x18a2d7514 <+16>: stp x29, x30, [sp, #-0x10]! >>>>>> 0x18a2d7518 <+20>: mov x29, sp >>>>>> Target 0: (fo_acoustic_streaming_solver_2d) stopped. >>>>>> Executable module set to >>>>>> "/Users/amneetb/Softwares/IBAMR-Git/objs-dbg/tests/IBTK/fo_acoustic_streaming_solver_2d". >>>>>> Architecture set to: arm64-apple-macosx-. >>>>>> (lldb) cont >>>>>> Process 4307 resuming >>>>>> Process 4307 stopped >>>>>> * thread #1, queue = 'com.apple.main-thread', stop reason = signal >>>>>> SIGSTOP >>>>>> frame #0: 0x0000000109d281b8 >>>>>> libpmpi.12.dylib`MPIDI_POSIX_mpi_barrier_release_gather + 400 >>>>>> libpmpi.12.dylib`MPIDI_POSIX_mpi_barrier_release_gather: >>>>>> -> 0x109d281b8 <+400>: ldr w9, [x24] >>>>>> 0x109d281bc <+404>: cmp w8, w9 >>>>>> 0x109d281c0 <+408>: b.lt 0x109d281a0 ; <+376> >>>>>> 0x109d281c4 <+412>: bl 0x109d28e64 ; >>>>>> MPID_Progress_test >>>>>> Target 0: (fo_acoustic_streaming_solver_2d) stopped. >>>>>> (lldb) bt >>>>>> * thread #1, queue = 'com.apple.main-thread', stop reason = signal >>>>>> SIGSTOP >>>>>> * frame #0: 0x0000000109d281b8 >>>>>> libpmpi.12.dylib`MPIDI_POSIX_mpi_barrier_release_gather + 400 >>>>>> frame #1: 0x0000000109d27d14 >>>>>> libpmpi.12.dylib`MPIDI_SHM_mpi_barrier + 224 >>>>>> frame #2: 0x0000000109d27b60 >>>>>> libpmpi.12.dylib`MPIDI_Barrier_intra_composition_alpha + 44 >>>>>> frame #3: 0x0000000109d0d490 libpmpi.12.dylib`MPIR_Barrier + 900 >>>>>> frame #4: 0x000000010224d030 libmpi.12.dylib`MPI_Barrier + 684 >>>>>> frame #5: 0x00000001045ea638 >>>>>> libpetsc.3.17.dylib`PetscCommDuplicate(comm_in=-2080374782, >>>>>> comm_out=0x000000010300bcb0, first_tag=0x000000010300bce4) at tagm.c: >>>>>> 235:5 >>>>>> frame #6: 0x00000001045f2910 >>>>>> libpetsc.3.17.dylib`PetscHeaderCreate_Private(h=0x000000010300bc70, >>>>>> classid=1211227, class_name="PetscSF", descr="Star Forest", >>>>>> mansec="PetscSF", comm=-2080374782, >>>>>> destroy=(libpetsc.3.17.dylib`PetscSFDestroy at sf.c:224), >>>>>> view=(libpetsc.3.17.dylib`PetscSFView at sf.c:841)) at inherit.c:62:3 >>>>>> frame #7: 0x00000001049cf820 >>>>>> libpetsc.3.17.dylib`PetscSFCreate(comm=-2080374782, >>>>>> sf=0x000000016f911a50) >>>>>> at sf.c:62:3 >>>>>> frame #8: 0x0000000104cd3024 >>>>>> libpetsc.3.17.dylib`MatZeroRowsMapLocal_Private(A=0x00000001170c1270, >>>>>> N=1, >>>>>> rows=0x000000016f912cb4, nr=0x000000016f911df8, >>>>>> olrows=0x000000016f911e00) >>>>>> at zerorows.c:36:5 >>>>>> frame #9: 0x000000010504ea50 >>>>>> libpetsc.3.17.dylib`MatZeroRows_MPIAIJ(A=0x00000001170c1270, N=1, >>>>>> rows=0x000000016f912cb4, diag=1, x=0x0000000000000000, >>>>>> b=0x0000000000000000) at mpiaij.c:768:3 >>>>>> >>>>>>