With Chris's example, I did reproduce the "MPI_ERR_BUFFER: invalid buffer pointer" on a machine. I am looking into it.
Thanks. --Junchao Zhang On Tue, Jul 22, 2025 at 9:51 AM Zongze Yang <yangzon...@gmail.com> wrote: > Hi, > I encountered a similar issue with Firedrake when using the -log_view option > with XML format on macOS. Below is the error message. The Firedrake code > and the shell script used to run it are attached. > > ``` > > [0]PETSC ERROR: --------------------- Error Message > -------------------------------------------------------------- > > [0]PETSC ERROR: General MPI error > > [0]PETSC ERROR: MPI error 1 MPI_ERR_BUFFER: invalid buffer pointer > > [0]PETSC ERROR: See > https://urldefense.us/v3/__https://petsc.org/release/faq/__;!!G_uCfscf7eWS!bitkMAVSBHfkO71IthJuocmtSkJAoWdXju0W8ra3pkNhAQ0ULGK2V2SDIVEXJTW6DekHmZoJorx9h8YpGr1EJj_T7kfT$ > > <https://urldefense.us/v3/__https://petsc.org/release/faq/__;!!G_uCfscf7eWS!eiv8Wo1VhQz4c2L8MbDoPcg0KZ0loiWlwjI1MR6VEtFfLWTjZNV4UssfSUT-F9tKXb2GjX8Ar-YrWmBGIAY9ujQp$> > for trouble shooting. > > [0]PETSC ERROR: PETSc Release Version 3.23.4, unknown > > [0]PETSC ERROR: test.py with 2 MPI process(es) and PETSC_ARCH > arch-firedrake-default on 192.168.10.51 by zzyang Tue Jul 22 22:24:05 2025 > > [0]PETSC ERROR: Configure options: PETSC_ARCH=arch-firedrake-default > --COPTFLAGS="-O3 -march=native -mtune=native" --CXXOPTFLAGS="-O3 > -march=native -mtune=native" --FOPTFLAGS="-O3 -mtune=native" > --with-c2html=0 --with-debugging=0 --with-fortran-bindings=0 > --with-shared-libraries=1 --with-strict-petscerrorcode --download-cmake > --download-bison --download-fftw --download-mumps-avoid-mpi-in-place > --with-hdf5-dir=/opt/homebrew --with-hwloc-dir=/opt/homebrew > --download-metis --download-mumps --download-netcdf --download-pnetcdf > --download-ptscotch --download-scalapack --download-suitesparse > --download-superlu_dist --download-slepc --with-zlib --download-hpddm > --download-libpng --download-ctetgen --download-tetgen --download-triangle > --download-mmg --download-parmmg --download-p4est --download-eigen > --download-hypre --download-pragmatic > > [0]PETSC ERROR: #1 PetscLogNestedTreePrintLine() at > /Users/zzyang/opt/firedrake/firedrake-pip/petsc/src/sys/logging/handler/impls/nested/xmlviewer.c:289 > > [0]PETSC ERROR: #2 PetscLogNestedTreePrint() at > /Users/zzyang/opt/firedrake/firedrake-pip/petsc/src/sys/logging/handler/impls/nested/xmlviewer.c:383 > > [0]PETSC ERROR: #3 PetscLogNestedTreePrint() at > /Users/zzyang/opt/firedrake/firedrake-pip/petsc/src/sys/logging/handler/impls/nested/xmlviewer.c:384 > > [0]PETSC ERROR: #4 PetscLogNestedTreePrint() at > /Users/zzyang/opt/firedrake/firedrake-pip/petsc/src/sys/logging/handler/impls/nested/xmlviewer.c:384 > > [0]PETSC ERROR: #5 PetscLogNestedTreePrint() at > /Users/zzyang/opt/firedrake/firedrake-pip/petsc/src/sys/logging/handler/impls/nested/xmlviewer.c:384 > > [0]PETSC ERROR: #6 PetscLogNestedTreePrint() at > /Users/zzyang/opt/firedrake/firedrake-pip/petsc/src/sys/logging/handler/impls/nested/xmlviewer.c:384 > > [0]PETSC ERROR: #7 PetscLogNestedTreePrint() at > /Users/zzyang/opt/firedrake/firedrake-pip/petsc/src/sys/logging/handler/impls/nested/xmlviewer.c:384 > > [0]PETSC ERROR: #8 PetscLogNestedTreePrint() at > /Users/zzyang/opt/firedrake/firedrake-pip/petsc/src/sys/logging/handler/impls/nested/xmlviewer.c:384 > > [0]PETSC ERROR: #9 PetscLogNestedTreePrint() at > /Users/zzyang/opt/firedrake/firedrake-pip/petsc/src/sys/logging/handler/impls/nested/xmlviewer.c:384 > > [0]PETSC ERROR: #10 PetscLogNestedTreePrint() at > /Users/zzyang/opt/firedrake/firedrake-pip/petsc/src/sys/logging/handler/impls/nested/xmlviewer.c:384 > > [0]PETSC ERROR: #11 PetscLogNestedTreePrint() at > /Users/zzyang/opt/firedrake/firedrake-pip/petsc/src/sys/logging/handler/impls/nested/xmlviewer.c:384 > > [0]PETSC ERROR: #12 PetscLogNestedTreePrint() at > /Users/zzyang/opt/firedrake/firedrake-pip/petsc/src/sys/logging/handler/impls/nested/xmlviewer.c:384 > > [0]PETSC ERROR: #13 PetscLogNestedTreePrintTop() at > /Users/zzyang/opt/firedrake/firedrake-pip/petsc/src/sys/logging/handler/impls/nested/xmlviewer.c:420 > > [0]PETSC ERROR: #14 PetscLogHandlerView_Nested_XML() at > /Users/zzyang/opt/firedrake/firedrake-pip/petsc/src/sys/logging/handler/impls/nested/xmlviewer.c:443 > > [0]PETSC ERROR: #15 PetscLogHandlerView_Nested() at > /Users/zzyang/opt/firedrake/firedrake-pip/petsc/src/sys/logging/handler/impls/nested/lognested.c:405 > > [0]PETSC ERROR: #16 PetscLogHandlerView() at > /Users/zzyang/opt/firedrake/firedrake-pip/petsc/src/sys/logging/handler/interface/loghandler.c:342 > > [0]PETSC ERROR: #17 PetscLogView() at > /Users/zzyang/opt/firedrake/firedrake-pip/petsc/src/sys/logging/plog.c:2043 > > [0]PETSC ERROR: #18 PetscLogViewFromOptions() at > /Users/zzyang/opt/firedrake/firedrake-pip/petsc/src/sys/logging/plog.c:2084 > > [0]PETSC ERROR: #19 PetscFinalize() at > /Users/zzyang/opt/firedrake/firedrake-pip/petsc/src/sys/objects/pinit.c:1552 > > PetscFinalize() failed [error code: 98] > > -------------------------------------------------------------------------- > > prterun has exited due to process rank 0 with PID 28986 on node > 192.168.10.51 exiting > > improperly. There are three reasons this could occur: > > > 1. this process did not call "init" before exiting, but others in the > > job did. This can cause a job to hang indefinitely while it waits for > > all processes to call "init". By rule, if one process calls "init", > > then ALL processes must call "init" prior to termination. > > > 2. this process called "init", but exited without calling "finalize". > > By rule, all processes that call "init" MUST call "finalize" prior to > > exiting or it will be considered an "abnormal termination" > > > 3. this process called "MPI_Abort" or "prte_abort" and the mca > > parameter prte_create_session_dirs is set to false. In this case, the > > run-time cannot detect that the abort call was an abnormal > > termination. Hence, the only error message you will receive is this > > one. > > > This may have caused other processes in the application to be > > terminated by signals sent by prterun (as reported here). > > > You can avoid this message by specifying -quiet on the prterun command > > line. > > -------------------------------------------------------------------------- > ``` > > Best wishes, > Zongze > > *From: *petsc-users <petsc-users-boun...@mcs.anl.gov> on behalf of Klaij, > Christiaan via petsc-users <petsc-users@mcs.anl.gov> > *Date: *Monday, July 14, 2025 at 15:58 > *To: *Barry Smith <bsm...@petsc.dev> > *Cc: *PETSc users list <petsc-users@mcs.anl.gov> > *Subject: *Re: [petsc-users] problem with nested logging, standalone > example > > @Junchao: yes, all with my ex2f.F90 variation on two or three cores > > @Barry: it's really puzzling that you cannot reproduce. Can you try > running it a dozen times in a row? And look at the report_performance.xml > file? When it hangs I see some nan's, for instance here in the VecAXPY > event: > > <events> > <event> > <name>VecAXPY</name> > <time> > <avgvalue>0.00610203</avgvalue> > <minvalue>0.</minvalue> > <maxvalue>0.0122041</maxvalue> > <minloc>1</minloc> > <maxloc>0</maxloc> > </time> > <ncalls> > <avgvalue>0.5</avgvalue> > <minvalue>0.</minvalue> > <maxvalue>1.</maxvalue> > <minloc>1</minloc> > <maxloc>0</maxloc> > </ncalls> > </event> > <event> > <name>self</name> > <time> > <value>-nan.</value> > </time> > > This is what I did in my latest attempt on the login node of our Rocky > Linux 9 cluster: > 1) download petsc-3.23.4.tar.gz from the petsc website > 2) ./configure -prefix=~/petsc/install --with-cxx=0 --with-debugging=0 > --with-mpi-dir=/cm/shared/apps/mpich/ge/gcc/64/3.4.2 > 3) adjust my example to this version of petsc (file is attached) > 4) make ex2f-cklaij-dbg-v2 > 5) mpirun -n 2 ./ex2f-cklaij-dbg-v2 > > So the exact versions are: petsc-3.23.4, system mpich 3.4.2, system gcc > 11.5.0 > > ________________________________________ > From: Barry Smith <bsm...@petsc.dev> > Sent: Friday, July 11, 2025 11:22 PM > To: Klaij, Christiaan > Cc: Junchao Zhang; PETSc users list > Subject: Re: [petsc-users] problem with nested logging, standalone example > > > And yet we cannot reproduce. > > Please tell us the exact PETSc version and MPI implementation versions. > And reattach your reproducing example. And exactly how you run it. > > > Can you reproduce it on an "ordinary" machine, say a Mac or Linux > laptop. > > Barry > > If I could reproduce the problem here is how I would debug. I put use > -start_in_debugger and then put break points in places which it seem > problematic. Presumably I would end up with a hang with each MPI process in > a "different place" and from that I may be able to determine how that > happened. > > > > > On Jul 11, 2025, at 7:58 AM, Klaij, Christiaan <c.kl...@marin.nl> wrote: > > > > In summary for future reference: > > - tested 3 different machines, two at Marin, one at the national HPC > > - tested 3 different mpi implementation (intelmpi, openmpi and mpich) > > - tested openmpi in both release and debug > > - tested 2 different compilers (intel and gnu), both older and very > recent versions > > - tested with the most basic config (./configure --with-cxx=0 > --with-debugging=0 --download-mpich) > > > > All of these test either segfault, or hang or error-out at the call to > PetscLogView. > > > > Chris > > > > ________________________________________ > > From: Klaij, Christiaan <c.kl...@marin.nl> > > Sent: Friday, July 11, 2025 10:10 AM > > To: Barry Smith; Junchao Zhang > > Cc: PETSc users list > > Subject: Re: [petsc-users] problem with nested logging, standalone > example > > > > @Matt: no MPI errors indeed. I've tried with MPICH and I get the same > hanging. > > @Barry: both stack traces aren't exactly the same, see a sample with > MPICH below. > > > > If it cannot be reproduced at your side, I'm afraid this is another dead > end. Thanks anyway, I really appreciate all your help. > > > > Chris > > > > (gdb) bt > > #0 0x000015555033bc2e in > MPIDI_POSIX_mpi_release_gather_gather.constprop.0 () > > from /cm/shared/apps/mpich/ge/gcc/64/3.4.2/lib/libmpi.so.12 > > #1 0x000015555033db8a in MPIDI_POSIX_mpi_allreduce_release_gather () > > from /cm/shared/apps/mpich/ge/gcc/64/3.4.2/lib/libmpi.so.12 > > #2 0x000015555033e70f in MPIR_Allreduce () > > from /cm/shared/apps/mpich/ge/gcc/64/3.4.2/lib/libmpi.so.12 > > #3 0x000015555033f22e in PMPI_Allreduce () > > from /cm/shared/apps/mpich/ge/gcc/64/3.4.2/lib/libmpi.so.12 > > #4 0x0000155553f85d69 in MPIU_Allreduce_Count (comm=-2080374782, > > op=1476395020, dtype=1275072547, count=1, outbuf=0x7fffffffac70, > > inbuf=0x7fffffffac60) > > at /home/cklaij/petsc/petsc-3.23.4/src/sys/objects/pinit.c:1839 > > #5 MPIU_Allreduce_Private (inbuf=inbuf@entry=0x7fffffffac60, > > outbuf=outbuf@entry=0x7fffffffac70, count=count@entry=1, > > dtype=dtype@entry=1275072547, op=op@entry=1476395020, > comm=-2080374782) > > at /home/cklaij/petsc/petsc-3.23.4/src/sys/objects/pinit.c:1869 > > #6 0x0000155553f33dbe in PetscPrintXMLNestedLinePerfResults ( > > viewer=viewer@entry=0x458890, name=name@entry=0x155554ef6a0d > 'mbps\000', > > value=<optimized out>, minthreshold=minthreshold@entry=0, > > maxthreshold=maxthreshold@entry=0.01, > > minmaxtreshold=minmaxtreshold@entry=1.05) > > at > /home/cklaij/petsc/petsc-3.23.4/src/sys/logging/handler/impls/nested/xmlviewer.c:255 > > > > > > (gdb) bt > > #0 0x000015554fed3b17 in clock_gettime@GLIBC_2.2.5 () from > /lib64/libc.so.6 > > #1 0x0000155550b0de71 in ofi_gettime_ns () > > from /cm/shared/apps/mpich/ge/gcc/64/3.4.2/lib/libmpi.so.12 > > #2 0x0000155550b0dec9 in ofi_gettime_ms () > > from /cm/shared/apps/mpich/ge/gcc/64/3.4.2/lib/libmpi.so.12 > > #3 0x0000155550b2fab5 in sock_cq_sreadfrom () > > from /cm/shared/apps/mpich/ge/gcc/64/3.4.2/lib/libmpi.so.12 > > #4 0x00001555505ca6f7 in MPIDI_OFI_progress () > > from /cm/shared/apps/mpich/ge/gcc/64/3.4.2/lib/libmpi.so.12 > > #5 0x0000155550591fe9 in progress_test () > > from /cm/shared/apps/mpich/ge/gcc/64/3.4.2/lib/libmpi.so.12 > > #6 0x00001555505924a3 in MPID_Progress_wait () > > from /cm/shared/apps/mpich/ge/gcc/64/3.4.2/lib/libmpi.so.12 > > #7 0x000015555043463e in MPIR_Wait_state () > > from /cm/shared/apps/mpich/ge/gcc/64/3.4.2/lib/libmpi.so.12 > > #8 0x000015555052ec49 in MPIC_Wait () > > from /cm/shared/apps/mpich/ge/gcc/64/3.4.2/lib/libmpi.so.12 > > #9 0x000015555053093e in MPIC_Sendrecv () > > from /cm/shared/apps/mpich/ge/gcc/64/3.4.2/lib/libmpi.so.12 > > #10 0x00001555504bf674 in MPIR_Allreduce_intra_recursive_doubling () > > from /cm/shared/apps/mpich/ge/gcc/64/3.4.2/lib/libmpi.so.12 > > #11 0x00001555505b61de in MPIDI_OFI_mpi_finalize_hook () > > from /cm/shared/apps/mpich/ge/gcc/64/3.4.2/lib/libmpi.so.12 > > > > ________________________________________ > > From: Barry Smith <bsm...@petsc.dev> > > Sent: Thursday, July 10, 2025 11:10 PM > > To: Junchao Zhang > > Cc: Klaij, Christiaan; PETSc users list > > Subject: Re: [petsc-users] problem with nested logging, standalone > example > > > > > > I cannot reproduce > > > > On Jul 10, 2025, at 3:46 PM, Junchao Zhang <junchao.zh...@gmail.com> > wrote: > > > > Adding -mca coll_hcoll_enable 0 didn't change anything at my end. > Strange. > > > > --Junchao Zhang > > > > > > On Thu, Jul 10, 2025 at 3:39 AM Klaij, Christiaan <c.kl...@marin.nl > <mailto:c.kl...@marin.nl>> wrote: > > An additional clue perhaps: with the option > OMPI_MCA_coll_hcoll_enable=0, the code does not hang but gives the error > below. > > > > Chris > > > > > > $ mpirun -mca coll_hcoll_enable 0 -n 2 ./ex2f-cklaij-dbg -pc_type jacobi > -ksp_monitor_short -ksp_gmres_cgs_refinement_type refine_always > > 0 KSP Residual norm 1.11803 > > 1 KSP Residual norm 0.591608 > > 2 KSP Residual norm 0.316228 > > 3 KSP Residual norm < 1.e-11 > > 0 KSP Residual norm 0.707107 > > 1 KSP Residual norm 0.408248 > > 2 KSP Residual norm < 1.e-11 > > Norm of error < 1.e-12 iterations 3 > > [1]PETSC ERROR: --------------------- Error Message > -------------------------------------------------------------- > > [1]PETSC ERROR: General MPI error > > [1]PETSC ERROR: MPI error 1 MPI_ERR_BUFFER: invalid buffer pointer > > [1]PETSC ERROR: See > https://urldefense.us/v3/__https://petsc.org/release/faq/__;!!G_uCfscf7eWS!dcT9AzbxDJMLIie0NhYIw4YU2TObPM3WHhzR-HlzrpfbjPd6sgsPX009yFy1lw_eLLu2WprNwYRABMK43J9p4SM$ > < > https://urldefense.us/v3/__https://petsc.org/release/faq/__;!!G_uCfscf7eWS!cbfMf1uAUCQ_T756UiU6Vd_NZkAvFLYRqJzL47P2JiAVi_2KCG5Q1u2oHseUcGLNAIW5qWtWbWHMIk_YNR8bJjkYxsN9$> > for trouble shooting. > > [1]PETSC ERROR: Petsc Release Version 3.22.4, Mar 01, 2025 > > [1]PETSC ERROR: ./ex2f-cklaij-dbg with 2 MPI process(es) and PETSC_ARCH > on login1 by cklaij Thu Jul 10 10:33:33 2025 > > [1]PETSC ERROR: Configure options: > --prefix=/home/cklaij/ReFRESCO/trunk/install/extLibs > --with-mpi-dir=/cm/shared/apps/openmpi/gcc/5.0.6-debug --with-x=0 > --with-mpe=0 --with-debugging=0 --download-superlu_dist= > https://urldefense.us/v3/__https://updates.marin.nl/refresco/libs/superlu_dist-8.1.2.tar.gz__;!!G_uCfscf7eWS!dcT9AzbxDJMLIie0NhYIw4YU2TObPM3WHhzR-HlzrpfbjPd6sgsPX009yFy1lw_eLLu2WprNwYRABMK4VVy6P4U$ > < > https://urldefense.us/v3/__https://updates.marin.nl/refresco/libs/superlu_dist-8.1.2.tar.gz__;!!G_uCfscf7eWS!cbfMf1uAUCQ_T756UiU6Vd_NZkAvFLYRqJzL47P2JiAVi_2KCG5Q1u2oHseUcGLNAIW5qWtWbWHMIk_YNR8bJkouVHb2$> > --with-blaslapack-dir=/cm/shared/apps/oneapi/2024.2.1/mkl/2024.2 > --download-parmetis= > https://urldefense.us/v3/__https://updates.marin.nl/refresco/libs/parmetis-4.0.3-p9.tar.gz__;!!G_uCfscf7eWS!dcT9AzbxDJMLIie0NhYIw4YU2TObPM3WHhzR-HlzrpfbjPd6sgsPX009yFy1lw_eLLu2WprNwYRABMK4-9b1K84$ > < > https://urldefense.us/v3/__https://updates.marin.nl/refresco/libs/parmetis-4.0.3-p9.tar.gz__;!!G_uCfscf7eWS!cbfMf1uAUCQ_T756UiU6Vd_NZkAvFLYRqJzL47P2JiAVi_2KCG5Q1u2oHseUcGLNAIW5qWtWbWHMIk_YNR8bJrjo6-SP$> > --download-metis= > https://urldefense.us/v3/__https://updates.marin.nl/refresco/libs/metis-5.1.0-p11.tar.gz__;!!G_uCfscf7eWS!dcT9AzbxDJMLIie0NhYIw4YU2TObPM3WHhzR-HlzrpfbjPd6sgsPX009yFy1lw_eLLu2WprNwYRABMK4Y9uaqiQ$ > < > https://urldefense.us/v3/__https://updates.marin.nl/refresco/libs/metis-5.1.0-p11.tar.gz__;!!G_uCfscf7eWS!cbfMf1uAUCQ_T756UiU6Vd_NZkAvFLYRqJzL47P2JiAVi_2KCG5Q1u2oHseUcGLNAIW5qWtWbWHMIk_YNR8bJhCc9MRE$> > --with-packages-build-dir=/home/cklaij/ReFRESCO/trunk/build-extlibs/superbuild > --with-ssl=0 --with-shared-libraries=1 CFLAGS="-std=gnu11 -Wall > -funroll-all-loops -O3 -DNDEBUG" CXXFLAGS="-std=gnu++14 -Wall > -funroll-all-loops -O3 -DNDEBUG " COPTFLAGS="-std=gnu11 -Wall > -funroll-all-loops -O3 -DNDEBUG" CXXOPTFLAGS="-std=gnu++14 -Wall > -funroll-all-loops -O3 -DNDEBUG " FCFLAGS="-Wall -funroll-all-loops > -ffree-line-length-0 -Wno-maybe-uninitialized -Wno-target-lifetime > -Wno-unused-function -O3 -DNDEBUG" F90FLAGS="-Wall -funroll-all-loops > -ffree-line-length-0 -Wno-maybe-uninitialized -Wno-target-lifetime > -Wno-unused-function -O3 -DNDEBUG" FOPTFLAGS="-Wall -funroll-all-loops > -ffree-line-length-0 -Wno-maybe-uninitialized -Wno-target-lifetime > -Wno-unused-function -O3 -DNDEBUG" > > [1]PETSC ERROR: #1 PetscLogNestedTreePrintLine() at > /home/cklaij/ReFRESCO/trunk/build-extlibs/superbuild/petsc/src/src/sys/logging/handler/impls/nested/xmlviewer.c:289 > > [1]PETSC ERROR: #2 PetscLogNestedTreePrint() at > /home/cklaij/ReFRESCO/trunk/build-extlibs/superbuild/petsc/src/src/sys/logging/handler/impls/nested/xmlviewer.c:377 > > [1]PETSC ERROR: #3 PetscLogNestedTreePrint() at > /home/cklaij/ReFRESCO/trunk/build-extlibs/superbuild/petsc/src/src/sys/logging/handler/impls/nested/xmlviewer.c:384 > > [1]PETSC ERROR: #4 PetscLogNestedTreePrintTop() at > /home/cklaij/ReFRESCO/trunk/build-extlibs/superbuild/petsc/src/src/sys/logging/handler/impls/nested/xmlviewer.c:420 > > [1]PETSC ERROR: #5 PetscLogHandlerView_Nested_XML() at > /home/cklaij/ReFRESCO/trunk/build-extlibs/superbuild/petsc/src/src/sys/logging/handler/impls/nested/xmlviewer.c:443 > > [1]PETSC ERROR: #6 PetscLogHandlerView_Nested() at > /home/cklaij/ReFRESCO/trunk/build-extlibs/superbuild/petsc/src/src/sys/logging/handler/impls/nested/lognested.c:405 > > [1]PETSC ERROR: #7 PetscLogHandlerView() at > /home/cklaij/ReFRESCO/trunk/build-extlibs/superbuild/petsc/src/src/sys/logging/handler/interface/loghandler.c:342 > > [1]PETSC ERROR: #8 PetscLogView() at > /home/cklaij/ReFRESCO/trunk/build-extlibs/superbuild/petsc/src/src/sys/logging/plog.c:2040 > > [1]PETSC ERROR: #9 ex2f-cklaij-dbg.F90:301 > > > -------------------------------------------------------------------------- > > MPI_ABORT was invoked on rank 0 in communicator MPI_COMM_SELF > > Proc: [[55228,1],1] > > Errorcode: 98 > > > > NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes. > > You may or may not see output from other processes, depending on > > exactly when Open MPI kills them. > > > -------------------------------------------------------------------------- > > > -------------------------------------------------------------------------- > > prterun has exited due to process rank 1 with PID 0 on node login1 > calling > > "abort". This may have caused other processes in the application to be > > terminated by signals sent by prterun (as reported here). > > > -------------------------------------------------------------------------- > > > > ________________________________________ > > <image198746.png> > > dr. ir. Christiaan Klaij | senior researcher > > Research & Development | CFD Development > > T +31 317 49 33 44<tel:+31%20317%2049%2033%2044> | > https://urldefense.us/v3/__http://www.marin.nl__;!!G_uCfscf7eWS!dcT9AzbxDJMLIie0NhYIw4YU2TObPM3WHhzR-HlzrpfbjPd6sgsPX009yFy1lw_eLLu2WprNwYRABMK4BUEn1h8$ > < > https://urldefense.us/v3/__https://www.marin.nl/__;!!G_uCfscf7eWS!cbfMf1uAUCQ_T756UiU6Vd_NZkAvFLYRqJzL47P2JiAVi_2KCG5Q1u2oHseUcGLNAIW5qWtWbWHMIk_YNR8bJrOqapgp$ > > > > <image542473.png>< > https://urldefense.us/v3/__https://www.facebook.com/marin.wageningen__;!!G_uCfscf7eWS!cbfMf1uAUCQ_T756UiU6Vd_NZkAvFLYRqJzL47P2JiAVi_2KCG5Q1u2oHseUcGLNAIW5qWtWbWHMIk_YNR8bJoD4fuV7$ > > > > <image555176.png>< > https://urldefense.us/v3/__https://www.linkedin.com/company/marin__;!!G_uCfscf7eWS!cbfMf1uAUCQ_T756UiU6Vd_NZkAvFLYRqJzL47P2JiAVi_2KCG5Q1u2oHseUcGLNAIW5qWtWbWHMIk_YNR8bJospHf95$ > > > > <image269837.png>< > https://urldefense.us/v3/__https://www.youtube.com/marinmultimedia__;!!G_uCfscf7eWS!cbfMf1uAUCQ_T756UiU6Vd_NZkAvFLYRqJzL47P2JiAVi_2KCG5Q1u2oHseUcGLNAIW5qWtWbWHMIk_YNR8bJrpsjB_W$ > > > > > > > > From: Klaij, Christiaan <c.kl...@marin.nl<mailto:c.kl...@marin.nl>> > > Sent: Thursday, July 10, 2025 10:15 AM > > To: Junchao Zhang > > Cc: PETSc users list > > Subject: Re: [petsc-users] problem with nested logging, standalone > example > > > > Hi Junchao, > > > > Thanks for testing. I've fixed the error but unfortunately that doesn't > change the behavior, the code still hangs as before, with the same stack > trace... > > > > Chris > > > > ________________________________________ > > From: Junchao Zhang <junchao.zh...@gmail.com<mailto: > junchao.zh...@gmail.com>> > > Sent: Tuesday, July 8, 2025 10:58 PM > > To: Klaij, Christiaan > > Cc: PETSc users list > > Subject: Re: [petsc-users] problem with nested logging, standalone > example > > > > Hi, Chris, > > First, I had to fix an error in your test by adding " > PetscCallA(MatSetFromOptions(AA,ierr))" at line 254. > > [0]PETSC ERROR: --------------------- Error Message > -------------------------------------------------------------- > > [0]PETSC ERROR: Object is in wrong state > > [0]PETSC ERROR: Mat object's type is not set: Argument # 1 > > ... > > [0]PETSC ERROR: #1 MatSetValues() at > /scratch/jczhang/petsc/src/mat/interface/matrix.c:1503 > > [0]PETSC ERROR: #2 ex2f.F90:258 > > > > Then I could ran the test without problems > > mpirun -n 2 ./ex2f -pc_type jacobi -ksp_monitor_short > -ksp_gmres_cgs_refinement_type refine_always > > 0 KSP Residual norm 1.11803 > > 1 KSP Residual norm 0.591608 > > 2 KSP Residual norm 0.316228 > > 3 KSP Residual norm < 1.e-11 > > 0 KSP Residual norm 0.707107 > > 1 KSP Residual norm 0.408248 > > 2 KSP Residual norm < 1.e-11 > > Norm of error < 1.e-12 iterations 3 > > > > I used petsc-3.22.4, gcc-11.3, openmpi-5.0.6 and configured with > > ./configure --with-cc=gcc --with-cxx=g++ --with-fc=gfortran > --download-openmpi --with-ssl=0 --with-shared-libraries=1 > CFLAGS="-std=gnu11 -Wall -funroll-all-loops -O3 -DNDEBUG" > CXXFLAGS="-std=gnu++14 -Wall -funroll-all-loops -O3 -DNDEBUG " > COPTFLAGS="-std=gnu11 -Wall -funroll-all-loops -O3 -DNDEBUG" > CXXOPTFLAGS="-std=gnu++14 -Wall -funroll-all-loops -O3 -DNDEBUG " > FCFLAGS="-Wall -funroll-all-loops -ffree-line-length-0 > -Wno-maybe-uninitialized -Wno-target-lifetime -Wno-unused-function -O3 > -DNDEBUG" F90FLAGS="-Wall -funroll-all-loops -ffree-line-length-0 > -Wno-maybe-uninitialized -Wno-target-lifetime -Wno-unused-function -O3 > -DNDEBUG" FOPTFLAGS="-Wall -funroll-all-loops -ffree-line-length-0 > -Wno-maybe-uninitialized -Wno-target-lifetime -Wno-unused-function -O3 > -DNDEBUG" > > > > Could you fix the error and retry? > > > > --Junchao Zhang > > > > > > On Sun, Jul 6, 2025 at 12:57 PM Klaij, Christiaan via petsc-users < > petsc-users@mcs.anl.gov<mailto:petsc-users@mcs.anl.gov><mailto: > petsc-users@mcs.anl.gov<mailto:petsc-users@mcs.anl.gov>>> wrote: > > Attached is a standalone example of the issue described in the > > earlier thread "problem with nested logging". The issue appeared > > somewhere between petsc 3.19.4 and 3.23.4. > > > > The example is a variation of ../ksp/tutorials/ex2f.F90, where > > I've added the nested log viewer with one event as well as the > > solution of a small system on rank zero. > > > > When running on mulitple procs the example hangs during > > PetscLogView with the backtrace below. The configure.log is also > > attached in the hope that you can replicate the issue. > > > > Chris > > > > > > #0 0x000015554c84ea9e in mca_pml_ucx_recv (buf=0x7fffffff9e30, count=1, > > datatype=0x15554c9ef900 <ompi_mpi_2dblprec>, src=1, tag=-12, > > comm=0x7f1e30, mpi_status=0x0) at pml_ucx.c:700 > > #1 0x000015554c65baff in > ompi_coll_base_allreduce_intra_recursivedoubling ( > > sbuf=0x7fffffff9e20, rbuf=0x7fffffff9e30, count=1, > > dtype=0x15554c9ef900 <ompi_mpi_2dblprec>, > > op=0x15554ca28980 <ompi_mpi_op_maxloc>, comm=0x7f1e30, module=0xaec630) > > at base/coll_base_allreduce.c:247 > > #2 0x000015554c6a7e40 in ompi_coll_tuned_allreduce_intra_do_this ( > > sbuf=0x7fffffff9e20, rbuf=0x7fffffff9e30, count=1, > > dtype=0x15554c9ef900 <ompi_mpi_2dblprec>, > > op=0x15554ca28980 <ompi_mpi_op_maxloc>, comm=0x7f1e30, module=0xaec630, > > algorithm=3, faninout=0, segsize=0) at > coll_tuned_allreduce_decision.c:142 > > #3 0x000015554c6a054f in ompi_coll_tuned_allreduce_intra_dec_fixed ( > > sbuf=0x7fffffff9e20, rbuf=0x7fffffff9e30, count=1, > > dtype=0x15554c9ef900 <ompi_mpi_2dblprec>, > > op=0x15554ca28980 <ompi_mpi_op_maxloc>, comm=0x7f1e30, module=0xaec630) > > at coll_tuned_decision_fixed.c:216 > > #4 0x000015554c68e160 in mca_coll_hcoll_allreduce (sbuf=0x7fffffff9e20, > > rbuf=0x7fffffff9e30, count=1, dtype=0x15554c9ef900 <ompi_mpi_2dblprec>, > > op=0x15554ca28980 <ompi_mpi_op_maxloc>, comm=0x7f1e30, module=0xaecb80) > > at coll_hcoll_ops.c:217 > > #5 0x000015554c59811a in PMPI_Allreduce (sendbuf=0x7fffffff9e20, > > recvbuf=0x7fffffff9e30, count=1, datatype=0x15554c9ef900 > <ompi_mpi_2dblprec>, op=0x15554ca28980 <ompi_mpi_op_maxloc>, comm=0x7f1e30) > at allreduce.c:123 > > #6 0x0000155553eabede in MPIU_Allreduce_Private () from > /home/cklaij/ReFRESCO/trunk/install/extLibs/lib/libpetsc.so.3.22 > > #7 0x0000155553e50d08 in PetscPrintXMLNestedLinePerfResults () from > /home/cklaij/ReFRESCO/trunk/install/extLibs/lib/libpetsc.so.3.22 > > #8 0x0000155553e5123e in PetscLogNestedTreePrintLine () from > /home/cklaij/ReFRESCO/trunk/install/extLibs/lib/libpetsc.so.3.22 > > #9 0x0000155553e51f3a in PetscLogNestedTreePrint () from > /home/cklaij/ReFRESCO/trunk/install/extLibs/lib/libpetsc.so.3.22 > > #10 0x0000155553e51e96 in PetscLogNestedTreePrint () from > /home/cklaij/ReFRESCO/trunk/install/extLibs/lib/libpetsc.so.3.22 > > #11 0x0000155553e51e96 in PetscLogNestedTreePrint () from > /home/cklaij/ReFRESCO/trunk/install/extLibs/lib/libpetsc.so.3.22 > > #12 0x0000155553e52142 in PetscLogNestedTreePrintTop () from > /home/cklaij/ReFRESCO/trunk/install/extLibs/lib/libpetsc.so.3.22 > > #13 0x0000155553e5257b in PetscLogHandlerView_Nested_XML () from > /home/cklaij/ReFRESCO/trunk/install/extLibs/lib/libpetsc.so.3.22 > > #14 0x0000155553e4e5a0 in PetscLogHandlerView_Nested () from > /home/cklaij/ReFRESCO/trunk/install/extLibs/lib/libpetsc.so.3.22 > > #15 0x0000155553e56232 in PetscLogHandlerView () from > /home/cklaij/ReFRESCO/trunk/install/extLibs/lib/libpetsc.so.3.22 > > #16 0x0000155553e588c3 in PetscLogView () from > /home/cklaij/ReFRESCO/trunk/install/extLibs/lib/libpetsc.so.3.22 > > #17 0x0000155553e40eb5 in petsclogview_ () from > /home/cklaij/ReFRESCO/trunk/install/extLibs/lib/libpetsc.so.3.22 > > #18 0x0000000000402c8b in MAIN__ () > > #19 0x00000000004023df in main () > > [cid:ii_197ebccaa1d27ee6ef21] > > dr. ir. Christiaan Klaij | senior researcher > > Research & Development | CFD Development > > T +31 317 49 33 44<tel:+31%20317%2049%2033%2044> | > https://urldefense.us/v3/__http://www.marin.nl__;!!G_uCfscf7eWS!dcT9AzbxDJMLIie0NhYIw4YU2TObPM3WHhzR-HlzrpfbjPd6sgsPX009yFy1lw_eLLu2WprNwYRABMK4BUEn1h8$ > < > https://urldefense.us/v3/__http://www.marin.nl__;!!G_uCfscf7eWS!cbfMf1uAUCQ_T756UiU6Vd_NZkAvFLYRqJzL47P2JiAVi_2KCG5Q1u2oHseUcGLNAIW5qWtWbWHMIk_YNR8bJhphmV4x$><https://urldefense.us/v3/__https://www.marin.nl/__;!!G_uCfscf7eWS!dAFNrWR8FzE9RrQXQAlok1iR_fA-rZdm9JAi-dlnKTnbdNTOTCViw0Nc-jjU4g72I-mhE1x1MZaf8imk4ivm_tE$ > > > > [Facebook]< > https://urldefense.us/v3/__https://www.facebook.com/marin.wageningen__;!!G_uCfscf7eWS!dAFNrWR8FzE9RrQXQAlok1iR_fA-rZdm9JAi-dlnKTnbdNTOTCViw0Nc-jjU4g72I-mhE1x1MZaf8imkLNCvsiI$ > > > > [LinkedIn]< > https://urldefense.us/v3/__https://www.linkedin.com/company/marin__;!!G_uCfscf7eWS!dAFNrWR8FzE9RrQXQAlok1iR_fA-rZdm9JAi-dlnKTnbdNTOTCViw0Nc-jjU4g72I-mhE1x1MZaf8imkrb79Ay4$ > > > > [YouTube]< > https://urldefense.us/v3/__https://www.youtube.com/marinmultimedia__;!!G_uCfscf7eWS!dAFNrWR8FzE9RrQXQAlok1iR_fA-rZdm9JAi-dlnKTnbdNTOTCViw0Nc-jjU4g72I-mhE1x1MZaf8imkJiCoeLw$ > > > > > > > >