Difficult to tell what is going on. 

  The message User provided function() line 0 in  unknown file  indicates the 
crash took place OUTSIDE of PETSc code and error message INTERNAL Error: recvd 
root arrowhead  is definitely not coming from PETSc. 

   Yes, debug with the debug version and also try valgrind.


> On Apr 8, 2019, at 12:12 PM, Manav Bhatia via petsc-users 
> <petsc-users@mcs.anl.gov> wrote:
> Hi,
>     I am running a code a nonlinear simulation using mesh-refinement on 
> libMesh. The code runs without issues on a Mac (can run for days without 
> issues), but crashes on Linux (Centos 6). I am using version 3.11 on Linux 
> with openmpi 3.1.3 and gcc8.2. 
>     I tried to use the -on_error_attach_debugger, but it only gave me this 
> message. Does this message imply something to the more experienced eyes? 
>     I am going to try to build a debug version of petsc to figure out what is 
> going wrong. I will get and share more detailed logs in a bit. 
> Regards,
> Manav
> ------------------------------------------------------------------------
> [8]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, 
> probably memory access out of range
> [8]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger
> [8]PETSC ERROR: or see 
> http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind
> [8]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to 
> find memory corruption errors
> [8]PETSC ERROR: configure using --with-debugging=yes, recompile, link, and 
> run 
> [8]PETSC ERROR: to get more information on the crash.
> [8]PETSC ERROR: User provided function() line 0 in  unknown file  
> PETSC: Attaching gdb to 
> /cavs/projects/brg_codes/users/bhatia/mast/mast_topology/opt/examples/structural/example_5/structural_example_5
>  of pid 2108 on display localhost:10.0 on machine Warhawk1.HPC.MsState.Edu
> PETSC: Attaching gdb to 
> /cavs/projects/brg_codes/users/bhatia/mast/mast_topology/opt/examples/structural/example_5/structural_example_5
>  of pid 2112 on display localhost:10.0 on machine Warhawk1.HPC.MsState.Edu
>            0 :INTERNAL Error: recvd root arrowhead 
>            0 :not belonging to me. IARR,JARR=       67525       67525
>            0 :IROW_GRID,JCOL_GRID=           0           4
>            0 :MYROW, MYCOL=           0           0
>            0 :IPOSROOT,JPOSROOT=    92264688    92264688
> --------------------------------------------------------------------------
> MPI_ABORT was invoked on rank 0 in communicator MPI_COMM_WORLD
> with errorcode -99.
> NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes.
> You may or may not see output from other processes, depending on
> exactly when Open MPI kills them.
> --------------------------------------------------------------------------

Reply via email to