There is little information in this stack trace. You would get more information if you use a debug build of petsc. e.g. configure with --with-debugging=yes It is recommended to always debug problems using a debug build of petsc and a debug build of your application.
Thanks, Dave On 27 November 2015 at 20:05, Fande Kong <fdkong...@gmail.com> wrote: > Hi all, > > I implemented a parallel IO based on the Vec and IS which uses HDF5. I am > testing this loader on a supercomputer. I occasionally (not always) > encounter the following errors (using 8192 cores): > > [7689]PETSC ERROR: > ------------------------------------------------------------------------ > [7689]PETSC ERROR: Caught signal number 5 TRAP > [7689]PETSC ERROR: Try option -start_in_debugger or > -on_error_attach_debugger > [7689]PETSC ERROR: or see > http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind > [7689]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac > OS X to find memory corruption errors > [7689]PETSC ERROR: configure using --with-debugging=yes, recompile, link, > and run > [7689]PETSC ERROR: to get more information on the crash. > [7689]PETSC ERROR: --------------------- Error Message > -------------------------------------------------------------- > [7689]PETSC ERROR: Signal received > [7689]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html > for trouble shooting. > [7689]PETSC ERROR: Petsc Release Version 3.6.2, unknown > [7689]PETSC ERROR: ./fsi on a arch-linux2-cxx-opt named ys6103 by fandek > Fri Nov 27 11:26:30 2015 > [7689]PETSC ERROR: Configure options --with-clanguage=cxx > --with-shared-libraries=1 --download-fblaslapack=1 --with-mpi=1 > --download-parmetis=1 --download-metis=1 --with-netcdf=1 > --download-exodusii=1 > --with-hdf5-dir=/glade/apps/opt/hdf5-mpi/1.8.12/intel/12.1.5 > --with-debugging=no --with-c2html=0 --with-64-bit-indices=1 > [7689]PETSC ERROR: #1 User provided function() line 0 in unknown file > Abort(59) on node 7689 (rank 7689 in comm 1140850688): application called > MPI_Abort(MPI_COMM_WORLD, 59) - process 7689 > ERROR: 0031-300 Forcing all remote tasks to exit due to exit code 1 in > task 7689 > > Make and configure logs are attached. > > Thanks, > > Fande Kong, > >