A couple of suggestions. - try building with gcc/gfortran - likely the compiler will flag issues (warnings) with the sources - that might be the cause of some of the errors. - try using PetscInt datatype across all sources (i.e use .F90 suffix - and include petsc includes) - to avoid any lingering mismatch (as a fix for some of the above warnings) - and then - you might be able to simplify your makefile to be more portable [using petsc formatted makefile]
Satish On Sun, 2 Jun 2024, Matthew Knepley wrote: > On Sun, Jun 2, 2024 at 10:27 AM Matthew Knepley <knep...@gmail.com> wrote: > > > On Sat, Jun 1, 2024 at 11:39 PM Carpenter, Mark H. (LARC-D302) via > > petsc-users <petsc-users@mcs.anl.gov> wrote: > > > >> Mark Carpenter, NASA Langley. I am a novice PETSC user of about 10 years. > >> I’ve build a DG-FEM code with petsc as one of the solver paths (I have my > >> own as well). Furthermore, I use petsc for MPI communication. I’m running > >> the DG-FEM > >> ZjQcmQRYFpfptBannerStart > >> This Message Is From an External Sender > >> This message came from outside your organization. > >> > >> ZjQcmQRYFpfptBannerEnd > >> > >> Mark Carpenter, NASA Langley. > >> > >> > >> > >> I am a novice PETSC user of about 10 years. I’ve build a DG-FEM code > >> with petsc as one of the solver paths (I have my own as well). > >> Furthermore, I use petsc for MPI communication. > >> > >> > >> > >> I’m running the DG-FEM code on our NAS supercomputer. Everything works > >> when my integer sizes are small. When I exceed the 2^32 limit of integer > >> arithmetic the code fails in very strange ways. > >> > >> The users that originally set up the petsc infrastructure in the code are > >> no longer at NASA and I’m “dead in the water”. > >> > > > One additional point. I have looked at the error message. When you make > PETSc calls, each call should be wrapped in PetscCall(). Here is a Fortran > example: > > > https://urldefense.us/v3/__https://gitlab.com/petsc/petsc/-/blob/main/src/ksp/ksp/tutorials/ex22f.F90?ref_type=heads__;!!G_uCfscf7eWS!eOkbaTOpui-YHhrX_HYLmYerXOaaGtlJn04-tdLvQzfRqa6gaCs2x-YtPn7xNTWzRRgD-wze7GkX5hkXqc8i$ > > > This checks the return value after each call and ends early if there is an > error. It would make your > error output much more readable. > > Thanks, > > Matt > > > > > >> > >> I think I’ve promoted all the integers that are problematic in my code > >> (F95). On PETSC side: I’ve tried > >> > >> 1. Reinstall petsc with –with-64-bit-integers (no luck) > >> > >> > > That option does not exist, so this will not work. > > > > > >> > >> 1. > >> 2. Reinstall petsc with –with-64-bit-integers and > >> –with-64-bit-indices (code will not compile with these options. > >> Additional variables on F90 side require promotion and then the errors > >> cascade through code when making PETSC calls. > >> > >> > > We should fix this. I feel confident we can get the code to compile. > > > > > >> > >> 1. > >> 2. It’s possible that I’ve missed offending integers, but the petsc > >> error messages are so cryptic that I can’t even tell where it is > >> failing. > >> > >> > >> > >> Further complicating matters: > >> > >> The problem by definition needs to be HUGE. Problem sizes requiring 1000 > >> cores (10^6 elements at P5) are needed to experience the errors, which > >> involves waiting in queues for ½ day at least. > >> > >> > >> > >> Attached are the > >> > >> 1. Install script used to install PETSC on our machine > >> 2. The Makefile used on the fortran side > >> 3. A data dump from an offending simulation (which is huge and I > >> can’t see any useful information.) > >> > >> > >> > >> How do I attack this problem. > >> > >> (I’ve never gotten debugging working properly). > >> > > > > Let's get the install for 64-bit indices to work. So we > > > > 1) Configure PETSc adding --with-64bit-indices to the configure line. Does > > this work? If not, send configure.log > > > > 2) Compile PETSc. Does this work? If not, send make.log > > > > 3) Compile your code. Does this work? If not, send all output. > > > > 4) Do one of the 1/2 day runs and let us know what happens. An alternative > > is to run a small number > > of processes on a large memory workstation. We do this to test at the > > lab. > > > > Thanks, > > > > Matt > > > > > >> Mark > >> > > > > > > -- > > What most experimenters take for granted before they begin their > > experiments is infinitely more interesting than any results to which their > > experiments lead. > > -- Norbert Wiener > > > > https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!eOkbaTOpui-YHhrX_HYLmYerXOaaGtlJn04-tdLvQzfRqa6gaCs2x-YtPn7xNTWzRRgD-wze7GkX5gB4gnrA$ > > > > <https://urldefense.us/v3/__http://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!eOkbaTOpui-YHhrX_HYLmYerXOaaGtlJn04-tdLvQzfRqa6gaCs2x-YtPn7xNTWzRRgD-wze7GkX5r8KJKDw$ > > > > > > > >