Let us know how it goes with MPICH


> On May 30, 2019, at 2:01 AM, Sanjay Govindjee <s...@berkeley.edu> wrote:
> 
> I put in calls to PetscMemoryGetCurrentUsage( ) around KSPSolve and my data 
> exchange routine.  The problem is clearly mostly in my data exchange routine.
> Attached are graphs of the change in memory  for each call.  Lots of calls 
> have zero change, but on a periodic regular basis the memory goes up from the 
> data exchange; much less
> so with the KSPSolve calls (and then mostly on the first calls).
> 
> For the CG/Jacobi          data_exchange_total = 21,311,488; kspsolve_total = 
> 2,625,536
> For the GMRES/BJACOBI data_exchange_total = 6,619,136; kspsolve_total = 
> 54,403,072 (dominated by initial calls)
> 
> I will try to switch up my MPI to see if anything changes; right now my 
> configure is with  --download-openmpi.
> I've also attached the data exchange routine in case there is something 
> obviously wrong.
> 
> NB: Graphs/Data are from just one run each.
> 
> -sanjay
> 
> On 5/29/19 10:17 PM, Smith, Barry F. wrote:
>>    This is indeed worrisome.
>> 
>>     Would it be possible to put PetscMemoryGetCurrentUsage() around each 
>> call to KSPSolve() and each call to your data exchange? See if at each step 
>> they increase?
>> 
>>     One thing to be aware of with "max resident set size" is that it 
>> measures the number of pages that have been set up. Not the amount of memory 
>> allocated. So, if, for example, you allocate a very large array but don't 
>> actually read or write the memory in that array until later in the code it 
>> won't appear in the "resident set size" until you read or write the memory 
>> (because Unix doesn't set up pages until it needs to).
>> 
>>    You should also try another MPI. Both OpenMPI and MPICH can be installed 
>> with brew or you can use --download-mpich or --download-openmp to see if the 
>> MPI implementation is making a difference.
>> 
>>     For now I would focus on the PETSc only solvers to eliminate one 
>> variable from the equation; once that is understood you can go back to the 
>> question of the memory management of the other solvers
>> 
>>   Barry
>> 
>> 
>>> On May 29, 2019, at 11:51 PM, Sanjay Govindjee via petsc-users 
>>> <petsc-users@mcs.anl.gov> wrote:
>>> 
>>> I am trying to track down a memory issue with my code; apologies in advance 
>>> for the longish message.
>>> 
>>> I am solving a FEA problem with a number of load steps involving about 3000
>>> right hand side and tangent assemblies and solves.  The program is mainly 
>>> Fortran, with a C memory allocator.
>>> 
>>> When I run my code in strictly serial mode (no Petsc or MPI routines) the 
>>> memory stays constant over the whole run.
>>> 
>>> When I run it in parallel mode with petsc solvers with num_processes=1, the 
>>> memory (max resident set size) also stays constant:
>>> 
>>> PetscMalloc = 28,976, ProgramNativeMalloc = constant, Resident Size = 
>>> 24,854,528 (constant) [CG/JACOBI]
>>> 
>>> [PetscMalloc and Resident Size as reported by PetscMallocGetCurrentUsage 
>>> and PetscMemoryGetCurrentUsage (and summed across processes as needed);
>>> ProgramNativeMalloc reported by program memory allocator.]
>>> 
>>> When I run it in parallel mode with petsc solvers but num_processes=2, the 
>>> resident memory grows steadily during the run:
>>> 
>>> PetscMalloc = 3,039,072 (constant), ProgramNativeMalloc = constant, 
>>> Resident Size = (finish) 31,313,920 (start) 24,698,880 [CG/JACOBI]
>>> 
>>> When I run it in parallel mode with petsc solvers but num_processes=4, the 
>>> resident memory grows steadily during the run:
>>> 
>>> PetscMalloc = 3,307,888 (constant), ProgramNativeMalloc = 1,427,584 
>>> (constant), Resident Size = (finish) 70,787,072  (start) 45,801,472 
>>> [CG/JACOBI]
>>> PetscMalloc = 5,903,808 (constant), ProgramNativeMalloc = 1,427,584 
>>> (constant), Resident Size = (finish) 112,410,624 (start) 52,076,544 
>>> [GMRES/BJACOBI]
>>> PetscMalloc = 3,188,944 (constant), ProgramNativeMalloc = 1,427,584 
>>> (constant), Resident Size = (finish) 712,798,208 (start) 381,480,960 
>>> [SUPERLU]
>>> PetscMalloc = 6,539,408 (constant), ProgramNativeMalloc = 1,427,584 
>>> (constant), Resident Size = (finish) 591,048,704 (start) 278,671,360 [MUMPS]
>>> 
>>> The memory growth feels alarming but maybe I do not understand the values 
>>> in ru_maxrss from getrusage().
>>> 
>>> My box (MacBook Pro) has a broken Valgrind so I need to get to a system 
>>> with a functional one; notwithstanding, the code has always been Valgrind 
>>> clean.
>>> There are no Fortran Pointers or Fortran Allocatable arrays in the part of 
>>> the code being used.  The program's C memory allocator keeps track of
>>> itself so I do not see that the problem is there.  The Petsc malloc is also 
>>> steady.
>>> 
>>> Other random hints:
>>> 
>>> 1) If I comment out the call to KSPSolve and to my MPI data-exchange 
>>> routine (for passing solution values between processes after each solve,
>>> use  MPI_Isend, MPI_Recv, MPI_BARRIER)  the memory growth essentially goes 
>>> away.
>>> 
>>> 2) If I comment out the call to my MPI data-exchange routine but leave the 
>>> call to KSPSolve the problem remains but is substantially reduced
>>> for CG/JACOBI, and is marginally reduced for the GMRES/BJACOBI, SUPERLU, 
>>> and MUMPS runs.
>>> 
>>> 3) If I comment out the call to KSPSolve but leave the call to my MPI 
>>> data-exchange routine the problem remains.
>>> 
>>> Any suggestions/hints of where to look will be great.
>>> 
>>> -sanjay
>>> 
>>> 
> 
> <cg.png><gmres.png><psetb.F>

Reply via email to