Re: [petsc-users] Memory growth issue

Sanjay Govindjee via petsc-users Mon, 03 Jun 2019 16:36:39 -0700

Junchao,

I won't be feasible to share the code but I will run a similar testas you have done (large problem); I willtry with both MPICH and OpenMPI. I also agree that deltas are not idealas there they do not account for latency in the freeing of memoryetc. But I will note when we have the memory growth issue latencyassociated with free( ) appears not to be in play since the total

memory footprint grows monotonically.

I'll also have a look at massif. If you figure out the interface,and can send me the lines to instrument the code with that will save me

some time.
-sanjay

On 6/3/19 3:17 PM, Zhang, Junchao wrote:

Sanjay & Barry,

Sorry, I made a mistake that I said I could reproduced Sanjay'sexperiments. I found 1) to correctly use PetscMallocGetCurrentUsage()when petsc is configured without debugging, I have to add -malloc torun the program. 2) I have to instrument the code outside ofKSPSolve(). In my case, it is in SNESSolve_NEWTONLS. In oldexperiments, I did it inside KSPSolve. Since KSPSolve can recursivelycall KSPSolve, the old results were misleading. With these fixes, I measured differences of RSS and Petsc mallocbefore/after KSPSolve. I did experiments on MacBookusing src/ts/examples/tutorials/advection-diffusion-reaction/ex5.cwith commands like mpirun -n 4 ./ex5 -da_grid_x 64 -da_grid_y 64-ts_type beuler -ts_max_steps 500 -malloc. I find if the grid size is small, I can see a non-zero RSS-deltarandomly, either with one mpi rank or multiple ranks, with MPICH orOpenMPI. If I increase grid sizes, e.g., -da_grid_x 256 -da_grid_y256, I only see non-zero RSS-delta randomly at the first fewiterations (with MPICH or OpenMPI). When the computer workload is highby simultaneously running ex5-openmpi and ex5-mpich, the MPICH onepops up much more non-zero RSS-delta. But "Malloc Delta" behavior isstable across all runs. There is only one nonzero malloc delta valuein the first KSPSolve call. All remaining are zero. Something like this:


    mpirun -n 4 ./ex5-mpich -da_grid_x 256 -da_grid_y 256 -ts_type
    beuler -ts_max_steps 500 -malloc
    RSS Delta= 32489472, Malloc Delta=       26290304, RSS End=  136114176
    RSS Delta=  32768, Malloc Delta=              0, RSS End=  138510336
    RSS Delta=    0, Malloc Delta=              0, RSS End=  138522624
    RSS Delta=    0, Malloc Delta=              0, RSS End=  138539008

So I think I can conclude there is no unfreed memory in KSPSolve()allocated by PETSc. Has MPICH allocated unfreed memory in KSPSolve?That is possible and I am trying to find a way likePetscMallocGetCurrentUsage() to measure that. Also, I think RSS deltais not a good way to measure memory allocation. It is dynamic anddepends on state of the computer (swap, shared libraries loaded etc)when running the code. We should focus on malloc instead. If therewas a valgrind tool, like performance profiling tools, that can letusers measure memory allocated but not freed in a user specified codesegment, that would be very helpful in this case. But I have not foundone.

Sanjay, did you say currently you could run with OpenMPI without outof memory, but with MPICH, you ran out of memory? Is it feasible toshare your code so that I can test with? Thanks.


--Junchao Zhang

On Sat, Jun 1, 2019 at 3:21 AM Sanjay Govindjee <s...@berkeley.edu<mailto:s...@berkeley.edu>> wrote:


    Barry,

    If you look at the graphs I generated (on my Mac),  you will see that
    OpenMPI and MPICH have very different values (along with the fact
    that
    MPICH does not seem to adhere
    to the standard (for releasing MPI_ISend resources following and
    MPI_Wait).

    -sanjay

    PS: I agree with Barry's assessment; this is really not that
    acceptable.

    On 6/1/19 1:00 AM, Smith, Barry F. wrote:
    >    Junchao,
    >
    >       This is insane. Either the OpenMPI library or something in
    the OS underneath related to sockets and interprocess
    communication is grabbing additional space for each round of MPI
    communication!  Does MPICH have the same values or different
    values than OpenMP? When you run on Linux do you get the same
    values as Apple or different. --- Same values seem to indicate the
    issue is inside OpenMPI/MPICH different values indicates problem
    is more likely at the OS level. Does this happen only with the
    default VecScatter that uses blocking MPI, what happens with
    PetscSF under Vec? Is it somehow related to PETSc's use of
    nonblocking sends and receives? One could presumably use valgrind
    to see exactly what lines in what code are causing these
    increases. I don't think we can just shrug and say this is the way
    it is, we need to track down and understand the cause (and if
    possible fix).
    >
    >    Barry
    >
    >
    >> On May 31, 2019, at 2:53 PM, Zhang, Junchao
    <jczh...@mcs.anl.gov <mailto:jczh...@mcs.anl.gov>> wrote:
    >>
    >> Sanjay,
    >> I tried petsc with MPICH and OpenMPI on my Macbook. I inserted
    PetscMemoryGetCurrentUsage/PetscMallocGetCurrentUsage at the
    beginning and end of KSPSolve and then computed the delta and
    summed over processes. Then I tested with
    src/ts/examples/tutorials/advection-diffusion-reaction/ex5.c
    >> With OpenMPI,
    >> mpirun -n 4 ./ex5 -da_grid_x 128 -da_grid_y 128 -ts_type beuler
    -ts_max_steps 500 > 128.log
    >> grep -n -v "RSS Delta=         0, Malloc Delta=      0" 128.log
    >> 1:RSS Delta=     69632, Malloc Delta=         0
    >> 2:RSS Delta=     69632, Malloc Delta=         0
    >> 3:RSS Delta=     69632, Malloc Delta=         0
    >> 4:RSS Delta=     69632, Malloc Delta=         0
    >> 9:RSS Delta=9.25286e+06, Malloc Delta=         0
    >> 22:RSS Delta=     49152, Malloc Delta=         0
    >> 44:RSS Delta=     20480, Malloc Delta=         0
    >> 53:RSS Delta=     49152, Malloc Delta=         0
    >> 66:RSS Delta=      4096, Malloc Delta=         0
    >> 97:RSS Delta=     16384, Malloc Delta=         0
    >> 119:RSS Delta=     20480, Malloc Delta=         0
    >> 141:RSS Delta=     53248, Malloc Delta=         0
    >> 176:RSS Delta=     16384, Malloc Delta=         0
    >> 308:RSS Delta=     16384, Malloc Delta=         0
    >> 352:RSS Delta=     16384, Malloc Delta=         0
    >> 550:RSS Delta=     16384, Malloc Delta=         0
    >> 572:RSS Delta=     16384, Malloc Delta=         0
    >> 669:RSS Delta=     40960, Malloc Delta=         0
    >> 924:RSS Delta=     32768, Malloc Delta=         0
    >> 1694:RSS Delta=     20480, Malloc Delta=         0
    >> 2099:RSS Delta=     16384, Malloc Delta=         0
    >> 2244:RSS Delta=     20480, Malloc Delta=         0
    >> 3001:RSS Delta=     16384, Malloc Delta=         0
    >> 5883:RSS Delta=     16384, Malloc Delta=         0
    >>
    >> If I increased the grid
    >> mpirun -n 4 ./ex5 -da_grid_x 512 -da_grid_y 512 -ts_type beuler
    -ts_max_steps 500 -malloc_test >512.log
    >> grep -n -v "RSS Delta=         0, Malloc Delta=      0" 512.log
    >> 1:RSS Delta=1.05267e+06, Malloc Delta=         0
    >> 2:RSS Delta=1.05267e+06, Malloc Delta=         0
    >> 3:RSS Delta=1.05267e+06, Malloc Delta=         0
    >> 4:RSS Delta=1.05267e+06, Malloc Delta=         0
    >> 13:RSS Delta=1.24932e+08, Malloc Delta=         0
    >>
    >> So we did see RSS increase in 4k-page sizes after KSPSolve. As
    long as no memory leaks, why do you care about it? Is it because
    you run out of memory?
    >>
    >> On Thu, May 30, 2019 at 1:59 PM Smith, Barry F.
    <bsm...@mcs.anl.gov <mailto:bsm...@mcs.anl.gov>> wrote:
    >>
    >>     Thanks for the update. So the current conclusions are that
    using the Waitall in your code
    >>
    >> 1) solves the memory issue with OpenMPI in your code
    >>
    >> 2) does not solve the memory issue with PETSc KSPSolve
    >>
    >> 3) MPICH has memory issues both for your code and PETSc
    KSPSolve (despite) the wait all fix?
    >>
    >> If you literately just comment out the call to KSPSolve() with
    OpenMPI is there no growth in memory usage?
    >>
    >>
    >> Both 2 and 3 are concerning, indicate possible memory leak bugs
    in MPICH and not freeing all MPI resources in KSPSolve()
    >>
    >> Junchao, can you please investigate 2 and 3 with, for example,
    a TS example that uses the linear solver (like with -ts_type
    beuler)? Thanks
    >>
    >>
    >>    Barry
    >>
    >>
    >>
    >>> On May 30, 2019, at 1:47 PM, Sanjay Govindjee
    <s...@berkeley.edu <mailto:s...@berkeley.edu>> wrote:
    >>>
    >>> Lawrence,
    >>> Thanks for taking a look!  This is what I had been wondering
    about -- my knowledge of MPI is pretty minimal and
    >>> this origins of the routine were from a programmer we hired a
    decade+ back from NERSC.  I'll have to look into
    >>> VecScatter.  It will be great to dispense with our
    roll-your-own routines (we even have our own reduceALL scattered
    around the code).
    >>>
    >>> Interestingly, the MPI_WaitALL has solved the problem when
    using OpenMPI but it still persists with MPICH.  Graphs attached.
    >>> I'm going to run with openmpi for now (but I guess I really
    still need to figure out what is wrong with MPICH and WaitALL;
    >>> I'll try Barry's suggestion of
    --download-mpich-configure-arguments="--enable-error-messages=all
    --enable-g" later today and report back).
    >>>
    >>> Regarding MPI_Barrier, it was put in due a problem that some
    processes were finishing up sending and receiving and exiting the
    subroutine
    >>> before the receiving processes had completed (which resulted
    in data loss as the buffers are freed after the call to the
    routine). MPI_Barrier was the solution proposed
    >>> to us.  I don't think I can dispense with it, but will think
    about some more.
    >>>
    >>> I'm not so sure about using MPI_IRecv as it will require a bit
    of rewriting since right now I process the received
    >>> data sequentially after each blocking MPI_Recv -- clearly
    slower but easier to code.
    >>>
    >>> Thanks again for the help.
    >>>
    >>> -sanjay
    >>>
    >>> On 5/30/19 4:48 AM, Lawrence Mitchell wrote:
    >>>> Hi Sanjay,
    >>>>
    >>>>> On 30 May 2019, at 08:58, Sanjay Govindjee via petsc-users
    <petsc-users@mcs.anl.gov <mailto:petsc-users@mcs.anl.gov>> wrote:
    >>>>>
    >>>>> The problem seems to persist but with a different
    signature.  Graphs attached as before.
    >>>>>
    >>>>> Totals with MPICH (NB: single run)
    >>>>>
    >>>>> For the CG/Jacobi data_exchange_total = 41,385,984;
    kspsolve_total = 38,289,408
    >>>>> For the GMRES/BJACOBI data_exchange_total = 41,324,544;
    kspsolve_total = 41,324,544
    >>>>>
    >>>>> Just reading the MPI docs I am wondering if I need some sort
    of MPI_Wait/MPI_Waitall before my MPI_Barrier in the data exchange
    routine?
    >>>>> I would have thought that with the blocking receives and the
    MPI_Barrier that everything will have fully completed and cleaned
    up before
    >>>>> all processes exited the routine, but perhaps I am wrong on
    that.
    >>>> Skimming the fortran code you sent you do:
    >>>>
    >>>> for i in ...:
    >>>>     call MPI_Isend(..., req, ierr)
    >>>>
    >>>> for i in ...:
    >>>>     call MPI_Recv(..., ierr)
    >>>>
    >>>> But you never call MPI_Wait on the request you got back from
    the Isend. So the MPI library will never free the data structures
    it created.
    >>>>
    >>>> The usual pattern for these non-blocking communications is to
    allocate an array for the requests of length nsend+nrecv and then do:
    >>>>
    >>>> for i in nsend:
    >>>>     call MPI_Isend(..., req[i], ierr)
    >>>> for j in nrecv:
    >>>>     call MPI_Irecv(..., req[nsend+j], ierr)
    >>>>
    >>>> call MPI_Waitall(req, ..., ierr)
    >>>>
    >>>> I note also there's no need for the Barrier at the end of the
    routine, this kind of communication does neighbourwise
    synchronisation, no need to add (unnecessary) global
    synchronisation too.
    >>>>
    >>>> As an aside, is there a reason you don't use PETSc's
    VecScatter to manage this global to local exchange?
    >>>>
    >>>> Cheers,
    >>>>
    >>>> Lawrence
    >>>
    <cg_mpichwall.png><cg_wall.png><gmres_mpichwall.png><gmres_wall.png>

Re: [petsc-users] Memory growth issue

Reply via email to