Re: Memory Leaks with Trilinos
Hi Trevor, My mystery deepens. Today, I've compiled and tried the following combinations of Trilinos/Swig: 12.8.1/3.0.2 11.14.3/3.0.2 11.12.1/3.0.2 11.10.1/2.0.8 11.2.5/2.0.8 All of these combinations leak memory at about the same rate. I am using Boost 1.54, Openmpi 1.6.5, and GCC 4.8.4. Are you using similar? Thanks, -Mike On 03/30/2016 05:06 PM, Keller, Trevor (Fed) wrote: Mike & Jon, Running 25 steps on Debian wheezy using PyTrilinos 11.10.2 and swig 2.0.8, monitoring with memory-profiler, I do not see a memory leak (see attachment): the simulation is fairly steady at 8GB RAM. The same code, but using PyTrilinos 12.1.0, ramps up to 16GB in the same simulation time. Later revisions of Trilinos 12 remain defiant (though, compiling swig-3.0.8 has improved matters). If one of them compiles and works, I will let you know. If not, if possible, Mike, can you test the next incremental version (11.10.2)? Trevor From: fipy-boun...@nist.gov on behalf of Michael Waters Sent: Wednesday, March 30, 2016 4:36 PM To: FIPY Subject: Re: Memory Leaks with Trilinos Hi Jon, I just compiled an old version of swig (2.0.8) and compiled Trilinos (11.10.1) against that. Sadly, I am still having the leak. I am out of ideas for the day... and should be looking for a post doc anyway. Thanks, -mike On 3/30/16 3:32 PM, Guyer, Jonathan E. Dr. (Fed) wrote: No worries. If building trilinos doesn't blindside you with something unexpected and unpleasant, you're not doing it right. I have a conda recipe at https://github.com/guyer/conda-recipes/tree/trilinos_upgrade_11_10_2/trilinos that has worked for me to build 11.10.2 on both OS X and Docker (Debian?). I haven't tried to adjust it to 12.x, yet. On Mar 30, 2016, at 2:42 PM, Michael Waters wrote: Hi Jon, I was just reviewing my version of Trilinos 11.10 and discovered that there is no way that I compiled it last night after exercising. It has unsatisfied dependencies on my machine. So I must apologize, I must have been more tired than I thought. Sorry for the error! -Mike Waters On 3/30/16 11:52 AM, Guyer, Jonathan E. Dr. (Fed) wrote: It looked to me like steps and accuracy were the way to do it, but my runs finish in one step, so I was confused. When I change to accuracy = 10.0**-6, it takes 15 steps, but still no leak (note, the hiccup in RSS and in ELAPSED time is because I put my laptop to sleep for awhile, but VSIZE is rock-steady). The fact that things never (or slowly) converge for you and Trevor, in addition to the leak, makes me wonder if Trilinos seriously broke something between 11.x and 12.x. Trevor's been struggling to build 12.4. I'll try to find time to do the same. In case it matters, I'm running on OS X. What's your system? - Jon On Mar 29, 2016, at 3:59 PM, Michael Waters wrote: When I did my testing and made those graphs, I ran Trilinos in serial. Syrupy didn't seem to track the other processes memory. I watched in real time as the parallel version ate all my ram though. To make the program run longer while not changing the memory: steps = 100 # increase this, (limits the number of self-consistent iterations) accuracy = 10.0**-5 # make this number smaller, (relative energy eigenvalue change for being considered converged ) initial_solver_iterations_per_step = 7 # reduce this to 1, (number of solver iterations per self-consistent iteration, to small and it's slow, to high and the solutions are not stable) I did those tests on a machine with 128 GB of ram so I wasn't expecting any swapping. Thanks, -mike On 3/29/16 3:38 PM, Guyer, Jonathan E. Dr. (Fed) wrote: I guess I spoke too soon. FWIW, I'm running Trilinos version: 11.10.2. On Mar 29, 2016, at 3:34 PM, Guyer, Jonathan E. Dr. (Fed) wrote: I'm not seeing a leak. The below is for trilinos. VSIZE grows to about 11 MiB and saturates and RSS saturates at around 5 MiB. VSIZE is more relevant for tracking leaks, as RSS is deeply tied to your system's swapping architecture and what else is running; either way, neither seems to be leaking, but this problem does use a lot of memory. What do I need to do to get it to run longer? On Mar 25, 2016, at 7:16 PM, Michael Waters wrote: Hello, I still have a large memory leak when using Trilinos. I am not sure where to start looking so I made an example code that produces my problem in hopes that someone can help me. But! my example is cool. I implemented Density Functional Theory in FiPy! My code is slow, but runs in parallel and is simple (relative to most DFT codes). The example I have attached is just a lithium and hydrogen atom. The electrostatic boundary conditions are goofy but work well enough for demonstration purposes. If you set use_trilinos to True, the code will slowly use more memory. If not, it will try to use Pysparse. Thanks, -Michael Waters ___ fipy mailing list fipy@nist.gov http://
RE: Memory Leakage & Object Build-up with FiPy Sweeps
Dear Jonathan, Daniel, Thank you for your responses. Just yesterday, I discovered and solved this problem (another remains). It wasn't a result of calls to .sweep. The time-varying boundary condition for one of the PDEs was being re-defined within the time-stepping loop using the PDE.faceGrad.constrain() method. This led to a net creation of objects with every timestep, irrespective of garbage collection call frequency, and that in turn caused the slow-down of the simulation. The solution was to instead update the value of the boundary condition by the .setValue method within the time-stepping loop, as below. # Outside the loop, declare a FaceVariable for the value of the BC: species_flux_neg_particle_surf = FaceVariable(mesh=p2d_mesh) # Next, apply that value to the BC at the top of that mesh: Cs_p2d.faceGrad.constrain(species_flux_neg_particle_surf, where=p2d_mesh.facesTop) . # Within the time-stepping loop, update the boundary condition using .setValue as follows: species_flux_neg_particle_surf.setValue(my_new_BC_value) # Enjoy not creating new objects I haven't yet had time to finish producing a new vprof memory consumption plot for comparison. However, it's clear from Pympler’s SummaryTracker().print_diff() function that this change to the way the BC is updated solved the memory leak issue. For comparison, here are the number of objects and CPU time per timestep plotted against three seconds of simulation time, firstly using the .faceGrad.constrian() method and, secondly, using the .setValue() method for updating. The now-stable number of objects illustrates the fixed leak. faceGrad.constrain, leaking: https://goo.gl/3LqSm7 setValue, memory leak fixed: https://goo.gl/6kQMjH There is a new issue which also slows the simulation to an unusable level, described below. With the memory leak solved, I was able to run the simulation well beyond three seconds, and discovered that the number of sweeps required per timestep begins to exponentially increase after around 120s of simulation time. It seems that this in turn pulls up the CPU time required per timestep. The plot at the following link illustrates the problem: https://goo.gl/G9DD5r I do not know why this is. It's clear that a memory leak is no longer the cause - the number of objects is relatively constant (varying only slightly between garbage collector cycles). Plotting, at 1 & 150 seconds into the simulation, the residuals returned by the .sweep function for each of the six PDEs being solved in the time-stepping loop provides some insight into the stability of convergence. Each subplot in a figure is for one of the six PDEs: https://goo.gl/Nnm7Si At 150s, the residuals are still decreasing with sweeping, but at a much slower rate, towards the tolerance (1e-4). Do you know why this might be happening? With best regards, - Ian -Original Message- From: fipy-boun...@nist.gov [mailto:fipy-boun...@nist.gov] On Behalf Of Guyer, Jonathan E. Dr. (Fed) Sent: 11 October 2016 16:37 To: FIPY Subject: Re: Memory Leakage & Object Build-up with FiPy Sweeps I have access to their code. Ian, please provide an explicit recipe for demonstrating the leak with the code in your github repo. - Jon > On Oct 11, 2016, at 11:15 AM, Daniel Wheeler > wrote: > > Hi Ian, > > Could you possible post your code or a version of the code that demonstrates > the problem? Also, do you have the same issue with different solver suites? > > Cheers, > > Daniel > > > > On Fri, Sep 30, 2016 at 12:41 PM, Campbell, Ian > wrote: > Hi All, > > > > We are sweeping six PDEs in a time-stepping loop. We’ve noticed that as CPU > time progresses, the duration of each time-step increases, although the sweep > count remains constant. This is illustrated in the Excel file of data logged > from the simulation, which is available at the first hyperlink below. > > > > Hence, we suspected a memory leak may be occurring. After conducting > memory-focused line-profiling with the vprof tool, we observed a linear > increase in total memory consumption at a rate of approximately 3 MB per > timestep loop. This is evident in the graph at the second link below, which > illustrates the memory increase over three seconds of simulation. > > > > As a further step, we used Pympler to investigate the source of RAM > consumption increase for each timestep. The table below is an output from > Pympler’s SummaryTracker().print_diff(), which describe the additional > objects created within every time-step. Clearly, there are ~3.2 MB of > additional data being generated with every loop – this correlates perfectly > with the total rate of increase of memory consumption reported by vprof. > Although we are not yet sure, we suspect that the increasing time spent per > loop is the result of this apparent memory leak. > > > > We suspect this is the result of the calls to .sweep, since we are not > explicitly creating these objects. Can
Re: Memory Leakage & Object Build-up with FiPy Sweeps
Hi all, Could Ian's issue be related to the issues I came across at the end of March? Trevor Keller seems to have isolated my memory leak to versions of Trilinos newer than 12.0. Trevor, It seems that I never got around to testing an older version of Trilinos, I'll do that now. -Mike On 10/11/2016 10:36 AM, Guyer, Jonathan E. Dr. (Fed) wrote: > I have access to their code. Ian, please provide an explicit recipe for > demonstrating the leak with the code in your github repo. > > - Jon > >> On Oct 11, 2016, at 11:15 AM, Daniel Wheeler >> wrote: >> >> Hi Ian, >> >> Could you possible post your code or a version of the code that demonstrates >> the problem? Also, do you have the same issue with different solver suites? >> >> Cheers, >> >> Daniel >> >> >> >> On Fri, Sep 30, 2016 at 12:41 PM, Campbell, Ian >> wrote: >> Hi All, >> >> >> >> We are sweeping six PDEs in a time-stepping loop. We’ve noticed that as CPU >> time progresses, the duration of each time-step increases, although the >> sweep count remains constant. This is illustrated in the Excel file of data >> logged from the simulation, which is available at the first hyperlink below. >> >> >> >> Hence, we suspected a memory leak may be occurring. After conducting >> memory-focused line-profiling with the vprof tool, we observed a linear >> increase in total memory consumption at a rate of approximately 3 MB per >> timestep loop. This is evident in the graph at the second link below, which >> illustrates the memory increase over three seconds of simulation. >> >> >> >> As a further step, we used Pympler to investigate the source of RAM >> consumption increase for each timestep. The table below is an output from >> Pympler’s SummaryTracker().print_diff(), which describe the additional >> objects created within every time-step. Clearly, there are ~3.2 MB of >> additional data being generated with every loop – this correlates perfectly >> with the total rate of increase of memory consumption reported by vprof. >> Although we are not yet sure, we suspect that the increasing time spent per >> loop is the result of this apparent memory leak. >> >> >> >> We suspect this is the result of the calls to .sweep, since we are not >> explicitly creating these objects. Can the origin of these objects be >> traced, and furthermore, is there a way to avoid re-creating them and >> consuming more memory with every loop? Without some method of unloading or >> preventing this object build-up, it isn’t feasible to run our simulation for >> long durations. >> >> >> dict >> >> 2684 >> >> 927.95 >> >> KB >> >> type >> >> 1716 >> >> 757.45 >> >> KB >> >> tuple >> >> 9504 >> >> 351.31 >> >> KB >> >> list >> >> 4781 >> >> 227.09 >> >> KB >> >> str >> >> 2582 >> >> 210.7 >> >> KB >> >> numpy.ndarray >> >> 396 >> >> 146.78 >> >> KB >> >> cell >> >> 3916 >> >> 107.08 >> >> KB >> >> property >> >> 2288 >> >> 98.31 >> >> KB >> >> weakref >> >> 2287 >> >> 98.27 >> >> KB >> >> function (getName) >> >> 1144 >> >> 67.03 >> >> KB >> >> function (getRank) >> >> 1144 >> >> 67.03 >> >> KB >> >> function (_calcValue_) >> >> 1144 >> >> 67.03 >> >> KB >> >> function (__init__) >> >> 1144 >> >> 67.03 >> >> KB >> >> function (_getRepresentation) >> >> 1012 >> >> 59.3 >> >> KB >> >> function (__setitem__) >> >> 572 >> >> 33.52 >> >> KB >> >> SUM >> >> 3285.88 >> >> KB >> >> >> >> >> >> https://imperialcollegelondon.box.com/s/zp9jj67du3mxdcfgbc4el8cqpxwnv0y4 >> >> >> >> https://imperialcollegelondon.box.com/s/ict9tnswqk9z57ovx8r3ll5po5ccrib9 >> >> >> >> With best regards, >> >> >> >> - Ian & Krishna >> >> >> >> P.S. Daniel, thank you very much for the excellent example solution you >> provided in response to our question on obtaining the sharp discontinuity. >> >> >> >> Ian Campbell | PhD Candidate >> >> Electrochemical Science & Engineering Group >> >> Imperial College London, SW7 2AZ, United Kingdom >> >> >> >> >> ___ >> fipy mailing list >> fipy@nist.gov >> http://www.ctcms.nist.gov/fipy >>[ NIST internal ONLY: https://email.nist.gov/mailman/listinfo/fipy ] >> >> >> >> >> -- >> Daniel Wheeler >> ___ >> fipy mailing list >> fipy@nist.gov >> http://www.ctcms.nist.gov/fipy >> [ NIST internal ONLY: https://email.nist.gov/mailman/listinfo/fipy ] > > ___ > fipy mailing list > fipy@nist.gov > http://www.ctcms.nist.gov/fipy >[ NIST internal ONLY: https://email.nist.gov/mailman/listinfo/fipy ] ___ fipy mailing list fipy@nist.gov http://www.ctcms.nist.gov/fipy [ NIST internal ONLY: https://email.nist.gov/mailman/listinfo/fipy ]
Re: Memory Leakage & Object Build-up with FiPy Sweeps
I have access to their code. Ian, please provide an explicit recipe for demonstrating the leak with the code in your github repo. - Jon > On Oct 11, 2016, at 11:15 AM, Daniel Wheeler > wrote: > > Hi Ian, > > Could you possible post your code or a version of the code that demonstrates > the problem? Also, do you have the same issue with different solver suites? > > Cheers, > > Daniel > > > > On Fri, Sep 30, 2016 at 12:41 PM, Campbell, Ian > wrote: > Hi All, > > > > We are sweeping six PDEs in a time-stepping loop. We’ve noticed that as CPU > time progresses, the duration of each time-step increases, although the sweep > count remains constant. This is illustrated in the Excel file of data logged > from the simulation, which is available at the first hyperlink below. > > > > Hence, we suspected a memory leak may be occurring. After conducting > memory-focused line-profiling with the vprof tool, we observed a linear > increase in total memory consumption at a rate of approximately 3 MB per > timestep loop. This is evident in the graph at the second link below, which > illustrates the memory increase over three seconds of simulation. > > > > As a further step, we used Pympler to investigate the source of RAM > consumption increase for each timestep. The table below is an output from > Pympler’s SummaryTracker().print_diff(), which describe the additional > objects created within every time-step. Clearly, there are ~3.2 MB of > additional data being generated with every loop – this correlates perfectly > with the total rate of increase of memory consumption reported by vprof. > Although we are not yet sure, we suspect that the increasing time spent per > loop is the result of this apparent memory leak. > > > > We suspect this is the result of the calls to .sweep, since we are not > explicitly creating these objects. Can the origin of these objects be traced, > and furthermore, is there a way to avoid re-creating them and consuming more > memory with every loop? Without some method of unloading or preventing this > object build-up, it isn’t feasible to run our simulation for long durations. > > > dict > > 2684 > > 927.95 > > KB > > type > > 1716 > > 757.45 > > KB > > tuple > > 9504 > > 351.31 > > KB > > list > > 4781 > > 227.09 > > KB > > str > > 2582 > > 210.7 > > KB > > numpy.ndarray > > 396 > > 146.78 > > KB > > cell > > 3916 > > 107.08 > > KB > > property > > 2288 > > 98.31 > > KB > > weakref > > 2287 > > 98.27 > > KB > > function (getName) > > 1144 > > 67.03 > > KB > > function (getRank) > > 1144 > > 67.03 > > KB > > function (_calcValue_) > > 1144 > > 67.03 > > KB > > function (__init__) > > 1144 > > 67.03 > > KB > > function (_getRepresentation) > > 1012 > > 59.3 > > KB > > function (__setitem__) > > 572 > > 33.52 > > KB > > SUM > > 3285.88 > > KB > > > > > > https://imperialcollegelondon.box.com/s/zp9jj67du3mxdcfgbc4el8cqpxwnv0y4 > > > > https://imperialcollegelondon.box.com/s/ict9tnswqk9z57ovx8r3ll5po5ccrib9 > > > > With best regards, > > > > - Ian & Krishna > > > > P.S. Daniel, thank you very much for the excellent example solution you > provided in response to our question on obtaining the sharp discontinuity. > > > > Ian Campbell | PhD Candidate > > Electrochemical Science & Engineering Group > > Imperial College London, SW7 2AZ, United Kingdom > > > > > ___ > fipy mailing list > fipy@nist.gov > http://www.ctcms.nist.gov/fipy > [ NIST internal ONLY: https://email.nist.gov/mailman/listinfo/fipy ] > > > > > -- > Daniel Wheeler > ___ > fipy mailing list > fipy@nist.gov > http://www.ctcms.nist.gov/fipy > [ NIST internal ONLY: https://email.nist.gov/mailman/listinfo/fipy ] ___ fipy mailing list fipy@nist.gov http://www.ctcms.nist.gov/fipy [ NIST internal ONLY: https://email.nist.gov/mailman/listinfo/fipy ]
Re: Memory Leakage & Object Build-up with FiPy Sweeps
Hi Ian, Could you possible post your code or a version of the code that demonstrates the problem? Also, do you have the same issue with different solver suites? Cheers, Daniel On Fri, Sep 30, 2016 at 12:41 PM, Campbell, Ian wrote: > Hi All, > > > > We are sweeping six PDEs in a time-stepping loop. We’ve noticed that as > CPU time progresses, the duration of each time-step increases, although the > sweep count remains constant. This is illustrated in the Excel file of data > logged from the simulation, which is available at the first hyperlink below. > > > > Hence, we suspected a memory leak may be occurring. After conducting > memory-focused line-profiling with the vprof tool, we observed a linear > increase in total memory consumption at a rate of approximately 3 MB per > timestep loop. This is evident in the graph at the second link below, which > illustrates the memory increase over three seconds of simulation. > > > > As a further step, we used Pympler to investigate the source of RAM > consumption increase for each timestep. The table below is an output from > Pympler’s SummaryTracker().print_diff(), which describe the additional > objects created within every time-step. Clearly, there are ~3.2 MB of > additional data being generated with every loop – this correlates perfectly > with the total rate of increase of memory consumption reported by vprof. > Although we are not yet sure, we suspect that the increasing time spent per > loop is the result of this apparent memory leak. > > > > We suspect this is the result of the calls to .sweep, since we are not > explicitly creating these objects. Can the origin of these objects be > traced, and furthermore, is there a way to avoid re-creating them and > consuming more memory with every loop? Without some method of unloading or > preventing this object build-up, it isn’t feasible to run our simulation > for long durations. > > dict > > 2684 > > 927.95 > > KB > > type > > 1716 > > 757.45 > > KB > > tuple > > 9504 > > 351.31 > > KB > > list > > 4781 > > 227.09 > > KB > > str > > 2582 > > 210.7 > > KB > > numpy.ndarray > > 396 > > 146.78 > > KB > > cell > > 3916 > > 107.08 > > KB > > property > > 2288 > > 98.31 > > KB > > weakref > > 2287 > > 98.27 > > KB > > function (getName) > > 1144 > > 67.03 > > KB > > function (getRank) > > 1144 > > 67.03 > > KB > > function (_calcValue_) > > 1144 > > 67.03 > > KB > > function (__init__) > > 1144 > > 67.03 > > KB > > function (_getRepresentation) > > 1012 > > 59.3 > > KB > > function (__setitem__) > > 572 > > 33.52 > > KB > > SUM > > 3285.88 > > KB > > > > > > https://imperialcollegelondon.box.com/s/zp9jj67du3mxdcfgbc4el8cqpxwnv0y4 > > > > https://imperialcollegelondon.box.com/s/ict9tnswqk9z57ovx8r3ll5po5ccrib9 > > > > With best regards, > > > > - Ian & Krishna > > > > P.S. Daniel, thank you very much for the excellent example solution you > provided in response to our question on obtaining the sharp discontinuity. > > > > Ian Campbell | PhD Candidate > > Electrochemical Science & Engineering Group > > Imperial College London, SW7 2AZ, United Kingdom > > > > ___ > fipy mailing list > fipy@nist.gov > http://www.ctcms.nist.gov/fipy > [ NIST internal ONLY: https://email.nist.gov/mailman/listinfo/fipy ] > > -- Daniel Wheeler ___ fipy mailing list fipy@nist.gov http://www.ctcms.nist.gov/fipy [ NIST internal ONLY: https://email.nist.gov/mailman/listinfo/fipy ]