Caleb,

Thanks for doing all of this investigation and providing something easy to reproduce.

With the help of valgrind, I believe I've tracked it down to a bug in PyCXX, the Python/C++ interface tool matplotlib uses.

I have attached a patch that seems to remove the leak for me, but as I'm not a PyCXX expert, I'm not comfortable with committing it to the repository just yet. *I'm hoping you and/or some other developers could test it on their systems (a fully clean re-build is required) and report any problems back. *I also plan to raise this question on the PyCXX mailing list to get any thoughts they may have.

Cheers,
Mike

On 11/19/2010 04:14 PM, Caleb Constantine wrote:
On Thu, Nov 18, 2010 at 4:50 PM, Benjamin Root<ben.r...@ou.edu>  wrote:
Caleb,

Interesting analysis.  One possible source of a leak would be some sort of 
dangling reference that still hangs around even though the plot objects have 
been cleared.  By the time of the matplotlib 1.0.0 release, we did seem to 
clear out pretty much all of these, but it is possible there are still some 
lurking about.  We should probably run your script against the latest svn to 
see how the results compare.

Another possibility might be related to numpy.  However this is the draw 
statement, so I don't know how much numpy is used in there. The latest refactor 
work in numpy has revealed some memory leaks that have existed, so who knows?

Might be interesting to try making equivalent versions of this script using 
different backends, and different package versions to possibly isolate the 
source of the memory leak.

Thanks for your observations,
Ben Root

Sorry for the double post; it seems the first is not displaying
correctly on SourceForge.

I conducted a couple more experiments taking into consideration suggestions
made in responses to my original post (thanks for the response).

First, I ran my original test (as close to it as possible anyway) using the
Agg back end for 3 hours, plotting 16591 times (about 1.5Hz). Memory usage
increased by 86MB. That's about 5.3K per redraw. Very similar to my original
experiment. As suggested, I called gc.collect() after each iteration. It
returned 67 for every iteration (no increase), although len(gc.garbage)
reported 0 each iteration.

Second, I ran a test targeting TkAgg for 3 hours, plotting 21374 times. Memory
usage fluctuated over time, but essentially did not increase: starting at
32.54MB and ending at 32.79MB. gc.collect() reported 0 after each iteration
as did len(gc.garbage).

Attached are images of plots showing change in memory usage over time for each
experiment.

Any comments would be appreciated.

Following is the code for each experiment.

Agg
-----

from random import random
from datetime import datetime
import os
import gc
import time
import win32api
import win32con
import win32process

import numpy

import matplotlib
matplotlib.use("Agg")
from matplotlib.figure import Figure
from matplotlib.backends.backend_agg import FigureCanvasAgg as FigureCanvas

def get_process_memory_info(process_id):
     memory = {}
     process = None
     try:
         process = win32api.OpenProcess(
             win32con.PROCESS_QUERY_INFORMATION|win32con.PROCESS_VM_READ,
             False, process_id);
         if process is not None:
             return win32process.GetProcessMemoryInfo(process)
     finally:
         if process:
             win32api.CloseHandle(process)
     return memory

meg = 1024.0 * 1024.0

figure = Figure(dpi=None)
canvas = FigureCanvas(figure)
axes = figure.add_subplot(1,1,1)

def draw(channel, seconds):
     axes.clear()
     axes.plot(channel, seconds)
     canvas.print_figure('test.png')

channel = numpy.sin(numpy.arange(1000) * random())
seconds = numpy.arange(len(channel))
testDuration = 60 * 60 * 3
startTime = time.time()

print "starting memory: ", \
     get_process_memory_info(os.getpid())["WorkingSetSize"]/meg

while (time.time() - startTime)<  testDuration:
     draw(channel, seconds)

     t = datetime.now()
     memory = get_process_memory_info(os.getpid())
     print "time: {0}, working: {1:f}, collect: {2}, garbage: {3}".format(
         t,
         memory["WorkingSetSize"]/meg,
         gc.collect(),
         len(gc.garbage) )

     time.sleep(0.5)


TkAgg
---------
from random import random
from datetime import datetime
import sys
import os
import gc
import time
import win32api
import win32con
import win32process

import numpy

import matplotlib
matplotlib.use("TkAgg")
from matplotlib.figure import Figure
from matplotlib.backends.backend_tkagg import FigureCanvasTkAgg \
     as FigureCanvas

import Tkinter as tk

def get_process_memory_info(process_id):
     memory = {}
     process = None
     try:
         process = win32api.OpenProcess(
             win32con.PROCESS_QUERY_INFORMATION|win32con.PROCESS_VM_READ,
             False, process_id);
         if process is not None:
             return win32process.GetProcessMemoryInfo(process)
     finally:
         if process:
             win32api.CloseHandle(process)
     return memory

meg = 1024.0 * 1024.0

rootTk = tk.Tk()
rootTk.wm_title("TKAgg Memory Leak")

figure = Figure()
canvas = FigureCanvas(figure, master=rootTk)
axes = figure.add_subplot(1,1,1)

def draw(channel, seconds):
     axes.clear()
     axes.plot(channel, seconds)

channel = numpy.sin(numpy.arange(1000) * random())
seconds = numpy.arange(len(channel))

testDuration = 60 * 60 * 3
startTime = time.time()

print "starting memory: ", \
     get_process_memory_info(os.getpid())["WorkingSetSize"]/meg

draw(channel, seconds)
canvas.show()
canvas.get_tk_widget().pack(side=tk.TOP, fill=tk.BOTH, expand=1)

rate = 500

def on_tick():
     canvas.get_tk_widget().after(rate, on_tick)

     if (time.time() - startTime)>= testDuration:
         return

     draw(channel, seconds)

     t = datetime.now()
     memory = get_process_memory_info(os.getpid())
     print "time: {0}, working: {1:f}, collect: {2}, garbage: {3}".format(
         t,
         memory["WorkingSetSize"]/meg,
         gc.collect(),
         len(gc.garbage) )

canvas.get_tk_widget().after(rate, on_tick)
tk.mainloop()


------------------------------------------------------------------------------
Beautiful is writing same markup. Internet Explorer 9 supports
standards for HTML5, CSS3, SVG 1.1,  ECMAScript5, and DOM L2&  L3.
Spend less time writing and  rewriting code and more time creating great
experiences on the web. Be a part of the beta today
http://p.sf.net/sfu/msIE9-sfdev2dev


_______________________________________________
Matplotlib-users mailing list
Matplotlib-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/matplotlib-users

Index: CXX/Python2/ExtensionOldType.hxx
===================================================================
--- CXX/Python2/ExtensionOldType.hxx	(revision 8806)
+++ CXX/Python2/ExtensionOldType.hxx	(working copy)
@@ -173,7 +173,7 @@
             Tuple self( 2 );
 
             self[0] = Object( this );
-            self[1] = Object( PyCObject_FromVoidPtr( method_def, do_not_dealloc ) );
+            self[1] = Object( PyCObject_FromVoidPtr( method_def, do_not_dealloc ), true );
 
             PyObject *func = PyCFunction_New( &method_def->ext_meth_def, self.ptr() );
 
------------------------------------------------------------------------------
Beautiful is writing same markup. Internet Explorer 9 supports
standards for HTML5, CSS3, SVG 1.1,  ECMAScript5, and DOM L2 & L3.
Spend less time writing and  rewriting code and more time creating great
experiences on the web. Be a part of the beta today
http://p.sf.net/sfu/msIE9-sfdev2dev
_______________________________________________
Matplotlib-users mailing list
Matplotlib-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/matplotlib-users

Reply via email to