Re: [Matplotlib-users] Matplotlib eating memory

Martin MOKREJŠ Thu, 10 Oct 2013 15:11:37 -0700

Hi Ben,
  thank you for your comments. Looks I will have a bad sleep tonight. :( Some 
quick
answers below.

Benjamin Root wrote:
> 
> 
> 
> On Thu, Oct 10, 2013 at 10:21 AM, Michael Droettboom <md...@stsci.edu 
> <mailto:md...@stsci.edu>> wrote:
> 
>     Thanks.  This is much more helpful.
> 
>     What we need, however, is a "self contained, standalone example".  The 
> code below calls functions that are not present.  See http://sscce.org/ for 
> why this is so important.  Again, I would have to guess what those functions 
> do -- it may be relevant, it may not.  If I have something that I can *just 
> run* then I can use various introspection tools to see what is going wrong.
> 
>     Mike
> 
> 
> That being said, I do see a number of anti-patterns here that could be 
> significant. For example:
> 
>         for _x, _y, _c in izip(mydata_x, mydata_y, colors):
>             # _Line2D = _ax1.plot(_x, _y) # returns Line2D object
>             _my_PathCollection = _ax1.scatter(_x, _y, color=_c, s=objsize) # 
> , label=_l) # returns PathCollection object
>             _series.append(_my_PathCollection)
> 
> Could be more concisely written as:
> 
>         _series = [_ax1.scatter(_x, _y, color=_c, s=objsize) for _x, _y, _c 
> in izip(mydata_x, mydata_y, colors)]
> 
> Python can then more intelligently handle memory management by intelligently 
> allocating the memory for _series. You can then use _series.extend() for when 
> you are doing the scatter plots for _ax2 with a similar list comprehension 
> (or even a generator statement).

You are right the .append() is ugly, maybe is a the real source of troubles. I 
somehow
do not understand myself right now why under the "if legends:" use ax1 instead 
of ax2.
Weird. I actually stopped using legends with this function because that was my 
first guess
that they cause the memory issues. Seems the culprit is elsewhere so I should 
add them
back and likely fix the ax2 vs. ax1 copy/paste (most likely) error.

As you could have seen, I used in the past label=_l but for some reason I 
switched away
to the current ugly code. Will try to find out why I did that.

Hmm, I don't know what you mean with _series.extend() at the moment, will read 
some
python Intro on using lists. :(

> 
> I would also question the need to store _series in the first place. You use 
> it for the call to legend, but you could have simply passed a label to each 
> call of scatter as well.

As I said, I used that in the past but somehow that did not work. Maybe time to 
re-try that.

> 
> Some other things of note:
> 
> 1) The clear() call here is completely useless as the figure is already clear.
>         _figure = pylab.figure()
>         _figure.clear()

Right, I was just trying to ensure everything is cleared. I somewhat suspect 
python
garbage collector does not recycle too often, and therefore added more and more 
del()
and gc.collect() calls.

> 
> 2) When limits are set on an axis, autoscaling for that axis is automatically 
> turned off anyway, so no need to turn if off yourself (also not sure why you 
> are calling out to an external function here):
>         _ax1.set_autoscale_on(False)
>         set_limits(_ax1, xmin, xmax, ymin, ymax)

The set_limits() is called because I got unstable coordinates in every figure.
Sometimes, matplotlib used wider offset from the axes line while sometimes not.
So, I basically force same layout for expected layouts.

> 
> 3) Finally, some discussion on the end of your function here:
>         if legends:
>             _figure.savefig(filename, dpi=100) #, bbox_inches='tight')
>             del(_my_PathCollection)
>             del(_ax2)
>         else:
>             _figure.savefig(filename, dpi=100)
> 
>         del(_series)
>         del(_ax1)
>         _figure.clear()
>         del(_figure)
>         pylab.clf()
>         pylab.close()
> first, as discussed, you can easily eliminate the need for _my_PathCollection 
> and possibly even _series. Second, when calling _figure.clear(), all of its 
> axes objects are deleted for you, so you don't need to delete them yourself. 
> Third, you delete the _figure object, but then call "pylab.clf()". I haven't 
> double-checked exactly what would happen, but I think you might run the risk 
> of accidentially clearing some other existing figure by doing that. Lastly, 
> you then call pylab.close(), which I point out the same caveat as before. 
> Really, all you needed was pylab.close() and you can eliminate the 5 
> preceding lines and the other two del()'s. All del() really does is remove 
> the variable out of scope. Once that object is out of everybody's scope, then 
> the gc can clean it up. Since the function was ending anyway, there is no 
> point in deleting the variable.

Right, but I suspect that garbage collector  does not recycle quickly enough 
unused objects
after the function is left. If I generate many figure sin a loop, one after 
another, it
appeared to me helpful to interleave the function calls with the gc.collect() 
calls.

> 
> I don't know if this would fix your problem, and there are a bunch of other 
> style issues here (particularly, pylab really shouldn't be used this way), 
> but hopefully this gives some food for thought.

I think I will start tomorrow finishing up the broken testcase so that we can 
be sure
where was the culprit. Then should improve the function as you proposed. I am 
not sure
some places what you really mean but will resolve it hopefully.

I was thinking about submitting several other functions like this one for 
discussion and
improvement, so that so that such wrapper functions could be included in 
matplotlib. I am
sure you would not like the many function argument and would prefer kwargs 
instead, but
something have same API would be helpful if I want to switch easily between 
scatter,
histplot, piechart. Actually, the hist2d substring in this function name is a 
remnant of
my attempts to do 2d charts but I did not take that route in the end. Just in 
case you were
puzzled by the function name. ;)

Thank you,
Martin

> 
> Cheers!
> Ben Root

------------------------------------------------------------------------------
October Webinars: Code for Performance
Free Intel webinars can help you accelerate application performance.
Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most from 
the latest Intel processors and coprocessors. See abstracts and register >
http://pubads.g.doubleclick.net/gampad/clk?id=60134071&iu=/4140/ostg.clktrk
_______________________________________________
Matplotlib-users mailing list
Matplotlib-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/matplotlib-users

Re: [Matplotlib-users] Matplotlib eating memory

Reply via email to