Benjamin Root wrote:
> 
> On Thu, Oct 10, 2013 at 9:47 AM, Martin MOKREJŠ <mmokr...@gmail.com 
> <mailto:mmokr...@gmail.com>> wrote:
> 
>     Benjamin Root wrote:
>     >
>     >
>     >
>     > On Thu, Oct 10, 2013 at 9:05 AM, Martin MOKREJŠ <mmokr...@gmail.com 
> <mailto:mmokr...@gmail.com> <mailto:mmokr...@gmail.com 
> <mailto:mmokr...@gmail.com>>> wrote:
>     >
>     >     Hi,
>     >       rendering some of my charts takes almost 50GB of RAM. I believe 
> below is a stracktrace
>     >     of one such situation when it already took 15GB. Would somebody 
> comments on what is
>     >     matplotlib doing at the very moment? Why the recursion?
>     >
>     >       The charts had to have 262422 data points in a 2D scatter plot, 
> each point has assigned
>     >     its own color. They are in batches so that there are 153 distinct 
> colors but nevertheless,
>     >     I assigned to each data point a color value. There are 153 legend 
> items also (one color
>     >     won't be used).
>     >
>     >     ^CTraceback (most recent call last):
>     >     ...
>     >         _figure.savefig(filename, dpi=100)
>     >       File "/usr/lib64/python2.7/site-packages/matplotlib/figure.py", 
> line 1421, in savefig
>     >         self.canvas.print_figure(*args, **kwargs)
>     >       File 
> "/usr/lib64/python2.7/site-packages/matplotlib/backend_bases.py", line 2220, 
> in print_figure
>     >         **kwargs)
>     >       File 
> "/usr/lib64/python2.7/site-packages/matplotlib/backends/backend_agg.py", line 
> 505, in print_png
>     >         FigureCanvasAgg.draw(self)
>     >       File 
> "/usr/lib64/python2.7/site-packages/matplotlib/backends/backend_agg.py", line 
> 451, in draw
>     >         self.figure.draw(self.renderer)
>     >       File "/usr/lib64/python2.7/site-packages/matplotlib/artist.py", 
> line 54, in draw_wrapper
>     >         draw(artist, renderer, *args, **kwargs)
>     >       File "/usr/lib64/python2.7/site-packages/matplotlib/figure.py", 
> line 1034, in draw
>     >         func(*args)
>     >       File "/usr/lib64/python2.7/site-packages/matplotlib/artist.py", 
> line 54, in draw_wrapper
>     >         draw(artist, renderer, *args, **kwargs)
>     >       File "/usr/lib64/python2.7/site-packages/matplotlib/axes.py", 
> line 2086, in draw
>     >         a.draw(renderer)
>     >       File "/usr/lib64/python2.7/site-packages/matplotlib/artist.py", 
> line 54, in draw_wrapper
>     >         draw(artist, renderer, *args, **kwargs)
>     >       File 
> "/usr/lib64/python2.7/site-packages/matplotlib/collections.py", line 718, in 
> draw
>     >         return Collection.draw(self, renderer)
>     >       File "/usr/lib64/python2.7/site-packages/matplotlib/artist.py", 
> line 54, in draw_wrapper
>     >         draw(artist, renderer, *args, **kwargs)
>     >       File 
> "/usr/lib64/python2.7/site-packages/matplotlib/collections.py", line 276, in 
> draw
>     >         offsets, transOffset, self.get_facecolor(), 
> self.get_edgecolor(),
>     >       File 
> "/usr/lib64/python2.7/site-packages/matplotlib/collections.py", line 551, in 
> get_edgecolor
>     >         return self._edgecolors
>     >     KeyboardInterrupt
>     >     ^CError in atexit._run_exitfuncs:
>     >     Traceback (most recent call last):
>     >       File "/usr/lib64/python2.7/atexit.py", line 24, in _run_exitfuncs
>     >         func(*targs, **kargs)
>     >       File 
> "/usr/lib64/python2.7/site-packages/matplotlib/_pylab_helpers.py", line 90, 
> in destroy_all
>     >         gc.collect()
>     >     KeyboardInterrupt
>     >     Error in sys.exitfunc:
>     >     Traceback (most recent call last):
>     >       File "/usr/lib64/python2.7/atexit.py", line 24, in _run_exitfuncs
>     >         func(*targs, **kargs)
>     >       File 
> "/usr/lib64/python2.7/site-packages/matplotlib/_pylab_helpers.py", line 90, 
> in destroy_all
>     >         gc.collect()
>     >     KeyboardInterrupt
>     >
>     >     ^C
>     >
>     >
>     >     Clues what is the code doing? I use mpl-1.3.0.
>     >     Thank you,
>     >     Martin
>     >
>     >
>     > Unfortunately, that stacktrace isn't very useful. There is no recursion 
> there, but rather the perfectly normal drawing of the figure object that has 
> a child axes, which has child collections which have child artist objects.
>     >
>     > Without the accompanying code, it would be difficult to determine where 
> the memory hog is.
> 
>     Could there be places where gc.collect() could be introduced? Are there 
> places where matplotlib
>     could del() unnecessary objects right away? I think the problem is with 
> huge lists or pythonic
>     dicts. I could save 10GB of RAM when I converted one python dict to a 
> bsddb3 file having just
>     10MB on disk. I speculate matplotlib in that code keeps the data in some 
> huge list or more likely
>     a dict and that is the same issue.
> 
>     Are you sure you cannot see where a problem is? It happens (is visible) 
> only with huge number of
>     dots, of course.
> 
> 
> I am not going to claim that matplotlib is the most lean graphing library out 
> there, and we already do know where we can make continued improvements, but 
> the symptom you are describing (50 GB for a couple hundred thousand scatter 
> points) is just unheard of for matplotlib. Without a simple, concise, 
> complete code example to demonstrate your problem, we can only hazard 
> guesses. For all I know, you might be "appending" to numpy arrays in a loop 
> prior to plotting, which would eat up significant amount of memory without it 
> being the fault of matplotlib.
> 
> As far as I am aware, we don't do very large dictionaries, so I am doubtful 
> that is the issue either.
> 
> As a side note, I have typically found that situations where del() 
> significantly improved memory usage were typically situations where I was 
> "doing it wrong" in the first place and a simple refactor of the code 
> improved memory and (sometimes) speed, with an added benefit of improved 
> readability.  I have even seen situations where calling del() in the wrong 
> places (say, for a list created at the beginning of the loop) actually hurt 
> performance because python couldn't recycle that chunk of memory.
> 
> Give us a code example that reproduces your problem, and then we can start 
> doing some more serious debugging.

Should be in your Inboxes now. I have to rush for a meeting now, so there was 
no example call
to that function with sample data, but hope I wrote already enough as I knew 
number of dots and legends
to be drawn. Yeah, the number of columns is determined elsewhere, put 2 as a 
value into that variable.

Surely one can rewrite the code, but ideally I would also propose that 
matplotlib is improved so that
others with similarly bad coding style do not hit the issue. ;)

Thank you for your time,
Martin

------------------------------------------------------------------------------
October Webinars: Code for Performance
Free Intel webinars can help you accelerate application performance.
Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most from 
the latest Intel processors and coprocessors. See abstracts and register >
http://pubads.g.doubleclick.net/gampad/clk?id=60134071&iu=/4140/ostg.clktrk
_______________________________________________
Matplotlib-users mailing list
Matplotlib-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/matplotlib-users

Reply via email to