Re: [matplotlib-devel] low-hanging fruit on the transforms branch

Eric Firing Tue, 06 Nov 2007 14:03:31 -0800

Michael Droettboom wrote:
> Eric Firing wrote:
>> Mike,
>>
>> Thanks for the quick response.
>>
>> I was wrong as usual: the masked array overhead in your original 
>> version of the path initializer was actually small.  I misinterpreted 
>> the kcachegrind display.  Rats!  I was hoping for a big gain.  It 
>> looks like anything that makes a huge number of paths is going to be 
>> slow, no matter what we do to try to optimize the path initializer.
> 
> Do you think the overhead is just from Python object creation?  On the 
> trunk, pcolor ultimately creates a large nested array, which on the 
> branch is then converted to a Python list of Path objects (each 
> containing two arrays).  I can definitely imagine how that would be 
> slower...


I think the problem is at least partly the creation of a large number of 
numpy arrays; this is a known performance problem with numpy.  It is 
also partly python function call overhead.  In my benchmark, that path 
initializer gets called a *lot* of times, and every little bit of 
overhead adds up.

> 
>> A partial solution for pcolor is pcolormesh, using the quadmesh 
>> extension code, although that still has a bug. (Paul Kienzle was going 
>> to look into it.) Is the quadmesh extension compatible with your 
>> transforms branch?
> 
> I'm going to say 'no', only because I wasn't aware of it ;)  Where is 
> that code?  Trunk contains a "draw_quad_mesh" method in the Agg backend 
> which has been replaced by creating Path objects in Python and sending 
> them to a more generic draw_path_collection method.

That's it--"draw_quad_mesh" in Agg backend.

> 
>> My impression is that the transforms branch is going to be a big step 
>> forward, with performance improvements in some areas, at worst minor 
>> penalties in others--except for some problems like pcolor that need to 
>> be solved.
> 
> I hope you're right.  In all honesty, I actually haven't seen many 
> performance improvements myself in what I've benchmarked.  Certainly 
> there must be some degenerate synthetic benchmark I can devise to prove 
> its superiority <wink>.  Seriously, I was focusing on generalization, 
> ease of adding new transforms, and reduction of interactions with 
> extension code -- all of which aren't certain to create performance 
> gains.  I hope that through more pervasive use of numpy, however, some 
> of the performance lost can be made up -- and the reduction in lines of 
> code may make it easier to optimize more broadly.

I think a fair amount of extension code may remain necessary, but your 
reorganization should make it easier to have a clean interface to the 
extension code.

> 
>> In order to replace matlab in my application, a very fast interactive 
>> pcolor-type capability is absolutely essential.  I think this simply 
>> has to be done via extension code, like quadmesh and the image codes. 
>> (Pcolor in the trunk isn't fast enough, either.)  Unfortunately, I 
>> have found those codes hard to understand.  Only the regular-grid 
>> image code is fully integrated into the trunk, and even it has a 
>> long-standing bug revealed by extreme zooming.  The irregular-grid 
>> image routine might be a big help, but it has never been integrated.  
>> I don't remember which bugs it shares with quadmesh and image, if any.
> 
> We'll definitely have to find better ways of doing this stuff.
> 
> It would be really nice if all this could be done independently of the 
> backends.  The primary reason for having the Path abstraction is so that 
> backends only need to understand one basic thing.    Images (as they are 
> now) are an exception to that for the obvious optimization benefits.  I 
> wonder if generating the Path objects in an extension would be a 
> worthwhile compromise to adding pcolor support in all the backends (hard 
> to say what the factor of improvement might be).

1) It might indeed be possible to speed up path object generation via 
fairly simple extension code using the numpy C API.  We will have to 
look more closely to see where the time is being taken.

2) If we take Agg as the standard core for interactive backends, then it 
is not so crucial to optimize the non-interactive backends, although it 
would still be nice.  Alternatively, it may be that the way to go for 
all very complex plots on all backends is to use an image for the 
complex part.  Matlab did something like this, with their "zbuffer" 
versus "painters" renderer choice, but they fouled it up--with zbuffer, 
everything was bitmapped, including lines and fonts, so the result 
looked terrible, and the ps files were sometimes too big to print.  (I 
know very little about options for images in svg, ps and pdf.)  This was 
many years ago; I haven't run into this sort of problem with Matlab more 
recently.

> 
> If by interactive you mean "panning and zooming" performance, than the 
> branch is already at least twice as fast as the trunk (finally a 
> performance benchmark in favor of the branch!).  I converted your 
> example into a benchmark (attached) and get these results:
> 
> trunk:
> 
> init:  1.39
> fps: 0.448591422932
> 
> branch:
> 
> init:  5.31
> fps: 1.08885017422

Good, this confirms my subjective impression.  But for comparison, try 
the benchmark on the trunk with pcolormesh instead of pcolor.

> 
> Of course, if you mean "interactive" as in "updating the data", then, 
> yes, the branch has a long way to go to catch up to the trunk.
> 
Both aspects--data updating and pan/zoom--are important for interactive use.

Eric

> Cheers,
> Mike
> 
>> Michael Droettboom wrote:
>>> Eric Firing wrote:
>>>> Mike,
>>>>
>>>> I made a quick test and took a quick look, and I certainly see a 
>>>> ripe mango within reach.  I don't know what your constraints and 
>>>> strategy are, but I thought I would give you the off-the-cuff idea 
>>>> before I forget what I did.
>>>>
>>>> The test was pcolortest.py, and the kcachegrind input is the .log file.
>>>>
>>>> The problem is the path initializer: it is converting everything to 
>>>> a masked array, which in the vast majority of cases is not needed, 
>>>> and is very costly.
>>>
>>> Thanks for finding this.  I agree completely.  I think that was 
>>> basically a typo that ended up "working", just suboptimally.  The 
>>> input to the path constructor may be either a numpy array, an ma 
>>> array or a regular Python sequence.  If it's the first two, it should 
>>> be left alone (if there is an array mask, it is dealt with later on 
>>> in the constructor), but if the latter, it should be converted to a 
>>> numpy array.
>>>
>>> What I meant to type was:
>>>
>>>          if not ma.isMaskedArray(vertices):
>>>              vertices = npy.asarray(vertices, npy.float_)
>>>
>>> The argument against just "npy.asarray(vertices, npy.float_)" is that 
>>> the mask needs to be preserved.
>>>
>>> If I understand correctly, that will be essentially a no-op when the 
>>> input is a numpy array, albeit with the overhead of some checks.
>>>
>>>> We need to think carefully about the levels of API, and what should 
>>>> be done at which levels.  One possibility is that at the level of 
>>>> the path initializer, only ordinary ndarrays should be 
>>>> acceptable--any mask manipulations and compressions should already 
>>>> have been done.  This would require a helper function to generate 
>>>> the codes for that case. Another is that the path initializer could 
>>>> get a flag telling it whether to check for masked arrays.  And 
>>>> another is that a check for existance of a mask should be done at 
>>>> the start, and the mask processing done only if there is a mask.
>>>
>>> This option was the intent.
>>>
>>>> Yet another is that if a mask is needed, it be passed in as an 
>>>> optional 1-D array kwarg.  An advantage of this is that the code 
>>>> that calls the path initializer may be in a better position to know 
>>>> what is needed to generate the 1-D mask (that is, a mask for each 
>>>> (x,y) point rather than for x and y separately)--that mask may 
>>>> already be sitting around.
>>>
>>> Many of these options I fear would significantly complicate the code. 
>>> One of the driving motivations for the refactoring is to allow 
>>> transformations to be combined more generally.  Think of the case 
>>> where you have a polar plot with a logarithmic scale on the r-axis 
>>> (this wasn't ever possible in the trunk).  The log scale means that 
>>> there is potential for negative masked values, but the polar part of 
>>> the transformation shouldn't have to know or care whether masked 
>>> values are being passed through.  Requiring it to do so would need 
>>> the same checks currently performed in the Path constructor, but they 
>>> would be copied all over the code in every kind of new transformation.
>>>
>>> FWIW, there already is a deliberate "quarantining" of masked arrays 
>>> -- it happens where the logical elements of the plot hit the drawing 
>>> commands of the plot (the Path object).  It could have been 
>>> implemented such that the backends must understand masked arrays and 
>>> draw accordingly, but it proved to be faster (based on the 
>>> simple_plot_fps.py benchmark) to convert to a non-masked array with 
>>> MOVETO codes upfront and reuse that.  (Not surprising, given the 
>>> overhead of masked arrays).   This means that masked arrays are not 
>>> used at all during panning and zooming operations where speed is 
>>> perhaps the most crucial.
>>>
>>>> Masked arrays are pretty clunky and slow.  The maskedarray 
>>>> implementation by Pierre GM is nicer, more complete, and faster for 
>>>> many operations than numpy.ma, but it still adds a lot of overhead, 
>>>> especially for small arrays.  (It needs to have its core in C; so 
>>>> far I have failed dismally in trying to understand how to do that 
>>>> without repeating the bulk of the ndarray code.)
>>>>
>>>> A related point: can you (or is it OK if I do it) change all the 
>>>> "import numpy.ma as ma" or whatever to "from matplotlib.numerix 
>>>> import npyma as ma"?  The advantage is that it makes it easy to test 
>>>> the new version with either maskedarray or ma.  This should be 
>>>> temporary; I am still hoping and expecting that maskedarray will 
>>>> replace ma in the core numpy distribution.
>>>
>>> That sounds like a very good idea.  I'll go ahead and do this (on the 
>>> branch only).
>>>
>>> Cheers,
>>> Mike
>>>
>>
> 


-------------------------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems?  Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >> http://get.splunk.com/
_______________________________________________
Matplotlib-devel mailing list
Matplotlib-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/matplotlib-devel

Re: [matplotlib-devel] low-hanging fruit on the transforms branch

Reply via email to