Re: [matplotlib-devel] mpl1 draft

Peter Wang Mon, 23 Jul 2007 16:46:10 -0700

On Jul 19, 2007, at 12:18 PM, John Hunter wrote:

> = Data copying =
>
> Push the data to the backend only once, or only when required.  Update
> the transforms in the backend, but do not push transformed data on
> every draw.  This is potentially a major win, because we currently
> move the data around on every draw.


Does the backend keep a copy of the untransformed data around, so  
that it can easily create new transformed data when its transform is  
updated?  If so, is there a coherent mechanism for invalidating a  
piece of data that is being graphed in multiple plots?  If not, then  
how does hittesting determine the correct index into the data (since,  
presumably, hittesting will require the exact transform in the backend)?

> = Transformations =
>
> Support a normal transformation architecture.  The current draft
> implementation assumes one nonlinear transformation, which happens at
> a high layer, and all transformations after that are affines.  In the
> mpl1 draft, there are three affines: the transformation from view
> limits -> axes units (AxesCoords.affineview), the transformation from
> axes units to normalized figure units (AxesCoords.affineaxes), and the
> transformation from normalized figure units to display
> (Renderer.affinerenderer)
>
> Do we want to use 3x3 or 4x4 to leave the door open for 3D developers?

I admit the temptation of having basic 3D support, but the problem is  
that it really doesn't scale well in software.  Even the simple blits  
that we do in Chaco start to hit their limits on big, high-res LCDs  
that are getting cheaper every day.  The approach that I think we're  
going to have to take in Chaco is to only let 3D be available when  
using the OpenGL backend, and to restrict the Agg-based backends to  
be 2D only.

Of course, I'm thinking about all this from an interactive  
standpoint, so if speed is not a concern, then there's no reason not  
to build in 3D support from the get-go.

> How do transformations (linear and nonlinear) play with Axis features
> (ticking and gridding).  The ideal is a framework in which ticking,
> gridding and labeling work intelligently with arbitrary, user
> supplied, transformations.  What is the proper transformation API?

This is something we've been puzzling over for Chaco as well.  Dave  
Kammeyer pointed out long ago that the problem with trying to write a  
generic axis/grid renderer while supporting arbitrary transformations  
is that straight lines become curves under arbitrary transforms.  The  
basic idea is that the backend (or GraphicsContext, in the Chaco  
world) needs to provide transformation-aware implementations of  
line_to() that automatically convert line segments into bezier curves  
while at the same time providing drawing methods that are guaranteed  
to be "straight" or aligned with screen coordinates.  This way, you  
can get curved axes in a hyperbolic space "for free", while your  
ticks stay perfectly straight and the label text is screen-aligned.   
(Of course, to be perfectly accurate, you would need to handle polar  
coordinates in a special way anyway.)

> = Objects that talk to the backend "primitives" =
>
> Have just a few, fairly rich obects, that the backends need to
> understand.  Clear candidates are a Path, Text and Image, but despite
> their names, don't confuse these with the eponymous matplotlib
> matplotlib Artists, which are higher level than what I'm thinking of
> here (eg matplotlib.text.Text does *a lot* of layout, and this would
> be offloaded ot the backend in this conception of the Text primitive).
> Each of these will carry their metadata, eg a path will carry its
> stroke color, facecolor, linewidth, etc..., and Text will carry its
> font size, color, etc....  We may need some optimizations down the
> road, but we should start small.  For now, let's call these objects
> "primitives".
>
> = How much of an intermediate artist layer do we need? =
>
> Do we want to create high level objects like Circle, Rectangle and
> Line, each of which manage a Path object under the hood?  Probably,
> for user convenience and general compability with matplotlib.  By
> using traits properly here, many current matplotlib Arists will be
> thin interfaces around one or more primitives.

I included these two together because I think they both concern a  
very fundamental matter, which is the drawing model.  If you are  
going to create higher-level "primitives" which encapsulate state  
(e.g. color, dash style, line width), then you are moving  
significantly away from the model of a Canvas as just a place to dump  
pixels/vector drawing commands, and more towards the model of Canvas  
as a container of stateful objects.  But as soon as you do this, a  
whole host of questions pop up...  Are the Circle, Rectangle, etc. in  
the "intermediate artist layer" objects in their own right, with  
parameters like 'radius', 'position', etc., or are they just  
convenience functions to create more low-level primitives on the  
Canvas?  If the former, then you have suddenly have a hierarchical  
Canvas.  If the latter, then is there any structure to how these  
primitives are held in the Canvas?  Even if they are just held in a  
simple list, are they drawn in the same order as they appear in that  
list?  If they were inserted by a single Circle or Rectangle  
intermediate artist, is there any way to ensure that they maintain  
coherency when re-ordering that draw order?

Also, if you have a list of these primitives, it seems natural to  
hittest against them for picking and interaction.  Does this also  
happen in the same order that they appear in the list?  Is there a  
straightforward way to make them process events in a different  
order?  If you have lots of these little primitives, are there  
optimizations you can design in so that you don't have to hittest  
thousands of little primitives on each mouse_move event?

> = Where do the plot functions live? =
>
> In matplotlib, the plot functions are matplotlib.axes.Axes methods and
> I think there is consensus that this is a poor design.  Where should
> these live, what should they create, etc?

Well, you can probably guess my answer to this question. :)  It seems  
to me that if you're going to have a drawing model that supports  
stateful graphics on a canvas, plot renderers should just be  
glorified graphics that live on the canvas, no different from a  
Circle or a Rectangle or whatnot.  In this case, the plot functions  
then just become convenience functions that create these graphics and  
stick them on a canvas.

> I think the whole matplotlib.collections module is poorly designed,
> and should be chucked wholesale, in favor of faster, more elegant,
> optimizations and special cases.  Just having the right Path object
> will reduce the need for many of these, eg LineCollection,
> PolygonCollection, etc...  Also, everything should be numpy enabled,
> and the sequence-of-python-tuples approach that many of the
> collections take should be dropped. Obviously some of the more useful
> things there, like quad meshes, need to be ported and retained.

In Chaco we can get interactive speeds just by having a few fast  
drawing calls at the Kiva layer: lines(), line_set(),  
draw_marker_at_points().  These were quite easy to implement in both  
the Agg and Quartz backends.  We've talked about introducing another  
set of drawing commands that allow passing in a "style index" with  
every point, so that we can speed up our colormapped scatter plots  
and eventually do colormapped lines and such.

> = Z-ordering, containers, etc =
>
> Peter has been doing a lot of nice work on z-order and layers for
> chaco, stuff that looks really useful for picking, interaction, etc...
> We should look at this approach, and think carefully about how this
> should be handled.  Paul may be a good candidate for this, since he
> has been working recently on the picking API.

I think that you should really consider integrating the event  
propagation model with  the drawing model.  The Chaco model of  
"containers of graphical components" is pretty straightforward and  
even though we've implemented it in pure python, it is responsive  
enough for interactivity.  The nice thing about it is that there's  
nothing in the container/component model that is intrinsically  
related to plotting, so you can use it to build simple widgets that  
play nicely with the rest of your plot because they use the same  
event propagation and component drawing model. I can put together  
some more thorough documentation on all this, if folks are interested.

> I also plan to use the SWIG
> agg wrapper, so this gets rid of _backend_agg.  If we can enhance the
> SWIG agg wrapper, we can also do images through there, getting rid of
> _image.cpp.  Having a fully featured, python-exposed agg wrapper will
> be a plus in mpl and beyond.  But with the agg license change, I'm
> open to discussion of other approaches.

How exactly are you guys wrapping Agg?  I guess I need to take a look  
at that stuff in more detail... Kiva has been fairly stable, even  
though we don't do much maintenance on it, and the DisplayPDF drawing  
model has worked out fairly well.  After Robert put together the  
Quartz backend for it, we can nicely verify that our Agg-based  
implementation of DisplayPDF is fairly good, since our plots render  
the same on Windows and Mac.  If we had just put in a little more  
effort on optimization for Linux and cleaning up some outdated cruft,  
I think it would be in really good shape.  Additionally, Phil  
Thompson is going to be working on porting Kiva and Enable to Qt.   
Kiva's Agg backend is based on Agg 2.4, which is still BSD.

> The major missing piece in ft2font, which is a pretty elaborate CXX
> module.  Michael may want to consider alternatives, including looking
> at the agg support for freetype, and the kiva/chaco approach.

Unfortunately, Chaco's font handling isn't anything to write home  
about.. I think the world is crying out for a nice Python library for  
font lookup and font metrics.

> = Traits =
>
> I think we should make a major committment to traits and use them from
> the ground up.  Even without the UI stuff, they add plenty to make
> them worthwhile, especially the validation and notification features.
> With the UI (wx only) , they are a major win for many GUI developers.
> Compare the logic for sharing an x-axis using matplotlib transforms
> with Axes.sharex with the approach used in mpl1.py with sync_trait-ed
> affines.

Once you start using trait events and notifications extensively, you  
won't want to go back. :)  It encourages a very componentized model  
of development that is both a world apart from normal OOP while at  
the same time feeling very natural.

> = Axis handling =
>
> The whole concept of the Axes object needs to be rethought, in light
> of the fact that we need to support multiple axis objects on one Axes.
> The matplotlib implementation assumes 1 xaxis and 1 yaxis per Axes,
> and we hack two y-axis support (examples/two_scales.py) with some
> transform shenanigans via twinx and multiple Axes where one is hidden,
> but the approach is not scalable and is unwieldy.
>
> This will require a fair amount of thought, but we should aim for
> supporting an arbitrary number of axis obects, presumably associated
> with individual artists or primitives.
> ...
> The other important featiure for axis support is that, for the most
> part, they should be arbitrarily placeable (eg a "detached" axis).

I think you should consider separating the two concerns that are  
being overloaded onto the Axis object: (1) an axis represents a range  
in data space that controls the transforms/mappings between data and  
screen space, and (2) an axis is a visual component that needs to be  
rendered at a particular place on the screen and receives events from  
the user (e.g. double-clicking to set its parameters).

If you create a separate DataRange object, then you can use it to  
drive one or more Transforms as well as multiple Axis objects.  This  
is basically how Chaco gets synchronized axes for "free".  The actual  
graphical Axis objects can render themselves however they want to,  
and their actual layout on the screen (on opposite sides of a plot,  
piled up in a stack on the left or right, etc.) is determined by a  
layout mechanism that is completely orthogonal to the issues of  
mapping.  This also allows for "detached" axes and such.

> = Chaco and Kiva =
>
> It is a good idea for an enterprising developer to take a careful look
> at the current Chaco and Kiva to see if we can further integrate with
> them.  I am gun shy because they seem formiddable and complex, and one
> of my major goals here is to streamline and simplify, but they are
> incredible pieces of work and we need to carefully consider them,
> especially as we integrate other parts of the enthought suite into our
> core, eg traits, increasing the possibility of synergies.

I really glad to read this, because I think there are clearly a lot  
of common problems that we all have to solve.  At its core, Chaco is  
not *that* complex - it's just rather poorly documented, and that is  
no one's fault but mine.  The structure, however, is really pretty  
straightforward.  Its container/component model is not much more  
complicated than what a minimal solution to some of the problems I've  
outlined in previous paragraphs would entail.

I guess the key question I would ask is this: What is the vision, or  
driving purpose, behind mpl1?  Is it to develop a better backend  
architecture for pylab, or something more?  I ask this because some  
of the designs you have proposed for various pieces of mpl1 look very  
much like they are trying to solve the same problems that we're  
trying to solve in Chaco; if you really are quite prepared to "break  
the hell out of matplotlib", I think that now would be a really good  
time for collaboration.  :)



-Peter


-------------------------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems?  Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >>  http://get.splunk.com/
_______________________________________________
Matplotlib-devel mailing list
Matplotlib-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/matplotlib-devel

Re: [matplotlib-devel] mpl1 draft

Reply via email to