On Wed, Mar 24, 2010 at 4:19 PM, Michael Droettboom <md...@stsci.edu> wrote: > Rich Krauter wrote: >>> >>> Rich Krauter wrote: >>> >>>> >>>> Hello, >>>> >>>> I am a relatively new user of matplotlib; thank you to the matplotlib >>>> team for this excellent package. >>>> I have a question about serializing matplotlib figures. I have searched >>>> for serialization options for matplotlib figures but have not found much >>>> information. I am interested to hear about serialization use cases and the >>>> approaches others use in these cases. >>>> >>>> Here is the reason I am asking: >>>> >>>> My use case for serialization is that I want to build a CouchDB database >>>> of matplotlib figures. The database could be accessed from a web >>>> application (in my case I want to build a django app to create, edit and >>>> manage figures) or desktop gui, or whatever. For storage of the figures in >>>> CouchDB, I am working on JSON representations of matplotlib figures. The >>>> JSON could be run through simple python functions to regenerate the >>>> matplotlib figures. I have very simple working examples, but to more >>>> completely test out this approach I would attempt to recreate the plots in >>>> the matplotlib gallery using JSON representations and a small set of >>>> (hopefully) very simple python functions which would process the JSON >>>> markup. >>>> >>>> Before I get too far, I wanted to see what others have done for similar >>>> use cases, make sure I am not missing existing approaches, etc. I am >>>> getting ahead of myself now, but if there is broader interest in this >>>> approach, and no other better solutions exist, I would set up a project on >>>> Google Code or some other site to work on this. >>>> >> >> On Wed, Mar 24, 2010 at 1:15 PM, Michael Droettboom <md...@stsci.edu> >> wrote: >> >>> >>> What is the advantage of JSON (is this specific case) over Python source >>> code? matplotlib is designed around it and it's more flexible. Unless >>> you're planning on automatically manipulating the JSON, I don't see why you >>> wouldn't just use Python source. >>> >>> Mike >>> >>> >> >> Mike, >> >> I don't know that there is much of a benefit to JSON outside of my use >> case or similar use cases. I want to manipulate the JSON >> representation of a figure within a javascript-based web interface to >> provide dynamic plotting through a web page. I also want to be able >> to store and query JSON representations using CouchDB. >> >> I am probably not exactly clear on what you mean by "using python >> source" to represent a figure. Is there a standard agreed upon way to >> do this? > > In general, most matplotlib users write Python scripts to generate their > plots. These scripts usually read in data from an external file in any > number of formats (the format tends to be domain-specific, but matplotlib > provides support for a number of CSV formats, Numpy itself supports a number > of ways of reading arrays etc.) matplotlib tends to be agnostic about data > (as long as you can convert it to a Numpy array somehow, it's happy), but > has a clearly defined API for plot types and styles. >> >> I do have python source code representations of figures. >> i.e. I have dict representations of matplotlib figures. The dicts >> have a "required" internal structure. I feed the dict to a function >> which regenerates the figure graphic from that structure. If I want >> to update the plot, I just change the contents of the dict data >> structure representing the plot, not the source code that is used to >> generate the figure. If I instead had a JSON object representation of >> a figure, I would convert it to a python dict and use the same >> function as before to produce the figure. >> > > I guess I have trouble seeing why a dictionary representation which is then > interpreted to convert it to function calls is better than just making the > function calls directly. That's the "interface" to matplotlib that is known > and tested. >
Here are my reasons why a structured representation (dict, JSON, XML, ...) is useful: - I want to access the same plot representation through both python and through javascript. I need to access it in python to run MPL and create plot images, and I want to use javascript to build the user interface. - I want to separate the plot content from the plot generation. I can serialize a data structure containing plot contents more easily than I can record the commands a user might call to generate a plot. The content of the plot is not python specific, only the generation of the MPL plot is. I need to be able to serialize the content to support later modifications. > The only use case I can imagine where a dictionary might be preferable would > be if an external tool needs to read in the dictionary, modify it and spit > it back out. Reading arbitrary Python code is of course extremely hairy, > whereas the JSON dictionary could be defined to be a more limited and > manageable subset. Another possible advantage may be security related -- if > you need to run untrusted plot code, you certainly don't want to be running > untrusted Python code. >> >> I haven't found much discussion about serialization of matplotlib >> figures, but I probably have not searched well enough, or maybe it is >> not a high interest topic. The discussion I have found seems to >> suggest using the script you used to create the figure as the >> serialization of that figure. To modify the figure, you modify the >> script an rerun it. > > Yes -- that's the general consensus (at least among the core developers) > when the discussion comes up. There have been discussions and experiments > using enthought.Traits that might make plots serializable and malleable, but > it's a significant refactoring of matplotlib to take such an approach, for a > fairly minor gain. It's also extremely difficult to invent a serialization > that would survive version upgrades to matplotlib. One advantage of the > script approach is that when APIs change in a backward-incompatible way, it > is generally easier for end users to update their plots. If plots were in a > less human readable/writable format the changes required may be less > obvious. >> I can see why an MPL-internal serialization capability would be low on the priority list. It's hard to do and no one is really asking for it. I don't think you were implying this, but just to be clear I am not requesting a change to MPL or complaining about its functionality. Hope I didn't give the impression that I was. Agreed that API changes could be difficult to deal with. >> What I would like to have (and what I have somee >> very preliminary examples for) are versioned data structures that can >> be converted to matplotlib figures without modifying any python source >> code (other than the structured representation of the figure itself.) >> However, I don't know how much the matplotlib API changes, and an >> approach like this may be very sensitive to those changes. >> > > I don't understand the motivation to avoid modifying Python source code. If > you want to have common functionality that needs to change en masse, you can > use Python functions in a library. You could write Python scripts defining > a plot that are nothing more than data and a single function call to said > library. I am not opposed to modifying python source code. What I meant is that I tried to separate the content of the figure from its generation. There is nothing python-specific about the content of a figure. To change a plot I change its content (as represented by a python dict, XML, JSON, etc.), not the python code used to generate the figure from the content. I can add support for other MPL features by changing the JSON, XML, python dict representation; I shouldn't have to add server side python code to support additional MPL features. > Are you indexing the JSON at a fine-grained level in the couchdb, or are > they ultimately just blobs anyway? In which case a Python blob or a JSON > blob should make no difference. > Good point, they will probably mostly be blobs, with some associated metadata to query against. > I'm not trying to dissuade you from creating a JSON frontend if there's a > strong advantage. But keeping that frontend in sync with the progress of > matplotlib may be difficult, depending on how much coverage you want to > provide. > > Mike Understood, and thanks for the input. Rich ------------------------------------------------------------------------------ Download Intel® Parallel Studio Eval Try the new software tools for yourself. Speed compiling, find bugs proactively, and fine-tune applications for parallel performance. See why Intel Parallel Studio got high marks during beta. http://p.sf.net/sfu/intel-sw-dev _______________________________________________ Matplotlib-users mailing list Matplotlib-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/matplotlib-users