Re: [matplotlib-devel] low-hanging fruit on the transforms branch

2007-11-06 Thread Michael Droettboom
Eric Firing wrote:
> Mike,
> 
> I made a quick test and took a quick look, and I certainly see a ripe 
> mango within reach.  I don't know what your constraints and strategy 
> are, but I thought I would give you the off-the-cuff idea before I 
> forget what I did.
> 
> The test was pcolortest.py, and the kcachegrind input is the .log file.
> 
> The problem is the path initializer: it is converting everything to a 
> masked array, which in the vast majority of cases is not needed, and is 
> very costly.

Thanks for finding this.  I agree completely.  I think that was 
basically a typo that ended up "working", just suboptimally.  The input 
to the path constructor may be either a numpy array, an ma array or a 
regular Python sequence.  If it's the first two, it should be left alone 
(if there is an array mask, it is dealt with later on in the 
constructor), but if the latter, it should be converted to a numpy array.

What I meant to type was:

 if not ma.isMaskedArray(vertices):
 vertices = npy.asarray(vertices, npy.float_)

The argument against just "npy.asarray(vertices, npy.float_)" is that 
the mask needs to be preserved.

If I understand correctly, that will be essentially a no-op when the 
input is a numpy array, albeit with the overhead of some checks.

> We need to think carefully about the levels of API, and what should be 
> done at which levels.  One possibility is that at the level of the path 
> initializer, only ordinary ndarrays should be acceptable--any mask 
> manipulations and compressions should already have been done.  This 
> would require a helper function to generate the codes for that case. 
> Another is that the path initializer could get a flag telling it whether 
> to check for masked arrays.  And another is that a check for existance 
> of a mask should be done at the start, and the mask processing done only 
> if there is a mask.

This option was the intent.

> Yet another is that if a mask is needed, it be 
> passed in as an optional 1-D array kwarg.  An advantage of this is that 
> the code that calls the path initializer may be in a better position to 
> know what is needed to generate the 1-D mask (that is, a mask for each 
> (x,y) point rather than for x and y separately)--that mask may already 
> be sitting around.

Many of these options I fear would significantly complicate the code. 
One of the driving motivations for the refactoring is to allow 
transformations to be combined more generally.  Think of the case where 
you have a polar plot with a logarithmic scale on the r-axis (this 
wasn't ever possible in the trunk).  The log scale means that there is 
potential for negative masked values, but the polar part of the 
transformation shouldn't have to know or care whether masked values are 
being passed through.  Requiring it to do so would need the same checks 
currently performed in the Path constructor, but they would be copied 
all over the code in every kind of new transformation.

FWIW, there already is a deliberate "quarantining" of masked arrays -- 
it happens where the logical elements of the plot hit the drawing 
commands of the plot (the Path object).  It could have been implemented 
such that the backends must understand masked arrays and draw 
accordingly, but it proved to be faster (based on the simple_plot_fps.py 
benchmark) to convert to a non-masked array with MOVETO codes upfront 
and reuse that.  (Not surprising, given the overhead of masked arrays). 
  This means that masked arrays are not used at all during panning and 
zooming operations where speed is perhaps the most crucial.

> Masked arrays are pretty clunky and slow.  The maskedarray 
> implementation by Pierre GM is nicer, more complete, and faster for many 
> operations than numpy.ma, but it still adds a lot of overhead, 
> especially for small arrays.  (It needs to have its core in C; so far I 
> have failed dismally in trying to understand how to do that without 
> repeating the bulk of the ndarray code.)
> 
> A related point: can you (or is it OK if I do it) change all the "import 
> numpy.ma as ma" or whatever to "from matplotlib.numerix import npyma as 
> ma"?  The advantage is that it makes it easy to test the new version 
> with either maskedarray or ma.  This should be temporary; I am still 
> hoping and expecting that maskedarray will replace ma in the core numpy 
> distribution.

That sounds like a very good idea.  I'll go ahead and do this (on the 
branch only).

Cheers,
Mike

-- 
Michael Droettboom
Science Software Branch
Operations and Engineering Division
Space Telescope Science Institute
Operated by AURA for NASA

-
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems?  Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >> http://get.splunk.com/
___

Re: [matplotlib-devel] STIX fonts (status with matplotlib)

2007-11-06 Thread Michael Droettboom
The STIX fonts are provided as OpenType wrappers around Adobe Compact 
Font Format (CFF or Type 2) fonts (sometimes called OpenType CFF or 
'OTTO' font, because of the header tag on these files).  With the Agg 
backend, this isn't a problem, since freetype supports these fonts 
transparently.  However, the Ps and Pdf backends are more-or-less 
hardwired for TrueType fonts and will need to have significant tricky 
bits of code glued on to support these fonts.



In the PDF backend, the fonts work, with the exception of subsetting.  I 
have patched the PDF backend so that it will fall back to the 
non-subsetting font output when it encounters an OpenType CFF font. 
This is not ideal, since the STIX fonts are so large.

The Ps backend does not support these fonts at all -- I have patched it 
to raise an exception when you try to use an OpenType CFF font.  New 
code would need to be written to extract the CFF section from the 
OpenType wrapper and embed it in the PostScript drawing.  I made a first 
attempt at this, but couldn't get it to work.  The PostScript spec 
provides an example of wrapping the CFF in a PostScript wrapper, but not 
actually embedding the CFF in a Postscript drawing...  Perhaps others on 
this list have some thoughts.

Agg, SVG and Cairo backends are working.

Long term, I think the font subsetting code should be rewritten so that 
it works with freetype (which understands all kinds of input font 
formats) on the input end, and then we should dump ttconv.  (I say that 
regretfully, having been the one to suggest ttconv in the first place). 
   The SVG backend already works like this.  The reason the Ps and Pdf 
backends didn't go that route was to avoid dealing with the intricacies 
of writing those more complex font formats.



All that said (and sorry for the gory details), if we just convert the 
OTF STIX fonts to TTF, all these problems go away.  matplotlib supports 
.ttf fonts quite well at this point.  As Darren pointed out, the STIX 
license allows this conversion:

"""
3. You may (a) convert the Fonts from one format to another (e.g.,
from TrueType to PostScript), in which case the normal and reasonable
distortion that occurs during such conversion shall be permitted and (b)
embed or include a subset of the Fonts in a document for the purposes of
allowing users to read text in the document that utilizes the Fonts. In
each case, you may use the STIX Fonts-TM mark to designate the resulting
Fonts or subset of the Fonts.
"""

This fontforge script seems to do the conversion quite well:

#!/usr/bin/fontforge
Open($1);
Generate($1:r+".ttf");
Quit(0);

If there are no objections, I'll go ahead and do that and commit the 
results.

Cheers,
Mike

Michael Droettboom wrote:
> STIX fonts seem to be break with PDF or PS font subsetting.  Looking 
> into it...
> 
> Cheers,
> Mike
> 
> Michael Droettboom wrote:
>> The STIX fonts are now passing the mathtext_examples.py unit test.  This 
>> font blends much better with fonts like Times.
>>
>> The rcParam "mathtext.use_cm" (which is new since the last release) has 
>> been replaced with "mathtext.fontset" which takes either "cm", "stix" or 
>> "custom".  To use the STIX fonts, set it to "stix".  While "custom" 
>> mostly works with the STIX fonts, "stix" will turn on a little extra 
>> code that knows how to use the dynamically sized characters (such as the 
>> radical sign) from the correct STIX fonts.
>>
>> There are far more characters in the STIX fonts than in the Bakoma 
>> fonts, and many of them are not accessible through a "named" symbol, 
>> such as "\foo".  At present, matplotlib only understands the common math 
>> symbols in core LaTeX, and a handful of symbols defined in commonly used 
>> LaTeX extension packages.  Ideally, now that we have much more complete 
>> fonts, we could create mappings from all the symbols in the 
>> "Comprehensive LaTeX symbol list" to Unicode, but that's a considerable 
>> amount of bookkeeping work, unless someone else has already done it for 
>> some other project.  I suspect that there's a 90/10 rule here: 90% of 
>> users use 10% of the symbols, and vice versa.  (It may even be more like 
>> 99/1.)
>>
>> As a way around this, you can insert Unicode characters directly into 
>> the math string and it will correctly use that character in the STIX 
>> font.  For example, the following will produce a carriage return symbol:
>>
>>  ur"$\u23ce$"
>>
>> This even works for the *really* rare symbols (that don't have an 
>> official Unicode code point and have been placed in the "Private Use 
>> Area" codepage in a separate font file)... matplotlib has a little extra 
>> code to use the "Non-Unicode" fonts when necessary (when the codepoint 
>> is E000 - F8FF).
>>
>> Currently, there's no way to get at all of the fancy integral signs that 
>> STIX provides.
>>
>> Cheers,
>> Mike
>>
>> Michael Droettboom wrote:
>>> John Hunter wrote:
 On 11/5/07, Darren Dale <[EMAIL PR

Re: [matplotlib-devel] remove ipython hack?

2007-11-06 Thread Fernando Perez
On 11/5/07, John Hunter <[EMAIL PROTECTED]> wrote:
> On 11/4/07, Eric Firing <[EMAIL PROTECTED]> wrote:
> > John, Fernando,
> >
> > Is it OK to remove the hack now?  In pyplot.py:
> >
> > # a hack to keep old versions of ipython working with mpl after bug
> > # fix #1209354
>
> This was added in 2005 when mpl was at 0.83 and ipython was at 0.6.15,
> so yes, it is OK to remove it now.  If someone wants to run the latest
> mpl, surely they can upgrade ipython.

Agreed from this side.

Cheers,

f

-
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems?  Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >> http://get.splunk.com/
___
Matplotlib-devel mailing list
Matplotlib-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/matplotlib-devel


Re: [matplotlib-devel] remove ipython hack?

2007-11-06 Thread Eric Firing
Fernando Perez wrote:
> On 11/5/07, John Hunter <[EMAIL PROTECTED]> wrote:
>> On 11/4/07, Eric Firing <[EMAIL PROTECTED]> wrote:
>>> John, Fernando,
>>>
>>> Is it OK to remove the hack now?  In pyplot.py:
>>>
>>> # a hack to keep old versions of ipython working with mpl after bug
>>> # fix #1209354
>> This was added in 2005 when mpl was at 0.83 and ipython was at 0.6.15,
>> so yes, it is OK to remove it now.  If someone wants to run the latest
>> mpl, surely they can upgrade ipython.
> 
> Agreed from this side.
> 
> Cheers,
> 
> f

Fernando,

Thanks.  I did it.

Eric

-
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems?  Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >> http://get.splunk.com/
___
Matplotlib-devel mailing list
Matplotlib-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/matplotlib-devel


Re: [matplotlib-devel] STIX fonts

2007-11-06 Thread Michael Droettboom
Darren Dale wrote:
> On Sunday 04 November 2007 9:04:15 am Michael Droettboom wrote:
>> I should also add -- it would be really nice to have STIX fonts working in
>> the upcoming stable release if possible.  Hopefully tomorrow morning I can
>> assess how much work that will be and maybe delay tagging the release
>> slightly so this can be worked through.  It would be nice to remove the
>> Computer Modern fonts (in mathtext only), but they still serve a niche in
>> that they match the LaTeX fonts for users who can't/won't use usetex.  So
>> we're probably stuck with them for the long term even if STIX becomes a
>> nicer/cleaner option.
> 
> I haven't found sans-serif or monospaced fonts in their distribution. Maybe I 
> don't know where to look. I sent an email to the STIX website asking about 
> them, but havent heard back from them. I tried opening the fonts in 
> fontforge, and there are a lot of missing glyphs.

Unicode includes a "Mathematical Alphanumeric Symbols" range 
1D400-1D7FF.  The STIXGeneral font includes some (not all) sans-serif 
and monospaced Greek and Latin glyphs in that range.  Mathtext won't 
actually use them yet, but doing so should be reasonably straightforward.

Cheers,
Mike

-- 
Michael Droettboom
Science Software Branch
Operations and Engineering Division
Space Telescope Science Institute
Operated by AURA for NASA

-
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems?  Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >> http://get.splunk.com/
___
Matplotlib-devel mailing list
Matplotlib-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/matplotlib-devel


Re: [matplotlib-devel] low-hanging fruit on the transforms branch

2007-11-06 Thread Eric Firing
Mike,

Thanks for the quick response.

I was wrong as usual: the masked array overhead in your original version 
of the path initializer was actually small.  I misinterpreted the 
kcachegrind display.  Rats!  I was hoping for a big gain.  It looks like 
anything that makes a huge number of paths is going to be slow, no 
matter what we do to try to optimize the path initializer.

A partial solution for pcolor is pcolormesh, using the quadmesh 
extension code, although that still has a bug. (Paul Kienzle was going 
to look into it.) Is the quadmesh extension compatible with your 
transforms branch?

My impression is that the transforms branch is going to be a big step 
forward, with performance improvements in some areas, at worst minor 
penalties in others--except for some problems like pcolor that need to 
be solved.

In order to replace matlab in my application, a very fast interactive 
pcolor-type capability is absolutely essential.  I think this simply has 
to be done via extension code, like quadmesh and the image codes. 
(Pcolor in the trunk isn't fast enough, either.)  Unfortunately, I have 
found those codes hard to understand.  Only the regular-grid image code 
is fully integrated into the trunk, and even it has a long-standing bug 
revealed by extreme zooming.  The irregular-grid image routine might be 
a big help, but it has never been integrated.  I don't remember which 
bugs it shares with quadmesh and image, if any.

Eric



Michael Droettboom wrote:
> Eric Firing wrote:
>> Mike,
>>
>> I made a quick test and took a quick look, and I certainly see a ripe 
>> mango within reach.  I don't know what your constraints and strategy 
>> are, but I thought I would give you the off-the-cuff idea before I 
>> forget what I did.
>>
>> The test was pcolortest.py, and the kcachegrind input is the .log file.
>>
>> The problem is the path initializer: it is converting everything to a 
>> masked array, which in the vast majority of cases is not needed, and is 
>> very costly.
> 
> Thanks for finding this.  I agree completely.  I think that was 
> basically a typo that ended up "working", just suboptimally.  The input 
> to the path constructor may be either a numpy array, an ma array or a 
> regular Python sequence.  If it's the first two, it should be left alone 
> (if there is an array mask, it is dealt with later on in the 
> constructor), but if the latter, it should be converted to a numpy array.
> 
> What I meant to type was:
> 
>  if not ma.isMaskedArray(vertices):
>  vertices = npy.asarray(vertices, npy.float_)
> 
> The argument against just "npy.asarray(vertices, npy.float_)" is that 
> the mask needs to be preserved.
> 
> If I understand correctly, that will be essentially a no-op when the 
> input is a numpy array, albeit with the overhead of some checks.
> 
>> We need to think carefully about the levels of API, and what should be 
>> done at which levels.  One possibility is that at the level of the path 
>> initializer, only ordinary ndarrays should be acceptable--any mask 
>> manipulations and compressions should already have been done.  This 
>> would require a helper function to generate the codes for that case. 
>> Another is that the path initializer could get a flag telling it whether 
>> to check for masked arrays.  And another is that a check for existance 
>> of a mask should be done at the start, and the mask processing done only 
>> if there is a mask.
> 
> This option was the intent.
> 
>> Yet another is that if a mask is needed, it be 
>> passed in as an optional 1-D array kwarg.  An advantage of this is that 
>> the code that calls the path initializer may be in a better position to 
>> know what is needed to generate the 1-D mask (that is, a mask for each 
>> (x,y) point rather than for x and y separately)--that mask may already 
>> be sitting around.
> 
> Many of these options I fear would significantly complicate the code. 
> One of the driving motivations for the refactoring is to allow 
> transformations to be combined more generally.  Think of the case where 
> you have a polar plot with a logarithmic scale on the r-axis (this 
> wasn't ever possible in the trunk).  The log scale means that there is 
> potential for negative masked values, but the polar part of the 
> transformation shouldn't have to know or care whether masked values are 
> being passed through.  Requiring it to do so would need the same checks 
> currently performed in the Path constructor, but they would be copied 
> all over the code in every kind of new transformation.
> 
> FWIW, there already is a deliberate "quarantining" of masked arrays -- 
> it happens where the logical elements of the plot hit the drawing 
> commands of the plot (the Path object).  It could have been implemented 
> such that the backends must understand masked arrays and draw 
> accordingly, but it proved to be faster (based on the simple_plot_fps.py 
> benchmark) to convert to a non-masked array with MOVETO code

Re: [matplotlib-devel] STIX fonts (status with matplotlib)

2007-11-06 Thread John Hunter
On Nov 6, 2007 1:05 PM, Michael Droettboom <[EMAIL PROTECTED]> wrote:

> This fontforge script seems to do the conversion quite well:
>
> #!/usr/bin/fontforge
> Open($1);
> Generate($1:r+".ttf");
> Quit(0);
>
> If there are no objections, I'll go ahead and do that and commit the
> results.

This certainly seems to be the path of least resistance.  Would there
be much savings in terms of file size if we just extracted the fonts
we need, eg the parts we need for latex expressions.  If a significant
portion of the fonts are relatively inaccessible unicode, maybe we
should just distribute the subset we need.

JDH

-
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems?  Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >> http://get.splunk.com/
___
Matplotlib-devel mailing list
Matplotlib-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/matplotlib-devel


Re: [matplotlib-devel] STIX fonts (status with matplotlib)

2007-11-06 Thread Michael Droettboom
John Hunter wrote:
> On Nov 6, 2007 1:05 PM, Michael Droettboom <[EMAIL PROTECTED]> wrote:
> 
>> This fontforge script seems to do the conversion quite well:
>>
>> #!/usr/bin/fontforge
>> Open($1);
>> Generate($1:r+".ttf");
>> Quit(0);
>>
>> If there are no objections, I'll go ahead and do that and commit the
>> results.
> 
> This certainly seems to be the path of least resistance.  Would there
> be much savings in terms of file size if we just extracted the fonts
> we need, eg the parts we need for latex expressions.  If a significant
> portion of the fonts are relatively inaccessible unicode, maybe we
> should just distribute the subset we need.

Now that subsetting works, I'd be more inclined to just include 
everything and let the backends sort it out.  The total filesize of all 
the STIX fonts converted to .ttf is 1,334,524 bytes.  Removing the font 
files that matplotlib can't currently access at all, the total is 
1,167,920 bytes.

I don't consider that to be a large amount of space anymore. 
(Sourceforge provides the hosting and bandwidth anyway, right? ;)

There is, of course, some time and memory overhead to loading larger 
fonts, but it may not be significant.

The other issue with subsetting the fonts before distributing them is 
just a matter of person-time: someone has to write the scripts to do it, 
and then re-run them when the STIX fonts are updated.

In the meantime, I'll commit the .ttf versions of the STIX fonts, and 
remove the .otf versions, which doesn't preclude subsetting the fonts 
further down the road.

Cheers,
Mike

-- 
Michael Droettboom
Science Software Branch
Operations and Engineering Division
Space Telescope Science Institute
Operated by AURA for NASA

-
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems?  Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >> http://get.splunk.com/
___
Matplotlib-devel mailing list
Matplotlib-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/matplotlib-devel


Re: [matplotlib-devel] STIX fonts (status with matplotlib)

2007-11-06 Thread John Hunter
On Nov 6, 2007 1:36 PM, Michael Droettboom <[EMAIL PROTECTED]> wrote:
> There is, of course, some time and memory overhead to loading larger
> fonts, but it may not be significant.
>
> The other issue with subsetting the fonts before distributing them is
> just a matter of person-time: someone has to write the scripts to do it,
> and then re-run them when the STIX fonts are updated.

Sounds good to me -- if there were a big savings in file size I
thought it might be worth looking into, but he who writes the code
gets the most votes :-)

I think we should make stix the default for mathtext.fontset once you
get these changes incorporated, presuming PDF and PS work as expected
-- is there any reason not to?

JDH

-
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems?  Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >> http://get.splunk.com/
___
Matplotlib-devel mailing list
Matplotlib-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/matplotlib-devel


Re: [matplotlib-devel] STIX fonts (status with matplotlib)

2007-11-06 Thread Michael Droettboom


John Hunter wrote:
> On Nov 6, 2007 1:36 PM, Michael Droettboom <[EMAIL PROTECTED]> wrote:
>> There is, of course, some time and memory overhead to loading larger
>> fonts, but it may not be significant.
>>
>> The other issue with subsetting the fonts before distributing them is
>> just a matter of person-time: someone has to write the scripts to do it,
>> and then re-run them when the STIX fonts are updated.
> 
> Sounds good to me -- if there were a big savings in file size I
> thought it might be worth looking into, but he who writes the code
> gets the most votes :-)

Not sure -- Maybe we were talking about different things...  I was 
talking about actually removing individual glyphs from the files -- that 
may result in much larger savings percentage-wise, but would be 
labor-intensive.  Removing whole font files is quite easy (and I 
actually did remove the files that mathtext can't currently access).

> I think we should make stix the default for mathtext.fontset once you
> get these changes incorporated, presuming PDF and PS work as expected
> -- is there any reason not to?

No reason, other than the usual "hasn't been tested as much", and how 
that may affect the upcoming release.  I anticipate more mis-mapped 
glyphs (I found some already, but I'm sure not all of them).  But it 
won't get tested much unless people are nudged into using it ;)  I'd 
encourage people on this list to kick it around a bit before we widen 
the audience.  But don't let that hold back the release -- it's always 
hard to know when to press that big red button... ;)

Cheers,
Mike

-- 
Michael Droettboom
Science Software Branch
Operations and Engineering Division
Space Telescope Science Institute
Operated by AURA for NASA

-
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems?  Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >> http://get.splunk.com/
___
Matplotlib-devel mailing list
Matplotlib-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/matplotlib-devel


Re: [matplotlib-devel] STIX fonts (status with matplotlib)

2007-11-06 Thread John Hunter
On Nov 6, 2007 2:09 PM, Michael Droettboom <[EMAIL PROTECTED]> wrote:

> No reason, other than the usual "hasn't been tested as much", and how
> that may affect the upcoming release.  I anticipate more mis-mapped
> glyphs (I found some already, but I'm sure not all of them).  But it
> won't get tested much unless people are nudged into using it ;)  I'd

Release early, release often.  We can always roll out a bug fix
release, if Charlie is willing

Charlie, what say you to an MPL 0.92.0 release for Monday or Tuesday
next week?  I will work on the release notes and updates to the web
page in the meantime...

JDH

-
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems?  Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >> http://get.splunk.com/
___
Matplotlib-devel mailing list
Matplotlib-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/matplotlib-devel


Re: [matplotlib-devel] low-hanging fruit on the transforms branch

2007-11-06 Thread Michael Droettboom
Eric Firing wrote:
> Mike,
> 
> Thanks for the quick response.
> 
> I was wrong as usual: the masked array overhead in your original version 
> of the path initializer was actually small.  I misinterpreted the 
> kcachegrind display.  Rats!  I was hoping for a big gain.  It looks like 
> anything that makes a huge number of paths is going to be slow, no 
> matter what we do to try to optimize the path initializer.

Do you think the overhead is just from Python object creation?  On the 
trunk, pcolor ultimately creates a large nested array, which on the 
branch is then converted to a Python list of Path objects (each 
containing two arrays).  I can definitely imagine how that would be 
slower...

> A partial solution for pcolor is pcolormesh, using the quadmesh 
> extension code, although that still has a bug. (Paul Kienzle was going 
> to look into it.) Is the quadmesh extension compatible with your 
> transforms branch?

I'm going to say 'no', only because I wasn't aware of it ;)  Where is 
that code?  Trunk contains a "draw_quad_mesh" method in the Agg backend 
which has been replaced by creating Path objects in Python and sending 
them to a more generic draw_path_collection method.

> My impression is that the transforms branch is going to be a big step 
> forward, with performance improvements in some areas, at worst minor 
> penalties in others--except for some problems like pcolor that need to 
> be solved.

I hope you're right.  In all honesty, I actually haven't seen many 
performance improvements myself in what I've benchmarked.  Certainly 
there must be some degenerate synthetic benchmark I can devise to prove 
its superiority .  Seriously, I was focusing on generalization, 
ease of adding new transforms, and reduction of interactions with 
extension code -- all of which aren't certain to create performance 
gains.  I hope that through more pervasive use of numpy, however, some 
of the performance lost can be made up -- and the reduction in lines of 
code may make it easier to optimize more broadly.

> In order to replace matlab in my application, a very fast interactive 
> pcolor-type capability is absolutely essential.  I think this simply has 
> to be done via extension code, like quadmesh and the image codes. 
> (Pcolor in the trunk isn't fast enough, either.)  Unfortunately, I have 
> found those codes hard to understand.  Only the regular-grid image code 
> is fully integrated into the trunk, and even it has a long-standing bug 
> revealed by extreme zooming.  The irregular-grid image routine might be 
> a big help, but it has never been integrated.  I don't remember which 
> bugs it shares with quadmesh and image, if any.

We'll definitely have to find better ways of doing this stuff.

It would be really nice if all this could be done independently of the 
backends.  The primary reason for having the Path abstraction is so that 
backends only need to understand one basic thing.Images (as they are 
now) are an exception to that for the obvious optimization benefits.  I 
wonder if generating the Path objects in an extension would be a 
worthwhile compromise to adding pcolor support in all the backends (hard 
to say what the factor of improvement might be).

If by interactive you mean "panning and zooming" performance, than the 
branch is already at least twice as fast as the trunk (finally a 
performance benchmark in favor of the branch!).  I converted your 
example into a benchmark (attached) and get these results:

trunk:

init:  1.39
fps: 0.448591422932

branch:

init:  5.31
fps: 1.08885017422

Of course, if you mean "interactive" as in "updating the data", then, 
yes, the branch has a long way to go to catch up to the trunk.

Cheers,
Mike

> Michael Droettboom wrote:
>> Eric Firing wrote:
>>> Mike,
>>>
>>> I made a quick test and took a quick look, and I certainly see a ripe 
>>> mango within reach.  I don't know what your constraints and strategy 
>>> are, but I thought I would give you the off-the-cuff idea before I 
>>> forget what I did.
>>>
>>> The test was pcolortest.py, and the kcachegrind input is the .log file.
>>>
>>> The problem is the path initializer: it is converting everything to a 
>>> masked array, which in the vast majority of cases is not needed, and 
>>> is very costly.
>>
>> Thanks for finding this.  I agree completely.  I think that was 
>> basically a typo that ended up "working", just suboptimally.  The 
>> input to the path constructor may be either a numpy array, an ma array 
>> or a regular Python sequence.  If it's the first two, it should be 
>> left alone (if there is an array mask, it is dealt with later on in 
>> the constructor), but if the latter, it should be converted to a numpy 
>> array.
>>
>> What I meant to type was:
>>
>>  if not ma.isMaskedArray(vertices):
>>  vertices = npy.asarray(vertices, npy.float_)
>>
>> The argument against just "npy.asarray(vertices, npy.float_)" is that 
>> the mask needs to be preserved.
>>
>> If

Re: [matplotlib-devel] low-hanging fruit on the transforms branch

2007-11-06 Thread Eric Firing
Michael Droettboom wrote:
> Eric Firing wrote:
>> Mike,
>>
>> Thanks for the quick response.
>>
>> I was wrong as usual: the masked array overhead in your original 
>> version of the path initializer was actually small.  I misinterpreted 
>> the kcachegrind display.  Rats!  I was hoping for a big gain.  It 
>> looks like anything that makes a huge number of paths is going to be 
>> slow, no matter what we do to try to optimize the path initializer.
> 
> Do you think the overhead is just from Python object creation?  On the 
> trunk, pcolor ultimately creates a large nested array, which on the 
> branch is then converted to a Python list of Path objects (each 
> containing two arrays).  I can definitely imagine how that would be 
> slower...

I think the problem is at least partly the creation of a large number of 
numpy arrays; this is a known performance problem with numpy.  It is 
also partly python function call overhead.  In my benchmark, that path 
initializer gets called a *lot* of times, and every little bit of 
overhead adds up.

> 
>> A partial solution for pcolor is pcolormesh, using the quadmesh 
>> extension code, although that still has a bug. (Paul Kienzle was going 
>> to look into it.) Is the quadmesh extension compatible with your 
>> transforms branch?
> 
> I'm going to say 'no', only because I wasn't aware of it ;)  Where is 
> that code?  Trunk contains a "draw_quad_mesh" method in the Agg backend 
> which has been replaced by creating Path objects in Python and sending 
> them to a more generic draw_path_collection method.

That's it--"draw_quad_mesh" in Agg backend.

> 
>> My impression is that the transforms branch is going to be a big step 
>> forward, with performance improvements in some areas, at worst minor 
>> penalties in others--except for some problems like pcolor that need to 
>> be solved.
> 
> I hope you're right.  In all honesty, I actually haven't seen many 
> performance improvements myself in what I've benchmarked.  Certainly 
> there must be some degenerate synthetic benchmark I can devise to prove 
> its superiority .  Seriously, I was focusing on generalization, 
> ease of adding new transforms, and reduction of interactions with 
> extension code -- all of which aren't certain to create performance 
> gains.  I hope that through more pervasive use of numpy, however, some 
> of the performance lost can be made up -- and the reduction in lines of 
> code may make it easier to optimize more broadly.

I think a fair amount of extension code may remain necessary, but your 
reorganization should make it easier to have a clean interface to the 
extension code.

> 
>> In order to replace matlab in my application, a very fast interactive 
>> pcolor-type capability is absolutely essential.  I think this simply 
>> has to be done via extension code, like quadmesh and the image codes. 
>> (Pcolor in the trunk isn't fast enough, either.)  Unfortunately, I 
>> have found those codes hard to understand.  Only the regular-grid 
>> image code is fully integrated into the trunk, and even it has a 
>> long-standing bug revealed by extreme zooming.  The irregular-grid 
>> image routine might be a big help, but it has never been integrated.  
>> I don't remember which bugs it shares with quadmesh and image, if any.
> 
> We'll definitely have to find better ways of doing this stuff.
> 
> It would be really nice if all this could be done independently of the 
> backends.  The primary reason for having the Path abstraction is so that 
> backends only need to understand one basic thing.Images (as they are 
> now) are an exception to that for the obvious optimization benefits.  I 
> wonder if generating the Path objects in an extension would be a 
> worthwhile compromise to adding pcolor support in all the backends (hard 
> to say what the factor of improvement might be).

1) It might indeed be possible to speed up path object generation via 
fairly simple extension code using the numpy C API.  We will have to 
look more closely to see where the time is being taken.

2) If we take Agg as the standard core for interactive backends, then it 
is not so crucial to optimize the non-interactive backends, although it 
would still be nice.  Alternatively, it may be that the way to go for 
all very complex plots on all backends is to use an image for the 
complex part.  Matlab did something like this, with their "zbuffer" 
versus "painters" renderer choice, but they fouled it up--with zbuffer, 
everything was bitmapped, including lines and fonts, so the result 
looked terrible, and the ps files were sometimes too big to print.  (I 
know very little about options for images in svg, ps and pdf.)  This was 
many years ago; I haven't run into this sort of problem with Matlab more 
recently.

> 
> If by interactive you mean "panning and zooming" performance, than the 
> branch is already at least twice as fast as the trunk (finally a 
> performance benchmark in favor of the branch!).  I converted yo

Re: [matplotlib-devel] low-hanging fruit on the transforms branch

2007-11-06 Thread Michael Droettboom

Attaching benchmark.

from numpy.random import rand
import matplotlib
from matplotlib.pyplot import pcolor, savefig, show, ion, axis, draw, axes
import time

ion()

t = time.clock()
pcolor(rand(1000,100))
print "init: ", time.clock() - t

frames = 25.0
t = time.clock()
for i in xrange(int(frames)):
part = (1.0 - (i / frames) / 2.0)
axes().set_ylim((0.0, 1000.0 * part))
draw()
print "fps:", frames / (time.clock() - t)

# show()

-
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems?  Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >> http://get.splunk.com/___
Matplotlib-devel mailing list
Matplotlib-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/matplotlib-devel


[matplotlib-devel] internal enthought.traits package: a progress report

2007-11-06 Thread Darren Dale
I have been working on updating the trunk to provide enthought.traits version 
2.6b1. backend_driver.py is running without exceptions using the traited 
config package with the internal traits package.

Issues:

1) there are lots of absolute package imports scattered throughout traits' 
code. I worked around this by adding a line to matplotlib/__init__.py:

sys.path.append(os.path.split(__file__)[0])

This lets matplotlib access enthought.traits without modifying enthoughts code 
(anymore than Gael had already done by stripping the pkg_resources imports).

2) When I tried updating rc_traits.py to import matplotlib.enthought.traits 
instead of enthought.traits (which isnt on the PYTHONPATH), I discovered a 
problem:

enthought.traits.trait_errors.TraitError: The 'parents_items' trait of a 
ViewElements instance must be a TraitListEvent, but a value of 
 was specified.

So traits would be a behind-the-scenes package, for internal mpl use only.

3) We can not include traits-3 without either adding setuptools as an external 
dependency (which is already true for python-2.3 users) or monkey-patching 
distutils. traits-3 includes some pyrex code, which standard distutils does 
not recognize.

I have not committed my work to svn yet. I wanted to get some feedback on 
points 1 and 2 first. Is it acceptable to use traits internally, but not 
expose it to the end user? I think the answer is yes, and that this is even a 
benefit. If we want to make it an external dependency we can strip the 
package out without impacting any user code.

Darren

-
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems?  Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >> http://get.splunk.com/
___
Matplotlib-devel mailing list
Matplotlib-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/matplotlib-devel


Re: [matplotlib-devel] low-hanging fruit on the transforms branch

2007-11-06 Thread Eric Firing
Mike,

On my machine, with pcolor from the trunk:
[EMAIL PROTECTED]:~/test$ python pcolortest2.py
init:  2.0
fps: 0.287026406429

And substituting pcolormesh for pcolor:
init:  0.27
fps: 5.48245614035

Now that's more like it!

Using image can be another order of magnitude faster than pcolormesh 
(but with limitations, of course). I suspect nonuniform image code is 
intermediate, but it is a long time since I have tried it.

Eric

Michael Droettboom wrote:
> Attaching benchmark.
> 
> 
> 
> 
> from numpy.random import rand
> import matplotlib
> from matplotlib.pyplot import pcolor, savefig, show, ion, axis, draw, axes
> import time
> 
> ion()
> 
> t = time.clock()
> pcolor(rand(1000,100))
> print "init: ", time.clock() - t
> 
> frames = 25.0
> t = time.clock()
> for i in xrange(int(frames)):
> part = (1.0 - (i / frames) / 2.0)
> axes().set_ylim((0.0, 1000.0 * part))
> draw()
> print "fps:", frames / (time.clock() - t)
> 
> # show()
> 


-
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems?  Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >> http://get.splunk.com/
___
Matplotlib-devel mailing list
Matplotlib-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/matplotlib-devel


Re: [matplotlib-devel] internal enthought.traits package: a progress report

2007-11-06 Thread Gael Varoquaux
On Tue, Nov 06, 2007 at 09:00:23PM -0500, Darren Dale wrote:
> I have not committed my work to svn yet. I wanted to get some feedback on 
> points 1 and 2 first. Is it acceptable to use traits internally, but not 
> expose it to the end user? I think the answer is yes, and that this is even a 
> benefit. If we want to make it an external dependency we can strip the 
> package out without impacting any user code.

I agree that by itself this is a benefit. It might also be interesting to
discuss this usecase on the enthought mailing list. If you tell these
guys that this is an important usecase, that projects like ipython,
matplotlib, would like to use traits as an internal dependency, but would
still like to be able to use the benefit of traits when interfacing with
other libraries, there might be a solution.

Gaƫl

-
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems?  Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >> http://get.splunk.com/
___
Matplotlib-devel mailing list
Matplotlib-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/matplotlib-devel