date:20120213

Re: [Numpy-discussion] Creating parallel curves

2012-02-13 Thread Niki Spahiev

You can get polygon buffer from http://angusj.com/delphi/clipper.php and 
make cython interface to it.

HTH

Niki

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

[Numpy-discussion] Initializing an array to a constant value

2012-02-13 Thread Pierre Haessig

I have a pretty silly question about initializing an array a to a given 
scalar value, say A.

Most of the time I use a=np.ones(shape)*A which seems the most 
widespread idiom, but I got recently interested in getting some 
performance improvement.

I tried a=np.zeros(shape)+A, based on broadcasting but it seems to be 
equivalent in terms of speed.

Now, the fastest :
a = np.empty(shape)
a.fill(A)

but it is a two-steps instruction to do one thing, which I feel doesn't 
look very nice.

Did I miss an all-in-one function like numpy.fill(shape, A) ?

Best,
Pierre
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

[Numpy-discussion] Indexing 2d arrays by column using an integer array

2012-02-13 Thread William Furnass

Hi,

Apologies if the following is a trivial question.  I wish to index the
columns of the following 2D array

In [78]: neighbourhoods
Out[78]:
array([[8, 0, 1],
   [0, 1, 2],
   [1, 2, 3],
   [2, 3, 4],
   [3, 4, 5],
   [4, 5, 6],
   [5, 6, 7],
   [6, 7, 8],
   [7, 8, 0]])

using the integer array

In [76]: perf[neighbourhoods].argmax(axis=1)
Out[76]: array([2, 1, 0, 2, 1, 0, 0, 2, 1])

to produce a 9-element array but can't find a way of applying the
indices to the columns rather than the rows.  Is this do-able without
using loops?

The looped version of what I want is

np.array( [neighbourhoods[i][perf[neighbourhoods].argmax(axis=1)[i]]
for i in xrange(neighbourhoods.shape[0])] )

Regards,

-- 
Will Furnass
Doctoral Student
Pennine Water Group
Department of Civil and Structural Engineering
University of Sheffield

Phone: +44 (0)114 22 25768
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Indexing 2d arrays by column using an integer array

2012-02-13 Thread Travis Oliphant

I think the following is what you want: 

neighborhoods[range(9),perf[neighbourhoods].argmax(axis=1)]

-Travis


On Feb 13, 2012, at 1:26 PM, William Furnass wrote:

 np.array( [neighbourhoods[i][perf[neighbourhoods].argmax(axis=1)[i]]
 for i in xrange(neighbourhoods.shape[0])] )

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Creating parallel curves

2012-02-13 Thread Chris Barker

On Mon, Feb 13, 2012 at 1:01 AM, Niki Spahiev niki.spah...@gmail.com wrote:
 You can get polygon buffer from http://angusj.com/delphi/clipper.php and
 make cython interface to it.

This should be built into GEOS as well, and the shapely package
provides a python wrapper already.

-Chris



 HTH

 Niki

 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion



-- 

Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/ORR            (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

chris.bar...@noaa.gov
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Indexing 2d arrays by column using an integer array

2012-02-13 Thread William Furnass

Thank you, that does the trick.

Regards,

Will

On 13 February 2012 19:39, Travis Oliphant tra...@continuum.io wrote:
 I think the following is what you want:

 neighborhoods[range(9),perf[neighbourhoods].argmax(axis=1)]

 -Travis


 On Feb 13, 2012, at 1:26 PM, William Furnass wrote:

 np.array( [neighbourhoods[i][perf[neighbourhoods].argmax(axis=1)[i]]
 for i in xrange(neighbourhoods.shape[0])] )

 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Issue Tracking

2012-02-13 Thread Ralf Gommers

On Mon, Feb 13, 2012 at 12:12 AM, Travis Oliphant tra...@continuum.iowrote:

 I'm wondering about using one of these commercial issue tracking plans for
 NumPy and would like thoughts and comments.Both of these plans allow
 Open Source projects to have unlimited plans for free.

 Free usage of a tool that's itself not open source is not all that
different from using Github, so no objections from me.


 YouTrack from JetBrains:

 http://www.jetbrains.com/youtrack/features/issue_tracking.html

 This looks promising. It seems to have good Github integration, and I
checked that you can easily export all your issues (so no lock-in). It's a
company that isn't going anywhere (I hope), and they do a very nice job
with PyCharm.


 JIRA:

 http://www.atlassian.com/software/jira/overview/tour/code-integration

 Haven't looked into this one in much detail. I happen to have a dislike
for Confluence (their wiki system), so someone else can say some nice
things about JIRA.

Haven't tried either tracker though. Anyone with actual experience?


 What Mark Wiebe said about making it easy to manage the issues quickly
 and what Eric said about making sure there are interfaces with dense
 information content really struck chords with me.  I have seen a lot of
 time wasted on issue management with Trac --- time that could be better
 spent on NumPy.I'd like to make issue management efficient --- even if
 it means a system separate from GitHub.

 Issue management is a very important part of the open-source process.

 While we're at it, our buildbot situation is much worse than our issue
tracker situation. This also looks good (and free):
http://www.jetbrains.com/teamcity/

Ralf
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Issue Tracking

2012-02-13 Thread Travis Oliphant

 
 On Mon, Feb 13, 2012 at 12:12 AM, Travis Oliphant tra...@continuum.io wrote:
 I'm wondering about using one of these commercial issue tracking plans for 
 NumPy and would like thoughts and comments.Both of these plans allow Open 
 Source projects to have unlimited plans for free.   
 
 Free usage of a tool that's itself not open source is not all that different 
 from using Github, so no objections from me.
  
 YouTrack from JetBrains: 
 
 http://www.jetbrains.com/youtrack/features/issue_tracking.html
 
 This looks promising. It seems to have good Github integration, and I checked 
 that you can easily export all your issues (so no lock-in). It's a company 
 that isn't going anywhere (I hope), and they do a very nice job with PyCharm.

I do like the team behind JetBrains.   And I've seen and heard good things 
about TeamCity.   Thanks for reminding me about the build-bot situation.  That 
is one thing I would like to address sooner rather than later as well. 

Thanks,

-Travis



  
 JIRA: 
 
 http://www.atlassian.com/software/jira/overview/tour/code-integration
 
 Haven't looked into this one in much detail. I happen to have a dislike for 
 Confluence (their wiki system), so someone else can say some nice things 
 about JIRA.
 
 Haven't tried either tracker though. Anyone with actual experience?
  
 What Mark Wiebe said about making it easy to manage the issues quickly and 
 what Eric said about making sure there are interfaces with dense information 
 content really struck chords with me.  I have seen a lot of time wasted on 
 issue management with Trac --- time that could be better spent on NumPy.
 I'd like to make issue management efficient --- even if it means a system 
 separate from GitHub. 
 
 Issue management is a very important part of the open-source process.  
 
 While we're at it, our buildbot situation is much worse than our issue 
 tracker situation. This also looks good (and free): 
 http://www.jetbrains.com/teamcity/
 
 Ralf  
 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Issue Tracking

2012-02-13 Thread Matthew Brett

Hi,

On Mon, Feb 13, 2012 at 12:44 PM, Travis Oliphant tra...@continuum.io wrote:

 On Mon, Feb 13, 2012 at 12:12 AM, Travis Oliphant tra...@continuum.io
 wrote:

 I'm wondering about using one of these commercial issue tracking plans for
 NumPy and would like thoughts and comments.    Both of these plans allow
 Open Source projects to have unlimited plans for free.

 Free usage of a tool that's itself not open source is not all that different
 from using Github, so no objections from me.


 YouTrack from JetBrains:

 http://www.jetbrains.com/youtrack/features/issue_tracking.html

 This looks promising. It seems to have good Github integration, and I
 checked that you can easily export all your issues (so no lock-in). It's a
 company that isn't going anywhere (I hope), and they do a very nice job with
 PyCharm.


 I do like the team behind JetBrains.   And I've seen and heard good things
 about TeamCity.   Thanks for reminding me about the build-bot situation.
  That is one thing I would like to address sooner rather than later as
 well.

We've (nipy) got a buildbot collection working OK.   If you want to go
that way you are welcome to use our machines.  It's a somewhat flaky
setup though.

http://nipy.bic.berkeley.edu/builders

I have the impression that the Cython / SAGE team are happy with their
Jenkins configuration.

Ondrej did some nice stuff on integrating a build with the github pull requests:

https://github.com/sympy/sympy-bot

Some discussion of buildbot and Jenkins:

http://vperic.blogspot.com/2011/05/continuous-integration-and-sympy.html

See you,

Matthew
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Initializing an array to a constant value

2012-02-13 Thread Pierre Haessig

Le 13/02/2012 19:17, eat a écrit :
 wouldn't it be nice if you could just write:
 a= np.empty(shape).fill(A)
 this would be possible if .fill(.) just returned self.
Thanks for the tip. I noticed several times this was not working
(because of course, in the mean time, I forgot it...)
but I had totally overlooked the reasons (just imagining there was some
garbage collection magic vanishing my arrays !!)

I find the syntax np.empty(shape).fill(A) being indeed a good
alternative to the burden of creating a new numpy.fill (or numpy.filled
?) function.

-- 
Pierre
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

[Numpy-discussion] Fwd: Re: Creating parallel curves

2012-02-13 Thread Andrea Gavana

-- Forwarded message --
From: Andrea Gavana andrea.gav...@gmail.com
Date: Feb 13, 2012 11:31 PM
Subject: Re: [Numpy-discussion] Creating parallel curves
To: Jonathan Hilmer jkhil...@gmail.com

Thank you Jonathan for this, it's exactly what I was looking for. I' ll try
it tomorrow on the 768 well trajectories I have and I'll let you know if I
stumble upon any issue.

If someone could shed some light on my problem number 2 (how to adjust the
scaling/distance) so that the curves look parallel on a matplotlib graph
even though the axes scales are different, I'd be more than grateful.

Thank you in advance.

Andrea.
On Feb 13, 2012 4:32 AM, Jonathan Hilmer jkhil...@gmail.com wrote:

 Andrea,

 This is playing some tricks with 2D array expansion to make a tradeoff
 in memory for speed.  Given two sets of matching vectors (one
 reference, given first, and a newly-expanded one, given second), it
 removes all points from the expanded vectors that aren't needed to
 describe the new contour.

 def filter_expansion(x, y, x_expan, y_expan, distance_target, tol=1e-6):

target_xx, expansion_xx = scipy.meshgrid(x, x_expan)
target_yy, expansion_yy = scipy.meshgrid(y, y_expan)

distance = scipy.sqrt((expansion_yy - target_yy)**2 + (expansion_xx
 -
 target_xx)**2)

valid = distance.min(axis=1)  distance_target*(1.-tol)

return x_expan.compress(valid), y_expan.compress(valid)
 #

 Jonathan

 On Sun, Feb 12, 2012 at 2:31 PM, Robert Kern robert.k...@gmail.com
 wrote:
  On Sun, Feb 12, 2012 at 20:26, Andrea Gavana andrea.gav...@gmail.com
 wrote:

  I know, my definition of parallel was probably not orthodox enough.
  What I am looking for is to generate 2 curves that look graphically
  parallel enough to the original one, and not parallel in the true
  mathematical sense.

  There is a rigorous way to define the curve that you are looking for,
  and fortunately it gives some hints for implementation. For each point
  (x,y) in space, associate with it the nearest distance D from that
  point to the reference curve. The parallel curves are just two sides
  of the level set where D(x,y) is equal to the specified distance
  (possibly removing the circular caps that surround the ends of the
  reference curve).

  If performance is not a constraint, then you could just evaluate that
  D(x,y) function on a fine-enough grid and do marching squares to find
  the level set. matplotlib's contour plotting routines can help here.

  There is a hint in the PyX page that you linked to that you should
  consider. Angles in the reference curve become circular arcs in the
  parallel curves. So if your reference curve is just a bunch of line
  segments, then what you can do is take each line segment, and make
  parallel copies the same length to either side. Now you just need to
  connect up these parallel segments with each other. You do this by
  using circular arcs centered on the vertices of the reference curve.
  Do this on both sides. On the outer side, the arcs will go forward
  while on the inner side, the arcs will go backwards just like the
  cusps that you saw in your attempt. Now let's take care of that. You
  will have two self-intersecting curves consisting of alternating line
  segments and circular arcs. Parts of these curves will be too close to
  the reference curve. You will have to go through these curves to find
  the locations of self-intersection and remove the parts of the
  segments and arcs that are too close to the reference curve. This is
  tricky to do, but the formulae for segment-segment, segment-arc, and
  arc-arc intersection can be found online.

  --
  Robert Kern

  I have come to believe that the whole world is an enigma, a harmless
  enigma that is made terrible by our own mad attempt to interpret it as
  though it had an underlying truth.
-- Umberto Eco
  ___
  NumPy-Discussion mailing list
  NumPy-Discussion@scipy.org
  http://mail.scipy.org/mailman/listinfo/numpy-discussion

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Issue Tracking

2012-02-13 Thread jason-sage

On 2/13/12 2:56 PM, Matthew Brett wrote:
 I have the impression that the Cython / SAGE team are happy with their
 Jenkins configuration.

I'm not aware of a Jenkins buildbot system for Sage, though I think 
Cython uses such a system: https://sage.math.washington.edu:8091/hudson/

We do have a number of systems we build and test Sage on, though I don't 
think we have continuous integration yet.  I've CCd Jeroen Demeyer, who 
is the current release manager for Sage.  Jeroen, do we have an 
automatic buildbot system for Sage?

Thanks,

Jason

--
Jason Grout
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Issue Tracking

2012-02-13 Thread Matthew Brett

Hi,

On Mon, Feb 13, 2012 at 2:33 PM,  jason-s...@creativetrax.com wrote:
 On 2/13/12 2:56 PM, Matthew Brett wrote:
 I have the impression that the Cython / SAGE team are happy with their
 Jenkins configuration.

 I'm not aware of a Jenkins buildbot system for Sage, though I think
 Cython uses such a system: https://sage.math.washington.edu:8091/hudson/

 We do have a number of systems we build and test Sage on, though I don't
 think we have continuous integration yet.  I've CCd Jeroen Demeyer, who
 is the current release manager for Sage.  Jeroen, do we have an
 automatic buildbot system for Sage?

Ah - sorry - I was thinking of the Cython system on the SAGE server.

Best,

Matthew
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Issue Tracking

2012-02-13 Thread Fernando Perez

On Mon, Feb 13, 2012 at 12:56 PM, Matthew Brett matthew.br...@gmail.com wrote:
 I have the impression that the Cython / SAGE team are happy with their
 Jenkins configuration.

So are we in IPython, thanks to Thomas Kluyver's recent leadership on
this front it's now running quite smoothly:

https://jenkins.shiningpanda.com/ipython/

I'm pretty sure Thomas is on this list, if you folks have any
questions on the details of the setup.

Cheers,

f
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

[Numpy-discussion] Index Array Performance

2012-02-13 Thread Marcel Oliver

Hi,

I have a short piece of code where the use of an index array feels
right, but incurs a severe performance penalty: It's about an order
of magnitude slower than all other operations with arrays of that
size.

It comes up in a piece of code which is doing a large number of on
the fly histograms via

  hist[i,j] += 1

where i is an array with the bin index to be incremented and j is
simply enumerating the histograms.  I attach a full short sample code
below which shows how it's being used in context, and corresponding
timeit output from the critical code section.

Questions:

- Is this a matter of principle, or due to an inefficient
  implementation?
- Is there an equivalent way of doing it which is fast?

Regards,
Marcel

=

#! /usr/bin/env python
# Plot the bifurcation diagram of the logistic map

from pylab import *

Nx = 800
Ny = 600
I = 5

rmin = 2.5
rmax = 4.0
ymin = 0.0
ymax = 1.0

rr = linspace (rmin, rmax, Nx)
x = 0.5*ones(rr.shape)
hist = zeros((Ny+1,Nx), dtype=int)
j = arange(Nx)

dy = ymax/Ny

def f(x):
return rr*x*(1.0-x)

for n in xrange(1000):
x = f(x)

for n in xrange(I):
x = f(x)
i = array(x/dy, dtype=int)
hist[i,j] += 1

figure()

imshow(hist,
   cmap='binary',
   origin='lower',
   interpolation='nearest',
   extent=(rmin,rmax,ymin,ymax),
   norm=matplotlib.colors.LogNorm())

xlabel ('$r$')
ylabel ('$x$')

title('Attractor of the logistic map $x_{n+1} = r \, x_n (1-x_n)$')

show()



In [4]: timeit y=f(x)
1 loops, best of 3: 19.4 us per loop

In [5]: timeit i = array(x/dy, dtype=int)
1 loops, best of 3: 22 us per loop

In [6]: timeit img[i,j] += 1
1 loops, best of 3: 119 us per loop
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] [Enthought-Dev] Discussion with Guido van Rossum and (hopefully) core python-dev on scientific Python and Python3

2012-02-13 Thread Travis Vaught


On Feb 13, 2012, at 3:55 PM, Fernando Perez wrote:

 ...
 - Extra operators/PEP 225.  Here's a summary from the last time we
 went over this, years ago at Scipy 2008:
 http://mail.scipy.org/pipermail/numpy-discussion/2008-October/038234.html,
 and the current status of the document we wrote about it is here:
 file:///home/fperez/www/site/_build/html/py4science/numpy-pep225/numpy-pep225.html.
 
 ...

The link to the document isn't quite right.  Please update it -- I can't wait 
for some nostalgic reading ;-)

Travis___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] [Enthought-Dev] Discussion with Guido van Rossum and (hopefully) core python-dev on scientific Python and Python3

2012-02-13 Thread Fernando Perez

On Mon, Feb 13, 2012 at 3:46 PM, Travis Vaught tra...@vaught.net wrote:

 - Extra operators/PEP 225.  Here's a summary from the last time we
 went over this, years ago at Scipy 2008:
 http://mail.scipy.org/pipermail/numpy-discussion/2008-October/038234.html,
 and the current status of the document we wrote about it is here:
 file:///home/fperez/www/site/_build/html/py4science/numpy-pep225/numpy-pep225.html.

 ...


 The link to the document isn't quite right.  Please update it -- I can't
 wait for some nostalgic reading ;-)

Oops, sorry; I pasted the local build url by accident:

http://fperez.org/py4science/numpy-pep225/numpy-pep225.html

And BTW, this discussion will take place on Friday March 2nd, most
likely 3-5pm.  We'll add that info to the pydata page as soon as it's
finalized.

Cheers,

f
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Index Array Performance

2012-02-13 Thread Wes McKinney

On Mon, Feb 13, 2012 at 6:23 PM, Marcel Oliver
m.oli...@jacobs-university.de wrote:
 Hi,

 I have a short piece of code where the use of an index array feels
 right, but incurs a severe performance penalty: It's about an order
 of magnitude slower than all other operations with arrays of that
 size.

 It comes up in a piece of code which is doing a large number of on
 the fly histograms via

  hist[i,j] += 1

 where i is an array with the bin index to be incremented and j is
 simply enumerating the histograms.  I attach a full short sample code
 below which shows how it's being used in context, and corresponding
 timeit output from the critical code section.

 Questions:

 - Is this a matter of principle, or due to an inefficient
  implementation?
 - Is there an equivalent way of doing it which is fast?

 Regards,
 Marcel

 =

 #! /usr/bin/env python
 # Plot the bifurcation diagram of the logistic map

 from pylab import *

 Nx = 800
 Ny = 600
 I = 5

 rmin = 2.5
 rmax = 4.0
 ymin = 0.0
 ymax = 1.0

 rr = linspace (rmin, rmax, Nx)
 x = 0.5*ones(rr.shape)
 hist = zeros((Ny+1,Nx), dtype=int)
 j = arange(Nx)

 dy = ymax/Ny

 def f(x):
    return rr*x*(1.0-x)

 for n in xrange(1000):
    x = f(x)

 for n in xrange(I):
    x = f(x)
    i = array(x/dy, dtype=int)
    hist[i,j] += 1

 figure()

 imshow(hist,
       cmap='binary',
       origin='lower',
       interpolation='nearest',
       extent=(rmin,rmax,ymin,ymax),
       norm=matplotlib.colors.LogNorm())

 xlabel ('$r$')
 ylabel ('$x$')

 title('Attractor of the logistic map $x_{n+1} = r \, x_n (1-x_n)$')

 show()

 

 In [4]: timeit y=f(x)
 1 loops, best of 3: 19.4 us per loop

 In [5]: timeit i = array(x/dy, dtype=int)
 1 loops, best of 3: 22 us per loop

 In [6]: timeit img[i,j] += 1
 1 loops, best of 3: 119 us per loop
 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion

This suggests to me that fancy indexing could be quite a bit faster in
this case:

In [40]: timeit hist[i,j] += 11 loops, best of 3: 58.2 us per loop
In [39]: timeit hist.put(np.ravel_multi_index((i, j), hist.shape), 1)
1 loops, best of 3: 20.6 us per loop

I wrote a simple Cython method

def fancy_inc(ndarray[int64_t, ndim=2] values,
  ndarray[int64_t] iarr, ndarray[int64_t] jarr, int64_t inc):
cdef:
Py_ssize_t i, n = len(iarr)

for i in range(n):
values[iarr[i], jarr[i]] += inc

that does even faster

In [8]: timeit sbx.fancy_inc(hist, i, j, 1)
10 loops, best of 3: 4.85 us per loop

About 10% faster if bounds checking and wraparound are disabled.

Kind of a bummer-- perhaps this should go high on the NumPy 2.0 TODO list?

- Wes
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

[Numpy-discussion] Change in scalar upcasting rules for 1.6.x?

2012-02-13 Thread Matthew Brett

Hi,

I recently noticed a change in the upcasting rules in numpy 1.6.0 /
1.6.1 and I just wanted to check it was intentional.

For all versions of numpy I've tested, we have:

 import numpy as np
 Adata = np.array([127], dtype=np.int8)
 Bdata = np.int16(127)
 (Adata + Bdata).dtype
dtype('int8')

That is - adding an integer scalar of a larger dtype does not result
in upcasting of the output dtype, if the data in the scalar type fits
in the smaller.

For numpy  1.6.0 we have this:

 Bdata = np.int16(128)
 (Adata + Bdata).dtype
dtype('int8')

That is - even if the data in the scalar does not fit in the dtype of
the array to which it is being added, there is no upcasting.

For numpy = 1.6.0 we have this:

 Bdata = np.int16(128)
 (Adata + Bdata).dtype
dtype('int16')

There is upcasting...

I can see why the numpy 1.6.0 way might be preferable but it is an API
change I suppose.

Best,

Matthew
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Index Array Performance

2012-02-13 Thread Nathaniel Smith

How would you fix it? I shouldn't speculate without profiling, but I'll be
naughty. Presumably the problem is that python turns that into something
like

hist[i,j] = hist[i,j] + 1

which means there's no way for numpy to avoid creating a temporary array.
So maybe this could be fixed by adding a fused __inplace_add__ protocol to
the language (and similarly for all the other inplace operators), but that
seems really unlikely. Fundamentally this is just the sort of optimization
opportunity you miss when you don't have a compiler with a global view;
Fortran or c++ expression templates will win every time. Maybe pypy will
fix it someday.

Perhaps it would help to make np.add(hist, 1, out=hist, where=(i,j)) work?

- N
On Feb 14, 2012 12:18 AM, Wes McKinney wesmck...@gmail.com wrote:

 On Mon, Feb 13, 2012 at 6:23 PM, Marcel Oliver
 m.oli...@jacobs-university.de wrote:
  Hi,
 
  I have a short piece of code where the use of an index array feels
  right, but incurs a severe performance penalty: It's about an order
  of magnitude slower than all other operations with arrays of that
  size.
 
  It comes up in a piece of code which is doing a large number of on
  the fly histograms via
 
   hist[i,j] += 1
 
  where i is an array with the bin index to be incremented and j is
  simply enumerating the histograms.  I attach a full short sample code
  below which shows how it's being used in context, and corresponding
  timeit output from the critical code section.
 
  Questions:
 
  - Is this a matter of principle, or due to an inefficient
   implementation?
  - Is there an equivalent way of doing it which is fast?
 
  Regards,
  Marcel
 
  =
 
  #! /usr/bin/env python
  # Plot the bifurcation diagram of the logistic map
 
  from pylab import *
 
  Nx = 800
  Ny = 600
  I = 5
 
  rmin = 2.5
  rmax = 4.0
  ymin = 0.0
  ymax = 1.0
 
  rr = linspace (rmin, rmax, Nx)
  x = 0.5*ones(rr.shape)
  hist = zeros((Ny+1,Nx), dtype=int)
  j = arange(Nx)
 
  dy = ymax/Ny
 
  def f(x):
 return rr*x*(1.0-x)
 
  for n in xrange(1000):
 x = f(x)
 
  for n in xrange(I):
 x = f(x)
 i = array(x/dy, dtype=int)
 hist[i,j] += 1
 
  figure()
 
  imshow(hist,
cmap='binary',
origin='lower',
interpolation='nearest',
extent=(rmin,rmax,ymin,ymax),
norm=matplotlib.colors.LogNorm())
 
  xlabel ('$r$')
  ylabel ('$x$')
 
  title('Attractor of the logistic map $x_{n+1} = r \, x_n (1-x_n)$')
 
  show()
 
  
 
  In [4]: timeit y=f(x)
  1 loops, best of 3: 19.4 us per loop
 
  In [5]: timeit i = array(x/dy, dtype=int)
  1 loops, best of 3: 22 us per loop
 
  In [6]: timeit img[i,j] += 1
  1 loops, best of 3: 119 us per loop
  ___
  NumPy-Discussion mailing list
  NumPy-Discussion@scipy.org
  http://mail.scipy.org/mailman/listinfo/numpy-discussion

 This suggests to me that fancy indexing could be quite a bit faster in
 this case:

 In [40]: timeit hist[i,j] += 11 loops, best of 3: 58.2 us per loop
 In [39]: timeit hist.put(np.ravel_multi_index((i, j), hist.shape), 1)
 1 loops, best of 3: 20.6 us per loop

 I wrote a simple Cython method

 def fancy_inc(ndarray[int64_t, ndim=2] values,
  ndarray[int64_t] iarr, ndarray[int64_t] jarr, int64_t inc):
cdef:
Py_ssize_t i, n = len(iarr)

for i in range(n):
values[iarr[i], jarr[i]] += inc

 that does even faster

 In [8]: timeit sbx.fancy_inc(hist, i, j, 1)
 10 loops, best of 3: 4.85 us per loop

 About 10% faster if bounds checking and wraparound are disabled.

 Kind of a bummer-- perhaps this should go high on the NumPy 2.0 TODO list?

 - Wes
 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Index Array Performance

2012-02-13 Thread Wes McKinney

On Mon, Feb 13, 2012 at 7:30 PM, Nathaniel Smith n...@pobox.com wrote:
 How would you fix it? I shouldn't speculate without profiling, but I'll be
 naughty. Presumably the problem is that python turns that into something
 like

 hist[i,j] = hist[i,j] + 1

 which means there's no way for numpy to avoid creating a temporary array. So
 maybe this could be fixed by adding a fused __inplace_add__ protocol to the
 language (and similarly for all the other inplace operators), but that seems
 really unlikely. Fundamentally this is just the sort of optimization
 opportunity you miss when you don't have a compiler with a global view;
 Fortran or c++ expression templates will win every time. Maybe pypy will fix
 it someday.

 Perhaps it would help to make np.add(hist, 1, out=hist, where=(i,j)) work?

 - N

Nope, don't buy it:

In [33]: timeit arr.__iadd__(1)
1000 loops, best of 3: 1.13 ms per loop

In [37]: timeit arr[:] += 1
1000 loops, best of 3: 1.13 ms per loop

- Wes

 On Feb 14, 2012 12:18 AM, Wes McKinney wesmck...@gmail.com wrote:

 On Mon, Feb 13, 2012 at 6:23 PM, Marcel Oliver
 m.oli...@jacobs-university.de wrote:
  Hi,
 
  I have a short piece of code where the use of an index array feels
  right, but incurs a severe performance penalty: It's about an order
  of magnitude slower than all other operations with arrays of that
  size.
 
  It comes up in a piece of code which is doing a large number of on
  the fly histograms via
 
   hist[i,j] += 1
 
  where i is an array with the bin index to be incremented and j is
  simply enumerating the histograms.  I attach a full short sample code
  below which shows how it's being used in context, and corresponding
  timeit output from the critical code section.
 
  Questions:
 
  - Is this a matter of principle, or due to an inefficient
   implementation?
  - Is there an equivalent way of doing it which is fast?
 
  Regards,
  Marcel
 
  =
 
  #! /usr/bin/env python
  # Plot the bifurcation diagram of the logistic map
 
  from pylab import *
 
  Nx = 800
  Ny = 600
  I = 5
 
  rmin = 2.5
  rmax = 4.0
  ymin = 0.0
  ymax = 1.0
 
  rr = linspace (rmin, rmax, Nx)
  x = 0.5*ones(rr.shape)
  hist = zeros((Ny+1,Nx), dtype=int)
  j = arange(Nx)
 
  dy = ymax/Ny
 
  def f(x):
     return rr*x*(1.0-x)
 
  for n in xrange(1000):
     x = f(x)
 
  for n in xrange(I):
     x = f(x)
     i = array(x/dy, dtype=int)
     hist[i,j] += 1
 
  figure()
 
  imshow(hist,
        cmap='binary',
        origin='lower',
        interpolation='nearest',
        extent=(rmin,rmax,ymin,ymax),
        norm=matplotlib.colors.LogNorm())
 
  xlabel ('$r$')
  ylabel ('$x$')
 
  title('Attractor of the logistic map $x_{n+1} = r \, x_n (1-x_n)$')
 
  show()
 
  
 
  In [4]: timeit y=f(x)
  1 loops, best of 3: 19.4 us per loop
 
  In [5]: timeit i = array(x/dy, dtype=int)
  1 loops, best of 3: 22 us per loop
 
  In [6]: timeit img[i,j] += 1
  1 loops, best of 3: 119 us per loop
  ___
  NumPy-Discussion mailing list
  NumPy-Discussion@scipy.org
  http://mail.scipy.org/mailman/listinfo/numpy-discussion

 This suggests to me that fancy indexing could be quite a bit faster in
 this case:

 In [40]: timeit hist[i,j] += 11 loops, best of 3: 58.2 us per loop
 In [39]: timeit hist.put(np.ravel_multi_index((i, j), hist.shape), 1)
 1 loops, best of 3: 20.6 us per loop

 I wrote a simple Cython method

 def fancy_inc(ndarray[int64_t, ndim=2] values,
              ndarray[int64_t] iarr, ndarray[int64_t] jarr, int64_t inc):
    cdef:
        Py_ssize_t i, n = len(iarr)

    for i in range(n):
        values[iarr[i], jarr[i]] += inc

 that does even faster

 In [8]: timeit sbx.fancy_inc(hist, i, j, 1)
 10 loops, best of 3: 4.85 us per loop

 About 10% faster if bounds checking and wraparound are disabled.

 Kind of a bummer-- perhaps this should go high on the NumPy 2.0 TODO list?

 - Wes
 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion


 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Index Array Performance

2012-02-13 Thread Wes McKinney

On Mon, Feb 13, 2012 at 7:46 PM, Wes McKinney wesmck...@gmail.com wrote:
 On Mon, Feb 13, 2012 at 7:30 PM, Nathaniel Smith n...@pobox.com wrote:
 How would you fix it? I shouldn't speculate without profiling, but I'll be
 naughty. Presumably the problem is that python turns that into something
 like

 hist[i,j] = hist[i,j] + 1

 which means there's no way for numpy to avoid creating a temporary array. So
 maybe this could be fixed by adding a fused __inplace_add__ protocol to the
 language (and similarly for all the other inplace operators), but that seems
 really unlikely. Fundamentally this is just the sort of optimization
 opportunity you miss when you don't have a compiler with a global view;
 Fortran or c++ expression templates will win every time. Maybe pypy will fix
 it someday.

 Perhaps it would help to make np.add(hist, 1, out=hist, where=(i,j)) work?

 - N

 Nope, don't buy it:

 In [33]: timeit arr.__iadd__(1)
 1000 loops, best of 3: 1.13 ms per loop

 In [37]: timeit arr[:] += 1
 1000 loops, best of 3: 1.13 ms per loop

 - Wes

Actually, apologies, I'm being silly (had too much coffee or
something). Python may be doing something nefarious with the hist[i,j]
+= 1. So both a get, add, then set, which is probably the problem.

 On Feb 14, 2012 12:18 AM, Wes McKinney wesmck...@gmail.com wrote:

 On Mon, Feb 13, 2012 at 6:23 PM, Marcel Oliver
 m.oli...@jacobs-university.de wrote:
  Hi,
 
  I have a short piece of code where the use of an index array feels
  right, but incurs a severe performance penalty: It's about an order
  of magnitude slower than all other operations with arrays of that
  size.
 
  It comes up in a piece of code which is doing a large number of on
  the fly histograms via
 
   hist[i,j] += 1
 
  where i is an array with the bin index to be incremented and j is
  simply enumerating the histograms.  I attach a full short sample code
  below which shows how it's being used in context, and corresponding
  timeit output from the critical code section.
 
  Questions:
 
  - Is this a matter of principle, or due to an inefficient
   implementation?
  - Is there an equivalent way of doing it which is fast?
 
  Regards,
  Marcel
 
  =
 
  #! /usr/bin/env python
  # Plot the bifurcation diagram of the logistic map
 
  from pylab import *
 
  Nx = 800
  Ny = 600
  I = 5
 
  rmin = 2.5
  rmax = 4.0
  ymin = 0.0
  ymax = 1.0
 
  rr = linspace (rmin, rmax, Nx)
  x = 0.5*ones(rr.shape)
  hist = zeros((Ny+1,Nx), dtype=int)
  j = arange(Nx)
 
  dy = ymax/Ny
 
  def f(x):
     return rr*x*(1.0-x)
 
  for n in xrange(1000):
     x = f(x)
 
  for n in xrange(I):
     x = f(x)
     i = array(x/dy, dtype=int)
     hist[i,j] += 1
 
  figure()
 
  imshow(hist,
        cmap='binary',
        origin='lower',
        interpolation='nearest',
        extent=(rmin,rmax,ymin,ymax),
        norm=matplotlib.colors.LogNorm())
 
  xlabel ('$r$')
  ylabel ('$x$')
 
  title('Attractor of the logistic map $x_{n+1} = r \, x_n (1-x_n)$')
 
  show()
 
  
 
  In [4]: timeit y=f(x)
  1 loops, best of 3: 19.4 us per loop
 
  In [5]: timeit i = array(x/dy, dtype=int)
  1 loops, best of 3: 22 us per loop
 
  In [6]: timeit img[i,j] += 1
  1 loops, best of 3: 119 us per loop
  ___
  NumPy-Discussion mailing list
  NumPy-Discussion@scipy.org
  http://mail.scipy.org/mailman/listinfo/numpy-discussion

 This suggests to me that fancy indexing could be quite a bit faster in
 this case:

 In [40]: timeit hist[i,j] += 11 loops, best of 3: 58.2 us per loop
 In [39]: timeit hist.put(np.ravel_multi_index((i, j), hist.shape), 1)
 1 loops, best of 3: 20.6 us per loop

 I wrote a simple Cython method

 def fancy_inc(ndarray[int64_t, ndim=2] values,
              ndarray[int64_t] iarr, ndarray[int64_t] jarr, int64_t inc):
    cdef:
        Py_ssize_t i, n = len(iarr)

    for i in range(n):
        values[iarr[i], jarr[i]] += inc

 that does even faster

 In [8]: timeit sbx.fancy_inc(hist, i, j, 1)
 10 loops, best of 3: 4.85 us per loop

 About 10% faster if bounds checking and wraparound are disabled.

 Kind of a bummer-- perhaps this should go high on the NumPy 2.0 TODO list?

 - Wes
 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion


 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Index Array Performance

2012-02-13 Thread Wes McKinney

On Mon, Feb 13, 2012 at 7:48 PM, Wes McKinney wesmck...@gmail.com wrote:
 On Mon, Feb 13, 2012 at 7:46 PM, Wes McKinney wesmck...@gmail.com wrote:
 On Mon, Feb 13, 2012 at 7:30 PM, Nathaniel Smith n...@pobox.com wrote:
 How would you fix it? I shouldn't speculate without profiling, but I'll be
 naughty. Presumably the problem is that python turns that into something
 like

 hist[i,j] = hist[i,j] + 1

 which means there's no way for numpy to avoid creating a temporary array. So
 maybe this could be fixed by adding a fused __inplace_add__ protocol to the
 language (and similarly for all the other inplace operators), but that seems
 really unlikely. Fundamentally this is just the sort of optimization
 opportunity you miss when you don't have a compiler with a global view;
 Fortran or c++ expression templates will win every time. Maybe pypy will fix
 it someday.

 Perhaps it would help to make np.add(hist, 1, out=hist, where=(i,j)) work?

 - N

 Nope, don't buy it:

 In [33]: timeit arr.__iadd__(1)
 1000 loops, best of 3: 1.13 ms per loop

 In [37]: timeit arr[:] += 1
 1000 loops, best of 3: 1.13 ms per loop

 - Wes

 Actually, apologies, I'm being silly (had too much coffee or
 something). Python may be doing something nefarious with the hist[i,j]
 += 1. So both a get, add, then set, which is probably the problem.

 On Feb 14, 2012 12:18 AM, Wes McKinney wesmck...@gmail.com wrote:

 On Mon, Feb 13, 2012 at 6:23 PM, Marcel Oliver
 m.oli...@jacobs-university.de wrote:
  Hi,
 
  I have a short piece of code where the use of an index array feels
  right, but incurs a severe performance penalty: It's about an order
  of magnitude slower than all other operations with arrays of that
  size.
 
  It comes up in a piece of code which is doing a large number of on
  the fly histograms via
 
   hist[i,j] += 1
 
  where i is an array with the bin index to be incremented and j is
  simply enumerating the histograms.  I attach a full short sample code
  below which shows how it's being used in context, and corresponding
  timeit output from the critical code section.
 
  Questions:
 
  - Is this a matter of principle, or due to an inefficient
   implementation?
  - Is there an equivalent way of doing it which is fast?
 
  Regards,
  Marcel
 
  =
 
  #! /usr/bin/env python
  # Plot the bifurcation diagram of the logistic map
 
  from pylab import *
 
  Nx = 800
  Ny = 600
  I = 5
 
  rmin = 2.5
  rmax = 4.0
  ymin = 0.0
  ymax = 1.0
 
  rr = linspace (rmin, rmax, Nx)
  x = 0.5*ones(rr.shape)
  hist = zeros((Ny+1,Nx), dtype=int)
  j = arange(Nx)
 
  dy = ymax/Ny
 
  def f(x):
     return rr*x*(1.0-x)
 
  for n in xrange(1000):
     x = f(x)
 
  for n in xrange(I):
     x = f(x)
     i = array(x/dy, dtype=int)
     hist[i,j] += 1
 
  figure()
 
  imshow(hist,
        cmap='binary',
        origin='lower',
        interpolation='nearest',
        extent=(rmin,rmax,ymin,ymax),
        norm=matplotlib.colors.LogNorm())
 
  xlabel ('$r$')
  ylabel ('$x$')
 
  title('Attractor of the logistic map $x_{n+1} = r \, x_n (1-x_n)$')
 
  show()
 
  
 
  In [4]: timeit y=f(x)
  1 loops, best of 3: 19.4 us per loop
 
  In [5]: timeit i = array(x/dy, dtype=int)
  1 loops, best of 3: 22 us per loop
 
  In [6]: timeit img[i,j] += 1
  1 loops, best of 3: 119 us per loop
  ___
  NumPy-Discussion mailing list
  NumPy-Discussion@scipy.org
  http://mail.scipy.org/mailman/listinfo/numpy-discussion

 This suggests to me that fancy indexing could be quite a bit faster in
 this case:

 In [40]: timeit hist[i,j] += 11 loops, best of 3: 58.2 us per loop
 In [39]: timeit hist.put(np.ravel_multi_index((i, j), hist.shape), 1)
 1 loops, best of 3: 20.6 us per loop

 I wrote a simple Cython method

 def fancy_inc(ndarray[int64_t, ndim=2] values,
              ndarray[int64_t] iarr, ndarray[int64_t] jarr, int64_t inc):
    cdef:
        Py_ssize_t i, n = len(iarr)

    for i in range(n):
        values[iarr[i], jarr[i]] += inc

 that does even faster

 In [8]: timeit sbx.fancy_inc(hist, i, j, 1)
 10 loops, best of 3: 4.85 us per loop

 About 10% faster if bounds checking and wraparound are disabled.

 Kind of a bummer-- perhaps this should go high on the NumPy 2.0 TODO list?

 - Wes
 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion


 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion


But:

In [40]: timeit hist[i, j]
1 loops, best of 3: 32 us per loop

So that's roughly 7-8x slower than a simple Cython method, so I
sincerely hope it could be brought down to the sub 10 microsecond
level with a little bit of work.

Re: [Numpy-discussion] [IPython-dev] Discussion with Guido van Rossum and (hopefully) core python-dev on scientific Python and Python3

2012-02-13 Thread Aaron Meurer

I'd like the ability to make in (i.e., __contains__) return
something other than a bool.

Also, the ability to make the x  y  z syntax would be useful.  It's
been suggested that the ability to override the boolean operators
(and, or, not) would be the way to do this (pep 335), though I'm not
100% convinced that's the way to go.

Aaron Meurer

On Mon, Feb 13, 2012 at 2:55 PM, Fernando Perez fperez@gmail.com wrote:
 Hi folks,

 [ I'm broadcasting this widely for maximum reach, but I'd appreciate
 it if replies can be kept to the *numpy* list, which is sort of the
 'base' list for scientific/numerical work.  It will make it much
 easier to organize a coherent set of notes later on.  Apology if
 you're subscribed to all and get it 10 times. ]

 As part of the PyData workshop (http://pydataworkshop.eventbrite.com)
 to be held March 2 and 3 at the Mountain View Google offices, we have
 scheduled a session for an open discussion with Guido van Rossum and
 hopefully as many core python-dev members who can make it.  We wanted
 to seize the combined opportunity of the PyData workshop bringing a
 number of 'scipy people' to Google with the timeline for Python 3.3,
 the first release after the Python language moratorium, being within
 sight: http://www.python.org/dev/peps/pep-0398.

 While a number of scientific Python packages are already available for
 Python 3 (either in released form or in their master git branches),
 it's fair to say that there hasn't been a major transition of the
 scientific community to Python3.  Since there is no more development
 being done on the Python2 series, eventually we will all want to find
 ways to make this transition, and we think that this is an excellent
 time to engage the core python development team and consider ideas
 that would make Python3 generally a more appealing language for
 scientific work.  Guido has made it clear that he doesn't speak for
 the day-to-day development of Python anymore, so we all should be
 aware that any ideas that come out of this panel will still need to be
 discussed with python-dev itself via standard mechanisms before
 anything is implemented.  Nonetheless, the opportunity for a solid
 face-to-face dialog for brainstorming was too good to pass up.

 The purpose of this email is then to solicit, from all of our
 community, ideas for this discussion.  In a week or so we'll need to
 summarize the main points brought up here and make a more concrete
 agenda out of it; I will also post a summary of the meeting afterwards
 here.

 Anything is a valid topic, some points just to get the conversation started:

 - Extra operators/PEP 225.  Here's a summary from the last time we
 went over this, years ago at Scipy 2008:
 http://mail.scipy.org/pipermail/numpy-discussion/2008-October/038234.html,
 and the current status of the document we wrote about it is here:
 file:///home/fperez/www/site/_build/html/py4science/numpy-pep225/numpy-pep225.html.

 - Improved syntax/support for rationals or decimal literals?  While
 Python now has both decimals
 (http://docs.python.org/library/decimal.html) and rationals
 (http://docs.python.org/library/fractions.html), they're quite clunky
 to use because they require full constructor calls.  Guido has
 mentioned in previous discussions toying with ideas about support for
 different kinds of numeric literals...

 - Using the numpy docstring standard python-wide, and thus having
 python improve the pathetic state of the stdlib's docstrings?  This is
 an area where our community is light years ahead of the standard
 library, but we'd all benefit from Python itself improving on this
 front.  I'm toying with the idea of giving a lighting talk at PyConn
 about this, comparing the great, robust culture and tools of good
 docstrings across the Scipy ecosystem with the sad, sad state of
 docstrings in the stdlib.  It might spur some movement on that front
 from the stdlib authors, esp. if the core python-dev team realizes the
 value and benefit it can bring (at relatively low cost, given how most
 of the information does exist, it's just in the wrong places).  But
 more importantly for us, if there was truly a universal standard for
 high-quality docstrings across Python projects, building good
 documentation/help machinery would be a lot easier, as we'd know what
 to expect and search for (such as rendering them nicely in the ipython
 notebook, providing high-quality cross-project help search, etc).

 - Literal syntax for arrays?  Sage has been floating a discussion
 about a literal matrix syntax
 (https://groups.google.com/forum/#!topic/sage-devel/mzwepqZBHnA).  For
 something like this to go into python in any meaningful way there
 would have to be core multidimensional arrays in the language, but
 perhaps it's time to think about a piece of the numpy array itself
 into Python?  This is one of the more 'out there' ideas, but after
 all, that's the point of a discussion like this, especially
 considering we'll have both

Re: [Numpy-discussion] Change in scalar upcasting rules for 1.6.x?

2012-02-13 Thread Travis Oliphant

Hmmm.   This seems like a regression.  The scalar casting API was fairly 
intentional.  

What is the reason for the change?

--
Travis Oliphant
(on a mobile)
512-826-7480


On Feb 13, 2012, at 6:25 PM, Matthew Brett matthew.br...@gmail.com wrote:

 Hi,
 
 I recently noticed a change in the upcasting rules in numpy 1.6.0 /
 1.6.1 and I just wanted to check it was intentional.
 
 For all versions of numpy I've tested, we have:
 
 import numpy as np
 Adata = np.array([127], dtype=np.int8)
 Bdata = np.int16(127)
 (Adata + Bdata).dtype
 dtype('int8')
 
 That is - adding an integer scalar of a larger dtype does not result
 in upcasting of the output dtype, if the data in the scalar type fits
 in the smaller.
 
 For numpy  1.6.0 we have this:
 
 Bdata = np.int16(128)
 (Adata + Bdata).dtype
 dtype('int8')
 
 That is - even if the data in the scalar does not fit in the dtype of
 the array to which it is being added, there is no upcasting.
 
 For numpy = 1.6.0 we have this:
 
 Bdata = np.int16(128)
 (Adata + Bdata).dtype
 dtype('int16')
 
 There is upcasting...
 
 I can see why the numpy 1.6.0 way might be preferable but it is an API
 change I suppose.
 
 Best,
 
 Matthew
 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

[Numpy-discussion] can_cast with structured array output - bug?

2012-02-13 Thread Matthew Brett

Hi,

I've also just noticed this oddity:

In [17]: np.can_cast('c', 'u1')
Out[17]: False

OK so far, but...

In [18]: np.can_cast('c', [('f1', 'u1')])
Out[18]: True

In [19]: np.can_cast('c', [('f1', 'u1')], 'safe')
Out[19]: True

In [20]: np.can_cast(np.ones(10, dtype='c'), [('f1', 'u1')])
Out[20]: True

I think this must be a bug.

In the other direction, it makes more sense to me:

In [24]: np.can_cast([('f1', 'u1')], 'c')
Out[24]: False

In [25]: np.can_cast([('f1', 'u1')], [('f1', 'u1')])
Out[25]: True

Best,

Matthew
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Change in scalar upcasting rules for 1.6.x?

2012-02-13 Thread Mark Wiebe

On Mon, Feb 13, 2012 at 5:00 PM, Travis Oliphant tra...@continuum.iowrote:

 Hmmm.   This seems like a regression.  The scalar casting API was fairly
 intentional.

 What is the reason for the change?


In order to make 1.6 ABI-compatible with 1.5, I basically had to rewrite
this subsystem. There were virtually no tests in the test suite specifying
what the expected behavior should be, and there were clear inconsistencies
where for example a+b could result in a different type than b+a. I
recall there being some bugs in the tracker related to this as well, but I
don't remember those details.

This change felt like an obvious extension of an existing behavior for
eliminating overflow, where the promotion changed unsigned - signed based
on the value of the scalar. This change introduced minimal upcasting only
in a set of cases where an overflow was guaranteed to happen without that
upcasting.

During the 1.6 beta period, I signaled that this subsystem had changed, as
the bullet point starting The ufunc uses a more consistent algorithm for
loop selection.:

http://mail.scipy.org/pipermail/numpy-discussion/2011-March/055156.html

The behavior Matthew has observed is a direct result of how I designed the
minimization function mentioned in that bullet point, and the algorithm for
it is documented in the 'Notes' section of the result_type page:

http://docs.scipy.org/doc/numpy/reference/generated/numpy.result_type.html

Hopefully that explains it well enough. I made the change intentionally and
carefully, tested its impact on SciPy and other projects, and advocated for
it during the release cycle.

Cheers,
Mark

--
 Travis Oliphant
 (on a mobile)
 512-826-7480


 On Feb 13, 2012, at 6:25 PM, Matthew Brett matthew.br...@gmail.com
 wrote:

  Hi,
 
  I recently noticed a change in the upcasting rules in numpy 1.6.0 /
  1.6.1 and I just wanted to check it was intentional.
 
  For all versions of numpy I've tested, we have:
 
  import numpy as np
  Adata = np.array([127], dtype=np.int8)
  Bdata = np.int16(127)
  (Adata + Bdata).dtype
  dtype('int8')
 
  That is - adding an integer scalar of a larger dtype does not result
  in upcasting of the output dtype, if the data in the scalar type fits
  in the smaller.
 
  For numpy  1.6.0 we have this:
 
  Bdata = np.int16(128)
  (Adata + Bdata).dtype
  dtype('int8')
 
  That is - even if the data in the scalar does not fit in the dtype of
  the array to which it is being added, there is no upcasting.
 
  For numpy = 1.6.0 we have this:
 
  Bdata = np.int16(128)
  (Adata + Bdata).dtype
  dtype('int16')
 
  There is upcasting...
 
  I can see why the numpy 1.6.0 way might be preferable but it is an API
  change I suppose.
 
  Best,
 
  Matthew
  ___
  NumPy-Discussion mailing list
  NumPy-Discussion@scipy.org
  http://mail.scipy.org/mailman/listinfo/numpy-discussion
 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Issue Tracking

2012-02-13 Thread alan

I'm wondering about using one of these commercial issue tracking plans for 
NumPy and would like thoughts and comments.Both of these plans allow Open 
Source projects to have unlimited plans for free.   

JIRA: 

http://www.atlassian.com/software/jira/overview/tour/code-integration



At work we just transitioned off JIRA to TFS. Have to say, for bug tracking,
JIRA was a lot better than TFS, not too good as a planning tool though. It
is quite customizable and flexible. Nice ability to set up automatic e-mails
and such as well.


-- 
---
| Alan K. Jackson| To see a World in a Grain of Sand  |
| a...@ajackson.org  | And a Heaven in a Wild Flower, |
| www.ajackson.org   | Hold Infinity in the palm of your hand |
| Houston, Texas | And Eternity in an hour. - Blake   |
---
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] [IPython-dev] Discussion with Guido van Rossum and (hopefully) core python-dev on scientific Python and Python3

2012-02-13 Thread Mark Wiebe

It might be nice to turn the matrix class into a short class hierarchy,
something like this:

class MatrixBase
class DenseMatrix(MatrixBase)
class TriangularMatrix(MatrixBase) # Maybe a few variations of upper/lower
triangular and whether the diagonal is stored
class SymmetricMatrix(MatrixBase)

These other matrix classes could use packed storage, and could call the
specific optimized BLAS/LAPACK functions to get higher performance when it
is known the matrix is triangular or symmetric. I'm not sure whether this
affects the discussion of the matrix * and \ operators, but it's a
possibility to consider.

-Mark

On Mon, Feb 13, 2012 at 4:53 PM, Aaron Meurer asmeu...@gmail.com wrote:

 I'd like the ability to make in (i.e., __contains__) return
 something other than a bool.

 Also, the ability to make the x  y  z syntax would be useful.  It's
 been suggested that the ability to override the boolean operators
 (and, or, not) would be the way to do this (pep 335), though I'm not
 100% convinced that's the way to go.

 Aaron Meurer

 On Mon, Feb 13, 2012 at 2:55 PM, Fernando Perez fperez@gmail.com
 wrote:
  Hi folks,
 
  [ I'm broadcasting this widely for maximum reach, but I'd appreciate
  it if replies can be kept to the *numpy* list, which is sort of the
  'base' list for scientific/numerical work.  It will make it much
  easier to organize a coherent set of notes later on.  Apology if
  you're subscribed to all and get it 10 times. ]
 
  As part of the PyData workshop (http://pydataworkshop.eventbrite.com)
  to be held March 2 and 3 at the Mountain View Google offices, we have
  scheduled a session for an open discussion with Guido van Rossum and
  hopefully as many core python-dev members who can make it.  We wanted
  to seize the combined opportunity of the PyData workshop bringing a
  number of 'scipy people' to Google with the timeline for Python 3.3,
  the first release after the Python language moratorium, being within
  sight: http://www.python.org/dev/peps/pep-0398.
 
  While a number of scientific Python packages are already available for
  Python 3 (either in released form or in their master git branches),
  it's fair to say that there hasn't been a major transition of the
  scientific community to Python3.  Since there is no more development
  being done on the Python2 series, eventually we will all want to find
  ways to make this transition, and we think that this is an excellent
  time to engage the core python development team and consider ideas
  that would make Python3 generally a more appealing language for
  scientific work.  Guido has made it clear that he doesn't speak for
  the day-to-day development of Python anymore, so we all should be
  aware that any ideas that come out of this panel will still need to be
  discussed with python-dev itself via standard mechanisms before
  anything is implemented.  Nonetheless, the opportunity for a solid
  face-to-face dialog for brainstorming was too good to pass up.
 
  The purpose of this email is then to solicit, from all of our
  community, ideas for this discussion.  In a week or so we'll need to
  summarize the main points brought up here and make a more concrete
  agenda out of it; I will also post a summary of the meeting afterwards
  here.
 
  Anything is a valid topic, some points just to get the conversation
 started:
 
  - Extra operators/PEP 225.  Here's a summary from the last time we
  went over this, years ago at Scipy 2008:
 
 http://mail.scipy.org/pipermail/numpy-discussion/2008-October/038234.html,
  and the current status of the document we wrote about it is here:
 
 file:///home/fperez/www/site/_build/html/py4science/numpy-pep225/numpy-pep225.html.
 
  - Improved syntax/support for rationals or decimal literals?  While
  Python now has both decimals
  (http://docs.python.org/library/decimal.html) and rationals
  (http://docs.python.org/library/fractions.html), they're quite clunky
  to use because they require full constructor calls.  Guido has
  mentioned in previous discussions toying with ideas about support for
  different kinds of numeric literals...
 
  - Using the numpy docstring standard python-wide, and thus having
  python improve the pathetic state of the stdlib's docstrings?  This is
  an area where our community is light years ahead of the standard
  library, but we'd all benefit from Python itself improving on this
  front.  I'm toying with the idea of giving a lighting talk at PyConn
  about this, comparing the great, robust culture and tools of good
  docstrings across the Scipy ecosystem with the sad, sad state of
  docstrings in the stdlib.  It might spur some movement on that front
  from the stdlib authors, esp. if the core python-dev team realizes the
  value and benefit it can bring (at relatively low cost, given how most
  of the information does exist, it's just in the wrong places).  But
  more importantly for us, if there was truly a universal standard for
  high-quality docstrings across Python

Re: [Numpy-discussion] can_cast with structured array output - bug?

2012-02-13 Thread Mark Wiebe

I took a look into the code to see what is causing this, and the reason is
that nothing has ever been implemented to deal with the fields. This means
it falls back to treating all struct dtypes as if they were a plain void
dtype, which allows anything to be cast to it.

While I was redoing the casting subsystem for 1.6, I did think on this
issue, and decided that it wasn't worth tackling it at the time because the
'safe'/'same_kind'/'unsafe' don't seem sufficient to handle what might be
desired. I tried to leave this alone as much as possible.

Some random thoughts about this are:

* Casting a scalar to a struct dtype: should it be safe if the scalar can
be safely cast to each member of the struct dtype? This is the NumPy
broadcasting rule applied to dtypes as if the struct dtype is another
dimension.
* Casting one struct dtype to another: If the fields of the source are a
subset of the target, and the types can safely convert, should that be a
safe cast? If the fields of the source are not a subset of the target,
should that still be a same_kind cast? Should a second enum which
complements the safe/same_kind/unsafe one, but is specific for how
adding/removing struct fields be added?

This is closely related to adding ufunc support for struct dtypes, and the
choices here should probably be decided at the same time as designing how
the ufuncs should work.

-Mark

On Mon, Feb 13, 2012 at 5:20 PM, Matthew Brett matthew.br...@gmail.comwrote:

 Hi,

 I've also just noticed this oddity:

 In [17]: np.can_cast('c', 'u1')
 Out[17]: False

 OK so far, but...

 In [18]: np.can_cast('c', [('f1', 'u1')])
 Out[18]: True

 In [19]: np.can_cast('c', [('f1', 'u1')], 'safe')
 Out[19]: True

 In [20]: np.can_cast(np.ones(10, dtype='c'), [('f1', 'u1')])
 Out[20]: True

 I think this must be a bug.

 In the other direction, it makes more sense to me:

 In [24]: np.can_cast([('f1', 'u1')], 'c')
 Out[24]: False

 In [25]: np.can_cast([('f1', 'u1')], [('f1', 'u1')])
 Out[25]: True

 Best,

 Matthew
 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Change in scalar upcasting rules for 1.6.x?

2012-02-13 Thread Travis Oliphant

The problem is that these sorts of things take a while to emerge.  The original 
system was more consistent than I think you give it credit.  What you are 
seeing is that most people get NumPy from distributions and are relying on us 
to keep things consistent. 

The scalar coercion rules were deterministic and based on the idea that a 
scalar does not determine the output dtype unless it is of a different kind.   
The new code changes that unfortunately. 

Another thing I noticed is that I thought that int16 op scalar float would 
produce float32 originally.  This seems to have changed, but I need to check on 
an older version of NumPy.

Changing the scalar coercion rules is an unfortunate substantial change in 
semantics and should not have happened in the 1.X series.

I understand you did not get a lot of feedback and spent a lot of time on the 
code which we all appreciate.   I worked to stay true to the Numeric casting 
rules incorporating the changes to prevent scalar upcasting due to the absence 
of single precision Numeric literals in Python.

We will need to look in detail at what has changed.  I will write a test to do 
that. 

Thanks,

Travis 

--
Travis Oliphant
(on a mobile)
512-826-7480


On Feb 13, 2012, at 7:58 PM, Mark Wiebe mwwi...@gmail.com wrote:

 On Mon, Feb 13, 2012 at 5:00 PM, Travis Oliphant tra...@continuum.io wrote:
 Hmmm.   This seems like a regression.  The scalar casting API was fairly 
 intentional.
 
 What is the reason for the change?
 
 In order to make 1.6 ABI-compatible with 1.5, I basically had to rewrite this 
 subsystem. There were virtually no tests in the test suite specifying what 
 the expected behavior should be, and there were clear inconsistencies where 
 for example a+b could result in a different type than b+a. I recall there 
 being some bugs in the tracker related to this as well, but I don't remember 
 those details.
 
 This change felt like an obvious extension of an existing behavior for 
 eliminating overflow, where the promotion changed unsigned - signed based on 
 the value of the scalar. This change introduced minimal upcasting only in a 
 set of cases where an overflow was guaranteed to happen without that 
 upcasting.
 
 During the 1.6 beta period, I signaled that this subsystem had changed, as 
 the bullet point starting The ufunc uses a more consistent algorithm for 
 loop selection.:
 
 http://mail.scipy.org/pipermail/numpy-discussion/2011-March/055156.html
 
 The behavior Matthew has observed is a direct result of how I designed the 
 minimization function mentioned in that bullet point, and the algorithm for 
 it is documented in the 'Notes' section of the result_type page:
 
 http://docs.scipy.org/doc/numpy/reference/generated/numpy.result_type.html 
 
 Hopefully that explains it well enough. I made the change intentionally and 
 carefully, tested its impact on SciPy and other projects, and advocated for 
 it during the release cycle.
 
 Cheers,
 Mark
 
 --
 Travis Oliphant
 (on a mobile)
 512-826-7480
 
 
 On Feb 13, 2012, at 6:25 PM, Matthew Brett matthew.br...@gmail.com wrote:
 
  Hi,
 
  I recently noticed a change in the upcasting rules in numpy 1.6.0 /
  1.6.1 and I just wanted to check it was intentional.
 
  For all versions of numpy I've tested, we have:
 
  import numpy as np
  Adata = np.array([127], dtype=np.int8)
  Bdata = np.int16(127)
  (Adata + Bdata).dtype
  dtype('int8')
 
  That is - adding an integer scalar of a larger dtype does not result
  in upcasting of the output dtype, if the data in the scalar type fits
  in the smaller.
 
  For numpy  1.6.0 we have this:
 
  Bdata = np.int16(128)
  (Adata + Bdata).dtype
  dtype('int8')
 
  That is - even if the data in the scalar does not fit in the dtype of
  the array to which it is being added, there is no upcasting.
 
  For numpy = 1.6.0 we have this:
 
  Bdata = np.int16(128)
  (Adata + Bdata).dtype
  dtype('int16')
 
  There is upcasting...
 
  I can see why the numpy 1.6.0 way might be preferable but it is an API
  change I suppose.
 
  Best,
 
  Matthew
  ___
  NumPy-Discussion mailing list
  NumPy-Discussion@scipy.org
  http://mail.scipy.org/mailman/listinfo/numpy-discussion
 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion
 
 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Change in scalar upcasting rules for 1.6.x?

2012-02-13 Thread Mark Wiebe

I believe the main lessons to draw from this are just how incredibly
important a complete test suite and staying on top of code reviews are. I'm
of the opinion that any explicit design choice of this nature should be
reflected in the test suite, so that if someone changes it years later,
they get immediate feedback that they're breaking something important.
NumPy has gradually increased its test suite coverage, and when I dealt
with the type promotion subsystem, I added fairly extensive tests:

https://github.com/numpy/numpy/blob/master/numpy/core/tests/test_numeric.py#L345

Another subsystem which is in a similar state as what the type promotion
subsystem was, is the subscript operator and how regular/fancy indexing
work. What this means is that any attempt to improve it that doesn't
coincide with the original intent years ago can easily break things that
were originally intended without them being caught by a test. I believe
this subsystem needs improvement, and the transition to new/improved code
will probably be trickier to manage than for the dtype promotion case.

Let's try to learn from the type promotion case as best we can, and use it
to improve NumPy's process. I believe Charles and Ralph have been doing a
great job of enforcing high standards in new NumPy code, and managing the
release process in a way that has resulted in very few bugs and regressions
in the release. Most of these quality standards are still informal,
however, and it's probably a good idea to write them down in a canonical
location. It will be especially helpful for newcomers, who can treat the
standards as a checklist before submitting pull requests.

Thanks,
-Mark

On Mon, Feb 13, 2012 at 7:11 PM, Travis Oliphant tra...@continuum.iowrote:

 The problem is that these sorts of things take a while to emerge.  The
 original system was more consistent than I think you give it credit.  What
 you are seeing is that most people get NumPy from distributions and are
 relying on us to keep things consistent.

 The scalar coercion rules were deterministic and based on the idea that a
 scalar does not determine the output dtype unless it is of a different
 kind.   The new code changes that unfortunately.

 Another thing I noticed is that I thought that int16 op scalar float
 would produce float32 originally.  This seems to have changed, but I need
 to check on an older version of NumPy.

 Changing the scalar coercion rules is an unfortunate substantial change in
 semantics and should not have happened in the 1.X series.

 I understand you did not get a lot of feedback and spent a lot of time on
 the code which we all appreciate.   I worked to stay true to the Numeric
 casting rules incorporating the changes to prevent scalar upcasting due to
 the absence of single precision Numeric literals in Python.

 We will need to look in detail at what has changed.  I will write a test
 to do that.

 Thanks,

 Travis

 --
 Travis Oliphant
 (on a mobile)
 512-826-7480


 On Feb 13, 2012, at 7:58 PM, Mark Wiebe mwwi...@gmail.com wrote:

 On Mon, Feb 13, 2012 at 5:00 PM, Travis Oliphant tra...@continuum.iowrote:

 Hmmm.   This seems like a regression.  The scalar casting API was fairly
 intentional.

 What is the reason for the change?


 In order to make 1.6 ABI-compatible with 1.5, I basically had to rewrite
 this subsystem. There were virtually no tests in the test suite specifying
 what the expected behavior should be, and there were clear inconsistencies
 where for example a+b could result in a different type than b+a. I
 recall there being some bugs in the tracker related to this as well, but I
 don't remember those details.

 This change felt like an obvious extension of an existing behavior for
 eliminating overflow, where the promotion changed unsigned - signed based
 on the value of the scalar. This change introduced minimal upcasting only
 in a set of cases where an overflow was guaranteed to happen without that
 upcasting.

 During the 1.6 beta period, I signaled that this subsystem had changed, as
 the bullet point starting The ufunc uses a more consistent algorithm for
 loop selection.:

 http://mail.scipy.org/pipermail/numpy-discussion/2011-March/055156.html

 The behavior Matthew has observed is a direct result of how I designed the
 minimization function mentioned in that bullet point, and the algorithm for
 it is documented in the 'Notes' section of the result_type page:

 http://docs.scipy.org/doc/numpy/reference/generated/numpy.result_type.html

 Hopefully that explains it well enough. I made the change intentionally
 and carefully, tested its impact on SciPy and other projects, and advocated
 for it during the release cycle.

 Cheers,
 Mark

 --
 Travis Oliphant
 (on a mobile)
 512-826-7480


 On Feb 13, 2012, at 6:25 PM, Matthew Brett matthew.br...@gmail.com
 wrote:

  Hi,
 
  I recently noticed a change in the upcasting rules in numpy 1.6.0 /
  1.6.1 and I just wanted to check it was intentional.
 
  For all versions of numpy

Re: [Numpy-discussion] [IPython-dev] Discussion with Guido van Rossum and (hopefully) core python-dev on scientific Python and Python3

2012-02-13 Thread Dag Sverre Seljebotn

On 02/13/2012 06:19 PM, Mark Wiebe wrote:
 It might be nice to turn the matrix class into a short class hierarchy,
 something like this:

 class MatrixBase
 class DenseMatrix(MatrixBase)
 class TriangularMatrix(MatrixBase) # Maybe a few variations of
 upper/lower triangular and whether the diagonal is stored
 class SymmetricMatrix(MatrixBase)

 These other matrix classes could use packed storage, and could call the
 specific optimized BLAS/LAPACK functions to get higher performance when
 it is known the matrix is triangular or symmetric. I'm not sure whether
 this affects the discussion of the matrix * and \ operators, but it's a
 possibility to consider.

I've been working on exactly this (+ some more) in January, and will be 
continuing to in the months to come.

(Can write more tomorrow if anybody's interested -- or email me directly 
as I don't have a 0.1 release to show yet -- got to go now)

Dag
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] [SciPy-Dev] Discussion with Guido van Rossum and (hopefully) core python-dev on scientific Python and Python3

2012-02-13 Thread Benjamin Root

On Monday, February 13, 2012, Aaron Meurer asmeu...@gmail.com wrote:
 I'd like the ability to make in (i.e., __contains__) return
 something other than a bool.

 Also, the ability to make the x  y  z syntax would be useful.  It's
 been suggested that the ability to override the boolean operators
 (and, or, not) would be the way to do this (pep 335), though I'm not
 100% convinced that's the way to go.

 Aaron Meurer

+1 on these syntax ideas, however I do agree that it might be a bit
problematic.

Also, I remember once talking about labeled arrays and discussing ways to
index them and ways to indicate which axis the indexing was for.  That
might require some sort of syntax changes.

Ben Root
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Change in scalar upcasting rules for 1.6.x?

2012-02-13 Thread Charles R Harris

On Mon, Feb 13, 2012 at 9:04 PM, Travis Oliphant tra...@continuum.iowrote:

 I disagree with your assessment of the subscript operator, but I'm sure we
 will have plenty of time to discuss that.  I don't think it's correct to
 compare  the corner cases of the fancy indexing and regular indexing to the
 corner cases of type coercion system.If you recall, I was quite nervous
 about all the changes you made to the coercion rules because I didn't
 believe you fully understood what had been done before and I knew there was
 not complete test coverage.

 It is true that both systems have emerged from a long history and could
 definitely use fresh perspectives which we all appreciate you and others
 bringing.   It is also true that few are aware of the details of how things
 are actually implemented and that there are corner cases that are basically
 defined by the algorithm used (this is more true of the type-coercion
 system than fancy-indexing, however).

 I think it would have been wise to write those extensive tests prior to
 writing new code.   I'm curious if what you were expecting for the output
 was derived from what earlier versions of NumPy produced.NumPy has
 never been in a state where you could just re-factor at will and assume
 that tests will catch all intended use cases.   Numeric before it was not
 in that state either.   This is a good goal, and we always welcome new
 tests.It just takes a lot of time and a lot of tedious work that the
 volunteer labor to this point have not had the time to do.

 Very few of us have ever been paid to work on NumPy directly and have
 often been trying to fit in improvements to the code base between other
 jobs we are supposed to be doing.Of course, you and I are hoping to
 change that this year and look forward to the code quality improving
 commensurately.

 Thanks for all you are doing.   I also agree that Rolf and Charles
 have-been and are invaluable in the maintenance and progress of NumPy and
 SciPy.   They deserve as much praise and kudos as anyone can give them.


Well, the typecasting wasn't perfect and, as Mark points out, it wasn't
commutative. The addition of float16 also complicated the picture, and user
types is going to do more in that direction. And I don't see how a new
developer should be responsible for tests enforcing old traditions, the
original developers should be responsible for those. But history is
history, it didn't happen that way, and here we are.

That said, I think we need to show a little flexibility in the corner
cases. And going forward I think that typecasting is going to need a
rethink.

Chuck

On Feb 13, 2012, at 9:40 PM, Mark Wiebe wrote:

 I believe the main lessons to draw from this are just how incredibly
 important a complete test suite and staying on top of code reviews are. I'm
 of the opinion that any explicit design choice of this nature should be
 reflected in the test suite, so that if someone changes it years later,
 they get immediate feedback that they're breaking something important.
 NumPy has gradually increased its test suite coverage, and when I dealt
 with the type promotion subsystem, I added fairly extensive tests:


 https://github.com/numpy/numpy/blob/master/numpy/core/tests/test_numeric.py#L345

 Another subsystem which is in a similar state as what the type promotion
 subsystem was, is the subscript operator and how regular/fancy indexing
 work. What this means is that any attempt to improve it that doesn't
 coincide with the original intent years ago can easily break things that
 were originally intended without them being caught by a test. I believe
 this subsystem needs improvement, and the transition to new/improved code
 will probably be trickier to manage than for the dtype promotion case.

 Let's try to learn from the type promotion case as best we can, and use it
 to improve NumPy's process. I believe Charles and Ralph have been doing a
 great job of enforcing high standards in new NumPy code, and managing the
 release process in a way that has resulted in very few bugs and regressions
 in the release. Most of these quality standards are still informal,
 however, and it's probably a good idea to write them down in a canonical
 location. It will be especially helpful for newcomers, who can treat the
 standards as a checklist before submitting pull requests.

 Thanks,
 -Mark

 On Mon, Feb 13, 2012 at 7:11 PM, Travis Oliphant tra...@continuum.iowrote:

 The problem is that these sorts of things take a while to emerge.  The
 original system was more consistent than I think you give it credit.  What
 you are seeing is that most people get NumPy from distributions and are
 relying on us to keep things consistent.

 The scalar coercion rules were deterministic and based on the idea that a
 scalar does not determine the output dtype unless it is of a different
 kind.   The new code changes that unfortunately.

 Another thing I noticed is that I thought that int16 op scalar float

Re: [Numpy-discussion] Change in scalar upcasting rules for 1.6.x?

2012-02-13 Thread Olivier Delalleau

It hasn't changed: since float is of a fundamentally different kind of
data, it's expected to upcast the result.

However, if I may add a personal comment on numpy's casting rules: until
now, I've found them confusing and somewhat inconsistent. Some of the
inconsistencies I've found were bugs, while others were unintuitive
behavior (or, you may say, me not having the correct intuition ;)

In particular the rule about mixed scalar / array operations is currently
only described in the doc by a rather vague sentence.
Also, the fact that the result's dtype can depend on the actual numerical
values can be confusing when you work with variable whose values can span a
wide range.
So I think if you could come up with a table that says an operation
involving two arrays of dtype1  dtype2 always returns an output of dtype3,
and a similar table for mixed scalar / array operations, that would be
great!

My 2 cents,

-=- Olivier

Le 13 février 2012 23:08, Travis Oliphant tra...@continuum.io a écrit :

 I can also confirm that at least on NumPy 1.5.1:

 integer array * (literal Python float scalar)   --- creates a double
 result.

 So, my memory was incorrect on that (unless it changed at an earlier
 release, but I don't think so).

 -Travis



 On Feb 13, 2012, at 9:40 PM, Mark Wiebe wrote:

 I believe the main lessons to draw from this are just how incredibly
 important a complete test suite and staying on top of code reviews are. I'm
 of the opinion that any explicit design choice of this nature should be
 reflected in the test suite, so that if someone changes it years later,
 they get immediate feedback that they're breaking something important.
 NumPy has gradually increased its test suite coverage, and when I dealt
 with the type promotion subsystem, I added fairly extensive tests:


 https://github.com/numpy/numpy/blob/master/numpy/core/tests/test_numeric.py#L345

 Another subsystem which is in a similar state as what the type promotion
 subsystem was, is the subscript operator and how regular/fancy indexing
 work. What this means is that any attempt to improve it that doesn't
 coincide with the original intent years ago can easily break things that
 were originally intended without them being caught by a test. I believe
 this subsystem needs improvement, and the transition to new/improved code
 will probably be trickier to manage than for the dtype promotion case.

 Let's try to learn from the type promotion case as best we can, and use it
 to improve NumPy's process. I believe Charles and Ralph have been doing a
 great job of enforcing high standards in new NumPy code, and managing the
 release process in a way that has resulted in very few bugs and regressions
 in the release. Most of these quality standards are still informal,
 however, and it's probably a good idea to write them down in a canonical
 location. It will be especially helpful for newcomers, who can treat the
 standards as a checklist before submitting pull requests.

 Thanks,
 -Mark

 On Mon, Feb 13, 2012 at 7:11 PM, Travis Oliphant tra...@continuum.iowrote:

 The problem is that these sorts of things take a while to emerge.  The
 original system was more consistent than I think you give it credit.  What
 you are seeing is that most people get NumPy from distributions and are
 relying on us to keep things consistent.

 The scalar coercion rules were deterministic and based on the idea that a
 scalar does not determine the output dtype unless it is of a different
 kind.   The new code changes that unfortunately.

 Another thing I noticed is that I thought that int16 op scalar float
 would produce float32 originally.  This seems to have changed, but I need
 to check on an older version of NumPy.

 Changing the scalar coercion rules is an unfortunate substantial change
 in semantics and should not have happened in the 1.X series.

 I understand you did not get a lot of feedback and spent a lot of time on
 the code which we all appreciate.   I worked to stay true to the Numeric
 casting rules incorporating the changes to prevent scalar upcasting due to
 the absence of single precision Numeric literals in Python.

 We will need to look in detail at what has changed.  I will write a test
 to do that.

 Thanks,

 Travis

 --
 Travis Oliphant
 (on a mobile)
 512-826-7480


 On Feb 13, 2012, at 7:58 PM, Mark Wiebe mwwi...@gmail.com wrote:

 On Mon, Feb 13, 2012 at 5:00 PM, Travis Oliphant tra...@continuum.iowrote:

 Hmmm.   This seems like a regression.  The scalar casting API was fairly
 intentional.

 What is the reason for the change?


 In order to make 1.6 ABI-compatible with 1.5, I basically had to rewrite
 this subsystem. There were virtually no tests in the test suite specifying
 what the expected behavior should be, and there were clear inconsistencies
 where for example a+b could result in a different type than b+a. I
 recall there being some bugs in the tracker related to this as well, but I
 don't remember those details.

 This change felt

Re: [Numpy-discussion] Change in scalar upcasting rules for 1.6.x?

2012-02-13 Thread Benjamin Root

On Monday, February 13, 2012, Charles R Harris charlesr.har...@gmail.com
wrote:


 On Mon, Feb 13, 2012 at 9:04 PM, Travis Oliphant tra...@continuum.io
wrote:

 I disagree with your assessment of the subscript operator, but I'm sure
we will have plenty of time to discuss that.  I don't think it's correct to
compare  the corner cases of the fancy indexing and regular indexing to the
corner cases of type coercion system.If you recall, I was quite nervous
about all the changes you made to the coercion rules because I didn't
believe you fully understood what had been done before and I knew there was
not complete test coverage.
 It is true that both systems have emerged from a long history and could
definitely use fresh perspectives which we all appreciate you and others
bringing.   It is also true that few are aware of the details of how things
are actually implemented and that there are corner cases that are basically
defined by the algorithm used (this is more true of the type-coercion
system than fancy-indexing, however).
 I think it would have been wise to write those extensive tests prior to
writing new code.   I'm curious if what you were expecting for the output
was derived from what earlier versions of NumPy produced.NumPy has
never been in a state where you could just re-factor at will and assume
that tests will catch all intended use cases.   Numeric before it was not
in that state either.   This is a good goal, and we always welcome new
tests.It just takes a lot of time and a lot of tedious work that the
volunteer labor to this point have not had the time to do.
 Very few of us have ever been paid to work on NumPy directly and have
often been trying to fit in improvements to the code base between other
jobs we are supposed to be doing.Of course, you and I are hoping to
change that this year and look forward to the code quality improving
commensurately.
 Thanks for all you are doing.   I also agree that Rolf and Charles
have-been and are invaluable in the maintenance and progress of NumPy and
SciPy.   They deserve as much praise and kudos as anyone can give them.

 Well, the typecasting wasn't perfect and, as Mark points out, it wasn't
commutative. The addition of float16 also complicated the picture, and user
types is going to do more in that direction. And I don't see how a new
developer should be responsible for tests enforcing old traditions, the
original developers should be responsible for those. But history is
history, it didn't happen that way, and here we are.

 That said, I think we need to show a little flexibility in the corner
cases. And going forward I think that typecasting is going to need a
rethink.

 Chuck

 On Feb 13, 2012, at 9:40 PM, Mark Wiebe wrote:

 I believe the main lessons to draw from this are just how incredibly
important a complete test suite and staying on top of code reviews are. I'm
of the opinion that any explicit design choice of this nature should be
reflected in the test suite, so that if someone changes it years later,
they get immediate feedback that they're breaking something important.
NumPy has gradually increased its test suite coverage, and when I dealt
with the type promotion subsystem, I added fairly extensive tests:

https://github.com/numpy/numpy/blob/master/numpy/core/tests/test_numeric.py#L345
 Another subsystem which is in a similar state as what the type promotion
subsystem was, is the subscript operator and how regular/fancy indexing
work. What this means is that any attempt to improve it that doesn't
coincide with the original intent years ago can easily break things that
were originally intended without them being caught by a test. I believe
this subsystem needs improvement, and the transition to new/improved code
will probably be trickier to manage than for the dtype promotion case.
 Let's try to learn from the type promotion case as best we can, and use
it to improve NumPy's process. I believe Charles and Ralph have been doing
a great job of enforcing high standards in new NumPy code, and managing the
release process in a way that has resulted in very few bugs and regressions
in the release. Most of these quality standards are still informal,
however, and it's probably a good idea to write them down in a canonical
location. It will be especially helpful for newcomers, who can treat the
standards as a checklist before submitting pull requests.
 Thanks,
 -Mark

 On Mon, Feb 13, 2012 at 7:11 PM, Travis Oliphant tra...@continuum.io
wrote:

 The problem is that these sorts of things take a while to emerge.  The
original system was more consistent than I think you give it credit.  What
you are seeing is that most people get NumPy from distributions and are
relying on us to keep things consistent.
 The scalar coercion rules were deterministic and based on the idea that a
scalar does not determine the output dtype unless it is of a different
kind.   The new code changes that unfortunately.
 Another thing I noticed is that I thought

Re: [Numpy-discussion] Change in scalar upcasting rules for 1.6.x?

2012-02-13 Thread Mark Wiebe

On Mon, Feb 13, 2012 at 8:04 PM, Travis Oliphant tra...@continuum.iowrote:

 I disagree with your assessment of the subscript operator, but I'm sure we
 will have plenty of time to discuss that.  I don't think it's correct to
 compare  the corner cases of the fancy indexing and regular indexing to the
 corner cases of type coercion system.If you recall, I was quite nervous
 about all the changes you made to the coercion rules because I didn't
 believe you fully understood what had been done before and I knew there was
 not complete test coverage.

 It is true that both systems have emerged from a long history and could
 definitely use fresh perspectives which we all appreciate you and others
 bringing.   It is also true that few are aware of the details of how things
 are actually implemented and that there are corner cases that are basically
 defined by the algorithm used (this is more true of the type-coercion
 system than fancy-indexing, however).


Likely the only way we will be able to know for certain the extent to which
our opinions are accurate is to actually dig into the code. I think we can
agree, however, that at the very least it could use some performance
improvement. :)


 I think it would have been wise to write those extensive tests prior to
 writing new code.   I'm curious if what you were expecting for the output
 was derived from what earlier versions of NumPy produced.NumPy has
 never been in a state where you could just re-factor at will and assume
 that tests will catch all intended use cases.   Numeric before it was not
 in that state either.   This is a good goal, and we always welcome new
 tests.It just takes a lot of time and a lot of tedious work that the
 volunteer labor to this point have not had the time to do.


I did put quite a bit of effort into maintaining compatibility, and was
incredibly careful about the change we're discussing. I used something I
suspect you created, the can cast safely table here:

http://docs.scipy.org/doc/numpy/reference/ufuncs.html#casting-rules

I extended it to more cases including scalar/array combinations of type
promotion, and validated that 1.5 and 1.6 produced the same outputs. The
script I used is here:

https://github.com/numpy/numpy/blob/master/numpy/testing/print_coercion_tables.py

I definitely didn't jump into the change blind, but I did approach it from
a clean perspective with the willingness to try and make things better. I
understand this is a delicate balance to walk, and I'd like to stress that
I didn't take any of the changes I made here lightly.

Very few of us have ever been paid to work on NumPy directly and have often
 been trying to fit in improvements to the code base between other jobs we
 are supposed to be doing.Of course, you and I are hoping to change that
 this year and look forward to the code quality improving commensurately.


Well, everything I did for 1.6 that we're discussing here was volunteer
work too. :)

You and Enthought have all the credit for the later bit where I did get
paid a little bit to do the datetime64 and NA stuff!

Thanks for all you are doing.   I also agree that Rolf and Charles
 have-been and are invaluable in the maintenance and progress of NumPy and
 SciPy.   They deserve as much praise and kudos as anyone can give them.


It's great to have you back and active in the community again too. I'm sure
this is improving the moods of many NumPy and SciPy users.

-Mark



 -Travis



 On Feb 13, 2012, at 9:40 PM, Mark Wiebe wrote:

 I believe the main lessons to draw from this are just how incredibly
 important a complete test suite and staying on top of code reviews are. I'm
 of the opinion that any explicit design choice of this nature should be
 reflected in the test suite, so that if someone changes it years later,
 they get immediate feedback that they're breaking something important.
 NumPy has gradually increased its test suite coverage, and when I dealt
 with the type promotion subsystem, I added fairly extensive tests:


 https://github.com/numpy/numpy/blob/master/numpy/core/tests/test_numeric.py#L345

 Another subsystem which is in a similar state as what the type promotion
 subsystem was, is the subscript operator and how regular/fancy indexing
 work. What this means is that any attempt to improve it that doesn't
 coincide with the original intent years ago can easily break things that
 were originally intended without them being caught by a test. I believe
 this subsystem needs improvement, and the transition to new/improved code
 will probably be trickier to manage than for the dtype promotion case.

 Let's try to learn from the type promotion case as best we can, and use it
 to improve NumPy's process. I believe Charles and Ralph have been doing a
 great job of enforcing high standards in new NumPy code, and managing the
 release process in a way that has resulted in very few bugs and regressions
 in the release. Most of these quality standards are still informal,

Re: [Numpy-discussion] Change in scalar upcasting rules for 1.6.x?

2012-02-13 Thread Travis Oliphant

These are great suggestions.   I am happy to start digging into the code.   I'm 
also happy to re-visit any and all design decisions for NumPy 2.0 (with a 
strong-eye towards helping people migrate and documenting the results).  Mark, 
I think you have done an excellent job of working with a stodgy group and 
pushing things forward.  That is a rare talent, and the world is a better place 
because you jumped in.

There is a lot of cruft all over the place, I know.   I also know a lot more 
now than I did 6 years ago about software design :-)I'm very excited about 
what we are going to be able to do with NumPy together --- and with the others 
in the community.  

But, I am also aware of *a lot* of users who never voice their opinion on this 
list, and a lot of features that they want and  need and are currently working 
around the limitations of NumPy to get.These are going to be my primary 
focus for the rest of the 1.X series.  I see at least a NumPy 1.8 at this point 
with maybe even a NumPy 1.9. At the same time, I am looking forward to 
working with you and others in the community as you lead the push toward NumPy 
2.0 (which I hope is not delayed too long with all the possible discussions 
that can take place :-) )

Best regards,

-Travis






On Feb 13, 2012, at 10:31 PM, Mark Wiebe wrote:

 On Mon, Feb 13, 2012 at 8:04 PM, Travis Oliphant tra...@continuum.io wrote:
 I disagree with your assessment of the subscript operator, but I'm sure we 
 will have plenty of time to discuss that.  I don't think it's correct to 
 compare  the corner cases of the fancy indexing and regular indexing to the 
 corner cases of type coercion system.If you recall, I was quite nervous 
 about all the changes you made to the coercion rules because I didn't believe 
 you fully understood what had been done before and I knew there was not 
 complete test coverage.   
 
 It is true that both systems have emerged from a long history and could 
 definitely use fresh perspectives which we all appreciate you and others 
 bringing.   It is also true that few are aware of the details of how things 
 are actually implemented and that there are corner cases that are basically 
 defined by the algorithm used (this is more true of the type-coercion system 
 than fancy-indexing, however).
 
 Likely the only way we will be able to know for certain the extent to which 
 our opinions are accurate is to actually dig into the code. I think we can 
 agree, however, that at the very least it could use some performance 
 improvement. :)
  
 I think it would have been wise to write those extensive tests prior to 
 writing new code.   I'm curious if what you were expecting for the output was 
 derived from what earlier versions of NumPy produced.NumPy has never been 
 in a state where you could just re-factor at will and assume that tests will 
 catch all intended use cases.   Numeric before it was not in that state 
 either.   This is a good goal, and we always welcome new tests.It just 
 takes a lot of time and a lot of tedious work that the volunteer labor to 
 this point have not had the time to do.
 
 I did put quite a bit of effort into maintaining compatibility, and was 
 incredibly careful about the change we're discussing. I used something I 
 suspect you created, the can cast safely table here:
 
 http://docs.scipy.org/doc/numpy/reference/ufuncs.html#casting-rules
 
 I extended it to more cases including scalar/array combinations of type 
 promotion, and validated that 1.5 and 1.6 produced the same outputs. The 
 script I used is here:
 
 https://github.com/numpy/numpy/blob/master/numpy/testing/print_coercion_tables.py
 
 I definitely didn't jump into the change blind, but I did approach it from a 
 clean perspective with the willingness to try and make things better. I 
 understand this is a delicate balance to walk, and I'd like to stress that I 
 didn't take any of the changes I made here lightly.
 
 Very few of us have ever been paid to work on NumPy directly and have often 
 been trying to fit in improvements to the code base between other jobs we are 
 supposed to be doing.Of course, you and I are hoping to change that this 
 year and look forward to the code quality improving commensurately.
 
 Well, everything I did for 1.6 that we're discussing here was volunteer work 
 too. :)
 
 You and Enthought have all the credit for the later bit where I did get paid 
 a little bit to do the datetime64 and NA stuff!
 
 Thanks for all you are doing.   I also agree that Rolf and Charles have-been 
 and are invaluable in the maintenance and progress of NumPy and SciPy.   They 
 deserve as much praise and kudos as anyone can give them.
 
 It's great to have you back and active in the community again too. I'm sure 
 this is improving the moods of many NumPy and SciPy users.
 
 -Mark
  
 
 -Travis
  
 
 
 On Feb 13, 2012, at 9:40 PM, Mark Wiebe wrote:
 
 I believe the main lessons to draw from this are just

Re: [Numpy-discussion] Change in scalar upcasting rules for 1.6.x?

2012-02-13 Thread Travis Oliphant


On Feb 13, 2012, at 10:14 PM, Charles R Harris wrote:

 
 
 On Mon, Feb 13, 2012 at 9:04 PM, Travis Oliphant tra...@continuum.io wrote:
 I disagree with your assessment of the subscript operator, but I'm sure we 
 will have plenty of time to discuss that.  I don't think it's correct to 
 compare  the corner cases of the fancy indexing and regular indexing to the 
 corner cases of type coercion system.If you recall, I was quite nervous 
 about all the changes you made to the coercion rules because I didn't believe 
 you fully understood what had been done before and I knew there was not 
 complete test coverage.   
 
 It is true that both systems have emerged from a long history and could 
 definitely use fresh perspectives which we all appreciate you and others 
 bringing.   It is also true that few are aware of the details of how things 
 are actually implemented and that there are corner cases that are basically 
 defined by the algorithm used (this is more true of the type-coercion system 
 than fancy-indexing, however).
 
 I think it would have been wise to write those extensive tests prior to 
 writing new code.   I'm curious if what you were expecting for the output was 
 derived from what earlier versions of NumPy produced.NumPy has never been 
 in a state where you could just re-factor at will and assume that tests will 
 catch all intended use cases.   Numeric before it was not in that state 
 either.   This is a good goal, and we always welcome new tests.It just 
 takes a lot of time and a lot of tedious work that the volunteer labor to 
 this point have not had the time to do.  
 
 Very few of us have ever been paid to work on NumPy directly and have often 
 been trying to fit in improvements to the code base between other jobs we are 
 supposed to be doing.Of course, you and I are hoping to change that this 
 year and look forward to the code quality improving commensurately.
 
 Thanks for all you are doing.   I also agree that Rolf and Charles have-been 
 and are invaluable in the maintenance and progress of NumPy and SciPy.   They 
 deserve as much praise and kudos as anyone can give them. 
 
 
 Well, the typecasting wasn't perfect and, as Mark points out, it wasn't 
 commutative. The addition of float16 also complicated the picture, and user 
 types is going to do more in that direction. And I don't see how a new 
 developer should be responsible for tests enforcing old traditions, the 
 original developers should be responsible for those. But history is history, 
 it didn't happen that way, and here we are.
 
 That said, I think we need to show a little flexibility in the corner cases. 
 And going forward I think that typecasting is going to need a rethink.
 

No argument on any of this.   It's just that this needs to happen at NumPy 2.0, 
not in the NumPy 1.X series.   I think requiring a re-compile is far-less 
onerous than changing the type-coercion subtly in a 1.5 to 1.6 release.  
That's my major point, and I'm surprised others are more cavalier about this.   
 

New developers are awesome, and the life-blood of a project.   But, you have to 
respect the history of a code-base and if you are re-factoring code that might 
create a change in corner-cases, then you are absolutely responsible for 
writing the tests if they aren't there already.That is a pretty simple 
rule.  

If you are changing semantics and are not doing a new major version number that 
you can document the changes in, then any re-factor needs to have tests written 
*before* the re-factor to ensure behavior does not change.   That might be 
annoying, for sure, and might make you curse the original author for not 
writing the tests you wish were already written --- but it doesn't change the 
fact that a released code has many, many tests already written for it in the 
way of applications and users.   All of these are outside of the actual 
code-base, and may rely on behavior that you can't just change even if you 
think it needs to change.  Bug-fixes are different, of course, but it can 
sometimes be difficult to discern what is a bug and what is just behavior 
that seems inappropriate.  

Type-coercion, in particular, can be a difficult nut to crack because NumPy 
doesn't always control what happens and is trying to work-within Python's 
stunted type-system.   I've often thought that it might be easier if NumPy were 
more tightly integrated into Python.   For example, it would be great if 
NumPy's Int-scalar was the same thing as Python's int.  Same for float and 
complex.It would also be nice if you could specify scalar literals with 
different precisions in Python directly.I've often wished that NumPy 
developers had more access to all the great language people who have spent 
their time on IronPython, Jython, and PyPy instead.   

-Travis



 Chuck
 
 On Feb 13, 2012, at 9:40 PM, Mark Wiebe wrote:
 
 I believe the main lessons to draw from this are just how incredibly 
 important

Re: [Numpy-discussion] Change in scalar upcasting rules for 1.6.x?

2012-02-13 Thread Benjamin Root

On Monday, February 13, 2012, Travis Oliphant tra...@continuum.io wrote:

 On Feb 13, 2012, at 10:14 PM, Charles R Harris wrote:


 On Mon, Feb 13, 2012 at 9:04 PM, Travis Oliphant tra...@continuum.io
wrote:

 I disagree with your assessment of the subscript operator, but I'm sure
we will have plenty of time to discuss that.  I don't think it's correct to
compare  the corner cases of the fancy indexing and regular indexing to the
corner cases of type coercion system.If you recall, I was quite nervous
about all the changes you made to the coercion rules because I didn't
believe you fully understood what had been done before and I knew there was
not complete test coverage.
 It is true that both systems have emerged from a long history and could
definitely use fresh perspectives which we all appreciate you and others
bringing.   It is also true that few are aware of the details of how things
are actually implemented and that there are corner cases that are basically
defined by the algorithm used (this is more true of the type-coercion
system than fancy-indexing, however).
 I think it would have been wise to write those extensive tests prior to
writing new code.   I'm curious if what you were expecting for the output
was derived from what earlier versions of NumPy produced.NumPy has
never been in a state where you could just re-factor at will and assume
that tests will catch all intended use cases.   Numeric before it was not
in that state either.   This is a good goal, and we always welcome new
tests.It just takes a lot of time and a lot of tedious work that the
volunteer labor to this point have not had the time to do.
 Very few of us have ever been paid to work on NumPy directly and have
often been trying to fit in improvements to the code base between other
jobs we are supposed to be doing.Of course, you and I are hoping to
change that this year and look forward to the code quality improving
commensurately.
 Thanks for all you are doing.   I also agree that Rolf and Charles
have-been and are invaluable in the maintenance and progress of NumPy and
SciPy.   They deserve as much praise and kudos as anyone can give them.

 Well, the typecasting wasn't perfect and, as Mark points out, it wasn't
commutative. The addition of float16 also complicated the picture, and user
types is going to do more in that direction. And I don't see how a new
developer should be responsible for tests enforcing old traditions, the
original developers should be responsible for those. But history is
history, it didn't happen that way, and here we are.

 That said, I think we need to show a little flexibility in the corner
cases. And going forward I think that typecasting is going to need a
rethink.


 No argument on any of this.   It's just that this needs to happen at
NumPy 2.0, not in the NumPy 1.X series.   I think requiring a re-compile is
far-less onerous than changing the type-coercion subtly in a 1.5 to 1.6
release.  That's my major point, and I'm surprised others are more
cavalier about this.

I thought the whole datetime debacle was the impetus for binary
compatibility? Also, I disagree with your cavalier charge here.  When we
looked at the rationale for the changes Mark made, the old behavior was not
documented, broke commutibility, and was unexpected.  So, if it walks like
a duck...

Now we are in an odd situation. We have undocumented old behavior, and
documented new behavior.  What do we do?  I understand the drive to revert,
but I hate the idea of putting back what I see as buggy, especially when
new software may fail with old behavior.

Maybe a Boolean switch defaulting to new behavior?  Anybody having issues
with old software could just flip the switch?

Ben Root
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Change in scalar upcasting rules for 1.6.x?

2012-02-13 Thread Travis Oliphant


 
  No argument on any of this.   It's just that this needs to happen at NumPy 
  2.0, not in the NumPy 1.X series.   I think requiring a re-compile is 
  far-less onerous than changing the type-coercion subtly in a 1.5 to 1.6 
  release.  That's my major point, and I'm surprised others are more 
  cavalier about this. 
 
 I thought the whole datetime debacle was the impetus for binary 
 compatibility? Also, I disagree with your cavalier charge here.  When we 
 looked at the rationale for the changes Mark made, the old behavior was not 
 documented, broke commutibility, and was unexpected.  So, if it walks like a 
 duck...

First of all, I don't recall the broken commutibility issue --- nor how long 
it had actually been in the code-base.   So, I'm not sure how much weight to 
give that problem 

The problem I see with the weighting of these issues that is being implied is 
that 

1) Requiring a re-compile is getting easier and easier as more and more 
people get their NumPy from distributions and not from downloads of NumPy 
itself.   They just wait until the distribution upgrades and everything is 
re-compiled.  
2) That same trend means that changes to run-time code (like those that 
can occur when type-coercion is changed) is likely to affect people much later 
after the discussions have taken place on the list and everyone who was 
involved in the discussion assumes all is fine.

This sort of change should be signaled by a version change.I would like to 
understand what the bugginess was and where it was better because I think we 
are painting a wide-brush.   Some-things I will probably agree with you were 
buggy, but others are likely just different preferences. 

I have a script that documents the old-behavior.   I will compare it to the 
new behavior and we can go from there.Certainly, there is precedent for 
using something like a __future__ statement to move forward which your 
boolean switch implies. 

-Travis




 
 Now we are in an odd situation. We have undocumented old behavior, and 
 documented new behavior.  What do we do?  I understand the drive to revert, 
 but I hate the idea of putting back what I see as buggy, especially when new 
 software may fail with old behavior.
 
 Maybe a Boolean switch defaulting to new behavior?  Anybody having issues 
 with old software could just flip the switch?
 
 Ben Root ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Change in scalar upcasting rules for 1.6.x?

2012-02-13 Thread Travis Oliphant


 
  No argument on any of this.   It's just that this needs to happen at NumPy 
  2.0, not in the NumPy 1.X series.   I think requiring a re-compile is 
  far-less onerous than changing the type-coercion subtly in a 1.5 to 1.6 
  release.  That's my major point, and I'm surprised others are more 
  cavalier about this. 
 
 I thought the whole datetime debacle was the impetus for binary 
 compatibility? Also, I disagree with your cavalier charge here.  When we 
 looked at the rationale for the changes Mark made, the old behavior was not 
 documented, broke commutibility, and was unexpected.  So, if it walks like a 
 duck...
 
 Now we are in an odd situation. We have undocumented old behavior, and 
 documented new behavior.  What do we do?  I understand the drive to revert, 
 but I hate the idea of putting back what I see as buggy, especially when new 
 software may fail with old behavior.
 
 Maybe a Boolean switch defaulting to new behavior?  Anybody having issues 
 with old software could just flip the switch?
 
 
 I think we just leave it as is. If it was a big problem we would have heard 
 screams of complaint long ago. The post that started this off wasn't even a 
 complaint, more of a see this. Spending time reverting or whatever would be 
 a waste of resources, IMHO.
 
 Chuck 

You might be right, Chuck.   I would like to investigate more, however. 

What I fear is that there are *a lot* of users still on NumPy 1.3 and NumPy 
1.5.   The fact that we haven't heard any complaints, yet, does not mean to me 
that we aren't creating headache for people later who have just not had time to 
try things.  

However, I can believe that the specifics of minor casting rules are probably 
not relied upon by a lot of codes out there.   Still, as Robert Kern often 
reminds us well --- our intuitions about this are usually not worth much. 

I may be making more of this then it's worth, I realize.   I was just sensitive 
to it at the time things were changing (even though I didn't have time to be 
vocal), and now hearing this users experience, it confirms my bias...  Believe 
me, I do not want to revert if at all possible.There is plenty of more 
work to do, and I'm very much in favor of the spirit of the work Mark was and 
is doing. 

Best regards,

-Travis



 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Change in scalar upcasting rules for 1.6.x?

2012-02-13 Thread Charles R Harris

On Mon, Feb 13, 2012 at 11:00 PM, Travis Oliphant tra...@continuum.iowrote:


  
   No argument on any of this.   It's just that this needs to happen at
 NumPy 2.0, not in the NumPy 1.X series.   I think requiring a re-compile is
 far-less onerous than changing the type-coercion subtly in a 1.5 to 1.6
 release.  That's my major point, and I'm surprised others are more
 cavalier about this.
 
  I thought the whole datetime debacle was the impetus for binary
 compatibility? Also, I disagree with your cavalier charge here.  When we
 looked at the rationale for the changes Mark made, the old behavior was not
 documented, broke commutibility, and was unexpected.  So, if it walks like
 a duck...

 First of all, I don't recall the broken commutibility issue --- nor how
 long it had actually been in the code-base.   So, I'm not sure how much
 weight to give that problem

 The problem I see with the weighting of these issues that is being implied
 is that

1) Requiring a re-compile is getting easier and easier as more and
 more people get their NumPy from distributions and not from downloads of
 NumPy itself.   They just wait until the distribution upgrades and
 everything is re-compiled.
2) That same trend means that changes to run-time code (like those
 that can occur when type-coercion is changed) is likely to affect people
 much later after the discussions have taken place on the list and everyone
 who was involved in the discussion assumes all is fine.

 This sort of change should be signaled by a version change.I would
 like to understand what the bugginess was and where it was better because
 I think we are painting a wide-brush.   Some-things I will probably agree
 with you were buggy, but others are likely just different preferences.

 I have a script that documents the old-behavior.   I will compare it to
 the new behavior and we can go from there.Certainly, there is precedent
 for using something like a __future__ statement to move forward which
 your boolean switch implies.


Let it go, Travis. It's a waste of time.

Chuck
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Change in scalar upcasting rules for 1.6.x?

2012-02-13 Thread Charles R Harris

On Mon, Feb 13, 2012 at 11:07 PM, Travis Oliphant tra...@continuum.iowrote:


 
  No argument on any of this.   It's just that this needs to happen at
 NumPy 2.0, not in the NumPy 1.X series.   I think requiring a re-compile is
 far-less onerous than changing the type-coercion subtly in a 1.5 to 1.6
 release.  That's my major point, and I'm surprised others are more
 cavalier about this.

 I thought the whole datetime debacle was the impetus for binary
 compatibility? Also, I disagree with your cavalier charge here.  When we
 looked at the rationale for the changes Mark made, the old behavior was not
 documented, broke commutibility, and was unexpected.  So, if it walks like
 a duck...

 Now we are in an odd situation. We have undocumented old behavior, and
 documented new behavior.  What do we do?  I understand the drive to revert,
 but I hate the idea of putting back what I see as buggy, especially when
 new software may fail with old behavior.

 Maybe a Boolean switch defaulting to new behavior?  Anybody having issues
 with old software could just flip the switch?


 I think we just leave it as is. If it was a big problem we would have
 heard screams of complaint long ago. The post that started this off wasn't
 even a complaint, more of a see this. Spending time reverting or whatever
 would be a waste of resources, IMHO.

 Chuck


 You might be right, Chuck.   I would like to investigate more, however.

 What I fear is that there are *a lot* of users still on NumPy 1.3 and
 NumPy 1.5.   The fact that we haven't heard any complaints, yet, does not
 mean to me that we aren't creating headache for people later who have just
 not had time to try things.

 However, I can believe that the specifics of minor casting rules are
 probably not relied upon by a lot of codes out there.   Still, as Robert
 Kern often reminds us well --- our intuitions about this are usually not
 worth much.

 I may be making more of this then it's worth, I realize.   I was just
 sensitive to it at the time things were changing (even though I didn't have
 time to be vocal), and now hearing this users experience, it confirms my
 bias...  Believe me, I do not want to revert if at all possible.There
 is plenty of more work to do, and I'm very much in favor of the spirit of
 the work Mark was and is doing.


I think writing tests would be more productive. The current coverage is
skimpy in that we typically don't cover *all* the combinations. Sometimes
we don't cover any of them ;) I know you are sensitive to the typecasting,
it was one of your babies. Nevertheless, I don't think it is that big an
issue at the moment. If you can think of ways to *improve* it I think
everyone will be interested in that.

The lack of commutativity wasn't in precision, it was in the typecodes, and
was there from the beginning. That caused confusion. A current cause of
confusion is the many to one relation of, say, int32 and long, longlong
which varies platform to platform. I think that confusion is a more
significant problem. Having some types derived from Python types, a
correspondence that also varies platform to platform is another source of
inconsistent behavior that can be confusing. So there are still plenty of
issues to deal with.

I'd like to point out that the addition of float16 necessitated a certain
amount of rewriting, as well as the addition of datetime. It was only
through Mark's work that we were able to include the latter in the 1.*
series at all. Before, we always had to remove datetime before a release, a
royal PITA, while waiting on the ever receding 2.0. So there were very good
reasons to deal with the type system.

That isn't to say that typecasting can't use some tweaks here and there, I
think we are all open to discussion along those lines. But it should about
specific cases.

Chuck
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Change in scalar upcasting rules for 1.6.x?

2012-02-13 Thread Mark Wiebe

On Mon, Feb 13, 2012 at 10:38 PM, Charles R Harris 
charlesr.har...@gmail.com wrote:



 On Mon, Feb 13, 2012 at 11:07 PM, Travis Oliphant tra...@continuum.iowrote:


 
  No argument on any of this.   It's just that this needs to happen at
 NumPy 2.0, not in the NumPy 1.X series.   I think requiring a re-compile is
 far-less onerous than changing the type-coercion subtly in a 1.5 to 1.6
 release.  That's my major point, and I'm surprised others are more
 cavalier about this.

 I thought the whole datetime debacle was the impetus for binary
 compatibility? Also, I disagree with your cavalier charge here.  When we
 looked at the rationale for the changes Mark made, the old behavior was not
 documented, broke commutibility, and was unexpected.  So, if it walks like
 a duck...

 Now we are in an odd situation. We have undocumented old behavior, and
 documented new behavior.  What do we do?  I understand the drive to revert,
 but I hate the idea of putting back what I see as buggy, especially when
 new software may fail with old behavior.

 Maybe a Boolean switch defaulting to new behavior?  Anybody having
 issues with old software could just flip the switch?


 I think we just leave it as is. If it was a big problem we would have
 heard screams of complaint long ago. The post that started this off wasn't
 even a complaint, more of a see this. Spending time reverting or whatever
 would be a waste of resources, IMHO.

 Chuck


 You might be right, Chuck.   I would like to investigate more, however.

 What I fear is that there are *a lot* of users still on NumPy 1.3 and
 NumPy 1.5.   The fact that we haven't heard any complaints, yet, does not
 mean to me that we aren't creating headache for people later who have just
 not had time to try things.

 However, I can believe that the specifics of minor casting rules are
 probably not relied upon by a lot of codes out there.   Still, as Robert
 Kern often reminds us well --- our intuitions about this are usually not
 worth much.

 I may be making more of this then it's worth, I realize.   I was just
 sensitive to it at the time things were changing (even though I didn't have
 time to be vocal), and now hearing this users experience, it confirms my
 bias...  Believe me, I do not want to revert if at all possible.There
 is plenty of more work to do, and I'm very much in favor of the spirit of
 the work Mark was and is doing.


 I think writing tests would be more productive. The current coverage is
 skimpy in that we typically don't cover *all* the combinations. Sometimes
 we don't cover any of them ;) I know you are sensitive to the typecasting,
 it was one of your babies. Nevertheless, I don't think it is that big an
 issue at the moment. If you can think of ways to *improve* it I think
 everyone will be interested in that.

 The lack of commutativity wasn't in precision, it was in the typecodes,
 and was there from the beginning. That caused confusion. A current cause of
 confusion is the many to one relation of, say, int32 and long, longlong
 which varies platform to platform. I think that confusion is a more
 significant problem. Having some types derived from Python types, a
 correspondence that also varies platform to platform is another source of
 inconsistent behavior that can be confusing. So there are still plenty of
 issues to deal with.


This reminds me of something that it would be really nice for the bug
tracker to have - user votes. This might be a particularly good way to draw
in some more of the users who don't want to stick their neck out with
emails and comments, put are comfortable adding a vote to a bug. Something
like this:

http://code.google.com/p/googleappengine/issues/detail?id=190

where it says that 566 people have starred the issue.

-Mark



 I'd like to point out that the addition of float16 necessitated a certain
 amount of rewriting, as well as the addition of datetime. It was only
 through Mark's work that we were able to include the latter in the 1.*
 series at all. Before, we always had to remove datetime before a release, a
 royal PITA, while waiting on the ever receding 2.0. So there were very good
 reasons to deal with the type system.

 That isn't to say that typecasting can't use some tweaks here and there, I
 think we are all open to discussion along those lines. But it should about
 specific cases.

 Chuck


 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion


___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] numpy.arange() error?

2012-02-13 Thread Matteo Malosio

 I think the problem is quite easy to solve, without changing the 
documentation behaviour.


The doc says:

Help on built-in function arange in module numpy.core.multiarray:
/
arange(...)
arange([start,] stop[, step,], dtype=None)

Return evenly spaced values within a given interval.

Values are generated within the half-open interval ``[start, stop)``
(in other words, the interval including `start` but excluding `stop`).
For integer arguments the function is equivalent to the Python built-in
`range http://docs.python.org/lib/built-in-funcs.html`_ function,
but returns a ndarray rather than a list.
/

stop is exclusive by definition. So substracting a very small value to 
stop when processing stop I think is the best way.


Matteo





Il 10/02/2012 02:22, Drew Frank ha scritto:
On Thu, Feb 9, 2012 at 3:40 PM, Benjamin Root ben.r...@ou.edu 
mailto:ben.r...@ou.edu wrote:




On Thursday, February 9, 2012, Sturla Molden stu...@molden.no
mailto:stu...@molden.no wrote:


 Den 9. feb. 2012 kl. 22:44 skrev eat e.antero.ta...@gmail.com
mailto:e.antero.ta...@gmail.com:


 Maybe this issue is raised also earlier, but wouldn't it be more
consistent to let arange operate only with integers (like Python's
range) and let linspace handle the floats as well?


 Perhaps. Another possibility would be to let arange take decimal
arguments, possibly entered as text strings.
 Sturla


Personally, I treat arange() to mean, give me a sequence of
values from x to y, exclusive, with a specific step size.
 Nowhere in that statement does it guarantee a particular number
of elements.  Whereas linspace() means, give me a sequence of
evenly spaced numbers from x to y, optionally inclusive, such that
there are exactly N elements. They complement each other well.


I agree -- both functions are useful and I think about them the same 
way.  The unfortunate part is that tiny precision errors in y can make 
arange appear to be sometimes-exclusive rather than always 
exclusive.  I've always imagined there to be a sort of duality between 
the two functions, where arange(low, high, step) == linspace(low, 
high-step, round((high-low)/step)) in cases where (high - low)/step is 
integral, but it turns out this is not the case.



There are times when I intentionally will specify a range where
the step size will not nicely fit.  i.e.- np.arange(1, 7, 3.5). I
wouldn't want this to change.


Nor would I.  What I meant to express earlier is that I like how 
Matlab addresses this particular class of floating point precision 
errors, not that I think arange output should somehow include both 
endpoints.



My vote is that if users want matlab-colon-like behavior, we could
make a new function - maybe erange() for exact range?


Ben Root


That could work; it would completely replace arange for me in every 
circumstance I can think of, but I understand we can't just go 
changing the behavior of core functions.


Drew


___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


--
---
Matteo Malosio, Eng.
Researcher
ITIA-CNR (www.itia.cnr.it)
Institute of Industrial Technologies and Automation
National Research Council
via Bassini 15, 20133 MILANO, ITALY
Ph:  +39 0223699625
Fax: +39 0223699925
e-mail:matteo.malo...@itia.cnr.it
---

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Change in scalar upcasting rules for 1.6.x?

2012-02-13 Thread Mark Wiebe

On Mon, Feb 13, 2012 at 10:48 PM, Mark Wiebe mwwi...@gmail.com wrote:

 On Mon, Feb 13, 2012 at 10:38 PM, Charles R Harris 
 charlesr.har...@gmail.com wrote:



 On Mon, Feb 13, 2012 at 11:07 PM, Travis Oliphant tra...@continuum.iowrote:


 
  No argument on any of this.   It's just that this needs to happen at
 NumPy 2.0, not in the NumPy 1.X series.   I think requiring a re-compile is
 far-less onerous than changing the type-coercion subtly in a 1.5 to 1.6
 release.  That's my major point, and I'm surprised others are more
 cavalier about this.

 I thought the whole datetime debacle was the impetus for binary
 compatibility? Also, I disagree with your cavalier charge here.  When we
 looked at the rationale for the changes Mark made, the old behavior was not
 documented, broke commutibility, and was unexpected.  So, if it walks like
 a duck...

 Now we are in an odd situation. We have undocumented old behavior, and
 documented new behavior.  What do we do?  I understand the drive to revert,
 but I hate the idea of putting back what I see as buggy, especially when
 new software may fail with old behavior.

 Maybe a Boolean switch defaulting to new behavior?  Anybody having
 issues with old software could just flip the switch?


 I think we just leave it as is. If it was a big problem we would have
 heard screams of complaint long ago. The post that started this off wasn't
 even a complaint, more of a see this. Spending time reverting or whatever
 would be a waste of resources, IMHO.

 Chuck


 You might be right, Chuck.   I would like to investigate more, however.

 What I fear is that there are *a lot* of users still on NumPy 1.3 and
 NumPy 1.5.   The fact that we haven't heard any complaints, yet, does not
 mean to me that we aren't creating headache for people later who have just
 not had time to try things.

 However, I can believe that the specifics of minor casting rules are
 probably not relied upon by a lot of codes out there.   Still, as Robert
 Kern often reminds us well --- our intuitions about this are usually not
 worth much.

 I may be making more of this then it's worth, I realize.   I was just
 sensitive to it at the time things were changing (even though I didn't have
 time to be vocal), and now hearing this users experience, it confirms my
 bias...  Believe me, I do not want to revert if at all possible.There
 is plenty of more work to do, and I'm very much in favor of the spirit of
 the work Mark was and is doing.


 I think writing tests would be more productive. The current coverage is
 skimpy in that we typically don't cover *all* the combinations. Sometimes
 we don't cover any of them ;) I know you are sensitive to the typecasting,
 it was one of your babies. Nevertheless, I don't think it is that big an
 issue at the moment. If you can think of ways to *improve* it I think
 everyone will be interested in that.

 The lack of commutativity wasn't in precision, it was in the typecodes,
 and was there from the beginning. That caused confusion. A current cause of
 confusion is the many to one relation of, say, int32 and long, longlong
 which varies platform to platform. I think that confusion is a more
 significant problem. Having some types derived from Python types, a
 correspondence that also varies platform to platform is another source of
 inconsistent behavior that can be confusing. So there are still plenty of
 issues to deal with.


 This reminds me of something that it would be really nice for the bug
 tracker to have - user votes. This might be a particularly good way to draw
 in some more of the users who don't want to stick their neck out with
 emails and comments, put are comfortable adding a vote to a bug. Something
 like this:

 http://code.google.com/p/googleappengine/issues/detail?id=190

 where it says that 566 people have starred the issue.


Here's how this feature looks in YouTrack:

http://youtrack.jetbrains.net/issues?q=sort+by%3Avotes

-Mark



 -Mark



 I'd like to point out that the addition of float16 necessitated a certain
 amount of rewriting, as well as the addition of datetime. It was only
 through Mark's work that we were able to include the latter in the 1.*
 series at all. Before, we always had to remove datetime before a release, a
 royal PITA, while waiting on the ever receding 2.0. So there were very good
 reasons to deal with the type system.

 That isn't to say that typecasting can't use some tweaks here and there,
 I think we are all open to discussion along those lines. But it should
 about specific cases.

 Chuck


 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion



___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Change in scalar upcasting rules for 1.6.x?

2012-02-13 Thread Travis Oliphant

 
 You might be right, Chuck.   I would like to investigate more, however. 
 
 What I fear is that there are *a lot* of users still on NumPy 1.3 and NumPy 
 1.5.   The fact that we haven't heard any complaints, yet, does not mean to 
 me that we aren't creating headache for people later who have just not had 
 time to try things.  
 
 However, I can believe that the specifics of minor casting rules are 
 probably not relied upon by a lot of codes out there.   Still, as Robert Kern 
 often reminds us well --- our intuitions about this are usually not worth 
 much. 
 
 I may be making more of this then it's worth, I realize.   I was just 
 sensitive to it at the time things were changing (even though I didn't have 
 time to be vocal), and now hearing this users experience, it confirms my 
 bias...  Believe me, I do not want to revert if at all possible.There 
 is plenty of more work to do, and I'm very much in favor of the spirit of the 
 work Mark was and is doing. 
 
 
 I think writing tests would be more productive. The current coverage is 
 skimpy in that we typically don't cover *all* the combinations. Sometimes we 
 don't cover any of them ;) I know you are sensitive to the typecasting, it 
 was one of your babies. Nevertheless, I don't think it is that big an issue 
 at the moment. If you can think of ways to *improve* it I think everyone will 
 be interested in that.

First of all, I would hardly call it one of my babies.   I care far more for my 
actual babies than for this.It was certainly one of my headaches that I had 
to deal with and write code for (and take into account previous behavior with). 
  I certainly spent a lot of time wrestling with type-coercion and integrating 
numerous opinions as quickly as I could with it --- even in Numeric with the 
funny down_casting arrays. At best the resulting system was a compromise 
(with an implementation that you could reason about with the right perspective 
despite claims to the contrary).  

This discussion is not about me being sensitive because I wrote some code or 
had a hand in a design that needed changing.  I hope we replace all the code 
I've written with something better.   I expect that eventually.  This just has 
to be done in an appropriate way.   I'm sensitive because I understand where 
the previous code came from and *why it was written* and am concerned about 
changing things out from under users in ways that are subtle.   

I continue to affirm that breaking ABI compatibility is much preferable to 
changing type-casting behavior.  I know people disagree with me.   But, 
distributions help solve the ABI compatibility problem, but nothing solves 
required code changes due to subtle type-casting issues.   I would just expect 
this sort of change at NumPy 2.0.   We could have waited for half-float until 
then.  

I will send the result of my analysis shortly on what changed between 1.5.1 and 
1.6.1

-Travis











___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Fwd: Re: Creating parallel curves

2012-02-13 Thread Kiko

2012/2/13 Andrea Gavana andrea.gav...@gmail.com

 -- Forwarded message --
 From: Andrea Gavana andrea.gav...@gmail.com
 Date: Feb 13, 2012 11:31 PM
 Subject: Re: [Numpy-discussion] Creating parallel curves
 To: Jonathan Hilmer jkhil...@gmail.com

 Thank you Jonathan for this, it's exactly what I was looking for. I' ll
 try it tomorrow on the 768 well trajectories I have and I'll let you know
 if I stumble upon any issue.

 If someone could shed some light on my problem number 2 (how to adjust the
 scaling/distance) so that the curves look parallel on a matplotlib graph
 even though the axes scales are different, I'd be more than grateful.

 Thank you in advance.

Hi.
Maybe this could help you as a starting point.

*from Shapely.geometry import LineString
from matplotlib import pyplot

myline = LineString(...)
x, y = myline.xy

xx, yy =  myline.buffer(distancefrommyline).exterior.xy  # coordinates
around myline

pyplot.plot(x, y)
pyplot.plot(xx,yy)
pyplot.show()*

Best.
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Change in scalar upcasting rules for 1.6.x?

2012-02-13 Thread Travis Oliphant

 
 The lack of commutativity wasn't in precision, it was in the typecodes, and 
 was there from the beginning. That caused confusion. A current cause of 
 confusion is the many to one relation of, say, int32 and long, longlong which 
 varies platform to platform. I think that confusion is a more significant 
 problem. Having some types derived from Python types, a correspondence that 
 also varies platform to platform is another source of inconsistent behavior 
 that can be
 confusing. So there are still plenty of issues to deal with

I didn't think it was in the precision.  I knew what you meant.  However, I'm 
still hoping for an example of what you mean by lack of commutativity in the 
typecodes.  

The confusion of long and longlong varying from platform to platform comes from 
C.   The whole point of having long and longlong is to ensure that you can 
specify the same types in Python that you would in C.   They should not be used 
if you don't care about that.

Deriving from Python types for some array-scalars is an issue.  I don't like 
that either.  However, Python itself special-cases it's scalars in ways that 
necessitated it to have some use-cases not fall-over.This shows a 
limitation of Python. I would prefer that all array-scalars were recognized 
appropriately by the Python type system.   

Most of the concerns that you mention here are mis-understandings.  Maybe there 
are solutions that fix the problem without just educating people.  I am open 
to them. 

I do think that it was a mistake to have the intp and uintp dtypes as 
*separate* dtypes.  They should have just mapped to the right one.   I think it 
was also a mistake to have dtypes for all the C-spellings instead of just a 
dtype for each different bit-length with an alias for the C-spellings. We 
should change that in NumPy 2.0. 

-Travis




 
 I'd like to point out that the addition of float16 necessitated a certain 
 amount of rewriting, as well as the addition of datetime. It was only through 
 Mark's work that we were able to include the latter in the 1.* series at all. 
 Before, we always had to remove datetime before a release, a royal PITA, 
 while waiting on the ever receding 2.0. So there were very good reasons to 
 deal with the type system.
 
 That isn't to say that typecasting can't use some tweaks here and there, I 
 think we are all open to discussion along those lines. But it should about 
 specific cases.
 
 Chuck
 
 ___
 NumPy-Discussion mailing list
 NumPy-Discussion@scipy.org
 http://mail.scipy.org/mailman/listinfo/numpy-discussion

___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

51 matches

Mail list logo