[Numpy-discussion] numpy.stack -- which function, if any, deserves the name?

2015-03-15 Thread Stephan Hoyer
In the past months there have been two proposals for new numpy functions
using the name stack:

1. np.stack for stacking like np.asarray(np.bmat(...))
http://thread.gmane.org/gmane.comp.python.numeric.general/58748/
https://github.com/numpy/numpy/pull/5057

2. np.stack for stacking along an arbitrary new axis (this was my proposal)
http://thread.gmane.org/gmane.comp.python.numeric.general/59850/
https://github.com/numpy/numpy/pull/5605

Both functions generalize the notion of stacking arrays from the existing
hstack, vstack and dstack, but in two very different ways. Both could be
useful -- but we can only call one stack. Which one deserves that name?

The existing *stack functions use the word stack to refer to combining
arrays in two similarly different ways:
a. For ND - ND stacking along an existing dimensions (like
numpy.concatenate and proposal 1)
b. For ND - (N+1)D stacking along new dimensions (like proposal 2).

I think it would be much cleaner API design if we had different words to
denote these two different operations. Concatenate for combine along an
existing dimension already exists, so my thought (when I wrote proposal
2), was that the verb stack could be reserved (going forward) for
combine along a new dimension. This also has the advantage of suggesting
that concatenate and stack are the two fundamental operations for
combining N-dimensional arrays. The documentation on this is currently
quite confusing, mostly because no function like that in proposal 2
currently exists.

Of course, the *stack functions have existed for quite some time, and in
many cases vstack and hstack are indeed used for concatenate like
functionality (e.g., whenever they are used for 2D arrays/matrices). So the
case is not entirely clear-cut. (We'll never be able to remove this
functionality from NumPy.)

In any case, I would appreciate your thoughts.

Best,
Stephan
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] Rewrite np.histogram in c?

2015-03-15 Thread Robert McGibbon
Hi,

Numpy.histogram is implemented in python, and is a little sluggish. This
has been discussed previously on the mailing list, [1, 2]. It came up in a
project that I maintain, where a new feature is bottlenecked by
numpy.histogram, and one developer suggested a faster implementation in
cython [3].

Would it make sense to reimplement this function in c? or cython? Is moving
functions like this from python to c to improve performance within the
scope of the development roadmap for numpy? I started implementing this a
little bit in c, [4] but I figured I should check in here first.

-Robert

[1]
http://scipy-user.10969.n7.nabble.com/numpy-histogram-is-slow-td17208.html
[2] http://numpy-discussion.10968.n7.nabble.com/Fast-histogram-td9359.html
[3] https://github.com/mdtraj/mdtraj/pull/734
[4] https://github.com/rmcgibbo/numpy/tree/histogram
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion