Re: [Numpy-discussion] "expected a single-segment buffer object"

2008-07-10 Thread Robert Kern
On Thu, Jul 10, 2008 at 10:33, Anne Archibald <[EMAIL PROTECTED]> wrote:
> 2008/7/9 Robert Kern <[EMAIL PROTECTED]>:
>
>> Because that's just what a buffer= argument *is*. It is not a place
>> for presenting the starting pointer to exotically-strided memory. Use
>> __array_interface__s to describe the full range of representable
>> memory. See below.
>
> Aha! Is this stuff documented somewhere?

_The Guide to Numpy_, section 3.1.4.

-- 
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless
enigma that is made terrible by our own mad attempt to interpret it as
though it had an underlying truth."
 -- Umberto Eco
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] "expected a single-segment buffer object"

2008-07-10 Thread Charles R Harris
On Thu, Jul 10, 2008 at 2:25 PM, Anne Archibald <[EMAIL PROTECTED]>
wrote:

> 2008/7/10 Charles R Harris <[EMAIL PROTECTED]>:
>
> > On Thu, Jul 10, 2008 at 9:33 AM, Anne Archibald <
> [EMAIL PROTECTED]>
> > wrote:
> >>
> >> 2008/7/9 Robert Kern <[EMAIL PROTECTED]>:
> >>
> >> > Because that's just what a buffer= argument *is*. It is not a place
> >> > for presenting the starting pointer to exotically-strided memory. Use
> >> > __array_interface__s to describe the full range of representable
> >> > memory. See below.
> >>
> >> Aha! Is this stuff documented somewhere?
> >>
> >> > I was about a week ahead of you. See numpy/lib/stride_tricks.py in the
> >> > trunk.
> >>
> >> Nice! Unfortunately it can't quite do what I want... for the linear
> >> algebra I need something that can broadcast all but certain axes. For
> >> example, take an array of matrices and an array of vectors. The
> >> "array" axes need broadcasting, but you can't broadcast on all axes
> >> without (incorrectly) turning the vector into a matrix. I've written a
> >> (messy) implementation, but the corner cases are giving me headaches.
> >> I'll let you know when I have something that works.
> >
> > I think something like a matrix/vector dtype would be another way to go
> for
> > stacks of matrices and vectors. It would have to be a user defined type
> to
> > fit into the current type hierarchy for ufuncs, but I think the base
> > machinery is there with the generic inner loops.
>
> There's definitely room for discussion about how such a linear algebra
> system should ultimately work. In the short term, though, I think it's
> valuable to create a prototype system, even if much of the looping has
> to be in python, just to figure out how it should look to the user.
>
> For example, I'm not sure whether a matrix/vector dtype would make
> easy things like indexing operations - fancy indexing spanning both
> array and matrix axes,


True enough. And I would expect the ufunc approach to require the sub
matrices to be contiguous. In any case, ufuncs aren't ready for this yet, in
fact they aren't ready for string types or other numeric types, so there is
a lot of work to be done just at that level.

for example. dtypes can also be a little
> impenetrable to use, so even if this is how elementwise linear algebra
> is ultimately implemented, we may want to provide a different user
> frontend.
>
> My idea for a first sketch was simply to provide (for example)
> np.elementwise_linalg.dot_mm, that interprets its arguments both as
> arrays of matrices and yields an array of matrices. A second sketch
> might be a subclass MatrixArray, which could serve much as Matrix does
> now, to tell various functions which axes to do linear algebra over
> and which axes to iterate over.
>

Definitely needs doing. It's hard to see how things work out in practice
without some experimentation.


> There's also the question of how to make these efficient; one could of
> course write a C wrapper for each linear algebra function that simply
> iterates, but your suggestion of using the ufunc machinery with a
> matrix dtype might be better. Or, if cython acquires sensible
> polymorphism over numpy dtypes, a cython wrapper might be the way to
> go. But I think establishing what it should look like to users is a
> good first step, and I think that's best done with sample
> implementations.
>

Amen.

Chuck
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] "expected a single-segment buffer object"

2008-07-10 Thread Anne Archibald
2008/7/10 Charles R Harris <[EMAIL PROTECTED]>:

> On Thu, Jul 10, 2008 at 9:33 AM, Anne Archibald <[EMAIL PROTECTED]>
> wrote:
>>
>> 2008/7/9 Robert Kern <[EMAIL PROTECTED]>:
>>
>> > Because that's just what a buffer= argument *is*. It is not a place
>> > for presenting the starting pointer to exotically-strided memory. Use
>> > __array_interface__s to describe the full range of representable
>> > memory. See below.
>>
>> Aha! Is this stuff documented somewhere?
>>
>> > I was about a week ahead of you. See numpy/lib/stride_tricks.py in the
>> > trunk.
>>
>> Nice! Unfortunately it can't quite do what I want... for the linear
>> algebra I need something that can broadcast all but certain axes. For
>> example, take an array of matrices and an array of vectors. The
>> "array" axes need broadcasting, but you can't broadcast on all axes
>> without (incorrectly) turning the vector into a matrix. I've written a
>> (messy) implementation, but the corner cases are giving me headaches.
>> I'll let you know when I have something that works.
>
> I think something like a matrix/vector dtype would be another way to go for
> stacks of matrices and vectors. It would have to be a user defined type to
> fit into the current type hierarchy for ufuncs, but I think the base
> machinery is there with the generic inner loops.

There's definitely room for discussion about how such a linear algebra
system should ultimately work. In the short term, though, I think it's
valuable to create a prototype system, even if much of the looping has
to be in python, just to figure out how it should look to the user.

For example, I'm not sure whether a matrix/vector dtype would make
easy things like indexing operations - fancy indexing spanning both
array and matrix axes, for example. dtypes can also be a little
impenetrable to use, so even if this is how elementwise linear algebra
is ultimately implemented, we may want to provide a different user
frontend.

My idea for a first sketch was simply to provide (for example)
np.elementwise_linalg.dot_mm, that interprets its arguments both as
arrays of matrices and yields an array of matrices. A second sketch
might be a subclass MatrixArray, which could serve much as Matrix does
now, to tell various functions which axes to do linear algebra over
and which axes to iterate over.

There's also the question of how to make these efficient; one could of
course write a C wrapper for each linear algebra function that simply
iterates, but your suggestion of using the ufunc machinery with a
matrix dtype might be better. Or, if cython acquires sensible
polymorphism over numpy dtypes, a cython wrapper might be the way to
go. But I think establishing what it should look like to users is a
good first step, and I think that's best done with sample
implementations.

Anne
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] "expected a single-segment buffer object"

2008-07-10 Thread Stéfan van der Walt
2008/7/10 Robert Kern <[EMAIL PROTECTED]>:
> I was about a week ahead of you. See numpy/lib/stride_tricks.py in the trunk.

Robert, this is fantastic!  I think people are going to enjoy your
talk at SciPy'08.  If you want, we could also tutor this in the
advanced NumPy session.

Cheers
Stéfan
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] "expected a single-segment buffer object"

2008-07-10 Thread Charles R Harris
On Thu, Jul 10, 2008 at 9:33 AM, Anne Archibald <[EMAIL PROTECTED]>
wrote:

> 2008/7/9 Robert Kern <[EMAIL PROTECTED]>:
>
> > Because that's just what a buffer= argument *is*. It is not a place
> > for presenting the starting pointer to exotically-strided memory. Use
> > __array_interface__s to describe the full range of representable
> > memory. See below.
>
> Aha! Is this stuff documented somewhere?
>
> > I was about a week ahead of you. See numpy/lib/stride_tricks.py in the
> trunk.
>
> Nice! Unfortunately it can't quite do what I want... for the linear
> algebra I need something that can broadcast all but certain axes. For
> example, take an array of matrices and an array of vectors. The
> "array" axes need broadcasting, but you can't broadcast on all axes
> without (incorrectly) turning the vector into a matrix. I've written a
> (messy) implementation, but the corner cases are giving me headaches.
> I'll let you know when I have something that works.
>

I think something like a matrix/vector dtype would be another way to go for
stacks of matrices and vectors. It would have to be a user defined type to
fit into the current type hierarchy for ufuncs, but I think the base
machinery is there with the generic inner loops.

Chuck
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] "expected a single-segment buffer object"

2008-07-10 Thread Anne Archibald
2008/7/9 Robert Kern <[EMAIL PROTECTED]>:

> Because that's just what a buffer= argument *is*. It is not a place
> for presenting the starting pointer to exotically-strided memory. Use
> __array_interface__s to describe the full range of representable
> memory. See below.

Aha! Is this stuff documented somewhere?

> I was about a week ahead of you. See numpy/lib/stride_tricks.py in the trunk.

Nice! Unfortunately it can't quite do what I want... for the linear
algebra I need something that can broadcast all but certain axes. For
example, take an array of matrices and an array of vectors. The
"array" axes need broadcasting, but you can't broadcast on all axes
without (incorrectly) turning the vector into a matrix. I've written a
(messy) implementation, but the corner cases are giving me headaches.
I'll let you know when I have something that works.

Thanks,
Anne
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] "expected a single-segment buffer object"

2008-07-09 Thread Robert Kern
On Wed, Jul 9, 2008 at 21:29, Anne Archibald <[EMAIL PROTECTED]> wrote:
> 2008/7/9 Robert Kern <[EMAIL PROTECTED]>:
>
>> Yes, the buffer interface, at least the subset that ndarray()
>> consumes, requires that all of the data be contiguous in memory.
>> array_as_buffer() checks for that using PyArray_ISONE_SEGMENT(), which
>> looks like this:
>>
>> #define PyArray_ISONESEGMENT(m) (PyArray_NDIM(m) == 0 || 
>>  \
>> PyArray_CHKFLAGS(m, NPY_CONTIGUOUS) ||   
>> \
>> PyArray_CHKFLAGS(m, NPY_FORTRAN))
>>
>> Trying to get a buffer object from anything that is neither C- or
>> Fortran-contiguous will fail. E.g.
>>
>> In [1]: from numpy import *
>>
>> In [2]: A = arange(10)
>>
>> In [3]: B = A[::2]
>>
>> In [4]: ndarray(strides=B.strides, shape=B.shape, buffer=B, dtype=B.dtype)
>> ---
>> TypeError Traceback (most recent call last)
>>
>> /Users/rkern/today/ in ()
>>
>> TypeError: expected a single-segment buffer object
>
> Is this really necessary? What does making this restriction gain? It
> certainly means that many arrays whose storage is a contiguous block
> of memory can still not be used (just permute the axes of a 3d array,
> say; it may even be possible for an array to be in C contiguous order
> but for the flag not to be set), but how is one to construct exotic
> slices of an array that is strided in memory? (The real part of a
> complex array, say.)

Because that's just what a buffer= argument *is*. It is not a place
for presenting the starting pointer to exotically-strided memory. Use
__array_interface__s to describe the full range of representable
memory. See below.

> I suppose one could follow the linked list of .bases up to the
> original ndarray, which should normally be C- or Fortran-contiguous,
> then work out the offset, but even this may not always work: what if
> the original array was constructed with non-C-contiguous strides from
> some preexisting buffer?
>
> If the concern is that this allows users to shoot themselves in the
> foot, it's worth noting that even with the current setup you can
> easily fabricate strides and shapes that go outside the allocated part
> of memory.
>
>> What is the use case, here? One rarely has to use the ndarray
>> constructor by itself. For example, the result you seem to want from
>> the call you make above can be done just fine with .view().
>
> I was presenting a simple example. I was actually trying to use
> zero-strided arrays to implement broadcasting.  The code was rather
> long, but essentially what it was meant to do was generate a view of
> an array in which an axis of length one had been replaced by an axis
> of length m with stride zero. (The point of all this was to create a
> class like vectorize that was suitable for use on, for example,
> np.linalg.inv().) But I also ran into this problem while writing
> segmentaxis.py, the code to produce a "matrix" of sliding windows.
> (See http://www.scipy.org/Cookbook/SegmentAxis .) There I caught the
> exception and copied the array (unnecessarily) if this came up.

I was about a week ahead of you. See numpy/lib/stride_tricks.py in the trunk.

-- 
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless
enigma that is made terrible by our own mad attempt to interpret it as
though it had an underlying truth."
 -- Umberto Eco
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] "expected a single-segment buffer object"

2008-07-09 Thread Anne Archibald
2008/7/9 Robert Kern <[EMAIL PROTECTED]>:

> Yes, the buffer interface, at least the subset that ndarray()
> consumes, requires that all of the data be contiguous in memory.
> array_as_buffer() checks for that using PyArray_ISONE_SEGMENT(), which
> looks like this:
>
> #define PyArray_ISONESEGMENT(m) (PyArray_NDIM(m) == 0 ||  
> \
> PyArray_CHKFLAGS(m, NPY_CONTIGUOUS) ||   \
> PyArray_CHKFLAGS(m, NPY_FORTRAN))
>
> Trying to get a buffer object from anything that is neither C- or
> Fortran-contiguous will fail. E.g.
>
> In [1]: from numpy import *
>
> In [2]: A = arange(10)
>
> In [3]: B = A[::2]
>
> In [4]: ndarray(strides=B.strides, shape=B.shape, buffer=B, dtype=B.dtype)
> ---
> TypeError Traceback (most recent call last)
>
> /Users/rkern/today/ in ()
>
> TypeError: expected a single-segment buffer object

Is this really necessary? What does making this restriction gain? It
certainly means that many arrays whose storage is a contiguous block
of memory can still not be used (just permute the axes of a 3d array,
say; it may even be possible for an array to be in C contiguous order
but for the flag not to be set), but how is one to construct exotic
slices of an array that is strided in memory? (The real part of a
complex array, say.)

I suppose one could follow the linked list of .bases up to the
original ndarray, which should normally be C- or Fortran-contiguous,
then work out the offset, but even this may not always work: what if
the original array was constructed with non-C-contiguous strides from
some preexisting buffer?

If the concern is that this allows users to shoot themselves in the
foot, it's worth noting that even with the current setup you can
easily fabricate strides and shapes that go outside the allocated part
of memory.

> What is the use case, here? One rarely has to use the ndarray
> constructor by itself. For example, the result you seem to want from
> the call you make above can be done just fine with .view().

I was presenting a simple example. I was actually trying to use
zero-strided arrays to implement broadcasting.  The code was rather
long, but essentially what it was meant to do was generate a view of
an array in which an axis of length one had been replaced by an axis
of length m with stride zero. (The point of all this was to create a
class like vectorize that was suitable for use on, for example,
np.linalg.inv().) But I also ran into this problem while writing
segmentaxis.py, the code to produce a "matrix" of sliding windows.
(See http://www.scipy.org/Cookbook/SegmentAxis .) There I caught the
exception and copied the array (unnecessarily) if this came up.

Anne
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] "expected a single-segment buffer object"

2008-07-09 Thread Robert Kern
On Wed, Jul 9, 2008 at 18:55, Anne Archibald <[EMAIL PROTECTED]> wrote:
> Hi,
>
> When trying to construct an ndarray, I sometimes run into the
> more-or-less mystifying error "expected a single-segment buffer
> object":
>
> Out[54]: (0, 16, 8)
> In [55]: A=np.zeros(2); A=A[np.newaxis,...];
> np.ndarray(strides=A.strides,shape=A.shape,buffer=A,dtype=A.dtype)
> ---
>  Traceback (most recent call last)
>
> /home/peridot/ in ()
>
> : expected a single-segment buffer object
>
> In [56]: A.strides
> Out[56]: (0, 8)
>
> That is, when I try to construct an ndarray based on an array with a
> zero stride, I get this mystifying error. Zero-strided arrays appear
> naturally when one uses newaxis, but they are valuable in their own
> right (for example for broadcasting purposes). So it's a bit awkward
> to have this error appearing when one tries to feed them to
> ndarray.__new__ as a buffer. I can, I think, work around it by
> removing all axes with stride zero:
>
> def bufferize(A):
>   idx = []
>   for v in A.strides:
>   if v==0:
>   idx.append(0)
>   else:
>   idx.append(slice(None,None,None))
>   return A[tuple(idx)]
>
> Is there any reason for this restriction?

Yes, the buffer interface, at least the subset that ndarray()
consumes, requires that all of the data be contiguous in memory.
array_as_buffer() checks for that using PyArray_ISONE_SEGMENT(), which
looks like this:

#define PyArray_ISONESEGMENT(m) (PyArray_NDIM(m) == 0 ||  \
 PyArray_CHKFLAGS(m, NPY_CONTIGUOUS) ||   \
 PyArray_CHKFLAGS(m, NPY_FORTRAN))

Trying to get a buffer object from anything that is neither C- or
Fortran-contiguous will fail. E.g.

In [1]: from numpy import *

In [2]: A = arange(10)

In [3]: B = A[::2]

In [4]: ndarray(strides=B.strides, shape=B.shape, buffer=B, dtype=B.dtype)
---
TypeError Traceback (most recent call last)

/Users/rkern/today/ in ()

TypeError: expected a single-segment buffer object


What is the use case, here? One rarely has to use the ndarray
constructor by itself. For example, the result you seem to want from
the call you make above can be done just fine with .view().

In [8]: C = B.view(ndarray)

In [9]: C
Out[9]: array([0, 2, 4, 6, 8])

In [10]: B
Out[10]: array([0, 2, 4, 6, 8])

In [11]: C is B
Out[11]: False

In [12]: B[0] = 10

In [13]: C
Out[13]: array([10,  2,  4,  6,  8])

-- 
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless
enigma that is made terrible by our own mad attempt to interpret it as
though it had an underlying truth."
 -- Umberto Eco
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion