Re: [Numpy-discussion] Buffer interface PEP

2007-03-27 Thread Zachary Pincus
>> Is this saying that either NULL or a pointer to "B" can be supplied
>> by getbufferproc to indicate to the caller that the array is unsigned
>> bytes? If so, is there a specific reason to put the (minor)
>> complexity of handling this case in the caller's hands, instead of
>> dealing with it internally to getbufferproc? In either case, the
>> wording is a bit unclear, I think.
>>
>
> Yes, the wording could be more clear.   I'm trying to make it easy for
> exporters to change
> to the new buffer interface.
>
> The main idea I really want to see is that if the caller just passes
> NULL instead of an address then it means they are assuming the data  
> will
> be "unsigned bytes"   It is up to the exporter to either allow this or
> raise an error.
>
> The exporter should always be explicit if an argument for returning  
> the
> format is provided (I may have thought differently a few days ago).

Understood -- I'm for the exporters being as explicit as possible if  
the argument is provided.

>> The general question is that there are several other instances where
>> getbufferproc is allowed to return ambiguous information which must
>> be handled on the client side. For example, C-contiguous data can be
>> indicated either by a NULL strides pointer or a pointer to a  
>> properly-
>> constructed strides array.
>
> Here.  I'm trying to be easy on the exporter and the consumer.  If the
> data is contiguous, then neither the exporter nor will likely care  
> about
> the strides.  Allowing this to be NULL is like the current array
> protocol convention which allows this to be None.

See below. My comments here aren't suggesting that NULL should be  
disallowed. I'm basically wondering whether it is a good idea to  
allow NULL and something else to represent the same information.  
(E.g. as above, an exporter could choose to show C-contiguous data  
with a NULL returned to the client, or with a trivial strides array).

Otherwise two different exporters exporting identical data could  
provide different representations, which the clients would need to be  
able to handle. I'm not sure that this is a recipe for perfect  
interoperability.

>> Clients that can't handle C-contiguous
>> data (contrived example, I know there is a function to deal with
>> that) would then need to check both for NULL *and* inside the strides
>> array if not null, before properly deciding that the data isn't
>> usable them.
> Not really.  A client that cannot deal with strides will simply not  
> pass
> an address to a stride array to the buffer protocol (that argument  
> will
> be NULL).  If the exporter cannot provide memory without stride
> information, then an error will be raised.

This doesn't really address my question, which I obscured with a  
poorly-chosen example. The PEP says (or at least that's how I read  
it) that if the client *does* provide an address for the stride  
array, then for un-strided arrays, the exporter may either choose to  
fill on NULL at that address, or provide a strides array.

Might it be easier for clients if the PEP required that NULL be  
returned if the array is C-contiguous? Or at least strongly suggested  
that? (I understand that there might be cases where an naive exporter  
"thinks" it is dealing with a strided array but it really is  
contiguous, and the exporter shouldn't be required to do that  
detection.)

The use-case isn't too strong here, but I think it's clear in the  
suboffsets case (see below).

>> Similarly, the suboffsets can be either all negative or
>> NULL to indicate the same thing.
> I think it's much easier to check if suboffsets is NULL rather than
> checking all the entries to see if they are -1 for the very common  
> case
> (i.e. the NumPy case) of no dereferencing.Also, if you can't deal
> with suboffsets you would just not provide an address for them.

My point exactly! As written, the PEP allows an exporter to either  
return NULL, or an array of all negative numbers (in the case that  
the client requested that information), forcing a fully -conforming  
client to make *both* checks in order to decide what to do.

Especially in this case, it would make sense to require a NULL be  
returned in the case of no suboffsets. This makes things easier for  
both clients that can deal with both suboffsets or non-offsets (e.g.  
they can branch on NULL, not on NULL or all-are-negative), and also  
for clients that can *only* deal with suboffsets.

Now, in these two cases, the use-case is pretty narrow, I agree.  
Basically it makes things easier for savvy clients that can deal with  
different data types, by not forcing them to make two checks (strides  
== NULL or strides array is trivial; suboffsets == NULL or suboffsets  
are all negative) when one would do. Again, this PEP allows the same  
information can be passed in two very different ways, when it really  
doesn't seem like that ambiguity makes life that much easier for  
exporters.

Maybe I'm wrong about this las

[Numpy-discussion] Further matrix oddities: no inner product

2007-03-27 Thread Charles R Harris

In mathematics, and inner product is a sesquilinear form on pairs of
vectors, so at the least it should return a scalar. In numpy inner is a sum
over the last indices. OK, so we have

In [10]: inner(ones(2),ones(2))
Out[10]: 2.0

This doesn't work as an inner product for column vectors, which would be the
usual textbook convention, but that's alright, it's not a 'real' inner
product. But what happens when matrices are involved?

In [11]: inner(mat(ones(2)),mat(ones(2)))
Out[11]: array([[ 2.]])

Hmm, we get an array, not a scalar. Maybe we can cheat

In [12]: mat(ones(2))*mat(ones(2)).T
Out[12]: matrix([[ 2.]])

What about vdot (conjugate of the mathematical convention, i.e., the Dirac
convention)

In [17]: vdot(mat(ones(2)),mat(ones(2)))
---
exceptions.ValueErrorTraceback (most recent
call last)

/home/charris/

ValueError: vectors have different lengths

In [18]: vdot(mat(ones(2)),mat(ones(2)).T)
---
exceptions.ValueErrorTraceback (most recent
call last)

/home/charris/

ValueError: vectors have different lengths


Nope, vdot doesn't work for row and column vectors. So there is *no* builtin
inner product that works for matrices. I wonder if we should have one, and
if so, what it should be called. I think that vdot should probably be
modified to do the job.  There is also the question of whether or not v.T *
v should be a scalar when v is a column vector. I believe that construction
is commonly used in matrix algebra as an alias for the inner product,
although strictly speaking it uses the mapping between a vector space and
its dual that the inner product provides.

Chuck
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] matrix indexing question

2007-03-27 Thread Alan G Isaac
>>  On Tue, 27 Mar 2007, Robert Kern apparently wrote: 
>>>  Gram-Schmidt orthogonalization 

 
> Alan G Isaac wrote:
>>  I take it from context that you consider it desirable
>>  to end up with a list of matrices?


Robert wrote:
> Honestly, I don't care. You asked about iteration, and 
> I gave you an example
> where it was important that the iterates were matrices 
> (such that .T worked correctly).
> Please stop moving the goal posts.

It is really not my intent to be irritating or to move the posts.
But I had two reasons to ask.

First, your claim is false that this was important in your example.
(To see this, replace M[1:] with M.A[1:].)
So actually your code would work the same if iteration over 
matrices were producing arrays
(after replacing M[0] with M[0,:] to match that case).

Second, when I spoke of *desirability*, the output is relevant.
Try nump.mat(ortho) to see what I mean.
If iteration were to produce arrays,
the outcome of implementing the algorithm
(e.g., using ``dot``)
would be I suggest more "desirable",
in that numpy.mat(ortho) would work as hoped/expected.

In this sense, the example perhaps speaks against your intent.
You offer code that would work just fine if iteration 
yielded arrays.

Apologies in advance if this again seems tangential to my 
original request.  To me it does not.


> If you really want iterates to be arrays, just use .A.

I have mentioned that several times.
It is orthogonal to the design issue I raised.
(I suggested that under the current behavior you do not 
usually end up with what you want when iterating over matrices.)
But again, I recognize that I am alone in that view.
I am just trying to understand why.

Cheers,
Alan Isaac


___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] use of concatenate with subclass of ndarray

2007-03-27 Thread Pierre GM
On Tuesday 27 March 2007 20:08:04 Bryce Hendrix wrote:
> We have a class which is a subclass of ndarray which defines
> __array_finalize__ to add an attribute. The class looks something like
> this:

Ahah, we ran into this problem a few months ago:
You should not initialize your "units" attribute in the __new__ method, as 
it's not thread-safe first, but more important as you will lose it if you try 
to get a view of your new object (because in that case, __new__ is not 
called).
Instead, try something like that:

def __array_finalize__(self, obj):
self.units = getattr(obj, 'units', yourdefaultunit)

where your defaultunit is the value you would use in __new__. True, 
yourdefaultunit may change, and even be lost if there's a break between 
__new__ and __array_finalize__, but that should work. Lemme know.

Note that things are always tricky w/ concatenate: when you concatenate your 
UnitArray, you should have a UnitArray, OK. But if you mix UnitArray and 
ndarray ? What should take precedence ?
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] use of concatenate with subclass of ndarray

2007-03-27 Thread Bryce Hendrix
We have a class which is a subclass of ndarray which defines 
__array_finalize__ to add an attribute. The class looks something like 
this:

class UnitArray(ndarray):
 # ...
 def __new__(cls, data, dtype=None, copy=True, units=None):
   # ...
   arr = array(data, dtype=dtype, copy=copy)
   res = ndarray.__new__(cls, arr.shape, arr.dtype,buffer=arr)
   res.units = units
   return res

 def __array_finalize__(self, obj):
   if (isinstance(obj, UnitArray)):
 if hasattr(obj, 'units'):
   self.units = obj.units

This works for all the ufuncs I've tried, but the result of concatenate 
is missing the units attribute. Is this a bug with concatenate, or 
expected behavior?

The real class can be found here:
https://svn.enthought.com/svn/enthought/trunk/src/lib/enthought/numerical_modeling/units/unit_array.py

Bryce

___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Complex 'where' expression

2007-03-27 Thread Tom Denniston
You need to use numpy.logical_and.

The and and or operators do not work elementwise on arrays to the best
of my understanding. I think it is a limitation on the way logical
operators are applied in python or maybe it is an intentional numpy
limitation.  I'm sure others on the list could explain better.

Also where is the wrong function.  You want compress:

In [8]: image2 = numpy.array([-1,1,1,1])

In [9]: image1 = numpy.array([1,-1,2,3])

In [10]: numpy.arange(len(image2)).compress(numpy.logical_and(image1 >
0, image2 > 0))
Out[10]: array([2, 3])

Others might have a better way but this at least works.

--Tom

On 3/27/07, Ludwig <[EMAIL PROTECTED]> wrote:
> A bit new to numpy, trying to move over some IDL code.
>
> I have a complex array selection expression, where I need an array of indexes 
> as
> a result, so I can use this as a selection expression to apply changes to a
> different array.
>
> I have two images, image1 and image2, plus an array sigma, the result of a
> previous calculation. All have the same shape. quality is a scalar threshold.
>
> I need the indexes where image1 and image2 are not 0 and the sigma value at 
> that
> point is lower than my threshold. I then take these indices and store some 
> value
> against this in a different array.
>
> In pseudo code:
>
> indexes = where(image1 > 0 and image2 > 0 and sigma < quality)
> result[indexes] = i # scalar
>
>
> When I run this numpy tells me that the that the truth value of the array
> comparison is ambiguous -- but I do not want to compare the whole arrays here,
> but the value for every point.
>
> How do I do this
>
> Regards
>
> Ludwig
>
>
>
>
>
>
> ___
> Numpy-discussion mailing list
> Numpy-discussion@scipy.org
> http://projects.scipy.org/mailman/listinfo/numpy-discussion
>
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Buffer interface PEP

2007-03-27 Thread Travis Oliphant
Zachary Pincus wrote:
> Hi,
>
>   
> Is this saying that either NULL or a pointer to "B" can be supplied  
> by getbufferproc to indicate to the caller that the array is unsigned  
> bytes? If so, is there a specific reason to put the (minor)  
> complexity of handling this case in the caller's hands, instead of  
> dealing with it internally to getbufferproc? In either case, the  
> wording is a bit unclear, I think.
>   

Yes, the wording could be more clear.   I'm trying to make it easy for 
exporters to change
to the new buffer interface.   

The main idea I really want to see is that if the caller just passes 
NULL instead of an address then it means they are assuming the data will 
be "unsigned bytes"   It is up to the exporter to either allow this or 
raise an error. 

The exporter should always be explicit if an argument for returning the 
format is provided (I may have thought differently a few days ago).

> The general question is that there are several other instances where  
> getbufferproc is allowed to return ambiguous information which must  
> be handled on the client side. For example, C-contiguous data can be  
> indicated either by a NULL strides pointer or a pointer to a properly- 
> constructed strides array. 

Here.  I'm trying to be easy on the exporter and the consumer.  If the 
data is contiguous, then neither the exporter nor will likely care about 
the strides.  Allowing this to be NULL is like the current array 
protocol convention which allows this to be None.  

> Clients that can't handle C-contiguous  
> data (contrived example, I know there is a function to deal with  
> that) would then need to check both for NULL *and* inside the strides  
> array if not null, before properly deciding that the data isn't  
> usable them.
Not really.  A client that cannot deal with strides will simply not pass 
an address to a stride array to the buffer protocol (that argument will 
be NULL).  If the exporter cannot provide memory without stride 
information, then an error will be raised.

> Similarly, the suboffsets can be either all negative or  
> NULL to indicate the same thing.
>   
I think it's much easier to check if suboffsets is NULL rather than 
checking all the entries to see if they are -1 for the very common case 
(i.e. the NumPy case) of no dereferencing.Also, if you can't deal 
with suboffsets you would just not provide an address for them.
> Might it be more appropriate to specify only one canonical behavior  
> in these cases? Otherwise clients which don't do all the checks on  
> the data might not properly interoperate with providers which format  
> these values in the alternate manner.
>   
It's important to also be easy to use.  I don't think clients should be 
required to ask for strides and suboffsets if they can't handle them. 
>
> Also, some typos, and places additional clarification could help:
>
>   
>> 253 PYBUF_STRIDES (strides and isptr)
>> 
> Should 'isptr' be 'suboffsets'?
>   

Yes, but I think we are going to take out the multiple locks.
>   
>> 75 of a larger array can be described without copying the data.   T
>> 
> Dangling 'T'.
>   
Thanks,

>   
>> 279 Get the buffer and optional information variables about the  
>> buffer.
>> 280 Return an object-specific view object (which may be simply a
>> 281 borrowed reference to the object itself).
>> 
> This phrasing (and similar phrasing elsewhere) is somewhat opaque to  
> me. What's an "object-specific view object"?
>   
At the moment it's the buffer provider.  It is not defined because it 
could be a different thing for each exporter.   We are still discussing 
this particular point and may drop it.
>   
>> 333 The struct string-syntax is missing some characters to fully
>> 334 implement data-format descriptions already available elsewhere (in
>> 335 ctypes and NumPy for example).  Here are the proposed additions:
>> 
> Is the following table just the additions? If so, it might be good to  
> show the full spec, and flag the specific additions. If not, then the  
> additions should be flagged.
>   

Yes, these are just the additions.  I don't want to do the full spec, it 
is already available elsewhere in the Python docs.

>   
>> 341 't'   bit (number before states how many bits)
>> 
> vs.
>   
>> 372 According to the struct-module, a number can preceed a character
>> 373 code to specify how many of that type there are.  The
>> 
> I'm confused -- could this be phrased more clearly? Does '5t' refer  
> to a field 5-bits wide, or 5-one bit fields? Is 't' allowed? If  
> so, is it equivalent to or different from '5t'?
>   
Yes, 't' is equivalent to '5t'  and the difference between one field 
5-bits wide or 5-one bit fields is a confusion based on thinking there 
are fields at all.   Both of those are equivalent.  If you want "fields" 
then you have to define names. 

>   
>> 378 Functions should be added to ctypes to create a ctypes object from
>> 379 a struct

Re: [Numpy-discussion] masked array with mask an instance of numpy.memmap

2007-03-27 Thread Pierre GM
> Is there something deficient about numpy.core.ma?

numpy.core.ma objects are not ndarrays. Therefore, their mask disappear when 
used as the input of something like numpy.array(data, subok=True), which is 
quickly problematic. 
Moreover, they are not especially subclassing-friendly, which was my biggest 
bone, and the main reason why I started rewriting the implementation.
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] masked array with mask an instance of numpy.memmap

2007-03-27 Thread Glen W. Mabey
On Tue, Mar 27, 2007 at 05:47:24PM -0500, Glen W. Mabey wrote:
> Is there something deficient about numpy.core.ma?

Okay, I just found sandbox/maskedarray/CHANGELOG .

Glen
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] masked array with mask an instance of numpy.memmap

2007-03-27 Thread Glen W. Mabey
On Tue, Mar 27, 2007 at 06:35:51PM -0400, Pierre GM wrote:
> 
> > Whoa.  You didn't mean numpy SVN, I take it.  So, will numpy.median not
> > ever take into account the mask?
> 
> Mmh, that should be part of numpy.core.ma instead: let's keep the masked 
> arrays a bit on the side.
>
> > And, how recently did you upload them?  I updated scipy today and I
> > still get:
> 
> Ah, sorry, I should have been clearer: scipy SVN, 
> sandbox/maskedarray/mstats.py is the file I was mentioning. That's more of a 
> place holder than an actual library at this point, I'll update it regularly 
> when the need arises. 
> I can't start messing with scipy.stats to take masked array into accounts , 
> as 
> I'm far too biased towards maskedarray vs numpy.core.ma. 

Uh, forgive me but I'm gleaning from your comments that numpy.ma is not
the same as a maskedarray in scipy?

Okay, yup, that's clarified in sandbox/maskedarray/README .

Is there something deficient about numpy.core.ma?

Thanks,
Glen
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] masked array with mask an instance of numpy.memmap

2007-03-27 Thread Pierre GM

> Whoa.  You didn't mean numpy SVN, I take it.  So, will numpy.median not
> ever take into account the mask?

Mmh, that should be part of numpy.core.ma instead: let's keep the masked 
arrays a bit on the side.

> And, how recently did you upload them?  I updated scipy today and I
> still get:

Ah, sorry, I should have been clearer: scipy SVN, 
sandbox/maskedarray/mstats.py is the file I was mentioning. That's more of a 
place holder than an actual library at this point, I'll update it regularly 
when the need arises. 
I can't start messing with scipy.stats to take masked array into accounts , as 
I'm far too biased towards maskedarray vs numpy.core.ma. 

I'm not familiar with memmap, but the fact that numpy.core.ma masked arrays 
are not ndarrays may be a hint.
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] matrix indexing question

2007-03-27 Thread Alan Isaac
On Tue, 27 Mar 2007, Fernando Perez wrote: 
> it's probably worth mentioning, as a data point, that the 
> python language has a well established precedent for 
> 'iteration over a FOO where a FOO is yielded': 
> In [1]: s='abc' 

Yes.
And I recall being bitten by this long ago.
Now it feels natural enough, I confess.

But then, on reflection, it is an oddity.  ;-)

Cheers,
Alan Isaac




___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] A unique function...

2007-03-27 Thread Pierre GM
Zach,
> There's unique and unique1d, but these don't output the number of
> occurences.
> There's also bincount, which outputs the number of each element, but
> includes zeros for non-present elements and so could be problematic
> for certain data.

Well, I found it anyway. As I was pretty sure it was a private function, I 
grep on 'def _' | sed '/__/' and find it in scipy.stats.morestats: it's the 
find_repeat function.
Note that I could probably built one with a combo unique1d/setmember1d.
Thanks nevertheless.
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] masked array with mask an instance of numpy.memmap

2007-03-27 Thread Glen W. Mabey
On Tue, Mar 27, 2007 at 05:54:05PM -0400, Pierre GM wrote:
> On Tuesday 27 March 2007 17:42:18 Glen W. Mabey wrote:
> > However, using a memmap for the mask value seems to load the entire
> > object into memory, even if it is already of type bool.
> >
> > How hard would it be to change that behavior?
> 
> Oh, I have no idea about that...

Anyone else care to comment on this?

> > Also, it seems that median() is not implemented on masked arrays?
> 
> Great minds meet, I just uploaded some very basic stats functions for the new 
> implementation of masked array in the scipy SVN.

Whoa.  You didn't mean numpy SVN, I take it.  So, will numpy.median not
ever take into account the mask?

And, how recently did you upload them?  I updated scipy today and I
still get:

In [41]:scipy.stats.median( ma_data[100:,1] )
Out[41]:1989792384.0

In [42]:numpy.median( ma_data[100:,1].filled() )
Out[42]:1989792384.0

In [43]:ma_data[100:,1]
Out[43]:
array(data =
 [ 8829714.  -100.   7007859.  ...,  3921109.5  3815157.   4447688.  ],
  mask =
 [False  True False ..., False False False],
  fill_value=-100.0)


> > Is there some method for a masked array like ravel() but that only
> > includes the non-masked values?
> 
> Yep, compressed(). 

Perfect.  Thank you.

--
Glen W. Mabey

"Happiness in marriage and parenthood can exceed a thousand times any 
 other happiness."
  -- James Esdras Faust
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] masked array with mask an instance of numpy.memmap

2007-03-27 Thread Pierre GM
On Tuesday 27 March 2007 17:42:18 Glen W. Mabey wrote:
> However, using a memmap for the mask value seems to load the entire
> object into memory, even if it is already of type bool.
>
> How hard would it be to change that behavior?

Oh, I have no idea about that...

> Also, it seems that median() is not implemented on masked arrays?

Great minds meet, I just uploaded some very basic stats functions for the new 
implementation of masked array in the scipy SVN.

> Is there some method for a masked array like ravel() but that only
> includes the non-masked values?

Yep, compressed(). Note that it will flatten your array first, so if you work 
with nd arrays, that's not the best solution. If you work with 2d arrays, you 
can simply loop on the other dimension, such as

(n,p) = data.shape
med = numpy.empty((n,),dtype=data.dtype)
for i in range(p):
med[i] = numpy.median(data[:,i].compressed())

That works well if p < n/3.
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] masked array with mask an instance of numpy.memmap

2007-03-27 Thread Glen W. Mabey

I've been using numpy.memmap to access large arrays very nicely, but now
I desire to instantiate a masked array that uses memmap objects.

Using a memmap instance for the data works just fine and as expected.

However, using a memmap for the mask value seems to load the entire
object into memory, even if it is already of type bool.

How hard would it be to change that behavior?

Also, it seems that median() is not implemented on masked arrays?

Is there some method for a masked array like ravel() but that only
includes the non-masked values?

Thank you,
Glen Mabey

___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] matrix indexing question

2007-03-27 Thread Robert Kern
Alan G Isaac wrote:
> On Tue, 27 Mar 2007, Robert Kern apparently wrote: 
>> Gram-Schmidt orthogonalization 
> 
> I take it from context that you consider it desirable
> to end up with a list of matrices?

Honestly, I don't care. You asked about iteration, and I gave you an example
where it was important that the iterates were matrices (such that .T worked
correctly).

Please stop moving the goal posts.

> I guess I would find it more natural to work with the 
> arrays, but perhaps that starts just being taste.

Personally, I find working with arrays from start to finish the most natural.

If you really want iterates to be arrays, just use .A.

-- 
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless enigma
 that is made terrible by our own mad attempt to interpret it as though it had
 an underlying truth."
  -- Umberto Eco
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] A unique function...

2007-03-27 Thread Zachary Pincus
Hello,

There's unique and unique1d, but these don't output the number of  
occurences.
There's also bincount, which outputs the number of each element, but  
includes zeros for non-present elements and so could be problematic  
for certain data.

Zach



On Mar 27, 2007, at 2:28 PM, Pierre GM wrote:

> All,
> I could swear that I ran once into a numpy (or scipy) function that  
> output the
> unique values of a 1d ndarray along with the number of occurences  
> of each
> element. If I'm not completely mistaken, it 's a private function  
> initially
> in Fortran.
> Does this ring a bell to anyone ? Where could I find this function ?
> ___
> Numpy-discussion mailing list
> Numpy-discussion@scipy.org
> http://projects.scipy.org/mailman/listinfo/numpy-discussion

___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] A unique function...

2007-03-27 Thread Pierre GM
All,
I could swear that I ran once into a numpy (or scipy) function that output the 
unique values of a 1d ndarray along with the number of occurences of each 
element. If I'm not completely mistaken, it 's a private function initially 
in Fortran.
Does this ring a bell to anyone ? Where could I find this function ?
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Buffer interface PEP

2007-03-27 Thread Zachary Pincus
Hi,

I have a specific question and then a general question, and some  
minor issues for clarification.

Specifically, regarding the arguments to getbufferproc:
> 166 format
> 167address of a format-string (following extended struct
> 168syntax) indicating what is in each element of
> 169of memory.  The number of elements is len / itemsize,
> 170where itemsize is the number of bytes implied by the format.
> 171NULL if not needed in which case format is "B" for
> 172unsigned bytes.  The memory for this string must not
> 173be freed by the consumer --- it is managed by the exporter.

Is this saying that either NULL or a pointer to "B" can be supplied  
by getbufferproc to indicate to the caller that the array is unsigned  
bytes? If so, is there a specific reason to put the (minor)  
complexity of handling this case in the caller's hands, instead of  
dealing with it internally to getbufferproc? In either case, the  
wording is a bit unclear, I think.

The general question is that there are several other instances where  
getbufferproc is allowed to return ambiguous information which must  
be handled on the client side. For example, C-contiguous data can be  
indicated either by a NULL strides pointer or a pointer to a properly- 
constructed strides array. Clients that can't handle C-contiguous  
data (contrived example, I know there is a function to deal with  
that) would then need to check both for NULL *and* inside the strides  
array if not null, before properly deciding that the data isn't  
usable them. Similarly, the suboffsets can be either all negative or  
NULL to indicate the same thing.

Might it be more appropriate to specify only one canonical behavior  
in these cases? Otherwise clients which don't do all the checks on  
the data might not properly interoperate with providers which format  
these values in the alternate manner.


Also, some typos, and places additional clarification could help:

> 253 PYBUF_STRIDES (strides and isptr)
Should 'isptr' be 'suboffsets'?

> 75 of a larger array can be described without copying the data.   T
Dangling 'T'.

> 279 Get the buffer and optional information variables about the  
> buffer.
> 280 Return an object-specific view object (which may be simply a
> 281 borrowed reference to the object itself).
This phrasing (and similar phrasing elsewhere) is somewhat opaque to  
me. What's an "object-specific view object"?

> 287 Call this function to tell obj that you are done with your "view"
Similarly, the 'view' concept and terminology should be defined more  
clearly in the preamble.

> 333 The struct string-syntax is missing some characters to fully
> 334 implement data-format descriptions already available elsewhere (in
> 335 ctypes and NumPy for example).  Here are the proposed additions:
Is the following table just the additions? If so, it might be good to  
show the full spec, and flag the specific additions. If not, then the  
additions should be flagged.

> 341 't'   bit (number before states how many bits)
vs.
> 372 According to the struct-module, a number can preceed a character
> 373 code to specify how many of that type there are.  The
I'm confused -- could this be phrased more clearly? Does '5t' refer  
to a field 5-bits wide, or 5-one bit fields? Is 't' allowed? If  
so, is it equivalent to or different from '5t'?

> 378 Functions should be added to ctypes to create a ctypes object from
> 379 a struct description, and add long-double, and ucs-2 to ctypes.
Very cool.

In general, the logic of the 'locking mechanism' should be described  
at a high level at some point. It's described in nitty-gritty  
details, but at least I would have appreciated a bit more of a  
discussion about the general how and why -- this would be helpful to  
clients trying to use the locking mechanism properly.

Thanks to Travis and everyone else involved for your work on this  
cogent and sorely-needed PEP.

Zach



On Mar 27, 2007, at 12:42 PM, Travis Oliphant wrote:

>
> Hi all,
>
> I'm having a good discussion with Carl Banks and Greg Ewing over on
> python-dev about the buffer interface. We are converging on a pretty
> good design, IMHO.   If anybody wants to participate in the  
> discussion,
> please read the PEP (there are a few things that still need to change)
> and add your two cents over on python-dev (you can read it through the
> gmane gateway and participate without signing up).
>
> The PEP is stored in numpy/doc/pep_buffer.txt on the SVN tree for  
> NumPy
>
> Best regards,
>
> -Travis
>
>
> ___
> Numpy-discussion mailing list
> Numpy-discussion@scipy.org
> http://projects.scipy.org/mailman/listinfo/numpy-discussion

___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] matrix indexing question

2007-03-27 Thread Joachim Dahl

On 3/27/07, Zachary Pincus <[EMAIL PROTECTED]> wrote:



So we are left with the current behavior:
(a) M[i,:] is a matrix.
(b) M[i] is a matrix.
(c) Iteration over M yields matrices.



I find it useful if M[i] indexes the matrix interpreted as a column-vector
(vec(M), or M(:) in
Matlab notation),  e.g.,  you could pick out the diagonal as M[::n+1],  or
if N is an index-set
of certain elements in M,  then you could access those elements as M[N].
Then I would also say that iterating over a matrix should just return the
elements of vec(M) one by
one.

In general, I think the Matlab notation works well:
* M[I,J]  where I and J and index-sets returns a len(I) x len(J) matrix
* M[I] returns a len(I) vector (a column or row vector depending on
orientation of I)
The index-sets could be a single integer, a list, or a slice.

But I can see there are good arguments for using the NumPy conventions as
well...
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] matrix indexing question

2007-03-27 Thread Fernando Perez
On 3/27/07, Alan G Isaac <[EMAIL PROTECTED]> wrote:
> Hi Zach,
>
> The use case I requested was for iteration over a
> matrix where it is desirable that matrices are yielded.

I'm personally not really partial to any side of this discussion,
given how I don't use matrices at all (I'm content with a simple
mental model of arrays, and use dot() as necessary).  But it's
probably worth mentioning, as a data point, that the python language
has a well established precedent for 'iteration over a FOO where a FOO
is yielded':

In [1]: s='abc'

In [2]: type(s)
Out[2]: 

In [3]: map(type,s)
Out[3]: [, , ]


I know strings aren't matrices, so take this as you will in terms of
being plus, neutral or minus regarding your discussion.  I'm just
offering the data, not interpreting it :)

Cheers,

f
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] matrix indexing question

2007-03-27 Thread Zachary Pincus
> The use case I requested was for iteration over a
> matrix where it is desirable that matrices are yielded.
> That is not what you offered.

No, I offered a thorough discussion of why the design of numpy, as I  
imperfectly understand it, make the trade-offs to achieve your  
desired goal unpalatable.

> The context for this request is my own experience:
> whenever I have needed to iterate over matrices,
> I have always wanted the arrays.  So I am simply
> interested in seeing an example of the opposite desire.

I'm glad that Robert provided such a nice example.

Nevertheless, I think that even absent any example it should be  
pretty clear at this point that if you want iteration over matrices  
to return arrays, either:
(1) you need to be willing to accept that list(iter(M))[i] != M[i]
or
(2) M[i], which equals M[i,:], should be an array.
(or 3, that M[i] and iteration over M is not defined)

Clearly, (1) is fundamentally ridiculous. Clearly use-cases (as well  
as the mathematical definition of matrices) abound to illustrate why  
(2) is no good. Option (3) helps nobody. So which is it? You have to  
choose one!

It is my assertion that if you are iterating over matrices and hoping  
to get arrays, you are misusing the matrix class. Almost by  
definition if you expect M[i,:] or M[i] (or equivalently, the output  
of an iterator over M) to be an array, then M is being used outside  
of a linear algebra context -- and thus, in my view, misused. In  
those cases, the matrix should be cast to an array, which has the  
desired behavior. Robert nicely illustrated a linear-algebraic  
context for matrix iteration.

Now, Bill offers up a different suggestion: indexing M yields neither  
a matrix nor an array, but a class that operates more or less like an  
array, except insofar as it interacts with other matrix objects, or  
other objects of similar classes. I'm interested in hearing more  
about this, what trade-offs or other compromises it might involve.

Zach


>
> Cheers,
> Alan Isaac
>
>
>
> ___
> Numpy-discussion mailing list
> Numpy-discussion@scipy.org
> http://projects.scipy.org/mailman/listinfo/numpy-discussion

___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] matrix indexing question

2007-03-27 Thread Alan G Isaac
On Tue, 27 Mar 2007, Robert Kern apparently wrote: 
> Gram-Schmidt orthogonalization 

I take it from context that you consider it desirable
to end up with a list of matrices?
I guess I would find it more natural to work with the 
arrays, but perhaps that starts just being taste.

Thank you,
Alan Isaac


___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] matrix indexing question

2007-03-27 Thread Robert Kern
Alan G Isaac wrote:
> Hi Zach,
> 
> The use case I requested was for iteration over a
> matrix where it is desirable that matrices are yielded.
> That is not what you offered.
> 
> The context for this request is my own experience:
> whenever I have needed to iterate over matrices,
> I have always wanted the arrays.  So I am simply
> interested in seeing an example of the opposite desire.

Gram-Schmidt orthogonalization.

ortho = [mat[0] / sqrt(mat[0] * mat[0].T)]
for rowv in mat[1:]:
for base in ortho:
rowv = rowv - base * float(rowv * base.T)
rowv = rowv / sqrt(rowv * rowv.T)
ortho.append(rowv)

-- 
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless enigma
 that is made terrible by our own mad attempt to interpret it as though it had
 an underlying truth."
  -- Umberto Eco
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] matrix indexing question

2007-03-27 Thread Alan G Isaac
Hi Zach,

The use case I requested was for iteration over a
matrix where it is desirable that matrices are yielded.
That is not what you offered.

The context for this request is my own experience:
whenever I have needed to iterate over matrices,
I have always wanted the arrays.  So I am simply
interested in seeing an example of the opposite desire.

Cheers,
Alan Isaac



___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] matrix indexing question

2007-03-27 Thread Zachary Pincus
Hello all,

I suspect my previous email did not contain the full chain of my  
reasoning, because I thought that some parts were basically obvious.  
My point was merely that given some pretty basic fundamental tenants  
of numpy, Alan's suggestions quickly lead to *serious issues* far  
worse than the original problem. Thanks Chris for making this more  
explicit; here I will do so further.

Let A be an array of arbitrary shape, and M be m-by-n matrix.

(1) A[i] equals by definition A[i,...]

This is a pretty fundamental part of numpy, right? That property (and  
its relatives) along with the broadcasting behavior, are in large  
part what make numpy so very expressive compared to matlab, etc.

I can write code in numpy to do multidimensional operations over  
arbitrary-shaped and arbitrary-dimensioned data that look *exactly*  
like the corresponding 1D operations. It's amazing, really, how  
numpy's "implicit indexing" (this property), broadcasting (sort of  
the inverse of this, really, where shape tuples are extended as  
necessary with implicit 1's, instead of slices being extended  
with :'s as necessary), and fancy indexing, allow complex operations  
to be written compactly, clearly, and in a fully-general manner.

I thought that point (1) was obvious, as it's a pretty fundamental  
part of numpy (and Numeric and Numarray before that). It's in  
absolutely no way a "trivial convenience". Of course, broadcasting  
and ufuncs, etc, contribute a lot more to expressiveness than this  
property, but it's part of the family.


(2) Thus, M[i] equals by definition M[i,:]. By the mathematical  
definition of matrices and for pragmatic reasons, M[i,:] is another  
matrix.

Here the trouble starts. Should matrices, not being arbitrary in  
their dimension, even have the ability to be indexed as M[i]? Or  
should they be completely different beasts than arrays? Which  
violates the "principle of least surprise" more?

Now, I think we all more or less agree that M[i,:] has to be a  
matrix, because otherwise the matrix class is pretty worthless for  
linear algebra, and doesn't operate according to the mathematical  
definition of a matrix. Indeed, this is one of the basic properties  
of the numpy matrix -- that, and that the multiplication operator is  
defined as matrix-multiplication. Remove that and there's almost no  
point in *having* a separate matrix class.

Want a use-case? Do the singular value decomposition, and get the  
product of the second-smallest singular vectors as Vt[:,-2] * U 
[-2,:]. (Things vaguely akin to this come up a lot.) You can copy the  
operation from any linear algebra text if the row- and column-vectors  
are treated as above. Otherwise you have to wonder whether that's  
supposed to be an outer or inner product, and use the corresponding  
operator -- and at that point, why was I using matrix classes anyway?

(3) If M[i] = M[i,:], and M[i,:] is a matrix, then the iteration  
behavior of matrices *has* to yield other matrices, unless Alan is  
willing to suggest that list(iter(m))[i] should have a different type  
than M[i] -- a clear absurdity. This is the point that I was trying  
to make previously, having mistakenly assumed that point (1) was  
clear to all, and that point (2) had been clearly established.

So, if we trace the reasoning of Alan's suggestion, coupled with  
these above properties, we get just that:
(a) Iteration over M should yield arrays.
(b) Ergo M[i] should be an array.
(c) Ergo M[i,:] should be an array.
(d) Ergo the matrix class should be identical to the array class  
except for overloaded multiplication, and the only way to get a 'row'  
or 'column' vector from a matrix that operates as a proper row or  
column vector should is M[i:i+1,:]. Whicy is a lot of typing to get  
the 'i-th' vector from the matrix.

I don't think these changes are worth it -- it basically discards the  
utility of the matrix class for linear algebra in order to make the  
matrix class more useful for general purpose data storage (a role  
already filled by the array type).

Here is another set of changes which would make matrices fully  
consistent with their linear-algebra roots:
(a) M[i,:] should be a matrix.
(b) M[i] is an error.
(c) Iteration over M is an error.

That's kind of lame though, right? Because then matrices are  
completely inconsistent with their numpy roots.

So we are left with the current behavior:
(a) M[i,:] is a matrix.
(b) M[i] is a matrix.
(c) Iteration over M yields matrices.


My view is that, fundamentally, they are linear algebra constructs.  
However, I also think it would be far more harmful to remove basic  
behavior common to the rest of numpy (e.g. that M[i] = M[i,:]), and  
behavior that comes along with that (iterability). Hence my  
suggestion: treat matrices like linear algebra constructs -- don't  
iterate over them (unless you know what you're doing). Don't index  
them like M[i] (unless you know what you're doing).

Maybe it's "just a 

[Numpy-discussion] Buffer interface PEP

2007-03-27 Thread Travis Oliphant

Hi all,

I'm having a good discussion with Carl Banks and Greg Ewing over on 
python-dev about the buffer interface. We are converging on a pretty 
good design, IMHO.   If anybody wants to participate in the discussion, 
please read the PEP (there are a few things that still need to change) 
and add your two cents over on python-dev (you can read it through the 
gmane gateway and participate without signing up).

The PEP is stored in numpy/doc/pep_buffer.txt on the SVN tree for NumPy

Best regards,

-Travis


___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] matrix indexing question

2007-03-27 Thread Bill Spotz
It seems to me that using shapes, (m,1) versus (1,n), to determine  
whether a vector is column- or row-oriented is a hack (or at least  
feels like one).  If we know we have a vector, then we want to use a  
single index to obtain a scalar, and that extra "0," or ",0"  
shouldn't be necessary.

It also seems to me that there is an object-oriented design solution  
to this, namely defining column_vector and row_vector classes that  
inherit from matrix, that accept a single index, but behave as  
matrices consistent with their type.  I'm sure there are implications  
to this I haven't thought through yet, but at least it gives us  
logical indexing AND persistence of matrices through operations.

On Mar 27, 2007, at 1:11 PM, Travis Oliphant wrote:

> Thanks for listing your points.   I can see that this is an issue  
> where
> reasonable people will disagree because there are multiple ways of
> looking at it.
>
> The idea that matrix selection would return matrices really comes from
> wanting to keep matrices more persistent through operations.
>
> M[0]  could be seen either as a 1xn matrix or a n-length array.   I
> agree that both concepts are possible.  Seeing it as a 1xn matrix  
> allows
> matrices to remain persistent more often.
>
> So, the arguments for the current approach and the arguments  
> against it
> to me seem on the same level, so I don't see a reason to change the
> current behavior and see a lot of strong reasons not to change the
> behavior (we are in a 1.0 release and could not change anything  
> until at
> least 1.1 anyway).
>
> With that said:
>
> One of my goals for the next year or two is to create a matrix  
> class in
> C and incorporate CVXOPT matrices and it's coupled sparse  
> matrix.We
> can re-visit this question in that process.  I would like for there to
> be a sparse matrix implementation in NumPy (without the solver which
> will remain in SciPy).
>
> But, the sparse matrix and the matrix need to have the same behaviors
> and should be able to interoperate with each other.
>
> So, if you would like to help with that project all input is welcome.

** Bill Spotz  **
** Sandia National Laboratories  Voice: (505)845-0170  **
** P.O. Box 5800 Fax:   (505)284-5451  **
** Albuquerque, NM 87185-0370Email: [EMAIL PROTECTED] **



___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] matrix indexing question

2007-03-27 Thread Travis Oliphant
Alan Isaac wrote:
> On Mon, 26 Mar 2007, Travis Oliphant wrote: 
>   
>> It actually has been offered.   You just don't accept it.  
>> Matrices are containers of matrices.  
>> If M is an (mxn) matrix then M[0] is a (1xn) matrix.  
>> Viewing this 1xn matrix as a 1-d array loses it's row-vectorness.  
>> This seems perfectly logical and acceptable to me.  I'm waiting for a 
>> better explanation as to why this is not acceptable.  
>> Arguments that rest on what is and what isn't "Pythonic" are not very 
>> persuasive as this is very often in the eye of the beholder. 
>> 
>
> Again, I only raised a *question* about whether
> there might be a design problem here. 

Thanks for listing your points.   I can see that this is an issue where 
reasonable people will disagree because there are multiple ways of 
looking at it. 

The idea that matrix selection would return matrices really comes from 
wanting to keep matrices more persistent through operations.

M[0]  could be seen either as a 1xn matrix or a n-length array.   I 
agree that both concepts are possible.  Seeing it as a 1xn matrix allows 
matrices to remain persistent more often.

So, the arguments for the current approach and the arguments against it 
to me seem on the same level, so I don't see a reason to change the 
current behavior and see a lot of strong reasons not to change the 
behavior (we are in a 1.0 release and could not change anything until at 
least 1.1 anyway).

With that said:

One of my goals for the next year or two is to create a matrix class in 
C and incorporate CVXOPT matrices and it's coupled sparse matrix.We 
can re-visit this question in that process.  I would like for there to 
be a sparse matrix implementation in NumPy (without the solver which 
will remain in SciPy). 

But, the sparse matrix and the matrix need to have the same behaviors 
and should be able to interoperate with each other.

So, if you would like to help with that project all input is welcome.

Best regards,

-Travis









>  My goal
> was only to have this explored, and I've tried
> to explain why.
>
> The evidence I offer:
> - it is surprising (at least to some) that iterating
>   over a matrix yields matrices
> - I believe it is unexpected (prior to instruction) and that 
>   there is a natural more expected behavior
> - if that is right, deviation from the expected should have 
>   a good justification
> - this behavior has tripped up at least a couple people and
>   I expect that to happen to many others over time (because 
>   I believe the behavior is unexpected)
> - when I desire to iterate over a matrix I always want the 
>   arrays. (Of course there is a way to have them; that's
>   not the point).  I'm interested to see a use case where
>   the rows are desired as matrices
>
> As you indicate, none of this constitutes an "argument".
> And since nobody appears to agree, I should shut up.
> This will be my last post on this subject.
>
> Cheers,
> Alan Isaac
>
>
>
>
> ___
> Numpy-discussion mailing list
> Numpy-discussion@scipy.org
> http://projects.scipy.org/mailman/listinfo/numpy-discussion
>
>
>   

___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Complex 'where' expression

2007-03-27 Thread Robert Kern
Ludwig wrote:

> In pseudo code:
> 
> indexes = where(image1 > 0 and image2 > 0 and sigma < quality)
> result[indexes] = i # scalar
> 
> 
> When I run this numpy tells me that the that the truth value of the array
> comparison is ambiguous -- but I do not want to compare the whole arrays here,
> but the value for every point. 

The Python keywords "and", "or", and "not" try to evaluate the truth value of
the objects. We cannot override them to operate elementwise. Instead, the
operators &, |, and ~, respectively, are the elementwise operators on arrays of
booleans. Thus, you want this:

  indexes = where((image1>0) & (image2>0) & (sigma0) & (image2>0) & (sigmahttp://projects.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] Complex 'where' expression

2007-03-27 Thread Ludwig
A bit new to numpy, trying to move over some IDL code.

I have a complex array selection expression, where I need an array of indexes as
a result, so I can use this as a selection expression to apply changes to a
different array.

I have two images, image1 and image2, plus an array sigma, the result of a
previous calculation. All have the same shape. quality is a scalar threshold.

I need the indexes where image1 and image2 are not 0 and the sigma value at that
point is lower than my threshold. I then take these indices and store some value
against this in a different array.

In pseudo code:

indexes = where(image1 > 0 and image2 > 0 and sigma < quality)
result[indexes] = i # scalar


When I run this numpy tells me that the that the truth value of the array
comparison is ambiguous -- but I do not want to compare the whole arrays here,
but the value for every point. 

How do I do this

Regards

Ludwig






___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] matrix indexing question

2007-03-27 Thread Colin J. Williams
Alan G Isaac wrote:
> On Mon, 26 Mar 2007, "Colin J. Williams" apparently wrote: 
>> One would expect the iteration over A to return row 
>> vectors, represented by (1, n) matrices. 
> 
> This is again simple assertion.
> **Why** would "one" expect this?
> Some people clearly do not.
> 
> One person commented that this unexpected behavior was 
> a source of error in their code.
> 
> Another person commented that they did not even guess that 
> such a thing would be possible.
> 
> Experience with Python should lead to the ability to 
> anticipate the outcome.  Apparently this is not the case.
> That suggests a design problem.
> 
> What about **Python** would lead us to expect this behavior??
> 
> In *contrast*, everyone agrees that for a matrix M,
> we should get a matrix from M[0,:].
> This is expected and desirable.

Perhaps our differences lies in two things:

1. the fact that the text books typically take the column vector as the 
default.  For a Python version, based on C it makes more sense to treat 
the rows as vectors, as data is stored contiguously by row.

2. the convention has been proposed that the vector is more conveniently 
implemented as a matrix, where one dimension is one. The vector could be 
treated as a subclass of the matrix but this adds complexity with little 
clear benefit.  PyMatrix has matrix methods isVector, isCVector and 
isRVector.

I can see some merit in conforming to text book usage and would be glad 
to consider changes when I complete the port to numpy, in a few months.

Colin W.
> 
> Cheers,
> Alan Isaac

___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] matrix indexing question

2007-03-27 Thread Joachim Dahl

I would like to hear your opinion on developing an explicit sparse/dense 2D
matrix class with indexing similar to Matlab, and without significant
differences between sparse and dense matrices from the user's
perspective...

I know that it's not one of Numpy/Scipy's goals to clone Matlab,  but I
think I represent a potentially large scientific Python user base,  who
would find Python matrices that feel a bit more like Matlab/Octave etc.
extremely useful.  I have a great respect for all the work in Numpy and
Scipy,
but at the same time I feel that numerical linear algebra in Python would
benefit from a more dedicated matrix library,  and I think that Numpy (or
another single package) should provide that
in a homogeneous way - without ties to how Numpy array works.

- Joachim


On 3/27/07, Christopher Barker <[EMAIL PROTECTED]> wrote:


Zachary Pincus wrote:
> rest of linear algebra -- e.g. that m[0] yields a matrix if m is a
> matrix-- it almost certainly would violate the principle of least
> surprise for iteration over m (intuitively understood to be choosing m
> [0] then m[1] and so forth) to yield anything other than a matrix.

I don't think the OP was suggesting that. Rather, I think the suggestion
was that for a  mXn matrix, M:

M[i].shape == (n,)

M[i:i+1].shape == (1,n)

that is, indexing (or iterating returns a vector, and slicing returns a
matrix). This is, indeed exactly how numpy arrays behave!

The problem with this is:

numpy matrices were created specifically to support linear algebra
calculations. For linear algebra, the distinction between row vectors
and column vectors is critical. By definition, a row vector has shape:
(1,n), and a column vector has shape (m,1).

In this case, perhaps the OP is thinking that a shape (n,) array could
be considered a row vector, but in that case:

M[1,:].shape == (n,)
M[:,1].shape == (m,)

which of these is the row and which the column? This is why matrices
index this way instead:

M[1,:].shape == (1, n)
M[:,1].shape == (m, 1)

now we know exactly what is a row and what is a column.

By the way, I think with the way numpy works: M[i] == M[i,:] by
definition, so those couldn't yield different shaped results. Is that
right?

I think we got a bit sidetracked by the example given. If I have a bunch
of points I want to store, I'm going to use an (m,2) *array*, not a
matrix, then then A[i] will yield a (2,) array, which makes sense for
(2-d) points. In fact, I do this a LOT.

If I happen to need to do some linear algebra on that array of points,
I'll convert to a matrix, do the linear algebra, then convert back to an
a array (or just use the linear algebra functions on the array).

I hope this helps

-Chris








> This can't possibly be what you're asking for, right? You aren't
> suggesting that m[0] and list(iter(m))[0] should be different types?
>
> There are many clear and definite use cases for m[0] being a matrix,
> by the way, in addition to the theoretical arguments -- these aren't
> hard to come by. Whether or nor there are actual good use-cases for
> iterating over a matrix, if you believe that m[0] should be a matrix,
> it's madness to suggest that list(iter(m))[0] should be otherwise.
>
> My opinion? Don't iterate over matrices. Use matrices for linear
> algebra. There's no "iteration" construct in linear algebra. The
> behavior you find so puzzling is a necessary by-product of the
> fundamental behavior of the matrix class -- which has been explained
> and which you offered no resistance to.
>
> Zach
>
> ___
> Numpy-discussion mailing list
> Numpy-discussion@scipy.org
> http://projects.scipy.org/mailman/listinfo/numpy-discussion

--
Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R(206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115   (206) 526-6317   main reception

[EMAIL PROTECTED]
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion

___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Tuning sparse stuff in NumPy

2007-03-27 Thread Robert Cimrman
David Koch wrote:
> On 3/27/07, Robert Cimrman <[EMAIL PROTECTED]> wrote:
>>
>>
>> ok. now which version of scipy (scipy.__version__) do you use (you may
>> have posted it, but I missed it)? Not so long ago, there was an effort
>> by Nathan Bell and others reimplementing sparsetools + scipy.sparse to
>> get better usability and performance. My (almost latest) version is
>> 0.5.3.dev2860.
>>
> My version is 0.5.2, the latest official version available for download,
> right?

I see. Now while it is desirable and fine in majority of cases to
install stable version of software, with numpy and scipy it is different
:-), as the development is very fast. You should try the 'bleeding edge'
versions via SVN, as many issues you see could have been already
fixed/improved. Thanks to the extensive unit testing, the latest version
usually (read: 99.9%) works better than the stable one.

r.
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] defmatrix.py

2007-03-27 Thread Colin J. Williams
Charles R Harris wrote:
> 
> 
> On 3/26/07, *Travis Oliphant* <[EMAIL PROTECTED] 
> > wrote:
> 
> Charles R Harris wrote:
> 
>  >
>  >
>  > The rule 1-d is always a *row* vector only applies when
> converting
>  > to a
>  > matrix.
>  >
>  > In this case, the dot operator does not "convert to a matrix"
> but
>  > uses
>  > rules for operating with mixed 2-d and 1-d arrays inherited from
>  > Numeric.
>  >
>  > I'm very hesitant to change those rules.
>  >
>  >
>  > I wasn't suggesting that, just noticing that the rule was 1-d vector
>  > on right is treated as a column vector by dot, which is why an
>  > exception was raised in the posted case. If it is traditional for
>  > matrix routines always treat is as a row vector, so be it.
> 
> O.K .  So, the problem is that when I defined matrix multiplication, I
> mistakenly believed that dot always interpreted 1-d arrays as row
> vectors which it is now clear it doesn't.
> 
> I think this led to the issues.
> 
> So, making dot always return arrays seems like the needed fix.
> 
> 
> Sounds good. Do you want to leave the matrix  * operator as it now is?

I hope so.  numarray used the row convention.

Colin W.
> 
> Chuck
> 
> 
> 
> 
> 
> ___
> Numpy-discussion mailing list
> Numpy-discussion@scipy.org
> http://projects.scipy.org/mailman/listinfo/numpy-discussion

___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] defmatrix.py

2007-03-27 Thread Colin J. Williams
Charles R Harris wrote:
> 
> 
> On 3/26/07, *Travis Oliphant* <[EMAIL PROTECTED] 
> > wrote:
> 
> 
>  > I think that might be the simplest thing, dot overrides subtypes.
> BTW,
>  > here is another ambiguity
>  >
>  > In [6]: dot(array([[1]]),ones(2))
>  >
> 
> ---
> 
>  > exceptions.ValueErrorTraceback (most
>  > recent call last)
>  >
>  > /home/charris/
>  >
>  > ValueError: matrices are not aligned
>  >
>  > Note that in this case dot acts like the rhs is always a column
> vector
>  > although it returns a 1-d vector. I don't know that this is a bad
>  > thing, but perhaps we should extend this behaviour to matrices,
> which
>  > would be different from the now current 1-d is always a *row*
> vector, i.e.
> 
> 
> The rule 1-d is always a *row* vector only applies when converting to a
> matrix.
> 
> In this case, the dot operator does not "convert to a matrix" but uses
> rules for operating with mixed 2-d and 1-d arrays inherited from
> Numeric.
> 
> I'm very hesitant to change those rules.
> 
> 
> I wasn't suggesting that, just noticing that the rule was 1-d vector on 
> right is treated as a column vector by dot, which is why an exception 
> was raised in the posted case. If it is traditional for matrix routines 
> always treat is as a row vector, so be it.

My recollection is that text books treat the column vector, represented 
by a lower case letter, bold or underlined, as the default.  If b 
(dressed as described before) is a column vector, then b' represents a 
row vector.

For numpy, it makes sense to consider b as a row vector, since the 
underlying array uses the C convention where each row is stored 
contiguously.

Colin W.

> 
> Chuck
> 
> 
> 
> 
> 
> ___
> Numpy-discussion mailing list
> Numpy-discussion@scipy.org
> http://projects.scipy.org/mailman/listinfo/numpy-discussion

___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] missing use case

2007-03-27 Thread Alan G Isaac
Hi Zachary, 

I think your response highlights very well 
the apparent design flaw. 

Here is your response to my request for a use case where 
iteration over a matrix should yield matrices: 
do not iterate! 

Does some part of you not find that just a little bizarre?! 

As for offering as a use case that with a matrix M 
you are allowed to use M[0] instead of M[0,:] 
when you want the first row as a matrix, 
I really cannot take that trivial convenience seriously 
as the basis of a fundamental design decision. 

However there is no sympathy for my view on this, 
so I am not debating it any more.  Instead I have asked 
a simple question: what use case is this design decision 
supporting?  I am interested in this so that I can see into 
the decision better. 

Nowith Chris proposes that "M[i] == M[i,:] by definition".
If so, that is an design-based answer to my question.
I agree that M[i,:] should return a matrix.
But my understanding was different:
I thought M[i] relied on standard Python idexing
while M[i,:] was numpy indexing.

Cheers, 
Alan Isaac


___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] matrix indexing question

2007-03-27 Thread Christopher Barker
Zachary Pincus wrote:
> rest of linear algebra -- e.g. that m[0] yields a matrix if m is a  
> matrix-- it almost certainly would violate the principle of least  
> surprise for iteration over m (intuitively understood to be choosing m 
> [0] then m[1] and so forth) to yield anything other than a matrix.

I don't think the OP was suggesting that. Rather, I think the suggestion 
was that for a  mXn matrix, M:

M[i].shape == (n,)

M[i:i+1].shape == (1,n)

that is, indexing (or iterating returns a vector, and slicing returns a 
matrix). This is, indeed exactly how numpy arrays behave!

The problem with this is:

numpy matrices were created specifically to support linear algebra 
calculations. For linear algebra, the distinction between row vectors 
and column vectors is critical. By definition, a row vector has shape: 
(1,n), and a column vector has shape (m,1).

In this case, perhaps the OP is thinking that a shape (n,) array could 
be considered a row vector, but in that case:

M[1,:].shape == (n,)
M[:,1].shape == (m,)

which of these is the row and which the column? This is why matrices 
index this way instead:

M[1,:].shape == (1, n)
M[:,1].shape == (m, 1)

now we know exactly what is a row and what is a column.

By the way, I think with the way numpy works: M[i] == M[i,:] by 
definition, so those couldn't yield different shaped results. Is that right?

I think we got a bit sidetracked by the example given. If I have a bunch 
of points I want to store, I'm going to use an (m,2) *array*, not a 
matrix, then then A[i] will yield a (2,) array, which makes sense for 
(2-d) points. In fact, I do this a LOT.

If I happen to need to do some linear algebra on that array of points, 
I'll convert to a matrix, do the linear algebra, then convert back to an 
a array (or just use the linear algebra functions on the array).

I hope this helps

-Chris








> This can't possibly be what you're asking for, right? You aren't  
> suggesting that m[0] and list(iter(m))[0] should be different types?
> 
> There are many clear and definite use cases for m[0] being a matrix,  
> by the way, in addition to the theoretical arguments -- these aren't  
> hard to come by. Whether or nor there are actual good use-cases for  
> iterating over a matrix, if you believe that m[0] should be a matrix,  
> it's madness to suggest that list(iter(m))[0] should be otherwise.
> 
> My opinion? Don't iterate over matrices. Use matrices for linear  
> algebra. There's no "iteration" construct in linear algebra. The  
> behavior you find so puzzling is a necessary by-product of the  
> fundamental behavior of the matrix class -- which has been explained  
> and which you offered no resistance to.
> 
> Zach
> 
> ___
> Numpy-discussion mailing list
> Numpy-discussion@scipy.org
> http://projects.scipy.org/mailman/listinfo/numpy-discussion

-- 
Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R(206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115   (206) 526-6317   main reception

[EMAIL PROTECTED]
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Tuning sparse stuff in NumPy

2007-03-27 Thread David Koch

On 3/27/07, Robert Cimrman <[EMAIL PROTECTED]> wrote:



ok. now which version of scipy (scipy.__version__) do you use (you may
have posted it, but I missed it)? Not so long ago, there was an effort
by Nathan Bell and others reimplementing sparsetools + scipy.sparse to
get better usability and performance. My (almost latest) version is
0.5.3.dev2860.




My version is 0.5.2, the latest official version available for download,
right?

Thanks,

/David
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Tuning sparse stuff in NumPy

2007-03-27 Thread Robert Cimrman
David Koch wrote:
> Ok,
> 
> I did and the results are:
> csc * csc: 372.601957083
> csc * csc: 3.90811300278

a typo here? which one is csr?

> csr * csc: 15.3202679157
> csr * csr:  3.84498214722
> 
> Mhm, quite insightful. Note, that in an operation X.transpose() * X,
> where X
> is csc_matrix, then X.tranpose() is automatically cast to csr_matrix. A
> re-cast to csc make the whole operation faster. It's still about 1000 times
> slower than Matlab but 4 times faster than before.

ok. now which version of scipy (scipy.__version__) do you use (you may
have posted it, but I missed it)? Not so long ago, there was an effort
by Nathan Bell and others reimplementing sparsetools + scipy.sparse to
get better usability and performance. My (almost latest) version is
0.5.3.dev2860.

r.

___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] matrix indexing question

2007-03-27 Thread Zachary Pincus
> Exactly: that was one other thing I found artificial.
> Surely the points will then be wanted as arrays.
>
> So my view is that we still do not have a use case
> for wanting matrices yielded when iterating across
> rows of a matrix.

It's pretty clear from my perspective: 1-D slices of matrices *must*  
be matrices. I gave an intuitive make-it-fit-with-known-notation  
explanation, and Charles gave a theoretically-grounded explanation.  
There was no objection to this from any quarter.

Given that 1-D slices of matrices must be matrices for deep  
mathematical reasons as well as notational compatibility with the  
rest of linear algebra -- e.g. that m[0] yields a matrix if m is a  
matrix-- it almost certainly would violate the principle of least  
surprise for iteration over m (intuitively understood to be choosing m 
[0] then m[1] and so forth) to yield anything other than a matrix.  
This can't possibly be what you're asking for, right? You aren't  
suggesting that m[0] and list(iter(m))[0] should be different types?

There are many clear and definite use cases for m[0] being a matrix,  
by the way, in addition to the theoretical arguments -- these aren't  
hard to come by. Whether or nor there are actual good use-cases for  
iterating over a matrix, if you believe that m[0] should be a matrix,  
it's madness to suggest that list(iter(m))[0] should be otherwise.

My opinion? Don't iterate over matrices. Use matrices for linear  
algebra. There's no "iteration" construct in linear algebra. The  
behavior you find so puzzling is a necessary by-product of the  
fundamental behavior of the matrix class -- which has been explained  
and which you offered no resistance to.

Zach

___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] ATLAS problems

2007-03-27 Thread Simon Burton

The last time I had this problem i ended up running python
in gdb and setting breakpoints at the appropriate dgemm calls.
That way i could see exactly what numpy was using to do dgemm.

Simon.

On Tue, 27 Mar 2007 10:20:41 +1000
"Justin Bedo" <[EMAIL PROTECTED]> wrote:

> Hi All,
> 
> I'm trying to compile NumPy using the ATLAS libraries but am having
> some difficulty with it.  I've managed to build numpy such that
> lapack_lite.so is linked as follows:
> 
> linux-gate.so.1 =>  (0xe000)
> libatlas.so => /usr/local/atlas/lib/libatlas.so (0xb79e9000)
> libptcblas.so => /usr/local/atlas/lib/libptcblas.so (0xb79c7000)
> libptf77blas.so => /usr/local/atlas/lib/libptf77blas.so (0xb79ac000)
> libpython2.5.so.1.0 => /usr/lib/libpython2.5.so.1.0 (0xb788f000)
> libgfortran.so.1 => /usr/lib/libgfortran.so.1 (0xb7817000)
> libm.so.6 => /lib/libm.so.6 (0xb77f1000)
> libgcc_s.so.1 => /lib/libgcc_s.so.1 (0xb77e5000)
> libc.so.6 => /lib/libc.so.6 (0xb76b7000)
> libpthread.so.0 => /lib/libpthread.so.0 (0xb769f000)
> libdl.so.2 => /lib/libdl.so.2 (0xb769a000)
> libutil.so.1 => /lib/libutil.so.1 (0xb7696000)
> /lib/ld-linux.so.2 (0x8000)
> 
> However despite this, the performance when multiplying matrices is
> extremely slow, and only a single thread is used even though
> lapack_lite.so is linked with the pt atlas libraries.  Does anybody
> know how to fix this problem?
> 
> Thanks
> Justin
> ___
> Numpy-discussion mailing list
> Numpy-discussion@scipy.org
> http://projects.scipy.org/mailman/listinfo/numpy-discussion
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] matrix indexing question

2007-03-27 Thread Alan G Isaac
On Tue, 27 Mar 2007, Bill Baxter apparently wrote: 
> xformedPt = someComplicatedNonLinearThing(pt)

> I do stuff like the above quite frequently in my code, 
> although with arrays rather than matrices. 


Exactly: that was one other thing I found artificial.
Surely the points will then be wanted as arrays.

So my view is that we still do not have a use case
for wanting matrices yielded when iterating across
rows of a matrix.

Alan Isaac

PS In the "large number of points" case, the thing to
do would be to extract modest sized submatrices.



___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] defmatrix.py

2007-03-27 Thread Sven Schreiber
Travis Oliphant schrieb:

> O.K.  So, the problem is that when I defined matrix multiplication, I 
> mistakenly believed that dot always interpreted 1-d arrays as row 
> vectors which it is now clear it doesn't.
> 
> I think this led to the issues. 
> 
> So, making dot always return arrays seems like the needed fix.
> 

Does that mean that...

a) dot(a,b) will return an array even if a and b are both numpy-matrices?

and/or

b) dot(c,d) will return an array if c or d is a numpy-matrix?

Thanks for any clarifications,

Sven
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion