Re: [Numpy-discussion] Proper NaN handling in max and co: a redux

2008-09-26 Thread David Cournapeau
Anne Archibald wrote:
>
> I would think sign should return NaN (does it not now?) unless its
> return type is integer, in which case I can't see a better answer than
> raising an exception (we certainly don't want it silently swallowing
> NaNs).
>   

signbit (the C99 macro) returns an integer. So we have to check for nan
I think.

>
> Is it really a good idea to duplicate the maskedarray sorting code?
>   

This should be done in C (it is not difficult to do so, has a clear
speed advantage, and is the only solution for numpy arrays themselves)
and then, masked arrays can reuse this code.

cheers,

David
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Proper NaN handling in max and co: a redux

2008-09-26 Thread Anne Archibald
2008/9/26 David Cournapeau <[EMAIL PROTECTED]>:
> Charles R Harris wrote:
>>
>> I'm also wondering about the sign ufunc. It should probably return nan
>> for nans, but -1,0,1 are the current values. We also need to decide
>> which end of the sorted values the nans should go to. I'm a bit
>> partial to the beginning but the end would be fine with me, it might
>> even be a bit more natural as searchsorted would find the beginning of
>> the nans by default.

I would think sign should return NaN (does it not now?) unless its
return type is integer, in which case I can't see a better answer than
raising an exception (we certainly don't want it silently swallowing
NaNs).

> Note that in both suggest approaches, sort with NaN would raise an
> error. We could then provide a nansort, which ignore the NaN, and deal
> with your case; would it be hard to have an option for the beginning vs
> the end, or would that be difficult ?

Is it really a good idea to duplicate the maskedarray sorting code?

Anne
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Proper NaN handling in max and co: a redux

2008-09-26 Thread Pierre GM
On Friday 26 September 2008 08:29:02 Charles R Harris wrote:
> It shouldn't be any more difficult to do either based on a keyword.
> Argsorts shouldn't be a problem either. I'm thinking that the most flexible
> way to handle the sorts is to make a preliminary pass through the data and
> collect all the nans at one end or the other and sort the remainder. That
> would also make it easy to raise an error if we wanted to and avoid an
> extra compare in all the other sorting passes. That approach could probably
> be extended to masked arrays also.

Note that MaskedArray.sort has already an extra keyword 'end_with', that lets 
you decide whether missing data should be at the beginning or the end of  the 
array.

___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Proper NaN handling in max and co: a redux

2008-09-26 Thread Charles R Harris
On Thu, Sep 25, 2008 at 10:56 PM, David Cournapeau <
[EMAIL PROTECTED]> wrote:

> Charles R Harris wrote:
> >
> > I'm also wondering about the sign ufunc. It should probably return nan
> > for nans, but -1,0,1 are the current values. We also need to decide
> > which end of the sorted values the nans should go to. I'm a bit
> > partial to the beginning but the end would be fine with me, it might
> > even be a bit more natural as searchsorted would find the beginning of
> > the nans by default.
>
> Note that in both suggest approaches, sort with NaN would raise an
> error. We could then provide a nansort, which ignore the NaN, and deal
> with your case; would it be hard to have an option for the beginning vs
> the end, or would that be difficult ?
>

It shouldn't be any more difficult to do either based on a keyword. Argsorts
shouldn't be a problem either. I'm thinking that the most flexible way to
handle the sorts is to make a preliminary pass through the data and collect
all the nans at one end or the other and sort the remainder. That would also
make it easy to raise an error if we wanted to and avoid an extra compare in
all the other sorting passes. That approach could probably be extended to
masked arrays also.

Chuck
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Proper NaN handling in max and co: a redux

2008-09-25 Thread David Cournapeau
Charles R Harris wrote:
>
> I'm also wondering about the sign ufunc. It should probably return nan
> for nans, but -1,0,1 are the current values. We also need to decide
> which end of the sorted values the nans should go to. I'm a bit
> partial to the beginning but the end would be fine with me, it might
> even be a bit more natural as searchsorted would find the beginning of
> the nans by default.

Note that in both suggest approaches, sort with NaN would raise an
error. We could then provide a nansort, which ignore the NaN, and deal
with your case; would it be hard to have an option for the beginning vs
the end, or would that be difficult ?

cheers,

David
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Proper NaN handling in max and co: a redux

2008-09-25 Thread Charles R Harris
On Thu, Sep 25, 2008 at 10:15 PM, David Cournapeau <
[EMAIL PROTECTED]> wrote:

> Hi,
>
>We started a small document to gather all information given in
> previous threads about NaN handling in max and co in numpy. Thanks to
> Anne (Archibald) for useful comments/additions/typo corrections:
>
> http://projects.scipy.org/scipy/numpy/wiki/ProperNanHandling
>
> We describe approaches taken by R and matlab for the relevant functions,
> and possible approaches to take in numpy. I have an patch almost ready
> (with tests) for the approach "Returning NaN" (Chuck said he would be
> willing to implement the sort/argsort part, which is the part missing in
> my current implementation).
>
> I think it would be nice for 1.3.0,
>

I'm also wondering about the sign ufunc. It should probably return nan for
nans, but -1,0,1 are the current values. We also need to decide which end of
the sorted values the nans should go to. I'm a bit partial to the beginning
but the end would be fine with me, it might even be a bit more natural as
searchsorted would find the beginning of the nans by default.

Chuck
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] Proper NaN handling in max and co: a redux

2008-09-25 Thread David Cournapeau
Hi,

We started a small document to gather all information given in
previous threads about NaN handling in max and co in numpy. Thanks to
Anne (Archibald) for useful comments/additions/typo corrections:

http://projects.scipy.org/scipy/numpy/wiki/ProperNanHandling

We describe approaches taken by R and matlab for the relevant functions,
and possible approaches to take in numpy. I have an patch almost ready
(with tests) for the approach "Returning NaN" (Chuck said he would be
willing to implement the sort/argsort part, which is the part missing in
my current implementation).

I think it would be nice for 1.3.0,

cheers,

David
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion