Re: [Numpy-discussion] min() of array containing NaN

2008-08-15 Thread Ryan May
>
> Availability of the NaN functionality in a method of ndarray
>
> The last point is key.  The NaN behavior is central to analyzing real
> data containing unavoidable bad values, which is the bread and butter
> of a substantial fraction of the user base.  In the languages they're
> switching from, handling NaNs is just part of doing business, and is
> an option of every relevant routine; there's no need for redundant
> sets of routines.  In contrast, numpy appears to consider data
> analysis to be secondary, somehow, to pure math, and takes the NaN
> functionality out of routines like min() and std().  This means it's
> not possible to use many ndarray methods.  If we're ready to handle a
> NaN by returning it, why not enable the more useful behavior of
> ignoring it, at user discretion?
>

Maybe I missed this somewhere, but this seems like a better use for masked
arrays, not NaN's.  Masked arrays were specifically designed to add
functions that work well with masked/invalid data points.  Why reinvent the
wheel here?

Ryan

-- 
Ryan May
Graduate Research Assistant
School of Meteorology
University of Oklahoma
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] min() of array containing NaN

2008-08-15 Thread Joe Harrington
> If you're willing to do arithmetic you might even be able to
> pull it off, since NaNs tend to propagate:
> if (new Whether the speed of this is worth its impenetrability I couldn't say.

Code comments cure impenetrability, and have no cost in speed.  One
could write a paragraph explaining it (if it really needed that
much).  The comments could even reference the current discussion.

--jh--
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] min() of array containing NaN

2008-08-14 Thread Andrew Dalke
Anne Archibald:
 > Sadly, it's not possible without extra overhead. Specifically: the
 > NaN-ignorant implementation does a single comparison between each
 > array element and a placeholder, and decides based on the result  
which
 > to keep.

Did my example code go through?  The test for NaN only needs to be
done when a new min value is found, which will occur something like
O(log(n)) in a randomly distributed array.

(Here's the hand-waving.  The first requires a NaN check.  The
second has a 1/2 chance of being the new minimum.  The third has
a 1/3 chance, etc.  The sum of the harmonic series goes as O(ln(n)).)

This depends on a double inverting so the test for a new min value and
a test for NaN occur at the same time.  Here's pseudocode:

best = array[0]
if isnan(best):
   return best
for item in array[1:]:
   if !(best <= item):
 best = item
 if isnan(best):
   return best
  return item


 > If you're willing to do two tests, sure, but that's overhead (and
 > probably comparable to an isnan).

In Python the extra inversion costs an extra PVM instruction.  In C
by comparison the resulting assembly code for "best > item" and
"!(best <= item)" have identical lengths, with no real performance
difference.

There's no extra cost for doing the extra inversion in the common
case, and for large arrays the ratio of (NaN check) / (no check) -> 1.0.

 > What do compilers' min builtins do with NaNs? This might well be
 > faster than an if statement even in the absence of NaNs...

This comes from a g++ implementation of min:

   /**
*  @brief This does what you think it does.
*  @param  a  A thing of arbitrary type.
*  @param  b  Another thing of arbitrary type.
*  @return   The lesser of the parameters.
*
*  This is the simple classic generic implementation.  It will  
work on
*  temporary expressions, since they are only evaluated once,  
unlike a
*  preprocessor macro.
   */
   template
 inline const _Tp&
 min(const _Tp& __a, const _Tp& __b)
 {
   // concept requirements
   __glibcxx_function_requires(_LessThanComparableConcept<_Tp>)
   //return __b < __a ? __b : __a;
   if (__b < __a)
 return __b;
   return __a;
 }


The isnan function another version of gcc uses a bunch of
#defs, leading to

 static __inline__  int __inline_isnanf( float __x ) { return  
__x != __x; }
 static __inline__  int __inline_isnand( double __x )  
{ return __x != __x; }
 static __inline__  int __inline_isnan( long double __x )  
{ return __x != __x; }

Andrew
[EMAIL PROTECTED]

___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] min() of array containing NaN

2008-08-14 Thread robert . kern
On 2008-08-14, Joe Harrington <[EMAIL PROTECTED]> wrote:
>> I'm doing nothing. Someone else must volunteer.
>
> Fair enough.  Would the code be accepted if contributed?

Like I said, I would be amenable to such a change. The other
developers haven't weighed in on this particular proposal, but I
suspect they will agree with me.

>> There is a
>> reasonable design rule that if you have a boolean argument which you
>> expect to only be passed literal Trues and Falses, you should instead
>> just have two different functions.
>
> Robert, can you list some reasons to favor this design rule?

nanmin(x) vs. min(x, nan=True)

A boolean argument that will almost always take literal Trues and
Falses basically is just a switch between different functionality. The
usual mechanism for the programmer to pick between different
functionality is to use the appropriate function.

The =True is extraneous, and puts important semantic information last
rather than at the front.

> Here are some reasons to favor richly functional routines:
>
> User's code is more readable because subtle differences affect args,
>not functions

This isn't subtle.

> Easier learning for new users

You have no evidence of this.

> Much briefer and more readable docs

Briefer is possible. More readable is debatable. "Much" is overstating the case.

> Similar behavior across languages

This is not, has never been, and never will be a goal. Similar
behavior happens because of convergent design constraints and
occasionally laziness, never for it's own sake.

> Smaller number of functions in the core package (a recent list topic)

In general, this is a reasonable concern that must be traded off with
the other concerns. In this particular case, it has no weight.
nanmin() and nanmax() already exist.

> Many fewer routines to maintain, particularly if multiple switches exist

Again, in this case, neither of these are relevant. Yes, if there are
multiple boolean switches, it might make sense to keep them all into
the same function. Typically, these switches will also be affecting
the semantics only in minor details, too.

> Availability of the NaN functionality in a method of ndarray

Point, but see below.

> The last point is key.  The NaN behavior is central to analyzing real
> data containing unavoidable bad values, which is the bread and butter
> of a substantial fraction of the user base.  In the languages they're
> switching from, handling NaNs is just part of doing business, and is
> an option of every relevant routine; there's no need for redundant
> sets of routines.  In contrast, numpy appears to consider data
> analysis to be secondary, somehow, to pure math, and takes the NaN
> functionality out of routines like min() and std().  This means it's
> not possible to use many ndarray methods.  If we're ready to handle a
> NaN by returning it, why not enable the more useful behavior of
> ignoring it, at user discretion?

Let's get something straight. numpy has no opinion on the primacy of
data analysis tasks versus "pure math", however you want to define
those. Now, the numpy developers *do* tend to have an opinion on how
NaNs are used. NaNs were invented to handle invalid results of
*computations*. They were not invented as place markers for missing
data. They can frequently be used as such because the IEEE-754
semantics of NaNs sometimes works for missing data (e.g. in z=x+y, z
will have a NaN wherever either x or y have NaNs). But at least as
frequently, they don't, and other semantics need to be specifically
placed on top of it (e.g. nanmin()).

numpy is a general purpose computational tool that needs to apply to
many different fields and use cases. Consequently, when presented with
a choice like this, we tend to go for the path that makes the minimum
of assumptions and overlaid semantics.

Now to address the idea that all of the relevant ndarray methods
should take nan=True arguments. I am sympathetic to the idea that we
should have the functionality somewhere. I do doubt that the users you
are thinking about will be happy adding nan=True to a substantial
fraction of their calls. My experience with such APIs is that it gets
tedious real fast. Instead, I would suggest that if you want a wide
range of nan-skipping versions of functions that we have, let's put
them all as functions into a module. This gives the programmer the
possibility of using relatively clean calls.

-- 
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless
enigma that is made terrible by our own mad attempt to interpret it as
though it had an underlying truth."
  -- Umberto Eco
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] min() of array containing NaN

2008-08-14 Thread Anne Archibald
2008/8/14 Norbert Nemec <[EMAIL PROTECTED]>:
> Travis E. Oliphant wrote:
>> NAN's don't play well with comparisons because comparison with them is
>> undefined.See  numpy.nanmin
>>
> This is not true! Each single comparison with a NaN has a well defined
> outcome. The difficulty is only that certain logical assumptions do not
> hold any more when NaNs are involved (e.g. [A [not(A>=B)]). Assuming an IEEE compliant processor and C compiler, it
> should be possible to code a NaN safe min routine without additional
> overhead.

Sadly, it's not possible without extra overhead. Specifically: the
NaN-ignorant implementation does a single comparison between each
array element and a placeholder, and decides based on the result which
to keep. If you try to rewrite the comparison to do the right thing
when a NaN is involved, you get stuck: any comparison with a NaN on
either side always returns False, so you cannot distinguish between
the temporary being a NaN and the new element being a non-NaN (keep
the temporary) and the temporary being a non-NaN and the new element
being a NaN (replace the temporary). If you're willing to do two
tests, sure, but that's overhead (and probably comparable to an
isnan). If you're willing to do arithmetic you might even be able to
pull it off, since NaNs tend to propagate:
if (newhttp://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] min() of array containing NaN

2008-08-14 Thread Norbert Nemec
Travis E. Oliphant wrote:
> Thomas J. Duck wrote:
>   
>> Determining the minimum value of an array that contains NaN produces  
>> a surprising result:
>>
>>  >>> x = numpy.array([0,1,2,numpy.nan,4,5,6])
>>  >>> x.min()
>> 4.0
>>
>> I expected 0.0.  Is this the intended behaviour or a bug?  I am using  
>> numpy 1.1.1.
>>   
>> 
> NAN's don't play well with comparisons because comparison with them is 
> undefined.See  numpy.nanmin
>   
This is not true! Each single comparison with a NaN has a well defined 
outcome. The difficulty is only that certain logical assumptions do not 
hold any more when NaNs are involved (e.g. [A=B)]). Assuming an IEEE compliant processor and C compiler, it 
should be possible to code a NaN safe min routine without additional 
overhead.

___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] min() of array containing NaN

2008-08-13 Thread Joe Harrington
> I'm doing nothing. Someone else must volunteer.

Fair enough.  Would the code be accepted if contributed?

> There is a
> reasonable design rule that if you have a boolean argument which you
> expect to only be passed literal Trues and Falses, you should instead
> just have two different functions.

Robert, can you list some reasons to favor this design rule?  

Here are some reasons to favor richly functional routines:

User's code is more readable because subtle differences affect args,
   not functions
Easier learning for new users
Much briefer and more readable docs
Similar behavior across languages
Smaller number of functions in the core package (a recent list topic)
Many fewer routines to maintain, particularly if multiple switches exist
Availability of the NaN functionality in a method of ndarray

The last point is key.  The NaN behavior is central to analyzing real
data containing unavoidable bad values, which is the bread and butter
of a substantial fraction of the user base.  In the languages they're
switching from, handling NaNs is just part of doing business, and is
an option of every relevant routine; there's no need for redundant
sets of routines.  In contrast, numpy appears to consider data
analysis to be secondary, somehow, to pure math, and takes the NaN
functionality out of routines like min() and std().  This means it's
not possible to use many ndarray methods.  If we're ready to handle a
NaN by returning it, why not enable the more useful behavior of
ignoring it, at user discretion?

--jh--
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] min() of array containing NaN

2008-08-13 Thread Kevin Jacobs <[EMAIL PROTECTED]>
On Wed, Aug 13, 2008 at 4:01 PM, Robert Kern <[EMAIL PROTECTED]> wrote:

> On Wed, Aug 13, 2008 at 14:37, Joe Harrington <[EMAIL PROTECTED]> wrote:
> >>On Tue, Aug 12, 2008 at 19:28, Charles R Harris
> >><[EMAIL PROTECTED]> wrote:
> >>>
> >>>
> >>> On Tue, Aug 12, 2008 at 5:13 PM, Andrew Dalke <
> [EMAIL PROTECTED]>
> >>> wrote:
> 
>  On Aug 12, 2008, at 9:54 AM, Anne Archibald wrote:
>  > Er, is this actually a bug? I would instead consider the fact that
>  > np.min([]) raises an exception a bug of sorts - the identity of min
> is
>  > inf.
> >>>
> >>> 
> >>>
> 
>  Personally, I expect that if my array 'x' has a NaN then
>  min(x) must be a NaN.
> >>>
> >>> I suppose you could use
> >>>
> >>> min(a,b) = (abs(a - b) + a + b)/2
> >>>
> >>> which would have that effect.
> >
> >>Or we could implement the inner loop of the minimum ufunc to return
> >>NaN if there is a NaN. Currently it just compares the two values
> >>(which causes the unpredictable results since having a NaN on either
> >>side of the < is always False). I would be amenable to that provided
> >>that the C isnan() call does not cause too much slowdown in the normal
> >>case.
> >
> > While you're doing that, can you do it so that if keyword nan=False it
> > returns NaN if NaNs exist, and if keyword nan=True it ignores NaNs?
>
> I'm doing nothing. Someone else must volunteer.
>


I've volunteered to implement this functionality and will have some time
over the weekend to prepare and post a patch for further discussion.

-Kevin
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] min() of array containing NaN

2008-08-13 Thread Andrew Dalke
Robert Kern wrote:
> Or we could implement the inner loop of the minimum ufunc to return
> NaN if there is a NaN. Currently it just compares the two values
> (which causes the unpredictable results since having a NaN on either
> side of the < is always False). I would be amenable to that provided
> that the C isnan() call does not cause too much slowdown in the normal
> case.

Reading this again, I realize that I don't know how ufuncs work so
this suggestion might not be feasible. 

It doesn't need to be unpredictable.  Make sure the first value is
not a NaN (if it is, quit).  The test against NaN always returns
false, so by inverting the comparison then inverting the result
you end up with a test for "is a new minimum OR is NaN".  (I
checked the assembly output.  There's no effective different
in code length between the normal and the inverted forms.  I
didn't test performance.)


For random values in the array the test should pass less and
less often, so sticking the isnan test in there has something
like O(log(N)) cost instead of O(N) cost.  That's handwaving,
btw, but it's probably a log because the effect is scale invariant.

Here's example code

#include 
#include 

double nan_min(int n, double *data) {
   int i;
   double best = data[0];
   if (isnan(best)) {
return best;
   }
   for (i=1; ihttp://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] min() of array containing NaN

2008-08-13 Thread Robert Kern
On Wed, Aug 13, 2008 at 14:37, Joe Harrington <[EMAIL PROTECTED]> wrote:
>>On Tue, Aug 12, 2008 at 19:28, Charles R Harris
>><[EMAIL PROTECTED]> wrote:
>>>
>>>
>>> On Tue, Aug 12, 2008 at 5:13 PM, Andrew Dalke <[EMAIL PROTECTED]>
>>> wrote:

 On Aug 12, 2008, at 9:54 AM, Anne Archibald wrote:
 > Er, is this actually a bug? I would instead consider the fact that
 > np.min([]) raises an exception a bug of sorts - the identity of min is
 > inf.
>>>
>>> 
>>>

 Personally, I expect that if my array 'x' has a NaN then
 min(x) must be a NaN.
>>>
>>> I suppose you could use
>>>
>>> min(a,b) = (abs(a - b) + a + b)/2
>>>
>>> which would have that effect.
>
>>Or we could implement the inner loop of the minimum ufunc to return
>>NaN if there is a NaN. Currently it just compares the two values
>>(which causes the unpredictable results since having a NaN on either
>>side of the < is always False). I would be amenable to that provided
>>that the C isnan() call does not cause too much slowdown in the normal
>>case.
>
> While you're doing that, can you do it so that if keyword nan=False it
> returns NaN if NaNs exist, and if keyword nan=True it ignores NaNs?

I'm doing nothing. Someone else must volunteer.

But I'm not in favor of using a keyword argument. There is a
reasonable design rule that if you have a boolean argument which you
expect to only be passed literal Trues and Falses, you should instead
just have two different functions. Since we already have names staked
out for this alternate version (nanmin() and nanmax()), we might as
well use them.

-- 
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless
enigma that is made terrible by our own mad attempt to interpret it as
though it had an underlying truth."
 -- Umberto Eco
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] min() of array containing NaN

2008-08-13 Thread Joe Harrington
>On Tue, Aug 12, 2008 at 19:28, Charles R Harris
><[EMAIL PROTECTED]> wrote:
>>
>>
>> On Tue, Aug 12, 2008 at 5:13 PM, Andrew Dalke <[EMAIL PROTECTED]>
>> wrote:
>>>
>>> On Aug 12, 2008, at 9:54 AM, Anne Archibald wrote:
>>> > Er, is this actually a bug? I would instead consider the fact that
>>> > np.min([]) raises an exception a bug of sorts - the identity of min is
>>> > inf.
>>
>> 
>>
>>>
>>> Personally, I expect that if my array 'x' has a NaN then
>>> min(x) must be a NaN.
>>
>> I suppose you could use
>>
>> min(a,b) = (abs(a - b) + a + b)/2
>>
>> which would have that effect.

>Or we could implement the inner loop of the minimum ufunc to return
>NaN if there is a NaN. Currently it just compares the two values
>(which causes the unpredictable results since having a NaN on either
>side of the < is always False). I would be amenable to that provided
>that the C isnan() call does not cause too much slowdown in the normal
>case.

While you're doing that, can you do it so that if keyword nan=False it
returns NaN if NaNs exist, and if keyword nan=True it ignores NaNs?
We can argue which should be the default (see my prior post).  Both
are compatible with the current undefined behavior.

I assume that the fastest way to do it is two separate loops for the
separate cases, but it might be fast enough straight (with a
conditional in the inner loop), or with some other trick (macro magic,
function pointer, whatever).

Thanks,

--jh--
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] min() of array containing NaN

2008-08-13 Thread Christopher Barker
Robert Kern wrote:
> Or we could implement the inner loop of the minimum ufunc to return
> NaN if there is a NaN. Currently it just compares the two values
> (which causes the unpredictable results since having a NaN on either
> side of the < is always False). I would be amenable to that provided
> that the C isnan() call does not cause too much slowdown in the normal
> case.

+1 -- this seems to be the only reasonable option.

-Chris



-- 
Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R(206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115   (206) 526-6317   main reception

[EMAIL PROTECTED]
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] min() of array containing NaN

2008-08-13 Thread Alok Singhal
On 12/08/08: 18:31, Charles R Harris wrote:
>OnTue,   Aug   12,   2008   at   6:28   PM,   Charles   R   Harris
><[EMAIL PROTECTED]> wrote:
>I suppose you could use
>min(a,b) = (abs(a - b) + a + b)/2
>which would have that effect.
> 
>Hmm, that is for the max, min would be
>(a + b - |a - b|)/2

This would break when there is an overflow because of
addition/subtraction:

def new_min(a, b):
  return (a + b - abs(a-b))/2

a = 1e308
b = -1e308

new_min(a, b) # returns -inf
min(a, b) # returns -1e308

-- 
   *   *  
Alok Singhal   *   * *
http://www.astro.virginia.edu/~as8ca/ 
   ** 
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] min() of array containing NaN

2008-08-12 Thread Robert Kern
On Tue, Aug 12, 2008 at 19:28, Charles R Harris
<[EMAIL PROTECTED]> wrote:
>
>
> On Tue, Aug 12, 2008 at 5:13 PM, Andrew Dalke <[EMAIL PROTECTED]>
> wrote:
>>
>> On Aug 12, 2008, at 9:54 AM, Anne Archibald wrote:
>> > Er, is this actually a bug? I would instead consider the fact that
>> > np.min([]) raises an exception a bug of sorts - the identity of min is
>> > inf.
>
> 
>
>>
>> Personally, I expect that if my array 'x' has a NaN then
>> min(x) must be a NaN.
>
> I suppose you could use
>
> min(a,b) = (abs(a - b) + a + b)/2
>
> which would have that effect.

Or we could implement the inner loop of the minimum ufunc to return
NaN if there is a NaN. Currently it just compares the two values
(which causes the unpredictable results since having a NaN on either
side of the < is always False). I would be amenable to that provided
that the C isnan() call does not cause too much slowdown in the normal
case.

-- 
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless
enigma that is made terrible by our own mad attempt to interpret it as
though it had an underlying truth."
 -- Umberto Eco
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] min() of array containing NaN

2008-08-12 Thread Charles R Harris
On Tue, Aug 12, 2008 at 6:28 PM, Charles R Harris <[EMAIL PROTECTED]
> wrote:

>
>
> On Tue, Aug 12, 2008 at 5:13 PM, Andrew Dalke <[EMAIL PROTECTED]>wrote:
>
>> On Aug 12, 2008, at 9:54 AM, Anne Archibald wrote:
>> > Er, is this actually a bug? I would instead consider the fact that
>> > np.min([]) raises an exception a bug of sorts - the identity of min is
>> > inf.
>>
> 
>
>
>>
>> Personally, I expect that if my array 'x' has a NaN then
>> min(x) must be a NaN.
>>
>
> I suppose you could use
>
> min(a,b) = (abs(a - b) + a + b)/2
>
> which would have that effect.
>

Hmm, that is for the max, min would be

(a + b - |a - b|)/2

>
> Chuck
>
>
>
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] min() of array containing NaN

2008-08-12 Thread Charles R Harris
On Tue, Aug 12, 2008 at 5:13 PM, Andrew Dalke <[EMAIL PROTECTED]>wrote:

> On Aug 12, 2008, at 9:54 AM, Anne Archibald wrote:
> > Er, is this actually a bug? I would instead consider the fact that
> > np.min([]) raises an exception a bug of sorts - the identity of min is
> > inf.
>



>
> Personally, I expect that if my array 'x' has a NaN then
> min(x) must be a NaN.
>

I suppose you could use

min(a,b) = (abs(a - b) + a + b)/2

which would have that effect.

Chuck
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] min() of array containing NaN

2008-08-12 Thread Andrew Dalke
On Aug 12, 2008, at 9:54 AM, Anne Archibald wrote:
> Er, is this actually a bug? I would instead consider the fact that
> np.min([]) raises an exception a bug of sorts - the identity of min is
> inf.

That'll break consistency with the normal 'max'
function in Python.

> Really nanmin of an array containing only nans should be the same
> as an empty array; both should be infinity.

One thing I expect is that if min(x) exists then there is
some i where x[i] "is" min(x) .  Returning +inf for min([NaN])
breaks that.

However, my expectation doesn't hold true for Python.  If
I use Python's object identity test 'is' then object
identity is lost in numpy.min, although it is preserved
under Python's min:

 >>> import numpy as np
 >>> x = [200, 300]
 >>> np.min(x)
200
 >>> np.min(x) is x[0]
False
 >>> min(x) is x[0]
True
 >>>

and if I use '==' for equality testing then my
expectation will fail if isnan(x[i]) because
then x[i] != x[i].

 >>> import numpy as np
 >>> np.nan
nan
 >>> np.nan == np.nan
False

So when I say "is" I means "acts the same as
except for in some strange corner cases".

Or to put it another way, it should be possible
to implement a hypothetical 'argnanmin' just
like there is an 'argmin' which complements 'min'.

> I guess this is a problem for types that don't have an infinity
> (returning maxint is a poor substitute), but what is the correct
> behaviour here?

"Doctor, doctor it hurts when I do this."
"Well, don't do that."

Raise an exception.  Refuse the temptation to guess.
Force the user to handle this case as appropriate.

Personally, I expect that if my array 'x' has a NaN then
min(x) must be a NaN.

Andrew
[EMAIL PROTECTED]


___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] min() of array containing NaN

2008-08-12 Thread David Cournapeau
On Tue, Aug 12, 2008 at 10:02 AM, Thomas J. Duck <[EMAIL PROTECTED]> wrote:
>
>  It is quite often the case that NaNs are unexpected, so it
> would be helpful to raise an Exception.

from numpy import seterr
seterr(all = 'warn')

Do emit a warning when encountering any kind of floating point error.
You can even use raise instead of warn, in which case you will get an
exception.

cheers,

David
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] min() of array containing NaN

2008-08-12 Thread Joe Harrington
> It really isn't very hard to replace
> np.sum(A)
> with
> np.sum(A[~isnan(A)])
> if you want to ignore NaNs instead of propagating them. So I don't
> feel a need for special code in sum() that treats NaN as 0.

That's all well and good, until you want to set the axis= keyword.
Then you're stuck with looping.  As doing stats for each pixel column
in a stack of astronomical images with bad pixels and cosmic-ray hits
is one of the most common actions in astronomical data analysis, this
is an issue for a significant number of current and future users.

>>> a=np.arange(9, dtype=float)
>>> a.shape=(3,3)
>>> a[1,1]=np.nan
>>> a
array([[ 0.,  1.,  2.],
   [ 3., nan,  5.],
   [ 6.,  7.,  8.]])
>>> np.sum(a)
nan
>>> np.sum(a[~np.isnan(a)])
32.0

Good, but...

>>> np.sum(a[~np.isnan(a)], axis=1)
Traceback (most recent call last):
  File "", line 1, in 
  File "/usr/local/lib/python2.5/site-packages/numpy/core/fromnumeric.py", line 
634, in sum
return sum(axis, dtype, out)
ValueError: axis(=1) out of bounds

Uh-oh...

>>> np.sum(a[~np.isnan(a)], axis=0)
32.0

Worse: wrong answer but not an exception, since

>>> a[~np.isnan(a)] 
array([ 0.,  1.,  2.,  3.,  5.,  6.,  7.,  8.])

has the undesired side effect of irreversibly flattening the array.

--jh--
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] min() of array containing NaN

2008-08-12 Thread Thomas J. Duck

Christopher Barker wrote:
> well, it's not a bug because the result if there is a NaN is  
> undefined.
> However, it sure could trip people up. If you know there is likely  
> to be
> a NaN in there, then you could use nanmin() or masked arrays. The
> problem comes up when you have no idea there might be a NaN in  
> there, in
> which case you get a bogus answer -- this is very bad.

  This is exactly what happened to me.  I was getting crazy  
results when contour plotting with matplotlib, although the pcolor  
plots looked fine.  In particular, the colorscale had incorrect  
limits.  This led me to check the min() and max() values in my array,  
which were clearly wrong as illustrated by the pcolor plot.  Further  
investigation revealed unexpected NaNs in my array.

> Is there an error state that will trigger an error or warning in these
> situations? Otherwise, I'd have to say that the default should be to
> test for NaN's, and either raise an error or return NaN. If that  
> really
> does slow things down too much, there could be a flag that lets you  
> turn
> it off.


  It is quite often the case that NaNs are unexpected, so it  
would be helpful to raise an Exception.

  Thanks for all of the helpful discussion on this issue.

--

Thomas J. Duck <[EMAIL PROTECTED]>

Associate Professor,
Department of Physics and Atmospheric Science, Dalhousie University,
Halifax, Nova Scotia, Canada, B3H 3J5.
Tel: (902)494-1456 | Fax: (902)494-5191 | Lab: (902)494-3813
Web: http://aolab.phys.dal.ca/


___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] min() of array containing NaN

2008-08-12 Thread Bruce Southey
Anne Archibald wrote:
> 2008/8/12 Joe Harrington <[EMAIL PROTECTED]>:
>
>   
>> So, I endorse extending min() and all other statistical routines to
>> handle NaNs, possibly with a switch to turn it on if a suitably fast
>> algorithm cannot be found (which is competitor IDL's solution).
>> Certainly without a switch the default behavior should be to return
>> NaN, not to return some random value, if a NaN is present.  Otherwise
>> the user may never know a NaN is present, and therefore has to check
>> every use for NaNs.  That constand manual NaN checking is slower and
>> more error-prone than any numerical speed advantage.
>>
>> So to sum, proposed for statistical routnes:
>> if NaN is not present, return value
>> if NaN is present, return NaN
>> if NaN is present and nan=True, return value ignoring all NaNs
>>
>> OR:
>> if NaN is not present, return value
>> if NaN is present, return value ignoring all NaNs
>> if NaN is present and nan=True, return NaN
>>
>> I'd prefer the latter.  IDL does the former and it is a pain to do
>> /nan all the time.  However, the latter might trip up the unwary,
>> whereas the former never does.
>>
>> This would apply at least to:
>> min
>> max
>> sum
>> prod
>> mean
>> median
>> std
>> and possibly many others.
>> 
>
> For almost all of these the current behaviour is to propagate NaNs
> arithmetically. For example, the sum of anything with a NaN is NaN. I
> think this is perfectly sufficient, given how easy it is to strip out
> NaNs if that's what you want. The issue that started this thread (and
> the many other threads that have come up as users stub their toes on
> this behaviour) is that min (and other functions based on comparisons)
> do not propagate NaNs. If you do np.amin(A) and A contains NaNs, you
> can't count on getting a NaN back, unlike np.mean or np.std. the fact
> that you get some random value not the minimum just adds insult to
> injury. (It is probably also true that the value you get back depends
> on how the array is stored in memory.)
>
> It really isn't very hard to replace
> np.sum(A)
> with
> np.sum(A[~isnan(A)])
> if you want to ignore NaNs instead of propagating them. So I don't
> feel a need for special code in sum() that treats NaN as 0. I would be
> content if the comparison-based functions propagated NaNs
> appropriately.
>
> If you did decide it was essential to make versions of the functions
> that removed NaNs, it would get you most of the way there to add an
> optional keyword argument to ufuncs' reduce method that skipped NaNs.
>
> Anne
> ___
> Numpy-discussion mailing list
> Numpy-discussion@scipy.org
> http://projects.scipy.org/mailman/listinfo/numpy-discussion
>
>   
Actually you probably need to use isfinite because of NumPy's support 
for IEEE 754 (means NaN is different from infinity).
Also, doesn't this also require an additional temporary copy of A?

The problem I have with this is that you must always know in advance 
that NaNs or infinities are present and assumes you want to ignore them.

Alternatively something simple like a new function.

Bruce

import numpy as np

def minnan(x, axis=None, out=None, hasnan=False):
if hasnan:
return np.nanmin(x,axis)
elif np.isfinite(x).all():
return np.min(x,axis, out)
else:
return np.nan # actually should be something else here


x = np.array([1,2,np.nan,4,5,6])
y = np.array([1,2,3,4,5,6])

print 'NumPy Min:', np.min(x)
print 'NumPy NaNMin:', np.nanmin(x)
print 'NumPy MinNaN:', minnan(x)
print 'NumPy MinNaN T:', minnan(x, hasnan=True)
print 'NumPy Min:', np.min(y)
print 'NumPy NaNMin:', np.nanmin(y)
print 'NumPy MinNan:', minnan(y)
print 'NumPy MinNaN T:', minnan(y, hasnan=True)

___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] min() of array containing NaN

2008-08-12 Thread Kevin Jacobs <[EMAIL PROTECTED]>
On Tue, Aug 12, 2008 at 1:46 AM, Andrew Dalke <[EMAIL PROTECTED]>wrote:

> Here's the implementation, from lib/function_base.py
>
> def nanmin(a, axis=None):
> """Find the minimium over the given axis, ignoring NaNs.
> """
> y = array(a,subok=True)
> if not issubclass(y.dtype.type, _nx.integer):
> y[isnan(a)] = _nx.inf
> return y.min(axis)
>
>
No wonder nanmin is slow.  A C implementation would run at virtually the
same speed as min.  If there is interest, I'll be happy to code C versions.
A better solution would be to just support NaNs and Inf in the generic code.

-Kevin
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] min() of array containing NaN

2008-08-12 Thread Anne Archibald
2008/8/12 Joe Harrington <[EMAIL PROTECTED]>:

> So, I endorse extending min() and all other statistical routines to
> handle NaNs, possibly with a switch to turn it on if a suitably fast
> algorithm cannot be found (which is competitor IDL's solution).
> Certainly without a switch the default behavior should be to return
> NaN, not to return some random value, if a NaN is present.  Otherwise
> the user may never know a NaN is present, and therefore has to check
> every use for NaNs.  That constand manual NaN checking is slower and
> more error-prone than any numerical speed advantage.
>
> So to sum, proposed for statistical routnes:
> if NaN is not present, return value
> if NaN is present, return NaN
> if NaN is present and nan=True, return value ignoring all NaNs
>
> OR:
> if NaN is not present, return value
> if NaN is present, return value ignoring all NaNs
> if NaN is present and nan=True, return NaN
>
> I'd prefer the latter.  IDL does the former and it is a pain to do
> /nan all the time.  However, the latter might trip up the unwary,
> whereas the former never does.
>
> This would apply at least to:
> min
> max
> sum
> prod
> mean
> median
> std
> and possibly many others.

For almost all of these the current behaviour is to propagate NaNs
arithmetically. For example, the sum of anything with a NaN is NaN. I
think this is perfectly sufficient, given how easy it is to strip out
NaNs if that's what you want. The issue that started this thread (and
the many other threads that have come up as users stub their toes on
this behaviour) is that min (and other functions based on comparisons)
do not propagate NaNs. If you do np.amin(A) and A contains NaNs, you
can't count on getting a NaN back, unlike np.mean or np.std. the fact
that you get some random value not the minimum just adds insult to
injury. (It is probably also true that the value you get back depends
on how the array is stored in memory.)

It really isn't very hard to replace
np.sum(A)
with
np.sum(A[~isnan(A)])
if you want to ignore NaNs instead of propagating them. So I don't
feel a need for special code in sum() that treats NaN as 0. I would be
content if the comparison-based functions propagated NaNs
appropriately.

If you did decide it was essential to make versions of the functions
that removed NaNs, it would get you most of the way there to add an
optional keyword argument to ufuncs' reduce method that skipped NaNs.

Anne
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] min() of array containing NaN

2008-08-12 Thread Anne Archibald
2008/8/12 Stéfan van der Walt <[EMAIL PROTECTED]>:
> Hi Andrew
>
> 2008/8/12 Andrew Dalke <[EMAIL PROTECTED]>:
>> This is buggy for the case of a list containing only NaNs.
>>
>>  >>> import numpy as np
>>  >>> np.NAN
>> nan
>>  >>> np.min([np.NAN])
>> nan
>>  >>> np.nanmin([np.NAN])
>> inf
>>  >>>
>
> Thanks for the report.  This should be fixed in r5630.

Er, is this actually a bug? I would instead consider the fact that
np.min([]) raises an exception a bug of sorts - the identity of min is
inf. Really nanmin of an array containing only nans should be the same
as an empty array; both should be infinity.

I guess this is a problem for types that don't have an infinity
(returning maxint is a poor substitute), but what is the correct
behaviour here?

Anne
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] min() of array containing NaN

2008-08-12 Thread Joe Harrington
Masked arrays are a bit clunky for something as simple and standard as
NaN handling.  They also have the inverse of the standard truth sense,
at least as used in my field.  1 (or True) usually means the item is
allowed, not denied, so that you can multiply the mask by the data to
zero all bad values, add and subtract masks in sensible ways and get
what's expected, etc.  For example, in the "stacked, masked mean"
image processing algorithm, you sum the data along an axis, sum the
masks along that axis, and divide the results to get the mean image
without bad pixels.  This is much more accurate than taking a median,
and admits to error analysis, which the median does not (easily).
While the regular behavior is "just a ~ away", as Stefan pointed out
to me once, that's not acceptable if the image cube is large and
memory or speed are at issue, and it's also very prone to bugs if
you're negating everything all the time.

Further, with ma you have to convert to using an entirely different
and redundant set of routines instead of having the very standard
handling of NaNs found in our competitor programs, such as IDL.  The
issue of not having an in-place method in ma was also raised earlier.
I'll add the difficulty of converting code if a standard thing like
NaN handling has to be simulated in multiple calls.

So, I endorse extending min() and all other statistical routines to
handle NaNs, possibly with a switch to turn it on if a suitably fast
algorithm cannot be found (which is competitor IDL's solution).
Certainly without a switch the default behavior should be to return
NaN, not to return some random value, if a NaN is present.  Otherwise
the user may never know a NaN is present, and therefore has to check
every use for NaNs.  That constand manual NaN checking is slower and
more error-prone than any numerical speed advantage.

So to sum, proposed for statistical routnes:
if NaN is not present, return value
if NaN is present, return NaN
if NaN is present and nan=True, return value ignoring all NaNs

OR:
if NaN is not present, return value
if NaN is present, return value ignoring all NaNs
if NaN is present and nan=True, return NaN

I'd prefer the latter.  IDL does the former and it is a pain to do
/nan all the time.  However, the latter might trip up the unwary,
whereas the former never does.

This would apply at least to:
min
max
sum
prod
mean
median
std
and possibly many others.

--jh--
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] min() of array containing NaN

2008-08-12 Thread Stéfan van der Walt
Hi Andrew

2008/8/12 Andrew Dalke <[EMAIL PROTECTED]>:
> This is buggy for the case of a list containing only NaNs.
>
>  >>> import numpy as np
>  >>> np.NAN
> nan
>  >>> np.min([np.NAN])
> nan
>  >>> np.nanmin([np.NAN])
> inf
>  >>>

Thanks for the report.  This should be fixed in r5630.

Regards
Stéfan
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] min() of array containing NaN

2008-08-11 Thread Andrew Dalke
On Aug 12, 2008, at 7:05 AM, Christopher Barker wrote:
> Actually, I think it skips over NaN -- otherwise, the min would always
> be zero if there where a Nan, and "a very small negative number" if
> there were a -inf.


Here's the implementation, from lib/function_base.py

def nanmin(a, axis=None):
 """Find the minimium over the given axis, ignoring NaNs.
 """
 y = array(a,subok=True)
 if not issubclass(y.dtype.type, _nx.integer):
 y[isnan(a)] = _nx.inf
 return y.min(axis)

This is buggy for the case of a list containing only NaNs.

 >>> import numpy as np
 >>> np.NAN
nan
 >>> np.min([np.NAN])
nan
 >>> np.nanmin([np.NAN])
inf
 >>>


Andrew
[EMAIL PROTECTED]


___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] min() of array containing NaN

2008-08-11 Thread Christopher Barker
Bruce Southey wrote:
> Actually this could be viewed as a bug because it ignores the entries
> to the left of the NaN.

well, it's not a bug because the result if there is a NaN is undefined. 
However, it sure could trip people up. If you know there is likely to be 
a NaN in there, then you could use nanmin() or masked arrays. The 
problem comes up when you have no idea there might be a NaN in there, in 
which case you get a bogus answer -- this is very bad.

Is there an error state that will trigger an error or warning in these 
situations? Otherwise, I'd have to say that the default should be to 
test for NaN's, and either raise an error or return NaN. If that really 
does slow things down too much, there could be a flag that lets you turn 
it off.

This situation now makes me very nervous.

>  because
> nanmin treats NaNs as zero, positive infinity as a really large
> positive number and negative infinity as a very small or negative
> number.

Actually, I think it skips over NaN -- otherwise, the min would always 
be zero if there where a Nan, and "a very small negative number" if 
there were a -inf.

I have to say that one of the things I always liked about Matlab was 
it's handling of NaN, inf, and -inf.

-Chris


-- 
Christopher Barker, Ph.D.
Oceanographer

NOAA/OR&R/HAZMAT (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115   (206) 526-6317   main reception
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] min() of array containing NaN

2008-08-11 Thread Bruce Southey
I agree with using Masked arrays...

Actually this could be viewed as a bug because it ignores the entries
to the left of the NaN.
>>> numpy.__version__
'1.1.1.dev5559'
>>> x = numpy.array([0,1,2,numpy.nan, 4, 5, 6])
>>> numpy.min(x)
4.0
>>> x = numpy.array([numpy.nan,0,1,2, 4, 5, 6])
>>> x.min()
0.0
>>> x = numpy.array([0,1,2, 4, 5, 6, numpy.nan])
>>> x.min()
-1.#IND

As has been recently said on this list (as per Stefan's post) NaN's
and infinity have a higher computational cost. I am not sure the
relative cost of using say isnan first as a check or having a NaN flag
stored as part of the ndarray class.

As per Travis's post, technically it should return NaN. But I don't
agree with Charles that it should automatically call nanmin because
nanmin treats NaNs as zero, positive infinity as a really large
positive number and negative infinity as a very small or negative
number. This may not be want the user wants. An alternative is to
change the signature to include a flag to include or exclude NaN and
infinity which would also remove the need for nanmin and friends.

Bruce

On Mon, Aug 11, 2008 at 6:41 PM, Pierre GM <[EMAIL PROTECTED]> wrote:
> *cough* MaskedArrays anyone ? *cough*
>
> The ideal would be for min/max to output a NaN when there's a NaN somewhere.
> That way, you'd know that there's a potential pb in your data, and that you
> should use the nanfunctions or masked arrays.
>
> is there a page on the wiki for that matter ? It seems to show up regularly...
>
> On Monday 11 August 2008 18:49:06 Stéfan van der Walt wrote:
>> 2008/8/11 Charles Doutriaux <[EMAIL PROTECTED]>:
>> > Seems to me like min should automagically  call nanmin if it spots any
>> > nan no ?
>>
>> Nanmin is quite a bit slower:
>>
>> In [2]: x = np.random.random((5000))
>>
>> In [3]: timeit np.min(x)
>> 1 loops, best of 3: 24.8 µs per loop
>>
>> In [4]: timeit np.nanmin(x)
>> 1 loops, best of 3: 136 µs per loop
>>
>> So, I'm not sure if that will happen.  One option is to use `nanmin`
>> by default, and to provide `min` for people who need the speed.  The
>> fact that results with nan's are almost always unexpected is certainly
>> a valid concern.
>>
>> Cheers
>> Stéfan
>> ___
>> Numpy-discussion mailing list
>> Numpy-discussion@scipy.org
>> http://projects.scipy.org/mailman/listinfo/numpy-discussion
>
>
> ___
> Numpy-discussion mailing list
> Numpy-discussion@scipy.org
> http://projects.scipy.org/mailman/listinfo/numpy-discussion
>
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] min() of array containing NaN

2008-08-11 Thread Pierre GM
*cough* MaskedArrays anyone ? *cough*

The ideal would be for min/max to output a NaN when there's a NaN somewhere. 
That way, you'd know that there's a potential pb in your data, and that you 
should use the nanfunctions or masked arrays.

is there a page on the wiki for that matter ? It seems to show up regularly...

On Monday 11 August 2008 18:49:06 Stéfan van der Walt wrote:
> 2008/8/11 Charles Doutriaux <[EMAIL PROTECTED]>:
> > Seems to me like min should automagically  call nanmin if it spots any
> > nan no ?
>
> Nanmin is quite a bit slower:
>
> In [2]: x = np.random.random((5000))
>
> In [3]: timeit np.min(x)
> 1 loops, best of 3: 24.8 µs per loop
>
> In [4]: timeit np.nanmin(x)
> 1 loops, best of 3: 136 µs per loop
>
> So, I'm not sure if that will happen.  One option is to use `nanmin`
> by default, and to provide `min` for people who need the speed.  The
> fact that results with nan's are almost always unexpected is certainly
> a valid concern.
>
> Cheers
> Stéfan
> ___
> Numpy-discussion mailing list
> Numpy-discussion@scipy.org
> http://projects.scipy.org/mailman/listinfo/numpy-discussion


___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] min() of array containing NaN

2008-08-11 Thread Stéfan van der Walt
2008/8/11 Charles Doutriaux <[EMAIL PROTECTED]>:
> Seems to me like min should automagically  call nanmin if it spots any
> nan no ?

Nanmin is quite a bit slower:

In [2]: x = np.random.random((5000))

In [3]: timeit np.min(x)
1 loops, best of 3: 24.8 µs per loop

In [4]: timeit np.nanmin(x)
1 loops, best of 3: 136 µs per loop

So, I'm not sure if that will happen.  One option is to use `nanmin`
by default, and to provide `min` for people who need the speed.  The
fact that results with nan's are almost always unexpected is certainly
a valid concern.

Cheers
Stéfan
___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] min() of array containing NaN

2008-08-11 Thread Charles Doutriaux
Seems to me like min should automagically  call nanmin if it spots any 
nan no ?

C.

Fabrice Silva wrote:
> Try nanmin function :
>
> $ python
> Python 2.5.2 (r252:60911, Jul 31 2008, 07:39:27) 
> [GCC 4.3.1] on linux2
> Type "help", "copyright", "credits" or "license" for more information.
> >>> import numpy
> >>> numpy.__version__
> '1.1.0'
> >>> x = numpy.array([0,1,2,numpy.nan, 4, 5, 6])
> >>> x.min()
> 4.0
> >>> numpy.nanmin(x)
> 0.0
>
> There lacks some nanmin method for array instances, i.e. one can not execute
> >>> x.nanmin()
> Traceback (most recent call last):
>   File "", line 1, in 
> AttributeError: 'numpy.ndarray' object has no attribute 'nanmin'
> 
>   

___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] min() of array containing NaN

2008-08-11 Thread Fabrice Silva
Try nanmin function :

$ python
Python 2.5.2 (r252:60911, Jul 31 2008, 07:39:27) 
[GCC 4.3.1] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import numpy
>>> numpy.__version__
'1.1.0'
>>> x = numpy.array([0,1,2,numpy.nan, 4, 5, 6])
>>> x.min()
4.0
>>> numpy.nanmin(x)
0.0

There lacks some nanmin method for array instances, i.e. one can not execute
>>> x.nanmin()
Traceback (most recent call last):
  File "", line 1, in 
AttributeError: 'numpy.ndarray' object has no attribute 'nanmin'

-- 
Fabrice Silva
LMA UPR CNRS 7051 - équipe S2M

___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] min() of array containing NaN

2008-08-11 Thread Travis E. Oliphant
Thomas J. Duck wrote:
> Determining the minimum value of an array that contains NaN produces  
> a surprising result:
>
>  >>> x = numpy.array([0,1,2,numpy.nan,4,5,6])
>  >>> x.min()
> 4.0
>
> I expected 0.0.  Is this the intended behaviour or a bug?  I am using  
> numpy 1.1.1.
>   
NAN's don't play well with comparisons because comparison with them is 
undefined.See  numpy.nanmin

-Travis

___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] min() of array containing NaN

2008-08-11 Thread Thomas J. Duck

Determining the minimum value of an array that contains NaN produces  
a surprising result:

 >>> x = numpy.array([0,1,2,numpy.nan,4,5,6])
 >>> x.min()
4.0

I expected 0.0.  Is this the intended behaviour or a bug?  I am using  
numpy 1.1.1.

Thanks,
Tom

--

Thomas J. Duck <[EMAIL PROTECTED]>

Associate Professor,
Department of Physics and Atmospheric Science, Dalhousie University,
Halifax, Nova Scotia, Canada, B3H 3J5.
Tel: (902)494-1456 | Fax: (902)494-5191 | Lab: (902)494-3813
Web: http://aolab.phys.dal.ca/


___
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion