Re: [Numpy-discussion] feedback request: proposal to add masks to the core ndarray

2011-06-30 Thread Mark Wiebe
On Wed, Jun 29, 2011 at 1:51 PM, Lluís xscr...@gmx.net wrote: Mark Wiebe writes: [...] I think that deciding on the value of NA signal values boils down to this question: should 3rd party code be able to interpret missing data information stored in the separate mask array?

Re: [Numpy-discussion] feedback request: proposal to add masks to the core ndarray

2011-06-29 Thread Mark Wiebe
On Tue, Jun 28, 2011 at 7:34 AM, Lluís xscr...@gmx.net wrote: Mark Wiebe writes: The design that's forming is a combination of: * Solve the missing data problem * My ideas of what a good solution looks like: * applies to all NumPy dtypes in a fully general way *

Re: [Numpy-discussion] feedback request: proposal to add masks to the core ndarray

2011-06-29 Thread Charles R Harris
On Wed, Jun 29, 2011 at 11:53 AM, Mark Wiebe mwwi...@gmail.com wrote: On Tue, Jun 28, 2011 at 7:34 AM, Lluís xscr...@gmx.net wrote: Mark Wiebe writes: The design that's forming is a combination of: * Solve the missing data problem * My ideas of what a good solution looks like: *

Re: [Numpy-discussion] feedback request: proposal to add masks to the core ndarray

2011-06-29 Thread Lluís
Mark Wiebe writes: [...] I think that deciding on the value of NA signal values boils down to this question: should 3rd party code be able to interpret missing data information stored in the separate mask array? I'm tossing around some variations of ideas using the iterator to

Re: [Numpy-discussion] feedback request: proposal to add masks to the core ndarray

2011-06-28 Thread Lluís
Charles R Harris writes: I think we may need some standard format for masked data on disk if we don't go the NA value route. As I see it, the mask array is just some metadata that is attached to the dtype descriptor. I don't know how an ndarray is (un)pickled from disk, but I imagine that each

Re: [Numpy-discussion] feedback request: proposal to add masks to the core ndarray

2011-06-25 Thread Laurent Gautier
On 2011-06-24 17:30, Robert Kern robert.k...@gmail.com wrote: On Fri, Jun 24, 2011 at 10:07, Laurent Gautierlgaut...@gmail.com wrote: On 2011-06-24 16:43, Robert Kernrobert.k...@gmail.com wrote: On Fri, Jun 24, 2011 at 09:33, Charles R Harris charlesr.har...@gmail.com wrote:

Re: [Numpy-discussion] feedback request: proposal to add masks to the core ndarray

2011-06-25 Thread Matthew Brett
Hi, On Sat, Jun 25, 2011 at 1:54 AM, Mark Wiebe mwwi...@gmail.com wrote: On Fri, Jun 24, 2011 at 5:21 PM, Matthew Brett matthew.br...@gmail.com ... @Mark - I don't have a clear idea whether you consider the nafloat64 option to be still in play as the first thing to be implemented (before

Re: [Numpy-discussion] feedback request: proposal to add masks to the core ndarray

2011-06-25 Thread Matthew Brett
Hi, On Sat, Jun 25, 2011 at 2:10 AM, Mark Wiebe mwwi...@gmail.com wrote: On Fri, Jun 24, 2011 at 7:02 PM, Matthew Brett matthew.br...@gmail.com wrote: Hi, On Sat, Jun 25, 2011 at 12:22 AM, Wes McKinney wesmck...@gmail.com wrote: ... Perhaps we should make a wiki page someplace

Re: [Numpy-discussion] feedback request: proposal to add masks to the core ndarray

2011-06-25 Thread Pierre GM
This thread is getting quite long, innit ? And I think it's getting a tad confusing, because we're mixing two different concepts: missing values and masks. There should be support for missing values in numpy.core, I think we all agree on that. * What's been suggested of adding new dtypes

Re: [Numpy-discussion] feedback request: proposal to add masks to the core ndarray

2011-06-25 Thread Gael Varoquaux
On Sat, Jun 25, 2011 at 01:02:07AM +0100, Matthew Brett wrote: I'm personally worried that the memory overhead of array.masks will make many of us tend to avoid them. I work with images that can easily get large enough that I would not want an array-items size byte array added to my storage.

Re: [Numpy-discussion] feedback request: proposal to add masks to the core ndarray

2011-06-25 Thread Wes McKinney
On Sat, Jun 25, 2011 at 12:42 AM, Charles R Harris charlesr.har...@gmail.com wrote: On Fri, Jun 24, 2011 at 10:06 PM, Wes McKinney wesmck...@gmail.com wrote: On Fri, Jun 24, 2011 at 11:59 PM, Nathaniel Smith n...@pobox.com wrote: On Fri, Jun 24, 2011 at 6:57 PM, Benjamin Root

Re: [Numpy-discussion] feedback request: proposal to add masks to the core ndarray

2011-06-25 Thread Charles R Harris
On Sat, Jun 25, 2011 at 5:29 AM, Pierre GM pgmdevl...@gmail.com wrote: This thread is getting quite long, innit ? And I think it's getting a tad confusing, because we're mixing two different concepts: missing values and masks. There should be support for missing values in numpy.core, I think

Re: [Numpy-discussion] feedback request: proposal to add masks to the core ndarray

2011-06-25 Thread Charles R Harris
On Sat, Jun 25, 2011 at 8:14 AM, Wes McKinney wesmck...@gmail.com wrote: On Sat, Jun 25, 2011 at 12:42 AM, Charles R Harris charlesr.har...@gmail.com wrote: On Fri, Jun 24, 2011 at 10:06 PM, Wes McKinney wesmck...@gmail.com wrote: On Fri, Jun 24, 2011 at 11:59 PM, Nathaniel Smith

Re: [Numpy-discussion] feedback request: proposal to add masks to the core ndarray

2011-06-25 Thread Charles R Harris
On Sat, Jun 25, 2011 at 6:00 AM, Gael Varoquaux gael.varoqu...@normalesup.org wrote: On Sat, Jun 25, 2011 at 01:02:07AM +0100, Matthew Brett wrote: I'm personally worried that the memory overhead of array.masks will make many of us tend to avoid them. I work with images that can easily

Re: [Numpy-discussion] feedback request: proposal to add masks to the core ndarray

2011-06-25 Thread Matthew Brett
Hi, On Sat, Jun 25, 2011 at 3:21 PM, Charles R Harris charlesr.har...@gmail.com wrote: On Sat, Jun 25, 2011 at 5:29 AM, Pierre GM pgmdevl...@gmail.com wrote: This thread is getting quite long, innit ? And I think it's getting a tad confusing, because we're mixing two different concepts:

Re: [Numpy-discussion] feedback request: proposal to add masks to the core ndarray

2011-06-25 Thread Matthew Brett
Hi, On Sat, Jun 25, 2011 at 3:14 PM, Wes McKinney wesmck...@gmail.com wrote: ... I hope you're right. So far it seems that anyone who has spent real time with R (e.g. myself, Nathaniel) has expressed serious concerns about the masked approach. I'm sorry - I have been distracted. For my sake,

Re: [Numpy-discussion] feedback request: proposal to add masks to the core ndarray

2011-06-25 Thread Charles R Harris
On Sat, Jun 25, 2011 at 8:31 AM, Matthew Brett matthew.br...@gmail.comwrote: Hi, On Sat, Jun 25, 2011 at 3:21 PM, Charles R Harris charlesr.har...@gmail.com wrote: On Sat, Jun 25, 2011 at 5:29 AM, Pierre GM pgmdevl...@gmail.com wrote: This thread is getting quite long, innit ? And

Re: [Numpy-discussion] feedback request: proposal to add masks to the core ndarray

2011-06-25 Thread Matthew Brett
Hi, On Sat, Jun 25, 2011 at 3:46 PM, Charles R Harris charlesr.har...@gmail.com wrote: On Sat, Jun 25, 2011 at 8:31 AM, Matthew Brett matthew.br...@gmail.com wrote: Hi, On Sat, Jun 25, 2011 at 3:21 PM, Charles R Harris charlesr.har...@gmail.com wrote: On Sat, Jun 25, 2011 at 5:29

Re: [Numpy-discussion] feedback request: proposal to add masks to the core ndarray

2011-06-25 Thread Charles R Harris
On Sat, Jun 25, 2011 at 8:44 AM, Wes McKinney wesmck...@gmail.com wrote: On Sat, Jun 25, 2011 at 10:25 AM, Charles R Harris charlesr.har...@gmail.com wrote: On Sat, Jun 25, 2011 at 8:14 AM, Wes McKinney wesmck...@gmail.com wrote: On Sat, Jun 25, 2011 at 12:42 AM, Charles R Harris

Re: [Numpy-discussion] feedback request: proposal to add masks to the core ndarray

2011-06-25 Thread Charles R Harris
On Sat, Jun 25, 2011 at 8:52 AM, Matthew Brett matthew.br...@gmail.comwrote: Hi, On Sat, Jun 25, 2011 at 3:46 PM, Charles R Harris charlesr.har...@gmail.com wrote: On Sat, Jun 25, 2011 at 8:31 AM, Matthew Brett matthew.br...@gmail.com wrote: Hi, On Sat, Jun 25, 2011 at 3:21

Re: [Numpy-discussion] feedback request: proposal to add masks to the core ndarray

2011-06-25 Thread Matthew Brett
Hi, On Sat, Jun 25, 2011 at 3:44 PM, Wes McKinney wesmck...@gmail.com wrote: ... Here are some things I can think of that would be affected by any changes here 1) Right now users of pandas can type pandas.isnull(series[5]) and that will yield True if the value is NA for any dtype. This might

Re: [Numpy-discussion] feedback request: proposal to add masks to the core ndarray

2011-06-25 Thread Matthew Brett
Hi, On Sat, Jun 25, 2011 at 4:05 PM, Charles R Harris charlesr.har...@gmail.com wrote: On Sat, Jun 25, 2011 at 8:52 AM, Matthew Brett matthew.br...@gmail.com wrote: Hi, On Sat, Jun 25, 2011 at 3:46 PM, Charles R Harris charlesr.har...@gmail.com wrote: On Sat, Jun 25, 2011 at 8:31

Re: [Numpy-discussion] feedback request: proposal to add masks to the core ndarray

2011-06-25 Thread Olivier Delalleau
2011/6/25 Charles R Harris charlesr.har...@gmail.com I think what we really need to see are the use cases and work flow. The ones that hadn't occurred to me before were memory mapped files and data stored on disk in general. I think we may need some standard format for masked data on disk if

Re: [Numpy-discussion] feedback request: proposal to add masks to the core ndarray

2011-06-25 Thread Mark Wiebe
On Fri, Jun 24, 2011 at 8:11 PM, Nathaniel Smith n...@pobox.com wrote: On Fri, Jun 24, 2011 at 2:09 PM, Benjamin Root ben.r...@ou.edu wrote: Another example of how we use masks in matplotlib is in pcolor(). We have to combine the possible masks of X, Y, and V in both the x and y

Re: [Numpy-discussion] feedback request: proposal to add masks to the core ndarray

2011-06-25 Thread Mark Wiebe
On Fri, Jun 24, 2011 at 8:25 PM, Benjamin Root ben.r...@ou.edu wrote: On Fri, Jun 24, 2011 at 8:00 PM, Mark Wiebe mwwi...@gmail.com wrote: On Fri, Jun 24, 2011 at 6:22 PM, Wes McKinney wesmck...@gmail.comwrote: On Fri, Jun 24, 2011 at 7:10 PM, Charles R Harris charlesr.har...@gmail.com

Re: [Numpy-discussion] feedback request: proposal to add masks to the core ndarray

2011-06-25 Thread Benjamin Root
On Sat, Jun 25, 2011 at 9:21 AM, Charles R Harris charlesr.har...@gmail.com wrote: I think he aims to support both. One complication with masks is keeping them tied to the data on disk. With na values one file can contain both the data and the missing data markers, whereas with masks, two

Re: [Numpy-discussion] feedback request: proposal to add masks to the core ndarray

2011-06-25 Thread Mark Wiebe
On Fri, Jun 24, 2011 at 10:59 PM, Nathaniel Smith n...@pobox.com wrote: On Fri, Jun 24, 2011 at 6:57 PM, Benjamin Root ben.r...@ou.edu wrote: On Fri, Jun 24, 2011 at 8:11 PM, Nathaniel Smith n...@pobox.com wrote: This is a situation where I would just... use an array and a mask, rather

Re: [Numpy-discussion] feedback request: proposal to add masks to the core ndarray

2011-06-25 Thread Mark Wiebe
On Fri, Jun 24, 2011 at 11:06 PM, Wes McKinney wesmck...@gmail.com wrote: On Fri, Jun 24, 2011 at 11:59 PM, Nathaniel Smith n...@pobox.com wrote: On Fri, Jun 24, 2011 at 6:57 PM, Benjamin Root ben.r...@ou.edu wrote: On Fri, Jun 24, 2011 at 8:11 PM, Nathaniel Smith n...@pobox.com wrote:

Re: [Numpy-discussion] feedback request: proposal to add masks to the core ndarray

2011-06-25 Thread Mark Wiebe
On Sat, Jun 25, 2011 at 6:00 AM, Matthew Brett matthew.br...@gmail.comwrote: Hi, On Sat, Jun 25, 2011 at 1:54 AM, Mark Wiebe mwwi...@gmail.com wrote: On Fri, Jun 24, 2011 at 5:21 PM, Matthew Brett matthew.br...@gmail.com ... @Mark - I don't have a clear idea whether you consider the

Re: [Numpy-discussion] feedback request: proposal to add masks to the core ndarray

2011-06-25 Thread Mark Wiebe
On Sat, Jun 25, 2011 at 9:14 AM, Wes McKinney wesmck...@gmail.com wrote: On Sat, Jun 25, 2011 at 12:42 AM, Charles R Harris charlesr.har...@gmail.com wrote: On Fri, Jun 24, 2011 at 10:06 PM, Wes McKinney wesmck...@gmail.com wrote: On Fri, Jun 24, 2011 at 11:59 PM, Nathaniel Smith

Re: [Numpy-discussion] feedback request: proposal to add masks to the core ndarray

2011-06-25 Thread Mark Wiebe
On Sat, Jun 25, 2011 at 9:21 AM, Charles R Harris charlesr.har...@gmail.com wrote: On Sat, Jun 25, 2011 at 5:29 AM, Pierre GM pgmdevl...@gmail.com wrote: This thread is getting quite long, innit ? And I think it's getting a tad confusing, because we're mixing two different concepts: missing

Re: [Numpy-discussion] feedback request: proposal to add masks to the core ndarray

2011-06-25 Thread Mark Wiebe
On Sat, Jun 25, 2011 at 9:44 AM, Wes McKinney wesmck...@gmail.com wrote: On Sat, Jun 25, 2011 at 10:25 AM, Charles R Harris charlesr.har...@gmail.com wrote: On Sat, Jun 25, 2011 at 8:14 AM, Wes McKinney wesmck...@gmail.com wrote: On Sat, Jun 25, 2011 at 12:42 AM, Charles R Harris

Re: [Numpy-discussion] feedback request: proposal to add masks to the core ndarray

2011-06-24 Thread Matthew Brett
Hi, On Thu, Jun 23, 2011 at 10:44 PM, Robert Kern robert.k...@gmail.com wrote: On Thu, Jun 23, 2011 at 15:53, Mark Wiebe mwwi...@gmail.com wrote: Enthought has asked me to look into the missing data problem and how NumPy could treat it better. I've considered the different ideas of adding

Re: [Numpy-discussion] feedback request: proposal to add masks to the core ndarray

2011-06-24 Thread Matthew Brett
Hi, On Fri, Jun 24, 2011 at 2:32 AM, Nathaniel Smith n...@pobox.com wrote: ... If we think that the memory overhead for floating point types is too high, it would be easy to add a special case where maybe(float) used a distinguished NaN instead of a separate boolean. The extra complexity

Re: [Numpy-discussion] feedback request: proposal to add masks to the core ndarray

2011-06-24 Thread Laurent Gautier
On 2011-06-24 13:59, Nathaniel Smith n...@pobox.com wrote: On Thu, Jun 23, 2011 at 5:56 PM, Benjamin Rootben.r...@ou.edu wrote: Lastly, I am not entirely familiar with R, so I am also very curious about what this magical NA value is, and how it compares to how NaNs work. Although, Pierre

Re: [Numpy-discussion] feedback request: proposal to add masks to the core ndarray

2011-06-24 Thread Neal Becker
Just 1 question before I look more closely. What is the cost to the non-MA user of this addition? ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] feedback request: proposal to add masks to the core ndarray

2011-06-24 Thread Charles R Harris
On Fri, Jun 24, 2011 at 6:30 AM, Laurent Gautier lgaut...@gmail.com wrote: On 2011-06-24 13:59, Nathaniel Smith n...@pobox.com wrote: On Thu, Jun 23, 2011 at 5:56 PM, Benjamin Rootben.r...@ou.edu wrote: Lastly, I am not entirely familiar with R, so I am also very curious about what this

Re: [Numpy-discussion] feedback request: proposal to add masks to the core ndarray

2011-06-24 Thread Keith Goodman
On Thu, Jun 23, 2011 at 3:24 PM, Mark Wiebe mwwi...@gmail.com wrote: On Thu, Jun 23, 2011 at 5:05 PM, Keith Goodman kwgood...@gmail.com wrote: On Thu, Jun 23, 2011 at 1:53 PM, Mark Wiebe mwwi...@gmail.com wrote: Enthought has asked me to look into the missing data problem and how NumPy

Re: [Numpy-discussion] feedback request: proposal to add masks to the core ndarray

2011-06-24 Thread Robert Kern
On Fri, Jun 24, 2011 at 06:47, Matthew Brett matthew.br...@gmail.com wrote: Hi, On Thu, Jun 23, 2011 at 10:44 PM, Robert Kern robert.k...@gmail.com wrote: On Thu, Jun 23, 2011 at 15:53, Mark Wiebe mwwi...@gmail.com wrote: Enthought has asked me to look into the missing data problem and how

Re: [Numpy-discussion] feedback request: proposal to add masks to the core ndarray

2011-06-24 Thread Robert Kern
On Fri, Jun 24, 2011 at 07:30, Laurent Gautier lgaut...@gmail.com wrote: On 2011-06-24 13:59,  Nathaniel Smith n...@pobox.com wrote: On Thu, Jun 23, 2011 at 5:56 PM, Benjamin Rootben.r...@ou.edu  wrote: Lastly, I am not entirely familiar with R, so I am also very curious about what this

Re: [Numpy-discussion] feedback request: proposal to add masks to the core ndarray

2011-06-24 Thread Keith Goodman
On Fri, Jun 24, 2011 at 7:06 AM, Robert Kern robert.k...@gmail.com wrote: The alternative proposal would be to add a few new dtypes that are NA-aware. E.g. an nafloat64 would reserve a particular NaN value (there are lots of different NaN bit patterns, we'd just reserve one) that would

Re: [Numpy-discussion] feedback request: proposal to add masks to the core ndarray

2011-06-24 Thread Bruce Southey
On 06/24/2011 09:06 AM, Robert Kern wrote: On Fri, Jun 24, 2011 at 07:30, Laurent Gautierlgaut...@gmail.com wrote: On 2011-06-24 13:59, Nathaniel Smithn...@pobox.com wrote: On Thu, Jun 23, 2011 at 5:56 PM, Benjamin Rootben.r...@ou.eduwrote: Lastly, I am not entirely familiar with R, so

Re: [Numpy-discussion] feedback request: proposal to add masks to the core ndarray

2011-06-24 Thread Charles R Harris
On Fri, Jun 24, 2011 at 8:06 AM, Robert Kern robert.k...@gmail.com wrote: On Fri, Jun 24, 2011 at 07:30, Laurent Gautier lgaut...@gmail.com wrote: On 2011-06-24 13:59, Nathaniel Smith n...@pobox.com wrote: On Thu, Jun 23, 2011 at 5:56 PM, Benjamin Rootben.r...@ou.edu wrote: Lastly, I am

Re: [Numpy-discussion] feedback request: proposal to add masks to the core ndarray

2011-06-24 Thread Robert Kern
On Fri, Jun 24, 2011 at 09:24, Keith Goodman kwgood...@gmail.com wrote: On Fri, Jun 24, 2011 at 7:06 AM, Robert Kern robert.k...@gmail.com wrote: The alternative proposal would be to add a few new dtypes that are NA-aware. E.g. an nafloat64 would reserve a particular NaN value (there are lots

Re: [Numpy-discussion] feedback request: proposal to add masks to the core ndarray

2011-06-24 Thread Robert Kern
On Fri, Jun 24, 2011 at 09:33, Charles R Harris charlesr.har...@gmail.com wrote: On Fri, Jun 24, 2011 at 8:06 AM, Robert Kern robert.k...@gmail.com wrote: The alternative proposal would be to add a few new dtypes that are NA-aware. E.g. an nafloat64 would reserve a particular NaN value

Re: [Numpy-discussion] feedback request: proposal to add masks to the core ndarray

2011-06-24 Thread Robert Kern
On Fri, Jun 24, 2011 at 09:35, Robert Kern robert.k...@gmail.com wrote: On Fri, Jun 24, 2011 at 09:24, Keith Goodman kwgood...@gmail.com wrote: On Fri, Jun 24, 2011 at 7:06 AM, Robert Kern robert.k...@gmail.com wrote: The alternative proposal would be to add a few new dtypes that are

Re: [Numpy-discussion] feedback request: proposal to add masks to the core ndarray

2011-06-24 Thread Charles R Harris
On Fri, Jun 24, 2011 at 8:44 AM, Robert Kern robert.k...@gmail.com wrote: On Fri, Jun 24, 2011 at 09:35, Robert Kern robert.k...@gmail.com wrote: On Fri, Jun 24, 2011 at 09:24, Keith Goodman kwgood...@gmail.com wrote: On Fri, Jun 24, 2011 at 7:06 AM, Robert Kern robert.k...@gmail.com

Re: [Numpy-discussion] feedback request: proposal to add masks to the core ndarray

2011-06-24 Thread Pierre GM
On Jun 24, 2011, at 4:44 PM, Robert Kern wrote: On Fri, Jun 24, 2011 at 09:35, Robert Kern robert.k...@gmail.com wrote: On Fri, Jun 24, 2011 at 09:24, Keith Goodman kwgood...@gmail.com wrote: On Fri, Jun 24, 2011 at 7:06 AM, Robert Kern robert.k...@gmail.com wrote: The alternative proposal

Re: [Numpy-discussion] feedback request: proposal to add masks to the core ndarray

2011-06-24 Thread Laurent Gautier
On 2011-06-24 16:43, Robert Kern robert.k...@gmail.com wrote: On Fri, Jun 24, 2011 at 09:33, Charles R Harris charlesr.har...@gmail.com wrote: On Fri, Jun 24, 2011 at 8:06 AM, Robert Kernrobert.k...@gmail.com wrote: The alternative proposal would be to add a few new dtypes that are

Re: [Numpy-discussion] feedback request: proposal to add masks to the core ndarray

2011-06-24 Thread Matthew Brett
Hi, On Fri, Jun 24, 2011 at 3:43 PM, Robert Kern robert.k...@gmail.com wrote: On Fri, Jun 24, 2011 at 09:33, Charles R Harris charlesr.har...@gmail.com wrote: On Fri, Jun 24, 2011 at 8:06 AM, Robert Kern robert.k...@gmail.com wrote: The alternative proposal would be to add a few new dtypes

Re: [Numpy-discussion] feedback request: proposal to add masks to the core ndarray

2011-06-24 Thread Robert Kern
On Fri, Jun 24, 2011 at 10:07, Laurent Gautier lgaut...@gmail.com wrote: On 2011-06-24 16:43, Robert Kern robert.k...@gmail.com wrote: On Fri, Jun 24, 2011 at 09:33, Charles R Harris charlesr.har...@gmail.com wrote:  On Fri, Jun 24, 2011 at 8:06 AM, Robert Kernrobert.k...@gmail.com  

Re: [Numpy-discussion] feedback request: proposal to add masks to the core ndarray

2011-06-24 Thread Robert Kern
On Fri, Jun 24, 2011 at 10:02, Pierre GM pgmdevl...@gmail.com wrote: On Jun 24, 2011, at 4:44 PM, Robert Kern wrote: On Fri, Jun 24, 2011 at 09:35, Robert Kern robert.k...@gmail.com wrote: On Fri, Jun 24, 2011 at 09:24, Keith Goodman kwgood...@gmail.com wrote: On Fri, Jun 24, 2011 at 7:06

Re: [Numpy-discussion] feedback request: proposal to add masks to the core ndarray

2011-06-24 Thread Nathaniel Smith
On Fri, Jun 24, 2011 at 8:14 AM, Robert Kern robert.k...@gmail.com wrote: On Fri, Jun 24, 2011 at 10:07, Laurent Gautier lgaut...@gmail.com wrote: May be there is not so much need for reservation over the string NA, when making the distinction between: a- the internal representation of a

Re: [Numpy-discussion] feedback request: proposal to add masks to the core ndarray

2011-06-24 Thread Robert Kern
On Fri, Jun 24, 2011 at 11:05, Nathaniel Smith n...@pobox.com wrote: On Fri, Jun 24, 2011 at 8:14 AM, Robert Kern robert.k...@gmail.com wrote: On Fri, Jun 24, 2011 at 10:07, Laurent Gautier lgaut...@gmail.com wrote: May be there is not so much need for reservation over the string NA, when

Re: [Numpy-discussion] feedback request: proposal to add masks to the core ndarray

2011-06-24 Thread Nathaniel Smith
On Fri, Jun 24, 2011 at 8:30 AM, Robert Kern robert.k...@gmail.com wrote: I would suggest following R's lead and letting ((NA==NA) == True) unlike NaNs. In R, NA and NaN do behave differently with respect to ==, but not the way you're saying: NA == NA [1] NA if (NA == NA) 1; Error in if (NA

Re: [Numpy-discussion] feedback request: proposal to add masks to the core ndarray

2011-06-24 Thread Christopher Barker
Nathaniel Smith wrote: The 'dtype factory' idea builds on the way I've structured datetime as a parameterized type, ... Another disadvantage is that we get further from Gael Varoquaux's point: Right now, the numpy array can be seen as an extension of the C array, basically a pointer, a

Re: [Numpy-discussion] feedback request: proposal to add masks to the core ndarray

2011-06-24 Thread Mark Wiebe
On Thu, Jun 23, 2011 at 8:00 PM, Pierre GM pgmdevl...@gmail.com wrote: On Jun 24, 2011, at 2:42 AM, Mark Wiebe wrote: On Thu, Jun 23, 2011 at 7:28 PM, Pierre GM pgmdevl...@gmail.com wrote: Sorry y'all, I'm just commenting bits by bits: One key problem is a lack of orthogonality with

Re: [Numpy-discussion] feedback request: proposal to add masks to the core ndarray

2011-06-24 Thread Robert Kern
On Fri, Jun 24, 2011 at 11:13, Christopher Barker chris.bar...@noaa.gov wrote: Nathaniel Smith wrote: If we think that the memory overhead for floating point types is too high, it would be easy to add a special case where maybe(float) used a distinguished NaN instead of a separate boolean.

Re: [Numpy-discussion] feedback request: proposal to add masks to the core ndarray

2011-06-24 Thread Christopher Barker
Robert Kern wrote: It's worth noting that this is not a replacement for masked arrays, nor is it intended to be the be-all, end-all solution to missing data problems. It's mostly just intended to be a focused tool to fill in the gaps where masked arrays are less convenient for whatever

Re: [Numpy-discussion] feedback request: proposal to add masks to the core ndarray

2011-06-24 Thread Nathaniel Smith
On Fri, Jun 24, 2011 at 7:06 AM, Robert Kern robert.k...@gmail.com wrote: The alternative proposal would be to add a few new dtypes that are NA-aware. E.g. an nafloat64 would reserve a particular NaN value (there are lots of different NaN bit patterns, we'd just reserve one) that would

Re: [Numpy-discussion] feedback request: proposal to add masks to the core ndarray

2011-06-24 Thread Mark Wiebe
On Thu, Jun 23, 2011 at 8:32 PM, Nathaniel Smith n...@pobox.com wrote: On Thu, Jun 23, 2011 at 5:21 PM, Mark Wiebe mwwi...@gmail.com wrote: On Thu, Jun 23, 2011 at 7:00 PM, Nathaniel Smith n...@pobox.com wrote: It's should also be possible to accomplish a general solution at the dtype

Re: [Numpy-discussion] feedback request: proposal to add masks to the core ndarray

2011-06-24 Thread Mark Wiebe
On Fri, Jun 24, 2011 at 6:59 AM, Matthew Brett matthew.br...@gmail.comwrote: Hi, On Fri, Jun 24, 2011 at 2:32 AM, Nathaniel Smith n...@pobox.com wrote: ... If we think that the memory overhead for floating point types is too high, it would be easy to add a special case where maybe(float)

Re: [Numpy-discussion] feedback request: proposal to add masks to the core ndarray

2011-06-24 Thread Mark Wiebe
On Fri, Jun 24, 2011 at 7:30 AM, Laurent Gautier lgaut...@gmail.com wrote: On 2011-06-24 13:59, Nathaniel Smith n...@pobox.com wrote: On Thu, Jun 23, 2011 at 5:56 PM, Benjamin Rootben.r...@ou.edu wrote: Lastly, I am not entirely familiar with R, so I am also very curious about what this

Re: [Numpy-discussion] feedback request: proposal to add masks to the core ndarray

2011-06-24 Thread Mark Wiebe
On Fri, Jun 24, 2011 at 8:01 AM, Neal Becker ndbeck...@gmail.com wrote: Just 1 question before I look more closely. What is the cost to the non-MA user of this addition? I'm following the idea that you don't pay for what you don't use. All the existing stuff will perform the same. -Mark

Re: [Numpy-discussion] feedback request: proposal to add masks to the core ndarray

2011-06-24 Thread Nathaniel Smith
On Fri, Jun 24, 2011 at 9:33 AM, Mark Wiebe mwwi...@gmail.com wrote: On Thu, Jun 23, 2011 at 8:32 PM, Nathaniel Smith n...@pobox.com wrote: But on the other hand, we gain:  -- simpler implementation: no need to be checking and tracking the mask buffer everywhere. The needed infrastructure is

Re: [Numpy-discussion] feedback request: proposal to add masks to the core ndarray

2011-06-24 Thread Mark Wiebe
On Fri, Jun 24, 2011 at 8:57 AM, Keith Goodman kwgood...@gmail.com wrote: On Thu, Jun 23, 2011 at 3:24 PM, Mark Wiebe mwwi...@gmail.com wrote: On Thu, Jun 23, 2011 at 5:05 PM, Keith Goodman kwgood...@gmail.com wrote: On Thu, Jun 23, 2011 at 1:53 PM, Mark Wiebe mwwi...@gmail.com wrote:

Re: [Numpy-discussion] feedback request: proposal to add masks to the core ndarray

2011-06-24 Thread Wes McKinney
On Fri, Jun 24, 2011 at 12:33 PM, Mark Wiebe mwwi...@gmail.com wrote: On Thu, Jun 23, 2011 at 8:32 PM, Nathaniel Smith n...@pobox.com wrote: On Thu, Jun 23, 2011 at 5:21 PM, Mark Wiebe mwwi...@gmail.com wrote: On Thu, Jun 23, 2011 at 7:00 PM, Nathaniel Smith n...@pobox.com wrote: It's

Re: [Numpy-discussion] feedback request: proposal to add masks to the core ndarray

2011-06-24 Thread Mark Wiebe
On Fri, Jun 24, 2011 at 9:27 AM, Bruce Southey bsout...@gmail.com wrote: ** On 06/24/2011 09:06 AM, Robert Kern wrote: On Fri, Jun 24, 2011 at 07:30, Laurent Gautier lgaut...@gmail.com lgaut...@gmail.com wrote: On 2011-06-24 13:59, Nathaniel Smith n...@pobox.com n...@pobox.com wrote:

Re: [Numpy-discussion] feedback request: proposal to add masks to the core ndarray

2011-06-24 Thread Mark Wiebe
On Fri, Jun 24, 2011 at 10:02 AM, Pierre GM pgmdevl...@gmail.com wrote: On Jun 24, 2011, at 4:44 PM, Robert Kern wrote: On Fri, Jun 24, 2011 at 09:35, Robert Kern robert.k...@gmail.com wrote: On Fri, Jun 24, 2011 at 09:24, Keith Goodman kwgood...@gmail.com wrote: On Fri, Jun 24, 2011 at

Re: [Numpy-discussion] feedback request: proposal to add masks to the core ndarray

2011-06-24 Thread Mark Wiebe
On Fri, Jun 24, 2011 at 10:07 AM, Matthew Brett matthew.br...@gmail.comwrote: Hi, On Fri, Jun 24, 2011 at 3:43 PM, Robert Kern robert.k...@gmail.com wrote: On Fri, Jun 24, 2011 at 09:33, Charles R Harris charlesr.har...@gmail.com wrote: On Fri, Jun 24, 2011 at 8:06 AM, Robert Kern

Re: [Numpy-discussion] feedback request: proposal to add masks to the core ndarray

2011-06-24 Thread Matthew Brett
Hi, Just as a use case, if I do this: a = np.zeros((big_number,), dtype=np.int32) a[0,0] = np.NA I think I'm right in saying that, with the array.mask implementation my array memory usage with grow by new big_number bytes, whereas with the np.naint32 implementation you'd get something like:

Re: [Numpy-discussion] feedback request: proposal to add masks to the core ndarray

2011-06-24 Thread Mark Wiebe
On Fri, Jun 24, 2011 at 11:25 AM, Robert Kern robert.k...@gmail.com wrote: On Fri, Jun 24, 2011 at 11:13, Christopher Barker chris.bar...@noaa.gov wrote: Nathaniel Smith wrote: If we think that the memory overhead for floating point types is too high, it would be easy to add a special

Re: [Numpy-discussion] feedback request: proposal to add masks to the core ndarray

2011-06-24 Thread Mark Wiebe
On Fri, Jun 24, 2011 at 11:25 AM, Christopher Barker chris.bar...@noaa.govwrote: Robert Kern wrote: It's worth noting that this is not a replacement for masked arrays, nor is it intended to be the be-all, end-all solution to missing data problems. It's mostly just intended to be a focused

Re: [Numpy-discussion] feedback request: proposal to add masks to the core ndarray

2011-06-24 Thread Matthew Brett
Hi, On Fri, Jun 24, 2011 at 5:45 PM, Mark Wiebe mwwi...@gmail.com wrote: On Fri, Jun 24, 2011 at 6:59 AM, Matthew Brett matthew.br...@gmail.com wrote: Hi, On Fri, Jun 24, 2011 at 2:32 AM, Nathaniel Smith n...@pobox.com wrote: ... and the fact that 'missing_value' could be any type would

Re: [Numpy-discussion] feedback request: proposal to add masks to the core ndarray

2011-06-24 Thread Mark Wiebe
On Fri, Jun 24, 2011 at 11:54 AM, Nathaniel Smith n...@pobox.com wrote: On Fri, Jun 24, 2011 at 9:33 AM, Mark Wiebe mwwi...@gmail.com wrote: On Thu, Jun 23, 2011 at 8:32 PM, Nathaniel Smith n...@pobox.com wrote: But on the other hand, we gain: -- simpler implementation: no need to be

Re: [Numpy-discussion] feedback request: proposal to add masks to the core ndarray

2011-06-24 Thread Mark Wiebe
On Fri, Jun 24, 2011 at 12:06 PM, Wes McKinney wesmck...@gmail.com wrote: On Fri, Jun 24, 2011 at 12:33 PM, Mark Wiebe mwwi...@gmail.com wrote: On Thu, Jun 23, 2011 at 8:32 PM, Nathaniel Smith n...@pobox.com wrote: On Thu, Jun 23, 2011 at 5:21 PM, Mark Wiebe mwwi...@gmail.com wrote: On

Re: [Numpy-discussion] feedback request: proposal to add masks to the core ndarray

2011-06-24 Thread Mark Wiebe
On Fri, Jun 24, 2011 at 1:04 PM, Matthew Brett matthew.br...@gmail.comwrote: Hi, Just as a use case, if I do this: a = np.zeros((big_number,), dtype=np.int32) a[0,0] = np.NA I think I'm right in saying that, with the array.mask implementation my array memory usage with grow by new

Re: [Numpy-discussion] feedback request: proposal to add masks to the core ndarray

2011-06-24 Thread Mark Wiebe
On Fri, Jun 24, 2011 at 1:18 PM, Matthew Brett matthew.br...@gmail.comwrote: Hi, On Fri, Jun 24, 2011 at 5:45 PM, Mark Wiebe mwwi...@gmail.com wrote: On Fri, Jun 24, 2011 at 6:59 AM, Matthew Brett matthew.br...@gmail.com wrote: Hi, On Fri, Jun 24, 2011 at 2:32 AM, Nathaniel Smith

Re: [Numpy-discussion] feedback request: proposal to add masks to the core ndarray

2011-06-24 Thread Lluís
Mark Wiebe writes: It's should also be possible to accomplish a general solution at the dtype level. We could have a 'dtype factory' used like:  np.zeros(10, dtype=np.maybe(float)) where np.maybe(x) returns a new dtype whose storage size

Re: [Numpy-discussion] feedback request: proposal to add masks to the core ndarray

2011-06-24 Thread Nathaniel Smith
On Fri, Jun 24, 2011 at 12:26 PM, Mark Wiebe mwwi...@gmail.com wrote: For the maybe dtype, it would need to gain access to the ufunc loop of the underlying dtype, and call it appropriately during the inner loop. This appears to require some more invasive upheaval within the ufunc code than the

Re: [Numpy-discussion] feedback request: proposal to add masks to the core ndarray

2011-06-24 Thread Matthew Brett
Hi, On Fri, Jun 24, 2011 at 10:09 PM, Benjamin Root ben.r...@ou.edu wrote: ... Again, there are pros and cons either way and I see them very orthogonal and complementary. That may be true, but I imagine only one of them will be implemented. @Mark - I don't have a clear idea whether you

Re: [Numpy-discussion] feedback request: proposal to add masks to the core ndarray

2011-06-24 Thread Gael Varoquaux
On Thu, Jun 23, 2011 at 07:51:25PM -0400, josef.p...@gmail.com wrote: From the perspective of statistical analysis, I don't see much advantage of this. What to do with nans depends on the analysis, and needs to be looked at for each case. From someone who actually sometimes does statistics

Re: [Numpy-discussion] feedback request: proposal to add masks to the core ndarray

2011-06-24 Thread Wes McKinney
On Fri, Jun 24, 2011 at 7:10 PM, Charles R Harris charlesr.har...@gmail.com wrote: On Fri, Jun 24, 2011 at 4:21 PM, Matthew Brett matthew.br...@gmail.com wrote: Hi, On Fri, Jun 24, 2011 at 10:09 PM, Benjamin Root ben.r...@ou.edu wrote: ... Again, there are pros and cons either way and I

Re: [Numpy-discussion] feedback request: proposal to add masks to the core ndarray

2011-06-24 Thread Matthew Brett
Hi, On Sat, Jun 25, 2011 at 12:22 AM, Wes McKinney wesmck...@gmail.com wrote: ... Perhaps we should make a wiki page someplace summarizing pros and cons of the various implementation approaches? But - we should do this if it really is an open question which one we go for. If not then, we're

Re: [Numpy-discussion] feedback request: proposal to add masks to the core ndarray

2011-06-24 Thread Wes McKinney
On Fri, Jun 24, 2011 at 8:02 PM, Charles R Harris charlesr.har...@gmail.com wrote: On Fri, Jun 24, 2011 at 5:22 PM, Wes McKinney wesmck...@gmail.com wrote: On Fri, Jun 24, 2011 at 7:10 PM, Charles R Harris charlesr.har...@gmail.com wrote: On Fri, Jun 24, 2011 at 4:21 PM, Matthew Brett

Re: [Numpy-discussion] feedback request: proposal to add masks to the core ndarray

2011-06-24 Thread Mark Wiebe
On Fri, Jun 24, 2011 at 3:38 PM, Lluís xscr...@gmx.net wrote: Mark Wiebe writes: It's should also be possible to accomplish a general solution at the dtype level. We could have a 'dtype factory' used like: np.zeros(10, dtype=np.maybe(float))

Re: [Numpy-discussion] feedback request: proposal to add masks to the core ndarray

2011-06-24 Thread Mark Wiebe
On Fri, Jun 24, 2011 at 4:24 PM, Nathaniel Smith n...@pobox.com wrote: On Fri, Jun 24, 2011 at 12:26 PM, Mark Wiebe mwwi...@gmail.com wrote: For the maybe dtype, it would need to gain access to the ufunc loop of the underlying dtype, and call it appropriately during the inner loop. This

Re: [Numpy-discussion] feedback request: proposal to add masks to the core ndarray

2011-06-24 Thread Mark Wiebe
On Fri, Jun 24, 2011 at 5:21 PM, Matthew Brett matthew.br...@gmail.comwrote: Hi, On Fri, Jun 24, 2011 at 10:09 PM, Benjamin Root ben.r...@ou.edu wrote: ... Again, there are pros and cons either way and I see them very orthogonal and complementary. That may be true, but I imagine only

Re: [Numpy-discussion] feedback request: proposal to add masks to the core ndarray

2011-06-24 Thread Mark Wiebe
On Fri, Jun 24, 2011 at 6:10 PM, Charles R Harris charlesr.har...@gmail.com wrote: On Fri, Jun 24, 2011 at 4:21 PM, Matthew Brett matthew.br...@gmail.comwrote: Hi, On Fri, Jun 24, 2011 at 10:09 PM, Benjamin Root ben.r...@ou.edu wrote: ... Again, there are pros and cons either way and I

Re: [Numpy-discussion] feedback request: proposal to add masks to the core ndarray

2011-06-24 Thread Mark Wiebe
On Fri, Jun 24, 2011 at 6:22 PM, Wes McKinney wesmck...@gmail.com wrote: On Fri, Jun 24, 2011 at 7:10 PM, Charles R Harris charlesr.har...@gmail.com wrote: On Fri, Jun 24, 2011 at 4:21 PM, Matthew Brett matthew.br...@gmail.com wrote: Hi, On Fri, Jun 24, 2011 at 10:09 PM,

Re: [Numpy-discussion] feedback request: proposal to add masks to the core ndarray

2011-06-24 Thread Mark Wiebe
On Fri, Jun 24, 2011 at 7:02 PM, Matthew Brett matthew.br...@gmail.comwrote: Hi, On Sat, Jun 25, 2011 at 12:22 AM, Wes McKinney wesmck...@gmail.com wrote: ... Perhaps we should make a wiki page someplace summarizing pros and cons of the various implementation approaches? But - we

Re: [Numpy-discussion] feedback request: proposal to add masks to the core ndarray

2011-06-24 Thread Nathaniel Smith
On Fri, Jun 24, 2011 at 2:09 PM, Benjamin Root ben.r...@ou.edu wrote: Another example of how we use masks in matplotlib is in pcolor().  We have to combine the possible masks of X, Y, and V in both the x and y directions to find the final mask to use for the final output result (because each

Re: [Numpy-discussion] feedback request: proposal to add masks to the core ndarray

2011-06-24 Thread Benjamin Root
On Fri, Jun 24, 2011 at 8:00 PM, Mark Wiebe mwwi...@gmail.com wrote: On Fri, Jun 24, 2011 at 6:22 PM, Wes McKinney wesmck...@gmail.com wrote: On Fri, Jun 24, 2011 at 7:10 PM, Charles R Harris charlesr.har...@gmail.com wrote: On Fri, Jun 24, 2011 at 4:21 PM, Matthew Brett

Re: [Numpy-discussion] feedback request: proposal to add masks to the core ndarray

2011-06-24 Thread Benjamin Root
On Fri, Jun 24, 2011 at 8:11 PM, Nathaniel Smith n...@pobox.com wrote: On Fri, Jun 24, 2011 at 2:09 PM, Benjamin Root ben.r...@ou.edu wrote: Another example of how we use masks in matplotlib is in pcolor(). We have to combine the possible masks of X, Y, and V in both the x and y

Re: [Numpy-discussion] feedback request: proposal to add masks to the core ndarray

2011-06-24 Thread Nathaniel Smith
On Fri, Jun 24, 2011 at 6:57 PM, Benjamin Root ben.r...@ou.edu wrote: On Fri, Jun 24, 2011 at 8:11 PM, Nathaniel Smith n...@pobox.com wrote: This is a situation where I would just... use an array and a mask, rather than a masked array. Then lots of things -- changing fill values, temporarily

Re: [Numpy-discussion] feedback request: proposal to add masks to the core ndarray

2011-06-24 Thread Wes McKinney
On Fri, Jun 24, 2011 at 11:59 PM, Nathaniel Smith n...@pobox.com wrote: On Fri, Jun 24, 2011 at 6:57 PM, Benjamin Root ben.r...@ou.edu wrote: On Fri, Jun 24, 2011 at 8:11 PM, Nathaniel Smith n...@pobox.com wrote: This is a situation where I would just... use an array and a mask, rather than a

Re: [Numpy-discussion] feedback request: proposal to add masks to the core ndarray

2011-06-24 Thread Charles R Harris
On Fri, Jun 24, 2011 at 10:06 PM, Wes McKinney wesmck...@gmail.com wrote: On Fri, Jun 24, 2011 at 11:59 PM, Nathaniel Smith n...@pobox.com wrote: On Fri, Jun 24, 2011 at 6:57 PM, Benjamin Root ben.r...@ou.edu wrote: On Fri, Jun 24, 2011 at 8:11 PM, Nathaniel Smith n...@pobox.com wrote:

[Numpy-discussion] feedback request: proposal to add masks to the core ndarray

2011-06-23 Thread Mark Wiebe
Enthought has asked me to look into the missing data problem and how NumPy could treat it better. I've considered the different ideas of adding dtype variants with a special signal value and masked arrays, and concluded that adding masks to the core ndarray appears is the best way to deal with the

Re: [Numpy-discussion] feedback request: proposal to add masks to the core ndarray

2011-06-23 Thread Nathaniel Smith
I'd like to see a statement of what the missing data problem is, and how this solves it? Because I don't think this is entirely intuitive, or that everyone necessarily has the same idea. Reduction operations like 'sum', 'prod', 'min', and 'max' will operate as if the values weren't there For

Re: [Numpy-discussion] feedback request: proposal to add masks to the core ndarray

2011-06-23 Thread Mark Wiebe
On Thu, Jun 23, 2011 at 4:19 PM, Nathaniel Smith n...@pobox.com wrote: I'd like to see a statement of what the missing data problem is, and how this solves it? Because I don't think this is entirely intuitive, or that everyone necessarily has the same idea. I agree it represents different

  1   2   >