On Wed, Jun 29, 2011 at 1:51 PM, Lluís xscr...@gmx.net wrote:
Mark Wiebe writes:
[...]
I think that deciding on the value of NA signal values boils down to
this question: should 3rd party code be able to interpret missing
data
information stored in the separate mask array?
On Tue, Jun 28, 2011 at 7:34 AM, Lluís xscr...@gmx.net wrote:
Mark Wiebe writes:
The design that's forming is a combination of:
* Solve the missing data problem
* My ideas of what a good solution looks like:
* applies to all NumPy dtypes in a fully general way
*
On Wed, Jun 29, 2011 at 11:53 AM, Mark Wiebe mwwi...@gmail.com wrote:
On Tue, Jun 28, 2011 at 7:34 AM, Lluís xscr...@gmx.net wrote:
Mark Wiebe writes:
The design that's forming is a combination of:
* Solve the missing data problem
* My ideas of what a good solution looks like:
*
Mark Wiebe writes:
[...]
I think that deciding on the value of NA signal values boils down to
this question: should 3rd party code be able to interpret missing data
information stored in the separate mask array?
I'm tossing around some variations of ideas using the iterator to
Charles R Harris writes:
I think we may need some standard format for masked data on disk if we
don't go the NA value route.
As I see it, the mask array is just some metadata that is attached to
the dtype descriptor. I don't know how an ndarray is (un)pickled from
disk, but I imagine that each
On 2011-06-24 17:30, Robert Kern robert.k...@gmail.com wrote:
On Fri, Jun 24, 2011 at 10:07, Laurent Gautierlgaut...@gmail.com wrote:
On 2011-06-24 16:43, Robert Kernrobert.k...@gmail.com wrote:
On Fri, Jun 24, 2011 at 09:33, Charles R Harris
charlesr.har...@gmail.com wrote:
Hi,
On Sat, Jun 25, 2011 at 1:54 AM, Mark Wiebe mwwi...@gmail.com wrote:
On Fri, Jun 24, 2011 at 5:21 PM, Matthew Brett matthew.br...@gmail.com
...
@Mark - I don't have a clear idea whether you consider the nafloat64
option to be still in play as the first thing to be implemented
(before
Hi,
On Sat, Jun 25, 2011 at 2:10 AM, Mark Wiebe mwwi...@gmail.com wrote:
On Fri, Jun 24, 2011 at 7:02 PM, Matthew Brett matthew.br...@gmail.com
wrote:
Hi,
On Sat, Jun 25, 2011 at 12:22 AM, Wes McKinney wesmck...@gmail.com
wrote:
...
Perhaps we should make a wiki page someplace
This thread is getting quite long, innit ?
And I think it's getting a tad confusing, because we're mixing two different
concepts: missing values and masks.
There should be support for missing values in numpy.core, I think we all agree
on that.
* What's been suggested of adding new dtypes
On Sat, Jun 25, 2011 at 01:02:07AM +0100, Matthew Brett wrote:
I'm personally worried that the memory overhead of array.masks will
make many of us tend to avoid them. I work with images that can
easily get large enough that I would not want an array-items size byte
array added to my storage.
On Sat, Jun 25, 2011 at 12:42 AM, Charles R Harris
charlesr.har...@gmail.com wrote:
On Fri, Jun 24, 2011 at 10:06 PM, Wes McKinney wesmck...@gmail.com wrote:
On Fri, Jun 24, 2011 at 11:59 PM, Nathaniel Smith n...@pobox.com wrote:
On Fri, Jun 24, 2011 at 6:57 PM, Benjamin Root
On Sat, Jun 25, 2011 at 5:29 AM, Pierre GM pgmdevl...@gmail.com wrote:
This thread is getting quite long, innit ?
And I think it's getting a tad confusing, because we're mixing two
different concepts: missing values and masks.
There should be support for missing values in numpy.core, I think
On Sat, Jun 25, 2011 at 8:14 AM, Wes McKinney wesmck...@gmail.com wrote:
On Sat, Jun 25, 2011 at 12:42 AM, Charles R Harris
charlesr.har...@gmail.com wrote:
On Fri, Jun 24, 2011 at 10:06 PM, Wes McKinney wesmck...@gmail.com
wrote:
On Fri, Jun 24, 2011 at 11:59 PM, Nathaniel Smith
On Sat, Jun 25, 2011 at 6:00 AM, Gael Varoquaux
gael.varoqu...@normalesup.org wrote:
On Sat, Jun 25, 2011 at 01:02:07AM +0100, Matthew Brett wrote:
I'm personally worried that the memory overhead of array.masks will
make many of us tend to avoid them. I work with images that can
easily
Hi,
On Sat, Jun 25, 2011 at 3:21 PM, Charles R Harris
charlesr.har...@gmail.com wrote:
On Sat, Jun 25, 2011 at 5:29 AM, Pierre GM pgmdevl...@gmail.com wrote:
This thread is getting quite long, innit ?
And I think it's getting a tad confusing, because we're mixing two
different concepts:
Hi,
On Sat, Jun 25, 2011 at 3:14 PM, Wes McKinney wesmck...@gmail.com wrote:
...
I hope you're right. So far it seems that anyone who has spent real
time with R (e.g. myself, Nathaniel) has expressed serious concerns
about the masked approach.
I'm sorry - I have been distracted. For my sake,
On Sat, Jun 25, 2011 at 8:31 AM, Matthew Brett matthew.br...@gmail.comwrote:
Hi,
On Sat, Jun 25, 2011 at 3:21 PM, Charles R Harris
charlesr.har...@gmail.com wrote:
On Sat, Jun 25, 2011 at 5:29 AM, Pierre GM pgmdevl...@gmail.com wrote:
This thread is getting quite long, innit ?
And
Hi,
On Sat, Jun 25, 2011 at 3:46 PM, Charles R Harris
charlesr.har...@gmail.com wrote:
On Sat, Jun 25, 2011 at 8:31 AM, Matthew Brett matthew.br...@gmail.com
wrote:
Hi,
On Sat, Jun 25, 2011 at 3:21 PM, Charles R Harris
charlesr.har...@gmail.com wrote:
On Sat, Jun 25, 2011 at 5:29
On Sat, Jun 25, 2011 at 8:44 AM, Wes McKinney wesmck...@gmail.com wrote:
On Sat, Jun 25, 2011 at 10:25 AM, Charles R Harris
charlesr.har...@gmail.com wrote:
On Sat, Jun 25, 2011 at 8:14 AM, Wes McKinney wesmck...@gmail.com
wrote:
On Sat, Jun 25, 2011 at 12:42 AM, Charles R Harris
On Sat, Jun 25, 2011 at 8:52 AM, Matthew Brett matthew.br...@gmail.comwrote:
Hi,
On Sat, Jun 25, 2011 at 3:46 PM, Charles R Harris
charlesr.har...@gmail.com wrote:
On Sat, Jun 25, 2011 at 8:31 AM, Matthew Brett matthew.br...@gmail.com
wrote:
Hi,
On Sat, Jun 25, 2011 at 3:21
Hi,
On Sat, Jun 25, 2011 at 3:44 PM, Wes McKinney wesmck...@gmail.com wrote:
...
Here are some things I can think of that would be affected by any changes here
1) Right now users of pandas can type pandas.isnull(series[5]) and
that will yield True if the value is NA for any dtype. This might
Hi,
On Sat, Jun 25, 2011 at 4:05 PM, Charles R Harris
charlesr.har...@gmail.com wrote:
On Sat, Jun 25, 2011 at 8:52 AM, Matthew Brett matthew.br...@gmail.com
wrote:
Hi,
On Sat, Jun 25, 2011 at 3:46 PM, Charles R Harris
charlesr.har...@gmail.com wrote:
On Sat, Jun 25, 2011 at 8:31
2011/6/25 Charles R Harris charlesr.har...@gmail.com
I think what we really need to see are the use cases and work flow. The
ones that hadn't occurred to me before were memory mapped files and data
stored on disk in general. I think we may need some standard format for
masked data on disk if
On Fri, Jun 24, 2011 at 8:11 PM, Nathaniel Smith n...@pobox.com wrote:
On Fri, Jun 24, 2011 at 2:09 PM, Benjamin Root ben.r...@ou.edu wrote:
Another example of how we use masks in matplotlib is in pcolor(). We
have
to combine the possible masks of X, Y, and V in both the x and y
On Fri, Jun 24, 2011 at 8:25 PM, Benjamin Root ben.r...@ou.edu wrote:
On Fri, Jun 24, 2011 at 8:00 PM, Mark Wiebe mwwi...@gmail.com wrote:
On Fri, Jun 24, 2011 at 6:22 PM, Wes McKinney wesmck...@gmail.comwrote:
On Fri, Jun 24, 2011 at 7:10 PM, Charles R Harris
charlesr.har...@gmail.com
On Sat, Jun 25, 2011 at 9:21 AM, Charles R Harris charlesr.har...@gmail.com
wrote:
I think he aims to support both. One complication with masks is keeping
them tied to the data on disk. With na values one file can contain both the
data and the missing data markers, whereas with masks, two
On Fri, Jun 24, 2011 at 10:59 PM, Nathaniel Smith n...@pobox.com wrote:
On Fri, Jun 24, 2011 at 6:57 PM, Benjamin Root ben.r...@ou.edu wrote:
On Fri, Jun 24, 2011 at 8:11 PM, Nathaniel Smith n...@pobox.com wrote:
This is a situation where I would just... use an array and a mask,
rather
On Fri, Jun 24, 2011 at 11:06 PM, Wes McKinney wesmck...@gmail.com wrote:
On Fri, Jun 24, 2011 at 11:59 PM, Nathaniel Smith n...@pobox.com wrote:
On Fri, Jun 24, 2011 at 6:57 PM, Benjamin Root ben.r...@ou.edu wrote:
On Fri, Jun 24, 2011 at 8:11 PM, Nathaniel Smith n...@pobox.com wrote:
On Sat, Jun 25, 2011 at 6:00 AM, Matthew Brett matthew.br...@gmail.comwrote:
Hi,
On Sat, Jun 25, 2011 at 1:54 AM, Mark Wiebe mwwi...@gmail.com wrote:
On Fri, Jun 24, 2011 at 5:21 PM, Matthew Brett matthew.br...@gmail.com
...
@Mark - I don't have a clear idea whether you consider the
On Sat, Jun 25, 2011 at 9:14 AM, Wes McKinney wesmck...@gmail.com wrote:
On Sat, Jun 25, 2011 at 12:42 AM, Charles R Harris
charlesr.har...@gmail.com wrote:
On Fri, Jun 24, 2011 at 10:06 PM, Wes McKinney wesmck...@gmail.com
wrote:
On Fri, Jun 24, 2011 at 11:59 PM, Nathaniel Smith
On Sat, Jun 25, 2011 at 9:21 AM, Charles R Harris charlesr.har...@gmail.com
wrote:
On Sat, Jun 25, 2011 at 5:29 AM, Pierre GM pgmdevl...@gmail.com wrote:
This thread is getting quite long, innit ?
And I think it's getting a tad confusing, because we're mixing two
different concepts: missing
On Sat, Jun 25, 2011 at 9:44 AM, Wes McKinney wesmck...@gmail.com wrote:
On Sat, Jun 25, 2011 at 10:25 AM, Charles R Harris
charlesr.har...@gmail.com wrote:
On Sat, Jun 25, 2011 at 8:14 AM, Wes McKinney wesmck...@gmail.com
wrote:
On Sat, Jun 25, 2011 at 12:42 AM, Charles R Harris
Hi,
On Thu, Jun 23, 2011 at 10:44 PM, Robert Kern robert.k...@gmail.com wrote:
On Thu, Jun 23, 2011 at 15:53, Mark Wiebe mwwi...@gmail.com wrote:
Enthought has asked me to look into the missing data problem and how NumPy
could treat it better. I've considered the different ideas of adding
Hi,
On Fri, Jun 24, 2011 at 2:32 AM, Nathaniel Smith n...@pobox.com wrote:
...
If we think that the memory overhead for floating point types is too
high, it would be easy to add a special case where maybe(float) used a
distinguished NaN instead of a separate boolean. The extra complexity
On 2011-06-24 13:59, Nathaniel Smith n...@pobox.com wrote:
On Thu, Jun 23, 2011 at 5:56 PM, Benjamin Rootben.r...@ou.edu wrote:
Lastly, I am not entirely familiar with R, so I am also very curious about
what this magical NA value is, and how it compares to how NaNs work.
Although, Pierre
Just 1 question before I look more closely. What is the cost to the non-MA
user
of this addition?
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion
On Fri, Jun 24, 2011 at 6:30 AM, Laurent Gautier lgaut...@gmail.com wrote:
On 2011-06-24 13:59, Nathaniel Smith n...@pobox.com wrote:
On Thu, Jun 23, 2011 at 5:56 PM, Benjamin Rootben.r...@ou.edu wrote:
Lastly, I am not entirely familiar with R, so I am also very curious
about
what this
On Thu, Jun 23, 2011 at 3:24 PM, Mark Wiebe mwwi...@gmail.com wrote:
On Thu, Jun 23, 2011 at 5:05 PM, Keith Goodman kwgood...@gmail.com wrote:
On Thu, Jun 23, 2011 at 1:53 PM, Mark Wiebe mwwi...@gmail.com wrote:
Enthought has asked me to look into the missing data problem and how
NumPy
On Fri, Jun 24, 2011 at 06:47, Matthew Brett matthew.br...@gmail.com wrote:
Hi,
On Thu, Jun 23, 2011 at 10:44 PM, Robert Kern robert.k...@gmail.com wrote:
On Thu, Jun 23, 2011 at 15:53, Mark Wiebe mwwi...@gmail.com wrote:
Enthought has asked me to look into the missing data problem and how
On Fri, Jun 24, 2011 at 07:30, Laurent Gautier lgaut...@gmail.com wrote:
On 2011-06-24 13:59, Nathaniel Smith n...@pobox.com wrote:
On Thu, Jun 23, 2011 at 5:56 PM, Benjamin Rootben.r...@ou.edu wrote:
Lastly, I am not entirely familiar with R, so I am also very curious about
what this
On Fri, Jun 24, 2011 at 7:06 AM, Robert Kern robert.k...@gmail.com wrote:
The alternative proposal would be to add a few new dtypes that are
NA-aware. E.g. an nafloat64 would reserve a particular NaN value
(there are lots of different NaN bit patterns, we'd just reserve one)
that would
On 06/24/2011 09:06 AM, Robert Kern wrote:
On Fri, Jun 24, 2011 at 07:30, Laurent Gautierlgaut...@gmail.com wrote:
On 2011-06-24 13:59, Nathaniel Smithn...@pobox.com wrote:
On Thu, Jun 23, 2011 at 5:56 PM, Benjamin Rootben.r...@ou.eduwrote:
Lastly, I am not entirely familiar with R, so
On Fri, Jun 24, 2011 at 8:06 AM, Robert Kern robert.k...@gmail.com wrote:
On Fri, Jun 24, 2011 at 07:30, Laurent Gautier lgaut...@gmail.com wrote:
On 2011-06-24 13:59, Nathaniel Smith n...@pobox.com wrote:
On Thu, Jun 23, 2011 at 5:56 PM, Benjamin Rootben.r...@ou.edu wrote:
Lastly, I am
On Fri, Jun 24, 2011 at 09:24, Keith Goodman kwgood...@gmail.com wrote:
On Fri, Jun 24, 2011 at 7:06 AM, Robert Kern robert.k...@gmail.com wrote:
The alternative proposal would be to add a few new dtypes that are
NA-aware. E.g. an nafloat64 would reserve a particular NaN value
(there are lots
On Fri, Jun 24, 2011 at 09:33, Charles R Harris
charlesr.har...@gmail.com wrote:
On Fri, Jun 24, 2011 at 8:06 AM, Robert Kern robert.k...@gmail.com wrote:
The alternative proposal would be to add a few new dtypes that are
NA-aware. E.g. an nafloat64 would reserve a particular NaN value
On Fri, Jun 24, 2011 at 09:35, Robert Kern robert.k...@gmail.com wrote:
On Fri, Jun 24, 2011 at 09:24, Keith Goodman kwgood...@gmail.com wrote:
On Fri, Jun 24, 2011 at 7:06 AM, Robert Kern robert.k...@gmail.com wrote:
The alternative proposal would be to add a few new dtypes that are
On Fri, Jun 24, 2011 at 8:44 AM, Robert Kern robert.k...@gmail.com wrote:
On Fri, Jun 24, 2011 at 09:35, Robert Kern robert.k...@gmail.com wrote:
On Fri, Jun 24, 2011 at 09:24, Keith Goodman kwgood...@gmail.com
wrote:
On Fri, Jun 24, 2011 at 7:06 AM, Robert Kern robert.k...@gmail.com
On Jun 24, 2011, at 4:44 PM, Robert Kern wrote:
On Fri, Jun 24, 2011 at 09:35, Robert Kern robert.k...@gmail.com wrote:
On Fri, Jun 24, 2011 at 09:24, Keith Goodman kwgood...@gmail.com wrote:
On Fri, Jun 24, 2011 at 7:06 AM, Robert Kern robert.k...@gmail.com wrote:
The alternative proposal
On 2011-06-24 16:43, Robert Kern robert.k...@gmail.com wrote:
On Fri, Jun 24, 2011 at 09:33, Charles R Harris
charlesr.har...@gmail.com wrote:
On Fri, Jun 24, 2011 at 8:06 AM, Robert Kernrobert.k...@gmail.com
wrote:
The alternative proposal would be to add a few new dtypes that are
Hi,
On Fri, Jun 24, 2011 at 3:43 PM, Robert Kern robert.k...@gmail.com wrote:
On Fri, Jun 24, 2011 at 09:33, Charles R Harris
charlesr.har...@gmail.com wrote:
On Fri, Jun 24, 2011 at 8:06 AM, Robert Kern robert.k...@gmail.com wrote:
The alternative proposal would be to add a few new dtypes
On Fri, Jun 24, 2011 at 10:07, Laurent Gautier lgaut...@gmail.com wrote:
On 2011-06-24 16:43, Robert Kern robert.k...@gmail.com wrote:
On Fri, Jun 24, 2011 at 09:33, Charles R Harris
charlesr.har...@gmail.com wrote:
On Fri, Jun 24, 2011 at 8:06 AM, Robert Kernrobert.k...@gmail.com
On Fri, Jun 24, 2011 at 10:02, Pierre GM pgmdevl...@gmail.com wrote:
On Jun 24, 2011, at 4:44 PM, Robert Kern wrote:
On Fri, Jun 24, 2011 at 09:35, Robert Kern robert.k...@gmail.com wrote:
On Fri, Jun 24, 2011 at 09:24, Keith Goodman kwgood...@gmail.com wrote:
On Fri, Jun 24, 2011 at 7:06
On Fri, Jun 24, 2011 at 8:14 AM, Robert Kern robert.k...@gmail.com wrote:
On Fri, Jun 24, 2011 at 10:07, Laurent Gautier lgaut...@gmail.com wrote:
May be there is not so much need for reservation over the string NA, when
making the distinction between:
a- the internal representation of a
On Fri, Jun 24, 2011 at 11:05, Nathaniel Smith n...@pobox.com wrote:
On Fri, Jun 24, 2011 at 8:14 AM, Robert Kern robert.k...@gmail.com wrote:
On Fri, Jun 24, 2011 at 10:07, Laurent Gautier lgaut...@gmail.com wrote:
May be there is not so much need for reservation over the string NA, when
On Fri, Jun 24, 2011 at 8:30 AM, Robert Kern robert.k...@gmail.com wrote:
I would suggest following R's lead and letting ((NA==NA) == True)
unlike NaNs.
In R, NA and NaN do behave differently with respect to ==, but not the
way you're saying:
NA == NA
[1] NA
if (NA == NA) 1;
Error in if (NA
Nathaniel Smith wrote:
The 'dtype factory' idea builds on the way I've structured datetime as a
parameterized type,
...
Another disadvantage is that we get further from Gael Varoquaux's point:
Right now, the numpy array can be seen as an extension of the C
array, basically a pointer, a
On Thu, Jun 23, 2011 at 8:00 PM, Pierre GM pgmdevl...@gmail.com wrote:
On Jun 24, 2011, at 2:42 AM, Mark Wiebe wrote:
On Thu, Jun 23, 2011 at 7:28 PM, Pierre GM pgmdevl...@gmail.com wrote:
Sorry y'all, I'm just commenting bits by bits:
One key problem is a lack of orthogonality with
On Fri, Jun 24, 2011 at 11:13, Christopher Barker chris.bar...@noaa.gov wrote:
Nathaniel Smith wrote:
If we think that the memory overhead for floating point types is too
high, it would be easy to add a special case where maybe(float) used a
distinguished NaN instead of a separate boolean.
Robert Kern wrote:
It's worth noting that this is not a replacement for masked arrays,
nor is it intended to be the be-all, end-all solution to missing data
problems. It's mostly just intended to be a focused tool to fill in
the gaps where masked arrays are less convenient for whatever
On Fri, Jun 24, 2011 at 7:06 AM, Robert Kern robert.k...@gmail.com wrote:
The alternative proposal would be to add a few new dtypes that are
NA-aware. E.g. an nafloat64 would reserve a particular NaN value
(there are lots of different NaN bit patterns, we'd just reserve one)
that would
On Thu, Jun 23, 2011 at 8:32 PM, Nathaniel Smith n...@pobox.com wrote:
On Thu, Jun 23, 2011 at 5:21 PM, Mark Wiebe mwwi...@gmail.com wrote:
On Thu, Jun 23, 2011 at 7:00 PM, Nathaniel Smith n...@pobox.com wrote:
It's should also be possible to accomplish a general solution at the
dtype
On Fri, Jun 24, 2011 at 6:59 AM, Matthew Brett matthew.br...@gmail.comwrote:
Hi,
On Fri, Jun 24, 2011 at 2:32 AM, Nathaniel Smith n...@pobox.com wrote:
...
If we think that the memory overhead for floating point types is too
high, it would be easy to add a special case where maybe(float)
On Fri, Jun 24, 2011 at 7:30 AM, Laurent Gautier lgaut...@gmail.com wrote:
On 2011-06-24 13:59, Nathaniel Smith n...@pobox.com wrote:
On Thu, Jun 23, 2011 at 5:56 PM, Benjamin Rootben.r...@ou.edu wrote:
Lastly, I am not entirely familiar with R, so I am also very curious
about
what this
On Fri, Jun 24, 2011 at 8:01 AM, Neal Becker ndbeck...@gmail.com wrote:
Just 1 question before I look more closely. What is the cost to the non-MA
user
of this addition?
I'm following the idea that you don't pay for what you don't use. All the
existing stuff will perform the same.
-Mark
On Fri, Jun 24, 2011 at 9:33 AM, Mark Wiebe mwwi...@gmail.com wrote:
On Thu, Jun 23, 2011 at 8:32 PM, Nathaniel Smith n...@pobox.com wrote:
But on the other hand, we gain:
-- simpler implementation: no need to be checking and tracking the
mask buffer everywhere. The needed infrastructure is
On Fri, Jun 24, 2011 at 8:57 AM, Keith Goodman kwgood...@gmail.com wrote:
On Thu, Jun 23, 2011 at 3:24 PM, Mark Wiebe mwwi...@gmail.com wrote:
On Thu, Jun 23, 2011 at 5:05 PM, Keith Goodman kwgood...@gmail.com
wrote:
On Thu, Jun 23, 2011 at 1:53 PM, Mark Wiebe mwwi...@gmail.com wrote:
On Fri, Jun 24, 2011 at 12:33 PM, Mark Wiebe mwwi...@gmail.com wrote:
On Thu, Jun 23, 2011 at 8:32 PM, Nathaniel Smith n...@pobox.com wrote:
On Thu, Jun 23, 2011 at 5:21 PM, Mark Wiebe mwwi...@gmail.com wrote:
On Thu, Jun 23, 2011 at 7:00 PM, Nathaniel Smith n...@pobox.com wrote:
It's
On Fri, Jun 24, 2011 at 9:27 AM, Bruce Southey bsout...@gmail.com wrote:
**
On 06/24/2011 09:06 AM, Robert Kern wrote:
On Fri, Jun 24, 2011 at 07:30, Laurent Gautier lgaut...@gmail.com
lgaut...@gmail.com wrote:
On 2011-06-24 13:59, Nathaniel Smith n...@pobox.com n...@pobox.com
wrote:
On Fri, Jun 24, 2011 at 10:02 AM, Pierre GM pgmdevl...@gmail.com wrote:
On Jun 24, 2011, at 4:44 PM, Robert Kern wrote:
On Fri, Jun 24, 2011 at 09:35, Robert Kern robert.k...@gmail.com
wrote:
On Fri, Jun 24, 2011 at 09:24, Keith Goodman kwgood...@gmail.com
wrote:
On Fri, Jun 24, 2011 at
On Fri, Jun 24, 2011 at 10:07 AM, Matthew Brett matthew.br...@gmail.comwrote:
Hi,
On Fri, Jun 24, 2011 at 3:43 PM, Robert Kern robert.k...@gmail.com
wrote:
On Fri, Jun 24, 2011 at 09:33, Charles R Harris
charlesr.har...@gmail.com wrote:
On Fri, Jun 24, 2011 at 8:06 AM, Robert Kern
Hi,
Just as a use case, if I do this:
a = np.zeros((big_number,), dtype=np.int32)
a[0,0] = np.NA
I think I'm right in saying that, with the array.mask implementation
my array memory usage with grow by new big_number bytes, whereas with
the np.naint32 implementation you'd get something like:
On Fri, Jun 24, 2011 at 11:25 AM, Robert Kern robert.k...@gmail.com wrote:
On Fri, Jun 24, 2011 at 11:13, Christopher Barker chris.bar...@noaa.gov
wrote:
Nathaniel Smith wrote:
If we think that the memory overhead for floating point types is too
high, it would be easy to add a special
On Fri, Jun 24, 2011 at 11:25 AM, Christopher Barker
chris.bar...@noaa.govwrote:
Robert Kern wrote:
It's worth noting that this is not a replacement for masked arrays,
nor is it intended to be the be-all, end-all solution to missing data
problems. It's mostly just intended to be a focused
Hi,
On Fri, Jun 24, 2011 at 5:45 PM, Mark Wiebe mwwi...@gmail.com wrote:
On Fri, Jun 24, 2011 at 6:59 AM, Matthew Brett matthew.br...@gmail.com
wrote:
Hi,
On Fri, Jun 24, 2011 at 2:32 AM, Nathaniel Smith n...@pobox.com wrote:
...
and the fact that 'missing_value' could be any type would
On Fri, Jun 24, 2011 at 11:54 AM, Nathaniel Smith n...@pobox.com wrote:
On Fri, Jun 24, 2011 at 9:33 AM, Mark Wiebe mwwi...@gmail.com wrote:
On Thu, Jun 23, 2011 at 8:32 PM, Nathaniel Smith n...@pobox.com wrote:
But on the other hand, we gain:
-- simpler implementation: no need to be
On Fri, Jun 24, 2011 at 12:06 PM, Wes McKinney wesmck...@gmail.com wrote:
On Fri, Jun 24, 2011 at 12:33 PM, Mark Wiebe mwwi...@gmail.com wrote:
On Thu, Jun 23, 2011 at 8:32 PM, Nathaniel Smith n...@pobox.com wrote:
On Thu, Jun 23, 2011 at 5:21 PM, Mark Wiebe mwwi...@gmail.com wrote:
On
On Fri, Jun 24, 2011 at 1:04 PM, Matthew Brett matthew.br...@gmail.comwrote:
Hi,
Just as a use case, if I do this:
a = np.zeros((big_number,), dtype=np.int32)
a[0,0] = np.NA
I think I'm right in saying that, with the array.mask implementation
my array memory usage with grow by new
On Fri, Jun 24, 2011 at 1:18 PM, Matthew Brett matthew.br...@gmail.comwrote:
Hi,
On Fri, Jun 24, 2011 at 5:45 PM, Mark Wiebe mwwi...@gmail.com wrote:
On Fri, Jun 24, 2011 at 6:59 AM, Matthew Brett matthew.br...@gmail.com
wrote:
Hi,
On Fri, Jun 24, 2011 at 2:32 AM, Nathaniel Smith
Mark Wiebe writes:
It's should also be possible to accomplish a general
solution at the dtype level. We could have a 'dtype
factory' used like: np.zeros(10, dtype=np.maybe(float))
where np.maybe(x) returns a new dtype whose storage size
On Fri, Jun 24, 2011 at 12:26 PM, Mark Wiebe mwwi...@gmail.com wrote:
For the maybe dtype, it would need to gain access to the ufunc loop of the
underlying dtype, and call it appropriately during the inner loop. This
appears to require some more invasive upheaval within the ufunc code than
the
Hi,
On Fri, Jun 24, 2011 at 10:09 PM, Benjamin Root ben.r...@ou.edu wrote:
...
Again, there are pros and cons either way and I see them very orthogonal and
complementary.
That may be true, but I imagine only one of them will be implemented.
@Mark - I don't have a clear idea whether you
On Thu, Jun 23, 2011 at 07:51:25PM -0400, josef.p...@gmail.com wrote:
From the perspective of statistical analysis, I don't see much
advantage of this. What to do with nans depends on the analysis, and
needs to be looked at for each case.
From someone who actually sometimes does statistics
On Fri, Jun 24, 2011 at 7:10 PM, Charles R Harris
charlesr.har...@gmail.com wrote:
On Fri, Jun 24, 2011 at 4:21 PM, Matthew Brett matthew.br...@gmail.com
wrote:
Hi,
On Fri, Jun 24, 2011 at 10:09 PM, Benjamin Root ben.r...@ou.edu wrote:
...
Again, there are pros and cons either way and I
Hi,
On Sat, Jun 25, 2011 at 12:22 AM, Wes McKinney wesmck...@gmail.com wrote:
...
Perhaps we should make a wiki page someplace summarizing pros and cons
of the various implementation approaches?
But - we should do this if it really is an open question which one we
go for. If not then, we're
On Fri, Jun 24, 2011 at 8:02 PM, Charles R Harris
charlesr.har...@gmail.com wrote:
On Fri, Jun 24, 2011 at 5:22 PM, Wes McKinney wesmck...@gmail.com wrote:
On Fri, Jun 24, 2011 at 7:10 PM, Charles R Harris
charlesr.har...@gmail.com wrote:
On Fri, Jun 24, 2011 at 4:21 PM, Matthew Brett
On Fri, Jun 24, 2011 at 3:38 PM, Lluís xscr...@gmx.net wrote:
Mark Wiebe writes:
It's should also be possible to accomplish a general
solution at the dtype level. We could have a 'dtype
factory' used like: np.zeros(10, dtype=np.maybe(float))
On Fri, Jun 24, 2011 at 4:24 PM, Nathaniel Smith n...@pobox.com wrote:
On Fri, Jun 24, 2011 at 12:26 PM, Mark Wiebe mwwi...@gmail.com wrote:
For the maybe dtype, it would need to gain access to the ufunc loop of
the
underlying dtype, and call it appropriately during the inner loop. This
On Fri, Jun 24, 2011 at 5:21 PM, Matthew Brett matthew.br...@gmail.comwrote:
Hi,
On Fri, Jun 24, 2011 at 10:09 PM, Benjamin Root ben.r...@ou.edu wrote:
...
Again, there are pros and cons either way and I see them very orthogonal
and
complementary.
That may be true, but I imagine only
On Fri, Jun 24, 2011 at 6:10 PM, Charles R Harris charlesr.har...@gmail.com
wrote:
On Fri, Jun 24, 2011 at 4:21 PM, Matthew Brett matthew.br...@gmail.comwrote:
Hi,
On Fri, Jun 24, 2011 at 10:09 PM, Benjamin Root ben.r...@ou.edu wrote:
...
Again, there are pros and cons either way and I
On Fri, Jun 24, 2011 at 6:22 PM, Wes McKinney wesmck...@gmail.com wrote:
On Fri, Jun 24, 2011 at 7:10 PM, Charles R Harris
charlesr.har...@gmail.com wrote:
On Fri, Jun 24, 2011 at 4:21 PM, Matthew Brett matthew.br...@gmail.com
wrote:
Hi,
On Fri, Jun 24, 2011 at 10:09 PM,
On Fri, Jun 24, 2011 at 7:02 PM, Matthew Brett matthew.br...@gmail.comwrote:
Hi,
On Sat, Jun 25, 2011 at 12:22 AM, Wes McKinney wesmck...@gmail.com
wrote:
...
Perhaps we should make a wiki page someplace summarizing pros and cons
of the various implementation approaches?
But - we
On Fri, Jun 24, 2011 at 2:09 PM, Benjamin Root ben.r...@ou.edu wrote:
Another example of how we use masks in matplotlib is in pcolor(). We have
to combine the possible masks of X, Y, and V in both the x and y directions
to find the final mask to use for the final output result (because each
On Fri, Jun 24, 2011 at 8:00 PM, Mark Wiebe mwwi...@gmail.com wrote:
On Fri, Jun 24, 2011 at 6:22 PM, Wes McKinney wesmck...@gmail.com wrote:
On Fri, Jun 24, 2011 at 7:10 PM, Charles R Harris
charlesr.har...@gmail.com wrote:
On Fri, Jun 24, 2011 at 4:21 PM, Matthew Brett
On Fri, Jun 24, 2011 at 8:11 PM, Nathaniel Smith n...@pobox.com wrote:
On Fri, Jun 24, 2011 at 2:09 PM, Benjamin Root ben.r...@ou.edu wrote:
Another example of how we use masks in matplotlib is in pcolor(). We
have
to combine the possible masks of X, Y, and V in both the x and y
On Fri, Jun 24, 2011 at 6:57 PM, Benjamin Root ben.r...@ou.edu wrote:
On Fri, Jun 24, 2011 at 8:11 PM, Nathaniel Smith n...@pobox.com wrote:
This is a situation where I would just... use an array and a mask,
rather than a masked array. Then lots of things -- changing fill
values, temporarily
On Fri, Jun 24, 2011 at 11:59 PM, Nathaniel Smith n...@pobox.com wrote:
On Fri, Jun 24, 2011 at 6:57 PM, Benjamin Root ben.r...@ou.edu wrote:
On Fri, Jun 24, 2011 at 8:11 PM, Nathaniel Smith n...@pobox.com wrote:
This is a situation where I would just... use an array and a mask,
rather than a
On Fri, Jun 24, 2011 at 10:06 PM, Wes McKinney wesmck...@gmail.com wrote:
On Fri, Jun 24, 2011 at 11:59 PM, Nathaniel Smith n...@pobox.com wrote:
On Fri, Jun 24, 2011 at 6:57 PM, Benjamin Root ben.r...@ou.edu wrote:
On Fri, Jun 24, 2011 at 8:11 PM, Nathaniel Smith n...@pobox.com wrote:
Enthought has asked me to look into the missing data problem and how NumPy
could treat it better. I've considered the different ideas of adding dtype
variants with a special signal value and masked arrays, and concluded that
adding masks to the core ndarray appears is the best way to deal with the
I'd like to see a statement of what the missing data problem is, and
how this solves it? Because I don't think this is entirely intuitive,
or that everyone necessarily has the same idea.
Reduction operations like 'sum', 'prod', 'min', and 'max' will operate as if
the values weren't there
For
On Thu, Jun 23, 2011 at 4:19 PM, Nathaniel Smith n...@pobox.com wrote:
I'd like to see a statement of what the missing data problem is, and
how this solves it? Because I don't think this is entirely intuitive,
or that everyone necessarily has the same idea.
I agree it represents different
1 - 100 of 143 matches
Mail list logo