Re: [Numpy-discussion] missing data discussion round 2

Dag Sverre Seljebotn Tue, 28 Jun 2011 23:53:52 -0700

On 06/28/2011 11:52 PM, Matthew Brett wrote:
> Hi,
>
> On Tue, Jun 28, 2011 at 5:38 PM, Charles R Harris
> <[email protected]>  wrote:
>> Nathaniel, an implementation using masks will look *exactly* like an
>> implementation using na-dtypes from the user's point of view. Except that
>> taking a masked view of an unmasked array allows ignoring values without
>> destroying or copying the original data. The only downside I can see to an
>> implementation using masks is memory and disk storage, and perhaps memory
>> mapped arrays. And I rather expect the former to solve itself in a few
>> years, eight gigs is becoming a baseline for workstations and in a couple of
>> years I expect that to be up around 16-32, and a few years after that.... In
>> any case we are talking 12% - 25% overhead, and in practice I expect it
>> won't be quite as big a problem as folks project.
>
> Or, in the case of 16 bit integers, 50% memory overhead.
>
> I honestly find it hard to believe that I will not care about memory
> use in the near future, and I don't think it's wise to make decisions
> on that assumption.


In many sciences, waiting for the future makes things worse, not better, 
simply because the amount of available data easily grows at a faster 
rate than the amount of memory you can get per dollar :-)

Dag Sverre
_______________________________________________
NumPy-Discussion mailing list
[email protected]
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] missing data discussion round 2

Reply via email to