Re: [Numpy-discussion] Condensing array...

Olivier Grisel Fri, 25 Feb 2011 02:52:56 -0800

2011/2/25 Gael Varoquaux <[email protected]>:
> On Fri, Feb 25, 2011 at 10:36:42AM +0100, Fred wrote:
>> I have a big array (44 GB) I want to decimate.
>
>> But this array has a lot of NaN (only 1/3 has value, in fact, so 2/3 of
>> NaN).
>
>> If I "basically" decimate it (a la NumPy, ie data[::nx, ::ny, ::nz], for
>> instance), the decimated array will also have a lot of NaN.
>
>> What I would like to have in one cell of the decimated array is the
>> nearest (for instance) value in the big array. This is what I call a
>> "condensated array".
>
> What exactly do you mean by 'decimating'. To me is seems that you are
> looking for matrix factorization or matrix completion techniques, which
> are trendy topics in machine learning currently.
>
> They however are a bit challenging, and I fear that you will have read
> the papers and do some implementation, unless you have a clear
> application in mind that enables for simple tricks to solve it.


Indeed the following paper by G. Martinsson from there is also a
section on matrix summarization:

  http://arxiv.org/abs/0909.4061
  http://www.stanford.edu/group/mmds/slides2010/Martinsson.pdf

The scikit-learn randomized SVD implementation is coming this paper.
It's pretty useful in practice.

-- 
Olivier
http://twitter.com/ogrisel - http://github.com/ogrisel
_______________________________________________
NumPy-Discussion mailing list
[email protected]
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Condensing array...

Reply via email to