Re: Column Similarities using DIMSUM fails with GC overhead limit exceeded

Reza Zadeh Sun, 01 Mar 2015 20:03:13 -0800

Hi Sab,
In this dense case, the output will contain 10000 x 10000 entries, i.e. 100
million doubles, which doesn't fit in 1GB with overheads.
For a dense matrix, similarColumns() scales quadratically in the number of
columns, so you need more memory across the cluster.
Reza



On Sun, Mar 1, 2015 at 7:06 PM, Sabarish Sasidharan <
sabarish.sasidha...@manthan.com> wrote:

> Sorry, I actually meant 30 x 10000 matrix (missed a 0)
>
>
> Regards
> Sab
>
>

Re: Column Similarities using DIMSUM fails with GC overhead limit exceeded

Reply via email to