Hi Sab, In this dense case, the output will contain 10000 x 10000 entries, i.e. 100 million doubles, which doesn't fit in 1GB with overheads. For a dense matrix, similarColumns() scales quadratically in the number of columns, so you need more memory across the cluster. Reza
On Sun, Mar 1, 2015 at 7:06 PM, Sabarish Sasidharan < sabarish.sasidha...@manthan.com> wrote: > Sorry, I actually meant 30 x 10000 matrix (missed a 0) > > > Regards > Sab > >