Thanks Reza, interesting approach. I think what I actually want is to calculate pair-wise distance, on second thought. Is there a pattern for that?
> On Jan 16, 2015, at 9:53 PM, Reza Zadeh <r...@databricks.com> wrote: > > You can use K-means with a suitably large k. Each cluster should correspond > to rows that are similar to one another. > >> On Fri, Jan 16, 2015 at 5:18 PM, Andrew Musselman >> <andrew.mussel...@gmail.com> wrote: >> What's a good way to calculate similarities between all vector-rows in a >> matrix or RDD[Vector]? >> >> I'm seeing RowMatrix has a columnSimilarities method but I'm not sure I'm >> going down a good path to transpose a matrix in order to run that. >