Thanks Reza, interesting approach.  I think what I actually want is to 
calculate pair-wise distance, on second thought.  Is there a pattern for that?

> On Jan 16, 2015, at 9:53 PM, Reza Zadeh <r...@databricks.com> wrote:
> 
> You can use K-means with a suitably large k. Each cluster should correspond 
> to rows that are similar to one another.
> 
>> On Fri, Jan 16, 2015 at 5:18 PM, Andrew Musselman 
>> <andrew.mussel...@gmail.com> wrote:
>> What's a good way to calculate similarities between all vector-rows in a 
>> matrix or RDD[Vector]?
>> 
>> I'm seeing RowMatrix has a columnSimilarities method but I'm not sure I'm 
>> going down a good path to transpose a matrix in order to run that.
> 

Reply via email to