Hi;

 

I have 2 dataframes. I am trying to cross join for finding vector distances.
Then i can choose the most similiar vectors.

Cross join speed is too slow. How can i increase the speed, or have you any
suggestion for this comparision?

 

 

val result=myDict.join(mainDataset).map(x=>{

 

               val orgClassName1 =x.getAs[SparseVector](1);         

               val orgClassName2 =x.getAs[SparseVector](2);

               val f1=x.getAs[SparseVector](3);

               val f2=x.getAs[SparseVector](4);

               val dist=Vectors.sqdist(f1,f2);

               

               (orgClassName1, orgClassName2,dist)

             }).toDF("orgClassName1","orgClassName2,"dist");

 

 

 

Reply via email to