Re: KNN for large data set

2015-01-22 Thread DEVAN M.S.
Thanks Xiangrui Meng will try this.

And, found this https://github.com/kaushikranjan/knnJoin also.
Will this work with double data ? Can we find out z value of
*Vector(10.3,4.5,3,5)* ?






On Thu, Jan 22, 2015 at 12:25 AM, Xiangrui Meng men...@gmail.com wrote:

 For large datasets, you need hashing in order to compute k-nearest
 neighbors locally. You can start with LSH + k-nearest in Google
 scholar: http://scholar.google.com/scholar?q=lsh+k+nearest -Xiangrui

 On Tue, Jan 20, 2015 at 9:55 PM, DEVAN M.S. msdeva...@gmail.com wrote:
  Hi all,
 
  Please help me to find out best way for K-nearest neighbor using spark
 for
  large data sets.
 



Re: KNN for large data set

2015-01-22 Thread Sudipta Banerjee
Hi Devan and Xiangrui,

Can you please explain the cost and optimization function of the KNN
alogorithim that is being  used?

Thank and Regards,
Sudipta

On Thu, Jan 22, 2015 at 6:59 PM, DEVAN M.S. msdeva...@gmail.com wrote:

 Thanks Xiangrui Meng will try this.

 And, found this https://github.com/kaushikranjan/knnJoin also.
 Will this work with double data ? Can we find out z value of
 *Vector(10.3,4.5,3,5)* ?






 On Thu, Jan 22, 2015 at 12:25 AM, Xiangrui Meng men...@gmail.com wrote:

 For large datasets, you need hashing in order to compute k-nearest
 neighbors locally. You can start with LSH + k-nearest in Google
 scholar: http://scholar.google.com/scholar?q=lsh+k+nearest -Xiangrui

 On Tue, Jan 20, 2015 at 9:55 PM, DEVAN M.S. msdeva...@gmail.com wrote:
  Hi all,
 
  Please help me to find out best way for K-nearest neighbor using spark
 for
  large data sets.
 





-- 
Sudipta Banerjee
Consultant, Business Analytics and Cloud Based Architecture
Call me +919019578099


KNN for large data set

2015-01-20 Thread DEVAN M.S.
Hi all,

Please help me to find out best way for K-nearest neighbor using spark for
large data sets.