Re: spark mllib kmeans

2015-05-11 Thread Driesprong, Fokko
Hi Paul, I would say that it should be possible, but you'll need a different distance measure which conforms to your coordinate system. 2015-05-11 14:59 GMT+02:00 Pa Rö paul.roewer1...@googlemail.com: hi, it is possible to use a custom distance measure and a other data typ as vector? i

Re: spark 1.3.1

2015-05-04 Thread Driesprong, Fokko
Hi Saurabh, Did you check the log of maven? 2015-05-04 15:17 GMT+02:00 Saurabh Gupta saurabh.gu...@semusi.com: HI, I am trying to build a example code given at https://spark.apache.org/docs/latest/sql-programming-guide.html#interoperating-with-rdds code is: // Import factory methods

Re: MLLib SVM probability

2015-05-04 Thread Driesprong, Fokko
Hi Robert, I would say, taking the sign of the numbers represent the class of the input-vector. What kind of data are you using, and what kind of traning-set do you use. Fundamentally a SVM is able to separate only two classes, you can do one vs the rest as you mentioned. I don't see how LVQ can

Re: Compute pairwise distance

2015-04-30 Thread Driesprong, Fokko
further improvement: 1. Create a rdd of your dataset 2. Do an cross join to generate pairs 3. Apply reducebykey and compute distance. You will get a rdd with keypairs and distance Best Ayan On 30 Apr 2015 06:11, Driesprong, Fokko fo...@driesprong.frl wrote: Dear Sparkers, I am working

Compute pairwise distance

2015-04-29 Thread Driesprong, Fokko
Dear Sparkers, I am working on an algorithm which requires the pair distance between all points (eg. DBScan, LOF, etc.). Computing this for *n* points will require produce a n^2 matrix. If the distance measure is symmetrical, this can be reduced to (n^2)/2. What would be the most optimal way of