Hi Robert, I would say, taking the sign of the numbers represent the class of the input-vector. What kind of data are you using, and what kind of traning-set do you use. Fundamentally a SVM is able to separate only two classes, you can do one vs the rest as you mentioned.
I don't see how LVQ can benefit the SVM classifier. I would say that this is more a SVM problem, than a Spark. 2015-05-04 15:22 GMT+02:00 Robert Musters <robert.must...@openindex.io>: > Hi all, > > I am trying to understand the output of the SVM classifier. > > Right now, my output looks like this: > > -18.841544889249917 0.0 > > 168.32916035523283 1.0 > > 420.67763915879794 1.0 > > -974.1942589201286 0.0 > > 71.73602841256813 1.0 > > 233.13636224524993 1.0 > > -1000.5902168199027 0.0 > > > The documentation is unclear about what these numbers mean > <https://spark.apache.org/docs/0.9.2/api/mllib/index.html#org.apache.spark.mllib.regression.LabeledPoint> > . > > I think it is the distance to the hyperplane with sign. > > > My main question is: How can I convert distances from hyperplanes to > probabilities in a multi-class one-vs-all approach? > > SVMLib <http://www.csie.ntu.edu.tw/~cjlin/libsvm/> has this functionality > and refers the process to get the probabilities as “Platt scaling” > <http://www.researchgate.net/profile/John_Platt/publication/2594015_Probabilistic_Outputs_for_Support_Vector_Machines_and_Comparisons_to_Regularized_Likelihood_Methods/links/004635154cff5262d6000000.pdf>. > > > I think this functionality should be in MLLib, but I can't find it? > Do you think Platt scaling makes sense? > > > Making clusters using Learning Vector Quantization, determining the > spread function of a cluster with a Gaussian function and then retrieving > the probability makes a lot more sense i.m.o. Using the distances from the > hyperplanes from several SVM classifiers and then trying to determine some > probability on these distance measures, does not make any sense, because > the distribution property of the data-points belonging to a cluster is not > taken into account. > Does anyone see a fallacy in my reasoning? > > > With kind regards, > > Robert >