Hi all,

I am trying to understand the output of the SVM classifier.

Right now, my output looks like this:

-18.841544889249917 0.0 

168.32916035523283 1.0 

420.67763915879794 1.0 

-974.1942589201286 0.0 

71.73602841256813 1.0 

233.13636224524993 1.0 

-1000.5902168199027 0.0



The documentation is unclear about what these numbers mean.

I think it is the distance to the hyperplane with sign.



My main question is: How can I convert distances from hyperplanes to 
probabilities in a multi-class one-vs-all approach?

SVMLib <http://www.csie.ntu.edu.tw/~cjlin/libsvm/> has this functionality and 
refers the process to get the probabilities as “Platt scaling” 
<http://www.researchgate.net/profile/John_Platt/publication/2594015_Probabilistic_Outputs_for_Support_Vector_Machines_and_Comparisons_to_Regularized_Likelihood_Methods/links/004635154cff5262d6000000.pdf>
 .

I think this functionality should be in MLLib, but I can't find it?
Do you think Platt scaling makes sense?



Making clusters using Learning Vector Quantization, determining the spread 
function of a cluster with a Gaussian function and then retrieving the 
probability makes a lot more sense i.m.o. Using the distances from the 
hyperplanes from several SVM classifiers and then trying to determine some 
probability on these distance measures, does not make any sense, because the 
distribution property of the data-points belonging to a cluster is not taken 
into account. 
Does anyone see a fallacy in my reasoning?



With kind regards,

Robert

Reply via email to