[ https://issues.apache.org/jira/browse/SPARK-20810?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16017273#comment-16017273 ]
Sean Owen commented on SPARK-20810: ----------------------------------- Are you pretty sure both are converged? You set the same params but do they have the same meaning in both implementations? I wonder if you can double-check the loss that both are computing to see if they even agree about how good a solution the other has found. I doubt the discontinuity of the hinge loss matters as it only affects the gradient when the loss is exactly 0, and defining the derivative as 0 or 1 is valid and doesn't matter much, or shouldn't. > ML LinearSVC vs MLlib SVMWithSGD output different solution > ---------------------------------------------------------- > > Key: SPARK-20810 > URL: https://issues.apache.org/jira/browse/SPARK-20810 > Project: Spark > Issue Type: Question > Components: ML, MLlib > Affects Versions: 2.2.0 > Reporter: Yanbo Liang > > Fitting with SVM classification model on the same dataset, ML {{LinearSVC}} > produces different solution compared with MLlib {{SVMWithSGD}}. I understand > they use different optimization solver (OWLQN vs SGD), does it make sense to > converge to different solution? Since we use {{sklearn.svm.LinearSVC}} and R > e1071 SVM as the reference in {{LinearSVCSuite}}, it seems like > {{SVMWithSGD}} produce wrong solution. Does it also like this? > AFAIK, both of them use {{hinge loss}} which is convex but not differentiable > function. Since the derivative of the hinge loss at certain place is > non-deterministic, should we switch to use {{squared hinge loss}} which is > the default loss function of {{sklearn.svm.LinearSVC}} and more robust than > {{hinge loss}}? > This issue is very easy to reproduce, you can paste the following code > snippet to {{LinearSVCSuite}} and then click run in Intellij IDE. > {code} > test("LinearSVC vs SVMWithSGD") { > import org.apache.spark.mllib.linalg.{Vectors => OldVectors} > import org.apache.spark.mllib.regression.{LabeledPoint => OldLabeledPoint} > val trainer1 = new LinearSVC() > .setRegParam(0.00002) > .setMaxIter(200) > .setTol(1e-4) > val model1 = trainer1.fit(binaryDataset) > println(model1.coefficients) > println(model1.intercept) > val oldData = binaryDataset.rdd.map { case Row(label: Double, features: > Vector) => > OldLabeledPoint(label, OldVectors.fromML(features)) > } > val trainer2 = new SVMWithSGD().setIntercept(true) > > trainer2.optimizer.setRegParam(0.00002).setNumIterations(200).setConvergenceTol(1e-4) > val model2 = trainer2.run(oldData) > println(model2.weights) > println(model2.intercept) > } > {code} > The output is: > {code} > [7.24661385022775,14.774484832179743,22.00945617480461,29.558498069476084] > 7.373454363024084 > [0.58166680313823,1.1938960150473041,1.7940106824589588,2.4884300611292165] > 0.667790514894194 > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org