[ https://issues.apache.org/jira/browse/FLINK-2297?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14609762#comment-14609762 ]
ASF GitHub Bot commented on FLINK-2297: --------------------------------------- Github user tillrohrmann commented on a diff in the pull request: https://github.com/apache/flink/pull/874#discussion_r33659515 --- Diff: flink-staging/flink-ml/src/test/scala/org/apache/flink/ml/classification/SVMITSuite.scala --- @@ -69,19 +70,38 @@ class SVMITSuite extends FlatSpec with Matchers with FlinkTestBase { svm.fit(trainingDS) - val threshold = 0.0 - - val predictionPairs = svm.evaluate(test).map { - truthPrediction => - val truth = truthPrediction._1 - val prediction = truthPrediction._2 - val thresholdedPrediction = if (prediction > threshold) 1.0 else -1.0 - (truth, thresholdedPrediction) - } + val predictionPairs = svm.evaluate(test) val absoluteErrorSum = predictionPairs.collect().map{ case (truth, prediction) => Math.abs(truth - prediction)}.sum absoluteErrorSum should be < 15.0 } + + it should "be possible to get the raw decision function values" in { + val env = ExecutionEnvironment.getExecutionEnvironment + + val svm = SVM(). + setBlocks(env.getParallelism). + setIterations(100). + setLocalIterations(100). + setRegularization(0.002). + setStepsize(0.1). + setSeed(0). + clearThreshold() + + val trainingDS = env.fromCollection(Classification.trainingData) + + val test = trainingDS.map(x => x.vector) + + svm.fit(trainingDS) + + val predictions: DataSet[(FlinkVector, Double)] = svm.predict(test) + + val preds = predictions.map(vectorLabel => vectorLabel._2).collect() + + preds.max should be > 1.0 --- End diff -- hmm you could manually set the weight vector for which you are sure that something different than 1/-1 is calculated > Add threshold setting for SVM binary predictions > ------------------------------------------------ > > Key: FLINK-2297 > URL: https://issues.apache.org/jira/browse/FLINK-2297 > Project: Flink > Issue Type: Improvement > Components: Machine Learning Library > Reporter: Theodore Vasiloudis > Assignee: Theodore Vasiloudis > Priority: Minor > Labels: ML > Fix For: 0.10 > > > Currently SVM outputs the raw decision function values when using the predict > function. > We should have instead the ability to set a threshold above which examples > are labeled as positive (1.0) and below negative (-1.0). Then the prediction > function can be directly used for evaluation. -- This message was sent by Atlassian JIRA (v6.3.4#6332)