[ 
https://issues.apache.org/jira/browse/FLINK-2297?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14609840#comment-14609840
 ] 

ASF GitHub Bot commented on FLINK-2297:
---------------------------------------

Github user thvasilo commented on a diff in the pull request:

    https://github.com/apache/flink/pull/874#discussion_r33663520
  
    --- Diff: 
flink-staging/flink-ml/src/main/scala/org/apache/flink/ml/classification/SVM.scala
 ---
    @@ -242,8 +275,21 @@ object SVM{
             }
           }
     
    -      override def predict(value: T, model: DenseVector): Double = {
    -        value.asBreeze dot model.asBreeze
    +      override def predict(value: T, model: DenseVector, 
predictParameters: ParameterMap):
    +        Double = {
    +        val thresholdOption = predictParameters.get(Threshold)
    +
    +        val rawValue = value.asBreeze dot model.asBreeze
    +        // If the Threshold option has been reset, we will get back a 
Some(None) thresholdOption
    +        // causing the exception when we try to get the value. In that 
case we just return the
    +        // raw value
    +        try {
    +          val thresOptionValue = thresholdOption.get
    +          if (rawValue > thresOptionValue) 1.0 else -1.0
    +        }
    +        catch {
    +          case e: java.lang.ClassCastException => rawValue
    +        }
    --- End diff --
    
    This relates to the previous discussion:
    
    I do believe we want this turned on by default, when you train a binary 
classifier you expect that `predict` will return binary labels, not the 
decision function values.
    
    So if we have `None` as default, the user could write:
    
    ```scala
    val svm = SVM().
          setBlocks(env.getParallelism)
    
    svm.fit(train)
    val eval = svm.evaluate(test)
    ```
    
    and the eval output would not make sense, but if he wrote
    
    ```scala
    val svm = SVM().
          setBlocks(env.getParallelism).
          setThreshold(0.0)
    
    svm.fit(train)
    val eval = svm.evaluate(test)
    ```
    
    it would.


> Add threshold setting for SVM binary predictions
> ------------------------------------------------
>
>                 Key: FLINK-2297
>                 URL: https://issues.apache.org/jira/browse/FLINK-2297
>             Project: Flink
>          Issue Type: Improvement
>          Components: Machine Learning Library
>            Reporter: Theodore Vasiloudis
>            Assignee: Theodore Vasiloudis
>            Priority: Minor
>              Labels: ML
>             Fix For: 0.10
>
>
> Currently SVM outputs the raw decision function values when using the predict 
> function.
> We should have instead the ability to set a threshold above which examples 
> are labeled as positive (1.0) and below negative (-1.0). Then the prediction 
> function can be directly used for evaluation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to