[ 
https://issues.apache.org/jira/browse/FLINK-9664?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16579589#comment-16579589
 ] 

ASF GitHub Bot commented on FLINK-9664:
---------------------------------------

azagrebin commented on a change in pull request #6425: [FLINK-9664][Doc] fixing 
documentation in ML quick start
URL: https://github.com/apache/flink/pull/6425#discussion_r209897278
 
 

 ##########
 File path: docs/dev/libs/ml/quickstart.md
 ##########
 @@ -146,7 +145,23 @@ create a classifier.
 
 ## Classification
 
-Once we have imported the dataset we can train a `Predictor` such as a linear 
SVM classifier.
+After importing the training and test dataset, they need to be prepared for 
the classification. 
+Because Flink SVM only supports threshold binary values of `+1.0` and `-1.0`, 
a conversion is 
+needed after loading the LibSVM dataset since it is labelled using `1`s and 
`0`s.
 
 Review comment:
   Just a wording thing, I would swap `Because` and `since` in this sentence:
   **Since** Flink SVM only supports threshold binary values of `+1.0` and 
`-1.0`, a conversion is 
   needed after loading the LibSVM dataset **because** it is labelled using 
`1`s and `0`s.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> FlinkML Quickstart Loading Data section example doesn't work as described
> -------------------------------------------------------------------------
>
>                 Key: FLINK-9664
>                 URL: https://issues.apache.org/jira/browse/FLINK-9664
>             Project: Flink
>          Issue Type: Bug
>          Components: Documentation, Machine Learning Library
>    Affects Versions: 1.5.0
>            Reporter: Mano Swerts
>            Assignee: Rong Rong
>            Priority: Major
>              Labels: documentation-update, machine_learning, ml, 
> pull-request-available
>   Original Estimate: 1h
>  Remaining Estimate: 1h
>
> The ML documentation example isn't complete: 
> [https://ci.apache.org/projects/flink/flink-docs-release-1.5/dev/libs/ml/quickstart.html#loading-data]
> The referred section loads data from an astroparticle binary classification 
> dataset to showcase SVM. The dataset uses 0 and 1 as labels, which doesn't 
> produce correct results. The SVM predictor expects -1 and 1 labels to 
> correctly predict the label. The documentation, however, doesn't mention 
> that. The example therefore doesn't work without a clue why.
> The documentation should be updated with an explicit mention to -1 and 1 
> labels and a mapping function that shows the conversion of the labels.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to