Ahmad Ragab created FLINK-4438: ---------------------------------- Summary: FlinkML Quickstart Guide implies incorrect type for test data Key: FLINK-4438 URL: https://issues.apache.org/jira/browse/FLINK-4438 Project: Flink Issue Type: Bug Components: Documentation Affects Versions: 1.2.0 Reporter: Ahmad Ragab Priority: Minor Fix For: 1.2.0
https://ci.apache.org/projects/flink/flink-docs-master/apis/batch/libs/ml/quickstart.html Documentation under *LibSVM* section says that: ---- We can simply import the dataset then using: {code:java} import org.apache.flink.ml.MLUtils val astroTrain: DataSet[LabeledVector] = MLUtils.readLibSVM("/path/to/svmguide1") val astroTest: DataSet[LabeledVector] = MLUtils.readLibSVM("/path/to/svmguide1.t") {code} This gives us two {{DataSet\[LabeledVector\]}} objects that we will use in the following section to create a classifier. ---- Test data wouldn't be of type {{LabeledVector}} generally, it would be as it is described in other examples as {{DataSet\[Vector\]}} since prediction should generate the labels. Thus after reading the file using {{MLUtils}} it should be mapped to a vector. Also, the previous section in *Loading Data* should include an example of using the {{Splitter}} in order to prepare the {{survivalLV}} data for use with a learner. -- This message was sent by Atlassian JIRA (v6.3.4#6332)