[ https://issues.apache.org/jira/browse/FLINK-2984?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Chiwan Park closed FLINK-2984. ------------------------------ Resolution: Implemented Fix Version/s: 1.0.0 Implemented via 615cf42b3d9404dca3884c5abc71650a6cf91d7b. Thanks for reporting this [~jkirsch]. :) > Support lenient parsing of SVMLight input files > ----------------------------------------------- > > Key: FLINK-2984 > URL: https://issues.apache.org/jira/browse/FLINK-2984 > Project: Flink > Issue Type: Improvement > Components: Machine Learning Library > Affects Versions: 0.9.1 > Reporter: Johannes > Assignee: Chiwan Park > Priority: Trivial > Labels: easyfix > Fix For: 1.0.0 > > > The current implementation for the reader assumes that the format follows the > exact specification. > The [splice-site Dataset| > https://www.csie.ntu.edu.tw/~cjlin/libsvmtools/datasets/binary.html#splice-site] > dataset is formatted slightly different > Example > {noformat} > -1 1:0.381846 2:0.163648 3:0.245472 4:0.627318 > {noformat} > note the two spaces after the label. > Currently MLUtils.scala splits on single spaces. -- This message was sent by Atlassian JIRA (v6.3.4#6332)