[ https://issues.apache.org/jira/browse/SPARK-11920?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Apache Spark reassigned SPARK-11920: ------------------------------------ Assignee: Apache Spark > ML LinearRegression should use correct dataset in examples and user guide doc > ----------------------------------------------------------------------------- > > Key: SPARK-11920 > URL: https://issues.apache.org/jira/browse/SPARK-11920 > Project: Spark > Issue Type: Improvement > Components: Documentation, ML > Reporter: Yanbo Liang > Assignee: Apache Spark > Priority: Minor > > ML LinearRegression use data/mllib/sample_libsvm_data.txt as dataset in > examples and user guide doc, but it's actually classification dataset rather > than regression dataset. We should use > data/mllib/sample_linear_regression_data.txt instead. > The deeper causes is that LinearRegression with "normal" solver can not solve > this dataset correctly, may be due to the ill condition and unreasonable > label. This issue has been reported at SPARK-11918. > So we should make this change in examples and user guides, that can clearly > illustrate the usage of LinearRegression algorithm. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org