[jira] [Assigned] (SPARK-11920) ML LinearRegression should use correct dataset in examples and user guide doc
[ https://issues.apache.org/jira/browse/SPARK-11920?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-11920: Assignee: (was: Apache Spark) > ML LinearRegression should use correct dataset in examples and user guide doc > - > > Key: SPARK-11920 > URL: https://issues.apache.org/jira/browse/SPARK-11920 > Project: Spark > Issue Type: Improvement > Components: Documentation, ML >Reporter: Yanbo Liang >Priority: Minor > > ML LinearRegression use data/mllib/sample_libsvm_data.txt as dataset in > examples and user guide doc, but it's actually classification dataset rather > than regression dataset. We should use > data/mllib/sample_linear_regression_data.txt instead. > The deeper causes is that LinearRegression with "normal" solver can not solve > this dataset correctly, may be due to the ill condition and unreasonable > label. This issue has been reported at SPARK-11918. > So we should make this change in examples and user guides, that can clearly > illustrate the usage of LinearRegression algorithm. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-11920) ML LinearRegression should use correct dataset in examples and user guide doc
[ https://issues.apache.org/jira/browse/SPARK-11920?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-11920: Assignee: Apache Spark > ML LinearRegression should use correct dataset in examples and user guide doc > - > > Key: SPARK-11920 > URL: https://issues.apache.org/jira/browse/SPARK-11920 > Project: Spark > Issue Type: Improvement > Components: Documentation, ML >Reporter: Yanbo Liang >Assignee: Apache Spark >Priority: Minor > > ML LinearRegression use data/mllib/sample_libsvm_data.txt as dataset in > examples and user guide doc, but it's actually classification dataset rather > than regression dataset. We should use > data/mllib/sample_linear_regression_data.txt instead. > The deeper causes is that LinearRegression with "normal" solver can not solve > this dataset correctly, may be due to the ill condition and unreasonable > label. This issue has been reported at SPARK-11918. > So we should make this change in examples and user guides, that can clearly > illustrate the usage of LinearRegression algorithm. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org