Yanbo Liang created SPARK-11920: ----------------------------------- Summary: ML LinearRegression should use correct dataset in examples and user guide doc Key: SPARK-11920 URL: https://issues.apache.org/jira/browse/SPARK-11920 Project: Spark Issue Type: Improvement Components: Documentation, ML Reporter: Yanbo Liang Priority: Minor
ML LinearRegression use data/mllib/sample_libsvm_data.txt as dataset in examples and user guide doc, but it's actually classification dataset rather than regression dataset. We should use data/mllib/sample_linear_regression_data.txt instead. Another reason is that LinearRegression with "normal" solver can not solve this dataset correctly, may be due to the ill condition and unreasonable label. This issue has been reported at SPARK-11918. So we should make this change in examples and user guides, that can clearly illustrate the usage of LinearRegression algorithm. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org