[jira] [Assigned] (SPARK-11920) ML LinearRegression should use correct dataset in examples and user guide doc

2015-11-23 Thread Apache Spark (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-11920?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-11920:


Assignee: (was: Apache Spark)

> ML LinearRegression should use correct dataset in examples and user guide doc
> -
>
> Key: SPARK-11920
> URL: https://issues.apache.org/jira/browse/SPARK-11920
> Project: Spark
>  Issue Type: Improvement
>  Components: Documentation, ML
>Reporter: Yanbo Liang
>Priority: Minor
>
> ML LinearRegression use data/mllib/sample_libsvm_data.txt as dataset in 
> examples and user guide doc, but it's actually classification dataset rather 
> than regression dataset. We should use 
> data/mllib/sample_linear_regression_data.txt instead.
> The deeper causes is that LinearRegression with "normal" solver can not solve 
> this dataset correctly, may be due to the ill condition and unreasonable 
> label. This issue has been reported at SPARK-11918.
> So we should make this change in examples and user guides, that can clearly 
> illustrate the usage of LinearRegression algorithm.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-11920) ML LinearRegression should use correct dataset in examples and user guide doc

2015-11-23 Thread Apache Spark (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-11920?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-11920:


Assignee: Apache Spark

> ML LinearRegression should use correct dataset in examples and user guide doc
> -
>
> Key: SPARK-11920
> URL: https://issues.apache.org/jira/browse/SPARK-11920
> Project: Spark
>  Issue Type: Improvement
>  Components: Documentation, ML
>Reporter: Yanbo Liang
>Assignee: Apache Spark
>Priority: Minor
>
> ML LinearRegression use data/mllib/sample_libsvm_data.txt as dataset in 
> examples and user guide doc, but it's actually classification dataset rather 
> than regression dataset. We should use 
> data/mllib/sample_linear_regression_data.txt instead.
> The deeper causes is that LinearRegression with "normal" solver can not solve 
> this dataset correctly, may be due to the ill condition and unreasonable 
> label. This issue has been reported at SPARK-11918.
> So we should make this change in examples and user guides, that can clearly 
> illustrate the usage of LinearRegression algorithm.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org