[ https://issues.apache.org/jira/browse/SPARK-22433?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16237884#comment-16237884 ]
Teng Peng commented on SPARK-22433: ----------------------------------- Thanks for the quick response, Sean. I am glad this issue is discussed in Spark community. I understand how important coherent is, and it's the users' decision to do what they believe is appropriate. I just want to propose a one-line change: change eval.setMetricName("r2") to "mse" in test("cross validation with linear regression"). Then we would not leave the impression that "Wait what? Spark officially cross validate on R2?" > Linear regression R^2 train/test terminology related > ----------------------------------------------------- > > Key: SPARK-22433 > URL: https://issues.apache.org/jira/browse/SPARK-22433 > Project: Spark > Issue Type: Improvement > Components: ML > Affects Versions: 2.2.0 > Reporter: Teng Peng > Priority: Minor > > Traditional statistics is traditional statistics. Their goal, framework, and > terminologies are not the same as ML. However, in linear regression related > components, this distinction is not clear, which is reflected: > 1. regressionMetric + regressionEvaluator : > * R2 shouldn't be there. > * A better name "regressionPredictionMetric". > 2. LinearRegressionSuite: > * Shouldn't test R2 and residuals on test data. > * There is no train set and test set in this setting. > 3. Terminology: there is no "linear regression with L1 regularization". > Linear regression is linear. Adding a penalty term, then it is no longer > linear. Just call it "LASSO", "ElasticNet". > There are more. I am working on correcting them. > They are not breaking anything, but it does not make one feel good to see the > basic distinction is blurred. -- This message was sent by Atlassian JIRA (v6.4.14#64029) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org