[ https://issues.apache.org/jira/browse/SPARK-18693?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15786013#comment-15786013 ]
Ilya Matiach commented on SPARK-18693: -------------------------------------- Many classifiers in ml don't seem to support weight columns yet, so probably other JIRAs need to be created to add weight columns to them (eg DecisionTreeClassifier). Also, it doesn't look like any packages in MLLIB contain weight columns, so I probably should try to limit the changes to ML only, but it is difficult to do so since ML evaluators are just wrappers around MLLIB. Also, please note the pull request that is linked to here hasn't been updated in a long time, and it only resolved the issue for RegressionMetrics in MLLIB: "SPARK-11520 RegressionMetrics should support instance weights " I'm still planning out the changes that need to be made, since this one looks nontrivial, any suggestions from spark folks? > BinaryClassificationEvaluator, RegressionEvaluator, and > MulticlassClassificationEvaluator should use sample weight data > ----------------------------------------------------------------------------------------------------------------------- > > Key: SPARK-18693 > URL: https://issues.apache.org/jira/browse/SPARK-18693 > Project: Spark > Issue Type: Bug > Components: ML > Affects Versions: 2.0.2 > Reporter: Devesh Parekh > > The LogisticRegression and LinearRegression models support training with a > weight column, but the corresponding evaluators do not support computing > metrics using those weights. This breaks model selection using CrossValidator. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org