[ https://issues.apache.org/jira/browse/SPARK-19053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15812352#comment-15812352 ]
Ilya Matiach commented on SPARK-19053: -------------------------------------- I like your second api (setMetrics). That way the user can specify which metrics they would like to see. We might even add a special "all" constant which would give all metrics. > Supporting multiple evaluation metrics in DataFrame-based API: discussion > ------------------------------------------------------------------------- > > Key: SPARK-19053 > URL: https://issues.apache.org/jira/browse/SPARK-19053 > Project: Spark > Issue Type: Brainstorming > Components: ML > Reporter: Joseph K. Bradley > > This JIRA is to discuss supporting the computation of multiple evaluation > metrics efficiently in the DataFrame-based API for MLlib. > In the RDD-based API, RegressionMetrics and other *Metrics classes support > efficient computation of multiple metrics. > In the DataFrame-based API, there are a few options: > * model/result summaries (e.g., LogisticRegressionSummary): These currently > provide the desired functionality, but they require a model and do not let > users compute metrics manually from DataFrames of predictions and true labels. > * Evaluator classes (e.g., RegressionEvaluator): These only support computing > a single metric in one pass over the data, but they do not require a model. > * new class analogous to Metrics: We could introduce a class analogous to > Metrics. Model/result summaries could use this internally as a replacement > for spark.mllib Metrics classes, or they could (maybe) inherit from these > classes. > Thoughts? -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org