[ 
https://issues.apache.org/jira/browse/SPARK-19053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15828755#comment-15828755
 ] 

Joseph K. Bradley commented on SPARK-19053:
-------------------------------------------

After thinking about this more and hearing your thoughts, my top pick would be 
to define Metrics classes similar to the spark.mllib ones, with the current 
model summaries inheriting from these metrics classes.  (We should sketch out 
APIs before implementation, though, to make sure this setup will not break any 
APIs or force awkward APIs.)

Metrics classes could offer both aggregate and per-row metrics.

> Supporting multiple evaluation metrics in DataFrame-based API: discussion
> -------------------------------------------------------------------------
>
>                 Key: SPARK-19053
>                 URL: https://issues.apache.org/jira/browse/SPARK-19053
>             Project: Spark
>          Issue Type: Brainstorming
>          Components: ML
>            Reporter: Joseph K. Bradley
>
> This JIRA is to discuss supporting the computation of multiple evaluation 
> metrics efficiently in the DataFrame-based API for MLlib.
> In the RDD-based API, RegressionMetrics and other *Metrics classes support 
> efficient computation of multiple metrics.
> In the DataFrame-based API, there are a few options:
> * model/result summaries (e.g., LogisticRegressionSummary): These currently 
> provide the desired functionality, but they require a model and do not let 
> users compute metrics manually from DataFrames of predictions and true labels.
> * Evaluator classes (e.g., RegressionEvaluator): These only support computing 
> a single metric in one pass over the data, but they do not require a model.
> * new class analogous to Metrics: We could introduce a class analogous to 
> Metrics.  Model/result summaries could use this internally as a replacement 
> for spark.mllib Metrics classes, or they could (maybe) inherit from these 
> classes.
> Thoughts?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to