[ https://issues.apache.org/jira/browse/SPARK-14604?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Joseph K. Bradley reassigned SPARK-14604: ----------------------------------------- Assignee: (was: Joseph K. Bradley) > Modify design of ML model summaries > ----------------------------------- > > Key: SPARK-14604 > URL: https://issues.apache.org/jira/browse/SPARK-14604 > Project: Spark > Issue Type: Improvement > Components: ML > Reporter: Joseph K. Bradley > Priority: Major > > Several spark.ml models now have summaries containing evaluation metrics and > training info: > * LinearRegressionModel > * LogisticRegressionModel > * GeneralizedLinearRegressionModel > These summaries have unfortunately been added in an inconsistent way. I > propose to reorganize them to have: > * For each model, 1 summary (without training info) and 1 training summary > (with info from training). The non-training summary can be produced for a > new dataset via {{evaluate}}. > * A summary should not store the model itself as a public field. > * A summary should provide a transient reference to the dataset used to > produce the summary. > This task will involve reorganizing the GLM summary (which lacks a > training/non-training distinction) and deprecating the model method in the > LinearRegressionSummary. -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org