[ https://issues.apache.org/jira/browse/SPARK-6725?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Xusen Yin updated SPARK-6725: ----------------------------- Comment: was deleted (was: Unlike other models in ML package extend from XXXParams as with their estimators, the MultilayerPerceptronClassificationModel does not extend from the MultilayerPerceptronParams. So we cannot test all parameters in the test suite because the MultilayerPerceptronClassificationModel complaints about not finding the parameters such as maxIter, layers. However, it is reasonable because users do not need to set those parameters in an MLPC model. But the inconsistency with other models makes the test suite different. Shall we split the testEstimatorAndModelReadWrite into two parts in the test suite? I.e. test the estimator and model separately. ) > Model export/import for Pipeline API > ------------------------------------ > > Key: SPARK-6725 > URL: https://issues.apache.org/jira/browse/SPARK-6725 > Project: Spark > Issue Type: Umbrella > Components: ML > Affects Versions: 1.3.0 > Reporter: Joseph K. Bradley > Assignee: Joseph K. Bradley > Priority: Critical > > This is an umbrella JIRA for adding model export/import to the spark.ml API. > This JIRA is for adding the internal Saveable/Loadable API and Parquet-based > format, not for other formats like PMML. > This will require the following steps: > * Add export/import for all PipelineStages supported by spark.ml > ** This will include some Transformers which are not Models. > ** These can use almost the same format as the spark.mllib model save/load > functions, but the model metadata must store a different class name (marking > the class as a spark.ml class). > * After all PipelineStages support save/load, add an interface which forces > future additions to support save/load. > *UPDATE*: In spark.ml, we could save feature metadata using DataFrames. > Other libraries and formats can support this, and it would be great if we > could too. We could do either of the following: > * save() optionally takes a dataset (or schema), and load will return a > (model, schema) pair. > * Models themselves save the input schema. > Both options would mean inheriting from new Saveable, Loadable types. > *UPDATE: DESIGN DOC*: Here's a design doc which I wrote. If you have > comments about the planned implementation, please comment in this JIRA. > Thanks! > [https://docs.google.com/document/d/1RleM4QiKwdfZZHf0_G6FBNaF7_koc1Ui7qfMT1pf4IA/edit?usp=sharing] -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org