[jira] [Issue Comment Deleted] (SPARK-6725) Model export/import for Pipeline API

Xusen Yin (JIRA) Sat, 21 Nov 2015 01:30:55 -0800

     [ 
https://issues.apache.org/jira/browse/SPARK-6725?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Xusen Yin updated SPARK-6725:
-----------------------------
    Comment: was deleted

(was: Unlike other models in ML package extend from XXXParams as with their 
estimators, the MultilayerPerceptronClassificationModel does not extend from 
the MultilayerPerceptronParams. So we cannot test all parameters in the test 
suite because the MultilayerPerceptronClassificationModel complaints about not 
finding the parameters such as maxIter, layers.

However, it is reasonable because users do not need to set those parameters in 
an MLPC model. But the inconsistency with other models makes the test suite 
different. Shall we split the testEstimatorAndModelReadWrite into two parts in 
the test suite? I.e. test the estimator and model separately. )

> Model export/import for Pipeline API
> ------------------------------------
>
>                 Key: SPARK-6725
>                 URL: https://issues.apache.org/jira/browse/SPARK-6725
>             Project: Spark
>          Issue Type: Umbrella
>          Components: ML
>    Affects Versions: 1.3.0
>            Reporter: Joseph K. Bradley
>            Assignee: Joseph K. Bradley
>            Priority: Critical
>
> This is an umbrella JIRA for adding model export/import to the spark.ml API.  
> This JIRA is for adding the internal Saveable/Loadable API and Parquet-based 
> format, not for other formats like PMML.
> This will require the following steps:
> * Add export/import for all PipelineStages supported by spark.ml
> ** This will include some Transformers which are not Models.
> ** These can use almost the same format as the spark.mllib model save/load 
> functions, but the model metadata must store a different class name (marking 
> the class as a spark.ml class).
> * After all PipelineStages support save/load, add an interface which forces 
> future additions to support save/load.
> *UPDATE*: In spark.ml, we could save feature metadata using DataFrames.  
> Other libraries and formats can support this, and it would be great if we 
> could too.  We could do either of the following:
> * save() optionally takes a dataset (or schema), and load will return a 
> (model, schema) pair.
> * Models themselves save the input schema.
> Both options would mean inheriting from new Saveable, Loadable types.
> *UPDATE: DESIGN DOC*: Here's a design doc which I wrote.  If you have 
> comments about the planned implementation, please comment in this JIRA.  
> Thanks!  
> [https://docs.google.com/document/d/1RleM4QiKwdfZZHf0_G6FBNaF7_koc1Ui7qfMT1pf4IA/edit?usp=sharing]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Issue Comment Deleted] (SPARK-6725) Model export/import for Pipeline API

Reply via email to