[ https://issues.apache.org/jira/browse/SPARK-21542?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16634679#comment-16634679 ]
John Bauer commented on SPARK-21542: ------------------------------------ The above is not as minimal as I would have liked... It is based on the unit tests associated with the fix referenced for DefaultParamsReadable, DefaultParamsWritable which I thought would test the desired behavior, i.e. save and load a pipeline after calling fit(). Unfortunately this was not tested, so I flailed at the code for a while until I got something that worked. A lot of stuff left over from setting up unit tests could probably be removed. But at least this seems to work.. > Helper functions for custom Python Persistence > ---------------------------------------------- > > Key: SPARK-21542 > URL: https://issues.apache.org/jira/browse/SPARK-21542 > Project: Spark > Issue Type: New Feature > Components: ML, PySpark > Affects Versions: 2.2.0 > Reporter: Ajay Saini > Assignee: Ajay Saini > Priority: Major > Fix For: 2.3.0 > > > Currently, there is no way to easily persist Json-serializable parameters in > Python only. All parameters in Python are persisted by converting them to > Java objects and using the Java persistence implementation. In order to > facilitate the creation of custom Python-only pipeline stages, it would be > good to have a Python-only persistence framework so that these stages do not > need to be implemented in Scala for persistence. > This task involves: > - Adding implementations for DefaultParamsReadable, DefaultParamsWriteable, > DefaultParamsReader, and DefaultParamsWriter in pyspark. -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org