[jira] [Comment Edited] (SPARK-21542) Helper functions for custom Python Persistence

2018-11-09 Thread John Bauer (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-21542?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16681895#comment-16681895
 ] 

John Bauer edited comment on SPARK-21542 at 11/9/18 8:07 PM:
-

Compared to the previous, the above example is a) much more minimal, b) 
genuinely useful, and c) actually works with save and load, for example:
{code:java}
impute.write().save("impute")
imp = ImputeNormal.load("impute")
imp.explainParams()
impute_model.write().save("impute_model")
impm = ImputeNormalModel.load("impute_model")
impm.explainParams(){code}


was (Author: johnhbauer):
This is a) much more minimal, b) genuinely useful, and c) actually works with 
save and load, for example:
{code:java}
impute.write().save("impute")
imp = ImputeNormal.load("impute")
imp.explainParams()
impute_model.write().save("impute_model")
impm = ImputeNormalModel.load("impute_model")
impm.explainParams(){code}

> Helper functions for custom Python Persistence
> --
>
> Key: SPARK-21542
> URL: https://issues.apache.org/jira/browse/SPARK-21542
> Project: Spark
>  Issue Type: New Feature
>  Components: ML, PySpark
>Affects Versions: 2.2.0
>Reporter: Ajay Saini
>Assignee: Ajay Saini
>Priority: Major
> Fix For: 2.3.0
>
>
> Currently, there is no way to easily persist Json-serializable parameters in 
> Python only. All parameters in Python are persisted by converting them to 
> Java objects and using the Java persistence implementation. In order to 
> facilitate the creation of custom Python-only pipeline stages, it would be 
> good to have a Python-only persistence framework so that these stages do not 
> need to be implemented in Scala for persistence. 
> This task involves:
> - Adding implementations for DefaultParamsReadable, DefaultParamsWriteable, 
> DefaultParamsReader, and DefaultParamsWriter in pyspark.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Comment Edited] (SPARK-21542) Helper functions for custom Python Persistence

2018-11-09 Thread John Bauer (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-21542?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16681895#comment-16681895
 ] 

John Bauer edited comment on SPARK-21542 at 11/9/18 7:56 PM:
-

This is a) much more minimal, b) genuinely useful, and c) actually works with 
save and load, for example:
{code:java}
impute.write().save("impute")
imp = ImputeNormal.load("impute")
imp.explainParams()
impute_model.write().save("impute_model")
impm = ImputeNormalModel.load("impute_model")
impm.explainParams(){code}


was (Author: johnhbauer):
This is a) much more minimal, b) genuinely useful, and c) actually works with 
save and load, for example:
{code:java}
impute.write().save("impute")
 imp = ImputeNormal.load("impute")
 imp.explainParams()
 impute_model.write().save("impute_model")
 impm = ImputeNormalModel.load("imputer_model")
 impm = ImputeNormalModel.load("impute_model")
 impm.getInputCol()
 impm.getOutputCol()
 impm.getMean()
 impm.getStddev(){code}

> Helper functions for custom Python Persistence
> --
>
> Key: SPARK-21542
> URL: https://issues.apache.org/jira/browse/SPARK-21542
> Project: Spark
>  Issue Type: New Feature
>  Components: ML, PySpark
>Affects Versions: 2.2.0
>Reporter: Ajay Saini
>Assignee: Ajay Saini
>Priority: Major
> Fix For: 2.3.0
>
>
> Currently, there is no way to easily persist Json-serializable parameters in 
> Python only. All parameters in Python are persisted by converting them to 
> Java objects and using the Java persistence implementation. In order to 
> facilitate the creation of custom Python-only pipeline stages, it would be 
> good to have a Python-only persistence framework so that these stages do not 
> need to be implemented in Scala for persistence. 
> This task involves:
> - Adding implementations for DefaultParamsReadable, DefaultParamsWriteable, 
> DefaultParamsReader, and DefaultParamsWriter in pyspark.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Comment Edited] (SPARK-21542) Helper functions for custom Python Persistence

2018-11-09 Thread John Bauer (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-21542?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16681895#comment-16681895
 ] 

John Bauer edited comment on SPARK-21542 at 11/9/18 7:54 PM:
-

This is a) much more minimal, b) genuinely useful, and c) actually works with 
save and load, for example:
{code:java}
impute.write().save("impute")
 imp = ImputeNormal.load("impute")
 imp.explainParams()
 impute_model.write().save("impute_model")
 impm = ImputeNormalModel.load("imputer_model")
 impm = ImputeNormalModel.load("impute_model")
 impm.getInputCol()
 impm.getOutputCol()
 impm.getMean()
 impm.getStddev(){code}


was (Author: johnhbauer):
This is a) much more minimal, b) genuinely useful, and c) actually works with 
save and load, for example:

impute.write().save("impute")
imp = ImputeNormal.load("impute")
imp.explainParams()
impute_model.write().save("impute_model")
impm = ImputeNormalModel.load("imputer_model")
impm = ImputeNormalModel.load("impute_model")
impm.getInputCol()
impm.getOutputCol()
impm.getMean()
impm.getStddev()

> Helper functions for custom Python Persistence
> --
>
> Key: SPARK-21542
> URL: https://issues.apache.org/jira/browse/SPARK-21542
> Project: Spark
>  Issue Type: New Feature
>  Components: ML, PySpark
>Affects Versions: 2.2.0
>Reporter: Ajay Saini
>Assignee: Ajay Saini
>Priority: Major
> Fix For: 2.3.0
>
>
> Currently, there is no way to easily persist Json-serializable parameters in 
> Python only. All parameters in Python are persisted by converting them to 
> Java objects and using the Java persistence implementation. In order to 
> facilitate the creation of custom Python-only pipeline stages, it would be 
> good to have a Python-only persistence framework so that these stages do not 
> need to be implemented in Scala for persistence. 
> This task involves:
> - Adding implementations for DefaultParamsReadable, DefaultParamsWriteable, 
> DefaultParamsReader, and DefaultParamsWriter in pyspark.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org