[jira] [Commented] (SPARK-10931) PySpark ML Models should contain Param values

Timothy Hunter (JIRA) Thu, 10 Mar 2016 14:38:38 -0800

    [ 
https://issues.apache.org/jira/browse/SPARK-10931?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15190093#comment-15190093
 ]


Timothy Hunter commented on SPARK-10931:
----------------------------------------

Using python decorators, it is fairly easy to autogenerate at runtime all the 
param wrappers, getters and setters and extract the documentation from the 
scala side so that the documentation of the parameter is included in the 
docstring of the getters and setters.

There are two issues with that:
 - do we need to specialize the documentation or some of the conversions 
between java and python? In both cases, it is possible to "subclass" and make 
sure that the methods do not get overwritten by some autogenerated stubs
 - the documentation of a class (which relies on the bytecode, not on runtime 
instances) would miss all the params, because they are only generated in 
runtime objects. I believe there are some ways around it, such as inserting 
such methods at import time, but that would require more investigation.

> PySpark ML Models should contain Param values
> ---------------------------------------------
>
>                 Key: SPARK-10931
>                 URL: https://issues.apache.org/jira/browse/SPARK-10931
>             Project: Spark
>          Issue Type: Improvement
>          Components: ML, PySpark
>            Reporter: Joseph K. Bradley
>
> PySpark spark.ml Models are generally wrappers around Java objects and do not 
> even contain Param values.  This JIRA is for copying the Param values from 
> the Estimator to the model.
> This can likely be solved by modifying Estimator.fit to copy Param values, 
> but should also include proper unit tests.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-10931) PySpark ML Models should contain Param values

Reply via email to