[ 
https://issues.apache.org/jira/browse/SPARK-7146?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15216769#comment-15216769
 ] 

Michael ZieliƄski commented on SPARK-7146:
------------------------------------------

We ended up copying significant portion of Params (HasInputCol(s), 
HasOutputCol(s), HasProbabilityCol) for our custom Transformers/Estimators. 
This creates a problem if you want to abstract Transformer, e.g. have a piece 
of code that accepts any transformer with "setInputCol" method. Because our 
HasInputCol and native Spark HasInputCol are different traits, that results in 
very clunky code (structural typing + few ugly type aliases).

Opening up at least some traits would result in cleaner code and more reuse. 
This document 
(https://docs.google.com/document/d/1plFBPJY_PriPTuMiFYLSm7fQgD1FieP4wt3oMVKMGcc/edit#)
 separates traits in a few groups, like I/O, hyper-parameters and optimization. 
I/O would be a very good place to start, since those are re-used most 
frequently across algorithms.

> Should ML sharedParams be a public API?
> ---------------------------------------
>
>                 Key: SPARK-7146
>                 URL: https://issues.apache.org/jira/browse/SPARK-7146
>             Project: Spark
>          Issue Type: Brainstorming
>          Components: ML
>            Reporter: Joseph K. Bradley
>
> Discussion: Should the Param traits in sharedParams.scala be public?
> Pros:
> * Sharing the Param traits helps to encourage standardized Param names and 
> documentation.
> Cons:
> * Users have to be careful since parameters can have different meanings for 
> different algorithms.
> * If the shared Params are public, then implementations could test for the 
> traits.  It is unclear if we want users to rely on these traits, which are 
> somewhat experimental.
> Currently, the shared params are private.
> Proposal: Either
> (a) make the shared params private to encourage users to write specialized 
> documentation and value checks for parameters, or
> (b) design a better way to encourage overriding documentation and parameter 
> value checks



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to