[jira] [Comment Edited] (SPARK-6682) Deprecate static train and use builder instead for Scala/Java

Alexander Ulanov (JIRA) Tue, 07 Apr 2015 11:36:02 -0700

    [ 
https://issues.apache.org/jira/browse/SPARK-6682?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14483729#comment-14483729
 ]


Alexander Ulanov edited comment on SPARK-6682 at 4/7/15 6:35 PM:
-----------------------------------------------------------------

This is a very good idea. Please note though, that there are few issues here
1) Setting optimizer: optimizers (LBFGS and SGD) have Gradient and Updater as 
constructor parameters. I don't think it is a good idea to force users to 
create Gradient and Updater separately and to be able to create Optimizer. So 
one have to explicitly implement methods like setLBFGSOptimizer or set 
SGDOptimizer and return them so the user will be able to set their parameters.

```
  def LBFGSOptimizer: LBFGS = {

    val lbfgs = new LBFGS(_gradient, _updater)

    optimizer = lbfgs

    lbfgs

  }

```

 Another downside of it is that if someone implements new Optimizer then one 
have to add "setMyOptimizer" to the builder. The above problems might be solved 
by figuring out a better interface of Optimizer that allows setting its 
parameters without actually creating it.

2) Setting parameters after setting the optimizer: what if user sets the 
Updater after setting the Optimizer? Optimizer takes Updater as a constructor 
parameter! So one has to recreate the corresponding Optimizer.

```
  private[this] def updateGradient(gradient: Gradient): Unit = {

    optimizer match {

      case lbfgs: LBFGS => lbfgs.setGradient(gradient)

      case sgd: GradientDescent => sgd.setGradient(gradient)

      case other => throw new UnsupportedOperationException(

        s"Only LBFGS and GradientDescent are supported but got 
${other.getClass}.")

    }

  }

```

So it is essential to work out the Optimizer interface first.


was (Author: avulanov):
This is a very good idea. Please note though, that there are few issues here
1) Setting optimizer: optimizers (LBFGS and SGD) have Gradient and Updater as 
constructor parameters. I don't think it is a good idea to force users to 
create Gradient and Updater separately and to be able to create Optimizer. So 
one have to explicitly implement methods like setLBFGSOptimizer or set 
SGDOptimizer and return them so the user will be able to set their parameters.

```
  def LBFGSOptimizer: LBFGS = {
    val lbfgs = new LBFGS(_gradient, _updater)
    optimizer = lbfgs
    lbfgs
  }
```

 Another downside of it is that if someone implements new Optimizer then one 
have to add "setMyOptimizer" to the builder. The above problems might be solved 
by figuring out a better interface of Optimizer that allows setting its 
parameters without actually creating it.

2) Setting parameters after setting the optimizer: what if user sets the 
Updater after setting the Optimizer? Optimizer takes Updater as a constructor 
parameter! So one has to recreate the corresponding Optimizer.

```
  private[this] def updateGradient(gradient: Gradient): Unit = {
    optimizer match {
      case lbfgs: LBFGS => lbfgs.setGradient(gradient)
      case sgd: GradientDescent => sgd.setGradient(gradient)
      case other => throw new UnsupportedOperationException(
        s"Only LBFGS and GradientDescent are supported but got 
${other.getClass}.")
    }
  }
```

So it is essential to work out the Optimizer interface first.

> Deprecate static train and use builder instead for Scala/Java
> -------------------------------------------------------------
>
>                 Key: SPARK-6682
>                 URL: https://issues.apache.org/jira/browse/SPARK-6682
>             Project: Spark
>          Issue Type: Improvement
>          Components: MLlib
>    Affects Versions: 1.3.0
>            Reporter: Joseph K. Bradley
>
> In MLlib, we have for some time been unofficially moving away from the old 
> static train() methods and moving towards builder patterns.  This JIRA is to 
> discuss this move and (hopefully) make it official.
> "Old static train()" API:
> {code}
> val myModel = NaiveBayes.train(myData, ...)
> {code}
> "New builder pattern" API:
> {code}
> val nb = new NaiveBayes().setLambda(0.1)
> val myModel = nb.train(myData)
> {code}
> Pros of the builder pattern:
> * Much less code when algorithms have many parameters.  Since Java does not 
> support default arguments, we required *many* duplicated static train() 
> methods (for each prefix set of arguments).
> * Helps to enforce default parameters.  Users should ideally not have to even 
> think about setting parameters if they just want to try an algorithm quickly.
> * Matches spark.ml API
> Cons of the builder pattern:
> * In Python APIs, static train methods are more "Pythonic."
> Proposal:
> * Scala/Java: We should start deprecating the old static train() methods.  We 
> must keep them for API stability, but deprecating will help with API 
> consistency, making it clear that everyone should use the builder pattern.  
> As we deprecate them, we should make sure that the builder pattern supports 
> all parameters.
> * Python: Keep static train methods.
> CC: [~mengxr]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Comment Edited] (SPARK-6682) Deprecate static train and use builder instead for Scala/Java

Reply via email to