[jira] [Commented] (SPARK-18023) Adam optimizer

2016-11-20 Thread Vincent (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-18023?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15682491#comment-15682491
 ] 

Vincent commented on SPARK-18023:
-

thanks [~mlnick]
that's really what we need. when I wrote the code for Adagrad&Adam, I do find 
some conflicts with original design. These new optimizers do not share a common 
API with what we have now in mllib, and also with a different workflow, it's 
hard to fit in and make a good PR without changing the original design, so I 
just made a package instead for now.

> Adam optimizer
> --
>
> Key: SPARK-18023
> URL: https://issues.apache.org/jira/browse/SPARK-18023
> Project: Spark
>  Issue Type: New Feature
>  Components: ML, MLlib
>Reporter: Vincent
>Priority: Minor
>
> It could be incredibly slow for SGD methods to diverge or converge if their  
> learning rate alpha are set inappropriately, many alternative methods have 
> been proposed to produce desirable convergence with less dependence on 
> hyperparameter settings, and to help prevent local optimum, e.g. Momentom, 
> NAG (Nesterov's Accelerated Gradient), Adagrad, RMSProp etc.
> Among which, Adam is one of the popular algorithms, which is for first-order 
> gradient-based optimization of stochastic objective functions. It's proved to 
> be well suited for problems with large data and/or parameters, and for 
> problems with noisy and/or sparse gradients and is computationally efficient. 
> Refer to this paper for details
> In fact, Tensorflow has implemented most of the adaptive optimization methods 
> mentioned, and we have seen that Adam out performs most of SGD methods in 
> certain cases, such as very sparse dataset in a FM model.
> It could be nice for Spark to have these adaptive optimization methods. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-18023) Adam optimizer

2016-11-20 Thread Nick Pentreath (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-18023?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15682471#comment-15682471
 ] 

Nick Pentreath commented on SPARK-18023:


Linking SPARK-17136 which is really a blocker for adding any optimization 
methods. We first need to design a good API for pluggable optimizers, then work 
on adding some more advanced options. We can take a look at other libs in R, 
Python and e.g. TensorFlow to get some ideas on how they have designed these 
interfaces.

> Adam optimizer
> --
>
> Key: SPARK-18023
> URL: https://issues.apache.org/jira/browse/SPARK-18023
> Project: Spark
>  Issue Type: New Feature
>  Components: ML, MLlib
>Reporter: Vincent
>Priority: Minor
>
> It could be incredibly slow for SGD methods to diverge or converge if their  
> learning rate alpha are set inappropriately, many alternative methods have 
> been proposed to produce desirable convergence with less dependence on 
> hyperparameter settings, and to help prevent local optimum, e.g. Momentom, 
> NAG (Nesterov's Accelerated Gradient), Adagrad, RMSProp etc.
> Among which, Adam is one of the popular algorithms, which is for first-order 
> gradient-based optimization of stochastic objective functions. It's proved to 
> be well suited for problems with large data and/or parameters, and for 
> problems with noisy and/or sparse gradients and is computationally efficient. 
> Refer to this paper for details
> In fact, Tensorflow has implemented most of the adaptive optimization methods 
> mentioned, and we have seen that Adam out performs most of SGD methods in 
> certain cases, such as very sparse dataset in a FM model.
> It could be nice for Spark to have these adaptive optimization methods. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-18023) Adam optimizer

2016-10-19 Thread Vincent (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-18023?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15590951#comment-15590951
 ] 

Vincent commented on SPARK-18023:
-

I can start with ADAM, then maybe other Ada methods after that

> Adam optimizer
> --
>
> Key: SPARK-18023
> URL: https://issues.apache.org/jira/browse/SPARK-18023
> Project: Spark
>  Issue Type: New Feature
>  Components: ML, MLlib
>Reporter: Vincent
>Priority: Minor
>
> It could be incredibly slow for SGD methods to diverge or converge if their  
> learning rate alpha are set inappropriately, many alternative methods have 
> been proposed to produce desirable convergence with less dependence on 
> hyperparameter settings, and to help prevent local optimum, e.g. Momentom, 
> NAG (Nesterov's Accelerated Gradient), Adagrad, RMSProp etc.
> Among which, Adam is one of the popular algorithms, which is for first-order 
> gradient-based optimization of stochastic objective functions. It's proved to 
> be well suited for problems with large data and/or parameters, and for 
> problems with noisy and/or sparse gradients and is computationally efficient. 
> Refer to this paper for details
> In fact, Tensorflow has implemented most of the adaptive optimization methods 
> mentioned, and we have seen that Adam out performs most of SGD methods in 
> certain cases, such as very sparse dataset in a FM model.
> It could be nice for Spark to have these adaptive optimization methods. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org