[jira] [Comment Edited] (SYSTEMML-1159) Enable Remote Hyperparameter Tuning

Mike Dusenberry (JIRA) Mon, 17 Jul 2017 13:03:13 -0700

    [ 
https://issues.apache.org/jira/browse/SYSTEMML-1159?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16090453#comment-16090453
 ]


Mike Dusenberry edited comment on SYSTEMML-1159 at 7/17/17 8:02 PM:
--------------------------------------------------------------------

[~return_01]  Thanks–adding HogWild asynchronous SGD would be quite 
interesting.  However, this particular JIRA issue is referring to 
*hyperparameters* rather than the model parameters, the latter of which HogWild 
is applicable.  If you are interested in pursuing the addition of support for 
HogWild, could you please create a new JIRA issue for it, and link it to 
SYSTEMML-540?  SYSTEMML-1563 may also be of interest -- I added a distributed 
synchronous SGD algorithm a while back, implemented currently in the 
[distributed MNIST LeNet | 
https://github.com/apache/systemml/blob/master/scripts/nn/examples/mnist_lenet_distrib_sgd.dml]
 algorithm.  We are currently working to improve the engine performance of it 
in SYSTEMML-1760.


was (Author: mwdus...@us.ibm.com):
[~return_01]  Thanks–adding HogWild asynchronous SGD would be quite 
interesting.  However, this particular JIRA issue is referring to 
*hyperparameters* rather than the model parameters, the latter of which HogWild 
is applicable.  If you are interested in pursuing the addition of support for 
HogWild, could you please create a new JIRA issue for it, and link it to 
SYSTEMML-540?  SYSTEMML-1563 may also be of interest -- I added a distributed 
synchronous SGD algorithm, implemented currently in [distributed MNIST LeNet | 
https://github.com/apache/systemml/blob/master/scripts/nn/examples/mnist_lenet_distrib_sgd.dml]
 algorithm.

> Enable Remote Hyperparameter Tuning
> -----------------------------------
>
>                 Key: SYSTEMML-1159
>                 URL: https://issues.apache.org/jira/browse/SYSTEMML-1159
>             Project: SystemML
>          Issue Type: Improvement
>    Affects Versions: SystemML 1.0
>            Reporter: Mike Dusenberry
>            Priority: Blocker
>
> Training a parameterized machine learning model (such as a large neural net 
> in deep learning) requires learning a set of ideal model parameters from the 
> data, as well as determining appropriate hyperparameters (or "settings") for 
> the training process itself.  In the latter case, the hyperparameters (i.e. 
> learning rate, regularization strength, dropout percentage, model 
> architecture, etc.) can not be learned from the data, and instead are 
> determined via a search across a space for each hyperparameter.  For large 
> numbers of hyperparameters (such as in deep learning models), the current 
> literature points to performing staged, randomized grid searches over the 
> space to produce distributions of performance, narrowing the space after each 
> search \[1].  Thus, for efficient hyperparameter optimization, it is 
> desirable to train several models in parallel, with each model trained over 
> the full dataset.  For deep learning models, a mini-batch training approach 
> is currently state-of-the-art, and thus separate models with different 
> hyperparameters could, conceivably, be easily trained on each of the nodes in 
> a cluster.
> In order to allow for the training of deep learning models, SystemML needs to 
> determine a solution to enable this scenario with the Spark backend.  
> Specifically, if the user has a {{train}} function that takes a set of 
> hyperparameters and trains a model with a mini-batch approach (and thus is 
> only making use of single-node instructions within the function), the user 
> should be able to wrap this function with, for example, a remote {{parfor}} 
> construct that samples hyperparameters and calls the {{train}} function on 
> each machine in parallel.
> To be clear, each model would need access to the entire dataset, and each 
> model would be trained independently.
> \[1]: http://www.jmlr.org/papers/volume13/bergstra12a/bergstra12a.pdf



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Comment Edited] (SYSTEMML-1159) Enable Remote Hyperparameter Tuning

Reply via email to