[
https://issues.apache.org/jira/browse/MADLIB-1206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16402383#comment-16402383
]
ASF GitHub Bot commented on MADLIB-1206:
----------------------------------------
GitHub user njayaram2 opened a pull request:
https://github.com/apache/madlib/pull/243
MLP: Add minibatch gradient descent solver
JIRA: MADLIB-1206
This commit adds support for mini-batch based gradient descent for MLP.
If the input table contains a 2D matrix for independent variable,
minibatch is automatically used as the solver. Two minibatch specific
optimizers are also introduced: batch_size and n_epochs.
- batch_size is defaulted to min(200, buffer_size), where buffer_size is
equal to the number of original input rows packed into a single row in
the matrix.
- n_epochs is the number of times all the batches in a buffer are
iterated over (default 1).
Other changes include:
- dependent variable in the minibatch solver is also a matrix now. It
was initially a vector.
- Randomize the order of processing a batch within an epoch.
- MLP minibatch currently doesn't support weights param, an error is
thrown now.
- Delete an unused type named mlp_step_result.
- Add unit tests for newly added functions in python file.
Co-authored-by: Rahul Iyer <[email protected]>
Co-authored-by: Nikhil Kak <[email protected]>
Closes #243
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/madlib/madlib
mlp-minibatch-with-preprocessed-data-rebased
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/madlib/pull/243.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #243
----
commit d9306f7c6a44f64c53df13c34759da55468c4d26
Author: Nandish Jayaram <njayaram@...>
Date: 2018-02-28T00:51:42Z
MLP: Add minibatch gradient descent solver
JIRA: MADLIB-1206
This commit adds support for mini-batch based gradient descent for MLP.
If the input table contains a 2D matrix for independent variable,
minibatch is automatically used as the solver. Two minibatch specific
optimizers are also introduced: batch_size and n_epochs.
- batch_size is defaulted to min(200, buffer_size), where buffer_size is
equal to the number of original input rows packed into a single row in
the matrix.
- n_epochs is the number of times all the batches in a buffer are
iterated over (default 1).
Other changes include:
- dependent variable in the minibatch solver is also a matrix now. It
was initially a vector.
- Randomize the order of processing a batch within an epoch.
- MLP minibatch currently doesn't support weights param, an error is
thrown now.
- Delete an unused type named mlp_step_result.
- Add unit tests for newly added functions in python file.
Co-authored-by: Rahul Iyer <[email protected]>
Co-authored-by: Nikhil Kak <[email protected]>
Closes #243
----
> Add mini batch based gradient descent support to MLP
> ----------------------------------------------------
>
> Key: MADLIB-1206
> URL: https://issues.apache.org/jira/browse/MADLIB-1206
> Project: Apache MADlib
> Issue Type: New Feature
> Components: Module: Neural Networks
> Reporter: Nandish Jayaram
> Assignee: Nandish Jayaram
> Priority: Major
> Fix For: v1.14
>
>
> Mini-batch gradient descent is typically the algorithm of choice when
> training a neural network.
> MADlib currently supports IGD, we may have to add extensions to include
> mini-batch as a solver for MLP. Other modules will continue to use the
> existing IGD that does not support mini-batching. Later JIRAs will move other
> modules over one at a time to use the new mini-batch GD.
> Related JIRA that will pre-process the input data to be consumed by
> mini-batch isĀ https://issues.apache.org/jira/browse/MADLIB-1200
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)