[
https://issues.apache.org/jira/browse/MADLIB-1206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16380827#comment-16380827
]
Nandish Jayaram commented on MADLIB-1206:
-----------------------------------------
Based on the output of preprocess step in
https://issues.apache.org/jira/browse/MADLIB-1200, MLP should decide to use
mini-batch or not, with some basic testing:
Check for <preprocessed_table_name>_summary, and
<preprocessed_table_name>_standardization, and the column names in them to
verify if the data is pre-processed or not. If preprocessed, then use
mini-batch, else use regular IGD.
Other information we should get from pre-process step:
# the mean and standard deviation for independent variable.
# Figure out if the data is pre-processed for classification or regression by
looking at a column named `classes` in <preprocessed_table_name>_summary.
# Get the original input table name, independent/dependent variable names,
grouping columns from <preprocessed_table_name>_summary.
# Use buffer size from <preprocessed_table_name>_summary to validate the
batch_size to be used in MLP mini-batch.
> Add mini batch based gradient descent support to MLP
> ----------------------------------------------------
>
> Key: MADLIB-1206
> URL: https://issues.apache.org/jira/browse/MADLIB-1206
> Project: Apache MADlib
> Issue Type: New Feature
> Components: Module: Neural Networks
> Reporter: Nandish Jayaram
> Assignee: Nandish Jayaram
> Priority: Major
> Fix For: v1.14
>
>
> Mini-batch gradient descent is typically the algorithm of choice when
> training a neural network.
> MADlib currently supports IGD, we may have to add extensions to include
> mini-batch as a solver for MLP. Other modules will continue to use the
> existing IGD that does not support mini-batching. Later JIRAs will move other
> modules over one at a time to use the new mini-batch GD.
> Related JIRA that will pre-process the input data to be consumed by
> mini-batch is https://issues.apache.org/jira/browse/MADLIB-1200
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)