[
https://issues.apache.org/jira/browse/MADLIB-1206?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16388482#comment-16388482
]
Nandish Jayaram commented on MADLIB-1206:
-----------------------------------------
We can follow python's path for setting the default value for batch_size:
{code}
default_batch_size = min(200, buffer_size)
{code}
The buffer_size is the number of rows that are packed into one by the
preprocessor step
(https://issues.apache.org/jira/browse/MADLIB-1200). Note that python uses the
total
number of input data points instead of buffer_size.
We may have to revisit this default batch size after some experiments to see
how it
affects performance/accuracy.
> Add mini batch based gradient descent support to MLP
> ----------------------------------------------------
>
> Key: MADLIB-1206
> URL: https://issues.apache.org/jira/browse/MADLIB-1206
> Project: Apache MADlib
> Issue Type: New Feature
> Components: Module: Neural Networks
> Reporter: Nandish Jayaram
> Assignee: Nandish Jayaram
> Priority: Major
> Fix For: v1.14
>
>
> Mini-batch gradient descent is typically the algorithm of choice when
> training a neural network.
> MADlib currently supports IGD, we may have to add extensions to include
> mini-batch as a solver for MLP. Other modules will continue to use the
> existing IGD that does not support mini-batching. Later JIRAs will move other
> modules over one at a time to use the new mini-batch GD.
> Related JIRA that will pre-process the input data to be consumed by
> mini-batch isĀ https://issues.apache.org/jira/browse/MADLIB-1200
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)