[jira] [Commented] (MAHOUT-976) Implement Multilayer Perceptron

Ted Dunning (Commented) (JIRA) Thu, 16 Feb 2012 07:45:25 -0800

    [ 
https://issues.apache.org/jira/browse/MAHOUT-976?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13209446#comment-13209446
 ]


Ted Dunning commented on MAHOUT-976:
------------------------------------

Also, John has had very good results in Vowpal Wabbit with an allreduce 
operation in his learning system.  The way that this works is that he launches 
a map-only learning task which reads inputs repeatedly and propagates the 
gradient vector every pass over the data using an all-reduce operation.  All 
reduce applies an associative aggregation to a data structure in a tree 
structure imposed on the network.  The result of the aggregation is passed back 
down the tree to all nodes.

This allows fast iteration of learning and could also speed up our k-means 
codes massively.  Typically, this improves speeds by about 2 orders of 
magnitude because the horrid costs of Hadoop job starts go away.

Would you be interested in experimenting with this in your parallel 
implementation here?

                
> Implement Multilayer Perceptron
> -------------------------------
>
>                 Key: MAHOUT-976
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-976
>             Project: Mahout
>          Issue Type: New Feature
>    Affects Versions: 0.7
>            Reporter: Christian Herta
>            Priority: Minor
>              Labels: multilayer, networks, neural, perceptron
>         Attachments: MAHOUT-976.patch, MAHOUT-976.patch
>
>   Original Estimate: 80h
>  Remaining Estimate: 80h
>
> Implement a multi layer perceptron
>  * via Matrix Multiplication
>  * Learning by Backpropagation; implementing tricks by Yann LeCun et al.: 
> "Efficent Backprop"
>  * arbitrary number of hidden layers (also 0  - just the linear model)
>  * connection between proximate layers only 
>  * different cost and activation functions (different activation function in 
> each layer) 
>  * test of backprop by gradient checking 
>  * normalization of the inputs (storeable) as part of the model
>  
> First:
>  * implementation "stocastic gradient descent" like gradient machine
>  * simple gradient descent incl. momentum
> Later (new jira issues):  
>  * Distributed Batch learning (see below)  
>  * "Stacked (Denoising) Autoencoder" - Feature Learning
>  * advanced cost minimazation like 2nd order methods, conjugate gradient etc.
> Distribution of learning can be done by (batch learning):
>  1 Partioning of the data in x chunks 
>  2 Learning the weight changes as matrices in each chunk
>  3 Combining the matrixes and update of the weights - back to 2
> Maybe this procedure can be done with random parts of the chunks (distributed 
> quasi online learning). 
> Batch learning with delta-bar-delta heuristics for adapting the learning 
> rates.    
>  

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAHOUT-976) Implement Multilayer Perceptron

Reply via email to