[
https://issues.apache.org/jira/browse/MAHOUT-976?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13825355#comment-13825355
]
Yexi Jiang commented on MAHOUT-976:
-----------------------------------
[MAHOUT-1265|https://issues.apache.org/jira/browse/MAHOUT-1265] is actually a
new implementation of MLP based on Ted's comments. For example, the users can
freely configure the layer but setting the number of neurons, the squashing
function. Also, the users can also set the kind of cost function and the
parameters like learning rate, momemtum weight and so on.
> Implement Multilayer Perceptron
> -------------------------------
>
> Key: MAHOUT-976
> URL: https://issues.apache.org/jira/browse/MAHOUT-976
> Project: Mahout
> Issue Type: New Feature
> Affects Versions: 0.7
> Reporter: Christian Herta
> Assignee: Ted Dunning
> Priority: Minor
> Labels: multilayer, networks, neural, perceptron
> Fix For: Backlog
>
> Attachments: MAHOUT-976.patch, MAHOUT-976.patch, MAHOUT-976.patch,
> MAHOUT-976.patch
>
> Original Estimate: 80h
> Remaining Estimate: 80h
>
> Implement a multi layer perceptron
> * via Matrix Multiplication
> * Learning by Backpropagation; implementing tricks by Yann LeCun et al.:
> "Efficent Backprop"
> * arbitrary number of hidden layers (also 0 - just the linear model)
> * connection between proximate layers only
> * different cost and activation functions (different activation function in
> each layer)
> * test of backprop by gradient checking
> * normalization of the inputs (storeable) as part of the model
>
> First:
> * implementation "stocastic gradient descent" like gradient machine
> * simple gradient descent incl. momentum
> Later (new jira issues):
> * Distributed Batch learning (see below)
> * "Stacked (Denoising) Autoencoder" - Feature Learning
> * advanced cost minimazation like 2nd order methods, conjugate gradient etc.
> Distribution of learning can be done by (batch learning):
> 1 Partioning of the data in x chunks
> 2 Learning the weight changes as matrices in each chunk
> 3 Combining the matrixes and update of the weights - back to 2
> Maybe this procedure can be done with random parts of the chunks (distributed
> quasi online learning).
> Batch learning with delta-bar-delta heuristics for adapting the learning
> rates.
>
--
This message was sent by Atlassian JIRA
(v6.1#6144)