[
https://issues.apache.org/jira/browse/MAHOUT-968?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13203045#comment-13203045
]
Dirk Weißenborn commented on MAHOUT-968:
----------------------------------------
that depends extremely on the number of training batches and the number of
epochs you are using and of course on the network structure. you can try one
epoch through the dataset at first and then a little finetuning maybe. If you
have less input neurons it should go much faster... if you need any help for
some parameter tuning, i can probably help you. if you want to monitor progress
dont forget the monitor option, where you can see the reconstruction
error(greedy pretraining)/discriminative error(finetuning) after each batch.
> Classifier based on restricted boltzmann machines
> -------------------------------------------------
>
> Key: MAHOUT-968
> URL: https://issues.apache.org/jira/browse/MAHOUT-968
> Project: Mahout
> Issue Type: New Feature
> Components: Classification
> Affects Versions: 0.7
> Reporter: Dirk Weißenborn
> Labels: classification, mnist
> Fix For: 0.7
>
> Attachments: MAHOUT-968.patch, MAHOUT-968.patch
>
> Original Estimate: 336h
> Remaining Estimate: 336h
>
> This is a proposal for a new classifier based on restricted boltzmann
> machines. The development of this feature follows the paper on "Deep
> Boltzmann Machines" (DBM) [1] from 2009. The proposed model (DBM) got an
> error rate of 0.95% on the mnist dataset [2], which is really good. Main
> parts of the implementation should also be applicable to other scenarios than
> classification where restricted boltzmann machines are used (ref. MAHOUT-375).
> I am working on this feature right now, and the results are promising. The
> only problem with the training algorithm is, that it is still mostly
> sequential (if training batches are small, what they should be), which makes
> Map/Reduce until now, not really beneficial. However, since the algorithm
> itself is fast (for a training algorithm), training can be done on a single
> machine in managable time.
> Testing of the algorithm is currently done on the mnist dataset itself to
> reproduce results of [1]. As soon as results indicate, that everything is
> working fine, I will upload the patch.
> [1] http://www.cs.toronto.edu/~hinton/absps/dbm.pdf
> [2] http://yann.lecun.com/exdb/mnist/
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira