Hi Francois, This took me a good while to figure out! I'll see if I can give a condensed version of the way I did it, which I think is equivalent. Following the code at the back of the DBN paper is how I finally figured it out. The way I see it is there are 3 separate parts (after pre-training as normal):
- Wake phase: data-driven - updates generative parameters - Associative Phase: transition from data to model - updates RBM parameters - Sleep Phases: model driven - updates recognition parameters. Pre-requisite step: Unite the recognition (forward prop/prop-up weights) from the generative (prop down) weights and leave the rbm weights tied. 1. Wake: - Propagate values up to the layer before the rbm at the top. (Second last or penultimate layer if only 1 RBM at the top). - *Before* going into the RBM propagate these values back down to get a reconstruction - Cost = difference between input and reconstruction. - Update generative parameters with derivative of relevant cost. 2. Associative: - Do usual update of RBM, using n-step CD etc. 3. Sleep: - Use values output from n-step Contrastive Divergence in the associative step as values in the penultimate layer. (pen_RBM) - Propagate these values down to input layer - Propagate these values back up again to calculate the reconstructed penultimate layer. (pen_reconstruction) - Cost = difference between pen_RBM and pen_Reconstruction That's my understanding anyway. Hope it helps! All the Best, Jim On Monday, 13 February 2017 12:54:19 UTC, Francois Lasson wrote: > > Hello Jim, > > I'm currently working on the same problem using Theano. > Have you implemented the constrastive wake-sleep algorithm on this library > and this case, could you tell me some guidances? > > Many thanks, > François > > Le jeudi 14 juillet 2016 11:32:38 UTC+2, Jim O' Donoghue a écrit : >> >> So I'm going to reply to my own question in case it helps anyone else >> out. Had another look at the paper there, I had forgotten about the >> contrastive wake-sleep algorithm. That's what's used to train the algorithm >> completely unsupervised. >> >> On Tuesday, 12 July 2016 15:40:48 UTC+1, Jim O' Donoghue wrote: >>> >>> Hi There, >>> >>> Just wondering how you would fine-tune a DBN for a completely >>> unsupervised task i.e. practical implementation of "Fine-tune all the >>> parameters of this deep architecture with respect to a proxy for the DBN >>> log- likelihood". >>> >>> Would this be something like, for example, a negative log likelihood >>> between the original input and the reconstruction of the data when >>> propogated entirely up and down the network? What makes the final layer an >>> rbm and the rest just normally directed. Or would the only way you can do >>> this be to completely un-roll the network and fine-tune like a deep >>> autoencoder (as in reducing the dimensionality of data with neural >>> networks)? >>> >>> Many thanks, >>> Jim >>> >>> >>> >>> -- --- You received this message because you are subscribed to the Google Groups "theano-users" group. To unsubscribe from this group and stop receiving emails from it, send an email to theano-users+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.