Hi all,

I have a question regarding TD(lambda) training by Tesauro (see 
http://www.research.ibm.com/massive/tdl.html#h2:learning_methodology).

The formula for adapting the weights of the neural net is

w(t+1)-w(t) = a * [Y(t+1)-Y(t)] * sum(lambda^(t-k) * nabla(w)Y(k); k=1..t).

I would like to know if nabla(w)Y(k) in the formula above is the gradient of 
Y(k) to the weights of the net at time t (i.e. the current net) or to the 
weights of the net at time k.  I assume the former.

Thanks in advance!



greetings, boomslang






_______________________________________________
Bug-gnubg mailing list
Bug-gnubg@gnu.org
http://lists.gnu.org/mailman/listinfo/bug-gnubg

Reply via email to