boomslang wrote:
> Hi Øystein / others,
>
> I didn't know gnubg used just TD(0). This does make things easier for
> me. The Sutton/Barto you're referring to..., is that the book
> "Reinforcement Learning: An Introduction"?
Yes! It's even available online in HTML formatting.
> I do have a questi
Hi Øystein / others,
thanks for your quick answer.
I didn't know gnubg used just TD(0). This does make things
easier for me. The Sutton/Barto you're referring
to..., is that the book "Reinforcement Learning: An
Introduction"?
I do have a question about this supervised training,
tho
Ian Shaw wrote:
>> Our experience is: TD is nice for kickstarting the training
>> process. But supervised training is the real thing. Make a big
>> database of positions and the rollout results according to these
>> positions and train supervised.
>>
>> If you still would like to do TD training w
> -Original Message-
> From: Øystein Johansen
> Sent: 21 May 2009 09:19
>
> Our experience is: TD is nice for kickstarting the training
> process. But supervised training is the real thing. Make a
> big database of positions and the rollout results according
> to these positions and
Alexander Smirnov wrote:
> Hello
>
> I wonder if it is possible to reuse gnubg engine in my application. I'm
> developing open source backgammon game for K Desktop Environment and
> looking for strong computer opponent. Looks like gnubg is the greatest
> opponent we have now! :-)
Cool! How far ha
boomslang wrote:
> Hi all,
>
> I have a question regarding TD(lambda) training by Tesauro (see
> http://www.research.ibm.com/massive/tdl.html#h2:learning_methodology).
>
> The formula for adapting the weights of the neural net is
>
> w(t+1)-w(t) = a * [Y(t+1)-Y(t)] * sum(lambda^(t-k) * nabla(w)Y