You have an input that represents whose turn it is (one input for white, one for black, value one if that player is on turn and zero otherwise). I think that's in the original Tesauro setup isn't it?
On Dec 10, 2011, at 1:10 AM, Joseph Heled <jhe...@gmail.com> wrote: > Well, I am not sure how you flip the position, since it matters who is > on the move. > > -Joseph > > On 10 December 2011 16:17, Mark Higgins <migg...@gmail.com> wrote: >> I've been playing around a bit with neural networks for backgammon and found >> something interesting, and want to see whether this is already part of gnubg. >> >> Assume a Tesauro-style network with the usual inputs, and some number of >> hidden nodes. And for simplicity, just one output representing the >> probability of win. >> >> If I take a given board and translate the position into the inputs and then >> evaluate the network, it gives me a probability of win. If I then flip the >> board's perspective (ie white vs black) and do the same, I get another >> probability of win. Those two probabilities should sum to 1, since one or >> the other player must win (or equivalently, the probability of white winning >> = probability of black losing = 1 - probability of black winning). >> >> But that constraint isn't satisfied with the usual TD setup. >> >> If however you make a few assumptions: >> >> * Hidden layer nodes don't include bias weight. >> * Hidden->input weights have a specific symmetry: weight of the i'th hidden >> node vs the j'th input node = w(i,j) = -w(i,j*), where j* is the index of >> the other player's corresponding position. >> * Output layer node doesn't include a bias weight. >> >> Then you can show that, for each set of output->hidden node weights, those >> weights sum to zero, the flip-the-perspective constraint is satisfied. >> >> This seems to reduce the number of weights by about half, since you need >> only half the middle weights. The network should be more accurate since a >> known symmetry is respected, and should converge quicker since there are >> fewer parameters to optimize. >> >> You can generalize to a bias weight on the output node; in that case, the >> constraint is on the bias weight that it = -1/2 sum( output->hidden node >> weights ). >> >> You can generalize as well to including a "gammon win" output node. In this >> case there are no constraints on the output->hidden node weights, but the >> probability of a gammon loss can be calculated from the probability of a >> gammon win weights, and you don't need to explicitly include an output node >> for the gammon loss. >> >> I googled around a fair bit but couldn't figure out whether this is well >> known or already included somewhere in gnubg. I took a look through eval.c >> but it's a bit daunting. :) Is there documentation somewhere that I've just >> not found? >> >> >> >> _______________________________________________ >> Bug-gnubg mailing list >> Bug-gnubg@gnu.org >> https://lists.gnu.org/mailman/listinfo/bug-gnubg _______________________________________________ Bug-gnubg mailing list Bug-gnubg@gnu.org https://lists.gnu.org/mailman/listinfo/bug-gnubg