Thanks to everybody for the links! They have given me a good amount of stuff to 
look at that will help with the proposal.

Many of these are very much in the same spirit as what I am proposing, though 
most seem to be concerned primarily with the tree rather than the playouts. 
It’s interesting that starting the counters at 1 is equivalent to using a 
uniform prior. Similarly, it can be easily seen that with a Beta(prior wins, 
prior losses) prior, the posterior will be Beta(prior wins + observed wins, 
prior losses + observed losses), and thus the new expected value is just total 
wins / total runs, as is used everywhere.

Thank you again for your time and the links,

Alex

> On Sep 25, 2014, at 4:06 PM, Álvaro Begué <alvaro.be...@gmail.com> wrote:
> 
> I believe this has been discussed in the mailing list before: If your prior 
> distribution of the win rate of a move is uniform, after L losses and W wins 
> the posterior distribution will be a beta distribution with alpha=W+1 and 
> beta=L+1. The expected value of this distribution is alpha/(alpha+beta) = 
> (W+1)/(W+L+2), which is equivalent to the common trick of starting the 
> counters W and L at 1 instead of at 0.
> 
> Of course one could start with a different prior, but I think staying within 
> the family of beta distributions makes sense because it's very tractable.
> 
> Is that the kind of thing you were looking for?
> 
> 
> Álvaro.
> 
> 
> 
> On Thu, Sep 25, 2014 at 6:28 PM, Alexander Terenin <atere...@ucsc.edu 
> <mailto:atere...@ucsc.edu>> wrote:
> Hello everybody,
> 
> I’m a PhD student in statistics at the University of California, Santa Cruz 
> who previously worked on the Go program Orego, currently in the process of 
> applying for the NSF fellowship. I am working on a Bayesian statistics - 
> related research proposal that I would like to use in my application, and 
> wanted to know if someone was aware of any research related to my topic that 
> has been done.
> 
> Currently, it seems most MCTS-based Go programs, in the playouts, treat the 
> strength (win rate) of each move as a fixed, unknown value, which is then 
> estimated using frequentist techniques (specifically, by playing a random 
> game, and taking the estimate to be wins / total runs). Has anyone attempted 
> to instead statistically estimate the strength of each move using Bayesian 
> techniques, by defining a set of prior beliefs about the strength of a 
> certain move, playing a random game, and then integrating the information 
> gained from the random game together with the prior beliefs using Bayes' 
> Rule? Equivalently, has anyone defined the strength of each move to be a 
> random variable rather than a fixed and unknown value? Without making this 
> email too long, there’s some theoretical advantages that might allow for more 
> information to be extracted from each playout if this setup is used.
> 
> If you are aware of any work in this direction that has been done, I would 
> love to hear from you! I’ve been looking through a variety of papers, and 
> have yet to find anything - it seems that any work remotely related to Bayes’ 
> Rule has concerned the tree, not the playouts.
> Thank you in advance,
> 
> Alex Terenin​
> atere...@ucsc.edu <mailto:atere...@ucsc.edu>​
> _______________________________________________
> Computer-go mailing list
> Computer-go@dvandva.org <mailto:Computer-go@dvandva.org>
> http://dvandva.org/cgi-bin/mailman/listinfo/computer-go 
> <http://dvandva.org/cgi-bin/mailman/listinfo/computer-go>
> _______________________________________________
> Computer-go mailing list
> Computer-go@dvandva.org
> http://dvandva.org/cgi-bin/mailman/listinfo/computer-go

_______________________________________________
Computer-go mailing list
Computer-go@dvandva.org
http://dvandva.org/cgi-bin/mailman/listinfo/computer-go

Reply via email to