Re: [computer-go] analysis of UCT and BAST
Remi Munos wrote: I have updated the BAST paper, providing additional comparison with UCT, as suggested by one person in the list. See: https://hal.inria.fr/inria-00150207 after reading the paper, I have two questions: How did you deal with unexplored nodes in Flat UCB and BAST in your experiment? If you assign infinity to their bounds that means that the maximum is also infinity and all of the 2^D leaf nodes in the tree will be explored at least once. This doesn't matter with Flat UCB, because the regret is O(2^D) anyway, but it should matter with BAST, right? Have you considered a different definition of your smoothness assumption? I think the bound on the difference between a sub-optimal leaf reward and the optimal reward is not a good model for Go trees. It might be better to model smoothness by assuming a high probability that two leaf rewards in a branch are close to each other, but there might still be a few outliers (losing moves in a won position). - Markus ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] analysis of UCT and BAST
Please notice that it is not my work. All the experiments were performed by Filip Gruszczynski. He corrected the webpage. (should be EGO_POWER) Best Regards, Lukasz On 5/30/07, Rémi Coulom [EMAIL PROTECTED] wrote: Łukasz Lew wrote: I'm not sure whether You have noticed, but my student made an empirical comparison between BAST, UCT and other formulas. It can be found here: http://students.mimuw.edu.pl/~fg219435/Go/ Best Regards, Lukasz Lew Hi Łukasz, You write that EGO_BAST seems to be a bit more effective against GnuGo, but I cannot see any test result of EGO_BAST against GNU Go on your page. Rémi ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/ ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] analysis of UCT and BAST
On 5/31/07, Remi Munos [EMAIL PROTECTED] wrote: Thanks Lukasz for the link. I'm not sure to understand precisely the formulas. For example, for ego_bast_sqrt, you mention that the bast value of a node is the min of the As I said previously this is not my experiment. You can reach the author - Filip Gruszczynski - using gruszczy (at) gmail (dot) com max of the bast values of the children and the own value of the node plus the confidence interval term sqrt(explore_rat * sqrt(p) / n) where p and n are the number of visits of the parent and current nodes. Do you add to the own value, an addition smoothness sequence (the delta term in the bast paper)? Here I think a such smoothness term could be k gamma^d /sqrt(n) (for some well chosen k and gamma constants, with a large k and gamma1) where d if the depth of the current node. This would take into account the time needed for the bias (difference between the true value of a node and the empirical average of the obtained rewards) to decrease to 0 (since we know that from the sqrt sequence above, the bias will decrease with rate O(1/sqrt(n)). From your formula, it does not seem that you add such an additional smoothness sequence, so I would say that this bound would be very close to a ego_sqrt (because the argument in the min above will most always be the second argument). Tell me if I misunderstood the formulas of your student. The simplest and most definite way is to look at the source code. Best Regards, Lukasz PS Filip noted that more complicated versions of BAST were worse. On Wednesday 30 May 2007 21:05:15 Łukasz Lew wrote: I'm not sure whether You have noticed, but my student made an empirical comparison between BAST, UCT and other formulas. It can be found here: http://students.mimuw.edu.pl/~fg219435/Go/ Best Regards, Lukasz Lew On 5/30/07, Remi Munos [EMAIL PROTECTED] wrote: I have updated the BAST paper, providing additional comparison with UCT, as suggested by one person in the list. See: https://hal.inria.fr/inria-00150207 Basically, in the considered examples, BAST with an appropriate smoothness sequence performs always better than both UCT and Flat-UCB. Now, the best smoothness coefficient is always smaller than the constant corresponding to the true smoothnes of the tree, and sometimes very close to 0 (which corresponds to UCT). Since go is way more complex than the problem of optimizing a simple 1-d function, I would say that no theoretical work could predict whether BAST would do better than UCT or not, in Monte-Carlo-go. Best, Remi ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/ ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/ -- Rémi Munos Directeur de recherche INRIA Sequential Learning project http://www.grappa.univ-lille3.fr/cgi-bin/twiki/view/Sequel/Munos ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/ ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
[computer-go] analysis of UCT and BAST
I have updated the BAST paper, providing additional comparison with UCT, as suggested by one person in the list. See: https://hal.inria.fr/inria-00150207 Basically, in the considered examples, BAST with an appropriate smoothness sequence performs always better than both UCT and Flat-UCB. Now, the best smoothness coefficient is always smaller than the constant corresponding to the true smoothnes of the tree, and sometimes very close to 0 (which corresponds to UCT). Since go is way more complex than the problem of optimizing a simple 1-d function, I would say that no theoretical work could predict whether BAST would do better than UCT or not, in Monte-Carlo-go. Best, Remi ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] analysis of UCT and BAST
I'm not sure whether You have noticed, but my student made an empirical comparison between BAST, UCT and other formulas. It can be found here: http://students.mimuw.edu.pl/~fg219435/Go/ Best Regards, Lukasz Lew On 5/30/07, Remi Munos [EMAIL PROTECTED] wrote: I have updated the BAST paper, providing additional comparison with UCT, as suggested by one person in the list. See: https://hal.inria.fr/inria-00150207 Basically, in the considered examples, BAST with an appropriate smoothness sequence performs always better than both UCT and Flat-UCB. Now, the best smoothness coefficient is always smaller than the constant corresponding to the true smoothnes of the tree, and sometimes very close to 0 (which corresponds to UCT). Since go is way more complex than the problem of optimizing a simple 1-d function, I would say that no theoretical work could predict whether BAST would do better than UCT or not, in Monte-Carlo-go. Best, Remi ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/ ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/
Re: [computer-go] analysis of UCT and BAST
Łukasz Lew wrote: I'm not sure whether You have noticed, but my student made an empirical comparison between BAST, UCT and other formulas. It can be found here: http://students.mimuw.edu.pl/~fg219435/Go/ Best Regards, Lukasz Lew Hi Łukasz, You write that EGO_BAST seems to be a bit more effective against GnuGo, but I cannot see any test result of EGO_BAST against GNU Go on your page. Rémi ___ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/