Hi there. This kinda ask for a large study :) *************************************************************************** So here is the final question (read more for explanations): *************************************************************************** How do best_score_goal scales against win_only_goal as the number of playout increases (for both, but with varying speed - best_score_goal geting more increase for each iteration). ***********************************************************************************
We've seen quite some study on AMAF lately. I'd now like to ask a question about the behavior of "UCT_like" go analysers. First rather than talking about go, i'd rather talk about the sub-game of go that the standard-light-simulator is exploring : + Suicide not allowed + Playout in own pseudo-eye not allowed + Can't pass, unless no empty legal intersections are available. Now here is what i call "UCT_like" behavior : - The node that got explored the most is the "best". - The exploration strategy is so as to get a logarithmic regret on a fixed distribution. - The engine uses informations gathered on nodes, to tweak gradually the probability of selecting the first moves of the playout from "all move is equiprobable" to "only the best move is considered". We discard there any consideration for "AMAF" or any other fancy stuff. I tried to get this definition so that any plain UCT_bot using standard-light-playout, would be conforming. Any Epsilon-greedy exploration strategy would also be conforming. ---------------- Here are my questions : ----------------------- First, how do the different parameters affects the scalability ? ( both against gnu-go-lvl-1, and against conforming engines ) What is the shape of the effect of all those parameters ? As long as we still have a conforming bot. For example, is a basic epsilon-greedy-bot really different than a uct_bot in it's efficiency ? Now here is the real question : How do the win_only_goal and the best_score_goal compares to each other in terms of won games ? It is obvious that it's easier to say : "this game is won for black" When compared to "Black can win this game by up to 4,5 points, no more assuming perfect play for both players" by win_only_goal, i assume of course that the bot only try to assess the best move in regard to wining the game based on a fixed komi. by best_score_goal, i assume that the bot will dynamically try to find the absolute best move regarding to the theoritical min/max score. I think it is obvious that if a position is won, say for black, and assuming enough playout are made, both win_only_goal and best_score_goal will give a sure win. *************************************************************************** So here is the final question (read more for explanations): *************************************************************************** How do best_score_goal scales against win_only_goal as the number of playout increases (for both, but with varying speed - best_score_goal geting more increase for each iteration). *********************************************************************************** _________________________________________________________________ Inédit ! Des Emoticônes Déjantées! Installez les dans votre Messenger ! http://www.ilovemessenger.fr/Emoticones/EmoticonesDejantees.aspx_______________________________________________ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/