[computer-go] win_ratio VS mean_score : any hard values ?

Denis fidaali Mon, 03 Nov 2008 03:04:56 -0800

Hi there.
This kinda ask for a large study :)
 ***************************************************************************
 So here is the final question (read more for explanations):
 ***************************************************************************
 How do best_score_goal scales against win_only_goal as the number of playout 
increases (for both, but with varying speed - best_score_goal geting more 
increase for each iteration).
 
***********************************************************************************

We've seen quite some study on AMAF lately.
I'd now like to ask a question about the behavior of
"UCT_like" go analysers.

First rather than talking about go, i'd rather talk
about the sub-game of go that the standard-light-simulator is exploring :
+ Suicide not allowed
+ Playout in own pseudo-eye not allowed
+ Can't pass, unless no empty legal intersections are available.

Now here is what i call "UCT_like" behavior :
- The node that got explored the most is the "best".
- The exploration strategy is so as to get a logarithmic regret on a fixed
distribution.
- The engine uses informations gathered on nodes, to tweak gradually the
probability
of selecting the first moves of the playout from "all move is equiprobable" to
"only the best move is considered". We discard there any consideration for
"AMAF" or any other fancy stuff.

I tried to get this definition so that any plain UCT_bot using
standard-light-playout, would be conforming. Any Epsilon-greedy exploration
strategy would also be conforming.

----------------
Here are my questions :
-----------------------

First, how do the different parameters affects the scalability ? ( both
against gnu-go-lvl-1, and against conforming engines ) What is the shape of the
effect of all those parameters ? As long as we still have a conforming bot. For
example, is a basic epsilon-greedy-bot really different than a uct_bot in it's
efficiency ?

Now here is the real question : How do the win_only_goal and the
best_score_goal compares to each other in terms of won games ? It is obvious
that it's easier to say : "this game is won for black" When compared to "Black
can win this game by up to 4,5 points, no more assuming perfect play for both
players"

by win_only_goal, i assume of course that the bot only try to assess the best
move in regard to wining the game based on a fixed komi.
by best_score_goal, i assume that the bot will dynamically try to find the
absolute best move regarding to the theoritical min/max score.

I think it is obvious that if a position is won, say for black, and assuming
enough playout are made, both win_only_goal and best_score_goal will give a
sure win.

***************************************************************************
So here is the final question (read more for explanations):
***************************************************************************
How do best_score_goal scales against win_only_goal as the number of playout
increases (for both, but with varying speed - best_score_goal geting more
increase for each iteration).

***********************************************************************************
_________________________________________________________________
Inédit ! Des Emoticônes Déjantées! Installez les dans votre Messenger !
http://www.ilovemessenger.fr/Emoticones/EmoticonesDejantees.aspx_______________________________________________
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

[computer-go] win_ratio VS mean_score : any hard values ?

Reply via email to