Hi Magnus,

Your post makes a great deal of sense.   I agree with all the points you
have stated.   I don't think you have ever made an illogical post like
most of us have (including myself) and they are always well thought out
and worded.

I have a response to this comment:

    Still I think predicting the best moves is very important in the
    tree part, but this may be much less important in the playouts, and
    perhaps even detrimental as some people have experienced.

A class of "bad" moves is moves on the edge of the board when nothing
else is nearby.  In other words, the candidate edge placement doesn't
"touch" any other stones of either color.     In my experiments I found
that it strengthens the program significantly to avoid trying these
moves in the tree search (I call them veto patterns even though they are
not necessarily veto'd permanently.)   

But I also discovered that there seems to be no benefit whatsoever in
removing them from the play-outs.    I have no real explanation for
this.   But it does tell me that the play-outs are very different in
nature from the tree - you cannot just use the same algorithms for
selection and prioritizing moves. 

I think I like your theory that eliminating noise is a good thing and
probably the primary thing.

Your last comment is what I am interested in the most,  and what I have
spent quite some time looking at.  I call it  "learning on the
fly."       The use of patterns is acceptable in this setting if those
patterns are learned during the game.     The basic important
observation is that most of the things learned during the game is
relearned over and over and over and over .... 

All-moves-as-first is a form of this learning, but very brittle of
course.     And I don't have a problem with imposing knowledge from the
outside if it's used only as a "kick start" - a way to overcome the
initial learning inertia,  but not imposed in any absolute way.

- Don
 



Magnus Persson wrote:
> A recurrent concept popping up in discussions on how to improve
> playouts is "balance". So I would like to try to share my philosophy
> behind the playouts of Valkyria and how I define "balance" and how it
> relates to the evaluation of go positions.
>
> *Background
>
> In an old school program the evaluation function would try to see
> which stones are safely connected to other stones of the same colours.
> Connected stones are called groups, and the program would probably
> also try to evaluate the safety of the groups looking at the eyespace
> at hand, weak neighbouring groups and so on. This quickly gets very
> commplicated. My old program Viking had several 1000's of handmade
> patterns for evaluating connectivity alone. This worked as a dream as
> long as the position consisted of patterns in the database... but in
> each an every game there were new situations and new patterns had to
> be added. A more robust method would be to use tactical search in the
> program to evaluate connectivity. The problem there is to ensure
> accuracy efficiently. Any tactical search tends to either become too
> time consuming, or resort to guessing.
>
> *MC-evaluation
>
> Programs using uniformly random MC-eval favors very solid but
> inefficient shape,  often building blocks of adjascent stones in the
> center of the board. The reason is that if stones are played more
> loosely the stones often get cut off and get killed in the simulations.
>
> What we rather want is a program that can play efficent moves where
> stones are safely connected but occupy as much territory/eyespace as
> possible.
>
> The tree search (UCT) cannot alone solve this problem. Shapes created
> in a 19x19 game may exist untouched to the late endgame and it is not
> possible to read out all shapes on the board. It is much better if
> secure shapes stay stable in the simulation.
>
> They way I implemented that in Valkyria is that the playout part is
> basically reactive to random moves that attacks shapes on the board.
> It does not in any sense attempt to play the best move on the board.
> If it does not need to defend a shape it plays uniformly random
> somewhere. [Note that Valkyria also prunes really ugly moves, thus it
> plays uniformly the first move that is not pruned]
>
> This is also how the pattern system works in Mogo as I understand it.
> If I remember correctly I would say that all Mogo patterns are very
> basic and common sense defenses against attacks on otherwise stable
> shapes.
>
> But there also have to be balance. Valkyria also punishes bad shape.
> That is if a weak shape already is on the board, or a random move
> attacked two shapes simulatanously in the simulation, then the program
> may attack the weakness (or in a way it also reacts to the situation
> defending against "the weak shape becoming stronger"). Often the same
> move that would have been the proper defence is played.
>
>
> *Eliminating noise rather than predicting  the best move
>
> Nothing I wrote above is original or new to the readers of this list.
> But I want to make a distinction between systems that tries to predict
> the best move and a system that only tries to eliminate noise from the
> otherwise very random simulations.
>
> Noise is eliminated when strong stones live and weak stones die almost
> always in the simulations. This way the random evaluations will mostly
> react to moves that matter in urgent fighting with shapes that are not
> yet stable. A MC-program that does this should stop defending and
> attacking strong shapes and would require much less simulations to
> discriminate between good and bad moves. Valkyria2 and Valkyria3 has
> slightly different tree search algorithms but uses the same playouts.
> Both versions needs only 512 playouts per move to win 50% against
> Gnugo 3.7.10.
>
>
> Still I think predicting the best moves is very important in the tree
> part, but this may be much less important in the playouts, and perhaps
> even detrimental as some people have experienced.
>
> *19x19
>
> The best programs on 19x19 seems to focus the uct search on local
> fighting. This temporarilly overcomes the biases of the simulations
> locally. But the information gained locally about good shape in the
> tree is forgotten when the fighting moves elswhere. But this knowledge
> can then be rediscovered later if the fighting comes back. Could a
> future improvement to 19x19 go be to use locally narrow searches that
> seeds the playouts with strong patterns for the current shapes on the
> board? Maybe someone is already doing this? A really strong approach
> would be to eliminate the need of hard coded patterns or offline
> pattern harvesting and let the program learn during the game.
>
> -Magnus
> _______________________________________________
> computer-go mailing list
> computer-go@computer-go.org
> http://www.computer-go.org/mailman/listinfo/computer-go/
>
_______________________________________________
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

Reply via email to