>I have the feeling that the paper is important, but it is completly 
>obfuscated by the strange reinforcement learning notation and jargon. Can 
>anyone explain it in Go-programming words?

The most important thing in the paper is how to combine RAVE(AMAF)
information with normal UCT. Like this:

  uct_value = child->GetUctValue();
  rave_value = child->GetRaveValue();
  beta = sqrt(K / (3 * node->visits + K));
  uct_rave = beta * rave_value + (1 - beta) * uct_value;

You do not always have to understand RLGO - they don't use it in the
online version of MoGo.

>It was pointed out by Donald Knuth in his paper on Alpha-Beta, that the - 
>simple - algorithm was not understood for a long time, because of the 
>inappropriate mathematical notation. For recursive functions, (pseudo-)code 
>is much better suited than the mathematical notation. Actually its 
>pseudo-mathematic notation.
>Why is this inappropriate notation still used?

I agree that the pseudo-code is easy to understand.

--
Yamato
_______________________________________________
computer-go mailing list
computer-go@computer-go.org
http://www.computer-go.org/mailman/listinfo/computer-go/

Reply via email to