Just a quickstart here, and I'm not one of the real experts, so it may contain inaccuracies. Perhaps you already know all this, perhaps not. The basis of many game-playing programs (like chess programs) is the minimax (or negamax) algorithm. It is a simple algorithm by itself and you can find a lot of background and examples for chess or other games if you google it. Basically, minimax explores all possible variations stemming from the current position and then selects the best move. This algorithm works in principle, but there is a problem with it: For interesting games like chess and go, it will usually take the program billions of years to find the best move, given a game position. In the past decades several adaptations of the minimax algorithm were invented to enable the algorithm to find a reasonably good move in a timespan that humans are prepared to wait for (like evaluation functions, alfabeta pruning and transposition tables). The program keeps investigating possibilities for some time and then the move seeming best so far is played. The longer the program is allowed to think (and the faster the computer), the better the move it will come up with. These adaptations work by reducing the number of possibilities the program explores, without weakening the program too much. The program must ignore many possibilities to save time (this is called pruning), but the more possibilities it ignores, the higher the risk that it accidentally ignores the best move. Also, there is allways the danger of the horizon effect: The program misinterprets the situation because it simply doesn't get enough time to fully investigate every possibility. The challenge for the programmer is to strike an optimal balance between time consumption and losing by accidentally ignoring or misinterpreting the best move. Some game programs are even able to finetune themselves to optimize this balance over time (learning). A combination of these adaptations have proven to be well enough to enable chess programs on current hardware to beat any human with reasonable "thinking" time. But computer-go is a harder problem than computer-chess. The adaptations that are succesful in computer-chess are just not enough for computer-go, mainly because there ar far more possibilities (and I mean FAR more) to explore in go. But now there is a recent adaptation that has proven to be fairly succesful in computer-go: MonteCarlo playouts. The program reduces the work to be done by gambling a number of times and measuring the average payoff. If the average payoff is good, the move is assumed to be good for the time being. Also, it doesn't automatically ignore moves with a low payoff (remember the risk of ignoring the best move). The program just postpones further investigation of seemingly bad moves until the moves seeming good initially give more and more dissapointing payoffs after further investigation. Although this invention has had only moderate success in computer-chess, it has proven to be fairly succesful in computer-go. But at the heart of a MonteCarlo go program, you can still recognize the minimax algorithm. Best Regards, Dave ________________________________
Van: [EMAIL PROTECTED] namens tony tang Verzonden: wo 12-11-2008 2:23 Aan: computer-go@computer-go.org Onderwerp: [computer-go] Monte carlo play Greetings all go gurus, I am new to programming go, could some one explain to me how a monte carlo based evalution manages to play random games by itself? ie: who/what is the oppoent which supplies the opposing moves which allows another move to be randomly played after making the initial move at the root. I am implementing in java, is there a package/framework which allows me to train my software. Thank you Kind regards Tony ________________________________ Read amazing stories to your kids on Messenger Try it Now! <http://clk.atdmt.com/UKM/go/117588488/direct/01/>
_______________________________________________ computer-go mailing list computer-go@computer-go.org http://www.computer-go.org/mailman/listinfo/computer-go/