Still an interesting question is how one could make more powerful inferences by observing the skill of the players in each action they take rather than just the final outcome of each game.
If you saw me play a single game of tennis against Federer you’d have no doubt as to which way the next 100 games would go. From: Computer-go [mailto:computer-go-boun...@computer-go.org] On Behalf Of Álvaro Begué Sent: 22 March 2016 17:21 To: computer-go <computer-go@computer-go.org> Subject: Re: [Computer-go] Congratulations to AlphaGo (Statistical significance of results) A very simple-minded analysis is that, if the null hypothesis is that AlphaGo and Lee Sedol are equally strong, AlphaGo would do as well as we observed or better 15.625% of the time. That's a p-value that even social scientists don't get excited about. :) Álvaro. On Tue, Mar 22, 2016 at 12:48 PM, Jason House <jason.james.ho...@gmail.com<mailto:jason.james.ho...@gmail.com>> wrote: Statistical significance requires a null hypothesis... I think it's probably easiest to ask the question of if I assume an ELO difference of x, how likely it's a 4-1 result? Turns out that 220 to 270 ELO has a 41% chance of that result. >= 10% is -50 to 670 ELO >= 1% is -250 to 1190 ELO My numbers may be slightly off from eyeballing things in a simple excel sheet. The idea and ranges should be clear though On Mar 22, 2016 12:00 PM, "Lucas, Simon M" <s...@essex.ac.uk<mailto:s...@essex.ac.uk>> wrote: Hi all, I was discussing the results with a colleague outside of the Game AI area the other day when he raised the question (which applies to nearly all sporting events, given the small sample size involved) of statistical significance - suggesting that on another week the result might have been 4-1 to Lee Sedol. I pointed out that in games of skill there's much more to judge than just the final outcome of each game, but wondered if anyone had any better (or worse :) arguments - or had even engaged in the same type of conversation. With AlphaGo winning 4 games to 1, from a simplistic stats point of view (with the prior assumption of a fair coin toss) you'd not be able to claim much statistical significance, yet most (me included) believe that AlphaGo is a genuinely better Go player than Lee Sedol. From a stats viewpoint you can use this approach: http://www.inference.phy.cam.ac.uk/itprnn/book.pdf (see section 3.2 on page 51) but given even priors it won't tell you much. Anyone know any good references for refuting this type of argument - the fact is of course that a game of Go is nothing like a coin toss. Games of skill tend to base their outcomes on the result of many (in the case of Go many hundreds of) individual actions. Best wishes, Simon _______________________________________________ Computer-go mailing list Computer-go@computer-go.org<mailto:Computer-go@computer-go.org> http://computer-go.org/mailman/listinfo/computer-go _______________________________________________ Computer-go mailing list Computer-go@computer-go.org<mailto:Computer-go@computer-go.org> http://computer-go.org/mailman/listinfo/computer-go
_______________________________________________ Computer-go mailing list Computer-go@computer-go.org http://computer-go.org/mailman/listinfo/computer-go