Re: [Computer-go] Our Silicon Overlord

Jim O'Flaherty Fri, 06 Jan 2017 14:37:22 -0800

Okay. So I will play along. How do you think you would coax AlphaGo into a
position with superko without AlphaGo having already simulated that pathway
as a less probable win space for itself when compared to other playing
trees which avoid it? IOW, how do you even get AlphaGo into a the arcane
state in the first place, especially since uncertainty of outcome is
weighted against wins for itself?

And since I know you cannot definitively answer that, it looks like we'll
just have to wait and see what happens. The professional players will be
open to all sorts of creative ideas on how to find weaknesses with AlphaGo.
And until they get free reign to play as many games as they like against it
so they can begin to get a feel for strategies that do expose probable
weaknesses (we won't know with certainty as it appears AlphaGo is now
generating its own theories where a situation is rated a weakness by a
human turns out to be incorrect and AlphaGo ends up leveraging it to its
advantage). Perhaps you can persuade one of the 9p-s to explore your idea
of pushing the AlphaGo AI in this direction.

IOW, we are now well outside of provable spaces and into probabilistic
spaces. At the scales we are discussing, it is improbable we will ever
directly experience seeing anything approaching a mathematical proof around
a full game of Go between two experts, even if those experts are two
competing AIs. We cannot formally prove much simpler models, much less ones
with the complexity of a game of Go.

On Fri, Jan 6, 2017 at 12:55 AM, Robert Jasiek <jas...@snafu.de> wrote:

> On 05.01.2017 17:32, Jim O'Flaherty wrote:
>
>> I don't follow.
>>
>
> 1) "For each arcane position reached, there would now be ample data for
> AlphaGo to train on that particular pathway." is false. See below.
>
> 2) "two strategies. The first would be to avoid the state in the first
> place." Does AlphaGo have any strategy ever? If it does, does it have
> strategies of avoiding certain types of positions?
>
> 3) "the second would be to optimize play in that particular state." If you
> mean optimise play = maximise winning probability.
>
> But... optimising this is hard when (under positional superko) optimal
> play can be ca. 13,500,000 moves long and the tree to that is huge. Even
> TPU sampling can be lost then.
>
> Afterwards, there is still only one position from which to train. For NN
> learning, one position is not enough and cannot replace analysis by
> mathematical proofs ALA the NN does not emulate mathematical proving.
>
>
> --
> robert jasiek
> _______________________________________________
> Computer-go mailing list
> Computer-go@computer-go.org
> http://computer-go.org/mailman/listinfo/computer-go
>

_______________________________________________
Computer-go mailing list
Computer-go@computer-go.org
http://computer-go.org/mailman/listinfo/computer-go

Re: [Computer-go] Our Silicon Overlord

Reply via email to