Re: [Computer-go] Creating the playout NN

2016-06-12 Thread Jim O'Flaherty
BTW, by improvement, I don't mean higher Go playing skill...I mean
appearing close to the same level of Go playing skill _per_ _move_ with far
less computational cost. It's the total game outcomes that will fall.

On Sun, Jun 12, 2016 at 3:55 PM, Jim O'Flaherty 
wrote:

> The purpose is to see if there is some sort of "simplification" available
> to the emerged complex functions encoded in the weights. It is a typical
> reductionist strategy, especially where there is an attempt to converge on
> human conceptualization. Given the complexity of the nuances in Go, my
> intuition says that it will show excellent improvement in short term play
> at the cost of nuance in longer term play.
>
> On Sun, Jun 12, 2016 at 6:05 AM, Álvaro Begué 
> wrote:
>
>> I don't understand the point of using the deeper network to train the
>> shallower one. If you had enough data to be able to train a model with many
>> parameters, you have enough to train a model with fewer parameters.
>>
>> Álvaro.
>>
>>
>> On Sun, Jun 12, 2016 at 5:52 AM, Michael Markefka <
>> michael.marke...@gmail.com> wrote:
>>
>>> Might be worthwhile to try the faster, shallower policy network as a
>>> MCTS replacement if it were fast enough to support enough breadth.
>>> Could cut down on some of the scoring variations that confuse rather
>>> than inform the score expectation.
>>>
>>> On Sun, Jun 12, 2016 at 10:56 AM, Stefan Kaitschick
>>>  wrote:
>>> > I don't know how the added training compares to direct training of the
>>> > shallow network.
>>> > It's prob. not so important, because both should be much faster than
>>> the
>>> > training of the deep NN.
>>> > Accuracy should be slightly improved.
>>> >
>>> > Together, that might not justify the effort. But I think the fact that
>>> you
>>> > can create the mimicking NN, after the deep NN has been refined with
>>> self
>>> > play, is important.
>>> >
>>> > On Sun, Jun 12, 2016 at 9:51 AM, Petri Pitkanen <
>>> petri.t.pitka...@gmail.com>
>>> > wrote:
>>> >>
>>> >> Would the expected improvement be reduced training time or improved
>>> >> accuracy?
>>> >>
>>> >>
>>> >> 2016-06-11 23:06 GMT+03:00 Stefan Kaitschick
>>> >> :
>>> >>>
>>> >>> If I understood it right, the playout NN in AlphaGo was created by
>>> using
>>> >>> the same training set as the one used for the large NN that is used
>>> in the
>>> >>> tree. There would be an alternative though. I don't know if this is
>>> the best
>>> >>> source, but here is one example: https://arxiv.org/pdf/1312.6184.pdf
>>> >>> The idea is to teach a shallow NN to mimic the outputs of a deeper
>>> net.
>>> >>> For one thing, this seems to give better results than direct
>>> training on the
>>> >>> same set. But also, more importantly, this could be done after the
>>> large NN
>>> >>> has been improved with selfplay.
>>> >>> And after that, the selfplay could be restarted with the new playout
>>> NN.
>>> >>> So it seems to me, there is real room for improvement here.
>>> >>>
>>> >>> Stefan
>>> >>>
>>> >>> ___
>>> >>> Computer-go mailing list
>>> >>> Computer-go@computer-go.org
>>> >>> http://computer-go.org/mailman/listinfo/computer-go
>>> >>
>>> >>
>>> >>
>>> >> ___
>>> >> Computer-go mailing list
>>> >> Computer-go@computer-go.org
>>> >> http://computer-go.org/mailman/listinfo/computer-go
>>> >
>>> >
>>> >
>>> > ___
>>> > Computer-go mailing list
>>> > Computer-go@computer-go.org
>>> > http://computer-go.org/mailman/listinfo/computer-go
>>> ___
>>> Computer-go mailing list
>>> Computer-go@computer-go.org
>>> http://computer-go.org/mailman/listinfo/computer-go
>>>
>>
>>
>> ___
>> Computer-go mailing list
>> Computer-go@computer-go.org
>> http://computer-go.org/mailman/listinfo/computer-go
>>
>
>
___
Computer-go mailing list
Computer-go@computer-go.org
http://computer-go.org/mailman/listinfo/computer-go

Re: [Computer-go] Creating the playout NN

2016-06-12 Thread Jim O'Flaherty
The purpose is to see if there is some sort of "simplification" available
to the emerged complex functions encoded in the weights. It is a typical
reductionist strategy, especially where there is an attempt to converge on
human conceptualization. Given the complexity of the nuances in Go, my
intuition says that it will show excellent improvement in short term play
at the cost of nuance in longer term play.

On Sun, Jun 12, 2016 at 6:05 AM, Álvaro Begué 
wrote:

> I don't understand the point of using the deeper network to train the
> shallower one. If you had enough data to be able to train a model with many
> parameters, you have enough to train a model with fewer parameters.
>
> Álvaro.
>
>
> On Sun, Jun 12, 2016 at 5:52 AM, Michael Markefka <
> michael.marke...@gmail.com> wrote:
>
>> Might be worthwhile to try the faster, shallower policy network as a
>> MCTS replacement if it were fast enough to support enough breadth.
>> Could cut down on some of the scoring variations that confuse rather
>> than inform the score expectation.
>>
>> On Sun, Jun 12, 2016 at 10:56 AM, Stefan Kaitschick
>>  wrote:
>> > I don't know how the added training compares to direct training of the
>> > shallow network.
>> > It's prob. not so important, because both should be much faster than the
>> > training of the deep NN.
>> > Accuracy should be slightly improved.
>> >
>> > Together, that might not justify the effort. But I think the fact that
>> you
>> > can create the mimicking NN, after the deep NN has been refined with
>> self
>> > play, is important.
>> >
>> > On Sun, Jun 12, 2016 at 9:51 AM, Petri Pitkanen <
>> petri.t.pitka...@gmail.com>
>> > wrote:
>> >>
>> >> Would the expected improvement be reduced training time or improved
>> >> accuracy?
>> >>
>> >>
>> >> 2016-06-11 23:06 GMT+03:00 Stefan Kaitschick
>> >> :
>> >>>
>> >>> If I understood it right, the playout NN in AlphaGo was created by
>> using
>> >>> the same training set as the one used for the large NN that is used
>> in the
>> >>> tree. There would be an alternative though. I don't know if this is
>> the best
>> >>> source, but here is one example: https://arxiv.org/pdf/1312.6184.pdf
>> >>> The idea is to teach a shallow NN to mimic the outputs of a deeper
>> net.
>> >>> For one thing, this seems to give better results than direct training
>> on the
>> >>> same set. But also, more importantly, this could be done after the
>> large NN
>> >>> has been improved with selfplay.
>> >>> And after that, the selfplay could be restarted with the new playout
>> NN.
>> >>> So it seems to me, there is real room for improvement here.
>> >>>
>> >>> Stefan
>> >>>
>> >>> ___
>> >>> Computer-go mailing list
>> >>> Computer-go@computer-go.org
>> >>> http://computer-go.org/mailman/listinfo/computer-go
>> >>
>> >>
>> >>
>> >> ___
>> >> Computer-go mailing list
>> >> Computer-go@computer-go.org
>> >> http://computer-go.org/mailman/listinfo/computer-go
>> >
>> >
>> >
>> > ___
>> > Computer-go mailing list
>> > Computer-go@computer-go.org
>> > http://computer-go.org/mailman/listinfo/computer-go
>> ___
>> Computer-go mailing list
>> Computer-go@computer-go.org
>> http://computer-go.org/mailman/listinfo/computer-go
>>
>
>
> ___
> Computer-go mailing list
> Computer-go@computer-go.org
> http://computer-go.org/mailman/listinfo/computer-go
>
___
Computer-go mailing list
Computer-go@computer-go.org
http://computer-go.org/mailman/listinfo/computer-go

Re: [Computer-go] GRS

2016-06-12 Thread Jim O'Flaherty
Have you considered using either of the two high level Go AIs (mentioned on
this email group this last week) as your end-of-game live-group estimator
(and could even use their scoring mechanism, too)?


On Sun, Jun 12, 2016 at 8:02 AM, Henry Hemming  wrote:

> Unfortunately I had to make some changes to
> http://goratingserver.appspot.com , which broke the bot interface
> (updates now come in via web sockets). However it should now be a lot
> easier to connect a bot to the server as I created a jar file and
> configuration ini file that connects a command line gtp bot to the server
> just like on KGS. The jar file and an example ini file (for gnu-go) as well
> as source code is available at https://github.com/typohh/GTPRest .
>
> Hopefully the site looks prettier and works more reliably now. Dead stone
> estimation should be much more accurate and only need cleanup in
> exceptional circumstances (fewer than than 1/200 games). I have no
> intention of putting up ads on the site and only plan to charge for custom
> names nothing else (to cover some of the server costs). Because of the
> changes in protocol the official launch of the website also got delayed, if
> however everything runs smoothly for a few more days I will run two day
> tournament for beta testers to stress test the site followed by a public
> tournament (basically whoever gets highest rating before deadline, and
> another category with additional minimum number of games played during the
> tournament time frame). Bots will be welcome to participate in the
> tournament, more on that later.
>
> Right now there are some human beta testers and 4 bots playing on the
> site. The bots are between 15kyu, and 2dan which should guarantee a pairing
> for any player within ~15 minutes at most.
>
> Once again any and all feedback much appreciated.
>
> -Henry Hemming
>
> ___
> Computer-go mailing list
> Computer-go@computer-go.org
> http://computer-go.org/mailman/listinfo/computer-go
>
___
Computer-go mailing list
Computer-go@computer-go.org
http://computer-go.org/mailman/listinfo/computer-go

[Computer-go] GRS

2016-06-12 Thread Henry Hemming
Unfortunately I had to make some changes to
http://goratingserver.appspot.com , which broke the bot interface (updates
now come in via web sockets). However it should now be a lot easier to
connect a bot to the server as I created a jar file and configuration ini
file that connects a command line gtp bot to the server just like on KGS.
The jar file and an example ini file (for gnu-go) as well as source code is
available at https://github.com/typohh/GTPRest .

Hopefully the site looks prettier and works more reliably now. Dead stone
estimation should be much more accurate and only need cleanup in
exceptional circumstances (fewer than than 1/200 games). I have no
intention of putting up ads on the site and only plan to charge for custom
names nothing else (to cover some of the server costs). Because of the
changes in protocol the official launch of the website also got delayed, if
however everything runs smoothly for a few more days I will run two day
tournament for beta testers to stress test the site followed by a public
tournament (basically whoever gets highest rating before deadline, and
another category with additional minimum number of games played during the
tournament time frame). Bots will be welcome to participate in the
tournament, more on that later.

Right now there are some human beta testers and 4 bots playing on the site.
The bots are between 15kyu, and 2dan which should guarantee a pairing for
any player within ~15 minutes at most.

Once again any and all feedback much appreciated.

-Henry Hemming
___
Computer-go mailing list
Computer-go@computer-go.org
http://computer-go.org/mailman/listinfo/computer-go

Re: [Computer-go] Creating the playout NN

2016-06-12 Thread Michael Markefka
I don't remember the content of the paper and currently can't look at the
PDF, but one possible explanation could be that a simple model trained
directly maybe regularizes differently from one trained on the best-fit
pre-smoothed output of a deeper net. The second could perhaps offer better
local optimization and regularization at higher accuracy with equal
parameter count.
Am 12.06.2016 13:05 schrieb "Álvaro Begué" :

> I don't understand the point of using the deeper network to train the
> shallower one. If you had enough data to be able to train a model with many
> parameters, you have enough to train a model with fewer parameters.
>
> Álvaro.
>
>
> On Sun, Jun 12, 2016 at 5:52 AM, Michael Markefka <
> michael.marke...@gmail.com> wrote:
>
>> Might be worthwhile to try the faster, shallower policy network as a
>> MCTS replacement if it were fast enough to support enough breadth.
>> Could cut down on some of the scoring variations that confuse rather
>> than inform the score expectation.
>>
>> On Sun, Jun 12, 2016 at 10:56 AM, Stefan Kaitschick
>>  wrote:
>> > I don't know how the added training compares to direct training of the
>> > shallow network.
>> > It's prob. not so important, because both should be much faster than the
>> > training of the deep NN.
>> > Accuracy should be slightly improved.
>> >
>> > Together, that might not justify the effort. But I think the fact that
>> you
>> > can create the mimicking NN, after the deep NN has been refined with
>> self
>> > play, is important.
>> >
>> > On Sun, Jun 12, 2016 at 9:51 AM, Petri Pitkanen <
>> petri.t.pitka...@gmail.com>
>> > wrote:
>> >>
>> >> Would the expected improvement be reduced training time or improved
>> >> accuracy?
>> >>
>> >>
>> >> 2016-06-11 23:06 GMT+03:00 Stefan Kaitschick
>> >> :
>> >>>
>> >>> If I understood it right, the playout NN in AlphaGo was created by
>> using
>> >>> the same training set as the one used for the large NN that is used
>> in the
>> >>> tree. There would be an alternative though. I don't know if this is
>> the best
>> >>> source, but here is one example: https://arxiv.org/pdf/1312.6184.pdf
>> >>> The idea is to teach a shallow NN to mimic the outputs of a deeper
>> net.
>> >>> For one thing, this seems to give better results than direct training
>> on the
>> >>> same set. But also, more importantly, this could be done after the
>> large NN
>> >>> has been improved with selfplay.
>> >>> And after that, the selfplay could be restarted with the new playout
>> NN.
>> >>> So it seems to me, there is real room for improvement here.
>> >>>
>> >>> Stefan
>> >>>
>> >>> ___
>> >>> Computer-go mailing list
>> >>> Computer-go@computer-go.org
>> >>> http://computer-go.org/mailman/listinfo/computer-go
>> >>
>> >>
>> >>
>> >> ___
>> >> Computer-go mailing list
>> >> Computer-go@computer-go.org
>> >> http://computer-go.org/mailman/listinfo/computer-go
>> >
>> >
>> >
>> > ___
>> > Computer-go mailing list
>> > Computer-go@computer-go.org
>> > http://computer-go.org/mailman/listinfo/computer-go
>> ___
>> Computer-go mailing list
>> Computer-go@computer-go.org
>> http://computer-go.org/mailman/listinfo/computer-go
>>
>
>
> ___
> Computer-go mailing list
> Computer-go@computer-go.org
> http://computer-go.org/mailman/listinfo/computer-go
>
___
Computer-go mailing list
Computer-go@computer-go.org
http://computer-go.org/mailman/listinfo/computer-go

Re: [Computer-go] Creating the playout NN

2016-06-12 Thread Álvaro Begué
I don't understand the point of using the deeper network to train the
shallower one. If you had enough data to be able to train a model with many
parameters, you have enough to train a model with fewer parameters.

Álvaro.


On Sun, Jun 12, 2016 at 5:52 AM, Michael Markefka <
michael.marke...@gmail.com> wrote:

> Might be worthwhile to try the faster, shallower policy network as a
> MCTS replacement if it were fast enough to support enough breadth.
> Could cut down on some of the scoring variations that confuse rather
> than inform the score expectation.
>
> On Sun, Jun 12, 2016 at 10:56 AM, Stefan Kaitschick
>  wrote:
> > I don't know how the added training compares to direct training of the
> > shallow network.
> > It's prob. not so important, because both should be much faster than the
> > training of the deep NN.
> > Accuracy should be slightly improved.
> >
> > Together, that might not justify the effort. But I think the fact that
> you
> > can create the mimicking NN, after the deep NN has been refined with self
> > play, is important.
> >
> > On Sun, Jun 12, 2016 at 9:51 AM, Petri Pitkanen <
> petri.t.pitka...@gmail.com>
> > wrote:
> >>
> >> Would the expected improvement be reduced training time or improved
> >> accuracy?
> >>
> >>
> >> 2016-06-11 23:06 GMT+03:00 Stefan Kaitschick
> >> :
> >>>
> >>> If I understood it right, the playout NN in AlphaGo was created by
> using
> >>> the same training set as the one used for the large NN that is used in
> the
> >>> tree. There would be an alternative though. I don't know if this is
> the best
> >>> source, but here is one example: https://arxiv.org/pdf/1312.6184.pdf
> >>> The idea is to teach a shallow NN to mimic the outputs of a deeper net.
> >>> For one thing, this seems to give better results than direct training
> on the
> >>> same set. But also, more importantly, this could be done after the
> large NN
> >>> has been improved with selfplay.
> >>> And after that, the selfplay could be restarted with the new playout
> NN.
> >>> So it seems to me, there is real room for improvement here.
> >>>
> >>> Stefan
> >>>
> >>> ___
> >>> Computer-go mailing list
> >>> Computer-go@computer-go.org
> >>> http://computer-go.org/mailman/listinfo/computer-go
> >>
> >>
> >>
> >> ___
> >> Computer-go mailing list
> >> Computer-go@computer-go.org
> >> http://computer-go.org/mailman/listinfo/computer-go
> >
> >
> >
> > ___
> > Computer-go mailing list
> > Computer-go@computer-go.org
> > http://computer-go.org/mailman/listinfo/computer-go
> ___
> Computer-go mailing list
> Computer-go@computer-go.org
> http://computer-go.org/mailman/listinfo/computer-go
>
___
Computer-go mailing list
Computer-go@computer-go.org
http://computer-go.org/mailman/listinfo/computer-go

Re: [Computer-go] Creating the playout NN

2016-06-12 Thread Michael Markefka
Might be worthwhile to try the faster, shallower policy network as a
MCTS replacement if it were fast enough to support enough breadth.
Could cut down on some of the scoring variations that confuse rather
than inform the score expectation.

On Sun, Jun 12, 2016 at 10:56 AM, Stefan Kaitschick
 wrote:
> I don't know how the added training compares to direct training of the
> shallow network.
> It's prob. not so important, because both should be much faster than the
> training of the deep NN.
> Accuracy should be slightly improved.
>
> Together, that might not justify the effort. But I think the fact that you
> can create the mimicking NN, after the deep NN has been refined with self
> play, is important.
>
> On Sun, Jun 12, 2016 at 9:51 AM, Petri Pitkanen 
> wrote:
>>
>> Would the expected improvement be reduced training time or improved
>> accuracy?
>>
>>
>> 2016-06-11 23:06 GMT+03:00 Stefan Kaitschick
>> :
>>>
>>> If I understood it right, the playout NN in AlphaGo was created by using
>>> the same training set as the one used for the large NN that is used in the
>>> tree. There would be an alternative though. I don't know if this is the best
>>> source, but here is one example: https://arxiv.org/pdf/1312.6184.pdf
>>> The idea is to teach a shallow NN to mimic the outputs of a deeper net.
>>> For one thing, this seems to give better results than direct training on the
>>> same set. But also, more importantly, this could be done after the large NN
>>> has been improved with selfplay.
>>> And after that, the selfplay could be restarted with the new playout NN.
>>> So it seems to me, there is real room for improvement here.
>>>
>>> Stefan
>>>
>>> ___
>>> Computer-go mailing list
>>> Computer-go@computer-go.org
>>> http://computer-go.org/mailman/listinfo/computer-go
>>
>>
>>
>> ___
>> Computer-go mailing list
>> Computer-go@computer-go.org
>> http://computer-go.org/mailman/listinfo/computer-go
>
>
>
> ___
> Computer-go mailing list
> Computer-go@computer-go.org
> http://computer-go.org/mailman/listinfo/computer-go
___
Computer-go mailing list
Computer-go@computer-go.org
http://computer-go.org/mailman/listinfo/computer-go

Re: [Computer-go] Creating the playout NN

2016-06-12 Thread Stefan Kaitschick
I don't know how the added training compares to direct training of the
shallow network.
It's prob. not so important, because both should be much faster than the
training of the deep NN.
Accuracy should be slightly improved.

Together, that might not justify the effort. But I think the fact that you
can create the mimicking NN, after the deep NN has been refined with self
play, is important.

On Sun, Jun 12, 2016 at 9:51 AM, Petri Pitkanen 
wrote:

> Would the expected improvement be reduced training time or improved
> accuracy?
>
>
> 2016-06-11 23:06 GMT+03:00 Stefan Kaitschick  >:
>
>> If I understood it right, the playout NN in AlphaGo was created by using
>> the same training set as the one used for the large NN that is used in the
>> tree. There would be an alternative though. I don't know if this is the
>> best source, but here is one example: https://arxiv.org/pdf/1312.6184.pdf
>> The idea is to teach a shallow NN to mimic the outputs of a deeper net.
>> For one thing, this seems to give better results than direct training on
>> the same set. But also, more importantly, this could be done after the
>> large NN has been improved with selfplay.
>> And after that, the selfplay could be restarted with the new playout NN.
>> So it seems to me, there is real room for improvement here.
>>
>> Stefan
>>
>> ___
>> Computer-go mailing list
>> Computer-go@computer-go.org
>> http://computer-go.org/mailman/listinfo/computer-go
>>
>
>
> ___
> Computer-go mailing list
> Computer-go@computer-go.org
> http://computer-go.org/mailman/listinfo/computer-go
>
___
Computer-go mailing list
Computer-go@computer-go.org
http://computer-go.org/mailman/listinfo/computer-go

Re: [Computer-go] Creating the playout NN

2016-06-12 Thread Petr Baudis
On Sun, Jun 12, 2016 at 10:51:37AM +0300, Petri Pitkanen wrote:
> 2016-06-11 23:06 GMT+03:00 Stefan Kaitschick :
> 
> > If I understood it right, the playout NN in AlphaGo was created by using
> > the same training set as the one used for the large NN that is used in the
> > tree. There would be an alternative though. I don't know if this is the
> > best source, but here is one example: https://arxiv.org/pdf/1312.6184.pdf
> > The idea is to teach a shallow NN to mimic the outputs of a deeper net.
> > For one thing, this seems to give better results than direct training on
> > the same set. But also, more importantly, this could be done after the
> > large NN has been improved with selfplay.
> > And after that, the selfplay could be restarted with the new playout NN.
> > So it seems to me, there is real room for improvement here.
> 
> Would the expected improvement be reduced training time or improved
> accuracy?

Neither - faster runtime move scoring procedure, i.e. more board
positions scored throughout the game, plus also latency reduction
(i.e. board scoring available sooner after the move is expanded,
i.e. less playouts made without the NN scoring in the last few
moves).

-- 
Petr Baudis
If you have good ideas, good data and fast computers,
you can do almost anything. -- Geoffrey Hinton
___
Computer-go mailing list
Computer-go@computer-go.org
http://computer-go.org/mailman/listinfo/computer-go

Re: [Computer-go] Creating the playout NN

2016-06-12 Thread Petri Pitkanen
Would the expected improvement be reduced training time or improved
accuracy?


2016-06-11 23:06 GMT+03:00 Stefan Kaitschick :

> If I understood it right, the playout NN in AlphaGo was created by using
> the same training set as the one used for the large NN that is used in the
> tree. There would be an alternative though. I don't know if this is the
> best source, but here is one example: https://arxiv.org/pdf/1312.6184.pdf
> The idea is to teach a shallow NN to mimic the outputs of a deeper net.
> For one thing, this seems to give better results than direct training on
> the same set. But also, more importantly, this could be done after the
> large NN has been improved with selfplay.
> And after that, the selfplay could be restarted with the new playout NN.
> So it seems to me, there is real room for improvement here.
>
> Stefan
>
> ___
> Computer-go mailing list
> Computer-go@computer-go.org
> http://computer-go.org/mailman/listinfo/computer-go
>
___
Computer-go mailing list
Computer-go@computer-go.org
http://computer-go.org/mailman/listinfo/computer-go