>
> The purpose is to see if there is some sort of "simplification" available
> to the emerged complex functions encoded in the weights. It is a typical
> reductionist strategy, especially where there is an attempt to converge on
> human conceptualization.
>
>
That's an interesting way to look at
>
> BTW, by improvement, I don't mean higher Go playing skill...I mean
> appearing close to the same level of Go playing skill _per_ _move_ with far
> less computational cost. It's the total game outcomes that will fall.
>
>
For the playouts, you always need a relatively inexpensive computation.
BTW, by improvement, I don't mean higher Go playing skill...I mean
appearing close to the same level of Go playing skill _per_ _move_ with far
less computational cost. It's the total game outcomes that will fall.
On Sun, Jun 12, 2016 at 3:55 PM, Jim O'Flaherty
wrote:
The purpose is to see if there is some sort of "simplification" available
to the emerged complex functions encoded in the weights. It is a typical
reductionist strategy, especially where there is an attempt to converge on
human conceptualization. Given the complexity of the nuances in Go, my
I don't remember the content of the paper and currently can't look at the
PDF, but one possible explanation could be that a simple model trained
directly maybe regularizes differently from one trained on the best-fit
pre-smoothed output of a deeper net. The second could perhaps offer better
local
I don't understand the point of using the deeper network to train the
shallower one. If you had enough data to be able to train a model with many
parameters, you have enough to train a model with fewer parameters.
Álvaro.
On Sun, Jun 12, 2016 at 5:52 AM, Michael Markefka <
Might be worthwhile to try the faster, shallower policy network as a
MCTS replacement if it were fast enough to support enough breadth.
Could cut down on some of the scoring variations that confuse rather
than inform the score expectation.
On Sun, Jun 12, 2016 at 10:56 AM, Stefan Kaitschick
I don't know how the added training compares to direct training of the
shallow network.
It's prob. not so important, because both should be much faster than the
training of the deep NN.
Accuracy should be slightly improved.
Together, that might not justify the effort. But I think the fact that
Would the expected improvement be reduced training time or improved
accuracy?
2016-06-11 23:06 GMT+03:00 Stefan Kaitschick :
> If I understood it right, the playout NN in AlphaGo was created by using
> the same training set as the one used for the large NN that is
If I understood it right, the playout NN in AlphaGo was created by using
the same training set as the one used for the large NN that is used in the
tree. There would be an alternative though. I don't know if this is the
best source, but here is one example: https://arxiv.org/pdf/1312.6184.pdf
The
10 matches
Mail list logo