Why can't you reuse the same self played games but score them with a different komi value ? The policy network does not use the komi to choose its moves so it should make no difference.
> On 21/03/2017 21:08, David Ongaro wrote: >>> But how would you fix it? Isn't that you'd need to retrain your value >>> network from the scratch? >> >> I would think so as well. But I some months ago I already made a >> proposal in this list to mitigate that problem: instead of training a >> different value network for each Komi, add a âKomi adjustmentâ value >> as >> input during the training phase. That should be much more effective, >> since the âwin/lostâ evaluation shouldnât change for many (most?) >> positions for small adjustments but the resulting value network (when >> trained for different Komi adjustments) has a much greater range of >> applicability. > > The problem is not the training of the network itself (~2-4 weeks of > letting a program someone else wrote run in the background, easiest > thing ever in computer go), or whether you use a komi input or a > separate network, the problem is getting data for the different komi > values. > > Note that if getting data is not a problem, then a separate network > would perform better than your proposal. > > -- > GCP > _______________________________________________ > Computer-go mailing list > Computer-go@computer-go.org > http://computer-go.org/mailman/listinfo/computer-go > _______________________________________________ Computer-go mailing list Computer-go@computer-go.org http://computer-go.org/mailman/listinfo/computer-go