Re: [Computer-go] mit-develops-algorithm-to-accelerate-neural-networks-by-200x
Concur w/Brian. While the authors present genuine contributions, meta-learning doesn't apply well to zero-sized architectures. I didn't get a lot from the article, the arxiv link for the work done is https://arxiv.org/abs/1812.00332 Best, -Chaz On Sun, Mar 24, 2019 at 4:17 PM Brian Lee wrote: > this doesn't actually speed up the neural networks that much; it's a > technique to more quickly brute-force the search space of possible neural > networks for ones that execute faster while maintaining similar accuracy. > Typical hype article. > > Anyway, the effort spent looking for bizarre architectures is probably > better spent doing more iterations of zero-style self-play with the same > architecture, since it seems likely we haven't maxed out the strength of > our existing architectures. > > On Sun, Mar 24, 2019 at 6:29 PM Ray Tayek wrote: > >> >> https://www.extremetech.com/computing/288152-mit-develops-algorithm-to-accelerate-neural-networks-by-200x >> >> i wonder how much this would speed up go programs? >> >> thanks >> >> -- >> Honesty is a very expensive gift. So, don't expect it from cheap people - >> Warren Buffett >> http://tayek.com/ >> >> ___ >> Computer-go mailing list >> Computer-go@computer-go.org >> http://computer-go.org/mailman/listinfo/computer-go > > ___ > Computer-go mailing list > Computer-go@computer-go.org > http://computer-go.org/mailman/listinfo/computer-go ___ Computer-go mailing list Computer-go@computer-go.org http://computer-go.org/mailman/listinfo/computer-go
Re: [Computer-go] Efficient Parameter Tuning Software
Hi Simon, Thanks for sharing. In my opinion, apart from discretizing the search space, the N-Tuple system takes a very intuitive approach to hyper-parameter optimization. The github repo readme notes you're working on an extended version to handle continuous parameters, what's your general approach to that issue? Thanks, -Chaz On Sun, Jan 13, 2019 at 11:51 AM Simon Lucas wrote: > Hi all, > > > > The N-Tuple Bandit Evolutionary Algorithm aims > > to provide sample-efficient optimisation, especially > > for noisy problems. > > > > Software available in Java and Python: > > > > https://github.com/SimonLucas/ntbea > > > > It also provides stats on the value of each parameter setting > > and combinations of settings. > > > > Best wishes, > > > > Simon > > > > > > -- > > Simon Lucas > > Professor of Artificial Intelligence > > Head of School > > Electronic Engineering and Computer Science > > Queen Mary University of London > > > > > > > ___ > Computer-go mailing list > Computer-go@computer-go.org > http://computer-go.org/mailman/listinfo/computer-go ___ Computer-go mailing list Computer-go@computer-go.org http://computer-go.org/mailman/listinfo/computer-go
Re: [Computer-go] Significance of resignation in AGZ
Hi Brian, Thanks for sharing your genuinely interesting result. One question though: why would you train on a non-"zero" program? Do you think your program as a result of your rules would perform better than zero, or is it imitating the best known algorithm inconvenient for your purposes? Best, -Chaz On Sat, Dec 2, 2017 at 7:31 PM, Brian Sheppard via Computer-go < computer-go@computer-go.org> wrote: > I implemented the ad hoc rule of not training on positions after the first > pass, and my program is basically playing moves until the first pass is > forced. (It is not a “zero” program, so I don’t mind ad hoc rules like > this.) > > > > *From:* Computer-go [mailto:computer-go-boun...@computer-go.org] *On > Behalf Of *Xavier Combelle > *Sent:* Saturday, December 2, 2017 12:36 PM > *To:* computer-go@computer-go.org > > *Subject:* Re: [Computer-go] Significance of resignation in AGZ > > > > It might make sense to enable resignation threshold even on stupid level. > As such the first thing the network should learn would be not to resign to > early (even before not passing) > > > > Le 02/12/2017 à 18:17, Brian Sheppard via Computer-go a écrit : > > I have some hard data now. My network’s initial training reached the same > performance in half the iterations. That is, the steepness of skill gain in > the first day of training was twice as great when I avoided training on > fill-ins. > > > > The has all the usual caveats: only one run before/after, YMMV, etc. > > > > *From:* Brian Sheppard [mailto:sheppar...@aol.com ] > *Sent:* Friday, December 1, 2017 5:39 PM > *To:* 'computer-go' > > *Subject:* RE: [Computer-go] Significance of resignation in AGZ > > > > I didn’t measure precisely because as soon as I saw the training artifacts > I changed the code. And I am not doing an AGZ-style experiment, so there > are differences for sure. So I will give you a swag… > > > > Speed difference is maybe 20%-ish for 9x9 games. > > > > A frequentist approach will overstate the frequency of fill-in plays by a > pretty large factor, because fill-in plays are guaranteed to occur in every > game but are not best in the competitive part of the game. This will affect > the speed of learning in the early going. > > > > The network will use some fraction (almost certainly <= 20%) of its > capacity to improve accuracy on positions that will not contribute to its > ultimate strength. This applies to both ordering and evaluation aspects. > > > > > > > > > > *From:* Andy [mailto:andy.olsen...@gmail.com ] > *Sent:* Friday, December 1, 2017 4:55 PM > *To:* Brian Sheppard ; > computer-go > *Subject:* Re: [Computer-go] Significance of resignation in AGZ > > > > Brian, do you have any experiments showing what kind of impact it has? It > sounds like you have tried both with and without your ad hoc first pass > approach? > > > > > > > > > > 2017-12-01 15:29 GMT-06:00 Brian Sheppard via Computer-go < > computer-go@computer-go.org>: > > I have concluded that AGZ's policy of resigning "lost" games early is > somewhat significant. Not as significant as using residual networks, for > sure, but you wouldn't want to go without these advantages. > > The benefit cited in the paper is speed. Certainly a factor. I see two > other advantages. > > First is that training does not include the "fill in" portion of the game, > where every move is low value. I see a specific effect on the move ordering > system, since it is based on frequency. By eliminating training on > fill-ins, the prioritization function will not be biased toward moves that > are not relevant to strong play. (That is, there are a lot of fill-in > moves, which are usually not best in the interesting portion of the game, > but occur a lot if the game is played out to the end, and therefore the > move prioritization system would predict them more often.) My ad hoc > alternative is to not train on positions after the first pass in a game. > (Note that this does not qualify as "zero knowledge", but that is OK with > me since I am not trying to reproduce AGZ.) > > Second is the positional evaluation is not training on situations where > everything is decided, so less of the NN capacity is devoted to situations > in which nothing can be gained. > > As always, YMMV. > > Best, > Brian > > > ___ > Computer-go mailing list > Computer-go@computer-go.org > http://computer-go.org/mailman/listinfo/computer-go > > > > > > > ___ > > Computer-go mailing list > > Computer-go@computer-go.org > > http://computer-go.org/mailman/listinfo/computer-go > > > > ___ > Computer-go mailing list > Computer-go@computer-go.org > http://computer-go.org/mailman/listinfo/computer-go > ___ Computer-go mailing list Computer-go@computer-go.org http://computer-go.org/mailman/listinfo/computer-go
Re: [Computer-go] CPU vs GPU
Rémi, Nvidia launched the K20 GPU in late 2012. Since then, GPUs and their convolution algorithms have improved considerably, while CPU performance has been relatively stagnant. I would expect about a 10x improvement with 2016 hardware. When it comes to training, it's the difference between running a job overnight and running a job for the entire weekend. Best, -Chaz On Tue, Mar 1, 2016 at 1:03 PM, Rémi Coulom wrote: > How tremendous is it? On that page, I find this data: > > https://github.com/BVLC/caffe/pull/439 > > " > These are setup details: > > * Desktop: CPU i7-4770 (Haswell), 3.5 GHz , DRAM - 16 GB; GPU K20. > * Ubuntu 12.04; gcc 4.7.3; MKL 11.1. > > Test:: imagenet, 100 train iteration (batch = 256). > > * GPU: time= 260 sec / memory = 0.8 GB > * CPU: time= 752 sec / memory = 3.5 GiB //Memory data is from system >monitor. > > " > > This does not look so tremendous to me. What kind of speed difference do > you get for Go networks? > > Rémi > > On 03/01/2016 06:19 PM, Petr Baudis wrote: > >> On Tue, Mar 01, 2016 at 09:14:39AM -0800, David Fotland wrote: >> >>> Very interesting, but it should also mention Aya. >>> >>> I'm working on this as well, but I haven’t bought any hardware yet. My >>> goal is not to get 7 dan on expensive hardware, but to get as much strength >>> as I can on standard PC hardware. I'll be looking at much smaller nets, >>> that don’t need a GPU to run. I'll have to buy a GPU for training. >>> >> But I think most people who play Go are also fans of computer games that >> often do use GPUs. :-) Of course, it's something totally different from >> NVidia Keplers, but still the step up from a CPU is tremendous. >> >> Petr Baudis >> ___ >> Computer-go mailing list >> Computer-go@computer-go.org >> http://computer-go.org/mailman/listinfo/computer-go >> > > ___ > Computer-go mailing list > Computer-go@computer-go.org > http://computer-go.org/mailman/listinfo/computer-go ___ Computer-go mailing list Computer-go@computer-go.org http://computer-go.org/mailman/listinfo/computer-go