Re: [Computer-go] mit-develops-algorithm-to-accelerate-neural-networks-by-200x

2019-03-25 Thread Chaz G.
Concur w/Brian. While the authors present genuine contributions,
meta-learning doesn't apply well to zero-sized architectures.

I didn't get a lot from the article, the arxiv link for the work done is
https://arxiv.org/abs/1812.00332

Best,
-Chaz

On Sun, Mar 24, 2019 at 4:17 PM Brian Lee 
wrote:

> this doesn't actually speed up the neural networks that much; it's a
> technique to more quickly brute-force the search space of possible neural
> networks for ones that execute faster while maintaining similar accuracy.
> Typical hype article.
>
> Anyway, the effort spent looking for bizarre architectures is probably
> better spent doing more iterations of zero-style self-play with the same
> architecture, since it seems likely we haven't maxed out the strength of
> our existing architectures.
>
> On Sun, Mar 24, 2019 at 6:29 PM Ray Tayek  wrote:
>
>>
>> https://www.extremetech.com/computing/288152-mit-develops-algorithm-to-accelerate-neural-networks-by-200x
>>
>> i wonder how much this would speed up go programs?
>>
>> thanks
>>
>> --
>> Honesty is a very expensive gift. So, don't expect it from cheap people -
>> Warren Buffett
>> http://tayek.com/
>>
>> ___
>> Computer-go mailing list
>> Computer-go@computer-go.org
>> http://computer-go.org/mailman/listinfo/computer-go
>
> ___
> Computer-go mailing list
> Computer-go@computer-go.org
> http://computer-go.org/mailman/listinfo/computer-go
___
Computer-go mailing list
Computer-go@computer-go.org
http://computer-go.org/mailman/listinfo/computer-go

Re: [Computer-go] Efficient Parameter Tuning Software

2019-01-14 Thread Chaz G.
Hi Simon,

Thanks for sharing. In my opinion, apart from discretizing the search
space, the N-Tuple system takes a very intuitive approach to
hyper-parameter optimization. The github repo readme notes you're working
on an extended version to handle continuous parameters, what's your general
approach to that issue?

Thanks,
-Chaz

On Sun, Jan 13, 2019 at 11:51 AM Simon Lucas  wrote:

> Hi all,
>
>
>
> The N-Tuple Bandit Evolutionary Algorithm aims
>
> to provide sample-efficient optimisation, especially
>
> for noisy problems.
>
>
>
> Software available in Java and Python:
>
>
>
> https://github.com/SimonLucas/ntbea
>
>
>
> It also provides stats on the value of each parameter setting
>
> and combinations of settings.
>
>
>
> Best wishes,
>
>
>
> Simon
>
>
>
>
>
> --
>
> Simon Lucas
>
> Professor of Artificial Intelligence
>
> Head of School
>
> Electronic Engineering and Computer Science
>
> Queen Mary University of London
>
>
>
>
>
>
> ___
> Computer-go mailing list
> Computer-go@computer-go.org
> http://computer-go.org/mailman/listinfo/computer-go
___
Computer-go mailing list
Computer-go@computer-go.org
http://computer-go.org/mailman/listinfo/computer-go

Re: [Computer-go] Significance of resignation in AGZ

2017-12-03 Thread Chaz G.
Hi Brian,

Thanks for sharing your genuinely interesting result. One question though:
why would you train on a non-"zero" program? Do you think your program as a
result of your rules would perform better than zero, or is it imitating the
best known algorithm inconvenient for your purposes?

Best,
-Chaz

On Sat, Dec 2, 2017 at 7:31 PM, Brian Sheppard via Computer-go <
computer-go@computer-go.org> wrote:

> I implemented the ad hoc rule of not training on positions after the first
> pass, and my program is basically playing moves until the first pass is
> forced. (It is not a “zero” program, so I don’t mind ad hoc rules like
> this.)
>
>
>
> *From:* Computer-go [mailto:computer-go-boun...@computer-go.org] *On
> Behalf Of *Xavier Combelle
> *Sent:* Saturday, December 2, 2017 12:36 PM
> *To:* computer-go@computer-go.org
>
> *Subject:* Re: [Computer-go] Significance of resignation in AGZ
>
>
>
> It might make sense to enable resignation threshold even on stupid level.
> As such the first thing the network should learn would be not to resign to
> early (even before not passing)
>
>
>
> Le 02/12/2017 à 18:17, Brian Sheppard via Computer-go a écrit :
>
> I have some hard data now. My network’s initial training reached the same
> performance in half the iterations. That is, the steepness of skill gain in
> the first day of training was twice as great when I avoided training on
> fill-ins.
>
>
>
> The has all the usual caveats: only one run before/after, YMMV, etc.
>
>
>
> *From:* Brian Sheppard [mailto:sheppar...@aol.com ]
> *Sent:* Friday, December 1, 2017 5:39 PM
> *To:* 'computer-go' 
> 
> *Subject:* RE: [Computer-go] Significance of resignation in AGZ
>
>
>
> I didn’t measure precisely because as soon as I saw the training artifacts
> I changed the code. And I am not doing an AGZ-style experiment, so there
> are differences for sure. So I will give you a swag…
>
>
>
> Speed difference is maybe 20%-ish for 9x9 games.
>
>
>
> A frequentist approach will overstate the frequency of fill-in plays by a
> pretty large factor, because fill-in plays are guaranteed to occur in every
> game but are not best in the competitive part of the game. This will affect
> the speed of learning in the early going.
>
>
>
> The network will use some fraction (almost certainly <= 20%) of its
> capacity to improve accuracy on positions that will not contribute to its
> ultimate strength. This applies to both ordering and evaluation aspects.
>
>
>
>
>
>
>
>
>
> *From:* Andy [mailto:andy.olsen...@gmail.com ]
> *Sent:* Friday, December 1, 2017 4:55 PM
> *To:* Brian Sheppard  ;
> computer-go  
> *Subject:* Re: [Computer-go] Significance of resignation in AGZ
>
>
>
> Brian, do you have any experiments showing what kind of impact it has? It
> sounds like you have tried both with and without your ad hoc first pass
> approach?
>
>
>
>
>
>
>
>
>
> 2017-12-01 15:29 GMT-06:00 Brian Sheppard via Computer-go <
> computer-go@computer-go.org>:
>
> I have concluded that AGZ's policy of resigning "lost" games early is
> somewhat significant. Not as significant as using residual networks, for
> sure, but you wouldn't want to go without these advantages.
>
> The benefit cited in the paper is speed. Certainly a factor. I see two
> other advantages.
>
> First is that training does not include the "fill in" portion of the game,
> where every move is low value. I see a specific effect on the move ordering
> system, since it is based on frequency. By eliminating training on
> fill-ins, the prioritization function will not be biased toward moves that
> are not relevant to strong play. (That is, there are a lot of fill-in
> moves, which are usually not best in the interesting portion of the game,
> but occur a lot if the game is played out to the end, and therefore the
> move prioritization system would predict them more often.) My ad hoc
> alternative is to not train on positions after the first pass in a game.
> (Note that this does not qualify as "zero knowledge", but that is OK with
> me since I am not trying to reproduce AGZ.)
>
> Second is the positional evaluation is not training on situations where
> everything is decided, so less of the NN capacity is devoted to situations
> in which nothing can be gained.
>
> As always, YMMV.
>
> Best,
> Brian
>
>
> ___
> Computer-go mailing list
> Computer-go@computer-go.org
> http://computer-go.org/mailman/listinfo/computer-go
>
>
>
>
>
>
> ___
>
> Computer-go mailing list
>
> Computer-go@computer-go.org
>
> http://computer-go.org/mailman/listinfo/computer-go
>
>
>
> ___
> Computer-go mailing list
> Computer-go@computer-go.org
> http://computer-go.org/mailman/listinfo/computer-go
>

Re: [Computer-go] CPU vs GPU

2016-03-02 Thread Chaz G.
Rémi,

Nvidia launched the K20 GPU in late 2012. Since then, GPUs and their
convolution algorithms have improved considerably, while CPU performance
has been relatively stagnant. I would expect about a 10x improvement with
2016 hardware.

When it comes to training, it's the difference between running a job
overnight and running a job for the entire weekend.

Best,
-Chaz

On Tue, Mar 1, 2016 at 1:03 PM, Rémi Coulom  wrote:

> How tremendous is it? On that page, I find this data:
>
> https://github.com/BVLC/caffe/pull/439
>
> "
> These are setup details:
>
>  * Desktop: CPU i7-4770 (Haswell), 3.5 GHz , DRAM - 16 GB; GPU K20.
>  * Ubuntu 12.04; gcc 4.7.3; MKL 11.1.
>
> Test:: imagenet, 100 train iteration (batch = 256).
>
>  * GPU: time= 260 sec / memory = 0.8 GB
>  * CPU: time= 752 sec / memory = 3.5 GiB //Memory data is from system
>monitor.
>
> "
>
> This does not look so tremendous to me. What kind of speed difference do
> you get for Go networks?
>
> Rémi
>
> On 03/01/2016 06:19 PM, Petr Baudis wrote:
>
>> On Tue, Mar 01, 2016 at 09:14:39AM -0800, David Fotland wrote:
>>
>>> Very interesting, but it should also mention Aya.
>>>
>>> I'm working on this as well, but I haven’t bought any hardware yet.  My
>>> goal is not to get 7 dan on expensive hardware, but to get as much strength
>>> as I can on standard PC hardware.  I'll be looking at much smaller nets,
>>> that don’t need a GPU to run.  I'll have to buy a GPU for training.
>>>
>> But I think most people who play Go are also fans of computer games that
>> often do use GPUs. :-)  Of course, it's something totally different from
>> NVidia Keplers, but still the step up from a CPU is tremendous.
>>
>> Petr Baudis
>> ___
>> Computer-go mailing list
>> Computer-go@computer-go.org
>> http://computer-go.org/mailman/listinfo/computer-go
>>
>
> ___
> Computer-go mailing list
> Computer-go@computer-go.org
> http://computer-go.org/mailman/listinfo/computer-go
___
Computer-go mailing list
Computer-go@computer-go.org
http://computer-go.org/mailman/listinfo/computer-go