Re: [julia-users] [ANN, x-post julia-stats] Mocha.jl Deep Learning Library for Julia

Chiyuan Zhang Fri, 21 Nov 2014 08:09:53 -0800

Hi,

About testing: it might be interesting to port Caffe's demo into Mocha to 
see how it works. Currently we already have two: MNIST and CIFAR10.


In terms of performance: the Mocha CPU with native extension performs 
similarly to Caffe on my very rough test. On MNIST, both GPU backend 
performs similarly. On CIFAR-10, which is a larger problem, Mocha GPU is a 
little bit slower than Caffe. In theory they should perform similarly as 
both use the cuDNN backend for bottleneck computation (convolution and 
pooling). The reason why caffe is currently a bit faster might be because 
Caffe use CUDA's multi-stream to better utilize parallelization, or 
different way of implementing LRN layer, or any other testing environment 
reasons that I didn't try to control. We might consider adding that, too. 
For even larger examples like imagenet, I haven't done any test yet.

Best,
Chiyuan

On Friday, November 21, 2014 6:22:16 AM UTC-5, René Donner wrote:
>
> Hi, 
>
> as I am just in the process of wrapping caffe, this looks really 
> exiciting! I will definitely try this out in the coming days. 
>
> Are there any specific areas where you would like testing / feedback for 
> now? 
>
> Do you have an approximate feeling how the performance compares to caffe? 
>
> Cheers, 
>
> Rene 
>
>
>
> Am 21.11.2014 um 03:00 schrieb Chiyuan Zhang <plu...@gmail.com 
> <javascript:>>: 
>
> > https://github.com/pluskid/Mocha.jl 
> > Mocha is a Deep Learning framework for Julia, inspired by the C++ Deep 
> Learning framework Caffe. Since this is the first time I post announcement 
> here, change logs of the last two releases are listed: 
> > 
> > v0.0.2 2014.11.20 
> > 
> >         • Infrastructure 
> >                 • Ability to import caffe trained model 
> >                 • Properly release all the allocated resources upon 
> backend shutdown 
> >         • Network 
> >                 • Sigmoid activation function 
> >                 • Power, Split, Element-wise layers 
> >                 • Local Response Normalization layer 
> >                 • Channel Pooling layer 
> >                 • Dropout Layer 
> >         • Documentation 
> >                 • Complete MNIST demo 
> >                 • Complete CIFAR-10 demo 
> >                 • Major part of User's Guide 
> > v0.0.1 2014.11.13 
> > 
> >         • Backend 
> >                 • Pure Julia CPU 
> >                 • Julia + C++ Extension CPU 
> >                 • CUDA + cuDNN GPU 
> >         • Infrastructure 
> >                 • Evaluate on validation set during training 
> >                 • Automaticly saving and recovering from snapshots 
> >         • Network 
> >                 • Convolution layer, mean and max pooling layer, fully 
> connected layer, softmax loss layer 
> >                 • ReLU activation function 
> >                 • L2 Regularization 
> >         • Solver 
> >                 • SGD with momentum 
> >         • Documentation 
> >                 • Demo code of LeNet on MNIST 
> >                 • Tutorial document on the MNIST demo (half finished) 
> > 
> > 
> > Below is a copy of the README file: 
> > ---------------------------------------------------------------- 
> > 
> > Mocha is a Deep Learning framework for Julia, inspired by the C++ Deep 
> Learning framework Caffe. Mocha support multiple backends: 
> > 
> >         • Pure Julia CPU Backend: Implemented in pure Julia; Runs out of 
> the box without any external dependency; Reasonably fast on small models 
> thanks to Julia's LLVM-based just-in-time (JIT) compiler and Performance 
> Annotations that eliminate unnecessary bound checkings. 
> >         • CPU Backend with Native Extension: Some bottleneck 
> computations (Convolution and Pooling) have C++ implementations. When 
> compiled and enabled, could be faster than the pure Julia backend. 
> >         • CUDA + cuDNN: An interface to NVidia® cuDNN GPU accelerated 
> deep learning library. When run with CUDA GPU devices, could be much faster 
> depending on the size of the problem (e.g. on MNIST CUDA backend is roughly 
> 20 times faster than the pure Julia backend). 
> > Installation 
> > 
> > To install the release version, simply run 
> > 
> > Pkg.add("Mocha") 
> > 
> > in Julia console. To install the latest development version, run the 
> following command instead: 
> > 
> > Pkg.clone("https://github.com/pluskid/Mocha.jl.git 
> > ") 
> > 
> > Then you can run the built-in unit tests with 
> > 
> > Pkg.test("Mocha") 
> > 
> > to verify that everything is functioning properly on your machine. 
> > 
> > Hello World 
> > 
> > Please refer to the MNIST tutorial on how prepare the MNIST dataset for 
> the following example. The complete code for this example is located at 
> examples/mnist/mnist.jl. See below for detailed documentation of other 
> tutorials and user's guide. 
> > 
> > using 
> >  Mocha 
> > 
> > data   
> > = 
> HDF5DataLayer(name="train-data",source="train-data-list.txt",batch_size=64 
> > ) 
> > conv   
> > = 
> ConvolutionLayer(name="conv1",n_filter=20,kernel=(5,5),bottoms=[:data],tops=[:conv
>  
>
> > ]) 
> > pool   
> > = 
> PoolingLayer(name="pool1",kernel=(2,2),stride=(2,2),bottoms=[:conv],tops=[:pool
>  
>
> > ]) 
> > conv2 
> > = 
> ConvolutionLayer(name="conv2",n_filter=50,kernel=(5,5),bottoms=[:pool],tops=[:conv2
>  
>
> > ]) 
> > pool2 
> > = 
> PoolingLayer(name="pool2",kernel=(2,2),stride=(2,2),bottoms=[:conv2],tops=[:pool2]
>  
>
> > ) 
> > fc1   
> > = 
> InnerProductLayer(name="ip1",output_dim=500,neuron=Neurons.ReLU(),bottoms=[:pool2
>  
>
> > ], 
> >                           tops 
> > =[:ip1 
> > ]) 
> > fc2   
> > = InnerProductLayer(name="ip2",output_dim=10,bottoms=[:ip1],tops=[:ip2 
> > ]) 
> > loss   
> > = SoftmaxLossLayer(name="loss",bottoms=[:ip2,:label 
> > ]) 
> > 
> > sys 
> > = System(CuDNNBackend 
> > ()) 
> > 
> > init 
> > (sys) 
> > 
> > common_layers 
> > = 
> >  [conv, pool, conv2, pool2, fc1, fc2] 
> > net 
> > = Net("MNIST-train", sys, [data, common_layers... 
> > , loss]) 
> > 
> > params 
> > = SolverParameters(max_iter=10000, regu_coef=0.0005, momentum=0.9 
> > , 
> >     lr_policy 
> > =LRPolicy.Inv(0.01, 0.0001, 0.75 
> > )) 
> > solver 
> > = SGD 
> > (params) 
> > 
> > 
> > # report training progress every 100 iterations 
> > add_coffee_break(solver, TrainingSummary(), every_n_iter=100 
> > ) 
> > 
> > 
> > # save snapshots every 5000 iterations 
> > add_coffee_break(solver, Snapshot("snapshots", auto_load=true 
> > ), 
> >     every_n_iter 
> > =5000 
> > ) 
> > 
> > 
> > # show performance on test data every 1000 iterations 
> > 
> > data_test 
> > = 
> HDF5DataLayer(name="test-data",source="test-data-list.txt",batch_size=100 
> > ) 
> > accuracy 
> > = AccuracyLayer(name="test-accuracy",bottoms=[:ip2, :label 
> > ]) 
> > test_net 
> > = Net("MNIST-test", sys, [data_test, common_layers... 
> > , accuracy]) 
> > 
> > add_coffee_break(solver, ValidationPerformance(test_net), 
> every_n_iter=1000 
> > ) 
> > 
> > 
> > solve 
> > (solver, net) 
> > 
> > 
> > destroy 
> > (net) 
> > 
> > destroy 
> > (test_net) 
> > 
> > shutdown(sys) 
> > Documentation 
> > 
> > The Mocha documentation is hosted on readthedocs.org. 
>
>

Re: [julia-users] [ANN, x-post julia-stats] Mocha.jl Deep Learning Library for Julia

Reply via email to