The garbage is generated from the indexing operations. In 0.4, we should have array views that should solve this problem. For now, you can either manually devectorize the inner loop, or use the @devectorize macros in the Devectorize package, if they work out in this case.
-viral On Sunday, September 14, 2014 10:34:45 AM UTC+5:30, Michael Oliver wrote: > > Hi all, > I've implemented a time delay neural network module and have been trying > to optimize it now. This function is for propagating the error backwards > through the network. > The deltas.d is just a container for holding the errors so I can do things > in place and don't have to keep initializing arrays. w and d are > collections of weights and errors respectively for different time lags. > This function gets called many many times and according to profiling, > there is a lot of garbage collection being induced by the fourth line, > specifically within multidimensional.jl getindex and setindex! and array.jl > + > > function errprop!(w::Array{Float32,3}, d::Array{Float32,3}, deltas) > deltas.d[:] = 0. > for ti=1:size(w,3), ti2 = 1:size(d,3) > deltas.d[:,:,ti+ti2-1] += w[:,:,ti]'*d[:,:,ti2]; > end > deltas.d > end > > Any advice would be much appreciated! > Best, > Michael >