Tim, thanks for suggestion.. To be honest I primarily work with julia 0.4 (0.4.0-dev+2523) and do everything there first. Unfortunately there is no visible performance improvement (~ same performance) compared to 0.3.4 for this one-dimensional problem.
Tomas On Tuesday, January 6, 2015 6:01:27 PM UTC+1, Tim Holy wrote: > > SubArrays work much better in julia 0.4; on the tasks you posted, you are > likely to see substantially better performance. > > --Tim > > On Tuesday, January 06, 2015 08:15:03 AM Tomas Mikoviny wrote: > > Hi, > > I'm trying to optimise julia script for spectra baseline correction > using > > rolling ball algorithm > > (http://linkinghub.elsevier.com/retrieve/pii/0168583X95009086, > > http://cran.r-project.org/web/packages/baseline). > > Profiling the code showed that the most time consuming part is actually > > subarray pickup. > > I was just wondering if there is any other possible speedup for this > > problem? > > > > I've started initially with standard sub-indexing: > > > > > > a = rand(300000); > > w = 200; > > > > > > > > @time for i in 1:length(a)-w > > > > a[i:i+w] > > end > > elapsed time: 0.387236571 seconds (645148344 bytes allocated, 56.46% gc > > time) > > > > > > > > Then I've tried directly with subarray function and it improved the > runtime > > significantly: > > > > @time for i in 1:length(a)-w > > > > sub(a,i:i+w) > > end > > elapsed time: 0.10720574 seconds (86321144 bytes allocated, 32.13% gc > time) > > > > > > > > With approach to internally remove and add elements I've gained yet > some > > extra speed-up (eliminating gc): > > > > subset = a[1:1+w] > > > > @time for i in 2:length(a)-w > > splice!(subset,1) > > insert!(subset,w+1,a[i+w]) > > end > > elapsed time: 0.067341484 seconds (33556344 bytes allocated) > > > > > > However I wonder if this is the end.... > > > > And obligatory version info: > > > > Julia Version 0.3.4 > > Commit 3392026* (2014-12-26 10:42 UTC) > > Platform Info: > > System: Darwin (x86_64-apple-darwin13.4.0) > > CPU: Intel(R) Core(TM) i5-4690 CPU @ 3.50GHz > > WORD_SIZE: 64 > > BLAS: libopenblas (USE64BITINT DYNAMIC_ARCH NO_AFFINITY Haswell) > > LAPACK: libopenblas > > LIBM: libopenlibm > > LLVM: libLLVM-3.3 > >