Silly me, ignoring all the commented out lines assuming they were 
comments...yes, this is almost certainly it.

On Saturday, April 4, 2015 at 3:24:50 AM UTC-5, Tim Holy wrote:
>
> Devectorization should never slow anything down. If it does, then you have 
> some other problem. Here, M is a global variable, and that's probably 
> what's 
> killing you. Performance tip #1: 
> http://docs.julialang.org/en/latest/manual/performance-tips/ 
>
> --Tim 
>
> On Friday, April 03, 2015 09:43:51 AM Adam Labadorf wrote: 
> > Hi, 
> > 
> > I am struggling with an issue related to garbage collection taking up 
> the 
> > vast majority (>99%) of compute time on a simple nested for loop. Code 
> > excerpt below: 
> > 
> > # julia version 0.3.7 
> > # counts is an MxN matrix of Float64 
> > # N=15000 
> > # M=108 
> > # h_cols and c_cols are indices \in {1:M} 
> > using HypothesisTests, ArrayViews 
> > ratios = Array(Float64,M) 
> > function compute!(S,tot_i::Int64,i::Int64,j::Int64) 
> >     ratios = view(counts,:,i)./view(counts,:,j) 
> >     #for k=1:M 
> >     #  ratios[k] = counts[k,i]/counts[k,j] 
> >     #end 
> >     #ratios = counts[:,i]./counts[:,j] 
> >     t = UnequalVarianceTTest(ratios[h_cols],ratios[c_cols]) 
> >     S[tot_i] = (pvalue(t),i,j) 
> > end 
> > 
> > for i=1:N-1 
> >   @time for j=(i+1):N 
> >     tot_i += 1 
> >     compute!(S,tot_i,i,j) 
> >   end 
> > end 
> > 
> > The loop begins fast, output from time: 
> > 
> > elapsed time: 1.023850054 seconds (62027220 bytes allocated) 
> > elapsed time: 0.170916977 seconds (45785624 bytes allocated) 
> > elapsed time: 0.171598156 seconds (45782760 bytes allocated) 
> > elapsed time: 0.173866309 seconds (45779896 bytes allocated) 
> > elapsed time: 0.170267172 seconds (45777032 bytes allocated) 
> > elapsed time: 0.171754713 seconds (45774168 bytes allocated) 
> > elapsed time: 0.170110142 seconds (45771304 bytes allocated) 
> > elapsed time: 0.175199053 seconds (45768440 bytes allocated) 
> > elapsed time: 0.179893161 seconds (45765576 bytes allocated) 
> > elapsed time: 0.212172824 seconds (45762712 bytes allocated) 
> > elapsed time: 0.252750549 seconds (45759848 bytes allocated) 
> > elapsed time: 0.254874855 seconds (45756984 bytes allocated) 
> > elapsed time: 0.231003319 seconds (45754120 bytes allocated) 
> > elapsed time: 0.235060195 seconds (45751256 bytes allocated) 
> > elapsed time: 0.235379355 seconds (45748392 bytes allocated) 
> > elapsed time: 0.927622743 seconds (45746168 bytes allocated, 77.65% gc 
> time) 
> > elapsed time: 0.9132931 seconds (45742664 bytes allocated, 78.35% gc 
> time) 
> > 
> > But as soon as it starts doing gc the % time spent in increases almost 
> > indefinitely, output from time much later: 
> > 
> > elapsed time: 0.174122929 seconds (36239160 bytes allocated) 
> > elapsed time: 18.535572658 seconds (36236168 bytes allocated, 99.22% gc 
> > time) 
> > elapsed time: 19.189478819 seconds (36233176 bytes allocated, 99.26% gc 
> > time) 
> > elapsed time: 21.812889439 seconds (36230184 bytes allocated, 99.35% gc 
> > time) 
> > elapsed time: 22.182467723 seconds (36227192 bytes allocated, 99.30% gc 
> > time) 
> > elapsed time: 0.169849999 seconds (36224200 bytes allocated) 
> > 
> > The inner loop, despite iterating over fewer and fewer indices has 
> > massively increased the gc, and therefore overall, execution time. I 
> have 
> > tried many things, including creating the compute function, 
> devectorizing 
> > the ratios calculation (which really slowed things down), using view and 
> > sub in various places, profiling with --trace-allocation=all but I can't 
> > figure out why this happens or how to fix it. Doing gc for 22 seconds 
> > obviously kills the performance, and since there are about 22M 
> iterations 
> > this is prohibitive. Can anyone suggest something I can try to improve 
> the 
> > performance here? 
> > 
> > Thanks, 
> > Adam 
>
>

Reply via email to