Re: [julia-users] Extreme garbage collection time

2015-04-08 Thread cdm

just a moment to applaud this exemplary thread !!!

excellent closing post ...

love the Julia community.

best,

cdm


Re: [julia-users] Extreme garbage collection time

2015-04-07 Thread Tim Holy
Out of curiosity, what happens if you take the compute! function out of main 
and pass ratios (and any other needed variables) as another argument?

--Tim

On Monday, April 06, 2015 09:14:24 PM Adam Labadorf wrote:
 Thanks for the replies. I took your suggestions (and reread the scope
 section of the docs) and am still experiencing the gc creep. Below is the
 complete program, with the notable changes that I wrapped the main
 computation in a function and eliminated all references to global variables
 inside. I'm also using the most recent nightly build of 0.4. Overall this
 version of the code is much faster, but there is still significant slowdown
 as the computation progresses. Is this expected? Do you see anything I'm
 doing wrong?
 
 # julia v0.4.0-dev+4159
 using HypothesisTests, ArrayViews
 
 type Result
   pvalue::Float64
   i::Int64
   j::Int64
 end
 
 function readtable(fn)
   fp = readcsv(fn)
   columns = fp[1,2:end]
   rows = fp[:,2:end]
   data = float(fp[2:end,2:end])
   return (columns,rows,data)
 end
 @time (cols, genes, counts) =
 readtable(../all_mRNA_nonzero_norm_counts_trim.csv)
 
 h_cols = find([c[1] == 'H' for c in cols])
 c_cols = find([c[1] == 'C' for c in cols])
 
 # filter out genes with zeros since it messes with the ratios
 nonzero = mapslices(x - !any(x.==0),counts,2)
 counts = counts[find(nonzero),[h_cols;c_cols]]
 
 # slices seem to be faster
 h_cols = 1:length(h_cols)
 c_cols = (length(h_cols)+1):size(counts,2)
 
 # arrays are stored in column order, transpose counts to make
 # accessing more efficient
 counts = transpose(counts)
 
 genes = genes[find(nonzero)]
 
 function main(counts,genes,h_cols,c_cols)
 
   N = size(genes,1)
   M = size(counts,1)
 
   ratios = Array(Float64,M)
   function
 compute!(S::Array{Result,1},counts::Array{Float64,2},tot_i::Int64,i::Int64,j
 ::Int64,h_cols::UnitRange{Int64},c_cols::UnitRange{Int64},M::Int64) for
 k=1:M
 ratios[k] = counts[k,i]/counts[k,j]
   end
   t = UnequalVarianceTTest(view(ratios,h_cols),view(ratios,c_cols))
   S[tot_i] = Result(pvalue(t),i,j)
   end
 
   tot_i = 0
   tot = (N^2-N)/2
 
   S = Array(Result,round(Int,tot))
 
   for i=1:N-1
 @time for j=(i+1):N
   tot_i += 1
   compute!(S,counts,tot_i,i,j,h_cols,c_cols,M)
 end
   end
 
 end
 
 S = main(counts,genes,h_cols,c_cols)
 
 
 And the output:
 
 elapsed time: 0.427719149 seconds (23 MB allocated, 39.90% gc time in 2
 pauses with 0 full sweep)
 elapsed time: 0.031006382 seconds (14 MB allocated)
 elapsed time: 0.131579099 seconds (14 MB allocated, 73.64% gc time in 1
 pauses with 1 full sweep)
 elapsed time: 0.140120717 seconds (14 MB allocated, 73.58% gc time in 1
 pauses with 0 full sweep)
 elapsed time: 0.030248237 seconds (14 MB allocated)
 ...
 elapsed time: 0.507894781 seconds (5 MB allocated, 97.65% gc time in 1
 pauses with 0 full sweep)
 elapsed time: 0.011821657 seconds (5 MB allocated)
 elapsed time: 0.011610651 seconds (5 MB allocated)
 elapsed time: 0.011816277 seconds (5 MB allocated)
 elapsed time: 0.50779098 seconds (5 MB allocated, 97.65% gc time in 1
 pauses with 0 full sweep)
 elapsed time: 0.011997168 seconds (5 MB allocated)
 elapsed time: 0.011721667 seconds (5 MB allocated)
 elapsed time: 0.011561071 seconds (5 MB allocated)
 
 On Saturday, April 4, 2015 at 12:38:46 PM UTC-4, Patrick O'Leary wrote:
  Silly me, ignoring all the commented out lines assuming they were
  comments...yes, this is almost certainly it.
  
  On Saturday, April 4, 2015 at 3:24:50 AM UTC-5, Tim Holy wrote:
  Devectorization should never slow anything down. If it does, then you
  have
  some other problem. Here, M is a global variable, and that's probably
  what's
  killing you. Performance tip #1:
  http://docs.julialang.org/en/latest/manual/performance-tips/
  
  --Tim
  
  On Friday, April 03, 2015 09:43:51 AM Adam Labadorf wrote:
   Hi,
   
   I am struggling with an issue related to garbage collection taking up
  
  the
  
   vast majority (99%) of compute time on a simple nested for loop. Code
   excerpt below:
   
   # julia version 0.3.7
   # counts is an MxN matrix of Float64
   # N=15000
   # M=108
   # h_cols and c_cols are indices \in {1:M}
   using HypothesisTests, ArrayViews
   ratios = Array(Float64,M)
   function compute!(S,tot_i::Int64,i::Int64,j::Int64)
   
   ratios = view(counts,:,i)./view(counts,:,j)
   #for k=1:M
   #  ratios[k] = counts[k,i]/counts[k,j]
   #end
   #ratios = counts[:,i]./counts[:,j]
   t = UnequalVarianceTTest(ratios[h_cols],ratios[c_cols])
   S[tot_i] = (pvalue(t),i,j)
   
   end
   
   for i=1:N-1
   
 @time for j=(i+1):N
 
   tot_i += 1
   compute!(S,tot_i,i,j)
 
 end
   
   end
   
   The loop begins fast, output from time:
   
   elapsed time: 1.023850054 seconds (62027220 bytes allocated)
   elapsed time: 0.170916977 seconds (45785624 bytes allocated)
   elapsed time: 0.171598156 seconds (45782760 bytes allocated)
   elapsed time: 0.173866309 

Re: [julia-users] Extreme garbage collection time

2015-04-07 Thread Kristoffer Carlsson
Can you post your csv file so we can actually run the program?

On Tuesday, April 7, 2015 at 6:21:18 AM UTC+2, Adam Labadorf wrote:

 Thanks for the replies. I took your suggestions (and reread the scope 
 section of the docs) and am still experiencing the gc creep. Below is the 
 complete program, with the notable changes that I wrapped the main 
 computation in a function and eliminated all references to global variables 
 inside. I'm also using the most recent nightly build of 0.4. Overall this 
 version of the code is much faster, but there is still significant slowdown 
 as the computation progresses. Is this expected? Do you see anything I'm 
 doing wrong?

 # julia v0.4.0-dev+4159
 using HypothesisTests, ArrayViews

 type Result
   pvalue::Float64
   i::Int64
   j::Int64
 end

 function readtable(fn)
   fp = readcsv(fn)
   columns = fp[1,2:end]
   rows = fp[:,2:end]
   data = float(fp[2:end,2:end])
   return (columns,rows,data)
 end
 @time (cols, genes, counts) = 
 readtable(../all_mRNA_nonzero_norm_counts_trim.csv)

 h_cols = find([c[1] == 'H' for c in cols])
 c_cols = find([c[1] == 'C' for c in cols])

 # filter out genes with zeros since it messes with the ratios
 nonzero = mapslices(x - !any(x.==0),counts,2)
 counts = counts[find(nonzero),[h_cols;c_cols]]

 # slices seem to be faster
 h_cols = 1:length(h_cols)
 c_cols = (length(h_cols)+1):size(counts,2)

 # arrays are stored in column order, transpose counts to make
 # accessing more efficient
 counts = transpose(counts)

 genes = genes[find(nonzero)]

 function main(counts,genes,h_cols,c_cols)

   N = size(genes,1)
   M = size(counts,1)

   ratios = Array(Float64,M)
   function 
 compute!(S::Array{Result,1},counts::Array{Float64,2},tot_i::Int64,i::Int64,j::Int64,h_cols::UnitRange{Int64},c_cols::UnitRange{Int64},M::Int64)
   for k=1:M
 ratios[k] = counts[k,i]/counts[k,j]
   end
   t = UnequalVarianceTTest(view(ratios,h_cols),view(ratios,c_cols))
   S[tot_i] = Result(pvalue(t),i,j)
   end

   tot_i = 0
   tot = (N^2-N)/2

   S = Array(Result,round(Int,tot))

   for i=1:N-1
 @time for j=(i+1):N
   tot_i += 1
   compute!(S,counts,tot_i,i,j,h_cols,c_cols,M)
 end
   end

 end

 S = main(counts,genes,h_cols,c_cols)


 And the output:

 elapsed time: 0.427719149 seconds (23 MB allocated, 39.90% gc time in 2 
 pauses with 0 full sweep)
 elapsed time: 0.031006382 seconds (14 MB allocated)
 elapsed time: 0.131579099 seconds (14 MB allocated, 73.64% gc time in 1 
 pauses with 1 full sweep)
 elapsed time: 0.140120717 seconds (14 MB allocated, 73.58% gc time in 1 
 pauses with 0 full sweep)
 elapsed time: 0.030248237 seconds (14 MB allocated)
 ...
 elapsed time: 0.507894781 seconds (5 MB allocated, 97.65% gc time in 1 
 pauses with 0 full sweep)
 elapsed time: 0.011821657 seconds (5 MB allocated)
 elapsed time: 0.011610651 seconds (5 MB allocated)
 elapsed time: 0.011816277 seconds (5 MB allocated)
 elapsed time: 0.50779098 seconds (5 MB allocated, 97.65% gc time in 1 
 pauses with 0 full sweep)
 elapsed time: 0.011997168 seconds (5 MB allocated)
 elapsed time: 0.011721667 seconds (5 MB allocated)
 elapsed time: 0.011561071 seconds (5 MB allocated)


 On Saturday, April 4, 2015 at 12:38:46 PM UTC-4, Patrick O'Leary wrote:

 Silly me, ignoring all the commented out lines assuming they were 
 comments...yes, this is almost certainly it.

 On Saturday, April 4, 2015 at 3:24:50 AM UTC-5, Tim Holy wrote:

 Devectorization should never slow anything down. If it does, then you 
 have 
 some other problem. Here, M is a global variable, and that's probably 
 what's 
 killing you. Performance tip #1: 
 http://docs.julialang.org/en/latest/manual/performance-tips/ 

 --Tim 

 On Friday, April 03, 2015 09:43:51 AM Adam Labadorf wrote: 
  Hi, 
  
  I am struggling with an issue related to garbage collection taking up 
 the 
  vast majority (99%) of compute time on a simple nested for loop. Code 
  excerpt below: 
  
  # julia version 0.3.7 
  # counts is an MxN matrix of Float64 
  # N=15000 
  # M=108 
  # h_cols and c_cols are indices \in {1:M} 
  using HypothesisTests, ArrayViews 
  ratios = Array(Float64,M) 
  function compute!(S,tot_i::Int64,i::Int64,j::Int64) 
  ratios = view(counts,:,i)./view(counts,:,j) 
  #for k=1:M 
  #  ratios[k] = counts[k,i]/counts[k,j] 
  #end 
  #ratios = counts[:,i]./counts[:,j] 
  t = UnequalVarianceTTest(ratios[h_cols],ratios[c_cols]) 
  S[tot_i] = (pvalue(t),i,j) 
  end 
  
  for i=1:N-1 
@time for j=(i+1):N 
  tot_i += 1 
  compute!(S,tot_i,i,j) 
end 
  end 
  
  The loop begins fast, output from time: 
  
  elapsed time: 1.023850054 seconds (62027220 bytes allocated) 
  elapsed time: 0.170916977 seconds (45785624 bytes allocated) 
  elapsed time: 0.171598156 seconds (45782760 bytes allocated) 
  elapsed time: 0.173866309 seconds (45779896 bytes allocated) 
  elapsed time: 0.170267172 seconds (45777032 bytes allocated) 
  elapsed time: 0.171754713 

Re: [julia-users] Extreme garbage collection time

2015-04-07 Thread Mauro

On Tue, 2015-04-07 at 06:14, Adam Labadorf alabad...@gmail.com wrote:
 Thanks for the replies. I took your suggestions (and reread the scope 
 section of the docs) and am still experiencing the gc creep. Below is the 
 complete program, with the notable changes that I wrapped the main 
 computation in a function and eliminated all references to global variables 
 inside. I'm also using the most recent nightly build of 0.4. Overall this 
 version of the code is much faster, but there is still significant slowdown 
 as the computation progresses. Is this expected? Do you see anything I'm 
 doing wrong?

 # julia v0.4.0-dev+4159
 using HypothesisTests, ArrayViews

 type Result
   pvalue::Float64
   i::Int64
   j::Int64
 end

 function readtable(fn)
   fp = readcsv(fn)
   columns = fp[1,2:end]
   rows = fp[:,2:end]
   data = float(fp[2:end,2:end])
   return (columns,rows,data)
 end
 @time (cols, genes, counts) = 
 readtable(../all_mRNA_nonzero_norm_counts_trim.csv)

 h_cols = find([c[1] == 'H' for c in cols])
 c_cols = find([c[1] == 'C' for c in cols])

 # filter out genes with zeros since it messes with the ratios
 nonzero = mapslices(x - !any(x.==0),counts,2)
 counts = counts[find(nonzero),[h_cols;c_cols]]

 # slices seem to be faster
 h_cols = 1:length(h_cols)
 c_cols = (length(h_cols)+1):size(counts,2)

 # arrays are stored in column order, transpose counts to make
 # accessing more efficient
 counts = transpose(counts)

 genes = genes[find(nonzero)]

 function main(counts,genes,h_cols,c_cols)

   N = size(genes,1)
   M = size(counts,1)

   ratios = Array(Float64,M)
   function 
 compute!(S::Array{Result,1},counts::Array{Float64,2},tot_i::Int64,i::Int64,j::Int64,h_cols::UnitRange{Int64},c_cols::UnitRange{Int64},M::Int64)
   for k=1:M
 ratios[k] = counts[k,i]/counts[k,j]
   end
   t = UnequalVarianceTTest(view(ratios,h_cols),view(ratios,c_cols))
   S[tot_i] = Result(pvalue(t),i,j)
   end

Sadly, nested function are often bad as type-inference does not work
properly, as Tim suggested. Consider this example:

function a(n)
aa(x,y) = x*y
out = 0
for i=1:n
out += aa(i,i)
end
out
end

bb(x,y) = x*y
function b(n)
out = 0
for i=1:n
out += bb(i,i)
end
out
end
n = 10^7
@time a(n)
@time a(n)  # elapsed time: 0.680312065 seconds (1220 MB allocated, 2.71% gc 
time in 56 pauses with 0 full sweep)
b(n)
@time b(n)  # elapsed time: 3.086e-6 seconds (192 bytes allocated)

@code_warntype a(n) # see how the return type of the function is not inferred!
@code_warntype b(n)

Move compute! out of main and it should be better.


   tot_i = 0
   tot = (N^2-N)/2

   S = Array(Result,round(Int,tot))

   for i=1:N-1
 @time for j=(i+1):N
   tot_i += 1
   compute!(S,counts,tot_i,i,j,h_cols,c_cols,M)
 end
   end

 end

 S = main(counts,genes,h_cols,c_cols)


 And the output:

 elapsed time: 0.427719149 seconds (23 MB allocated, 39.90% gc time in 2 
 pauses with 0 full sweep)
 elapsed time: 0.031006382 seconds (14 MB allocated)
 elapsed time: 0.131579099 seconds (14 MB allocated, 73.64% gc time in 1 
 pauses with 1 full sweep)
 elapsed time: 0.140120717 seconds (14 MB allocated, 73.58% gc time in 1 
 pauses with 0 full sweep)
 elapsed time: 0.030248237 seconds (14 MB allocated)
 ...
 elapsed time: 0.507894781 seconds (5 MB allocated, 97.65% gc time in 1 
 pauses with 0 full sweep)
 elapsed time: 0.011821657 seconds (5 MB allocated)
 elapsed time: 0.011610651 seconds (5 MB allocated)
 elapsed time: 0.011816277 seconds (5 MB allocated)
 elapsed time: 0.50779098 seconds (5 MB allocated, 97.65% gc time in 1 
 pauses with 0 full sweep)
 elapsed time: 0.011997168 seconds (5 MB allocated)
 elapsed time: 0.011721667 seconds (5 MB allocated)
 elapsed time: 0.011561071 seconds (5 MB allocated)

This looks ok-ish to me.  The program runs, allocates memory and every
so often the memory is garbage collected.  What is not ok that the gc
runs after only 20MB is allocated and that it takes so long.  But at
that point all the memory is used up, right?  Maybe that is why it takes
so long then?

 On Saturday, April 4, 2015 at 12:38:46 PM UTC-4, Patrick O'Leary wrote:

 Silly me, ignoring all the commented out lines assuming they were 
 comments...yes, this is almost certainly it.

 On Saturday, April 4, 2015 at 3:24:50 AM UTC-5, Tim Holy wrote:

 Devectorization should never slow anything down. If it does, then you 
 have 
 some other problem. Here, M is a global variable, and that's probably 
 what's 
 killing you. Performance tip #1: 
 http://docs.julialang.org/en/latest/manual/performance-tips/ 

 --Tim 

 On Friday, April 03, 2015 09:43:51 AM Adam Labadorf wrote: 
  Hi, 
  
  I am struggling with an issue related to garbage collection taking up 
 the 
  vast majority (99%) of compute time on a simple nested for loop. Code 
  excerpt below: 
  
  # julia version 0.3.7 
  # counts is an MxN matrix of Float64 
  # N=15000 
  # M=108 
  # h_cols and c_cols are 

Re: [julia-users] Extreme garbage collection time

2015-04-07 Thread Adam Labadorf
I moved compute! out of the main function and pass the ratios as you 
suggested and the performance is a bit better but I still notice the gc 
time increasing albeit more slowly, new code below. The csv I'm using is 
available at 
https://www.dropbox.com/s/uqeisg1vx027gjc/all_mRNA_nonzero_norm_counts_trim.csv?dl=0

using HypothesisTests, ArrayViews

type Result
  pvalue::Float64
  i::Int64
  j::Int64
end

function readtable(fn)
  fp = readcsv(fn)
  columns = fp[1,2:end]
  rows = fp[2:end,1]
  data = float(fp[2:end,2:end])
  return (columns,rows,data)
end
@time (cols, genes, counts) = 
readtable(../all_mRNA_nonzero_norm_counts_trim.csv)

h_cols = find([c[1] == 'H' for c in cols])
c_cols = find([c[1] == 'C' for c in cols])

# restrict to HD and control
# add a pseudocount to avoid division by zero errors
counts = counts[:,[h_cols;c_cols]] + 0.01

h_cols = 1:length(h_cols)
c_cols = (length(h_cols)+1):size(counts,2)

# arrays are stored in column order, transpose counts to make
# accessing more efficient
counts = transpose(counts)

function 
compute!(S::Array{Result,1},ratios::Array{Float64,1},tot_i::Int64,i::Int64,j::Int64,h_cols::UnitRange{Int64},c_cols::UnitRange{Int64},M::Int64)
  t = UnequalVarianceTTest(view(ratios,h_cols),view(ratios,c_cols))
  S[tot_i] = Result(pvalue(t),i,j)
end

function main(counts,genes,h_cols,c_cols)

  N = size(genes,1)
  M = size(counts,1)

  ratios = Array(Float64,M)

  tot_i = 0
  tot = (N^2-N)/2

  S = Array(Result,round(Int,tot))

  for i=1:N-1
@time for j=(i+1):N
  tot_i += 1

  # use a sigmoid function to compress log ratios to [-10,10]
  b = 10
  for k=1:M
ratios[k] = b*(2/(1+exp(-log2(counts[k,i]/counts[k,j])/(b/2)))-1)
  end

  compute!(S,ratios,tot_i,i,j,h_cols,c_cols,M)
end
@show (tot_i,tot,tot_i/tot)
  end

end

S = main(counts,genes,h_cols,c_cols)



On Tuesday, April 7, 2015 at 8:47:06 AM UTC-4, Mauro wrote:


 On Tue, 2015-04-07 at 06:14, Adam Labadorf alab...@gmail.com 
 javascript: wrote: 
  Thanks for the replies. I took your suggestions (and reread the scope 
  section of the docs) and am still experiencing the gc creep. Below is 
 the 
  complete program, with the notable changes that I wrapped the main 
  computation in a function and eliminated all references to global 
 variables 
  inside. I'm also using the most recent nightly build of 0.4. Overall 
 this 
  version of the code is much faster, but there is still significant 
 slowdown 
  as the computation progresses. Is this expected? Do you see anything I'm 
  doing wrong? 
  
  # julia v0.4.0-dev+4159 
  using HypothesisTests, ArrayViews 
  
  type Result 
pvalue::Float64 
i::Int64 
j::Int64 
  end 
  
  function readtable(fn) 
fp = readcsv(fn) 
columns = fp[1,2:end] 
rows = fp[:,2:end] 
data = float(fp[2:end,2:end]) 
return (columns,rows,data) 
  end 
  @time (cols, genes, counts) = 
  readtable(../all_mRNA_nonzero_norm_counts_trim.csv) 
  
  h_cols = find([c[1] == 'H' for c in cols]) 
  c_cols = find([c[1] == 'C' for c in cols]) 
  
  # filter out genes with zeros since it messes with the ratios 
  nonzero = mapslices(x - !any(x.==0),counts,2) 
  counts = counts[find(nonzero),[h_cols;c_cols]] 
  
  # slices seem to be faster 
  h_cols = 1:length(h_cols) 
  c_cols = (length(h_cols)+1):size(counts,2) 
  
  # arrays are stored in column order, transpose counts to make 
  # accessing more efficient 
  counts = transpose(counts) 
  
  genes = genes[find(nonzero)] 
  
  function main(counts,genes,h_cols,c_cols) 
  
N = size(genes,1) 
M = size(counts,1) 
  
ratios = Array(Float64,M) 
function 
  
 compute!(S::Array{Result,1},counts::Array{Float64,2},tot_i::Int64,i::Int64,j::Int64,h_cols::UnitRange{Int64},c_cols::UnitRange{Int64},M::Int64)
  

for k=1:M 
  ratios[k] = counts[k,i]/counts[k,j] 
end 
t = UnequalVarianceTTest(view(ratios,h_cols),view(ratios,c_cols)) 
S[tot_i] = Result(pvalue(t),i,j) 
end 

 Sadly, nested function are often bad as type-inference does not work 
 properly, as Tim suggested. Consider this example: 

 function a(n) 
 aa(x,y) = x*y 
 out = 0 
 for i=1:n 
 out += aa(i,i) 
 end 
 out 
 end 

 bb(x,y) = x*y 
 function b(n) 
 out = 0 
 for i=1:n 
 out += bb(i,i) 
 end 
 out 
 end 
 n = 10^7 
 @time a(n) 
 @time a(n)  # elapsed time: 0.680312065 seconds (1220 MB allocated, 2.71% 
 gc time in 56 pauses with 0 full sweep) 
 b(n) 
 @time b(n)  # elapsed time: 3.086e-6 seconds (192 bytes allocated) 

 @code_warntype a(n) # see how the return type of the function is not 
 inferred! 
 @code_warntype b(n) 

 Move compute! out of main and it should be better. 


tot_i = 0 
tot = (N^2-N)/2 
  
S = Array(Result,round(Int,tot)) 
  
for i=1:N-1 
  @time for j=(i+1):N 
tot_i += 1 
compute!(S,counts,tot_i,i,j,h_cols,c_cols,M) 
  end 
end 
  
  end 
  
  S = 

Re: [julia-users] Extreme garbage collection time

2015-04-07 Thread Mauro
Make Result an immutable:

immutable Result
  pvalue::Float64
  i::Int64
  j::Int64
end

that way the numbers are stored in the S array directly and all its
memory is preallocated.  Otherwise S just holds pointers to the Results
objects.  gc time is only 2% for me then (but I didn't run it for more
than 10min).

However, why the gc time increases in the original code, I don't know.
I don't think it should.

On Tue, 2015-04-07 at 16:26, Adam Labadorf alabad...@gmail.com wrote:
 I moved compute! out of the main function and pass the ratios as you 
 suggested and the performance is a bit better but I still notice the gc 
 time increasing albeit more slowly, new code below. The csv I'm using is 
 available at 
 https://www.dropbox.com/s/uqeisg1vx027gjc/all_mRNA_nonzero_norm_counts_trim.csv?dl=0

 using HypothesisTests, ArrayViews

 type Result
   pvalue::Float64
   i::Int64
   j::Int64
 end

 function readtable(fn)
   fp = readcsv(fn)
   columns = fp[1,2:end]
   rows = fp[2:end,1]
   data = float(fp[2:end,2:end])
   return (columns,rows,data)
 end
 @time (cols, genes, counts) = 
 readtable(../all_mRNA_nonzero_norm_counts_trim.csv)

 h_cols = find([c[1] == 'H' for c in cols])
 c_cols = find([c[1] == 'C' for c in cols])

 # restrict to HD and control
 # add a pseudocount to avoid division by zero errors
 counts = counts[:,[h_cols;c_cols]] + 0.01

 h_cols = 1:length(h_cols)
 c_cols = (length(h_cols)+1):size(counts,2)

 # arrays are stored in column order, transpose counts to make
 # accessing more efficient
 counts = transpose(counts)

 function 
 compute!(S::Array{Result,1},ratios::Array{Float64,1},tot_i::Int64,i::Int64,j::Int64,h_cols::UnitRange{Int64},c_cols::UnitRange{Int64},M::Int64)
   t = UnequalVarianceTTest(view(ratios,h_cols),view(ratios,c_cols))
   S[tot_i] = Result(pvalue(t),i,j)
 end

 function main(counts,genes,h_cols,c_cols)

   N = size(genes,1)
   M = size(counts,1)

   ratios = Array(Float64,M)

   tot_i = 0
   tot = (N^2-N)/2

   S = Array(Result,round(Int,tot))

   for i=1:N-1
 @time for j=(i+1):N
   tot_i += 1

   # use a sigmoid function to compress log ratios to [-10,10]
   b = 10
   for k=1:M
 ratios[k] = b*(2/(1+exp(-log2(counts[k,i]/counts[k,j])/(b/2)))-1)
   end

   compute!(S,ratios,tot_i,i,j,h_cols,c_cols,M)
 end
 @show (tot_i,tot,tot_i/tot)
   end

 end

 S = main(counts,genes,h_cols,c_cols)



 On Tuesday, April 7, 2015 at 8:47:06 AM UTC-4, Mauro wrote:


 On Tue, 2015-04-07 at 06:14, Adam Labadorf alab...@gmail.com 
 javascript: wrote: 
  Thanks for the replies. I took your suggestions (and reread the scope 
  section of the docs) and am still experiencing the gc creep. Below is 
 the 
  complete program, with the notable changes that I wrapped the main 
  computation in a function and eliminated all references to global 
 variables 
  inside. I'm also using the most recent nightly build of 0.4. Overall 
 this 
  version of the code is much faster, but there is still significant 
 slowdown 
  as the computation progresses. Is this expected? Do you see anything I'm 
  doing wrong? 
  
  # julia v0.4.0-dev+4159 
  using HypothesisTests, ArrayViews 
  
  type Result 
pvalue::Float64 
i::Int64 
j::Int64 
  end 
  
  function readtable(fn) 
fp = readcsv(fn) 
columns = fp[1,2:end] 
rows = fp[:,2:end] 
data = float(fp[2:end,2:end]) 
return (columns,rows,data) 
  end 
  @time (cols, genes, counts) = 
  readtable(../all_mRNA_nonzero_norm_counts_trim.csv) 
  
  h_cols = find([c[1] == 'H' for c in cols]) 
  c_cols = find([c[1] == 'C' for c in cols]) 
  
  # filter out genes with zeros since it messes with the ratios 
  nonzero = mapslices(x - !any(x.==0),counts,2) 
  counts = counts[find(nonzero),[h_cols;c_cols]] 
  
  # slices seem to be faster 
  h_cols = 1:length(h_cols) 
  c_cols = (length(h_cols)+1):size(counts,2) 
  
  # arrays are stored in column order, transpose counts to make 
  # accessing more efficient 
  counts = transpose(counts) 
  
  genes = genes[find(nonzero)] 
  
  function main(counts,genes,h_cols,c_cols) 
  
N = size(genes,1) 
M = size(counts,1) 
  
ratios = Array(Float64,M) 
function 
  
 compute!(S::Array{Result,1},counts::Array{Float64,2},tot_i::Int64,i::Int64,j::Int64,h_cols::UnitRange{Int64},c_cols::UnitRange{Int64},M::Int64)
  

for k=1:M 
  ratios[k] = counts[k,i]/counts[k,j] 
end 
t = UnequalVarianceTTest(view(ratios,h_cols),view(ratios,c_cols)) 
S[tot_i] = Result(pvalue(t),i,j) 
end 

 Sadly, nested function are often bad as type-inference does not work 
 properly, as Tim suggested. Consider this example: 

 function a(n) 
 aa(x,y) = x*y 
 out = 0 
 for i=1:n 
 out += aa(i,i) 
 end 
 out 
 end 

 bb(x,y) = x*y 
 function b(n) 
 out = 0 
 for i=1:n 
 out += bb(i,i) 
 end 
 out 
 end 
 n = 10^7 
 @time a(n) 
 @time a(n)  # elapsed time: 0.680312065 seconds (1220 MB allocated, 

Re: [julia-users] Extreme garbage collection time

2015-04-07 Thread Adam Labadorf
Yes, that gives a massive improvement and I see the same gc behavior you 
got. I could notice the gc increasing slowly after just watching for a 
minute or two, and that seems to have stopped now. I'll know for sure when 
I run the whole thing and will report back.

Thanks so much to all of you for your swift replies, this was my first 
julia experience and it's great to have such a supportive community.

On Tuesday, April 7, 2015 at 11:40:49 AM UTC-4, Mauro wrote:

 Make Result an immutable: 

 immutable Result 
   pvalue::Float64 
   i::Int64 
   j::Int64 
 end 

 that way the numbers are stored in the S array directly and all its 
 memory is preallocated.  Otherwise S just holds pointers to the Results 
 objects.  gc time is only 2% for me then (but I didn't run it for more 
 than 10min). 

 However, why the gc time increases in the original code, I don't know. 
 I don't think it should. 

 On Tue, 2015-04-07 at 16:26, Adam Labadorf alab...@gmail.com 
 javascript: wrote: 
  I moved compute! out of the main function and pass the ratios as you 
  suggested and the performance is a bit better but I still notice the gc 
  time increasing albeit more slowly, new code below. The csv I'm using is 
  available at 
  
 https://www.dropbox.com/s/uqeisg1vx027gjc/all_mRNA_nonzero_norm_counts_trim.csv?dl=0
  
  
  using HypothesisTests, ArrayViews 
  
  type Result 
pvalue::Float64 
i::Int64 
j::Int64 
  end 
  
  function readtable(fn) 
fp = readcsv(fn) 
columns = fp[1,2:end] 
rows = fp[2:end,1] 
data = float(fp[2:end,2:end]) 
return (columns,rows,data) 
  end 
  @time (cols, genes, counts) = 
  readtable(../all_mRNA_nonzero_norm_counts_trim.csv) 
  
  h_cols = find([c[1] == 'H' for c in cols]) 
  c_cols = find([c[1] == 'C' for c in cols]) 
  
  # restrict to HD and control 
  # add a pseudocount to avoid division by zero errors 
  counts = counts[:,[h_cols;c_cols]] + 0.01 
  
  h_cols = 1:length(h_cols) 
  c_cols = (length(h_cols)+1):size(counts,2) 
  
  # arrays are stored in column order, transpose counts to make 
  # accessing more efficient 
  counts = transpose(counts) 
  
  function 
  
 compute!(S::Array{Result,1},ratios::Array{Float64,1},tot_i::Int64,i::Int64,j::Int64,h_cols::UnitRange{Int64},c_cols::UnitRange{Int64},M::Int64)
  

t = UnequalVarianceTTest(view(ratios,h_cols),view(ratios,c_cols)) 
S[tot_i] = Result(pvalue(t),i,j) 
  end 
  
  function main(counts,genes,h_cols,c_cols) 
  
N = size(genes,1) 
M = size(counts,1) 
  
ratios = Array(Float64,M) 
  
tot_i = 0 
tot = (N^2-N)/2 
  
S = Array(Result,round(Int,tot)) 
  
for i=1:N-1 
  @time for j=(i+1):N 
tot_i += 1 
  
# use a sigmoid function to compress log ratios to [-10,10] 
b = 10 
for k=1:M 
  ratios[k] = 
 b*(2/(1+exp(-log2(counts[k,i]/counts[k,j])/(b/2)))-1) 
end 
  
compute!(S,ratios,tot_i,i,j,h_cols,c_cols,M) 
  end 
  @show (tot_i,tot,tot_i/tot) 
end 
  
  end 
  
  S = main(counts,genes,h_cols,c_cols) 
  
  
  
  On Tuesday, April 7, 2015 at 8:47:06 AM UTC-4, Mauro wrote: 
  
  
  On Tue, 2015-04-07 at 06:14, Adam Labadorf alab...@gmail.com 
  javascript: wrote: 
   Thanks for the replies. I took your suggestions (and reread the scope 
   section of the docs) and am still experiencing the gc creep. Below is 
  the 
   complete program, with the notable changes that I wrapped the main 
   computation in a function and eliminated all references to global 
  variables 
   inside. I'm also using the most recent nightly build of 0.4. Overall 
  this 
   version of the code is much faster, but there is still significant 
  slowdown 
   as the computation progresses. Is this expected? Do you see anything 
 I'm 
   doing wrong? 
   
   # julia v0.4.0-dev+4159 
   using HypothesisTests, ArrayViews 
   
   type Result 
 pvalue::Float64 
 i::Int64 
 j::Int64 
   end 
   
   function readtable(fn) 
 fp = readcsv(fn) 
 columns = fp[1,2:end] 
 rows = fp[:,2:end] 
 data = float(fp[2:end,2:end]) 
 return (columns,rows,data) 
   end 
   @time (cols, genes, counts) = 
   readtable(../all_mRNA_nonzero_norm_counts_trim.csv) 
   
   h_cols = find([c[1] == 'H' for c in cols]) 
   c_cols = find([c[1] == 'C' for c in cols]) 
   
   # filter out genes with zeros since it messes with the ratios 
   nonzero = mapslices(x - !any(x.==0),counts,2) 
   counts = counts[find(nonzero),[h_cols;c_cols]] 
   
   # slices seem to be faster 
   h_cols = 1:length(h_cols) 
   c_cols = (length(h_cols)+1):size(counts,2) 
   
   # arrays are stored in column order, transpose counts to make 
   # accessing more efficient 
   counts = transpose(counts) 
   
   genes = genes[find(nonzero)] 
   
   function main(counts,genes,h_cols,c_cols) 
   
 N = size(genes,1) 
 M = size(counts,1) 
   
 ratios = Array(Float64,M) 
 function 
   
  
 

Re: [julia-users] Extreme garbage collection time

2015-04-07 Thread Mauro
Glad that helped!  (Also not all immutable are inlined that way, only
when isbits==true)

Maybe the gc time increased because it had to look at more and more
objects as more and more Result instances were created?

On Tue, 2015-04-07 at 18:24, Adam Labadorf alabad...@gmail.com wrote:
 Yes, that gives a massive improvement and I see the same gc behavior you 
 got. I could notice the gc increasing slowly after just watching for a 
 minute or two, and that seems to have stopped now. I'll know for sure when 
 I run the whole thing and will report back.

 Thanks so much to all of you for your swift replies, this was my first 
 julia experience and it's great to have such a supportive community.

 On Tuesday, April 7, 2015 at 11:40:49 AM UTC-4, Mauro wrote:

 Make Result an immutable: 

 immutable Result 
   pvalue::Float64 
   i::Int64 
   j::Int64 
 end 

 that way the numbers are stored in the S array directly and all its 
 memory is preallocated.  Otherwise S just holds pointers to the Results 
 objects.  gc time is only 2% for me then (but I didn't run it for more 
 than 10min). 

 However, why the gc time increases in the original code, I don't know. 
 I don't think it should. 

 On Tue, 2015-04-07 at 16:26, Adam Labadorf alab...@gmail.com 
 javascript: wrote: 
  I moved compute! out of the main function and pass the ratios as you 
  suggested and the performance is a bit better but I still notice the gc 
  time increasing albeit more slowly, new code below. The csv I'm using is 
  available at 
  
 https://www.dropbox.com/s/uqeisg1vx027gjc/all_mRNA_nonzero_norm_counts_trim.csv?dl=0
  
  
  using HypothesisTests, ArrayViews 
  
  type Result 
pvalue::Float64 
i::Int64 
j::Int64 
  end 
  
  function readtable(fn) 
fp = readcsv(fn) 
columns = fp[1,2:end] 
rows = fp[2:end,1] 
data = float(fp[2:end,2:end]) 
return (columns,rows,data) 
  end 
  @time (cols, genes, counts) = 
  readtable(../all_mRNA_nonzero_norm_counts_trim.csv) 
  
  h_cols = find([c[1] == 'H' for c in cols]) 
  c_cols = find([c[1] == 'C' for c in cols]) 
  
  # restrict to HD and control 
  # add a pseudocount to avoid division by zero errors 
  counts = counts[:,[h_cols;c_cols]] + 0.01 
  
  h_cols = 1:length(h_cols) 
  c_cols = (length(h_cols)+1):size(counts,2) 
  
  # arrays are stored in column order, transpose counts to make 
  # accessing more efficient 
  counts = transpose(counts) 
  
  function 
  
 compute!(S::Array{Result,1},ratios::Array{Float64,1},tot_i::Int64,i::Int64,j::Int64,h_cols::UnitRange{Int64},c_cols::UnitRange{Int64},M::Int64)
  

t = UnequalVarianceTTest(view(ratios,h_cols),view(ratios,c_cols)) 
S[tot_i] = Result(pvalue(t),i,j) 
  end 
  
  function main(counts,genes,h_cols,c_cols) 
  
N = size(genes,1) 
M = size(counts,1) 
  
ratios = Array(Float64,M) 
  
tot_i = 0 
tot = (N^2-N)/2 
  
S = Array(Result,round(Int,tot)) 
  
for i=1:N-1 
  @time for j=(i+1):N 
tot_i += 1 
  
# use a sigmoid function to compress log ratios to [-10,10] 
b = 10 
for k=1:M 
  ratios[k] = 
 b*(2/(1+exp(-log2(counts[k,i]/counts[k,j])/(b/2)))-1) 
end 
  
compute!(S,ratios,tot_i,i,j,h_cols,c_cols,M) 
  end 
  @show (tot_i,tot,tot_i/tot) 
end 
  
  end 
  
  S = main(counts,genes,h_cols,c_cols) 
  
  
  
  On Tuesday, April 7, 2015 at 8:47:06 AM UTC-4, Mauro wrote: 
  
  
  On Tue, 2015-04-07 at 06:14, Adam Labadorf alab...@gmail.com 
  javascript: wrote: 
   Thanks for the replies. I took your suggestions (and reread the scope 
   section of the docs) and am still experiencing the gc creep. Below is 
  the 
   complete program, with the notable changes that I wrapped the main 
   computation in a function and eliminated all references to global 
  variables 
   inside. I'm also using the most recent nightly build of 0.4. Overall 
  this 
   version of the code is much faster, but there is still significant 
  slowdown 
   as the computation progresses. Is this expected? Do you see anything 
 I'm 
   doing wrong? 
   
   # julia v0.4.0-dev+4159 
   using HypothesisTests, ArrayViews 
   
   type Result 
 pvalue::Float64 
 i::Int64 
 j::Int64 
   end 
   
   function readtable(fn) 
 fp = readcsv(fn) 
 columns = fp[1,2:end] 
 rows = fp[:,2:end] 
 data = float(fp[2:end,2:end]) 
 return (columns,rows,data) 
   end 
   @time (cols, genes, counts) = 
   readtable(../all_mRNA_nonzero_norm_counts_trim.csv) 
   
   h_cols = find([c[1] == 'H' for c in cols]) 
   c_cols = find([c[1] == 'C' for c in cols]) 
   
   # filter out genes with zeros since it messes with the ratios 
   nonzero = mapslices(x - !any(x.==0),counts,2) 
   counts = counts[find(nonzero),[h_cols;c_cols]] 
   
   # slices seem to be faster 
   h_cols = 1:length(h_cols) 
   c_cols = (length(h_cols)+1):size(counts,2) 
   
   # arrays are stored in column order, transpose counts to make 
   # accessing more efficient 
   

Re: [julia-users] Extreme garbage collection time

2015-04-07 Thread Adam Labadorf
The code ran to completion at least 17x times faster than the equivalent 
python implementation. Out of curiosity, I collected the elapsed time and % 
spent in gc statistics from @time and plotted them from across the whole 
run:



This was run on a shared cluster so the spikes in time are probably due to 
other things going on with the node. The linear decrease in elapsed time 
confirms that the immutable type change fixed this issue. The trend in % 
time spent is linear until the end when the cost of gc was greater than the 
computations of the inner loop, makes sense. Not sure if anyone else cares 
about this plot besides me, but figured I'd toss it out there anyway.

On Tuesday, April 7, 2015 at 2:46:48 PM UTC-4, Mauro wrote:

 Glad that helped!  (Also not all immutable are inlined that way, only 
 when isbits==true) 

 Maybe the gc time increased because it had to look at more and more 
 objects as more and more Result instances were created? 

 On Tue, 2015-04-07 at 18:24, Adam Labadorf alab...@gmail.com 
 javascript: wrote: 
  Yes, that gives a massive improvement and I see the same gc behavior you 
  got. I could notice the gc increasing slowly after just watching for a 
  minute or two, and that seems to have stopped now. I'll know for sure 
 when 
  I run the whole thing and will report back. 
  
  Thanks so much to all of you for your swift replies, this was my first 
  julia experience and it's great to have such a supportive community. 
  
  On Tuesday, April 7, 2015 at 11:40:49 AM UTC-4, Mauro wrote: 
  
  Make Result an immutable: 
  
  immutable Result 
pvalue::Float64 
i::Int64 
j::Int64 
  end 
  
  that way the numbers are stored in the S array directly and all its 
  memory is preallocated.  Otherwise S just holds pointers to the Results 
  objects.  gc time is only 2% for me then (but I didn't run it for more 
  than 10min). 
  
  However, why the gc time increases in the original code, I don't know. 
  I don't think it should. 
  
  On Tue, 2015-04-07 at 16:26, Adam Labadorf alab...@gmail.com 
  javascript: wrote: 
   I moved compute! out of the main function and pass the ratios as you 
   suggested and the performance is a bit better but I still notice the 
 gc 
   time increasing albeit more slowly, new code below. The csv I'm using 
 is 
   available at 
   
  
 https://www.dropbox.com/s/uqeisg1vx027gjc/all_mRNA_nonzero_norm_counts_trim.csv?dl=0
  
   
   using HypothesisTests, ArrayViews 
   
   type Result 
 pvalue::Float64 
 i::Int64 
 j::Int64 
   end 
   
   function readtable(fn) 
 fp = readcsv(fn) 
 columns = fp[1,2:end] 
 rows = fp[2:end,1] 
 data = float(fp[2:end,2:end]) 
 return (columns,rows,data) 
   end 
   @time (cols, genes, counts) = 
   readtable(../all_mRNA_nonzero_norm_counts_trim.csv) 
   
   h_cols = find([c[1] == 'H' for c in cols]) 
   c_cols = find([c[1] == 'C' for c in cols]) 
   
   # restrict to HD and control 
   # add a pseudocount to avoid division by zero errors 
   counts = counts[:,[h_cols;c_cols]] + 0.01 
   
   h_cols = 1:length(h_cols) 
   c_cols = (length(h_cols)+1):size(counts,2) 
   
   # arrays are stored in column order, transpose counts to make 
   # accessing more efficient 
   counts = transpose(counts) 
   
   function 
   
  
 compute!(S::Array{Result,1},ratios::Array{Float64,1},tot_i::Int64,i::Int64,j::Int64,h_cols::UnitRange{Int64},c_cols::UnitRange{Int64},M::Int64)
  

  
 t = UnequalVarianceTTest(view(ratios,h_cols),view(ratios,c_cols)) 
 S[tot_i] = Result(pvalue(t),i,j) 
   end 
   
   function main(counts,genes,h_cols,c_cols) 
   
 N = size(genes,1) 
 M = size(counts,1) 
   
 ratios = Array(Float64,M) 
   
 tot_i = 0 
 tot = (N^2-N)/2 
   
 S = Array(Result,round(Int,tot)) 
   
 for i=1:N-1 
   @time for j=(i+1):N 
 tot_i += 1 
   
 # use a sigmoid function to compress log ratios to [-10,10] 
 b = 10 
 for k=1:M 
   ratios[k] = 
  b*(2/(1+exp(-log2(counts[k,i]/counts[k,j])/(b/2)))-1) 
 end 
   
 compute!(S,ratios,tot_i,i,j,h_cols,c_cols,M) 
   end 
   @show (tot_i,tot,tot_i/tot) 
 end 
   
   end 
   
   S = main(counts,genes,h_cols,c_cols) 
   
   
   
   On Tuesday, April 7, 2015 at 8:47:06 AM UTC-4, Mauro wrote: 
   
   
   On Tue, 2015-04-07 at 06:14, Adam Labadorf alab...@gmail.com 
   javascript: wrote: 
Thanks for the replies. I took your suggestions (and reread the 
 scope 
section of the docs) and am still experiencing the gc creep. Below 
 is 
   the 
complete program, with the notable changes that I wrapped the main 
computation in a function and eliminated all references to global 
   variables 
inside. I'm also using the most recent nightly build of 0.4. 
 Overall 
   this 
version of the code is much faster, but there is still significant 
   slowdown 
as the computation progresses. Is this expected? Do you see 
 anything 
  I'm 

Re: [julia-users] Extreme garbage collection time

2015-04-07 Thread Adam Labadorf


The code ran to completion at least 17x times faster than the equivalent 
python implementation. Out of curiosity, I collected the elapsed time and % 
spent in gc statistics from @time and plotted them from across the whole 
run. This was run on a shared cluster so the spikes in time are probably 
due to other things going on with the node. The linear decrease in elapsed 
time confirms that the immutable type change fixed this issue. The trend in 
% time spent is linear until the end when the cost of gc was greater than 
the computations of the inner loop, makes sense. Not sure if anyone else 
cares about this plot besides me, but figured I'd toss it out there anyway.
https://lh3.googleusercontent.com/-ftbz3MuaUfQ/VSRTBC_sJVI/DvA/_MZ96knbNsE/s1600/time_gc_julia.png

On Tuesday, April 7, 2015 at 2:46:48 PM UTC-4, Mauro wrote:

 Glad that helped!  (Also not all immutable are inlined that way, only 
 when isbits==true) 

 Maybe the gc time increased because it had to look at more and more 
 objects as more and more Result instances were created? 

 On Tue, 2015-04-07 at 18:24, Adam Labadorf alab...@gmail.com 
 javascript: wrote: 
  Yes, that gives a massive improvement and I see the same gc behavior you 
  got. I could notice the gc increasing slowly after just watching for a 
  minute or two, and that seems to have stopped now. I'll know for sure 
 when 
  I run the whole thing and will report back. 
  
  Thanks so much to all of you for your swift replies, this was my first 
  julia experience and it's great to have such a supportive community. 
  
  On Tuesday, April 7, 2015 at 11:40:49 AM UTC-4, Mauro wrote: 
  
  Make Result an immutable: 
  
  immutable Result 
pvalue::Float64 
i::Int64 
j::Int64 
  end 
  
  that way the numbers are stored in the S array directly and all its 
  memory is preallocated.  Otherwise S just holds pointers to the Results 
  objects.  gc time is only 2% for me then (but I didn't run it for more 
  than 10min). 
  
  However, why the gc time increases in the original code, I don't know. 
  I don't think it should. 
  
  On Tue, 2015-04-07 at 16:26, Adam Labadorf alab...@gmail.com 
  javascript: wrote: 
   I moved compute! out of the main function and pass the ratios as you 
   suggested and the performance is a bit better but I still notice the 
 gc 
   time increasing albeit more slowly, new code below. The csv I'm using 
 is 
   available at 
   
  
 https://www.dropbox.com/s/uqeisg1vx027gjc/all_mRNA_nonzero_norm_counts_trim.csv?dl=0
  
   
   using HypothesisTests, ArrayViews 
   
   type Result 
 pvalue::Float64 
 i::Int64 
 j::Int64 
   end 
   
   function readtable(fn) 
 fp = readcsv(fn) 
 columns = fp[1,2:end] 
 rows = fp[2:end,1] 
 data = float(fp[2:end,2:end]) 
 return (columns,rows,data) 
   end 
   @time (cols, genes, counts) = 
   readtable(../all_mRNA_nonzero_norm_counts_trim.csv) 
   
   h_cols = find([c[1] == 'H' for c in cols]) 
   c_cols = find([c[1] == 'C' for c in cols]) 
   
   # restrict to HD and control 
   # add a pseudocount to avoid division by zero errors 
   counts = counts[:,[h_cols;c_cols]] + 0.01 
   
   h_cols = 1:length(h_cols) 
   c_cols = (length(h_cols)+1):size(counts,2) 
   
   # arrays are stored in column order, transpose counts to make 
   # accessing more efficient 
   counts = transpose(counts) 
   
   function 
   
  
 compute!(S::Array{Result,1},ratios::Array{Float64,1},tot_i::Int64,i::Int64,j::Int64,h_cols::UnitRange{Int64},c_cols::UnitRange{Int64},M::Int64)
  

  
 t = UnequalVarianceTTest(view(ratios,h_cols),view(ratios,c_cols)) 
 S[tot_i] = Result(pvalue(t),i,j) 
   end 
   
   function main(counts,genes,h_cols,c_cols) 
   
 N = size(genes,1) 
 M = size(counts,1) 
   
 ratios = Array(Float64,M) 
   
 tot_i = 0 
 tot = (N^2-N)/2 
   
 S = Array(Result,round(Int,tot)) 
   
 for i=1:N-1 
   @time for j=(i+1):N 
 tot_i += 1 
   
 # use a sigmoid function to compress log ratios to [-10,10] 
 b = 10 
 for k=1:M 
   ratios[k] = 
  b*(2/(1+exp(-log2(counts[k,i]/counts[k,j])/(b/2)))-1) 
 end 
   
 compute!(S,ratios,tot_i,i,j,h_cols,c_cols,M) 
   end 
   @show (tot_i,tot,tot_i/tot) 
 end 
   
   end 
   
   S = main(counts,genes,h_cols,c_cols) 
   
   
   
   On Tuesday, April 7, 2015 at 8:47:06 AM UTC-4, Mauro wrote: 
   
   
   On Tue, 2015-04-07 at 06:14, Adam Labadorf alab...@gmail.com 
   javascript: wrote: 
Thanks for the replies. I took your suggestions (and reread the 
 scope 
section of the docs) and am still experiencing the gc creep. Below 
 is 
   the 
complete program, with the notable changes that I wrapped the main 
computation in a function and eliminated all references to global 
   variables 
inside. I'm also using the most recent nightly build of 0.4. 
 Overall 
   this 
version of the code is much faster, but there is still significant 

Re: [julia-users] Extreme garbage collection time

2015-04-06 Thread Adam Labadorf
Thanks for the replies. I took your suggestions (and reread the scope 
section of the docs) and am still experiencing the gc creep. Below is the 
complete program, with the notable changes that I wrapped the main 
computation in a function and eliminated all references to global variables 
inside. I'm also using the most recent nightly build of 0.4. Overall this 
version of the code is much faster, but there is still significant slowdown 
as the computation progresses. Is this expected? Do you see anything I'm 
doing wrong?

# julia v0.4.0-dev+4159
using HypothesisTests, ArrayViews

type Result
  pvalue::Float64
  i::Int64
  j::Int64
end

function readtable(fn)
  fp = readcsv(fn)
  columns = fp[1,2:end]
  rows = fp[:,2:end]
  data = float(fp[2:end,2:end])
  return (columns,rows,data)
end
@time (cols, genes, counts) = 
readtable(../all_mRNA_nonzero_norm_counts_trim.csv)

h_cols = find([c[1] == 'H' for c in cols])
c_cols = find([c[1] == 'C' for c in cols])

# filter out genes with zeros since it messes with the ratios
nonzero = mapslices(x - !any(x.==0),counts,2)
counts = counts[find(nonzero),[h_cols;c_cols]]

# slices seem to be faster
h_cols = 1:length(h_cols)
c_cols = (length(h_cols)+1):size(counts,2)

# arrays are stored in column order, transpose counts to make
# accessing more efficient
counts = transpose(counts)

genes = genes[find(nonzero)]

function main(counts,genes,h_cols,c_cols)

  N = size(genes,1)
  M = size(counts,1)

  ratios = Array(Float64,M)
  function 
compute!(S::Array{Result,1},counts::Array{Float64,2},tot_i::Int64,i::Int64,j::Int64,h_cols::UnitRange{Int64},c_cols::UnitRange{Int64},M::Int64)
  for k=1:M
ratios[k] = counts[k,i]/counts[k,j]
  end
  t = UnequalVarianceTTest(view(ratios,h_cols),view(ratios,c_cols))
  S[tot_i] = Result(pvalue(t),i,j)
  end

  tot_i = 0
  tot = (N^2-N)/2

  S = Array(Result,round(Int,tot))

  for i=1:N-1
@time for j=(i+1):N
  tot_i += 1
  compute!(S,counts,tot_i,i,j,h_cols,c_cols,M)
end
  end

end

S = main(counts,genes,h_cols,c_cols)


And the output:

elapsed time: 0.427719149 seconds (23 MB allocated, 39.90% gc time in 2 
pauses with 0 full sweep)
elapsed time: 0.031006382 seconds (14 MB allocated)
elapsed time: 0.131579099 seconds (14 MB allocated, 73.64% gc time in 1 
pauses with 1 full sweep)
elapsed time: 0.140120717 seconds (14 MB allocated, 73.58% gc time in 1 
pauses with 0 full sweep)
elapsed time: 0.030248237 seconds (14 MB allocated)
...
elapsed time: 0.507894781 seconds (5 MB allocated, 97.65% gc time in 1 
pauses with 0 full sweep)
elapsed time: 0.011821657 seconds (5 MB allocated)
elapsed time: 0.011610651 seconds (5 MB allocated)
elapsed time: 0.011816277 seconds (5 MB allocated)
elapsed time: 0.50779098 seconds (5 MB allocated, 97.65% gc time in 1 
pauses with 0 full sweep)
elapsed time: 0.011997168 seconds (5 MB allocated)
elapsed time: 0.011721667 seconds (5 MB allocated)
elapsed time: 0.011561071 seconds (5 MB allocated)


On Saturday, April 4, 2015 at 12:38:46 PM UTC-4, Patrick O'Leary wrote:

 Silly me, ignoring all the commented out lines assuming they were 
 comments...yes, this is almost certainly it.

 On Saturday, April 4, 2015 at 3:24:50 AM UTC-5, Tim Holy wrote:

 Devectorization should never slow anything down. If it does, then you 
 have 
 some other problem. Here, M is a global variable, and that's probably 
 what's 
 killing you. Performance tip #1: 
 http://docs.julialang.org/en/latest/manual/performance-tips/ 

 --Tim 

 On Friday, April 03, 2015 09:43:51 AM Adam Labadorf wrote: 
  Hi, 
  
  I am struggling with an issue related to garbage collection taking up 
 the 
  vast majority (99%) of compute time on a simple nested for loop. Code 
  excerpt below: 
  
  # julia version 0.3.7 
  # counts is an MxN matrix of Float64 
  # N=15000 
  # M=108 
  # h_cols and c_cols are indices \in {1:M} 
  using HypothesisTests, ArrayViews 
  ratios = Array(Float64,M) 
  function compute!(S,tot_i::Int64,i::Int64,j::Int64) 
  ratios = view(counts,:,i)./view(counts,:,j) 
  #for k=1:M 
  #  ratios[k] = counts[k,i]/counts[k,j] 
  #end 
  #ratios = counts[:,i]./counts[:,j] 
  t = UnequalVarianceTTest(ratios[h_cols],ratios[c_cols]) 
  S[tot_i] = (pvalue(t),i,j) 
  end 
  
  for i=1:N-1 
@time for j=(i+1):N 
  tot_i += 1 
  compute!(S,tot_i,i,j) 
end 
  end 
  
  The loop begins fast, output from time: 
  
  elapsed time: 1.023850054 seconds (62027220 bytes allocated) 
  elapsed time: 0.170916977 seconds (45785624 bytes allocated) 
  elapsed time: 0.171598156 seconds (45782760 bytes allocated) 
  elapsed time: 0.173866309 seconds (45779896 bytes allocated) 
  elapsed time: 0.170267172 seconds (45777032 bytes allocated) 
  elapsed time: 0.171754713 seconds (45774168 bytes allocated) 
  elapsed time: 0.170110142 seconds (45771304 bytes allocated) 
  elapsed time: 0.175199053 seconds (45768440 bytes allocated) 
  elapsed time: 0.179893161 seconds (45765576 bytes 

Re: [julia-users] Extreme garbage collection time

2015-04-04 Thread Tim Holy
Devectorization should never slow anything down. If it does, then you have 
some other problem. Here, M is a global variable, and that's probably what's 
killing you. Performance tip #1:
http://docs.julialang.org/en/latest/manual/performance-tips/

--Tim

On Friday, April 03, 2015 09:43:51 AM Adam Labadorf wrote:
 Hi,
 
 I am struggling with an issue related to garbage collection taking up the
 vast majority (99%) of compute time on a simple nested for loop. Code
 excerpt below:
 
 # julia version 0.3.7
 # counts is an MxN matrix of Float64
 # N=15000
 # M=108
 # h_cols and c_cols are indices \in {1:M}
 using HypothesisTests, ArrayViews
 ratios = Array(Float64,M)
 function compute!(S,tot_i::Int64,i::Int64,j::Int64)
 ratios = view(counts,:,i)./view(counts,:,j)
 #for k=1:M
 #  ratios[k] = counts[k,i]/counts[k,j]
 #end
 #ratios = counts[:,i]./counts[:,j]
 t = UnequalVarianceTTest(ratios[h_cols],ratios[c_cols])
 S[tot_i] = (pvalue(t),i,j)
 end
 
 for i=1:N-1
   @time for j=(i+1):N
 tot_i += 1
 compute!(S,tot_i,i,j)
   end
 end
 
 The loop begins fast, output from time:
 
 elapsed time: 1.023850054 seconds (62027220 bytes allocated)
 elapsed time: 0.170916977 seconds (45785624 bytes allocated)
 elapsed time: 0.171598156 seconds (45782760 bytes allocated)
 elapsed time: 0.173866309 seconds (45779896 bytes allocated)
 elapsed time: 0.170267172 seconds (45777032 bytes allocated)
 elapsed time: 0.171754713 seconds (45774168 bytes allocated)
 elapsed time: 0.170110142 seconds (45771304 bytes allocated)
 elapsed time: 0.175199053 seconds (45768440 bytes allocated)
 elapsed time: 0.179893161 seconds (45765576 bytes allocated)
 elapsed time: 0.212172824 seconds (45762712 bytes allocated)
 elapsed time: 0.252750549 seconds (45759848 bytes allocated)
 elapsed time: 0.254874855 seconds (45756984 bytes allocated)
 elapsed time: 0.231003319 seconds (45754120 bytes allocated)
 elapsed time: 0.235060195 seconds (45751256 bytes allocated)
 elapsed time: 0.235379355 seconds (45748392 bytes allocated)
 elapsed time: 0.927622743 seconds (45746168 bytes allocated, 77.65% gc time)
 elapsed time: 0.9132931 seconds (45742664 bytes allocated, 78.35% gc time)
 
 But as soon as it starts doing gc the % time spent in increases almost
 indefinitely, output from time much later:
 
 elapsed time: 0.174122929 seconds (36239160 bytes allocated)
 elapsed time: 18.535572658 seconds (36236168 bytes allocated, 99.22% gc
 time)
 elapsed time: 19.189478819 seconds (36233176 bytes allocated, 99.26% gc
 time)
 elapsed time: 21.812889439 seconds (36230184 bytes allocated, 99.35% gc
 time)
 elapsed time: 22.182467723 seconds (36227192 bytes allocated, 99.30% gc
 time)
 elapsed time: 0.16984 seconds (36224200 bytes allocated)
 
 The inner loop, despite iterating over fewer and fewer indices has
 massively increased the gc, and therefore overall, execution time. I have
 tried many things, including creating the compute function, devectorizing
 the ratios calculation (which really slowed things down), using view and
 sub in various places, profiling with --trace-allocation=all but I can't
 figure out why this happens or how to fix it. Doing gc for 22 seconds
 obviously kills the performance, and since there are about 22M iterations
 this is prohibitive. Can anyone suggest something I can try to improve the
 performance here?
 
 Thanks,
 Adam



Re: [julia-users] Extreme garbage collection time

2015-04-04 Thread Patrick O'Leary
Silly me, ignoring all the commented out lines assuming they were 
comments...yes, this is almost certainly it.

On Saturday, April 4, 2015 at 3:24:50 AM UTC-5, Tim Holy wrote:

 Devectorization should never slow anything down. If it does, then you have 
 some other problem. Here, M is a global variable, and that's probably 
 what's 
 killing you. Performance tip #1: 
 http://docs.julialang.org/en/latest/manual/performance-tips/ 

 --Tim 

 On Friday, April 03, 2015 09:43:51 AM Adam Labadorf wrote: 
  Hi, 
  
  I am struggling with an issue related to garbage collection taking up 
 the 
  vast majority (99%) of compute time on a simple nested for loop. Code 
  excerpt below: 
  
  # julia version 0.3.7 
  # counts is an MxN matrix of Float64 
  # N=15000 
  # M=108 
  # h_cols and c_cols are indices \in {1:M} 
  using HypothesisTests, ArrayViews 
  ratios = Array(Float64,M) 
  function compute!(S,tot_i::Int64,i::Int64,j::Int64) 
  ratios = view(counts,:,i)./view(counts,:,j) 
  #for k=1:M 
  #  ratios[k] = counts[k,i]/counts[k,j] 
  #end 
  #ratios = counts[:,i]./counts[:,j] 
  t = UnequalVarianceTTest(ratios[h_cols],ratios[c_cols]) 
  S[tot_i] = (pvalue(t),i,j) 
  end 
  
  for i=1:N-1 
@time for j=(i+1):N 
  tot_i += 1 
  compute!(S,tot_i,i,j) 
end 
  end 
  
  The loop begins fast, output from time: 
  
  elapsed time: 1.023850054 seconds (62027220 bytes allocated) 
  elapsed time: 0.170916977 seconds (45785624 bytes allocated) 
  elapsed time: 0.171598156 seconds (45782760 bytes allocated) 
  elapsed time: 0.173866309 seconds (45779896 bytes allocated) 
  elapsed time: 0.170267172 seconds (45777032 bytes allocated) 
  elapsed time: 0.171754713 seconds (45774168 bytes allocated) 
  elapsed time: 0.170110142 seconds (45771304 bytes allocated) 
  elapsed time: 0.175199053 seconds (45768440 bytes allocated) 
  elapsed time: 0.179893161 seconds (45765576 bytes allocated) 
  elapsed time: 0.212172824 seconds (45762712 bytes allocated) 
  elapsed time: 0.252750549 seconds (45759848 bytes allocated) 
  elapsed time: 0.254874855 seconds (45756984 bytes allocated) 
  elapsed time: 0.231003319 seconds (45754120 bytes allocated) 
  elapsed time: 0.235060195 seconds (45751256 bytes allocated) 
  elapsed time: 0.235379355 seconds (45748392 bytes allocated) 
  elapsed time: 0.927622743 seconds (45746168 bytes allocated, 77.65% gc 
 time) 
  elapsed time: 0.9132931 seconds (45742664 bytes allocated, 78.35% gc 
 time) 
  
  But as soon as it starts doing gc the % time spent in increases almost 
  indefinitely, output from time much later: 
  
  elapsed time: 0.174122929 seconds (36239160 bytes allocated) 
  elapsed time: 18.535572658 seconds (36236168 bytes allocated, 99.22% gc 
  time) 
  elapsed time: 19.189478819 seconds (36233176 bytes allocated, 99.26% gc 
  time) 
  elapsed time: 21.812889439 seconds (36230184 bytes allocated, 99.35% gc 
  time) 
  elapsed time: 22.182467723 seconds (36227192 bytes allocated, 99.30% gc 
  time) 
  elapsed time: 0.16984 seconds (36224200 bytes allocated) 
  
  The inner loop, despite iterating over fewer and fewer indices has 
  massively increased the gc, and therefore overall, execution time. I 
 have 
  tried many things, including creating the compute function, 
 devectorizing 
  the ratios calculation (which really slowed things down), using view and 
  sub in various places, profiling with --trace-allocation=all but I can't 
  figure out why this happens or how to fix it. Doing gc for 22 seconds 
  obviously kills the performance, and since there are about 22M 
 iterations 
  this is prohibitive. Can anyone suggest something I can try to improve 
 the 
  performance here? 
  
  Thanks, 
  Adam 



[julia-users] Extreme garbage collection time

2015-04-03 Thread Adam Labadorf
Hi,

I am struggling with an issue related to garbage collection taking up the 
vast majority (99%) of compute time on a simple nested for loop. Code 
excerpt below:

# julia version 0.3.7
# counts is an MxN matrix of Float64
# N=15000
# M=108
# h_cols and c_cols are indices \in {1:M}
using HypothesisTests, ArrayViews
ratios = Array(Float64,M)
function compute!(S,tot_i::Int64,i::Int64,j::Int64)
ratios = view(counts,:,i)./view(counts,:,j)
#for k=1:M
#  ratios[k] = counts[k,i]/counts[k,j]
#end
#ratios = counts[:,i]./counts[:,j]
t = UnequalVarianceTTest(ratios[h_cols],ratios[c_cols])
S[tot_i] = (pvalue(t),i,j)
end

for i=1:N-1
  @time for j=(i+1):N
tot_i += 1
compute!(S,tot_i,i,j)
  end
end

The loop begins fast, output from time:

elapsed time: 1.023850054 seconds (62027220 bytes allocated)
elapsed time: 0.170916977 seconds (45785624 bytes allocated)
elapsed time: 0.171598156 seconds (45782760 bytes allocated)
elapsed time: 0.173866309 seconds (45779896 bytes allocated)
elapsed time: 0.170267172 seconds (45777032 bytes allocated)
elapsed time: 0.171754713 seconds (45774168 bytes allocated)
elapsed time: 0.170110142 seconds (45771304 bytes allocated)
elapsed time: 0.175199053 seconds (45768440 bytes allocated)
elapsed time: 0.179893161 seconds (45765576 bytes allocated)
elapsed time: 0.212172824 seconds (45762712 bytes allocated)
elapsed time: 0.252750549 seconds (45759848 bytes allocated)
elapsed time: 0.254874855 seconds (45756984 bytes allocated)
elapsed time: 0.231003319 seconds (45754120 bytes allocated)
elapsed time: 0.235060195 seconds (45751256 bytes allocated)
elapsed time: 0.235379355 seconds (45748392 bytes allocated)
elapsed time: 0.927622743 seconds (45746168 bytes allocated, 77.65% gc time)
elapsed time: 0.9132931 seconds (45742664 bytes allocated, 78.35% gc time)

But as soon as it starts doing gc the % time spent in increases almost 
indefinitely, output from time much later:

elapsed time: 0.174122929 seconds (36239160 bytes allocated)
elapsed time: 18.535572658 seconds (36236168 bytes allocated, 99.22% gc 
time)
elapsed time: 19.189478819 seconds (36233176 bytes allocated, 99.26% gc 
time)
elapsed time: 21.812889439 seconds (36230184 bytes allocated, 99.35% gc 
time)
elapsed time: 22.182467723 seconds (36227192 bytes allocated, 99.30% gc 
time)
elapsed time: 0.16984 seconds (36224200 bytes allocated)

The inner loop, despite iterating over fewer and fewer indices has 
massively increased the gc, and therefore overall, execution time. I have 
tried many things, including creating the compute function, devectorizing 
the ratios calculation (which really slowed things down), using view and 
sub in various places, profiling with --trace-allocation=all but I can't 
figure out why this happens or how to fix it. Doing gc for 22 seconds 
obviously kills the performance, and since there are about 22M iterations 
this is prohibitive. Can anyone suggest something I can try to improve the 
performance here?

Thanks,
Adam