Good detective work, please post your code in the issue!
On Fri, 2016-10-14 at 19:51, Andrew <owen...@gmail.com> wrote: > I've found the main problem. I have a function which repeatedly accesses a > 6-dimensional array in a loop. This function took no time in 0.4, but is > actually very slow in 0.5. My problem looks very similar to 18774 > <https://github.com/JuliaLang/julia/issues/18774>. > Here's an example: > > A3 = rand(10, 10, 10); > function test3(A, nx1, nx2, nx3) > for i = 1:10_000_000 > A[nx1, nx2, nx3] > end > end > > A5 = rand(10, 10, 10, 10, 10); > function test5(A, nx1, nx2, nx3, nx4, nx5) > for i = 1:10_000_000 > A[nx1, nx2, nx3, nx4, nx5] > end > end > > A6 = rand(10, 10, 10, 10, 10, 10); > function test6(A, nx1, nx2, nx3, nx4, nx5, nx6) > for i = 1:10_000_000 > A[nx1, nx2, nx3, nx4, nx5, nx6] > end > end > function test6_fast(A, nx1, nx2, nx3, nx4, nx5, nx6) > Asize = size(A) > for i = 1:10_000_000 > A[sub2ind(Asize, nx1, nx2, nx3, nx4, nx5, nx6 )] > end > end > @time test3(A3, 1, 1, 1) > @time test5(A5, 1, 1, 1, 1, 1) > @time test6(A6, 1, 1, 1, 1, 1, 1) > @time test6_fast(A6, 1, 1, 1, 1, 1, 1) > > > test6 takes 0.01s in 0.4 and takes 15s in 0.5. Using a linear index fixes > the problem. > > On Friday, September 30, 2016 at 2:30:02 AM UTC-4, Mauro wrote: >> >> On Fri, 2016-09-30 at 03:45, Andrew <owe...@gmail.com <javascript:>> >> wrote: >> > I checked, and my objective function is evaluated exactly as many times >> > under 0.4 as it is under 0.5. The number of iterations must be the same. >> > >> > I also looked at the times more precisely. For one particular function >> call >> > in the code, I have: >> > >> > 0.4 with old code: 6.7s 18.5M allocations >> > 0.4 with 0.5 style code(regular anonymous functions) 11.6s, 141M >> > allocations >> > 0.5: 36.2s, 189M allocations >> > >> > Surprisingly, 0.4 is still much faster even without the fast anonymous >> > functions trick. It doesn't look like 0.5 is generating many more >> > allocations than 0.4 on the same code, the time is just a lot slower. >> >> Sounds like your not far off a minimal, working example. Post it and >> I'm sure it will be dissected in no time. (And an issue can be filed). >> >> > On Thursday, September 29, 2016 at 3:36:46 PM UTC-4, Tim Holy wrote: >> >> >> >> No real clue about what's happening, but my immediate thought was that >> if >> >> your algorithm is iterative and uses some kind of threshold to decide >> >> convergence, then it seems possible that a change in the accuracy of >> some >> >> computation might lead to it getting "stuck" occasionally due to >> roundoff >> >> error. That's probably more likely to happen because of some kind of >> >> worsening rather than some improvement, but either is conceivable. >> >> >> >> If that's even a possible explanation, I'd check for unusually-large >> >> numbers of iterations and then print some kind of convergence info. >> >> >> >> Best, >> >> --Tim >> >> >> >> On Thu, Sep 29, 2016 at 1:21 PM, Andrew <owe...@gmail.com >> <javascript:>> >> >> wrote: >> >> >> >>> In the 0.4 version the above times are pretty consistent. I never >> observe >> >>> any several thousand allocation calls. I wonder if compilation is >> occurring >> >>> repeatedly. >> >>> >> >>> This isn't terribly pressing for me since I'm not currently working on >> >>> this project, but if there's an easy fix it would be useful for future >> work. >> >>> >> >>> (sorry I didn't mean to post twice. For some reason hitting spacebar >> was >> >>> interpreted as the post command?) >> >>> >> >>> >> >>> On Thursday, September 29, 2016 at 2:15:35 PM UTC-4, Andrew wrote: >> >>>> >> >>>> I've used @code_warntype everywhere I can think to and I've only >> found >> >>>> one Core.box. The @code_warntype looks like this >> >>>> >> >>>> Variables: >> >>>> #self#::#innerloop#3133{#bellman_obj} >> >>>> state::State{IdioState,AggState} >> >>>> EVspline::Dierckx.Spline1D >> >>>> model::Model{CRRA_Family,AggState} >> >>>> policy::PolicyFunctions{Array{Float64,6},Array{Int64,6}} >> >>>> OO::NW >> >>>> >> >>>> >> #3130::##3130#3134{State{IdioState,AggState},Dierckx.Spline1D,Model{CRRA_Family,AggState},PolicyFunctions{Array{Float64,6},Array{Int64,6}},NW,#bellman_obj} >> >> >>>> >> >>>> Body: >> >>>> begin >> >>>> >> >>>> >> #3130::##3130#3134{State{IdioState,AggState},Dierckx.Spline1D,Model{CRRA_Family,AggState},PolicyFunctions{Array{Float64,6},Array{Int64,6}},NW,#bellman_obj} >> >> >>>> = $(Expr(:new, >> >>>> >> ##3130#3134{State{IdioState,AggState},Dierckx.Spline1D,Model{CRRA_Family,AggState},PolicyFunctions{Array{Float64,6},Array{Int64,6}},NW,#bellman_obj}, >> >> >>>> :(state), :(EVspline), :(model), :(policy), :(OO), >> >>>> :((Core.getfield)(#self#,:bellman_obj)::#bellman_obj))) >> >>>> SSAValue(0) = >> >>>> >> #3130::##3130#3134{State{IdioState,AggState},Dierckx.Spline1D,Model{CRRA_Family,AggState},PolicyFunctions{Array{Float64,6},Array{Int64,6}},NW,#bellman_obj} >> >> >>>> >> >>>> >> (Core.setfield!)((Core.getfield)(#self#::#innerloop#3133{#bellman_obj},:obj)::CORE.BOX,:contents,SSAValue(0))::##3130#3134{State{IdioState,AggState},Dierckx.Spline1D,Model{CRRA_Family,AggState},PolicyFunctions{Array{Float64,6},Array{Int64,6}},NW,#bellman_obj} >> >> >>>> return SSAValue(0) >> >>>> >> >>>> >> end::##3130#3134{State{IdioState,AggState},Dierckx.Spline1D,Model{CRRA_Family,AggState},PolicyFunctions{Array{Float64,6},Array{Int64,6}},NW,#bellman_obj} >> >> >>>> >> >>>> >> >>>> I put the CORE.BOX in all caps near the bottom. >> >>>> >> >>>> I have no idea if this is actually a problem. The return type is >> stable. >> >>>> Also, this function appears in an outer loop. >> >>>> >> >>>> What I noticed putting a @time in places is that in 0.5, occasionally >> >>>> calls to my nonlinear equation solver take a really long time, like >> here: >> >>>> >> >>>> 0.069224 seconds (9.62 k allocations: 487.873 KB) >> >>>> 0.000007 seconds (39 allocations: 1.922 KB) >> >>>> 0.000006 seconds (29 allocations: 1.391 KB) >> >>>> 0.000011 seconds (74 allocations: 3.781 KB) >> >>>> 0.000009 seconds (54 allocations: 2.719 KB) >> >>>> 0.000008 seconds (54 allocations: 2.719 KB) >> >>>> 0.000008 seconds (49 allocations: 2.453 KB) >> >>>> 0.000007 seconds (44 allocations: 2.188 KB) >> >>>> 0.000007 seconds (44 allocations: 2.188 KB) >> >>>> 0.000006 seconds (39 allocations: 1.922 KB) >> >>>> 0.000007 seconds (39 allocations: 1.922 KB) >> >>>> 0.000006 seconds (39 allocations: 1.922 KB) >> >>>> 0.000005 seconds (34 allocations: 1.656 KB) >> >>>> 0.000005 seconds (34 allocations: 1.656 KB) >> >>>> 0.000004 seconds (29 allocations: 1.391 KB) >> >>>> 0.000004 seconds (24 allocations: 1.125 KB) >> >>>> 0.007399 seconds (248 allocations: 15.453 KB) >> >>>> 0.000009 seconds (30 allocations: 1.594 KB) >> >>>> 0.000004 seconds (25 allocations: 1.328 KB) >> >>>> 0.000004 seconds (25 allocations: 1.328 KB) >> >>>> >> >>>> 0.000010 seconds (70 allocations: 3.719 KB) >> >>>> 0.072703 seconds (41.74 k allocations: 1.615 MB) >> >>>> >> >>>> >> >>>> >> >>>> >> >>>> >> >>>> >> >>>> >> >>>> >> >>>> On Thursday, September 29, 2016 at 1:37:18 AM UTC-4, Kristoffer >> Carlsson >> >>>> wrote: >> >>>>> >> >>>>> Look for Core.Box in @code_warntype. See >> >>>>> https://github.com/JuliaLang/julia/issues/15276 >> >>>> >> >>>> >> >> >>