Le mardi 29 mars 2016 à 09:43 -0700, 博陈 a écrit : > I tried the built-in profiler, and find that the problem lies in > lines I end with ******, the result is shown below: > that proved my guess, how can I pre-allocate these arrays? If I don't > want to divide this code into several parts that calculate these > arrays separately. Can you show us which lines the numbers correspond to? There's no line 211 in the script you posted.
Regards > | lines | backtrace | > | 169 | 9011 | *********** > | 173 | 1552 | > | 175 | 2604 | > | 179 | 2906 | > | 181 | 1541 | > | 192 | 4458 | > | 211 | 13332 ************| > | 214 | 8431 |************ > | 218 | 15871 |*********** > | 221 | 2538 | > > > Have you tried: > > > > (a) calling @code_typewarn on your function > > (b) using the built-in profiler? > > > > > > On Tue, Mar 29, 2016 at 9:23 AM, 博陈 <chenph...@gmail.com> wrote: > > > First of all, have a look at the result. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > My code calculates the evolution of 1-d 2-electron system in the electric > > > field, some variables are calculated during the evolution. > > > According to the result of @time evolution, my code must have a > > > pre-allocation problem. Before you see the long code, i suggest that the > > > hotspot might be around the Arrays prop_e, \phio, pp. I have learnt that > > > I can use m = Array(Float64, 1) outside a "for" loop and empty!(m) and > > > push!(m, new_m) inside the loop to pre-allocate the variable m, but in my > > > situations, I don't know how to pre-allocate these arrays. > > > > > > Below is the script (precisely, the main function) itself. > > > > > > function evolution(ϕ::Array{Complex{Float64}, 2}, > > > ele::Array{Float64, 1}, dx::Float64, dt::Float64, > > > flags::Tuple{Int64, Int64, Int64, Int64}) > > > ϕg = copy(ϕ) > > > FFTW.set_num_threads(8) > > > ns = length( ϕ[:, 1] ) > > > x = get_x(ns, dx) > > > p = get_p(ns, dx) > > > if flags[4] == 1 > > > pp = similar(p) > > > A = -cumsum(ele) * dt > > > A² = A.*A > > > ##### splitting > > > r_sp = 150.0 > > > δ_sp = 5.0 > > > splitter = Array(Float64, ns, ns) > > > end > > > nt = length( ele ) > > > > > > # ##### Pre-allocate result and temporary arrays > > > #if flags[1] == 1 > > > σ = zeros(Complex128, nt) > > > #end > > > #if flags[2] == 1 > > > a = zeros(Float64, nt) > > > #end > > > #if flags[3] == 1 > > > r_ionization = 20.0 > > > n1 = round(Int, ns/2 - r_ionization/dx) > > > n2 = round(Int, ns/2 + r_ionization/dx) > > > ip = zeros(Float64, nt) > > > #end > > > > > > ##### FFT plan > > > p_fft! = plan_fft!( similar(ϕ), flags=FFTW.MEASURE ) > > > > > > prop_x = similar( ϕ ) > > > prop_p = similar( prop_x ) > > > prop_e = similar( prop_x ) > > > # this two versions just cost the same time > > > xplusy = Array(Float64, ns, ns) > > > #xplusy = Array( Float64, ns^2) > > > > > > ##### absorb boundary > > > r_a = ns * dx /2 - 50.0 > > > δ = 10.0 > > > absorb = Array(Float64, ns, ns) > > > > > > k0 = 2π / (ns * dx) > > > > > > @inbounds for j in 1:ns > > > @inbounds for i in 1:ns > > > prop_x[i, j] = exp( -im * get_potential(x[i], x[j]) * dt / 2 ) > > > prop_p[i, j] = exp( -im * (p[i]^2 + p[j]^2)/2 * dt ) > > > > > > xplusy[i, j] = x[i] + x[j] > > > > > > absorb[i, j] = (1.0 - get_out(x[i], r_a, δ ))* (1.0 - > > > get_out(x[j], > > > r_a, δ)) > > > end > > > end > > > > > > if flags[2] == 1 > > > pvpx = Array(Float64, ns, ns) > > > @inbounds for j in 1:ns > > > @inbounds for i in 1:ns > > > pvpx[i, j] = get_pvpx(x[i], x[j]) > > > end > > > end > > > end > > > > > > if flags[4] == 1 > > > ϕo = zeros(Complex128, ns, ns) > > > ϕp = zeros(Complex128, ns, ns) > > > @inbounds for j in 1:ns > > > @inbounds for i in 1:ns > > > splitter[i, j] = get_out(x[i], r_sp, δ_sp) * > > > get_out(x[j], r_sp, δ_sp) > > > end > > > end > > > end > > > > > > for i in 1:nt > > > for j in eachindex(ϕ) > > > prop_e[j] = exp( -im * ele[i] * xplusy[j] * dt/2.0) > > > ************************************169 > > > end > > > > > > for j in eachindex(ϕ) > > > ϕ[j] *= prop_x[j] * prop_e[j] > > > end > > > p_fft! * ϕ > > > for j in eachindex(ϕ) > > > ϕ[j] *= prop_p[j] > > > end > > > p_fft! \ ϕ > > > for j in eachindex(ϕ) > > > ϕ[j] *= prop_x[j] * prop_e[j] > > > end > > > ########## autocorrelation function σ(t) > > > if flags[1] == 1 > > > for j in eachindex(ϕ) > > > σ[i] += conj(ϕg[j]) * ϕ[j] > > > end > > > end > > > ########## dipole acceleration a(t) > > > if flags[2] == 1 > > > for j in eachindex(ϕ) > > > a[i] += abs(ϕ[j])^2 * (pvpx[j] + 2ele[i]) > > > end > > > end > > > ########## ionization probability ip(t) > > > if flags[3] == 1 > > > for j1 in n1:n2 > > > for j2 in 1:ns > > > ip[i] += abs( ϕ[j2+ns*(j1-1)] )^2 > > > end > > > end > > > for j1 in [1:n1-1; n2+1:ns] > > > for j2 in n1:n2 > > > ip[i] += abs( ϕ[j2+ns*(j1-1)] )^2 > > > end > > > end > > > end > > > ########## get momentum > > > if flags[4] == 1 > > > for j in eachindex(ϕo) > > > ϕo[j] = ϕ[j] * splitter[j] * exp( -im * A[i]*xplusy[j] ) > > > **********************************211 > > > end > > > for j in eachindex(p) > > > pp[j] = p[j]^2 /2 * (nt-i) - p[j] *sum( A[i:nt] ) + sum( > > > A²[1:nt] ) /2 ******************214 > > > end > > > for j2 in 1:ns > > > for j1 in 1:ns > > > ϕo[j1, j2] = ϕo[j1, j2] * exp( -im * (pp[j1] + > > > pp[j2]) * dt)************************218 > > > end > > > end > > > p_fft! * ϕo > > > for j in eachindex(ϕp) > > > ϕp[j] += ϕo[j] > > > end > > > end > > > > > > ########## absorb boundary > > > if mod(i, 300) == 0 > > > for j in eachindex(ϕ) > > > ϕ[j] *= absorb[j] > > > end > > > end > > > > > > if (mod(i, 500) == 0) > > > println("i = $i") > > > flush(STDOUT) > > > end > > > end > > > σ *= dx^2 > > > a *= dx^2 > > > ip *= dx^2 > > > > > > save("data/fs.jld", "ϕ", ϕ) > > > if flags[1] == 1 > > > save("data/sigma.jld", "σ", σ) > > > end > > > if flags[2] == 1 > > > save("data/a.jld", "a", a) > > > end > > > if flags[3] == 1 > > > save("data/ip.jld", "ip", ip) > > > end > > > if flags[4] == 1 > > > save("data/pf.jld", "ϕp", ϕp) > > > end > > > > > > #return σ, a, ip, ϕ > > > nothing > > > end > > > > > > > >