Hello,

I'm trying to understand how basic functions on vectors perform, regarding 
the size (n) of a vector.

When I track the elapsed time per element (in nano sec) regarding the 
vector size (n), I see that I need at least 100 elements in my vector to 
reach half the maximum speed (at 7ns/el for n=1e3).

So my question is what am I measuring between n=1 to n= 100 and why the 
performance is drastically poorer in this region ?
Is this the cost of calling the function ? 
Is this a problem with my profiling method ?

Thanks,
Lionel

CPU(1) = 300 ns/el
CPU(10) = 40ns/el
CPU(100) = 12ns/el
CPU(200) = 10ns/el
CPU(1_000) = 7ns/el = max speed








using Gadfly
N = [   1,2,3,4,5,6,7,8,9,
        10,20,30,40,50,60,70,80,90,100,200,300,400,500,750,
        1_000,2_500,5_000,7_500,10_000,100_000,1_000_000]

cpu = []
for n in N
    n==1 ? a = pi : a = rand(n)
    sqrt(a)
    gc()
    gc_enable(false)
    t = mean([@elapsed sqrt(a) for i=1:100])*(1e9/n)
    gc_enable(true)
    push!(cpu,t)
end

df = DataFrame()
df[:N] = N
df[:CPU] = cpu

path = Pkg.dir("MKL") * "/benchmark/"
p = Gadfly.plot(
                layer(df,x="N",y="CPU",Geom.line),
                Scale.x_log10,
                Guide.xlabel("n-element vector"),
                Guide.ylabel("CPU time in nsec/element"),
                Guide.title("CPU time for sqrt(X) where X = Float64[] with 
n elements"))
draw(PNG(path*"sqrt_cpu(n).png", 20cm, 20cm), p)
p







Reply via email to