Hello, I'm trying to understand how basic functions on vectors perform, regarding the size (n) of a vector.
When I track the elapsed time per element (in nano sec) regarding the vector size (n), I see that I need at least 100 elements in my vector to reach half the maximum speed (at 7ns/el for n=1e3). So my question is what am I measuring between n=1 to n= 100 and why the performance is drastically poorer in this region ? Is this the cost of calling the function ? Is this a problem with my profiling method ? Thanks, Lionel CPU(1) = 300 ns/el CPU(10) = 40ns/el CPU(100) = 12ns/el CPU(200) = 10ns/el CPU(1_000) = 7ns/el = max speed using Gadfly N = [ 1,2,3,4,5,6,7,8,9, 10,20,30,40,50,60,70,80,90,100,200,300,400,500,750, 1_000,2_500,5_000,7_500,10_000,100_000,1_000_000] cpu = [] for n in N n==1 ? a = pi : a = rand(n) sqrt(a) gc() gc_enable(false) t = mean([@elapsed sqrt(a) for i=1:100])*(1e9/n) gc_enable(true) push!(cpu,t) end df = DataFrame() df[:N] = N df[:CPU] = cpu path = Pkg.dir("MKL") * "/benchmark/" p = Gadfly.plot( layer(df,x="N",y="CPU",Geom.line), Scale.x_log10, Guide.xlabel("n-element vector"), Guide.ylabel("CPU time in nsec/element"), Guide.title("CPU time for sqrt(X) where X = Float64[] with n elements")) draw(PNG(path*"sqrt_cpu(n).png", 20cm, 20cm), p) p