Firstly, don't benchmark in global scope, put things in functions. Secondly, see https://github.com/JuliaLang/julia/pull/17623 for .+ and company.
On Thursday, October 20, 2016 at 10:55:25 PM UTC+2, Alexey Cherkaev wrote: > > Hi all! > > Consider example: > > # collect to produce vectors - not strictly necessary though > y = collect(linspace(0.0, 1.0, 1000)) > z = collect(linspace(0.5,3.0, 1000)) > > x = zeros(Float64, 1000) > > If I do > > @time x .= sin.(y) > > The timing output is > > 0.000057 seconds (7 allocations: 208 bytes) > > So, OK, no, let’s call it, “real” allocation. However, if I do: > > @time x .= cos.(sin.(y)) > > I get > > 0.018246 seconds (7.16 k allocations: 322.747 KB) > > Or > > @time x .= sin.(y) .+ cos.(z) > > 0.000376 seconds (63 allocations: 25.984 KB) > > Better, but still 26 KB allocated! I was under impression that .-operations > fuse, producing no intermediate arrays. Am I wrong? > >