Hello everyone, I want to do an in-place tanh: y = tanh(x) assuming that y and x are preallocated (both are Array{Float64, 1024*1024}). I tested different implementations and got very different performance.
Version1: julia> @time map!(tanh, y, x) 0.149988 seconds (3.15 M allocations: 48.000 MB, 1.24% gc time) Version2: julia> @time for i = 1:length(x) y[i] = tanh(x[i]) end 0.355105 seconds (5.24 M allocations: 95.984 MB, 1.83% gc time) Version3: this one is not in-place, however it's the fast one. julia> @time y += tanh(x) 0.045402 seconds (10 allocations: 16.000 MB, 6.37% gc time) It's quite counterintuitive for me that version1 and version 2 are much slower than version 3. Can someone explain this a bit? Thanks! Cheng