I think Tom iir right here. These lines call the pow function movabsq $pow, %rax callq *%rax
but the actual pow functions that is being called is different. I am surprised it is that much of a difference in performance between the two pow functions... That seems odd. What Mauro says is also interesting that the speed difference is there (and is as large) even without the fastmath macro. My question now is, what does IEEE say about x^double vs x^int. Is there any reason these should have different performance? If not, it seems to make sense to always convert the exponent to a double and call the libm version? All doubles should be able to exactly represent the integers that the power function take? On Thursday, September 24, 2015 at 9:18:45 PM UTC+2, Mauro wrote: > > I dissected the bench-method into two, just to be sure (on 0.4-RC2). > > julia> function bench(N) > for i = 1:N > f(π/4) > end > end > bench (generic function with 1 method) > > julia> function bench_f(N) > for i = 1:N > f_float(π/4) > end > end > bench_f (generic function with 1 method) > > They also have identical native code but run differently: > > julia> @time bench_f(10^7) > 0.190613 seconds (5 allocations: 176 bytes) > > julia> @time bench(10^7) > 0.780212 seconds (5 allocations: 176 bytes) > > I thought that @code_native shows the code which is actually run, so why > different speeds? > > If I define the f* functions without the @fastmath macro, then I get > the same performance as above: > > julia> @time bench_f(10^7) > 0.203071 seconds (5 allocations: 176 bytes) > > julia> @time bench(10^7) > 0.787696 seconds (5 allocations: 176 bytes) > > but with different native-codes. > > > I can reproduce... I think the 2 versions will call these methods > > respectively... I guess there's a performance difference? > > > > pow_fast{T<:FloatTypes}(x::T, y::Integer) = > >> box(T, Base.powi_llvm(unbox(T,x), unbox(Int32,Int32(y)))) > >> > > > > > >> pow_fast(x::Float64, y::Float64) = > >> ccall(("pow",libm), Float64, (Float64,Float64), x, y) > > > > Tom, or are those two functions called within the native-code? I'm no > good assembler reader. >