Re: [julia-users] Re: Same native code, different performance

Kristoffer Carlsson Thu, 24 Sep 2015 12:47:45 -0700

I think Tom iir right here. These lines call the pow function

    movabsq $pow, %rax
    callq   *%rax


but the actual pow functions that is being called is different. I am 
surprised it is that much of a difference in performance between the two 
pow functions... That seems odd.

What Mauro says is also interesting that the speed difference is there (and 
is as large) even without the fastmath macro.

My question now is, what does IEEE say about x^double vs x^int. Is there 
any reason these should have different performance? If not, it seems to 
make sense to always convert the exponent to a double and call the libm 
version? All doubles should be able to exactly represent the integers that 
the power function take?


On Thursday, September 24, 2015 at 9:18:45 PM UTC+2, Mauro wrote:
>
> I dissected the bench-method into two, just to be sure (on 0.4-RC2). 
>
> julia> function bench(N) 
>           for i = 1:N 
>                f(π/4) 
>           end 
>        end 
> bench (generic function with 1 method) 
>
> julia> function bench_f(N) 
>           for i = 1:N 
>                f_float(π/4) 
>           end 
>        end 
> bench_f (generic function with 1 method) 
>
> They also have identical native code but run differently: 
>
> julia> @time bench_f(10^7) 
>   0.190613 seconds (5 allocations: 176 bytes) 
>
> julia> @time bench(10^7) 
>   0.780212 seconds (5 allocations: 176 bytes) 
>
> I thought that @code_native shows the code which is actually run, so why 
> different speeds? 
>
> If I define the f* functions without the @fastmath macro, then I get 
> the same performance as above: 
>
> julia> @time bench_f(10^7) 
>   0.203071 seconds (5 allocations: 176 bytes) 
>
> julia> @time bench(10^7) 
>   0.787696 seconds (5 allocations: 176 bytes) 
>
> but with different native-codes. 
>
> > I can reproduce... I think the 2 versions will call these methods 
> > respectively... I guess there's a performance difference? 
> > 
> > pow_fast{T<:FloatTypes}(x::T, y::Integer) = 
> >>     box(T, Base.powi_llvm(unbox(T,x), unbox(Int32,Int32(y)))) 
> >> 
> > 
> > 
> >> pow_fast(x::Float64, y::Float64) = 
> >>     ccall(("pow",libm), Float64, (Float64,Float64), x, y) 
> > 
>
> Tom, or are those two functions called within the native-code?  I'm no 
> good assembler reader. 
>

Re: [julia-users] Re: Same native code, different performance

Reply via email to