Can someone explain these results to me.

Two functions: 
f(x) = @fastmath cos(x)^3
f_float(x) = @fastmath  cos(x)^3.0


Identical native code:

julia> code_native(f, (Float64,))
    .text
Filename: none
Source line: 1
    pushq   %rbp
    movq    %rsp, %rbp
    movabsq $cos, %rax
Source line: 1
    callq   *%rax
    movabsq $140084090479408, %rax  # imm = 0x7F67DE73A330
    vmovsd  (%rax), %xmm1
    movabsq $pow, %rax
    callq   *%rax
    popq    %rbp
    ret

julia> code_native(f_float, (Float64,))
    .text
Filename: none
Source line: 1
    pushq   %rbp
    movq    %rsp, %rbp
    movabsq $cos, %rax
Source line: 1
    callq   *%rax
    movabsq $140084090501536, %rax  # imm = 0x7F67DE73F9A0
    vmovsd  (%rax), %xmm1
    movabsq $pow, %rax
    callq   *%rax
    popq    %rbp
    ret

Still a large difference in performance:

function bench(N)
    @time for i = 1:N
        f(π/4)
   end
   @time for i = 1:N
       f_float(π/4)
   end
end

julia> bench(10^6)
  0.062536 seconds
  0.010077 seconds

Secondly, can someone explain why there should be a performance difference 
at all? Is power by a float which is == an int defined differently? IEEE 
shenanigans?

Reply via email to