[julia-users] Re: Same native code, different performance

2015-09-25 Thread TY
Out of curiosity, I tried the following one (on julia 0.4-rc1) f(x) = ( c = cos(x); c^3 ) f_float(x) = ( c = cos(x); c^3.0 ) then I get 0.006489 seconds 0.013220 seconds but with the original code, I get 0.076714 seconds 0.013280 seconds (both without @fastmath)

Re: [julia-users] Re: Same native code, different performance

2015-09-25 Thread Páll Haraldsson
On Thursday, September 24, 2015 at 8:05:52 PM UTC, Jeffrey Sarnoff wrote: > > It could be that integer powers are done with binary shifts in software > and the floating point powers are computed in the fpu. > I suspect not. [At least in this case here, where the numbers to be raised to a power a

Re: [julia-users] Re: Same native code, different performance

2015-09-25 Thread Kristoffer Carlsson
If you want to reproduce the results above and below you can use JuliaBox. This has something to do with the constant propagation of sin and cos I think. Changing cos to x reverses the results. f(x) = @fastmath x^3 f2(x) = @fastmath x^3.0 fs(x) = @fastmath cos(x)^3 fs2(x) = @fastmath cos(x)^3.0

Re: [julia-users] Re: Same native code, different performance

2015-09-25 Thread Kristoffer Carlsson
If you want to reproduce this you can use JuliaBox ( I changed cos(x) to x because it doesnt change anything) On Thursday, September 24, 2015 at 11:03:21 PM UTC+2, Erik Schnetter wrote:

Re: [julia-users] Re: Same native code, different performance

2015-09-24 Thread Erik Schnetter
On Thu, Sep 24, 2015 at 4:56 PM, Yichao Yu wrote: > On Thu, Sep 24, 2015 at 4:42 PM, Erik Schnetter > wrote: > > In the native code above, the C function `pow(double, double)` is called > in > > both cases. Maybe `llvm_powi` is involved; if so, it is lowered to the > same > > `pow` function. The

Re: [julia-users] Re: Same native code, different performance

2015-09-24 Thread Yichao Yu
On Thu, Sep 24, 2015 at 4:42 PM, Erik Schnetter wrote: > In the native code above, the C function `pow(double, double)` is called in > both cases. Maybe `llvm_powi` is involved; if so, it is lowered to the same > `pow` function. The speed difference must have a different reason. Not necessarily,

Re: [julia-users] Re: Same native code, different performance

2015-09-24 Thread Kristoffer Carlsson
I don't like to invoke the black magic card here. I have tried benchmarking in different ways in different scenarios and the results are consistent. It is also reproducable by others. FWIW this is what lead me to this https://github.com/JuliaDiff/ForwardDiff.jl/issues/57

Re: [julia-users] Re: Same native code, different performance

2015-09-24 Thread Erik Schnetter
In the native code above, the C function `pow(double, double)` is called in both cases. Maybe `llvm_powi` is involved; if so, it is lowered to the same `pow` function. The speed difference must have a different reason. Sometimes there are random things occurring that invalidate benchmark results.

Re: [julia-users] Re: Same native code, different performance

2015-09-24 Thread Kristoffer Carlsson
But the floating ones are the faster ones. Shouldn't it be the opposite?

Re: [julia-users] Re: Same native code, different performance

2015-09-24 Thread Jeffrey Sarnoff
It could be that integer powers are done with binary shifts in software and the floating point powers are computed in the fpu. On Thursday, September 24, 2015 at 3:47:22 PM UTC-4, Kristoffer Carlsson wrote: > > I think Tom iir right here. These lines call the pow function > > movabsq $pow,

Re: [julia-users] Re: Same native code, different performance

2015-09-24 Thread Kristoffer Carlsson
I think Tom iir right here. These lines call the pow function movabsq $pow, %rax callq *%rax but the actual pow functions that is being called is different. I am surprised it is that much of a difference in performance between the two pow functions... That seems odd. What Mauro says

Re: [julia-users] Re: Same native code, different performance

2015-09-24 Thread Mauro
I dissected the bench-method into two, just to be sure (on 0.4-RC2). julia> function bench(N) for i = 1:N f(π/4) end end bench (generic function with 1 method) julia> function bench_f(N) for i = 1:N f_float(π/4) end

Re: [julia-users] Re: Same native code, different performance

2015-09-24 Thread Tom Breloff
I can reproduce... I think the 2 versions will call these methods respectively... I guess there's a performance difference? pow_fast{T<:FloatTypes}(x::T, y::Integer) = > box(T, Base.powi_llvm(unbox(T,x), unbox(Int32,Int32(y > > pow_fast(x::Float64, y::Float64) = > ccall(("pow",libm),

[julia-users] Re: Same native code, different performance

2015-09-24 Thread Simon Danisch
I cannot reproduce this on RC2. Probably the inlining fails for f on some julia version? Am Donnerstag, 24. September 2015 18:04:18 UTC+2 schrieb Kristoffer Carlsson: > > Can someone explain these results to me. > > Two functions: > f(x) = @fastmath cos(x)^3 > f_float(x) = @fastmath cos(x)^3.0