Try putting some extraneous parentheses around some of your operations, and you'll get good performance again. It's an inlining thing.
Please do report this as an issue: https://github.com/JuliaLang/julia/issues/new --Tim On Wednesday, October 14, 2015 08:07:11 AM Damien wrote: > Hi all, > > I'm noticing a strange performance issue with expressions such as this one: > > n = 100000 > a = zeros(Float32, n) > b = rand(Float32, n) > c = rand(Float32, n) > > function test(a, b, c) > @simd for i in 1:length(a) > @inbounds a[i] += b[i] * c[i] * (c[i] < b[i]) * (c[i] > b[i]) * > (c[i] <= b[i]) * (c[i] >= b[i]) > end > end > > The problem is that performance and successful vectorisation depend on the > number of comparison statements in the expression and whether the > comparisons are explicitely cast to Float32. > > In Julia 0.4-rc4, I get the following: > > @inbounds a[i] += b[i] * c[i] * (c[i] < b[i]) * (c[i] > b[i]) * (c[i] <= > b[i]) > > > test(a, b, c) > > @time test(a, b, c) > > 0.000169 seconds (4 allocations: 160 bytes) > > @inbounds a[i] += b[i] * c[i] * (c[i] < b[i]) * (c[i] > b[i]) * (c[i] <= > b[i]) * (c[i] >= b[i]) > > > test(a, b, c) > > @time test(a, b, c) > > 0.007258 seconds (200.00 k allocations: 3.052 MB, 47.59% gc time) > > @inbounds a[i] += b[i] * c[i] * Float32(c[i] < b[i]) * Float32(c[i] > b[i]) > * Float32(c[i] <= b[i]) * Float32(c[i] <= b[i]) > > > test(a, b, c) > > @time test(a, b, c) > > 0.000137 seconds (4 allocations: 160 bytes) > > I get a similar behavior in the current 0.5 HEAD (Commit d9f7c21* with the > fix for issue #13553) but the threshold for the number of comparisons is > slightly different. > > (a) Is meant to be OK to use expressions like a[i] * (c[i] < b[i]) or > should I always cast explicitely? I really like the implicit version, > because it is very readable and a natural translation of equations > involving cases. > > (b) What is causing the vectorisation threshold observed here? > > Best, > Damien