Thanks for the confirmation! Yes, I need more tests to see what the best practice is for my particular problem.
On Monday, June 20, 2016 at 3:05:31 PM UTC+1, Chris Rackauckas wrote: > > Most likely. I would also time it with and without @simd at your problem > size. For some reason I've had some simple loops do better without @simd. > > On Monday, June 20, 2016 at 2:50:22 PM UTC+1, chobb...@gmail.com wrote: >> >> Thanks! I'm still using v0.4.5. In this case, is the code I highlighted >> above still the best choice for doing the job? >> >> >> On Monday, June 20, 2016 at 1:57:25 PM UTC+1, Chris Rackauckas wrote: >>> >>> I think that for medium size (but not large) arrays in v0.5 you may want >>> to use @threads from the threadding branch, and then for really large >>> arrays you may want to use @parallel. But you'd have to test some timings. >>> >>> On Monday, June 20, 2016 at 11:38:15 AM UTC+1, chobb...@gmail.com wrote: >>>> >>>> I have the same question regarding how to calculate the entry-wise >>>> vector product and find this thread. As a novice, I wonder if the >>>> following >>>> code snippet is still the standard for entry-wise vector multiplication >>>> that one should stick to in practice? Thanks! >>>> >>>> >>>> @fastmath @inbounds @simd for i=1:n >>>> A[i] *= B[i] >>>> end >>>> >>>> >>>> >>>> On Tuesday, October 6, 2015 at 3:28:29 PM UTC+1, Lionel du Peloux wrote: >>>>> >>>>> Dear all, >>>>> >>>>> I'm looking for the fastest way to do element-wise vector >>>>> multiplication in Julia. The best I could have done is the following >>>>> implementation which still runs 1.5x slower than the dot product. I >>>>> assume >>>>> the dot product would include such an operation ... and then do a >>>>> cumulative sum over the element-wise product. >>>>> >>>>> The MKL lib includes such an operation (v?Mul) but it seems OpenBLAS >>>>> does not. So my question is : >>>>> >>>>> 1) is there any chance I can do vector element-wise multiplication >>>>> faster then the actual dot product ? >>>>> 2) why the built-in element-wise multiplication operator (*.) is much >>>>> slower than my own implementation for such a basic linealg operation >>>>> (full >>>>> julia) ? >>>>> >>>>> Thank you, >>>>> Lionel >>>>> >>>>> Best custom implementation : >>>>> >>>>> function xpy!{T<:Number}(A::Vector{T},B::Vector{T}) >>>>> n = size(A)[1] >>>>> if n == size(B)[1] >>>>> for i=1:n >>>>> @inbounds A[i] *= B[i] >>>>> end >>>>> end >>>>> return A >>>>> end >>>>> >>>>> Bench mark results (JuliaBox, A = randn(300000) : >>>>> >>>>> function CPU (s) GC (%) ALLOCATION (bytes) >>>>> CPU (x) >>>>> dot(A,B) 1.58e-04 0.00 16 >>>>> 1.0 xpy!(A,B) 2.31e-04 0.00 80 >>>>> 1.5 >>>>> NumericExtensions.multiply!(P,Q) 3.60e-04 0.00 80 >>>>> 2.3 xpy!(A,B) - no @inbounds check 4.36e-04 0.00 80 >>>>> 2.8 >>>>> P.*Q 2.52e-03 50.36 2400512 >>>>> 16.0 >>>>> ############################################################ >>>>> >>>>>