Thanks for the confirmation! Yes, I need more tests to see what the best 
practice is for my particular problem. 



On Monday, June 20, 2016 at 3:05:31 PM UTC+1, Chris Rackauckas wrote:
>
> Most likely. I would also time it with and without @simd at your problem 
> size. For some reason I've had some simple loops do better without @simd. 
>
> On Monday, June 20, 2016 at 2:50:22 PM UTC+1, chobb...@gmail.com wrote:
>>
>> Thanks! I'm still using v0.4.5. In this case, is the code I highlighted 
>> above still the best choice for doing the job?
>>
>>
>> On Monday, June 20, 2016 at 1:57:25 PM UTC+1, Chris Rackauckas wrote:
>>>
>>> I think that for medium size (but not large) arrays in v0.5 you may want 
>>> to use @threads from the threadding branch, and then for really large 
>>> arrays you may want to use @parallel. But you'd have to test some timings.
>>>
>>> On Monday, June 20, 2016 at 11:38:15 AM UTC+1, chobb...@gmail.com wrote:
>>>>
>>>> I have the same question regarding how to calculate the entry-wise 
>>>> vector product and find this thread. As a novice, I wonder if the 
>>>> following 
>>>> code snippet is still the standard for entry-wise vector multiplication 
>>>> that one should stick to in practice? Thanks!
>>>>
>>>>
>>>> @fastmath @inbounds @simd for i=1:n
>>>> A[i] *= B[i]
>>>> end
>>>>
>>>>
>>>>
>>>> On Tuesday, October 6, 2015 at 3:28:29 PM UTC+1, Lionel du Peloux wrote:
>>>>>
>>>>> Dear all,
>>>>>
>>>>> I'm looking for the fastest way to do element-wise vector 
>>>>> multiplication in Julia. The best I could have done is the following 
>>>>> implementation which still runs 1.5x slower than the dot product. I 
>>>>> assume 
>>>>> the dot product would include such an operation ... and then do a 
>>>>> cumulative sum over the element-wise product.
>>>>>
>>>>> The MKL lib includes such an operation (v?Mul) but it seems OpenBLAS 
>>>>> does not. So my question is :
>>>>>
>>>>> 1) is there any chance I can do vector element-wise multiplication 
>>>>> faster then the actual dot product ?
>>>>> 2) why the built-in element-wise multiplication operator (*.) is much 
>>>>> slower than my own implementation for such a basic linealg operation 
>>>>> (full 
>>>>> julia) ? 
>>>>>
>>>>> Thank you,
>>>>> Lionel
>>>>>
>>>>> Best custom implementation :
>>>>>
>>>>> function xpy!{T<:Number}(A::Vector{T},B::Vector{T})
>>>>>   n = size(A)[1]
>>>>>   if n == size(B)[1]
>>>>>     for i=1:n
>>>>>       @inbounds A[i] *= B[i]
>>>>>     end
>>>>>   end
>>>>>   return A
>>>>> end
>>>>>
>>>>> Bench mark results (JuliaBox, A = randn(300000) :
>>>>>
>>>>> function                          CPU (s)     GC (%)  ALLOCATION (bytes)  
>>>>> CPU (x)     
>>>>> dot(A,B)                          1.58e-04    0.00    16                  
>>>>> 1.0         xpy!(A,B)                         2.31e-04    0.00    80      
>>>>>             1.5         
>>>>> NumericExtensions.multiply!(P,Q)  3.60e-04    0.00    80                  
>>>>> 2.3         xpy!(A,B) - no @inbounds check    4.36e-04    0.00    80      
>>>>>             2.8         
>>>>> P.*Q                              2.52e-03    50.36   2400512             
>>>>> 16.0        
>>>>> ############################################################
>>>>>
>>>>>

Reply via email to