Re: [julia-users] Re: Fast vector element-wise multiplication

2016-11-02 Thread Patrick Kofod Mogensen
Does that work for you? I have to write A .= (*).(A,B) On Wednesday, November 2, 2016 at 3:51:54 AM UTC+1, Chris Rackauckas wrote: > > It's the other way around. .* won't fuse because it's still an operator. > .= will. It you want .* to fuse, you can instead do: > > A .= *.(A,B) > > since this i

Re: [julia-users] Re: Fast vector element-wise multiplication

2016-11-02 Thread Tim Holy
Hmm, that's surprising. Looks like we're using generic broadcasting machinery for that operation (check out what @which P.*P returns). Might be good to add .* to this line: https://github.com/JuliaLang/julia/blob/b7f1aa7554c71d3759702b9c2e14904ebdc94199/base/arraymath.jl#L69. Want to make a pull re

Re: [julia-users] Re: Fast vector element-wise multiplication

2016-11-02 Thread Sheehan Olver
OK, good to know. I think putting the function in a package is overkill. > On 2 Nov. 2016, at 6:35 pm, Chris Rackauckas wrote: > > Yes, this most likely won't help for GPU arrays because you likely don't want > to be looping through elements serially: you want to call a vectorized

Re: [julia-users] Re: Fast vector element-wise multiplication

2016-11-02 Thread Chris Rackauckas
Yes, this most likely won't help for GPU arrays because you likely don't want to be looping through elements serially: you want to call a vectorized GPU function which will do the computation in parallel on the GPU. ArrayFire's mathematical operations are already overloaded to do this, but I do

Re: [julia-users] Re: Fast vector element-wise multiplication

2016-11-01 Thread Sheehan Olver
Ah thanks! Though I guess if I want the same code to work also on a GPU array then this won't help? Sent from my iPhone > On 2 Nov. 2016, at 13:51, Chris Rackauckas wrote: > > It's the other way around. .* won't fuse because it's still an operator. .= > will. It you want .* to fuse, you can

Re: [julia-users] Re: Fast vector element-wise multiplication

2016-11-01 Thread Chris Rackauckas
It's the other way around. .* won't fuse because it's still an operator. .= will. It you want .* to fuse, you can instead do: A .= *.(A,B) since this invokes the broadcast on *, instead of invoking .*. But that's just a temporary thing. On Tuesday, November 1, 2016 at 7:27:40 PM UTC-7, Tom Bre

Re: [julia-users] Re: Fast vector element-wise multiplication

2016-11-01 Thread Tom Breloff
As I understand it, the .* will fuse, but the .= will not (until 0.6?), so A will be rebound to a newly allocated array. If my understanding is wrong I'd love to know. There have been many times in the last few days that I would have used it... On Tue, Nov 1, 2016 at 10:06 PM, Sheehan Olver wro

Re: [julia-users] Re: Fast vector element-wise multiplication

2016-11-01 Thread Sheehan Olver
Ah, good point. Though I guess that won't work til 0.6 since .* won't auto-fuse yet? Sent from my iPhone > On 2 Nov. 2016, at 12:55, Chris Rackauckas wrote: > > This is pretty much obsolete by the . fusing changes: > > A .= A.*B > > should be an in-place update of A scaled by B (Tomas' sol

[julia-users] Re: Fast vector element-wise multiplication

2016-11-01 Thread Chris Rackauckas
This is pretty much obsolete by the . fusing changes: A .= A.*B should be an in-place update of A scaled by B (Tomas' solution). On Tuesday, November 1, 2016 at 4:39:15 PM UTC-7, Sheehan Olver wrote: > > Should this be added to a package? I imagine if the arrays are on the GPU > (AFArrays) the

[julia-users] Re: Fast vector element-wise multiplication

2016-11-01 Thread Sheehan Olver
Should this be added to a package? I imagine if the arrays are on the GPU (AFArrays) then the operation could be much faster, and having a consistent name would be helpful. On Wednesday, October 7, 2015 at 1:28:29 AM UTC+11, Lionel du Peloux wrote: > > Dear all, > > I'm looking for the fastest

[julia-users] Re: Fast vector element-wise multiplication

2016-06-20 Thread chobbes158
Thanks for the confirmation! Yes, I need more tests to see what the best practice is for my particular problem. On Monday, June 20, 2016 at 3:05:31 PM UTC+1, Chris Rackauckas wrote: > > Most likely. I would also time it with and without @simd at your problem > size. For some reason I've had s

[julia-users] Re: Fast vector element-wise multiplication

2016-06-20 Thread Chris Rackauckas
Most likely. I would also time it with and without @simd at your problem size. For some reason I've had some simple loops do better without @simd. On Monday, June 20, 2016 at 2:50:22 PM UTC+1, chobb...@gmail.com wrote: > > Thanks! I'm still using v0.4.5. In this case, is the code I highlighted

[julia-users] Re: Fast vector element-wise multiplication

2016-06-20 Thread chobbes158
Thanks! I'm still using v0.4.5. In this case, is the code I highlighted above still the best choice for doing the job? On Monday, June 20, 2016 at 1:57:25 PM UTC+1, Chris Rackauckas wrote: > > I think that for medium size (but not large) arrays in v0.5 you may want > to use @threads from the th

[julia-users] Re: Fast vector element-wise multiplication

2016-06-20 Thread Chris Rackauckas
I think that for medium size (but not large) arrays in v0.5 you may want to use @threads from the threadding branch, and then for really large arrays you may want to use @parallel. But you'd have to test some timings. On Monday, June 20, 2016 at 11:38:15 AM UTC+1, chobb...@gmail.com wrote: > > I

[julia-users] Re: Fast vector element-wise multiplication

2016-06-20 Thread chobbes158
I have the same question regarding how to calculate the entry-wise vector product and find this thread. As a novice, I wonder if the following code snippet is still the standard for entry-wise vector multiplication that one should stick to in practice? Thanks! @fastmath @inbounds @simd for i=1

[julia-users] Re: Fast vector element-wise multiplication

2015-10-06 Thread Lionel du Peloux
Thank you for all of your suggestions. The @simd macro effectively gives a (very) slightly improved performance (5%).

[julia-users] Re: Fast vector element-wise multiplication

2015-10-06 Thread Steven G. Johnson
Note that the BLAS dot product probably uses all sorts of tricks to squeeze the last cycle of SIMD performance out of the CPU. e.g. here is the OpenBLAS ddot function for SandyBridge, which is hand-coded in assembly: https://github.com/xianyi/OpenBLAS/blob/develop/kernel/x86_64/ddot_microk_sand

[julia-users] Re: Fast vector element-wise multiplication

2015-10-06 Thread Steven G. Johnson
On Tuesday, October 6, 2015 at 2:23:33 PM UTC-4, Patrick Kofod Mogensen wrote: > > That was supposed to be "A * B only allocates..." right? > Yes.

[julia-users] Re: Fast vector element-wise multiplication

2015-10-06 Thread Patrick Kofod Mogensen
That was supposed to be "A * B only allocates..." right? On Tuesday, October 6, 2015 at 1:52:18 PM UTC-4, Steven G. Johnson wrote: > > > > On Tuesday, October 6, 2015 at 12:29:04 PM UTC-4, Christoph Ortner wrote: >> >> a *= b is equivalent to a = a * b, which allocates a temporary variable I >> t

[julia-users] Re: Fast vector element-wise multiplication

2015-10-06 Thread Steven G. Johnson
On Tuesday, October 6, 2015 at 12:29:04 PM UTC-4, Christoph Ortner wrote: > > a *= b is equivalent to a = a * b, which allocates a temporary variable I > think? > A * A only allocates memory on the heap if A is an array or something other heap-allocated datatype. For A[i] *= B[i] where A[i]

[julia-users] Re: Fast vector element-wise multiplication

2015-10-06 Thread Christoph Ortner
a *= b is equivalent to a = a * b, which allocates a temporary variable I think? Try @fastmath @inbounds @simd for i=1:n A[i] *= B[i] end

[julia-users] Re: Fast vector element-wise multiplication

2015-10-06 Thread Christoph Ortner
or, possibly A[i] = A[i] * B[i] (I'm not sure whether @simd automatically translates *= to what it needs) On Tuesday, 6 October 2015 17:29:04 UTC+1, Christoph Ortner wrote: > > a *= b is equivalent to a = a * b, which allocates a temporary variable I > think? > > Try > > @fastmath @inbounds @si

[julia-users] Re: Fast vector element-wise multiplication

2015-10-06 Thread Patrick Kofod Mogensen
Well, I guess your table pretty much shows it, right? It seems as it allocates a lot of temporary memory to carry out the calculations. On Tuesday, October 6, 2015 at 10:28:29 AM UTC-4, Lionel du Peloux wrote: > > Dear all, > > I'm looking for the fastest way to do element-wise vector multiplicat

[julia-users] Re: Fast vector element-wise multiplication

2015-10-06 Thread Tomas Lycken
I made some simple changes to your `xpy!`, and managed to get it to allocate nothing at all, while performing very close to the speed of `dot`. I don't know anything about e.g. `@simd` instructions, but I imagine they could help speeding this up even further. The most significant change was swi