Just noticed that you're allocating memory on each iteration. If you have the
patience to write out all those matrix operations explicitly, it should help.
Alternatively, perhaps try ParallelAccelerator.
Best,
--Tim
On Monday, August 29, 2016 10:49:40 AM CDT Marius Millea wrote:
> Thanks, just
Dict can be slow. Try
d_cl = Array{Array,1}
for i=1:np
d_cl[i] = copy(inv_cl)
end
I can't say if that leads to good threads since nthreads() gives me only 1
today.
Thanks, I did notice that, but regardless this shouldn't affect the scaling
with NCPUs, and in fact as you say, it doesn't change performance at all.
On Monday, August 29, 2016 at 7:27:44 PM UTC+2, Diego Javier Zea wrote:
>
> Looks like the type of *d_cl* isn't inferred correctly. *d_cl =
Thanks, just tried wrapping the for loop inside a function and it seems to
make the @threads version slightly slower and serial version slightly
faster, so I'm even further from the speedup I was hoping for! Reading
through that Issue and linked ones, I guess I may not be the only one
seeing
Looks like the type of *d_cl* isn't inferred correctly. *d_cl = Dict(i =>
ones(3,3,nl) for i=1:np)::Dict{Int64,Array{Float64,3}}* helps with that,
but I din't see a change in performance. Best
Very quickly (train to catch!): try this https://github.com/JuliaLang/julia/
issues/17395#issuecomment-241911387
and see if it helps.
--Tim
On Monday, August 29, 2016 9:22:09 AM CDT Marius Millea wrote:
> I've parallelized some code with @threads, but instead of a factor NCPUs
> speed
I've parallelized some code with @threads, but instead of a factor NCPUs
speed improvement (for me, 8), I'm seeing rather a bit under a factor 2. I
suppose the answer may be that my bottleneck isn't computation, rather
memory access. But during running the code, I see my CPU usage go to 100%