[julia-users] CUDArt: loop inside device do

2016-02-12 Thread Joaquim Masset Lacombe Dias Garcia
Can I have a standard julia "for loop" inside a "device do" of CUDArt?

I tried the following example:

using CUDArt, MyCudaModule

nrow = 10
ncol = 3000

mat = ones(Float64,nrow,ncol)
out1 = zeros(Float64,nrow)
vec = Float64[1:nrow;]
out2 = zeros(Float64,nrow)

d_mat  = CudaArray(mat)
d_out1 = CudaArray(out1)
d_vec  = CudaArray(vec)
d_out2 = CudaArray(out2)
d_nrow = CudaArray(Int32[nrow;])
d_ncol = CudaArray(Int32[ncol;])

result = devices(dev->capability(dev)[1]>=2) do devlist
MyCudaModule.init(devlist) do dev
blocks = 1
threads = nrow
global result = 0
result = for i in 1:10
MyCudaModule.cudaSumCol(d_out1,d_mat,d_ncol,blocks,threads)

result = to_host(d_out1)[1]
end
end
end

cudaSumCol is a function ta simply sums a matrix´s entries convetring it 
into a column, it was wrapped just like the example on CUArt´s README.
the above code without the loop part work just perfectly.

Should I try something different, like not using the do devlist?

thanks,
Joaquim


Re: [julia-users] CUDArt: loop inside device do

2016-02-12 Thread Tim Holy
That loop is not proper julia syntax (i.e., this issue has nothing to do with 
CUDArt).

julia> global s

julia> s = 0
0

julia> s = for i = 1:10
   s = s+1
   end

julia> s

julia> s == nothing
true

--Tim


On Friday, February 12, 2016 08:51:47 AM Joaquim Masset Lacombe Dias Garcia 
wrote:
> Can I have a standard julia "for loop" inside a "device do" of CUDArt?
> 
> I tried the following example:
> 
> using CUDArt, MyCudaModule
> 
> nrow = 10
> ncol = 3000
> 
> mat = ones(Float64,nrow,ncol)
> out1 = zeros(Float64,nrow)
> vec = Float64[1:nrow;]
> out2 = zeros(Float64,nrow)
> 
> d_mat  = CudaArray(mat)
> d_out1 = CudaArray(out1)
> d_vec  = CudaArray(vec)
> d_out2 = CudaArray(out2)
> d_nrow = CudaArray(Int32[nrow;])
> d_ncol = CudaArray(Int32[ncol;])
> 
> result = devices(dev->capability(dev)[1]>=2) do devlist
> MyCudaModule.init(devlist) do dev
> blocks = 1
> threads = nrow
> global result = 0
> result = for i in 1:10
> MyCudaModule.cudaSumCol(d_out1,d_mat,d_ncol,blocks,threads)
> 
> result = to_host(d_out1)[1]
> end
> end
> end
> 
> cudaSumCol is a function ta simply sums a matrix´s entries convetring it
> into a column, it was wrapped just like the example on CUArt´s README.
> the above code without the loop part work just perfectly.
> 
> Should I try something different, like not using the do devlist?
> 
> thanks,
> Joaquim



Re: [julia-users] CUDArt: loop inside device do

2016-02-12 Thread Joaquim Dias Garcia
Oh! Sure, thanks for the prompt answer!
Sorry for the dumb question... 

Joaquim

> On 12 de fev de 2016, at 20:36, Tim Holy  wrote:
> 
>> On Friday, February 12, 2016 08:30:26 PM Joaquim Dias Garcia wrote:
>> Is there any way around it?
>> 
>> I was planning a monte-carlo code, but all the iteration rely on some huge
>> amount of data which is always the same. So sending it back and forth to
>> the device would be a bottleneck...
> 
> Again, you can use loops, you just have to write your code in a way that is 
> actually valid syntax. Something like this:
> 
> result = devices(dev->capability(dev)[1]>=2) do devlist
>MyCudaModule.init(devlist) do dev
>result = Array(T, n)
>d_mat = CudaArray(mat)
># more allocation here...
>for i = 1:n
>result[i] = my_calculation(d_mat, othervariables, i)
>end
>result
>end
> end
> 
> The problem with your old version is that `result = for i = 1:n...` is not 
> supported syntax in julia.
> 
> --Tim
> 


Re: [julia-users] CUDArt: loop inside device do

2016-02-12 Thread Joaquim Dias Garcia
Is there any way around it? 

I was planning a monte-carlo code, but all the iteration rely on some huge 
amount of data which is always the same. So sending it back and forth to the 
device would be a bottleneck...

Re: [julia-users] CUDArt: loop inside device do

2016-02-12 Thread Tim Holy
On Friday, February 12, 2016 08:30:26 PM Joaquim Dias Garcia wrote:
> Is there any way around it?
> 
> I was planning a monte-carlo code, but all the iteration rely on some huge
> amount of data which is always the same. So sending it back and forth to
> the device would be a bottleneck...

Again, you can use loops, you just have to write your code in a way that is 
actually valid syntax. Something like this:

result = devices(dev->capability(dev)[1]>=2) do devlist
MyCudaModule.init(devlist) do dev
result = Array(T, n)
d_mat = CudaArray(mat)
# more allocation here...
for i = 1:n
result[i] = my_calculation(d_mat, othervariables, i)
end
result
end
end

The problem with your old version is that `result = for i = 1:n...` is not 
supported syntax in julia.

--Tim