[julia-users] CUDArt: loop inside device do
Can I have a standard julia "for loop" inside a "device do" of CUDArt? I tried the following example: using CUDArt, MyCudaModule nrow = 10 ncol = 3000 mat = ones(Float64,nrow,ncol) out1 = zeros(Float64,nrow) vec = Float64[1:nrow;] out2 = zeros(Float64,nrow) d_mat = CudaArray(mat) d_out1 = CudaArray(out1) d_vec = CudaArray(vec) d_out2 = CudaArray(out2) d_nrow = CudaArray(Int32[nrow;]) d_ncol = CudaArray(Int32[ncol;]) result = devices(dev->capability(dev)[1]>=2) do devlist MyCudaModule.init(devlist) do dev blocks = 1 threads = nrow global result = 0 result = for i in 1:10 MyCudaModule.cudaSumCol(d_out1,d_mat,d_ncol,blocks,threads) result = to_host(d_out1)[1] end end end cudaSumCol is a function ta simply sums a matrix´s entries convetring it into a column, it was wrapped just like the example on CUArt´s README. the above code without the loop part work just perfectly. Should I try something different, like not using the do devlist? thanks, Joaquim
Re: [julia-users] CUDArt: loop inside device do
That loop is not proper julia syntax (i.e., this issue has nothing to do with CUDArt). julia> global s julia> s = 0 0 julia> s = for i = 1:10 s = s+1 end julia> s julia> s == nothing true --Tim On Friday, February 12, 2016 08:51:47 AM Joaquim Masset Lacombe Dias Garcia wrote: > Can I have a standard julia "for loop" inside a "device do" of CUDArt? > > I tried the following example: > > using CUDArt, MyCudaModule > > nrow = 10 > ncol = 3000 > > mat = ones(Float64,nrow,ncol) > out1 = zeros(Float64,nrow) > vec = Float64[1:nrow;] > out2 = zeros(Float64,nrow) > > d_mat = CudaArray(mat) > d_out1 = CudaArray(out1) > d_vec = CudaArray(vec) > d_out2 = CudaArray(out2) > d_nrow = CudaArray(Int32[nrow;]) > d_ncol = CudaArray(Int32[ncol;]) > > result = devices(dev->capability(dev)[1]>=2) do devlist > MyCudaModule.init(devlist) do dev > blocks = 1 > threads = nrow > global result = 0 > result = for i in 1:10 > MyCudaModule.cudaSumCol(d_out1,d_mat,d_ncol,blocks,threads) > > result = to_host(d_out1)[1] > end > end > end > > cudaSumCol is a function ta simply sums a matrix´s entries convetring it > into a column, it was wrapped just like the example on CUArt´s README. > the above code without the loop part work just perfectly. > > Should I try something different, like not using the do devlist? > > thanks, > Joaquim
Re: [julia-users] CUDArt: loop inside device do
Oh! Sure, thanks for the prompt answer! Sorry for the dumb question... Joaquim > On 12 de fev de 2016, at 20:36, Tim Holywrote: > >> On Friday, February 12, 2016 08:30:26 PM Joaquim Dias Garcia wrote: >> Is there any way around it? >> >> I was planning a monte-carlo code, but all the iteration rely on some huge >> amount of data which is always the same. So sending it back and forth to >> the device would be a bottleneck... > > Again, you can use loops, you just have to write your code in a way that is > actually valid syntax. Something like this: > > result = devices(dev->capability(dev)[1]>=2) do devlist >MyCudaModule.init(devlist) do dev >result = Array(T, n) >d_mat = CudaArray(mat) ># more allocation here... >for i = 1:n >result[i] = my_calculation(d_mat, othervariables, i) >end >result >end > end > > The problem with your old version is that `result = for i = 1:n...` is not > supported syntax in julia. > > --Tim >
Re: [julia-users] CUDArt: loop inside device do
Is there any way around it? I was planning a monte-carlo code, but all the iteration rely on some huge amount of data which is always the same. So sending it back and forth to the device would be a bottleneck...
Re: [julia-users] CUDArt: loop inside device do
On Friday, February 12, 2016 08:30:26 PM Joaquim Dias Garcia wrote: > Is there any way around it? > > I was planning a monte-carlo code, but all the iteration rely on some huge > amount of data which is always the same. So sending it back and forth to > the device would be a bottleneck... Again, you can use loops, you just have to write your code in a way that is actually valid syntax. Something like this: result = devices(dev->capability(dev)[1]>=2) do devlist MyCudaModule.init(devlist) do dev result = Array(T, n) d_mat = CudaArray(mat) # more allocation here... for i = 1:n result[i] = my_calculation(d_mat, othervariables, i) end result end end The problem with your old version is that `result = for i = 1:n...` is not supported syntax in julia. --Tim