Dear Lev, dear all,

the problem with DataTypeError I have solved. I did not mention that I pass float64 instead of float32.

In attach you will find the code, it works......but....GPU brings wrong solution. I print the results in program. They dont match.
Could you explain why?

I read the issues on the scikits.cuda to provide the corrections...but it did not help.

Best regards,
Evgeny


Am 24.02.2014 23:23, schrieb Lev Givon:
Received from Evgeny Lazutkin on Mon, Feb 24, 2014 at 05:08:43PM EST:

(snip)

So, I did the following:
# Transfer to GPU
a_gpu = pycuda.gpuarray.to_gpu(A)
b_gpu = pycuda.gpuarray.to_gpu(B)
#pointer
p1 = pycuda.gpuarray.GPUArray(a_gpu, shape(A)).ptr()

and ot raises the error - in gpuarray.py in __init__
dtype = np.dtype(dtype)
TypeError: data type not understood
pycuda.gpuarray.to_gpu() creates a GPUArray instance; not sure why you are
passing it to the GPUArray constructor. Also, note that since ptr is a property,
it shouldn't be invoked as a method.

There are examples of how to use pycuda with wrapped library functions in
scikits.cuda in the docstrings of some of the wrapper functions (e.g., see
scikits/cuda/cublas.py).

import pycuda.driver as drv
import pycuda.tools
import pycuda.autoinit
import numpy
import pycuda.gpuarray as gpuarray
from scikits.cuda.cula import *
from scikits import *
from scipy import *



A = array([[ 5.18649167 , 0.         , 1.03279556 , 0.   ,      -0.14549722, 0. 
     ,    0. ,  0.   ,      0.     ,     0.  ,        0.    ,      0.        ],
 [-5.0819889  , 0.,          2.52459667,  0.,          0.64549722 , 0. ,        
 0.  , 0.,          0.      ,    0.     ,     0.      ,    0.        ],
 [ 9.01848057,  0. ,        -8.13118224,  0. ,         5.18649167,  0.  ,       
 0.  , 0.,          0. ,         0.     ,     0.   ,       0.        ],
 [-0.5      ,   4.43649167,  0. ,         1.03279556,  0.     ,    -0.14549722, 
  0. ,         0. ,         0.      ,    0.     ,     0.     ,    0.        ],
 [ 0.      ,   -5.0819889  ,-0.5,         1.77459667 , 0.    ,      0.64549722, 
  0. ,         0.    ,      0.       ,   0.     ,     0.      ,    0.        ],
 [ 0.         , 9.01848057  ,0.,         -8.13118224, -0.5  ,       4.43649167, 
  0. ,         0.    ,      0.       ,   0.    ,      0.      ,    0.        ],
 [ 0.        ,  0.          ,0.,          0.        ,  0.  ,        0. ,  
5.18649167 , 0.      ,    1.03279556 , 0.       ,  -0.14549722 , 0.        ],
 [ 0.       ,   0.          ,0.,          0.        ,  0. ,         0. , 
-5.0819889  , 0.      ,    2.52459667 , 0.       ,   0.64549722 , 0.        ],
 [ 0.      ,    0.          ,0.,          0.        ,  0.,          0. ,  
9.01848057 , 0.      ,   -8.13118224 , 0.       ,   5.18649167 , 0.        ],
 [ 0.     ,     0.       ,   0.      ,    0.     ,     0.        ,  0.      ,  
-0.5  , 4.43649167,  0.       ,   1.03279556 , 0.      ,   -0.14549722] ,
[ 0.     ,     0.       ,   0.     ,     0.    ,      0.       ,   0.       ,   
0. , -5.0819889 , -0.5     ,    1.77459667 , 0.      ,    0.64549722] ,
[ 0.    ,      0.     ,     0.    ,      0.  ,        0.      ,    0.      ,    
0. ,  9.01848057 , 0.     ,    -8.13118224, -0.5    ,     4.43649167]], dtype = 
numpy.float32)


B = array([[-5.32379001 , 0.       ,   0.,          0.        ],
 [ 2.661895 ,   0.        ,  0.,          0.        ],
 [-5.32379001,  0.        ,  0. ,         0.        ],
 [ 0.        , -5.32379001,  0.  ,        0.        ],
 [ 0.        ,  2.661895  ,  0.   ,       0.        ],
 [ 0.        , -5.32379001,  0.    ,      0.        ],
 [ 0.        ,  0. ,        -5.32379001,  0.        ],
 [ 0.        ,  0. ,         2.661895  ,  0.        ],
 [ 0.        ,  0.,         -5.32379001,  0.        ],
 [ 0.        ,  0.,          0.        , -5.32379001],
 [ 0.        ,  0.,          0.        ,  2.661895  ],
 [ 0.        ,  0.,          0.        , -5.32379001]], dtype = numpy.float32)


def cpu_solve(k_,y_):

        start=drv.Event()
        end=drv.Event()
        start.record()

        result = np.linalg.solve(k_, y_)

        end.record()
        end.synchronize()
        print "numpy array time: %fs" %(start.time_till(end)*1e-3)
         
        return result

def gpu_solve(k_,y_):
        k_gpu=gpuarray.to_gpu(k_)
        y_gpu=gpuarray.to_gpu(y_)
        (m,n)=k_.shape
        lda=m
        nrhs=shape(y_)[1]

        ipiv=np.empty(n,dtype=np.int32)
        ipiv_gpu=gpuarray.to_gpu(ipiv)
        ldb=y_.shape[0]

        start=drv.Event()
        end=drv.Event()
        start.record()

        culaDeviceSgesv(n, nrhs, k_gpu.ptr, lda, ipiv_gpu.ptr, y_gpu.ptr, ldb)

        end.record()
        end.synchronize()
        print t
        print "GPU array time: %fs" %(start.time_till(end)*1e-3)
        return  y_gpu

culaInitialize()
print cpu_solve(A,B) # check: correct? ---> YES. 
print gpu_solve(A,B) # check: corretc? ---> NO!
culaShutdown()
_______________________________________________
PyCUDA mailing list
PyCUDA@tiker.net
http://lists.tiker.net/listinfo/pycuda

Reply via email to