Dear Evgeny, *I did something similar while back and remember I had some difficulties. * *I used another library called pycula (which there is no support for that), and I solved the above equation using truncated eigen decomposition* *. In attached my source code where you can see both cpu and gpu implementation and the run times. for my code I used * *cula.gpu_devsyevx_index* *which return a numpy array. i remember the is another function that return gpu array. * *another issue is pycula and Scikit **gpu array are not compatible (at least when I was doing this).*
*Hope I could help.* *Cheers,* *Mohsen* On Mon, Feb 24, 2014 at 8:56 PM, Evgeny Lazutkin <evgeny.lazut...@gmail.com>wrote: > Dear all, > > sorry for the delayed answer, I have problem with installation. But now > everything is just fine. > > So, I have installed Scikit (as it was proposed from GitHub) and CULA. > > I am confused. I'd like to solve very simple system A*X = B, but it > raises the error: > *TypeError: only length-1 arrays can be converted to python scalars.* > Could you please tell me, what is going wrong? > > I suppose, that I do everything wrong. Even if it works...how to obtain > parallelization? From the example by Andreas, he used SourceModule with C > language and for me it is obvious, what is happen there. > > But here, I cannot understand. I have tried to write "own" SourceModule > and call functions from CULA - but when I try to manipulate with memory or > write function - comes error - that I cannot do that from __device__ > /__global__. > > Oh...I am stuck ( > > Could you please make a code corrections and give me an answers! Find > please py-file in attach. > > Best regards, > Evgeny > > > Am 23.02.2014 15:03, schrieb Lev Givon: > > Received from Evgeny Lazutkin on Sun, Feb 23, 2014 at 03:53:12AM EST: > > Dear Andreas, dear all, > > thank you very much! I will install this package and perform the > sample code! I hope after that you can correct me. > > Best regards, > Evgeny > > I suggest that you install the latest revision of the package from GitHub > rather > than the tarball on PyPI. If you encounter any problems, feel free to submit a > report via the project's GitHub issue tracker (scikits.cuda is developed > separately from pycuda). > > > > _______________________________________________ > PyCUDA mailing list > PyCUDA@tiker.net > http://lists.tiker.net/listinfo/pycuda > > -- Mohsen
#=============================================================================== # CUDA libraries #=============================================================================== #import pycuda.gpuarray as gpuarray #import pycuda.autoinit #import pycuda.driver as cuda #import PyCULA.cula as cula #@UnresolvedImport #import scikits.cuda.linalg as la ##=============================================================================== # Routin Libraries #=============================================================================== import numpy as np import scipy.linalg as spla from numpy.lib import stride_tricks #cula.culaInitialize() #cula.mixed_init() #la.init() #=============================================================================== # Largest Eigenvalue #=============================================================================== def cpu(k_,y_, lo, hi,flag): # start = cuda.Event() # end = cuda.Event() # start.record() w, v = spla.eigh(k_, eigvals=(lo, hi)) temp= np.dot(np.dot(v,np.diag(1.0/w)),v.T) c=np.dot(temp,y_) # end.record() # end.synchronize() # time = start.time_till(end) * 1e-3 if flag=="t": value= round(time,5) print 'cpu: %f' %value elif flag=="r": value=c else: raise ValueError('unrecognized flag') return value #def gpu(k_,y_, il, iu,flag): # # k_gpu = cula.cula_gpuarray_like(k_) # n=k_.shape[0] # cuda.Context.synchronize() # start = cuda.Event() # end = cuda.Event() # start.record() # # w,v=cula.gpu_devsyevx_index(k_gpu,il,iu,vectors=True,uplo='L') # newShape=(n,iu-il+1) # # elmSize=v.itemsize # z=stride_tricks.as_strided(v, shape=newShape, strides=(elmSize,elmSize*newShape[0])) # # # w=np.delete(w,np.s_[iu-il+1:],0)#delete zeros from eigenvalues # temp= np.dot(np.dot(z,np.diag(1.0/w)),z.T) # c=np.dot(temp,y_) # # end.record() # end.synchronize() # time = start.time_till(end) * 1e-3 # # #delete zero cols. from vectors ## v=np.delete(v_gpu.get().T, np.s_[iu-il+1:], 1) # # if flag=="t": # value= round(time,5) # print 'gpu: %f' %value # # elif flag=="r": # # value=c # # else: # raise ValueError('unrecognized flag') # return value
_______________________________________________ PyCUDA mailing list PyCUDA@tiker.net http://lists.tiker.net/listinfo/pycuda