Hi Vince and Andreas: I've ported the Mersenne Twister code (along with enhancements from http://www.jcornwall.me.uk/2009/04/mersenne-twisters-in-cuda/) for use with within PyCUDA kernels. I'd be happy to share what I have because it's worked quite well for me.
Should I just attach the code to the mailing list or should I email you both directly? Best, Per On Tue, Jun 30, 2009 at 8:47 AM, Andreas Klöckner<[email protected]> wrote: > Hi Vince, > > On Montag 29 Juni 2009, Vince Fulco wrote: >> Dear Andreas- >> >> Thank you for the detailed response. >> >> At the risk of belabouring, a portion of the Marsenne Twister code >> contains two kernel/functions for the Box Muller transformation calcs. >> One is defined __device__ and the other which draws on calcs of the >> first is a __global__. Would it be possible to re-code the first as a >> __global__ with appropriate changes internally as well and then wrap >> the two with Pycuda or am I missing something more obvious? This may >> not be an efficient use of the device but could be faster than >> porting. Of course there is a larger portion of C which accesses the >> host and would need to be dealt with as well. > > I must admit I'm not really familiar with Nvidia's Monte Carlo example. As > long as the function you're trying to recode doesn't pass pointers to shared > memory, what you suggest should be possible. I can't quite say whether it's > going to be efficient, that depends on a rather large number of factors. > > Andreas > > _______________________________________________ > PyCUDA mailing list > [email protected] > http://tiker.net/mailman/listinfo/pycuda_tiker.net > > _______________________________________________ PyCUDA mailing list [email protected] http://tiker.net/mailman/listinfo/pycuda_tiker.net
