Dave, That will probably not be a very good idea due to the overhead associated with transferring data to and from the GPU being more expensive than the computation itself for small problems. This issue can be somewhat avoided by writing trivial wrappers for routines like dgemm which only run the multiply on the GPU when the dimensions of the problem are above some threshold, but this would require slightly more work than simply replacing BLAS with CUBLAS.
Jack On Fri, Feb 24, 2012 at 8:28 PM, Dave Nystrom <Dave.Nystrom at tachyonlogic.com > wrote: > I was wondering if anyone had ever tried using cuBlas as a substitute for > something like MKL with PETSc. I've been wondering if it would give better > performance than MKL for my direct solves with cholmod even though the > block > sizes are small for cholmod i.e. 32x32 is the default I believe. If so, > were > there any tricky aspects to using cuBlas in this way? > > Thanks, > > Dave > -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.mcs.anl.gov/pipermail/petsc-dev/attachments/20120224/22816e8b/attachment.html>