Hi, > Is DGEMM your performance-critical operation? Are there any other > performance-critical operations? > > > For now we are only looking at (especially sparse) blas3 and > decompositions. Basically, your normal R base functionality for > in-memory sparse algebra.
Sparse factorizations (LU, QR, etc.) are very hard to parallelize for many-core architectures (GPUs in particular). > One more question i had: > > do you guys handle low resource cases? like transfer optimization for > blockwise multiplication in case operands do not fit -- out-of-core > algorithms? out-of-core has gone out-of-fashion. The reason is that the differences in memory speed has become so large that falling back to a slower memory type almost never pays off. > Did you look at gpu+cpu combined balanced algorithms (as i guess MAGMA > did for some)? yes, a couple of algorithms in ViennaCL use GPUs for the main work (i.e. GEMM) and CPUs for sequential in the algorithm. Best regards, Karli ------------------------------------------------------------------------------ What NetFlow Analyzer can do for you? Monitors network bandwidth and traffic patterns at an interface-level. Reveals which users, apps, and protocols are consuming the most bandwidth. Provides multi-vendor support for NetFlow, J-Flow, sFlow and other flows. Make informed decisions using capacity planning reports.http://sdm.link/zohodev2dev _______________________________________________ ViennaCL-devel mailing list ViennaCL-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/viennacl-devel