Hi Richard,
CPU spGEMM is about twice as fast even on the GPU-friendly case of a
single rank: http://viennacl.sourceforge.net/viennacl-benchmarks-spmm.html
I agree that it would be good to have a GPU-MatMatMult for the sake of
experiments. Under these performance constraints it's not top priority,
though.
Best regards,
Karli
On 10/3/19 12:00 AM, Mills, Richard Tran via petsc-dev wrote:
Fellow PETSc developers,
I am wondering why the AIJCUSPARSE and AIJVIENNACL matrix types do not
support the sparse matrix-matrix multiplication (SpGEMM, or MatMatMult()
in PETSc parlance) routines provided by cuSPARSE and ViennaCL,
respectively. Is there a good reason that I shouldn't add those? My
guess is that support was not added because SpGEMM is hard to do well on
a GPU compared to many CPUs (it is hard to compete with, say, Intel Xeon
CPUs with their huge caches) and it has been the case that one would
generally be better off doing these operations on the CPU. Since the
trend at the big supercomputing centers seems to be to put more and more
of the computational power into GPUs, I'm thinking that I should add the
option to use the GPU library routines for SpGEMM, though. Is there some
good reason to *not* do this that I am not aware of? (Maybe the CPUs are
better for this even on a machine like Summit, but I think we're at the
point that we should at least be able to experimentally verify this.)
--Richard