Hi,
>> The use of aijcusp instead of a dense matrix type certainly adds to
the issue.
I know, but I couldn't find a dense gpu type in the petsc manual, please
correct me if there is any.
There is indeed no dense GPU matrix type in PETSc (yet).
Please send the output of -log_summary so that we can see where most
time is spent.
I am unable to do that as somehow I am having no output when I use that option.
I also tried to explicitly call PetscLogView but still nothing is printed out.
If I try with one of the slepc examples, I get the output.
Why is this happening? If I run my code with -info or -log_trace I see their
output, only -log_summary is shy!
Maybe you forgot to call SlepcFinalize()?
If you have good (recent) CPUs in dual-socket configuration, it's more than
unlikely that you will gain anything beyond ~2x with an optimized GPU setup.
Even that ~2x may only be possible with heavily tweaking the current SVD-
implementation in SLEPc, of which I don't know the details.
I used Xeon processors from 2010, just like the GPUs.
Ok, this is actually a relatively GPU-friendly setup, because CPUs have
reduced the gap in terms of FLOPs quite a bit (see for example
http://www.karlrupp.net/2013/06/cpu-gpu-and-mic-hardware-characteristics-over-time/
)
This is not good news, as my supervisor is really optimist about using GPUs and
getting high speed-ups!
Anyway, at the moment my gpu version is several times slower than the cpu
version, so even a 2x would be a win now :D
I'd suggest to convince your supervisor into buying/using a cluster with
current hardware and enjoy a higher speedup compared to what you could
get in an ideal setting with a GPU from 2010 anyway ;-)
(Having said that, I carefully estimate that you can get some
performance gains for SVD if you deep-dive into the existing SVD
implementation, carefully redesign it to minimize CPU<->GPU
communication, and use optimized library routines from the BLAS 3
operations. Currently there is not enough GPU-infrastructure in PETSc to
achieve this via command line parameters only.)
Best regards,
Karli