Re: [petsc-dev] code review request : txpetscgpu package removal

Karl Rupp Tue, 25 Jun 2013 14:46:37 -0700

Hi Paul,

> Not too heavy. I've already converted much of this code to remove this

package while supporting existing features, though I haven't pushed it
into the fork. The real question is whether we want to go down this path
or not.

I see two options: Either txpetscgpu is a self-contained package andbrings its own set of implementation files along, or it should beintegrated. The current model of injectedPETSC_HAVE_TXPETSCGPU-preprocessor switches will not be able to competein any code beauty contest... ;-) Either way, there is presumably alsosome licensing issue involved, so you guys need to agree to havetxpetscgpu integrated (or not).

Right now, I think CUSP does not support SpMVs in streams. Thus, in
order to get an effective multi GPU SpMV (for all the different storage
formats), one has to rewrite all the SpMV kernels (for all the different
storage formats) to use streams. This adds a lot of additional code to
support. I would prefer to just call some CUSP API with a stream as an
input argument but I don't think that exists at the moment. I'm not sure
what to do here.  Once the other code is accepted, perhaps we can
address this problem then?


The CUSP API needs to provide streams for that, yes.

As I addressed in my comments on your commits on Bitbucket, I'd preferto see CUSP being separated from CUSPARSE and instead use aCUSPARSE-native matrix datastructure (a simple collection of handles).This way one can already use the CUSPARSE interface if only the CUDA SDKis installed, and hook in CUSP later for preconditioners, etc.

It works across node but you have to know what you're doing. This is a
tough problem to solve universally because its (almost) impossible to
determine the number of mpi ranks per node in an mpi run. I've never
seen an MPI function that returns this information.

Right now, a 1-1 pairing between CPU core and GPU will work across any
system with any number of nodes. I've tested this on a system with 2
nodes, 4 GPUs per node (so "mpirun -n 8 -npernode 4" would work)

Thanks, I see. Apparently I'm not the only one struggling with thisabstraction issue...


Best regards,
Karli

Re: [petsc-dev] code review request : txpetscgpu package removal

Reply via email to