If you use cudaMallocManaged with host affinity, you can drop that into
PETSc malloc and it should “just work” including migrating to GPU when
touched. Or you can give it device affinity and it will migrate the other
way when the CPU touches it.
This is way more performance portable that system ma
OK good to know. I will now worry even less about making this very complete.
On Wed, Sep 2, 2020 at 1:33 PM Barry Smith wrote:
>
> Mark,
>
>Currently you use directly the Nvidia provided mallocs cudaMalloc for
> all mallocs on the GPU. See for example aijcusparse.cu.
>
>I will be using
Mark,
Currently you use directly the Nvidia provided mallocs cudaMalloc for all
mallocs on the GPU. See for example aijcusparse.cu.
I will be using Stefano's work to start developing a unified PETSc based
system for all memory management but don't wait for that.
Barry
> On Sep
I believe there are a few PetscMallocCuda impls in
src/sys/memory/cuda/mcudahost.cu that seem to do what you are describing. If
you are creating mats you can also consider cudaMallocPitch, but I’m not sure
how that plays with the sparse storage impls that petsc mat uses. Seems more
useful for d
PETSc mallocs seem to boil down to PetscMallocAlign. There are switches in
here but I don't see a Cuda malloc. THis would seem to be convenient if I
want to create an Object entirely on Cuda or any device.
Are there any thoughts along these lines or should I just duplicate Mat
creation, for instan