I saw your update. In PetscCUDAInitialize we have

      /* First get the device count */
      err   = cudaGetDeviceCount(&devCount);

      /* next determine the rank and then set the device via a mod */
      ierr   = MPI_Comm_rank(comm,&rank);CHKERRQ(ierr);
      device = rank % devCount;
    err = cudaSetDevice(device);

If we rely on the first CUDA call to do initialization, how could CUDA know 
these MPI stuff.
--Junchao Zhang

On Wed, Sep 18, 2019 at 11:42 PM Smith, Barry F. 
<bsm...@mcs.anl.gov<mailto:bsm...@mcs.anl.gov>> wrote:

  Fixed the docs. Thanks for pointing out the lack of clarity

> On Sep 18, 2019, at 11:25 PM, Zhang, Junchao via petsc-dev 
> <petsc-dev@mcs.anl.gov<mailto:petsc-dev@mcs.anl.gov>> wrote:
> Barry,
> I saw you added these in init.c
> +  -cuda_initialize - do the initialization in PetscInitialize()
> Notes:
>    Initializing cuBLAS takes about 1/2 second there it is done by default in 
> PetscInitialize() before logging begins
> But I did not get otherwise with -cuda_initialize 0, when will cuda be 
> initialized?
> --Junchao Zhang

Reply via email to