Re: [petsc-dev] https://developer.nvidia.com/nccl

2020-06-16 Thread Karl Rupp
From a practical standpoint it seems to me that NCCL is an offering to 
a community that isn't used to MPI. It's categorized as 'Deep Learning 
Software' on the NVIDIA page ;-)


The section 'NCCL and MPI' has some interesting bits:
 https://docs.nvidia.com/deeplearning/nccl/user-guide/docs/mpi.html

At the bottom of the page there is
 "Using NCCL to perform inter-GPU communication concurrently with 
CUDA-aware MPI may create deadlocks. (...) Using both MPI and NCCL to 
perform transfers between the same sets of CUDA devices concurrently is 
therefore not guaranteed to be safe."


While I'm impressed that NVIDIA even 'reinvents' MPI for their GPUs to 
serve the deep learning community, I don't think NCCL provides enough 
beyond MPI for PETSc.


Best regards,
Karli





On 6/17/20 4:13 AM, Junchao Zhang wrote:
It should be renamed as NCL (NVIDIA Communications Library) as it adds 
point-to-point, in addition to collectives. I am not sure whether to 
implement it in petsc as none exscale machine uses nvidia GPUs.


--Junchao Zhang


On Tue, Jun 16, 2020 at 6:44 PM Matthew Knepley > wrote:


It would seem to make more sense to just reverse-engineering this as
another MPI impl.

    Matt

On Tue, Jun 16, 2020 at 6:22 PM Barry Smith mailto:bsm...@petsc.dev>> wrote:




-- 
What most experimenters take for granted before they begin their

experiments is infinitely more interesting than any results to which
their experiments lead.
-- Norbert Wiener

https://www.cse.buffalo.edu/~knepley/




Re: [petsc-dev] https://developer.nvidia.com/nccl

2020-06-16 Thread Junchao Zhang
It should be renamed as NCL (NVIDIA Communications Library) as it adds
point-to-point, in addition to collectives. I am not sure whether to
implement it in petsc as none exscale machine uses nvidia GPUs.

--Junchao Zhang


On Tue, Jun 16, 2020 at 6:44 PM Matthew Knepley  wrote:

> It would seem to make more sense to just reverse-engineering this as
> another MPI impl.
>
>Matt
>
> On Tue, Jun 16, 2020 at 6:22 PM Barry Smith  wrote:
>
>>
>>
>
> --
> What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which their
> experiments lead.
> -- Norbert Wiener
>
> https://www.cse.buffalo.edu/~knepley/
> 
>


Re: [petsc-dev] https://developer.nvidia.com/nccl

2020-06-16 Thread Matthew Knepley
It would seem to make more sense to just reverse-engineering this as
another MPI impl.

   Matt

On Tue, Jun 16, 2020 at 6:22 PM Barry Smith  wrote:

>
>

-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener

https://www.cse.buffalo.edu/~knepley/ 


[petsc-dev] https://developer.nvidia.com/nccl

2020-06-16 Thread Barry Smith