I would like as much as possible to pass the cuda and hip streams to Kokkos, 
since I can directly handle much of the annoyance with wrangling multiple 
streams and stream objects externally. Last I checked on this Kokkos was moving 
towards allowing association of streams to functions, but admittedly this was a 
while back.

Best regards,

Jacob Faibussowitsch
(Jacob Fai - booss - oh - vitch)
Cell: (312) 694-3391

> On Jan 10, 2021, at 13:10, Mark Adams <mfad...@lbl.gov> wrote:
> 
> 
> 
> On Sat, Jan 9, 2021 at 7:37 PM Jacob Faibussowitsch <jacob....@gmail.com 
> <mailto:jacob....@gmail.com>> wrote:
> It is a single object that holds a pointer to every stream implementation and 
> toggleable type so it can be universally passed around. Currently has a 
> cudaStream and a hipStream but this is easily extendable to any other stream 
> implementation.  
> 
> Do you have any thoughts on how this would work with Kokkos?
> 
> Would you want to feed Kokkos your Cuda/Hip, etc, stream or add a Kokkos 
> backend to your object? 
> 
> Junchao might be the person to ask. I would guess Kokkos View (vector) 
> objects carry a stream because they block on a "deep_copy", that moves data 
> to/from the GPU, and it is blocking.
> 
> Thanks,
> Mark
> 
> 
> Best regards,
> 
> Jacob Faibussowitsch
> (Jacob Fai - booss - oh - vitch)
> Cell: +1 (312) 694-3391
> 
>> On Jan 9, 2021, at 18:19, Mark Adams <mfad...@lbl.gov 
>> <mailto:mfad...@lbl.gov>> wrote:
>> 
>> 
>> Is this stream object going to have Cuda, Kokkos, etc., implementations?
>> 
>> On Sat, Jan 9, 2021 at 4:09 PM Jacob Faibussowitsch <jacob....@gmail.com 
>> <mailto:jacob....@gmail.com>> wrote:
>> I’m currently working on an implementation of a general PetscStream object. 
>> Currently it only supports Vector ops and has a proof of concept KSPCG, but 
>> should be extensible to other objects when finished. Junchao is also 
>> indirectly working on pipeline support in his NVSHMEM MR. Take a look at 
>> either MR, it would be very useful to get your input, as tailoring either of 
>> these approaches for pipelined algorithms is key.
>> 
>> Best regards,
>> 
>> Jacob Faibussowitsch
>> (Jacob Fai - booss - oh - vitch)
>> Cell: (312) 694-3391
>> 
>>> On Jan 9, 2021, at 15:01, Mark Adams <mfad...@lbl.gov 
>>> <mailto:mfad...@lbl.gov>> wrote:
>>> 
>>> I would like to put a non-overlapping ASM solve on the GPU. It's not clear 
>>> that we have a model for this. 
>>> 
>>> PCApply_ASM currently pipelines the scater with the subdomain solves. I 
>>> think we would want to change this and do a 1) scatter begin loop, 2) 
>>> scatter end and non-blocking solve loop, 3) solve-wait and scatter begging 
>>> loop and 4) scatter end loop.
>>> 
>>> I'm not sure how to go about doing this.
>>>  * Should we make a new PCApply_ASM_PARALLEL or dump this pipelining 
>>> algorithm and rewrite PCApply_ASM?
>>>  * Add a solver-wait method to KSP?
>>> 
>>> Thoughts?
>>> 
>>> Mark
>> 

Reply via email to