> On Jan 25, 2022, at 12:25 PM, Jed Brown <j...@jedbrown.org> wrote:
>
> Barry Smith <bsm...@petsc.dev> writes:
>
>>> On Jan 25, 2022, at 11:55 AM, Jed Brown <j...@jedbrown.org> wrote:
>>>
>>> Barry Smith <bsm...@petsc.dev> writes:
>>>
>>>> Thanks Mark, far more interesting. I've improved the formatting to make it
>>>> easier to read (and fixed width font for email reading)
>>>>
>>>> * Can you do same run with say 10 iterations of Jacobi PC?
>>>>
>>>> * PCApply performance (looks like GAMG) is terrible! Problems too small?
>>>
>>> This is -pc_type jacobi.
>>
>> Dang, how come it doesn't warn about all the gamg arguments passed to the
>> program? I saw them and jump to the wrong conclusion.
>
> We don't have -options_left by default. Mark has a big .petscrc or
> PETSC_OPTIONS.
>
>> How come PCApply is so low while Pointwise mult (which should be all of
>> PCApply) is high?
>
> I also think that's weird.
>
>>>
>>>> * VecScatter time is completely dominated by SFPack! Junchao what's up
>>>> with that? Lots of little kernels in the PCApply? PCJACOBI run will help
>>>> clarify where that is coming from.
>>>
>>> It's all in MatMult.
>>>
>>> I'd like to see a run that doesn't wait for the GPU.
>>
>> Indeed
>
> What is the command line option to turn
> PetscLogGpuTimeBegin/PetscLogGpuTimeEnd into a no-op even when -log_view is
> on? I know it'll mess up attribution, but it'll still tell us how long the
> solve took.
We don't have an API for this yet. It is slightly tricky because turning it
off will also break the regular -log_view for some stuff like VecAXPY();
anything that doesn't have a needed synchronization with the CPU.)
Because of this I think Mark should just put a PetscTime() around KSPSolve
run without -log_view and we can compare that number to the one from -log_view
to see how much the synchronousness of PetscLogGPUTime is causing. Ad hoc yes,
but a quick easy way to get the information.
>
> Also, can we make WaitForKokkos a no-op? I don't think it's necessary for
> correctness (docs indicate kokkos::fence synchronizes).