Re: [petsc-dev] Kokkos/Crusher perforance

Barry Smith Sat, 22 Jan 2022 18:49:10 -0800


  I am not arguing for a rickety set of scripts, I am arguing that doing more 
is not so easy and it is only worth doing if the underlying benchmark is worth 
the effort.


> On Jan 22, 2022, at 8:08 PM, Jed Brown <j...@jedbrown.org> wrote:
> 
> Yeah, I'm referring to the operational aspect of data management, not 
> benchmark design (which is hard and even Sam had years working with Mark and 
> me on HPGMG to refine that).
> 
> If you run libCEED BPs (which use PETSc), you can run one command
> 
> srun -N.... ./bps -ceed /cpu/self/xsmm/blocked,/gpu/cuda/gen -degree 2,3,4,5 
> -local_nodes 1000,5000000 -problem bp1,bp2,bp3,bp4
> 
> and it'll loop (in C code) over all the combinations (reusing some 
> non-benchmarked things like the DMPlex) across the whole range of sizes, 
> problems, devices. It makes one output file and you feed that to a Python 
> script to read it as a Pandas DataFrame and plot (or read and interact in a 
> notebook). You can have a basket of files from different machines and slice 
> those plots without code changes.
> 
> We should do similar for a suite of PETSc benchmarks, even just basic Vec and 
> Mat operations like in the reports. It isn't more work than a rickety bundle 
> of scripts, and it's a lot less error-prone.
> 
> Barry Smith <bsm...@petsc.dev> writes:
> 
>>  I submit it is actually a good amount of additional work and requires real 
>> creativity and very good judgment; it is not a good intro or undergrad 
>> project; especially for someone without a huge amount of hands-on experience 
>> already. Look who had to do the new SpecHPC multigrid benchmark. The last 
>> time I checked Sam was not an undergrad. Senior Scientist, Lawrence Berkeley 
>> National Laboratory - ‪‪Cited by 11194‬‬ I definitely do not plan to involve 
>> myself in any brand new serious benchmarking studies in my current lifetime, 
>> doing one correctly is a massive undertaking IMHO.
>> 
>>> On Jan 22, 2022, at 6:43 PM, Jed Brown <j...@jedbrown.org> wrote:
>>> 
>>> This isn't so much more or less work, but work in more useful places. Maybe 
>>> this is a good undergrad or intro project to make a clean workflow for 
>>> these experiments.
>>> 
>>> Barry Smith <bsm...@petsc.dev> writes:
>>> 
>>>> Performance studies are enormously difficult to do well; which is why 
>>>> there are so few good ones out there. And unless you fall into the LINPACK 
>>>> benchmark or hit upon Streams the rewards of doing an excellent job are 
>>>> pretty thin. Even Streams was not properly maintained for many years, you 
>>>> could not just get it and use it out of the box for a variety of purposes 
>>>> (which is why PETSc has its hacked-up ones). I submit a properly 
>>>> performance study is a full-time job and everyone always has those.
>>>> 
>>>>> On Jan 22, 2022, at 2:11 PM, Jed Brown <j...@jedbrown.org> wrote:
>>>>> 
>>>>> Barry Smith <bsm...@petsc.dev> writes:
>>>>> 
>>>>>>> On Jan 22, 2022, at 12:15 PM, Jed Brown <j...@jedbrown.org> wrote:
>>>>>>> Barry, when you did the tech reports, did you make an example to 
>>>>>>> reproduce on other architectures? Like, run this one example (it'll run 
>>>>>>> all the benchmarks across different sizes) and then run this script on 
>>>>>>> the output to make all the figures?
>>>>>> 
>>>>>> It is documented in 
>>>>>> https://www.overleaf.com/project/5ff8f7aca589b2f7eb81c579    You may 
>>>>>> need to dig through the submit scripts etc to find out exactly.
>>>>> 
>>>>> This runs a ton of small jobs and each job doesn't really preload, but 
>>>>> instead of loops in job submission scripts, the loops could be inside the 
>>>>> C code and it could directly output tabular data. This would run faster 
>>>>> and be easier to submit and analyze.
>>>>> 
>>>>> https://gitlab.com/hannah_mairs/summit-performance/-/blob/master/summit-submissions/submit_gpu1.lsf
>>>>> 
>>>>> It would hopefully also avoid writing the size range manually over here 
>>>>> in the analysis script where it has to match exactly the job submission.
>>>>> 
>>>>> https://gitlab.com/hannah_mairs/summit-performance/-/blob/master/python/graphs.py#L8-9
>>>>> 
>>>>> 
>>>>> We'd make our lives a lot easier understanding new machines if we put 
>>>>> into the design of performance studies just a fraction of the kind of 
>>>>> thought we put into public library interfaces.

Re: [petsc-dev] Kokkos/Crusher perforance

Reply via email to