I am not arguing for a rickety set of scripts, I am arguing that doing more is not so easy and it is only worth doing if the underlying benchmark is worth the effort.
> On Jan 22, 2022, at 8:08 PM, Jed Brown <j...@jedbrown.org> wrote: > > Yeah, I'm referring to the operational aspect of data management, not > benchmark design (which is hard and even Sam had years working with Mark and > me on HPGMG to refine that). > > If you run libCEED BPs (which use PETSc), you can run one command > > srun -N.... ./bps -ceed /cpu/self/xsmm/blocked,/gpu/cuda/gen -degree 2,3,4,5 > -local_nodes 1000,5000000 -problem bp1,bp2,bp3,bp4 > > and it'll loop (in C code) over all the combinations (reusing some > non-benchmarked things like the DMPlex) across the whole range of sizes, > problems, devices. It makes one output file and you feed that to a Python > script to read it as a Pandas DataFrame and plot (or read and interact in a > notebook). You can have a basket of files from different machines and slice > those plots without code changes. > > We should do similar for a suite of PETSc benchmarks, even just basic Vec and > Mat operations like in the reports. It isn't more work than a rickety bundle > of scripts, and it's a lot less error-prone. > > Barry Smith <bsm...@petsc.dev> writes: > >> I submit it is actually a good amount of additional work and requires real >> creativity and very good judgment; it is not a good intro or undergrad >> project; especially for someone without a huge amount of hands-on experience >> already. Look who had to do the new SpecHPC multigrid benchmark. The last >> time I checked Sam was not an undergrad. Senior Scientist, Lawrence Berkeley >> National Laboratory - Cited by 11194 I definitely do not plan to involve >> myself in any brand new serious benchmarking studies in my current lifetime, >> doing one correctly is a massive undertaking IMHO. >> >>> On Jan 22, 2022, at 6:43 PM, Jed Brown <j...@jedbrown.org> wrote: >>> >>> This isn't so much more or less work, but work in more useful places. Maybe >>> this is a good undergrad or intro project to make a clean workflow for >>> these experiments. >>> >>> Barry Smith <bsm...@petsc.dev> writes: >>> >>>> Performance studies are enormously difficult to do well; which is why >>>> there are so few good ones out there. And unless you fall into the LINPACK >>>> benchmark or hit upon Streams the rewards of doing an excellent job are >>>> pretty thin. Even Streams was not properly maintained for many years, you >>>> could not just get it and use it out of the box for a variety of purposes >>>> (which is why PETSc has its hacked-up ones). I submit a properly >>>> performance study is a full-time job and everyone always has those. >>>> >>>>> On Jan 22, 2022, at 2:11 PM, Jed Brown <j...@jedbrown.org> wrote: >>>>> >>>>> Barry Smith <bsm...@petsc.dev> writes: >>>>> >>>>>>> On Jan 22, 2022, at 12:15 PM, Jed Brown <j...@jedbrown.org> wrote: >>>>>>> Barry, when you did the tech reports, did you make an example to >>>>>>> reproduce on other architectures? Like, run this one example (it'll run >>>>>>> all the benchmarks across different sizes) and then run this script on >>>>>>> the output to make all the figures? >>>>>> >>>>>> It is documented in >>>>>> https://www.overleaf.com/project/5ff8f7aca589b2f7eb81c579 You may >>>>>> need to dig through the submit scripts etc to find out exactly. >>>>> >>>>> This runs a ton of small jobs and each job doesn't really preload, but >>>>> instead of loops in job submission scripts, the loops could be inside the >>>>> C code and it could directly output tabular data. This would run faster >>>>> and be easier to submit and analyze. >>>>> >>>>> https://gitlab.com/hannah_mairs/summit-performance/-/blob/master/summit-submissions/submit_gpu1.lsf >>>>> >>>>> It would hopefully also avoid writing the size range manually over here >>>>> in the analysis script where it has to match exactly the job submission. >>>>> >>>>> https://gitlab.com/hannah_mairs/summit-performance/-/blob/master/python/graphs.py#L8-9 >>>>> >>>>> >>>>> We'd make our lives a lot easier understanding new machines if we put >>>>> into the design of performance studies just a fraction of the kind of >>>>> thought we put into public library interfaces.