On Thu, Oct 10, 2019 at 2:21 PM David Greene via cfe-dev < [email protected]> wrote:
> Florian Hahn via llvm-dev <[email protected]> writes: > > >> - Performance varies from implementation to implementation. It is > >> difficult to keep tests up-to-date for all possible targets and > >> subtargets. > > > > Could you expand a bit more what you mean here? Are you concerned > > about having to run the performance tests on different kinds of > > hardware? In what way do the existing benchmarks require keeping > > up-to-date? > > We have to support many different systems and those systems are always > changing (new processors, new BIOS, new OS, etc.). Performance can vary > widely day to day from factors completely outside the compiler's > control. As the performance changes you have to keep updating the tests > to expect the new performance numbers. Relying on performance > measurements to ensure something like vectorization is happening just > isn't reliable in our experience. Could you compare performance with vectorization turned on and off? > > > With tests checking ASM, wouldn’t we end up with lots of checks for > > various targets/subtargets that we need to keep up to date? > > Yes, that's true. But the only thing that changes the asm generated is > the compiler. > > > Just considering AArch64 as an example, people might want to check the > > ASM for different architecture versions and different vector > > extensions and different vendors might want to make sure that the ASM > > on their specific cores does not regress. > > Absolutely. We do a lot of that sort of thing downstream. > > >> - Partially as a result, but also for other reasons, performance tests > >> tend to be complicated, either in code size or in the numerous code > >> paths tested. This makes such tests hard to debug when there is a > >> regression. > > > > I am not sure they have to. Have you considered adding the small test > > functions/loops as micro-benchmarks using the existing google > > benchmark infrastructure in test-suite? > > We have tried nightly performance runs using LNT/test-suite and have > found it to be very unreliable, especially the microbenchmarks. > > > I think that might be able to address the points here relatively > > adequately. The separate micro benchmarks would be relatively small > > and we should be able to track down regressions in a similar fashion > > as if it would be a stand-alone file we compile and then analyze the > > ASM. Plus, we can easily run it and verify the performance on actual > > hardware. > > A few of my colleagues really struggled to get consistent results out of > LNT. They asked for help and discussed with a few upstream folks, but > in the end were not able to get something reliable working. I've talked > to a couple of other people off-list and they've had similar > experiences. It would be great if we have a reliable performance suite. > Please tell us how to get it working! :) > > But even then, I still maintain there is a place for the kind of > end-to-end testing I describe. Performance testing would complement it. > Neither is a replacement for the other. > > >> - Performance tests don't focus on the why/how of vectorization. They > >> just check, "did it run fast enough?" Maybe the test ran fast enough > >> for some other reason but we still lost desired vectorization and > >> could have run even faster. > >> > > > > If you would add a new micro-benchmark, you could check that it > > produces the desired result when adding it. The runtime-tracking > > should cover cases where we lost optimizations. I guess if the > > benchmarks are too big, additional optimizations in one part could > > hide lost optimizations somewhere else. But I would assume this to be > > relatively unlikely, as long as the benchmarks are isolated. > > Even then I have seen small performance tests vary widely in performance > due to system issues (see above). Again, there is a place for them but > they are not sufficient. > > > Also, checking the assembly for vector code does also not guarantee > > that the vector code will be actually executed. So for example by just > > checking the assembly for certain vector instructions, we might miss > > that we regressed performance, because we messed up the runtime checks > > guarding the vector loop. > > Oh absolutely. Presumably such checks would be included in the test or > would be checked by a different test. As always, tests have to be > constructed intelligently. :) > > -David > _______________________________________________ > cfe-dev mailing list > [email protected] > https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-dev >
_______________________________________________ lldb-dev mailing list [email protected] https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-dev
