RE: [TCWG CI] 400.perlbench slowed down by 6% after llvm: [AArch64] Remove redundant ORRWrs which is generated by zero-extend

2021-10-28 Thread Jingu Kang
; From: Maxim Kuvyrkov > Sent: 27 October 2021 16:29 > To: David Spickett > Cc: Jingu Kang ; linaro-toolchain toolch...@lists.linaro.org> > Subject: Re: [TCWG CI] 400.perlbench slowed down by 6% after llvm: [AArch64] > Remove redundant ORRWrs which is generated by zero-extend >

Re: [TCWG CI] 400.perlbench slowed down by 6% after llvm: [AArch64] Remove redundant ORRWrs which is generated by zero-extend

2021-10-27 Thread Maxim Kuvyrkov
Hi David, Thanks for looking at this! I can’t immediately say that this is a false positive, the performance difference reproduces in several independent builds. Looking at the save-temps -- at least 400.perlbench’es regexec.s (which hosts S_regmatch()) has 19 extra instructions, which are, if

Re: [TCWG CI] 400.perlbench slowed down by 6% after llvm: [AArch64] Remove redundant ORRWrs which is generated by zero-extend

2021-10-27 Thread David Spickett
I think this is a false positive/one off disturbance in the benchmarking. Based on the contents of the saved temps. FastFullPelBlockMotionSearch has not changed at all. (so unless perf is saying time spent in that function and its callees went up, it must be something other than code change) perl

[TCWG CI] 400.perlbench slowed down by 6% after llvm: [AArch64] Remove redundant ORRWrs which is generated by zero-extend

2021-10-27 Thread ci_notify
After llvm commit a502436259307f95e9c95437d8a1d2d07294341c Author: Jingu Kang [AArch64] Remove redundant ORRWrs which is generated by zero-extend the following benchmarks slowed down by more than 2%: - 400.perlbench slowed down by 6% from 9792 to 10354 perf samples - 464.h264ref slowed down