RE: [TCWG CI] 456.hmmer slowed down by 5% after llvm: Revert "Allow rematerialization of virtual reg uses"

2021-10-01 Thread Mekhanoshin, Stanislav
[AMD Official Use Only]

I have tried a lot of different options, I do not recall now. Anyway, it is 
reverted and I do not seem to have resources to further pursue it.

Stas

-Original Message-
From: Maxim Kuvyrkov 
Sent: Friday, October 1, 2021 11:16
To: Mekhanoshin, Stanislav 
Cc: linaro-toolchain@lists.linaro.org
Subject: Re: [TCWG CI] 456.hmmer slowed down by 5% after llvm: Revert "Allow 
rematerialization of virtual reg uses"

[CAUTION: External Email]

> On 1 Oct 2021, at 21:06, Mekhanoshin, Stanislav 
>  wrote:
>
> [AMD Official Use Only]
>
>> You mentioned that you saw different results for another ARM target — could 
>> you elaborate please?
>
> When I was trying to reproduce hmmer asm I was trying to use different ARM 
> targets. I was never able to pick the one you were using apparently, but then 
> got very different results with different targets.

Our benchmarking CI is using default armhf target 
(--target=armv7a-linux-gnueabihf) with no additional -mcpu=/-march tuning 
flags.  Is it the same in your testing?  If so, then Clang should generate 
exactly same assembly in both cases, and have same extra reloads in 456.hmmer

The hardware used in benchmarking is Cortex-A15, which is still one of the most 
popular cores.  Which one you used in your experiments?

Thanks,

--
Maxim Kuvyrkov
https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.linaro.org%2F&data=04%7C01%7CStanislav.Mekhanoshin%40amd.com%7C251f58135a0e4ee609a708d985078a56%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637687089717973498%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=ThWhyirXioKcezaw5VAnU862oKOFZ41h1CjYTPQlF5E%3D&reserved=0

>
> Stas
>
> -Original Message-
> From: Maxim Kuvyrkov 
> Sent: Friday, October 1, 2021 3:05
> To: Mekhanoshin, Stanislav 
> Cc: linaro-toolchain@lists.linaro.org
> Subject: Re: [TCWG CI] 456.hmmer slowed down by 5% after llvm: Revert "Allow 
> rematerialization of virtual reg uses"
>
> [CAUTION: External Email]
>
> Hi Stanislav,
>
> I fully understand the challenges of compiler optimizations and the fact that 
> a generally-good optimisation can slow down a small number of benchmarks.
>
> Still, benchmarking your original patch (commit 
> 92c1fd19abb15bc68b1127a26137a69e033cdb39) on arm-linux-gnueabihf results in 
> overall runtime slow-down across C/C++ subset of SPEC CPU2006:
> - 0.25% runtime geomean increase at -O2
> - 0.37% runtime geomean increase at -O3
>
> See [1] for the numbers.
>
> You mentioned that you saw different results for another ARM target — could 
> you elaborate please?
>
> [1] 
> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdocs.google.com%2Fspreadsheets%2Fd%2F1USWty9Vdx6JLo7TGddbkoKVUCiC4wtneOhhbHf5WXfc%2Fedit%3Fusp%3Dsharing&data=04%7C01%7CStanislav.Mekhanoshin%40amd.com%7C251f58135a0e4ee609a708d985078a56%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637687089717973498%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=31VTIoeZpH2KU7QstyGVPSNUijna37paZagNoq37nSI%3D&reserved=0
>
> Regards,
>
> --
> Maxim Kuvyrkov
> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.linaro.org%2F&data=04%7C01%7CStanislav.Mekhanoshin%40amd.com%7C251f58135a0e4ee609a708d985078a56%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637687089717973498%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=ThWhyirXioKcezaw5VAnU862oKOFZ41h1CjYTPQlF5E%3D&reserved=0
>
>> On 29 Sep 2021, at 20:13, Mekhanoshin, Stanislav 
>>  wrote:
>>
>> [AMD Official Use Only]
>>
>> Maxim,
>>
>> This is really difficult for me to work on this as I do not have various 
>> targets and HW affected. I am sure there were quite a lot of progressions, 
>> but as I said in the beginning regressions are also inevitable, just like 
>> every time a heuristic is involved. For the hmmer case I was getting quite 
>> different results just by selecting a different ARM target. So without a 
>> good way to measure it and given the heuristic approach I cannot satisfy all 
>> the requests from multiple parties. Our target (AMDGPU) does this for a long 
>> time and I believe it is overall beneficial. It is somewhat pity I cannot 
>> make this a universal optimization, but I am also time constrained as there 
>> is other work to do too.
>>
>> Stas
>>
>> -Original Message-
>> From: Maxim Kuvyrkov 
>> Sent: Wednesday, September 29, 2021 4:17
>> To: Mekhanoshin, Stanislav 
>> Cc: linaro-toolchain@lists.linaro.org
>> Subject: Re: [TCWG CI] 456.hmmer slow

RE: [TCWG CI] 456.hmmer slowed down by 5% after llvm: Revert "Allow rematerialization of virtual reg uses"

2021-10-01 Thread Mekhanoshin, Stanislav
[AMD Official Use Only]

> You mentioned that you saw different results for another ARM target — could 
> you elaborate please?

When I was trying to reproduce hmmer asm I was trying to use different ARM 
targets. I was never able to pick the one you were using apparently, but then 
got very different results with different targets.

Stas

-Original Message-
From: Maxim Kuvyrkov 
Sent: Friday, October 1, 2021 3:05
To: Mekhanoshin, Stanislav 
Cc: linaro-toolchain@lists.linaro.org
Subject: Re: [TCWG CI] 456.hmmer slowed down by 5% after llvm: Revert "Allow 
rematerialization of virtual reg uses"

[CAUTION: External Email]

Hi Stanislav,

I fully understand the challenges of compiler optimizations and the fact that a 
generally-good optimisation can slow down a small number of benchmarks.

Still, benchmarking your original patch (commit 
92c1fd19abb15bc68b1127a26137a69e033cdb39) on arm-linux-gnueabihf results in 
overall runtime slow-down across C/C++ subset of SPEC CPU2006:
- 0.25% runtime geomean increase at -O2
- 0.37% runtime geomean increase at -O3

See [1] for the numbers.

You mentioned that you saw different results for another ARM target — could you 
elaborate please?

[1] 
https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdocs.google.com%2Fspreadsheets%2Fd%2F1USWty9Vdx6JLo7TGddbkoKVUCiC4wtneOhhbHf5WXfc%2Fedit%3Fusp%3Dsharing&data=04%7C01%7CStanislav.Mekhanoshin%40amd.com%7C875cf130a5b3482342a808d984c2e333%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637686796375377160%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=X09fEBSn%2FykJ09vMSf2YoGnkODBoJAKhnma8KX9%2BxUE%3D&reserved=0

Regards,

--
Maxim Kuvyrkov
https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.linaro.org%2F&data=04%7C01%7CStanislav.Mekhanoshin%40amd.com%7C875cf130a5b3482342a808d984c2e333%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637686796375377160%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=P503kX0DWQuMMr72znAXgvIdh2IsBDvliTko5%2F%2B4x6Q%3D&reserved=0

> On 29 Sep 2021, at 20:13, Mekhanoshin, Stanislav 
>  wrote:
>
> [AMD Official Use Only]
>
> Maxim,
>
> This is really difficult for me to work on this as I do not have various 
> targets and HW affected. I am sure there were quite a lot of progressions, 
> but as I said in the beginning regressions are also inevitable, just like 
> every time a heuristic is involved. For the hmmer case I was getting quite 
> different results just by selecting a different ARM target. So without a good 
> way to measure it and given the heuristic approach I cannot satisfy all the 
> requests from multiple parties. Our target (AMDGPU) does this for a long time 
> and I believe it is overall beneficial. It is somewhat pity I cannot make 
> this a universal optimization, but I am also time constrained as there is 
> other work to do too.
>
> Stas
>
> -Original Message-
> From: Maxim Kuvyrkov 
> Sent: Wednesday, September 29, 2021 4:17
> To: Mekhanoshin, Stanislav 
> Cc: linaro-toolchain@lists.linaro.org
> Subject: Re: [TCWG CI] 456.hmmer slowed down by 5% after llvm: Revert "Allow 
> rematerialization of virtual reg uses"
>
> [CAUTION: External Email]
>
> I thought the speed up and slow-down from "Allow rematerialization of virtual 
> reg uses" were for different benchmarks, but they are for the same benchmark 
> - 456.hmmer - but for different compilation flags.
>
> - At -O2 the patch slows down 456.hmmer by 5% from 751s to 771s.
> - At -O2 -flto patch speeds up 456.hmmer by 5% from 803s to 765s.
>
> Two observations from this:
> 1. 456.hmmer is very sensitive to this optimisation
> 2. LTO screws up on 456.hmmer.
>
> --
> Maxim Kuvyrkov
> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.linaro.org%2F&data=04%7C01%7CStanislav.Mekhanoshin%40amd.com%7C875cf130a5b3482342a808d984c2e333%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637686796375377160%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=P503kX0DWQuMMr72znAXgvIdh2IsBDvliTko5%2F%2B4x6Q%3D&reserved=0
>
>> On 29 Sep 2021, at 14:06, Maxim Kuvyrkov  wrote:
>>
>> Hi Stanislav,
>>
>> Just FYI.  Your original patch improved 456.hmmer by 5%, that's a nice speed 
>> up!
>>
>> --
>> Maxim Kuvyrkov
>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.linaro.org%2F&data=04%7C01%7CStanislav.Mekhanoshin%40amd.com%7C875cf130a5b3482342a808d984c2e333%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637686796375377160%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=P503kX0DWQuMMr72zn

Re: [TCWG CI] 456.hmmer slowed down by 5% after llvm: Revert "Allow rematerialization of virtual reg uses"

2021-10-01 Thread Maxim Kuvyrkov
> On 1 Oct 2021, at 21:06, Mekhanoshin, Stanislav 
>  wrote:
> 
> [AMD Official Use Only]
> 
>> You mentioned that you saw different results for another ARM target — could 
>> you elaborate please?
> 
> When I was trying to reproduce hmmer asm I was trying to use different ARM 
> targets. I was never able to pick the one you were using apparently, but then 
> got very different results with different targets.

Our benchmarking CI is using default armhf target 
(--target=armv7a-linux-gnueabihf) with no additional -mcpu=/-march tuning 
flags.  Is it the same in your testing?  If so, then Clang should generate 
exactly same assembly in both cases, and have same extra reloads in 456.hmmer

The hardware used in benchmarking is Cortex-A15, which is still one of the most 
popular cores.  Which one you used in your experiments?

Thanks,

--
Maxim Kuvyrkov
https://www.linaro.org

> 
> Stas
> 
> -Original Message-
> From: Maxim Kuvyrkov 
> Sent: Friday, October 1, 2021 3:05
> To: Mekhanoshin, Stanislav 
> Cc: linaro-toolchain@lists.linaro.org
> Subject: Re: [TCWG CI] 456.hmmer slowed down by 5% after llvm: Revert "Allow 
> rematerialization of virtual reg uses"
> 
> [CAUTION: External Email]
> 
> Hi Stanislav,
> 
> I fully understand the challenges of compiler optimizations and the fact that 
> a generally-good optimisation can slow down a small number of benchmarks.
> 
> Still, benchmarking your original patch (commit 
> 92c1fd19abb15bc68b1127a26137a69e033cdb39) on arm-linux-gnueabihf results in 
> overall runtime slow-down across C/C++ subset of SPEC CPU2006:
> - 0.25% runtime geomean increase at -O2
> - 0.37% runtime geomean increase at -O3
> 
> See [1] for the numbers.
> 
> You mentioned that you saw different results for another ARM target — could 
> you elaborate please?
> 
> [1] 
> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdocs.google.com%2Fspreadsheets%2Fd%2F1USWty9Vdx6JLo7TGddbkoKVUCiC4wtneOhhbHf5WXfc%2Fedit%3Fusp%3Dsharing&data=04%7C01%7CStanislav.Mekhanoshin%40amd.com%7C875cf130a5b3482342a808d984c2e333%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637686796375377160%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=X09fEBSn%2FykJ09vMSf2YoGnkODBoJAKhnma8KX9%2BxUE%3D&reserved=0
> 
> Regards,
> 
> --
> Maxim Kuvyrkov
> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.linaro.org%2F&data=04%7C01%7CStanislav.Mekhanoshin%40amd.com%7C875cf130a5b3482342a808d984c2e333%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637686796375377160%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=P503kX0DWQuMMr72znAXgvIdh2IsBDvliTko5%2F%2B4x6Q%3D&reserved=0
> 
>> On 29 Sep 2021, at 20:13, Mekhanoshin, Stanislav 
>>  wrote:
>> 
>> [AMD Official Use Only]
>> 
>> Maxim,
>> 
>> This is really difficult for me to work on this as I do not have various 
>> targets and HW affected. I am sure there were quite a lot of progressions, 
>> but as I said in the beginning regressions are also inevitable, just like 
>> every time a heuristic is involved. For the hmmer case I was getting quite 
>> different results just by selecting a different ARM target. So without a 
>> good way to measure it and given the heuristic approach I cannot satisfy all 
>> the requests from multiple parties. Our target (AMDGPU) does this for a long 
>> time and I believe it is overall beneficial. It is somewhat pity I cannot 
>> make this a universal optimization, but I am also time constrained as there 
>> is other work to do too.
>> 
>> Stas
>> 
>> -Original Message-
>> From: Maxim Kuvyrkov 
>> Sent: Wednesday, September 29, 2021 4:17
>> To: Mekhanoshin, Stanislav 
>> Cc: linaro-toolchain@lists.linaro.org
>> Subject: Re: [TCWG CI] 456.hmmer slowed down by 5% after llvm: Revert "Allow 
>> rematerialization of virtual reg uses"
>> 
>> [CAUTION: External Email]
>> 
>> I thought the speed up and slow-down from "Allow rematerialization of 
>> virtual reg uses" were for different benchmarks, but they are for the same 
>> benchmark - 456.hmmer - but for different compilation flags.
>> 
>> - At -O2 the patch slows down 456.hmmer by 5% from 751s to 771s.
>> - At -O2 -flto patch speeds up 456.hmmer by 5% from 803s to 765s.
>> 
>> Two observations from this:
>> 1. 456.hmmer is very sensitive to this optimisation
>> 2. LTO screws up on 456.hmmer.
>> 
>> --
>> Maxim Kuvyrkov
>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fww

RE: [TCWG CI] 456.hmmer slowed down by 5% after llvm: Revert "Allow rematerialization of virtual reg uses"

2021-10-01 Thread Mekhanoshin, Stanislav
[AMD Official Use Only]

Maxim,

This is really difficult for me to work on this as I do not have various 
targets and HW affected. I am sure there were quite a lot of progressions, but 
as I said in the beginning regressions are also inevitable, just like every 
time a heuristic is involved. For the hmmer case I was getting quite different 
results just by selecting a different ARM target. So without a good way to 
measure it and given the heuristic approach I cannot satisfy all the requests 
from multiple parties. Our target (AMDGPU) does this for a long time and I 
believe it is overall beneficial. It is somewhat pity I cannot make this a 
universal optimization, but I am also time constrained as there is other work 
to do too.

Stas

-Original Message-
From: Maxim Kuvyrkov 
Sent: Wednesday, September 29, 2021 4:17
To: Mekhanoshin, Stanislav 
Cc: linaro-toolchain@lists.linaro.org
Subject: Re: [TCWG CI] 456.hmmer slowed down by 5% after llvm: Revert "Allow 
rematerialization of virtual reg uses"

[CAUTION: External Email]

I thought the speed up and slow-down from "Allow rematerialization of virtual 
reg uses" were for different benchmarks, but they are for the same benchmark - 
456.hmmer - but for different compilation flags.

- At -O2 the patch slows down 456.hmmer by 5% from 751s to 771s.
- At -O2 -flto patch speeds up 456.hmmer by 5% from 803s to 765s.

Two observations from this:
1. 456.hmmer is very sensitive to this optimisation
2. LTO screws up on 456.hmmer.

--
Maxim Kuvyrkov
https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.linaro.org%2F&data=04%7C01%7CStanislav.Mekhanoshin%40amd.com%7C06739cf07d704b0ae9c808d9833ab4db%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637685110452392032%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=lT0JQgOBKwpI7H04MR%2BBFww5RKAiXTq3XQiLEBQSBCE%3D&reserved=0

> On 29 Sep 2021, at 14:06, Maxim Kuvyrkov  wrote:
>
> Hi Stanislav,
>
> Just FYI.  Your original patch improved 456.hmmer by 5%, that's a nice speed 
> up!
>
> --
> Maxim Kuvyrkov
> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.linaro.org%2F&data=04%7C01%7CStanislav.Mekhanoshin%40amd.com%7C06739cf07d704b0ae9c808d9833ab4db%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637685110452402029%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=YyOt%2FmkYeomR8vtrFndKNlUOyKTe4kbFRTv9xMoktjY%3D&reserved=0
>
>> On 28 Sep 2021, at 08:21, ci_not...@linaro.org wrote:
>>
>> After llvm commit 08d7eec06e8cf5c15a96ce11f311f1480291a441
>> Author: Stanislav Mekhanoshin 
>>
>>   Revert "Allow rematerialization of virtual reg uses"
>>
>> the following benchmarks slowed down by more than 2%:
>> - 456.hmmer slowed down by 5% from 7649 to 8028 perf samples
>>
>> Below reproducer instructions can be used to re-build both "first_bad" and 
>> "last_good" cross-toolchains used in this bisection.  Naturally, the scripts 
>> will fail when triggerring benchmarking jobs if you don't have access to 
>> Linaro TCWG CI.
>>
>> For your convenience, we have uploaded tarballs with pre-processed source 
>> and assembly files at:
>> - First_bad save-temps: 
>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fci.linaro.org%2Fjob%2Ftcwg_bmk_ci_llvm-bisect-tcwg_bmk_tk1-llvm-master-arm-spec2k6-O2_LTO%2F16%2Fartifact%2Fartifacts%2Fbuild-08d7eec06e8cf5c15a96ce11f311f1480291a441%2Fsave-temps%2F&data=04%7C01%7CStanislav.Mekhanoshin%40amd.com%7C06739cf07d704b0ae9c808d9833ab4db%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637685110452402029%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=6aQN%2FwqNrcGw5fYNZf8jJqzQdAtAsuTgbZbDPM5Ob8o%3D&reserved=0
>> - Last_good save-temps: 
>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fci.linaro.org%2Fjob%2Ftcwg_bmk_ci_llvm-bisect-tcwg_bmk_tk1-llvm-master-arm-spec2k6-O2_LTO%2F16%2Fartifact%2Fartifacts%2Fbuild-e8e2edd8ca88f8b0a7dba141349b2aa83284f3af%2Fsave-temps%2F&data=04%7C01%7CStanislav.Mekhanoshin%40amd.com%7C06739cf07d704b0ae9c808d9833ab4db%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637685110452402029%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=PqQtn5CJt%2BJtZOxxgwKdIIrPW0zCZbfbnB5vO%2FEm%2BhU%3D&reserved=0
>> - Baseline save-temps: 
>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fci.linaro.org%2Fjob%2Ftcwg_bmk_ci_llvm-bisect-tcwg_bmk_tk1-llvm-master-arm-spec2k6-O2_LTO%2F16%2Fartifact%2Fartifacts%2Fbuild-baseline%2Fsave-temps%2F&data=04%7C01%7CStanislav.Mekhanoshin%40amd.com%7C06739cf07d704b0ae9c808d9833ab4d

Re: [TCWG CI] 456.hmmer slowed down by 5% after llvm: Revert "Allow rematerialization of virtual reg uses"

2021-10-01 Thread Maxim Kuvyrkov
Hi Stanislav,

I fully understand the challenges of compiler optimizations and the fact that a 
generally-good optimisation can slow down a small number of benchmarks.

Still, benchmarking your original patch (commit 
92c1fd19abb15bc68b1127a26137a69e033cdb39) on arm-linux-gnueabihf results in 
overall runtime slow-down across C/C++ subset of SPEC CPU2006:
- 0.25% runtime geomean increase at -O2
- 0.37% runtime geomean increase at -O3

See [1] for the numbers.

You mentioned that you saw different results for another ARM target — could you 
elaborate please?

[1] 
https://docs.google.com/spreadsheets/d/1USWty9Vdx6JLo7TGddbkoKVUCiC4wtneOhhbHf5WXfc/edit?usp=sharing

Regards,

--
Maxim Kuvyrkov
https://www.linaro.org

> On 29 Sep 2021, at 20:13, Mekhanoshin, Stanislav 
>  wrote:
> 
> [AMD Official Use Only]
> 
> Maxim,
> 
> This is really difficult for me to work on this as I do not have various 
> targets and HW affected. I am sure there were quite a lot of progressions, 
> but as I said in the beginning regressions are also inevitable, just like 
> every time a heuristic is involved. For the hmmer case I was getting quite 
> different results just by selecting a different ARM target. So without a good 
> way to measure it and given the heuristic approach I cannot satisfy all the 
> requests from multiple parties. Our target (AMDGPU) does this for a long time 
> and I believe it is overall beneficial. It is somewhat pity I cannot make 
> this a universal optimization, but I am also time constrained as there is 
> other work to do too.
> 
> Stas
> 
> -Original Message-
> From: Maxim Kuvyrkov 
> Sent: Wednesday, September 29, 2021 4:17
> To: Mekhanoshin, Stanislav 
> Cc: linaro-toolchain@lists.linaro.org
> Subject: Re: [TCWG CI] 456.hmmer slowed down by 5% after llvm: Revert "Allow 
> rematerialization of virtual reg uses"
> 
> [CAUTION: External Email]
> 
> I thought the speed up and slow-down from "Allow rematerialization of virtual 
> reg uses" were for different benchmarks, but they are for the same benchmark 
> - 456.hmmer - but for different compilation flags.
> 
> - At -O2 the patch slows down 456.hmmer by 5% from 751s to 771s.
> - At -O2 -flto patch speeds up 456.hmmer by 5% from 803s to 765s.
> 
> Two observations from this:
> 1. 456.hmmer is very sensitive to this optimisation
> 2. LTO screws up on 456.hmmer.
> 
> --
> Maxim Kuvyrkov
> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.linaro.org%2F&data=04%7C01%7CStanislav.Mekhanoshin%40amd.com%7C06739cf07d704b0ae9c808d9833ab4db%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637685110452392032%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=lT0JQgOBKwpI7H04MR%2BBFww5RKAiXTq3XQiLEBQSBCE%3D&reserved=0
> 
>> On 29 Sep 2021, at 14:06, Maxim Kuvyrkov  wrote:
>> 
>> Hi Stanislav,
>> 
>> Just FYI.  Your original patch improved 456.hmmer by 5%, that's a nice speed 
>> up!
>> 
>> --
>> Maxim Kuvyrkov
>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.linaro.org%2F&data=04%7C01%7CStanislav.Mekhanoshin%40amd.com%7C06739cf07d704b0ae9c808d9833ab4db%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637685110452402029%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=YyOt%2FmkYeomR8vtrFndKNlUOyKTe4kbFRTv9xMoktjY%3D&reserved=0
>> 
>>> On 28 Sep 2021, at 08:21, ci_not...@linaro.org wrote:
>>> 
>>> After llvm commit 08d7eec06e8cf5c15a96ce11f311f1480291a441
>>> Author: Stanislav Mekhanoshin 
>>> 
>>>  Revert "Allow rematerialization of virtual reg uses"
>>> 
>>> the following benchmarks slowed down by more than 2%:
>>> - 456.hmmer slowed down by 5% from 7649 to 8028 perf samples
>>> 
>>> Below reproducer instructions can be used to re-build both "first_bad" and 
>>> "last_good" cross-toolchains used in this bisection.  Naturally, the 
>>> scripts will fail when triggerring benchmarking jobs if you don't have 
>>> access to Linaro TCWG CI.
>>> 
>>> For your convenience, we have uploaded tarballs with pre-processed source 
>>> and assembly files at:
>>> - First_bad save-temps: 
>>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fci.linaro.org%2Fjob%2Ftcwg_bmk_ci_llvm-bisect-tcwg_bmk_tk1-llvm-master-arm-spec2k6-O2_LTO%2F16%2Fartifact%2Fartifacts%2Fbuild-08d7eec06e8cf5c15a96ce11f311f1480291a441%2Fsave-temps%2F&data=04%7C01%7CStanislav.Mekhanoshin%40amd.com%7C06739cf07d704b0ae9c808d9833ab4db%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C63768511045240202

Re: [TCWG CI] 456.hmmer slowed down by 5% after llvm: Revert "Allow rematerialization of virtual reg uses"

2021-09-30 Thread Maxim Kuvyrkov
Hi Stanislav,

Just FYI.  Your original patch improved 456.hmmer by 5%, that’s a nice speed up!

--
Maxim Kuvyrkov
https://www.linaro.org

> On 28 Sep 2021, at 08:21, ci_not...@linaro.org wrote:
> 
> After llvm commit 08d7eec06e8cf5c15a96ce11f311f1480291a441
> Author: Stanislav Mekhanoshin 
> 
>Revert "Allow rematerialization of virtual reg uses"
> 
> the following benchmarks slowed down by more than 2%:
> - 456.hmmer slowed down by 5% from 7649 to 8028 perf samples
> 
> Below reproducer instructions can be used to re-build both "first_bad" and 
> "last_good" cross-toolchains used in this bisection.  Naturally, the scripts 
> will fail when triggerring benchmarking jobs if you don't have access to 
> Linaro TCWG CI.
> 
> For your convenience, we have uploaded tarballs with pre-processed source and 
> assembly files at:
> - First_bad save-temps: 
> https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tk1-llvm-master-arm-spec2k6-O2_LTO/16/artifact/artifacts/build-08d7eec06e8cf5c15a96ce11f311f1480291a441/save-temps/
> - Last_good save-temps: 
> https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tk1-llvm-master-arm-spec2k6-O2_LTO/16/artifact/artifacts/build-e8e2edd8ca88f8b0a7dba141349b2aa83284f3af/save-temps/
> - Baseline save-temps: 
> https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tk1-llvm-master-arm-spec2k6-O2_LTO/16/artifact/artifacts/build-baseline/save-temps/
> 
> Configuration:
> - Benchmark: SPEC CPU2006
> - Toolchain: Clang + Glibc + LLVM Linker
> - Version: all components were built from their tip of trunk
> - Target: arm-linux-gnueabihf
> - Compiler flags: -O2 -flto -marm
> - Hardware: NVidia TK1 4x Cortex-A15
> 
> This benchmarking CI is work-in-progress, and we welcome feedback and 
> suggestions at linaro-toolchain@lists.linaro.org .  In our improvement plans 
> is to add support for SPEC CPU2017 benchmarks and provide "perf 
> report/annotate" data behind these reports.
> 
> THIS IS THE END OF INTERESTING STUFF.  BELOW ARE LINKS TO BUILDS, 
> REPRODUCTION INSTRUCTIONS, AND THE RAW COMMIT.
> 
> This commit has regressed these CI configurations:
> - tcwg_bmk_llvm_tk1/llvm-master-arm-spec2k6-O2_LTO
> 
> First_bad build: 
> https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tk1-llvm-master-arm-spec2k6-O2_LTO/16/artifact/artifacts/build-08d7eec06e8cf5c15a96ce11f311f1480291a441/
> Last_good build: 
> https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tk1-llvm-master-arm-spec2k6-O2_LTO/16/artifact/artifacts/build-e8e2edd8ca88f8b0a7dba141349b2aa83284f3af/
> Baseline build: 
> https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tk1-llvm-master-arm-spec2k6-O2_LTO/16/artifact/artifacts/build-baseline/
> Even more details: 
> https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tk1-llvm-master-arm-spec2k6-O2_LTO/16/artifact/artifacts/
> 
> Reproduce builds:
> 
> mkdir investigate-llvm-08d7eec06e8cf5c15a96ce11f311f1480291a441
> cd investigate-llvm-08d7eec06e8cf5c15a96ce11f311f1480291a441
> 
> # Fetch scripts
> git clone https://git.linaro.org/toolchain/jenkins-scripts
> 
> # Fetch manifests and test.sh script
> mkdir -p artifacts/manifests
> curl -o artifacts/manifests/build-baseline.sh 
> https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tk1-llvm-master-arm-spec2k6-O2_LTO/16/artifact/artifacts/manifests/build-baseline.sh
>  --fail
> curl -o artifacts/manifests/build-parameters.sh 
> https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tk1-llvm-master-arm-spec2k6-O2_LTO/16/artifact/artifacts/manifests/build-parameters.sh
>  --fail
> curl -o artifacts/test.sh 
> https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tk1-llvm-master-arm-spec2k6-O2_LTO/16/artifact/artifacts/test.sh
>  --fail
> chmod +x artifacts/test.sh
> 
> # Reproduce the baseline build (build all pre-requisites)
> ./jenkins-scripts/tcwg_bmk-build.sh @@ artifacts/manifests/build-baseline.sh
> 
> # Save baseline build state (which is then restored in artifacts/test.sh)
> mkdir -p ./bisect
> rsync -a --del --delete-excluded --exclude /bisect/ --exclude /artifacts/ 
> --exclude /llvm/ ./ ./bisect/baseline/
> 
> cd llvm
> 
> # Reproduce first_bad build
> git checkout --detach 08d7eec06e8cf5c15a96ce11f311f1480291a441
> ../artifacts/test.sh
> 
> # Reproduce last_good build
> git checkout --detach e8e2edd8ca88f8b0a7dba141349b2aa83284f3af
> ../artifacts/test.sh
> 
> cd ..
> 
> 
> Full commit (up to 1000 lines):
> 
> commit 08d7eec06e8cf5c15a96ce11f311f1480291a441
> Author: Stanislav Mekhanoshin 
> Date:   Fri Sep 24 09:53:51 2021 -0700
> 
>Revert "Allow rematerialization of virtual reg uses"
> 
>Reverted due to two distcint performance regression reports.
> 
>This reverts commit 92c1fd19abb15bc68b1127a26137a69e033cdb39.
> ---
> llvm/include/llvm/CodeGen/TargetInstrInfo.h|   12 +-
> llvm/lib/CodeGen/TargetInstrInfo.cpp   |9 +-
> llvm/test/CodeGen/AMDGPU/remat-sop.mir |   60 -
> llvm/test/CodeGen/ARM/arm-shrink-wrapping-linux.ll

[TCWG CI] 456.hmmer slowed down by 5% after llvm: Revert "Allow rematerialization of virtual reg uses"

2021-09-30 Thread ci_notify
After llvm commit 08d7eec06e8cf5c15a96ce11f311f1480291a441
Author: Stanislav Mekhanoshin 

Revert "Allow rematerialization of virtual reg uses"

the following benchmarks slowed down by more than 2%:
- 456.hmmer slowed down by 5% from 7649 to 8028 perf samples

Below reproducer instructions can be used to re-build both "first_bad" and 
"last_good" cross-toolchains used in this bisection.  Naturally, the scripts 
will fail when triggerring benchmarking jobs if you don't have access to Linaro 
TCWG CI.

For your convenience, we have uploaded tarballs with pre-processed source and 
assembly files at:
- First_bad save-temps: 
https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tk1-llvm-master-arm-spec2k6-O2_LTO/16/artifact/artifacts/build-08d7eec06e8cf5c15a96ce11f311f1480291a441/save-temps/
- Last_good save-temps: 
https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tk1-llvm-master-arm-spec2k6-O2_LTO/16/artifact/artifacts/build-e8e2edd8ca88f8b0a7dba141349b2aa83284f3af/save-temps/
- Baseline save-temps: 
https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tk1-llvm-master-arm-spec2k6-O2_LTO/16/artifact/artifacts/build-baseline/save-temps/

Configuration:
- Benchmark: SPEC CPU2006
- Toolchain: Clang + Glibc + LLVM Linker
- Version: all components were built from their tip of trunk
- Target: arm-linux-gnueabihf
- Compiler flags: -O2 -flto -marm
- Hardware: NVidia TK1 4x Cortex-A15

This benchmarking CI is work-in-progress, and we welcome feedback and 
suggestions at linaro-toolchain@lists.linaro.org .  In our improvement plans is 
to add support for SPEC CPU2017 benchmarks and provide "perf report/annotate" 
data behind these reports.

THIS IS THE END OF INTERESTING STUFF.  BELOW ARE LINKS TO BUILDS, REPRODUCTION 
INSTRUCTIONS, AND THE RAW COMMIT.

This commit has regressed these CI configurations:
 - tcwg_bmk_llvm_tk1/llvm-master-arm-spec2k6-O2_LTO

First_bad build: 
https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tk1-llvm-master-arm-spec2k6-O2_LTO/16/artifact/artifacts/build-08d7eec06e8cf5c15a96ce11f311f1480291a441/
Last_good build: 
https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tk1-llvm-master-arm-spec2k6-O2_LTO/16/artifact/artifacts/build-e8e2edd8ca88f8b0a7dba141349b2aa83284f3af/
Baseline build: 
https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tk1-llvm-master-arm-spec2k6-O2_LTO/16/artifact/artifacts/build-baseline/
Even more details: 
https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tk1-llvm-master-arm-spec2k6-O2_LTO/16/artifact/artifacts/

Reproduce builds:

mkdir investigate-llvm-08d7eec06e8cf5c15a96ce11f311f1480291a441
cd investigate-llvm-08d7eec06e8cf5c15a96ce11f311f1480291a441

# Fetch scripts
git clone https://git.linaro.org/toolchain/jenkins-scripts

# Fetch manifests and test.sh script
mkdir -p artifacts/manifests
curl -o artifacts/manifests/build-baseline.sh 
https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tk1-llvm-master-arm-spec2k6-O2_LTO/16/artifact/artifacts/manifests/build-baseline.sh
 --fail
curl -o artifacts/manifests/build-parameters.sh 
https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tk1-llvm-master-arm-spec2k6-O2_LTO/16/artifact/artifacts/manifests/build-parameters.sh
 --fail
curl -o artifacts/test.sh 
https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tk1-llvm-master-arm-spec2k6-O2_LTO/16/artifact/artifacts/test.sh
 --fail
chmod +x artifacts/test.sh

# Reproduce the baseline build (build all pre-requisites)
./jenkins-scripts/tcwg_bmk-build.sh @@ artifacts/manifests/build-baseline.sh

# Save baseline build state (which is then restored in artifacts/test.sh)
mkdir -p ./bisect
rsync -a --del --delete-excluded --exclude /bisect/ --exclude /artifacts/ 
--exclude /llvm/ ./ ./bisect/baseline/

cd llvm

# Reproduce first_bad build
git checkout --detach 08d7eec06e8cf5c15a96ce11f311f1480291a441
../artifacts/test.sh

# Reproduce last_good build
git checkout --detach e8e2edd8ca88f8b0a7dba141349b2aa83284f3af
../artifacts/test.sh

cd ..


Full commit (up to 1000 lines):

commit 08d7eec06e8cf5c15a96ce11f311f1480291a441
Author: Stanislav Mekhanoshin 
Date:   Fri Sep 24 09:53:51 2021 -0700

Revert "Allow rematerialization of virtual reg uses"

Reverted due to two distcint performance regression reports.

This reverts commit 92c1fd19abb15bc68b1127a26137a69e033cdb39.
---
 llvm/include/llvm/CodeGen/TargetInstrInfo.h|   12 +-
 llvm/lib/CodeGen/TargetInstrInfo.cpp   |9 +-
 llvm/test/CodeGen/AMDGPU/remat-sop.mir |   60 -
 llvm/test/CodeGen/ARM/arm-shrink-wrapping-linux.ll |   28 +-
 llvm/test/CodeGen/ARM/funnel-shift-rot.ll  |   32 +-
 llvm/test/CodeGen/ARM/funnel-shift.ll  |   30 +-
 .../test/CodeGen/ARM/illegal-bitfield-loadstore.ll |   30 +-
 llvm/test/CodeGen/ARM/neon-copy.ll |   10 +-
 llvm/test/CodeGen/Mips/llvm-ir/ashr.ll |  227 +-
 llvm/test/CodeGen/Mips/llvm-ir/lshr.ll |  206 +-
 llvm/test/Cod

Re: [TCWG CI] 456.hmmer slowed down by 5% after llvm: Revert "Allow rematerialization of virtual reg uses"

2021-09-30 Thread Maxim Kuvyrkov
I thought the speed up and slow-down from "Allow rematerialization of virtual 
reg uses" were for different benchmarks, but they are for the same benchmark — 
456.hmmer — but for different compilation flags.

- At -O2 the patch slows down 456.hmmer by 5% from 751s to 771s.
- At -O2 -flto patch speeds up 456.hmmer by 5% from 803s to 765s.

Two observations from this:
1. 456.hmmer is very sensitive to this optimisation
2. LTO screws up on 456.hmmer.

--
Maxim Kuvyrkov
https://www.linaro.org

> On 29 Sep 2021, at 14:06, Maxim Kuvyrkov  wrote:
> 
> Hi Stanislav,
> 
> Just FYI.  Your original patch improved 456.hmmer by 5%, that’s a nice speed 
> up!
> 
> --
> Maxim Kuvyrkov
> https://www.linaro.org
> 
>> On 28 Sep 2021, at 08:21, ci_not...@linaro.org wrote:
>> 
>> After llvm commit 08d7eec06e8cf5c15a96ce11f311f1480291a441
>> Author: Stanislav Mekhanoshin 
>> 
>>   Revert "Allow rematerialization of virtual reg uses"
>> 
>> the following benchmarks slowed down by more than 2%:
>> - 456.hmmer slowed down by 5% from 7649 to 8028 perf samples
>> 
>> Below reproducer instructions can be used to re-build both "first_bad" and 
>> "last_good" cross-toolchains used in this bisection.  Naturally, the scripts 
>> will fail when triggerring benchmarking jobs if you don't have access to 
>> Linaro TCWG CI.
>> 
>> For your convenience, we have uploaded tarballs with pre-processed source 
>> and assembly files at:
>> - First_bad save-temps: 
>> https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tk1-llvm-master-arm-spec2k6-O2_LTO/16/artifact/artifacts/build-08d7eec06e8cf5c15a96ce11f311f1480291a441/save-temps/
>> - Last_good save-temps: 
>> https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tk1-llvm-master-arm-spec2k6-O2_LTO/16/artifact/artifacts/build-e8e2edd8ca88f8b0a7dba141349b2aa83284f3af/save-temps/
>> - Baseline save-temps: 
>> https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tk1-llvm-master-arm-spec2k6-O2_LTO/16/artifact/artifacts/build-baseline/save-temps/
>> 
>> Configuration:
>> - Benchmark: SPEC CPU2006
>> - Toolchain: Clang + Glibc + LLVM Linker
>> - Version: all components were built from their tip of trunk
>> - Target: arm-linux-gnueabihf
>> - Compiler flags: -O2 -flto -marm
>> - Hardware: NVidia TK1 4x Cortex-A15
>> 
>> This benchmarking CI is work-in-progress, and we welcome feedback and 
>> suggestions at linaro-toolchain@lists.linaro.org .  In our improvement plans 
>> is to add support for SPEC CPU2017 benchmarks and provide "perf 
>> report/annotate" data behind these reports.
>> 
>> THIS IS THE END OF INTERESTING STUFF.  BELOW ARE LINKS TO BUILDS, 
>> REPRODUCTION INSTRUCTIONS, AND THE RAW COMMIT.
>> 
>> This commit has regressed these CI configurations:
>> - tcwg_bmk_llvm_tk1/llvm-master-arm-spec2k6-O2_LTO
>> 
>> First_bad build: 
>> https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tk1-llvm-master-arm-spec2k6-O2_LTO/16/artifact/artifacts/build-08d7eec06e8cf5c15a96ce11f311f1480291a441/
>> Last_good build: 
>> https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tk1-llvm-master-arm-spec2k6-O2_LTO/16/artifact/artifacts/build-e8e2edd8ca88f8b0a7dba141349b2aa83284f3af/
>> Baseline build: 
>> https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tk1-llvm-master-arm-spec2k6-O2_LTO/16/artifact/artifacts/build-baseline/
>> Even more details: 
>> https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tk1-llvm-master-arm-spec2k6-O2_LTO/16/artifact/artifacts/
>> 
>> Reproduce builds:
>> 
>> mkdir investigate-llvm-08d7eec06e8cf5c15a96ce11f311f1480291a441
>> cd investigate-llvm-08d7eec06e8cf5c15a96ce11f311f1480291a441
>> 
>> # Fetch scripts
>> git clone https://git.linaro.org/toolchain/jenkins-scripts
>> 
>> # Fetch manifests and test.sh script
>> mkdir -p artifacts/manifests
>> curl -o artifacts/manifests/build-baseline.sh 
>> https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tk1-llvm-master-arm-spec2k6-O2_LTO/16/artifact/artifacts/manifests/build-baseline.sh
>>  --fail
>> curl -o artifacts/manifests/build-parameters.sh 
>> https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tk1-llvm-master-arm-spec2k6-O2_LTO/16/artifact/artifacts/manifests/build-parameters.sh
>>  --fail
>> curl -o artifacts/test.sh 
>> https://ci.linaro.org/job/tcwg_bmk_ci_llvm-bisect-tcwg_bmk_tk1-llvm-master-arm-spec2k6-O2_LTO/16/artifact/artifacts/test.sh
>>  --fail
>> chmod +x artifacts/test.sh
>> 
>> # Reproduce the baseline build (build all pre-requisites)
>> ./jenkins-scripts/tcwg_bmk-build.sh @@ artifacts/manifests/build-baseline.sh
>> 
>> # Save baseline build state (which is then restored in artifacts/test.sh)
>> mkdir -p ./bisect
>> rsync -a --del --delete-excluded --exclude /bisect/ --exclude /artifacts/ 
>> --exclude /llvm/ ./ ./bisect/baseline/
>> 
>> cd llvm
>> 
>> # Reproduce first_bad build
>> git checkout --detach 08d7eec06e8cf5c15a96ce11f311f1480291a441
>> ../artifacts/test.sh
>> 
>> # Reproduce last_good build
>> git checkout --detach e8e2edd8ca88f8b0a7dba141349b2aa8