On 12 October 2016 at 11:14, Kyrill Tkachov <kyrylo.tkac...@foss.arm.com> wrote: > > On 12/10/16 09:59, Christophe Lyon wrote: >> >> Hi Kyrill, >> >> On 7 October 2016 at 17:00, Kyrill Tkachov <kyrylo.tkac...@foss.arm.com> >> wrote: >>> >>> Hi Christophe, >>> >>> >>> On 07/09/16 21:05, Christophe Lyon wrote: >>>> >>>> Hi, >>>> >>>> The attached patch is a first part to solve PR 67591: it removes >>>> several occurrences of "IT blocks containing 32-bit Thumb >>>> instructions are deprecated in ARMv8" messages in the >>>> gcc/g++/libstdc++/fortran testsuites. >>>> >>>> It does not remove them all yet. This patch only modifies the >>>> *cmp_and, *cmp_ior, *ior_scc_scc, *ior_scc_scc_cmp, >>>> *and_scc_scc and *and_scc_scc_cmp patterns. >>>> Additional work is required in sub_shiftsi etc, at least. >>>> I've started looking at these, but I decided I could already >>>> post this self-contained patch to check if this implementation >>>> is OK. >>>> >>>> Regarding *cmp_and and *cmp_ior patterns, the addition of the >>>> enabled_for_depr_it attribute is aggressive in the sense that it keeps >>>> only the alternatives with 'l' and 'Py' constraints, while in some >>>> cases the constraints could be relaxed. Indeed, these 2 patterns can >>>> swap their input comparisons, meaning that any of them can be emitted >>>> in the IT-block, and is thus subject to the ARMv8 deprecation. >>>> The generated code is possibly suboptimal in the cases where the >>>> operands are not swapped, since 'r' could be used. >>>> >>>> Cross-tested on arm-none-linux-gnueabihf with -mthumb/-march=armv8-a >>>> and --with-cpu=cortex-a57 --with-mode=thumb, showing only improvements: >>>> >>>> >>>> http://people.linaro.org/~christophe.lyon/cross-validation/gcc-test-patches/239850-depr-it-4/report-build-info.html >>>> >>>> Bootstrapped OK on armv8l HW. >>>> >>>> Is this OK? >>>> >>>> Thanks, >>>> >>>> Christophe >>> >>> >>> (define_insn_and_split "*ior_scc_scc" >>> - [(set (match_operand:SI 0 "s_register_operand" "=Ts") >>> + [(set (match_operand:SI 0 "s_register_operand" "=Ts,Ts") >>> (ior:SI (match_operator:SI 3 "arm_comparison_operator" >>> - [(match_operand:SI 1 "s_register_operand" "r") >>> - (match_operand:SI 2 "arm_add_operand" "rIL")]) >>> + [(match_operand:SI 1 "s_register_operand" "r,l") >>> + (match_operand:SI 2 "arm_add_operand" "rIL,lPy")]) >>> (match_operator:SI 6 "arm_comparison_operator" >>> - [(match_operand:SI 4 "s_register_operand" "r") >>> - (match_operand:SI 5 "arm_add_operand" "rIL")]))) >>> + [(match_operand:SI 4 "s_register_operand" "r,l") >>> + (match_operand:SI 5 "arm_add_operand" "rIL,lPy")]))) >>> >>> Can you please put the more restrictive alternatives (lPy) first? >>> Same with the other patterns your patch touches. >>> Ok with that change if a rebased testing run is ok. >>> Sorry for the delay in reviewing. >>> >> OK, I will update my patch accordingly. >> >> However, when I discussed this with Ramana during the Cauldron, >> he requested benchmark results. So far, I was able to run spec2006 >> on an APM machine, and I'm seeing performance changes in the >> range 11% improvement (465.tonto) to 7% degradation (433.milc). >> >> Would that be acceptable? > > > Those sound like quite large swings. Indeed, but most are in the -1%-+1% range.
> Are you sure the machine was not running anything else at the time > or playing tricks with frequency scaling? No, I had no such guarantee. I used this machine temporarily, first to check that bootstrap worked. I planed to use another board with an A57 "standard" microarch for proper benchmarking, but I'm not sure when I'll have access to it (wrt to e/o gcc stage1), that's why I reported these early figures. > Did all iterations of SPEC show a consistent difference? > > If the changes are consistent, could you have a look at the codegen > to see if there are any clues to the differences? I will update my patch according to your comment, re-run the bench and have a deeper look at the codegen differences. > I'd like to get an explanation for these differences before committing > this patch. If they are just an effect of the more restrictive constraints > then there may be not much we can do, but I'd like to make sure there's not > anything else unexpected going on. > Thanks, Christophe >> >> The number of warnings (IT blocks containing 32-bit Thumb instructions >> are deprecated in ARMv8) >> was 712 without my patch and 122 with it. (using the hosts's binutils >> 2.24/ubuntu). >> I expected some warning, since as I said earlier other patterns need >> to be updated. > > > Understood. That's fine. > > Thanks, > Kyrill > > >> >> Christophe >> >> >>> Thanks for improving this area! >>> Kyrill >>> >