On Tue, 25 Feb 2020 at 14:44, Kyrill Tkachov <kyrylo.tkac...@foss.arm.com> wrote: > > Hi Christophe, > > On 2/24/20 2:16 PM, Christophe Lyon wrote: > > Ping? > > > > I'd also like to backport this and the main patch (svn r279463, > > r10-5505-ge24f6408df1e4c5e8c09785d7b488c492dfb68b3) > > to the gcc-9 branch. > > > > I found the problem addressed by this patch while validating the > > backport to gcc-9: although the patch applies cleanly except for > > testcases dg directives, there were some failures which I could > > finally reproduce on trunk with -fdisable-rtl-fwprop2. > > > > Here is a summary of the validations I ran using --target arm-eabi: > > * without my patches: > > (1) --with-cpu cortex-m0 > > (2) --with-cpu cortex-m4 > > (3) --with-cpu cortex-m4 CFLAGS_FOR_TARGET=-mpure-code (to build the > > libs with -mpure-code) > > (4) --with-cpu cortex-m4 CFLAGS_FOR_TARGET=-mpure-code > > --target-board=-mpure-code (to also run the tests with -mpure-code) > > > > * with my patches: > > (5) --with-cpu cortex-m0 CFLAGS_FOR_TARGET=-mpure-code > > --target-board=-mpure-code > > (6) --with-cpu cortex-m4 CFLAGS_FOR_TARGET=-mpure-code > > --target-board=-mpure-code > > > > Comparing (4) and (6) ensured that my (v6m) patches introduce no > > regression in v7m cases. > > > > Comparison of (1) vs (5) gave results similar to (2) vs (6), there's a > > bit of noise because some tests cases don't cope well with -mpure-code > > despite my previous testsuite-only patch (svn r277828) > > > > Comparison of (1) vs (2) gave similar results to (5) vs (6). > > > > Ideally, we may also want to backport svn r277828 (testsuite-only > > patch, to handle -mpure-code better), but that's not mandatory. > > > > In summary, is this patch OK for trunk? > > > > Are this patch and r279463, > > r10-5505-ge24f6408df1e4c5e8c09785d7b488c492dfb68b3 OK to backport to > > gcc-9? > > > > This is okay with me. > > I don't think any of the branches are frozen at the moment, so it should > be okay to backport it. >
Cool, thanks. Pushed as r10-6845-ga71f2193d0df71a86c4743aab22891bb0003112e and backported as r9-8277-g9c5db942ca332494ef3d79d4a7d494d83cad8304 r9-8278-g7edf9fa1c5f0e05c22e1d719658ed903fe2b44f4 Thanks, Christophe > Thanks, > > Kyrill > > > > Thanks, > > > > Christophe > > > > On Thu, 13 Feb 2020 at 11:14, Christophe Lyon > > <christophe.l...@linaro.org> wrote: > > > > > > On Mon, 10 Feb 2020 at 17:45, Richard Earnshaw (lists) > > > <richard.earns...@arm.com> wrote: > > > > > > > > On 10/02/2020 09:27, Christophe Lyon wrote: > > > > > On Fri, 7 Feb 2020 at 17:55, Richard Earnshaw (lists) > > > > > <richard.earns...@arm.com> wrote: > > > > >> > > > > >> On 07/02/2020 16:43, Christophe Lyon wrote: > > > > >>> On Fri, 7 Feb 2020 at 14:49, Richard Earnshaw (lists) > > > > >>> <richard.earns...@arm.com> wrote: > > > > >>>> > > > > >>>> On 07/02/2020 13:19, Christophe Lyon wrote: > > > > >>>>> When running the testsuite with -fdisable-rtl-fwprop2 and > > -mpure-code > > > > >>>>> for cortex-m0, I noticed that some testcases were failing > > because we > > > > >>>>> still generate "ldr rX, .LCY", which is what we want to > > avoid with > > > > >>>>> -mpure-code. This is latent since a recent improvement in fwprop > > > > >>>>> (PR88833). > > > > >>>>> > > > > >>>>> In this patch I change the thumb1_movsi_insn pattern so that > > it emits > > > > >>>>> the desired instruction sequence when > > arm_disable_literal_pool is set. > > > > >>>>> > > > > >>>>> I tried to add a define_split instead, but couldn't make it > > work: the > > > > >>>>> compiler then complains it cannot split the instruction, > > while my new > > > > >>>>> define_split accepts the same operand types as > > thumb1_movsi_insn: > > > > >>>>> > > > > >>>>> c-c++-common/torture/complex-sign-mixed-add.c:41:1: error: > > could not split insn > > > > >>>>> (insn 2989 425 4844 (set (reg/f:SI 3 r3 [1342]) > > > > >>>>> (symbol_ref/u:SI ("*.LC6") [flags 0x2])) 836 > > {*thumb1_movsi_insn} > > > > >>>>> (expr_list:REG_EQUIV (symbol_ref/u:SI ("*.LC6") > > [flags 0x2]) > > > > >>>>> (nil))) > > > > >>>>> during RTL pass: final > > > > >>>>> > > > > >>>>> (define_split > > > > >>>>> [(set (match_operand:SI 0 "register_operand" "") > > > > >>>>> (match_operand:SI 1 "general_operand" ""))] > > > > >>>>> "TARGET_THUMB1 > > > > >>>>> && arm_disable_literal_pool > > > > >>>>> && GET_CODE (operands[1]) == SYMBOL_REF" > > > > >>>>> [(clobber (const_int 0))] > > > > >>>>> " > > > > >>>>> gen_thumb1_movsi_symbol_ref(operands[0], operands[1]); > > > > >>>>> DONE; > > > > >>>>> " > > > > >>>>> ) > > > > >>>>> and I put this in thumb1_movsi_insn: > > > > >>>>> if (GET_CODE (operands[1]) == SYMBOL_REF && > > arm_disable_literal_pool) > > > > >>>>> { > > > > >>>>> return \"#\"; > > > > >>>>> } > > > > >>>>> return \"ldr\\t%0, %1\"; > > > > >>>>> > > > > >>>>> 2020-02-07 Christophe Lyon <christophe.l...@linaro.org> > > > > >>>>> > > > > >>>>> * config/arm/thumb1.md (thumb1_movsi_insn): Fix > > ldr alternative to > > > > >>>>> work with -mpure-code. > > > > >>>>> > > > > >>>> > > > > >>>> + case 0: > > > > >>>> + case 1: > > > > >>>> + return \"movs %0, %1\"; > > > > >>>> + case 2: > > > > >>>> + return \"movw %0, %1\"; > > > > >>>> > > > > >>>> This is OK, but please replace the hard tab in the strings > > for MOVS/MOVW > > > > >>>> with \\t. > > > > >>>> > > > > >>> > > > > >>> OK that was merely a cut & paste from the existing code. > > > > >>> > > > > >>> I'm concerned that the length attribute is becoming wrong with my > > > > >>> patch, isn't this a problem? > > > > >>> > > > > >> > > > > >> Potentially yes. The branch range code needs this to handle > > overly long > > > > >> jumps correctly. > > > > >> > > > > > > > > > > Do you mean that the probability of problems due to that shortcoming > > > > > is low enough that I can commit my patch? > > > > > > > > It's hard to say that. The number of instructions you generate > > for this > > > > is significant, so it likely will throw off the calculations and > > > > somebody will get unlucky. On the other hand, I don't think we should > > > > pessimize code for the non-pure-code variants by inflating the > > size for > > > > this unconditionally. > > > > > > > > It seems there are two ways to fix this. > > > > > > > > 1) create a new alternative for the pure-code variant with its own > > > > length attribute, then only enable that for the case you need. That > > > > would also allow you to go back to the templated asm format. > > > > 2) use a level of indirection to calculate the length - I haven't > > tried > > > > this, but it would be something like: > > > > > > > > - create a new attribute - lets say default_length > > > > - rename length for this pattern to default_length > > > > - create another new attribute - lets say purecode_length, add > > lengths > > > > for each alternative but adjusted for the purecode variant. > > > > - make the length attribute for this pattern select based on the > > > > disable_literal_pool variable between the default_length and > > > > purecode_length attributes. > > > > > > > > > > Hi Richard, > > > > > > Thanks for the suggestions. I've implemented option (1) above, does > > > it match what you had in mind? > > > > > > Tested on arm-eabi, with -mpure-code forced, no regression. Manually > > > checked that the expected sequence is generated with > > > -fdisable-rtl-fwprop2. > > > > > > Thanks, > > > > > > Christophe > > > > > > > R. > > > > > > > > > > > > > > Thanks, > > > > > > > > > > Christophe > > > > > > > > > >> R. > > > > >> > > > > >>> Thanks, > > > > >>> > > > > >>> Christophe > > > > >>> > > > > >>>> R. > > > > >> > > > >