Re: [patch, avr] Take 2: Fix PR64331: insn output and insn length computation rely on REG_DEAD notes.
Am 02/28/2015 um 09:02 AM schrieb Denis Chertykov: 2015-02-27 1:45 GMT+03:00 Steven Bosscher stevenb@gmail.com: On Thu, Feb 26, 2015 at 8:35 PM, Georg-Johann Lay wrote: Take #2 introduces a new, avr-specific rtl pass whose sole purpose is to rectify notes. The pass is scheduled right before cfg does down (right before .*free_cfg) so that cfg and hence df machinery is available. Regression tests look fine and for the test case the patches produce correct code and correct insn length. Sorry for party-pooping, but it seems to me that the real bug is that the target is lying to reload. IIUC the AVR port selects the insn alternative after register allocation (after reload). It bases its selection on REG_DEAD notes. In PR64331 an alternative is used that clobbers r20 that has a REG_DEAD note, but r20 is not actually dead because hardreg-cprop has propagated it forward without adjusting the note. The normal way of things is that the insn alternative is selected in reload (or in LRA) and that the clobbers are added as necessary. In PR64331, an alternative for insn r17 would be selected that has a CLOBBER for r20, prevent hardreg-cprop from propagating r20. Selecting insns based on REG-notes is dangerous business. Lying to reload and to post-RA passes is a mortal sin, the compiler will punish you. There is no guarantee that nothing will change between your new pass to recompute notes, and the final pass that emits the insns. It's not my port, for sure, but I would look for a real fix instead: Don't select insns to output based on unreliable information like REG-notes. Steven rights. We will have an endless fight with this problem. Just consider the patch as prerequisite for further changes atop of it as outlined in https://gcc.gnu.org/ml/gcc-patches/2015-02/msg01745.html The current patches are operating correctly and do nothing wrong. Complete rewriting of reg_unused_after will be much more work and more error prone (e.g. unrecognizable insn, optimization flaws). Better to completely drop `reg_unused_after'. (I know that it used around 40 times in port) What do you think Georg ? I'd prefer to fix it, c.f. link above. Johann Denis.
Re: [patch, avr] Take 2: Fix PR64331: insn output and insn length computation rely on REG_DEAD notes.
2015-03-02 15:32 GMT+03:00 Georg-Johann Lay a...@gjlay.de: Am 02/28/2015 um 09:02 AM schrieb Denis Chertykov: 2015-02-27 1:45 GMT+03:00 Steven Bosscher stevenb@gmail.com: On Thu, Feb 26, 2015 at 8:35 PM, Georg-Johann Lay wrote: Take #2 introduces a new, avr-specific rtl pass whose sole purpose is to rectify notes. The pass is scheduled right before cfg does down (right before .*free_cfg) so that cfg and hence df machinery is available. Regression tests look fine and for the test case the patches produce correct code and correct insn length. Sorry for party-pooping, but it seems to me that the real bug is that the target is lying to reload. IIUC the AVR port selects the insn alternative after register allocation (after reload). It bases its selection on REG_DEAD notes. In PR64331 an alternative is used that clobbers r20 that has a REG_DEAD note, but r20 is not actually dead because hardreg-cprop has propagated it forward without adjusting the note. The normal way of things is that the insn alternative is selected in reload (or in LRA) and that the clobbers are added as necessary. In PR64331, an alternative for insn r17 would be selected that has a CLOBBER for r20, prevent hardreg-cprop from propagating r20. Selecting insns based on REG-notes is dangerous business. Lying to reload and to post-RA passes is a mortal sin, the compiler will punish you. There is no guarantee that nothing will change between your new pass to recompute notes, and the final pass that emits the insns. It's not my port, for sure, but I would look for a real fix instead: Don't select insns to output based on unreliable information like REG-notes. Steven rights. We will have an endless fight with this problem. Just consider the patch as prerequisite for further changes atop of it as outlined in https://gcc.gnu.org/ml/gcc-patches/2015-02/msg01745.html The current patches are operating correctly and do nothing wrong. Complete rewriting of reg_unused_after will be much more work and more error prone (e.g. unrecognizable insn, optimization flaws). Better to completely drop `reg_unused_after'. (I know that it used around 40 times in port) What do you think Georg ? I'd prefer to fix it, c.f. link above. Ok. Please fix it. Denis. PS: As I remember: I have copyed `reg_unused_after' from sparc port. It was around 10 years ago. `reg_unused_after' was removed from sparc a few years ago. So, only avr and sh ports use `reg_unused_after'. It's dangerous because GCC core developers don't bothered about two small ports. (Ports with relatively small user base) At least we can remove `reg_unused_after' in any time. (It's relatively easy.)
Re: [patch, avr] Take 2: Fix PR64331: insn output and insn length computation rely on REG_DEAD notes.
On Sat, Feb 28, 2015 at 5:38 PM, Georg-Johann Lay wrote: Am 02/26/2015 um 11:45 PM schrieb Steven Bosscher: On Thu, Feb 26, 2015 at 8:35 PM, Georg-Johann Lay wrote: Take #2 introduces a new, avr-specific rtl pass whose sole purpose is to rectify notes. The pass is scheduled right before cfg does down (right before .*free_cfg) so that cfg and hence df machinery is available. Regression tests look fine and for the test case the patches produce correct code and correct insn length. Sorry for party-pooping, but it seems to me that the real bug is that the target is lying to reload. IIUC the AVR port selects the insn alternative after register allocation (after reload). It bases its selection on REG_DEAD notes. In PR64331 an alternative is used that clobbers r20 that has a REG_DEAD note, but r20 is not actually dead because hardreg-cprop has propagated it forward without adjusting the note. It's not actually about constraint alternatives. Let me give an example: Testing HI for 0. The usual sequence would be cc0 = reg.low == 0 cc0 = cc0 reg.high == 0 which costs 2 instructions. If reg is unused after, then ORing can be used and cc0 will be set as a side effect. This costs 1 insn: cc0 = (reg.low |= reg.high) Using alternatives would double their number, i.e. 14 instead of 7 for *cmphi. These constraint alternatives would have to express 1) reg-alloc, please, use alternative #1 (with clobber of reg) if the register is unused after 2) reg-alloc, please, use alternative #2 (not clobbering reg) if the register is used after If we assume for a moment that we have a *testhi insn and the old *tsthi had just 1 alternative: (define_insn *tsthi [(set (cc0) (compare (match_operand:ALL2 0 register_operand r) (const_int 0)))] ...) Then the new one would be something like (define_insn *tsthi [(set (cc0) (compare (match_operand:ALL2 0 register_operand r,r) (const_int 0))) (clobber (match_scratch:HI 1 =0,X)] ...) But how can I express 1) and 2) ? I don't think you can. Other ports express such transformations using define_peephole2 and peep2_reg_dead_p. Currently, my preferred approach is a new drop-in replacement for the old reg_unused_after which uses clobbers to decide whether or not op 0 is still needed. That way, reg-alloc can work like before and there is no need to implement dozens of new constraint alternatives across the md files. The problem with this approach is that it may break at random. There's just no guarantee that it will work, because you're relying on information that just isn't valid anymore. Unfortunately I don't know enough about CC0-targets to suggest an alternative. Ciao! Steven
Re: [patch, avr] Take 2: Fix PR64331: insn output and insn length computation rely on REG_DEAD notes.
Am 03/02/2015 um 05:12 PM schrieb Steven Bosscher: On Sat, Feb 28, 2015 at 5:38 PM, Georg-Johann Lay wrote: Am 02/26/2015 um 11:45 PM schrieb Steven Bosscher: On Thu, Feb 26, 2015 at 8:35 PM, Georg-Johann Lay wrote: Take #2 introduces a new, avr-specific rtl pass whose sole purpose is to rectify notes. The pass is scheduled right before cfg does down (right before .*free_cfg) so that cfg and hence df machinery is available. Regression tests look fine and for the test case the patches produce correct code and correct insn length. Sorry for party-pooping, but it seems to me that the real bug is that the target is lying to reload. IIUC the AVR port selects the insn alternative after register allocation (after reload). It bases its selection on REG_DEAD notes. In PR64331 an alternative is used that clobbers r20 that has a REG_DEAD note, but r20 is not actually dead because hardreg-cprop has propagated it forward without adjusting the note. It's not actually about constraint alternatives. Let me give an example: Testing HI for 0. The usual sequence would be cc0 = reg.low == 0 cc0 = cc0 reg.high == 0 which costs 2 instructions. If reg is unused after, then ORing can be used and cc0 will be set as a side effect. This costs 1 insn: cc0 = (reg.low |= reg.high) Using alternatives would double their number, i.e. 14 instead of 7 for *cmphi. These constraint alternatives would have to express 1) reg-alloc, please, use alternative #1 (with clobber of reg) if the register is unused after 2) reg-alloc, please, use alternative #2 (not clobbering reg) if the register is used after If we assume for a moment that we have a *testhi insn and the old *tsthi had just 1 alternative: (define_insn *tsthi [(set (cc0) (compare (match_operand:ALL2 0 register_operand r) (const_int 0)))] ...) Then the new one would be something like (define_insn *tsthi [(set (cc0) (compare (match_operand:ALL2 0 register_operand r,r) (const_int 0))) (clobber (match_scratch:HI 1 =0,X)] ...) But how can I express 1) and 2) ? I don't think you can. Other ports express such transformations using define_peephole2 and peep2_reg_dead_p. This means to duplicate all patterns that currently make use of reg_unused_after... With proposed dead attribute there is no need to duplicate all patterns. And it is explicit about the insns which want to use that information: the insn itself carries all needed information :-) Currently, my preferred approach is a new drop-in replacement for the old reg_unused_after which uses clobbers to decide whether or not op 0 is still needed. That way, reg-alloc can work like before and there is no need to implement dozens of new constraint alternatives across the md files. The problem with this approach is that it may break at random. There's just no guarantee that it will work, because you're relying on information that just isn't valid anymore. Would you elaborate on this? In the proposed solution, the new reg_unused_after just scans the insn pattern for clobber of the register of interest, why isn't that clobber valid any more? If a register is dead after a specific insn, isn't it valid to add a clobber of respective regsiter to the insn pattern? Why does that break at random when the pass which adds these clobbers runs before df / cfg is down and has correct deadness information availabe? If some optimization pass, which is filed after that clobber has been added, decides to add an insn which uses the contents of that clobbered register, then that pass is plain wrong. The old reg_unused_after has been renamed to avr_reg_unused_after_df and runs before cfg goes down, hence is basing its decisions on valid df information -- modulo the fact that reg_unused_after is antique code and might need to be updated for different reasons. In principle it should be in order to rely on df at that time (before pass free_cfg), isn't it? Or am I still missing some gory details for why df / cfg is invalid and must not be used befor pass *free_cfg for some obscure reason? Johann Unfortunately I don't know enough about CC0-targets to suggest an alternative. Ciao! Steven
Re: [patch, avr] Take 2: Fix PR64331: insn output and insn length computation rely on REG_DEAD notes.
2015-02-27 1:45 GMT+03:00 Steven Bosscher stevenb@gmail.com: On Thu, Feb 26, 2015 at 8:35 PM, Georg-Johann Lay wrote: Take #2 introduces a new, avr-specific rtl pass whose sole purpose is to rectify notes. The pass is scheduled right before cfg does down (right before .*free_cfg) so that cfg and hence df machinery is available. Regression tests look fine and for the test case the patches produce correct code and correct insn length. Sorry for party-pooping, but it seems to me that the real bug is that the target is lying to reload. IIUC the AVR port selects the insn alternative after register allocation (after reload). It bases its selection on REG_DEAD notes. In PR64331 an alternative is used that clobbers r20 that has a REG_DEAD note, but r20 is not actually dead because hardreg-cprop has propagated it forward without adjusting the note. The normal way of things is that the insn alternative is selected in reload (or in LRA) and that the clobbers are added as necessary. In PR64331, an alternative for insn r17 would be selected that has a CLOBBER for r20, prevent hardreg-cprop from propagating r20. Selecting insns based on REG-notes is dangerous business. Lying to reload and to post-RA passes is a mortal sin, the compiler will punish you. There is no guarantee that nothing will change between your new pass to recompute notes, and the final pass that emits the insns. It's not my port, for sure, but I would look for a real fix instead: Don't select insns to output based on unreliable information like REG-notes. Steven rights. We will have an endless fight with this problem. Better to completely drop `reg_unused_after'. (I know that it used around 40 times in port) What do you think Georg ? Denis.
Re: [patch, avr] Take 2: Fix PR64331: insn output and insn length computation rely on REG_DEAD notes.
Am 02/26/2015 um 11:45 PM schrieb Steven Bosscher: On Thu, Feb 26, 2015 at 8:35 PM, Georg-Johann Lay wrote: Take #2 introduces a new, avr-specific rtl pass whose sole purpose is to rectify notes. The pass is scheduled right before cfg does down (right before .*free_cfg) so that cfg and hence df machinery is available. Regression tests look fine and for the test case the patches produce correct code and correct insn length. Sorry for party-pooping, but it seems to me that the real bug is that the target is lying to reload. IIUC the AVR port selects the insn alternative after register allocation (after reload). It bases its selection on REG_DEAD notes. In PR64331 an alternative is used that clobbers r20 that has a REG_DEAD note, but r20 is not actually dead because hardreg-cprop has propagated it forward without adjusting the note. It's not actually about constraint alternatives. Let me give an example: Testing HI for 0. The usual sequence would be cc0 = reg.low == 0 cc0 = cc0 reg.high == 0 which costs 2 instructions. If reg is unused after, then ORing can be used and cc0 will be set as a side effect. This costs 1 insn: cc0 = (reg.low |= reg.high) Using alternatives would double their number, i.e. 14 instead of 7 for *cmphi. These constraint alternatives would have to express 1) reg-alloc, please, use alternative #1 (with clobber of reg) if the register is unused after 2) reg-alloc, please, use alternative #2 (not clobbering reg) if the register is used after If we assume for a moment that we have a *testhi insn and the old *tsthi had just 1 alternative: (define_insn *tsthi [(set (cc0) (compare (match_operand:ALL2 0 register_operand r) (const_int 0)))] ...) Then the new one would be something like (define_insn *tsthi [(set (cc0) (compare (match_operand:ALL2 0 register_operand r,r) (const_int 0))) (clobber (match_scratch:HI 1 =0,X)] ...) But how can I express 1) and 2) ? reg-alloc's choices appear to be random from BE's perspective, and it fails with much simpler tasks to produce good results. The first alternative which matches is the 1st, i.e. it would always clobber op 0 which results in up to 4 instructions and higher register pressure if that register is still needed after the insn. Making alternative 1 more expensive presumably results always in 2nd alternateive, i.e. 2 instructions instead of 1 if op 0 is no more needed. Similar if we swap alternatives #1 and #2: Then r,X will always match and appear to be cheaper than r,=0. Currently, my preferred approach is a new drop-in replacement for the old reg_unused_after which uses clobbers to decide whether or not op 0 is still needed. That way, reg-alloc can work like before and there is no need to implement dozens of new constraint alternatives across the md files. The clobbers, in turn, are added by a target-pass which translates deadness information to clobbers. The preferred placement of that pass is the same as for the proposed new avr pass from patch take #2. Below you find a quick hack that outlines the idea: New insn attribute dead tags insns which are worth to be taken into account for such dead--clobber translation and provide the operand number. The new pass scans all insns and translates dead info to clobbers as requested. This needs much more work, of course, like adjusting peepholes and splitters to the new insns with their additional operand. avr_reg_unused_after_df (the old reg_unused_after) should be rewritten to use DF info instead of scanning by hand. Some of the insn templates which used reg_unused_after are move insns; for these insns it's a bit different. The normal way of things is that the insn alternative is selected in reload (or in LRA) and that the clobbers are added as necessary. In Would you outline a concrete example? In particular, how to avoid clobbers if the operand is no more needed after the insn, and use clobber alternative if the value is no more needed? I.e. how would you express 1) and 2) from above in terms of insn constraints? PR64331, an alternative for insn r17 would be selected that has a CLOBBER for r20, prevent hardreg-cprop from propagating r20. Selecting insns based on REG-notes is dangerous business. Lying to reload and to post-RA passes is a mortal sin, the compiler will punish you. There is no guarantee that nothing will change between your new pass to recompute notes, and the final pass that emits the insns. Ya. However, for 4.9 I'd still propose patch take #2. It definitely improves matters, does nothing wrong, and throwing out reg_unused_after completely will lead to considerable increase of code size and run time. And for 4.9 a big change like outlined above is way too intrusive for already released version, imo. For 5.0 I am also still proposing the patchh. It does nothing wrong and is the base for a complete
Re: [patch, avr] Take 2: Fix PR64331: insn output and insn length computation rely on REG_DEAD notes.
Am 02/23/2015 um 11:53 AM schrieb Georg-Johann Lay: This patch fixes PR64331 which produced wrong code because of outdated (too many) REG_DEAD notes. These notes are not (re)computed per default, hence do the computation by hand each time avr.c:reg_unused_after is called in a different pass. Let me drop that... Problem was that df relies on cfg which is down after a specific point (after pass .*free_cfg). Take #2 introduces a new, avr-specific rtl pass whose sole purpose is to rectify notes. The pass is scheduled right before cfg does down (right before .*free_cfg) so that cfg and hence df machinery is available. Regression tests look fine and for the test case the patches produce correct code and correct insn length. Ok for trunk and 4.9? Johann gcc/ PR target/64331 * config/avr/avr.c (context.h, tree-pass.h): Include them. (avr_pass_data_recompute_notes): New static variable. (avr_pass_recompute_notes): New class. (avr_register_passes): New static function. (avr_option_override): Call it. gcc/testsuite/ PR target/64331 * gcc.target/avr/torture/pr64331.c: New test. Index: config/avr/avr.c === --- config/avr/avr.c (revision 220964) +++ config/avr/avr.c (working copy) @@ -81,6 +81,8 @@ #include basic-block.h #include df.h #include builtins.h +#include context.h +#include tree-pass.h /* Maximal allowed offset for an address in the LD command */ #define MAX_LD_OFFSET(MODE) (64 - (signed)GET_MODE_SIZE (MODE)) @@ -329,6 +331,55 @@ avr_to_int_mode (rtx x) } +static const pass_data avr_pass_data_recompute_notes = +{ + RTL_PASS, // type + ,// name (will be patched) + OPTGROUP_NONE, // optinfo_flags + TV_DF_SCAN,// tv_id + 0, // properties_required + 0, // properties_provided + 0, // properties_destroyed + 0, // todo_flags_start + TODO_df_finish | TODO_df_verify // todo_flags_finish +}; + + +class avr_pass_recompute_notes : public rtl_opt_pass +{ +public: + avr_pass_recompute_notes (gcc::context *ctxt, const char *name) +: rtl_opt_pass (avr_pass_data_recompute_notes, ctxt) + { +this-name = name; + } + + virtual unsigned int execute (function*) + { +df_note_add_problem (); +df_analyze (); + +return 0; + } +}; // avr_pass_recompute_notes + + +static void +avr_register_passes (void) +{ + /* This avr-specific pass (re)computes insn notes, in particular REG_DEAD + notes which are used by `avr.c::reg_unused_after' and branch offset + computations. These notes must be correct, i.e. there must be no + dangling REG_DEAD notes; otherwise wrong code might result, cf. PR64331. + + DF needs (correct) CFG, hence right before free_cfg is the last + opportunity to rectify notes. */ + + register_pass (new avr_pass_recompute_notes (g, avr-notes-free-cfg), + PASS_POS_INSERT_BEFORE, *free_cfg, 1); +} + + /* Implement `TARGET_OPTION_OVERRIDE'. */ static void @@ -411,6 +462,11 @@ if (!avr_current_device-macro init_machine_status = avr_init_machine_status; avr_log_set_avr_log(); + + /* Register some avr-specific pass(es). There is no canonical place for + pass registration. This function is convenient. */ + + avr_register_passes (); } /* Function to set up the backend function structure. */ Index: testsuite/gcc.target/avr/torture/pr64331.c === --- testsuite/gcc.target/avr/torture/pr64331.c (revision 0) +++ testsuite/gcc.target/avr/torture/pr64331.c (revision 0) @@ -0,0 +1,37 @@ +/* { dg-do run } */ + +typedef struct +{ + unsigned a, b; +} T2; + + +__attribute__((__noinline__, __noclone__)) +void foo2 (T2 *t, int x) +{ + if (x != t-a) +{ + t-a = x; + + if (x x == t-b) + t-a = 20; +} +} + + +T2 t; + +int main (void) +{ + t.a = 1; + t.b = 1234; + + foo2 (t, 1234); + + if (t.a != 20) +__builtin_abort(); + + __builtin_exit (0); + + return 0; +} Index: config/avr/avr.c === --- config/avr/avr.c (revision 220963) +++ config/avr/avr.c (working copy) @@ -51,6 +51,8 @@ #include target-def.h #include params.h #include df.h +#include context.h +#include tree-pass.h /* Maximal allowed offset for an address in the LD command */ #define MAX_LD_OFFSET(MODE) (64 - (signed)GET_MODE_SIZE (MODE)) @@ -285,6 +287,58 @@ avr_to_int_mode (rtx x) } +static const pass_data avr_pass_data_recompute_notes = +{ + RTL_PASS, // type + , // name (will be patched) + OPTGROUP_NONE, // optinfo_flags + false, // has_gate + true, // has_execute + TV_DF_SCAN, // tv_id + 0, // properties_required + 0, // properties_provided + 0, // properties_destroyed + 0, // todo_flags_start +
Re: [patch, avr] Take 2: Fix PR64331: insn output and insn length computation rely on REG_DEAD notes.
On Thu, Feb 26, 2015 at 8:35 PM, Georg-Johann Lay wrote: Take #2 introduces a new, avr-specific rtl pass whose sole purpose is to rectify notes. The pass is scheduled right before cfg does down (right before .*free_cfg) so that cfg and hence df machinery is available. Regression tests look fine and for the test case the patches produce correct code and correct insn length. Sorry for party-pooping, but it seems to me that the real bug is that the target is lying to reload. IIUC the AVR port selects the insn alternative after register allocation (after reload). It bases its selection on REG_DEAD notes. In PR64331 an alternative is used that clobbers r20 that has a REG_DEAD note, but r20 is not actually dead because hardreg-cprop has propagated it forward without adjusting the note. The normal way of things is that the insn alternative is selected in reload (or in LRA) and that the clobbers are added as necessary. In PR64331, an alternative for insn r17 would be selected that has a CLOBBER for r20, prevent hardreg-cprop from propagating r20. Selecting insns based on REG-notes is dangerous business. Lying to reload and to post-RA passes is a mortal sin, the compiler will punish you. There is no guarantee that nothing will change between your new pass to recompute notes, and the final pass that emits the insns. It's not my port, for sure, but I would look for a real fix instead: Don't select insns to output based on unreliable information like REG-notes. Ciao! Steven