[Bug rtl-optimization/51933] [4.7 regression] wrong code due to -free
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51933 Jakub Jelinek changed: What|Removed |Added Status|ASSIGNED|RESOLVED Resolution||FIXED --- Comment #9 from Jakub Jelinek 2012-01-23 09:28:26 UTC --- Fixed.
[Bug rtl-optimization/51933] [4.7 regression] wrong code due to -free
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51933 --- Comment #8 from Jakub Jelinek 2012-01-23 09:25:55 UTC --- Author: jakub Date: Mon Jan 23 09:25:52 2012 New Revision: 183416 URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=183416 Log: PR rtl-optimization/51933 * ree.c (transform_ifelse): Return true right away if dstreg is already wider or equal to cand->mode. (enum ext_modified_kind, struct ext_modified, ext_state): New types. (make_defs_and_copies_lists): Remove defs_list and copies_list arguments, add state argument, just truncate state->work_list instead of always allocating and freeing the vector. Assert that get_defs succeeds instead of returning 2. Changed return type to bool. (merge_def_and_ext): Add state argument. If SET_DEST doesn't have ext_src_mode, see if it has been modified already with the right kind of extension and has been extended before from the ext_src_mode. If SET_DEST is already wider or equal to cand->mode, just return true. Remember the original mode in state->modified array. (combine_reaching_defs): Add state argument. Don't allocate and free here def_list, copied_list and vec vectors, instead just VEC_truncate the vectors in *state. Don't handle outcome == 2 here. (find_and_remove_re): Set DF_DEFER_INSN_RESCAN df flag. Add state variable, clear vectors in it, initialize state.modified if needed. Free all the vectors at the end and state.modified too. Don't skip a candidate if the extension expression has been modified. * gcc.c-torture/execute/pr51933.c: New test. Added: trunk/gcc/testsuite/gcc.c-torture/execute/pr51933.c Modified: trunk/gcc/ChangeLog trunk/gcc/ree.c trunk/gcc/testsuite/ChangeLog
[Bug rtl-optimization/51933] [4.7 regression] wrong code due to -free
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51933 Jakub Jelinek changed: What|Removed |Added Attachment #26410|0 |1 is obsolete|| Status|NEW |ASSIGNED AssignedTo|unassigned at gcc dot |jakub at gcc dot gnu.org |gnu.org | --- Comment #7 from Jakub Jelinek 2012-01-22 18:59:43 UTC --- Created attachment 26414 --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=26414 gcc47-pr51933.patch Untested fix with deferred insn rescanning. Reverts the PR51924 find_and_remove_re fix as it is no longer needed.
[Bug rtl-optimization/51933] [4.7 regression] wrong code due to -free
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51933 --- Comment #6 from Jakub Jelinek 2012-01-22 09:42:05 UTC --- Created attachment 26410 --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=26410 gcc47-pr51933.patch Patch I wrote last night. Passed bootstrap/regtest, the only cases where it prevented something was in the new testcase and in jcf-parse.c (rewrite_reflection_indexes) with -m64, where it fixed an miscompilation. What happened there is that some earlier bb (but in a loop) was doing a SImode->DImode zero extension, and later in the loop there was a SImode ior, followed by HImode->SImode zero extension of that, which fed the earlier SImode->DImode zero extension. The SImode->DImode insn was processed first, removed and changed the HImode->SImode zero extension into HImode->DImode zero extension. But as the HImode->SImode extension was changed, its df links were gone. When the HImode-> zero extension is later processed as candidate, get_defs returns NULL and we assume all the dependencies have been adjusted. Which leads me to thinking that this patch is a wrong approach, and perhaps we should instead DF_DEFER_INSN_RESCAN in ree.c and adjust, so that we can cope with the adjusted insns.
[Bug rtl-optimization/51933] [4.7 regression] wrong code due to -free
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51933 Eric Botcazou changed: What|Removed |Added Status|ASSIGNED|NEW AssignedTo|ebotcazou at gcc dot|unassigned at gcc dot |gnu.org |gnu.org --- Comment #5 from Eric Botcazou 2012-01-21 21:27:40 UTC --- > But then when make_defs_and_copies_lists is called on the QImode -> DImode > extension, we reach: > /* Initialize the work list. */ > if (!get_defs (extend_insn, src_reg, &work_list)) > { > VEC_free (rtx, heap, work_list); > /* The number of defs being equal to zero can only mean that all the > definitions have been previously merged. */ > return 2; > } > and because the definition has been changed already. But nothing performs the > ext_src_mode check that used to be performed otherwise. > So we either need to do the src mode checking earlier when we haven't started > modifying insns (in add_removable_extension?), or we'd need to look even at > the > newly added defs and see what mode they extend from. OK, I didn't realize that there could be an implicit truncation hidden within an extension. And I think that your change to add_removable_extension is fine, as it still leaves the new flavor of the pass more powerful than the old one. The other PR is a different problem.
[Bug rtl-optimization/51933] [4.7 regression] wrong code due to -free
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51933 --- Comment #4 from Jakub Jelinek 2012-01-21 18:40:23 UTC --- That unfortunately slightly pessimizes it. E.g. on i686-linux bootstrap on: ../../gcc/config/i386/i386.c ix86_memory_move_cost ../../gcc/config/i386/i386.c ix86_register_move_cost ../../../../libgfortran/io/write.c write_float ../../../libgfortran/io/write.c write_float s-regpat.adb system__regpat__dump_until /usr/src/gcc/gcc/testsuite/gcc.target/i386/avx-vpackuswb-1.c do_test /usr/src/gcc/gcc/testsuite/gcc.target/i386/sse2-packuswb-1.c do_test /usr/src/gcc/libstdc++-v3/testsuite/ext/rope/1.cc _S_tree_concat /usr/src/gcc/libstdc++-v3/testsuite/ext/rope/2.cc _S_tree_concat /usr/src/gcc/libstdc++-v3/testsuite/ext/rope/3.cc _S_tree_concat /usr/src/gcc/libstdc++-v3/testsuite/ext/rope/44963.cc _S_tree_concat /usr/src/gcc/libstdc++-v3/testsuite/ext/rope/4.cc _S_tree_concat /usr/src/gcc/libstdc++-v3/testsuite/ext/rope/pthread7-rope.cc _S_tree_concat and on x86_64-linux bootstrap sans regtest: ../../gcc/ada/par.adb par__prag__process_restrictions_or_restriction_warnings ../../gcc/config/i386/i386.c ix86_memory_move_cost ../../gcc/config/i386/i386.c ix86_register_move_cost ../../gcc/java/jcf-parse.c rewrite_reflection_indexes ../../../libgfortran/io/write.c write_float s-regpat.adb system__regpat__dump_until The problem is if the def insn satisfies is_cond_copy_insn, then it can be in a wider mode, yet all definitions could still be in a narrower mode. So, perhaps we'd need to have a recursive function that for is_cond_copy_insn defs calls get_defs on both args and recurses, checking mode of real definitions only. Not sure if we'd need also this is_insn_visited array for it to avoid recursing infinitely. BTW, for 4.8, I guess a couple of ree.c cleanups will be needed, e.g. several functions allocate and free way too many heap allocated vectors, I guess if we allocate them in the caller (or caller's caller) and just pass addresses of those vectors around, we could just VEC_truncate the vectors and avoid the permanent malloc/free.
[Bug rtl-optimization/51933] [4.7 regression] wrong code due to -free
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51933 --- Comment #3 from Jakub Jelinek 2012-01-21 15:57:23 UTC --- Completely untested fix: --- gcc/ree.c.jj2011-12-28 10:52:44.0 +0100 +++ gcc/ree.c2012-01-21 16:55:27.290731139 +0100 @@ -776,6 +776,23 @@ add_removable_extension (rtx x ATTRIBUTE } return; } +else + { +rtx set = single_set (DF_REF_INSN (def->ref)); +if (set == NULL_RTX +|| !REG_P (SET_DEST (set)) +|| GET_MODE (SET_DEST (set)) != GET_MODE (XEXP (src, 0))) + { +if (dump_file) + { +fprintf (dump_file, "Cannot eliminate extension:\n"); +print_rtl_single (dump_file, rei->insn); +fprintf (dump_file, + " because def insn has different mode\n"); + } +return; + } + } /* Then add the candidate to the list and insert the reaching definitions into the definition map. */
[Bug rtl-optimization/51933] [4.7 Regression] Wrong-code due to -free
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51933 Eric Botcazou changed: What|Removed |Added Status|UNCONFIRMED |ASSIGNED Last reconfirmed||2012-01-21 CC||ebotcazou at gcc dot ||gnu.org AssignedTo|unassigned at gcc dot |ebotcazou at gcc dot |gnu.org |gnu.org Ever Confirmed|0 |1 --- Comment #2 from Eric Botcazou 2012-01-21 15:44:46 UTC --- > But then when make_defs_and_copies_lists is called on the QImode -> DImode > extension, we reach: > /* Initialize the work list. */ > if (!get_defs (extend_insn, src_reg, &work_list)) > { > VEC_free (rtx, heap, work_list); > /* The number of defs being equal to zero can only mean that all the > definitions have been previously merged. */ > return 2; > } > and because the definition has been changed already. Yes, this particular return embodies some implicit assumptions... Will fix.
[Bug rtl-optimization/51933] [4.7 Regression] Wrong-code due to -free
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51933 --- Comment #1 from Jakub Jelinek 2012-01-21 15:24:26 UTC --- Consider following 3 routines: unsigned long foo (unsigned short *s, int i) { unsigned short x = s[i]; if (x < 0x211) return (unsigned char) x; asm volatile (""); return 6; } unsigned long bar (unsigned short *s, int i) { unsigned short x = s[i]; if (x < 0x211) return (unsigned char) x; asm volatile (""); return x; } unsigned long baz (unsigned short *s, int i) { unsigned short x = s[i]; if (x < 0x211) return x; asm volatile (""); return (unsigned char) x; } In foo ree doesn't optimize away the QImode -> DImode zero extension, because merge_def_and_ext checks: && GET_MODE (SET_DEST (*sub_rtx)) == ext_src_mode and in this case ext_src_mode is QImode, but SET_DEST has HImode. In bar ree doesn't optimize it away either, first merge_def_and_ext_checks is called with the QImode ext_src_mode (which doesn't match) and only afterwards on the HImode -> DImode extension, which is optimized (that is correct). In baz (which is the same as bar in #c0 testcase) unfortunately merge_def_and_ext is first called on the HImode -> DImode extension, which is optimized (that is fine) by turning the HImode load into a DImode zero_extend load from HImode and removing the HImode -> DImode extension. But then when make_defs_and_copies_lists is called on the QImode -> DImode extension, we reach: /* Initialize the work list. */ if (!get_defs (extend_insn, src_reg, &work_list)) { VEC_free (rtx, heap, work_list); /* The number of defs being equal to zero can only mean that all the definitions have been previously merged. */ return 2; } and because the definition has been changed already. But nothing performs the ext_src_mode check that used to be performed otherwise. So we either need to do the src mode checking earlier when we haven't started modifying insns (in add_removable_extension?), or we'd need to look even at the newly added defs and see what mode they extend from.
[Bug rtl-optimization/51933] [4.7 Regression] Wrong-code due to -free
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51933 Jakub Jelinek changed: What|Removed |Added Priority|P3 |P1 Known to work||4.6.3 Target Milestone|--- |4.7.0 Known to fail||4.7.0