[Bug rtl-optimization/21138] wrong code in sixtrack for -fmodulo-sched
--- Additional Comments From mustafa at il dot ibm dot com 2005-05-17 15:05 --- Janis, can you try this patch? Index: modulo-sched.c === RCS file: /cvs/gcc/gcc/gcc/modulo-sched.c,v retrieving revision 1.29 diff -c -p -r1.29 modulo-sched.c *** modulo-sched.c 28 Apr 2005 02:25:22 - 1.29 --- modulo-sched.c 17 May 2005 11:47:00 - *** undo_generate_reg_moves (partial_schedul *** 597,602 --- 597,603 delete_insn (crr); crr = prev; } + SCHED_FIRST_REG_MOVE (u) = NULL_RTX; } while (reg_move_replaces) -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=21138
[Bug rtl-optimization/21138] wrong code in sixtrack for -fmodulo-sched
--- Additional Comments From mustafa at il dot ibm dot com 2005-05-04 12:31 --- Is seems like this is not an SMS bug, sixtrack is failing for me with -m64 -O2 without -fmodulo-sched. Jania, have you checked that and have a different results? -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=21138
[Bug middle-end/20177] ICE in schedule-insns for -O2 -fmodulo-sched
--- Additional Comments From mustafa at il dot ibm dot com 2005-03-17 11:53 --- The following patch should fix the Segmentation fault in gap (from SPEC2000) mentioned in a href=#c14comment 14/a. This patch is combined with the patch from a href=#c13comment 13/a. Janis can you try it out. Index: ddg.c === RCS file: /cvs/gcc/gcc/gcc/ddg.c,v retrieving revision 1.11 diff -c -p -r1.11 ddg.c *** ddg.c 22 Nov 2004 12:23:47 - 1.11 --- ddg.c 17 Mar 2005 11:20:37 - *** create_ddg_dependence (ddg_ptr g, ddg_no *** 187,192 --- 187,194 else free (e); } + else if (t == ANTI_DEP dt == REG_DEP) + free (e); /* We can fix broken anti register deps using reg-moves. */ else add_edge_to_ddg (g, e); } Index: modulo-sched.c === RCS file: /cvs/gcc/gcc/gcc/modulo-sched.c,v retrieving revision 1.19 diff -c -p -r1.19 modulo-sched.c *** modulo-sched.c 1 Dec 2004 00:33:05 - 1.19 --- modulo-sched.c 17 Mar 2005 11:20:38 - *** const_iteration_count (rtx count_reg, ba *** 339,344 --- 339,348 { rtx insn; rtx head, tail; + + if (! pre_header) + return NULL_RTX; + get_block_head_tail (pre_header-index, head, tail); for (insn = tail; insn != PREV_INSN (head); insn = PREV_INSN (insn)) *** print_node_sched_params (FILE * dump_fil *** 401,406 --- 405,412 { int i; + if (! dump_file) + return; for (i = 0; i num_nodes; i++) { node_sched_params_ptr nsp = node_sched_params[i]; *** calculate_maxii (ddg_ptr g) *** 443,456 return maxii; } ! ! /* Given the partial schedule, generate register moves when the length !of the register live range is more than ii; the number of moves is !determined according to the following equation: ! SCHED_TIME (use) - SCHED_TIME (def) { 1 broken loop-carried !nreg_moves = --- - { dependence. ! ii { 0 if not. !This handles the modulo-variable-expansions (mve's) needed for the ps. */ static void generate_reg_moves (partial_schedule_ptr ps) { --- 449,465 return maxii; } ! /* !Breaking intra-loop register anti-dependences: !Each intra-loop register anti-dependence implies a cross-iteration true !dependence of distance 1. Therefore, we can remove such false dependencies !and figure out if the partial schedule broke them by checking if (for a !true-dependence of distance 1): SCHED_TIME (def) SCHED_TIME (use) and !if so generate a register move. The number of such moves is equal to: ! SCHED_TIME (use) - SCHED_TIME (def) { 0 broken !nreg_moves = --- + 1 - { dependecnce. ! ii { 1 if not. ! */ static void generate_reg_moves (partial_schedule_ptr ps) { *** generate_reg_moves (partial_schedule_ptr *** 474,479 --- 483,491 { int nreg_moves4e = (SCHED_TIME (e-dest) - SCHED_TIME (e-src)) / ii; + if (e-distance == 1) + nreg_moves4e = (SCHED_TIME (e-dest) - SCHED_TIME (e-src) + ii) / ii; + /* If dest precedes src in the schedule of the kernel, then dest will read before src writes and we can save one reg_copy. */ if (SCHED_ROW (e-dest) == SCHED_ROW (e-src) *** generate_reg_moves (partial_schedule_ptr *** 497,502 --- 509,517 { int dest_copy = (SCHED_TIME (e-dest) - SCHED_TIME (e-src)) / ii; + if (e-distance == 1) + dest_copy = (SCHED_TIME (e-dest) - SCHED_TIME (e-src) + ii) / ii; + if (SCHED_ROW (e-dest) == SCHED_ROW (e-src) SCHED_COLUMN (e-dest) SCHED_COLUMN (e-src)) dest_copy--; *** normalize_sched_times (partial_schedule_ *** 539,546 ddg_ptr g = ps-g; int amount = PS_MIN_CYCLE (ps); int ii = ps-ii; ! ! for (i = 0; i g-num_nodes; i++) { ddg_node_ptr u = g-nodes[i]; int normalized_time = SCHED_TIME (u) - amount; --- 554,561 ddg_ptr g = ps-g; int amount = PS_MIN_CYCLE (ps); int ii = ps-ii; ! /* Don't include the closing branch assuming that it is the last node. */ ! for (i = 0; i g-num_nodes - 1; i++) { ddg_node_ptr u = g-nodes[i]; int normalized_time = SCHED_TIME (u) - amount; *** duplicate_insns_of_cycles (partial_sched *** 609,615 /* SCHED_STAGE (u_node) = from_stage == 0. Generate increasing number of reg_moves starting with the second occurrence of u_node, which is generated if its SCHED_STAGE
[Bug middle-end/20177] ICE in schedule-insns for -O2 -fmodulo-sched
--- Additional Comments From mustafa at il dot ibm dot com 2005-03-16 09:36 --- I suppose that the REG_DEAD for 136 in insn 65 is correct, because the next insn is a DEF of 136. So the problem here is that 136 is not in the liveout of BB 7. I can guess that insn 97 that defines 136 is added by SMS, but isn't it the job of update_flow_info to update the liveout of BB 7? At least that what I was excepting when called it from rest_of_handle_sms. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=20177
[Bug middle-end/20177] ICE in schedule-insns for -O2 -fmodulo-sched
--- Additional Comments From mustafa at il dot ibm dot com 2005-03-16 12:05 --- After a bit more debugging I found out that the error caused due to copying the REG_DEAD note with the instructions when we are generating the prologues and epilogues in SMS. The REG_DEAD is correct in the inter-block view but not inside the block; but update_live_info looks inside the blocks and verify the information for each block and that where we are wrong. The fix is to prevent us from copying the REG_DEAD note in such a case. We should also look for other REG_ notes that we shouldn't be copying to prologue/epilogue. REG_USE for example should still be copies. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=20177
[Bug middle-end/20177] ICE in schedule-insns for -O2 -fmodulo-sched
--- Additional Comments From mustafa at il dot ibm dot com 2005-03-16 17:45 --- For some reason the REG_DEAD is not the cause of the failure it is the fact that the SMSed basic-block wasn't mark dirty for update_life_info that come after it. doing so fixes the failure even with REG_DEAD is still in that insn. The REG_DEAD note is correct when we look inter-block so maybe it is still correct to keep their. The question is: what is the correct fix for the longer term ? is it enough to mark the SMSed block dirty? or do we need also to keep the REG_DEAD correct in each basic-block separately? This patch fixes the failure : Index: modulo-sched.c === RCS file: /cvs/gcc/gcc/gcc/modulo-sched.c,v retrieving revision 1.19 diff -c -p -r1.19 modulo-sched.c *** modulo-sched.c 1 Dec 2004 00:33:05 - 1.19 --- modulo-sched.c 16 Mar 2005 17:20:59 - *** sms_schedule (FILE *dump_file) *** 1109,1114 --- 1109,1116 scheduling passes doesn't touch it. */ if (! flag_resched_modulo_sched) g-bb-flags |= BB_DISABLE_SCHEDULE; + /* The life-info is not valid any more. */ + g-bb-flags |= BB_DIRTY; generate_reg_moves (ps); if (dump_file) -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=20177
[Bug middle-end/20177] ICE in schedule-insns for -O2 -fmodulo-sched
--- Additional Comments From mustafa at il dot ibm dot com 2005-03-17 07:30 --- I get the ICE in schedule_insns at sched-rgn.c:2549 also in bootstrap with -O2 -fmodulo-sched. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=20177
[Bug rtl-optimization/20450] New: ICE in postreload-gcse
postreload-gcse ICEs when trying to generate an illegal move insn between registers. This happens when compiling vpr on G5 with -fprofile-generate (gcc -O3 -fprofile-generate -mcpu=G5). I made a smaller test-case that causes the same failure. compiling the following code with -- Summary: ICE in postreload-gcse Product: gcc Version: 4.1.0 Status: UNCONFIRMED Severity: normal Priority: P2 Component: rtl-optimization AssignedTo: mustafa at il dot ibm dot com ReportedBy: mustafa at il dot ibm dot com CC: gcc-bugs at gcc dot gnu dot org GCC build triplet: powerpc-apple-darwin7.6.0 GCC host triplet: powerpc-apple-darwin7.6.0 GCC target triplet: powerpc-apple-darwin7.6.0 http://gcc.gnu.org/bugzilla/show_bug.cgi?id=20450
[Bug rtl-optimization/20450] ICE in postreload-gcse
--- Additional Comments From mustafa at il dot ibm dot com 2005-03-13 07:09 --- Created an attachment (id=8381) -- (http://gcc.gnu.org/bugzilla/attachment.cgi?id=8381action=view) gcse_test.c -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=20450
[Bug rtl-optimization/20450] ICE in postreload-gcse
--- Additional Comments From mustafa at il dot ibm dot com 2005-03-13 07:11 --- Compiling the gcse_test.c on powerpc-apple-darwin7.6.0 with the following options: -O3 -mcpu=G5 -fprofile-generate causes the following ICE : gcse_test.c: In function 'alloc_and_load_edges_and_switches': gcse_test.c:58: error: unrecognizable insn: (insn 206 0 0 (set (reg/v/f:SI 5 r5 [orig:127 list_ptr ] [127]) (reg:DI 0 r0)) -1 (nil) (nil)) -- What|Removed |Added Status|UNCONFIRMED |NEW Ever Confirmed||1 http://gcc.gnu.org/bugzilla/show_bug.cgi?id=20450
[Bug tree-optimization/19038] [4.0 Regression] out-of ssa causing loops to have more than one BB
--- Additional Comments From mustafa at il dot ibm dot com 2004-12-24 12:13 --- Doing loop header copying in RTL works around this problem, doing so gain us improvements of 3.9% for SPECINT on a G5 machine and a 66% for gcc benchmark. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=19038