Re: Constant folding and Constant propagation
Jean Christophe Beyler writes: I set up your patch and I get an internal error on this test program: You're right. I haven't handled the case properly when the constant itself was an anchor constant (e.g. 0). Try this version. Adam * cse.c (get_const_anchors): New function. (insert_const_anchors): New function. (cse_insn): Set src_related using anchor constants. Insert constant anchors into the table of available expressions. * config/mips/mips.c (mips_rtx_costs): Make immediate-add even cheaper than loading a simple constant into a register. Index: gcc/cse.c === --- gcc.orig/cse.c 2009-03-08 12:16:56.0 -0700 +++ gcc/cse.c 2009-03-16 23:07:40.0 -0700 @@ -3961,6 +3961,55 @@ record_jump_cond (enum rtx_code code, en merge_equiv_classes (op0_elt, op1_elt); } + +#define TARGET_CONST_ANCHOR 0x8000 + +/* Compute the upper and lower anchors for CST as base, offset pairs. Return + NULL_RTX if CST is equal to an anchor. */ + +static rtx +get_const_anchors (rtx cst, rtx *upper_base, HOST_WIDE_INT *upper_offs, + HOST_WIDE_INT *lower_offs) +{ + HOST_WIDE_INT n, upper, lower; + + n = INTVAL (cst); + lower = n ~(TARGET_CONST_ANCHOR - 1); + if (n == lower) +return NULL_RTX; + upper = (n + (TARGET_CONST_ANCHOR - 1)) ~(TARGET_CONST_ANCHOR - 1); + + *upper_base = GEN_INT (upper); + *upper_offs = n - upper; + *lower_offs = n - lower; + return GEN_INT (lower); +} + +/* Create equivalences between the two anchors of a constant value and the + corresponding register-offset expressions. Use the register REG, which is + equivalent to the constant value CLASSP-exp. */ + +static void +insert_const_anchors (rtx reg, struct table_elt *classp, + enum machine_mode mode) +{ + rtx lower_base, upper_base; + HOST_WIDE_INT lower_offs, upper_offs; + rtx lower_exp, upper_exp; + struct table_elt *celt; + rtx cst = classp-exp; + + lower_base = get_const_anchors (cst, upper_base, upper_offs, lower_offs); + if (!lower_base) +return; + lower_exp = plus_constant (reg, -lower_offs); + upper_exp = plus_constant (reg, -upper_offs); + + celt = insert (lower_base, NULL, HASH (lower_base, mode), mode); + insert (lower_exp, celt, HASH (lower_exp, mode), mode); + celt = insert (upper_base, NULL, HASH (upper_base, mode), mode); + insert (upper_exp, celt, HASH (upper_exp, mode), mode); +} /* CSE processing for one instruction. First simplify sources and addresses of all assignments @@ -4595,6 +4644,67 @@ cse_insn (rtx insn) } #endif /* LOAD_EXTEND_OP */ + /* Try to express the constant using a register-offset expression using +anchor constants. */ + + if (!src_related src_const GET_CODE (src_const) == CONST_INT) + { + rtx lower_base, upper_base; + struct table_elt *lower_elt, *upper_elt, *elt; + HOST_WIDE_INT lower_offs, upper_offs, offs; + + lower_base = get_const_anchors (src_const, upper_base, upper_offs, + lower_offs); + if (lower_base) + { + lower_elt = lookup (lower_base, HASH (lower_base, mode), mode); + upper_elt = lookup (upper_base, HASH (upper_base, mode), mode); + + /* Loop over LOWER_ELTs and UPPER_ELTs to find a reg-offset pair +that we can use to express SRC_CONST. */ + elt = NULL; + if (lower_elt) + { + elt = lower_elt-first_same_value; + offs = lower_offs; + } + else if (upper_elt) + { + elt = upper_elt-first_same_value; + upper_elt = NULL; + offs = upper_offs; + } + while (elt) + { + if (REG_P (elt-exp) + || (GET_CODE (elt-exp) == PLUS + REG_P (XEXP (elt-exp, 0)) + GET_CODE (XEXP (elt-exp, 1)) == CONST_INT)) + { + rtx x = plus_constant (elt-exp, offs); + if (REG_P (x) + || (GET_CODE (x) == PLUS + IN_RANGE (INTVAL (XEXP (x, 1)), + -TARGET_CONST_ANCHOR, + TARGET_CONST_ANCHOR - 1))) + { + src_related = x; + break; + } + } + + if (!elt-next_same_value upper_elt) + { + elt = upper_elt-first_same_value; + upper_elt = NULL; + offs = upper_offs; + } + else + elt = elt-next_same_value; + } + } +
Re: Constant folding and Constant propagation
If I replace your lines: if ((GET_CODE (sets[i].src_elt-exp) == CONST_INT)) insert_const_anchors (dest, sets[i].src_elt, GET_MODE (dest)); with: if ((GET_CODE (sets[i].src_elt-exp) == CONST_INT) (INTVAL (sets[i].src_elt-exp) != 0)) insert_const_anchors (dest, sets[i].src_elt, GET_MODE (dest)); it does not fail anymore since I remove the 0 corner case I am seeing. I fail to see why it is a problem however. Like I said previously, it might be a problem with my target architecture but I fail to see why 0 is considered differently in the target files. Jc
Re: Constant folding and Constant propagation
I set up your patch and I get an internal error on this test program: extern void foo(int, int); extern int bar(void); int startup; void foobar() { int i; while(1) { if (bar()) { foo(0,0); } } } Here's the error: /home/beyler/cyclops64/src/tnt/kernel/process_manager/testnode.c: In function 'foobar': /home/beyler/cyclops64/src/tnt/kernel/process_manager/testnode.c:15: internal compiler error: in insert_regs, at cse.c:1156 Please submit a full bug report, with preprocessed source if appropriate. See http://gcc.gnu.org/bugs.html for instructions. I think it's got something to do with my machine description but this only happens when the values of foo are both 0. I haven't found another case where it fails. It seems when the compiler sees two constants 0 for the register (r8 which is my first input register), it fails on the second. Somewhere between both calls, it must remove the quantity. Any ideas ? Jc
Re: Constant folding and Constant propagation
Adam Nemet ane...@caviumnetworks.com writes: I am actually looking at something similar for PR33699 for MIPS. My plan is to experiment extending cse.c with putting anchor constants to the available expressions along with the original constant and then querying those later for constant expressions. See http://gcc.gnu.org/ml/gcc-patches/2009-03/msg00161.html. Adam
RE: Constant folding and Constant propagation
- If I patch in this code, actually I get the same results I did before where the constants are propagated. It seems that in 4.3.2, every part of the compiler is trying to do that. There are at least two forward propagation passes, one before and another after GCSE. I haven't tried to tackle those passes, but I believe they already have a cost model in place. This patch will prevent partial sums involving registers and constants from being combined during GCSE. - Just for testing I turned off the GCSE pass and it still propagated and folded the constants... GCSE won't help with your trimmed down example int main(void) { long a = 0xcafecafe; printf(Final: %lx %lx %lx\n, a, a+5, a+15); return EXIT_SUCCESS; } I believe Paolo's canon_reg solution together with tweaking rtx_cost of constants with outer code PLUS might help. Any luck with that pass?
Re: Constant folding and Constant propagation
Rahul Kharche ra...@icerasemi.com writes: GCSE won't help with your trimmed down example int main(void) { long a = 0xcafecafe; printf(Final: %lx %lx %lx\n, a, a+5, a+15); return EXIT_SUCCESS; } I believe Paolo's canon_reg solution together with tweaking rtx_cost of constants with outer code PLUS might help. Any luck with that pass? As I understand Paolo's change is to move some of the functionality from cse.c to a new LCM pass. However cse.c does not handle the above yet. I am actually looking at something similar for PR33699 for MIPS. My plan is to experiment extending cse.c with putting anchor constants to the available expressions along with the original constant and then querying those later for constant expressions. Adam
RE: Constant folding and Constant propagation
Hi Jean, Do you have a link to that patch? So that I can see if it applies for me ? See patch below. I put in some comments to try and explain what I have attempted to do. The hash SET generation, to track potential candidates in replacing a register USE, needed some tweaking. This function was not tracking register copies if the associated INSN had a NOTE attached that associated the register with a constant. Also, only the last register or constant in an assignment chain was being added to the hash SET. It may be profitable to have a closer register copy in the assignment chain as a potential replacement. Before the actual replacing operation, the cost of temporary replacement is determined using rtx_cost. A cost cannot be reliably formed if we come across jump INSNs because they may alter jumps or successors. The default replacement is used in this case. Rahul Index: gcse.c === --- gcse.c (revision 141156) +++ gcse.c (working copy) @@ -520,7 +520,7 @@ static void record_set_info (rtx, const_rtx, void *); static void compute_sets (void); static void hash_scan_insn (rtx, struct hash_table *, int); -static void hash_scan_set (rtx, rtx, struct hash_table *); +static void hash_scan_set (rtx, rtx, struct hash_table *, bool); static void hash_scan_clobber (rtx, rtx, struct hash_table *); static void hash_scan_call (rtx, rtx, struct hash_table *); static int want_to_gcse_p (rtx); @@ -560,7 +560,7 @@ static void compute_cprop_data (void); static void find_used_regs (rtx *, void *); static int try_replace_reg (rtx, rtx, rtx); -static struct expr *find_avail_set (int, rtx); +static struct expr *find_avail_set (int, rtx, bool); static int cprop_jump (basic_block, rtx, rtx, rtx, rtx); static void mems_conflict_for_gcse_p (rtx, const_rtx, void *); static int load_killed_in_block_p (const_basic_block, int, const_rtx, int); @@ -1671,11 +1671,11 @@ expression one). */ static void -hash_scan_set (rtx pat, rtx insn, struct hash_table *table) +hash_scan_set (rtx pat, rtx insn, struct hash_table *table, bool use_note) { rtx src = SET_SRC (pat); rtx dest = SET_DEST (pat); - rtx note; + rtx note = NULL_RTX; if (GET_CODE (src) == CALL) hash_scan_call (src, insn, table); @@ -1685,16 +1685,21 @@ unsigned int regno = REGNO (dest); rtx tmp; - /* See if a REG_NOTE shows this equivalent to a simpler expression. -This allows us to do a single GCSE pass and still eliminate -redundant constants, addresses or other expressions that are -constructed with multiple instructions. */ - note = find_reg_equal_equiv_note (insn); - if (note != 0 - (table-set_p - ? gcse_constant_p (XEXP (note, 0)) - : want_to_gcse_p (XEXP (note, 0 - src = XEXP (note, 0), pat = gen_rtx_SET (VOIDmode, dest, src); + /* Do not substitute insn SET_SRC for note if caller uses + uses use_note = false. */ + if (use_note) +{ + /* See if a REG_NOTE shows this equivalent to a simpler expression. +This allows us to do a single GCSE pass and still eliminate +redundant constants, addresses or other expressions that are +constructed with multiple instructions. */ + note = find_reg_equal_equiv_note (insn); + if (note != 0 + (table-set_p + ? gcse_constant_p (XEXP (note, 0)) + : want_to_gcse_p (XEXP (note, 0 + src = XEXP (note, 0), pat = gen_rtx_SET (VOIDmode, dest, src); + } /* Only record sets of pseudo-regs in the hash table. */ if (! table-set_p @@ -1835,14 +1840,30 @@ what's been modified. */ if (GET_CODE (pat) == SET) -hash_scan_set (pat, insn, table); +{ + /* If we're hashing SETs for constant/copy propagation + also hash a SET ignoring any insn notes. */ + if (table-set_p) +{ + hash_scan_set (pat, insn, table, false); + } + hash_scan_set (pat, insn, table, true); +} else if (GET_CODE (pat) == PARALLEL) for (i = 0; i XVECLEN (pat, 0); i++) { rtx x = XVECEXP (pat, 0, i); if (GET_CODE (x) == SET) - hash_scan_set (x, insn, table); + { + /* If we're hashing SETs for constant/copy propagation + also hash a SET ignoring any insn notes. */ + if (table-set_p) + { + hash_scan_set (x, insn, table, false); + } + hash_scan_set (x, insn, table, true); + } else if (GET_CODE (x) == CLOBBER) hash_scan_clobber (x, insn, table); else if (GET_CODE (x) == CALL) @@ -2071,7 +2092,7 @@ if (table-set_p implicit_sets[current_bb-index] != NULL_RTX) hash_scan_set (implicit_sets[current_bb-index], - BB_HEAD (current_bb), table); +
Fwd: Constant folding and Constant propagation
Dear all, I was working on the machine description so I was postponing a bit this problem but now I have a bit more time on my hands to handle it. I'm starting to look at the various links and code you've provided me and will keep you posted if I make any headway or not ;-). For the GCC port I work on, I have fixed this by weighing the rtx_cost of propagating a register copy Vs propagating the constant into an insn. I have an initial patch for this problem. Do you have a link to that patch? So that I can see if it applies for me ? Thanks, Jc On Tue, Feb 10, 2009 at 2:25 PM, Rahul Kharche ra...@icerasemi.com wrote: I am looking at a related problem in GCSE, GCC 4.3 whereby constants are propagated to their use irrespective of the final instruction cost of generating them (machine cycles or instruction count). Global constant propagation effectively voids common expressions that form large constants, identified by global CSE. This is especially true of SYMBOl_REF constants like section-anchors generated while indexing global arrays. For the GCC port I work on, I have fixed this by weighing the rtx_cost of propagating a register copy Vs propagating the constant into an insn. I have an initial patch for this problem. Rahul Vijay Kharche
Re: Constant folding and Constant propagation
I am looking at a related problem in GCSE, GCC 4.3 whereby constants are propagated to their use irrespective of the final instruction cost of generating them (machine cycles or instruction count). Global constant propagation effectively voids common expressions that form large constants, identified by global CSE. This is especially true of SYMBOl_REF constants like section-anchors generated while indexing global arrays. For the GCC port I work on, I have fixed this by weighing the rtx_cost of propagating a register copy Vs propagating the constant into an insn. I have an initial patch for this problem. Rahul Vijay Kharche
Re: Constant folding and Constant propagation
On Friday 06 February 2009, Ian Lance Taylor wrote: Jean Christophe Beyler jean.christophe.bey...@gmail.com writes: All of these have an outer code of SET. Therefore, I'm not quite positive of how I'm supposed to implement my rtx_cost function. Since I don't seem to get a choice between a set 0xcb03 and a (plus 0xcafe 5), how can I tell the compiler the different costs? Make the CONST_INT more expensive than the PLUS. But I don't know that gcc will implement the particular optimization that you are looking for. I'm not aware of any other processor which is able to load a large constant in a single instruction, but for which an add instruction is cheaper if there is a similar constant already available. This is true for Arm/Thumb. You have limited immediate values for [cheap] ALU instructions (mov/add), but can also load arbitrary constants from a pool with a single [relatively expensive] pc-relative load. We lie about what immediate we can load, then fix this up in md-reorg. A quick test shows gcc manages to DTRT at least some of the time. Paul
Re: Constant folding and Constant propagation
Ian Lance Taylor: that you are looking for. I'm not aware of any other processor which is able to load a large constant in a single instruction, but for which an add instruction is cheaper if there is a similar constant already available. Paul Brook: This is true for Arm/Thumb. You have limited immediate values for [cheap] ALU instructions (mov/add), but can also load arbitrary constants from a pool with a single [relatively expensive] pc-relative load. We lie about what immediate we can load, then fix this up in md-reorg. The SH also uses md-reorg to load constants from memory. And the ARCompact allows 32-bit constants, but that requires an extra fetch cycle; cheap 12 bit signed constants are allowed for two-address ALU operations, and 6 bit unsigned for three-address ALU operations. OTOH if you want to store a zero, you either have to load it into a register, or use a (costly) 32 bit constant. And there can only be one 32-bit cosntant per instruction, i.e. you can store a 32 bit immediate to a register + offset address, or store a register to a 32 bit immediate adddress, but not a 32 bit immediate constant to an absolute address.
Re: Constant folding and Constant propagation
Jean Christophe Beyler wrote: Ok, thanks for all this information and if you can dig that up it would be nice too. I'll start looking at that patch and PR33699 to see if I can adapt them to my needs. Here it is. Paolo /* Copy propagation on RTL for GNU compiler. Copyright (C) 2006 Free Software Foundation, Inc. This file is part of GCC. GCC is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2, or (at your option) any later version. GCC is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. You should have received a copy of the GNU General Public License along with GCC; see the file COPYING. If not, write to the Free Software Foundation, 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301, USA. */ #include config.h #include system.h #include coretypes.h #include tm.h #include rtl.h #include obstack.h #include basic-block.h #include insn-config.h #include recog.h #include alloc-pool.h #include timevar.h #include tree-pass.h /* The basic idea is to keep a table of registers with the same value, and replace expressions encountered so that they use the equivalent register. We do this processing on extended basic blocks. Note this can turn a conditional or computed jump into a nop or an unconditional jump. This is left to be cleaned up by CSE, for now. At the start of each basic block, an assignment places a register in a distinct group number. During scan, when the code copies one register (or a related expression, see below) into another, we copy the quantity number. When a register is loaded in any other way, we allocate a new quantity number to describe the value generated by this operation. `reg_group[regno].num' records what quantity register REGNO is currently thought of as containing. Other expressions: A CLOBBER rtx in an instruction invalidates its operand for further reuse. Related expressions: Registers that differ only by an additive integer are called related. Related registers share the same quantity number. */ /* Per-group information tracking. */ struct group_table_elem { rtxreg; HOST_WIDE_INT delta; /* Basic block in which it was defined. */ basic_blockbb; struct group_table_elem *prev, *next; }; /* Per-register data, including register-group mapping. */ struct reg_data { /* Current group. */ struct group_table_elem *entry; /* Pointer into the table of group heads, indexed by register number. Always 0 for unknown values. */ int num; }; /* Length of group_head vector. */ static int max_group; /* Next quantity number to be allocated. This is 1 + the largest number needed so far. */ static int next_group; /* The table of all groups, indexed by group number. */ static struct group_table_elem **group_head; /* The table of all pseudos, indexed by regno. */ static struct reg_data *reg_group; /* Allocation pool. */ static alloc_pool group_elem_pool; /* Basic block being processed. */ static basic_block current_bb; /* We store the state of each basic block (not visited, part of current EBB, finished) in its AUX field. These two functions return the state. */ static inline bool bb_visited (basic_block bb) { return bb-aux != 0; } static inline bool bb_active (basic_block bb) { return bb-aux == (void *) 1L; } /* Routines to manage the data structures of this pass. */ /* Remove the register DEST from the equivalence tables. */ static struct group_table_elem * remove_from_group (rtx dest) { struct group_table_elem *ent = reg_group[REGNO (dest)].entry; int num = reg_group[REGNO (dest)].num; reg_group[REGNO (dest)].num = 0; reg_group[REGNO (dest)].entry = NULL; gcc_assert (ent); if (ent-prev) ent-prev-next = ent-next; else group_head[num] = ent-next; if (ent-next) ent-next-prev = ent-prev; ent-prev = ent-next = NULL; return ent; } /* Check if the entry for register REG in the equivalence tables is up to date. Remove it and return NULL if it is not. Otherwise, return the entry, or NULL if it is absent. */ static inline struct group_table_elem * reg_group_entry (rtx reg) { struct group_table_elem *ent = reg_group[REGNO (reg)].entry; if (ent !bb_active (ent-bb)) { remove_from_group (reg); return NULL; } else return ent; } /* Return the head of the equivalence class for REG (removing stale entries). */ static struct group_table_elem * reg_group_head (rtx reg) { int num = reg_group[REGNO (reg)].num; struct group_table_elem *canon_ent = group_head[num]; while (canon_ent !bb_active (canon_ent-bb)) { remove_from_group
Re: Constant folding and Constant propagation
Steven Bosscher wrote: On Fri, Feb 6, 2009 at 7:32 PM, Adam Nemet ane...@caviumnetworks.com wrote: I think you really need the Joern's optmize_related_values patch. Also see PR33699. I wouldn't recommend that patch, but yes: Something that performs that optimization ;-) Yes, something doing that using LCM was on my list of things to do after fwprop to do what CSE currently does, but better. I never got round to implementing it, I think. I had an LCM-based replacement for canon_reg that I wanted to use as a basis, maybe I can dig it up. Paolo
Re: Constant folding and Constant propagation
Ok, thanks for all this information and if you can dig that up it would be nice too. I'll start looking at that patch and PR33699 to see if I can adapt them to my needs. Any reason why you wouldn't recommend it? I will keep you posted of anything I do for latter reference, Jean Christophe Beyler On Sat, Feb 7, 2009 at 5:05 AM, Paolo Bonzini bonz...@gnu.org wrote: Steven Bosscher wrote: On Fri, Feb 6, 2009 at 7:32 PM, Adam Nemet ane...@caviumnetworks.com wrote: I think you really need the Joern's optmize_related_values patch. Also see PR33699. I wouldn't recommend that patch, but yes: Something that performs that optimization ;-) Yes, something doing that using LCM was on my list of things to do after fwprop to do what CSE currently does, but better. I never got round to implementing it, I think. I had an LCM-based replacement for canon_reg that I wanted to use as a basis, maybe I can dig it up. Paolo
Re: Constant folding and Constant propagation
Dear all, I've been trying to play with the RTX costs by using a target function using the macro TARGET_RTX_COSTS. However, I haven't been able to really make any good progress. This is the program I have: #include stdio.h #include stdlib.h int main(void) { long a = 0xcafe; printf(Final: %lx %lx\n, a, a+5); return EXIT_SUCCESS; } What I would like to see is not have the two elements set as constants in registers but only one and the other set with an add instruction. I first looked at what rtx were sent to this function and I am perplexed about how this would work. After many calls to the function, I finally get this : (const_int 51966 [0xcafe]) (const_int 51966 [0xcafe]) (const_int 51971 [0xcb03]) (const_int 51971 [0xcb03]) All of these have an outer code of SET. Therefore, I'm not quite positive of how I'm supposed to implement my rtx_cost function. Since I don't seem to get a choice between a set 0xcb03 and a (plus 0xcafe 5), how can I tell the compiler the different costs? Jc On Thu, Feb 5, 2009 at 4:10 PM, Ian Lance Taylor i...@google.com wrote: Jean Christophe Beyler jean.christophe.bey...@gmail.com writes: I'm currently working on removing the constant folding and constant propagation because, on the architecture I'm working on, it is highly costly to move a constant into a register if the number is big (we can say over 16 bits). Currently, I've been looking into this and it seems that I have to hack into the fold-const.c file, CCP, SCCVN and probably the propagate passes to tell the compiler to be careful if the number is too big. But I'm sure there are many other places I'll have to hack and slash to tell him to stop the folding and propagation. Is there an easier way to do this ? This type of thing is normally controlled by the TARGET_RTX_COSTS hook. E.g., see the handling of CONST_INT mips_rtx_costs in config/mips/mips.c. Ian
Re: Constant folding and Constant propagation
On Fri, Feb 06, 2009 at 11:46:58AM -0500, Jean Christophe Beyler wrote: I finally get this : (const_int 51966 [0xcafe]) (const_int 51966 [0xcafe]) (const_int 51971 [0xcb03]) (const_int 51971 [0xcb03]) All of these have an outer code of SET. Therefore, I'm not quite positive of how I'm supposed to implement my rtx_cost function. Since I don't seem to get a choice between a set 0xcb03 and a (plus 0xcafe 5), how can I tell the compiler the different costs? They may all have an outer code of SET, but you get the entire inner RTX. Search arm_rtx_costs_1 for CONST_INT:. -- Daniel Jacobowitz CodeSourcery
Re: Constant folding and Constant propagation
Jean Christophe Beyler jean.christophe.bey...@gmail.com writes: All of these have an outer code of SET. Therefore, I'm not quite positive of how I'm supposed to implement my rtx_cost function. Since I don't seem to get a choice between a set 0xcb03 and a (plus 0xcafe 5), how can I tell the compiler the different costs? Make the CONST_INT more expensive than the PLUS. But I don't know that gcc will implement the particular optimization that you are looking for. I'm not aware of any other processor which is able to load a large constant in a single instruction, but for which an add instruction is cheaper if there is a similar constant already available. You may need to implement this as a peephole or as a machine specific pass. Ian
Re: Constant folding and Constant propagation
Ian Lance Taylor wrote: Jean Christophe Beyler jean.christophe.bey...@gmail.com writes: All of these have an outer code of SET. Therefore, I'm not quite positive of how I'm supposed to implement my rtx_cost function. Since I don't seem to get a choice between a set 0xcb03 and a (plus 0xcafe 5), how can I tell the compiler the different costs? Make the CONST_INT more expensive than the PLUS. But I don't know that gcc will implement the particular optimization that you are looking for. I'm not aware of any other processor which is able to load a large constant in a single instruction, but for which an add instruction is cheaper if there is a similar constant already available. You may need to implement this as a peephole or as a machine specific pass. Take a look at reload_cse_move2add. Bernd -- This footer brought to you by insane German lawmakers. Analog Devices GmbH Wilhelm-Wagenfeld-Str. 6 80807 Muenchen Sitz der Gesellschaft Muenchen, Registergericht Muenchen HRB 40368 Geschaeftsfuehrer Thomas Wessel, William A. Martin, Margaret Seif
Re: Constant folding and Constant propagation
Bernd Schmidt bernds_...@t-online.de writes: But I don't know that gcc will implement the particular optimization that you are looking for. I'm not aware of any other processor which is able to load a large constant in a single instruction, but for which an add instruction is cheaper if there is a similar constant already available. You may need to implement this as a peephole or as a machine specific pass. Take a look at reload_cse_move2add. Oh yeah, I completely forgot about that. Thanks. Ian
Re: Constant folding and Constant propagation
Bernd Schmidt bernds_...@t-online.de writes: Take a look at reload_cse_move2add. I don't think that powerful enough; it requires the same destination registers: /* Try to transform (set (REGX) (CONST_INT A)) ... (set (REGX) (CONST_INT B)) to (set (REGX) (CONST_INT A)) ... (set (REGX) (plus (REGX) (CONST_INT B-A))) I think you really need the Joern's optmize_related_values patch. Also see PR33699. Adam
Re: Constant folding and Constant propagation
On Fri, Feb 6, 2009 at 7:32 PM, Adam Nemet ane...@caviumnetworks.com wrote: I think you really need the Joern's optmize_related_values patch. Also see PR33699. I wouldn't recommend that patch, but yes: Something that performs that optimization ;-) Gr. Steven
Constant folding and Constant propagation
Dear all, I'm currently working on removing the constant folding and constant propagation because, on the architecture I'm working on, it is highly costly to move a constant into a register if the number is big (we can say over 16 bits). Currently, I've been looking into this and it seems that I have to hack into the fold-const.c file, CCP, SCCVN and probably the propagate passes to tell the compiler to be careful if the number is too big. But I'm sure there are many other places I'll have to hack and slash to tell him to stop the folding and propagation. Is there an easier way to do this ? I've been looking at what happens for the Itanium since you'd have the same problem if you had to use big numbers. You would in theory prefer to not use too many move long instructions since they use two slots in the bundle. But I fail to see if anything is really done in that respect. Thanks for your help, Jean Christophe Beyler
Re: Constant folding and Constant propagation
On Thu, Feb 05, 2009 at 12:34:14PM -0800, Jean Christophe Beyler wrote: I'm currently working on removing the constant folding and constant propagation because, on the architecture I'm working on, it is highly costly to move a constant into a register if the number is big (we can say over 16 bits). But in practice most constants that cprop deals with are values like -1, 0, 1, or 2. Wouldn't it be better to be able to constrain cprop based on the values of the constants, than to eliminate it altogether?
Re: Constant folding and Constant propagation
On Thu, Feb 05, 2009 at 12:46:01PM -0800, Joe Buck wrote: On Thu, Feb 05, 2009 at 12:34:14PM -0800, Jean Christophe Beyler wrote: I'm currently working on removing the constant folding and constant propagation because, on the architecture I'm working on, it is highly costly to move a constant into a register if the number is big (we can say over 16 bits). But in practice most constants that cprop deals with are values like -1, 0, 1, or 2. Wouldn't it be better to be able to constrain cprop based on the values of the constants, than to eliminate it altogether? I shouldn't have said most, but small constants are common.
Re: Constant folding and Constant propagation
True but what I've noticed with GCC 4.3.2 is that with this code: #include stdio.h #include stdlib.h int main(void) { long a = 0xcafecafe; printf(Final: %lx %lx %lx\n, a, a+5, a+15); return EXIT_SUCCESS; } Whether I compile it with a big number like here or a smaller number, I'll get something like: la r8,$LC0 #Loading the address of Final: %lx %lx %lx\n lid r9,0xcafecafe#Loading the first immediate lid r10,0xcafecb03 #Loading the second immediate lid r11,0xcafecb0d #Loading the third immediate But in my case, the lid is very costly for this large number. It would be more efficient to have: la r8,$LC0 #Loading the address of Final: %lx %lx %lx\n lid r9,0xcafecafe#Loading the first immediate addi r10,r9,0x5 #Loading the second immediate addi r11,r9,0xf #Loading the third immediate So my question is, can I explain this in the machine description alone and if so, does anybody know where? Thanks again, Jean Christophe Beyler On Thu, Feb 5, 2009 at 4:02 PM, Joe Buck joe.b...@synopsys.com wrote: On Thu, Feb 05, 2009 at 12:46:01PM -0800, Joe Buck wrote: On Thu, Feb 05, 2009 at 12:34:14PM -0800, Jean Christophe Beyler wrote: I'm currently working on removing the constant folding and constant propagation because, on the architecture I'm working on, it is highly costly to move a constant into a register if the number is big (we can say over 16 bits). But in practice most constants that cprop deals with are values like -1, 0, 1, or 2. Wouldn't it be better to be able to constrain cprop based on the values of the constants, than to eliminate it altogether? I shouldn't have said most, but small constants are common.
Re: Constant folding and Constant propagation
Jean Christophe Beyler jean.christophe.bey...@gmail.com writes: I'm currently working on removing the constant folding and constant propagation because, on the architecture I'm working on, it is highly costly to move a constant into a register if the number is big (we can say over 16 bits). Currently, I've been looking into this and it seems that I have to hack into the fold-const.c file, CCP, SCCVN and probably the propagate passes to tell the compiler to be careful if the number is too big. But I'm sure there are many other places I'll have to hack and slash to tell him to stop the folding and propagation. Is there an easier way to do this ? This type of thing is normally controlled by the TARGET_RTX_COSTS hook. E.g., see the handling of CONST_INT mips_rtx_costs in config/mips/mips.c. Ian