On 15/11/13 14:19, Yvan Roux wrote: > Hi, > > I'm agree. I looked at the ARM backend and it occurs that the usage > of optimize_insn_for_size_p() was added to only use store_minmax in > cold path because of some performance issue. But in any case its > usage doesn't shrink the number of instruction, if we are in ARM mode > 3 are needed : 1 comparison, 1 conditional move and 1 store in O1, O2 > and O3 and 1 comparison and 2 conditional sotres in Os : > > cmp ip, r0 > movlt r0, ip > str r0, [lr, r3] >
Sometimes 4 will be needed, since both original register values may remain live. However, I'm inclined to agree that while it should be possible to decide at the *function* level whether or not an insn is valid, doing so at the block level is probably unsafe. R. > or > > cmp r5, r4 > strle r5, [lr, r3] > strgt r4, [lr, r3] > > and in Thumb mode 4 are needed in both cases because of the it > insertion. Thus I think that this insn can be removed. Is it ok for > you ? > > Thanks > Yvan > > > On 10 November 2013 14:17, Richard Biener <richard.guent...@gmail.com> wrote: >> Steven Bosscher <stevenb....@gmail.com> wrote: >>> Hello, >>> >>> This patch is necessary to make ARM pass the test suite with LRA >>> enabled. The symptom is recog failing to recognize a store_minmaxsi >>> insn, see: >>> http://gcc.gnu.org/ml/gcc-patches/2013-11/msg00725.html >>> >>> But I am not sure if that's also the root cause of the problem, or >>> whether the ARM back end should not let recognition of insn patterns >>> be dependent on the state of the profile flags. >>> >>> The pattern for store_minmaxsi (in arm.md) is: >>> >>> (define_insn "*store_minmaxsi" >>> [(set (match_operand:SI 0 "memory_operand" "=m") >>> (match_operator:SI 3 "minmax_operator" >>> [(match_operand:SI 1 "s_register_operand" "r") >>> (match_operand:SI 2 "s_register_operand" "r")])) >>> (clobber (reg:CC CC_REGNUM))] >>> "TARGET_32BIT && optimize_insn_for_size_p()" >>> "* >>> operands[3] = gen_rtx_fmt_ee (minmax_code (operands[3]), SImode, >>> operands[1], operands[2]); >>> output_asm_insn (\"cmp\\t%1, %2\", operands); >>> if (TARGET_THUMB2) >>> output_asm_insn (\"ite\t%d3\", operands); >>> output_asm_insn (\"str%d3\\t%1, %0\", operands); >>> output_asm_insn (\"str%D3\\t%2, %0\", operands); >>> return \"\"; >>> " >>> [(set_attr "conds" "clob") >>> (set (attr "length") >>> (if_then_else (eq_attr "is_thumb" "yes") >>> (const_int 14) >>> (const_int 12))) >>> (set_attr "type" "store1")] >>> ) >>> >>> >>> Note the insn condition uses optimize_insn_for_size_p(). This means >>> the pattern can be valid or invalid dependent on the context where the >>> insn appears: in hot or cold code. IMHO this behavior should not be >>> allowed. The back end cannot expect the middle end to know at all >>> times the context that the insn appears in, and more importantly >>> whether a pattern is valid or not is independent of where the insn >>> appears: That is a *cost* issue! >>> >>> It seems to me that the ARM back end should be fixed here, not LRA's >>> check_rtl. >>> >>> Comments&thoughts? >> >> The intent is to allow this also to control combine and split, but certainly >> making insns invalid after the fact is bad. think of sinking a previously >> hot insn into a cold path... >> >> Honza, how was this supposed to work? >> >> Richard. >> >>> Ciao! >>> Steven >> >> >