On June 26, 2020 3:24:24 AM GMT+02:00, Alan Lehotsky <[email protected]> wrote:
>On Jun 25, 2020, at 6:37 PM, Jeff Law
><[email protected]<mailto:[email protected]>> wrote:
>
>On Thu, 2020-06-25 at 15:46 -0400, Alan Lehotsky wrote:
>I’m working on a GCC 8.3 port to a load/store architecture with a
>32-bit data-path between registers and memory;
>
>looking at the gcc.dg/loop-9.c test, I fail to pass because I have
>split the move of a double constant to memory into multiple moves (4 in
>fact, because I only have a 16-bit immediate mode.)
>
>The (define_insn_and_split “movdf” …) is conditioned on
>“reload_completed”.
>
>Is there some other trick I need get the constant hoisted. I have
>already set the rtx cost of the CONST_DOUBLE ridiculously high (like 10
>insns)
>Hi Alan, it's been a long time...
>
>We'd probably need to set the RTL. A variety of things can get in the
>way of
>LICM. For example, I'd expect subregs to be problematical because they
>can look
>like RMW operations.
>
>jeff
>
>
>
>Hello to you too, Jeff…. I’ve been lurking for the last decade or so,
>last port I actually did was was GCC 4 based, so lots of new stuff to
>try and wrap my head around. I certainly am grateful for anybody with
>suggestions as to how to track down this problem (I’m not terribly
>eager to do a
>parallel stepping thru a x86 gcc in parallel with my port to see where
>they diverge in the loop-invariant recognition.)
>
>Although in crafting this expanded email, I see that the x86 has
>already decided to store the constant 18.4242 in the .rodata section by
>the start of loop-invariance so there’s a
>
> (set (reg:DF…. ) (mem:DF (symbol_ref ….)))
>
>and I bet that’s far easier to move out of the loop than it would be to
>split the original
>
> (set (mem:DF…) (const_double:DF ….))
Immediate operands are never moved or CSEd by either RTL nor GIMPLE so if you
do not have const_double immediates the best thing to do is not make them
legitimate.
Richard.
>— Al
>
>==========
>
>Source code is
>
>void f (double *a)
>{
>int i;
>for (i = 0; i < 100; i++_
>a[i] = 18.4242;
>}
>==========
>
>Here’s the dump from loop-9.c.252r.loop2-invariant (compiled -O1)
>
>
>;; Function f (f, funcdef_no=0, decl_uid=1458, cgraph_uid=0,
>symbol_order=0)
>
>*****starting processing of loop 1 ******
>starting the processing of deferred insns
>ending the processing of deferred insns
>setting blocks to analyze 3, 5
>starting the processing of deferred insns
>ending the processing of deferred insns
>df_analyze called
>df_worklist_dataflow_doublequeue: n_basic_blocks 6 n_edges 6 count 2 (
>0.33)
>df_worklist_dataflow_doublequeue: n_basic_blocks 6 n_edges 6 count 2 (
>0.33)
>df_worklist_dataflow_doublequeue: n_basic_blocks 6 n_edges 6 count 3 (
>0.5)
>
>
>starting region dump
>
>
>f
>
>Dataflow summary:
>def_info->table_size = 3, use_info->table_size = 23
>;; invalidated by call 0 [d0] 1 [d1] 2 [d2] 3 [d3] 4 [d4] 5 [d5] 6
>[d6] 7 [d7] 8 [d8] 9 [d9] 14 [d14] 15 [d15] 16 [a0] 19 [a3] 20 [a4] 24
>[acc0_hi] 25 [acc0_lo] 26 [acc1_hi] 27 [acc1_lo] 28 [source3] 30 [cc]
>31 [int_set0] 32 [int_set1] 33 [int_clr0] 34 [int_clr1] 35
>[scratchpad0] 36 [scratchpad1] 37 [scratchpad2] 38 [scratchpad3]
>;; hardware regs used 23 [sp] 29 [arg] 39 [sfp]
>;; regular block artificial uses 22 [a6] 23 [sp] 29 [arg] 39 [sfp]
>;; eh block artificial uses 22 [a6] 23 [sp] 29 [arg] 39 [sfp]
>;; entry block defs 0 [d0] 1 [d1] 2 [d2] 3 [d3] 4 [d4] 5 [d5] 6 [d6] 7
>[d7] 8 [d8] 9 [d9] 21 [a5] 22 [a6] 23 [sp] 29 [arg] 39 [sfp]
>;; exit block uses 22 [a6] 23 [sp] 39 [sfp]
>;; regs ever live 0 [d0] 30 [cc]
>;; ref usage r0={1d,1u} r1={1d} r2={1d} r3={1d} r4={1d} r5={1d}
>r6={1d} r7={1d} r8={1d} r9={1d} r21={1d} r22={1d,5u} r23={1d,5u}
>r29={1d,4u} r30={3d,1u} r39={1d,5u} r46={2d,4u} r48={1d,1u}
>;; total ref usage 47{21d,26u,0e} in 6{6 regular + 0 call} insns.
>;; Reaching defs:
>;; sparse invalidated
>;; dense invalidated 0, 1
>;; reg->defs[] map: 30[0,1] 46[2,2]
>;; bb 3 artificial_defs: { }
>;; bb 3 artificial_uses: { u7(22){ }u8(23){ }u9(29){ }u10(39){ }}
>;; lr in 22 [a6] 23 [sp] 29 [arg] 39 [sfp] 46 48
>;; lr use 22 [a6] 23 [sp] 29 [arg] 39 [sfp] 46 48
>;; lr def 30 [cc] 46
>;; live in 46
>;; live gen 30 [cc] 46
>;; live kill 30 [cc]
>;; rd in (1) 46[2]
>;; rd gen (2) 30[1],46[2]
>;; rd kill (3) 30[0,1],46[2]
>;; UD chains for artificial uses at top
>
>(code_label 11 7 8 3 2 (nil) [0 uses])
>(note 8 11 9 3 [bb 3] NOTE_INSN_BASIC_BLOCK)
>;; UD chains for insn luid 0 uid 9
>;; reg 46 { d2(bb 3 insn 10) }
>(insn 9 8 10 3 (set (mem:DF (reg:SI 46 [ ivtmp___6 ]) [0 MEM[base: _15,
>offset: 0B]+0 S8 A32])
>(const_double:DF 1.84241999999999990222931955941021442413330078125e+1
>[0x0.9364c2f837b4ap+5])) "loop-9.c":9 19 {movdf}
> (nil))
>;; UD chains for insn luid 1 uid 10
>;; reg 46 { d2(bb 3 insn 10) }
>(insn 10 9 12 3 (parallel [
> (set (reg:SI 46 [ ivtmp___6 ])
> (plus:SI (reg:SI 46 [ ivtmp___6 ])
> (const_int 8 [0x8])))
> (clobber (reg:CC 30 cc))
> ]) 81 {addsi3_1v5}
> (expr_list:REG_UNUSED (reg:CC 30 cc)
> (nil)))
>;; UD chains for insn luid 2 uid 12
>;; reg 46 { d2(bb 3 insn 10) }
>;; reg 48 { }
>(insn 12 10 13 3 (set (reg:CCWZ 30 cc)
> (compare:CCWZ (reg:SI 46 [ ivtmp___6 ])
> (reg:SI 48 [ _17 ]))) "loop-9.c":8 57 {cmpsi_sub4}
> (nil))
>;; UD chains for insn luid 3 uid 13
>;; reg 30 { d1(bb 3 insn 12) }
>(jump_insn 13 12 18 3 (set (pc)
> (if_then_else (ne:CCWZ (reg:CCWZ 30 cc)
> (const_int 0 [0]))
> (label_ref:SI 18)
> (pc))) "loop-9.c":8 177 {jcc}
> (expr_list:REG_DEAD (reg:CCWZ 30 cc)
> (int_list:REG_BR_PROB 1063004412 (nil)))
> -> 18)
>;; lr out 22 [a6] 23 [sp] 29 [arg] 39 [sfp] 46 48
>;; live out 46
>;; rd out (1) 46[2]
>;; UD chains for artificial uses at bottom
>;; reg 22 { }
>;; reg 23 { }
>;; reg 29 { }
>;; reg 39 { }
>
>
>;; bb 5 artificial_defs: { }
>;; bb 5 artificial_uses: { u-1(22){ }u-1(23){ }u-1(29){ }u-1(39){ }}
>;; lr in 22 [a6] 23 [sp] 29 [arg] 39 [sfp] 46 48
>;; lr use 22 [a6] 23 [sp] 29 [arg] 39 [sfp]
>;; lr def
>;; live in 46
>;; live gen
>;; live kill
>;; rd in (2) 30[1],46[2]
>;; rd gen (0)
>;; rd kill (0)
>;; UD chains for artificial uses at top
>
>(code_label 18 13 17 5 3 (nil) [1 uses])
>(note 17 18 14 5 [bb 5] NOTE_INSN_BASIC_BLOCK)
>;; lr out 22 [a6] 23 [sp] 29 [arg] 39 [sfp] 46 48
>;; live out 46
>;; rd out (1) 46[2]
>;; UD chains for artificial uses at bottom
>;; reg 22 { }
>;; reg 23 { }
>;; reg 29 { }
>;; reg 39 { }
>
>
>
>*****ending processing of loop 1 ******
>starting the processing of deferred insns
>ending the processing of deferred insns
>
>
>f
>
>Dataflow summary:
>;; invalidated by call 0 [d0] 1 [d1] 2 [d2] 3 [d3] 4 [d4] 5 [d5] 6
>[d6] 7 [d7] 8 [d8] 9 [d9] 14 [d14] 15 [d15] 16 [a0] 19 [a3] 20 [a4] 24
>[acc0_hi] 25 [acc0_lo] 26 [acc1_hi] 27 [acc1_lo] 28 [source3] 30 [cc]
>31 [int_set0] 32 [int_set1] 33 [int_clr0] 34 [int_clr1] 35
>[scratchpad0] 36 [scratchpad1] 37 [scratchpad2] 38 [scratchpad3]
>;; hardware regs used 23 [sp] 29 [arg] 39 [sfp]
>;; regular block artificial uses 22 [a6] 23 [sp] 29 [arg] 39 [sfp]
>;; eh block artificial uses 22 [a6] 23 [sp] 29 [arg] 39 [sfp]
>;; entry block defs 0 [d0] 1 [d1] 2 [d2] 3 [d3] 4 [d4] 5 [d5] 6 [d6] 7
>[d7] 8 [d8] 9 [d9] 21 [a5] 22 [a6] 23 [sp] 29 [arg] 39 [sfp]
>;; exit block uses 22 [a6] 23 [sp] 39 [sfp]
>;; regs ever live 0 [d0] 30 [cc]
>;; ref usage r0={1d,1u} r1={1d} r2={1d} r3={1d} r4={1d} r5={1d}
>r6={1d} r7={1d} r8={1d} r9={1d} r21={1d} r22={1d,5u} r23={1d,5u}
>r29={1d,4u} r30={3d,1u} r39={1d,5u} r46={2d,4u} r48={1d,1u}
>;; total ref usage 47{21d,26u,0e} in 6{6 regular + 0 call} insns.
>(note 4 0 2 2 [bb 2] NOTE_INSN_BASIC_BLOCK)
>(insn 2 4 3 2 (set (reg:SI 46 [ ivtmp___6 ])
> (reg:SI 0 d0 [ a ])) "loop-9.c":6 7 {movsi_internal}
> (expr_list:REG_DEAD (reg:SI 0 d0 [ a ])
> (nil)))
>(note 3 2 7 2 NOTE_INSN_FUNCTION_BEG)
>(insn 7 3 11 2 (parallel [
> (set (reg:SI 48 [ _17 ])
> (plus:SI (reg:SI 46 [ ivtmp___6 ])
> (const_int 800 [0x320])))
> (clobber (reg:CC 30 cc))
> ]) 81 {addsi3_1v5}
> (expr_list:REG_UNUSED (reg:CC 30 cc)
> (nil)))
>(code_label 11 7 8 3 2 (nil) [0 uses])
>(note 8 11 9 3 [bb 3] NOTE_INSN_BASIC_BLOCK)
>(insn 9 8 10 3 (set (mem:DF (reg:SI 46 [ ivtmp___6 ]) [0 MEM[base: _15,
>offset: 0B]+0 S8 A32])
>(const_double:DF 1.84241999999999990222931955941021442413330078125e+1
>[0x0.9364c2f837b4ap+5])) "loop-9.c":9 19 {movdf}
> (nil))
>(insn 10 9 12 3 (parallel [
> (set (reg:SI 46 [ ivtmp___6 ])
> (plus:SI (reg:SI 46 [ ivtmp___6 ])
> (const_int 8 [0x8])))
> (clobber (reg:CC 30 cc))
> ]) 81 {addsi3_1v5}
> (expr_list:REG_UNUSED (reg:CC 30 cc)
> (nil)))
>(insn 12 10 13 3 (set (reg:CCWZ 30 cc)
> (compare:CCWZ (reg:SI 46 [ ivtmp___6 ])
> (reg:SI 48 [ _17 ]))) "loop-9.c":8 57 {cmpsi_sub4}
> (nil))
>(jump_insn 13 12 18 3 (set (pc)
> (if_then_else (ne:CCWZ (reg:CCWZ 30 cc)
> (const_int 0 [0]))
> (label_ref:SI 18)
> (pc))) "loop-9.c":8 177 {jcc}
> (expr_list:REG_DEAD (reg:CCWZ 30 cc)
> (int_list:REG_BR_PROB 1063004412 (nil)))
> -> 18)
>(code_label 18 13 17 5 3 (nil) [1 uses])
>(note 17 18 14 5 [bb 5] NOTE_INSN_BASIC_BLOCK)
>(note 14 17 0 4 [bb 4] NOTE_INSN_BASIC_BLOCK)