The subreg pass has this :
(insn 5 2 6 2 ex1b.c:8 (set (reg/f:DI 74)
(const:DI (plus:DI (symbol_ref:DI ("data") <var_decl 0xb7d35058 data>)
(const_int 8 [0x8])))) 71 {movdi_internal} (nil))
(insn 6 5 7 2 ex1b.c:8 (set (reg/f:DI 75)
(symbol_ref:DI ("data") <var_decl 0xb7d35058 data>)) 71
{movdi_internal} (nil))
...
(insn 10 9 11 2 ex1b.c:8 (set (reg/f:DI 79)
(const:DI (plus:DI (symbol_ref:DI ("data") <var_decl 0xb7d35058 data>)
(const_int 16 [0x10])))) 71 {movdi_internal} (nil))
As we can see, all three are using the symbol_ref data before adding
their offset. But after cse, we get this:
(insn 5 2 6 2 ex1b.c:8 (set (reg/f:DI 74)
(const:DI (plus:DI (symbol_ref:DI ("data") <var_decl 0xb7d35058 data>)
(const_int 8 [0x8])))) 71 {movdi_internal} (nil))
(insn 6 5 7 2 ex1b.c:8 (set (reg/f:DI 75)
(symbol_ref:DI ("data") <var_decl 0xb7d35058 data>)) 71
{movdi_internal} (nil))
...
(insn 10 9 11 2 ex1b.c:8 (set (reg/f:DI 79)
(plus:DI (reg/f:DI 75)
(const_int 16 [0x10]))) 2 {adddi3_port}
(expr_list:REG_EQUAL (const:DI (plus:DI (symbol_ref:DI ("data")
<var_decl 0
(const_int 16 [0x10])))
(nil)))
As we can see, the CSE pass, instead of putting the three in function
of 74, puts only the last one in function of 75.
I put the whole dump of cse at the end of this email, I didn't want to
make this one too long...
Thanks again,
Jean Christophe Beyler
------------------ Dump of cse1 ------------------
;; Function foo (foo)
3 basic blocks, 2 edges.
Basic block 0 , next 2, loop_depth 0, count 0, freq 10000, maybe hot.
Predecessors:
Successors: 2 [100.0%] (fallthru)
Basic block 2 , prev 0, next 1, loop_depth 0, count 0, freq 10000, maybe hot.
Predecessors: ENTRY [100.0%] (fallthru)
Successors: EXIT [100.0%] (fallthru)
Basic block 1 , prev 2, loop_depth 0, count 0, freq 10000, maybe hot.
Predecessors: 2 [100.0%] (fallthru)
Successors:
starting the processing of deferred insns
ending the processing of deferred insns
df_analyze called
df_worklist_dataflow_overeager:n_basic_blocks 3 n_edges 2 count 3 ( 1)
foo
Dataflow summary:
def_info->table_size = 0, use_info->table_size = 0
;; invalidated by call 2 [r2] 4 [r4] 5 [r5] 6 [r6] 7 [r7] 8 [r8] 9
[r9] 10 [r10] 11 [r11] 12 [r12] 13 [r13] 14 [r14] 15 [r15] 16 [r16] 17
[r17] 18 [r18] 19 [r19] 20 [r20] 21 [r21] 22 [r22] 23 [r23] 24 [r24]
25 [r25] 26 [r26] 27 [r27] 28 [r28] 29 [r29] 30 [r30] 31 [r31] 32
[r32] 33 [r33] 34 [r34] 35 [r35] 36 [r36] 37 [r37] 38 [r38] 39 [r39]
40 [r40] 41 [r41] 42 [r42] 43 [r43] 44 [r44] 45 [r45] 46 [r46] 47
[r47] 63 [r63] 64 [$rap] 65 [cc] 66 [acc]
;; hardware regs used 0 [r0] 1 [r1] 3 [r3]
;; regular block artificial uses 0 [r0] 1 [r1] 3 [r3] 62 [r62]
;; eh block artificial uses 0 [r0] 1 [r1] 3 [r3] 62 [r62]
;; entry block defs 0 [r0] 1 [r1] 3 [r3] 6 [r6] 8 [r8] 9 [r9] 10
[r10] 11 [r11] 12 [r12] 13 [r13] 14 [r14] 15 [r15] 62 [r62] 63 [r63]
;; exit block uses 1 [r1] 3 [r3] 6 [r6] 62 [r62]
;; regs ever live 6[r6]
( )->[0]->( 2 )
;; bb 0 artificial_defs: { d-1(0){ }d-1(1){ }d-1(3){ }d-1(6){ }d-1(8){
}d-1(9){ }d-1(10){ }d-1(11){ }d-1(12){ }d-1(13){ }d-1(14){ }d-1(15){
}d-1(62){ }d-1(63){ }}
;; bb 0 artificial_uses: { }
( 0 )->[2]->( 1 )
;; bb 2 artificial_defs: { }
;; bb 2 artificial_uses: { u-1(0){ }u-1(1){ }u-1(3){ }u-1(62){ }}
( 2 )->[1]->( )
;; bb 1 artificial_defs: { }
;; bb 1 artificial_uses: { u-1(1){ }u-1(3){ }u-1(6){ }u-1(62){ }}
Finding needed instructions:
Adding insn 23 to worklist
Finished finding needed instructions:
processing block 2 live out = 0 [r0] 1 [r1] 3 [r3] 6 [r6] 62 [r62]
Adding insn 17 to worklist
Adding insn 13 to worklist
Adding insn 12 to worklist
Adding insn 11 to worklist
Adding insn 10 to worklist
Adding insn 9 to worklist
Adding insn 8 to worklist
Adding insn 7 to worklist
Adding insn 6 to worklist
Adding insn 5 to worklist
df_worklist_dataflow_overeager:n_basic_blocks 3 n_edges 2 count 3 ( 1)
;; Following path with 11 sets: 2
deferring rescan insn with uid = 10.
deferring rescan insn with uid = 17.
try_optimize_cfg iteration 1
(note 3 0 2 2 [bb 2] NOTE_INSN_BASIC_BLOCK)
(note 2 3 5 2 NOTE_INSN_FUNCTION_BEG)
(insn 5 2 6 2 ex1b.c:8 (set (reg/f:DI 74)
(const:DI (plus:DI (symbol_ref:DI ("data") <var_decl 0xb7d35058 data>)
(const_int 8 [0x8])))) 71 {movdi_internal} (nil))
(insn 6 5 7 2 ex1b.c:8 (set (reg/f:DI 75)
(symbol_ref:DI ("data") <var_decl 0xb7d35058 data>)) 71
{movdi_internal} (nil))
(insn 7 6 8 2 ex1b.c:8 (set (reg:DI 77 [ data+8 ])
(mem/s:DI (reg/f:DI 74) [2 data+8 S8 A64])) 71 {movdi_internal} (nil))
(insn 8 7 9 2 ex1b.c:8 (set (reg:DI 78 [ data ])
(mem/s:DI (reg/f:DI 75) [2 data+0 S8 A64])) 71 {movdi_internal} (nil))
(insn 9 8 10 2 ex1b.c:8 (set (reg:DI 76)
(plus:DI (reg:DI 77 [ data+8 ])
(reg:DI 78 [ data ]))) 2 {adddi3_port} (nil))
(insn 10 9 11 2 ex1b.c:8 (set (reg/f:DI 79)
(plus:DI (reg/f:DI 75)
(const_int 16 [0x10]))) 2 {adddi3_port}
(expr_list:REG_EQUAL (const:DI (plus:DI (symbol_ref:DI ("data")
<var_decl 0xb7d35058 data>)
(const_int 16 [0x10])))
(nil)))
(insn 11 10 12 2 ex1b.c:8 (set (reg:DI 80 [ data+16 ])
(mem/s:DI (reg/f:DI 79) [2 data+16 S8 A64])) 71 {movdi_internal} (nil))
(insn 12 11 13 2 ex1b.c:8 (set (reg:DI 73)
(plus:DI (reg:DI 76)
(reg:DI 80 [ data+16 ]))) 2 {adddi3_port} (nil))
(insn 13 12 17 2 ex1b.c:8 (set (reg:DI 72 [ <result> ])
(reg:DI 73)) 71 {movdi_internal} (nil))
(insn 17 13 23 2 ex1b.c:10 (set (reg/i:DI 6 r6)
(reg:DI 73)) 71 {movdi_internal} (nil))
(insn 23 17 0 2 ex1b.c:10 (use (reg/i:DI 6 r6)) -1 (nil))
starting the processing of deferred insns
rescanning insn with uid = 10.
deleting insn with uid = 10.
rescanning insn with uid = 17.
deleting insn with uid = 17.
ending the processing of deferred insns
On Wed, Jul 15, 2009 at 12:25 PM, Adam Nemet<[email protected]> wrote:
> Jean Christophe Beyler <[email protected]> writes:
>> uint64_t foo (void)
>> {
>> return data[0] + data[1] + data[2];
>> }
>>
>> And this generates :
>>
>> la r9,data
>> la r7,data+8
>> ldd r6,0(r7)
>> ldd r8,0(r9)
>> ldd r7,16(r9)
>>
>> I'm trying to see if there is a problem with my rtx costs function
>> because again, I don't understand why it would generate 2 la instead
>> of using an offset of 8 and 16.
>
> You probably want to look at the RTL dumps. This code should have been
> expanded with the correct offsets (at least that is what happens on
> MIPS). I don't see how later passes would modify the code other than
> removing 2 of the 3 "la rX, data" insns.
>
> Adam
>