From some simple experiments (see below), it appears as though GCC aims to create a lop-sided tree when there are constants involved (func1 below), but a balanced tree when there aren't (func2 below).
Our assumption is that GCC likes having constants all near to each other to aid with tree-based optimisations, but I'm fairly sure that, when it comes to scheduling, it would be better to have a balanced tree, so sched has more choices about what to schedule next? The impact of limiting sched's options can be seen if we look at the pseudo-assembly produced by GCC for our architecture: func1: LSHIFT $c3,$c1,3 # tmp137, a, ADD $c2,$c2,$c3 # tmp138, b, tmp137 ADD $c1,$c2,$c1 #, tmp138, a We think it would be better to avoid using the temporary: func1: ADD $c2,$c1,$c2 # tmp137, a, b LSHIFT $c1,$c1,3 # tmp138, a, ADD $c1,$c2,$c1 # <result>, tmp137, tmp138 As it currently stands, sched doesn't have the option to do this because its input (shown in func.c.172r.asmcons below) is arranged such that the first add depends on the shift and the second add depends on the first add. If the tree were balanced, sched would have the option to do the add first. And, providing the logic was there in sched, we could make it choose to schedule such that we limit the number of temporaries used. Maybe one of the RTL passes prior to scheduling has the potential to balance the tree/RTL, but just isn't on our architecture? ====================================== func.c: -------------------------------------- int func1 (int a, int b) { /* the original expression */ return a + b + (a << 3); } int func2 (int a, int b, int c) { /* the original expression */ return a + b + (a << c); } ====================================== ====================================== func.c.129t.supress_extend: -------------------------------------- ;; Function func1 (func1) func1 (int a, int b) { <bb 2>: return (b + (a << 3)) + a; } func2 (int a, int b, int c) { <bb 2>: return (b + a) + (a << c); } ====================================== ====================================== func.c.172r.asmcons: -------------------------------------- ;; Function func1 (func1) ;; Pred edge ENTRY [100.0%] (fallthru) (note 5 1 2 2 [bb 2] NOTE_INSN_BASIC_BLOCK) (insn 2 5 3 2 func.c:2 (set (reg/v:SI 134 [ a ]) (reg:SI 1 $c1 [ a ])) 45 {*movsi} (expr_list:REG_DEAD (reg:SI 1 $c1 [ a ]) (nil))) (note 3 2 4 2 NOTE_INSN_DELETED) (note 4 3 7 2 NOTE_INSN_FUNCTION_BEG) (insn 7 4 8 2 func.c:2 (set (reg:SI 137) (ashift:SI (reg/v:SI 134 [ a ]) (const_int 3 [0x3]))) 80 {ashlsi3} (nil)) (insn 8 7 9 2 func.c:2 (set (reg:SI 138) (plus:SI (reg:SI 2 $c2 [ b ]) (reg:SI 137))) 65 {*addsi3} (expr_list:REG_DEAD (reg:SI 137) (expr_list:REG_DEAD (reg:SI 2 $c2 [ b ]) (nil)))) (note 9 8 14 2 NOTE_INSN_DELETED) (insn 14 9 20 2 func.c:5 (set (reg/i:SI 1 $c1) (plus:SI (reg:SI 138) (reg/v:SI 134 [ a ]))) 65 {*addsi3} (expr_list:REG_DEAD (reg:SI 138) (expr_list:REG_DEAD (reg/v:SI 134 [ a ]) (nil)))) (insn 20 14 0 2 func.c:5 (use (reg/i:SI 1 $c1)) -1 (nil)) ;; Function func2 (func2) ;; Pred edge ENTRY [100.0%] (fallthru) (note 6 1 2 2 [bb 2] NOTE_INSN_BASIC_BLOCK) (insn 2 6 3 2 func.c:8 (set (reg/v:SI 134 [ a ]) (reg:SI 1 $c1 [ a ])) 45 {*movsi} (expr_list:REG_DEAD (reg:SI 1 $c1 [ a ]) (nil))) (note 3 2 4 2 NOTE_INSN_DELETED) (note 4 3 5 2 NOTE_INSN_DELETED) (note 5 4 8 2 NOTE_INSN_FUNCTION_BEG) (insn 8 5 9 2 func.c:8 (set (reg:SI 138) (plus:SI (reg:SI 2 $c2 [ b ]) (reg/v:SI 134 [ a ]))) 65 {*addsi3} (expr_list:REG_DEAD (reg:SI 2 $c2 [ b ]) (nil))) (insn 9 8 10 2 func.c:8 (set (reg:SI 139) (ashift:SI (reg/v:SI 134 [ a ]) (reg:SI 3 $c3 [ c ]))) 80 {ashlsi3} (expr_list:REG_DEAD (reg/v:SI 134 [ a ]) (expr_list:REG_DEAD (reg:SI 3 $c3 [ c ]) (nil)))) (note 10 9 15 2 NOTE_INSN_DELETED) (insn 15 10 21 2 func.c:11 (set (reg/i:SI 1 $c1) (plus:SI (reg:SI 138) (reg:SI 139))) 65 {*addsi3} (expr_list:REG_DEAD (reg:SI 139) (expr_list:REG_DEAD (reg:SI 138) (nil)))) (insn 21 15 0 2 func.c:11 (use (reg/i:SI 1 $c1)) -1 (nil)) ====================================== Cheers, Ian