Hello,
It seems to me that tree balancing risk of producing wrong result due
to overflow of subexpression. 

Say a = INT_MIN, b = 10, c = 10, d = INT_MAX. 

If
((a + b) + c) + d))

becomes 
((a + b) + (c + d))

c + d will overflow and the original won't. So the behaviour of
two are different. Though the architecture may manage to produce
correct result, it is undefined I think. 


Cheers,
Bingfeng



> -----Original Message-----
> From: gcc-ow...@gcc.gnu.org [mailto:gcc-ow...@gcc.gnu.org] On 
> Behalf Of Ian Bolton
> Sent: 20 November 2009 15:05
> To: gcc@gcc.gnu.org
> Subject: Worth balancing the tree before scheduling?
> 
> From some simple experiments (see below), it appears as 
> though GCC aims
> to
> create a lop-sided tree when there are constants involved 
> (func1 below),
> but a balanced tree when there aren't (func2 below).
> 
> Our assumption is that GCC likes having constants all near to 
> each other
> to
> aid with tree-based optimisations, but I'm fairly sure that, when it
> comes
> to scheduling, it would be better to have a balanced tree, so 
> sched has
> more
> choices about what to schedule next?
> 
> The impact of limiting sched's options can be seen if we look at the
> pseudo-assembly produced by GCC for our architecture:
> 
> func1:
>         LSHIFT  $c3,$c1,3     # tmp137, a,
>         ADD     $c2,$c2,$c3   # tmp138, b, tmp137
>         ADD     $c1,$c2,$c1   #, tmp138, a
> 
> We think it would be better to avoid using the temporary:
> 
> func1:
>         ADD     $c2,$c1,$c2     # tmp137, a, b
>         LSHIFT  $c1,$c1,3       # tmp138, a,
>         ADD     $c1,$c2,$c1     # <result>, tmp137, tmp138
> 
> As it currently stands, sched doesn't have the option to do 
> this because
> its input (shown in func.c.172r.asmcons below) is arranged 
> such that the
> first add depends on the shift and the second add depends on the first
> add.
> 
> If the tree were balanced, sched would have the option to do the add
> first.
> And, providing the logic was there in sched, we could make it 
> choose to
> schedule such that we limit the number of temporaries used.
> 
> Maybe one of the RTL passes prior to scheduling has the potential to
> balance the tree/RTL, but just isn't on our architecture?
> 
> ======================================
> func.c:
> --------------------------------------
> int func1 (int a, int b)
> {
>   /* the original expression */
>   return a + b + (a << 3);
> }
>  
> 
> int func2 (int a, int b, int c)
> {
>   /* the original expression */
>   return a + b + (a << c);
> }
>  
> 
> ======================================
> 
> ======================================
> func.c.129t.supress_extend:
> --------------------------------------
> ;; Function func1 (func1)
>  
> func1 (int a, int b)
> {
> <bb 2>:
>   return (b + (a << 3)) + a;
> }
> 
> func2 (int a, int b, int c)
> {
> <bb 2>:
>   return (b + a) + (a << c);
> }
> 
>  
> ======================================
> 
> ======================================
> func.c.172r.asmcons:
> --------------------------------------
> 
> ;; Function func1 (func1)
> 
> ;; Pred edge  ENTRY [100.0%]  (fallthru)
> (note 5 1 2 2 [bb 2] NOTE_INSN_BASIC_BLOCK)
>  
> (insn 2 5 3 2 func.c:2 (set (reg/v:SI 134 [ a ])
>         (reg:SI 1 $c1 [ a ])) 45 {*movsi} (expr_list:REG_DEAD 
> (reg:SI 1
> $c1 [ a ])
>         (nil)))
>  
> (note 3 2 4 2 NOTE_INSN_DELETED)
>  
> (note 4 3 7 2 NOTE_INSN_FUNCTION_BEG)
>  
> (insn 7 4 8 2 func.c:2 (set (reg:SI 137)
>         (ashift:SI (reg/v:SI 134 [ a ])
>             (const_int 3 [0x3]))) 80 {ashlsi3} (nil))
>  
> (insn 8 7 9 2 func.c:2 (set (reg:SI 138)
>         (plus:SI (reg:SI 2 $c2 [ b ])
>             (reg:SI 137))) 65 {*addsi3} (expr_list:REG_DEAD 
> (reg:SI 137)
>         (expr_list:REG_DEAD (reg:SI 2 $c2 [ b ])
>             (nil))))
>  
> 
> (note 9 8 14 2 NOTE_INSN_DELETED)
>  
> 
> (insn 14 9 20 2 func.c:5 (set (reg/i:SI 1 $c1)
>         (plus:SI (reg:SI 138)
>             (reg/v:SI 134 [ a ]))) 65 {*addsi3} (expr_list:REG_DEAD
> (reg:SI 138)
>         (expr_list:REG_DEAD (reg/v:SI 134 [ a ])
>             (nil))))
>  
> 
> (insn 20 14 0 2 func.c:5 (use (reg/i:SI 1 $c1)) -1 (nil))
> 
> ;; Function func2 (func2)
> 
> ;; Pred edge  ENTRY [100.0%]  (fallthru)
> (note 6 1 2 2 [bb 2] NOTE_INSN_BASIC_BLOCK)
>  
> 
> (insn 2 6 3 2 func.c:8 (set (reg/v:SI 134 [ a ])
>         (reg:SI 1 $c1 [ a ])) 45 {*movsi} (expr_list:REG_DEAD 
> (reg:SI 1
> $c1 [ a ])
>         (nil)))
>  
> 
> (note 3 2 4 2 NOTE_INSN_DELETED)
>  
> 
> (note 4 3 5 2 NOTE_INSN_DELETED)
>  
> 
> (note 5 4 8 2 NOTE_INSN_FUNCTION_BEG)
>  
> 
> (insn 8 5 9 2 func.c:8 (set (reg:SI 138)
>         (plus:SI (reg:SI 2 $c2 [ b ])
>             (reg/v:SI 134 [ a ]))) 65 {*addsi3} (expr_list:REG_DEAD
> (reg:SI 2 $c2 [ b ])
>         (nil)))
>  
> 
> (insn 9 8 10 2 func.c:8 (set (reg:SI 139)
>         (ashift:SI (reg/v:SI 134 [ a ])
>             (reg:SI 3 $c3 [ c ]))) 80 {ashlsi3} (expr_list:REG_DEAD
> (reg/v:SI 134 [ a ])
>         (expr_list:REG_DEAD (reg:SI 3 $c3 [ c ])
>             (nil))))
>  
> 
> (note 10 9 15 2 NOTE_INSN_DELETED)
>  
> 
> (insn 15 10 21 2 func.c:11 (set (reg/i:SI 1 $c1)
>         (plus:SI (reg:SI 138)
>             (reg:SI 139))) 65 {*addsi3} (expr_list:REG_DEAD 
> (reg:SI 139)
>         (expr_list:REG_DEAD (reg:SI 138)
>             (nil))))
>  
> 
> (insn 21 15 0 2 func.c:11 (use (reg/i:SI 1 $c1)) -1 (nil))
> 
> ======================================
> 
> Cheers,
> Ian
> 
> 

Reply via email to