On Thu, 2011-04-21 at 12:23 +0100, Andrew Stubbs wrote:
> This patch is a repost of the one I previously posted here:
> 
>    http://gcc.gnu.org/ml/gcc-patches/2010-12/msg00652.html
> 
> As requested, I've broken out the other parts of the original patch, and 
> those have already been reposted yesterday (and one committed also).
> 
> This (final) part is support for using Thumb2's replicated constants and 
> addw/subw instructions as part of split constant loads. Previously the 
> compiler could use these constants, but only where they would be loaded 
> in a single instruction.
> 
> This patch must be applied on top of the addw/subw patch I posted yesterday.
> 
> The patch also optimizes the use of inverted or negated constants as a 
> short-cut to the final value. The previous code did this in some cases, 
> but could not be easily adapted to replicated constants.
> 
> The previous code also had a bug that prevented optimal use of shifted 
> constants in Thumb code by imposing the same restrictions as ARM code. 
> This has been fixed.
> 
> Example 1: addw as part of a split constant load
> 
> a + 0xfffff
> 
>     Before:
>           movw    r3, #65535       ; 0x0ffff
>           movt    r3, 15           ; 0xf0000
>           adds    r3, r0, r3
>     After:
>           add     r0, r0, #1044480 ; 0xff000
>           addw    r0, r0, #4095    ; 0x00fff
> 
> Example 2: arbitrary shifts bug fix
> 
> a - 0xfff1
> 
>     Before:
>           sub     r0, r0, #65024   ; 0xfe00
>           sub     r0, r0, #496     ; 0x01f0
>           sub     r0, r0, #1       ; 0x0001
>     After:
>           sub     r0, r0, #65280   ; 0xff00
>           sub     r0, r0, #241     ; 0x00f1
> 
> Example 3: 16-bit replicated patterns
> 
> a + 0x44004401
> 
>     Before:
>           movw    r3, #17409          ; 0x00004401
>           movt    r3, 17408           ; 0x44000000
>           adds    r3, r0, r3
>     After:
>           add     r0, r0, #1140868096 ; 0x44004400
>           adds    r0, r0, #1          ; 0x00000001
> 
> Example 4: 32-bit replicated patterns
> 
> a & 0xaaaaaa00
> 
>     Before:
>           mov     r3, #43520           ; 0x0000aa00
>           movt    r3, 43690            ; 0xaaaa0000
>           and     r3, r0, r3
>     After:
>           and     r0, r0, #-1431655766 ; 0xaaaaaaaa
>           bic     r0, r0, #170         ; 0x000000aa
> 
> The constant splitting code was duplicated in two places, and I would 
> have needed to modify both quite heavily, so I have taken the 
> opportunity to unify the two, and hopefully reduce the future 
> maintenance burden.
> 
> Let me respond to a point Richard Earnshaw raised following the original 
> posting:
> 
>  > A final note is that you may have missed some cases.  Now that we have
>  > movw,
>  >    reg&  ~(16-bit const)
>  > can now be done in at most 2 insns:
>  >    movw t1, #16-bit const
>  >    bic  Rd, reg, t1
> 
> Actually, I think we can do better than that for a 16-bit constant.
> 
> Given:
> 
>     a & ~(0xabcd)
> 
> Before my changes, GCC gave:
> 
>          bic     r0, r0, #43520
>          bic     r0, r0, #460
>          bic     r0, r0, #1
> 
> and after applying my patch:
> 
>          bic     r0, r0, #43776
>          bic     r0, r0, #205
> 
> Two instructions and no temporary register.
> 
>  > On thumb-2 you can also use ORN that way as well.
> 
> It turns out that my previous patch was broken for ORN. I traced the 
> problem to some confusing code already in arm.c that set can_invert for 
> IOR, but then explicitly ignored it later (I had removed the second 
> part, but not the first). I posted, and committed a patch to fix this 
> yesterday.
> 
> In fact ORN is only of limited use for this kind of thing. Like AND, you 
> can't use multiple ORNs to build a constant. The compiler already does 
> use ORN in some circumstances, and this patch has not changed that.
> 
> Is the patch OK?
> 
> Andrew

+   RETURN_SEQUENCE must be an int[4].

It would be a more robust coding style to define a struct with an int[4]
array as its only member.  Then it wouldn't be possible to pass an
undersized object to these routines.

OK with a change to do that.

R.


Reply via email to