On Thu, 2011-04-21 at 12:23 +0100, Andrew Stubbs wrote: > This patch is a repost of the one I previously posted here: > > http://gcc.gnu.org/ml/gcc-patches/2010-12/msg00652.html > > As requested, I've broken out the other parts of the original patch, and > those have already been reposted yesterday (and one committed also). > > This (final) part is support for using Thumb2's replicated constants and > addw/subw instructions as part of split constant loads. Previously the > compiler could use these constants, but only where they would be loaded > in a single instruction. > > This patch must be applied on top of the addw/subw patch I posted yesterday. > > The patch also optimizes the use of inverted or negated constants as a > short-cut to the final value. The previous code did this in some cases, > but could not be easily adapted to replicated constants. > > The previous code also had a bug that prevented optimal use of shifted > constants in Thumb code by imposing the same restrictions as ARM code. > This has been fixed. > > Example 1: addw as part of a split constant load > > a + 0xfffff > > Before: > movw r3, #65535 ; 0x0ffff > movt r3, 15 ; 0xf0000 > adds r3, r0, r3 > After: > add r0, r0, #1044480 ; 0xff000 > addw r0, r0, #4095 ; 0x00fff > > Example 2: arbitrary shifts bug fix > > a - 0xfff1 > > Before: > sub r0, r0, #65024 ; 0xfe00 > sub r0, r0, #496 ; 0x01f0 > sub r0, r0, #1 ; 0x0001 > After: > sub r0, r0, #65280 ; 0xff00 > sub r0, r0, #241 ; 0x00f1 > > Example 3: 16-bit replicated patterns > > a + 0x44004401 > > Before: > movw r3, #17409 ; 0x00004401 > movt r3, 17408 ; 0x44000000 > adds r3, r0, r3 > After: > add r0, r0, #1140868096 ; 0x44004400 > adds r0, r0, #1 ; 0x00000001 > > Example 4: 32-bit replicated patterns > > a & 0xaaaaaa00 > > Before: > mov r3, #43520 ; 0x0000aa00 > movt r3, 43690 ; 0xaaaa0000 > and r3, r0, r3 > After: > and r0, r0, #-1431655766 ; 0xaaaaaaaa > bic r0, r0, #170 ; 0x000000aa > > The constant splitting code was duplicated in two places, and I would > have needed to modify both quite heavily, so I have taken the > opportunity to unify the two, and hopefully reduce the future > maintenance burden. > > Let me respond to a point Richard Earnshaw raised following the original > posting: > > > A final note is that you may have missed some cases. Now that we have > > movw, > > reg& ~(16-bit const) > > can now be done in at most 2 insns: > > movw t1, #16-bit const > > bic Rd, reg, t1 > > Actually, I think we can do better than that for a 16-bit constant. > > Given: > > a & ~(0xabcd) > > Before my changes, GCC gave: > > bic r0, r0, #43520 > bic r0, r0, #460 > bic r0, r0, #1 > > and after applying my patch: > > bic r0, r0, #43776 > bic r0, r0, #205 > > Two instructions and no temporary register. > > > On thumb-2 you can also use ORN that way as well. > > It turns out that my previous patch was broken for ORN. I traced the > problem to some confusing code already in arm.c that set can_invert for > IOR, but then explicitly ignored it later (I had removed the second > part, but not the first). I posted, and committed a patch to fix this > yesterday. > > In fact ORN is only of limited use for this kind of thing. Like AND, you > can't use multiple ORNs to build a constant. The compiler already does > use ORN in some circumstances, and this patch has not changed that. > > Is the patch OK? > > Andrew
+ RETURN_SEQUENCE must be an int[4]. It would be a more robust coding style to define a struct with an int[4] array as its only member. Then it wouldn't be possible to pass an undersized object to these routines. OK with a change to do that. R.