https://gcc.gnu.org/bugzilla/show_bug.cgi?id=46393
--- Comment #1 from Jeffrey A. Law ---
It appears the problem starts with forwprop turning the pointer accesses into
array/structure memory accesses. This is generally a good thing.
However, in this instance it makes it awful hard to recover the CSE
opportunities that are needed to get good compact code.
We have 3 instances of:
30 003e D3C2 add.l %d2,%a1
31 0040 D3C9 add.l %a1,%a1
32 0042 D3C2 add.l %d2,%a1
That's 12 wasted bytes.
ANd we two have two instances of:
24 0032 5200 addq.b #1,%d0
25 0034 5288 addq.l #1,%a0
Another two wasted bytes.
Also related we end up selecting poor addressing modes which probably another
10-16 bytes.
But at the core AFAICT is recovery of array/structure access from what was
pointer accesses. In theory PRE ought to come along and pull out the
redundant address arithmetic, but it doesn't (not even with -O2).
It's not clear how prelevant this is across other architectures, so I'm keeping
a P4 for now. If someone can show this causing problems on non-dead targets,
then we might consider bumping this up to a P2 priority.