For the following small test case there is unbelievable code bloat from
lower-subreg.c
The code reads a 4-byte value from AVR's address spaces:
long readx (const __memx long *p)
{
return *p;
}
long read1 (const __flash1 long *p)
{
return *p;
}
Compiled with 4.8.0
$ avr-gcc flash.c -S -dp -Os -mmcu=avr51 -fno-split-wide-types
This yields
readx:
/* prologue: function */
movw r30,r22
mov r21,r24
call __xload_4
ret
read1:
/* prologue: function */
movw r30,r24
ldi r18,1
out __RAMPZ__,r18
elpm r22,Z+
elpm r23,Z+
elpm r24,Z+
elpm r25,Z+
ret
Which is reasonable. Loads from space __memx are expensive and are outsourced
to libgcc function __xload_4.
But without the -fno-split-wide-types the code is
readx:
push r12
push r13
push r14
/* prologue: function */
mov r26,r24
movw r24,r22
movw r18,r24
mov r20,r26
subi r18,-1
sbci r19,-1
sbci r20,-1
movw r30,r18
mov r21,r20
call __xload_1
mov r23,r22
ldi r18,lo8(2)
mov r12,r18
mov r13,__zero_reg__
mov r14,__zero_reg__
add r12,r24
adc r13,r25
adc r14,r26
movw r18,r24
mov r20,r26
subi r18,-3
sbci r19,-1
sbci r20,-1
movw r30,r12
mov r21,r14
call __xload_1
mov r24,r22
movw r30,r18
mov r21,r20
call __xload_1
mov r25,r22
/* epilogue start */
pop r14
pop r13
pop r12
ret
read1:
/* prologue: function */
movw r30,r24
ldi r18,1
out __RAMPZ__,r18
elpm r22,Z+
ldi r18,1
out __RAMPZ__,r18
elpm r23,Z
movw r18,r24
subi r18,-2
sbci r19,-1
movw r20,r24
subi r20,-3
sbci r21,-1
movw r30,r18
ldi r24,1
out __RAMPZ__,r24
elpm r24,Z
movw r30,r20
ldi r25,1
out __RAMPZ__,r25
elpm r25,Z
ret
You don't need to know anything about AVR to see that the code is *really* bad
and bloat to the maximum.
Besides that the code is wrong, there are just 3 __xload_1 calls instead of 4.
But that appears to be a different issue, PR52484.
The reason is that lower-subreg.c does not care about costs at all and greedily
splits everything it gets hold of.
And a second reason is that GCC is completely afraid of pre/post
increment/modify/decrement addressing modes.
Any idea how to fix this in the backend?
There is TARGET_MODE_DEPENDENT_ADDRESS_P and it can fix the first case which
uses PSImode as pointer mode.
The second case, however, uses Pmode and in that hook there is no way to tell
if an address is to generic address space or to a special address space because
that hook hides this information from the backend and there is no address-space
flavour of the hook.
Any ideas what to do about that?
Is it reasonable hack to make
TARGET_MODE_DEPENDENT_ADDRESS_P (PSImode) = false
Why does lower-subreg not care for costs at all?
...even if; MEMORY_MOVE_COST is not sensitive to address spaces, either.
Johann