https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116467
Bug ID: 116467
Summary: missed optimization: zero-extension duplicated on
xtensa
Product: gcc
Version: 12.2.0
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: target
Assignee: unassigned at gcc dot gnu.org
Reporter: rsaxvc at gmail dot com
Target Milestone: ---
On GCC 12.2.0, -O2 -Wall -Wextra, the following code:
#include <stdint.h>
__attribute__ ((noinline)) uint32_t callee(uint32_t x, uint16_t y){
return x + y;
}
__attribute__ ((noinline)) uint32_t caller(uint32_t x, uint32_t y){
return callee(x, y);
}
compiles to these xtensa instructions:
callee:
entry sp, 32
extui a3, a3, 0, 16
add.n a2, a3, a2
retw.n
caller:
entry sp, 32
extui a11, a3, 0, 16
mov.n a10, a2
call8 callee
mov.n a2, a10
retw.n
I was surprised to find that zero-extension (extui rDest, rSource, 0, 16)
occurs twice, once in each function. On other targets like ARM32, it looks like
uint16_t passed in a register is assumed to be passed zero-extended, so the
callee does not need to repeat it. ARM32, GCC12.2, same flags:
callee:
add r0, r0, r1
bx lr
caller:
uxth r1, r1 //similar to extui, .., .., 0, 16
b callee
Could xtensa do the same?