On Mon, May 22, 2023 at 10:48 PM Takayuki 'January June' Suwa <jjsuwa_sys3...@yahoo.co.jp> wrote: > > On 2023/05/23 11:27, Max Filippov wrote: > > Hi Suwa-san, > > Hi! > > > This change introduces a bunch of test failures on big endian configuration. > > I believe that's because the starting bit position for zero_extract is > > counted > > from different ends depending on the endianness. > > Oops, what a stupid mistake... X( > > === > This patch decreses one machine instruction from "single bit extraction > with shifting" operation, and tries to eliminate the conditional > branch if CST2_POW2 doesn't fit into signed 12 bits with the help > of ifcvt optimization. > > /* example #1 */ > int test0(int x) { > return (x & 1048576) != 0 ? 1024 : 0; > } > extern int foo(void); > int test1(void) { > return (foo() & 1048576) != 0 ? 16777216 : 0; > } > > ;; before > test0: > movi a9, 0x400 > srai a2, a2, 10 > and a2, a2, a9 > ret.n > test1: > addi sp, sp, -16 > s32i.n a0, sp, 12 > call0 foo > extui a2, a2, 20, 1 > slli a2, a2, 20 > beqz.n a2, .L2 > movi.n a2, 1 > slli a2, a2, 24 > .L2: > l32i.n a0, sp, 12 > addi sp, sp, 16 > ret.n > > ;; after > test0: > extui a2, a2, 20, 1 > slli a2, a2, 10 > ret.n > test1: > addi sp, sp, -16 > s32i.n a0, sp, 12 > call0 foo > l32i.n a0, sp, 12 > extui a2, a2, 20, 1 > slli a2, a2, 24 > addi sp, sp, 16 > ret.n > > In addition, if the left shift amount ('exact_log2(CST2_POW2)') is > between 1 through 3 and a either addition or subtraction with another > register follows, emit a ADDX[248] or SUBX[248] machine instruction > instead of separate left shift and add/subtract ones. > > /* example #2 */ > int test2(int x, int y) { > return ((x & 1048576) != 0 ? 4 : 0) + y; > } > int test3(int x, int y) { > return ((x & 2) != 0 ? 8 : 0) - y; > } > > ;; before > test2: > movi.n a9, 4 > srai a2, a2, 18 > and a2, a2, a9 > add.n a2, a2, a3 > ret.n > test3: > movi.n a9, 8 > slli a2, a2, 2 > and a2, a2, a9 > sub a2, a2, a3 > ret.n > > ;; after > test2: > extui a2, a2, 20, 1 > addx4 a2, a2, a3 > ret.n > test3: > extui a2, a2, 1, 1 > subx8 a2, a2, a3 > ret.n > > gcc/ChangeLog: > > * config/xtensa/predicates.md (addsub_operator): New. > * config/xtensa/xtensa.md (*extzvsi-1bit_ashlsi3, > *extzvsi-1bit_addsubx): New insn_and_split patterns. > * config/xtensa/xtensa.cc (xtensa_rtx_costs): > Add a special case about ifcvt 'noce_try_cmove()' to handle > constant loads that do not fit into signed 12 bits in the > patterns added above. > --- > gcc/config/xtensa/predicates.md | 3 ++ > gcc/config/xtensa/xtensa.cc | 3 +- > gcc/config/xtensa/xtensa.md | 83 +++++++++++++++++++++++++++++++++ > 3 files changed, 88 insertions(+), 1 deletion(-)
Regtested for target=xtensa-linux-uclibc, no new regressions. Committed to master. -- Thanks. -- Max