https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69941
Bug ID: 69941 Summary: bogus zero_extend transformation in postreload on aarch64 Product: gcc Version: 6.0 Status: UNCONFIRMED Keywords: wrong-code Severity: normal Priority: P3 Component: rtl-optimization Assignee: unassigned at gcc dot gnu.org Reporter: ktkachov at gcc dot gnu.org CC: rth at gcc dot gnu.org Target Milestone: --- Target: aarch64 Consider the code: #include <stdio.h> int a = 0; int b = 0; int c = 0; int e = 0; int f = 0; int *g = &e; int fn1() { return b ? a : b; } int main() { int h = fn1() <= 0x8000000000000000ULL; // h = 1; int k = f; // k = 0; long i = h ? k : k / h; // i = 0; long l = (unsigned short)(i - 0x1800); // l = 0xe800 i = l ? l : c; // i = 0xe800; *g = i; // *g = 0xe800; e = 0xe800; unsigned char result = e >> 9; // result = 0x74; printf("checksum = %x\n", (int)result); return 0; } This is supposed to print 74. Before r232540 which rejigged the aarch64 add expander, splitters and peepholes this was miscompiled and actually printed f4. I think r232540 made a but in the rtl optimisers go latent. If we take a revision before and after Richard's patch the difference between the bad and the good code is: - mov w0, -6144 - add w0, w0, w1 + sub w0, w1, #4096 + sub w0, w0, #2048 + uxth x0, w0 So in the bad code we miss a zero_extend from HImode after the subtraction. Looking at the RTL dumps this gets lost in postreload. Immediately after reload we have: (insn 24 22 25 4 (set (reg:SI 0 x0 [94]) (const_int -6144 [0xffffffffffffe800])) file.c:31 47 {*movsi_aarch64} (expr_list:REG_EQUIV (const_int -6144 [0xffffffffffffe800]) (nil))) (insn 25 24 26 4 (set (reg:SI 0 x0 [93]) (plus:SI (reg:SI 0 x0 [94]) (reg:SI 1 x1 [orig:73 iftmp.2_1 ] [73]))) file.c:31 96 {*addsi3_aarch64} (nil)) (insn 26 25 27 4 (set (reg/v:DI 0 x0 [orig:74 l ] [74]) (zero_extend:DI (reg:HI 0 x0 [93]))) file.c:31 91 {*zero_extendhidi2_aarch64} (nil)) but after postreload we have: (insn 24 22 26 4 (set (reg:SI 0 x0 [94]) (const_int -6144 [0xffffffffffffe800])) file.c:31 47 {*movsi_aarch64} (expr_list:REG_EQUIV (const_int -6144 [0xffffffffffffe800]) (nil))) (insn 26 24 27 4 (set (reg/v:DI 0 x0 [orig:74 l ] [74]) (zero_extend:DI (plus:SI (reg:SI 0 x0 [94]) (reg:SI 1 x1 [orig:73 iftmp.2_1 ] [73])))) file.c:31 97 {*addsi3_aarch64_uxtw} The zero_extend in insn 26 lost the HImode inner mode. As I said, this bug doesn't reproduce anymore on trunk after r232540 because we carry the plus-immediate as a single insn for longer and split it via peephole later on.