https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102793
Richard Biener <rguenth at gcc dot gnu.org> changed: What |Removed |Added ---------------------------------------------------------------------------- Target| |aarch64 --- Comment #3 from Richard Biener <rguenth at gcc dot gnu.org> --- I wonder why tail-merging doesn't do it's job here. It does (on x86_64-linux): find_duplicates: <bb 3> duplicate of <bb 5> Removing basic block 5 ;; basic block 5, loop depth 0 ;; pred: _12 = foo_10(D) (); ;; succ: 6 and .optimized: int noccmp (uint64_t * s1, uint64_t * s2, int (*<T3aa>) (void) foo) { uint64_t bar; uint64_t d2; uint64_t d1; long unsigned int _1; int _2; int _14; <bb 2> [local count: 1073741824]: d1_6 = *s1_4(D); d2_8 = *s2_7(D); _1 = d1_6 ^ d2_8; bar_9 = _1 & 43981; if (bar_9 == 0) goto <bb 3>; [34.00%] else goto <bb 4>; [66.00%] <bb 3> [local count: 719407024]: _14 = foo_10(D) (); [tail call] goto <bb 5>; [100.00%] <bb 4> [local count: 708669601]: if (d1_6 != d2_8) goto <bb 3>; [50.00%] else goto <bb 5>; [50.00%] <bb 5> [local count: 1073741824]: # _2 = PHI <_14(3), 0(4)> return _2; } but what's missing is possibly some if-combine? IMHO ccmp expansion should be re-written to a pre RTL expansion GIMPLE transform.