https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79981
Bug ID: 79981 Summary: Forwprop not working for __atomic_compare_exchange_n Product: gcc Version: 7.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: vogt at linux dot vnet.ibm.com Target Milestone: --- Trying to figure out why this sample program results on not so good Rtl code on s390x: -- extern void locked (void *lock); extern void not_locked (void); void csi (int *lock) { int oldval = 0; if (__atomic_compare_exchange_n (lock, (void *) &oldval, 1, 1, 2, 0)) locked ((void *)lock); else not_locked (); } -- Before forwprop2: -- _Bool _1; __complex__ unsigned int _8; unsigned int _9; <bb 2> [100.00%]: _8 = ATOMIC_COMPARE_EXCHANGE (lock_4(D), 0, 1, 260, 2, 0); _9 = IMAGPART_EXPR <_8>; _1 = (_Bool) _9; if (_1 != 0) ... -- Although Gcc seems to know that the SImode value from the complex number can only have values 0 and 1, it fails to propagate _9 into the condition: "if (_9 != 0)". For similar code that calls a normal function, forwprop does exactly that: -- extern _Complex int func(int *lock, void *old, int new); extern void locked (void *lock); extern void not_locked (void); void foo(int *lock) { int oldval = 0; int i = __imag__ func(lock, (void *)&oldval, 1); _Bool b = (_Bool)i; if (b) locked ((void *)lock); else not_locked (); } -- We get (before forwprop1): -- complex int _1; <bb 2> [0.00%]: oldval = 0; _1 = func (lock_5(D), &oldval, 1); i_7 = IMAGPART_EXPR <_1>; b_8 = i_7 != 0; if (b_8 != 0) -- => -- ;; Function foo (foo, funcdef_no=0, decl_uid=2024, cgraph_uid=0, symbol_order=0) Applying pattern match.pd:932, gimple-match.c:164 Applying pattern match.pd:3002, gimple-match.c:61428 gimple_simplified to if (b_8 != 0) Applying pattern match.pd:932, generic-match.c:136 Applying pattern match.pd:3002, generic-match.c:31977 Replaced 'b_8 != 0' with 'i_7 != 0' ... _1 = func (lock_5(D), &oldval, 1); i_7 = IMAGPART_EXPR <_1>; if (i_7 != 0) ... -- What can be done to get rid of the _Bool stuff in the first program? (The Rtl code generated by the _Bool logic stays around too long, until Combine, and foils optimizations in the s390x pattern for compare-and-swap.)