https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102714
Bug ID: 102714 Summary: A volatile-related problem cased by ipa inline pass Product: gcc Version: 12.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: duan.db at linux dot alibaba.com Target Milestone: --- Created attachment 51592 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=51592&action=edit test.i Hi, GCC-trunk has a volatile-related problem with -O2 -fno-strict-aliasing. The complete test case test.i file is attached. Below are the main caller and callee code snippets: caller code snippets: restart: parent = ((void *)0); radix_tree_load_root(root, &node, &maxindex); while (radix_tree_is_internal_node(node)) { parent = entry_to_node(node); if (node == xa_mk_internal(256)) goto restart; if (parent->shift == 0) break; } callee code snippets: static unsigned radix_tree_load_root(const struct xarray *root, struct xa_node **nodep, unsigned long *maxindex) { struct xa_node *node = ({typeof(root->xa_head) ________p1 = ({(*(const volatile typeof(root->xa_head) *)&(root->xa_head)); }); ((typeof(*root->xa_head) *)(________p1));}); *nodep = node; if (__builtin_expect(!!(radix_tree_is_internal_node(node)), 1)) { node = entry_to_node(node); *maxindex = node_maxindex(node); return node->shift + (0 ? 4 : 6); } *maxindex = 0; return 0; } The callee function radix_tree_load_root will assign the volatile attribute to root->xa_head (struct member of one input parameter), so that xa_head will not be optimized by subsequent passes, like loop-invariant code motion. However, during the IPA inline pass, GCC will use function redirect_call_stmt_to_callee to rewrite the call function statement and the intput parameter. In this process, the volatile attribute of xa_head will be lost, which will be optimized and makes the goto jump logic crash. Wrong Assembly code: 0000000000000000 <__radix_tree_lookup>: 0: 4c 8b 47 08 mov 0x8(%rdi),%r8 4: 4c 89 c6 mov %r8,%rsi 7: 4c 89 c0 mov %r8,%rax a: 83 e6 03 and $0x3,%esi d: 48 83 e0 fd and $0xfffffffffffffffd,%rax 11: 31 c9 xor %ecx,%ecx 13: 48 83 fe 02 cmp $0x2,%rsi 17: 75 18 jne 31 <__radix_tree_lookup+0x31> 19: 0f 1f 80 00 00 00 00 nopl 0x0(%rax) 20: 48 89 c1 mov %rax,%rcx 23: 49 81 f8 02 04 00 00 cmp $0x402,%r8 2a: 74 e5 je 11 <__radix_tree_lookup+0x11> //The difference between correct assembly and incorrect assembly is the goto jump. The correct assembly here should be (je 0 <__radix_tree_lookup>), which means each cycle needs to re-fetch xa_head from the memory. 2c: 80 38 00 cmpb $0x0,(%rax) 2f: 75 ef jne 20 <__radix_tree_lookup+0x20> 31: 48 85 d2 test %rdx,%rdx 34: 74 03 je 39 <__radix_tree_lookup+0x39> 36: 48 89 0a mov %rcx,(%rdx) 39: 4c 89 c0 mov %r8,%rax 3c: c3 retq Like I see, when IPA uses the class ipa_param_adjustments to analyze and store the input parameters information of the call statement, it does not consider related volatile information, which leads to the loss of information. And early inline pass does not have this problem, which only exists in IPA inline pass. If lower the limit of early inline pass, the problem can be circumvented. But I am not particularly familiar with this part of GCC code. For how to fix this bug, I look forward to getting some suggestions. Thanks a lot.