https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119549
Bug ID: 119549
Summary: [14/15 Regression] SSE4 code inlined into no-sse4
function
Product: gcc
Version: 15.0
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: target
Assignee: unassigned at gcc dot gnu.org
Reporter: rguenth at gcc dot gnu.org
Target Milestone: ---
GCC 13 used to reject the following code when compiling with -msse4:
t.c: In function ‘rte_eal_trace_generic_void_init’:
t.c:3:5: error: inlining failed in call to ‘always_inline’
‘rte_trace_feature_is_enabled’: target specific option mismatch
3 | int rte_trace_feature_is_enabled() { *(v2di *)0 =
__builtin_ia32_movntdqa ((void *)0); return 1; }
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~
t.c:9:8: note: called from here
9 | if (!rte_trace_feature_is_enabled()) return;
--
typedef long long v2di __attribute__((vector_size(16)));
static inline __attribute__((always_inline))
int rte_trace_feature_is_enabled()
{
// some SSE4 instruction, original code from DPDK just has the return 1
*(v2di *)0 = __builtin_ia32_movntdqa ((void *)0);
return 1;
}
void
__attribute__((target ("no-sse3"))) __attribute__((target ("no-sse4")))
rte_eal_trace_generic_void_init(void)
{
if (!rte_trace_feature_is_enabled()) return;
// real code follows in DPDK, this is a CTOR function there
__asm__ volatile ("" : : : "memory");
}
--
but since r14-5607 (AVX10.1 support) we accept it and inline, resulting in
rte_eal_trace_generic_void_init having SSE4 instructions.
The rev does not indicate it wants to change things.
I'll note that dropping either no-sse3 or no-sse4 makes the testcase accepted
and inlined even with GCC 13.
So the behavior is odd and unintended I guess.