https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117542
Bug ID: 117542
Summary: Missed loop vectorization for truncate from float to
__bf16.
Product: gcc
Version: 15.0
Status: UNCONFIRMED
Keywords: missed-optimization
Severity: normal
Priority: P3
Component: middle-end
Assignee: unassigned at gcc dot gnu.org
Reporter: liuhongt at gcc dot gnu.org
Target Milestone: ---
Target: x86_64-*-* i?86-*-*
For loop vectorization, GCC relies on optab vec_pack_trunk_m to check if
backend supports that or not.
But the optab is already used by truncate from float to _Float16 and can't be
overloaded. The document only mention the dest has 2*N elements of size S/2,
but doesn't specify the dest mode and there're 2 kinds of half-precision
floating-point.
------
‘vec_pack_trunc_m’
Narrow (demote) and merge the elements of two vectors. Operands 1 and 2
are vectors of the same mode having N integral or floating point elements of
size S. Operand 0 is the resulting vector in which 2*N elements of size S/2 are
concatenated after narrowing them down using truncation.
----------
void
foo (__bf16* a, float* b)
{
for (int i = 0; i != 10000; i++)
a[i] = b[i];
}
couldn't vectorize loop
not vectorized: no vectype for stmt: _4 = *_3;