https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77399
Bug ID: 77399 Summary: Fails to use native instructions for vector casts Product: gcc Version: 7.0 Status: UNCONFIRMED Keywords: missed-optimization Severity: normal Priority: P3 Component: middle-end Assignee: unassigned at gcc dot gnu.org Reporter: amonakov at gcc dot gnu.org CC: nsz at gcc dot gnu.org Target Milestone: --- For the following testcase: typedef int v4si __attribute__((vector_size(16))); typedef float v4sf __attribute__((vector_size(16))); v4sf vec_cast(v4si f) { return (v4sf){f[0], f[1], f[2], f[3]}; } (where unfortunately the c-style cast of the vector type has to be spelled out per-component, as '(v4sf)f' would mean a bitwise copy of the vector representation) on x86-64 the generated should be just: cvtdq2ps %xmm0, %xmm0 retq (and that is what Clang/LLVM generates), but GCC generates code that unpacks the input vector and converts each component separately. Note that the issue arises only with vector types; when auto-vectorizing scalar code, GCC can use 'cvtdq2ps'. I believe this is because the original code is lowered to gimple using 4 BIT_FIELD_REFS, and the vectorizer doesn't handle that.