The target_clones attribute documentation says: > Note that any subsequent call of a function without target_clone from a > target_clone caller will not lead to copying (target clone) of the called > function. If you want to enforce such behavior, we recommend declaring the > calling function with the flatten attribute?
This isn't anywhere near to what we need for explicit vectorization; i.e. when you need to know the native SIMD width and what intrinsics/builtins are valid to call. I've been adding a few thoughts and ideas on the topic at https://gcc.gnu.org/ bugzilla/show_bug.cgi?id=83875. But before I go any further I'd like to know whether there's a chance this is ever going to happen. End goal (made up example): ------------------------------ #include <simd> namespace simd = std::simd; [[gnu::target_clones("arch=x86-64,arch=x86-64-v2,arch=x86-64-v3,arch=x86-64- v4")]] int do_work(std::span<float> data) { using Vf = simd::vec<float>; Vf v = simd::unchecked_load<Vf>(data); if (all_of(v == 0.f)) return 0; v += 1.f; simd::unchecked_store(v, data); return Vf::size(); } ------------------------------ This example requires a different type Vf for three of the four clones. For x86-64 and x86-64-v2, Vf is the same type but 'all_of' is implemented differently (using the ptest builtin for v2). I have ideas how this could be done (in principle). I can't implement it in the compiler for lack of time and knowledge. But before I invest more time in specifying a solution idea and preparing the std::simd code for working like that, I'd like to know whether there's interest. - Matthias -- ────────────────────────────────────────────────────────────────────────── Dr. Matthias Kretz https://mattkretz.github.io GSI Helmholtz Center for Heavy Ion Research https://gsi.de std::simd ──────────────────────────────────────────────────────────────────────────
