[llvm-bugs] [Bug 169935] [x86] Vectorization of `std::replace` and its hand-written equivalent is not very useful and is often even somewhat harmful

LLVM Bugs via llvm-bugs Fri, 28 Nov 2025 23:09:51 -0800

Issue	169935
Summary	[x86] Vectorization of `std::replace` and its hand-written equivalent is not very useful and is often even somewhat harmful
Labels	new issue
Assignees
Reporter	AlexGuteniev

    In the following example:
```C++
const char s[] = "....1.......1......1.......1....1......1...1.....1......1.....11....."
 "......1.....1..........1.......1.........1..........1......1...........1.........1.."
 ".......1.........1....1...........1......1........1.....1.......1....1....1..1.....";


static void a(benchmark::State& state) {
  char x[sizeof(s)];

  for (auto _ : state) {
    memcpy(x, s, sizeof(s));
    benchmark::DoNotOptimize(x);
 for (int i=0;i<sizeof(s);i++)
    {
        if (x[i] == '1')
 x[i] = '2';
    }
    benchmark::DoNotOptimize(x);
  }
}
``` 
Adding `#pragma clang loop vectorize(disable)` will not slow down much, and with this particular data it will actually speed up by 1.4x.
The reason is that stores by individual elements aren't efficient and produce the same branchy code as scalar code.

See https://quick-bench.com/q/nTz_9lPk7QlCi-M2RJizQR3UJKU. It is up to Clang 17, but the current trunk still has similar codeged.

_______________________________________________
llvm-bugs mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs

[llvm-bugs] [Bug 169935] [x86] Vectorization of `std::replace` and its hand-written equivalent is not very useful and is often even somewhat harmful

Reply via email to