[Bug libstdc++/88545] std::find compile to memchr in trivial random access cases (patch)

tnfchris at gcc dot gnu.org via Gcc-bugs Mon, 01 Jul 2024 06:29:27 -0700

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88545


--- Comment #12 from Tamar Christina <tnfchris at gcc dot gnu.org> ---
I had a bug in the benchmark, I forgot to set taskset,

These are the correct ones:

+--------+-----------+---------+---------+
| NEEDLE | scalar 1x | vect    | memchr  |
+--------+-----------+---------+---------+
| 1      | -0.14%    | 174.95% | 373.69% |
| 0      | 0.00%     | -90.60% | -95.21% |
| 100    | 0.03%     | -80.28% | -80.39% |
| 1000   | 0.00%     | -89.46% | -94.06% |
| 10000  | 0.00%     | -90.33% | -95.19% |
| -1     | 0.00%     | -90.60% | -95.21% |
+--------+-----------+---------+---------+

So this shows that on modern cores the unrolled scalar has no influence, so we
should just remove it.

It also shows that memchr is universally faster and that for the rest the
vectorizer does a pretty good job.  We'll get some additional speedups there
soon as well but memchr should still win as it's hand tuned.

So I think for 1-byte we should use memchr and the rest remove the unrolled
code and let the vectorizer handle it.

[Bug libstdc++/88545] std::find compile to memchr in trivial random access cases (patch)

Reply via email to