https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117008
Andrew Pinski <pinskia at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Ever confirmed|1 |0
Blocks| |53947
Status|NEW |UNCONFIRMED
--- Comment #3 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
Note your `total+=values[index];` loop could be reduced down to just `total +=
values.count();` and that will over 10x faster.
I am not sure sure if this is useful benchmark either. because count uses
popcount directly. Maybe GCC could detect the popcount here but I am not sure.
LLVM does a slightly better job at vectorizing the loop but still messes it up.
Plus once you add other code around values[index], the vectorizer will no
longer kick in so the slow down is only for this bad micro-benchmark.
Referenced Bugs:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53947
[Bug 53947] [meta-bug] vectorizer missed-optimizations