https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117008
Andrew Pinski <pinskia at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Keywords| |missed-optimization
Status|UNCONFIRMED |NEW
Ever confirmed|0 |1
Last reconfirmed| |2024-10-08
Target| |x86_64-*-*
--- Comment #2 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
Looks like the reduction loop is vectorized and that is causing the slow down.
Semi reduced (unincluded) testcase:
```
#include <bitset>
void g(std::bitset<12800000> &);
int f()
{
unsigned int total = 0;
std::bitset<12800000> values;
g(values);
for (unsigned int index = 0; index != 12800000; ++index)
total += values[index];
return total ;
}
```
For Linux, you need `-m32 -O2 -mavx2` (-m32 since it uses long and for mingw
that is 32bits while for linux it is 64bits and that does not get vectorized).