https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107905
--- Comment #5 from Alexander Monakov <amonakov at gcc dot gnu.org> --- Not sure what you don't like about the inputs, they appear quite reasonable. Perhaps GCC's estimation of bb frequencies is off (with profile feedback we achieve good performance). Georgi: you'll likely see better results with profile-guided optimization. You can first compile the benchmark with -O2 -fprofile-generate, run the output (it will generate *.gcda files), then compile again with -O2 -fprofile-use. For Clang the options are spelled -fprofile-instr-generate and -fprofile-instr-use, respectively.