https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86702
--- Comment #4 from Alexander Nesterovskiy <alexander.nesterovskiy at intel dot com> --- I've noticed performance regressions on different targets and with different compilation options, not only highly optimized like "-march=skylake-avx512 -Ofast -flto -funroll-loops" but with "-O2" too. The simplest case is 500.perlbench_r with "-O2" on Broadwell executed in one copy. Performance drop is not in a particular place but "spread" over whole S_regmatch function which is really big. My guess was that loosing of these probabilities affects passes that follows tree-switchlower1. And it is what I see in generated assembly - some different spilling/filling and different order of blocks.