https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85740
--- Comment #9 from Thomas Koenig <tkoenig at gcc dot gnu.org> --- (In reply to Martin Liška from comment #8) > (In reply to Richard Biener from comment #7) > > Confirmed with a Haswell CPU as well. Without the __builtin_expect we > > rightfully predict the branch to be 50%/50% which means BB re-ordering will > > do either nothing to pre-existing order or apply some other magic. CFG > > construction makes the > > flow exactly as visible in the source. > > > > So not sure what you are asking here, but annotating the libgfortran > > routines > > or inline expansion from the FE with __builtin_expect is probably a good > > idea. First, I think that -funroll-loop is such a big win that it would be good to have it enabled for such a loop with -Ofast. Second, vectorization would be nice, but quite possibly unrealistic. I'll add a test case momentarliy for AVX2, which also clearly shows a benefit. > If the code is emitted in Fortran FE, that it's similar to specific > predictors: > grep for 'PRED_FORTRAN_'. These are predictors emitted by the FE and can > have specific probability based on SPEC benchmarks. > > Can you Thomas point me to code that emits the maxloc/minloc? I already added __builtin_expect to the minloc/maxloc routines, in gfc_conv_intrinsic_minmaxloc. Maybe the likelyhood of missing the maximum case was too high. > > > > At least I can't really see how to easily derive a new predictor that would > > match > > this case... > > Agree. Running through an array and finding a min/max? Hmm, a pity...