https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112325
--- Comment #11 from Hongtao Liu <liuhongt at gcc dot gnu.org> --- > Loop body is likely going to simplify further, this is difficult > to guess, we just decrease the result by 1/3. */ > This is introduced by r0-68074-g91a01f21abfe19 /* Estimate number of insns of completely unrolled loop. We assume + that the size of the unrolled loop is decreased in the + following way (the numbers of insns are based on what + estimate_num_insns returns for appropriate statements): + + 1) exit condition gets removed (2 insns) + 2) increment of the control variable gets removed (2 insns) + 3) All remaining statements are likely to get simplified + due to constant propagation. Hard to estimate; just + as a heuristics we decrease the rest by 1/3. + + NINSNS is the number of insns in the loop before unrolling. + NUNROLL is the number of times the loop is unrolled. */ + +static unsigned HOST_WIDE_INT +estimated_unrolled_size (unsigned HOST_WIDE_INT ninsns, + unsigned HOST_WIDE_INT nunroll) +{ + HOST_WIDE_INT unr_insns = 2 * ((HOST_WIDE_INT) ninsns - 4) / 3; + if (unr_insns <= 0) + unr_insns = 1; + unr_insns *= (nunroll + 1); + + return unr_insns; +} And r0-93444-g08f1af2ed022e0 try do it more accurately by marking likely_eliminated stmt and minus that from total insns, But 2 / 3 is still keeped. +/* Estimate number of insns of completely unrolled loop. + It is (NUNROLL + 1) * size of loop body with taking into account + the fact that in last copy everything after exit conditional + is dead and that some instructions will be eliminated after + peeling. - NINSNS is the number of insns in the loop before unrolling. - NUNROLL is the number of times the loop is unrolled. */ + Loop body is likely going to simplify futher, this is difficult + to guess, we just decrease the result by 1/3. */ static unsigned HOST_WIDE_INT -estimated_unrolled_size (unsigned HOST_WIDE_INT ninsns, +estimated_unrolled_size (struct loop_size *size, unsigned HOST_WIDE_INT nunroll) { - HOST_WIDE_INT unr_insns = 2 * ((HOST_WIDE_INT) ninsns - 4) / 3; + HOST_WIDE_INT unr_insns = ((nunroll) + * (HOST_WIDE_INT) (size->overall + - size->eliminated_by_peeling)); + if (!nunroll) + unr_insns = 0; + unr_insns += size->last_iteration - size->last_iteration_eliminated_by_peeling; + + unr_insns = unr_insns * 2 / 3; if (unr_insns <= 0) unr_insns = 1; - unr_insns *= (nunroll + 1); It looks to me 1 / 3 overestimates the instructions that can be optimised away, especially if we've subtracted eliminated_by_peeling