https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110649

--- Comment #7 from Jan Hubicka <hubicka at gcc dot gnu.org> ---
I found the problem why vectorizer gets vectorized epilogue profile scales
wrong. It is scale_profile_for_vect_loop that uses niter_for_unrolled_loop
which does not understand the fact that if iteration count is not divisible,
the epilogue (unless loop is masked) will use the count.

THe upper bound compuation is actually right in update of loop_info, so we can
just use it directly instead of relying on niter_for_unrolled_loop.

Wrong profile in:

;;   basic block 14, loop depth 2, count 13764235 (guessed, freq 1.9247), maybe
hot
;;   Invalid sum of incoming counts 25234431 (guessed, freq 3.5286), should be
13764235 (guessed, freq 1.9247)

Is caused by loop peeling.  The unrolled loop is peeled 4 times which seems
like a reasonable idea, but I am not sure why profile is not updated correctly
here.

Reply via email to