https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88398

--- Comment #19 from Wilco <wilco at gcc dot gnu.org> ---
(In reply to Jakub Jelinek from comment #18)
> The duffs device doesn't need to be done with computed jump, it can be done
> with 3 conditional branches + 3 comparisons too.  The advantage of doing
> that is especially if the iter isn't really very small, by doing it that way
> you don't need those 4 unrolled iterations + one scalar loop.

While that is better than an indirect branch, it still branches into the loop,
so you don't benefit from optimization across the loop. So using a trailing
loop is better in most cases.

> Of course, if iter is very short, it might be easier/more efficient to
> duplicate iter more times than 4 and do something else.

If iter is a small constant then peeling makes sense as you can completely
remove the loop counter and branches. Eg. for N=3 you end up with something
like this:

if (n >= 2)
  iter1
  iter2
  n -= 2
if (n != 0)
  iter3

Reply via email to