On 11/07/18 19:45, Eero Tamminen wrote:
Hi,
On 11.07.2018 12:00, Timothy Arceri wrote:
On 11/07/18 18:20, Eero Tamminen wrote:
Have you considered partial loop unrolling support?
I.e. when loop counter is known, but too high for full unroll, doing
partial loop unrolling (e.g. unroll 4x times) and dividing loop
counter by same amount (if it didn't divide evenly, need to unroll
remainder outside of loop).
This is supported e.g. by Intel Windows compiler.
Do you have any examples of apps this helps?
Sorry, no. This was found a while ago when doing synthetic tests for
things that are handled by compilers on the CPU side (GCC, LLVM...).
If I remember correctly there are very few (if any?) shaders in shader-db
that are not unrolled due to the limit.
Loop unrolling should probably have some limits based on shader cache
size and how many instructions the loop content has, not on the loop
counter, but only backend has that info...
Yes we already limit based on the contents of the loop. If I recall
correctly that part is not blocking too many loops from unrolling, its
set at a limit that seems to limit spilling pretty well it a small
number of shaders. The iteration limit is that main hard limit, I guess
there probably is room to play with partial unrolling there.
The limiting code is:
static bool
is_loop_small_enough_to_unroll(nir_shader *shader, nir_loop_info *li)
{
unsigned max_iter = shader->options->max_unroll_iterations;
if (li->trip_count > max_iter)
return false;
if (li->force_unroll)
return true;
bool loop_not_too_large =
li->num_instructions * li->trip_count <= max_iter *
LOOP_UNROLL_LIMIT;
return loop_not_too_large;
}
- Eero
There was a measurable impact from the unroll limits on the Talos
benchmark for RADV. I guess it might be interesting to try partial
unrolling with that Game, it would be good to know where else it might
help.
- Eero
On 11.07.2018 09:48, Timothy Arceri wrote:
This series started out as me trying to unrolls some useless loops I
spotted in some shaders from DXVK games (see patch 10), but I found
some other issues and improvements along the way.
The biggest winner seem like it could be the dolphin uber shaders on
i965 (on radeonsi the shaders don't seem to have spilling issues).
The loops in the uber shaders that are unrolled are those used as
wrappers around switchs by GLSL IR.
shader-db results for the series on IVB (note as the loops that are
unrolled only have a single iteration I enabled shader-db reporting
on shaders where loops are unrolled):
total instructions in shared programs: 10018187 -> 10016468 (-0.02%)
instructions in affected programs: 104080 -> 102361 (-1.65%)
helped: 36
HURT: 15
total cycles in shared programs: 220065064 -> 154529655 (-29.78%)
cycles in affected programs: 126063017 -> 60527608 (-51.99%)
helped: 51
HURT: 0
total loops in shared programs: 2515 -> 2308 (-8.23%)
loops in affected programs: 903 -> 696 (-22.92%)
helped: 51
HURT: 0
total spills in shared programs: 4370 -> 4124 (-5.63%)
spills in affected programs: 1397 -> 1151 (-17.61%)
helped: 9
HURT: 12
total fills in shared programs: 4581 -> 4419 (-3.54%)
fills in affected programs: 2201 -> 2039 (-7.36%)
helped: 9
HURT: 15
_______________________________________________
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev
_______________________________________________
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev
_______________________________________________
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev