On Sun, Sep 06, 2015 at 06:12:38PM +0200, Francisco Jerez wrote: > This series implements dynamic partitioning of the L3 cache space > among its clients, the purpose is multiple: > > - Steal a chunk of L3 space when necessary and reserve it for SLM as > required to support compute shaders with shared variables. > > - Allow L3 caching of dataport DC memory access where the default L3 > partitioning doesn't have any space reserved for it (pre-Gen8) -- > Should improve performance of scratch access (register spills and > fills and some forms of indirect array indexing), atomic counters > and images. > > - Allow dynamic changes of the L3 configuration for work-loads that > could benefit from a partitioning other than the default > (e.g. reduce URB size to gain some additional cache space on > heavily fragment-bound workloads, or split the L3 allocation of > different clients to reduce thrashing). The basic infrastructure > to achieve this is implemented here but no specific heuristics are > included yet in this series.
I admit to not know how this stuff works pre-GEN8, but it was my impression that on GEN8+ these kind of tweaks will make no difference to 3D clients other than for constant buffers, and scratch space. Every other client of the L3 uses a fixed size. Therefore I am skeptical of your last claim and I'd very much like it if you could help me find where the theory came from and certainly some amount of performance data would be very welcome as well. I certainly believe the partitioning is critical for optimal usage of SLM, and as you mention, ensuring that other users of the dynamic partitioning don't screw us over. It's the rest that I'm unsure of. > > The series can be found here in a testable form: > http://cgit.freedesktop.org/~currojerez/mesa/log/?h=i965-l3-partitioning > > [PATCH 01/13] i965: Define symbolic constants for some useful L3 cache > control registers. > [PATCH 02/13] i965: Keep track of whether LRI is allowed in the context > struct. > [PATCH 03/13] i965: Define state flag to signal that the URB size has been > altered. > [PATCH 04/13] i965/gen8: Don't add workaround bits to PIPE_CONTROL stalls if > DC flush is set. > [PATCH 05/13] i965: Import tables enumerating the set of validated L3 > configurations. > [PATCH 06/13] i965: Implement programming of the L3 configuration. > [PATCH 07/13] i965/hsw: Enable L3 atomics. > [PATCH 08/13] i965: Implement selection of the closest L3 configuration based > on a vector of weights. > [PATCH 09/13] i965: Calculate appropriate L3 partition weights for the > current pipeline state. > [PATCH 10/13] i965: Implement L3 state atom. > [PATCH 11/13] i965: Add debug flag to print out the new L3 state during > transitions. > [PATCH 12/13] i965: Work around L3 state leaks during context switches. > [PATCH 13/13] i965: Hook up L3 partitioning state atom. > _______________________________________________ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > http://lists.freedesktop.org/mailman/listinfo/mesa-dev _______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev