Re: [Mesa-dev] [PATCH] r600g: fix color exports when we have no CBs

Vadim Girlin Tue, 27 Aug 2013 14:59:44 -0700

On 08/28/2013 12:43 AM, Marek Olšák wrote:

Shader variants are BAD, BAD, BAD. Have you ever played an AAA game
with a Mesa driver that likes to compile shader variants on first use?
It's HORRIBLE.

I don't think that shader variants are bad, but it's definitely bad whenwe are compiling variants that are never used. Currently glxgearscompiles 18 ps/vs shaders. In my branch with initial GS support [1] Iswitched handling of the shaders to deferred compilation, that is,shaders are compiled only before the actual draw. I found later thatit's not really required for GS, but IIRC this change results in only 5shaders being compiled for glxgears instead of 18. It seems most of theuseless variants are results of state changes between creation of theshader state (initial compilation) and actual draw call.

I had some concerns about increased overhead with those changes, andit's actually noticeable with drawoverhead demo, but I didn't see anyregressions with a few real apps that I tested, e.g. glxgears evenshowed slightly better performance with these changes. Probably I alsoimplemented it in a not very optimal way (I was mostly concentrated onGS support) and the overhead can be reduced.

One more thing is duplicate shaders, I've analyzed shader dumps fromUnigine Heaven 3.0 some time ago and found that from about 320 compiledshaders, only about 180 (50%) were unique, others were duplicates(detected by comparing the bytecode dumps for them in an automated way),maybe they had different shader keys (which still resulted in the samebytecode), but I suspect duplicate pipe shaders were also involved.Unfortunately I didn't have a time to investigate it more thoroughlysince then.

So my point is that we don't really need to eliminate shader variants,first we need to eliminate compilation of unused variants and duplicateshaders. Also we might want to consider offloading of the compilation toseparate thread(s) and caching of shader binaries between runs.


Vadim

 [1] http://cgit.freedesktop.org/~vadimg/mesa/log/?h=r600-geom-shaders


What the patch does is probably the right solution. At least
alpha-test state changes don't cause shader recompilation and
re-binding, which also negatively affects performance. Ideally we
shouldn't depend on the framebuffer state at all, but we need to
emulate the TGSI property FS_COLOR0_WRITES_ALL_CBUFS. I think we
should always be fine with key.nr_cbufs forced to 8 for any shader
without that property. I expect app developers to do the right thing
and not write outputs they don't need.

Marek

On Tue, Aug 27, 2013 at 9:00 PM, Roland Scheidegger <srol...@vmware.com> wrote:

Not that I'm qualified to review r600 code, but couldn't you create
different shader variants depending on whether you need alpha test? At
least I would assume shader exports aren't free.

Roland

Am 27.08.2013 19:56, schrieb Vadim Girlin:

We need to export at least one color if the shader writes it,
even when nr_cbufs==0.

Signed-off-by: Vadim Girlin <vadimgir...@gmail.com>
---

Tested on evergreen with multiple combinations of backends - no regressions,
fixes some tests:

   default    - fixes fb-alphatest-nocolor and fb_alphatest-nocolor-ff
   default+sb - fixes fb-alphatest-nocolor and fb_alphatest-nocolor-ff
   llvm       - fixes about 25 tests related to depth/stencil
   llvm+sb    - fixes about 300 tests (llvm's depth/stencil issues and
                regressions cased by reordering of exports in sb)

With this patch, there are no regressions with default+sb vs default.
There is one regression with llvm+sb vs llvm - fs-texturegrad-miplevels,
AFAICS it's a problem with llvm backend uncovered by sb - SET_GRADIENTS_V/H
instructions are not placed in the same TEX clause with corresponding SAMPLE_G.

  src/gallium/drivers/r600/r600_shader.c | 7 ++++---
  1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/src/gallium/drivers/r600/r600_shader.c 
b/src/gallium/drivers/r600/r600_shader.c
index 300b5c4..f7eab76 100644
--- a/src/gallium/drivers/r600/r600_shader.c
+++ b/src/gallium/drivers/r600/r600_shader.c
@@ -918,6 +918,7 @@ static int r600_shader_from_tgsi(struct r600_screen 
*rscreen,
       unsigned opcode;
       int i, j, k, r = 0;
       int next_pos_base = 60, next_param_base = 0;
+     int max_color_exports = MAX2(key.nr_cbufs, 1);
       /* Declarations used by llvm code */
       bool use_llvm = false;
       bool indirect_gprs;
@@ -1130,7 +1131,7 @@ static int r600_shader_from_tgsi(struct r600_screen 
*rscreen,
               radeon_llvm_ctx.face_gpr = ctx.face_gpr;
               radeon_llvm_ctx.r600_inputs = ctx.shader->input;
               radeon_llvm_ctx.r600_outputs = ctx.shader->output;
-             radeon_llvm_ctx.color_buffer_count = MAX2(key.nr_cbufs , 1);
+             radeon_llvm_ctx.color_buffer_count = max_color_exports;
               radeon_llvm_ctx.chip_class = ctx.bc->chip_class;
               radeon_llvm_ctx.fs_color_all = shader->fs_write_all && 
(rscreen->chip_class >= EVERGREEN);
               radeon_llvm_ctx.stream_outputs = &so;
@@ -1440,7 +1441,7 @@ static int r600_shader_from_tgsi(struct r600_screen 
*rscreen,
               case TGSI_PROCESSOR_FRAGMENT:
                       if (shader->output[i].name == TGSI_SEMANTIC_COLOR) {
                               /* never export more colors than the number of 
CBs */
-                             if (shader->output[i].sid >= key.nr_cbufs) {
+                             if (shader->output[i].sid >= max_color_exports) {
                                       /* skip export */
                                       j--;
                                       continue;
@@ -1450,7 +1451,7 @@ static int r600_shader_from_tgsi(struct r600_screen 
*rscreen,
                               output[j].type = 
V_SQ_CF_ALLOC_EXPORT_WORD0_SQ_EXPORT_PIXEL;
                               shader->nr_ps_color_exports++;
                               if (shader->fs_write_all && (rscreen->chip_class 
>= EVERGREEN)) {
-                                     for (k = 1; k < key.nr_cbufs; k++) {
+                                     for (k = 1; k < max_color_exports; k++) {
                                               j++;
                                               memset(&output[j], 0, 
sizeof(struct r600_bytecode_output));
                                               output[j].gpr = 
shader->output[i].gpr;

_______________________________________________
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

_______________________________________________
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


_______________________________________________
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] r600g: fix color exports when we have no CBs

Reply via email to