Module: Mesa
Branch: master
Commit: 6e1b12c7881fe663cb500cb2f7374f4862bae179
URL:    
http://cgit.freedesktop.org/mesa/mesa/commit/?id=6e1b12c7881fe663cb500cb2f7374f4862bae179

Author: Marek Olšák <marek.ol...@amd.com>
Date:   Wed Jun  8 13:21:25 2016 +0200

radeonsi: enable scratch coalescing

This makes one particular compute shader 8x faster.

Latest LLVM git is required.

Reviewed-by: Nicolai Hähnle <nicolai.haeh...@amd.com>

---

 src/gallium/drivers/radeonsi/si_shader.c | 12 ++++++++++--
 1 file changed, 10 insertions(+), 2 deletions(-)

diff --git a/src/gallium/drivers/radeonsi/si_shader.c 
b/src/gallium/drivers/radeonsi/si_shader.c
index 754b4af..f2bd337 100644
--- a/src/gallium/drivers/radeonsi/si_shader.c
+++ b/src/gallium/drivers/radeonsi/si_shader.c
@@ -5903,8 +5903,16 @@ void si_shader_apply_scratch_relocs(struct si_context 
*sctx,
        unsigned i;
        uint32_t scratch_rsrc_dword0 = scratch_va;
        uint32_t scratch_rsrc_dword1 =
-               S_008F04_BASE_ADDRESS_HI(scratch_va >> 32)
-               |  S_008F04_STRIDE(config->scratch_bytes_per_wave / 64);
+               S_008F04_BASE_ADDRESS_HI(scratch_va >> 32);
+
+       /* Enable scratch coalescing if LLVM sets ELEMENT_SIZE & INDEX_STRIDE
+        * correctly.
+        */
+       if (HAVE_LLVM >= 0x0309)
+               scratch_rsrc_dword1 |= S_008F04_SWIZZLE_ENABLE(1);
+       else
+               scratch_rsrc_dword1 |=
+                       S_008F04_STRIDE(config->scratch_bytes_per_wave / 64);
 
        for (i = 0 ; i < shader->binary.reloc_count; i++) {
                const struct radeon_shader_reloc *reloc =

_______________________________________________
mesa-commit mailing list
mesa-commit@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-commit

Reply via email to