Samuel Iglesias Gonsálvez <sigles...@igalia.com> writes:

> On 18/11/15 06:54, Jordan Justen wrote:
>> From: Francisco Jerez <curroje...@riseup.net>
>> 
>> Improves performance of the arb_shader_image_load_store-atomicity
>> piglit test by over 25x (which isn't a real benchmark it's just heavy
>> on atomics -- the improvement in a microbenchmark I wrote a while ago
>> seemed to be even greater).  The drawback is one needs to be
>> extra-careful not to hang the GPU (in fact the whole system).  A DC
>> partition must have been allocated on L3, the "convert L3 cycle for DC
>> to UC" bit may not be set, the MOCS L3 cacheability bit must be set
>> for all surfaces accessed using DC atomics, and the SCRATCH1 and
>> ROW_CHICKEN3 bits must be kept in sync.
>> 
>> A fairly recent kernel is required for the command parser to allow
>> writes to these registers.
>> ---
>>  src/mesa/drivers/dri/i965/gen7_l3_state.c | 14 ++++++++++++++
>>  1 file changed, 14 insertions(+)
>> 
>> diff --git a/src/mesa/drivers/dri/i965/gen7_l3_state.c 
>> b/src/mesa/drivers/dri/i965/gen7_l3_state.c
>> index 48bca29..c863b7f 100644
>> --- a/src/mesa/drivers/dri/i965/gen7_l3_state.c
>> +++ b/src/mesa/drivers/dri/i965/gen7_l3_state.c
>> @@ -254,5 +254,19 @@ setup_l3_config(struct brw_context *brw, const struct 
>> brw_l3_config *cfg)
>>                  SET_FIELD(cfg->n[L3P_T], GEN7_L3CNTLREG3_T_ALLOC));
>>  
>>        ADVANCE_BATCH();
>> +
>> +      if (brw->is_haswell && brw->intelScreen->cmd_parser_version >= 4) {
>> +         /* Enable L3 atomics on HSW if we have a DC partition, otherwise 
>> keep
>> +          * them disabled to avoid crashing the system hard.
>> +          */
>> +         BEGIN_BATCH(5);
>> +         OUT_BATCH(MI_LOAD_REGISTER_IMM | (5 - 2));
>> +         OUT_BATCH(HSW_SCRATCH1);
>> +         OUT_BATCH(has_dc ? 0 : HSW_SCRATCH1_L3_ATOMIC_DISABLE);
>> +         OUT_BATCH(HSW_ROW_CHICKEN3);
>> +         OUT_BATCH(HSW_ROW_CHICKEN3_L3_ATOMIC_DISABLE << 16 |
>> +                   (has_dc ? 0 : HSW_ROW_CHICKEN3_L3_ATOMIC_DISABLE));
>
>
> I have not found references to ROW_CHICKEN3 nor register with 0xe49c
> address offset in HSW PRMs, so these could be stupid questions:
>

Yeah, these chicken registers are not part of the public PRMs.

> Why you need to set the L3 atomic disable flag in two different places
> in ROW_CHICKEN3 register? Also, why the first flag is set
> unconditionally while the second one only if we don't have a DC
> partition? This is what you want?

It's a masked register as many other MMIO registers on Gen hardware: the
upper 16 bits control which of the lower 16 bits are modified in the
register.

> 
>
> Also, if the "HSW_ROW_CHICKEN3_L3_ATOMIC_DISABLE << 16" is really
> needed, it could be defined as a constant in the first patch of the series.
>
If you think that would be more readable I guess we could define a
REG_MASK() macro and use it wherever we write a masked register.  See
attachment (applies on top of this patch).

> Sam
>
>> +         ADVANCE_BATCH();
>> +      }
>>     }
>>  }
>> 

From 4a3af3cf92bc45401242a676e0178e408eeafbb8 Mon Sep 17 00:00:00 2001
From: Francisco Jerez <curroje...@riseup.net>
Date: Thu, 26 Nov 2015 16:42:43 +0200
Subject: [PATCH] i965: Define and use REG_MASK macro to make masked MMIO
 writes slightly more readable.

---
 src/mesa/drivers/dri/i965/brw_defines.h      | 6 ++++++
 src/mesa/drivers/dri/i965/brw_state_upload.c | 2 +-
 src/mesa/drivers/dri/i965/gen7_l3_state.c    | 2 +-
 src/mesa/drivers/dri/i965/intel_reg.h        | 2 +-
 4 files changed, 9 insertions(+), 3 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_defines.h b/src/mesa/drivers/dri/i965/brw_defines.h
index ade3ede..e08d385 100644
--- a/src/mesa/drivers/dri/i965/brw_defines.h
+++ b/src/mesa/drivers/dri/i965/brw_defines.h
@@ -41,6 +41,12 @@
 #define GET_BITS(data, high, low) ((data & INTEL_MASK((high), (low))) >> (low))
 #define GET_FIELD(word, field) (((word)  & field ## _MASK) >> field ## _SHIFT)
 
+/**
+ * For use with masked MMIO registers where the upper 16 bits control which
+ * of the lower bits are committed to the register.
+ */
+#define REG_MASK(value) ((value) << 16)
+
 #ifndef BRW_DEFINES_H
 #define BRW_DEFINES_H
 
diff --git a/src/mesa/drivers/dri/i965/brw_state_upload.c b/src/mesa/drivers/dri/i965/brw_state_upload.c
index cd1fbee..9ad40d2 100644
--- a/src/mesa/drivers/dri/i965/brw_state_upload.c
+++ b/src/mesa/drivers/dri/i965/brw_state_upload.c
@@ -386,7 +386,7 @@ brw_upload_initial_gpu_state(struct brw_context *brw)
       BEGIN_BATCH(3);
       OUT_BATCH(MI_LOAD_REGISTER_IMM | (3 - 2));
       OUT_BATCH(GEN7_CACHE_MODE_1);
-      OUT_BATCH((GEN9_PARTIAL_RESOLVE_DISABLE_IN_VC << 16) |
+      OUT_BATCH(REG_MASK(GEN9_PARTIAL_RESOLVE_DISABLE_IN_VC) |
                 GEN9_PARTIAL_RESOLVE_DISABLE_IN_VC);
       ADVANCE_BATCH();
    }
diff --git a/src/mesa/drivers/dri/i965/gen7_l3_state.c b/src/mesa/drivers/dri/i965/gen7_l3_state.c
index 239e5bd..022c01a 100644
--- a/src/mesa/drivers/dri/i965/gen7_l3_state.c
+++ b/src/mesa/drivers/dri/i965/gen7_l3_state.c
@@ -412,7 +412,7 @@ setup_l3_config(struct brw_context *brw, const struct brw_l3_config *cfg)
          OUT_BATCH(HSW_SCRATCH1);
          OUT_BATCH(has_dc ? 0 : HSW_SCRATCH1_L3_ATOMIC_DISABLE);
          OUT_BATCH(HSW_ROW_CHICKEN3);
-         OUT_BATCH(HSW_ROW_CHICKEN3_L3_ATOMIC_DISABLE << 16 |
+         OUT_BATCH(REG_MASK(HSW_ROW_CHICKEN3_L3_ATOMIC_DISABLE) |
                    (has_dc ? 0 : HSW_ROW_CHICKEN3_L3_ATOMIC_DISABLE));
          ADVANCE_BATCH();
       }
diff --git a/src/mesa/drivers/dri/i965/intel_reg.h b/src/mesa/drivers/dri/i965/intel_reg.h
index 0b167d5..8888d6f 100644
--- a/src/mesa/drivers/dri/i965/intel_reg.h
+++ b/src/mesa/drivers/dri/i965/intel_reg.h
@@ -183,7 +183,7 @@
 # define GEN8_HIZ_NP_EARLY_Z_FAILS_DISABLE (1 << 13)
 # define GEN9_PARTIAL_RESOLVE_DISABLE_IN_VC (1 << 1)
 # define GEN8_HIZ_PMA_MASK_BITS \
-   ((GEN8_HIZ_NP_PMA_FIX_ENABLE | GEN8_HIZ_NP_EARLY_Z_FAILS_DISABLE) << 16)
+   REG_MASK(GEN8_HIZ_NP_PMA_FIX_ENABLE | GEN8_HIZ_NP_EARLY_Z_FAILS_DISABLE)
 
 /* Predicate registers */
 #define MI_PREDICATE_SRC0               0x2400
-- 
2.5.1

Attachment: signature.asc
Description: PGP signature

_______________________________________________
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Reply via email to