[Mesa-dev] [PATCH 2/2] i965/skl: Break down SIMD16 3-source instructions when required.

2015-03-04 Thread Kenneth Graunke
Several steppings of Skylake fail when using SIMD16 with 3-source
instructions (such as MAD).

This implements WaDisableSIMD16On3SrcInstr and fixes ~190 Piglit
tests.

Based on a patch by Neil Roberts.

Signed-off-by: Kenneth Graunke 
Cc: Neil Roberts 
Cc: Matt Turner 
---
 src/mesa/drivers/dri/i965/brw_shader.cpp | 6 ++
 1 file changed, 6 insertions(+)

Neil, what do you think of this approach?  It's a bit smaller of a hammer
than turning off SIMD16 altogether, and pretty simple.  I haven't tested
it at all, though.

Feel free to --reset-author and claim authorship on this patch - it's
really your code, I just moved it over a bit.

diff --git a/src/mesa/drivers/dri/i965/brw_shader.cpp 
b/src/mesa/drivers/dri/i965/brw_shader.cpp
index fed4ba3..74c0e50 100644
--- a/src/mesa/drivers/dri/i965/brw_shader.cpp
+++ b/src/mesa/drivers/dri/i965/brw_shader.cpp
@@ -400,6 +400,12 @@ bool
 brw_instruction_supports_simd16(const struct brw_context *brw, enum opcode op)
 {
bool supports_3src = brw->is_haswell || brw->gen >= 8;
+   /* WaDisableSIMD16On3SrcInstr: 3-source instructions don't work in SIMD16
+* on a few steppings of Skylake.
+*/
+   if (brw->gen == 9 && (brw->revision == 2 || brw->revision == 3 ||
+ brw->revision == -1))
+  supports_3src = false;
 
switch (op) {
case BRW_OPCODE_MAD:
-- 
2.2.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] i965: Reserve more batch space to accomodate Gen6 perfmonitors.

2015-03-04 Thread Kenneth Graunke
Ben noticed that I said each PIPE_CONTROL was 4 DWords, but it's
actually 5 DWords on Gen6-7.  We've been reserving insufficient space
for performance monitoring on Sandybridge, which means it would likely
break if you used that functionality.  (Thankfully, no one does...)

Signed-off-by: Kenneth Graunke 
---
 src/mesa/drivers/dri/i965/intel_batchbuffer.h | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

We should probably Cc this to stable unless we delete the broken
performance monitoring code altogether.

diff --git a/src/mesa/drivers/dri/i965/intel_batchbuffer.h 
b/src/mesa/drivers/dri/i965/intel_batchbuffer.h
index 7bdd836..5a16456 100644
--- a/src/mesa/drivers/dri/i965/intel_batchbuffer.h
+++ b/src/mesa/drivers/dri/i965/intel_batchbuffer.h
@@ -22,12 +22,12 @@ extern "C" {
  *   - Disabling OA counters on Gen6+ (3 DWords = 12 bytes)
  *   - Ending MI_REPORT_PERF_COUNT on Gen5+, plus associated PIPE_CONTROLs:
  * - Two sets of PIPE_CONTROLs, which become 3 PIPE_CONTROLs each on SNB,
- *   which are 4 DWords each ==> 2 * 3 * 4 * 4 = 96 bytes
+ *   which are 5 DWords each ==> 2 * 3 * 5 * 4 = 120 bytes
  * - 3 DWords for MI_REPORT_PERF_COUNT itself on Gen6+.  ==> 12 bytes.
  *   On Ironlake, it's 6 DWords, but we have some slack due to the lack of
  *   Sandybridge PIPE_CONTROL madness.
  */
-#define BATCH_RESERVED 146
+#define BATCH_RESERVED 152
 
 struct intel_batchbuffer;
 
-- 
2.2.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] i965: Split Gen4-5 BlitFramebuffer code; prefer BLT over Meta.

2015-03-04 Thread Kenneth Graunke
A while back I switched intel_blit_framebuffer to prefer Meta over the
BLT.  This meant that Gen8 platforms would start using the 3D engine
for blits, just like we do on Gen6-7.5.

However, I hadn't considered Gen4-5 when making that change.  The BLT
engine appears to be substantially faster on 965GM than using Meta to
drive the 3D engine.  This isn't too surprising: original Gen4 doesn't
support tile offsets (that came on G45), and the level/layer fields
don't work for cubemap rendering, so for inconvenient miplevel
alignments, we end up blitting or copying data to/from temporaries
in order to render to it.  We may as well just use the blitter.

I chose to use the BLT on Gen4-5 because they use the same ring for
both 3D and BLT; Gen6+ splits it out.

Fixes regressions on 965GM due to botched tile offset code (we should
fix those properly as well, but they're longstanding bugs - for now,
put things back to the status quo).

Signed-off-by: Kenneth Graunke 
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89430
Cc: "10.5" 
Cc: Mark Janes 
---
 src/mesa/drivers/dri/i965/intel_fbo.c | 50 ++-
 1 file changed, 49 insertions(+), 1 deletion(-)

diff --git a/src/mesa/drivers/dri/i965/intel_fbo.c 
b/src/mesa/drivers/dri/i965/intel_fbo.c
index 04e5030..121c97f 100644
--- a/src/mesa/drivers/dri/i965/intel_fbo.c
+++ b/src/mesa/drivers/dri/i965/intel_fbo.c
@@ -916,6 +916,51 @@ intel_blit_framebuffer(struct gl_context *ctx,
 }
 
 /**
+ * Gen4-5 implementation of glBlitFrameBuffer().
+ *
+ * Tries BLT, Meta, then swrast.
+ *
+ * Gen4-5 have a single ring for both 3D and BLT operations, so there's no
+ * inter-ring synchronization issues like on Gen6+.  It is apparently faster
+ * than using the 3D pipeline.  Original Gen4 also has to rebase and copy
+ * miptree slices in order to render to unaligned locations.
+ */
+static void
+gen4_blit_framebuffer(struct gl_context *ctx,
+  struct gl_framebuffer *readFb,
+  struct gl_framebuffer *drawFb,
+  GLint srcX0, GLint srcY0, GLint srcX1, GLint srcY1,
+  GLint dstX0, GLint dstY0, GLint dstX1, GLint dstY1,
+  GLbitfield mask, GLenum filter)
+{
+   /* Page 679 of OpenGL 4.4 spec says:
+*"Added BlitFramebuffer to commands affected by conditional rendering 
in
+* section 10.10 (Bug 9562)."
+*/
+   if (!_mesa_check_conditional_render(ctx))
+  return;
+
+   mask = intel_blit_framebuffer_with_blitter(ctx, readFb, drawFb,
+  srcX0, srcY0, srcX1, srcY1,
+  dstX0, dstY0, dstX1, dstY1,
+  mask, filter);
+   if (mask == 0x0)
+  return;
+
+   mask = _mesa_meta_BlitFramebuffer(ctx, readFb, drawFb,
+ srcX0, srcY0, srcX1, srcY1,
+ dstX0, dstY0, dstX1, dstY1,
+ mask, filter);
+   if (mask == 0x0)
+  return;
+
+   _swrast_BlitFramebuffer(ctx, readFb, drawFb,
+   srcX0, srcY0, srcX1, srcY1,
+   dstX0, dstY0, dstX1, dstY1,
+   mask, filter);
+}
+
+/**
  * Does the renderbuffer have hiz enabled?
  */
 bool
@@ -1049,7 +1094,10 @@ intel_fbo_init(struct brw_context *brw)
dd->UnmapRenderbuffer = intel_unmap_renderbuffer;
dd->RenderTexture = intel_render_texture;
dd->ValidateFramebuffer = intel_validate_framebuffer;
-   dd->BlitFramebuffer = intel_blit_framebuffer;
+   if (brw->gen >= 6)
+  dd->BlitFramebuffer = intel_blit_framebuffer;
+   else
+  dd->BlitFramebuffer = gen4_blit_framebuffer;
dd->EGLImageTargetRenderbufferStorage =
   intel_image_target_renderbuffer_storage;
 
-- 
2.2.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] i965: Fix URB size for CHV

2015-03-05 Thread Kenneth Graunke
On Thursday, March 05, 2015 07:41:29 PM Ville Syrjälä wrote:
> On Fri, Jan 23, 2015 at 12:12:56PM +0200, ville.syrj...@linux.intel.com wrote:
> > From: Ville Syrjälä 
> > 
> > Increase the device info .urb.size for CHV to match the default URB
> > size (192kB).
> > 
> > Signed-off-by: Ville Syrjälä 
> 
> Ping?

Oh, sorry!  I thought I'd reviewed this.  It does indeed appear to be
192kB.

Reviewed-by: Kenneth Graunke 

Have you tested it?  Assuming it doesn't explode, feel free to push
this.  Thanks for catching the mistake!


signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] nir: Make the printer include nir_variable::location too.

2015-03-05 Thread Kenneth Graunke
Being able to see both location and driver_location can be useful when
debugging IO mistakes.

Signed-off-by: Kenneth Graunke 
---
 src/glsl/nir/nir_print.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/glsl/nir/nir_print.c b/src/glsl/nir/nir_print.c
index 6a3c6a0..30d4821 100644
--- a/src/glsl/nir/nir_print.c
+++ b/src/glsl/nir/nir_print.c
@@ -228,7 +228,7 @@ print_var_decl(nir_variable *var, print_var_state *state, 
FILE *fp)
if (var->data.mode == nir_var_shader_in ||
var->data.mode == nir_var_shader_out ||
var->data.mode == nir_var_uniform) {
-  fprintf(fp, " (%u)", var->data.driver_location);
+  fprintf(fp, " (%u, %u)", var->data.location, var->data.driver_location);
}
 
fprintf(fp, "\n");
-- 
2.2.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2 6/6] i965: Don't write past the end of the application supplied buffer

2015-03-05 Thread Kenneth Graunke
On Wednesday, March 04, 2015 09:55:46 AM Ian Romanick wrote:
> From: Ian Romanick 
> 
> Both the AMD and Intel APIs provide a dataSize parameter, and this
> function would merrily ignore it.  Neither API specifies what to do when
> the buffer isn't big enough.  I take the easy route of writing all the
> complete bits of data that will fit.  With more complete specs, we could
> probably do something different.
> 
> I noticed this while looking into an unused parameter warning.  The
> warning was actually useful!
> 
> brw_performance_monitor.c: In function 'brw_get_perf_monitor_result':
> brw_performance_monitor.c:1261:37: warning: unused parameter 'data_size' 
> [-Wunused-parameter]
>  GLsizei data_size,
>  ^
> 
> v2: Fix checks to include offset in the calculation.  Noticed by Jan.
> 
> Signed-off-by: Ian Romanick 
> Cc: Kenneth Graunke 
> Cc: Jan Vesely 

Huh.  I could've sworn I reviewed this patch a while back, but I can't
find any evidence of that.  Thanks for fixing this!

The spec isn't particularly clear about whether we should write up to
the very end of the byte limit (i.e. write the group name and counter
name but not the value), or omit the entire last entry when there isn't
space...but I think what you did is quite sensible.

I'm pretty sure we're deleting most of this code in favor of Rob's
efforts, but we may as well land this fix in the meantime.

Reviewed-by: Kenneth Graunke 


signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] i965/nir: Resolve source modifiers on Gen8+ logic operations.

2015-03-06 Thread Kenneth Graunke
On Gen8+, AND/OR/XOR/NOT don't support the abs() source modifier, and
negate changes meaning to bitwise-not (~, not -).  This isn't what NIR
expects, so we should resolve the source modifers via a MOV.

+30 Piglits (fs-op-bit{and,or,xor}-not-abs-*).

Signed-off-by: Kenneth Graunke 
---
 src/mesa/drivers/dri/i965/brw_fs.cpp | 11 +++
 src/mesa/drivers/dri/i965/brw_fs.h   |  1 +
 src/mesa/drivers/dri/i965/brw_fs_nir.cpp | 15 +++
 3 files changed, 27 insertions(+)

diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp 
b/src/mesa/drivers/dri/i965/brw_fs.cpp
index d6acc23..428234f 100644
--- a/src/mesa/drivers/dri/i965/brw_fs.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
@@ -1561,6 +1561,17 @@ fs_visitor::emit_sampleid_setup()
return reg;
 }
 
+void
+fs_visitor::resolve_source_modifiers(fs_reg *src)
+{
+   if (!src->abs && !src->negate)
+  return;
+
+   fs_reg temp = retype(vgrf(1), src->type);
+   emit(MOV(temp, *src));
+   *src = temp;
+}
+
 fs_reg
 fs_visitor::fix_math_operand(fs_reg src)
 {
diff --git a/src/mesa/drivers/dri/i965/brw_fs.h 
b/src/mesa/drivers/dri/i965/brw_fs.h
index 70098d8..ec77962 100644
--- a/src/mesa/drivers/dri/i965/brw_fs.h
+++ b/src/mesa/drivers/dri/i965/brw_fs.h
@@ -299,6 +299,7 @@ public:
  int texunit);
fs_reg emit_mcs_fetch(fs_reg coordinate, int components, fs_reg sampler);
void emit_gen6_gather_wa(uint8_t wa, fs_reg dst);
+   void resolve_source_modifiers(fs_reg *src);
fs_reg fix_math_operand(fs_reg src);
fs_inst *emit_math(enum opcode op, fs_reg dst, fs_reg src0);
fs_inst *emit_math(enum opcode op, fs_reg dst, fs_reg src0, fs_reg src1);
diff --git a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp 
b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
index 66f7918..a0300aa 100644
--- a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
@@ -935,15 +935,30 @@ fs_visitor::nir_emit_alu(nir_alu_instr *instr)
   break;
 
case nir_op_inot:
+  if (brw->gen >= 8) {
+ resolve_source_modifiers(&op[0]);
+  }
   emit(NOT(result, op[0]));
   break;
case nir_op_ixor:
+  if (brw->gen >= 8) {
+ resolve_source_modifiers(&op[0]);
+ resolve_source_modifiers(&op[1]);
+  }
   emit(XOR(result, op[0], op[1]));
   break;
case nir_op_ior:
+  if (brw->gen >= 8) {
+ resolve_source_modifiers(&op[0]);
+ resolve_source_modifiers(&op[1]);
+  }
   emit(OR(result, op[0], op[1]));
   break;
case nir_op_iand:
+  if (brw->gen >= 8) {
+ resolve_source_modifiers(&op[0]);
+ resolve_source_modifiers(&op[1]);
+  }
   emit(AND(result, op[0], op[1]));
   break;
 
-- 
2.2.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] i965/fs: Implement SIMD16 dual source blending.

2015-03-06 Thread Kenneth Graunke
On Thursday, March 05, 2015 09:39:58 PM Jason Ekstrand wrote:
> This looks fine to me.  I just kicked off a build on our test farm and,
> assuming that looks good (I'll send another e-mail in the morning if it
> does),
> 
> Reviewed-by: Jason Ekstrand 
> 
> I ran shader-db on the change and I was kind of surprised to see that it
> doesn't really do anything.
> 
> GAINED: shaders/dolphin/smg.1.shader_test FS SIMD16
> 
> total instructions in shared programs: 5769629 -> 5769629 (0.00%)
> instructions in affected programs: 0 -> 0
> helped:0
> HURT:  0
> GAINED:1
> LOST:  0
> 
> Perhaps shader-db doesn't account for some other GL state required for
> dual-source because I doubt only one shader uses it.  Ken?
> 
> --Jason

That would be dolphin/smg.1.shader_test - the one lonely shader that
uses layout qualifiers to specify the dual source color output index:

layout(location = 0, index = 1) out vec4 ocol1;

Other applications (such as Unigine) most likely call
glBindFragDataLocationIndexed to assign the location and index.

Unfortunately, shader-db doesn't capture this, as it's tied to API
calls, and not part of the shader itself.  Eric's new shader-db-2
project that uses apitrace would catch this (but at a large cost).

We could probably capture this somehow - add some kind of annotations
to the file with the locations/indexes of each shader input/output,
then make the API calls after compiling the shader...relink to make
them take effect, which would also cause a new precompile, then replace
the original results...seems like a pain, but probably doable...


signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/6] glsl: Mark array access when copying to a temporary for the ?: operator.

2015-03-06 Thread Kenneth Graunke
Piglit's spec/glsl-1.20/compiler/structure-and-array-operations/
array-selection.vert test contains the following code:

   gl_Position = (pick_from_a_or_b ? a : b)[i];

where "a" and "b" are uniform vec4[2] variables.

ast_to_hir creates a temporary vec4[2] variable, conditional_tmp, and
generates an if-block to copy one or the other:

   (declare (temporary) (array vec4 2) conditional_tmp)
   (if (var_ref pick_from_a_or_b)
 ((assign () (var_ref conditional_tmp) (var_ref a)))
 ((assign () (var_ref conditional_tmp) (var_ref b

However, we failed to update max_array_access for "a" and "b", so it
remained 0 - here, the whole array is being accessed.  At link time,
update_array_sizes() used this bogus information to change the types
of "a" and "b" to vec4[1].  We then had assignments from a vec4[1] to
a vec4[2], which is highly illegal.

This tripped assertions in nir_split_var_copies with scalar VS.

Signed-off-by: Kenneth Graunke 
Cc: mesa-sta...@lists.freedesktop.org
---
 src/glsl/ast_to_hir.cpp | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/src/glsl/ast_to_hir.cpp b/src/glsl/ast_to_hir.cpp
index acb5c76..d387b2e 100644
--- a/src/glsl/ast_to_hir.cpp
+++ b/src/glsl/ast_to_hir.cpp
@@ -1617,6 +1617,12 @@ ast_expression::do_hir(exec_list *instructions,
   && cond_val != NULL) {
  result = cond_val->value.b[0] ? op[1] : op[2];
   } else {
+ /* The copy to conditional_tmp reads the whole array. */
+ if (type->is_array()) {
+mark_whole_array_access(op[1]);
+mark_whole_array_access(op[2]);
+ }
+
  ir_variable *const tmp =
 new(ctx) ir_variable(type, "conditional_tmp", ir_var_temporary);
  instructions->push_tail(tmp);
-- 
2.2.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 6/6] nir: Only do gl_FrontFacing workaround in glsl_to_nir for the FS.

2015-03-06 Thread Kenneth Graunke
Vertex shaders can have shader inputs where location happens to be
VARYING_SLOT_FACE.  Without predicating this on the shader stage,
we suddenly end up with load_front_face intrinsics in vertex shaders,
which is nonsensical.

Fixes spec/arb_vertex_buffer_object/pos-array when using NIR for VS.

Signed-off-by: Kenneth Graunke 
---
 src/glsl/nir/glsl_to_nir.cpp | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/src/glsl/nir/glsl_to_nir.cpp b/src/glsl/nir/glsl_to_nir.cpp
index ddad207..047cb51 100644
--- a/src/glsl/nir/glsl_to_nir.cpp
+++ b/src/glsl/nir/glsl_to_nir.cpp
@@ -251,7 +251,8 @@ nir_visitor::visit(ir_variable *ir)
   break;
 
case ir_var_shader_in:
-  if (ir->data.location == VARYING_SLOT_FACE) {
+  if (stage == MESA_SHADER_FRAGMENT &&
+  ir->data.location == VARYING_SLOT_FACE) {
  /* For whatever reason, GLSL IR makes gl_FrontFacing an input */
  var->data.location = SYSTEM_VALUE_FRONT_FACE;
  var->data.mode = nir_var_system_value;
-- 
2.2.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/6] nir: Delete nir_shader::user_structures and num_user_structures.

2015-03-06 Thread Kenneth Graunke
Nothing actually uses these, and the only caller of glsl_to_nir()
(brw_fs_nir.cpp) always passes NULL for the _mesa_glsl_parse_state
pointer, meaning they'll always be NULL and 0, respectively.

Just delete them.

Signed-off-by: Kenneth Graunke 
---
 src/glsl/nir/glsl_to_nir.cpp | 11 ---
 src/glsl/nir/nir.c   |  3 ---
 src/glsl/nir/nir.h   |  4 
 src/glsl/nir/nir_print.c |  4 
 4 files changed, 22 deletions(-)

diff --git a/src/glsl/nir/glsl_to_nir.cpp b/src/glsl/nir/glsl_to_nir.cpp
index adef19c..b82e5c7 100644
--- a/src/glsl/nir/glsl_to_nir.cpp
+++ b/src/glsl/nir/glsl_to_nir.cpp
@@ -154,17 +154,6 @@ glsl_to_nir(exec_list *ir, _mesa_glsl_parse_state *state,
 
nir_shader *shader = nir_shader_create(NULL, options);
 
-   if (state) {
-  shader->num_user_structures = state->num_user_structures;
-  shader->user_structures = ralloc_array(shader, glsl_type *,
- shader->num_user_structures);
-  memcpy(shader->user_structures, state->user_structures,
-shader->num_user_structures * sizeof(glsl_type *));
-   } else {
-  shader->num_user_structures = 0;
-  shader->user_structures = NULL;
-   }
-
nir_visitor v1(shader, native_integers);
nir_function_visitor v2(&v1);
v2.run(ir);
diff --git a/src/glsl/nir/nir.c b/src/glsl/nir/nir.c
index ab57fd4..abad3f8 100644
--- a/src/glsl/nir/nir.c
+++ b/src/glsl/nir/nir.c
@@ -42,9 +42,6 @@ nir_shader_create(void *mem_ctx, const 
nir_shader_compiler_options *options)
 
shader->options = options;
 
-   shader->num_user_structures = 0;
-   shader->user_structures = NULL;
-
exec_list_make_empty(&shader->functions);
exec_list_make_empty(&shader->registers);
exec_list_make_empty(&shader->globals);
diff --git a/src/glsl/nir/nir.h b/src/glsl/nir/nir.h
index d5df596..b935354 100644
--- a/src/glsl/nir/nir.h
+++ b/src/glsl/nir/nir.h
@@ -1400,10 +1400,6 @@ typedef struct nir_shader {
/** list of global register in the shader */
struct exec_list registers;
 
-   /** structures used in this shader */
-   unsigned num_user_structures;
-   struct glsl_type **user_structures;
-
/** next available global register index */
unsigned reg_alloc;
 
diff --git a/src/glsl/nir/nir_print.c b/src/glsl/nir/nir_print.c
index 30d4821..f8b14a1 100644
--- a/src/glsl/nir/nir_print.c
+++ b/src/glsl/nir/nir_print.c
@@ -844,10 +844,6 @@ nir_print_shader(nir_shader *shader, FILE *fp)
print_var_state state;
init_print_state(&state);
 
-   for (unsigned i = 0; i < shader->num_user_structures; i++) {
-  glsl_print_struct(shader->user_structures[i], fp);
-   }
-
struct hash_entry *entry;
 
hash_table_foreach(shader->uniforms, entry) {
-- 
2.2.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 5/6] nir: Plumb the shader stage into glsl_to_nir().

2015-03-06 Thread Kenneth Graunke
The next commit needs to know the shader stage in glsl_to_nir().
To facilitate that, we pass the gl_shader rather than the raw exec_list
of instructions.  This has both the exec_list and the stage.

Signed-off-by: Kenneth Graunke 
---
 src/glsl/nir/glsl_to_nir.cpp | 14 --
 src/glsl/nir/glsl_to_nir.h   |  2 +-
 src/mesa/drivers/dri/i965/brw_fs_nir.cpp |  2 +-
 3 files changed, 10 insertions(+), 8 deletions(-)

diff --git a/src/glsl/nir/glsl_to_nir.cpp b/src/glsl/nir/glsl_to_nir.cpp
index 0d96e03..ddad207 100644
--- a/src/glsl/nir/glsl_to_nir.cpp
+++ b/src/glsl/nir/glsl_to_nir.cpp
@@ -43,7 +43,7 @@ namespace {
 class nir_visitor : public ir_visitor
 {
 public:
-   nir_visitor(nir_shader *shader);
+   nir_visitor(nir_shader *shader, gl_shader_stage stage);
~nir_visitor();
 
virtual void visit(ir_variable *);
@@ -83,6 +83,7 @@ private:
bool supports_ints;
 
nir_shader *shader;
+   gl_shader_stage stage;
nir_function_impl *impl;
exec_list *cf_node_list;
nir_instr *result; /* result of the expression tree last visited */
@@ -125,22 +126,23 @@ private:
 }; /* end of anonymous namespace */
 
 nir_shader *
-glsl_to_nir(exec_list *ir, const nir_shader_compiler_options *options)
+glsl_to_nir(struct gl_shader *sh, const nir_shader_compiler_options *options)
 {
nir_shader *shader = nir_shader_create(NULL, options);
 
-   nir_visitor v1(shader);
+   nir_visitor v1(shader, sh->Stage);
nir_function_visitor v2(&v1);
-   v2.run(ir);
-   visit_exec_list(ir, &v1);
+   v2.run(sh->ir);
+   visit_exec_list(sh->ir, &v1);
 
return shader;
 }
 
-nir_visitor::nir_visitor(nir_shader *shader)
+nir_visitor::nir_visitor(nir_shader *shader, gl_shader_stage stage)
 {
this->supports_ints = shader->options->native_integers;
this->shader = shader;
+   this->stage = stage;
this->is_global = true;
this->var_table = _mesa_hash_table_create(NULL, _mesa_hash_pointer,
  _mesa_key_pointer_equal);
diff --git a/src/glsl/nir/glsl_to_nir.h b/src/glsl/nir/glsl_to_nir.h
index dd62793..3801e8c 100644
--- a/src/glsl/nir/glsl_to_nir.h
+++ b/src/glsl/nir/glsl_to_nir.h
@@ -32,7 +32,7 @@
 extern "C" {
 #endif
 
-nir_shader *glsl_to_nir(exec_list *ir,
+nir_shader *glsl_to_nir(struct gl_shader *sh,
 const nir_shader_compiler_options *options);
 
 #ifdef __cplusplus
diff --git a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp 
b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
index ccb5cea..3bb6806 100644
--- a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
@@ -87,7 +87,7 @@ fs_visitor::emit_nir_code()
 
/* first, lower the GLSL IR shader to NIR */
lower_output_reads(shader->base.ir);
-   nir_shader *nir = glsl_to_nir(shader->base.ir, options);
+   nir_shader *nir = glsl_to_nir(&shader->base, options);
nir_validate_shader(nir);
 
nir_lower_global_vars_to_local(nir);
-- 
2.2.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 4/6] nir: Add native_integers to nir_shader_compiler_options.

2015-03-06 Thread Kenneth Graunke
glsl_to_nir, tgsi_to_nir, and prog_to_nir all want to know whether the
driver supports native integers.  Presumably other passes may as well.

Adding this to nir_shader_compiler_options is an easy way to provide
that information, as it's accessible via nir_shader::options.

Signed-off-by: Kenneth Graunke 
---
 src/glsl/nir/glsl_to_nir.cpp | 11 +--
 src/glsl/nir/glsl_to_nir.h   |  2 +-
 src/glsl/nir/nir.h   |  6 ++
 src/mesa/drivers/dri/i965/brw_context.c  |  4 +++-
 src/mesa/drivers/dri/i965/brw_fs_nir.cpp |  2 +-
 5 files changed, 16 insertions(+), 9 deletions(-)

diff --git a/src/glsl/nir/glsl_to_nir.cpp b/src/glsl/nir/glsl_to_nir.cpp
index 7e40ef4..0d96e03 100644
--- a/src/glsl/nir/glsl_to_nir.cpp
+++ b/src/glsl/nir/glsl_to_nir.cpp
@@ -43,7 +43,7 @@ namespace {
 class nir_visitor : public ir_visitor
 {
 public:
-   nir_visitor(nir_shader *shader, bool supports_ints);
+   nir_visitor(nir_shader *shader);
~nir_visitor();
 
virtual void visit(ir_variable *);
@@ -125,12 +125,11 @@ private:
 }; /* end of anonymous namespace */
 
 nir_shader *
-glsl_to_nir(exec_list *ir, bool native_integers,
-const nir_shader_compiler_options *options)
+glsl_to_nir(exec_list *ir, const nir_shader_compiler_options *options)
 {
nir_shader *shader = nir_shader_create(NULL, options);
 
-   nir_visitor v1(shader, native_integers);
+   nir_visitor v1(shader);
nir_function_visitor v2(&v1);
v2.run(ir);
visit_exec_list(ir, &v1);
@@ -138,9 +137,9 @@ glsl_to_nir(exec_list *ir, bool native_integers,
return shader;
 }
 
-nir_visitor::nir_visitor(nir_shader *shader, bool supports_ints)
+nir_visitor::nir_visitor(nir_shader *shader)
 {
-   this->supports_ints = supports_ints;
+   this->supports_ints = shader->options->native_integers;
this->shader = shader;
this->is_global = true;
this->var_table = _mesa_hash_table_create(NULL, _mesa_hash_pointer,
diff --git a/src/glsl/nir/glsl_to_nir.h b/src/glsl/nir/glsl_to_nir.h
index 7300945..dd62793 100644
--- a/src/glsl/nir/glsl_to_nir.h
+++ b/src/glsl/nir/glsl_to_nir.h
@@ -32,7 +32,7 @@
 extern "C" {
 #endif
 
-nir_shader *glsl_to_nir(exec_list *ir, bool native_integers,
+nir_shader *glsl_to_nir(exec_list *ir,
 const nir_shader_compiler_options *options);
 
 #ifdef __cplusplus
diff --git a/src/glsl/nir/nir.h b/src/glsl/nir/nir.h
index b935354..669a26e 100644
--- a/src/glsl/nir/nir.h
+++ b/src/glsl/nir/nir.h
@@ -1370,6 +1370,12 @@ typedef struct nir_shader_compiler_options {
bool lower_fsqrt;
/** lowers fneg and ineg to fsub and isub. */
bool lower_negate;
+
+   /**
+* Does the driver support real 32-bit integers?  (Otherwise, integers
+* are simulated by floats.)
+*/
+   bool native_integers;
 } nir_shader_compiler_options;
 
 typedef struct nir_shader {
diff --git a/src/mesa/drivers/dri/i965/brw_context.c 
b/src/mesa/drivers/dri/i965/brw_context.c
index 892d4ca..03547e9 100644
--- a/src/mesa/drivers/dri/i965/brw_context.c
+++ b/src/mesa/drivers/dri/i965/brw_context.c
@@ -551,7 +551,9 @@ brw_initialize_context_constants(struct brw_context *brw)
   ctx->Const.Program[MESA_SHADER_FRAGMENT].MaxInputComponents = 128;
}
 
-   static const nir_shader_compiler_options nir_options = {};
+   static const nir_shader_compiler_options nir_options = {
+  .native_integers = true,
+   };
 
/* We want the GLSL compiler to emit code that uses condition codes */
for (int i = 0; i < MESA_SHADER_STAGES; i++) {
diff --git a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp 
b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
index e24bf92..ccb5cea 100644
--- a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
@@ -87,7 +87,7 @@ fs_visitor::emit_nir_code()
 
/* first, lower the GLSL IR shader to NIR */
lower_output_reads(shader->base.ir);
-   nir_shader *nir = glsl_to_nir(shader->base.ir, true, options);
+   nir_shader *nir = glsl_to_nir(shader->base.ir, options);
nir_validate_shader(nir);
 
nir_lower_global_vars_to_local(nir);
-- 
2.2.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 3/6] nir: Try to make sense of the nir_shader_compiler_options code.

2015-03-06 Thread Kenneth Graunke
The code in glsl_to_nir is entirely dead, as we translate from GLSL to
NIR at link time, when there isn't a _mesa_glsl_parse_state to pass,
so every caller passes NULL.

glsl_to_nir seems like the wrong place to try and create the shader
compiler options structure anyway - tgsi_to_nir, prog_to_nir, and other
translators all would have to duplicate that code.  The driver should
set this up once with whatever settings it wants, and pass it in.

Eric also added a NirOptions field to ctx->Const.ShaderCompilerOptions[]
and left a comment saying: "The memory for the options is expected to be
kept in a single static copy by the driver."  This suggests the plan was
to do exactly that.  That pointer was not marked const, however, and the
dead code used a mix of static structures and ralloced ones.

This patch deletes the dead code in glsl_to_nir, instead making it take
the shader compiler options as a mandatory argument.  It creates an
(empty) options struct in the i965 driver, and makes NirOptions point
to that.  It marks the pointer const so that we can actually do so
without generating "discards const qualifier" compiler warnings.

Signed-off-by: Kenneth Graunke 
Cc: Eric Anholt 
---
 src/glsl/nir/glsl_to_nir.cpp | 28 ++--
 src/glsl/nir/glsl_to_nir.h   |  4 ++--
 src/mesa/drivers/dri/i965/brw_context.c  |  5 +
 src/mesa/drivers/dri/i965/brw_fs_nir.cpp |  5 -
 src/mesa/main/mtypes.h   |  2 +-
 5 files changed, 14 insertions(+), 30 deletions(-)

Eric, does this look reasonable to you?

diff --git a/src/glsl/nir/glsl_to_nir.cpp b/src/glsl/nir/glsl_to_nir.cpp
index b82e5c7..7e40ef4 100644
--- a/src/glsl/nir/glsl_to_nir.cpp
+++ b/src/glsl/nir/glsl_to_nir.cpp
@@ -124,34 +124,10 @@ private:
 
 }; /* end of anonymous namespace */
 
-static const nir_shader_compiler_options default_options = {
-};
-
 nir_shader *
-glsl_to_nir(exec_list *ir, _mesa_glsl_parse_state *state,
-bool native_integers)
+glsl_to_nir(exec_list *ir, bool native_integers,
+const nir_shader_compiler_options *options)
 {
-   const nir_shader_compiler_options *options;
-
-   if (state) {
-  struct gl_context *ctx = state->ctx;
-  struct gl_shader_compiler_options *gl_options =
- &ctx->Const.ShaderCompilerOptions[state->stage];
-
-  if (!gl_options->NirOptions) {
- nir_shader_compiler_options *new_options =
-rzalloc(ctx, nir_shader_compiler_options);
- options = gl_options->NirOptions = new_options;
-
- if (gl_options->EmitNoPow)
-new_options->lower_fpow = true;
-  } else {
- options = gl_options->NirOptions;
-  }
-   } else {
-  options = &default_options;
-   }
-
nir_shader *shader = nir_shader_create(NULL, options);
 
nir_visitor v1(shader, native_integers);
diff --git a/src/glsl/nir/glsl_to_nir.h b/src/glsl/nir/glsl_to_nir.h
index 58b2cee..7300945 100644
--- a/src/glsl/nir/glsl_to_nir.h
+++ b/src/glsl/nir/glsl_to_nir.h
@@ -32,8 +32,8 @@
 extern "C" {
 #endif
 
-nir_shader *glsl_to_nir(exec_list * ir, _mesa_glsl_parse_state *state,
-bool native_integers);
+nir_shader *glsl_to_nir(exec_list *ir, bool native_integers,
+const nir_shader_compiler_options *options);
 
 #ifdef __cplusplus
 }
diff --git a/src/mesa/drivers/dri/i965/brw_context.c 
b/src/mesa/drivers/dri/i965/brw_context.c
index 786e6f5..892d4ca 100644
--- a/src/mesa/drivers/dri/i965/brw_context.c
+++ b/src/mesa/drivers/dri/i965/brw_context.c
@@ -68,6 +68,8 @@
 #include "tnl/t_pipeline.h"
 #include "util/ralloc.h"
 
+#include "glsl/nir/nir.h"
+
 /***
  * Mesa's Driver Functions
  ***/
@@ -549,6 +551,8 @@ brw_initialize_context_constants(struct brw_context *brw)
   ctx->Const.Program[MESA_SHADER_FRAGMENT].MaxInputComponents = 128;
}
 
+   static const nir_shader_compiler_options nir_options = {};
+
/* We want the GLSL compiler to emit code that uses condition codes */
for (int i = 0; i < MESA_SHADER_STAGES; i++) {
   ctx->Const.ShaderCompilerOptions[i].MaxIfDepth = brw->gen < 6 ? 16 : 
UINT_MAX;
@@ -562,6 +566,7 @@ brw_initialize_context_constants(struct brw_context *brw)
 (i == MESA_SHADER_FRAGMENT);
   ctx->Const.ShaderCompilerOptions[i].EmitNoIndirectUniform = false;
   ctx->Const.ShaderCompilerOptions[i].LowerClipDistance = true;
+  ctx->Const.ShaderCompilerOptions[i].NirOptions = &nir_options;
}
 
ctx->Const.ShaderCompilerOptions[MESA_SHADER_VERTEX].OptimizeForAOS = true;
diff --git a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp 
b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
index a0300aa..e24bf92 100644
--- a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
@

Re: [Mesa-dev] [PATCH 05/13] i965: Fix the untyped surface opcodes to deal with indirect surface access.

2015-03-07 Thread Kenneth Graunke
On Friday, March 06, 2015 03:20:26 PM Pohjolainen, Topi wrote:
> On Fri, Mar 06, 2015 at 02:46:51PM +0200, Francisco Jerez wrote:
> > "Pohjolainen, Topi"  writes:
> > 
> > > On Fri, Mar 06, 2015 at 02:29:15PM +0200, Francisco Jerez wrote:
> > >> "Pohjolainen, Topi"  writes:
> > >> 
> > >> > On Fri, Feb 27, 2015 at 05:34:48PM +0200, Francisco Jerez wrote:
> > >> >> Change brw_untyped_atomic() and brw_untyped_surface_read() to take the
> > >> >> surface index as a register instead of a constant and to use
> > >> >> brw_send_indirect_message() to emit the indirect variant of send with
> > >> >> a dynamically calculated message descriptor.  This will be required to
> > >> >> support variable indexing of image arrays for
> > >> >> ARB_shader_image_load_store.
> > >> >> ---
> > >> >>  src/mesa/drivers/dri/i965/brw_eu.h   |  10 +-
> > >> >>  src/mesa/drivers/dri/i965/brw_eu_emit.c  | 158 
> > >> >> +--
> > >> >>  src/mesa/drivers/dri/i965/brw_fs_generator.cpp   |   4 +-
> > >> >>  src/mesa/drivers/dri/i965/brw_vec4_generator.cpp |   4 +-
> > >> >>  4 files changed, 96 insertions(+), 80 deletions(-)
> > >> >> 
> > >> >> diff --git a/src/mesa/drivers/dri/i965/brw_eu.h 
> > >> >> b/src/mesa/drivers/dri/i965/brw_eu.h
> > >> >> index 87a9f3f..9cc9123 100644
> > >> >> --- a/src/mesa/drivers/dri/i965/brw_eu.h
> > >> >> +++ b/src/mesa/drivers/dri/i965/brw_eu.h
> > >> >> @@ -398,18 +398,18 @@ void brw_CMP(struct brw_compile *p,
> > >> >>  
> > >> >>  void
> > >> >>  brw_untyped_atomic(struct brw_compile *p,
> > >> >> -   struct brw_reg dest,
> > >> >> +   struct brw_reg dst,
> > >> >> struct brw_reg payload,
> > >> >> +   struct brw_reg surface,
> > >> >> unsigned atomic_op,
> > >> >> -   unsigned bind_table_index,
> > >> >> unsigned msg_length,
> > >> >> bool response_expected);
> > >> >>  
> > >> >>  void
> > >> >>  brw_untyped_surface_read(struct brw_compile *p,
> > >> >> - struct brw_reg dest,
> > >> >> - struct brw_reg mrf,
> > >> >> - unsigned bind_table_index,
> > >> >> + struct brw_reg dst,
> > >> >> + struct brw_reg payload,
> > >> >> + struct brw_reg surface,
> > >> >>   unsigned msg_length,
> > >> >>   unsigned num_channels);
> > >> >>  
> > >> >> diff --git a/src/mesa/drivers/dri/i965/brw_eu_emit.c 
> > >> >> b/src/mesa/drivers/dri/i965/brw_eu_emit.c
> > >> >> index 0b655d4..34695bf 100644
> > >> >> --- a/src/mesa/drivers/dri/i965/brw_eu_emit.c
> > >> >> +++ b/src/mesa/drivers/dri/i965/brw_eu_emit.c
> > >> >> @@ -2518,6 +2518,48 @@ brw_send_indirect_message(struct brw_compile 
> > >> >> *p,
> > >> >> return setup;
> > >> >>  }
> > >> >>  
> > >> >> +static struct brw_inst *
> > >> >> +brw_send_indirect_surface_message(struct brw_compile *p,
> > >> >> +  unsigned sfid,
> > >> >> +  struct brw_reg dst,
> > >> >> +  struct brw_reg payload,
> > >> >> +  struct brw_reg surface,
> > >> >> +  unsigned message_len,
> > >> >> +  unsigned response_len,
> > >> >> +  bool header_present)
> > >> >> +{
> > >> >> +   const struct brw_context *brw = p->brw;
> > >> >> +   struct brw_inst *insn;
> > >> >> +
> > >> >> +   if (surface.file != BRW_IMMEDIATE_VALUE) {
> > >> >> +  struct brw_reg addr = retype(brw_address_reg(0), 
> > >> >> BRW_REGISTER_TYPE_UD);
> > >> >> +
> > >> >> +  brw_push_insn_state(p);
> > >> >> +  brw_set_default_access_mode(p, BRW_ALIGN_1);
> > >> >> +  brw_set_default_mask_control(p, BRW_MASK_DISABLE);
> > >> >> +  brw_set_default_predicate_control(p, BRW_PREDICATE_NONE);
> > >> >> +
> > >> >> +  /* Mask out invalid bits from the surface index to avoid hangs 
> > >> >> e.g. when
> > >> >> +   * some surface array is accessed out of bounds.
> > >> >> +   */
> > >> >> +  insn = brw_AND(p, addr,
> > >> >> + suboffset(vec1(retype(surface, 
> > >> >> BRW_REGISTER_TYPE_UD)),
> > >> >> +   BRW_GET_SWZ(surface.dw1.bits.swizzle, 
> > >> >> 0)),
> > >> >> + brw_imm_ud(0xff));
> > >> >> +
> > >> >> +  brw_pop_insn_state(p);
> > >> >> +
> > >> >> +  surface = addr;
> > >> >> +   }
> > >> >> +
> > >> >> +   insn = brw_send_indirect_message(p, sfid, dst, payload, surface);
> > >> >> +   brw_inst_set_mlen(brw, insn, message_len);
> > >> >> +   brw_inst_set_rlen(brw, insn, response_len);
> > >> >> +   brw_inst_set_header_present(brw, insn, header_present);
> > >> >
> > >> > I'll continue the discussion we started with 

Re: [Mesa-dev] [PATCH 01/13] i965: Factor out logic to build a send message instruction with indirect descriptor.

2015-03-07 Thread Kenneth Graunke
On Friday, February 27, 2015 05:34:44 PM Francisco Jerez wrote:
> ---
>  src/mesa/drivers/dri/i965/brw_eu.h   | 19 ++--
>  src/mesa/drivers/dri/i965/brw_eu_emit.c  | 58 
> ++--
>  src/mesa/drivers/dri/i965/brw_fs_generator.cpp   | 55 +-
>  src/mesa/drivers/dri/i965/brw_vec4_generator.cpp | 37 ---
>  4 files changed, 77 insertions(+), 92 deletions(-)

FYI, I glanced over this series briefly, and it looks reasonable to me.

It looks like Topi's done a pretty thorough review (thanks!) - if he's
happy with it, that's good enough for me.  (I suppose we could call that
an Acked-by.)


signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v4] mesa: improve ARB_copy_image internal format compat check

2015-03-07 Thread Kenneth Graunke
On Saturday, March 07, 2015 10:09:15 PM Jason Ekstrand wrote:
> LGTM.

Jason,

Sean doesn't have commit access (it's actually his first patch) - if
this is good to go, would you mind pushing it for him?

Nice work, Sean!

--Ken


signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2 2/4] i965/fs: Make get_timestamp() return an fs_inst * rather than emitting.

2015-03-08 Thread Kenneth Graunke
On Friday, February 27, 2015 11:37:01 PM Pohjolainen, Topi wrote:
> On Fri, Feb 27, 2015 at 11:15:35AM -0800, Kenneth Graunke wrote:
> > This makes another part of the INTEL_DEBUG=shader_time code emittable
> > at arbitrary locations, rather than just at the end of the instruction
> > stream.
> > 
> > v2: Don't lose smear!  Caught by Topi Pohjolainen.
> > 
> > Signed-off-by: Kenneth Graunke 
> > ---
> >  src/mesa/drivers/dri/i965/brw_fs.cpp | 24 +---
> >  src/mesa/drivers/dri/i965/brw_fs.h   |  2 +-
> >  2 files changed, 14 insertions(+), 12 deletions(-)
> > 
> > Yikes, good catch!  Thanks for the review, Topi!
> > 
> > diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp 
> > b/src/mesa/drivers/dri/i965/brw_fs.cpp
> > index 9c6f084..d65f1f1 100644
> > --- a/src/mesa/drivers/dri/i965/brw_fs.cpp
> > +++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
> > @@ -680,8 +680,8 @@ fs_visitor::type_size(const struct glsl_type *type)
> > return 0;
> >  }
> >  
> > -fs_reg
> > -fs_visitor::get_timestamp()
> > +fs_inst *
> > +fs_visitor::timestamp_read()
> >  {
> > assert(brw->gen >= 7);
> >  
> > @@ -692,12 +692,6 @@ fs_visitor::get_timestamp()
> >  
> > fs_reg dst = fs_reg(GRF, alloc.allocate(1), BRW_REGISTER_TYPE_UD, 4);
> >  
> > -   fs_inst *mov = emit(MOV(dst, ts));
> > -   /* We want to read the 3 fields we care about even if it's not enabled 
> > in
> > -* the dispatch.
> > -*/
> > -   mov->force_writemask_all = true;
> > -
> > /* The caller wants the low 32 bits of the timestamp.  Since it's 
> > running
> >  * at the GPU clock rate of ~1.2ghz, it will roll over every ~3 seconds,
> >  * which is plenty of time for our purposes.  It is identical across the
> > @@ -710,14 +704,21 @@ fs_visitor::get_timestamp()
> >  */
> > dst.set_smear(0);
> >  
> > -   return dst;
> > +   fs_inst *mov = MOV(dst, ts);
> 
> Previously the smear wasn't set for the destination in the instruction
> itself. I had to check what set_smear() really does. It also sets stride to
> zero which the original logic left to the init value of one. I guess this is
> not what you intented?

Augh.  Good catch - that totally breaks things.  It makes the timestamp
read a mov(1) instead of a mov(4).

I rebased this code, and discovered that shader_time was just totally
hosed again.  I think I've patched it up, and will send out a respin
shortly.


signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/5] i965/fs: Set force_writemask_all on shader_time instructions.

2015-03-08 Thread Kenneth Graunke
These computations don't have anything to do with the currently
executing channels, so they should use force_writemask_all.

This fixes assert failures.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=86974
Signed-off-by: Kenneth Graunke 
Cc: mesa-sta...@lists.freedesktop.org
---
 src/mesa/drivers/dri/i965/brw_fs.cpp | 9 +++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp 
b/src/mesa/drivers/dri/i965/brw_fs.cpp
index d6acc23..45a5793 100644
--- a/src/mesa/drivers/dri/i965/brw_fs.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
@@ -759,18 +759,23 @@ fs_visitor::emit_shader_time_end()
reset.set_smear(2);
fs_inst *test = emit(AND(reg_null_d, reset, fs_reg(1u)));
test->conditional_mod = BRW_CONDITIONAL_Z;
+   test->force_writemask_all = true;
emit(IF(BRW_PREDICATE_NORMAL));
 
fs_reg start = shader_start_time;
start.negate = true;
fs_reg diff = fs_reg(GRF, alloc.allocate(1), BRW_REGISTER_TYPE_UD, 1);
-   emit(ADD(diff, start, shader_end_time));
+   fs_inst *add = ADD(diff, start, shader_end_time);
+   add->force_writemask_all = true;
+   emit(add);
 
/* If there were no instructions between the two timestamp gets, the diff
 * is 2 cycles.  Remove that overhead, so I can forget about that when
 * trying to determine the time taken for single instructions.
 */
-   emit(ADD(diff, diff, fs_reg(-2u)));
+   add = ADD(diff, diff, fs_reg(-2u));
+   add->force_writemask_all = true;
+   emit(add);
 
emit_shader_time_write(type, diff);
emit_shader_time_write(written_type, fs_reg(1u));
-- 
2.2.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 3/5] i965/fs: Make emit_shader_time_write return rather than emit.

2015-03-08 Thread Kenneth Graunke
Instead of emit_shader_time_write, we now do emit(SHADER_TIME_ADD(...)).
The advantage is that we can also insert a shader time write at an
arbitrary location in the instruction stream, rather than being
restricted to emitting at the end.

Signed-off-by: Kenneth Graunke 
Reviewed-by: Topi Pohjolainen 
---
 src/mesa/drivers/dri/i965/brw_fs.cpp | 15 +++
 src/mesa/drivers/dri/i965/brw_fs.h   |  3 +--
 2 files changed, 8 insertions(+), 10 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp 
b/src/mesa/drivers/dri/i965/brw_fs.cpp
index db8affa..682841b 100644
--- a/src/mesa/drivers/dri/i965/brw_fs.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
@@ -778,16 +778,15 @@ fs_visitor::emit_shader_time_end()
add->force_writemask_all = true;
emit(add);
 
-   emit_shader_time_write(type, diff);
-   emit_shader_time_write(written_type, fs_reg(1u));
+   emit(SHADER_TIME_ADD(type, diff));
+   emit(SHADER_TIME_ADD(written_type, fs_reg(1u)));
emit(BRW_OPCODE_ELSE);
-   emit_shader_time_write(reset_type, fs_reg(1u));
+   emit(SHADER_TIME_ADD(reset_type, fs_reg(1u)));
emit(BRW_OPCODE_ENDIF);
 }
 
-void
-fs_visitor::emit_shader_time_write(enum shader_time_shader_type type,
-   fs_reg value)
+fs_inst *
+fs_visitor::SHADER_TIME_ADD(enum shader_time_shader_type type, fs_reg value)
 {
int shader_time_index =
   brw_get_shader_time_index(brw, shader_prog, prog, type);
@@ -799,8 +798,8 @@ fs_visitor::emit_shader_time_write(enum 
shader_time_shader_type type,
else
   payload = vgrf(glsl_type::uint_type);
 
-   emit(new(mem_ctx) fs_inst(SHADER_OPCODE_SHADER_TIME_ADD,
- fs_reg(), payload, offset, value));
+   return new(mem_ctx) fs_inst(SHADER_OPCODE_SHADER_TIME_ADD,
+   fs_reg(), payload, offset, value);
 }
 
 void
diff --git a/src/mesa/drivers/dri/i965/brw_fs.h 
b/src/mesa/drivers/dri/i965/brw_fs.h
index 70098d8..be1c8a1 100644
--- a/src/mesa/drivers/dri/i965/brw_fs.h
+++ b/src/mesa/drivers/dri/i965/brw_fs.h
@@ -378,8 +378,7 @@ public:
 
void emit_shader_time_begin();
void emit_shader_time_end();
-   void emit_shader_time_write(enum shader_time_shader_type type,
-   fs_reg value);
+   fs_inst *SHADER_TIME_ADD(enum shader_time_shader_type type, fs_reg value);
 
void emit_untyped_atomic(unsigned atomic_op, unsigned surf_index,
 fs_reg dst, fs_reg offset, fs_reg src0,
-- 
2.2.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 4/5] i965/fs: Make get_timestamp() pass back the MOV rather than emitting it.

2015-03-08 Thread Kenneth Graunke
This makes another part of the INTEL_DEBUG=shader_time code emittable
at arbitrary locations, rather than just at the end of the instruction
stream.

v2: Don't lose smear!  Caught by Topi Pohjolainen.
v3: Don't set smear on the destination of the MOV.  Thanks Topi!

Signed-off-by: Kenneth Graunke 
---
 src/mesa/drivers/dri/i965/brw_fs.cpp | 19 +++
 src/mesa/drivers/dri/i965/brw_fs.h   |  2 +-
 2 files changed, 16 insertions(+), 5 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp 
b/src/mesa/drivers/dri/i965/brw_fs.cpp
index 682841b..8f11600 100644
--- a/src/mesa/drivers/dri/i965/brw_fs.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
@@ -677,8 +677,14 @@ fs_visitor::type_size(const struct glsl_type *type)
return 0;
 }
 
+/**
+ * Create a MOV to read the timestamp register.
+ *
+ * The caller is responsible for emitting the MOV.  The return value is
+ * the destination of the MOV, with extra parameters set.
+ */
 fs_reg
-fs_visitor::get_timestamp()
+fs_visitor::get_timestamp(fs_inst **out_mov)
 {
assert(brw->gen >= 7);
 
@@ -689,7 +695,7 @@ fs_visitor::get_timestamp()
 
fs_reg dst = fs_reg(GRF, alloc.allocate(1), BRW_REGISTER_TYPE_UD, 4);
 
-   fs_inst *mov = emit(MOV(dst, ts));
+   fs_inst *mov = MOV(dst, ts);
/* We want to read the 3 fields we care about even if it's not enabled in
 * the dispatch.
 */
@@ -707,6 +713,7 @@ fs_visitor::get_timestamp()
 */
dst.set_smear(0);
 
+   *out_mov = mov;
return dst;
 }
 
@@ -714,7 +721,9 @@ void
 fs_visitor::emit_shader_time_begin()
 {
current_annotation = "shader time start";
-   shader_start_time = get_timestamp();
+   fs_inst *mov;
+   shader_start_time = get_timestamp(&mov);
+   emit(mov);
 }
 
 void
@@ -750,7 +759,9 @@ fs_visitor::emit_shader_time_end()
   unreachable("fs_visitor::emit_shader_time_end missing code");
}
 
-   fs_reg shader_end_time = get_timestamp();
+   fs_inst *tm_read;
+   fs_reg shader_end_time = get_timestamp(&tm_read);
+   emit(tm_read);
 
/* Check that there weren't any timestamp reset events (assuming these
 * were the only two timestamp reads that happened).
diff --git a/src/mesa/drivers/dri/i965/brw_fs.h 
b/src/mesa/drivers/dri/i965/brw_fs.h
index be1c8a1..f68a476 100644
--- a/src/mesa/drivers/dri/i965/brw_fs.h
+++ b/src/mesa/drivers/dri/i965/brw_fs.h
@@ -399,7 +399,7 @@ public:
void resolve_ud_negate(fs_reg *reg);
void resolve_bool_comparison(ir_rvalue *rvalue, fs_reg *reg);
 
-   fs_reg get_timestamp();
+   fs_reg get_timestamp(fs_inst **out_mov);
 
struct brw_reg interp_reg(int location, int channel);
void setup_uniform_values(ir_variable *ir);
-- 
2.2.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] INTEL_DEBUG=shader_time scalar backend fixes

2015-03-08 Thread Kenneth Graunke
Welcome to the rabbit trail.  In order to fix Football Manager, I had to
rework INTEL_DEBUG=shader_time in the FS backend.  While doing that, I
hit two assertion failures.  After fixing that, I compared numbers.
I noticed that VS again accounted for 0 cycles.

Matt - could you take a look at brw_vec4_dead_code_elimination.cpp?
Since 5df88c2096281f (the rewrite to use live intervals), it's again
completely eating the atomic buffer offset initialization, resulting
in us using garbage data in the first register (of an mlen 2 message).
(Sounds like the same bug I fixed in d0575d98fc595dcc17706dc73d1eb4.)

With these patches, and the new vec4 DCE pass reverted, 10.3 vs my
branch produce basically the same numbers on an openarena timedemo.

Available in the 'football-v3' branch of my tree.

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 5/5] i965/fs: Make emit_shader_time_end() insert before EOT.

2015-03-08 Thread Kenneth Graunke
Previously, we emitted the shader-time epilogue from emit_fb_writes(),
during the middle of looping through color regions (or emit_urb_writes
for the VS).  This is duplicated several times and rather awkward.

I need to fix a bug in our FB write handling, and it will be a lot
easier if we move emit_shader_time_end() out of there.

Now, we simply emit FB writes/URB writes, and subsequently have
emit_shader_time_end() insert instructions before the final SEND with
EOT.  Not only is this simpler, it's actually a slight improvement:
we now include the MOVs to set up the final FB write payload in our
shader-time measurements.

Note that INTEL_DEBUG=shader_time only exists on Gen7+, and uses
send-from-GRF.  (In the past, we might have hit trouble where both
attempt to use MRFs for messages; that's not a problem now.)

v2: Rebase on v3 of the previous patch and other shader_time fixes.

Signed-off-by: Kenneth Graunke 
Reviewed-by: Topi Pohjolainen  [v1]
---
 src/mesa/drivers/dri/i965/brw_fs.cpp | 28 ++--
 src/mesa/drivers/dri/i965/brw_fs_visitor.cpp | 13 -
 2 files changed, 18 insertions(+), 23 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp 
b/src/mesa/drivers/dri/i965/brw_fs.cpp
index 8f11600..af48f4b 100644
--- a/src/mesa/drivers/dri/i965/brw_fs.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
@@ -759,19 +759,24 @@ fs_visitor::emit_shader_time_end()
   unreachable("fs_visitor::emit_shader_time_end missing code");
}
 
+   /* Insert our code just before the final SEND with EOT. */
+   exec_node *end = this->instructions.get_tail();
+   assert(end && ((fs_inst *) end)->eot);
+
fs_inst *tm_read;
fs_reg shader_end_time = get_timestamp(&tm_read);
-   emit(tm_read);
+   end->insert_before(tm_read);
 
/* Check that there weren't any timestamp reset events (assuming these
 * were the only two timestamp reads that happened).
 */
fs_reg reset = shader_end_time;
reset.set_smear(2);
-   fs_inst *test = emit(AND(reg_null_d, reset, fs_reg(1u)));
+   fs_inst *test = AND(reg_null_d, reset, fs_reg(1u));
test->conditional_mod = BRW_CONDITIONAL_Z;
test->force_writemask_all = true;
-   emit(IF(BRW_PREDICATE_NORMAL));
+   end->insert_before(test);
+   end->insert_before(IF(BRW_PREDICATE_NORMAL));
 
fs_reg start = shader_start_time;
start.negate = true;
@@ -779,7 +784,7 @@ fs_visitor::emit_shader_time_end()
diff.set_smear(0);
fs_inst *add = ADD(diff, start, shader_end_time);
add->force_writemask_all = true;
-   emit(add);
+   end->insert_before(add);
 
/* If there were no instructions between the two timestamp gets, the diff
 * is 2 cycles.  Remove that overhead, so I can forget about that when
@@ -787,13 +792,13 @@ fs_visitor::emit_shader_time_end()
 */
add = ADD(diff, diff, fs_reg(-2u));
add->force_writemask_all = true;
-   emit(add);
+   end->insert_before(add);
 
-   emit(SHADER_TIME_ADD(type, diff));
-   emit(SHADER_TIME_ADD(written_type, fs_reg(1u)));
-   emit(BRW_OPCODE_ELSE);
-   emit(SHADER_TIME_ADD(reset_type, fs_reg(1u)));
-   emit(BRW_OPCODE_ENDIF);
+   end->insert_before(SHADER_TIME_ADD(type, diff));
+   end->insert_before(SHADER_TIME_ADD(written_type, fs_reg(1u)));
+   end->insert_before(new(mem_ctx) fs_inst(BRW_OPCODE_ELSE, dispatch_width));
+   end->insert_before(SHADER_TIME_ADD(reset_type, fs_reg(1u)));
+   end->insert_before(new(mem_ctx) fs_inst(BRW_OPCODE_ENDIF, dispatch_width));
 }
 
 fs_inst *
@@ -3915,6 +3920,9 @@ fs_visitor::run_fs()
 
   emit_fb_writes();
 
+  if (INTEL_DEBUG & DEBUG_SHADER_TIME)
+ emit_shader_time_end();
+
   calculate_cfg();
 
   optimize();
diff --git a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp 
b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
index 6b48f70..6766583 100644
--- a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
@@ -3645,9 +3645,6 @@ fs_visitor::emit_fb_writes()
 
fs_inst *inst;
if (do_dual_src) {
-  if (INTEL_DEBUG & DEBUG_SHADER_TIME)
- emit_shader_time_end();
-
   this->current_annotation = ralloc_asprintf(this->mem_ctx,
 "FB dual-source write");
   inst = emit_single_fb_write(this->outputs[0], this->dual_src_output,
@@ -3663,19 +3660,12 @@ fs_visitor::emit_fb_writes()
  if (brw->gen >= 6 && key->replicate_alpha && target != 0)
 src0_alpha = offset(outputs[0], 3);
 
- if (target == key->nr_color_regions - 1 &&
- (INTEL_DEBUG & DEBUG_SHADER_TIME))
-emit_shader_time_end();
-
  inst = emit_single_fb_write(this->outputs[target], reg_undef,
  src0_alpha,
  this->output_components[target]);

[Mesa-dev] [PATCH 2/5] i965/fs: Set smear on shader_time diff register.

2015-03-08 Thread Kenneth Graunke
The ADD(diff, diff, fs_reg(-2u)) instruction reads diff, which is a
width 1 register.  We need to read it as <0,1,0> with a subreg of 0,
which is what smear accomplishes.

Fixes assertion:
brw_eu_emit.c:285: validate_reg: Assertion `hstride == 0' failed.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=86974
Signed-off-by: Kenneth Graunke 
Cc: mesa-sta...@lists.freedesktop.org
---
 src/mesa/drivers/dri/i965/brw_fs.cpp | 1 +
 1 file changed, 1 insertion(+)

diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp 
b/src/mesa/drivers/dri/i965/brw_fs.cpp
index 45a5793..db8affa 100644
--- a/src/mesa/drivers/dri/i965/brw_fs.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
@@ -765,6 +765,7 @@ fs_visitor::emit_shader_time_end()
fs_reg start = shader_start_time;
start.negate = true;
fs_reg diff = fs_reg(GRF, alloc.allocate(1), BRW_REGISTER_TYPE_UD, 1);
+   diff.set_smear(0);
fs_inst *add = ADD(diff, start, shader_end_time);
add->force_writemask_all = true;
emit(add);
-- 
2.2.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/5] i965/fs: Set force_writemask_all on shader_time instructions.

2015-03-08 Thread Kenneth Graunke
On Sunday, March 08, 2015 01:32:33 PM Matt Turner wrote:
> On Sun, Mar 8, 2015 at 1:08 AM, Kenneth Graunke  wrote:
> > These computations don't have anything to do with the currently
> > executing channels, so they should use force_writemask_all.
> >
> > This fixes assert failures.
> >
> > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=86974
> > Signed-off-by: Kenneth Graunke 
> > Cc: mesa-sta...@lists.freedesktop.org
> > ---
> >  src/mesa/drivers/dri/i965/brw_fs.cpp | 9 +++--
> >  1 file changed, 7 insertions(+), 2 deletions(-)
> >
> > diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp 
> > b/src/mesa/drivers/dri/i965/brw_fs.cpp
> > index d6acc23..45a5793 100644
> > --- a/src/mesa/drivers/dri/i965/brw_fs.cpp
> > +++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
> > @@ -759,18 +759,23 @@ fs_visitor::emit_shader_time_end()
> > reset.set_smear(2);
> > fs_inst *test = emit(AND(reg_null_d, reset, fs_reg(1u)));
> > test->conditional_mod = BRW_CONDITIONAL_Z;
> > +   test->force_writemask_all = true;
> > emit(IF(BRW_PREDICATE_NORMAL));
> >
> > fs_reg start = shader_start_time;
> > start.negate = true;
> > fs_reg diff = fs_reg(GRF, alloc.allocate(1), BRW_REGISTER_TYPE_UD, 1);
> > -   emit(ADD(diff, start, shader_end_time));
> > +   fs_inst *add = ADD(diff, start, shader_end_time);
> > +   add->force_writemask_all = true;
> > +   emit(add);
> 
> Emit returns a pointer to the instruction it emitted, so you can cut
> one line by wrapping the ADD with emit().
> 
> Either way's fine.

Yeah, I didn't bother because patch 5 changes it to:

   end->insert_before(add)

and that doesn't return a pointer to the inserted instruction.


signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] meta: Plug memory leak in blit shader creation

2015-03-09 Thread Kenneth Graunke
On Friday, March 06, 2015 06:22:00 PM Ben Widawsky wrote:
> It looks like this has existed since:
> commit f5a477ab76b6e0b268387699cd2253a43db0dfae
> Author: Ian Romanick 
> Date:   Mon Dec 16 11:54:08 2013 -0800
> 
> meta: Refactor shader generation code out of mipmap generation path
> 
> Valgrind was complaining on the piglit test:
> fbo-generatemipmap-formats GL_ARB_texture_float -auto -fbo
> 
> Cc: Ian Romanick 
> Cc: Brian Paul 
> Cc: Eric Anholt 
> Reported-by: Mark Janes  (Jenkins)
> Signed-off-by: Ben Widawsky 
> ---
>  src/mesa/drivers/common/meta.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/src/mesa/drivers/common/meta.c b/src/mesa/drivers/common/meta.c
> index fdc4cf1..2c1abe3 100644
> --- a/src/mesa/drivers/common/meta.c
> +++ b/src/mesa/drivers/common/meta.c
> @@ -270,6 +270,7 @@ _mesa_meta_setup_blit_shader(struct gl_context *ctx,
>  
> if (shader->shader_prog != 0) {
>_mesa_UseProgram(shader->shader_prog);
> +  ralloc_free(mem_ctx);
>return;
> }

Hi Ben,

You're right - in the case where the shader already exists (due to us
hitting this path once before), we do create a ralloc context and fail
to free it.  We don't actually need one, since we're not constructing
shader code.

I think it would be better to pull 'mem_ctx = ralloc_context(NULL)' out
of the variable declaration, and put it after this early return.

The early return is the common case, and it'd be nice to avoid
allocating and freeing pointless heap memory for no reason.

--Ken


signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/9] i965/fs: Store a pointer to brw_sampler_prog_key_data in the visitor.

2015-03-09 Thread Kenneth Graunke
The NIR backend hardcodes brw_wm_prog_key at the moment, which won't
work when we support scalar VS.  We could use get_tex(), but it's a
static method.  I was going to promote it to fs_visitor, but then
realized that both parameters (stage and key) are already members.

It then occured to me that we could just set up a pointer in the
constructor, and skip having a function altogether.

This patch also converts all existing users to use key_tex.

Signed-off-by: Kenneth Graunke 
---
 src/mesa/drivers/dri/i965/brw_fs.h   |  2 +
 src/mesa/drivers/dri/i965/brw_fs_nir.cpp |  3 +-
 src/mesa/drivers/dri/i965/brw_fs_visitor.cpp | 55 
 3 files changed, 27 insertions(+), 33 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_fs.h 
b/src/mesa/drivers/dri/i965/brw_fs.h
index ee6ba98..7f4916a 100644
--- a/src/mesa/drivers/dri/i965/brw_fs.h
+++ b/src/mesa/drivers/dri/i965/brw_fs.h
@@ -418,6 +418,8 @@ public:
void visit_atomic_counter_intrinsic(ir_call *ir);
 
const void *const key;
+   struct brw_sampler_prog_key_data *key_tex;
+
struct brw_stage_prog_data *prog_data;
unsigned int sanity_param_count;
 
diff --git a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp 
b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
index 3bb6806..249e59c 100644
--- a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
@@ -1612,7 +1612,6 @@ fs_visitor::nir_emit_intrinsic(nir_intrinsic_instr *instr)
 void
 fs_visitor::nir_emit_texture(nir_tex_instr *instr)
 {
-   brw_wm_prog_key *key = (brw_wm_prog_key*) this->key;
unsigned sampler = instr->sampler_index;
fs_reg sampler_reg(sampler);
 
@@ -1709,7 +1708,7 @@ fs_visitor::nir_emit_texture(nir_tex_instr *instr)
}
 
if (instr->op == nir_texop_txf_ms) {
-  if (brw->gen >= 7 && key->tex.compressed_multisample_layout_mask & 
(1<gen >= 7 && key_tex->compressed_multisample_layout_mask & (1 << 
sampler))
  mcs = emit_mcs_fetch(coordinate, instr->coord_components, 
sampler_reg);
   else
  mcs = fs_reg(0u);
diff --git a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp 
b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
index e413ae3..2adb299 100644
--- a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
@@ -1860,19 +1860,6 @@ fs_visitor::emit_texture_gen7(ir_texture_opcode op, 
fs_reg dst,
return inst;
 }
 
-static struct brw_sampler_prog_key_data *
-get_tex(gl_shader_stage stage, const void *key)
-{
-   switch (stage) {
-   case MESA_SHADER_FRAGMENT:
-  return &((brw_wm_prog_key*) key)->tex;
-   case MESA_SHADER_VERTEX:
-  return &((brw_vue_prog_key*) key)->tex;
-   default:
-  unreachable("unhandled shader stage");
-   }
-}
-
 fs_reg
 fs_visitor::rescale_texcoord(fs_reg coordinate, int coord_components,
  bool is_rect, uint32_t sampler, int texunit)
@@ -1880,7 +1867,6 @@ fs_visitor::rescale_texcoord(fs_reg coordinate, int 
coord_components,
fs_inst *inst = NULL;
bool needs_gl_clamp = true;
fs_reg scale_x, scale_y;
-   struct brw_sampler_prog_key_data *tex = get_tex(stage, this->key);
 
/* The 965 requires the EU to do the normalization of GL rectangle
 * texture coordinates.  We use the program parameter state
@@ -1888,8 +1874,8 @@ fs_visitor::rescale_texcoord(fs_reg coordinate, int 
coord_components,
 */
if (is_rect &&
(brw->gen < 6 ||
-(brw->gen >= 6 && (tex->gl_clamp_mask[0] & (1 << sampler) ||
-   tex->gl_clamp_mask[1] & (1 << sampler) {
+(brw->gen >= 6 && (key_tex->gl_clamp_mask[0] & (1 << sampler) ||
+   key_tex->gl_clamp_mask[1] & (1 << sampler) {
   struct gl_program_parameter_list *params = prog->Parameters;
   int tokens[STATE_LENGTH] = {
 STATE_INTERNAL,
@@ -1950,7 +1936,7 @@ fs_visitor::rescale_texcoord(fs_reg coordinate, int 
coord_components,
   needs_gl_clamp = false;
 
   for (int i = 0; i < 2; i++) {
-if (tex->gl_clamp_mask[i] & (1 << sampler)) {
+if (key_tex->gl_clamp_mask[i] & (1 << sampler)) {
fs_reg chan = coordinate;
chan = offset(chan, i);
 
@@ -1975,7 +1961,7 @@ fs_visitor::rescale_texcoord(fs_reg coordinate, int 
coord_components,
 
if (coord_components > 0 && needs_gl_clamp) {
   for (int i = 0; i < MIN2(coord_components, 3); i++) {
-if (tex->gl_clamp_mask[i] & (1 << sampler)) {
+if (key_tex->gl_clamp_mask[i] & (1 << sampler)) {
fs_reg chan = coordinate;
chan = offset(chan, i);
 
@@ -2033,14 +2019,13 @@ fs_visitor::emit_texture(ir_texture_opcode op,
  uint32_t sampler,

[Mesa-dev] [PATCH 6/9] i965/fs: Refactor fs_visitor::nir_setup_inputs().

2015-03-09 Thread Kenneth Graunke
No functional change.  In preparation for supporting vertex shaders,
this adds a switch statement on shader stage (since vertex attributes
and fragment shader varyings will need different handling).  It also
renames "varying" to "input", to be more general.

Signed-off-by: Kenneth Graunke 
---
 src/mesa/drivers/dri/i965/brw_fs_nir.cpp | 29 +++--
 1 file changed, 19 insertions(+), 10 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp 
b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
index d700523..3baafc4 100644
--- a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
@@ -199,18 +199,27 @@ fs_visitor::nir_setup_inputs(nir_shader *shader)
struct hash_entry *entry;
hash_table_foreach(shader->inputs, entry) {
   nir_variable *var = (nir_variable *) entry->data;
-  fs_reg varying = offset(nir_inputs, var->data.driver_location);
+  fs_reg input = offset(nir_inputs, var->data.driver_location);
 
   fs_reg reg;
-  if (var->data.location == VARYING_SLOT_POS) {
- reg = *emit_fragcoord_interpolation(var->data.pixel_center_integer,
- var->data.origin_upper_left);
- emit_percomp(MOV(varying, reg), 0xF);
-  } else {
- emit_general_interpolation(varying, var->name, var->type,
-(glsl_interp_qualifier) 
var->data.interpolation,
-var->data.location, var->data.centroid,
-var->data.sample);
+  switch (stage) {
+  case MESA_SHADER_VERTEX:
+  case MESA_SHADER_GEOMETRY:
+  case MESA_SHADER_COMPUTE:
+ unreachable("fs_visitor not used for these stages yet.");
+ break;
+  case MESA_SHADER_FRAGMENT:
+ if (var->data.location == VARYING_SLOT_POS) {
+reg = *emit_fragcoord_interpolation(var->data.pixel_center_integer,
+var->data.origin_upper_left);
+emit_percomp(MOV(input, reg), 0xF);
+ } else {
+emit_general_interpolation(input, var->name, var->type,
+   (glsl_interp_qualifier) 
var->data.interpolation,
+   var->data.location, var->data.centroid,
+   var->data.sample);
+ }
+ break;
   }
}
 }
-- 
2.2.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 8/9] i965/fs: Add VS output support to nir_setup_outputs().

2015-03-09 Thread Kenneth Graunke
Adapted from fs_visitor::visit(ir_variable *).

Signed-off-by: Kenneth Graunke 
---
 src/mesa/drivers/dri/i965/brw_fs_nir.cpp | 16 +++-
 1 file changed, 11 insertions(+), 5 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp 
b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
index 1734d03..d03a337 100644
--- a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
@@ -255,7 +255,17 @@ fs_visitor::nir_setup_outputs(nir_shader *shader)
   nir_variable *var = (nir_variable *) entry->data;
   fs_reg reg = offset(nir_outputs, var->data.driver_location);
 
-  if (var->data.index > 0) {
+  int vector_elements =
+ var->type->is_array() ? var->type->fields.array->vector_elements
+   : var->type->vector_elements;
+
+  if (stage == MESA_SHADER_VERTEX) {
+ for (int i = 0; i < ALIGN(type_size(var->type), 4) / 4; i++) {
+int output = var->data.location + i;
+this->outputs[output] = offset(reg, 4 * i);
+this->output_components[output] = vector_elements;
+ }
+  } else if (var->data.index > 0) {
  assert(var->data.location == FRAG_RESULT_DATA0);
  assert(var->data.index == 1);
  this->dual_src_output = reg;
@@ -275,10 +285,6 @@ fs_visitor::nir_setup_outputs(nir_shader *shader)
  assert(var->data.location >= FRAG_RESULT_DATA0 &&
 var->data.location < FRAG_RESULT_DATA0 + BRW_MAX_DRAW_BUFFERS);
 
- int vector_elements =
-var->type->is_array() ? var->type->fields.array->vector_elements
-  : var->type->vector_elements;
-
  /* General color output. */
  for (unsigned int i = 0; i < MAX2(1, var->type->length); i++) {
 int output = var->data.location - FRAG_RESULT_DATA0 + i;
-- 
2.2.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/9] i965/nir: Optimize after nir_lower_var_copies().

2015-03-09 Thread Kenneth Graunke
Array variable copy splitting generates a bunch of stuff we want to
clean up before proceeding.

Signed-off-by: Kenneth Graunke 
Cc: Jason Ekstrand 
---
 src/mesa/drivers/dri/i965/brw_fs_nir.cpp | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp 
b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
index 249e59c..fbdfc22 100644
--- a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
@@ -102,6 +102,9 @@ fs_visitor::emit_nir_code()
nir_lower_var_copies(nir);
nir_validate_shader(nir);
 
+   /* Get rid of split copies */
+   nir_optimize(nir);
+
nir_lower_io(nir);
nir_validate_shader(nir);
 
-- 
2.2.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 9/9] i965: Use NIR for scalar VS when INTEL_USE_NIR is set.

2015-03-09 Thread Kenneth Graunke
Signed-off-by: Kenneth Graunke 
---
 src/mesa/drivers/dri/i965/brw_fs.cpp | 15 ++-
 1 file changed, 10 insertions(+), 5 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp 
b/src/mesa/drivers/dri/i965/brw_fs.cpp
index 428234f..ee5bc4a 100644
--- a/src/mesa/drivers/dri/i965/brw_fs.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
@@ -3818,12 +3818,17 @@ fs_visitor::run_vs()
if (INTEL_DEBUG & DEBUG_SHADER_TIME)
   emit_shader_time_begin();
 
-   foreach_in_list(ir_instruction, ir, shader->base.ir) {
-  base_ir = ir;
-  this->result = reg_undef;
-  ir->accept(this);
+   if (getenv("INTEL_USE_NIR") != NULL) {
+  emit_nir_code();
+   } else {
+  foreach_in_list(ir_instruction, ir, shader->base.ir) {
+ base_ir = ir;
+ this->result = reg_undef;
+ ir->accept(this);
+  }
+  base_ir = NULL;
}
-   base_ir = NULL;
+
if (failed)
   return false;
 
-- 
2.2.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 4/9] nir: Add intrinsics for SYSTEM_VALUE_BASE_VERTEX and VERTEX_ID_ZERO_BASE

2015-03-09 Thread Kenneth Graunke
Ian and I added these around the time Connor was developing NIR.  Now
that both exist, we should make them work together!

Signed-off-by: Kenneth Graunke 
---
 src/glsl/nir/nir_intrinsics.h  | 2 ++
 src/glsl/nir/nir_lower_system_values.c | 6 ++
 2 files changed, 8 insertions(+)

diff --git a/src/glsl/nir/nir_intrinsics.h b/src/glsl/nir/nir_intrinsics.h
index 3bf102f..8e28765 100644
--- a/src/glsl/nir/nir_intrinsics.h
+++ b/src/glsl/nir/nir_intrinsics.h
@@ -95,6 +95,8 @@ ATOMIC(read, NIR_INTRINSIC_CAN_ELIMINATE)
 
 SYSTEM_VALUE(front_face, 1)
 SYSTEM_VALUE(vertex_id, 1)
+SYSTEM_VALUE(vertex_id_zero_base, 1)
+SYSTEM_VALUE(base_vertex, 1)
 SYSTEM_VALUE(instance_id, 1)
 SYSTEM_VALUE(sample_id, 1)
 SYSTEM_VALUE(sample_pos, 2)
diff --git a/src/glsl/nir/nir_lower_system_values.c 
b/src/glsl/nir/nir_lower_system_values.c
index 328d4f1..a6eec65 100644
--- a/src/glsl/nir/nir_lower_system_values.c
+++ b/src/glsl/nir/nir_lower_system_values.c
@@ -49,6 +49,12 @@ convert_instr(nir_intrinsic_instr *instr)
case SYSTEM_VALUE_VERTEX_ID:
   op = nir_intrinsic_load_vertex_id;
   break;
+   case SYSTEM_VALUE_VERTEX_ID_ZERO_BASE:
+  op = nir_intrinsic_load_vertex_id_zero_base;
+  break;
+   case SYSTEM_VALUE_BASE_VERTEX:
+  op = nir_intrinsic_load_base_vertex;
+  break;
case SYSTEM_VALUE_INSTANCE_ID:
   op = nir_intrinsic_load_instance_id;
   break;
-- 
2.2.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 3/9] i965/nir: Lower to registers a bit later.

2015-03-09 Thread Kenneth Graunke
We can't safely call nir_optimize() with register present, since several
passes called in the loop can't handle registers, and will fail asserts.

Notably, nir_lower_vec_alus() and nir_opt_algebraic() really don't want
registers.

Signed-off-by: Kenneth Graunke 
---
 src/mesa/drivers/dri/i965/brw_fs_nir.cpp | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp 
b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
index fbdfc22..c5ed55c 100644
--- a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
@@ -108,9 +108,6 @@ fs_visitor::emit_nir_code()
nir_lower_io(nir);
nir_validate_shader(nir);
 
-   nir_lower_locals_to_regs(nir);
-   nir_validate_shader(nir);
-
nir_remove_dead_variables(nir);
nir_validate_shader(nir);
 
@@ -125,6 +122,9 @@ fs_visitor::emit_nir_code()
 
nir_optimize(nir);
 
+   nir_lower_locals_to_regs(nir);
+   nir_validate_shader(nir);
+
nir_lower_to_source_mods(nir);
nir_validate_shader(nir);
nir_copy_prop(nir);
-- 
2.2.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 7/9] i965/fs: Handle VS inputs in the NIR backend.

2015-03-09 Thread Kenneth Graunke
Signed-off-by: Kenneth Graunke 
---
 src/mesa/drivers/dri/i965/brw_fs_nir.cpp | 23 ++-
 1 file changed, 22 insertions(+), 1 deletion(-)

diff --git a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp 
b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
index 3baafc4..1734d03 100644
--- a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
@@ -199,11 +199,32 @@ fs_visitor::nir_setup_inputs(nir_shader *shader)
struct hash_entry *entry;
hash_table_foreach(shader->inputs, entry) {
   nir_variable *var = (nir_variable *) entry->data;
+  enum brw_reg_type type = brw_type_for_base_type(var->type);
   fs_reg input = offset(nir_inputs, var->data.driver_location);
 
   fs_reg reg;
   switch (stage) {
-  case MESA_SHADER_VERTEX:
+  case MESA_SHADER_VERTEX: {
+ /* Our ATTR file is indexed by VERT_ATTRIB_*, which is the value
+  * stored in nir_variable::location.
+  *
+  * However, NIR's load_input intrinsics use a different index - an
+  * offset into a single contiguous array containing all inputs.
+  * This index corresponds to the nir_variable::driver_location field.
+  *
+  * So, we need to copy from fs_reg(ATTR, var->location) to
+  * offset(nir_inputs, var->data.driver_location).
+  */
+ unsigned components = var->type->without_array()->components();
+ unsigned array_length = var->type->is_array() ? var->type->length : 1;
+ for (unsigned i = 0; i < array_length; i++) {
+for (unsigned j = 0; j < components; j++) {
+   emit(MOV(retype(offset(input, components * i + j), type),
+offset(fs_reg(ATTR, var->data.location + i, type), 
j)));
+}
+ }
+ break;
+  }
   case MESA_SHADER_GEOMETRY:
   case MESA_SHADER_COMPUTE:
  unreachable("fs_visitor not used for these stages yet.");
-- 
2.2.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 5/9] i965: Implement NIR intrinsics for loading VS system values.

2015-03-09 Thread Kenneth Graunke
Signed-off-by: Kenneth Graunke 
---
 src/mesa/drivers/dri/i965/brw_fs_nir.cpp | 51 
 1 file changed, 51 insertions(+)

diff --git a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp 
b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
index c5ed55c..d700523 100644
--- a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
@@ -363,6 +363,30 @@ emit_system_values_block(nir_block *block, void 
*void_visitor)
 
   nir_intrinsic_instr *intrin = nir_instr_as_intrinsic(instr);
   switch (intrin->intrinsic) {
+  case nir_intrinsic_load_vertex_id:
+ unreachable("should be lowered by lower_vertex_id().");
+
+  case nir_intrinsic_load_vertex_id_zero_base:
+ assert(v->stage == MESA_SHADER_VERTEX);
+ reg = &v->nir_system_values[SYSTEM_VALUE_VERTEX_ID_ZERO_BASE];
+ if (reg->file == BAD_FILE)
+*reg = *v->emit_vs_system_value(SYSTEM_VALUE_VERTEX_ID_ZERO_BASE);
+ break;
+
+  case nir_intrinsic_load_base_vertex:
+ assert(v->stage == MESA_SHADER_VERTEX);
+ reg = &v->nir_system_values[SYSTEM_VALUE_BASE_VERTEX];
+ if (reg->file == BAD_FILE)
+*reg = *v->emit_vs_system_value(SYSTEM_VALUE_BASE_VERTEX);
+ break;
+
+  case nir_intrinsic_load_instance_id:
+ assert(v->stage == MESA_SHADER_VERTEX);
+ reg = &v->nir_system_values[SYSTEM_VALUE_INSTANCE_ID];
+ if (reg->file == BAD_FILE)
+*reg = *v->emit_vs_system_value(SYSTEM_VALUE_INSTANCE_ID);
+ break;
+
   case nir_intrinsic_load_sample_pos:
  assert(v->stage == MESA_SHADER_FRAGMENT);
  reg = &v->nir_system_values[SYSTEM_VALUE_SAMPLE_POS];
@@ -1344,6 +1368,33 @@ fs_visitor::nir_emit_intrinsic(nir_intrinsic_instr 
*instr)
*emit_frontfacing_interpolation()));
   break;
 
+   case nir_intrinsic_load_vertex_id:
+  unreachable("should be lowered by lower_vertex_id()");
+
+   case nir_intrinsic_load_vertex_id_zero_base: {
+  fs_reg vertex_id = nir_system_values[SYSTEM_VALUE_VERTEX_ID_ZERO_BASE];
+  assert(vertex_id.file != BAD_FILE);
+  dest.type = vertex_id.type;
+  emit(MOV(dest, vertex_id));
+  break;
+   }
+
+   case nir_intrinsic_load_base_vertex: {
+  fs_reg base_vertex = nir_system_values[SYSTEM_VALUE_BASE_VERTEX];
+  assert(base_vertex.file != BAD_FILE);
+  dest.type = base_vertex.type;
+  emit(MOV(dest, base_vertex));
+  break;
+   }
+
+   case nir_intrinsic_load_instance_id: {
+  fs_reg instance_id = nir_system_values[SYSTEM_VALUE_INSTANCE_ID];
+  assert(instance_id.file != BAD_FILE);
+  dest.type = instance_id.type;
+  emit(MOV(dest, instance_id));
+  break;
+   }
+
case nir_intrinsic_load_sample_mask_in: {
   fs_reg sample_mask_in = nir_system_values[SYSTEM_VALUE_SAMPLE_MASK_IN];
   assert(sample_mask_in.file != BAD_FILE);
-- 
2.2.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] [v2] meta: Plug memory leak

2015-03-09 Thread Kenneth Graunke
On Monday, March 09, 2015 11:44:18 AM Ben Widawsky wrote:
> It looks like this has existed since
> commit f5a477ab76b6e0b268387699cd2253a43db0dfae
> Author: Ian Romanick 
> Date:   Mon Dec 16 11:54:08 2013 -0800
> 
> meta: Refactor shader generation code out of mipmap generation path
> 
> Valgrind was complaining on fbo-generatemipmap-formats
> 
> v2: Instead, do the allocation after the early return block (v2)
> 
> Cc: Ian Romanick 
> Cc: Brian Paul 
> Cc: Eric Anholt 
> Cc: Kenneth Graunke 
> Signed-off-by: Ben Widawsky 
> ---
> 
> Thanks Ken. I wasn't sure if this path was common or not, and I had opted for
> the standard, define variables at the top, style. If it occurs more than
> infrequently, then I like this solution better.
> 
> (FYI: Jenkins results, http://otc-gfxtest-01.jf.intel.com/job/bwidawsk/100/)
> ---
>  src/mesa/drivers/common/meta.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/src/mesa/drivers/common/meta.c b/src/mesa/drivers/common/meta.c
> index fdc4cf1..fa800ec 100644
> --- a/src/mesa/drivers/common/meta.c
> +++ b/src/mesa/drivers/common/meta.c
> @@ -247,7 +247,6 @@ _mesa_meta_setup_blit_shader(struct gl_context *ctx,
>   struct blit_shader_table *table)
>  {
> char *vs_source, *fs_source;
> -   void *const mem_ctx = ralloc_context(NULL);
> struct blit_shader *shader = choose_blit_shader(target, table);
> const char *vs_input, *vs_output, *fs_input, *vs_preprocess, 
> *fs_preprocess;
>  
> @@ -273,6 +272,8 @@ _mesa_meta_setup_blit_shader(struct gl_context *ctx,
>return;
> }
>  
> +   void *const mem_ctx = ralloc_context(NULL);
> +
> vs_source = ralloc_asprintf(mem_ctx,
>  "%s\n"
>  "%s vec2 position;\n"
> 

I'm not clear whether we can get away with C99 mixed declarations and code yet
(in the past, it's been a problem for MSVC).  Assuming you move the
declaration back to the top, and leave the initialization here, this
would get a:

Reviewed-by: Kenneth Graunke 


signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/2] nir: Fix non-determinism in nir_lower_vars_to_ssa().

2015-03-09 Thread Kenneth Graunke
Previously, we stored derefs in a hash table, using the malloc'd pointer
as the key.  Then, we walked through the hash table and generated code,
based on the order of the hash table's elements.

Memory addresses returned by malloc are pretty much random, which meant
that the hash was random, and the hash table's elements would be walked
in some random order.  This led to successive compiles of the same
shader using different variable names and slightly different orderings
of phi-nodes.  Code could not be diff'd, and the final assembly would
sometimes change slightly too.

It turns out the only point of the hash table was to avoid inserting
the same node multiple times for different dereferences.  We never
actually searched the hash table!  This patch uses an intrusive
linked list instead.  Since exec_list uses head and tail sentinels,
checking prev or next against NULL will tell us whether the node is
already in the list.

Pair programming with Jason Ekstrand.

Signed-off-by: Jason Ekstrand 
Signed-off-by: Kenneth Graunke 
---
 src/glsl/nir/nir_lower_vars_to_ssa.c | 123 ---
 1 file changed, 26 insertions(+), 97 deletions(-)

diff --git a/src/glsl/nir/nir_lower_vars_to_ssa.c 
b/src/glsl/nir/nir_lower_vars_to_ssa.c
index 9e9a418..86e6ab4 100644
--- a/src/glsl/nir/nir_lower_vars_to_ssa.c
+++ b/src/glsl/nir/nir_lower_vars_to_ssa.c
@@ -35,6 +35,13 @@ struct deref_node {
 
bool lower_to_ssa;
 
+   /* Only valid for things that end up in the direct list.
+* Note that multiple nir_deref_vars may correspond to this node, but they
+* will all be equivalent, so any is as good as the other.
+*/
+   nir_deref_var *deref;
+   struct exec_node direct_derefs_link;
+
struct set *loads;
struct set *stores;
struct set *copies;
@@ -69,7 +76,7 @@ struct lower_variables_state {
 * wildcards and no indirects, these are precisely the derefs that we
 * can actually consider lowering.
 */
-   struct hash_table *direct_deref_nodes;
+   struct exec_list direct_deref_nodes;
 
/* Controls whether get_deref_node will add variables to the
 * direct_deref_nodes table.  This is turned on when we are initially
@@ -83,88 +90,6 @@ struct lower_variables_state {
struct hash_table *phi_table;
 };
 
-/* The following two functions implement a hash and equality check for
- * variable dreferences.  When the hash or equality function encounters an
- * array, all indirects are treated as equal and are never equal to a
- * direct dereference or a wildcard.
- */
-static uint32_t
-hash_deref(const void *void_deref)
-{
-   uint32_t hash = _mesa_fnv32_1a_offset_bias;
-
-   const nir_deref_var *deref_var = void_deref;
-   hash = _mesa_fnv32_1a_accumulate(hash, deref_var->var);
-
-   for (const nir_deref *deref = deref_var->deref.child;
-deref; deref = deref->child) {
-  switch (deref->deref_type) {
-  case nir_deref_type_array: {
- nir_deref_array *deref_array = nir_deref_as_array(deref);
-
- hash = _mesa_fnv32_1a_accumulate(hash, deref_array->deref_array_type);
-
- if (deref_array->deref_array_type == nir_deref_array_type_direct)
-hash = _mesa_fnv32_1a_accumulate(hash, deref_array->base_offset);
- break;
-  }
-  case nir_deref_type_struct: {
- nir_deref_struct *deref_struct = nir_deref_as_struct(deref);
- hash = _mesa_fnv32_1a_accumulate(hash, deref_struct->index);
- break;
-  }
-  default:
- assert("Invalid deref chain");
-  }
-   }
-
-   return hash;
-}
-
-static bool
-derefs_equal(const void *void_a, const void *void_b)
-{
-   const nir_deref_var *a_var = void_a;
-   const nir_deref_var *b_var = void_b;
-
-   if (a_var->var != b_var->var)
-  return false;
-
-   for (const nir_deref *a = a_var->deref.child, *b = b_var->deref.child;
-a != NULL; a = a->child, b = b->child) {
-  if (a->deref_type != b->deref_type)
- return false;
-
-  switch (a->deref_type) {
-  case nir_deref_type_array: {
- nir_deref_array *a_arr = nir_deref_as_array(a);
- nir_deref_array *b_arr = nir_deref_as_array(b);
-
- if (a_arr->deref_array_type != b_arr->deref_array_type)
-return false;
-
- if (a_arr->deref_array_type == nir_deref_array_type_direct &&
- a_arr->base_offset != b_arr->base_offset)
-return false;
- break;
-  }
-  case nir_deref_type_struct:
- if (nir_deref_as_struct(a)->index != nir_deref_as_struct(b)->index)
-return false;
- break;
-  default:
- assert("Invalid deref chain");
- return false;
-  }
-
-  assert((a->child == NULL) == (b->child == NULL));
-  if((a->child == NULL) != (b->child == NULL))
- return false;
-   }
-
-   return true;
-}
-
 static int
 typ

[Mesa-dev] [PATCH 1/2] util: Fix foreach_list_typed_safe when exec_node is not at offset 0.

2015-03-09 Thread Kenneth Graunke
From: Jason Ekstrand 

__next and __prev are pointers to the structure containing the exec_node
link, not the embedded exec_node.  NULL checks would fail unless the
embedded exec_node happened to be at offset 0 in the parent struct.

Signed-off-by: Jason Ekstrand 
Reviewed-by: Kenneth Graunke 
---
 src/glsl/list.h | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/src/glsl/list.h b/src/glsl/list.h
index ddb98f7..680e963 100644
--- a/src/glsl/list.h
+++ b/src/glsl/list.h
@@ -684,7 +684,7 @@ inline void exec_node::insert_before(exec_list *before)
exec_node_data(__type, (__list)->head, __field),\
* __next =  \
exec_node_data(__type, (__node)->__field.next, __field);\
-__next != NULL;\
+&__next->__field != NULL;  \
 __node = __next, __next =  \
exec_node_data(__type, (__next)->__field.next, __field))
 
@@ -693,7 +693,7 @@ inline void exec_node::insert_before(exec_list *before)
exec_node_data(__type, (__list)->tail_pred, __field),   \
* __prev =  \
exec_node_data(__type, (__node)->__field.prev, __field);\
-__prev != NULL;\
+&__prev->__field != NULL;  \
 __node = __prev, __prev =  \
exec_node_data(__type, (__prev)->__field.prev, __field))
 
-- 
2.2.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] i965/fs: Allow copy propagation on ATTR file registers.

2015-03-10 Thread Kenneth Graunke
This especially helps with NIR because we currently emit MOVs at the top
of the shader to copy from various ATTR registers to a giant VGRF array
of all inputs.  (This could potentially be done better, but since
there's only ever one write to each register, it should be trivial to
copy propagate away...)

With NIR - only vertex shaders:
total instructions in shared programs: 3083505 -> 2868128 (-6.98%)
instructions in affected programs: 2977237 -> 2761860 (-7.23%)
helped:18653

Without NIR - only vertex shaders:
total instructions in shared programs: 2724564 -> 2708831 (-0.58%)
instructions in affected programs: 593696 -> 577963 (-2.65%)
helped:3134

Signed-off-by: Kenneth Graunke 
---
 src/mesa/drivers/dri/i965/brw_fs.cpp  | 1 +
 src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp | 5 -
 2 files changed, 5 insertions(+), 1 deletion(-)

No apparent Piglit regressions.  Please scrutinize though, I spent very
little actual thought on this, so I may have missed something stupid.

diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp 
b/src/mesa/drivers/dri/i965/brw_fs.cpp
index 9a415ff..6fa90d2 100644
--- a/src/mesa/drivers/dri/i965/brw_fs.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
@@ -3235,6 +3235,7 @@ fs_visitor::lower_load_payload()
* destination may be used.
*/
   assert(inst->src[i].file == IMM ||
+ inst->src[i].file == ATTR ||
  inst->src[i].file == UNIFORM);
   mov->force_writemask_all = true;
}
diff --git a/src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp 
b/src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp
index 764741d..56f4fa1 100644
--- a/src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp
@@ -283,7 +283,8 @@ fs_visitor::try_copy_propagate(fs_inst *inst, int arg, 
acp_entry *entry)
 
if (entry->src.file == IMM)
   return false;
-   assert(entry->src.file == GRF || entry->src.file == UNIFORM);
+   assert(entry->src.file == GRF || entry->src.file == UNIFORM ||
+  entry->src.file == ATTR);
 
if (entry->opcode == SHADER_OPCODE_LOAD_PAYLOAD &&
inst->opcode == SHADER_OPCODE_LOAD_PAYLOAD)
@@ -382,6 +383,7 @@ fs_visitor::try_copy_propagate(fs_inst *inst, int arg, 
acp_entry *entry)
   inst->src[arg].reg_offset = entry->src.reg_offset;
   inst->src[arg].subreg_offset = entry->src.subreg_offset;
   break;
+   case ATTR:
case GRF:
   {
  assert(entry->src.width % inst->src[arg].width == 0);
@@ -598,6 +600,7 @@ can_propagate_from(fs_inst *inst)
((inst->src[0].file == GRF &&
  (inst->src[0].reg != inst->dst.reg ||
   inst->src[0].reg_offset != inst->dst.reg_offset)) ||
+inst->src[0].file == ATTR ||
 inst->src[0].file == UNIFORM ||
 inst->src[0].file == IMM) &&
inst->src[0].type == inst->dst.type &&
-- 
2.2.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [Mesa-dev, 1/2] i965: fix up asserts in brw_inst_set_jip()

2017-01-28 Thread Kenneth Graunke
On Thu, Jan 26, 2017 at 01:50:41PM +1100, Timothy Arceri wrote:
> We are casting from a signed 32bit int to an unsigned 16bit int
> so shift 15 bits rather than 16.
> ---
>  src/mesa/drivers/dri/i965/brw_inst.h | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/src/mesa/drivers/dri/i965/brw_inst.h 
> b/src/mesa/drivers/dri/i965/brw_inst.h
> index 13fce97..9a777e5 100644
> --- a/src/mesa/drivers/dri/i965/brw_inst.h
> +++ b/src/mesa/drivers/dri/i965/brw_inst.h
> @@ -283,8 +283,8 @@ brw_inst_set_jip(const struct gen_device_info *devinfo,
> if (devinfo->gen >= 8) {
>brw_inst_set_bits(inst, 127, 96, (uint32_t)value);
> } else {
> -  assert(value <= (1 << 16) - 1);
> -  assert(value > -(1 << 16));
> +  assert(value <= (1 << 15) - 1);
> +  assert(value > -(1 << 15));
>brw_inst_set_bits(inst, 111, 96, (uint16_t)value);
> }
>  }

Isn't -32768 legal?  It would be excluded by the above assertions.
A patch to make them:

  assert(value <= (1 << 15) - 1);
  assert(value >= -(1 << 15));

would be:
Reviewed-by: Kenneth Graunke 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [Mesa-dev, 2/2] i965: add assert to while_jumps_before_offset()

2017-01-28 Thread Kenneth Graunke
On Thu, Jan 26, 2017 at 01:50:42PM +1100, Timothy Arceri wrote:
> jip should always be negative here as its the result of
> do instruction - while instruction.
> ---
>  src/mesa/drivers/dri/i965/brw_eu_emit.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/src/mesa/drivers/dri/i965/brw_eu_emit.c 
> b/src/mesa/drivers/dri/i965/brw_eu_emit.c
> index 257757f..f4bec33 100644
> --- a/src/mesa/drivers/dri/i965/brw_eu_emit.c
> +++ b/src/mesa/drivers/dri/i965/brw_eu_emit.c
> @@ -2714,6 +2714,7 @@ while_jumps_before_offset(const struct gen_device_info 
> *devinfo,
> int scale = 16 / brw_jump_scale(devinfo);
> int jip = devinfo->gen == 6 ? brw_inst_gen6_jump_count(devinfo, insn)
> : brw_inst_jip(devinfo, insn);
> +   assert(jip < 0);
> return while_offset + jip * scale <= start_offset;
>  }
>  

This one is:
Reviewed-by: Kenneth Graunke 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] i965: Implement another VF cache invalidate workaround on Gen8+.

2017-01-29 Thread Kenneth Graunke
...and provide a better citation for the existing one.

Signed-off-by: Kenneth Graunke 
---
 src/mesa/drivers/dri/i965/brw_pipe_control.c | 32 +---
 1 file changed, 24 insertions(+), 8 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_pipe_control.c 
b/src/mesa/drivers/dri/i965/brw_pipe_control.c
index b8f740640f2..3e08841e0a9 100644
--- a/src/mesa/drivers/dri/i965/brw_pipe_control.c
+++ b/src/mesa/drivers/dri/i965/brw_pipe_control.c
@@ -118,14 +118,30 @@ brw_emit_pipe_control_flush(struct brw_context *brw, 
uint32_t flags)
   if (brw->gen == 8)
  gen8_add_cs_stall_workaround_bits(&flags);
 
-  if (brw->gen == 9 &&
-  (flags & PIPE_CONTROL_VF_CACHE_INVALIDATE)) {
- /* Hardware workaround: SKL
-  *
-  * Emit Pipe Control with all bits set to zero before emitting
-  * a Pipe Control with VF Cache Invalidate set.
-  */
- brw_emit_pipe_control_flush(brw, 0);
+  if (flags & PIPE_CONTROL_VF_CACHE_INVALIDATE) {
+ if (brw->gen >= 9) {
+/* The PIPE_CONTROL "VF Cache Invalidation Enable" bit description
+ * lists several workarounds:
+ *
+ * "Projects: SKL, KBL, BXT
+ *  If the VF Cache Invalidation Enable is set to a 1 in a
+ *  PIPE_CONTROL, a separate Null PIPE_CONTROL, all bitfields sets
+ *  to 0, with the VF Cache Invalidation Enable set to 0 needs to
+ *  be sent prior to the PIPE_CONTROL with VF Cache Invalidation
+ *  Enable set to a 1."
+ */
+brw_emit_pipe_control_flush(brw, 0);
+
+/* "Projects: BDW+
+ *  When VF Cache Invalidate is set “Post Sync Operation” must
+ *  be enabled to “Write Immediate Data” or “Write PS Depth Count”
+ *  or “Write Timestamp”.
+ */
+brw_emit_pipe_control_write(brw,
+flags | PIPE_CONTROL_WRITE_IMMEDIATE,
+brw->workaround_bo, 0, 0, 0);
+return;
+ }
   }
 
   BEGIN_BATCH(6);
-- 
2.11.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] Revert "i915: Always enable GL 2.0 support."

2017-01-29 Thread Kenneth Graunke
On Sunday, January 29, 2017 6:20:10 PM PST Matt Turner wrote:
> This partially reverts commit 97217a40f97cdeae0304798b607f704deb0c3558.
> It leaves ES 2.0 support in place per Ian's suggestion, because ES 2.0
> is designed to work on hardware like i915.

Your commit message should mention why dropping from OpenGL 2.1 to 1.4
is a good thing.

(IIRC it's because Chrome (and other apps?) use really slow paths with
2.1, and so the general usability of the system is likely to be worse.)

> The piglit results look like:
> 
>name: before-revert-i915 after-revert-i915
>  -- -
>pass:   7171  2169
>fail:933   201
>   crash:  8 7
>skip:  32997 38676
> timeout:  0 0
>warn:  3 1
>  incomplete:  0 0
>  dmesg-warn:  0 0
>  dmesg-fail:  0 0
> changes:  0  6040
>   fixes:  0   292
> regressions:  0 2
>   total:  41112 41054

Why are these interesting?

> Cc: "17.0" 
> ---
>  src/mesa/drivers/dri/i915/intel_extensions.c |  8 ++--
>  src/mesa/drivers/dri/i915/intel_screen.c | 21 +++--
>  2 files changed, 25 insertions(+), 4 deletions(-)
> 
> diff --git a/src/mesa/drivers/dri/i915/intel_extensions.c 
> b/src/mesa/drivers/dri/i915/intel_extensions.c
> index ab7820f..4f2c6fa 100644
> --- a/src/mesa/drivers/dri/i915/intel_extensions.c
> +++ b/src/mesa/drivers/dri/i915/intel_extensions.c
> @@ -92,8 +92,12 @@ intelInitExtensions(struct gl_context *ctx)
>ctx->Extensions.ATI_separate_stencil = true;
>ctx->Extensions.ATI_texture_env_combine3 = true;
>ctx->Extensions.NV_texture_env_combine4 = true;
> -  ctx->Extensions.ARB_fragment_shader = true;
> -  ctx->Extensions.ARB_occlusion_query = true;
> +
> +  if (driQueryOptionb(&intel->optionCache, "fragment_shader"))
> + ctx->Extensions.ARB_fragment_shader = true;

I get dropping GLSL support, but isn't GL_ARB_fragment_shader sort of
reasonable for this hardware?

> +
> +  if (driQueryOptionb(&intel->optionCache, "stub_occlusion_query"))
> + ctx->Extensions.ARB_occlusion_query = true;
> }
>  
> if (intel->ctx.Mesa_DXTn
> diff --git a/src/mesa/drivers/dri/i915/intel_screen.c 
> b/src/mesa/drivers/dri/i915/intel_screen.c
> index 5c7c06a..fe86179 100644
> --- a/src/mesa/drivers/dri/i915/intel_screen.c
> +++ b/src/mesa/drivers/dri/i915/intel_screen.c
> @@ -62,6 +62,10 @@ DRI_CONF_BEGIN
>DRI_CONF_DESC(en, "Enable early Z in classic mode (unstable, 
> 945-only).")
>DRI_CONF_OPT_END
>  
> +  DRI_CONF_OPT_BEGIN_B(fragment_shader, "true")
> +  DRI_CONF_DESC(en, "Enable limited ARB_fragment_shader support on 
> 915/945.")
> +  DRI_CONF_OPT_END
> +
> DRI_CONF_SECTION_END
> DRI_CONF_SECTION_QUALITY
>DRI_CONF_FORCE_S3TC_ENABLE("false")
> @@ -75,6 +79,10 @@ DRI_CONF_BEGIN
>DRI_CONF_DISABLE_GLSL_LINE_CONTINUATIONS("false")
>DRI_CONF_DISABLE_BLEND_FUNC_EXTENDED("false")
>  
> +  DRI_CONF_OPT_BEGIN_B(stub_occlusion_query, "false")
> +  DRI_CONF_DESC(en, "Enable stub ARB_occlusion_query support on 
> 915/945.")
> +  DRI_CONF_OPT_END
> +
>DRI_CONF_OPT_BEGIN_B(shader_precompile, "true")
>DRI_CONF_DESC(en, "Perform code generation at shader link time.")
>DRI_CONF_OPT_END
> @@ -1125,12 +1133,21 @@ set_max_gl_versions(struct intel_screen *screen)
> __DRIscreen *psp = screen->driScrnPriv;
>  
> switch (screen->gen) {
> -   case 3:
> +   case 3: {
> +  bool has_fragment_shader = driQueryOptionb(&screen->optionCache, 
> "fragment_shader");
> +  bool has_occlusion_query = driQueryOptionb(&screen->optionCache, 
> "stub_occlusion_query");
> +
>psp->max_gl_core_version = 0;
>psp->max_gl_es1_version = 11;
> -  psp->max_gl_compat_version = 21;
>psp->max_gl_es2_version = 20;
> +
> +  if (has_fragment_shader && has_occlusion_query) {
> + psp->max_gl_compat_version = 21;
> +  } else {
> + psp->max_gl_compat_version = 14;
> +  }
>break;
> +   }
> case 2:
>psp->max_gl_core_version = 0;
>psp->max_gl_compat_version = 13;
> 



signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] anv: Improve flushing around STATE_BASE_ADDRESS

2017-01-31 Thread Kenneth Graunke
On Tuesday, January 31, 2017 10:33:41 AM PST Jason Ekstrand wrote:
> It is not clear from the docs exactly how pipelined STATE_BASE_ADDRESS
> actually is.  Previously, we knew we needed to flush prior to re-emitting
> STATE_BASE_ADDRESS on gen8+ but we had never confirmed it on gen7 so we
> left it alone and avoided the flush.  Recently, Mark found hangs on gen7
> which appear to be STATE_BASE_ADDRESS related which this patch fixes.
> The theory is that SURFACE_STATE structures are cached in texture cache
> but relative addressing nature of things doesn't work nicely with the
> cache.  This means that you need to invalidate the texture cache when

This description doesn't match the actual change you're making.  We
already invalidate the texture cache after STATE_BASE_ADDRESS, on all
generations.  You're making it flush the render cache.  According to
the comment already in that function, it sounds like the render cache
may also include surface states, so...same explanation, just
s/texture/render/g.  It seems like a fine thing to do.

> you change STATE_BASE_ADDRESS and interesting things can happen if there
> are any EU threads in-flight when surface state base address changes.
> 
> While we're here, we also add data cache flushing in order to ensure
> that any compute shaders running are finished before we change the
> surface state base address.  It's unknown whether or not this would
> actually be a problem but, given how hard these things are to debug, we
> might as well make sure.

That's kind of a separate change from applying the flushes on Gen7/7.5.
It also seems like a fine thing to do.  Might make sense to split into
two patches.

At minimum, assuming you update the commit message,
Reviewed-by: Kenneth Graunke 

> 
> Reported-by: Mark Janes 
> Tested-by: Mark Janes 
> ---
>  src/intel/vulkan/genX_cmd_buffer.c | 5 ++---
>  1 file changed, 2 insertions(+), 3 deletions(-)
> 
> diff --git a/src/intel/vulkan/genX_cmd_buffer.c 
> b/src/intel/vulkan/genX_cmd_buffer.c
> index f7894a0..7b43c6f 100644
> --- a/src/intel/vulkan/genX_cmd_buffer.c
> +++ b/src/intel/vulkan/genX_cmd_buffer.c
> @@ -55,8 +55,6 @@ genX(cmd_buffer_emit_state_base_address)(struct 
> anv_cmd_buffer *cmd_buffer)
>  {
> struct anv_device *device = cmd_buffer->device;
>  
> -/* XXX: Do we need this on more than just BDW? */
> -#if (GEN_GEN >= 8)
> /* Emit a render target cache flush.
>  *
>  * This isn't documented anywhere in the PRM.  However, it seems to be
> @@ -65,9 +63,10 @@ genX(cmd_buffer_emit_state_base_address)(struct 
> anv_cmd_buffer *cmd_buffer)
>  * clear depth, reset state base address, and then go render stuff.
>  */
> anv_batch_emit(&cmd_buffer->batch, GENX(PIPE_CONTROL), pc) {
> +  pc.DCFlushEnable = true;
>pc.RenderTargetCacheFlushEnable = true;
> +  pc.CommandStreamerStallEnable = true;
> }
> -#endif
>  
> anv_batch_emit(&cmd_buffer->batch, GENX(STATE_BASE_ADDRESS), sba) {
>sba.GeneralStateBaseAddress = (struct anv_address) { NULL, 0 };
> 


signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/2] intel/blorp: Handle clearing of A4B4G4R4 on all platforms

2017-01-31 Thread Kenneth Graunke
On Friday, January 27, 2017 2:18:34 PM PST Jason Ekstrand wrote:
> Cc: "13.0 17.0" 
> ---
>  src/intel/blorp/blorp_clear.c | 19 +++
>  1 file changed, 19 insertions(+)
> 
> diff --git a/src/intel/blorp/blorp_clear.c b/src/intel/blorp/blorp_clear.c
> index afc505d..f8ac6dc 100644
> --- a/src/intel/blorp/blorp_clear.c
> +++ b/src/intel/blorp/blorp_clear.c
> @@ -349,6 +349,25 @@ blorp_clear(struct blorp_batch *batch,
> if (format == ISL_FORMAT_R9G9B9E5_SHAREDEXP) {
>clear_color.u32[0] = float3_to_rgb9e5(clear_color.f32);
>format = ISL_FORMAT_R32_UINT;
> +   } else if (format == ISL_FORMAT_A4B4G4R4_UNORM) {
> +  /* Broadwell and earlier cannot render to this format so we need to 
> work
> +   * around it by swapping the colors around and using B4G4R4A4 instead.
> +   */
> +
> +  /* First, we apply the swizzle. */
> +  union isl_color_value old;
> +  old.u32[swizzle.r - ISL_CHANNEL_SELECT_RED] = clear_color.u32[0];
> +  old.u32[swizzle.g - ISL_CHANNEL_SELECT_RED] = clear_color.u32[1];
> +  old.u32[swizzle.b - ISL_CHANNEL_SELECT_RED] = clear_color.u32[2];
> +  old.u32[swizzle.a - ISL_CHANNEL_SELECT_RED] = clear_color.u32[3];
> +  swizzle = ISL_SWIZZLE_IDENTITY;

Are ISL_CHANNEL_SELECT_{ZERO,ONE} disallowed by a higher level?  If not,
this will break rather spectacularly (old.u32[-4] = ...).  Maybe add
asserts?

Assuming that isn't a problem,
Reviewed-by: Kenneth Graunke 

> +
> +  /* Now we re-order for the new format */
> +  clear_color.u32[0] = old.u32[1];
> +  clear_color.u32[1] = old.u32[2];
> +  clear_color.u32[2] = old.u32[3];
> +  clear_color.u32[3] = old.u32[0];
> +  format = ISL_FORMAT_B4G4R4A4_UNORM;
> }
>  
> memcpy(¶ms.wm_inputs.clear_color, clear_color.f32, sizeof(float) * 4);
> 



signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/2] isl/formats: Only advertise sampling for A4B4G4R4 on Broadwell

2017-01-31 Thread Kenneth Graunke
On Friday, January 27, 2017 2:18:35 PM PST Jason Ekstrand wrote:
> This causes hangs on Broadwell if you try to render to it.  I have no
> idea how we managed to not hit this earlier.
> 
> Cc: "13.0 17.0" 
> ---
>  src/intel/isl/isl_format.c | 5 +++--
>  1 file changed, 3 insertions(+), 2 deletions(-)
> 
> diff --git a/src/intel/isl/isl_format.c b/src/intel/isl/isl_format.c
> index c8daece..bc157d5 100644
> --- a/src/intel/isl/isl_format.c
> +++ b/src/intel/isl/isl_format.c
> @@ -218,9 +218,10 @@ static const struct surface_format_info format_info[] = {
> SF(50, 50,  x,  x,  x,  x,  x,  x,  x,x,   P8A8_UNORM_PALETTE1)
> SF( x,  x,  x,  x,  x,  x,  x,  x,  x,x,   A1B5G5R5_UNORM)
> /* According to the PRM, A4B4G4R4_UNORM isn't supported until Sky Lake
> -* but empirical testing indicates that it works just fine on Broadwell.
> +* but empirical testing indicates that at least sampling works just fine
> +* on Broadwell.
>  */
> -   SF(80, 80,  x,  x, 80,  x,  x,  x,  x,x,   A4B4G4R4_UNORM)
> +   SF(80, 80,  x,  x, 90,  x,  x,  x,  x,x,   A4B4G4R4_UNORM)
> SF(90,  x,  x,  x,  x,  x,  x,  x,  x,x,   L8A8_UINT)
> SF(90,  x,  x,  x,  x,  x,  x,  x,  x,x,   L8A8_SINT)
> SF( Y,  Y,  x, 45,  Y,  Y,  Y,  x,  x,x,   R8_UNORM)
> 

Still somewhat grumpy that there are mandatory 4-bit formats that
aren't supported in hardware.  We definitely don't want to advertise
rendering to this if it doesn't work.

You can always do your clever hacks in BLORP to make it work for e.g.
copyimage etc, like you did for RGB9E5 and friends.

Reviewed-by: Kenneth Graunke 


signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] docs: add link to http://mesamatrix.net/

2017-02-01 Thread Kenneth Graunke
On Wednesday, February 1, 2017 2:36:02 PM PST Brian Paul wrote:
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=95460
> ---
>  docs/features.txt | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/docs/features.txt b/docs/features.txt
> index 55b1fbb..300442b 100644
> --- a/docs/features.txt
> +++ b/docs/features.txt
> @@ -335,3 +335,6 @@ we DO NOT WANT implementations of these extensions for 
> Mesa.
>  
>  More info about these features and the work involved can be found at
>  http://dri.freedesktop.org/wiki/MissingFunctionality
> +
> +Also, a graphical representation of this information can be found at
> +http://mesamatrix.net/
> 

Sounds good to me.


signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] Revert "i915: Always enable GL 2.0 support."

2017-02-01 Thread Kenneth Graunke
On Wednesday, February 1, 2017 12:35:15 PM PST Eero Tamminen wrote:
> Hi,
> 
> On 31.01.2017 21:12, Matt Turner wrote:
> > On Sun, Jan 29, 2017 at 8:29 PM, Kenneth Graunke  
> > wrote:
> >> On Sunday, January 29, 2017 6:20:10 PM PST Matt Turner wrote:
> >>> This partially reverts commit 97217a40f97cdeae0304798b607f704deb0c3558.
> >>> It leaves ES 2.0 support in place per Ian's suggestion, because ES 2.0
> >>> is designed to work on hardware like i915.
> >>
> >> Your commit message should mention why dropping from OpenGL 2.1 to 1.4
> >> is a good thing.
> >>
> >> (IIRC it's because Chrome (and other apps?) use really slow paths with
> >> 2.1, and so the general usability of the system is likely to be worse.)
>  >
> > Yeah, I'll add
> 
> Has that been profiled?  I assume the issue is the constant uploads 
> Chrome does for the Browser's CPU rendered page content updates.  There 
> are not many other workloads that utilize GPU as inefficiently as 
> current browsers.
> 
> 
>   - Eero

I figured it was just software rendering everything, given that there's
no vertex shading support and only 64 fragment shader instructions...

prog_execute isn't exactly a state-of-the-art interpreter, either.

--Ken


signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] configure: Only require libdrm 2.4.75 for intel.

2017-02-02 Thread Kenneth Graunke
On Thursday, February 2, 2017 7:35:20 AM PST Chad Versace wrote:
> On Thu 02 Feb 2017, Dave Airlie wrote:
> > On 2 February 2017 at 13:09, Emil Velikov  wrote:
> > > On 2 February 2017 at 02:58, Michel Dänzer  wrote:
> > >> On 02/02/17 09:10 AM, Emil Velikov wrote:
> > >>> On 1 February 2017 at 23:28, Vinson Lee  wrote:
> >  Fixes: b8acb6b17981 ("configure: Require libdrm >= 2.4.75")
> >  Signed-off-by: Vinson Lee 
> > >>> Are you sure that's correct ?
> > >>>
> > >>> Afaict the follow-up commits make use of updated i915_drm.h which
> > >>> should be provided by your distro's libdrm-dev package.
> > >>
> > >> This seems to be at the heart of the confusion here: Is i915_drm.h part
> > >> of libdrm or of libdrm_intel? I'd argue it's the latter, and the fact
> > >> that some or even all downstreams ship a single package with all libdrm*
> > >> headers is irrelevant. That package also contains all the libdrm_*.pc
> > >> files, so Vinson's patch works as intended either way.
> > >>
> > > Are you saying that there's a single -dev package [libdrm-dev] for
> > > everything libdrm* related ?
> > > That sounds like a broken distro package... which would explain some
> > > of the assumptions/discussions on #dri-devel :-)
> > 
> > That is how all distros ship it.
> 
> As Dänzer said, "Vinson's patch works as intended either way".
> 
> If this small patch fixes Vinson's problem; breaks no one's setup; and
> causes no maintenance burden; then the patch is good.
> 
> Is anyone *opposed* to Vinson's patch? (It's hard to tell because all of
> the discussion about what distro's do, don't do, and should do).

I'm not opposed.  Normally, this is what we do.

Bumping LIBDRM_INTEL_REQUIRED when we need a new i915_drm.h seems
totally reasonable to me.  I don't know of any setup that ships
multiple libdrm (why?!)...but it seems like if you have a new enough
libdrm_intel, you'll have a new enough i915_drm.h.

That said...this case is a /little/ different...because we're
introducing a dependency on libsync.h, which is part of core libdrm.
I don't think it's an Intel-specific file, though it is currently only
used in i965...

I don't know that it makes much difference.


signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] configure.ac: describe what all the LIBDRM_*REQUIRED macros mean

2017-02-02 Thread Kenneth Graunke
On Thursday, February 2, 2017 9:15:47 AM PST Chad Versace wrote:
> On Thu 02 Feb 2017, Emil Velikov wrote:
> > From: Emil Velikov 
> > 
> > They are versions of the respective libdrm package. They are _not_ the
> > version of libdrm[.so] required for driver X.
> > 
> > Doing the latter will lead to combinatoric explosion and in all fairness
> > things will likely be broken most of the time.
> > 
> > To make things even more confusing the kernel UAPI is provided by libdrm
> > itself.
> > 
> > Cc: Vinson Lee 
> > Cc: Kenneth Graunke 
> > Signed-off-by: Emil Velikov 
> > ---
> > Ken, you/Chad have things spot on. Yet semes like other people struggle
> > deeply with these.
> 
> Actually, I agree with airlied and imirkin. I made a mistake when
> I bumped LIBDRM_REQUIRED.

Me too...


signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] mesa: Don't crash when destroying contexts created with no visual.

2017-02-02 Thread Kenneth Graunke
dEQP-EGL.functional.create_context.no_config tries to create a context
with no config, then immediately destroys it.  The drawbuffer is never
set up, so we can't dereference it asking if it's double buffered, or
we'll crash on a null pointer dereference.

Just bail early.

Signed-off-by: Kenneth Graunke 
---
 src/mesa/main/context.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/mesa/main/context.c b/src/mesa/main/context.c
index 7ecd241e6eb..76763489b9f 100644
--- a/src/mesa/main/context.c
+++ b/src/mesa/main/context.c
@@ -1550,7 +1550,7 @@ _mesa_check_init_viewport(struct gl_context *ctx, GLuint 
width, GLuint height)
 static void
 handle_first_current(struct gl_context *ctx)
 {
-   if (ctx->Version == 0) {
+   if (ctx->Version == 0 || !ctx->DrawBuffer) {
   /* probably in the process of tearing down the context */
   return;
}
-- 
2.11.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] anv/pipeline: set ThreadDispatchEnable conditionally

2017-02-03 Thread Kenneth Graunke
On Friday, February 3, 2017 1:19:16 PM PST Juan A. Suarez Romero wrote:
> Set 3DSTATE_WM/ThreadDispatchEnable bit on/off based on the same
> conditions as used in the GL version.
> 
> Signed-off-by: Juan A. Suarez Romero 
> ---
>  src/intel/vulkan/genX_pipeline.c | 49 
> +---
>  1 file changed, 26 insertions(+), 23 deletions(-)

looks good to me.

Reviewed-by: Kenneth Graunke 


signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] docs: replace URL in features.txt

2017-02-03 Thread Kenneth Graunke
On Friday, February 3, 2017 11:48:50 AM PST Brian Paul wrote:
> Replace unmaintained http://dri.freedesktop.org/wiki/MissingFunctionality
> URL with http://mesamatrix.net/
> 
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=95460
> ---
>  docs/features.txt | 5 +++--
>  1 file changed, 3 insertions(+), 2 deletions(-)
> 
> diff --git a/docs/features.txt b/docs/features.txt
> index 55b1fbb..2f2d41d 100644
> --- a/docs/features.txt
> +++ b/docs/features.txt
> @@ -333,5 +333,6 @@ we DO NOT WANT implementations of these extensions for 
> Mesa.
>GL_ARB_shadow_ambient Superseded by 
> GL_ARB_fragment_program
>GL_ARB_vertex_blend   Superseded by 
> GL_ARB_vertex_program
>  
> -More info about these features and the work involved can be found at
> -http://dri.freedesktop.org/wiki/MissingFunctionality
> +
> +A graphical representation of this information can be found at
> +http://mesamatrix.net/
> 

Yeah, we haven't used that page in years...

Acked-by: Kenneth Graunke 


signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] i965: Select pipeline and emit state base address in Gen8+ HiZ ops.

2017-02-06 Thread Kenneth Graunke
If a HiZ op is the first thing in the batch, we should make sure
to select the render pipeline and emit state base address before
proceeding.

I believe 3DSTATE_WM_HZ_OP creates 3DPRIMITIVEs internally, and
dispatching those on the GPGPU pipeline seems a bit sketchy.  I'm
not actually sure that STATE_BASE_ADDRESS is necessary, as the
depth related commands use graphics addresses, not ones relative
to the base address...but we're likely to set it as part of the
next operation anyway, so we should just do it right away.

Cc: "17.0" 
Signed-off-by: Kenneth Graunke 
---
 src/mesa/drivers/dri/i965/gen8_depth_state.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/src/mesa/drivers/dri/i965/gen8_depth_state.c 
b/src/mesa/drivers/dri/i965/gen8_depth_state.c
index a7e61354fd5..620b32df8bb 100644
--- a/src/mesa/drivers/dri/i965/gen8_depth_state.c
+++ b/src/mesa/drivers/dri/i965/gen8_depth_state.c
@@ -404,6 +404,9 @@ gen8_hiz_exec(struct brw_context *brw, struct 
intel_mipmap_tree *mt,
if (op == BLORP_HIZ_OP_NONE)
   return;
 
+   brw_select_pipeline(brw, BRW_RENDER_PIPELINE);
+   brw_upload_state_base_address(brw);
+
/* Disable the PMA stall fix since we're about to do a HiZ operation. */
if (brw->gen == 8)
   gen8_write_pma_stall_bits(brw, 0);
-- 
2.11.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] Time to merge threaded GL dispatch? (aka glthread)

2017-02-06 Thread Kenneth Graunke
On Monday, February 6, 2017 8:54:40 PM PST Marek Olšák wrote:
> On Mon, Feb 6, 2017 at 8:20 PM, Ernst Sjöstrand  wrote:
> > FYI glmark2 segfaults with mesa_glthread=true. Expected that some programs
> > will segfault?
> 
> Yes, even segfaults are expected with mesa_glthread=true.
> 
> Marek

Would it make sense to be crash-free or even regression-free on at
least Piglit, before merging?  (Or are we there already?)

--Ken


signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] glsl: correct compute shader checks for memoryBarrier functions

2017-02-06 Thread Kenneth Graunke
On Monday, February 6, 2017 9:07:30 AM PST Marc Di Luzio wrote:
> As per the spec -
> "The functions memoryBarrierShared() and groupMemoryBarrier() are
> available only in compute shaders; the other functions are available
> in all shader types."
> 
> Conform to this by adding another delegate to check for compute
> shader support instead of only whether the current stage is compute
> 
> This allows some fragment shaders in Dirt Rally to compile
> 
> CC: "17.0" 
> 
> Reviewed-by: Anuj Phogat 

Reviewed-by: Kenneth Graunke 

and pushed:

To ssh://git.freedesktop.org/git/mesa/mesa
   83fb63d31de..21efe2528cd  master -> master

Thanks, Marc!


signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] Time to merge threaded GL dispatch? (aka glthread)

2017-02-09 Thread Kenneth Graunke
On Thursday, February 9, 2017 5:23:47 PM PST Ian Romanick wrote:
[snip]
> Not achieving the desired performance is vastly different from
> segfaults.  Does the threaded mode in NVIDIA's driver crash?  I'd wager
> not.

Sadly, yes it does.  A while back, I tried running some Steam games
on Linux using my NVIDIA machine, with an out of the box configuration,
and they totally crashed.  Workaround was to disable GL threaded
optimizations.  Of course, I also had crashes which I had to update the
CPU microcode to workaround, and tearing right through the middle of
every single frame (needed more workarounds), and all kinds of
problems...so...not sure that's the standard of quality I want to
aspire to...

--Ken


signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [RFC PATCH 0/4] New drirc options for Worms WMD, Tropico 5 and Crookz

2017-02-10 Thread Kenneth Graunke
On Friday, February 10, 2017 5:45:03 AM PST Ilia Mirkin wrote:
> On Fri, Feb 10, 2017 at 8:41 AM, Samuel Pitoiset
>  wrote:
> > Hi,
> >
> > Mesa currently doesn't allow to create 3.1+ compatibility profiles mainly
> > because various features are unimplemented and bugs can happen.
> >
> > However, some buggy apps request a compat profile without using any old
> > features but they fail to start because Mesa clamps the GLSL version to
> > 130 for compat.
> 
> And presumably you can't just hand them a core profile because they'll
> do something like glGetString(GL_EXTENSIONS)?

You can totally hand Tropico 5 a core profile and it'll work correctly.


signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] glsl: Actually enforce ES SSBO unsized array check for SSBOs.

2017-02-10 Thread Kenneth Graunke
Commit b010fa85675b98962426fe8961466fbae2d25499 re-added this check
after it was erroneously dropped while fixing another bug.  However,
the control flow was slightly off, and it ended up only applying to
UBOs and not SSBOs.

Fixes dEQP-GLES31.functional.debug.negative_coverage.
{callbacks,get_error,log}.shader.compile_compute_shader.

Signed-off-by: Kenneth Graunke 
---
 src/compiler/glsl/ast_to_hir.cpp | 7 +++
 1 file changed, 3 insertions(+), 4 deletions(-)

diff --git a/src/compiler/glsl/ast_to_hir.cpp b/src/compiler/glsl/ast_to_hir.cpp
index b31b61d1ed6..851a7d6ef4d 100644
--- a/src/compiler/glsl/ast_to_hir.cpp
+++ b/src/compiler/glsl/ast_to_hir.cpp
@@ -7903,10 +7903,9 @@ ast_interface_block::hir(exec_list *instructions,
  }
 
  if (var->type->is_unsized_array()) {
-if (var->is_in_shader_storage_block()) {
-   if (is_unsized_array_last_element(var)) {
-  var->data.from_ssbo_unsized_array = true;
-   }
+if (var->is_in_shader_storage_block() &&
+is_unsized_array_last_element(var)) {
+   var->data.from_ssbo_unsized_array = true;
 } else {
/* From GLSL ES 3.10 spec, section 4.1.9 "Arrays":
 *
-- 
2.11.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] glsl: non-last member unsized array on SSBO must fail compilation on GLSL ES 3.1

2017-02-10 Thread Kenneth Graunke
On Friday, February 10, 2017 5:25:27 AM PST Jose Maria Casanova Crespo wrote:
> From GLSL ES 3.10 spec, section 4.1.9 "Arrays":
> 
> "If an array is declared as the last member of a shader storage block
>  and the size is not specified at compile-time, it is sized at run-time.
>  In all other cases, arrays are sized only at compile-time."
> 
> In desktop GLSL it is allowed to have unsized-arrays that are
> not last, as long as we can determine that they are implicitly
> sized, which is detected at link-time.
> 
> With this patch Mesa reports a compilation error as glslang does with
> the following shader:
> 
> buffer SSBO { vec4 data[]; vec4 moreData;};
> void main (void)
> {
> }
> 
> Fixes:
> dEQP-GLES31.functional.debug.negative_coverage.log.shader.compile_compute_shader
> dEQP-GLES31.functional.debug.negative_coverage.callbacks.shader.compile_compute_shader
> dEQP-GLES31.functional.debug.negative_coverage.get_error.shader.compile_compute_shader
> 
> Cc: "17.0" 
> Signed-off-by: Jose Maria Casanova Crespo 
> ---
>  src/compiler/glsl/ast_to_hir.cpp | 11 +++
>  1 file changed, 7 insertions(+), 4 deletions(-)

Thanks!  I missed this patch on the mailing list and independently
arrived at the same solution today, so we'll call that a:

Reviewed-by: Kenneth Graunke 

and pushed:

To ssh://git.freedesktop.org/git/mesa/mesa
   0514b0bdc91..5bc222ebafd  master -> master


signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] glsl: Actually enforce ES SSBO unsized array check for SSBOs.

2017-02-10 Thread Kenneth Graunke
On Friday, February 10, 2017 3:01:10 PM PST Ilia Mirkin wrote:
> Seems pretty similar to https://patchwork.freedesktop.org/patch/138274/

So it does!  I hadn't seen that patch yet.  I just pushed his patch
instead with my R-b.  Thanks!


signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 4/5] mesa: Ignore per-vertex array size in SSO pipeline validation.

2017-02-10 Thread Kenneth Graunke
We were already unwrapping types when the producer was a non-array
stage and the consumer was an arrayed-stage...but we ought to unwrap
both ends for TCS -> TES matching too.

This will allow us to drop the "resize to gl_MaxPatchVertices" check
shortly, which breaks some things.

Signed-off-by: Kenneth Graunke 
---
 src/mesa/main/shader_query.cpp | 98 --
 1 file changed, 56 insertions(+), 42 deletions(-)

diff --git a/src/mesa/main/shader_query.cpp b/src/mesa/main/shader_query.cpp
index 30e5b087abb..6e077623e15 100644
--- a/src/mesa/main/shader_query.cpp
+++ b/src/mesa/main/shader_query.cpp
@@ -1378,11 +1378,12 @@ validate_io(struct gl_program *producer, struct 
gl_program *consumer)
if (producer->sh.data->linked_stages == consumer->sh.data->linked_stages)
   return true;
 
-   const bool nonarray_stage_to_array_stage =
-  producer->info.stage != MESA_SHADER_TESS_CTRL &&
-  (consumer->info.stage == MESA_SHADER_GEOMETRY ||
-   consumer->info.stage == MESA_SHADER_TESS_CTRL ||
-   consumer->info.stage == MESA_SHADER_TESS_EVAL);
+   const bool producer_is_array_stage =
+  producer->info.stage == MESA_SHADER_TESS_CTRL;
+   const bool consumer_is_array_stage =
+  consumer->info.stage == MESA_SHADER_GEOMETRY ||
+  consumer->info.stage == MESA_SHADER_TESS_CTRL ||
+  consumer->info.stage == MESA_SHADER_TESS_EVAL;
 
bool valid = true;
 
@@ -1495,6 +1496,56 @@ validate_io(struct gl_program *producer, struct 
gl_program *consumer)
   if (match_index < num_outputs)
  outputs[match_index] = outputs[num_outputs];
 
+  /* Section 7.4.1 (Shader Interface Matching) of the ES 3.2 spec says:
+   *
+   *"Tessellation control shader per-vertex output variables and
+   * blocks and tessellation control, tessellation evaluation, and
+   * geometry shader per-vertex input variables and blocks are
+   * required to be declared as arrays, with each element representing
+   * input or output values for a single vertex of a multi-vertex
+   * primitive. For the purposes of interface matching, such variables
+   * and blocks are treated as though they were not declared as
+   * arrays."
+   *
+   * So we unwrap those types before matching.
+   */
+  const glsl_type *consumer_type = consumer_var->type;
+  const glsl_type *consumer_interface_type = consumer_var->interface_type;
+  const glsl_type *producer_type = producer_var->type;
+  const glsl_type *producer_interface_type = producer_var->interface_type;
+
+  if (consumer_is_array_stage) {
+ if (consumer_interface_type) {
+/* the interface is the array; the underlying types should match */
+if (consumer_interface_type->is_array() && !consumer_var->patch)
+   consumer_interface_type = consumer_interface_type->fields.array;
+ } else {
+if (consumer_type->is_array() && !consumer_var->patch)
+   consumer_type = consumer_type->fields.array;
+ }
+  }
+
+  if (producer_is_array_stage) {
+ if (producer_interface_type) {
+/* the interface is the array; the underlying types should match */
+if (producer_interface_type->is_array() && !producer_var->patch)
+   producer_interface_type = producer_interface_type->fields.array;
+ } else {
+if (producer_type->is_array() && !producer_var->patch)
+   producer_type = producer_type->fields.array;
+ }
+  }
+
+  if (producer_type != consumer_type) {
+ valid = false;
+ goto out;
+  }
+
+  if (producer_interface_type != consumer_interface_type) {
+ valid = false;
+ goto out;
+  }
+
   /* Section 9.2.2 (Separable Programs) of the GLSL ES spec says:
*
*Qualifier Class|  Qualifier  |in/out
@@ -1525,43 +1576,6 @@ validate_io(struct gl_program *producer, struct 
gl_program *consumer)
* Note that location mismatches are detected by the loops above that
* find the producer variable that goes with the consumer variable.
*/
-  if (nonarray_stage_to_array_stage) {
- if (consumer_var->interface_type != NULL) {
-/* the interface is the array; underlying types should match */
-if (producer_var->type != consumer_var->type) {
-   valid = false;
-   goto out;
-}
-
-if (!consumer_var->interface_type->is_array() ||
-consumer_var->interface_type->fields.array != 
producer_var->interface_type) {
-   valid = false;
-   goto out;
-}
- } else {
-if (producer_var->interface_type != NULL) 

[Mesa-dev] [PATCH 5/5] glsl: Drop resize-to-MaxPatchVertices hack.

2017-02-10 Thread Kenneth Graunke
TCS and TES inputs without an array size are implicitly sized to
gl_MaxPatchVertices.  But TCS outputs are apparently not:

   "If no size is specified, it will be taken from the output patch size
(gl_VerticesOut) declared in the shader."

Fixes dEQP-GLES31.functional.program_interface_query.program_output.
array_size.separable_tess_ctrl.var.

Signed-off-by: Kenneth Graunke 
---
 src/compiler/glsl/ast_to_hir.cpp   |  3 --
 src/compiler/glsl/ir.h |  6 
 src/compiler/glsl/linker.cpp   | 32 --
 src/compiler/glsl/lower_named_interface_blocks.cpp |  2 --
 4 files changed, 43 deletions(-)

diff --git a/src/compiler/glsl/ast_to_hir.cpp b/src/compiler/glsl/ast_to_hir.cpp
index ee47b654bfb..b90ad97b1de 100644
--- a/src/compiler/glsl/ast_to_hir.cpp
+++ b/src/compiler/glsl/ast_to_hir.cpp
@@ -4404,8 +4404,6 @@ handle_tess_ctrl_shader_output_decl(struct 
_mesa_glsl_parse_state *state,
if (var->data.patch)
   return;
 
-   var->data.tess_varying_implicit_sized_array = var->type->is_unsized_array();
-
validate_layout_qualifier_vertex_count(state, loc, var, num_vertices,
   &state->tcs_output_size,
   "tessellation control shader 
output");
@@ -4442,7 +4440,6 @@ handle_tess_shader_input_decl(struct 
_mesa_glsl_parse_state *state,
if (var->type->is_unsized_array()) {
   var->type = glsl_type::get_array_instance(var->type->fields.array,
 state->Const.MaxPatchVertices);
-  var->data.tess_varying_implicit_sized_array = true;
} else if (var->type->length != state->Const.MaxPatchVertices) {
   _mesa_glsl_error(&loc, state,
"per-vertex tessellation shader input arrays must be "
diff --git a/src/compiler/glsl/ir.h b/src/compiler/glsl/ir.h
index 4317c54d498..3544161105e 100644
--- a/src/compiler/glsl/ir.h
+++ b/src/compiler/glsl/ir.h
@@ -844,12 +844,6 @@ public:
   unsigned implicit_sized_array:1;
 
   /**
-   * Is this a non-patch TCS output / TES input array that was implicitly
-   * sized to gl_MaxPatchVertices?
-   */
-  unsigned tess_varying_implicit_sized_array:1;
-
-  /**
* Whether this is a fragment shader output implicitly initialized with
* the previous contents of the specified render target at the
* framebuffer location corresponding to this shader invocation.
diff --git a/src/compiler/glsl/linker.cpp b/src/compiler/glsl/linker.cpp
index 4c7bf282ce1..b3c7d2c1456 100644
--- a/src/compiler/glsl/linker.cpp
+++ b/src/compiler/glsl/linker.cpp
@@ -3709,17 +3709,6 @@ create_shader_variable(struct gl_shader_program *shProg,
return out;
 }
 
-static const glsl_type *
-resize_to_max_patch_vertices(const struct gl_context *ctx,
- const glsl_type *type)
-{
-   if (!type)
-  return NULL;
-
-   return glsl_type::get_array_instance(type->fields.array,
-ctx->Const.MaxPatchVertices);
-}
-
 static bool
 add_shader_variable(const struct gl_context *ctx,
 struct gl_shader_program *shProg,
@@ -3733,27 +3722,6 @@ add_shader_variable(const struct gl_context *ctx,
const glsl_type *interface_type = var->get_interface_type();
 
if (outermost_struct_type == NULL) {
-  /* Unsized (non-patch) TCS output/TES input arrays are implicitly
-   * sized to gl_MaxPatchVertices.  Internally, we shrink them to a
-   * smaller size.
-   *
-   * This can cause trouble with SSO programs.  Since the TCS declares
-   * the number of output vertices, we can always shrink TCS output
-   * arrays.  However, the TES might not be linked with a TCS, in
-   * which case it won't know the size of the patch.  In other words,
-   * the TCS and TES may disagree on the (smaller) array sizes.  This
-   * can result in the resource names differing across stages, causing
-   * SSO validation failures and other cascading issues.
-   *
-   * Expanding the array size to the full gl_MaxPatchVertices fixes
-   * these issues.  It's also what program interface queries expect,
-   * as that is the official size of the array.
-   */
-  if (var->data.tess_varying_implicit_sized_array) {
- type = resize_to_max_patch_vertices(ctx, type);
- interface_type = resize_to_max_patch_vertices(ctx, interface_type);
-  }
-
   if (var->data.from_named_ifc_block) {
  const char *interface_name = interface_type->name;
 
diff --git a/src/compiler/glsl/lower_named_interface_blocks.cpp 
b/src/compiler/glsl/lower_named_interface_blocks.cpp
index 53ef638c9c4..a00e60dd771 100644
--- a/src/compiler/glsl/lower_named_interface_blocks.cpp
+++ b/src/compiler/glsl/lower_named_interface_blocks.cpp
@@ -193,8 +193,6

[Mesa-dev] [PATCH 3/5] glsl: Update a comment about link errors for TCS && !TES.

2017-02-10 Thread Kenneth Graunke
OpenGL ES actually has spec text to prohibit this.  It's just OpenGL
that's confusing.

Signed-off-by: Kenneth Graunke 
---
 src/compiler/glsl/linker.cpp | 10 +-
 1 file changed, 9 insertions(+), 1 deletion(-)

diff --git a/src/compiler/glsl/linker.cpp b/src/compiler/glsl/linker.cpp
index 720c22baee0..4c7bf282ce1 100644
--- a/src/compiler/glsl/linker.cpp
+++ b/src/compiler/glsl/linker.cpp
@@ -4721,7 +4721,15 @@ link_shaders(struct gl_context *ctx, struct 
gl_shader_program *prog)
  goto done;
   }
 
-  /* The spec is self-contradictory here. It allows linking without a tess
+  /* Section 7.3 of the OpenGL ES 3.2 specification says:
+   *
+   *"Linking can fail for [...] any of the following reasons:
+   *
+   * * program contains an object to form a tessellation control
+   *   shader [...] and [...] the program is not separable and
+   *   contains no object to form a tessellation evaluation shader"
+   *
+   * The OpenGL spec is contradictory. It allows linking without a tess
* eval shader, but that can only be used with transform feedback and
* rasterization disabled. However, transform feedback isn't allowed
* with GL_PATCHES, so it can't be used.
-- 
2.11.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/5] mesa: Do (TCS && !TES) draw time validation in ES as well.

2017-02-10 Thread Kenneth Graunke
Now that we have OES_tessellation_shader, the same situation can occur
in ES too, not just GL core profile.

Having a TCS but no TES may confuse drivers - i965 crashes, for example.

This prevents regressions in
ES31-CTS.core.tessellation_shader.single.xfb_captures_data_from_correct_stage
with some SSO pipeline validation changes I'm making.

Cc: "17.0" 
Signed-off-by: Kenneth Graunke 
---
 src/mesa/main/api_validate.c | 38 +++---
 1 file changed, 19 insertions(+), 19 deletions(-)

diff --git a/src/mesa/main/api_validate.c b/src/mesa/main/api_validate.c
index 6c95701ea0e..d64264fe1e9 100644
--- a/src/mesa/main/api_validate.c
+++ b/src/mesa/main/api_validate.c
@@ -236,6 +236,25 @@ check_valid_to_render(struct gl_context *ctx, const char 
*function)
   return false;
}
 
+   /* The spec argues that this is allowed because a tess ctrl shader
+* without a tess eval shader can be used with transform feedback.
+* However, glBeginTransformFeedback doesn't allow GL_PATCHES and
+* therefore doesn't allow tessellation.
+*
+* Further investigation showed that this is indeed a spec bug and
+* a tess ctrl shader without a tess eval shader shouldn't have been
+* allowed, because there is no API in GL 4.0 that can make use this
+* to produce something useful.
+*
+* Also, all vendors except one don't support a tess ctrl shader without
+* a tess eval shader anyway.
+*/
+   if (ctx->TessCtrlProgram._Current && !ctx->TessEvalProgram._Current) {
+  _mesa_error(ctx, GL_INVALID_OPERATION,
+  "%s(tess eval shader is missing)", function);
+  return false;
+   }
+
switch (ctx->API) {
case API_OPENGLES2:
   /* For ES2, we can draw if we have a vertex program/shader). */
@@ -260,25 +279,6 @@ check_valid_to_render(struct gl_context *ctx, const char 
*function)
  return false;
   }
 
-  /* The spec argues that this is allowed because a tess ctrl shader
-   * without a tess eval shader can be used with transform feedback.
-   * However, glBeginTransformFeedback doesn't allow GL_PATCHES and
-   * therefore doesn't allow tessellation.
-   *
-   * Further investigation showed that this is indeed a spec bug and
-   * a tess ctrl shader without a tess eval shader shouldn't have been
-   * allowed, because there is no API in GL 4.0 that can make use this
-   * to produce something useful.
-   *
-   * Also, all vendors except one don't support a tess ctrl shader without
-   * a tess eval shader anyway.
-   */
-  if (ctx->TessCtrlProgram._Current && !ctx->TessEvalProgram._Current) {
- _mesa_error(ctx, GL_INVALID_OPERATION,
- "%s(tess eval shader is missing)", function);
- return false;
-  }
-
   /* Section 7.3 (Program Objects) of the OpenGL 4.5 Core Profile spec
* says:
*
-- 
2.11.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/5] mesa: Do a draw time check for TES && !TCS in ES 3.x.

2017-02-10 Thread Kenneth Graunke
ES 3.x requires both TCS and TES to be present.  We already checked
the TCS && !TES case above, so we just have to check !TCS && TES here.

Note that this is allowed in OpenGL, just not ES.

This fixes a subcase of:
dEQP-GLES31.functional.debug.negative_coverage.*.tessellation.single_tessellation_stage

Signed-off-by: Kenneth Graunke 
---
 src/mesa/main/api_validate.c | 14 ++
 1 file changed, 14 insertions(+)

diff --git a/src/mesa/main/api_validate.c b/src/mesa/main/api_validate.c
index d64264fe1e9..41d2611eec4 100644
--- a/src/mesa/main/api_validate.c
+++ b/src/mesa/main/api_validate.c
@@ -257,6 +257,20 @@ check_valid_to_render(struct gl_context *ctx, const char 
*function)
 
switch (ctx->API) {
case API_OPENGLES2:
+  /* Section 11.2 (Tessellation) of the ES 3.2 spec says:
+   *
+   * "An INVALID_OPERATION error is generated by any command that
+   *  transfers vertices to the GL if the current program state has
+   *  one but not both of a tessellation control shader and tessellation
+   *  evaluation shader."
+   */
+  if (_mesa_is_gles3(ctx) &&
+  ctx->TessEvalProgram._Current && !ctx->TessCtrlProgram._Current) {
+ _mesa_error(ctx, GL_INVALID_OPERATION,
+ "%s(tess ctrl shader is missing)", function);
+ return false;
+  }
+
   /* For ES2, we can draw if we have a vertex program/shader). */
   return ctx->VertexProgram._Current != NULL;
 
-- 
2.11.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/5] mesa: Do (TCS && !TES) draw time validation in ES as well.

2017-02-11 Thread Kenneth Graunke
On Saturday, February 11, 2017 12:21:36 AM PST Alejandro Piñeiro wrote:
> On 11/02/17 08:52, Kenneth Graunke wrote:
> > Now that we have OES_tessellation_shader, the same situation can occur
> > in ES too, not just GL core profile.
> >
> > Having a TCS but no TES may confuse drivers - i965 crashes, for example.
> >
> > This prevents regressions in
> > ES31-CTS.core.tessellation_shader.single.xfb_captures_data_from_correct_stage
> > with some SSO pipeline validation changes I'm making.
> >
> > Cc: "17.0" 
> > Signed-off-by: Kenneth Graunke 
> > ---
> >  src/mesa/main/api_validate.c | 38 +++---
> >  1 file changed, 19 insertions(+), 19 deletions(-)
> >
> > diff --git a/src/mesa/main/api_validate.c b/src/mesa/main/api_validate.c
> > index 6c95701ea0e..d64264fe1e9 100644
> > --- a/src/mesa/main/api_validate.c
> > +++ b/src/mesa/main/api_validate.c
> > @@ -236,6 +236,25 @@ check_valid_to_render(struct gl_context *ctx, const 
> > char *function)
> >return false;
> > }
> >  
> > +   /* The spec argues that this is allowed because a tess ctrl shader
> > +* without a tess eval shader can be used with transform feedback.
> > +* However, glBeginTransformFeedback doesn't allow GL_PATCHES and
> > +* therefore doesn't allow tessellation.
> > +*
> > +* Further investigation showed that this is indeed a spec bug and
> > +* a tess ctrl shader without a tess eval shader shouldn't have been
> > +* allowed, because there is no API in GL 4.0 that can make use this
> > +* to produce something useful.
> 
> As the check is done for both OpenGL and OpenGL ES, how about update the
> comment to include both, as right now only mentions desktop?
> 
> In any case, this is a nitpick, feel free to ignore if you think that
> increases the comment without good reason.

Good call, I've changed it to:

+   /* Section 11.2 (Tessellation) of the ES 3.2 spec says:
+*
+* "An INVALID_OPERATION error is generated by any command that
+*  transfers vertices to the GL if the current program state has
+*  one but not both of a tessellation control shader and tessellation
+*  evaluation shader."
+*
+* The OpenGL spec argues that this is allowed because a tess ctrl shader

Thanks Alejandro!


signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/2] glsl: Handle packed_type == ivec4[] in lower_packed_varyings().

2017-02-11 Thread Kenneth Graunke
For GS input arrays, we may turn a packed_type of ivec4 into an
array of ivec4s.  We still want flat qualification.

Found by inspection.  Not known to help anything.

Signed-off-by: Kenneth Graunke 
---
 src/compiler/glsl/lower_packed_varyings.cpp | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/src/compiler/glsl/lower_packed_varyings.cpp 
b/src/compiler/glsl/lower_packed_varyings.cpp
index 1e9bdda12b1..13f7e5b52da 100644
--- a/src/compiler/glsl/lower_packed_varyings.cpp
+++ b/src/compiler/glsl/lower_packed_varyings.cpp
@@ -704,7 +704,8 @@ lower_packed_varyings_visitor::get_packed_varying_deref(
   packed_var->data.centroid = unpacked_var->data.centroid;
   packed_var->data.sample = unpacked_var->data.sample;
   packed_var->data.patch = unpacked_var->data.patch;
-  packed_var->data.interpolation = packed_type == glsl_type::ivec4_type
+  packed_var->data.interpolation =
+ packed_type->without_array() == glsl_type::ivec4_type
  ? unsigned(INTERP_MODE_FLAT) : unpacked_var->data.interpolation;
   packed_var->data.location = location;
   packed_var->data.precision = unpacked_var->data.precision;
-- 
2.11.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/2] glsl: Nerf assert about qualifiers in lower_packed_varyings().

2017-02-11 Thread Kenneth Graunke
In ES 3.0, interpret_interpolation_qualifier() defaults unset
interpolation qualifiers to "smooth"...which has the strange result
of marking some integer variables "smooth".

Interpolation qualifiers really only matter for fragment shader inputs,
other than enforcing the linker rules that changed in various language
versions.  By the time we get to lower_packed_varyings, we probably
don't care anymore.

Just ignore the assert for non-FS stages.  It's sketchy, but this whole
pass is pretty sketchy and needs to die...

Fixes dEQP-GLES31.functional.program_interface_query.program_input.type.
separable_geometry.{int,ivec2,ivec3,ivec4,uint,uvec2,uvec3,uvec4}.

Signed-off-by: Kenneth Graunke 
---
 src/compiler/glsl/lower_packed_varyings.cpp | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/src/compiler/glsl/lower_packed_varyings.cpp 
b/src/compiler/glsl/lower_packed_varyings.cpp
index 13f7e5b52da..989caf63c14 100644
--- a/src/compiler/glsl/lower_packed_varyings.cpp
+++ b/src/compiler/glsl/lower_packed_varyings.cpp
@@ -282,7 +282,8 @@ lower_packed_varyings_visitor::run(struct gl_linked_shader 
*shader)
* together when their interpolation mode is "flat".  Treat integers as
* being flat when the interpolation mode is none.
*/
-  assert(var->data.interpolation == INTERP_MODE_FLAT ||
+  assert(shader->Stage != MESA_SHADER_FRAGMENT ||
+ var->data.interpolation == INTERP_MODE_FLAT ||
  var->data.interpolation == INTERP_MODE_NONE ||
  !var->type->contains_integer());
 
-- 
2.11.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 3/3] i965/sampler_state: Set the "Base Mip Level" field on Sandy Bridge

2017-02-11 Thread Kenneth Graunke
On Friday, February 10, 2017 3:52:44 PM PST Jason Ekstrand wrote:
> Fixes two GL ES 3.0 CTS tests on Sandy Bridge:
> 
> ES3-CTS.functional.texture.mipmap.cube.base_level.linear_linear
> ES3-CTS.functional.texture.mipmap.cube.base_level.linear_nearest
> 
> Cc: "17.0 13.0" 
> ---
>  src/mesa/drivers/dri/i965/brw_sampler_state.c | 20 +++-
>  src/mesa/drivers/dri/i965/brw_state.h |  1 +
>  2 files changed, 20 insertions(+), 1 deletion(-)

This seems like the most likely explanation, and there's approximately
zero information on the subject, so let's go with it for now :)

Series is:
Reviewed-by: Kenneth Graunke 


signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/2] anv: Use build-id for pipeline cache UUID.

2017-02-14 Thread Kenneth Graunke
On Tuesday, February 14, 2017 12:38:45 PM PST Chad Versace wrote:
> On Tue 14 Feb 2017, Matt Turner wrote:
> 
> 
> >  static bool
> > -anv_get_function_timestamp(void *ptr, uint32_t* timestamp)
> > +anv_device_get_cache_uuid(void *uuid)
> >  {
> > -   Dl_info info;
> > -   struct stat st;
> > -   if (!dladdr(ptr, &info) || !info.dli_fname)
> > +   const struct note *note = build_id_find_nhdr("libvulkan_intel.so");
> > +   if (!note)
> >return false;
> >  
> > -   if (stat(info.dli_fname, &st))
> > +   unsigned len = build_id_length(note);
> > +   if (len < VK_UUID_SIZE)
> >return false;
> >  
> > -   *timestamp = st.st_mtim.tv_sec;
> > -   return true;
> > -}
> > -
> > -static bool
> > -anv_device_get_cache_uuid(void *uuid)
> > -{
> > -   uint32_t timestamp;
> > -
> > -   memset(uuid, 0, VK_UUID_SIZE);
> > -   if (!anv_get_function_timestamp(anv_device_get_cache_uuid, ×tamp))
> > +   unsigned char *build_id = malloc(len);
> > +   if (!build_id)
> >return false;
> >  
> > -   snprintf(uuid, VK_UUID_SIZE, "anv-%d", timestamp);
> > +   build_id_read(note, build_id);
> > +
> > +   memcpy(uuid, build_id, VK_UUID_SIZE);
> > +   free(build_id);
> 
> The Vulkan spec frowns on memory allocations when not needed. If you
> must allocate memory here, then it should be through the VkInstance
> allocation callbacks. However, it's best to avoid the allocation by
> adding a size_t parameter, à la snprintf, to build_id_read().
> 
> Otherwise, the patch looks good to me.

You're worried about the performance of anv_physical_device_init()?
This doesn't happen on lookup...just driver start up...

--Ken


signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/2] i965: Make a helper for emitting 3DSTATE_CONSTANT_XS packets.

2017-02-14 Thread Kenneth Graunke
This separates the logic from filling out a 3DSTATE_CONSTANT_XS
packet from the decisions about what to put in the various buffers.

It also should make it easier to use more than one buffer, should
we decide to do so.  It also provides a nice place to enforce the
various restrictions via assertions.

By marking the helper as inline, the code for unused buffers should
be constant folded away.

Signed-off-by: Kenneth Graunke 
Signed-off-by: Ben Widawsky 
---
 src/mesa/drivers/dri/i965/gen6_constant_state.c | 169 +---
 1 file changed, 118 insertions(+), 51 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/gen6_constant_state.c 
b/src/mesa/drivers/dri/i965/gen6_constant_state.c
index 6c0c32b26f7..7e6fa92ecf2 100644
--- a/src/mesa/drivers/dri/i965/gen6_constant_state.c
+++ b/src/mesa/drivers/dri/i965/gen6_constant_state.c
@@ -27,6 +27,97 @@
 #include "intel_batchbuffer.h"
 #include "program/prog_parameter.h"
 
+#define F(RELOC, BATCH, buf, x) \
+   if (buf) { \
+  RELOC(buf, I915_GEM_DOMAIN_RENDER, 0, x); \
+   } else { \
+  BATCH(x); \
+   }
+#define OUT_PTR64(buf, x) F(OUT_RELOC64, OUT_BATCH64, buf, x)
+#define OUT_PTR(buf, x)   F(OUT_RELOC,   OUT_BATCH,   buf, x)
+
+static inline void
+emit_3dstate_constant(struct brw_context *brw,
+  uint32_t opcode,
+  uint32_t mocs,
+  drm_intel_bo *bufs[4],
+  uint16_t read_lengths[4],
+  uint64_t offsets[4])
+{
+   /* Buffer 0 is relative to Dynamic State Base Address, which we program
+* to the start of the batch buffer.  All others are graphics virtual
+* addresses regardless of the INSTPM settings.
+*/
+   assert(bufs[0] == NULL || bufs[0] == brw->batch.bo);
+
+   assert(read_lengths[0] == 0 || bufs[0] != NULL);
+   assert(read_lengths[1] == 0 || bufs[1] != NULL);
+   assert(read_lengths[2] == 0 || bufs[2] != NULL);
+   assert(read_lengths[3] == 0 || bufs[3] != NULL);
+
+   if (brw->gen >= 8) {
+  BEGIN_BATCH(11);
+  OUT_BATCH(opcode << 16 | (11 - 2));
+  OUT_BATCH(read_lengths[0] | read_lengths[1] << 16);
+  OUT_BATCH(read_lengths[2] | read_lengths[3] << 16);
+  OUT_BATCH64(offsets[0]);
+  OUT_PTR64(bufs[1], offsets[1]);
+  OUT_PTR64(bufs[2], offsets[2]);
+  OUT_PTR64(bufs[3], offsets[3]);
+  ADVANCE_BATCH();
+   } else if (brw->gen == 7) {
+  /* From the Ivybridge PRM, Volume 2, Part 1, Page 112:
+   * "Constant buffers must be enabled in order from Constant Buffer 0 to
+   *  Constant Buffer 3 within this command.  For example, it is not
+   *  allowed to enable Constant Buffer 1 by programming a non-zero value
+   *  in the VS Constant Buffer 1 Read Length without a non-zero value in
+   *  VS Constant Buffer 0 Read Length."
+   *
+   * Haswell removes this restriction.
+   */
+  if (!brw->is_haswell) {
+ assert(read_lengths[3] == 0 || (read_lengths[2] > 0 &&
+ read_lengths[1] > 0 &&
+ read_lengths[0] > 0));
+ assert(read_lengths[2] == 0 || (read_lengths[1] > 0 &&
+ read_lengths[0] > 0));
+ assert(read_lengths[1] == 0 || read_lengths[0] > 0);
+  }
+
+  BEGIN_BATCH(7);
+  OUT_BATCH(opcode << 16 | (7 - 2));
+  OUT_BATCH(read_lengths[0] | read_lengths[1] << 16);
+  OUT_BATCH(read_lengths[2] | read_lengths[3] << 16);
+  OUT_BATCH(offsets[0]);
+  OUT_PTR(bufs[1], offsets[1]);
+  OUT_PTR(bufs[2], offsets[2]);
+  OUT_PTR(bufs[3], offsets[3]);
+  ADVANCE_BATCH();
+   } else if (brw->gen == 6) {
+  /* From the Sandybridge PRM, Volume 2, Part 1, Page 138:
+   * "The sum of all four read length fields (each incremented to
+   *  represent the actual read length) must be less than or equal
+   *  to 32."
+   */
+  assert(read_lengths[0] + read_lengths[1] +
+ read_lengths[2] + read_lengths[3] < 32);
+
+  BEGIN_BATCH(5);
+  OUT_BATCH(opcode << 16 | (5 - 2) |
+(read_lengths[0] ? GEN6_CONSTANT_BUFFER_0_ENABLE : 0) |
+(read_lengths[1] ? GEN6_CONSTANT_BUFFER_1_ENABLE : 0) |
+(read_lengths[2] ? GEN6_CONSTANT_BUFFER_2_ENABLE : 0) |
+(read_lengths[3] ? GEN6_CONSTANT_BUFFER_3_ENABLE : 0));
+  OUT_BATCH(offsets[0] | (read_lengths[0] - 1));
+  OUT_PTR(bufs[1], offsets[1] | (read_lengths[1] - 1));
+  OUT_PTR(bufs[2], offsets[2] | (read_lengths[2] - 1));
+  OUT_PTR(bufs[3], offsets[3] | (read_lengths[3] - 1));
+  ADVANCE_BATCH();
+   } else {
+  unreachable("unhandled gen in emit_3dstate_constant");
+   }
+}
+
 void
 gen7_upload_constant_state(struct brw_context *brw,
const stru

[Mesa-dev] [PATCH 1/2] i965: Add an OUT_BATCH64() macro.

2017-02-14 Thread Kenneth Graunke
This is more convenient than OUT_BATCH'ing both halves.

Signed-off-by: Kenneth Graunke 
Cc: Ben Widawsky 
---
 src/mesa/drivers/dri/i965/gen8_depth_state.c  | 3 +--
 src/mesa/drivers/dri/i965/gen8_ds_state.c | 3 +--
 src/mesa/drivers/dri/i965/gen8_gs_state.c | 3 +--
 src/mesa/drivers/dri/i965/gen8_hs_state.c | 3 +--
 src/mesa/drivers/dri/i965/gen8_ps_state.c | 3 +--
 src/mesa/drivers/dri/i965/gen8_vs_state.c | 3 +--
 src/mesa/drivers/dri/i965/intel_batchbuffer.h | 1 +
 7 files changed, 7 insertions(+), 12 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/gen8_depth_state.c 
b/src/mesa/drivers/dri/i965/gen8_depth_state.c
index a7e61354fd5..c085246bc92 100644
--- a/src/mesa/drivers/dri/i965/gen8_depth_state.c
+++ b/src/mesa/drivers/dri/i965/gen8_depth_state.c
@@ -72,8 +72,7 @@ emit_depth_packets(struct brw_context *brw,
   OUT_RELOC64(depth_mt->bo,
   I915_GEM_DOMAIN_RENDER, I915_GEM_DOMAIN_RENDER, 0);
} else {
-  OUT_BATCH(0);
-  OUT_BATCH(0);
+  OUT_BATCH64(0);
}
OUT_BATCH(((width - 1) << 4) | ((height - 1) << 18) | lod);
OUT_BATCH(((depth - 1) << 21) | (min_array_element << 10) | mocs_wb);
diff --git a/src/mesa/drivers/dri/i965/gen8_ds_state.c 
b/src/mesa/drivers/dri/i965/gen8_ds_state.c
index ee2f82e1098..55738fd1ffc 100644
--- a/src/mesa/drivers/dri/i965/gen8_ds_state.c
+++ b/src/mesa/drivers/dri/i965/gen8_ds_state.c
@@ -56,8 +56,7 @@ gen8_upload_ds_state(struct brw_context *brw)
  I915_GEM_DOMAIN_RENDER, I915_GEM_DOMAIN_RENDER,
  ffs(stage_state->per_thread_scratch) - 11);
   } else {
- OUT_BATCH(0);
- OUT_BATCH(0);
+ OUT_BATCH64(0);
   }
   OUT_BATCH(SET_FIELD(prog_data->dispatch_grf_start_reg,
   GEN7_DS_DISPATCH_START_GRF) |
diff --git a/src/mesa/drivers/dri/i965/gen8_gs_state.c 
b/src/mesa/drivers/dri/i965/gen8_gs_state.c
index 2b74f1bd575..31c6f89bc13 100644
--- a/src/mesa/drivers/dri/i965/gen8_gs_state.c
+++ b/src/mesa/drivers/dri/i965/gen8_gs_state.c
@@ -63,8 +63,7 @@ gen8_upload_gs_state(struct brw_context *brw)
  I915_GEM_DOMAIN_RENDER, I915_GEM_DOMAIN_RENDER,
  ffs(stage_state->per_thread_scratch) - 11);
   } else {
- OUT_BATCH(0);
- OUT_BATCH(0);
+ OUT_BATCH64(0);
   }
 
   /* DW6 */
diff --git a/src/mesa/drivers/dri/i965/gen8_hs_state.c 
b/src/mesa/drivers/dri/i965/gen8_hs_state.c
index ee47e5e54a0..dbdd19b1f5c 100644
--- a/src/mesa/drivers/dri/i965/gen8_hs_state.c
+++ b/src/mesa/drivers/dri/i965/gen8_hs_state.c
@@ -57,8 +57,7 @@ gen8_upload_hs_state(struct brw_context *brw)
  I915_GEM_DOMAIN_RENDER, I915_GEM_DOMAIN_RENDER,
  ffs(stage_state->per_thread_scratch) - 11);
   } else {
- OUT_BATCH(0);
- OUT_BATCH(0);
+ OUT_BATCH64(0);
   }
   OUT_BATCH(GEN7_HS_INCLUDE_VERTEX_HANDLES |
 SET_FIELD(prog_data->dispatch_grf_start_reg,
diff --git a/src/mesa/drivers/dri/i965/gen8_ps_state.c 
b/src/mesa/drivers/dri/i965/gen8_ps_state.c
index 03468267ce6..9b1a78c6ee6 100644
--- a/src/mesa/drivers/dri/i965/gen8_ps_state.c
+++ b/src/mesa/drivers/dri/i965/gen8_ps_state.c
@@ -269,8 +269,7 @@ gen8_upload_ps_state(struct brw_context *brw,
   I915_GEM_DOMAIN_RENDER, I915_GEM_DOMAIN_RENDER,
   ffs(stage_state->per_thread_scratch) - 11);
} else {
-  OUT_BATCH(0);
-  OUT_BATCH(0);
+  OUT_BATCH64(0);
}
OUT_BATCH(dw6);
OUT_BATCH(dw7);
diff --git a/src/mesa/drivers/dri/i965/gen8_vs_state.c 
b/src/mesa/drivers/dri/i965/gen8_vs_state.c
index 7b66da4b17c..a2b08fe92a0 100644
--- a/src/mesa/drivers/dri/i965/gen8_vs_state.c
+++ b/src/mesa/drivers/dri/i965/gen8_vs_state.c
@@ -62,8 +62,7 @@ upload_vs_state(struct brw_context *brw)
   I915_GEM_DOMAIN_RENDER, I915_GEM_DOMAIN_RENDER,
   ffs(stage_state->per_thread_scratch) - 11);
} else {
-  OUT_BATCH(0);
-  OUT_BATCH(0);
+  OUT_BATCH64(0);
}
 
OUT_BATCH((prog_data->dispatch_grf_start_reg <<
diff --git a/src/mesa/drivers/dri/i965/intel_batchbuffer.h 
b/src/mesa/drivers/dri/i965/intel_batchbuffer.h
index bf7cadfc4d6..da8f7e561f4 100644
--- a/src/mesa/drivers/dri/i965/intel_batchbuffer.h
+++ b/src/mesa/drivers/dri/i965/intel_batchbuffer.h
@@ -161,6 +161,7 @@ intel_batchbuffer_advance(struct brw_context *brw)
 
 #define OUT_BATCH(d) *__map++ = (d)
 #define OUT_BATCH_F(f) OUT_BATCH(float_as_int((f)))
+#define OUT_BATCH64(d) *((uint64_t *) __map) = (d); __map += 2
 
 #define OUT_RELOC(buf, read_domains, write_domain, delta) do {\
uint32_t __offset = (__map - brw->batch.map) * 4;  \
-- 
2.11.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/2] i965/ps: Use ForceThreadDispatchEnable instead of AccessUAV.

2017-02-14 Thread Kenneth Graunke
On Monday, February 13, 2017 6:58:39 PM PST Jason Ekstrand wrote:
> On Mon, Feb 13, 2017 at 6:35 PM, Francisco Jerez 
> wrote:
> 
> > Jason Ekstrand  writes:
> >
> > > The AccessUAV bit is not quite what we want because it's more about
> > > coherency between storage operations than it is about dispatch.  Also,
> > > the 3DSTATE_PS_EXTRA::PixelShaderHasUAV bit seems to cause hangs on
> > > Broadwell for unknown reasons so it's best to just leave it off.  The
> > > 3DSTATE_WM::ForceThreadDispatchEnable bit, on the other hand, does
> > > exactly what we want and seems to work fine.
> > > ---
> > >  src/mesa/drivers/dri/i965/brw_defines.h   |  2 ++
> > >  src/mesa/drivers/dri/i965/gen8_ps_state.c | 52
> > ++-
> > >  2 files changed, 19 insertions(+), 35 deletions(-)
> > >
> > > diff --git a/src/mesa/drivers/dri/i965/brw_defines.h
> > b/src/mesa/drivers/dri/i965/brw_defines.h
> > > index 3c5c6c4..9ad36ca 100644
> > > --- a/src/mesa/drivers/dri/i965/brw_defines.h
> > > +++ b/src/mesa/drivers/dri/i965/brw_defines.h
> > > @@ -2733,6 +2733,8 @@ enum brw_barycentric_mode {
> > >  # define GEN7_WM_MSDISPMODE_PERSAMPLE(0 << 31)
> > >  # define GEN7_WM_MSDISPMODE_PERPIXEL (1 << 31)
> > >  # define HSW_WM_UAV_ONLY(1 << 30)
> > > +# define GEN8_WM_FORCE_THREAD_DISPATCH_OFF  (1 << 19)
> > > +# define GEN8_WM_FORCE_THREAD_DISPATCH_ON   (2 << 19)
> > >
> > >  #define _3DSTATE_PS  0x7820 /* GEN7+ */
> > >  /* DW1: kernel pointer */
> > > diff --git a/src/mesa/drivers/dri/i965/gen8_ps_state.c
> > b/src/mesa/drivers/dri/i965/gen8_ps_state.c
> > > index 0346826..cca57e6 100644
> > > --- a/src/mesa/drivers/dri/i965/gen8_ps_state.c
> > > +++ b/src/mesa/drivers/dri/i965/gen8_ps_state.c
> > > @@ -74,39 +74,6 @@ gen8_upload_ps_extra(struct brw_context *brw,
> > > if (brw->gen >= 9 && prog_data->pulls_bary)
> > >dw1 |= GEN9_PSX_SHADER_PULLS_BARY;
> > >
> > > -   /* The stricter cross-primitive coherency guarantees that the
> > hardware
> > > -* gives us with the "Accesses UAV" bit set for at least one shader
> > stage
> > > -* and the "UAV coherency required" bit set on the 3DPRIMITIVE
> > command are
> > > -* redundant within the current image, atomic counter and SSBO GL
> > APIs,
> > > -* which all have very loose ordering and coherency requirements and
> > > -* generally rely on the application to insert explicit barriers
> > when a
> > > -* shader invocation is expected to see the memory writes performed
> > by the
> > > -* invocations of some previous primitive.  Regardless of the value
> > of "UAV
> > > -* coherency required", the "Accesses UAV" bits will implicitly
> > cause an in
> > > -* most cases useless DC flush when the lowermost stage with the bit
> > set
> > > -* finishes execution.
> > > -*
> > > -* It would be nice to disable it, but in some cases we can't
> > because on
> > > -* Gen8+ it also has an influence on rasterization via the PS
> > UAV-only
> > > -* signal (which could be set independently from the coherency
> > mechanism in
> > > -* the 3DSTATE_WM command on Gen7), and because in some cases it will
> > > -* determine whether the hardware skips execution of the fragment
> > shader or
> > > -* not via the ThreadDispatchEnable signal.  However if we know that
> > > -* GEN8_PS_BLEND_HAS_WRITEABLE_RT is going to be set and
> > > -* GEN8_PSX_PIXEL_SHADER_NO_RT_WRITE is not set it shouldn't make
> > any
> > > -* difference so we may just disable it here.
> > > -*
> > > -* Gen8 hardware tries to compute ThreadDispatchEnable for us but
> > doesn't
> > > -* take into account KillPixels when no depth or stencil writes are
> > enabled.
> > > -* In order for occlusion queries to work correctly with no
> > attachments, we
> > > -* need to force-enable here.
> > > -*
> > > -* BRW_NEW_FS_PROG_DATA | BRW_NEW_FRAGMENT_PROGRAM | _NEW_BUFFERS |
> > _NEW_COLOR
> > > -*/
> > > -   if ((prog_data->has_side_effects || prog_data->uses_kill) &&
> > > -   !brw_color_buffer_write_enabled(brw))
> > > -  dw1 |= GEN8_PSX_SHADER_HAS_UAV;
> > > -
> > > if (prog_data->computed_stencil) {
> > >assert(brw->gen >= 9);
> > >dw1 |= GEN9_PSX_SHADER_COMPUTES_STENCIL;
> > > @@ -127,7 +94,6 @@ upload_ps_extra(struct brw_context *brw)
> > >
> > >  const struct brw_tracked_state gen8_ps_extra = {
> > > .dirty = {
> > > -  .mesa  = _NEW_BUFFERS | _NEW_COLOR,
> > >.brw   = BRW_NEW_BLORP |
> > > BRW_NEW_CONTEXT |
> > > BRW_NEW_FRAGMENT_PROGRAM |
> > > @@ -169,6 +135,20 @@ upload_wm_state(struct brw_context *brw)
> > > else if (wm_prog_data->has_side_effects)
> > >dw1 |= GEN7_WM_EARLY_DS_CONTROL_PSEXEC;
> > >
> > > +   /* In most cases, the computation for the
> > WM_INT::ThreadDispatchEnable
> > > +

Re: [Mesa-dev] [PATCH 1/5] i965/fs: Fix the inline nir_op_pack_double optimization

2017-02-15 Thread Kenneth Graunke
On Tuesday, February 14, 2017 11:29:47 PM PST Jason Ekstrand wrote:
> We can only do the optimization if the source *is* SSA.
> 
> Cc: "13.0 17.0" 
> ---
>  src/mesa/drivers/dri/i965/brw_fs_nir.cpp | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp 
> b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
> index 991c20f..94f2751 100644
> --- a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
> @@ -1219,7 +1219,7 @@ fs_visitor::nir_emit_alu(const fs_builder &bld, 
> nir_alu_instr *instr)
> * the unpack operation.
> */
>for (int i = 0; i < 2; i++) {
> - if (instr->src[i].src.is_ssa)
> + if (!instr->src[i].src.is_ssa)
>  continue;
>  
>   const nir_instr *parent_instr = instr->src[i].src.ssa->parent_instr;
> 

Series is:
Reviewed-by: Kenneth Graunke 


signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] mesa: Always expose GREMEDY_string_marker.

2017-02-15 Thread Kenneth Graunke
Equivalent marker functionality is already included in KHR_debug, which
we already expose unconditionally in all drivers (dummy_true).

Grim Fandango Remastered apparently calls glStringMarkerGREMEDY()
without checking for the extension, spewing GL errors.  Assuming the
existence of the extension is definitely not valid, but it also seems
kinda mean to spew GL errors when we could simply expose the feature
and silently ignore the provided string markers.

This patch enables GREMEDY_string_marker everywhere, and makes the calls
no-ops if the driver doesn't provide the EmitStringMarker() hook, just
like we did for the KHR_debug functionality.

This may impact freedreno, which actually puts markers in its command
buffers.

Signed-off-by: Kenneth Graunke 
---
 src/mesa/main/debug_output.c   | 4 +---
 src/mesa/main/extensions_table.h   | 2 +-
 src/mesa/state_tracker/st_debug.c  | 1 -
 src/mesa/state_tracker/st_debug.h  | 3 +--
 src/mesa/state_tracker/st_extensions.c | 4 
 5 files changed, 3 insertions(+), 11 deletions(-)

diff --git a/src/mesa/main/debug_output.c b/src/mesa/main/debug_output.c
index bc933db93d4..1d2dee128b4 100644
--- a/src/mesa/main/debug_output.c
+++ b/src/mesa/main/debug_output.c
@@ -1308,12 +1308,10 @@ void GLAPIENTRY
 _mesa_StringMarkerGREMEDY(GLsizei len, const GLvoid *string)
 {
GET_CURRENT_CONTEXT(ctx);
-   if (ctx->Extensions.GREMEDY_string_marker) {
+   if (ctx->Driver.EmitStringMarker) {
   /* if length not specified, string will be null terminated: */
   if (len <= 0)
  len = strlen(string);
   ctx->Driver.EmitStringMarker(ctx, string, len);
-   } else {
-  _mesa_error(ctx, GL_INVALID_OPERATION, "StringMarkerGREMEDY");
}
 }
diff --git a/src/mesa/main/extensions_table.h b/src/mesa/main/extensions_table.h
index 7ea56c8422d..ec48aadde3f 100644
--- a/src/mesa/main/extensions_table.h
+++ b/src/mesa/main/extensions_table.h
@@ -285,7 +285,7 @@ EXT(EXT_vertex_array, dummy_true
 EXT(EXT_vertex_array_bgra   , EXT_vertex_array_bgra
  , GLL, GLC,  x ,  x , 2008)
 EXT(EXT_window_rectangles   , EXT_window_rectangles
  , GLL, GLC,  x ,  30, 2016)
 
-EXT(GREMEDY_string_marker   , GREMEDY_string_marker
  , GLL, GLC,  x ,  x , 2007)
+EXT(GREMEDY_string_marker   , dummy_true   
  , GLL, GLC,  x ,  x , 2007)
 
 EXT(IBM_multimode_draw_arrays   , dummy_true   
  , GLL, GLC,  x ,  x , 1998)
 EXT(IBM_rasterpos_clip  , dummy_true   
  , GLL,  x ,  x ,  x , 1996)
diff --git a/src/mesa/state_tracker/st_debug.c 
b/src/mesa/state_tracker/st_debug.c
index d6cb5cd57d8..f2e982c8c7a 100644
--- a/src/mesa/state_tracker/st_debug.c
+++ b/src/mesa/state_tracker/st_debug.c
@@ -58,7 +58,6 @@ static const struct debug_named_value st_debug_flags[] = {
{ "buffer",   DEBUG_BUFFER, NULL },
{ "wf",   DEBUG_WIREFRAME, NULL },
{ "precompile",  DEBUG_PRECOMPILE, NULL },
-   { "gremedy",  DEBUG_GREMEDY, "Enable GREMEDY debug extensions" },
{ "noreadpixcache", DEBUG_NOREADPIXCACHE, NULL },
DEBUG_NAMED_VALUE_END
 };
diff --git a/src/mesa/state_tracker/st_debug.h 
b/src/mesa/state_tracker/st_debug.h
index 6c1e915f68c..4b92a669a37 100644
--- a/src/mesa/state_tracker/st_debug.h
+++ b/src/mesa/state_tracker/st_debug.h
@@ -50,8 +50,7 @@ st_print_current(void);
 #define DEBUG_BUFFER0x200
 #define DEBUG_WIREFRAME 0x400
 #define DEBUG_PRECOMPILE   0x800
-#define DEBUG_GREMEDY   0x1000
-#define DEBUG_NOREADPIXCACHE 0x2000
+#define DEBUG_NOREADPIXCACHE 0x1000
 
 #ifdef DEBUG
 extern int ST_DEBUG;
diff --git a/src/mesa/state_tracker/st_extensions.c 
b/src/mesa/state_tracker/st_extensions.c
index 37fe4469c37..d9057c77657 100644
--- a/src/mesa/state_tracker/st_extensions.c
+++ b/src/mesa/state_tracker/st_extensions.c
@@ -1167,10 +1167,6 @@ void st_init_extensions(struct pipe_screen *screen,
   extensions->ARB_vertex_attrib_64bit = GL_TRUE;
}
 
-   if ((ST_DEBUG & DEBUG_GREMEDY) &&
-   screen->get_param(screen, PIPE_CAP_STRING_MARKER))
-  extensions->GREMEDY_string_marker = GL_TRUE;
-
if (screen->get_param(screen, PIPE_CAP_COMPUTE)) {
   int compute_supported_irs =
  screen->get_shader_param(screen, PIPE_SHADER_COMPUTE,
-- 
2.11.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/2] i965: Make a helper for emitting 3DSTATE_CONSTANT_XS packets.

2017-02-15 Thread Kenneth Graunke
On Wednesday, February 15, 2017 3:23:14 PM PST Jordan Justen wrote:
> On 2017-02-14 13:45:49, Kenneth Graunke wrote:
[snip]
> > diff --git a/src/mesa/drivers/dri/i965/gen6_constant_state.c 
> > b/src/mesa/drivers/dri/i965/gen6_constant_state.c
> > index 6c0c32b26f7..7e6fa92ecf2 100644
> > --- a/src/mesa/drivers/dri/i965/gen6_constant_state.c
> > +++ b/src/mesa/drivers/dri/i965/gen6_constant_state.c
> > @@ -27,6 +27,97 @@
> >  #include "intel_batchbuffer.h"
> >  #include "program/prog_parameter.h"
> >  
> > +#define F(RELOC, BATCH, buf, x) \
> > +   if (buf) { \
> > +  RELOC(buf, I915_GEM_DOMAIN_RENDER, 0, x); \
> > +   } else { \
> > +  BATCH(x); \
> > +   }
> > +#define OUT_PTR64(buf, x) F(OUT_RELOC64, OUT_BATCH64, buf, x)
> > +#define OUT_PTR(buf, x)   F(OUT_RELOC,   OUT_BATCH,   buf, x)
> 
> Is there a better name than 'x'? Unfortunately, I couldn't think of a
> suggestion. :)

Probably...but I couldn't think of one.  It's some bits that get OR'd
in...and can be pretty arbitrary.  OUT_RELOC calls it "delta", but I'm
not sure that historical name makes much sense these days.  OUT_BATCH
uses "d" which is arguably not better.

[snip]

> I didn't see where the 'mocs' value usage moved to in the new code.
> 
> -Jordan

Good point.  It's zero on Gen6 and Gen8, but I accidentally dropped
GEN7_MOCS_L3 on Gen7-7.5.  I'll fix that and send a v2.

Thanks for reading carefully!


signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v2 2/2] i965: Make a helper for emitting 3DSTATE_CONSTANT_XS packets.

2017-02-15 Thread Kenneth Graunke
This separates the logic from filling out a 3DSTATE_CONSTANT_XS
packet from the decisions about what to put in the various buffers.

It also should make it easier to use more than one buffer, should
we decide to do so.  It also provides a nice place to enforce the
various restrictions via assertions.

By marking the helper as inline, the code for unused buffers should
be constant folded away.

v2: Don't lose MOCS setting on Gen7-7.5.

Signed-off-by: Kenneth Graunke 
Signed-off-by: Ben Widawsky 
---
 src/mesa/drivers/dri/i965/gen6_constant_state.c | 170 
 1 file changed, 117 insertions(+), 53 deletions(-)

Patch 1/2 remains unchanged so I didn't bother to send it again

diff --git a/src/mesa/drivers/dri/i965/gen6_constant_state.c 
b/src/mesa/drivers/dri/i965/gen6_constant_state.c
index 6c0c32b26f7..83c1dbab97b 100644
--- a/src/mesa/drivers/dri/i965/gen6_constant_state.c
+++ b/src/mesa/drivers/dri/i965/gen6_constant_state.c
@@ -27,70 +27,134 @@
 #include "intel_batchbuffer.h"
 #include "program/prog_parameter.h"
 
+#define F(RELOC, BATCH, buf, x) \
+   if (buf) { \
+  RELOC(buf, I915_GEM_DOMAIN_RENDER, 0, x); \
+   } else { \
+  BATCH(x); \
+   }
+#define OUT_PTR64(buf, x) F(OUT_RELOC64, OUT_BATCH64, buf, x)
+#define OUT_PTR(buf, x)   F(OUT_RELOC,   OUT_BATCH,   buf, x)
+
+static inline void
+emit_3dstate_constant(struct brw_context *brw,
+  uint32_t opcode,
+  drm_intel_bo *bufs[4],
+  uint16_t read_lengths[4],
+  uint64_t offsets[4])
+{
+   /* Buffer 0 is relative to Dynamic State Base Address, which we program
+* to the start of the batch buffer.  All others are graphics virtual
+* addresses regardless of the INSTPM settings.
+*/
+   assert(bufs[0] == NULL || bufs[0] == brw->batch.bo);
+
+   assert(read_lengths[0] == 0 || bufs[0] != NULL);
+   assert(read_lengths[1] == 0 || bufs[1] != NULL);
+   assert(read_lengths[2] == 0 || bufs[2] != NULL);
+   assert(read_lengths[3] == 0 || bufs[3] != NULL);
+
+   if (brw->gen >= 8) {
+  BEGIN_BATCH(11);
+  OUT_BATCH(opcode << 16 | (11 - 2));
+  OUT_BATCH(read_lengths[0] | read_lengths[1] << 16);
+  OUT_BATCH(read_lengths[2] | read_lengths[3] << 16);
+  OUT_BATCH64(offsets[0]);
+  OUT_PTR64(bufs[1], offsets[1]);
+  OUT_PTR64(bufs[2], offsets[2]);
+  OUT_PTR64(bufs[3], offsets[3]);
+  ADVANCE_BATCH();
+   } else if (brw->gen == 7) {
+  /* From the Ivybridge PRM, Volume 2, Part 1, Page 112:
+   * "Constant buffers must be enabled in order from Constant Buffer 0 to
+   *  Constant Buffer 3 within this command.  For example, it is not
+   *  allowed to enable Constant Buffer 1 by programming a non-zero value
+   *  in the VS Constant Buffer 1 Read Length without a non-zero value in
+   *  VS Constant Buffer 0 Read Length."
+   *
+   * Haswell removes this restriction.
+   */
+  if (!brw->is_haswell) {
+ assert(read_lengths[3] == 0 || (read_lengths[2] > 0 &&
+ read_lengths[1] > 0 &&
+ read_lengths[0] > 0));
+ assert(read_lengths[2] == 0 || (read_lengths[1] > 0 &&
+ read_lengths[0] > 0));
+ assert(read_lengths[1] == 0 || read_lengths[0] > 0);
+  }
+
+  BEGIN_BATCH(7);
+  OUT_BATCH(opcode << 16 | (7 - 2));
+  OUT_BATCH(read_lengths[0] | read_lengths[1] << 16);
+  OUT_BATCH(read_lengths[2] | read_lengths[3] << 16);
+  OUT_BATCH(offsets[0] | GEN7_MOCS_L3);
+  OUT_PTR(bufs[1], offsets[1]);
+  OUT_PTR(bufs[2], offsets[2]);
+  OUT_PTR(bufs[3], offsets[3]);
+  ADVANCE_BATCH();
+   } else if (brw->gen == 6) {
+  /* From the Sandybridge PRM, Volume 2, Part 1, Page 138:
+   * "The sum of all four read length fields (each incremented to
+   *  represent the actual read length) must be less than or equal
+   *  to 32."
+   */
+  assert(read_lengths[0] + read_lengths[1] +
+ read_lengths[2] + read_lengths[3] < 32);
+
+  BEGIN_BATCH(5);
+  OUT_BATCH(opcode << 16 | (5 - 2) |
+(read_lengths[0] ? GEN6_CONSTANT_BUFFER_0_ENABLE : 0) |
+(read_lengths[1] ? GEN6_CONSTANT_BUFFER_1_ENABLE : 0) |
+(read_lengths[2] ? GEN6_CONSTANT_BUFFER_2_ENABLE : 0) |
+(read_lengths[3] ? GEN6_CONSTANT_BUFFER_3_ENABLE : 0));
+  OUT_BATCH(offsets[0] | (read_lengths[0] - 1));
+  OUT_PTR(bufs[1], offsets[1] | (read_lengths[1] - 1));
+  OUT_PTR(bufs[2], offsets[2] | (read_lengths[2] - 1));
+  OUT_PTR(bufs[3], offsets[3] | (read_lengths[3] - 1));
+  ADVANCE_BATCH();
+   } else {
+  unreachable("unhandled gen in emit_3dstate_constant");
+   }
+}
+
 voi

Re: [Mesa-dev] [PATCH shader-db 3/4] run: set INTEL_NO_HW together with INTEL_DEVID_OVERRIDE

2017-02-16 Thread Kenneth Graunke
On Thursday, February 16, 2017 4:29:50 AM PST Lionel Landwerlin wrote:
> Since we're already asking the driver to generate code for a different
> hardware than what we're running on, better not even bother with emitting
> any batch.
> 
> Signed-off-by: Lionel Landwerlin 
> ---
>  run.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/run.c b/run.c
> index 62c19c8..7543b2a 100644
> --- a/run.c
> +++ b/run.c
> @@ -370,6 +370,7 @@ main(int argc, char **argv)
>  
>  printf("### Compiling for %s ###\n", platform->name);
>  setenv("INTEL_DEVID_OVERRIDE", platform->pci_id, 1);
> +setenv("INTEL_NO_HW", "1", 1);
>  break;
>  }
>  case 'j':
> 

I don't think you need this patch - libdrm will already not execute
batches if INTEL_DEVID_OVERRIDE is used to force a PCI ID that doesn't
match the one on the system.

Unless the fake PCI ID happens to match the one you're compiling for...


signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 4/7] i965: Update brw_save_primitives_written_counters for pre-Gen7.

2017-02-17 Thread Kenneth Graunke
Sandybridge and earlier only have a single counter.

Signed-off-by: Kenneth Graunke 
---
 src/mesa/drivers/dri/i965/gen6_sol.c | 14 ++
 1 file changed, 10 insertions(+), 4 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/gen6_sol.c 
b/src/mesa/drivers/dri/i965/gen6_sol.c
index 41158bd580c..8adac92d07d 100644
--- a/src/mesa/drivers/dri/i965/gen6_sol.c
+++ b/src/mesa/drivers/dri/i965/gen6_sol.c
@@ -297,11 +297,17 @@ brw_save_primitives_written_counters(struct brw_context 
*brw,
brw_emit_mi_flush(brw);
 
/* Emit MI_STORE_REGISTER_MEM commands to write the values. */
-   for (int i = 0; i < streams; i++) {
-  int offset = (obj->prim_count_buffer_index + i) * sizeof(uint64_t);
+   if (brw->gen >= 7) {
+  for (int i = 0; i < streams; i++) {
+ int offset = (obj->prim_count_buffer_index + i) * sizeof(uint64_t);
+ brw_store_register_mem64(brw, obj->prim_count_bo,
+  GEN7_SO_NUM_PRIMS_WRITTEN(i),
+  offset);
+  }
+   } else {
   brw_store_register_mem64(brw, obj->prim_count_bo,
-   GEN7_SO_NUM_PRIMS_WRITTEN(i),
-   offset);
+   GEN6_SO_NUM_PRIMS_WRITTEN,
+   obj->prim_count_buffer_index * 
sizeof(uint64_t));
}
 
/* Update where to write data to. */
-- 
2.11.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 6/7] i965: Properly reset SVBI counters on ResumeTransformFeedback().

2017-02-17 Thread Kenneth Graunke
This fixes Piglit's ARB_transform_feedback2/change-objects-while-paused
GLES 3.0 test.  When resuming the transform feedback object, we need to
reset the SVBI counters so we continue writing at the correct point in
the buffer.

Instead of SO_WRITE_OFFSET counters (with a DWord offset), we have the
Streamed Vertex Buffer Index (SVBI) counters, which contain a count of
vertices emitted.

Unfortunately, there's no straightforward way to store the current SVBI
counter values to a buffer.  They're not available in a register.  You
can use a bit in the 3DSTATE_SVB_INDEX packet to copy them to another
internal counter which 3DPRIMITIVE can use...but there's no good way to
extract that either.

So, once again, we use SO_NUM_PRIMS_WRITTEN to calculate the vertex
numbers.  Thankfully, we can reuse most of the existing Gen7+ code.

Signed-off-by: Kenneth Graunke 
---
 src/mesa/drivers/dri/i965/brw_context.c |   2 +
 src/mesa/drivers/dri/i965/brw_context.h |   6 ++
 src/mesa/drivers/dri/i965/gen6_sol.c| 116 +++-
 3 files changed, 107 insertions(+), 17 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_context.c 
b/src/mesa/drivers/dri/i965/brw_context.c
index 7240b1f4455..977c59c1c3e 100644
--- a/src/mesa/drivers/dri/i965/brw_context.c
+++ b/src/mesa/drivers/dri/i965/brw_context.c
@@ -481,6 +481,8 @@ brw_init_driver_functions(struct brw_context *brw,
} else {
   functions->BeginTransformFeedback = brw_begin_transform_feedback;
   functions->EndTransformFeedback = brw_end_transform_feedback;
+  functions->PauseTransformFeedback = brw_pause_transform_feedback;
+  functions->ResumeTransformFeedback = brw_resume_transform_feedback;
}
 
if (brw->gen >= 6)
diff --git a/src/mesa/drivers/dri/i965/brw_context.h 
b/src/mesa/drivers/dri/i965/brw_context.h
index 25c90645cea..df406c0d772 100644
--- a/src/mesa/drivers/dri/i965/brw_context.h
+++ b/src/mesa/drivers/dri/i965/brw_context.h
@@ -1471,6 +1471,12 @@ void
 brw_end_transform_feedback(struct gl_context *ctx,
struct gl_transform_feedback_object *obj);
 void
+brw_pause_transform_feedback(struct gl_context *ctx,
+ struct gl_transform_feedback_object *obj);
+void
+brw_resume_transform_feedback(struct gl_context *ctx,
+  struct gl_transform_feedback_object *obj);
+void
 brw_save_primitives_written_counters(struct brw_context *brw,
  struct brw_transform_feedback_object 
*obj);
 void
diff --git a/src/mesa/drivers/dri/i965/gen6_sol.c 
b/src/mesa/drivers/dri/i965/gen6_sol.c
index f1cc2d59fd4..702cb2830f0 100644
--- a/src/mesa/drivers/dri/i965/gen6_sol.c
+++ b/src/mesa/drivers/dri/i965/gen6_sol.c
@@ -314,18 +314,12 @@ brw_save_primitives_written_counters(struct brw_context 
*brw,
obj->prim_count_buffer_index += streams;
 }
 
-/**
- * Compute the number of vertices written by this transform feedback operation.
- */
-void
-brw_compute_xfb_vertices_written(struct brw_context *brw,
- struct brw_transform_feedback_object *obj)
+static void
+compute_vertices_written_so_far(struct brw_context *brw,
+struct brw_transform_feedback_object *obj,
+uint64_t *vertices_written)
 {
const struct gl_context *ctx = &brw->ctx;
-
-   if (obj->vertices_written_valid || !obj->base.EndedAnytime)
-  return;
-
unsigned vertices_per_prim = 0;
 
switch (obj->primitive_mode) {
@@ -346,8 +340,22 @@ brw_compute_xfb_vertices_written(struct brw_context *brw,
tally_prims_generated(brw, obj);
 
for (int i = 0; i < ctx->Const.MaxVertexStreams; i++) {
-  obj->vertices_written[i] = vertices_per_prim * obj->prims_generated[i];
+  vertices_written[i] = vertices_per_prim * obj->prims_generated[i];
}
+}
+
+/**
+ * Compute the number of vertices written by this transform feedback operation.
+ */
+void
+brw_compute_xfb_vertices_written(struct brw_context *brw,
+ struct brw_transform_feedback_object *obj)
+{
+   if (obj->vertices_written_valid || !obj->base.EndedAnytime)
+  return;
+
+   compute_vertices_written_so_far(brw, obj, obj->vertices_written);
+
obj->vertices_written_valid = true;
 }
 
@@ -423,18 +431,92 @@ brw_begin_transform_feedback(struct gl_context *ctx, 
GLenum mode,
   OUT_BATCH(0x);
   ADVANCE_BATCH();
}
+
+   /* We're about to lose the information needed to compute the number of
+* vertices written during the last Begin/EndTransformFeedback section,
+* so we can't delay it any further.
+*/
+   brw_compute_xfb_vertices_written(brw, brw_obj);
+
+   /* No primitives have been generated yet. */
+   for (int i = 0; i < BRW_MAX_XFB_STREAMS; i++) {
+  brw_obj->prims_generated[i] = 0;
+   }
+
+   /* Store the starting 

[Mesa-dev] [PATCH 2/7] i965: Move some code from gen7_sol_state.c to gen6_sol.c.

2017-02-17 Thread Kenneth Graunke
I plan to use these functions on Sandybridge soon.  I changed the prefix
on a couple of functions to "brw" instead of "gen7" as in theory they
should be usable all the way back to G45.

Signed-off-by: Kenneth Graunke 
---
 src/mesa/drivers/dri/i965/brw_context.h|   6 ++
 src/mesa/drivers/dri/i965/gen6_sol.c   | 140 +++
 src/mesa/drivers/dri/i965/gen7_sol_state.c | 148 +
 3 files changed, 150 insertions(+), 144 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_context.h 
b/src/mesa/drivers/dri/i965/brw_context.h
index 01e651b09f0..8d9a75f884b 100644
--- a/src/mesa/drivers/dri/i965/brw_context.h
+++ b/src/mesa/drivers/dri/i965/brw_context.h
@@ -1464,6 +1464,12 @@ brw_begin_transform_feedback(struct gl_context *ctx, 
GLenum mode,
 void
 brw_end_transform_feedback(struct gl_context *ctx,
struct gl_transform_feedback_object *obj);
+void
+brw_save_primitives_written_counters(struct brw_context *brw,
+ struct brw_transform_feedback_object 
*obj);
+void
+brw_compute_xfb_vertices_written(struct brw_context *brw,
+ struct brw_transform_feedback_object *obj);
 GLsizei
 brw_get_transform_feedback_vertex_count(struct gl_context *ctx,
 struct gl_transform_feedback_object 
*obj,
diff --git a/src/mesa/drivers/dri/i965/gen6_sol.c 
b/src/mesa/drivers/dri/i965/gen6_sol.c
index 6f1d2c2fd04..ca06194ba15 100644
--- a/src/mesa/drivers/dri/i965/gen6_sol.c
+++ b/src/mesa/drivers/dri/i965/gen6_sol.c
@@ -220,6 +220,146 @@ brw_delete_transform_feedback(struct gl_context *ctx,
free(brw_obj);
 }
 
+/**
+ * Tally the number of primitives generated so far.
+ *
+ * The buffer contains a series of pairs:
+ * (, ) ;
+ * (, ) ;
+ *
+ * For each stream, we subtract the pair of values (end - start) to get the
+ * number of primitives generated during one section.  We accumulate these
+ * values, adding them up to get the total number of primitives generated.
+ */
+static void
+tally_prims_generated(struct brw_context *brw,
+  struct brw_transform_feedback_object *obj)
+{
+   /* If the current batch is still contributing to the number of primitives
+* generated, flush it now so the results will be present when mapped.
+*/
+   if (drm_intel_bo_references(brw->batch.bo, obj->prim_count_bo))
+  intel_batchbuffer_flush(brw);
+
+   if (unlikely(brw->perf_debug && drm_intel_bo_busy(obj->prim_count_bo)))
+  perf_debug("Stalling for # of transform feedback primitives written.\n");
+
+   drm_intel_bo_map(obj->prim_count_bo, false);
+   uint64_t *prim_counts = obj->prim_count_bo->virtual;
+
+   assert(obj->prim_count_buffer_index % (2 * BRW_MAX_XFB_STREAMS) == 0);
+   int pairs = obj->prim_count_buffer_index / (2 * BRW_MAX_XFB_STREAMS);
+
+   for (int i = 0; i < pairs; i++) {
+  for (int s = 0; s < BRW_MAX_XFB_STREAMS; s++) {
+ obj->prims_generated[s] +=
+prim_counts[BRW_MAX_XFB_STREAMS + s] - prim_counts[s];
+  }
+  prim_counts += 2 * BRW_MAX_XFB_STREAMS; /* move to the next pair */
+   }
+
+   drm_intel_bo_unmap(obj->prim_count_bo);
+
+   /* We've already gathered up the old data; we can safely overwrite it now. 
*/
+   obj->prim_count_buffer_index = 0;
+}
+
+/**
+ * Store the SO_NUM_PRIMS_WRITTEN counters for each stream (4 uint64_t values)
+ * to prim_count_bo.
+ *
+ * If prim_count_bo is out of space, gather up the results so far into
+ * prims_generated[] and allocate a new buffer with enough space.
+ *
+ * The number of primitives written is used to compute the number of vertices
+ * written to a transform feedback stream, which is required to implement
+ * DrawTransformFeedback().
+ */
+void
+brw_save_primitives_written_counters(struct brw_context *brw,
+ struct brw_transform_feedback_object *obj)
+{
+   const int streams = BRW_MAX_XFB_STREAMS;
+
+   /* Check if there's enough space for a new pair of four values. */
+   if (obj->prim_count_bo != NULL &&
+   obj->prim_count_buffer_index + 2 * streams >= 4096 / sizeof(uint64_t)) {
+  /* Gather up the results so far and release the BO. */
+  tally_prims_generated(brw, obj);
+   }
+
+   /* Flush any drawing so that the counters have the right values. */
+   brw_emit_mi_flush(brw);
+
+   /* Emit MI_STORE_REGISTER_MEM commands to write the values. */
+   for (int i = 0; i < streams; i++) {
+  int offset = (obj->prim_count_buffer_index + i) * sizeof(uint64_t);
+  brw_store_register_mem64(brw, obj->prim_count_bo,
+   GEN7_SO_NUM_PRIMS_WRITTEN(i),
+   offset);
+   }
+
+   /* Update where to write data to. */
+   obj->prim_count_buffer_index += streams;
+}
+
+/**
+ * Compute the number of vert

[Mesa-dev] [PATCH 5/7] i965: Save max_index in brw_transform_feedback_object.

2017-02-17 Thread Kenneth Graunke
I'm going to need this in a new Resume hook shortly.

Signed-off-by: Kenneth Graunke 
---
 src/mesa/drivers/dri/i965/brw_context.h | 6 ++
 src/mesa/drivers/dri/i965/gen6_sol.c| 6 --
 2 files changed, 10 insertions(+), 2 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_context.h 
b/src/mesa/drivers/dri/i965/brw_context.h
index 8d9a75f884b..25c90645cea 100644
--- a/src/mesa/drivers/dri/i965/brw_context.h
+++ b/src/mesa/drivers/dri/i965/brw_context.h
@@ -567,6 +567,12 @@ struct brw_transform_feedback_object {
GLenum primitive_mode;
 
/**
+* The maximum number of vertices that we can write without overflowing
+* any of the buffers currently being used for transform feedback.
+*/
+   unsigned max_index;
+
+   /**
 * Count of primitives generated during this transform feedback operation.
 *  @{
 */
diff --git a/src/mesa/drivers/dri/i965/gen6_sol.c 
b/src/mesa/drivers/dri/i965/gen6_sol.c
index 8adac92d07d..f1cc2d59fd4 100644
--- a/src/mesa/drivers/dri/i965/gen6_sol.c
+++ b/src/mesa/drivers/dri/i965/gen6_sol.c
@@ -382,6 +382,8 @@ brw_begin_transform_feedback(struct gl_context *ctx, GLenum 
mode,
const struct gl_transform_feedback_info *linked_xfb_info;
struct gl_transform_feedback_object *xfb_obj =
   ctx->TransformFeedback.CurrentObject;
+   struct brw_transform_feedback_object *brw_obj =
+  (struct brw_transform_feedback_object *) xfb_obj;
 
assert(brw->gen == 6);
 
@@ -397,7 +399,7 @@ brw_begin_transform_feedback(struct gl_context *ctx, GLenum 
mode,
/* Compute the maximum number of vertices that we can write without
 * overflowing any of the buffers currently being used for feedback.
 */
-   unsigned max_index
+   brw_obj->max_index
   = _mesa_compute_max_transform_feedback_vertices(ctx, xfb_obj,
   linked_xfb_info);
 
@@ -406,7 +408,7 @@ brw_begin_transform_feedback(struct gl_context *ctx, GLenum 
mode,
OUT_BATCH(_3DSTATE_GS_SVB_INDEX << 16 | (4 - 2));
OUT_BATCH(0); /* SVBI 0 */
OUT_BATCH(0); /* starting index */
-   OUT_BATCH(max_index);
+   OUT_BATCH(brw_obj->max_index);
ADVANCE_BATCH();
 
/* Initialize the rest of the unused streams to sane values.  Otherwise,
-- 
2.11.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/7] i965: Drop dead Gen8+ code from Gen7/sometimes-HSW driver hooks.

2017-02-17 Thread Kenneth Graunke
These driver hooks are not used when MI_MATH and MI_LOAD_REGISTER_REG
are supported, which Gen8+ can always do.  So this code is dead.

Signed-off-by: Kenneth Graunke 
---
 src/mesa/drivers/dri/i965/gen7_sol_state.c | 50 ++
 1 file changed, 24 insertions(+), 26 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/gen7_sol_state.c 
b/src/mesa/drivers/dri/i965/gen7_sol_state.c
index e6b79ed2342..50631610e51 100644
--- a/src/mesa/drivers/dri/i965/gen7_sol_state.c
+++ b/src/mesa/drivers/dri/i965/gen7_sol_state.c
@@ -490,13 +490,11 @@ gen7_begin_transform_feedback(struct gl_context *ctx, 
GLenum mode,
struct brw_transform_feedback_object *brw_obj =
   (struct brw_transform_feedback_object *) obj;
 
+   assert(brw->gen == 7);
+
/* Reset the SO buffer offsets to 0. */
-   if (brw->gen >= 8) {
-  brw_obj->zero_offsets = true;
-   } else {
-  intel_batchbuffer_flush(brw);
-  brw->batch.needs_sol_reset = true;
-   }
+   intel_batchbuffer_flush(brw);
+   brw->batch.needs_sol_reset = true;
 
/* We're about to lose the information needed to compute the number of
 * vertices written during the last Begin/EndTransformFeedback section,
@@ -552,17 +550,17 @@ gen7_pause_transform_feedback(struct gl_context *ctx,
/* Flush any drawing so that the counters have the right values. */
brw_emit_mi_flush(brw);
 
+   assert(brw->gen == 7);
+
/* Save the SOL buffer offset register values. */
-   if (brw->gen < 8) {
-  for (int i = 0; i < 4; i++) {
- BEGIN_BATCH(3);
- OUT_BATCH(MI_STORE_REGISTER_MEM | (3 - 2));
- OUT_BATCH(GEN7_SO_WRITE_OFFSET(i));
- OUT_RELOC(brw_obj->offset_bo,
-   I915_GEM_DOMAIN_INSTRUCTION, I915_GEM_DOMAIN_INSTRUCTION,
-   i * sizeof(uint32_t));
- ADVANCE_BATCH();
-  }
+   for (int i = 0; i < 4; i++) {
+  BEGIN_BATCH(3);
+  OUT_BATCH(MI_STORE_REGISTER_MEM | (3 - 2));
+  OUT_BATCH(GEN7_SO_WRITE_OFFSET(i));
+  OUT_RELOC(brw_obj->offset_bo,
+I915_GEM_DOMAIN_INSTRUCTION, I915_GEM_DOMAIN_INSTRUCTION,
+i * sizeof(uint32_t));
+  ADVANCE_BATCH();
}
 
/* Store the temporary ending value of the SO_NUM_PRIMS_WRITTEN counters.
@@ -581,17 +579,17 @@ gen7_resume_transform_feedback(struct gl_context *ctx,
struct brw_transform_feedback_object *brw_obj =
   (struct brw_transform_feedback_object *) obj;
 
+   assert(brw->gen == 7);
+
/* Reload the SOL buffer offset registers. */
-   if (brw->gen < 8) {
-  for (int i = 0; i < 4; i++) {
- BEGIN_BATCH(3);
- OUT_BATCH(GEN7_MI_LOAD_REGISTER_MEM | (3 - 2));
- OUT_BATCH(GEN7_SO_WRITE_OFFSET(i));
- OUT_RELOC(brw_obj->offset_bo,
-   I915_GEM_DOMAIN_INSTRUCTION, I915_GEM_DOMAIN_INSTRUCTION,
-   i * sizeof(uint32_t));
- ADVANCE_BATCH();
-  }
+   for (int i = 0; i < 4; i++) {
+  BEGIN_BATCH(3);
+  OUT_BATCH(GEN7_MI_LOAD_REGISTER_MEM | (3 - 2));
+  OUT_BATCH(GEN7_SO_WRITE_OFFSET(i));
+  OUT_RELOC(brw_obj->offset_bo,
+I915_GEM_DOMAIN_INSTRUCTION, I915_GEM_DOMAIN_INSTRUCTION,
+i * sizeof(uint32_t));
+  ADVANCE_BATCH();
}
 
/* Store the new starting value of the SO_NUM_PRIMS_WRITTEN counters. */
-- 
2.11.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 7/7] i965: Enable ARB_transform_feedback2 on Sandybridge.

2017-02-17 Thread Kenneth Graunke
The only feature over and above ES 3.0 is DrawTransformFeedback().

We already have to do the whole SOL_NUM_PRIMS_WRITTEN counter dance in
order to compute the SVBI value for ResumeTransformFeedback(), at which
point our existing GetTransformFeedbackVertexCount() implementation will
do the trick (though with a stall to CPU map the buffer).

Someday, we could probably implement DrawTransformFeedback() more
efficiently, using the "Load Internal Vertex Count" feature of
3DSTATE_SVB_INDEX and the 3DPRIMITIVE indirect vertex count bit.

Rumor has it this allows people to use WebGL 2.0 on Sandybridge.

Note that we don't need pipelined register writes like Gen7+ because
we use the 3DSTATE_SVB_INDEX command rather than MI_LOAD_REGISTER_MEM.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99842
Signed-off-by: Kenneth Graunke 
---
 src/mesa/drivers/dri/i965/brw_context.c  | 2 ++
 src/mesa/drivers/dri/i965/intel_extensions.c | 3 +++
 2 files changed, 5 insertions(+)

diff --git a/src/mesa/drivers/dri/i965/brw_context.c 
b/src/mesa/drivers/dri/i965/brw_context.c
index 977c59c1c3e..17800e3fd60 100644
--- a/src/mesa/drivers/dri/i965/brw_context.c
+++ b/src/mesa/drivers/dri/i965/brw_context.c
@@ -483,6 +483,8 @@ brw_init_driver_functions(struct brw_context *brw,
   functions->EndTransformFeedback = brw_end_transform_feedback;
   functions->PauseTransformFeedback = brw_pause_transform_feedback;
   functions->ResumeTransformFeedback = brw_resume_transform_feedback;
+  functions->GetTransformFeedbackVertexCount =
+ brw_get_transform_feedback_vertex_count;
}
 
if (brw->gen >= 6)
diff --git a/src/mesa/drivers/dri/i965/intel_extensions.c 
b/src/mesa/drivers/dri/i965/intel_extensions.c
index f1290bf7b49..ec7ff02be84 100644
--- a/src/mesa/drivers/dri/i965/intel_extensions.c
+++ b/src/mesa/drivers/dri/i965/intel_extensions.c
@@ -158,6 +158,9 @@ intelInitExtensions(struct gl_context *ctx)
   ctx->Extensions.EXT_timer_query = true;
}
 
+   if (brw->gen == 6)
+  ctx->Extensions.ARB_transform_feedback2 = true;
+
if (brw->gen >= 6) {
   ctx->Extensions.ARB_blend_func_extended =
  !driQueryOptionb(&brw->optionCache, "disable_blend_func_extended");
-- 
2.11.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 3/7] i965: Use ctx->Const.MaxVertexStreams rather than BRW_XFB_MAX_STREAMS.

2017-02-17 Thread Kenneth Graunke
This way on Sandybridge we'll only do 1 stream worth of math, since
we only have one SO_NUM_PRIMS_WRITTEN counter.

Signed-off-by: Kenneth Graunke 
---
 src/mesa/drivers/dri/i965/gen6_sol.c | 25 -
 1 file changed, 16 insertions(+), 9 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/gen6_sol.c 
b/src/mesa/drivers/dri/i965/gen6_sol.c
index ca06194ba15..41158bd580c 100644
--- a/src/mesa/drivers/dri/i965/gen6_sol.c
+++ b/src/mesa/drivers/dri/i965/gen6_sol.c
@@ -230,11 +230,16 @@ brw_delete_transform_feedback(struct gl_context *ctx,
  * For each stream, we subtract the pair of values (end - start) to get the
  * number of primitives generated during one section.  We accumulate these
  * values, adding them up to get the total number of primitives generated.
+ *
+ * Note that we expose one stream pre-Gen7, so the above is just (start, end).
  */
 static void
 tally_prims_generated(struct brw_context *brw,
   struct brw_transform_feedback_object *obj)
 {
+   const struct gl_context *ctx = &brw->ctx;
+   const int streams = ctx->Const.MaxVertexStreams;
+
/* If the current batch is still contributing to the number of primitives
 * generated, flush it now so the results will be present when mapped.
 */
@@ -247,15 +252,14 @@ tally_prims_generated(struct brw_context *brw,
drm_intel_bo_map(obj->prim_count_bo, false);
uint64_t *prim_counts = obj->prim_count_bo->virtual;
 
-   assert(obj->prim_count_buffer_index % (2 * BRW_MAX_XFB_STREAMS) == 0);
-   int pairs = obj->prim_count_buffer_index / (2 * BRW_MAX_XFB_STREAMS);
+   assert(obj->prim_count_buffer_index % (2 * streams) == 0);
+   int pairs = obj->prim_count_buffer_index / (2 * streams);
 
for (int i = 0; i < pairs; i++) {
-  for (int s = 0; s < BRW_MAX_XFB_STREAMS; s++) {
- obj->prims_generated[s] +=
-prim_counts[BRW_MAX_XFB_STREAMS + s] - prim_counts[s];
+  for (int s = 0; s < streams; s++) {
+ obj->prims_generated[s] += prim_counts[streams + s] - prim_counts[s];
   }
-  prim_counts += 2 * BRW_MAX_XFB_STREAMS; /* move to the next pair */
+  prim_counts += 2 * streams; /* move to the next pair */
}
 
drm_intel_bo_unmap(obj->prim_count_bo);
@@ -279,7 +283,8 @@ void
 brw_save_primitives_written_counters(struct brw_context *brw,
  struct brw_transform_feedback_object *obj)
 {
-   const int streams = BRW_MAX_XFB_STREAMS;
+   const struct gl_context *ctx = &brw->ctx;
+   const int streams = ctx->Const.MaxVertexStreams;
 
/* Check if there's enough space for a new pair of four values. */
if (obj->prim_count_bo != NULL &&
@@ -310,6 +315,8 @@ void
 brw_compute_xfb_vertices_written(struct brw_context *brw,
  struct brw_transform_feedback_object *obj)
 {
+   const struct gl_context *ctx = &brw->ctx;
+
if (obj->vertices_written_valid || !obj->base.EndedAnytime)
   return;
 
@@ -332,7 +339,7 @@ brw_compute_xfb_vertices_written(struct brw_context *brw,
/* Get the number of primitives generated. */
tally_prims_generated(brw, obj);
 
-   for (int i = 0; i < BRW_MAX_XFB_STREAMS; i++) {
+   for (int i = 0; i < ctx->Const.MaxVertexStreams; i++) {
   obj->vertices_written[i] = vertices_per_prim * obj->prims_generated[i];
}
obj->vertices_written_valid = true;
@@ -354,7 +361,7 @@ brw_get_transform_feedback_vertex_count(struct gl_context 
*ctx,
   (struct brw_transform_feedback_object *) obj;
 
assert(obj->EndedAnytime);
-   assert(stream < BRW_MAX_XFB_STREAMS);
+   assert(stream < ctx->Const.MaxVertexStreams);
 
brw_compute_xfb_vertices_written(brw, brw_obj);
return brw_obj->vertices_written[stream];
-- 
2.11.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] vbo: kill primitive restart lowering in glDrawArrays

2017-02-21 Thread Kenneth Graunke
On Monday, February 20, 2017 10:35:36 AM PST Marek Olšák wrote:
> From: Marek Olšák 
> 
> ---
>  src/mesa/vbo/vbo_exec_array.c | 56 
> ++-
>  1 file changed, 7 insertions(+), 49 deletions(-)
> 
> diff --git a/src/mesa/vbo/vbo_exec_array.c b/src/mesa/vbo/vbo_exec_array.c
> index 6a96167..30c52d5 100644
> --- a/src/mesa/vbo/vbo_exec_array.c
> +++ b/src/mesa/vbo/vbo_exec_array.c
> @@ -404,77 +404,35 @@ vbo_bind_arrays(struct gl_context *ctx)
>   */
>  static void
>  vbo_draw_arrays(struct gl_context *ctx, GLenum mode, GLint start,
>  GLsizei count, GLuint numInstances, GLuint baseInstance)
>  {
> struct vbo_context *vbo = vbo_context(ctx);
> struct _mesa_prim prim[2];
>  
> vbo_bind_arrays(ctx);
>  
> -   /* init most fields to zero */
> +   /* OpenGL 4.5 says that primitive restart is ignored with non-indexed
> +* draws.
> +*/

Cool!  Yes it does.  ES 3.2 also seems to agree.

i965 hardware doesn't support primitive restart on non-indexed draws (yet).
Apparently neither does AMD hardware.  NVIDIA does.  So both of us would
need this lowering...if it were required.  But it isn't, so goodbye
code! :)

Reviewed-by: Kenneth Graunke 


signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v4 0/6] Add support for ARB_transform_feedback_overflow_query.

2017-02-21 Thread Kenneth Graunke
On Friday, January 20, 2017 9:53:21 AM PST Rafael Antognolli wrote:
> This patch series implements the ARB_transform_feedback_overflow_query
> extension for i965.
> 
> Changes for v4:
> - Reuse of MI_MATH calcs from hsw_queryobj.c in brw_conditional_render.c
> - Renamed a couple functions as suggested by Kenneth
> - Fallback to CPU-side conditional rendering if MI_MATH is not available.
> 
> The series is available on github here:
> 
> https://github.com/rantogno/mesa/tree/review/overflow_query-v04
> 
> There are also piglit tests available for it here:
> 
> https://github.com/rantogno/piglit/tree/review/overflow_query-v05
> 
> Regards,
> Rafael

Looks great!  Series is:

Reviewed-by: Kenneth Graunke 

I'm guessing you don't have commit access, so I'll try and push it...


signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/3] Model INTEL perf query backend after query object BE

2017-02-21 Thread Kenneth Graunke
ryData)(struct gl_context *ctx,
> +struct gl_perf_query_object *obj,
> +GLsizei dataSize,
> +GLuint *data,
> +GLuint *bytesWritten);
> +   /*@}*/
> +
> +
> +   /**
>  * \name GREMEDY debug/marker functions
>  */
> /*@{*/
> diff --git a/src/mesa/main/mtypes.h b/src/mesa/main/mtypes.h
> index f3a24df589..e6cf1f4af6 100644
> --- a/src/mesa/main/mtypes.h
> +++ b/src/mesa/main/mtypes.h
> @@ -1860,6 +1860,23 @@ struct gl_perf_monitor_group
>  
>  
>  /**
> + * A query object instance as described in INTEL_performance_query.
> + *
> + * NB: We want to keep this and the corresponding backend structure
> + * relatively lean considering that applications may expect to
> + * allocate enough objects to be able to query around all draw calls
> + * in a frame.
> + */
> +struct gl_perf_query_object
> +{
> +   GLuint Id;/**< hash table ID/name */
> +   GLuint Used:1;/**< has been used for 1 or more queries */
> +   GLuint Active:1;  /**< inside Begin/EndPerfQuery */
> +   GLuint Ready:1;   /**< result is ready? */

Please use "unsigned Id" and "bool Used:1" - we're trying to get away
from GL type aliases when not directly API-facing.

[snip]

> -   /* The specification does not state that this produces an error. */
> +   /* The specification does not state that this produces an error but
> +* to be consistent with glGetFirstPerfQueryIdINTEL we generate an
> +* INVALID_VALUE error */

   */ goes on its own line.

With the "mesa: ..." prefix added, patches 1-2 are (weakly):
Reviewed-by: Kenneth Graunke 


signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2] i965: Implement INTEL_performance_query backend

2017-02-21 Thread Kenneth Graunke
On Thursday, February 16, 2017 5:20:37 AM PST Robert Bragg wrote:
[snip]
> +   switch(obj->query->kind) {

Space after "switch" please.

Patch 3 is:
Reviewed-by: Kenneth Graunke 


signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/3] Model INTEL perf query backend after query object BE

2017-02-22 Thread Kenneth Graunke
On Wednesday, February 22, 2017 10:35:24 AM PST Robert Bragg wrote:
> On Wed, Feb 22, 2017 at 1:24 AM, Kenneth Graunke  
> wrote:
> > On Wednesday, February 15, 2017 1:37:36 PM PST Robert Bragg wrote:
[snip]
> >> diff --git a/src/mesa/main/mtypes.h b/src/mesa/main/mtypes.h
> >> index f3a24df589..e6cf1f4af6 100644
> >> --- a/src/mesa/main/mtypes.h
> >> +++ b/src/mesa/main/mtypes.h
> >> @@ -1860,6 +1860,23 @@ struct gl_perf_monitor_group
> >>
> >>
> >>  /**
> >> + * A query object instance as described in INTEL_performance_query.
> >> + *
> >> + * NB: We want to keep this and the corresponding backend structure
> >> + * relatively lean considering that applications may expect to
> >> + * allocate enough objects to be able to query around all draw calls
> >> + * in a frame.
> >> + */
> >> +struct gl_perf_query_object
> >> +{
> >> +   GLuint Id;/**< hash table ID/name */
> >> +   GLuint Used:1;/**< has been used for 1 or more queries */
> >> +   GLuint Active:1;  /**< inside Begin/EndPerfQuery */
> >> +   GLuint Ready:1;   /**< result is ready? */
> >
> > Please use "unsigned Id" and "bool Used:1" - we're trying to get away
> > from GL type aliases when not directly API-facing.
> 
> Ah right I was generally aware of that but doing a skimming everything
> with this in mind I found a few other little bits to clean up though I
> ended having some second thoughts about these particular members:
> 
> This Id is a record of the GLuint ID given to the application, just
> used for debugging currently. The value is returned by
> _mesa_HashFindFreeKeyBlock() which is currently implemented in terms
> of the GLuint type. One other place where we access the same ID for
> debugging is via _mesa_HashWalk() which takes a callback expecting a
> GLuint argument. I can still change, but when I thought about this it
> felt like it was indeed a directly api facing value.
> 
> For the bitfields I started over thinking what it means to have a bool
> bitfield since I doubted whether it could be assumed to be unsigned
> and then wondered about the potential for a bug with some code trying
> to compare a bitfield value == 1, or indexing an array. Does Mesa
> require a c99 compiler, otherwise I don't think it's unheard of for
> bool to end up as a signed int typedef. Anyway, besides the
> overly-pedantic thought, I guessed you wouldn't really mind me using
> "unsigned Used:1;" for the sake of avoiding GLuint. I don't think it
> would make a practical difference since the struct will be naturally
> padded to 8 bytes in all likelyhood either way. I'll prod you on IRC
> to check if you really have a strong opinion here before I push.

We assume bool types become 0 or 1 when cast to integers, as guaranteed
by C99.  There are existing examples of bool:1 in NIR, i965, and vc4.
I think it should be OK in Mesa core.

But, feel free to use unsigned.  Or GLuint when you have a rationale
for doing so, as above.  A lot of people just use GL types everywhere
for no particular reason, but it makes some sense here.

--Ken


signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] glsl: Raise a link error for non-SSO ES programs with a TES but no TCS.

2017-02-22 Thread Kenneth Graunke
OpenGL allows the TCS to be missing and supplies an implicit passthrough
shader, but OpenGL ES does not.

One open question is how to handle this for ARB_ES3_2_compatibility.
This patch raises the link error for all ES shading language programs,
but it might make sense to base it on the API.  The approach taken in
this patch is more restrictive, but should still allow any valid ES
programs to work in GL.

Signed-off-by: Kenneth Graunke 
---
 src/compiler/glsl/linker.cpp | 10 ++
 1 file changed, 10 insertions(+)

diff --git a/src/compiler/glsl/linker.cpp b/src/compiler/glsl/linker.cpp
index b6f8bc4212e..9e68c6e2d71 100644
--- a/src/compiler/glsl/linker.cpp
+++ b/src/compiler/glsl/linker.cpp
@@ -4743,6 +4743,16 @@ link_shaders(struct gl_context *ctx, struct 
gl_shader_program *prog)
   "tessellation evaluation shader\n");
  goto done;
   }
+
+  if (prog->IsES) {
+ if (num_shaders[MESA_SHADER_TESS_EVAL] > 0 &&
+ num_shaders[MESA_SHADER_TESS_CTRL] == 0) {
+linker_error(prog, "GLSL ES requires non-separable programs "
+ "containing a tessellation evaluation shader to also "
+ "be linked with a tessellation control shader\n");
+goto done;
+ }
+  }
}
 
/* Compute shaders have additional restrictions. */
-- 
2.11.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] mesa: Use _mesa_has_OES_geometry_shader() when validating draws

2017-02-23 Thread Kenneth Graunke
On Thursday, February 23, 2017 12:05:18 AM PST Tomasz Figa wrote:
> In validate_DrawElements_common() we need to check for OES_geometry_shader
> extension to determine if we should fail if transform feedback is
> unpaused. However current code reads ctx->Extensions.OES_geometry_shader
> directly, which does not take context version into account. This means
> that if the context is GLES 3.0, which makes the OES_geometry_shader
> inapplicable, we would not validate the draw properly. To fix it, let's
> replace the check with a call to _mesa_has_OES_geometry_shader().
> 
> Fixes following dEQP tests on i965 with a GLES 3.0 context:
> 
> dEQP-GLES3.functional.negative_api.vertex_array#draw_elements
> dEQP-GLES3.functional.negative_api.vertex_array#draw_elements_incomplete_primitive
> dEQP-GLES3.functional.negative_api.vertex_array#draw_elements_instanced
> dEQP-GLES3.functional.negative_api.vertex_array#draw_elements_instanced_incomplete_primitive
> dEQP-GLES3.functional.negative_api.vertex_array#draw_range_elements
> dEQP-GLES3.functional.negative_api.vertex_array#draw_range_elements_incomplete_primitive
> 
> Change-Id: Iebc960b479fcd5f6c2b1501cb3e7798b575e6c4d
> Signed-off-by: Tomasz Figa 
> ---
>  src/mesa/main/api_validate.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/src/mesa/main/api_validate.c b/src/mesa/main/api_validate.c
> index 1e8a714067..184bf143ed 100644
> --- a/src/mesa/main/api_validate.c
> +++ b/src/mesa/main/api_validate.c
> @@ -664,7 +664,8 @@ validate_DrawElements_common(struct gl_context *ctx,
>  * to have been overlooked.  The body of the spec only explicitly allows
>  * the indirect versions.
>  */
> -   if (_mesa_is_gles3(ctx) && !ctx->Extensions.OES_geometry_shader &&
> +   if (_mesa_is_gles3(ctx) &&
> +   !_mesa_has_OES_geometry_shader(ctx) &&
> _mesa_is_xfb_active_and_unpaused(ctx)) {
>_mesa_error(ctx, GL_INVALID_OPERATION,
>"%s(transform feedback active)", caller);
> 

Reviewed-by: Kenneth Graunke 

I'll push this tomorrow if there are no objections.


signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] glsl: Raise a link error for non-SSO ES programs with a TES but no TCS.

2017-02-23 Thread Kenneth Graunke
On Thursday, February 23, 2017 4:03:36 AM PST Andres Gomez wrote:
> I would welcome a reference to the text in Secption 7.3 of the OpenGL
> ES 3.2 specs as a code comment and commit text but, other than that,
> this is:
> 
> Reviewed-by: Andres Gomez 

I'll add it to the commit message.  I wasn't sure about adding it to the
code since the block above quotes the very same section, so it seemed
like it was already cited (although the particular sentence isn't).

Of course, that didn't show up in the patch...sorry :(

Thanks for the review!


signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] mesa: Use _mesa_has_OES_geometry_shader() when validating draws

2017-02-23 Thread Kenneth Graunke
On Thursday, February 23, 2017 10:46:54 AM PST Ilia Mirkin wrote:
> The assumption was that if you're setting that ext, you can do gles31. Is
> there a driver/hw combo where that's not the case? If so, we should fix
> that instead...

ChromeOS/ARC++ currently disables GLES 3.1 on i965 because we haven't
passed all the dEQP tests for it.  So, we support the functionality,
but they're using GLES 3.0 contexts.  I'm guessing that's why they hit
this and I haven't.

They could probably disable the bit too, but checking whether the
feature is actually exposed seems pretty reasonable.


signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/2] ralloc: Delete autofree handling.

2017-02-27 Thread Kenneth Graunke
There was exactly one user of this, and I just removed it.

It also accessed an implicit global context, with no locking.  This
meant that it was only safe if all callers of ralloc_autofree_context()
held the same lock...which is a pretty terrible thing for a utility
library to impose.

Signed-off-by: Kenneth Graunke 
---
 src/util/ralloc.c | 18 --
 src/util/ralloc.h |  9 -
 2 files changed, 27 deletions(-)

diff --git a/src/util/ralloc.c b/src/util/ralloc.c
index 980e4e4f138..d5cc16766b1 100644
--- a/src/util/ralloc.c
+++ b/src/util/ralloc.c
@@ -323,24 +323,6 @@ ralloc_parent(const void *ptr)
return info->parent ? PTR_FROM_HEADER(info->parent) : NULL;
 }
 
-static void *autofree_context = NULL;
-
-static void
-autofree(void)
-{
-   ralloc_free(autofree_context);
-}
-
-void *
-ralloc_autofree_context(void)
-{
-   if (unlikely(autofree_context == NULL)) {
-  autofree_context = ralloc_context(NULL);
-  atexit(autofree);
-   }
-   return autofree_context;
-}
-
 void
 ralloc_set_destructor(const void *ptr, void(*destructor)(void *))
 {
diff --git a/src/util/ralloc.h b/src/util/ralloc.h
index 3e2d342b45e..7d906519661 100644
--- a/src/util/ralloc.h
+++ b/src/util/ralloc.h
@@ -247,15 +247,6 @@ void ralloc_adopt(const void *new_ctx, void *old_ctx);
 void *ralloc_parent(const void *ptr);
 
 /**
- * Return a context whose memory will be automatically freed at program exit.
- *
- * The first call to this function creates a context and registers a handler
- * to free it using \c atexit.  This may cause trouble if used in a library
- * loaded with \c dlopen.
- */
-void *ralloc_autofree_context(void);
-
-/**
  * Set a callback to occur just before an object is freed.
  */
 void ralloc_set_destructor(const void *ptr, void(*destructor)(void *));
-- 
2.11.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/2] compiler: Free types in _mesa_glsl_release_types() rather than autofree.

2017-02-27 Thread Kenneth Graunke
Instead of using ralloc_autofree_context() to install an atexit()
handler to ralloc_free(glsl_type::mem_ctx), we can simply free them
from _mesa_glsl_release_types().

This is effectively the same, because _mesa_glsl_release_types() is
called from _mesa_destroy_shader_compiler(), which is called from Mesa's
one_time_fini() function, which Mesa installs as an atexit() handler.

The one advantage here is that it ensures the built-in functions are
destroyed before the types.

Signed-off-by: Kenneth Graunke 
---
 src/compiler/glsl_types.cpp | 5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/src/compiler/glsl_types.cpp b/src/compiler/glsl_types.cpp
index a431edcdd3f..1da631dcb98 100644
--- a/src/compiler/glsl_types.cpp
+++ b/src/compiler/glsl_types.cpp
@@ -40,7 +40,7 @@ void
 glsl_type::init_ralloc_type_ctx(void)
 {
if (glsl_type::mem_ctx == NULL) {
-  glsl_type::mem_ctx = ralloc_autofree_context();
+  glsl_type::mem_ctx = ralloc_context(NULL);
   assert(glsl_type::mem_ctx != NULL);
}
 }
@@ -416,6 +416,9 @@ _mesa_glsl_release_types(void)
   _mesa_hash_table_destroy(glsl_type::interface_types, NULL);
   glsl_type::interface_types = NULL;
}
+
+   ralloc_free(glsl_type::mem_ctx);
+   glsl_type::mem_ctx = NULL;
 }
 
 
-- 
2.11.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] mesa: Require mipmap completeness for glCopyImageSubData(), sometimes.

2017-02-27 Thread Kenneth Graunke
This patch makes glCopyImageSubData require mipmap completeness when the
texture object's built-in sampler object has a mipmapping MinFilter.

Fixes (on i965):
dEQP-GLES31.functional.debug.negative_coverage.*.buffer.copy_image_sub_data

Signed-off-by: Kenneth Graunke 
---
 src/mesa/main/copyimage.c | 25 +++--
 1 file changed, 23 insertions(+), 2 deletions(-)

diff --git a/src/mesa/main/copyimage.c b/src/mesa/main/copyimage.c
index cf25159e880..877c8ac246d 100644
--- a/src/mesa/main/copyimage.c
+++ b/src/mesa/main/copyimage.c
@@ -149,9 +149,30 @@ prepare_target(struct gl_context *ctx, GLuint name, GLenum 
target,
  return false;
   }
 
+  /* The ARB_copy_image specification says:
+   *
+   *"INVALID_OPERATION is generated if either object is a texture and
+   * the texture is not complete (as defined in section 3.9.14)"
+   *
+   * The cited section says:
+   *
+   *"Using the preceding definitions, a texture is complete unless any
+   * of the following conditions hold true: [...]
+   *
+   * * The minification filter requires a mipmap (is neither NEAREST
+   *   nor LINEAR), and the texture is not mipmap complete."
+   *
+   * This imposes the bizarre restriction that glCopyImageSubData requires
+   * mipmap completion at times, which dEQP mandates, and other drivers
+   * appear to implement.  We don't have any texture units here, so we
+   * can't look at any bound separate sampler objects...it appears that
+   * you're supposed to use the sampler object which is built-in to the
+   * texture object.
+   *
+   * See https://cvs.khronos.org/bugzilla/show_bug.cgi?id=16224.
+   */
   _mesa_test_texobj_completeness(ctx, texObj);
-  if (!texObj->_BaseComplete ||
-  (level != 0 && !texObj->_MipmapComplete)) {
+  if (!_mesa_is_texture_complete(texObj, &texObj->Sampler)) {
  _mesa_error(ctx, GL_INVALID_OPERATION,
  "glCopyImageSubData(%sName incomplete)", dbg_prefix);
  return false;
-- 
2.11.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] i965: Move intel_resolve_map.[ch] from i965_compiler_FILES to i965_FILES

2017-02-27 Thread Kenneth Graunke
I have no idea why these were part of the compiler files.  They're
miptree related code, and the compiler doesn't appear to use them.

Signed-off-by: Kenneth Graunke 
---
 src/mesa/drivers/dri/i965/Makefile.sources | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/Makefile.sources 
b/src/mesa/drivers/dri/i965/Makefile.sources
index 5278e86339a..633c3dc00a4 100644
--- a/src/mesa/drivers/dri/i965/Makefile.sources
+++ b/src/mesa/drivers/dri/i965/Makefile.sources
@@ -85,9 +85,7 @@ i965_compiler_FILES = \
intel_asm_annotation.c \
intel_asm_annotation.h \
intel_debug.c \
-   intel_debug.h \
-   intel_resolve_map.c \
-   intel_resolve_map.h
+   intel_debug.h
 
 i965_compiler_GENERATED_FILES = \
brw_nir_trig_workarounds.c
@@ -236,6 +234,8 @@ i965_FILES = \
intel_pixel_draw.c \
intel_pixel.h \
intel_pixel_read.c \
+   intel_resolve_map.c \
+   intel_resolve_map.h \
intel_screen.c \
intel_screen.h \
intel_state.c \
-- 
2.11.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] mesa: Require mipmap completeness for glCopyImageSubData(), sometimes.

2017-02-27 Thread Kenneth Graunke
On Monday, February 27, 2017 4:54:37 AM PST Kenneth Graunke wrote:
> This patch makes glCopyImageSubData require mipmap completeness when the
> texture object's built-in sampler object has a mipmapping MinFilter.
> 
> Fixes (on i965):
> dEQP-GLES31.functional.debug.negative_coverage.*.buffer.copy_image_sub_data
> 
> Signed-off-by: Kenneth Graunke 
> ---
>  src/mesa/main/copyimage.c | 25 +++--
>  1 file changed, 23 insertions(+), 2 deletions(-)

Oops...I accidentally mailed this out before regression testing it
(waited for my test job to come back, saw it was green...but it was
the results for a different patch).

It looks like it regresses some Piglit tests (but I think the tests
are probably what needs to change).  I'll look into it.


signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


<    1   2   3   4   5   6   7   8   9   10   >