date:20150407

On Tue, Apr 7, 2015 at 1:44 AM, Ilia Mirkin imir...@alum.mit.edu wrote:
 Signed-off-by: Ilia Mirkin imir...@alum.mit.edu
 ---

 Pushing this through a complete piglit run, but it seems to fix

   bin/arb-provoking-vertex-render

 on a3xx. Please take special care to double-check that I didn't mess
 up cw/ccw order or something. I'm especially weak on the quadstrip
 case.

  src/gallium/auxiliary/indices/u_indices_gen.py | 13 ++---
  1 file changed, 10 insertions(+), 3 deletions(-)

 diff --git a/src/gallium/auxiliary/indices/u_indices_gen.py 
 b/src/gallium/auxiliary/indices/u_indices_gen.py
 index 687a717..b17d132 100644
 --- a/src/gallium/auxiliary/indices/u_indices_gen.py
 +++ b/src/gallium/auxiliary/indices/u_indices_gen.py
 @@ -142,8 +142,12 @@ def do_tri( intype, outtype, ptr, v0, v1, v2, inpv, 
 outpv ):
  tri( intype, outtype, ptr, v2, v0, v1 )

  def do_quad( intype, outtype, ptr, v0, v1, v2, v3, inpv, outpv ):
 -do_tri( intype, outtype, ptr+'+0',  v0, v1, v3, inpv, outpv );
 -do_tri( intype, outtype, ptr+'+3',  v1, v2, v3, inpv, outpv );
 +if inpv == LAST:
 +do_tri( intype, outtype, ptr+'+0',  v0, v1, v3, inpv, outpv );
 +do_tri( intype, outtype, ptr+'+3',  v1, v2, v3, inpv, outpv );
 +else:
 +do_tri( intype, outtype, ptr+'+0',  v0, v1, v3, inpv, outpv );
 +do_tri( intype, outtype, ptr+'+3',  v0, v3, v2, inpv, outpv );

Erm, make that v0, v1, v2; v0, v2, v3. Oops :)


  def name(intype, outtype, inpv, outpv, pr, prim):
  if intype == GENERATE:
 @@ -331,7 +335,10 @@ def quadstrip(intype, outtype, inpv, outpv, pr):
  print ' i += 4;'
  print ' goto restart;'
  print '  }'
 -do_quad( intype, outtype, 'out+j', 'i+2', 'i+0', 'i+1', 'i+3', inpv, 
 outpv );
 +if inpv == LAST:
 +do_quad( intype, outtype, 'out+j', 'i+2', 'i+0', 'i+1', 'i+3', inpv, 
 outpv );
 +else:
 +do_quad( intype, outtype, 'out+j', 'i+0', 'i+1', 'i+3', 'i+2', inpv, 
 outpv );
  print '   }'
  postamble()

 --
 2.0.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] glsl: Allow any sort of sampler array indexing with GLSL ES 3.00

2015-04-07 Thread Tapani Pälli

From: Kalyan Kondapally kalyan.kondapa...@intel.com

Dynamic indexing of sampler arrays is prohibited by GLSL ES 3.00.
Earlier versions allow 'constant-index-expression' indexing, where
index can contain a loop induction variable.

Patch allows dynamic indexing for sampler arrays when GLSL ES  3.00.
This change makes 'sampler-array-index.frag' parser test in Piglit
pass + fishgl.com works when running Chrome on OpenGL ES 2.0 backend.

v2: small change and some more commit message (Tapani)

Signed-off-by: Kalyan Kondapally kalyan.kondapa...@intel.com
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=84225
---
 src/glsl/ast_array_index.cpp | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/glsl/ast_array_index.cpp b/src/glsl/ast_array_index.cpp
index ecef651..b2609b6 100644
--- a/src/glsl/ast_array_index.cpp
+++ b/src/glsl/ast_array_index.cpp
@@ -226,7 +226,7 @@ _mesa_ast_array_index_to_hir(void *mem_ctx,
* dynamically uniform expression is undefined.
*/
   if (array-type-element_type()-is_sampler()) {
-if (!state-is_version(130, 100)) {
+if (!state-is_version(130, 300)) {
if (state-es_shader) {
   _mesa_glsl_warning(loc, state,
  sampler arrays indexed with non-constant 
-- 
2.1.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 4/5] nir: Allocate dereferences out of their parent instruction or deref.

Jason pointed out that variable dereferences in NIR are really part of
their parent instruction, and should have the same lifetime.

Unlike in GLSL IR, they're not used very often - just for intrinsic
variables, call parameters  return, and indirect samplers for
texturing.  Also, nir_deref_var is the top-level concept, and
nir_deref_array/nir_deref_record are child nodes.

This patch attempts to allocate nir_deref_vars out of their parent
instruction, and any sub-dereferences out of their parent deref.
It enforces these restrictions in the validator as well.

This means that freeing an instruction should free its associated
dereference chain as well.  The memory sweeper pass can also happily
ignore them.

Signed-off-by: Kenneth Graunke kenn...@whitecape.org
---
 src/glsl/nir/glsl_to_nir.cpp| 47 -
 src/glsl/nir/nir.c  |  6 ++---
 src/glsl/nir/nir_lower_var_copies.c |  8 +++
 src/glsl/nir/nir_split_var_copies.c |  4 ++--
 src/glsl/nir/nir_validate.c | 13 ++
 src/mesa/program/prog_to_nir.c  |  9 ---
 6 files changed, 45 insertions(+), 42 deletions(-)

This is still a lot of churn, but surprisingly about even on LOC.
With the validator code in place, I suspect we can get this right
going forward without too much trouble.

diff --git a/src/glsl/nir/glsl_to_nir.cpp b/src/glsl/nir/glsl_to_nir.cpp
index 80c5b3a..f61a47a 100644
--- a/src/glsl/nir/glsl_to_nir.cpp
+++ b/src/glsl/nir/glsl_to_nir.cpp
@@ -88,6 +88,8 @@ private:
exec_list *cf_node_list;
nir_instr *result; /* result of the expression tree last visited */
 
+   nir_deref_var *make_deref(void *mem_ctx, ir_instruction *ir);
+
/* the head of the dereference chain we're creating */
nir_deref_var *deref_head;
/* the tail of the dereference chain we're creating */
@@ -156,6 +158,14 @@ nir_visitor::~nir_visitor()
_mesa_hash_table_destroy(this-overload_table, NULL);
 }
 
+nir_deref_var *
+nir_visitor::make_deref(void *mem_ctx, ir_instruction *ir)
+{
+   ir-accept(this);
+   ralloc_steal(mem_ctx, this-deref_head);
+   return this-deref_head;
+}
+
 static nir_constant *
 constant_copy(ir_constant *ir, void *mem_ctx)
 {
@@ -582,13 +592,11 @@ void
 nir_visitor::visit(ir_return *ir)
 {
if (ir-value != NULL) {
-  ir-value-accept(this);
   nir_intrinsic_instr *copy =
  nir_intrinsic_instr_create(this-shader, nir_intrinsic_copy_var);
 
-  copy-variables[0] = nir_deref_var_create(this-shader,
-this-impl-return_var);
-  copy-variables[1] = this-deref_head;
+  copy-variables[0] = nir_deref_var_create(copy, this-impl-return_var);
+  copy-variables[1] = make_deref(copy, ir-value);
}
 
nir_jump_instr *instr = nir_jump_instr_create(this-shader, 
nir_jump_return);
@@ -613,8 +621,7 @@ nir_visitor::visit(ir_call *ir)
   nir_intrinsic_instr *instr = nir_intrinsic_instr_create(shader, op);
   ir_dereference *param =
  (ir_dereference *) ir-actual_parameters.get_head();
-  param-accept(this);
-  instr-variables[0] = this-deref_head;
+  instr-variables[0] = make_deref(instr, param);
   nir_ssa_dest_init(instr-instr, instr-dest, 1, NULL);
 
   nir_instr_insert_after_cf_list(this-cf_node_list, instr-instr);
@@ -623,8 +630,7 @@ nir_visitor::visit(ir_call *ir)
  nir_intrinsic_instr_create(shader, nir_intrinsic_store_var);
   store_instr-num_components = 1;
 
-  ir-return_deref-accept(this);
-  store_instr-variables[0] = this-deref_head;
+  store_instr-variables[0] = make_deref(store_instr, ir-return_deref);
   store_instr-src[0].is_ssa = true;
   store_instr-src[0].ssa = instr-dest.ssa;
 
@@ -642,13 +648,11 @@ nir_visitor::visit(ir_call *ir)
 
unsigned i = 0;
foreach_in_list(ir_dereference, param, ir-actual_parameters) {
-  param-accept(this);
-  instr-params[i] = this-deref_head;
+  instr-params[i] = make_deref(instr, param);
   i++;
}
 
-   ir-return_deref-accept(this);
-   instr-return_deref = this-deref_head;
+   instr-return_deref = make_deref(instr, ir-return_deref);
nir_instr_insert_after_cf_list(this-cf_node_list, instr-instr);
 }
 
@@ -663,12 +667,8 @@ nir_visitor::visit(ir_assignment *ir)
   nir_intrinsic_instr *copy =
  nir_intrinsic_instr_create(this-shader, nir_intrinsic_copy_var);
 
-  ir-lhs-accept(this);
-  copy-variables[0] = this-deref_head;
-
-  ir-rhs-accept(this);
-  copy-variables[1] = this-deref_head;
-
+  copy-variables[0] = make_deref(copy, ir-lhs);
+  copy-variables[1] = make_deref(copy, ir-rhs);
 
   if (ir-condition) {
  nir_if *if_stmt = nir_if_create(this-shader);
@@ -700,6 +700,7 @@ nir_visitor::visit(ir_assignment *ir)
   load-num_components = ir-lhs-type-vector_elements;
   nir_ssa_dest_init(load-instr, load-dest, num_components, NULL);
   load-variables[0] = lhs_deref;
+  ralloc_steal(load,

[Mesa-dev] [PATCH 2/5] nir: Allocate nir_phi_src values out of the nir_phi_instr.

Phi sources are part of the phi instruction and should have the same
lifetime.

Signed-off-by: Kenneth Graunke kenn...@whitecape.org
---
 src/glsl/nir/nir_lower_phis_to_scalar.c | 2 +-
 src/glsl/nir/nir_lower_vars_to_ssa.c| 2 +-
 src/glsl/nir/nir_to_ssa.c   | 2 +-
 3 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/src/glsl/nir/nir_lower_phis_to_scalar.c 
b/src/glsl/nir/nir_lower_phis_to_scalar.c
index 7cd93ea..4bdb800 100644
--- a/src/glsl/nir/nir_lower_phis_to_scalar.c
+++ b/src/glsl/nir/nir_lower_phis_to_scalar.c
@@ -223,7 +223,7 @@ lower_phis_to_scalar_block(nir_block *block, void 
*void_state)
 else
nir_instr_insert_after_block(src-pred, mov-instr);
 
-nir_phi_src *new_src = ralloc(state-mem_ctx, nir_phi_src);
+nir_phi_src *new_src = ralloc(new_phi, nir_phi_src);
 new_src-pred = src-pred;
 new_src-src = nir_src_for_ssa(mov-dest.dest.ssa);
 
diff --git a/src/glsl/nir/nir_lower_vars_to_ssa.c 
b/src/glsl/nir/nir_lower_vars_to_ssa.c
index 86e6ab4..2ca74d7 100644
--- a/src/glsl/nir/nir_lower_vars_to_ssa.c
+++ b/src/glsl/nir/nir_lower_vars_to_ssa.c
@@ -642,7 +642,7 @@ add_phi_sources(nir_block *block, nir_block *pred,
 
   struct deref_node *node = entry-data;
 
-  nir_phi_src *src = ralloc(state-mem_ctx, nir_phi_src);
+  nir_phi_src *src = ralloc(phi, nir_phi_src);
   src-pred = pred;
   src-src.is_ssa = true;
   src-src.ssa = get_ssa_def_for_block(node, pred, state);
diff --git a/src/glsl/nir/nir_to_ssa.c b/src/glsl/nir/nir_to_ssa.c
index 47cf453..53ff547 100644
--- a/src/glsl/nir/nir_to_ssa.c
+++ b/src/glsl/nir/nir_to_ssa.c
@@ -47,7 +47,7 @@ insert_trivial_phi(nir_register *reg, nir_block *block, void 
*mem_ctx)
set_foreach(block-predecessors, entry) {
   nir_block *pred = (nir_block *) entry-key;
 
-  nir_phi_src *src = ralloc(mem_ctx, nir_phi_src);
+  nir_phi_src *src = ralloc(instr, nir_phi_src);
   src-pred = pred;
   src-src.is_ssa = false;
   src-src.reg.base_offset = 0;
-- 
2.3.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 3/5] nir: Allocate nir_ssa_def::uses/if_uses out of the instruction.

We can't allocate them out of the nir_ssa_def itself, because it may not
be ralloc'd (for example, nir_dest embeds a nir_ssa_def).

However, allocating them out of the instruction should work.

Signed-off-by: Kenneth Graunke kenn...@whitecape.org
---
 src/glsl/nir/nir.c | 6 ++
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/src/glsl/nir/nir.c b/src/glsl/nir/nir.c
index 0f807dd..85ff0f4 100644
--- a/src/glsl/nir/nir.c
+++ b/src/glsl/nir/nir.c
@@ -1834,13 +1834,11 @@ void
 nir_ssa_def_init(nir_instr *instr, nir_ssa_def *def,
  unsigned num_components, const char *name)
 {
-   void *mem_ctx = ralloc_parent(instr);
-
def-name = name;
def-parent_instr = instr;
-   def-uses = _mesa_set_create(mem_ctx, _mesa_hash_pointer,
+   def-uses = _mesa_set_create(instr, _mesa_hash_pointer,
 _mesa_key_pointer_equal);
-   def-if_uses = _mesa_set_create(mem_ctx, _mesa_hash_pointer,
+   def-if_uses = _mesa_set_create(instr, _mesa_hash_pointer,
_mesa_key_pointer_equal);
def-num_components = num_components;
 
-- 
2.3.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 1/5] nir: Allocate nir_call_instr::params out of the nir_call itself.

The lifetime of the params array needs to be match the nir_call_instr
itself.  So, allocate it using the instruction itself as the context.

Signed-off-by: Kenneth Graunke kenn...@whitecape.org
---
 src/glsl/nir/nir.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)
 
This is the 'nir-memory-v2' branch in my tree.

diff --git a/src/glsl/nir/nir.c b/src/glsl/nir/nir.c
index 5f86eca..0f807dd 100644
--- a/src/glsl/nir/nir.c
+++ b/src/glsl/nir/nir.c
@@ -445,7 +445,7 @@ nir_call_instr_create(void *mem_ctx, nir_function_overload 
*callee)
 
instr-callee = callee;
instr-num_params = callee-num_params;
-   instr-params = ralloc_array(mem_ctx, nir_deref_var *, instr-num_params);
+   instr-params = ralloc_array(instr, nir_deref_var *, instr-num_params);
instr-return_deref = NULL;
 
return instr;
-- 
2.3.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 5/5] nir: Implement a nir_sweep() pass.

This pass performs a mark and sweep pass over a nir_shader's associated
memory - anything still connected to the program will be kept, and any
dead memory we dropped on the floor will be freed.

The expectation is that this will be called when finished building and
optimizing the shader.  However, it's also fine to call it earlier, and
many times, to free up memory earlier.

v2: (feedback from Jason Ekstrand)
- Skip sweeping impl-start_block, as it's already in the CF list.
- Don't sweep SSA defs (they're owned by their defining instruction)
- Don't steal phi sources (they're owned by nir_phi_instr).
- Don't steal tex-src (it's owned by the tex_inst itself)
- Don't sweep dereference chains (top-level dereferences are owned by
  the instruction; sub-dereferences are owned by the parent deref).
- Don't sweep sources and destinations (SSA defs are handled as part of
  the defining instruction, and registers are handled as part of
  function implementations).
- Just steal instructions; don't walk them (no longer required).

Signed-off-by: Kenneth Graunke kenn...@whitecape.org
---
 src/glsl/Makefile.sources |   1 +
 src/glsl/nir/nir.h|   2 +
 src/glsl/nir/nir_sweep.c  | 151 ++
 3 files changed, 154 insertions(+)
 create mode 100644 src/glsl/nir/nir_sweep.c

This version is much simpler (= faster), thanks to the earlier changes.

diff --git a/src/glsl/Makefile.sources b/src/glsl/Makefile.sources
index 9bdcb80..c471eca 100644
--- a/src/glsl/Makefile.sources
+++ b/src/glsl/Makefile.sources
@@ -59,6 +59,7 @@ NIR_FILES = \
nir/nir_search.c \
nir/nir_search.h \
nir/nir_split_var_copies.c \
+   nir/nir_sweep.c \
nir/nir_to_ssa.c \
nir/nir_types.h \
nir/nir_validate.c \
diff --git a/src/glsl/nir/nir.h b/src/glsl/nir/nir.h
index e6b7684..0f72301 100644
--- a/src/glsl/nir/nir.h
+++ b/src/glsl/nir/nir.h
@@ -1650,6 +1650,8 @@ bool nir_opt_peephole_ffma(nir_shader *shader);
 
 bool nir_opt_remove_phis(nir_shader *shader);
 
+void nir_sweep(nir_shader *shader);
+
 #ifdef __cplusplus
 } /* extern C */
 #endif
diff --git a/src/glsl/nir/nir_sweep.c b/src/glsl/nir/nir_sweep.c
new file mode 100644
index 000..b33d624
--- /dev/null
+++ b/src/glsl/nir/nir_sweep.c
@@ -0,0 +1,151 @@
+/*
+ * Copyright © 2015 Intel Corporation
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the Software),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice (including the next
+ * paragraph) shall be included in all copies or substantial portions of the
+ * Software.
+ *
+ * THE SOFTWARE IS PROVIDED AS IS, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
+ * IN THE SOFTWARE.
+ */
+
+#include nir.h
+
+/**
+ * \file nir_sweep.c
+ *
+ * The nir_sweep() pass performs a mark and sweep pass over a nir_shader's 
associated
+ * memory - anything still connected to the program will be kept, and any dead 
memory
+ * we dropped on the floor will be freed.
+ *
+ * The expectation is that drivers should call this when finished compiling 
the shader
+ * (after any optimization, lowering, and so on).  However, it's also fine to 
call it
+ * earlier, and even many times, trading CPU cycles for memory savings.
+ */
+
+#define steal_list(mem_ctx, type, list) \
+   foreach_list_typed(type, obj, node, list) { ralloc_steal(mem_ctx, obj); }
+
+static void sweep_cf_node(nir_shader *nir, nir_cf_node *cf_node);
+
+static void
+sweep_block(nir_shader *nir, nir_block *block)
+{
+   ralloc_steal(nir, block);
+
+   nir_foreach_instr(block, instr) {
+  ralloc_steal(nir, instr);
+   }
+}
+
+static void
+sweep_if(nir_shader *nir, nir_if *iff)
+{
+   ralloc_steal(nir, iff);
+
+   foreach_list_typed(nir_cf_node, cf_node, node, iff-then_list) {
+  sweep_cf_node(nir, cf_node);
+   }
+
+   foreach_list_typed(nir_cf_node, cf_node, node, iff-else_list) {
+  sweep_cf_node(nir, cf_node);
+   }
+}
+
+static void
+sweep_loop(nir_shader *nir, nir_loop *loop)
+{
+   ralloc_steal(nir, loop);
+
+   foreach_list_typed(nir_cf_node, cf_node, node, loop-body) {
+  sweep_cf_node(nir, cf_node);
+   }
+}
+
+static void
+sweep_cf_node(nir_shader *nir, nir_cf_node *cf_node)
+{
+   switch (cf_node-type) {
+   case

Re: [Mesa-dev] [PATCH] glsl: Allow any sort of sampler array indexing with GLSL ES 3.00

2015-04-07 Thread Francisco Jerez

Tapani Pälli tapani.pa...@intel.com writes:

 From: Kalyan Kondapally kalyan.kondapa...@intel.com

 Dynamic indexing of sampler arrays is prohibited by GLSL ES 3.00.
 Earlier versions allow 'constant-index-expression' indexing, where
 index can contain a loop induction variable.

 Patch allows dynamic indexing for sampler arrays when GLSL ES  3.00.
 This change makes 'sampler-array-index.frag' parser test in Piglit
 pass + fishgl.com works when running Chrome on OpenGL ES 2.0 backend.

 v2: small change and some more commit message (Tapani)

 Signed-off-by: Kalyan Kondapally kalyan.kondapa...@intel.com
 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=84225

Looks good, but did you check what happens now if the shader uses actual
variable indexing (i.e. which lowering cannot turn into a constant) on
an implementation that doesn't support it?  Hopefully no crashes or
hangs?

 ---
  src/glsl/ast_array_index.cpp | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

 diff --git a/src/glsl/ast_array_index.cpp b/src/glsl/ast_array_index.cpp
 index ecef651..b2609b6 100644
 --- a/src/glsl/ast_array_index.cpp
 +++ b/src/glsl/ast_array_index.cpp
 @@ -226,7 +226,7 @@ _mesa_ast_array_index_to_hir(void *mem_ctx,
 * dynamically uniform expression is undefined.
 */
if (array-type-element_type()-is_sampler()) {
 -  if (!state-is_version(130, 100)) {
 +  if (!state-is_version(130, 300)) {
   if (state-es_shader) {
  _mesa_glsl_warning(loc, state,
 sampler arrays indexed with non-constant 
 -- 
 2.1.0


signature.asc
Description: PGP signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] glsl: Allow any sort of sampler array indexing with GLSL ES 3.00

2015-04-07 Thread Tapani Pälli




On 04/07/2015 01:22 PM, Francisco Jerez wrote:

Tapani Pälli tapani.pa...@intel.com writes:


From: Kalyan Kondapally kalyan.kondapa...@intel.com

Dynamic indexing of sampler arrays is prohibited by GLSL ES 3.00.
Earlier versions allow 'constant-index-expression' indexing, where
index can contain a loop induction variable.

Patch allows dynamic indexing for sampler arrays when GLSL ES  3.00.
This change makes 'sampler-array-index.frag' parser test in Piglit
pass + fishgl.com works when running Chrome on OpenGL ES 2.0 backend.

v2: small change and some more commit message (Tapani)

Signed-off-by: Kalyan Kondapally kalyan.kondapa...@intel.com
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=84225


Looks good, but did you check what happens now if the shader uses actual
variable indexing (i.e. which lowering cannot turn into a constant) on
an implementation that doesn't support it?  Hopefully no crashes or
hangs?


I could test something like this, can you throw example of a good victim 
platform and some ugly corner case? I have a shader_test that has 
expression with a uniform in it as index as a starter.


As a plan B, I think loop analysis could store some information which 
can be then used for additional validation of array index in a later 
step (skip it in AST and check only later for ES 1.00).



---
  src/glsl/ast_array_index.cpp | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/glsl/ast_array_index.cpp b/src/glsl/ast_array_index.cpp
index ecef651..b2609b6 100644
--- a/src/glsl/ast_array_index.cpp
+++ b/src/glsl/ast_array_index.cpp
@@ -226,7 +226,7 @@ _mesa_ast_array_index_to_hir(void *mem_ctx,
 * dynamically uniform expression is undefined.
 */
if (array-type-element_type()-is_sampler()) {
-if (!state-is_version(130, 100)) {
+if (!state-is_version(130, 300)) {
if (state-es_shader) {
   _mesa_glsl_warning(loc, state,
  sampler arrays indexed with non-constant 
--
2.1.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 4/5] glsl: Consider active all elements of a shared/std140 block array

2015-04-07 Thread Iago Toral

Besides fixing the mentioned dEQP crashes, this patch also generally
fixes instance arrays with UBOs. The problem we have now is that each
element in the UBO instance array is a separate UBO mapped to a specific
binding point (and thus, a separate buffer), but we kill the instances
that are not being referenced in the shader code, so if we have
something like this:

layout(std140, binding=2) uniform Fragments {
   vec4 v0;
   vec4 v1;
} inst[3];

And then the shader code only references inst[1], for example:

vec4 tfOutput0 = inst[1].v0;

That UBO read for inst[1].v0 can fail as a consequence of the fact that
we we are killing UBOs for inst[0] and inst[2] and we shouldn't.

I hit this while developing SSBO, which is the same thing, and this
patch fixes the problem.

Iago

On Wed, 2015-03-11 at 10:01 +0100, Eduardo Lima Mitev wrote:
 From: Antia Puentes apuen...@igalia.com
 
 Commmit 1ca25ab (glsl: Do not eliminate 'shared' or 'std140'
 blocks or block members) considers active 'shared' and 'std140'
 uniform blocks and uniform block arrays but did not include the
 block array elements. It was possible to have an active uniform
 block array without any elements marked as used, making the
 assertion ((b-num_array_elements  0) == b-type-is_array())
 in link_uniform_blocks fail.
 
 Fixes the following 5 dEQP tests:
 
  * dEQP-GLES3.functional.ubo.random.nested_structs_instance_arrays.18
  * dEQP-GLES3.functional.ubo.random.nested_structs_instance_arrays.24
  * dEQP-GLES3.functional.ubo.random.nested_structs_arrays_instance_arrays.19
  * dEQP-GLES3.functional.ubo.random.all_per_block_buffers.49
  * dEQP-GLES3.functional.ubo.random.all_shared_buffer.36
 ---
  src/glsl/link_uniform_block_active_visitor.cpp | 23 +++
  1 file changed, 23 insertions(+)
 
 diff --git a/src/glsl/link_uniform_block_active_visitor.cpp 
 b/src/glsl/link_uniform_block_active_visitor.cpp
 index 292cde3..8379750 100644
 --- a/src/glsl/link_uniform_block_active_visitor.cpp
 +++ b/src/glsl/link_uniform_block_active_visitor.cpp
 @@ -105,6 +105,22 @@ link_uniform_block_active_visitor::visit(ir_variable 
 *var)
 assert(b-num_array_elements == 0);
 assert(b-array_elements == NULL);
 assert(b-type != NULL);
 +   assert(!b-type-is_array() || b-has_instance_name);
 +
 +   /* For uniform block arrays declared with a shared or std140 layout
 +* qualifier, mark all its instances as used.
 +*/
 +   if (b-type-is_array()  b-type-length  0) {
 +  b-num_array_elements = b-type-length;
 +  b-array_elements = reralloc(this-mem_ctx,
 +   b-array_elements,
 +   unsigned,
 +   b-num_array_elements);
 +
 +  for (unsigned i = 0; i  b-num_array_elements; i++) {
 + b-array_elements[i] = i;
 +  }
 +   }
  
 return visit_continue;
  }
 @@ -146,6 +162,13 @@ 
 link_uniform_block_active_visitor::visit_enter(ir_dereference_array *ir)
 assert((b-num_array_elements == 0) == (b-array_elements == NULL));
 assert(b-type != NULL);
  
 +   /* If the block array was declared with a shared or std140 layout 
 qualifier,
 +* all its instances have been already marked as used (see
 +* link_uniform_block_active_visitor::visit(ir_variable *) function).
 +*/
 +   if (var-type-interface_packing == GLSL_INTERFACE_PACKING_PACKED)
 +  return visit_continue;
 +
 ir_constant *c = ir-array_index-as_constant();
  
 if (c) {


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 4/4] autoconf, scons: Move fallback HAVE_* definitions to headers.

2015-04-07 Thread Jose Fonseca


Sorry for the delay. I've been away during the Easter.

On 02/04/15 19:02, Matt Turner wrote:

On Thu, Apr 2, 2015 at 7:32 AM, Jose Fonseca jfons...@vmware.com wrote:

These were being defined in SCons, but it's not practical -- we actually
need to include Gallium headers from external source trees, with
completely disjoint build infrastructure, and it's unsustainable to
replicate the HAVE_xxx checks or even hard-coded defines across
everywhere.


To confirm, you're building external sources with gcc? I don't think
these macros are useful for MSVC.


Correct.



No actual change in behavior for autoconf.
---
  configure.ac |  2 +-
  include/c99_compat.h | 45 +
  scons/gallium.py | 27 ---
  src/util/macros.h|  2 ++
  4 files changed, 48 insertions(+), 28 deletions(-)

diff --git a/configure.ac b/configure.ac
index 520cc22..1485bba 100644
--- a/configure.ac
+++ b/configure.ac
@@ -230,7 +230,7 @@ _SAVE_LDFLAGS=$LDFLAGS
  _SAVE_CPPFLAGS=$CPPFLAGS

  dnl Compiler macros
-DEFINES=
+DEFINES=-DHAVE_AUTOCONF
  AC_SUBST([DEFINES])
  case $host_os in
  linux*|*-gnu*|gnu*)
diff --git a/include/c99_compat.h b/include/c99_compat.h
index 4fc91bc..62ccd46 100644
--- a/include/c99_compat.h
+++ b/include/c99_compat.h


c99_compat.h doesn't seem like the right location. I know it seems
like a nice place to add this since it's included everywhere, but I
worry that in a few years we're going to be cleaning it up like we've
been doing with compiler.h and friends.

I might make a separate header to define these? Not sure.


I can move the defines out of c99_compat.h , e.g., 
mesa/include/fallbackconfig.h.


But I'd prefer to include fallbackconfig.h out of c99_compat.h , as 
c99_compat.h is pretty much guaranteed to be included all the time.



 Since
 probably all cases of #ifdef HAVE___* have a fallback, that runs the
 risk of never noticing that you weren't including the right header.

Precisely, this is all the more reason why it must be included from a 
header that's included all the time.  If it depends on people to add the 
include on a case-by-case it is bound to fail, as nobody else but us 
cares, and it will easily go unnoticed.




@@ -141,4 +141,49 @@ test_c99_compat_h(const void * restrict a,
  #endif


+
+/* Fallback definitions, for when these headers are used by build systems which
+ * don't auto-detect these things.*/
+#ifndef HAVE_AUTOCONF


I'd rather flip this condition around and not modify configure.ac. But
maybe you can't do that because you're not actually building
everything with scons?


No biggie either way.


I don't know. This seems nuts. I really don't like adding stuff to the
autotools build system like this.


Sure.



I really don't know how to deal with this. What I'm hearing is that
even the custom scons build system you guys use isn't sufficient for
your own needs. You're not building the external source trees with the
same build system...?


I think you might be getting the wrong idea.

We don't build the .C files from external source trees.  But we do need 
to include .h files, so we can interface with components in Mesa tree.


That is, I only need the .h files to make sense on their own (with Mesa 
components, namely mesa/src/gallium/include, and gallium auxiliary 
libraries).  But we have so many inlines functions, so many #ifdef 
HAVE_foo, that unless all the defines match precisely, the whole hell 
breaks loose.



Gallium has from the start been integrated (ie. embedded) on a myriad of 
places.  It was always meant as a framework to write any sort of 3d 
driver, not just OpenGL drivers.  Things were much worse when Gallium 
was used on Windows XP kernel land or Windows CE.  I'm glad that I or 
anybody else has to deal with the quirkiness of keeping code portable 
across these platforms.  Things are still much more uniform nowadays.




I mean, in all the build system work I've done I've tried to make sure
scons continues working -- doing things like adding these HAVE_*
definitions to it and such. It's kind of frustrating, and it's even
more frustrating when even that isn't sufficient.



All I'm doing here is basically move your defines out of scons's python 
files into C headers.  Conceptually it's doing pretty much the same 
thing as before, but being in a header that means that it's there for 
all build systems to take.



Rembember that Mesa itself is not just autoconf and Scons, there's also 
Android build system.


I don't like it any more you do, but this is the world we live in: the 
fact is that many platforms constraint how software must be built to a 
point which is impracticable/impossible to build.  Even if a build 
system that meets everybody needs existed, we'd still face the legacy of 
existing software using other build systems.




To be honest, IMHO, Mesa source tree and build systems are a failure if 
they can't even sustain external interfaces.



For many drivers, the external

Re: [Mesa-dev] [PATCH] glsl: Allow any sort of sampler array indexing with GLSL ES 3.00

2015-04-07 Thread Francisco Jerez

Tapani Pälli tapani.pa...@intel.com writes:

 On 04/07/2015 01:22 PM, Francisco Jerez wrote:
 Tapani Pälli tapani.pa...@intel.com writes:

 From: Kalyan Kondapally kalyan.kondapa...@intel.com

 Dynamic indexing of sampler arrays is prohibited by GLSL ES 3.00.
 Earlier versions allow 'constant-index-expression' indexing, where
 index can contain a loop induction variable.

 Patch allows dynamic indexing for sampler arrays when GLSL ES  3.00.
 This change makes 'sampler-array-index.frag' parser test in Piglit
 pass + fishgl.com works when running Chrome on OpenGL ES 2.0 backend.

 v2: small change and some more commit message (Tapani)

 Signed-off-by: Kalyan Kondapally kalyan.kondapa...@intel.com
 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=84225

 Looks good, but did you check what happens now if the shader uses actual
 variable indexing (i.e. which lowering cannot turn into a constant) on
 an implementation that doesn't support it?  Hopefully no crashes or
 hangs?

 I could test something like this, can you throw example of a good victim 
 platform and some ugly corner case? I have a shader_test that has 
 expression with a uniform in it as index as a starter.

I guess SNB would be bad enough.  The hardware actually supports
dynamically uniform indexing of surfaces but we don't implement it and
it would likely violate some assumptions in the back-end if it gets that
far.

 As a plan B, I think loop analysis could store some information which 
 can be then used for additional validation of array index in a later 
 step (skip it in AST and check only later for ES 1.00).


Yeah, well.  It seems rather annoying to get right at the GLSL IR level
too, because you'd have to traverse variable defs and built-in function
calls, except maybe after optimization (after loop unrolling and
constant folding at least).  At that point a valid ESSL 1.0 program
should only have sampler arrays indexed by constants, what will probably
make your job easier.

 ---
   src/glsl/ast_array_index.cpp | 2 +-
   1 file changed, 1 insertion(+), 1 deletion(-)

 diff --git a/src/glsl/ast_array_index.cpp b/src/glsl/ast_array_index.cpp
 index ecef651..b2609b6 100644
 --- a/src/glsl/ast_array_index.cpp
 +++ b/src/glsl/ast_array_index.cpp
 @@ -226,7 +226,7 @@ _mesa_ast_array_index_to_hir(void *mem_ctx,
  * dynamically uniform expression is undefined.
  */
 if (array-type-element_type()-is_sampler()) {
 -if (!state-is_version(130, 100)) {
 +if (!state-is_version(130, 300)) {
 if (state-es_shader) {
_mesa_glsl_warning(loc, state,
   sampler arrays indexed with non-constant 
 --
 2.1.0


signature.asc
Description: PGP signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] indices: fix provoking vertex for quads/quadstrips

Weird, this seems to regress

bin/arb_shader_texture_lod-texgrad
bin/arb_shader_texture_lod-texgradcube

Visually they look the same, but piglit finds small differences.

On Tue, Apr 7, 2015 at 2:20 AM, Ilia Mirkin imir...@alum.mit.edu wrote:
 On Tue, Apr 7, 2015 at 1:44 AM, Ilia Mirkin imir...@alum.mit.edu wrote:
 Signed-off-by: Ilia Mirkin imir...@alum.mit.edu
 ---

 Pushing this through a complete piglit run, but it seems to fix

   bin/arb-provoking-vertex-render

 on a3xx. Please take special care to double-check that I didn't mess
 up cw/ccw order or something. I'm especially weak on the quadstrip
 case.

  src/gallium/auxiliary/indices/u_indices_gen.py | 13 ++---
  1 file changed, 10 insertions(+), 3 deletions(-)

 diff --git a/src/gallium/auxiliary/indices/u_indices_gen.py 
 b/src/gallium/auxiliary/indices/u_indices_gen.py
 index 687a717..b17d132 100644
 --- a/src/gallium/auxiliary/indices/u_indices_gen.py
 +++ b/src/gallium/auxiliary/indices/u_indices_gen.py
 @@ -142,8 +142,12 @@ def do_tri( intype, outtype, ptr, v0, v1, v2, inpv, 
 outpv ):
  tri( intype, outtype, ptr, v2, v0, v1 )

  def do_quad( intype, outtype, ptr, v0, v1, v2, v3, inpv, outpv ):
 -do_tri( intype, outtype, ptr+'+0',  v0, v1, v3, inpv, outpv );
 -do_tri( intype, outtype, ptr+'+3',  v1, v2, v3, inpv, outpv );
 +if inpv == LAST:
 +do_tri( intype, outtype, ptr+'+0',  v0, v1, v3, inpv, outpv );
 +do_tri( intype, outtype, ptr+'+3',  v1, v2, v3, inpv, outpv );
 +else:
 +do_tri( intype, outtype, ptr+'+0',  v0, v1, v3, inpv, outpv );
 +do_tri( intype, outtype, ptr+'+3',  v0, v3, v2, inpv, outpv );

 Erm, make that v0, v1, v2; v0, v2, v3. Oops :)


  def name(intype, outtype, inpv, outpv, pr, prim):
  if intype == GENERATE:
 @@ -331,7 +335,10 @@ def quadstrip(intype, outtype, inpv, outpv, pr):
  print ' i += 4;'
  print ' goto restart;'
  print '  }'
 -do_quad( intype, outtype, 'out+j', 'i+2', 'i+0', 'i+1', 'i+3', inpv, 
 outpv );
 +if inpv == LAST:
 +do_quad( intype, outtype, 'out+j', 'i+2', 'i+0', 'i+1', 'i+3', 
 inpv, outpv );
 +else:
 +do_quad( intype, outtype, 'out+j', 'i+0', 'i+1', 'i+3', 'i+2', 
 inpv, outpv );
  print '   }'
  postamble()

 --
 2.0.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] indices: fix provoking vertex for quads/quadstrips

2015-04-07 Thread Roland Scheidegger

It will look different with llvmpipe if you use the right debug
variables (GALLIVM_DEBUG=no_brilinear,no_quad_lod,no_rho_approx), though
still fail.
I think the test may not be really valid. This is because if you use
texgrad, the driver/hw probably will (or should) use per-pixel lod. But
if you don't, it is of course per-quad. For the smallest mip it will
only give the same results here if you pick the right (top/bottom,
left/right) values for doing the lod calculations with implicit lod
(that is, the one from the actually active pixel in the quad, so if the
active pixel was top/left you must calculate ddx with the top values,
and ddy with the left values). And I don't think that is a requirement
anywhere. At least that's what I remember...
And the cube test is probably not quite right neither (though llvmpipe
passes it with those mentioned variables, that is more due to the
implementation of cube mapping though - the cube face selection must be
done per pixel and not per quad and there's tons of code to get lods
right be it implicit or explicit).
Don't know though why the test would regress this, as it shouldn't
affect it at all with last provoking vertex.

Am 07.04.2015 um 16:28 schrieb Ilia Mirkin:
 Oh fun, those tests also fail with nvc0 and llvmpipe. But pass on
 softpipe. (The llvmpipe fail is visually different from the nvc0 and
 freedreno/a3xx one though.)
 
 On Tue, Apr 7, 2015 at 10:25 AM, Ilia Mirkin imir...@alum.mit.edu wrote:
 Weird, this seems to regress

 bin/arb_shader_texture_lod-texgrad
 bin/arb_shader_texture_lod-texgradcube

 Visually they look the same, but piglit finds small differences.

 On Tue, Apr 7, 2015 at 2:20 AM, Ilia Mirkin imir...@alum.mit.edu wrote:
 On Tue, Apr 7, 2015 at 1:44 AM, Ilia Mirkin imir...@alum.mit.edu wrote:
 Signed-off-by: Ilia Mirkin imir...@alum.mit.edu
 ---

 Pushing this through a complete piglit run, but it seems to fix

   bin/arb-provoking-vertex-render

 on a3xx. Please take special care to double-check that I didn't mess
 up cw/ccw order or something. I'm especially weak on the quadstrip
 case.

  src/gallium/auxiliary/indices/u_indices_gen.py | 13 ++---
  1 file changed, 10 insertions(+), 3 deletions(-)

 diff --git a/src/gallium/auxiliary/indices/u_indices_gen.py 
 b/src/gallium/auxiliary/indices/u_indices_gen.py
 index 687a717..b17d132 100644
 --- a/src/gallium/auxiliary/indices/u_indices_gen.py
 +++ b/src/gallium/auxiliary/indices/u_indices_gen.py
 @@ -142,8 +142,12 @@ def do_tri( intype, outtype, ptr, v0, v1, v2, inpv, 
 outpv ):
  tri( intype, outtype, ptr, v2, v0, v1 )

  def do_quad( intype, outtype, ptr, v0, v1, v2, v3, inpv, outpv ):
 -do_tri( intype, outtype, ptr+'+0',  v0, v1, v3, inpv, outpv );
 -do_tri( intype, outtype, ptr+'+3',  v1, v2, v3, inpv, outpv );
 +if inpv == LAST:
 +do_tri( intype, outtype, ptr+'+0',  v0, v1, v3, inpv, outpv );
 +do_tri( intype, outtype, ptr+'+3',  v1, v2, v3, inpv, outpv );
 +else:
 +do_tri( intype, outtype, ptr+'+0',  v0, v1, v3, inpv, outpv );
 +do_tri( intype, outtype, ptr+'+3',  v0, v3, v2, inpv, outpv );

 Erm, make that v0, v1, v2; v0, v2, v3. Oops :)


  def name(intype, outtype, inpv, outpv, pr, prim):
  if intype == GENERATE:
 @@ -331,7 +335,10 @@ def quadstrip(intype, outtype, inpv, outpv, pr):
  print ' i += 4;'
  print ' goto restart;'
  print '  }'
 -do_quad( intype, outtype, 'out+j', 'i+2', 'i+0', 'i+1', 'i+3', inpv, 
 outpv );
 +if inpv == LAST:
 +do_quad( intype, outtype, 'out+j', 'i+2', 'i+0', 'i+1', 'i+3', 
 inpv, outpv );
 +else:
 +do_quad( intype, outtype, 'out+j', 'i+0', 'i+1', 'i+3', 'i+2', 
 inpv, outpv );
  print '   }'
  postamble()

 --
 2.0.5

 ___
 mesa-dev mailing list
 mesa-dev@lists.freedesktop.org
 https://urldefense.proofpoint.com/v2/url?u=http-3A__lists.freedesktop.org_mailman_listinfo_mesa-2Ddevd=AwIGaQc=Sqcl0Ez6M0X8aeM67LKIiDJAXVeAw-YihVMNtXt-uEsr=Vjtt0vs_iqoI31UfJxBl7yv9I2FeiaeAYgMTLKRBc_Im=WCe6lxrKqxKgvrVHuTpgj1fH76sE-mwWzExbv9DnLQss=R37n-HrF-x56UKswGlBbLHCIlZvTk6p-Z99737VvlS8e=
  
 

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 4/4] autoconf, scons: Move fallback HAVE_* definitions to headers.

2015-04-07 Thread Emil Velikov

On 7 April 2015 at 13:14, Jose Fonseca jfons...@vmware.com wrote:
 Sorry for the delay. I've been away during the Easter.

 On 02/04/15 19:02, Matt Turner wrote:

 On Thu, Apr 2, 2015 at 7:32 AM, Jose Fonseca jfons...@vmware.com wrote:

 These were being defined in SCons, but it's not practical -- we actually
 need to include Gallium headers from external source trees, with
 completely disjoint build infrastructure, and it's unsustainable to
 replicate the HAVE_xxx checks or even hard-coded defines across
 everywhere.


 To confirm, you're building external sources with gcc? I don't think
 these macros are useful for MSVC.


 Correct.



 No actual change in behavior for autoconf.
 ---
   configure.ac |  2 +-
   include/c99_compat.h | 45 +
   scons/gallium.py | 27 ---
   src/util/macros.h|  2 ++
   4 files changed, 48 insertions(+), 28 deletions(-)

 diff --git a/configure.ac b/configure.ac
 index 520cc22..1485bba 100644
 --- a/configure.ac
 +++ b/configure.ac
 @@ -230,7 +230,7 @@ _SAVE_LDFLAGS=$LDFLAGS
   _SAVE_CPPFLAGS=$CPPFLAGS

   dnl Compiler macros
 -DEFINES=
 +DEFINES=-DHAVE_AUTOCONF
   AC_SUBST([DEFINES])
   case $host_os in
   linux*|*-gnu*|gnu*)
 diff --git a/include/c99_compat.h b/include/c99_compat.h
 index 4fc91bc..62ccd46 100644
 --- a/include/c99_compat.h
 +++ b/include/c99_compat.h


 c99_compat.h doesn't seem like the right location. I know it seems
 like a nice place to add this since it's included everywhere, but I
 worry that in a few years we're going to be cleaning it up like we've
 been doing with compiler.h and friends.

 I might make a separate header to define these? Not sure.


 I can move the defines out of c99_compat.h , e.g.,
 mesa/include/fallbackconfig.h.

 But I'd prefer to include fallbackconfig.h out of c99_compat.h , as
 c99_compat.h is pretty much guaranteed to be included all the time.


 Since
 probably all cases of #ifdef HAVE___* have a fallback, that runs the
 risk of never noticing that you weren't including the right header.

 Precisely, this is all the more reason why it must be included from a header
 that's included all the time.  If it depends on people to add the include on
 a case-by-case it is bound to fail, as nobody else but us cares, and it will
 easily go unnoticed.


 @@ -141,4 +141,49 @@ test_c99_compat_h(const void * restrict a,
   #endif


 +
 +/* Fallback definitions, for when these headers are used by build
 systems which
 + * don't auto-detect these things.*/
 +#ifndef HAVE_AUTOCONF


 I'd rather flip this condition around and not modify configure.ac. But
 maybe you can't do that because you're not actually building
 everything with scons?


 No biggie either way.

 I don't know. This seems nuts. I really don't like adding stuff to the
 autotools build system like this.


 Sure.


 I really don't know how to deal with this. What I'm hearing is that
 even the custom scons build system you guys use isn't sufficient for
 your own needs. You're not building the external source trees with the
 same build system...?


 I think you might be getting the wrong idea.

 We don't build the .C files from external source trees.  But we do need to
 include .h files, so we can interface with components in Mesa tree.

 That is, I only need the .h files to make sense on their own (with Mesa
 components, namely mesa/src/gallium/include, and gallium auxiliary
 libraries).  But we have so many inlines functions, so many #ifdef HAVE_foo,
 that unless all the defines match precisely, the whole hell breaks loose.


 Gallium has from the start been integrated (ie. embedded) on a myriad of
 places.  It was always meant as a framework to write any sort of 3d driver,
 not just OpenGL drivers.  Things were much worse when Gallium was used on
 Windows XP kernel land or Windows CE.  I'm glad that I or anybody else has
 to deal with the quirkiness of keeping code portable across these platforms.
 Things are still much more uniform nowadays.


 I mean, in all the build system work I've done I've tried to make sure
 scons continues working -- doing things like adding these HAVE_*
 definitions to it and such. It's kind of frustrating, and it's even
 more frustrating when even that isn't sufficient.



 All I'm doing here is basically move your defines out of scons's python
 files into C headers.  Conceptually it's doing pretty much the same thing as
 before, but being in a header that means that it's there for all build
 systems to take.


 Rembember that Mesa itself is not just autoconf and Scons, there's also
 Android build system.

 I don't like it any more you do, but this is the world we live in: the fact
 is that many platforms constraint how software must be built to a point
 which is impracticable/impossible to build.  Even if a build system that
 meets everybody needs existed, we'd still face the legacy of existing
 software using other build systems.



 To be honest, IMHO, Mesa source

Re: [Mesa-dev] [PATCH] indices: fix provoking vertex for quads/quadstrips

Oh fun, those tests also fail with nvc0 and llvmpipe. But pass on
softpipe. (The llvmpipe fail is visually different from the nvc0 and
freedreno/a3xx one though.)

On Tue, Apr 7, 2015 at 10:25 AM, Ilia Mirkin imir...@alum.mit.edu wrote:
 Weird, this seems to regress

 bin/arb_shader_texture_lod-texgrad
 bin/arb_shader_texture_lod-texgradcube

 Visually they look the same, but piglit finds small differences.

 On Tue, Apr 7, 2015 at 2:20 AM, Ilia Mirkin imir...@alum.mit.edu wrote:
 On Tue, Apr 7, 2015 at 1:44 AM, Ilia Mirkin imir...@alum.mit.edu wrote:
 Signed-off-by: Ilia Mirkin imir...@alum.mit.edu
 ---

 Pushing this through a complete piglit run, but it seems to fix

   bin/arb-provoking-vertex-render

 on a3xx. Please take special care to double-check that I didn't mess
 up cw/ccw order or something. I'm especially weak on the quadstrip
 case.

  src/gallium/auxiliary/indices/u_indices_gen.py | 13 ++---
  1 file changed, 10 insertions(+), 3 deletions(-)

 diff --git a/src/gallium/auxiliary/indices/u_indices_gen.py 
 b/src/gallium/auxiliary/indices/u_indices_gen.py
 index 687a717..b17d132 100644
 --- a/src/gallium/auxiliary/indices/u_indices_gen.py
 +++ b/src/gallium/auxiliary/indices/u_indices_gen.py
 @@ -142,8 +142,12 @@ def do_tri( intype, outtype, ptr, v0, v1, v2, inpv, 
 outpv ):
  tri( intype, outtype, ptr, v2, v0, v1 )

  def do_quad( intype, outtype, ptr, v0, v1, v2, v3, inpv, outpv ):
 -do_tri( intype, outtype, ptr+'+0',  v0, v1, v3, inpv, outpv );
 -do_tri( intype, outtype, ptr+'+3',  v1, v2, v3, inpv, outpv );
 +if inpv == LAST:
 +do_tri( intype, outtype, ptr+'+0',  v0, v1, v3, inpv, outpv );
 +do_tri( intype, outtype, ptr+'+3',  v1, v2, v3, inpv, outpv );
 +else:
 +do_tri( intype, outtype, ptr+'+0',  v0, v1, v3, inpv, outpv );
 +do_tri( intype, outtype, ptr+'+3',  v0, v3, v2, inpv, outpv );

 Erm, make that v0, v1, v2; v0, v2, v3. Oops :)


  def name(intype, outtype, inpv, outpv, pr, prim):
  if intype == GENERATE:
 @@ -331,7 +335,10 @@ def quadstrip(intype, outtype, inpv, outpv, pr):
  print ' i += 4;'
  print ' goto restart;'
  print '  }'
 -do_quad( intype, outtype, 'out+j', 'i+2', 'i+0', 'i+1', 'i+3', inpv, 
 outpv );
 +if inpv == LAST:
 +do_quad( intype, outtype, 'out+j', 'i+2', 'i+0', 'i+1', 'i+3', 
 inpv, outpv );
 +else:
 +do_quad( intype, outtype, 'out+j', 'i+0', 'i+1', 'i+3', 'i+2', 
 inpv, outpv );
  print '   }'
  postamble()

 --
 2.0.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 4/5] nir: Allocate dereferences out of their parent instruction or deref.

Other than my nitpicking below this looks great! Thanks for working on this!
On Apr 7, 2015 2:32 AM, Kenneth Graunke kenn...@whitecape.org wrote:

 Jason pointed out that variable dereferences in NIR are really part of
 their parent instruction, and should have the same lifetime.

 Unlike in GLSL IR, they're not used very often - just for intrinsic
 variables, call parameters  return, and indirect samplers for
 texturing.  Also, nir_deref_var is the top-level concept, and
 nir_deref_array/nir_deref_record are child nodes.

 This patch attempts to allocate nir_deref_vars out of their parent
 instruction, and any sub-dereferences out of their parent deref.
 It enforces these restrictions in the validator as well.

 This means that freeing an instruction should free its associated
 dereference chain as well.  The memory sweeper pass can also happily
 ignore them.

 Signed-off-by: Kenneth Graunke kenn...@whitecape.org
 ---
  src/glsl/nir/glsl_to_nir.cpp| 47
-
  src/glsl/nir/nir.c  |  6 ++---
  src/glsl/nir/nir_lower_var_copies.c |  8 +++
  src/glsl/nir/nir_split_var_copies.c |  4 ++--
  src/glsl/nir/nir_validate.c | 13 ++
  src/mesa/program/prog_to_nir.c  |  9 ---
  6 files changed, 45 insertions(+), 42 deletions(-)

 This is still a lot of churn, but surprisingly about even on LOC.
 With the validator code in place, I suspect we can get this right
 going forward without too much trouble.

 diff --git a/src/glsl/nir/glsl_to_nir.cpp b/src/glsl/nir/glsl_to_nir.cpp
 index 80c5b3a..f61a47a 100644
 --- a/src/glsl/nir/glsl_to_nir.cpp
 +++ b/src/glsl/nir/glsl_to_nir.cpp
 @@ -88,6 +88,8 @@ private:
 exec_list *cf_node_list;
 nir_instr *result; /* result of the expression tree last visited */

 +   nir_deref_var *make_deref(void *mem_ctx, ir_instruction *ir);
 +
 /* the head of the dereference chain we're creating */
 nir_deref_var *deref_head;
 /* the tail of the dereference chain we're creating */
 @@ -156,6 +158,14 @@ nir_visitor::~nir_visitor()
 _mesa_hash_table_destroy(this-overload_table, NULL);
  }

 +nir_deref_var *
 +nir_visitor::make_deref(void *mem_ctx, ir_instruction *ir)

I'm not a huge fan of the name. Maybe evaluate_deref to match
evaluate_rvalue or perhaps build_deref?  In any case, it doesn't really
matter so I won't quibble.

It should, however, take a nir_instr instead of a void as its memory
context.  That makes it a bit more explicit.

 +{
 +   ir-accept(this);
 +   ralloc_steal(mem_ctx, this-deref_head);
 +   return this-deref_head;
 +}
 +
  static nir_constant *
  constant_copy(ir_constant *ir, void *mem_ctx)
  {
 @@ -582,13 +592,11 @@ void
  nir_visitor::visit(ir_return *ir)
  {
 if (ir-value != NULL) {
 -  ir-value-accept(this);
nir_intrinsic_instr *copy =
   nir_intrinsic_instr_create(this-shader,
nir_intrinsic_copy_var);

 -  copy-variables[0] = nir_deref_var_create(this-shader,
 -this-impl-return_var);
 -  copy-variables[1] = this-deref_head;
 +  copy-variables[0] = nir_deref_var_create(copy,
this-impl-return_var);
 +  copy-variables[1] = make_deref(copy, ir-value);
 }

 nir_jump_instr *instr = nir_jump_instr_create(this-shader,
nir_jump_return);
 @@ -613,8 +621,7 @@ nir_visitor::visit(ir_call *ir)
nir_intrinsic_instr *instr = nir_intrinsic_instr_create(shader,
op);
ir_dereference *param =
   (ir_dereference *) ir-actual_parameters.get_head();
 -  param-accept(this);
 -  instr-variables[0] = this-deref_head;
 +  instr-variables[0] = make_deref(instr, param);
nir_ssa_dest_init(instr-instr, instr-dest, 1, NULL);

nir_instr_insert_after_cf_list(this-cf_node_list, instr-instr);
 @@ -623,8 +630,7 @@ nir_visitor::visit(ir_call *ir)
   nir_intrinsic_instr_create(shader, nir_intrinsic_store_var);
store_instr-num_components = 1;

 -  ir-return_deref-accept(this);
 -  store_instr-variables[0] = this-deref_head;
 +  store_instr-variables[0] = make_deref(store_instr,
ir-return_deref);
store_instr-src[0].is_ssa = true;
store_instr-src[0].ssa = instr-dest.ssa;

 @@ -642,13 +648,11 @@ nir_visitor::visit(ir_call *ir)

 unsigned i = 0;
 foreach_in_list(ir_dereference, param, ir-actual_parameters) {
 -  param-accept(this);
 -  instr-params[i] = this-deref_head;
 +  instr-params[i] = make_deref(instr, param);
i++;
 }

 -   ir-return_deref-accept(this);
 -   instr-return_deref = this-deref_head;
 +   instr-return_deref = make_deref(instr, ir-return_deref);
 nir_instr_insert_after_cf_list(this-cf_node_list, instr-instr);
  }

 @@ -663,12 +667,8 @@ nir_visitor::visit(ir_assignment *ir)
nir_intrinsic_instr *copy =
   nir_intrinsic_instr_create(this-shader,
nir_intrinsic_copy_var);

 -  ir-lhs-accept(this);
 -  copy-variables[0] = this-deref_head;
 -
 -

Re: [Mesa-dev] [PATCH 5/5] nir: Implement a nir_sweep() pass.

On Apr 7, 2015 2:32 AM, Kenneth Graunke kenn...@whitecape.org wrote:

 This pass performs a mark and sweep pass over a nir_shader's associated
 memory - anything still connected to the program will be kept, and any
 dead memory we dropped on the floor will be freed.

 The expectation is that this will be called when finished building and
 optimizing the shader.  However, it's also fine to call it earlier, and
 many times, to free up memory earlier.

 v2: (feedback from Jason Ekstrand)
 - Skip sweeping impl-start_block, as it's already in the CF list.
 - Don't sweep SSA defs (they're owned by their defining instruction)
 - Don't steal phi sources (they're owned by nir_phi_instr).
 - Don't steal tex-src (it's owned by the tex_inst itself)
 - Don't sweep dereference chains (top-level dereferences are owned by
   the instruction; sub-dereferences are owned by the parent deref).
 - Don't sweep sources and destinations (SSA defs are handled as part of
   the defining instruction, and registers are handled as part of
   function implementations).
 - Just steal instructions; don't walk them (no longer required).

 Signed-off-by: Kenneth Graunke kenn...@whitecape.org
 ---
  src/glsl/Makefile.sources |   1 +
  src/glsl/nir/nir.h|   2 +
  src/glsl/nir/nir_sweep.c  | 151
++
  3 files changed, 154 insertions(+)
  create mode 100644 src/glsl/nir/nir_sweep.c

 This version is much simpler (= faster), thanks to the earlier changes.

 diff --git a/src/glsl/Makefile.sources b/src/glsl/Makefile.sources
 index 9bdcb80..c471eca 100644
 --- a/src/glsl/Makefile.sources
 +++ b/src/glsl/Makefile.sources
 @@ -59,6 +59,7 @@ NIR_FILES = \
 nir/nir_search.c \
 nir/nir_search.h \
 nir/nir_split_var_copies.c \
 +   nir/nir_sweep.c \
 nir/nir_to_ssa.c \
 nir/nir_types.h \
 nir/nir_validate.c \
 diff --git a/src/glsl/nir/nir.h b/src/glsl/nir/nir.h
 index e6b7684..0f72301 100644
 --- a/src/glsl/nir/nir.h
 +++ b/src/glsl/nir/nir.h
 @@ -1650,6 +1650,8 @@ bool nir_opt_peephole_ffma(nir_shader *shader);

  bool nir_opt_remove_phis(nir_shader *shader);

 +void nir_sweep(nir_shader *shader);
 +
  #ifdef __cplusplus
  } /* extern C */
  #endif
 diff --git a/src/glsl/nir/nir_sweep.c b/src/glsl/nir/nir_sweep.c
 new file mode 100644
 index 000..b33d624
 --- /dev/null
 +++ b/src/glsl/nir/nir_sweep.c
 @@ -0,0 +1,151 @@
 +/*
 + * Copyright © 2015 Intel Corporation
 + *
 + * Permission is hereby granted, free of charge, to any person obtaining
a
 + * copy of this software and associated documentation files (the
Software),
 + * to deal in the Software without restriction, including without
limitation
 + * the rights to use, copy, modify, merge, publish, distribute,
sublicense,
 + * and/or sell copies of the Software, and to permit persons to whom the
 + * Software is furnished to do so, subject to the following conditions:
 + *
 + * The above copyright notice and this permission notice (including the
next
 + * paragraph) shall be included in all copies or substantial portions of
the
 + * Software.
 + *
 + * THE SOFTWARE IS PROVIDED AS IS, WITHOUT WARRANTY OF ANY KIND,
EXPRESS OR
 + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
MERCHANTABILITY,
 + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT
SHALL
 + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR
OTHER
 + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
ARISING
 + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
DEALINGS
 + * IN THE SOFTWARE.
 + */
 +
 +#include nir.h
 +
 +/**
 + * \file nir_sweep.c
 + *
 + * The nir_sweep() pass performs a mark and sweep pass over a
nir_shader's associated
 + * memory - anything still connected to the program will be kept, and
any dead memory
 + * we dropped on the floor will be freed.
 + *
 + * The expectation is that drivers should call this when finished
compiling the shader
 + * (after any optimization, lowering, and so on).  However, it's also
fine to call it
 + * earlier, and even many times, trading CPU cycles for memory savings.
 + */
 +
 +#define steal_list(mem_ctx, type, list) \
 +   foreach_list_typed(type, obj, node, list) { ralloc_steal(mem_ctx,
obj); }
 +
 +static void sweep_cf_node(nir_shader *nir, nir_cf_node *cf_node);
 +
 +static void
 +sweep_block(nir_shader *nir, nir_block *block)
 +{
 +   ralloc_steal(nir, block);
 +
 +   nir_foreach_instr(block, instr) {
 +  ralloc_steal(nir, instr);

We still need to walk the non-ssa sources and steal any indirect register
uses.  Either that or ensure that they're allocated out of the instruction.

 +   }
 +}
 +
 +static void
 +sweep_if(nir_shader *nir, nir_if *iff)
 +{
 +   ralloc_steal(nir, iff);

If has a source that may have an indirect too.

With comments addressed, series is
Reviewed-by: Jason Ekstrand jason.ekstr...@intel.com

 +
 +   foreach_list_typed(nir_cf_node, cf_node, node, iff-then_list) {
 +

[Mesa-dev] [PATCH v2] glsl: fix assignment of multiple scalar and vecs to matrices.

2015-04-07 Thread Samuel Iglesias Gonsalvez

When a vec has more elements than row components in a matrix, the
code could end up failing an assert inside assign_to_matrix_column().

This patch makes sure that when there is still room in the matrix for
more elements (but in other columns of the matrix), the data is actually
assigned.

This patch fixes the following dEQP test:

  
dEQP-GLES3.functional.shaders.conversions.matrix_combine.float_bvec4_ivec2_bool_to_mat4x2_vertex
  
dEQP-GLES3.functional.shaders.conversions.matrix_combine.float_bvec4_ivec2_bool_to_mat4x2_fragment

Signed-off-by: Samuel Iglesias Gonsalvez sigles...@igalia.com
---

v2:
   - Improve the patch following Ben's comments.

 src/glsl/ast_function.cpp | 110 +-
 1 file changed, 49 insertions(+), 61 deletions(-)

diff --git a/src/glsl/ast_function.cpp b/src/glsl/ast_function.cpp
index 918be69..0010ffe 100644
--- a/src/glsl/ast_function.cpp
+++ b/src/glsl/ast_function.cpp
@@ -1370,71 +1370,59 @@ emit_inline_matrix_constructor(const glsl_type *type,
} else {
   const unsigned cols = type-matrix_columns;
   const unsigned rows = type-vector_elements;
+  unsigned remaining_slots = rows * cols;
   unsigned col_idx = 0;
   unsigned row_idx = 0;
 
   foreach_in_list(ir_rvalue, rhs, parameters) {
-const unsigned components_remaining_this_column = rows - row_idx;
-unsigned rhs_components = rhs-type-components();
-unsigned rhs_base = 0;
-
-/* Since the parameter might be used in the RHS of two assignments,
- * generate a temporary and copy the paramter there.
- */
-ir_variable *rhs_var =
-   new(ctx) ir_variable(rhs-type, mat_ctor_vec, ir_var_temporary);
-instructions-push_tail(rhs_var);
-
-ir_dereference *rhs_var_ref =
-   new(ctx) ir_dereference_variable(rhs_var);
-ir_instruction *inst = new(ctx) ir_assignment(rhs_var_ref, rhs, NULL);
-instructions-push_tail(inst);
-
-/* Assign the current parameter to as many components of the matrix
- * as it will fill.
- *
- * NOTE: A single vector parameter can span two matrix columns.  A
- * single vec4, for example, can completely fill a mat2.
- */
-if (rhs_components = components_remaining_this_column) {
-   const unsigned count = MIN2(rhs_components,
-   components_remaining_this_column);
-
-   rhs_var_ref = new(ctx) ir_dereference_variable(rhs_var);
-
-   ir_instruction *inst = assign_to_matrix_column(var, col_idx,
-  row_idx,
-  rhs_var_ref, 0,
-  count, ctx);
-   instructions-push_tail(inst);
-
-   rhs_base = count;
-
-   col_idx++;
-   row_idx = 0;
-}
-
-/* If there is data left in the parameter and components left to be
- * set in the destination, emit another assignment.  It is possible
- * that the assignment could be of a vec4 to the last element of the
- * matrix.  In this case col_idx==cols, but there is still data
- * left in the source parameter.  Obviously, don't emit an assignment
- * to data outside the destination matrix.
- */
-if ((col_idx  cols)  (rhs_base  rhs_components)) {
-   const unsigned count = rhs_components - rhs_base;
-
-   rhs_var_ref = new(ctx) ir_dereference_variable(rhs_var);
-
-   ir_instruction *inst = assign_to_matrix_column(var, col_idx,
-  row_idx,
-  rhs_var_ref,
-  rhs_base,
-  count, ctx);
-   instructions-push_tail(inst);
-
-   row_idx += count;
-}
+ unsigned rhs_components = rhs-type-components();
+ unsigned rhs_base = 0;
+
+ if (remaining_slots == 0)
+break;
+
+ /* Since the parameter might be used in the RHS of two assignments,
+  * generate a temporary and copy the paramter there.
+  */
+ ir_variable *rhs_var =
+new(ctx) ir_variable(rhs-type, mat_ctor_vec, ir_var_temporary);
+ instructions-push_tail(rhs_var);
+
+ ir_dereference *rhs_var_ref =
+new(ctx) ir_dereference_variable(rhs_var);
+ ir_instruction *inst = new(ctx) ir_assignment(rhs_var_ref, rhs, NULL);
+ instructions-push_tail(inst);
+
+ do {
+/* Assign the current parameter to as many components of the matrix
+ * as it will fill.
+ *
+ * NOTE: A single vector parameter can span two matrix columns.  A
+ * single vec4, for example, can

Re: [Mesa-dev] [PATCH] indices: fix provoking vertex for quads/quadstrips

Mystery semi-solved? Previously u_primconvert would always select
*FIRST* provoking order when flatshading wasn't enabled, but the quads
would still follow the last logic. No big deal. I added support for
quads to be able to follow the provoking vertex convention, but now
the way that the quad is split into tris is different (to make it so
that both tris start with vertex 0). This probably tickles one of the
effects that you allude to.

Soo I just changed it to only look at flatshading_first, which
will now generally make it use the LAST provoking order. Problem
solved? Not really, but piglits pass, and this seems more consistent.

On Tue, Apr 7, 2015 at 11:06 AM, Roland Scheidegger srol...@vmware.com wrote:
 It will look different with llvmpipe if you use the right debug
 variables (GALLIVM_DEBUG=no_brilinear,no_quad_lod,no_rho_approx), though
 still fail.
 I think the test may not be really valid. This is because if you use
 texgrad, the driver/hw probably will (or should) use per-pixel lod. But
 if you don't, it is of course per-quad. For the smallest mip it will
 only give the same results here if you pick the right (top/bottom,
 left/right) values for doing the lod calculations with implicit lod
 (that is, the one from the actually active pixel in the quad, so if the
 active pixel was top/left you must calculate ddx with the top values,
 and ddy with the left values). And I don't think that is a requirement
 anywhere. At least that's what I remember...
 And the cube test is probably not quite right neither (though llvmpipe
 passes it with those mentioned variables, that is more due to the
 implementation of cube mapping though - the cube face selection must be
 done per pixel and not per quad and there's tons of code to get lods
 right be it implicit or explicit).
 Don't know though why the test would regress this, as it shouldn't
 affect it at all with last provoking vertex.

 Am 07.04.2015 um 16:28 schrieb Ilia Mirkin:
 Oh fun, those tests also fail with nvc0 and llvmpipe. But pass on
 softpipe. (The llvmpipe fail is visually different from the nvc0 and
 freedreno/a3xx one though.)

 On Tue, Apr 7, 2015 at 10:25 AM, Ilia Mirkin imir...@alum.mit.edu wrote:
 Weird, this seems to regress

 bin/arb_shader_texture_lod-texgrad
 bin/arb_shader_texture_lod-texgradcube

 Visually they look the same, but piglit finds small differences.

 On Tue, Apr 7, 2015 at 2:20 AM, Ilia Mirkin imir...@alum.mit.edu wrote:
 On Tue, Apr 7, 2015 at 1:44 AM, Ilia Mirkin imir...@alum.mit.edu wrote:
 Signed-off-by: Ilia Mirkin imir...@alum.mit.edu
 ---

 Pushing this through a complete piglit run, but it seems to fix

   bin/arb-provoking-vertex-render

 on a3xx. Please take special care to double-check that I didn't mess
 up cw/ccw order or something. I'm especially weak on the quadstrip
 case.

  src/gallium/auxiliary/indices/u_indices_gen.py | 13 ++---
  1 file changed, 10 insertions(+), 3 deletions(-)

 diff --git a/src/gallium/auxiliary/indices/u_indices_gen.py 
 b/src/gallium/auxiliary/indices/u_indices_gen.py
 index 687a717..b17d132 100644
 --- a/src/gallium/auxiliary/indices/u_indices_gen.py
 +++ b/src/gallium/auxiliary/indices/u_indices_gen.py
 @@ -142,8 +142,12 @@ def do_tri( intype, outtype, ptr, v0, v1, v2, inpv, 
 outpv ):
  tri( intype, outtype, ptr, v2, v0, v1 )

  def do_quad( intype, outtype, ptr, v0, v1, v2, v3, inpv, outpv ):
 -do_tri( intype, outtype, ptr+'+0',  v0, v1, v3, inpv, outpv );
 -do_tri( intype, outtype, ptr+'+3',  v1, v2, v3, inpv, outpv );
 +if inpv == LAST:
 +do_tri( intype, outtype, ptr+'+0',  v0, v1, v3, inpv, outpv );
 +do_tri( intype, outtype, ptr+'+3',  v1, v2, v3, inpv, outpv );
 +else:
 +do_tri( intype, outtype, ptr+'+0',  v0, v1, v3, inpv, outpv );
 +do_tri( intype, outtype, ptr+'+3',  v0, v3, v2, inpv, outpv );

 Erm, make that v0, v1, v2; v0, v2, v3. Oops :)


  def name(intype, outtype, inpv, outpv, pr, prim):
  if intype == GENERATE:
 @@ -331,7 +335,10 @@ def quadstrip(intype, outtype, inpv, outpv, pr):
  print ' i += 4;'
  print ' goto restart;'
  print '  }'
 -do_quad( intype, outtype, 'out+j', 'i+2', 'i+0', 'i+1', 'i+3', inpv, 
 outpv );
 +if inpv == LAST:
 +do_quad( intype, outtype, 'out+j', 'i+2', 'i+0', 'i+1', 'i+3', 
 inpv, outpv );
 +else:
 +do_quad( intype, outtype, 'out+j', 'i+0', 'i+1', 'i+3', 'i+2', 
 inpv, outpv );
  print '   }'
  postamble()

 --
 2.0.5

 ___
 mesa-dev mailing list
 mesa-dev@lists.freedesktop.org
 https://urldefense.proofpoint.com/v2/url?u=http-3A__lists.freedesktop.org_mailman_listinfo_mesa-2Ddevd=AwIGaQc=Sqcl0Ez6M0X8aeM67LKIiDJAXVeAw-YihVMNtXt-uEsr=Vjtt0vs_iqoI31UfJxBl7yv9I2FeiaeAYgMTLKRBc_Im=WCe6lxrKqxKgvrVHuTpgj1fH76sE-mwWzExbv9DnLQss=R37n-HrF-x56UKswGlBbLHCIlZvTk6p-Z99737VvlS8e=


___
mesa-dev mailing list

[Mesa-dev] [PATCH v2 1/2] primconvert: select pv convention only from flatshade_first

This should match to how drivers program hardware. It shouldn't matter
when flatshading isn't in effect, but somehow it seems to.

Signed-off-by: Ilia Mirkin imir...@alum.mit.edu
---
 src/gallium/auxiliary/indices/u_primconvert.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/src/gallium/auxiliary/indices/u_primconvert.c 
b/src/gallium/auxiliary/indices/u_primconvert.c
index 00e65aa..70d3e85 100644
--- a/src/gallium/auxiliary/indices/u_primconvert.c
+++ b/src/gallium/auxiliary/indices/u_primconvert.c
@@ -104,8 +104,7 @@ util_primconvert_save_rasterizer_state(struct 
primconvert_context *pc,
 * we would actually need to save/restore rasterizer state.  As
 * it is, we just need to make note of the pv.
 */
-   pc-api_pv = (rast-flatshade
-  !rast-flatshade_first) ? PV_LAST : PV_FIRST;
+   pc-api_pv = rast-flatshade_first ? PV_FIRST : PV_LAST;
 }
 
 void
-- 
2.0.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [RFC 0/2] nir and ttn support for indirect/arrays

On Tue, Apr 7, 2015 at 8:52 AM, Rob Clark robdcl...@gmail.com wrote:
 From: Rob Clark robcl...@freedesktop.org

 Introduce intrinsics to load/store global vars (since I'm not sure what
 the point is to have global as a var type if there is no way to access
 it, but maybe I'm missing something), and update ttn to generate global
 variables for arrays.

By and large, variables are supposed to be accessed with
nir_intrinsic_load/store_var.  We then (optionally) lower to explicit
index+offset intrinsics for shader inputs/outputs to make things
easier on the backends.  However, globals and locals are expected to
be lowered to registers not intrinsics.  With TGSI, I think they have
an input/output register file, so it was easier for Eric to simply use
the index+offset intrinsics right from the start.
--Jason

 So, for example:

FRAG
PROPERTY FS_COLOR0_WRITES_ALL_CBUFS 1
DCL OUT[0], COLOR
DCL CONST[0..1]
DCL TEMP[0..2], ARRAY(1), LOCAL
DCL TEMP[3..4], LOCAL
DCL ADDR[0]
IMM[0] FLT32 {1., 2., 3., 0.}
IMM[1] FLT32 {4., 5., 6., 7.}
IMM[2] FLT32 {7., 8., 9., 0.}
  0: MOV TEMP[0], IMM[0].xyzx
  1: MOV TEMP[1], IMM[1].xyzx
  2: MOV TEMP[2], IMM[2].xyzx
  3: UARL ADDR[0].x, CONST[0].
  4: FSEQ TEMP[3].xyz, TEMP[ADDR[0].x](1).xyzz, CONST[1].xyzz
  5: AND TEMP[3].y, TEMP[3]., TEMP[3].
  6: AND TEMP[3].x, TEMP[3]., TEMP[3].
  7: UCMP TEMP[4], TEMP[3]., IMM[0].wxwx, TEMP[4]
  8: NOT TEMP[3].x, TEMP[3].
  9: UCMP TEMP[4], TEMP[3]., IMM[0].xwwx, TEMP[4]
 10: MOV OUT[0], TEMP[4]
 11: END

 becomes:

decl_var uniform  vec4[2] uniform_0 (0, 0)
decl_var shader_out  vec4 out_0 (1, 0)
decl_overload main returning void

impl main {
   block block_0:
   /* preds: */
   vec4 ssa_0 = load_const (0x3f80 /* 1.00 */, 0x4000 /* 
 2.00 */, 0x4040 /* 3.00 */, 0x /* 0.00 */
   vec4 ssa_233 = load_const (0x3f80 /* 1.00 */, 0x4000 /* 
 2.00 */, 0x4040 /* 3.00 */, 0x3f80 /* 1.00 */
   intrinsic store_global (ssa_233) () (0, 1)
   vec4 ssa_234 = load_const (0x4080 /* 4.00 */, 0x40a0 /* 
 5.00 */, 0x40c0 /* 6.00 */, 0x4080 /* 4.00 */
   intrinsic store_global (ssa_234) () (1, 1)
   vec4 ssa_235 = load_const (0x40e0 /* 7.00 */, 0x4100 /* 
 8.00 */, 0x4110 /* 9.00 */, 0x40e0 /* 7.00 */
   intrinsic store_global (ssa_235) () (2, 1)
   vec4 ssa_18 = intrinsic load_uniform () () (0, 1)
   vec4 ssa_25 = intrinsic load_global_indirect (ssa_18) () (0, 1)
   vec4 ssa_27 = intrinsic load_uniform () () (1, 1)
   vec1 ssa_132 = feq ssa_25, ssa_27
   vec1 ssa_133 = feq ssa_25.y, ssa_27.y
   vec1 ssa_134 = feq ssa_25.z, ssa_27.z
   vec1 ssa_77 = iand ssa_133, ssa_134
   vec1 ssa_79 = iand ssa_132, ssa_77
   vec1 ssa_244 = bcsel ssa_79, ssa_0.w, ssa_0
   vec1 ssa_246 = bcsel ssa_79, ssa_0, ssa_0.w
   vec1 ssa_254 = load_const (0x /* 0.00 */
   vec1 ssa_255 = load_const (0x3f80 /* 1.00 */
   vec4 ssa_230 = vec4 ssa_244, ssa_246, ssa_254, ssa_255
   intrinsic store_output (ssa_230) () (0, 1)
   /* succs: block_1 */
   block block_1:
}

 note, in one of the opt passes the 'decl_var  vec4[3] arr_1' is getting lost
 but I haven't debugged that yet

 Rob Clark (2):
   nir: add intrinsics for load/store global
   gallium/ttn: add support for temp arrays

  src/gallium/auxiliary/nir/tgsi_to_nir.c | 116 
 ++--
  src/glsl/nir/nir_intrinsics.h   |   7 +-
  2 files changed, 100 insertions(+), 23 deletions(-)

 --
 2.1.0

 ___
 mesa-dev mailing list
 mesa-dev@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH v2 1/2] primconvert: select pv convention only from flatshade_first

2015-04-07 Thread Roland Scheidegger

This looks good to me. Note that generally it is not true that this
doesn't affect things when flatshading isn't in effect - this is only
true if you only have old-style semantics, where color is the only
attribute which can be flatshaded (and is done so with the rasterizer
setting). But not true if you have attributes which just are
interpolated with the flat qualifier. Though this is probably another
problem...

Roland

Am 07.04.2015 um 18:12 schrieb Ilia Mirkin:
 This should match to how drivers program hardware. It shouldn't matter
 when flatshading isn't in effect, but somehow it seems to.
 
 Signed-off-by: Ilia Mirkin imir...@alum.mit.edu
 ---
  src/gallium/auxiliary/indices/u_primconvert.c | 3 +--
  1 file changed, 1 insertion(+), 2 deletions(-)
 
 diff --git a/src/gallium/auxiliary/indices/u_primconvert.c 
 b/src/gallium/auxiliary/indices/u_primconvert.c
 index 00e65aa..70d3e85 100644
 --- a/src/gallium/auxiliary/indices/u_primconvert.c
 +++ b/src/gallium/auxiliary/indices/u_primconvert.c
 @@ -104,8 +104,7 @@ util_primconvert_save_rasterizer_state(struct 
 primconvert_context *pc,
  * we would actually need to save/restore rasterizer state.  As
  * it is, we just need to make note of the pv.
  */
 -   pc-api_pv = (rast-flatshade
 -  !rast-flatshade_first) ? PV_LAST : PV_FIRST;
 +   pc-api_pv = rast-flatshade_first ? PV_FIRST : PV_LAST;
  }
  
  void
 

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 4/4] autoconf, scons: Move fallback HAVE_* definitions to headers.

2015-04-07 Thread Matt Turner

On Tue, Apr 7, 2015 at 5:14 AM, Jose Fonseca jfons...@vmware.com wrote:
 Sorry for the delay. I've been away during the Easter.

 On 02/04/15 19:02, Matt Turner wrote:

 On Thu, Apr 2, 2015 at 7:32 AM, Jose Fonseca jfons...@vmware.com wrote:

 These were being defined in SCons, but it's not practical -- we actually
 need to include Gallium headers from external source trees, with
 completely disjoint build infrastructure, and it's unsustainable to
 replicate the HAVE_xxx checks or even hard-coded defines across
 everywhere.


 To confirm, you're building external sources with gcc? I don't think
 these macros are useful for MSVC.


 Correct.



 No actual change in behavior for autoconf.
 ---
   configure.ac |  2 +-
   include/c99_compat.h | 45 +
   scons/gallium.py | 27 ---
   src/util/macros.h|  2 ++
   4 files changed, 48 insertions(+), 28 deletions(-)

 diff --git a/configure.ac b/configure.ac
 index 520cc22..1485bba 100644
 --- a/configure.ac
 +++ b/configure.ac
 @@ -230,7 +230,7 @@ _SAVE_LDFLAGS=$LDFLAGS
   _SAVE_CPPFLAGS=$CPPFLAGS

   dnl Compiler macros
 -DEFINES=
 +DEFINES=-DHAVE_AUTOCONF
   AC_SUBST([DEFINES])
   case $host_os in
   linux*|*-gnu*|gnu*)
 diff --git a/include/c99_compat.h b/include/c99_compat.h
 index 4fc91bc..62ccd46 100644
 --- a/include/c99_compat.h
 +++ b/include/c99_compat.h


 c99_compat.h doesn't seem like the right location. I know it seems
 like a nice place to add this since it's included everywhere, but I
 worry that in a few years we're going to be cleaning it up like we've
 been doing with compiler.h and friends.

 I might make a separate header to define these? Not sure.


 I can move the defines out of c99_compat.h , e.g.,
 mesa/include/fallbackconfig.h.

 But I'd prefer to include fallbackconfig.h out of c99_compat.h , as
 c99_compat.h is pretty much guaranteed to be included all the time.


 Since
 probably all cases of #ifdef HAVE___* have a fallback, that runs the
 risk of never noticing that you weren't including the right header.

 Precisely, this is all the more reason why it must be included from a header
 that's included all the time.  If it depends on people to add the include on
 a case-by-case it is bound to fail, as nobody else but us cares, and it will
 easily go unnoticed.


 @@ -141,4 +141,49 @@ test_c99_compat_h(const void * restrict a,
   #endif


 +
 +/* Fallback definitions, for when these headers are used by build
 systems which
 + * don't auto-detect these things.*/
 +#ifndef HAVE_AUTOCONF


 I'd rather flip this condition around and not modify configure.ac. But
 maybe you can't do that because you're not actually building
 everything with scons?


 No biggie either way.

 I don't know. This seems nuts. I really don't like adding stuff to the
 autotools build system like this.


 Sure.


 I really don't know how to deal with this. What I'm hearing is that
 even the custom scons build system you guys use isn't sufficient for
 your own needs. You're not building the external source trees with the
 same build system...?


 I think you might be getting the wrong idea.

 We don't build the .C files from external source trees.  But we do need to
 include .h files, so we can interface with components in Mesa tree.

 That is, I only need the .h files to make sense on their own (with Mesa
 components, namely mesa/src/gallium/include, and gallium auxiliary
 libraries).  But we have so many inlines functions, so many #ifdef HAVE_foo,
 that unless all the defines match precisely, the whole hell breaks loose.


 Gallium has from the start been integrated (ie. embedded) on a myriad of
 places.  It was always meant as a framework to write any sort of 3d driver,
 not just OpenGL drivers.  Things were much worse when Gallium was used on
 Windows XP kernel land or Windows CE.  I'm glad that I or anybody else has
 to deal with the quirkiness of keeping code portable across these platforms.
 Things are still much more uniform nowadays.


 I mean, in all the build system work I've done I've tried to make sure
 scons continues working -- doing things like adding these HAVE_*
 definitions to it and such. It's kind of frustrating, and it's even
 more frustrating when even that isn't sufficient.



 All I'm doing here is basically move your defines out of scons's python
 files into C headers.  Conceptually it's doing pretty much the same thing as
 before, but being in a header that means that it's there for all build
 systems to take.


 Rembember that Mesa itself is not just autoconf and Scons, there's also
 Android build system.

 I don't like it any more you do, but this is the world we live in: the fact
 is that many platforms constraint how software must be built to a point
 which is impracticable/impossible to build.  Even if a build system that
 meets everybody needs existed, we'd still face the legacy of existing
 software using other build systems.



 To be honest, IMHO, Mesa

Re: [Mesa-dev] [PATCH] glsl: check for forced_language_version in is_version()

2015-04-07 Thread Brian Paul


Ping.

On 04/01/2015 02:38 PM, Brian Paul wrote:

This is a follow-on fix from the earlier glsl: allow ForceGLSLVersion
to override #version directives change.  Since we're not changing
the language_version field, we have to check forced_language_version
here.
---
  src/glsl/glsl_parser_extras.h | 4 +++-
  1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/src/glsl/glsl_parser_extras.h b/src/glsl/glsl_parser_extras.h
index 1f5478b..dae7864 100644
--- a/src/glsl/glsl_parser_extras.h
+++ b/src/glsl/glsl_parser_extras.h
@@ -105,8 +105,10 @@ struct _mesa_glsl_parse_state {
 {
unsigned required_version = this-es_shader ?
   required_glsl_es_version : required_glsl_version;
+  unsigned this_version = this-forced_language_version
+ ? this-forced_language_version : this-language_version;
return required_version != 0
-  this-language_version = required_version;
+  this_version = required_version;
 }

 bool check_version(unsigned required_glsl_version,



___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH v2 2/2] indices: fix provoking vertex for quads/quadstrips

This allows drivers to provide consistent flat shading for quads.
Otherwise a driver that only supported tris would have to force last
provoking vertex when drawing quads (and would have to say that quads
don't follow the provoking vertex convention).

Signed-off-by: Ilia Mirkin imir...@alum.mit.edu
---

vmware folks -- Please test this out with svga. I see that you might
have a similar issue with how you determine api_pv as
u_primconvert. Also if you don't expect quads to follow, you can
always just pass in pv == LAST, but then you still end up having to
force your rast state to pv = LAST too, for quads.

A good piglit to play with is:

  bin/arb-provoking-vertex-render

 src/gallium/auxiliary/indices/u_indices_gen.py | 13 ++---
 1 file changed, 10 insertions(+), 3 deletions(-)

diff --git a/src/gallium/auxiliary/indices/u_indices_gen.py 
b/src/gallium/auxiliary/indices/u_indices_gen.py
index 687a717..97c8e0d 100644
--- a/src/gallium/auxiliary/indices/u_indices_gen.py
+++ b/src/gallium/auxiliary/indices/u_indices_gen.py
@@ -142,8 +142,12 @@ def do_tri( intype, outtype, ptr, v0, v1, v2, inpv, outpv 
):
 tri( intype, outtype, ptr, v2, v0, v1 )
 
 def do_quad( intype, outtype, ptr, v0, v1, v2, v3, inpv, outpv ):
-do_tri( intype, outtype, ptr+'+0',  v0, v1, v3, inpv, outpv );
-do_tri( intype, outtype, ptr+'+3',  v1, v2, v3, inpv, outpv );
+if inpv == LAST:
+do_tri( intype, outtype, ptr+'+0',  v0, v1, v3, inpv, outpv );
+do_tri( intype, outtype, ptr+'+3',  v1, v2, v3, inpv, outpv );
+else:
+do_tri( intype, outtype, ptr+'+0',  v0, v1, v2, inpv, outpv );
+do_tri( intype, outtype, ptr+'+3',  v0, v2, v3, inpv, outpv );
 
 def name(intype, outtype, inpv, outpv, pr, prim):
 if intype == GENERATE:
@@ -331,7 +335,10 @@ def quadstrip(intype, outtype, inpv, outpv, pr):
 print ' i += 4;'
 print ' goto restart;'
 print '  }'
-do_quad( intype, outtype, 'out+j', 'i+2', 'i+0', 'i+1', 'i+3', inpv, outpv 
);
+if inpv == LAST:
+do_quad( intype, outtype, 'out+j', 'i+2', 'i+0', 'i+1', 'i+3', inpv, 
outpv );
+else:
+do_quad( intype, outtype, 'out+j', 'i+0', 'i+1', 'i+3', 'i+2', inpv, 
outpv );
 print '   }'
 postamble()
 
-- 
2.0.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [RFC 1/2] nir: add intrinsics for load/store global

From: Rob Clark robcl...@freedesktop.org

Seemed like these were missing in action?  This is how it works for
other vars (uniform/shader_in/shader_out/etc), so seemed sensible.

Signed-off-by: Rob Clark robcl...@freedesktop.org
---
 src/glsl/nir/nir_intrinsics.h | 7 ++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/src/glsl/nir/nir_intrinsics.h b/src/glsl/nir/nir_intrinsics.h
index 8e28765..a047324 100644
--- a/src/glsl/nir/nir_intrinsics.h
+++ b/src/glsl/nir/nir_intrinsics.h
@@ -122,6 +122,10 @@ SYSTEM_VALUE(invocation_id, 1)
INTRINSIC(load_##name##_indirect, extra_srcs + 1, ARR(1, 1), \
  true, 0, 0, 2, flags)
 
+/* NOTE: global can be re-ordered, just not wrt. stores.. not sure if
+ * is a way to express that?
+ */
+LOAD(global, 0, NIR_INTRINSIC_CAN_ELIMINATE)
 LOAD(uniform, 0, NIR_INTRINSIC_CAN_ELIMINATE | NIR_INTRINSIC_CAN_REORDER)
 LOAD(ubo, 1, NIR_INTRINSIC_CAN_ELIMINATE | NIR_INTRINSIC_CAN_REORDER)
 LOAD(input, 0, NIR_INTRINSIC_CAN_ELIMINATE | NIR_INTRINSIC_CAN_REORDER)
@@ -136,8 +140,9 @@ LOAD(input, 0, NIR_INTRINSIC_CAN_ELIMINATE | 
NIR_INTRINSIC_CAN_REORDER)
 #define STORE(name, num_indices, flags) \
INTRINSIC(store_##name, 1, ARR(0), false, 0, 0, num_indices, flags) \
INTRINSIC(store_##name##_indirect, 2, ARR(0, 1), false, 0, 0, \
- num_indices, flags) \
+ num_indices, flags)
 
+STORE(global, 2, 0)/* num_indices should be ?? */
 STORE(output, 2, 0)
 /* STORE(ssbo, 3, 0) */
 
-- 
2.1.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] i965: Fix depth field setting in surface state for raw buffer on Gen7/8

2015-04-07 Thread Kristian Høgsberg

On Mon, Apr 6, 2015 at 10:51 PM, Zhenyu Wang zhen...@linux.intel.com wrote:
 On Gen7/8 for RAW surface format, the depth field (surf[3]) in surface
 state means [30:21] bits of number of entries which is different from
 other surface format which uses [26:21] bits field.

 Signed-off-by: Zhenyu Wang zhen...@linux.intel.com

Is there  a bugzilla that this fixes we can link from the commit
message? Either way, this looks good.

Reviewed-by: Kristian Høgsberg k...@bitplanet.net

 ---
  src/mesa/drivers/dri/i965/gen7_wm_surface_state.c | 7 +--
  src/mesa/drivers/dri/i965/gen8_surface_state.c| 7 +--
  2 files changed, 10 insertions(+), 4 deletions(-)

 diff --git a/src/mesa/drivers/dri/i965/gen7_wm_surface_state.c 
 b/src/mesa/drivers/dri/i965/gen7_wm_surface_state.c
 index d9361d3..18bcb8a 100644
 --- a/src/mesa/drivers/dri/i965/gen7_wm_surface_state.c
 +++ b/src/mesa/drivers/dri/i965/gen7_wm_surface_state.c
 @@ -238,8 +238,11 @@ gen7_emit_buffer_surface_state(struct brw_context *brw,
 surf[1] = (bo ? bo-offset64 : 0) + buffer_offset; /* reloc */
 surf[2] = SET_FIELD((buffer_size - 1)  0x7f, GEN7_SURFACE_WIDTH) |
   SET_FIELD(((buffer_size - 1)  7)  0x3fff, 
 GEN7_SURFACE_HEIGHT);
 -   surf[3] = SET_FIELD(((buffer_size - 1)  21)  0x3f, BRW_SURFACE_DEPTH) |
 - (pitch - 1);
 +   if (surface_format == BRW_SURFACEFORMAT_RAW)
 +  surf[3] = SET_FIELD(((buffer_size - 1)  21)  0x3ff, 
 BRW_SURFACE_DEPTH);
 +   else
 +  surf[3] = SET_FIELD(((buffer_size - 1)  21)  0x3f, 
 BRW_SURFACE_DEPTH);
 +   surf[3] |= (pitch - 1);

 surf[5] = SET_FIELD(GEN7_MOCS_L3, GEN7_SURFACE_MOCS);

 diff --git a/src/mesa/drivers/dri/i965/gen8_surface_state.c 
 b/src/mesa/drivers/dri/i965/gen8_surface_state.c
 index 0007c95..ba59b05 100644
 --- a/src/mesa/drivers/dri/i965/gen8_surface_state.c
 +++ b/src/mesa/drivers/dri/i965/gen8_surface_state.c
 @@ -129,8 +129,11 @@ gen8_emit_buffer_surface_state(struct brw_context *brw,

 surf[2] = SET_FIELD((buffer_size - 1)  0x7f, GEN7_SURFACE_WIDTH) |
   SET_FIELD(((buffer_size - 1)  7)  0x3fff, 
 GEN7_SURFACE_HEIGHT);
 -   surf[3] = SET_FIELD(((buffer_size - 1)  21)  0x3f, BRW_SURFACE_DEPTH) |
 - (pitch - 1);
 +   if (surface_format == BRW_SURFACEFORMAT_RAW)
 +  surf[3] = SET_FIELD(((buffer_size - 1)  21)  0x3ff, 
 BRW_SURFACE_DEPTH);
 +   else
 +  surf[3] = SET_FIELD(((buffer_size - 1)  21)  0x3f, 
 BRW_SURFACE_DEPTH);
 +   surf[3] |= (pitch - 1);
 surf[7] = SET_FIELD(HSW_SCS_RED,   GEN7_SURFACE_SCS_R) |
   SET_FIELD(HSW_SCS_GREEN, GEN7_SURFACE_SCS_G) |
   SET_FIELD(HSW_SCS_BLUE,  GEN7_SURFACE_SCS_B) |
 --
 2.1.4

 ___
 mesa-dev mailing list
 mesa-dev@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [RFC 2/2] gallium/ttn: add support for temp arrays

On Tue, Apr 7, 2015 at 8:52 AM, Rob Clark robdcl...@gmail.com wrote:
 From: Rob Clark robcl...@freedesktop.org

 Since the rest of NIR really would rather have these as variables rather
 than registers, create a nir_variable per array.  But rather than
 completely re-arrange ttn to be variable based rather than register
 based, keep the registers.  In the cases where there is a matching var
 for the reg, ttn_emit_instruction will append the appropriate intrinsic
 to get things back from the shadow reg into the variable.

 NOTE: this doesn't quite handle TEMP[ADDR[]] when the DCL doesn't give
 an array id.  But those just kinda suck, and should really go away.
 AFAICT we don't get those from glsl.  Might be an issue for some other
 state tracker.

 Signed-off-by: Rob Clark robcl...@freedesktop.org
 ---
  src/gallium/auxiliary/nir/tgsi_to_nir.c | 116 
 ++--
  1 file changed, 94 insertions(+), 22 deletions(-)

 diff --git a/src/gallium/auxiliary/nir/tgsi_to_nir.c 
 b/src/gallium/auxiliary/nir/tgsi_to_nir.c
 index da935a4..1c7b313 100644
 --- a/src/gallium/auxiliary/nir/tgsi_to_nir.c
 +++ b/src/gallium/auxiliary/nir/tgsi_to_nir.c
 @@ -44,6 +44,7 @@
  struct ttn_reg_info {
 /** nir register containing this TGSI index. */
 nir_register *reg;
 +   nir_variable *var;
 /** Offset (in vec4s) from the start of var for this TGSI index. */
 int offset;
  };
 @@ -121,22 +122,29 @@ ttn_emit_declaration(struct ttn_compile *c)

 if (file == TGSI_FILE_TEMPORARY) {
nir_register *reg;
 -  if (c-scan-indirect_files  (1  file)) {
 +  for (i = 0; i  array_size; i++) {
   reg = nir_local_reg_create(b-impl);
   reg-num_components = 4;
 - reg-num_array_elems = array_size;
 + c-temp_regs[decl-Range.First + i].reg = reg;
 + c-temp_regs[decl-Range.First + i].offset = 0;
 +  }
 +  if (decl-Declaration.Array) {
 + /* for arrays, the register created just serves as a
 +  * shadow register.  We append intrinsic_store_global
 +  * after the tgsi instruction is translated to move
 +  * back from the shadow register to the variable
 +  */
 + nir_variable *var = rzalloc(b-shader, nir_variable);
 + var-type = glsl_array_type(glsl_vec4_type(), array_size);
 + var-data.mode = nir_var_global;
 + var-name = ralloc_asprintf(var, arr_%d, decl-Array.ArrayID);
 +
 + exec_list_push_tail(b-shader-globals, var-node);

   for (i = 0; i  array_size; i++) {
 -c-temp_regs[decl-Range.First + i].reg = reg;
 -c-temp_regs[decl-Range.First + i].offset = i;
 +c-temp_regs[decl-Range.First + i].var = var;
   }
} else {
 - for (i = 0; i  array_size; i++) {
 -reg = nir_local_reg_create(b-impl);
 -reg-num_components = 4;
 -c-temp_regs[decl-Range.First + i].reg = reg;
 -c-temp_regs[decl-Range.First + i].offset = 0;
 - }
}
 } else if (file == TGSI_FILE_ADDRESS) {
c-addr_reg = nir_local_reg_create(b-impl);
 @@ -256,10 +264,34 @@ ttn_src_for_file_and_index(struct ttn_compile *c, 
 unsigned file, unsigned index,

 switch (file) {
 case TGSI_FILE_TEMPORARY:
 -  src.reg.reg = c-temp_regs[index].reg;
 -  src.reg.base_offset = c-temp_regs[index].offset;
 -  if (indirect)
 - src.reg.indirect = ttn_src_for_indirect(c, indirect);
 +  if (c-temp_regs[index].var) {
 + nir_intrinsic_instr *load;
 + nir_alu_src indirect_address;
 +
 + assert(indirect);
 +
 + load = nir_intrinsic_instr_create(b-shader,
 +   
 nir_intrinsic_load_global_indirect);
 + load-num_components = 4;
 + load-const_index[0] = index;
 + load-const_index[1] = 1;

Why are we using an intrinsic that has an index and not
nir_intrinsic_load_var with a deref?  A short (2-element) deref chain
will handle this for you and then the lower_vars_to_ssa pass will pick
up on things like if all the indirect uses are actually constant and
lower it to SSA values for you.  If you use an index+offset intrinsic
then it's completely opaque and the rest of NIR doesn't know what to
do with it.

 +
 + memset(indirect_address, 0, sizeof(indirect_address));
 + indirect_address.src = nir_src_for_reg(c-addr_reg);
 + for (int i = 0; i  4; i++)
 +indirect_address.swizzle[i] = indirect-Swizzle;
 + load-src[0] = nir_src_for_ssa(nir_imov_alu(b, indirect_address, 
 1));
 +
 + nir_ssa_dest_init(load-instr, load-dest, 4, NULL);
 + nir_instr_insert_after_cf_list(b-cf_node_list, load-instr);
 +
 + src = nir_src_for_ssa(load-dest.ssa);
 +
 +  } else {
 + assert(!indirect);
 + src.reg.reg = c-temp_regs[index].reg;
 + src.reg.base_offset = c-temp_regs[index].offset;
 +  }
break;

 case

Re: [Mesa-dev] [PATCH 4/4] autoconf, scons: Move fallback HAVE_* definitions to headers.

2015-04-07 Thread Jose Fonseca


On 07/04/15 15:01, Emil Velikov wrote:

On 7 April 2015 at 13:14, Jose Fonseca jfons...@vmware.com wrote:

Sorry for the delay. I've been away during the Easter.

On 02/04/15 19:02, Matt Turner wrote:


On Thu, Apr 2, 2015 at 7:32 AM, Jose Fonseca jfons...@vmware.com wrote:


These were being defined in SCons, but it's not practical -- we actually
need to include Gallium headers from external source trees, with
completely disjoint build infrastructure, and it's unsustainable to
replicate the HAVE_xxx checks or even hard-coded defines across
everywhere.



To confirm, you're building external sources with gcc? I don't think
these macros are useful for MSVC.



Correct.




No actual change in behavior for autoconf.
---
   configure.ac |  2 +-
   include/c99_compat.h | 45 +
   scons/gallium.py | 27 ---
   src/util/macros.h|  2 ++
   4 files changed, 48 insertions(+), 28 deletions(-)

diff --git a/configure.ac b/configure.ac
index 520cc22..1485bba 100644
--- a/configure.ac
+++ b/configure.ac
@@ -230,7 +230,7 @@ _SAVE_LDFLAGS=$LDFLAGS
   _SAVE_CPPFLAGS=$CPPFLAGS

   dnl Compiler macros
-DEFINES=
+DEFINES=-DHAVE_AUTOCONF
   AC_SUBST([DEFINES])
   case $host_os in
   linux*|*-gnu*|gnu*)
diff --git a/include/c99_compat.h b/include/c99_compat.h
index 4fc91bc..62ccd46 100644
--- a/include/c99_compat.h
+++ b/include/c99_compat.h



c99_compat.h doesn't seem like the right location. I know it seems
like a nice place to add this since it's included everywhere, but I
worry that in a few years we're going to be cleaning it up like we've
been doing with compiler.h and friends.

I might make a separate header to define these? Not sure.



I can move the defines out of c99_compat.h , e.g.,
mesa/include/fallbackconfig.h.

But I'd prefer to include fallbackconfig.h out of c99_compat.h , as
c99_compat.h is pretty much guaranteed to be included all the time.



Since
probably all cases of #ifdef HAVE___* have a fallback, that runs the
risk of never noticing that you weren't including the right header.


Precisely, this is all the more reason why it must be included from a header
that's included all the time.  If it depends on people to add the include on
a case-by-case it is bound to fail, as nobody else but us cares, and it will
easily go unnoticed.



@@ -141,4 +141,49 @@ test_c99_compat_h(const void * restrict a,
   #endif


+
+/* Fallback definitions, for when these headers are used by build
systems which
+ * don't auto-detect these things.*/
+#ifndef HAVE_AUTOCONF



I'd rather flip this condition around and not modify configure.ac. But
maybe you can't do that because you're not actually building
everything with scons?



No biggie either way.


I don't know. This seems nuts. I really don't like adding stuff to the
autotools build system like this.



Sure.



I really don't know how to deal with this. What I'm hearing is that
even the custom scons build system you guys use isn't sufficient for
your own needs. You're not building the external source trees with the
same build system...?



I think you might be getting the wrong idea.

We don't build the .C files from external source trees.  But we do need to
include .h files, so we can interface with components in Mesa tree.

That is, I only need the .h files to make sense on their own (with Mesa
components, namely mesa/src/gallium/include, and gallium auxiliary
libraries).  But we have so many inlines functions, so many #ifdef HAVE_foo,
that unless all the defines match precisely, the whole hell breaks loose.


Gallium has from the start been integrated (ie. embedded) on a myriad of
places.  It was always meant as a framework to write any sort of 3d driver,
not just OpenGL drivers.  Things were much worse when Gallium was used on
Windows XP kernel land or Windows CE.  I'm glad that I or anybody else has
to deal with the quirkiness of keeping code portable across these platforms.
Things are still much more uniform nowadays.



I mean, in all the build system work I've done I've tried to make sure
scons continues working -- doing things like adding these HAVE_*
definitions to it and such. It's kind of frustrating, and it's even
more frustrating when even that isn't sufficient.




All I'm doing here is basically move your defines out of scons's python
files into C headers.  Conceptually it's doing pretty much the same thing as
before, but being in a header that means that it's there for all build
systems to take.


Rembember that Mesa itself is not just autoconf and Scons, there's also
Android build system.

I don't like it any more you do, but this is the world we live in: the fact
is that many platforms constraint how software must be built to a point
which is impracticable/impossible to build.  Even if a build system that
meets everybody needs existed, we'd still face the legacy of existing
software using other build systems.



To be honest, IMHO, Mesa source tree and build systems are

[Mesa-dev] [RFC 2/2] gallium/ttn: add support for temp arrays

From: Rob Clark robcl...@freedesktop.org

Since the rest of NIR really would rather have these as variables rather
than registers, create a nir_variable per array.  But rather than
completely re-arrange ttn to be variable based rather than register
based, keep the registers.  In the cases where there is a matching var
for the reg, ttn_emit_instruction will append the appropriate intrinsic
to get things back from the shadow reg into the variable.

NOTE: this doesn't quite handle TEMP[ADDR[]] when the DCL doesn't give
an array id.  But those just kinda suck, and should really go away.
AFAICT we don't get those from glsl.  Might be an issue for some other
state tracker.

Signed-off-by: Rob Clark robcl...@freedesktop.org
---
 src/gallium/auxiliary/nir/tgsi_to_nir.c | 116 ++--
 1 file changed, 94 insertions(+), 22 deletions(-)

diff --git a/src/gallium/auxiliary/nir/tgsi_to_nir.c 
b/src/gallium/auxiliary/nir/tgsi_to_nir.c
index da935a4..1c7b313 100644
--- a/src/gallium/auxiliary/nir/tgsi_to_nir.c
+++ b/src/gallium/auxiliary/nir/tgsi_to_nir.c
@@ -44,6 +44,7 @@
 struct ttn_reg_info {
/** nir register containing this TGSI index. */
nir_register *reg;
+   nir_variable *var;
/** Offset (in vec4s) from the start of var for this TGSI index. */
int offset;
 };
@@ -121,22 +122,29 @@ ttn_emit_declaration(struct ttn_compile *c)
 
if (file == TGSI_FILE_TEMPORARY) {
   nir_register *reg;
-  if (c-scan-indirect_files  (1  file)) {
+  for (i = 0; i  array_size; i++) {
  reg = nir_local_reg_create(b-impl);
  reg-num_components = 4;
- reg-num_array_elems = array_size;
+ c-temp_regs[decl-Range.First + i].reg = reg;
+ c-temp_regs[decl-Range.First + i].offset = 0;
+  }
+  if (decl-Declaration.Array) {
+ /* for arrays, the register created just serves as a
+  * shadow register.  We append intrinsic_store_global
+  * after the tgsi instruction is translated to move
+  * back from the shadow register to the variable
+  */
+ nir_variable *var = rzalloc(b-shader, nir_variable);
+ var-type = glsl_array_type(glsl_vec4_type(), array_size);
+ var-data.mode = nir_var_global;
+ var-name = ralloc_asprintf(var, arr_%d, decl-Array.ArrayID);
+
+ exec_list_push_tail(b-shader-globals, var-node);
 
  for (i = 0; i  array_size; i++) {
-c-temp_regs[decl-Range.First + i].reg = reg;
-c-temp_regs[decl-Range.First + i].offset = i;
+c-temp_regs[decl-Range.First + i].var = var;
  }
   } else {
- for (i = 0; i  array_size; i++) {
-reg = nir_local_reg_create(b-impl);
-reg-num_components = 4;
-c-temp_regs[decl-Range.First + i].reg = reg;
-c-temp_regs[decl-Range.First + i].offset = 0;
- }
   }
} else if (file == TGSI_FILE_ADDRESS) {
   c-addr_reg = nir_local_reg_create(b-impl);
@@ -256,10 +264,34 @@ ttn_src_for_file_and_index(struct ttn_compile *c, 
unsigned file, unsigned index,
 
switch (file) {
case TGSI_FILE_TEMPORARY:
-  src.reg.reg = c-temp_regs[index].reg;
-  src.reg.base_offset = c-temp_regs[index].offset;
-  if (indirect)
- src.reg.indirect = ttn_src_for_indirect(c, indirect);
+  if (c-temp_regs[index].var) {
+ nir_intrinsic_instr *load;
+ nir_alu_src indirect_address;
+
+ assert(indirect);
+
+ load = nir_intrinsic_instr_create(b-shader,
+   nir_intrinsic_load_global_indirect);
+ load-num_components = 4;
+ load-const_index[0] = index;
+ load-const_index[1] = 1;
+
+ memset(indirect_address, 0, sizeof(indirect_address));
+ indirect_address.src = nir_src_for_reg(c-addr_reg);
+ for (int i = 0; i  4; i++)
+indirect_address.swizzle[i] = indirect-Swizzle;
+ load-src[0] = nir_src_for_ssa(nir_imov_alu(b, indirect_address, 1));
+
+ nir_ssa_dest_init(load-instr, load-dest, 4, NULL);
+ nir_instr_insert_after_cf_list(b-cf_node_list, load-instr);
+
+ src = nir_src_for_ssa(load-dest.ssa);
+
+  } else {
+ assert(!indirect);
+ src.reg.reg = c-temp_regs[index].reg;
+ src.reg.base_offset = c-temp_regs[index].offset;
+  }
   break;
 
case TGSI_FILE_ADDRESS:
@@ -340,29 +372,45 @@ ttn_get_dest(struct ttn_compile *c, struct 
tgsi_full_dst_register *tgsi_fdst)
 {
struct tgsi_dst_register *tgsi_dst = tgsi_fdst-Register;
nir_alu_dest dest;
+   unsigned index = tgsi_dst-Index;
 
memset(dest, 0, sizeof(dest));
 
+   dest.write_mask = tgsi_dst-WriteMask;
+   dest.saturate = false;
+
if (tgsi_dst-File == TGSI_FILE_TEMPORARY) {
-  dest.dest.reg.reg = c-temp_regs[tgsi_dst-Index].reg;
-  dest.dest.reg.base_offset = c-temp_regs[tgsi_dst-Index].offset;
+  dest.dest.reg.reg = c-temp_regs[index].reg;
+

[Mesa-dev] [RFC 0/2] nir and ttn support for indirect/arrays

From: Rob Clark robcl...@freedesktop.org

Introduce intrinsics to load/store global vars (since I'm not sure what
the point is to have global as a var type if there is no way to access
it, but maybe I'm missing something), and update ttn to generate global
variables for arrays.

So, for example:

   FRAG
   PROPERTY FS_COLOR0_WRITES_ALL_CBUFS 1
   DCL OUT[0], COLOR
   DCL CONST[0..1]
   DCL TEMP[0..2], ARRAY(1), LOCAL
   DCL TEMP[3..4], LOCAL
   DCL ADDR[0]
   IMM[0] FLT32 {1., 2., 3., 0.}
   IMM[1] FLT32 {4., 5., 6., 7.}
   IMM[2] FLT32 {7., 8., 9., 0.}
 0: MOV TEMP[0], IMM[0].xyzx
 1: MOV TEMP[1], IMM[1].xyzx
 2: MOV TEMP[2], IMM[2].xyzx
 3: UARL ADDR[0].x, CONST[0].
 4: FSEQ TEMP[3].xyz, TEMP[ADDR[0].x](1).xyzz, CONST[1].xyzz
 5: AND TEMP[3].y, TEMP[3]., TEMP[3].
 6: AND TEMP[3].x, TEMP[3]., TEMP[3].
 7: UCMP TEMP[4], TEMP[3]., IMM[0].wxwx, TEMP[4]
 8: NOT TEMP[3].x, TEMP[3].
 9: UCMP TEMP[4], TEMP[3]., IMM[0].xwwx, TEMP[4]
10: MOV OUT[0], TEMP[4]
11: END

becomes:

   decl_var uniform  vec4[2] uniform_0 (0, 0)
   decl_var shader_out  vec4 out_0 (1, 0)
   decl_overload main returning void
   
   impl main {
  block block_0:
  /* preds: */
  vec4 ssa_0 = load_const (0x3f80 /* 1.00 */, 0x4000 /* 
2.00 */, 0x4040 /* 3.00 */, 0x /* 0.00 */
  vec4 ssa_233 = load_const (0x3f80 /* 1.00 */, 0x4000 /* 
2.00 */, 0x4040 /* 3.00 */, 0x3f80 /* 1.00 */
  intrinsic store_global (ssa_233) () (0, 1)
  vec4 ssa_234 = load_const (0x4080 /* 4.00 */, 0x40a0 /* 
5.00 */, 0x40c0 /* 6.00 */, 0x4080 /* 4.00 */
  intrinsic store_global (ssa_234) () (1, 1)
  vec4 ssa_235 = load_const (0x40e0 /* 7.00 */, 0x4100 /* 
8.00 */, 0x4110 /* 9.00 */, 0x40e0 /* 7.00 */
  intrinsic store_global (ssa_235) () (2, 1)
  vec4 ssa_18 = intrinsic load_uniform () () (0, 1)
  vec4 ssa_25 = intrinsic load_global_indirect (ssa_18) () (0, 1)
  vec4 ssa_27 = intrinsic load_uniform () () (1, 1)
  vec1 ssa_132 = feq ssa_25, ssa_27
  vec1 ssa_133 = feq ssa_25.y, ssa_27.y
  vec1 ssa_134 = feq ssa_25.z, ssa_27.z
  vec1 ssa_77 = iand ssa_133, ssa_134
  vec1 ssa_79 = iand ssa_132, ssa_77
  vec1 ssa_244 = bcsel ssa_79, ssa_0.w, ssa_0
  vec1 ssa_246 = bcsel ssa_79, ssa_0, ssa_0.w
  vec1 ssa_254 = load_const (0x /* 0.00 */
  vec1 ssa_255 = load_const (0x3f80 /* 1.00 */
  vec4 ssa_230 = vec4 ssa_244, ssa_246, ssa_254, ssa_255
  intrinsic store_output (ssa_230) () (0, 1)
  /* succs: block_1 */
  block block_1:
   }

note, in one of the opt passes the 'decl_var  vec4[3] arr_1' is getting lost
but I haven't debugged that yet

Rob Clark (2):
  nir: add intrinsics for load/store global
  gallium/ttn: add support for temp arrays

 src/gallium/auxiliary/nir/tgsi_to_nir.c | 116 ++--
 src/glsl/nir/nir_intrinsics.h   |   7 +-
 2 files changed, 100 insertions(+), 23 deletions(-)

-- 
2.1.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH v2 1/2] primconvert: select pv convention only from flatshade_first

BTW, I should have mentioned -- this affects the way that quads are
split up into tri's, which is most likely the source of any
differences from my later change which makes quads respect pv order.

On Tue, Apr 7, 2015 at 12:12 PM, Ilia Mirkin imir...@alum.mit.edu wrote:
 This should match to how drivers program hardware. It shouldn't matter
 when flatshading isn't in effect, but somehow it seems to.

 Signed-off-by: Ilia Mirkin imir...@alum.mit.edu
 ---
  src/gallium/auxiliary/indices/u_primconvert.c | 3 +--
  1 file changed, 1 insertion(+), 2 deletions(-)

 diff --git a/src/gallium/auxiliary/indices/u_primconvert.c 
 b/src/gallium/auxiliary/indices/u_primconvert.c
 index 00e65aa..70d3e85 100644
 --- a/src/gallium/auxiliary/indices/u_primconvert.c
 +++ b/src/gallium/auxiliary/indices/u_primconvert.c
 @@ -104,8 +104,7 @@ util_primconvert_save_rasterizer_state(struct 
 primconvert_context *pc,
  * we would actually need to save/restore rasterizer state.  As
  * it is, we just need to make note of the pv.
  */
 -   pc-api_pv = (rast-flatshade
 -  !rast-flatshade_first) ? PV_LAST : PV_FIRST;
 +   pc-api_pv = rast-flatshade_first ? PV_FIRST : PV_LAST;
  }

  void
 --
 2.0.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [RFC 2/2] gallium/ttn: add support for temp arrays

On Tue, Apr 7, 2015 at 12:32 PM, Jason Ekstrand ja...@jlekstrand.net wrote:
 On Tue, Apr 7, 2015 at 8:52 AM, Rob Clark robdcl...@gmail.com wrote:
 From: Rob Clark robcl...@freedesktop.org

 Since the rest of NIR really would rather have these as variables rather
 than registers, create a nir_variable per array.  But rather than
 completely re-arrange ttn to be variable based rather than register
 based, keep the registers.  In the cases where there is a matching var
 for the reg, ttn_emit_instruction will append the appropriate intrinsic
 to get things back from the shadow reg into the variable.

 NOTE: this doesn't quite handle TEMP[ADDR[]] when the DCL doesn't give
 an array id.  But those just kinda suck, and should really go away.
 AFAICT we don't get those from glsl.  Might be an issue for some other
 state tracker.

 Signed-off-by: Rob Clark robcl...@freedesktop.org
 ---
  src/gallium/auxiliary/nir/tgsi_to_nir.c | 116 
 ++--
  1 file changed, 94 insertions(+), 22 deletions(-)

 diff --git a/src/gallium/auxiliary/nir/tgsi_to_nir.c 
 b/src/gallium/auxiliary/nir/tgsi_to_nir.c
 index da935a4..1c7b313 100644
 --- a/src/gallium/auxiliary/nir/tgsi_to_nir.c
 +++ b/src/gallium/auxiliary/nir/tgsi_to_nir.c
 @@ -44,6 +44,7 @@
  struct ttn_reg_info {
 /** nir register containing this TGSI index. */
 nir_register *reg;
 +   nir_variable *var;
 /** Offset (in vec4s) from the start of var for this TGSI index. */
 int offset;
  };
 @@ -121,22 +122,29 @@ ttn_emit_declaration(struct ttn_compile *c)

 if (file == TGSI_FILE_TEMPORARY) {
nir_register *reg;
 -  if (c-scan-indirect_files  (1  file)) {
 +  for (i = 0; i  array_size; i++) {
   reg = nir_local_reg_create(b-impl);
   reg-num_components = 4;
 - reg-num_array_elems = array_size;
 + c-temp_regs[decl-Range.First + i].reg = reg;
 + c-temp_regs[decl-Range.First + i].offset = 0;
 +  }
 +  if (decl-Declaration.Array) {
 + /* for arrays, the register created just serves as a
 +  * shadow register.  We append intrinsic_store_global
 +  * after the tgsi instruction is translated to move
 +  * back from the shadow register to the variable
 +  */
 + nir_variable *var = rzalloc(b-shader, nir_variable);
 + var-type = glsl_array_type(glsl_vec4_type(), array_size);
 + var-data.mode = nir_var_global;
 + var-name = ralloc_asprintf(var, arr_%d, decl-Array.ArrayID);
 +
 + exec_list_push_tail(b-shader-globals, var-node);

   for (i = 0; i  array_size; i++) {
 -c-temp_regs[decl-Range.First + i].reg = reg;
 -c-temp_regs[decl-Range.First + i].offset = i;
 +c-temp_regs[decl-Range.First + i].var = var;
   }
} else {
 - for (i = 0; i  array_size; i++) {
 -reg = nir_local_reg_create(b-impl);
 -reg-num_components = 4;
 -c-temp_regs[decl-Range.First + i].reg = reg;
 -c-temp_regs[decl-Range.First + i].offset = 0;
 - }
}
 } else if (file == TGSI_FILE_ADDRESS) {
c-addr_reg = nir_local_reg_create(b-impl);
 @@ -256,10 +264,34 @@ ttn_src_for_file_and_index(struct ttn_compile *c, 
 unsigned file, unsigned index,

 switch (file) {
 case TGSI_FILE_TEMPORARY:
 -  src.reg.reg = c-temp_regs[index].reg;
 -  src.reg.base_offset = c-temp_regs[index].offset;
 -  if (indirect)
 - src.reg.indirect = ttn_src_for_indirect(c, indirect);
 +  if (c-temp_regs[index].var) {
 + nir_intrinsic_instr *load;
 + nir_alu_src indirect_address;
 +
 + assert(indirect);
 +
 + load = nir_intrinsic_instr_create(b-shader,
 +   
 nir_intrinsic_load_global_indirect);
 + load-num_components = 4;
 + load-const_index[0] = index;
 + load-const_index[1] = 1;

 Why are we using an intrinsic that has an index and not
 nir_intrinsic_load_var with a deref?  A short (2-element) deref chain
 will handle this for you and then the lower_vars_to_ssa pass will pick
 up on things like if all the indirect uses are actually constant and
 lower it to SSA values for you.  If you use an index+offset intrinsic
 then it's completely opaque and the rest of NIR doesn't know what to
 do with it.

I am *assuming* here that the index refers to which var you are
load/storing.. at least that is how it seemed to work for
uniforms/inputs/outputs.  Ofc I'm mostly just trying to infer about
how things should work from reading code so entirely possible I'm
missing something or haven't read the right parts of the code yet..

I'm starting to think more that I should have added a
nir_intrinsic_{load,store}_var_indirect instead of new intrinsics for
load/store_global(_indirect)..  I guess that would fit in better with
how variables already work.  Although I couldn't see any obvious way
for {load,store}_var to take

Re: [Mesa-dev] [PATCH 4/4] autoconf, scons: Move fallback HAVE_* definitions to headers.

2015-04-07 Thread Jose Fonseca


On 07/04/15 17:16, Matt Turner wrote:

On Tue, Apr 7, 2015 at 5:14 AM, Jose Fonseca jfons...@vmware.com wrote:

Sorry for the delay. I've been away during the Easter.

On 02/04/15 19:02, Matt Turner wrote:


On Thu, Apr 2, 2015 at 7:32 AM, Jose Fonseca jfons...@vmware.com wrote:


These were being defined in SCons, but it's not practical -- we actually
need to include Gallium headers from external source trees, with
completely disjoint build infrastructure, and it's unsustainable to
replicate the HAVE_xxx checks or even hard-coded defines across
everywhere.



To confirm, you're building external sources with gcc? I don't think
these macros are useful for MSVC.



Correct.




No actual change in behavior for autoconf.
---
   configure.ac |  2 +-
   include/c99_compat.h | 45 +
   scons/gallium.py | 27 ---
   src/util/macros.h|  2 ++
   4 files changed, 48 insertions(+), 28 deletions(-)

diff --git a/configure.ac b/configure.ac
index 520cc22..1485bba 100644
--- a/configure.ac
+++ b/configure.ac
@@ -230,7 +230,7 @@ _SAVE_LDFLAGS=$LDFLAGS
   _SAVE_CPPFLAGS=$CPPFLAGS

   dnl Compiler macros
-DEFINES=
+DEFINES=-DHAVE_AUTOCONF
   AC_SUBST([DEFINES])
   case $host_os in
   linux*|*-gnu*|gnu*)
diff --git a/include/c99_compat.h b/include/c99_compat.h
index 4fc91bc..62ccd46 100644
--- a/include/c99_compat.h
+++ b/include/c99_compat.h



c99_compat.h doesn't seem like the right location. I know it seems
like a nice place to add this since it's included everywhere, but I
worry that in a few years we're going to be cleaning it up like we've
been doing with compiler.h and friends.

I might make a separate header to define these? Not sure.



I can move the defines out of c99_compat.h , e.g.,
mesa/include/fallbackconfig.h.

But I'd prefer to include fallbackconfig.h out of c99_compat.h , as
c99_compat.h is pretty much guaranteed to be included all the time.



Since
probably all cases of #ifdef HAVE___* have a fallback, that runs the
risk of never noticing that you weren't including the right header.


Precisely, this is all the more reason why it must be included from a header
that's included all the time.  If it depends on people to add the include on
a case-by-case it is bound to fail, as nobody else but us cares, and it will
easily go unnoticed.



@@ -141,4 +141,49 @@ test_c99_compat_h(const void * restrict a,
   #endif


+
+/* Fallback definitions, for when these headers are used by build
systems which
+ * don't auto-detect these things.*/
+#ifndef HAVE_AUTOCONF



I'd rather flip this condition around and not modify configure.ac. But
maybe you can't do that because you're not actually building
everything with scons?



No biggie either way.


I don't know. This seems nuts. I really don't like adding stuff to the
autotools build system like this.



Sure.



I really don't know how to deal with this. What I'm hearing is that
even the custom scons build system you guys use isn't sufficient for
your own needs. You're not building the external source trees with the
same build system...?



I think you might be getting the wrong idea.

We don't build the .C files from external source trees.  But we do need to
include .h files, so we can interface with components in Mesa tree.

That is, I only need the .h files to make sense on their own (with Mesa
components, namely mesa/src/gallium/include, and gallium auxiliary
libraries).  But we have so many inlines functions, so many #ifdef HAVE_foo,
that unless all the defines match precisely, the whole hell breaks loose.


Gallium has from the start been integrated (ie. embedded) on a myriad of
places.  It was always meant as a framework to write any sort of 3d driver,
not just OpenGL drivers.  Things were much worse when Gallium was used on
Windows XP kernel land or Windows CE.  I'm glad that I or anybody else has
to deal with the quirkiness of keeping code portable across these platforms.
Things are still much more uniform nowadays.



I mean, in all the build system work I've done I've tried to make sure
scons continues working -- doing things like adding these HAVE_*
definitions to it and such. It's kind of frustrating, and it's even
more frustrating when even that isn't sufficient.




All I'm doing here is basically move your defines out of scons's python
files into C headers.  Conceptually it's doing pretty much the same thing as
before, but being in a header that means that it's there for all build
systems to take.


Rembember that Mesa itself is not just autoconf and Scons, there's also
Android build system.

I don't like it any more you do, but this is the world we live in: the fact
is that many platforms constraint how software must be built to a point
which is impracticable/impossible to build.  Even if a build system that
meets everybody needs existed, we'd still face the legacy of existing
software using other build systems.



To be honest, IMHO, Mesa source tree and build

[Mesa-dev] [PATCH] swrast: replace FUNCTION with func

Consistently just use C99's __func__ everywhere.
The patch was verified with Microsoft Visual studio 2013
redistributable package(RTM version number: 18.0.21005.1)
Next MSVC versions intends to support __func__.
No functional changes.

Signed-off-by: Marius Predut marius.pre...@intel.com
---
 src/mesa/swrast/s_linetemp.h |4 ++--
 src/mesa/swrast/s_span.c |2 +-
 src/mesa/swrast/s_tritemp.h  |2 +-
 3 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/src/mesa/swrast/s_linetemp.h b/src/mesa/swrast/s_linetemp.h
index 352c884..035a1e6 100644
--- a/src/mesa/swrast/s_linetemp.h
+++ b/src/mesa/swrast/s_linetemp.h
@@ -106,7 +106,7 @@ NAME( struct gl_context *ctx, const SWvertex *vert0, const 
SWvertex *vert1 )
}
 
/*
-   printf(%s():\n, __FUNCTION__);
+   printf(%s():\n, __func__);
printf( (%f, %f, %f) - (%f, %f, %f)\n,
   vert0-attrib[VARYING_SLOT_POS][0],
   vert0-attrib[VARYING_SLOT_POS][1],
@@ -154,7 +154,7 @@ NAME( struct gl_context *ctx, const SWvertex *vert0, const 
SWvertex *vert1 )
   return;
 
/*
-   printf(%s %d,%d  %g %g %g %g  %g %g %g %g\n, __FUNCTION__, dx, dy,
+   printf(%s %d,%d  %g %g %g %g  %g %g %g %g\n, __func__, dx, dy,
   vert0-attrib[VARYING_SLOT_COL1][0],
   vert0-attrib[VARYING_SLOT_COL1][1],
   vert0-attrib[VARYING_SLOT_COL1][2],
diff --git a/src/mesa/swrast/s_span.c b/src/mesa/swrast/s_span.c
index e304b6b..7bb5712 100644
--- a/src/mesa/swrast/s_span.c
+++ b/src/mesa/swrast/s_span.c
@@ -1144,7 +1144,7 @@ _swrast_write_rgba_span( struct gl_context *ctx, SWspan 
*span)
struct gl_framebuffer *fb = ctx-DrawBuffer;
 
/*
-   printf(%s()  interp 0x%x  array 0x%x\n, __FUNCTION__,
+   printf(%s()  interp 0x%x  array 0x%x\n, __func__,
   span-interpMask, span-arrayMask);
*/
 
diff --git a/src/mesa/swrast/s_tritemp.h b/src/mesa/swrast/s_tritemp.h
index fb73b2d..4b6d34c 100644
--- a/src/mesa/swrast/s_tritemp.h
+++ b/src/mesa/swrast/s_tritemp.h
@@ -156,7 +156,7 @@ static void NAME(struct gl_context *ctx, const SWvertex *v0,
 #endif
 
/*
-   printf(%s()\n, __FUNCTION__);
+   printf(%s()\n, __func__);
printf(  %g, %g, %g\n,
   v0-attrib[VARYING_SLOT_POS][0],
   v0-attrib[VARYING_SLOT_POS][1],
-- 
1.7.9.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 4/4] autoconf, scons: Move fallback HAVE_* definitions to headers.

2015-04-07 Thread Emil Velikov

On 7 April 2015 at 16:21, Jose Fonseca jfons...@vmware.com wrote:
 On 07/04/15 15:01, Emil Velikov wrote:
...

 So let see if I got this correct, apologies in advance if it comes out
 too blunt.

 Unless I'm mistaken the gallium interfaces are internal/private, so
 comparing them with public ones (like the Khronos OpenGL) seems like
 comparing apples to oranges. Yet as one tries to have/use gallium
 interfaces as if they were public, the idea of gettting some of this
 #ifdef-ery into a single, isolated and easily manageable place is
 valid and honourable.


 From my POV, Gallium interfaces are public and always have been. Admittedly,
 there's no standards body, and the interface is neither stable nor does it
 provide backwards compatibility.  But pretty much from as far as I can
 remember (which is 2007) there were external (as in out-of-tree)
 state-trackers and even externals drivers.

Don't mean to be cheeky but do you have an example of a project that
has a public interface that is neither stable nor backwards compatible
? Don't think I've heard about one, so I must admit that I found your
statement rather surprising. Then again I have been proven narrow
minded on a occasion or two.


 BTW, another solution would be for autotools to generate a config.h.

 And have SCons, etc, include a hand-written drop-in config.h (living in a
 separate directory.)

 This is actually a practice that many projects (out of my head I can name
 zlib, linpng, tiff, etc) do.

I had the same idea for a few months now. Although that would likely
be a slow and long transition, as I would like to avoid severe
breakages.


 Another alternative is for me to pre-include this fakeconfig.h , ie., `gcc
 --include fakeconfig.h`, MSVC's `/Fifakeconfig.h`

Imho this sounds like a the better solution. This way when someone
uses gallium interfaces as public they can include it explicitly.



 Afaict the overhead of rebasing an integrated solution on top of newer
 mesa, would be less than having it out-of-tree. Plus it seems like the
 better engineering approach. Perhaps I'm missing something and this
 does not hold true ?


 I'm afraid it doesn't hold true.  It's not worth going into specifics, but
 imagine the following: there's Mesa, theres our Product, and there's the
 Component linking both.  Mesa has its build system. The Product has its
 build system.  Both Product and Mesa are huge, so building one inside the
 other it's just impractical.  What you can do is choose to build the
 Component inside Mesa or inside the Product, but either way you'll end up
 the variations of the same problem.  Which is one is easier depends on how
 tightly the Component is integrated with Mesa vs the Product.

 The component is sort of a Direct3D state tracker, and is way more tightly
 integrated into the rest of the Product than Mesa, as it really one needs
 the gallium headers and a few of the helper modules.

The situation sounds familiar, although I might be bit biased on the
topic. Let me reword your sentence with an example in place:

You have a Component (st/omx) , which depends on Product
(omx-bellagio) at compile/link time. Does your platform(s) has tools
similar to pkg-config  cmake's package-config ?

If so one should be able to tackle it as follows:
 1. Bring some versioning into Product.
 2. Making sure that Product's headers/libraries are available via the
pkg-config(alike) tool.
 3. Add the check for Product into the autotools/scons build
 4. Integrate(merge) Component into Mesa.

It does have one small catch though
 - Product cannot on depend on Mesa at link time. One can get around
this but it requires some non-trivial changes.

Suspecting that the situation might be more elaborate than presented
(or I did not fully understood it) and I'm not trying to push you to
disclose any more information. Just saying that as presented it does
not sound so complex.


Thanks for the comprehensive explanation.

Cheers,
Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] i965: Use SET_FIELD in 3DSTATE_STREAMOUT packets.

2015-04-07 Thread Anuj Phogat

On Mon, Apr 6, 2015 at 4:12 PM, Kenneth Graunke kenn...@whitecape.org
wrote:

 Suggested by Topi Pohjolainen.

 Signed-off-by: Kenneth Graunke kenn...@whitecape.org
 Cc: Topi Pohjolainen topi.pohjolai...@intel.com
 ---
  src/mesa/drivers/dri/i965/gen7_sol_state.c | 16 
  src/mesa/drivers/dri/i965/gen8_sol_state.c | 16 
  2 files changed, 16 insertions(+), 16 deletions(-)

 diff --git a/src/mesa/drivers/dri/i965/gen7_sol_state.c
 b/src/mesa/drivers/dri/i965/gen7_sol_state.c
 index 7e9b285..3f99df9 100644
 --- a/src/mesa/drivers/dri/i965/gen7_sol_state.c
 +++ b/src/mesa/drivers/dri/i965/gen7_sol_state.c
 @@ -245,17 +245,17 @@ upload_3dstate_streamout(struct brw_context *brw,
 bool active,
 * point by reading less and offsetting the register index in the
 * SO_DECLs.
 */
 -  dw2 |= urb_entry_read_offset 
 SO_STREAM_0_VERTEX_READ_OFFSET_SHIFT;
 -  dw2 |= (urb_entry_read_length - 1) 
 SO_STREAM_0_VERTEX_READ_LENGTH_SHIFT;
 +  dw2 |= SET_FIELD(urb_entry_read_offset,
 SO_STREAM_0_VERTEX_READ_OFFSET);
 +  dw2 |= SET_FIELD(urb_entry_read_length - 1,
 SO_STREAM_0_VERTEX_READ_LENGTH);

 -  dw2 |= urb_entry_read_offset 
 SO_STREAM_1_VERTEX_READ_OFFSET_SHIFT;
 -  dw2 |= (urb_entry_read_length - 1) 
 SO_STREAM_1_VERTEX_READ_LENGTH_SHIFT;
 +  dw2 |= SET_FIELD(urb_entry_read_offset,
 SO_STREAM_1_VERTEX_READ_OFFSET);
 +  dw2 |= SET_FIELD(urb_entry_read_length - 1,
 SO_STREAM_1_VERTEX_READ_LENGTH);

 -  dw2 |= urb_entry_read_offset 
 SO_STREAM_2_VERTEX_READ_OFFSET_SHIFT;
 -  dw2 |= (urb_entry_read_length - 1) 
 SO_STREAM_2_VERTEX_READ_LENGTH_SHIFT;
 +  dw2 |= SET_FIELD(urb_entry_read_offset,
 SO_STREAM_2_VERTEX_READ_OFFSET);
 +  dw2 |= SET_FIELD(urb_entry_read_length - 1,
 SO_STREAM_2_VERTEX_READ_LENGTH);

 -  dw2 |= urb_entry_read_offset 
 SO_STREAM_3_VERTEX_READ_OFFSET_SHIFT;
 -  dw2 |= (urb_entry_read_length - 1) 
 SO_STREAM_3_VERTEX_READ_LENGTH_SHIFT;
 +  dw2 |= SET_FIELD(urb_entry_read_offset,
 SO_STREAM_3_VERTEX_READ_OFFSET);
 +  dw2 |= SET_FIELD(urb_entry_read_length - 1,
 SO_STREAM_3_VERTEX_READ_LENGTH);
 }

 BEGIN_BATCH(3);
 diff --git a/src/mesa/drivers/dri/i965/gen8_sol_state.c
 b/src/mesa/drivers/dri/i965/gen8_sol_state.c
 index d98a226..58ead68 100644
 --- a/src/mesa/drivers/dri/i965/gen8_sol_state.c
 +++ b/src/mesa/drivers/dri/i965/gen8_sol_state.c
 @@ -125,17 +125,17 @@ gen8_upload_3dstate_streamout(struct brw_context
 *brw, bool active,
 * point by reading less and offsetting the register index in the
 * SO_DECLs.
 */
 -  dw2 |= urb_entry_read_offset 
 SO_STREAM_0_VERTEX_READ_OFFSET_SHIFT;
 -  dw2 |= (urb_entry_read_length - 1) 
 SO_STREAM_0_VERTEX_READ_LENGTH_SHIFT;
 +  dw2 |= SET_FIELD(urb_entry_read_offset,
 SO_STREAM_0_VERTEX_READ_OFFSET);
 +  dw2 |= SET_FIELD(urb_entry_read_length - 1,
 SO_STREAM_0_VERTEX_READ_LENGTH);

 -  dw2 |= urb_entry_read_offset 
 SO_STREAM_1_VERTEX_READ_OFFSET_SHIFT;
 -  dw2 |= (urb_entry_read_length - 1) 
 SO_STREAM_1_VERTEX_READ_LENGTH_SHIFT;
 +  dw2 |= SET_FIELD(urb_entry_read_offset,
 SO_STREAM_1_VERTEX_READ_OFFSET);
 +  dw2 |= SET_FIELD(urb_entry_read_length - 1,
 SO_STREAM_1_VERTEX_READ_LENGTH);

 -  dw2 |= urb_entry_read_offset 
 SO_STREAM_2_VERTEX_READ_OFFSET_SHIFT;
 -  dw2 |= (urb_entry_read_length - 1) 
 SO_STREAM_2_VERTEX_READ_LENGTH_SHIFT;
 +  dw2 |= SET_FIELD(urb_entry_read_offset,
 SO_STREAM_2_VERTEX_READ_OFFSET);
 +  dw2 |= SET_FIELD(urb_entry_read_length - 1,
 SO_STREAM_2_VERTEX_READ_LENGTH);

 -  dw2 |= urb_entry_read_offset 
 SO_STREAM_3_VERTEX_READ_OFFSET_SHIFT;
 -  dw2 |= (urb_entry_read_length - 1) 
 SO_STREAM_3_VERTEX_READ_LENGTH_SHIFT;
 +  dw2 |= SET_FIELD(urb_entry_read_offset,
 SO_STREAM_3_VERTEX_READ_OFFSET);
 +  dw2 |= SET_FIELD(urb_entry_read_length - 1,
 SO_STREAM_3_VERTEX_READ_LENGTH);

/* Set buffer pitches; 0 means unbound. */
if (xfb_obj-Buffers[0])
 --
 2.3.4

 Reviewed-by: Anuj Phogat anuj.pho...@gmail.com

 ___
 mesa-dev mailing list
 mesa-dev@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/mesa-dev

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] nir/lower_tex_projector: Don't use designated initializers

2015-04-07 Thread Mark Janes

Reviewed-by: Mark Janes mark.a.ja...@intel.com

Jason Ekstrand ja...@jlekstrand.net writes:

 These don't work in MSVC or in older versions of GCC

 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89899
 Cc: Eric Anholt e...@anholt.net
 ---
  src/glsl/nir/nir_lower_tex_projector.c | 3 ++-
  1 file changed, 2 insertions(+), 1 deletion(-)

 diff --git a/src/glsl/nir/nir_lower_tex_projector.c 
 b/src/glsl/nir/nir_lower_tex_projector.c
 index 6327b23..6b0e9c3 100644
 --- a/src/glsl/nir/nir_lower_tex_projector.c
 +++ b/src/glsl/nir/nir_lower_tex_projector.c
 @@ -109,7 +109,8 @@ nir_lower_tex_projector_block(nir_block *block, void 
 *void_state)
/* Now move the later tex sources down the array so that the projector
 * disappears.
 */
 -  nir_src dead = {.is_ssa = false, .ssa = NULL};
 +  nir_src dead;
 +  memset(dead, 0, sizeof dead);
nir_instr_rewrite_src(tex-instr, tex-src[proj_index].src, dead);
memmove(tex-src[proj_index],
tex-src[proj_index + 1],
 -- 
 2.3.4

 ___
 mesa-dev mailing list
 mesa-dev@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 1/4] mesa/teximage: use correct extension for accept stencil texture.

On Monday, April 06, 2015 09:44:07 PM Pohjolainen, Topi wrote:
 On Mon, Apr 06, 2015 at 11:37:08AM -0700, Ian Romanick wrote:
  On 04/06/2015 08:33 AM, Pohjolainen, Topi wrote:
   On Sun, Apr 05, 2015 at 08:22:13PM +0300, Pohjolainen, Topi wrote:
   On Sun, Apr 05, 2015 at 08:06:50PM +0300, Pohjolainen, Topi wrote:
   On Sun, Apr 05, 2015 at 08:46:16AM -0400, Ilia Mirkin wrote:
   While this change is correct, the Intel guys will yell at you, because
   they're somehow misusing this in meta for Broadwell, s.t. this will
   cause crashes when blitting stencil. IMHO that's a problem that should
   be fixed in their driver and this can go on, but... it's also not my
   driver that's crashing -- they might feel differently :)
  
   As far as I can tell we only do:
  
  _mesa_TexParameteri(target, GL_DEPTH_STENCIL_TEXTURE_MODE,
  GL_STENCIL_INDEX);
  
   which suppose to be the right thing to do - we select the stencil to be
   sampled instead of depth. And this won't hit the path below. I made the
   change locally and I'm now running piglit on broadwell.
  
   I noticed that _mesa_base_tex_format() is in turn used in
  
   src/mesa/drivers/common/meta_blit.c
  
   but we shouldn't go there with intel driver ever. On hardware older than
   broadwell we don't use meta and the one used on broadwell and newer
   is found in:
  
   src/mesa/drivers/dri/i965/brw_meta_stencil_blit.c
  
   But lets see what piglit says.
  
   Right you are. This is more subtle, we will hit it when we actually 
   create
   a temporary texture out of the given read renderbuffer. It seems that 
   this
   was hit first time when formats where adjusted and then Jason added the
   conditional using ARB_stencil_texturing (which is not right either).
  
   Really sorry that this is hindering your work now. I'll try to take a 
   look
   at this tomorrow.
   
   So far I can't come up with other things than pure hacks. I'll explain
   a little what happens in the intel stencil meta blit. Like I said, the
   driver creates a temporary texture out of the stencil attachment:
   
  const struct gl_renderbuffer_attachment *att =
 ctx-ReadBuffer-Attachment[BUFFER_STENCIL];
  struct gl_renderbuffer *rb = att-Renderbuffer;
  struct gl_texture_object *tex_obj;
   
  ...
 if (!_mesa_meta_bind_rb_as_tex_image(ctx, rb, blit-tempTex, 
   tex_obj,
 target)) {
   
   
   This gets wound back to the driver, a call to
   intel_bind_renderbuffer_tex_image() which in turn calls the core again.
   
  _mesa_init_teximage_fields(ctx, image,
 rb-Width, rb-Height, 1,
 0, rb-InternalFormat, rb-Format);
   
   Here rb-InternalFormat is GL_STENCIL_INDEX that won't be accepted by
   _mesa_base_tex_format() anymore without ARB_texture_stencil8. As most of
   the texture image setting up logic takes place in the core, the boolean 
   state
   flag (brw_context::meta_in_progress) we have in intel driver is not much
   help. It looks that we would need additional driver driven overriding.
   But I don't like that at all.
  
  On the platforms that use this path, don't we fake DEPTH_STENCIL
  textures by having separate depth and stencil surfaces?  The implication
  being that all of the mechanism that does stencil texturing from
  DEPTH_STENCIL surfaces is the same as we would need to texture from
  STENCIL_INDEX8 surfaces.
  
  Wouldn't it be easier to just enable ARB_texture_stencil8 on those
  platforms?
 
 I'm sure you would know better than me :)

Actually, you're the expert here :)

I think that we can just turn on ARB_texture_stencil8 - I just hadn't
done the core Mesa plumbing.  Why don't we try and do that?


signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] i965: Add the ability to render to I8/L8 and I16/L16 UNORM formats.

2015-04-07 Thread Anuj Phogat

On Mon, Apr 6, 2015 at 5:06 PM, Kenneth Graunke kenn...@whitecape.org
wrote:

 This allows those formats to work with the meta PBO upload path.

 Signed-off-by: Kenneth Graunke kenn...@whitecape.org
 ---
  src/mesa/drivers/dri/i965/brw_surface_formats.c | 8 
  1 file changed, 8 insertions(+)

 diff --git a/src/mesa/drivers/dri/i965/brw_surface_formats.c
 b/src/mesa/drivers/dri/i965/brw_surface_formats.c
 index 7261c01..7524ad9 100644
 --- a/src/mesa/drivers/dri/i965/brw_surface_formats.c
 +++ b/src/mesa/drivers/dri/i965/brw_surface_formats.c
 @@ -582,6 +582,14 @@ brw_init_surface_formats(struct brw_context *brw)
case BRW_SURFACEFORMAT_L16_FLOAT:
  render = BRW_SURFACEFORMAT_R16_FLOAT;
  break;
 +  case BRW_SURFACEFORMAT_I8_UNORM:
 +  case BRW_SURFACEFORMAT_L8_UNORM:
 + render = BRW_SURFACEFORMAT_R8_UNORM;
 + break;
 +  case BRW_SURFACEFORMAT_I16_UNORM:
 +  case BRW_SURFACEFORMAT_L16_UNORM:
 + render = BRW_SURFACEFORMAT_R16_UNORM;
 + break;
case BRW_SURFACEFORMAT_B8G8R8X8_UNORM:
  /* XRGB is handled as ARGB because the chips in this family
   * cannot render to XRGB targets.  This means that we have to
 --
 2.3.5

 Reviewed-by: Anuj Phogat anuj.pho...@gmail.com

 ___
 mesa-dev mailing list
 mesa-dev@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/mesa-dev

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [Bug 89899] nir/nir_lower_tex_projector.c:112: error: unknown field ‘ssa’ specified in initializer

2015-04-07 Thread bugzilla-daemon

https://bugs.freedesktop.org/show_bug.cgi?id=89899

Jason Ekstrand ja...@jlekstrand.net changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |FIXED

--- Comment #2 from Jason Ekstrand ja...@jlekstrand.net ---
I just pushed a patch that should fix this.  Reopen if it's still a problem.

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] i915: replace FUNCTION with func

Consistently just use C99's __func__ everywhere.
The patch was verified with Microsoft Visual studio 2013
redistributable package(RTM version number: 18.0.21005.1)
Next MSVC versions intends to support __func__.
No functional changes.

Signed-off-by: Marius Predut marius.pre...@intel.com
---
 src/mesa/drivers/dri/i915/i830_state.c |   44 
 src/mesa/drivers/dri/i915/i830_texblend.c  |4 +--
 src/mesa/drivers/dri/i915/i830_texstate.c  |2 +-
 src/mesa/drivers/dri/i915/i915_program.c   |8 ++---
 src/mesa/drivers/dri/i915/i915_state.c |   26 +++---
 src/mesa/drivers/dri/i915/i915_tex_layout.c|4 +--
 src/mesa/drivers/dri/i915/i915_texstate.c  |2 +-
 src/mesa/drivers/dri/i915/i915_vtbl.c  |2 +-
 src/mesa/drivers/dri/i915/intel_blit.c |   10 +++---
 src/mesa/drivers/dri/i915/intel_clear.c|2 +-
 src/mesa/drivers/dri/i915/intel_context.c  |2 +-
 src/mesa/drivers/dri/i915/intel_fbo.c  |8 ++---
 src/mesa/drivers/dri/i915/intel_mipmap_tree.c  |   18 +-
 src/mesa/drivers/dri/i915/intel_pixel_bitmap.c |2 +-
 src/mesa/drivers/dri/i915/intel_pixel_copy.c   |6 ++--
 src/mesa/drivers/dri/i915/intel_pixel_read.c   |   12 +++
 src/mesa/drivers/dri/i915/intel_regions.c  |8 ++---
 src/mesa/drivers/dri/i915/intel_render.c   |2 +-
 src/mesa/drivers/dri/i915/intel_state.c|6 ++--
 src/mesa/drivers/dri/i915/intel_tex.c  |   10 +++---
 src/mesa/drivers/dri/i915/intel_tex_copy.c |4 +--
 src/mesa/drivers/dri/i915/intel_tex_image.c|   18 +-
 src/mesa/drivers/dri/i915/intel_tex_subimage.c |2 +-
 src/mesa/drivers/dri/i915/intel_tris.c |   12 +++
 24 files changed, 107 insertions(+), 107 deletions(-)

diff --git a/src/mesa/drivers/dri/i915/i830_state.c 
b/src/mesa/drivers/dri/i915/i830_state.c
index 3e379f3..13adf56 100644
--- a/src/mesa/drivers/dri/i915/i830_state.c
+++ b/src/mesa/drivers/dri/i915/i830_state.c
@@ -56,7 +56,7 @@ i830StencilFuncSeparate(struct gl_context * ctx, GLenum face, 
GLenum func, GLint
 
mask = mask  0xff;
 
-   DBG(%s : func: %s, ref : 0x%x, mask: 0x%x\n, __FUNCTION__,
+   DBG(%s : func: %s, ref : 0x%x, mask: 0x%x\n, __func__,
_mesa_lookup_enum_by_nr(func), ref, mask);
 
 
@@ -77,7 +77,7 @@ i830StencilMaskSeparate(struct gl_context * ctx, GLenum face, 
GLuint mask)
 {
struct i830_context *i830 = i830_context(ctx);
 
-   DBG(%s : mask 0x%x\n, __FUNCTION__, mask);
+   DBG(%s : mask 0x%x\n, __func__, mask);

mask = mask  0xff;
 
@@ -94,7 +94,7 @@ i830StencilOpSeparate(struct gl_context * ctx, GLenum face, 
GLenum fail, GLenum
struct i830_context *i830 = i830_context(ctx);
int fop, dfop, dpop;
 
-   DBG(%s: fail : %s, zfail: %s, zpass : %s\n, __FUNCTION__,
+   DBG(%s: fail : %s, zfail: %s, zpass : %s\n, __func__,
_mesa_lookup_enum_by_nr(fail),
_mesa_lookup_enum_by_nr(zfail), 
_mesa_lookup_enum_by_nr(zpass));
@@ -261,7 +261,7 @@ i830BlendColor(struct gl_context * ctx, const GLfloat 
color[4])
struct i830_context *i830 = i830_context(ctx);
GLubyte r, g, b, a;
 
-   DBG(%s\n, __FUNCTION__);
+   DBG(%s\n, __func__);

UNCLAMPED_FLOAT_TO_UBYTE(r, color[RCOMP]);
UNCLAMPED_FLOAT_TO_UBYTE(g, color[GCOMP]);
@@ -315,7 +315,7 @@ i830_set_blend_state(struct gl_context * ctx)
   break;
default:
   fprintf(stderr, [%s:%u] Invalid RGB blend equation (0x%04x).\n,
-  __FUNCTION__, __LINE__, ctx-Color.Blend[0].EquationRGB);
+  __func__, __LINE__, ctx-Color.Blend[0].EquationRGB);
   return;
}
 
@@ -343,7 +343,7 @@ i830_set_blend_state(struct gl_context * ctx)
   break;
default:
   fprintf(stderr, [%s:%u] Invalid alpha blend equation (0x%04x).\n,
-  __FUNCTION__, __LINE__, ctx-Color.Blend[0].EquationA);
+  __func__, __LINE__, ctx-Color.Blend[0].EquationA);
   return;
}
 
@@ -378,7 +378,7 @@ i830_set_blend_state(struct gl_context * ctx)
if (0) {
   fprintf(stderr,
   [%s:%u] STATE1: 0x%08x IALPHAB: 0x%08x blend is %sabled\n,
-  __FUNCTION__, __LINE__, i830-state.Ctx[I830_CTXREG_STATE1],
+  __func__, __LINE__, i830-state.Ctx[I830_CTXREG_STATE1],
   i830-state.Ctx[I830_CTXREG_IALPHAB],
   (ctx-Color.BlendEnabled) ? en : dis);
}
@@ -388,7 +388,7 @@ i830_set_blend_state(struct gl_context * ctx)
 static void
 i830BlendEquationSeparate(struct gl_context * ctx, GLenum modeRGB, GLenum 
modeA)
 {
-   DBG(%s - %s, %s\n, __FUNCTION__,
+   DBG(%s - %s, %s\n, __func__,
_mesa_lookup_enum_by_nr(modeRGB),
_mesa_lookup_enum_by_nr(modeA));
 
@@ -402,7 +402,7 @@ static void
 i830BlendFuncSeparate(struct gl_context * ctx, GLenum sfactorRGB,
   GLenum dfactorRGB, GLenum sfactorA, GLenum dfactorA)
 {
-   DBG(%s - RGB(%s, %s) A(%s, %s)\n, __FUNCTION__,
+   DBG(%s -

[Mesa-dev] [PATCH] glx: replace FUNCTION with func

Consistently just use C99's __func__ everywhere.
The patch was verified with Microsoft Visual studio 2013
redistributable package(RTM version number: 18.0.21005.1)
Next MSVC versions intends to support __func__.
No functional changes.

Signed-off-by: Marius Predut marius.pre...@intel.com
---
 src/glx/apple/apple_glx_log.h |4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/src/glx/apple/apple_glx_log.h b/src/glx/apple/apple_glx_log.h
index 4b1c531..b1a5538 100644
--- a/src/glx/apple/apple_glx_log.h
+++ b/src/glx/apple/apple_glx_log.h
@@ -39,14 +39,14 @@ __printflike(5, 6)
 void _apple_glx_log(int level, const char *file, const char *function,
 int line, const char *fmt, ...);
 #define apple_glx_log(l, f, args ...) \
-_apple_glx_log(l, __FILE__, __FUNCTION__, __LINE__, f, ## args)
+_apple_glx_log(l, __FILE__, __func__, __LINE__, f, ## args)
 
 
 __printflike(5, 0)
 void _apple_glx_vlog(int level, const char *file, const char *function,
  int line, const char *fmt, va_list v);
 #define apple_glx_vlog(l, f, v) \
-_apple_glx_vlog(l, __FILE__, __FUNCTION__, __LINE__, f, v)
+_apple_glx_vlog(l, __FILE__, __func__, __LINE__, f, v)
 
 /* This is just here to help the transition.
  * TODO: Replace calls to apple_glx_diagnostic
-- 
1.7.9.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] i965: replace FUNCTION with func

Consistently just use C99's __func__ everywhere.
The patch was verified with Microsoft Visual studio 2013
redistributable package(RTM version number: 18.0.21005.1)
Next MSVC versions intends to support __func__.
No functional changes.

Signed-off-by: Marius Predut marius.pre...@intel.com
---
 src/mesa/drivers/dri/common/utils.c|2 +-
 src/mesa/drivers/dri/i965/brw_blorp.cpp|2 +-
 src/mesa/drivers/dri/i965/brw_blorp_blit.cpp   |2 +-
 src/mesa/drivers/dri/i965/brw_context.c|4 +--
 src/mesa/drivers/dri/i965/brw_draw_upload.c|2 +-
 src/mesa/drivers/dri/i965/brw_state_cache.c|4 +--
 src/mesa/drivers/dri/i965/brw_tex_layout.c |2 +-
 src/mesa/drivers/dri/i965/brw_wm_surface_state.c   |2 +-
 src/mesa/drivers/dri/i965/gen6_surface_state.c |2 +-
 src/mesa/drivers/dri/i965/gen7_wm_surface_state.c  |2 +-
 src/mesa/drivers/dri/i965/gen8_surface_state.c |2 +-
 src/mesa/drivers/dri/i965/intel_blit.c |8 +++---
 src/mesa/drivers/dri/i965/intel_fbo.c  |8 +++---
 src/mesa/drivers/dri/i965/intel_mipmap_tree.c  |   30 ++--
 src/mesa/drivers/dri/i965/intel_pixel_bitmap.c |2 +-
 src/mesa/drivers/dri/i965/intel_pixel_copy.c   |6 ++--
 src/mesa/drivers/dri/i965/intel_pixel_draw.c   |   14 -
 src/mesa/drivers/dri/i965/intel_pixel_read.c   |8 +++---
 src/mesa/drivers/dri/i965/intel_screen.c   |4 +--
 src/mesa/drivers/dri/i965/intel_tex.c  |   10 +++
 src/mesa/drivers/dri/i965/intel_tex_copy.c |4 +--
 src/mesa/drivers/dri/i965/intel_tex_image.c|   16 +--
 src/mesa/drivers/dri/i965/intel_tex_subimage.c |6 ++--
 .../dri/i965/test_vec4_register_coalesce.cpp   |2 +-
 24 files changed, 72 insertions(+), 72 deletions(-)

diff --git a/src/mesa/drivers/dri/common/utils.c 
b/src/mesa/drivers/dri/common/utils.c
index bb22107..70d34e8 100644
--- a/src/mesa/drivers/dri/common/utils.c
+++ b/src/mesa/drivers/dri/common/utils.c
@@ -227,7 +227,7 @@ driCreateConfigs(mesa_format format,
   break;
default:
   fprintf(stderr, [%s:%u] Unknown framebuffer type %s (%d).\n,
-  __FUNCTION__, __LINE__,
+  __func__, __LINE__,
   _mesa_get_format_name(format), format);
   return NULL;
}
diff --git a/src/mesa/drivers/dri/i965/brw_blorp.cpp 
b/src/mesa/drivers/dri/i965/brw_blorp.cpp
index df00b77..3b03f75 100644
--- a/src/mesa/drivers/dri/i965/brw_blorp.cpp
+++ b/src/mesa/drivers/dri/i965/brw_blorp.cpp
@@ -194,7 +194,7 @@ intel_hiz_exec(struct brw_context *brw, struct 
intel_mipmap_tree *mt,
}
 
DBG(%s %s to mt %p level %d layer %d\n,
-   __FUNCTION__, opname, mt, level, layer);
+   __func__, opname, mt, level, layer);
 
if (brw-gen = 8) {
   gen8_hiz_exec(brw, mt, level, layer, op);
diff --git a/src/mesa/drivers/dri/i965/brw_blorp_blit.cpp 
b/src/mesa/drivers/dri/i965/brw_blorp_blit.cpp
index 644cb41..d25e201 100644
--- a/src/mesa/drivers/dri/i965/brw_blorp_blit.cpp
+++ b/src/mesa/drivers/dri/i965/brw_blorp_blit.cpp
@@ -78,7 +78,7 @@ brw_blorp_blit_miptrees(struct brw_context *brw,
 
DBG(%s from %dx %s mt %p %d %d (%f,%f) (%f,%f)
to %dx %s mt %p %d %d (%f,%f) (%f,%f) (flip %d,%d)\n,
-   __FUNCTION__,
+   __func__,
src_mt-num_samples, _mesa_get_format_name(src_mt-format), src_mt,
src_level, src_layer, src_x0, src_y0, src_x1, src_y1,
dst_mt-num_samples, _mesa_get_format_name(dst_mt-format), dst_mt,
diff --git a/src/mesa/drivers/dri/i965/brw_context.c 
b/src/mesa/drivers/dri/i965/brw_context.c
index ed6fdff..a63d00b 100644
--- a/src/mesa/drivers/dri/i965/brw_context.c
+++ b/src/mesa/drivers/dri/i965/brw_context.c
@@ -722,7 +722,7 @@ brwCreateContext(gl_api api,
 
struct brw_context *brw = rzalloc(NULL, struct brw_context);
if (!brw) {
-  fprintf(stderr, %s: failed to alloc context\n, __FUNCTION__);
+  fprintf(stderr, %s: failed to alloc context\n, __func__);
   *dri_ctx_error = __DRI_CTX_ERROR_NO_MEMORY;
   return false;
}
@@ -778,7 +778,7 @@ brwCreateContext(gl_api api,
 
if (!_mesa_initialize_context(ctx, api, mesaVis, shareCtx, functions)) {
   *dri_ctx_error = __DRI_CTX_ERROR_NO_MEMORY;
-  fprintf(stderr, %s: failed to init mesa context\n, __FUNCTION__);
+  fprintf(stderr, %s: failed to init mesa context\n, __func__);
   intelDestroyContext(driContextPriv);
   return false;
}
diff --git a/src/mesa/drivers/dri/i965/brw_draw_upload.c 
b/src/mesa/drivers/dri/i965/brw_draw_upload.c
index 52dcb6f..623465f 100644
--- a/src/mesa/drivers/dri/i965/brw_draw_upload.c
+++ b/src/mesa/drivers/dri/i965/brw_draw_upload.c
@@ -413,7 +413,7 @@ brw_prepare_vertices(struct brw_context *brw)
}
 
if (0)
-  fprintf(stderr, %s %d..%d\n, __FUNCTION__, min_index, max_index);
+  fprintf(stderr, %s %d..%d\n, __func__,

[Mesa-dev] [PATCH] main: replace FUNCTION with func

Consistently just use C99's __func__ everywhere.
The patch was verified with Microsoft Visual studio 2013
redistributable package(RTM version number: 18.0.21005.1)
Next MSVC versions intends to support __func__.
No functional changes.

Signed-off-by: Marius Predut marius.pre...@intel.com
---
 src/mesa/main/atifragshader.c  |4 ++--
 src/mesa/main/ffvertex_prog.c  |6 +++---
 src/mesa/main/format_unpack.py |4 ++--
 src/mesa/main/glformats.c  |2 +-
 src/mesa/main/mtypes.h |2 +-
 src/mesa/main/state.c  |2 +-
 6 files changed, 10 insertions(+), 10 deletions(-)

diff --git a/src/mesa/main/atifragshader.c b/src/mesa/main/atifragshader.c
index 9d967b9..9fc3552 100644
--- a/src/mesa/main/atifragshader.c
+++ b/src/mesa/main/atifragshader.c
@@ -476,7 +476,7 @@ _mesa_PassTexCoordATI(GLuint dst, GLuint coord, GLenum 
swizzle)
curI-swizzle = swizzle;
 
 #if MESA_DEBUG_ATI_FS
-   _mesa_debug(ctx, %s(%s, %s, %s)\n, __FUNCTION__,
+   _mesa_debug(ctx, %s(%s, %s, %s)\n, __func__,
   _mesa_lookup_enum_by_nr(dst), _mesa_lookup_enum_by_nr(coord),
   _mesa_lookup_enum_by_nr(swizzle));
 #endif
@@ -549,7 +549,7 @@ _mesa_SampleMapATI(GLuint dst, GLuint interp, GLenum 
swizzle)
curI-swizzle = swizzle;
 
 #if MESA_DEBUG_ATI_FS
-   _mesa_debug(ctx, %s(%s, %s, %s)\n, __FUNCTION__,
+   _mesa_debug(ctx, %s(%s, %s, %s)\n, __func__,
   _mesa_lookup_enum_by_nr(dst), _mesa_lookup_enum_by_nr(interp),
   _mesa_lookup_enum_by_nr(swizzle));
 #endif
diff --git a/src/mesa/main/ffvertex_prog.c b/src/mesa/main/ffvertex_prog.c
index 395b00e..edf7e33 100644
--- a/src/mesa/main/ffvertex_prog.c
+++ b/src/mesa/main/ffvertex_prog.c
@@ -619,13 +619,13 @@ static void emit_op3fn(struct tnl_program *p,
 
 
 #define emit_op3(p, op, dst, mask, src0, src1, src2) \
-   emit_op3fn(p, op, dst, mask, src0, src1, src2, __FUNCTION__, __LINE__)
+   emit_op3fn(p, op, dst, mask, src0, src1, src2, __func__, __LINE__)
 
 #define emit_op2(p, op, dst, mask, src0, src1) \
-emit_op3fn(p, op, dst, mask, src0, src1, undef, __FUNCTION__, __LINE__)
+emit_op3fn(p, op, dst, mask, src0, src1, undef, __func__, __LINE__)
 
 #define emit_op1(p, op, dst, mask, src0) \
-emit_op3fn(p, op, dst, mask, src0, undef, undef, __FUNCTION__, __LINE__)
+emit_op3fn(p, op, dst, mask, src0, undef, undef, __func__, __LINE__)
 
 
 static struct ureg make_temp( struct tnl_program *p, struct ureg reg )
diff --git a/src/mesa/main/format_unpack.py b/src/mesa/main/format_unpack.py
index 53bdf64..9917548 100644
--- a/src/mesa/main/format_unpack.py
+++ b/src/mesa/main/format_unpack.py
@@ -333,7 +333,7 @@ _mesa_unpack_rgba_row(mesa_format format, GLuint n,
   unpack_float_ycbcr_rev(src, dst, n);
   break;
default:
-  _mesa_problem(NULL, %s: bad format %s, __FUNCTION__,
+  _mesa_problem(NULL, %s: bad format %s, __func__,
 _mesa_get_format_name(format));
   return;
}
@@ -402,7 +402,7 @@ _mesa_unpack_uint_rgba_row(mesa_format format, GLuint n,
   break;
 %endfor
default:
-  _mesa_problem(NULL, %s: bad format %s, __FUNCTION__,
+  _mesa_problem(NULL, %s: bad format %s, __func__,
 _mesa_get_format_name(format));
   return;
}
diff --git a/src/mesa/main/glformats.c b/src/mesa/main/glformats.c
index 4e05229..8ced579 100644
--- a/src/mesa/main/glformats.c
+++ b/src/mesa/main/glformats.c
@@ -1393,7 +1393,7 @@ _mesa_base_format_has_channel(GLenum base_format, GLenum 
pname)
   return GL_FALSE;
default:
   _mesa_warning(NULL, %s: Unexpected channel token 0x%x\n,
-   __FUNCTION__, pname);
+   __func__, pname);
   return GL_FALSE;
}
 
diff --git a/src/mesa/main/mtypes.h b/src/mesa/main/mtypes.h
index 8e1dba6..ab74489 100644
--- a/src/mesa/main/mtypes.h
+++ b/src/mesa/main/mtypes.h
@@ -4523,7 +4523,7 @@ struct gl_context
 #ifdef DEBUG
 extern int MESA_VERBOSE;
 extern int MESA_DEBUG_FLAGS;
-# define MESA_FUNCTION __FUNCTION__
+# define MESA_FUNCTION __func__
 #else
 # define MESA_VERBOSE 0
 # define MESA_DEBUG_FLAGS 0
diff --git a/src/mesa/main/state.c b/src/mesa/main/state.c
index dadfb3c..ccf83d7 100644
--- a/src/mesa/main/state.c
+++ b/src/mesa/main/state.c
@@ -507,7 +507,7 @@ _mesa_set_varying_vp_inputs( struct gl_context *ctx,
   ctx-FragmentProgram._TexEnvProgram) {
  ctx-NewState |= _NEW_VARYING_VP_INPUTS;
   }
-  /*printf(%s %x\n, __FUNCTION__, varying_inputs);*/
+  /*printf(%s %x\n, __func__, varying_inputs);*/
}
 }
 
-- 
1.7.9.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] i965: Fix depth field setting in surface state for raw buffer on Gen7/8

2015-04-07 Thread Anuj Phogat

On Mon, Apr 6, 2015 at 10:51 PM, Zhenyu Wang zhen...@linux.intel.com wrote:
 On Gen7/8 for RAW surface format, the depth field (surf[3]) in surface
 state means [30:21] bits of number of entries which is different from
 other surface format which uses [26:21] bits field.

 Signed-off-by: Zhenyu Wang zhen...@linux.intel.com
 ---
  src/mesa/drivers/dri/i965/gen7_wm_surface_state.c | 7 +--
  src/mesa/drivers/dri/i965/gen8_surface_state.c| 7 +--
  2 files changed, 10 insertions(+), 4 deletions(-)

 diff --git a/src/mesa/drivers/dri/i965/gen7_wm_surface_state.c 
 b/src/mesa/drivers/dri/i965/gen7_wm_surface_state.c
 index d9361d3..18bcb8a 100644
 --- a/src/mesa/drivers/dri/i965/gen7_wm_surface_state.c
 +++ b/src/mesa/drivers/dri/i965/gen7_wm_surface_state.c
 @@ -238,8 +238,11 @@ gen7_emit_buffer_surface_state(struct brw_context *brw,
 surf[1] = (bo ? bo-offset64 : 0) + buffer_offset; /* reloc */
 surf[2] = SET_FIELD((buffer_size - 1)  0x7f, GEN7_SURFACE_WIDTH) |
   SET_FIELD(((buffer_size - 1)  7)  0x3fff, 
 GEN7_SURFACE_HEIGHT);
 -   surf[3] = SET_FIELD(((buffer_size - 1)  21)  0x3f, BRW_SURFACE_DEPTH) |
 - (pitch - 1);
 +   if (surface_format == BRW_SURFACEFORMAT_RAW)
 +  surf[3] = SET_FIELD(((buffer_size - 1)  21)  0x3ff, 
 BRW_SURFACE_DEPTH);
 +   else
 +  surf[3] = SET_FIELD(((buffer_size - 1)  21)  0x3f, 
 BRW_SURFACE_DEPTH);
 +   surf[3] |= (pitch - 1);

 surf[5] = SET_FIELD(GEN7_MOCS_L3, GEN7_SURFACE_MOCS);

 diff --git a/src/mesa/drivers/dri/i965/gen8_surface_state.c 
 b/src/mesa/drivers/dri/i965/gen8_surface_state.c
 index 0007c95..ba59b05 100644
 --- a/src/mesa/drivers/dri/i965/gen8_surface_state.c
 +++ b/src/mesa/drivers/dri/i965/gen8_surface_state.c
 @@ -129,8 +129,11 @@ gen8_emit_buffer_surface_state(struct brw_context *brw,

 surf[2] = SET_FIELD((buffer_size - 1)  0x7f, GEN7_SURFACE_WIDTH) |
   SET_FIELD(((buffer_size - 1)  7)  0x3fff, 
 GEN7_SURFACE_HEIGHT);
 -   surf[3] = SET_FIELD(((buffer_size - 1)  21)  0x3f, BRW_SURFACE_DEPTH) |
 - (pitch - 1);
 +   if (surface_format == BRW_SURFACEFORMAT_RAW)
 +  surf[3] = SET_FIELD(((buffer_size - 1)  21)  0x3ff, 
 BRW_SURFACE_DEPTH);
 +   else
 +  surf[3] = SET_FIELD(((buffer_size - 1)  21)  0x3f, 
 BRW_SURFACE_DEPTH);
 +   surf[3] |= (pitch - 1);
 surf[7] = SET_FIELD(HSW_SCS_RED,   GEN7_SURFACE_SCS_R) |
   SET_FIELD(HSW_SCS_GREEN, GEN7_SURFACE_SCS_G) |
   SET_FIELD(HSW_SCS_BLUE,  GEN7_SURFACE_SCS_B) |
 --
 2.1.4

 ___
 mesa-dev mailing list
 mesa-dev@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Reviewed-by: Anuj Phogat anuj.pho...@gmail.com
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] state_tracker: replace FUNCTION with func

Consistently just use C99's __func__ everywhere.
The patch was verified with Microsoft Visual studio 2013
redistributable package(RTM version number: 18.0.21005.1)
Next MSVC versions intends to support __func__.
No functional changes.

Signed-off-by: Marius Predut marius.pre...@intel.com
---
 src/mesa/state_tracker/st_atom.c   |2 +-
 src/mesa/state_tracker/st_atom_constbuf.c  |2 +-
 src/mesa/state_tracker/st_cb_clear.c   |2 +-
 src/mesa/state_tracker/st_cb_texture.c |   18 +-
 src/mesa/state_tracker/st_glsl_to_tgsi.cpp |2 +-
 src/mesa/state_tracker/st_mesa_to_tgsi.c   |2 +-
 src/mesa/state_tracker/st_program.c|2 +-
 src/mesa/state_tracker/st_texture.c|6 +++---
 8 files changed, 18 insertions(+), 18 deletions(-)

diff --git a/src/mesa/state_tracker/st_atom.c b/src/mesa/state_tracker/st_atom.c
index f0fe11f..428f2d9 100644
--- a/src/mesa/state_tracker/st_atom.c
+++ b/src/mesa/state_tracker/st_atom.c
@@ -183,7 +183,7 @@ void st_validate_state( struct st_context *st )
if (state-st == 0)
   return;
 
-   /*printf(%s %x/%x\n, __FUNCTION__, state-mesa, state-st);*/
+   /*printf(%s %x/%x\n, __func__, state-mesa, state-st);*/
 
 #ifdef DEBUG
if (1) {
diff --git a/src/mesa/state_tracker/st_atom_constbuf.c 
b/src/mesa/state_tracker/st_atom_constbuf.c
index 7984bf7..a54e0d9 100644
--- a/src/mesa/state_tracker/st_atom_constbuf.c
+++ b/src/mesa/state_tracker/st_atom_constbuf.c
@@ -92,7 +92,7 @@ void st_upload_constants( struct st_context *st,
 
   if (ST_DEBUG  DEBUG_CONSTANTS) {
  debug_printf(%s(shader=%d, numParams=%d, stateFlags=0x%x)\n,
-  __FUNCTION__, shader_type, params-NumParameters,
+  __func__, shader_type, params-NumParameters,
   params-StateFlags);
  _mesa_print_parameter_list(params);
   }
diff --git a/src/mesa/state_tracker/st_cb_clear.c 
b/src/mesa/state_tracker/st_cb_clear.c
index dd81a62..f10e906 100644
--- a/src/mesa/state_tracker/st_cb_clear.c
+++ b/src/mesa/state_tracker/st_cb_clear.c
@@ -247,7 +247,7 @@ clear_with_quad(struct gl_context *ctx, unsigned 
clear_buffers)
   util_framebuffer_get_num_layers(st-state.framebuffer);
 
/*
-   printf(%s %s%s%s %f,%f %f,%f\n, __FUNCTION__, 
+   printf(%s %s%s%s %f,%f %f,%f\n, __func__,
  color ? color,  : ,
  depth ? depth,  : ,
  stencil ? stencil : ,
diff --git a/src/mesa/state_tracker/st_cb_texture.c 
b/src/mesa/state_tracker/st_cb_texture.c
index 5c520b4..6b35d61 100644
--- a/src/mesa/state_tracker/st_cb_texture.c
+++ b/src/mesa/state_tracker/st_cb_texture.c
@@ -123,7 +123,7 @@ gl_target_to_pipe(GLenum target)
 static struct gl_texture_image *
 st_NewTextureImage(struct gl_context * ctx)
 {
-   DBG(%s\n, __FUNCTION__);
+   DBG(%s\n, __func__);
(void) ctx;
return (struct gl_texture_image *) ST_CALLOC_STRUCT(st_texture_image);
 }
@@ -144,7 +144,7 @@ st_NewTextureObject(struct gl_context * ctx, GLuint name, 
GLenum target)
 {
struct st_texture_object *obj = ST_CALLOC_STRUCT(st_texture_object);
 
-   DBG(%s\n, __FUNCTION__);
+   DBG(%s\n, __func__);
_mesa_initialize_texture_object(ctx, obj-base, name, target);
 
return obj-base;
@@ -172,7 +172,7 @@ st_FreeTextureImageBuffer(struct gl_context *ctx,
 {
struct st_texture_image *stImage = st_texture_image(texImage);
 
-   DBG(%s\n, __FUNCTION__);
+   DBG(%s\n, __func__);
 
if (stImage-pt) {
   pipe_resource_reference(stImage-pt, NULL);
@@ -405,7 +405,7 @@ guess_and_alloc_texture(struct st_context *st,
GLuint ptWidth, ptHeight, ptDepth, ptLayers;
enum pipe_format fmt;
 
-   DBG(%s\n, __FUNCTION__);
+   DBG(%s\n, __func__);
 
assert(!stObj-pt);
 
@@ -473,7 +473,7 @@ guess_and_alloc_texture(struct st_context *st,
 
stObj-lastLevel = lastLevel;
 
-   DBG(%s returning %d\n, __FUNCTION__, (stObj-pt != NULL));
+   DBG(%s returning %d\n, __func__, (stObj-pt != NULL));
 
return stObj-pt != NULL;
 }
@@ -496,7 +496,7 @@ st_AllocTextureImageBuffer(struct gl_context *ctx,
GLuint height = texImage-Height;
GLuint depth = texImage-Depth;
 
-   DBG(%s\n, __FUNCTION__);
+   DBG(%s\n, __func__);
 
assert(!stImage-pt); /* xxx this might be wrong */
 
@@ -1148,7 +1148,7 @@ st_GetTexImage(struct gl_context * ctx,
   }
 
   if (ST_DEBUG  DEBUG_FALLBACK)
- debug_printf(%s: fallback format translation\n, __FUNCTION__);
+ debug_printf(%s: fallback format translation\n, __func__);
 
   dstMesaFormat = _mesa_format_from_format_and_type(format, type);
   dstStride = _mesa_image_row_stride(ctx-Pack, width, format, type);
@@ -1234,7 +1234,7 @@ fallback_copy_texsubimage(struct gl_context *ctx,
struct pipe_transfer *transfer;
 
if (ST_DEBUG  DEBUG_FALLBACK)
-  debug_printf(%s: fallback processing\n, __FUNCTION__);
+  debug_printf(%s: fallback processing\n, __func__);
 
if (st_fb_orientation(ctx-ReadBuffer) == Y_0_TOP) {

Re: [Mesa-dev] DMA_BUF render targets disabled for intel

2015-04-07 Thread Chad Versace

On Thu 02 Apr 2015, Axel Davy wrote:

Hi,

you may be interesting look at this related bug report:
https://bugs.freedesktop.org/show_bug.cgi?id=87452#c5

Yours,

Axel Davy

On 02/04/2015 11:58, Volker Vogelhuber wrote :
We currently want to stream OpenGL output to an FPGA that does not
provide
a SG controller and should manage the transfers from the CPU memory
to it's
own hardware. For that reason we want to have the OpenGL driver
(intel baytrail)
to render at a specific memory area within the CPU system. Render to
texture as

it is possible e.g. on the PowerVR 530 seems not to be possible, as
GL_TEXTURE_EXTERNAL_OES is not valid for glFrameBufferTexture2D and
in contrast to the PowerVR OpenGL implementation, Mesa seems to
prohibit the

use of GL_TEXTURE_2D for textures created by glEGLImageTargetTexture2DOES
(there is a check within Mesa where glEGLImageTargetTexture2DOES's
target has
to be equal to the target of the texture = GL_TEXTURE_EXTERNAL_OES
!= GL_TEXTURE_2D).

So the only possible way to render to an EGLImage with memory
allocated by myself
seems to be the use of glEGLImageTargetRenderbufferStorageOES and
bind this

render buffer using glFramebufferRenderbuffer to the FBO.

But for some reason, it seems to be forbidden to use an EGLImage
imported from
a dmabuf as render buffer. At least within
src/mesa/drivers/dri/i965/intel_fbo.c there

is a check:

/* Buffers originating from outside are for read-only. */
if (image-dma_buf_imported) {
_mesa_error(ctx, GL_INVALID_OPERATION,
glEGLImageTargetRenderbufferStorage(dma buffers are
read-only));

return;
}

This prevents me from doing what I wanted to do and I googled a bit.
I found someone else that just removed that check:

https://github.com/kalyankondapally/Chromium-OzoneGBM/blob/master/0010-i965-remove-read-only-restriction-of-imported-buffer.patch

That patch isn't safe for general renderbuffer usage... details below.
(As an aside, Chrome OS also has a similar patch in their Mesa tree. But
it's safe for Chrome OS, at least for now).

and after I did so myself, it just worked as I wanted it to work. I
only wonder why this
limitation has been added. Is it just for some pedantic reasons or
is there any good reason
why EGLImages imported from dmabuf descriptors shouldn't be used for
render targets?

There is a very good reason. It is not pedantic. And me and Tapani
(CC'd) are working on enabling this. See
[https://bugs.freedesktop.org/show_bug.cgi?id=87452#c7] for my
work-in-progress patches.

The reason is that, on Intel chipsets Ivybridge and newer, the i965
driver often expects each color buffer to have an auxiliary metadata
buffer that holds compresson information. If the aux buffer does not
exist, i965 will create it. If the metadata buffer and the real color
buffer become unsynchronized (which is *very* likey when using a dma_buf
as renderbuffer storage), you will get corrupt rendering. If you haven't
got corrupt rendering, it's solely due to luck (and that luck is
proportional to the density of cleared pixels exist in the buffer).

Therefore, i965 needs to be taught to disable aux buffers for
dma_buf-backed storage. Before that happens, you risk corrupted images
if you render to a dma_buf-backed renderbuffer.

If you apply Kalyan's patch on top of my (untested) patches, then
that should safely enable what you're doing with the FPGA. (There may be
still be bugs with EGLImage orphaning semantics, but that likely won't
affect you).
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH v2] i965/fs: Always invert predicate of SEL with swapped arguments

2015-04-07 Thread Ian Romanick

From: Ian Romanick ian.d.roman...@intel.com

Commit b616164 added an optimization of b2f generation of a comparison.
It also included an extra optimization of one of the comparison values
is a constant of zero.  The trick was that some value was known to be
zero, so that value could be used in the SEL instruction instead of
potentially loading 0.0 into a register.

This change switched the order of the arguments to the SEL, and, for
some unknown reason, I thought that the predicate should therefore
only be inverted for the == case.  Clearly, it should always be
inverted.

Fixes piglit fs-notEqual-of-expression.shader_test and
fs-equal-of-expression.shader_test.

v2: Don't do the register already has zero optimization for the '== 0'
case.  In that case, the register does not have zero when we want to
produce a zero result.

Signed-off-by: Ian Romanick ian.d.roman...@intel.com
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89722
Reviewed-by: Kenneth Graunke kenn...@whitecape.org [v1]
---
 src/mesa/drivers/dri/i965/brw_fs_visitor.cpp | 10 +-
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp 
b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
index e6fb0cb..da0a08d 100644
--- a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
@@ -502,15 +502,15 @@ fs_visitor::try_emit_b2f_of_comparison(ir_expression *ir)
 * and(16) g41D  g28,8,1D  1D
 * and(16) m61D  -g48,8,1D 0x3f80UD
 *
-* When the comparison is either == 0.0 or != 0.0 using the knowledge that
-* the true (or false) case already results in zero would allow better code
-* generation by possibly avoiding a load-immediate instruction.
+* When the comparison is != 0.0 using the knowledge that the false case
+* already results in zero would allow better code generation by possibly
+* avoiding a load-immediate instruction.
 */
ir_expression *cmp = ir-operands[0]-as_expression();
if (cmp == NULL)
   return false;
 
-   if (cmp-operation == ir_binop_equal || cmp-operation == ir_binop_nequal) {
+   if (cmp-operation == ir_binop_nequal) {
   for (unsigned i = 0; i  2; i++) {
  ir_constant *c = cmp-operands[i]-as_constant();
  if (c == NULL || !c-is_zero())
@@ -538,7 +538,7 @@ fs_visitor::try_emit_b2f_of_comparison(ir_expression *ir)
 
 fs_inst *inst = emit(SEL(this-result, op[i ^ 1], fs_reg(1.0f)));
 inst-predicate = BRW_PREDICATE_NORMAL;
-inst-predicate_inverse = cmp-operation == ir_binop_equal;
+inst-predicate_inverse = true;
 return true;
  }
   }
-- 
2.1.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] gallium/ttn: add support for temp arrays

From: Rob Clark robcl...@freedesktop.org

Since the rest of NIR really would rather have these as variables rather
than registers, create a nir_variable per array.  But rather than
completely re-arrange ttn to be variable based rather than register
based, keep the registers.  In the cases where there is a matching var
for the reg, ttn_emit_instruction will append the appropriate intrinsic
to get things back from the shadow reg into the variable.

NOTE: this doesn't quite handle TEMP[ADDR[]] when the DCL doesn't give
an array id.  But those just kinda suck, and should really go away.
AFAICT we don't get those from glsl.  Might be an issue for some other
state tracker.

v2: rework to use load_var/store_var with deref chains

Signed-off-by: Rob Clark robcl...@freedesktop.org
---
 src/gallium/auxiliary/nir/tgsi_to_nir.c | 122 +++-
 1 file changed, 103 insertions(+), 19 deletions(-)

diff --git a/src/gallium/auxiliary/nir/tgsi_to_nir.c 
b/src/gallium/auxiliary/nir/tgsi_to_nir.c
index da935a4..f4c0bad 100644
--- a/src/gallium/auxiliary/nir/tgsi_to_nir.c
+++ b/src/gallium/auxiliary/nir/tgsi_to_nir.c
@@ -44,6 +44,7 @@
 struct ttn_reg_info {
/** nir register containing this TGSI index. */
nir_register *reg;
+   nir_variable *var;
/** Offset (in vec4s) from the start of var for this TGSI index. */
int offset;
 };
@@ -121,22 +122,29 @@ ttn_emit_declaration(struct ttn_compile *c)
 
if (file == TGSI_FILE_TEMPORARY) {
   nir_register *reg;
-  if (c-scan-indirect_files  (1  file)) {
+  nir_variable *var = NULL;
+
+  if (decl-Declaration.Array) {
+ /* for arrays, the register created just serves as a
+  * shadow register.  We append intrinsic_store_global
+  * after the tgsi instruction is translated to move
+  * back from the shadow register to the variable
+  */
+ var = rzalloc(b-shader, nir_variable);
+
+ var-type = glsl_array_type(glsl_vec4_type(), array_size);
+ var-data.mode = nir_var_global;
+ var-name = ralloc_asprintf(var, arr_%d, decl-Array.ArrayID);
+
+ exec_list_push_tail(b-shader-globals, var-node);
+  }
+
+  for (i = 0; i  array_size; i++) {
  reg = nir_local_reg_create(b-impl);
  reg-num_components = 4;
- reg-num_array_elems = array_size;
-
- for (i = 0; i  array_size; i++) {
-c-temp_regs[decl-Range.First + i].reg = reg;
-c-temp_regs[decl-Range.First + i].offset = i;
- }
-  } else {
- for (i = 0; i  array_size; i++) {
-reg = nir_local_reg_create(b-impl);
-reg-num_components = 4;
-c-temp_regs[decl-Range.First + i].reg = reg;
-c-temp_regs[decl-Range.First + i].offset = 0;
- }
+ c-temp_regs[decl-Range.First + i].reg = reg;
+ c-temp_regs[decl-Range.First + i].var = var;
+ c-temp_regs[decl-Range.First + i].offset = i;
   }
} else if (file == TGSI_FILE_ADDRESS) {
   c-addr_reg = nir_local_reg_create(b-impl);
@@ -245,6 +253,32 @@ ttn_emit_immediate(struct ttn_compile *c)
 static nir_src *
 ttn_src_for_indirect(struct ttn_compile *c, struct tgsi_ind_register 
*indirect);
 
+/* generate either a constant or indirect deref chain for accessing an
+ * array variable.
+ */
+static nir_deref_var *
+ttn_array_deref(struct ttn_compile *c, nir_variable *var, unsigned offset,
+struct tgsi_ind_register *indirect)
+{
+   nir_builder *b = c-build;
+   nir_deref_var *deref = nir_deref_var_create(b-shader, var);
+   nir_deref_array *arr = nir_deref_array_create(b-shader);
+
+   arr-base_offset = offset;
+   arr-deref.type = glsl_get_array_element(var-type);
+
+   if (indirect) {
+  arr-deref_array_type = nir_deref_array_type_indirect;
+  arr-indirect = nir_src_for_reg(c-addr_reg);
+   } else {
+  arr-deref_array_type = nir_deref_array_type_direct;
+   }
+
+   deref-deref.child = arr-deref;
+
+   return deref;
+}
+
 static nir_src
 ttn_src_for_file_and_index(struct ttn_compile *c, unsigned file, unsigned 
index,
struct tgsi_ind_register *indirect)
@@ -256,10 +290,25 @@ ttn_src_for_file_and_index(struct ttn_compile *c, 
unsigned file, unsigned index,
 
switch (file) {
case TGSI_FILE_TEMPORARY:
-  src.reg.reg = c-temp_regs[index].reg;
-  src.reg.base_offset = c-temp_regs[index].offset;
-  if (indirect)
- src.reg.indirect = ttn_src_for_indirect(c, indirect);
+  if (c-temp_regs[index].var) {
+ unsigned offset = c-temp_regs[index].offset;
+ nir_variable *var = c-temp_regs[index].var;
+ nir_intrinsic_instr *load;
+
+ load = nir_intrinsic_instr_create(b-shader,
+   nir_intrinsic_load_var);
+ load-num_components = 4;
+ load-variables[0] = ttn_array_deref(c, var, offset, indirect);
+
+ nir_ssa_dest_init(load-instr, load-dest, 4, NULL);
+

Re: [Mesa-dev] [PATCH] i965: replace FUNCTION with func

2015-04-07 Thread Matt Turner

On Tue, Apr 7, 2015 at 12:05 PM, Marius Predut marius.pre...@intel.com wrote:
 Consistently just use C99's __func__ everywhere.
 The patch was verified with Microsoft Visual studio 2013
 redistributable package(RTM version number: 18.0.21005.1)

Presumably not, since i965 isn't built with MSVC :)

But yeah, the patch looks good. I'll collect all of the __func__
patches and commit them later assuming everything else looks good.

Thanks!
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] Value Range Propagation in NIR (GSoC)

2015-04-07 Thread Thomas Helland

Hi,

For those that don't know I've submitted a proposal for this years GSoC.
I've proposed to implement value range propagation and loop unrolling in
NIR.
Since I'm no expert on compilers I've read up on some litterature:

I started with Constant propagation with conditional branches  (thanks
Connor).
This paper describes an algorithm, sparse conditional constant
propagation,
that seems to be the defacto standard in compilers today.

I also found the paper;
Accurate static branch prediction by value range propagation  (VRP).
This describes a value range propagation implementation based on SCCP.
(This also allows one to set heuristics to calculate educated guesses for
the
probability of a certain branch, but that's probably more than we're
interested in.)

There is also a GCC paper (with whatever licensing issues that may apply);
A propagation engine for GCC.
They have a shared engine for doing all propagation passes.
It handles the worklists, and the logic to traverse these.
The implementing passes then supply callbacks to define the lattice rules.
They reply back if the instruction was interesting or not,
and the propagation engine basically handles the rest.

Maybe that's an interesting solution? Or it might not be worth the hassle?
We already have copy propagation, and with value range propagation
we probably don't want separate constant propagation?
(I'm hoping to write the pass so that it handles both constants and value
ranges.)
The GCC guys have used this engine to get copy propagation that propagates
copies accross conditionals, maybe this makes such a solution more
interesting?

Connor: I just remembered you saying something about your freedesktop
git repo, so I poked around some and found that you have already done
some work on VRP based on SCCP? How far did you get?

If we just want to make an SCCP inspired VRP pass then Connor has work in
progress.
Finishing that, and loop unrolling, may not be enough work for GSoC?
Or maybe Connor wants to finish it of himself, and I should spend my time
implementing some other pass instead, alongside loop unrolling?

Realising Connor has partially started on this I thought it was a good
idea to get some feedback and ideas from others (if I need to change my
proposal)
All suggestions, ideas and opinions are more than welcome.
Fire at will, I'm all ears =)

Regards,
Thomas
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 6/5] nir: Make nir__instr_create take a nir_shader instead of a void context

Signed-off-by: Jason Ekstrand jason.ekstr...@intel.com

---
 src/glsl/nir/nir.c | 36 ++--
 src/glsl/nir/nir.h | 18 +-
 2 files changed, 27 insertions(+), 27 deletions(-)

diff --git a/src/glsl/nir/nir.c b/src/glsl/nir/nir.c
index 1c6b603..c6e5361 100644
--- a/src/glsl/nir/nir.c
+++ b/src/glsl/nir/nir.c
@@ -381,11 +381,11 @@ alu_src_init(nir_alu_src *src)
 }
 
 nir_alu_instr *
-nir_alu_instr_create(void *mem_ctx, nir_op op)
+nir_alu_instr_create(nir_shader *shader, nir_op op)
 {
unsigned num_srcs = nir_op_infos[op].num_inputs;
nir_alu_instr *instr =
-  ralloc_size(mem_ctx,
+  ralloc_size(shader,
   sizeof(nir_alu_instr) + num_srcs * sizeof(nir_alu_src));
 
instr_init(instr-instr, nir_instr_type_alu);
@@ -398,18 +398,18 @@ nir_alu_instr_create(void *mem_ctx, nir_op op)
 }
 
 nir_jump_instr *
-nir_jump_instr_create(void *mem_ctx, nir_jump_type type)
+nir_jump_instr_create(nir_shader *shader, nir_jump_type type)
 {
-   nir_jump_instr *instr = ralloc(mem_ctx, nir_jump_instr);
+   nir_jump_instr *instr = ralloc(shader, nir_jump_instr);
instr_init(instr-instr, nir_instr_type_jump);
instr-type = type;
return instr;
 }
 
 nir_load_const_instr *
-nir_load_const_instr_create(void *mem_ctx, unsigned num_components)
+nir_load_const_instr_create(nir_shader *shader, unsigned num_components)
 {
-   nir_load_const_instr *instr = ralloc(mem_ctx, nir_load_const_instr);
+   nir_load_const_instr *instr = ralloc(shader, nir_load_const_instr);
instr_init(instr-instr, nir_instr_type_load_const);
 
nir_ssa_def_init(instr-instr, instr-def, num_components, NULL);
@@ -418,11 +418,11 @@ nir_load_const_instr_create(void *mem_ctx, unsigned 
num_components)
 }
 
 nir_intrinsic_instr *
-nir_intrinsic_instr_create(void *mem_ctx, nir_intrinsic_op op)
+nir_intrinsic_instr_create(nir_shader *shader, nir_intrinsic_op op)
 {
unsigned num_srcs = nir_intrinsic_infos[op].num_srcs;
nir_intrinsic_instr *instr =
-  ralloc_size(mem_ctx,
+  ralloc_size(shader,
   sizeof(nir_intrinsic_instr) + num_srcs * sizeof(nir_src));
 
instr_init(instr-instr, nir_instr_type_intrinsic);
@@ -438,9 +438,9 @@ nir_intrinsic_instr_create(void *mem_ctx, nir_intrinsic_op 
op)
 }
 
 nir_call_instr *
-nir_call_instr_create(void *mem_ctx, nir_function_overload *callee)
+nir_call_instr_create(nir_shader *shader, nir_function_overload *callee)
 {
-   nir_call_instr *instr = ralloc(mem_ctx, nir_call_instr);
+   nir_call_instr *instr = ralloc(shader, nir_call_instr);
instr_init(instr-instr, nir_instr_type_call);
 
instr-callee = callee;
@@ -452,9 +452,9 @@ nir_call_instr_create(void *mem_ctx, nir_function_overload 
*callee)
 }
 
 nir_tex_instr *
-nir_tex_instr_create(void *mem_ctx, unsigned num_srcs)
+nir_tex_instr_create(nir_shader *shader, unsigned num_srcs)
 {
-   nir_tex_instr *instr = ralloc(mem_ctx, nir_tex_instr);
+   nir_tex_instr *instr = ralloc(shader, nir_tex_instr);
instr_init(instr-instr, nir_instr_type_tex);
 
dest_init(instr-dest);
@@ -472,9 +472,9 @@ nir_tex_instr_create(void *mem_ctx, unsigned num_srcs)
 }
 
 nir_phi_instr *
-nir_phi_instr_create(void *mem_ctx)
+nir_phi_instr_create(nir_shader *shader)
 {
-   nir_phi_instr *instr = ralloc(mem_ctx, nir_phi_instr);
+   nir_phi_instr *instr = ralloc(shader, nir_phi_instr);
instr_init(instr-instr, nir_instr_type_phi);
 
dest_init(instr-dest);
@@ -483,9 +483,9 @@ nir_phi_instr_create(void *mem_ctx)
 }
 
 nir_parallel_copy_instr *
-nir_parallel_copy_instr_create(void *mem_ctx)
+nir_parallel_copy_instr_create(nir_shader *shader)
 {
-   nir_parallel_copy_instr *instr = ralloc(mem_ctx, nir_parallel_copy_instr);
+   nir_parallel_copy_instr *instr = ralloc(shader, nir_parallel_copy_instr);
instr_init(instr-instr, nir_instr_type_parallel_copy);
 
exec_list_make_empty(instr-entries);
@@ -494,9 +494,9 @@ nir_parallel_copy_instr_create(void *mem_ctx)
 }
 
 nir_ssa_undef_instr *
-nir_ssa_undef_instr_create(void *mem_ctx, unsigned num_components)
+nir_ssa_undef_instr_create(nir_shader *shader, unsigned num_components)
 {
-   nir_ssa_undef_instr *instr = ralloc(mem_ctx, nir_ssa_undef_instr);
+   nir_ssa_undef_instr *instr = ralloc(shader, nir_ssa_undef_instr);
instr_init(instr-instr, nir_instr_type_ssa_undef);
 
nir_ssa_def_init(instr-instr, instr-def, num_components, NULL);
diff --git a/src/glsl/nir/nir.h b/src/glsl/nir/nir.h
index 0f72301..f9ca0f7 100644
--- a/src/glsl/nir/nir.h
+++ b/src/glsl/nir/nir.h
@@ -1480,26 +1480,26 @@ void nir_metadata_require(nir_function_impl *impl, 
nir_metadata required);
 void nir_metadata_preserve(nir_function_impl *impl, nir_metadata preserved);
 
 /** creates an instruction with default swizzle/writemask/etc. with NULL 
registers */
-nir_alu_instr *nir_alu_instr_create(void *mem_ctx, nir_op op);
+nir_alu_instr *nir_alu_instr_create(nir_shader *shader, nir_op op);
 
-nir_jump_instr *nir_jump_instr_create(void *mem_ctx,

Re: [Mesa-dev] [PATCH] gallium/ttn: add support for temp arrays

On Tue, Apr 7, 2015 at 1:30 PM, Rob Clark robdcl...@gmail.com wrote:
 From: Rob Clark robcl...@freedesktop.org

 Since the rest of NIR really would rather have these as variables rather
 than registers, create a nir_variable per array.  But rather than
 completely re-arrange ttn to be variable based rather than register
 based, keep the registers.  In the cases where there is a matching var
 for the reg, ttn_emit_instruction will append the appropriate intrinsic
 to get things back from the shadow reg into the variable.

 NOTE: this doesn't quite handle TEMP[ADDR[]] when the DCL doesn't give
 an array id.  But those just kinda suck, and should really go away.
 AFAICT we don't get those from glsl.  Might be an issue for some other
 state tracker.

 v2: rework to use load_var/store_var with deref chains

 Signed-off-by: Rob Clark robcl...@freedesktop.org
 ---
  src/gallium/auxiliary/nir/tgsi_to_nir.c | 122 
 +++-
  1 file changed, 103 insertions(+), 19 deletions(-)

 diff --git a/src/gallium/auxiliary/nir/tgsi_to_nir.c 
 b/src/gallium/auxiliary/nir/tgsi_to_nir.c
 index da935a4..f4c0bad 100644
 --- a/src/gallium/auxiliary/nir/tgsi_to_nir.c
 +++ b/src/gallium/auxiliary/nir/tgsi_to_nir.c
 @@ -44,6 +44,7 @@
  struct ttn_reg_info {
 /** nir register containing this TGSI index. */
 nir_register *reg;
 +   nir_variable *var;
 /** Offset (in vec4s) from the start of var for this TGSI index. */
 int offset;
  };
 @@ -121,22 +122,29 @@ ttn_emit_declaration(struct ttn_compile *c)

 if (file == TGSI_FILE_TEMPORARY) {
nir_register *reg;
 -  if (c-scan-indirect_files  (1  file)) {
 +  nir_variable *var = NULL;
 +
 +  if (decl-Declaration.Array) {
 + /* for arrays, the register created just serves as a
 +  * shadow register.  We append intrinsic_store_global
 +  * after the tgsi instruction is translated to move
 +  * back from the shadow register to the variable
 +  */
 + var = rzalloc(b-shader, nir_variable);
 +
 + var-type = glsl_array_type(glsl_vec4_type(), array_size);
 + var-data.mode = nir_var_global;
 + var-name = ralloc_asprintf(var, arr_%d, decl-Array.ArrayID);
 +
 + exec_list_push_tail(b-shader-globals, var-node);
 +  }
 +
 +  for (i = 0; i  array_size; i++) {
   reg = nir_local_reg_create(b-impl);
   reg-num_components = 4;
 - reg-num_array_elems = array_size;
 -
 - for (i = 0; i  array_size; i++) {
 -c-temp_regs[decl-Range.First + i].reg = reg;
 -c-temp_regs[decl-Range.First + i].offset = i;
 - }
 -  } else {
 - for (i = 0; i  array_size; i++) {
 -reg = nir_local_reg_create(b-impl);
 -reg-num_components = 4;
 -c-temp_regs[decl-Range.First + i].reg = reg;
 -c-temp_regs[decl-Range.First + i].offset = 0;
 - }
 + c-temp_regs[decl-Range.First + i].reg = reg;
 + c-temp_regs[decl-Range.First + i].var = var;
 + c-temp_regs[decl-Range.First + i].offset = i;
}
 } else if (file == TGSI_FILE_ADDRESS) {
c-addr_reg = nir_local_reg_create(b-impl);
 @@ -245,6 +253,32 @@ ttn_emit_immediate(struct ttn_compile *c)
  static nir_src *
  ttn_src_for_indirect(struct ttn_compile *c, struct tgsi_ind_register 
 *indirect);

 +/* generate either a constant or indirect deref chain for accessing an
 + * array variable.
 + */
 +static nir_deref_var *
 +ttn_array_deref(struct ttn_compile *c, nir_variable *var, unsigned offset,
 +struct tgsi_ind_register *indirect)
 +{
 +   nir_builder *b = c-build;
 +   nir_deref_var *deref = nir_deref_var_create(b-shader, var);
 +   nir_deref_array *arr = nir_deref_array_create(b-shader);

As per code Ken just pushed today, deref_var's need to be created with
the instruction as the memory context and all other derefs need to be
created with the parent deref as the memory context.  The validator
will assert-fail if you don't.

Other than that, this looks good to me.  I can't speak for the other
TTN bits but it looks ok.  For whatever it's worth,

Reviewed-by: Jason Ekstrand jason.ekstr...@intel.com

 +
 +   arr-base_offset = offset;
 +   arr-deref.type = glsl_get_array_element(var-type);
 +
 +   if (indirect) {
 +  arr-deref_array_type = nir_deref_array_type_indirect;
 +  arr-indirect = nir_src_for_reg(c-addr_reg);
 +   } else {
 +  arr-deref_array_type = nir_deref_array_type_direct;
 +   }
 +
 +   deref-deref.child = arr-deref;
 +
 +   return deref;
 +}
 +
  static nir_src
  ttn_src_for_file_and_index(struct ttn_compile *c, unsigned file, unsigned 
 index,
 struct tgsi_ind_register *indirect)
 @@ -256,10 +290,25 @@ ttn_src_for_file_and_index(struct ttn_compile *c, 
 unsigned file, unsigned index,

 switch (file) {
 case TGSI_FILE_TEMPORARY:
 -  src.reg.reg = c-temp_regs[index].reg;
 -  src.reg.base_offset =

Re: [Mesa-dev] [PATCH] glsl: Allow any sort of sampler array indexing with GLSL ES 3.00

2015-04-07 Thread Ian Romanick

On 04/07/2015 03:22 AM, Francisco Jerez wrote:
 Tapani Pälli tapani.pa...@intel.com writes:
 
 From: Kalyan Kondapally kalyan.kondapa...@intel.com

 Dynamic indexing of sampler arrays is prohibited by GLSL ES 3.00.
 Earlier versions allow 'constant-index-expression' indexing, where
 index can contain a loop induction variable.

 Patch allows dynamic indexing for sampler arrays when GLSL ES  3.00.
 This change makes 'sampler-array-index.frag' parser test in Piglit
 pass + fishgl.com works when running Chrome on OpenGL ES 2.0 backend.

 v2: small change and some more commit message (Tapani)

 Signed-off-by: Kalyan Kondapally kalyan.kondapa...@intel.com
 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=84225
 
 Looks good, but did you check what happens now if the shader uses actual
 variable indexing (i.e. which lowering cannot turn into a constant) on
 an implementation that doesn't support it?  Hopefully no crashes or
 hangs?

I think we should add a post-link check that no dynamic indexing remains
after all the optimizations are complete.  The intention if the ES2
language was to allow cases where the dynamic indexing could be
optimized away.  This was redacted in ES3 because each optimizer was
differently capable, so a shader that worked on one driver/GPU might
fail on another... even from the same vendor.

Adding the post-link check should prevent the problems the Curro
(rightly) worried about, and it should still allow the WebGL demo to work.

 ---
  src/glsl/ast_array_index.cpp | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

 diff --git a/src/glsl/ast_array_index.cpp b/src/glsl/ast_array_index.cpp
 index ecef651..b2609b6 100644
 --- a/src/glsl/ast_array_index.cpp
 +++ b/src/glsl/ast_array_index.cpp
 @@ -226,7 +226,7 @@ _mesa_ast_array_index_to_hir(void *mem_ctx,
 * dynamically uniform expression is undefined.
 */
if (array-type-element_type()-is_sampler()) {
 - if (!state-is_version(130, 100)) {
 + if (!state-is_version(130, 300)) {
  if (state-es_shader) {
 _mesa_glsl_warning(loc, state,
sampler arrays indexed with non-constant 

It looks like this is what e3ded7f should have made this code.

Looking at the rest of the surrounding code, I don't think this is quite
right... at the very least, it's not easy to follow.  You can blame me
and Paul for that.  I think this is correct and easier to follow:

   if (!state-is_version(400, 0)  !state-ARB_gpu_shader5_enable) {
  if (state-is_version(130, 300))
 _mesa_glsl_error(loc, state,
sampler arrays indexed with non-constant 
expressions are forbidden in GLSL %s 
and later
state-es_shader ? ES 3.00 : 1.30);
  else if (state-es_shader)
 _mesa_glsl_warning(loc, state,
sampler arrays indexed with non-constant 
expressions are optional in %s and will 
be forbidden in GLSL ES 3.00 and later
state-version_string());
  else
 _mesa_glsl_warning(loc, state,
sampler arrays indexed with non-constant 
expressions will be forbidden in GLSL 
1.30 and later);
   }

 -- 
 2.1.0


 ___
 mesa-dev mailing list
 mesa-dev@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/mesa-dev

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] i965: Add the ability to render to I8/L8 and I16/L16 UNORM formats.

2015-04-07 Thread Ian Romanick

On 04/06/2015 05:06 PM, Kenneth Graunke wrote:
 This allows those formats to work with the meta PBO upload path.
 
 Signed-off-by: Kenneth Graunke kenn...@whitecape.org
 ---
  src/mesa/drivers/dri/i965/brw_surface_formats.c | 8 
  1 file changed, 8 insertions(+)
 
 diff --git a/src/mesa/drivers/dri/i965/brw_surface_formats.c 
 b/src/mesa/drivers/dri/i965/brw_surface_formats.c
 index 7261c01..7524ad9 100644
 --- a/src/mesa/drivers/dri/i965/brw_surface_formats.c
 +++ b/src/mesa/drivers/dri/i965/brw_surface_formats.c
 @@ -582,6 +582,14 @@ brw_init_surface_formats(struct brw_context *brw)
case BRW_SURFACEFORMAT_L16_FLOAT:
render = BRW_SURFACEFORMAT_R16_FLOAT;
break;
 +  case BRW_SURFACEFORMAT_I8_UNORM:
 +  case BRW_SURFACEFORMAT_L8_UNORM:
 + render = BRW_SURFACEFORMAT_R8_UNORM;
 + break;

I wasn't sure this was correct, so I spent some time digging in the GL
spec.  Table 3.15 on page 179 (page 195 of the PDF) of the OpenGL 3.0
spec shows that this mapping is correct.

Reviewed-by: Ian Romanick ian.d.roman...@intel.com

 +  case BRW_SURFACEFORMAT_I16_UNORM:
 +  case BRW_SURFACEFORMAT_L16_UNORM:
 + render = BRW_SURFACEFORMAT_R16_UNORM;
 + break;
case BRW_SURFACEFORMAT_B8G8R8X8_UNORM:
/* XRGB is handled as ARGB because the chips in this family
 * cannot render to XRGB targets.  This means that we have to
 

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] Value Range Propagation in NIR (GSoC)

2015-04-07 Thread Connor Abbott

Hi Thomas,

Thanks for submitting a proposal! Some comments/answers below.

On Tue, Apr 7, 2015 at 3:34 PM, Thomas Helland
thomashellan...@gmail.com wrote:
 Hi,

 For those that don't know I've submitted a proposal for this years GSoC.
 I've proposed to implement value range propagation and loop unrolling in
 NIR.
 Since I'm no expert on compilers I've read up on some litterature:

 I started with Constant propagation with conditional branches  (thanks
 Connor).
 This paper describes an algorithm, sparse conditional constant
 propagation,
 that seems to be the defacto standard in compilers today.

 I also found the paper;
 Accurate static branch prediction by value range propagation  (VRP).
 This describes a value range propagation implementation based on SCCP.
 (This also allows one to set heuristics to calculate educated guesses for
 the
 probability of a certain branch, but that's probably more than we're
 interested in.)

Thanks for mentioning that... I had forgotten the name of that paper.
You're right in that the branch probability stuff isn't too useful for
us. Also, it raises an important issue about back-edges from phi
nodes; they present a more sophisticated method to handle it, but I
think that for now we can just force back edges to have an infinite
range unless they're constant.


 There is also a GCC paper (with whatever licensing issues that may apply);
 A propagation engine for GCC.
 They have a shared engine for doing all propagation passes.
 It handles the worklists, and the logic to traverse these.
 The implementing passes then supply callbacks to define the lattice rules.
 They reply back if the instruction was interesting or not,
 and the propagation engine basically handles the rest.

 Maybe that's an interesting solution? Or it might not be worth the hassle?
 We already have copy propagation, and with value range propagation
 we probably don't want separate constant propagation?
 (I'm hoping to write the pass so that it handles both constants and value
 ranges.)

Yes, copy propagation probably won't be so useful once we have value
range propagation; the former is a special case of the latter. Note
that we have a nifty way of actually doing the constant folding
(nir_constant_expressions.py and nir_constant_expressions.h), which
you should still use if all the inputs are constant.

 The GCC guys have used this engine to get copy propagation that propagates
 copies accross conditionals, maybe this makes such a solution more
 interesting?

I'm not so sure how useful such a general framework will be. Constant
propagation that handles back-edges seems interesting, but I'm not
sure it's worth the time to implement something this general as a
first pass.


 Connor: I just remembered you saying something about your freedesktop
 git repo, so I poked around some and found that you have already done
 some work on VRP based on SCCP? How far did you get?

I started on it, but then I realized that the approach I was using was
too cumbersome/complicated so I don't think what I have is too useful.
Feel free to work on it yourself, although Jason and I have discussed
it so we have some ideas of how to do it. I've written a few notes on
this below that you may find useful.

- I have a branch I created while working on VRP that you'll probably
find useful: http://cgit.freedesktop.org/~cwabbott0/mesa/log/?h=nir-worklist
. The first two commits are already in master, but the last two should
be useful for implementing SCCP/VRP (although they'll need to be
rebased, obviously).

- There's a comment in the SCCP paper (5.3, Nodes versus Edges) that
says: An alternative way of implementing this would be to add nodes
to the
graph and then associate an ExecutableFlag with each node. An
additional node must be inserted between any node that has more than
one immediate successor and any successor node that has more than one
immediate predecessor. I think this procedure is what's usually
called splitting critical edges; in NIR, thanks to the structured
control flow, there are never any critical edges except for one edge
case you don't really have to care about too much (namely, an infinite
loop with one basic block) and therefore you can just use the basic
block worklist that I added in the branch mentioned above, rather than
a worklist of basic block edges as the paper describes.

- The reason my pass was becoming so cumbersome was because I was
trying to solve two problems at once. First, there's actually
propagating the ranges. Then, there's taking into account restrictions
on range due to branch predicates. For example, if I have something
like:

if (x  0) {
y = max(x, 0);
}

then since the use of x is dominated by the then-branch of the if, x
has to be greater than 0 and we can optimize it. This is a little
contrived, but we have seen things like:

if (foo)
break;

/* ... */

if (foo)
break;

in the wild, where we could get rid of the redundant break using this
analysis by recognizing that since

Re: [Mesa-dev] [PATCH] st/mesa: align cube map arrays layers