date:20160915

[Mesa-dev] V3 Loop unrolling in NIR

2016-09-15 Thread Timothy Arceri

Big thanks to Connor for his feedback on previous versions, and
to Jason for answering my all my nir questions.

This series works on ssa defs so for now it's only enabled for
the scalar backend on Gen7+.

V3:
- So called complex loop unrolling has been implemented.
- An instruction limit and rules from the GLSL IR pass to override
 the limit for unrolling have been implemented.
- Lots of other stuff see individual patches.

total instructions in shared programs: 8488940 -> 8488648 (-0.00%)
instructions in affected programs: 48903 -> 48611 (-0.60%)
helped: 68
HURT: 89

Most of this HURT comes for switching to using 
nir_lower_indirect_derefs(). See patch 1 for more deals.

total cycles in shared programs: 69787006 -> 69758740 (-0.04%)
cycles in affected programs: 2525708 -> 2497442 (-1.12%)
helped: 900
HURT: 919

total loops in shared programs: 2071 -> 1499 (-27.62%)
loops in affected programs: 687 -> 115 (-83.26%)
helped: 655
HURT: 99

Helped here comes from a number of things. One example is the
nir pass is better than the GLSL pass at unrolling loops
regardless of which terminator has the lowest limit. We could
easily go further and handle unrolling of loops with complex
terminators e.g the ifs then or else blocks contain instructions
currently we just bail if they are not empty, I still need to
check if its worth while.

Another reason could be that I've set the instruction limit too
high but that doesn't seem to be the case.

I believe 82/99 of the HURT is from shaders that look something
like this:

  vec2 array[const_size_of_array];
  for (i = 0; i < const_size_of_array; i++) {
...  = array[i];

... lots of instructions (more that the unroll limit) ...
  }

The GLSL IR pass would force this to unroll as long as const_size_of_array
wasn't greater than 32. However by the time we get to the nir pass the
arrays have been removed, it seems like this may only be happening for
vectors but I haven't looked into what is causing it yet.

The other 17 shaders seem to be various corner cases that can be fixed
in folow-up patches.

total spills in shared programs: 2212 -> 2212 (0.00%)
spills in affected programs: 0 -> 0
helped: 0
HURT: 0

total fills in shared programs: 1891 -> 1891 (0.00%)
fills in affected programs: 0 -> 0
helped: 0
HURT: 0

LOST:   6
GAINED: 32

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 04/10] nir: Add a LCSAA-pass

2016-09-15 Thread Timothy Arceri

From: Thomas Helland 

V2: Do a "depth first search" to convert to LCSSA

V3: Small comment fixup

V4: Rebase, adapt to removal of function overloads

V5: Rebase, adapt to relocation of nir to compiler/nir
Still need to adapt to potential if-uses
Work around nir_validate issue

V6 (Timothy):
 - tidy lcssa and stop leaking memory
 - dont rewrite the src for the lcssa phi node
 - validate lcssa phi srcs to avoid postvalidate assert
 - don't add new phi if one already exists
 - more lcssa phi validation fixes
 - Rather than marking ssa defs inside a loop just mark blocks inside
   a loop. This is simpler and fixes lcssa for intrinsics which do
   not have a destination.
 - don't create LCSSA phis for loops we won't unroll
 - require loop metadata for lcssa pass
 - handle case were the ssa defs use outside the loop is already a phi

lcssa handle case were the ssa def use outside the loop is by a phi
---
 src/compiler/Makefile.sources   |   1 +
 src/compiler/nir/nir.h  |   5 +
 src/compiler/nir/nir_to_lcssa.c | 227 
 src/compiler/nir/nir_validate.c |  11 +-
 4 files changed, 241 insertions(+), 3 deletions(-)
 create mode 100644 src/compiler/nir/nir_to_lcssa.c

diff --git a/src/compiler/Makefile.sources b/src/compiler/Makefile.sources
index 7ed26a9..8ef6080 100644
--- a/src/compiler/Makefile.sources
+++ b/src/compiler/Makefile.sources
@@ -247,6 +247,7 @@ NIR_FILES = \
nir/nir_search_helpers.h \
nir/nir_split_var_copies.c \
nir/nir_sweep.c \
+   nir/nir_to_lcssa.c \
nir/nir_to_ssa.c \
nir/nir_validate.c \
nir/nir_vla.h \
diff --git a/src/compiler/nir/nir.h b/src/compiler/nir/nir.h
index 3a2a13a..eb81955 100644
--- a/src/compiler/nir/nir.h
+++ b/src/compiler/nir/nir.h
@@ -1384,6 +1384,8 @@ typedef struct {
struct exec_list srcs; /** < list of nir_phi_src */
 
nir_dest dest;
+
+   bool is_lcssa_phi;
 } nir_phi_instr;
 
 typedef struct {
@@ -2632,6 +2634,9 @@ void nir_convert_to_ssa(nir_shader *shader);
 bool nir_repair_ssa_impl(nir_function_impl *impl);
 bool nir_repair_ssa(nir_shader *shader);
 
+void nir_to_lcssa_impl(nir_function_impl *impl);
+void nir_to_lcssa(nir_shader *shader);
+
 /* If phi_webs_only is true, only convert SSA values involved in phi nodes to
  * registers.  If false, convert all values (even those not involved in a phi
  * node) to registers.
diff --git a/src/compiler/nir/nir_to_lcssa.c b/src/compiler/nir/nir_to_lcssa.c
new file mode 100644
index 000..4011ae0
--- /dev/null
+++ b/src/compiler/nir/nir_to_lcssa.c
@@ -0,0 +1,227 @@
+/*
+ * Copyright © 2015 Thomas Helland
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice (including the next
+ * paragraph) shall be included in all copies or substantial portions of the
+ * Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
+ * IN THE SOFTWARE.
+ */
+
+/*
+ * This pass converts the ssa-graph into "Loop Closed SSA form". This is
+ * done by placing phi nodes at the exits of the loop for all values
+ * that are used outside the loop. The result is it transforms:
+ *
+ * loop {->  loop {
+ *ssa2 = ->  ssa2 = ...
+ *if (cond)  ->  if (cond) {
+ *   break;  -> break;
+ *ssa3 = ssa2 * ssa4 ->  }
+ * } ->  ssa3 = ssa2 * ssa4
+ * ssa6 = ssa2 + 4   ->   }
+ *ssa5 = lcssa_phi(ssa2)
+ *ssa6 = ssa5 + 4
+ */
+
+#include "nir.h"
+
+typedef struct {
+   /* The nir_shader we are transforming */
+   nir_shader *shader;
+
+   /* The loop we store information for */
+   nir_loop *loop;
+
+   /* Keep track of which defs are in the loop */
+   BITSET_WORD *is_in_loop;
+
+   /* General purpose bool */
+   bool flag;
+} lcssa_state;
+
+static void
+mark_block_as_in_loop(nir_block *blk, void *state)
+{
+   lcssa_state *state_cast = (lcssa_state *) state;
+   BITSET_SET(state_cast->is_in_loop, blk->index);
+}
+
+static void
+is_block_outs

[Mesa-dev] [PATCH 03/10] nir: add helpers to check if we can unroll loops

2016-09-15 Thread Timothy Arceri

This will be used by the loop unroll and lcssa passes.

V2:
- Check instruction count is not too large for unrolling
- Add helper for complex loop unrolling
---
 src/compiler/nir/nir.h | 31 +++
 1 file changed, 31 insertions(+)

diff --git a/src/compiler/nir/nir.h b/src/compiler/nir/nir.h
index 49e8cd8..3a2a13a 100644
--- a/src/compiler/nir/nir.h
+++ b/src/compiler/nir/nir.h
@@ -2590,6 +2590,37 @@ bool nir_normalize_cubemap_coords(nir_shader *shader);
 
 void nir_live_ssa_defs_impl(nir_function_impl *impl);
 
+static inline bool
+is_loop_small_enough_to_unroll(nir_shader *shader, nir_loop_info *li)
+{
+   unsigned max_iter = shader->options->max_unroll_iterations;
+
+   if (li->trip_count > max_iter)
+  return false;
+
+   if (li->force_unroll)
+  return true;
+
+   bool loop_not_too_large =
+  li->num_instructions * li->trip_count <= max_iter * 25;
+
+   return loop_not_too_large;
+}
+
+static inline bool
+is_complex_loop(nir_shader *shader, nir_loop_info *li)
+{
+   unsigned num_lt = list_length(&li->loop_terminator_list);
+   return is_loop_small_enough_to_unroll(shader, li) && num_lt == 2;
+}
+
+static inline bool
+is_simple_loop(nir_shader *shader, nir_loop_info *li)
+{
+   return li->is_trip_count_known &&
+  is_loop_small_enough_to_unroll(shader, li);
+}
+
 void nir_loop_analyze_impl(nir_function_impl *impl,
nir_variable_mode indirect_mask);
 
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 09/10] nir: pass compiler rather than devinfo to functions that call nir_optimize

2016-09-15 Thread Timothy Arceri

Later we will pass compiler to nir_optimise to be used by the loop unroll
pass.
---
 src/mesa/drivers/dri/i965/brw_fs.cpp  | 10 --
 src/mesa/drivers/dri/i965/brw_nir.c   |  7 ---
 src/mesa/drivers/dri/i965/brw_nir.h   |  4 ++--
 src/mesa/drivers/dri/i965/brw_shader.cpp  |  4 ++--
 src/mesa/drivers/dri/i965/brw_vec4.cpp|  5 ++---
 src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.cpp |  5 ++---
 src/mesa/drivers/dri/i965/brw_vec4_tcs.cpp|  4 ++--
 7 files changed, 18 insertions(+), 21 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp 
b/src/mesa/drivers/dri/i965/brw_fs.cpp
index 75642d3..779c098 100644
--- a/src/mesa/drivers/dri/i965/brw_fs.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
@@ -6554,8 +6554,7 @@ brw_compile_fs(const struct brw_compiler *compiler, void 
*log_data,
char **error_str)
 {
nir_shader *shader = nir_shader_clone(mem_ctx, src_shader);
-   shader = brw_nir_apply_sampler_key(shader, compiler->devinfo, &key->tex,
-  true);
+   shader = brw_nir_apply_sampler_key(shader, compiler, &key->tex, true);
brw_nir_set_default_interpolation(compiler->devinfo, shader,
  key->flat_shade, key->persample_interp);
brw_nir_lower_fs_inputs(shader);
@@ -6563,7 +6562,7 @@ brw_compile_fs(const struct brw_compiler *compiler, void 
*log_data,
if (!key->multisample_fbo)
   NIR_PASS_V(shader, demote_sample_qualifiers);
NIR_PASS_V(shader, move_interpolation_to_top);
-   shader = brw_postprocess_nir(shader, compiler->devinfo, true);
+   shader = brw_postprocess_nir(shader, compiler, true);
 
/* key->alpha_test_func means simulating alpha testing via discards,
 * so the shader definitely kills pixels.
@@ -6786,8 +6785,7 @@ brw_compile_cs(const struct brw_compiler *compiler, void 
*log_data,
char **error_str)
 {
nir_shader *shader = nir_shader_clone(mem_ctx, src_shader);
-   shader = brw_nir_apply_sampler_key(shader, compiler->devinfo, &key->tex,
-  true);
+   shader = brw_nir_apply_sampler_key(shader, compiler, &key->tex, true);
brw_nir_lower_cs_shared(shader);
prog_data->base.total_shared += shader->num_shared;
 
@@ -6800,7 +6798,7 @@ brw_compile_cs(const struct brw_compiler *compiler, void 
*log_data,
(unsigned)4 * (prog_data->thread_local_id_index + 1));
 
brw_nir_lower_intrinsics(shader, &prog_data->base);
-   shader = brw_postprocess_nir(shader, compiler->devinfo, true);
+   shader = brw_postprocess_nir(shader, compiler, true);
 
prog_data->local_size[0] = shader->info.cs.local_size[0];
prog_data->local_size[1] = shader->info.cs.local_size[1];
diff --git a/src/mesa/drivers/dri/i965/brw_nir.c 
b/src/mesa/drivers/dri/i965/brw_nir.c
index af646ed..0c15b55 100644
--- a/src/mesa/drivers/dri/i965/brw_nir.c
+++ b/src/mesa/drivers/dri/i965/brw_nir.c
@@ -481,10 +481,10 @@ brw_preprocess_nir(const struct brw_compiler *compiler, 
nir_shader *nir)
  * will not work.
  */
 nir_shader *
-brw_postprocess_nir(nir_shader *nir,
-const struct gen_device_info *devinfo,
+brw_postprocess_nir(nir_shader *nir, const struct brw_compiler *compiler,
 bool is_scalar)
 {
+   const struct gen_device_info *devinfo = compiler->devinfo;
bool debug_enabled =
   (INTEL_DEBUG & intel_debug_flag_for_shader_stage(nir->stage));
 
@@ -546,10 +546,11 @@ brw_postprocess_nir(nir_shader *nir,
 
 nir_shader *
 brw_nir_apply_sampler_key(nir_shader *nir,
-  const struct gen_device_info *devinfo,
+  const struct brw_compiler *compiler,
   const struct brw_sampler_prog_key_data *key_tex,
   bool is_scalar)
 {
+   const struct gen_device_info *devinfo = compiler->devinfo;
nir_lower_tex_options tex_options = { 0 };
 
/* Iron Lake and prior require lowering of all rectangle textures */
diff --git a/src/mesa/drivers/dri/i965/brw_nir.h 
b/src/mesa/drivers/dri/i965/brw_nir.h
index b025d55..4af22fd 100644
--- a/src/mesa/drivers/dri/i965/brw_nir.h
+++ b/src/mesa/drivers/dri/i965/brw_nir.h
@@ -113,7 +113,7 @@ void brw_nir_lower_fs_outputs(nir_shader *nir);
 void brw_nir_lower_cs_shared(nir_shader *nir);
 
 nir_shader *brw_postprocess_nir(nir_shader *nir,
-const struct gen_device_info *devinfo,
+const struct brw_compiler *compiler,
 bool is_scalar);
 
 bool brw_nir_apply_attribute_workarounds(nir_shader *nir,
@@ -125,7 +125,7 @@ bool brw_nir_apply_trig_workarounds(nir_shader *nir);
 void brw_nir_apply_tcs_quads_workaround(nir_shader *nir);
 
 nir_shader *brw_nir_apply_sampler_key(nir_shader *nir,
-  const struct gen_device_info *devinfo,
+  const struct brw_compiler *

[Mesa-dev] [PATCH 05/10] nir: don't count removal of lcssa_phi as progress

2016-09-15 Thread Timothy Arceri

---
 src/compiler/nir/nir_opt_remove_phis.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/src/compiler/nir/nir_opt_remove_phis.c 
b/src/compiler/nir/nir_opt_remove_phis.c
index acaa6e1..d4344b0 100644
--- a/src/compiler/nir/nir_opt_remove_phis.c
+++ b/src/compiler/nir/nir_opt_remove_phis.c
@@ -73,6 +73,7 @@ remove_phis_block(nir_block *block, nir_builder *b)
  break;
 
   nir_phi_instr *phi = nir_instr_as_phi(instr);
+  bool is_lcssa_phi = phi->is_lcssa_phi;
 
   nir_ssa_def *def = NULL;
   nir_alu_instr *mov = NULL;
@@ -133,7 +134,8 @@ remove_phis_block(nir_block *block, nir_builder *b)
   nir_ssa_def_rewrite_uses(&phi->dest.ssa, nir_src_for_ssa(def));
   nir_instr_remove(instr);
 
-  progress = true;
+  if (!is_lcssa_phi)
+ progress = true;
}
 
return progress;
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 02/10] nir: Add a loop analysis pass

2016-09-15 Thread Timothy Arceri

From: Thomas Helland 

This pass detects induction variables and calculates the
trip count of loops to be used for loop unrolling.

I've removed support for float induction values for now, for the
simple reason that they don't appear in my shader-db collection,
and so I don't see it as common enough that we want to pollute the
pass with this in the initial version.

V2: Rebase, adapt to removal of function overloads

V3: (Timothy Arceri)
 - don't try to find trip count if loop terminator conditional is a phi
 - fix trip count for do-while loops
 - replace conditional type != alu assert with return
 - disable unrolling of loops with continues
 - multiple fixes to memory allocation, stop leaking and don't destroy
   structs we want to use for unrolling.
 - fix iteration count bugs when induction var not on RHS of condition
 - add FIXME for && conditions
 - calculate trip count for unsigned induction/limit vars

V4:
- count instructions in a loop
- set the limiting_terminator even if we can't find the trip count for
 all terminators. This is needed for complex unrolling where we handle
 2 terminators and the trip count is unknown for one of them.
- restruct structs so we don't keep information not required after
 analysis and remove dead fields.
- force unrolling in some cases as per the rules in the GLSL IR pass
---
 src/compiler/Makefile.sources   |2 +
 src/compiler/nir/nir.h  |   36 +-
 src/compiler/nir/nir_loop_analyze.c | 1012 +++
 src/compiler/nir/nir_metadata.c |8 +-
 4 files changed, 1056 insertions(+), 2 deletions(-)
 create mode 100644 src/compiler/nir/nir_loop_analyze.c

diff --git a/src/compiler/Makefile.sources b/src/compiler/Makefile.sources
index f5b4f9c..7ed26a9 100644
--- a/src/compiler/Makefile.sources
+++ b/src/compiler/Makefile.sources
@@ -190,6 +190,8 @@ NIR_FILES = \
nir/nir_intrinsics.c \
nir/nir_intrinsics.h \
nir/nir_liveness.c \
+   nir/nir_loop_analyze.c \
+   nir/nir_loop_analyze.h \
nir/nir_lower_alu_to_scalar.c \
nir/nir_lower_atomics.c \
nir/nir_lower_bitmap.c \
diff --git a/src/compiler/nir/nir.h b/src/compiler/nir/nir.h
index ff7c422..49e8cd8 100644
--- a/src/compiler/nir/nir.h
+++ b/src/compiler/nir/nir.h
@@ -1549,9 +1549,36 @@ nir_if_last_else_node(nir_if *if_stmt)
 }
 
 typedef struct {
+   nir_if *nif;
+
+   nir_instr *conditional_instr;
+
+   struct list_head loop_terminator_link;
+} nir_loop_terminator;
+
+typedef struct {
+   /* Number of instructions in the loop */
+   unsigned num_instructions;
+
+   /* How many times the loop is run (if known) */
+   unsigned trip_count;
+   bool is_trip_count_known;
+
+   /* Unroll the loop regardless of its size */
+   bool force_unroll;
+
+   nir_loop_terminator *limiting_terminator;
+
+   /* A list of loop_terminators terminating this loop. */
+   struct list_head loop_terminator_list;
+} nir_loop_info;
+
+typedef struct {
nir_cf_node cf_node;
 
struct exec_list body; /** < list of nir_cf_node */
+
+   nir_loop_info *info;
 } nir_loop;
 
 static inline nir_cf_node *
@@ -1576,6 +1603,7 @@ typedef enum {
nir_metadata_dominance = 0x2,
nir_metadata_live_ssa_defs = 0x4,
nir_metadata_not_properly_reset = 0x8,
+   nir_metadata_loop_analysis = 0x16,
 } nir_metadata;
 
 typedef struct {
@@ -1758,6 +1786,8 @@ typedef struct nir_shader_compiler_options {
 * information must be inferred from the list of input nir_variables.
 */
bool use_interpolated_input_intrinsics;
+
+   unsigned max_unroll_iterations;
 } nir_shader_compiler_options;
 
 typedef struct nir_shader_info {
@@ -1962,7 +1992,7 @@ nir_loop *nir_loop_create(nir_shader *shader);
 nir_function_impl *nir_cf_node_get_function(nir_cf_node *node);
 
 /** requests that the given pieces of metadata be generated */
-void nir_metadata_require(nir_function_impl *impl, nir_metadata required);
+void nir_metadata_require(nir_function_impl *impl, nir_metadata required, ...);
 /** dirties all but the preserved metadata */
 void nir_metadata_preserve(nir_function_impl *impl, nir_metadata preserved);
 
@@ -2559,6 +2589,10 @@ void nir_lower_double_pack(nir_shader *shader);
 bool nir_normalize_cubemap_coords(nir_shader *shader);
 
 void nir_live_ssa_defs_impl(nir_function_impl *impl);
+
+void nir_loop_analyze_impl(nir_function_impl *impl,
+   nir_variable_mode indirect_mask);
+
 bool nir_ssa_defs_interfere(nir_ssa_def *a, nir_ssa_def *b);
 
 void nir_convert_to_ssa_impl(nir_function_impl *impl);
diff --git a/src/compiler/nir/nir_loop_analyze.c 
b/src/compiler/nir/nir_loop_analyze.c
new file mode 100644
index 000..6bea9e5
--- /dev/null
+++ b/src/compiler/nir/nir_loop_analyze.c
@@ -0,0 +1,1012 @@
+/*
+ * Copyright © 2015 Thomas Helland
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without re

[Mesa-dev] [PATCH 01/10] i965: use nir_lower_indirect_derefs() for GLSL Gen7+

2016-09-15 Thread Timothy Arceri

This moves the nir_lower_indirect_derefs() call into
brw_preprocess_nir() so thats is called by both OpenGL and Vulkan
and removes that call to the old GLSL IR pass
lower_variable_index_to_cond_assign()

We want to do this pass in nir to be able to move loop unrolling
to nir.

There is a increase of 1-3 instructions in a small number of shaders,
and 2 Kerbal Space program shaders that increase by 32 instructions.

Shader-db results BDW:

total instructions in shared programs: 8705873 -> 8706194 (0.00%)
instructions in affected programs: 32515 -> 32836 (0.99%)
helped: 3
HURT: 79

total cycles in shared programs: 74618120 -> 74583476 (-0.05%)
cycles in affected programs: 528104 -> 493460 (-6.56%)
helped: 47
HURT: 37

LOST:   2
GAINED: 0
---
 src/intel/vulkan/anv_pipeline.c| 10 --
 src/mesa/drivers/dri/i965/brw_link.cpp | 26 ++
 src/mesa/drivers/dri/i965/brw_nir.c| 12 
 3 files changed, 26 insertions(+), 22 deletions(-)

diff --git a/src/intel/vulkan/anv_pipeline.c b/src/intel/vulkan/anv_pipeline.c
index f96fe22..f292f0b 100644
--- a/src/intel/vulkan/anv_pipeline.c
+++ b/src/intel/vulkan/anv_pipeline.c
@@ -183,16 +183,6 @@ anv_shader_compile_to_nir(struct anv_device *device,
 
nir_shader_gather_info(nir, entry_point->impl);
 
-   nir_variable_mode indirect_mask = 0;
-   if (compiler->glsl_compiler_options[stage].EmitNoIndirectInput)
-  indirect_mask |= nir_var_shader_in;
-   if (compiler->glsl_compiler_options[stage].EmitNoIndirectOutput)
-  indirect_mask |= nir_var_shader_out;
-   if (compiler->glsl_compiler_options[stage].EmitNoIndirectTemp)
-  indirect_mask |= nir_var_local;
-
-   nir_lower_indirect_derefs(nir, indirect_mask);
-
return nir;
 }
 
diff --git a/src/mesa/drivers/dri/i965/brw_link.cpp 
b/src/mesa/drivers/dri/i965/brw_link.cpp
index 2b1fa61..41791d4 100644
--- a/src/mesa/drivers/dri/i965/brw_link.cpp
+++ b/src/mesa/drivers/dri/i965/brw_link.cpp
@@ -139,18 +139,20 @@ process_glsl_ir(gl_shader_stage stage,
 
do_copy_propagation(shader->ir);
 
-   bool lowered_variable_indexing =
-  lower_variable_index_to_cond_assign((gl_shader_stage)stage,
-  shader->ir,
-  options->EmitNoIndirectInput,
-  options->EmitNoIndirectOutput,
-  options->EmitNoIndirectTemp,
-  options->EmitNoIndirectUniform);
-
-   if (unlikely(brw->perf_debug && lowered_variable_indexing)) {
-  perf_debug("Unsupported form of variable indexing in %s; falling "
- "back to very inefficient code generation\n",
- _mesa_shader_stage_to_abbrev(shader->Stage));
+   if (brw->gen < 7) {
+  bool lowered_variable_indexing =
+ lower_variable_index_to_cond_assign((gl_shader_stage)stage,
+ shader->ir,
+ options->EmitNoIndirectInput,
+ options->EmitNoIndirectOutput,
+ options->EmitNoIndirectTemp,
+ options->EmitNoIndirectUniform);
+
+  if (unlikely(brw->perf_debug && lowered_variable_indexing)) {
+ perf_debug("Unsupported form of variable indexing in %s; falling "
+"back to very inefficient code generation\n",
+_mesa_shader_stage_to_abbrev(shader->Stage));
+  }
}
 
bool progress;
diff --git a/src/mesa/drivers/dri/i965/brw_nir.c 
b/src/mesa/drivers/dri/i965/brw_nir.c
index e8dafae..af646ed 100644
--- a/src/mesa/drivers/dri/i965/brw_nir.c
+++ b/src/mesa/drivers/dri/i965/brw_nir.c
@@ -453,6 +453,18 @@ brw_preprocess_nir(const struct brw_compiler *compiler, 
nir_shader *nir)
/* Lower a bunch of stuff */
OPT_V(nir_lower_var_copies);
 
+   if (compiler->devinfo->gen > 6) {
+  nir_variable_mode indirect_mask = 0;
+  if (compiler->glsl_compiler_options[nir->stage].EmitNoIndirectInput)
+ indirect_mask |= nir_var_shader_in;
+  if (compiler->glsl_compiler_options[nir->stage].EmitNoIndirectOutput)
+ indirect_mask |= nir_var_shader_out;
+  if (compiler->glsl_compiler_options[nir->stage].EmitNoIndirectTemp)
+ indirect_mask |= nir_var_local;
+
+  nir_lower_indirect_derefs(nir, indirect_mask);
+   }
+
/* Get rid of split copies */
nir = nir_optimize(nir, is_scalar);
 
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 06/10] nir: create helper for fixing phi srcs when cloning

2016-09-15 Thread Timothy Arceri

This will be useful for fixing phi srcs when cloning a loop body
during loop unrolling.
---
 src/compiler/nir/nir_clone.c | 36 +---
 1 file changed, 21 insertions(+), 15 deletions(-)

diff --git a/src/compiler/nir/nir_clone.c b/src/compiler/nir/nir_clone.c
index 0e397b0..8808333 100644
--- a/src/compiler/nir/nir_clone.c
+++ b/src/compiler/nir/nir_clone.c
@@ -593,6 +593,26 @@ clone_cf_list(clone_state *state, struct exec_list *dst,
}
 }
 
+/* After we've cloned almost everything, we have to walk the list of phi
+ * sources and fix them up.  Thanks to loops, the block and SSA value for a
+ * phi source may not be defined when we first encounter it.  Instead, we
+ * add it to the phi_srcs list and we fix it up here.
+ */
+static void
+fixup_phi_srcs(clone_state *state)
+{
+   list_for_each_entry_safe(nir_phi_src, src, &state->phi_srcs, src.use_link) {
+  src->pred = remap_local(state, src->pred);
+  assert(src->src.is_ssa);
+  src->src.ssa = remap_local(state, src->src.ssa);
+
+  /* Remove from this list and place in the uses of the SSA def */
+  list_del(&src->src.use_link);
+  list_addtail(&src->src.use_link, &src->src.ssa->uses);
+   }
+   assert(list_empty(&state->phi_srcs));
+}
+
 static nir_function_impl *
 clone_function_impl(clone_state *state, const nir_function_impl *fi)
 {
@@ -614,21 +634,7 @@ clone_function_impl(clone_state *state, const 
nir_function_impl *fi)
 
clone_cf_list(state, &nfi->body, &fi->body);
 
-   /* After we've cloned almost everything, we have to walk the list of phi
-* sources and fix them up.  Thanks to loops, the block and SSA value for a
-* phi source may not be defined when we first encounter it.  Instead, we
-* add it to the phi_srcs list and we fix it up here.
-*/
-   list_for_each_entry_safe(nir_phi_src, src, &state->phi_srcs, src.use_link) {
-  src->pred = remap_local(state, src->pred);
-  assert(src->src.is_ssa);
-  src->src.ssa = remap_local(state, src->src.ssa);
-
-  /* Remove from this list and place in the uses of the SSA def */
-  list_del(&src->src.use_link);
-  list_addtail(&src->src.use_link, &src->src.ssa->uses);
-   }
-   assert(list_empty(&state->phi_srcs));
+   fixup_phi_srcs(state);
 
/* All metadata is invalidated in the cloning process */
nfi->valid_metadata = 0;
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 08/10] nir: add a loop unrolling pass

2016-09-15 Thread Timothy Arceri

V2:
- tidy ups suggested by Connor.
- tidy up cloning logic and handle copy propagation
 based of suggestion by Connor.
- use nir_ssa_def_rewrite_uses to fix up lcssa phis
  suggested by Connor.
- add support for complex loop unrolling (two terminators)
- handle case were the ssa defs use outside the loop is already a phi
- support unrolling loops with multiple terminators when trip count
  is know for each terminator
---
 src/compiler/Makefile.sources  |   1 +
 src/compiler/nir/nir.h |   2 +
 src/compiler/nir/nir_opt_loop_unroll.c | 820 +
 3 files changed, 823 insertions(+)
 create mode 100644 src/compiler/nir/nir_opt_loop_unroll.c

diff --git a/src/compiler/Makefile.sources b/src/compiler/Makefile.sources
index 8ef6080..b3512bb 100644
--- a/src/compiler/Makefile.sources
+++ b/src/compiler/Makefile.sources
@@ -233,6 +233,7 @@ NIR_FILES = \
nir/nir_opt_dead_cf.c \
nir/nir_opt_gcm.c \
nir/nir_opt_global_to_local.c \
+   nir/nir_opt_loop_unroll.c \
nir/nir_opt_peephole_select.c \
nir/nir_opt_remove_phis.c \
nir/nir_opt_undef.c \
diff --git a/src/compiler/nir/nir.h b/src/compiler/nir/nir.h
index 9887432..0513d81 100644
--- a/src/compiler/nir/nir.h
+++ b/src/compiler/nir/nir.h
@@ -2661,6 +2661,8 @@ bool nir_opt_dead_cf(nir_shader *shader);
 
 bool nir_opt_gcm(nir_shader *shader, bool value_number);
 
+bool nir_opt_loop_unroll(nir_shader *shader, nir_variable_mode indirect_mask);
+
 bool nir_opt_peephole_select(nir_shader *shader);
 
 bool nir_opt_remove_phis(nir_shader *shader);
diff --git a/src/compiler/nir/nir_opt_loop_unroll.c 
b/src/compiler/nir/nir_opt_loop_unroll.c
new file mode 100644
index 000..1de02f6
--- /dev/null
+++ b/src/compiler/nir/nir_opt_loop_unroll.c
@@ -0,0 +1,820 @@
+/*
+ * Copyright © 2016 Intel Corporation
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice (including the next
+ * paragraph) shall be included in all copies or substantial portions of the
+ * Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
+ * DEALINGS IN THE SOFTWARE.
+ */
+
+#include "nir.h"
+#include "nir_builder.h"
+#include "nir_control_flow.h"
+
+static void
+extract_loop_body(nir_cf_list *extracted, nir_cf_node *node)
+{
+   nir_cf_node *end = node;
+   while (!nir_cf_node_is_last(end))
+  end = nir_cf_node_next(end);
+
+   nir_cf_extract(extracted, nir_before_cf_node(node),
+  nir_after_cf_node(end));
+}
+
+static void
+clone_list(nir_shader *ns, nir_loop *loop, nir_cf_list *src_cf_list,
+   nir_cf_list *cloned_cf_list, struct hash_table *remap_table)
+{
+   /* Dest list needs to at least have one block */
+   nir_block *nblk = nir_block_create(ns);
+   nblk->cf_node.parent = loop->cf_node.parent;
+   exec_list_push_tail(&cloned_cf_list->list, &nblk->cf_node.node);
+
+   nir_clone_loop_list(&cloned_cf_list->list, &src_cf_list->list,
+   remap_table, ns);
+}
+
+static void
+move_cf_list_into_if(nir_cf_list *lst, nir_cf_node *if_node,
+ nir_cf_node *last_node, bool continue_from_then_branch)
+{
+   nir_if *if_stmt = nir_cf_node_as_if(if_node);
+   if (continue_from_then_branch) {
+  /* Move the rest of the loop inside the then */
+  nir_cf_reinsert(lst, nir_after_cf_node(nir_if_last_then_node(if_stmt)));
+   } else {
+  /* Move the rest of the loop inside the else */
+  nir_cf_reinsert(lst, nir_after_cf_node(nir_if_last_else_node(if_stmt)));
+   }
+
+   /* Remove the break */
+   nir_instr_remove(nir_block_last_instr(nir_cf_node_as_block(last_node)));
+}
+
+static bool
+is_phi_src_phi_from_loop_header(nir_ssa_def *def, nir_ssa_def *src)
+{
+   return def->parent_instr->type == nir_instr_type_phi &&
+  src->parent_instr->type == nir_instr_type_phi &&
+  nir_instr_as_phi(def->parent_instr)->instr.block->index ==
+  nir_instr_as_phi(src->parent_instr)->instr.block->index;
+}
+
+static void
+get_table_of_lcssa_and_loop_term_phis(nir_cf_node *loop,
+  struct hash_table **lcssa_phis,
+

[Mesa-dev] [PATCH 07/10] nir: add helper for cloning loops

2016-09-15 Thread Timothy Arceri

---
 src/compiler/nir/nir.h   |  2 ++
 src/compiler/nir/nir_clone.c | 41 ++---
 2 files changed, 36 insertions(+), 7 deletions(-)

diff --git a/src/compiler/nir/nir.h b/src/compiler/nir/nir.h
index eb81955..9887432 100644
--- a/src/compiler/nir/nir.h
+++ b/src/compiler/nir/nir.h
@@ -2335,6 +2335,8 @@ void nir_print_instr(const nir_instr *instr, FILE *fp);
 
 nir_shader *nir_shader_clone(void *mem_ctx, const nir_shader *s);
 nir_function_impl *nir_function_impl_clone(const nir_function_impl *fi);
+void nir_clone_loop_list(struct exec_list *dst, const struct exec_list *list,
+ struct hash_table *remap_table, nir_shader *ns);
 nir_constant *nir_constant_clone(const nir_constant *c, nir_variable *var);
 nir_variable *nir_variable_clone(const nir_variable *c, nir_shader *shader);
 
diff --git a/src/compiler/nir/nir_clone.c b/src/compiler/nir/nir_clone.c
index 8808333..7afccbb 100644
--- a/src/compiler/nir/nir_clone.c
+++ b/src/compiler/nir/nir_clone.c
@@ -35,6 +35,11 @@ typedef struct {
/* True if we are cloning an entire shader. */
bool global_clone;
 
+   /* This allows us to clone a loop body without having to add srcs from
+* outside the loop to the remap table. This is useful for loop unrolling.
+*/
+   bool allow_remap_fallback;
+
/* maps orig ptr -> cloned ptr: */
struct hash_table *remap_table;
 
@@ -46,11 +51,19 @@ typedef struct {
 } clone_state;
 
 static void
-init_clone_state(clone_state *state, bool global)
+init_clone_state(clone_state *state, struct hash_table *remap_table,
+ bool global, bool allow_remap_fallback)
 {
state->global_clone = global;
-   state->remap_table = _mesa_hash_table_create(NULL, _mesa_hash_pointer,
-_mesa_key_pointer_equal);
+   state->allow_remap_fallback = allow_remap_fallback;
+
+   if (remap_table) {
+  state->remap_table = remap_table;
+   } else {
+  state->remap_table = _mesa_hash_table_create(NULL, _mesa_hash_pointer,
+   _mesa_key_pointer_equal);
+   }
+
list_inithead(&state->phi_srcs);
 }
 
@@ -72,9 +85,8 @@ _lookup_ptr(clone_state *state, const void *ptr, bool global)
   return (void *)ptr;
 
entry = _mesa_hash_table_search(state->remap_table, ptr);
-   assert(entry && "Failed to find pointer!");
if (!entry)
-  return NULL;
+  return state->allow_remap_fallback ? (void *)ptr : NULL;
 
return entry->data;
 }
@@ -613,6 +625,21 @@ fixup_phi_srcs(clone_state *state)
assert(list_empty(&state->phi_srcs));
 }
 
+void
+nir_clone_loop_list(struct exec_list *dst, const struct exec_list *list,
+struct hash_table *remap_table, nir_shader *ns)
+{
+   clone_state state;
+   init_clone_state(&state, remap_table, false, true);
+
+   /* We use the same shader */
+   state.ns = ns;
+
+   clone_cf_list(&state, dst, list);
+
+   fixup_phi_srcs(&state);
+}
+
 static nir_function_impl *
 clone_function_impl(clone_state *state, const nir_function_impl *fi)
 {
@@ -646,7 +673,7 @@ nir_function_impl *
 nir_function_impl_clone(const nir_function_impl *fi)
 {
clone_state state;
-   init_clone_state(&state, false);
+   init_clone_state(&state, NULL, false, false);
 
/* We use the same shader */
state.ns = fi->function->shader;
@@ -686,7 +713,7 @@ nir_shader *
 nir_shader_clone(void *mem_ctx, const nir_shader *s)
 {
clone_state state;
-   init_clone_state(&state, true);
+   init_clone_state(&state, NULL, true, false);
 
nir_shader *ns = nir_shader_create(mem_ctx, s->stage, s->options);
state.ns = ns;
-- 
2.7.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 10/10] i965: use nir unrolling for scalar backend Gen7+

2016-09-15 Thread Timothy Arceri

---
 src/compiler/glsl/glsl_parser_extras.cpp | 12 +
 src/mesa/drivers/dri/i965/brw_compiler.c | 42 +++-
 src/mesa/drivers/dri/i965/brw_nir.c  | 23 +
 3 files changed, 55 insertions(+), 22 deletions(-)

diff --git a/src/compiler/glsl/glsl_parser_extras.cpp 
b/src/compiler/glsl/glsl_parser_extras.cpp
index 436ddd0..a5c926a 100644
--- a/src/compiler/glsl/glsl_parser_extras.cpp
+++ b/src/compiler/glsl/glsl_parser_extras.cpp
@@ -2057,12 +2057,14 @@ do_common_optimization(exec_list *ir, bool linked,
OPT(optimize_split_arrays, ir, linked);
OPT(optimize_redundant_jumps, ir);
 
-   loop_state *ls = analyze_loop_variables(ir);
-   if (ls->loop_found) {
-  OPT(set_loop_controls, ir, ls);
-  OPT(unroll_loops, ir, ls, options);
+   if (options->MaxUnrollIterations != 0) {
+  loop_state *ls = analyze_loop_variables(ir);
+  if (ls->loop_found) {
+ OPT(set_loop_controls, ir, ls);
+ OPT(unroll_loops, ir, ls, options);
+  }
+  delete ls;
}
-   delete ls;
 
 #undef OPT
 
diff --git a/src/mesa/drivers/dri/i965/brw_compiler.c 
b/src/mesa/drivers/dri/i965/brw_compiler.c
index 86b1eaa..523b554 100644
--- a/src/mesa/drivers/dri/i965/brw_compiler.c
+++ b/src/mesa/drivers/dri/i965/brw_compiler.c
@@ -43,18 +43,28 @@
.use_interpolated_input_intrinsics = true, \
.vertex_id_zero_based = true
 
+#define COMMON_SCALAR_OPTIONS \
+   .lower_pack_half_2x16 = true,  \
+   .lower_pack_snorm_2x16 = true, \
+   .lower_pack_snorm_4x8 = true,  \
+   .lower_pack_unorm_2x16 = true, \
+   .lower_pack_unorm_4x8 = true,  \
+   .lower_unpack_half_2x16 = true,\
+   .lower_unpack_snorm_2x16 = true,   \
+   .lower_unpack_snorm_4x8 = true,\
+   .lower_unpack_unorm_2x16 = true,   \
+   .lower_unpack_unorm_4x8 = true \
+
 static const struct nir_shader_compiler_options scalar_nir_options = {
COMMON_OPTIONS,
-   .lower_pack_half_2x16 = true,
-   .lower_pack_snorm_2x16 = true,
-   .lower_pack_snorm_4x8 = true,
-   .lower_pack_unorm_2x16 = true,
-   .lower_pack_unorm_4x8 = true,
-   .lower_unpack_half_2x16 = true,
-   .lower_unpack_snorm_2x16 = true,
-   .lower_unpack_snorm_4x8 = true,
-   .lower_unpack_unorm_2x16 = true,
-   .lower_unpack_unorm_4x8 = true,
+   COMMON_SCALAR_OPTIONS,
+   .max_unroll_iterations = 0,
+};
+
+static const struct nir_shader_compiler_options scalar_nir_options_gen7 = {
+   COMMON_OPTIONS,
+   COMMON_SCALAR_OPTIONS,
+   .max_unroll_iterations = 32,
 };
 
 static const struct nir_shader_compiler_options vector_nir_options = {
@@ -75,6 +85,7 @@ static const struct nir_shader_compiler_options 
vector_nir_options = {
.lower_unpack_unorm_2x16 = true,
.lower_extract_byte = true,
.lower_extract_word = true,
+   .max_unroll_iterations = 0,
 };
 
 static const struct nir_shader_compiler_options vector_nir_options_gen6 = {
@@ -92,6 +103,7 @@ static const struct nir_shader_compiler_options 
vector_nir_options_gen6 = {
.lower_unpack_unorm_2x16 = true,
.lower_extract_byte = true,
.lower_extract_word = true,
+   .max_unroll_iterations = 0,
 };
 
 struct brw_compiler *
@@ -119,7 +131,6 @@ brw_compiler_create(void *mem_ctx, const struct 
gen_device_info *devinfo)
 
/* We want the GLSL compiler to emit code that uses condition codes */
for (int i = 0; i < MESA_SHADER_STAGES; i++) {
-  compiler->glsl_compiler_options[i].MaxUnrollIterations = 32;
   compiler->glsl_compiler_options[i].MaxIfDepth =
  devinfo->gen < 6 ? 16 : UINT_MAX;
 
@@ -140,8 +151,15 @@ brw_compiler_create(void *mem_ctx, const struct 
gen_device_info *devinfo)
  compiler->glsl_compiler_options[i].EmitNoIndirectSampler = true;
 
   if (is_scalar) {
- compiler->glsl_compiler_options[i].NirOptions = &scalar_nir_options;
+ if (devinfo->gen > 6) {
+compiler->glsl_compiler_options[i].MaxUnrollIterations = 0;
+ } else {
+compiler->glsl_compiler_options[i].MaxUnrollIterations = 32;
+ }
+ compiler->glsl_compiler_options[i].NirOptions =
+devinfo->gen < 7 ? &scalar_nir_options : &scalar_nir_options_gen7;
   } else {
+ compiler->glsl_compiler_options[i].MaxUnrollIterations = 32;
  compiler->glsl_compiler_options[i].NirOptions =
 devinfo->gen < 6 ? &vector_nir_options : &vector_nir_options_gen6;
   }
diff --git a/src/mesa/drivers/dri/i965/brw_nir.c 
b/src/mesa/drivers/dri/i965/brw_nir.c
index 0c15b55..15abb77

Re: [Mesa-dev] [PATCH v2] st/vdpau: fix argument type to vlVdpOutputSurfaceDMABuf

2016-09-15 Thread Christian König


Am 15.09.2016 um 01:16 schrieb Ilia Mirkin:

Signed-off-by: Ilia Mirkin 


Reviewed-by: Christian König .


---

v1 -> v2: adjust typedef in vdpau_dmabuf.h, per Nayan

  src/gallium/include/state_tracker/vdpau_dmabuf.h | 2 +-
  src/gallium/state_trackers/vdpau/output.c| 2 +-
  2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/src/gallium/include/state_tracker/vdpau_dmabuf.h 
b/src/gallium/include/state_tracker/vdpau_dmabuf.h
index 886c344..f838c92 100644
--- a/src/gallium/include/state_tracker/vdpau_dmabuf.h
+++ b/src/gallium/include/state_tracker/vdpau_dmabuf.h
@@ -87,7 +87,7 @@ typedef VdpStatus VdpVideoSurfaceDMABuf(
  );
  
  typedef VdpStatus VdpOutputSurfaceDMABuf(

-   VdpVideoSurface   surface,
+   VdpOutputSurface  surface,
 struct VdpSurfaceDMABufDesc * result
  );
  
diff --git a/src/gallium/state_trackers/vdpau/output.c b/src/gallium/state_trackers/vdpau/output.c

index 85751ea..f4d62a3 100644
--- a/src/gallium/state_trackers/vdpau/output.c
+++ b/src/gallium/state_trackers/vdpau/output.c
@@ -773,7 +773,7 @@ struct pipe_resource 
*vlVdpOutputSurfaceGallium(VdpOutputSurface surface)
 return vlsurface->surface->texture;
  }
  
-VdpStatus vlVdpOutputSurfaceDMABuf(VdpVideoSurface surface,

+VdpStatus vlVdpOutputSurfaceDMABuf(VdpOutputSurface surface,
 struct VdpSurfaceDMABufDesc *result)
  {
 vlVdpOutputSurface *vlsurface;



___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] Revert "st/vdpau: use linear layout for output surfaces"

2016-09-15 Thread Christian König


Am 15.09.2016 um 06:00 schrieb Ilia Mirkin:

On Wed, Sep 14, 2016 at 11:58 PM, Dave Airlie  wrote:

From: Dave Airlie 

This reverts commit d180de35320eafa3df3d76f0e82b332656530126.

This is a radeon specific hack that causes problems on nouveau
when combined with the SHARED flag later. If radeonsi needs a fix
for this, please fix it in the driver.


Actually it isn't radeon specific. Using linear surfaces for this makes 
sense because tilling isn't beneficial and the surfaces can potentially 
be shared with other GPUs using the VDPAU OpenGL interop.


The problem is that I actually can't say if a surfaces will be shared 
with another GPU or not when it is created. So the driver needs to be 
able to handle this case gracefully and move the surface to GART on 
demand when it is opened by another GPU.



Signed-off-by: Dave Airlie 

Tested-by: Ilia Mirkin 
Cc: "12.0" 


I don't think we have actually tested this with PRIME and I'm pretty 
sure there are some loose ends which still needs to be fixed for radeon 
as well.


So the patch is Acked-by: Christian König  for 
now, but essentially we need to find a better solution to this.


Regards,
Christian.




---
  src/gallium/state_trackers/vdpau/output.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/gallium/state_trackers/vdpau/output.c 
b/src/gallium/state_trackers/vdpau/output.c
index 85751ea..09a1517 100644
--- a/src/gallium/state_trackers/vdpau/output.c
+++ b/src/gallium/state_trackers/vdpau/output.c
@@ -82,7 +82,7 @@ vlVdpOutputSurfaceCreate(VdpDevice device,
 res_tmpl.depth0 = 1;
 res_tmpl.array_size = 1;
 res_tmpl.bind = PIPE_BIND_SAMPLER_VIEW | PIPE_BIND_RENDER_TARGET |
-   PIPE_BIND_LINEAR | PIPE_BIND_SHARED;
+   PIPE_BIND_SHARED;
 res_tmpl.usage = PIPE_USAGE_DEFAULT;

 pipe_mutex_lock(dev->mutex);
--
2.5.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev



___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [Bug 94925] Crash in egl_dri3_get_dri_context with Dolphin EGL/X11 in single-core mode

2016-09-15 Thread bugzilla-daemon

https://bugs.freedesktop.org/show_bug.cgi?id=94925

Eero Tamminen  changed:

   What|Removed |Added

  Component|Drivers/DRI/i965|EGL
 QA Contact|intel-3d-bugs@lists.freedes |mesa-dev@lists.freedesktop.
   |ktop.org|org
   Assignee|i...@freedesktop.org |mesa-dev@lists.freedesktop.
   ||org

--- Comment #2 from Eero Tamminen  ---
(In reply to Link Mauve from comment #0)
> When Dolphin-emu is compiled with EGL (-DUSE_EGL=ON), and set to single core
> mode (Config > General > Enable Dual Core disabled), it crashes in
> egl_dri3_get_dri_context().
> 
> This doesn’t happen in dual core mode (where the emulated GPU is running on
> a different thread than the emulated CPU), or when Dolphin is compiled with
> GLX as its context API, but it happens on both Xorg and Xwayland since both
> are using DRI3.

(In reply to Karol Herbst from comment #1)
> it seems like Mesa doesn't handle this situation right:
> 
> dolphin creats the Context on one thread and makes the context current on
> another one.
> 
> with DRI2 you get a white window and with DRI3 mesa simply crashes.

Doesn't seem like Intel backend issue.  What's the correct component for EGL
context handling issues, EGL or "Mesa core"?

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 1/2] glsl: add subpass image type

2016-09-15 Thread Dave Airlie

From: Dave Airlie 

SPIR-V/Vulkan have a special image type for input attachments
called the subpass type. It has different characteristics than
other images types.

The main one being it can only be an input image to fragment
shaders and loads from it are relative to the frag coord.

This adds support for it to the GLSL types. Unfortunately
we've run out of space in the sampler dim in types, so we
need to use another bit.
---
 src/compiler/builtin_type_macros.h |  2 ++
 src/compiler/glsl_types.cpp| 12 
 src/compiler/glsl_types.h  |  5 +++--
 src/compiler/nir/nir.h |  1 +
 4 files changed, 18 insertions(+), 2 deletions(-)

diff --git a/src/compiler/builtin_type_macros.h 
b/src/compiler/builtin_type_macros.h
index da3f19e..8af0e2a 100644
--- a/src/compiler/builtin_type_macros.h
+++ b/src/compiler/builtin_type_macros.h
@@ -159,6 +159,8 @@ DECL_TYPE(uimageCubeArray, 
GL_UNSIGNED_INT_IMAGE_CUBE_MAP_ARRAY,   GLSL_TYPE
 DECL_TYPE(uimage2DMS,  GL_UNSIGNED_INT_IMAGE_2D_MULTISAMPLE,   
GLSL_TYPE_IMAGE, GLSL_SAMPLER_DIM_MS, 0, 0, GLSL_TYPE_UINT)
 DECL_TYPE(uimage2DMSArray, GL_UNSIGNED_INT_IMAGE_2D_MULTISAMPLE_ARRAY, 
GLSL_TYPE_IMAGE, GLSL_SAMPLER_DIM_MS, 0, 1, GLSL_TYPE_UINT)
 
+DECL_TYPE(imageSubpass,0,  
GLSL_TYPE_IMAGE, GLSL_SAMPLER_DIM_SUBPASS,0, 0, GLSL_TYPE_FLOAT)
+
 DECL_TYPE(atomic_uint, GL_UNSIGNED_INT_ATOMIC_COUNTER, GLSL_TYPE_ATOMIC_UINT, 
1, 1)
 
 STRUCT_TYPE(gl_DepthRangeParameters)
diff --git a/src/compiler/glsl_types.cpp b/src/compiler/glsl_types.cpp
index 641644d..bf72419 100644
--- a/src/compiler/glsl_types.cpp
+++ b/src/compiler/glsl_types.cpp
@@ -674,6 +674,8 @@ glsl_type::get_sampler_instance(enum glsl_sampler_dim dim,
 return error_type;
  else
 return samplerExternalOES_type;
+  case GLSL_SAMPLER_DIM_SUBPASS:
+ return error_type;
   }
case GLSL_TYPE_INT:
   if (shadow)
@@ -701,6 +703,8 @@ glsl_type::get_sampler_instance(enum glsl_sampler_dim dim,
  return (array ? isampler2DMSArray_type : isampler2DMS_type);
   case GLSL_SAMPLER_DIM_EXTERNAL:
  return error_type;
+  case GLSL_SAMPLER_DIM_SUBPASS:
+ return error_type;
   }
case GLSL_TYPE_UINT:
   if (shadow)
@@ -728,6 +732,8 @@ glsl_type::get_sampler_instance(enum glsl_sampler_dim dim,
  return (array ? usampler2DMSArray_type : usampler2DMS_type);
   case GLSL_SAMPLER_DIM_EXTERNAL:
  return error_type;
+  case GLSL_SAMPLER_DIM_SUBPASS:
+ return error_type;
   }
default:
   return error_type;
@@ -740,6 +746,8 @@ const glsl_type *
 glsl_type::get_image_instance(enum glsl_sampler_dim dim,
   bool array, glsl_base_type type)
 {
+   if (dim == GLSL_SAMPLER_DIM_SUBPASS)
+  return imageSubpass_type;
switch (type) {
case GLSL_TYPE_FLOAT:
   switch (dim) {
@@ -764,6 +772,7 @@ glsl_type::get_image_instance(enum glsl_sampler_dim dim,
   case GLSL_SAMPLER_DIM_MS:
  return (array ? image2DMSArray_type : image2DMS_type);
   case GLSL_SAMPLER_DIM_EXTERNAL:
+  case GLSL_SAMPLER_DIM_SUBPASS:
  return error_type;
   }
case GLSL_TYPE_INT:
@@ -789,6 +798,7 @@ glsl_type::get_image_instance(enum glsl_sampler_dim dim,
   case GLSL_SAMPLER_DIM_MS:
  return (array ? iimage2DMSArray_type : iimage2DMS_type);
   case GLSL_SAMPLER_DIM_EXTERNAL:
+  case GLSL_SAMPLER_DIM_SUBPASS:
  return error_type;
   }
case GLSL_TYPE_UINT:
@@ -814,6 +824,7 @@ glsl_type::get_image_instance(enum glsl_sampler_dim dim,
   case GLSL_SAMPLER_DIM_MS:
  return (array ? uimage2DMSArray_type : uimage2DMS_type);
   case GLSL_SAMPLER_DIM_EXTERNAL:
+  case GLSL_SAMPLER_DIM_SUBPASS:
  return error_type;
   }
default:
@@ -1975,6 +1986,7 @@ glsl_type::coordinate_components() const
case GLSL_SAMPLER_DIM_RECT:
case GLSL_SAMPLER_DIM_MS:
case GLSL_SAMPLER_DIM_EXTERNAL:
+   case GLSL_SAMPLER_DIM_SUBPASS:
   size = 2;
   break;
case GLSL_SAMPLER_DIM_3D:
diff --git a/src/compiler/glsl_types.h b/src/compiler/glsl_types.h
index 7c4827d..b1e2f7a 100644
--- a/src/compiler/glsl_types.h
+++ b/src/compiler/glsl_types.h
@@ -80,7 +80,8 @@ enum glsl_sampler_dim {
GLSL_SAMPLER_DIM_RECT,
GLSL_SAMPLER_DIM_BUF,
GLSL_SAMPLER_DIM_EXTERNAL,
-   GLSL_SAMPLER_DIM_MS
+   GLSL_SAMPLER_DIM_MS,
+   GLSL_SAMPLER_DIM_SUBPASS, /* for vulkan input attachments */
 };
 
 enum glsl_interface_packing {
@@ -127,7 +128,7 @@ struct glsl_type {
GLenum gl_type;
glsl_base_type base_type;
 
-   unsigned sampler_dimensionality:3; /**< \see glsl_sampler_dim */
+   unsigned sampler_dimensionality:4; /**< \see glsl_sampler_dim */
unsigned sampler_shadow:1;
unsigned sampler_array:1;
unsigned sampled_type:2;/**< Type of data returned using this
diff --git a/src/compiler/nir/nir.h b/src/compiler/nir/nir.h

[Mesa-dev] [PATCH 2/2] spirv: use subpass image type

2016-09-15 Thread Dave Airlie

From: Dave Airlie 

This adds support for the input attachments subpass type
to the SPIRV->NIR pass.
---
 src/compiler/spirv/spirv_to_nir.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/src/compiler/spirv/spirv_to_nir.c 
b/src/compiler/spirv/spirv_to_nir.c
index 7e7a026..45dfe0b 100644
--- a/src/compiler/spirv/spirv_to_nir.c
+++ b/src/compiler/spirv/spirv_to_nir.c
@@ -828,6 +828,7 @@ vtn_handle_type(struct vtn_builder *b, SpvOp opcode,
   case SpvDimCube: dim = GLSL_SAMPLER_DIM_CUBE;  break;
   case SpvDimRect: dim = GLSL_SAMPLER_DIM_RECT;  break;
   case SpvDimBuffer:   dim = GLSL_SAMPLER_DIM_BUF;   break;
+  case SpvDimSubpassData: dim = GLSL_SAMPLER_DIM_SUBPASS; break;
   default:
  unreachable("Invalid SPIR-V Sampler dimension");
   }
@@ -854,7 +855,7 @@ vtn_handle_type(struct vtn_builder *b, SpvOp opcode,
  val->type->type = glsl_sampler_type(dim, is_shadow, is_array,
  glsl_get_base_type(sampled_type));
   } else if (sampled == 2) {
- assert(format);
+ assert((dim == GLSL_SAMPLER_DIM_SUBPASS) || format);
  assert(!is_shadow);
  val->type->type = glsl_image_type(dim, is_array,
glsl_get_base_type(sampled_type));
@@ -1419,6 +1420,7 @@ vtn_handle_texture(struct vtn_builder *b, SpvOp opcode,
   case GLSL_SAMPLER_DIM_2D:
   case GLSL_SAMPLER_DIM_RECT:
   case GLSL_SAMPLER_DIM_MS:
+  case GLSL_SAMPLER_DIM_SUBPASS:
  coord_components = 2;
  break;
   case GLSL_SAMPLER_DIM_3D:
-- 
2.5.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] i965: enable ARB_ES3_2_compatibility on gen8+

2016-09-15 Thread Kenneth Graunke

On Tuesday, September 13, 2016 9:27:49 PM PDT Ilia Mirkin wrote:
> Note that ASTC support is not actually mandated for this extension to be
> exposed.
> 
> Signed-off-by: Ilia Mirkin 
> ---
> 
> Also note that it doesn't seem required for the driver to simultaneously be
> exposing an actual ES 3.2 context. The ext does, however, nominally require
> GL 4.5. I think that can be ignored though.
> 
>  src/mesa/drivers/dri/i965/intel_extensions.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/src/mesa/drivers/dri/i965/intel_extensions.c 
> b/src/mesa/drivers/dri/i965/intel_extensions.c
> index f1ef4f6..fe22d3f 100644
> --- a/src/mesa/drivers/dri/i965/intel_extensions.c
> +++ b/src/mesa/drivers/dri/i965/intel_extensions.c
> @@ -400,6 +400,7 @@ intelInitExtensions(struct gl_context *ctx)
>ctx->Extensions.ARB_shader_precision = true;
>ctx->Extensions.ARB_gpu_shader_fp64 = true;
>ctx->Extensions.ARB_vertex_attrib_64bit = true;
> +  ctx->Extensions.ARB_ES3_2_compatibility = true;
>ctx->Extensions.OES_geometry_shader = true;
>ctx->Extensions.OES_texture_cube_map_array = true;
> }
> 

Acked-by: Kenneth Graunke 


signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [Bug 97804] Later precision statement isn't overriding earlier one

2016-09-15 Thread bugzilla-daemon

https://bugs.freedesktop.org/show_bug.cgi?id=97804

Eero Tamminen  changed:

   What|Removed |Added

 Attachment #126514|Glmark2 shader triggering   |Glmark2-es2 Jellyfish
description|the bug |shader_test triggering the
   ||bug

-- 
You are receiving this mail because:
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 1/3] mesa: Expose GL_CONTEXT_FLAGS in ES 3.2.

2016-09-15 Thread Kenneth Graunke

Fixes four ES32-CTS.context_flags.* tests.

Signed-off-by: Kenneth Graunke 
---
 src/mesa/main/get_hash_params.py | 8 +---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/src/mesa/main/get_hash_params.py b/src/mesa/main/get_hash_params.py
index 1f63dc3..b2aef7b 100644
--- a/src/mesa/main/get_hash_params.py
+++ b/src/mesa/main/get_hash_params.py
@@ -620,6 +620,11 @@ descriptor=[
   [ "MULTISAMPLE_LINE_WIDTH_GRANULARITY_ARB", 
"CONTEXT_FLOAT(Const.LineWidthGranularity), extra_ES32" ],
 ]},
 
+{ "apis": ["GL", "GL_CORE", "GLES32"], "params": [
+# GL 3.0 or ES 3.2
+  [ "CONTEXT_FLAGS", "CONTEXT_INT(Const.ContextFlags), extra_version_30" ],
+]},
+
 # Remaining enums are only in OpenGL
 { "apis": ["GL", "GL_CORE"], "params": [
   [ "ACCUM_RED_BITS", "BUFFER_INT(Visual.accumRedBits), NO_EXTRA" ],
@@ -888,9 +893,6 @@ descriptor=[
 # GL_ARB_color_buffer_float
   [ "RGBA_FLOAT_MODE_ARB", "BUFFER_FIELD(Visual.floatMode, TYPE_BOOLEAN), 
extra_core_ARB_color_buffer_float_and_new_buffers" ],
 
-# GL 3.0
-  [ "CONTEXT_FLAGS", "CONTEXT_INT(Const.ContextFlags), extra_version_30" ],
-
 # GL3.0 / GL_EXT_framebuffer_sRGB
   [ "FRAMEBUFFER_SRGB_EXT", "CONTEXT_BOOL(Color.sRGBEnabled), 
extra_EXT_framebuffer_sRGB" ],
   [ "FRAMEBUFFER_SRGB_CAPABLE_EXT", "BUFFER_INT(Visual.sRGBCapable), 
extra_EXT_framebuffer_sRGB_and_new_buffers" ],
-- 
2.9.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 2/3] glsl: Skip "unsized arrays aren't allowed" check for TCS/TES vars.

2016-09-15 Thread Kenneth Graunke

Fixes ESEXT-CTS.draw_elements_base_vertex_tests.AEP_shader_stages and
ESEXT-CTS.texture_cube_map_array.texture_size_tesselation_con_sh.

Signed-off-by: Kenneth Graunke 
---
 src/compiler/glsl/ast_to_hir.cpp | 9 -
 1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/src/compiler/glsl/ast_to_hir.cpp b/src/compiler/glsl/ast_to_hir.cpp
index 0a23195..90cc924 100644
--- a/src/compiler/glsl/ast_to_hir.cpp
+++ b/src/compiler/glsl/ast_to_hir.cpp
@@ -5127,7 +5127,14 @@ ast_declarator_list::hir(exec_list *instructions,
  const glsl_type *const t = (earlier == NULL)
 ? var->type : earlier->type;
 
- if (t->is_unsized_array())
+ /* GL_OES_tessellation_shader allows omitting the array size
+  * for TCS inputs/outputs and TES inputs.  Ignore this check.
+  */
+ bool unsized_ok = state->stage == MESA_SHADER_TESS_CTRL ||
+ (state->stage == MESA_SHADER_TESS_EVAL &&
+  var->data.mode == ir_var_shader_in);
+
+ if (t->is_unsized_array() && !unsized_ok)
 /* Section 10.17 of the GLSL ES 1.00 specification states that
  * unsized array declarations have been removed from the language.
  * Arrays that are sized using an initializer are still explicitly
-- 
2.9.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 3/3] mesa: Move buffers-unmapped earlier in check_valid_to_render().

2016-09-15 Thread Kenneth Graunke

This needs to be above the switch on API, as that can return true
(valid to render) before this error check even had a chance to run.

Fixes ESEXT-CTS.draw_elements_base_vertex_tests.invalid_mapped_bos,
which worked before commit 72f1566f90c434c7752d8405193eec68d6743246.

Signed-off-by: Kenneth Graunke 
Cc: Mathias Fröhlich 
---
 src/mesa/main/api_validate.c | 12 ++--
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/src/mesa/main/api_validate.c b/src/mesa/main/api_validate.c
index b35751e..6cb626a 100644
--- a/src/mesa/main/api_validate.c
+++ b/src/mesa/main/api_validate.c
@@ -45,6 +45,12 @@ check_valid_to_render(struct gl_context *ctx, const char 
*function)
   return false;
}
 
+   if (!_mesa_all_buffers_are_unmapped(ctx->Array.VAO)) {
+  _mesa_error(ctx, GL_INVALID_OPERATION,
+  "%s(vertex buffers are mapped)", function);
+  return false;
+   }
+
switch (ctx->API) {
case API_OPENGLES2:
   /* For ES2, we can draw if we have a vertex program/shader). */
@@ -119,12 +125,6 @@ check_valid_to_render(struct gl_context *ctx, const char 
*function)
   unreachable("Invalid API value in check_valid_to_render()");
}
 
-   if (!_mesa_all_buffers_are_unmapped(ctx->Array.VAO)) {
-  _mesa_error(ctx, GL_INVALID_OPERATION,
-  "%s(vertex buffers are mapped)", function);
-  return false;
-   }
-
return true;
 }
 
-- 
2.9.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] V3 Loop unrolling in NIR

2016-09-15 Thread Eero Tamminen


Hi,

Have you any plans for supporting partial unrolling?

I.e. if the loop count is too large to be completely unrolled, unroll it 
few times (that still fits into instruction cache) and then loop that.


E.g. for a loop with 51 rounds, Mesa could unroll it 4 rounds, loop that 
12 times and unroll (or loop) remaining 3 rounds separately.



- Eero

On 15.09.2016 10:03, Timothy Arceri wrote:

Big thanks to Connor for his feedback on previous versions, and
to Jason for answering my all my nir questions.

This series works on ssa defs so for now it's only enabled for
the scalar backend on Gen7+.

V3:
- So called complex loop unrolling has been implemented.
- An instruction limit and rules from the GLSL IR pass to override
 the limit for unrolling have been implemented.
- Lots of other stuff see individual patches.

total instructions in shared programs: 8488940 -> 8488648 (-0.00%)
instructions in affected programs: 48903 -> 48611 (-0.60%)
helped: 68
HURT: 89

Most of this HURT comes for switching to using
nir_lower_indirect_derefs(). See patch 1 for more deals.

total cycles in shared programs: 69787006 -> 69758740 (-0.04%)
cycles in affected programs: 2525708 -> 2497442 (-1.12%)
helped: 900
HURT: 919

total loops in shared programs: 2071 -> 1499 (-27.62%)
loops in affected programs: 687 -> 115 (-83.26%)
helped: 655
HURT: 99

Helped here comes from a number of things. One example is the
nir pass is better than the GLSL pass at unrolling loops
regardless of which terminator has the lowest limit. We could
easily go further and handle unrolling of loops with complex
terminators e.g the ifs then or else blocks contain instructions
currently we just bail if they are not empty, I still need to
check if its worth while.

Another reason could be that I've set the instruction limit too
high but that doesn't seem to be the case.

I believe 82/99 of the HURT is from shaders that look something
like this:

  vec2 array[const_size_of_array];
  for (i = 0; i < const_size_of_array; i++) {
...  = array[i];

... lots of instructions (more that the unroll limit) ...
  }

The GLSL IR pass would force this to unroll as long as const_size_of_array
wasn't greater than 32. However by the time we get to the nir pass the
arrays have been removed, it seems like this may only be happening for
vectors but I haven't looked into what is causing it yet.

The other 17 shaders seem to be various corner cases that can be fixed
in folow-up patches.

total spills in shared programs: 2212 -> 2212 (0.00%)
spills in affected programs: 0 -> 0
helped: 0
HURT: 0

total fills in shared programs: 1891 -> 1891 (0.00%)
fills in affected programs: 0 -> 0
helped: 0
HURT: 0

LOST:   6
GAINED: 32

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev



___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] st/vdpau: remove nouveau target

2016-09-15 Thread Emil Velikov

On 15 September 2016 at 02:54, Ilia Mirkin  wrote:
> On Wed, Sep 14, 2016 at 9:52 PM, Ilia Mirkin  wrote:
>> Recent changes have been made to the VDPAU state tracker to make it
>> unusable with nouveau. Don't provide users with an awfully slow
>> "hardware" decoding option.
>>
>> [To preemptively answer the question that will invariably be asked -
>> this is due to the state tracker's use of PIPE_BIND_SHARED, which
>> nouveau uses to force GART placement to make things with with PRIME.
>> However when this is used for output surfaces, which are then blended on
>> (the most common way of implementing an OSD), this results in
>> *incredibly* slow operation.]
>>
>> Signed-off-by: Ilia Mirkin 
>
> Oops, meant to add a CC to mesa-stable, since the breakage was
> introduced in 12.0.
>
Just to double-check - by "breakage" and "unusable" you mean that it's
"awfully slow" correct ?

Do you/others have any plan to update the state-tracker and/or nouveau
to brings things back on par ?

Thanks
Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [Bug 94925] Crash in egl_dri3_get_dri_context with Dolphin EGL/X11 in single-core mode

2016-09-15 Thread bugzilla-daemon

https://bugs.freedesktop.org/show_bug.cgi?id=94925

Emil Velikov  changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |FIXED

--- Comment #3 from Emil Velikov  ---
The crash should be resolved with the following commit. Feel free to reopen if
that's not the case.

commit d98d6e6269167230d20efdc45d608435a52f25fb
Author: Dave Airlie 
Date:   Mon May 30 08:02:00 2016 +1000

egl/dri3: don't crash on no context.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94925

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] Possible memory leak with `glxMakeCurrent`?

2016-09-15 Thread Eero Tamminen

Hi,

On 15.09.2016 11:00, Itai wrote:

What I should have said is I have never written anything in Mesa. I
tried making a small C program reproducing the problem, but I couldn't
get the same result (Not entirely sure what the JRE is doing).
I have run valgrind on the sample Java program, which produces a lot of
extraneous data, but I hope there is enough there to find the problem.

On a quick look I would say that the Java program / VM doesn't at exit 
get properly rid of the context(s) it creates.  Additionally, I think 
it's creating compatibility profile, because largest allocations are 
from swrast context creation (AFAIK added for compat fallbacks).

At the time the program was stopped, it's memory consumption exceeded
1.5GB, only about 200MB of them accounted for by the JVM (which means
about 1.3GB were in native/mesa code).

I would recommend killing the process with a signal that Valgrind can 
catch, but program doesn't (e.g. SIGCPU is something most programs don't 
catch :-)).  That way you see the actual run-time leaks instead of being 
confused with issues program creates with its exit code.  A lot of 
programs don't properly free their resources at exit and people often 
confuse Valgrind reports of these application's exit issues with 
run-time leakage.

- Eero

This was run using Mesa 12.0.2-1 on Debian testing. Attached is the
GZipped valgrind output, with hope it could help to find the problem, or
at least write a proper (non-Java) test case

On Thu, Aug 18, 2016 at 6:08 AM, Michel Dänzer mailto:mic...@daenzer.net>> wrote:

On 18/08/16 05:39 AM, Itai wrote:
> (Posted initially in mesa-users, but got no reply - the list seems dead.
> Couldn't find any bug report, and sadly not well versed enough in mesa
> to file one myself).

FWIW, there's no need to be versed in Mesa to file a bug report. :)

> Following an investigation of a memory leak with JavaFX on some Linux
> configuration, it looks like there is a possible memory leak when using
> `glxMakeCurrent`.
> Sadly, I myself don't know enough about OpenGL/Mesa to describe it
> fully, but I'm hoping someone here can understand it well enough to make
> a proper bug report.
>
> Here is a link to the discussion on the openjfx-dex list:
> http://mail.openjdk.java.net/pipermail/openjfx-dev/2016-August/019577.html

>
> Here is a forum post describing a non-Java way to reproduce this same
> issue:
> http://www.gamedev.net/topic/679705-glxmakecurrent-slowly-leaks-memory/

>
>
> This was possibly not an issue in older versions of Mesa, as the bug
> does not appear on older Linux installations (I'm using Mesa 11.2.2,
> where the bug is present)

Does it still happen with current Git master? There have been some fixes
in this area recently.

If it still happens, the output of running an affected application in
valgrind --leak-check=full (with debugging symbols available for at
least all Mesa binaries) would be useful.

--
Earthling Michel Dänzer   |   http://www.amd.com
Libre software enthusiast | Mesa and X developer

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [ANNOUNCE] mesa 12.0.3

2016-09-15 Thread Emil Velikov

Hi all,

Mesa 12.0.3 is now available.

This is an emergency release addressing a number of regressions across all
devices using the i965 driver.


Emil Velikov (4):
  docs: add sha256 checksums for 12.0.2
  Revert "i965/miptree: Stop multiplying cube depth by 6 in HiZ
calculations"
  Update version to 12.0.3
  docs: add release notes for 12.0.3

José Fonseca (1):
  appveyor: Update winflexbison download URL.


git tag: mesa-12.0.3

ftp://ftp.freedesktop.org/pub/mesa/12.0.3/mesa-12.0.3.tar.gz
MD5: 60c5f9897ddc38b46f8144c7366e84ad mesa-12.0.3.tar.gz
SHA1: 3661e2f6b3ff71b7498fa787848959059517e92a mesa-12.0.3.tar.gz
SHA256: 79abcfab3de30dbd416d1582a3cf6b1be308466231488775f1b7bb43be353602
mesa-12.0.3.tar.gz
PGP: ftp://ftp.freedesktop.org/pub/mesa/12.0.3/mesa-12.0.3.tar.gz.sig

ftp://ftp.freedesktop.org/pub/mesa/12.0.3/mesa-12.0.3.tar.xz
MD5: 1113699c714042d8c4df4766be8c57d8 mesa-12.0.3.tar.xz
SHA1: 3c36c91faf4d06a3d5e7bdaf3b2eabb28b21c2a9 mesa-12.0.3.tar.xz
SHA256: 1dc86dd9b51272eee1fad3df65e18cda2e556ef1bc0b6e07cd750b9757f493b1
mesa-12.0.3.tar.xz
PGP: ftp://ftp.freedesktop.org/pub/mesa/12.0.3/mesa-12.0.3.tar.xz.sig

--
-Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] st/vdpau: remove nouveau target

2016-09-15 Thread Emil Velikov

On 15 September 2016 at 10:56, Emil Velikov  wrote:
> On 15 September 2016 at 02:54, Ilia Mirkin  wrote:
>> On Wed, Sep 14, 2016 at 9:52 PM, Ilia Mirkin  wrote:
>>> Recent changes have been made to the VDPAU state tracker to make it
>>> unusable with nouveau. Don't provide users with an awfully slow
>>> "hardware" decoding option.
>>>
>>> [To preemptively answer the question that will invariably be asked -
>>> this is due to the state tracker's use of PIPE_BIND_SHARED, which
>>> nouveau uses to force GART placement to make things with with PRIME.
>>> However when this is used for output surfaces, which are then blended on
>>> (the most common way of implementing an OSD), this results in
>>> *incredibly* slow operation.]
>>>
>>> Signed-off-by: Ilia Mirkin 
>>
>> Oops, meant to add a CC to mesa-stable, since the breakage was
>> introduced in 12.0.
>>
> Just to double-check - by "breakage" and "unusable" you mean that it's
> "awfully slow" correct ?
>
> Do you/others have any plan to update the state-tracker and/or nouveau
> to brings things back on par ?
>
Scratch this question - I've just noticed the VDPAU patch which makes
this one obsolete.

-Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] st/va: also honors interlaced preference when providing a video format

2016-09-15 Thread Andy Furniss


Andy Furniss wrote:

Leo Liu wrote:

Hi Andy,

On 09/13/2016 06:22 AM, Andy Furniss wrote:

Zhang, Boyuan wrote:

Hi Leo, Christian and Julien,

I tested the patch with Vaapi Encoding and Transcoding, it seems
working fine. We are using "VAAPI_DISABLE_INTERLACE" env, so
interlaced is always disabled.


Though I notice it will break screen recording scripts for existing
users who previously didn't need the env set but will after this.

Totally untested/thought through, but maybe the env should default to
on?


Agree, can you come up a patch for that?


OK, but maybe I should test a bit first to see if anything regresses.


It seems true is needed for most gstreamer encodes now, so I guess it 
will need to be set on.


As for regressions I haven't tried ffmpeg encode as it's currently not 
working.


One I found with mpv which I guess is obvious =

mpv --hwdec=vaapi --vo=vaapi foo

works with or without the env, but if foo is interlaced you can't use 
the vavpp de-interlacer with the env set.





___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] st/vdpau: remove nouveau target

2016-09-15 Thread Ilia Mirkin

On Sep 15, 2016 6:41 AM, "Emil Velikov"  wrote:
>
> On 15 September 2016 at 10:56, Emil Velikov 
wrote:
> > On 15 September 2016 at 02:54, Ilia Mirkin  wrote:
> >> On Wed, Sep 14, 2016 at 9:52 PM, Ilia Mirkin 
wrote:
> >>> Recent changes have been made to the VDPAU state tracker to make it
> >>> unusable with nouveau. Don't provide users with an awfully slow
> >>> "hardware" decoding option.
> >>>
> >>> [To preemptively answer the question that will invariably be asked -
> >>> this is due to the state tracker's use of PIPE_BIND_SHARED, which
> >>> nouveau uses to force GART placement to make things with with PRIME.
> >>> However when this is used for output surfaces, which are then blended
on
> >>> (the most common way of implementing an OSD), this results in
> >>> *incredibly* slow operation.]
> >>>
> >>> Signed-off-by: Ilia Mirkin 
> >>
> >> Oops, meant to add a CC to mesa-stable, since the breakage was
> >> introduced in 12.0.
> >>
> > Just to double-check - by "breakage" and "unusable" you mean that it's
> > "awfully slow" correct ?

Yeah. Nowhere close to able to keep up if blending is enabled, which
happens whenever an osd is displayed.

> >
> > Do you/others have any plan to update the state-tracker and/or nouveau
> > to brings things back on par ?

No plans from me.

> >
> Scratch this question - I've just noticed the VDPAU patch which makes
> this one obsolete.

Either this or a "make it work again" patch needs to land.

  -ilia
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] clover: Fix build against clang SVN >= r273191

2016-09-15 Thread Timo Aaltonen

On 21.06.2016 02:17, Vedran Miletić wrote:
> setLangDefaults() now requires PreprocessorOptions as an argument.
> ---
>  src/gallium/state_trackers/clover/llvm/invocation.cpp | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/src/gallium/state_trackers/clover/llvm/invocation.cpp 
> b/src/gallium/state_trackers/clover/llvm/invocation.cpp
> index e2cadda..57e 100644
> --- a/src/gallium/state_trackers/clover/llvm/invocation.cpp
> +++ b/src/gallium/state_trackers/clover/llvm/invocation.cpp
> @@ -207,7 +207,7 @@ namespace {
>c.getDiagnosticOpts().ShowCarets = false;
>c.getInvocation().setLangDefaults(c.getLangOpts(), clang::IK_OpenCL,
>  #if HAVE_LLVM >= 0x0309
> -llvm::Triple(triple),
> +llvm::Triple(triple), 
> c.getPreprocessorOpts(),
>  #endif
>  clang::LangStandard::lang_opencl11);
>c.createDiagnostics(
> 

Emil,

Please add this to 12.0, since it's needed for using llvm-3.9 with 12.0.x.

Fixes https://bugs.freedesktop.org/show_bug.cgi?id=97542


-- 
t
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] direct-to-native-GL for GLX clients on Cygwin ("Windows-DRI")

2016-09-15 Thread Emil Velikov

On 18 July 2016 at 15:43, Jon Turney  wrote:
> Structurally, this is very similar to the existing Apple-DRI code, except I
> have chosen to implement this using the __GLXDRIdisplay, etc. vtables (as
> suggested originally in [1]), rather than a maze of ifdefs.  This also means
> that LIBGL_ALWAYS_SOFTWARE and LIBGL_ALWAYS_INDIRECT work as expected.
>
> [1] https://lists.freedesktop.org/archives/mesa-dev/2010-May/000756.html
>
> This adds:
>
> * the Windows-DRI extension protocol headers and the windowsdriproto.pc
> file, for use in building the Windows-DRI extension for the X server
>
> * a Windows-DRI extension helper client library
>
> * a Windows-specific DRI implementation for GLX clients
>
> The server is queried for Windows-DRI extension support on the screen before
> using it (to detect the case where WGL is disabled or can't be activated).
>
> The server is queried for fbconfigID to pixelformatindex mapping, which is
> used to augment glx_config.
>
> The server is queried for a native handle for the drawable (which is of a
> different type for windows, pixmaps and pbuffers), which is used to augment
> __GLXDRIdrawable.
>
> Various GLX extensions are enabled depending on if the equivalent WGL
> extension is available.
>
> Signed-off-by: Jon Turney 
> ---
>  configure.ac  |  10 +-
>  src/glx/Makefile.am   |  14 +
>  src/glx/driwindows_glx.c  | 609 
> ++
>  src/glx/glxclient.h   |  11 +-
>  src/glx/glxext.c  |  19 ++
>  src/glx/windows/Makefile.am   |  31 ++
>  src/glx/windows/wgl.c | 108 ++
>  src/glx/windows/wgl.h |  44 +++
>  src/glx/windows/windows_drawable.c| 192 +++
>  src/glx/windows/windowsdriconst.h |  45 +++
>  src/glx/windows/windowsdriproto.pc.in |   9 +
>  src/glx/windows/windowsdristr.h   | 152 +
>  src/glx/windows/windowsgl.c   | 403 ++
>  src/glx/windows/windowsgl.h   |  52 +++
>  src/glx/windows/windowsgl_internal.h  |  67 
>  src/glx/windows/xwindowsdri.c | 237 +
>  src/glx/windows/xwindowsdri.h |  59 
>  src/mapi/Makefile.am  |   3 +
>  src/mapi/glapi/gen/Makefile.am|   3 +
>  src/mapi/glapi/glapi.h|   2 +-
>  20 files changed, 2066 insertions(+), 4 deletions(-)
>  create mode 100644 src/glx/driwindows_glx.c
>  create mode 100644 src/glx/windows/Makefile.am
>  create mode 100644 src/glx/windows/wgl.c
>  create mode 100644 src/glx/windows/wgl.h
>  create mode 100644 src/glx/windows/windows_drawable.c
>  create mode 100644 src/glx/windows/windowsdriconst.h
>  create mode 100644 src/glx/windows/windowsdriproto.pc.in
>  create mode 100644 src/glx/windows/windowsdristr.h
>  create mode 100644 src/glx/windows/windowsgl.c
>  create mode 100644 src/glx/windows/windowsgl.h
>  create mode 100644 src/glx/windows/windowsgl_internal.h
>  create mode 100644 src/glx/windows/xwindowsdri.c
>  create mode 100644 src/glx/windows/xwindowsdri.h
>
> diff --git a/configure.ac b/configure.ac
> index 54416b4..9cefc28 100644
> --- a/configure.ac
> +++ b/configure.ac
> @@ -1114,7 +1114,9 @@ fi
>  case "$host_os" in
>  darwin*)
>  dri_platform='apple' ;;
> -gnu*|cygwin*)
> +cygwin*)
> +dri_platform='windows' ;;
> +gnu*)
>  dri_platform='none' ;;
>  *)
>  dri_platform='drm' ;;
> @@ -1130,6 +1132,7 @@ AM_CONDITIONAL(HAVE_DRISW_KMS, test "x$have_drisw_kms" 
> = xyes )
>  AM_CONDITIONAL(HAVE_DRI2, test "x$enable_dri" = xyes -a "x$dri_platform" = 
> xdrm -a "x$have_libdrm" = xyes )
>  AM_CONDITIONAL(HAVE_DRI3, test "x$enable_dri3" = xyes -a "x$dri_platform" = 
> xdrm -a "x$have_libdrm" = xyes )
>  AM_CONDITIONAL(HAVE_APPLEDRI, test "x$enable_dri" = xyes -a "x$dri_platform" 
> = xapple )
> +AM_CONDITIONAL(HAVE_WINDOWSDRI, test "x$enable_dri" = xyes -a 
> "x$dri_platform" = xwindows )
>
>  AC_ARG_ENABLE([shared-glapi],
>  [AS_HELP_STRING([--enable-shared-glapi],
> @@ -1394,6 +1397,9 @@ xdri)
>  if test x"$dri_platform" = xapple ; then
>  DEFINES="$DEFINES -DGLX_USE_APPLEGL"
>  fi
> +if test x"$dri_platform" = xwindows ; then
> +DEFINES="$DEFINES -DGLX_USE_WINDOWSGL"
> +fi
>  fi
>
>  # add xf86vidmode if available
> @@ -2744,6 +2750,8 @@ AC_CONFIG_FILES([Makefile
> src/glx/Makefile
> src/glx/apple/Makefile
> src/glx/tests/Makefile
> +   src/glx/windows/Makefile
> +   src/glx/windows/windowsdriproto.pc
So this is where windowsdriproto is. Can I suggest moving that into
separate package/repo analogous to xf86driproto/dri2proto/dri3proto.
Having it in ~jturney/ would be fine as a start, until an admin
creates a repo in xorg/proto/

Other than that, the coding style seems a bit inconsistent. But
considering this is based/copied from the dri codebase (dri1 by the
looks of

[Mesa-dev] [Bug 97804] Later precision statement isn't overriding earlier one

2016-09-15 Thread bugzilla-daemon

https://bugs.freedesktop.org/show_bug.cgi?id=97804

Tapani Pälli  changed:

   What|Removed |Added

 CC||lem...@gmail.com

-- 
You are receiving this mail because:
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 1/3] glx/glvnd: Don't modify the dummy slot in the dispatch table

2016-09-15 Thread Eric Engestrom

On Wed, Sep 14, 2016 at 02:06:18PM -0400, Adam Jackson wrote:
> Signed-off-by: Adam Jackson 

The series is:
Reviewed-by: Eric Engestrom 

BTW, I feel like these should be CC'ed to stable? I never know when
a fix is stable-worthy.

Cheers,
  Eric

> ---
>  src/glx/glxglvnd.c | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/src/glx/glxglvnd.c b/src/glx/glxglvnd.c
> index 098304d..2fc9b00 100644
> --- a/src/glx/glxglvnd.c
> +++ b/src/glx/glxglvnd.c
> @@ -50,6 +50,9 @@ static void __glXGLVNDSetDispatchIndex(const GLubyte 
> *procName, int index)
>  {
>  unsigned internalIndex = FindGLXFunction(procName);
>  
> +if (internalIndex == DI_FUNCTION_COUNT)
> +return; /* unknown or static dispatch */
> +
>  __glXDispatchTableIndices[internalIndex] = index;
>  }
>  
> -- 
> 2.9.3
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] st/va: Make VAAPI_DISABLE_INTERLACE default true

2016-09-15 Thread Andy Furniss


Since bf901a2
st/va: also honors interlaced preference when providing a video format
existing scripts and most use cases will need true.

Signed-off-by: Andy Furniss 
---
 src/gallium/state_trackers/va/surface.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/gallium/state_trackers/va/surface.c 
b/src/gallium/state_trackers/va/surface.c

index 00df69d..e73e17e 100644
--- a/src/gallium/state_trackers/va/surface.c
+++ b/src/gallium/state_trackers/va/surface.c
@@ -43,7 +43,7 @@

 #include "va_private.h"

-DEBUG_GET_ONCE_BOOL_OPTION(nointerlace, "VAAPI_DISABLE_INTERLACE", FALSE);
+DEBUG_GET_ONCE_BOOL_OPTION(nointerlace, "VAAPI_DISABLE_INTERLACE", TRUE);

 #include 

--
2.7.0
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [Bug 97804] Later precision statement isn't overriding earlier one

2016-09-15 Thread bugzilla-daemon

https://bugs.freedesktop.org/show_bug.cgi?id=97804

--- Comment #1 from Eero Tamminen  ---
GL 4.5 GLSL spec:
  https://www.opengl.org/registry/doc/GLSLangSpec.4.50.pdf

Says the same in "4.7.3 Default Precision Qualifiers":
"Multiple precision statements for the same basic type can appear inside the
same scope, with later statements overriding earlier statements within that
scope."

-- 
You are receiving this mail because:
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] st/va: also honors interlaced preference when providing a video format

2016-09-15 Thread Andy Furniss


Leo Liu wrote:

Hi Andy,

On 09/13/2016 06:22 AM, Andy Furniss wrote:

Zhang, Boyuan wrote:

Hi Leo, Christian and Julien,

I tested the patch with Vaapi Encoding and Transcoding, it seems
working fine. We are using "VAAPI_DISABLE_INTERLACE" env, so
interlaced is always disabled.


Though I notice it will break screen recording scripts for existing
users who previously didn't need the env set but will after this.

Totally untested/thought through, but maybe the env should default to on?


Agree, can you come up a patch for that?


I sent a patch to the list, patchwork -

https://patchwork.freedesktop.org/patch/110697/

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] st/va: Make VAAPI_DISABLE_INTERLACE default true

2016-09-15 Thread Leo Liu




On 09/15/2016 10:43 AM, Andy Furniss wrote:

Since bf901a2
st/va: also honors interlaced preference when providing a video format
existing scripts and most use cases will need true.

Signed-off-by: Andy Furniss 
---
 src/gallium/state_trackers/va/surface.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/gallium/state_trackers/va/surface.c 
b/src/gallium/state_trackers/va/surface.c

index 00df69d..e73e17e 100644
--- a/src/gallium/state_trackers/va/surface.c
+++ b/src/gallium/state_trackers/va/surface.c
@@ -43,7 +43,7 @@

 #include "va_private.h"

-DEBUG_GET_ONCE_BOOL_OPTION(nointerlace, "VAAPI_DISABLE_INTERLACE", 
FALSE);
+DEBUG_GET_ONCE_BOOL_OPTION(nointerlace, "VAAPI_DISABLE_INTERLACE", 
TRUE);


Like being mentioned,  It'll still override the preferred interlaced 
format when this env is not explicitly used.


Not sure this will be okay with other case. @Julien?

Regards,
Leo



 #include 



___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] mesa: check for no matrix change in _mesa_LoadMatrixf()

2016-09-15 Thread Brian Paul

Some apps issue redundant glLoadMatrixf() calls with the same matrix.
Try to avoid setting dirty state in that situation.

This reduces the number of constant buffer updates by about half in
ET Quake Wars.

Tested with Piglit, ETQW, Sauerbraten, Google Earth, etc.
---
 src/mesa/main/matrix.c | 8 +---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/src/mesa/main/matrix.c b/src/mesa/main/matrix.c
index b30b983..83f081e 100644
--- a/src/mesa/main/matrix.c
+++ b/src/mesa/main/matrix.c
@@ -356,9 +356,11 @@ _mesa_LoadMatrixf( const GLfloat *m )
   m[2], m[6], m[10], m[14],
   m[3], m[7], m[11], m[15]);
 
-   FLUSH_VERTICES(ctx, 0);
-   _math_matrix_loadf( ctx->CurrentStack->Top, m );
-   ctx->NewState |= ctx->CurrentStack->DirtyFlag;
+   if (memcmp(m, ctx->CurrentStack->Top->m, 16 * sizeof(GLfloat)) != 0) {
+  FLUSH_VERTICES(ctx, 0);
+  _math_matrix_loadf( ctx->CurrentStack->Top, m );
+  ctx->NewState |= ctx->CurrentStack->DirtyFlag;
+   }
 }
 
 
-- 
1.9.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] nvc0/ir: fix subops for IMAD

2016-09-15 Thread Samuel Pitoiset

Offset was wrong, it's at bit 8, not 4. Also, uses subr instead
of sub when src2 has neg. Similar to GK110 now.

Signed-off-by: Samuel Pitoiset 
Cc: mesa-sta...@lists.freedesktop.org
---
 src/gallium/drivers/nouveau/codegen/nv50_ir_emit_nvc0.cpp | 10 ++
 1 file changed, 6 insertions(+), 4 deletions(-)

diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_nvc0.cpp 
b/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_nvc0.cpp
index 47b4e7a..3ed70b5 100644
--- a/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_nvc0.cpp
+++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_nvc0.cpp
@@ -736,9 +736,15 @@ CodeEmitterNVC0::emitUADD(const Instruction *i)
 void
 CodeEmitterNVC0::emitIMAD(const Instruction *i)
 {
+   uint8_t addOp =
+  (i->src(2).mod.neg() << 1) | (i->src(0).mod.neg() ^ i->src(1).mod.neg());
+
assert(i->encSize == 8);
emitForm_A(i, HEX64(2000, 0003));
 
+   assert(addOp != 3);
+   code[1] |= addOp << 8;
+
if (isSignedType(i->dType))
   code[0] |= 1 << 7;
if (isSignedType(i->sType))
@@ -749,10 +755,6 @@ CodeEmitterNVC0::emitIMAD(const Instruction *i)
if (i->flagsDef >= 0) code[1] |= 1 << 16;
if (i->flagsSrc >= 0) code[1] |= 1 << 23;
 
-   if (i->src(2).mod.neg()) code[0] |= 0x10;
-   if (i->src(1).mod.neg() ^
-   i->src(0).mod.neg()) code[0] |= 0x20;
-
if (i->subOp == NV50_IR_SUBOP_MUL_HIGH)
   code[0] |= 1 << 6;
 }
-- 
2.8.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] nvc0/ir: fix subops for IMAD

2016-09-15 Thread Ilia Mirkin

On Thu, Sep 15, 2016 at 12:07 PM, Samuel Pitoiset
 wrote:
> Offset was wrong, it's at bit 8, not 4. Also, uses subr instead
> of sub when src2 has neg. Similar to GK110 now.
>
> Signed-off-by: Samuel Pitoiset 
> Cc: mesa-sta...@lists.freedesktop.org
> ---
>  src/gallium/drivers/nouveau/codegen/nv50_ir_emit_nvc0.cpp | 10 ++
>  1 file changed, 6 insertions(+), 4 deletions(-)
>
> diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_nvc0.cpp 
> b/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_nvc0.cpp
> index 47b4e7a..3ed70b5 100644
> --- a/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_nvc0.cpp
> +++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_nvc0.cpp
> @@ -736,9 +736,15 @@ CodeEmitterNVC0::emitUADD(const Instruction *i)
>  void
>  CodeEmitterNVC0::emitIMAD(const Instruction *i)
>  {
> +   uint8_t addOp =
> +  (i->src(2).mod.neg() << 1) | (i->src(0).mod.neg() ^ 
> i->src(1).mod.neg());
> +
> assert(i->encSize == 8);
> emitForm_A(i, HEX64(2000, 0003));
>
> +   assert(addOp != 3);
> +   code[1] |= addOp << 8;

code[0], no?

> +
> if (isSignedType(i->dType))
>code[0] |= 1 << 7;
> if (isSignedType(i->sType))
> @@ -749,10 +755,6 @@ CodeEmitterNVC0::emitIMAD(const Instruction *i)
> if (i->flagsDef >= 0) code[1] |= 1 << 16;
> if (i->flagsSrc >= 0) code[1] |= 1 << 23;
>
> -   if (i->src(2).mod.neg()) code[0] |= 0x10;
> -   if (i->src(1).mod.neg() ^
> -   i->src(0).mod.neg()) code[0] |= 0x20;
> -
> if (i->subOp == NV50_IR_SUBOP_MUL_HIGH)
>code[0] |= 1 << 6;
>  }
> --
> 2.8.0
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] nvc0/ir: fix subops for IMAD

2016-09-15 Thread Samuel Pitoiset




On 09/15/2016 06:08 PM, Ilia Mirkin wrote:

On Thu, Sep 15, 2016 at 12:07 PM, Samuel Pitoiset
 wrote:

Offset was wrong, it's at bit 8, not 4. Also, uses subr instead
of sub when src2 has neg. Similar to GK110 now.

Signed-off-by: Samuel Pitoiset 
Cc: mesa-sta...@lists.freedesktop.org
---
 src/gallium/drivers/nouveau/codegen/nv50_ir_emit_nvc0.cpp | 10 ++
 1 file changed, 6 insertions(+), 4 deletions(-)

diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_nvc0.cpp 
b/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_nvc0.cpp
index 47b4e7a..3ed70b5 100644
--- a/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_nvc0.cpp
+++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_nvc0.cpp
@@ -736,9 +736,15 @@ CodeEmitterNVC0::emitUADD(const Instruction *i)
 void
 CodeEmitterNVC0::emitIMAD(const Instruction *i)
 {
+   uint8_t addOp =
+  (i->src(2).mod.neg() << 1) | (i->src(0).mod.neg() ^ i->src(1).mod.neg());
+
assert(i->encSize == 8);
emitForm_A(i, HEX64(2000, 0003));

+   assert(addOp != 3);
+   code[1] |= addOp << 8;


code[0], no?


...
correct! I failed at copy&paste.




+
if (isSignedType(i->dType))
   code[0] |= 1 << 7;
if (isSignedType(i->sType))
@@ -749,10 +755,6 @@ CodeEmitterNVC0::emitIMAD(const Instruction *i)
if (i->flagsDef >= 0) code[1] |= 1 << 16;
if (i->flagsSrc >= 0) code[1] |= 1 << 23;

-   if (i->src(2).mod.neg()) code[0] |= 0x10;
-   if (i->src(1).mod.neg() ^
-   i->src(0).mod.neg()) code[0] |= 0x20;
-
if (i->subOp == NV50_IR_SUBOP_MUL_HIGH)
   code[0] |= 1 << 6;
 }
--
2.8.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


--
-Samuel
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH v2] nvc0/ir: fix subops for IMAD

2016-09-15 Thread Samuel Pitoiset

Offset was wrong, it's at bit 8, not 4. Also, uses subr instead
of sub when src2 has neg. Similar to GK110 now.

Signed-off-by: Samuel Pitoiset 
Cc: mesa-sta...@lists.freedesktop.org
---
 src/gallium/drivers/nouveau/codegen/nv50_ir_emit_nvc0.cpp | 10 ++
 1 file changed, 6 insertions(+), 4 deletions(-)

diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_nvc0.cpp 
b/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_nvc0.cpp
index d83028c..d8ca6ab 100644
--- a/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_nvc0.cpp
+++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_emit_nvc0.cpp
@@ -735,9 +735,15 @@ CodeEmitterNVC0::emitUADD(const Instruction *i)
 void
 CodeEmitterNVC0::emitIMAD(const Instruction *i)
 {
+   uint8_t addOp =
+  (i->src(2).mod.neg() << 1) | (i->src(0).mod.neg() ^ i->src(1).mod.neg());
+
assert(i->encSize == 8);
emitForm_A(i, HEX64(2000, 0003));
 
+   assert(addOp != 3);
+   code[0] |= addOp << 8;
+
if (isSignedType(i->dType))
   code[0] |= 1 << 7;
if (isSignedType(i->sType))
@@ -748,10 +754,6 @@ CodeEmitterNVC0::emitIMAD(const Instruction *i)
if (i->flagsDef >= 0) code[1] |= 1 << 16;
if (i->flagsSrc >= 0) code[1] |= 1 << 23;
 
-   if (i->src(2).mod.neg()) code[0] |= 0x10;
-   if (i->src(1).mod.neg() ^
-   i->src(0).mod.neg()) code[0] |= 0x20;
-
if (i->subOp == NV50_IR_SUBOP_MUL_HIGH)
   code[0] |= 1 << 6;
 }
-- 
2.9.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] st/va: Make VAAPI_DISABLE_INTERLACE default true

2016-09-15 Thread Christian König


Am 15.09.2016 um 16:43 schrieb Andy Furniss:

Since bf901a2
st/va: also honors interlaced preference when providing a video format
existing scripts and most use cases will need true.

Signed-off-by: Andy Furniss 


Reviewed-by: Christian König .


---
 src/gallium/state_trackers/va/surface.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/gallium/state_trackers/va/surface.c 
b/src/gallium/state_trackers/va/surface.c

index 00df69d..e73e17e 100644
--- a/src/gallium/state_trackers/va/surface.c
+++ b/src/gallium/state_trackers/va/surface.c
@@ -43,7 +43,7 @@

 #include "va_private.h"

-DEBUG_GET_ONCE_BOOL_OPTION(nointerlace, "VAAPI_DISABLE_INTERLACE", 
FALSE);
+DEBUG_GET_ONCE_BOOL_OPTION(nointerlace, "VAAPI_DISABLE_INTERLACE", 
TRUE);


 #include 



___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] mesa: check for no matrix change in _mesa_LoadMatrixf()

2016-09-15 Thread Kenneth Graunke

On Thursday, September 15, 2016 9:34:50 AM PDT Brian Paul wrote:
> Some apps issue redundant glLoadMatrixf() calls with the same matrix.
> Try to avoid setting dirty state in that situation.
> 
> This reduces the number of constant buffer updates by about half in
> ET Quake Wars.
> 
> Tested with Piglit, ETQW, Sauerbraten, Google Earth, etc.

   ^^^
   Is that a texture compression test suite?
   :)

Reviewed-by: Kenneth Graunke 

> ---
>  src/mesa/main/matrix.c | 8 +---
>  1 file changed, 5 insertions(+), 3 deletions(-)
> 
> diff --git a/src/mesa/main/matrix.c b/src/mesa/main/matrix.c
> index b30b983..83f081e 100644
> --- a/src/mesa/main/matrix.c
> +++ b/src/mesa/main/matrix.c
> @@ -356,9 +356,11 @@ _mesa_LoadMatrixf( const GLfloat *m )
>m[2], m[6], m[10], m[14],
>m[3], m[7], m[11], m[15]);
>  
> -   FLUSH_VERTICES(ctx, 0);
> -   _math_matrix_loadf( ctx->CurrentStack->Top, m );
> -   ctx->NewState |= ctx->CurrentStack->DirtyFlag;
> +   if (memcmp(m, ctx->CurrentStack->Top->m, 16 * sizeof(GLfloat)) != 0) {
> +  FLUSH_VERTICES(ctx, 0);
> +  _math_matrix_loadf( ctx->CurrentStack->Top, m );
> +  ctx->NewState |= ctx->CurrentStack->DirtyFlag;
> +   }
>  }


signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] mesa: check for no matrix change in _mesa_LoadMatrixf()

2016-09-15 Thread Charmaine Lee


Looks good.

Reviewed-by: Charmaine Lee 

From: Brian Paul 
Sent: Thursday, September 15, 2016 8:34 AM
To: mesa-dev@lists.freedesktop.org
Cc: Charmaine Lee
Subject: [PATCH] mesa: check for no matrix change in _mesa_LoadMatrixf()

Some apps issue redundant glLoadMatrixf() calls with the same matrix.
Try to avoid setting dirty state in that situation.

This reduces the number of constant buffer updates by about half in
ET Quake Wars.

Tested with Piglit, ETQW, Sauerbraten, Google Earth, etc.
---
 src/mesa/main/matrix.c | 8 +---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/src/mesa/main/matrix.c b/src/mesa/main/matrix.c
index b30b983..83f081e 100644
--- a/src/mesa/main/matrix.c
+++ b/src/mesa/main/matrix.c
@@ -356,9 +356,11 @@ _mesa_LoadMatrixf( const GLfloat *m )
   m[2], m[6], m[10], m[14],
   m[3], m[7], m[11], m[15]);

-   FLUSH_VERTICES(ctx, 0);
-   _math_matrix_loadf( ctx->CurrentStack->Top, m );
-   ctx->NewState |= ctx->CurrentStack->DirtyFlag;
+   if (memcmp(m, ctx->CurrentStack->Top->m, 16 * sizeof(GLfloat)) != 0) {
+  FLUSH_VERTICES(ctx, 0);
+  _math_matrix_loadf( ctx->CurrentStack->Top, m );
+  ctx->NewState |= ctx->CurrentStack->DirtyFlag;
+   }
 }


--
1.9.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 3/3] mesa: Move buffers-unmapped earlier in check_valid_to_render().

2016-09-15 Thread Mathias Fröhlich

Hi,

On Thursday, 15 September 2016 02:10:24 CEST Kenneth Graunke wrote:
> This needs to be above the switch on API, as that can return true
> (valid to render) before this error check even had a chance to run.
> 
> Fixes ESEXT-CTS.draw_elements_base_vertex_tests.invalid_mapped_bos,
> which worked before commit 72f1566f90c434c7752d8405193eec68d6743246.
> 
> Signed-off-by: Kenneth Graunke 
> Cc: Mathias Fröhlich 

Indeed. Thanks for fixing!

Reviewed-by: Mathias Fröhlich 

best
Mathias


> ---
>  src/mesa/main/api_validate.c | 12 ++--
>  1 file changed, 6 insertions(+), 6 deletions(-)
> 
> diff --git a/src/mesa/main/api_validate.c b/src/mesa/main/api_validate.c
> index b35751e..6cb626a 100644
> --- a/src/mesa/main/api_validate.c
> +++ b/src/mesa/main/api_validate.c
> @@ -45,6 +45,12 @@ check_valid_to_render(struct gl_context *ctx, const char 
*function)
>return false;
> }
>  
> +   if (!_mesa_all_buffers_are_unmapped(ctx->Array.VAO)) {
> +  _mesa_error(ctx, GL_INVALID_OPERATION,
> +  "%s(vertex buffers are mapped)", function);
> +  return false;
> +   }
> +
> switch (ctx->API) {
> case API_OPENGLES2:
>/* For ES2, we can draw if we have a vertex program/shader). */
> @@ -119,12 +125,6 @@ check_valid_to_render(struct gl_context *ctx, const 
char *function)
>unreachable("Invalid API value in check_valid_to_render()");
> }
>  
> -   if (!_mesa_all_buffers_are_unmapped(ctx->Array.VAO)) {
> -  _mesa_error(ctx, GL_INVALID_OPERATION,
> -  "%s(vertex buffers are mapped)", function);
> -  return false;
> -   }
> -
> return true;
>  }
>  
> 


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 1/4] i965/fs: Use sample interpolation for interpolateAtCentroid in persample mode

2016-09-15 Thread Kenneth Graunke

On Wednesday, September 14, 2016 10:45:24 AM PDT Jason Ekstrand wrote:
> From the ARB_gpu_shader5 spec:
> 
>The built-in functions interpolateAtCentroid() and interpolateAtSample()
>will sample variables as though they were declared with the "centroid"
>or "sample" qualifiers, respectively.
> 
> When running with persample dispatch forced by the API, we interpolate
> anything that isn't flat as if it's qualified by "sample".  In order to
> keep interpolateAtCentroid() consistent with the "centroid" qualifier, we
> need to make interpolateAtCentroid() do sample interpolation instead.
> Nothing in the GLSL spec guarantees that the result of
> interpolateAtCentroid is uniform across samples in any way, so this is a
> perfectly fine thing to do.
> 
> Fixes 8 of the new dEQP-VK.pipeline.multisample_interpolation.* Vulkan CTS
> tests that specifically validate consistency between the "sample" qualifier
> and interpolateAtSample()
> 
> Signed-off-by: Jason Ekstrand 
> ---
>  src/mesa/drivers/dri/i965/brw_fs.cpp | 26 ++
>  1 file changed, 26 insertions(+)

Series is:
Reviewed-by: Kenneth Graunke 


signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 01/10] i965: use nir_lower_indirect_derefs() for GLSL Gen7+

2016-09-15 Thread Jason Ekstrand

On Sep 15, 2016 12:05 AM, "Timothy Arceri" 
wrote:
>
> This moves the nir_lower_indirect_derefs() call into
> brw_preprocess_nir() so thats is called by both OpenGL and Vulkan
> and removes that call to the old GLSL IR pass
> lower_variable_index_to_cond_assign()
>
> We want to do this pass in nir to be able to move loop unrolling
> to nir.
>
> There is a increase of 1-3 instructions in a small number of shaders,
> and 2 Kerbal Space program shaders that increase by 32 instructions.
>
> Shader-db results BDW:
>
> total instructions in shared programs: 8705873 -> 8706194 (0.00%)
> instructions in affected programs: 32515 -> 32836 (0.99%)
> helped: 3
> HURT: 79
>
> total cycles in shared programs: 74618120 -> 74583476 (-0.05%)
> cycles in affected programs: 528104 -> 493460 (-6.56%)
> helped: 47
> HURT: 37
>
> LOST:   2
> GAINED: 0
> ---
>  src/intel/vulkan/anv_pipeline.c| 10 --
>  src/mesa/drivers/dri/i965/brw_link.cpp | 26 ++
>  src/mesa/drivers/dri/i965/brw_nir.c| 12 
>  3 files changed, 26 insertions(+), 22 deletions(-)
>
> diff --git a/src/intel/vulkan/anv_pipeline.c
b/src/intel/vulkan/anv_pipeline.c
> index f96fe22..f292f0b 100644
> --- a/src/intel/vulkan/anv_pipeline.c
> +++ b/src/intel/vulkan/anv_pipeline.c
> @@ -183,16 +183,6 @@ anv_shader_compile_to_nir(struct anv_device *device,
>
> nir_shader_gather_info(nir, entry_point->impl);
>
> -   nir_variable_mode indirect_mask = 0;
> -   if (compiler->glsl_compiler_options[stage].EmitNoIndirectInput)
> -  indirect_mask |= nir_var_shader_in;
> -   if (compiler->glsl_compiler_options[stage].EmitNoIndirectOutput)
> -  indirect_mask |= nir_var_shader_out;
> -   if (compiler->glsl_compiler_options[stage].EmitNoIndirectTemp)
> -  indirect_mask |= nir_var_local;
> -
> -   nir_lower_indirect_derefs(nir, indirect_mask);
> -
> return nir;
>  }
>
> diff --git a/src/mesa/drivers/dri/i965/brw_link.cpp
b/src/mesa/drivers/dri/i965/brw_link.cpp
> index 2b1fa61..41791d4 100644
> --- a/src/mesa/drivers/dri/i965/brw_link.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_link.cpp
> @@ -139,18 +139,20 @@ process_glsl_ir(gl_shader_stage stage,
>
> do_copy_propagation(shader->ir);
>
> -   bool lowered_variable_indexing =
> -  lower_variable_index_to_cond_assign((gl_shader_stage)stage,
> -  shader->ir,
> -  options->EmitNoIndirectInput,
> -  options->EmitNoIndirectOutput,
> -  options->EmitNoIndirectTemp,
> -
options->EmitNoIndirectUniform);
> -
> -   if (unlikely(brw->perf_debug && lowered_variable_indexing)) {
> -  perf_debug("Unsupported form of variable indexing in %s; falling "
> - "back to very inefficient code generation\n",
> - _mesa_shader_stage_to_abbrev(shader->Stage));
> +   if (brw->gen < 7) {
> +  bool lowered_variable_indexing =
> + lower_variable_index_to_cond_assign((gl_shader_stage)stage,
> + shader->ir,
> +
 options->EmitNoIndirectInput,
> +
 options->EmitNoIndirectOutput,
> + options->EmitNoIndirectTemp,
> +
 options->EmitNoIndirectUniform);
> +
> +  if (unlikely(brw->perf_debug && lowered_variable_indexing)) {
> + perf_debug("Unsupported form of variable indexing in %s;
falling "
> +"back to very inefficient code generation\n",
> +_mesa_shader_stage_to_abbrev(shader->Stage));
> +  }
> }
>
> bool progress;
> diff --git a/src/mesa/drivers/dri/i965/brw_nir.c
b/src/mesa/drivers/dri/i965/brw_nir.c
> index e8dafae..af646ed 100644
> --- a/src/mesa/drivers/dri/i965/brw_nir.c
> +++ b/src/mesa/drivers/dri/i965/brw_nir.c
> @@ -453,6 +453,18 @@ brw_preprocess_nir(const struct brw_compiler
*compiler, nir_shader *nir)
> /* Lower a bunch of stuff */
> OPT_V(nir_lower_var_copies);
>
> +   if (compiler->devinfo->gen > 6) {

I think you want "> 7" here

> +  nir_variable_mode indirect_mask = 0;
> +  if
(compiler->glsl_compiler_options[nir->stage].EmitNoIndirectInput)
> + indirect_mask |= nir_var_shader_in;
> +  if
(compiler->glsl_compiler_options[nir->stage].EmitNoIndirectOutput)
> + indirect_mask |= nir_var_shader_out;
> +  if (compiler->glsl_compiler_options[nir->stage].EmitNoIndirectTemp)
> + indirect_mask |= nir_var_local;
> +
> +  nir_lower_indirect_derefs(nir, indirect_mask);
> +   }
> +
> /* Get rid of split copies */
> nir = nir_optimize(nir, is_scalar);
>
> --
> 2.7.4
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mes

Re: [Mesa-dev] [PATCH 2/3] glsl: Skip "unsized arrays aren't allowed" check for TCS/TES vars.

2016-09-15 Thread Ilia Mirkin

On Thu, Sep 15, 2016 at 5:10 AM, Kenneth Graunke  wrote:
> Fixes ESEXT-CTS.draw_elements_base_vertex_tests.AEP_shader_stages and
> ESEXT-CTS.texture_cube_map_array.texture_size_tesselation_con_sh.
>
> Signed-off-by: Kenneth Graunke 
> ---
>  src/compiler/glsl/ast_to_hir.cpp | 9 -
>  1 file changed, 8 insertions(+), 1 deletion(-)
>
> diff --git a/src/compiler/glsl/ast_to_hir.cpp 
> b/src/compiler/glsl/ast_to_hir.cpp
> index 0a23195..90cc924 100644
> --- a/src/compiler/glsl/ast_to_hir.cpp
> +++ b/src/compiler/glsl/ast_to_hir.cpp
> @@ -5127,7 +5127,14 @@ ast_declarator_list::hir(exec_list *instructions,
>   const glsl_type *const t = (earlier == NULL)
>  ? var->type : earlier->type;
>
> - if (t->is_unsized_array())
> + /* GL_OES_tessellation_shader allows omitting the array size
> +  * for TCS inputs/outputs and TES inputs.  Ignore this check.
> +  */
> + bool unsized_ok = state->stage == MESA_SHADER_TESS_CTRL ||

Are you sure that only inputs/outputs can make it in here? I couldn't
come to that conclusion from a quick scan of the code...

> + (state->stage == MESA_SHADER_TESS_EVAL &&
> +  var->data.mode == ir_var_shader_in);
> +
> + if (t->is_unsized_array() && !unsized_ok)
>  /* Section 10.17 of the GLSL ES 1.00 specification states that
>   * unsized array declarations have been removed from the 
> language.
>   * Arrays that are sized using an initializer are still 
> explicitly
> --
> 2.9.3
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] Revert "st/vdpau: use linear layout for output surfaces"

2016-09-15 Thread Dave Airlie

On 15 September 2016 at 17:43, Christian König  wrote:
> Am 15.09.2016 um 06:00 schrieb Ilia Mirkin:
>>
>> On Wed, Sep 14, 2016 at 11:58 PM, Dave Airlie  wrote:
>>>
>>> From: Dave Airlie 
>>>
>>> This reverts commit d180de35320eafa3df3d76f0e82b332656530126.
>>>
>>> This is a radeon specific hack that causes problems on nouveau
>>> when combined with the SHARED flag later. If radeonsi needs a fix
>>> for this, please fix it in the driver.
>
>
> Actually it isn't radeon specific. Using linear surfaces for this makes
> sense because tilling isn't beneficial and the surfaces can potentially be
> shared with other GPUs using the VDPAU OpenGL interop.

Who says tiling isn't beneficial though? Maybe on other GPUs tiling might be, it
still seems like a radeon centric view to me.

I don't think the API should be dictating that, possibly the backend should get
more info to decide if it wants to use LINEAR.

it would be good if you guys can smoketest this at least, esp if we cc
it for stable.

Dave.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] Problem with RX 480 on Alien: Isolation and Dota 2

2016-09-15 Thread Marek Olšák

On Thu, Sep 15, 2016 at 3:44 AM, Michel Dänzer  wrote:
> On 14/09/16 07:41 PM, Marek Olšák wrote:
>> On Wed, Sep 14, 2016 at 5:26 AM, Michel Dänzer  wrote:
>>> On 14/09/16 02:53 AM, Marek Olšák wrote:

 cmake .. -G Ninja -DCMAKE_INSTALL_PREFIX=/usr/llvm/x86_64-linux-gnu
 -DLLVM_TARGETS_TO_BUILD="X86;AMDGPU" -DLLVM_ENABLE_ASSERTIONS=O
   -DCMAKE_BUILD_TYPE=RelWithDebInfo
 -DLLVM_BUILD_LLVM_DYLIB=ON -DLLVM_LINK_LLVM_DYLIB=ON \
   -DCMAKE_C_FLAGS_RELWITHDEBINFO="-O2 -g -DNDEBUG
 -fno-omit-frame-pointer" \
   -DCMAKE_CXX_FLAGS_RELWITHDEBINFO="-O2 -g -DNDEBUG
 -fno-omit-frame-pointer".
>>>
>>> FWIW, I recommend enabling assertions, i.e. setting
>>> -DLLVM_ENABLE_ASSERTIONS=1 and removing -DNDEBUG.
>>
>> That should have been:
>>
>> -DLLVM_ENABLE_ASSERTIONS=ON \
>>
>> It was cut when I was copy-pasting it.
>
> Doesn't -DNDEBUG disable assertions anyway though? When was the last
> time an LLVM assertion failed for you? :)

That's a good point. I don't remember. :)

Marek
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] Problem with RX 480 on Alien: Isolation and Dota 2

2016-09-15 Thread Marek Olšák

On Thu, Sep 15, 2016 at 5:04 AM, Romain Failliot
 wrote:
> 2016-09-13 13:53 GMT-04:00 Marek Olšák :
>> LLVM 32-bit:
>>
>> mkdir -p build32
>> cd build32
>> cmake .. -G Ninja -DCMAKE_INSTALL_PREFIX=/usr/llvm/i386-linux-gnu
>> -DLLVM_TARGETS_TO_BUILD="X86;AMDGPU" -DLLVM_ENABLE_ASSERTIONS=ON
>>   -DCMAKE_BUILD_TYPE=RelWithDebInfo
>> -DLLVM_BUILD_LLVM_DYLIB=ON -DLLVM_LINK_LLVM_DYLIB=ON \
>>   -DCMAKE_C_FLAGS_RELWITHDEBINFO="-O2 -g -DNDEBUG
>> -fno-omit-frame-pointer" \
>>   -DCMAKE_CXX_FLAGS_RELWITHDEBINFO="-O2 -g -DNDEBUG
>> -fno-omit-frame-pointer" \
>>   -DLLVM_BUILD_32_BITS=ON
>
> I have a problem with the 32-bit compilation of llvm.
>
> I get this error:
>
> -- Target triple: x86_64-unknown-linux-gnu
> -- Native target architecture is X86
> -- Threads enabled.
> -- Doxygen disabled.
> -- Sphinx disabled.
> -- Go bindings disabled.
> -- Could NOT find OCaml (missing:  OCAMLFIND OCAML_VERSION OCAML_STDLIB_PATH)
> -- Could NOT find OCaml (missing:  OCAMLFIND OCAML_VERSION OCAML_STDLIB_PATH)
> -- OCaml bindings disabled.
> -- Building with -fPIC
> -- Building 32 bits executables and libraries.
> CMake Error at cmake/modules/HandleLLVMOptions.cmake:469 (message):
>   LLVM requires C++11 support but the '-std=c++11' flag isn't supported.
> Call Stack (most recent call first):
>   CMakeLists.txt:473 (include)
>
>
> -- Configuring incomplete, errors occurred!
>
> I don't know why my LLVM doesn't handle C++11... Any idea?

Update your gcc I guess? Sorry, I don't know much about LLVM build
requirements. It works with gcc 5.4.0.

Marek
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 2/3] glsl: Skip "unsized arrays aren't allowed" check for TCS/TES vars.

2016-09-15 Thread Kenneth Graunke

On Thursday, September 15, 2016 3:35:27 PM PDT Ilia Mirkin wrote:
> On Thu, Sep 15, 2016 at 5:10 AM, Kenneth Graunke  
> wrote:
> > Fixes ESEXT-CTS.draw_elements_base_vertex_tests.AEP_shader_stages and
> > ESEXT-CTS.texture_cube_map_array.texture_size_tesselation_con_sh.
> >
> > Signed-off-by: Kenneth Graunke 
> > ---
> >  src/compiler/glsl/ast_to_hir.cpp | 9 -
> >  1 file changed, 8 insertions(+), 1 deletion(-)
> >
> > diff --git a/src/compiler/glsl/ast_to_hir.cpp 
> > b/src/compiler/glsl/ast_to_hir.cpp
> > index 0a23195..90cc924 100644
> > --- a/src/compiler/glsl/ast_to_hir.cpp
> > +++ b/src/compiler/glsl/ast_to_hir.cpp
> > @@ -5127,7 +5127,14 @@ ast_declarator_list::hir(exec_list *instructions,
> >   const glsl_type *const t = (earlier == NULL)
> >  ? var->type : earlier->type;
> >
> > - if (t->is_unsized_array())
> > + /* GL_OES_tessellation_shader allows omitting the array size
> > +  * for TCS inputs/outputs and TES inputs.  Ignore this check.
> > +  */
> > + bool unsized_ok = state->stage == MESA_SHADER_TESS_CTRL ||
> 
> Are you sure that only inputs/outputs can make it in here? I couldn't
> come to that conclusion from a quick scan of the code...

Whoops.  No, other things can get here too.  How about:

 const bool unsized_ok =
(stage->state == MESA_SHADER_TESS_CTRL &&
 (var->data.mode == ir_var_shader_in ||
  var->data.mode == ir_var_shader_out)) ||
(stage->state == MESA_SHADER_TESS_EVAL &&
 var->data.mode == ir_var_shader_in);

> > + (state->stage == MESA_SHADER_TESS_EVAL &&
> > +  var->data.mode == ir_var_shader_in);
> > +
> > + if (t->is_unsized_array() && !unsized_ok)
> >  /* Section 10.17 of the GLSL ES 1.00 specification states that
> >   * unsized array declarations have been removed from the 
> > language.
> >   * Arrays that are sized using an initializer are still 
> > explicitly
> > --
> > 2.9.3


signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 2/3] glsl: Skip "unsized arrays aren't allowed" check for TCS/TES vars.

2016-09-15 Thread Ilia Mirkin

On Thu, Sep 15, 2016 at 4:40 PM, Kenneth Graunke  wrote:
> On Thursday, September 15, 2016 3:35:27 PM PDT Ilia Mirkin wrote:
>> On Thu, Sep 15, 2016 at 5:10 AM, Kenneth Graunke  
>> wrote:
>> > Fixes ESEXT-CTS.draw_elements_base_vertex_tests.AEP_shader_stages and
>> > ESEXT-CTS.texture_cube_map_array.texture_size_tesselation_con_sh.
>> >
>> > Signed-off-by: Kenneth Graunke 
>> > ---
>> >  src/compiler/glsl/ast_to_hir.cpp | 9 -
>> >  1 file changed, 8 insertions(+), 1 deletion(-)
>> >
>> > diff --git a/src/compiler/glsl/ast_to_hir.cpp 
>> > b/src/compiler/glsl/ast_to_hir.cpp
>> > index 0a23195..90cc924 100644
>> > --- a/src/compiler/glsl/ast_to_hir.cpp
>> > +++ b/src/compiler/glsl/ast_to_hir.cpp
>> > @@ -5127,7 +5127,14 @@ ast_declarator_list::hir(exec_list *instructions,
>> >   const glsl_type *const t = (earlier == NULL)
>> >  ? var->type : earlier->type;
>> >
>> > - if (t->is_unsized_array())
>> > + /* GL_OES_tessellation_shader allows omitting the array size
>> > +  * for TCS inputs/outputs and TES inputs.  Ignore this check.
>> > +  */
>> > + bool unsized_ok = state->stage == MESA_SHADER_TESS_CTRL ||
>>
>> Are you sure that only inputs/outputs can make it in here? I couldn't
>> come to that conclusion from a quick scan of the code...
>
> Whoops.  No, other things can get here too.  How about:
>
>  const bool unsized_ok =
> (stage->state == MESA_SHADER_TESS_CTRL &&
>  (var->data.mode == ir_var_shader_in ||
>   var->data.mode == ir_var_shader_out)) ||
> (stage->state == MESA_SHADER_TESS_EVAL &&
>  var->data.mode == ir_var_shader_in);
>

Sounds good to me. Presumably also throw geometry shaders into that?

"""
All geometry shader input unsized array declarations will be sized by an
earlier input primitive layout qualifier, when present, as per the
following table.
"""

Or are those var's sized by the time this code is hit? With that
worked out, this is

Reviewed-by: Ilia Mirkin 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 09/10] st/vdpau: implement the new DMA-buf based interop

2016-09-15 Thread Marek Olšák

On Thu, Sep 15, 2016 at 5:14 AM, Dave Airlie  wrote:
> On 15 September 2016 at 13:03, Ilia Mirkin  wrote:
>> On Wed, Sep 14, 2016 at 10:15 PM, Michel Dänzer  wrote:
 No, the current impl is pretty radeon-specific (note - it doesn't work
 on nouveau, and no other drivers support the interfaces, so ... it's
 radeon-specific).
>>>
>>> We're getting into semantics here, but since the reason it doesn't work
>>> well with nouveau is a fundamental issue in nouveau (which should also
>>> affect at least DRI3 in general), while you may call it "de facto radeon
>>> specific" if you're so inclined, that doesn't make the implementation
>>> actually radeon specific.
>>
>> No one's reported any issues with DRI3, I use it on my home desktop
>> every day. And VDPAU used to work great until these changes to
>> st/vdpau went in. Prior to those changes in st/vdpau, saying that
>> "shared == gart" was a perfectly reasonable thing to say, since no one
>> tried blending/readback on those surfaces (or at least not enough for
>> it to matter). But now ... poof ... it doesn't work [actually, worse -
>> it works - but can't come close to keeping up with 24fps video].
>>
>> Anyways, I realize this is a losing argument. Interfaces and usages
>> move forward and change over time. This happens to be a change that
>> leaves nouveau behind. As a spare-time contributor, I can't keep up
>> with multiple full timers. I had hoped that there'd be some way to
>> make it all still work, but that doesn't seem to be the case.
>> Unfortunately end users are going to lose out on functionality as a
>> result.
>
> So (a) this is a regression, regressions aren't allowed, so it would
> be good to back the change out until it can be fixed.
>
> The problem is the combo of LINEAR and SHARED means that
> GART placement is most likely, radeon should be doing the same
> in most circumstances.
>
> We should possible introduced SHARED_OTHER_GPU maybe
> and use that throughout the stack where it matters.

The main problem is that nouveau lacks a proper memory management and
buffers are pinned after allocation forever.

The workaround is to add PIPE_BIND_something, which would do what you
need it to do, and use it where you need to use it. I don't care about
the name as long as it works for nouveau. Does that sound reasonable?

Marek
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] st/va: Make VAAPI_DISABLE_INTERLACE default true

2016-09-15 Thread Julien Isorce

On 15 September 2016 at 16:02, Leo Liu  wrote:

>
>
> On 09/15/2016 10:43 AM, Andy Furniss wrote:
>
>> Since bf901a2
>> st/va: also honors interlaced preference when providing a video format
>> existing scripts and most use cases will need true.
>>
>> Signed-off-by: Andy Furniss 
>> ---
>>  src/gallium/state_trackers/va/surface.c | 2 +-
>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/src/gallium/state_trackers/va/surface.c
>> b/src/gallium/state_trackers/va/surface.c
>> index 00df69d..e73e17e 100644
>> --- a/src/gallium/state_trackers/va/surface.c
>> +++ b/src/gallium/state_trackers/va/surface.c
>> @@ -43,7 +43,7 @@
>>
>>  #include "va_private.h"
>>
>> -DEBUG_GET_ONCE_BOOL_OPTION(nointerlace, "VAAPI_DISABLE_INTERLACE",
>> FALSE);
>> +DEBUG_GET_ONCE_BOOL_OPTION(nointerlace, "VAAPI_DISABLE_INTERLACE",
>> TRUE);
>>
>
> Like being mentioned,  It'll still override the preferred interlaced
> format when this env is not explicitly used.
>
> Not sure this will be okay with other case. @Julien?
>

Hi,

So the problem is that with radeon, PIPE_VIDEO_CAP_SUPPORTS_INTERLACED
always returns false for encoding
but can return true for decoding (depending on the chipset and the codec).
Then when doing transcoding you need all to be non interlaced to avoid
extra copy/conversion and even a clash. Is it correct ?

Should debug_get_option_nointerlace() be moved to
radeon_video.c::rvid_get_video_param ?

Other question:

case PIPE_VIDEO_CAP_SUPPORTS_INTERLACED:
if (rscreen->family < CHIP_PALM) {
/* MPEG2 only with shaders and no support for
   interlacing on R6xx style UVD */
return codec != PIPE_VIDEO_FORMAT_MPEG12 &&
   rscreen->family > CHIP_RV770;
} else {
if (u_reduce_video_profile(profile) == PIPE_VIDEO_FORMAT_HEVC)
return false; //The firmware doesn't support interlaced
HEVC.
return true;
}
So if instead it would always return false then it will really work on
hardware for which above code says true ?
Because my understanding is that for nvidia hardware this is not a
preference but rather a requirement but I might be wrong.

In any case, with current upstream code and VAAPI_DISABLE_INTERLACE=1 it
hits "assert(templat->interlaced);" in nouveau_vp3_video_buffer_create. If
I remove the asset it crashes and can even stall the driver (just wanted to
check ).

Cheers
Julien

> Regards,
> Leo
>
>
>>  #include 
>>
>>
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] st/mesa: only enable MSAA coverage options when we have a MSAA buffer

2016-09-15 Thread Brian Paul

Regardless of whether GL_MULTISAMPLE is enabled (it's enabled by default)
we should not set the alpha_to_coverage or alpha_to_one flags if the
current drawing buffer does not do MSAA.

This fixes the new piglit gl-1.3-alpha_to_coverage_nop test.
---
 src/mesa/state_tracker/st_atom_blend.c | 9 ++---
 src/mesa/state_tracker/st_context.c| 3 ++-
 2 files changed, 8 insertions(+), 4 deletions(-)

diff --git a/src/mesa/state_tracker/st_atom_blend.c 
b/src/mesa/state_tracker/st_atom_blend.c
index 65de67b..67e 100644
--- a/src/mesa/state_tracker/st_atom_blend.c
+++ b/src/mesa/state_tracker/st_atom_blend.c
@@ -265,9 +265,12 @@ update_blend( struct st_context *st )
 
blend->dither = ctx->Color.DitherFlag;
 
-   if (ctx->Multisample.Enabled) {
-  /* unlike in gallium/d3d10 these operations are only performed
- if msaa is enabled */
+   if (ctx->Multisample.Enabled &&
+   ctx->DrawBuffer &&
+   ctx->DrawBuffer->Visual.sampleBuffers > 0) {
+  /* Unlike in gallium/d3d10 these operations are only performed
+   * if both msaa is enabled and we have a multisample buffer.
+   */
   blend->alpha_to_coverage = ctx->Multisample.SampleAlphaToCoverage;
   blend->alpha_to_one = ctx->Multisample.SampleAlphaToOne;
}
diff --git a/src/mesa/state_tracker/st_context.c 
b/src/mesa/state_tracker/st_context.c
index ddc11a4..81b3387 100644
--- a/src/mesa/state_tracker/st_context.c
+++ b/src/mesa/state_tracker/st_context.c
@@ -166,7 +166,8 @@ void st_invalidate_state(struct gl_context * ctx, 
GLbitfield new_state)
struct st_context *st = st_context(ctx);
 
if (new_state & _NEW_BUFFERS) {
-  st->dirty |= ST_NEW_DSA |
+  st->dirty |= ST_NEW_BLEND |
+   ST_NEW_DSA |
ST_NEW_FB_STATE |
ST_NEW_SAMPLE_MASK |
ST_NEW_SAMPLE_SHADING |
-- 
1.9.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] st/mesa: only enable MSAA coverage options when we have a MSAA buffer

2016-09-15 Thread Roland Scheidegger

Am 15.09.2016 um 23:20 schrieb Brian Paul:
> Regardless of whether GL_MULTISAMPLE is enabled (it's enabled by default)
> we should not set the alpha_to_coverage or alpha_to_one flags if the
> current drawing buffer does not do MSAA.
> 
> This fixes the new piglit gl-1.3-alpha_to_coverage_nop test.
> ---
>  src/mesa/state_tracker/st_atom_blend.c | 9 ++---
>  src/mesa/state_tracker/st_context.c| 3 ++-
>  2 files changed, 8 insertions(+), 4 deletions(-)
> 
> diff --git a/src/mesa/state_tracker/st_atom_blend.c 
> b/src/mesa/state_tracker/st_atom_blend.c
> index 65de67b..67e 100644
> --- a/src/mesa/state_tracker/st_atom_blend.c
> +++ b/src/mesa/state_tracker/st_atom_blend.c
> @@ -265,9 +265,12 @@ update_blend( struct st_context *st )
>  
> blend->dither = ctx->Color.DitherFlag;
>  
> -   if (ctx->Multisample.Enabled) {
> -  /* unlike in gallium/d3d10 these operations are only performed
> - if msaa is enabled */
> +   if (ctx->Multisample.Enabled &&
> +   ctx->DrawBuffer &&
> +   ctx->DrawBuffer->Visual.sampleBuffers > 0) {
> +  /* Unlike in gallium/d3d10 these operations are only performed
> +   * if both msaa is enabled and we have a multisample buffer.
> +   */
>blend->alpha_to_coverage = ctx->Multisample.SampleAlphaToCoverage;
>blend->alpha_to_one = ctx->Multisample.SampleAlphaToOne;
> }
> diff --git a/src/mesa/state_tracker/st_context.c 
> b/src/mesa/state_tracker/st_context.c
> index ddc11a4..81b3387 100644
> --- a/src/mesa/state_tracker/st_context.c
> +++ b/src/mesa/state_tracker/st_context.c
> @@ -166,7 +166,8 @@ void st_invalidate_state(struct gl_context * ctx, 
> GLbitfield new_state)
> struct st_context *st = st_context(ctx);
>  
> if (new_state & _NEW_BUFFERS) {
> -  st->dirty |= ST_NEW_DSA |
> +  st->dirty |= ST_NEW_BLEND |
> +   ST_NEW_DSA |
> ST_NEW_FB_STATE |
> ST_NEW_SAMPLE_MASK |
> ST_NEW_SAMPLE_SHADING |
> 

Looks reasonable to me.

Reviewed-by: Roland Scheidegger 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] st/mesa: update comment in st_atom_msaa.c

2016-09-15 Thread Brian Paul

The old comment was a copy and paste mistake.  Indent another comment.
---
 src/mesa/state_tracker/st_atom_msaa.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/src/mesa/state_tracker/st_atom_msaa.c 
b/src/mesa/state_tracker/st_atom_msaa.c
index 8442a28..69aea69 100644
--- a/src/mesa/state_tracker/st_atom_msaa.c
+++ b/src/mesa/state_tracker/st_atom_msaa.c
@@ -36,7 +36,7 @@
 #include "util/u_framebuffer.h"
 
 
-/* Second state atom for user clip planes:
+/* Update the sample mask for MSAA.
  */
 static void update_sample_mask( struct st_context *st )
 {
@@ -46,7 +46,7 @@ static void update_sample_mask( struct st_context *st )
unsigned sample_count = util_framebuffer_get_num_samples(framebuffer);
 
if (st->ctx->Multisample.Enabled && sample_count > 1) {
-   /* unlike in gallium/d3d10 the mask is only active if msaa is enabled */
+  /* unlike in gallium/d3d10 the mask is only active if msaa is enabled */
   if (st->ctx->Multisample.SampleCoverage) {
  unsigned nr_bits;
  nr_bits = (unsigned)
-- 
1.9.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] gallium/docs: document alpha_to_coverage and alpha_to_one blend state

2016-09-15 Thread Brian Paul

The gallium interface defines these like DX10.  Note that OpenGL ignores
these options if MSAA is disabled or the dest buffer doesn't support
MSAA.
---
 src/gallium/docs/source/cso/blend.rst | 12 
 1 file changed, 12 insertions(+)

diff --git a/src/gallium/docs/source/cso/blend.rst 
b/src/gallium/docs/source/cso/blend.rst
index dce999c..7316e5c 100644
--- a/src/gallium/docs/source/cso/blend.rst
+++ b/src/gallium/docs/source/cso/blend.rst
@@ -88,6 +88,18 @@ independent_blend_enable
the first member of the rt array contains valid data.
 rt
Contains the per-rendertarget blend state.
+alpha_to_coverage
+   If enabled, the fragment's alpha value is used to override the fragment's
+   coverage mask.  The coverage mask will be all zeros if the alpha value is
+   zero.  The coverage mask will be all ones if the alpha value is one.
+   Otherwise, the number of bits set in the coverage mask will be proportional
+   to the alpha value.  Note that this step happens regardless of whether
+   multisample is enabled or the destination buffer is multisampled.
+alpha_to_one
+   If enabled, the fragment's alpha value will be set to one.  As with
+   alpha_to_coverage, this step happens regardless of whether multisample
+   is enabled or the destination buffer is multisampled.
+
 
 Per-rendertarget Members
 
-- 
1.9.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] gallium/docs: document alpha_to_coverage and alpha_to_one blend state

2016-09-15 Thread Ilia Mirkin

What about integer RTs? I had to add a hack in nouveau to make it
disable those when RT0 is an integer. It'd be more convenient if they
were turned off in the first place.

On Thu, Sep 15, 2016 at 5:34 PM, Brian Paul  wrote:
> The gallium interface defines these like DX10.  Note that OpenGL ignores
> these options if MSAA is disabled or the dest buffer doesn't support
> MSAA.
> ---
>  src/gallium/docs/source/cso/blend.rst | 12 
>  1 file changed, 12 insertions(+)
>
> diff --git a/src/gallium/docs/source/cso/blend.rst 
> b/src/gallium/docs/source/cso/blend.rst
> index dce999c..7316e5c 100644
> --- a/src/gallium/docs/source/cso/blend.rst
> +++ b/src/gallium/docs/source/cso/blend.rst
> @@ -88,6 +88,18 @@ independent_blend_enable
> the first member of the rt array contains valid data.
>  rt
> Contains the per-rendertarget blend state.
> +alpha_to_coverage
> +   If enabled, the fragment's alpha value is used to override the fragment's
> +   coverage mask.  The coverage mask will be all zeros if the alpha value is
> +   zero.  The coverage mask will be all ones if the alpha value is one.
> +   Otherwise, the number of bits set in the coverage mask will be 
> proportional
> +   to the alpha value.  Note that this step happens regardless of whether
> +   multisample is enabled or the destination buffer is multisampled.
> +alpha_to_one
> +   If enabled, the fragment's alpha value will be set to one.  As with
> +   alpha_to_coverage, this step happens regardless of whether multisample
> +   is enabled or the destination buffer is multisampled.
> +
>
>  Per-rendertarget Members
>  
> --
> 1.9.1
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] st/mesa: update comment in st_atom_msaa.c

2016-09-15 Thread Roland Scheidegger

Am 15.09.2016 um 23:32 schrieb Brian Paul:
> The old comment was a copy and paste mistake.  Indent another comment.
> ---
>  src/mesa/state_tracker/st_atom_msaa.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/src/mesa/state_tracker/st_atom_msaa.c 
> b/src/mesa/state_tracker/st_atom_msaa.c
> index 8442a28..69aea69 100644
> --- a/src/mesa/state_tracker/st_atom_msaa.c
> +++ b/src/mesa/state_tracker/st_atom_msaa.c
> @@ -36,7 +36,7 @@
>  #include "util/u_framebuffer.h"
>  
>  
> -/* Second state atom for user clip planes:
> +/* Update the sample mask for MSAA.
>   */
>  static void update_sample_mask( struct st_context *st )
>  {
> @@ -46,7 +46,7 @@ static void update_sample_mask( struct st_context *st )
> unsigned sample_count = util_framebuffer_get_num_samples(framebuffer);
>  
> if (st->ctx->Multisample.Enabled && sample_count > 1) {
> -   /* unlike in gallium/d3d10 the mask is only active if msaa is enabled */
> +  /* unlike in gallium/d3d10 the mask is only active if msaa is enabled 
> */
>if (st->ctx->Multisample.SampleCoverage) {
>   unsigned nr_bits;
>   nr_bits = (unsigned)
> 

Reviewed-by: Roland Scheidegger 

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 08/15] glsl/standalone: Enable GLSL 4.00 through 4.50

2016-09-15 Thread Ian Romanick

From: Ian Romanick 

Signed-off-by: Ian Romanick 
---
 src/compiler/glsl/standalone.cpp | 12 
 1 file changed, 12 insertions(+)

diff --git a/src/compiler/glsl/standalone.cpp b/src/compiler/glsl/standalone.cpp
index a7e6254..6b1c2ce 100644
--- a/src/compiler/glsl/standalone.cpp
+++ b/src/compiler/glsl/standalone.cpp
@@ -153,6 +153,12 @@ initialize_context(struct gl_context *ctx, gl_api api)
   break;
case 150:
case 330:
+   case 400:
+   case 410:
+   case 420:
+   case 430:
+   case 440:
+   case 450:
   ctx->Const.MaxClipPlanes = 8;
   ctx->Const.MaxDrawBuffers = 8;
   ctx->Const.MinProgramTexelOffset = -8;
@@ -324,6 +330,12 @@ standalone_compile_shader(const struct standalone_options 
*_options,
case 140:
case 150:
case 330:
+   case 400:
+   case 410:
+   case 420:
+   case 430:
+   case 440:
+   case 450:
   glsl_es = false;
   break;
default:
-- 
2.5.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 01/15] glsl: Fix cut-and-paste bug in hierarchical visitor ir_expression::accept

2016-09-15 Thread Ian Romanick

From: Ian Romanick 

At this point in the code, s must be visit_continue.  If the child
returned visit_stop, visit_stop is the only correct thing to return.

Found by inspection.

Signed-off-by: Ian Romanick 
Cc: mesa-sta...@lists.freedesktop.org
---
 src/compiler/glsl/ir_hv_accept.cpp | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/compiler/glsl/ir_hv_accept.cpp 
b/src/compiler/glsl/ir_hv_accept.cpp
index 213992a..5cc6a34 100644
--- a/src/compiler/glsl/ir_hv_accept.cpp
+++ b/src/compiler/glsl/ir_hv_accept.cpp
@@ -147,7 +147,7 @@ ir_expression::accept(ir_hierarchical_visitor *v)
 goto done;
 
   case visit_stop:
-return s;
+return visit_stop;
   }
}
 
-- 
2.5.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 10/15] glsl/standalone: Optimize dead variable declarations

2016-09-15 Thread Ian Romanick

From: Ian Romanick 

We didn't bother with this in the regular compiler because it doesn't
change the generated code.  In the stand-alone compiler, this can
clutter the output with useless variables.  It's especially bad after
functions are inlined but the foo_retval declarations remain.

Signed-off-by: Ian Romanick 
---
 src/compiler/glsl/standalone.cpp | 63 
 1 file changed, 63 insertions(+)

diff --git a/src/compiler/glsl/standalone.cpp b/src/compiler/glsl/standalone.cpp
index c4b6854..f7e1055 100644
--- a/src/compiler/glsl/standalone.cpp
+++ b/src/compiler/glsl/standalone.cpp
@@ -37,6 +37,7 @@
 #include "standalone_scaffolding.h"
 #include "standalone.h"
 #include "util/string_to_uint_map.h"
+#include "util/set.h"
 
 class add_neg_to_sub_visitor : public ir_hierarchical_visitor {
 public:
@@ -69,6 +70,64 @@ public:
}
 };
 
+class dead_variable_visitor : public ir_hierarchical_visitor {
+public:
+   dead_variable_visitor()
+   {
+  variables = _mesa_set_create(NULL,
+   _mesa_hash_pointer,
+   _mesa_key_pointer_equal);
+   }
+
+   virtual ~dead_variable_visitor()
+   {
+  _mesa_set_destroy(variables, NULL);
+   }
+
+   virtual ir_visitor_status visit(ir_variable *ir)
+   {
+  /* If the variable is auto or temp, add it to the set of variables that
+   * are candidates for removal.
+   */
+  if (ir->data.mode != ir_var_auto && ir->data.mode != ir_var_temporary)
+ return visit_continue;
+
+  _mesa_set_add(variables, ir);
+
+  return visit_continue;
+   }
+
+   virtual ir_visitor_status visit(ir_dereference_variable *ir)
+   {
+  struct set_entry *entry = _mesa_set_search(variables, ir->var);
+
+  /* If a variable is dereferenced at all, remove it from the set of
+   * variables that are candidates for removal.
+   */
+  if (entry != NULL)
+ _mesa_set_remove(variables, entry);
+
+  return visit_continue;
+   }
+
+   void remove_dead_variables()
+   {
+  struct set_entry *entry;
+
+  for (entry = _mesa_set_next_entry(variables, NULL);
+   entry != NULL;
+   entry = _mesa_set_next_entry(variables, entry)) {
+ ir_variable *ir = (ir_variable *) entry->key;
+
+ assert(ir->ir_type == ir_type_variable);
+ ir->remove();
+  }
+   }
+
+private:
+   set *variables;
+};
+
 static const struct standalone_options *options;
 
 static void
@@ -471,6 +530,10 @@ standalone_compile_shader(const struct standalone_options 
*_options,
  add_neg_to_sub_visitor v;
  visit_list_elements(&v, shader->ir);
 
+ dead_variable_visitor dv;
+ visit_list_elements(&dv, shader->ir);
+ dv.remove_dead_variables();
+
  shader->Program = rzalloc(shader, gl_program);
  init_gl_program(shader->Program, shader->Stage);
   }
-- 
2.5.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 12/15] glsl: Generate strings that are the enum names without the ir_*op_ prefix

2016-09-15 Thread Ian Romanick

From: Ian Romanick 

For many expressions, this is different from the printable name.  The
printable name for ir_binop_add is "+", but we want "add".  This is
needed for ir_builder_print_visitor.

Signed-off-by: Ian Romanick 
---
 src/compiler/glsl/ir.h   | 1 +
 src/compiler/glsl/ir_expression_operation.py | 6 ++
 2 files changed, 7 insertions(+)

diff --git a/src/compiler/glsl/ir.h b/src/compiler/glsl/ir.h
index a3b1a50..b754ab7 100644
--- a/src/compiler/glsl/ir.h
+++ b/src/compiler/glsl/ir.h
@@ -1364,6 +1364,7 @@ public:
 #include "ir_expression_operation.h"
 
 extern const char *const ir_expression_operation_strings[ir_last_opcode + 1];
+extern const char *const ir_expression_operation_enum_strings[ir_last_opcode + 
1];
 
 class ir_expression : public ir_rvalue {
 public:
diff --git a/src/compiler/glsl/ir_expression_operation.py 
b/src/compiler/glsl/ir_expression_operation.py
index 9aa08d3..58a585b 100644
--- a/src/compiler/glsl/ir_expression_operation.py
+++ b/src/compiler/glsl/ir_expression_operation.py
@@ -707,6 +707,12 @@ const char *const ir_expression_operation_strings[] = {
 % for item in values:
"${item.printable_name}",
 % endfor
+};
+
+const char *const ir_expression_operation_enum_strings[] = {
+% for item in values:
+   "${item.name}",
+% endfor
 };""")
 
constant_template = mako.template.Template("""\
-- 
2.5.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 15/15] glsl/standalone: Add the ability to generate ir_builder code

2016-09-15 Thread Ian Romanick

From: Ian Romanick 

Signed-off-by: Ian Romanick 
---
 src/compiler/glsl/main.cpp   |  1 +
 src/compiler/glsl/standalone.cpp | 12 
 src/compiler/glsl/standalone.h   |  1 +
 3 files changed, 14 insertions(+)

diff --git a/src/compiler/glsl/main.cpp b/src/compiler/glsl/main.cpp
index 1e5e0fe..e0d3ab7 100644
--- a/src/compiler/glsl/main.cpp
+++ b/src/compiler/glsl/main.cpp
@@ -42,6 +42,7 @@ const struct option compiler_opts[] = {
{ "dump-ast", no_argument, &options.dump_ast, 1 },
{ "dump-hir", no_argument, &options.dump_hir, 1 },
{ "dump-lir", no_argument, &options.dump_lir, 1 },
+   { "dump-builder", no_argument, &options.dump_builder, 1 },
{ "link", no_argument, &options.do_link,  1 },
{ "just-log", no_argument, &options.just_log, 1 },
{ "version",  required_argument, NULL, 'v' },
diff --git a/src/compiler/glsl/standalone.cpp b/src/compiler/glsl/standalone.cpp
index 1b2a575..4f80764 100644
--- a/src/compiler/glsl/standalone.cpp
+++ b/src/compiler/glsl/standalone.cpp
@@ -40,6 +40,7 @@
 #include "util/set.h"
 #include "linker.h"
 #include "glsl_parser_extras.h"
+#include "ir_builder_print_visitor.h"
 
 class add_neg_to_sub_visitor : public ir_hierarchical_visitor {
 public:
@@ -563,6 +564,17 @@ standalone_compile_shader(const struct standalone_options 
*_options,
  shader->Program = rzalloc(shader, gl_program);
  init_gl_program(shader->Program, shader->Stage);
   }
+
+  if (options->dump_builder) {
+ for (unsigned i = 0; i < MESA_SHADER_STAGES; i++) {
+struct gl_linked_shader *shader = whole_program->_LinkedShaders[i];
+
+if (!shader)
+   continue;
+
+_mesa_print_builder_for_ir(stdout, shader->ir);
+ }
+  }
}
 
return whole_program;
diff --git a/src/compiler/glsl/standalone.h b/src/compiler/glsl/standalone.h
index 648cedb..5029e16 100644
--- a/src/compiler/glsl/standalone.h
+++ b/src/compiler/glsl/standalone.h
@@ -33,6 +33,7 @@ struct standalone_options {
int dump_ast;
int dump_hir;
int dump_lir;
+   int dump_builder;
int do_link;
int just_log;
 };
-- 
2.5.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 02/15] glsl: Update function parameter documentation for do_common_optimization

2016-09-15 Thread Ian Romanick

From: Ian Romanick 

max_unroll_iterations was moved into options a long, long time ago.

Signed-off-by: Ian Romanick 
---
 src/compiler/glsl/glsl_parser_extras.cpp | 7 ---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/src/compiler/glsl/glsl_parser_extras.cpp 
b/src/compiler/glsl/glsl_parser_extras.cpp
index 0e9bfa7..c1e958a 100644
--- a/src/compiler/glsl/glsl_parser_extras.cpp
+++ b/src/compiler/glsl/glsl_parser_extras.cpp
@@ -2010,10 +2010,11 @@ _mesa_glsl_compile_shader(struct gl_context *ctx, 
struct gl_shader *shader,
  *of unused uniforms from being removed.
  *The setting of this flag only matters if
  *\c linked is \c true.
- * \param max_unroll_iterations   Maximum number of loop iterations to be
- *unrolled.  Setting to 0 disables loop
- *unrolling.
  * \param options The driver's preferred shader options.
+ * \param native_integers Selects optimizations that depend on the
+ *implementations supporting integers
+ *natively (as opposed to supporting
+ *integers in floating point registers).
  */
 bool
 do_common_optimization(exec_list *ir, bool linked,
-- 
2.5.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 05/15] glsl: Do some "post link" optimizations on just built-in functions

2016-09-15 Thread Ian Romanick

From: Ian Romanick 

This enables the support added by the previous two patches in
do_common_optimization.  This will be used by the stand-alone compiler.

Signed-off-by: Ian Romanick 
---
 src/compiler/glsl/glsl_parser_extras.cpp   | 12 
 src/compiler/glsl/ir_optimization.h|  3 ++-
 src/compiler/glsl/linker.cpp   |  3 ++-
 src/compiler/glsl/test_optpass.cpp |  2 +-
 src/mesa/drivers/dri/i965/brw_link.cpp |  3 ++-
 src/mesa/main/ff_fragment_shader.cpp   |  2 +-
 src/mesa/program/ir_to_mesa.cpp|  3 ++-
 src/mesa/state_tracker/st_glsl_to_tgsi.cpp |  2 +-
 8 files changed, 19 insertions(+), 11 deletions(-)

diff --git a/src/compiler/glsl/glsl_parser_extras.cpp 
b/src/compiler/glsl/glsl_parser_extras.cpp
index 807bf3f..e2e5cb7 100644
--- a/src/compiler/glsl/glsl_parser_extras.cpp
+++ b/src/compiler/glsl/glsl_parser_extras.cpp
@@ -1921,7 +1921,7 @@ _mesa_glsl_compile_shader(struct gl_context *ctx, struct 
gl_shader *shader,
* and reduce later work if the same shader is linked multiple times
*/
   while (do_common_optimization(shader->ir, false, false, options,
-ctx->Const.NativeIntegers))
+ctx->Const.NativeIntegers, false))
  ;
 
   validate_ir_tree(shader->ir);
@@ -2020,7 +2020,8 @@ bool
 do_common_optimization(exec_list *ir, bool linked,
   bool uniform_locations_assigned,
const struct gl_shader_compiler_options *options,
-   bool native_integers)
+   bool native_integers,
+   bool process_builtin_functions)
 {
const bool debug = false;
GLboolean progress = GL_FALSE;
@@ -2041,9 +2042,12 @@ do_common_optimization(exec_list *ir, bool linked,
 
OPT(lower_instructions, ir, SUB_TO_ADD_NEG);
 
+   if (linked || process_builtin_functions) {
+  /* If the shader is not linked, only inline built-in functions. */
+  OPT(do_function_inlining, ir, !linked);
+  OPT(do_dead_functions, ir, !linked);
+   }
if (linked) {
-  OPT(do_function_inlining, ir, false);
-  OPT(do_dead_functions, ir, false);
   OPT(do_structure_splitting, ir);
}
propagate_invariance(ir);
diff --git a/src/compiler/glsl/ir_optimization.h 
b/src/compiler/glsl/ir_optimization.h
index 20c17e3..bb97c6e 100644
--- a/src/compiler/glsl/ir_optimization.h
+++ b/src/compiler/glsl/ir_optimization.h
@@ -78,7 +78,8 @@ enum lower_packing_builtins_op {
 bool do_common_optimization(exec_list *ir, bool linked,
bool uniform_locations_assigned,
 const struct gl_shader_compiler_options *options,
-bool native_integers);
+bool native_integers,
+bool process_builtin_functions);
 
 bool ir_constant_fold(ir_rvalue **rvalue);
 
diff --git a/src/compiler/glsl/linker.cpp b/src/compiler/glsl/linker.cpp
index f3eece2..2699c1e 100644
--- a/src/compiler/glsl/linker.cpp
+++ b/src/compiler/glsl/linker.cpp
@@ -5031,7 +5031,8 @@ link_shaders(struct gl_context *ctx, struct 
gl_shader_program *prog)
 
   while (do_common_optimization(prog->_LinkedShaders[i]->ir, true, false,
 &ctx->Const.ShaderCompilerOptions[i],
-ctx->Const.NativeIntegers))
+ctx->Const.NativeIntegers,
+false))
 ;
 
   lower_const_arrays_to_uniforms(prog->_LinkedShaders[i]->ir);
diff --git a/src/compiler/glsl/test_optpass.cpp 
b/src/compiler/glsl/test_optpass.cpp
index 3659c8b..baa1ee5 100644
--- a/src/compiler/glsl/test_optpass.cpp
+++ b/src/compiler/glsl/test_optpass.cpp
@@ -63,7 +63,7 @@ do_optimization(struct exec_list *ir, const char 
*optimization,
int int_4;
 
if (sscanf(optimization, "do_common_optimization ( %d ) ", &int_0) == 1) {
-  return do_common_optimization(ir, int_0 != 0, false, options, true);
+  return do_common_optimization(ir, int_0 != 0, false, options, true, 
false);
} else if (strcmp(optimization, "do_algebraic") == 0) {
   return do_algebraic(ir, true, options);
} else if (strcmp(optimization, "do_constant_folding") == 0) {
diff --git a/src/mesa/drivers/dri/i965/brw_link.cpp 
b/src/mesa/drivers/dri/i965/brw_link.cpp
index 2b1fa61..d8f14ef 100644
--- a/src/mesa/drivers/dri/i965/brw_link.cpp
+++ b/src/mesa/drivers/dri/i965/brw_link.cpp
@@ -171,7 +171,8 @@ process_glsl_ir(gl_shader_stage stage,
 ) || progress;
 
   progress = do_common_optimization(shader->ir, true, true,
-options, ctx->Const.NativeIntegers) || 
progress;
+options, ctx->Const.NativeIntegers,
+false) || progress;
} while (progress);
 
validate_ir

[Mesa-dev] [PATCH 09/15] glsl/standalone: Optimize add-of-neg to subtract

2016-09-15 Thread Ian Romanick

From: Ian Romanick 

This just makes the output of the standalone compiler a little more
compact.

Signed-off-by: Ian Romanick 
---
 src/compiler/glsl/standalone.cpp | 34 ++
 1 file changed, 34 insertions(+)

diff --git a/src/compiler/glsl/standalone.cpp b/src/compiler/glsl/standalone.cpp
index 6b1c2ce..c4b6854 100644
--- a/src/compiler/glsl/standalone.cpp
+++ b/src/compiler/glsl/standalone.cpp
@@ -38,6 +38,37 @@
 #include "standalone.h"
 #include "util/string_to_uint_map.h"
 
+class add_neg_to_sub_visitor : public ir_hierarchical_visitor {
+public:
+   add_neg_to_sub_visitor()
+   {
+  /* empty */
+   }
+
+   ir_visitor_status visit_leave(ir_expression *ir)
+   {
+  if (ir->operation != ir_binop_add)
+ return visit_continue;
+
+  for (unsigned i = 0; i < 2; i++) {
+ ir_expression *const op = ir->operands[i]->as_expression();
+
+ if (op != NULL && op->operation == ir_unop_neg) {
+ir->operation = ir_binop_sub;
+
+/* This ensures that -a + b becomes b - a. */
+if (i == 0)
+   ir->operands[0] = ir->operands[1];
+
+ir->operands[i] = op->operands[0];
+break;
+ }
+  }
+
+  return visit_continue;
+   }
+};
+
 static const struct standalone_options *options;
 
 static void
@@ -437,6 +468,9 @@ standalone_compile_shader(const struct standalone_options 
*_options,
  if (!shader)
 continue;
 
+ add_neg_to_sub_visitor v;
+ visit_list_elements(&v, shader->ir);
+
  shader->Program = rzalloc(shader, gl_program);
  init_gl_program(shader->Program, shader->Stage);
   }
-- 
2.5.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 00/15] Compile GLSL to ir_builder

2016-09-15 Thread Ian Romanick

This series makes the stand-alone GLSL compiler useful for something.
It adds an option to generate C++ code that uses ir_builder to recreate
the compiled GLSL shader.  I intend to use this for various lowering
code for GL_ARB_gpu_shader_int64 and GL_ARB_gpu_shader_fp64 (on
platforms that don't actually have double precision hardware).  See also
Elie Tournier's "soft double" GSoC project.

As an example, I present some GLSL code that does 64-bit integer
division:

https://people.freedesktop.org/~idr/udivmod64.glsl

and the C++ code generated:

https://people.freedesktop.org/~idr/udivmod64.cpp

This is a little bit of fib... the code in this series lacks a very
small amount of 64-bit integer support, and the other necessary bits in
Mesa are not yet in master.  A tree with all of that will be available
soon.

The generated code is only ~200 lines compared to ~50 lines of GLSL.
However, I can make changes to the GLSL much easier than I can the big
pile of ir_builder code.  It's a bit like coding in assembly.

I believe this same technique could be adapted to generate NIR builder
too.  Then we could use the same GLSL source to lower things in NIR that
originated from SPIR-V binaries.

The ideal work flow would be generate the C++ code while Mesa is
building, like we do with the various lexers and parsers.  However, this
presents the usual compiler bootstrap problems that we have managed to
avoid all these years.  It also adds problems for cross-compiling Mesa.

I don't expect the generated code to change very often... maybe the best
work flow that will actually work is to generate the C++ files by hand
and commit both the C++ and the GLSL to the tree.  That seems pretty
awful, but I'm not sure what we can do that's better.

One other change I've thought about making... the C++ can include an
embedded version of the GLSL, possibly as a comment.  Then we could
detect when the C++ and GLSL didn't match.

This does feel a little like "everything old is new again."  We used to
do something similar very early in the compiler development.  We had a
special inline "assembly" mode, and we'd embed the GLSL for built-in
functions in the Mesa binary.  At context creation, we'd bootstrap the
compiler by compiling the built-in functions.

This series is available at:

https://cgit.freedesktop.org/~idr/mesa/log/?h=standalone-ir_builder

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 11/15] glsl/standalone: Enable par-linking

2016-09-15 Thread Ian Romanick

From: Ian Romanick 

If the user did not request full linking, link the shader with the
built-in functions, inline them, and eliminate them.  Previous to this
you'd see all these calls to "dot" and "max" in the output.  This
prevented a lot of expected optimizations and cluttered the output.
This gives it some chance of being useful.

Signed-off-by: Ian Romanick 
---
 src/compiler/glsl/standalone.cpp | 30 --
 1 file changed, 28 insertions(+), 2 deletions(-)

diff --git a/src/compiler/glsl/standalone.cpp b/src/compiler/glsl/standalone.cpp
index f7e1055..1b2a575 100644
--- a/src/compiler/glsl/standalone.cpp
+++ b/src/compiler/glsl/standalone.cpp
@@ -38,6 +38,8 @@
 #include "standalone.h"
 #include "util/string_to_uint_map.h"
 #include "util/set.h"
+#include "linker.h"
+#include "glsl_parser_extras.h"
 
 class add_neg_to_sub_visitor : public ir_hierarchical_visitor {
 public:
@@ -506,10 +508,34 @@ standalone_compile_shader(const struct standalone_options 
*_options,
   }
}
 
-   if ((status == EXIT_SUCCESS) && options->do_link)  {
+   if (status == EXIT_SUCCESS) {
   _mesa_clear_shader_program_data(whole_program);
 
-  link_shaders(ctx, whole_program);
+  if (options->do_link)  {
+ link_shaders(ctx, whole_program);
+  } else {
+ struct gl_shader *const shader = whole_program->Shaders[0];
+
+ whole_program->LinkStatus = GL_TRUE;
+ whole_program->_LinkedShaders[shader->Stage] =
+link_intrastage_shaders(whole_program /* mem_ctx */,
+ctx,
+whole_program,
+whole_program->Shaders,
+1,
+true);
+
+ struct gl_shader_compiler_options *const compiler_options =
+&ctx->Const.ShaderCompilerOptions[shader->Stage];
+
+ 
do_common_optimization(whole_program->_LinkedShaders[shader->Stage]->ir,
+false,
+false,
+compiler_options,
+true,
+true);
+  }
+
   status = (whole_program->LinkStatus) ? EXIT_SUCCESS : EXIT_FAILURE;
 
   if (strlen(whole_program->InfoLog) > 0) {
-- 
2.5.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 07/15] glsl/standalone: Use API_OPENGL_CORE if the GLSL version is >= 1.40

2016-09-15 Thread Ian Romanick

From: Ian Romanick 

Otherwise extensions to 1.40 that are only for core profile won't work.

Signed-off-by: Ian Romanick 
---
 src/compiler/glsl/standalone.cpp | 6 +-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/src/compiler/glsl/standalone.cpp b/src/compiler/glsl/standalone.cpp
index d6e6829..a7e6254 100644
--- a/src/compiler/glsl/standalone.cpp
+++ b/src/compiler/glsl/standalone.cpp
@@ -331,7 +331,11 @@ standalone_compile_shader(const struct standalone_options 
*_options,
   return NULL;
}
 
-   initialize_context(ctx, (glsl_es) ? API_OPENGLES2 : API_OPENGL_COMPAT);
+   if (glsl_es) {
+  initialize_context(ctx, API_OPENGLES2);
+   } else {
+  initialize_context(ctx, options->glsl_version > 130 ? API_OPENGL_CORE : 
API_OPENGL_COMPAT);
+   }
 
struct gl_shader_program *whole_program;
 
-- 
2.5.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 04/15] glsl: Modify dead function removal to only operate on built-in functions

2016-09-15 Thread Ian Romanick

From: Ian Romanick 

This will be used in the stand-alone compiler.

Signed-off-by: Ian Romanick 
---
 src/compiler/glsl/glsl_parser_extras.cpp |  2 +-
 src/compiler/glsl/ir_optimization.h  |  2 +-
 src/compiler/glsl/opt_dead_functions.cpp | 11 +++
 src/compiler/glsl/test_optpass.cpp   |  2 +-
 4 files changed, 10 insertions(+), 7 deletions(-)

diff --git a/src/compiler/glsl/glsl_parser_extras.cpp 
b/src/compiler/glsl/glsl_parser_extras.cpp
index 29eba13..807bf3f 100644
--- a/src/compiler/glsl/glsl_parser_extras.cpp
+++ b/src/compiler/glsl/glsl_parser_extras.cpp
@@ -2043,7 +2043,7 @@ do_common_optimization(exec_list *ir, bool linked,
 
if (linked) {
   OPT(do_function_inlining, ir, false);
-  OPT(do_dead_functions, ir);
+  OPT(do_dead_functions, ir, false);
   OPT(do_structure_splitting, ir);
}
propagate_invariance(ir);
diff --git a/src/compiler/glsl/ir_optimization.h 
b/src/compiler/glsl/ir_optimization.h
index 26f13f1..20c17e3 100644
--- a/src/compiler/glsl/ir_optimization.h
+++ b/src/compiler/glsl/ir_optimization.h
@@ -100,7 +100,7 @@ void do_dead_builtin_varyings(struct gl_context *ctx,
 bool do_dead_code(exec_list *instructions, bool uniform_locations_assigned);
 bool do_dead_code_local(exec_list *instructions);
 bool do_dead_code_unlinked(exec_list *instructions);
-bool do_dead_functions(exec_list *instructions);
+bool do_dead_functions(exec_list *instructions, bool only_builtins);
 bool opt_flip_matrices(exec_list *instructions);
 bool do_function_inlining(exec_list *instructions, bool only_builtins);
 bool do_lower_jumps(exec_list *instructions, bool pull_out_jumps = true, bool 
lower_sub_return = true, bool lower_main_return = false, bool lower_continue = 
false, bool lower_break = false);
diff --git a/src/compiler/glsl/opt_dead_functions.cpp 
b/src/compiler/glsl/opt_dead_functions.cpp
index 2e90b65..4b58bc9 100644
--- a/src/compiler/glsl/opt_dead_functions.cpp
+++ b/src/compiler/glsl/opt_dead_functions.cpp
@@ -49,7 +49,8 @@ public:
 
 class ir_dead_functions_visitor : public ir_hierarchical_visitor {
 public:
-   ir_dead_functions_visitor()
+   ir_dead_functions_visitor(bool only_builtins)
+  : only_builtins(only_builtins)
{
   this->mem_ctx = ralloc_context(NULL);
}
@@ -66,6 +67,7 @@ public:
 
/* List of signature_entry */
exec_list signature_list;
+   bool only_builtins;
void *mem_ctx;
 };
 
@@ -94,7 +96,8 @@ ir_dead_functions_visitor::visit_enter(ir_function_signature 
*ir)
   entry->used = true;
}
 
-
+   if (only_builtins && !ir->is_builtin())
+  entry->used = true;
 
return visit_continue;
 }
@@ -111,9 +114,9 @@ ir_dead_functions_visitor::visit_enter(ir_call *ir)
 }
 
 bool
-do_dead_functions(exec_list *instructions)
+do_dead_functions(exec_list *instructions, bool only_builtins)
 {
-   ir_dead_functions_visitor v;
+   ir_dead_functions_visitor v(only_builtins);
bool progress = false;
 
visit_list_elements(&v, instructions);
diff --git a/src/compiler/glsl/test_optpass.cpp 
b/src/compiler/glsl/test_optpass.cpp
index 3771a89..3659c8b 100644
--- a/src/compiler/glsl/test_optpass.cpp
+++ b/src/compiler/glsl/test_optpass.cpp
@@ -85,7 +85,7 @@ do_optimization(struct exec_list *ir, const char 
*optimization,
} else if (strcmp(optimization, "do_dead_code_unlinked") == 0) {
   return do_dead_code_unlinked(ir);
} else if (strcmp(optimization, "do_dead_functions") == 0) {
-  return do_dead_functions(ir);
+  return do_dead_functions(ir, false);
} else if (strcmp(optimization, "do_function_inlining") == 0) {
   return do_function_inlining(ir, false);
} else if (sscanf(optimization,
-- 
2.5.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 13/15] glsl: Add bit_xor builder

2016-09-15 Thread Ian Romanick

From: Ian Romanick 

Signed-off-by: Ian Romanick 
---
 src/compiler/glsl/ir_builder.cpp | 6 ++
 src/compiler/glsl/ir_builder.h   | 1 +
 2 files changed, 7 insertions(+)

diff --git a/src/compiler/glsl/ir_builder.cpp b/src/compiler/glsl/ir_builder.cpp
index d68647f..f430100 100644
--- a/src/compiler/glsl/ir_builder.cpp
+++ b/src/compiler/glsl/ir_builder.cpp
@@ -417,6 +417,12 @@ bit_or(operand a, operand b)
 }
 
 ir_expression*
+bit_xor(operand a, operand b)
+{
+   return expr(ir_binop_bit_xor, a, b);
+}
+
+ir_expression*
 lshift(operand a, operand b)
 {
return expr(ir_binop_lshift, a, b);
diff --git a/src/compiler/glsl/ir_builder.h b/src/compiler/glsl/ir_builder.h
index b483ebf..231fbfc 100644
--- a/src/compiler/glsl/ir_builder.h
+++ b/src/compiler/glsl/ir_builder.h
@@ -168,6 +168,7 @@ ir_expression *logic_or(operand a, operand b);
 ir_expression *bit_not(operand a);
 ir_expression *bit_or(operand a, operand b);
 ir_expression *bit_and(operand a, operand b);
+ir_expression *bit_xor(operand a, operand b);
 ir_expression *lshift(operand a, operand b);
 ir_expression *rshift(operand a, operand b);
 
-- 
2.5.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 14/15] glsl: Add a C++ code generator that uses ir_builder to rebuild a program

2016-09-15 Thread Ian Romanick

From: Ian Romanick 

This is only in libstandalone currently because it will only be used in
the stand-alone compiler.

Signed-off-by: Ian Romanick 
---
 src/compiler/Makefile.sources  |   2 +
 src/compiler/glsl/ir_builder_print_visitor.cpp | 753 +
 src/compiler/glsl/ir_builder_print_visitor.h   |  32 ++
 3 files changed, 787 insertions(+)
 create mode 100644 src/compiler/glsl/ir_builder_print_visitor.cpp
 create mode 100644 src/compiler/glsl/ir_builder_print_visitor.h

diff --git a/src/compiler/Makefile.sources b/src/compiler/Makefile.sources
index f5b4f9c..88f1264 100644
--- a/src/compiler/Makefile.sources
+++ b/src/compiler/Makefile.sources
@@ -139,6 +139,8 @@ LIBGLSL_FILES = \
 # glsl_compiler
 
 GLSL_COMPILER_CXX_FILES = \
+   glsl/ir_builder_print_visitor.cpp \
+   glsl/ir_builder_print_visitor.h \
glsl/standalone_scaffolding.cpp \
glsl/standalone_scaffolding.h \
glsl/standalone.cpp \
diff --git a/src/compiler/glsl/ir_builder_print_visitor.cpp 
b/src/compiler/glsl/ir_builder_print_visitor.cpp
new file mode 100644
index 000..101a38f
--- /dev/null
+++ b/src/compiler/glsl/ir_builder_print_visitor.cpp
@@ -0,0 +1,753 @@
+/*
+ * Copyright © 2016 Intel Corporation
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice (including the next
+ * paragraph) shall be included in all copies or substantial portions of the
+ * Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
+ * DEALINGS IN THE SOFTWARE.
+ */
+
+#include "ir.h"
+#include "ir_hierarchical_visitor.h"
+#include "ir_builder_print_visitor.h"
+#include "compiler/glsl_types.h"
+#include "glsl_parser_extras.h"
+#include "main/macros.h"
+#include "util/hash_table.h"
+
+class ir_builder_print_visitor : public ir_hierarchical_visitor {
+public:
+   ir_builder_print_visitor(FILE *f);
+   virtual ~ir_builder_print_visitor();
+
+   void indent(void);
+
+   virtual ir_visitor_status visit(class ir_variable *);
+   virtual ir_visitor_status visit(class ir_dereference_variable *);
+   virtual ir_visitor_status visit(class ir_constant *);
+   virtual ir_visitor_status visit(class ir_loop_jump *);
+
+   virtual ir_visitor_status visit_enter(class ir_if *);
+
+   virtual ir_visitor_status visit_enter(class ir_loop *);
+   virtual ir_visitor_status visit_leave(class ir_loop *);
+
+   virtual ir_visitor_status visit_enter(class ir_function_signature *);
+   virtual ir_visitor_status visit_leave(class ir_function_signature *);
+
+   virtual ir_visitor_status visit_enter(class ir_expression *);
+
+   virtual ir_visitor_status visit_enter(class ir_assignment *);
+   virtual ir_visitor_status visit_leave(class ir_assignment *);
+
+   virtual ir_visitor_status visit_leave(class ir_call *);
+   virtual ir_visitor_status visit_leave(class ir_swizzle *);
+   virtual ir_visitor_status visit_leave(class ir_return *);
+
+private:
+   void print_with_indent(const char *fmt, ...);
+   void print_without_indent(const char *fmt, ...);
+
+   void print_without_declaration(const ir_rvalue *ir);
+   void print_without_declaration(const ir_constant *ir);
+   void print_without_declaration(const ir_dereference_variable *ir);
+   void print_without_declaration(const ir_swizzle *ir);
+   void print_without_declaration(const ir_expression *ir);
+
+   unsigned next_ir_index;
+
+   /**
+* Mapping from ir_instruction * -> index used in the generated C code
+* variable name.
+*/
+   hash_table *index_map;
+
+   FILE *f;
+
+   int indentation;
+};
+
+/* An operand is "simple" if it can be compactly printed on one line.
+ */
+static bool
+is_simple_operand(const ir_rvalue *ir, unsigned depth = 1)
+{
+   if (depth == 0)
+  return false;
+
+   switch (ir->ir_type) {
+   case ir_type_dereference_variable:
+  return true;
+
+   case ir_type_constant: {
+  if (ir->type == glsl_type::uint_type ||
+  ir->type == glsl_type::int_type ||
+  ir->type == glsl_type::float_type ||
+  ir->type == glsl_type::bool_type)
+ return true;
+
+  const ir_constant *const c = (ir_constant *) ir;
+  ir_constant_

[Mesa-dev] [PATCH 03/15] glsl: Modify function inlining to only operate on built-in functions

2016-09-15 Thread Ian Romanick

From: Ian Romanick 

This will be used in the stand-alone compiler.

Signed-off-by: Ian Romanick 
---
 src/compiler/glsl/glsl_parser_extras.cpp|  2 +-
 src/compiler/glsl/ir_optimization.h |  2 +-
 src/compiler/glsl/opt_function_inlining.cpp | 12 +++-
 src/compiler/glsl/test_optpass.cpp  |  2 +-
 4 files changed, 10 insertions(+), 8 deletions(-)

diff --git a/src/compiler/glsl/glsl_parser_extras.cpp 
b/src/compiler/glsl/glsl_parser_extras.cpp
index c1e958a..29eba13 100644
--- a/src/compiler/glsl/glsl_parser_extras.cpp
+++ b/src/compiler/glsl/glsl_parser_extras.cpp
@@ -2042,7 +2042,7 @@ do_common_optimization(exec_list *ir, bool linked,
OPT(lower_instructions, ir, SUB_TO_ADD_NEG);
 
if (linked) {
-  OPT(do_function_inlining, ir);
+  OPT(do_function_inlining, ir, false);
   OPT(do_dead_functions, ir);
   OPT(do_structure_splitting, ir);
}
diff --git a/src/compiler/glsl/ir_optimization.h 
b/src/compiler/glsl/ir_optimization.h
index 3bd6928..26f13f1 100644
--- a/src/compiler/glsl/ir_optimization.h
+++ b/src/compiler/glsl/ir_optimization.h
@@ -102,7 +102,7 @@ bool do_dead_code_local(exec_list *instructions);
 bool do_dead_code_unlinked(exec_list *instructions);
 bool do_dead_functions(exec_list *instructions);
 bool opt_flip_matrices(exec_list *instructions);
-bool do_function_inlining(exec_list *instructions);
+bool do_function_inlining(exec_list *instructions, bool only_builtins);
 bool do_lower_jumps(exec_list *instructions, bool pull_out_jumps = true, bool 
lower_sub_return = true, bool lower_main_return = false, bool lower_continue = 
false, bool lower_break = false);
 bool do_lower_texture_projection(exec_list *instructions);
 bool do_if_simplification(exec_list *instructions);
diff --git a/src/compiler/glsl/opt_function_inlining.cpp 
b/src/compiler/glsl/opt_function_inlining.cpp
index 83534bf..f216696 100644
--- a/src/compiler/glsl/opt_function_inlining.cpp
+++ b/src/compiler/glsl/opt_function_inlining.cpp
@@ -43,9 +43,10 @@ namespace {
 
 class ir_function_inlining_visitor : public ir_hierarchical_visitor {
 public:
-   ir_function_inlining_visitor()
+   ir_function_inlining_visitor(bool only_builtins)
+  : progress(false), only_builtins(only_builtins)
{
-  progress = false;
+  /* empty */
}
 
virtual ~ir_function_inlining_visitor()
@@ -60,14 +61,15 @@ public:
virtual ir_visitor_status visit_enter(ir_swizzle *);
 
bool progress;
+   bool only_builtins;
 };
 
 } /* unnamed namespace */
 
 bool
-do_function_inlining(exec_list *instructions)
+do_function_inlining(exec_list *instructions, bool only_builtins)
 {
-   ir_function_inlining_visitor v;
+   ir_function_inlining_visitor v(only_builtins);
 
v.run(instructions);
 
@@ -246,7 +248,7 @@ ir_function_inlining_visitor::visit_enter(ir_swizzle *ir)
 ir_visitor_status
 ir_function_inlining_visitor::visit_enter(ir_call *ir)
 {
-   if (can_inline(ir)) {
+   if ((!only_builtins || ir->callee->is_builtin()) && can_inline(ir)) {
   ir->generate_inline(ir);
   ir->remove();
   this->progress = true;
diff --git a/src/compiler/glsl/test_optpass.cpp 
b/src/compiler/glsl/test_optpass.cpp
index 852af19..3771a89 100644
--- a/src/compiler/glsl/test_optpass.cpp
+++ b/src/compiler/glsl/test_optpass.cpp
@@ -87,7 +87,7 @@ do_optimization(struct exec_list *ir, const char 
*optimization,
} else if (strcmp(optimization, "do_dead_functions") == 0) {
   return do_dead_functions(ir);
} else if (strcmp(optimization, "do_function_inlining") == 0) {
-  return do_function_inlining(ir);
+  return do_function_inlining(ir, false);
} else if (sscanf(optimization,
  "do_lower_jumps ( %d , %d , %d , %d , %d ) ",
  &int_0, &int_1, &int_2, &int_3, &int_4) == 5) {
-- 
2.5.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 06/15] glsl/linker: Allow link_intrastage_shaders when there is no main()

2016-09-15 Thread Ian Romanick

From: Ian Romanick 

This enables a sort of par-linking.  The primary use for this feature is
resolving built-in functions in the stand-alone compiler.

Signed-off-by: Ian Romanick 
---
 src/compiler/glsl/linker.cpp | 28 +---
 src/compiler/glsl/linker.h   |  9 +
 2 files changed, 26 insertions(+), 11 deletions(-)

diff --git a/src/compiler/glsl/linker.cpp b/src/compiler/glsl/linker.cpp
index 2699c1e..3ec6ea0 100644
--- a/src/compiler/glsl/linker.cpp
+++ b/src/compiler/glsl/linker.cpp
@@ -2137,12 +2137,13 @@ link_cs_input_layout_qualifiers(struct 
gl_shader_program *prog,
  * If this function is supplied a single shader, it is cloned, and the new
  * shader is returned.
  */
-static struct gl_linked_shader *
+struct gl_linked_shader *
 link_intrastage_shaders(void *mem_ctx,
struct gl_context *ctx,
struct gl_shader_program *prog,
struct gl_shader **shader_list,
-   unsigned num_shaders)
+unsigned num_shaders,
+bool allow_missing_main)
 {
struct gl_uniform_block *ubo_blocks = NULL;
struct gl_uniform_block *ssbo_blocks = NULL;
@@ -2221,6 +,9 @@ link_intrastage_shaders(void *mem_ctx,
   }
}
 
+   if (main == NULL && allow_missing_main)
+  main = shader_list[0];
+
if (main == NULL) {
   linker_error(prog, "%s shader lacks `main'\n",
   _mesa_shader_stage_to_string(shader_list[0]->Stage));
@@ -2250,16 +2254,18 @@ link_intrastage_shaders(void *mem_ctx,
/* Move any instructions other than variable declarations or function
 * declarations into main.
 */
-   exec_node *insertion_point =
-  move_non_declarations(linked->ir, (exec_node *) &main_sig->body, false,
-   linked);
+   if (main_sig != NULL) {
+  exec_node *insertion_point =
+ move_non_declarations(linked->ir, (exec_node *) &main_sig->body, 
false,
+   linked);
 
-   for (unsigned i = 0; i < num_shaders; i++) {
-  if (shader_list[i] == main)
-continue;
+  for (unsigned i = 0; i < num_shaders; i++) {
+ if (shader_list[i] == main)
+continue;
 
-  insertion_point = move_non_declarations(shader_list[i]->ir,
- insertion_point, true, linked);
+ insertion_point = move_non_declarations(shader_list[i]->ir,
+ insertion_point, true, 
linked);
+  }
}
 
/* Check if any shader needs built-in functions. */
@@ -4874,7 +4880,7 @@ link_shaders(struct gl_context *ctx, struct 
gl_shader_program *prog)
   if (num_shaders[stage] > 0) {
  gl_linked_shader *const sh =
 link_intrastage_shaders(mem_ctx, ctx, prog, shader_list[stage],
-num_shaders[stage]);
+num_shaders[stage], false);
 
  if (!prog->LinkStatus) {
 if (sh)
diff --git a/src/compiler/glsl/linker.h b/src/compiler/glsl/linker.h
index e1a53d2..fee39b5 100644
--- a/src/compiler/glsl/linker.h
+++ b/src/compiler/glsl/linker.h
@@ -86,6 +86,15 @@ extern void
 link_check_atomic_counter_resources(struct gl_context *ctx,
 struct gl_shader_program *prog);
 
+
+extern struct gl_linked_shader *
+link_intrastage_shaders(void *mem_ctx,
+struct gl_context *ctx,
+struct gl_shader_program *prog,
+struct gl_shader **shader_list,
+unsigned num_shaders,
+bool allow_missing_main);
+
 /**
  * Class for processing all of the leaf fields of a variable that corresponds
  * to a program resource.
-- 
2.5.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] gallium/docs: document alpha_to_coverage and alpha_to_one blend state

2016-09-15 Thread Roland Scheidegger

A good question. I have no idea what d3d10 does there (since a
conversion from alpha to coverage doesn't really make sense as the alpha
values don't actually go from 0 to 1).
I suppose they could be translated away in the state tracker as well
since the combination doesn't make sense (albeit we don't do that for
d3d10 - maybe this behavior isn't tested).

But in any case, this documentation is an improvement.
(alpha_to_one is not possible with d3d10, therefore the semantics could
be different, but it probably makes sense it follows the same rules.)
Reviewed-by: Roland Scheidegger 


Am 15.09.2016 um 23:35 schrieb Ilia Mirkin:
> What about integer RTs? I had to add a hack in nouveau to make it
> disable those when RT0 is an integer. It'd be more convenient if they
> were turned off in the first place.
> 
> On Thu, Sep 15, 2016 at 5:34 PM, Brian Paul  wrote:
>> The gallium interface defines these like DX10.  Note that OpenGL ignores
>> these options if MSAA is disabled or the dest buffer doesn't support
>> MSAA.
>> ---
>>  src/gallium/docs/source/cso/blend.rst | 12 
>>  1 file changed, 12 insertions(+)
>>
>> diff --git a/src/gallium/docs/source/cso/blend.rst 
>> b/src/gallium/docs/source/cso/blend.rst
>> index dce999c..7316e5c 100644
>> --- a/src/gallium/docs/source/cso/blend.rst
>> +++ b/src/gallium/docs/source/cso/blend.rst
>> @@ -88,6 +88,18 @@ independent_blend_enable
>> the first member of the rt array contains valid data.
>>  rt
>> Contains the per-rendertarget blend state.
>> +alpha_to_coverage
>> +   If enabled, the fragment's alpha value is used to override the fragment's
>> +   coverage mask.  The coverage mask will be all zeros if the alpha value is
>> +   zero.  The coverage mask will be all ones if the alpha value is one.
>> +   Otherwise, the number of bits set in the coverage mask will be 
>> proportional
>> +   to the alpha value.  Note that this step happens regardless of whether
>> +   multisample is enabled or the destination buffer is multisampled.
>> +alpha_to_one
>> +   If enabled, the fragment's alpha value will be set to one.  As with
>> +   alpha_to_coverage, this step happens regardless of whether multisample
>> +   is enabled or the destination buffer is multisampled.
>> +
>>
>>  Per-rendertarget Members
>>  
>> --

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 00/15] Compile GLSL to ir_builder

2016-09-15 Thread Ian Romanick

On 09/15/2016 03:12 PM, Ian Romanick wrote:
> This series makes the stand-alone GLSL compiler useful for something.
> It adds an option to generate C++ code that uses ir_builder to recreate
> the compiled GLSL shader.  I intend to use this for various lowering
> code for GL_ARB_gpu_shader_int64 and GL_ARB_gpu_shader_fp64 (on
> platforms that don't actually have double precision hardware).  See also
> Elie Tournier's "soft double" GSoC project.
> 
> As an example, I present some GLSL code that does 64-bit integer
> division:
> 
> https://people.freedesktop.org/~idr/udivmod64.glsl
> 
> and the C++ code generated:
> 
> https://people.freedesktop.org/~idr/udivmod64.cpp
> 
> This is a little bit of fib... the code in this series lacks a very
> small amount of 64-bit integer support, and the other necessary bits in
> Mesa are not yet in master.  A tree with all of that will be available
> soon.

https://cgit.freedesktop.org/~idr/mesa/log/?h=arb_gpu_shader_int64

radeonsi uses LLVM to do the 64-bit multiply and division lowering, so
ARB_gpu_shader_int64 support could land before this series.  The
rearranging and rebasing should be trivial to make that work.

> The generated code is only ~200 lines compared to ~50 lines of GLSL.
> However, I can make changes to the GLSL much easier than I can the big
> pile of ir_builder code.  It's a bit like coding in assembly.
> 
> I believe this same technique could be adapted to generate NIR builder
> too.  Then we could use the same GLSL source to lower things in NIR that
> originated from SPIR-V binaries.
> 
> The ideal work flow would be generate the C++ code while Mesa is
> building, like we do with the various lexers and parsers.  However, this
> presents the usual compiler bootstrap problems that we have managed to
> avoid all these years.  It also adds problems for cross-compiling Mesa.
> 
> I don't expect the generated code to change very often... maybe the best
> work flow that will actually work is to generate the C++ files by hand
> and commit both the C++ and the GLSL to the tree.  That seems pretty
> awful, but I'm not sure what we can do that's better.
> 
> One other change I've thought about making... the C++ can include an
> embedded version of the GLSL, possibly as a comment.  Then we could
> detect when the C++ and GLSL didn't match.
> 
> This does feel a little like "everything old is new again."  We used to
> do something similar very early in the compiler development.  We had a
> special inline "assembly" mode, and we'd embed the GLSL for built-in
> functions in the Mesa binary.  At context creation, we'd bootstrap the
> compiler by compiling the built-in functions.
> 
> This series is available at:
> 
> https://cgit.freedesktop.org/~idr/mesa/log/?h=standalone-ir_builder
> 
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
> 

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 01/10] i965: use nir_lower_indirect_derefs() for GLSL Gen7+

2016-09-15 Thread Timothy Arceri

On Thu, 2016-09-15 at 12:34 -0700, Jason Ekstrand wrote:
> On Sep 15, 2016 12:05 AM, "Timothy Arceri"  com> wrote:
> >
> > This moves the nir_lower_indirect_derefs() call into
> > brw_preprocess_nir() so thats is called by both OpenGL and Vulkan
> > and removes that call to the old GLSL IR pass
> > lower_variable_index_to_cond_assign()
> >
> > We want to do this pass in nir to be able to move loop unrolling
> > to nir.
> >
> > There is a increase of 1-3 instructions in a small number of
> shaders,
> > and 2 Kerbal Space program shaders that increase by 32
> instructions.
> >
> > Shader-db results BDW:
> >
> > total instructions in shared programs: 8705873 -> 8706194 (0.00%)
> > instructions in affected programs: 32515 -> 32836 (0.99%)
> > helped: 3
> > HURT: 79
> >
> > total cycles in shared programs: 74618120 -> 74583476 (-0.05%)
> > cycles in affected programs: 528104 -> 493460 (-6.56%)
> > helped: 47
> > HURT: 37
> >
> > LOST:   2
> > GAINED: 0
> > ---
> >  src/intel/vulkan/anv_pipeline.c        | 10 --
> >  src/mesa/drivers/dri/i965/brw_link.cpp | 26 ++--
> --
> >  src/mesa/drivers/dri/i965/brw_nir.c    | 12 
> >  3 files changed, 26 insertions(+), 22 deletions(-)
> >
> > diff --git a/src/intel/vulkan/anv_pipeline.c
> b/src/intel/vulkan/anv_pipeline.c
> > index f96fe22..f292f0b 100644
> > --- a/src/intel/vulkan/anv_pipeline.c
> > +++ b/src/intel/vulkan/anv_pipeline.c
> > @@ -183,16 +183,6 @@ anv_shader_compile_to_nir(struct anv_device
> *device,
> >
> >     nir_shader_gather_info(nir, entry_point->impl);
> >
> > -   nir_variable_mode indirect_mask = 0;
> > -   if (compiler->glsl_compiler_options[stage].EmitNoIndirectInput)
> > -      indirect_mask |= nir_var_shader_in;
> > -   if (compiler-
> >glsl_compiler_options[stage].EmitNoIndirectOutput)
> > -      indirect_mask |= nir_var_shader_out;
> > -   if (compiler->glsl_compiler_options[stage].EmitNoIndirectTemp)
> > -      indirect_mask |= nir_var_local;
> > -
> > -   nir_lower_indirect_derefs(nir, indirect_mask);
> > -
> >     return nir;
> >  }
> >
> > diff --git a/src/mesa/drivers/dri/i965/brw_link.cpp
> b/src/mesa/drivers/dri/i965/brw_link.cpp
> > index 2b1fa61..41791d4 100644
> > --- a/src/mesa/drivers/dri/i965/brw_link.cpp
> > +++ b/src/mesa/drivers/dri/i965/brw_link.cpp
> > @@ -139,18 +139,20 @@ process_glsl_ir(gl_shader_stage stage,
> >
> >     do_copy_propagation(shader->ir);
> >
> > -   bool lowered_variable_indexing =
> > -      lower_variable_index_to_cond_assign((gl_shader_stage)stage,
> > -                                          shader->ir,
> > -                                          options-
> >EmitNoIndirectInput,
> > -                                          options-
> >EmitNoIndirectOutput,
> > -                                          options-
> >EmitNoIndirectTemp,
> > -                                          options-
> >EmitNoIndirectUniform);
> > -
> > -   if (unlikely(brw->perf_debug && lowered_variable_indexing)) {
> > -      perf_debug("Unsupported form of variable indexing in %s;
> falling "
> > -                 "back to very inefficient code generation\n",
> > -                 _mesa_shader_stage_to_abbrev(shader->Stage));
> > +   if (brw->gen < 7) {
> > +      bool lowered_variable_indexing =
> > +       
>  lower_variable_index_to_cond_assign((gl_shader_stage)stage,
> > +                                             shader->ir,
> > +                                             options-
> >EmitNoIndirectInput,
> > +                                             options-
> >EmitNoIndirectOutput,
> > +                                             options-
> >EmitNoIndirectTemp,
> > +                                             options-
> >EmitNoIndirectUniform);
> > +
> > +      if (unlikely(brw->perf_debug && lowered_variable_indexing))
> {
> > +         perf_debug("Unsupported form of variable indexing in %s;
> falling "
> > +                    "back to very inefficient code generation\n",
> > +                    _mesa_shader_stage_to_abbrev(shader->Stage));
> > +      }
> >     }
> >
> >     bool progress;
> > diff --git a/src/mesa/drivers/dri/i965/brw_nir.c
> b/src/mesa/drivers/dri/i965/brw_nir.c
> > index e8dafae..af646ed 100644
> > --- a/src/mesa/drivers/dri/i965/brw_nir.c
> > +++ b/src/mesa/drivers/dri/i965/brw_nir.c
> > @@ -453,6 +453,18 @@ brw_preprocess_nir(const struct brw_compiler
> *compiler, nir_shader *nir)
> >     /* Lower a bunch of stuff */
> >     OPT_V(nir_lower_var_copies);
> >
> > +   if (compiler->devinfo->gen > 6) {
> I think you want "> 7" here

It can be used with gen 7 and up. I could change it to >= 7 if that is
easier to parse but I think > 6 is functionally correct.

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] V3 Loop unrolling in NIR

2016-09-15 Thread Timothy Arceri

On Thu, 2016-09-15 at 12:28 +0300, Eero Tamminen wrote:
> Hi,
> 
> Have you any plans for supporting partial unrolling?

Not currently no although it shouldn't be too difficult to add this.

> 
> I.e. if the loop count is too large to be completely unrolled, unroll
> it 
> few times (that still fits into instruction cache) and then loop
> that.
> 
> E.g. for a loop with 51 rounds, Mesa could unroll it 4 rounds, loop
> that 
> 12 times and unroll (or loop) remaining 3 rounds separately.
> 
> 
>   - Eero
> 
> On 15.09.2016 10:03, Timothy Arceri wrote:
> > 
> > Big thanks to Connor for his feedback on previous versions, and
> > to Jason for answering my all my nir questions.
> > 
> > This series works on ssa defs so for now it's only enabled for
> > the scalar backend on Gen7+.
> > 
> > V3:
> > - So called complex loop unrolling has been implemented.
> > - An instruction limit and rules from the GLSL IR pass to override
> >  the limit for unrolling have been implemented.
> > - Lots of other stuff see individual patches.
> > 
> > total instructions in shared programs: 8488940 -> 8488648 (-0.00%)
> > instructions in affected programs: 48903 -> 48611 (-0.60%)
> > helped: 68
> > HURT: 89
> > 
> > Most of this HURT comes for switching to using
> > nir_lower_indirect_derefs(). See patch 1 for more deals.
> > 
> > total cycles in shared programs: 69787006 -> 69758740 (-0.04%)
> > cycles in affected programs: 2525708 -> 2497442 (-1.12%)
> > helped: 900
> > HURT: 919
> > 
> > total loops in shared programs: 2071 -> 1499 (-27.62%)
> > loops in affected programs: 687 -> 115 (-83.26%)
> > helped: 655
> > HURT: 99
> > 
> > Helped here comes from a number of things. One example is the
> > nir pass is better than the GLSL pass at unrolling loops
> > regardless of which terminator has the lowest limit. We could
> > easily go further and handle unrolling of loops with complex
> > terminators e.g the ifs then or else blocks contain instructions
> > currently we just bail if they are not empty, I still need to
> > check if its worth while.
> > 
> > Another reason could be that I've set the instruction limit too
> > high but that doesn't seem to be the case.
> > 
> > I believe 82/99 of the HURT is from shaders that look something
> > like this:
> > 
> >   vec2 array[const_size_of_array];
> >   for (i = 0; i < const_size_of_array; i++) {
> > ...  = array[i];
> > 
> > ... lots of instructions (more that the unroll limit) ...
> >   }
> > 
> > The GLSL IR pass would force this to unroll as long as
> > const_size_of_array
> > wasn't greater than 32. However by the time we get to the nir pass
> > the
> > arrays have been removed, it seems like this may only be happening
> > for
> > vectors but I haven't looked into what is causing it yet.
> > 
> > The other 17 shaders seem to be various corner cases that can be
> > fixed
> > in folow-up patches.
> > 
> > total spills in shared programs: 2212 -> 2212 (0.00%)
> > spills in affected programs: 0 -> 0
> > helped: 0
> > HURT: 0
> > 
> > total fills in shared programs: 1891 -> 1891 (0.00%)
> > fills in affected programs: 0 -> 0
> > helped: 0
> > HURT: 0
> > 
> > LOST:   6
> > GAINED: 32
> > 
> > ___
> > mesa-dev mailing list
> > mesa-dev@lists.freedesktop.org
> > https://lists.freedesktop.org/mailman/listinfo/mesa-dev
> > 
> 
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 10/10] i965: use nir unrolling for scalar backend Gen7+

2016-09-15 Thread Connor Abbott

This seems a little dubious... why restrict it to gen7+? And why only
scalar? This pass assumes SSA, but so do many other passes in core NIR
that we also run in nir_optimize(), so that shouldn't be a problem.

On Thu, Sep 15, 2016 at 3:03 AM, Timothy Arceri
 wrote:
> ---
>  src/compiler/glsl/glsl_parser_extras.cpp | 12 +
>  src/mesa/drivers/dri/i965/brw_compiler.c | 42 
> +++-
>  src/mesa/drivers/dri/i965/brw_nir.c  | 23 +
>  3 files changed, 55 insertions(+), 22 deletions(-)
>
> diff --git a/src/compiler/glsl/glsl_parser_extras.cpp 
> b/src/compiler/glsl/glsl_parser_extras.cpp
> index 436ddd0..a5c926a 100644
> --- a/src/compiler/glsl/glsl_parser_extras.cpp
> +++ b/src/compiler/glsl/glsl_parser_extras.cpp
> @@ -2057,12 +2057,14 @@ do_common_optimization(exec_list *ir, bool linked,
> OPT(optimize_split_arrays, ir, linked);
> OPT(optimize_redundant_jumps, ir);
>
> -   loop_state *ls = analyze_loop_variables(ir);
> -   if (ls->loop_found) {
> -  OPT(set_loop_controls, ir, ls);
> -  OPT(unroll_loops, ir, ls, options);
> +   if (options->MaxUnrollIterations != 0) {
> +  loop_state *ls = analyze_loop_variables(ir);
> +  if (ls->loop_found) {
> + OPT(set_loop_controls, ir, ls);
> + OPT(unroll_loops, ir, ls, options);
> +  }
> +  delete ls;
> }
> -   delete ls;
>
>  #undef OPT
>
> diff --git a/src/mesa/drivers/dri/i965/brw_compiler.c 
> b/src/mesa/drivers/dri/i965/brw_compiler.c
> index 86b1eaa..523b554 100644
> --- a/src/mesa/drivers/dri/i965/brw_compiler.c
> +++ b/src/mesa/drivers/dri/i965/brw_compiler.c
> @@ -43,18 +43,28 @@
> .use_interpolated_input_intrinsics = true,
>  \
> .vertex_id_zero_based = true
>
> +#define COMMON_SCALAR_OPTIONS
>  \
> +   .lower_pack_half_2x16 = true, 
>  \
> +   .lower_pack_snorm_2x16 = true,
>  \
> +   .lower_pack_snorm_4x8 = true, 
>  \
> +   .lower_pack_unorm_2x16 = true,
>  \
> +   .lower_pack_unorm_4x8 = true, 
>  \
> +   .lower_unpack_half_2x16 = true,   
>  \
> +   .lower_unpack_snorm_2x16 = true,  
>  \
> +   .lower_unpack_snorm_4x8 = true,   
>  \
> +   .lower_unpack_unorm_2x16 = true,  
>  \
> +   .lower_unpack_unorm_4x8 = true
>  \
> +
>  static const struct nir_shader_compiler_options scalar_nir_options = {
> COMMON_OPTIONS,
> -   .lower_pack_half_2x16 = true,
> -   .lower_pack_snorm_2x16 = true,
> -   .lower_pack_snorm_4x8 = true,
> -   .lower_pack_unorm_2x16 = true,
> -   .lower_pack_unorm_4x8 = true,
> -   .lower_unpack_half_2x16 = true,
> -   .lower_unpack_snorm_2x16 = true,
> -   .lower_unpack_snorm_4x8 = true,
> -   .lower_unpack_unorm_2x16 = true,
> -   .lower_unpack_unorm_4x8 = true,
> +   COMMON_SCALAR_OPTIONS,
> +   .max_unroll_iterations = 0,
> +};
> +
> +static const struct nir_shader_compiler_options scalar_nir_options_gen7 = {
> +   COMMON_OPTIONS,
> +   COMMON_SCALAR_OPTIONS,
> +   .max_unroll_iterations = 32,
>  };
>
>  static const struct nir_shader_compiler_options vector_nir_options = {
> @@ -75,6 +85,7 @@ static const struct nir_shader_compiler_options 
> vector_nir_options = {
> .lower_unpack_unorm_2x16 = true,
> .lower_extract_byte = true,
> .lower_extract_word = true,
> +   .max_unroll_iterations = 0,
>  };
>
>  static const struct nir_shader_compiler_options vector_nir_options_gen6 = {
> @@ -92,6 +103,7 @@ static const struct nir_shader_compiler_options 
> vector_nir_options_gen6 = {
> .lower_unpack_unorm_2x16 = true,
> .lower_extract_byte = true,
> .lower_extract_word = true,
> +   .max_unroll_iterations = 0,
>  };
>
>  struct brw_compiler *
> @@ -119,7 +131,6 @@ brw_compiler_create(void *mem_ctx, const struct 
> gen_device_info *devinfo)
>
> /* We want the GLSL compiler to emit code that uses condition codes */
> for (int i = 0; i < MESA_SHADER_STAGES; i++) {
> -  compiler->glsl_compiler_options[i].MaxUnrollIterations = 32;
>compiler->glsl_compiler_options[i].MaxIfDepth =
>   devinfo->gen < 6 ? 16 : UINT_MAX;
>
> @@ -140,8 +151,15 @@ brw_compiler_create(void *mem_ctx, const struct 
> gen_device_info *devinfo)
>   compiler->glsl_compiler_options[i].EmitNoIndirectSampler = true;
>
>if (is_scalar) {
> - compiler->glsl_compiler_options[i].NirOptions = &scalar_nir_options;
> + if (devinfo->gen > 6) {
> +compiler->glsl_compiler_options[i].MaxUnrollIterations = 0;
> + } else {
> +compiler->glsl_compiler_options[i].MaxUnrollIte

Re: [Mesa-dev] [PATCH 01/10] i965: use nir_lower_indirect_derefs() for GLSL Gen7+

2016-09-15 Thread Jason Ekstrand

On Sep 15, 2016 4:31 PM, "Timothy Arceri" 
wrote:
>
> On Thu, 2016-09-15 at 12:34 -0700, Jason Ekstrand wrote:
> > On Sep 15, 2016 12:05 AM, "Timothy Arceri"  > com> wrote:
> > >
> > > This moves the nir_lower_indirect_derefs() call into
> > > brw_preprocess_nir() so thats is called by both OpenGL and Vulkan
> > > and removes that call to the old GLSL IR pass
> > > lower_variable_index_to_cond_assign()
> > >
> > > We want to do this pass in nir to be able to move loop unrolling
> > > to nir.
> > >
> > > There is a increase of 1-3 instructions in a small number of
> > shaders,
> > > and 2 Kerbal Space program shaders that increase by 32
> > instructions.
> > >
> > > Shader-db results BDW:
> > >
> > > total instructions in shared programs: 8705873 -> 8706194 (0.00%)
> > > instructions in affected programs: 32515 -> 32836 (0.99%)
> > > helped: 3
> > > HURT: 79
> > >
> > > total cycles in shared programs: 74618120 -> 74583476 (-0.05%)
> > > cycles in affected programs: 528104 -> 493460 (-6.56%)
> > > helped: 47
> > > HURT: 37
> > >
> > > LOST:   2
> > > GAINED: 0
> > > ---
> > >  src/intel/vulkan/anv_pipeline.c| 10 --
> > >  src/mesa/drivers/dri/i965/brw_link.cpp | 26 ++--
> > --
> > >  src/mesa/drivers/dri/i965/brw_nir.c| 12 
> > >  3 files changed, 26 insertions(+), 22 deletions(-)
> > >
> > > diff --git a/src/intel/vulkan/anv_pipeline.c
> > b/src/intel/vulkan/anv_pipeline.c
> > > index f96fe22..f292f0b 100644
> > > --- a/src/intel/vulkan/anv_pipeline.c
> > > +++ b/src/intel/vulkan/anv_pipeline.c
> > > @@ -183,16 +183,6 @@ anv_shader_compile_to_nir(struct anv_device
> > *device,
> > >
> > > nir_shader_gather_info(nir, entry_point->impl);
> > >
> > > -   nir_variable_mode indirect_mask = 0;
> > > -   if (compiler->glsl_compiler_options[stage].EmitNoIndirectInput)
> > > -  indirect_mask |= nir_var_shader_in;
> > > -   if (compiler-
> > >glsl_compiler_options[stage].EmitNoIndirectOutput)
> > > -  indirect_mask |= nir_var_shader_out;
> > > -   if (compiler->glsl_compiler_options[stage].EmitNoIndirectTemp)
> > > -  indirect_mask |= nir_var_local;
> > > -
> > > -   nir_lower_indirect_derefs(nir, indirect_mask);
> > > -
> > > return nir;
> > >  }
> > >
> > > diff --git a/src/mesa/drivers/dri/i965/brw_link.cpp
> > b/src/mesa/drivers/dri/i965/brw_link.cpp
> > > index 2b1fa61..41791d4 100644
> > > --- a/src/mesa/drivers/dri/i965/brw_link.cpp
> > > +++ b/src/mesa/drivers/dri/i965/brw_link.cpp
> > > @@ -139,18 +139,20 @@ process_glsl_ir(gl_shader_stage stage,
> > >
> > > do_copy_propagation(shader->ir);
> > >
> > > -   bool lowered_variable_indexing =
> > > -  lower_variable_index_to_cond_assign((gl_shader_stage)stage,
> > > -  shader->ir,
> > > -  options-
> > >EmitNoIndirectInput,
> > > -  options-
> > >EmitNoIndirectOutput,
> > > -  options-
> > >EmitNoIndirectTemp,
> > > -  options-
> > >EmitNoIndirectUniform);
> > > -
> > > -   if (unlikely(brw->perf_debug && lowered_variable_indexing)) {
> > > -  perf_debug("Unsupported form of variable indexing in %s;
> > falling "
> > > - "back to very inefficient code generation\n",
> > > - _mesa_shader_stage_to_abbrev(shader->Stage));
> > > +   if (brw->gen < 7) {
> > > +  bool lowered_variable_indexing =
> > > +
> >  lower_variable_index_to_cond_assign((gl_shader_stage)stage,
> > > + shader->ir,
> > > + options-
> > >EmitNoIndirectInput,
> > > + options-
> > >EmitNoIndirectOutput,
> > > + options-
> > >EmitNoIndirectTemp,
> > > + options-
> > >EmitNoIndirectUniform);
> > > +
> > > +  if (unlikely(brw->perf_debug && lowered_variable_indexing))
> > {
> > > + perf_debug("Unsupported form of variable indexing in %s;
> > falling "
> > > +"back to very inefficient code generation\n",
> > > +_mesa_shader_stage_to_abbrev(shader->Stage));
> > > +  }
> > > }
> > >
> > > bool progress;
> > > diff --git a/src/mesa/drivers/dri/i965/brw_nir.c
> > b/src/mesa/drivers/dri/i965/brw_nir.c
> > > index e8dafae..af646ed 100644
> > > --- a/src/mesa/drivers/dri/i965/brw_nir.c
> > > +++ b/src/mesa/drivers/dri/i965/brw_nir.c
> > > @@ -453,6 +453,18 @@ brw_preprocess_nir(const struct brw_compiler
> > *compiler, nir_shader *nir)
> > > /* Lower a bunch of stuff */
> > > OPT_V(nir_lower_var_copies);
> > >
> > > +   if (compiler->devinfo->gen > 6) {
> > I think you want "> 7" here
>
> It can be used with gen 7 and up. I could change it to >= 7 if that is
> easier to parse but I think > 6 is functionally co

Re: [Mesa-dev] [PATCH 10/10] i965: use nir unrolling for scalar backend Gen7+

2016-09-15 Thread Timothy Arceri

On Thu, 2016-09-15 at 19:49 -0400, Connor Abbott wrote:
> This seems a little dubious... why restrict it to gen7+?

Because if we don't switch to using nir_lower_indirect_derefs() then
lower_variable_index_to_cond_assign() makes a big mess before we get to
unrolling. I was getting piglit regressions when switching to for gen6
and below, unfortunately I have no hardware to debug it on. 

>  And why only
> scalar? This pass assumes SSA, but so do many other passes in core
> NIR
> that we also run in nir_optimize(), so that shouldn't be a problem.

I need to retest to give you the exact reason but there was a lowering
pass that is not called for the vector backend that meant we still had
to deal with nir registers.


> 
> On Thu, Sep 15, 2016 at 3:03 AM, Timothy Arceri
>  wrote:
> > 
> > ---
> >  src/compiler/glsl/glsl_parser_extras.cpp | 12 +
> >  src/mesa/drivers/dri/i965/brw_compiler.c | 42
> > +++-
> >  src/mesa/drivers/dri/i965/brw_nir.c  | 23 +
> >  3 files changed, 55 insertions(+), 22 deletions(-)
> > 
> > diff --git a/src/compiler/glsl/glsl_parser_extras.cpp
> > b/src/compiler/glsl/glsl_parser_extras.cpp
> > index 436ddd0..a5c926a 100644
> > --- a/src/compiler/glsl/glsl_parser_extras.cpp
> > +++ b/src/compiler/glsl/glsl_parser_extras.cpp
> > @@ -2057,12 +2057,14 @@ do_common_optimization(exec_list *ir, bool
> > linked,
> > OPT(optimize_split_arrays, ir, linked);
> > OPT(optimize_redundant_jumps, ir);
> > 
> > -   loop_state *ls = analyze_loop_variables(ir);
> > -   if (ls->loop_found) {
> > -  OPT(set_loop_controls, ir, ls);
> > -  OPT(unroll_loops, ir, ls, options);
> > +   if (options->MaxUnrollIterations != 0) {
> > +  loop_state *ls = analyze_loop_variables(ir);
> > +  if (ls->loop_found) {
> > + OPT(set_loop_controls, ir, ls);
> > + OPT(unroll_loops, ir, ls, options);
> > +  }
> > +  delete ls;
> > }
> > -   delete ls;
> > 
> >  #undef OPT
> > 
> > diff --git a/src/mesa/drivers/dri/i965/brw_compiler.c
> > b/src/mesa/drivers/dri/i965/brw_compiler.c
> > index 86b1eaa..523b554 100644
> > --- a/src/mesa/drivers/dri/i965/brw_compiler.c
> > +++ b/src/mesa/drivers/dri/i965/brw_compiler.c
> > @@ -43,18 +43,28 @@
> > .use_interpolated_input_intrinsics =
> > true, \
> > .vertex_id_zero_based = true
> > 
> > +#define
> > COMMON_SCALAR_OPTIONS  
> >    \
> > +   .lower_pack_half_2x16 =
> > true,  \
> > +   .lower_pack_snorm_2x16 =
> > true, \
> > +   .lower_pack_snorm_4x8 =
> > true,  \
> > +   .lower_pack_unorm_2x16 =
> > true, \
> > +   .lower_pack_unorm_4x8 =
> > true,  \
> > +   .lower_unpack_half_2x16 =
> > true,\
> > +   .lower_unpack_snorm_2x16 =
> > true,   \
> > +   .lower_unpack_snorm_4x8 =
> > true,\
> > +   .lower_unpack_unorm_2x16 =
> > true,   \
> > +   .lower_unpack_unorm_4x8 =
> > true \
> > +
> >  static const struct nir_shader_compiler_options scalar_nir_options
> > = {
> > COMMON_OPTIONS,
> > -   .lower_pack_half_2x16 = true,
> > -   .lower_pack_snorm_2x16 = true,
> > -   .lower_pack_snorm_4x8 = true,
> > -   .lower_pack_unorm_2x16 = true,
> > -   .lower_pack_unorm_4x8 = true,
> > -   .lower_unpack_half_2x16 = true,
> > -   .lower_unpack_snorm_2x16 = true,
> > -   .lower_unpack_snorm_4x8 = true,
> > -   .lower_unpack_unorm_2x16 = true,
> > -   .lower_unpack_unorm_4x8 = true,
> > +   COMMON_SCALAR_OPTIONS,
> > +   .max_unroll_iterations = 0,
> > +};
> > +
> > +static const struct nir_shader_compiler_options
> > scalar_nir_options_gen7 = {
> > +   COMMON_OPTIONS,
> > +   COMMON_SCALAR_OPTIONS,
> > +   .max_unroll_iterations = 32,
> >  };
> > 
> >  static const struct nir_shader_compiler_options vector_nir_options
> > = {
> > @@ -75,6 +85,7 @@ static const struct nir_shader_compiler_options
> > vector_nir_options = {
> > .lower_unpack_unorm_2x16 = true,
> > .lower_extract_byte = true,
> > .lower_extract_word = true,
> > +   .max_unroll_iterations = 0,
> >  };
> > 
> >  static const struct nir_shader_compiler_options
> > vector_nir_options_gen6 = {
> > @@ -92,6 +103,7 @@ static const struct nir_shader_compiler_options
> > vector_nir_options_gen6 = {
> > .lower_unpack_unorm_2x16 = true,
> > .lower_extract_byte = true,
> > .lower_extract_word = true,
> > +   .max_unroll_iterations = 0,
> >  };
> > 
> >  struct brw_compiler *
> > @@ -119,7 +131,6 @@ brw_compiler_create(void *mem_ctx, const struct
> > gen_device_info *devinfo)
> > 
> > /* We want the GL

Re: [Mesa-dev] [PATCH 01/10] i965: use nir_lower_indirect_derefs() for GLSL Gen7+

2016-09-15 Thread Timothy Arceri

On Thu, 2016-09-15 at 17:55 -0700, Jason Ekstrand wrote:
> 
> > On Sep 15, 2016 4:31 PM, "Timothy Arceri"  wrote:
> 
> >
> 
> > On Thu, 2016-09-15 at 12:34 -0700, Jason Ekstrand wrote:
> 
> > > > On Sep 15, 2016 12:05 AM, "Timothy Arceri"  
> > > com> wrote:
> 
> > > >
> 
> > > > This moves the nir_lower_indirect_derefs() call into
> 
> > > > > brw_preprocess_nir() so thats is called by both OpenGL and
Vulkan
> 
> > > > and removes that call to the old GLSL IR pass
> 
> > > > lower_variable_index_to_cond_assign()
> 
> > > >
> 
> > > > > We want to do this pass in nir to be able to move loop
unrolling
> 
> > > > to nir.
> 
> > > >
> 
> > > > There is a increase of 1-3 instructions in a small number of
> 
> > > shaders,
> 
> > > > and 2 Kerbal Space program shaders that increase by 32
> 
> > > instructions.
> 
> > > >
> 
> > > > Shader-db results BDW:
> 
> > > >
> 
> > > > > total instructions in shared programs: 8705873 -> 8706194
(0.00%)
> 
> > > > instructions in affected programs: 32515 -> 32836 (0.99%)
> 
> > > > helped: 3
> 
> > > > HURT: 79
> 
> > > >
> 
> > > > total cycles in shared programs: 74618120 -> 74583476 (-0.05%)
> 
> > > > cycles in affected programs: 528104 -> 493460 (-6.56%)
> 
> > > > helped: 47
> 
> > > > HURT: 37
> 
> > > >
> 
> > > > LOST:   2
> 
> > > > GAINED: 0
> 
> > > > ---
> 
> > > >  src/intel/vulkan/anv_pipeline.c        | 10 --
> 
> > > > >  src/mesa/drivers/dri/i965/brw_link.cpp | 26 ++--

> 
> > > --
> 
> > > >  src/mesa/drivers/dri/i965/brw_nir.c    | 12 
> 
> > > >  3 files changed, 26 insertions(+), 22 deletions(-)
> 
> > > >
> 
> > > > diff --git a/src/intel/vulkan/anv_pipeline.c
> 
> > > b/src/intel/vulkan/anv_pipeline.c
> 
> > > > index f96fe22..f292f0b 100644
> 
> > > > --- a/src/intel/vulkan/anv_pipeline.c
> 
> > > > +++ b/src/intel/vulkan/anv_pipeline.c
> 
> > > > > @@ -183,16 +183,6 @@ anv_shader_compile_to_nir(struct
anv_device
> 
> > > *device,
> 
> > > >
> 
> > > >     nir_shader_gather_info(nir, entry_point->impl);
> 
> > > >
> 
> > > > -   nir_variable_mode indirect_mask = 0;
> 
> > > > > -   if (compiler-
>glsl_compiler_options[stage].EmitNoIndirectInput)
> 
> > > > -      indirect_mask |= nir_var_shader_in;
> 
> > > > -   if (compiler-
> 
> > > >glsl_compiler_options[stage].EmitNoIndirectOutput)
> 
> > > > -      indirect_mask |= nir_var_shader_out;
> 
> > > > > -   if (compiler-
>glsl_compiler_options[stage].EmitNoIndirectTemp)
> 
> > > > -      indirect_mask |= nir_var_local;
> 
> > > > -
> 
> > > > -   nir_lower_indirect_derefs(nir, indirect_mask);
> 
> > > > -
> 
> > > >     return nir;
> 
> > > >  }
> 
> > > >
> 
> > > > diff --git a/src/mesa/drivers/dri/i965/brw_link.cpp
> 
> > > b/src/mesa/drivers/dri/i965/brw_link.cpp
> 
> > > > index 2b1fa61..41791d4 100644
> 
> > > > --- a/src/mesa/drivers/dri/i965/brw_link.cpp
> 
> > > > +++ b/src/mesa/drivers/dri/i965/brw_link.cpp
> 
> > > > @@ -139,18 +139,20 @@ process_glsl_ir(gl_shader_stage stage,
> 
> > > >
> 
> > > >     do_copy_propagation(shader->ir);
> 
> > > >
> 
> > > > -   bool lowered_variable_indexing =
> 
> > > > > -     
lower_variable_index_to_cond_assign((gl_shader_stage)stage,
> 
> > > > -                                          shader->ir,
> 
> > > > -                                          options-
> 
> > > >EmitNoIndirectInput,
> 
> > > > -                                          options-
> 
> > > >EmitNoIndirectOutput,
> 
> > > > -                                          options-
> 
> > > >EmitNoIndirectTemp,
> 
> > > > -                                          options-
> 
> > > >EmitNoIndirectUniform);
> 
> > > > -
> 
> > > > > -   if (unlikely(brw->perf_debug && lowered_variable_indexing))
{
> 
> > > > -      perf_debug("Unsupported form of variable indexing in %s;
> 
> > > falling "
> 
> > > > -                 "back to very inefficient code generation\n",
> 
> > > > -                 _mesa_shader_stage_to_abbrev(shader->Stage));
> 
> > > > +   if (brw->gen < 7) {
> 
> > > > +      bool lowered_variable_indexing =
> 
> > > > +       
> 
> > >  lower_variable_index_to_cond_assign((gl_shader_stage)stage,
> 
> > > > +                                             shader->ir,
> 
> > > > +                                             options-
> 
> > > >EmitNoIndirectInput,
> 
> > > > +                                             options-
> 
> > > >EmitNoIndirectOutput,
> 
> > > > +                                             options-
> 
> > > >EmitNoIndirectTemp,
> 
> > > > +                                             options-
> 
> > > >EmitNoIndirectUniform);
> 
> > > > +
> 
> > > > > +      if (unlikely(brw->perf_debug &&
lowered_variable_indexing))
> 
> > > {
> 
> > > > > +         perf_debug("Unsupported form of variable indexing in
%s;
> 
> > > falling "
> 
> > > > > +                    "back to very inefficient code
generation\n",
> 
> > > > > +                    _mesa_shader_stage_to_abbrev(shader-
>Stage));
> 
> > > > +      }
> 
> > > >

Re: [Mesa-dev] [PATCH 10/10] i965: use nir unrolling for scalar backend Gen7+

2016-09-15 Thread Connor Abbott

On Thu, Sep 15, 2016 at 9:06 PM, Timothy Arceri
 wrote:
> On Thu, 2016-09-15 at 19:49 -0400, Connor Abbott wrote:
>> This seems a little dubious... why restrict it to gen7+?
>
> Because if we don't switch to using nir_lower_indirect_derefs() then
> lower_variable_index_to_cond_assign() makes a big mess before we get to
> unrolling. I was getting piglit regressions when switching to for gen6
> and below, unfortunately I have no hardware to debug it on.

Ok, probably should be in the commit message.

>
>>  And why only
>> scalar? This pass assumes SSA, but so do many other passes in core
>> NIR
>> that we also run in nir_optimize(), so that shouldn't be a problem.
>
> I need to retest to give you the exact reason but there was a lowering
> pass that is not called for the vector backend that meant we still had
> to deal with nir registers.

That doesn't quite sound right... we never generate registers in the
frontend (except if they're immediately lowered away), and we don't
lower to registers until pretty late in the process.

>
>
>>
>> On Thu, Sep 15, 2016 at 3:03 AM, Timothy Arceri
>>  wrote:
>> >
>> > ---
>> >  src/compiler/glsl/glsl_parser_extras.cpp | 12 +
>> >  src/mesa/drivers/dri/i965/brw_compiler.c | 42
>> > +++-
>> >  src/mesa/drivers/dri/i965/brw_nir.c  | 23 +
>> >  3 files changed, 55 insertions(+), 22 deletions(-)
>> >
>> > diff --git a/src/compiler/glsl/glsl_parser_extras.cpp
>> > b/src/compiler/glsl/glsl_parser_extras.cpp
>> > index 436ddd0..a5c926a 100644
>> > --- a/src/compiler/glsl/glsl_parser_extras.cpp
>> > +++ b/src/compiler/glsl/glsl_parser_extras.cpp
>> > @@ -2057,12 +2057,14 @@ do_common_optimization(exec_list *ir, bool
>> > linked,
>> > OPT(optimize_split_arrays, ir, linked);
>> > OPT(optimize_redundant_jumps, ir);
>> >
>> > -   loop_state *ls = analyze_loop_variables(ir);
>> > -   if (ls->loop_found) {
>> > -  OPT(set_loop_controls, ir, ls);
>> > -  OPT(unroll_loops, ir, ls, options);
>> > +   if (options->MaxUnrollIterations != 0) {
>> > +  loop_state *ls = analyze_loop_variables(ir);
>> > +  if (ls->loop_found) {
>> > + OPT(set_loop_controls, ir, ls);
>> > + OPT(unroll_loops, ir, ls, options);
>> > +  }
>> > +  delete ls;
>> > }
>> > -   delete ls;
>> >
>> >  #undef OPT
>> >
>> > diff --git a/src/mesa/drivers/dri/i965/brw_compiler.c
>> > b/src/mesa/drivers/dri/i965/brw_compiler.c
>> > index 86b1eaa..523b554 100644
>> > --- a/src/mesa/drivers/dri/i965/brw_compiler.c
>> > +++ b/src/mesa/drivers/dri/i965/brw_compiler.c
>> > @@ -43,18 +43,28 @@
>> > .use_interpolated_input_intrinsics =
>> > true, \
>> > .vertex_id_zero_based = true
>> >
>> > +#define
>> > COMMON_SCALAR_OPTIONS
>> >\
>> > +   .lower_pack_half_2x16 =
>> > true,  \
>> > +   .lower_pack_snorm_2x16 =
>> > true, \
>> > +   .lower_pack_snorm_4x8 =
>> > true,  \
>> > +   .lower_pack_unorm_2x16 =
>> > true, \
>> > +   .lower_pack_unorm_4x8 =
>> > true,  \
>> > +   .lower_unpack_half_2x16 =
>> > true,\
>> > +   .lower_unpack_snorm_2x16 =
>> > true,   \
>> > +   .lower_unpack_snorm_4x8 =
>> > true,\
>> > +   .lower_unpack_unorm_2x16 =
>> > true,   \
>> > +   .lower_unpack_unorm_4x8 =
>> > true \
>> > +
>> >  static const struct nir_shader_compiler_options scalar_nir_options
>> > = {
>> > COMMON_OPTIONS,
>> > -   .lower_pack_half_2x16 = true,
>> > -   .lower_pack_snorm_2x16 = true,
>> > -   .lower_pack_snorm_4x8 = true,
>> > -   .lower_pack_unorm_2x16 = true,
>> > -   .lower_pack_unorm_4x8 = true,
>> > -   .lower_unpack_half_2x16 = true,
>> > -   .lower_unpack_snorm_2x16 = true,
>> > -   .lower_unpack_snorm_4x8 = true,
>> > -   .lower_unpack_unorm_2x16 = true,
>> > -   .lower_unpack_unorm_4x8 = true,
>> > +   COMMON_SCALAR_OPTIONS,
>> > +   .max_unroll_iterations = 0,
>> > +};
>> > +
>> > +static const struct nir_shader_compiler_options
>> > scalar_nir_options_gen7 = {
>> > +   COMMON_OPTIONS,
>> > +   COMMON_SCALAR_OPTIONS,
>> > +   .max_unroll_iterations = 32,
>> >  };
>> >
>> >  static const struct nir_shader_compiler_options vector_nir_options
>> > = {
>> > @@ -75,6 +85,7 @@ static const struct nir_shader_compiler_options
>> > vector_nir_options = {
>> > .lower_unpack_unorm_2x16 = true,
>> > .lower_extract_byte = true,
>> > .lower_extract_word = true,
>> > +   .max_unroll_iterations = 0,
>> >  };
>> >
>> >  static const struct nir_shader_compiler_options
>> > vector_nir_options_gen6 = {
>> > @@ -92,6 +103,7 @@ static const struct nir_sha

Re: [Mesa-dev] [PATCH 10/10] i965: use nir unrolling for scalar backend Gen7+

2016-09-15 Thread Timothy Arceri

On Thu, 2016-09-15 at 21:22 -0400, Connor Abbott wrote:
> On Thu, Sep 15, 2016 at 9:06 PM, Timothy Arceri
>  wrote:
> > 
> > On Thu, 2016-09-15 at 19:49 -0400, Connor Abbott wrote:
> > > 
> > > This seems a little dubious... why restrict it to gen7+?
> > 
> > Because if we don't switch to using nir_lower_indirect_derefs()
> > then
> > lower_variable_index_to_cond_assign() makes a big mess before we
> > get to
> > unrolling. I was getting piglit regressions when switching to for
> > gen6
> > and below, unfortunately I have no hardware to debug it on.
> 
> Ok, probably should be in the commit message.

I've taken another look. The regression is below, this seems to happen
because Gen6 can't handle sampler indirects and requires loops that
contain them to be unrolled (this seems like a bug the loop can't
always be unrolled). It seems this was failing in my series 2 because
the test has an interations of 32 and my limit for unrolling was < 32 I
changed it to li->trip_count > max_iter in series 3.

Test: piglit.spec.!opengl 2_0.max-samplers
Status: fail
Platform/arch:
snb/m64, g965/m64, ilk/m64, g45/m64

> 
> > 
> > 
> > > 
> > >  And why only
> > > scalar? This pass assumes SSA, but so do many other passes in
> > > core
> > > NIR
> > > that we also run in nir_optimize(), so that shouldn't be a
> > > problem.
> > 
> > I need to retest to give you the exact reason but there was a
> > lowering
> > pass that is not called for the vector backend that meant we still
> > had
> > to deal with nir registers.
> 
> That doesn't quite sound right... we never generate registers in the
> frontend (except if they're immediately lowered away), and we don't
> lower to registers until pretty late in the process.

I could very well be recalling the problem incorrectly. I've just
enabled it for all stages on my ivy bridge and run shader-db without
crashing so it's possible this was cause by a bug I've now resolved. 

I'll enable it and push it to jenkins to be sure. If everything checks
out ok I'll send some new patches to enable nir unrolling everywhere.
 
> 
> > 
> > 
> > 
> > > 
> > > 
> > > On Thu, Sep 15, 2016 at 3:03 AM, Timothy Arceri
> > >  wrote:
> > > > 
> > > > 
> > > > ---
> > > >  src/compiler/glsl/glsl_parser_extras.cpp | 12 +
> > > >  src/mesa/drivers/dri/i965/brw_compiler.c | 42
> > > > +++-
> > > >  src/mesa/drivers/dri/i965/brw_nir.c  | 23 +---
> > > > -
> > > >  3 files changed, 55 insertions(+), 22 deletions(-)
> > > > 
> > > > diff --git a/src/compiler/glsl/glsl_parser_extras.cpp
> > > > b/src/compiler/glsl/glsl_parser_extras.cpp
> > > > index 436ddd0..a5c926a 100644
> > > > --- a/src/compiler/glsl/glsl_parser_extras.cpp
> > > > +++ b/src/compiler/glsl/glsl_parser_extras.cpp
> > > > @@ -2057,12 +2057,14 @@ do_common_optimization(exec_list *ir,
> > > > bool
> > > > linked,
> > > > OPT(optimize_split_arrays, ir, linked);
> > > > OPT(optimize_redundant_jumps, ir);
> > > > 
> > > > -   loop_state *ls = analyze_loop_variables(ir);
> > > > -   if (ls->loop_found) {
> > > > -  OPT(set_loop_controls, ir, ls);
> > > > -  OPT(unroll_loops, ir, ls, options);
> > > > +   if (options->MaxUnrollIterations != 0) {
> > > > +  loop_state *ls = analyze_loop_variables(ir);
> > > > +  if (ls->loop_found) {
> > > > + OPT(set_loop_controls, ir, ls);
> > > > + OPT(unroll_loops, ir, ls, options);
> > > > +  }
> > > > +  delete ls;
> > > > }
> > > > -   delete ls;
> > > > 
> > > >  #undef OPT
> > > > 
> > > > diff --git a/src/mesa/drivers/dri/i965/brw_compiler.c
> > > > b/src/mesa/drivers/dri/i965/brw_compiler.c
> > > > index 86b1eaa..523b554 100644
> > > > --- a/src/mesa/drivers/dri/i965/brw_compiler.c
> > > > +++ b/src/mesa/drivers/dri/i965/brw_compiler.c
> > > > @@ -43,18 +43,28 @@
> > > > .use_interpolated_input_intrinsics =
> > > > true, \
> > > > .vertex_id_zero_based = true
> > > > 
> > > > +#define
> > > > COMMON_SCALAR_OPTIONS
> > > >    \
> > > > +   .lower_pack_half_2x16 =
> > > > true,  \
> > > > +   .lower_pack_snorm_2x16 =
> > > > true, \
> > > > +   .lower_pack_snorm_4x8 =
> > > > true,  \
> > > > +   .lower_pack_unorm_2x16 =
> > > > true, \
> > > > +   .lower_pack_unorm_4x8 =
> > > > true,  \
> > > > +   .lower_unpack_half_2x16 =
> > > > true,\
> > > > +   .lower_unpack_snorm_2x16 =
> > > > true,   \
> > > > +   .lower_unpack_snorm_4x8 =
> > > > true,\
> > > > +   .lower_unpack_unorm_2x16 =
> > > > true,   \
> > > > +   .lower_unpack_unorm_4x8 =
> > > > true

[Mesa-dev] [PATCH] nir/spirv: Bring back the spirv2nir helper binary

2016-09-15 Thread Jason Ekstrand

This was something that I wrote in the early days of the spirv_to_nir code
but deleted once we had a real driver.  However, in the absence of a
shader_runner equivalent, it's extremely useful for debugging the
spirv_to_nir code so let's bring it back.
---
 src/compiler/Makefile.nir.am   | 17 +
 src/compiler/spirv/spirv2nir.c | 55 ++
 2 files changed, 72 insertions(+)
 create mode 100644 src/compiler/spirv/spirv2nir.c

diff --git a/src/compiler/Makefile.nir.am b/src/compiler/Makefile.nir.am
index 9aac214..69ff7b1 100644
--- a/src/compiler/Makefile.nir.am
+++ b/src/compiler/Makefile.nir.am
@@ -53,6 +53,23 @@ nir/nir_opt_algebraic.c: nir/nir_opt_algebraic.py 
nir/nir_algebraic.py
$(MKDIR_GEN)
$(PYTHON_GEN) $(srcdir)/nir/nir_opt_algebraic.py > $@ || ($(RM) $@; 
false)
 
+noinst_PROGRAMS += spirv2nir
+
+spirv2nir_SOURCES = \
+   spirv/spirv2nir.c
+
+spirv2nir_CPPFLAGS =   \
+   $(AM_CPPFLAGS)  \
+   -I$(top_builddir)/src/compiler/nir  \
+   -I$(top_srcdir)/src/compiler/nir\
+   -I$(top_srcdir)/src/compiler/spirv
+
+spirv2nir_LDADD =  \
+   nir/libnir.la   \
+   $(top_builddir)/src/util/libmesautil.la \
+   -lm -lstdc++\
+   $(PTHREAD_LIBS)
+
 
 check_PROGRAMS += nir/tests/control_flow_tests
 
diff --git a/src/compiler/spirv/spirv2nir.c b/src/compiler/spirv/spirv2nir.c
new file mode 100644
index 000..c837186
--- /dev/null
+++ b/src/compiler/spirv/spirv2nir.c
@@ -0,0 +1,55 @@
+/*
+ * Copyright © 2015 Intel Corporation
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice (including the next
+ * paragraph) shall be included in all copies or substantial portions of the
+ * Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
+ * IN THE SOFTWARE.
+ *
+ * Authors:
+ *Jason Ekstrand (ja...@jlekstrand.net)
+ *
+ */
+
+/*
+ * A simple executable that opens a SPIR-V shader, converts it to NIR, and
+ * dumps out the result.  This should be useful for testing the
+ * spirv_to_nir code.
+ */
+
+#include "spirv/nir_spirv.h"
+
+#include 
+#include 
+#include 
+#include 
+
+int main(int argc, char **argv)
+{
+   int fd = open(argv[1], O_RDONLY);
+   off_t len = lseek(fd, 0, SEEK_END);
+
+   assert(len % 4 == 0);
+   size_t word_count = len / 4;
+
+   const void *map = mmap(NULL, len, PROT_READ, MAP_PRIVATE, fd, 0);
+   assert(map != NULL);
+
+   nir_function *func = spirv_to_nir(map, word_count, NULL, 0,
+ MESA_SHADER_FRAGMENT, "main", NULL);
+   nir_print_shader(func->shader, stderr);
+}
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [RFC 1/2] glsl/list: Add an iteration helper that starts after the given node

2016-09-15 Thread Jason Ekstrand

---
 src/compiler/glsl/list.h | 9 +
 1 file changed, 9 insertions(+)

diff --git a/src/compiler/glsl/list.h b/src/compiler/glsl/list.h
index b5b5b36..8371519 100644
--- a/src/compiler/glsl/list.h
+++ b/src/compiler/glsl/list.h
@@ -714,6 +714,15 @@ inline void exec_node::insert_before(exec_list *before)
 __node = __next, __next =  \
exec_node_data(__type, (__next)->__field.next, __field))
 
+#define foreach_list_typed_safe_after(__type, __node, __field, __after)\
+   for (__type * __node =  \
+   exec_node_data(__type, (__after)->__field.next, __field),   \
+   * __next =  \
+   exec_node_data(__type, (__node)->__field.next, __field);\
+(__node)->__field.next != NULL;\
+__node = __next, __next =  \
+   exec_node_data(__type, (__next)->__field.next, __field))
+
 #define foreach_list_typed_reverse_safe(__type, __node, __field, __list)   \
for (__type * __node =  \
exec_node_data(__type, (__list)->tail_sentinel.prev, __field),  \
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 10/10] i965: use nir unrolling for scalar backend Gen7+

2016-09-15 Thread Timothy Arceri

On Fri, 2016-09-16 at 12:17 +1000, Timothy Arceri wrote:
> On Thu, 2016-09-15 at 21:22 -0400, Connor Abbott wrote:
> > 
> > On Thu, Sep 15, 2016 at 9:06 PM, Timothy Arceri
> >  wrote:
> > > 
> > > 
> > > On Thu, 2016-09-15 at 19:49 -0400, Connor Abbott wrote:
> > > > 
> > > > 
> > > > This seems a little dubious... why restrict it to gen7+?
> > > 
> > > Because if we don't switch to using nir_lower_indirect_derefs()
> > > then
> > > lower_variable_index_to_cond_assign() makes a big mess before we
> > > get to
> > > unrolling. I was getting piglit regressions when switching to for
> > > gen6
> > > and below, unfortunately I have no hardware to debug it on.
> > 
> > Ok, probably should be in the commit message.
> 
> I've taken another look. The regression is below, this seems to
> happen
> because Gen6 can't handle sampler indirects and requires loops that
> contain them to be unrolled (this seems like a bug the loop can't
> always be unrolled). It seems this was failing in my series 2 because
> the test has an interations of 32 and my limit for unrolling was < 32
> I
> changed it to li->trip_count > max_iter in series 3.
> 
> Test: piglit.spec.!opengl 2_0.max-samplers
> Status: fail
> Platform/arch:
>   snb/m64, g965/m64, ilk/m64, g45/m64
> 
> > 
> > 
> > > 
> > > 
> > > 
> > > > 
> > > > 
> > > >  And why only
> > > > scalar? This pass assumes SSA, but so do many other passes in
> > > > core
> > > > NIR
> > > > that we also run in nir_optimize(), so that shouldn't be a
> > > > problem.
> > > 
> > > I need to retest to give you the exact reason but there was a
> > > lowering
> > > pass that is not called for the vector backend that meant we
> > > still
> > > had
> > > to deal with nir registers.
> > 
> > That doesn't quite sound right... we never generate registers in
> > the
> > frontend (except if they're immediately lowered away), and we don't
> > lower to registers until pretty late in the process.
> 
> I could very well be recalling the problem incorrectly. I've just
> enabled it for all stages on my ivy bridge and run shader-db without
> crashing so it's possible this was cause by a bug I've now resolved. 
> 
> I'll enable it and push it to jenkins to be sure. If everything
> checks
> out ok I'll send some new patches to enable nir unrolling everywhere.

It regressed some piglit tests. The problem was the loop analysis pass
was getting called when it shouldn't as thomas mistakenly used a
decimal value rather than a hex value.

nir_metadata_loop_analysis = 0x16,

So we call would call it after we have converted from ssa.


>  
> > 
> > 
> > > 
> > > 
> > > 
> > > 
> > > > 
> > > > 
> > > > 
> > > > On Thu, Sep 15, 2016 at 3:03 AM, Timothy Arceri
> > > >  wrote:
> > > > > 
> > > > > 
> > > > > 
> > > > > ---
> > > > >  src/compiler/glsl/glsl_parser_extras.cpp | 12 +
> > > > >  src/mesa/drivers/dri/i965/brw_compiler.c | 42
> > > > > +++-
> > > > >  src/mesa/drivers/dri/i965/brw_nir.c  | 23 +-
> > > > > --
> > > > > -
> > > > >  3 files changed, 55 insertions(+), 22 deletions(-)
> > > > > 
> > > > > diff --git a/src/compiler/glsl/glsl_parser_extras.cpp
> > > > > b/src/compiler/glsl/glsl_parser_extras.cpp
> > > > > index 436ddd0..a5c926a 100644
> > > > > --- a/src/compiler/glsl/glsl_parser_extras.cpp
> > > > > +++ b/src/compiler/glsl/glsl_parser_extras.cpp
> > > > > @@ -2057,12 +2057,14 @@ do_common_optimization(exec_list *ir,
> > > > > bool
> > > > > linked,
> > > > > OPT(optimize_split_arrays, ir, linked);
> > > > > OPT(optimize_redundant_jumps, ir);
> > > > > 
> > > > > -   loop_state *ls = analyze_loop_variables(ir);
> > > > > -   if (ls->loop_found) {
> > > > > -  OPT(set_loop_controls, ir, ls);
> > > > > -  OPT(unroll_loops, ir, ls, options);
> > > > > +   if (options->MaxUnrollIterations != 0) {
> > > > > +  loop_state *ls = analyze_loop_variables(ir);
> > > > > +  if (ls->loop_found) {
> > > > > + OPT(set_loop_controls, ir, ls);
> > > > > + OPT(unroll_loops, ir, ls, options);
> > > > > +  }
> > > > > +  delete ls;
> > > > > }
> > > > > -   delete ls;
> > > > > 
> > > > >  #undef OPT
> > > > > 
> > > > > diff --git a/src/mesa/drivers/dri/i965/brw_compiler.c
> > > > > b/src/mesa/drivers/dri/i965/brw_compiler.c
> > > > > index 86b1eaa..523b554 100644
> > > > > --- a/src/mesa/drivers/dri/i965/brw_compiler.c
> > > > > +++ b/src/mesa/drivers/dri/i965/brw_compiler.c
> > > > > @@ -43,18 +43,28 @@
> > > > > .use_interpolated_input_intrinsics =
> > > > > true, \
> > > > > .vertex_id_zero_based = true
> > > > > 
> > > > > +#define
> > > > > COMMON_SCALAR_OPTIONS
> > > > >    \
> > > > > +   .lower_pack_half_2x16 =
> > > > > true,  \
> > > > > +   .lower_pack_snorm_2x16 =
> > > > > true, \
> > > > > +   .lower_pack_snorm_4x8 =
> > > > > true,

[Mesa-dev] [RFC 2/2] nir: When splitting blocks, always put the new block after the old

2016-09-15 Thread Jason Ekstrand

This makes block splitting happen a bit more deterministically.  In
particular, if using nir_builder to build a shader, it means that you can
always save off nir_cursor_current_block and then add another CF node after
that without fear that the nir_block pointer you just saved will get
replaced out from under you.
---
 src/compiler/nir/nir.h  |  2 +
 src/compiler/nir/nir_control_flow.c | 97 +
 2 files changed, 35 insertions(+), 64 deletions(-)

diff --git a/src/compiler/nir/nir.h b/src/compiler/nir/nir.h
index 6f059720..9ab8440 100644
--- a/src/compiler/nir/nir.h
+++ b/src/compiler/nir/nir.h
@@ -1511,6 +1511,8 @@ nir_block_last_instr(nir_block *block)
foreach_list_typed_reverse(nir_instr, instr, node, &(block)->instr_list)
 #define nir_foreach_instr_safe(instr, block) \
foreach_list_typed_safe(nir_instr, instr, node, &(block)->instr_list)
+#define nir_foreach_instr_safe_after(instr, block, after) \
+   foreach_list_typed_safe_after(nir_instr, instr, node, after)
 #define nir_foreach_instr_reverse_safe(instr, block) \
foreach_list_typed_reverse_safe(nir_instr, instr, node, 
&(block)->instr_list)
 
diff --git a/src/compiler/nir/nir_control_flow.c 
b/src/compiler/nir/nir_control_flow.c
index 1ff7a53..7210197 100644
--- a/src/compiler/nir/nir_control_flow.c
+++ b/src/compiler/nir/nir_control_flow.c
@@ -178,60 +178,6 @@ link_block_to_non_block(nir_block *block, nir_cf_node 
*node)
 
 }
 
-/**
- * Replace a block's successor with a different one.
- */
-static void
-replace_successor(nir_block *block, nir_block *old_succ, nir_block *new_succ)
-{
-   if (block->successors[0] == old_succ) {
-  block->successors[0] = new_succ;
-   } else {
-  assert(block->successors[1] == old_succ);
-  block->successors[1] = new_succ;
-   }
-
-   block_remove_pred(old_succ, block);
-   block_add_pred(new_succ, block);
-}
-
-/**
- * Takes a basic block and inserts a new empty basic block before it, making 
its
- * predecessors point to the new block. This essentially splits the block into
- * an empty header and a body so that another non-block CF node can be inserted
- * between the two. Note that this does *not* link the two basic blocks, so
- * some kind of cleanup *must* be performed after this call.
- */
-
-static nir_block *
-split_block_beginning(nir_block *block)
-{
-   nir_block *new_block = nir_block_create(ralloc_parent(block));
-   new_block->cf_node.parent = block->cf_node.parent;
-   exec_node_insert_node_before(&block->cf_node.node, 
&new_block->cf_node.node);
-
-   struct set_entry *entry;
-   set_foreach(block->predecessors, entry) {
-  nir_block *pred = (nir_block *) entry->key;
-  replace_successor(pred, block, new_block);
-   }
-
-   /* Any phi nodes must stay part of the new block, or else their
-* sourcse will be messed up. This will reverse the order of the phi's, but
-* order shouldn't matter.
-*/
-   nir_foreach_instr_safe(instr, block) {
-  if (instr->type != nir_instr_type_phi)
- break;
-
-  exec_node_remove(&instr->node);
-  instr->block = new_block;
-  exec_list_push_head(&new_block->instr_list, &instr->node);
-   }
-
-   return new_block;
-}
-
 static void
 rewrite_phi_preds(nir_block *block, nir_block *old_pred, nir_block *new_pred)
 {
@@ -359,6 +305,28 @@ block_add_normal_succs(nir_block *block)
 }
 
 static nir_block *
+split_block_beginning(nir_block *block)
+{
+   nir_block *new_block = nir_block_create(ralloc_parent(block));
+   new_block->cf_node.parent = block->cf_node.parent;
+   exec_node_insert_after(&block->cf_node.node, &new_block->cf_node.node);
+
+   /* Move everything except the phis to the new block */
+   nir_foreach_instr_safe(cur_instr, block) {
+  if (cur_instr->type == nir_instr_type_phi)
+ continue;
+
+  exec_node_remove(&cur_instr->node);
+  cur_instr->block = new_block;
+  exec_list_push_tail(&new_block->instr_list, &cur_instr->node);
+   }
+
+   move_successors(block, new_block);
+
+   return new_block;
+}
+
+static nir_block *
 split_block_end(nir_block *block)
 {
nir_block *new_block = nir_block_create(ralloc_parent(block));
@@ -381,12 +349,13 @@ static nir_block *
 split_block_before_instr(nir_instr *instr)
 {
assert(instr->type != nir_instr_type_phi);
-   nir_block *new_block = split_block_beginning(instr->block);
+   nir_block *block = instr->block;
 
-   nir_foreach_instr_safe(cur_instr, instr->block) {
-  if (cur_instr == instr)
- break;
+   nir_block *new_block = nir_block_create(ralloc_parent(block));
+   new_block->cf_node.parent = block->cf_node.parent;
+   exec_node_insert_after(&block->cf_node.node, &new_block->cf_node.node);
 
+   nir_foreach_instr_safe_after(cur_instr, instr->block, instr) {
   exec_node_remove(&cur_instr->node);
   cur_instr->block = new_block;
   exec_list_push_tail(&new_block->instr_list, &cur_instr->node);
@@ -409,8 +378,8 @@ split_block_cursor(nir_cursor cursor,
nir_bl

[Mesa-dev] [PATCH 2/2] nir/spirv: Use a nop intrinsic for tagging the ends of blocks

2016-09-15 Thread Jason Ekstrand

Previously, we were saving off the last nir_block in a vtn_block before
moving on so that we could find the nir_block again when it came time to
handle phi sources.  Unfortunately, NIR's control flow modification code is
inconsistent when it comes to how it splits blocks so the block pointer we
saved off may point to a block somewhere else in the shader by the time we
get around to handling phi sources.  In order to get around this, we insert
a nop instruction and use that as the logical end of our block.  Since the
control flow manipulation code respects instructions, the nop will keeps
its place like any other instruction and we can easily find the end of our
block when we need it.

Signed-off-by: Jason Ekstrand 
---
 src/compiler/spirv/vtn_cfg.c | 6 --
 src/compiler/spirv/vtn_private.h | 4 ++--
 2 files changed, 6 insertions(+), 4 deletions(-)

diff --git a/src/compiler/spirv/vtn_cfg.c b/src/compiler/spirv/vtn_cfg.c
index d9096f4..599ed69 100644
--- a/src/compiler/spirv/vtn_cfg.c
+++ b/src/compiler/spirv/vtn_cfg.c
@@ -518,7 +518,7 @@ vtn_handle_phi_second_pass(struct vtn_builder *b, SpvOp 
opcode,
   struct vtn_block *pred =
  vtn_value(b, w[i + 1], vtn_value_type_block)->block;
 
-  b->nb.cursor = nir_after_block_before_jump(pred->end_block);
+  b->nb.cursor = nir_after_instr(&pred->end_nop->instr);
 
   vtn_local_store(b, src, nir_deref_var_create(b, phi_var));
}
@@ -576,7 +576,9 @@ vtn_emit_cf_list(struct vtn_builder *b, struct list_head 
*cf_list,
 
  vtn_foreach_instruction(b, block_start, block_end, handler);
 
- block->end_block = nir_cursor_current_block(b->nb.cursor);
+ block->end_nop = nir_intrinsic_instr_create(b->nb.shader,
+ nir_intrinsic_nop);
+ nir_builder_instr_insert(&b->nb, &block->end_nop->instr);
 
  if ((*block->branch & SpvOpCodeMask) == SpvOpReturnValue) {
 struct vtn_ssa_value *src = vtn_ssa_value(b, block->branch[1]);
diff --git a/src/compiler/spirv/vtn_private.h b/src/compiler/spirv/vtn_private.h
index 7f5444e..6f34f09 100644
--- a/src/compiler/spirv/vtn_private.h
+++ b/src/compiler/spirv/vtn_private.h
@@ -149,8 +149,8 @@ struct vtn_block {
/** Points to the switch case started by this block (if any) */
struct vtn_case *switch_case;
 
-   /** The last block in this SPIR-V block. */
-   nir_block *end_block;
+   /** Every block ends in a nop intrinsic so that we can find it again */
+   nir_intrinsic_instr *end_nop;
 };
 
 struct vtn_function {
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 1/2] nir: Add a nop intrinsic

2016-09-15 Thread Jason Ekstrand

This intrinsic has no destination, no sources, no variables, and can be
eliminated.  In other words, it does nothing and will always get deleted by
dead code elimination.  However, it does provide a quick-and-easy way to
temporarily tag a particular location in a NIR shader.

Signed-off-by: Jason Ekstrand 
---
 src/compiler/nir/nir_intrinsics.h | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/src/compiler/nir/nir_intrinsics.h 
b/src/compiler/nir/nir_intrinsics.h
index b27a148..1d18466 100644
--- a/src/compiler/nir/nir_intrinsics.h
+++ b/src/compiler/nir/nir_intrinsics.h
@@ -41,6 +41,9 @@
 
 #define ARR(...) { __VA_ARGS__ }
 
+INTRINSIC(nop, 0, ARR(0), false, 0, 0, 0, xx, xx, xx,
+  NIR_INTRINSIC_CAN_ELIMINATE | NIR_INTRINSIC_CAN_REORDER)
+
 INTRINSIC(load_var, 0, ARR(0), true, 0, 1, 0, xx, xx, xx, 
NIR_INTRINSIC_CAN_ELIMINATE)
 INTRINSIC(store_var, 1, ARR(0), false, 0, 1, 1, WRMASK, xx, xx, 0)
 INTRINSIC(copy_var, 0, ARR(0), false, 0, 2, 0, xx, xx, xx, 0)
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 01/10] i965: use nir_lower_indirect_derefs() for GLSL Gen7+

2016-09-15 Thread Kenneth Graunke

On Friday, September 16, 2016 11:10:01 AM PDT Timothy Arceri wrote:
> On Thu, 2016-09-15 at 17:55 -0700, Jason Ekstrand wrote:
> > > On Sep 15, 2016 4:31 PM, "Timothy Arceri"  
> > > wrote:
> > > On Thu, 2016-09-15 at 12:34 -0700, Jason Ekstrand wrote:
> > > > > On Sep 15, 2016 12:05 AM, "Timothy Arceri" 
> > > > >  wrote:
> > > > > +   if (compiler->devinfo->gen > 6) {
> > 
> > > > I think you want "> 7" here
> > 
> > >
> > 
> > > > It can be used with gen 7 and up. I could change it to >= 7 if that
> is
> > 
> > > easier to parse but I think > 6 is functionally correct.
> > > Hunh? You use GLSL for gen7. Why are we duplicating? Also, why can't
> this be used on sandy bridge and earlier?
> 
> GLSL IR is used for the non scalar stages on gen7. See my answers to
> Connors questions:
> https://lists.freedesktop.org/archives/mesa-dev/2016-September/129108.html

If you're trying to use GLSL IR lowering for vector stages and NIR
lowering for scalar stages, why not just use:

   compiler->scalar_stage[nir->stage]

rather than generation checks?


signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 2/2] nir/spirv: Use a nop intrinsic for tagging the ends of blocks

2016-09-15 Thread Dave Airlie

On 16 September 2016 at 14:16, Jason Ekstrand  wrote:
> Previously, we were saving off the last nir_block in a vtn_block before
> moving on so that we could find the nir_block again when it came time to
> handle phi sources.  Unfortunately, NIR's control flow modification code is
> inconsistent when it comes to how it splits blocks so the block pointer we
> saved off may point to a block somewhere else in the shader by the time we
> get around to handling phi sources.  In order to get around this, we insert
> a nop instruction and use that as the logical end of our block.  Since the
> control flow manipulation code respects instructions, the nop will keeps
> its place like any other instruction and we can easily find the end of our
> block when we need it.
>
> Signed-off-by: Jason Ekstrand 

I'm not sure I'm good enough to review it, but this makes sense after
I looked through it.

It also fixes vkQuake on radv.

Tested-by: Dave Airlie 
Fixes: https://bugs.freedesktop.org/show_bug.cgi?id=97233

Dave.

> ---
>  src/compiler/spirv/vtn_cfg.c | 6 --
>  src/compiler/spirv/vtn_private.h | 4 ++--
>  2 files changed, 6 insertions(+), 4 deletions(-)
>
> diff --git a/src/compiler/spirv/vtn_cfg.c b/src/compiler/spirv/vtn_cfg.c
> index d9096f4..599ed69 100644
> --- a/src/compiler/spirv/vtn_cfg.c
> +++ b/src/compiler/spirv/vtn_cfg.c
> @@ -518,7 +518,7 @@ vtn_handle_phi_second_pass(struct vtn_builder *b, SpvOp 
> opcode,
>struct vtn_block *pred =
>   vtn_value(b, w[i + 1], vtn_value_type_block)->block;
>
> -  b->nb.cursor = nir_after_block_before_jump(pred->end_block);
> +  b->nb.cursor = nir_after_instr(&pred->end_nop->instr);
>
>vtn_local_store(b, src, nir_deref_var_create(b, phi_var));
> }
> @@ -576,7 +576,9 @@ vtn_emit_cf_list(struct vtn_builder *b, struct list_head 
> *cf_list,
>
>   vtn_foreach_instruction(b, block_start, block_end, handler);
>
> - block->end_block = nir_cursor_current_block(b->nb.cursor);
> + block->end_nop = nir_intrinsic_instr_create(b->nb.shader,
> + nir_intrinsic_nop);
> + nir_builder_instr_insert(&b->nb, &block->end_nop->instr);
>
>   if ((*block->branch & SpvOpCodeMask) == SpvOpReturnValue) {
>  struct vtn_ssa_value *src = vtn_ssa_value(b, block->branch[1]);
> diff --git a/src/compiler/spirv/vtn_private.h 
> b/src/compiler/spirv/vtn_private.h
> index 7f5444e..6f34f09 100644
> --- a/src/compiler/spirv/vtn_private.h
> +++ b/src/compiler/spirv/vtn_private.h
> @@ -149,8 +149,8 @@ struct vtn_block {
> /** Points to the switch case started by this block (if any) */
> struct vtn_case *switch_case;
>
> -   /** The last block in this SPIR-V block. */
> -   nir_block *end_block;
> +   /** Every block ends in a nop intrinsic so that we can find it again */
> +   nir_intrinsic_instr *end_nop;
>  };
>
>  struct vtn_function {
> --
> 2.5.0.400.gff86faf
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 1/2] glsl: add subpass image type

2016-09-15 Thread Jason Ekstrand

On Thu, Sep 15, 2016 at 1:23 AM, Dave Airlie  wrote:

> From: Dave Airlie 
>
> SPIR-V/Vulkan have a special image type for input attachments
> called the subpass type. It has different characteristics than
> other images types.
>
> The main one being it can only be an input image to fragment
> shaders and loads from it are relative to the frag coord.
>
> This adds support for it to the GLSL types. Unfortunately
> we've run out of space in the sampler dim in types, so we
> need to use another bit.
> ---
>  src/compiler/builtin_type_macros.h |  2 ++
>  src/compiler/glsl_types.cpp| 12 
>  src/compiler/glsl_types.h  |  5 +++--
>  src/compiler/nir/nir.h |  1 +
>  4 files changed, 18 insertions(+), 2 deletions(-)
>
> diff --git a/src/compiler/builtin_type_macros.h
> b/src/compiler/builtin_type_macros.h
> index da3f19e..8af0e2a 100644
> --- a/src/compiler/builtin_type_macros.h
> +++ b/src/compiler/builtin_type_macros.h
> @@ -159,6 +159,8 @@ DECL_TYPE(uimageCubeArray,
> GL_UNSIGNED_INT_IMAGE_CUBE_MAP_ARRAY,   GLSL_TYPE
>  DECL_TYPE(uimage2DMS,  GL_UNSIGNED_INT_IMAGE_2D_MULTISAMPLE,
>  GLSL_TYPE_IMAGE, GLSL_SAMPLER_DIM_MS, 0, 0, GLSL_TYPE_UINT)
>  DECL_TYPE(uimage2DMSArray, GL_UNSIGNED_INT_IMAGE_2D_MULTISAMPLE_ARRAY,
> GLSL_TYPE_IMAGE, GLSL_SAMPLER_DIM_MS, 0, 1, GLSL_TYPE_UINT)
>
> +DECL_TYPE(imageSubpass,0,
> GLSL_TYPE_IMAGE, GLSL_SAMPLER_DIM_SUBPASS,0, 0, GLSL_TYPE_FLOAT)
>

We should probably call this subpassInput to match the GLSL Vulkan spec.


> +
>  DECL_TYPE(atomic_uint, GL_UNSIGNED_INT_ATOMIC_COUNTER,
> GLSL_TYPE_ATOMIC_UINT, 1, 1)
>
>  STRUCT_TYPE(gl_DepthRangeParameters)
> diff --git a/src/compiler/glsl_types.cpp b/src/compiler/glsl_types.cpp
> index 641644d..bf72419 100644
> --- a/src/compiler/glsl_types.cpp
> +++ b/src/compiler/glsl_types.cpp
> @@ -674,6 +674,8 @@ glsl_type::get_sampler_instance(enum glsl_sampler_dim
> dim,
>  return error_type;
>   else
>  return samplerExternalOES_type;
> +  case GLSL_SAMPLER_DIM_SUBPASS:
> + return error_type;
>}
> case GLSL_TYPE_INT:
>if (shadow)
> @@ -701,6 +703,8 @@ glsl_type::get_sampler_instance(enum glsl_sampler_dim
> dim,
>   return (array ? isampler2DMSArray_type : isampler2DMS_type);
>case GLSL_SAMPLER_DIM_EXTERNAL:
>   return error_type;
> +  case GLSL_SAMPLER_DIM_SUBPASS:
> + return error_type;
>}
> case GLSL_TYPE_UINT:
>if (shadow)
> @@ -728,6 +732,8 @@ glsl_type::get_sampler_instance(enum glsl_sampler_dim
> dim,
>   return (array ? usampler2DMSArray_type : usampler2DMS_type);
>case GLSL_SAMPLER_DIM_EXTERNAL:
>   return error_type;
> +  case GLSL_SAMPLER_DIM_SUBPASS:
> + return error_type;
>}
> default:
>return error_type;
> @@ -740,6 +746,8 @@ const glsl_type *
>  glsl_type::get_image_instance(enum glsl_sampler_dim dim,
>bool array, glsl_base_type type)
>  {
> +   if (dim == GLSL_SAMPLER_DIM_SUBPASS)
> +  return imageSubpass_type;
> switch (type) {
> case GLSL_TYPE_FLOAT:
>switch (dim) {
> @@ -764,6 +772,7 @@ glsl_type::get_image_instance(enum glsl_sampler_dim
> dim,
>case GLSL_SAMPLER_DIM_MS:
>   return (array ? image2DMSArray_type : image2DMS_type);
>case GLSL_SAMPLER_DIM_EXTERNAL:
> +  case GLSL_SAMPLER_DIM_SUBPASS:
>   return error_type;
>}
> case GLSL_TYPE_INT:
> @@ -789,6 +798,7 @@ glsl_type::get_image_instance(enum glsl_sampler_dim
> dim,
>case GLSL_SAMPLER_DIM_MS:
>   return (array ? iimage2DMSArray_type : iimage2DMS_type);
>case GLSL_SAMPLER_DIM_EXTERNAL:
> +  case GLSL_SAMPLER_DIM_SUBPASS:
>   return error_type;
>}
> case GLSL_TYPE_UINT:
> @@ -814,6 +824,7 @@ glsl_type::get_image_instance(enum glsl_sampler_dim
> dim,
>case GLSL_SAMPLER_DIM_MS:
>   return (array ? uimage2DMSArray_type : uimage2DMS_type);
>case GLSL_SAMPLER_DIM_EXTERNAL:
> +  case GLSL_SAMPLER_DIM_SUBPASS:
>   return error_type;
>}
> default:
> @@ -1975,6 +1986,7 @@ glsl_type::coordinate_components() const
> case GLSL_SAMPLER_DIM_RECT:
> case GLSL_SAMPLER_DIM_MS:
> case GLSL_SAMPLER_DIM_EXTERNAL:
> +   case GLSL_SAMPLER_DIM_SUBPASS:
>size = 2;
>break;
> case GLSL_SAMPLER_DIM_3D:
> diff --git a/src/compiler/glsl_types.h b/src/compiler/glsl_types.h
> index 7c4827d..b1e2f7a 100644
> --- a/src/compiler/glsl_types.h
> +++ b/src/compiler/glsl_types.h
> @@ -80,7 +80,8 @@ enum glsl_sampler_dim {
> GLSL_SAMPLER_DIM_RECT,
> GLSL_SAMPLER_DIM_BUF,
> GLSL_SAMPLER_DIM_EXTERNAL,
> -   GLSL_SAMPLER_DIM_MS
> +   GLSL_SAMPLER_DIM_MS,
> +   GLSL_SAMPLER_DIM_SUBPASS, /* for vulkan input attachments */
>  };
>
>  enum glsl_interface_packing {
> @@ -127,7 +128,7 @@ struct glsl_type {
> GLenum gl_type;
> glsl_base_type base_t

Re: [Mesa-dev] [PATCH 2/2] spirv: use subpass image type

2016-09-15 Thread Jason Ekstrand

On Thu, Sep 15, 2016 at 1:23 AM, Dave Airlie  wrote:

> From: Dave Airlie 
>
> This adds support for the input attachments subpass type
> to the SPIRV->NIR pass.
> ---
>  src/compiler/spirv/spirv_to_nir.c | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
>
> diff --git a/src/compiler/spirv/spirv_to_nir.c
> b/src/compiler/spirv/spirv_to_nir.c
> index 7e7a026..45dfe0b 100644
> --- a/src/compiler/spirv/spirv_to_nir.c
> +++ b/src/compiler/spirv/spirv_to_nir.c
> @@ -828,6 +828,7 @@ vtn_handle_type(struct vtn_builder *b, SpvOp opcode,
>case SpvDimCube: dim = GLSL_SAMPLER_DIM_CUBE;  break;
>case SpvDimRect: dim = GLSL_SAMPLER_DIM_RECT;  break;
>case SpvDimBuffer:   dim = GLSL_SAMPLER_DIM_BUF;   break;
> +  case SpvDimSubpassData: dim = GLSL_SAMPLER_DIM_SUBPASS; break;
>default:
>   unreachable("Invalid SPIR-V Sampler dimension");
>}
> @@ -854,7 +855,7 @@ vtn_handle_type(struct vtn_builder *b, SpvOp opcode,
>   val->type->type = glsl_sampler_type(dim, is_shadow, is_array,
>   glsl_get_base_type(sampled_
> type));
>} else if (sampled == 2) {
> - assert(format);
> + assert((dim == GLSL_SAMPLER_DIM_SUBPASS) || format);
>   assert(!is_shadow);
>   val->type->type = glsl_image_type(dim, is_array,
> glsl_get_base_type(sampled_
> type));
> @@ -1419,6 +1420,7 @@ vtn_handle_texture(struct vtn_builder *b, SpvOp
> opcode,
>case GLSL_SAMPLER_DIM_2D:
>case GLSL_SAMPLER_DIM_RECT:
>case GLSL_SAMPLER_DIM_MS:
> +  case GLSL_SAMPLER_DIM_SUBPASS:
>

I don't think this is correct.  Given that they're being handled as storage
images, you should never actually get here.  Probably best to let it fall
through to the unreachable so we catch it if someone ever tries to texture
from a subpassInput.

I left one other comment on the first patch.  With those two fixed,

Reviewed-by: Jason Ekstrand 

This is a lot simpler than I expected and I think it probably is better to
just use storage images and convert to textures in a lowering pass.  If
they ever let you combine a sampler with a subpassInput, we'll have to
rethink things a bit, but I like the way this looks.


>   coord_components = 2;
>   break;
>case GLSL_SAMPLER_DIM_3D:
> --
> 2.5.5
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 1/2] glsl: add subpass image type

2016-09-15 Thread Dave Airlie

On 16 September 2016 at 14:47, Jason Ekstrand  wrote:
> On Thu, Sep 15, 2016 at 1:23 AM, Dave Airlie  wrote:
>>
>> From: Dave Airlie 
>>
>> SPIR-V/Vulkan have a special image type for input attachments
>> called the subpass type. It has different characteristics than
>> other images types.
>>
>> The main one being it can only be an input image to fragment
>> shaders and loads from it are relative to the frag coord.
>>
>> This adds support for it to the GLSL types. Unfortunately
>> we've run out of space in the sampler dim in types, so we
>> need to use another bit.
>> ---
>>  src/compiler/builtin_type_macros.h |  2 ++
>>  src/compiler/glsl_types.cpp| 12 
>>  src/compiler/glsl_types.h  |  5 +++--
>>  src/compiler/nir/nir.h |  1 +
>>  4 files changed, 18 insertions(+), 2 deletions(-)
>>
>> diff --git a/src/compiler/builtin_type_macros.h
>> b/src/compiler/builtin_type_macros.h
>> index da3f19e..8af0e2a 100644
>> --- a/src/compiler/builtin_type_macros.h
>> +++ b/src/compiler/builtin_type_macros.h
>> @@ -159,6 +159,8 @@ DECL_TYPE(uimageCubeArray,
>> GL_UNSIGNED_INT_IMAGE_CUBE_MAP_ARRAY,   GLSL_TYPE
>>  DECL_TYPE(uimage2DMS,  GL_UNSIGNED_INT_IMAGE_2D_MULTISAMPLE,
>> GLSL_TYPE_IMAGE, GLSL_SAMPLER_DIM_MS, 0, 0, GLSL_TYPE_UINT)
>>  DECL_TYPE(uimage2DMSArray, GL_UNSIGNED_INT_IMAGE_2D_MULTISAMPLE_ARRAY,
>> GLSL_TYPE_IMAGE, GLSL_SAMPLER_DIM_MS, 0, 1, GLSL_TYPE_UINT)
>>
>> +DECL_TYPE(imageSubpass,0,
>> GLSL_TYPE_IMAGE, GLSL_SAMPLER_DIM_SUBPASS,0, 0, GLSL_TYPE_FLOAT)
>
>
> We should probably call this subpassInput to match the GLSL Vulkan spec.

Sounds good, I'll post v2.

Dave.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

99 matches

Mail list logo