Re: [Mesa-dev] [PATCH 00/21] anv: Do cross-stage link optimizations

2017-10-29 Thread Jason Ekstrand

On October 29, 2017 21:34:01 Timothy Arceri  wrote:


On 29/10/17 12:58, Jason Ekstrand wrote:

On Sat, Oct 28, 2017 at 11:36 AM, Jason Ekstrand > wrote:

This series adds support for cross-stage optimizations in anv.
There are a
few patches from Jordan's shader cache series in here that I wanted
because
they made my life easier.  There are also three patches CCd to stable to
fix a but in the i965 cross-stage NIR linking which, as as side-effect,
expose a nice brw_nir_link_shaders helper that we can use in anv.
The bulk
of the series, however, is the annoying refactoring of anv_pipeline.c to
let us work with and cache the shaders an entire pipeline at a time
instead
of having everything be per-stage.  The patch to actually add the
NIR link
optimizations to ANV is almost trivial.

On my thermally throttled (and therefore a bit inconsistent) laptop,
this
seems to help the Aztec Ruins benchmark by 2%.


Or not... I'm having trouble reproducing it now.


For what its worth RADV had improvements in the following:

Sascha Willems demo results:

  computecullandlod 39 -> 41 fps
  pipelines ~6100 -> ~6200 fps

The biggest improvement is with the component packing enabled:

SaschaWillems Vulkan demo tessellation:

~4300fps -> ~4800fps


Yeah, I tried those out but they were too noisy on my laptop to get good 
data.  I asked Eero to try and get me some better numbers on his perf setup.




Carl Worth (1):
   intel/compiler: add new field for storing program size

Jason Ekstrand (17):
   anv/pipeline: Rework the parameters to populate_wm_prog_key
   anv/pipeline: Add populate_tcs/tes_key helpers
   anv/pipline: Add a helper struct for per-stage info
   anv/pipeline: Populate keys up-front
   anv/pipeline: Hash the entire pipeline in one go
   anv/pipeline: Call anv_pipeline_compile_* in a loop
   anv/pipeline: Pull shader compilation out into a helper.
   anv/pipeline: Drop anv_pipeline_add_compiled_stage
   anv/pipeline: Recompile all shaders if any are missing from the cache
   anv/pipeline: Compile to NIR in compile_graphics
   anv/pipeline: Add a separate "link" stage
   anv/pipeline: Pull most of the anv_pipeline_compile_* into common
code
   intel/nir: Add a helper for getting the NoIndirect mask
   intel/nir: Break the linking code into a helper in brw_nir.c
   intel/nir: Use the correct indirect lowering masks in link_shaders
   nir/lower_indirect: Bail early if modes == 0
   anv/pipeline: Do cross-stage linking optimizations

Jordan Justen (3):
   intel/compiler: Add union types for prog_data and prog_key stages
   intel/compiler: Add functions to get prog_data and prog_key sizes for
     a stage
   intel/compiler: Remove final_program_size from brw_compile_*

  src/compiler/nir/nir_lower_indirect_derefs.c |   3 +
  src/intel/blorp/blorp.c                      |  10 +-
  src/intel/blorp/blorp_blit.c                 |   5 +-
  src/intel/blorp/blorp_clear.c                |  15 +-
  src/intel/blorp/blorp_priv.h                 |   6 +-
  src/intel/compiler/brw_compiler.c            |  36 +
  src/intel/compiler/brw_compiler.h            |  34 +-
  src/intel/compiler/brw_fs.cpp                |   6 +-
  src/intel/compiler/brw_nir.c                 |  63 +-
  src/intel/compiler/brw_nir.h                 |   4 +
  src/intel/compiler/brw_shader.cpp            |  12 +-
  src/intel/compiler/brw_vec4.cpp              |   5 +-
  src/intel/compiler/brw_vec4_gs_visitor.cpp   |   8 +-
  src/intel/compiler/brw_vec4_tcs.cpp          |  12 +-
  src/intel/vulkan/anv_pipeline.c              | 971
++-
  src/intel/vulkan/anv_private.h               |   2 +-
  src/intel/vulkan/genX_pipeline.c             |   2 -
  src/mesa/drivers/dri/i965/brw_cs.c           |   5 +-
  src/mesa/drivers/dri/i965/brw_gs.c           |   5 +-
  src/mesa/drivers/dri/i965/brw_link.cpp       |  38 +-
  src/mesa/drivers/dri/i965/brw_tcs.c          |   5 +-
  src/mesa/drivers/dri/i965/brw_tes.c          |   5 +-
  src/mesa/drivers/dri/i965/brw_vs.c           |  11 +-
  src/mesa/drivers/dri/i965/brw_wm.c           |   5 +-
  24 files changed, 668 insertions(+), 600 deletions(-)

--
2.5.0.400.gff86faf




___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev




___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 00/21] anv: Do cross-stage link optimizations

2017-10-29 Thread Timothy Arceri

On 29/10/17 12:58, Jason Ekstrand wrote:
On Sat, Oct 28, 2017 at 11:36 AM, Jason Ekstrand > wrote:


This series adds support for cross-stage optimizations in anv. 
There are a

few patches from Jordan's shader cache series in here that I wanted
because
they made my life easier.  There are also three patches CCd to stable to
fix a but in the i965 cross-stage NIR linking which, as as side-effect,
expose a nice brw_nir_link_shaders helper that we can use in anv. 
The bulk

of the series, however, is the annoying refactoring of anv_pipeline.c to
let us work with and cache the shaders an entire pipeline at a time
instead
of having everything be per-stage.  The patch to actually add the
NIR link
optimizations to ANV is almost trivial.

On my thermally throttled (and therefore a bit inconsistent) laptop,
this
seems to help the Aztec Ruins benchmark by 2%.


Or not... I'm having trouble reproducing it now.


For what its worth RADV had improvements in the following:

Sascha Willems demo results:

 computecullandlod 39 -> 41 fps
 pipelines ~6100 -> ~6200 fps

The biggest improvement is with the component packing enabled:

SaschaWillems Vulkan demo tessellation:

~4300fps -> ~4800fps



Carl Worth (1):
   intel/compiler: add new field for storing program size

Jason Ekstrand (17):
   anv/pipeline: Rework the parameters to populate_wm_prog_key
   anv/pipeline: Add populate_tcs/tes_key helpers
   anv/pipline: Add a helper struct for per-stage info
   anv/pipeline: Populate keys up-front
   anv/pipeline: Hash the entire pipeline in one go
   anv/pipeline: Call anv_pipeline_compile_* in a loop
   anv/pipeline: Pull shader compilation out into a helper.
   anv/pipeline: Drop anv_pipeline_add_compiled_stage
   anv/pipeline: Recompile all shaders if any are missing from the cache
   anv/pipeline: Compile to NIR in compile_graphics
   anv/pipeline: Add a separate "link" stage
   anv/pipeline: Pull most of the anv_pipeline_compile_* into common
code
   intel/nir: Add a helper for getting the NoIndirect mask
   intel/nir: Break the linking code into a helper in brw_nir.c
   intel/nir: Use the correct indirect lowering masks in link_shaders
   nir/lower_indirect: Bail early if modes == 0
   anv/pipeline: Do cross-stage linking optimizations

Jordan Justen (3):
   intel/compiler: Add union types for prog_data and prog_key stages
   intel/compiler: Add functions to get prog_data and prog_key sizes for
     a stage
   intel/compiler: Remove final_program_size from brw_compile_*

  src/compiler/nir/nir_lower_indirect_derefs.c |   3 +
  src/intel/blorp/blorp.c                      |  10 +-
  src/intel/blorp/blorp_blit.c                 |   5 +-
  src/intel/blorp/blorp_clear.c                |  15 +-
  src/intel/blorp/blorp_priv.h                 |   6 +-
  src/intel/compiler/brw_compiler.c            |  36 +
  src/intel/compiler/brw_compiler.h            |  34 +-
  src/intel/compiler/brw_fs.cpp                |   6 +-
  src/intel/compiler/brw_nir.c                 |  63 +-
  src/intel/compiler/brw_nir.h                 |   4 +
  src/intel/compiler/brw_shader.cpp            |  12 +-
  src/intel/compiler/brw_vec4.cpp              |   5 +-
  src/intel/compiler/brw_vec4_gs_visitor.cpp   |   8 +-
  src/intel/compiler/brw_vec4_tcs.cpp          |  12 +-
  src/intel/vulkan/anv_pipeline.c              | 971
++-
  src/intel/vulkan/anv_private.h               |   2 +-
  src/intel/vulkan/genX_pipeline.c             |   2 -
  src/mesa/drivers/dri/i965/brw_cs.c           |   5 +-
  src/mesa/drivers/dri/i965/brw_gs.c           |   5 +-
  src/mesa/drivers/dri/i965/brw_link.cpp       |  38 +-
  src/mesa/drivers/dri/i965/brw_tcs.c          |   5 +-
  src/mesa/drivers/dri/i965/brw_tes.c          |   5 +-
  src/mesa/drivers/dri/i965/brw_vs.c           |  11 +-
  src/mesa/drivers/dri/i965/brw_wm.c           |   5 +-
  24 files changed, 668 insertions(+), 600 deletions(-)

--
2.5.0.400.gff86faf




___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 9/9] radv: enable nir component packing

2017-10-29 Thread Timothy Arceri
SaschaWillems Vulkan demo tessellation:

~4000fps -> ~4600fps

Reviewed-by: Bas Nieuwenhuizen 
---
 src/amd/vulkan/radv_pipeline.c | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/src/amd/vulkan/radv_pipeline.c b/src/amd/vulkan/radv_pipeline.c
index 322cd7951b2..ec7c2393fc9 100644
--- a/src/amd/vulkan/radv_pipeline.c
+++ b/src/amd/vulkan/radv_pipeline.c
@@ -1815,6 +1815,7 @@ void radv_create_shaders(struct radv_pipeline *pipeline,
last = i;
}
 
+   int prev = -1;
for (unsigned i = 0; i < MESA_SHADER_STAGES; ++i) {
const VkPipelineShaderStageCreateInfo *stage = pStages[i];
 
@@ -1845,6 +1846,11 @@ void radv_create_shaders(struct radv_pipeline *pipeline,
nir_lower_io_to_scalar_early(nir[i], mask);
radv_optimize_nir(nir[i]);
}
+
+   if (prev != -1) {
+   nir_compact_varyings(nir[prev], nir[i], true);
+   }
+   prev = i;
}
 
if (nir[MESA_SHADER_TESS_CTRL]) {
-- 
2.13.6

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 6/9] nir: add varying component packing helpers

2017-10-29 Thread Timothy Arceri
v2: update shader info input/output masks when pack components

Reviewed-by: Bas Nieuwenhuizen  (v1)
---
 src/compiler/nir/nir.h |   2 +
 src/compiler/nir/nir_linking_helpers.c | 272 +
 2 files changed, 274 insertions(+)

diff --git a/src/compiler/nir/nir.h b/src/compiler/nir/nir.h
index 095cc6600ad..2b46cefc4f7 100644
--- a/src/compiler/nir/nir.h
+++ b/src/compiler/nir/nir.h
@@ -2420,6 +2420,8 @@ void nir_assign_var_locations(struct exec_list *var_list, 
unsigned *size,
 
 /* Some helpers to do very simple linking */
 bool nir_remove_unused_varyings(nir_shader *producer, nir_shader *consumer);
+void nir_compact_varyings(nir_shader *producer, nir_shader *consumer,
+  bool default_to_smooth_interp);
 
 typedef enum {
/* If set, this forces all non-flat fragment shader inputs to be
diff --git a/src/compiler/nir/nir_linking_helpers.c 
b/src/compiler/nir/nir_linking_helpers.c
index 4d709c1b3c5..f7355af2195 100644
--- a/src/compiler/nir/nir_linking_helpers.c
+++ b/src/compiler/nir/nir_linking_helpers.c
@@ -173,3 +173,275 @@ nir_remove_unused_varyings(nir_shader *producer, 
nir_shader *consumer)
 
return progress;
 }
+
+static uint8_t
+get_interp_type(nir_variable *var, bool default_to_smooth_interp)
+{
+   if (var->data.interpolation != INTERP_MODE_NONE)
+  return var->data.interpolation;
+   else if (default_to_smooth_interp)
+  return INTERP_MODE_SMOOTH;
+   else
+  return INTERP_MODE_NONE;
+}
+
+static void
+get_slot_component_masks_and_interp_types(struct exec_list *var_list,
+  uint8_t *comps,  uint8_t 
*interp_type,
+  gl_shader_stage stage,
+  bool default_to_smooth_interp)
+{
+   nir_foreach_variable_safe(var, var_list) {
+  assert(var->data.location >= 0);
+
+  /* Only remap things that aren't built-ins.
+   * TODO: add TES patch support.
+   */
+  if (var->data.location >= VARYING_SLOT_VAR0 &&
+  var->data.location - VARYING_SLOT_VAR0 < 32) {
+
+ const struct glsl_type *type = var->type;
+ if (nir_is_per_vertex_io(var, stage)) {
+assert(glsl_type_is_array(type));
+type = glsl_get_array_element(type);
+ }
+
+ unsigned location = var->data.location - VARYING_SLOT_VAR0;
+ unsigned elements =
+glsl_get_vector_elements(glsl_without_array(type));
+
+ bool dual_slot = glsl_type_is_dual_slot(glsl_without_array(type));
+ unsigned slots = glsl_count_attribute_slots(type, false);
+ unsigned comps_slot2 = 0;
+ for (unsigned i = 0; i < slots; i++) {
+interp_type[location + i] =
+   get_interp_type(var, default_to_smooth_interp);
+
+if (dual_slot) {
+   if (i & 1) {
+  comps[location + i] |= ((1 << comps_slot2) - 1);
+   } else {
+  unsigned num_comps = 4 - var->data.location_frac;
+  comps_slot2 = (elements * 2) - num_comps;
+
+  /* Assume ARB_enhanced_layouts packing rules for doubles */
+  assert(var->data.location_frac == 0 ||
+ var->data.location_frac == 2);
+  assert(comps_slot2 <= 4);
+
+  comps[location + i] |=
+ ((1 << num_comps) - 1) << var->data.location_frac;
+   }
+} else {
+   comps[location + i] |=
+  ((1 << elements) - 1) << var->data.location_frac;
+}
+ }
+  }
+   }
+}
+
+struct varying_loc
+{
+   uint8_t component;
+   uint32_t location;
+};
+
+static void
+remap_slots_and_components(struct exec_list *var_list, gl_shader_stage stage,
+   struct varying_loc (*remap)[4], uint64_t 
*slots_used)
+ {
+   /* We don't touch builtins so just copy the bitmask */
+   uint64_t slots_used_tmp =
+  *slots_used & (((uint64_t)1 << (VARYING_SLOT_VAR0 - 1)) - 1);
+
+   nir_foreach_variable(var, var_list) {
+  assert(var->data.location >= 0);
+
+  /* Only remap things that aren't built-ins */
+  if (var->data.location >= VARYING_SLOT_VAR0 &&
+  var->data.location - VARYING_SLOT_VAR0 < 32) {
+ assert(var->data.location - VARYING_SLOT_VAR0 < 32);
+ assert(remap[var->data.location - VARYING_SLOT_VAR0] >= 0);
+
+ unsigned location = var->data.location - VARYING_SLOT_VAR0;
+ struct varying_loc *new_loc = 
[location][var->data.location_frac];
+ if (new_loc->location) {
+var->data.location = new_loc->location;
+var->data.location_frac = new_loc->component;
+ }
+
+ const struct glsl_type *type = var->type;
+ if (nir_is_per_vertex_io(var, stage)) {
+assert(glsl_type_is_array(type));
+type = 

[Mesa-dev] [PATCH 3/9] i965: move update_xfb_info() call out of loop

2017-10-29 Thread Timothy Arceri
We can just call it once. Also a following patch will also introduce
link time component packing which modifies the outputs_written
bit mask, we want to avoid calling update_xfb_info() until after
packing is completed.
---
 src/mesa/drivers/dri/i965/brw_link.cpp | 7 +--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_link.cpp 
b/src/mesa/drivers/dri/i965/brw_link.cpp
index 1a28e63fcae..b6c5362a1ee 100644
--- a/src/mesa/drivers/dri/i965/brw_link.cpp
+++ b/src/mesa/drivers/dri/i965/brw_link.cpp
@@ -325,8 +325,6 @@ brw_link_shader(struct gl_context *ctx, struct 
gl_shader_program *shProg)
 
   infos[stage] = >nir->info;
 
-  update_xfb_info(prog->sh.LinkedTransformFeedback, infos[stage]);
-
   /* Make a pass over the IR to add state references for any built-in
* uniforms that are used.  This has to be done now (during linking).
* Code generation doesn't happen until the first time this shader is
@@ -347,6 +345,11 @@ brw_link_shader(struct gl_context *ctx, struct 
gl_shader_program *shProg)
   }
}
 
+   if (shProg->last_vert_prog) {
+  update_xfb_info(shProg->last_vert_prog->sh.LinkedTransformFeedback,
+  >last_vert_prog->nir->info);
+   }
+
/* The linker tries to dead code eliminate unused varying components,
 * and make sure interfaces match.  But it isn't able to do so in all
 * cases.  So, explicitly make the interfaces match by OR'ing together
-- 
2.13.6

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/9] nir: add varying array splitting pass

2017-10-29 Thread Timothy Arceri
---
 src/compiler/Makefile.sources  |   1 +
 src/compiler/nir/meson.build   |   1 +
 src/compiler/nir/nir.h |   1 +
 src/compiler/nir/nir_lower_io_arrays_to_elements.c | 371 +
 4 files changed, 374 insertions(+)
 create mode 100644 src/compiler/nir/nir_lower_io_arrays_to_elements.c

diff --git a/src/compiler/Makefile.sources b/src/compiler/Makefile.sources
index 27cc33ab835..ac9a3c8549c 100644
--- a/src/compiler/Makefile.sources
+++ b/src/compiler/Makefile.sources
@@ -227,6 +227,7 @@ NIR_FILES = \
nir/nir_lower_indirect_derefs.c \
nir/nir_lower_int64.c \
nir/nir_lower_io.c \
+   nir/nir_lower_io_arrays_to_elements.c \
nir/nir_lower_io_to_temporaries.c \
nir/nir_lower_io_to_scalar.c \
nir/nir_lower_io_types.c \
diff --git a/src/compiler/nir/meson.build b/src/compiler/nir/meson.build
index cb88effa628..ab0aa65eb1e 100644
--- a/src/compiler/nir/meson.build
+++ b/src/compiler/nir/meson.build
@@ -114,6 +114,7 @@ files_libnir = files(
   'nir_lower_indirect_derefs.c',
   'nir_lower_int64.c',
   'nir_lower_io.c',
+  'nir_lower_io_arrays_to_elements.c',
   'nir_lower_io_to_temporaries.c',
   'nir_lower_io_to_scalar.c',
   'nir_lower_io_types.c',
diff --git a/src/compiler/nir/nir.h b/src/compiler/nir/nir.h
index dd833cf1831..095cc6600ad 100644
--- a/src/compiler/nir/nir.h
+++ b/src/compiler/nir/nir.h
@@ -2454,6 +2454,7 @@ bool nir_lower_alu_to_scalar(nir_shader *shader);
 bool nir_lower_load_const_to_scalar(nir_shader *shader);
 bool nir_lower_read_invocation_to_scalar(nir_shader *shader);
 bool nir_lower_phis_to_scalar(nir_shader *shader);
+void nir_lower_io_arrays_to_elements(nir_shader *producer, nir_shader 
*consumer);
 void nir_lower_io_to_scalar(nir_shader *shader, nir_variable_mode mask);
 void nir_lower_io_to_scalar_early(nir_shader *shader, nir_variable_mode mask);
 
diff --git a/src/compiler/nir/nir_lower_io_arrays_to_elements.c 
b/src/compiler/nir/nir_lower_io_arrays_to_elements.c
new file mode 100644
index 000..3a8e2dc1933
--- /dev/null
+++ b/src/compiler/nir/nir_lower_io_arrays_to_elements.c
@@ -0,0 +1,371 @@
+/*
+ * Copyright © 2017 Timothy Arceri
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice (including the next
+ * paragraph) shall be included in all copies or substantial portions of the
+ * Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
+ * IN THE SOFTWARE.
+ */
+
+#include "nir.h"
+#include "nir_builder.h"
+
+/** @file nir_lower_io_arrays_to_elements.c
+ *
+ * Split arrays/matrices with direct indexing into individual elements. This
+ * will allow optimisation passes to better clean up unused elements.
+ *
+ */
+
+static unsigned
+get_io_offset(nir_builder *b, nir_deref_var *deref, nir_variable *var,
+  unsigned *element_index)
+{
+   nir_deref *tail = >deref;
+
+   /* For per-vertex input arrays (i.e. geometry shader inputs), skip the
+* outermost array index.  Process the rest normally.
+*/
+   if (nir_is_per_vertex_io(var, b->shader->info.stage)) {
+  tail = tail->child;
+   }
+
+   unsigned offset = 0;
+   while (tail->child != NULL) {
+  tail = tail->child;
+
+  if (tail->deref_type == nir_deref_type_array) {
+ nir_deref_array *deref_array = nir_deref_as_array(tail);
+ assert(deref_array->deref_array_type != 
nir_deref_array_type_indirect);
+
+ unsigned size = glsl_count_attribute_slots(tail->type, false);
+ offset += size * deref_array->base_offset;
+
+ unsigned num_elements = glsl_type_is_array(tail->type) ?
+glsl_get_aoa_size(tail->type) : 1;
+
+ num_elements *= glsl_type_is_matrix(glsl_without_array(tail->type)) ?
+glsl_get_matrix_columns(glsl_without_array(tail->type)) : 1;
+
+ *element_index += num_elements * deref_array->base_offset;
+  } else if (tail->deref_type == nir_deref_type_struct) {
+ /* TODO: we could also add struct splitting support to this pass */
+ break;
+  }
+   }
+
+   return offset;
+}
+

[Mesa-dev] [PATCH 8/9] radv: enable nir varying array splitting

2017-10-29 Thread Timothy Arceri
---
 src/amd/vulkan/radv_pipeline.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/src/amd/vulkan/radv_pipeline.c b/src/amd/vulkan/radv_pipeline.c
index c25642c9667..322cd7951b2 100644
--- a/src/amd/vulkan/radv_pipeline.c
+++ b/src/amd/vulkan/radv_pipeline.c
@@ -1666,6 +1666,9 @@ radv_link_shaders(struct radv_pipeline *pipeline, 
nir_shader **shaders)
}
 
for (int i = 1; i < shader_count; ++i)  {
+   nir_lower_io_arrays_to_elements(ordered_shaders[i],
+   ordered_shaders[i - 1]);
+
nir_remove_dead_variables(ordered_shaders[i],
  nir_var_shader_out);
nir_remove_dead_variables(ordered_shaders[i - 1],
-- 
2.13.6

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 7/9] i965: enable varying component packing for BDW+

2017-10-29 Thread Timothy Arceri
shader-db results BDW:

total instructions in shared programs: 13192895 -> 13182437 (-0.08%)
instructions in affected programs: 827145 -> 816687 (-1.26%)
helped: 5199
HURT: 116

total cycles in shared programs: 539249342 -> 539156566 (-0.02%)
cycles in affected programs: 21894552 -> 21801776 (-0.42%)
helped: 10667
HURT: 7196

LOST:   0
GAINED: 17
---
 src/mesa/drivers/dri/i965/brw_link.cpp | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/src/mesa/drivers/dri/i965/brw_link.cpp 
b/src/mesa/drivers/dri/i965/brw_link.cpp
index 46dbcac8430..782135430cb 100644
--- a/src/mesa/drivers/dri/i965/brw_link.cpp
+++ b/src/mesa/drivers/dri/i965/brw_link.cpp
@@ -329,6 +329,7 @@ brw_link_shader(struct gl_context *ctx, struct 
gl_shader_program *shProg)
}
 }
 
+   int prev = -1;
for (stage = 0; stage < ARRAY_SIZE(shProg->_LinkedShaders); stage++) {
   struct gl_linked_shader *shader = shProg->_LinkedShaders[stage];
   if (!shader)
@@ -340,6 +341,12 @@ brw_link_shader(struct gl_context *ctx, struct 
gl_shader_program *shProg)
   NIR_PASS_V(prog->nir, nir_lower_samplers, shProg);
   NIR_PASS_V(prog->nir, nir_lower_atomics, shProg);
 
+  if (brw->screen->devinfo.gen >= 8 && prev != -1) {
+ nir_compact_varyings(shProg->_LinkedShaders[prev]->Program->nir,
+  prog->nir, ctx->API != API_OPENGL_COMPAT);
+  }
+  prev = stage;
+
   infos[stage] = >nir->info;
 
   /* Make a pass over the IR to add state references for any built-in
-- 
2.13.6

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] More nir linking optimisations

2017-10-29 Thread Timothy Arceri
This series adds a varying array splitting pass to the previous component 
packing
series I sent out previously. This allows avoiding the workaround of calling
gather shader info twice since we can more easily keep the input/output bitmasks
in sync now that we don't need to worry about partial marking of arrays.

Remaining improvements include adding a pass to compact varyings into 
consecutive
slots rather than leaving empty slots when removing dead varyings.

Shader-db results for serires on i965 (BDW):

total instructions in shared programs: 13298718 -> 13191284 (-0.81%)
instructions in affected programs: 2315180 -> 2207746 (-4.64%)
helped: 14956
HURT: 390

total cycles in shared programs: 540151400 -> 539397048 (-0.14%)
cycles in affected programs: 297905258 -> 297150906 (-0.25%)
helped: 25231
HURT: 13033

total loops in shared programs: 3807 -> 3804 (-0.08%)
loops in affected programs: 3 -> 0
helped: 3
HURT: 0

total spills in shared programs: 86577 -> 86640 (0.07%)
spills in affected programs: 1380 -> 1443 (4.57%)
helped: 7
HURT: 15

total fills in shared programs: 90871 -> 90946 (0.08%)
fills in affected programs: 1728 -> 1803 (4.34%)
helped: 16
HURT: 9

LOST:   4
GAINED: 15

The spill hurt is all in dolphin uber shaders (as is most of the spill
improvements).

Two of the lost programs are SIMD16 programs are from CS: GO because 80% of the
shaders get optimised away when we remove dead varying components, these are
also the shaders where the 3 loops go away.

Please review.

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/9] nir: add tess patch support to nir_remove_unused_varyings()

2017-10-29 Thread Timothy Arceri
---
 src/compiler/nir/nir_linking_helpers.c | 61 +++---
 1 file changed, 42 insertions(+), 19 deletions(-)

diff --git a/src/compiler/nir/nir_linking_helpers.c 
b/src/compiler/nir/nir_linking_helpers.c
index 54ba1c85e58..4d709c1b3c5 100644
--- a/src/compiler/nir/nir_linking_helpers.c
+++ b/src/compiler/nir/nir_linking_helpers.c
@@ -37,10 +37,12 @@
 static uint64_t
 get_variable_io_mask(nir_variable *var, gl_shader_stage stage)
 {
-   /* TODO: add support for tess patches */
-   if (var->data.patch || var->data.location < 0)
+   if (var->data.location < 0)
   return 0;
 
+   unsigned location = var->data.patch ?
+  var->data.location - VARYING_SLOT_PATCH0 : var->data.location;
+
assert(var->data.mode == nir_var_shader_in ||
   var->data.mode == nir_var_shader_out ||
   var->data.mode == nir_var_system_value);
@@ -53,11 +55,11 @@ get_variable_io_mask(nir_variable *var, gl_shader_stage 
stage)
}
 
unsigned slots = glsl_count_attribute_slots(type, false);
-   return ((1ull << slots) - 1) << var->data.location;
+   return ((1ull << slots) - 1) << location;
 }
 
 static void
-tcs_add_output_reads(nir_shader *shader, uint64_t *read)
+tcs_add_output_reads(nir_shader *shader, uint64_t *read, uint64_t 
*patches_read)
 {
nir_foreach_function(function, shader) {
   if (function->impl) {
@@ -73,9 +75,15 @@ tcs_add_output_reads(nir_shader *shader, uint64_t *read)
nir_var_shader_out) {
 
   nir_variable *var = intrin_instr->variables[0]->var;
-  read[var->data.location_frac] |=
- get_variable_io_mask(intrin_instr->variables[0]->var,
-  shader->info.stage);
+  if (var->data.patch) {
+ patches_read[var->data.location_frac] |=
+get_variable_io_mask(intrin_instr->variables[0]->var,
+ shader->info.stage);
+  } else {
+ read[var->data.location_frac] |=
+get_variable_io_mask(intrin_instr->variables[0]->var,
+ shader->info.stage);
+  }
}
 }
  }
@@ -85,14 +93,17 @@ tcs_add_output_reads(nir_shader *shader, uint64_t *read)
 
 static bool
 remove_unused_io_vars(nir_shader *shader, struct exec_list *var_list,
-  uint64_t *used_by_other_stage)
+  uint64_t *used_by_other_stage,
+  uint64_t *used_by_other_stage_patches)
 {
bool progress = false;
+   uint64_t *used;
 
nir_foreach_variable_safe(var, var_list) {
-  /* TODO: add patch support */
   if (var->data.patch)
- continue;
+ used = used_by_other_stage_patches;
+  else
+ used = used_by_other_stage;
 
   if (var->data.location < VARYING_SLOT_VAR0 && var->data.location >= 0)
  continue;
@@ -100,7 +111,7 @@ remove_unused_io_vars(nir_shader *shader, struct exec_list 
*var_list,
   if (var->data.always_active_io)
  continue;
 
-  uint64_t other_stage = used_by_other_stage[var->data.location_frac];
+  uint64_t other_stage = used[var->data.location_frac];
 
   if (!(other_stage & get_variable_io_mask(var, shader->info.stage))) {
  /* This one is invalid, make it a global variable instead */
@@ -124,15 +135,26 @@ nir_remove_unused_varyings(nir_shader *producer, 
nir_shader *consumer)
assert(consumer->info.stage != MESA_SHADER_VERTEX);
 
uint64_t read[4] = { 0 }, written[4] = { 0 };
+   uint64_t patches_read[4] = { 0 }, patches_written[4] = { 0 };
 
nir_foreach_variable(var, >outputs) {
-  written[var->data.location_frac] |=
- get_variable_io_mask(var, producer->info.stage);
+  if (var->data.patch) {
+ patches_written[var->data.location_frac] |=
+get_variable_io_mask(var, producer->info.stage);
+  } else {
+ written[var->data.location_frac] |=
+get_variable_io_mask(var, producer->info.stage);
+  }
}
 
nir_foreach_variable(var, >inputs) {
-  read[var->data.location_frac] |=
- get_variable_io_mask(var, consumer->info.stage);
+  if (var->data.patch) {
+ patches_read[var->data.location_frac] |=
+get_variable_io_mask(var, consumer->info.stage);
+  } else {
+ read[var->data.location_frac] |=
+get_variable_io_mask(var, consumer->info.stage);
+  }
}
 
/* Each TCS invocation can read data written by other TCS invocations,
@@ -140,13 +162,14 @@ nir_remove_unused_varyings(nir_shader *producer, 
nir_shader *consumer)
 * sure they are not read by the TCS before demoting them to globals.
 */
if (producer->info.stage == MESA_SHADER_TESS_CTRL)
-  tcs_add_output_reads(producer, read);
+  tcs_add_output_reads(producer, read, patches_read);
 
bool 

[Mesa-dev] [PATCH 4/9] i965: enable varying array splitting

2017-10-29 Thread Timothy Arceri
total instructions in shared programs: 13210579 -> 13199325 (-0.09%)
instructions in affected programs: 89043 -> 77789 (-12.64%)
helped: 430
HURT: 0

total cycles in shared programs: 539530190 -> 539493750 (-0.01%)
cycles in affected programs: 584860 -> 548420 (-6.23%)
helped: 437
HURT: 110

total spills in shared programs: 86646 -> 86640 (-0.01%)
spills in affected programs: 6 -> 0
helped: 1
HURT: 0

total fills in shared programs: 90955 -> 90946 (-0.01%)
fills in affected programs: 9 -> 0
helped: 1
HURT: 0
---
 src/mesa/drivers/dri/i965/brw_link.cpp | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/src/mesa/drivers/dri/i965/brw_link.cpp 
b/src/mesa/drivers/dri/i965/brw_link.cpp
index b6c5362a1ee..c0e16ae7d5c 100644
--- a/src/mesa/drivers/dri/i965/brw_link.cpp
+++ b/src/mesa/drivers/dri/i965/brw_link.cpp
@@ -278,6 +278,8 @@ brw_link_shader(struct gl_context *ctx, struct 
gl_shader_program *shProg)
 nir_shader *producer = shProg->_LinkedShaders[i]->Program->nir;
 nir_shader *consumer = shProg->_LinkedShaders[next]->Program->nir;
 
+nir_lower_io_arrays_to_elements(producer, consumer);
+
 NIR_PASS_V(producer, nir_remove_dead_variables, 
nir_var_shader_out);
 NIR_PASS_V(consumer, nir_remove_dead_variables, nir_var_shader_in);
 
-- 
2.13.6

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 5/9] i965: call nir_lower_io_to_scalar() at link time for BDW and above

2017-10-29 Thread Timothy Arceri
This will allow dead components of varyings to be removed.

BDW shader-db results:

total instructions in shared programs: 13190730 -> 13108459 (-0.62%)
instructions in affected programs: 2110903 -> 2028632 (-3.90%)
helped: 14043
HURT: 486

total cycles in shared programs: 541148990 -> 540544072 (-0.11%)
cycles in affected programs: 290344296 -> 289739378 (-0.21%)
helped: 23418
HURT: 11623

total loops in shared programs: 3923 -> 3920 (-0.08%)
loops in affected programs: 3 -> 0
helped: 3
HURT: 0

total spills in shared programs: 85784 -> 85853 (0.08%)
spills in affected programs: 1374 -> 1443 (5.02%)
helped: 6
HURT: 15

total fills in shared programs: 88717 -> 88801 (0.09%)
fills in affected programs: 1719 -> 1803 (4.89%)
helped: 15
HURT: 9

LOST:   3
GAINED: 0

The fills/spills changes were all in the dolphin uber shaders.

I tested enabling this on IVB but the results went in the other
direction.
---
 src/mesa/drivers/dri/i965/brw_link.cpp | 35 --
 1 file changed, 25 insertions(+), 10 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_link.cpp 
b/src/mesa/drivers/dri/i965/brw_link.cpp
index c0e16ae7d5c..46dbcac8430 100644
--- a/src/mesa/drivers/dri/i965/brw_link.cpp
+++ b/src/mesa/drivers/dri/i965/brw_link.cpp
@@ -224,6 +224,17 @@ brw_link_shader(struct gl_context *ctx, struct 
gl_shader_program *shProg)
unsigned int stage;
struct shader_info *infos[MESA_SHADER_STAGES] = { 0, };
 
+   /* Determine first and last stage. */
+   unsigned first = MESA_SHADER_STAGES;
+   unsigned last = 0;
+   for (unsigned i = 0; i < MESA_SHADER_STAGES; i++) {
+  if (!shProg->_LinkedShaders[i])
+ continue;
+  if (first == MESA_SHADER_STAGES)
+ first = i;
+  last = i;
+   }
+
for (stage = 0; stage < ARRAY_SIZE(shProg->_LinkedShaders); stage++) {
   struct gl_linked_shader *shader = shProg->_LinkedShaders[stage];
   if (!shader)
@@ -251,17 +262,21 @@ brw_link_shader(struct gl_context *ctx, struct 
gl_shader_program *shProg)
 
   prog->nir = brw_create_nir(brw, shProg, prog, (gl_shader_stage) stage,
  compiler->scalar_stage[stage]);
-   }
 
-   /* Determine first and last stage. */
-   unsigned first = MESA_SHADER_STAGES;
-   unsigned last = 0;
-   for (unsigned i = 0; i < MESA_SHADER_STAGES; i++) {
-  if (!shProg->_LinkedShaders[i])
- continue;
-  if (first == MESA_SHADER_STAGES)
- first = i;
-  last = i;
+  if (brw->screen->devinfo.gen >= 8) {
+ nir_variable_mode mask = (nir_variable_mode) 0;
+
+ if (stage != first)
+mask = (nir_variable_mode)(mask | nir_var_shader_in);
+
+ if (stage != last)
+mask = (nir_variable_mode)(mask | nir_var_shader_out);
+
+ nir_lower_io_to_scalar_early(prog->nir, mask);
+
+ prog->nir = brw_nir_optimize(prog->nir, compiler,
+  compiler->scalar_stage[stage]);
+  }
}
 
/* Linking the stages in the opposite order (from fragment to vertex)
-- 
2.13.6

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 08/25] gallium/u_threaded: mark queries flushed only for non-deferred flushes

2017-10-29 Thread Marek Olšák
On Sun, Oct 22, 2017 at 9:07 PM, Nicolai Hähnle  wrote:
> From: Nicolai Hähnle 
>
> The driver uses (and must use) the flushed flag of queries as a hint that
> it does not have to check for synchronization with currently queued up
> commands. Deferred flushes do not actually flush queued up commands, so
> we must not set the flushed flag for them.
>
> Found by inspection.
> ---
>  src/gallium/auxiliary/util/u_threaded_context.c | 8 +---
>  src/gallium/auxiliary/util/u_threaded_context.h | 2 +-
>  2 files changed, 6 insertions(+), 4 deletions(-)
>
> diff --git a/src/gallium/auxiliary/util/u_threaded_context.c 
> b/src/gallium/auxiliary/util/u_threaded_context.c
> index 7e28b87a7ff..24fab7f5cb6 100644
> --- a/src/gallium/auxiliary/util/u_threaded_context.c
> +++ b/src/gallium/auxiliary/util/u_threaded_context.c
> @@ -1783,23 +1783,25 @@ tc_create_video_buffer(struct pipe_context *_pipe,
>   */
>
>  static void
>  tc_flush(struct pipe_context *_pipe, struct pipe_fence_handle **fence,
>   unsigned flags)
>  {
> struct threaded_context *tc = threaded_context(_pipe);
> struct pipe_context *pipe = tc->pipe;
> struct threaded_query *tq, *tmp;
>
> -   LIST_FOR_EACH_ENTRY_SAFE(tq, tmp, >unflushed_queries, head_unflushed) 
> {
> -  tq->flushed = true;
> -  LIST_DEL(>head_unflushed);
> +   if (!(flags & PIPE_FLUSH_DEFERRED)) {

Do we also need to check the ASYNC flag here? Or top-of-pipe and
bottom-of-pipe flags that don't have to flush caches if I understand
correctly?

Marek

> +  LIST_FOR_EACH_ENTRY_SAFE(tq, tmp, >unflushed_queries, 
> head_unflushed) {
> + tq->flushed = true;
> + LIST_DEL(>head_unflushed);
> +  }
> }
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH mesa] git_sha1_gen: create empty file in fallback path

2017-10-29 Thread Dylan Baker
Rb

On October 29, 2017 3:06:28 PM PDT, Eric Engestrom  wrote:
>I missed this part in my conversion, the old stream redirection meant
>the file was always created.
>
>Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103496
>Fixes: 7088622e5fb506b64c90 "buildsys: move file regeneration logic to
>   the script itself"
>Signed-off-by: Eric Engestrom 
>---
> bin/git_sha1_gen.py | 2 ++
> 1 file changed, 2 insertions(+)
>
>diff --git a/bin/git_sha1_gen.py b/bin/git_sha1_gen.py
>index 7b9267b59e..68a87e72ec 100755
>--- a/bin/git_sha1_gen.py
>+++ b/bin/git_sha1_gen.py
>@@ -45,3 +45,5 @@ def get_git_sha1():
> quit()
> with open(args.output, 'w') as git_sha1_h:
> git_sha1_h.write(new_sha1)
>+else:
>+open(args.output, 'w').close()
>-- 
>Cheers,
>  Eric
>
>___
>mesa-dev mailing list
>mesa-dev@lists.freedesktop.org
>https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v3 28/34] i965: add cache fallback support using serialized nir

2017-10-29 Thread Jordan Justen
On 2017-10-29 01:11:32, Kenneth Graunke wrote:
> On Sunday, October 22, 2017 1:01:36 PM PDT Jordan Justen wrote:
> > If the i965 gen program cannot be loaded from the cache, then we
> > fallback to using a serialized nir program.
> > 
> > This is based on "i965: add cache fallback support" by Timothy Arceri
> > . Tim's version was written to fallback
> > to compiling from source, and therefore had to be much more complex.
> > After Connor and Jason implemented nir serialization, I was able to
> > rewrite and greatly simplify this patch.
> > 
> > Signed-off-by: Jordan Justen 
> > Acked-by: Timothy Arceri 
> > ---
> >  src/mesa/drivers/dri/i965/brw_disk_cache.c | 27 ++-
> >  1 file changed, 26 insertions(+), 1 deletion(-)
> > 
> > diff --git a/src/mesa/drivers/dri/i965/brw_disk_cache.c 
> > b/src/mesa/drivers/dri/i965/brw_disk_cache.c
> > index 503c6c7b499..9af893d40a7 100644
> > --- a/src/mesa/drivers/dri/i965/brw_disk_cache.c
> > +++ b/src/mesa/drivers/dri/i965/brw_disk_cache.c
> > @@ -24,6 +24,7 @@
> >  #include "compiler/blob.h"
> >  #include "compiler/glsl/ir_uniform.h"
> >  #include "compiler/glsl/shader_cache.h"
> > +#include "compiler/nir/nir_serialize.h"
> >  #include "main/mtypes.h"
> >  #include "util/disk_cache.h"
> >  #include "util/macros.h"
> > @@ -58,6 +59,27 @@ gen_shader_sha1(struct brw_context *brw, struct 
> > gl_program *prog,
> > _mesa_sha1_compute(manifest, strlen(manifest), out_sha1);
> >  }
> >  
> > +static void
> > +fallback_to_full_recompile(struct brw_context *brw, struct gl_program 
> > *prog,
> 
> It's not exactly a full recompile anymore, maybe rename this to
> recompile_from_nir?  Or fallback_to_partial_recompile?

Good point. I guess eventually we'll recompile from nir, but at this
point we are just restoring the nir program. What about
restore_serialized_nir_shader? Reviewed-by from you with that?

-Jordan

> 
> > +   gl_shader_stage stage)
> > +{
> > +   prog->program_written_to_cache = false;
> > +   if (brw->ctx._Shader->Flags & GLSL_CACHE_INFO) {
> > +  fprintf(stderr, "falling back to nir %s.\n",
> > +  _mesa_shader_stage_to_abbrev(prog->info.stage));
> > +   }
> > +
> > +   if (!prog->nir) {
> > +  assert(prog->driver_cache_blob && prog->driver_cache_blob_size > 0);
> > +  const struct nir_shader_compiler_options *options =
> > + brw->ctx.Const.ShaderCompilerOptions[stage].NirOptions;
> > +  struct blob_reader reader;
> > +  blob_reader_init(, prog->driver_cache_blob,
> > +   prog->driver_cache_blob_size);
> > +  prog->nir = nir_deserialize(NULL, options, );
> > +   }
> > +}
> > +
> >  static void
> >  write_blob_program_data(struct blob *binary, const void *program,
> >  size_t program_size,
> > @@ -280,6 +302,9 @@ brw_disk_cache_upload_program(struct brw_context *brw, 
> > gl_shader_stage stage)
> > prog->sh.LinkedTransformFeedback->api_enabled)
> >return false;
> >  
> > +   if (brw->ctx._Shader->Flags & GLSL_CACHE_FALLBACK)
> > +  goto FAIL;
> > +
> > if (prog->sh.data->LinkStatus != linking_skipped)
> >goto FAIL;
> >  
> > @@ -293,7 +318,7 @@ brw_disk_cache_upload_program(struct brw_context *brw, 
> > gl_shader_stage stage)
> > return true;
> >  
> >  FAIL:
> > -   /*FIXME: Fall back and compile from source here. */
> > +   fallback_to_full_recompile(brw, prog, stage);
> > return false;
> >  }
> >  
> > 
> 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 103505] RX 480, newest mesa, VULKAN Does not start

2017-10-29 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=103505

--- Comment #3 from Matias N. Goldberg  ---
Sounds similar to this bug 99591:
https://bugs.freedesktop.org/show_bug.cgi?id=99591

Try export LD_BIND_NOW=1 before running the Vulkan application. If that doesn't
work, there could be a problem with the config with which LLVM was built.

Are the vulkan drivers and LLVM provided by your distro or you built them
yourself?

Please give more info, like OS, kernel, driver, Mesa version, LLVM version,
etc.

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v3 19/43] i965/fs: Support push constants of 16-bit types

2017-10-29 Thread Chema Casanova
On 29/10/17 19:55, Pohjolainen, Topi wrote:
> On Thu, Oct 12, 2017 at 08:38:08PM +0200, Jose Maria Casanova Crespo wrote:
>> We enable the use of 16-bit values in push constants
>> modifying the assign_constant_locations function to work
>> with 16-bit types.
>>
>> The API to access buffers in Vulkan use multiples of 4-byte for
>> offsets and sizes. Current accountability of uniforms based on 4-byte
>> slots will work for 16-bit values if they are allowed to use 32-bit
>> slots. For that, we replace the division by 4 by a DIV_ROUND_UP, so
>> 2-byte elements will use 1 slot instead of 0.
>>
>> We aligns the 16-bit locations after assigning the 32-bit
>> ones.
>> ---
>>  src/intel/compiler/brw_fs.cpp | 30 +++---
>>  1 file changed, 23 insertions(+), 7 deletions(-)
>>
>> diff --git a/src/intel/compiler/brw_fs.cpp b/src/intel/compiler/brw_fs.cpp
>> index a1d49a63be..8da16145dc 100644
>> --- a/src/intel/compiler/brw_fs.cpp
>> +++ b/src/intel/compiler/brw_fs.cpp
>> @@ -1909,8 +1909,9 @@ set_push_pull_constant_loc(unsigned uniform, int 
>> *chunk_start,
>> if (!contiguous) {
>>/* If bitsize doesn't match the target one, skip it */
>>if (*max_chunk_bitsize != target_bitsize) {
>> - /* FIXME: right now we only support 32 and 64-bit accesses */
>> - assert(*max_chunk_bitsize == 4 || *max_chunk_bitsize == 8);
>> + assert(*max_chunk_bitsize == 4 ||
>> +*max_chunk_bitsize == 8 ||
>> +*max_chunk_bitsize == 2);
>>   *max_chunk_bitsize = 0;
>>   *chunk_start = -1;
>>   return;
>> @@ -1987,8 +1988,9 @@ fs_visitor::assign_constant_locations()
>>   int constant_nr = inst->src[i].nr + inst->src[i].offset / 4;
> 
> Did you test this with, for example, vec4?

CTS has 16bit scalar, vec2 (uint,sint), vec4 (float) and matrix tests
for push constants for compute and graphics pipelines. For vec4 you can try:

dEQP-VK.spirv_assembly.instruction.compute.16bit_storage.push_constant_16_to_32.vector_float

For push constant tests in general there are 42 tests, but vec3 aren't
tested:

dEQP-VK.*16bit_storage.*push_constant.


> I've been toying with a glsl
> lowering pass changing mediump floats into float16. I was curious to know how
> much is needed as you have addressed most of the things from NIR onwards.
> Here I'm seeing offsets 0,2,4,6 which result into 0,0,1,1 when divided by
> four. Don't we need something of this sort in addition?

If i remember correctly, tests were testing to use push constants with
64 16bit values, to use the minimum spec maximum available as
max_push_constants_size that is 128 bytes. So at the end the generated
intrinsic was:

vec4 16 ssa_4 = intrinsic load_uniform (ssa_3) () (0, 128) /* base=0 */
/* range=128 */

As the calculus here is to calculate the number of location used, and
taking into account that the Vulkan API restrictions for push constants
that says that push constant ranges that say that offset must be
multiple of 4 and size must be multiple of 4, maintain the use of
4-bytes slots was ok for supporting the feature. Our code changes just
take the accountability in the number of 32-bits location needed, mainly
changing the divisions by 4 using DIV_ROUND_UP( , 4) to calculate sizes.

> commit 1a6d2bf3302f6e4305e383da0f27712dc5c20a67
> Author: Topi Pohjolainen 
> Date:   Sun Oct 29 20:28:03 2017 +0200
> 
> fix alignment of 16-bit uniforms on 32-bit slots
> 
> diff --git a/src/intel/compiler/brw_fs_nir.cpp 
> b/src/intel/compiler/brw_fs_nir.cpp
> index 2f5443958a..586eb9d9ff 100644
> --- a/src/intel/compiler/brw_fs_nir.cpp
> +++ b/src/intel/compiler/brw_fs_nir.cpp
> @@ -4007,7 +4007,10 @@ fs_visitor::nir_emit_intrinsic(const fs_builder , 
> nir_intrinsic_instr *instr
>   src.offset = const_offset->u32[0];
>  
>   for (unsigned j = 0; j < instr->num_components; j++) {
> -bld.MOV(offset(dest, bld, j), offset(src, bld, j));
> +const unsigned src_offset =
> +  src.type == BRW_REGISTER_TYPE_HF ? 2 * j : j;
> +
> +bld.MOV(offset(dest, bld, j), offset(src, bld, src_offset));
> 
> 
> 
> Then about the change of using 32-bit slots. This is now unconditional and
> would require revisiting if we wanted to pack 16-bits tighter and possibly
> increase the amount of uniforms that can be pushed. Similarly to Vulkan, in
> GL the core stores uniforms as floats and I think we should keep it that way.

> I added support in the i965 backend to keep track of the types of the
> uniforms and to convert 32-bit presentation to 16-bits on the fly in
> gen6_constant_state.c::brw_param_value(). I don't like it that much but I had
> to start from somewhere.

> My thinking is that we'd want to decouple the storage of the values and the
> packing used in the compiler backend. Ideally keeping the mesa gl core and the
> api working with full 32-bit floats but using tight 16-bit slots in the
> push/pull 

[Mesa-dev] [PATCH mesa] git_sha1_gen: create empty file in fallback path

2017-10-29 Thread Eric Engestrom
I missed this part in my conversion, the old stream redirection meant
the file was always created.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103496
Fixes: 7088622e5fb506b64c90 "buildsys: move file regeneration logic to
   the script itself"
Signed-off-by: Eric Engestrom 
---
 bin/git_sha1_gen.py | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/bin/git_sha1_gen.py b/bin/git_sha1_gen.py
index 7b9267b59e..68a87e72ec 100755
--- a/bin/git_sha1_gen.py
+++ b/bin/git_sha1_gen.py
@@ -45,3 +45,5 @@ def get_git_sha1():
 quit()
 with open(args.output, 'w') as git_sha1_h:
 git_sha1_h.write(new_sha1)
+else:
+open(args.output, 'w').close()
-- 
Cheers,
  Eric

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 102573] fails to build on armel

2017-10-29 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=102573

--- Comment #10 from Matt Turner  ---
(In reply to Bernd Kuhls from comment #9)

Open a new bug

-- 
You are receiving this mail because:
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 20/21] nir/lower_indirect: Bail early if modes == 0

2017-10-29 Thread Jason Ekstrand

Good point. I'll drop this patch.


On October 29, 2017 05:10:01 Bas Nieuwenhuizen  wrote:


Doesn't the old behavior also lower compact arrays even with modes = 0?



On Sat, Oct 28, 2017 at 8:36 PM, Jason Ekstrand  wrote:

There's no point in walking the program if 100% if we're never going to
actually lower anything.
---
 src/compiler/nir/nir_lower_indirect_derefs.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/src/compiler/nir/nir_lower_indirect_derefs.c 
b/src/compiler/nir/nir_lower_indirect_derefs.c

index c949224..f1e060c 100644
--- a/src/compiler/nir/nir_lower_indirect_derefs.c
+++ b/src/compiler/nir/nir_lower_indirect_derefs.c
@@ -202,6 +202,9 @@ nir_lower_indirect_derefs(nir_shader *shader, 
nir_variable_mode modes)

 {
bool progress = false;

+   if (modes == 0)
+  return false;
+
nir_foreach_function(function, shader) {
   if (function->impl)
  progress = lower_indirects_impl(function->impl, modes) || progress;
--
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev



___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 103507] Wrong colors on screen when waking from suspend mode

2017-10-29 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=103507

Felix Schwarz  changed:

   What|Removed |Added

 CC||felix.schwarz@oss.schwarz.e
   ||u

-- 
You are receiving this mail because:
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 103507] Wrong colors on screen when waking from suspend mode

2017-10-29 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=103507

--- Comment #3 from Felix Schwarz  ---
bug 98832 might be a similar issue – that one is about the Radeon HD 6450. (You
have a slightly different model so it might make sense to keep both bugs.)

Maybe you can try to revert this commit:

commit d57c0edfe00d3274b50f91ce3076ed0e82d28782
Author: Alex Deucher 
Date:   Wed Jul 8 14:08:12 2015 -0400

Revert "Revert "drm/radeon: dont switch vt on suspend""

This reverts commit ac9134906b3f5c2b45dc80dab0fee792bd516d52.

We've fixed the underlying problem with cursors, so re-enable
this.

If that fixes it for you I suspect you are hitting the same issue as bug 98832
and bug 99163.

(Btw: You might work around the problem if you just switch to a different
console instead of logout/login.)

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v3 19/43] i965/fs: Support push constants of 16-bit types

2017-10-29 Thread Pohjolainen, Topi
On Thu, Oct 12, 2017 at 08:38:08PM +0200, Jose Maria Casanova Crespo wrote:
> We enable the use of 16-bit values in push constants
> modifying the assign_constant_locations function to work
> with 16-bit types.
> 
> The API to access buffers in Vulkan use multiples of 4-byte for
> offsets and sizes. Current accountability of uniforms based on 4-byte
> slots will work for 16-bit values if they are allowed to use 32-bit
> slots. For that, we replace the division by 4 by a DIV_ROUND_UP, so
> 2-byte elements will use 1 slot instead of 0.
> 
> We aligns the 16-bit locations after assigning the 32-bit
> ones.
> ---
>  src/intel/compiler/brw_fs.cpp | 30 +++---
>  1 file changed, 23 insertions(+), 7 deletions(-)
> 
> diff --git a/src/intel/compiler/brw_fs.cpp b/src/intel/compiler/brw_fs.cpp
> index a1d49a63be..8da16145dc 100644
> --- a/src/intel/compiler/brw_fs.cpp
> +++ b/src/intel/compiler/brw_fs.cpp
> @@ -1909,8 +1909,9 @@ set_push_pull_constant_loc(unsigned uniform, int 
> *chunk_start,
> if (!contiguous) {
>/* If bitsize doesn't match the target one, skip it */
>if (*max_chunk_bitsize != target_bitsize) {
> - /* FIXME: right now we only support 32 and 64-bit accesses */
> - assert(*max_chunk_bitsize == 4 || *max_chunk_bitsize == 8);
> + assert(*max_chunk_bitsize == 4 ||
> +*max_chunk_bitsize == 8 ||
> +*max_chunk_bitsize == 2);
>   *max_chunk_bitsize = 0;
>   *chunk_start = -1;
>   return;
> @@ -1987,8 +1988,9 @@ fs_visitor::assign_constant_locations()
>   int constant_nr = inst->src[i].nr + inst->src[i].offset / 4;

Did you test this with, for example, vec4? I've been toying with a glsl
lowering pass changing mediump floats into float16. I was curious to know how
much is needed as you have addressed most of the things from NIR onwards.

Here I'm seeing offsets 0,2,4,6 which result into 0,0,1,1 when divided by
four. Don't we need something of this sort in addition?


commit 1a6d2bf3302f6e4305e383da0f27712dc5c20a67
Author: Topi Pohjolainen 
Date:   Sun Oct 29 20:28:03 2017 +0200

fix alignment of 16-bit uniforms on 32-bit slots

diff --git a/src/intel/compiler/brw_fs_nir.cpp 
b/src/intel/compiler/brw_fs_nir.cpp
index 2f5443958a..586eb9d9ff 100644
--- a/src/intel/compiler/brw_fs_nir.cpp
+++ b/src/intel/compiler/brw_fs_nir.cpp
@@ -4007,7 +4007,10 @@ fs_visitor::nir_emit_intrinsic(const fs_builder , 
nir_intrinsic_instr *instr
  src.offset = const_offset->u32[0];
 
  for (unsigned j = 0; j < instr->num_components; j++) {
-bld.MOV(offset(dest, bld, j), offset(src, bld, j));
+const unsigned src_offset =
+  src.type == BRW_REGISTER_TYPE_HF ? 2 * j : j;
+
+bld.MOV(offset(dest, bld, j), offset(src, bld, src_offset));



Then about the change of using 32-bit slots. This is now unconditional and
would require revisiting if we wanted to pack 16-bits tighter and possibly
increase the amount of uniforms that can be pushed. Similarly to Vulkan, in
GL the core stores uniforms as floats and I think we should keep it that way.

I added support in the i965 backend to keep track of the types of the
uniforms and to convert 32-bit presentation to 16-bits on the fly in
gen6_constant_state.c::brw_param_value(). I don't like it that much but I had
to start from somewhere.

My thinking is that we'd want to decouple the storage of the values and the
packing used in the compiler backend. Ideally keeping the mesa gl core and the
api working with full 32-bit floats but using tight 16-bit slots in the
push/pull constant buffers.
This requires quite a bit more changes as we have structured
param[]/pull_param[] to work with 32-bit slots.

My current work can be found in:

git://people.freedesktop.org/~tpohjola/mesa 16_bit_gles

>  
>   if (inst->opcode == SHADER_OPCODE_MOV_INDIRECT && i == 0) {
> -assert(inst->src[2].ud % 4 == 0);
> -unsigned last = constant_nr + (inst->src[2].ud / 4) - 1;
> +assert(type_sz(inst->src[i].type) == 2 ?
> +   (inst->src[2].ud % 2 == 0) : (inst->src[2].ud % 4 == 0));
> +unsigned last = constant_nr + DIV_ROUND_UP(inst->src[2].ud, 4) - 
> 1;
>  assert(last < uniforms);
>  
>  for (unsigned j = constant_nr; j < last; j++) {
> @@ -2000,8 +2002,8 @@ fs_visitor::assign_constant_locations()
>  bitsize_access[last] = MAX2(bitsize_access[last], 
> type_sz(inst->src[i].type));
>   } else {
>  if (constant_nr >= 0 && constant_nr < (int) uniforms) {
> -   int regs_read = inst->components_read(i) *
> -  type_sz(inst->src[i].type) / 4;
> +   int regs_read = DIV_ROUND_UP(inst->components_read(i) *
> +type_sz(inst->src[i].type), 4);
> for (int j = 0; j < regs_read; j++) {
>

[Mesa-dev] [Bug 103505] RX 480, newest mesa, VULKAN Does not start

2017-10-29 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=103505

--- Comment #2 from Valentin Novikov  ---
(In reply to Bas Nieuwenhuizen from comment #1)
> So this seems like mesa 17.2.2?
> 
> Can you get Vulka-Demos (https://github.com/SaschaWillems/Vulkan) and get a
> backtrace of one of the demos?
> 
> The complete dmesg log after failing to run an application would also be
> useful info to have.

dmesg before crash vulkan-smoketest:

https://pastebin.com/9TBSpZHF

after:

https://pastebin.com/bSVS7p3P

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 103507] Wrong colors on screen when waking from suspend mode

2017-10-29 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=103507

--- Comment #2 from andre35...@yahoo.com ---
Created attachment 135155
  --> https://bugs.freedesktop.org/attachment.cgi?id=135155=edit
dmesg output

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 103507] Wrong colors on screen when waking from suspend mode

2017-10-29 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=103507

--- Comment #1 from andre35...@yahoo.com ---
Created attachment 135154
  --> https://bugs.freedesktop.org/attachment.cgi?id=135154=edit
Picture of issue/monitor

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 103507] Wrong colors on screen when waking from suspend mode

2017-10-29 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=103507

andre35...@yahoo.com changed:

   What|Removed |Added

Summary|RGB colors across wake from |Wrong colors on screen when
   |suspend |waking from suspend mode

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 103507] RGB colors across wake from suspend

2017-10-29 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=103507

Bug ID: 103507
   Summary: RGB colors across wake from suspend
   Product: Mesa
   Version: unspecified
  Hardware: x86-64 (AMD64)
OS: Linux (All)
Status: NEW
  Severity: critical
  Priority: medium
 Component: Mesa core
  Assignee: mesa-dev@lists.freedesktop.org
  Reporter: andre35...@yahoo.com
QA Contact: mesa-dev@lists.freedesktop.org

Linux Mint 18.2 
AMD 6570 (open source drivers)
Intel 64bit CPU

Upon resuming the computer from suspend mode, around 3/5 times, the entire
screen/Cinnamon DE is in some weird contrast mode where everything is pink/blue
and hard to read. A logout/login restores the colors back to the way they
should be. I have reported this to the Linux Mint Github page and a mod marked
it as a possible Mesa issue so I am here.

Here is a picture of the issue https://i.imgur.com/q5g4LEV.jpg 
Here is dmesg output if that proves useful: https://pastebin.com/tRwmTM1P

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 103506] Max core profile version: 0.0 in the "Drivers/DRI/r300" component of the "Mesa"

2017-10-29 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=103506

Ilia Mirkin  changed:

   What|Removed |Added

  Component|Other   |Drivers/Gallium/r300
 QA Contact|mesa-dev@lists.freedesktop. |dri-devel@lists.freedesktop
   |org |.org
   Assignee|mesa-dev@lists.freedesktop. |dri-devel@lists.freedesktop
   |org |.org

--- Comment #1 from Ilia Mirkin  ---
Is there a question somewhere in there? (There is currently no r300 "dri"
driver, only a gallium one, btw.)

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 103506] Max core profile version: 0.0 in the "Drivers/DRI/r300" component of the "Mesa"

2017-10-29 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=103506

Bug ID: 103506
   Summary: Max core profile version: 0.0 in the
"Drivers/DRI/r300" component of the "Mesa"
   Product: Mesa
   Version: unspecified
  Hardware: Other
OS: Linux (All)
Status: NEW
  Severity: critical
  Priority: medium
 Component: Other
  Assignee: mesa-dev@lists.freedesktop.org
  Reporter: pythonal...@gmail.com
QA Contact: mesa-dev@lists.freedesktop.org

server glx version string: 1.4
client glx version string: 1.4
GLX version: 1.4
Max core profile version: 0.0
Max compat profile version: 2.1
Max GLES1 profile version: 1.1
Max GLES[23] profile version: 2.0
OpenGL version string: 2.1 Mesa 17.1.8
OpenGL shading language version string: 1.20
OpenGL ES profile version string: OpenGL ES 2.0 Mesa 17.1.8
OpenGL ES profile shading language version string: OpenGL ES GLSL ES 1.0.16

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are the QA Contact for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 103505] RX 480, newest mesa, VULKAN Does not start

2017-10-29 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=103505

Bas Nieuwenhuizen  changed:

   What|Removed |Added

  Component|Mesa core   |Drivers/Vulkan/radeon

--- Comment #1 from Bas Nieuwenhuizen  ---
So this seems like mesa 17.2.2?

Can you get Vulka-Demos (https://github.com/SaschaWillems/Vulkan) and get a
backtrace of one of the demos?

The complete dmesg log after failing to run an application would also be useful
info to have.

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 103505] RX 480, newest mesa, VULKAN Does not start

2017-10-29 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=103505

Bug ID: 103505
   Summary: RX 480, newest mesa, VULKAN Does not start
   Product: Mesa
   Version: git
  Hardware: x86-64 (AMD64)
OS: Linux (All)
Status: NEW
  Severity: critical
  Priority: medium
 Component: Mesa core
  Assignee: mesa-dev@lists.freedesktop.org
  Reporter: mrmeln...@gmail.com
QA Contact: mesa-dev@lists.freedesktop.org

Vulkan-smoketest: segmentation fault (core dumped)

Dota 2: just does not start

vkwuake: segmentation fault (core dumped)

vulkaninfo:

https://pastebin.com/SXpb18sP

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 102573] fails to build on armel

2017-10-29 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=102573

--- Comment #9 from Bernd Kuhls  ---
Hi,

this patch breaks building mesa3d 17.2.3 with

Target: powerpc-ctng_e500v2-linux-gnuspe
gcc version 4.7.3 (crosstool-NG hg+-c65fcf8a34b7) 

as reported by buildroot autobuilders:

http://autobuild.buildroot.net/?reason=mesa3d-17.2.3

Quoting
http://autobuild.buildroot.net/results/43d/43d8bf9a1531f4b69e22bfb53b4536d76cf31cbb/build-end.log

/home/peko/autobuild/instance-0/output/host/opt/ext-toolchain/bin/../lib/gcc/powerpc-ctng_e500v2-linux-gnuspe/4.7.3/../../../../powerpc-ctng_e500v2-linux-gnuspe/bin/ld:
cannot find -latomic

Quoting from configure output:

checking whether -latomic is needed... yes
checking whether __sync_add_and_fetch_8 is supported... no

-- 
You are receiving this mail because:
You are the assignee for the bug.___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 20/21] nir/lower_indirect: Bail early if modes == 0

2017-10-29 Thread Bas Nieuwenhuizen
Doesn't the old behavior also lower compact arrays even with modes = 0?



On Sat, Oct 28, 2017 at 8:36 PM, Jason Ekstrand  wrote:
> There's no point in walking the program if 100% if we're never going to
> actually lower anything.
> ---
>  src/compiler/nir/nir_lower_indirect_derefs.c | 3 +++
>  1 file changed, 3 insertions(+)
>
> diff --git a/src/compiler/nir/nir_lower_indirect_derefs.c 
> b/src/compiler/nir/nir_lower_indirect_derefs.c
> index c949224..f1e060c 100644
> --- a/src/compiler/nir/nir_lower_indirect_derefs.c
> +++ b/src/compiler/nir/nir_lower_indirect_derefs.c
> @@ -202,6 +202,9 @@ nir_lower_indirect_derefs(nir_shader *shader, 
> nir_variable_mode modes)
>  {
> bool progress = false;
>
> +   if (modes == 0)
> +  return false;
> +
> nir_foreach_function(function, shader) {
>if (function->impl)
>   progress = lower_indirects_impl(function->impl, modes) || progress;
> --
> 2.5.0.400.gff86faf
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v3 13/34] glsl/shader_cache: Save and restore serialized nir in gl_program

2017-10-29 Thread Kenneth Graunke
On Sunday, October 22, 2017 1:01:21 PM PDT Jordan Justen wrote:
> v3:
>  * Rename serialized_nir* to driver_cache_blob*. (Tim)
> 
> Signed-off-by: Jordan Justen 
> Reviewed-by: Timothy Arceri 
> ---
>  src/compiler/glsl/shader_cache.cpp | 16 
>  1 file changed, 16 insertions(+)
> 
> diff --git a/src/compiler/glsl/shader_cache.cpp 
> b/src/compiler/glsl/shader_cache.cpp
> index ca90cfde350..1d208fb0911 100644
> --- a/src/compiler/glsl/shader_cache.cpp
> +++ b/src/compiler/glsl/shader_cache.cpp
> @@ -1062,6 +1062,14 @@ write_shader_metadata(struct blob *metadata, 
> gl_linked_shader *shader)
> }
>  
> write_shader_parameters(metadata, glprog->Parameters);
> +
> +   assert((glprog->driver_cache_blob == NULL) ==
> +  (glprog->driver_cache_blob_size == 0));
> +   blob_write_uint32(metadata, (uint32_t)glprog->driver_cache_blob_size);
> +   if (glprog->driver_cache_blob_size > 0) {
> +  blob_write_bytes(metadata, glprog->driver_cache_blob,
> +   glprog->driver_cache_blob_size);
> +   }
>  }
>  
>  static void
> @@ -1116,6 +1124,14 @@ read_shader_metadata(struct blob_reader *metadata,
>  
> glprog->Parameters = _mesa_new_parameter_list();
> read_shader_parameters(metadata, glprog->Parameters);
> +
> +   glprog->driver_cache_blob_size = (size_t)blob_read_uint32(metadata);
> +   if (glprog->driver_cache_blob_size > 0) {
> +  glprog->driver_cache_blob =
> + (uint8_t*)ralloc_size(glprog, glprog->driver_cache_blob_size);
> +  blob_copy_bytes(metadata, glprog->driver_cache_blob,
> +  glprog->driver_cache_blob_size);
> +   }

Shouldn't you check for overrun here, and leave things in a consistent
state (passing the assertion above)?

>  }
>  
>  static void
> 



signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v3 28/34] i965: add cache fallback support using serialized nir

2017-10-29 Thread Kenneth Graunke
On Sunday, October 22, 2017 1:01:36 PM PDT Jordan Justen wrote:
> If the i965 gen program cannot be loaded from the cache, then we
> fallback to using a serialized nir program.
> 
> This is based on "i965: add cache fallback support" by Timothy Arceri
> . Tim's version was written to fallback
> to compiling from source, and therefore had to be much more complex.
> After Connor and Jason implemented nir serialization, I was able to
> rewrite and greatly simplify this patch.
> 
> Signed-off-by: Jordan Justen 
> Acked-by: Timothy Arceri 
> ---
>  src/mesa/drivers/dri/i965/brw_disk_cache.c | 27 ++-
>  1 file changed, 26 insertions(+), 1 deletion(-)
> 
> diff --git a/src/mesa/drivers/dri/i965/brw_disk_cache.c 
> b/src/mesa/drivers/dri/i965/brw_disk_cache.c
> index 503c6c7b499..9af893d40a7 100644
> --- a/src/mesa/drivers/dri/i965/brw_disk_cache.c
> +++ b/src/mesa/drivers/dri/i965/brw_disk_cache.c
> @@ -24,6 +24,7 @@
>  #include "compiler/blob.h"
>  #include "compiler/glsl/ir_uniform.h"
>  #include "compiler/glsl/shader_cache.h"
> +#include "compiler/nir/nir_serialize.h"
>  #include "main/mtypes.h"
>  #include "util/disk_cache.h"
>  #include "util/macros.h"
> @@ -58,6 +59,27 @@ gen_shader_sha1(struct brw_context *brw, struct gl_program 
> *prog,
> _mesa_sha1_compute(manifest, strlen(manifest), out_sha1);
>  }
>  
> +static void
> +fallback_to_full_recompile(struct brw_context *brw, struct gl_program *prog,

It's not exactly a full recompile anymore, maybe rename this to
recompile_from_nir?  Or fallback_to_partial_recompile?

> +   gl_shader_stage stage)
> +{
> +   prog->program_written_to_cache = false;
> +   if (brw->ctx._Shader->Flags & GLSL_CACHE_INFO) {
> +  fprintf(stderr, "falling back to nir %s.\n",
> +  _mesa_shader_stage_to_abbrev(prog->info.stage));
> +   }
> +
> +   if (!prog->nir) {
> +  assert(prog->driver_cache_blob && prog->driver_cache_blob_size > 0);
> +  const struct nir_shader_compiler_options *options =
> + brw->ctx.Const.ShaderCompilerOptions[stage].NirOptions;
> +  struct blob_reader reader;
> +  blob_reader_init(, prog->driver_cache_blob,
> +   prog->driver_cache_blob_size);
> +  prog->nir = nir_deserialize(NULL, options, );
> +   }
> +}
> +
>  static void
>  write_blob_program_data(struct blob *binary, const void *program,
>  size_t program_size,
> @@ -280,6 +302,9 @@ brw_disk_cache_upload_program(struct brw_context *brw, 
> gl_shader_stage stage)
> prog->sh.LinkedTransformFeedback->api_enabled)
>return false;
>  
> +   if (brw->ctx._Shader->Flags & GLSL_CACHE_FALLBACK)
> +  goto FAIL;
> +
> if (prog->sh.data->LinkStatus != linking_skipped)
>goto FAIL;
>  
> @@ -293,7 +318,7 @@ brw_disk_cache_upload_program(struct brw_context *brw, 
> gl_shader_stage stage)
> return true;
>  
>  FAIL:
> -   /*FIXME: Fall back and compile from source here. */
> +   fallback_to_full_recompile(brw, prog, stage);
> return false;
>  }
>  
> 



signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v3 22/34] i965: Add shader cache support for vertex and fragment stages

2017-10-29 Thread Kenneth Graunke
On Sunday, October 22, 2017 1:01:30 PM PDT Jordan Justen wrote:
> From: Timothy Arceri 
> 
> This enables the cache on vertex and fragment shaders only.
> 
> v2:
>  * Use MAYBE_UNUSED. (Matt)
> 
> [jordan.l.jus...@intel.com: reword subject]
> [jordan.l.jus...@intel.com: *_cached_program => brw_disk_cache_*_program]
> Signed-off-by: Jordan Justen 

Patches 22-27 are:
Reviewed-by: Kenneth Graunke 


signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v3 21/34] i965: add initial implementation of on disk shader cache

2017-10-29 Thread Kenneth Graunke
On Sunday, October 22, 2017 1:01:29 PM PDT Jordan Justen wrote:
> From: Timothy Arceri 
> 
> This uses the recently-added disk_cache.c to write out the final
> linked binary for vertex and fragment shader programs.
> 
> This is based off the initial implementation done by Carl Worth.
> 
> v2:
>  * Squash 'i965: add image param shader cache support'
>  * Squash 'i965: add shader cache support for pull param pointers'
>  * Sustantially simplified by a rework on top of Jason's 2975e4c56a7a.
>  * Rename load_program_data to read_program_data. (Jason)
> 
> v3:
>  * Simplify and align program read/write. (Jason)
> 
> [jordan.l.jus...@intel.com: *_cached_program => brw_disk_cache_*_program]
> [jordan.l.jus...@intel.com: brw_shader_cache.c => brw_disk_cache.c]
> [jordan.l.jus...@intel.com: don't map to write program when LLC is present]
> [jordan.l.jus...@intel.com: set program_written_to_cache on read from cache]
> [jordan.l.jus...@intel.com: only try cache when status is linking_skipped]
> [jordan.l.jus...@intel.com: rework based on uniforms rework 2975e4c56a7a]
> [jordan.l.jus...@intel.com: Simplify and align program read/write]
> Signed-off-by: Jordan Justen 
> ---
>  src/mesa/drivers/dri/i965/Makefile.sources |   1 +
>  src/mesa/drivers/dri/i965/brw_disk_cache.c | 329 
> +
>  src/mesa/drivers/dri/i965/brw_state.h  |   5 +
>  src/mesa/drivers/dri/i965/meson.build  |   1 +
>  4 files changed, 336 insertions(+)
>  create mode 100644 src/mesa/drivers/dri/i965/brw_disk_cache.c
> 
> diff --git a/src/mesa/drivers/dri/i965/Makefile.sources 
> b/src/mesa/drivers/dri/i965/Makefile.sources
> index 053d89b81ec..2980cdb3c54 100644
> --- a/src/mesa/drivers/dri/i965/Makefile.sources
> +++ b/src/mesa/drivers/dri/i965/Makefile.sources
> @@ -14,6 +14,7 @@ i965_FILES = \
>   brw_cs.h \
>   brw_curbe.c \
>   brw_defines.h \
> + brw_disk_cache.c \
>   brw_draw.c \
>   brw_draw.h \
>   brw_draw_upload.c \
> diff --git a/src/mesa/drivers/dri/i965/brw_disk_cache.c 
> b/src/mesa/drivers/dri/i965/brw_disk_cache.c
> new file mode 100644
> index 000..186cbe83706
> --- /dev/null
> +++ b/src/mesa/drivers/dri/i965/brw_disk_cache.c
> @@ -0,0 +1,329 @@
> +/*
> + * Copyright © 2014 Intel Corporation
> + *
> + * Permission is hereby granted, free of charge, to any person obtaining a
> + * copy of this software and associated documentation files (the "Software"),
> + * to deal in the Software without restriction, including without limitation
> + * the rights to use, copy, modify, merge, publish, distribute, sublicense,
> + * and/or sell copies of the Software, and to permit persons to whom the
> + * Software is furnished to do so, subject to the following conditions:
> + *
> + * The above copyright notice and this permission notice (including the next
> + * paragraph) shall be included in all copies or substantial portions of the
> + * Software.
> + *
> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
> + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
> + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
> + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
> + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
> + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER 
> DEALINGS
> + * IN THE SOFTWARE.
> + */
> +
> +#include "compiler/blob.h"
> +#include "compiler/glsl/ir_uniform.h"
> +#include "compiler/glsl/shader_cache.h"
> +#include "main/mtypes.h"
> +#include "util/disk_cache.h"
> +#include "util/macros.h"
> +#include "util/mesa-sha1.h"
> +
> +#include "brw_context.h"
> +#include "brw_state.h"
> +#include "brw_vs.h"
> +#include "brw_wm.h"
> +
> +static void
> +gen_shader_sha1(struct brw_context *brw, struct gl_program *prog,
> +gl_shader_stage stage, void *key, unsigned char *out_sha1)
> +{
> +   char sha1_buf[41];
> +   unsigned char sha1[20];
> +   char manifest[256];
> +   int offset = 0;
> +
> +   _mesa_sha1_format(sha1_buf, prog->sh.data->sha1);
> +   offset += snprintf(manifest, sizeof(manifest), "program: %s\n", sha1_buf);
> +
> +   _mesa_sha1_compute(key, brw_prog_key_size(stage), sha1);
> +   _mesa_sha1_format(sha1_buf, sha1);
> +   offset += snprintf(manifest + offset, sizeof(manifest) - offset,
> +  "%s_key: %s\n", _mesa_shader_stage_to_abbrev(stage),
> +  sha1_buf);
> +
> +   _mesa_sha1_compute(manifest, strlen(manifest), out_sha1);
> +}
> +
> +static void
> +write_blob_program_data(struct blob *binary, const void *program,
> +size_t program_size,
> +struct brw_stage_prog_data *prog_data,
> +size_t prog_data_size)
> +{
> +   /* Write program to blob. */
> +   blob_write_uint32(binary, program_size);
> +