Re: [Mesa-dev] [PATCH 00/21] anv: Do cross-stage link optimizations
On October 29, 2017 21:34:01 Timothy Arceriwrote: On 29/10/17 12:58, Jason Ekstrand wrote: On Sat, Oct 28, 2017 at 11:36 AM, Jason Ekstrand > wrote: This series adds support for cross-stage optimizations in anv. There are a few patches from Jordan's shader cache series in here that I wanted because they made my life easier. There are also three patches CCd to stable to fix a but in the i965 cross-stage NIR linking which, as as side-effect, expose a nice brw_nir_link_shaders helper that we can use in anv. The bulk of the series, however, is the annoying refactoring of anv_pipeline.c to let us work with and cache the shaders an entire pipeline at a time instead of having everything be per-stage. The patch to actually add the NIR link optimizations to ANV is almost trivial. On my thermally throttled (and therefore a bit inconsistent) laptop, this seems to help the Aztec Ruins benchmark by 2%. Or not... I'm having trouble reproducing it now. For what its worth RADV had improvements in the following: Sascha Willems demo results: computecullandlod 39 -> 41 fps pipelines ~6100 -> ~6200 fps The biggest improvement is with the component packing enabled: SaschaWillems Vulkan demo tessellation: ~4300fps -> ~4800fps Yeah, I tried those out but they were too noisy on my laptop to get good data. I asked Eero to try and get me some better numbers on his perf setup. Carl Worth (1): intel/compiler: add new field for storing program size Jason Ekstrand (17): anv/pipeline: Rework the parameters to populate_wm_prog_key anv/pipeline: Add populate_tcs/tes_key helpers anv/pipline: Add a helper struct for per-stage info anv/pipeline: Populate keys up-front anv/pipeline: Hash the entire pipeline in one go anv/pipeline: Call anv_pipeline_compile_* in a loop anv/pipeline: Pull shader compilation out into a helper. anv/pipeline: Drop anv_pipeline_add_compiled_stage anv/pipeline: Recompile all shaders if any are missing from the cache anv/pipeline: Compile to NIR in compile_graphics anv/pipeline: Add a separate "link" stage anv/pipeline: Pull most of the anv_pipeline_compile_* into common code intel/nir: Add a helper for getting the NoIndirect mask intel/nir: Break the linking code into a helper in brw_nir.c intel/nir: Use the correct indirect lowering masks in link_shaders nir/lower_indirect: Bail early if modes == 0 anv/pipeline: Do cross-stage linking optimizations Jordan Justen (3): intel/compiler: Add union types for prog_data and prog_key stages intel/compiler: Add functions to get prog_data and prog_key sizes for a stage intel/compiler: Remove final_program_size from brw_compile_* src/compiler/nir/nir_lower_indirect_derefs.c | 3 + src/intel/blorp/blorp.c | 10 +- src/intel/blorp/blorp_blit.c | 5 +- src/intel/blorp/blorp_clear.c | 15 +- src/intel/blorp/blorp_priv.h | 6 +- src/intel/compiler/brw_compiler.c | 36 + src/intel/compiler/brw_compiler.h | 34 +- src/intel/compiler/brw_fs.cpp | 6 +- src/intel/compiler/brw_nir.c | 63 +- src/intel/compiler/brw_nir.h | 4 + src/intel/compiler/brw_shader.cpp | 12 +- src/intel/compiler/brw_vec4.cpp | 5 +- src/intel/compiler/brw_vec4_gs_visitor.cpp | 8 +- src/intel/compiler/brw_vec4_tcs.cpp | 12 +- src/intel/vulkan/anv_pipeline.c | 971 ++- src/intel/vulkan/anv_private.h | 2 +- src/intel/vulkan/genX_pipeline.c | 2 - src/mesa/drivers/dri/i965/brw_cs.c | 5 +- src/mesa/drivers/dri/i965/brw_gs.c | 5 +- src/mesa/drivers/dri/i965/brw_link.cpp | 38 +- src/mesa/drivers/dri/i965/brw_tcs.c | 5 +- src/mesa/drivers/dri/i965/brw_tes.c | 5 +- src/mesa/drivers/dri/i965/brw_vs.c | 11 +- src/mesa/drivers/dri/i965/brw_wm.c | 5 +- 24 files changed, 668 insertions(+), 600 deletions(-) -- 2.5.0.400.gff86faf ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 00/21] anv: Do cross-stage link optimizations
On 29/10/17 12:58, Jason Ekstrand wrote: On Sat, Oct 28, 2017 at 11:36 AM, Jason Ekstrand> wrote: This series adds support for cross-stage optimizations in anv. There are a few patches from Jordan's shader cache series in here that I wanted because they made my life easier. There are also three patches CCd to stable to fix a but in the i965 cross-stage NIR linking which, as as side-effect, expose a nice brw_nir_link_shaders helper that we can use in anv. The bulk of the series, however, is the annoying refactoring of anv_pipeline.c to let us work with and cache the shaders an entire pipeline at a time instead of having everything be per-stage. The patch to actually add the NIR link optimizations to ANV is almost trivial. On my thermally throttled (and therefore a bit inconsistent) laptop, this seems to help the Aztec Ruins benchmark by 2%. Or not... I'm having trouble reproducing it now. For what its worth RADV had improvements in the following: Sascha Willems demo results: computecullandlod 39 -> 41 fps pipelines ~6100 -> ~6200 fps The biggest improvement is with the component packing enabled: SaschaWillems Vulkan demo tessellation: ~4300fps -> ~4800fps Carl Worth (1): intel/compiler: add new field for storing program size Jason Ekstrand (17): anv/pipeline: Rework the parameters to populate_wm_prog_key anv/pipeline: Add populate_tcs/tes_key helpers anv/pipline: Add a helper struct for per-stage info anv/pipeline: Populate keys up-front anv/pipeline: Hash the entire pipeline in one go anv/pipeline: Call anv_pipeline_compile_* in a loop anv/pipeline: Pull shader compilation out into a helper. anv/pipeline: Drop anv_pipeline_add_compiled_stage anv/pipeline: Recompile all shaders if any are missing from the cache anv/pipeline: Compile to NIR in compile_graphics anv/pipeline: Add a separate "link" stage anv/pipeline: Pull most of the anv_pipeline_compile_* into common code intel/nir: Add a helper for getting the NoIndirect mask intel/nir: Break the linking code into a helper in brw_nir.c intel/nir: Use the correct indirect lowering masks in link_shaders nir/lower_indirect: Bail early if modes == 0 anv/pipeline: Do cross-stage linking optimizations Jordan Justen (3): intel/compiler: Add union types for prog_data and prog_key stages intel/compiler: Add functions to get prog_data and prog_key sizes for a stage intel/compiler: Remove final_program_size from brw_compile_* src/compiler/nir/nir_lower_indirect_derefs.c | 3 + src/intel/blorp/blorp.c | 10 +- src/intel/blorp/blorp_blit.c | 5 +- src/intel/blorp/blorp_clear.c | 15 +- src/intel/blorp/blorp_priv.h | 6 +- src/intel/compiler/brw_compiler.c | 36 + src/intel/compiler/brw_compiler.h | 34 +- src/intel/compiler/brw_fs.cpp | 6 +- src/intel/compiler/brw_nir.c | 63 +- src/intel/compiler/brw_nir.h | 4 + src/intel/compiler/brw_shader.cpp | 12 +- src/intel/compiler/brw_vec4.cpp | 5 +- src/intel/compiler/brw_vec4_gs_visitor.cpp | 8 +- src/intel/compiler/brw_vec4_tcs.cpp | 12 +- src/intel/vulkan/anv_pipeline.c | 971 ++- src/intel/vulkan/anv_private.h | 2 +- src/intel/vulkan/genX_pipeline.c | 2 - src/mesa/drivers/dri/i965/brw_cs.c | 5 +- src/mesa/drivers/dri/i965/brw_gs.c | 5 +- src/mesa/drivers/dri/i965/brw_link.cpp | 38 +- src/mesa/drivers/dri/i965/brw_tcs.c | 5 +- src/mesa/drivers/dri/i965/brw_tes.c | 5 +- src/mesa/drivers/dri/i965/brw_vs.c | 11 +- src/mesa/drivers/dri/i965/brw_wm.c | 5 +- 24 files changed, 668 insertions(+), 600 deletions(-) -- 2.5.0.400.gff86faf ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 9/9] radv: enable nir component packing
SaschaWillems Vulkan demo tessellation: ~4000fps -> ~4600fps Reviewed-by: Bas Nieuwenhuizen--- src/amd/vulkan/radv_pipeline.c | 6 ++ 1 file changed, 6 insertions(+) diff --git a/src/amd/vulkan/radv_pipeline.c b/src/amd/vulkan/radv_pipeline.c index 322cd7951b2..ec7c2393fc9 100644 --- a/src/amd/vulkan/radv_pipeline.c +++ b/src/amd/vulkan/radv_pipeline.c @@ -1815,6 +1815,7 @@ void radv_create_shaders(struct radv_pipeline *pipeline, last = i; } + int prev = -1; for (unsigned i = 0; i < MESA_SHADER_STAGES; ++i) { const VkPipelineShaderStageCreateInfo *stage = pStages[i]; @@ -1845,6 +1846,11 @@ void radv_create_shaders(struct radv_pipeline *pipeline, nir_lower_io_to_scalar_early(nir[i], mask); radv_optimize_nir(nir[i]); } + + if (prev != -1) { + nir_compact_varyings(nir[prev], nir[i], true); + } + prev = i; } if (nir[MESA_SHADER_TESS_CTRL]) { -- 2.13.6 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 6/9] nir: add varying component packing helpers
v2: update shader info input/output masks when pack components Reviewed-by: Bas Nieuwenhuizen(v1) --- src/compiler/nir/nir.h | 2 + src/compiler/nir/nir_linking_helpers.c | 272 + 2 files changed, 274 insertions(+) diff --git a/src/compiler/nir/nir.h b/src/compiler/nir/nir.h index 095cc6600ad..2b46cefc4f7 100644 --- a/src/compiler/nir/nir.h +++ b/src/compiler/nir/nir.h @@ -2420,6 +2420,8 @@ void nir_assign_var_locations(struct exec_list *var_list, unsigned *size, /* Some helpers to do very simple linking */ bool nir_remove_unused_varyings(nir_shader *producer, nir_shader *consumer); +void nir_compact_varyings(nir_shader *producer, nir_shader *consumer, + bool default_to_smooth_interp); typedef enum { /* If set, this forces all non-flat fragment shader inputs to be diff --git a/src/compiler/nir/nir_linking_helpers.c b/src/compiler/nir/nir_linking_helpers.c index 4d709c1b3c5..f7355af2195 100644 --- a/src/compiler/nir/nir_linking_helpers.c +++ b/src/compiler/nir/nir_linking_helpers.c @@ -173,3 +173,275 @@ nir_remove_unused_varyings(nir_shader *producer, nir_shader *consumer) return progress; } + +static uint8_t +get_interp_type(nir_variable *var, bool default_to_smooth_interp) +{ + if (var->data.interpolation != INTERP_MODE_NONE) + return var->data.interpolation; + else if (default_to_smooth_interp) + return INTERP_MODE_SMOOTH; + else + return INTERP_MODE_NONE; +} + +static void +get_slot_component_masks_and_interp_types(struct exec_list *var_list, + uint8_t *comps, uint8_t *interp_type, + gl_shader_stage stage, + bool default_to_smooth_interp) +{ + nir_foreach_variable_safe(var, var_list) { + assert(var->data.location >= 0); + + /* Only remap things that aren't built-ins. + * TODO: add TES patch support. + */ + if (var->data.location >= VARYING_SLOT_VAR0 && + var->data.location - VARYING_SLOT_VAR0 < 32) { + + const struct glsl_type *type = var->type; + if (nir_is_per_vertex_io(var, stage)) { +assert(glsl_type_is_array(type)); +type = glsl_get_array_element(type); + } + + unsigned location = var->data.location - VARYING_SLOT_VAR0; + unsigned elements = +glsl_get_vector_elements(glsl_without_array(type)); + + bool dual_slot = glsl_type_is_dual_slot(glsl_without_array(type)); + unsigned slots = glsl_count_attribute_slots(type, false); + unsigned comps_slot2 = 0; + for (unsigned i = 0; i < slots; i++) { +interp_type[location + i] = + get_interp_type(var, default_to_smooth_interp); + +if (dual_slot) { + if (i & 1) { + comps[location + i] |= ((1 << comps_slot2) - 1); + } else { + unsigned num_comps = 4 - var->data.location_frac; + comps_slot2 = (elements * 2) - num_comps; + + /* Assume ARB_enhanced_layouts packing rules for doubles */ + assert(var->data.location_frac == 0 || + var->data.location_frac == 2); + assert(comps_slot2 <= 4); + + comps[location + i] |= + ((1 << num_comps) - 1) << var->data.location_frac; + } +} else { + comps[location + i] |= + ((1 << elements) - 1) << var->data.location_frac; +} + } + } + } +} + +struct varying_loc +{ + uint8_t component; + uint32_t location; +}; + +static void +remap_slots_and_components(struct exec_list *var_list, gl_shader_stage stage, + struct varying_loc (*remap)[4], uint64_t *slots_used) + { + /* We don't touch builtins so just copy the bitmask */ + uint64_t slots_used_tmp = + *slots_used & (((uint64_t)1 << (VARYING_SLOT_VAR0 - 1)) - 1); + + nir_foreach_variable(var, var_list) { + assert(var->data.location >= 0); + + /* Only remap things that aren't built-ins */ + if (var->data.location >= VARYING_SLOT_VAR0 && + var->data.location - VARYING_SLOT_VAR0 < 32) { + assert(var->data.location - VARYING_SLOT_VAR0 < 32); + assert(remap[var->data.location - VARYING_SLOT_VAR0] >= 0); + + unsigned location = var->data.location - VARYING_SLOT_VAR0; + struct varying_loc *new_loc = [location][var->data.location_frac]; + if (new_loc->location) { +var->data.location = new_loc->location; +var->data.location_frac = new_loc->component; + } + + const struct glsl_type *type = var->type; + if (nir_is_per_vertex_io(var, stage)) { +assert(glsl_type_is_array(type)); +type =
[Mesa-dev] [PATCH 3/9] i965: move update_xfb_info() call out of loop
We can just call it once. Also a following patch will also introduce link time component packing which modifies the outputs_written bit mask, we want to avoid calling update_xfb_info() until after packing is completed. --- src/mesa/drivers/dri/i965/brw_link.cpp | 7 +-- 1 file changed, 5 insertions(+), 2 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_link.cpp b/src/mesa/drivers/dri/i965/brw_link.cpp index 1a28e63fcae..b6c5362a1ee 100644 --- a/src/mesa/drivers/dri/i965/brw_link.cpp +++ b/src/mesa/drivers/dri/i965/brw_link.cpp @@ -325,8 +325,6 @@ brw_link_shader(struct gl_context *ctx, struct gl_shader_program *shProg) infos[stage] = >nir->info; - update_xfb_info(prog->sh.LinkedTransformFeedback, infos[stage]); - /* Make a pass over the IR to add state references for any built-in * uniforms that are used. This has to be done now (during linking). * Code generation doesn't happen until the first time this shader is @@ -347,6 +345,11 @@ brw_link_shader(struct gl_context *ctx, struct gl_shader_program *shProg) } } + if (shProg->last_vert_prog) { + update_xfb_info(shProg->last_vert_prog->sh.LinkedTransformFeedback, + >last_vert_prog->nir->info); + } + /* The linker tries to dead code eliminate unused varying components, * and make sure interfaces match. But it isn't able to do so in all * cases. So, explicitly make the interfaces match by OR'ing together -- 2.13.6 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 2/9] nir: add varying array splitting pass
--- src/compiler/Makefile.sources | 1 + src/compiler/nir/meson.build | 1 + src/compiler/nir/nir.h | 1 + src/compiler/nir/nir_lower_io_arrays_to_elements.c | 371 + 4 files changed, 374 insertions(+) create mode 100644 src/compiler/nir/nir_lower_io_arrays_to_elements.c diff --git a/src/compiler/Makefile.sources b/src/compiler/Makefile.sources index 27cc33ab835..ac9a3c8549c 100644 --- a/src/compiler/Makefile.sources +++ b/src/compiler/Makefile.sources @@ -227,6 +227,7 @@ NIR_FILES = \ nir/nir_lower_indirect_derefs.c \ nir/nir_lower_int64.c \ nir/nir_lower_io.c \ + nir/nir_lower_io_arrays_to_elements.c \ nir/nir_lower_io_to_temporaries.c \ nir/nir_lower_io_to_scalar.c \ nir/nir_lower_io_types.c \ diff --git a/src/compiler/nir/meson.build b/src/compiler/nir/meson.build index cb88effa628..ab0aa65eb1e 100644 --- a/src/compiler/nir/meson.build +++ b/src/compiler/nir/meson.build @@ -114,6 +114,7 @@ files_libnir = files( 'nir_lower_indirect_derefs.c', 'nir_lower_int64.c', 'nir_lower_io.c', + 'nir_lower_io_arrays_to_elements.c', 'nir_lower_io_to_temporaries.c', 'nir_lower_io_to_scalar.c', 'nir_lower_io_types.c', diff --git a/src/compiler/nir/nir.h b/src/compiler/nir/nir.h index dd833cf1831..095cc6600ad 100644 --- a/src/compiler/nir/nir.h +++ b/src/compiler/nir/nir.h @@ -2454,6 +2454,7 @@ bool nir_lower_alu_to_scalar(nir_shader *shader); bool nir_lower_load_const_to_scalar(nir_shader *shader); bool nir_lower_read_invocation_to_scalar(nir_shader *shader); bool nir_lower_phis_to_scalar(nir_shader *shader); +void nir_lower_io_arrays_to_elements(nir_shader *producer, nir_shader *consumer); void nir_lower_io_to_scalar(nir_shader *shader, nir_variable_mode mask); void nir_lower_io_to_scalar_early(nir_shader *shader, nir_variable_mode mask); diff --git a/src/compiler/nir/nir_lower_io_arrays_to_elements.c b/src/compiler/nir/nir_lower_io_arrays_to_elements.c new file mode 100644 index 000..3a8e2dc1933 --- /dev/null +++ b/src/compiler/nir/nir_lower_io_arrays_to_elements.c @@ -0,0 +1,371 @@ +/* + * Copyright © 2017 Timothy Arceri + * + * Permission is hereby granted, free of charge, to any person obtaining a + * copy of this software and associated documentation files (the "Software"), + * to deal in the Software without restriction, including without limitation + * the rights to use, copy, modify, merge, publish, distribute, sublicense, + * and/or sell copies of the Software, and to permit persons to whom the + * Software is furnished to do so, subject to the following conditions: + * + * The above copyright notice and this permission notice (including the next + * paragraph) shall be included in all copies or substantial portions of the + * Software. + * + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS + * IN THE SOFTWARE. + */ + +#include "nir.h" +#include "nir_builder.h" + +/** @file nir_lower_io_arrays_to_elements.c + * + * Split arrays/matrices with direct indexing into individual elements. This + * will allow optimisation passes to better clean up unused elements. + * + */ + +static unsigned +get_io_offset(nir_builder *b, nir_deref_var *deref, nir_variable *var, + unsigned *element_index) +{ + nir_deref *tail = >deref; + + /* For per-vertex input arrays (i.e. geometry shader inputs), skip the +* outermost array index. Process the rest normally. +*/ + if (nir_is_per_vertex_io(var, b->shader->info.stage)) { + tail = tail->child; + } + + unsigned offset = 0; + while (tail->child != NULL) { + tail = tail->child; + + if (tail->deref_type == nir_deref_type_array) { + nir_deref_array *deref_array = nir_deref_as_array(tail); + assert(deref_array->deref_array_type != nir_deref_array_type_indirect); + + unsigned size = glsl_count_attribute_slots(tail->type, false); + offset += size * deref_array->base_offset; + + unsigned num_elements = glsl_type_is_array(tail->type) ? +glsl_get_aoa_size(tail->type) : 1; + + num_elements *= glsl_type_is_matrix(glsl_without_array(tail->type)) ? +glsl_get_matrix_columns(glsl_without_array(tail->type)) : 1; + + *element_index += num_elements * deref_array->base_offset; + } else if (tail->deref_type == nir_deref_type_struct) { + /* TODO: we could also add struct splitting support to this pass */ + break; + } + } + + return offset; +} +
[Mesa-dev] [PATCH 8/9] radv: enable nir varying array splitting
--- src/amd/vulkan/radv_pipeline.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/src/amd/vulkan/radv_pipeline.c b/src/amd/vulkan/radv_pipeline.c index c25642c9667..322cd7951b2 100644 --- a/src/amd/vulkan/radv_pipeline.c +++ b/src/amd/vulkan/radv_pipeline.c @@ -1666,6 +1666,9 @@ radv_link_shaders(struct radv_pipeline *pipeline, nir_shader **shaders) } for (int i = 1; i < shader_count; ++i) { + nir_lower_io_arrays_to_elements(ordered_shaders[i], + ordered_shaders[i - 1]); + nir_remove_dead_variables(ordered_shaders[i], nir_var_shader_out); nir_remove_dead_variables(ordered_shaders[i - 1], -- 2.13.6 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 7/9] i965: enable varying component packing for BDW+
shader-db results BDW: total instructions in shared programs: 13192895 -> 13182437 (-0.08%) instructions in affected programs: 827145 -> 816687 (-1.26%) helped: 5199 HURT: 116 total cycles in shared programs: 539249342 -> 539156566 (-0.02%) cycles in affected programs: 21894552 -> 21801776 (-0.42%) helped: 10667 HURT: 7196 LOST: 0 GAINED: 17 --- src/mesa/drivers/dri/i965/brw_link.cpp | 7 +++ 1 file changed, 7 insertions(+) diff --git a/src/mesa/drivers/dri/i965/brw_link.cpp b/src/mesa/drivers/dri/i965/brw_link.cpp index 46dbcac8430..782135430cb 100644 --- a/src/mesa/drivers/dri/i965/brw_link.cpp +++ b/src/mesa/drivers/dri/i965/brw_link.cpp @@ -329,6 +329,7 @@ brw_link_shader(struct gl_context *ctx, struct gl_shader_program *shProg) } } + int prev = -1; for (stage = 0; stage < ARRAY_SIZE(shProg->_LinkedShaders); stage++) { struct gl_linked_shader *shader = shProg->_LinkedShaders[stage]; if (!shader) @@ -340,6 +341,12 @@ brw_link_shader(struct gl_context *ctx, struct gl_shader_program *shProg) NIR_PASS_V(prog->nir, nir_lower_samplers, shProg); NIR_PASS_V(prog->nir, nir_lower_atomics, shProg); + if (brw->screen->devinfo.gen >= 8 && prev != -1) { + nir_compact_varyings(shProg->_LinkedShaders[prev]->Program->nir, + prog->nir, ctx->API != API_OPENGL_COMPAT); + } + prev = stage; + infos[stage] = >nir->info; /* Make a pass over the IR to add state references for any built-in -- 2.13.6 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] More nir linking optimisations
This series adds a varying array splitting pass to the previous component packing series I sent out previously. This allows avoiding the workaround of calling gather shader info twice since we can more easily keep the input/output bitmasks in sync now that we don't need to worry about partial marking of arrays. Remaining improvements include adding a pass to compact varyings into consecutive slots rather than leaving empty slots when removing dead varyings. Shader-db results for serires on i965 (BDW): total instructions in shared programs: 13298718 -> 13191284 (-0.81%) instructions in affected programs: 2315180 -> 2207746 (-4.64%) helped: 14956 HURT: 390 total cycles in shared programs: 540151400 -> 539397048 (-0.14%) cycles in affected programs: 297905258 -> 297150906 (-0.25%) helped: 25231 HURT: 13033 total loops in shared programs: 3807 -> 3804 (-0.08%) loops in affected programs: 3 -> 0 helped: 3 HURT: 0 total spills in shared programs: 86577 -> 86640 (0.07%) spills in affected programs: 1380 -> 1443 (4.57%) helped: 7 HURT: 15 total fills in shared programs: 90871 -> 90946 (0.08%) fills in affected programs: 1728 -> 1803 (4.34%) helped: 16 HURT: 9 LOST: 4 GAINED: 15 The spill hurt is all in dolphin uber shaders (as is most of the spill improvements). Two of the lost programs are SIMD16 programs are from CS: GO because 80% of the shaders get optimised away when we remove dead varying components, these are also the shaders where the 3 loops go away. Please review. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 1/9] nir: add tess patch support to nir_remove_unused_varyings()
--- src/compiler/nir/nir_linking_helpers.c | 61 +++--- 1 file changed, 42 insertions(+), 19 deletions(-) diff --git a/src/compiler/nir/nir_linking_helpers.c b/src/compiler/nir/nir_linking_helpers.c index 54ba1c85e58..4d709c1b3c5 100644 --- a/src/compiler/nir/nir_linking_helpers.c +++ b/src/compiler/nir/nir_linking_helpers.c @@ -37,10 +37,12 @@ static uint64_t get_variable_io_mask(nir_variable *var, gl_shader_stage stage) { - /* TODO: add support for tess patches */ - if (var->data.patch || var->data.location < 0) + if (var->data.location < 0) return 0; + unsigned location = var->data.patch ? + var->data.location - VARYING_SLOT_PATCH0 : var->data.location; + assert(var->data.mode == nir_var_shader_in || var->data.mode == nir_var_shader_out || var->data.mode == nir_var_system_value); @@ -53,11 +55,11 @@ get_variable_io_mask(nir_variable *var, gl_shader_stage stage) } unsigned slots = glsl_count_attribute_slots(type, false); - return ((1ull << slots) - 1) << var->data.location; + return ((1ull << slots) - 1) << location; } static void -tcs_add_output_reads(nir_shader *shader, uint64_t *read) +tcs_add_output_reads(nir_shader *shader, uint64_t *read, uint64_t *patches_read) { nir_foreach_function(function, shader) { if (function->impl) { @@ -73,9 +75,15 @@ tcs_add_output_reads(nir_shader *shader, uint64_t *read) nir_var_shader_out) { nir_variable *var = intrin_instr->variables[0]->var; - read[var->data.location_frac] |= - get_variable_io_mask(intrin_instr->variables[0]->var, - shader->info.stage); + if (var->data.patch) { + patches_read[var->data.location_frac] |= +get_variable_io_mask(intrin_instr->variables[0]->var, + shader->info.stage); + } else { + read[var->data.location_frac] |= +get_variable_io_mask(intrin_instr->variables[0]->var, + shader->info.stage); + } } } } @@ -85,14 +93,17 @@ tcs_add_output_reads(nir_shader *shader, uint64_t *read) static bool remove_unused_io_vars(nir_shader *shader, struct exec_list *var_list, - uint64_t *used_by_other_stage) + uint64_t *used_by_other_stage, + uint64_t *used_by_other_stage_patches) { bool progress = false; + uint64_t *used; nir_foreach_variable_safe(var, var_list) { - /* TODO: add patch support */ if (var->data.patch) - continue; + used = used_by_other_stage_patches; + else + used = used_by_other_stage; if (var->data.location < VARYING_SLOT_VAR0 && var->data.location >= 0) continue; @@ -100,7 +111,7 @@ remove_unused_io_vars(nir_shader *shader, struct exec_list *var_list, if (var->data.always_active_io) continue; - uint64_t other_stage = used_by_other_stage[var->data.location_frac]; + uint64_t other_stage = used[var->data.location_frac]; if (!(other_stage & get_variable_io_mask(var, shader->info.stage))) { /* This one is invalid, make it a global variable instead */ @@ -124,15 +135,26 @@ nir_remove_unused_varyings(nir_shader *producer, nir_shader *consumer) assert(consumer->info.stage != MESA_SHADER_VERTEX); uint64_t read[4] = { 0 }, written[4] = { 0 }; + uint64_t patches_read[4] = { 0 }, patches_written[4] = { 0 }; nir_foreach_variable(var, >outputs) { - written[var->data.location_frac] |= - get_variable_io_mask(var, producer->info.stage); + if (var->data.patch) { + patches_written[var->data.location_frac] |= +get_variable_io_mask(var, producer->info.stage); + } else { + written[var->data.location_frac] |= +get_variable_io_mask(var, producer->info.stage); + } } nir_foreach_variable(var, >inputs) { - read[var->data.location_frac] |= - get_variable_io_mask(var, consumer->info.stage); + if (var->data.patch) { + patches_read[var->data.location_frac] |= +get_variable_io_mask(var, consumer->info.stage); + } else { + read[var->data.location_frac] |= +get_variable_io_mask(var, consumer->info.stage); + } } /* Each TCS invocation can read data written by other TCS invocations, @@ -140,13 +162,14 @@ nir_remove_unused_varyings(nir_shader *producer, nir_shader *consumer) * sure they are not read by the TCS before demoting them to globals. */ if (producer->info.stage == MESA_SHADER_TESS_CTRL) - tcs_add_output_reads(producer, read); + tcs_add_output_reads(producer, read, patches_read); bool
[Mesa-dev] [PATCH 4/9] i965: enable varying array splitting
total instructions in shared programs: 13210579 -> 13199325 (-0.09%) instructions in affected programs: 89043 -> 77789 (-12.64%) helped: 430 HURT: 0 total cycles in shared programs: 539530190 -> 539493750 (-0.01%) cycles in affected programs: 584860 -> 548420 (-6.23%) helped: 437 HURT: 110 total spills in shared programs: 86646 -> 86640 (-0.01%) spills in affected programs: 6 -> 0 helped: 1 HURT: 0 total fills in shared programs: 90955 -> 90946 (-0.01%) fills in affected programs: 9 -> 0 helped: 1 HURT: 0 --- src/mesa/drivers/dri/i965/brw_link.cpp | 2 ++ 1 file changed, 2 insertions(+) diff --git a/src/mesa/drivers/dri/i965/brw_link.cpp b/src/mesa/drivers/dri/i965/brw_link.cpp index b6c5362a1ee..c0e16ae7d5c 100644 --- a/src/mesa/drivers/dri/i965/brw_link.cpp +++ b/src/mesa/drivers/dri/i965/brw_link.cpp @@ -278,6 +278,8 @@ brw_link_shader(struct gl_context *ctx, struct gl_shader_program *shProg) nir_shader *producer = shProg->_LinkedShaders[i]->Program->nir; nir_shader *consumer = shProg->_LinkedShaders[next]->Program->nir; +nir_lower_io_arrays_to_elements(producer, consumer); + NIR_PASS_V(producer, nir_remove_dead_variables, nir_var_shader_out); NIR_PASS_V(consumer, nir_remove_dead_variables, nir_var_shader_in); -- 2.13.6 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 5/9] i965: call nir_lower_io_to_scalar() at link time for BDW and above
This will allow dead components of varyings to be removed. BDW shader-db results: total instructions in shared programs: 13190730 -> 13108459 (-0.62%) instructions in affected programs: 2110903 -> 2028632 (-3.90%) helped: 14043 HURT: 486 total cycles in shared programs: 541148990 -> 540544072 (-0.11%) cycles in affected programs: 290344296 -> 289739378 (-0.21%) helped: 23418 HURT: 11623 total loops in shared programs: 3923 -> 3920 (-0.08%) loops in affected programs: 3 -> 0 helped: 3 HURT: 0 total spills in shared programs: 85784 -> 85853 (0.08%) spills in affected programs: 1374 -> 1443 (5.02%) helped: 6 HURT: 15 total fills in shared programs: 88717 -> 88801 (0.09%) fills in affected programs: 1719 -> 1803 (4.89%) helped: 15 HURT: 9 LOST: 3 GAINED: 0 The fills/spills changes were all in the dolphin uber shaders. I tested enabling this on IVB but the results went in the other direction. --- src/mesa/drivers/dri/i965/brw_link.cpp | 35 -- 1 file changed, 25 insertions(+), 10 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_link.cpp b/src/mesa/drivers/dri/i965/brw_link.cpp index c0e16ae7d5c..46dbcac8430 100644 --- a/src/mesa/drivers/dri/i965/brw_link.cpp +++ b/src/mesa/drivers/dri/i965/brw_link.cpp @@ -224,6 +224,17 @@ brw_link_shader(struct gl_context *ctx, struct gl_shader_program *shProg) unsigned int stage; struct shader_info *infos[MESA_SHADER_STAGES] = { 0, }; + /* Determine first and last stage. */ + unsigned first = MESA_SHADER_STAGES; + unsigned last = 0; + for (unsigned i = 0; i < MESA_SHADER_STAGES; i++) { + if (!shProg->_LinkedShaders[i]) + continue; + if (first == MESA_SHADER_STAGES) + first = i; + last = i; + } + for (stage = 0; stage < ARRAY_SIZE(shProg->_LinkedShaders); stage++) { struct gl_linked_shader *shader = shProg->_LinkedShaders[stage]; if (!shader) @@ -251,17 +262,21 @@ brw_link_shader(struct gl_context *ctx, struct gl_shader_program *shProg) prog->nir = brw_create_nir(brw, shProg, prog, (gl_shader_stage) stage, compiler->scalar_stage[stage]); - } - /* Determine first and last stage. */ - unsigned first = MESA_SHADER_STAGES; - unsigned last = 0; - for (unsigned i = 0; i < MESA_SHADER_STAGES; i++) { - if (!shProg->_LinkedShaders[i]) - continue; - if (first == MESA_SHADER_STAGES) - first = i; - last = i; + if (brw->screen->devinfo.gen >= 8) { + nir_variable_mode mask = (nir_variable_mode) 0; + + if (stage != first) +mask = (nir_variable_mode)(mask | nir_var_shader_in); + + if (stage != last) +mask = (nir_variable_mode)(mask | nir_var_shader_out); + + nir_lower_io_to_scalar_early(prog->nir, mask); + + prog->nir = brw_nir_optimize(prog->nir, compiler, + compiler->scalar_stage[stage]); + } } /* Linking the stages in the opposite order (from fragment to vertex) -- 2.13.6 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 08/25] gallium/u_threaded: mark queries flushed only for non-deferred flushes
On Sun, Oct 22, 2017 at 9:07 PM, Nicolai Hähnlewrote: > From: Nicolai Hähnle > > The driver uses (and must use) the flushed flag of queries as a hint that > it does not have to check for synchronization with currently queued up > commands. Deferred flushes do not actually flush queued up commands, so > we must not set the flushed flag for them. > > Found by inspection. > --- > src/gallium/auxiliary/util/u_threaded_context.c | 8 +--- > src/gallium/auxiliary/util/u_threaded_context.h | 2 +- > 2 files changed, 6 insertions(+), 4 deletions(-) > > diff --git a/src/gallium/auxiliary/util/u_threaded_context.c > b/src/gallium/auxiliary/util/u_threaded_context.c > index 7e28b87a7ff..24fab7f5cb6 100644 > --- a/src/gallium/auxiliary/util/u_threaded_context.c > +++ b/src/gallium/auxiliary/util/u_threaded_context.c > @@ -1783,23 +1783,25 @@ tc_create_video_buffer(struct pipe_context *_pipe, > */ > > static void > tc_flush(struct pipe_context *_pipe, struct pipe_fence_handle **fence, > unsigned flags) > { > struct threaded_context *tc = threaded_context(_pipe); > struct pipe_context *pipe = tc->pipe; > struct threaded_query *tq, *tmp; > > - LIST_FOR_EACH_ENTRY_SAFE(tq, tmp, >unflushed_queries, head_unflushed) > { > - tq->flushed = true; > - LIST_DEL(>head_unflushed); > + if (!(flags & PIPE_FLUSH_DEFERRED)) { Do we also need to check the ASYNC flag here? Or top-of-pipe and bottom-of-pipe flags that don't have to flush caches if I understand correctly? Marek > + LIST_FOR_EACH_ENTRY_SAFE(tq, tmp, >unflushed_queries, > head_unflushed) { > + tq->flushed = true; > + LIST_DEL(>head_unflushed); > + } > } ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH mesa] git_sha1_gen: create empty file in fallback path
Rb On October 29, 2017 3:06:28 PM PDT, Eric Engestromwrote: >I missed this part in my conversion, the old stream redirection meant >the file was always created. > >Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103496 >Fixes: 7088622e5fb506b64c90 "buildsys: move file regeneration logic to > the script itself" >Signed-off-by: Eric Engestrom >--- > bin/git_sha1_gen.py | 2 ++ > 1 file changed, 2 insertions(+) > >diff --git a/bin/git_sha1_gen.py b/bin/git_sha1_gen.py >index 7b9267b59e..68a87e72ec 100755 >--- a/bin/git_sha1_gen.py >+++ b/bin/git_sha1_gen.py >@@ -45,3 +45,5 @@ def get_git_sha1(): > quit() > with open(args.output, 'w') as git_sha1_h: > git_sha1_h.write(new_sha1) >+else: >+open(args.output, 'w').close() >-- >Cheers, > Eric > >___ >mesa-dev mailing list >mesa-dev@lists.freedesktop.org >https://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v3 28/34] i965: add cache fallback support using serialized nir
On 2017-10-29 01:11:32, Kenneth Graunke wrote: > On Sunday, October 22, 2017 1:01:36 PM PDT Jordan Justen wrote: > > If the i965 gen program cannot be loaded from the cache, then we > > fallback to using a serialized nir program. > > > > This is based on "i965: add cache fallback support" by Timothy Arceri > >. Tim's version was written to fallback > > to compiling from source, and therefore had to be much more complex. > > After Connor and Jason implemented nir serialization, I was able to > > rewrite and greatly simplify this patch. > > > > Signed-off-by: Jordan Justen > > Acked-by: Timothy Arceri > > --- > > src/mesa/drivers/dri/i965/brw_disk_cache.c | 27 ++- > > 1 file changed, 26 insertions(+), 1 deletion(-) > > > > diff --git a/src/mesa/drivers/dri/i965/brw_disk_cache.c > > b/src/mesa/drivers/dri/i965/brw_disk_cache.c > > index 503c6c7b499..9af893d40a7 100644 > > --- a/src/mesa/drivers/dri/i965/brw_disk_cache.c > > +++ b/src/mesa/drivers/dri/i965/brw_disk_cache.c > > @@ -24,6 +24,7 @@ > > #include "compiler/blob.h" > > #include "compiler/glsl/ir_uniform.h" > > #include "compiler/glsl/shader_cache.h" > > +#include "compiler/nir/nir_serialize.h" > > #include "main/mtypes.h" > > #include "util/disk_cache.h" > > #include "util/macros.h" > > @@ -58,6 +59,27 @@ gen_shader_sha1(struct brw_context *brw, struct > > gl_program *prog, > > _mesa_sha1_compute(manifest, strlen(manifest), out_sha1); > > } > > > > +static void > > +fallback_to_full_recompile(struct brw_context *brw, struct gl_program > > *prog, > > It's not exactly a full recompile anymore, maybe rename this to > recompile_from_nir? Or fallback_to_partial_recompile? Good point. I guess eventually we'll recompile from nir, but at this point we are just restoring the nir program. What about restore_serialized_nir_shader? Reviewed-by from you with that? -Jordan > > > + gl_shader_stage stage) > > +{ > > + prog->program_written_to_cache = false; > > + if (brw->ctx._Shader->Flags & GLSL_CACHE_INFO) { > > + fprintf(stderr, "falling back to nir %s.\n", > > + _mesa_shader_stage_to_abbrev(prog->info.stage)); > > + } > > + > > + if (!prog->nir) { > > + assert(prog->driver_cache_blob && prog->driver_cache_blob_size > 0); > > + const struct nir_shader_compiler_options *options = > > + brw->ctx.Const.ShaderCompilerOptions[stage].NirOptions; > > + struct blob_reader reader; > > + blob_reader_init(, prog->driver_cache_blob, > > + prog->driver_cache_blob_size); > > + prog->nir = nir_deserialize(NULL, options, ); > > + } > > +} > > + > > static void > > write_blob_program_data(struct blob *binary, const void *program, > > size_t program_size, > > @@ -280,6 +302,9 @@ brw_disk_cache_upload_program(struct brw_context *brw, > > gl_shader_stage stage) > > prog->sh.LinkedTransformFeedback->api_enabled) > >return false; > > > > + if (brw->ctx._Shader->Flags & GLSL_CACHE_FALLBACK) > > + goto FAIL; > > + > > if (prog->sh.data->LinkStatus != linking_skipped) > >goto FAIL; > > > > @@ -293,7 +318,7 @@ brw_disk_cache_upload_program(struct brw_context *brw, > > gl_shader_stage stage) > > return true; > > > > FAIL: > > - /*FIXME: Fall back and compile from source here. */ > > + fallback_to_full_recompile(brw, prog, stage); > > return false; > > } > > > > > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 103505] RX 480, newest mesa, VULKAN Does not start
https://bugs.freedesktop.org/show_bug.cgi?id=103505 --- Comment #3 from Matias N. Goldberg--- Sounds similar to this bug 99591: https://bugs.freedesktop.org/show_bug.cgi?id=99591 Try export LD_BIND_NOW=1 before running the Vulkan application. If that doesn't work, there could be a problem with the config with which LLVM was built. Are the vulkan drivers and LLVM provided by your distro or you built them yourself? Please give more info, like OS, kernel, driver, Mesa version, LLVM version, etc. -- You are receiving this mail because: You are the assignee for the bug. You are the QA Contact for the bug.___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v3 19/43] i965/fs: Support push constants of 16-bit types
On 29/10/17 19:55, Pohjolainen, Topi wrote: > On Thu, Oct 12, 2017 at 08:38:08PM +0200, Jose Maria Casanova Crespo wrote: >> We enable the use of 16-bit values in push constants >> modifying the assign_constant_locations function to work >> with 16-bit types. >> >> The API to access buffers in Vulkan use multiples of 4-byte for >> offsets and sizes. Current accountability of uniforms based on 4-byte >> slots will work for 16-bit values if they are allowed to use 32-bit >> slots. For that, we replace the division by 4 by a DIV_ROUND_UP, so >> 2-byte elements will use 1 slot instead of 0. >> >> We aligns the 16-bit locations after assigning the 32-bit >> ones. >> --- >> src/intel/compiler/brw_fs.cpp | 30 +++--- >> 1 file changed, 23 insertions(+), 7 deletions(-) >> >> diff --git a/src/intel/compiler/brw_fs.cpp b/src/intel/compiler/brw_fs.cpp >> index a1d49a63be..8da16145dc 100644 >> --- a/src/intel/compiler/brw_fs.cpp >> +++ b/src/intel/compiler/brw_fs.cpp >> @@ -1909,8 +1909,9 @@ set_push_pull_constant_loc(unsigned uniform, int >> *chunk_start, >> if (!contiguous) { >>/* If bitsize doesn't match the target one, skip it */ >>if (*max_chunk_bitsize != target_bitsize) { >> - /* FIXME: right now we only support 32 and 64-bit accesses */ >> - assert(*max_chunk_bitsize == 4 || *max_chunk_bitsize == 8); >> + assert(*max_chunk_bitsize == 4 || >> +*max_chunk_bitsize == 8 || >> +*max_chunk_bitsize == 2); >> *max_chunk_bitsize = 0; >> *chunk_start = -1; >> return; >> @@ -1987,8 +1988,9 @@ fs_visitor::assign_constant_locations() >> int constant_nr = inst->src[i].nr + inst->src[i].offset / 4; > > Did you test this with, for example, vec4? CTS has 16bit scalar, vec2 (uint,sint), vec4 (float) and matrix tests for push constants for compute and graphics pipelines. For vec4 you can try: dEQP-VK.spirv_assembly.instruction.compute.16bit_storage.push_constant_16_to_32.vector_float For push constant tests in general there are 42 tests, but vec3 aren't tested: dEQP-VK.*16bit_storage.*push_constant. > I've been toying with a glsl > lowering pass changing mediump floats into float16. I was curious to know how > much is needed as you have addressed most of the things from NIR onwards. > Here I'm seeing offsets 0,2,4,6 which result into 0,0,1,1 when divided by > four. Don't we need something of this sort in addition? If i remember correctly, tests were testing to use push constants with 64 16bit values, to use the minimum spec maximum available as max_push_constants_size that is 128 bytes. So at the end the generated intrinsic was: vec4 16 ssa_4 = intrinsic load_uniform (ssa_3) () (0, 128) /* base=0 */ /* range=128 */ As the calculus here is to calculate the number of location used, and taking into account that the Vulkan API restrictions for push constants that says that push constant ranges that say that offset must be multiple of 4 and size must be multiple of 4, maintain the use of 4-bytes slots was ok for supporting the feature. Our code changes just take the accountability in the number of 32-bits location needed, mainly changing the divisions by 4 using DIV_ROUND_UP( , 4) to calculate sizes. > commit 1a6d2bf3302f6e4305e383da0f27712dc5c20a67 > Author: Topi Pohjolainen> Date: Sun Oct 29 20:28:03 2017 +0200 > > fix alignment of 16-bit uniforms on 32-bit slots > > diff --git a/src/intel/compiler/brw_fs_nir.cpp > b/src/intel/compiler/brw_fs_nir.cpp > index 2f5443958a..586eb9d9ff 100644 > --- a/src/intel/compiler/brw_fs_nir.cpp > +++ b/src/intel/compiler/brw_fs_nir.cpp > @@ -4007,7 +4007,10 @@ fs_visitor::nir_emit_intrinsic(const fs_builder , > nir_intrinsic_instr *instr > src.offset = const_offset->u32[0]; > > for (unsigned j = 0; j < instr->num_components; j++) { > -bld.MOV(offset(dest, bld, j), offset(src, bld, j)); > +const unsigned src_offset = > + src.type == BRW_REGISTER_TYPE_HF ? 2 * j : j; > + > +bld.MOV(offset(dest, bld, j), offset(src, bld, src_offset)); > > > > Then about the change of using 32-bit slots. This is now unconditional and > would require revisiting if we wanted to pack 16-bits tighter and possibly > increase the amount of uniforms that can be pushed. Similarly to Vulkan, in > GL the core stores uniforms as floats and I think we should keep it that way. > I added support in the i965 backend to keep track of the types of the > uniforms and to convert 32-bit presentation to 16-bits on the fly in > gen6_constant_state.c::brw_param_value(). I don't like it that much but I had > to start from somewhere. > My thinking is that we'd want to decouple the storage of the values and the > packing used in the compiler backend. Ideally keeping the mesa gl core and the > api working with full 32-bit floats but using tight 16-bit slots in the > push/pull
[Mesa-dev] [PATCH mesa] git_sha1_gen: create empty file in fallback path
I missed this part in my conversion, the old stream redirection meant the file was always created. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103496 Fixes: 7088622e5fb506b64c90 "buildsys: move file regeneration logic to the script itself" Signed-off-by: Eric Engestrom--- bin/git_sha1_gen.py | 2 ++ 1 file changed, 2 insertions(+) diff --git a/bin/git_sha1_gen.py b/bin/git_sha1_gen.py index 7b9267b59e..68a87e72ec 100755 --- a/bin/git_sha1_gen.py +++ b/bin/git_sha1_gen.py @@ -45,3 +45,5 @@ def get_git_sha1(): quit() with open(args.output, 'w') as git_sha1_h: git_sha1_h.write(new_sha1) +else: +open(args.output, 'w').close() -- Cheers, Eric ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 102573] fails to build on armel
https://bugs.freedesktop.org/show_bug.cgi?id=102573 --- Comment #10 from Matt Turner--- (In reply to Bernd Kuhls from comment #9) Open a new bug -- You are receiving this mail because: You are the assignee for the bug.___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 20/21] nir/lower_indirect: Bail early if modes == 0
Good point. I'll drop this patch. On October 29, 2017 05:10:01 Bas Nieuwenhuizenwrote: Doesn't the old behavior also lower compact arrays even with modes = 0? On Sat, Oct 28, 2017 at 8:36 PM, Jason Ekstrand wrote: There's no point in walking the program if 100% if we're never going to actually lower anything. --- src/compiler/nir/nir_lower_indirect_derefs.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/src/compiler/nir/nir_lower_indirect_derefs.c b/src/compiler/nir/nir_lower_indirect_derefs.c index c949224..f1e060c 100644 --- a/src/compiler/nir/nir_lower_indirect_derefs.c +++ b/src/compiler/nir/nir_lower_indirect_derefs.c @@ -202,6 +202,9 @@ nir_lower_indirect_derefs(nir_shader *shader, nir_variable_mode modes) { bool progress = false; + if (modes == 0) + return false; + nir_foreach_function(function, shader) { if (function->impl) progress = lower_indirects_impl(function->impl, modes) || progress; -- 2.5.0.400.gff86faf ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 103507] Wrong colors on screen when waking from suspend mode
https://bugs.freedesktop.org/show_bug.cgi?id=103507 Felix Schwarzchanged: What|Removed |Added CC||felix.schwarz@oss.schwarz.e ||u -- You are receiving this mail because: You are the assignee for the bug.___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 103507] Wrong colors on screen when waking from suspend mode
https://bugs.freedesktop.org/show_bug.cgi?id=103507 --- Comment #3 from Felix Schwarz--- bug 98832 might be a similar issue – that one is about the Radeon HD 6450. (You have a slightly different model so it might make sense to keep both bugs.) Maybe you can try to revert this commit: commit d57c0edfe00d3274b50f91ce3076ed0e82d28782 Author: Alex Deucher Date: Wed Jul 8 14:08:12 2015 -0400 Revert "Revert "drm/radeon: dont switch vt on suspend"" This reverts commit ac9134906b3f5c2b45dc80dab0fee792bd516d52. We've fixed the underlying problem with cursors, so re-enable this. If that fixes it for you I suspect you are hitting the same issue as bug 98832 and bug 99163. (Btw: You might work around the problem if you just switch to a different console instead of logout/login.) -- You are receiving this mail because: You are the assignee for the bug. You are the QA Contact for the bug.___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v3 19/43] i965/fs: Support push constants of 16-bit types
On Thu, Oct 12, 2017 at 08:38:08PM +0200, Jose Maria Casanova Crespo wrote: > We enable the use of 16-bit values in push constants > modifying the assign_constant_locations function to work > with 16-bit types. > > The API to access buffers in Vulkan use multiples of 4-byte for > offsets and sizes. Current accountability of uniforms based on 4-byte > slots will work for 16-bit values if they are allowed to use 32-bit > slots. For that, we replace the division by 4 by a DIV_ROUND_UP, so > 2-byte elements will use 1 slot instead of 0. > > We aligns the 16-bit locations after assigning the 32-bit > ones. > --- > src/intel/compiler/brw_fs.cpp | 30 +++--- > 1 file changed, 23 insertions(+), 7 deletions(-) > > diff --git a/src/intel/compiler/brw_fs.cpp b/src/intel/compiler/brw_fs.cpp > index a1d49a63be..8da16145dc 100644 > --- a/src/intel/compiler/brw_fs.cpp > +++ b/src/intel/compiler/brw_fs.cpp > @@ -1909,8 +1909,9 @@ set_push_pull_constant_loc(unsigned uniform, int > *chunk_start, > if (!contiguous) { >/* If bitsize doesn't match the target one, skip it */ >if (*max_chunk_bitsize != target_bitsize) { > - /* FIXME: right now we only support 32 and 64-bit accesses */ > - assert(*max_chunk_bitsize == 4 || *max_chunk_bitsize == 8); > + assert(*max_chunk_bitsize == 4 || > +*max_chunk_bitsize == 8 || > +*max_chunk_bitsize == 2); > *max_chunk_bitsize = 0; > *chunk_start = -1; > return; > @@ -1987,8 +1988,9 @@ fs_visitor::assign_constant_locations() > int constant_nr = inst->src[i].nr + inst->src[i].offset / 4; Did you test this with, for example, vec4? I've been toying with a glsl lowering pass changing mediump floats into float16. I was curious to know how much is needed as you have addressed most of the things from NIR onwards. Here I'm seeing offsets 0,2,4,6 which result into 0,0,1,1 when divided by four. Don't we need something of this sort in addition? commit 1a6d2bf3302f6e4305e383da0f27712dc5c20a67 Author: Topi PohjolainenDate: Sun Oct 29 20:28:03 2017 +0200 fix alignment of 16-bit uniforms on 32-bit slots diff --git a/src/intel/compiler/brw_fs_nir.cpp b/src/intel/compiler/brw_fs_nir.cpp index 2f5443958a..586eb9d9ff 100644 --- a/src/intel/compiler/brw_fs_nir.cpp +++ b/src/intel/compiler/brw_fs_nir.cpp @@ -4007,7 +4007,10 @@ fs_visitor::nir_emit_intrinsic(const fs_builder , nir_intrinsic_instr *instr src.offset = const_offset->u32[0]; for (unsigned j = 0; j < instr->num_components; j++) { -bld.MOV(offset(dest, bld, j), offset(src, bld, j)); +const unsigned src_offset = + src.type == BRW_REGISTER_TYPE_HF ? 2 * j : j; + +bld.MOV(offset(dest, bld, j), offset(src, bld, src_offset)); Then about the change of using 32-bit slots. This is now unconditional and would require revisiting if we wanted to pack 16-bits tighter and possibly increase the amount of uniforms that can be pushed. Similarly to Vulkan, in GL the core stores uniforms as floats and I think we should keep it that way. I added support in the i965 backend to keep track of the types of the uniforms and to convert 32-bit presentation to 16-bits on the fly in gen6_constant_state.c::brw_param_value(). I don't like it that much but I had to start from somewhere. My thinking is that we'd want to decouple the storage of the values and the packing used in the compiler backend. Ideally keeping the mesa gl core and the api working with full 32-bit floats but using tight 16-bit slots in the push/pull constant buffers. This requires quite a bit more changes as we have structured param[]/pull_param[] to work with 32-bit slots. My current work can be found in: git://people.freedesktop.org/~tpohjola/mesa 16_bit_gles > > if (inst->opcode == SHADER_OPCODE_MOV_INDIRECT && i == 0) { > -assert(inst->src[2].ud % 4 == 0); > -unsigned last = constant_nr + (inst->src[2].ud / 4) - 1; > +assert(type_sz(inst->src[i].type) == 2 ? > + (inst->src[2].ud % 2 == 0) : (inst->src[2].ud % 4 == 0)); > +unsigned last = constant_nr + DIV_ROUND_UP(inst->src[2].ud, 4) - > 1; > assert(last < uniforms); > > for (unsigned j = constant_nr; j < last; j++) { > @@ -2000,8 +2002,8 @@ fs_visitor::assign_constant_locations() > bitsize_access[last] = MAX2(bitsize_access[last], > type_sz(inst->src[i].type)); > } else { > if (constant_nr >= 0 && constant_nr < (int) uniforms) { > - int regs_read = inst->components_read(i) * > - type_sz(inst->src[i].type) / 4; > + int regs_read = DIV_ROUND_UP(inst->components_read(i) * > +type_sz(inst->src[i].type), 4); > for (int j = 0; j < regs_read; j++) { >
[Mesa-dev] [Bug 103505] RX 480, newest mesa, VULKAN Does not start
https://bugs.freedesktop.org/show_bug.cgi?id=103505 --- Comment #2 from Valentin Novikov--- (In reply to Bas Nieuwenhuizen from comment #1) > So this seems like mesa 17.2.2? > > Can you get Vulka-Demos (https://github.com/SaschaWillems/Vulkan) and get a > backtrace of one of the demos? > > The complete dmesg log after failing to run an application would also be > useful info to have. dmesg before crash vulkan-smoketest: https://pastebin.com/9TBSpZHF after: https://pastebin.com/bSVS7p3P -- You are receiving this mail because: You are the assignee for the bug. You are the QA Contact for the bug.___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 103507] Wrong colors on screen when waking from suspend mode
https://bugs.freedesktop.org/show_bug.cgi?id=103507 --- Comment #2 from andre35...@yahoo.com --- Created attachment 135155 --> https://bugs.freedesktop.org/attachment.cgi?id=135155=edit dmesg output -- You are receiving this mail because: You are the QA Contact for the bug. You are the assignee for the bug.___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 103507] Wrong colors on screen when waking from suspend mode
https://bugs.freedesktop.org/show_bug.cgi?id=103507 --- Comment #1 from andre35...@yahoo.com --- Created attachment 135154 --> https://bugs.freedesktop.org/attachment.cgi?id=135154=edit Picture of issue/monitor -- You are receiving this mail because: You are the QA Contact for the bug. You are the assignee for the bug.___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 103507] Wrong colors on screen when waking from suspend mode
https://bugs.freedesktop.org/show_bug.cgi?id=103507 andre35...@yahoo.com changed: What|Removed |Added Summary|RGB colors across wake from |Wrong colors on screen when |suspend |waking from suspend mode -- You are receiving this mail because: You are the assignee for the bug. You are the QA Contact for the bug.___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 103507] RGB colors across wake from suspend
https://bugs.freedesktop.org/show_bug.cgi?id=103507 Bug ID: 103507 Summary: RGB colors across wake from suspend Product: Mesa Version: unspecified Hardware: x86-64 (AMD64) OS: Linux (All) Status: NEW Severity: critical Priority: medium Component: Mesa core Assignee: mesa-dev@lists.freedesktop.org Reporter: andre35...@yahoo.com QA Contact: mesa-dev@lists.freedesktop.org Linux Mint 18.2 AMD 6570 (open source drivers) Intel 64bit CPU Upon resuming the computer from suspend mode, around 3/5 times, the entire screen/Cinnamon DE is in some weird contrast mode where everything is pink/blue and hard to read. A logout/login restores the colors back to the way they should be. I have reported this to the Linux Mint Github page and a mod marked it as a possible Mesa issue so I am here. Here is a picture of the issue https://i.imgur.com/q5g4LEV.jpg Here is dmesg output if that proves useful: https://pastebin.com/tRwmTM1P -- You are receiving this mail because: You are the assignee for the bug. You are the QA Contact for the bug.___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 103506] Max core profile version: 0.0 in the "Drivers/DRI/r300" component of the "Mesa"
https://bugs.freedesktop.org/show_bug.cgi?id=103506 Ilia Mirkinchanged: What|Removed |Added Component|Other |Drivers/Gallium/r300 QA Contact|mesa-dev@lists.freedesktop. |dri-devel@lists.freedesktop |org |.org Assignee|mesa-dev@lists.freedesktop. |dri-devel@lists.freedesktop |org |.org --- Comment #1 from Ilia Mirkin --- Is there a question somewhere in there? (There is currently no r300 "dri" driver, only a gallium one, btw.) -- You are receiving this mail because: You are the assignee for the bug. You are the QA Contact for the bug.___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 103506] Max core profile version: 0.0 in the "Drivers/DRI/r300" component of the "Mesa"
https://bugs.freedesktop.org/show_bug.cgi?id=103506 Bug ID: 103506 Summary: Max core profile version: 0.0 in the "Drivers/DRI/r300" component of the "Mesa" Product: Mesa Version: unspecified Hardware: Other OS: Linux (All) Status: NEW Severity: critical Priority: medium Component: Other Assignee: mesa-dev@lists.freedesktop.org Reporter: pythonal...@gmail.com QA Contact: mesa-dev@lists.freedesktop.org server glx version string: 1.4 client glx version string: 1.4 GLX version: 1.4 Max core profile version: 0.0 Max compat profile version: 2.1 Max GLES1 profile version: 1.1 Max GLES[23] profile version: 2.0 OpenGL version string: 2.1 Mesa 17.1.8 OpenGL shading language version string: 1.20 OpenGL ES profile version string: OpenGL ES 2.0 Mesa 17.1.8 OpenGL ES profile shading language version string: OpenGL ES GLSL ES 1.0.16 -- You are receiving this mail because: You are the assignee for the bug. You are the QA Contact for the bug.___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 103505] RX 480, newest mesa, VULKAN Does not start
https://bugs.freedesktop.org/show_bug.cgi?id=103505 Bas Nieuwenhuizenchanged: What|Removed |Added Component|Mesa core |Drivers/Vulkan/radeon --- Comment #1 from Bas Nieuwenhuizen --- So this seems like mesa 17.2.2? Can you get Vulka-Demos (https://github.com/SaschaWillems/Vulkan) and get a backtrace of one of the demos? The complete dmesg log after failing to run an application would also be useful info to have. -- You are receiving this mail because: You are the QA Contact for the bug. You are the assignee for the bug.___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 103505] RX 480, newest mesa, VULKAN Does not start
https://bugs.freedesktop.org/show_bug.cgi?id=103505 Bug ID: 103505 Summary: RX 480, newest mesa, VULKAN Does not start Product: Mesa Version: git Hardware: x86-64 (AMD64) OS: Linux (All) Status: NEW Severity: critical Priority: medium Component: Mesa core Assignee: mesa-dev@lists.freedesktop.org Reporter: mrmeln...@gmail.com QA Contact: mesa-dev@lists.freedesktop.org Vulkan-smoketest: segmentation fault (core dumped) Dota 2: just does not start vkwuake: segmentation fault (core dumped) vulkaninfo: https://pastebin.com/SXpb18sP -- You are receiving this mail because: You are the QA Contact for the bug. You are the assignee for the bug.___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 102573] fails to build on armel
https://bugs.freedesktop.org/show_bug.cgi?id=102573 --- Comment #9 from Bernd Kuhls--- Hi, this patch breaks building mesa3d 17.2.3 with Target: powerpc-ctng_e500v2-linux-gnuspe gcc version 4.7.3 (crosstool-NG hg+-c65fcf8a34b7) as reported by buildroot autobuilders: http://autobuild.buildroot.net/?reason=mesa3d-17.2.3 Quoting http://autobuild.buildroot.net/results/43d/43d8bf9a1531f4b69e22bfb53b4536d76cf31cbb/build-end.log /home/peko/autobuild/instance-0/output/host/opt/ext-toolchain/bin/../lib/gcc/powerpc-ctng_e500v2-linux-gnuspe/4.7.3/../../../../powerpc-ctng_e500v2-linux-gnuspe/bin/ld: cannot find -latomic Quoting from configure output: checking whether -latomic is needed... yes checking whether __sync_add_and_fetch_8 is supported... no -- You are receiving this mail because: You are the assignee for the bug.___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 20/21] nir/lower_indirect: Bail early if modes == 0
Doesn't the old behavior also lower compact arrays even with modes = 0? On Sat, Oct 28, 2017 at 8:36 PM, Jason Ekstrandwrote: > There's no point in walking the program if 100% if we're never going to > actually lower anything. > --- > src/compiler/nir/nir_lower_indirect_derefs.c | 3 +++ > 1 file changed, 3 insertions(+) > > diff --git a/src/compiler/nir/nir_lower_indirect_derefs.c > b/src/compiler/nir/nir_lower_indirect_derefs.c > index c949224..f1e060c 100644 > --- a/src/compiler/nir/nir_lower_indirect_derefs.c > +++ b/src/compiler/nir/nir_lower_indirect_derefs.c > @@ -202,6 +202,9 @@ nir_lower_indirect_derefs(nir_shader *shader, > nir_variable_mode modes) > { > bool progress = false; > > + if (modes == 0) > + return false; > + > nir_foreach_function(function, shader) { >if (function->impl) > progress = lower_indirects_impl(function->impl, modes) || progress; > -- > 2.5.0.400.gff86faf > > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v3 13/34] glsl/shader_cache: Save and restore serialized nir in gl_program
On Sunday, October 22, 2017 1:01:21 PM PDT Jordan Justen wrote: > v3: > * Rename serialized_nir* to driver_cache_blob*. (Tim) > > Signed-off-by: Jordan Justen> Reviewed-by: Timothy Arceri > --- > src/compiler/glsl/shader_cache.cpp | 16 > 1 file changed, 16 insertions(+) > > diff --git a/src/compiler/glsl/shader_cache.cpp > b/src/compiler/glsl/shader_cache.cpp > index ca90cfde350..1d208fb0911 100644 > --- a/src/compiler/glsl/shader_cache.cpp > +++ b/src/compiler/glsl/shader_cache.cpp > @@ -1062,6 +1062,14 @@ write_shader_metadata(struct blob *metadata, > gl_linked_shader *shader) > } > > write_shader_parameters(metadata, glprog->Parameters); > + > + assert((glprog->driver_cache_blob == NULL) == > + (glprog->driver_cache_blob_size == 0)); > + blob_write_uint32(metadata, (uint32_t)glprog->driver_cache_blob_size); > + if (glprog->driver_cache_blob_size > 0) { > + blob_write_bytes(metadata, glprog->driver_cache_blob, > + glprog->driver_cache_blob_size); > + } > } > > static void > @@ -1116,6 +1124,14 @@ read_shader_metadata(struct blob_reader *metadata, > > glprog->Parameters = _mesa_new_parameter_list(); > read_shader_parameters(metadata, glprog->Parameters); > + > + glprog->driver_cache_blob_size = (size_t)blob_read_uint32(metadata); > + if (glprog->driver_cache_blob_size > 0) { > + glprog->driver_cache_blob = > + (uint8_t*)ralloc_size(glprog, glprog->driver_cache_blob_size); > + blob_copy_bytes(metadata, glprog->driver_cache_blob, > + glprog->driver_cache_blob_size); > + } Shouldn't you check for overrun here, and leave things in a consistent state (passing the assertion above)? > } > > static void > signature.asc Description: This is a digitally signed message part. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v3 28/34] i965: add cache fallback support using serialized nir
On Sunday, October 22, 2017 1:01:36 PM PDT Jordan Justen wrote: > If the i965 gen program cannot be loaded from the cache, then we > fallback to using a serialized nir program. > > This is based on "i965: add cache fallback support" by Timothy Arceri >. Tim's version was written to fallback > to compiling from source, and therefore had to be much more complex. > After Connor and Jason implemented nir serialization, I was able to > rewrite and greatly simplify this patch. > > Signed-off-by: Jordan Justen > Acked-by: Timothy Arceri > --- > src/mesa/drivers/dri/i965/brw_disk_cache.c | 27 ++- > 1 file changed, 26 insertions(+), 1 deletion(-) > > diff --git a/src/mesa/drivers/dri/i965/brw_disk_cache.c > b/src/mesa/drivers/dri/i965/brw_disk_cache.c > index 503c6c7b499..9af893d40a7 100644 > --- a/src/mesa/drivers/dri/i965/brw_disk_cache.c > +++ b/src/mesa/drivers/dri/i965/brw_disk_cache.c > @@ -24,6 +24,7 @@ > #include "compiler/blob.h" > #include "compiler/glsl/ir_uniform.h" > #include "compiler/glsl/shader_cache.h" > +#include "compiler/nir/nir_serialize.h" > #include "main/mtypes.h" > #include "util/disk_cache.h" > #include "util/macros.h" > @@ -58,6 +59,27 @@ gen_shader_sha1(struct brw_context *brw, struct gl_program > *prog, > _mesa_sha1_compute(manifest, strlen(manifest), out_sha1); > } > > +static void > +fallback_to_full_recompile(struct brw_context *brw, struct gl_program *prog, It's not exactly a full recompile anymore, maybe rename this to recompile_from_nir? Or fallback_to_partial_recompile? > + gl_shader_stage stage) > +{ > + prog->program_written_to_cache = false; > + if (brw->ctx._Shader->Flags & GLSL_CACHE_INFO) { > + fprintf(stderr, "falling back to nir %s.\n", > + _mesa_shader_stage_to_abbrev(prog->info.stage)); > + } > + > + if (!prog->nir) { > + assert(prog->driver_cache_blob && prog->driver_cache_blob_size > 0); > + const struct nir_shader_compiler_options *options = > + brw->ctx.Const.ShaderCompilerOptions[stage].NirOptions; > + struct blob_reader reader; > + blob_reader_init(, prog->driver_cache_blob, > + prog->driver_cache_blob_size); > + prog->nir = nir_deserialize(NULL, options, ); > + } > +} > + > static void > write_blob_program_data(struct blob *binary, const void *program, > size_t program_size, > @@ -280,6 +302,9 @@ brw_disk_cache_upload_program(struct brw_context *brw, > gl_shader_stage stage) > prog->sh.LinkedTransformFeedback->api_enabled) >return false; > > + if (brw->ctx._Shader->Flags & GLSL_CACHE_FALLBACK) > + goto FAIL; > + > if (prog->sh.data->LinkStatus != linking_skipped) >goto FAIL; > > @@ -293,7 +318,7 @@ brw_disk_cache_upload_program(struct brw_context *brw, > gl_shader_stage stage) > return true; > > FAIL: > - /*FIXME: Fall back and compile from source here. */ > + fallback_to_full_recompile(brw, prog, stage); > return false; > } > > signature.asc Description: This is a digitally signed message part. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v3 22/34] i965: Add shader cache support for vertex and fragment stages
On Sunday, October 22, 2017 1:01:30 PM PDT Jordan Justen wrote: > From: Timothy Arceri> > This enables the cache on vertex and fragment shaders only. > > v2: > * Use MAYBE_UNUSED. (Matt) > > [jordan.l.jus...@intel.com: reword subject] > [jordan.l.jus...@intel.com: *_cached_program => brw_disk_cache_*_program] > Signed-off-by: Jordan Justen Patches 22-27 are: Reviewed-by: Kenneth Graunke signature.asc Description: This is a digitally signed message part. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v3 21/34] i965: add initial implementation of on disk shader cache
On Sunday, October 22, 2017 1:01:29 PM PDT Jordan Justen wrote: > From: Timothy Arceri> > This uses the recently-added disk_cache.c to write out the final > linked binary for vertex and fragment shader programs. > > This is based off the initial implementation done by Carl Worth. > > v2: > * Squash 'i965: add image param shader cache support' > * Squash 'i965: add shader cache support for pull param pointers' > * Sustantially simplified by a rework on top of Jason's 2975e4c56a7a. > * Rename load_program_data to read_program_data. (Jason) > > v3: > * Simplify and align program read/write. (Jason) > > [jordan.l.jus...@intel.com: *_cached_program => brw_disk_cache_*_program] > [jordan.l.jus...@intel.com: brw_shader_cache.c => brw_disk_cache.c] > [jordan.l.jus...@intel.com: don't map to write program when LLC is present] > [jordan.l.jus...@intel.com: set program_written_to_cache on read from cache] > [jordan.l.jus...@intel.com: only try cache when status is linking_skipped] > [jordan.l.jus...@intel.com: rework based on uniforms rework 2975e4c56a7a] > [jordan.l.jus...@intel.com: Simplify and align program read/write] > Signed-off-by: Jordan Justen > --- > src/mesa/drivers/dri/i965/Makefile.sources | 1 + > src/mesa/drivers/dri/i965/brw_disk_cache.c | 329 > + > src/mesa/drivers/dri/i965/brw_state.h | 5 + > src/mesa/drivers/dri/i965/meson.build | 1 + > 4 files changed, 336 insertions(+) > create mode 100644 src/mesa/drivers/dri/i965/brw_disk_cache.c > > diff --git a/src/mesa/drivers/dri/i965/Makefile.sources > b/src/mesa/drivers/dri/i965/Makefile.sources > index 053d89b81ec..2980cdb3c54 100644 > --- a/src/mesa/drivers/dri/i965/Makefile.sources > +++ b/src/mesa/drivers/dri/i965/Makefile.sources > @@ -14,6 +14,7 @@ i965_FILES = \ > brw_cs.h \ > brw_curbe.c \ > brw_defines.h \ > + brw_disk_cache.c \ > brw_draw.c \ > brw_draw.h \ > brw_draw_upload.c \ > diff --git a/src/mesa/drivers/dri/i965/brw_disk_cache.c > b/src/mesa/drivers/dri/i965/brw_disk_cache.c > new file mode 100644 > index 000..186cbe83706 > --- /dev/null > +++ b/src/mesa/drivers/dri/i965/brw_disk_cache.c > @@ -0,0 +1,329 @@ > +/* > + * Copyright © 2014 Intel Corporation > + * > + * Permission is hereby granted, free of charge, to any person obtaining a > + * copy of this software and associated documentation files (the "Software"), > + * to deal in the Software without restriction, including without limitation > + * the rights to use, copy, modify, merge, publish, distribute, sublicense, > + * and/or sell copies of the Software, and to permit persons to whom the > + * Software is furnished to do so, subject to the following conditions: > + * > + * The above copyright notice and this permission notice (including the next > + * paragraph) shall be included in all copies or substantial portions of the > + * Software. > + * > + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR > + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, > + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL > + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER > + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING > + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER > DEALINGS > + * IN THE SOFTWARE. > + */ > + > +#include "compiler/blob.h" > +#include "compiler/glsl/ir_uniform.h" > +#include "compiler/glsl/shader_cache.h" > +#include "main/mtypes.h" > +#include "util/disk_cache.h" > +#include "util/macros.h" > +#include "util/mesa-sha1.h" > + > +#include "brw_context.h" > +#include "brw_state.h" > +#include "brw_vs.h" > +#include "brw_wm.h" > + > +static void > +gen_shader_sha1(struct brw_context *brw, struct gl_program *prog, > +gl_shader_stage stage, void *key, unsigned char *out_sha1) > +{ > + char sha1_buf[41]; > + unsigned char sha1[20]; > + char manifest[256]; > + int offset = 0; > + > + _mesa_sha1_format(sha1_buf, prog->sh.data->sha1); > + offset += snprintf(manifest, sizeof(manifest), "program: %s\n", sha1_buf); > + > + _mesa_sha1_compute(key, brw_prog_key_size(stage), sha1); > + _mesa_sha1_format(sha1_buf, sha1); > + offset += snprintf(manifest + offset, sizeof(manifest) - offset, > + "%s_key: %s\n", _mesa_shader_stage_to_abbrev(stage), > + sha1_buf); > + > + _mesa_sha1_compute(manifest, strlen(manifest), out_sha1); > +} > + > +static void > +write_blob_program_data(struct blob *binary, const void *program, > +size_t program_size, > +struct brw_stage_prog_data *prog_data, > +size_t prog_data_size) > +{ > + /* Write program to blob. */ > + blob_write_uint32(binary, program_size); > +