Re: [Mesa-dev] [PATCH 2/4] glsl/linker: fix location aliasing checks for interface variables
On Fri, 2017-10-20 at 13:18 +1100, Timothy Arceri wrote: > > On 20/10/17 03:31, Iago Toral Quiroga wrote: > > The existing code was checking the whole interface variable rather > > than its members, which is not what we want: we want to check > > aliasing for each member in the interface variable. > > > > Surprisingly, there are piglit tests that verify this and were > > passing due to a bug in the existing code: when we were computing > > the last component used by an interface variable we would use > > the 'vector' path and multiply by vector_elements, which is 0 for > > interface variables. This made the loop that checks for aliasing > > be a no-op and not add the interface variable to the list of > > outputs > > so then we would fail to link when we did not see a matching output > > for the same input in the next stage. Since the tests expect a > > linker error to happen, they would pass, but not for the right > > reason. > > > > Unfortunately, the current implementation uses ir_variable > > instances > > to keep track of explicit locations. Since we don't have > > ir_variables instances for individual interface members, we need > > to have a custom struct with the data we need. This struct has > > the ir_variable (which for interface members is the whole > > interface variable), plus the data that we need to validate for > > each aliased location, for now only the base type, which for > > interface members we will take from the appropriate field inside > > the interface variable. > > > > Later patches will expand this custom struct so we can also check > > other requirements for location aliasing, specifically that > > we have matching interpolation and auxiliary storage, that once > > again, we will take from the appropriate field members for the > > interface variables. > > > > Fixes: > > KHR-GL45.enhanced_layouts.varying_block_automatic_member_locations > > > > Fixes: > > tests/spec/arb_separate_shader_objects/execution/layout-location- > > named-block.shader_test -auto -fbo > > > > Fixes (these were passing before but for incorrect reasons): > > tests/spec/arb_enhanced_layouts/linker/block-member- > > locations/named-block-member-location-overlap.shader_test > > tests/spec/arb_enhanced_layouts/linker/block-member- > > locations/named-block-member-mixed-order-overlap.shader_test > > > > --- > > src/compiler/glsl/link_varyings.cpp | 45 > > +++-- > > 1 file changed, 33 insertions(+), 12 deletions(-) > > > > diff --git a/src/compiler/glsl/link_varyings.cpp > > b/src/compiler/glsl/link_varyings.cpp > > index 687bd50e52..5dc2704321 100644 > > --- a/src/compiler/glsl/link_varyings.cpp > > +++ b/src/compiler/glsl/link_varyings.cpp > > @@ -403,8 +403,13 @@ compute_variable_location_slot(ir_variable > > *var, gl_shader_stage stage) > > return var->data.location - location_start; > > } > > > > +struct explicit_location_info { > > + ir_variable *var; > > + unsigned base_type; > > +}; > > + > > static bool > > -check_location_aliasing(ir_variable *explicit_locations[][4], > > +check_location_aliasing(struct explicit_location_info > > explicit_locations[][4], > > ir_variable *var, > > unsigned location, > > unsigned component, > > @@ -427,7 +432,7 @@ check_location_aliasing(ir_variable > > *explicit_locations[][4], > > while (location < location_limit) { > > unsigned i = component; > > while (i < last_comp) { > > - if (explicit_locations[location][i] != NULL) { > > + if (explicit_locations[location][i].var != NULL) { > > linker_error(prog, > > "%s shader has multiple outputs > > explicitly " > > "assigned to location %d and component > > %d\n", > > @@ -439,9 +444,9 @@ check_location_aliasing(ir_variable > > *explicit_locations[][4], > > /* Make sure all component at this location have the > > same type. > > */ > > for (unsigned j = 0; j < 4; j++) { > > -if (explicit_locations[location][j] && > > -(explicit_locations[location][j]->type- > > >without_array() > > - ->base_type != type->without_array()->base_type)) > > { > > +if (explicit_locations[location][j].var && > > +explicit_locations[location][j].base_type != > > + type->without_array()->base_type) { > > linker_error(prog, > > "Varyings sharing the same location > > must " > > "have the same underlying numerical > > type. " > > @@ -450,7 +455,9 @@ check_location_aliasing(ir_variable > > *explicit_locations[][4], > > } > > } > > > > - explicit_locations[location][i] = var; > > + explicit_locations[location][i].var = var; > > + explicit_locations[location][i].base_type =
Re: [Mesa-dev] [PATCH] radv: ensure correct outinfo is picked.
On 20/10/17 15:12, Dave Airlie wrote: From: Dave AirlieThis struct used to rely on being in a union, it isn't anymore, so we have to pick the correct outinfo struct now. This should fix a regression since the union became a struct. dEQP-VK.tessellation.geometry_interaction.point_size.vertex_set_geometry_set Fixes: 6078a3bd51 (ac/nir: Allow ac_shader_variant_info to contain info about multiple stages.) Signed-off-by: Dave Airlie --- src/amd/vulkan/radv_pipeline.c | 27 ++- 1 file changed, 14 insertions(+), 13 deletions(-) diff --git a/src/amd/vulkan/radv_pipeline.c b/src/amd/vulkan/radv_pipeline.c index c16b5e3..9ffeda8 100644 --- a/src/amd/vulkan/radv_pipeline.c +++ b/src/amd/vulkan/radv_pipeline.c @@ -1410,12 +1410,19 @@ static uint32_t si_vgt_gs_mode(struct radv_shader_variant *gs) S_028A40_GS_WRITE_OPTIMIZE(1); } -static void calculate_vgt_gs_mode(struct radv_pipeline *pipeline) +static struct ac_vs_output_info *get_vs_output_info(struct radv_pipeline *pipeline) { - struct radv_shader_variant *vs; - vs = radv_pipeline_has_gs(pipeline) ? pipeline->gs_copy_shader : (radv_pipeline_has_tess(pipeline) ? pipeline->shaders[MESA_SHADER_TESS_EVAL] : pipeline->shaders[MESA_SHADER_VERTEX]); + if (radv_pipeline_has_gs(pipeline)) + return >gs_copy_shader->info.vs.outinfo; + else if (radv_pipeline_has_tess(pipeline)) + return >shaders[MESA_SHADER_TESS_EVAL]->info.tes.outinfo; + else + return >shaders[MESA_SHADER_VERTEX]->info.vs.outinfo; +} - struct ac_vs_output_info *outinfo = >info.vs.outinfo; +static void calculate_vgt_gs_mode(struct radv_pipeline *pipeline) +{ + struct ac_vs_output_info *outinfo = get_vs_output_info(pipeline);; extra ; pipeline->graphics.vgt_primitiveid_en = false; pipeline->graphics.vgt_gs_mode = 0; @@ -1430,10 +1437,7 @@ static void calculate_vgt_gs_mode(struct radv_pipeline *pipeline) static void calculate_pa_cl_vs_out_cntl(struct radv_pipeline *pipeline) { - struct radv_shader_variant *vs; - vs = radv_pipeline_has_gs(pipeline) ? pipeline->gs_copy_shader : (radv_pipeline_has_tess(pipeline) ? pipeline->shaders[MESA_SHADER_TESS_EVAL] : pipeline->shaders[MESA_SHADER_VERTEX]); - - struct ac_vs_output_info *outinfo = >info.vs.outinfo; + struct ac_vs_output_info *outinfo = get_vs_output_info(pipeline);; extra ; unsigned clip_dist_mask, cull_dist_mask, total_mask; clip_dist_mask = outinfo->clip_dist_mask; @@ -1476,13 +1480,10 @@ static uint32_t offset_to_ps_input(uint32_t offset, bool flat_shade) static void calculate_ps_inputs(struct radv_pipeline *pipeline) { - struct radv_shader_variant *ps, *vs; - struct ac_vs_output_info *outinfo; + struct radv_shader_variant *ps; + struct ac_vs_output_info *outinfo = get_vs_output_info(pipeline);; extra ; With those removed: Reviewed-by: Timothy Arceri ps = pipeline->shaders[MESA_SHADER_FRAGMENT]; - vs = radv_pipeline_has_gs(pipeline) ? pipeline->gs_copy_shader : (radv_pipeline_has_tess(pipeline) ? pipeline->shaders[MESA_SHADER_TESS_EVAL] : pipeline->shaders[MESA_SHADER_VERTEX]); - - outinfo = >info.vs.outinfo; unsigned ps_offset = 0; ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v2] i965 : optimized bucket index calculation
On 09/13/2017 11:43 PM, aravindan.muthuku...@intel.com wrote: > From: Aravindan Muthukumar> > Avoiding the loop which was running with O(n) complexity. > Now the complexity has been reduced to O(1) > > Algorithm calculates the index using matrix method. > Matrix arrangement is as below: > Assuming PAGE_SIZE is 4096. > > 1*4096 2*40963*40964*4096 > 5*4096 6*40967*40968*4096 I think adding one more row to this chart would make it more clear. The two rows shown also follow a simpler pattern, and that made some of the complexity below seem confusing. 10*4096 12*4096 14*4096 16*4096 > ... ... ... ... > ... ... ... ... > ... ... ... max_cache_size > > From this matrix its cleary seen that every row clearly > follows the below way: > ... ... ...n >n+(1/4)n n+(1/2)n n+(3/4)n2n > > Row is calulated as log2(size/PAGE_SIZE) > Column is calculated as converting the difference > between the elements to fit into power size of two > and indexing it. > > Final Index is (row*4)+(col-1) > > Tested with Intel Mesa CI. > > Improves performance of 3d Mark on Broxton. > Analyzed using Compare Perf Analyser: > Average : 201.2 +/- 65.4836 (n=20) Is 201 the improvement or the absolute score? Do not quote absolute scores. > Percentage : 0.705966% +/- 0.229767% (n=20) Eero: Can you reproduce this result on BXT or other platforms? Just curious... > v2: Review comments regarding cosmetics and asserts implemented > > Signed-off-by: Aravindan Muthukumar > Signed-off-by: Kedar Karanje > Reviewed-by: Yogesh Marathe > --- > src/mesa/drivers/dri/i965/brw_bufmgr.c | 46 > -- > 1 file changed, 39 insertions(+), 7 deletions(-) > > diff --git a/src/mesa/drivers/dri/i965/brw_bufmgr.c > b/src/mesa/drivers/dri/i965/brw_bufmgr.c > index 8017219..8013ccb 100644 > --- a/src/mesa/drivers/dri/i965/brw_bufmgr.c > +++ b/src/mesa/drivers/dri/i965/brw_bufmgr.c > @@ -87,6 +87,8 @@ > > #define memclear(s) memset(, 0, sizeof(s)) > > +#define PAGE_SIZE 4096 > + > #define FILE_DEBUG_FLAG DEBUG_BUFMGR > > static inline int > @@ -181,19 +183,45 @@ bo_tile_pitch(struct brw_bufmgr *bufmgr, uint32_t > pitch, uint32_t tiling) > return ALIGN(pitch, tile_width); > } > > +static inline int > +ilog2_round_up(int value) > +{ > + assert(value != 0); > + return 32 - __builtin_clz(value - 1); > +} > + > +/* > + * This function finds the correct bucket fit for the input size. > + * The function works with O(1) complexity when the requested size > + * was queried instead of iterating the size through all the buckets. > + */ > static struct bo_cache_bucket * > bucket_for_size(struct brw_bufmgr *bufmgr, uint64_t size) > { > - int i; > + int index = -1; I don't see how execution could get to that test without index being set again, so it should be safe to remove the initialization. > + int row, col = 0; > + int pages, pages_log2; Move the declarations of row, col, pages, and pages_log2 into the else case, initialize them with the only assignment to them, and make them const. Since all of these values, including index, get values derived from size, I believe the should all be unsigned. In that case, remove the index >= 0 test below. > > - for (i = 0; i < bufmgr->num_buckets; i++) { > - struct bo_cache_bucket *bucket = >cache_bucket[i]; > - if (bucket->size >= size) { > - return bucket; > - } > + /* condition for size less than 4*4096 (16KB) page size */ ^ ^ Remove one space here. | Capitalize and add a period at the end of the sentence. > + if(size <= 4 * PAGE_SIZE) { ^ Add a space here. > + index = DIV_ROUND_UP(size, PAGE_SIZE) - 1;; Remove extra trailing semicolon. ^ > + } else { > + /* Number of pages of page size */ > + pages = DIV_ROUND_UP(size, PAGE_SIZE); > + pages_log2 = ilog2_round_up(pages) - 1; Isn't this going to be the same result as _mesa_logbase2(pages)? > + > + /* Finding the row and column of the matrix */ > + row = pages_log2 - 1; > + col = DIV_ROUND_UP((pages - (1 << pages_log2)), > +(1 << (pages_log2 - 2))); ^^ This should | be aligned | here. ---+ Removing some of the extra parenthesis would also make it easier to read. So... I feel like this function doesn't need to special case size < 16k like this. I haven't benchmarked it, but the following is about 30 bytes shorter and avoids the conditional branch. I also think it's easier to understand. const unsigned pages = DIV_ROUND_UP(size, PAGE_SIZE); /* Row
[Mesa-dev] [PATCH] radv: ensure correct outinfo is picked.
From: Dave AirlieThis struct used to rely on being in a union, it isn't anymore, so we have to pick the correct outinfo struct now. This should fix a regression since the union became a struct. dEQP-VK.tessellation.geometry_interaction.point_size.vertex_set_geometry_set Fixes: 6078a3bd51 (ac/nir: Allow ac_shader_variant_info to contain info about multiple stages.) Signed-off-by: Dave Airlie --- src/amd/vulkan/radv_pipeline.c | 27 ++- 1 file changed, 14 insertions(+), 13 deletions(-) diff --git a/src/amd/vulkan/radv_pipeline.c b/src/amd/vulkan/radv_pipeline.c index c16b5e3..9ffeda8 100644 --- a/src/amd/vulkan/radv_pipeline.c +++ b/src/amd/vulkan/radv_pipeline.c @@ -1410,12 +1410,19 @@ static uint32_t si_vgt_gs_mode(struct radv_shader_variant *gs) S_028A40_GS_WRITE_OPTIMIZE(1); } -static void calculate_vgt_gs_mode(struct radv_pipeline *pipeline) +static struct ac_vs_output_info *get_vs_output_info(struct radv_pipeline *pipeline) { - struct radv_shader_variant *vs; - vs = radv_pipeline_has_gs(pipeline) ? pipeline->gs_copy_shader : (radv_pipeline_has_tess(pipeline) ? pipeline->shaders[MESA_SHADER_TESS_EVAL] : pipeline->shaders[MESA_SHADER_VERTEX]); + if (radv_pipeline_has_gs(pipeline)) + return >gs_copy_shader->info.vs.outinfo; + else if (radv_pipeline_has_tess(pipeline)) + return >shaders[MESA_SHADER_TESS_EVAL]->info.tes.outinfo; + else + return >shaders[MESA_SHADER_VERTEX]->info.vs.outinfo; +} - struct ac_vs_output_info *outinfo = >info.vs.outinfo; +static void calculate_vgt_gs_mode(struct radv_pipeline *pipeline) +{ + struct ac_vs_output_info *outinfo = get_vs_output_info(pipeline);; pipeline->graphics.vgt_primitiveid_en = false; pipeline->graphics.vgt_gs_mode = 0; @@ -1430,10 +1437,7 @@ static void calculate_vgt_gs_mode(struct radv_pipeline *pipeline) static void calculate_pa_cl_vs_out_cntl(struct radv_pipeline *pipeline) { - struct radv_shader_variant *vs; - vs = radv_pipeline_has_gs(pipeline) ? pipeline->gs_copy_shader : (radv_pipeline_has_tess(pipeline) ? pipeline->shaders[MESA_SHADER_TESS_EVAL] : pipeline->shaders[MESA_SHADER_VERTEX]); - - struct ac_vs_output_info *outinfo = >info.vs.outinfo; + struct ac_vs_output_info *outinfo = get_vs_output_info(pipeline);; unsigned clip_dist_mask, cull_dist_mask, total_mask; clip_dist_mask = outinfo->clip_dist_mask; @@ -1476,13 +1480,10 @@ static uint32_t offset_to_ps_input(uint32_t offset, bool flat_shade) static void calculate_ps_inputs(struct radv_pipeline *pipeline) { - struct radv_shader_variant *ps, *vs; - struct ac_vs_output_info *outinfo; + struct radv_shader_variant *ps; + struct ac_vs_output_info *outinfo = get_vs_output_info(pipeline);; ps = pipeline->shaders[MESA_SHADER_FRAGMENT]; - vs = radv_pipeline_has_gs(pipeline) ? pipeline->gs_copy_shader : (radv_pipeline_has_tess(pipeline) ? pipeline->shaders[MESA_SHADER_TESS_EVAL] : pipeline->shaders[MESA_SHADER_VERTEX]); - - outinfo = >info.vs.outinfo; unsigned ps_offset = 0; -- 2.9.5 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 4/4] glsl/linker: outputs in the same location must share auxiliary storage
I only really skimmed over them but these look fine to me 3-4: Reviewed-by: Timothy ArceriThanks! ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 2/4] glsl/linker: fix location aliasing checks for interface variables
On 20/10/17 13:36, Ilia Mirkin wrote: On Thu, Oct 19, 2017 at 10:18 PM, Timothy Arceriwrote: On 20/10/17 03:31, Iago Toral Quiroga wrote: The existing code was checking the whole interface variable rather than its members, which is not what we want: we want to check aliasing for each member in the interface variable. Surprisingly, there are piglit tests that verify this and were passing due to a bug in the existing code: when we were computing the last component used by an interface variable we would use the 'vector' path and multiply by vector_elements, which is 0 for interface variables. This made the loop that checks for aliasing be a no-op and not add the interface variable to the list of outputs so then we would fail to link when we did not see a matching output for the same input in the next stage. Since the tests expect a linker error to happen, they would pass, but not for the right reason. Unfortunately, the current implementation uses ir_variable instances to keep track of explicit locations. Since we don't have ir_variables instances for individual interface members, we need to have a custom struct with the data we need. This struct has the ir_variable (which for interface members is the whole interface variable), plus the data that we need to validate for each aliased location, for now only the base type, which for interface members we will take from the appropriate field inside the interface variable. Later patches will expand this custom struct so we can also check other requirements for location aliasing, specifically that we have matching interpolation and auxiliary storage, that once again, we will take from the appropriate field members for the interface variables. Fixes: KHR-GL45.enhanced_layouts.varying_block_automatic_member_locations Fixes: tests/spec/arb_separate_shader_objects/execution/layout-location-named-block.shader_test -auto -fbo Fixes (these were passing before but for incorrect reasons): tests/spec/arb_enhanced_layouts/linker/block-member-locations/named-block-member-location-overlap.shader_test tests/spec/arb_enhanced_layouts/linker/block-member-locations/named-block-member-mixed-order-overlap.shader_test --- src/compiler/glsl/link_varyings.cpp | 45 +++-- 1 file changed, 33 insertions(+), 12 deletions(-) diff --git a/src/compiler/glsl/link_varyings.cpp b/src/compiler/glsl/link_varyings.cpp index 687bd50e52..5dc2704321 100644 --- a/src/compiler/glsl/link_varyings.cpp +++ b/src/compiler/glsl/link_varyings.cpp @@ -403,8 +403,13 @@ compute_variable_location_slot(ir_variable *var, gl_shader_stage stage) return var->data.location - location_start; } +struct explicit_location_info { + ir_variable *var; + unsigned base_type; +}; + static bool -check_location_aliasing(ir_variable *explicit_locations[][4], +check_location_aliasing(struct explicit_location_info explicit_locations[][4], ir_variable *var, unsigned location, unsigned component, @@ -427,7 +432,7 @@ check_location_aliasing(ir_variable *explicit_locations[][4], while (location < location_limit) { unsigned i = component; while (i < last_comp) { - if (explicit_locations[location][i] != NULL) { + if (explicit_locations[location][i].var != NULL) { linker_error(prog, "%s shader has multiple outputs explicitly " "assigned to location %d and component %d\n", @@ -439,9 +444,9 @@ check_location_aliasing(ir_variable *explicit_locations[][4], /* Make sure all component at this location have the same type. */ for (unsigned j = 0; j < 4; j++) { -if (explicit_locations[location][j] && -(explicit_locations[location][j]->type->without_array() - ->base_type != type->without_array()->base_type)) { +if (explicit_locations[location][j].var && +explicit_locations[location][j].base_type != + type->without_array()->base_type) { linker_error(prog, "Varyings sharing the same location must " "have the same underlying numerical type. " @@ -450,7 +455,9 @@ check_location_aliasing(ir_variable *explicit_locations[][4], } } - explicit_locations[location][i] = var; + explicit_locations[location][i].var = var; + explicit_locations[location][i].base_type = +type->without_array()->base_type; i++; /* We need to do some special handling for doubles as dvec3 and @@ -482,8 +489,8 @@ cross_validate_outputs_to_inputs(struct gl_context *ctx, gl_linked_shader *consumer) { glsl_symbol_table parameters; - ir_variable
Re: [Mesa-dev] [PATCH] glsl: Allow precision mismatch on dead uniform with GLSL ES 1.00 (v3)
Hi Ian, Kenneth, On Wed, Sep 27, 2017 at 2:57 AM, Tomasz Figawrote: > Commit 259fc505454ea6a67aeacf6cdebf1398d9947759 added linker error for > mismatching uniform precision, as required by GLES 3.0 specification and > conformance test-suite. > > Several Android applications, including Forge of Empires, have shaders > which violate this rule, on uniforms that are declared but not used > further in shader code. The problem affects a big number of Android > games, including ones built on top of one of the common 2D graphics > engines and other GLES implementations accept this, which poses a serious > application compatibility issue. > > Starting from GLSL ES 3.0, declarations with conflicting precision > qualifiers are explicitly prohibited. However GLSL ES 1.00 does not > clearly specify the behavior, except that > > "Uniforms are defined to behave as if they are using the same storage in > the vertex and fragment processors and may be implemented this way. > If uniforms are used in both the vertex and fragment shaders, developers > should be warned if the precisions are different. Conversion of > precision should never be implicit." > > The word "used" is not clear in this context and might refer to > 1) declared (same as GLES 3.x) > 2) referred after post-processing, or > 3) linked after all optimizations are done. > > Looking at existing applications, 2) or 3) seems to be widely adopted. > To avoid compatibility issues, turn the error into a warning if GLSL ES > version is lower than 3.0 and the data is dead in at least one of the > shaders. > > v3: > - Add a comment explaining the behavior. > - Fix bad copy/paste in commit message (s/varyings/uniforms). Would you be able to take a look? Ian, I believe your previous NAK was due to the confusing erroneous copy/paste from the freedesktop bug I made in commit message. Looking at last comment from Kenneth there, we were going to go with my patch, but things remained quiet after that. Thanks, Tomasz > v2: > - Change the behavior only for GLSL ES 1.00 shaders. > > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97532 > Signed-off-by: Tomasz Figa > --- > src/compiler/glsl/linker.cpp | 32 > 1 file changed, 28 insertions(+), 4 deletions(-) > > diff --git a/src/compiler/glsl/linker.cpp b/src/compiler/glsl/linker.cpp > index f352c5385ca..693c5ae3664 100644 > --- a/src/compiler/glsl/linker.cpp > +++ b/src/compiler/glsl/linker.cpp > @@ -1121,10 +1121,34 @@ cross_validate_globals(struct gl_shader_program *prog, > if (prog->IsES && (prog->data->Version != 310 || > !var->get_interface_type()) && > existing->data.precision != var->data.precision) { > -linker_error(prog, "declarations for %s `%s` have " > - "mismatching precision qualifiers\n", > - mode_string(var), var->name); > -return; > +/* > + * GLSL ES 3.00 is the first version that explicitly requires > + * that uniform declarations have matching precision qualifiers. > + * > + * The only relevant part of GLSL ES 1.00 spec, > + * > + * "If uniforms are used in both the vertex and fragment > shaders, > + * developers should be warned if the precisions are different. > + * Conversion of precision should never be implicit." > + * > + * leaves too wide field for intepretation. However, judging by > + * applications and implementations existing in the wild, it > seems > + * to be widely assumed that declarations alone are not enough to > + * fail the link. > + * > + * Thus, in case of GLSL ES < 3.00, trigger an error only if the > + * uniform is actually referenced in the code of both shaders. > + */ > +if ((existing->data.used && var->data.used) || > prog->data->Version >= 300) { > + linker_error(prog, "declarations for %s `%s` have " > +"mismatching precision qualifiers\n", > +mode_string(var), var->name); > + return; > +} else { > + linker_warning(prog, "declarations for %s `%s` have " > + "mismatching precision qualifiers\n", > + mode_string(var), var->name); > +} > } >} else > variables->add_variable(var); > -- > 2.14.1.992.g2c7b836f3a-goog > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 2/4] glsl/linker: fix location aliasing checks for interface variables
On Thu, Oct 19, 2017 at 10:18 PM, Timothy Arceriwrote: > > > On 20/10/17 03:31, Iago Toral Quiroga wrote: >> >> The existing code was checking the whole interface variable rather >> than its members, which is not what we want: we want to check >> aliasing for each member in the interface variable. >> >> Surprisingly, there are piglit tests that verify this and were >> passing due to a bug in the existing code: when we were computing >> the last component used by an interface variable we would use >> the 'vector' path and multiply by vector_elements, which is 0 for >> interface variables. This made the loop that checks for aliasing >> be a no-op and not add the interface variable to the list of outputs >> so then we would fail to link when we did not see a matching output >> for the same input in the next stage. Since the tests expect a >> linker error to happen, they would pass, but not for the right >> reason. >> >> Unfortunately, the current implementation uses ir_variable instances >> to keep track of explicit locations. Since we don't have >> ir_variables instances for individual interface members, we need >> to have a custom struct with the data we need. This struct has >> the ir_variable (which for interface members is the whole >> interface variable), plus the data that we need to validate for >> each aliased location, for now only the base type, which for >> interface members we will take from the appropriate field inside >> the interface variable. >> >> Later patches will expand this custom struct so we can also check >> other requirements for location aliasing, specifically that >> we have matching interpolation and auxiliary storage, that once >> again, we will take from the appropriate field members for the >> interface variables. >> >> Fixes: >> KHR-GL45.enhanced_layouts.varying_block_automatic_member_locations >> >> Fixes: >> >> tests/spec/arb_separate_shader_objects/execution/layout-location-named-block.shader_test >> -auto -fbo >> >> Fixes (these were passing before but for incorrect reasons): >> >> tests/spec/arb_enhanced_layouts/linker/block-member-locations/named-block-member-location-overlap.shader_test >> >> tests/spec/arb_enhanced_layouts/linker/block-member-locations/named-block-member-mixed-order-overlap.shader_test >> >> --- >> src/compiler/glsl/link_varyings.cpp | 45 >> +++-- >> 1 file changed, 33 insertions(+), 12 deletions(-) >> >> diff --git a/src/compiler/glsl/link_varyings.cpp >> b/src/compiler/glsl/link_varyings.cpp >> index 687bd50e52..5dc2704321 100644 >> --- a/src/compiler/glsl/link_varyings.cpp >> +++ b/src/compiler/glsl/link_varyings.cpp >> @@ -403,8 +403,13 @@ compute_variable_location_slot(ir_variable *var, >> gl_shader_stage stage) >> return var->data.location - location_start; >> } >> +struct explicit_location_info { >> + ir_variable *var; >> + unsigned base_type; >> +}; >> + >> static bool >> -check_location_aliasing(ir_variable *explicit_locations[][4], >> +check_location_aliasing(struct explicit_location_info >> explicit_locations[][4], >> ir_variable *var, >> unsigned location, >> unsigned component, >> @@ -427,7 +432,7 @@ check_location_aliasing(ir_variable >> *explicit_locations[][4], >> while (location < location_limit) { >> unsigned i = component; >> while (i < last_comp) { >> - if (explicit_locations[location][i] != NULL) { >> + if (explicit_locations[location][i].var != NULL) { >> linker_error(prog, >>"%s shader has multiple outputs explicitly " >>"assigned to location %d and component %d\n", >> @@ -439,9 +444,9 @@ check_location_aliasing(ir_variable >> *explicit_locations[][4], >>/* Make sure all component at this location have the same type. >> */ >>for (unsigned j = 0; j < 4; j++) { >> -if (explicit_locations[location][j] && >> -(explicit_locations[location][j]->type->without_array() >> - ->base_type != type->without_array()->base_type)) { >> +if (explicit_locations[location][j].var && >> +explicit_locations[location][j].base_type != >> + type->without_array()->base_type) { >> linker_error(prog, >> "Varyings sharing the same location must " >> "have the same underlying numerical type. " >> @@ -450,7 +455,9 @@ check_location_aliasing(ir_variable >> *explicit_locations[][4], >> } >>} >> - explicit_locations[location][i] = var; >> + explicit_locations[location][i].var = var; >> + explicit_locations[location][i].base_type = >> +type->without_array()->base_type; >>i++; >> /* We need to do some special handling for
Re: [Mesa-dev] [PATCH 2/4] glsl/linker: fix location aliasing checks for interface variables
On 20/10/17 03:31, Iago Toral Quiroga wrote: The existing code was checking the whole interface variable rather than its members, which is not what we want: we want to check aliasing for each member in the interface variable. Surprisingly, there are piglit tests that verify this and were passing due to a bug in the existing code: when we were computing the last component used by an interface variable we would use the 'vector' path and multiply by vector_elements, which is 0 for interface variables. This made the loop that checks for aliasing be a no-op and not add the interface variable to the list of outputs so then we would fail to link when we did not see a matching output for the same input in the next stage. Since the tests expect a linker error to happen, they would pass, but not for the right reason. Unfortunately, the current implementation uses ir_variable instances to keep track of explicit locations. Since we don't have ir_variables instances for individual interface members, we need to have a custom struct with the data we need. This struct has the ir_variable (which for interface members is the whole interface variable), plus the data that we need to validate for each aliased location, for now only the base type, which for interface members we will take from the appropriate field inside the interface variable. Later patches will expand this custom struct so we can also check other requirements for location aliasing, specifically that we have matching interpolation and auxiliary storage, that once again, we will take from the appropriate field members for the interface variables. Fixes: KHR-GL45.enhanced_layouts.varying_block_automatic_member_locations Fixes: tests/spec/arb_separate_shader_objects/execution/layout-location-named-block.shader_test -auto -fbo Fixes (these were passing before but for incorrect reasons): tests/spec/arb_enhanced_layouts/linker/block-member-locations/named-block-member-location-overlap.shader_test tests/spec/arb_enhanced_layouts/linker/block-member-locations/named-block-member-mixed-order-overlap.shader_test --- src/compiler/glsl/link_varyings.cpp | 45 +++-- 1 file changed, 33 insertions(+), 12 deletions(-) diff --git a/src/compiler/glsl/link_varyings.cpp b/src/compiler/glsl/link_varyings.cpp index 687bd50e52..5dc2704321 100644 --- a/src/compiler/glsl/link_varyings.cpp +++ b/src/compiler/glsl/link_varyings.cpp @@ -403,8 +403,13 @@ compute_variable_location_slot(ir_variable *var, gl_shader_stage stage) return var->data.location - location_start; } +struct explicit_location_info { + ir_variable *var; + unsigned base_type; +}; + static bool -check_location_aliasing(ir_variable *explicit_locations[][4], +check_location_aliasing(struct explicit_location_info explicit_locations[][4], ir_variable *var, unsigned location, unsigned component, @@ -427,7 +432,7 @@ check_location_aliasing(ir_variable *explicit_locations[][4], while (location < location_limit) { unsigned i = component; while (i < last_comp) { - if (explicit_locations[location][i] != NULL) { + if (explicit_locations[location][i].var != NULL) { linker_error(prog, "%s shader has multiple outputs explicitly " "assigned to location %d and component %d\n", @@ -439,9 +444,9 @@ check_location_aliasing(ir_variable *explicit_locations[][4], /* Make sure all component at this location have the same type. */ for (unsigned j = 0; j < 4; j++) { -if (explicit_locations[location][j] && -(explicit_locations[location][j]->type->without_array() - ->base_type != type->without_array()->base_type)) { +if (explicit_locations[location][j].var && +explicit_locations[location][j].base_type != + type->without_array()->base_type) { linker_error(prog, "Varyings sharing the same location must " "have the same underlying numerical type. " @@ -450,7 +455,9 @@ check_location_aliasing(ir_variable *explicit_locations[][4], } } - explicit_locations[location][i] = var; + explicit_locations[location][i].var = var; + explicit_locations[location][i].base_type = +type->without_array()->base_type; i++; /* We need to do some special handling for doubles as dvec3 and @@ -482,8 +489,8 @@ cross_validate_outputs_to_inputs(struct gl_context *ctx, gl_linked_shader *consumer) { glsl_symbol_table parameters; - ir_variable *explicit_locations[MAX_VARYINGS_INCL_PATCH][4] = - { {NULL, NULL} }; + struct explicit_location_info explicit_locations[MAX_VARYINGS_INCL_PATCH][4] = +
Re: [Mesa-dev] [PATCH v2 25/32] i965: add cache fallback support using serialized nir
On 20/10/17 09:13, Timothy Arceri wrote: IMO you should just do this in brw_link() when its cheap. You will be doing the deserialization at draw time here which is not what we want. Can also drop the serialized_nir params if you follow my suggestions in patch 14. I still think you might want to always restore the nir rather than doing this at draw time, but it's not going to be a huge overhead compared to the compile so for now. Acked-by: Timothy ArceriSomething you probably want to do as a follow up is free the serialised NIR after you load it. On 19/10/17 16:32, Jordan Justen wrote: If the i965 gen program cannot be loaded from the cache, then we fallback to using a serialized nir program. This is based on "i965: add cache fallback support" by Timothy Arceri . Tim's version was written to fallback to compiling from source, and therefore had to be much more complex. After Connor and Jason implemented nir serialization, I was able to rewrite and greatly simplify this patch. Signed-off-by: Jordan Justen --- src/mesa/drivers/dri/i965/brw_disk_cache.c | 27 ++- 1 file changed, 26 insertions(+), 1 deletion(-) diff --git a/src/mesa/drivers/dri/i965/brw_disk_cache.c b/src/mesa/drivers/dri/i965/brw_disk_cache.c index d89df846d5..790fad6925 100644 --- a/src/mesa/drivers/dri/i965/brw_disk_cache.c +++ b/src/mesa/drivers/dri/i965/brw_disk_cache.c @@ -24,6 +24,7 @@ #include "compiler/blob.h" #include "compiler/glsl/ir_uniform.h" #include "compiler/glsl/shader_cache.h" +#include "compiler/nir/nir_serialize.h" #include "main/mtypes.h" #include "util/disk_cache.h" #include "util/macros.h" @@ -79,6 +80,27 @@ gen_shader_sha1(struct brw_context *brw, struct gl_program *prog, _mesa_sha1_compute(manifest, strlen(manifest), out_sha1); } +static void +fallback_to_full_recompile(struct brw_context *brw, struct gl_program *prog, + gl_shader_stage stage) +{ + prog->program_written_to_cache = false; + if (brw->ctx._Shader->Flags & GLSL_CACHE_INFO) { + fprintf(stderr, "falling back to nir %s.\n", + _mesa_shader_stage_to_abbrev(prog->info.stage)); + } + + if (!prog->nir) { + assert(prog->serialized_nir && prog->serialized_nir_size > 0); + const struct nir_shader_compiler_options *options = + brw->ctx.Const.ShaderCompilerOptions[stage].NirOptions; + struct blob_reader reader; + blob_reader_init(, prog->serialized_nir, + prog->serialized_nir_size); + prog->nir = nir_deserialize(NULL, options, ); + } +} + static void read_program_data(struct gl_program *glprog, struct blob_reader *binary, struct brw_stage_prog_data *prog_data, @@ -298,6 +320,9 @@ brw_disk_cache_upload_program(struct brw_context *brw, gl_shader_stage stage) prog->sh.LinkedTransformFeedback->api_enabled) return false; + if (brw->ctx._Shader->Flags & GLSL_CACHE_FALLBACK) + goto FAIL; + if (prog->sh.data->LinkStatus != linking_skipped) goto FAIL; @@ -311,7 +336,7 @@ brw_disk_cache_upload_program(struct brw_context *brw, gl_shader_stage stage) return true; FAIL: - /*FIXME: Fall back and compile from source here. */ + fallback_to_full_recompile(brw, prog, stage); return false; } ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v2 14/32] glsl/shader_cache: Save and restore serialized nir in gl_program
On 20/10/17 09:03, Timothy Arceri wrote: This and the previous patch feels wrong. The glsl shader cache shouldn't be handling writing nir to disk. IMO you should add this functionality to nir_serialize.c (maybe rename is nir_cache.c ??). That way we have continue with the nir is a toolkit theme and we can easily reused the util to write nir to disk in Vulkan which could be useful for reducing compile times. Doing it this way we could deserialize it in brw_link() and serialize it in glsl_to_nir() no need for the extra serialized_nir params. After discussing this on IRC Jordan has changed my mind. Reviewed-by: Timothy ArceriOn 19/10/17 16:32, Jordan Justen wrote: Signed-off-by: Jordan Justen --- src/compiler/glsl/shader_cache.cpp | 16 1 file changed, 16 insertions(+) diff --git a/src/compiler/glsl/shader_cache.cpp b/src/compiler/glsl/shader_cache.cpp index ca90cfde35..f43bd6b17e 100644 --- a/src/compiler/glsl/shader_cache.cpp +++ b/src/compiler/glsl/shader_cache.cpp @@ -1062,6 +1062,14 @@ write_shader_metadata(struct blob *metadata, gl_linked_shader *shader) } write_shader_parameters(metadata, glprog->Parameters); + + assert((glprog->serialized_nir == NULL) == + (glprog->serialized_nir_size == 0)); + blob_write_uint32(metadata, (uint32_t)glprog->serialized_nir_size); + if (glprog->serialized_nir_size > 0) { + blob_write_bytes(metadata, glprog->serialized_nir, + glprog->serialized_nir_size); + } } static void @@ -1116,6 +1124,14 @@ read_shader_metadata(struct blob_reader *metadata, glprog->Parameters = _mesa_new_parameter_list(); read_shader_parameters(metadata, glprog->Parameters); + + glprog->serialized_nir_size = (size_t)blob_read_uint32(metadata); + if (glprog->serialized_nir_size > 0) { + glprog->serialized_nir = + (uint8_t*)ralloc_size(glprog, glprog->serialized_nir_size); + blob_copy_bytes(metadata, glprog->serialized_nir, + glprog->serialized_nir_size); + } } static void ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v2 09/32] glsl_to_nir: Zero nir_variable struct for valgrind & nir_serialize
On Wednesday, October 18, 2017 10:31:57 PM PDT Jordan Justen wrote: > Signed-off-by: Jordan Justen> --- > src/compiler/glsl/glsl_to_nir.cpp | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/src/compiler/glsl/glsl_to_nir.cpp > b/src/compiler/glsl/glsl_to_nir.cpp > index 63694fd41f..1d1085ffbc 100644 > --- a/src/compiler/glsl/glsl_to_nir.cpp > +++ b/src/compiler/glsl/glsl_to_nir.cpp > @@ -311,7 +311,7 @@ nir_visitor::visit(ir_variable *ir) > if (ir->data.mode == ir_var_shader_shared) >return; > > - nir_variable *var = ralloc(shader, nir_variable); > + nir_variable *var = rzalloc(shader, nir_variable); > var->type = ir->type; > var->name = ralloc_strdup(var, ir->name); > > Patches 9-10: Reviewed-by: Kenneth Graunke signature.asc Description: This is a digitally signed message part. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v2 08/32] nir: Zero nir_load_const_instr::value for valgrind & nir_serialize
On Wednesday, October 18, 2017 10:31:56 PM PDT Jordan Justen wrote: > Signed-off-by: Jordan Justen> --- > src/compiler/nir/nir.c | 1 + > 1 file changed, 1 insertion(+) > > diff --git a/src/compiler/nir/nir.c b/src/compiler/nir/nir.c > index fe48451694..cbba9c8749 100644 > --- a/src/compiler/nir/nir.c > +++ b/src/compiler/nir/nir.c > @@ -481,6 +481,7 @@ nir_load_const_instr_create(nir_shader *shader, unsigned > num_components, > unsigned bit_size) > { > nir_load_const_instr *instr = ralloc(shader, nir_load_const_instr); > + memset(>value, 0, sizeof(instr->value)); I would just rzalloc here. Preemptively, that patch would get a: Reviewed-by: Kenneth Graunke > instr_init(>instr, nir_instr_type_load_const); > > nir_ssa_def_init(>instr, >def, num_components, bit_size, > NULL); > signature.asc Description: This is a digitally signed message part. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v2 07/32] intel/nir: Zero local index const struct for valgrind & nir_serialize
On Wednesday, October 18, 2017 10:31:55 PM PDT Jordan Justen wrote: > Signed-off-by: Jordan Justen> --- > src/intel/compiler/brw_nir_lower_cs_intrinsics.c | 1 + > 1 file changed, 1 insertion(+) > > diff --git a/src/intel/compiler/brw_nir_lower_cs_intrinsics.c > b/src/intel/compiler/brw_nir_lower_cs_intrinsics.c > index f9322654e7..d27727624c 100644 > --- a/src/intel/compiler/brw_nir_lower_cs_intrinsics.c > +++ b/src/intel/compiler/brw_nir_lower_cs_intrinsics.c > @@ -116,6 +116,7 @@ lower_cs_intrinsics_convert_block(struct > lower_intrinsics_state *state, > nir_ssa_def *local_index = nir_load_local_invocation_index(b); > > nir_const_value uvec3; > + memset(, 0, sizeof(uvec3)); > uvec3.u32[0] = 1; > uvec3.u32[1] = size[0]; > uvec3.u32[2] = size[0] * size[1]; > Reviewed-by: Kenneth Graunke signature.asc Description: This is a digitally signed message part. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v2 06/32] nir: Zero local_size const struct for valgrind & nir_serialize
On Wednesday, October 18, 2017 10:31:54 PM PDT Jordan Justen wrote: > Signed-off-by: Jordan Justen> --- > src/compiler/nir/nir_lower_system_values.c | 1 + > 1 file changed, 1 insertion(+) > > diff --git a/src/compiler/nir/nir_lower_system_values.c > b/src/compiler/nir/nir_lower_system_values.c > index ba20d3083f..39b1a260bd 100644 > --- a/src/compiler/nir/nir_lower_system_values.c > +++ b/src/compiler/nir/nir_lower_system_values.c > @@ -58,6 +58,7 @@ convert_block(nir_block *block, nir_builder *b) >*/ > > nir_const_value local_size; > + memset(_size, 0, sizeof(local_size)); > local_size.u32[0] = b->shader->info.cs.local_size[0]; > local_size.u32[1] = b->shader->info.cs.local_size[1]; > local_size.u32[2] = b->shader->info.cs.local_size[2]; > Reviewed-by: Kenneth Graunke signature.asc Description: This is a digitally signed message part. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v2 05/32] glsl: Zero per_vertex_accumulator::fields for valgrind & nir_serialize
On Wednesday, October 18, 2017 10:31:53 PM PDT Jordan Justen wrote: > Signed-off-by: Jordan Justen> --- > src/compiler/glsl/builtin_variables.cpp | 1 + > 1 file changed, 1 insertion(+) > > diff --git a/src/compiler/glsl/builtin_variables.cpp > b/src/compiler/glsl/builtin_variables.cpp > index ea2d897cc8..d3cf12475b 100644 > --- a/src/compiler/glsl/builtin_variables.cpp > +++ b/src/compiler/glsl/builtin_variables.cpp > @@ -318,6 +318,7 @@ per_vertex_accumulator::per_vertex_accumulator() > : fields(), > num_fields(0) > { > + memset(fields, 0, sizeof(fields)); > } This shouldn't fix anything, we're calling the fields() constructor, which should value-initialize every element of the fields array, calling the constructor for glsl_struct_type on each element. That constructor is supposed to zero initialize items. Maybe it is missing some of them? FWIW, I trying to figure out the behavior of this by reading the rules here: https://stackoverflow.com/questions/1628311/array-initialisation signature.asc Description: This is a digitally signed message part. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v2 03/32] nir/intrinsics: Set the correct num_indices for load_output
On Wednesday, October 18, 2017 10:31:51 PM PDT Jordan Justen wrote: > From: Jason Ekstrand> > Cc: mesa-sta...@lists.freedesktop.org > --- > src/compiler/nir/nir_intrinsics.h | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/src/compiler/nir/nir_intrinsics.h > b/src/compiler/nir/nir_intrinsics.h > index 0de7080bfa..cefd18be90 100644 > --- a/src/compiler/nir/nir_intrinsics.h > +++ b/src/compiler/nir/nir_intrinsics.h > @@ -434,7 +434,7 @@ INTRINSIC(load_interpolated_input, 2, ARR(2, 1), true, 0, > 0, > /* src[] = { buffer_index, offset }. No const_index */ > LOAD(ssbo, 2, 0, xx, xx, xx, NIR_INTRINSIC_CAN_ELIMINATE) > /* src[] = { offset }. const_index[] = { base, component } */ > -LOAD(output, 1, 1, BASE, COMPONENT, xx, NIR_INTRINSIC_CAN_ELIMINATE) > +LOAD(output, 1, 2, BASE, COMPONENT, xx, NIR_INTRINSIC_CAN_ELIMINATE) > /* src[] = { vertex, offset }. const_index[] = { base, component } */ > LOAD(per_vertex_output, 2, 1, BASE, COMPONENT, xx, > NIR_INTRINSIC_CAN_ELIMINATE) > /* src[] = { offset }. const_index[] = { base } */ > Reviewed-by: Kenneth Graunke signature.asc Description: This is a digitally signed message part. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v2 02/32] nir: Get rid of nir_shader::stage
On Wednesday, October 18, 2017 10:31:50 PM PDT Jordan Justen wrote: > From: Jason Ekstrand> > It's redundant with nir_shader::info::stage. I hate the extra wordiness here, but removing the redundancy is certainly a reasonable thing to do. So, as much as I dislike this, Reviewed-by: Kenneth Graunke signature.asc Description: This is a digitally signed message part. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] i965: Revert absolute mode for constant buffer pointers.
On Thu, Oct 19, 2017 at 4:18 PM, Kenneth Graunkewrote: > The kernel doesn't initialize the value of the INSTPM or CS_DEBUG_MODE2 > registers at context initialization time. Instead, they're inherited > from whatever happened to be running on the GPU prior to first run of a > new context. So, when we started setting these, other contexts in the > system started inheriting our values. Since this controls whether > 3DSTATE_CONSTANT_* takes a pointer or an offset, getting the wrong > setting is fatal for almost any process which isn't expecting this. > > Unfortunately, VA-API and Beignet don't initialize this (nor does older > Mesa), so they will die horribly if we start doing this. UXA and SNA > don't use any push constants, so they are unaffected. > > The kernel developers are apparently uninterested in making the proto- > context initialize these registers to guarantee deterministic behavior. Could somebody from the kernel team elaborate here? This is obviously broken and undermines the security and containerization that hw contexts are supposed to provide. I'm really curious what the thinking is here... Kristian > Apparently, the "Default Value" of registers in the documentation is a > total lie, and cannot be relied upon by userspace. So, we need to find > a new solution. One would be to patch VA-API and Beignet to initialize > these, then get every distributor to ship those in tandem with the newer > Mesa version. I'm unclear when va-intel-driver releases might happen. > Another option would be to hack Mesa to restore the register back to the > historical default (relative mode) at the end of each batch. This is > also gross, as it has non-zero cost, and is also relying on userspace to > be "polite" to other broken userspace. A large part of the motivation > for contexts was to provide proper process isolation, so we should not > have to do this. But, we're already doing it in some cases, because > our kernel fixes to enforce isolation were reverted. > > In the meantime, I guess we'll just revert this patch and abandon using > the feature. It will lead to fewer pushed UBOs on Broadwell+, which may > lead to lower performance, but I don't have any data on the impact. > > Cc: "17.2" > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102774 > --- > src/mesa/drivers/dri/i965/brw_state_upload.c | 24 > src/mesa/drivers/dri/i965/intel_screen.c | 2 +- > 2 files changed, 1 insertion(+), 25 deletions(-) > > diff --git a/src/mesa/drivers/dri/i965/brw_state_upload.c > b/src/mesa/drivers/dri/i965/brw_state_upload.c > index 16f44d03bbe..23e4ebda259 100644 > --- a/src/mesa/drivers/dri/i965/brw_state_upload.c > +++ b/src/mesa/drivers/dri/i965/brw_state_upload.c > @@ -101,30 +101,6 @@ brw_upload_initial_gpu_state(struct brw_context *brw) >OUT_BATCH(0); >ADVANCE_BATCH(); > } > - > - /* Set the "CONSTANT_BUFFER Address Offset Disable" bit, so > -* 3DSTATE_CONSTANT_XS buffer 0 is an absolute address. > -* > -* On Gen6-7.5, we use an execbuf parameter to do this for us. > -* However, the kernel ignores that when execlists are in use. > -* Fortunately, we can just write the registers from userspace > -* on Gen8+, and they're context saved/restored. > -*/ > - if (devinfo->gen >= 9) { > - BEGIN_BATCH(3); > - OUT_BATCH(MI_LOAD_REGISTER_IMM | (3 - 2)); > - OUT_BATCH(CS_DEBUG_MODE2); > - OUT_BATCH(REG_MASK(CSDBG2_CONSTANT_BUFFER_ADDRESS_OFFSET_DISABLE) | > -CSDBG2_CONSTANT_BUFFER_ADDRESS_OFFSET_DISABLE); > - ADVANCE_BATCH(); > - } else if (devinfo->gen == 8) { > - BEGIN_BATCH(3); > - OUT_BATCH(MI_LOAD_REGISTER_IMM | (3 - 2)); > - OUT_BATCH(INSTPM); > - OUT_BATCH(REG_MASK(INSTPM_CONSTANT_BUFFER_ADDRESS_OFFSET_DISABLE) | > -INSTPM_CONSTANT_BUFFER_ADDRESS_OFFSET_DISABLE); > - ADVANCE_BATCH(); > - } > } > > static inline const struct brw_tracked_state * > diff --git a/src/mesa/drivers/dri/i965/intel_screen.c > b/src/mesa/drivers/dri/i965/intel_screen.c > index ea04a72e860..776c2898d5b 100644 > --- a/src/mesa/drivers/dri/i965/intel_screen.c > +++ b/src/mesa/drivers/dri/i965/intel_screen.c > @@ -2510,7 +2510,7 @@ __DRIconfig **intelInitScreen2(__DRIscreen *dri_screen) > screen->compiler = brw_compiler_create(screen, devinfo); > screen->compiler->shader_debug_log = shader_debug_log_mesa; > screen->compiler->shader_perf_log = shader_perf_log_mesa; > - screen->compiler->constant_buffer_0_is_relative = devinfo->gen < 8; > + screen->compiler->constant_buffer_0_is_relative = true; > screen->compiler->supports_pull_constants = true; > > screen->has_exec_fence = > -- > 2.14.2 > > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] i965/fs: Use align1 mode on ternary instructions on Gen10+
Align1 mode offers some nice features over align16, like access to more data types and the ability to use a 16-bit immediate. This patch does not start using any new features. It just emits ternary instructions in align1 mode. --- src/intel/compiler/brw_fs_generator.cpp | 12 1 file changed, 8 insertions(+), 4 deletions(-) diff --git a/src/intel/compiler/brw_fs_generator.cpp b/src/intel/compiler/brw_fs_generator.cpp index 2622a91917..bdf2f916cb 100644 --- a/src/intel/compiler/brw_fs_generator.cpp +++ b/src/intel/compiler/brw_fs_generator.cpp @@ -1729,13 +1729,15 @@ fs_generator::generate_code(const cfg_t *cfg, int dispatch_width) case BRW_OPCODE_MAD: assert(devinfo->gen >= 6); -brw_set_default_access_mode(p, BRW_ALIGN_16); + if (devinfo->gen < 10) +brw_set_default_access_mode(p, BRW_ALIGN_16); brw_MAD(p, dst, src[0], src[1], src[2]); break; case BRW_OPCODE_LRP: assert(devinfo->gen >= 6); -brw_set_default_access_mode(p, BRW_ALIGN_16); + if (devinfo->gen < 10) +brw_set_default_access_mode(p, BRW_ALIGN_16); brw_LRP(p, dst, src[0], src[1], src[2]); break; @@ -1833,7 +1835,8 @@ fs_generator::generate_code(const cfg_t *cfg, int dispatch_width) case BRW_OPCODE_BFE: assert(devinfo->gen >= 7); - brw_set_default_access_mode(p, BRW_ALIGN_16); + if (devinfo->gen < 10) +brw_set_default_access_mode(p, BRW_ALIGN_16); brw_BFE(p, dst, src[0], src[1], src[2]); break; @@ -1843,7 +1846,8 @@ fs_generator::generate_code(const cfg_t *cfg, int dispatch_width) break; case BRW_OPCODE_BFI2: assert(devinfo->gen >= 7); - brw_set_default_access_mode(p, BRW_ALIGN_16); + if (devinfo->gen < 10) +brw_set_default_access_mode(p, BRW_ALIGN_16); brw_BFI2(p, dst, src[0], src[1], src[2]); break; -- 2.13.6 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 10/13] i965: Add align1 ternary instruction disassembler support
On Fri, Sep 29, 2017 at 5:11 PM, Scott D Phillipswrote: > Matt Turner writes: > >> --- >> src/intel/compiler/brw_disasm.c | 399 >> +--- >> src/intel/compiler/brw_eu_defines.h | 11 - >> 2 files changed, 322 insertions(+), 88 deletions(-) >> >> diff --git a/src/intel/compiler/brw_disasm.c >> b/src/intel/compiler/brw_disasm.c >> index 3726172e5d..7215735967 100644 >> --- a/src/intel/compiler/brw_disasm.c >> +++ b/src/intel/compiler/brw_disasm.c >> @@ -30,6 +30,7 @@ >> #include "brw_reg.h" >> #include "brw_inst.h" >> #include "brw_eu.h" >> +#include "util/half_float.h" >> >> static bool >> has_jip(const struct gen_device_info *devinfo, enum opcode opcode) >> @@ -237,13 +238,6 @@ static const char *const access_mode[2] = { >> [1] = "align16", >> }; >> >> -static const char *const three_source_reg_encoding[] = { >> - [BRW_3SRC_TYPE_F] = "F", >> - [BRW_3SRC_TYPE_D] = "D", >> - [BRW_3SRC_TYPE_UD] = "UD", >> - [BRW_3SRC_TYPE_DF] = "DF", >> -}; >> - >> static const char *const reg_file[4] = { >> [0] = "A", >> [1] = "g", >> @@ -762,17 +756,17 @@ dest(FILE *file, const struct gen_device_info >> *devinfo, const brw_inst *inst) >> static int >> dest_3src(FILE *file, const struct gen_device_info *devinfo, const brw_inst >> *inst) >> { >> + bool is_align1 = brw_inst_3src_access_mode(devinfo, inst) == BRW_ALIGN_1; >> int err = 0; >> + unsigned flags = 0; >> uint32_t reg_file; >> - enum brw_reg_type type = >> - brw_hw_3src_type_to_reg_type(devinfo, >> - brw_inst_3src_a16_dst_hw_type(devinfo, >> inst), >> - 0); >> - unsigned dst_subreg_nr = >> - brw_inst_3src_a16_dst_subreg_nr(devinfo, inst) * 4 / >> - brw_reg_type_to_size(type); >> - >> - if (devinfo->gen == 6 && brw_inst_3src_a16_dst_reg_file(devinfo, inst)) >> + unsigned subreg_nr; >> + unsigned hw_type; >> + enum brw_reg_type type; >> + >> + if (is_align1 && brw_inst_3src_a1_dst_reg_file(devinfo, inst)) >> + reg_file = BRW_ARCHITECTURE_REGISTER_FILE; >> + else if (devinfo->gen == 6 && brw_inst_3src_a16_dst_reg_file(devinfo, >> inst)) >>reg_file = BRW_MESSAGE_REGISTER_FILE; >> else >>reg_file = BRW_GENERAL_REGISTER_FILE; >> @@ -780,13 +774,32 @@ dest_3src(FILE *file, const struct gen_device_info >> *devinfo, const brw_inst *ins >> err |= reg(file, reg_file, brw_inst_3src_dst_reg_nr(devinfo, inst)); >> if (err == -1) >>return 0; >> - if (dst_subreg_nr) >> - format(file, ".%u", dst_subreg_nr); >> + >> + if (is_align1) { >> + flags |= IS_ALIGN1; >> + >> + if (brw_inst_3src_a1_exec_type(devinfo, inst) == >> + BRW_ALIGN1_3SRC_EXEC_TYPE_INT) >> + flags |= IS_INTEGER; >> + >> + hw_type = brw_inst_3src_a1_dst_hw_type(devinfo, inst); >> + subreg_nr = brw_inst_3src_a1_dst_subreg_nr(devinfo, inst); >> + } else { >> + hw_type = brw_inst_3src_a16_dst_hw_type(devinfo, inst); >> + subreg_nr = brw_inst_3src_a16_dst_subreg_nr(devinfo, inst) * 4; >> + } >> + type = brw_hw_3src_type_to_reg_type(devinfo, hw_type, flags); >> + subreg_nr /= brw_reg_type_to_size(type); >> + >> + if (subreg_nr) >> + format(file, ".%u", subreg_nr); >> string(file, "<1>"); >> - err |= control(file, "writemask", writemask, >> - brw_inst_3src_a16_dst_writemask(devinfo, inst), NULL); >> - err |= control(file, "dest reg encoding", three_source_reg_encoding, >> - brw_inst_3src_a16_dst_hw_type(devinfo, inst), NULL); >> + >> + if (!is_align1) { >> + err |= control(file, "writemask", writemask, >> + brw_inst_3src_a16_dst_writemask(devinfo, inst), NULL); >> + } >> + string(file, brw_reg_type_to_letters(type)); >> >> return 0; >> } >> @@ -931,36 +944,169 @@ src_da16(FILE *file, >> return err; >> } >> >> +static enum brw_vertical_stride >> +vstride_from_align1_3src_vstride(enum gen10_align1_3src_vertical_stride >> vstride) >> +{ >> + switch (vstride) { >> + case BRW_ALIGN1_3SRC_VERTICAL_STRIDE_0: return BRW_VERTICAL_STRIDE_0; >> + case BRW_ALIGN1_3SRC_VERTICAL_STRIDE_2: return BRW_VERTICAL_STRIDE_2; >> + case BRW_ALIGN1_3SRC_VERTICAL_STRIDE_4: return BRW_VERTICAL_STRIDE_4; >> + case BRW_ALIGN1_3SRC_VERTICAL_STRIDE_8: return BRW_VERTICAL_STRIDE_8; >> + default: >> + unreachable("not reached"); >> + } >> +} >> + >> +static enum brw_horizontal_stride >> +hstride_from_align1_3src_hstride(enum >> gen10_align1_3src_src_horizontal_stride hstride) >> +{ >> + switch (hstride) { >> + case BRW_ALIGN1_3SRC_SRC_HORIZONTAL_STRIDE_0: return >> BRW_HORIZONTAL_STRIDE_0; >> + case BRW_ALIGN1_3SRC_SRC_HORIZONTAL_STRIDE_1: return >> BRW_HORIZONTAL_STRIDE_1; >> + case BRW_ALIGN1_3SRC_SRC_HORIZONTAL_STRIDE_2: return >> BRW_HORIZONTAL_STRIDE_2; >> + case
[Mesa-dev] [PATCH] i965: Add align1 ternary instruction emission support
--- src/intel/compiler/brw_eu_emit.c | 219 +-- 1 file changed, 166 insertions(+), 53 deletions(-) diff --git a/src/intel/compiler/brw_eu_emit.c b/src/intel/compiler/brw_eu_emit.c index 7495a19fd8..4f98860044 100644 --- a/src/intel/compiler/brw_eu_emit.c +++ b/src/intel/compiler/brw_eu_emit.c @@ -678,64 +678,177 @@ brw_alu3(struct brw_codegen *p, unsigned opcode, struct brw_reg dest, gen7_convert_mrf_to_grf(p, ); - assert(brw_inst_access_mode(devinfo, inst) == BRW_ALIGN_16); - - assert(dest.file == BRW_GENERAL_REGISTER_FILE || - dest.file == BRW_MESSAGE_REGISTER_FILE); assert(dest.nr < 128); + assert(src0.nr < 128); + assert(src1.nr < 128); + assert(src2.nr < 128); assert(dest.address_mode == BRW_ADDRESS_DIRECT); - assert(dest.type == BRW_REGISTER_TYPE_F || - dest.type == BRW_REGISTER_TYPE_DF || - dest.type == BRW_REGISTER_TYPE_D || - dest.type == BRW_REGISTER_TYPE_UD); - if (devinfo->gen == 6) { - brw_inst_set_3src_a16_dst_reg_file(devinfo, inst, - dest.file == BRW_MESSAGE_REGISTER_FILE); - } - brw_inst_set_3src_dst_reg_nr(devinfo, inst, dest.nr); - brw_inst_set_3src_a16_dst_subreg_nr(devinfo, inst, dest.subnr / 16); - brw_inst_set_3src_a16_dst_writemask(devinfo, inst, dest.writemask); - - assert(src0.file == BRW_GENERAL_REGISTER_FILE); assert(src0.address_mode == BRW_ADDRESS_DIRECT); - assert(src0.nr < 128); - brw_inst_set_3src_a16_src0_swizzle(devinfo, inst, src0.swizzle); - brw_inst_set_3src_a16_src0_subreg_nr(devinfo, inst, get_3src_subreg_nr(src0)); - brw_inst_set_3src_src0_reg_nr(devinfo, inst, src0.nr); - brw_inst_set_3src_src0_abs(devinfo, inst, src0.abs); - brw_inst_set_3src_src0_negate(devinfo, inst, src0.negate); - brw_inst_set_3src_a16_src0_rep_ctrl(devinfo, inst, - src0.vstride == BRW_VERTICAL_STRIDE_0); - - assert(src1.file == BRW_GENERAL_REGISTER_FILE); assert(src1.address_mode == BRW_ADDRESS_DIRECT); - assert(src1.nr < 128); - brw_inst_set_3src_src1_reg_nr(devinfo, inst, src1.nr); - brw_inst_set_3src_src1_abs(devinfo, inst, src1.abs); - brw_inst_set_3src_src1_negate(devinfo, inst, src1.negate); - brw_inst_set_3src_a16_src1_rep_ctrl(devinfo, inst, - src1.vstride == BRW_VERTICAL_STRIDE_0); - - assert(src2.file == BRW_GENERAL_REGISTER_FILE); assert(src2.address_mode == BRW_ADDRESS_DIRECT); - assert(src2.nr < 128); - brw_inst_set_3src_a16_src2_swizzle(devinfo, inst, src2.swizzle); - brw_inst_set_3src_a16_src2_subreg_nr(devinfo, inst, get_3src_subreg_nr(src2)); - brw_inst_set_3src_src2_reg_nr(devinfo, inst, src2.nr); - brw_inst_set_3src_src2_abs(devinfo, inst, src2.abs); - brw_inst_set_3src_src2_negate(devinfo, inst, src2.negate); - brw_inst_set_3src_a16_src2_rep_ctrl(devinfo, inst, - src2.vstride == BRW_VERTICAL_STRIDE_0); - - if (devinfo->gen >= 7) { - /* Set both the source and destination types based on dest.type, - * ignoring the source register types. The MAD and LRP emitters ensure - * that all four types are float. The BFE and BFI2 emitters, however, - * may send us mixed D and UD types and want us to ignore that and use - * the destination type. - */ - brw_inst_set_3src_a16_src_type(devinfo, inst, dest.type); - brw_inst_set_3src_a16_dst_type(devinfo, inst, dest.type); + + if (brw_inst_access_mode(devinfo, inst) == BRW_ALIGN_1) { + assert(dest.file == BRW_GENERAL_REGISTER_FILE || + dest.file == BRW_ARCHITECTURE_REGISTER_FILE); + + if (dest.file == BRW_ARCHITECTURE_REGISTER_FILE) { + brw_inst_set_3src_a1_dst_reg_file(devinfo, inst, + BRW_ALIGN1_3SRC_ACCUMULATOR); + brw_inst_set_3src_dst_reg_nr(devinfo, inst, BRW_ARF_ACCUMULATOR); + } else { + brw_inst_set_3src_a1_dst_reg_file(devinfo, inst, + BRW_ALIGN1_3SRC_GENERAL_REGISTER_FILE); + brw_inst_set_3src_dst_reg_nr(devinfo, inst, dest.nr); + } + brw_inst_set_3src_a1_dst_subreg_nr(devinfo, inst, dest.subnr / 8); + + brw_inst_set_3src_a1_dst_hstride(devinfo, inst, BRW_ALIGN1_3SRC_DST_HORIZONTAL_STRIDE_1); + + if (brw_reg_type_is_floating_point(dest.type)) { + brw_inst_set_3src_a1_exec_type(devinfo, inst, +BRW_ALIGN1_3SRC_EXEC_TYPE_FLOAT); + } else { + brw_inst_set_3src_a1_exec_type(devinfo, inst, +BRW_ALIGN1_3SRC_EXEC_TYPE_INT); + } + + brw_inst_set_3src_a1_dst_type(devinfo, inst, dest.type); + brw_inst_set_3src_a1_src0_type(devinfo, inst, src0.type); + brw_inst_set_3src_a1_src1_type(devinfo, inst, src1.type); + brw_inst_set_3src_a1_src2_type(devinfo, inst, src2.type); + +
[Mesa-dev] [PATCH] i965: Revert absolute mode for constant buffer pointers.
The kernel doesn't initialize the value of the INSTPM or CS_DEBUG_MODE2 registers at context initialization time. Instead, they're inherited from whatever happened to be running on the GPU prior to first run of a new context. So, when we started setting these, other contexts in the system started inheriting our values. Since this controls whether 3DSTATE_CONSTANT_* takes a pointer or an offset, getting the wrong setting is fatal for almost any process which isn't expecting this. Unfortunately, VA-API and Beignet don't initialize this (nor does older Mesa), so they will die horribly if we start doing this. UXA and SNA don't use any push constants, so they are unaffected. The kernel developers are apparently uninterested in making the proto- context initialize these registers to guarantee deterministic behavior. Apparently, the "Default Value" of registers in the documentation is a total lie, and cannot be relied upon by userspace. So, we need to find a new solution. One would be to patch VA-API and Beignet to initialize these, then get every distributor to ship those in tandem with the newer Mesa version. I'm unclear when va-intel-driver releases might happen. Another option would be to hack Mesa to restore the register back to the historical default (relative mode) at the end of each batch. This is also gross, as it has non-zero cost, and is also relying on userspace to be "polite" to other broken userspace. A large part of the motivation for contexts was to provide proper process isolation, so we should not have to do this. But, we're already doing it in some cases, because our kernel fixes to enforce isolation were reverted. In the meantime, I guess we'll just revert this patch and abandon using the feature. It will lead to fewer pushed UBOs on Broadwell+, which may lead to lower performance, but I don't have any data on the impact. Cc: "17.2"Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102774 --- src/mesa/drivers/dri/i965/brw_state_upload.c | 24 src/mesa/drivers/dri/i965/intel_screen.c | 2 +- 2 files changed, 1 insertion(+), 25 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_state_upload.c b/src/mesa/drivers/dri/i965/brw_state_upload.c index 16f44d03bbe..23e4ebda259 100644 --- a/src/mesa/drivers/dri/i965/brw_state_upload.c +++ b/src/mesa/drivers/dri/i965/brw_state_upload.c @@ -101,30 +101,6 @@ brw_upload_initial_gpu_state(struct brw_context *brw) OUT_BATCH(0); ADVANCE_BATCH(); } - - /* Set the "CONSTANT_BUFFER Address Offset Disable" bit, so -* 3DSTATE_CONSTANT_XS buffer 0 is an absolute address. -* -* On Gen6-7.5, we use an execbuf parameter to do this for us. -* However, the kernel ignores that when execlists are in use. -* Fortunately, we can just write the registers from userspace -* on Gen8+, and they're context saved/restored. -*/ - if (devinfo->gen >= 9) { - BEGIN_BATCH(3); - OUT_BATCH(MI_LOAD_REGISTER_IMM | (3 - 2)); - OUT_BATCH(CS_DEBUG_MODE2); - OUT_BATCH(REG_MASK(CSDBG2_CONSTANT_BUFFER_ADDRESS_OFFSET_DISABLE) | -CSDBG2_CONSTANT_BUFFER_ADDRESS_OFFSET_DISABLE); - ADVANCE_BATCH(); - } else if (devinfo->gen == 8) { - BEGIN_BATCH(3); - OUT_BATCH(MI_LOAD_REGISTER_IMM | (3 - 2)); - OUT_BATCH(INSTPM); - OUT_BATCH(REG_MASK(INSTPM_CONSTANT_BUFFER_ADDRESS_OFFSET_DISABLE) | -INSTPM_CONSTANT_BUFFER_ADDRESS_OFFSET_DISABLE); - ADVANCE_BATCH(); - } } static inline const struct brw_tracked_state * diff --git a/src/mesa/drivers/dri/i965/intel_screen.c b/src/mesa/drivers/dri/i965/intel_screen.c index ea04a72e860..776c2898d5b 100644 --- a/src/mesa/drivers/dri/i965/intel_screen.c +++ b/src/mesa/drivers/dri/i965/intel_screen.c @@ -2510,7 +2510,7 @@ __DRIconfig **intelInitScreen2(__DRIscreen *dri_screen) screen->compiler = brw_compiler_create(screen, devinfo); screen->compiler->shader_debug_log = shader_debug_log_mesa; screen->compiler->shader_perf_log = shader_perf_log_mesa; - screen->compiler->constant_buffer_0_is_relative = devinfo->gen < 8; + screen->compiler->constant_buffer_0_is_relative = true; screen->compiler->supports_pull_constants = true; screen->has_exec_fence = -- 2.14.2 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] i965: Add align1 ternary instruction-word support
Reviewed-by: Scott D Phillips--- src/intel/compiler/brw_inst.h | 108 ++ 1 file changed, 108 insertions(+) diff --git a/src/intel/compiler/brw_inst.h b/src/intel/compiler/brw_inst.h index 1ddee77164..40723f9012 100644 --- a/src/intel/compiler/brw_inst.h +++ b/src/intel/compiler/brw_inst.h @@ -268,6 +268,114 @@ REG_TYPE(src) #undef REG_TYPE /** + * Three-source align1 instructions: + * @{ + */ +/* Reserved 127:126 */ +/* src2_reg_nr same in align16 */ +FC(3src_a1_src2_subreg_nr, 117, 113, devinfo->gen >= 10) +FC(3src_a1_src2_hstride, 112, 111, devinfo->gen >= 10) +/* Reserved 110:109. src2 vstride is an implied parameter */ +FC(3src_a1_src2_hw_type, 108, 106, devinfo->gen >= 10) +/* Reserved 105 */ +/* src1_reg_nr same in align16 */ +FC(3src_a1_src1_subreg_nr, 96, 92, devinfo->gen >= 10) +FC(3src_a1_src1_hstride,91, 90, devinfo->gen >= 10) +FC(3src_a1_src1_vstride,89, 88, devinfo->gen >= 10) +FC(3src_a1_src1_hw_type,87, 85, devinfo->gen >= 10) +/* Reserved 84 */ +/* src0_reg_nr same in align16 */ +FC(3src_a1_src0_subreg_nr, 75, 71, devinfo->gen >= 10) +FC(3src_a1_src0_hstride,70, 69, devinfo->gen >= 10) +FC(3src_a1_src0_vstride,68, 67, devinfo->gen >= 10) +FC(3src_a1_src0_hw_type,66, 64, devinfo->gen >= 10) +/* dst_reg_nr same in align16 */ +FC(3src_a1_dst_subreg_nr, 55, 54, devinfo->gen >= 10) +FC(3src_a1_special_acc, 55, 52, devinfo->gen >= 10) /* aliases dst_subreg_nr */ +/* Reserved 51:50 */ +FC(3src_a1_dst_hstride, 49, 49, devinfo->gen >= 10) +FC(3src_a1_dst_hw_type, 48, 46, devinfo->gen >= 10) +FC(3src_a1_src2_reg_file, 45, 45, devinfo->gen >= 10) +FC(3src_a1_src1_reg_file, 44, 44, devinfo->gen >= 10) +FC(3src_a1_src0_reg_file, 43, 43, devinfo->gen >= 10) +/* Source Modifier fields same in align16 */ +FC(3src_a1_dst_reg_file,36, 36, devinfo->gen >= 10) +FC(3src_a1_exec_type, 35, 35, devinfo->gen >= 10) +/* Fields below this same in align16 */ +/** @} */ + +#define REG_TYPE(reg) \ +static inline void\ +brw_inst_set_3src_a1_##reg##_type(const struct gen_device_info *devinfo, \ + brw_inst *inst, enum brw_reg_type type) \ +{ \ + UNUSED enum gen10_align1_3src_exec_type exec_type =\ + (enum gen10_align1_3src_exec_type) brw_inst_3src_a1_exec_type(devinfo, \ +inst);\ + if (brw_reg_type_is_floating_point(type)) {\ + assert(exec_type == BRW_ALIGN1_3SRC_EXEC_TYPE_FLOAT); \ + } else { \ + assert(exec_type == BRW_ALIGN1_3SRC_EXEC_TYPE_INT); \ + } \ + unsigned hw_type = brw_reg_type_to_a1_hw_3src_type(devinfo, type); \ + brw_inst_set_3src_a1_##reg##_hw_type(devinfo, inst, hw_type); \ +} \ + \ +static inline enum brw_reg_type \ +brw_inst_3src_a1_##reg##_type(const struct gen_device_info *devinfo, \ + const brw_inst *inst) \ +{ \ + enum gen10_align1_3src_exec_type exec_type = \ + (enum gen10_align1_3src_exec_type) brw_inst_3src_a1_exec_type(devinfo, \ +inst);\ + unsigned hw_type = brw_inst_3src_a1_##reg##_hw_type(devinfo, inst);\ + return brw_a1_hw_3src_type_to_reg_type(devinfo, hw_type, exec_type); \ +} + +REG_TYPE(dst) +REG_TYPE(src0) +REG_TYPE(src1) +REG_TYPE(src2) +#undef REG_TYPE + +/** + * Three-source align1 instruction immediates: + * @{ + */ +static inline uint16_t +brw_inst_3src_a1_src0_imm(const struct gen_device_info *devinfo, + const brw_inst *insn) +{ + assert(devinfo->gen >= 10); + return brw_inst_bits(insn, 82, 67); +} + +static inline uint16_t +brw_inst_3src_a1_src2_imm(const struct gen_device_info *devinfo, + const brw_inst *insn) +{ + assert(devinfo->gen >= 10); + return brw_inst_bits(insn, 124, 109); +} + +static inline void +brw_inst_set_3src_a1_src0_imm(const struct gen_device_info *devinfo, + brw_inst *insn, uint16_t value) +{ + assert(devinfo->gen >= 10); + return brw_inst_set_bits(insn, 82, 67, value); +} + +static inline
Re: [Mesa-dev] [PATCH 09/13] i965: Add align1 ternary instruction-word support
On Fri, Sep 29, 2017 at 5:08 PM, Scott D Phillipswrote: > Matt Turner writes: > >> --- >> src/intel/compiler/brw_inst.h | 114 >> ++ >> 1 file changed, 114 insertions(+) >> >> diff --git a/src/intel/compiler/brw_inst.h b/src/intel/compiler/brw_inst.h >> index e6169057e3..b9c03fa88f 100644 >> --- a/src/intel/compiler/brw_inst.h >> +++ b/src/intel/compiler/brw_inst.h >> @@ -268,6 +268,120 @@ REG_TYPE(src) >> #undef REG_TYPE >> >> /** >> + * Three-source align1 instructions: >> + * @{ >> + */ >> +/* Reserved 127:126 */ >> +/* src2_reg_nr same in align16 */ >> +FC(3src_a1_src2_subreg_nr, 117, 113, devinfo->gen >= 10) >> +FC(3src_a1_src2_hstride, 112, 111, devinfo->gen >= 10) >> +/* Reserved 110:109. src2 vstride is an implied parameter */ >> +FC(3src_a1_src2_hw_type, 108, 106, devinfo->gen >= 10) >> +/* Reserved 105 */ >> +/* src1_reg_nr same in align16 */ >> +FC(3src_a1_src1_subreg_nr, 96, 92, devinfo->gen >= 10) >> +FC(3src_a1_src1_hstride,91, 90, devinfo->gen >= 10) >> +FC(3src_a1_src1_vstride,89, 88, devinfo->gen >= 10) >> +FC(3src_a1_src1_hw_type,87, 85, devinfo->gen >= 10) >> +/* Reserved 84 */ >> +/* src0_reg_nr same in align16 */ >> +FC(3src_a1_src0_subreg_nr, 75, 71, devinfo->gen >= 10) >> +FC(3src_a1_src0_hstride,70, 69, devinfo->gen >= 10) >> +FC(3src_a1_src0_vstride,68, 67, devinfo->gen >= 10) >> +FC(3src_a1_src0_hw_type,66, 64, devinfo->gen >= 10) >> +/* dst_reg_nr same in align16 */ >> +FC(3src_a1_dst_subreg_nr, 55, 54, devinfo->gen >= 10) >> +FC(3src_a1_special_acc, 55, 52, devinfo->gen >= 10) /* aliases >> dst_subreg_nr */ >> +/* Reserved 51:50 */ >> +FC(3src_a1_dst_hstride, 49, 49, devinfo->gen >= 10) >> +FC(3src_a1_dst_hw_type, 48, 46, devinfo->gen >= 10) >> +FC(3src_a1_src2_reg_file, 45, 45, devinfo->gen >= 10) >> +FC(3src_a1_src1_reg_file, 44, 44, devinfo->gen >= 10) >> +FC(3src_a1_src0_reg_file, 43, 43, devinfo->gen >= 10) >> +/* Source Modifier fields same in align16 */ >> +FC(3src_a1_dst_reg_file,36, 36, devinfo->gen >= 10) >> +FC(3src_a1_exec_type, 35, 35, devinfo->gen >= 10) >> +/* Fields below this same in align16 */ >> +/** @} */ >> + >> +#define REG_TYPE(reg) >> \ >> +static inline void >> \ >> +brw_inst_set_3src_a1_##reg##_type(const struct gen_device_info *devinfo, >> \ >> + brw_inst *inst, enum brw_reg_type type) >> \ >> +{ >> \ >> + enum gen10_align1_3src_exec_type exec_type = >> \ >> + (enum gen10_align1_3src_exec_type) >> brw_inst_3src_a1_exec_type(devinfo, \ >> +inst); >> \ > > add MAYBE_UNUSED on exec_type here to silence a bunch of warnings. Indeed. I didn't see that until I built a non-debug build. Fixed locally. > Patches 4-7,9: > > Reviewed-by: Scott D Phillips Thanks! ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] i965: Add align1 ternary instruction support to conversion functions
--- src/intel/compiler/brw_disasm.c | 16 ++- src/intel/compiler/brw_inst.h | 4 +- src/intel/compiler/brw_reg_type.c | 99 --- src/intel/compiler/brw_reg_type.h | 16 +-- 4 files changed, 101 insertions(+), 34 deletions(-) diff --git a/src/intel/compiler/brw_disasm.c b/src/intel/compiler/brw_disasm.c index e91961028e..ef7f623ceb 100644 --- a/src/intel/compiler/brw_disasm.c +++ b/src/intel/compiler/brw_disasm.c @@ -764,9 +764,7 @@ dest_3src(FILE *file, const struct gen_device_info *devinfo, const brw_inst *ins { int err = 0; uint32_t reg_file; - enum brw_reg_type type = - brw_hw_3src_type_to_reg_type(devinfo, - brw_inst_3src_a16_dst_hw_type(devinfo, inst)); + enum brw_reg_type type = brw_inst_3src_a16_dst_type(devinfo, inst); unsigned dst_subreg_nr = brw_inst_3src_a16_dst_subreg_nr(devinfo, inst) * 4 / brw_reg_type_to_size(type); @@ -934,9 +932,7 @@ static int src0_3src(FILE *file, const struct gen_device_info *devinfo, const brw_inst *inst) { int err = 0; - enum brw_reg_type type = - brw_hw_3src_type_to_reg_type(devinfo, - brw_inst_3src_a16_src_hw_type(devinfo, inst)); + enum brw_reg_type type = brw_inst_3src_a16_src_type(devinfo, inst); unsigned src0_subreg_nr = brw_inst_3src_a16_src0_subreg_nr(devinfo, inst) * 4 / brw_reg_type_to_size(type); @@ -966,9 +962,7 @@ static int src1_3src(FILE *file, const struct gen_device_info *devinfo, const brw_inst *inst) { int err = 0; - enum brw_reg_type type = - brw_hw_3src_type_to_reg_type(devinfo, - brw_inst_3src_a16_src_hw_type(devinfo, inst)); + enum brw_reg_type type = brw_inst_3src_a16_src_type(devinfo, inst); unsigned src1_subreg_nr = brw_inst_3src_a16_src1_subreg_nr(devinfo, inst) * 4 / brw_reg_type_to_size(type); @@ -999,9 +993,7 @@ static int src2_3src(FILE *file, const struct gen_device_info *devinfo, const brw_inst *inst) { int err = 0; - enum brw_reg_type type = - brw_hw_3src_type_to_reg_type(devinfo, - brw_inst_3src_a16_src_hw_type(devinfo, inst)); + enum brw_reg_type type = brw_inst_3src_a16_src_type(devinfo, inst); unsigned src2_subreg_nr = brw_inst_3src_a16_src2_subreg_nr(devinfo, inst) * 4 / brw_reg_type_to_size(type); diff --git a/src/intel/compiler/brw_inst.h b/src/intel/compiler/brw_inst.h index 0cc1a3e911..1ddee77164 100644 --- a/src/intel/compiler/brw_inst.h +++ b/src/intel/compiler/brw_inst.h @@ -251,7 +251,7 @@ static inline void \ brw_inst_set_3src_a16_##reg##_type(const struct gen_device_info *devinfo, \ brw_inst *inst, enum brw_reg_type type)\ { \ - unsigned hw_type = brw_reg_type_to_hw_3src_type(devinfo, type);\ + unsigned hw_type = brw_reg_type_to_a16_hw_3src_type(devinfo, type);\ brw_inst_set_3src_a16_##reg##_hw_type(devinfo, inst, hw_type); \ } \ \ @@ -260,7 +260,7 @@ brw_inst_3src_a16_##reg##_type(const struct gen_device_info *devinfo, \ const brw_inst *inst) \ { \ unsigned hw_type = brw_inst_3src_a16_##reg##_hw_type(devinfo, inst); \ - return brw_hw_3src_type_to_reg_type(devinfo, hw_type); \ + return brw_a16_hw_3src_type_to_reg_type(devinfo, hw_type); \ } REG_TYPE(dst) diff --git a/src/intel/compiler/brw_reg_type.c b/src/intel/compiler/brw_reg_type.c index f5aadf88bb..c73be15f16 100644 --- a/src/intel/compiler/brw_reg_type.c +++ b/src/intel/compiler/brw_reg_type.c @@ -84,20 +84,57 @@ static const struct { * and unsigned doublewords, so a new field is also available in the da3src * struct (part of struct brw_instruction.bits1 in brw_structs.h) to select * dst and shared-src types. + * + * CNL adds support for 3-src instructions in align1 mode, and with it support + * for most register types. */ enum hw_3src_reg_type { GEN7_3SRC_TYPE_F = 0, GEN7_3SRC_TYPE_D = 1, GEN7_3SRC_TYPE_UD = 2, GEN7_3SRC_TYPE_DF = 3, + + /** When ExecutionDatatype is 1: @{ */ + GEN10_ALIGN1_3SRC_REG_TYPE_HF = 0b000, + GEN10_ALIGN1_3SRC_REG_TYPE_F = 0b001, + GEN10_ALIGN1_3SRC_REG_TYPE_DF = 0b010, + /** @} */ + + /** When ExecutionDatatype is 0: @{ */ + GEN10_ALIGN1_3SRC_REG_TYPE_UD = 0b000, + GEN10_ALIGN1_3SRC_REG_TYPE_D = 0b001, + GEN10_ALIGN1_3SRC_REG_TYPE_UW = 0b010, + GEN10_ALIGN1_3SRC_REG_TYPE_W = 0b011, +
Re: [Mesa-dev] [PATCH 12/13] i965/fs: Use align1 mode on ternary instructions on Gen10+
On Fri, Sep 29, 2017 at 5:20 PM, Scott D Phillipswrote: > Matt Turner writes: > >> Align1 mode offers some nice features over align16, like access to more >> data types and the ability to use a 16-bit immediate. This patch does >> not start using any new features. It just emits ternary instructions in >> align1 mode. >> --- >> src/intel/compiler/brw_fs_generator.cpp | 12 >> 1 file changed, 8 insertions(+), 4 deletions(-) >> >> diff --git a/src/intel/compiler/brw_fs_generator.cpp >> b/src/intel/compiler/brw_fs_generator.cpp >> index afaec5c949..03ee26ccd4 100644 >> --- a/src/intel/compiler/brw_fs_generator.cpp >> +++ b/src/intel/compiler/brw_fs_generator.cpp >> @@ -1728,14 +1728,16 @@ fs_generator::generate_code(const cfg_t *cfg, int >> dispatch_width) >> >>case BRW_OPCODE_MAD: >> assert(devinfo->gen >= 6); >> - brw_set_default_access_mode(p, BRW_ALIGN_16); >> + if (devinfo->gen < 10) >> +brw_set_default_access_mode(p, BRW_ALIGN_16); > > Not setting anything gets BRW_ALIGN_1 because its enum value is 0. Maybe > add a comment for that if you want. Actually, it's set at the top of the loop: brw_set_default_access_mode(p, BRW_ALIGN_1); along with some other defaults. This is done, because align1 mode matches well with the scalar mode that the fragment shader runs in (and on Gen8+ all other stages too). >> brw_MAD(p, dst, src[0], src[1], src[2]); >>break; >> >>case BRW_OPCODE_LRP: >> assert(devinfo->gen >= 6); >>brw_set_default_access_mode(p, BRW_ALIGN_16); >> - brw_LRP(p, dst, src[0], src[1], src[2]); >> + if (devinfo->gen < 10) >> +brw_LRP(p, dst, src[0], src[1], src[2]); > > Copy paste error Whoops. Thanks. Fixed locally. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 08/13] i965: Add align1 ternary instruction support to conversion functions
On Fri, Sep 29, 2017 at 5:07 PM, Scott D Phillipswrote: > Matt Turner writes: > >> --- >> src/intel/compiler/brw_disasm.c | 12 --- >> src/intel/compiler/brw_inst.h | 4 +-- >> src/intel/compiler/brw_reg_type.c | 76 >> --- >> src/intel/compiler/brw_reg_type.h | 7 ++-- >> 4 files changed, 79 insertions(+), 20 deletions(-) >> >> diff --git a/src/intel/compiler/brw_disasm.c >> b/src/intel/compiler/brw_disasm.c >> index aab4a65b7d..3726172e5d 100644 >> --- a/src/intel/compiler/brw_disasm.c >> +++ b/src/intel/compiler/brw_disasm.c >> @@ -766,7 +766,8 @@ dest_3src(FILE *file, const struct gen_device_info >> *devinfo, const brw_inst *ins >> uint32_t reg_file; >> enum brw_reg_type type = >>brw_hw_3src_type_to_reg_type(devinfo, >> - brw_inst_3src_a16_dst_hw_type(devinfo, >> inst)); >> + brw_inst_3src_a16_dst_hw_type(devinfo, >> inst), >> + 0); >> unsigned dst_subreg_nr = >>brw_inst_3src_a16_dst_subreg_nr(devinfo, inst) * 4 / >>brw_reg_type_to_size(type); >> @@ -936,7 +937,8 @@ src0_3src(FILE *file, const struct gen_device_info >> *devinfo, const brw_inst *ins >> int err = 0; >> enum brw_reg_type type = >>brw_hw_3src_type_to_reg_type(devinfo, >> - brw_inst_3src_a16_src_hw_type(devinfo, >> inst)); >> + brw_inst_3src_a16_src_hw_type(devinfo, >> inst), >> + 0); >> unsigned src0_subreg_nr = >>brw_inst_3src_a16_src0_subreg_nr(devinfo, inst) * 4 / >>brw_reg_type_to_size(type); >> @@ -968,7 +970,8 @@ src1_3src(FILE *file, const struct gen_device_info >> *devinfo, const brw_inst *ins >> int err = 0; >> enum brw_reg_type type = >>brw_hw_3src_type_to_reg_type(devinfo, >> - brw_inst_3src_a16_src_hw_type(devinfo, >> inst)); >> + brw_inst_3src_a16_src_hw_type(devinfo, >> inst), >> + 0); >> unsigned src1_subreg_nr = >>brw_inst_3src_a16_src1_subreg_nr(devinfo, inst) * 4 / >>brw_reg_type_to_size(type); >> @@ -1001,7 +1004,8 @@ src2_3src(FILE *file, const struct gen_device_info >> *devinfo, const brw_inst *ins >> int err = 0; >> enum brw_reg_type type = >>brw_hw_3src_type_to_reg_type(devinfo, >> - brw_inst_3src_a16_src_hw_type(devinfo, >> inst)); >> + brw_inst_3src_a16_src_hw_type(devinfo, >> inst), >> + 0); >> unsigned src2_subreg_nr = >>brw_inst_3src_a16_src2_subreg_nr(devinfo, inst) * 4 / >>brw_reg_type_to_size(type); >> diff --git a/src/intel/compiler/brw_inst.h b/src/intel/compiler/brw_inst.h >> index 0cc1a3e911..e6169057e3 100644 >> --- a/src/intel/compiler/brw_inst.h >> +++ b/src/intel/compiler/brw_inst.h >> @@ -251,7 +251,7 @@ static inline void >> \ >> brw_inst_set_3src_a16_##reg##_type(const struct gen_device_info *devinfo, >> \ >> brw_inst *inst, enum brw_reg_type type) >> \ >> { >> \ >> - unsigned hw_type = brw_reg_type_to_hw_3src_type(devinfo, type); >> \ >> + unsigned hw_type = brw_reg_type_to_hw_3src_type(devinfo, type, 0); >> \ >> brw_inst_set_3src_a16_##reg##_hw_type(devinfo, inst, hw_type); >> \ >> } >> \ >> >> \ >> @@ -260,7 +260,7 @@ brw_inst_3src_a16_##reg##_type(const struct >> gen_device_info *devinfo, \ >> const brw_inst *inst) >> \ >> { >> \ >> unsigned hw_type = brw_inst_3src_a16_##reg##_hw_type(devinfo, inst); >> \ >> - return brw_hw_3src_type_to_reg_type(devinfo, hw_type); >> \ >> + return brw_hw_3src_type_to_reg_type(devinfo, hw_type, 0); >> \ >> } >> >> REG_TYPE(dst) >> diff --git a/src/intel/compiler/brw_reg_type.c >> b/src/intel/compiler/brw_reg_type.c >> index d65ebaee48..7fb4a1e62a 100644 >> --- a/src/intel/compiler/brw_reg_type.c >> +++ b/src/intel/compiler/brw_reg_type.c >> @@ -84,20 +84,55 @@ static const struct { >> * and unsigned doublewords, so a new field is also available in the da3src >> * struct (part of struct brw_instruction.bits1 in brw_structs.h) to select >> * dst and shared-src types. >> + * >> + * CNL adds support for 3-src instructions in align1
Re: [Mesa-dev] [PATCH v2 16/32] i965: Don't rely on nir for uses_texture_gather
On 20/10/17 09:38, Jordan Justen wrote: On 2017-10-19 15:07:45, Timothy Arceri wrote: Maybe you should just do: prog->nir->info = prog->info; After you restore nir from the cache? We only deserialize from nir if the gen program restore fails. So, hopefully prog->nir will be NULL. IMO we should always restore the NIR. See my comments on 14 and 25. Basically we want to do all the restores at link time to avoid ever falling back at draw time. -Jordan On 19/10/17 16:32, Jordan Justen wrote: When a program is restored from the shader cache, prog->nir will be NULL, but prog->info will be restored. Signed-off-by: Jordan Justen--- src/mesa/drivers/dri/i965/brw_wm.c | 4 ++-- src/mesa/drivers/dri/i965/brw_wm_surface_state.c | 12 ++-- 2 files changed, 8 insertions(+), 8 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_wm.c b/src/mesa/drivers/dri/i965/brw_wm.c index 69d8e61e40..e511f0f70b 100644 --- a/src/mesa/drivers/dri/i965/brw_wm.c +++ b/src/mesa/drivers/dri/i965/brw_wm.c @@ -330,7 +330,7 @@ brw_populate_sampler_prog_key_data(struct gl_context *ctx, } /* gather4 for RG32* is broken in multiple ways on Gen7. */ - if (devinfo->gen == 7 && prog->nir->info.uses_texture_gather) { + if (devinfo->gen == 7 && prog->info.uses_texture_gather) { switch (img->InternalFormat) { case GL_RG32I: case GL_RG32UI: { @@ -368,7 +368,7 @@ brw_populate_sampler_prog_key_data(struct gl_context *ctx, /* Gen6's gather4 is broken for UINT/SINT; we treat them as * UNORM/FLOAT instead and fix it in the shader. */ - if (devinfo->gen == 6 && prog->nir->info.uses_texture_gather) { + if (devinfo->gen == 6 && prog->info.uses_texture_gather) { key->gen6_gather_wa[s] = gen6_gather_workaround(img->InternalFormat); } diff --git a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c index f4e9cf48c6..4f454dae44 100644 --- a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c +++ b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c @@ -1198,15 +1198,15 @@ brw_update_texture_surfaces(struct brw_context *brw) * allows the surface format to be overriden for only the * gather4 messages. */ if (devinfo->gen < 8) { - if (vs && vs->nir->info.uses_texture_gather) + if (vs && vs->info.uses_texture_gather) update_stage_texture_surfaces(brw, vs, >vs.base, true, 0); - if (tcs && tcs->nir->info.uses_texture_gather) + if (tcs && tcs->info.uses_texture_gather) update_stage_texture_surfaces(brw, tcs, >tcs.base, true, 0); - if (tes && tes->nir->info.uses_texture_gather) + if (tes && tes->info.uses_texture_gather) update_stage_texture_surfaces(brw, tes, >tes.base, true, 0); - if (gs && gs->nir->info.uses_texture_gather) + if (gs && gs->info.uses_texture_gather) update_stage_texture_surfaces(brw, gs, >gs.base, true, 0); - if (fs && fs->nir->info.uses_texture_gather) + if (fs && fs->info.uses_texture_gather) update_stage_texture_surfaces(brw, fs, >wm.base, true, 0); } @@ -1253,7 +1253,7 @@ brw_update_cs_texture_surfaces(struct brw_context *brw) * gather4 messages. */ if (devinfo->gen < 8) { - if (cs && cs->nir->info.uses_texture_gather) + if (cs && cs->info.uses_texture_gather) update_stage_texture_surfaces(brw, cs, >cs.base, true, 0); } ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v2 16/32] i965: Don't rely on nir for uses_texture_gather
On 2017-10-19 15:07:45, Timothy Arceri wrote: > Maybe you should just do: > > prog->nir->info = prog->info; > > After you restore nir from the cache? We only deserialize from nir if the gen program restore fails. So, hopefully prog->nir will be NULL. -Jordan > > On 19/10/17 16:32, Jordan Justen wrote: > > When a program is restored from the shader cache, prog->nir will be > > NULL, but prog->info will be restored. > > > > Signed-off-by: Jordan Justen> > --- > > src/mesa/drivers/dri/i965/brw_wm.c | 4 ++-- > > src/mesa/drivers/dri/i965/brw_wm_surface_state.c | 12 ++-- > > 2 files changed, 8 insertions(+), 8 deletions(-) > > > > diff --git a/src/mesa/drivers/dri/i965/brw_wm.c > > b/src/mesa/drivers/dri/i965/brw_wm.c > > index 69d8e61e40..e511f0f70b 100644 > > --- a/src/mesa/drivers/dri/i965/brw_wm.c > > +++ b/src/mesa/drivers/dri/i965/brw_wm.c > > @@ -330,7 +330,7 @@ brw_populate_sampler_prog_key_data(struct gl_context > > *ctx, > >} > > > >/* gather4 for RG32* is broken in multiple ways on Gen7. */ > > - if (devinfo->gen == 7 && prog->nir->info.uses_texture_gather) { > > + if (devinfo->gen == 7 && prog->info.uses_texture_gather) { > > switch (img->InternalFormat) { > > case GL_RG32I: > > case GL_RG32UI: { > > @@ -368,7 +368,7 @@ brw_populate_sampler_prog_key_data(struct gl_context > > *ctx, > >/* Gen6's gather4 is broken for UINT/SINT; we treat them as > > * UNORM/FLOAT instead and fix it in the shader. > > */ > > - if (devinfo->gen == 6 && prog->nir->info.uses_texture_gather) { > > + if (devinfo->gen == 6 && prog->info.uses_texture_gather) { > > key->gen6_gather_wa[s] = > > gen6_gather_workaround(img->InternalFormat); > >} > > > > diff --git a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c > > b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c > > index f4e9cf48c6..4f454dae44 100644 > > --- a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c > > +++ b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c > > @@ -1198,15 +1198,15 @@ brw_update_texture_surfaces(struct brw_context *brw) > > * allows the surface format to be overriden for only the > > * gather4 messages. */ > > if (devinfo->gen < 8) { > > - if (vs && vs->nir->info.uses_texture_gather) > > + if (vs && vs->info.uses_texture_gather) > >update_stage_texture_surfaces(brw, vs, >vs.base, true, 0); > > - if (tcs && tcs->nir->info.uses_texture_gather) > > + if (tcs && tcs->info.uses_texture_gather) > >update_stage_texture_surfaces(brw, tcs, >tcs.base, true, 0); > > - if (tes && tes->nir->info.uses_texture_gather) > > + if (tes && tes->info.uses_texture_gather) > >update_stage_texture_surfaces(brw, tes, >tes.base, true, 0); > > - if (gs && gs->nir->info.uses_texture_gather) > > + if (gs && gs->info.uses_texture_gather) > >update_stage_texture_surfaces(brw, gs, >gs.base, true, 0); > > - if (fs && fs->nir->info.uses_texture_gather) > > + if (fs && fs->info.uses_texture_gather) > >update_stage_texture_surfaces(brw, fs, >wm.base, true, 0); > > } > > > > @@ -1253,7 +1253,7 @@ brw_update_cs_texture_surfaces(struct brw_context > > *brw) > > * gather4 messages. > > */ > > if (devinfo->gen < 8) { > > - if (cs && cs->nir->info.uses_texture_gather) > > + if (cs && cs->info.uses_texture_gather) > >update_stage_texture_surfaces(brw, cs, >cs.base, true, 0); > > } > > > > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] swr: Rework scratch space allocation
Reviewed-by: Bruce Cherniak> On Oct 19, 2017, at 4:40 PM, George Kyriazis > wrote: > > Remove allocation of > 2kbyte buffers into context memory in > swr_copy_to_scatch_space() (which is used to copy small vertex/index buffers > and shader constants to a scratch space to be used by the upcoming draw.) > > Large shader constant allocations need to be done in the circular scratch > buffer instead of context memory, because their values persist across > render calls. > > Also lower SCRATCH_SINGLE_ALLOCATION_LIMIT to 8k, since allocations of larger > buffers will get too large for the circular scratch space. > > Fixes render issues with CEI Ensight. > --- > src/gallium/drivers/swr/swr_scratch.cpp | 51 ++--- > src/gallium/drivers/swr/swr_screen.cpp | 2 +- > 2 files changed, 23 insertions(+), 30 deletions(-) > > diff --git a/src/gallium/drivers/swr/swr_scratch.cpp > b/src/gallium/drivers/swr/swr_scratch.cpp > index d298a48..c31f3c2 100644 > --- a/src/gallium/drivers/swr/swr_scratch.cpp > +++ b/src/gallium/drivers/swr/swr_scratch.cpp > @@ -28,8 +28,6 @@ > #include "swr_fence_work.h" > #include "api.h" > > -#define SCRATCH_SINGLE_ALLOCATION_LIMIT 2048 > - > void * > swr_copy_to_scratch_space(struct swr_context *ctx, > struct swr_scratch_space *space, > @@ -40,41 +38,36 @@ swr_copy_to_scratch_space(struct swr_context *ctx, >assert(space); >assert(size); > > - if (size >= SCRATCH_SINGLE_ALLOCATION_LIMIT) { > - /* Use per draw SwrAllocDrawContextMemory for larger copies */ > - ptr = ctx->api.pfnSwrAllocDrawContextMemory(ctx->swrContext, size, 4); > - } else { > - /* Allocate enough so that MAX_DRAWS_IN_FLIGHT sets fit. */ > - unsigned int max_size_in_flight = size * KNOB_MAX_DRAWS_IN_FLIGHT; > - > - /* Need to grow space */ > - if (max_size_in_flight > space->current_size) { > - space->current_size = max_size_in_flight; > + /* Allocate enough so that MAX_DRAWS_IN_FLIGHT sets fit. */ > + unsigned int max_size_in_flight = size * KNOB_MAX_DRAWS_IN_FLIGHT; > > - if (space->base) { > -/* defer delete, use aligned-free */ > -struct swr_screen *screen = swr_screen(ctx->pipe.screen); > -swr_fence_work_free(screen->flush_fence, space->base, true); > -space->base = NULL; > - } > + /* Need to grow space */ > + if (max_size_in_flight > space->current_size) { > + space->current_size = max_size_in_flight; > > - if (!space->base) { > -space->base = (uint8_t *)AlignedMalloc(space->current_size, > - sizeof(void *)); > -space->head = (void *)space->base; > - } > + if (space->base) { > + /* defer delete, use aligned-free */ > + struct swr_screen *screen = swr_screen(ctx->pipe.screen); > + swr_fence_work_free(screen->flush_fence, space->base, true); > + space->base = NULL; > } > > - /* Wrap */ > - if (((uint8_t *)space->head + size) > - >= ((uint8_t *)space->base + space->current_size)) { > - space->head = space->base; > + if (!space->base) { > + space->base = (uint8_t *)AlignedMalloc(space->current_size, > +sizeof(void *)); > + space->head = (void *)space->base; > } > + } > > - ptr = space->head; > - space->head = (uint8_t *)space->head + size; > + /* Wrap */ > + if (((uint8_t *)space->head + size) > + >= ((uint8_t *)space->base + space->current_size)) { > + space->head = space->base; >} > > + ptr = space->head; > + space->head = (uint8_t *)space->head + size; > + >/* Copy user_buffer to scratch */ >if (user_buffer) > memcpy(ptr, user_buffer, size); > diff --git a/src/gallium/drivers/swr/swr_screen.cpp > b/src/gallium/drivers/swr/swr_screen.cpp > index 46b3a00..b21c35e 100644 > --- a/src/gallium/drivers/swr/swr_screen.cpp > +++ b/src/gallium/drivers/swr/swr_screen.cpp > @@ -57,7 +57,7 @@ > #define SWR_MAX_TEXTURE_ARRAY_LAYERS 512 /* 8K x 512 / 8K x 8K x 512 */ > > /* Default max client_copy_limit */ > -#define SWR_CLIENT_COPY_LIMIT 32768 > +#define SWR_CLIENT_COPY_LIMIT 8192 > > /* Flag indicates creation of alternate surface, to prevent recursive loop > * in resource creation when msaa_force_enable is set. */ > -- > 2.7.4 > > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 2/4] glsl/linker: fix location aliasing checks for interface variables
On 20/10/17 08:45, Ilia Mirkin wrote: On Thu, Oct 19, 2017 at 5:35 PM, Timothy Arceriwrote: On 20/10/17 08:27, Timothy Arceri wrote: On 20/10/17 08:19, Timothy Arceri wrote: On 20/10/17 04:21, Ilia Mirkin wrote: On Thu, Oct 19, 2017 at 12:45 PM, Ilia Mirkin wrote: On Thu, Oct 19, 2017 at 12:40 PM, Iago Toral wrote: On Thu, 2017-10-19 at 12:37 -0400, Ilia Mirkin wrote: Will this work with SSO shaders? Presumably the validation still has to happen, but I don't think cross_validate_outputs_to_inputs() will end up getting called. The piglit tests I mention use SSO so it seems to be working for this. See for example: tests/spec/arb_enhanced_layouts/linker/block-member-locations/named- block-member-location-overlap.shader_test Ah great. I'm a little curious how, since I don't see how cross_validate_outputs_to_inputs would get called for SSO shaders. Perhaps I'm looking at old code. Oh - because that test doesn't test SSO. It's missing a "SSO ENABLED" - can you try with that? It's just using a pipeline, but both shaders are ending up in it. BTW, my solution to all this was https://patchwork.freedesktop.org/patch/175959/ but Tim hated it, and I didn't have the time to properly respond. Hate is a strong word, the problem is it duplicated some of the checks/logic in cross_validate_outputs_to_inputs() unnecessarily. The checks should be pulled into a helper/helpers that can also be used by SSO. Oh and your patch was also missing all the component checking logic which we also should be doing for SSO. Moving the checks into helpers will give us these check for free. Thinking about it some more, I don't see how your patch works for SSO. The only place you can validate the SSO pipeline is via _mesa_validate_program_pipeline() since we can't know what the other stages will be at link time. You don't need to validate the pipeline. You need to make sure each individual program links, and the code is called at glLinkShader() time, at which point it will fail the link if the interfaces have things they're not supposed to. Oh right, this is just for validating the outward facing interfaces of a SSO program. I think we have discussed this before :P ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v2 32/32] disk_cache: Add support for MESA_GLSL_CACHE_TIMESTAMP in debug builds
Maybe add to docs? On 19/10/17 16:32, Jordan Justen wrote: The MESA_GLSL_CACHE_TIMESTAMP environment variable can be set to override the driver timestamp. Usually the driver will specify a hash of their driver build so the cache items become invalid with each driver build. We don't guarantee a stable serialized shader cache format, so changing the timestamp for each build is required for safety. Nevertheless, during debug, making small changes to the driver may be known to be safe. The driver developer can use this variable to keep the timestamp consistent. When debugging issues on an application for which the shader cache greatly lowers the startup time, this can save the developer significant time. Signed-off-by: Jordan Justen--- src/util/disk_cache.c | 12 1 file changed, 12 insertions(+) diff --git a/src/util/disk_cache.c b/src/util/disk_cache.c index fde6e2e097..54f48a8ba5 100644 --- a/src/util/disk_cache.c +++ b/src/util/disk_cache.c @@ -208,6 +208,18 @@ disk_cache_create(const char *gpu_name, const char *timestamp, if (env_var_as_boolean("MESA_GLSL_CACHE_DISABLE", false)) goto fail; +#ifdef DEBUG + /* For debug builds, MESA_GLSL_CACHE_TIMESTAMP can be set to override the +* driver specified timestamp. This will allow small changes to be made to +* the driver without invalidating the cache. Given that this is normally +* unsafe, it is only allowed for debug builds. +*/ + const char *timestamp_override = getenv("MESA_GLSL_CACHE_TIMESTAMP"); + if (timestamp_override) { + timestamp = timestamp_override; + } +#endif + /* Determine path for cache based on the first defined name as follows: * * $MESA_GLSL_CACHE_DIR ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v2 29/32] disk_cache: Fix issue reading GLSL metadata
Reviewed-by: Timothy Arceri___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v2 28/32] glsl/shader_cache: Save fs (BlendSupport) metadata
Reviewed-by: Timothy Arceri___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v2 27/32] i965: Initialize sha1 hash of dri config options
Reviewed-by: Timothy Arceri___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v2 25/32] i965: add cache fallback support using serialized nir
IMO you should just do this in brw_link() when its cheap. You will be doing the deserialization at draw time here which is not what we want. Can also drop the serialized_nir params if you follow my suggestions in patch 14. On 19/10/17 16:32, Jordan Justen wrote: If the i965 gen program cannot be loaded from the cache, then we fallback to using a serialized nir program. This is based on "i965: add cache fallback support" by Timothy Arceri. Tim's version was written to fallback to compiling from source, and therefore had to be much more complex. After Connor and Jason implemented nir serialization, I was able to rewrite and greatly simplify this patch. Signed-off-by: Jordan Justen --- src/mesa/drivers/dri/i965/brw_disk_cache.c | 27 ++- 1 file changed, 26 insertions(+), 1 deletion(-) diff --git a/src/mesa/drivers/dri/i965/brw_disk_cache.c b/src/mesa/drivers/dri/i965/brw_disk_cache.c index d89df846d5..790fad6925 100644 --- a/src/mesa/drivers/dri/i965/brw_disk_cache.c +++ b/src/mesa/drivers/dri/i965/brw_disk_cache.c @@ -24,6 +24,7 @@ #include "compiler/blob.h" #include "compiler/glsl/ir_uniform.h" #include "compiler/glsl/shader_cache.h" +#include "compiler/nir/nir_serialize.h" #include "main/mtypes.h" #include "util/disk_cache.h" #include "util/macros.h" @@ -79,6 +80,27 @@ gen_shader_sha1(struct brw_context *brw, struct gl_program *prog, _mesa_sha1_compute(manifest, strlen(manifest), out_sha1); } +static void +fallback_to_full_recompile(struct brw_context *brw, struct gl_program *prog, + gl_shader_stage stage) +{ + prog->program_written_to_cache = false; + if (brw->ctx._Shader->Flags & GLSL_CACHE_INFO) { + fprintf(stderr, "falling back to nir %s.\n", + _mesa_shader_stage_to_abbrev(prog->info.stage)); + } + + if (!prog->nir) { + assert(prog->serialized_nir && prog->serialized_nir_size > 0); + const struct nir_shader_compiler_options *options = + brw->ctx.Const.ShaderCompilerOptions[stage].NirOptions; + struct blob_reader reader; + blob_reader_init(, prog->serialized_nir, + prog->serialized_nir_size); + prog->nir = nir_deserialize(NULL, options, ); + } +} + static void read_program_data(struct gl_program *glprog, struct blob_reader *binary, struct brw_stage_prog_data *prog_data, @@ -298,6 +320,9 @@ brw_disk_cache_upload_program(struct brw_context *brw, gl_shader_stage stage) prog->sh.LinkedTransformFeedback->api_enabled) return false; + if (brw->ctx._Shader->Flags & GLSL_CACHE_FALLBACK) + goto FAIL; + if (prog->sh.data->LinkStatus != linking_skipped) goto FAIL; @@ -311,7 +336,7 @@ brw_disk_cache_upload_program(struct brw_context *brw, gl_shader_stage stage) return true; FAIL: - /*FIXME: Fall back and compile from source here. */ + fallback_to_full_recompile(brw, prog, stage); return false; } ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v2 22/32] i965: Add shader cache support for compute
Reviewed-by: Timothy Arceri___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v2 16/32] i965: Don't rely on nir for uses_texture_gather
Maybe you should just do: prog->nir->info = prog->info; After you restore nir from the cache? On 19/10/17 16:32, Jordan Justen wrote: When a program is restored from the shader cache, prog->nir will be NULL, but prog->info will be restored. Signed-off-by: Jordan Justen--- src/mesa/drivers/dri/i965/brw_wm.c | 4 ++-- src/mesa/drivers/dri/i965/brw_wm_surface_state.c | 12 ++-- 2 files changed, 8 insertions(+), 8 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_wm.c b/src/mesa/drivers/dri/i965/brw_wm.c index 69d8e61e40..e511f0f70b 100644 --- a/src/mesa/drivers/dri/i965/brw_wm.c +++ b/src/mesa/drivers/dri/i965/brw_wm.c @@ -330,7 +330,7 @@ brw_populate_sampler_prog_key_data(struct gl_context *ctx, } /* gather4 for RG32* is broken in multiple ways on Gen7. */ - if (devinfo->gen == 7 && prog->nir->info.uses_texture_gather) { + if (devinfo->gen == 7 && prog->info.uses_texture_gather) { switch (img->InternalFormat) { case GL_RG32I: case GL_RG32UI: { @@ -368,7 +368,7 @@ brw_populate_sampler_prog_key_data(struct gl_context *ctx, /* Gen6's gather4 is broken for UINT/SINT; we treat them as * UNORM/FLOAT instead and fix it in the shader. */ - if (devinfo->gen == 6 && prog->nir->info.uses_texture_gather) { + if (devinfo->gen == 6 && prog->info.uses_texture_gather) { key->gen6_gather_wa[s] = gen6_gather_workaround(img->InternalFormat); } diff --git a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c index f4e9cf48c6..4f454dae44 100644 --- a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c +++ b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c @@ -1198,15 +1198,15 @@ brw_update_texture_surfaces(struct brw_context *brw) * allows the surface format to be overriden for only the * gather4 messages. */ if (devinfo->gen < 8) { - if (vs && vs->nir->info.uses_texture_gather) + if (vs && vs->info.uses_texture_gather) update_stage_texture_surfaces(brw, vs, >vs.base, true, 0); - if (tcs && tcs->nir->info.uses_texture_gather) + if (tcs && tcs->info.uses_texture_gather) update_stage_texture_surfaces(brw, tcs, >tcs.base, true, 0); - if (tes && tes->nir->info.uses_texture_gather) + if (tes && tes->info.uses_texture_gather) update_stage_texture_surfaces(brw, tes, >tes.base, true, 0); - if (gs && gs->nir->info.uses_texture_gather) + if (gs && gs->info.uses_texture_gather) update_stage_texture_surfaces(brw, gs, >gs.base, true, 0); - if (fs && fs->nir->info.uses_texture_gather) + if (fs && fs->info.uses_texture_gather) update_stage_texture_surfaces(brw, fs, >wm.base, true, 0); } @@ -1253,7 +1253,7 @@ brw_update_cs_texture_surfaces(struct brw_context *brw) * gather4 messages. */ if (devinfo->gen < 8) { - if (cs && cs->nir->info.uses_texture_gather) + if (cs && cs->info.uses_texture_gather) update_stage_texture_surfaces(brw, cs, >cs.base, true, 0); } ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v2 14/32] glsl/shader_cache: Save and restore serialized nir in gl_program
This and the previous patch feels wrong. The glsl shader cache shouldn't be handling writing nir to disk. IMO you should add this functionality to nir_serialize.c (maybe rename is nir_cache.c ??). That way we have continue with the nir is a toolkit theme and we can easily reused the util to write nir to disk in Vulkan which could be useful for reducing compile times. Doing it this way we could deserialize it in brw_link() and serialize it in glsl_to_nir() no need for the extra serialized_nir params. On 19/10/17 16:32, Jordan Justen wrote: Signed-off-by: Jordan Justen--- src/compiler/glsl/shader_cache.cpp | 16 1 file changed, 16 insertions(+) diff --git a/src/compiler/glsl/shader_cache.cpp b/src/compiler/glsl/shader_cache.cpp index ca90cfde35..f43bd6b17e 100644 --- a/src/compiler/glsl/shader_cache.cpp +++ b/src/compiler/glsl/shader_cache.cpp @@ -1062,6 +1062,14 @@ write_shader_metadata(struct blob *metadata, gl_linked_shader *shader) } write_shader_parameters(metadata, glprog->Parameters); + + assert((glprog->serialized_nir == NULL) == + (glprog->serialized_nir_size == 0)); + blob_write_uint32(metadata, (uint32_t)glprog->serialized_nir_size); + if (glprog->serialized_nir_size > 0) { + blob_write_bytes(metadata, glprog->serialized_nir, + glprog->serialized_nir_size); + } } static void @@ -1116,6 +1124,14 @@ read_shader_metadata(struct blob_reader *metadata, glprog->Parameters = _mesa_new_parameter_list(); read_shader_parameters(metadata, glprog->Parameters); + + glprog->serialized_nir_size = (size_t)blob_read_uint32(metadata); + if (glprog->serialized_nir_size > 0) { + glprog->serialized_nir = + (uint8_t*)ralloc_size(glprog, glprog->serialized_nir_size); + blob_copy_bytes(metadata, glprog->serialized_nir, + glprog->serialized_nir_size); + } } static void ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v2 12/32] nir: Add hooks for testing serialization
Reviewed-by: Timothy Arceri___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v2 11/32] nir: add serialization and deserialization
Acked-by: Timothy Arceri___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 2/2] radv: Enable tessellation shaders for GFX9.
It mostly works now. --- src/amd/vulkan/radv_device.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/amd/vulkan/radv_device.c b/src/amd/vulkan/radv_device.c index 7f306db5c48..125498809ec 100644 --- a/src/amd/vulkan/radv_device.c +++ b/src/amd/vulkan/radv_device.c @@ -429,7 +429,7 @@ void radv_GetPhysicalDeviceFeatures( .imageCubeArray = true, .independentBlend = true, .geometryShader = !is_gfx9, - .tessellationShader = !is_gfx9, + .tessellationShader = true, .sampleRateShading= true, .dualSrcBlend = true, .logicOp = true, -- 2.14.2 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 1/2] ac/nir: init full exec mask for merged shaders.
From: Dave AirlieSigned-off-by: Bas Nieuwenhuizen --- src/amd/common/ac_llvm_build.c | 8 src/amd/common/ac_llvm_build.h | 1 + src/amd/common/ac_nir_to_llvm.c | 3 +++ 3 files changed, 12 insertions(+) diff --git a/src/amd/common/ac_llvm_build.c b/src/amd/common/ac_llvm_build.c index 949f181aace..e5cd23e0251 100644 --- a/src/amd/common/ac_llvm_build.c +++ b/src/amd/common/ac_llvm_build.c @@ -1734,3 +1734,11 @@ void ac_optimize_vs_outputs(struct ac_llvm_context *ctx, *num_param_exports = exports.num; } } + +void ac_init_exec_full_mask(struct ac_llvm_context *ctx) +{ + LLVMValueRef full_mask = LLVMConstInt(ctx->i64, ~0ull, 0); + ac_build_intrinsic(ctx, + "llvm.amdgcn.init.exec", ctx->voidt, + _mask, 1, AC_FUNC_ATTR_CONVERGENT); +} diff --git a/src/amd/common/ac_llvm_build.h b/src/amd/common/ac_llvm_build.h index f0b5875b423..aa2a2899ab4 100644 --- a/src/amd/common/ac_llvm_build.h +++ b/src/amd/common/ac_llvm_build.h @@ -281,6 +281,7 @@ void ac_optimize_vs_outputs(struct ac_llvm_context *ac, uint8_t *vs_output_param_offset, uint32_t num_outputs, uint8_t *num_param_exports); +void ac_init_exec_full_mask(struct ac_llvm_context *ctx); #ifdef __cplusplus } #endif diff --git a/src/amd/common/ac_nir_to_llvm.c b/src/amd/common/ac_nir_to_llvm.c index 242675654d2..b6a7d43789d 100644 --- a/src/amd/common/ac_nir_to_llvm.c +++ b/src/amd/common/ac_nir_to_llvm.c @@ -6488,6 +6488,9 @@ LLVMModuleRef ac_translate_nir_to_llvm(LLVMTargetMachineRef tm, ctx.abi.load_ssbo = radv_load_ssbo; ctx.abi.load_sampler_desc = radv_get_sampler_desc; + if (shader_count >= 2) + ac_init_exec_full_mask(); + if (ctx.ac.chip_class == GFX9 && shaders[shader_count - 1]->stage == MESA_SHADER_TESS_CTRL) ac_nir_fixup_ls_hs_input_vgprs(); -- 2.14.2 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v2 10/32] glsl_to_nir: Zero nir_constant in constant_copy for valgrind & nir_serialize
5-10: Reviewed-by: Timothy Arceri___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v2 04/32] compiler/types: Support [de]serializing void types
Reviewed-by: Timothy Arceri___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 2/4] glsl/linker: fix location aliasing checks for interface variables
On Thu, Oct 19, 2017 at 5:35 PM, Timothy Arceriwrote: > > > On 20/10/17 08:27, Timothy Arceri wrote: >> >> >> >> On 20/10/17 08:19, Timothy Arceri wrote: >>> >>> On 20/10/17 04:21, Ilia Mirkin wrote: On Thu, Oct 19, 2017 at 12:45 PM, Ilia Mirkin wrote: > > On Thu, Oct 19, 2017 at 12:40 PM, Iago Toral wrote: >> >> On Thu, 2017-10-19 at 12:37 -0400, Ilia Mirkin wrote: >>> >>> Will this work with SSO shaders? Presumably the validation still has >>> to happen, but I don't think cross_validate_outputs_to_inputs() will >>> end up getting called. >> >> >> The piglit tests I mention use SSO so it seems to be working for this. >> See for example: >> >> tests/spec/arb_enhanced_layouts/linker/block-member-locations/named- >> block-member-location-overlap.shader_test > > > Ah great. I'm a little curious how, since I don't see how > cross_validate_outputs_to_inputs would get called for SSO shaders. > Perhaps I'm looking at old code. > > Oh - because that test doesn't test SSO. It's missing a "SSO ENABLED" > - can you try with that? It's just using a pipeline, but both shaders > are ending up in it. BTW, my solution to all this was https://patchwork.freedesktop.org/patch/175959/ but Tim hated it, and I didn't have the time to properly respond. >>> >>> >>> Hate is a strong word, the problem is it duplicated some of the >>> checks/logic in cross_validate_outputs_to_inputs() unnecessarily. The checks >>> should be pulled into a helper/helpers that can also be used by SSO. >> >> >> Oh and your patch was also missing all the component checking logic which >> we also should be doing for SSO. Moving the checks into helpers will give us >> these check for free. > > > Thinking about it some more, I don't see how your patch works for SSO. > > The only place you can validate the SSO pipeline is via > _mesa_validate_program_pipeline() since we can't know what the other stages > will be at link time. You don't need to validate the pipeline. You need to make sure each individual program links, and the code is called at glLinkShader() time, at which point it will fail the link if the interfaces have things they're not supposed to. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v2 03/32] nir/intrinsics: Set the correct num_indices for load_output
Reviewed-by: Timothy Arceri___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v2 02/32] nir: Get rid of nir_shader::stage
Acked-by: Timothy Arceri___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v2 01/32] glsl: move shader_cache type handling to glsl_types
Reviewed-by: Timothy Arceri___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] swr: Rework scratch space allocation
Remove allocation of > 2kbyte buffers into context memory in swr_copy_to_scatch_space() (which is used to copy small vertex/index buffers and shader constants to a scratch space to be used by the upcoming draw.) Large shader constant allocations need to be done in the circular scratch buffer instead of context memory, because their values persist across render calls. Also lower SCRATCH_SINGLE_ALLOCATION_LIMIT to 8k, since allocations of larger buffers will get too large for the circular scratch space. Fixes render issues with CEI Ensight. --- src/gallium/drivers/swr/swr_scratch.cpp | 51 ++--- src/gallium/drivers/swr/swr_screen.cpp | 2 +- 2 files changed, 23 insertions(+), 30 deletions(-) diff --git a/src/gallium/drivers/swr/swr_scratch.cpp b/src/gallium/drivers/swr/swr_scratch.cpp index d298a48..c31f3c2 100644 --- a/src/gallium/drivers/swr/swr_scratch.cpp +++ b/src/gallium/drivers/swr/swr_scratch.cpp @@ -28,8 +28,6 @@ #include "swr_fence_work.h" #include "api.h" -#define SCRATCH_SINGLE_ALLOCATION_LIMIT 2048 - void * swr_copy_to_scratch_space(struct swr_context *ctx, struct swr_scratch_space *space, @@ -40,41 +38,36 @@ swr_copy_to_scratch_space(struct swr_context *ctx, assert(space); assert(size); - if (size >= SCRATCH_SINGLE_ALLOCATION_LIMIT) { - /* Use per draw SwrAllocDrawContextMemory for larger copies */ - ptr = ctx->api.pfnSwrAllocDrawContextMemory(ctx->swrContext, size, 4); - } else { - /* Allocate enough so that MAX_DRAWS_IN_FLIGHT sets fit. */ - unsigned int max_size_in_flight = size * KNOB_MAX_DRAWS_IN_FLIGHT; - - /* Need to grow space */ - if (max_size_in_flight > space->current_size) { - space->current_size = max_size_in_flight; + /* Allocate enough so that MAX_DRAWS_IN_FLIGHT sets fit. */ + unsigned int max_size_in_flight = size * KNOB_MAX_DRAWS_IN_FLIGHT; - if (space->base) { -/* defer delete, use aligned-free */ -struct swr_screen *screen = swr_screen(ctx->pipe.screen); -swr_fence_work_free(screen->flush_fence, space->base, true); -space->base = NULL; - } + /* Need to grow space */ + if (max_size_in_flight > space->current_size) { + space->current_size = max_size_in_flight; - if (!space->base) { -space->base = (uint8_t *)AlignedMalloc(space->current_size, - sizeof(void *)); -space->head = (void *)space->base; - } + if (space->base) { + /* defer delete, use aligned-free */ + struct swr_screen *screen = swr_screen(ctx->pipe.screen); + swr_fence_work_free(screen->flush_fence, space->base, true); + space->base = NULL; } - /* Wrap */ - if (((uint8_t *)space->head + size) - >= ((uint8_t *)space->base + space->current_size)) { - space->head = space->base; + if (!space->base) { + space->base = (uint8_t *)AlignedMalloc(space->current_size, +sizeof(void *)); + space->head = (void *)space->base; } + } - ptr = space->head; - space->head = (uint8_t *)space->head + size; + /* Wrap */ + if (((uint8_t *)space->head + size) + >= ((uint8_t *)space->base + space->current_size)) { + space->head = space->base; } + ptr = space->head; + space->head = (uint8_t *)space->head + size; + /* Copy user_buffer to scratch */ if (user_buffer) memcpy(ptr, user_buffer, size); diff --git a/src/gallium/drivers/swr/swr_screen.cpp b/src/gallium/drivers/swr/swr_screen.cpp index 46b3a00..b21c35e 100644 --- a/src/gallium/drivers/swr/swr_screen.cpp +++ b/src/gallium/drivers/swr/swr_screen.cpp @@ -57,7 +57,7 @@ #define SWR_MAX_TEXTURE_ARRAY_LAYERS 512 /* 8K x 512 / 8K x 8K x 512 */ /* Default max client_copy_limit */ -#define SWR_CLIENT_COPY_LIMIT 32768 +#define SWR_CLIENT_COPY_LIMIT 8192 /* Flag indicates creation of alternate surface, to prevent recursive loop * in resource creation when msaa_force_enable is set. */ -- 2.7.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 2/4] glsl/linker: fix location aliasing checks for interface variables
On 20/10/17 08:27, Timothy Arceri wrote: On 20/10/17 08:19, Timothy Arceri wrote: On 20/10/17 04:21, Ilia Mirkin wrote: On Thu, Oct 19, 2017 at 12:45 PM, Ilia Mirkinwrote: On Thu, Oct 19, 2017 at 12:40 PM, Iago Toral wrote: On Thu, 2017-10-19 at 12:37 -0400, Ilia Mirkin wrote: Will this work with SSO shaders? Presumably the validation still has to happen, but I don't think cross_validate_outputs_to_inputs() will end up getting called. The piglit tests I mention use SSO so it seems to be working for this. See for example: tests/spec/arb_enhanced_layouts/linker/block-member-locations/named- block-member-location-overlap.shader_test Ah great. I'm a little curious how, since I don't see how cross_validate_outputs_to_inputs would get called for SSO shaders. Perhaps I'm looking at old code. Oh - because that test doesn't test SSO. It's missing a "SSO ENABLED" - can you try with that? It's just using a pipeline, but both shaders are ending up in it. BTW, my solution to all this was https://patchwork.freedesktop.org/patch/175959/ but Tim hated it, and I didn't have the time to properly respond. Hate is a strong word, the problem is it duplicated some of the checks/logic in cross_validate_outputs_to_inputs() unnecessarily. The checks should be pulled into a helper/helpers that can also be used by SSO. Oh and your patch was also missing all the component checking logic which we also should be doing for SSO. Moving the checks into helpers will give us these check for free. Thinking about it some more, I don't see how your patch works for SSO. The only place you can validate the SSO pipeline is via _mesa_validate_program_pipeline() since we can't know what the other stages will be at link time. -ilia ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 2/4] glsl/linker: fix location aliasing checks for interface variables
On 20/10/17 08:19, Timothy Arceri wrote: On 20/10/17 04:21, Ilia Mirkin wrote: On Thu, Oct 19, 2017 at 12:45 PM, Ilia Mirkinwrote: On Thu, Oct 19, 2017 at 12:40 PM, Iago Toral wrote: On Thu, 2017-10-19 at 12:37 -0400, Ilia Mirkin wrote: Will this work with SSO shaders? Presumably the validation still has to happen, but I don't think cross_validate_outputs_to_inputs() will end up getting called. The piglit tests I mention use SSO so it seems to be working for this. See for example: tests/spec/arb_enhanced_layouts/linker/block-member-locations/named- block-member-location-overlap.shader_test Ah great. I'm a little curious how, since I don't see how cross_validate_outputs_to_inputs would get called for SSO shaders. Perhaps I'm looking at old code. Oh - because that test doesn't test SSO. It's missing a "SSO ENABLED" - can you try with that? It's just using a pipeline, but both shaders are ending up in it. BTW, my solution to all this was https://patchwork.freedesktop.org/patch/175959/ but Tim hated it, and I didn't have the time to properly respond. Hate is a strong word, the problem is it duplicated some of the checks/logic in cross_validate_outputs_to_inputs() unnecessarily. The checks should be pulled into a helper/helpers that can also be used by SSO. Oh and your patch was also missing all the component checking logic which we also should be doing for SSO. Moving the checks into helpers will give us these check for free. -ilia ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 2/4] glsl/linker: fix location aliasing checks for interface variables
On Thu, Oct 19, 2017 at 5:19 PM, Timothy Arceriwrote: > On 20/10/17 04:21, Ilia Mirkin wrote: >> >> On Thu, Oct 19, 2017 at 12:45 PM, Ilia Mirkin >> wrote: >>> >>> On Thu, Oct 19, 2017 at 12:40 PM, Iago Toral wrote: On Thu, 2017-10-19 at 12:37 -0400, Ilia Mirkin wrote: > > Will this work with SSO shaders? Presumably the validation still has > to happen, but I don't think cross_validate_outputs_to_inputs() will > end up getting called. The piglit tests I mention use SSO so it seems to be working for this. See for example: tests/spec/arb_enhanced_layouts/linker/block-member-locations/named- block-member-location-overlap.shader_test >>> >>> >>> Ah great. I'm a little curious how, since I don't see how >>> cross_validate_outputs_to_inputs would get called for SSO shaders. >>> Perhaps I'm looking at old code. >>> >>> Oh - because that test doesn't test SSO. It's missing a "SSO ENABLED" >>> - can you try with that? It's just using a pipeline, but both shaders >>> are ending up in it. >> >> >> BTW, my solution to all this was >> >> https://patchwork.freedesktop.org/patch/175959/ >> >> but Tim hated it, and I didn't have the time to properly respond. > > > Hate is a strong word, the problem is it duplicated some of the checks/logic > in cross_validate_outputs_to_inputs() unnecessarily. The checks should be > pulled into a helper/helpers that can also be used by SSO. Once I looked at the other code, I hated the duplication too - they do look sadly similar. But fixing it seemed difficult, since they were counting slightly different things, and I didn't have time to investigate in depth (and continue to not have that time). -ilia ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 2/4] glsl/linker: fix location aliasing checks for interface variables
On 20/10/17 04:21, Ilia Mirkin wrote: On Thu, Oct 19, 2017 at 12:45 PM, Ilia Mirkinwrote: On Thu, Oct 19, 2017 at 12:40 PM, Iago Toral wrote: On Thu, 2017-10-19 at 12:37 -0400, Ilia Mirkin wrote: Will this work with SSO shaders? Presumably the validation still has to happen, but I don't think cross_validate_outputs_to_inputs() will end up getting called. The piglit tests I mention use SSO so it seems to be working for this. See for example: tests/spec/arb_enhanced_layouts/linker/block-member-locations/named- block-member-location-overlap.shader_test Ah great. I'm a little curious how, since I don't see how cross_validate_outputs_to_inputs would get called for SSO shaders. Perhaps I'm looking at old code. Oh - because that test doesn't test SSO. It's missing a "SSO ENABLED" - can you try with that? It's just using a pipeline, but both shaders are ending up in it. BTW, my solution to all this was https://patchwork.freedesktop.org/patch/175959/ but Tim hated it, and I didn't have the time to properly respond. Hate is a strong word, the problem is it duplicated some of the checks/logic in cross_validate_outputs_to_inputs() unnecessarily. The checks should be pulled into a helper/helpers that can also be used by SSO. -ilia ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v2 1/8] egl: add dri2_egl_surface_free_outdated_buffers_and_update_size() helper (v2)
De-duplicating and then trimming down works for me. On Thu, Oct 19, 2017 at 3:31 AM, Emil Velikovwrote: > On 18 October 2017 at 23:36, Gurchetan Singh > wrote: > >> Then again, I'd suggest keeping that as separate series. These patches > >> started as a way to minimise the duplication we have in drivers/dri2. > > > > I'm fine with dri2_$action_$object. We can modify the existing functions > > later, but I recommend adopting more concise conventions in this > patchset, > > i.e: > > > > dri2_egl_surface_record_buffers_and_update_back_buffer --> > > dri2_set_back_buffer_surface > > dri2_egl_surface_free_outdated_buffers_and_update_size --> > > dri2_fixup_surface > > dri2_egl_surface_update_buffer_age --> dri2_update_age_surface > > dri2_egl_surface_get_image_front --> dri2_get_front_image_surface > > > Sure thing, let's use consistent names with this series. > > >> goal the series is to a) remove a handful of the ifdef spaghetti and > > > > I agree, struct dri2_egl_surface can be refactored. I would advocate a > > solution where the surface (a) has everything a platform needs but > nothing > > else (b) has a minimal amount of duplication. I would like to look at > the > > struct and see if it defines buffers[5], it must mean the platform > > implements get_buffers_with_format for example. If a platform doesn't > > define color_buffers, it means EXT_buffer_age is not used for whatever > > reason. Everything has dri_image_front -- then everything must use the > > image extension. I think this type of self-consistency is useful, from a > > code is documentation point of view. Here's pseudo-code of what I would > > want: > > > I agreed with the goals but I would swap the order/priority - > de-duplicate, then trim down. > By de-duplicating/refactoring one could add support for X/Y fairly > easily. Once all that is done, we can polish ;-) > > I fear that otherwise there will be a lot of unnecessary churn. > > Thanks > Emil > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH v2 5/5] etnaviv: fix implicit conversion warning
Galliums query_type used in APIs is unsigned. Signed-off-by: Christian GmeinerReviewed-by: Wladimir J. van der Laan --- src/gallium/drivers/etnaviv/etnaviv_query.h| 2 +- src/gallium/drivers/etnaviv/etnaviv_query_sw.c | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/src/gallium/drivers/etnaviv/etnaviv_query.h b/src/gallium/drivers/etnaviv/etnaviv_query.h index e099e10f7c..8927266057 100644 --- a/src/gallium/drivers/etnaviv/etnaviv_query.h +++ b/src/gallium/drivers/etnaviv/etnaviv_query.h @@ -44,7 +44,7 @@ struct etna_query_funcs { struct etna_query { const struct etna_query_funcs *funcs; bool active; - int type; + unsigned type; }; static inline struct etna_query * diff --git a/src/gallium/drivers/etnaviv/etnaviv_query_sw.c b/src/gallium/drivers/etnaviv/etnaviv_query_sw.c index ea79467b61..dd9bac3850 100644 --- a/src/gallium/drivers/etnaviv/etnaviv_query_sw.c +++ b/src/gallium/drivers/etnaviv/etnaviv_query_sw.c @@ -42,7 +42,7 @@ etna_sw_destroy_query(struct etna_context *ctx, struct etna_query *q) } static uint64_t -read_counter(struct etna_context *ctx, int type) +read_counter(struct etna_context *ctx, unsigned type) { switch (type) { case PIPE_QUERY_PRIMITIVES_EMITTED: -- 2.13.6 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH v2 3/5] etnaviv: add support for occlusion queries
Passes most occlusion query piglits. The following piglits are broken: - spec@arb_occlusion_query@occlusion_query_meta_fragments - spec@arb_occlusion_query@occlusion_query_meta_save - spec@arb_occlusion_query2@render v1 -> v2: - use one sample provider for all occlusion queries tyes - add comment about 'magic' value 0x1DF5E76 Signed-off-by: Christian Gmeiner--- src/gallium/drivers/etnaviv/etnaviv_query_hw.c | 78 ++ 1 file changed, 78 insertions(+) diff --git a/src/gallium/drivers/etnaviv/etnaviv_query_hw.c b/src/gallium/drivers/etnaviv/etnaviv_query_hw.c index a1dead335c..0f3cd7257b 100644 --- a/src/gallium/drivers/etnaviv/etnaviv_query_hw.c +++ b/src/gallium/drivers/etnaviv/etnaviv_query_hw.c @@ -30,9 +30,73 @@ #include "util/u_memory.h" #include "etnaviv_context.h" +#include "etnaviv_debug.h" +#include "etnaviv_emit.h" #include "etnaviv_query_hw.h" #include "etnaviv_screen.h" +/* + * Occlusion Query: + * + * OCCLUSION_COUNTER and OCCLUSION_PREDICATE differ only in how they + * interpret results + */ + +static void +occlusion_start(struct etna_hw_query *hq, struct etna_context *ctx) +{ + struct etna_resource *rsc = etna_resource(hq->prsc); + struct etna_reloc r = { + .bo = rsc->bo, + .flags = ETNA_RELOC_WRITE + }; + + if (hq->samples > 63) { + hq->samples = 63; + BUG("samples overflow"); + } + + r.offset = hq->samples * 8; /* 64bit value */ + + etna_set_state_reloc(ctx->stream, VIVS_GL_OCCLUSION_QUERY_ADDR, ); +} + +static void +occlusion_stop(struct etna_hw_query *hq, struct etna_context *ctx) +{ + /* 0x1DF5E76 is the value used by blob - but any random value will work */ + etna_set_state(ctx->stream, VIVS_GL_OCCLUSION_QUERY_CONTROL, 0x1DF5E76); +} + +static void +occlusion_suspend(struct etna_hw_query *hq, struct etna_context *ctx) +{ + occlusion_stop(hq, ctx); +} + +static void +occlusion_resume(struct etna_hw_query *hq, struct etna_context *ctx) +{ + hq->samples++; + occlusion_start(hq, ctx); +} + +static void +occlusion_result(struct etna_hw_query *hq, void *buf, + union pipe_query_result *result) +{ + uint64_t sum = 0; + uint64_t *ptr = (uint64_t *)buf; + + for (unsigned i = 0; i <= hq->samples; i++) + sum += *(ptr + i); + + if (hq->base.type == PIPE_QUERY_OCCLUSION_COUNTER) + result->u64 = sum; + else + result->b = !!sum; +} + static void etna_hw_destroy_query(struct etna_context *ctx, struct etna_query *q) { @@ -44,6 +108,14 @@ etna_hw_destroy_query(struct etna_context *ctx, struct etna_query *q) FREE(hq); } +static const struct etna_hw_sample_provider occlusion_provider = { + .start = occlusion_start, + .stop = occlusion_stop, + .suspend = occlusion_suspend, + .resume = occlusion_resume, + .result = occlusion_result, +}; + static void realloc_query_bo(struct etna_context *ctx, struct etna_hw_query *hq) { @@ -153,6 +225,12 @@ static inline const struct etna_hw_sample_provider * query_sample_provider(unsigned query_type) { switch (query_type) { + case PIPE_QUERY_OCCLUSION_COUNTER: + /* fallthrough */ + case PIPE_QUERY_OCCLUSION_PREDICATE: + /* fallthrough */ + case PIPE_QUERY_OCCLUSION_PREDICATE_CONSERVATIVE: + return _provider; default: return NULL; } -- 2.13.6 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH v2 4/5] etnaviv: enable occlusion query if GPU supports it
Signed-off-by: Christian GmeinerReviewed-by: Wladimir J. van der Laan --- src/gallium/drivers/etnaviv/etnaviv_screen.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/src/gallium/drivers/etnaviv/etnaviv_screen.c b/src/gallium/drivers/etnaviv/etnaviv_screen.c index 738605a4f3..009bc73c14 100644 --- a/src/gallium/drivers/etnaviv/etnaviv_screen.c +++ b/src/gallium/drivers/etnaviv/etnaviv_screen.c @@ -319,8 +319,9 @@ etna_screen_get_param(struct pipe_screen *pscreen, enum pipe_cap param) /* Timer queries. */ case PIPE_CAP_QUERY_TIME_ELAPSED: - case PIPE_CAP_OCCLUSION_QUERY: return 0; + case PIPE_CAP_OCCLUSION_QUERY: + return VIV_FEATURE(screen, chipMinorFeatures1, HALTI0); case PIPE_CAP_QUERY_TIMESTAMP: return 1; case PIPE_CAP_QUERY_PIPELINE_STATISTICS: -- 2.13.6 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH v2 1/5] etnaviv: update headers from rnndb
Update to etna_viv commit 6c9c706. Signed-off-by: Christian GmeinerReviewed-by: Wladimir J. van der Laan --- src/gallium/drivers/etnaviv/hw/cmdstream.xml.h | 36 ++- src/gallium/drivers/etnaviv/hw/common.xml.h| 117 src/gallium/drivers/etnaviv/hw/isa.xml.h | 4 +- src/gallium/drivers/etnaviv/hw/state.xml.h | 197 -- src/gallium/drivers/etnaviv/hw/state_3d.xml.h | 357 - 5 files changed, 622 insertions(+), 89 deletions(-) diff --git a/src/gallium/drivers/etnaviv/hw/cmdstream.xml.h b/src/gallium/drivers/etnaviv/hw/cmdstream.xml.h index f8d76b0105..e12188ea52 100644 --- a/src/gallium/drivers/etnaviv/hw/cmdstream.xml.h +++ b/src/gallium/drivers/etnaviv/hw/cmdstream.xml.h @@ -8,9 +8,9 @@ http://0x04.net/cgit/index.cgi/rules-ng-ng git clone git://0x04.net/rules-ng-ng The rules-ng-ng source files this header was generated from are: -- cmdstream.xml ( 15289 bytes, from 2017-09-29 11:52:39) -- copyright.xml ( 1597 bytes, from 2016-12-08 16:37:56) -- common.xml( 23529 bytes, from 2017-09-29 11:52:39) +- cmdstream.xml ( 16595 bytes, from 2017-10-05 21:20:32) +- copyright.xml ( 1597 bytes, from 2016-11-13 13:46:17) +- common.xml( 26135 bytes, from 2017-10-05 21:20:32) Copyright (C) 2012-2017 by the following authors: - Wladimir J. van der Laan @@ -52,6 +52,8 @@ DEALINGS IN THE SOFTWARE. #define FE_OPCODE_RETURN 0x000b #define FE_OPCODE_DRAW_INSTANCED 0x000c #define FE_OPCODE_CHIP_SELECT 0x000d +#define FE_OPCODE_WAIT_FENCE 0x000f +#define FE_OPCODE_SNAP_PAGES 0x0013 #define PRIMITIVE_TYPE_POINTS 0x0001 #define PRIMITIVE_TYPE_LINES 0x0002 #define PRIMITIVE_TYPE_LINE_STRIP 0x0003 @@ -192,6 +194,9 @@ DEALINGS IN THE SOFTWARE. #define VIV_FE_STALL_TOKEN_TO__MASK0x1f00 #define VIV_FE_STALL_TOKEN_TO__SHIFT 8 #define VIV_FE_STALL_TOKEN_TO(x) (((x) << VIV_FE_STALL_TOKEN_TO__SHIFT) & VIV_FE_STALL_TOKEN_TO__MASK) +#define VIV_FE_STALL_TOKEN_UNK28__MASK 0x3000 +#define VIV_FE_STALL_TOKEN_UNK28__SHIFT28 +#define VIV_FE_STALL_TOKEN_UNK28(x)(((x) << VIV_FE_STALL_TOKEN_UNK28__SHIFT) & VIV_FE_STALL_TOKEN_UNK28__MASK) #define VIV_FE_CALL0x @@ -266,5 +271,30 @@ DEALINGS IN THE SOFTWARE. #define VIV_FE_DRAW_INSTANCED_START_INDEX__SHIFT 0 #define VIV_FE_DRAW_INSTANCED_START_INDEX(x) (((x) << VIV_FE_DRAW_INSTANCED_START_INDEX__SHIFT) & VIV_FE_DRAW_INSTANCED_START_INDEX__MASK) +#define VIV_FE_WAIT_FENCE 0x + +#define VIV_FE_WAIT_FENCE_HEADER 0x +#define VIV_FE_WAIT_FENCE_HEADER_OP__MASK 0xf800 +#define VIV_FE_WAIT_FENCE_HEADER_OP__SHIFT 27 +#define VIV_FE_WAIT_FENCE_HEADER_OP_WAIT_FENCE 0x7800 +#define VIV_FE_WAIT_FENCE_HEADER_UNK16__MASK 0x0003 +#define VIV_FE_WAIT_FENCE_HEADER_UNK16__SHIFT 16 +#define VIV_FE_WAIT_FENCE_HEADER_UNK16(x) (((x) << VIV_FE_WAIT_FENCE_HEADER_UNK16__SHIFT) & VIV_FE_WAIT_FENCE_HEADER_UNK16__MASK) +#define VIV_FE_WAIT_FENCE_HEADER_WAITCOUNT__MASK 0x +#define VIV_FE_WAIT_FENCE_HEADER_WAITCOUNT__SHIFT 0 +#define VIV_FE_WAIT_FENCE_HEADER_WAITCOUNT(x) (((x) << VIV_FE_WAIT_FENCE_HEADER_WAITCOUNT__SHIFT) & VIV_FE_WAIT_FENCE_HEADER_WAITCOUNT__MASK) + +#define VIV_FE_WAIT_FENCE_ADDRESS 0x0004 + +#define VIV_FE_SNAP_PAGES 0x + +#define VIV_FE_SNAP_PAGES_HEADER 0x +#define VIV_FE_SNAP_PAGES_HEADER_OP__MASK 0xf800 +#define VIV_FE_SNAP_PAGES_HEADER_OP__SHIFT 27 +#define VIV_FE_SNAP_PAGES_HEADER_OP_SNAP_PAGES 0x9800 +#define VIV_FE_SNAP_PAGES_HEADER_UNK0__MASK0x001f +#define VIV_FE_SNAP_PAGES_HEADER_UNK0__SHIFT 0 +#define VIV_FE_SNAP_PAGES_HEADER_UNK0(x) (((x) << VIV_FE_SNAP_PAGES_HEADER_UNK0__SHIFT) & VIV_FE_SNAP_PAGES_HEADER_UNK0__MASK) + #endif /* CMDSTREAM_XML */ diff --git a/src/gallium/drivers/etnaviv/hw/common.xml.h b/src/gallium/drivers/etnaviv/hw/common.xml.h index 85c4990b61..57369d741d 100644 --- a/src/gallium/drivers/etnaviv/hw/common.xml.h +++
[Mesa-dev] [PATCH v2 2/5] etnaviv: add basic infrastructure for hw queries
No hardware query is supported yet. v1 -> v2 - removed query_type from strcut etna_hw_sample_provider Signed-off-by: Christian Gmeiner--- src/gallium/drivers/etnaviv/Makefile.sources | 2 + src/gallium/drivers/etnaviv/etnaviv_context.c | 11 ++ src/gallium/drivers/etnaviv/etnaviv_context.h | 3 + src/gallium/drivers/etnaviv/etnaviv_query.c| 3 + src/gallium/drivers/etnaviv/etnaviv_query_hw.c | 185 + src/gallium/drivers/etnaviv/etnaviv_query_hw.h | 88 6 files changed, 292 insertions(+) create mode 100644 src/gallium/drivers/etnaviv/etnaviv_query_hw.c create mode 100644 src/gallium/drivers/etnaviv/etnaviv_query_hw.h diff --git a/src/gallium/drivers/etnaviv/Makefile.sources b/src/gallium/drivers/etnaviv/Makefile.sources index 60275c949d..ea8df807f6 100644 --- a/src/gallium/drivers/etnaviv/Makefile.sources +++ b/src/gallium/drivers/etnaviv/Makefile.sources @@ -27,6 +27,8 @@ C_SOURCES := \ etnaviv_internal.h \ etnaviv_query.c \ etnaviv_query.h \ + etnaviv_query_hw.c \ + etnaviv_query_hw.h \ etnaviv_query_sw.c \ etnaviv_query_sw.h \ etnaviv_rasterizer.c \ diff --git a/src/gallium/drivers/etnaviv/etnaviv_context.c b/src/gallium/drivers/etnaviv/etnaviv_context.c index 67aab6a68c..65c20d2f83 100644 --- a/src/gallium/drivers/etnaviv/etnaviv_context.c +++ b/src/gallium/drivers/etnaviv/etnaviv_context.c @@ -34,6 +34,7 @@ #include "etnaviv_emit.h" #include "etnaviv_fence.h" #include "etnaviv_query.h" +#include "etnaviv_query_hw.h" #include "etnaviv_rasterizer.h" #include "etnaviv_screen.h" #include "etnaviv_shader.h" @@ -260,6 +261,9 @@ etna_draw_vbo(struct pipe_context *pctx, const struct pipe_draw_info *info) if (ctx->sampler_view[i]) resource_read(ctx, ctx->sampler_view[i]->texture); + list_for_each_entry(struct etna_hw_query, hq, >active_hw_queries, node) + resource_written(ctx, hq->prsc); + ctx->stats.prims_emitted += u_reduced_prims_for_vertices(info->mode, info->count); ctx->stats.draw_calls++; @@ -299,10 +303,16 @@ etna_flush(struct pipe_context *pctx, struct pipe_fence_handle **fence, struct etna_context *ctx = etna_context(pctx); int out_fence_fd = -1; + list_for_each_entry(struct etna_hw_query, hq, >active_hw_queries, node) + etna_hw_query_suspend(hq, ctx); + etna_cmd_stream_flush2(ctx->stream, ctx->in_fence_fd, (flags & PIPE_FLUSH_FENCE_FD) ? _fence_fd : NULL); + list_for_each_entry(struct etna_hw_query, hq, >active_hw_queries, node) + etna_hw_query_resume(hq, ctx); + if (fence) *fence = etna_fence_create(pctx, out_fence_fd); } @@ -436,6 +446,7 @@ etna_context_create(struct pipe_screen *pscreen, void *priv, unsigned flags) goto fail; slab_create_child(>transfer_pool, >transfer_pool); + list_inithead(>active_hw_queries); return pctx; diff --git a/src/gallium/drivers/etnaviv/etnaviv_context.h b/src/gallium/drivers/etnaviv/etnaviv_context.h index bf2b265f5e..2903e0963d 100644 --- a/src/gallium/drivers/etnaviv/etnaviv_context.h +++ b/src/gallium/drivers/etnaviv/etnaviv_context.h @@ -180,6 +180,9 @@ struct etna_context { struct pipe_debug_callback debug; int in_fence_fd; + + /* list of active hardware queries */ + struct list_head active_hw_queries; }; static inline struct etna_context * diff --git a/src/gallium/drivers/etnaviv/etnaviv_query.c b/src/gallium/drivers/etnaviv/etnaviv_query.c index a416a7cb0f..9e897cd75a 100644 --- a/src/gallium/drivers/etnaviv/etnaviv_query.c +++ b/src/gallium/drivers/etnaviv/etnaviv_query.c @@ -30,6 +30,7 @@ #include "etnaviv_context.h" #include "etnaviv_query.h" +#include "etnaviv_query_hw.h" #include "etnaviv_query_sw.h" static struct pipe_query * @@ -40,6 +41,8 @@ etna_create_query(struct pipe_context *pctx, unsigned query_type, struct etna_query *q; q = etna_sw_create_query(ctx, query_type); + if (!q) + q = etna_hw_create_query(ctx, query_type); return (struct pipe_query *)q; } diff --git a/src/gallium/drivers/etnaviv/etnaviv_query_hw.c b/src/gallium/drivers/etnaviv/etnaviv_query_hw.c new file mode 100644 index 00..a1dead335c --- /dev/null +++ b/src/gallium/drivers/etnaviv/etnaviv_query_hw.c @@ -0,0 +1,185 @@ +/* + * Copyright (c) 2017 Etnaviv Project + * Copyright (C) 2017 Zodiac Inflight Innovations + * + * Permission is hereby granted, free of charge, to any person obtaining a + * copy of this software and associated documentation files (the "Software"), + * to deal in the Software without restriction, including without limitation + * the rights to use, copy, modify, merge, publish, distribute, sub license, + * and/or sell copies of the Software, and to permit persons to whom the + * Software is furnished to do so, subject to the following conditions: + * + * The above copyright notice and this
Re: [Mesa-dev] [PATCH 4/4] meson: Add support for EGL glvnd
Works just fine for me, and patch looks good. Reviewed-by: Lyude PaulOn Wed, 2017-10-18 at 16:55 -0700, Dylan Baker wrote: > Signed-off-by: Dylan Baker > --- > src/egl/meson.build | 46 -- > 1 file changed, 44 insertions(+), 2 deletions(-) > > diff --git a/src/egl/meson.build b/src/egl/meson.build > index ade6810bf91..8ea8a5bbb69 100644 > --- a/src/egl/meson.build > +++ b/src/egl/meson.build > @@ -70,6 +70,34 @@ linux_dmabuf_unstable_v1_client_protocol_h = > custom_target( >command : [prog_wl_scanner, 'client-header', '@INPUT@', '@OUTPUT@'], > ) > > +g_egldispatchstubs_c = custom_target( > + 'g_egldispatchstubs.c', > + input : [ > +'generate/gen_egl_dispatch.py', 'generate/eglFunctionList.py', > +'generate/egl.xml', 'generate/egl_other.xml' > + ], > + output : 'g_egldispatchstubs.c', > + command : [ > +prog_python2, '@INPUT0@', 'source', '@INPUT1@', '@INPUT2@', '@INPUT3@' > + ], > + depend_files : files('generate/genCommon.py'), > + capture : true, > +) > + > +g_egldispatchstubs_h = custom_target( > + 'g_egldispatchstubs.h', > + input : [ > +'generate/gen_egl_dispatch.py', 'generate/eglFunctionList.py', > +'generate/egl.xml', 'generate/egl_other.xml' > + ], > + output : 'g_egldispatchstubs.h', > + command : [ > +prog_python2, '@INPUT0@', 'header', '@INPUT1@', '@INPUT2@', '@INPUT3@' > + ], > + depend_files : files('generate/genCommon.py'), > + capture : true, > +) > + > if with_platform_x11 >files_egl += files('drivers/dri2/platform_x11.c') >if with_dri3 > @@ -107,8 +135,22 @@ if cc.has_function('mincore') >c_args_for_egl += '-DHAVE_MINCORE' > endif > > +if not with_glvnd > + egl_lib_name = 'EGL' > + egl_lib_version = '1.0.0' > +else > + egl_lib_name = 'EGL_mesa' > + egl_lib_version = '0' > + files_egl += [g_egldispatchstubs_h, g_egldispatchstubs_c] > + files_egl += files('main/eglglvnd.c', 'main/egldispatchstubs.c') > + install_data( > +'main/50_mesa.json', > +install_dir : join_paths(get_option('datadir'), 'glvnd', > 'egl_vendor.d') > + ) > +endif > + > libegl = shared_library( > - 'EGL', > + egl_lib_name, >files_egl, >c_args : [ > c_vis_args, > @@ -125,7 +167,7 @@ libegl = shared_library( >link_args : [ld_args_bsymbolic, ld_args_gc_sections], >dependencies : [deps_for_egl, dep_dl, dep_libdrm, dep_clock, dep_thread], >install : true, > - version : '1.0.0', > + version : egl_lib_version, > ) > > pkg.generate( ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 3/5] etnaviv: add support for occlusion queries
> There is one difference - how the sum is interpreted - uint64_t vs. bool > value. In general the code Ok in that case it's ok like this, just looked like unnecessary/accidental duplication. Regards, Wladimir ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] radv: don't flush the VS when srcStageMask == TOP_OF_PIPE_BIT
Reviewed-by: Bas NieuwenhuizenOn Thu, Oct 19, 2017 at 8:54 PM, Fredrik Höglund wrote: > The Vulkan specification says: > >"... an execution dependency with only VK_PIPELINE_STAGE_TOP_OF_- > PIPE_BIT in the source stage mask will effectively not wait for > any prior commands to complete." > > Signed-off-by: Fredrik Höglund > --- > src/amd/vulkan/radv_cmd_buffer.c | 3 +-- > 1 file changed, 1 insertion(+), 2 deletions(-) > > diff --git a/src/amd/vulkan/radv_cmd_buffer.c > b/src/amd/vulkan/radv_cmd_buffer.c > index 147235006fa..4b9d49cd2bd 100644 > --- a/src/amd/vulkan/radv_cmd_buffer.c > +++ b/src/amd/vulkan/radv_cmd_buffer.c > @@ -1813,8 +1813,7 @@ static void radv_stage_flush(struct radv_cmd_buffer > *cmd_buffer, > VK_PIPELINE_STAGE_ALL_GRAPHICS_BIT | > VK_PIPELINE_STAGE_ALL_COMMANDS_BIT)) { > cmd_buffer->state.flush_bits |= > RADV_CMD_FLAG_PS_PARTIAL_FLUSH; > - } else if (src_stage_mask & (VK_PIPELINE_STAGE_TOP_OF_PIPE_BIT | > -VK_PIPELINE_STAGE_DRAW_INDIRECT_BIT | > + } else if (src_stage_mask & (VK_PIPELINE_STAGE_DRAW_INDIRECT_BIT | > VK_PIPELINE_STAGE_VERTEX_INPUT_BIT | > VK_PIPELINE_STAGE_VERTEX_SHADER_BIT)) { > cmd_buffer->state.flush_bits |= > RADV_CMD_FLAG_VS_PARTIAL_FLUSH; > -- > 2.11.0 > > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/2] radv: be smarter with descriptors when emitting secondary buffers
On Wed, Oct 11, 2017 at 5:18 PM, Samuel Pitoisetwrote: > If the secondary buffers don't use any descriptors we don't > have to re-emit the ones from the primary command buffer. > > Signed-off-by: Samuel Pitoiset > --- > src/amd/vulkan/radv_cmd_buffer.c | 29 - > 1 file changed, 28 insertions(+), 1 deletion(-) > > diff --git a/src/amd/vulkan/radv_cmd_buffer.c > b/src/amd/vulkan/radv_cmd_buffer.c > index 22637950c4..69ca16b52d 100644 > --- a/src/amd/vulkan/radv_cmd_buffer.c > +++ b/src/amd/vulkan/radv_cmd_buffer.c > @@ -2615,6 +2615,29 @@ void radv_CmdSetStencilReference( > cmd_buffer->state.dirty |= RADV_CMD_DIRTY_DYNAMIC_STENCIL_REFERENCE; > } > > +static bool > +radv_cmd_buffer_has_descriptors(struct radv_cmd_buffer *cmd_buffer) > +{ > + struct radv_cmd_state *state = _buffer->state; > + int i; > + > + for (i = 0; i < MAX_SETS; i++) { > + if (cmd_buffer->state.descriptors[i]) > + return true; > + } This does not work, because of the meta restore stuff resetting to NULL. > + > + if ((state->pipeline && > +state->pipeline->need_indirect_descriptor_sets) || > + (state->compute_pipeline && > +state->compute_pipeline->need_indirect_descriptor_sets)) > + return true; This seems unnecessary to me here. > + > + if (cmd_buffer->push_descriptors.capacity > 0) > + return true; This as well, we have a separate descriptor set per cmd buffer, so what is in there does not matter. Furthermore, if the parent command buffer and child command buffer have different emitted pipelines the user SGPR assignment is different and we need a flush too. > + > + return false; > +} > + > void radv_CmdExecuteCommands( > VkCommandBuffer commandBuffer, > uint32_tcommandBufferCount, > @@ -2675,6 +2698,11 @@ void radv_CmdExecuteCommands( > secondary->state.last_primitive_reset_en; > } > > + /* Only re-emit all descriptors when needed. */ > + if (radv_cmd_buffer_has_descriptors(secondary)) { > + radv_mark_descriptor_sets_dirty(primary); > + } > + > primary->state.last_primitive_reset_index = > secondary->state.last_primitive_reset_index; > primary->state.last_ia_multi_vgt_param = > secondary->state.last_ia_multi_vgt_param; > } > @@ -2684,7 +2712,6 @@ void radv_CmdExecuteCommands( > */ > primary->state.dirty |= RADV_CMD_DIRTY_PIPELINE; > primary->state.dirty |= RADV_CMD_DIRTY_DYNAMIC_ALL; > - radv_mark_descriptor_sets_dirty(primary); > } > > VkResult radv_CreateCommandPool( > -- > 2.14.2 > > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 8/9] radv: add radv_emit_shaders_prefetch()
On Tue, Oct 17, 2017 at 11:03 AM, Samuel Pitoisetwrote: > Signed-off-by: Samuel Pitoiset > --- > src/amd/vulkan/radv_cmd_buffer.c | 38 ++ > 1 file changed, 26 insertions(+), 12 deletions(-) > > diff --git a/src/amd/vulkan/radv_cmd_buffer.c > b/src/amd/vulkan/radv_cmd_buffer.c > index ec4e34966c..e72ef5ffb7 100644 > --- a/src/amd/vulkan/radv_cmd_buffer.c > +++ b/src/amd/vulkan/radv_cmd_buffer.c > @@ -606,6 +606,30 @@ radv_emit_shader_prefetch(struct radv_cmd_buffer > *cmd_buffer, > si_cp_dma_prefetch(cmd_buffer, va, shader->code_size); > } > > +static void > +radv_emit_shaders_prefetch(struct radv_cmd_buffer *cmd_buffer, > + struct radv_pipeline *pipeline) > +{ > + radv_emit_shader_prefetch(cmd_buffer, > + pipeline->shaders[MESA_SHADER_VERTEX]); Do this per stage instead of grouping, and also check the vertex shader for NULL. Consequence of the merged shaders on vega. > + > + if (pipeline->shaders[MESA_SHADER_TESS_EVAL]) { > + radv_emit_shader_prefetch(cmd_buffer, > + > pipeline->shaders[MESA_SHADER_TESS_CTRL]); > + radv_emit_shader_prefetch(cmd_buffer, > + > pipeline->shaders[MESA_SHADER_TESS_EVAL]); > + } > + > + if (pipeline->shaders[MESA_SHADER_GEOMETRY]) { > + radv_emit_shader_prefetch(cmd_buffer, > + > pipeline->shaders[MESA_SHADER_GEOMETRY]); > + radv_emit_shader_prefetch(cmd_buffer, > pipeline->gs_copy_shader); > + } > + > + radv_emit_shader_prefetch(cmd_buffer, > + pipeline->shaders[MESA_SHADER_FRAGMENT]); > +} > + > static void > radv_emit_hw_vs(struct radv_cmd_buffer *cmd_buffer, > struct radv_pipeline *pipeline, > @@ -615,8 +639,6 @@ radv_emit_hw_vs(struct radv_cmd_buffer *cmd_buffer, > uint64_t va = radv_buffer_get_va(shader->bo) + shader->bo_offset; > unsigned export_count; > > - radv_emit_shader_prefetch(cmd_buffer, shader); > - > export_count = MAX2(1, outinfo->param_exports); > radeon_set_context_reg(cmd_buffer->cs, R_0286C4_SPI_VS_OUT_CONFIG, >S_0286C4_VS_EXPORT_COUNT(export_count - 1)); > @@ -662,8 +684,6 @@ radv_emit_hw_es(struct radv_cmd_buffer *cmd_buffer, > { > uint64_t va = radv_buffer_get_va(shader->bo) + shader->bo_offset; > > - radv_emit_shader_prefetch(cmd_buffer, shader); > - > radeon_set_context_reg(cmd_buffer->cs, > R_028AAC_VGT_ESGS_RING_ITEMSIZE, >outinfo->esgs_itemsize / 4); > radeon_set_sh_reg_seq(cmd_buffer->cs, R_00B320_SPI_SHADER_PGM_LO_ES, > 4); > @@ -680,8 +700,6 @@ radv_emit_hw_ls(struct radv_cmd_buffer *cmd_buffer, > uint64_t va = radv_buffer_get_va(shader->bo) + shader->bo_offset; > uint32_t rsrc2 = shader->rsrc2; > > - radv_emit_shader_prefetch(cmd_buffer, shader); > - > radeon_set_sh_reg_seq(cmd_buffer->cs, R_00B520_SPI_SHADER_PGM_LO_LS, > 2); > radeon_emit(cmd_buffer->cs, va >> 8); > radeon_emit(cmd_buffer->cs, va >> 40); > @@ -702,8 +720,6 @@ radv_emit_hw_hs(struct radv_cmd_buffer *cmd_buffer, > { > uint64_t va = radv_buffer_get_va(shader->bo) + shader->bo_offset; > > - radv_emit_shader_prefetch(cmd_buffer, shader); > - > radeon_set_sh_reg_seq(cmd_buffer->cs, R_00B420_SPI_SHADER_PGM_LO_HS, > 4); > radeon_emit(cmd_buffer->cs, va >> 8); > radeon_emit(cmd_buffer->cs, va >> 40); > @@ -835,8 +851,6 @@ radv_emit_geometry_shader(struct radv_cmd_buffer > *cmd_buffer, > > va = radv_buffer_get_va(gs->bo) + gs->bo_offset; > > - radv_emit_shader_prefetch(cmd_buffer, gs); > - > radeon_set_sh_reg_seq(cmd_buffer->cs, R_00B220_SPI_SHADER_PGM_LO_GS, > 4); > radeon_emit(cmd_buffer->cs, va >> 8); > radeon_emit(cmd_buffer->cs, va >> 40); > @@ -875,8 +889,6 @@ radv_emit_fragment_shader(struct radv_cmd_buffer > *cmd_buffer, > ps = pipeline->shaders[MESA_SHADER_FRAGMENT]; > va = radv_buffer_get_va(ps->bo) + ps->bo_offset; > > - radv_emit_shader_prefetch(cmd_buffer, ps); > - > radeon_set_sh_reg_seq(cmd_buffer->cs, R_00B020_SPI_SHADER_PGM_LO_PS, > 4); > radeon_emit(cmd_buffer->cs, va >> 8); > radeon_emit(cmd_buffer->cs, va >> 40); > @@ -953,6 +965,8 @@ radv_emit_graphics_pipeline(struct radv_cmd_buffer > *cmd_buffer) > radv_emit_fragment_shader(cmd_buffer, pipeline); > radv_emit_vgt_vertex_reuse(cmd_buffer, pipeline); > > + radv_emit_shaders_prefetch(cmd_buffer, pipeline); > + > cmd_buffer->scratch_size_needed = > MAX2(cmd_buffer->scratch_size_needed, >
[Mesa-dev] [PATCH] radv: don't flush the VS when srcStageMask == TOP_OF_PIPE_BIT
The Vulkan specification says: "... an execution dependency with only VK_PIPELINE_STAGE_TOP_OF_- PIPE_BIT in the source stage mask will effectively not wait for any prior commands to complete." Signed-off-by: Fredrik Höglund--- src/amd/vulkan/radv_cmd_buffer.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/src/amd/vulkan/radv_cmd_buffer.c b/src/amd/vulkan/radv_cmd_buffer.c index 147235006fa..4b9d49cd2bd 100644 --- a/src/amd/vulkan/radv_cmd_buffer.c +++ b/src/amd/vulkan/radv_cmd_buffer.c @@ -1813,8 +1813,7 @@ static void radv_stage_flush(struct radv_cmd_buffer *cmd_buffer, VK_PIPELINE_STAGE_ALL_GRAPHICS_BIT | VK_PIPELINE_STAGE_ALL_COMMANDS_BIT)) { cmd_buffer->state.flush_bits |= RADV_CMD_FLAG_PS_PARTIAL_FLUSH; - } else if (src_stage_mask & (VK_PIPELINE_STAGE_TOP_OF_PIPE_BIT | -VK_PIPELINE_STAGE_DRAW_INDIRECT_BIT | + } else if (src_stage_mask & (VK_PIPELINE_STAGE_DRAW_INDIRECT_BIT | VK_PIPELINE_STAGE_VERTEX_INPUT_BIT | VK_PIPELINE_STAGE_VERTEX_SHADER_BIT)) { cmd_buffer->state.flush_bits |= RADV_CMD_FLAG_VS_PARTIAL_FLUSH; -- 2.11.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 2/2] radv: copy indirect lowering settings from radeonsi
Reviewed-by: Bas Nieuwenhuizenas well. On Thu, Oct 19, 2017 at 12:27 AM, Timothy Arceri wrote: > It looks the original indirect mask was probably copied from > ANV. > > Sascha Willems demo results: > > tessellation ~4000 -> ~4200 fps > > V2: continue lowering local indirect due to llvm deficiencies. > > Cc: Alex Smith > --- > src/amd/vulkan/radv_shader.c | 27 ++- > 1 file changed, 26 insertions(+), 1 deletion(-) > > diff --git a/src/amd/vulkan/radv_shader.c b/src/amd/vulkan/radv_shader.c > index 055787a705..faba0c50e9 100644 > --- a/src/amd/vulkan/radv_shader.c > +++ b/src/amd/vulkan/radv_shader.c > @@ -238,22 +238,47 @@ radv_shader_compile_to_nir(struct radv_device *device, > NIR_PASS_V(nir, nir_lower_constant_initializers, ~0); > NIR_PASS_V(nir, nir_lower_system_values); > NIR_PASS_V(nir, nir_lower_clip_cull_distance_arrays); > } > > /* Vulkan uses the separate-shader linking model */ > nir->info.separate_shader = true; > > nir_shader_gather_info(nir, entry_point->impl); > > + /* While it would be nice not to have this flag, we are constrained > +* by the reality that LLVM 5.0 doesn't have working VGPR indexing > +* on GFX9. > +*/ > + bool llvm_has_working_vgpr_indexing = > + device->physical_device->rad_info.chip_class <= VI; > + > + /* TODO: Indirect indexing of GS inputs is unimplemented. > +* > +* TCS and TES load inputs directly from LDS or offchip memory, so > +* indirect indexing is trivial. > +*/ > nir_variable_mode indirect_mask = 0; > - indirect_mask |= nir_var_shader_in; > + if (nir->stage == MESA_SHADER_GEOMETRY || > + (nir->stage != MESA_SHADER_TESS_CTRL && > +nir->stage != MESA_SHADER_TESS_EVAL && > +!llvm_has_working_vgpr_indexing)) { > + indirect_mask |= nir_var_shader_in; > + } > + > + /* TODO: We shouldn't need to do this, however LLVM isn't currently > +* smart enough to handle indirects without causing excess spilling > +* causing the gpu to hang. > +* > +* See the following thread for more details of the problem: > +* > https://lists.freedesktop.org/archives/mesa-dev/2017-July/162106.html > +*/ > indirect_mask |= nir_var_local; > > nir_lower_indirect_derefs(nir, indirect_mask); > > static const nir_lower_tex_options tex_options = { > .lower_txp = ~0, > }; > > nir_lower_tex(nir, _options); > > -- > 2.13.6 > > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] Upstream support for FreeSync / Adaptive Sync
On Wed, Oct 18, 2017 at 09:28:01PM +0200, Daniel Vetter wrote: > On Wed, Oct 18, 2017 at 6:59 PM, Michel Dänzerwrote: > > On 18/10/17 12:15 PM, Nicolai Hähnle wrote: > >> On 18.10.2017 10:10, Daniel Vetter wrote: > >>> On Tue, Oct 17, 2017 at 09:01:52PM +0200, Nicolai Hähnle wrote: > On 17.10.2017 19:16, Daniel Vetter wrote: > > On Tue, Oct 17, 2017 at 5:40 PM, Michel Dänzer > > wrote: > >> On 17/10/17 05:04 PM, Daniel Vetter wrote: > >>> On Tue, Oct 17, 2017 at 03:46:24PM +0200, Michel Dänzer wrote: > On 17/10/17 02:22 PM, Daniel Vetter wrote: > > On Tue, Oct 17, 2017 at 12:28:17PM +0200, Michel Dänzer wrote: > >> On 17/10/17 11:34 AM, Nicolai Hähnle wrote: > > > >>> Common sense suggests that there need to be two side to > >>> FreeSync / VESA > >>> Adaptive Sync support: > >>> > >>> 1. Query the display capabilities. This means querying minimum > >>> / maximum > >>> refresh duration, plus possibly a query for when the > >>> earliest/latest > >>> timing of the *next* refresh. > >>> > >>> 2. Signal desired present time. This means passing a target > >>> timer value > >>> instead of a target vblank count, e.g. something like this for > >>> the KMS > >>> interface: > >>> > >>> int drmModePageFlipTarget64(int fd, uint32_t crtc_id, > >>> uint32_t fb_id, > >>> uint32_t flags, void *user_data, > >>> uint64_t target); > >>> > >>> + a flag to indicate whether target is the vblank count or > >>> the > >>> CLOCK_MONOTONIC (?) time in ns. > >> > >> drmModePageFlip(Target) is part of the pre-atomic KMS API, but > >> adapative > >> sync should probably only be supported via the atomic API, > >> presumably > >> via output properties. > > > > +1 > > > > At least now that DC is on track to land properly, and you want > > to do this > > for DC-only anyway there's no reason to pimp the legacy interfaces > > further. And atomic is soo much easier to extend. > > > > The big question imo is where we need to put the flag on the kms > > side, > > since freesync is not just about presenting earlier, but also about > > presenting later. But for backwards compat we can't stretch the > > refresh > > rate by default for everyone, or clients that rely on high > > precision > > timestamps and regular refresh will get a bad surprise. > > The idea described above is that adaptive sync would be used for > flips > with a target timestamp. Apps which don't want to use adaptive sync > wouldn't set a target timestamp. > > > > I think a boolean enable_freesync property is probably what we > > want, which > > enables freesync for as long as it's set. > > The question then becomes under what circumstances the property > is (not) > set. Not sure offhand this will actually solve any problem, or > just push > it somewhere else. > >>> > >>> I thought that's what the driconf switch is for, with a policy of > >>> "please > >>> schedule asap" instead of a specific timestamp. > >> > >> The driconf switch is just for the user's intention to use adaptive > >> sync > >> when possible. A property as you suggest cannot be set by the client > >> directly, because it can't know when adaptive sync can actually be > >> used > >> (only when its window is fullscreen and using page flipping). So the > >> property would have to be set by the X server/driver / Wayland > >> compositor / ... instead. The question is whether such a property is > >> actually needed, or whether the kernel could just enable adaptive sync > >> when there's a flip with a target timestamp, and disable it when > >> there's > >> a flip without a target timestamp, or something like that. > > > > If your adaptive sync also supports extending the vblank beyond the > > nominal limit, then you can't do that with a per-flip flag. Because > > absent of a userspace requesting adaptive sync you must flip at the > > nominal vrefresh rate. So if your userspace is a tad bit late with the > > frame and would like to extend the frame to avoid missing a frame > > entirely it'll be too late by the time the vblank actually gets > > submitted. That's a bit a variation of what Ville brought up about > > what we're going to do when the timestamp was missed by the time all > > the depending fences signalled. > > These are very
Re: [Mesa-dev] [PATCH 3/3] radv: re-emit VGT_INDEX_TYPE because non-indexed draws overwrite it
r-b for the series. MAybe add a fixes tag? On Thu, Oct 19, 2017 at 12:35 PM, Samuel Pitoisetwrote: > Only on CIK and later. We should only update VGT_INDEX_TYPE but > it seems easier to re-emit all the index buffer packets. > > Signed-off-by: Samuel Pitoiset > --- > src/amd/vulkan/radv_cmd_buffer.c | 13 +++-- > 1 file changed, 11 insertions(+), 2 deletions(-) > > diff --git a/src/amd/vulkan/radv_cmd_buffer.c > b/src/amd/vulkan/radv_cmd_buffer.c > index b3f0ad0da7..13489b9faf 100644 > --- a/src/amd/vulkan/radv_cmd_buffer.c > +++ b/src/amd/vulkan/radv_cmd_buffer.c > @@ -1770,8 +1770,17 @@ radv_cmd_buffer_flush_state(struct radv_cmd_buffer > *cmd_buffer, > if (cmd_buffer->state.dirty & RADV_CMD_DIRTY_FRAMEBUFFER) > radv_emit_framebuffer_state(cmd_buffer); > > - if (cmd_buffer->state.dirty & RADV_CMD_DIRTY_INDEX_BUFFER) > - radv_emit_index_buffer(cmd_buffer); > + if (indexed_draw) { > + if (cmd_buffer->state.dirty & RADV_CMD_DIRTY_INDEX_BUFFER) > + radv_emit_index_buffer(cmd_buffer); > + } else { > + /* On CI and later, non-indexed draws overwrite > VGT_INDEX_TYPE, > +* so the state must be re-emitted before the next indexed > +* draw. > +*/ > + if (cmd_buffer->device->physical_device->rad_info.chip_class > >= CIK) > + cmd_buffer->state.dirty |= > RADV_CMD_DIRTY_INDEX_BUFFER; > + } > > ia_multi_vgt_param = si_get_ia_multi_vgt_param(cmd_buffer, > instanced_draw, indirect_draw, draw_vertex_count); > if (cmd_buffer->state.last_ia_multi_vgt_param != ia_multi_vgt_param) { > -- > 2.14.2 > > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] radv: move DB_COUNT_CONTROL initialization to si_emit_config()
r-b On Thu, Oct 19, 2017 at 4:25 PM, Samuel Pitoisetwrote: > CLEAR_STATE will initialize DB_COUNT_CONTROL to 0 for CIK+. > > Signed-off-by: Samuel Pitoiset > --- > src/amd/vulkan/radv_cmd_buffer.c | 1 - > src/amd/vulkan/si_cmd_buffer.c | 5 + > 2 files changed, 5 insertions(+), 1 deletion(-) > > diff --git a/src/amd/vulkan/radv_cmd_buffer.c > b/src/amd/vulkan/radv_cmd_buffer.c > index 2791f1e096..7cffeec482 100644 > --- a/src/amd/vulkan/radv_cmd_buffer.c > +++ b/src/amd/vulkan/radv_cmd_buffer.c > @@ -2123,7 +2123,6 @@ VkResult radv_BeginCommandBuffer( > switch (cmd_buffer->queue_family_index) { > case RADV_QUEUE_GENERAL: > emit_gfx_buffer_state(cmd_buffer); > - radv_set_db_count_control(cmd_buffer); > break; > case RADV_QUEUE_COMPUTE: > si_init_compute(cmd_buffer); > diff --git a/src/amd/vulkan/si_cmd_buffer.c b/src/amd/vulkan/si_cmd_buffer.c > index 1e8b43d4fa..d24b63a5b5 100644 > --- a/src/amd/vulkan/si_cmd_buffer.c > +++ b/src/amd/vulkan/si_cmd_buffer.c > @@ -528,6 +528,11 @@ si_emit_config(struct radv_physical_device > *physical_device, > radeon_emit(cs, S_028A04_MIN_SIZE(radv_pack_float_12p4(0)) | > S_028A04_MAX_SIZE(radv_pack_float_12p4(8192/2))); > > + if (!physical_device->has_clear_state) { > + radeon_set_context_reg(cs, R_028004_DB_COUNT_CONTROL, > + S_028004_ZPASS_INCREMENT_DISABLE(1)); > + } > + > si_emit_compute(physical_device, cs); > } > > -- > 2.14.2 > > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] meson: don't build gallium dri target if gallium is disabled
On Thu, Oct 19, 2017 at 10:32:46AM -0700, Dylan Baker wrote: > Otherwise -Dgallium-drivers= will cause libmesa_gallium to be built and > the megadriver install script to attempt to install drivers without any > actual drivers being built. Tested-by: Rafael Antognolli> fixes: 66f97f6640f5316b36177fd1053f0027eb6ec6cc ("meson: build radeonsi") > Reported-by: Rafael Antognolli > Signed-off-by: Dylan Baker > --- > src/gallium/meson.build | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/src/gallium/meson.build b/src/gallium/meson.build > index a65b32c658e..97347819d60 100644 > --- a/src/gallium/meson.build > +++ b/src/gallium/meson.build > @@ -67,7 +67,7 @@ subdir('state_trackers/dri') > # TODO: virgl > # TODO: winsys/sw/xlib > # TODO: clover > -if with_dri > +if with_dri and with_gallium >subdir('targets/dri') > endif > # TODO: xlib-glx > -- > 2.14.2 > > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 12/12] anv: Add support for the variablePointers feature
Not to be confused with variablePointersStorageBuffer which is the subset of VK_KHR_variable_pointers required to enable the extension. This means we now have "full" support for variable pointers. --- src/intel/vulkan/anv_device.c | 2 +- src/intel/vulkan/anv_pipeline.c | 5 ++--- 2 files changed, 3 insertions(+), 4 deletions(-) diff --git a/src/intel/vulkan/anv_device.c b/src/intel/vulkan/anv_device.c index a305afe..256fc41 100644 --- a/src/intel/vulkan/anv_device.c +++ b/src/intel/vulkan/anv_device.c @@ -699,7 +699,7 @@ void anv_GetPhysicalDeviceFeatures2KHR( case VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_VARIABLE_POINTER_FEATURES_KHR: { VkPhysicalDeviceVariablePointerFeaturesKHR *features = (void *)ext; features->variablePointersStorageBuffer = true; - features->variablePointers = false; + features->variablePointers = true; break; } diff --git a/src/intel/vulkan/anv_pipeline.c b/src/intel/vulkan/anv_pipeline.c index 92995ee..a200fec 100644 --- a/src/intel/vulkan/anv_pipeline.c +++ b/src/intel/vulkan/anv_pipeline.c @@ -124,6 +124,7 @@ anv_shader_compile_to_nir(struct anv_pipeline *pipeline, } struct spirv_to_nir_options spirv_options = { + .lower_workgroup_access_to_offsets = true, .caps = { .float64 = device->instance->physicalDevice.info.gen >= 8, .int64 = device->instance->physicalDevice.info.gen >= 8, @@ -388,10 +389,8 @@ anv_pipeline_compile(struct anv_pipeline *pipeline, if (stage != MESA_SHADER_COMPUTE) NIR_PASS_V(nir, anv_nir_lower_multiview, pipeline->subpass->view_mask); - if (stage == MESA_SHADER_COMPUTE) { - NIR_PASS_V(nir, brw_nir_lower_cs_shared); + if (stage == MESA_SHADER_COMPUTE) prog_data->total_shared = nir->num_shared; - } nir_shader_gather_info(nir, nir_shader_get_entrypoint(nir)); -- 2.5.0.400.gff86faf ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 05/12] spirv: Refactor the base case of offset_pointer_dereference
This makes us key off of !offset instead of !block_index. It also puts the guts inside a switch statement so that we can handle more than just UBOs and SSBOs. --- src/compiler/spirv/vtn_variables.c | 38 +++--- 1 file changed, 23 insertions(+), 15 deletions(-) diff --git a/src/compiler/spirv/vtn_variables.c b/src/compiler/spirv/vtn_variables.c index 0909af3..6a7cba4 100644 --- a/src/compiler/spirv/vtn_variables.c +++ b/src/compiler/spirv/vtn_variables.c @@ -161,24 +161,32 @@ vtn_ssa_offset_pointer_dereference(struct vtn_builder *b, idx++; } - if (!block_index) { + if (!offset) { + /* This is the first access chain so we don't have a block index */ + assert(!block_index); + assert(base->var); - if (glsl_type_is_array(type->type)) { - /* We need at least one element in the chain */ - assert(deref_chain->length >= 1); + switch (base->mode) { + case vtn_variable_mode_ubo: + case vtn_variable_mode_ssbo: + if (glsl_type_is_array(type->type)) { +/* We need at least one element in the chain */ +assert(deref_chain->length >= 1); + +nir_ssa_def *desc_arr_idx = + vtn_access_link_as_ssa(b, deref_chain->link[0], 1); +block_index = vtn_variable_resource_index(b, base->var, desc_arr_idx); +type = type->array_element; +idx++; + } else { +block_index = vtn_variable_resource_index(b, base->var, NULL); + } + offset = nir_imm_int(>nb, 0); + break; - nir_ssa_def *desc_arr_idx = -vtn_access_link_as_ssa(b, deref_chain->link[0], 1); - block_index = vtn_variable_resource_index(b, base->var, desc_arr_idx); - type = type->array_element; - idx++; - } else { - block_index = vtn_variable_resource_index(b, base->var, NULL); + default: + unreachable("Invalid offset pointer mode"); } - - /* This is the first access chain so we also need an offset */ - assert(!offset); - offset = nir_imm_int(>nb, 0); } assert(offset); -- 2.5.0.400.gff86faf ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 10/12] spirv: Rename get_shared_nir_atomic_op to get_var_nir_atomic_op
--- src/compiler/spirv/spirv_to_nir.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/src/compiler/spirv/spirv_to_nir.c b/src/compiler/spirv/spirv_to_nir.c index a2dcbcf..96ecff6 100644 --- a/src/compiler/spirv/spirv_to_nir.c +++ b/src/compiler/spirv/spirv_to_nir.c @@ -2102,7 +2102,7 @@ get_ssbo_nir_atomic_op(SpvOp opcode) } static nir_intrinsic_op -get_shared_nir_atomic_op(SpvOp opcode) +get_var_nir_atomic_op(SpvOp opcode) { switch (opcode) { case SpvOpAtomicLoad: return nir_intrinsic_load_var; @@ -2169,7 +2169,7 @@ vtn_handle_ssbo_or_shared_atomic(struct vtn_builder *b, SpvOp opcode, if (ptr->mode == vtn_variable_mode_workgroup) { nir_deref_var *deref = vtn_pointer_to_deref(b, ptr); const struct glsl_type *deref_type = nir_deref_tail(>deref)->type; - nir_intrinsic_op op = get_shared_nir_atomic_op(opcode); + nir_intrinsic_op op = get_var_nir_atomic_op(opcode); atomic = nir_intrinsic_instr_create(b->nb.shader, op); atomic->variables[0] = nir_deref_var_clone(deref, atomic); -- 2.5.0.400.gff86faf ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 08/12] spirv: Use offset_pointer_dereference to instead of get_vulkan_resource_index
There is no good reason why we should have the same logic repeated in get_vulkan_resource_index and vtn_ssa_offset_pointer_dereference. If we're a bit more careful about how we do things, we can just use the one function and get rid of the other entirely. This also makes the push constant special case a lot more clear. --- src/compiler/spirv/vtn_variables.c | 56 ++ 1 file changed, 21 insertions(+), 35 deletions(-) diff --git a/src/compiler/spirv/vtn_variables.c b/src/compiler/spirv/vtn_variables.c index b135b03..8272c72 100644 --- a/src/compiler/spirv/vtn_variables.c +++ b/src/compiler/spirv/vtn_variables.c @@ -503,45 +503,27 @@ vtn_local_store(struct vtn_builder *b, struct vtn_ssa_value *src, } } -static nir_ssa_def * -get_vulkan_resource_index(struct vtn_builder *b, struct vtn_pointer *ptr, - struct vtn_type **type, unsigned *chain_idx) -{ - /* Push constants have no explicit binding */ - if (ptr->mode == vtn_variable_mode_push_constant) { - *chain_idx = 0; - *type = ptr->var->type; - return NULL; - } - - if (glsl_type_is_array(ptr->var->type->type)) { - assert(ptr->chain->length > 0); - nir_ssa_def *desc_array_index = - vtn_access_link_as_ssa(b, ptr->chain->link[0], 1); - *chain_idx = 1; - *type = ptr->var->type->array_element; - return vtn_variable_resource_index(b, ptr->var, desc_array_index); - } else { - *chain_idx = 0; - *type = ptr->var->type; - return vtn_variable_resource_index(b, ptr->var, NULL); - } -} - nir_ssa_def * vtn_pointer_to_offset(struct vtn_builder *b, struct vtn_pointer *ptr, nir_ssa_def **index_out, unsigned *end_idx_out) { - if (ptr->offset) { - assert(ptr->block_index); + if (vtn_pointer_uses_ssa_offset(b, ptr)) { + if (!ptr->offset) { + assert(ptr->mode == vtn_variable_mode_workgroup); + struct vtn_access_chain chain = { +.length = 0, + }; + ptr = vtn_ssa_offset_pointer_dereference(b, ptr, ); + } *index_out = ptr->block_index; return ptr->offset; } - unsigned idx = 0; - struct vtn_type *type; - *index_out = get_vulkan_resource_index(b, ptr, , ); + assert(ptr->mode == vtn_variable_mode_push_constant); + *index_out = NULL; + unsigned idx = 0; + struct vtn_type *type = ptr->var->type; nir_ssa_def *offset = nir_imm_int(>nb, 0); for (; idx < ptr->chain->length; idx++) { enum glsl_base_type base_type = glsl_get_base_type(type->type); @@ -1894,15 +1876,19 @@ vtn_handle_variables(struct vtn_builder *b, SpvOp opcode, const uint32_t offset = ptr->var->type->offsets[w[4]]; const uint32_t stride = ptr->var->type->members[w[4]]->stride; - unsigned chain_idx; - struct vtn_type *type; - nir_ssa_def *index = - get_vulkan_resource_index(b, ptr, , _idx); + if (!ptr->block_index) { + assert(ptr->mode == vtn_variable_mode_workgroup); + struct vtn_access_chain chain = { +.length = 0, + }; + ptr = vtn_ssa_offset_pointer_dereference(b, ptr, ); + assert(ptr->block_index); + } nir_intrinsic_instr *instr = nir_intrinsic_instr_create(b->nb.shader, nir_intrinsic_get_buffer_size); - instr->src[0] = nir_src_for_ssa(index); + instr->src[0] = nir_src_for_ssa(ptr->block_index); nir_ssa_dest_init(>instr, >dest, 1, 32, NULL); nir_builder_instr_insert(>nb, >instr); nir_ssa_def *buf_size = >dest.ssa; -- 2.5.0.400.gff86faf ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 09/12] spirv: Add theoretical support for single component pointers
Up until now, all pointers have been ivec2s. We're about to add support for pointers to workgroup storage and those are going to be uints. --- src/compiler/spirv/vtn_variables.c | 30 -- 1 file changed, 24 insertions(+), 6 deletions(-) diff --git a/src/compiler/spirv/vtn_variables.c b/src/compiler/spirv/vtn_variables.c index 8272c72..74204f1 100644 --- a/src/compiler/spirv/vtn_variables.c +++ b/src/compiler/spirv/vtn_variables.c @@ -1500,7 +1500,7 @@ vtn_pointer_to_ssa(struct vtn_builder *b, struct vtn_pointer *ptr) assert(ptr->ptr_type); assert(ptr->ptr_type->type); - if (!ptr->offset || !ptr->block_index) { + if (!ptr->offset) { /* If we don't have an offset then we must be a pointer to the variable * itself. */ @@ -1518,15 +1518,22 @@ vtn_pointer_to_ssa(struct vtn_builder *b, struct vtn_pointer *ptr) ptr = vtn_ssa_offset_pointer_dereference(b, ptr, ); } - assert(ptr->offset && ptr->block_index); - return nir_vec2(>nb, ptr->block_index, ptr->offset); + assert(ptr->offset); + if (ptr->block_index) { + assert(ptr->mode == vtn_variable_mode_ubo || + ptr->mode == vtn_variable_mode_ssbo); + return nir_vec2(>nb, ptr->block_index, ptr->offset); + } else { + unreachable("Invalid pointer"); + return ptr->offset; + } } struct vtn_pointer * vtn_pointer_from_ssa(struct vtn_builder *b, nir_ssa_def *ssa, struct vtn_type *ptr_type) { - assert(ssa->num_components == 2 && ssa->bit_size == 32); + assert(ssa->num_components <= 2 && ssa->bit_size == 32); assert(ptr_type->base_type == vtn_base_type_pointer); assert(ptr_type->deref->base_type != vtn_base_type_pointer); /* This pointer type needs to have actual storage */ @@ -1537,8 +1544,19 @@ vtn_pointer_from_ssa(struct vtn_builder *b, nir_ssa_def *ssa, ptr_type, NULL); ptr->type = ptr_type->deref; ptr->ptr_type = ptr_type; - ptr->block_index = nir_channel(>nb, ssa, 0); - ptr->offset = nir_channel(>nb, ssa, 1); + + if (ssa->num_components > 1) { + assert(ssa->num_components == 2); + assert(ptr->mode == vtn_variable_mode_ubo || + ptr->mode == vtn_variable_mode_ssbo); + ptr->block_index = nir_channel(>nb, ssa, 0); + ptr->offset = nir_channel(>nb, ssa, 1); + } else { + assert(ssa->num_components == 1); + unreachable("Invalid pointer"); + ptr->block_index = NULL; + ptr->offset = ssa; + } return ptr; } -- 2.5.0.400.gff86faf ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 07/12] spirv: Refactor a couple of pointer query helpers
This commit moves them both into vtn_variables.c towards the top, makes them take a vtn_builder, and replaces a hand-rolled instance of is_external_block with a function call. --- src/compiler/spirv/vtn_private.h | 7 --- src/compiler/spirv/vtn_variables.c | 35 +-- 2 files changed, 21 insertions(+), 21 deletions(-) diff --git a/src/compiler/spirv/vtn_private.h b/src/compiler/spirv/vtn_private.h index 6110b6f..63bebe3 100644 --- a/src/compiler/spirv/vtn_private.h +++ b/src/compiler/spirv/vtn_private.h @@ -372,13 +372,6 @@ struct vtn_pointer { struct nir_ssa_def *offset; }; -static inline bool -vtn_pointer_uses_ssa_offset(struct vtn_pointer *ptr) -{ - return ptr->mode == vtn_variable_mode_ubo || - ptr->mode == vtn_variable_mode_ssbo; -} - struct vtn_variable { enum vtn_variable_mode mode; diff --git a/src/compiler/spirv/vtn_variables.c b/src/compiler/spirv/vtn_variables.c index 6a7cba4..b135b03 100644 --- a/src/compiler/spirv/vtn_variables.c +++ b/src/compiler/spirv/vtn_variables.c @@ -57,6 +57,23 @@ vtn_access_chain_extend(struct vtn_builder *b, struct vtn_access_chain *old, return chain; } +static bool +vtn_pointer_uses_ssa_offset(struct vtn_builder *b, +struct vtn_pointer *ptr) +{ + return ptr->mode == vtn_variable_mode_ubo || + ptr->mode == vtn_variable_mode_ssbo; +} + +static bool +vtn_pointer_is_external_block(struct vtn_builder *b, + struct vtn_pointer *ptr) +{ + return ptr->mode == vtn_variable_mode_ssbo || + ptr->mode == vtn_variable_mode_ubo || + ptr->mode == vtn_variable_mode_push_constant; +} + /* Dereference the given base pointer by the access chain */ static struct vtn_pointer * vtn_access_chain_pointer_dereference(struct vtn_builder *b, @@ -236,7 +253,7 @@ vtn_pointer_dereference(struct vtn_builder *b, struct vtn_pointer *base, struct vtn_access_chain *deref_chain) { - if (vtn_pointer_uses_ssa_offset(base)) { + if (vtn_pointer_uses_ssa_offset(b, base)) { return vtn_ssa_offset_pointer_dereference(b, base, deref_chain); } else { return vtn_access_chain_pointer_dereference(b, base, deref_chain); @@ -873,14 +890,6 @@ vtn_block_store(struct vtn_builder *b, struct vtn_ssa_value *src, 0, 0, dst->chain, chain_idx, dst->type, ); } -static bool -vtn_pointer_is_external_block(struct vtn_pointer *ptr) -{ - return ptr->mode == vtn_variable_mode_ssbo || - ptr->mode == vtn_variable_mode_ubo || - ptr->mode == vtn_variable_mode_push_constant; -} - static void _vtn_variable_load_store(struct vtn_builder *b, bool load, struct vtn_pointer *ptr, @@ -940,7 +949,7 @@ _vtn_variable_load_store(struct vtn_builder *b, bool load, struct vtn_ssa_value * vtn_variable_load(struct vtn_builder *b, struct vtn_pointer *src) { - if (vtn_pointer_is_external_block(src)) { + if (vtn_pointer_is_external_block(b, src)) { return vtn_block_load(b, src); } else { struct vtn_ssa_value *val = NULL; @@ -953,7 +962,7 @@ void vtn_variable_store(struct vtn_builder *b, struct vtn_ssa_value *src, struct vtn_pointer *dest) { - if (vtn_pointer_is_external_block(dest)) { + if (vtn_pointer_is_external_block(b, dest)) { assert(dest->mode == vtn_variable_mode_ssbo); vtn_block_store(b, src, dest); } else { @@ -1761,9 +1770,7 @@ vtn_create_variable(struct vtn_builder *b, struct vtn_value *val, nir_shader_add_variable(b->shader, var->members[i]); } } else { - assert(var->mode == vtn_variable_mode_ubo || - var->mode == vtn_variable_mode_ssbo || - var->mode == vtn_variable_mode_push_constant); + assert(vtn_pointer_is_external_block(b, val->pointer)); } } -- 2.5.0.400.gff86faf ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 11/12] spirv: Add support for lowering workgroup access to offsets
Before, we always left workgroup variables as shared nir_variables and let the driver call nir_lower_io. This adds an option to do the lowering directly in spirv_to_nir. To do this, we implicitly assign the variables a std430 layout and then treat them like a UBO or SSBO and immediately lower all the way to an offset. As a side-effect, the spirv_to_nir pass now handles variable pointers for workgroup variables. --- src/compiler/spirv/nir_spirv.h | 8 +++ src/compiler/spirv/spirv_to_nir.c | 130 + src/compiler/spirv/vtn_private.h | 17 - src/compiler/spirv/vtn_variables.c | 54 +-- 4 files changed, 190 insertions(+), 19 deletions(-) diff --git a/src/compiler/spirv/nir_spirv.h b/src/compiler/spirv/nir_spirv.h index 234b0ce..58df3e1 100644 --- a/src/compiler/spirv/nir_spirv.h +++ b/src/compiler/spirv/nir_spirv.h @@ -43,6 +43,14 @@ struct nir_spirv_specialization { }; struct spirv_to_nir_options { + /* Whether or not to lower all workgroup variable access to offsets +* up-front. This means you will _shared intrinsics instead of _var +* for workgroup data access. +* +* This is currently required for full variable pointers support. +*/ + bool lower_workgroup_access_to_offsets; + struct { bool float64; bool image_ms_array; diff --git a/src/compiler/spirv/spirv_to_nir.c b/src/compiler/spirv/spirv_to_nir.c index 96ecff6..1a612ae 100644 --- a/src/compiler/spirv/spirv_to_nir.c +++ b/src/compiler/spirv/spirv_to_nir.c @@ -729,6 +729,64 @@ translate_image_format(SpvImageFormat format) } } +static struct vtn_type * +vtn_type_layout_std430(struct vtn_builder *b, struct vtn_type *type, + uint32_t *size_out, uint32_t *align_out) +{ + switch (type->base_type) { + case vtn_base_type_scalar: { + uint32_t comp_size = glsl_get_bit_size(type->type) / 8; + *size_out = comp_size; + *align_out = comp_size; + return type; + } + + case vtn_base_type_vector: { + uint32_t comp_size = glsl_get_bit_size(type->type) / 8; + assert(type->length > 0 && type->length <= 4); + unsigned align_comps = type->length == 3 ? 4 : type->length; + *size_out = comp_size * type->length, + *align_out = comp_size * align_comps; + return type; + } + + case vtn_base_type_matrix: + case vtn_base_type_array: { + /* We're going to add an array stride */ + type = vtn_type_copy(b, type); + uint32_t elem_size, elem_align; + type->array_element = vtn_type_layout_std430(b, type->array_element, + _size, _align); + type->stride = vtn_align_u32(elem_size, elem_align); + *size_out = type->stride * type->length; + *align_out = elem_align; + return type; + } + + case vtn_base_type_struct: { + /* We're going to add member offsets */ + type = vtn_type_copy(b, type); + uint32_t offset = 0; + uint32_t align = 0; + for (unsigned i = 0; i < type->length; i++) { + uint32_t mem_size, mem_align; + type->members[i] = vtn_type_layout_std430(b, type->members[i], + _size, _align); + offset = vtn_align_u32(offset, mem_align); + type->offsets[i] = offset; + offset += mem_size; + align = MAX2(align, mem_align); + } + *size_out = offset; + *align_out = align; + return type; + } + + default: + unreachable("Invalid SPIR-V type for std430"); + } +} + static void vtn_handle_type(struct vtn_builder *b, SpvOp opcode, const uint32_t *w, unsigned count) @@ -878,6 +936,19 @@ vtn_handle_type(struct vtn_builder *b, SpvOp opcode, */ val->type->type = glsl_vector_type(GLSL_TYPE_UINT, 2); } + + if (storage_class == SpvStorageClassWorkgroup && + b->options->lower_workgroup_access_to_offsets) { + uint32_t size, align; + val->type->deref = vtn_type_layout_std430(b, val->type->deref, + , ); + val->type->length = size; + val->type->align = align; + /* These can actually be stored to nir_variables and used as SSA + * values so they need a real glsl_type. + */ + val->type->type = glsl_uint_type(); + } break; } @@ -2102,6 +2173,32 @@ get_ssbo_nir_atomic_op(SpvOp opcode) } static nir_intrinsic_op +get_shared_nir_atomic_op(SpvOp opcode) +{ + switch (opcode) { + case SpvOpAtomicLoad: return nir_intrinsic_load_shared; + case SpvOpAtomicStore: return nir_intrinsic_store_shared; +#define OP(S, N) case SpvOp##S: return nir_intrinsic_shared_##N; + OP(AtomicExchange, atomic_exchange) + OP(AtomicCompareExchange, atomic_comp_swap) + OP(AtomicIIncrement, atomic_add) + OP(AtomicIDecrement, atomic_add) + OP(AtomicIAdd,
[Mesa-dev] [PATCH 04/12] spirv: Add a switch statement for the block store opcode
This parallels what we do for vtn_block_load except that we don't yet support anything except SSBO loads through this path. --- src/compiler/spirv/vtn_variables.c | 11 ++- 1 file changed, 10 insertions(+), 1 deletion(-) diff --git a/src/compiler/spirv/vtn_variables.c b/src/compiler/spirv/vtn_variables.c index 7bfac46..0909af3 100644 --- a/src/compiler/spirv/vtn_variables.c +++ b/src/compiler/spirv/vtn_variables.c @@ -848,11 +848,20 @@ static void vtn_block_store(struct vtn_builder *b, struct vtn_ssa_value *src, struct vtn_pointer *dst) { + nir_intrinsic_op op; + switch (dst->mode) { + case vtn_variable_mode_ssbo: + op = nir_intrinsic_store_ssbo; + break; + default: + unreachable("Invalid block variable mode"); + } + nir_ssa_def *offset, *index = NULL; unsigned chain_idx; offset = vtn_pointer_to_offset(b, dst, , _idx); - _vtn_block_load_store(b, nir_intrinsic_store_ssbo, false, index, offset, + _vtn_block_load_store(b, op, false, index, offset, 0, 0, dst->chain, chain_idx, dst->type, ); } -- 2.5.0.400.gff86faf ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 03/12] spirv: Use a dereference instead of vtn_variable_resource_index
This is equivalent and means we don't have resource index code scattered about. --- src/compiler/spirv/vtn_variables.c | 17 ++--- 1 file changed, 10 insertions(+), 7 deletions(-) diff --git a/src/compiler/spirv/vtn_variables.c b/src/compiler/spirv/vtn_variables.c index a7e6ae0..7bfac46 100644 --- a/src/compiler/spirv/vtn_variables.c +++ b/src/compiler/spirv/vtn_variables.c @@ -1492,11 +1492,9 @@ vtn_pointer_to_ssa(struct vtn_builder *b, struct vtn_pointer *ptr) assert(ptr->ptr_type); assert(ptr->ptr_type->type); - if (ptr->offset && ptr->block_index) { - return nir_vec2(>nb, ptr->block_index, ptr->offset); - } else { - /* If we don't have an offset or block index, then we must be a pointer - * to the variable itself. + if (!ptr->offset || !ptr->block_index) { + /* If we don't have an offset then we must be a pointer to the variable + * itself. */ assert(!ptr->offset && !ptr->block_index); @@ -1506,9 +1504,14 @@ vtn_pointer_to_ssa(struct vtn_builder *b, struct vtn_pointer *ptr) */ assert(ptr->var && ptr->var->type->base_type == vtn_base_type_struct); - return nir_vec2(>nb, vtn_variable_resource_index(b, ptr->var, NULL), - nir_imm_int(>nb, 0)); + struct vtn_access_chain chain = { + .length = 0, + }; + ptr = vtn_ssa_offset_pointer_dereference(b, ptr, ); } + + assert(ptr->offset && ptr->block_index); + return nir_vec2(>nb, ptr->block_index, ptr->offset); } struct vtn_pointer * -- 2.5.0.400.gff86faf ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 02/12] spirv: Only emit functions which are actually used
Instead of emitting absolutely everything, just emit the few functions that are actually referenced in some way by the entrypoint. This should save us quite a bit of time when handed large shader modules containing many entrypoints. --- src/compiler/spirv/spirv_to_nir.c | 29 + src/compiler/spirv/vtn_cfg.c | 2 ++ src/compiler/spirv/vtn_private.h | 3 +++ 3 files changed, 26 insertions(+), 8 deletions(-) diff --git a/src/compiler/spirv/spirv_to_nir.c b/src/compiler/spirv/spirv_to_nir.c index d22c3dc..e5e12ff 100644 --- a/src/compiler/spirv/spirv_to_nir.c +++ b/src/compiler/spirv/spirv_to_nir.c @@ -1394,8 +1394,11 @@ vtn_handle_function_call(struct vtn_builder *b, SpvOp opcode, const uint32_t *w, unsigned count) { struct vtn_type *res_type = vtn_value(b, w[1], vtn_value_type_type)->type; - struct nir_function *callee = - vtn_value(b, w[3], vtn_value_type_function)->func->impl->function; + struct vtn_function *vtn_callee = + vtn_value(b, w[3], vtn_value_type_function)->func; + struct nir_function *callee = vtn_callee->impl->function; + + vtn_callee->referenced = true; nir_call_instr *call = nir_call_instr_create(b->nb.shader, callee); for (unsigned i = 0; i < call->num_params; i++) { @@ -3367,12 +3370,22 @@ spirv_to_nir(const uint32_t *words, size_t word_count, vtn_build_cfg(b, words, word_end); - foreach_list_typed(struct vtn_function, func, node, >functions) { - b->const_table = _mesa_hash_table_create(b, _mesa_hash_pointer, - _mesa_key_pointer_equal); - - vtn_function_emit(b, func, vtn_handle_body_instruction); - } + assert(b->entry_point->value_type == vtn_value_type_function); + b->entry_point->func->referenced = true; + + bool progress; + do { + progress = false; + foreach_list_typed(struct vtn_function, func, node, >functions) { + if (func->referenced && !func->emitted) { +b->const_table = _mesa_hash_table_create(b, _mesa_hash_pointer, + _mesa_key_pointer_equal); + +vtn_function_emit(b, func, vtn_handle_body_instruction); +progress = true; + } + } + } while (progress); assert(b->entry_point->value_type == vtn_value_type_function); nir_function *entry_point = b->entry_point->func->impl->function; diff --git a/src/compiler/spirv/vtn_cfg.c b/src/compiler/spirv/vtn_cfg.c index 13f0221..70bbccb 100644 --- a/src/compiler/spirv/vtn_cfg.c +++ b/src/compiler/spirv/vtn_cfg.c @@ -783,4 +783,6 @@ vtn_function_emit(struct vtn_builder *b, struct vtn_function *func, */ if (b->has_loop_continue) nir_repair_ssa_impl(func->impl); + + func->emitted = true; } diff --git a/src/compiler/spirv/vtn_private.h b/src/compiler/spirv/vtn_private.h index 728c1ff..2551e0b 100644 --- a/src/compiler/spirv/vtn_private.h +++ b/src/compiler/spirv/vtn_private.h @@ -159,6 +159,9 @@ struct vtn_block { struct vtn_function { struct exec_node node; + bool referenced; + bool emitted; + nir_function_impl *impl; struct vtn_block *start_block; -- 2.5.0.400.gff86faf ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 06/12] spirv: Convert the supported_extensions struct to spirv_options
This is a bit more general and lets us pass additional options into the spirv_to_nir pass beyond what capabilities we support. --- src/amd/vulkan/radv_shader.c | 23 +-- src/compiler/spirv/nir_spirv.h| 26 ++ src/compiler/spirv/spirv_to_nir.c | 10 +- src/compiler/spirv/vtn_private.h | 2 +- src/intel/vulkan/anv_pipeline.c | 20 +++- 5 files changed, 44 insertions(+), 37 deletions(-) diff --git a/src/amd/vulkan/radv_shader.c b/src/amd/vulkan/radv_shader.c index 055787a..7190ca6 100644 --- a/src/amd/vulkan/radv_shader.c +++ b/src/amd/vulkan/radv_shader.c @@ -194,19 +194,22 @@ radv_shader_compile_to_nir(struct radv_device *device, spec_entries[i].data32 = *(const uint32_t *)data; } } - const struct nir_spirv_supported_extensions supported_ext = { - .draw_parameters = true, - .float64 = true, - .image_read_without_format = true, - .image_write_without_format = true, - .tessellation = true, - .int64 = true, - .multiview = true, - .variable_pointers = true, + const struct spirv_to_nir_options spirv_options = { + .caps = { + .draw_parameters = true, + .float64 = true, + .image_read_without_format = true, + .image_write_without_format = true, + .tessellation = true, + .int64 = true, + .multiview = true, + .variable_pointers = true, + }, }; entry_point = spirv_to_nir(spirv, module->size / 4, spec_entries, num_spec_entries, - stage, entrypoint_name, _ext, _options); + stage, entrypoint_name, + _options, _options); nir = entry_point->shader; assert(nir->stage == stage); nir_validate_shader(nir); diff --git a/src/compiler/spirv/nir_spirv.h b/src/compiler/spirv/nir_spirv.h index 83577fb..234b0ce 100644 --- a/src/compiler/spirv/nir_spirv.h +++ b/src/compiler/spirv/nir_spirv.h @@ -42,24 +42,26 @@ struct nir_spirv_specialization { }; }; -struct nir_spirv_supported_extensions { - bool float64; - bool image_ms_array; - bool tessellation; - bool draw_parameters; - bool image_read_without_format; - bool image_write_without_format; - bool int64; - bool multiview; - bool variable_pointers; +struct spirv_to_nir_options { + struct { + bool float64; + bool image_ms_array; + bool tessellation; + bool draw_parameters; + bool image_read_without_format; + bool image_write_without_format; + bool int64; + bool multiview; + bool variable_pointers; + } caps; }; nir_function *spirv_to_nir(const uint32_t *words, size_t word_count, struct nir_spirv_specialization *specializations, unsigned num_specializations, gl_shader_stage stage, const char *entry_point_name, - const struct nir_spirv_supported_extensions *ext, - const nir_shader_compiler_options *options); + const struct spirv_to_nir_options *options, + const nir_shader_compiler_options *nir_options); #ifdef __cplusplus } diff --git a/src/compiler/spirv/spirv_to_nir.c b/src/compiler/spirv/spirv_to_nir.c index e5e12ff..a2dcbcf 100644 --- a/src/compiler/spirv/spirv_to_nir.c +++ b/src/compiler/spirv/spirv_to_nir.c @@ -2677,7 +2677,7 @@ stage_for_execution_model(SpvExecutionModel model) } #define spv_check_supported(name, cap) do {\ - if (!(b->ext && b->ext->name)) \ + if (!(b->options && b->options->caps.name)) \ vtn_warn("Unsupported SPIR-V capability: %s", \ spirv_capability_to_string(cap)); \ } while(0) @@ -3317,8 +3317,8 @@ nir_function * spirv_to_nir(const uint32_t *words, size_t word_count, struct nir_spirv_specialization *spec, unsigned num_spec, gl_shader_stage stage, const char *entry_point_name, - const struct nir_spirv_supported_extensions *ext, - const nir_shader_compiler_options *options) + const struct spirv_to_nir_options *options, + const nir_shader_compiler_options *nir_options) { const uint32_t *word_end = words + word_count; @@ -3340,7 +3340,7 @@
[Mesa-dev] [PATCH 01/12] spirv: Drop the impl field from vtn_builder
We have a nir_builder and it has an impl field. --- src/compiler/spirv/spirv_to_nir.c | 9 - src/compiler/spirv/vtn_cfg.c | 2 +- src/compiler/spirv/vtn_private.h | 1 - src/compiler/spirv/vtn_variables.c | 2 +- 4 files changed, 6 insertions(+), 8 deletions(-) diff --git a/src/compiler/spirv/spirv_to_nir.c b/src/compiler/spirv/spirv_to_nir.c index 079ff0f..d22c3dc 100644 --- a/src/compiler/spirv/spirv_to_nir.c +++ b/src/compiler/spirv/spirv_to_nir.c @@ -117,7 +117,7 @@ vtn_const_ssa_value(struct vtn_builder *b, nir_constant *constant, load->value = constant->values[0]; - nir_instr_insert_before_cf_list(>impl->body, >instr); + nir_instr_insert_before_cf_list(>nb.impl->body, >instr); val->def = >def; } else { assert(glsl_type_is_matrix(type)); @@ -133,7 +133,7 @@ vtn_const_ssa_value(struct vtn_builder *b, nir_constant *constant, load->value = constant->values[i]; -nir_instr_insert_before_cf_list(>impl->body, >instr); +nir_instr_insert_before_cf_list(>nb.impl->body, >instr); col_val->def = >def; val->elems[i] = col_val; @@ -1410,7 +1410,7 @@ vtn_handle_function_call(struct vtn_builder *b, SpvOp opcode, /* Make a temporary to store the argument in */ nir_variable *tmp = -nir_local_variable_create(b->impl, arg_ssa->type, "arg_tmp"); +nir_local_variable_create(b->nb.impl, arg_ssa->type, "arg_tmp"); call->params[i] = nir_deref_var_create(call, tmp); vtn_local_store(b, arg_ssa, call->params[i]); @@ -1420,7 +1420,7 @@ vtn_handle_function_call(struct vtn_builder *b, SpvOp opcode, nir_variable *out_tmp = NULL; assert(res_type->type == callee->return_type); if (!glsl_type_is_void(callee->return_type)) { - out_tmp = nir_local_variable_create(b->impl, callee->return_type, + out_tmp = nir_local_variable_create(b->nb.impl, callee->return_type, "out_tmp"); call->return_deref = nir_deref_var_create(call, out_tmp); } @@ -3368,7 +3368,6 @@ spirv_to_nir(const uint32_t *words, size_t word_count, vtn_build_cfg(b, words, word_end); foreach_list_typed(struct vtn_function, func, node, >functions) { - b->impl = func->impl; b->const_table = _mesa_hash_table_create(b, _mesa_hash_pointer, _mesa_key_pointer_equal); diff --git a/src/compiler/spirv/vtn_cfg.c b/src/compiler/spirv/vtn_cfg.c index 25ff254..13f0221 100644 --- a/src/compiler/spirv/vtn_cfg.c +++ b/src/compiler/spirv/vtn_cfg.c @@ -606,7 +606,7 @@ vtn_emit_cf_list(struct vtn_builder *b, struct list_head *cf_list, if ((*block->branch & SpvOpCodeMask) == SpvOpReturnValue) { struct vtn_ssa_value *src = vtn_ssa_value(b, block->branch[1]); vtn_local_store(b, src, -nir_deref_var_create(b, b->impl->return_var)); +nir_deref_var_create(b, b->nb.impl->return_var)); } if (block->branch_type != vtn_branch_type_none) { diff --git a/src/compiler/spirv/vtn_private.h b/src/compiler/spirv/vtn_private.h index 8458462..728c1ff 100644 --- a/src/compiler/spirv/vtn_private.h +++ b/src/compiler/spirv/vtn_private.h @@ -463,7 +463,6 @@ struct vtn_builder { nir_builder nb; nir_shader *shader; - nir_function_impl *impl; const struct nir_spirv_supported_extensions *ext; struct vtn_block *block; diff --git a/src/compiler/spirv/vtn_variables.c b/src/compiler/spirv/vtn_variables.c index 997b66f..a7e6ae0 100644 --- a/src/compiler/spirv/vtn_variables.c +++ b/src/compiler/spirv/vtn_variables.c @@ -1731,7 +1731,7 @@ vtn_create_variable(struct vtn_builder *b, struct vtn_value *val, if (var->mode == vtn_variable_mode_local) { assert(var->members == NULL && var->var != NULL); - nir_function_impl_add_variable(b->impl, var->var); + nir_function_impl_add_variable(b->nb.impl, var->var); } else if (var->var) { nir_shader_add_variable(b->shader, var->var); } else if (var->members) { -- 2.5.0.400.gff86faf ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 00/12] anv: Add support for the variablePointers feature
Not to be confused with variablePointersStorageBuffer which is the subset of VK_KHR_variable_pointers required to enable the extension. This gives us "full" support for variable pointers. The approach chosen here was to do the lowering to _shared intrinsics directly in spirv_to_nir instead of using the _var intrinsics and using nir_lower_io. Pointers with a storage class of Workgroup are given an implicit std430 layout and now go through the same offset pointer paths as UBO and SSBO access. The whole thing really ended up working out rather cleanly. There are some downsides to this approach. One, is that we can't delete unused shared variables post-optimization. Also, the driver may be able to handle better than std430. Both of these can lead to some waisted SLM space. This also means that we can't do any deref-based load/store elimination optimizations on SLM but we didn't really before so that's no great loss; SLM is now exactly as hard to optimize as SSBOs. Connor, Yes, I know that this is not quite the approach you were suggesting on IRC. I considered how we might add some sort of deref intrinsic and I don't see a good way of doing so without rewriting large chunks of NIR. I think that rewrite is probably worth it some day but that day is not today. We people asking for this feature so I really don't want to delay on a major NIR rewrite. Cc: Connor AbbottCc: Chad Versace Cc: Dave Airlie Jason Ekstrand (12): spirv: Drop the impl field from vtn_builder spirv: Only emit functions which are actually used spirv: Use a dereference instead of vtn_variable_resource_index spirv: Add a switch statement for the block store opcode spirv: Refactor the base case of offset_pointer_dereference spirv: Convert the supported_extensions struct to spirv_options spirv: Refactor a couple of pointer query helpers spirv: Use offset_pointer_dereference to instead of get_vulkan_resource_index spirv: Add theoretical support for single component pointers spirv: Rename get_shared_nir_atomic_op to get_var_nir_atomic_op spirv: Add support for lowering workgroup access to offsets anv: Add support for the variablePointers feature src/amd/vulkan/radv_shader.c | 23 ++-- src/compiler/spirv/nir_spirv.h | 34 -- src/compiler/spirv/spirv_to_nir.c | 180 - src/compiler/spirv/vtn_cfg.c | 4 +- src/compiler/spirv/vtn_private.h | 30 +++-- src/compiler/spirv/vtn_variables.c | 229 - src/intel/vulkan/anv_device.c | 2 +- src/intel/vulkan/anv_pipeline.c| 25 ++-- 8 files changed, 372 insertions(+), 155 deletions(-) -- 2.5.0.400.gff86faf ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 5/5] etnaviv: fix implicit conversion warning
On Tue, Oct 17, 2017 at 10:38:17PM +0200, Christian Gmeiner wrote: > Galliums query_type used in APIs is unsigned. Reviewed-by: Wladimir J. van der Laan> Signed-off-by: Christian Gmeiner > --- > src/gallium/drivers/etnaviv/etnaviv_query.h| 2 +- > src/gallium/drivers/etnaviv/etnaviv_query_sw.c | 2 +- > 2 files changed, 2 insertions(+), 2 deletions(-) ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 3/5] etnaviv: add support for occlusion queries
On Tue, Oct 17, 2017 at 10:38:15PM +0200, Christian Gmeiner wrote: > Passes most occlusion query piglits. The following piglits are broken: > - spec@arb_occlusion_query@occlusion_query_meta_fragments > - spec@arb_occlusion_query@occlusion_query_meta_save > - spec@arb_occlusion_query2@render > > Signed-off-by: Christian GmeinerComments inline > +static void > +occlusion_stop(struct etna_hw_query *hq, struct etna_context *ctx) > +{ > + etna_set_state(ctx->stream, VIVS_GL_OCCLUSION_QUERY_CONTROL, 0x1DF5E76); Does the actual value written matter here? If so, it'd make sense to define a constant (or bit definitions in the rnndb). If not it's fine like this, or add a comment that it's just a "random" nonce. > +static const struct etna_hw_sample_provider occlusion_counter = { > + .query_type = PIPE_QUERY_OCCLUSION_COUNTER, > + .start = occlusion_start, > + .stop = occlusion_stop, > + .suspend = occlusion_suspend, > + .resume = occlusion_resume, > + .result = occlusion_counter_result, > +}; > + > +static const struct etna_hw_sample_provider occlusion_predicate = { > + .query_type = PIPE_QUERY_OCCLUSION_PREDICATE, > + .start = occlusion_start, > + .stop = occlusion_stop, > + .suspend = occlusion_suspend, > + .resume = occlusion_resume, > + .result = occlusion_predicate_result, > +}; > + > +static const struct etna_hw_sample_provider occlusion_predicate_conservative > = { > + .query_type = PIPE_QUERY_OCCLUSION_PREDICATE_CONSERVATIVE, > + .start = occlusion_start, > + .stop = occlusion_stop, > + .suspend = occlusion_suspend, > + .resume = occlusion_resume, > + .result = occlusion_predicate_result, > +}; Is it intentional that this defines the same fields three times? Why not return the same structure for all three cases? Is this expected to change to specific implementations soon in another patch? Wladimir ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 4/5] etnaviv: enable occlusion query if GPU supports it
On Tue, Oct 17, 2017 at 10:38:16PM +0200, Christian Gmeiner wrote: > Signed-off-by: Christian GmeinerReviewed-by: Wladimir J. van der Laan > --- > src/gallium/drivers/etnaviv/etnaviv_screen.c | 3 ++- > 1 file changed, 2 insertions(+), 1 deletion(-) > > diff --git a/src/gallium/drivers/etnaviv/etnaviv_screen.c > b/src/gallium/drivers/etnaviv/etnaviv_screen.c > index 738605a4f3..009bc73c14 100644 > --- a/src/gallium/drivers/etnaviv/etnaviv_screen.c > +++ b/src/gallium/drivers/etnaviv/etnaviv_screen.c > @@ -319,8 +319,9 @@ etna_screen_get_param(struct pipe_screen *pscreen, enum > pipe_cap param) > > /* Timer queries. */ > case PIPE_CAP_QUERY_TIME_ELAPSED: > - case PIPE_CAP_OCCLUSION_QUERY: >return 0; > + case PIPE_CAP_OCCLUSION_QUERY: > + return VIV_FEATURE(screen, chipMinorFeatures1, HALTI0); > case PIPE_CAP_QUERY_TIMESTAMP: >return 1; > case PIPE_CAP_QUERY_PIPELINE_STATISTICS: > -- > 2.13.6 > > ___ > etnaviv mailing list > etna...@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/etnaviv ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 2/5] etnaviv: add basic infrastructure for hw queries
On Tue, Oct 17, 2017 at 10:38:14PM +0200, Christian Gmeiner wrote: > No hardware query is supported yet. > > Signed-off-by: Christian GmeinerReviewed-by: Wladimir J. van der Laan ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] gallium: add more exceptions to tgsi_util_get_inst_usage_mask
Reviewed-by: Marek OlšákMarek On Thu, Oct 19, 2017 at 6:40 PM, Tim Rowley wrote: > A number of double/int64 operations don't have matching > read and write usage masks, which the fallthrough case of > tgsi_util_get_inst_usage_mask assumes for componentwise > tagged instructions. > > No regressions in llvmpipe piglit; fixes a large number of > swr regressions. > --- > src/gallium/auxiliary/tgsi/tgsi_util.c | 12 > 1 file changed, 12 insertions(+) > > diff --git a/src/gallium/auxiliary/tgsi/tgsi_util.c > b/src/gallium/auxiliary/tgsi/tgsi_util.c > index cfce59093c..afe5690ce0 100644 > --- a/src/gallium/auxiliary/tgsi/tgsi_util.c > +++ b/src/gallium/auxiliary/tgsi/tgsi_util.c > @@ -230,13 +230,25 @@ tgsi_util_get_inst_usage_mask(const struct > tgsi_full_instruction *inst, >read_mask = TGSI_WRITEMASK_XYZ; >break; > > + case TGSI_OPCODE_DSEQ: > + case TGSI_OPCODE_DSNE: > + case TGSI_OPCODE_DSLT: > + case TGSI_OPCODE_DSGE: > case TGSI_OPCODE_DP4: > case TGSI_OPCODE_PK4B: > case TGSI_OPCODE_PK4UB: > case TGSI_OPCODE_D2F: > + case TGSI_OPCODE_D2I: > + case TGSI_OPCODE_D2U: > case TGSI_OPCODE_I2F: > case TGSI_OPCODE_U2F: > + case TGSI_OPCODE_U64SEQ: > + case TGSI_OPCODE_U64SNE: > + case TGSI_OPCODE_U64SLT: > + case TGSI_OPCODE_U64SGE: > case TGSI_OPCODE_U642F: > + case TGSI_OPCODE_I64SLT: > + case TGSI_OPCODE_I64SGE: > case TGSI_OPCODE_I642F: >read_mask = TGSI_WRITEMASK_XYZW; >break; > -- > 2.11.0 > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] meson: don't build gallium dri target if gallium is disabled
Otherwise -Dgallium-drivers= will cause libmesa_gallium to be built and the megadriver install script to attempt to install drivers without any actual drivers being built. fixes: 66f97f6640f5316b36177fd1053f0027eb6ec6cc ("meson: build radeonsi") Reported-by: Rafael AntognolliSigned-off-by: Dylan Baker --- src/gallium/meson.build | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/gallium/meson.build b/src/gallium/meson.build index a65b32c658e..97347819d60 100644 --- a/src/gallium/meson.build +++ b/src/gallium/meson.build @@ -67,7 +67,7 @@ subdir('state_trackers/dri') # TODO: virgl # TODO: winsys/sw/xlib # TODO: clover -if with_dri +if with_dri and with_gallium subdir('targets/dri') endif # TODO: xlib-glx -- 2.14.2 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] docs: add documentation for building with meson
Quoting Emil Velikov (2017-10-19 09:44:37) > On 18 October 2017 at 18:27, Dylan Bakerwrote: > > >> > +Note that in meson this defaults to "debug", and not setting it to > >> > +"release" will yield non-optimal performance and binary size > >> Ouch, can we change that? > > > > When I did an informal poll in the Intel cube the universal consensus was > > that > > defaulting to debug was a feature. If the wider community disagrees, yes, > > it can be > > changed. > > > The intent behind is amazing, but it subtly changes behaviour over any > of the existing build systems. > > Past experience suggests that listing such changes in release > notes/elsewhere people do miss it. > Thus as users get annoyed with the optimised build there'll be some > bad noise/publicity. > > Just something to keep in mind. That is good to keep in mind. > > > I'll fix up the other stuff too. > > > Great thanks. > > -Emil signature.asc Description: signature ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 2/4] glsl/linker: fix location aliasing checks for interface variables
On Thu, Oct 19, 2017 at 12:45 PM, Ilia Mirkinwrote: > On Thu, Oct 19, 2017 at 12:40 PM, Iago Toral wrote: >> On Thu, 2017-10-19 at 12:37 -0400, Ilia Mirkin wrote: >>> Will this work with SSO shaders? Presumably the validation still has >>> to happen, but I don't think cross_validate_outputs_to_inputs() will >>> end up getting called. >> >> The piglit tests I mention use SSO so it seems to be working for this. >> See for example: >> >> tests/spec/arb_enhanced_layouts/linker/block-member-locations/named- >> block-member-location-overlap.shader_test > > Ah great. I'm a little curious how, since I don't see how > cross_validate_outputs_to_inputs would get called for SSO shaders. > Perhaps I'm looking at old code. > > Oh - because that test doesn't test SSO. It's missing a "SSO ENABLED" > - can you try with that? It's just using a pipeline, but both shaders > are ending up in it. BTW, my solution to all this was https://patchwork.freedesktop.org/patch/175959/ but Tim hated it, and I didn't have the time to properly respond. -ilia ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 3/4] meson: build libEGL
Quoting Daniel Stone (2017-10-19 07:34:28) > Hi Dylan, > > On 19 October 2017 at 01:55, Dylan Bakerwrote: > > This is based heavily on Daniel Stone's work for the same, rebased on > > master and with a number of TODO's fixed. > > > > This does not implement glvnd (which is coming in a later patch) > > > > Meson builds egl slightly differently than autotools, namely it doesn't > > build an intermediate shared library. It doesn't do this because meson > > doesn't have problems with the name of the library being dynamically > > generated, so the glvnd and non-glvnd code can follow the same path. > > Thanks a million for picking this up, and fixing all the egregious > bugs / gaps / variable name inconsistency! Is this available in a > fully-baked git branch somewhere? https://github.com/dcbaker/mesa submit/meson-egl > > > +linux_dmabuf_unstable_v1_protocol_c = custom_target( > > + 'linux-dmabuf-unstable-v1-protocol.c', > > + input : wayland_dmabuf_xml, > > + output : 'linux-dmabuf-unstable-v1-protocol.c', > > + command : [prog_wl_scanner, 'code', '@INPUT@', '@OUTPUT@'], > > +) > > + > > +linux_dmabuf_unstable_v1_client_protocol_h = custom_target( > > + 'linux-dmabuf-unstable-v1-client-protocol.h', > > + input : wayland_dmabuf_xml, > > + output : 'linux-dmabuf-unstable-v1-client-protocol.h', > > + command : [prog_wl_scanner, 'client-header', '@INPUT@', '@OUTPUT@'], > > +) > > Could you please move these into src/egl/wayland/wayland-drm? They're > not actually wl_drm of course, but I have patches out to use > linux-dmabuf inside Vulkan as well, so having wl_drm and > zwp_linux_dmabuf in the same place where it's just built once seems > like an idea. Then it'd be trivial to hoist that out of src/egl/ > later. Jason and I talked about this, and he thought that the better thing to do would be to make a src/wsi folder that could have things like src/wsi/wayland, src/wsi/x11, etc. and we could move it in there. I like that idea too, would that be okay as follow up work? > > > +libwayland_egl = shared_library( > > + 'wayland-egl', > > + 'wayland-egl.c', > > + c_args : [c_vis_args], > > + link_args : ld_args_gc_sections, > > + version : '1.0.0', > > + install : true, > > ) > > As a drive-by musing, is there a reason c_vis_args isn't part of the > global arguments? I realised after the fact that I'd left it out of > quite a few places in my branch. There are actually some targets that don't have visibility flags set for them. Maybe those are just oversights? > > Other than that, it looks good to me, so: > Reviewed-by: Daniel Stone > > Cheers, > Daniel signature.asc Description: signature ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/5] etnaviv: update headers from rnndb
On Tue, Oct 17, 2017 at 10:38:13PM +0200, Christian Gmeiner wrote: > Update to etna_viv commit 6c9c706. > > Signed-off-by: Christian GmeinerReviewed-by: Wladimir J. van der Laan > --- > src/gallium/drivers/etnaviv/hw/cmdstream.xml.h | 36 ++- > src/gallium/drivers/etnaviv/hw/common.xml.h| 117 > src/gallium/drivers/etnaviv/hw/isa.xml.h | 4 +- > src/gallium/drivers/etnaviv/hw/state.xml.h | 197 -- > src/gallium/drivers/etnaviv/hw/state_3d.xml.h | 357 > - > 5 files changed, 622 insertions(+), 89 deletions(-) > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] gallium: add more exceptions to tgsi_util_get_inst_usage_mask
Reviewed-by: Roland ScheideggerAlbeit the way those masks are derived looks quite error-prone in general (especially for new opcodes). Am 19.10.2017 um 18:40 schrieb Tim Rowley: > A number of double/int64 operations don't have matching > read and write usage masks, which the fallthrough case of > tgsi_util_get_inst_usage_mask assumes for componentwise > tagged instructions. > > No regressions in llvmpipe piglit; fixes a large number of > swr regressions. > --- > src/gallium/auxiliary/tgsi/tgsi_util.c | 12 > 1 file changed, 12 insertions(+) > > diff --git a/src/gallium/auxiliary/tgsi/tgsi_util.c > b/src/gallium/auxiliary/tgsi/tgsi_util.c > index cfce59093c..afe5690ce0 100644 > --- a/src/gallium/auxiliary/tgsi/tgsi_util.c > +++ b/src/gallium/auxiliary/tgsi/tgsi_util.c > @@ -230,13 +230,25 @@ tgsi_util_get_inst_usage_mask(const struct > tgsi_full_instruction *inst, >read_mask = TGSI_WRITEMASK_XYZ; >break; > > + case TGSI_OPCODE_DSEQ: > + case TGSI_OPCODE_DSNE: > + case TGSI_OPCODE_DSLT: > + case TGSI_OPCODE_DSGE: > case TGSI_OPCODE_DP4: > case TGSI_OPCODE_PK4B: > case TGSI_OPCODE_PK4UB: > case TGSI_OPCODE_D2F: > + case TGSI_OPCODE_D2I: > + case TGSI_OPCODE_D2U: > case TGSI_OPCODE_I2F: > case TGSI_OPCODE_U2F: > + case TGSI_OPCODE_U64SEQ: > + case TGSI_OPCODE_U64SNE: > + case TGSI_OPCODE_U64SLT: > + case TGSI_OPCODE_U64SGE: > case TGSI_OPCODE_U642F: > + case TGSI_OPCODE_I64SLT: > + case TGSI_OPCODE_I64SGE: > case TGSI_OPCODE_I642F: >read_mask = TGSI_WRITEMASK_XYZW; >break; > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 2/4] glsl/linker: fix location aliasing checks for interface variables
On Thu, Oct 19, 2017 at 12:40 PM, Iago Toralwrote: > On Thu, 2017-10-19 at 12:37 -0400, Ilia Mirkin wrote: >> Will this work with SSO shaders? Presumably the validation still has >> to happen, but I don't think cross_validate_outputs_to_inputs() will >> end up getting called. > > The piglit tests I mention use SSO so it seems to be working for this. > See for example: > > tests/spec/arb_enhanced_layouts/linker/block-member-locations/named- > block-member-location-overlap.shader_test Ah great. I'm a little curious how, since I don't see how cross_validate_outputs_to_inputs would get called for SSO shaders. Perhaps I'm looking at old code. Oh - because that test doesn't test SSO. It's missing a "SSO ENABLED" - can you try with that? It's just using a pipeline, but both shaders are ending up in it. -ilia ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] docs: add documentation for building with meson
On 18 October 2017 at 18:27, Dylan Bakerwrote: >> > +Note that in meson this defaults to "debug", and not setting it to >> > +"release" will yield non-optimal performance and binary size >> Ouch, can we change that? > > When I did an informal poll in the Intel cube the universal consensus was that > defaulting to debug was a feature. If the wider community disagrees, yes, it > can be > changed. > The intent behind is amazing, but it subtly changes behaviour over any of the existing build systems. Past experience suggests that listing such changes in release notes/elsewhere people do miss it. Thus as users get annoyed with the optimised build there'll be some bad noise/publicity. Just something to keep in mind. > I'll fix up the other stuff too. > Great thanks. -Emil ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] gallium: add more exceptions to tgsi_util_get_inst_usage_mask
A number of double/int64 operations don't have matching read and write usage masks, which the fallthrough case of tgsi_util_get_inst_usage_mask assumes for componentwise tagged instructions. No regressions in llvmpipe piglit; fixes a large number of swr regressions. --- src/gallium/auxiliary/tgsi/tgsi_util.c | 12 1 file changed, 12 insertions(+) diff --git a/src/gallium/auxiliary/tgsi/tgsi_util.c b/src/gallium/auxiliary/tgsi/tgsi_util.c index cfce59093c..afe5690ce0 100644 --- a/src/gallium/auxiliary/tgsi/tgsi_util.c +++ b/src/gallium/auxiliary/tgsi/tgsi_util.c @@ -230,13 +230,25 @@ tgsi_util_get_inst_usage_mask(const struct tgsi_full_instruction *inst, read_mask = TGSI_WRITEMASK_XYZ; break; + case TGSI_OPCODE_DSEQ: + case TGSI_OPCODE_DSNE: + case TGSI_OPCODE_DSLT: + case TGSI_OPCODE_DSGE: case TGSI_OPCODE_DP4: case TGSI_OPCODE_PK4B: case TGSI_OPCODE_PK4UB: case TGSI_OPCODE_D2F: + case TGSI_OPCODE_D2I: + case TGSI_OPCODE_D2U: case TGSI_OPCODE_I2F: case TGSI_OPCODE_U2F: + case TGSI_OPCODE_U64SEQ: + case TGSI_OPCODE_U64SNE: + case TGSI_OPCODE_U64SLT: + case TGSI_OPCODE_U64SGE: case TGSI_OPCODE_U642F: + case TGSI_OPCODE_I64SLT: + case TGSI_OPCODE_I64SGE: case TGSI_OPCODE_I642F: read_mask = TGSI_WRITEMASK_XYZW; break; -- 2.11.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 0/7] swr: rasterizer update
Reviewed-by: Bruce Cherniak> On Oct 19, 2017, at 8:12 AM, Tim Rowley wrote: > > Highlights are code cleanups, some more simd16 work (disabled by default), > and tuning for the Intel Xeon Phi architecture. > > Tim Rowley (7): > swr/rast: Minor changes for os-x > swr/rast: Miscellaneous viewport array code changes > swr/rast: Fix indentation > swr/rast: Change DS memory allocation > swr/rast: Widen fetch shader to SIMD16 (disabled for now) > swr/rast: Add api to override draws in flight > swr: knob overrides for Intel Xeon Phi > > src/gallium/drivers/swr/rasterizer/core/api.cpp| 26 +- > src/gallium/drivers/swr/rasterizer/core/api.h | 4 + > src/gallium/drivers/swr/rasterizer/core/binner.cpp | 45 ++- > src/gallium/drivers/swr/rasterizer/core/clip.h | 14 +- > src/gallium/drivers/swr/rasterizer/core/context.h | 2 + > .../drivers/swr/rasterizer/core/frontend.cpp | 26 +- > src/gallium/drivers/swr/rasterizer/core/pa.h | 24 +- > src/gallium/drivers/swr/rasterizer/core/pa_avx.cpp | 4 +- > src/gallium/drivers/swr/rasterizer/core/state.h| 3 +- > .../drivers/swr/rasterizer/core/threads.cpp| 24 +- > .../drivers/swr/rasterizer/jitter/fetch_jit.cpp| 441 - > src/gallium/drivers/swr/swr_context.cpp| 27 ++ > src/gallium/drivers/swr/swr_context.h | 2 + > src/gallium/drivers/swr/swr_loader.cpp | 4 + > src/gallium/drivers/swr/swr_scratch.cpp| 2 +- > src/gallium/drivers/swr/swr_screen.h | 3 + > 16 files changed, 575 insertions(+), 76 deletions(-) > > -- > 2.7.4 > > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev