Re: [Mesa-dev] [PATCH] nir: Don't print swizzles when there are more than 4 components
On October 27, 2017 18:21:45 Matt Turnerwrote: ... as can happen with various types like mat4, or else we'll smash the stack writing past the end of components_local[]. Ugh... Ideally, I think you'd want without_matrix() because a matrix is just an array as far as interfaces go. That said, I think this probably fine for now. R-b. Tim, if you'd like to fix it for all the "fun" cases, that would he nice. Fixes: 5a0d3e1129b7 ("nir: Print the components referenced for split or packed shader in/outs.") --- Reproduce with INTEL_DEBUG=vs and generated_tests/spec/arb_vertex_attrib_64bit/execution/vs_in/vs-input-float_mat4-position-double_double.shader_test We could probably print better swizzles for dvecs too, which have more than four components. Exercise to the reader :) src/compiler/nir/nir_print.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/compiler/nir/nir_print.c b/src/compiler/nir/nir_print.c index 4b7ad5c6ba..fcc8025346 100644 --- a/src/compiler/nir/nir_print.c +++ b/src/compiler/nir/nir_print.c @@ -457,7 +457,7 @@ print_var_decl(nir_variable *var, print_state *state) switch (var->data.mode) { case nir_var_shader_in: case nir_var_shader_out: - if (num_components != 4 && num_components != 0) { + if (num_components < 4 && num_components != 0) { const char *xyzw = "xyzw"; for (int i = 0; i < num_components; i++) components_local[i + 1] = xyzw[i + var->data.location_frac]; -- 2.13.6 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] nir: Don't print swizzles when there are more than 4 components
... as can happen with various types like mat4, or else we'll smash the stack writing past the end of components_local[]. Fixes: 5a0d3e1129b7 ("nir: Print the components referenced for split or packed shader in/outs.") --- Reproduce with INTEL_DEBUG=vs and generated_tests/spec/arb_vertex_attrib_64bit/execution/vs_in/vs-input-float_mat4-position-double_double.shader_test We could probably print better swizzles for dvecs too, which have more than four components. Exercise to the reader :) src/compiler/nir/nir_print.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/compiler/nir/nir_print.c b/src/compiler/nir/nir_print.c index 4b7ad5c6ba..fcc8025346 100644 --- a/src/compiler/nir/nir_print.c +++ b/src/compiler/nir/nir_print.c @@ -457,7 +457,7 @@ print_var_decl(nir_variable *var, print_state *state) switch (var->data.mode) { case nir_var_shader_in: case nir_var_shader_out: - if (num_components != 4 && num_components != 0) { + if (num_components < 4 && num_components != 0) { const char *xyzw = "xyzw"; for (int i = 0; i < num_components; i++) components_local[i + 1] = xyzw[i + var->data.location_frac]; -- 2.13.6 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [Mesa-stable] [PATCH v3 14/48] i965/fs: Extend the live ranges of VGRFs which leave loops
I've submitted an alternative to this approach here [1], more along the lines of the v1 patch you borrowed from my jenkins branch. Most of the reasons for that we have discussed already in the office (except for the last point), but FTR: - This patch only works around the problem by tweaking the live intervals, which makes the live intervals inconsistent with the block live sets (the latter are used by some back-end passes that this patch isn't going to help with, like DCE and scheduling). - This patch doesn't fix the same bug on the VEC4 back-end. - I feel that the alternative approach [1] admits more straightforward extension to more exotic control-flow constructs. - It doesn't actually fix the problem for all possible patterns of divergent loop execution. Consider the following GLSL pseudocode: | b = 0; | do { |use(b); |b = non_uniform_condition(); |if (b) | continue; | |stomp_neighboring_channels(); |break; | | } while(true); With this series applied, the live interval of 'b' will be calculated to be between the 'b = 0' and 'continue' statements, which allows stomp_neighboring_channels() to overwrite the contents of 'b' depending on how the register allocator lays things out in the register file. OTOH with [1] applied the live interval calculated for 'b' will be the region between the 'b = 0' and 'break' statements, which overlaps the whole region of divergent control flow as expected. Cheers. [1] https://lists.freedesktop.org/archives/mesa-dev/2017-October/174591.html Jason Ekstrandwrites: > No Shader-db changes. > > Cc: mesa-sta...@lists.freedesktop.org > --- > src/intel/compiler/brw_fs_live_variables.cpp | 55 > > 1 file changed, 55 insertions(+) > > diff --git a/src/intel/compiler/brw_fs_live_variables.cpp > b/src/intel/compiler/brw_fs_live_variables.cpp > index c449672..380060d 100644 > --- a/src/intel/compiler/brw_fs_live_variables.cpp > +++ b/src/intel/compiler/brw_fs_live_variables.cpp > @@ -223,6 +223,61 @@ fs_live_variables::compute_start_end() > } >} > } > + > + /* Due to the explicit way the SIMD data is handled on GEN, we need to be > a > +* bit more careful with live ranges and loops. Consider the following > +* example: > +* > +*vec4 color2; > +*while (1) { > +* vec4 color = texture(); > +* if (...) { > +* color2 = color * 2; > +* break; > +* } > +*} > +*gl_FragColor = color2; > +* > +* In this case, the definition of color2 dominates the use because the > +* loop only has the one exit. This means that the live range interval > for > +* color2 goes from the statement in the if to it's use below the loop. > +* Now suppose that the texture operation has a header register that gets > +* assigned one of the registers used for color2. If the loop condition > is > +* non-uniform and some of the threads will take the and others will > +* continue. In this case, the next pass through the loop, the WE_all > +* setup of the header register will stomp the disabled channels of color2 > +* and corrupt the value. > +* > +* This same problem can occur if you have a mix of 64, 32, and 16-bit > +* registers because the channels do not line up or if you have a SIMD16 > +* program and the first half of one value overlaps the second half of the > +* other. > +* > +* To solve this problem, we take any VGRFs whose live ranges cross the > +* while instruction of a loop and extend their live ranges to the top of > +* the loop. This more accurately models the hardware because the value > in > +* the VGRF needs to be carried through subsequent loop iterations in > order > +* to remain valid when we finally do break. > +*/ > + foreach_block (block, cfg) { > + if (block->end()->opcode != BRW_OPCODE_WHILE) > + continue; > + > + /* This is a WHILE instrution. Find the DO block. */ > + bblock_t *do_block = NULL; > + foreach_list_typed(bblock_link, child_link, link, >children) { > + if (child_link->block->start_ip < block->end_ip) { > +assert(do_block == NULL); > +do_block = child_link->block; > + } > + } > + assert(do_block); > + > + for (int i = 0; i < num_vars; i++) { > + if (start[i] < block->end_ip && end[i] > block->end_ip) > +start[i] = MIN2(start[i], do_block->start_ip); > + } > + } > } > > fs_live_variables::fs_live_variables(fs_visitor *v, const cfg_t *cfg) > -- > 2.5.0.400.gff86faf > > ___ > mesa-stable mailing list > mesa-sta...@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/mesa-stable signature.asc Description: PGP signature
Re: [Mesa-dev] [PATCH 2/2] i965: Check CCS_E compatibility for texture view rendering
On Fri, Oct 27, 2017 at 3:16 PM, Nanley Cherywrote: > On Fri, Oct 27, 2017 at 12:52:30PM -0700, Jason Ekstrand wrote: > > On Fri, Oct 27, 2017 at 12:24 PM, Nanley Chery > > wrote: > > > > > Only use CCS_E to render to a texture that is CCS_E-compatible with the > > > original texture's miptree (linear) format. This prevents render > > > operations from writing data that can't be decoded with the original > > > miptree format. > > > > > > On Gen10, with the new CCS_E-enabled formats handled, this enables the > > > driver to pass the arb_texture_view-rendering-formats piglit test. > > > > > > Cc: > > > Signed-off-by: Nanley Chery > > > --- > > > src/mesa/drivers/dri/i965/intel_mipmap_tree.c | 28 > > > +-- > > > 1 file changed, 26 insertions(+), 2 deletions(-) > > > > > > diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c > > > b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c > > > index a850f4d17b..59c57c227b 100644 > > > --- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c > > > +++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c > > > @@ -241,6 +241,27 @@ intel_miptree_supports_hiz(const struct > brw_context > > > *brw, > > > } > > > } > > > > > > +/** > > > + * Return true if the format that will be used to access the miptree > is > > > + * CCS_E-compatible with the miptree's linear/non-sRGB format. > > > + * > > > + * Why use the linear format? Well, although the miptree may be > specified > > > with > > > + * an sRGB format, the usage of that color space/format can be > toggled. > > > Since > > > + * our HW tends to support more linear formats than sRGB ones, we use > this > > > + * format variant for check for CCS_E compatibility. > > > + */ > > > +static bool > > > +format_ccs_e_compat_with_miptree(const struct gen_device_info > *devinfo, > > > + const struct intel_mipmap_tree *mt, > > > + enum isl_format access_format) > > > +{ > > > + assert(mt->aux_usage == ISL_AUX_USAGE_CCS_E); > > > + > > > + mesa_format linear_format = _mesa_get_srgb_format_linear( > mt->format); > > > + enum isl_format isl_format = brw_isl_format_for_mesa_ > > > format(linear_format); > > > + return isl_formats_are_ccs_e_compatible(devinfo, isl_format, > > > access_format); > > > +} > > > + > > > static bool > > > intel_miptree_supports_ccs_e(struct brw_context *brw, > > > const struct intel_mipmap_tree *mt) > > > @@ -2662,8 +2683,11 @@ intel_miptree_render_aux_usage(struct > brw_context > > > *brw, > > >return mt->mcs_buf ? ISL_AUX_USAGE_CCS_D : ISL_AUX_USAGE_NONE; > > > > > > case ISL_AUX_USAGE_CCS_E: { > > > - /* If the format supports CCS_E, then we can just use it */ > > > - if (isl_format_supports_ccs_e(>screen->devinfo, > > > render_format)) > > > + /* If the format supports CCS_E and is compatible with the > miptree, > > > + * then we can use it. > > > + */ > > > + if (format_ccs_e_compat_with_miptree(>screen->devinfo, > > > + mt, render_format)) > > > > > > > You don't need the helper if you just use mt->surf.format. That's what > > can_texture_with_ccs does. > > > > > > Isn't using mt->surf.format making the code more restrictive than > necessary? That field may give you the sRGB format of the surface, but > the helper would always give you the linear variant. Therefore the > helper allows for CCS_E to be used in more cases. > You're right. I completely missed that. In that case, we probably want to use your helper for can_texture_with_ccs as well. Maybe two patches: one which adds the helper and makes can_texture_with_ccs use it and a second to add the render target stuff? Or you can keep this one as-is and do can_texture_with_ccs as a follow-on, I don't really care. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 2/3] intel/fs: Don't let undefined values prevent copy propagation.
This makes the dataflow propagation logic of the copy propagation pass more intelligent in cases where the destination of a copy is known to be undefined for some incoming CFG edges, building upon the definedness information provided by the last patch. Helps a few programs, and avoids a handful shader-db regressions from the next patch. shader-db results on ILK: total instructions in shared programs: 6541547 -> 6541523 (-0.00%) instructions in affected programs: 360 -> 336 (-6.67%) helped: 8 HURT: 0 LOST: 0 GAINED: 10 shader-db results on BDW: total instructions in shared programs: 8174323 -> 8173882 (-0.01%) instructions in affected programs: 7730 -> 7289 (-5.71%) helped: 5 HURT: 2 LOST: 0 GAINED: 4 shader-db results on SKL: total instructions in shared programs: 8185669 -> 8184598 (-0.01%) instructions in affected programs: 10364 -> 9293 (-10.33%) helped: 5 HURT: 2 LOST: 0 GAINED: 2 --- src/intel/compiler/brw_fs_copy_propagation.cpp | 50 -- 1 file changed, 47 insertions(+), 3 deletions(-) diff --git a/src/intel/compiler/brw_fs_copy_propagation.cpp b/src/intel/compiler/brw_fs_copy_propagation.cpp index cb117396089..5897cff35be 100644 --- a/src/intel/compiler/brw_fs_copy_propagation.cpp +++ b/src/intel/compiler/brw_fs_copy_propagation.cpp @@ -36,9 +36,12 @@ #include "util/bitset.h" #include "brw_fs.h" +#include "brw_fs_live_variables.h" #include "brw_cfg.h" #include "brw_eu.h" +using namespace brw; + namespace { /* avoid conflict with opt_copy_propagation_elements */ struct acp_entry : public exec_node { fs_reg dst; @@ -77,12 +80,19 @@ struct block_data { * course of this block. */ BITSET_WORD *kill; + + /** +* Which entries in the fs_copy_prop_dataflow acp table are guaranteed to +* have a fully uninitialized destination at the end of this block. +*/ + BITSET_WORD *undef; }; class fs_copy_prop_dataflow { public: fs_copy_prop_dataflow(void *mem_ctx, cfg_t *cfg, + const fs_live_variables *live, exec_list *out_acp[ACP_HASH_SIZE]); void setup_initial_values(); @@ -92,6 +102,7 @@ public: void *mem_ctx; cfg_t *cfg; + const fs_live_variables *live; acp_entry **acp; int num_acp; @@ -102,8 +113,9 @@ public: } /* anonymous namespace */ fs_copy_prop_dataflow::fs_copy_prop_dataflow(void *mem_ctx, cfg_t *cfg, + const fs_live_variables *live, exec_list *out_acp[ACP_HASH_SIZE]) - : mem_ctx(mem_ctx), cfg(cfg) + : mem_ctx(mem_ctx), cfg(cfg), live(live) { bd = rzalloc_array(mem_ctx, struct block_data, cfg->num_blocks); @@ -124,6 +136,7 @@ fs_copy_prop_dataflow::fs_copy_prop_dataflow(void *mem_ctx, cfg_t *cfg, bd[block->num].liveout = rzalloc_array(bd, BITSET_WORD, bitset_words); bd[block->num].copy = rzalloc_array(bd, BITSET_WORD, bitset_words); bd[block->num].kill = rzalloc_array(bd, BITSET_WORD, bitset_words); + bd[block->num].undef = rzalloc_array(bd, BITSET_WORD, bitset_words); for (int i = 0; i < ACP_HASH_SIZE; i++) { foreach_in_list(acp_entry, entry, _acp[block->num][i]) { @@ -189,6 +202,18 @@ fs_copy_prop_dataflow::setup_initial_values() } } } + + /* Initialize the undef set. */ + foreach_block (block, cfg) { + for (int i = 0; i < num_acp; i++) { + BITSET_SET(bd[block->num].undef, i); + for (unsigned off = 0; off < acp[i]->size_written; off += REG_SIZE) { +if (BITSET_TEST(live->block_data[block->num].defout, +live->var_from_reg(byte_offset(acp[i]->dst, off + BITSET_CLEAR(bd[block->num].undef, i); + } + } + } } /** @@ -229,13 +254,30 @@ fs_copy_prop_dataflow::run() for (int i = 0; i < bitset_words; i++) { const BITSET_WORD old_livein = bd[block->num].livein[i]; +BITSET_WORD livein_from_any_block = 0; bd[block->num].livein[i] = ~0u; foreach_list_typed(bblock_link, parent_link, link, >parents) { bblock_t *parent = parent_link->block; - bd[block->num].livein[i] &= bd[parent->num].liveout[i]; + /* Consider ACP entries with a known-undefined destination to +* be available from the parent. This is valid because we're +* free to set the undefined variable equal to the source of +* the ACP entry without breaking the application's +* expectations, since the variable is undefined. +*/ + bd[block->num].livein[i] &= (bd[parent->num].liveout[i] | +bd[parent->num].undef[i]); + livein_from_any_block |= bd[parent->num].liveout[i]; } +/* Limit to the set of ACP entries
[Mesa-dev] [PATCH 1/3] intel/fs: Restrict live intervals to the subset possibly reachable from any definition.
Currently the liveness analysis pass would extend a live interval up to the top of the program when no unconditional and complete definition of the variable is found that dominates all of its uses. This can lead to a serious performance problem in shaders containing many partial writes, like scalar arithmetic, FP64 and soon FP16 operations. The number of oversize live intervals in such workloads can cause the compilation time of the shader to explode because of the worse than quadratic behavior of the register allocator and scheduler when running out of registers, and it can also cause the running time of the shader to explode due to the amount of spilling it leads to, which is orders of magnitude slower than GRF memory. This patch fixes it by computing the intersection of our current live intervals with the subset of the program that can possibly be reached from any definition of the variable. Extending the storage allocation of the variable beyond that is pretty useless because its value is guaranteed to be undefined at a point that cannot be reached from any definition. No significant change in the running time of shader-db (with 5% statistical significance). shader-db results on IVB: total cycles in shared programs: 61108780 -> 60932856 (-0.29%) cycles in affected programs: 16335482 -> 16159558 (-1.08%) helped: 5121 HURT: 4347 total spills in shared programs: 1309 -> 1288 (-1.60%) spills in affected programs: 249 -> 228 (-8.43%) helped: 3 HURT: 0 total fills in shared programs: 1652 -> 1597 (-3.33%) fills in affected programs: 262 -> 207 (-20.99%) helped: 4 HURT: 0 LOST: 2 GAINED: 209 shader-db results on BDW: total cycles in shared programs: 67617262 -> 67361220 (-0.38%) cycles in affected programs: 23397142 -> 23141100 (-1.09%) helped: 8045 HURT: 6488 total spills in shared programs: 1456 -> 1252 (-14.01%) spills in affected programs: 465 -> 261 (-43.87%) helped: 3 HURT: 0 total fills in shared programs: 1720 -> 1465 (-14.83%) fills in affected programs: 471 -> 216 (-54.14%) helped: 4 HURT: 0 LOST: 2 GAINED: 162 shader-db results on SKL: total cycles in shared programs: 65436248 -> 65245186 (-0.29%) cycles in affected programs: 22560936 -> 22369874 (-0.85%) helped: 8457 HURT: 6247 total spills in shared programs: 437 -> 437 (0.00%) spills in affected programs: 0 -> 0 helped: 0 HURT: 0 total fills in shared programs: 870 -> 854 (-1.84%) fills in affected programs: 16 -> 0 helped: 1 HURT: 0 LOST: 0 GAINED: 107 Reviewed-by: Jason Ekstrand--- src/intel/compiler/brw_fs_live_variables.cpp | 34 src/intel/compiler/brw_fs_live_variables.h | 12 ++ 2 files changed, 42 insertions(+), 4 deletions(-) diff --git a/src/intel/compiler/brw_fs_live_variables.cpp b/src/intel/compiler/brw_fs_live_variables.cpp index c449672a519..059f076fa51 100644 --- a/src/intel/compiler/brw_fs_live_variables.cpp +++ b/src/intel/compiler/brw_fs_live_variables.cpp @@ -83,9 +83,11 @@ fs_live_variables::setup_one_write(struct block_data *bd, fs_inst *inst, /* The def[] bitset marks when an initialization in a block completely * screens off previous updates of that variable (VGRF channel). */ - if (inst->dst.file == VGRF && !inst->is_partial_write()) { - if (!BITSET_TEST(bd->use, var)) + if (inst->dst.file == VGRF) { + if (!inst->is_partial_write() && !BITSET_TEST(bd->use, var)) BITSET_SET(bd->def, var); + + BITSET_SET(bd->defout, var); } } @@ -199,6 +201,28 @@ fs_live_variables::compute_live_variables() } } } + + /* Propagate defin and defout down the CFG to calculate the union of live +* variables potentially defined along any possible control flow path. +*/ + do { + cont = false; + + foreach_block (block, cfg) { + const struct block_data *bd = _data[block->num]; + +foreach_list_typed(bblock_link, child_link, link, >children) { +struct block_data *child_bd = _data[child_link->block->num]; + + for (int i = 0; i < bitset_words; i++) { + const BITSET_WORD new_def = bd->defout[i] & ~child_bd->defin[i]; + child_bd->defin[i] |= new_def; + child_bd->defout[i] |= new_def; + cont |= new_def; + } +} + } + } while (cont); } /** @@ -212,12 +236,12 @@ fs_live_variables::compute_start_end() struct block_data *bd = _data[block->num]; for (int i = 0; i < num_vars; i++) { - if (BITSET_TEST(bd->livein, i)) { + if (BITSET_TEST(bd->livein, i) && BITSET_TEST(bd->defin, i)) { start[i] = MIN2(start[i], block->start_ip); end[i] = MAX2(end[i], block->start_ip); } - if (BITSET_TEST(bd->liveout, i)) { + if (BITSET_TEST(bd->liveout, i) && BITSET_TEST(bd->defout, i)) { start[i] =
[Mesa-dev] [PATCH 3/3] intel/cfg: Represent divergent control flow paths caused by non-uniform loop execution.
This addresses a long-standing back-end compiler bug that could lead to cross-channel data corruption in loops executed non-uniformly. In some cases live variables extending through a loop divergence point (e.g. a non-uniform break) into a convergence point (e.g. the end of the loop) wouldn't be considered live along all physical control flow paths the SIMD thread could possibly have taken in between due to some channels remaining in the loop for additional iterations. This patch fixes the problem by extending the CFG with physical edges that don't exist in the idealized non-vectorized program, but represent valid control flow paths the SIMD EU may take due to the divergence of logical threads. This makes sense because the i965 IR is explicitly SIMD, and it's not uncommon for instructions to have an influence on neighboring channels (e.g. a force_writemask_all header setup), so the behavior of the SIMD thread as a whole needs to be considered. No changes in shader-db. --- src/intel/compiler/brw_cfg.cpp | 75 ++ 1 file changed, 69 insertions(+), 6 deletions(-) diff --git a/src/intel/compiler/brw_cfg.cpp b/src/intel/compiler/brw_cfg.cpp index fad12eec588..600b428a492 100644 --- a/src/intel/compiler/brw_cfg.cpp +++ b/src/intel/compiler/brw_cfg.cpp @@ -98,6 +98,7 @@ ends_block(const backend_instruction *inst) op == BRW_OPCODE_ELSE || op == BRW_OPCODE_CONTINUE || op == BRW_OPCODE_BREAK || + op == BRW_OPCODE_DO || op == BRW_OPCODE_WHILE; } @@ -268,13 +269,57 @@ cfg_t::cfg_t(exec_list *instructions) } cur->instructions.push_tail(inst); + + /* Represent divergent execution of the loop as a pair of alternative + * edges coming out of the DO instruction: For any physical iteration + * of the loop a given logical thread can either start off enabled + * (which is represented as the "next" successor), or disabled (if it + * has reached a non-uniform exit of the loop during a previous + * iteration, which is represented as the "cur_while" successor). + * + * The disabled edge will be taken by the logical thread anytime we + * arrive at the DO instruction through a back-edge coming from a + * conditional exit of the loop where divergent control flow started. + * + * This guarantees that there is a control-flow path from any + * divergence point of the loop into the convergence point + * (immediately past the WHILE instruction) such that it overlaps the + * whole IP region of divergent control flow (potentially the whole + * loop) *and* doesn't imply the execution of any instructions part + * of the loop (since the corresponding execution mask bit will be + * disabled for a diverging thread). + * + * This way we make sure that any variables that are live throughout + * the region of divergence for an inactive logical thread are also + * considered to interfere with any other variables assigned by + * active logical threads within the same physical region of the + * program, since otherwise we would risk cross-channel data + * corruption. + */ + next = new_block(); + cur->add_successor(mem_ctx, next); + cur->add_successor(mem_ctx, cur_while); + set_next_block(, next, ip); break; case BRW_OPCODE_CONTINUE: cur->instructions.push_tail(inst); + /* A conditional CONTINUE may start a region of divergent control + * flow until the start of the next loop iteration (*not* until the + * end of the loop which is why the successor is not the top-level + * divergence point at cur_do). The live interval of any variable + * extending through a CONTINUE edge is guaranteed to overlap the + * whole region of divergent execution, because any variable live-out + * at the CONTINUE instruction will also be live-in at the top of the + * loop, and therefore also live-out at the bottom-most point of the + * loop which is reachable from the top (since a control flow path + * exists from a definition of the variable through this CONTINUE + * instruction, the top of the loop, the (reachable) bottom of the + * loop, the top of the loop again, into a use of the variable). + */ assert(cur_do != NULL); -cur->add_successor(mem_ctx, cur_do); + cur->add_successor(mem_ctx, cur_do->next()); next = new_block(); if (inst->predicate) @@ -286,8 +331,18 @@ cfg_t::cfg_t(exec_list *instructions) case BRW_OPCODE_BREAK: cur->instructions.push_tail(inst); - assert(cur_while != NULL); -cur->add_successor(mem_ctx, cur_while); + /* A conditional BREAK
Re: [Mesa-dev] [PATCH v2] wayland-drm: static inline wayland_drm_buffer_get
On 24 October 2017 at 17:14, Emil Velikovwrote: > From: Emil Velikov > > The function is effectively a direct function call into > libwayland-server.so. > > Thus GBM no longer depends on the wayland-drm static library, making the > build more straight forward. And the resulting binary is a bit smaller. > > Note: we need to move struct wayland_drm_callbacks further up, > otherwise we'll get an error since the type is incomplete. > > v2: Rebase, beef-up commit message, update meson, move struct > wayland_drm_callbacks. > > Cc: Dylan Baker > Cc: Daniel Stone > Signed-off-by: Emil Velikov > Reviewed-by: Daniel Stone (v1) > --- > Dylan can you check the meson bits? Can one say to meson, build object X > while only using the depA CFLAGS? It seems to me that it currently links > against depA even when you don't want it to. > Dylan, can you please take a look at the meson bit? -Emil ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v2] clover/llvm: Drop support for LLVM < 3.9.
Vedran Miletićwrites: > v2: remove/inline compat stuff > > Reviewed-by: Francisco Jerez > --- > .../state_trackers/clover/llvm/codegen/native.cpp | 8 +- > src/gallium/state_trackers/clover/llvm/compat.hpp | 109 > + > .../state_trackers/clover/llvm/invocation.cpp | 22 +++-- > .../state_trackers/clover/llvm/metadata.hpp| 30 +- > 4 files changed, 20 insertions(+), 149 deletions(-) > > diff --git a/src/gallium/state_trackers/clover/llvm/codegen/native.cpp > b/src/gallium/state_trackers/clover/llvm/codegen/native.cpp > index 12c83a92b6..38028597a3 100644 > --- a/src/gallium/state_trackers/clover/llvm/codegen/native.cpp > +++ b/src/gallium/state_trackers/clover/llvm/codegen/native.cpp > @@ -114,7 +114,7 @@ namespace { > >std::unique_ptr tm { > t->createTargetMachine(target.triple, target.cpu, "", {}, > -compat::default_reloc_model, > +::llvm::None, > compat::default_code_model, > ::llvm::CodeGenOpt::Default) }; >if (!tm) > @@ -124,11 +124,11 @@ namespace { >::llvm::SmallVector data; > >{ > - compat::pass_manager pm; > + ::llvm::legacy::PassManager pm; > ::llvm::raw_svector_ostream os { data }; > - compat::raw_ostream_to_emit_file fos(os); > + ::llvm::raw_svector_ostream (os); > This is just a reference to the other local variable above. Mind cleaning it up? > - mod.setDataLayout(compat::get_data_layout(*tm)); > + mod.setDataLayout(tm->createDataLayout()); > tm->Options.MCOptions.AsmVerbose = > (ft == TargetMachine::CGFT_AssemblyFile); > > diff --git a/src/gallium/state_trackers/clover/llvm/compat.hpp > b/src/gallium/state_trackers/clover/llvm/compat.hpp > index f8b56516d5..ce3a29f7d5 100644 > --- a/src/gallium/state_trackers/clover/llvm/compat.hpp > +++ b/src/gallium/state_trackers/clover/llvm/compat.hpp > @@ -36,6 +36,8 @@ > > #include "util/algorithm.hpp" > > +#include > +#include > #include > #include > #include > @@ -46,16 +48,6 @@ > #include > #endif > > -#if HAVE_LLVM >= 0x0307 > -#include > -#include > -#else > -#include > -#include > -#include > -#include > -#endif > - > #include > #include > #include > @@ -63,12 +55,6 @@ > namespace clover { > namespace llvm { >namespace compat { > -#if HAVE_LLVM >= 0x0307 > - typedef ::llvm::TargetLibraryInfoImpl target_library_info; > -#else > - typedef ::llvm::TargetLibraryInfo target_library_info; > -#endif > - > #if HAVE_LLVM >= 0x0500 > const auto lang_as_offset = 0; > const clang::InputKind ik_opencl = clang::InputKind::OpenCL; > @@ -77,19 +63,6 @@ namespace clover { > const clang::InputKind ik_opencl = clang::IK_OpenCL; > #endif > > - inline void > - set_lang_defaults(clang::CompilerInvocation , > - clang::LangOptions , clang::InputKind ik, > - const ::llvm::Triple , > - clang::PreprocessorOptions , > - clang::LangStandard::Kind std) { > -#if HAVE_LLVM >= 0x0309 > -inv.setLangDefaults(lopts, ik, t, ppopts, std); > -#else > -inv.setLangDefaults(lopts, ik, std); > -#endif > - } > - > inline void > add_link_bitcode_file(clang::CodeGenOptions , > const std::string ) { > @@ -100,78 +73,8 @@ namespace clover { > F.PropagateAttrs = true; > F.LinkFlags = ::llvm::Linker::Flags::None; > opts.LinkBitcodeFiles.emplace_back(F); > -#elif HAVE_LLVM >= 0x0308 > -opts.LinkBitcodeFiles.emplace_back(::llvm::Linker::Flags::None, > path); > -#else > -opts.LinkBitcodeFile = path; > -#endif > - } > - > -#if HAVE_LLVM >= 0x0307 > - typedef ::llvm::legacy::PassManager pass_manager; > -#else > - typedef ::llvm::PassManager pass_manager; > -#endif > - > - inline void > - add_data_layout_pass(pass_manager ) { > -#if HAVE_LLVM < 0x0307 > -pm.add(new ::llvm::DataLayoutPass()); > -#endif > - } > - > - inline void > - add_internalize_pass(pass_manager , > - const std::vector ) { > -#if HAVE_LLVM >= 0x0309 > -pm.add(::llvm::createInternalizePass( > - [=](const ::llvm::GlobalValue ) { > - return std::find(names.begin(), names.end(), > - gv.getName()) != names.end(); > - })); > -#else > -pm.add(::llvm::createInternalizePass(std::vector( > - map(std::mem_fn(::string::data), names; > -#endif > - } > - > -
Re: [Mesa-dev] [PATCH 1/7] util: move futex helpers into futex.h
On 23/10/17 05:33, Nicolai Hähnle wrote: From: Nicolai Hähnle--- src/util/Makefile.sources | 1 + src/util/futex.h | 51 +++ src/util/meson.build | 1 + src/util/simple_mtx.h | 20 +-- 4 files changed, 54 insertions(+), 19 deletions(-) create mode 100644 src/util/futex.h diff --git a/src/util/Makefile.sources b/src/util/Makefile.sources index c7f6516a992..407b2825624 100644 --- a/src/util/Makefile.sources +++ b/src/util/Makefile.sources @@ -6,20 +6,21 @@ MESA_UTIL_FILES := \ build_id.h \ crc32.c \ crc32.h \ debug.c \ debug.h \ disk_cache.c \ disk_cache.h \ format_r11g11b10f.h \ format_rgb9e5.h \ format_srgb.h \ + futex.h \ half_float.c \ half_float.h \ hash_table.c \ hash_table.h \ list.h \ macros.h \ mesa-sha1.c \ mesa-sha1.h \ sha1/sha1.c \ sha1/sha1.h \ diff --git a/src/util/futex.h b/src/util/futex.h new file mode 100644 index 000..a79daaf209b --- /dev/null +++ b/src/util/futex.h @@ -0,0 +1,51 @@ +/* + * Copyright © 2015 Intel + * + * Permission is hereby granted, free of charge, to any person obtaining a + * copy of this software and associated documentation files (the "Software"), + * to deal in the Software without restriction, including without limitation + * the rights to use, copy, modify, merge, publish, distribute, sublicense, + * and/or sell copies of the Software, and to permit persons to whom the + * Software is furnished to do so, subject to the following conditions: + * + * The above copyright notice and this permission notice (including the next + * paragraph) shall be included in all copies or substantial portions of the + * Software. + * + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS + * IN THE SOFTWARE. + */ + +#ifndef UTIL_FUTEX_H +#define UTIL_FUTEX_H + +#if defined(HAVE_FUTEX) + +#include +#include +#include +#include +#include +#include + +static inline long sys_futex(void *addr1, int op, int val1, struct timespec *timeout, void *addr2, int val3) +{ + return syscall(SYS_futex, addr1, op, val1, timeout, addr2, val3); +} + +static inline int futex_wake(uint32_t *addr) { + return sys_futex(addr, FUTEX_WAKE, 1, NULL, NULL, 0); +} + +static inline int futex_wait(uint32_t *addr, int32_t value) { These have the same style mistakes as in the fast mtx patch i.e the { needs to be on the next line. It looks like this series is written on top of that patch but I haven't yet got a rb/ack for it yet. + return sys_futex(addr, FUTEX_WAIT, value, NULL, NULL, 0); +} + +#endif + +#endif /* UTIL_FUTEX_H */ diff --git a/src/util/meson.build b/src/util/meson.build index c9cb3e861e9..0940e4d12a7 100644 --- a/src/util/meson.build +++ b/src/util/meson.build @@ -30,20 +30,21 @@ files_mesa_util = files( 'build_id.h', 'crc32.c', 'crc32.h', 'debug.c', 'debug.h', 'disk_cache.c', 'disk_cache.h', 'format_r11g11b10f.h', 'format_rgb9e5.h', 'format_srgb.h', + 'futex.h', 'half_float.c', 'half_float.h', 'hash_table.c', 'hash_table.h', 'list.h', 'macros.h', 'mesa-sha1.c', 'mesa-sha1.h', 'sha1/sha1.c', 'sha1/sha1.h', diff --git a/src/util/simple_mtx.h b/src/util/simple_mtx.h index d23a4303c80..0c2602d03b6 100644 --- a/src/util/simple_mtx.h +++ b/src/util/simple_mtx.h @@ -14,20 +14,21 @@ * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS * IN THE SOFTWARE. */ +#include "util/futex.h" #include "util/u_thread.h" #if defined(__GNUC__) && defined(HAVE_FUTEX) /* mtx_t - Fast, simple mutex * * While modern pthread mutexes are very fast (implemented using futex), they * still incur a call to an external DSO and overhead of the generality and * features of pthread mutexes. Most mutexes in mesa only needs lock/unlock, * and the idea here is that we can inline the atomic operation and make the @@ -48,39 +49,20 @@ * A fast mutex only supports lock/unlock, can't
Re: [Mesa-dev] [PATCH V2 1/2] i965/gen10: Don't set Smooth Point Enable in 3DSTATE_SF if num_samples > 1
On Fri, Oct 27, 2017 at 4:00 PM, Kenneth Graunkewrote: > On Friday, October 27, 2017 3:17:20 PM PDT Anuj Phogat wrote: >> V2: Use _NEW_BUFFERS in place of _NEW_MULTISAMPLE >> >> Signed-off-by: Anuj Phogat >> Cc: Kenneth Graunke >> --- >> src/mesa/drivers/dri/i965/genX_state_upload.c | 13 - >> 1 file changed, 12 insertions(+), 1 deletion(-) >> >> diff --git a/src/mesa/drivers/dri/i965/genX_state_upload.c >> b/src/mesa/drivers/dri/i965/genX_state_upload.c >> index 4ccfd48919..25240f01d9 100644 >> --- a/src/mesa/drivers/dri/i965/genX_state_upload.c >> +++ b/src/mesa/drivers/dri/i965/genX_state_upload.c >> @@ -1626,6 +1626,16 @@ genX(upload_sf)(struct brw_context *brw) >> sf.SmoothPointEnable = true; >> #endif >> >> +#if GEN_GEN == 10 >> + /* _NEW_BUFFERS >> + * Smooth Point Enable bit MUST not be set when NUM_MULTISAMPLES > 1. >> + */ >> + const bool multisampled_fbo = >> + _mesa_geometric_samples(ctx->DrawBuffer) > 1; >> + if (multisampled_fbo) >> + sf.SmoothPointEnable = false; >> +#endif >> + >> #if GEN_IS_G4X || GEN_GEN >= 5 >>sf.AALineDistanceMode = AALINEDISTANCE_TRUE; >> #endif >> @@ -1681,7 +1691,8 @@ static const struct brw_tracked_state genX(sf_state) = >> { >> _NEW_POINT | >> _NEW_PROGRAM | >> (GEN_GEN >= 6 ? _NEW_MULTISAMPLE : 0) | >> - (GEN_GEN <= 7 ? _NEW_BUFFERS | _NEW_POLYGON : 0), >> + (GEN_GEN <= 7 ? _NEW_POLYGON : 0) | >> + (GEN_GEN <= 7 || GEN_GEN == 10 ? _NEW_BUFFERS : 0), > > Let's leave the GEN <= 7 bit as is and just add > > (GEN_GEN == 10 ? _NEW_BUFFERS : 0) Done. Thanks. > > With that, both are > Reviewed-by: Kenneth Graunke ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH V2 1/2] i965/gen10: Don't set Smooth Point Enable in 3DSTATE_SF if num_samples > 1
On Friday, October 27, 2017 3:17:20 PM PDT Anuj Phogat wrote: > V2: Use _NEW_BUFFERS in place of _NEW_MULTISAMPLE > > Signed-off-by: Anuj Phogat> Cc: Kenneth Graunke > --- > src/mesa/drivers/dri/i965/genX_state_upload.c | 13 - > 1 file changed, 12 insertions(+), 1 deletion(-) > > diff --git a/src/mesa/drivers/dri/i965/genX_state_upload.c > b/src/mesa/drivers/dri/i965/genX_state_upload.c > index 4ccfd48919..25240f01d9 100644 > --- a/src/mesa/drivers/dri/i965/genX_state_upload.c > +++ b/src/mesa/drivers/dri/i965/genX_state_upload.c > @@ -1626,6 +1626,16 @@ genX(upload_sf)(struct brw_context *brw) > sf.SmoothPointEnable = true; > #endif > > +#if GEN_GEN == 10 > + /* _NEW_BUFFERS > + * Smooth Point Enable bit MUST not be set when NUM_MULTISAMPLES > 1. > + */ > + const bool multisampled_fbo = > + _mesa_geometric_samples(ctx->DrawBuffer) > 1; > + if (multisampled_fbo) > + sf.SmoothPointEnable = false; > +#endif > + > #if GEN_IS_G4X || GEN_GEN >= 5 >sf.AALineDistanceMode = AALINEDISTANCE_TRUE; > #endif > @@ -1681,7 +1691,8 @@ static const struct brw_tracked_state genX(sf_state) = { > _NEW_POINT | > _NEW_PROGRAM | > (GEN_GEN >= 6 ? _NEW_MULTISAMPLE : 0) | > - (GEN_GEN <= 7 ? _NEW_BUFFERS | _NEW_POLYGON : 0), > + (GEN_GEN <= 7 ? _NEW_POLYGON : 0) | > + (GEN_GEN <= 7 || GEN_GEN == 10 ? _NEW_BUFFERS : 0), Let's leave the GEN <= 7 bit as is and just add (GEN_GEN == 10 ? _NEW_BUFFERS : 0) With that, both are Reviewed-by: Kenneth Graunke signature.asc Description: This is a digitally signed message part. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] scons: fix OSMesa driver build
Looks alright too. Reviewed-by: Roland ScheideggerAm 28.10.2017 um 00:35 schrieb Brian Paul: > Fixes: ea53d9a8eb5d4b2 "glapi: include generated headers without path" > --- > src/mesa/drivers/osmesa/SConscript | 1 + > 1 file changed, 1 insertion(+) > > diff --git a/src/mesa/drivers/osmesa/SConscript > b/src/mesa/drivers/osmesa/SConscript > index 7280314..9391dc3 100644 > --- a/src/mesa/drivers/osmesa/SConscript > +++ b/src/mesa/drivers/osmesa/SConscript > @@ -7,6 +7,7 @@ env.Prepend(CPPPATH = [ > '#src/mapi', > '#src/mesa', > Dir('../../../mapi'), # src/mapi build path for python-generated GL API > files/headers > +Dir('../../../mapi/glapi'), # src/mapi/glapi build path > ]) > > env.Prepend(LIBS = [ > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] scons: fix OSMesa driver build
Ah, I see that you fixed that already, Reviewed-by: Dylan BakerQuoting Brian Paul (2017-10-27 15:35:09) > Fixes: ea53d9a8eb5d4b2 "glapi: include generated headers without path" > --- > src/mesa/drivers/osmesa/SConscript | 1 + > 1 file changed, 1 insertion(+) > > diff --git a/src/mesa/drivers/osmesa/SConscript > b/src/mesa/drivers/osmesa/SConscript > index 7280314..9391dc3 100644 > --- a/src/mesa/drivers/osmesa/SConscript > +++ b/src/mesa/drivers/osmesa/SConscript > @@ -7,6 +7,7 @@ env.Prepend(CPPPATH = [ > '#src/mapi', > '#src/mesa', > Dir('../../../mapi'), # src/mapi build path for python-generated GL API > files/headers > +Dir('../../../mapi/glapi'), # src/mapi/glapi build path > ]) > > env.Prepend(LIBS = [ > -- > 1.9.1 > > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/mesa-dev signature.asc Description: signature ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] scons: fix OSMesa driver build
I sent almost the same patch at exactly the same time. I also needed to add this path to src/map/glapi/Sconscript to get scons building. Maybe that's unrelated? Dylan Quoting Brian Paul (2017-10-27 15:35:09) > Fixes: ea53d9a8eb5d4b2 "glapi: include generated headers without path" > --- > src/mesa/drivers/osmesa/SConscript | 1 + > 1 file changed, 1 insertion(+) > > diff --git a/src/mesa/drivers/osmesa/SConscript > b/src/mesa/drivers/osmesa/SConscript > index 7280314..9391dc3 100644 > --- a/src/mesa/drivers/osmesa/SConscript > +++ b/src/mesa/drivers/osmesa/SConscript > @@ -7,6 +7,7 @@ env.Prepend(CPPPATH = [ > '#src/mapi', > '#src/mesa', > Dir('../../../mapi'), # src/mapi build path for python-generated GL API > files/headers > +Dir('../../../mapi/glapi'), # src/mapi/glapi build path > ]) > > env.Prepend(LIBS = [ > -- > 1.9.1 > > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/mesa-dev signature.asc Description: signature ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] scons: fix OSMesa driver build
Fixes: ea53d9a8eb5d4b2 "glapi: include generated headers without path" --- src/mesa/drivers/osmesa/SConscript | 1 + 1 file changed, 1 insertion(+) diff --git a/src/mesa/drivers/osmesa/SConscript b/src/mesa/drivers/osmesa/SConscript index 7280314..9391dc3 100644 --- a/src/mesa/drivers/osmesa/SConscript +++ b/src/mesa/drivers/osmesa/SConscript @@ -7,6 +7,7 @@ env.Prepend(CPPPATH = [ '#src/mapi', '#src/mesa', Dir('../../../mapi'), # src/mapi build path for python-generated GL API files/headers +Dir('../../../mapi/glapi'), # src/mapi/glapi build path ]) env.Prepend(LIBS = [ -- 1.9.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [AppVeyor] mesa master #5970 completed
Build mesa 5970 completed Commit 05592cebd4 by Brian Paul on 10/27/2017 9:14 PM: scons: fix scons build to find generated glapitable.h\n\nFixes: ea53d9a8eb5d4b2 "glapi: include generated headers without path"\n\nReviewed-by: Roland ScheideggerConfigure your notification preferences ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] scons: Fix glapi include path changes
fixes: 5daed06da2c0aaa4 ("osmesa: Include generated headers without path") cc: Mark JanesSigned-off-by: Dylan Baker --- src/mapi/glapi/SConscript | 1 + src/mesa/drivers/osmesa/SConscript | 4 +++- 2 files changed, 4 insertions(+), 1 deletion(-) diff --git a/src/mapi/glapi/SConscript b/src/mapi/glapi/SConscript index 994778a8fb9..11f2b6dddfd 100644 --- a/src/mapi/glapi/SConscript +++ b/src/mapi/glapi/SConscript @@ -31,6 +31,7 @@ env.Append(CPPPATH = [ '#/src/mapi', '#/src/mesa', Dir('..'), # src/mapi build path +Dir('.'), # src/mapi/glapi build path ]) glapi_sources = [ diff --git a/src/mesa/drivers/osmesa/SConscript b/src/mesa/drivers/osmesa/SConscript index 728031446e5..43da571f085 100644 --- a/src/mesa/drivers/osmesa/SConscript +++ b/src/mesa/drivers/osmesa/SConscript @@ -6,7 +6,9 @@ env.Prepend(CPPPATH = [ '#src', '#src/mapi', '#src/mesa', -Dir('../../../mapi'), # src/mapi build path for python-generated GL API files/headers +# src/mapi{/glapi} build path for python-generated GL API files/headers +Dir('../../../mapi'), +Dir('../../../mapi/glapi'), ]) env.Prepend(LIBS = [ -- 2.14.2 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH V2 2/2] i965/gen10: Don't set Antialiasing Enable in 3DSTATE_RASTER if num_samples > 1
V2: Use _NEW_BUFFERS in place of _NEW_MULTISAMPLE. Fix indentation. Signed-off-by: Anuj PhogatCc: Kenneth Graunke Cc: Matt Turner --- src/mesa/drivers/dri/i965/genX_state_upload.c | 10 ++ 1 file changed, 10 insertions(+) diff --git a/src/mesa/drivers/dri/i965/genX_state_upload.c b/src/mesa/drivers/dri/i965/genX_state_upload.c index 25240f01d9..65ec30833c 100644 --- a/src/mesa/drivers/dri/i965/genX_state_upload.c +++ b/src/mesa/drivers/dri/i965/genX_state_upload.c @@ -4363,6 +4363,16 @@ genX(upload_raster)(struct brw_context *brw) /* _NEW_LINE */ raster.AntialiasingEnable = ctx->Line.SmoothFlag; +#if GEN_GEN == 10 + /* _NEW_BUFFERS + * Antialiasing Enable bit MUST not be set when NUM_MULTISAMPLES > 1. + */ + const bool multisampled_fbo = + _mesa_geometric_samples(ctx->DrawBuffer) > 1; + if (multisampled_fbo) + raster.AntialiasingEnable = false; +#endif + /* _NEW_SCISSOR */ raster.ScissorRectangleEnable = ctx->Scissor.EnableFlags; -- 2.13.5 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v3.5] intel/compiler: Add union types for prog_data and prog_key stages
On Thu, Oct 26, 2017 at 11:14 PM, Jordan Justenwrote: > Signed-off-by: Jordan Justen > Reviewed-by: Jason Ekstrand > Cc: Jason Ekstrand > Cc: Kenneth Graunke > --- > > * Add comment (Ken) > * No typedef (Jason) > > src/intel/compiler/brw_compiler.h | 20 > 1 file changed, 20 insertions(+) > > diff --git a/src/intel/compiler/brw_compiler.h b/src/intel/compiler/brw_ > compiler.h > index 701b4a80bf1..6ad89171ce4 100644 > --- a/src/intel/compiler/brw_compiler.h > +++ b/src/intel/compiler/brw_compiler.h > @@ -403,6 +403,16 @@ struct brw_cs_prog_key { > struct brw_sampler_prog_key_data tex; > }; > > +/* brw_any_prog_key is any of the keys that map to an API stage */ > +union brw_any_prog_key { > + struct brw_vs_prog_key vs; > + struct brw_tcs_prog_key tcs; > + struct brw_tes_prog_key tes; > + struct brw_gs_prog_key gs; > + struct brw_wm_prog_key wm; > + struct brw_cs_prog_key cs; > +}; > + > /* > * Image metadata structure as laid out in the shader parameter > * buffer. Entries have to be 16B-aligned for the vec4 back-end to be > @@ -1066,6 +1076,16 @@ struct brw_clip_prog_data { > uint32_t total_grf; > }; > > +/* brw_any_prog_data is prog_data for any stage that maps to an API stage > */ > +union brw_any_prog_data { > For my usage, it'd be nice if you also throw stage_prog_data in here. > + struct brw_vs_prog_data vs; > + struct brw_tcs_prog_data tcs; > + struct brw_tes_prog_data tes; > + struct brw_gs_prog_data gs; > + struct brw_wm_prog_data wm; > + struct brw_cs_prog_data cs; > +}; > + > #define DEFINE_PROG_DATA_DOWNCAST(stage) \ > static inline struct brw_##stage##_prog_data * \ > brw_##stage##_prog_data(struct brw_stage_prog_data *prog_data) \ > -- > 2.15.0.rc2 > > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [RFC 1/2] intel/isl: Disable some gen10 CCS_E formats for now
On Fri, Oct 27, 2017 at 12:50:45PM -0700, Jason Ekstrand wrote: > Thanks for doing this in isl_format_supports_ccs_e instead of changing the > table! > > Reviewed-by: Jason Ekstrand> Thank you for the review! > On Fri, Oct 27, 2017 at 12:24 PM, Nanley Chery > wrote: > > > CannonLake additionally supports R11G11B10_FLOAT and four 10-10-10-2 > > formats with CCS_E. None of these formats fit within the current > > blorp_copy framework so disable them until support is added. > > > > Signed-off-by: Nanley Chery > > Cc: Jason Ekstrand > > --- > > > > Jason, do you think we modify blorp instead of moving forward with this > > patch? In any case, I've sent this to the list to add context to my next > > patch. > > > > src/intel/isl/isl_format.c | 24 > > 1 file changed, 24 insertions(+) > > > > diff --git a/src/intel/isl/isl_format.c b/src/intel/isl/isl_format.c > > index fba3ac5e1a..03c591071b 100644 > > --- a/src/intel/isl/isl_format.c > > +++ b/src/intel/isl/isl_format.c > > @@ -536,6 +536,30 @@ isl_format_supports_ccs_e(const struct > > gen_device_info *devinfo, > > if (!format_info[format].exists) > >return false; > > > > + /* For simplicity, only report that a format supports CCS_E if blorp > > can > > +* perform bit-for-bit copies with an image of that format while > > compressed. > > +* This allows ISL users to avoid having to resolve the image before > > +* performing such a copy. We may want to change this behavior in the > > +* future. > > +* > > +* R11G11B10_FLOAT has no equivalent UINT format. Given how blorp_copy > > +* currently works, bit-for-bit copy operations are not possible > > without an > > +* intermediate resolve. > > +*/ > > + if (format == ISL_FORMAT_R11G11B10_FLOAT) > > + return false; > > + > > + /* blorp_copy currently doesn't support formats with different > > bit-widths > > +* per-channel. Until that support is added, report that these formats > > don't > > +* support CCS_E. FIXME: Add support for these formats. > > +*/ > > + if (format == ISL_FORMAT_B10G10R10A2_UNORM || > > + format == ISL_FORMAT_B10G10R10A2_UNORM_SRGB || > > + format == ISL_FORMAT_R10G10B10A2_UNORM || > > + format == ISL_FORMAT_R10G10B10A2_UINT) { > > + return false; > > + } > > + > > return format_gen(devinfo) >= format_info[format].ccs_e; > > } > > > > -- > > 2.14.3 > > > > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH V2 1/2] i965/gen10: Don't set Smooth Point Enable in 3DSTATE_SF if num_samples > 1
V2: Use _NEW_BUFFERS in place of _NEW_MULTISAMPLE Signed-off-by: Anuj PhogatCc: Kenneth Graunke --- src/mesa/drivers/dri/i965/genX_state_upload.c | 13 - 1 file changed, 12 insertions(+), 1 deletion(-) diff --git a/src/mesa/drivers/dri/i965/genX_state_upload.c b/src/mesa/drivers/dri/i965/genX_state_upload.c index 4ccfd48919..25240f01d9 100644 --- a/src/mesa/drivers/dri/i965/genX_state_upload.c +++ b/src/mesa/drivers/dri/i965/genX_state_upload.c @@ -1626,6 +1626,16 @@ genX(upload_sf)(struct brw_context *brw) sf.SmoothPointEnable = true; #endif +#if GEN_GEN == 10 + /* _NEW_BUFFERS + * Smooth Point Enable bit MUST not be set when NUM_MULTISAMPLES > 1. + */ + const bool multisampled_fbo = + _mesa_geometric_samples(ctx->DrawBuffer) > 1; + if (multisampled_fbo) + sf.SmoothPointEnable = false; +#endif + #if GEN_IS_G4X || GEN_GEN >= 5 sf.AALineDistanceMode = AALINEDISTANCE_TRUE; #endif @@ -1681,7 +1691,8 @@ static const struct brw_tracked_state genX(sf_state) = { _NEW_POINT | _NEW_PROGRAM | (GEN_GEN >= 6 ? _NEW_MULTISAMPLE : 0) | - (GEN_GEN <= 7 ? _NEW_BUFFERS | _NEW_POLYGON : 0), + (GEN_GEN <= 7 ? _NEW_POLYGON : 0) | + (GEN_GEN <= 7 || GEN_GEN == 10 ? _NEW_BUFFERS : 0), .brw = BRW_NEW_BLORP | BRW_NEW_VUE_MAP_GEOM_OUT | (GEN_GEN <= 5 ? BRW_NEW_BATCH | -- 2.13.5 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 2/2] i965: Check CCS_E compatibility for texture view rendering
On Fri, Oct 27, 2017 at 12:52:30PM -0700, Jason Ekstrand wrote: > On Fri, Oct 27, 2017 at 12:24 PM, Nanley Chery> wrote: > > > Only use CCS_E to render to a texture that is CCS_E-compatible with the > > original texture's miptree (linear) format. This prevents render > > operations from writing data that can't be decoded with the original > > miptree format. > > > > On Gen10, with the new CCS_E-enabled formats handled, this enables the > > driver to pass the arb_texture_view-rendering-formats piglit test. > > > > Cc: > > Signed-off-by: Nanley Chery > > --- > > src/mesa/drivers/dri/i965/intel_mipmap_tree.c | 28 > > +-- > > 1 file changed, 26 insertions(+), 2 deletions(-) > > > > diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c > > b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c > > index a850f4d17b..59c57c227b 100644 > > --- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c > > +++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c > > @@ -241,6 +241,27 @@ intel_miptree_supports_hiz(const struct brw_context > > *brw, > > } > > } > > > > +/** > > + * Return true if the format that will be used to access the miptree is > > + * CCS_E-compatible with the miptree's linear/non-sRGB format. > > + * > > + * Why use the linear format? Well, although the miptree may be specified > > with > > + * an sRGB format, the usage of that color space/format can be toggled. > > Since > > + * our HW tends to support more linear formats than sRGB ones, we use this > > + * format variant for check for CCS_E compatibility. > > + */ > > +static bool > > +format_ccs_e_compat_with_miptree(const struct gen_device_info *devinfo, > > + const struct intel_mipmap_tree *mt, > > + enum isl_format access_format) > > +{ > > + assert(mt->aux_usage == ISL_AUX_USAGE_CCS_E); > > + > > + mesa_format linear_format = _mesa_get_srgb_format_linear(mt->format); > > + enum isl_format isl_format = brw_isl_format_for_mesa_ > > format(linear_format); > > + return isl_formats_are_ccs_e_compatible(devinfo, isl_format, > > access_format); > > +} > > + > > static bool > > intel_miptree_supports_ccs_e(struct brw_context *brw, > > const struct intel_mipmap_tree *mt) > > @@ -2662,8 +2683,11 @@ intel_miptree_render_aux_usage(struct brw_context > > *brw, > >return mt->mcs_buf ? ISL_AUX_USAGE_CCS_D : ISL_AUX_USAGE_NONE; > > > > case ISL_AUX_USAGE_CCS_E: { > > - /* If the format supports CCS_E, then we can just use it */ > > - if (isl_format_supports_ccs_e(>screen->devinfo, > > render_format)) > > + /* If the format supports CCS_E and is compatible with the miptree, > > + * then we can use it. > > + */ > > + if (format_ccs_e_compat_with_miptree(>screen->devinfo, > > + mt, render_format)) > > > > You don't need the helper if you just use mt->surf.format. That's what > can_texture_with_ccs does. > > Isn't using mt->surf.format making the code more restrictive than necessary? That field may give you the sRGB format of the surface, but the helper would always give you the linear variant. Therefore the helper allows for CCS_E to be used in more cases. > > return ISL_AUX_USAGE_CCS_E; > > > >/* Otherwise, we have to fall back to CCS_D */ > > -- > > 2.14.3 > > > > ___ > > mesa-dev mailing list > > mesa-dev@lists.freedesktop.org > > https://lists.freedesktop.org/mailman/listinfo/mesa-dev > > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 3/3] st/mesa: guard sampler views changes with a mutex
For the series: Reviewed-by: Marek OlšákMarek On Sun, Oct 22, 2017 at 8:39 PM, Nicolai Hähnle wrote: > From: Nicolai Hähnle > > Some locking is unfortunately required, because well-formed GL programs > can have multiple threads racing to access the same texture, e.g.: two > threads/contexts rendering from the same texture, or one thread destroying > a context while the other is rendering from or modifying a texture. > > Since even the simple mutex caused noticable slowdowns in the piglit > drawoverhead micro-benchmark, this patch uses a slightly more involved > approach to keep locks out of the fast path: > > - the initial lookup of sampler views happens without taking a lock > - a per-texture lock is only taken when we have to modify the sampler > view(s) > - since each thread mostly operates only on the entry corresponding to > its context, the main issue is re-allocation of the sampler view array > when it needs to be grown, but the old copy is not freed > > Old copies of the sampler views array are kept around in a linked list > until the entire texture object is deleted. The total memory wasted > in this way is roughly equal to the size of the current sampler views > array. > > Fixes non-deterministic memory corruption in some > dEQP-EGL.functional.sharing.gles2.multithread.* tests, e.g. > dEQP-EGL.functional.sharing.gles2.multithread.simple.images.texture_source.create_texture_render > --- > src/mesa/state_tracker/st_atom_sampler.c | 21 +-- > src/mesa/state_tracker/st_cb_texture.c | 12 ++ > src/mesa/state_tracker/st_sampler_view.c | 270 > ++- > src/mesa/state_tracker/st_sampler_view.h | 3 + > src/mesa/state_tracker/st_texture.h | 40 - > 5 files changed, 250 insertions(+), 96 deletions(-) > > diff --git a/src/mesa/state_tracker/st_atom_sampler.c > b/src/mesa/state_tracker/st_atom_sampler.c > index ff3f49fa4e4..6d83f963f0a 100644 > --- a/src/mesa/state_tracker/st_atom_sampler.c > +++ b/src/mesa/state_tracker/st_atom_sampler.c > @@ -36,20 +36,21 @@ > #include "main/mtypes.h" > #include "main/glformats.h" > #include "main/samplerobj.h" > #include "main/teximage.h" > #include "main/texobj.h" > > #include "st_context.h" > #include "st_cb_texture.h" > #include "st_format.h" > #include "st_atom.h" > +#include "st_sampler_view.h" > #include "st_texture.h" > #include "pipe/p_context.h" > #include "pipe/p_defines.h" > > #include "cso_cache/cso_context.h" > > #include "util/u_format.h" > > > /** > @@ -157,40 +158,32 @@ st_convert_sampler(const struct st_context *st, > (sampler->wrap_s | sampler->wrap_t | sampler->wrap_r) & 0x1 && > (msamp->BorderColor.ui[0] || > msamp->BorderColor.ui[1] || > msamp->BorderColor.ui[2] || > msamp->BorderColor.ui[3])) { >const GLboolean is_integer = texobj->_IsIntegerFormat; >GLenum texBaseFormat = _mesa_base_tex_image(texobj)->_BaseFormat; > >if (st->apply_texture_swizzle_to_border_color) { > const struct st_texture_object *stobj = > st_texture_object_const(texobj); > - const struct pipe_sampler_view *sv = NULL; > - > - /* Just search for the first used view. We can do this because the > -swizzle is per-texture, not per context. */ > /* XXX: clean that up to not use the sampler view at all */ > - for (unsigned i = 0; i < stobj->num_sampler_views; ++i) { > -if (stobj->sampler_views[i].view) { > - sv = stobj->sampler_views[i].view; > - break; > -} > - } > + const struct st_sampler_view *sv = > st_texture_get_current_sampler_view(st, stobj); > > if (sv) { > +struct pipe_sampler_view *view = sv->view; > union pipe_color_union tmp; > const unsigned char swz[4] = > { > - sv->swizzle_r, > - sv->swizzle_g, > - sv->swizzle_b, > - sv->swizzle_a, > + view->swizzle_r, > + view->swizzle_g, > + view->swizzle_b, > + view->swizzle_a, > }; > > st_translate_color(>BorderColor, , > texBaseFormat, is_integer); > > util_format_apply_color_swizzle(>border_color, > , swz, is_integer); > } else { > st_translate_color(>BorderColor, > >border_color, > diff --git a/src/mesa/state_tracker/st_cb_texture.c > b/src/mesa/state_tracker/st_cb_texture.c > index 3b7a3b5e985..7766273381b 100644 > --- a/src/mesa/state_tracker/st_cb_texture.c > +++ b/src/mesa/state_tracker/st_cb_texture.c > @@ -144,40 +144,52 @@ st_DeleteTextureImage(struct gl_context * ctx, struct > gl_texture_image *img) > /* nothing special (yet) for st_texture_image */ >
Re: [Mesa-dev] [PATCH 1/2] i965/gen10: Don't set Smooth Point Enable in 3DSTATE_SF if num_samples > 1
On Fri, Oct 27, 2017 at 2:50 PM, Matt Turnerwrote: > On Fri, Oct 27, 2017 at 10:50 AM, Anuj Phogat wrote: >> Signed-off-by: Anuj Phogat >> --- >> src/mesa/drivers/dri/i965/genX_state_upload.c | 10 ++ >> 1 file changed, 10 insertions(+) >> >> diff --git a/src/mesa/drivers/dri/i965/genX_state_upload.c >> b/src/mesa/drivers/dri/i965/genX_state_upload.c >> index 4ccfd48919..b6e800aa90 100644 >> --- a/src/mesa/drivers/dri/i965/genX_state_upload.c >> +++ b/src/mesa/drivers/dri/i965/genX_state_upload.c >> @@ -1626,6 +1626,16 @@ genX(upload_sf)(struct brw_context *brw) >> sf.SmoothPointEnable = true; >> #endif >> >> +#if GEN_GEN == 10 >> + /* _NEW_MULTISAMPLE >> + * Smooth Point Enable bit MUST not be set when NUM_MULTISAMPLES > 1. >> + */ >> + const bool multisampled_fbo = >> + _mesa_geometric_samples(ctx->DrawBuffer) > 1; > > Indent this line. Fixed locally. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/2] i965/gen10: Don't set Smooth Point Enable in 3DSTATE_SF if num_samples > 1
On Fri, Oct 27, 2017 at 2:44 PM, Kenneth Graunkewrote: > On Friday, October 27, 2017 10:50:23 AM PDT Anuj Phogat wrote: >> Signed-off-by: Anuj Phogat >> --- >> src/mesa/drivers/dri/i965/genX_state_upload.c | 10 ++ >> 1 file changed, 10 insertions(+) >> >> diff --git a/src/mesa/drivers/dri/i965/genX_state_upload.c >> b/src/mesa/drivers/dri/i965/genX_state_upload.c >> index 4ccfd48919..b6e800aa90 100644 >> --- a/src/mesa/drivers/dri/i965/genX_state_upload.c >> +++ b/src/mesa/drivers/dri/i965/genX_state_upload.c >> @@ -1626,6 +1626,16 @@ genX(upload_sf)(struct brw_context *brw) >> sf.SmoothPointEnable = true; >> #endif >> >> +#if GEN_GEN == 10 >> + /* _NEW_MULTISAMPLE >> + * Smooth Point Enable bit MUST not be set when NUM_MULTISAMPLES > 1. >> + */ >> + const bool multisampled_fbo = >> + _mesa_geometric_samples(ctx->DrawBuffer) > 1; >> + if (multisampled_fbo) >> + sf.SmoothPointEnable = false; >> +#endif >> + >> #if GEN_IS_G4X || GEN_GEN >= 5 >>sf.AALineDistanceMode = AALINEDISTANCE_TRUE; >> #endif >> > > In both patches...accessing ctx->DrawBuffer requires _NEW_BUFFERS, but > doesn't need _NEW_MULTISAMPLE. Right. Fixed locally. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] scons: fix scons build to find generated glapitable.h
Am 27.10.2017 um 23:16 schrieb Brian Paul: > Fixes: ea53d9a8eb5d4b2 "glapi: include generated headers without path" > --- > src/mapi/glapi/SConscript | 1 + > 1 file changed, 1 insertion(+) > > diff --git a/src/mapi/glapi/SConscript b/src/mapi/glapi/SConscript > index 994778a..08fed6c 100644 > --- a/src/mapi/glapi/SConscript > +++ b/src/mapi/glapi/SConscript > @@ -30,6 +30,7 @@ env.Append(CPPPATH = [ > '#/src', > '#/src/mapi', > '#/src/mesa', > +Dir('.'), # src/mapi/glapi build path > Dir('..'), # src/mapi build path > ]) > > Makes sense to me. Reviewed-by: Roland Scheidegger___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/2] i965/gen10: Don't set Smooth Point Enable in 3DSTATE_SF if num_samples > 1
On Fri, Oct 27, 2017 at 10:50 AM, Anuj Phogatwrote: > Signed-off-by: Anuj Phogat > --- > src/mesa/drivers/dri/i965/genX_state_upload.c | 10 ++ > 1 file changed, 10 insertions(+) > > diff --git a/src/mesa/drivers/dri/i965/genX_state_upload.c > b/src/mesa/drivers/dri/i965/genX_state_upload.c > index 4ccfd48919..b6e800aa90 100644 > --- a/src/mesa/drivers/dri/i965/genX_state_upload.c > +++ b/src/mesa/drivers/dri/i965/genX_state_upload.c > @@ -1626,6 +1626,16 @@ genX(upload_sf)(struct brw_context *brw) > sf.SmoothPointEnable = true; > #endif > > +#if GEN_GEN == 10 > + /* _NEW_MULTISAMPLE > + * Smooth Point Enable bit MUST not be set when NUM_MULTISAMPLES > 1. > + */ > + const bool multisampled_fbo = > + _mesa_geometric_samples(ctx->DrawBuffer) > 1; Indent this line. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/2] i965/gen10: Don't set Smooth Point Enable in 3DSTATE_SF if num_samples > 1
On Friday, October 27, 2017 10:50:23 AM PDT Anuj Phogat wrote: > Signed-off-by: Anuj Phogat> --- > src/mesa/drivers/dri/i965/genX_state_upload.c | 10 ++ > 1 file changed, 10 insertions(+) > > diff --git a/src/mesa/drivers/dri/i965/genX_state_upload.c > b/src/mesa/drivers/dri/i965/genX_state_upload.c > index 4ccfd48919..b6e800aa90 100644 > --- a/src/mesa/drivers/dri/i965/genX_state_upload.c > +++ b/src/mesa/drivers/dri/i965/genX_state_upload.c > @@ -1626,6 +1626,16 @@ genX(upload_sf)(struct brw_context *brw) > sf.SmoothPointEnable = true; > #endif > > +#if GEN_GEN == 10 > + /* _NEW_MULTISAMPLE > + * Smooth Point Enable bit MUST not be set when NUM_MULTISAMPLES > 1. > + */ > + const bool multisampled_fbo = > + _mesa_geometric_samples(ctx->DrawBuffer) > 1; > + if (multisampled_fbo) > + sf.SmoothPointEnable = false; > +#endif > + > #if GEN_IS_G4X || GEN_GEN >= 5 >sf.AALineDistanceMode = AALINEDISTANCE_TRUE; > #endif > In both patches...accessing ctx->DrawBuffer requires _NEW_BUFFERS, but doesn't need _NEW_MULTISAMPLE. signature.asc Description: This is a digitally signed message part. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] gallium; s/unsigned/enum pipe_prim_type/
Reviewed-by: Roland ScheideggerAm 27.10.2017 um 23:16 schrieb Brian Paul: > In the vbuf_render::set_primitive() functions. > --- > src/gallium/auxiliary/draw/draw_vbuf.h | 3 ++- > src/gallium/drivers/i915/i915_prim_vbuf.c| 2 +- > src/gallium/drivers/llvmpipe/lp_setup_vbuf.c | 2 +- > src/gallium/drivers/nouveau/nv30/nv30_draw.c | 2 +- > src/gallium/drivers/r300/r300_render.c | 2 +- > src/gallium/drivers/softpipe/sp_prim_vbuf.c | 4 ++-- > 6 files changed, 8 insertions(+), 7 deletions(-) > > diff --git a/src/gallium/auxiliary/draw/draw_vbuf.h > b/src/gallium/auxiliary/draw/draw_vbuf.h > index 194796b..8faccda 100644 > --- a/src/gallium/auxiliary/draw/draw_vbuf.h > +++ b/src/gallium/auxiliary/draw/draw_vbuf.h > @@ -38,6 +38,7 @@ > > > #include "pipe/p_compiler.h" > +#include "pipe/p_defines.h" > > > struct pipe_rasterizer_state; > @@ -96,7 +97,7 @@ struct vbuf_render { > * the discretion of the driver, for the benefit of the passthrough > * path. > */ > - void (*set_primitive)( struct vbuf_render *, unsigned prim ); > + void (*set_primitive)( struct vbuf_render *, enum pipe_prim_type prim ); > > /** > * Draw indexed primitives. Note that indices are ushort. The driver > diff --git a/src/gallium/drivers/i915/i915_prim_vbuf.c > b/src/gallium/drivers/i915/i915_prim_vbuf.c > index c0ba23b..8f2f5c1 100644 > --- a/src/gallium/drivers/i915/i915_prim_vbuf.c > +++ b/src/gallium/drivers/i915/i915_prim_vbuf.c > @@ -324,7 +324,7 @@ i915_vbuf_ensure_index_bounds(struct vbuf_render *render, > > static void > i915_vbuf_render_set_primitive(struct vbuf_render *render, > - unsigned prim) > + enum pipe_prim_type prim) > { > struct i915_vbuf_render *i915_render = i915_vbuf_render(render); > i915_render->prim = prim; > diff --git a/src/gallium/drivers/llvmpipe/lp_setup_vbuf.c > b/src/gallium/drivers/llvmpipe/lp_setup_vbuf.c > index d7fa9fd..28a48d4 100644 > --- a/src/gallium/drivers/llvmpipe/lp_setup_vbuf.c > +++ b/src/gallium/drivers/llvmpipe/lp_setup_vbuf.c > @@ -115,7 +115,7 @@ lp_setup_unmap_vertices(struct vbuf_render *vbr, > > > static void > -lp_setup_set_primitive(struct vbuf_render *vbr, unsigned prim) > +lp_setup_set_primitive(struct vbuf_render *vbr, enum pipe_prim_type prim) > { > lp_setup_context(vbr)->prim = prim; > } > diff --git a/src/gallium/drivers/nouveau/nv30/nv30_draw.c > b/src/gallium/drivers/nouveau/nv30/nv30_draw.c > index 4c587fc..798ec14 100644 > --- a/src/gallium/drivers/nouveau/nv30/nv30_draw.c > +++ b/src/gallium/drivers/nouveau/nv30/nv30_draw.c > @@ -111,7 +111,7 @@ nv30_render_unmap_vertices(struct vbuf_render *render, > } > > static void > -nv30_render_set_primitive(struct vbuf_render *render, unsigned prim) > +nv30_render_set_primitive(struct vbuf_render *render, enum pipe_prim_type > prim) > { > struct nv30_render *r = nv30_render(render); > > diff --git a/src/gallium/drivers/r300/r300_render.c > b/src/gallium/drivers/r300/r300_render.c > index 4cae766..9397aae 100644 > --- a/src/gallium/drivers/r300/r300_render.c > +++ b/src/gallium/drivers/r300/r300_render.c > @@ -964,7 +964,7 @@ static void r300_render_release_vertices(struct > vbuf_render* render) > } > > static void r300_render_set_primitive(struct vbuf_render* render, > - unsigned prim) > + enum pipe_prim_type prim) > { > struct r300_render* r300render = r300_render(render); > > diff --git a/src/gallium/drivers/softpipe/sp_prim_vbuf.c > b/src/gallium/drivers/softpipe/sp_prim_vbuf.c > index 95d1ac1..1ce04a2 100644 > --- a/src/gallium/drivers/softpipe/sp_prim_vbuf.c > +++ b/src/gallium/drivers/softpipe/sp_prim_vbuf.c > @@ -60,7 +60,7 @@ struct softpipe_vbuf_render > struct softpipe_context *softpipe; > struct setup_context *setup; > > - uint prim; > + enum pipe_prim_type prim; > uint vertex_size; > uint nr_vertices; > uint vertex_buffer_size; > @@ -133,7 +133,7 @@ sp_vbuf_unmap_vertices(struct vbuf_render *vbr, > > > static void > -sp_vbuf_set_primitive(struct vbuf_render *vbr, unsigned prim) > +sp_vbuf_set_primitive(struct vbuf_render *vbr, enum pipe_prim_type prim) > { > struct softpipe_vbuf_render *cvbr = softpipe_vbuf_render(vbr); > struct setup_context *setup_ctx = cvbr->setup; > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v2] i965 : optimized bucket index calculation
On 10/24/2017 02:42 AM, Muthukumar, Aravindan wrote: >> -Original Message- >> From: Ian Romanick [mailto:i...@freedesktop.org] >> Sent: Friday, October 20, 2017 9:51 AM >> To: Muthukumar, Aravindan; mesa- >> d...@lists.freedesktop.org >> Cc: J Karanje, Kedar ; Tamminen, Eero T >> >> Subject: Re: [Mesa-dev] [PATCH v2] i965 : optimized bucket index calculation >> >> On 09/13/2017 11:43 PM, aravindan.muthuku...@intel.com wrote: >>> From: Aravindan Muthukumar >>> >>> Avoiding the loop which was running with O(n) complexity. >>> Now the complexity has been reduced to O(1) >>> >>> Algorithm calculates the index using matrix method. >>> Matrix arrangement is as below: >>> Assuming PAGE_SIZE is 4096. >>> >>> 1*4096 2*40963*40964*4096 >>> 5*4096 6*40967*40968*4096 >> >> I think adding one more row to this chart would make it more clear. The two >> rows shown also follow a simpler pattern, and that made some of the >> complexity below seem confusing. >> >>10*4096 12*4096 14*4096 16*4096 >> >>> ... ... ... ... >>> ... ... ... ... >>> ... ... ... max_cache_size >>> >>> From this matrix its cleary seen that every row >>clearly >> >>> follows the below way: >>> ... ... ...n >>>n+(1/4)n n+(1/2)n n+(3/4)n2n >>> >>> Row is calulated as log2(size/PAGE_SIZE) Column is calculated as >>> converting the difference between the elements to fit into power size >>> of two and indexing it. >>> >>> Final Index is (row*4)+(col-1) >>> >>> Tested with Intel Mesa CI. >>> >>> Improves performance of 3d Mark on Broxton. >>> Analyzed using Compare Perf Analyser: >>> Average : 201.2 +/- 65.4836 (n=20) >> >> Is 201 the improvement or the absolute score? Do not quote absolute scores. >> >>> Percentage : 0.705966% +/- 0.229767% (n=20) >> >> Eero: Can you reproduce this result on BXT or other platforms? Just >> curious... >> >>> v2: Review comments regarding cosmetics and asserts implemented >>> >>> Signed-off-by: Aravindan Muthukumar >>> Signed-off-by: Kedar Karanje >>> Reviewed-by: Yogesh Marathe >>> --- >>> src/mesa/drivers/dri/i965/brw_bufmgr.c | 46 >>> -- >>> 1 file changed, 39 insertions(+), 7 deletions(-) >>> >>> diff --git a/src/mesa/drivers/dri/i965/brw_bufmgr.c >>> b/src/mesa/drivers/dri/i965/brw_bufmgr.c >>> index 8017219..8013ccb 100644 >>> --- a/src/mesa/drivers/dri/i965/brw_bufmgr.c >>> +++ b/src/mesa/drivers/dri/i965/brw_bufmgr.c >>> @@ -87,6 +87,8 @@ >>> >>> #define memclear(s) memset(, 0, sizeof(s)) >>> >>> +#define PAGE_SIZE 4096 >>> + >>> #define FILE_DEBUG_FLAG DEBUG_BUFMGR >>> >>> static inline int >>> @@ -181,19 +183,45 @@ bo_tile_pitch(struct brw_bufmgr *bufmgr, uint32_t >> pitch, uint32_t tiling) >>> return ALIGN(pitch, tile_width); >>> } >>> >>> +static inline int >>> +ilog2_round_up(int value) >>> +{ >>> + assert(value != 0); >>> + return 32 - __builtin_clz(value - 1); } >>> + >>> +/* >>> + * This function finds the correct bucket fit for the input size. >>> + * The function works with O(1) complexity when the requested size >>> + * was queried instead of iterating the size through all the buckets. >>> + */ >>> static struct bo_cache_bucket * >>> bucket_for_size(struct brw_bufmgr *bufmgr, uint64_t size) { >>> - int i; >>> + int index = -1; >> >> I don't see how execution could get to that test without index being set >> again, >> so it should be safe to remove the initialization. >> > > We will skip this initialization since it doesn’t impact the index result. > > >>> + int row, col = 0; >>> + int pages, pages_log2; >> >> Move the declarations of row, col, pages, and pages_log2 into the else case, >> initialize them with the only assignment to them, and make them const. >> >> Since all of these values, including index, get values derived from size, I >> believe >> the should all be unsigned. In that case, remove the index >= 0 test below. >> > > We will move as well as initialize the above variables in the respective else > part. > >>> >>> - for (i = 0; i < bufmgr->num_buckets; i++) { >>> - struct bo_cache_bucket *bucket = >cache_bucket[i]; >>> - if (bucket->size >= size) { >>> - return bucket; >>> - } >>> + /* condition for size less than 4*4096 (16KB) page size */ >> ^ ^ Remove one space here. >> | >> Capitalize and add a period at the end of the sentence. >> >>> + if(size <= 4 * PAGE_SIZE) { >> ^ Add a space here. >> >>> + index = DIV_ROUND_UP(size, PAGE_SIZE) - 1;; >> Remove extra trailing semicolon. ^ >> >>> + } else { >>> +
[Mesa-dev] [PATCH] scons: fix scons build to find generated glapitable.h
Fixes: ea53d9a8eb5d4b2 "glapi: include generated headers without path" --- src/mapi/glapi/SConscript | 1 + 1 file changed, 1 insertion(+) diff --git a/src/mapi/glapi/SConscript b/src/mapi/glapi/SConscript index 994778a..08fed6c 100644 --- a/src/mapi/glapi/SConscript +++ b/src/mapi/glapi/SConscript @@ -30,6 +30,7 @@ env.Append(CPPPATH = [ '#/src', '#/src/mapi', '#/src/mesa', +Dir('.'), # src/mapi/glapi build path Dir('..'), # src/mapi build path ]) -- 1.9.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] gallium; s/unsigned/enum pipe_prim_type/
In the vbuf_render::set_primitive() functions. --- src/gallium/auxiliary/draw/draw_vbuf.h | 3 ++- src/gallium/drivers/i915/i915_prim_vbuf.c| 2 +- src/gallium/drivers/llvmpipe/lp_setup_vbuf.c | 2 +- src/gallium/drivers/nouveau/nv30/nv30_draw.c | 2 +- src/gallium/drivers/r300/r300_render.c | 2 +- src/gallium/drivers/softpipe/sp_prim_vbuf.c | 4 ++-- 6 files changed, 8 insertions(+), 7 deletions(-) diff --git a/src/gallium/auxiliary/draw/draw_vbuf.h b/src/gallium/auxiliary/draw/draw_vbuf.h index 194796b..8faccda 100644 --- a/src/gallium/auxiliary/draw/draw_vbuf.h +++ b/src/gallium/auxiliary/draw/draw_vbuf.h @@ -38,6 +38,7 @@ #include "pipe/p_compiler.h" +#include "pipe/p_defines.h" struct pipe_rasterizer_state; @@ -96,7 +97,7 @@ struct vbuf_render { * the discretion of the driver, for the benefit of the passthrough * path. */ - void (*set_primitive)( struct vbuf_render *, unsigned prim ); + void (*set_primitive)( struct vbuf_render *, enum pipe_prim_type prim ); /** * Draw indexed primitives. Note that indices are ushort. The driver diff --git a/src/gallium/drivers/i915/i915_prim_vbuf.c b/src/gallium/drivers/i915/i915_prim_vbuf.c index c0ba23b..8f2f5c1 100644 --- a/src/gallium/drivers/i915/i915_prim_vbuf.c +++ b/src/gallium/drivers/i915/i915_prim_vbuf.c @@ -324,7 +324,7 @@ i915_vbuf_ensure_index_bounds(struct vbuf_render *render, static void i915_vbuf_render_set_primitive(struct vbuf_render *render, - unsigned prim) + enum pipe_prim_type prim) { struct i915_vbuf_render *i915_render = i915_vbuf_render(render); i915_render->prim = prim; diff --git a/src/gallium/drivers/llvmpipe/lp_setup_vbuf.c b/src/gallium/drivers/llvmpipe/lp_setup_vbuf.c index d7fa9fd..28a48d4 100644 --- a/src/gallium/drivers/llvmpipe/lp_setup_vbuf.c +++ b/src/gallium/drivers/llvmpipe/lp_setup_vbuf.c @@ -115,7 +115,7 @@ lp_setup_unmap_vertices(struct vbuf_render *vbr, static void -lp_setup_set_primitive(struct vbuf_render *vbr, unsigned prim) +lp_setup_set_primitive(struct vbuf_render *vbr, enum pipe_prim_type prim) { lp_setup_context(vbr)->prim = prim; } diff --git a/src/gallium/drivers/nouveau/nv30/nv30_draw.c b/src/gallium/drivers/nouveau/nv30/nv30_draw.c index 4c587fc..798ec14 100644 --- a/src/gallium/drivers/nouveau/nv30/nv30_draw.c +++ b/src/gallium/drivers/nouveau/nv30/nv30_draw.c @@ -111,7 +111,7 @@ nv30_render_unmap_vertices(struct vbuf_render *render, } static void -nv30_render_set_primitive(struct vbuf_render *render, unsigned prim) +nv30_render_set_primitive(struct vbuf_render *render, enum pipe_prim_type prim) { struct nv30_render *r = nv30_render(render); diff --git a/src/gallium/drivers/r300/r300_render.c b/src/gallium/drivers/r300/r300_render.c index 4cae766..9397aae 100644 --- a/src/gallium/drivers/r300/r300_render.c +++ b/src/gallium/drivers/r300/r300_render.c @@ -964,7 +964,7 @@ static void r300_render_release_vertices(struct vbuf_render* render) } static void r300_render_set_primitive(struct vbuf_render* render, - unsigned prim) + enum pipe_prim_type prim) { struct r300_render* r300render = r300_render(render); diff --git a/src/gallium/drivers/softpipe/sp_prim_vbuf.c b/src/gallium/drivers/softpipe/sp_prim_vbuf.c index 95d1ac1..1ce04a2 100644 --- a/src/gallium/drivers/softpipe/sp_prim_vbuf.c +++ b/src/gallium/drivers/softpipe/sp_prim_vbuf.c @@ -60,7 +60,7 @@ struct softpipe_vbuf_render struct softpipe_context *softpipe; struct setup_context *setup; - uint prim; + enum pipe_prim_type prim; uint vertex_size; uint nr_vertices; uint vertex_buffer_size; @@ -133,7 +133,7 @@ sp_vbuf_unmap_vertices(struct vbuf_render *vbr, static void -sp_vbuf_set_primitive(struct vbuf_render *vbr, unsigned prim) +sp_vbuf_set_primitive(struct vbuf_render *vbr, enum pipe_prim_type prim) { struct softpipe_vbuf_render *cvbr = softpipe_vbuf_render(vbr); struct setup_context *setup_ctx = cvbr->setup; -- 1.9.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] Android: move drivers' symlinks to /vendor (v2)
Having moved gallium_dri.so library to /vendor/lib/dri also symlinks need to be coherently created using TARGET_OUT_VENDOR instead of TARGET_OUT or all non Intel drivers will not be loaded with Android N and earlier, thus causing SurfaceFlinger SIGABRT (v2) simplification of post install command Fixes: c3f75d483c ("Android: move libraries to /vendor") Cc: 17.3--- src/gallium/targets/dri/Android.mk | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/src/gallium/targets/dri/Android.mk b/src/gallium/targets/dri/Android.mk index 61b65769ff..bd3613388f 100644 --- a/src/gallium/targets/dri/Android.mk +++ b/src/gallium/targets/dri/Android.mk @@ -70,8 +70,9 @@ LOCAL_SHARED_LIBRARIES += $(sort $(GALLIUM_SHARED_LIBS)) ifneq ($(filter 5 6 7, $(MESA_ANDROID_MAJOR_VERSION)),) LOCAL_POST_INSTALL_CMD := \ $(foreach l, lib $(if $(filter true,$(TARGET_IS_64_BIT)),lib64), \ - mkdir -p $(TARGET_OUT)/$(l)/$(MESA_DRI_MODULE_REL_PATH); \ - $(foreach d, $(GALLIUM_TARGET_DRIVERS), ln -sf gallium_dri.so $(TARGET_OUT)/$(l)/$(MESA_DRI_MODULE_REL_PATH)/$(d)_dri.so;) \ + $(eval MESA_DRI_MODULE_PATH := $(TARGET_OUT_VENDOR)/$(l)/$(MESA_DRI_MODULE_REL_PATH)) \ + mkdir -p $(MESA_DRI_MODULE_PATH); \ + $(foreach d, $(GALLIUM_TARGET_DRIVERS), ln -sf gallium_dri.so $(MESA_DRI_MODULE_PATH)/$(d)_dri.so;) \ ) else LOCAL_MODULE_SYMLINKS := $(foreach d, $(GALLIUM_TARGET_DRIVERS), $(d)_dri.so) -- 2.14.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 2/2] i965: Check CCS_E compatibility for texture view rendering
On Fri, Oct 27, 2017 at 12:24 PM, Nanley Cherywrote: > Only use CCS_E to render to a texture that is CCS_E-compatible with the > original texture's miptree (linear) format. This prevents render > operations from writing data that can't be decoded with the original > miptree format. > > On Gen10, with the new CCS_E-enabled formats handled, this enables the > driver to pass the arb_texture_view-rendering-formats piglit test. > > Cc: > Signed-off-by: Nanley Chery > --- > src/mesa/drivers/dri/i965/intel_mipmap_tree.c | 28 > +-- > 1 file changed, 26 insertions(+), 2 deletions(-) > > diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c > b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c > index a850f4d17b..59c57c227b 100644 > --- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c > +++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c > @@ -241,6 +241,27 @@ intel_miptree_supports_hiz(const struct brw_context > *brw, > } > } > > +/** > + * Return true if the format that will be used to access the miptree is > + * CCS_E-compatible with the miptree's linear/non-sRGB format. > + * > + * Why use the linear format? Well, although the miptree may be specified > with > + * an sRGB format, the usage of that color space/format can be toggled. > Since > + * our HW tends to support more linear formats than sRGB ones, we use this > + * format variant for check for CCS_E compatibility. > + */ > +static bool > +format_ccs_e_compat_with_miptree(const struct gen_device_info *devinfo, > + const struct intel_mipmap_tree *mt, > + enum isl_format access_format) > +{ > + assert(mt->aux_usage == ISL_AUX_USAGE_CCS_E); > + > + mesa_format linear_format = _mesa_get_srgb_format_linear(mt->format); > + enum isl_format isl_format = brw_isl_format_for_mesa_ > format(linear_format); > + return isl_formats_are_ccs_e_compatible(devinfo, isl_format, > access_format); > +} > + > static bool > intel_miptree_supports_ccs_e(struct brw_context *brw, > const struct intel_mipmap_tree *mt) > @@ -2662,8 +2683,11 @@ intel_miptree_render_aux_usage(struct brw_context > *brw, >return mt->mcs_buf ? ISL_AUX_USAGE_CCS_D : ISL_AUX_USAGE_NONE; > > case ISL_AUX_USAGE_CCS_E: { > - /* If the format supports CCS_E, then we can just use it */ > - if (isl_format_supports_ccs_e(>screen->devinfo, > render_format)) > + /* If the format supports CCS_E and is compatible with the miptree, > + * then we can use it. > + */ > + if (format_ccs_e_compat_with_miptree(>screen->devinfo, > + mt, render_format)) > You don't need the helper if you just use mt->surf.format. That's what can_texture_with_ccs does. > return ISL_AUX_USAGE_CCS_E; > >/* Otherwise, we have to fall back to CCS_D */ > -- > 2.14.3 > > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/mesa-dev > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] intel: common: silence compiler warning
Reviewed-by: Jordan JustenOn 2017-10-27 09:44:14, Lionel Landwerlin wrote: > Signed-off-by: Lionel Landwerlin > --- > src/intel/common/gen_decoder.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/src/intel/common/gen_decoder.c b/src/intel/common/gen_decoder.c > index 395ff02908a..55e7305117c 100644 > --- a/src/intel/common/gen_decoder.c > +++ b/src/intel/common/gen_decoder.c > @@ -556,7 +556,7 @@ gen_spec_load(const struct gen_device_info *devinfo) > { > struct parser_context ctx; > void *buf; > - uint8_t *text_data; > + uint8_t *text_data = NULL; > uint32_t text_offset = 0, text_length = 0, total_length; > uint32_t gen_10 = devinfo_to_gen(devinfo); > > -- > 2.15.0.rc2 > > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [RFC 1/2] intel/isl: Disable some gen10 CCS_E formats for now
Thanks for doing this in isl_format_supports_ccs_e instead of changing the table! Reviewed-by: Jason EkstrandOn Fri, Oct 27, 2017 at 12:24 PM, Nanley Chery wrote: > CannonLake additionally supports R11G11B10_FLOAT and four 10-10-10-2 > formats with CCS_E. None of these formats fit within the current > blorp_copy framework so disable them until support is added. > > Signed-off-by: Nanley Chery > Cc: Jason Ekstrand > --- > > Jason, do you think we modify blorp instead of moving forward with this > patch? In any case, I've sent this to the list to add context to my next > patch. > > src/intel/isl/isl_format.c | 24 > 1 file changed, 24 insertions(+) > > diff --git a/src/intel/isl/isl_format.c b/src/intel/isl/isl_format.c > index fba3ac5e1a..03c591071b 100644 > --- a/src/intel/isl/isl_format.c > +++ b/src/intel/isl/isl_format.c > @@ -536,6 +536,30 @@ isl_format_supports_ccs_e(const struct > gen_device_info *devinfo, > if (!format_info[format].exists) >return false; > > + /* For simplicity, only report that a format supports CCS_E if blorp > can > +* perform bit-for-bit copies with an image of that format while > compressed. > +* This allows ISL users to avoid having to resolve the image before > +* performing such a copy. We may want to change this behavior in the > +* future. > +* > +* R11G11B10_FLOAT has no equivalent UINT format. Given how blorp_copy > +* currently works, bit-for-bit copy operations are not possible > without an > +* intermediate resolve. > +*/ > + if (format == ISL_FORMAT_R11G11B10_FLOAT) > + return false; > + > + /* blorp_copy currently doesn't support formats with different > bit-widths > +* per-channel. Until that support is added, report that these formats > don't > +* support CCS_E. FIXME: Add support for these formats. > +*/ > + if (format == ISL_FORMAT_B10G10R10A2_UNORM || > + format == ISL_FORMAT_B10G10R10A2_UNORM_SRGB || > + format == ISL_FORMAT_R10G10B10A2_UNORM || > + format == ISL_FORMAT_R10G10B10A2_UINT) { > + return false; > + } > + > return format_gen(devinfo) >= format_info[format].ccs_e; > } > > -- > 2.14.3 > > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/2] intel/genxml: Fix decoding of groups with fields smaller than a DWord.
On 2017-10-25 21:17:13, Kenneth Graunke wrote: > Groups containing fields smaller than a DWord were not being decoded > correctly. For example: > > > > > > gen_field_iterator_next would properly walk over each element of the > array, incrementing group_iter, and calling iter_group_offset_bits() > to advance to the proper DWord. However, the code to print the actual > values only considered iter->field->start/end, which are 0 and 3 in the > above example. So it would always fetch bits 3:0 of the current DWord > when printing values, instead of advancing to each element of the array, > printing bits 0-3, 4-7, 8-11, and so on. > > To fix this, we add new iter->start/end tracking, which properly > advances for each instance of a group's field. > > Caught by Matt Turner while working on 3DSTATE_VF_COMPONENT_PACKING, > with a patch to convert it to use an array of bitfields (the example > above). > > This also fixes the decoding of 3DSTATE_SBE's "Attribute Active > Component Format" fields. > --- > src/intel/common/gen_decoder.c | 24 ++-- > src/intel/common/gen_decoder.h | 2 ++ > 2 files changed, 16 insertions(+), 10 deletions(-) > > diff --git a/src/intel/common/gen_decoder.c b/src/intel/common/gen_decoder.c > index 85880143f00..1bf69ac4f94 100644 > --- a/src/intel/common/gen_decoder.c > +++ b/src/intel/common/gen_decoder.c > @@ -843,8 +843,12 @@ iter_advance_field(struct gen_field_iterator *iter) > strncpy(iter->name, iter->field->name, sizeof(iter->name)); > else >memset(iter->name, 0, sizeof(iter->name)); > - iter->dword = iter_group_offset_bits(iter, iter->group_iter) / 32 + > - iter->field->start / 32; > + > + int group_member_offset = iter_group_offset_bits(iter, iter->group_iter); > + > + iter->start = group_member_offset + iter->field->start; > + iter->end = group_member_offset + iter->field->end; > + iter->dword = iter->start / 32; > iter->struct_desc = NULL; > > return true; > @@ -861,7 +865,7 @@ gen_field_iterator_next(struct gen_field_iterator *iter) > if (!iter_advance_field(iter)) >return false; > > - if ((iter->field->end - iter->field->start) > 32) > + if ((iter->end - iter->start) > 32) >v.qw = ((uint64_t) iter->p[iter->dword+1] << 32) | > iter->p[iter->dword]; > else >v.qw = iter->p[iter->dword]; > @@ -871,13 +875,13 @@ gen_field_iterator_next(struct gen_field_iterator *iter) > switch (iter->field->type.kind) { > case GEN_TYPE_UNKNOWN: > case GEN_TYPE_INT: { > - uint64_t value = field(v.qw, iter->field->start, iter->field->end); > + uint64_t value = field(v.qw, iter->start, iter->end); >snprintf(iter->value, sizeof(iter->value), "%"PRId64, value); >enum_name = gen_get_enum_name(>field->inline_enum, value); >break; > } > case GEN_TYPE_UINT: { > - uint64_t value = field(v.qw, iter->field->start, iter->field->end); > + uint64_t value = field(v.qw, iter->start, iter->end); >snprintf(iter->value, sizeof(iter->value), "%"PRIu64, value); >enum_name = gen_get_enum_name(>field->inline_enum, value); >break; > @@ -886,7 +890,7 @@ gen_field_iterator_next(struct gen_field_iterator *iter) >const char *true_string = > iter->print_colors ? "\e[0;35mtrue\e[0m" : "true"; >snprintf(iter->value, sizeof(iter->value), "%s", > - field(v.qw, iter->field->start, iter->field->end) ? > + field(v.qw, iter->start, iter->end) ? > true_string : "false"); >break; > } > @@ -896,7 +900,7 @@ gen_field_iterator_next(struct gen_field_iterator *iter) > case GEN_TYPE_ADDRESS: > case GEN_TYPE_OFFSET: >snprintf(iter->value, sizeof(iter->value), "0x%08"PRIx64, > - field_address(v.qw, iter->field->start, iter->field->end)); > + field_address(v.qw, iter->start, iter->end)); >break; > case GEN_TYPE_STRUCT: >snprintf(iter->value, sizeof(iter->value), "", > @@ -907,8 +911,8 @@ gen_field_iterator_next(struct gen_field_iterator *iter) >break; > case GEN_TYPE_UFIXED: >snprintf(iter->value, sizeof(iter->value), "%f", > - (float) field(v.qw, iter->field->start, > - iter->field->end) / (1 << iter->field->type.f)); > + (float) field(v.qw, iter->start, > + iter->end) / (1 << iter->field->type.f)); This line break seems odd. Could 'iter->end) /' move to the previous line? Either way, Reviewed-by: Jordan Justen>break; > case GEN_TYPE_SFIXED: >/* FIXME: Sign extend extracted field. */ > @@ -917,7 +921,7 @@ gen_field_iterator_next(struct gen_field_iterator *iter) > case GEN_TYPE_MBO: > break; > case GEN_TYPE_ENUM: { > - uint64_t value = field(v.qw, iter->field->start, iter->field->end); > + uint64_t value =
Re: [Mesa-dev] [PATCH v3 31/48] intel/cs: Re-run final NIR optimizations for each SIMD size
On Fri, Oct 27, 2017 at 3:03 AM, Iago Toralwrote: > This should be squashed into the previous commit > Oops... Fixed. > On Wed, 2017-10-25 at 16:26 -0700, Jason Ekstrand wrote: > > With the advent of SPIR-V subgroup operations, compute shaders will > > have > > to be slightly different depending on the SIMD size at which they > > execute. In order to allow us to do dispatch-width specific things > > in > > NIR, we re-run the final NIR stages for each sIMD width. > > > > As a side-effect of this change, we start using ralloc on fs_visitor > > so > > we need to add DECLARE_RALLOC_OPERATORS to fs_visitor. > > --- > > src/intel/compiler/brw_fs.h | 2 +- > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > diff --git a/src/intel/compiler/brw_fs.h > > b/src/intel/compiler/brw_fs.h > > index d3ab385..9ff06b6 100644 > > --- a/src/intel/compiler/brw_fs.h > > +++ b/src/intel/compiler/brw_fs.h > > @@ -60,7 +60,7 @@ offset(const fs_reg , const brw::fs_builder > > , unsigned delta) > > class fs_visitor : public backend_shader > > { > > public: > > - DECLARE_RALLOC_CXX_OPERATORS(fs_reg) > > + DECLARE_RALLOC_CXX_OPERATORS(fs_visitor) > > > > fs_visitor(const struct brw_compiler *compiler, void *log_data, > >void *mem_ctx, > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v3 38/48] intel/fs: Don't use automatic exec size
On Fri, Oct 27, 2017 at 4:47 AM, Iago Toralwrote: > On Wed, 2017-10-25 at 16:26 -0700, Jason Ekstrand wrote: > > The automatic exec size inference can accidentally mess things up if > > we're not careful. For instance, if we have > > > > add(4)g38.2<4>Dg38.1<8,2,4>Dg38.2<8,2,4>D > > > > then the destination register will end up having a width of 2 with a > > horizontal stride of 4 and a vertical stride of 8. The EU emit code > > sees the width of 2 and decides that we really wanted an exec size of > > 2 > > which doesn't do what we wanted. > > Right :-/ > > I have to say that this change makes me a little nervous, mostly > because it doesn't look easy to identify all the cases where we were > relying on automatic execsizes to fix things up for us... since this is > not as easy as to look for places where we use 'vec1' or something like > that. How did you get the list of things that needed explicit sizes? > Lots of grep :) If I recall correctly, I searched for EXECUTE_1, vec1, EXECUTE_2, and vec2. > Also, both commits before this address cases of exec_size = 1, but we > rely on automatic exec sizes for exec_size = 2 as well, I guess we have > none of these? > > Anyway, I guess Jenkins would have caught at least most omissions so > maybe I am being too paranoid. That's my hope as well. :-) Durring the debugging of this chunk of the series, Jenkins found quite a few errors. Once Jenkins was happy, I did one more pass of grep to be sure. --Jason > > --- > > src/intel/compiler/brw_fs_generator.cpp | 12 +--- > > 1 file changed, 9 insertions(+), 3 deletions(-) > > > > diff --git a/src/intel/compiler/brw_fs_generator.cpp > > b/src/intel/compiler/brw_fs_generator.cpp > > index 8322be1..46f9a33 100644 > > --- a/src/intel/compiler/brw_fs_generator.cpp > > +++ b/src/intel/compiler/brw_fs_generator.cpp > > @@ -190,6 +190,12 @@ fs_generator::fs_generator(const struct > > brw_compiler *compiler, void *log_data, > > { > > p = rzalloc(mem_ctx, struct brw_codegen); > > brw_init_codegen(devinfo, p, mem_ctx); > > + > > + /* In the FS code generator, we are very careful to ensure that > > we always > > +* set the right execution size so we don't need the EU code to > > "help" us > > +* by trying to infer it. Sometimes, it infers the wrong thing. > > +*/ > > + p->automatic_exec_sizes = false; > > } > > > > fs_generator::~fs_generator() > > @@ -395,17 +401,17 @@ fs_generator::generate_fb_write(fs_inst *inst, > > struct brw_reg payload) > >struct brw_reg v1_null_ud = vec1(retype(brw_null_reg(), > > BRW_REGISTER_TYPE_UD)); > > > >/* Check runtime bit to detect if we have to send AA data or > > not */ > > - brw_set_default_compression_control(p, BRW_COMPRESSION_NONE); > >brw_push_insn_state(p); > > - brw_inst_set_exec_size(p->devinfo, brw_last_inst, > > BRW_EXECUTE_1); > > + brw_set_default_compression_control(p, BRW_COMPRESSION_NONE); > > + brw_set_default_exec_size(p, BRW_EXECUTE_1); > >brw_AND(p, > >v1_null_ud, > >retype(brw_vec1_grf(1, 6), BRW_REGISTER_TYPE_UD), > >brw_imm_ud(1<<26)); > >brw_inst_set_cond_modifier(p->devinfo, brw_last_inst, > > BRW_CONDITIONAL_NZ); > > - brw_pop_insn_state(p); > > > >int jmp = brw_JMPI(p, brw_imm_ud(0), BRW_PREDICATE_NORMAL) - > > p->store; > > + brw_pop_insn_state(p); > >{ > > /* Don't send AA data */ > > fire_fb_write(inst, offset(payload, 1), implied_header, > > inst->mlen-1); > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v3 23/48] intel/fs: Assign constant locations if they haven't been assigned
On Fri, Oct 27, 2017 at 12:35 AM, Iago Toralwrote: > This sounds good to me, but I guess it is not really fixing anything, > right? I ask because the subject claims that this patch does something > that the original code was already supposed to be doing. > This patch is a bit of an artifact of history. I needed it at one point in the development of the series but I think it may have ended up not mattering in the end. I still think it's a nice clean-up. --Jason > On Wed, 2017-10-25 at 16:25 -0700, Jason Ekstrand wrote: > > Before, we bailing in assign_constant_locations based on the minimum > > dispatch size. The more direct thing to do is simply to check for > > whether or not we have constant locations and bail if we do. For > > nir_setup_uniforms, it's completely safe to do it multiple times > > because > > we just copy a value from the NIR shader. > > --- > > src/intel/compiler/brw_fs.cpp | 4 +++- > > src/intel/compiler/brw_fs_nir.cpp | 5 - > > 2 files changed, 7 insertions(+), 2 deletions(-) > > > > diff --git a/src/intel/compiler/brw_fs.cpp > > b/src/intel/compiler/brw_fs.cpp > > index 52079d3..75139fd 100644 > > --- a/src/intel/compiler/brw_fs.cpp > > +++ b/src/intel/compiler/brw_fs.cpp > > @@ -1956,8 +1956,10 @@ void > > fs_visitor::assign_constant_locations() > > { > > /* Only the first compile gets to decide on locations. */ > > - if (dispatch_width != min_dispatch_width) > > + if (push_constant_loc) { > > + assert(pull_constant_loc); > >return; > > + } > > > > bool is_live[uniforms]; > > memset(is_live, 0, sizeof(is_live)); > > diff --git a/src/intel/compiler/brw_fs_nir.cpp > > b/src/intel/compiler/brw_fs_nir.cpp > > index 7556576..05efee3 100644 > > --- a/src/intel/compiler/brw_fs_nir.cpp > > +++ b/src/intel/compiler/brw_fs_nir.cpp > > @@ -81,8 +81,11 @@ fs_visitor::nir_setup_outputs() > > void > > fs_visitor::nir_setup_uniforms() > > { > > - if (dispatch_width != min_dispatch_width) > > + /* Only the first compile gets to set up uniforms. */ > > + if (push_constant_loc) { > > + assert(pull_constant_loc); > >return; > > + } > > > > uniforms = nir->num_uniforms / 4; > > } > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v3 29/48] intel/cs: Rework the way thread local ID is handled
On Fri, Oct 27, 2017 at 2:11 AM, Iago Toralwrote: > On Wed, 2017-10-25 at 16:26 -0700, Jason Ekstrand wrote: > > Previously, brw_nir_lower_intrinsics added the param and then emitted > > a > > load_uniform intrinsic to load it directly. This commit switches > > things > > over to use a specific NIR intrinsic for the thread id. The one > > thing I > > don't like about this approach is that we have to copy > > thread_local_id > > over to the new visitor in import_uniforms. > > It is not clear to me why you are doing this... why do you like this > better? > For compute shaders, the SPIR-V subgroups stuff has a gl_subgroupId system value which subgroup in the dispatch you are. That information is basically the same as the thread_local_id only off by a factor of the SIMD size. It's fairly arbitrary, but I figured we might as well switch over to pushing the value that's defined in SPIR-V. > > --- > > src/compiler/nir/nir_intrinsics.h| 3 ++ > > src/intel/compiler/brw_fs.cpp| 4 +- > > src/intel/compiler/brw_fs.h | 1 + > > src/intel/compiler/brw_fs_nir.cpp| 14 +++ > > src/intel/compiler/brw_nir.h | 3 +- > > src/intel/compiler/brw_nir_lower_cs_intrinsics.c | 53 +- > > -- > > 6 files changed, 32 insertions(+), 46 deletions(-) > > > > diff --git a/src/compiler/nir/nir_intrinsics.h > > b/src/compiler/nir/nir_intrinsics.h > > index cefd18b..47022dd 100644 > > --- a/src/compiler/nir/nir_intrinsics.h > > +++ b/src/compiler/nir/nir_intrinsics.h > > @@ -364,6 +364,9 @@ SYSTEM_VALUE(blend_const_color_a_float, 1, 0, xx, > > xx, xx) > > SYSTEM_VALUE(blend_const_color_rgba_unorm, 1, 0, xx, xx, xx) > > SYSTEM_VALUE(blend_const_color__unorm, 1, 0, xx, xx, xx) > > > > +/* Intel specific system values */ > > +SYSTEM_VALUE(intel_thread_local_id, 1, 0, xx, xx, xx) > > + > > /** > > * Barycentric coordinate intrinsics. > > * > > diff --git a/src/intel/compiler/brw_fs.cpp > > b/src/intel/compiler/brw_fs.cpp > > index 2acd838..c0d4c05 100644 > > --- a/src/intel/compiler/brw_fs.cpp > > +++ b/src/intel/compiler/brw_fs.cpp > > @@ -996,6 +996,7 @@ fs_visitor::import_uniforms(fs_visitor *v) > > this->push_constant_loc = v->push_constant_loc; > > this->pull_constant_loc = v->pull_constant_loc; > > this->uniforms = v->uniforms; > > + this->thread_local_id = v->thread_local_id; > > } > > > > void > > @@ -6781,8 +6782,7 @@ brw_compile_cs(const struct brw_compiler > > *compiler, void *log_data, > > { > > nir_shader *shader = nir_shader_clone(mem_ctx, src_shader); > > shader = brw_nir_apply_sampler_key(shader, compiler, >tex, > > true); > > - > > - brw_nir_lower_cs_intrinsics(shader, prog_data); > > + brw_nir_lower_cs_intrinsics(shader); > > shader = brw_postprocess_nir(shader, compiler, true); > > > > prog_data->local_size[0] = shader->info.cs.local_size[0]; > > diff --git a/src/intel/compiler/brw_fs.h > > b/src/intel/compiler/brw_fs.h > > index da32593..f51a4d8 100644 > > --- a/src/intel/compiler/brw_fs.h > > +++ b/src/intel/compiler/brw_fs.h > > @@ -315,6 +315,7 @@ public: > > */ > > int *push_constant_loc; > > > > + fs_reg thread_local_id; > > fs_reg frag_depth; > > fs_reg frag_stencil; > > fs_reg sample_mask; > > diff --git a/src/intel/compiler/brw_fs_nir.cpp > > b/src/intel/compiler/brw_fs_nir.cpp > > index 05efee3..fdc6fc6 100644 > > --- a/src/intel/compiler/brw_fs_nir.cpp > > +++ b/src/intel/compiler/brw_fs_nir.cpp > > @@ -88,6 +88,16 @@ fs_visitor::nir_setup_uniforms() > > } > > > > uniforms = nir->num_uniforms / 4; > > + > > + if (stage == MESA_SHADER_COMPUTE) { > > + /* Add a uniform for the thread local id. It must be the last > > uniform > > + * on the list. > > + */ > > + assert(uniforms == prog_data->nr_params); > > + uint32_t *param = brw_stage_prog_data_add_params(prog_data, > > 1); > > + *param = BRW_PARAM_BUILTIN_THREAD_LOCAL_ID; > > + thread_local_id = fs_reg(UNIFORM, uniforms++, > > BRW_REGISTER_TYPE_UD); > > + } > > } > > > > static bool > > @@ -3409,6 +3419,10 @@ fs_visitor::nir_emit_cs_intrinsic(const > > fs_builder , > >cs_prog_data->uses_barrier = true; > >break; > > > > + case nir_intrinsic_load_intel_thread_local_id: > > + bld.MOV(retype(dest, BRW_REGISTER_TYPE_UD), thread_local_id); > > + break; > > + > > case nir_intrinsic_load_local_invocation_id: > > case nir_intrinsic_load_work_group_id: { > >gl_system_value sv = nir_system_value_from_intrinsic(instr- > > >intrinsic); > > diff --git a/src/intel/compiler/brw_nir.h > > b/src/intel/compiler/brw_nir.h > > index 1493b74..3e40712 100644 > > --- a/src/intel/compiler/brw_nir.h > > +++ b/src/intel/compiler/brw_nir.h > > @@ -95,8 +95,7 @@ void brw_nir_analyze_boolean_resolves(nir_shader > > *nir); > > nir_shader *brw_preprocess_nir(const struct
Re: [Mesa-dev] [PATCH v3 25/48] intel/cs: Drop max_dispatch_width checks from compile_cs
On Fri, Oct 27, 2017 at 1:18 AM, Iago Toralwrote: > On Wed, 2017-10-25 at 16:25 -0700, Jason Ekstrand wrote: > > The only things that adjust fs_visitor::max_dispatch_width are render > > target writes which don't happen in compute shaders so they're > > pointless. > > --- > > src/intel/compiler/brw_fs.cpp | 6 ++ > > 1 file changed, 2 insertions(+), 4 deletions(-) > > > > diff --git a/src/intel/compiler/brw_fs.cpp > > b/src/intel/compiler/brw_fs.cpp > > index a23366b..4c362ba 100644 > > --- a/src/intel/compiler/brw_fs.cpp > > +++ b/src/intel/compiler/brw_fs.cpp > > Maybe add an assert before this to check that max_dispatch_width is >= > 32 as expected here? > Done. > > @@ -6818,8 +6818,7 @@ brw_compile_cs(const struct brw_compiler > > *compiler, void *log_data, > > NULL, /* Never used in core profile */ > > shader, 16, shader_time_index); > > if (likely(!(INTEL_DEBUG & DEBUG_NO16)) && > > - !fail_msg && v8.max_dispatch_width >= 16 && > > - min_dispatch_width <= 16) { > > + !fail_msg && min_dispatch_width <= 16) { > >/* Try a SIMD16 compile */ > >if (min_dispatch_width <= 8) > > v16.import_uniforms(); > > @@ -6843,8 +6842,7 @@ brw_compile_cs(const struct brw_compiler > > *compiler, void *log_data, > > fs_visitor v32(compiler, log_data, mem_ctx, key, _data- > > >base, > > NULL, /* Never used in core profile */ > > shader, 32, shader_time_index); > > - if (!fail_msg && v8.max_dispatch_width >= 32 && > > - (min_dispatch_width > 16 || (INTEL_DEBUG & DEBUG_DO32))) { > > + if (!fail_msg && (min_dispatch_width > 16 || ( > > Maybe use unlikely() with (INTEL_DEBUG & DEBUG_DO32)? > That doesn't really go along with this change. I've been meaning to rework DO32 and I'm happy to fix it as part of that. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH v2 1/3] glsl_parser_extra: Add utility to copy symbols between symbol tables
From: Eduardo Lima MitevSome symbols gathered in the symbols table during parsing are needed later for the compile and link stages, so they are moved along the process. Currently, only functions and non-temporary variables are copied between symbol tables. However, the built-in gl_PerVertex interface blocks are also needed during the linking stage (the last step), to match re-declared blocks of inter-stage shaders. This patch adds a new utility function that will factorize current code that copies functions and variables between two symbol tables, and in addition will copy explicitly declared gl_PerVertex blocks too. The function will be used in a subsequent patch. v2 (Neil Roberts): Allow the src symbol table to be NULL and explicitly copy the gl_PerVertex symbols in case they are not referenced in the exec_list. Signed-off-by: Eduardo Lima Mitev Signed-off-by: Neil Roberts --- src/compiler/glsl/glsl_parser_extras.cpp | 43 src/compiler/glsl/glsl_parser_extras.h | 5 2 files changed, 48 insertions(+) diff --git a/src/compiler/glsl/glsl_parser_extras.cpp b/src/compiler/glsl/glsl_parser_extras.cpp index 51d835b..ec4603d 100644 --- a/src/compiler/glsl/glsl_parser_extras.cpp +++ b/src/compiler/glsl/glsl_parser_extras.cpp @@ -1863,6 +1863,49 @@ set_shader_inout_layout(struct gl_shader *shader, shader->bound_image = state->bound_image_specified; } +/* src can be NULL if only the symbols found in the exec_list should be + * copied + */ +void +_mesa_glsl_copy_symbols_from_table(struct exec_list *shader_ir, + struct glsl_symbol_table *src, + struct glsl_symbol_table *dest) +{ + foreach_in_list (ir_instruction, ir, shader_ir) { + switch (ir->ir_type) { + case ir_type_function: + dest->add_function((ir_function *) ir); + break; + case ir_type_variable: { + ir_variable *const var = (ir_variable *) ir; + + if (var->data.mode != ir_var_temporary) +dest->add_variable(var); + break; + } + default: + break; + } + } + + if (src != NULL) { + /* Explicitly copy the gl_PerVertex interface definitions because these + * are needed to check they are the same during the interstage link. + * They can’t necessarily be found via the exec_list because the members + * might not be referenced. The GL spec still requires that they match + * in that case. + */ + const glsl_type *iface = + src->get_interface("gl_PerVertex", ir_var_shader_in); + if (iface) + dest->add_interface(iface->name, iface, ir_var_shader_in); + + iface = src->get_interface("gl_PerVertex", ir_var_shader_out); + if (iface) + dest->add_interface(iface->name, iface, ir_var_shader_out); + } +} + extern "C" { static void diff --git a/src/compiler/glsl/glsl_parser_extras.h b/src/compiler/glsl/glsl_parser_extras.h index fb35813..2e98bc7 100644 --- a/src/compiler/glsl/glsl_parser_extras.h +++ b/src/compiler/glsl/glsl_parser_extras.h @@ -948,6 +948,11 @@ extern int glcpp_preprocess(void *ctx, const char **shader, char **info_log, extern void _mesa_destroy_shader_compiler(void); extern void _mesa_destroy_shader_compiler_caches(void); +extern void +_mesa_glsl_copy_symbols_from_table(struct exec_list *shader_ir, + struct glsl_symbol_table *src, + struct glsl_symbol_table *dest); + #ifdef __cplusplus } #endif -- 2.9.5 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH v2 3/3] glsl/linker: Check that re-declared, inter-shader built-in blocks match
From: Eduardo Lima Mitev>From GLSL 4.5 spec, section "7.1 Built-In Language Variables", page 130 of the PDF states: "If multiple shaders using members of a built-in block belonging to the same interface are linked together in the same program, they must all redeclare the built-in block in the same way, as described in section 4.3.9 “Interface Blocks” for interface-block matching, or a link-time error will result." Fixes: * GL45-CTS.CommonBugs.CommonBug_PerVertexValidation v2 (Neil Roberts): Explicitly look for gl_PerVertex in the symbol tables instead of waiting to find a variable in the interface. Signed-off-by: Eduardo Lima Mitev Signed-off-by: Neil Roberts --- src/compiler/glsl/link_interface_blocks.cpp | 29 + 1 file changed, 29 insertions(+) diff --git a/src/compiler/glsl/link_interface_blocks.cpp b/src/compiler/glsl/link_interface_blocks.cpp index 7037c77..510d4f7 100644 --- a/src/compiler/glsl/link_interface_blocks.cpp +++ b/src/compiler/glsl/link_interface_blocks.cpp @@ -364,6 +364,35 @@ validate_interstage_inout_blocks(struct gl_shader_program *prog, consumer->Stage != MESA_SHADER_FRAGMENT) || consumer->Stage == MESA_SHADER_GEOMETRY; + /* Check that block re-declarations of gl_PerVertex are compatible +* across shaders: From OpenGL Shading Language 4.5, section +* "7.1 Built-In Language Variables", page 130 of the PDF: +* +*"If multiple shaders using members of a built-in block belonging +* to the same interface are linked together in the same program, +* they must all redeclare the built-in block in the same way, as +* described in section 4.3.9 “Interface Blocks” for interface-block +* matching, or a link-time error will result." +* +* This is done explicitly outside of iterating the member variable +* declarations because it is possible that the variables are not used and +* so they would have been optimised out. +*/ + const glsl_type *consumer_iface = + consumer->symbols->get_interface("gl_PerVertex", + ir_var_shader_in); + + const glsl_type *producer_iface = + producer->symbols->get_interface("gl_PerVertex", + ir_var_shader_out); + + if (producer_iface && consumer_iface && + interstage_member_mismatch(prog, consumer_iface, producer_iface)) { + linker_error(prog, "Incompatible or missing gl_PerVertex re-declaration " + "in consecutive shaders"); + return; + } + /* Add output interfaces from the producer to the symbol table. */ foreach_in_list(ir_instruction, node, producer->ir) { ir_variable *var = node->as_variable(); -- 2.9.5 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH v2 2/3] glsl: Use the utility function to copy symbols between symbol tables
From: Eduardo Lima MitevThis effectively factorizes a couple of similar routines. v2 (Neil Roberts): Non-trivial rebase on master Signed-off-by: Eduardo Lima Mitev Signed-off-by: Neil Roberts --- src/compiler/glsl/glsl_parser_extras.cpp | 25 +++-- src/compiler/glsl/linker.cpp | 16 +++- 2 files changed, 10 insertions(+), 31 deletions(-) diff --git a/src/compiler/glsl/glsl_parser_extras.cpp b/src/compiler/glsl/glsl_parser_extras.cpp index ec4603d..1fea4d6 100644 --- a/src/compiler/glsl/glsl_parser_extras.cpp +++ b/src/compiler/glsl/glsl_parser_extras.cpp @@ -1979,6 +1979,7 @@ do_late_parsing_checks(struct _mesa_glsl_parse_state *state) static void opt_shader_and_create_symbol_table(struct gl_context *ctx, + struct glsl_symbol_table *source_symbols, struct gl_shader *shader) { assert(shader->CompileStatus != compile_failure && @@ -2036,22 +2037,8 @@ opt_shader_and_create_symbol_table(struct gl_context *ctx, * We don't have to worry about types or interface-types here because those * are fly-weights that are looked up by glsl_type. */ - foreach_in_list (ir_instruction, ir, shader->ir) { - switch (ir->ir_type) { - case ir_type_function: - shader->symbols->add_function((ir_function *) ir); - break; - case ir_type_variable: { - ir_variable *const var = (ir_variable *) ir; - - if (var->data.mode != ir_var_temporary) -shader->symbols->add_variable(var); - break; - } - default: - break; - } - } + _mesa_glsl_copy_symbols_from_table(shader->ir, source_symbols, + shader->symbols); } void @@ -2088,7 +2075,9 @@ _mesa_glsl_compile_shader(struct gl_context *ctx, struct gl_shader *shader, return; if (shader->CompileStatus == compiled_no_opts) { - opt_shader_and_create_symbol_table(ctx, shader); + opt_shader_and_create_symbol_table(ctx, +NULL, /* source_symbols */ +shader); shader->CompileStatus = compile_success; return; } @@ -2149,7 +2138,7 @@ _mesa_glsl_compile_shader(struct gl_context *ctx, struct gl_shader *shader, lower_subroutine(shader->ir, state); if (!ctx->Cache || force_recompile) - opt_shader_and_create_symbol_table(ctx, shader); + opt_shader_and_create_symbol_table(ctx, state->symbols, shader); else { reparent_ir(shader->ir, shader->ir); shader->CompileStatus = compiled_no_opts; diff --git a/src/compiler/glsl/linker.cpp b/src/compiler/glsl/linker.cpp index f827b68..0045291 100644 --- a/src/compiler/glsl/linker.cpp +++ b/src/compiler/glsl/linker.cpp @@ -1255,21 +1255,11 @@ interstage_cross_validate_uniform_blocks(struct gl_shader_program *prog, * Populates a shaders symbol table with all global declarations */ static void -populate_symbol_table(gl_linked_shader *sh) +populate_symbol_table(gl_linked_shader *sh, glsl_symbol_table *symbols) { sh->symbols = new(sh) glsl_symbol_table; - foreach_in_list(ir_instruction, inst, sh->ir) { - ir_variable *var; - ir_function *func; - - if ((func = inst->as_function()) != NULL) { - sh->symbols->add_function(func); - } else if ((var = inst->as_variable()) != NULL) { - if (var->data.mode != ir_var_temporary) -sh->symbols->add_variable(var); - } - } + _mesa_glsl_copy_symbols_from_table(sh->ir, symbols, sh->symbols); } @@ -2288,7 +2278,7 @@ link_intrastage_shaders(void *mem_ctx, link_bindless_layout_qualifiers(prog, shader_list, num_shaders); - populate_symbol_table(linked); + populate_symbol_table(linked, shader_list[0]->symbols); /* The pointer to the main function in the final linked shader (i.e., the * copy of the original shader that contained the main function). -- 2.9.5 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH v2 0/3] Fix for PerVertexValidation CTS test
Hi, Here is a v2 of Eduardo’s patches to fix the PerVertexValidation CTS test. They were originally posted here: https://lists.freedesktop.org/archives/mesa-dev/2017-March/146667.html The patches needed a non-trivial rebase. The previous discussion was left with a note saying that there are more cases that should fail but that weren’t handled by the patches and aren’t checked by the CTS test. Ie, if the consuming shader doesn’t reference any of the members of gl_PerVertex then it wouldn’t find the block redefinition. This v2 addresses that by directly fetching the glsl_type for the gl_PerVertex block from the symbol table rather waiting for an instruction which references it. I’ve tested it with Piglit and the public CTS repo and it doesn’t cause any regressions, and of course it fixes the CTS test. It might be nice to add a Piglit or CTS test to check this missing case. - Neil Eduardo Lima Mitev (3): glsl_parser_extra: Add utility to copy symbols between symbol tables glsl: Use the utility function to copy symbols between symbol tables glsl/linker: Check that re-declared, inter-shader built-in blocks match src/compiler/glsl/glsl_parser_extras.cpp| 68 + src/compiler/glsl/glsl_parser_extras.h | 5 +++ src/compiler/glsl/link_interface_blocks.cpp | 29 src/compiler/glsl/linker.cpp| 16 ++- 4 files changed, 87 insertions(+), 31 deletions(-) -- 2.9.5 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v3 20/48] intel/fs: Protect opt_algebraic from OOB BROADCAST indices
On Fri, Oct 27, 2017 at 12:09 AM, Iago Toralwrote: > > On Wed, 2017-10-25 at 16:25 -0700, Jason Ekstrand wrote: > > --- > > src/intel/compiler/brw_fs.cpp | 10 -- > > 1 file changed, 8 insertions(+), 2 deletions(-) > > > > diff --git a/src/intel/compiler/brw_fs.cpp > > b/src/intel/compiler/brw_fs.cpp > > index 1c4351b..52079d3 100644 > > --- a/src/intel/compiler/brw_fs.cpp > > +++ b/src/intel/compiler/brw_fs.cpp > > @@ -2416,8 +2416,14 @@ fs_visitor::opt_algebraic() > > progress = true; > > } else if (inst->src[1].file == IMM) { > > inst->opcode = BRW_OPCODE_MOV; > > -inst->src[0] = component(inst->src[0], > > - inst->src[1].ud); > > +/* It's possible that the selected component will be too > > large and > > + * overflow the register. If this happens and we some > > how manage > > + * to constant fold it in and get here, it would cause > > an assert > > + * in component() below. Instead, just let it wrap > > around if it > > + * goes over exec_size. > > + */ > > component() is really a horiz_offset() call which is in turn a > byte_offset() call, which doesn't assert on anything other than invalid > register files. I guess you mean that the byte offset computed by the > component() call below can later lead to hitting assertions as we > attempt to read outside the allocated space for the vgrf, right? > Yes. Nothing asserts here, we just end up reading way past the end of the VGRF and the validator which checks whether or not we keep all our reads in-bounds throws a fit. > My question is whether this is supposed to happen at all, it seems like > we should not be emitting broadcast operations like this at all since > they are invalid and here we are instead papering over that bug. > This can happen if someone does a read_invocation from SPIR-V or GLSL that has an invocation index that is too large. We're allowed to give them garbage values, but asserting isn't nice. I've updated the comment to: /* It's possible that the selected component will be too large and * overflow the register. This can happen if someone does a * readInvocation() from GLSL or SPIR-V and provides an OOB * invocationIndex. If this happens and we some how manage * to constant fold it in and get here, then component() may cause * us to start reading outside of the VGRF which will lead to an * assert later. Instead, just let it wrap around if it goes over * exec_size. */ --Jason ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [RFC 1/2] intel/isl: Disable some gen10 CCS_E formats for now
CannonLake additionally supports R11G11B10_FLOAT and four 10-10-10-2 formats with CCS_E. None of these formats fit within the current blorp_copy framework so disable them until support is added. Signed-off-by: Nanley CheryCc: Jason Ekstrand --- Jason, do you think we modify blorp instead of moving forward with this patch? In any case, I've sent this to the list to add context to my next patch. src/intel/isl/isl_format.c | 24 1 file changed, 24 insertions(+) diff --git a/src/intel/isl/isl_format.c b/src/intel/isl/isl_format.c index fba3ac5e1a..03c591071b 100644 --- a/src/intel/isl/isl_format.c +++ b/src/intel/isl/isl_format.c @@ -536,6 +536,30 @@ isl_format_supports_ccs_e(const struct gen_device_info *devinfo, if (!format_info[format].exists) return false; + /* For simplicity, only report that a format supports CCS_E if blorp can +* perform bit-for-bit copies with an image of that format while compressed. +* This allows ISL users to avoid having to resolve the image before +* performing such a copy. We may want to change this behavior in the +* future. +* +* R11G11B10_FLOAT has no equivalent UINT format. Given how blorp_copy +* currently works, bit-for-bit copy operations are not possible without an +* intermediate resolve. +*/ + if (format == ISL_FORMAT_R11G11B10_FLOAT) + return false; + + /* blorp_copy currently doesn't support formats with different bit-widths +* per-channel. Until that support is added, report that these formats don't +* support CCS_E. FIXME: Add support for these formats. +*/ + if (format == ISL_FORMAT_B10G10R10A2_UNORM || + format == ISL_FORMAT_B10G10R10A2_UNORM_SRGB || + format == ISL_FORMAT_R10G10B10A2_UNORM || + format == ISL_FORMAT_R10G10B10A2_UINT) { + return false; + } + return format_gen(devinfo) >= format_info[format].ccs_e; } -- 2.14.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 2/2] i965: Check CCS_E compatibility for texture view rendering
Only use CCS_E to render to a texture that is CCS_E-compatible with the original texture's miptree (linear) format. This prevents render operations from writing data that can't be decoded with the original miptree format. On Gen10, with the new CCS_E-enabled formats handled, this enables the driver to pass the arb_texture_view-rendering-formats piglit test. Cc:Signed-off-by: Nanley Chery --- src/mesa/drivers/dri/i965/intel_mipmap_tree.c | 28 +-- 1 file changed, 26 insertions(+), 2 deletions(-) diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c index a850f4d17b..59c57c227b 100644 --- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c +++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c @@ -241,6 +241,27 @@ intel_miptree_supports_hiz(const struct brw_context *brw, } } +/** + * Return true if the format that will be used to access the miptree is + * CCS_E-compatible with the miptree's linear/non-sRGB format. + * + * Why use the linear format? Well, although the miptree may be specified with + * an sRGB format, the usage of that color space/format can be toggled. Since + * our HW tends to support more linear formats than sRGB ones, we use this + * format variant for check for CCS_E compatibility. + */ +static bool +format_ccs_e_compat_with_miptree(const struct gen_device_info *devinfo, + const struct intel_mipmap_tree *mt, + enum isl_format access_format) +{ + assert(mt->aux_usage == ISL_AUX_USAGE_CCS_E); + + mesa_format linear_format = _mesa_get_srgb_format_linear(mt->format); + enum isl_format isl_format = brw_isl_format_for_mesa_format(linear_format); + return isl_formats_are_ccs_e_compatible(devinfo, isl_format, access_format); +} + static bool intel_miptree_supports_ccs_e(struct brw_context *brw, const struct intel_mipmap_tree *mt) @@ -2662,8 +2683,11 @@ intel_miptree_render_aux_usage(struct brw_context *brw, return mt->mcs_buf ? ISL_AUX_USAGE_CCS_D : ISL_AUX_USAGE_NONE; case ISL_AUX_USAGE_CCS_E: { - /* If the format supports CCS_E, then we can just use it */ - if (isl_format_supports_ccs_e(>screen->devinfo, render_format)) + /* If the format supports CCS_E and is compatible with the miptree, + * then we can use it. + */ + if (format_ccs_e_compat_with_miptree(>screen->devinfo, + mt, render_format)) return ISL_AUX_USAGE_CCS_E; /* Otherwise, we have to fall back to CCS_D */ -- 2.14.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v3 19/48] i965/fs/nir: Don't stomp 64-bit values to D in get_nir_src
On Thu, Oct 26, 2017 at 11:53 PM, Iago Toralwrote: > On Wed, 2017-10-25 at 16:25 -0700, Jason Ekstrand wrote: > > --- > > src/intel/compiler/brw_fs_nir.cpp | 33 +-- > > -- > > 1 file changed, 21 insertions(+), 12 deletions(-) > > > > diff --git a/src/intel/compiler/brw_fs_nir.cpp > > b/src/intel/compiler/brw_fs_nir.cpp > > index e008e2e..a441f57 100644 > > --- a/src/intel/compiler/brw_fs_nir.cpp > > +++ b/src/intel/compiler/brw_fs_nir.cpp > > @@ -1441,11 +1441,19 @@ fs_visitor::get_nir_src(const nir_src ) > > src.reg.base_offset * src.reg.reg- > > >num_components); > > } > > > > - /* to avoid floating-point denorm flushing problems, set the type > > by > > -* default to D - instructions that need floating point semantics > > will set > > -* this to F if they need to > > -*/ > > - return retype(reg, BRW_REGISTER_TYPE_D); > > + if (nir_src_bit_size(src) == 64 && devinfo->gen == 7) { > > + /* The only 64-bit type available on gen7 is DF, so use that. > > */ > > + reg.type = BRW_REGISTER_TYPE_DF; > > + } else { > > + /* To avoid floating-point denorm flushing problems, set the > > type by > > + * default to an integer type - instructions that need > > floating point > > + * semantics will set this to F if they need to > > + */ > > + reg.type = brw_reg_type_from_bit_size(nir_src_bit_size(src), > > +BRW_REGISTER_TYPE_D); > > + } > > + > > + return reg; > > } > > > > /** > > @@ -1455,6 +1463,10 @@ fs_reg > > fs_visitor::get_nir_src_imm(const nir_src ) > > { > > nir_const_value *val = nir_src_as_const_value(src); > > + /* This function shouldn't be called on anything which can even > > +* possibly be 64 bits as it can't do what it claims. > > +*/ > > What would be wrong with something like this? > > if (nir_src_bit_size(src) == 32) >return val ? fs_reg(brw_imm_d(val->i32[0])) : get_nir_src(src); > else >return val ? fs_reg(brw_imm_df(val->f64[0])) : get_nir_src(src); > Because double immediates only really work on BDW+ and I didn't want someone to call this function and get tripped up by that. > > + assert(nir_src_bit_size(src) == 32); > > return val ? fs_reg(brw_imm_d(val->i32[0])) : get_nir_src(src); > > } > > > > @@ -2648,8 +2660,7 @@ fs_visitor::nir_emit_tcs_intrinsic(const > > fs_builder , > > */ > > unsigned channel = iter * 2 + i; > > fs_reg dest = shuffle_64bit_data_for_32bit_write(bld, > > - retype(offset(value, bld, 2 * channel), > > BRW_REGISTER_TYPE_DF), > > - 1); > > + offset(value, bld, channel), 1); > > > > srcs[header_regs + (i + first_component) * 2] = dest; > > srcs[header_regs + (i + first_component) * 2 + 1] = > > @@ -3505,8 +3516,7 @@ fs_visitor::nir_emit_cs_intrinsic(const > > fs_builder , > >if (nir_src_bit_size(instr->src[0]) == 64) { > > type_size = 8; > > val_reg = shuffle_64bit_data_for_32bit_write(bld, > > -retype(val_reg, BRW_REGISTER_TYPE_DF), > > -instr->num_components); > > +val_reg, instr->num_components); > >} > > > >unsigned type_slots = type_size / 4; > > @@ -4005,8 +4015,7 @@ fs_visitor::nir_emit_intrinsic(const fs_builder > > , nir_intrinsic_instr *instr > >if (nir_src_bit_size(instr->src[0]) == 64) { > > type_size = 8; > > val_reg = shuffle_64bit_data_for_32bit_write(bld, > > -retype(val_reg, BRW_REGISTER_TYPE_DF), > > -instr->num_components); > > +val_reg, instr->num_components); > >} > > > >unsigned type_slots = type_size / 4; > > @@ -4063,7 +4072,7 @@ fs_visitor::nir_emit_intrinsic(const fs_builder > > , nir_intrinsic_instr *instr > >unsigned first_component = nir_intrinsic_component(instr); > >if (nir_src_bit_size(instr->src[0]) == 64) { > > fs_reg tmp = shuffle_64bit_data_for_32bit_write(bld, > > -retype(src, BRW_REGISTER_TYPE_DF), num_components); > > +src, num_components); > > src = tmp; > > num_components *= 2; > >} > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v3 18/48] i965/fs/nir: Minor refactor of store_output
On Thu, Oct 26, 2017 at 11:35 PM, Iago Toralwrote: > On Wed, 2017-10-25 at 16:25 -0700, Jason Ekstrand wrote: > > Stop retyping the output of shuffle_64bit_data_for_32bit_write. It's > > always BRW_REGISTER_TYPE_D which is perfectly fine for writing out. > > Also, when we change get_nir_src to return something with a 64-bit > > type > > for 64-bit values, the retyping will not be at all what we > > want. Also, > > retyping the output based on src.type before we whack it back to 32 > > bits > > is a problem because the output is always 32 bits. > > --- > > src/intel/compiler/brw_fs_nir.cpp | 6 +++--- > > 1 file changed, 3 insertions(+), 3 deletions(-) > > > > diff --git a/src/intel/compiler/brw_fs_nir.cpp > > b/src/intel/compiler/brw_fs_nir.cpp > > index 5bcdb1a..e008e2e 100644 > > --- a/src/intel/compiler/brw_fs_nir.cpp > > +++ b/src/intel/compiler/brw_fs_nir.cpp > > @@ -4058,18 +4058,18 @@ fs_visitor::nir_emit_intrinsic(const > > fs_builder , nir_intrinsic_instr *instr > > > >nir_const_value *const_offset = nir_src_as_const_value(instr- > > >src[1]); > >assert(const_offset && "Indirect output stores not allowed"); > > - fs_reg new_dest = retype(offset(outputs[instr- > > >const_index[0]], bld, > > - 4 * const_offset->u32[0]), > > src.type); > > > >unsigned num_components = instr->num_components; > >unsigned first_component = nir_intrinsic_component(instr); > >if (nir_src_bit_size(instr->src[0]) == 64) { > > fs_reg tmp = shuffle_64bit_data_for_32bit_write(bld, > > retype(src, BRW_REGISTER_TYPE_DF), num_components); > > - src = retype(tmp, src.type); > > + src = tmp; > > Maybe just make this: > > src = suffle_64bit_data_for_32bit_write(...) ? > Fixed locally. > > num_components *= 2; > >} > > > > + fs_reg new_dest = retype(offset(outputs[instr- > > >const_index[0]], bld, > > + 4 * const_offset->u32[0]), > > src.type); > >for (unsigned j = 0; j < num_components; j++) { > > bld.MOV(offset(new_dest, bld, j + first_component), > > offset(src, bld, j)); > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] meson: declare an empty xcb dependency
Well, I went to push this, apparently this was in one of the osmesa patches as an unrelated change.. :/ So it's fixed. Dylan Quoting Dylan Baker (2017-10-27 10:25:33) > This is needed in cases where xcb isn't actually needed as a dependency, > but may still be included somewhere. > > cc: Erik Faye-Lund> cc: Eric Engestrom > Signed-off-by: Dylan Baker > --- > meson.build | 1 + > 1 file changed, 1 insertion(+) > > diff --git a/meson.build b/meson.build > index 875f9d4d294..102b75c2320 100644 > --- a/meson.build > +++ b/meson.build > @@ -740,6 +740,7 @@ dep_xext = [] > dep_xdamage = [] > dep_xfixes = [] > dep_x11_xcb = [] > +dep_xcb = [] > dep_xcb_glx = [] > dep_xcb_dri2 = [] > dep_xcb_dri3 = [] > -- > 2.14.2 > signature.asc Description: signature ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] Android: move drivers' symlinks to /vendor
2017-10-27 14:28 GMT+02:00 Rob Herring: > On Fri, Oct 27, 2017 at 6:41 AM, Emil Velikov > wrote: > > On 26 October 2017 at 23:48, Mauro Rossi wrote: > >> Having moved gallium_dri.so library to /vendor/lib/dri > >> also symlinks need to be coherently created using TARGET_OUT_VENDOR > insted of TARGET_OUT > >> or all non Intel drivers will not be loaded with Android N and earlier, > >> thus causing SurfaceFlinger SIGABRT > >> > >> Fixes: c3f75d483c ("Android: move libraries to /vendor") > >> > >> Cc: 17.3 > >> --- > >> src/gallium/targets/dri/Android.mk | 4 ++-- > >> 1 file changed, 2 insertions(+), 2 deletions(-) > > Reviewed-by: Rob Herring > > >> > >> diff --git a/src/gallium/targets/dri/Android.mk > b/src/gallium/targets/dri/Android.mk > >> index 61b65769ff..3fa86a2d56 100644 > >> --- a/src/gallium/targets/dri/Android.mk > >> +++ b/src/gallium/targets/dri/Android.mk > >> @@ -70,8 +70,8 @@ LOCAL_SHARED_LIBRARIES += $(sort > $(GALLIUM_SHARED_LIBS)) > >> ifneq ($(filter 5 6 7, $(MESA_ANDROID_MAJOR_VERSION)),) > >> LOCAL_POST_INSTALL_CMD := \ > >> $(foreach l, lib $(if $(filter true,$(TARGET_IS_64_BIT)),lib64), > \ > >> - mkdir -p $(TARGET_OUT)/$(l)/$(MESA_DRI_MODULE_REL_PATH); \ > >> - $(foreach d, $(GALLIUM_TARGET_DRIVERS), ln -sf gallium_dri.so > $(TARGET_OUT)/$(l)/$(MESA_DRI_MODULE_REL_PATH)/$(d)_dri.so;) \ > >> + mkdir -p $(TARGET_OUT_VENDOR)/$(l)/$(MESA_DRI_MODULE_REL_PATH); > \ > >> + $(foreach d, $(GALLIUM_TARGET_DRIVERS), ln -sf gallium_dri.so > $(TARGET_OUT_VENDOR)/$(l)/$(MESA_DRI_MODULE_REL_PATH)/$(d)_dri.so;) \ > > Can we fold the long path into a variable and then reuse it? > > This code will be around for a bit, so it might be worth it. > > > > foo=$(TARGET_OUT_VENDOR)/$(l)/$(MESA_DRI_MODULE_REL_PATH) > > mkdir -p $(foo) > > $(foreach d, $(GALLIUM_TARGET_DRIVERS), ln -sf gallium_dri.so > > $(foo)/$(d)_dri.so;) > > > > -Emil > > *Please use better variable name than foo > > bar? > ...and the winner is ... MESA_DRI_MODULE_PATH $(eval MESA_DRI_MODULE_PATH := $(TARGET_OUT_VENDOR)/$(l)/$(MESA_DRI_MODULE_REL_PATH)) Sending tested v2 patch soon Mauro ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] meson: declare an empty xcb dependency
On Fri, Oct 27, 2017 at 7:25 PM, Dylan Bakerwrote: > This is needed in cases where xcb isn't actually needed as a dependency, > but may still be included somewhere. > > cc: Erik Faye-Lund > cc: Eric Engestrom > Signed-off-by: Dylan Baker Tested-by: Erik Faye-Lund ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [AppVeyor] mesa master #5967 failed
Build mesa 5967 failed Commit f121a669c7 by Dylan Baker on 10/24/2017 10:52 PM: meson: build gallium based osmesa\n\nThis has been tested with the osdemo from mesa-demos\n\nv2: - Add SELinux dependency\n- fix typo GALLIUM_LLVM -> GALLIUM_LLVMPIPE\n\nSigned-off-by: Dylan Baker\nReviewed-by: Eric Engestrom Configure your notification preferences ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] meson: Add threads dependencies to glsl_compiler executable
Fixes compiling the optional standalone glsl compiler. Reported-by: DrNick (on irc) Signed-off-by: Dylan Baker--- This is not compiled by default, but can be built by: meson build ninja -C build src/compiler/glsl/glsl_compiler src/compiler/glsl/meson.build | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/compiler/glsl/meson.build b/src/compiler/glsl/meson.build index 76fcafb9910..aa0e7153f42 100644 --- a/src/compiler/glsl/meson.build +++ b/src/compiler/glsl/meson.build @@ -223,7 +223,7 @@ glsl_compiler = executable( 'main.cpp', c_args : [c_vis_args, c_msvc_compat_args, no_override_init_args], cpp_args : [cpp_vis_args, cpp_msvc_compat_args], - dependencies : [dep_clock], + dependencies : [dep_clock, dep_thread], include_directories : [inc_common], link_with : [libglsl_standalone], build_by_default : false, -- 2.14.2 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] Mesa 17.2.4 release candidate
On Fri, Oct 27, 2017 at 1:43 PM, Andres Gomezwrote: > Rejected (6) > > > Ilia Mirkin (1): > glsl: fix derived cs variables > > Reason: Commit is too big for stable at this point. The issue it fixes in regular compute shaders is slightly difficult to hit (but there are piglits that do now), however the issue it hits with ARB_compute_variable_group_size is fairly trivial to encounter. It seems silly to put out releases with known bugs when a fix is easily available and apply-able, with negligible risk of messing things up. Note that this all only affects nouveau and radeonsi, as those are the only drivers that make use of the lowering. -ilia ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 9/9] meson: build gallium based osmesa
Quoting Eric Engestrom (2017-10-27 02:28:05) > On Thursday, 2017-10-26 13:55:35 -0700, Dylan Baker wrote: > > Quoting Eric Engestrom (2017-10-26 02:40:20) > > > On Wednesday, 2017-10-25 15:58:23 -0700, Dylan Baker wrote: > > > > +if with_shared_glapi > > > > + osmesa_link_with += libglapi > > > > +endif > > > > +if with_ld_version_script > > > > + osmesa_link_args += [ > > > > +'-Wl,--version-script', join_paths(meson.current_source_dir(), > > > > 'osmesa.sym') > > > > > > files('osmesa.sym') ? > > > I'm not sure if it would work here though, since it expects a string. > > > /me hopes meson converts the file_array to a list of its filenames > > > > > > > files() doesn't work here. Maybe it makes sense to extend meson to support > > that? > > We're not the first ones to think about this :) > There's already an issue open for this exact case [1], which doesn't really > have a conclusion. Might be worth chiming in, if only to ping them? > > [1] https://github.com/mesonbuild/meson/issues/1592 > I kinda hacked something up real quick. I don't think it would be hard to implement, although I started wondering about it. Part of what makes File so special is that it always works no matter what directory you're in. I don't think that th right thing is to be able to us them in string.format, but to be able to pass them as a linker or compiler argument, and have that do the right thing with them. I'll look into that more. Dylan signature.asc Description: signature ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 2/2] i965/gen10: Don't set Antialiasing Enable in 3DSTATE_RASTER if num_samples > 1
On Fri, Oct 27, 2017 at 10:50 AM, Anuj Phogatwrote: > Signed-off-by: Anuj Phogat > --- > src/mesa/drivers/dri/i965/genX_state_upload.c | 10 ++ > 1 file changed, 10 insertions(+) > > diff --git a/src/mesa/drivers/dri/i965/genX_state_upload.c > b/src/mesa/drivers/dri/i965/genX_state_upload.c > index b6e800aa90..2fb6420cee 100644 > --- a/src/mesa/drivers/dri/i965/genX_state_upload.c > +++ b/src/mesa/drivers/dri/i965/genX_state_upload.c > @@ -4362,6 +4362,16 @@ genX(upload_raster)(struct brw_context *brw) >/* _NEW_LINE */ >raster.AntialiasingEnable = ctx->Line.SmoothFlag; > > +#if GEN_GEN == 10 > + /* _NEW_LINE | _NEW_MULTISAMPLE _NEW_LINE is not required here. Fixed locally. > + * Antialiasing Enable bit MUST not be set when NUM_MULTISAMPLES > 1. > + */ > + const bool multisampled_fbo = > + _mesa_geometric_samples(ctx->DrawBuffer) > 1; > + if (multisampled_fbo) > + raster.AntialiasingEnable = false; > +#endif > + >/* _NEW_SCISSOR */ >raster.ScissorRectangleEnable = ctx->Scissor.EnableFlags; > > -- > 2.13.5 > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] meson: declare an empty xcb dependency
On Friday, 2017-10-27 10:25:33 -0700, Dylan Baker wrote: > This is needed in cases where xcb isn't actually needed as a dependency, > but may still be included somewhere. > > cc: Erik Faye-Lund> cc: Eric Engestrom better indeed :) Reviewed-by: Eric Engestrom > Signed-off-by: Dylan Baker > --- > meson.build | 1 + > 1 file changed, 1 insertion(+) > > diff --git a/meson.build b/meson.build > index 875f9d4d294..102b75c2320 100644 > --- a/meson.build > +++ b/meson.build > @@ -740,6 +740,7 @@ dep_xext = [] > dep_xdamage = [] > dep_xfixes = [] > dep_x11_xcb = [] > +dep_xcb = [] > dep_xcb_glx = [] > dep_xcb_dri2 = [] > dep_xcb_dri3 = [] > -- > 2.14.2 > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 2/2] i965/gen10: Don't set Antialiasing Enable in 3DSTATE_RASTER if num_samples > 1
Signed-off-by: Anuj Phogat--- src/mesa/drivers/dri/i965/genX_state_upload.c | 10 ++ 1 file changed, 10 insertions(+) diff --git a/src/mesa/drivers/dri/i965/genX_state_upload.c b/src/mesa/drivers/dri/i965/genX_state_upload.c index b6e800aa90..2fb6420cee 100644 --- a/src/mesa/drivers/dri/i965/genX_state_upload.c +++ b/src/mesa/drivers/dri/i965/genX_state_upload.c @@ -4362,6 +4362,16 @@ genX(upload_raster)(struct brw_context *brw) /* _NEW_LINE */ raster.AntialiasingEnable = ctx->Line.SmoothFlag; +#if GEN_GEN == 10 + /* _NEW_LINE | _NEW_MULTISAMPLE + * Antialiasing Enable bit MUST not be set when NUM_MULTISAMPLES > 1. + */ + const bool multisampled_fbo = + _mesa_geometric_samples(ctx->DrawBuffer) > 1; + if (multisampled_fbo) + raster.AntialiasingEnable = false; +#endif + /* _NEW_SCISSOR */ raster.ScissorRectangleEnable = ctx->Scissor.EnableFlags; -- 2.13.5 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 1/2] i965/gen10: Don't set Smooth Point Enable in 3DSTATE_SF if num_samples > 1
Signed-off-by: Anuj Phogat--- src/mesa/drivers/dri/i965/genX_state_upload.c | 10 ++ 1 file changed, 10 insertions(+) diff --git a/src/mesa/drivers/dri/i965/genX_state_upload.c b/src/mesa/drivers/dri/i965/genX_state_upload.c index 4ccfd48919..b6e800aa90 100644 --- a/src/mesa/drivers/dri/i965/genX_state_upload.c +++ b/src/mesa/drivers/dri/i965/genX_state_upload.c @@ -1626,6 +1626,16 @@ genX(upload_sf)(struct brw_context *brw) sf.SmoothPointEnable = true; #endif +#if GEN_GEN == 10 + /* _NEW_MULTISAMPLE + * Smooth Point Enable bit MUST not be set when NUM_MULTISAMPLES > 1. + */ + const bool multisampled_fbo = + _mesa_geometric_samples(ctx->DrawBuffer) > 1; + if (multisampled_fbo) + sf.SmoothPointEnable = false; +#endif + #if GEN_IS_G4X || GEN_GEN >= 5 sf.AALineDistanceMode = AALINEDISTANCE_TRUE; #endif -- 2.13.5 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 2/2] radv: Disallow indirect outputs for GS on GFX9 as well.
On Fri, Oct 27, 2017 at 5:03 PM, Andres Gomezwrote: > Bas, this patch looks like it should have been marked as fixing > 6ce550453f1 instead of e38685cc62e (?). Well my reasoning was that the bug got "visible" when we enabled Vega. The patch you refer to fixed part of it, but not all of it, and then this patch fixes the rest. > > In any case, I was wondering whether it would be interesting to bring > them both to the 17.2 stable queue and whether we would also want > Timothy's preceding patch: > > 087e010b2b3dd83a539f97203909d6c43b5da87c radv: copy indirect lowering > settings from radeonsi Yes, that would be reasonable. Timothy's patch is an optimization though, so I'd be happy to send a backport that only generates the variable needed for the other two if you'd prefer that. > > Let me know what you think. > > On Sun, 2017-10-22 at 18:43 +0200, Bas Nieuwenhuizen wrote: >> Since it also uses the output vector before writing to memory. >> >> Fixes: e38685cc62e 'Revert "radv: disable support for VEGA for now."' >> --- >> src/amd/vulkan/radv_shader.c | 4 +--- >> 1 file changed, 1 insertion(+), 3 deletions(-) >> >> diff --git a/src/amd/vulkan/radv_shader.c b/src/amd/vulkan/radv_shader.c >> index 07e68d6032b..6176a2e590d 100644 >> --- a/src/amd/vulkan/radv_shader.c >> +++ b/src/amd/vulkan/radv_shader.c >> @@ -265,9 +265,7 @@ radv_shader_compile_to_nir(struct radv_device *device, >> indirect_mask |= nir_var_shader_in; >> } >> if (!llvm_has_working_vgpr_indexing && >> - (nir->info.stage == MESA_SHADER_VERTEX || >> - nir->info.stage == MESA_SHADER_TESS_EVAL || >> - nir->info.stage == MESA_SHADER_FRAGMENT)) >> + nir->info.stage != MESA_SHADER_TESS_CTRL) >> indirect_mask |= nir_var_shader_out; >> >> /* TODO: We shouldn't need to do this, however LLVM isn't currently > -- > Br, > > Andres ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] Mesa 17.2.4 release candidate
Hello list, The candidate for the Mesa 17.2.4 is now available. Currently we have: - 21 queued - 10 nominated (outstanding) - and 6 rejected patches In the current queue we have: In Mesa Core we have included a change to prevent KOTOR from breaking when in combination with the ATI fragment shader extension. Additionally, NIR has also received a correction. Mesa's state tracker has gotten a patch to avoid leaks in certain situations like when resizing a window. Intel drivers have received several fixes. The compiler has gotten a couple, while anv also received one. i965 got a patch to avoid VA-API, Beignet and other contexts in the system to break when in combination with previous versions of Mesa 17.2.x. AMD's compiler has received a couple of fixes while radv has also received another couple, including one to avoid a hang due to an overflow on huge textures. Broadcom's vc4 has corrected a problem when compiling with Android's clang. clover has also seen fixed a problem compiling after a specific clang revision. Vulkan's WSI has gotten plugged a memory leak for X11. Take a look at section "Mesa stable queue" for more information. Testing reports/general approval Any testing reports (or general approval of the state of the branch) will be greatly appreciated. The plan is to have 17.2.4 next Sunday (29th of October), around or shortly after 18:00 GMT. If you have any questions or suggestions - be that about the current patch queue or otherwise, please go ahead. Trivial merge conflicts --- commit 1eb4cbc934afd72ea45521eeac763e43564d972b Author: Dave Airlieradv/image: bump all the offset to uint64_t. (cherry picked from commit 35c66f3e40177a97d74e614e2a324a03e2149c73) commit cbc081b8711a519dd9aa371debbf68aa0add4076 Author: Kenneth Graunke i965: Revert absolute mode for constant buffer pointers. (cherry picked from commit 013d33122028f2492da90a03ae4bc1dab84c3ee9) commit ce725baa7c90420b46e16087a4f201a7e62b23e5 Author: Matthew Nicholls ac/nir: generate correct instruction for atomic min/max on unsigned images (cherry picked from commit 27a0b24bf238342031e0709584e4d71ab228f1ec) commit 74f19032341748ad257a0c2527624b661e2d2e6f Author: Jason Ekstrand anv/pipeline: Call nir_lower_system_valaues after brw_preprocess_nir (cherry picked from commit 279f8fb69cf68d05287e14f60cf67fc025643bc4) Cheers, Andres Mesa stable queue - Nominated (10) = Jason Ekstrand (4): spirv: Claim support for the simple memory model i965/blorp: Use blorp_to_isl_format for src_isl_format in blit_miptrees i965/blorp: Use more temporary isl_format variables i965/miptree: Take an isl_format in render_aux_usage Kenneth Graunke (1): mesa: Accept GL_BACK in get_fb0_attachment with ARB_ES3_1_compatibility. Leo Liu (1): radeon/video: add gfx9 offsets when rejoin the video surface Marek Olšák (2): radeonsi: add a workaround for weird s_buffer_load_dword behavior on SI ac/surface/gfx9: don't allow DCC for the smallest mipmap levels Nicolai Hähnle (1): amd/common/gfx9: workaround DCC corruption more conservatively Tapani Pälli (1): i965: unref push_const_bo in intelDestroyContext Queued (21) === Andres Gomez (7): cherry-ignore: configure.ac: rework llvm detection and handling cherry-ignore: glsl: fix derived cs variables cherry-ignore: added 17.3 nominations. cherry-ignore: radv: Don't use vgpr indexing for outputs on GFX9. cherry-ignore: radv: Disallow indirect outputs for GS on GFX9 as well. cherry-ignore: mesa/bufferobj: don't double negate the range cherry-ignore: broadcom/vc5: Propagate vc4 aliasing fix to vc5. Bas Nieuwenhuizen (1): ac/nir: Fix nir_texop_lod on GFX for 1D arrays. Dave Airlie (1): radv/image: bump all the offset to uint64_t. Henri Verbeet (1): vulkan/wsi: Free the event in x11_manage_fifo_queues(). Jan Vesely (1): clover: Fix compilation after clang r315871 Jason Ekstrand (4): nir/intrinsics: Set the correct num_indices for load_output intel/fs: Handle flag read/write aliasing in needs_src_copy anv/pipeline: Call nir_lower_system_valaues after brw_preprocess_nir intel/eu: Use EXECUTE_1 for JMPI Kenneth Graunke (1): i965: Revert absolute mode for constant buffer pointers. Marek Olšák (1): Revert "mesa: fix texture updates for ATI_fragment_shader" Matthew Nicholls (1): ac/nir: generate correct instruction for atomic min/max on unsigned images Michel Dänzer (1): st/mesa: Initialize textures array in st_framebuffer_validate Squashed with st/osmesa: include u_inlines.h for pipe_resource_reference Samuel Pitoiset (1): radv: add the draw count buffer to the list of
Re: [Mesa-dev] [PATCH] meson: Add a dependency on nir_opcodes_h for freedreno
Reviewed-by: Mark JanesDylan Baker writes: > This fixes a race condition in the build. > > cc: Rob Clark > Signed-off-by: Dylan Baker > --- > src/gallium/drivers/freedreno/meson.build | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/src/gallium/drivers/freedreno/meson.build > b/src/gallium/drivers/freedreno/meson.build > index 028be54ebac..6f7cf2b3bd4 100644 > --- a/src/gallium/drivers/freedreno/meson.build > +++ b/src/gallium/drivers/freedreno/meson.build > @@ -200,7 +200,7 @@ freedreno_includes = [ > > libfreedreno = static_library( >'freedreno', > - [files_libfreedreno, ir3_nir_trig_c], > + [files_libfreedreno, ir3_nir_trig_c, nir_opcodes_h], >include_directories : freedreno_includes, >c_args : [c_vis_args], >cpp_args : [cpp_vis_args], > -- > 2.14.2 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH mesa] meson: wire up egl/android
Whooo! Thanks for doing this! Quoting Eric Engestrom (2017-10-27 07:40:17) > Cc: Rob Herring> Cc: Tomasz Figa > Signed-off-by: Eric Engestrom > --- > Completely untested! > It's a step in the right direction though; doesn't hurt non-android, > and gets android closer to building on meson :) > --- > meson.build | 13 +++-- > src/egl/meson.build | 5 - > 2 files changed, 15 insertions(+), 3 deletions(-) > > diff --git a/meson.build b/meson.build > index 875f9d4d294d1911f239..761c33f4651ab37ab7b6 100644 > --- a/meson.build > +++ b/meson.build > @@ -152,7 +152,7 @@ endif > # TODO: other OSes > with_dri_platform = 'drm' > > -# TODO: android platform > +with_platform_android = false > with_platform_wayland = false > with_platform_x11 = false > with_platform_drm = false > @@ -161,6 +161,7 @@ egl_native_platform = '' > _platforms = get_option('platforms') > if _platforms != '' >_split = _platforms.split(',') > + with_platform_android = _split.contains('android') >with_platform_x11 = _split.contains('x11') >with_platform_wayland = _split.contains('wayland') >with_platform_drm = _split.contains('drm') > @@ -252,7 +253,7 @@ if _vulkan_drivers != '' >with_intel_vk = _split.contains('intel') >with_amd_vk = _split.contains('amd') >with_any_vk = with_amd_vk or with_intel_vk > - if not (with_platform_x11 or with_platform_wayland) > + if not (with_platform_x11 or with_platform_wayland or > with_platform_android) > error('Vulkan requires at least one platform (x11, wayland)') >endif > endif > @@ -330,6 +331,14 @@ endif > if with_platform_surfaceless >pre_args += '-DHAVE_SURFACELESS_PLATFORM' > endif > +if with_platform_android > + dep_android = [ > + dependency('cutils'), > + dependency('hardware'), > + dependency('sync'), > + ] The indent looks off here, it looks like 4 space instead of 2, and the closing brace should be dedented. Otherwise this looks good to me, though it would be great if one of the ChromeOS guys could look at it (since I think that android builds are always done with android.mk, and this would only be for the ChromeOS ARC++ container) Reviewed-by: Dylan Baker > + pre_args += '-DHAVE_ANDROID_PLATFORM' > +endif > > prog_python2 = find_program('python2') > has_mako = run_command(prog_python2, '-c', 'import mako') > diff --git a/src/egl/meson.build b/src/egl/meson.build > index ea7ae06761f75c00a40c..cc51671f9d8f24708405 100644 > --- a/src/egl/meson.build > +++ b/src/egl/meson.build > @@ -129,7 +129,10 @@ if with_platform_wayland > 'wayland/wayland-egl', 'wayland/wayland-drm', >) > endif > -# TODO: android > +if with_platform_android > + deps_for_egl += dep_android > + files_egl += files('drivers/dri2/platform_android.c') > +endif > > # TODO: glvnd > > -- > Cheers, > Eric > signature.asc Description: signature ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] meson: drop vulkan if no drivers are built
Yeah, I kinda got that feeling. Your approach seems better. On Oct 27, 2017 19:22, "Dylan Baker"wrote: This just papers over the actual problem, that dep_xcb isn't declared as an empty list with the other glx dependencies. Dylan Quoting Erik Faye-Lund (2017-10-27 04:55:25) > This avoids the following build-error when building with emtpy > vulkan-drivers and without glx=dri: > > Meson encountered an error in file src/vulkan/wsi/meson.build, line 30, > column 2: > Unknown variable "dep_xcb". > > Signed-off-by: Erik Faye-Lund > --- > src/meson.build | 4 +++- > 1 file changed, 3 insertions(+), 1 deletion(-) > > diff --git a/src/meson.build b/src/meson.build > index 9b1b0ae594..4b00ab910c 100644 > --- a/src/meson.build > +++ b/src/meson.build > @@ -47,7 +47,9 @@ subdir('mapi') > # TODO: osmesa > subdir('compiler') > subdir('egl/wayland/wayland-drm') > -subdir('vulkan') > +if with_any_vk > + subdir('vulkan') > +endif > subdir('amd') > if with_gallium_vc4 >subdir('broadcom') > -- > 2.11.0 > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] meson: declare an empty xcb dependency
This is needed in cases where xcb isn't actually needed as a dependency, but may still be included somewhere. cc: Erik Faye-Lundcc: Eric Engestrom Signed-off-by: Dylan Baker --- meson.build | 1 + 1 file changed, 1 insertion(+) diff --git a/meson.build b/meson.build index 875f9d4d294..102b75c2320 100644 --- a/meson.build +++ b/meson.build @@ -740,6 +740,7 @@ dep_xext = [] dep_xdamage = [] dep_xfixes = [] dep_x11_xcb = [] +dep_xcb = [] dep_xcb_glx = [] dep_xcb_dri2 = [] dep_xcb_dri3 = [] -- 2.14.2 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v5 00/10] new series of Mesa for Tizen
On Thursday, October 26, 2017 11:16:06 AM PDT Eric Anholt wrote: > Gwan-gyeong Munwrites: > > > Hi, > > > > These Patch v5 series modified with new helper function series [1]. > > > > These series only have mesa for tizen feature. > > > > [1] https://patchwork.freedesktop.org/series/32577/ > > Rather than have another giant pile of window system code in the tree, > I'd like to see a serious discussion of why you aren't using the > existing wayland and gbm platforms and what you would need from them. > The TPL stuff looks like abstraction for the sake of abstraction, to me. I agree with Eric. I've never understood why this is necessary. signature.asc Description: This is a digitally signed message part. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] meson: drop vulkan if no drivers are built
This just papers over the actual problem, that dep_xcb isn't declared as an empty list with the other glx dependencies. Dylan Quoting Erik Faye-Lund (2017-10-27 04:55:25) > This avoids the following build-error when building with emtpy > vulkan-drivers and without glx=dri: > > Meson encountered an error in file src/vulkan/wsi/meson.build, line 30, > column 2: > Unknown variable "dep_xcb". > > Signed-off-by: Erik Faye-Lund> --- > src/meson.build | 4 +++- > 1 file changed, 3 insertions(+), 1 deletion(-) > > diff --git a/src/meson.build b/src/meson.build > index 9b1b0ae594..4b00ab910c 100644 > --- a/src/meson.build > +++ b/src/meson.build > @@ -47,7 +47,9 @@ subdir('mapi') > # TODO: osmesa > subdir('compiler') > subdir('egl/wayland/wayland-drm') > -subdir('vulkan') > +if with_any_vk > + subdir('vulkan') > +endif > subdir('amd') > if with_gallium_vc4 >subdir('broadcom') > -- > 2.11.0 > signature.asc Description: signature ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] meson: Add a dependency on nir_opcodes_h for freedreno
This fixes a race condition in the build. cc: Rob ClarkSigned-off-by: Dylan Baker --- src/gallium/drivers/freedreno/meson.build | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/gallium/drivers/freedreno/meson.build b/src/gallium/drivers/freedreno/meson.build index 028be54ebac..6f7cf2b3bd4 100644 --- a/src/gallium/drivers/freedreno/meson.build +++ b/src/gallium/drivers/freedreno/meson.build @@ -200,7 +200,7 @@ freedreno_includes = [ libfreedreno = static_library( 'freedreno', - [files_libfreedreno, ir3_nir_trig_c], + [files_libfreedreno, ir3_nir_trig_c, nir_opcodes_h], include_directories : freedreno_includes, c_args : [c_vis_args], cpp_args : [cpp_vis_args], -- 2.14.2 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] util: hashtable: make hashing prototypes match
On Friday, October 27, 2017 9:43:45 AM PDT Lionel Landwerlin wrote: > It seems nobody's using the string hashing function. If you try to > pass it directly to the hashtable creation function, you'll get > compiler warning for non matching prototypes. Let's make them match. > > Signed-off-by: Lionel Landwerlin> --- > src/util/hash_table.c | 3 ++- > src/util/hash_table.h | 2 +- > 2 files changed, 3 insertions(+), 2 deletions(-) > > diff --git a/src/util/hash_table.c b/src/util/hash_table.c > index 1bda2149b95..b7421a0144c 100644 > --- a/src/util/hash_table.c > +++ b/src/util/hash_table.c > @@ -476,9 +476,10 @@ _mesa_hash_data(const void *data, size_t size) > > /** FNV-1a string hash implementation */ > uint32_t > -_mesa_hash_string(const char *key) > +_mesa_hash_string(const void *_key) > { > uint32_t hash = _mesa_fnv32_1a_offset_bias; > + const char *key = _key; > > while (*key != 0) { >hash = _mesa_fnv32_1a_accumulate(hash, *key); > diff --git a/src/util/hash_table.h b/src/util/hash_table.h > index cf939130fcf..d3e0758b265 100644 > --- a/src/util/hash_table.h > +++ b/src/util/hash_table.h > @@ -94,7 +94,7 @@ _mesa_hash_table_random_entry(struct hash_table *ht, >bool (*predicate)(struct hash_entry *entry)); > > uint32_t _mesa_hash_data(const void *data, size_t size); > -uint32_t _mesa_hash_string(const char *key); > +uint32_t _mesa_hash_string(const void *key); > bool _mesa_key_string_equal(const void *a, const void *b); > bool _mesa_key_pointer_equal(const void *a, const void *b); > > Reviewed-by: Kenneth Graunke signature.asc Description: This is a digitally signed message part. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] radv: disable depth writes when depth test is not enabled
On 10/27/2017 06:49 PM, Marek Olšák wrote: The hw disables depth writes automatically. Okay, good to know. Thanks! Marek On Fri, Oct 27, 2017 at 6:27 PM, Samuel Pitoisetwrote: Found by inspection. Signed-off-by: Samuel Pitoiset --- src/amd/vulkan/radv_pipeline.c | 8 +++- 1 file changed, 7 insertions(+), 1 deletion(-) diff --git a/src/amd/vulkan/radv_pipeline.c b/src/amd/vulkan/radv_pipeline.c index c25642c966..f5ebcda883 100644 --- a/src/amd/vulkan/radv_pipeline.c +++ b/src/amd/vulkan/radv_pipeline.c @@ -741,8 +741,14 @@ radv_pipeline_init_depth_stencil_state(struct radv_pipeline *pipeline, bool has_stencil_attachment = vk_format_is_stencil(attachment->format); if (has_depth_attachment) { + /* Depth writes are always disabled when depthTestEnable is +* VK_FALSE. +*/ + bool z_write_enable = + vkds->depthTestEnable && vkds->depthWriteEnable; + ds->db_depth_control = S_028800_Z_ENABLE(vkds->depthTestEnable ? 1 : 0) | - S_028800_Z_WRITE_ENABLE(vkds->depthWriteEnable ? 1 : 0) | + S_028800_Z_WRITE_ENABLE(z_write_enable ? 1 : 0) | S_028800_ZFUNC(vkds->depthCompareOp) | S_028800_DEPTH_BOUNDS_ENABLE(vkds->depthBoundsTestEnable ? 1 : 0); } -- 2.14.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] radv: disable depth writes when depth test is not enabled
The hw disables depth writes automatically. Marek On Fri, Oct 27, 2017 at 6:27 PM, Samuel Pitoisetwrote: > Found by inspection. > > Signed-off-by: Samuel Pitoiset > --- > src/amd/vulkan/radv_pipeline.c | 8 +++- > 1 file changed, 7 insertions(+), 1 deletion(-) > > diff --git a/src/amd/vulkan/radv_pipeline.c b/src/amd/vulkan/radv_pipeline.c > index c25642c966..f5ebcda883 100644 > --- a/src/amd/vulkan/radv_pipeline.c > +++ b/src/amd/vulkan/radv_pipeline.c > @@ -741,8 +741,14 @@ radv_pipeline_init_depth_stencil_state(struct > radv_pipeline *pipeline, > bool has_stencil_attachment = > vk_format_is_stencil(attachment->format); > > if (has_depth_attachment) { > + /* Depth writes are always disabled when depthTestEnable is > +* VK_FALSE. > +*/ > + bool z_write_enable = > + vkds->depthTestEnable && vkds->depthWriteEnable; > + > ds->db_depth_control = > S_028800_Z_ENABLE(vkds->depthTestEnable ? 1 : 0) | > - > S_028800_Z_WRITE_ENABLE(vkds->depthWriteEnable ? 1 : 0) | > + S_028800_Z_WRITE_ENABLE(z_write_enable > ? 1 : 0) | >S_028800_ZFUNC(vkds->depthCompareOp) | > > S_028800_DEPTH_BOUNDS_ENABLE(vkds->depthBoundsTestEnable ? 1 : 0); > } > -- > 2.14.3 > > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] util: hashtable: make hashing prototypes match
It seems nobody's using the string hashing function. If you try to pass it directly to the hashtable creation function, you'll get compiler warning for non matching prototypes. Let's make them match. Signed-off-by: Lionel Landwerlin--- src/util/hash_table.c | 3 ++- src/util/hash_table.h | 2 +- 2 files changed, 3 insertions(+), 2 deletions(-) diff --git a/src/util/hash_table.c b/src/util/hash_table.c index 1bda2149b95..b7421a0144c 100644 --- a/src/util/hash_table.c +++ b/src/util/hash_table.c @@ -476,9 +476,10 @@ _mesa_hash_data(const void *data, size_t size) /** FNV-1a string hash implementation */ uint32_t -_mesa_hash_string(const char *key) +_mesa_hash_string(const void *_key) { uint32_t hash = _mesa_fnv32_1a_offset_bias; + const char *key = _key; while (*key != 0) { hash = _mesa_fnv32_1a_accumulate(hash, *key); diff --git a/src/util/hash_table.h b/src/util/hash_table.h index cf939130fcf..d3e0758b265 100644 --- a/src/util/hash_table.h +++ b/src/util/hash_table.h @@ -94,7 +94,7 @@ _mesa_hash_table_random_entry(struct hash_table *ht, bool (*predicate)(struct hash_entry *entry)); uint32_t _mesa_hash_data(const void *data, size_t size); -uint32_t _mesa_hash_string(const char *key); +uint32_t _mesa_hash_string(const void *key); bool _mesa_key_string_equal(const void *a, const void *b); bool _mesa_key_pointer_equal(const void *a, const void *b); -- 2.15.0.rc2 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] intel: common: silence compiler warning
Signed-off-by: Lionel Landwerlin--- src/intel/common/gen_decoder.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/intel/common/gen_decoder.c b/src/intel/common/gen_decoder.c index 395ff02908a..55e7305117c 100644 --- a/src/intel/common/gen_decoder.c +++ b/src/intel/common/gen_decoder.c @@ -556,7 +556,7 @@ gen_spec_load(const struct gen_device_info *devinfo) { struct parser_context ctx; void *buf; - uint8_t *text_data; + uint8_t *text_data = NULL; uint32_t text_offset = 0, text_length = 0, total_length; uint32_t gen_10 = devinfo_to_gen(devinfo); -- 2.15.0.rc2 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/7] util: move futex helpers into futex.h
For patches 1-4, 6, 7, assuming you fix the double unlock in patch 7: Reviewed-by: Marek OlšákMarek On Sun, Oct 22, 2017 at 8:33 PM, Nicolai Hähnle wrote: > From: Nicolai Hähnle > > --- > src/util/Makefile.sources | 1 + > src/util/futex.h | 51 > +++ > src/util/meson.build | 1 + > src/util/simple_mtx.h | 20 +-- > 4 files changed, 54 insertions(+), 19 deletions(-) > create mode 100644 src/util/futex.h > > diff --git a/src/util/Makefile.sources b/src/util/Makefile.sources > index c7f6516a992..407b2825624 100644 > --- a/src/util/Makefile.sources > +++ b/src/util/Makefile.sources > @@ -6,20 +6,21 @@ MESA_UTIL_FILES := \ > build_id.h \ > crc32.c \ > crc32.h \ > debug.c \ > debug.h \ > disk_cache.c \ > disk_cache.h \ > format_r11g11b10f.h \ > format_rgb9e5.h \ > format_srgb.h \ > + futex.h \ > half_float.c \ > half_float.h \ > hash_table.c \ > hash_table.h \ > list.h \ > macros.h \ > mesa-sha1.c \ > mesa-sha1.h \ > sha1/sha1.c \ > sha1/sha1.h \ > diff --git a/src/util/futex.h b/src/util/futex.h > new file mode 100644 > index 000..a79daaf209b > --- /dev/null > +++ b/src/util/futex.h > @@ -0,0 +1,51 @@ > +/* > + * Copyright © 2015 Intel > + * > + * Permission is hereby granted, free of charge, to any person obtaining a > + * copy of this software and associated documentation files (the "Software"), > + * to deal in the Software without restriction, including without limitation > + * the rights to use, copy, modify, merge, publish, distribute, sublicense, > + * and/or sell copies of the Software, and to permit persons to whom the > + * Software is furnished to do so, subject to the following conditions: > + * > + * The above copyright notice and this permission notice (including the next > + * paragraph) shall be included in all copies or substantial portions of the > + * Software. > + * > + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR > + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, > + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL > + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER > + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING > + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER > DEALINGS > + * IN THE SOFTWARE. > + */ > + > +#ifndef UTIL_FUTEX_H > +#define UTIL_FUTEX_H > + > +#if defined(HAVE_FUTEX) > + > +#include > +#include > +#include > +#include > +#include > +#include > + > +static inline long sys_futex(void *addr1, int op, int val1, struct timespec > *timeout, void *addr2, int val3) > +{ > + return syscall(SYS_futex, addr1, op, val1, timeout, addr2, val3); > +} > + > +static inline int futex_wake(uint32_t *addr) { > + return sys_futex(addr, FUTEX_WAKE, 1, NULL, NULL, 0); > +} > + > +static inline int futex_wait(uint32_t *addr, int32_t value) { > + return sys_futex(addr, FUTEX_WAIT, value, NULL, NULL, 0); > +} > + > +#endif > + > +#endif /* UTIL_FUTEX_H */ > diff --git a/src/util/meson.build b/src/util/meson.build > index c9cb3e861e9..0940e4d12a7 100644 > --- a/src/util/meson.build > +++ b/src/util/meson.build > @@ -30,20 +30,21 @@ files_mesa_util = files( >'build_id.h', >'crc32.c', >'crc32.h', >'debug.c', >'debug.h', >'disk_cache.c', >'disk_cache.h', >'format_r11g11b10f.h', >'format_rgb9e5.h', >'format_srgb.h', > + 'futex.h', >'half_float.c', >'half_float.h', >'hash_table.c', >'hash_table.h', >'list.h', >'macros.h', >'mesa-sha1.c', >'mesa-sha1.h', >'sha1/sha1.c', >'sha1/sha1.h', > diff --git a/src/util/simple_mtx.h b/src/util/simple_mtx.h > index d23a4303c80..0c2602d03b6 100644 > --- a/src/util/simple_mtx.h > +++ b/src/util/simple_mtx.h > @@ -14,20 +14,21 @@ > * > * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR > * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, > * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL > * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER > * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING > * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER > DEALINGS > * IN THE SOFTWARE. > */ > > +#include "util/futex.h" > #include "util/u_thread.h" > > #if defined(__GNUC__) && defined(HAVE_FUTEX) > > /* mtx_t - Fast, simple mutex > * > * While modern pthread mutexes are very fast (implemented using futex), they > * still incur a call to an external DSO and overhead of the generality and > * features of pthread mutexes. Most
[Mesa-dev] [PATCH] radv: disable depth writes when depth test is not enabled
Found by inspection. Signed-off-by: Samuel Pitoiset--- src/amd/vulkan/radv_pipeline.c | 8 +++- 1 file changed, 7 insertions(+), 1 deletion(-) diff --git a/src/amd/vulkan/radv_pipeline.c b/src/amd/vulkan/radv_pipeline.c index c25642c966..f5ebcda883 100644 --- a/src/amd/vulkan/radv_pipeline.c +++ b/src/amd/vulkan/radv_pipeline.c @@ -741,8 +741,14 @@ radv_pipeline_init_depth_stencil_state(struct radv_pipeline *pipeline, bool has_stencil_attachment = vk_format_is_stencil(attachment->format); if (has_depth_attachment) { + /* Depth writes are always disabled when depthTestEnable is +* VK_FALSE. +*/ + bool z_write_enable = + vkds->depthTestEnable && vkds->depthWriteEnable; + ds->db_depth_control = S_028800_Z_ENABLE(vkds->depthTestEnable ? 1 : 0) | - S_028800_Z_WRITE_ENABLE(vkds->depthWriteEnable ? 1 : 0) | + S_028800_Z_WRITE_ENABLE(z_write_enable ? 1 : 0) | S_028800_ZFUNC(vkds->depthCompareOp) | S_028800_DEPTH_BOUNDS_ENABLE(vkds->depthBoundsTestEnable ? 1 : 0); } -- 2.14.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/7] svga: Use __asm__ instead of asm
On 10/27/2017 05:21 AM, Emil Velikov wrote: On 27 October 2017 at 00:57, Dylan Bakerwrote: Which allows the code to be compiled with c99 instead of gnu99. A little history. This code is guarded by #ifdef __GNUC__, so it's only compiled with autotools on *nix, SCons with MSVC wont hit that code. However, meson is going to build both MSVC and GCC/Clang paths. As such it makes sense to not have to override the std for gcc/clang, but ensure that it's not set to gnu99 when building with MSVC when there's a straightforward code change that allows removing the need for gnu99. I'm afraid that most of the buildsystem details are off. Patch makes sense regardless :-) With a more generic commit message (one example below), the commit is Reviewed-by: Emil Velikov Replace the GNU specific keyword asm with __asm_. This allows us to remove the explicit request for GNU extensions aka -std=gnu99 Sounds good. I tested with MinGW too. Reviewed-by: Brian Paul Tested-by: Brian Paul ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] meson: drop vulkan if no drivers are built
On Friday, 2017-10-27 13:55:25 +0200, Erik Faye-Lund wrote: > This avoids the following build-error when building with emtpy > vulkan-drivers and without glx=dri: > > Meson encountered an error in file src/vulkan/wsi/meson.build, line 30, > column 2: > Unknown variable "dep_xcb". > > Signed-off-by: Erik Faye-LundReviewed-by: Eric Engestrom > --- > src/meson.build | 4 +++- > 1 file changed, 3 insertions(+), 1 deletion(-) > > diff --git a/src/meson.build b/src/meson.build > index 9b1b0ae594..4b00ab910c 100644 > --- a/src/meson.build > +++ b/src/meson.build > @@ -47,7 +47,9 @@ subdir('mapi') > # TODO: osmesa > subdir('compiler') > subdir('egl/wayland/wayland-drm') > -subdir('vulkan') > +if with_any_vk > + subdir('vulkan') > +endif > subdir('amd') > if with_gallium_vc4 >subdir('broadcom') > -- > 2.11.0 > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] glsl: add varying resources for arrays of complex types
On Fri, Oct 27, 2017 at 11:07 AM, Juan A. Suarez Romerowrote: > On Fri, 2017-10-27 at 10:27 -0400, Ilia Mirkin wrote: >> On Fri, Oct 27, 2017 at 10:03 AM, Juan A. Suarez Romero >> wrote: >> > This patch is mostly a patch done by Ilia Mirkin. >> > >> > It fixes KHR-GL45.enhanced_layouts.varying_structure_locations. >> > >> > CC: Ilia Mirkin >> > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103098 >> > Signed-off-by: Juan A. Suarez Romero >> > --- >> > src/compiler/glsl/linker.cpp | 36 >> > 1 file changed, 36 insertions(+) >> > >> > diff --git a/src/compiler/glsl/linker.cpp b/src/compiler/glsl/linker.cpp >> > index f827b68555..7305b7740b 100644 >> > --- a/src/compiler/glsl/linker.cpp >> > +++ b/src/compiler/glsl/linker.cpp >> > @@ -3882,6 +3882,42 @@ add_shader_variable(const struct gl_context *ctx, >> >return true; >> > } >> > >> > + case GLSL_TYPE_ARRAY: { >> > + /* The ARB_program_interface_query spec says: >> > + * >> > + * "For an active variable declared as an array of basic types, >> > a >> > + * single entry will be generated, with its name string formed >> > by >> > + * concatenating the name of the array and the string "[0]"." >> > + * >> > + * "For an active variable declared as an array of an aggregate >> > data >> > + * type (structures or arrays), a separate entry will be >> > generated >> > + * for each active array element, unless noted immediately >> > below. >> > + * The name of each entry is formed by concatenating the name >> > of >> > + * the array, the "[" character, an integer identifying the >> > element >> > + * number, and the "]" character. These enumeration rules are >> > + * applied recursively, treating each enumerated array element >> > as a >> > + * separate active variable." >> > + */ >> >> So if you have a TCS or TES or GS with >> >> in struct { vec4 foo; } bar[4]; >> >> Wouldn't this add bar[0].foo, bar[1].foo, bar[2].foo, and bar[3].foo? > > According to the spec, yes. Unless I'm missing something... > > > Looking for other examples, in > > https://www.khronos.org/opengl/wiki/Program_Introspection#Arrays > > it shows a similar example, and also do as you say. > >> With the latter ones getting bogus locations? What is it supposed to >> do in this case? > > Why it would get bogus locations? Because it'll do elem_location += stride every time, but they each should get the same location. > > >> >> > + const struct glsl_type *array_type = type->fields.array; >> > + if (array_type->base_type == GLSL_TYPE_STRUCT || >> > + array_type->base_type == GLSL_TYPE_ARRAY) { >> > + unsigned elem_location = location; >> > + unsigned stride = array_type->count_attribute_slots(false); >> > + for (unsigned i = 0; i < type->length; i++) { >> > +char *elem = ralloc_asprintf(shProg, "%s[%d]", name, i); >> > +if (!add_shader_variable(ctx, shProg, resource_set, >> > + stage_mask, programInterface, >> > + var, elem, array_type, >> > + use_implicit_location, elem_location, >> > + outermost_struct_type)) >> > + return false; >> > +elem_location += stride; >> > + } >> > + return true; >> > + } >> > + /* fallthrough */ >> > + } >> > + >> > default: { >> >/* The ARB_program_interface_query spec says: >> > * >> > -- >> > 2.13.6 >> > >> >> ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] glsl: add varying resources for arrays of complex types
On Fri, 2017-10-27 at 10:27 -0400, Ilia Mirkin wrote: > On Fri, Oct 27, 2017 at 10:03 AM, Juan A. Suarez Romero >wrote: > > This patch is mostly a patch done by Ilia Mirkin. > > > > It fixes KHR-GL45.enhanced_layouts.varying_structure_locations. > > > > CC: Ilia Mirkin > > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103098 > > Signed-off-by: Juan A. Suarez Romero > > --- > > src/compiler/glsl/linker.cpp | 36 > > 1 file changed, 36 insertions(+) > > > > diff --git a/src/compiler/glsl/linker.cpp b/src/compiler/glsl/linker.cpp > > index f827b68555..7305b7740b 100644 > > --- a/src/compiler/glsl/linker.cpp > > +++ b/src/compiler/glsl/linker.cpp > > @@ -3882,6 +3882,42 @@ add_shader_variable(const struct gl_context *ctx, > >return true; > > } > > > > + case GLSL_TYPE_ARRAY: { > > + /* The ARB_program_interface_query spec says: > > + * > > + * "For an active variable declared as an array of basic types, a > > + * single entry will be generated, with its name string formed > > by > > + * concatenating the name of the array and the string "[0]"." > > + * > > + * "For an active variable declared as an array of an aggregate > > data > > + * type (structures or arrays), a separate entry will be > > generated > > + * for each active array element, unless noted immediately > > below. > > + * The name of each entry is formed by concatenating the name of > > + * the array, the "[" character, an integer identifying the > > element > > + * number, and the "]" character. These enumeration rules are > > + * applied recursively, treating each enumerated array element > > as a > > + * separate active variable." > > + */ > > So if you have a TCS or TES or GS with > > in struct { vec4 foo; } bar[4]; > > Wouldn't this add bar[0].foo, bar[1].foo, bar[2].foo, and bar[3].foo? According to the spec, yes. Unless I'm missing something... Looking for other examples, in https://www.khronos.org/opengl/wiki/Program_Introspection#Arrays it shows a similar example, and also do as you say. > With the latter ones getting bogus locations? What is it supposed to > do in this case? Why it would get bogus locations? > > > + const struct glsl_type *array_type = type->fields.array; > > + if (array_type->base_type == GLSL_TYPE_STRUCT || > > + array_type->base_type == GLSL_TYPE_ARRAY) { > > + unsigned elem_location = location; > > + unsigned stride = array_type->count_attribute_slots(false); > > + for (unsigned i = 0; i < type->length; i++) { > > +char *elem = ralloc_asprintf(shProg, "%s[%d]", name, i); > > +if (!add_shader_variable(ctx, shProg, resource_set, > > + stage_mask, programInterface, > > + var, elem, array_type, > > + use_implicit_location, elem_location, > > + outermost_struct_type)) > > + return false; > > +elem_location += stride; > > + } > > + return true; > > + } > > + /* fallthrough */ > > + } > > + > > default: { > >/* The ARB_program_interface_query spec says: > > * > > -- > > 2.13.6 > > > > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 2/2] radv: Disallow indirect outputs for GS on GFX9 as well.
Bas, this patch looks like it should have been marked as fixing 6ce550453f1 instead of e38685cc62e (?). In any case, I was wondering whether it would be interesting to bring them both to the 17.2 stable queue and whether we would also want Timothy's preceding patch: 087e010b2b3dd83a539f97203909d6c43b5da87c radv: copy indirect lowering settings from radeonsi Let me know what you think. On Sun, 2017-10-22 at 18:43 +0200, Bas Nieuwenhuizen wrote: > Since it also uses the output vector before writing to memory. > > Fixes: e38685cc62e 'Revert "radv: disable support for VEGA for now."' > --- > src/amd/vulkan/radv_shader.c | 4 +--- > 1 file changed, 1 insertion(+), 3 deletions(-) > > diff --git a/src/amd/vulkan/radv_shader.c b/src/amd/vulkan/radv_shader.c > index 07e68d6032b..6176a2e590d 100644 > --- a/src/amd/vulkan/radv_shader.c > +++ b/src/amd/vulkan/radv_shader.c > @@ -265,9 +265,7 @@ radv_shader_compile_to_nir(struct radv_device *device, > indirect_mask |= nir_var_shader_in; > } > if (!llvm_has_working_vgpr_indexing && > - (nir->info.stage == MESA_SHADER_VERTEX || > - nir->info.stage == MESA_SHADER_TESS_EVAL || > - nir->info.stage == MESA_SHADER_FRAGMENT)) > + nir->info.stage != MESA_SHADER_TESS_CTRL) > indirect_mask |= nir_var_shader_out; > > /* TODO: We shouldn't need to do this, however LLVM isn't currently -- Br, Andres ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH mesa] meson: wire up egl/android
Cc: Rob HerringCc: Tomasz Figa Signed-off-by: Eric Engestrom --- Completely untested! It's a step in the right direction though; doesn't hurt non-android, and gets android closer to building on meson :) --- meson.build | 13 +++-- src/egl/meson.build | 5 - 2 files changed, 15 insertions(+), 3 deletions(-) diff --git a/meson.build b/meson.build index 875f9d4d294d1911f239..761c33f4651ab37ab7b6 100644 --- a/meson.build +++ b/meson.build @@ -152,7 +152,7 @@ endif # TODO: other OSes with_dri_platform = 'drm' -# TODO: android platform +with_platform_android = false with_platform_wayland = false with_platform_x11 = false with_platform_drm = false @@ -161,6 +161,7 @@ egl_native_platform = '' _platforms = get_option('platforms') if _platforms != '' _split = _platforms.split(',') + with_platform_android = _split.contains('android') with_platform_x11 = _split.contains('x11') with_platform_wayland = _split.contains('wayland') with_platform_drm = _split.contains('drm') @@ -252,7 +253,7 @@ if _vulkan_drivers != '' with_intel_vk = _split.contains('intel') with_amd_vk = _split.contains('amd') with_any_vk = with_amd_vk or with_intel_vk - if not (with_platform_x11 or with_platform_wayland) + if not (with_platform_x11 or with_platform_wayland or with_platform_android) error('Vulkan requires at least one platform (x11, wayland)') endif endif @@ -330,6 +331,14 @@ endif if with_platform_surfaceless pre_args += '-DHAVE_SURFACELESS_PLATFORM' endif +if with_platform_android + dep_android = [ + dependency('cutils'), + dependency('hardware'), + dependency('sync'), + ] + pre_args += '-DHAVE_ANDROID_PLATFORM' +endif prog_python2 = find_program('python2') has_mako = run_command(prog_python2, '-c', 'import mako') diff --git a/src/egl/meson.build b/src/egl/meson.build index ea7ae06761f75c00a40c..cc51671f9d8f24708405 100644 --- a/src/egl/meson.build +++ b/src/egl/meson.build @@ -129,7 +129,10 @@ if with_platform_wayland 'wayland/wayland-egl', 'wayland/wayland-drm', ) endif -# TODO: android +if with_platform_android + deps_for_egl += dep_android + files_egl += files('drivers/dri2/platform_android.c') +endif # TODO: glvnd -- Cheers, Eric ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH mesa 1/2] buildsys: move file regeneration logic to the script itself
On Thursday, 2017-10-26 10:10:19 -0700, Dylan Baker wrote: > I have a few tiny nits, but otherwise this seems fine: > Reviewed-by: Dylan Baker> > Quoting Eric Engestrom (2017-10-26 08:40:56) > > Suggested-by: Jordan Justen > > Signed-off-by: Eric Engestrom > > --- > > bin/git_sha1_gen.py | 13 - > > src/Makefile.am | 15 --- > > src/SConscript | 22 ++ > > src/mesa/Android.libmesa_git_sha1.mk | 4 ++-- > > 4 files changed, 24 insertions(+), 30 deletions(-) > > > > diff --git a/bin/git_sha1_gen.py b/bin/git_sha1_gen.py > > index c75dba101acf4ffc85fb..7b9267b59e9d1d897cc4 100755 > > --- a/bin/git_sha1_gen.py > > +++ b/bin/git_sha1_gen.py > > @@ -6,6 +6,7 @@ > > """ > > > > > > +import argparse > > import os > > import os.path > > import subprocess > > @@ -27,10 +28,20 @@ def get_git_sha1(): > > git_sha1 = '' > > return git_sha1 > > > > +parser = argparse.ArgumentParser() > > +parser.add_argument('--output', help='File to write the #define in', > > +required=True) > > +args = parser.parse_args() > > > > git_sha1 = os.environ.get('MESA_GIT_SHA1_OVERRIDE', get_git_sha1())[:10] > > if git_sha1: > > git_sha1_h_in_path = os.path.join(os.path.dirname(sys.argv[0]), > > '..', 'src', 'git_sha1.h.in') > > with open(git_sha1_h_in_path , 'r') as git_sha1_h_in: > > -sys.stdout.write(git_sha1_h_in.read().replace('@VCS_TAG@', > > git_sha1)) > > +new_sha1 = git_sha1_h_in.read().replace('@VCS_TAG@', git_sha1) > > +if os.path.isfile(args.output): > > +with open(args.output, 'r') as git_sha1_h: > > +if git_sha1_h.read() == new_sha1: > > +quit() > > +with open(args.output, 'w') as git_sha1_h: > > +git_sha1_h.write(new_sha1) > > diff --git a/src/Makefile.am b/src/Makefile.am > > index 5ef2d4f55eacc470a7a9..1de4fca6a1216e7662b1 100644 > > --- a/src/Makefile.am > > +++ b/src/Makefile.am > > @@ -19,17 +19,10 @@ > > # FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER > > DEALINGS > > # IN THE SOFTWARE. > > > > -.PHONY: git_sha1.h.tmp > > -git_sha1.h.tmp: > > - @$(PYTHON2) $(top_srcdir)/bin/git_sha1_gen.py > $@ > > - > > -git_sha1.h: git_sha1.h.tmp > > - @echo "updating git_sha1.h" > > - @if ! cmp -s git_sha1.h.tmp git_sha1.h; then \ > > - mv git_sha1.h.tmp git_sha1.h ;\ > > - else \ > > - rm git_sha1.h.tmp ;\ > > - fi > > +.PHONY: git_sha1.h > > +git_sha1.h: $(top_srcdir)/src/git_sha1.h.in > > + @echo "updating $@" > > + @$(PYTHON2) $(top_srcdir)/bin/git_sha1_gen.py --output $@ > > > > BUILT_SOURCES = git_sha1.h > > CLEANFILES = $(BUILT_SOURCES) > > diff --git a/src/SConscript b/src/SConscript > > index a277e8b79254588e8fd4..820c3c01289e7544a57e 100644 > > --- a/src/SConscript > > +++ b/src/SConscript > > @@ -24,22 +24,12 @@ def write_git_sha1_h_file(filename): > > to retrieve the git hashid and write the header file. An empty file > > will be created if anything goes wrong.""" > > > > -tempfile = "git_sha1.h.tmp" > > -with open(tempfile, "w") as f: > > -args = [ python_cmd, Dir('#').abspath + '/bin/git_sha1_gen.py' ] > > -try: > > -subprocess.Popen(args, stdout=f).wait() > > -except: > > -print("Warning: exception in write_git_sha1_h_file()") > > -return > > - > > -if not os.path.exists(filename) or not filecmp.cmp(tempfile, filename): > > -# The filename does not exist or it's different from the new file, > > -# so replace old file with new. > > -if os.path.exists(filename): > > -os.remove(filename) > > -os.rename(tempfile, filename) > > -return > > +args = [ python_cmd, Dir('#').abspath + '/bin/git_sha1_gen.py', > > '--output', filename] > > space after the opening brace but none before the closing. I don't konw what > the > scons style is, but you should probably be consistent scons style is to have spaces inside [], which I added back now. thanks for spotting that :) > > > +try: > > +subprocess.Popen(args).wait() > > how about subprocess.call() instead of Popen().wait()? indeed, Popen() is not needed anymore. replaced with call(), tested and pushed :) > > > +except: > > +print("Warning: exception in write_git_sha1_h_file()") > > +return > > > > > > # Create the git_sha1.h header file > > diff --git a/src/mesa/Android.libmesa_git_sha1.mk > > b/src/mesa/Android.libmesa_git_sha1.mk > > index ddb30c428af739ee9fe5..d27923074dd289a115de 100644 > > --- a/src/mesa/Android.libmesa_git_sha1.mk > > +++ b/src/mesa/Android.libmesa_git_sha1.mk > > @@ -43,10 +43,10 @@ $(intermediates)/dummy.c: > > > > LOCAL_GENERATED_SOURCES +=
Re: [Mesa-dev] [PATCH] i965: fix blorp stage_prog_data->param leak
On 27 October 2017 at 14:53, Lionel Landwerlinwrote: > On 27/10/17 14:46, Emil Velikov wrote: >> >> On 27 October 2017 at 13:55, Tapani Pälli wrote: >>> >>> Patch uses mem_ctx for allocation to ensure param array gets freed >>> later, in blorp clear case this happens in blorp_params_get_clear_kernel. >>> >>> ==6164== 48 bytes in 1 blocks are definitely lost in loss record 61 of >>> 193 >>> ==6164==at 0x4C2EB6B: malloc (vg_replace_malloc.c:299) >>> ==6164==by 0x12E31C6C: ralloc_size (ralloc.c:121) >>> ==6164==by 0x130189F1: fs_visitor::assign_constant_locations() >>> (brw_fs.cpp:2095) >>> ==6164==by 0x13022D32: fs_visitor::optimize() (brw_fs.cpp:5715) >>> ==6164==by 0x13024D5A: fs_visitor::run_fs(bool, bool) >>> (brw_fs.cpp:6229) >>> ==6164==by 0x1302549A: brw_compile_fs (brw_fs.cpp:6570) >>> ==6164==by 0x130C4B07: blorp_compile_fs (blorp.c:194) >>> ==6164==by 0x130D384B: blorp_params_get_clear_kernel >>> (blorp_clear.c:79) >>> ==6164==by 0x130D3C56: blorp_fast_clear (blorp_clear.c:332) >>> ==6164==by 0x12EFA439: do_single_blorp_clear (brw_blorp.c:1261) >>> ==6164==by 0x12EFC4AF: brw_blorp_clear_color (brw_blorp.c:1326) >>> ==6164==by 0x12EFF72B: brw_clear (brw_clear.c:297) >>> >>> Signed-off-by: Tapani Pälli >>> --- >>> src/intel/compiler/brw_fs.cpp | 2 +- >>> 1 file changed, 1 insertion(+), 1 deletion(-) >>> >>> diff --git a/src/intel/compiler/brw_fs.cpp >>> b/src/intel/compiler/brw_fs.cpp >>> index 4616529abc..6b27c38be7 100644 >>> --- a/src/intel/compiler/brw_fs.cpp >>> +++ b/src/intel/compiler/brw_fs.cpp >>> @@ -2092,7 +2092,7 @@ fs_visitor::assign_constant_locations() >>> */ >>> uint32_t *param = stage_prog_data->param; >>> stage_prog_data->nr_params = num_push_constants; >>> - stage_prog_data->param = ralloc_array(NULL, uint32_t, >>> num_push_constants); >>> + stage_prog_data->param = ralloc_array(mem_ctx, uint32_t, >>> num_push_constants); >> >> I'm likely having a dull moment: >> isn't param/stage_prog_data->param freed (via ralloc_free) at the end >> of the function? > > > The old one is freed, here we create a new one. I'm a genius, indeed is. Thanks for Lionel. -Emil Note to self: Don't skip afternoon coffee :-\ ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] glsl: add varying resources for arrays of complex types
On Fri, Oct 27, 2017 at 10:03 AM, Juan A. Suarez Romerowrote: > This patch is mostly a patch done by Ilia Mirkin. > > It fixes KHR-GL45.enhanced_layouts.varying_structure_locations. > > CC: Ilia Mirkin > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103098 > Signed-off-by: Juan A. Suarez Romero > --- > src/compiler/glsl/linker.cpp | 36 > 1 file changed, 36 insertions(+) > > diff --git a/src/compiler/glsl/linker.cpp b/src/compiler/glsl/linker.cpp > index f827b68555..7305b7740b 100644 > --- a/src/compiler/glsl/linker.cpp > +++ b/src/compiler/glsl/linker.cpp > @@ -3882,6 +3882,42 @@ add_shader_variable(const struct gl_context *ctx, >return true; > } > > + case GLSL_TYPE_ARRAY: { > + /* The ARB_program_interface_query spec says: > + * > + * "For an active variable declared as an array of basic types, a > + * single entry will be generated, with its name string formed by > + * concatenating the name of the array and the string "[0]"." > + * > + * "For an active variable declared as an array of an aggregate > data > + * type (structures or arrays), a separate entry will be generated > + * for each active array element, unless noted immediately below. > + * The name of each entry is formed by concatenating the name of > + * the array, the "[" character, an integer identifying the > element > + * number, and the "]" character. These enumeration rules are > + * applied recursively, treating each enumerated array element as > a > + * separate active variable." > + */ So if you have a TCS or TES or GS with in struct { vec4 foo; } bar[4]; Wouldn't this add bar[0].foo, bar[1].foo, bar[2].foo, and bar[3].foo? With the latter ones getting bogus locations? What is it supposed to do in this case? > + const struct glsl_type *array_type = type->fields.array; > + if (array_type->base_type == GLSL_TYPE_STRUCT || > + array_type->base_type == GLSL_TYPE_ARRAY) { > + unsigned elem_location = location; > + unsigned stride = array_type->count_attribute_slots(false); > + for (unsigned i = 0; i < type->length; i++) { > +char *elem = ralloc_asprintf(shProg, "%s[%d]", name, i); > +if (!add_shader_variable(ctx, shProg, resource_set, > + stage_mask, programInterface, > + var, elem, array_type, > + use_implicit_location, elem_location, > + outermost_struct_type)) > + return false; > +elem_location += stride; > + } > + return true; > + } > + /* fallthrough */ > + } > + > default: { >/* The ARB_program_interface_query spec says: > * > -- > 2.13.6 > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] glsl: add varying resources for arrays of complex types
This patch is mostly a patch done by Ilia Mirkin. It fixes KHR-GL45.enhanced_layouts.varying_structure_locations. CC: Ilia MirkinBugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103098 Signed-off-by: Juan A. Suarez Romero --- src/compiler/glsl/linker.cpp | 36 1 file changed, 36 insertions(+) diff --git a/src/compiler/glsl/linker.cpp b/src/compiler/glsl/linker.cpp index f827b68555..7305b7740b 100644 --- a/src/compiler/glsl/linker.cpp +++ b/src/compiler/glsl/linker.cpp @@ -3882,6 +3882,42 @@ add_shader_variable(const struct gl_context *ctx, return true; } + case GLSL_TYPE_ARRAY: { + /* The ARB_program_interface_query spec says: + * + * "For an active variable declared as an array of basic types, a + * single entry will be generated, with its name string formed by + * concatenating the name of the array and the string "[0]"." + * + * "For an active variable declared as an array of an aggregate data + * type (structures or arrays), a separate entry will be generated + * for each active array element, unless noted immediately below. + * The name of each entry is formed by concatenating the name of + * the array, the "[" character, an integer identifying the element + * number, and the "]" character. These enumeration rules are + * applied recursively, treating each enumerated array element as a + * separate active variable." + */ + const struct glsl_type *array_type = type->fields.array; + if (array_type->base_type == GLSL_TYPE_STRUCT || + array_type->base_type == GLSL_TYPE_ARRAY) { + unsigned elem_location = location; + unsigned stride = array_type->count_attribute_slots(false); + for (unsigned i = 0; i < type->length; i++) { +char *elem = ralloc_asprintf(shProg, "%s[%d]", name, i); +if (!add_shader_variable(ctx, shProg, resource_set, + stage_mask, programInterface, + var, elem, array_type, + use_implicit_location, elem_location, + outermost_struct_type)) + return false; +elem_location += stride; + } + return true; + } + /* fallthrough */ + } + default: { /* The ARB_program_interface_query spec says: * -- 2.13.6 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] i965: fix blorp stage_prog_data->param leak
On 27/10/17 14:46, Emil Velikov wrote: On 27 October 2017 at 13:55, Tapani Pälliwrote: Patch uses mem_ctx for allocation to ensure param array gets freed later, in blorp clear case this happens in blorp_params_get_clear_kernel. ==6164== 48 bytes in 1 blocks are definitely lost in loss record 61 of 193 ==6164==at 0x4C2EB6B: malloc (vg_replace_malloc.c:299) ==6164==by 0x12E31C6C: ralloc_size (ralloc.c:121) ==6164==by 0x130189F1: fs_visitor::assign_constant_locations() (brw_fs.cpp:2095) ==6164==by 0x13022D32: fs_visitor::optimize() (brw_fs.cpp:5715) ==6164==by 0x13024D5A: fs_visitor::run_fs(bool, bool) (brw_fs.cpp:6229) ==6164==by 0x1302549A: brw_compile_fs (brw_fs.cpp:6570) ==6164==by 0x130C4B07: blorp_compile_fs (blorp.c:194) ==6164==by 0x130D384B: blorp_params_get_clear_kernel (blorp_clear.c:79) ==6164==by 0x130D3C56: blorp_fast_clear (blorp_clear.c:332) ==6164==by 0x12EFA439: do_single_blorp_clear (brw_blorp.c:1261) ==6164==by 0x12EFC4AF: brw_blorp_clear_color (brw_blorp.c:1326) ==6164==by 0x12EFF72B: brw_clear (brw_clear.c:297) Signed-off-by: Tapani Pälli --- src/intel/compiler/brw_fs.cpp | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/intel/compiler/brw_fs.cpp b/src/intel/compiler/brw_fs.cpp index 4616529abc..6b27c38be7 100644 --- a/src/intel/compiler/brw_fs.cpp +++ b/src/intel/compiler/brw_fs.cpp @@ -2092,7 +2092,7 @@ fs_visitor::assign_constant_locations() */ uint32_t *param = stage_prog_data->param; stage_prog_data->nr_params = num_push_constants; - stage_prog_data->param = ralloc_array(NULL, uint32_t, num_push_constants); + stage_prog_data->param = ralloc_array(mem_ctx, uint32_t, num_push_constants); I'm likely having a dull moment: isn't param/stage_prog_data->param freed (via ralloc_free) at the end of the function? The old one is freed, here we create a new one. -Emil ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] i965: fix blorp stage_prog_data->param leak
On 27 October 2017 at 13:55, Tapani Pälliwrote: > Patch uses mem_ctx for allocation to ensure param array gets freed > later, in blorp clear case this happens in blorp_params_get_clear_kernel. > > ==6164== 48 bytes in 1 blocks are definitely lost in loss record 61 of 193 > ==6164==at 0x4C2EB6B: malloc (vg_replace_malloc.c:299) > ==6164==by 0x12E31C6C: ralloc_size (ralloc.c:121) > ==6164==by 0x130189F1: fs_visitor::assign_constant_locations() > (brw_fs.cpp:2095) > ==6164==by 0x13022D32: fs_visitor::optimize() (brw_fs.cpp:5715) > ==6164==by 0x13024D5A: fs_visitor::run_fs(bool, bool) (brw_fs.cpp:6229) > ==6164==by 0x1302549A: brw_compile_fs (brw_fs.cpp:6570) > ==6164==by 0x130C4B07: blorp_compile_fs (blorp.c:194) > ==6164==by 0x130D384B: blorp_params_get_clear_kernel (blorp_clear.c:79) > ==6164==by 0x130D3C56: blorp_fast_clear (blorp_clear.c:332) > ==6164==by 0x12EFA439: do_single_blorp_clear (brw_blorp.c:1261) > ==6164==by 0x12EFC4AF: brw_blorp_clear_color (brw_blorp.c:1326) > ==6164==by 0x12EFF72B: brw_clear (brw_clear.c:297) > > Signed-off-by: Tapani Pälli > --- > src/intel/compiler/brw_fs.cpp | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/src/intel/compiler/brw_fs.cpp b/src/intel/compiler/brw_fs.cpp > index 4616529abc..6b27c38be7 100644 > --- a/src/intel/compiler/brw_fs.cpp > +++ b/src/intel/compiler/brw_fs.cpp > @@ -2092,7 +2092,7 @@ fs_visitor::assign_constant_locations() > */ > uint32_t *param = stage_prog_data->param; > stage_prog_data->nr_params = num_push_constants; > - stage_prog_data->param = ralloc_array(NULL, uint32_t, num_push_constants); > + stage_prog_data->param = ralloc_array(mem_ctx, uint32_t, > num_push_constants); I'm likely having a dull moment: isn't param/stage_prog_data->param freed (via ralloc_free) at the end of the function? -Emil ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH v3] radv: Implement VK_AMD_shader_info
This allows an app to query shader statistics and get a disassembly of a shader. RenderDoc git has support for it, so this allows you to view shader disassembly from a capture. When this extension is enabled on a device (or when tracing), we now disable pipeline caching, since we don't get the shader debug info when we retrieve cached shaders. v2: Improvements to resource usage reporting v3: Disassembly string must be null terminated (string_buffer's length does not include the terminator) Signed-off-by: Alex Smith--- src/amd/vulkan/radv_device.c | 9 ++ src/amd/vulkan/radv_extensions.py| 1 + src/amd/vulkan/radv_pipeline.c | 2 +- src/amd/vulkan/radv_pipeline_cache.c | 11 ++- src/amd/vulkan/radv_private.h| 3 + src/amd/vulkan/radv_shader.c | 179 +-- 6 files changed, 170 insertions(+), 35 deletions(-) diff --git a/src/amd/vulkan/radv_device.c b/src/amd/vulkan/radv_device.c index d25e9c97ba..0c2f6fa631 100644 --- a/src/amd/vulkan/radv_device.c +++ b/src/amd/vulkan/radv_device.c @@ -944,10 +944,15 @@ VkResult radv_CreateDevice( VkResult result; struct radv_device *device; + bool keep_shader_info = false; + for (uint32_t i = 0; i < pCreateInfo->enabledExtensionCount; i++) { const char *ext_name = pCreateInfo->ppEnabledExtensionNames[i]; if (!radv_physical_device_extension_supported(physical_device, ext_name)) return vk_error(VK_ERROR_EXTENSION_NOT_PRESENT); + + if (strcmp(ext_name, VK_AMD_SHADER_INFO_EXTENSION_NAME) == 0) + keep_shader_info = true; } /* Check enabled features */ @@ -1041,10 +1046,14 @@ VkResult radv_CreateDevice( device->physical_device->rad_info.max_se >= 2; if (getenv("RADV_TRACE_FILE")) { + keep_shader_info = true; + if (!radv_init_trace(device)) goto fail; } + device->keep_shader_info = keep_shader_info; + result = radv_device_init_meta(device); if (result != VK_SUCCESS) goto fail; diff --git a/src/amd/vulkan/radv_extensions.py b/src/amd/vulkan/radv_extensions.py index dfeb2880fc..eeb679d65a 100644 --- a/src/amd/vulkan/radv_extensions.py +++ b/src/amd/vulkan/radv_extensions.py @@ -81,6 +81,7 @@ EXTENSIONS = [ Extension('VK_EXT_global_priority', 1, 'device->rad_info.has_ctx_priority'), Extension('VK_AMD_draw_indirect_count', 1, True), Extension('VK_AMD_rasterization_order', 1, 'device->rad_info.chip_class >= VI && device->rad_info.max_se >= 2'), +Extension('VK_AMD_shader_info', 1, True), ] class VkVersion: diff --git a/src/amd/vulkan/radv_pipeline.c b/src/amd/vulkan/radv_pipeline.c index c25642c966..c6d9debc7e 100644 --- a/src/amd/vulkan/radv_pipeline.c +++ b/src/amd/vulkan/radv_pipeline.c @@ -1942,7 +1942,7 @@ void radv_create_shaders(struct radv_pipeline *pipeline, for (int i = 0; i < MESA_SHADER_STAGES; ++i) { free(codes[i]); - if (modules[i] && !pipeline->device->trace_bo) + if (modules[i] && !pipeline->device->keep_shader_info) ralloc_free(nir[i]); } diff --git a/src/amd/vulkan/radv_pipeline_cache.c b/src/amd/vulkan/radv_pipeline_cache.c index 5dee114749..441e2257d5 100644 --- a/src/amd/vulkan/radv_pipeline_cache.c +++ b/src/amd/vulkan/radv_pipeline_cache.c @@ -62,9 +62,11 @@ radv_pipeline_cache_init(struct radv_pipeline_cache *cache, cache->hash_table = malloc(byte_size); /* We don't consider allocation failure fatal, we just start with a 0-sized -* cache. */ +* cache. Disable caching when we want to keep shader debug info, since +* we don't get the debug info on cached shaders. */ if (cache->hash_table == NULL || - (device->instance->debug_flags & RADV_DEBUG_NO_CACHE)) + (device->instance->debug_flags & RADV_DEBUG_NO_CACHE) || + device->keep_shader_info) cache->table_size = 0; else memset(cache->hash_table, 0, byte_size); @@ -186,8 +188,11 @@ radv_create_shader_variants_from_pipeline_cache(struct radv_device *device, entry = radv_pipeline_cache_search_unlocked(cache, sha1); if (!entry) { + /* Again, don't cache when we want debug info, since this isn't +* present in the cache. */ if (!device->physical_device->disk_cache || - (device->instance->debug_flags & RADV_DEBUG_NO_CACHE)) { + (device->instance->debug_flags & RADV_DEBUG_NO_CACHE) || + device->keep_shader_info) { pthread_mutex_unlock(>mutex); return false; } diff --git
Re: [Mesa-dev] [PATCH] i965: fix blorp stage_prog_data->param leak
Reviewed-by: Lionel LandwerlinOn 27/10/17 13:55, Tapani Pälli wrote: Patch uses mem_ctx for allocation to ensure param array gets freed later, in blorp clear case this happens in blorp_params_get_clear_kernel. ==6164== 48 bytes in 1 blocks are definitely lost in loss record 61 of 193 ==6164==at 0x4C2EB6B: malloc (vg_replace_malloc.c:299) ==6164==by 0x12E31C6C: ralloc_size (ralloc.c:121) ==6164==by 0x130189F1: fs_visitor::assign_constant_locations() (brw_fs.cpp:2095) ==6164==by 0x13022D32: fs_visitor::optimize() (brw_fs.cpp:5715) ==6164==by 0x13024D5A: fs_visitor::run_fs(bool, bool) (brw_fs.cpp:6229) ==6164==by 0x1302549A: brw_compile_fs (brw_fs.cpp:6570) ==6164==by 0x130C4B07: blorp_compile_fs (blorp.c:194) ==6164==by 0x130D384B: blorp_params_get_clear_kernel (blorp_clear.c:79) ==6164==by 0x130D3C56: blorp_fast_clear (blorp_clear.c:332) ==6164==by 0x12EFA439: do_single_blorp_clear (brw_blorp.c:1261) ==6164==by 0x12EFC4AF: brw_blorp_clear_color (brw_blorp.c:1326) ==6164==by 0x12EFF72B: brw_clear (brw_clear.c:297) Signed-off-by: Tapani Pälli --- src/intel/compiler/brw_fs.cpp | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/intel/compiler/brw_fs.cpp b/src/intel/compiler/brw_fs.cpp index 4616529abc..6b27c38be7 100644 --- a/src/intel/compiler/brw_fs.cpp +++ b/src/intel/compiler/brw_fs.cpp @@ -2092,7 +2092,7 @@ fs_visitor::assign_constant_locations() */ uint32_t *param = stage_prog_data->param; stage_prog_data->nr_params = num_push_constants; - stage_prog_data->param = ralloc_array(NULL, uint32_t, num_push_constants); + stage_prog_data->param = ralloc_array(mem_ctx, uint32_t, num_push_constants); if (num_pull_constants > 0) { stage_prog_data->nr_pull_params = num_pull_constants; stage_prog_data->pull_param = ralloc_array(NULL, uint32_t, ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] i965: fix blorp stage_prog_data->param leak
Patch uses mem_ctx for allocation to ensure param array gets freed later, in blorp clear case this happens in blorp_params_get_clear_kernel. ==6164== 48 bytes in 1 blocks are definitely lost in loss record 61 of 193 ==6164==at 0x4C2EB6B: malloc (vg_replace_malloc.c:299) ==6164==by 0x12E31C6C: ralloc_size (ralloc.c:121) ==6164==by 0x130189F1: fs_visitor::assign_constant_locations() (brw_fs.cpp:2095) ==6164==by 0x13022D32: fs_visitor::optimize() (brw_fs.cpp:5715) ==6164==by 0x13024D5A: fs_visitor::run_fs(bool, bool) (brw_fs.cpp:6229) ==6164==by 0x1302549A: brw_compile_fs (brw_fs.cpp:6570) ==6164==by 0x130C4B07: blorp_compile_fs (blorp.c:194) ==6164==by 0x130D384B: blorp_params_get_clear_kernel (blorp_clear.c:79) ==6164==by 0x130D3C56: blorp_fast_clear (blorp_clear.c:332) ==6164==by 0x12EFA439: do_single_blorp_clear (brw_blorp.c:1261) ==6164==by 0x12EFC4AF: brw_blorp_clear_color (brw_blorp.c:1326) ==6164==by 0x12EFF72B: brw_clear (brw_clear.c:297) Signed-off-by: Tapani Pälli--- src/intel/compiler/brw_fs.cpp | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/intel/compiler/brw_fs.cpp b/src/intel/compiler/brw_fs.cpp index 4616529abc..6b27c38be7 100644 --- a/src/intel/compiler/brw_fs.cpp +++ b/src/intel/compiler/brw_fs.cpp @@ -2092,7 +2092,7 @@ fs_visitor::assign_constant_locations() */ uint32_t *param = stage_prog_data->param; stage_prog_data->nr_params = num_push_constants; - stage_prog_data->param = ralloc_array(NULL, uint32_t, num_push_constants); + stage_prog_data->param = ralloc_array(mem_ctx, uint32_t, num_push_constants); if (num_pull_constants > 0) { stage_prog_data->nr_pull_params = num_pull_constants; stage_prog_data->pull_param = ralloc_array(NULL, uint32_t, -- 2.13.6 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] formatquery: use correct target check for IMAGE_FORMAT_COMPATIBILITY_TYPE
On Fri, Oct 27, 2017 at 5:18 AM, Alejandro Piñeirowrote: > From the spec: >"IMAGE_FORMAT_COMPATIBILITY_TYPE: The matching criteria use for the > resource when used as an image textures is returned in > . This is equivalent to calling GetTexParameter" > > So we would need to return None for any target not supported by > GetTexParameter. By mistake, we were using the target check for > GetTexLevelParameter. > --- > src/mesa/main/formatquery.c | 8 +++- > 1 file changed, 7 insertions(+), 1 deletion(-) > > diff --git a/src/mesa/main/formatquery.c b/src/mesa/main/formatquery.c > index 77c7faa2251..39c628039b8 100644 > --- a/src/mesa/main/formatquery.c > +++ b/src/mesa/main/formatquery.c > @@ -1430,7 +1430,13 @@ _mesa_GetInternalformativ(GLenum target, GLenum > internalformat, GLenum pname, >if (!_mesa_has_ARB_shader_image_load_store(ctx)) > goto end; > > - if (!_mesa_legal_get_tex_level_parameter_target(ctx, target, true)) > + /* As pointed by the spec quote below, this pname query should return > + * the same value that GetTexParameter. So if the target is not valid > + * for GetTexParameter we return the unsupported value. The check below > + * is the same target check used by GetTextParameter. GetTexParameter > + */ > + int targetIndex = _mesa_tex_target_to_index(ctx, target); > + if (targetIndex < 0 || targetIndex == TEXTURE_BUFFER_INDEX) > goto end; > >/* From spec: "Equivalent to calling GetTexParameter with set > -- > 2.11.0 > > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] Android: egl: add dependency on libnativewindow
On 26 October 2017 at 20:18, Rob Herringwrote: > system/window.h is no longer available by default and is part of > libnativewindow, so add it to the shared libraries. It has to be conditional > because the library is only present in O and later. > > Really, we should only be depending on vndk/window.h now, but that's only > in O and changing would be pretty invasive. > > Signed-off-by: Rob Herring > --- > src/egl/Android.mk | 4 > 1 file changed, 4 insertions(+) > > diff --git a/src/egl/Android.mk b/src/egl/Android.mk > index 2de842ca4172..11818694f4f8 100644 > --- a/src/egl/Android.mk > +++ b/src/egl/Android.mk > @@ -60,6 +60,10 @@ LOCAL_SHARED_LIBRARIES := \ > libgralloc_drm \ > libsync > > +ifeq ($(filter $(MESA_ANDROID_MAJOR_VERSION), 4 5 6 7),) Didn't we remove Android 4.x from the supported list? Apart from that Reviewed-by: Emil Velikov -Emil ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] formatquery: use correct target check for IMAGE_FORMAT_COMPATIBILITY_TYPE
Reviewed-by: Marek OlšákMarek On Fri, Oct 27, 2017 at 11:18 AM, Alejandro Piñeiro wrote: > From the spec: >"IMAGE_FORMAT_COMPATIBILITY_TYPE: The matching criteria use for the > resource when used as an image textures is returned in > . This is equivalent to calling GetTexParameter" > > So we would need to return None for any target not supported by > GetTexParameter. By mistake, we were using the target check for > GetTexLevelParameter. > --- > src/mesa/main/formatquery.c | 8 +++- > 1 file changed, 7 insertions(+), 1 deletion(-) > > diff --git a/src/mesa/main/formatquery.c b/src/mesa/main/formatquery.c > index 77c7faa2251..39c628039b8 100644 > --- a/src/mesa/main/formatquery.c > +++ b/src/mesa/main/formatquery.c > @@ -1430,7 +1430,13 @@ _mesa_GetInternalformativ(GLenum target, GLenum > internalformat, GLenum pname, >if (!_mesa_has_ARB_shader_image_load_store(ctx)) > goto end; > > - if (!_mesa_legal_get_tex_level_parameter_target(ctx, target, true)) > + /* As pointed by the spec quote below, this pname query should return > + * the same value that GetTexParameter. So if the target is not valid > + * for GetTexParameter we return the unsupported value. The check below > + * is the same target check used by GetTextParameter. > + */ > + int targetIndex = _mesa_tex_target_to_index(ctx, target); > + if (targetIndex < 0 || targetIndex == TEXTURE_BUFFER_INDEX) > goto end; > >/* From spec: "Equivalent to calling GetTexParameter with set > -- > 2.11.0 > > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] Android: move drivers' symlinks to /vendor
On Fri, Oct 27, 2017 at 6:41 AM, Emil Velikovwrote: > On 26 October 2017 at 23:48, Mauro Rossi wrote: >> Having moved gallium_dri.so library to /vendor/lib/dri >> also symlinks need to be coherently created using TARGET_OUT_VENDOR insted >> of TARGET_OUT >> or all non Intel drivers will not be loaded with Android N and earlier, >> thus causing SurfaceFlinger SIGABRT >> >> Fixes: c3f75d483c ("Android: move libraries to /vendor") >> >> Cc: 17.3 >> --- >> src/gallium/targets/dri/Android.mk | 4 ++-- >> 1 file changed, 2 insertions(+), 2 deletions(-) Reviewed-by: Rob Herring >> >> diff --git a/src/gallium/targets/dri/Android.mk >> b/src/gallium/targets/dri/Android.mk >> index 61b65769ff..3fa86a2d56 100644 >> --- a/src/gallium/targets/dri/Android.mk >> +++ b/src/gallium/targets/dri/Android.mk >> @@ -70,8 +70,8 @@ LOCAL_SHARED_LIBRARIES += $(sort $(GALLIUM_SHARED_LIBS)) >> ifneq ($(filter 5 6 7, $(MESA_ANDROID_MAJOR_VERSION)),) >> LOCAL_POST_INSTALL_CMD := \ >> $(foreach l, lib $(if $(filter true,$(TARGET_IS_64_BIT)),lib64), \ >> - mkdir -p $(TARGET_OUT)/$(l)/$(MESA_DRI_MODULE_REL_PATH); \ >> - $(foreach d, $(GALLIUM_TARGET_DRIVERS), ln -sf gallium_dri.so >> $(TARGET_OUT)/$(l)/$(MESA_DRI_MODULE_REL_PATH)/$(d)_dri.so;) \ >> + mkdir -p $(TARGET_OUT_VENDOR)/$(l)/$(MESA_DRI_MODULE_REL_PATH); \ >> + $(foreach d, $(GALLIUM_TARGET_DRIVERS), ln -sf gallium_dri.so >> $(TARGET_OUT_VENDOR)/$(l)/$(MESA_DRI_MODULE_REL_PATH)/$(d)_dri.so;) \ > Can we fold the long path into a variable and then reuse it? > This code will be around for a bit, so it might be worth it. > > foo=$(TARGET_OUT_VENDOR)/$(l)/$(MESA_DRI_MODULE_REL_PATH) > mkdir -p $(foo) > $(foreach d, $(GALLIUM_TARGET_DRIVERS), ln -sf gallium_dri.so > $(foo)/$(d)_dri.so;) > > -Emil > *Please use better variable name than foo bar? ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v3 35/48] intel/eu: Make automatic exec sizes a configurable option
On Wed, 2017-10-25 at 16:26 -0700, Jason Ekstrand wrote: > We have had a feature in codegen for some time that tries to > automatically infer the execution size of an instruction from the > width > of its destination. For things such as fixed function GS, clipper, > and > SF programs, this is very useful because they tend to have lots of > hand-rolled register setup and trying to specify the exec size all > the > time would be prohibitive. For things that come from a higher-level > IR, > however, it's easier to just set the right size all the time and the > automatic exec sizes can, in fact, cause problems. This commit makes > it > optional while enabling it by default. > --- > src/intel/compiler/brw_eu.c | 1 + > src/intel/compiler/brw_eu.h | 10 ++ > src/intel/compiler/brw_eu_emit.c | 32 ++-- > > 3 files changed, 29 insertions(+), 14 deletions(-) > > diff --git a/src/intel/compiler/brw_eu.c > b/src/intel/compiler/brw_eu.c > index b0bdc38..bc297a2 100644 > --- a/src/intel/compiler/brw_eu.c > +++ b/src/intel/compiler/brw_eu.c > @@ -296,6 +296,7 @@ brw_init_codegen(const struct gen_device_info > *devinfo, > memset(p, 0, sizeof(*p)); > > p->devinfo = devinfo; > + p->automatic_exec_sizes = true; > /* > * Set the initial instruction store array size to 1024, if found > that > * isn't enough, then it will double the store size at > brw_next_insn() > diff --git a/src/intel/compiler/brw_eu.h > b/src/intel/compiler/brw_eu.h > index 8e597b2..8abebeb 100644 > --- a/src/intel/compiler/brw_eu.h > +++ b/src/intel/compiler/brw_eu.h > @@ -65,6 +65,16 @@ struct brw_codegen { > bool compressed_stack[BRW_EU_MAX_INSN_STACK]; > brw_inst *current; > > + /** Whether or not the user wants automatic exec sizes > +* > +* If true, codegen will try to automatically infer the exec size > of an > +* instruction from the width of the destination register. If > false, it > +* will take whatever is set by brw_set_default_exec_size > verbatim. > +* > +* This is set to true by default in brw_init_codegen. > +*/ > + bool automatic_exec_sizes; > + > bool single_program_flow; > const struct gen_device_info *devinfo; > > diff --git a/src/intel/compiler/brw_eu_emit.c > b/src/intel/compiler/brw_eu_emit.c > index fae74cf..902914f 100644 > --- a/src/intel/compiler/brw_eu_emit.c > +++ b/src/intel/compiler/brw_eu_emit.c > @@ -141,22 +141,26 @@ brw_set_dest(struct brw_codegen *p, brw_inst > *inst, struct brw_reg dest) > > /* Generators should set a default exec_size of either 8 (SIMD4x2 > or SIMD8) > * or 16 (SIMD16), as that's normally correct. However, when > dealing with > -* small registers, we automatically reduce it to match the > register size. > -* > -* In platforms that support fp64 we can emit instructions with a > width of > -* 4 that need two SIMD8 registers and an exec_size of 8 or 16. > In these > -* cases we need to make sure that these instructions have their > exec sizes > -* set properly when they are emitted and we can't rely on this > code to fix > -* it. > +* small registers, it can be useful for us toautomatically to automatically > reduce it to > +* match the register size. > */ > - bool fix_exec_size; > - if (devinfo->gen >= 6) > - fix_exec_size = dest.width < BRW_EXECUTE_4; > - else > - fix_exec_size = dest.width < BRW_EXECUTE_8; > + if (p->automatic_exec_sizes) { > + /* > + * In platforms that support fp64 we can emit instructions > with a width > + * of 4 that need two SIMD8 registers and an exec_size of 8 or > 16. In > + * these cases we need to make sure that these instructions > have their > + * exec sizes set properly when they are emitted and we can't > rely on > + * this code to fix it. > + */ > + bool fix_exec_size; > + if (devinfo->gen >= 6) > + fix_exec_size = dest.width < BRW_EXECUTE_4; > + else > + fix_exec_size = dest.width < BRW_EXECUTE_8; > > - if (fix_exec_size) > - brw_inst_set_exec_size(devinfo, inst, dest.width); > + if (fix_exec_size) > + brw_inst_set_exec_size(devinfo, inst, dest.width); > + } > } > > void ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v3 00/48] nir, intel: Prerequisites for subgroups
I dropped a few more comments on patches 18-20, 23, 25, 29-31, 34 and 38 but nothing major in general. I have doubts that patch 20 isn't working around some other bug and 38 scares me a little bit but I guess if Jenkins is happy I shouldn't worry too much. Otherwise, patches 16-38 are: Reviewed-by: Iago Toral Quiroga(patch 15 already has your Rb) Iago On Thu, 2017-10-26 at 14:15 +0200, Iago Toral wrote: > I left a few minor comments in patches 1, 2, 8 and 14. Otherwise > patches 1-2, 4-5 and 7-14 (3 and 6 already have Rb) are: > > Reviewed-by: Iago Toral Quiroga > > I feel like patches 10, 11 could maybe use another extra review if > there is someone who wants to do it, since I am not very familiar > with > how all the indirect addressing stuff works and the restrictions of > the > hardware that affect this. > > Patch 14 looks good, although the part where you locate the DO block > for a matching WHILE could probably use the review of someone else > more > familiar with the CFG code than me. > > Iago > > On Wed, 2017-10-25 at 16:25 -0700, Jason Ekstrand wrote: > > This series is a third respin of my subgroups prerequisites series > > that > > that I sent out a few weeks ago. Not a whole lot has changed but > > there are > > some new patches. Primarily, > > > > 1) Some patches which were reviewed by Matt and Lionel were pushed > > and are > > no longer in the series. Thanks guys! > > > > 2) I've applied R-B tags from various people for patches which are > > reviewed but depend on still unreviewed patches. > > > > 3) A few patches to fix little-core. In particular, the extra > > little-core > > EU restrictions cause problems for BROADCAST, MOV_INDIRECT, and > > integer > > MUL. > > > > This series can be found on fd.o nere: > > > > https://cgit.freedesktop.org/~jekstrand/mesa/log/?h=review/subgroup > > -p > > rereqs-v3 > > > > Happy reviewing! > > > > > > Cc: Matt Turner > > Cc: Francisco Jerez > > Cc: Connor Abbott > > > > Francisco Jerez (1): > > intel/fs: Restrict live intervals to the subset possibly > > reachable > > from any definition. > > > > Jason Ekstrand (47): > > intel/fs: Pass builders instead of blocks into emit_[un]zip > > intel/fs: Be more explicit about our placement of [un]zip > > intel/fs: Use ANY/ALL32 predicates in SIMD32 > > intel/fs: Don't stomp f0.1 in SIMD16 ballot > > intel/fs: Use an explicit D type for vote any/all/eq intrinsics > > intel/fs: Use a pair of 1-wide MOVs instead of SEL for any/all > > intel/compiler: Add some restrictions to MOV_INDIRECT and > > BROADCAST > > intel/eu: Just modify the offset in brw_broadcast > > intel/eu/reg: Add a subscript() helper > > intel/eu: Fix broadcast instruction for 64-bit values on little- > > core > > intel/fs: Fix MOV_INDIRECT for 64-bit values on little-core > > intel/fs: Fix integer multiplication lowering for src/dst hazards > > intel/fs: Use the original destination region for int MUL > > lowering > > i965/fs: Extend the live ranges of VGRFs which leave loops > > i965/fs/nir: Simplify 64-bit store_output > > i965/fs: Return a fs_reg from shuffle_64bit_data_for_32bit_write > > i965/fs/nir: Minor refactor of store_output > > i965/fs/nir: Don't stomp 64-bit values to D in get_nir_src > > intel/fs: Protect opt_algebraic from OOB BROADCAST indices > > intel/fs: Uniformize the index in readInvocation > > intel/fs: Retype dest to match value in read[First]Invocation > > intel/fs: Assign constant locations if they haven't been assigned > > intel/fs: Remove min_dispatch_width from fs_visitor > > intel/cs: Drop max_dispatch_width checks from compile_cs > > intel/cs: Stop setting dispatch_grf_start_reg > > intel/cs: Ignore runtime_check_aads_emit for CS > > intel/fs: Mark 64-bit values as being contiguous > > intel/cs: Rework the way thread local ID is handled > > intel/cs: Re-run final NIR optimizations for each SIMD size > > intel/cs: Re-run final NIR optimizations for each SIMD size > > intel/cs: Push subgroup ID instead of base thread ID > > intel/compiler/fs: Set up subgroup invocation as a system value > > intel/fs: Rework zero-length URB write handling > > intel/eu: Make automatic exec sizes a configurable option > > intel/eu: Explicitly set EXECUTE_1 where needed > > intel/fs: Explicitly set EXECUTE_1 where needed > > intel/fs: Don't use automatic exec size inference > > nir: Add a new subgroups lowering pass > > nir: Add a ssa_dest_init_for_type helper > > nir: Make ballot intrinsics variable-size > > nir/lower_system_values: Lower SUBGROUP_*_MASK based on type > > nir/lower_subgroups: Lower ballot intrinsics to the specified bit > > size > > nir,intel/compiler: Use a fixed subgroup size > > spirv: Add a vtn_constant_value helper > > spirv: Rework barriers > > nir: Validate base types on