[Mesa-dev] [PATCH] nir/copy_propagate: do not copy-propagate MOV srcs with source modifiers
If a source operand in a MOV has source modifiers, then we cannot copy-propagate it from the parent instruction and remove the MOV. v2: remove the check for source source modifiers from is_move() (Jason) --- src/glsl/nir/nir_opt_copy_propagate.c | 17 ++--- 1 file changed, 10 insertions(+), 7 deletions(-) diff --git a/src/glsl/nir/nir_opt_copy_propagate.c b/src/glsl/nir/nir_opt_copy_propagate.c index 7d8bdd7..2611069 100644 --- a/src/glsl/nir/nir_opt_copy_propagate.c +++ b/src/glsl/nir/nir_opt_copy_propagate.c @@ -41,11 +41,6 @@ static bool is_move(nir_alu_instr *instr) if (instr->dest.saturate) return false; - /* we handle modifiers in a separate pass */ - - if (instr->src[0].abs || instr->src[0].negate) - return false; - if (!instr->src[0].src.is_ssa) return false; @@ -65,9 +60,13 @@ static bool is_vec(nir_alu_instr *instr) } static bool -is_swizzleless_move(nir_alu_instr *instr) +is_simple_move(nir_alu_instr *instr) { if (is_move(instr)) { + /* We handle modifiers in a separate pass */ + if (instr->src[0].negate || instr->src[0].abs) + return false; + for (unsigned i = 0; i < 4; i++) { if (!((instr->dest.write_mask >> i) & 1)) break; @@ -81,6 +80,10 @@ is_swizzleless_move(nir_alu_instr *instr) if (instr->src[i].swizzle[0] != i) return false; + /* We handle modifiers in a separate pass */ + if (instr->src[i].negate || instr->src[i].abs) +return false; + if (def == NULL) { def = instr->src[i].src.ssa; } else if (instr->src[i].src.ssa != def) { @@ -107,7 +110,7 @@ copy_prop_src(nir_src *src, nir_instr *parent_instr, nir_if *parent_if) return false; nir_alu_instr *alu_instr = nir_instr_as_alu(src_instr); - if (!is_swizzleless_move(alu_instr)) + if (!is_simple_move(alu_instr)) return false; /* Don't let copy propagation land us with a phi that has more -- 1.9.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] glsl: set precision qualifier on interface block members
On 12/11/15 08:13, Tapani Pälli wrote: > > > On 11/12/2015 09:11 AM, Samuel Iglesias Gonsálvez wrote: >> Reviewed-by: Samuel Iglesias Gonsálvez >> >> Are you planning to merge it to patch 5/7 or keep it standalone? > > Yeah, I'll merge this to patch 5/7, otherwise there are some failures > between patches when bisecting. > OK, thanks! Sam >> Sam >> >> On 12/11/15 07:57, Tapani Pälli wrote: >>> Signed-off-by: Tapani Pälli >>> --- >>> src/glsl/ast_to_hir.cpp | 1 + >>> 1 file changed, 1 insertion(+) >>> >>> diff --git a/src/glsl/ast_to_hir.cpp b/src/glsl/ast_to_hir.cpp >>> index 2fd9c5b..51ea183 100644 >>> --- a/src/glsl/ast_to_hir.cpp >>> +++ b/src/glsl/ast_to_hir.cpp >>> @@ -6161,6 +6161,7 @@ >>> ast_process_structure_or_interface_block(exec_list *instructions, >>>fields[i].centroid = qual->flags.q.centroid ? 1 : 0; >>>fields[i].sample = qual->flags.q.sample ? 1 : 0; >>>fields[i].patch = qual->flags.q.patch ? 1 : 0; >>> + fields[i].precision = qual->precision; >>> >>>/* From Section 4.4.2.3 (Geometry Outputs) of the GLSL >>> 4.50 spec: >>> * >>> > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] glsl: set precision qualifier on interface block members
On 11/12/2015 09:11 AM, Samuel Iglesias Gonsálvez wrote: Reviewed-by: Samuel Iglesias Gonsálvez Are you planning to merge it to patch 5/7 or keep it standalone? Yeah, I'll merge this to patch 5/7, otherwise there are some failures between patches when bisecting. Sam On 12/11/15 07:57, Tapani Pälli wrote: Signed-off-by: Tapani Pälli --- src/glsl/ast_to_hir.cpp | 1 + 1 file changed, 1 insertion(+) diff --git a/src/glsl/ast_to_hir.cpp b/src/glsl/ast_to_hir.cpp index 2fd9c5b..51ea183 100644 --- a/src/glsl/ast_to_hir.cpp +++ b/src/glsl/ast_to_hir.cpp @@ -6161,6 +6161,7 @@ ast_process_structure_or_interface_block(exec_list *instructions, fields[i].centroid = qual->flags.q.centroid ? 1 : 0; fields[i].sample = qual->flags.q.sample ? 1 : 0; fields[i].patch = qual->flags.q.patch ? 1 : 0; + fields[i].precision = qual->precision; /* From Section 4.4.2.3 (Geometry Outputs) of the GLSL 4.50 spec: * ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] glsl: set precision qualifier on interface block members
Reviewed-by: Samuel Iglesias Gonsálvez Are you planning to merge it to patch 5/7 or keep it standalone? Sam On 12/11/15 07:57, Tapani Pälli wrote: > Signed-off-by: Tapani Pälli > --- > src/glsl/ast_to_hir.cpp | 1 + > 1 file changed, 1 insertion(+) > > diff --git a/src/glsl/ast_to_hir.cpp b/src/glsl/ast_to_hir.cpp > index 2fd9c5b..51ea183 100644 > --- a/src/glsl/ast_to_hir.cpp > +++ b/src/glsl/ast_to_hir.cpp > @@ -6161,6 +6161,7 @@ ast_process_structure_or_interface_block(exec_list > *instructions, > fields[i].centroid = qual->flags.q.centroid ? 1 : 0; > fields[i].sample = qual->flags.q.sample ? 1 : 0; > fields[i].patch = qual->flags.q.patch ? 1 : 0; > + fields[i].precision = qual->precision; > > /* From Section 4.4.2.3 (Geometry Outputs) of the GLSL 4.50 spec: >* > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] glsl: set precision qualifier on interface block members
Signed-off-by: Tapani Pälli --- src/glsl/ast_to_hir.cpp | 1 + 1 file changed, 1 insertion(+) diff --git a/src/glsl/ast_to_hir.cpp b/src/glsl/ast_to_hir.cpp index 2fd9c5b..51ea183 100644 --- a/src/glsl/ast_to_hir.cpp +++ b/src/glsl/ast_to_hir.cpp @@ -6161,6 +6161,7 @@ ast_process_structure_or_interface_block(exec_list *instructions, fields[i].centroid = qual->flags.q.centroid ? 1 : 0; fields[i].sample = qual->flags.q.sample ? 1 : 0; fields[i].patch = qual->flags.q.patch ? 1 : 0; + fields[i].precision = qual->precision; /* From Section 4.4.2.3 (Geometry Outputs) of the GLSL 4.50 spec: * -- 2.4.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] additional change to precision patches
Hi; Here's one additional change (can be considered as patch 5.1/7) that was missing from the precision patches. Tapani Pälli (1): glsl: set precision qualifier on interface block members src/glsl/ast_to_hir.cpp | 1 + 1 file changed, 1 insertion(+) -- 2.4.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] glsl: simplify validation of illegal layouts on block members
From: Timothy Arceri We already give the location of the qualifier so there is no need to list all the identifiers in the error message. Also not calling the binding validation function will make things much simpler when adding compile time constant support as we wont need to resolve the binding value. --- src/glsl/ast_to_hir.cpp | 12 1 file changed, 4 insertions(+), 8 deletions(-) diff --git a/src/glsl/ast_to_hir.cpp b/src/glsl/ast_to_hir.cpp index 9d341e8..79bd4e7 100644 --- a/src/glsl/ast_to_hir.cpp +++ b/src/glsl/ast_to_hir.cpp @@ -5896,18 +5896,14 @@ ast_process_structure_or_interface_block(exec_list *instructions, const struct ast_type_qualifier *const qual = & decl_list->type->qualifier; - if (qual->flags.q.explicit_binding) -validate_binding_qualifier(state, &loc, decl_type, qual); - if (qual->flags.q.std140 || qual->flags.q.std430 || qual->flags.q.packed || - qual->flags.q.shared) { + qual->flags.q.shared || + qual->flags.q.explicit_binding) { _mesa_glsl_error(&loc, state, - "uniform/shader storage block layout qualifiers " - "std140, std430, packed, and shared can only be " - "applied to uniform/shader storage blocks, not " - "members"); + "this layout qualifier can only be applied to " + "uniform/shader storage blocks, not members"); } if (qual->flags.q.constant) { -- 2.4.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v2 3/3] glsl: Use array deref for access to vector components
On Wed, Nov 11, 2015 at 8:47 PM, Ilia Mirkin wrote: > On Nov 11, 2015 9:10 PM, "Matt Turner" wrote: >> >> On Thu, Nov 5, 2015 at 9:44 PM, Kristian Høgsberg Kristensen >> wrote: >> > diff --git a/src/glsl/ast_function.cpp b/src/glsl/ast_function.cpp >> > index e4e4a3f..5584470 100644 >> > --- a/src/glsl/ast_function.cpp >> > +++ b/src/glsl/ast_function.cpp >> > @@ -376,12 +368,8 @@ fix_parameter(void *mem_ctx, ir_rvalue *actual, >> > const glsl_type *formal_type, >> > >> > ir_rvalue *lhs = actual; >> > if (expr != NULL && expr->operation == ir_binop_vector_extract) { >> > - rhs = new(mem_ctx) ir_expression(ir_triop_vector_insert, >> > - expr->operands[0]->type, >> > - >> > expr->operands[0]->clone(mem_ctx, NULL), >> > - rhs, >> > - >> > expr->operands[1]->clone(mem_ctx, NULL)); >> > - lhs = expr->operands[0]->clone(mem_ctx, NULL); >> > + lhs == new(mem_ctx) >> > ir_dereference_array(expr->operands[0]->clone(mem_ctx, NULL), >> > + >> > expr->operands[1]->clone(mem_ctx, NULL)); >> >> >> I'm getting >> >> ../../../mesa/src/glsl/ast_function.cpp: In function ‘void >> fix_parameter(void*, ir_rvalue*, const glsl_type*, exec_list*, >> exec_list*, bool)’: >> ../../../mesa/src/glsl/ast_function.cpp:372:88: warning: right operand >> of comma operator has no effect [-Wunused-value] >> >> expr->operands[1]->clone(mem_ctx, NULL)); >> >> from this, but I can't interpret the warning. >> > > Probably one equal sign too many... Wow! How it this working?! ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v2 3/3] glsl: Use array deref for access to vector components
On Nov 11, 2015 9:10 PM, "Matt Turner" wrote: > > On Thu, Nov 5, 2015 at 9:44 PM, Kristian Høgsberg Kristensen > wrote: > > diff --git a/src/glsl/ast_function.cpp b/src/glsl/ast_function.cpp > > index e4e4a3f..5584470 100644 > > --- a/src/glsl/ast_function.cpp > > +++ b/src/glsl/ast_function.cpp > > @@ -376,12 +368,8 @@ fix_parameter(void *mem_ctx, ir_rvalue *actual, const glsl_type *formal_type, > > > > ir_rvalue *lhs = actual; > > if (expr != NULL && expr->operation == ir_binop_vector_extract) { > > - rhs = new(mem_ctx) ir_expression(ir_triop_vector_insert, > > - expr->operands[0]->type, > > - expr->operands[0]->clone(mem_ctx, NULL), > > - rhs, > > - expr->operands[1]->clone(mem_ctx, NULL)); > > - lhs = expr->operands[0]->clone(mem_ctx, NULL); > > + lhs == new(mem_ctx) ir_dereference_array(expr->operands[0]->clone(mem_ctx, NULL), > > + expr->operands[1]->clone(mem_ctx, NULL)); > > > I'm getting > > ../../../mesa/src/glsl/ast_function.cpp: In function ‘void > fix_parameter(void*, ir_rvalue*, const glsl_type*, exec_list*, > exec_list*, bool)’: > ../../../mesa/src/glsl/ast_function.cpp:372:88: warning: right operand > of comma operator has no effect [-Wunused-value] > > expr->operands[1]->clone(mem_ctx, NULL)); > > from this, but I can't interpret the warning. > Probably one equal sign too many... -ilia ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v2 3/3] glsl: Use array deref for access to vector components
On Thu, Nov 5, 2015 at 9:44 PM, Kristian Høgsberg Kristensen wrote: > diff --git a/src/glsl/ast_function.cpp b/src/glsl/ast_function.cpp > index e4e4a3f..5584470 100644 > --- a/src/glsl/ast_function.cpp > +++ b/src/glsl/ast_function.cpp > @@ -376,12 +368,8 @@ fix_parameter(void *mem_ctx, ir_rvalue *actual, const > glsl_type *formal_type, > > ir_rvalue *lhs = actual; > if (expr != NULL && expr->operation == ir_binop_vector_extract) { > - rhs = new(mem_ctx) ir_expression(ir_triop_vector_insert, > - expr->operands[0]->type, > - expr->operands[0]->clone(mem_ctx, > NULL), > - rhs, > - expr->operands[1]->clone(mem_ctx, > NULL)); > - lhs = expr->operands[0]->clone(mem_ctx, NULL); > + lhs == new(mem_ctx) > ir_dereference_array(expr->operands[0]->clone(mem_ctx, NULL), > + > expr->operands[1]->clone(mem_ctx, NULL)); I'm getting ../../../mesa/src/glsl/ast_function.cpp: In function ‘void fix_parameter(void*, ir_rvalue*, const glsl_type*, exec_list*, exec_list*, bool)’: ../../../mesa/src/glsl/ast_function.cpp:372:88: warning: right operand of comma operator has no effect [-Wunused-value] expr->operands[1]->clone(mem_ctx, NULL)); from this, but I can't interpret the warning. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 18/24] i965: Combine register file field.
On Tue, Nov 3, 2015 at 10:49 AM, Emil Velikov wrote: > On 3 November 2015 at 18:10, Matt Turner wrote: >> On Tue, Nov 3, 2015 at 8:07 AM, Emil Velikov >> wrote: >>> On 3 November 2015 at 00:29, Matt Turner wrote: >>> index 6eeafd5..3d2b051 100644 --- a/src/mesa/drivers/dri/i965/brw_fs.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs.cpp @@ -423,7 +423,6 @@ fs_reg::fs_reg(uint8_t vf0, uint8_t vf1, uint8_t vf2, uint8_t vf3) fs_reg::fs_reg(struct brw_reg reg) : backend_reg(reg) { - this->file = (enum register_file)reg.file; >>> Should we fold the remaining this->file = foo into the backend_reg() ctors ? >> >> Not necessary. The only remaining file field after this commit is in >> brw_reg. So the backend_reg(reg) constructor is already doing that. > I might be missing a patch but I think that the default > fs/src/dst_reg() ctors still have it ? > Either way it's not a show stopper. I checked, and without removing init() and initializing backend_reg() from fs_reg/src_reg/dst_reg, there's no way to do this. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 00/11] i965/nir: Do texture rectangle lowering in NIR
On Wed, Nov 11, 2015 at 5:23 PM, Jason Ekstrand wrote: > On older hardware (Iron Lake and below), we can't support texture rectangle > natively. Sandy Bridge through Haswell can support it but don't support > the GL_CLAMP wrap mode natively. It isn't until Broadwell that GL_CLAMP is > supported together with GL_TEXTURE_RECTANGLE in hardware. In the cases > where it isn't supported, we have to fake it by dividing by the texture > size. > > Previously, we had a rescale_texcoord function added a uniform to hold the > texture coordinate and used that to rescale/clamp the texture coordinates. > For a while now, nir_lower_tex has been able to lower texture rectangle to > a textureSize and a regular texture2D operation. This series makes i965 > use the nir_lower_tex path instead. Incidentally, this fixes texture > rectangle support in vertex and geometry shaders on Haswell and below. > (The backend lowering was only ever done in the FS backend.) > > Since this is the first time we're doing any sort of shader variants in > NIR, the first several passes add the infastructure to do so. Two of these > patches are from Ken, two are from Rob, and one (nir_clone itself) is my > rendition but heavily based on what Rob did only with less hashing. Once again, git-send-email failed to send one of the patches (nir_clone). You can find the whole thing on my freedesktop cgit: http://cgit.freedesktop.org/~jekstrand/mesa/log/?h=wip/i965-nir-variants > Jason Ekstrand (7): > nir: support to clone shaders > i965/nir: Split shader optimization and lowering into three satages > i965: Move postprocess_nir to codegen time > nir/lower_tex: Report progress > nir/lower_tex: Set the dest_type for txs instructions > i965/fs: Don't allow SINT32 as a return type for resinfo > i965: Use nir_lower_tex for texture coordinate lowering > > Kenneth Graunke (2): > i965/nir: Add OPT() and OPT_V() macros for invoking NIR passes. > i965/nir: Validate that NIR passes call nir_metadata_preserve(). > > Rob Clark (2): > nir: remove nir_variable::max_ifc_array_access > nir: add array length field > > src/glsl/Makefile.sources | 1 + > src/glsl/nir/glsl_to_nir.cpp | 14 +- > src/glsl/nir/nir.c| 8 + > src/glsl/nir/nir.h| 27 +- > src/glsl/nir/nir_clone.c | 671 > ++ > src/glsl/nir/nir_lower_tex.c | 20 +- > src/glsl/nir/nir_metadata.c | 36 ++ > src/mesa/drivers/dri/i965/brw_fs.cpp | 13 +- > src/mesa/drivers/dri/i965/brw_fs.h| 3 - > src/mesa/drivers/dri/i965/brw_fs_generator.cpp| 10 +- > src/mesa/drivers/dri/i965/brw_fs_nir.cpp | 4 +- > src/mesa/drivers/dri/i965/brw_fs_visitor.cpp | 125 > src/mesa/drivers/dri/i965/brw_nir.c | 268 + > src/mesa/drivers/dri/i965/brw_nir.h | 15 + > src/mesa/drivers/dri/i965/brw_vec4.cpp| 7 +- > src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.cpp | 8 +- > 16 files changed, 966 insertions(+), 264 deletions(-) > create mode 100644 src/glsl/nir/nir_clone.c > > -- > 2.5.0.400.gff86faf > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 10/11] i965/fs: Don't allow SINT32 as a return type for resinfo
--- src/mesa/drivers/dri/i965/brw_fs_generator.cpp | 10 +- 1 file changed, 9 insertions(+), 1 deletion(-) diff --git a/src/mesa/drivers/dri/i965/brw_fs_generator.cpp b/src/mesa/drivers/dri/i965/brw_fs_generator.cpp index 974219f..dad541b 100644 --- a/src/mesa/drivers/dri/i965/brw_fs_generator.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs_generator.cpp @@ -680,7 +680,15 @@ fs_generator::generate_tex(fs_inst *inst, struct brw_reg dst, struct brw_reg src switch (dst.type) { case BRW_REGISTER_TYPE_D: - return_format = BRW_SAMPLER_RETURN_FORMAT_SINT32; + /* SINT32 isn't actually allowed for TXS. This isn't explicitly stated + * in the PRM, but the i965 PRM explicitly lists UINT32 and FLOAT32 as + * being valid for resinfo but not SINT32 (Vol. 4 Section 4.8.1.1). + * Emperical testing has also verified this. + */ + if (inst->opcode == SHADER_OPCODE_TXS) + return_format = BRW_SAMPLER_RETURN_FORMAT_UINT32; + else + return_format = BRW_SAMPLER_RETURN_FORMAT_SINT32; break; case BRW_REGISTER_TYPE_UD: return_format = BRW_SAMPLER_RETURN_FORMAT_UINT32; -- 2.5.0.400.gff86faf ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 11/11] i965: Use nir_lower_tex for texture coordinate lowering
Previously, we had a rescale_texcoords helper in the FS backend for handling rescaling of texture coordinates. Now that we can do variants in NIR, we can use nir_lower_tex to do the rescaling for us. This allows us to delete the i965-specific code and gives us proper TEXTURE_RECTANGLE and GL_CLAMP handling in vertex and geometry shaders. --- src/mesa/drivers/dri/i965/brw_fs.cpp | 2 + src/mesa/drivers/dri/i965/brw_fs.h| 3 - src/mesa/drivers/dri/i965/brw_fs_nir.cpp | 4 +- src/mesa/drivers/dri/i965/brw_fs_visitor.cpp | 125 -- src/mesa/drivers/dri/i965/brw_nir.c | 23 src/mesa/drivers/dri/i965/brw_nir.h | 6 ++ src/mesa/drivers/dri/i965/brw_vec4.cpp| 2 + src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.cpp | 2 + 8 files changed, 36 insertions(+), 131 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp b/src/mesa/drivers/dri/i965/brw_fs.cpp index b8713ab..c56cafe 100644 --- a/src/mesa/drivers/dri/i965/brw_fs.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs.cpp @@ -5468,6 +5468,7 @@ brw_compile_fs(const struct brw_compiler *compiler, void *log_data, char **error_str) { nir_shader *shader = nir_shader_clone(mem_ctx, src_shader); + brw_nir_apply_sampler_key(shader, compiler->devinfo, &key->tex, true); brw_postprocess_nir(shader, compiler->devinfo, true); /* key->alpha_test_func means simulating alpha testing via discards, @@ -5628,6 +5629,7 @@ brw_compile_cs(const struct brw_compiler *compiler, void *log_data, char **error_str) { nir_shader *shader = nir_shader_clone(mem_ctx, src_shader); + brw_nir_apply_sampler_key(shader, compiler->devinfo, &key->tex, true); brw_postprocess_nir(shader, compiler->devinfo, true); prog_data->local_size[0] = shader->info.cs.local_size[0]; diff --git a/src/mesa/drivers/dri/i965/brw_fs.h b/src/mesa/drivers/dri/i965/brw_fs.h index 2dfcab1..8a181d7 100644 --- a/src/mesa/drivers/dri/i965/brw_fs.h +++ b/src/mesa/drivers/dri/i965/brw_fs.h @@ -217,8 +217,6 @@ public: void emit_interpolation_setup_gen4(); void emit_interpolation_setup_gen6(); void compute_sample_position(fs_reg dst, fs_reg int_sample_pos); - fs_reg rescale_texcoord(fs_reg coordinate, int coord_components, - bool is_rect, uint32_t sampler); void emit_texture(ir_texture_opcode op, const glsl_type *dest_type, fs_reg coordinate, int components, @@ -229,7 +227,6 @@ public: fs_reg mcs, int gather_component, bool is_cube_array, - bool is_rect, uint32_t sampler, fs_reg sampler_reg); fs_reg emit_mcs_fetch(const fs_reg &coordinate, unsigned components, diff --git a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp index 02b9f5b..3d83d7c 100644 --- a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp @@ -2411,8 +2411,6 @@ fs_visitor::nir_emit_texture(const fs_builder &bld, nir_tex_instr *instr) int gather_component = instr->component; - bool is_rect = instr->sampler_dim == GLSL_SAMPLER_DIM_RECT; - bool is_cube_array = instr->sampler_dim == GLSL_SAMPLER_DIM_CUBE && instr->is_array; @@ -2549,7 +2547,7 @@ fs_visitor::nir_emit_texture(const fs_builder &bld, nir_tex_instr *instr) emit_texture(op, dest_type, coordinate, instr->coord_components, shadow_comparitor, lod, lod2, lod_components, sample_index, tex_offset, mcs, gather_component, -is_cube_array, is_rect, sampler, sampler_reg); +is_cube_array, sampler, sampler_reg); fs_reg dest = get_nir_dest(instr->dest); dest.type = this->result.type; diff --git a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp index 213c912..faf304c 100644 --- a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp @@ -79,122 +79,6 @@ fs_visitor::emit_vs_system_value(int location) return reg; } -fs_reg -fs_visitor::rescale_texcoord(fs_reg coordinate, int coord_components, - bool is_rect, uint32_t sampler) -{ - bool needs_gl_clamp = true; - fs_reg scale_x, scale_y; - - /* The 965 requires the EU to do the normalization of GL rectangle -* texture coordinates. We use the program parameter state -* tracking to get the scaling factor. -*/ - if (is_rect && - (devinfo->gen < 6 || -(devinfo->gen >= 6 && (key_tex->gl_clamp_mask[0] & (1 << sampler) || - key_tex->gl_clamp_mask[1] & (1 << sampler) { - struct gl_program_parameter_list *params = prog->Parameters; - - - /* FINISHME: We're failing to recompile our programs wh
[Mesa-dev] [PATCH 07/11] i965: Move postprocess_nir to codegen time
--- src/mesa/drivers/dri/i965/brw_fs.cpp | 11 +-- src/mesa/drivers/dri/i965/brw_nir.c | 1 - src/mesa/drivers/dri/i965/brw_vec4.cpp| 5 - src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.cpp | 6 +- 4 files changed, 18 insertions(+), 5 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp b/src/mesa/drivers/dri/i965/brw_fs.cpp index ad94fa4..b8713ab 100644 --- a/src/mesa/drivers/dri/i965/brw_fs.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs.cpp @@ -43,6 +43,7 @@ #include "brw_wm.h" #include "brw_fs.h" #include "brw_cs.h" +#include "brw_nir.h" #include "brw_vec4_gs_visitor.h" #include "brw_cfg.h" #include "brw_dead_control_flow.h" @@ -5459,13 +5460,16 @@ brw_compile_fs(const struct brw_compiler *compiler, void *log_data, void *mem_ctx, const struct brw_wm_prog_key *key, struct brw_wm_prog_data *prog_data, - const nir_shader *shader, + const nir_shader *src_shader, struct gl_program *prog, int shader_time_index8, int shader_time_index16, bool use_rep_send, unsigned *final_assembly_size, char **error_str) { + nir_shader *shader = nir_shader_clone(mem_ctx, src_shader); + brw_postprocess_nir(shader, compiler->devinfo, true); + /* key->alpha_test_func means simulating alpha testing via discards, * so the shader definitely kills pixels. */ @@ -5618,11 +5622,14 @@ brw_compile_cs(const struct brw_compiler *compiler, void *log_data, void *mem_ctx, const struct brw_cs_prog_key *key, struct brw_cs_prog_data *prog_data, - const nir_shader *shader, + const nir_shader *src_shader, int shader_time_index, unsigned *final_assembly_size, char **error_str) { + nir_shader *shader = nir_shader_clone(mem_ctx, src_shader); + brw_postprocess_nir(shader, compiler->devinfo, true); + prog_data->local_size[0] = shader->info.cs.local_size[0]; prog_data->local_size[1] = shader->info.cs.local_size[1]; prog_data->local_size[2] = shader->info.cs.local_size[2]; diff --git a/src/mesa/drivers/dri/i965/brw_nir.c b/src/mesa/drivers/dri/i965/brw_nir.c index 21c2648..693b9cd 100644 --- a/src/mesa/drivers/dri/i965/brw_nir.c +++ b/src/mesa/drivers/dri/i965/brw_nir.c @@ -391,7 +391,6 @@ brw_create_nir(struct brw_context *brw, brw_preprocess_nir(nir, is_scalar); brw_lower_nir(nir, devinfo, shader_prog, is_scalar); - brw_postprocess_nir(nir, devinfo, is_scalar); return nir; } diff --git a/src/mesa/drivers/dri/i965/brw_vec4.cpp b/src/mesa/drivers/dri/i965/brw_vec4.cpp index 8350a02..9f75bb6 100644 --- a/src/mesa/drivers/dri/i965/brw_vec4.cpp +++ b/src/mesa/drivers/dri/i965/brw_vec4.cpp @@ -2028,13 +2028,16 @@ brw_compile_vs(const struct brw_compiler *compiler, void *log_data, void *mem_ctx, const struct brw_vs_prog_key *key, struct brw_vs_prog_data *prog_data, - const nir_shader *shader, + const nir_shader *src_shader, gl_clip_plane *clip_planes, bool use_legacy_snorm_formula, int shader_time_index, unsigned *final_assembly_size, char **error_str) { + nir_shader *shader = nir_shader_clone(mem_ctx, src_shader); + brw_postprocess_nir(shader, compiler->devinfo, compiler->scalar_vs); + const unsigned *assembly = NULL; unsigned nr_attributes = _mesa_bitcount_64(prog_data->inputs_read); diff --git a/src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.cpp b/src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.cpp index 49c1083..92b15d9 100644 --- a/src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.cpp +++ b/src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.cpp @@ -30,6 +30,7 @@ #include "brw_vec4_gs_visitor.h" #include "gen6_gs_visitor.h" #include "brw_fs.h" +#include "brw_nir.h" namespace brw { @@ -604,7 +605,7 @@ brw_compile_gs(const struct brw_compiler *compiler, void *log_data, void *mem_ctx, const struct brw_gs_prog_key *key, struct brw_gs_prog_data *prog_data, - const nir_shader *shader, + const nir_shader *src_shader, struct gl_shader_program *shader_prog, int shader_time_index, unsigned *final_assembly_size, @@ -614,6 +615,9 @@ brw_compile_gs(const struct brw_compiler *compiler, void *log_data, memset(&c, 0, sizeof(c)); c.key = *key; + nir_shader *shader = nir_shader_clone(mem_ctx, src_shader); + brw_postprocess_nir(shader, compiler->devinfo, compiler->scalar_gs); + prog_data->include_primitive_id = (shader->info.inputs_read & VARYING_BIT_PRIMITIVE_ID) != 0; -- 2.5.0.400.gff86faf ___ mesa-dev maili
[Mesa-dev] [PATCH 09/11] nir/lower_tex: Set the dest_type for txs instructions
--- src/glsl/nir/nir_lower_tex.c | 1 + 1 file changed, 1 insertion(+) diff --git a/src/glsl/nir/nir_lower_tex.c b/src/glsl/nir/nir_lower_tex.c index 21ed103..6dea837 100644 --- a/src/glsl/nir/nir_lower_tex.c +++ b/src/glsl/nir/nir_lower_tex.c @@ -134,6 +134,7 @@ get_texture_size(nir_builder *b, nir_tex_instr *tex) txs->op = nir_texop_txs; txs->sampler_dim = GLSL_SAMPLER_DIM_RECT; txs->sampler_index = tex->sampler_index; + txs->dest_type = nir_type_int; /* only single src, the lod: */ txs->src[0].src = nir_src_for_ssa(nir_imm_int(b, 0)); -- 2.5.0.400.gff86faf ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 08/11] nir/lower_tex: Report progress
--- src/glsl/nir/nir.h | 2 +- src/glsl/nir/nir_lower_tex.c | 19 +++ 2 files changed, 16 insertions(+), 5 deletions(-) diff --git a/src/glsl/nir/nir.h b/src/glsl/nir/nir.h index 41125b1..2299ece 100644 --- a/src/glsl/nir/nir.h +++ b/src/glsl/nir/nir.h @@ -1981,7 +1981,7 @@ typedef struct nir_lower_tex_options { unsigned saturate_r; } nir_lower_tex_options; -void nir_lower_tex(nir_shader *shader, +bool nir_lower_tex(nir_shader *shader, const nir_lower_tex_options *options); void nir_lower_idiv(nir_shader *shader); diff --git a/src/glsl/nir/nir_lower_tex.c b/src/glsl/nir/nir_lower_tex.c index 8aaa48a..21ed103 100644 --- a/src/glsl/nir/nir_lower_tex.c +++ b/src/glsl/nir/nir_lower_tex.c @@ -41,6 +41,7 @@ typedef struct { nir_builder b; const nir_lower_tex_options *options; + bool progress; } lower_tex_state; static void @@ -239,15 +240,21 @@ nir_lower_tex_block(nir_block *block, void *void_state) /* If we are clamping any coords, we must lower projector first * as clamping happens *after* projection: */ - if (lower_txp || sat_mask) + if (lower_txp || sat_mask) { project_src(b, tex); + state->progress = true; + } if ((tex->sampler_dim == GLSL_SAMPLER_DIM_RECT) && - state->options->lower_rect) + state->options->lower_rect) { lower_rect(b, tex); + state->progress = true; + } - if (sat_mask) + if (sat_mask) { saturate_src(b, tex, sat_mask); + state->progress = true; + } } return true; @@ -264,13 +271,17 @@ nir_lower_tex_impl(nir_function_impl *impl, lower_tex_state *state) nir_metadata_dominance); } -void +bool nir_lower_tex(nir_shader *shader, const nir_lower_tex_options *options) { lower_tex_state state; state.options = options; + state.progress = false; + nir_foreach_overload(shader, overload) { if (overload->impl) nir_lower_tex_impl(overload->impl, &state); } + + return state.progress; } -- 2.5.0.400.gff86faf ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 06/11] i965/nir: Split shader optimization and lowering into three satages
At the moment, brw_create_nir just calls the three stages in sequence so there's not much difference. Soon, however, we will want to start doing variants in NIR at which point the postprocessing step will have to move from shader create time to codegen time. --- src/mesa/drivers/dri/i965/brw_nir.c | 129 +--- src/mesa/drivers/dri/i965/brw_nir.h | 9 +++ 2 files changed, 100 insertions(+), 38 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_nir.c b/src/mesa/drivers/dri/i965/brw_nir.c index 7826729..21c2648 100644 --- a/src/mesa/drivers/dri/i965/brw_nir.c +++ b/src/mesa/drivers/dri/i965/brw_nir.c @@ -221,43 +221,33 @@ nir_optimize(nir_shader *nir, bool is_scalar) } while (progress); } -nir_shader * -brw_create_nir(struct brw_context *brw, - const struct gl_shader_program *shader_prog, - const struct gl_program *prog, - gl_shader_stage stage, - bool is_scalar) +/* Does some simple lowering and runs the standard suite of optimizations + * + * This is intended to be called more-or-less directly after you get the + * shader out of GLSL or some other source. While it is geared towards i965, + * it is not at all generator-specific except for the is_scalar flag. Even + * there, it is safe to call with is_scalar = false for a shader that is + * intended for the FS backend as long as nir_optimize is called again with + * is_scalar = true to scalarize everything prior to code gen. + */ +void +brw_preprocess_nir(nir_shader *nir, bool is_scalar) { - struct gl_context *ctx = &brw->ctx; - const struct brw_device_info *devinfo = brw->intelScreen->devinfo; - const nir_shader_compiler_options *options = - ctx->Const.ShaderCompilerOptions[stage].NirOptions; - static const nir_lower_tex_options tex_options = { - .lower_txp = ~0, - }; - bool debug_enabled = INTEL_DEBUG & intel_debug_flag_for_shader_stage(stage); - bool progress = false; - nir_shader *nir; - - /* First, lower the GLSL IR or Mesa IR to NIR */ - if (shader_prog) { - nir = glsl_to_nir(shader_prog, stage, options); - } else { - nir = prog_to_nir(prog, options); - OPT_V(nir_convert_to_ssa); /* turn registers into SSA */ - } - nir_validate_shader(nir); + bool progress; /* Written by OPT and OPT_V */ + (void)progress; - if (stage == MESA_SHADER_GEOMETRY) { + if (nir->stage == MESA_SHADER_GEOMETRY) OPT(nir_lower_gs_intrinsics); - } - OPT(nir_lower_global_vars_to_local); + static const nir_lower_tex_options tex_options = { + .lower_txp = ~0, + }; OPT_V(nir_lower_tex, &tex_options); - OPT(nir_normalize_cubemap_coords); + OPT(nir_lower_global_vars_to_local); + OPT(nir_split_var_copies); nir_optimize(nir, is_scalar); @@ -268,6 +258,25 @@ brw_create_nir(struct brw_context *brw, /* Get rid of split copies */ nir_optimize(nir, is_scalar); + OPT(nir_remove_dead_variables); +} + +/* Lowers inputs, outputs, uniforms, and samplers for i965 + * + * This function does all of the standard lowering prior to post-processing. + * The lowering done is highly gen, stage, and backend-specific. The + * shader_prog parameter is optional and is used only for lowering sampler + * derefs and atomics for GLSL shaders. + */ +void +brw_lower_nir(nir_shader *nir, + const struct brw_device_info *devinfo, + const struct gl_shader_program *shader_prog, + bool is_scalar) +{ + bool progress; /* Written by OPT and OPT_V */ + (void)progress; + OPT_V(brw_nir_lower_inputs, devinfo, is_scalar); OPT_V(brw_nir_lower_outputs, is_scalar); nir_assign_var_locations(&nir->uniforms, @@ -275,8 +284,6 @@ brw_create_nir(struct brw_context *brw, is_scalar ? type_size_scalar : type_size_vec4); OPT_V(nir_lower_io, -1, is_scalar ? type_size_scalar : type_size_vec4); - OPT(nir_remove_dead_variables); - if (shader_prog) { OPT_V(nir_lower_samplers, shader_prog); } @@ -288,8 +295,27 @@ brw_create_nir(struct brw_context *brw, } nir_optimize(nir, is_scalar); +} + +/* Prepare the given shader for codegen + * + * This function is intended to be called right before going into the actual + * backend and is highly backend-specific. Also, once this function has been + * called on a shader, it will no longer be in SSA form so most optimizations + * will not work. + */ +void +brw_postprocess_nir(nir_shader *nir, +const struct brw_device_info *devinfo, +bool is_scalar) +{ + bool debug_enabled = + (INTEL_DEBUG & intel_debug_flag_for_shader_stage(nir->stage)); + + bool progress; /* Written by OPT and OPT_V */ + (void)progress; - if (brw->gen >= 6) { + if (devinfo->gen >= 6) { /* Try and fuse multiply-adds */ OPT(brw_nir_opt_peephole_ffma); } @@ -310,7 +336,7 @@ brw_create_nir(struct brw_context *brw, } fprin
[Mesa-dev] [PATCH 03/11] i965/nir: Add OPT() and OPT_V() macros for invoking NIR passes.
From: Kenneth Graunke OPT() is the normal macro for passes that return booleans, while OPT_V() is a variant that works for passes that don't properly report progress. (Such passes should be fixed to return a boolean, eventually.) These macros take care of calling nir_validate_shader() and setting progress appropriately. In the future, it would be easy to add shader dumping similar to INTEL_DEBUG=optimizer by extending the macro. v2 (Jason Ekstrand): - Fix an unused variable warning Signed-off-by: Kenneth Graunke Reviewed-by: Jason Ekstrand --- src/mesa/drivers/dri/i965/brw_nir.c | 131 1 file changed, 59 insertions(+), 72 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_nir.c b/src/mesa/drivers/dri/i965/brw_nir.c index fe5cad4..b19f969 100644 --- a/src/mesa/drivers/dri/i965/brw_nir.c +++ b/src/mesa/drivers/dri/i965/brw_nir.c @@ -56,8 +56,9 @@ remap_vs_attrs(nir_block *block, void *closure) } static void -brw_nir_lower_inputs(const struct brw_device_info *devinfo, - nir_shader *nir, bool is_scalar) +brw_nir_lower_inputs(nir_shader *nir, + const struct brw_device_info *devinfo, + bool is_scalar) { switch (nir->stage) { case MESA_SHADER_VERTEX: @@ -170,46 +171,49 @@ brw_nir_lower_outputs(nir_shader *nir, bool is_scalar) } } +#define _OPT(do_pass) (({ \ + bool this_progress = true; \ + do_pass\ + nir_validate_shader(nir); \ + this_progress; \ +})) + +#define OPT(pass, ...) _OPT( \ + this_progress = pass(nir ,##__VA_ARGS__); \ + progress = progress || this_progress; \ +) + +#define OPT_V(pass, ...) _OPT( \ + pass(nir, ##__VA_ARGS__); \ +) + static void nir_optimize(nir_shader *nir, bool is_scalar) { bool progress; do { progress = false; - nir_lower_vars_to_ssa(nir); - nir_validate_shader(nir); + OPT_V(nir_lower_vars_to_ssa); if (is_scalar) { - nir_lower_alu_to_scalar(nir); - nir_validate_shader(nir); + OPT_V(nir_lower_alu_to_scalar); } - progress |= nir_copy_prop(nir); - nir_validate_shader(nir); + OPT(nir_copy_prop); if (is_scalar) { - nir_lower_phis_to_scalar(nir); - nir_validate_shader(nir); + OPT_V(nir_lower_phis_to_scalar); } - progress |= nir_copy_prop(nir); - nir_validate_shader(nir); - progress |= nir_opt_dce(nir); - nir_validate_shader(nir); - progress |= nir_opt_cse(nir); - nir_validate_shader(nir); - progress |= nir_opt_peephole_select(nir); - nir_validate_shader(nir); - progress |= nir_opt_algebraic(nir); - nir_validate_shader(nir); - progress |= nir_opt_constant_folding(nir); - nir_validate_shader(nir); - progress |= nir_opt_dead_cf(nir); - nir_validate_shader(nir); - progress |= nir_opt_remove_phis(nir); - nir_validate_shader(nir); - progress |= nir_opt_undef(nir); - nir_validate_shader(nir); + OPT(nir_copy_prop); + OPT(nir_opt_dce); + OPT(nir_opt_cse); + OPT(nir_opt_peephole_select); + OPT(nir_opt_algebraic); + OPT(nir_opt_constant_folding); + OPT(nir_opt_dead_cf); + OPT(nir_opt_remove_phis); + OPT(nir_opt_undef); } while (progress); } @@ -228,6 +232,7 @@ brw_create_nir(struct brw_context *brw, .lower_txp = ~0, }; bool debug_enabled = INTEL_DEBUG & intel_debug_flag_for_shader_stage(stage); + bool progress = false; nir_shader *nir; /* First, lower the GLSL IR or Mesa IR to NIR */ @@ -235,80 +240,63 @@ brw_create_nir(struct brw_context *brw, nir = glsl_to_nir(shader_prog, stage, options); } else { nir = prog_to_nir(prog, options); - nir_convert_to_ssa(nir); /* turn registers into SSA */ + OPT_V(nir_convert_to_ssa); /* turn registers into SSA */ } nir_validate_shader(nir); if (stage == MESA_SHADER_GEOMETRY) { - nir_lower_gs_intrinsics(nir); - nir_validate_shader(nir); + OPT(nir_lower_gs_intrinsics); } - nir_lower_global_vars_to_local(nir); - nir_validate_shader(nir); + OPT(nir_lower_global_vars_to_local); - nir_lower_tex(nir, &tex_options); - nir_validate_shader(nir); + OPT_V(nir_lower_tex, &tex_options); - nir_normalize_cubemap_coords(nir); - nir_validate_shader(nir); + OPT(nir_normalize_cubemap_coords); - nir_split_var_copies(nir); - nir_validate_shader(nir); + OPT(nir_split_var_copies); nir_optimize(nir, is_scalar); /* Lower a bunch of stuff */ - nir_lower_var_copies(nir); - nir_validate_shader(nir); + OPT_V(nir_lower_var_copies); /* Get rid of split copies */ nir_optimize(nir, is_scalar); - brw_nir_lower_inputs(devinfo, nir, is_scalar); - brw_nir_lower_outputs(nir, is_scalar); + OPT_V(brw_nir_lower_inputs, devinfo, is_scalar); + OPT_V(brw_nir_lower_outputs, is_scalar
[Mesa-dev] [PATCH 04/11] i965/nir: Validate that NIR passes call nir_metadata_preserve().
From: Kenneth Graunke Failing to call nir_metadata_preserve() can have nasty consequences: some pass breaks dominance information, but leaves it marked as valid, causing some subsequent pass to go haywire and probably crash. This pass adds a simple validation mechanism to ensure passes handle this properly. We add a new bogus metadata flag that isn't used for anything in particular, set it before each pass, and ensure it *isn't* still set after the pass. nir_metadata_preserve will reset the flag, so correct passes will work, and bad passes will assert fail. (I would have made these functions static inline, but nir.h is included in C++, so we can't bit-or enums without lots of casting...) Thanks to Dylan Baker for the idea. Signed-off-by: Kenneth Graunke Reviewed-by: Jason Ekstrand --- src/glsl/nir/nir.h | 5 + src/glsl/nir/nir_metadata.c | 36 src/mesa/drivers/dri/i965/brw_nir.c | 10 +++--- 3 files changed, 48 insertions(+), 3 deletions(-) diff --git a/src/glsl/nir/nir.h b/src/glsl/nir/nir.h index f99af4e..4f4e946 100644 --- a/src/glsl/nir/nir.h +++ b/src/glsl/nir/nir.h @@ -1312,6 +1312,7 @@ typedef enum { nir_metadata_block_index = 0x1, nir_metadata_dominance = 0x2, nir_metadata_live_ssa_defs = 0x4, + nir_metadata_not_properly_reset = 0x8, } nir_metadata; typedef struct { @@ -1886,8 +1887,12 @@ void nir_print_instr(const nir_instr *instr, FILE *fp); #ifdef DEBUG void nir_validate_shader(nir_shader *shader); +void nir_metadata_set_validation_flag(nir_shader *shader); +void nir_metadata_check_validation_flag(nir_shader *shader); #else static inline void nir_validate_shader(nir_shader *shader) { (void) shader; } +static inline void nir_metadata_set_validation_flag(nir_shader *shader) { (void) shader; } +static inline void nir_metadata_check_validation_flag(nir_shader *shader) { (void) shader; } #endif /* DEBUG */ void nir_calc_dominance_impl(nir_function_impl *impl); diff --git a/src/glsl/nir/nir_metadata.c b/src/glsl/nir/nir_metadata.c index 6de981f..d5324b3 100644 --- a/src/glsl/nir/nir_metadata.c +++ b/src/glsl/nir/nir_metadata.c @@ -52,3 +52,39 @@ nir_metadata_preserve(nir_function_impl *impl, nir_metadata preserved) { impl->valid_metadata &= preserved; } + +#ifdef DEBUG +/** + * Make sure passes properly invalidate metadata (part 1). + * + * Call this before running a pass to set a bogus metadata flag, which will + * only be preserved if the pass forgets to call nir_metadata_preserve(). + */ +void +nir_metadata_set_validation_flag(nir_shader *shader) +{ + nir_foreach_overload(shader, overload) { + if (overload->impl) { + overload->impl->valid_metadata |= nir_metadata_not_properly_reset; + } + } +} + +/** + * Make sure passes properly invalidate metadata (part 2). + * + * Call this after a pass makes progress to verify that the bogus metadata set by + * the earlier function was properly thrown away. Note that passes may not call + * nir_metadata_preserve() if they don't actually make any changes at all. + */ +void +nir_metadata_check_validation_flag(nir_shader *shader) +{ + nir_foreach_overload(shader, overload) { + if (overload->impl) { + assert(!(overload->impl->valid_metadata & + nir_metadata_not_properly_reset)); + } + } +} +#endif diff --git a/src/mesa/drivers/dri/i965/brw_nir.c b/src/mesa/drivers/dri/i965/brw_nir.c index b19f969..7826729 100644 --- a/src/mesa/drivers/dri/i965/brw_nir.c +++ b/src/mesa/drivers/dri/i965/brw_nir.c @@ -178,9 +178,13 @@ brw_nir_lower_outputs(nir_shader *nir, bool is_scalar) this_progress; \ })) -#define OPT(pass, ...) _OPT( \ - this_progress = pass(nir ,##__VA_ARGS__); \ - progress = progress || this_progress; \ +#define OPT(pass, ...) _OPT( \ + nir_metadata_set_validation_flag(nir); \ + this_progress = pass(nir ,##__VA_ARGS__); \ + if (this_progress) {\ + progress = true; \ + nir_metadata_check_validation_flag(nir); \ + } \ ) #define OPT_V(pass, ...) _OPT( \ -- 2.5.0.400.gff86faf ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 02/11] nir: add array length field
From: Rob Clark This will simplify things somewhat in clone. Signed-off-by: Rob Clark Reviewed-by: Jason Ekstrand --- src/glsl/nir/glsl_to_nir.cpp | 5 + src/glsl/nir/nir.h | 5 + 2 files changed, 10 insertions(+) diff --git a/src/glsl/nir/glsl_to_nir.cpp b/src/glsl/nir/glsl_to_nir.cpp index 8e53e22..13fa987 100644 --- a/src/glsl/nir/glsl_to_nir.cpp +++ b/src/glsl/nir/glsl_to_nir.cpp @@ -241,6 +241,8 @@ constant_copy(ir_constant *ir, void *mem_ctx) unsigned total_elems = ir->type->components(); unsigned i; + + ret->num_elements = 0; switch (ir->type->base_type) { case GLSL_TYPE_UINT: for (i = 0; i < total_elems; i++) @@ -265,6 +267,8 @@ constant_copy(ir_constant *ir, void *mem_ctx) case GLSL_TYPE_STRUCT: ret->elements = ralloc_array(mem_ctx, nir_constant *, ir->type->length); + ret->num_elements = ir->type->length; + i = 0; foreach_in_list(ir_constant, field, &ir->components) { ret->elements[i] = constant_copy(field, mem_ctx); @@ -275,6 +279,7 @@ constant_copy(ir_constant *ir, void *mem_ctx) case GLSL_TYPE_ARRAY: ret->elements = ralloc_array(mem_ctx, nir_constant *, ir->type->length); + ret->num_elements = ir->type->length; for (i = 0; i < ir->type->length; i++) ret->elements[i] = constant_copy(ir->array_elements[i], mem_ctx); diff --git a/src/glsl/nir/nir.h b/src/glsl/nir/nir.h index 6ffa60b..f99af4e 100644 --- a/src/glsl/nir/nir.h +++ b/src/glsl/nir/nir.h @@ -111,6 +111,11 @@ typedef struct nir_constant { */ union nir_constant_data value; + /* we could get this from the var->type but makes clone *much* easier to +* not have to care about the type. +*/ + unsigned num_elements; + /* Array elements / Structure Fields */ struct nir_constant **elements; } nir_constant; -- 2.5.0.400.gff86faf ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 00/11] i965/nir: Do texture rectangle lowering in NIR
On older hardware (Iron Lake and below), we can't support texture rectangle natively. Sandy Bridge through Haswell can support it but don't support the GL_CLAMP wrap mode natively. It isn't until Broadwell that GL_CLAMP is supported together with GL_TEXTURE_RECTANGLE in hardware. In the cases where it isn't supported, we have to fake it by dividing by the texture size. Previously, we had a rescale_texcoord function added a uniform to hold the texture coordinate and used that to rescale/clamp the texture coordinates. For a while now, nir_lower_tex has been able to lower texture rectangle to a textureSize and a regular texture2D operation. This series makes i965 use the nir_lower_tex path instead. Incidentally, this fixes texture rectangle support in vertex and geometry shaders on Haswell and below. (The backend lowering was only ever done in the FS backend.) Since this is the first time we're doing any sort of shader variants in NIR, the first several passes add the infastructure to do so. Two of these patches are from Ken, two are from Rob, and one (nir_clone itself) is my rendition but heavily based on what Rob did only with less hashing. Jason Ekstrand (7): nir: support to clone shaders i965/nir: Split shader optimization and lowering into three satages i965: Move postprocess_nir to codegen time nir/lower_tex: Report progress nir/lower_tex: Set the dest_type for txs instructions i965/fs: Don't allow SINT32 as a return type for resinfo i965: Use nir_lower_tex for texture coordinate lowering Kenneth Graunke (2): i965/nir: Add OPT() and OPT_V() macros for invoking NIR passes. i965/nir: Validate that NIR passes call nir_metadata_preserve(). Rob Clark (2): nir: remove nir_variable::max_ifc_array_access nir: add array length field src/glsl/Makefile.sources | 1 + src/glsl/nir/glsl_to_nir.cpp | 14 +- src/glsl/nir/nir.c| 8 + src/glsl/nir/nir.h| 27 +- src/glsl/nir/nir_clone.c | 671 ++ src/glsl/nir/nir_lower_tex.c | 20 +- src/glsl/nir/nir_metadata.c | 36 ++ src/mesa/drivers/dri/i965/brw_fs.cpp | 13 +- src/mesa/drivers/dri/i965/brw_fs.h| 3 - src/mesa/drivers/dri/i965/brw_fs_generator.cpp| 10 +- src/mesa/drivers/dri/i965/brw_fs_nir.cpp | 4 +- src/mesa/drivers/dri/i965/brw_fs_visitor.cpp | 125 src/mesa/drivers/dri/i965/brw_nir.c | 268 + src/mesa/drivers/dri/i965/brw_nir.h | 15 + src/mesa/drivers/dri/i965/brw_vec4.cpp| 7 +- src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.cpp | 8 +- 16 files changed, 966 insertions(+), 264 deletions(-) create mode 100644 src/glsl/nir/nir_clone.c -- 2.5.0.400.gff86faf ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 01/11] nir: remove nir_variable::max_ifc_array_access
From: Rob Clark No users. Signed-off-by: Rob Clark Reviewed-by: Jason Ekstrand --- src/glsl/nir/glsl_to_nir.cpp | 9 - src/glsl/nir/nir.h | 13 - 2 files changed, 22 deletions(-) diff --git a/src/glsl/nir/glsl_to_nir.cpp b/src/glsl/nir/glsl_to_nir.cpp index b10d192..8e53e22 100644 --- a/src/glsl/nir/glsl_to_nir.cpp +++ b/src/glsl/nir/glsl_to_nir.cpp @@ -294,15 +294,6 @@ nir_visitor::visit(ir_variable *ir) var->type = ir->type; var->name = ralloc_strdup(var, ir->name); - if (ir->is_interface_instance() && ir->get_max_ifc_array_access() != NULL) { - unsigned size = ir->get_interface_type()->length; - var->max_ifc_array_access = ralloc_array(var, unsigned, size); - memcpy(var->max_ifc_array_access, ir->get_max_ifc_array_access(), - size * sizeof(unsigned)); - } else { - var->max_ifc_array_access = NULL; - } - var->data.read_only = ir->data.read_only; var->data.centroid = ir->data.centroid; var->data.sample = ir->data.sample; diff --git a/src/glsl/nir/nir.h b/src/glsl/nir/nir.h index 4ed2cbd..6ffa60b 100644 --- a/src/glsl/nir/nir.h +++ b/src/glsl/nir/nir.h @@ -147,19 +147,6 @@ typedef struct { */ char *name; - /** -* For variables which satisfy the is_interface_instance() predicate, this -* points to an array of integers such that if the ith member of the -* interface block is an array, max_ifc_array_access[i] is the maximum -* array element of that member that has been accessed. If the ith member -* of the interface block is not an array, max_ifc_array_access[i] is -* unused. -* -* For variables whose type is not an interface block, this pointer is -* NULL. -*/ - unsigned *max_ifc_array_access; - struct nir_variable_data { /** -- 2.5.0.400.gff86faf ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v2 6/9] i965: Add annotation_insert_error() and support for printing errors.
On Friday, November 06, 2015 05:19:41 PM Matt Turner wrote: > Will allow annotations to contain error messages (indicating an > instruction violates a rule for instance) that are printed after the > disassembly of the block. > --- > src/mesa/drivers/dri/i965/intel_asm_annotation.c | 71 > +--- > src/mesa/drivers/dri/i965/intel_asm_annotation.h | 7 +++ > 2 files changed, 71 insertions(+), 7 deletions(-) > > diff --git a/src/mesa/drivers/dri/i965/intel_asm_annotation.c > b/src/mesa/drivers/dri/i965/intel_asm_annotation.c > index fe9d80a..52878fd 100644 > --- a/src/mesa/drivers/dri/i965/intel_asm_annotation.c > +++ b/src/mesa/drivers/dri/i965/intel_asm_annotation.c > @@ -69,6 +69,10 @@ dump_assembly(void *assembly, int num_annotations, struct > annotation *annotation > >brw_disassemble(devinfo, assembly, start_offset, end_offset, stderr); > > + if (annotation[i].error) { > + fputs(annotation[i].error, stderr); > + } > + >if (annotation[i].block_end) { > fprintf(stderr, " END B%d", annotation[i].block_end->num); > foreach_list_typed(struct bblock_link, successor_link, link, > @@ -82,25 +86,34 @@ dump_assembly(void *assembly, int num_annotations, struct > annotation *annotation > fprintf(stderr, "\n"); > } > > -void annotate(const struct brw_device_info *devinfo, > - struct annotation_info *annotation, const struct cfg_t *cfg, > - struct backend_instruction *inst, unsigned offset) > +static bool > +annotation_array_ensure_space(struct annotation_info *annotation) > { > - if (annotation->mem_ctx == NULL) > - annotation->mem_ctx = ralloc_context(NULL); > - > if (annotation->ann_size <= annotation->ann_count) { >int old_size = annotation->ann_size; >annotation->ann_size = MAX2(1024, annotation->ann_size * 2); >annotation->ann = reralloc(annotation->mem_ctx, annotation->ann, > struct annotation, annotation->ann_size); >if (!annotation->ann) > - return; > + return false; > >memset(annotation->ann + old_size, 0, > (annotation->ann_size - old_size) * sizeof(struct annotation)); > } > > + return true; > +} > + > +void annotate(const struct brw_device_info *devinfo, > + struct annotation_info *annotation, const struct cfg_t *cfg, > + struct backend_instruction *inst, unsigned offset) > +{ > + if (annotation->mem_ctx == NULL) > + annotation->mem_ctx = ralloc_context(NULL); > + > + if (!annotation_array_ensure_space(annotation)) > + return; > + > struct annotation *ann = &annotation->ann[annotation->ann_count++]; > ann->offset = offset; > if ((INTEL_DEBUG & DEBUG_ANNOTATION) != 0) { > @@ -156,3 +169,47 @@ annotation_finalize(struct annotation_info *annotation, > } > annotation->ann[annotation->ann_count].offset = next_inst_offset; > } > + > +void > +annotation_insert_error(struct annotation_info *annotation, unsigned offset, > +const char *error) > +{ > + struct annotation *ann; > + > + if (!annotation->ann_count) > + return; > + > + /* We may have to split an annotation, so ensure we have enough space > +* allocated for that case up front. > +*/ > + if (!annotation_array_ensure_space(annotation)) > + return; > + > + for (int i = 0; i < annotation->ann_count; i++) { > + struct annotation *cur = &annotation->ann[i]; > + struct annotation *next = &annotation->ann[i + 1]; > + ann = cur; > + > + if (next->offset <= offset) > + continue; > + > + if (offset + sizeof(brw_inst) != next->offset) { > + memmove(next, cur, > + (annotation->ann_count - i + 2) * sizeof(struct > annotation)); > + cur->error = NULL; > + cur->error_length = 0; > + cur->block_end = NULL; > + next->offset = offset + sizeof(brw_inst); > + next->block_start = NULL; > + annotation->ann_count++; > + } > + break; > + } > + > + assume(ann != NULL); > + > + if (ann->error) > + ralloc_strcat(&ann->error, error); > + else > + ann->error = ralloc_strdup(annotation->mem_ctx, error); > +} > diff --git a/src/mesa/drivers/dri/i965/intel_asm_annotation.h > b/src/mesa/drivers/dri/i965/intel_asm_annotation.h > index 6c72326..662a4b4 100644 > --- a/src/mesa/drivers/dri/i965/intel_asm_annotation.h > +++ b/src/mesa/drivers/dri/i965/intel_asm_annotation.h > @@ -37,6 +37,9 @@ struct cfg_t; > struct annotation { > int offset; > > + size_t error_length; > + char *error; > + > /* Pointers to the basic block in the CFG if the instruction group starts > * or ends a basic block. > */ > @@ -69,6 +72,10 @@ annotate(const struct brw_device_info *devinfo, > void > annotation_finalize(struct annotation_info *annotation, unsigned offset); > > +void > +annotation_insert_error(s
Re: [Mesa-dev] [PATCH 12/24] i965: Initialize registers' file to BAD_FILE.
On Wed, Nov 11, 2015 at 4:08 PM, Kenneth Graunke wrote: > Actually, your earlier statement: > "if file == BAD_FILE, no other fields mean anything." > suggests that we should change fs_reg() to simply set BAD_FILE, and > not bother initializing the other fields. That would eliminate one > of the redundant memsets, and give us valgrind errors if we used > an uninitialized value. If we do that, though, we should make > equals() return true for two BAD_FILE registers. That's a really good idea. I'll give that a try. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 12/24] i965: Initialize registers' file to BAD_FILE.
On Wednesday, November 11, 2015 03:12:11 PM Matt Turner wrote: > On Wed, Nov 11, 2015 at 3:05 PM, Kenneth Graunke > wrote: > > On Wednesday, November 11, 2015 01:07:24 PM Matt Turner wrote: > >> On Wed, Nov 11, 2015 at 12:46 PM, Kenneth Graunke > >> wrote: > >> > On Monday, November 02, 2015 04:29:22 PM Matt Turner wrote: > >> >> The test (file == BAD_FILE) works on registers for which the constructor > >> >> has not run because BAD_FILE is zero. The next commit will move > >> >> BAD_FILE in the enum so that it's no longer zero. > >> >> --- > >> >> src/mesa/drivers/dri/i965/brw_fs_nir.cpp | 10 +- > >> >> src/mesa/drivers/dri/i965/brw_fs_visitor.cpp | 3 +++ > >> >> src/mesa/drivers/dri/i965/brw_vec4_nir.cpp | 9 + > >> >> 3 files changed, 21 insertions(+), 1 deletion(-) > >> >> > >> >> diff --git a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp > >> >> b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp > >> >> index 7eeff93..611347c 100644 > >> >> --- a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp > >> >> +++ b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp > >> >> @@ -260,6 +260,10 @@ void > >> >> fs_visitor::nir_emit_system_values() > >> >> { > >> >> nir_system_values = ralloc_array(mem_ctx, fs_reg, SYSTEM_VALUE_MAX); > >> >> + for (unsigned i = 0; i < SYSTEM_VALUE_MAX; i++) { > >> >> + nir_system_values[i].file = BAD_FILE; > >> > > >> > How about we do this instead: > >> > > >> >nir_system_values[i] = fs_reg(); > >> > > >> > That way, they're properly constructed using the default constructor, > >> > which would not only set BAD_FILE, but properly initialize everything, > >> > so we don't have to revisit this if we make other changes in fs_reg(). > >> > >> Is it worth is? The function this code exists in is the thing that > >> initializes the system values. And, of course if file == BAD_FILE, no > >> other fields mean anything. Neither of those are likely to change. > > > > Yes. It's the correct thing to do. We don't want partially > > initialized objects. That way lies madness. > > Fine. But I think that calloc()ing storage, memsetting it to zero, and > then memsetting it to zero again before initializing it is madness. It's a bit overkill, sure. The fact that ralloc_array uses calloc internally is an implementation detail, though - we shouldn't rely on it memsetting the memory to zero. (rzalloc_array is guaranteed to have those semantics.) But, putting away ralloc for a moment, you get two memsets even with ordinary C++. For example, if you did: fs_reg array_of_regs[16]; for (int i = 0; i < 16; i++) array_of_regs[i] = fs_reg(i); C++ would automatically call the fs_reg() default constructor on all the fs_regs in the array (1 memset per element). Then we'd call the fs_reg constructor when creating fs_reg(i), doing a second memset per element. Similarly, using C++'s heap allocator: fs_reg array_of_regs[] = new [] fs_reg; would also call the fs_reg() default constructor for each element, doing a memset-per-element. Simply malloc'ing memory for objects and beginning to use them may work, and even save a bit of code, but it's awful practice in C++. Constructors need to be called. Actually, your earlier statement: "if file == BAD_FILE, no other fields mean anything." suggests that we should change fs_reg() to simply set BAD_FILE, and not bother initializing the other fields. That would eliminate one of the redundant memsets, and give us valgrind errors if we used an uninitialized value. If we do that, though, we should make equals() return true for two BAD_FILE registers. --Ken signature.asc Description: This is a digitally signed message part. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 00/24] i965: Refactor register classes
On Monday, November 02, 2015 04:29:10 PM Matt Turner wrote: > backend_reg (from which fs_reg, src_reg, and dst_reg inherit) includes a > brw_reg that's used for "hardware regs" -- precolored registers or > architecture > registers. This leads to properties like source modifiers, the register type, > swizzles, and writemasks being duplicated between the derived classes and the > brw_reg and of course often being out of sync. > > This series removes the "fixed_hw_reg" field from backend_reg by just making > backend_reg inherit from brw_reg, and then removes fields duplicated in the > derived classes. In the process, it gets rid of HW_REG. > > This in turn simplifies a lot of code -- no longer do you have to check a > number of subfields if file == HW_REG. > > The last few patches begin some clean ups -- since the base of our register > classes is now brw_reg we don't need to do as many conversions. I've only > handled immediates so far and more is planned, but the series is growing large > and is a lot of churn already. > > The sizes of the register classes all shrink by 8 bytes: > >backend_reg 20 -> 12 >fs_reg40 -> 32 >src_reg 32 -> 24? >dst_reg 32 -> 24? > > The remaining fields in the classes are > >backend_reg: reg_offset >fs_reg: reladdr, subreg_offset, stride >src_reg reladdr >dst_reg reladdr Assuming you address my and Emil's feedback, the series is: Reviewed-by: Kenneth Graunke This is an invasive enough refactor that I believe running the assembly diffing tool would be worthwhile. Nice work :) I like the result. signature.asc Description: This is a digitally signed message part. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 12/24] i965: Initialize registers' file to BAD_FILE.
On Wed, Nov 11, 2015 at 3:05 PM, Kenneth Graunke wrote: > On Wednesday, November 11, 2015 01:07:24 PM Matt Turner wrote: >> On Wed, Nov 11, 2015 at 12:46 PM, Kenneth Graunke >> wrote: >> > On Monday, November 02, 2015 04:29:22 PM Matt Turner wrote: >> >> The test (file == BAD_FILE) works on registers for which the constructor >> >> has not run because BAD_FILE is zero. The next commit will move >> >> BAD_FILE in the enum so that it's no longer zero. >> >> --- >> >> src/mesa/drivers/dri/i965/brw_fs_nir.cpp | 10 +- >> >> src/mesa/drivers/dri/i965/brw_fs_visitor.cpp | 3 +++ >> >> src/mesa/drivers/dri/i965/brw_vec4_nir.cpp | 9 + >> >> 3 files changed, 21 insertions(+), 1 deletion(-) >> >> >> >> diff --git a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp >> >> b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp >> >> index 7eeff93..611347c 100644 >> >> --- a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp >> >> +++ b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp >> >> @@ -260,6 +260,10 @@ void >> >> fs_visitor::nir_emit_system_values() >> >> { >> >> nir_system_values = ralloc_array(mem_ctx, fs_reg, SYSTEM_VALUE_MAX); >> >> + for (unsigned i = 0; i < SYSTEM_VALUE_MAX; i++) { >> >> + nir_system_values[i].file = BAD_FILE; >> > >> > How about we do this instead: >> > >> >nir_system_values[i] = fs_reg(); >> > >> > That way, they're properly constructed using the default constructor, >> > which would not only set BAD_FILE, but properly initialize everything, >> > so we don't have to revisit this if we make other changes in fs_reg(). >> >> Is it worth is? The function this code exists in is the thing that >> initializes the system values. And, of course if file == BAD_FILE, no >> other fields mean anything. Neither of those are likely to change. > > Yes. It's the correct thing to do. We don't want partially > initialized objects. That way lies madness. Fine. But I think that calloc()ing storage, memsetting it to zero, and then memsetting it to zero again before initializing it is madness. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 17/24] i965: Replace HW_REG with ARF/GRF.
On Monday, November 02, 2015 04:29:27 PM Matt Turner wrote: > HW_REGs are (were!) kind of awful. If the file was HW_REG, you had to > look at different fields for type, abs, negate, writemask, swizzle, and > a second file. They also caused annoying problems like immediate sources > being considered scheduling barriers (commit 6148e94e2) and other such > nonsense. > > Instead use ARF/GRF/MRF for fixed registers in those files. Eee. I thought we were going to use FIXED_GRF, at least in the short-term, as we have several years of history (and possibly outstanding patches) that say "GRF" to mean "VGRF". At least for a while, until people learn to type VGRF... signature.asc Description: This is a digitally signed message part. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 12/24] i965: Initialize registers' file to BAD_FILE.
On Wednesday, November 11, 2015 01:07:24 PM Matt Turner wrote: > On Wed, Nov 11, 2015 at 12:46 PM, Kenneth Graunke > wrote: > > On Monday, November 02, 2015 04:29:22 PM Matt Turner wrote: > >> The test (file == BAD_FILE) works on registers for which the constructor > >> has not run because BAD_FILE is zero. The next commit will move > >> BAD_FILE in the enum so that it's no longer zero. > >> --- > >> src/mesa/drivers/dri/i965/brw_fs_nir.cpp | 10 +- > >> src/mesa/drivers/dri/i965/brw_fs_visitor.cpp | 3 +++ > >> src/mesa/drivers/dri/i965/brw_vec4_nir.cpp | 9 + > >> 3 files changed, 21 insertions(+), 1 deletion(-) > >> > >> diff --git a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp > >> b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp > >> index 7eeff93..611347c 100644 > >> --- a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp > >> +++ b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp > >> @@ -260,6 +260,10 @@ void > >> fs_visitor::nir_emit_system_values() > >> { > >> nir_system_values = ralloc_array(mem_ctx, fs_reg, SYSTEM_VALUE_MAX); > >> + for (unsigned i = 0; i < SYSTEM_VALUE_MAX; i++) { > >> + nir_system_values[i].file = BAD_FILE; > > > > How about we do this instead: > > > >nir_system_values[i] = fs_reg(); > > > > That way, they're properly constructed using the default constructor, > > which would not only set BAD_FILE, but properly initialize everything, > > so we don't have to revisit this if we make other changes in fs_reg(). > > Is it worth is? The function this code exists in is the thing that > initializes the system values. And, of course if file == BAD_FILE, no > other fields mean anything. Neither of those are likely to change. Yes. It's the correct thing to do. We don't want partially initialized objects. That way lies madness. signature.asc Description: This is a digitally signed message part. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v2 10/18] mesa: In helpers, only check driver capability for meta
On Fri 30 Oct 2015, Nanley Chery wrote: > From: Nanley Chery > > Make API context and version checks done by the helper functions pass > unconditionally while meta is in progress. This transparently makes > extension checks solely dependent on struct gl_extensions while in meta. > > v2. Use 8-bit wide datatype instead GLuint. > > Signed-off-by: Nanley Chery > --- > src/mesa/drivers/common/meta.c | 11 +++ > src/mesa/drivers/common/meta.h | 1 + > src/mesa/main/extensions.h | 2 +- > src/mesa/main/mtypes.h | 6 ++ > src/mesa/main/version.c| 1 + > 5 files changed, 20 insertions(+), 1 deletion(-) > diff --git a/src/mesa/main/mtypes.h b/src/mesa/main/mtypes.h > index 10e1586..15964dc 100644 > --- a/src/mesa/main/mtypes.h > +++ b/src/mesa/main/mtypes.h > @@ -3802,6 +3802,12 @@ struct gl_extensions > const GLubyte *String; > /** Number of supported extensions */ > GLuint Count; > + /** > +* The context version which extension helper functions compare against. > +* By default, the value is equal to ctx->Version. This changes to ~0, Hanging comma. Was that intentional? > +* while meta is in progress. > +*/ > + GLubyte Version; > }; I approve of this slick version hack for meta. With or without the hanging comma, this patch is Reviewed-by: Chad Versace I'm done with review today. I'll resume tomorrow. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] r600: initialised PGM_RESOURCES_2 for ES/GS
On Wed, 11 Nov 2015 23:42:18 +0100, Dave Airlie wrote: From: Dave Airlie This fixes the corruption on rendering that we are seeing in certain geometry shaders. Specifically, this fixes https://bugs.freedesktop.org/show_bug.cgi?id=91780 and probably others Signed-off-by: Dave Airlie --- src/gallium/drivers/r600/evergreen_state.c | 4 src/gallium/drivers/r600/evergreend.h | 2 ++ 2 files changed, 6 insertions(+) diff --git a/src/gallium/drivers/r600/evergreen_state.c b/src/gallium/drivers/r600/evergreen_state.c index c6702a9..a3bbbcc 100644 --- a/src/gallium/drivers/r600/evergreen_state.c +++ b/src/gallium/drivers/r600/evergreen_state.c @@ -2362,6 +2362,8 @@ static void cayman_init_atom_start_cs(struct r600_context *rctx) r600_store_context_reg(cb, R_028848_SQ_PGM_RESOURCES_2_PS, S_028848_SINGLE_ROUND(V_SQ_ROUND_NEAREST_EVEN)); r600_store_context_reg(cb, R_028864_SQ_PGM_RESOURCES_2_VS, S_028864_SINGLE_ROUND(V_SQ_ROUND_NEAREST_EVEN)); + r600_store_context_reg(cb, R_02887C_SQ_PGM_RESOURCES_2_GS, S_028848_SINGLE_ROUND(V_SQ_ROUND_NEAREST_EVEN)); + r600_store_context_reg(cb, R_028894_SQ_PGM_RESOURCES_2_ES, S_028848_SINGLE_ROUND(V_SQ_ROUND_NEAREST_EVEN)); r600_store_context_reg(cb, R_0288A8_SQ_PGM_RESOURCES_FS, 0); /* to avoid GPU doing any preloading of constant from random address */ @@ -2801,6 +2803,8 @@ void evergreen_init_atom_start_cs(struct r600_context *rctx) r600_store_context_reg(cb, R_028848_SQ_PGM_RESOURCES_2_PS, S_028848_SINGLE_ROUND(V_SQ_ROUND_NEAREST_EVEN)); r600_store_context_reg(cb, R_028864_SQ_PGM_RESOURCES_2_VS, S_028864_SINGLE_ROUND(V_SQ_ROUND_NEAREST_EVEN)); + r600_store_context_reg(cb, R_02887C_SQ_PGM_RESOURCES_2_GS, S_028848_SINGLE_ROUND(V_SQ_ROUND_NEAREST_EVEN)); + r600_store_context_reg(cb, R_028894_SQ_PGM_RESOURCES_2_ES, S_028848_SINGLE_ROUND(V_SQ_ROUND_NEAREST_EVEN)); nitpick: separate macros for SINGLE_ROUND for each register r600_store_context_reg(cb, R_0288A8_SQ_PGM_RESOURCES_FS, 0); /* to avoid GPU doing any preloading of constant from random address */ diff --git a/src/gallium/drivers/r600/evergreend.h b/src/gallium/drivers/r600/evergreend.h index 937ffcb..cf8906c 100644 --- a/src/gallium/drivers/r600/evergreend.h +++ b/src/gallium/drivers/r600/evergreend.h @@ -1497,6 +1497,7 @@ #define S_028878_UNCACHED_FIRST_INST(x) (((x) & 0x1) << 28) #define G_028878_UNCACHED_FIRST_INST(x) (((x) >> 28) & 0x1) #define C_028878_UNCACHED_FIRST_INST 0xEFFF +#define R_02887C_SQ_PGM_RESOURCES_2_GS 0x02887C #define R_028890_SQ_PGM_RESOURCES_ES 0x028890 #define S_028890_NUM_GPRS(x) (((x) & 0xFF) << 0) @@ -1511,6 +1512,7 @@ #define S_028890_UNCACHED_FIRST_INST(x) (((x) & 0x1) << 28) #define G_028890_UNCACHED_FIRST_INST(x) (((x) >> 28) & 0x1) #define C_028890_UNCACHED_FIRST_INST 0xEFFF +#define R_028894_SQ_PGM_RESOURCES_2_ES 0x028894 #define R_028864_SQ_PGM_RESOURCES_2_VS 0x028864 #define S_028864_SINGLE_ROUND(x) (((x) & 0x3) << 0) Tested / Reviewed-by: Glenn Kennard ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v2 09/18] mesa/extensions: Prefix global struct and extension type
On Fri 30 Oct 2015, Nanley Chery wrote: > From: Nanley Chery > > Rename the following types and variables: > * struct extension -> struct mesa_extension, > like the mesa_format type. > * extension_table -> _mesa_extension_table, > like the _mesa_extension_override_{enables,disables} structs. > > Suggested-by: Marek Olšák > Suggested-by: Chad Versace > Signed-off-by: Nanley Chery > --- > src/mesa/main/extensions.c | 40 > src/mesa/main/extensions.h | 6 +++--- > 2 files changed, 23 insertions(+), 23 deletions(-) I again diffed glxinfo against master, and the diff still looks good. Patch 9 is Reviewed-by: Chad Versace ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v2 08/18] mesa: Generate a helper function for each extension
On Fri 30 Oct 2015, Nanley Chery wrote: > From: Nanley Chery > > Generate functions which determine if an extension is supported in the > current context. Initially, enums were going to be explicitly used with > _mesa_extension_supported(). The idea to embed the function and enums > into generated helper functions was suggested by Kristian Høgsberg. > > For performance, the function body no longer uses > _mesa_extension_supported() and, as suggested by Chad Versace, the > functions are also declared static inline. > > v2. Place function qualifiers on seperate line (Chad) > > Signed-off-by: Nanley Chery > --- > src/mesa/main/context.h| 1 + > src/mesa/main/extensions.c | 23 +-- > src/mesa/main/extensions.h | 41 + > 3 files changed, 43 insertions(+), 22 deletions(-) > > diff --git a/src/mesa/main/context.h b/src/mesa/main/context.h > index 1e7a12c..4798b1f 100644 > --- a/src/mesa/main/context.h > +++ b/src/mesa/main/context.h > @@ -50,6 +50,7 @@ > > > #include "imports.h" > +#include "extensions.h" > #include "mtypes.h" > #include "vbo/vbo.h" > > diff --git a/src/mesa/main/extensions.c b/src/mesa/main/extensions.c > index c7609be..7ef79e5 100644 > --- a/src/mesa/main/extensions.c > +++ b/src/mesa/main/extensions.c > @@ -42,27 +42,6 @@ struct gl_extensions _mesa_extension_override_disables; > static char *extra_extensions = NULL; > static char *cant_disable_extensions = NULL; > > -/** > - * \brief An element of the \c extension_table. > - */ > -struct extension { > - /** Name of extension, such as "GL_ARB_depth_clamp". */ > - const char *name; > - > - /** Offset (in bytes) of the corresponding member in struct > gl_extensions. */ > - size_t offset; > - > - /** Minimum version the extension requires for the given API > -* (see gl_api defined in mtypes.h). The value is equal to: > -* 10 * major_version + minor_version > -*/ > - uint8_t version[API_OPENGL_LAST + 1]; > - > - /** Year the extension was proposed or approved. Used to sort the > -* extension string chronologically. */ > - uint16_t year; > -}; > - > > /** > * Given a member \c x of struct gl_extensions, return offset of > @@ -74,7 +53,7 @@ struct extension { > /** > * \brief Table of supported OpenGL extensions for all API's. > */ > -static const struct extension extension_table[] = { > +const struct extension extension_table[] = { > #define EXT(name_str, driver_cap, gll_ver, glc_ver, gles_ver, gles2_ver, > ) \ > { .name = "GL_" #name_str, .offset = o(driver_cap), \ >.version = { \ > diff --git a/src/mesa/main/extensions.h b/src/mesa/main/extensions.h > index 595512a..30abe02 100644 > --- a/src/mesa/main/extensions.h > +++ b/src/mesa/main/extensions.h > @@ -55,6 +55,47 @@ _mesa_get_extension_count(struct gl_context *ctx); > extern const GLubyte * > _mesa_get_enabled_extension(struct gl_context *ctx, GLuint index); > > + > +/** > + * \brief An element of the \c extension_table. > + */ > +struct extension { > + /** Name of extension, such as "GL_ARB_depth_clamp". */ > + const char *name; > + > + /** Offset (in bytes) of the corresponding member in struct > gl_extensions. */ > + size_t offset; > + > + /** Minimum version the extension requires for the given API > +* (see gl_api defined in mtypes.h). The value is equal to: > +* 10 * major_version + minor_version > +*/ > + uint8_t version[API_OPENGL_LAST + 1]; > + > + /** Year the extension was proposed or approved. Used to sort the > +* extension string chronologically. */ > + uint16_t year; > +} extern const extension_table[]; > + > + > +/* Generate enums for the functions below */ > +enum { > +#define EXT(name_str, ...) MESA_EXTENSION_##name_str, > +#include "extensions_table.h" > +#undef EXT > +}; > + > + > +/** Checks if the context suports a user-facing extension */ > +#define EXT(name_str, driver_cap, ...) \ > +static inline bool \ > +_mesa_has_##name_str(const struct gl_context *ctx) { \ > + return ctx->Extensions.driver_cap && (ctx->Version >= \ > + extension_table[MESA_EXTENSION_##name_str].version[ctx->API]); \ > +} > +#include "extensions_table.h" > +#undef EXT One small nit. The function _mesa_has_FOO should be formatted like all other functions, with the opening brace on its own line. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] r600: initialised PGM_RESOURCES_2 for ES/GS
On Wed, Nov 11, 2015 at 5:42 PM, Dave Airlie wrote: > From: Dave Airlie > > This fixes the corruption on rendering that we are seeing in > certain geometry shaders. > > Signed-off-by: Dave Airlie Reviewed-by: Alex Deucher > --- > src/gallium/drivers/r600/evergreen_state.c | 4 > src/gallium/drivers/r600/evergreend.h | 2 ++ > 2 files changed, 6 insertions(+) > > diff --git a/src/gallium/drivers/r600/evergreen_state.c > b/src/gallium/drivers/r600/evergreen_state.c > index c6702a9..a3bbbcc 100644 > --- a/src/gallium/drivers/r600/evergreen_state.c > +++ b/src/gallium/drivers/r600/evergreen_state.c > @@ -2362,6 +2362,8 @@ static void cayman_init_atom_start_cs(struct > r600_context *rctx) > > r600_store_context_reg(cb, R_028848_SQ_PGM_RESOURCES_2_PS, > S_028848_SINGLE_ROUND(V_SQ_ROUND_NEAREST_EVEN)); > r600_store_context_reg(cb, R_028864_SQ_PGM_RESOURCES_2_VS, > S_028864_SINGLE_ROUND(V_SQ_ROUND_NEAREST_EVEN)); > + r600_store_context_reg(cb, R_02887C_SQ_PGM_RESOURCES_2_GS, > S_028848_SINGLE_ROUND(V_SQ_ROUND_NEAREST_EVEN)); > + r600_store_context_reg(cb, R_028894_SQ_PGM_RESOURCES_2_ES, > S_028848_SINGLE_ROUND(V_SQ_ROUND_NEAREST_EVEN)); > r600_store_context_reg(cb, R_0288A8_SQ_PGM_RESOURCES_FS, 0); > > /* to avoid GPU doing any preloading of constant from random address > */ > @@ -2801,6 +2803,8 @@ void evergreen_init_atom_start_cs(struct r600_context > *rctx) > > r600_store_context_reg(cb, R_028848_SQ_PGM_RESOURCES_2_PS, > S_028848_SINGLE_ROUND(V_SQ_ROUND_NEAREST_EVEN)); > r600_store_context_reg(cb, R_028864_SQ_PGM_RESOURCES_2_VS, > S_028864_SINGLE_ROUND(V_SQ_ROUND_NEAREST_EVEN)); > + r600_store_context_reg(cb, R_02887C_SQ_PGM_RESOURCES_2_GS, > S_028848_SINGLE_ROUND(V_SQ_ROUND_NEAREST_EVEN)); > + r600_store_context_reg(cb, R_028894_SQ_PGM_RESOURCES_2_ES, > S_028848_SINGLE_ROUND(V_SQ_ROUND_NEAREST_EVEN)); > r600_store_context_reg(cb, R_0288A8_SQ_PGM_RESOURCES_FS, 0); > > /* to avoid GPU doing any preloading of constant from random address > */ > diff --git a/src/gallium/drivers/r600/evergreend.h > b/src/gallium/drivers/r600/evergreend.h > index 937ffcb..cf8906c 100644 > --- a/src/gallium/drivers/r600/evergreend.h > +++ b/src/gallium/drivers/r600/evergreend.h > @@ -1497,6 +1497,7 @@ > #define S_028878_UNCACHED_FIRST_INST(x) (((x) & 0x1) << 28) > #define G_028878_UNCACHED_FIRST_INST(x) (((x) >> 28) & 0x1) > #define C_028878_UNCACHED_FIRST_INST 0xEFFF > +#define R_02887C_SQ_PGM_RESOURCES_2_GS 0x02887C > > #define R_028890_SQ_PGM_RESOURCES_ES 0x028890 > #define S_028890_NUM_GPRS(x) (((x) & 0xFF) << 0) > @@ -1511,6 +1512,7 @@ > #define S_028890_UNCACHED_FIRST_INST(x) (((x) & 0x1) << 28) > #define G_028890_UNCACHED_FIRST_INST(x) (((x) >> 28) & 0x1) > #define C_028890_UNCACHED_FIRST_INST 0xEFFF > +#define R_028894_SQ_PGM_RESOURCES_2_ES 0x028894 > > #define R_028864_SQ_PGM_RESOURCES_2_VS 0x028864 > #define S_028864_SINGLE_ROUND(x) (((x) & 0x3) << 0) > -- > 2.1.0 > > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v2 08/18] mesa: Generate a helper function for each extension
On Fri 30 Oct 2015, Nanley Chery wrote: > From: Nanley Chery > > Generate functions which determine if an extension is supported in the > current context. Initially, enums were going to be explicitly used with > _mesa_extension_supported(). The idea to embed the function and enums > into generated helper functions was suggested by Kristian Høgsberg. > > For performance, the function body no longer uses > _mesa_extension_supported() and, as suggested by Chad Versace, the > functions are also declared static inline. > > v2. Place function qualifiers on seperate line (Chad) > > Signed-off-by: Nanley Chery > --- > src/mesa/main/context.h| 1 + > src/mesa/main/extensions.c | 23 +-- > src/mesa/main/extensions.h | 41 + > 3 files changed, 43 insertions(+), 22 deletions(-) Patch 8 is Reviewed-by: Chad Versace ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v2 07/18] mesa/extensions: Replace extension::api_set with ::version
On Fri 30 Oct 2015, Nanley Chery wrote: > From: Nanley Chery > > The api_set field has no users outside of _mesa_extension_supported(). > Remove it and allow the version field to take its place. > > The brunt of the transformation was performed with the following vim commands: > s/\(GL [^,]\+\),\s*\d*,\s*\d*\(,\s*\d*\)\(,\s*\d*\)/\1, GLL, GLC\2\3/g > s/\(GLL [^,]\+\)\,\s*\d*/\1, GLL/g > s/\(GLC [^,]\+\)\(,\s*\d*\),\s*\d*\(,\s*\d*\)\(,\s*\d*\)/\1\2, GLC\3\4/g > s/\( ES1[^,]*\)\(,\s*\(\w\|\d\)\+\)\(,\s*\(\w\|\d\)\+\),\s*\d*/\1\2\4, ES1/g > s/\( > ES2[^,]*\)\(,\s*\(\w\|\d\)\+\)\(,\s*\(\w\|\d\)\+\)\(,\s*\(\w\|\d\)\+\),\s*\d*/\1\2\4\6, > ES2/g > > Signed-off-by: Nanley Chery > --- > src/mesa/main/extensions.c | 21 +- > src/mesa/main/extensions_table.h | 636 > --- > 2 files changed, 326 insertions(+), 331 deletions(-) Patch 7 is Reviewed-by: Chad Versace I again tested glxinfo output against master, and there was no difference. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/7] [v2] i965/skl: Add fast color clear infrastructure
On Wed, Nov 11, 2015 at 2:06 PM, Ben Widawsky wrote: > Patch was originally called: > i965/skl: Enable fast color clears on SKL > > Skylake introduces some differences in the way that fast clears are programmed > and in the restrictions for using fast clears. Since some of these are > non-obvious, and fast clears are currently disabled globally, we can enable > the > simple stuff here and leave the weirder stuff and separately reviewable work. > > Based on a patch originally from Kristian. > > Note that within this patch the change in scaling factors could be achieved > with > this hunk instead. I've opted to keep things more like how the docs describe > it > however. > --- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c > +++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c > @@ -150,9 +150,13 @@ intel_get_non_msrt_mcs_alignment(struct brw_context *brw, >/* In release builds, fall through */ > case I915_TILING_Y: >*width_px = 32 / mt->cpp; > - *height = 4; > + if (brw->gen >= 9) > + *height = 2; > + else > + *height = 4; I can't git am this patch, presumably because you have a patch inside the commit summary. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v2 06/18] mesa/extensions: Use _mesa_extension_supported()
On Fri 30 Oct 2015, Nanley Chery wrote: > From: Nanley Chery > > Replace open-coded checks for extension support with > _mesa_extension_supported(). > > Signed-off-by: Nanley Chery > --- > src/mesa/main/extensions.c | 54 > > src/mesa/main/extensions_table.h | 6 ++--- > 2 files changed, 14 insertions(+), 46 deletions(-) I diffed the output of glxinfo between master@0260620 and nchery/ext_save-v2@69a93a0 (this patch), and saw no significant diff. So the series looks good so far. Patches 5 and 6 are Reviewed-by: Chad Versace ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] r600: initialised PGM_RESOURCES_2 for ES/GS
From: Dave Airlie This fixes the corruption on rendering that we are seeing in certain geometry shaders. Signed-off-by: Dave Airlie --- src/gallium/drivers/r600/evergreen_state.c | 4 src/gallium/drivers/r600/evergreend.h | 2 ++ 2 files changed, 6 insertions(+) diff --git a/src/gallium/drivers/r600/evergreen_state.c b/src/gallium/drivers/r600/evergreen_state.c index c6702a9..a3bbbcc 100644 --- a/src/gallium/drivers/r600/evergreen_state.c +++ b/src/gallium/drivers/r600/evergreen_state.c @@ -2362,6 +2362,8 @@ static void cayman_init_atom_start_cs(struct r600_context *rctx) r600_store_context_reg(cb, R_028848_SQ_PGM_RESOURCES_2_PS, S_028848_SINGLE_ROUND(V_SQ_ROUND_NEAREST_EVEN)); r600_store_context_reg(cb, R_028864_SQ_PGM_RESOURCES_2_VS, S_028864_SINGLE_ROUND(V_SQ_ROUND_NEAREST_EVEN)); + r600_store_context_reg(cb, R_02887C_SQ_PGM_RESOURCES_2_GS, S_028848_SINGLE_ROUND(V_SQ_ROUND_NEAREST_EVEN)); + r600_store_context_reg(cb, R_028894_SQ_PGM_RESOURCES_2_ES, S_028848_SINGLE_ROUND(V_SQ_ROUND_NEAREST_EVEN)); r600_store_context_reg(cb, R_0288A8_SQ_PGM_RESOURCES_FS, 0); /* to avoid GPU doing any preloading of constant from random address */ @@ -2801,6 +2803,8 @@ void evergreen_init_atom_start_cs(struct r600_context *rctx) r600_store_context_reg(cb, R_028848_SQ_PGM_RESOURCES_2_PS, S_028848_SINGLE_ROUND(V_SQ_ROUND_NEAREST_EVEN)); r600_store_context_reg(cb, R_028864_SQ_PGM_RESOURCES_2_VS, S_028864_SINGLE_ROUND(V_SQ_ROUND_NEAREST_EVEN)); + r600_store_context_reg(cb, R_02887C_SQ_PGM_RESOURCES_2_GS, S_028848_SINGLE_ROUND(V_SQ_ROUND_NEAREST_EVEN)); + r600_store_context_reg(cb, R_028894_SQ_PGM_RESOURCES_2_ES, S_028848_SINGLE_ROUND(V_SQ_ROUND_NEAREST_EVEN)); r600_store_context_reg(cb, R_0288A8_SQ_PGM_RESOURCES_FS, 0); /* to avoid GPU doing any preloading of constant from random address */ diff --git a/src/gallium/drivers/r600/evergreend.h b/src/gallium/drivers/r600/evergreend.h index 937ffcb..cf8906c 100644 --- a/src/gallium/drivers/r600/evergreend.h +++ b/src/gallium/drivers/r600/evergreend.h @@ -1497,6 +1497,7 @@ #define S_028878_UNCACHED_FIRST_INST(x) (((x) & 0x1) << 28) #define G_028878_UNCACHED_FIRST_INST(x) (((x) >> 28) & 0x1) #define C_028878_UNCACHED_FIRST_INST 0xEFFF +#define R_02887C_SQ_PGM_RESOURCES_2_GS 0x02887C #define R_028890_SQ_PGM_RESOURCES_ES 0x028890 #define S_028890_NUM_GPRS(x) (((x) & 0xFF) << 0) @@ -1511,6 +1512,7 @@ #define S_028890_UNCACHED_FIRST_INST(x) (((x) & 0x1) << 28) #define G_028890_UNCACHED_FIRST_INST(x) (((x) >> 28) & 0x1) #define C_028890_UNCACHED_FIRST_INST 0xEFFF +#define R_028894_SQ_PGM_RESOURCES_2_ES 0x028894 #define R_028864_SQ_PGM_RESOURCES_2_VS 0x028864 #define S_028864_SINGLE_ROUND(x) (((x) & 0x3) << 0) -- 2.1.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v2 03/18] mesa/extensions: Move entries entries to seperate file
On Fri 30 Oct 2015, Nanley Chery wrote: > From: Nanley Chery > > With this infrastructure set in place, we can now reuse the entries to > generate useful code. > > v2. Add the new file into Makefile.sources (Emil) > > Signed-off-by: Nanley Chery > --- > src/mesa/Makefile.sources| 1 + > src/mesa/main/extensions.c | 321 > +-- > src/mesa/main/extensions_table.h | 320 ++ > 3 files changed, 322 insertions(+), 320 deletions(-) > create mode 100644 src/mesa/main/extensions_table.h Patch 3 is Reviewed-by: Chad Versace ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v2 04/18] mesa/extensions: Add extension::version
On Fri 30 Oct 2015, Nanley Chery wrote: > From: Nanley Chery > > Enable limiting advertised extension support by context version with > finer granularity. > > v2. Use uint*t type for version and note the expected values (Emil) > Use an 8-bit wide datatype. > Reformat macro for better readability (Chad) > > Signed-off-by: Nanley Chery > --- > src/mesa/main/extensions.c | 18 +- > src/mesa/main/extensions_table.h | 626 > +++ > 2 files changed, 329 insertions(+), 315 deletions(-) Please note in the commit message the preparatory nature of the patch. Namely, that (1) extension::version is nowhere used yet and (2) extension::version[*] is set to 0 everywhere. With that little extra in the commit message, patch 4 is Reviewed-by: Chad Versace ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v2 02/18] mesa/extensions: Wrap array entries in macros
On Fri 30 Oct 2015, Nanley Chery wrote: > From: Nanley Chery > > Now that we're using macros, remove the redundant text from each entry. > > Remove comments between the entries to make editing easier and separate > the sections with blank lines. Structure the EXT macros in a way that > helps reviewers verify that no meaning has been altered. > > Signed-off-by: Nanley Chery > --- > src/mesa/main/extensions.c | 645 > +++-- > 1 file changed, 323 insertions(+), 322 deletions(-) > > diff --git a/src/mesa/main/extensions.c b/src/mesa/main/extensions.c > index 389bbb0..fb4d791 100644 > --- a/src/mesa/main/extensions.c > +++ b/src/mesa/main/extensions.c > @@ -83,328 +83,329 @@ struct extension { > * \brief Table of supported OpenGL extensions for all API's. > */ > static const struct extension extension_table[] = { > - /* ARB Extensions */ > - { "GL_ARB_ES2_compatibility", o(ARB_ES2_compatibility), > GL, 2009 }, > - { "GL_ARB_ES3_compatibility", o(ARB_ES3_compatibility), > GL, 2012 }, [...] > - { "GL_SGIS_texture_lod",o(dummy_true), > GLL,1997 }, > - { "GL_SUN_multi_draw_arrays", o(dummy_true), > GLL,1999 }, > +#define EXT(name_str, driver_cap, api_flags, ) \ > +{ .name = "GL_" #name_str, .offset = o(driver_cap), .api_set = > api_flags, .year = }, > +EXT(ARB_ES2_compatibility , ARB_ES2_compatibility > , GL , 2009) > +EXT(ARB_ES3_compatibility , ARB_ES3_compatibility > , GL , 2012) [...] Each EXT line is an array entry. Please indent it as such, just like the old array entries that the patch replaces. With some indentation, this patch is Reviewed-by: Chad Versace ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 2/7] i965: Add lossless compression to surface format table
On Wed, Nov 11, 2015 at 02:10:57PM -0800, Ben Widawsky wrote: > This subject used to say Add lossless compression to surface format TO table > > somehow, "to" got dropped. It's fixed locally. > Ignore this, subject looks fine to me and I'm an idiot. [snip] ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v2 01/18] mesa/extensions: Remove array sentinel
On Wed, Nov 11, 2015 at 2:07 PM, Chad Versace wrote: > On Fri 30 Oct 2015, Nanley Chery wrote: > > From: Nanley Chery > > > > Simplify future updates to the extension struct array by removing > > the sentinel. > > > > Signed-off-by: Nanley Chery > > Patch 1 is > Reviewed-by: Chad Versace > > Thanks. > Since your series is long and risks suffering from rebase conflicts, > I suggest you push the little cleanup patches (like this one) as soon as > you're satisfied with the review. > > Sounds good. > Also, could you push a branch for v2 of your series? And put v2 in its > name? That would help me review it since it no longer applies cleanly to > master. > Sure, the branch is located here: http://cgit.freedesktop.org/~nchery/mesa/log/?h=ext_safev2 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 2/7] i965: Add lossless compression to surface format table
This subject used to say Add lossless compression to surface format TO table somehow, "to" got dropped. It's fixed locally. On Wed, Nov 11, 2015 at 02:06:16PM -0800, Ben Widawsky wrote: > Background: Prior to Skylake and since Ivybridge Intel hardware has had the > ability to use a MCS (Multisample Control Surface) as auxiliary data in > "compression" operations on the surface. This reduces memory bandwidth. This > hardware was either used for MSAA compression, and fast clear operations. On > Gen8, a similar mechanism exists to allow the hiz buffer to be sampled from, > and > therefore this feature is sometimes referred to more generally as "AUX > buffers". > > Skylake adds the ability to have the display engine directly source compressed > surfaces on top of the ability to sample from them. Inference dictates that > enabling this display features adding a restriction to the formats which could > actually be compressed. The current set of surfaces seems to be a subset as > compared to previous gens (see the next patch). Also, if I had to guess I > would > guess that future gens add support for more surface formats. To make handling > this a bit easier to read, and more future proof, the support for this is > moved > into the surface formats table. > > Along with the modifications to the table, a helper function is also provided > to > determine if a surface is CCS compatible. Because fast clears are currently > disabled on SKL, we can plumb the helper all the way through here, and not > actually have anything break. > > The logic in the table works a bit differently than the other columns in the > table and therefore deserves a small mention. For most other features, the GEN > which began implementing it is set, and it is assumed future gens also support > this. For this feature, GEN9 actually eliminates support for certain formats. > We > could use this column to determine support for the similar feature on older > generation hardware. Aside from that being an error prone task which is > unrelated to enabling this on GEN9, it becomes somewhat tricky to implement > because of the fact that surface format support diminishes. You'd probably > want > another column to cleanly implement it. > > Requested-by: Chad Versace > Requested-by: Neil Roberts > Signed-off-by: Ben Widawsky > --- > src/mesa/drivers/dri/i965/brw_context.h | 2 + > src/mesa/drivers/dri/i965/brw_surface_formats.c | 527 > +--- > src/mesa/drivers/dri/i965/intel_mipmap_tree.c | 7 + > 3 files changed, 285 insertions(+), 251 deletions(-) > > diff --git a/src/mesa/drivers/dri/i965/brw_context.h > b/src/mesa/drivers/dri/i965/brw_context.h > index 4b2db61..6284c18 100644 > --- a/src/mesa/drivers/dri/i965/brw_context.h > +++ b/src/mesa/drivers/dri/i965/brw_context.h > @@ -1465,6 +1465,8 @@ void brw_upload_image_surfaces(struct brw_context *brw, > /* brw_surface_formats.c */ > bool brw_render_target_supported(struct brw_context *brw, > struct gl_renderbuffer *rb); > +bool brw_losslessly_compressible_format(struct brw_context *brw, > +uint32_t brw_format); > uint32_t brw_depth_format(struct brw_context *brw, mesa_format format); > mesa_format brw_lower_mesa_image_format(const struct brw_device_info > *devinfo, > mesa_format format); > diff --git a/src/mesa/drivers/dri/i965/brw_surface_formats.c > b/src/mesa/drivers/dri/i965/brw_surface_formats.c > index 97fff60..a7cdc13 100644 > --- a/src/mesa/drivers/dri/i965/brw_surface_formats.c > +++ b/src/mesa/drivers/dri/i965/brw_surface_formats.c > @@ -39,14 +39,15 @@ struct surface_format_info { > int input_vb; > int streamed_output_vb; > int color_processing; > + int lossless_compression_support; > const char *name; > }; > > /* This macro allows us to write the table almost as it appears in the PRM, > * while restructuring it to turn it into the C code we want. > */ > -#define SF(sampl, filt, shad, ck, rt, ab, vb, so, color, sf) \ > - [BRW_SURFACEFORMAT_##sf] = { true, sampl, filt, shad, ck, rt, ab, vb, so, > color, #sf}, > +#define SF(sampl, filt, shad, ck, rt, ab, vb, so, color, ccs, sf) \ > + [BRW_SURFACEFORMAT_##sf] = { true, sampl, filt, shad, ck, rt, ab, vb, so, > color, ccs, #sf}, > > #define Y 0 > #define x 999 > @@ -74,6 +75,7 @@ struct surface_format_info { > * VB- Input Vertex Buffer > * SO- Steamed Output Vertex Buffers (transform feedback) > * color - Color Processing > + * ccs - Lossless Compression Support (gen9+ only) > * sf- Surface Format > * > * See page 88 of the Sandybridge PRM VOL4_Part1 PDF. > @@ -84,257 +86,258 @@ struct surface_format_info { > * - VOL2_Part1 section 2.5.11 Format Conversion (vertex fetch). > * - VOL4_Part1 section 2.12.2.1.2 Sampler Output Channel Mapping. > * - VOL4_Part1 section 3.9.11 Render Target Write. > + * - Render Ta
Re: [Mesa-dev] [PATCH v2 01/18] mesa/extensions: Remove array sentinel
On Fri 30 Oct 2015, Nanley Chery wrote: > From: Nanley Chery > > Simplify future updates to the extension struct array by removing > the sentinel. > > Signed-off-by: Nanley Chery Patch 1 is Reviewed-by: Chad Versace Since your series is long and risks suffering from rebase conflicts, I suggest you push the little cleanup patches (like this one) as soon as you're satisfied with the review. Also, could you push a branch for v2 of your series? And put v2 in its name? That would help me review it since it no longer applies cleanly to master. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 4/7] [v2] i965/meta/gen9: Individually fast clear color attachments
The impetus for this patch comes from a seemingly benign statement within the spec (quoted within the patch). For me, this patch was at some point critical for getting stable piglit results (though this did not seem to be the case on a branch Chad was working on). It is very important for clearing multiple color buffer attachments and can be observed in the following piglit tests: spec/arb_framebuffer_object/fbo-drawbuffers-none glclear spec/ext_framebuffer_multisample/blit-multiple-render-targets 0 v2: Doing the framebuffer binding only once (Chad) Directly use the renderbuffers from the mt (Chad) Cc: Chad Versace Signed-off-by: Ben Widawsky --- src/mesa/drivers/dri/i965/brw_meta_fast_clear.c | 94 + 1 file changed, 81 insertions(+), 13 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_meta_fast_clear.c b/src/mesa/drivers/dri/i965/brw_meta_fast_clear.c index eac92d4..97444d7 100644 --- a/src/mesa/drivers/dri/i965/brw_meta_fast_clear.c +++ b/src/mesa/drivers/dri/i965/brw_meta_fast_clear.c @@ -428,6 +428,71 @@ use_rectlist(struct brw_context *brw, bool enable) brw->ctx.NewDriverState |= BRW_NEW_FRAGMENT_PROGRAM; } +/** + * Individually fast clear each color buffer attachment. On previous gens this + * isn't required. The motivation for this comes from one line (which seems to + * be specific to SKL+). The list item is in section titled _MCS Buffer for + * Render Target(s)_ + * + * "Since only one RT is bound with a clear pass, only one RT can be cleared + * at a time. To clear multiple RTs, multiple clear passes are required." + * + * The code follows the same idea as the resolve code which creates a fake FBO + * to avoid interfering with too much of the GL state. + */ +static void +fast_clear_attachments(struct brw_context *brw, + struct gl_framebuffer *fb, + uint32_t fast_clear_buffers, + struct rect fast_clear_rect) +{ + assert(brw->gen >= 9); + struct gl_context *ctx = &brw->ctx; + const GLuint old_fb = ctx->DrawBuffer->Name; + GLuint fbo; + + _mesa_GenFramebuffers(1, &fbo); + _mesa_BindFramebuffer(GL_DRAW_FRAMEBUFFER, fbo); + _mesa_DrawBuffer(GL_COLOR_ATTACHMENT0); + + brw_fast_clear_init(brw); + use_rectlist(brw, true); + brw_bind_rep_write_shader(brw, (float *) fast_clear_color); + + /* SKL+ also has a resolve mode for compressed render targets and thus more +* bits to let us select the type of resolve. For fast clear resolves, it +* turns out we can use the same value as pre-SKL though. +*/ + set_fast_clear_op(brw, GEN7_PS_RENDER_TARGET_FAST_CLEAR_ENABLE); + + for (unsigned buf = 0; buf < fb->_NumColorDrawBuffers; buf++) { + struct gl_renderbuffer *rb = fb->_ColorDrawBuffers[buf]; + struct intel_renderbuffer *irb = intel_renderbuffer(rb); + int index = fb->_ColorDrawBufferIndexes[buf]; + + if ((fast_clear_buffers & (1 << index)) == 0) + continue; + + + _mesa_framebuffer_renderbuffer(ctx, ctx->DrawBuffer, + GL_COLOR_ATTACHMENT0, rb, + "meta fast clear (per-attachment)"); + + brw_draw_rectlist(ctx, &fast_clear_rect, MAX2(1, fb->MaxNumLayers)); + + /* Now set the mcs we cleared to INTEL_FAST_CLEAR_STATE_CLEAR so we'll + * resolve them eventually. + */ + irb->mt->fast_clear_state = INTEL_FAST_CLEAR_STATE_CLEAR; + } + + set_fast_clear_op(brw, 0); + use_rectlist(brw, false); + + _mesa_DeleteFramebuffers(1, &fbo); + _mesa_BindFramebuffer(GL_DRAW_FRAMEBUFFER, old_fb); +} + bool brw_meta_fast_clear(struct brw_context *brw, struct gl_framebuffer *fb, GLbitfield buffers, bool partial_clear) @@ -603,12 +668,27 @@ brw_meta_fast_clear(struct brw_context *brw, struct gl_framebuffer *fb, use_rectlist(brw, true); layers = MAX2(1, fb->MaxNumLayers); - if (fast_clear_buffers) { + + if (brw->gen >= 9 && fast_clear_buffers) { + fast_clear_attachments(brw, fb, fast_clear_buffers, fast_clear_rect); + } else if (fast_clear_buffers) { _mesa_meta_drawbuffers_from_bitfield(fast_clear_buffers); brw_bind_rep_write_shader(brw, (float *) fast_clear_color); set_fast_clear_op(brw, GEN7_PS_RENDER_TARGET_FAST_CLEAR_ENABLE); brw_draw_rectlist(ctx, &fast_clear_rect, layers); set_fast_clear_op(brw, 0); + + /* Now set the mcs we cleared to INTEL_FAST_CLEAR_STATE_CLEAR so we'll + * resolve them eventually. + */ + for (unsigned buf = 0; buf < fb->_NumColorDrawBuffers; buf++) { + struct gl_renderbuffer *rb = fb->_ColorDrawBuffers[buf]; + struct intel_renderbuffer *irb = intel_renderbuffer(rb); + int index = fb->_ColorDrawBufferIndexes[buf]; + + if ((1 << index) & fast_clear_buffers) +irb->mt->fast_clear_state = INTEL_FAST_CLEAR_STATE_CLEAR; + } } if (rep_clear_buffers) { @@ -617,18 +697,6 @@
[Mesa-dev] [PATCH 7/7] [v2] i965/gen9: Support fast clears for 32b float
SKL supports the ability to do fast clears and resolves of 32b RGBA as both integer and floats. This patch only enables float color clears because we haven't yet enabled integer color clears, (HW support for that was added in BDW). Two formats are explicitly disabled because they fail piglit tests, LUMINANCE16F and INTENSITY16F. There is some question about the validity of sampling from these surfaces for all gens, however, there seem to be no other failures, so I'd prefer to leave tackling that for a separate series. v2: Just reject the two failing types. Cc: Neil Roberts --- src/mesa/drivers/dri/i965/brw_meta_fast_clear.c | 12 ++-- src/mesa/drivers/dri/i965/gen8_surface_state.c | 8 2 files changed, 10 insertions(+), 10 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_meta_fast_clear.c b/src/mesa/drivers/dri/i965/brw_meta_fast_clear.c index 67dd22d..a024d02 100644 --- a/src/mesa/drivers/dri/i965/brw_meta_fast_clear.c +++ b/src/mesa/drivers/dri/i965/brw_meta_fast_clear.c @@ -360,10 +360,18 @@ is_color_fast_clear_compatible(struct brw_context *brw, } for (int i = 0; i < 4; i++) { - if (color->f[i] != 0.0f && color->f[i] != 1.0f && - _mesa_format_has_color_component(format, i)) { + if (!_mesa_format_has_color_component(format, i)) { + continue; + } + + if (brw->gen < 9 && + color->f[i] != 0.0f && color->f[i] != 1.0f) { return false; } + + if (format == MESA_FORMAT_L_FLOAT16 || + format == MESA_FORMAT_I_FLOAT16) + return false; } return true; } diff --git a/src/mesa/drivers/dri/i965/gen8_surface_state.c b/src/mesa/drivers/dri/i965/gen8_surface_state.c index 8fe480c..2381c83 100644 --- a/src/mesa/drivers/dri/i965/gen8_surface_state.c +++ b/src/mesa/drivers/dri/i965/gen8_surface_state.c @@ -188,14 +188,6 @@ gen8_emit_fast_clear_color(struct brw_context *brw, uint32_t *surf) { if (brw->gen >= 9) { -#define check_fast_clear_val(x) \ - assert(mt->gen9_fast_clear_color.f[x] == 0.0 || \ - mt->gen9_fast_clear_color.f[x] == 1.0) - check_fast_clear_val(0); - check_fast_clear_val(1); - check_fast_clear_val(2); - check_fast_clear_val(3); -#undef check_fast_clear_val surf[12] = mt->gen9_fast_clear_color.ui[0]; surf[13] = mt->gen9_fast_clear_color.ui[1]; surf[14] = mt->gen9_fast_clear_color.ui[2]; -- 2.6.2 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 6/7] Revert "i965/gen9: Enable rep clears on gen9"
This reverts commit 8a0c85b25853decb4a110b6d36d79c4f095d437b. It's not a strict revert because I don't want to bring back the gen < 9 check at this point in time. Reviewed-by: Neil Roberts --- src/mesa/drivers/dri/i965/brw_meta_fast_clear.c | 5 - 1 file changed, 5 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_meta_fast_clear.c b/src/mesa/drivers/dri/i965/brw_meta_fast_clear.c index 97444d7..67dd22d 100644 --- a/src/mesa/drivers/dri/i965/brw_meta_fast_clear.c +++ b/src/mesa/drivers/dri/i965/brw_meta_fast_clear.c @@ -535,11 +535,6 @@ brw_meta_fast_clear(struct brw_context *brw, struct gl_framebuffer *fb, if (irb->mt->fast_clear_state == INTEL_FAST_CLEAR_STATE_NO_MCS) clear_type = REP_CLEAR; - if (brw->gen >= 9 && clear_type == FAST_CLEAR) { - perf_debug("fast MCS clears are disabled on gen9"); - clear_type = REP_CLEAR; - } - /* We can't do scissored fast clears because of the restrictions on the * fast clear rectangle size. */ -- 2.6.2 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 2/7] i965: Add lossless compression to surface format table
Background: Prior to Skylake and since Ivybridge Intel hardware has had the ability to use a MCS (Multisample Control Surface) as auxiliary data in "compression" operations on the surface. This reduces memory bandwidth. This hardware was either used for MSAA compression, and fast clear operations. On Gen8, a similar mechanism exists to allow the hiz buffer to be sampled from, and therefore this feature is sometimes referred to more generally as "AUX buffers". Skylake adds the ability to have the display engine directly source compressed surfaces on top of the ability to sample from them. Inference dictates that enabling this display features adding a restriction to the formats which could actually be compressed. The current set of surfaces seems to be a subset as compared to previous gens (see the next patch). Also, if I had to guess I would guess that future gens add support for more surface formats. To make handling this a bit easier to read, and more future proof, the support for this is moved into the surface formats table. Along with the modifications to the table, a helper function is also provided to determine if a surface is CCS compatible. Because fast clears are currently disabled on SKL, we can plumb the helper all the way through here, and not actually have anything break. The logic in the table works a bit differently than the other columns in the table and therefore deserves a small mention. For most other features, the GEN which began implementing it is set, and it is assumed future gens also support this. For this feature, GEN9 actually eliminates support for certain formats. We could use this column to determine support for the similar feature on older generation hardware. Aside from that being an error prone task which is unrelated to enabling this on GEN9, it becomes somewhat tricky to implement because of the fact that surface format support diminishes. You'd probably want another column to cleanly implement it. Requested-by: Chad Versace Requested-by: Neil Roberts Signed-off-by: Ben Widawsky --- src/mesa/drivers/dri/i965/brw_context.h | 2 + src/mesa/drivers/dri/i965/brw_surface_formats.c | 527 +--- src/mesa/drivers/dri/i965/intel_mipmap_tree.c | 7 + 3 files changed, 285 insertions(+), 251 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_context.h b/src/mesa/drivers/dri/i965/brw_context.h index 4b2db61..6284c18 100644 --- a/src/mesa/drivers/dri/i965/brw_context.h +++ b/src/mesa/drivers/dri/i965/brw_context.h @@ -1465,6 +1465,8 @@ void brw_upload_image_surfaces(struct brw_context *brw, /* brw_surface_formats.c */ bool brw_render_target_supported(struct brw_context *brw, struct gl_renderbuffer *rb); +bool brw_losslessly_compressible_format(struct brw_context *brw, +uint32_t brw_format); uint32_t brw_depth_format(struct brw_context *brw, mesa_format format); mesa_format brw_lower_mesa_image_format(const struct brw_device_info *devinfo, mesa_format format); diff --git a/src/mesa/drivers/dri/i965/brw_surface_formats.c b/src/mesa/drivers/dri/i965/brw_surface_formats.c index 97fff60..a7cdc13 100644 --- a/src/mesa/drivers/dri/i965/brw_surface_formats.c +++ b/src/mesa/drivers/dri/i965/brw_surface_formats.c @@ -39,14 +39,15 @@ struct surface_format_info { int input_vb; int streamed_output_vb; int color_processing; + int lossless_compression_support; const char *name; }; /* This macro allows us to write the table almost as it appears in the PRM, * while restructuring it to turn it into the C code we want. */ -#define SF(sampl, filt, shad, ck, rt, ab, vb, so, color, sf) \ - [BRW_SURFACEFORMAT_##sf] = { true, sampl, filt, shad, ck, rt, ab, vb, so, color, #sf}, +#define SF(sampl, filt, shad, ck, rt, ab, vb, so, color, ccs, sf) \ + [BRW_SURFACEFORMAT_##sf] = { true, sampl, filt, shad, ck, rt, ab, vb, so, color, ccs, #sf}, #define Y 0 #define x 999 @@ -74,6 +75,7 @@ struct surface_format_info { * VB- Input Vertex Buffer * SO- Steamed Output Vertex Buffers (transform feedback) * color - Color Processing + * ccs - Lossless Compression Support (gen9+ only) * sf- Surface Format * * See page 88 of the Sandybridge PRM VOL4_Part1 PDF. @@ -84,257 +86,258 @@ struct surface_format_info { * - VOL2_Part1 section 2.5.11 Format Conversion (vertex fetch). * - VOL4_Part1 section 2.12.2.1.2 Sampler Output Channel Mapping. * - VOL4_Part1 section 3.9.11 Render Target Write. + * - Render Target Surface Types [SKL+] */ const struct surface_format_info surface_formats[] = { -/* smpl filt shad CK RT AB VB SO color */ - SF( Y, 50, x, x, Y, Y, Y, Y, x, R32G32B32A32_FLOAT) - SF( Y, x, x, x, Y, x, Y, Y, x, R32G32B32A32_SINT) - SF( Y, x, x, x, Y, x, Y, Y, x, R32G32B32A32_UINT) - SF( x, x, x, x, x, x, Y, x, x, R32G32B32A32_UNORM) - SF( x, x,
[Mesa-dev] [PATCH 1/7] [v2] i965/skl: Add fast color clear infrastructure
Patch was originally called: i965/skl: Enable fast color clears on SKL Skylake introduces some differences in the way that fast clears are programmed and in the restrictions for using fast clears. Since some of these are non-obvious, and fast clears are currently disabled globally, we can enable the simple stuff here and leave the weirder stuff and separately reviewable work. Based on a patch originally from Kristian. Note that within this patch the change in scaling factors could be achieved with this hunk instead. I've opted to keep things more like how the docs describe it however. --- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c +++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c @@ -150,9 +150,13 @@ intel_get_non_msrt_mcs_alignment(struct brw_context *brw, /* In release builds, fall through */ case I915_TILING_Y: *width_px = 32 / mt->cpp; - *height = 4; + if (brw->gen >= 9) + *height = 2; + else + *height = 4; v2: Add braces for the multiline (Matt + Chad) Comment updates (requested by Chad) Modified commit message Commit message from Chad explaining the MCS height change (Chad) Cc: Chad Versace Signed-off-by: Ben Widawsky Reviewed-by: Neil Roberts --- src/mesa/drivers/dri/i965/brw_meta_fast_clear.c | 55 ++--- src/mesa/drivers/dri/i965/gen8_surface_state.c | 16 ++- src/mesa/drivers/dri/i965/intel_mipmap_tree.c | 17 src/mesa/drivers/dri/i965/intel_mipmap_tree.h | 13 -- 4 files changed, 81 insertions(+), 20 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_meta_fast_clear.c b/src/mesa/drivers/dri/i965/brw_meta_fast_clear.c index 69fe7b4..eac92d4 100644 --- a/src/mesa/drivers/dri/i965/brw_meta_fast_clear.c +++ b/src/mesa/drivers/dri/i965/brw_meta_fast_clear.c @@ -204,7 +204,7 @@ brw_draw_rectlist(struct gl_context *ctx, struct rect *rect, int num_instances) } static void -get_fast_clear_rect(struct gl_framebuffer *fb, +get_fast_clear_rect(struct brw_context *brw, struct gl_framebuffer *fb, struct intel_renderbuffer *irb, struct rect *rect) { unsigned int x_align, y_align; @@ -228,7 +228,14 @@ get_fast_clear_rect(struct gl_framebuffer *fb, */ intel_get_non_msrt_mcs_alignment(irb->mt, &x_align, &y_align); x_align *= 16; - y_align *= 32; + + /* SKL+ line alignment requirement for Y-tiled are half those of the prior + * generations. + */ + if (brw->gen >= 9) + y_align *= 16; + else + y_align *= 32; /* From the Ivy Bridge PRM, Vol2 Part1 11.7 "MCS Buffer for Render * Target(s)", beneath the "Fast Color Clear" bullet (p327): @@ -265,8 +272,10 @@ get_fast_clear_rect(struct gl_framebuffer *fb, * terms of (width,height) of the RT. * * MSAA Width of Clear Rect Height of Clear Rect + * 2X Ceil(1/8*width) Ceil(1/2*height) * 4X Ceil(1/8*width) Ceil(1/2*height) * 8X Ceil(1/2*width) Ceil(1/2*height) + * 16X widthCeil(1/2*height) * * The text "with upper left co-ordinate to coincide with actual * rectangle being cleared" is a little confusing--it seems to imply @@ -289,6 +298,9 @@ get_fast_clear_rect(struct gl_framebuffer *fb, case 8: x_scaledown = 2; break; + case 16: + x_scaledown = 1; + break; default: unreachable("Unexpected sample count for fast clear"); } @@ -358,18 +370,25 @@ is_color_fast_clear_compatible(struct brw_context *brw, /** * Convert the given color to a bitfield suitable for ORing into DWORD 7 of - * SURFACE_STATE. + * SURFACE_STATE (DWORD 12-15 on SKL+). */ -static uint32_t -compute_fast_clear_color_bits(const union gl_color_union *color) +static void +set_fast_clear_color(struct brw_context *brw, + struct intel_mipmap_tree *mt, + const union gl_color_union *color) { - uint32_t bits = 0; - for (int i = 0; i < 4; i++) { - /* Testing for non-0 works for integer and float colors */ - if (color->f[i] != 0.0f) - bits |= 1 << (GEN7_SURFACE_CLEAR_COLOR_SHIFT + (3 - i)); + if (brw->gen >= 9) { + mt->gen9_fast_clear_color = *color; + } else { + mt->fast_clear_color_value = 0; + for (int i = 0; i < 4; i++) { + /* Testing for non-0 works for integer and float colors */ + if (color->f[i] != 0.0f) { + mt->fast_clear_color_value |= +1 << (GEN7_SURFACE_CLEAR_COLOR_SHIFT + (3 - i)); + } + } } - return bits; } static const uint32_t fast_clear_color[4] = { ~0, ~0, ~0, ~0 }; @@ -504,8 +523,7 @@ brw_meta_fast_clear(struct brw_context *brw, struct gl_framebuffer *fb, switch (clear_type) { case FAST_CLEAR: - irb->mt->fast_clear_color_value = -compute_fast_clear_color_bits(&ctx->Color
[Mesa-dev] [PATCH 0/7] [v2] Gen9 MCS buffers
These patches represent the remaining patches for enabling fast color clears on GEN9+. Mostly they address feedback from Chad. There is one spot where I am waiting for him to tell me what he wants for a comment [i965/meta/gen9: Individually fast clear color attachments]. Hopefully we can get that addressed without needing a resend just for that. Thanks to Chad, and Neil for the thorough feedback (and Kristian for helping me debug the implementation on some of that feedback). Aside from the removed and rejected patches, there is one new patch where I add the ccs column to the surface formats table. See that patch for more details, but in short, this column is for gen9+ ccs and not for the older thing which resembled ccs functionality. patches 2 (which is new), 3, 4, and 7 are the ones missing review (please correct me if someone reviewed any of those and I missed it somehow). This branch may be found here: http://cgit.freedesktop.org/~bwidawsk/mesa/log/?h=skl-fast-clear NOTE: These patches "regress" piglit.spec.arb_shader_image_load_store.execution.load-from-cleared-image because of this bug: https://bugs.freedesktop.org/show_bug.cgi?id=92849 Cc: Chad Versace Ben Widawsky (7): i965/skl: Add fast color clear infrastructure i965: Add lossless compression to surface format table i965/skl: skip fast clears for certain surface formats i965/meta/gen9: Individually fast clear color attachments Revert "i965/gen9: Disable MCS for 1x color surfaces" Revert "i965/gen9: Enable rep clears on gen9" i965/gen9: Support fast clears for 32b float src/mesa/drivers/dri/i965/brw_context.h | 2 + src/mesa/drivers/dri/i965/brw_meta_fast_clear.c | 166 ++-- src/mesa/drivers/dri/i965/brw_surface_formats.c | 527 +--- src/mesa/drivers/dri/i965/gen8_surface_state.c | 15 +- src/mesa/drivers/dri/i965/intel_mipmap_tree.c | 32 +- src/mesa/drivers/dri/i965/intel_mipmap_tree.h | 13 +- 6 files changed, 454 insertions(+), 301 deletions(-) -- 2.6.2 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 3/7] [v2] i965/skl: skip fast clears for certain surface formats
Some of the information originally in this commit message is now in the patch before this. SKL adds compressible render targets and as a result mutates some of the programming for fast clears and resolves. There is a new internal surface type called the CCS. The old AUX_MCS bit becomes AUX_CCS_D. "The Auxiliary surface is a CCS (Color Control Surface) with compression disabled or an MCS with compression enabled, depending on number of multisamples. MCS (Multisample Control Surface) is a special type of CCS." The formats which are supported are defined in the table titled "Render Target Surface Types [SKL+]". There is no PRM yet to reference. The previously implemented helper function already does the right thing provided the table is correct. v2: Use better English in commit message (Matt) s/compressable/compressible/ (Matt) Don't compare bools to true (Matt) Use the helper function and don't increase the context size - this is mostly implemented in the patch just before this (Chad, Neil) Remove an "invalid" assert (Chad) Fix assertion to check num_samples > 1, instead of num_samples (Chad) Cc: Chad Versace Cc: Neil Roberts Signed-off-by: Ben Widawsky --- src/mesa/drivers/dri/i965/brw_surface_formats.c | 52 - src/mesa/drivers/dri/i965/gen8_surface_state.c | 7 +++- 2 files changed, 31 insertions(+), 28 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_surface_formats.c b/src/mesa/drivers/dri/i965/brw_surface_formats.c index a7cdc13..a527f2f 100644 --- a/src/mesa/drivers/dri/i965/brw_surface_formats.c +++ b/src/mesa/drivers/dri/i965/brw_surface_formats.c @@ -90,9 +90,9 @@ struct surface_format_info { */ const struct surface_format_info surface_formats[] = { /* smpl filt shad CK RT AB VB SO color ccs */ - SF( Y, 50, x, x, Y, Y, Y, Y, x,x, R32G32B32A32_FLOAT) - SF( Y, x, x, x, Y, x, Y, Y, x,x, R32G32B32A32_SINT) - SF( Y, x, x, x, Y, x, Y, Y, x,x, R32G32B32A32_UINT) + SF( Y, 50, x, x, Y, Y, Y, Y, x, 90, R32G32B32A32_FLOAT) + SF( Y, x, x, x, Y, x, Y, Y, x, 90, R32G32B32A32_SINT) + SF( Y, x, x, x, Y, x, Y, Y, x, 90, R32G32B32A32_UINT) SF( x, x, x, x, x, x, Y, x, x,x, R32G32B32A32_UNORM) SF( x, x, x, x, x, x, Y, x, x,x, R32G32B32A32_SNORM) SF( x, x, x, x, x, x, Y, x, x,x, R64G64_FLOAT) @@ -109,15 +109,15 @@ const struct surface_format_info surface_formats[] = { SF( x, x, x, x, x, x, Y, x, x,x, R32G32B32_SSCALED) SF( x, x, x, x, x, x, Y, x, x,x, R32G32B32_USCALED) SF( x, x, x, x, x, x, x, x, x,x, R32G32B32_SFIXED) - SF( Y, Y, x, x, Y, 45, Y, x, 60,x, R16G16B16A16_UNORM) - SF( Y, Y, x, x, Y, 60, Y, x, x,x, R16G16B16A16_SNORM) - SF( Y, x, x, x, Y, x, Y, x, x,x, R16G16B16A16_SINT) - SF( Y, x, x, x, Y, x, Y, x, x,x, R16G16B16A16_UINT) - SF( Y, Y, x, x, Y, Y, Y, x, x,x, R16G16B16A16_FLOAT) - SF( Y, 50, x, x, Y, Y, Y, Y, x,x, R32G32_FLOAT) + SF( Y, Y, x, x, Y, 45, Y, x, 60, 90, R16G16B16A16_UNORM) + SF( Y, Y, x, x, Y, 60, Y, x, x, 90, R16G16B16A16_SNORM) + SF( Y, x, x, x, Y, x, Y, x, x, 90, R16G16B16A16_SINT) + SF( Y, x, x, x, Y, x, Y, x, x, 90, R16G16B16A16_UINT) + SF( Y, Y, x, x, Y, Y, Y, x, x, 90, R16G16B16A16_FLOAT) + SF( Y, 50, x, x, Y, Y, Y, Y, x, 90, R32G32_FLOAT) SF( Y, 70, x, x, Y, Y, Y, Y, x,x, R32G32_FLOAT_LD) - SF( Y, x, x, x, Y, x, Y, Y, x,x, R32G32_SINT) - SF( Y, x, x, x, Y, x, Y, Y, x,x, R32G32_UINT) + SF( Y, x, x, x, Y, x, Y, Y, x, 90, R32G32_SINT) + SF( Y, x, x, x, Y, x, Y, Y, x, 90, R32G32_UINT) SF( Y, 50, Y, x, x, x, x, x, x,x, R32_FLOAT_X8X24_TYPELESS) SF( Y, x, x, x, x, x, x, x, x,x, X32_TYPELESS_G8X24_UINT) SF( Y, 50, x, x, x, x, x, x, x,x, L32A32_FLOAT) @@ -125,7 +125,7 @@ const struct surface_format_info surface_formats[] = { SF( x, x, x, x, x, x, Y, x, x,x, R32G32_SNORM) SF( x, x, x, x, x, x, Y, x, x,x, R64_FLOAT) SF( Y, Y, x, x, x, x, x, x, x,x, R16G16B16X16_UNORM) - SF( Y, Y, x, x, x, x, x, x, x,x, R16G16B16X16_FLOAT) + SF( Y, Y, x, x, x, x, x, x, x, 90, R16G16B16X16_FLOAT) SF( Y, 50, x, x, x, x, x, x, x,x, A32X32_FLOAT) SF( Y, 50, x, x, x, x, x, x, x,x, L32X32_FLOAT) SF( Y, 50, x, x, x, x, x, x, x,x, I32X32_FLOAT) @@ -135,29 +135,29 @@ const struct surface_format_info surface_formats[] = { SF( x, x, x, x, x, x, Y, x, x,x, R32G32_USCALED) SF( x, x, x, x, x, x, x, x, x,x, R32G32_SFIXED) SF( x, x, x, x, x, x, x, x, x,x, R64_PASSTHRU) - SF( Y, Y, x, Y, Y, Y, Y, x,
[Mesa-dev] [PATCH 5/7] Revert "i965/gen9: Disable MCS for 1x color surfaces"
This reverts commit dcd59a9e322edeea74187bcad65a8e56c0bfaaa2. Reviewed-by: Neil Roberts --- src/mesa/drivers/dri/i965/intel_mipmap_tree.c | 8 1 file changed, 8 deletions(-) diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c index 4f6848f..285c3f7 100644 --- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c +++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c @@ -208,14 +208,6 @@ intel_miptree_supports_non_msrt_fast_clear(struct brw_context *brw, if (brw->gen < 7) return false; - if (brw->gen >= 9) { - /* FINISHME: Enable singlesample fast MCS clears on SKL after all GPU - * FINISHME: hangs are resolved. - */ - perf_debug("singlesample fast MCS clears disabled on gen9"); - return false; - } - if (mt->disable_aux_buffers) return false; -- 2.6.2 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v4] gallium/hud: control visibility at startup and runtime.
For documentation purposes, we discussed things in IRC and should be reflected in v6. -- Jimmy On Sat, Nov 7, 2015 at 10:49 PM, Ilia Mirkin wrote: > On Sat, Nov 7, 2015 at 11:42 PM, Jimmy Berry wrote: >> - env GALLIUM_HUD_VISIBLE: control default visibility >> - env GALLIUM_HUD_SIGNAL_TOGGLE: toggle visibility via signal >> --- >> >> per imirkin's comments: >> - long to unsigned >> - ignore signal value of 0 rather than always printing error message >> (oversight) >> >> docs/envvars.html | 6 ++ >> src/gallium/auxiliary/hud/hud_context.c | 28 >> 2 files changed, 34 insertions(+) >> >> diff --git a/docs/envvars.html b/docs/envvars.html >> index bdfe999..530bbb7 100644 >> --- a/docs/envvars.html >> +++ b/docs/envvars.html >> @@ -179,6 +179,12 @@ Mesa EGL supports different sets of environment >> variables. See the >> GALLIUM_HUD - draws various information on the screen, like framerate, >> cpu load, driver statistics, performance counters, etc. >> Set GALLIUM_HUD=help and run e.g. glxgears for more info. >> +GALLIUM_HUD_VISIBLE - control default visibility, defaults to true. >> +GALLIUM_HUD_TOGGLE_SIGNAL - toggle visibility via user specified signal. >> +Especially useful to toggle hud at specific points of application and >> +disable for unencumbered viewing the rest of the time. For example, set >> +GALLIUM_HUD_VISIBLE to false and GALLIUM_HUD_SIGNAL_TOGGLE to 10 >> (SIGUSR1). >> +Use kill -10 to toggle the hud as desired. >> GALLIUM_LOG_FILE - specifies a file for logging all errors, warnings, >> etc. >> rather than stderr. >> GALLIUM_PRINT_OPTIONS - if non-zero, print all the Gallium environment >> diff --git a/src/gallium/auxiliary/hud/hud_context.c >> b/src/gallium/auxiliary/hud/hud_context.c >> index ffe30b8..f9dfcc7 100644 >> --- a/src/gallium/auxiliary/hud/hud_context.c >> +++ b/src/gallium/auxiliary/hud/hud_context.c >> @@ -33,6 +33,7 @@ >> * Set GALLIUM_HUD=help for more info. >> */ >> >> +#include >> #include >> >> #include "hud/hud_context.h" >> @@ -51,6 +52,8 @@ >> #include "tgsi/tgsi_text.h" >> #include "tgsi/tgsi_dump.h" >> >> +/* Control the visibility of all HUD contexts */ >> +static boolean huds_visible = TRUE; >> >> struct hud_context { >> struct pipe_context *pipe; >> @@ -95,6 +98,11 @@ struct hud_context { >> } text, bg, whitelines; >> }; >> >> +static void >> +signal_visible_handler(int sig, siginfo_t *siginfo, void *context) >> +{ >> + huds_visible = !huds_visible; >> +} >> >> static void >> hud_draw_colored_prims(struct hud_context *hud, unsigned prim, >> @@ -441,6 +449,9 @@ hud_draw(struct hud_context *hud, struct pipe_resource >> *tex) >> struct hud_pane *pane; >> struct hud_graph *gr; >> >> + if (!huds_visible) >> + return; >> + >> hud->fb_width = tex->width0; >> hud->fb_height = tex->height0; >> hud->constants.two_div_fb_width = 2.0f / hud->fb_width; >> @@ -1125,6 +1136,10 @@ hud_create(struct pipe_context *pipe, struct >> cso_context *cso) >> struct pipe_sampler_view view_templ; >> unsigned i; >> const char *env = debug_get_option("GALLIUM_HUD", NULL); >> + unsigned signo = debug_get_num_option("GALLIUM_HUD_TOGGLE_SIGNAL", 0); >> + boolean sig_handled = FALSE; >> + struct sigaction action = {}; >> + huds_visible = debug_get_bool_option("GALLIUM_HUD_VISIBLE", TRUE); >> >> if (!env || !*env) >>return NULL; >> @@ -1267,6 +1282,19 @@ hud_create(struct pipe_context *pipe, struct >> cso_context *cso) >> >> LIST_INITHEAD(&hud->pane_list); >> >> + /* setup sig handler once for all hud contexts */ >> + if (!sig_handled && signo != 0) { >> + action.sa_sigaction = &signal_visible_handler; >> + action.sa_flags = SA_SIGINFO; >> + >> + if (signo >= NSIG) >> + fprintf(stderr, "gallium_hud: invalid signal %u\n", signo); >> + else if (sigaction(signo, &action, NULL) < 0) >> + fprintf(stderr, "gallium_hud: unable to set handler for signal >> %u\n", signo); >> + >> + sig_handled = TRUE; > > sig_handled is scoped to this function, so this will have fairly > minimal effect. Did you mean to make it a global / pthread_once_t type > of thing? > >> + } >> + >> hud_parse_env_var(hud, env); >> return hud; >> } >> -- >> 2.6.2 >> >> ___ >> mesa-dev mailing list >> mesa-dev@lists.freedesktop.org >> http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] gallium/hud: document GALLIUM_HUD_PERIOD in envvars.html.
Do I need to do something to else to get this committed? Not 100% on process. -- Jimmy On Wed, Nov 4, 2015 at 2:32 AM, Samuel Pitoiset wrote: > Reviewed-by: Samuel Pitoiset > > > > On 11/04/2015 06:24 AM, Jimmy Berry wrote: >> >> --- >> docs/envvars.html | 2 ++ >> 1 file changed, 2 insertions(+) >> >> diff --git a/docs/envvars.html b/docs/envvars.html >> index bdfe999..173c941 100644 >> --- a/docs/envvars.html >> +++ b/docs/envvars.html >> @@ -179,6 +179,8 @@ Mesa EGL supports different sets of environment >> variables. See the >> GALLIUM_HUD - draws various information on the screen, like >> framerate, >> cpu load, driver statistics, performance counters, etc. >> Set GALLIUM_HUD=help and run e.g. glxgears for more info. >> +GALLIUM_HUD_PERIOD - sets the hud update rate in seconds (float). Use >> zero >> +to update every frame. The default period is 1/2 second. >> GALLIUM_LOG_FILE - specifies a file for logging all errors, >> warnings, etc. >> rather than stderr. >> GALLIUM_PRINT_OPTIONS - if non-zero, print all the Gallium >> environment >> > > -- > -Samuel ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] Can't get OpenGL 3.x inside VMware Workstation 12 (Ubuntu guest)
On 11 November 2015 at 19:51, Brian Paul wrote: > On 11/11/2015 11:38 AM, Emil Velikov wrote: >> >> On 11 November 2015 at 18:25, Thomas Hellstrom >> wrote: >>> >>> On 11/11/2015 07:07 PM, Brian Paul wrote: On 11/11/2015 10:44 AM, Emil Velikov wrote: > > On 11 November 2015 at 16:48, Brian Paul wrote: >> >> On 11/11/2015 08:44 AM, Emil Velikov wrote: > > >>> >>> I have seen similar type of documents in the past, most of which >>> going >>> out of date very quickly due to distribution changes and/or others. >>> Wondering how you'll feel about "check your distro and add svga to >>> the >>> gallium-drivers array" style of instructions ? >> >> >> >> I'm afraid I don't quite understand what you're saying there. Can you >> elaborate? >> >> > Rather than walking through the requirements, configure and make/make > install steps, just forward people to the distro specific wiki on "how > to build mesa/kernel" and explicitly mention the differences: > mesa: > - XA must be enabled: --enable-xa > - svga must be listed in the gallium drivers: > --with-gallium-drivers=svga... > > kernel: >- Set DRM_VMWGFX > > others... I guess I've never seen those wikis. I'd have to search for them, but I really don't have the time now. We actually have an in-house shell script that installs all the pre-req packages, pulls the git trees, builds and installs for a variety of guest OSes. But it has some VMware-specific stuff that I'd have to trim out before making public. > > Related: does the upstream [1] vmwgfx module work well when combined > with upstream core drm across different versions ? Considering how > well Thomas is handling upstreaming shouldn't the module from the > kernel be recommended ? Either should be fine at this point but the build instructions cover the case of one having an older distro that may not have the GL3-enabled kernel module already. >>> >>> The upstream[1] vmwgfx module should work well with any linux kernel >>> dating back to 2.6.32 unless the distro has changed the kernel API from >>> the base version. It ships with builtin stripped drm and ttm to handle >>> compatibility issues, and is intended for people (mostly including >>> ourselves and our QA team) that want to try out new features without >>> installing a completely new kernel. >>> >> Ok seems that my point is too subtle, so I'll try from another angle. >> >> The wiki instructions say "nuke he vmwgfx.ko module" and implicitly >> "keep drm.ko". If we ignore the unlikely cases where either one and/or >> both is built-in, we can have a case where the new vmwgfx is build >> against core drm from the upstream, yet the downstream drm module >> is/gets loaded. As core drm often goes through various changes, you >> can see how bad things are likely to happen. > > > Well, the above-mentioned build script doesn't touch drm.ko and works on > about 14 different versions of Ubuntu, Mint, Fedora, RHEL, etc. so I don't > think we've ever seen that conflict. But if someone's doing their own > kernel/graphics builds/installs, who knows. If it comes up, we'll just have > to address it. > Ouch... I see what's happening here. You're not using any of the kernel core drm/ttm/foo - you're just static linking the local ones into vmwgfx.ko. This will explain why the lack of issues. Well played guys ! Cheers, Emil ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] Can't get OpenGL 3.x inside VMware Workstation 12 (Ubuntu guest)
On 11/11/2015 08:51 PM, Brian Paul wrote: > On 11/11/2015 11:38 AM, Emil Velikov wrote: >> On 11 November 2015 at 18:25, Thomas Hellstrom >> wrote: >>> On 11/11/2015 07:07 PM, Brian Paul wrote: On 11/11/2015 10:44 AM, Emil Velikov wrote: > On 11 November 2015 at 16:48, Brian Paul wrote: >> On 11/11/2015 08:44 AM, Emil Velikov wrote: > >>> >>> I have seen similar type of documents in the past, most of which >>> going >>> out of date very quickly due to distribution changes and/or others. >>> Wondering how you'll feel about "check your distro and add svga >>> to the >>> gallium-drivers array" style of instructions ? >> >> >> I'm afraid I don't quite understand what you're saying there. >> Can you >> elaborate? >> >> > Rather than walking through the requirements, configure and make/make > install steps, just forward people to the distro specific wiki on > "how > to build mesa/kernel" and explicitly mention the differences: > mesa: > - XA must be enabled: --enable-xa > - svga must be listed in the gallium drivers: > --with-gallium-drivers=svga... > > kernel: >- Set DRM_VMWGFX > > others... I guess I've never seen those wikis. I'd have to search for them, but I really don't have the time now. We actually have an in-house shell script that installs all the pre-req packages, pulls the git trees, builds and installs for a variety of guest OSes. But it has some VMware-specific stuff that I'd have to trim out before making public. > > Related: does the upstream [1] vmwgfx module work well when combined > with upstream core drm across different versions ? Considering how > well Thomas is handling upstreaming shouldn't the module from the > kernel be recommended ? Either should be fine at this point but the build instructions cover the case of one having an older distro that may not have the GL3-enabled kernel module already. >>> >>> The upstream[1] vmwgfx module should work well with any linux kernel >>> dating back to 2.6.32 unless the distro has changed the kernel API from >>> the base version. It ships with builtin stripped drm and ttm to handle >>> compatibility issues, and is intended for people (mostly including >>> ourselves and our QA team) that want to try out new features without >>> installing a completely new kernel. >>> >> Ok seems that my point is too subtle, so I'll try from another angle. >> >> The wiki instructions say "nuke he vmwgfx.ko module" and implicitly >> "keep drm.ko". If we ignore the unlikely cases where either one and/or >> both is built-in, we can have a case where the new vmwgfx is build >> against core drm from the upstream, yet the downstream drm module >> is/gets loaded. As core drm often goes through various changes, you >> can see how bad things are likely to happen. > Actually they are unlikely to happen since this scenario was taken into account when the standalone vmwgfx module was designed, and unless we have a bug it should be safe (I've seen two duplicate TTM symbols exported in 3.19, but that's cosmetic only until another TTM aware driver is loaded, which is currently not likely). When built, only internal drm / ttm headers are used, no internal drm or ttm symbols are exported (except as stated above) and all drm and ttm references are internally resolved except two, namely those that allow binding to core drm's sysfs, so that the driver's sysfs interface shows up where it should; among other things for Mir compatibility. It's true that if these two single DRM functions change their interface, we need to handle that, but that's a small interface to watch over; The driver is simply designed to interoperate with core drm and TTM. /Thomas ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 15/24] i965/fs: Set stride = 1 for vector immediate types.
On Wed, Nov 11, 2015 at 1:02 PM, Kenneth Graunke wrote: > On Monday, November 02, 2015 04:29:25 PM Matt Turner wrote: >> The generator asserts that this is true (and presumably it's useful in >> some optimization passes?) and the VF fs_reg constructors did this (by >> virtue of the fact that it doesn't override what init() does). >> >> In the next commit, calling this constructor with brw_imm_* will >> generate an IMM file register rather than a HW_REG, making this change >> necessary to avoid breakage with existing uses of brw_imm_v(). >> --- >> src/mesa/drivers/dri/i965/brw_fs.cpp | 6 ++ >> 1 file changed, 6 insertions(+) >> >> diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp >> b/src/mesa/drivers/dri/i965/brw_fs.cpp >> index 91eaf61..92a9437 100644 >> --- a/src/mesa/drivers/dri/i965/brw_fs.cpp >> +++ b/src/mesa/drivers/dri/i965/brw_fs.cpp >> @@ -427,6 +427,12 @@ fs_reg::fs_reg(struct brw_reg reg) : >> this->subreg_offset = 0; >> this->reladdr = NULL; >> this->stride = 1; >> + if (this->file == IMM && >> + (this->type != BRW_REGISTER_TYPE_V && >> +this->type != BRW_REGISTER_TYPE_UV && >> +this->type != BRW_REGISTER_TYPE_VF)) { >> + this->stride = 0; >> + } >> } >> >> bool >> > > It's confusing that your subject says "Set stride to 1 for vector > immediate types" yet your code sets stride to 0. Right, subject fail. > I would suggest renaming the patch to > > "i965/fs: Set stride correctly for immediates in fs_reg(brw_reg)" > > and adding some text to the commit message like: > > "The fs_reg() constructors for immediates set stride to 0, except for > vector-immediates, which set stride to 1. This patch makes the fs_reg > constructor that takes a brw_reg do likewise, so that stride is set > correctly for cases such as fs_reg(brw_imm_v(...))." Sure, sounds good. > Regardless, > Reviewed-by: Kenneth Graunke ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 12/24] i965: Initialize registers' file to BAD_FILE.
On Wed, Nov 11, 2015 at 12:46 PM, Kenneth Graunke wrote: > On Monday, November 02, 2015 04:29:22 PM Matt Turner wrote: >> The test (file == BAD_FILE) works on registers for which the constructor >> has not run because BAD_FILE is zero. The next commit will move >> BAD_FILE in the enum so that it's no longer zero. >> --- >> src/mesa/drivers/dri/i965/brw_fs_nir.cpp | 10 +- >> src/mesa/drivers/dri/i965/brw_fs_visitor.cpp | 3 +++ >> src/mesa/drivers/dri/i965/brw_vec4_nir.cpp | 9 + >> 3 files changed, 21 insertions(+), 1 deletion(-) >> >> diff --git a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp >> b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp >> index 7eeff93..611347c 100644 >> --- a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp >> +++ b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp >> @@ -260,6 +260,10 @@ void >> fs_visitor::nir_emit_system_values() >> { >> nir_system_values = ralloc_array(mem_ctx, fs_reg, SYSTEM_VALUE_MAX); >> + for (unsigned i = 0; i < SYSTEM_VALUE_MAX; i++) { >> + nir_system_values[i].file = BAD_FILE; > > How about we do this instead: > >nir_system_values[i] = fs_reg(); > > That way, they're properly constructed using the default constructor, > which would not only set BAD_FILE, but properly initialize everything, > so we don't have to revisit this if we make other changes in fs_reg(). Is it worth is? The function this code exists in is the thing that initializes the system values. And, of course if file == BAD_FILE, no other fields mean anything. Neither of those are likely to change. > Similarly below. > > That patch would get a: > Reviewed-by: Kenneth Graunke ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 15/24] i965/fs: Set stride = 1 for vector immediate types.
On Monday, November 02, 2015 04:29:25 PM Matt Turner wrote: > The generator asserts that this is true (and presumably it's useful in > some optimization passes?) and the VF fs_reg constructors did this (by > virtue of the fact that it doesn't override what init() does). > > In the next commit, calling this constructor with brw_imm_* will > generate an IMM file register rather than a HW_REG, making this change > necessary to avoid breakage with existing uses of brw_imm_v(). Also, the next commit does not do that. Presumably some later commit does...(hooray, rebasing!) signature.asc Description: This is a digitally signed message part. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 15/24] i965/fs: Set stride = 1 for vector immediate types.
On Monday, November 02, 2015 04:29:25 PM Matt Turner wrote: > The generator asserts that this is true (and presumably it's useful in > some optimization passes?) and the VF fs_reg constructors did this (by > virtue of the fact that it doesn't override what init() does). > > In the next commit, calling this constructor with brw_imm_* will > generate an IMM file register rather than a HW_REG, making this change > necessary to avoid breakage with existing uses of brw_imm_v(). > --- > src/mesa/drivers/dri/i965/brw_fs.cpp | 6 ++ > 1 file changed, 6 insertions(+) > > diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp > b/src/mesa/drivers/dri/i965/brw_fs.cpp > index 91eaf61..92a9437 100644 > --- a/src/mesa/drivers/dri/i965/brw_fs.cpp > +++ b/src/mesa/drivers/dri/i965/brw_fs.cpp > @@ -427,6 +427,12 @@ fs_reg::fs_reg(struct brw_reg reg) : > this->subreg_offset = 0; > this->reladdr = NULL; > this->stride = 1; > + if (this->file == IMM && > + (this->type != BRW_REGISTER_TYPE_V && > +this->type != BRW_REGISTER_TYPE_UV && > +this->type != BRW_REGISTER_TYPE_VF)) { > + this->stride = 0; > + } > } > > bool > It's confusing that your subject says "Set stride to 1 for vector immediate types" yet your code sets stride to 0. I would suggest renaming the patch to "i965/fs: Set stride correctly for immediates in fs_reg(brw_reg)" and adding some text to the commit message like: "The fs_reg() constructors for immediates set stride to 0, except for vector-immediates, which set stride to 1. This patch makes the fs_reg constructor that takes a brw_reg do likewise, so that stride is set correctly for cases such as fs_reg(brw_imm_v(...))." Regardless, Reviewed-by: Kenneth Graunke signature.asc Description: This is a digitally signed message part. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 12/24] i965: Initialize registers' file to BAD_FILE.
On Monday, November 02, 2015 04:29:22 PM Matt Turner wrote: > The test (file == BAD_FILE) works on registers for which the constructor > has not run because BAD_FILE is zero. The next commit will move > BAD_FILE in the enum so that it's no longer zero. > --- > src/mesa/drivers/dri/i965/brw_fs_nir.cpp | 10 +- > src/mesa/drivers/dri/i965/brw_fs_visitor.cpp | 3 +++ > src/mesa/drivers/dri/i965/brw_vec4_nir.cpp | 9 + > 3 files changed, 21 insertions(+), 1 deletion(-) > > diff --git a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp > b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp > index 7eeff93..611347c 100644 > --- a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp > +++ b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp > @@ -260,6 +260,10 @@ void > fs_visitor::nir_emit_system_values() > { > nir_system_values = ralloc_array(mem_ctx, fs_reg, SYSTEM_VALUE_MAX); > + for (unsigned i = 0; i < SYSTEM_VALUE_MAX; i++) { > + nir_system_values[i].file = BAD_FILE; How about we do this instead: nir_system_values[i] = fs_reg(); That way, they're properly constructed using the default constructor, which would not only set BAD_FILE, but properly initialize everything, so we don't have to revisit this if we make other changes in fs_reg(). Similarly below. That patch would get a: Reviewed-by: Kenneth Graunke > + } > + > nir_foreach_overload(nir, overload) { >assert(strcmp(overload->function->name, "main") == 0); >assert(overload->impl); > @@ -270,7 +274,11 @@ fs_visitor::nir_emit_system_values() > void > fs_visitor::nir_emit_impl(nir_function_impl *impl) > { > - nir_locals = reralloc(mem_ctx, nir_locals, fs_reg, impl->reg_alloc); > + nir_locals = ralloc_array(mem_ctx, fs_reg, impl->reg_alloc); > + for (unsigned i = 0; i < impl->reg_alloc; i++) { > + nir_locals[i].file = BAD_FILE; > + } > + > foreach_list_typed(nir_register, reg, node, &impl->registers) { >unsigned array_elems = > reg->num_array_elems == 0 ? 1 : reg->num_array_elems; > diff --git a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp > b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp > index 1e46f9a..ef6e19a 100644 > --- a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp > +++ b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp > @@ -1157,6 +1157,9 @@ fs_visitor::init() > memset(&this->payload, 0, sizeof(this->payload)); > memset(this->outputs, 0, sizeof(this->outputs)); > memset(this->output_components, 0, sizeof(this->output_components)); > + for (unsigned i = 0; i < ARRAY_SIZE(this->outputs); i++) { > + this->outputs[i].file = BAD_FILE; > + } > this->source_depth_to_render_target = false; > this->runtime_check_aads_emit = false; > this->first_non_payload_grf = 0; > diff --git a/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp > b/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp > index 8ca8ddb..bdb3d02 100644 > --- a/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp > +++ b/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp > @@ -106,6 +106,9 @@ void > vec4_visitor::nir_setup_system_values() > { > nir_system_values = ralloc_array(mem_ctx, dst_reg, SYSTEM_VALUE_MAX); > + for (unsigned i = 0; i < SYSTEM_VALUE_MAX; i++) { > + nir_system_values[i].file = BAD_FILE; > + } > > nir_foreach_overload(nir, overload) { >assert(strcmp(overload->function->name, "main") == 0); > @@ -118,6 +121,9 @@ void > vec4_visitor::nir_setup_inputs() > { > nir_inputs = ralloc_array(mem_ctx, src_reg, nir->num_inputs); > + for (unsigned i = 0; i < nir->num_inputs; i++) { > + nir_inputs[i].file = BAD_FILE; > + } > > nir_foreach_variable(var, &nir->inputs) { >int offset = var->data.driver_location; > @@ -148,6 +154,9 @@ void > vec4_visitor::nir_emit_impl(nir_function_impl *impl) > { > nir_locals = ralloc_array(mem_ctx, dst_reg, impl->reg_alloc); > + for (unsigned i = 0; i < impl->reg_alloc; i++) { > + nir_locals[i].file = BAD_FILE; > + } > > foreach_list_typed(nir_register, reg, node, &impl->registers) { >unsigned array_elems = > signature.asc Description: This is a digitally signed message part. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [RFC PATCH 00/40] Rework/consolidate the pipe-loader business
On Mon, Nov 9, 2015 at 8:39 AM, Rob Clark wrote: > On Mon, Nov 9, 2015 at 8:30 AM, Emil Velikov wrote: >> On 30 October 2015 at 17:57, Emil Velikov wrote: >>> On 19 October 2015 at 18:41, Emil Velikov wrote: On 19 October 2015 at 17:07, Brian Paul wrote: >>> > > I'm not too familiar with this code or these changes but I'm wondering how > much of chance there is of this breaking any driver/target builds. > > For example, is there a chance of breaking the VMware driver or Windows > builds? I don't have time to test this series here ATM, but I guess I > could > later this week. > Afaics the Windows builds are unaffected. On the svga/vmwgfx front, (of the top of my head) I'd say - 1-2% chance that things have regressed (due to git rebase fallouts, as spotted with the missing winsys->destroy). But in all means please do give them a bash. These can wait a week or so (but hopefully less than a month). >>> Hi Brian, >>> I suspect you did not have the time to test the series, have you ? As >>> we've got some patches that conflict with this, I'm wondering if I >>> should respin things (+ drop the sw_winsys rework as mentioned) or if >>> I should chill for another week or so. >>> >>> >>> Rob, >>> Considering how unhappy you were with the with the current state of >>> things (bth neither was I, but you can see why I haven't bothered >>> earlier), can you please take a look and/or test. You will still need >>> the "WIP: gallium: introduce load_pipe_screen()" on top, for Android. >>> >>> The lot can be found in branch pipe-loader-redux at >>> https://github.com/evelikov/Mesa/ >>> >> Guys anyone ? >> >> I'm not asking for anyone to review the patches, although a ACK/NACK >> and/or test on your platforms will be greatly appreciated. > > yeah, I'm interested in this patchset.. sorry, been busy on other > things so didn't yet have a chance to rebase my android build > (although I am also interested in moving drm_gralloc into mesa tree).. > btw, not sure if I'll really be getting time to play w/ android builds before next release branch.. but imho this patchset is moving in the right direction, so if it doesn't break any normal use cases, I'd say to go ahead and push w/ my ack-by. If we at least get it in before branch point, and some small tweaks are needed later for android build, we can always cherry-pick those for the release branch BR, -R ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 9/9] i965: Check accumulator restrictions.
On Monday, November 09, 2015 12:05:34 PM Matt Turner wrote: > On Tue, Nov 3, 2015 at 11:53 PM, Kenneth Graunke > wrote: > > On Wednesday, October 21, 2015 03:58:17 PM Matt Turner wrote: > >> --- > >> src/mesa/drivers/dri/i965/brw_eu_validate.c | 244 > >> > >> 1 file changed, 244 insertions(+) > >> > >> diff --git a/src/mesa/drivers/dri/i965/brw_eu_validate.c > >> b/src/mesa/drivers/dri/i965/brw_eu_validate.c > >> index eb57962..3d16f90 100644 > >> --- a/src/mesa/drivers/dri/i965/brw_eu_validate.c > >> +++ b/src/mesa/drivers/dri/i965/brw_eu_validate.c > >> @@ -54,6 +54,16 @@ cat(struct string *dest, const struct string src) > >>} \ > >> } while(0) > >> > >> +#define CHECK(func) \ > >> + do { \ > >> + struct string __msg = func; \ > >> + if (__msg.str) { \ > >> + cat(&error_msg, __msg); \ > >> + free(__msg.str); \ > >> + valid = false; \ > >> + } \ > >> + } while (0) > >> + > >> static bool > >> src0_is_null(const struct brw_device_info *devinfo, const brw_inst *inst) > >> { > >> @@ -68,6 +78,42 @@ src1_is_null(const struct brw_device_info *devinfo, > >> const brw_inst *inst) > >>brw_inst_src1_da_reg_nr(devinfo, inst) == BRW_ARF_NULL; > >> } > >> > >> +static bool > >> +dst_is_accumulator(const struct brw_device_info *devinfo, const brw_inst > >> *inst) > >> +{ > >> + return brw_inst_dst_reg_file(devinfo, inst) == > >> BRW_ARCHITECTURE_REGISTER_FILE && > >> + brw_inst_dst_da_reg_nr(devinfo, inst) == BRW_ARF_ACCUMULATOR; > >> +} > >> + > >> +static bool > >> +src0_is_accumulator(const struct brw_device_info *devinfo, const brw_inst > >> *inst) > >> +{ > >> + return brw_inst_src0_reg_file(devinfo, inst) == > >> BRW_ARCHITECTURE_REGISTER_FILE && > >> + brw_inst_src0_da_reg_nr(devinfo, inst) == BRW_ARF_ACCUMULATOR; > >> +} > >> + > >> +static bool > >> +src1_is_accumulator(const struct brw_device_info *devinfo, const brw_inst > >> *inst) > >> +{ > >> + return brw_inst_src1_reg_file(devinfo, inst) == > >> BRW_ARCHITECTURE_REGISTER_FILE && > >> + brw_inst_src1_da_reg_nr(devinfo, inst) == BRW_ARF_ACCUMULATOR; > >> +} > >> + > >> +static bool > >> +is_integer(enum brw_reg_type type) > >> +{ > >> + return type == BRW_REGISTER_TYPE_UD || > >> + type == BRW_REGISTER_TYPE_D || > >> + type == BRW_REGISTER_TYPE_UW || > >> + type == BRW_REGISTER_TYPE_W || > >> + type == BRW_REGISTER_TYPE_UB || > >> + type == BRW_REGISTER_TYPE_B || > >> + type == BRW_REGISTER_TYPE_V || > >> + type == BRW_REGISTER_TYPE_UV || > >> + type == BRW_REGISTER_TYPE_UQ || > >> + type == BRW_REGISTER_TYPE_Q; > >> +} > >> + > >> enum gen { > >> GEN4 = (1 << 0), > >> GEN45 = (1 << 1), > >> @@ -83,40 +129,66 @@ enum gen { > >> #define GEN_GE(gen) (~((gen) - 1) | gen) > >> #define GEN_LE(gen) (((gen) - 1) | gen) > >> > >> +enum acc { > >> + ACC_NO_RESTRICTIONS = 0, > >> + ACC_GEN_DEPENDENT = (1 << 0), > >> + ACC_NO_EXPLICIT_SOURCE = (1 << 1), > >> + ACC_NO_EXPLICIT_DESTINATION = (1 << 2), > >> + ACC_NO_IMPLICIT_DESTINATION = (1 << 3), > >> + ACC_NO_DESTINATION = ACC_NO_EXPLICIT_DESTINATION | > >> +ACC_NO_IMPLICIT_DESTINATION, > >> + ACC_NO_ACCESS = ACC_NO_EXPLICIT_SOURCE | > >> + ACC_NO_DESTINATION, > >> + ACC_NO_SOURCE_MODIFIER = (1 << 4), > >> + ACC_NO_INTEGER_SOURCE = (1 << 5), > >> + ACC_IMPLICIT_WRITE_REQUIRED = (1 << 6), > >> + ACC_NOT_BOTH_SOURCE_AND_DESTINATION = (1 << 7), > >> +}; > >> + > >> struct inst_info { > >> enum gen gen; > >> + enum acc acc; > >> }; > >> > >> static const struct inst_info inst_info[128] = { > >> [BRW_OPCODE_ILLEGAL] = { > >>.gen = GEN_ALL, > >> + .acc = ACC_NO_ACCESS, > >> }, > >> [BRW_OPCODE_MOV] = { > >>.gen = GEN_ALL, > >> + .acc = ACC_NOT_BOTH_SOURCE_AND_DESTINATION, > >> }, > >> [BRW_OPCODE_SEL] = { > >>.gen = GEN_ALL, > >> + .acc = ACC_GEN_DEPENDENT, > >> }, > >> [BRW_OPCODE_MOVI] = { > >>.gen = GEN_GE(GEN45), > >> + .acc = ACC_NO_EXPLICIT_SOURCE, > >> }, > >> [BRW_OPCODE_NOT] = { > >>.gen = GEN_ALL, > >> + .acc = ACC_NO_SOURCE_MODIFIER, > >> }, > >> [BRW_OPCODE_AND] = { > >>.gen = GEN_ALL, > >> + .acc = ACC_NO_SOURCE_MODIFIER, > >> }, > >> [BRW_OPCODE_OR] = { > >>.gen = GEN_ALL, > >> + .acc = ACC_NO_SOURCE_MODIFIER, > >> }, > >> [BRW_OPCODE_XOR] = { > >>.gen = GEN_ALL, > >> + .acc = ACC_NO_SOURCE_MODIFIER, > >> }, > >> [BRW_OPCODE_SHR] = { > >>.gen = GEN_ALL, > >> }, > >> [BRW_OPCODE_SHL] = { > >>.gen = GEN_ALL, > >> + .acc = ACC_NO_DESTINATION, > >> }, > >> /* BRW_OPCODE_DIM / BRW_OPCODE_SMOV */ > >> /* Reserved - 11 */ > >> @@ -126,63 +198,81 @@ static const struct i
[Mesa-dev] [PATCH v2 1/3] i965: Introduce a MOV_INDIRECT opcode.
The geometry and tessellation control shader stages both read from multiple URB entries (one per vertex). The thread payload contains several URB handles which reference these separate memory segments. In GLSL, these inputs are represented as per-vertex arrays; the outermost array index selects which vertex's inputs to read. This array index does not necessarily need to be constant. To handle that, we need to use indirect addressing on GRFs to select which of the thread payload registers has the appropriate URB handle. (This is before we can even think about applying the pull model!) This patch introduces a new opcode which performs a MOV from a source using VxH indirect addressing (which allows each of the 8 SIMD channels to select distinct data.) Based on a patch by Jason Ekstrand. Signed-off-by: Kenneth Graunke --- src/mesa/drivers/dri/i965/brw_defines.h| 9 +++ src/mesa/drivers/dri/i965/brw_fs.cpp | 24 ++ src/mesa/drivers/dri/i965/brw_fs.h | 5 src/mesa/drivers/dri/i965/brw_fs_cse.cpp | 1 + src/mesa/drivers/dri/i965/brw_fs_generator.cpp | 34 ++ src/mesa/drivers/dri/i965/brw_shader.cpp | 2 ++ 6 files changed, 75 insertions(+) Jason, I omitted a few things from your patch: - UNIFORM handling in regs_read() - didn't want to ship it without testing it So you'll need to add this back in when you start using it. - Gen7 base_offset munging - on IRC you suggested dropping imm_offset, so it didn't appear to actually do anything. Feel free to put it back in if you need it, or switch your code to not use base_offset. But we're using regs_read() now rather than RA hackery, and using a register with a HW_REG file instead of an IMM offset, so that should be a lot more reusable. diff --git a/src/mesa/drivers/dri/i965/brw_defines.h b/src/mesa/drivers/dri/i965/brw_defines.h index 99a3a2d..6e1cfed 100644 --- a/src/mesa/drivers/dri/i965/brw_defines.h +++ b/src/mesa/drivers/dri/i965/brw_defines.h @@ -1268,6 +1268,15 @@ enum opcode { * Calculate the high 32-bits of a 32x32 multiply. */ SHADER_OPCODE_MULH, + + /** +* A MOV that uses VxH indirect addressing. +* +* Source 0: A register to start from (HW_REG). +* Source 1: An indirect offset (in bytes, UD GRF). +* Source 2: The maximum value of the indirect offset (in bytes, UD IMM). +*/ + SHADER_OPCODE_MOV_INDIRECT, }; enum brw_urb_write_flags { diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp b/src/mesa/drivers/dri/i965/brw_fs.cpp index be712e5..bf8a4a6 100644 --- a/src/mesa/drivers/dri/i965/brw_fs.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs.cpp @@ -841,6 +841,30 @@ fs_inst::regs_read(int arg) const case SHADER_OPCODE_BARRIER: return 1; + case SHADER_OPCODE_MOV_INDIRECT: + if (arg == 0) { + assert(src[2].file == IMM); + unsigned max_indirect = src[2].fixed_hw_reg.dw1.ud; + + if (src[0].file == HW_REG) { +assert(src[0].file == HW_REG); + +unsigned regs = DIV_ROUND_UP(max_indirect, REG_SIZE); + +/* If the start of the region is not register aligned, then + * we only read part of the register at the beginning, but + * overlap into another register at the end. So add 1. + */ +if (src[0].fixed_hw_reg.subnr) + regs++; + +return regs; + } else { +assert(!"Invalid register file"); + } + } + break; + default: if (is_tex() && arg == 0 && src[0].file == GRF) return mlen; diff --git a/src/mesa/drivers/dri/i965/brw_fs.h b/src/mesa/drivers/dri/i965/brw_fs.h index 8a93b56..c599cde 100644 --- a/src/mesa/drivers/dri/i965/brw_fs.h +++ b/src/mesa/drivers/dri/i965/brw_fs.h @@ -526,6 +526,11 @@ private: struct brw_reg offset, struct brw_reg value); + void generate_mov_indirect(fs_inst *inst, + struct brw_reg dst, + struct brw_reg reg, + struct brw_reg indirect_byte_offset); + bool patch_discard_jumps_to_fb_writes(); const struct brw_compiler *compiler; diff --git a/src/mesa/drivers/dri/i965/brw_fs_cse.cpp b/src/mesa/drivers/dri/i965/brw_fs_cse.cpp index 3a28c8d..ffb5f87 100644 --- a/src/mesa/drivers/dri/i965/brw_fs_cse.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs_cse.cpp @@ -78,6 +78,7 @@ is_expression(const fs_visitor *v, const fs_inst *const inst) case FS_OPCODE_LINTERP: case SHADER_OPCODE_FIND_LIVE_CHANNEL: case SHADER_OPCODE_BROADCAST: + case SHADER_OPCODE_MOV_INDIRECT: return true; case SHADER_OPCODE_RCP: case SHADER_OPCODE_RSQ: diff --git a/src/mesa/drivers/dri/i965/brw_fs_generator.cpp b/src/mesa/drivers/dri/i965/brw_fs_generator.cpp index 974219f..97a85bb 100644 --- a/src/mesa/drivers/dri/i965/brw_fs_generator.cp
[Mesa-dev] [PATCH v2 3/3] i965: Allow indirect GS input indexing in the scalar backend.
This allows arbitrary non-constant indices on GS input arrays, both for the vertex index, and any array offsets beyond that. All indirects are handled via the pull model. We could potentially handle indirect addressing of pushed data as well, but it would add additional code complexity, and we usually have to pull inputs anyway due to the sheer volume of input data. Plus, marking pushed inputs as live due to indirect addressing could exacerbate register pressure problems pretty badly. We'd need to be careful. v2: Use updated MOV_INDIRECT opcode. Signed-off-by: Kenneth Graunke --- src/mesa/drivers/dri/i965/brw_fs.cpp | 17 src/mesa/drivers/dri/i965/brw_fs.h | 3 +- src/mesa/drivers/dri/i965/brw_fs_nir.cpp | 130 --- src/mesa/drivers/dri/i965/brw_shader.cpp | 3 + 4 files changed, 107 insertions(+), 46 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp b/src/mesa/drivers/dri/i965/brw_fs.cpp index 9396cf2..730e837 100644 --- a/src/mesa/drivers/dri/i965/brw_fs.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs.cpp @@ -1684,24 +1684,7 @@ fs_visitor::assign_gs_urb_setup() first_non_payload_grf += 8 * vue_prog_data->urb_read_length * nir->info.gs.vertices_in; - const unsigned first_icp_handle = payload.num_regs - - (vue_prog_data->include_vue_handles ? nir->info.gs.vertices_in : 0); - foreach_block_and_inst(block, fs_inst, inst, cfg) { - /* Lower URB_READ_SIMD8 opcodes into real messages. */ - if (inst->opcode == SHADER_OPCODE_URB_READ_SIMD8) { - assert(inst->src[0].file == IMM); - inst->src[0] = retype(brw_vec8_grf(first_icp_handle + -inst->src[0].fixed_hw_reg.dw1.ud, -0), BRW_REGISTER_TYPE_UD); - /* for now, assume constant - we can do per-slot offsets later */ - assert(inst->src[1].file == IMM); - inst->offset = inst->src[1].fixed_hw_reg.dw1.ud; - inst->src[1] = fs_reg(); - inst->mlen = 1; - inst->base_mrf = -1; - } - /* Rewrite all ATTR file references to HW_REGs. */ convert_attr_sources_to_hw_regs(inst); } diff --git a/src/mesa/drivers/dri/i965/brw_fs.h b/src/mesa/drivers/dri/i965/brw_fs.h index c599cde..613aa89 100644 --- a/src/mesa/drivers/dri/i965/brw_fs.h +++ b/src/mesa/drivers/dri/i965/brw_fs.h @@ -302,7 +302,8 @@ public: unsigned stream_id); void emit_gs_thread_end(); void emit_gs_input_load(const fs_reg &dst, const nir_src &vertex_src, - unsigned offset, unsigned num_components); + const fs_reg &indirect_offset, unsigned imm_offset, + unsigned num_components); void emit_cs_terminate(); fs_reg *emit_cs_local_invocation_id_setup(); fs_reg *emit_cs_work_group_id_setup(); diff --git a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp index 52d5ad1..79675a3 100644 --- a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp @@ -1543,42 +1543,113 @@ fs_visitor::emit_gs_vertex(const nir_src &vertex_count_nir_src, void fs_visitor::emit_gs_input_load(const fs_reg &dst, const nir_src &vertex_src, - unsigned input_offset, + const fs_reg &indirect_offset, + unsigned imm_offset, unsigned num_components) { - const brw_vue_prog_data *vue_prog_data = (const brw_vue_prog_data *) prog_data; - const unsigned vertex = nir_src_as_const_value(vertex_src)->u[0]; + struct brw_gs_prog_data *gs_prog_data = (struct brw_gs_prog_data *) prog_data; - const unsigned array_stride = vue_prog_data->urb_read_length * 8; + /* Offset 0 is the VUE header, which contains VARYING_SLOT_LAYER [.y], +* VARYING_SLOT_VIEWPORT [.z], and VARYING_SLOT_PSIZ [.w]. Only +* gl_PointSize is available as a GS input, however, so it must be that. +*/ + const bool is_point_size = + indirect_offset.file == BAD_FILE && imm_offset == 0; + + nir_const_value *vertex_const = nir_src_as_const_value(vertex_src); + const unsigned push_reg_count = gs_prog_data->base.urb_read_length * 8; + + if (indirect_offset.file == BAD_FILE && vertex_const != NULL && + 4 * imm_offset < push_reg_count) { + imm_offset = 4 * imm_offset + vertex_const->u[0] * push_reg_count; + /* This input was pushed into registers. */ + if (is_point_size) { + /* gl_PointSize comes in .w */ + bld.MOV(dst, fs_reg(ATTR, imm_offset + 3, dst.type)); + } else { + for (unsigned i = 0; i < num_components; i++) { +bld.MOV(offset(dst, bld, i), +fs_reg(ATTR, imm_offset + i, dst.type)); + } + } + } else { + /* Resort to the pull model. Ensure the VUE handles are
[Mesa-dev] [PATCH v2 2/3] i965: Add a SHADER_OPCODE_URB_READ_SIMD8_PER_SLOT opcode.
We need to use per-slot offsets when there's non-uniform indexing, as each SIMD channel could have a different index. We want to use them for any non-constant index (even if uniform), as it lives in the message header instead of the descriptor, allowing us to set offsets in GRFs rather than immediates. Signed-off-by: Kenneth Graunke --- src/mesa/drivers/dri/i965/brw_defines.h| 7 ++- src/mesa/drivers/dri/i965/brw_fs.cpp | 2 ++ src/mesa/drivers/dri/i965/brw_fs_generator.cpp | 4 src/mesa/drivers/dri/i965/brw_shader.cpp | 2 ++ 4 files changed, 10 insertions(+), 5 deletions(-) Unchanged from v1. diff --git a/src/mesa/drivers/dri/i965/brw_defines.h b/src/mesa/drivers/dri/i965/brw_defines.h index 6e1cfed..176da83 100644 --- a/src/mesa/drivers/dri/i965/brw_defines.h +++ b/src/mesa/drivers/dri/i965/brw_defines.h @@ -1031,13 +1031,10 @@ enum opcode { SHADER_OPCODE_GEN7_SCRATCH_READ, /** -* Gen8+ SIMD8 URB Read message. -* -* Source 0: The header register, containing URB handles (g1). -* -* Currently only supports constant offsets, in inst->offset. +* Gen8+ SIMD8 URB Read messages. */ SHADER_OPCODE_URB_READ_SIMD8, + SHADER_OPCODE_URB_READ_SIMD8_PER_SLOT, SHADER_OPCODE_URB_WRITE_SIMD8, SHADER_OPCODE_URB_WRITE_SIMD8_PER_SLOT, diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp b/src/mesa/drivers/dri/i965/brw_fs.cpp index bf8a4a6..9396cf2 100644 --- a/src/mesa/drivers/dri/i965/brw_fs.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs.cpp @@ -284,6 +284,7 @@ fs_inst::is_send_from_grf() const case SHADER_OPCODE_URB_WRITE_SIMD8_MASKED: case SHADER_OPCODE_URB_WRITE_SIMD8_MASKED_PER_SLOT: case SHADER_OPCODE_URB_READ_SIMD8: + case SHADER_OPCODE_URB_READ_SIMD8_PER_SLOT: return true; case FS_OPCODE_UNIFORM_PULL_CONSTANT_LOAD: return src[1].file == GRF; @@ -810,6 +811,7 @@ fs_inst::regs_read(int arg) const case SHADER_OPCODE_URB_WRITE_SIMD8_MASKED: case SHADER_OPCODE_URB_WRITE_SIMD8_MASKED_PER_SLOT: case SHADER_OPCODE_URB_READ_SIMD8: + case SHADER_OPCODE_URB_READ_SIMD8_PER_SLOT: case SHADER_OPCODE_UNTYPED_ATOMIC: case SHADER_OPCODE_UNTYPED_SURFACE_READ: case SHADER_OPCODE_UNTYPED_SURFACE_WRITE: diff --git a/src/mesa/drivers/dri/i965/brw_fs_generator.cpp b/src/mesa/drivers/dri/i965/brw_fs_generator.cpp index 97a85bb..48e350f 100644 --- a/src/mesa/drivers/dri/i965/brw_fs_generator.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs_generator.cpp @@ -413,6 +413,9 @@ fs_generator::generate_urb_read(fs_inst *inst, brw_inst_set_sfid(p->devinfo, send, BRW_SFID_URB); brw_inst_set_urb_opcode(p->devinfo, send, GEN8_URB_OPCODE_SIMD8_READ); + if (inst->opcode == SHADER_OPCODE_URB_READ_SIMD8_PER_SLOT) + brw_inst_set_urb_per_slot_offset(p->devinfo, send, true); + brw_inst_set_mlen(p->devinfo, send, inst->mlen); brw_inst_set_rlen(p->devinfo, send, inst->regs_written); brw_inst_set_header_present(p->devinfo, send, true); @@ -2105,6 +2108,7 @@ fs_generator::generate_code(const cfg_t *cfg, int dispatch_width) break; case SHADER_OPCODE_URB_READ_SIMD8: + case SHADER_OPCODE_URB_READ_SIMD8_PER_SLOT: generate_urb_read(inst, dst, src[0]); break; diff --git a/src/mesa/drivers/dri/i965/brw_shader.cpp b/src/mesa/drivers/dri/i965/brw_shader.cpp index 9550a62..e64adb1 100644 --- a/src/mesa/drivers/dri/i965/brw_shader.cpp +++ b/src/mesa/drivers/dri/i965/brw_shader.cpp @@ -429,6 +429,8 @@ brw_instruction_name(enum opcode op) return "gen8_urb_write_simd8_masked_per_slot"; case SHADER_OPCODE_URB_READ_SIMD8: return "urb_read_simd8"; + case SHADER_OPCODE_URB_READ_SIMD8_PER_SLOT: + return "urb_read_simd8_per_slot"; case SHADER_OPCODE_FIND_LIVE_CHANNEL: return "find_live_channel"; -- 2.6.2 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] Can't get OpenGL 3.x inside VMware Workstation 12 (Ubuntu guest)
On 11/11/2015 11:38 AM, Emil Velikov wrote: On 11 November 2015 at 18:25, Thomas Hellstrom wrote: On 11/11/2015 07:07 PM, Brian Paul wrote: On 11/11/2015 10:44 AM, Emil Velikov wrote: On 11 November 2015 at 16:48, Brian Paul wrote: On 11/11/2015 08:44 AM, Emil Velikov wrote: I have seen similar type of documents in the past, most of which going out of date very quickly due to distribution changes and/or others. Wondering how you'll feel about "check your distro and add svga to the gallium-drivers array" style of instructions ? I'm afraid I don't quite understand what you're saying there. Can you elaborate? Rather than walking through the requirements, configure and make/make install steps, just forward people to the distro specific wiki on "how to build mesa/kernel" and explicitly mention the differences: mesa: - XA must be enabled: --enable-xa - svga must be listed in the gallium drivers: --with-gallium-drivers=svga... kernel: - Set DRM_VMWGFX others... I guess I've never seen those wikis. I'd have to search for them, but I really don't have the time now. We actually have an in-house shell script that installs all the pre-req packages, pulls the git trees, builds and installs for a variety of guest OSes. But it has some VMware-specific stuff that I'd have to trim out before making public. Related: does the upstream [1] vmwgfx module work well when combined with upstream core drm across different versions ? Considering how well Thomas is handling upstreaming shouldn't the module from the kernel be recommended ? Either should be fine at this point but the build instructions cover the case of one having an older distro that may not have the GL3-enabled kernel module already. The upstream[1] vmwgfx module should work well with any linux kernel dating back to 2.6.32 unless the distro has changed the kernel API from the base version. It ships with builtin stripped drm and ttm to handle compatibility issues, and is intended for people (mostly including ourselves and our QA team) that want to try out new features without installing a completely new kernel. Ok seems that my point is too subtle, so I'll try from another angle. The wiki instructions say "nuke he vmwgfx.ko module" and implicitly "keep drm.ko". If we ignore the unlikely cases where either one and/or both is built-in, we can have a case where the new vmwgfx is build against core drm from the upstream, yet the downstream drm module is/gets loaded. As core drm often goes through various changes, you can see how bad things are likely to happen. Well, the above-mentioned build script doesn't touch drm.ko and works on about 14 different versions of Ubuntu, Mint, Fedora, RHEL, etc. so I don't think we've ever seen that conflict. But if someone's doing their own kernel/graphics builds/installs, who knows. If it comes up, we'll just have to address it. TL;DR: If using vmwgfx.ko from upstream one should also pick drm.ko ? I haven't done so. Cheers, Emil Note: I've not experienced this, although I had the pleasure of dealing with similar issues. Props to your colleague who updated the start up scripts to honour the existing vmmon & co modules. -Brian ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 92860] [radeonsi][bisected] st/mesa: implement ARB_copy_image - Corruption in ARK Survival Evolved
https://bugs.freedesktop.org/show_bug.cgi?id=92860 Ilia Mirkin changed: What|Removed |Added Status|NEW |RESOLVED Resolution|--- |FIXED --- Comment #6 from Ilia Mirkin --- Fix pushed. commit 912babba7bf1abd3caa49f6372d581ae1afe7e84 Author: Ilia Mirkin Date: Sun Nov 8 04:46:38 2015 -0500 mesa/copyimage: allow width/height to not be multiples of block -- You are receiving this mail because: You are the QA Contact for the bug. You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] Can't get OpenGL 3.x inside VMware Workstation 12 (Ubuntu guest)
On 11 November 2015 at 18:25, Thomas Hellstrom wrote: > On 11/11/2015 07:07 PM, Brian Paul wrote: >> On 11/11/2015 10:44 AM, Emil Velikov wrote: >>> On 11 November 2015 at 16:48, Brian Paul wrote: On 11/11/2015 08:44 AM, Emil Velikov wrote: >>> > > I have seen similar type of documents in the past, most of which going > out of date very quickly due to distribution changes and/or others. > Wondering how you'll feel about "check your distro and add svga to the > gallium-drivers array" style of instructions ? I'm afraid I don't quite understand what you're saying there. Can you elaborate? >>> Rather than walking through the requirements, configure and make/make >>> install steps, just forward people to the distro specific wiki on "how >>> to build mesa/kernel" and explicitly mention the differences: >>> mesa: >>> - XA must be enabled: --enable-xa >>> - svga must be listed in the gallium drivers: >>> --with-gallium-drivers=svga... >>> >>> kernel: >>> - Set DRM_VMWGFX >>> >>> others... >> >> I guess I've never seen those wikis. I'd have to search for them, but >> I really don't have the time now. >> >> We actually have an in-house shell script that installs all the >> pre-req packages, pulls the git trees, builds and installs for a >> variety of guest OSes. But it has some VMware-specific stuff that I'd >> have to trim out before making public. >> >> >>> >>> Related: does the upstream [1] vmwgfx module work well when combined >>> with upstream core drm across different versions ? Considering how >>> well Thomas is handling upstreaming shouldn't the module from the >>> kernel be recommended ? >> >> Either should be fine at this point but the build instructions cover >> the case of one having an older distro that may not have the >> GL3-enabled kernel module already. >> > > The upstream[1] vmwgfx module should work well with any linux kernel > dating back to 2.6.32 unless the distro has changed the kernel API from > the base version. It ships with builtin stripped drm and ttm to handle > compatibility issues, and is intended for people (mostly including > ourselves and our QA team) that want to try out new features without > installing a completely new kernel. > Ok seems that my point is too subtle, so I'll try from another angle. The wiki instructions say "nuke he vmwgfx.ko module" and implicitly "keep drm.ko". If we ignore the unlikely cases where either one and/or both is built-in, we can have a case where the new vmwgfx is build against core drm from the upstream, yet the downstream drm module is/gets loaded. As core drm often goes through various changes, you can see how bad things are likely to happen. TL;DR: If using vmwgfx.ko from upstream one should also pick drm.ko ? Cheers, Emil Note: I've not experienced this, although I had the pleasure of dealing with similar issues. Props to your colleague who updated the start up scripts to honour the existing vmmon & co modules. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] Can't get OpenGL 3.x inside VMware Workstation 12 (Ubuntu guest)
On 11/11/2015 07:07 PM, Brian Paul wrote: > On 11/11/2015 10:44 AM, Emil Velikov wrote: >> On 11 November 2015 at 16:48, Brian Paul wrote: >>> On 11/11/2015 08:44 AM, Emil Velikov wrote: >> I have seen similar type of documents in the past, most of which going out of date very quickly due to distribution changes and/or others. Wondering how you'll feel about "check your distro and add svga to the gallium-drivers array" style of instructions ? >>> >>> >>> I'm afraid I don't quite understand what you're saying there. Can you >>> elaborate? >>> >>> >> Rather than walking through the requirements, configure and make/make >> install steps, just forward people to the distro specific wiki on "how >> to build mesa/kernel" and explicitly mention the differences: >> mesa: >> - XA must be enabled: --enable-xa >> - svga must be listed in the gallium drivers: >> --with-gallium-drivers=svga... >> >> kernel: >> - Set DRM_VMWGFX >> >> others... > > I guess I've never seen those wikis. I'd have to search for them, but > I really don't have the time now. > > We actually have an in-house shell script that installs all the > pre-req packages, pulls the git trees, builds and installs for a > variety of guest OSes. But it has some VMware-specific stuff that I'd > have to trim out before making public. > > >> >> Related: does the upstream [1] vmwgfx module work well when combined >> with upstream core drm across different versions ? Considering how >> well Thomas is handling upstreaming shouldn't the module from the >> kernel be recommended ? > > Either should be fine at this point but the build instructions cover > the case of one having an older distro that may not have the > GL3-enabled kernel module already. > The upstream[1] vmwgfx module should work well with any linux kernel dating back to 2.6.32 unless the distro has changed the kernel API from the base version. It ships with builtin stripped drm and ttm to handle compatibility issues, and is intended for people (mostly including ourselves and our QA team) that want to try out new features without installing a completely new kernel. /Thomas ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] Can't get OpenGL 3.x inside VMware Workstation 12 (Ubuntu guest)
On 11/11/2015 10:44 AM, Emil Velikov wrote: On 11 November 2015 at 16:48, Brian Paul wrote: On 11/11/2015 08:44 AM, Emil Velikov wrote: I have seen similar type of documents in the past, most of which going out of date very quickly due to distribution changes and/or others. Wondering how you'll feel about "check your distro and add svga to the gallium-drivers array" style of instructions ? I'm afraid I don't quite understand what you're saying there. Can you elaborate? Rather than walking through the requirements, configure and make/make install steps, just forward people to the distro specific wiki on "how to build mesa/kernel" and explicitly mention the differences: mesa: - XA must be enabled: --enable-xa - svga must be listed in the gallium drivers: --with-gallium-drivers=svga... kernel: - Set DRM_VMWGFX others... I guess I've never seen those wikis. I'd have to search for them, but I really don't have the time now. We actually have an in-house shell script that installs all the pre-req packages, pulls the git trees, builds and installs for a variety of guest OSes. But it has some VMware-specific stuff that I'd have to trim out before making public. Related: does the upstream [1] vmwgfx module work well when combined with upstream core drm across different versions ? Considering how well Thomas is handling upstreaming shouldn't the module from the kernel be recommended ? Either should be fine at this point but the build instructions cover the case of one having an older distro that may not have the GL3-enabled kernel module already. For example some of us had nasty experiences where versions of vmware player/workstation ships/builds/uses kernel modules which "clash" with the ones already bundled in the kernel package. With "clash" - there is no guarantee whether the upstream or downstream module gets loaded, and due difference in the symbols provided one does encounter "function_foo() error Invalid argument" type of messages, and ultimately things just not working. I don't think I've ever had much trouble with that. The host-side Linux kernel modules aren't really my area so I can't say much about that. -Brian ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] Can't get OpenGL 3.x inside VMware Workstation 12 (Ubuntu guest)
On 11/09/2015 09:24 PM, Valera Rozuvan wrote: On Tue, Nov 10, 2015 at 4:13 AM, Brian Paul wrote: After running depmod, you probably need to update the initramfs with: 'sudo update-initramfs -u' -Brian Hi Brian. First of all, thank you for your reply. I have tried your suggestion on my working setup, and also doing everything again from scratch. Basically, I get the same result - everything goes smoothly, but in the end OpenGL is version 2.1. I even tried to run a simple OpenGL 3.x program, and it crashes (simple OpenGL 2.1 program runs fine). Can you please spend an hour and try the instructions from https://urldefense.proofpoint.com/v2/url?u=http-3A__www.mesa3d.org_vmware-2Dguest.html&d=BQIBaQ&c=Sqcl0Ez6M0X8aeM67LKIiDJAXVeAw-YihVMNtXt-uEs&r=T0t4QG7chq2ZwJo6wilkFznRSFy-8uDKartPGbomVj8&m=f2dfCFlDZuGGllk3HIqWXUyWbl6sPO1sA7SlX-79UG8&s=emo24XVRAaQOlitBEbGNBjqB348c25cFStvxj413YcU&e= on a clean VMware Workstation 12 Player for Windows 64-bit (host), and Ubuntu 15.10 (guest)? It would be very awesome if you can get OpenGL 3.x running in VMware 12, and update the instructions. I am sure there is some critical piece missing. Thank you! = ) I went through the instructions again and updated the Mesa configure command: cd $TOP/mesa ./autogen.sh --prefix=/usr --libdir=${LIBDIR} --with-gallium-drivers=svga --with-dri-drivers=swrast --enable-xa --disable-dri3 --enable-glx-tls Things worked for me here. Can you send me your vmware.log file from the VM (off-list)? Maybe something's wrong host-side. -Brian ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] Can't get OpenGL 3.x inside VMware Workstation 12 (Ubuntu guest)
On 11 November 2015 at 16:48, Brian Paul wrote: > On 11/11/2015 08:44 AM, Emil Velikov wrote: >> >> I have seen similar type of documents in the past, most of which going >> out of date very quickly due to distribution changes and/or others. >> Wondering how you'll feel about "check your distro and add svga to the >> gallium-drivers array" style of instructions ? > > > I'm afraid I don't quite understand what you're saying there. Can you > elaborate? > > Rather than walking through the requirements, configure and make/make install steps, just forward people to the distro specific wiki on "how to build mesa/kernel" and explicitly mention the differences: mesa: - XA must be enabled: --enable-xa - svga must be listed in the gallium drivers: --with-gallium-drivers=svga... kernel: - Set DRM_VMWGFX others... Related: does the upstream [1] vmwgfx module work well when combined with upstream core drm across different versions ? Considering how well Thomas is handling upstreaming shouldn't the module from the kernel be recommended ? For example some of us had nasty experiences where versions of vmware player/workstation ships/builds/uses kernel modules which "clash" with the ones already bundled in the kernel package. With "clash" - there is no guarantee whether the upstream or downstream module gets loaded, and due difference in the symbols provided one does encounter "function_foo() error Invalid argument" type of messages, and ultimately things just not working. Thanks Emil [1] http://cgit.freedesktop.org/mesa/vmwgfx/ ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] Can't get OpenGL 3.x inside VMware Workstation 12 (Ubuntu guest)
On 11/11/2015 09:58 AM, Ilia Mirkin wrote: On Wed, Nov 11, 2015 at 11:48 AM, Brian Paul wrote: I think there is a hunk missing about --enable-texture-float for ARB_texture_float (and ultimately GL 3.0). N/A; that option defines the TEXTURE_FLOAT_ENABLED symbol which is only tested in _mesa_enable_sw_extensions(). boolean util_format_is_supported(enum pipe_format format, unsigned bind) { if (util_format_is_s3tc(format) && !util_format_s3tc_enabled) { return FALSE; } #ifndef TEXTURE_FLOAT_ENABLED if ((bind & PIPE_BIND_RENDER_TARGET) && format != PIPE_FORMAT_R9G9B9E5_FLOAT && format != PIPE_FORMAT_R11G11B10_FLOAT && util_format_is_float(format)) { return FALSE; } #endif return TRUE; } Oops, I git-grep'd only src/mesa/. But in any case, we don't use util_format_is_supported() in the VMware driver and --enable-texture-float is not needed to get GL 3.3. -Brian ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] Can't get OpenGL 3.x inside VMware Workstation 12 (Ubuntu guest)
On Wed, Nov 11, 2015 at 11:58 AM, Ilia Mirkin wrote: > On Wed, Nov 11, 2015 at 11:48 AM, Brian Paul wrote: >>> I think there is a hunk missing about --enable-texture-float for >>> ARB_texture_float (and ultimately GL 3.0). >> >> >> N/A; that option defines the TEXTURE_FLOAT_ENABLED symbol which is only >> tested in _mesa_enable_sw_extensions(). > > boolean > util_format_is_supported(enum pipe_format format, unsigned bind) > { >if (util_format_is_s3tc(format) && !util_format_s3tc_enabled) { > return FALSE; >} > > #ifndef TEXTURE_FLOAT_ENABLED >if ((bind & PIPE_BIND_RENDER_TARGET) && >format != PIPE_FORMAT_R9G9B9E5_FLOAT && >format != PIPE_FORMAT_R11G11B10_FLOAT && >util_format_is_float(format)) { > return FALSE; >} > #endif > >return TRUE; > } Which is, amusingly, only used by the various hw drivers: src/gallium/drivers/freedreno/a2xx/fd2_screen.c: !util_format_is_supported(format, usage)) { src/gallium/drivers/freedreno/a3xx/fd3_screen.c: !util_format_is_supported(format, usage)) { src/gallium/drivers/freedreno/a4xx/fd4_screen.c: !util_format_is_supported(format, usage)) { src/gallium/drivers/i915/i915_screen.c: if (!util_format_is_supported(format, tex_usage)) src/gallium/drivers/ilo/ilo_screen.c: if (!util_format_is_supported(format, bindings)) src/gallium/drivers/nouveau/nv30/nv30_screen.c: if (!util_format_is_supported(format, bindings)) { src/gallium/drivers/nouveau/nv50/nv50_screen.c: if (!util_format_is_supported(format, bindings)) src/gallium/drivers/nouveau/nvc0/nvc0_screen.c: if (!util_format_is_supported(format, bindings)) src/gallium/drivers/r300/r300_screen.c:if (!util_format_is_supported(format, usage)) src/gallium/drivers/r600/evergreen_state.c: if (!util_format_is_supported(format, usage)) src/gallium/drivers/r600/r600_state.c: if (!util_format_is_supported(format, usage)) src/gallium/drivers/radeonsi/si_state.c:if (!util_format_is_supported(format, usage)) src/gallium/drivers/vc4/vc4_screen.c: !util_format_is_supported(format, usage)) { ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] Can't get OpenGL 3.x inside VMware Workstation 12 (Ubuntu guest)
On Wed, Nov 11, 2015 at 11:48 AM, Brian Paul wrote: >> I think there is a hunk missing about --enable-texture-float for >> ARB_texture_float (and ultimately GL 3.0). > > > N/A; that option defines the TEXTURE_FLOAT_ENABLED symbol which is only > tested in _mesa_enable_sw_extensions(). boolean util_format_is_supported(enum pipe_format format, unsigned bind) { if (util_format_is_s3tc(format) && !util_format_s3tc_enabled) { return FALSE; } #ifndef TEXTURE_FLOAT_ENABLED if ((bind & PIPE_BIND_RENDER_TARGET) && format != PIPE_FORMAT_R9G9B9E5_FLOAT && format != PIPE_FORMAT_R11G11B10_FLOAT && util_format_is_float(format)) { return FALSE; } #endif return TRUE; } ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] Can't get OpenGL 3.x inside VMware Workstation 12 (Ubuntu guest)
On 11/11/2015 08:44 AM, Emil Velikov wrote: Hi Brian, On 10 November 2015 at 16:48, Brian Paul wrote: On 11/09/2015 09:24 PM, Valera Rozuvan wrote: On Tue, Nov 10, 2015 at 4:13 AM, Brian Paul wrote: After running depmod, you probably need to update the initramfs with: 'sudo update-initramfs -u' -Brian Hi Brian. First of all, thank you for your reply. I have tried your suggestion on my working setup, and also doing everything again from scratch. Basically, I get the same result - everything goes smoothly, but in the end OpenGL is version 2.1. I even tried to run a simple OpenGL 3.x program, and it crashes (simple OpenGL 2.1 program runs fine). Can you please spend an hour and try the instructions from https://urldefense.proofpoint.com/v2/url?u=http-3A__www.mesa3d.org_vmware-2Dguest.html&d=BQIBaQ&c=Sqcl0Ez6M0X8aeM67LKIiDJAXVeAw-YihVMNtXt-uEs&r=T0t4QG7chq2ZwJo6wilkFznRSFy-8uDKartPGbomVj8&m=f2dfCFlDZuGGllk3HIqWXUyWbl6sPO1sA7SlX-79UG8&s=emo24XVRAaQOlitBEbGNBjqB348c25cFStvxj413YcU&e= on a clean VMware Workstation 12 Player for Windows 64-bit (host), and Ubuntu 15.10 (guest)? It would be very awesome if you can get OpenGL 3.x running in VMware 12, and update the instructions. I am sure there is some critical piece missing. Thank you! = ) I've reviewed and updated the instructions at https://urldefense.proofpoint.com/v2/url?u=http-3A__mesa3d.org_vmware-2Dguest.html&d=BQIFaQ&c=Sqcl0Ez6M0X8aeM67LKIiDJAXVeAw-YihVMNtXt-uEs&r=T0t4QG7chq2ZwJo6wilkFznRSFy-8uDKartPGbomVj8&m=8UjWNV8TUt02VRSItAMguiKXASEh2kzlITp9wZQKoBU&s=sf309mmdWTsjAQi6L5RoKDbJP4xmIpTkAu8KmEa0kLo&e= In particular, the --libdir option is different for Ubuntu and I've updated some info about installing the kernel module. Let me know how that goes. I think there is a hunk missing about --enable-texture-float for ARB_texture_float (and ultimately GL 3.0). N/A; that option defines the TEXTURE_FLOAT_ENABLED symbol which is only tested in _mesa_enable_sw_extensions(). I have seen similar type of documents in the past, most of which going out of date very quickly due to distribution changes and/or others. Wondering how you'll feel about "check your distro and add svga to the gallium-drivers array" style of instructions ? I'm afraid I don't quite understand what you're saying there. Can you elaborate? I'm not volunteering or suggesting that one has to rewrite them, mostly curious. Can you please sync the changes between mesa3d.org and the git repo. Assuming only vmware-guest.html differs yet I've haven't checked the rest bth. I put the tentative updates on the website so that Valera could access them easily and test. Once they're confirmed to work, I'll commit the changes into the git tree. I figured that'd be easier for Valera to use than a patch against the html code. -Brian ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] i965/fs/nir: fix the number of register written by FS_OPCODE_GET_BUFFER_SIZE
On Wed, Nov 11, 2015 at 7:07 AM, Samuel Iglesias Gonsálvez wrote: > FS_OPCODE_GET_BUFFER_SIZE is calculated with a resinfo's sampler message. > > This patch adjusts the number of registers written by the opcode > following what the PRM spec says about the number of registers written > by the SIMD8 and SIMD16's writeback messages for sampler messages. Thanks for catching this! I've fixed this up a couple of other places recently. > Signed-off-by: Samuel Iglesias Gonsálvez > Cc: tapani.pa...@intel.com > --- > src/mesa/drivers/dri/i965/brw_fs_nir.cpp | 17 +++-- > 1 file changed, 15 insertions(+), 2 deletions(-) > > diff --git a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp > b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp > index 02b9f5b..61c9f2e 100644 > --- a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp > +++ b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp > @@ -2297,16 +2297,29 @@ fs_visitor::nir_emit_intrinsic(const fs_builder &bld, > nir_intrinsic_instr *instr >fs_reg source = fs_reg(0); > >int mlen = 1 * reg_width; > + > + /* A resinfo's sampler message is used to get the buffer size. > + * The SIMD8's writeback message consists of four registers and > + * SIMD16's writeback message consists of 8 destination registers > + * (two per each component), although we are only interested on the > + * first component, where resinfo returns the buffer size for > + * SURFTYPE_BUFFER. > + */ > + int regs_written = 4 * mlen; >fs_reg src_payload = fs_reg(GRF, alloc.allocate(mlen), >BRW_REGISTER_TYPE_UD); >bld.LOAD_PAYLOAD(src_payload, &source, 1, 0); > - > + fs_reg buffer_size = fs_reg(GRF, alloc.allocate(regs_written), > + BRW_REGISTER_TYPE_UD); >const unsigned index = prog_data->binding_table.ssbo_start + > ssbo_index; > - fs_inst *inst = bld.emit(FS_OPCODE_GET_BUFFER_SIZE, dest, > + fs_inst *inst = bld.emit(FS_OPCODE_GET_BUFFER_SIZE, buffer_size, > src_payload, fs_reg(index)); >inst->header_size = 0; >inst->mlen = mlen; > + inst->regs_written = regs_written; >bld.emit(inst); > + dest.type = buffer_size.type; > + bld.MOV(dest, buffer_size); You could just do "bld.MOV(retype(dest, buffer_size.type), buffer_size)" and save a line. Other than that, Reviewed-by: Jason Ekstrand > >brw_mark_surface_used(prog_data, index); >break; > -- > 2.5.0 > > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] nir/copy_propagate: do not copy-propagate MOV srcs with source modifiers
On Wed, Nov 11, 2015 at 12:20 AM, Iago Toral Quiroga wrote: > If a source operand in a MOV has source modifiers, then we cannot > copy-propagate it from the parent instruction and remove the MOV. > --- > > I noticed this while debugging some regressions introduced with the fp64 > code. Basically, I had code similar to this: > > vec4 ssa1 = intrincisc1 (...) (...) > vec2 ssa2 = vec2 ssa1 -ssa1.y > vec4 ssa3 = intrinsic2 (ssa2) (...) > > that would be turned into this by copy propagation: > > vec4 ssa1 = some intrincisc > vec4 ssa2 = some intrinsic (ssa1) > > which is obviously not correct. This was happening because > is_swizzleless_move checked that the MOV/vecN operation that generates > the value we want to copy-propagate does not incorporate swizzling, but > it ignored the case where it also added source modifiers, in which case > we can't copy-propagate either. > > Of course, now that we have made vecN operations unsigned again, that example > can't happen because lower_to_source_mods won't produce things like that, but > I figured that the patch would still make sense, since it fixes a case where > copy-propagation won't work as intended, even if we are not currently > triggering > it (at least not with vecN operations). > > src/glsl/nir/nir_opt_copy_propagate.c | 10 -- > 1 file changed, 8 insertions(+), 2 deletions(-) > > diff --git a/src/glsl/nir/nir_opt_copy_propagate.c > b/src/glsl/nir/nir_opt_copy_propagate.c > index 7d8bdd7..7caa4b7 100644 > --- a/src/glsl/nir/nir_opt_copy_propagate.c > +++ b/src/glsl/nir/nir_opt_copy_propagate.c > @@ -65,9 +65,12 @@ static bool is_vec(nir_alu_instr *instr) > } > > static bool > -is_swizzleless_move(nir_alu_instr *instr) > +is_simple_move(nir_alu_instr *instr) > { > if (is_move(instr)) { > + if (instr->src[0].negate || instr->src[0].abs) > + return false; We already do this in is_move() but it might be best to move it here... > + >for (unsigned i = 0; i < 4; i++) { > if (!((instr->dest.write_mask >> i) & 1)) > break; > @@ -81,6 +84,9 @@ is_swizzleless_move(nir_alu_instr *instr) > if (instr->src[i].swizzle[0] != i) > return false; > > + if (instr->src[i].negate || instr->src[i].abs) > +return false; We should either move the one in is_move or we should put this into is_vec. I think I'd be more of a fan of the former. --Jason > + > if (def == NULL) { > def = instr->src[i].src.ssa; > } else if (instr->src[i].src.ssa != def) { > @@ -107,7 +113,7 @@ copy_prop_src(nir_src *src, nir_instr *parent_instr, > nir_if *parent_if) >return false; > > nir_alu_instr *alu_instr = nir_instr_as_alu(src_instr); > - if (!is_swizzleless_move(alu_instr)) > + if (!is_simple_move(alu_instr)) >return false; > > /* Don't let copy propagation land us with a phi that has more > -- > 1.9.1 > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] texgetimage: consolidate 1D array handling code.
Just two minor nits below. On 11/10/2015 07:34 PM, Dave Airlie wrote: From: Dave Airlie This should fix the getteximage-depth test that currently asserts. I was hitting problem with virgl as well in this area. This moves the 1D array handling code to a single place. Signed-off-by: Dave Airlie --- src/mesa/main/texgetimage.c | 24 +--- 1 file changed, 9 insertions(+), 15 deletions(-) diff --git a/src/mesa/main/texgetimage.c b/src/mesa/main/texgetimage.c index 945890a..200d1ca 100644 --- a/src/mesa/main/texgetimage.c +++ b/src/mesa/main/texgetimage.c @@ -88,12 +88,6 @@ get_tex_depth(struct gl_context *ctx, GLuint dimensions, return; } - if (texImage->TexObject->Target == GL_TEXTURE_1D_ARRAY) { - depth = height; - height = 1; - } - - assert(zoffset + depth <= texImage->Depth); for (img = 0; img < depth; img++) { GLubyte *srcMap; GLint srcRowStride; @@ -141,7 +135,6 @@ get_tex_depth_stencil(struct gl_context *ctx, GLuint dimensions, assert(type == GL_UNSIGNED_INT_24_8 || type == GL_FLOAT_32_UNSIGNED_INT_24_8_REV); - assert(zoffset + depth <= texImage->Depth); for (img = 0; img < depth; img++) { GLubyte *srcMap; GLint rowstride; @@ -233,7 +226,6 @@ get_tex_ycbcr(struct gl_context *ctx, GLuint dimensions, { GLint img, row; - assert(zoffset + depth <= texImage->Depth); for (img = 0; img < depth; img++) { GLubyte *srcMap; GLint rowstride; @@ -431,13 +423,6 @@ get_tex_rgba_uncompressed(struct gl_context *ctx, GLuint dimensions, bool needsRebase; void *rgba = NULL; - if (texImage->TexObject->Target == GL_TEXTURE_1D_ARRAY) { - depth = height; - height = 1; - zoffset = yoffset; - yoffset = 0; - } - /* Depending on the base format involved we may need to apply a rebase * transform (for example: if we download to a Luminance format we want * G=0 and B=0). @@ -737,6 +722,15 @@ _mesa_GetTexSubImage_sw(struct gl_context *ctx, pixels = ADD_POINTERS(buf, pixels); } I'd probably put a comment on this conditional saying something like /* for all array textures, the Z axis selects the layer */ + if (texImage->TexObject->Target == GL_TEXTURE_1D_ARRAY) { + depth = height; + height = 1; + zoffset = yoffset; + yoffset = 0; + assert(zoffset + depth <= texImage->Height); + } else + assert(zoffset + depth <= texImage->Depth); I like seeing braces on the else clause to match the if clause. Reviewed-by: Brian Paul + if (get_tex_memcpy(ctx, xoffset, yoffset, zoffset, width, height, depth, format, type, pixels, texImage)) { /* all done */ ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] Can't get OpenGL 3.x inside VMware Workstation 12 (Ubuntu guest)
Hi Brian, On 10 November 2015 at 16:48, Brian Paul wrote: > On 11/09/2015 09:24 PM, Valera Rozuvan wrote: >> >> On Tue, Nov 10, 2015 at 4:13 AM, Brian Paul wrote: >>> >>> After running depmod, you probably need to update the initramfs with: >>> 'sudo update-initramfs -u' >>> >>> -Brian >> >> >> Hi Brian. First of all, thank you for your reply. I have tried your >> suggestion on my working setup, and also doing everything again from >> scratch. Basically, I get the same result - everything goes smoothly, >> but in the end OpenGL is version 2.1. I even tried to run a simple >> OpenGL 3.x program, and it crashes (simple OpenGL 2.1 program runs >> fine). >> >> Can you please spend an hour and try the instructions from >> >> https://urldefense.proofpoint.com/v2/url?u=http-3A__www.mesa3d.org_vmware-2Dguest.html&d=BQIBaQ&c=Sqcl0Ez6M0X8aeM67LKIiDJAXVeAw-YihVMNtXt-uEs&r=T0t4QG7chq2ZwJo6wilkFznRSFy-8uDKartPGbomVj8&m=f2dfCFlDZuGGllk3HIqWXUyWbl6sPO1sA7SlX-79UG8&s=emo24XVRAaQOlitBEbGNBjqB348c25cFStvxj413YcU&e= >> on a clean VMware Workstation >> 12 Player for Windows 64-bit (host), and Ubuntu 15.10 (guest)? It >> would be very awesome if you can get OpenGL 3.x running in VMware 12, >> and update the instructions. I am sure there is some critical piece >> missing. >> >> Thank you! = ) > > > I've reviewed and updated the instructions at > http://mesa3d.org/vmware-guest.html > > In particular, the --libdir option is different for Ubuntu and I've updated > some info about installing the kernel module. > > Let me know how that goes. > I think there is a hunk missing about --enable-texture-float for ARB_texture_float (and ultimately GL 3.0). I have seen similar type of documents in the past, most of which going out of date very quickly due to distribution changes and/or others. Wondering how you'll feel about "check your distro and add svga to the gallium-drivers array" style of instructions ? I'm not volunteering or suggesting that one has to rewrite them, mostly curious. Can you please sync the changes between mesa3d.org and the git repo. Assuming only vmware-guest.html differs yet I've haven't checked the rest bth. Thanks Emil ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] i965/fs/nir: fix the number of register written by FS_OPCODE_GET_BUFFER_SIZE
Nice! This fixes 12 GLES 3.1 CTS tests. > -Original Message- > From: mesa-dev [mailto:mesa-dev-boun...@lists.freedesktop.org] On > Behalf Of Samuel Iglesias Gonsálvez > Sent: Wednesday, November 11, 2015 4:07 PM > To: mesa-dev@lists.freedesktop.org > Subject: [Mesa-dev] [PATCH] i965/fs/nir: fix the number of register written > by FS_OPCODE_GET_BUFFER_SIZE > > FS_OPCODE_GET_BUFFER_SIZE is calculated with a resinfo's sampler > message. > > This patch adjusts the number of registers written by the opcode following > what the PRM spec says about the number of registers written by the SIMD8 > and SIMD16's writeback messages for sampler messages. > > Signed-off-by: Samuel Iglesias Gonsálvez > Cc: tapani.pa...@intel.com > --- > src/mesa/drivers/dri/i965/brw_fs_nir.cpp | 17 +++-- > 1 file changed, 15 insertions(+), 2 deletions(-) > > diff --git a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp > b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp > index 02b9f5b..61c9f2e 100644 > --- a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp > +++ b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp > @@ -2297,16 +2297,29 @@ fs_visitor::nir_emit_intrinsic(const fs_builder > &bld, nir_intrinsic_instr *instr >fs_reg source = fs_reg(0); > >int mlen = 1 * reg_width; > + > + /* A resinfo's sampler message is used to get the buffer size. > + * The SIMD8's writeback message consists of four registers and > + * SIMD16's writeback message consists of 8 destination registers > + * (two per each component), although we are only interested on the > + * first component, where resinfo returns the buffer size for > + * SURFTYPE_BUFFER. > + */ > + int regs_written = 4 * mlen; >fs_reg src_payload = fs_reg(GRF, alloc.allocate(mlen), >BRW_REGISTER_TYPE_UD); >bld.LOAD_PAYLOAD(src_payload, &source, 1, 0); > - > + fs_reg buffer_size = fs_reg(GRF, alloc.allocate(regs_written), > + BRW_REGISTER_TYPE_UD); >const unsigned index = prog_data->binding_table.ssbo_start + > ssbo_index; > - fs_inst *inst = bld.emit(FS_OPCODE_GET_BUFFER_SIZE, dest, > + fs_inst *inst = bld.emit(FS_OPCODE_GET_BUFFER_SIZE, buffer_size, > src_payload, fs_reg(index)); >inst->header_size = 0; >inst->mlen = mlen; > + inst->regs_written = regs_written; >bld.emit(inst); > + dest.type = buffer_size.type; > + bld.MOV(dest, buffer_size); > >brw_mark_surface_used(prog_data, index); >break; > -- > 2.5.0 > > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] i965/fs/nir: fix the number of register written by FS_OPCODE_GET_BUFFER_SIZE
FS_OPCODE_GET_BUFFER_SIZE is calculated with a resinfo's sampler message. This patch adjusts the number of registers written by the opcode following what the PRM spec says about the number of registers written by the SIMD8 and SIMD16's writeback messages for sampler messages. Signed-off-by: Samuel Iglesias Gonsálvez Cc: tapani.pa...@intel.com --- src/mesa/drivers/dri/i965/brw_fs_nir.cpp | 17 +++-- 1 file changed, 15 insertions(+), 2 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp index 02b9f5b..61c9f2e 100644 --- a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp @@ -2297,16 +2297,29 @@ fs_visitor::nir_emit_intrinsic(const fs_builder &bld, nir_intrinsic_instr *instr fs_reg source = fs_reg(0); int mlen = 1 * reg_width; + + /* A resinfo's sampler message is used to get the buffer size. + * The SIMD8's writeback message consists of four registers and + * SIMD16's writeback message consists of 8 destination registers + * (two per each component), although we are only interested on the + * first component, where resinfo returns the buffer size for + * SURFTYPE_BUFFER. + */ + int regs_written = 4 * mlen; fs_reg src_payload = fs_reg(GRF, alloc.allocate(mlen), BRW_REGISTER_TYPE_UD); bld.LOAD_PAYLOAD(src_payload, &source, 1, 0); - + fs_reg buffer_size = fs_reg(GRF, alloc.allocate(regs_written), + BRW_REGISTER_TYPE_UD); const unsigned index = prog_data->binding_table.ssbo_start + ssbo_index; - fs_inst *inst = bld.emit(FS_OPCODE_GET_BUFFER_SIZE, dest, + fs_inst *inst = bld.emit(FS_OPCODE_GET_BUFFER_SIZE, buffer_size, src_payload, fs_reg(index)); inst->header_size = 0; inst->mlen = mlen; + inst->regs_written = regs_written; bld.emit(inst); + dest.type = buffer_size.type; + bld.MOV(dest, buffer_size); brw_mark_surface_used(prog_data, index); break; -- 2.5.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v5 5/7] glsl: Add precision information to ir_variable
Reviewed-by: Samuel Iglesias Gonsálvez On 06/11/15 13:03, Tapani Pälli wrote: > From: Iago Toral Quiroga > > We will need this later on when we implement proper support for > precision qualifiers in the drivers and also to do link time checks for > uniforms as indicated by the spec. > > This patch also adds compile-time checks for variables without precision > information (currently, Mesa only checks that a default precision is set > for floats in fragment shaders). > > As indicated by Ian, the addition of the precision information to > ir_variable has been done using a bitfield and pahole to identify an > available hole so that memory requirements for ir_variable stay the > same. > > v2 (Ian): > - Avoid if-ladders by defining arrays of supported sampler names and > indexing > into them with type->sampler_array + 2 * type->sampler_shadow > - Make the code that selects the precision qualifier to use an utility > function > - Fix a typo > > v3 (Tapani): > - rebased > - squashed in "Precision qualifiers are not allowed on structs" > - fixed select_gles_precision for sampler arrays > - fixed precision_qualifier_allowed for arrays of structs > > v4 (Tapani): > - add atomic_uint handling > - do not allow precision qualifier on images > (issues reported by Marta) > > v5 (Tapani): > - support precision qualifier on image types > --- > src/glsl/ast_to_hir.cpp | 296 > > src/glsl/ir.h | 13 ++ > src/glsl/nir/glsl_types.cpp | 4 + > src/glsl/nir/glsl_types.h | 11 ++ > 4 files changed, 301 insertions(+), 23 deletions(-) > > diff --git a/src/glsl/ast_to_hir.cpp b/src/glsl/ast_to_hir.cpp > index b6d662b..1240615 100644 > --- a/src/glsl/ast_to_hir.cpp > +++ b/src/glsl/ast_to_hir.cpp > @@ -2189,10 +2189,10 @@ precision_qualifier_allowed(const glsl_type *type) > * From this, we infer that GLSL 1.30 (and later) should allow precision > * qualifiers on sampler types just like float and integer types. > */ > - return type->is_float() > + return (type->is_float() > || type->is_integer() > - || type->is_record() > - || type->contains_opaque(); > + || type->contains_opaque()) > + && !type->without_array()->is_record(); > } > > const glsl_type * > @@ -2210,31 +2210,268 @@ ast_type_specifier::glsl_type(const char **name, > return type; > } > > -const glsl_type * > -ast_fully_specified_type::glsl_type(const char **name, > -struct _mesa_glsl_parse_state *state) > const > +/** > + * From the OpenGL ES 3.0 spec, 4.5.4 Default Precision Qualifiers: > + * > + * "The precision statement > + * > + *precision precision-qualifier type; > + * > + * can be used to establish a default precision qualifier. The type field > can > + * be either int or float or any of the sampler types, (...) If type is > float, > + * the directive applies to non-precision-qualified floating point type > + * (scalar, vector, and matrix) declarations. If type is int, the directive > + * applies to all non-precision-qualified integer type (scalar, vector, > signed, > + * and unsigned) declarations." > + * > + * We use the symbol table to keep the values of the default precisions for > + * each 'type' in each scope and we use the 'type' string from the precision > + * statement as key in the symbol table. When we want to retrieve the default > + * precision associated with a given glsl_type we need to know the type > string > + * associated with it. This is what this function returns. > + */ > +static const char * > +get_type_name_for_precision_qualifier(const glsl_type *type) > { > - const struct glsl_type *type = this->specifier->glsl_type(name, state); > - > - if (type == NULL) > - return NULL; > + switch (type->base_type) { > + case GLSL_TYPE_FLOAT: > + return "float"; > + case GLSL_TYPE_UINT: > + case GLSL_TYPE_INT: > + return "int"; > + case GLSL_TYPE_ATOMIC_UINT: > + return "atomic_uint"; > + case GLSL_TYPE_IMAGE: > + /* fallthrough */ > + case GLSL_TYPE_SAMPLER: { > + const unsigned type_idx = > + type->sampler_array + 2 * type->sampler_shadow; > + const unsigned offset = type->base_type == GLSL_TYPE_SAMPLER ? 0 : 4; > + assert(type_idx < 4); > + switch (type->sampler_type) { > + case GLSL_TYPE_FLOAT: > + switch (type->sampler_dimensionality) { > + case GLSL_SAMPLER_DIM_1D: { > +assert(type->base_type == GLSL_TYPE_SAMPLER); > +static const char *const names[4] = { > + "sampler1D", "sampler1DArray", > + "sampler1DShadow", "sampler1DArrayShadow" > +}; > +return names[type_idx]; > + } > + case GLSL_SAMPLER_DIM_2D: { > +static const char *const names[8] = { > + "sampler2D", "sampler2DArray", > + "sampler2DShadow", "sampler2
Re: [Mesa-dev] [PATCH 4/7] glsl: Move the definition of precision_qualifier_allowed
You can either move the function or add the function forward declaration. In any case, Reviewed-by: Samuel Iglesias Gonsálvez Sam On 05/11/15 12:33, Tapani Pälli wrote: > From: Iago Toral Quiroga > > We will need this to build later patches > --- > src/glsl/ast_to_hir.cpp | 71 > - > 1 file changed, 35 insertions(+), 36 deletions(-) > > diff --git a/src/glsl/ast_to_hir.cpp b/src/glsl/ast_to_hir.cpp > index d20be0b..b6d662b 100644 > --- a/src/glsl/ast_to_hir.cpp > +++ b/src/glsl/ast_to_hir.cpp > @@ -2159,6 +2159,41 @@ process_array_type(YYLTYPE *loc, const glsl_type *base, > return array_type; > } > > +static bool > +precision_qualifier_allowed(const glsl_type *type) > +{ > + /* Precision qualifiers apply to floating point, integer and opaque > +* types. > +* > +* Section 4.5.2 (Precision Qualifiers) of the GLSL 1.30 spec says: > +*"Any floating point or any integer declaration can have the type > +*preceded by one of these precision qualifiers [...] Literal > +*constants do not have precision qualifiers. Neither do Boolean > +*variables. > +* > +* Section 4.5 (Precision and Precision Qualifiers) of the GLSL 1.30 > +* spec also says: > +* > +* "Precision qualifiers are added for code portability with OpenGL > +* ES, not for functionality. They have the same syntax as in OpenGL > +* ES." > +* > +* Section 8 (Built-In Functions) of the GLSL ES 1.00 spec says: > +* > +* "uniform lowp sampler2D sampler; > +* highp vec2 coord; > +* ... > +* lowp vec4 col = texture2D (sampler, coord); > +*// texture2D returns lowp" > +* > +* From this, we infer that GLSL 1.30 (and later) should allow precision > +* qualifiers on sampler types just like float and integer types. > +*/ > + return type->is_float() > + || type->is_integer() > + || type->is_record() > + || type->contains_opaque(); > +} > > const glsl_type * > ast_type_specifier::glsl_type(const char **name, > @@ -3645,42 +3680,6 @@ validate_identifier(const char *identifier, YYLTYPE > loc, > } > } > > -static bool > -precision_qualifier_allowed(const glsl_type *type) > -{ > - /* Precision qualifiers apply to floating point, integer and opaque > -* types. > -* > -* Section 4.5.2 (Precision Qualifiers) of the GLSL 1.30 spec says: > -*"Any floating point or any integer declaration can have the type > -*preceded by one of these precision qualifiers [...] Literal > -*constants do not have precision qualifiers. Neither do Boolean > -*variables. > -* > -* Section 4.5 (Precision and Precision Qualifiers) of the GLSL 1.30 > -* spec also says: > -* > -* "Precision qualifiers are added for code portability with OpenGL > -* ES, not for functionality. They have the same syntax as in OpenGL > -* ES." > -* > -* Section 8 (Built-In Functions) of the GLSL ES 1.00 spec says: > -* > -* "uniform lowp sampler2D sampler; > -* highp vec2 coord; > -* ... > -* lowp vec4 col = texture2D (sampler, coord); > -*// texture2D returns lowp" > -* > -* From this, we infer that GLSL 1.30 (and later) should allow precision > -* qualifiers on sampler types just like float and integer types. > -*/ > - return type->is_float() > - || type->is_integer() > - || type->is_record() > - || type->contains_opaque(); > -} > - > ir_rvalue * > ast_declarator_list::hir(exec_list *instructions, > struct _mesa_glsl_parse_state *state) > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 3/7] glsl: Add user-defined default precision qualifiers to the symbol table
Reviewed-by: Samuel Iglesias Gonsálvez On 05/11/15 12:33, Tapani Pälli wrote: > From: Iago Toral Quiroga > > Notice that the spec requires that a default precision has been set for every > type used by a shader that can use a precision qualifier and does not have a > predefined precision, however, at the moment, Mesa only checks this for floats > in the fragment shader. This is probably because the GLSL ES 1.0 specs > mentions > this case specifically, but GLSL ES 3.0 clarifies that the same applies to > other types: > > "The fragment language has no default precision qualifier for floating point > types. Hence for float, floating point vector and matrix variable > declarations, either the declaration must include a precision qualifier or > the default float precision must have been previously declared. Similarly, > there is no default precision qualifier for the following sampler types in > either the vertex or fragment language: > > sampler3D; > samplerCubeShadow; > sampler2DShadow; > sampler2DArray; > sampler2DArrayShadow; > isampler2D; > isampler3D; > isamplerCube; > isampler2DArray; > usampler2D; > usampler3D; > usamplerCube; > usampler2DArray;" > > we will fix this in a later patch. > --- > src/glsl/ast_to_hir.cpp | 29 ++--- > 1 file changed, 10 insertions(+), 19 deletions(-) > > diff --git a/src/glsl/ast_to_hir.cpp b/src/glsl/ast_to_hir.cpp > index 0306530..d20be0b 100644 > --- a/src/glsl/ast_to_hir.cpp > +++ b/src/glsl/ast_to_hir.cpp > @@ -2184,11 +2184,15 @@ ast_fully_specified_type::glsl_type(const char **name, > if (type == NULL) >return NULL; > > + /* The fragment language does not define a default precision value > +* for float types, so check that one is defined if the type declaration > +* isn't providing one explictly. > +*/ > if (type->base_type == GLSL_TYPE_FLOAT > && state->es_shader > && state->stage == MESA_SHADER_FRAGMENT > && this->qualifier.precision == ast_precision_none > - && state->symbols->get_variable("#default precision") == NULL) { > + && state->symbols->get_default_precision_qualifier("float") == > ast_precision_none) { >YYLTYPE loc = this->get_location(); >_mesa_glsl_error(&loc, state, > "no precision specified this scope for type `%s'", > @@ -5749,20 +5753,10 @@ ast_type_specifier::hir(exec_list *instructions, > return NULL; >} > > - if (type->base_type == GLSL_TYPE_FLOAT > - && state->es_shader > - && state->stage == MESA_SHADER_FRAGMENT) { > + if (state->es_shader) { > /* Section 4.5.3 (Default Precision Qualifiers) of the GLSL ES 1.00 >* spec says: >* > - * "The fragment language has no default precision qualifier for > - * floating point types." > - * > - * As a result, we have to track whether or not default precision > has > - * been specified for float in GLSL ES fragment shaders. > - * > - * Earlier in that same section, the spec says: > - * >* "Non-precision qualified declarations will use the precision >* qualifier specified in the most recent precision statement >* that is still in scope. The precision statement has the same > @@ -5775,16 +5769,13 @@ ast_type_specifier::hir(exec_list *instructions, >* overriding earlier statements within that scope." >* >* Default precision specifications follow the same scope rules as > - * variables. So, we can track the state of the default float > - * precision in the symbol table, and the rules will just work. > This > + * variables. So, we can track the state of the default precision > + * qualifiers in the symbol table, and the rules will just work. > This >* is a slight abuse of the symbol table, but it has the semantics >* that we want. >*/ > - ir_variable *const junk = > -new(state) ir_variable(type, "#default precision", > - ir_var_auto); > - > - state->symbols->add_variable(junk); > + state->symbols->add_default_precision_qualifier(this->type_name, > + > this->default_precision); >} > >/* FINISHME: Translate precision statements into IR. */ > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/7] glsl: Add API to put default precision qualifiers in the symbol table
Reviewed-by: Samuel Iglesias Gonsálvez On 05/11/15 12:33, Tapani Pälli wrote: > From: Iago Toral Quiroga > > These have scoping rules that match the ones defined for other things such > as variables, so we want them in the symbol table. > --- > src/glsl/glsl_symbol_table.cpp | 24 > src/glsl/glsl_symbol_table.h | 2 ++ > 2 files changed, 26 insertions(+) > > diff --git a/src/glsl/glsl_symbol_table.cpp b/src/glsl/glsl_symbol_table.cpp > index 536f0a3..6c682ac 100644 > --- a/src/glsl/glsl_symbol_table.cpp > +++ b/src/glsl/glsl_symbol_table.cpp > @@ -23,6 +23,7 @@ > */ > > #include "glsl_symbol_table.h" > +#include "ast.h" > > class symbol_table_entry { > public: > @@ -201,6 +202,20 @@ bool glsl_symbol_table::add_function(ir_function *f) > return _mesa_symbol_table_add_symbol(table, -1, f->name, entry) == 0; > } > > +bool glsl_symbol_table::add_default_precision_qualifier(const char > *type_name, > +int precision) > +{ > + char *name = ralloc_asprintf(mem_ctx, "#default_precision_%s", type_name); > + > + ast_type_specifier *default_specifier = new(mem_ctx) > ast_type_specifier(name); > + default_specifier->default_precision = precision; > + > + symbol_table_entry *entry = > + new(mem_ctx) symbol_table_entry(default_specifier); > + > + return _mesa_symbol_table_add_symbol(table, -1, name, entry) == 0; > +} > + > void glsl_symbol_table::add_global_function(ir_function *f) > { > symbol_table_entry *entry = new(mem_ctx) symbol_table_entry(f); > @@ -234,6 +249,15 @@ ir_function *glsl_symbol_table::get_function(const char > *name) > return entry != NULL ? entry->f : NULL; > } > > +int glsl_symbol_table::get_default_precision_qualifier(const char *type_name) > +{ > + char *name = ralloc_asprintf(mem_ctx, "#default_precision_%s", type_name); > + symbol_table_entry *entry = get_entry(name); > + if (!entry) > + return ast_precision_none; > + return entry->a->default_precision; > +} > + > symbol_table_entry *glsl_symbol_table::get_entry(const char *name) > { > return (symbol_table_entry *) > diff --git a/src/glsl/glsl_symbol_table.h b/src/glsl/glsl_symbol_table.h > index e32b88b..5d654e5 100644 > --- a/src/glsl/glsl_symbol_table.h > +++ b/src/glsl/glsl_symbol_table.h > @@ -72,6 +72,7 @@ struct glsl_symbol_table { > bool add_function(ir_function *f); > bool add_interface(const char *name, const glsl_type *i, >enum ir_variable_mode mode); > + bool add_default_precision_qualifier(const char *type_name, int > precision); > /*@}*/ > > /** > @@ -88,6 +89,7 @@ struct glsl_symbol_table { > ir_function *get_function(const char *name); > const glsl_type *get_interface(const char *name, >enum ir_variable_mode mode); > + int get_default_precision_qualifier(const char *type_name); > /*@}*/ > > /** > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH][mesa-demos] configure.ac: fix AC_WITH(glut) so that --without-glut works
Currently if --without-glut is used on a system that has the GLUT libraries installed, GLUT is used regardless. Change the logic so that GLUT is searched for if and only if GLUT is requested. Signed-off-by: Ross Burton --- configure.ac | 26 +++--- 1 file changed, 15 insertions(+), 11 deletions(-) diff --git a/configure.ac b/configure.ac index c4ee12b..9445424 100644 --- a/configure.ac +++ b/configure.ac @@ -67,21 +67,25 @@ DEMO_CFLAGS="$DEMO_CFLAGS $GL_CFLAGS" DEMO_LIBS="$DEMO_LIBS $GL_LIBS" dnl Check for GLUT -GLUT_CFLAGS="" -GLUT_LIBS=-lglut -glut_enabled=yes +glut_enabled=no AC_ARG_WITH([glut], [AS_HELP_STRING([--with-glut=DIR], [glut install directory])], [GLUT_CFLAGS="-I$withval/include" -GLUT_LIBS="-L$withval/lib -lglut"]) -AC_CHECK_HEADER([GL/glut.h], - [], - [glut_enabled=no]) -AC_CHECK_LIB([glut], - [glutInit], - [], - [glut_enabled=no]) +GLUT_LIBS="-L$withval/lib -lglut"], + [GLUT_CFLAGS="" +GLUT_LIBS="-lglut"] +) +AS_IF([test "x$with_glut" != xno], + [AC_CHECK_HEADER([GL/glut.h], + [], + [glut_enabled=no]) + AC_CHECK_LIB([glut], +[glutInit], +[], +[glut_enabled=no]) + glut_enabled=yes +]) dnl Check for FreeGLUT 2.6 or later AC_EGREP_HEADER([glutInitContextProfile], -- 2.1.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] Mesa 11.0.5
Mesa 11.0.5 is now available. With this release we have some driver patches for i965 and nouveau, a couple of llvm 3.7 related fixes and a some bugfixes in the VA state-tracker. Additionally we have a few new PCI ids for i965 and radeonsi. The list of exported OSMesa symbols under Windows has been updated. Alex Deucher (1): radeon/uvd: don't expose HEVC on old UVD hw (v3) Ben Widawsky (1): i965/skl: Add GT4 PCI IDs Emil Velikov (5): docs: add sha256 checksums for 11.0.4 cherry-ignore: ignore a possible wrong nomination Revert "mesa/glformats: Undo code changes from _mesa_base_tex_format() move" Update version to 11.0.5 docs: add release notes for 11.0.5 Emmanuel Gil Peyrot (1): gbm.h: Add a missing stddef.h include for size_t. Eric Anholt (1): vc4: When the create ioctl fails, free our cache and try again. Ian Romanick (1): i965: Fix is-renderable check in intel_image_target_renderbuffer_storage Ilia Mirkin (3): nvc0: respect edgeflag attribute width nouveau: set MaxDrawBuffers to the same value as MaxColorAttachments nouveau: relax fence emit space assert Ivan Kalvachev (1): r600g: Fix special negative immediate constants when using ABS modifier. Jason Ekstrand (2): nir/lower_vec_to_movs: Pass the shader around directly nir: Report progress from lower_vec_to_movs(). Jose Fonseca (2): gallivm: Translate all util_cpu_caps bits to LLVM attributes. gallivm: Explicitly disable unsupported CPU features. Julien Isorce (4): st/va: pass picture desc to begin and decode nvc0: fix crash when nv50_miptree_from_handle fails st/va: do not destroy old buffer when new one failed st/va: add more errors checks in vlVaBufferSetNumElements and vlVaMapBuffer Kenneth Graunke (6): i965: Fix missing BRW_NEW_*_PROG_DATA flagging caused by cache reuse. nir: Report progress from nir_split_var_copies(). nir: Properly invalidate metadata in nir_split_var_copies(). nir: Properly invalidate metadata in nir_opt_copy_prop(). nir: Properly invalidate metadata in nir_lower_vec_to_movs(). nir: Properly invalidate metadata in nir_opt_remove_phis(). Marek Olšák (1): radeonsi: add register definitions for Stoney Nanley Chery (1): mesa/glformats: Undo code changes from _mesa_base_tex_format() move Nicolai Hähnle (1): st/mesa: fix mipmap generation for immutable textures with incomplete pyramids Nigel Stewart (1): osmesa: Expose GL entry points for Windows build via DEF file. Roland Scheidegger (1): gallivm: disable f16c when not using AVX Samuel Li (2): radeonsi: add support for Stoney asics (v3) radeonsi: add Stoney pci ids git tag: mesa-11.0.5 ftp://ftp.freedesktop.org/pub/mesa/11.0.5/mesa-11.0.5.tar.gz MD5: 04679fb7af8d0647898229033eb8d754 mesa-11.0.5.tar.gz SHA1: 2a0bd13915d67ba1aba65f0a9873b96151581836 mesa-11.0.5.tar.gz SHA256: 8495ef5c06f7f726452462b7d408a5b40048373ff908f2283a3b4d1f49b45ee6 mesa-11.0.5.tar.gz PGP: ftp://ftp.freedesktop.org/pub/mesa/11.0.5/mesa-11.0.5.tar.gz.sig ftp://ftp.freedesktop.org/pub/mesa/11.0.5/mesa-11.0.5.tar.xz MD5: b71b5e6c437cd8c85a61a476ab840f9f mesa-11.0.5.tar.xz SHA1: 0d015c7b2041f06503144f71fbc432e78706bc79 mesa-11.0.5.tar.xz SHA256: 9c255a2a6695fcc6ef4a279e1df0aeaf417dc142f39ee59dfb533d80494bb67a mesa-11.0.5.tar.xz PGP: ftp://ftp.freedesktop.org/pub/mesa/11.0.5/mesa-11.0.5.tar.xz.sig -- -Emil signature.asc Description: OpenPGP digital signature ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] nir/copy_propagate: do not copy-propagate MOV srcs with source modifiers
If a source operand in a MOV has source modifiers, then we cannot copy-propagate it from the parent instruction and remove the MOV. --- I noticed this while debugging some regressions introduced with the fp64 code. Basically, I had code similar to this: vec4 ssa1 = intrincisc1 (...) (...) vec2 ssa2 = vec2 ssa1 -ssa1.y vec4 ssa3 = intrinsic2 (ssa2) (...) that would be turned into this by copy propagation: vec4 ssa1 = some intrincisc vec4 ssa2 = some intrinsic (ssa1) which is obviously not correct. This was happening because is_swizzleless_move checked that the MOV/vecN operation that generates the value we want to copy-propagate does not incorporate swizzling, but it ignored the case where it also added source modifiers, in which case we can't copy-propagate either. Of course, now that we have made vecN operations unsigned again, that example can't happen because lower_to_source_mods won't produce things like that, but I figured that the patch would still make sense, since it fixes a case where copy-propagation won't work as intended, even if we are not currently triggering it (at least not with vecN operations). src/glsl/nir/nir_opt_copy_propagate.c | 10 -- 1 file changed, 8 insertions(+), 2 deletions(-) diff --git a/src/glsl/nir/nir_opt_copy_propagate.c b/src/glsl/nir/nir_opt_copy_propagate.c index 7d8bdd7..7caa4b7 100644 --- a/src/glsl/nir/nir_opt_copy_propagate.c +++ b/src/glsl/nir/nir_opt_copy_propagate.c @@ -65,9 +65,12 @@ static bool is_vec(nir_alu_instr *instr) } static bool -is_swizzleless_move(nir_alu_instr *instr) +is_simple_move(nir_alu_instr *instr) { if (is_move(instr)) { + if (instr->src[0].negate || instr->src[0].abs) + return false; + for (unsigned i = 0; i < 4; i++) { if (!((instr->dest.write_mask >> i) & 1)) break; @@ -81,6 +84,9 @@ is_swizzleless_move(nir_alu_instr *instr) if (instr->src[i].swizzle[0] != i) return false; + if (instr->src[i].negate || instr->src[i].abs) +return false; + if (def == NULL) { def = instr->src[i].src.ssa; } else if (instr->src[i].src.ssa != def) { @@ -107,7 +113,7 @@ copy_prop_src(nir_src *src, nir_instr *parent_instr, nir_if *parent_if) return false; nir_alu_instr *alu_instr = nir_instr_as_alu(src_instr); - if (!is_swizzleless_move(alu_instr)) + if (!is_simple_move(alu_instr)) return false; /* Don't let copy propagation land us with a phi that has more -- 1.9.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev