Re: [Mesa-dev] [PATCH 1/8] gallium: Add a cap for offset_units_unscaled
On 15/06/2016 03:04, Roland Scheidegger wrote: Am 15.06.2016 um 01:08 schrieb Axel Davy: On 15/06/2016 00:21, Roland Scheidegger wrote: Am 14.06.2016 um 23:33 schrieb Axel Davy: diff --git a/src/gallium/include/pipe/p_state.h b/src/gallium/include/pipe/p_state.h index 396f563..7dce80a 100644 --- a/src/gallium/include/pipe/p_state.h +++ b/src/gallium/include/pipe/p_state.h @@ -139,6 +139,13 @@ struct pipe_rasterizer_state unsigned clip_halfz:1; /** +* When true do not scale offset_units and use same rules for unorm and +* float depth buffers (D3D9). When false use GL/D3D1X behaviour. +* This depends on PIPE_CAP_POLYGON_OFFSET_UNITS_UNSCALED. +*/ + unsigned offset_units_unscaled; + + /** * Enable bits for clipping half-spaces. * This applies to both user clip planes and shader clip distances. * Note that if the bound shader exports any clip distances, these I don't like this. Generally, for unorm formats, you can easily enough translate this from d3d9 to gl (or d3d10) rules (but yes, obviously it's going to be format dependent). (With one big caveat, in general not all gl drivers think the minimum resolvable difference is the same, that might range from 2^-22 to 2^-24 for 24bit unorm depth for instance, and I don't think it's quite consistent with gallium drivers neither). You are right though for float depth the formula is different, and you can't translate it. But do you really need float depth buffer support? AFAIK no d3d9 app really depends on it, everything can fall back to d24. Roland Hi, That's true float depth buffer do not seem to be widely used in d3d9. The two float depth buffers available in d3d9, as far as I know, are D32F_LOCKABLE and D24FS8. We can see the support for those and other depth buffers here (note that these are mainly old cards): http://zp.amsnet.pl/cdragan/query.php?dxversion=9&feature=formats&featuregroup=selected&adaptergroup=all&featureselected[]=45&featureselected[]=44&featureselected[]=41&featureselected[]=42&featureselected[]=43&featureselected[]=40&featureselected[]=39&featureselected[]=46&resource=SURFACE&usage=DEPTHSTENCIL&orientation=horizontal It is likely not a requirement for any game to support these formats. We could ignore these formats, and add to gallium a way to get the minimum resolvable difference per depth buffer format from drivers. We considered this option. That said, the driver is the best location to know about the minimum resolvable difference, and we made the choice to let the driver do the scaling instead of doing it based on some driver query in the state tracker. As for floating point depth buffers behaviour, I understand for some drivers it may be harder than for others to implement. That doesn't seem however a reason to drop floating depth buffer support in Gallium Nine. D32F_LOCKABLE is particularly useful for debugging, being lockable, it can be used to show depth buffer content after some draw calls for d3d on windows, and compare with nine. And some apps may use it for some particular effects. I'd be ok if we make the float depth buffer part of offset_units_unscaled optional given how rare the combination float depth buffers + depth bias must be used. However if hw can do it, I see no reason why we wouldn't support the capability? On second look, it doesn't really look too bad (and fwiw we actually could probably put it to use here if we'd support it in llvmpipe). Albeit, unsigned offset_units_unscaled; needs to be unsigned offset_units_unscaled:1; Good catch, this was the reason I had put it in this place of the structure, but somehow forgot the :1 ... I'm just very sceptical when it comes to capabilities solely to the benefit of fringe state trackers (and everything not st/mesa counts here). It usually means driver authors aren't going to bother. And you probably can't implement it in all drivers yourselves even if the hw could do it. That said, I'm ok with this if there's no objections from others. Roland ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 0/7] Fix ralloc/rzalloc usage v2
On 14/06/16 18:47, Martin Peres wrote: On 14/06/16 17:58, Juha-Pekka Heikkila wrote: Here is fixed version of this ralloc set. Now I got to run this on many different machines thanks to Mark Janes. There didn't show up any regressions on different gen hw. On my IVB I've been running also many different traces with Apitrace while having Valgrind running on background but Valgrind did seem to be happy with my changes. As a performance test I did shader-db compile runs 10 times and compare timing results against what Mesa master does on my IVB. To my surprise this does bring reasonable gain which also seem to be repeatable, on my IVB shader compile time is around 5% faster with these changes. On my SKL gt2, I only get a 0.35% improvement (10 runs also). Ministat says that there is no difference proven at 95% confidence and 0.35% at 90%. Adding 100 more runs overnight, we'll see what we get in the end. With n=110: Difference at 95.0% confidence -0.133665% +/- 0.0945651% (Student's t, pooled s = 0.447829) Martin ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 14/18] nir/glsl: add double packing support to vs and fs
--- src/compiler/glsl/link_varyings.cpp | 16 +--- src/compiler/nir/nir_lower_io.c | 16 2 files changed, 29 insertions(+), 3 deletions(-) diff --git a/src/compiler/glsl/link_varyings.cpp b/src/compiler/glsl/link_varyings.cpp index 22dc2d8..7c0d93a 100644 --- a/src/compiler/glsl/link_varyings.cpp +++ b/src/compiler/glsl/link_varyings.cpp @@ -1992,10 +1992,11 @@ set_num_packed_components(struct gl_shader *shader, ir_variable_mode io_mode, var->type->without_array()->is_matrix()) continue; + unsigned dfrac = var->type->without_array()->is_double() ? 2 : 1; if (var->type->is_array()) { const glsl_type *type = get_varying_type(var, shader->Stage); unsigned array_components = type->without_array()->vector_elements + -var->data.location_frac; +var->data.location_frac / dfrac; assert(type->arrays_of_arrays_size() + idx <= ARRAY_SIZE(num_components)); for (unsigned i = idx; i < type->arrays_of_arrays_size(); i++) { @@ -2003,7 +2004,7 @@ set_num_packed_components(struct gl_shader *shader, ir_variable_mode io_mode, } } else { unsigned comps = var->type->vector_elements + -var->data.location_frac; +var->data.location_frac / dfrac; num_components[idx] = MAX2(comps, num_components[idx]); } } @@ -2031,7 +2032,16 @@ set_num_packed_components(struct gl_shader *shader, ir_variable_mode io_mode, c = MAX2(c, num_components[i]); } } else { - c = num_components[idx]; + /* Handle special case of packing dvec3 with a double. The only + * valid scenario is packing a double in the 4th component of the + * double vector. + */ + if (var->type->is_double() && var->type->vector_elements == 3 && + num_components[idx+1] == 2) { +c = 4; + } else { +c = num_components[idx]; + } } var->data.num_packed_components = c; } diff --git a/src/compiler/nir/nir_lower_io.c b/src/compiler/nir/nir_lower_io.c index b966348..5566c83 100644 --- a/src/compiler/nir/nir_lower_io.c +++ b/src/compiler/nir/nir_lower_io.c @@ -104,6 +104,22 @@ nir_assign_var_locations(struct exec_list *var_list, unsigned *size, if (locations[idx][var->data.index] == -1) { var->data.driver_location = location; locations[idx][var->data.index] = location; + +/* A dvec3 can be packed with a double we need special handling + * for this as we are packing across two locations. + */ +if (glsl_get_base_type(var->type) == GLSL_TYPE_DOUBLE && +glsl_get_vector_elements(var->type) == 3) { + /* Hack around type_size functions that expect vectors to be +* padded out to vec4. +*/ + unsigned dsize = type_size(glsl_double_type()); + unsigned offset = + dsize == type_size(glsl_float_type()) ? dsize : dsize * 2; + + locations[idx + 1][var->data.index] = location + offset; +} + location += type_size(var->type) + calc_type_size_offset(var->data.num_packed_components, var->type, type_size); -- 2.5.5 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 11/18] i965: add indirect packing support to gs load inputs
Reviewed-by: Kenneth Graunke --- src/mesa/drivers/dri/i965/brw_fs_nir.cpp | 18 +++--- 1 file changed, 15 insertions(+), 3 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp index 4eaf5ea..75737c1 100644 --- a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp @@ -2135,14 +2135,26 @@ fs_visitor::emit_gs_input_load(const fs_reg &dst, } else { /* Indirect indexing - use per-slot offsets as well. */ const fs_reg srcs[] = { icp_handle, indirect_offset }; + unsigned read_components = num_components + first_component; + fs_reg tmp = bld.vgrf(dst.type, read_components); fs_reg payload = bld.vgrf(BRW_REGISTER_TYPE_UD, 2); bld.LOAD_PAYLOAD(payload, srcs, ARRAY_SIZE(srcs), 0); - - inst = bld.emit(SHADER_OPCODE_URB_READ_SIMD8_PER_SLOT, tmp_dst, payload); + if (first_component != 0) { +inst = bld.emit(SHADER_OPCODE_URB_READ_SIMD8_PER_SLOT, tmp, +payload); +inst->regs_written = read_components; +for (unsigned i = 0; i < num_components; i++) { + bld.MOV(offset(tmp_dst, bld, i), + offset(tmp, bld, i + first_component)); +} + } else { +inst = bld.emit(SHADER_OPCODE_URB_READ_SIMD8_PER_SLOT, tmp_dst, + payload); +inst->regs_written = num_components * type_sz(tmp_dst.type) / 4; + } inst->offset = base_offset; inst->base_mrf = -1; inst->mlen = 2; - inst->regs_written = num_components * type_sz(tmp_dst.type) / 4; } if (type_sz(dst.type) == 8) { -- 2.5.5 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 17/18] i965: enable ARB_enhanced_layouts for gen8+
--- src/mesa/drivers/dri/i965/intel_extensions.c | 1 + 1 file changed, 1 insertion(+) diff --git a/src/mesa/drivers/dri/i965/intel_extensions.c b/src/mesa/drivers/dri/i965/intel_extensions.c index 5be4787..d61692d 100644 --- a/src/mesa/drivers/dri/i965/intel_extensions.c +++ b/src/mesa/drivers/dri/i965/intel_extensions.c @@ -388,6 +388,7 @@ intelInitExtensions(struct gl_context *ctx) } if (brw->gen >= 8) { + ctx->Extensions.ARB_enhanced_layouts = true; ctx->Extensions.ARB_shader_precision = true; ctx->Extensions.ARB_stencil_texturing = true; ctx->Extensions.ARB_texture_stencil8 = true; -- 2.5.5 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 08/18] nir: add glsl_dvec_type() helper
--- src/compiler/nir_types.cpp | 6 ++ src/compiler/nir_types.h | 1 + 2 files changed, 7 insertions(+) diff --git a/src/compiler/nir_types.cpp b/src/compiler/nir_types.cpp index 4ea7a2f..835d53b 100644 --- a/src/compiler/nir_types.cpp +++ b/src/compiler/nir_types.cpp @@ -257,6 +257,12 @@ glsl_vec_type(unsigned n) } const glsl_type * +glsl_dvec_type(unsigned n) +{ + return glsl_type::dvec(n); +} + +const glsl_type * glsl_vec4_type(void) { return glsl_type::vec4_type; diff --git a/src/compiler/nir_types.h b/src/compiler/nir_types.h index 7d9917f..f7147a9 100644 --- a/src/compiler/nir_types.h +++ b/src/compiler/nir_types.h @@ -118,6 +118,7 @@ bool glsl_sampler_type_is_array(const struct glsl_type *type); const struct glsl_type *glsl_void_type(void); const struct glsl_type *glsl_float_type(void); const struct glsl_type *glsl_vec_type(unsigned n); +const struct glsl_type *glsl_dvec_type(unsigned n); const struct glsl_type *glsl_vec4_type(void); const struct glsl_type *glsl_int_type(void); const struct glsl_type *glsl_uint_type(void); -- 2.5.5 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 13/18] nir: add glsl_double_type() helper
Reviewed-by: Kenneth Graunke --- src/compiler/nir_types.cpp | 6 ++ src/compiler/nir_types.h | 1 + 2 files changed, 7 insertions(+) diff --git a/src/compiler/nir_types.cpp b/src/compiler/nir_types.cpp index 835d53b..f694a84 100644 --- a/src/compiler/nir_types.cpp +++ b/src/compiler/nir_types.cpp @@ -251,6 +251,12 @@ glsl_float_type(void) } const glsl_type * +glsl_double_type(void) +{ + return glsl_type::double_type; +} + +const glsl_type * glsl_vec_type(unsigned n) { return glsl_type::vec(n); diff --git a/src/compiler/nir_types.h b/src/compiler/nir_types.h index f7147a9..6b4f646 100644 --- a/src/compiler/nir_types.h +++ b/src/compiler/nir_types.h @@ -117,6 +117,7 @@ bool glsl_sampler_type_is_array(const struct glsl_type *type); const struct glsl_type *glsl_void_type(void); const struct glsl_type *glsl_float_type(void); +const struct glsl_type *glsl_double_type(void); const struct glsl_type *glsl_vec_type(unsigned n); const struct glsl_type *glsl_dvec_type(unsigned n); const struct glsl_type *glsl_vec4_type(void); -- 2.5.5 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 07/18] i965: add component packing support for tcs
Reviewed-by: Kenneth Graunke --- src/mesa/drivers/dri/i965/brw_fs_nir.cpp | 11 --- 1 file changed, 8 insertions(+), 3 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp index 6033e5e..587549f 100644 --- a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp @@ -2680,6 +2680,9 @@ fs_visitor::nir_emit_tcs_intrinsic(const fs_builder &bld, fs_reg tmp = fs_reg(VGRF, alloc.allocate(2 * iter_components), value.type); + unsigned first_component = nir_intrinsic_component(instr); + mask = mask << first_component; + for (unsigned iter = 0; iter < num_iterations; iter++) { if (!is_64bit && mask != WRITEMASK_XYZW) { srcs[header_regs++] = brw_imm_ud(mask << 16); @@ -2717,11 +2720,12 @@ fs_visitor::nir_emit_tcs_intrinsic(const fs_builder &bld, } for (unsigned i = 0; i < iter_components; i++) { -if (!(mask & (1 << i))) +if (!(mask & (1 << (i + first_component continue; if (!is_64bit) { - srcs[header_regs + i] = offset(value, bld, BRW_GET_SWZ(swiz, i)); + srcs[header_regs + i + first_component] = + offset(value, bld, BRW_GET_SWZ(swiz, i)); } else { /* We need to shuffle the 64-bit data to match the layout * expected by our 32-bit URB write messages. We use a temporary @@ -2744,7 +2748,8 @@ fs_visitor::nir_emit_tcs_intrinsic(const fs_builder &bld, } unsigned mlen = -header_regs + (is_64bit ? 2 * iter_components : iter_components); +header_regs + (is_64bit ? 2 * iter_components : iter_components) + +first_component; fs_reg payload = bld.vgrf(BRW_REGISTER_TYPE_UD, mlen); bld.LOAD_PAYLOAD(payload, srcs, mlen, header_regs); -- 2.5.5 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 16/18] i965: add double packing support to tess stages
Reviewed-by: Kenneth Graunke --- src/mesa/drivers/dri/i965/brw_fs_nir.cpp | 27 ++- 1 file changed, 18 insertions(+), 9 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp index 9f890ca..bd37a51 100644 --- a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp @@ -2407,8 +2407,10 @@ fs_visitor::nir_emit_tcs_intrinsic(const fs_builder &bld, */ unsigned num_iterations = 1; unsigned num_components = instr->num_components; + unsigned first_component = nir_intrinsic_component(instr); fs_reg orig_dst = dst; if (type_sz(dst.type) == 8) { + first_component = first_component / 2; if (instr->num_components > 2) { num_iterations = 2; num_components = 2; @@ -2418,7 +2420,6 @@ fs_visitor::nir_emit_tcs_intrinsic(const fs_builder &bld, dst = tmp; } - unsigned first_component = nir_intrinsic_component(instr); for (unsigned iter = 0; iter < num_iterations; iter++) { if (indirect_offset.file == BAD_FILE) { /* Constant indexing - use global offset. */ @@ -2459,7 +2460,7 @@ fs_visitor::nir_emit_tcs_intrinsic(const fs_builder &bld, inst->mlen = 2; } inst->regs_written = -(num_components * type_sz(dst.type) / 4) + first_component; +((num_components + first_component) * type_sz(dst.type) / 4); /* If we are reading 64-bit data using 32-bit read messages we need * build proper 64-bit data elements by shuffling the low and high @@ -2720,9 +2721,13 @@ fs_visitor::nir_emit_tcs_intrinsic(const fs_builder &bld, */ unsigned num_iterations = 1; unsigned iter_components = num_components; - if (is_64bit && instr->num_components > 2) { - num_iterations = 2; - iter_components = 2; + unsigned first_component = nir_intrinsic_component(instr); + if (is_64bit) { + first_component = first_component / 2; + if (instr->num_components > 2) { +num_iterations = 2; +iter_components = 2; + } } /* 64-bit data needs to me shuffled before we can write it to the URB. @@ -2732,7 +2737,6 @@ fs_visitor::nir_emit_tcs_intrinsic(const fs_builder &bld, fs_reg tmp = fs_reg(VGRF, alloc.allocate(2 * iter_components), value.type); - unsigned first_component = nir_intrinsic_component(instr); mask = mask << first_component; for (unsigned iter = 0; iter < num_iterations; iter++) { @@ -2794,14 +2798,15 @@ fs_visitor::nir_emit_tcs_intrinsic(const fs_builder &bld, unsigned idx = 2 * i; bld.MOV(dest, offset(tmp, bld, idx)); bld.MOV(offset(dest, bld, 1), offset(tmp, bld, idx + 1)); - srcs[header_regs + idx] = dest; - srcs[header_regs + idx + 1] = offset(dest, bld, 1); + srcs[header_regs + idx + first_component * 2] = dest; + srcs[header_regs + idx + 1 + first_component * 2] = + offset(dest, bld, 1); } } unsigned mlen = header_regs + (is_64bit ? 2 * iter_components : iter_components) + -first_component; +(is_64bit ? 2 * first_component : first_component); fs_reg payload = bld.vgrf(BRW_REGISTER_TYPE_UD, mlen); bld.LOAD_PAYLOAD(payload, srcs, mlen, header_regs); @@ -2898,6 +2903,10 @@ fs_visitor::nir_emit_tes_intrinsic(const fs_builder &bld, unsigned imm_offset = instr->const_index[0]; unsigned first_component = nir_intrinsic_component(instr); + if (type_sz(dest.type) == 8) { + first_component = first_component / 2; + } + fs_inst *inst; if (indirect_offset.file == BAD_FILE) { /* Arbitrarily only push up to 32 vec4 slots worth of data, -- 2.5.5 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 06/18] i965: add component packing support for tes
Reviewed-by: Kenneth Graunke --- src/mesa/drivers/dri/i965/brw_fs_nir.cpp | 38 +++- 1 file changed, 33 insertions(+), 5 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp index 6d695f1..6033e5e 100644 --- a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp @@ -2405,10 +2405,21 @@ fs_visitor::nir_emit_tcs_intrinsic(const fs_builder &bld, dst = tmp; } + unsigned first_component = nir_intrinsic_component(instr); for (unsigned iter = 0; iter < num_iterations; iter++) { if (indirect_offset.file == BAD_FILE) { /* Constant indexing - use global offset. */ -inst = bld.emit(SHADER_OPCODE_URB_READ_SIMD8, dst, icp_handle); +if (first_component != 0) { + unsigned read_components = num_components + first_component; + fs_reg tmp = bld.vgrf(dst.type, read_components); + inst = bld.emit(SHADER_OPCODE_URB_READ_SIMD8, tmp, icp_handle); + for (unsigned i = 0; i < num_components; i++) { + bld.MOV(offset(dst, bld, i), + offset(tmp, bld, i + first_component)); + } +} else { + inst = bld.emit(SHADER_OPCODE_URB_READ_SIMD8, dst, icp_handle); +} inst->offset = imm_offset; inst->mlen = 1; inst->base_mrf = -1; @@ -2423,7 +2434,8 @@ fs_visitor::nir_emit_tcs_intrinsic(const fs_builder &bld, inst->base_mrf = -1; inst->mlen = 2; } - inst->regs_written = num_components * type_sz(dst.type) / 4; + inst->regs_written = +(num_components * type_sz(dst.type) / 4) + first_component; /* If we are reading 64-bit data using 32-bit read messages we need * build proper 64-bit data elements by shuffling the low and high @@ -2827,6 +2839,7 @@ fs_visitor::nir_emit_tes_intrinsic(const fs_builder &bld, case nir_intrinsic_load_per_vertex_input: { fs_reg indirect_offset = get_indirect_offset(instr); unsigned imm_offset = instr->const_index[0]; + unsigned first_component = nir_intrinsic_component(instr); fs_inst *inst; if (indirect_offset.file == BAD_FILE) { @@ -2837,7 +2850,8 @@ fs_visitor::nir_emit_tes_intrinsic(const fs_builder &bld, if (imm_offset < max_push_slots) { fs_reg src = fs_reg(ATTR, imm_offset / 2, dest.type); for (int i = 0; i < instr->num_components; i++) { - unsigned comp = 16 / type_sz(dest.type) * (imm_offset % 2) + i; + unsigned comp = 16 / type_sz(dest.type) * (imm_offset % 2) + + i + first_component; bld.MOV(offset(dest, bld, i), component(src, comp)); } tes_prog_data->base.urb_read_length = @@ -2851,11 +2865,25 @@ fs_visitor::nir_emit_tes_intrinsic(const fs_builder &bld, fs_reg patch_handle = bld.vgrf(BRW_REGISTER_TYPE_UD, 1); bld.LOAD_PAYLOAD(patch_handle, srcs, ARRAY_SIZE(srcs), 0); -inst = bld.emit(SHADER_OPCODE_URB_READ_SIMD8, dest, patch_handle); +if (first_component != 0) { + unsigned read_components = + instr->num_components + first_component; + fs_reg tmp = bld.vgrf(dest.type, read_components); + inst = bld.emit(SHADER_OPCODE_URB_READ_SIMD8, tmp, + patch_handle); + inst->regs_written = read_components; + for (unsigned i = 0; i < instr->num_components; i++) { + bld.MOV(offset(dest, bld, i), + offset(tmp, bld, i + first_component)); + } +} else { + inst = bld.emit(SHADER_OPCODE_URB_READ_SIMD8, dest, + patch_handle); + inst->regs_written = instr->num_components; +} inst->mlen = 1; inst->offset = imm_offset; inst->base_mrf = -1; -inst->regs_written = instr->num_components; } } else { /* Indirect indexing - use per-slot offsets as well. */ -- 2.5.5 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 18/18] docs: mark ARB_enhanced_layouts as DONE for i965
--- docs/GL3.txt | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/docs/GL3.txt b/docs/GL3.txt index 0204695..b0573c8 100644 --- a/docs/GL3.txt +++ b/docs/GL3.txt @@ -193,11 +193,11 @@ GL 4.4, GLSL 4.40: GL_MAX_VERTEX_ATTRIB_STRIDE DONE (all drivers) GL_ARB_buffer_storage DONE (i965, nv50, nvc0, r600, radeonsi) GL_ARB_clear_texture DONE (i965, nv50, nvc0) - GL_ARB_enhanced_layouts in progress (Timothy) + GL_ARB_enhanced_layouts DONE (i965) - compile-time constant expressions DONE - explicit byte offsets for blocksDONE - forced alignment within blocks DONE - - specified vec4-slot component numbers in progress + - specified vec4-slot component numbers DONE (i965) - specified transform/feedback layout DONE - input/output block locationsDONE GL_ARB_multi_bind DONE (all drivers) -- 2.5.5 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 12/18] i965: add component packing support for load_output intrinsics
--- src/mesa/drivers/dri/i965/brw_fs_nir.cpp | 38 +++- 1 file changed, 33 insertions(+), 5 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp index 75737c1..c18e7b6 100644 --- a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp @@ -2507,6 +2507,7 @@ fs_visitor::nir_emit_tcs_intrinsic(const fs_builder &bld, case nir_intrinsic_load_per_vertex_output: { fs_reg indirect_offset = get_indirect_offset(instr); unsigned imm_offset = instr->const_index[0]; + unsigned first_component = nir_intrinsic_component(instr); fs_inst *inst; if (indirect_offset.file == BAD_FILE) { @@ -2590,11 +2591,25 @@ fs_visitor::nir_emit_tcs_intrinsic(const fs_builder &bld, } bld.LOAD_PAYLOAD(dst, srcs, num_components, 0); } else { -inst = bld.emit(SHADER_OPCODE_URB_READ_SIMD8, dst, patch_handle); +if (first_component != 0) { + unsigned read_components = + instr->num_components + first_component; + fs_reg tmp = bld.vgrf(dst.type, read_components); + inst = bld.emit(SHADER_OPCODE_URB_READ_SIMD8, tmp, + patch_handle); + inst->regs_written = read_components; + for (unsigned i = 0; i < instr->num_components; i++) { + bld.MOV(offset(dst, bld, i), + offset(tmp, bld, i + first_component)); + } +} else { + inst = bld.emit(SHADER_OPCODE_URB_READ_SIMD8, dst, + patch_handle); + inst->regs_written = instr->num_components; +} inst->offset = imm_offset; inst->mlen = 1; inst->base_mrf = -1; -inst->regs_written = instr->num_components; } } else { /* Indirect indexing - use per-slot offsets as well. */ @@ -2604,12 +2619,25 @@ fs_visitor::nir_emit_tcs_intrinsic(const fs_builder &bld, }; fs_reg payload = bld.vgrf(BRW_REGISTER_TYPE_UD, 2); bld.LOAD_PAYLOAD(payload, srcs, ARRAY_SIZE(srcs), 0); - - inst = bld.emit(SHADER_OPCODE_URB_READ_SIMD8_PER_SLOT, dst, payload); + if (first_component != 0) { +unsigned read_components = + instr->num_components + first_component; +fs_reg tmp = bld.vgrf(dst.type, read_components); +inst = bld.emit(SHADER_OPCODE_URB_READ_SIMD8_PER_SLOT, tmp, +payload); +inst->regs_written = read_components; +for (unsigned i = 0; i < instr->num_components; i++) { + bld.MOV(offset(dst, bld, i), + offset(tmp, bld, i + first_component)); +} + } else { +inst = bld.emit(SHADER_OPCODE_URB_READ_SIMD8_PER_SLOT, dst, +payload); +inst->regs_written = instr->num_components; + } inst->offset = imm_offset; inst->mlen = 2; inst->base_mrf = -1; - inst->regs_written = instr->num_components; } break; } -- 2.5.5 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 15/18] i965: add double support packing support to gs inputs
Reviewed-by: Kenneth Graunke --- src/mesa/drivers/dri/i965/brw_fs_nir.cpp | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp index c18e7b6..9f890ca 100644 --- a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp @@ -2110,6 +2110,7 @@ fs_visitor::emit_gs_input_load(const fs_reg &dst, } fs_reg tmp = fs_reg(VGRF, alloc.allocate(4), dst.type); tmp_dst = tmp; + first_component = first_component / 2; } for (unsigned iter = 0; iter < num_iterations; iter++) { @@ -2119,7 +2120,7 @@ fs_visitor::emit_gs_input_load(const fs_reg &dst, unsigned read_components = num_components + first_component; fs_reg tmp = bld.vgrf(dst.type, read_components); inst = bld.emit(SHADER_OPCODE_URB_READ_SIMD8, tmp, icp_handle); -inst->regs_written = read_components; +inst->regs_written = read_components * type_sz(tmp_dst.type) / 4; for (unsigned i = 0; i < num_components; i++) { bld.MOV(offset(tmp_dst, bld, i), offset(tmp, bld, i + first_component)); @@ -2142,7 +2143,7 @@ fs_visitor::emit_gs_input_load(const fs_reg &dst, if (first_component != 0) { inst = bld.emit(SHADER_OPCODE_URB_READ_SIMD8_PER_SLOT, tmp, payload); -inst->regs_written = read_components; +inst->regs_written = read_components * type_sz(tmp_dst.type) / 4; for (unsigned i = 0; i < num_components; i++) { bld.MOV(offset(tmp_dst, bld, i), offset(tmp, bld, i + first_component)); -- 2.5.5 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 10/18] i965: add indirect packing support for tcs and tes
Reviewed-by: Kenneth Graunke --- src/mesa/drivers/dri/i965/brw_fs_nir.cpp | 33 1 file changed, 29 insertions(+), 4 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp index 587549f..4eaf5ea 100644 --- a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp @@ -2428,8 +2428,19 @@ fs_visitor::nir_emit_tcs_intrinsic(const fs_builder &bld, const fs_reg srcs[] = { icp_handle, indirect_offset }; fs_reg payload = bld.vgrf(BRW_REGISTER_TYPE_UD, 2); bld.LOAD_PAYLOAD(payload, srcs, ARRAY_SIZE(srcs), 0); - -inst = bld.emit(SHADER_OPCODE_URB_READ_SIMD8_PER_SLOT, dst, payload); +if (first_component != 0) { + unsigned read_components = num_components + first_component; + fs_reg tmp = bld.vgrf(dst.type, read_components); + inst = bld.emit(SHADER_OPCODE_URB_READ_SIMD8_PER_SLOT, tmp, + payload); + for (unsigned i = 0; i < num_components; i++) { + bld.MOV(offset(dst, bld, i), + offset(tmp, bld, i + first_component)); + } +} else { + inst = bld.emit(SHADER_OPCODE_URB_READ_SIMD8_PER_SLOT, dst, + payload); +} inst->offset = imm_offset; inst->base_mrf = -1; inst->mlen = 2; @@ -2899,11 +2910,25 @@ fs_visitor::nir_emit_tes_intrinsic(const fs_builder &bld, fs_reg payload = bld.vgrf(BRW_REGISTER_TYPE_UD, 2); bld.LOAD_PAYLOAD(payload, srcs, ARRAY_SIZE(srcs), 0); - inst = bld.emit(SHADER_OPCODE_URB_READ_SIMD8_PER_SLOT, dest, payload); + if (first_component != 0) { +unsigned read_components = +instr->num_components + first_component; +fs_reg tmp = bld.vgrf(dest.type, read_components); +inst = bld.emit(SHADER_OPCODE_URB_READ_SIMD8_PER_SLOT, tmp, +payload); +inst->regs_written = read_components; +for (unsigned i = 0; i < instr->num_components; i++) { + bld.MOV(offset(dest, bld, i), + offset(tmp, bld, i + first_component)); +} + } else { +inst = bld.emit(SHADER_OPCODE_URB_READ_SIMD8_PER_SLOT, dest, +payload); +inst->regs_written = instr->num_components; + } inst->mlen = 2; inst->offset = imm_offset; inst->base_mrf = -1; - inst->regs_written = instr->num_components; } break; } -- 2.5.5 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 01/18] nir: add new intrinsic field for storing component offset
This offset is used for packing. Reviewed-by: Kenneth Graunke --- src/compiler/nir/nir.h| 6 ++ src/compiler/nir/nir_intrinsics.h | 12 ++-- src/compiler/nir/nir_lower_io.c | 8 src/compiler/nir/nir_print.c | 3 +++ 4 files changed, 23 insertions(+), 6 deletions(-) diff --git a/src/compiler/nir/nir.h b/src/compiler/nir/nir.h index ec7b0c7..d5e4733 100644 --- a/src/compiler/nir/nir.h +++ b/src/compiler/nir/nir.h @@ -987,6 +987,11 @@ typedef enum { */ NIR_INTRINSIC_BINDING = 7, + /** +* Component offset. +*/ + NIR_INTRINSIC_COMPONENT = 8, + NIR_INTRINSIC_NUM_INDEX_FLAGS, } nir_intrinsic_index_flag; @@ -1053,6 +1058,7 @@ INTRINSIC_IDX_ACCESSORS(ucp_id, UCP_ID, unsigned) INTRINSIC_IDX_ACCESSORS(range, RANGE, unsigned) INTRINSIC_IDX_ACCESSORS(desc_set, DESC_SET, unsigned) INTRINSIC_IDX_ACCESSORS(binding, BINDING, unsigned) +INTRINSIC_IDX_ACCESSORS(component, COMPONENT, unsigned) /** * \group texture information diff --git a/src/compiler/nir/nir_intrinsics.h b/src/compiler/nir/nir_intrinsics.h index 6f86c9f..19df191 100644 --- a/src/compiler/nir/nir_intrinsics.h +++ b/src/compiler/nir/nir_intrinsics.h @@ -336,15 +336,15 @@ LOAD(uniform, 1, 2, BASE, RANGE, xx, NIR_INTRINSIC_CAN_ELIMINATE | NIR_INTRINSIC /* src[] = { buffer_index, offset }. No const_index */ LOAD(ubo, 2, 0, xx, xx, xx, NIR_INTRINSIC_CAN_ELIMINATE | NIR_INTRINSIC_CAN_REORDER) /* src[] = { offset }. const_index[] = { base } */ -LOAD(input, 1, 1, BASE, xx, xx, NIR_INTRINSIC_CAN_ELIMINATE | NIR_INTRINSIC_CAN_REORDER) +LOAD(input, 1, 2, BASE, COMPONENT, xx, NIR_INTRINSIC_CAN_ELIMINATE | NIR_INTRINSIC_CAN_REORDER) /* src[] = { vertex, offset }. const_index[] = { base } */ -LOAD(per_vertex_input, 2, 1, BASE, xx, xx, NIR_INTRINSIC_CAN_ELIMINATE | NIR_INTRINSIC_CAN_REORDER) +LOAD(per_vertex_input, 2, 2, BASE, COMPONENT, xx, NIR_INTRINSIC_CAN_ELIMINATE | NIR_INTRINSIC_CAN_REORDER) /* src[] = { buffer_index, offset }. No const_index */ LOAD(ssbo, 2, 0, xx, xx, xx, NIR_INTRINSIC_CAN_ELIMINATE) /* src[] = { offset }. const_index[] = { base } */ -LOAD(output, 1, 1, BASE, xx, xx, NIR_INTRINSIC_CAN_ELIMINATE) +LOAD(output, 1, 1, BASE, COMPONENT, xx, NIR_INTRINSIC_CAN_ELIMINATE) /* src[] = { vertex, offset }. const_index[] = { base } */ -LOAD(per_vertex_output, 2, 1, BASE, xx, xx, NIR_INTRINSIC_CAN_ELIMINATE) +LOAD(per_vertex_output, 2, 1, BASE, COMPONENT, xx, NIR_INTRINSIC_CAN_ELIMINATE) /* src[] = { offset }. const_index[] = { base } */ LOAD(shared, 1, 1, BASE, xx, xx, NIR_INTRINSIC_CAN_ELIMINATE) /* src[] = { offset }. const_index[] = { base, range } */ @@ -362,9 +362,9 @@ LOAD(push_constant, 1, 2, BASE, RANGE, xx, INTRINSIC(store_##name, srcs, ARR(0, 1, 1, 1), false, 0, 0, num_indices, idx0, idx1, idx2, flags) /* src[] = { value, offset }. const_index[] = { base, write_mask } */ -STORE(output, 2, 2, BASE, WRMASK, xx, 0) +STORE(output, 2, 3, BASE, WRMASK, COMPONENT, 0) /* src[] = { value, vertex, offset }. const_index[] = { base, write_mask } */ -STORE(per_vertex_output, 3, 2, BASE, WRMASK, xx, 0) +STORE(per_vertex_output, 3, 3, BASE, WRMASK, COMPONENT, 0) /* src[] = { value, block_index, offset }. const_index[] = { write_mask } */ STORE(ssbo, 3, 1, WRMASK, xx, xx, 0) /* src[] = { value, offset }. const_index[] = { base, write_mask } */ diff --git a/src/compiler/nir/nir_lower_io.c b/src/compiler/nir/nir_lower_io.c index a839924..72f1b05 100644 --- a/src/compiler/nir/nir_lower_io.c +++ b/src/compiler/nir/nir_lower_io.c @@ -274,6 +274,10 @@ nir_lower_io_block(nir_block *block, nir_intrinsic_set_base(load, intrin->variables[0]->var->data.driver_location); + if (mode == nir_var_shader_in || mode == nir_var_shader_out) { +nir_intrinsic_set_component(load, + intrin->variables[0]->var->data.location_frac); + } if (load->intrinsic == nir_intrinsic_load_uniform) { nir_intrinsic_set_range(load, @@ -322,6 +326,10 @@ nir_lower_io_block(nir_block *block, nir_intrinsic_set_base(store, intrin->variables[0]->var->data.driver_location); + if (mode == nir_var_shader_out) { +nir_intrinsic_set_component(store, + intrin->variables[0]->var->data.location_frac); + } nir_intrinsic_set_write_mask(store, nir_intrinsic_write_mask(intrin)); if (per_vertex) diff --git a/src/compiler/nir/nir_print.c b/src/compiler/nir/nir_print.c index 36176ec..bca8a35 100644 --- a/src/compiler/nir/nir_print.c +++ b/src/compiler/nir/nir_print.c @@ -570,6 +570,7 @@ print_intrinsic_instr(nir_intrinsic_instr *instr, print_state *state) [NIR_INTRINSIC_RANGE] = "range", [NIR_INTRINSIC_DESC_SET] = "desc-set", [NIR_INTRINSIC_BINDING] = "binding", + [NIR_INTRINSIC_COMPONENT] = "component", }; for (unsigned idx = 1; idx < NIR_INTRINSIC_NUM_INDEX_FLAGS; idx++) { if (!info->index_map[id
[Mesa-dev] [PATCH 03/18] glsl/nir: add new num_packed_components field
This will be used to store the total number of components used at this location when packing via ARB_enhanced_layouts. --- src/compiler/glsl/glsl_to_nir.cpp | 1 + src/compiler/glsl/ir.h | 5 +++ src/compiler/glsl/link_varyings.cpp | 74 - src/compiler/glsl/linker.cpp| 2 + src/compiler/glsl/linker.h | 4 ++ src/compiler/nir/nir.h | 5 +++ 6 files changed, 89 insertions(+), 2 deletions(-) diff --git a/src/compiler/glsl/glsl_to_nir.cpp b/src/compiler/glsl/glsl_to_nir.cpp index daf237e..0663c69 100644 --- a/src/compiler/glsl/glsl_to_nir.cpp +++ b/src/compiler/glsl/glsl_to_nir.cpp @@ -375,6 +375,7 @@ nir_visitor::visit(ir_variable *ir) var->data.explicit_binding = ir->data.explicit_binding; var->data.has_initializer = ir->data.has_initializer; var->data.location_frac = ir->data.location_frac; + var->data.num_packed_components = ir->data.num_packed_components; switch (ir->data.depth_layout) { case ir_depth_layout_none: diff --git a/src/compiler/glsl/ir.h b/src/compiler/glsl/ir.h index 3629356..4248e62 100644 --- a/src/compiler/glsl/ir.h +++ b/src/compiler/glsl/ir.h @@ -763,6 +763,11 @@ public: unsigned location_frac:2; /** + * The total number of components packed into this location. + */ + unsigned num_packed_components:4; + + /** * Layout of the matrix. Uses glsl_matrix_layout values. */ unsigned matrix_layout:2; diff --git a/src/compiler/glsl/link_varyings.cpp b/src/compiler/glsl/link_varyings.cpp index 534393a..22dc2d8 100644 --- a/src/compiler/glsl/link_varyings.cpp +++ b/src/compiler/glsl/link_varyings.cpp @@ -1972,6 +1972,70 @@ reserved_varying_slot(struct gl_shader *stage, ir_variable_mode io_mode) return slots; } +void +set_num_packed_components(struct gl_shader *shader, ir_variable_mode io_mode, + unsigned base_offset) +{ + /* Find the max number of components used at this location */ + unsigned num_components[MAX_VARYINGS_INCL_PATCH] = { 0 }; + + foreach_in_list(ir_instruction, node, shader->ir) { + ir_variable *const var = node->as_variable(); + + if (var == NULL || var->data.mode != io_mode || + !var->data.explicit_location) + continue; + + int idx = var->data.location - base_offset; + if (idx < 0 || idx >= MAX_VARYINGS_INCL_PATCH || + var->type->without_array()->is_record() || + var->type->without_array()->is_matrix()) + continue; + + if (var->type->is_array()) { + const glsl_type *type = get_varying_type(var, shader->Stage); + unsigned array_components = type->without_array()->vector_elements + +var->data.location_frac; + assert(type->arrays_of_arrays_size() + idx <= +ARRAY_SIZE(num_components)); + for (unsigned i = idx; i < type->arrays_of_arrays_size(); i++) { +num_components[i] = MAX2(array_components, num_components[i]); + } + } else { + unsigned comps = var->type->vector_elements + +var->data.location_frac; + num_components[idx] = MAX2(comps, num_components[idx]); + } + } + + foreach_in_list(ir_instruction, node, shader->ir) { + ir_variable *const var = node->as_variable(); + + if (var == NULL || var->data.mode != io_mode || + !var->data.explicit_location) + continue; + + int idx = var->data.location - base_offset; + if (idx < 0 || idx >= MAX_VARYINGS_INCL_PATCH || + var->type->without_array()->is_record() || + var->type->without_array()->is_matrix()) + continue; + + /* For arrays we need to check all elements in order to find the max + * number of components used. + */ + unsigned c = 0; + if (var->type->is_array()) { + const glsl_type *type = get_varying_type(var, shader->Stage); + for (unsigned i = idx; i < type->arrays_of_arrays_size(); i++) { +c = MAX2(c, num_components[i]); + } + } else { + c = num_components[idx]; + } + var->data.num_packed_components = c; + } +} /** * Assign locations for all variables that are produced in one pipeline stage @@ -2087,11 +2151,17 @@ assign_varying_locations(struct gl_context *ctx, * 4. Mark input variables in the consumer that do not have locations as *not being inputs. This lets the optimizer eliminate them. */ - if (consumer) + if (consumer) { canonicalize_shader_io(consumer->ir, ir_var_shader_in); + set_num_packed_components(consumer, ir_var_shader_in, +VARYING_SLOT_VAR0); + } - if (producer) + if (producer) { canonicalize_shader_io(producer->ir, ir_var_shader_out); + set_num_packed_components(producer, ir_var_shader_out, +VARYING_SLOT_VAR0); + } if (consumer) linke
[Mesa-dev] [PATCH 09/18] i965: add support for packing arrays
Here we add a new helper function calc_type_size_offset() to help calculate the size of a varying once packing is taken into account. --- src/compiler/nir/nir_lower_io.c | 55 +++-- 1 file changed, 48 insertions(+), 7 deletions(-) diff --git a/src/compiler/nir/nir_lower_io.c b/src/compiler/nir/nir_lower_io.c index c25790a..b966348 100644 --- a/src/compiler/nir/nir_lower_io.c +++ b/src/compiler/nir/nir_lower_io.c @@ -41,6 +41,36 @@ struct lower_io_state { nir_variable_mode modes; }; +/** + * Calculates the offset for a type by allowing for other components that are + * packed into the same location. + */ +static unsigned +calc_type_size_offset(unsigned num_packed_components, + const struct glsl_type *type, + int (*type_size)(const struct glsl_type *)) +{ + unsigned base_size; + const struct glsl_type *wa = glsl_without_array(type); + int comp_diff = num_packed_components - glsl_get_vector_elements(wa); + + /* If there is no difference in component sizes or the type_size function +* being used treats everything as a vec4 return. +*/ + if (comp_diff <= 0 || + type_size(glsl_float_type()) == type_size(glsl_double_type())) + return 0; + + if (glsl_get_base_type(wa) == GLSL_TYPE_DOUBLE) { + base_size = type_size(glsl_dvec_type(comp_diff)); + } else { + base_size = type_size(glsl_vec_type(comp_diff)); + } + + return glsl_type_is_array(type) ? base_size * glsl_get_aoa_size(type) : + base_size; +} + void nir_assign_var_locations(struct exec_list *var_list, unsigned *size, unsigned base_offset, @@ -74,13 +104,17 @@ nir_assign_var_locations(struct exec_list *var_list, unsigned *size, if (locations[idx][var->data.index] == -1) { var->data.driver_location = location; locations[idx][var->data.index] = location; -location += type_size(var->type); +location += type_size(var->type) + + calc_type_size_offset(var->data.num_packed_components, + var->type, type_size); } else { var->data.driver_location = locations[idx][var->data.index]; } } else { var->data.driver_location = location; - location += type_size(var->type); + location += type_size(var->type) + +calc_type_size_offset(var->data.num_packed_components, var->type, + type_size); } } @@ -113,7 +147,8 @@ is_per_vertex_output(struct lower_io_state *state, nir_variable *var) static nir_ssa_def * get_io_offset(nir_builder *b, nir_deref_var *deref, nir_ssa_def **vertex_index, - int (*type_size)(const struct glsl_type *)) + int (*type_size)(const struct glsl_type *), + unsigned num_packed_components) { nir_deref *tail = &deref->deref; @@ -141,7 +176,9 @@ get_io_offset(nir_builder *b, nir_deref_var *deref, if (tail->deref_type == nir_deref_type_array) { nir_deref_array *deref_array = nir_deref_as_array(tail); - unsigned size = type_size(tail->type); + unsigned size = type_size(tail->type) + +calc_type_size_offset(num_packed_components, tail->type, + type_size); offset = nir_iadd(b, offset, nir_imm_int(b, size * deref_array->base_offset)); @@ -289,7 +326,9 @@ nir_lower_io_block(nir_block *block, offset = get_io_offset(b, intrin->variables[0], per_vertex ? &vertex_index : NULL, -state->type_size); +state->type_size, +intrin->variables[0]->var-> + data.num_packed_components); nir_intrinsic_instr *load = nir_intrinsic_instr_create(state->mem_ctx, @@ -339,7 +378,9 @@ nir_lower_io_block(nir_block *block, offset = get_io_offset(b, intrin->variables[0], per_vertex ? &vertex_index : NULL, -state->type_size); +state->type_size, +intrin->variables[0]->var-> + data.num_packed_components); nir_intrinsic_instr *store = nir_intrinsic_instr_create(state->mem_ctx, @@ -381,7 +422,7 @@ nir_lower_io_block(nir_block *block, nir_ssa_def *offset; offset = get_io_offset(b, intrin->variables[0], -NULL, state->type_size); +NULL, state->type_size, 0); nir_intrinsic_instr *atomic = nir_intrinsic_instr_create(state->mem_ctx, -- 2.5.5 ___ mesa-dev mailing list mesa-dev@lists.freedesktop
[Mesa-dev] [PATCH 04/18] i965: enable component packing for vs and fs
--- src/mesa/drivers/dri/i965/brw_fs.cpp | 20 src/mesa/drivers/dri/i965/brw_fs.h | 5 +++-- src/mesa/drivers/dri/i965/brw_fs_nir.cpp | 29 - 3 files changed, 35 insertions(+), 19 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp b/src/mesa/drivers/dri/i965/brw_fs.cpp index 8774f25..1fdb654 100644 --- a/src/mesa/drivers/dri/i965/brw_fs.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs.cpp @@ -1127,7 +1127,8 @@ fs_visitor::emit_general_interpolation(fs_reg *attr, const char *name, const glsl_type *type, glsl_interp_qualifier interpolation_mode, int *location, bool mod_centroid, - bool mod_sample) + bool mod_sample, + unsigned num_packed_components) { assert(stage == MESA_SHADER_FRAGMENT); brw_wm_prog_data *prog_data = (brw_wm_prog_data*) this->prog_data; @@ -1149,22 +1150,26 @@ fs_visitor::emit_general_interpolation(fs_reg *attr, const char *name, for (unsigned i = 0; i < length; i++) { emit_general_interpolation(attr, name, elem_type, interpolation_mode, -location, mod_centroid, mod_sample); +location, mod_centroid, mod_sample, +num_packed_components); } } else if (type->is_record()) { for (unsigned i = 0; i < type->length; i++) { const glsl_type *field_type = type->fields.structure[i].type; emit_general_interpolation(attr, name, field_type, interpolation_mode, -location, mod_centroid, mod_sample); +location, mod_centroid, mod_sample, +num_packed_components); } } else { assert(type->is_scalar() || type->is_vector()); + unsigned num_components = num_packed_components ? + num_packed_components : type->vector_elements; if (prog_data->urb_setup[*location] == -1) { /* If there's no incoming setup data for this slot, don't * emit interpolation for it. */ - *attr = offset(*attr, bld, type->vector_elements); + *attr = offset(*attr, bld, num_components); (*location)++; return; } @@ -1176,7 +1181,6 @@ fs_visitor::emit_general_interpolation(fs_reg *attr, const char *name, * handed us defined values in only the constant offset * field of the setup reg. */ - unsigned vector_elements = type->vector_elements; /* Data starts at suboffet 3 in 32-bit units (12 bytes), so it is not * 64-bit aligned and the current implementation fails to read the @@ -1184,10 +1188,10 @@ fs_visitor::emit_general_interpolation(fs_reg *attr, const char *name, * read it as vector of floats with twice the number of components. */ if (attr->type == BRW_REGISTER_TYPE_DF) { -vector_elements *= 2; +num_components *= 2; attr->type = BRW_REGISTER_TYPE_F; } - for (unsigned int i = 0; i < vector_elements; i++) { + for (unsigned int i = 0; i < num_components; i++) { struct brw_reg interp = interp_reg(*location, i); interp = suboffset(interp, 3); interp.type = attr->type; @@ -1196,7 +1200,7 @@ fs_visitor::emit_general_interpolation(fs_reg *attr, const char *name, } } else { /* Smooth/noperspective interpolation case. */ - for (unsigned int i = 0; i < type->vector_elements; i++) { + for (unsigned int i = 0; i < num_components; i++) { struct brw_reg interp = interp_reg(*location, i); if (devinfo->needs_unlit_centroid_workaround && mod_centroid) { /* Get the pixel/sample mask into f0 so that we know diff --git a/src/mesa/drivers/dri/i965/brw_fs.h b/src/mesa/drivers/dri/i965/brw_fs.h index 4237197..fc85206 100644 --- a/src/mesa/drivers/dri/i965/brw_fs.h +++ b/src/mesa/drivers/dri/i965/brw_fs.h @@ -181,7 +181,7 @@ public: const glsl_type *type, glsl_interp_qualifier interpolation_mode, int *location, bool mod_centroid, - bool mod_sample); + bool mod_sample, unsigned num_components); fs_reg *emit_vs_system_value(int location); void emit_interpolation_setup_gen4(); void emit_interpolation_setup_gen6(); @@ -200,7 +200,8 @@ public: void emit_nir_code(); void nir_setup_inputs(); void nir_setup_single_output_varying(fs_reg *reg, const glsl_type *type, -unsigned
[Mesa-dev] [PATCH 05/18] i965: add component packing support for gs
Reviewed-by: Kenneth Graunke --- src/mesa/drivers/dri/i965/brw_fs.h | 2 +- src/mesa/drivers/dri/i965/brw_fs_nir.cpp | 22 ++ 2 files changed, 19 insertions(+), 5 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_fs.h b/src/mesa/drivers/dri/i965/brw_fs.h index fc85206..0c72802 100644 --- a/src/mesa/drivers/dri/i965/brw_fs.h +++ b/src/mesa/drivers/dri/i965/brw_fs.h @@ -266,7 +266,7 @@ public: void emit_gs_thread_end(); void emit_gs_input_load(const fs_reg &dst, const nir_src &vertex_src, unsigned base_offset, const nir_src &offset_src, - unsigned num_components); + unsigned num_components, unsigned first_component); void emit_cs_terminate(); fs_reg *emit_cs_work_group_id_setup(); diff --git a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp index b90cc8b..6d695f1 100644 --- a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp @@ -1980,7 +1980,8 @@ fs_visitor::emit_gs_input_load(const fs_reg &dst, const nir_src &vertex_src, unsigned base_offset, const nir_src &offset_src, - unsigned num_components) + unsigned num_components, + unsigned first_component) { struct brw_gs_prog_data *gs_prog_data = (struct brw_gs_prog_data *) prog_data; @@ -2114,11 +2115,23 @@ fs_visitor::emit_gs_input_load(const fs_reg &dst, for (unsigned iter = 0; iter < num_iterations; iter++) { if (offset_const) { /* Constant indexing - use global offset. */ - inst = bld.emit(SHADER_OPCODE_URB_READ_SIMD8, tmp_dst, icp_handle); + if (first_component != 0) { +unsigned read_components = num_components + first_component; +fs_reg tmp = bld.vgrf(dst.type, read_components); +inst = bld.emit(SHADER_OPCODE_URB_READ_SIMD8, tmp, icp_handle); +inst->regs_written = read_components; +for (unsigned i = 0; i < num_components; i++) { + bld.MOV(offset(tmp_dst, bld, i), + offset(tmp, bld, i + first_component)); +} + } else { +inst = bld.emit(SHADER_OPCODE_URB_READ_SIMD8, tmp_dst, +icp_handle); +inst->regs_written = num_components * type_sz(tmp_dst.type) / 4; + } inst->offset = base_offset + offset_const->u32[0]; inst->base_mrf = -1; inst->mlen = 1; - inst->regs_written = num_components * type_sz(tmp_dst.type) / 4; } else { /* Indirect indexing - use per-slot offsets as well. */ const fs_reg srcs[] = { icp_handle, indirect_offset }; @@ -2891,7 +2904,8 @@ fs_visitor::nir_emit_gs_intrinsic(const fs_builder &bld, case nir_intrinsic_load_per_vertex_input: emit_gs_input_load(dest, instr->src[0], instr->const_index[0], - instr->src[1], instr->num_components); + instr->src[1], instr->num_components, + nir_intrinsic_component(instr)); break; case nir_intrinsic_emit_vertex_with_counter: -- 2.5.5 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 02/18] nir: use the same driver location for packed varyings
Reviewed-by: Kenneth Graunke --- src/compiler/nir/nir.h| 4 ++-- src/compiler/nir/nir_lower_io.c | 28 ++-- src/mesa/drivers/dri/i965/brw_nir.c | 8 +--- src/mesa/drivers/dri/i965/brw_program.c | 4 ++-- src/mesa/state_tracker/st_glsl_to_nir.cpp | 3 +++ 5 files changed, 38 insertions(+), 9 deletions(-) diff --git a/src/compiler/nir/nir.h b/src/compiler/nir/nir.h index d5e4733..4ade03a 100644 --- a/src/compiler/nir/nir.h +++ b/src/compiler/nir/nir.h @@ -2310,8 +2310,8 @@ void nir_lower_io_to_temporaries(nir_shader *shader, nir_function *entrypoint, void nir_shader_gather_info(nir_shader *shader, nir_function_impl *entrypoint); -void nir_assign_var_locations(struct exec_list *var_list, - unsigned *size, +void nir_assign_var_locations(struct exec_list *var_list, unsigned *size, + unsigned base_offset, int (*type_size)(const struct glsl_type *)); void nir_lower_io(nir_shader *shader, diff --git a/src/compiler/nir/nir_lower_io.c b/src/compiler/nir/nir_lower_io.c index 72f1b05..c25790a 100644 --- a/src/compiler/nir/nir_lower_io.c +++ b/src/compiler/nir/nir_lower_io.c @@ -43,10 +43,18 @@ struct lower_io_state { void nir_assign_var_locations(struct exec_list *var_list, unsigned *size, + unsigned base_offset, int (*type_size)(const struct glsl_type *)) { unsigned location = 0; + /* There are 32 regular and 32 patch varyings allowed */ + int locations[64][2]; + for (unsigned i = 0; i < 64; i++) { + for (unsigned j = 0; j < 2; j++) + locations[i][j] = -1; + } + nir_foreach_variable(var, var_list) { /* * UBO's have their own address spaces, so don't count them towards the @@ -56,8 +64,24 @@ nir_assign_var_locations(struct exec_list *var_list, unsigned *size, var->interface_type != NULL) continue; - var->data.driver_location = location; - location += type_size(var->type); + /* Make sure we give the same location to varyings packed with + * ARB_enhanced_layouts. + */ + int idx = var->data.location - base_offset; + if (base_offset && idx >= 0) { + assert(idx < ARRAY_SIZE(locations)); + + if (locations[idx][var->data.index] == -1) { +var->data.driver_location = location; +locations[idx][var->data.index] = location; +location += type_size(var->type); + } else { +var->data.driver_location = locations[idx][var->data.index]; + } + } else { + var->data.driver_location = location; + location += type_size(var->type); + } } *size = location; diff --git a/src/mesa/drivers/dri/i965/brw_nir.c b/src/mesa/drivers/dri/i965/brw_nir.c index d8cf12d..6c3e1d1 100644 --- a/src/mesa/drivers/dri/i965/brw_nir.c +++ b/src/mesa/drivers/dri/i965/brw_nir.c @@ -282,7 +282,8 @@ brw_nir_lower_tes_inputs(nir_shader *nir, const struct brw_vue_map *vue_map) void brw_nir_lower_fs_inputs(nir_shader *nir) { - nir_assign_var_locations(&nir->inputs, &nir->num_inputs, type_size_scalar); + nir_assign_var_locations(&nir->inputs, &nir->num_inputs, VARYING_SLOT_VAR0, +type_size_scalar); nir_lower_io(nir, nir_var_shader_in, type_size_scalar); } @@ -292,6 +293,7 @@ brw_nir_lower_vue_outputs(nir_shader *nir, { if (is_scalar) { nir_assign_var_locations(&nir->outputs, &nir->num_outputs, + VARYING_SLOT_VAR0, type_size_scalar); nir_lower_io(nir, nir_var_shader_out, type_size_scalar); } else { @@ -330,14 +332,14 @@ void brw_nir_lower_fs_outputs(nir_shader *nir) { nir_assign_var_locations(&nir->outputs, &nir->num_outputs, -type_size_scalar); +FRAG_RESULT_DATA0, type_size_scalar); nir_lower_io(nir, nir_var_shader_out, type_size_scalar); } void brw_nir_lower_cs_shared(nir_shader *nir) { - nir_assign_var_locations(&nir->shared, &nir->num_shared, + nir_assign_var_locations(&nir->shared, &nir->num_shared, 0, type_size_scalar_bytes); nir_lower_io(nir, nir_var_shared, type_size_scalar_bytes); } diff --git a/src/mesa/drivers/dri/i965/brw_program.c b/src/mesa/drivers/dri/i965/brw_program.c index a1a8116..2eec7fc 100644 --- a/src/mesa/drivers/dri/i965/brw_program.c +++ b/src/mesa/drivers/dri/i965/brw_program.c @@ -51,11 +51,11 @@ static void brw_nir_lower_uniforms(nir_shader *nir, bool is_scalar) { if (is_scalar) { - nir_assign_var_locations(&nir->uniforms, &nir->num_uniforms, + nir_assign_var_locations(&nir->uniforms, &nir->num_uniforms, 0, type_size_scalar_bytes); nir_lower_io(nir, nir_var_uniform, type_size_scalar_bytes); } else { - nir_as
[Mesa-dev] V3 ARB_enhanced_layouts packing support for i965 Gen8+
V3: - Rewrite patch 9 (add support for packing arrays) to not add hacks to the type_size() functions. - Add packing support for the load_output intrinsics (patch 12) - Add glsl_dvec_type() helper (patch 8) V2: - validation fixes patches 1-2 - added support for packing doubles now that explicit location fixes have landed. - fix various issues with intel debug output with new COMPONENT const index. This adds component packing support for Gen8+. Series can be found in my component_packing_backend6 branch: https://github.com/tarceri/Mesa_arrays_of_arrays.git ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] i965: remove type_size_vec4_times_4()
On Tuesday, June 14, 2016 4:53:22 PM PDT Timothy Arceri wrote: > type_size_vec4_times_4() was introduced as a fix in 8dcf807cb43383 > however since 3810c1561 we can just use type_size_scalar() and > get the actual number of outputs we need. > > Cc: Kenneth Graunke > --- > Hi Ken, > > I'm looking into the other suggestions you made on IRC so this may all just > go away but seems like a good idea to clean this up in the meantime. > > src/mesa/drivers/dri/i965/brw_fs.cpp | 13 - > src/mesa/drivers/dri/i965/brw_fs_nir.cpp | 2 +- > src/mesa/drivers/dri/i965/brw_nir.c | 4 ++-- > src/mesa/drivers/dri/i965/brw_shader.h | 1 - > 4 files changed, 3 insertions(+), 17 deletions(-) > > diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp > b/src/mesa/drivers/dri/i965/brw_fs.cpp > index 0347b0a..8774f25 100644 > --- a/src/mesa/drivers/dri/i965/brw_fs.cpp > +++ b/src/mesa/drivers/dri/i965/brw_fs.cpp > @@ -505,19 +505,6 @@ type_size_scalar(const struct glsl_type *type) > return 0; > } > > -/** > - * Returns the number of scalar components needed to store type, assuming > - * that vectors are padded out to vec4. > - * > - * This has the packing rules of type_size_vec4(), but counts components > - * similar to type_size_scalar(). > - */ > -extern "C" int > -type_size_vec4_times_4(const struct glsl_type *type) > -{ > - return 4 * type_size_vec4(type); > -} > - > /* Attribute arrays are loaded as one vec4 per element (or matrix column), > * except for double-precision types, which are loaded as one dvec4. > */ > diff --git a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp > b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp > index a956f9d..b811953 100644 > --- a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp > +++ b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp > @@ -108,7 +108,7 @@ fs_visitor::nir_setup_single_output_varying(fs_reg *reg, >for (unsigned count = 0; count < num_elements; count += 4) { > this->outputs[*location] = *reg; > this->output_components[*location] = MIN2(4, num_elements - count); > - *reg = offset(*reg, bld, 4); > + *reg = offset(*reg, bld, this->output_components[*location]); > (*location)++; >} > } > diff --git a/src/mesa/drivers/dri/i965/brw_nir.c > b/src/mesa/drivers/dri/i965/brw_nir.c > index e01f160..d8cf12d 100644 > --- a/src/mesa/drivers/dri/i965/brw_nir.c > +++ b/src/mesa/drivers/dri/i965/brw_nir.c > @@ -292,8 +292,8 @@ brw_nir_lower_vue_outputs(nir_shader *nir, > { > if (is_scalar) { >nir_assign_var_locations(&nir->outputs, &nir->num_outputs, > - type_size_vec4_times_4); > - nir_lower_io(nir, nir_var_shader_out, type_size_vec4_times_4); > + type_size_scalar); > + nir_lower_io(nir, nir_var_shader_out, type_size_scalar); > } else { >nir_foreach_variable(var, &nir->outputs) > var->data.driver_location = var->data.location; > diff --git a/src/mesa/drivers/dri/i965/brw_shader.h > b/src/mesa/drivers/dri/i965/brw_shader.h > index 656dc89..9300f20 100644 > --- a/src/mesa/drivers/dri/i965/brw_shader.h > +++ b/src/mesa/drivers/dri/i965/brw_shader.h > @@ -294,7 +294,6 @@ struct gl_shader *brw_new_shader(struct gl_context *ctx, > GLuint name, GLuint typ > int type_size_scalar(const struct glsl_type *type); > int type_size_vec4(const struct glsl_type *type); > int type_size_dvec4(const struct glsl_type *type); > -int type_size_vec4_times_4(const struct glsl_type *type); > int type_size_vs_input(const struct glsl_type *type); > > unsigned tesslevel_outer_components(GLenum tes_primitive_mode); > I was skeptical, but this looks correct. This only applies to shadowed outputs, and just controls the packing within the fs_reg we allocate for those outputs. The URB layout remains the same. It appears that we only needed this prior to the commit you referenced because the old code was buggy. Now that it's fixed, it doesn't matter. I think this is fine, then. Presumably you've run it through Jenkins and everything was happy? Reviewed-by: Kenneth Graunke signature.asc Description: This is a digitally signed message part. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 07/11] mesa: Fix incorrect "see also" comments
On Tue, 2016-06-14 at 19:01 -0700, Ian Romanick wrote: > From: Ian Romanick > > Signed-off-by: Ian Romanick Reviewed-by: Timothy Arceri > --- > src/compiler/glsl/ir.h | 2 +- > src/mesa/main/mtypes.h | 2 +- > 2 files changed, 2 insertions(+), 2 deletions(-) > > diff --git a/src/compiler/glsl/ir.h b/src/compiler/glsl/ir.h > index 3629356..cd17f69 100644 > --- a/src/compiler/glsl/ir.h > +++ b/src/compiler/glsl/ir.h > @@ -679,7 +679,7 @@ public: > /** > * Interpolation mode for shader inputs / outputs > * > - * \sa ir_variable_interpolation > + * \sa glsl_interp_qualifier > */ > unsigned interpolation:2; > > diff --git a/src/mesa/main/mtypes.h b/src/mesa/main/mtypes.h > index 471d41d..88702cb 100644 > --- a/src/mesa/main/mtypes.h > +++ b/src/mesa/main/mtypes.h > @@ -2615,7 +2615,7 @@ struct gl_shader_variable > /** > * Interpolation mode for shader inputs / outputs > * > -* \sa ir_variable_interpolation > +* \sa glsl_interp_qualifier > */ > unsigned interpolation:2; > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 06/11] mesa: Silence unused parameter warning
On Tue, 2016-06-14 at 19:01 -0700, Ian Romanick wrote: > From: Ian Romanick > > main/pipelineobj.c: In function ‘delete_pipelineobj_cb’: > main/pipelineobj.c:110:30: warning: unused parameter ‘id’ [-Wunused- > parameter] > delete_pipelineobj_cb(GLuint id, void *data, void *userData) > ^ > > Signed-off-by: Ian Romanick > --- > src/mesa/main/pipelineobj.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/src/mesa/main/pipelineobj.c > b/src/mesa/main/pipelineobj.c > index 9ecbcc9..8483752 100644 > --- a/src/mesa/main/pipelineobj.c > +++ b/src/mesa/main/pipelineobj.c > @@ -107,7 +107,7 @@ _mesa_init_pipeline(struct gl_context *ctx) > * Callback for deleting a pipeline object. Called by > _mesa_HashDeleteAll(). > */ > static void > -delete_pipelineobj_cb(GLuint id, void *data, void *userData) > +delete_pipelineobj_cb(UNUSED GLuint id, void *data, void *userData) I doesnt look like this has been used in core mesa before, as long as others are ok with it. Reviewed-by: Timothy Arceri > { > struct gl_pipeline_object *obj = (struct gl_pipeline_object *) > data; > struct gl_context *ctx = (struct gl_context *) userData; ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 05/11] glsl: Don't monkey about with the interpolation modes
On Tue, 2016-06-14 at 19:01 -0700, Ian Romanick wrote: > From: Ian Romanick > > Previously we'd munge the interpolation mode so that later checks in > the > GLSL linker would pass. The caused problems for similar checks in > SSO > IO validation. Instead, make the check smarter, use the same check > in > both places, and don't modify the interpolation mode. > > Signed-off-by: Ian Romanick > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96358 > Cc: "12.0" > Cc: Gregory Hainaut > Cc: Ilia Mirkin > --- > src/compiler/glsl/ast_to_hir.cpp| 11 -- > src/compiler/glsl/link_varyings.cpp | 41 > + > src/compiler/glsl/link_varyings.h | 7 +++ > src/mesa/main/shader_query.cpp | 6 +- > 4 files changed, 49 insertions(+), 16 deletions(-) > > diff --git a/src/compiler/glsl/ast_to_hir.cpp > b/src/compiler/glsl/ast_to_hir.cpp > index 7da734c..d675dfa 100644 > --- a/src/compiler/glsl/ast_to_hir.cpp > +++ b/src/compiler/glsl/ast_to_hir.cpp > @@ -2991,17 +2991,6 @@ interpret_interpolation_qualifier(const struct > ast_type_qualifier *qual, > interpolation = INTERP_QUALIFIER_NOPERSPECTIVE; > else if (qual->flags.q.smooth) > interpolation = INTERP_QUALIFIER_SMOOTH; > - else if (state->es_shader && > -((mode == ir_var_shader_in && > - state->stage != MESA_SHADER_VERTEX) || > - (mode == ir_var_shader_out && > - state->stage != MESA_SHADER_FRAGMENT))) > - /* Section 4.3.9 (Interpolation) of the GLSL ES 3.00 spec > says: > - * > - *"When no interpolation qualifier is present, smooth > interpolation > - *is used." > - */ > - interpolation = INTERP_QUALIFIER_SMOOTH; > else > interpolation = INTERP_QUALIFIER_NONE; > > diff --git a/src/compiler/glsl/link_varyings.cpp > b/src/compiler/glsl/link_varyings.cpp > index 534393a..54491fc 100644 > --- a/src/compiler/glsl/link_varyings.cpp > +++ b/src/compiler/glsl/link_varyings.cpp > @@ -201,6 +201,37 @@ anonymous_struct_type_matches(const glsl_type > *output_type, > to_match->record_compare(output_type); > } > > +bool > +interpolation_compatible(gl_shader_stage producer_stage, > + gl_shader_stage consumer_stage, > + enum glsl_interp_qualifier producer_interp, > + enum glsl_interp_qualifier consumer_interp, > + bool is_builtin_variable) > +{ > + if (producer_interp == consumer_interp) > + return true; > + > + if (is_builtin_variable) > + return false; > + > + /* Section 4.3.9 (Interpolation) of the GLSL ES 3.00 spec says: > +* > +*When no interpolation qualifier is present, smooth > interpolation is > +*used. > +*/ Note last time I was looking at this I couldn't find this text in the desktop spec so I don't think the following code can be applied to desktop gl. > + if (producer_stage == MESA_SHADER_VERTEX && > + producer_interp == INTERP_QUALIFIER_NONE && > + consumer_interp == INTERP_QUALIFIER_SMOOTH) > + return true; > + > + if (consumer_stage == MESA_SHADER_FRAGMENT && > + consumer_interp == INTERP_QUALIFIER_NONE && > + producer_interp == INTERP_QUALIFIER_SMOOTH) > + return true; Are you sure this is enough? What about a fragment shader with smooth and a geom shader with none? That shouldn't that return true also? > + > + return false; > +} > + > /** > * Validate the types and qualifiers of an output from one stage > against the > * matching input to another stage. > @@ -329,8 +360,11 @@ cross_validate_types_and_qualifiers(struct > gl_shader_program *prog, > * qualifiers of variables of the same name do not match. > * > */ > - if (input->data.interpolation != output->data.interpolation && > - prog->Version < 440) { > + if (prog->Version < 440 && > + !interpolation_compatible(producer_stage, consumer_stage, > + glsl_interp_qualifier(output- > >data.interpolation), > + glsl_interp_qualifier(input- > >data.interpolation), > + is_gl_identifier(output->name))) { > linker_error(prog, > "%s shader output `%s' specifies %s " > "interpolation qualifier, " > @@ -1371,8 +1405,7 @@ varying_matches::record(ir_variable > *producer_var, ir_variable *consumer_var) > (producer_var->type->contains_integer() || > producer_var->type->contains_double()); > > - if (needs_flat_qualifier || > - (consumer_stage != -1 && consumer_stage != > MESA_SHADER_FRAGMENT)) { > + if (needs_flat_qualifier) { > /* Since this varying is not being consumed by the fragment > shader, its > * interpolation type varying cannot possibly affect > rendering. > * Also, this variable is non-flat and is (or contains
Re: [Mesa-dev] [PATCH 04/11] glsl: Pack integer and double varyings as flat even if interpolation mode is none
On Tue, 2016-06-14 at 19:01 -0700, Ian Romanick wrote: > From: Ian Romanick > > Signed-off-by: Ian Romanick > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96358 > Cc: "12.0" > Cc: Gregory Hainaut > Cc: Ilia Mirkin > --- I guess we might also want to update varying_matches::compute_packing_class() to make the most of this. > src/compiler/glsl/lower_packed_varyings.cpp | 13 - > 1 file changed, 8 insertions(+), 5 deletions(-) > > diff --git a/src/compiler/glsl/lower_packed_varyings.cpp > b/src/compiler/glsl/lower_packed_varyings.cpp > index 130b8f6..ae36c1c 100644 > --- a/src/compiler/glsl/lower_packed_varyings.cpp > +++ b/src/compiler/glsl/lower_packed_varyings.cpp > @@ -273,11 +273,11 @@ lower_packed_varyings_visitor::run(struct > gl_shader *shader) > continue; > > /* This lowering pass is only capable of packing floats and > ints > - * together when their interpolation mode is > "flat". Therefore, to be > - * safe, caller should ensure that integral varyings always > use flat > - * interpolation, even when this is not required by GLSL. > + * together when their interpolation mode is "flat". Treat > integers as > + * being flat when the interpolation mode is none. > */ > assert(var->data.interpolation == INTERP_QUALIFIER_FLAT || > + var->data.interpolation == INTERP_QUALIFIER_NONE || > !var->type->contains_integer()); > > /* Clone the variable for program resource list before > @@ -607,7 +607,9 @@ > lower_packed_varyings_visitor::get_packed_varying_deref( > if (this->packed_varyings[slot] == NULL) { > char *packed_name = ralloc_asprintf(this->mem_ctx, > "packed:%s", name); > const glsl_type *packed_type; > - if (unpacked_var->data.interpolation == INTERP_QUALIFIER_FLAT) > + if (unpacked_var->data.interpolation == INTERP_QUALIFIER_FLAT > || > + unpacked_var->type->contains_integer() || > + unpacked_var->type->contains_double()) > packed_type = glsl_type::ivec4_type; > else > packed_type = glsl_type::vec4_type; > @@ -627,7 +629,8 @@ > lower_packed_varyings_visitor::get_packed_varying_deref( > packed_var->data.centroid = unpacked_var->data.centroid; > packed_var->data.sample = unpacked_var->data.sample; > packed_var->data.patch = unpacked_var->data.patch; > - packed_var->data.interpolation = unpacked_var- > >data.interpolation; > + packed_var->data.interpolation = packed_type == > glsl_type::ivec4_type > + ? unsigned(INTERP_QUALIFIER_FLAT) : unpacked_var- > >data.interpolation; > packed_var->data.location = location; > packed_var->data.precision = unpacked_var->data.precision; > packed_var->data.always_active_io = unpacked_var- > >data.always_active_io; ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 03/11] mesa: Strip arrayness from interface block names in some IO validation
On Tue, 2016-06-14 at 19:01 -0700, Ian Romanick wrote: > From: Ian Romanick > > Outputs from the vertex shader need to be able to match > per-vertex-arrayed inputs of later stages. Acomplish this by > stripping > one level of arrayness from the names and types of outputs going to a > per-vertex-arrayed stage. > > Signed-off-by: Ian Romanick > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96358 > Cc: "12.0" > Cc: Gregory Hainaut > Cc: Ilia Mirkin > --- > src/mesa/main/shader_query.cpp | 98 > ++ > 1 file changed, 90 insertions(+), 8 deletions(-) > > diff --git a/src/mesa/main/shader_query.cpp > b/src/mesa/main/shader_query.cpp > index 5956ce4..b2e53fb 100644 > --- a/src/mesa/main/shader_query.cpp > +++ b/src/mesa/main/shader_query.cpp > @@ -1385,13 +1385,24 @@ _mesa_get_program_resourceiv(struct > gl_shader_program *shProg, > > static bool > validate_io(struct gl_shader_program *producer, > -struct gl_shader_program *consumer) > +struct gl_shader_program *consumer, > +gl_shader_stage producer_stage, > +gl_shader_stage consumer_stage) > { > if (producer == consumer) > return true; > > + const bool nonarray_stage_to_array_stage = > + producer_stage == MESA_SHADER_VERTEX && > + (consumer_stage == MESA_SHADER_GEOMETRY || > + consumer_stage == MESA_SHADER_TESS_CTRL || > + consumer_stage == MESA_SHADER_TESS_EVAL); TESS_EVAL->GEOM ? > + > bool valid = true; > > + void *name_buffer = NULL; > + size_t name_buffer_size = 0; > + > gl_shader_variable const **outputs = > (gl_shader_variable const **) calloc(producer- > >NumProgramResourceList, > sizeof(gl_shader_variable > *)); > @@ -1463,11 +1474,52 @@ validate_io(struct gl_shader_program > *producer, > } > } > } else { > + char *consumer_name = consumer_var->name; > + > + if (nonarray_stage_to_array_stage && > + consumer_var->interface_type != NULL && > + consumer_var->interface_type->is_array() && > + !is_gl_identifier(consumer_var->name)) { > +const size_t name_len = strlen(consumer_var->name); > + > +if (name_len >= name_buffer_size) { > + free(name_buffer); > + > + name_buffer_size = name_len + 1; > + name_buffer = malloc(name_buffer_size); > + if (name_buffer == NULL) { > + valid = false; > + goto out; > + } > +} > + > +consumer_name = (char *) name_buffer; > + > +char *s = strchr(consumer_var->name, '['); > +if (s == NULL) { > + valid = false; > + goto out; > +} > + > +char *t = strchr(s, ']'); > +if (t == NULL) { > + valid = false; > + goto out; > +} > + > +assert(t[1] == '.' || t[1] == '['); > + > +const ptrdiff_t base_name_len = s - consumer_var->name; > + > +memcpy(consumer_name, consumer_var->name, > base_name_len); > +strcpy(consumer_name + base_name_len, t + 1); > + } > + > for (unsigned j = 0; j < num_outputs; j++) { > const gl_shader_variable *const var = outputs[j]; > > if (!var->explicit_location && > -strcmp(consumer_var->name, var->name) == 0) { > +strcmp(consumer_name, var->name) == 0) { > producer_var = var; > match_index = j; > break; > @@ -1529,25 +1581,53 @@ validate_io(struct gl_shader_program > *producer, > * Note that location mismatches are detected by the loops > above that > * find the producer variable that goes with the consumer > variable. > */ > - if (producer_var->type != consumer_var->type || > - producer_var->interpolation != consumer_var->interpolation > || > - producer_var->precision != consumer_var->precision) { > + if (nonarray_stage_to_array_stage) { > + if (!consumer_var->type->is_array() || > + consumer_var->type->fields.array != producer_var->type) > { > +valid = false; > +goto out; > + } > + > + if (consumer_var->interface_type != NULL) { > +if (!consumer_var->interface_type->is_array() || > +consumer_var->interface_type->fields.array != > producer_var->interface_type) { > + valid = false; > + goto out; > +} > + } else if (producer_var->interface_type != NULL) { > +valid = false; > +goto out; > + } > + } else { > + if (producer_var->type != consumer_var->type) { > +valid = false; > +goto out; > + } > + > + if (produ
Re: [Mesa-dev] [PATCH 02/11] mesa: If validation fails in a debug context just emit a debug message
On Tue, 2016-06-14 at 19:01 -0700, Ian Romanick wrote: > From: Ian Romanick > > There are quite a few pipelines that desktop applications (including > a > bunch of piglit test) can expect to have run but don't meet the GLES > requirements. Instead of failing validation, just emit a debug > message. > > Signed-off-by: Ian Romanick > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96358 > Cc: "12.0" > Cc: Gregory Hainaut > Cc: Ilia Mirkin Patches 1-2 are: Reviewed-by: Timothy Arceri > --- > src/mesa/main/pipelineobj.c | 17 +++-- > 1 file changed, 15 insertions(+), 2 deletions(-) > > diff --git a/src/mesa/main/pipelineobj.c > b/src/mesa/main/pipelineobj.c > index 5a46cfe..9ecbcc9 100644 > --- a/src/mesa/main/pipelineobj.c > +++ b/src/mesa/main/pipelineobj.c > @@ -929,8 +929,21 @@ _mesa_validate_program_pipeline(struct > gl_context* ctx, > * application has created a debug context. > */ > if ((_mesa_is_gles(ctx) || (ctx->Const.ContextFlags & > GL_CONTEXT_FLAG_DEBUG_BIT)) && > - !_mesa_validate_pipeline_io(pipe)) > - return GL_FALSE; > + !_mesa_validate_pipeline_io(pipe)) { > + if (_mesa_is_gles(ctx)) > + return GL_FALSE; > + > + static GLuint msg_id = 0; > + > + _mesa_gl_debug(ctx, &msg_id, > + MESA_DEBUG_SOURCE_API, > + MESA_DEBUG_TYPE_PORTABILITY, > + MESA_DEBUG_SEVERITY_MEDIUM, > + "glValidateProgramPipeline: pipeline %u does > not meet " > + "strict OpenGL ES 3.1 requirements and may not > be " > + "portable across desktop hardware\n", > + pipe->Name); > + } > > pipe->Validated = GL_TRUE; > return GL_TRUE; ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 04/11] glsl: Pack integer and double varyings as flat even if interpolation mode is none
From: Ian Romanick Signed-off-by: Ian Romanick Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96358 Cc: "12.0" Cc: Gregory Hainaut Cc: Ilia Mirkin --- src/compiler/glsl/lower_packed_varyings.cpp | 13 - 1 file changed, 8 insertions(+), 5 deletions(-) diff --git a/src/compiler/glsl/lower_packed_varyings.cpp b/src/compiler/glsl/lower_packed_varyings.cpp index 130b8f6..ae36c1c 100644 --- a/src/compiler/glsl/lower_packed_varyings.cpp +++ b/src/compiler/glsl/lower_packed_varyings.cpp @@ -273,11 +273,11 @@ lower_packed_varyings_visitor::run(struct gl_shader *shader) continue; /* This lowering pass is only capable of packing floats and ints - * together when their interpolation mode is "flat". Therefore, to be - * safe, caller should ensure that integral varyings always use flat - * interpolation, even when this is not required by GLSL. + * together when their interpolation mode is "flat". Treat integers as + * being flat when the interpolation mode is none. */ assert(var->data.interpolation == INTERP_QUALIFIER_FLAT || + var->data.interpolation == INTERP_QUALIFIER_NONE || !var->type->contains_integer()); /* Clone the variable for program resource list before @@ -607,7 +607,9 @@ lower_packed_varyings_visitor::get_packed_varying_deref( if (this->packed_varyings[slot] == NULL) { char *packed_name = ralloc_asprintf(this->mem_ctx, "packed:%s", name); const glsl_type *packed_type; - if (unpacked_var->data.interpolation == INTERP_QUALIFIER_FLAT) + if (unpacked_var->data.interpolation == INTERP_QUALIFIER_FLAT || + unpacked_var->type->contains_integer() || + unpacked_var->type->contains_double()) packed_type = glsl_type::ivec4_type; else packed_type = glsl_type::vec4_type; @@ -627,7 +629,8 @@ lower_packed_varyings_visitor::get_packed_varying_deref( packed_var->data.centroid = unpacked_var->data.centroid; packed_var->data.sample = unpacked_var->data.sample; packed_var->data.patch = unpacked_var->data.patch; - packed_var->data.interpolation = unpacked_var->data.interpolation; + packed_var->data.interpolation = packed_type == glsl_type::ivec4_type + ? unsigned(INTERP_QUALIFIER_FLAT) : unpacked_var->data.interpolation; packed_var->data.location = location; packed_var->data.precision = unpacked_var->data.precision; packed_var->data.always_active_io = unpacked_var->data.always_active_io; -- 2.5.5 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 03/11] mesa: Strip arrayness from interface block names in some IO validation
From: Ian Romanick Outputs from the vertex shader need to be able to match per-vertex-arrayed inputs of later stages. Acomplish this by stripping one level of arrayness from the names and types of outputs going to a per-vertex-arrayed stage. Signed-off-by: Ian Romanick Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96358 Cc: "12.0" Cc: Gregory Hainaut Cc: Ilia Mirkin --- src/mesa/main/shader_query.cpp | 98 ++ 1 file changed, 90 insertions(+), 8 deletions(-) diff --git a/src/mesa/main/shader_query.cpp b/src/mesa/main/shader_query.cpp index 5956ce4..b2e53fb 100644 --- a/src/mesa/main/shader_query.cpp +++ b/src/mesa/main/shader_query.cpp @@ -1385,13 +1385,24 @@ _mesa_get_program_resourceiv(struct gl_shader_program *shProg, static bool validate_io(struct gl_shader_program *producer, -struct gl_shader_program *consumer) +struct gl_shader_program *consumer, +gl_shader_stage producer_stage, +gl_shader_stage consumer_stage) { if (producer == consumer) return true; + const bool nonarray_stage_to_array_stage = + producer_stage == MESA_SHADER_VERTEX && + (consumer_stage == MESA_SHADER_GEOMETRY || + consumer_stage == MESA_SHADER_TESS_CTRL || + consumer_stage == MESA_SHADER_TESS_EVAL); + bool valid = true; + void *name_buffer = NULL; + size_t name_buffer_size = 0; + gl_shader_variable const **outputs = (gl_shader_variable const **) calloc(producer->NumProgramResourceList, sizeof(gl_shader_variable *)); @@ -1463,11 +1474,52 @@ validate_io(struct gl_shader_program *producer, } } } else { + char *consumer_name = consumer_var->name; + + if (nonarray_stage_to_array_stage && + consumer_var->interface_type != NULL && + consumer_var->interface_type->is_array() && + !is_gl_identifier(consumer_var->name)) { +const size_t name_len = strlen(consumer_var->name); + +if (name_len >= name_buffer_size) { + free(name_buffer); + + name_buffer_size = name_len + 1; + name_buffer = malloc(name_buffer_size); + if (name_buffer == NULL) { + valid = false; + goto out; + } +} + +consumer_name = (char *) name_buffer; + +char *s = strchr(consumer_var->name, '['); +if (s == NULL) { + valid = false; + goto out; +} + +char *t = strchr(s, ']'); +if (t == NULL) { + valid = false; + goto out; +} + +assert(t[1] == '.' || t[1] == '['); + +const ptrdiff_t base_name_len = s - consumer_var->name; + +memcpy(consumer_name, consumer_var->name, base_name_len); +strcpy(consumer_name + base_name_len, t + 1); + } + for (unsigned j = 0; j < num_outputs; j++) { const gl_shader_variable *const var = outputs[j]; if (!var->explicit_location && -strcmp(consumer_var->name, var->name) == 0) { +strcmp(consumer_name, var->name) == 0) { producer_var = var; match_index = j; break; @@ -1529,25 +1581,53 @@ validate_io(struct gl_shader_program *producer, * Note that location mismatches are detected by the loops above that * find the producer variable that goes with the consumer variable. */ - if (producer_var->type != consumer_var->type || - producer_var->interpolation != consumer_var->interpolation || - producer_var->precision != consumer_var->precision) { + if (nonarray_stage_to_array_stage) { + if (!consumer_var->type->is_array() || + consumer_var->type->fields.array != producer_var->type) { +valid = false; +goto out; + } + + if (consumer_var->interface_type != NULL) { +if (!consumer_var->interface_type->is_array() || +consumer_var->interface_type->fields.array != producer_var->interface_type) { + valid = false; + goto out; +} + } else if (producer_var->interface_type != NULL) { +valid = false; +goto out; + } + } else { + if (producer_var->type != consumer_var->type) { +valid = false; +goto out; + } + + if (producer_var->interface_type != consumer_var->interface_type) { +valid = false; +goto out; + } + } + + if (producer_var->interpolation != consumer_var->interpolation) { valid = false; goto out; } - if (producer_var->outermost_struct_type != consumer_var->outermost_struct_type) { + if (pro
[Mesa-dev] [PATCH 11/11] i965: Delete redundant extension enables
From: Ian Romanick A nearly identical block already exists in the gen >= 6 block above. Signed-off-by: Ian Romanick --- src/mesa/drivers/dri/i965/intel_extensions.c | 9 - 1 file changed, 9 deletions(-) diff --git a/src/mesa/drivers/dri/i965/intel_extensions.c b/src/mesa/drivers/dri/i965/intel_extensions.c index 5be4787..b55fed2 100644 --- a/src/mesa/drivers/dri/i965/intel_extensions.c +++ b/src/mesa/drivers/dri/i965/intel_extensions.c @@ -360,15 +360,6 @@ intelInitExtensions(struct gl_context *ctx) if (brw->intelScreen->cmd_parser_version >= 2) brw->predicate.supported = true; } - - /* Only enable this in core profile because other parts of Mesa behave - * slightly differently when the extension is enabled. - */ - if (ctx->API == API_OPENGL_CORE) { - ctx->Extensions.ARB_viewport_array = true; - ctx->Extensions.AMD_vertex_shader_viewport_index = true; - ctx->Extensions.ARB_shader_subroutine = true; - } } if (brw->gen >= 8 || brw->is_haswell || brw->is_baytrail) { -- 2.5.5 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 08/11] docs: Update GL3.txt for OpenGL ES on i965-ish hardware
From: Ian Romanick Signed-off-by: Ian Romanick --- docs/GL3.txt | 28 +++- 1 file changed, 15 insertions(+), 13 deletions(-) diff --git a/docs/GL3.txt b/docs/GL3.txt index 0204695..dedea1a 100644 --- a/docs/GL3.txt +++ b/docs/GL3.txt @@ -222,25 +222,26 @@ GL 4.5, GLSL 4.50: GL_EXT_shader_integer_mix DONE (all drivers that support GLSL) These are the extensions cherry-picked to make GLES 3.1 -GLES3.1, GLSL ES 3.1 -- all DONE: nvc0, radeonsi +GLES3.1, GLSL ES 3.1 -- all DONE: i965/gen8+, nvc0, radeonsi + GL_ARB_arrays_of_arrays DONE (all drivers that support GLSL 1.30) - GL_ARB_compute_shader DONE (i965, softpipe) - GL_ARB_draw_indirect DONE (i965, r600, llvmpipe, softpipe, swr) + GL_ARB_compute_shader DONE (i965/gen7+, softpipe) + GL_ARB_draw_indirect DONE (i965/gen7+, r600, llvmpipe, softpipe, swr) GL_ARB_explicit_uniform_location DONE (all drivers that support GLSL) - GL_ARB_framebuffer_no_attachments DONE (i965, r600, softpipe) + GL_ARB_framebuffer_no_attachments DONE (i965/gen7+, r600, softpipe) GL_ARB_program_interface_queryDONE (all drivers) - GL_ARB_shader_atomic_counters DONE (i965, softpipe) - GL_ARB_shader_image_load_storeDONE (i965, softpipe) - GL_ARB_shader_image_size DONE (i965, softpipe) - GL_ARB_shader_storage_buffer_object DONE (i965, softpipe) + GL_ARB_shader_atomic_counters DONE (i965/gen7+, softpipe) + GL_ARB_shader_image_load_storeDONE (i965/gen7+, softpipe) + GL_ARB_shader_image_size DONE (i965/gen7+, softpipe) + GL_ARB_shader_storage_buffer_object DONE (i965/gen7+, softpipe) GL_ARB_shading_language_packing DONE (all drivers) GL_ARB_separate_shader_objectsDONE (all drivers) - GL_ARB_stencil_texturing DONE (i965/gen8+, nv50, r600, llvmpipe, softpipe, swr) - GL_ARB_texture_multisample (Multisample textures) DONE (i965, nv50, r600, llvmpipe, softpipe) + GL_ARB_stencil_texturing DONE (nv50, r600, llvmpipe, softpipe, swr) + GL_ARB_texture_multisample (Multisample textures) DONE (i965/gen7+, nv50, r600, llvmpipe, softpipe) GL_ARB_texture_storage_multisampleDONE (all drivers that support GL_ARB_texture_multisample) GL_ARB_vertex_attrib_binding DONE (all drivers) - GS5 Enhanced textureGatherDONE (i965, r600) - GS5 Packing/bitfield/conversion functions DONE (i965, r600) + GS5 Enhanced textureGatherDONE (i965/gen7+, r600) + GS5 Packing/bitfield/conversion functions DONE (i965/gen6+, r600) GL_EXT_shader_integer_mix DONE (all drivers that support GLSL) Additional functionality not covered above: @@ -249,7 +250,8 @@ GLES3.1, GLSL ES 3.1 -- all DONE: nvc0, radeonsi glGetBooleani_v - restrict to GLES enums gl_HelperInvocation support DONE (i965, r600) -GLES3.2, GLSL ES 3.2 +GLES3.2, GLSL ES 3.2: + GL_EXT_color_buffer_float DONE (all drivers) GL_KHR_blend_equation_advancednot started GL_KHR_debug DONE (all drivers) -- 2.5.5 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 05/11] glsl: Don't monkey about with the interpolation modes
From: Ian Romanick Previously we'd munge the interpolation mode so that later checks in the GLSL linker would pass. The caused problems for similar checks in SSO IO validation. Instead, make the check smarter, use the same check in both places, and don't modify the interpolation mode. Signed-off-by: Ian Romanick Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96358 Cc: "12.0" Cc: Gregory Hainaut Cc: Ilia Mirkin --- src/compiler/glsl/ast_to_hir.cpp| 11 -- src/compiler/glsl/link_varyings.cpp | 41 + src/compiler/glsl/link_varyings.h | 7 +++ src/mesa/main/shader_query.cpp | 6 +- 4 files changed, 49 insertions(+), 16 deletions(-) diff --git a/src/compiler/glsl/ast_to_hir.cpp b/src/compiler/glsl/ast_to_hir.cpp index 7da734c..d675dfa 100644 --- a/src/compiler/glsl/ast_to_hir.cpp +++ b/src/compiler/glsl/ast_to_hir.cpp @@ -2991,17 +2991,6 @@ interpret_interpolation_qualifier(const struct ast_type_qualifier *qual, interpolation = INTERP_QUALIFIER_NOPERSPECTIVE; else if (qual->flags.q.smooth) interpolation = INTERP_QUALIFIER_SMOOTH; - else if (state->es_shader && -((mode == ir_var_shader_in && - state->stage != MESA_SHADER_VERTEX) || - (mode == ir_var_shader_out && - state->stage != MESA_SHADER_FRAGMENT))) - /* Section 4.3.9 (Interpolation) of the GLSL ES 3.00 spec says: - * - *"When no interpolation qualifier is present, smooth interpolation - *is used." - */ - interpolation = INTERP_QUALIFIER_SMOOTH; else interpolation = INTERP_QUALIFIER_NONE; diff --git a/src/compiler/glsl/link_varyings.cpp b/src/compiler/glsl/link_varyings.cpp index 534393a..54491fc 100644 --- a/src/compiler/glsl/link_varyings.cpp +++ b/src/compiler/glsl/link_varyings.cpp @@ -201,6 +201,37 @@ anonymous_struct_type_matches(const glsl_type *output_type, to_match->record_compare(output_type); } +bool +interpolation_compatible(gl_shader_stage producer_stage, + gl_shader_stage consumer_stage, + enum glsl_interp_qualifier producer_interp, + enum glsl_interp_qualifier consumer_interp, + bool is_builtin_variable) +{ + if (producer_interp == consumer_interp) + return true; + + if (is_builtin_variable) + return false; + + /* Section 4.3.9 (Interpolation) of the GLSL ES 3.00 spec says: +* +*When no interpolation qualifier is present, smooth interpolation is +*used. +*/ + if (producer_stage == MESA_SHADER_VERTEX && + producer_interp == INTERP_QUALIFIER_NONE && + consumer_interp == INTERP_QUALIFIER_SMOOTH) + return true; + + if (consumer_stage == MESA_SHADER_FRAGMENT && + consumer_interp == INTERP_QUALIFIER_NONE && + producer_interp == INTERP_QUALIFIER_SMOOTH) + return true; + + return false; +} + /** * Validate the types and qualifiers of an output from one stage against the * matching input to another stage. @@ -329,8 +360,11 @@ cross_validate_types_and_qualifiers(struct gl_shader_program *prog, * qualifiers of variables of the same name do not match. * */ - if (input->data.interpolation != output->data.interpolation && - prog->Version < 440) { + if (prog->Version < 440 && + !interpolation_compatible(producer_stage, consumer_stage, + glsl_interp_qualifier(output->data.interpolation), + glsl_interp_qualifier(input->data.interpolation), + is_gl_identifier(output->name))) { linker_error(prog, "%s shader output `%s' specifies %s " "interpolation qualifier, " @@ -1371,8 +1405,7 @@ varying_matches::record(ir_variable *producer_var, ir_variable *consumer_var) (producer_var->type->contains_integer() || producer_var->type->contains_double()); - if (needs_flat_qualifier || - (consumer_stage != -1 && consumer_stage != MESA_SHADER_FRAGMENT)) { + if (needs_flat_qualifier) { /* Since this varying is not being consumed by the fragment shader, its * interpolation type varying cannot possibly affect rendering. * Also, this variable is non-flat and is (or contains) an integer diff --git a/src/compiler/glsl/link_varyings.h b/src/compiler/glsl/link_varyings.h index 39e9070..6a98c0f 100644 --- a/src/compiler/glsl/link_varyings.h +++ b/src/compiler/glsl/link_varyings.h @@ -338,4 +338,11 @@ check_against_input_limit(struct gl_context *ctx, gl_shader *consumer, unsigned num_explicit_locations); +bool +interpolation_compatible(gl_shader_stage producer_stage, + gl_shader_stage consumer_stage, + enum glsl_interp_qualifier producer_interp, +
[Mesa-dev] [PATCH 10/11] docs: Add extensions not part of any GL or GL ES version
From: Ian Romanick Based loosely on patches submitted ages ago by Thomas Helland. Signed-off-by: Ian Romanick --- docs/GL3.txt | 56 1 file changed, 56 insertions(+) diff --git a/docs/GL3.txt b/docs/GL3.txt index 0deeaa1..b0966a2 100644 --- a/docs/GL3.txt +++ b/docs/GL3.txt @@ -275,5 +275,61 @@ GLES3.2, GLSL ES 3.2: GL_OES_texture_stencil8 DONE (all drivers that support GL_ARB_texture_stencil8) GL_OES_texture_storage_multisample_2d_array DONE (all drivers that support GL_ARB_texture_multisample) +Khronos, ARB, and OES extensions that are not part of any OpenGL or OpenGL ES version: + + GL_ARB_bindless_texture not started + GL_ARB_cl_event not started + GL_ARB_compute_variable_group_sizenot started + GL_ARB_ES3_2_compatibilitynot started + GL_ARB_fragment_shader_interlock not started + GL_ARB_gpu_shader_int64 started (airlied for core and Gallium, idr for i965) + GL_ARB_indirect_parametersDONE (core only?) + GL_ARB_parallel_shader_compilenot started, but Chia-I Wu did some related work in 2014 + GL_ARB_pipeline_statistics_query DONE (i965, nvc0, radeonsi, softpipe, swr) + GL_ARB_post_depth_coveragenot started + GL_ARB_robustness_isolation not started + GL_ARB_sample_locations not started + GL_ARB_seamless_cubemap_per_texture DONE (i965, nvc0, radeonsi, r600, softpipe, swr) + GL_ARB_shader_atomic_counter_ops DONE (some Gallium drivers?) + GL_ARB_shader_ballot not started + GL_ARB_shader_clock DONE (i965/gen7+) + GL_ARB_shader_draw_parameters DONE (i965, nvc0) + GL_ARB_shader_group_vote started (Ilia for nvc0, Matt for i965) + GL_ARB_shader_stencil_export DONE (i965/gen9+, radeonsi, softpipe, llvmpipe, swr) + GL_ARB_shader_viewport_layer_arraynot started + GL_ARB_sparse_buffer not started + GL_ARB_sparse_texture2not started + GL_ARB_sparse_texture_clamp not started + GL_ARB_sparse_texture not started + GL_ARB_texture_filter_minmax not started + GL_ARB_transform_feedback_overflow_query not started + GL_KHR_blend_equation_advanced_coherent not started + GL_KHR_no_error not started + GL_KHR_texture_compression_astc_hdr DONE (core only) + GL_KHR_texture_compression_astc_sliced_3d not started + GL_OES_depth_texture_cube_map DONE (all drivers that support GLSL 1.30+) + GL_OES_EGL_image DONE (all drivers) + GL_OES_EGL_image_external_essl3 not started + GL_OES_required_internalformatnot started - GLES2 extension based on OpenGL ES 3.0 feature + GL_OES_surfaceless_contextDONE (all drivers) + GL_OES_texture_compression_astc DONE (core only) + GL_OES_texture_float DONE (i965) + GL_OES_texture_float_linear DONE (i965) + GL_OES_texture_half_float DONE (i965) + GL_OES_texture_half_float_linear DONE (i965) + GL_OES_texture_view not started - based on GL_ARB_texture_view + GLX_ARB_context_flush_control not started + GLX_ARB_robustness_application_isolation not started + GLX_ARB_robustness_share_group_isolation not started + +The following extensions are not part of any OpenGL or OpenGL ES version, and +we DO NOT WANT implementations of these extensions for Mesa. + + GL_ARB_geometry_shader4 Superseded by GL 3.2 geometry shaders + GL_ARB_matrix_palette Superseded by GL_ARB_vertex_program + GL_ARB_shading_language_include Not interesting + GL_ARB_shadow_ambient Superseded by GL_ARB_fragment_program + GL_ARB_vertex_blend Superseded by GL_ARB_vertex_program + More info about these features and the work involved can be found at http://dri.freedesktop.org/wiki/MissingFunctionality -- 2.5.5 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freede
[Mesa-dev] [PATCH 02/11] mesa: If validation fails in a debug context just emit a debug message
From: Ian Romanick There are quite a few pipelines that desktop applications (including a bunch of piglit test) can expect to have run but don't meet the GLES requirements. Instead of failing validation, just emit a debug message. Signed-off-by: Ian Romanick Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96358 Cc: "12.0" Cc: Gregory Hainaut Cc: Ilia Mirkin --- src/mesa/main/pipelineobj.c | 17 +++-- 1 file changed, 15 insertions(+), 2 deletions(-) diff --git a/src/mesa/main/pipelineobj.c b/src/mesa/main/pipelineobj.c index 5a46cfe..9ecbcc9 100644 --- a/src/mesa/main/pipelineobj.c +++ b/src/mesa/main/pipelineobj.c @@ -929,8 +929,21 @@ _mesa_validate_program_pipeline(struct gl_context* ctx, * application has created a debug context. */ if ((_mesa_is_gles(ctx) || (ctx->Const.ContextFlags & GL_CONTEXT_FLAG_DEBUG_BIT)) && - !_mesa_validate_pipeline_io(pipe)) - return GL_FALSE; + !_mesa_validate_pipeline_io(pipe)) { + if (_mesa_is_gles(ctx)) + return GL_FALSE; + + static GLuint msg_id = 0; + + _mesa_gl_debug(ctx, &msg_id, + MESA_DEBUG_SOURCE_API, + MESA_DEBUG_TYPE_PORTABILITY, + MESA_DEBUG_SEVERITY_MEDIUM, + "glValidateProgramPipeline: pipeline %u does not meet " + "strict OpenGL ES 3.1 requirements and may not be " + "portable across desktop hardware\n", + pipe->Name); + } pipe->Validated = GL_TRUE; return GL_TRUE; -- 2.5.5 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 09/11] docs: Update GL3.txt for OpenGL 4.0 on i965-ish hardware
From: Ian Romanick Signed-off-by: Ian Romanick --- docs/GL3.txt | 26 +- 1 file changed, 13 insertions(+), 13 deletions(-) diff --git a/docs/GL3.txt b/docs/GL3.txt index dedea1a..0deeaa1 100644 --- a/docs/GL3.txt +++ b/docs/GL3.txt @@ -107,11 +107,11 @@ GL 3.3, GLSL 3.30 --- all DONE: i965, nv50, nvc0, r600, radeonsi, llvmpipe, soft GL_ARB_vertex_type_2_10_10_10_rev DONE (swr) -GL 4.0, GLSL 4.00 --- all DONE: nvc0, r600, radeonsi +GL 4.0, GLSL 4.00 --- all DONE: i965/gen8+, nvc0, r600, radeonsi - GL_ARB_draw_buffers_blend DONE (i965, nv50, llvmpipe, softpipe, swr) - GL_ARB_draw_indirect DONE (i965, llvmpipe, softpipe, swr) - GL_ARB_gpu_shader5DONE (i965) + GL_ARB_draw_buffers_blend DONE (i965/gen6+, nv50, llvmpipe, softpipe, swr) + GL_ARB_draw_indirect DONE (i965/gen7+, llvmpipe, softpipe, swr) + GL_ARB_gpu_shader5DONE (i965/gen7+) - 'precise' qualifier DONE - Dynamically uniform sampler array indices DONE (softpipe) - Dynamically uniform UBO array indices DONE () @@ -124,16 +124,16 @@ GL 4.0, GLSL 4.00 --- all DONE: nvc0, r600, radeonsi - Enhanced per-sample shading DONE () - Interpolation functions DONE () - New overload resolution rules DONE - GL_ARB_gpu_shader_fp64DONE (i965/gen8+, llvmpipe, softpipe) - GL_ARB_sample_shading DONE (i965, nv50) - GL_ARB_shader_subroutine DONE (i965, nv50, llvmpipe, softpipe, swr) - GL_ARB_tessellation_shaderDONE (i965) - GL_ARB_texture_buffer_object_rgb32DONE (i965, llvmpipe, softpipe, swr) - GL_ARB_texture_cube_map_array DONE (i965, nv50, llvmpipe, softpipe) - GL_ARB_texture_gather DONE (i965, nv50, llvmpipe, softpipe, swr) + GL_ARB_gpu_shader_fp64DONE (llvmpipe, softpipe) + GL_ARB_sample_shading DONE (i965/gen6+, nv50) + GL_ARB_shader_subroutine DONE (i965/gen6+, nv50, llvmpipe, softpipe, swr) + GL_ARB_tessellation_shaderDONE (i965/gen7+) + GL_ARB_texture_buffer_object_rgb32DONE (i965/gen6+, llvmpipe, softpipe) + GL_ARB_texture_cube_map_array DONE (i965/gen6+, nv50, llvmpipe, softpipe) + GL_ARB_texture_gather DONE (i965/gen6+, nv50, llvmpipe, softpipe, swr) GL_ARB_texture_query_lod DONE (i965, nv50, softpipe) - GL_ARB_transform_feedback2DONE (i965, nv50, llvmpipe, softpipe, swr) - GL_ARB_transform_feedback3DONE (i965, nv50, llvmpipe, softpipe, swr) + GL_ARB_transform_feedback2DONE (i965/gen7+, nv50, llvmpipe, softpipe, swr) + GL_ARB_transform_feedback3DONE (i965/gen7+, nv50, llvmpipe, softpipe, swr) GL 4.1, GLSL 4.10 --- all DONE: nvc0, r600, radeonsi -- 2.5.5 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 07/11] mesa: Fix incorrect "see also" comments
From: Ian Romanick Signed-off-by: Ian Romanick --- src/compiler/glsl/ir.h | 2 +- src/mesa/main/mtypes.h | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/src/compiler/glsl/ir.h b/src/compiler/glsl/ir.h index 3629356..cd17f69 100644 --- a/src/compiler/glsl/ir.h +++ b/src/compiler/glsl/ir.h @@ -679,7 +679,7 @@ public: /** * Interpolation mode for shader inputs / outputs * - * \sa ir_variable_interpolation + * \sa glsl_interp_qualifier */ unsigned interpolation:2; diff --git a/src/mesa/main/mtypes.h b/src/mesa/main/mtypes.h index 471d41d..88702cb 100644 --- a/src/mesa/main/mtypes.h +++ b/src/mesa/main/mtypes.h @@ -2615,7 +2615,7 @@ struct gl_shader_variable /** * Interpolation mode for shader inputs / outputs * -* \sa ir_variable_interpolation +* \sa glsl_interp_qualifier */ unsigned interpolation:2; -- 2.5.5 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 01/11] glsl: Always strip arrayness in precision_qualifier_allowed
From: Ian Romanick Previously some callers of precision_qualifier_allowed would strip the arrayness from the type and some would not. As a result, some places would not notice that float[6], for example, needed a precision qualifier. Fixes the new piglit test no-default-float-array-precision.frag. Signed-off-by: Ian Romanick Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96358 Cc: "12.0" Cc: Gregory Hainaut Cc: Ilia Mirkin --- src/compiler/glsl/ast_to_hir.cpp | 17 ++--- 1 file changed, 6 insertions(+), 11 deletions(-) diff --git a/src/compiler/glsl/ast_to_hir.cpp b/src/compiler/glsl/ast_to_hir.cpp index ea32924..7da734c 100644 --- a/src/compiler/glsl/ast_to_hir.cpp +++ b/src/compiler/glsl/ast_to_hir.cpp @@ -2278,10 +2278,10 @@ precision_qualifier_allowed(const glsl_type *type) * From this, we infer that GLSL 1.30 (and later) should allow precision * qualifiers on sampler types just like float and integer types. */ - return (type->is_float() - || type->is_integer() - || type->contains_opaque()) - && !type->without_array()->is_record(); + const glsl_type *const t = type->without_array(); + + return (t->is_float() || t->is_integer() || t->contains_opaque()) && + !t->is_record(); } const glsl_type * @@ -4994,13 +4994,8 @@ ast_declarator_list::hir(exec_list *instructions, state->check_precision_qualifiers_allowed(&loc); } - - /* If a precision qualifier is allowed on a type, it is allowed on - * an array of that type. - */ - if (!(this->type->qualifier.precision == ast_precision_none - || precision_qualifier_allowed(var->type->without_array( { - + if (this->type->qualifier.precision != ast_precision_none && + !precision_qualifier_allowed(var->type)) { _mesa_glsl_error(&loc, state, "precision qualifiers apply only to floating point" ", integer and opaque types"); -- 2.5.5 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 06/11] mesa: Silence unused parameter warning
From: Ian Romanick main/pipelineobj.c: In function ‘delete_pipelineobj_cb’: main/pipelineobj.c:110:30: warning: unused parameter ‘id’ [-Wunused-parameter] delete_pipelineobj_cb(GLuint id, void *data, void *userData) ^ Signed-off-by: Ian Romanick --- src/mesa/main/pipelineobj.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/mesa/main/pipelineobj.c b/src/mesa/main/pipelineobj.c index 9ecbcc9..8483752 100644 --- a/src/mesa/main/pipelineobj.c +++ b/src/mesa/main/pipelineobj.c @@ -107,7 +107,7 @@ _mesa_init_pipeline(struct gl_context *ctx) * Callback for deleting a pipeline object. Called by _mesa_HashDeleteAll(). */ static void -delete_pipelineobj_cb(GLuint id, void *data, void *userData) +delete_pipelineobj_cb(UNUSED GLuint id, void *data, void *userData) { struct gl_pipeline_object *obj = (struct gl_pipeline_object *) data; struct gl_context *ctx = (struct gl_context *) userData; -- 2.5.5 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] anv: Fix a harmless overflow warning
On Jun 14, 2016 4:23 PM, "Chad Versace" wrote: > > anv_pipeline_binding::index is a uint8_t, but some code assigned to it > UINT16_MAX. > --- > src/intel/vulkan/anv_pipeline.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/src/intel/vulkan/anv_pipeline.c b/src/intel/vulkan/anv_pipeline.c > index 60b7c6b..b41e11e 100644 > --- a/src/intel/vulkan/anv_pipeline.c > +++ b/src/intel/vulkan/anv_pipeline.c > @@ -664,7 +664,7 @@ anv_pipeline_compile_fs(struct anv_pipeline *pipeline, > rt_bindings[0] = (struct anv_pipeline_binding) { > .set = ANV_DESCRIPTOR_SET_COLOR_ATTACHMENTS, > .binding = 0, > -.index = UINT16_MAX, > +.index = UINT8_MAX, I believe we have a descriptive #define specifically for render targets. Probably better to use that. > }; > num_rts = 1; >} > -- > 2.9.0.rc2 > > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/8] gallium: Add a cap for offset_units_unscaled
Am 15.06.2016 um 01:08 schrieb Axel Davy: > On 15/06/2016 00:21, Roland Scheidegger wrote: >> Am 14.06.2016 um 23:33 schrieb Axel Davy: >>> diff --git a/src/gallium/include/pipe/p_state.h >>> b/src/gallium/include/pipe/p_state.h >>> index 396f563..7dce80a 100644 >>> --- a/src/gallium/include/pipe/p_state.h >>> +++ b/src/gallium/include/pipe/p_state.h >>> @@ -139,6 +139,13 @@ struct pipe_rasterizer_state >>> unsigned clip_halfz:1; >>>/** >>> +* When true do not scale offset_units and use same rules for >>> unorm and >>> +* float depth buffers (D3D9). When false use GL/D3D1X behaviour. >>> +* This depends on PIPE_CAP_POLYGON_OFFSET_UNITS_UNSCALED. >>> +*/ >>> + unsigned offset_units_unscaled; >>> + >>> + /** >>> * Enable bits for clipping half-spaces. >>> * This applies to both user clip planes and shader clip distances. >>> * Note that if the bound shader exports any clip distances, these >>> >> I don't like this. Generally, for unorm formats, you can easily enough >> translate this from d3d9 to gl (or d3d10) rules (but yes, obviously it's >> going to be format dependent). (With one big caveat, in general not all >> gl drivers think the minimum resolvable difference is the same, that >> might range from 2^-22 to 2^-24 for 24bit unorm depth for instance, and >> I don't think it's quite consistent with gallium drivers neither). >> >> You are right though for float depth the formula is different, and you >> can't translate it. But do you really need float depth buffer support? >> AFAIK no d3d9 app really depends on it, everything can fall back to d24. >> >> Roland >> > Hi, > > > That's true float depth buffer do not seem to be widely used in d3d9. > > The two float depth buffers available in d3d9, as far as I know, are > D32F_LOCKABLE and D24FS8. > > We can see the support for those and other depth buffers here (note that > these are mainly old cards): > > http://zp.amsnet.pl/cdragan/query.php?dxversion=9&feature=formats&featuregroup=selected&adaptergroup=all&featureselected[]=45&featureselected[]=44&featureselected[]=41&featureselected[]=42&featureselected[]=43&featureselected[]=40&featureselected[]=39&featureselected[]=46&resource=SURFACE&usage=DEPTHSTENCIL&orientation=horizontal > > > It is likely not a requirement for any game to support these formats. > > > We could ignore these formats, and add to gallium a way to get the > minimum resolvable difference per depth buffer format from drivers. We > considered this option. > > > That said, the driver is the best location to know about the minimum > resolvable difference, and we made the choice to let the driver do the > scaling instead of doing it based on some driver query in the state > tracker. > > As for floating point depth buffers behaviour, I understand for some > drivers it may be harder than for others to implement. > > That doesn't seem however a reason to drop floating depth buffer support > in Gallium Nine. D32F_LOCKABLE is particularly useful for debugging, > being lockable, it can be used to show depth buffer content after some > draw calls for d3d on windows, and compare with nine. And some apps may > use it for some particular effects. > > I'd be ok if we make the float depth buffer part of > offset_units_unscaled optional given how rare the combination float > depth buffers + depth bias must be used. However if hw can do it, I see > no reason why we wouldn't support the capability? On second look, it doesn't really look too bad (and fwiw we actually could probably put it to use here if we'd support it in llvmpipe). Albeit, unsigned offset_units_unscaled; needs to be unsigned offset_units_unscaled:1; I'm just very sceptical when it comes to capabilities solely to the benefit of fringe state trackers (and everything not st/mesa counts here). It usually means driver authors aren't going to bother. And you probably can't implement it in all drivers yourselves even if the hw could do it. That said, I'm ok with this if there's no objections from others. Roland ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] anv: Fix a harmless overflow warning
On Tue, Jun 14, 2016 at 4:22 PM, Chad Versace wrote: > anv_pipeline_binding::index is a uint8_t, but some code assigned to it > UINT16_MAX. > --- > src/intel/vulkan/anv_pipeline.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/src/intel/vulkan/anv_pipeline.c b/src/intel/vulkan/anv_pipeline.c > index 60b7c6b..b41e11e 100644 > --- a/src/intel/vulkan/anv_pipeline.c > +++ b/src/intel/vulkan/anv_pipeline.c > @@ -664,7 +664,7 @@ anv_pipeline_compile_fs(struct anv_pipeline *pipeline, > rt_bindings[0] = (struct anv_pipeline_binding) { > .set = ANV_DESCRIPTOR_SET_COLOR_ATTACHMENTS, > .binding = 0, > -.index = UINT16_MAX, > +.index = UINT8_MAX, > }; > num_rts = 1; >} > -- > 2.9.0.rc2 > > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/mesa-dev Reviewed-by: Anuj Phogat ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] anv: Fix a harmless overflow warning
anv_pipeline_binding::index is a uint8_t, but some code assigned to it UINT16_MAX. --- src/intel/vulkan/anv_pipeline.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/intel/vulkan/anv_pipeline.c b/src/intel/vulkan/anv_pipeline.c index 60b7c6b..b41e11e 100644 --- a/src/intel/vulkan/anv_pipeline.c +++ b/src/intel/vulkan/anv_pipeline.c @@ -664,7 +664,7 @@ anv_pipeline_compile_fs(struct anv_pipeline *pipeline, rt_bindings[0] = (struct anv_pipeline_binding) { .set = ANV_DESCRIPTOR_SET_COLOR_ATTACHMENTS, .binding = 0, -.index = UINT16_MAX, +.index = UINT8_MAX, }; num_rts = 1; } -- 2.9.0.rc2 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/8] gallium: Add a cap for offset_units_unscaled
On 15/06/2016 00:21, Roland Scheidegger wrote: Am 14.06.2016 um 23:33 schrieb Axel Davy: diff --git a/src/gallium/include/pipe/p_state.h b/src/gallium/include/pipe/p_state.h index 396f563..7dce80a 100644 --- a/src/gallium/include/pipe/p_state.h +++ b/src/gallium/include/pipe/p_state.h @@ -139,6 +139,13 @@ struct pipe_rasterizer_state unsigned clip_halfz:1; /** +* When true do not scale offset_units and use same rules for unorm and +* float depth buffers (D3D9). When false use GL/D3D1X behaviour. +* This depends on PIPE_CAP_POLYGON_OFFSET_UNITS_UNSCALED. +*/ + unsigned offset_units_unscaled; + + /** * Enable bits for clipping half-spaces. * This applies to both user clip planes and shader clip distances. * Note that if the bound shader exports any clip distances, these I don't like this. Generally, for unorm formats, you can easily enough translate this from d3d9 to gl (or d3d10) rules (but yes, obviously it's going to be format dependent). (With one big caveat, in general not all gl drivers think the minimum resolvable difference is the same, that might range from 2^-22 to 2^-24 for 24bit unorm depth for instance, and I don't think it's quite consistent with gallium drivers neither). You are right though for float depth the formula is different, and you can't translate it. But do you really need float depth buffer support? AFAIK no d3d9 app really depends on it, everything can fall back to d24. Roland Hi, That's true float depth buffer do not seem to be widely used in d3d9. The two float depth buffers available in d3d9, as far as I know, are D32F_LOCKABLE and D24FS8. We can see the support for those and other depth buffers here (note that these are mainly old cards): http://zp.amsnet.pl/cdragan/query.php?dxversion=9&feature=formats&featuregroup=selected&adaptergroup=all&featureselected%5B%5D=45&featureselected%5B%5D=44&featureselected%5B%5D=41&featureselected%5B%5D=42&featureselected%5B%5D=43&featureselected%5B%5D=40&featureselected%5B%5D=39&featureselected%5B%5D=46&resource=SURFACE&usage=DEPTHSTENCIL&orientation=horizontal It is likely not a requirement for any game to support these formats. We could ignore these formats, and add to gallium a way to get the minimum resolvable difference per depth buffer format from drivers. We considered this option. That said, the driver is the best location to know about the minimum resolvable difference, and we made the choice to let the driver do the scaling instead of doing it based on some driver query in the state tracker. As for floating point depth buffers behaviour, I understand for some drivers it may be harder than for others to implement. That doesn't seem however a reason to drop floating depth buffer support in Gallium Nine. D32F_LOCKABLE is particularly useful for debugging, being lockable, it can be used to show depth buffer content after some draw calls for d3d on windows, and compare with nine. And some apps may use it for some particular effects. I'd be ok if we make the float depth buffer part of offset_units_unscaled optional given how rare the combination float depth buffers + depth bias must be used. However if hw can do it, I see no reason why we wouldn't support the capability? Yours, Axel Davy ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/8] gallium: Add a cap for offset_units_unscaled
Am 14.06.2016 um 23:33 schrieb Axel Davy: > D3D9 has a different behaviour for depth bias. > > For OGL/D3D1X, the depth bias unit is the > minimal resolvable value for the depth buffer, > which depends on the format (and has different > behaviour for float depth buffers). > > For D3D9, the depth bias unit is 1.0f. > > Signed-off-by: Axel Davy > --- > src/gallium/docs/source/cso/rasterizer.rst | 6 ++ > src/gallium/docs/source/screen.rst | 2 ++ > src/gallium/drivers/freedreno/freedreno_screen.c | 1 + > src/gallium/drivers/i915/i915_screen.c | 1 + > src/gallium/drivers/ilo/ilo_screen.c | 1 + > src/gallium/drivers/llvmpipe/lp_screen.c | 1 + > src/gallium/drivers/nouveau/nv30/nv30_screen.c | 1 + > src/gallium/drivers/nouveau/nv50/nv50_screen.c | 1 + > src/gallium/drivers/nouveau/nvc0/nvc0_screen.c | 1 + > src/gallium/drivers/r300/r300_screen.c | 1 + > src/gallium/drivers/r600/r600_pipe.c | 1 + > src/gallium/drivers/radeonsi/si_pipe.c | 1 + > src/gallium/drivers/softpipe/sp_screen.c | 1 + > src/gallium/drivers/svga/svga_screen.c | 1 + > src/gallium/drivers/swr/swr_screen.cpp | 1 + > src/gallium/drivers/vc4/vc4_screen.c | 1 + > src/gallium/drivers/virgl/virgl_screen.c | 1 + > src/gallium/include/pipe/p_defines.h | 1 + > src/gallium/include/pipe/p_state.h | 7 +++ > 19 files changed, 31 insertions(+) > > diff --git a/src/gallium/docs/source/cso/rasterizer.rst > b/src/gallium/docs/source/cso/rasterizer.rst > index 8d473b8..616e451 100644 > --- a/src/gallium/docs/source/cso/rasterizer.rst > +++ b/src/gallium/docs/source/cso/rasterizer.rst > @@ -127,6 +127,12 @@ offset_tri > > offset_units > Specifies the polygon offset bias > +offset_units_unscaled > +Specifies the unit of the polygon offset bias. If false, use the > +GL/D3D1X behaviour. If true, offset_units is a floating point offset > +which isn't scaled (D3D9). Note that GL/D3D1X behaviour has different > +formula whether the depth buffer is unorm or float, which is not > +the case for D3D9. > offset_scale > Specifies the polygon offset scale > offset_clamp > diff --git a/src/gallium/docs/source/screen.rst > b/src/gallium/docs/source/screen.rst > index 920da42..9c26604 100644 > --- a/src/gallium/docs/source/screen.rst > +++ b/src/gallium/docs/source/screen.rst > @@ -340,6 +340,8 @@ The integer capabilities: >extension and thus implements proper support for culling planes. > * ``PIPE_CAP_PRIMITIVE_RESTART_FOR_PATCHES``: Whether primitive restart is >supported for patch primitives. > +* ``PIPE_CAP_POLYGON_OFFSET_UNITS_UNSCALED``: If true, the driver implements > support > + for ``pipe_rasterizer_state::offset_units_unscaled``. > > > .. _pipe_capf: > diff --git a/src/gallium/drivers/freedreno/freedreno_screen.c > b/src/gallium/drivers/freedreno/freedreno_screen.c > index ad15aab..ed61456 100644 > --- a/src/gallium/drivers/freedreno/freedreno_screen.c > +++ b/src/gallium/drivers/freedreno/freedreno_screen.c > @@ -262,6 +262,7 @@ fd_screen_get_param(struct pipe_screen *pscreen, enum > pipe_cap param) > case PIPE_CAP_ROBUST_BUFFER_ACCESS_BEHAVIOR: > case PIPE_CAP_CULL_DISTANCE: > case PIPE_CAP_PRIMITIVE_RESTART_FOR_PATCHES: > + case PIPE_CAP_POLYGON_OFFSET_UNITS_UNSCALED: > return 0; > > case PIPE_CAP_MAX_VIEWPORTS: > diff --git a/src/gallium/drivers/i915/i915_screen.c > b/src/gallium/drivers/i915/i915_screen.c > index c0e06e5..ea451e6 100644 > --- a/src/gallium/drivers/i915/i915_screen.c > +++ b/src/gallium/drivers/i915/i915_screen.c > @@ -273,6 +273,7 @@ i915_get_param(struct pipe_screen *screen, enum pipe_cap > cap) > case PIPE_CAP_ROBUST_BUFFER_ACCESS_BEHAVIOR: > case PIPE_CAP_CULL_DISTANCE: > case PIPE_CAP_PRIMITIVE_RESTART_FOR_PATCHES: > + case PIPE_CAP_POLYGON_OFFSET_UNITS_UNSCALED: >return 0; > > case PIPE_CAP_MAX_DUAL_SOURCE_RENDER_TARGETS: > diff --git a/src/gallium/drivers/ilo/ilo_screen.c > b/src/gallium/drivers/ilo/ilo_screen.c > index c847a90..c9b8d81 100644 > --- a/src/gallium/drivers/ilo/ilo_screen.c > +++ b/src/gallium/drivers/ilo/ilo_screen.c > @@ -502,6 +502,7 @@ ilo_get_param(struct pipe_screen *screen, enum pipe_cap > param) > case PIPE_CAP_ROBUST_BUFFER_ACCESS_BEHAVIOR: > case PIPE_CAP_CULL_DISTANCE: > case PIPE_CAP_PRIMITIVE_RESTART_FOR_PATCHES: > + case PIPE_CAP_POLYGON_OFFSET_UNITS_UNSCALED: >return 0; > > case PIPE_CAP_VENDOR_ID: > diff --git a/src/gallium/drivers/llvmpipe/lp_screen.c > b/src/gallium/drivers/llvmpipe/lp_screen.c > index 5fc4427..f9217d6 100644 > --- a/src/gallium/drivers/llvmpipe/lp_screen.c > +++ b/src/gallium/drivers/llvmpipe/lp_screen.c > @@ -329,6 +329,7 @@ llvmpipe_get_param(struct pipe_screen *screen, enum > pipe_cap param) > case PIPE_CAP_FRAMEBUFFER_NO_ATTACHMENT: >
[Mesa-dev] [PATCH 7/8] r600, r600g: Implement POLYGON_OFFSET_UNITS_UNSCALED
Empirical tests show that the polygon offset behaviour is entirely determined by the content of the PA_SU_POLY_OFFSET states, and not by the depth buffer format bound. PA_SU_POLY_OFFSET seems to directly set the parameters of the polygon offset formula, and setting 0 for PA_SU_POLY_OFFSET_DB_FMT_CNTL (ie setting the unorm depth bias behaviour with a scale of 2^0 = 1.0f) gives the unscaled behaviour. Signed-off-by: Axel Davy --- src/gallium/drivers/r600/evergreen_state.c | 39 +++- src/gallium/drivers/r600/r600_pipe.c | 2 +- src/gallium/drivers/r600/r600_pipe.h | 2 ++ src/gallium/drivers/r600/r600_state.c| 35 + src/gallium/drivers/r600/r600_state_common.c | 4 ++- 5 files changed, 46 insertions(+), 36 deletions(-) diff --git a/src/gallium/drivers/r600/evergreen_state.c b/src/gallium/drivers/r600/evergreen_state.c index 9346ae9..e041842 100644 --- a/src/gallium/drivers/r600/evergreen_state.c +++ b/src/gallium/drivers/r600/evergreen_state.c @@ -493,6 +493,7 @@ static void *evergreen_create_rs_state(struct pipe_context *ctx, rs->offset_units = state->offset_units; rs->offset_scale = state->offset_scale * 16.0f; rs->offset_enable = state->offset_point || state->offset_line || state->offset_tri; + rs->offset_units_unscaled = state->offset_units_unscaled; if (state->point_size_per_vertex) { psize_min = util_get_min_point_size(state); @@ -1661,24 +1662,26 @@ static void evergreen_emit_polygon_offset(struct r600_context *rctx, struct r600 float offset_scale = state->offset_scale; uint32_t pa_su_poly_offset_db_fmt_cntl = 0; - switch (state->zs_format) { - case PIPE_FORMAT_Z24X8_UNORM: - case PIPE_FORMAT_Z24_UNORM_S8_UINT: - case PIPE_FORMAT_X8Z24_UNORM: - case PIPE_FORMAT_S8_UINT_Z24_UNORM: - offset_units *= 2.0f; - pa_su_poly_offset_db_fmt_cntl = - S_028B78_POLY_OFFSET_NEG_NUM_DB_BITS((char)-24); - break; - case PIPE_FORMAT_Z16_UNORM: - offset_units *= 4.0f; - pa_su_poly_offset_db_fmt_cntl = - S_028B78_POLY_OFFSET_NEG_NUM_DB_BITS((char)-16); - break; - default: - pa_su_poly_offset_db_fmt_cntl = - S_028B78_POLY_OFFSET_NEG_NUM_DB_BITS((char)-23) | - S_028B78_POLY_OFFSET_DB_IS_FLOAT_FMT(1); + if (!state->offset_units_unscaled) { + switch (state->zs_format) { + case PIPE_FORMAT_Z24X8_UNORM: + case PIPE_FORMAT_Z24_UNORM_S8_UINT: + case PIPE_FORMAT_X8Z24_UNORM: + case PIPE_FORMAT_S8_UINT_Z24_UNORM: + offset_units *= 2.0f; + pa_su_poly_offset_db_fmt_cntl = + S_028B78_POLY_OFFSET_NEG_NUM_DB_BITS((char)-24); + break; + case PIPE_FORMAT_Z16_UNORM: + offset_units *= 4.0f; + pa_su_poly_offset_db_fmt_cntl = + S_028B78_POLY_OFFSET_NEG_NUM_DB_BITS((char)-16); + break; + default: + pa_su_poly_offset_db_fmt_cntl = + S_028B78_POLY_OFFSET_NEG_NUM_DB_BITS((char)-23) | + S_028B78_POLY_OFFSET_DB_IS_FLOAT_FMT(1); + } } radeon_set_context_reg_seq(cs, R_028B80_PA_SU_POLY_OFFSET_FRONT_SCALE, 4); diff --git a/src/gallium/drivers/r600/r600_pipe.c b/src/gallium/drivers/r600/r600_pipe.c index db4fd1b..d9fffe9 100644 --- a/src/gallium/drivers/r600/r600_pipe.c +++ b/src/gallium/drivers/r600/r600_pipe.c @@ -282,6 +282,7 @@ static int r600_get_param(struct pipe_screen* pscreen, enum pipe_cap param) case PIPE_CAP_SURFACE_REINTERPRET_BLOCKS: case PIPE_CAP_QUERY_MEMORY_INFO: case PIPE_CAP_FRAMEBUFFER_NO_ATTACHMENT: + case PIPE_CAP_POLYGON_OFFSET_UNITS_UNSCALED: return 1; case PIPE_CAP_DEVICE_RESET_STATUS_QUERY: @@ -368,7 +369,6 @@ static int r600_get_param(struct pipe_screen* pscreen, enum pipe_cap param) case PIPE_CAP_ROBUST_BUFFER_ACCESS_BEHAVIOR: case PIPE_CAP_CULL_DISTANCE: case PIPE_CAP_PRIMITIVE_RESTART_FOR_PATCHES: - case PIPE_CAP_POLYGON_OFFSET_UNITS_UNSCALED: return 0; case PIPE_CAP_MAX_SHADER_PATCH_VARYINGS: diff --git a/src/gallium/drivers/r600/r600_pipe.h b/src/gallium/drivers/r600/r600_pipe.h index 9677bb6..0dd538b 100644 --- a/src/gallium/drivers/r600/r600_pipe.h +++ b/src/gallium/drivers/r600/r600_pipe.h @@ -273,6 +273,7 @@ struct r600_rasterizer_state { float offset_units; float offset_scale; booloffset_enable; + bool
[Mesa-dev] [PATCH 5/8] radeon: Remove useless pa_su_poly_offset_db_fmt_cntl
pa_su_poly_offset_db_fmt_cntl usages were removed in previous patches. Signed-off-by: Axel Davy --- src/gallium/drivers/radeon/r600_pipe_common.h | 1 - 1 file changed, 1 deletion(-) diff --git a/src/gallium/drivers/radeon/r600_pipe_common.h b/src/gallium/drivers/radeon/r600_pipe_common.h index 8072833..14edeea 100644 --- a/src/gallium/drivers/radeon/r600_pipe_common.h +++ b/src/gallium/drivers/radeon/r600_pipe_common.h @@ -315,7 +315,6 @@ struct r600_surface { unsigned db_htile_surface; unsigned db_htile_data_base; unsigned db_preload_control;/* EG and later */ - unsigned pa_su_poly_offset_db_fmt_cntl; }; struct r600_common_screen { -- 2.8.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 8/8] st/nine: Use offset_units_unscaled
offset_units_unscaled enables proper support for depth bias for gallium nine. Use it if available. Solves issues with some games using depth bias. For example: https://github.com/iXit/Mesa-3D/issues/220 Signed-off-by: Axel Davy --- src/gallium/state_trackers/nine/device9.c| 1 + src/gallium/state_trackers/nine/device9.h| 1 + src/gallium/state_trackers/nine/nine_pipe.c | 18 +- src/gallium/state_trackers/nine/nine_pipe.h | 2 +- src/gallium/state_trackers/nine/nine_state.c | 2 +- 5 files changed, 13 insertions(+), 11 deletions(-) diff --git a/src/gallium/state_trackers/nine/device9.c b/src/gallium/state_trackers/nine/device9.c index f510af7..98636fd 100644 --- a/src/gallium/state_trackers/nine/device9.c +++ b/src/gallium/state_trackers/nine/device9.c @@ -427,6 +427,7 @@ NineDevice9_ctor( struct NineDevice9 *This, This->driver_caps.window_space_position_support = GET_PCAP(TGSI_VS_WINDOW_SPACE_POSITION); This->driver_caps.vs_integer = pScreen->get_shader_param(pScreen, PIPE_SHADER_VERTEX, PIPE_SHADER_CAP_INTEGERS); This->driver_caps.ps_integer = pScreen->get_shader_param(pScreen, PIPE_SHADER_FRAGMENT, PIPE_SHADER_CAP_INTEGERS); +This->driver_caps.offset_units_unscaled = GET_PCAP(POLYGON_OFFSET_UNITS_UNSCALED); nine_ff_init(This); /* initialize fixed function code */ diff --git a/src/gallium/state_trackers/nine/device9.h b/src/gallium/state_trackers/nine/device9.h index 73a43cf..d584a35 100644 --- a/src/gallium/state_trackers/nine/device9.h +++ b/src/gallium/state_trackers/nine/device9.h @@ -121,6 +121,7 @@ struct NineDevice9 boolean window_space_position_support; boolean vs_integer; boolean ps_integer; +boolean offset_units_unscaled; } driver_caps; struct { diff --git a/src/gallium/state_trackers/nine/nine_pipe.c b/src/gallium/state_trackers/nine/nine_pipe.c index e3f9717..ea7dc16 100644 --- a/src/gallium/state_trackers/nine/nine_pipe.c +++ b/src/gallium/state_trackers/nine/nine_pipe.c @@ -70,7 +70,9 @@ nine_convert_dsa_state(struct pipe_depth_stencil_alpha_state *dsa_state, } void -nine_convert_rasterizer_state(struct pipe_rasterizer_state *rast_state, const DWORD *rs) +nine_convert_rasterizer_state(struct NineDevice9 *device, + struct pipe_rasterizer_state *rast_state, + const DWORD *rs) { struct pipe_rasterizer_state rast; @@ -120,14 +122,12 @@ nine_convert_rasterizer_state(struct pipe_rasterizer_state *rast_state, const DW /* offset_units has the ogl/d3d11 meaning. * d3d9: offset = scale * dz + bias * ogl/d3d11: offset = scale * dz + r * bias - * with r implementation dependant and is supposed to be - * the smallest value the depth buffer format can hold. - * In practice on current and past hw it seems to be 2^-23 - * for all formats except float formats where it varies depending - * on the content. - * For now use 1 << 23, but in the future perhaps add a way in gallium - * to get r for the format or get the gallium behaviour */ -rast.offset_units = asfloat(rs[D3DRS_DEPTHBIAS]) * (float)(1 << 23); + * with r implementation dependent (+ different formula for float depth + * buffers). r=2^-23 is often the right value for gallium drivers. + * If possible, use offset_units_unscaled, which gives the d3d9 + * behaviour, else scale by 1 << 23 */ +rast.offset_units = asfloat(rs[D3DRS_DEPTHBIAS]) * (device->driver_caps.offset_units_unscaled ? 1.0f : (float)(1 << 23)); +rast.offset_units_unscaled = device->driver_caps.offset_units_unscaled; rast.offset_scale = asfloat(rs[D3DRS_SLOPESCALEDEPTHBIAS]); /* rast.offset_clamp = 0.0f; */ diff --git a/src/gallium/state_trackers/nine/nine_pipe.h b/src/gallium/state_trackers/nine/nine_pipe.h index 4d2bc92..fe8e910 100644 --- a/src/gallium/state_trackers/nine/nine_pipe.h +++ b/src/gallium/state_trackers/nine/nine_pipe.h @@ -38,7 +38,7 @@ extern const enum pipe_format nine_d3d9_to_pipe_format_map[120]; extern const D3DFORMAT nine_pipe_to_d3d9_format_map[PIPE_FORMAT_COUNT]; void nine_convert_dsa_state(struct pipe_depth_stencil_alpha_state *, const DWORD *); -void nine_convert_rasterizer_state(struct pipe_rasterizer_state *, const DWORD *); +void nine_convert_rasterizer_state(struct NineDevice9 *, struct pipe_rasterizer_state *, const DWORD *); void nine_convert_blend_state(struct pipe_blend_state *, const DWORD *); void nine_convert_sampler_state(struct cso_context *, int idx, const DWORD *); diff --git a/src/gallium/state_trackers/nine/nine_state.c b/src/gallium/state_trackers/nine/nine_state.c index f0b3d0d..3aa8906 100644 --- a/src/gallium/state_trackers/nine/nine_state.c +++ b/src/gallium/state_trackers/nine/nine_state.c @@ -74,7 +74,7 @@ prepare_dsa(struct NineDevice9 *device) static inline void prepare_rasterizer(struct NineDevice9 *device) { -nine_convert_rasterizer_state(&device-
[Mesa-dev] [PATCH 3/8] r600: Emit poly_offset states together
Emit PA_SU_POLY_OFFSET_DB_FMT_CNTL with the other poly_offset states. This will be useful to implement PIPE_CAP_POLYGON_OFFSET_UNITS_UNSCALED. Signed-off-by: Axel Davy --- src/gallium/drivers/r600/r600_state.c | 35 --- 1 file changed, 12 insertions(+), 23 deletions(-) diff --git a/src/gallium/drivers/r600/r600_state.c b/src/gallium/drivers/r600/r600_state.c index cf7f0b3..edb1491 100644 --- a/src/gallium/drivers/r600/r600_state.c +++ b/src/gallium/drivers/r600/r600_state.c @@ -254,16 +254,24 @@ static void r600_emit_polygon_offset(struct r600_context *rctx, struct r600_atom struct r600_poly_offset_state *state = (struct r600_poly_offset_state*)a; float offset_units = state->offset_units; float offset_scale = state->offset_scale; + uint32_t pa_su_poly_offset_db_fmt_cntl = 0; switch (state->zs_format) { case PIPE_FORMAT_Z24X8_UNORM: case PIPE_FORMAT_Z24_UNORM_S8_UINT: offset_units *= 2.0f; + pa_su_poly_offset_db_fmt_cntl = + S_028DF8_POLY_OFFSET_NEG_NUM_DB_BITS((char)-24); break; case PIPE_FORMAT_Z16_UNORM: offset_units *= 4.0f; + pa_su_poly_offset_db_fmt_cntl = + S_028DF8_POLY_OFFSET_NEG_NUM_DB_BITS((char)-16); break; - default:; + default: + pa_su_poly_offset_db_fmt_cntl = + S_028DF8_POLY_OFFSET_NEG_NUM_DB_BITS((char)-23) | + S_028DF8_POLY_OFFSET_DB_IS_FLOAT_FMT(1); } radeon_set_context_reg_seq(cs, R_028E00_PA_SU_POLY_OFFSET_FRONT_SCALE, 4); @@ -271,6 +279,9 @@ static void r600_emit_polygon_offset(struct r600_context *rctx, struct r600_atom radeon_emit(cs, fui(offset_units)); radeon_emit(cs, fui(offset_scale)); radeon_emit(cs, fui(offset_units)); + + radeon_set_context_reg(cs, R_028DF8_PA_SU_POLY_OFFSET_DB_FMT_CNTL, + pa_su_poly_offset_db_fmt_cntl); } static uint32_t r600_get_blend_control(const struct pipe_blend_state *state, unsigned i) @@ -1059,25 +1070,6 @@ static void r600_init_depth_surface(struct r600_context *rctx, surf->db_depth_size = S_028000_PITCH_TILE_MAX(pitch) | S_028000_SLICE_TILE_MAX(slice); surf->db_prefetch_limit = (rtex->surface.level[level].nblk_y / 8) - 1; - switch (surf->base.format) { - case PIPE_FORMAT_Z24X8_UNORM: - case PIPE_FORMAT_Z24_UNORM_S8_UINT: - surf->pa_su_poly_offset_db_fmt_cntl = - S_028DF8_POLY_OFFSET_NEG_NUM_DB_BITS((char)-24); - break; - case PIPE_FORMAT_Z32_FLOAT: - case PIPE_FORMAT_Z32_FLOAT_S8X24_UINT: - surf->pa_su_poly_offset_db_fmt_cntl = - S_028DF8_POLY_OFFSET_NEG_NUM_DB_BITS((char)-23) | - S_028DF8_POLY_OFFSET_DB_IS_FLOAT_FMT(1); - break; - case PIPE_FORMAT_Z16_UNORM: - surf->pa_su_poly_offset_db_fmt_cntl = - S_028DF8_POLY_OFFSET_NEG_NUM_DB_BITS((char)-16); - break; - default:; - } - /* use htile only for first level */ if (rtex->htile_buffer && !level) { surf->db_htile_data_base = 0; @@ -1457,9 +1449,6 @@ static void r600_emit_framebuffer_state(struct r600_context *rctx, struct r600_a RADEON_PRIO_DEPTH_BUFFER_MSAA : RADEON_PRIO_DEPTH_BUFFER); - radeon_set_context_reg(cs, R_028DF8_PA_SU_POLY_OFFSET_DB_FMT_CNTL, - surf->pa_su_poly_offset_db_fmt_cntl); - radeon_set_context_reg_seq(cs, R_028000_DB_DEPTH_SIZE, 2); radeon_emit(cs, surf->db_depth_size); /* R_028000_DB_DEPTH_SIZE */ radeon_emit(cs, surf->db_depth_view); /* R_028004_DB_DEPTH_VIEW */ -- 2.8.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 6/8] radeonsi: Implement POLYGON_OFFSET_UNITS_UNSCALED
Empirical tests show that the polygon offset behaviour is entirely determined by the content of the PA_SU_POLY_OFFSET states, and not by the depth buffer format bound. PA_SU_POLY_OFFSET seems to directly set the parameters of the polygon offset formula, and setting 0 for PA_SU_POLY_OFFSET_DB_FMT_CNTL (ie setting the unorm depth bias behaviour with a scale of 2^0 = 1.0f) gives the unscaled behaviour. Signed-off-by: Axel Davy --- src/gallium/drivers/radeonsi/si_pipe.c | 2 +- src/gallium/drivers/radeonsi/si_state.c | 32 ++-- 2 files changed, 19 insertions(+), 15 deletions(-) diff --git a/src/gallium/drivers/radeonsi/si_pipe.c b/src/gallium/drivers/radeonsi/si_pipe.c index 99c0349..430cca2 100644 --- a/src/gallium/drivers/radeonsi/si_pipe.c +++ b/src/gallium/drivers/radeonsi/si_pipe.c @@ -343,6 +343,7 @@ static int si_get_param(struct pipe_screen* pscreen, enum pipe_cap param) case PIPE_CAP_TGSI_PACK_HALF_FLOAT: case PIPE_CAP_FRAMEBUFFER_NO_ATTACHMENT: case PIPE_CAP_ROBUST_BUFFER_ACCESS_BEHAVIOR: + case PIPE_CAP_POLYGON_OFFSET_UNITS_UNSCALED: return 1; case PIPE_CAP_RESOURCE_FROM_USER_MEMORY: @@ -400,7 +401,6 @@ static int si_get_param(struct pipe_screen* pscreen, enum pipe_cap param) case PIPE_CAP_QUERY_BUFFER_OBJECT: case PIPE_CAP_CULL_DISTANCE: case PIPE_CAP_PRIMITIVE_RESTART_FOR_PATCHES: - case PIPE_CAP_POLYGON_OFFSET_UNITS_UNSCALED: return 0; case PIPE_CAP_MAX_SHADER_PATCH_VARYINGS: diff --git a/src/gallium/drivers/radeonsi/si_state.c b/src/gallium/drivers/radeonsi/si_state.c index 06c65be..82b643a 100644 --- a/src/gallium/drivers/radeonsi/si_state.c +++ b/src/gallium/drivers/radeonsi/si_state.c @@ -812,20 +812,24 @@ static void *si_create_rs_state(struct pipe_context *ctx, float offset_scale = state->offset_scale * 16.0f; uint32_t pa_su_poly_offset_db_fmt_cntl = 0; - switch (i) { - case 0: /* 16-bit zbuffer */ - offset_units *= 4.0f; - pa_su_poly_offset_db_fmt_cntl = S_028B78_POLY_OFFSET_NEG_NUM_DB_BITS(-16); - break; - case 1: /* 24-bit zbuffer */ - offset_units *= 2.0f; - pa_su_poly_offset_db_fmt_cntl = S_028B78_POLY_OFFSET_NEG_NUM_DB_BITS(-24); - break; - case 2: /* 32-bit zbuffer */ - offset_units *= 1.0f; - pa_su_poly_offset_db_fmt_cntl = S_028B78_POLY_OFFSET_NEG_NUM_DB_BITS(-23) | - S_028B78_POLY_OFFSET_DB_IS_FLOAT_FMT(1); - break; + if (!state->offset_units_unscaled) { + switch (i) { + case 0: /* 16-bit zbuffer */ + offset_units *= 4.0f; + pa_su_poly_offset_db_fmt_cntl = + S_028B78_POLY_OFFSET_NEG_NUM_DB_BITS(-16); + break; + case 1: /* 24-bit zbuffer */ + offset_units *= 2.0f; + pa_su_poly_offset_db_fmt_cntl = + S_028B78_POLY_OFFSET_NEG_NUM_DB_BITS(-24); + break; + case 2: /* 32-bit zbuffer */ + offset_units *= 1.0f; + pa_su_poly_offset_db_fmt_cntl = S_028B78_POLY_OFFSET_NEG_NUM_DB_BITS(-23) | + S_028B78_POLY_OFFSET_DB_IS_FLOAT_FMT(1); + break; + } } si_pm4_set_reg(pm4, R_028B80_PA_SU_POLY_OFFSET_FRONT_SCALE, -- 2.8.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 4/8] r600g: Emit poly_offset states together
Emit PA_SU_POLY_OFFSET_DB_FMT_CNTL with the other poly_offset states. This will be useful to implement PIPE_CAP_POLYGON_OFFSET_UNITS_UNSCALED. Signed-off-by: Axel Davy --- src/gallium/drivers/r600/evergreen_state.c | 36 ++ 1 file changed, 12 insertions(+), 24 deletions(-) diff --git a/src/gallium/drivers/r600/evergreen_state.c b/src/gallium/drivers/r600/evergreen_state.c index 1ac8914..9346ae9 100644 --- a/src/gallium/drivers/r600/evergreen_state.c +++ b/src/gallium/drivers/r600/evergreen_state.c @@ -1223,27 +1223,6 @@ static void evergreen_init_depth_surface(struct r600_context *rctx, surf->db_depth_slice = S_02805C_SLICE_TILE_MAX(levelinfo->nblk_x * levelinfo->nblk_y / 64 - 1); - switch (surf->base.format) { - case PIPE_FORMAT_Z24X8_UNORM: - case PIPE_FORMAT_Z24_UNORM_S8_UINT: - case PIPE_FORMAT_X8Z24_UNORM: - case PIPE_FORMAT_S8_UINT_Z24_UNORM: - surf->pa_su_poly_offset_db_fmt_cntl = - S_028B78_POLY_OFFSET_NEG_NUM_DB_BITS((char)-24); - break; - case PIPE_FORMAT_Z32_FLOAT: - case PIPE_FORMAT_Z32_FLOAT_S8X24_UINT: - surf->pa_su_poly_offset_db_fmt_cntl = - S_028B78_POLY_OFFSET_NEG_NUM_DB_BITS((char)-23) | - S_028B78_POLY_OFFSET_DB_IS_FLOAT_FMT(1); - break; - case PIPE_FORMAT_Z16_UNORM: - surf->pa_su_poly_offset_db_fmt_cntl = - S_028B78_POLY_OFFSET_NEG_NUM_DB_BITS((char)-16); - break; - default:; - } - if (rtex->surface.flags & RADEON_SURF_SBUFFER) { uint64_t stencil_offset; unsigned stile_split = rtex->surface.stencil_tile_split; @@ -1628,8 +1607,6 @@ static void evergreen_emit_framebuffer_state(struct r600_context *rctx, struct r RADEON_PRIO_DEPTH_BUFFER_MSAA : RADEON_PRIO_DEPTH_BUFFER); - radeon_set_context_reg(cs, R_028B78_PA_SU_POLY_OFFSET_DB_FMT_CNTL, - zb->pa_su_poly_offset_db_fmt_cntl); radeon_set_context_reg(cs, R_028008_DB_DEPTH_VIEW, zb->db_depth_view); radeon_set_context_reg_seq(cs, R_028040_DB_Z_INFO, 8); @@ -1682,6 +1659,7 @@ static void evergreen_emit_polygon_offset(struct r600_context *rctx, struct r600 struct r600_poly_offset_state *state = (struct r600_poly_offset_state*)a; float offset_units = state->offset_units; float offset_scale = state->offset_scale; + uint32_t pa_su_poly_offset_db_fmt_cntl = 0; switch (state->zs_format) { case PIPE_FORMAT_Z24X8_UNORM: @@ -1689,11 +1667,18 @@ static void evergreen_emit_polygon_offset(struct r600_context *rctx, struct r600 case PIPE_FORMAT_X8Z24_UNORM: case PIPE_FORMAT_S8_UINT_Z24_UNORM: offset_units *= 2.0f; + pa_su_poly_offset_db_fmt_cntl = + S_028B78_POLY_OFFSET_NEG_NUM_DB_BITS((char)-24); break; case PIPE_FORMAT_Z16_UNORM: offset_units *= 4.0f; + pa_su_poly_offset_db_fmt_cntl = + S_028B78_POLY_OFFSET_NEG_NUM_DB_BITS((char)-16); break; - default:; + default: + pa_su_poly_offset_db_fmt_cntl = + S_028B78_POLY_OFFSET_NEG_NUM_DB_BITS((char)-23) | + S_028B78_POLY_OFFSET_DB_IS_FLOAT_FMT(1); } radeon_set_context_reg_seq(cs, R_028B80_PA_SU_POLY_OFFSET_FRONT_SCALE, 4); @@ -1701,6 +1686,9 @@ static void evergreen_emit_polygon_offset(struct r600_context *rctx, struct r600 radeon_emit(cs, fui(offset_units)); radeon_emit(cs, fui(offset_scale)); radeon_emit(cs, fui(offset_units)); + + radeon_set_context_reg(cs, R_028B78_PA_SU_POLY_OFFSET_DB_FMT_CNTL, + pa_su_poly_offset_db_fmt_cntl); } static void evergreen_emit_cb_misc_state(struct r600_context *rctx, struct r600_atom *atom) -- 2.8.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 2/8] radeonsi: Emit poly_offset states together
Emit PA_SU_POLY_OFFSET_DB_FMT_CNTL with rasterizer poly_offset states. This will be useful to implement PIPE_CAP_POLYGON_OFFSET_UNITS_UNSCALED. Signed-off-by: Axel Davy --- src/gallium/drivers/radeonsi/si_state.c | 31 --- 1 file changed, 8 insertions(+), 23 deletions(-) diff --git a/src/gallium/drivers/radeonsi/si_state.c b/src/gallium/drivers/radeonsi/si_state.c index 14520ca..06c65be 100644 --- a/src/gallium/drivers/radeonsi/si_state.c +++ b/src/gallium/drivers/radeonsi/si_state.c @@ -810,16 +810,21 @@ static void *si_create_rs_state(struct pipe_context *ctx, struct si_pm4_state *pm4 = &rs->pm4_poly_offset[i]; float offset_units = state->offset_units; float offset_scale = state->offset_scale * 16.0f; + uint32_t pa_su_poly_offset_db_fmt_cntl = 0; switch (i) { case 0: /* 16-bit zbuffer */ offset_units *= 4.0f; + pa_su_poly_offset_db_fmt_cntl = S_028B78_POLY_OFFSET_NEG_NUM_DB_BITS(-16); break; case 1: /* 24-bit zbuffer */ offset_units *= 2.0f; + pa_su_poly_offset_db_fmt_cntl = S_028B78_POLY_OFFSET_NEG_NUM_DB_BITS(-24); break; case 2: /* 32-bit zbuffer */ offset_units *= 1.0f; + pa_su_poly_offset_db_fmt_cntl = S_028B78_POLY_OFFSET_NEG_NUM_DB_BITS(-23) | + S_028B78_POLY_OFFSET_DB_IS_FLOAT_FMT(1); break; } @@ -831,6 +836,8 @@ static void *si_create_rs_state(struct pipe_context *ctx, fui(offset_scale)); si_pm4_set_reg(pm4, R_028B8C_PA_SU_POLY_OFFSET_BACK_OFFSET, fui(offset_units)); + si_pm4_set_reg(pm4, R_028B78_PA_SU_POLY_OFFSET_DB_FMT_CNTL, + pa_su_poly_offset_db_fmt_cntl); } return rs; @@ -2097,26 +2104,7 @@ static void si_init_depth_surface(struct si_context *sctx, unsigned format; uint32_t z_info, s_info, db_depth_info; uint64_t z_offs, s_offs; - uint32_t db_htile_data_base, db_htile_surface, pa_su_poly_offset_db_fmt_cntl = 0; - - switch (sctx->framebuffer.state.zsbuf->texture->format) { - case PIPE_FORMAT_S8_UINT_Z24_UNORM: - case PIPE_FORMAT_X8Z24_UNORM: - case PIPE_FORMAT_Z24X8_UNORM: - case PIPE_FORMAT_Z24_UNORM_S8_UINT: - pa_su_poly_offset_db_fmt_cntl = S_028B78_POLY_OFFSET_NEG_NUM_DB_BITS(-24); - break; - case PIPE_FORMAT_Z32_FLOAT: - case PIPE_FORMAT_Z32_FLOAT_S8X24_UINT: - pa_su_poly_offset_db_fmt_cntl = S_028B78_POLY_OFFSET_NEG_NUM_DB_BITS(-23) | - S_028B78_POLY_OFFSET_DB_IS_FLOAT_FMT(1); - break; - case PIPE_FORMAT_Z16_UNORM: - pa_su_poly_offset_db_fmt_cntl = S_028B78_POLY_OFFSET_NEG_NUM_DB_BITS(-16); - break; - default: - assert(0); - } + uint32_t db_htile_data_base, db_htile_surface; format = si_translate_dbformat(rtex->resource.b.b.format); @@ -2213,7 +2201,6 @@ static void si_init_depth_surface(struct si_context *sctx, surf->db_depth_slice = S_02805C_SLICE_TILE_MAX((levelinfo->nblk_x * levelinfo->nblk_y) / 64 - 1); surf->db_htile_surface = db_htile_surface; - surf->pa_su_poly_offset_db_fmt_cntl = pa_su_poly_offset_db_fmt_cntl; surf->depth_initialized = true; } @@ -2514,8 +2501,6 @@ static void si_emit_framebuffer_state(struct si_context *sctx, struct r600_atom radeon_emit(cs, fui(rtex->depth_clear_value)); /* R_02802C_DB_DEPTH_CLEAR */ radeon_set_context_reg(cs, R_028ABC_DB_HTILE_SURFACE, zb->db_htile_surface); - radeon_set_context_reg(cs, R_028B78_PA_SU_POLY_OFFSET_DB_FMT_CNTL, - zb->pa_su_poly_offset_db_fmt_cntl); } else if (sctx->framebuffer.dirty_zsbuf) { radeon_set_context_reg_seq(cs, R_028040_DB_Z_INFO, 2); radeon_emit(cs, S_028040_FORMAT(V_028040_Z_INVALID)); /* R_028040_DB_Z_INFO */ -- 2.8.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 1/8] gallium: Add a cap for offset_units_unscaled
D3D9 has a different behaviour for depth bias. For OGL/D3D1X, the depth bias unit is the minimal resolvable value for the depth buffer, which depends on the format (and has different behaviour for float depth buffers). For D3D9, the depth bias unit is 1.0f. Signed-off-by: Axel Davy --- src/gallium/docs/source/cso/rasterizer.rst | 6 ++ src/gallium/docs/source/screen.rst | 2 ++ src/gallium/drivers/freedreno/freedreno_screen.c | 1 + src/gallium/drivers/i915/i915_screen.c | 1 + src/gallium/drivers/ilo/ilo_screen.c | 1 + src/gallium/drivers/llvmpipe/lp_screen.c | 1 + src/gallium/drivers/nouveau/nv30/nv30_screen.c | 1 + src/gallium/drivers/nouveau/nv50/nv50_screen.c | 1 + src/gallium/drivers/nouveau/nvc0/nvc0_screen.c | 1 + src/gallium/drivers/r300/r300_screen.c | 1 + src/gallium/drivers/r600/r600_pipe.c | 1 + src/gallium/drivers/radeonsi/si_pipe.c | 1 + src/gallium/drivers/softpipe/sp_screen.c | 1 + src/gallium/drivers/svga/svga_screen.c | 1 + src/gallium/drivers/swr/swr_screen.cpp | 1 + src/gallium/drivers/vc4/vc4_screen.c | 1 + src/gallium/drivers/virgl/virgl_screen.c | 1 + src/gallium/include/pipe/p_defines.h | 1 + src/gallium/include/pipe/p_state.h | 7 +++ 19 files changed, 31 insertions(+) diff --git a/src/gallium/docs/source/cso/rasterizer.rst b/src/gallium/docs/source/cso/rasterizer.rst index 8d473b8..616e451 100644 --- a/src/gallium/docs/source/cso/rasterizer.rst +++ b/src/gallium/docs/source/cso/rasterizer.rst @@ -127,6 +127,12 @@ offset_tri offset_units Specifies the polygon offset bias +offset_units_unscaled +Specifies the unit of the polygon offset bias. If false, use the +GL/D3D1X behaviour. If true, offset_units is a floating point offset +which isn't scaled (D3D9). Note that GL/D3D1X behaviour has different +formula whether the depth buffer is unorm or float, which is not +the case for D3D9. offset_scale Specifies the polygon offset scale offset_clamp diff --git a/src/gallium/docs/source/screen.rst b/src/gallium/docs/source/screen.rst index 920da42..9c26604 100644 --- a/src/gallium/docs/source/screen.rst +++ b/src/gallium/docs/source/screen.rst @@ -340,6 +340,8 @@ The integer capabilities: extension and thus implements proper support for culling planes. * ``PIPE_CAP_PRIMITIVE_RESTART_FOR_PATCHES``: Whether primitive restart is supported for patch primitives. +* ``PIPE_CAP_POLYGON_OFFSET_UNITS_UNSCALED``: If true, the driver implements support + for ``pipe_rasterizer_state::offset_units_unscaled``. .. _pipe_capf: diff --git a/src/gallium/drivers/freedreno/freedreno_screen.c b/src/gallium/drivers/freedreno/freedreno_screen.c index ad15aab..ed61456 100644 --- a/src/gallium/drivers/freedreno/freedreno_screen.c +++ b/src/gallium/drivers/freedreno/freedreno_screen.c @@ -262,6 +262,7 @@ fd_screen_get_param(struct pipe_screen *pscreen, enum pipe_cap param) case PIPE_CAP_ROBUST_BUFFER_ACCESS_BEHAVIOR: case PIPE_CAP_CULL_DISTANCE: case PIPE_CAP_PRIMITIVE_RESTART_FOR_PATCHES: + case PIPE_CAP_POLYGON_OFFSET_UNITS_UNSCALED: return 0; case PIPE_CAP_MAX_VIEWPORTS: diff --git a/src/gallium/drivers/i915/i915_screen.c b/src/gallium/drivers/i915/i915_screen.c index c0e06e5..ea451e6 100644 --- a/src/gallium/drivers/i915/i915_screen.c +++ b/src/gallium/drivers/i915/i915_screen.c @@ -273,6 +273,7 @@ i915_get_param(struct pipe_screen *screen, enum pipe_cap cap) case PIPE_CAP_ROBUST_BUFFER_ACCESS_BEHAVIOR: case PIPE_CAP_CULL_DISTANCE: case PIPE_CAP_PRIMITIVE_RESTART_FOR_PATCHES: + case PIPE_CAP_POLYGON_OFFSET_UNITS_UNSCALED: return 0; case PIPE_CAP_MAX_DUAL_SOURCE_RENDER_TARGETS: diff --git a/src/gallium/drivers/ilo/ilo_screen.c b/src/gallium/drivers/ilo/ilo_screen.c index c847a90..c9b8d81 100644 --- a/src/gallium/drivers/ilo/ilo_screen.c +++ b/src/gallium/drivers/ilo/ilo_screen.c @@ -502,6 +502,7 @@ ilo_get_param(struct pipe_screen *screen, enum pipe_cap param) case PIPE_CAP_ROBUST_BUFFER_ACCESS_BEHAVIOR: case PIPE_CAP_CULL_DISTANCE: case PIPE_CAP_PRIMITIVE_RESTART_FOR_PATCHES: + case PIPE_CAP_POLYGON_OFFSET_UNITS_UNSCALED: return 0; case PIPE_CAP_VENDOR_ID: diff --git a/src/gallium/drivers/llvmpipe/lp_screen.c b/src/gallium/drivers/llvmpipe/lp_screen.c index 5fc4427..f9217d6 100644 --- a/src/gallium/drivers/llvmpipe/lp_screen.c +++ b/src/gallium/drivers/llvmpipe/lp_screen.c @@ -329,6 +329,7 @@ llvmpipe_get_param(struct pipe_screen *screen, enum pipe_cap param) case PIPE_CAP_FRAMEBUFFER_NO_ATTACHMENT: case PIPE_CAP_ROBUST_BUFFER_ACCESS_BEHAVIOR: case PIPE_CAP_PRIMITIVE_RESTART_FOR_PATCHES: + case PIPE_CAP_POLYGON_OFFSET_UNITS_UNSCALED: return 0; } /* should only get here on unhandled cases */ diff --git a/src/gallium/drivers/nouveau/
[Mesa-dev] [PATCH] vc4: fix vc4_resource_from_handle() stride calculation
The expected stride calculation is completely wrong. It should ultimately be multiplying cpp and width rather than dividing. The width also needs to be aligned to the tiling width first before converting to stride bytes. The whole stride check here is possibly pointless. Any buffers which were allocated outside of vc4 may have strides with larger alignment requirements. Signed-off-by: Rob Herring --- src/gallium/drivers/vc4/vc4_resource.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/src/gallium/drivers/vc4/vc4_resource.c b/src/gallium/drivers/vc4/vc4_resource.c index 20f137a..aabe593 100644 --- a/src/gallium/drivers/vc4/vc4_resource.c +++ b/src/gallium/drivers/vc4/vc4_resource.c @@ -534,8 +534,8 @@ vc4_resource_from_handle(struct pipe_screen *pscreen, struct vc4_resource *rsc = vc4_resource_setup(pscreen, tmpl); struct pipe_resource *prsc = &rsc->base.b; struct vc4_resource_slice *slice = &rsc->slices[0]; -uint32_t expected_stride = align(prsc->width0 / rsc->cpp, - vc4_utile_width(rsc->cpp)); +uint32_t expected_stride = +align(prsc->width0, vc4_utile_width(rsc->cpp)) * rsc->cpp; if (!rsc) return NULL; -- 2.7.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] radeon: remove unnecessary checks
On Tue, 2016-06-14 at 22:44 +0200, Jakob Sinclair wrote: > On 2016-06-14 20:39, Jan Vesely wrote: > > I really disagree here. The conditions check whether swizzle is > > between > > X and W (as in, only X,Y,Z,W are allowed). The fact that X maps to > > 0 is > > irrelevant. removing the checks impairs readability of the code > > because > > the lower bound is now inferred (by being 0) rather than explicit. > > > > the same comment applies to your v2. > > > > Jan > > Thanks for the input. Now when I think about it again this is > probably a > bad change. > Didn't think about the lower bound. So this patch should probably not > be > pushed. It'd still be nice to have them fixed. those lines produce Wtype-limits ("comparison always false due to limited range of data types") warnings which are rather useful in other cases, since type-limits can lead to DCE of useful code with aggressive optimization. not sure what the best way to do is, though. Jan -- Jan Vesely signature.asc Description: This is a digitally signed message part ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 4/5] radeon/vce: sort cpb by ref list for VAAPI encode
Signed-off-by: Boyuan Zhang --- src/gallium/drivers/radeon/radeon_vce.c | 52 +++-- 1 file changed, 49 insertions(+), 3 deletions(-) diff --git a/src/gallium/drivers/radeon/radeon_vce.c b/src/gallium/drivers/radeon/radeon_vce.c index 549d999..0ff07eb 100644 --- a/src/gallium/drivers/radeon/radeon_vce.c +++ b/src/gallium/drivers/radeon/radeon_vce.c @@ -139,6 +139,48 @@ static void sort_cpb(struct rvce_encoder *enc) } } +/** + * sort l0 and l1 based on reference picture list + */ +static void sort_cpb_by_ref_list(struct rvce_encoder *enc) +{ + struct rvce_cpb_slot *i, *l0 = NULL, *l1 = NULL; + struct list_head *current = &enc->cpb_slots; + + for (int j = 0 ; j < 32 ; j++) { + if ((enc->pic.ref_pic_list_0[j] == 0x) && + (enc->pic.ref_pic_list_1[j] == 0x)) + break; + LIST_FOR_EACH_ENTRY(i, &enc->cpb_slots, list) { + if (i->frame_num == enc->pic.ref_pic_list_0[j]) + l0 = i; + + if (i->frame_num == enc->pic.ref_pic_list_1[j]) + l1 = i; + + if (enc->pic.picture_type == PIPE_H264_ENC_PICTURE_TYPE_P && + l0) + break; + + if (enc->pic.picture_type == PIPE_H264_ENC_PICTURE_TYPE_B && + l0 && l1) + break; + } + + if (l0) { + LIST_DEL(&l0->list); + LIST_ADD(&l0->list, current); + current = current->next; + } + + if (l1) { + LIST_DEL(&l1->list); + LIST_ADD(&l1->list, current); + current = current->next; + } + } +} + static void get_rate_control_param(struct rvce_encoder *enc, struct pipe_h264_enc_picture_desc *pic) { enc->pic.rc.rc_method = pic->rate_ctrl.rate_ctrl_method; @@ -444,9 +486,13 @@ static void rvce_begin_frame(struct pipe_video_codec *encoder, if (pic->picture_type == PIPE_H264_ENC_PICTURE_TYPE_IDR) reset_cpb(enc); else if (pic->picture_type == PIPE_H264_ENC_PICTURE_TYPE_P || -pic->picture_type == PIPE_H264_ENC_PICTURE_TYPE_B) - sort_cpb(enc); - +pic->picture_type == PIPE_H264_ENC_PICTURE_TYPE_B) { + if (pic->has_ref_pic_list) + sort_cpb_by_ref_list(enc); + else + sort_cpb(enc); + } + if (!enc->stream_handle) { struct rvid_buffer fb; enc->stream_handle = rvid_alloc_stream_handle(); -- 1.9.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 1/5] radeon/vce: add vce structures
Signed-off-by: Boyuan Zhang --- src/gallium/drivers/radeon/radeon_vce.h | 297 1 file changed, 297 insertions(+) diff --git a/src/gallium/drivers/radeon/radeon_vce.h b/src/gallium/drivers/radeon/radeon_vce.h index e438148..da61285 100644 --- a/src/gallium/drivers/radeon/radeon_vce.h +++ b/src/gallium/drivers/radeon/radeon_vce.h @@ -65,6 +65,303 @@ struct rvce_cpb_slot { unsignedpic_order_cnt; }; +struct rvce_rate_control { + uint32_trc_method; + uint32_ttarget_bitrate; + uint32_tpeak_bitrate; + uint32_tframe_rate_num; + uint32_tgop_size; + uint32_tquant_i_frames; + uint32_tquant_p_frames; + uint32_tquant_b_frames; + uint32_tvbv_buffer_size; + uint32_tframe_rate_den; + uint32_tvbv_buf_lv; + uint32_tmax_au_size; + uint32_tqp_initial_mode; + uint32_ttarget_bits_picture; + uint32_tpeak_bits_picture_integer; + uint32_tpeak_bits_picture_fraction; + uint32_tmin_qp; + uint32_tmax_qp; + uint32_tskip_frame_enable; + uint32_tfill_data_enable; + uint32_tenforce_hrd; + uint32_tb_pics_delta_qp; + uint32_tref_b_pics_delta_qp; + uint32_trc_reinit_disable; + uint32_tenc_lcvbr_init_qp_flag; + uint32_tlcvbrsatd_based_nonlinear_bit_budget_flag; +}; + +struct rvce_motion_estimation { + uint32_tenc_ime_decimation_search; + uint32_tmotion_est_half_pixel; + uint32_tmotion_est_quarter_pixel; + uint32_tdisable_favor_pmv_point; + uint32_tforce_zero_point_center; + uint32_tlsmvert; + uint32_tenc_search_range_x; + uint32_tenc_search_range_y; + uint32_tenc_search1_range_x; + uint32_tenc_search1_range_y; + uint32_tdisable_16x16_frame1; + uint32_tdisable_satd; + uint32_tenable_amd; + uint32_tenc_disable_sub_mode; + uint32_tenc_ime_skip_x; + uint32_tenc_ime_skip_y; + uint32_tenc_en_ime_overw_dis_subm; + uint32_tenc_ime_overw_dis_subm_no; + uint32_tenc_ime2_search_range_x; + uint32_tenc_ime2_search_range_y; + uint32_tparallel_mode_speedup_enable; + uint32_tfme0_enc_disable_sub_mode; + uint32_tfme1_enc_disable_sub_mode; + uint32_time_sw_speedup_enable; +}; + +struct rvce_pic_control { + uint32_tenc_use_constrained_intra_pred; + uint32_tenc_cabac_enable; + uint32_tenc_cabac_idc; + uint32_tenc_loop_filter_disable; + int32_t enc_lf_beta_offset; + int32_t enc_lf_alpha_c0_offset; + uint32_tenc_crop_left_offset; + uint32_tenc_crop_right_offset; + uint32_tenc_crop_top_offset; + uint32_tenc_crop_bottom_offset; + uint32_tenc_num_mbs_per_slice; + uint32_tenc_intra_refresh_num_mbs_per_slot; + uint32_tenc_force_intra_refresh; + uint32_tenc_force_imb_period; + uint32_tenc_pic_order_cnt_type; + uint32_tlog2_max_pic_order_cnt_lsb_minus4; + uint32_tenc_sps_id; + uint32_tenc_pps_id; + uint32_tenc_constraint_set_flags; + uint32_tenc_b_pic_pattern; + uint32_tweight_pred_mode_b_picture; + uint32_tenc_number_of_reference_frames; + uint32_tenc_max_num_ref_frames; + uint32_tenc_num_default_active_ref_l0; + uint32_tenc_num_default_active_ref_l1; + uint32_tenc_slice_mode; + uint32_tenc_max_slice_size; +}; + +struct rvce_task_info { + uint32_toffset_of_next_task_info; + uint32_ttask_operation; + uint32_treference_picture_dependency; + uint32_tcollocate_flag_dependency; + uint32_tfeedback_index; + uint32_t
Re: [Mesa-dev] [PATCH] radeon: remove unnecessary checks
On 2016-06-14 20:39, Jan Vesely wrote: I really disagree here. The conditions check whether swizzle is between X and W (as in, only X,Y,Z,W are allowed). The fact that X maps to 0 is irrelevant. removing the checks impairs readability of the code because the lower bound is now inferred (by being 0) rather than explicit. the same comment applies to your v2. Jan Thanks for the input. Now when I think about it again this is probably a bad change. Didn't think about the lower bound. So this patch should probably not be pushed. -- Mvh Jakob Sinclair ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 3/5] radeon/vce: use vce structures for encoding
Signed-off-by: Boyuan Zhang --- src/gallium/drivers/radeon/radeon_vce.c| 180 ++- src/gallium/drivers/radeon/radeon_vce.h| 2 +- src/gallium/drivers/radeon/radeon_vce_40_2_2.c | 425 + src/gallium/drivers/radeon/radeon_vce_50.c | 183 ++- src/gallium/drivers/radeon/radeon_vce_52.c | 171 +- 5 files changed, 603 insertions(+), 358 deletions(-) diff --git a/src/gallium/drivers/radeon/radeon_vce.c b/src/gallium/drivers/radeon/radeon_vce.c index e16e0cf..549d999 100644 --- a/src/gallium/drivers/radeon/radeon_vce.c +++ b/src/gallium/drivers/radeon/radeon_vce.c @@ -139,6 +139,176 @@ static void sort_cpb(struct rvce_encoder *enc) } } +static void get_rate_control_param(struct rvce_encoder *enc, struct pipe_h264_enc_picture_desc *pic) +{ + enc->pic.rc.rc_method = pic->rate_ctrl.rate_ctrl_method; + enc->pic.rc.target_bitrate = pic->rate_ctrl.target_bitrate; + enc->pic.rc.peak_bitrate = pic->rate_ctrl.peak_bitrate; + enc->pic.rc.quant_i_frames = pic->quant_i_frames; + enc->pic.rc.quant_p_frames = pic->quant_p_frames; + enc->pic.rc.quant_b_frames = pic->quant_b_frames; + enc->pic.rc.gop_size = pic->gop_size; + enc->pic.rc.frame_rate_num = pic->rate_ctrl.frame_rate_num; + enc->pic.rc.frame_rate_den = pic->rate_ctrl.frame_rate_den; + enc->pic.rc.max_qp = 51; + + if (pic->enable_low_level_control == true) { + enc->pic.rc.vbv_buffer_size = 2000; + if (pic->rate_ctrl.frame_rate_num == 0) + enc->pic.rc.frame_rate_num = 30; + if (pic->rate_ctrl.frame_rate_den == 0) + enc->pic.rc.frame_rate_den = 1; + enc->pic.rc.vbv_buf_lv = 48; + enc->pic.rc.fill_data_enable = 1; + enc->pic.rc.enforce_hrd = 1; + enc->pic.rc.target_bits_picture = enc->pic.rc.target_bitrate / enc->pic.rc.frame_rate_num; + enc->pic.rc.peak_bits_picture_integer = enc->pic.rc.peak_bitrate / enc->pic.rc.frame_rate_num; + enc->pic.rc.peak_bits_picture_fraction = 0; + } else { + enc->pic.rc.vbv_buffer_size = pic->rate_ctrl.vbv_buffer_size; + enc->pic.rc.vbv_buf_lv = 0; + enc->pic.rc.fill_data_enable = 0; + enc->pic.rc.enforce_hrd = 0; + enc->pic.rc.target_bits_picture = pic->rate_ctrl.target_bits_picture; + enc->pic.rc.peak_bits_picture_integer = pic->rate_ctrl.peak_bits_picture_integer; + enc->pic.rc.peak_bits_picture_fraction = pic->rate_ctrl.peak_bits_picture_fraction; + } +} + +static void get_motion_estimation_param(struct rvce_encoder *enc, struct pipe_h264_enc_picture_desc *pic) +{ + if (pic->enable_low_level_control == true) { + enc->pic.me.motion_est_quarter_pixel = 0x0001; + enc->pic.me.enc_disable_sub_mode = 0x0078; + enc->pic.me.lsmvert = 0x0002; + enc->pic.me.enc_en_ime_overw_dis_subm = 0x0001; + enc->pic.me.enc_ime_overw_dis_subm_no = 0x0001; + enc->pic.me.enc_ime2_search_range_x = 0x0004; + enc->pic.me.enc_ime2_search_range_y = 0x0004; + enc->pic.me.enc_ime_decimation_search = 0x0001; + enc->pic.me.motion_est_half_pixel = 0x0001; + enc->pic.me.enc_search_range_x = 0x0010; + enc->pic.me.enc_search_range_y = 0x0010; + enc->pic.me.enc_search1_range_x = 0x0010; + enc->pic.me.enc_search1_range_y = 0x0010; + } else { + enc->pic.me.motion_est_quarter_pixel = 0x; + enc->pic.me.enc_disable_sub_mode = 0x00fe; + enc->pic.me.lsmvert = 0x; + enc->pic.me.enc_en_ime_overw_dis_subm = 0x; + enc->pic.me.enc_ime_overw_dis_subm_no = 0x; + enc->pic.me.enc_ime2_search_range_x = 0x0001; + enc->pic.me.enc_ime2_search_range_y = 0x0001; + enc->pic.me.enc_ime_decimation_search = 0x0001; + enc->pic.me.motion_est_half_pixel = 0x0001; + enc->pic.me.enc_search_range_x = 0x0010; + enc->pic.me.enc_search_range_y = 0x0010; + enc->pic.me.enc_search1_range_x = 0x0010; + enc->pic.me.enc_search1_range_y = 0x0010; + } +} + +static void get_pic_control_param(struct rvce_encoder *enc, struct pipe_h264_enc_picture_desc *pic) +{ + unsigned encNumMBsPerSlice; + encNumMBsPerSlice = align(enc->base.width, 16) / 16; + encNumMBsPerSlice *= align(enc->base.height, 16) / 16; + enc->pic.pc.enc_crop_right_offset = (align(enc->base.width, 16) - enc->base.width) >> 1; + enc->pic.pc.enc_crop_bottom_offset = (align(enc->base.height, 16) - enc
[Mesa-dev] [PATCH 2/5] vl: add parameters for VAAPI encode
Signed-off-by: Boyuan Zhang --- src/gallium/include/pipe/p_video_state.h | 13 + 1 file changed, 13 insertions(+) diff --git a/src/gallium/include/pipe/p_video_state.h b/src/gallium/include/pipe/p_video_state.h index d353be6..d519d17 100644 --- a/src/gallium/include/pipe/p_video_state.h +++ b/src/gallium/include/pipe/p_video_state.h @@ -131,6 +131,7 @@ enum pipe_h264_enc_rate_control_method struct pipe_picture_desc { enum pipe_video_profile profile; + enum pipe_video_entrypoint entry_point; }; struct pipe_quant_matrix @@ -369,11 +370,23 @@ struct pipe_h264_enc_picture_desc enum pipe_h264_enc_picture_type picture_type; unsigned frame_num; + unsigned frame_num_cnt; + unsigned p_remain; + unsigned i_remain; + unsigned idr_pic_id; + unsigned gop_cnt; unsigned pic_order_cnt; unsigned ref_idx_l0; unsigned ref_idx_l1; + unsigned gop_size; bool not_referenced; + bool is_idr; + bool has_ref_pic_list; + bool enable_low_level_control; + unsigned int ref_pic_list_0[32]; + unsigned int ref_pic_list_1[32]; + unsigned int frame_idx[32]; }; struct pipe_h265_sps -- 1.9.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 5/5] st/va: enable h264 VAAPI encode
Signed-off-by: Boyuan Zhang --- src/gallium/state_trackers/va/buffer.c | 6 ++ src/gallium/state_trackers/va/config.c | 104 +++--- src/gallium/state_trackers/va/context.c| 72 - src/gallium/state_trackers/va/image.c | 126 +++--- src/gallium/state_trackers/va/picture.c| 165 - src/gallium/state_trackers/va/surface.c| 16 ++- src/gallium/state_trackers/va/va_private.h | 9 ++ 7 files changed, 441 insertions(+), 57 deletions(-) diff --git a/src/gallium/state_trackers/va/buffer.c b/src/gallium/state_trackers/va/buffer.c index 7d3167b..dfcebbe 100644 --- a/src/gallium/state_trackers/va/buffer.c +++ b/src/gallium/state_trackers/va/buffer.c @@ -133,6 +133,12 @@ vlVaMapBuffer(VADriverContextP ctx, VABufferID buf_id, void **pbuff) if (!buf->derived_surface.transfer || !*pbuff) return VA_STATUS_ERROR_INVALID_BUFFER; + if (buf->type == VAEncCodedBufferType) { + ((VACodedBufferSegment*)buf->data)->buf = *pbuff; + ((VACodedBufferSegment*)buf->data)->size = buf->coded_size; + ((VACodedBufferSegment*)buf->data)->next = NULL; + *pbuff = buf->data; + } } else { pipe_mutex_unlock(drv->mutex); *pbuff = buf->data; diff --git a/src/gallium/state_trackers/va/config.c b/src/gallium/state_trackers/va/config.c index 9ca0aa8..04d214d 100644 --- a/src/gallium/state_trackers/va/config.c +++ b/src/gallium/state_trackers/va/config.c @@ -34,6 +34,8 @@ #include "va_private.h" +#include "util/u_handle_table.h" + DEBUG_GET_ONCE_BOOL_OPTION(mpeg4, "VAAPI_MPEG4_ENABLED", false) VAStatus @@ -72,6 +74,7 @@ vlVaQueryConfigEntrypoints(VADriverContextP ctx, VAProfile profile, { struct pipe_screen *pscreen; enum pipe_video_profile p; + int va_status = VA_STATUS_ERROR_UNSUPPORTED_PROFILE; if (!ctx) return VA_STATUS_ERROR_INVALID_CONTEXT; @@ -88,12 +91,18 @@ vlVaQueryConfigEntrypoints(VADriverContextP ctx, VAProfile profile, return VA_STATUS_ERROR_UNSUPPORTED_PROFILE; pscreen = VL_VA_PSCREEN(ctx); - if (!pscreen->get_video_param(pscreen, p, PIPE_VIDEO_ENTRYPOINT_BITSTREAM, PIPE_VIDEO_CAP_SUPPORTED)) - return VA_STATUS_ERROR_UNSUPPORTED_PROFILE; - - entrypoint_list[(*num_entrypoints)++] = VAEntrypointVLD; + if (pscreen->get_video_param(pscreen, p, PIPE_VIDEO_ENTRYPOINT_BITSTREAM, PIPE_VIDEO_CAP_SUPPORTED)) { + entrypoint_list[(*num_entrypoints)++] = VAEntrypointVLD; + va_status = VA_STATUS_SUCCESS; + } + if (pscreen->get_video_param(pscreen, p, PIPE_VIDEO_ENTRYPOINT_ENCODE, PIPE_VIDEO_CAP_SUPPORTED) && + p == PIPE_VIDEO_PROFILE_MPEG4_AVC_BASELINE) { + entrypoint_list[(*num_entrypoints)++] = VAEntrypointEncSlice; + entrypoint_list[(*num_entrypoints)++] = VAEntrypointEncPicture; + va_status = VA_STATUS_SUCCESS; + } - return VA_STATUS_SUCCESS; + return va_status; } VAStatus @@ -112,7 +121,7 @@ vlVaGetConfigAttributes(VADriverContextP ctx, VAProfile profile, VAEntrypoint en value = VA_RT_FORMAT_YUV420; break; case VAConfigAttribRateControl: - value = VA_RC_NONE; + value = VA_RC_CQP | VA_RC_CBR; break; default: value = VA_ATTRIB_NOT_SUPPORTED; @@ -128,14 +137,27 @@ VAStatus vlVaCreateConfig(VADriverContextP ctx, VAProfile profile, VAEntrypoint entrypoint, VAConfigAttrib *attrib_list, int num_attribs, VAConfigID *config_id) { + vlVaDriver *drv; + vlVaConfig *config; struct pipe_screen *pscreen; enum pipe_video_profile p; if (!ctx) return VA_STATUS_ERROR_INVALID_CONTEXT; + drv = VL_VA_DRIVER(ctx); + + if (!drv) + return VA_STATUS_ERROR_INVALID_CONTEXT; + + config = CALLOC(1, sizeof(vlVaConfig)); + if (!config) + return VA_STATUS_ERROR_ALLOCATION_FAILED; + if (profile == VAProfileNone && entrypoint == VAEntrypointVideoProc) { - *config_id = PIPE_VIDEO_PROFILE_UNKNOWN; + config->entrypoint = VAEntrypointVideoProc; + config->profile = PIPE_VIDEO_PROFILE_UNKNOWN; + *config_id = handle_table_add(drv->htab, config); return VA_STATUS_SUCCESS; } @@ -144,13 +166,36 @@ vlVaCreateConfig(VADriverContextP ctx, VAProfile profile, VAEntrypoint entrypoin return VA_STATUS_ERROR_UNSUPPORTED_PROFILE; pscreen = VL_VA_PSCREEN(ctx); - if (!pscreen->get_video_param(pscreen, p, PIPE_VIDEO_ENTRYPOINT_BITSTREAM, PIPE_VIDEO_CAP_SUPPORTED)) - return VA_STATUS_ERROR_UNSUPPORTED_PROFILE; - - if (entrypoint != VAEntrypointVLD) + if (entrypoint == VAEntrypointVLD) { + if (!pscreen->get_video_param(pscreen, p, PIPE_VIDEO_ENTRYPOINT_BITSTREAM, PIPE_VIDEO_CAP_SUPPORTED)) + return VA_STATUS_ERROR_UNSUPPORTED_PROFILE; + } + else if (entrypoint == VAEntrypointEncSlice) { + if (!pscreen->get_video_param(pscreen, p, PIPE_VIDEO_ENTRYPOINT_ENCODE, PIPE_VIDEO_CAP_SUPPORTED)) + retu
Re: [Mesa-dev] [PATCH 0/7] Fix ralloc/rzalloc usage v2
Rob Clark writes: > I (and I expect Eric too) would appreciate it if you went ahead and > replaced the current use of non-"z" versions in code that you can't > test w/ the "z" versions. That way we can switch over to non-zero'ing > on our own time, rather than getting a surprise next time we > pull/rebase > > I think it's only a couple spots in freedreno, and pre-emptive r-b for > that change ;-) I've checked vc4, and all the calls should be fine already. signature.asc Description: PGP signature ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 96517] [llvmpipe] piglit arb_uniform_buffer_object-rendering-dsa regression
https://bugs.freedesktop.org/show_bug.cgi?id=96517 Roland Scheidegger changed: What|Removed |Added Resolution|--- |FIXED Status|NEW |RESOLVED --- Comment #5 from Roland Scheidegger --- Fixed by f4184d5450c12e107d3e41ae29e5927c75543259. -- You are receiving this mail because: You are the assignee for the bug. You are the QA Contact for the bug.___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] i965/fs: Fix regs_written for SIMD-lowered instructions some more.
Iago Toral writes: > On Fri, 2016-06-10 at 22:39 -0700, Francisco Jerez wrote: >> ISTR having suggested this during review of the recent FP64 changes to >> the SIMD lowering pass, but it doesn't look like it was taken into >> account in the end. Using the fs_reg::component_size helper instead >> of this open-coded variant makes sure that the stride is taken into >> account correctly. Fixes at least the following piglit tests with >> spilling forced on (since otherwise regs_written would be calculated >> incorrectly and the spilling code would be rather confused about how >> much data needs to be spilled): > > Yes, you had suggested it but we forgot about it until a few days ago > when I was tracking down a similar bug and came up with this same patch. > I was about to send it for review together with other fixes for BSW this > week, sorry if that caused you trouble... > No worries, it was no trouble at all. > Iago > >> spec.arb_gpu_shader_fp64.shader_storage.layout-std140-fp64-shader >> spec.arb_gpu_shader_fp64.shader_storage.layout-std140-fp64-mixed-shader >> >> Cc: >> --- >> src/mesa/drivers/dri/i965/brw_fs.cpp | 6 +++--- >> 1 file changed, 3 insertions(+), 3 deletions(-) >> >> diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp >> b/src/mesa/drivers/dri/i965/brw_fs.cpp >> index 104c20b..0347b0a 100644 >> --- a/src/mesa/drivers/dri/i965/brw_fs.cpp >> +++ b/src/mesa/drivers/dri/i965/brw_fs.cpp >> @@ -5261,9 +5261,9 @@ fs_visitor::lower_simd_width() >> split_inst.src[j] = emit_unzip(lbld, block, inst, j); >> >> split_inst.dst = emit_zip(lbld, block, inst); >> -split_inst.regs_written = >> - DIV_ROUND_UP(type_sz(inst->dst.type) * dst_size * >> lower_width, >> -REG_SIZE); >> +split_inst.regs_written = DIV_ROUND_UP( >> + split_inst.dst.component_size(lower_width) * dst_size, >> + REG_SIZE); >> >> lbld.emit(split_inst); >> } signature.asc Description: PGP signature ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] radeon: remove unnecessary checks
On Tue, 2016-06-14 at 20:19 +0200, Jakob Sinclair wrote: > On 2016-06-13 12:02, Nicolai Hähnle wrote: > > > > Meh. This is the kind of thing where Coverity should perhaps just > > shut > > up :/ > > I do agree with you that Coverity should perhaps shut up about this > kinda thing > but I couldn't see a reason to have these checks in the code. They > really didn't > contribute to my understanding of the code. I really disagree here. The conditions check whether swizzle is between X and W (as in, only X,Y,Z,W are allowed). The fact that X maps to 0 is irrelevant. removing the checks impairs readability of the code because the lower bound is now inferred (by being 0) rather than explicit. the same comment applies to your v2. Jan > Although I may be missing > something > important here. > > > Anyway... > > I think for consistency, you should also remove the '- > > PIPE_SWIZZLE_X' > > here, similar to the first hunk. With that changed, > > Forgot about that one. I agree with this change. > > > Reviewed-by: Nicolai Hähnle > > Thanks! -- Jan Vesely signature.asc Description: This is a digitally signed message part ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] V2 radeon: remove unnecessary checks
PIPE_SWIZZLE_X is always 0 and desc->swizzle is an unsigned char meaning that desc->swizzle can never be smaller then PIPE_SWIZZLE_X. Removing these checks doesn't change the code path at all because they would always give the same result. Issue discovered by Coverity. V2: Removed "- PIPE_SWIZZLE_X" for more consistency. CID: 1337954 Signed-off-by: Jakob Sinclair --- src/gallium/drivers/radeon/r600_texture.c | 8 +++- 1 file changed, 3 insertions(+), 5 deletions(-) diff --git a/src/gallium/drivers/radeon/r600_texture.c b/src/gallium/drivers/radeon/r600_texture.c index 32347f2..3a56f9f 100644 --- a/src/gallium/drivers/radeon/r600_texture.c +++ b/src/gallium/drivers/radeon/r600_texture.c @@ -1754,10 +1754,9 @@ static void vi_get_fast_clear_parameters(enum pipe_format surface_format, return; for (i = 0; i < 4; ++i) { - int index = desc->swizzle[i] - PIPE_SWIZZLE_X; + int index = desc->swizzle[i]; - if (desc->swizzle[i] < PIPE_SWIZZLE_X || - desc->swizzle[i] > PIPE_SWIZZLE_W) + if (desc->swizzle[i] > PIPE_SWIZZLE_W) continue; if (util_format_is_pure_sint(surface_format)) { @@ -1782,8 +1781,7 @@ static void vi_get_fast_clear_parameters(enum pipe_format surface_format, for (int i = 0; i < 4; ++i) if (values[i] != main_value && - desc->swizzle[i] - PIPE_SWIZZLE_X != extra_channel && - desc->swizzle[i] >= PIPE_SWIZZLE_X && + desc->swizzle[i] != extra_channel && desc->swizzle[i] <= PIPE_SWIZZLE_W) return; -- 2.8.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] radeon: remove unnecessary checks
On 2016-06-13 12:02, Nicolai Hähnle wrote: Meh. This is the kind of thing where Coverity should perhaps just shut up :/ I do agree with you that Coverity should perhaps shut up about this kinda thing but I couldn't see a reason to have these checks in the code. They really didn't contribute to my understanding of the code. Although I may be missing something important here. Anyway... I think for consistency, you should also remove the '- PIPE_SWIZZLE_X' here, similar to the first hunk. With that changed, Forgot about that one. I agree with this change. Reviewed-by: Nicolai Hähnle Thanks! -- Mvh Jakob Sinclair ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 3/3] radeonsi: fix undefined left-shift into sign bit
Reviewed-by: Marek Olšák Marek On Tue, Jun 14, 2016 at 4:37 PM, Nicolai Hähnle wrote: > From: Nicolai Hähnle > > --- > src/gallium/drivers/radeonsi/cik_sdma.c | 3 ++- > 1 file changed, 2 insertions(+), 1 deletion(-) > > diff --git a/src/gallium/drivers/radeonsi/cik_sdma.c > b/src/gallium/drivers/radeonsi/cik_sdma.c > index d8ec2a3..a36bbce 100644 > --- a/src/gallium/drivers/radeonsi/cik_sdma.c > +++ b/src/gallium/drivers/radeonsi/cik_sdma.c > @@ -370,12 +370,13 @@ static bool cik_sdma_copy_texture(struct si_context > *sctx, > copy_height <= (1 << 14) && > copy_depth <= (1 << 11)) { > struct radeon_winsys_cs *cs = sctx->b.dma.cs; > + uint32_t direction = linear == rdst ? 1u << 31 : 0; > > r600_need_dma_space(&sctx->b, 14, &rdst->resource, > &rsrc->resource); > > radeon_emit(cs, CIK_SDMA_PACKET(CIK_SDMA_OPCODE_COPY, > > CIK_SDMA_COPY_SUB_OPCODE_TILED_SUB_WINDOW, 0) | > - ((linear == rdst) << 31)); > + direction); > radeon_emit(cs, tiled_address); > radeon_emit(cs, tiled_address >> 32); > radeon_emit(cs, tiled_x | (tiled_y << 16)); > -- > 2.7.4 > > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 05/10] gallium: change end_query() to return boolean
On Tue, Jun 14, 2016 at 6:24 PM, Ilia Mirkin wrote: > On Tue, Jun 14, 2016 at 12:19 PM, Nicolai Hähnle wrote: >> On 14.06.2016 17:57, Rob Clark wrote: >>> >>> From: Rob Clark >>> >>> s/bool/boolean/ to make it match the other APIs. >> >> >> Please no. C has finally grown a proper bool type, we should use it where >> possible. If anything, make the patch go in the other direction. > > FWIW I've eradicated boolean from nouveau except for the gallium API > interfaces. Would definitely be in favor of flipping those to bool. I'm in favor of bool too and would like to see boolean/TRUE/FALSE disappear from radeon-specific code at least. Marek ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 01/10] gallium: cleanup set_tess_state
On Tue, Jun 14, 2016 at 12:30 PM, Nicolai Hähnle wrote: > On 14.06.2016 18:02, Ilia Mirkin wrote: >> >> Can you explain the motivation behind this change? I'm adding a >> ->set_window_rectangles thing which also takes multiple parameters. >> What's the advantage of stuffing things into a struct first? > > > FWIW, I tend to be mildly supportive of changes like this. At least, the > other extreme where functions grow multiple bool or int parameters over time > is much worse. But in this particular case, changing this around might be > too eager. I'd have to think about how it would work to deal w/ variants that have params not wrapped in a struct. It at least sounds annoying, and I tended to think the benefits of using a struct where enough of a justification to change this. (Plus there are not many usages of this API yet, so seemed like the perfect time to cleanup.) > Perhaps teaching the script to deal with slightly more complicated cases > will help elsewhere, too. *maybe*, but I can't think of anything.. right now it is only the sampler_view and stream_output_target state that I handle "manually".. but those are also kind of different from the rest since they are already refcnt'd. And I figured it was easier to just deal w/ those manually than implement a 3rd type of state (CSO vs Param) in rsq_state.py.. BR, -R > Nicolai > >> >>-ilia >> >> On Tue, Jun 14, 2016 at 11:57 AM, Rob Clark wrote: >>> >>> From: Rob Clark >>> >>> The reset of the state APIs take state structs, rather than inline >>> parameters (with the exception of a couple which just amount to a single >>> uint). >>> >>> This makes the API more regular and simplifies autogeneration of the >>> gallium state related APIs. >>> >>> Signed-off-by: Rob Clark >>> --- >>> src/gallium/drivers/ddebug/dd_context.c | 9 - >>> src/gallium/drivers/nouveau/nvc0/nvc0_state.c | 7 +++ >>> src/gallium/drivers/r600/evergreen_state.c| 7 +++ >>> src/gallium/drivers/radeonsi/si_state.c | 7 +++ >>> src/gallium/drivers/trace/tr_context.c| 9 - >>> src/gallium/include/pipe/p_context.h | 4 ++-- >>> src/gallium/include/pipe/p_state.h| 8 >>> src/mesa/state_tracker/st_atom_tess.c | 13 ++--- >>> 8 files changed, 37 insertions(+), 27 deletions(-) >>> >>> diff --git a/src/gallium/drivers/ddebug/dd_context.c >>> b/src/gallium/drivers/ddebug/dd_context.c >>> index 0f8ef18..06b7c91 100644 >>> --- a/src/gallium/drivers/ddebug/dd_context.c >>> +++ b/src/gallium/drivers/ddebug/dd_context.c >>> @@ -380,15 +380,14 @@ dd_context_set_viewport_states(struct pipe_context >>> *_pipe, >>> } >>> >>> static void dd_context_set_tess_state(struct pipe_context *_pipe, >>> - const float >>> default_outer_level[4], >>> - const float >>> default_inner_level[2]) >>> + const struct pipe_tess_state >>> *state) >>> { >>> struct dd_context *dctx = dd_context(_pipe); >>> struct pipe_context *pipe = dctx->pipe; >>> >>> - memcpy(dctx->tess_default_levels, default_outer_level, sizeof(float) >>> * 4); >>> - memcpy(dctx->tess_default_levels+4, default_inner_level, >>> sizeof(float) * 2); >>> - pipe->set_tess_state(pipe, default_outer_level, default_inner_level); >>> + memcpy(dctx->tess_default_levels, state->default_outer_level, >>> sizeof(float) * 4); >>> + memcpy(dctx->tess_default_levels+4, state->default_inner_level, >>> sizeof(float) * 2); >>> + pipe->set_tess_state(pipe, state); >>> } >>> >>> >>> diff --git a/src/gallium/drivers/nouveau/nvc0/nvc0_state.c >>> b/src/gallium/drivers/nouveau/nvc0/nvc0_state.c >>> index 92161ec..a9c1830 100644 >>> --- a/src/gallium/drivers/nouveau/nvc0/nvc0_state.c >>> +++ b/src/gallium/drivers/nouveau/nvc0/nvc0_state.c >>> @@ -1001,13 +1001,12 @@ nvc0_set_viewport_states(struct pipe_context >>> *pipe, >>> >>> static void >>> nvc0_set_tess_state(struct pipe_context *pipe, >>> -const float default_tess_outer[4], >>> -const float default_tess_inner[2]) >>> +const struct pipe_tess_state *state) >>> { >>> struct nvc0_context *nvc0 = nvc0_context(pipe); >>> >>> - memcpy(nvc0->default_tess_outer, default_tess_outer, 4 * >>> sizeof(float)); >>> - memcpy(nvc0->default_tess_inner, default_tess_inner, 2 * >>> sizeof(float)); >>> + memcpy(nvc0->default_tess_outer, state->default_tess_outer, 4 * >>> sizeof(float)); >>> + memcpy(nvc0->default_tess_inner, state->default_tess_inner, 2 * >>> sizeof(float)); >>> nvc0->dirty_3d |= NVC0_NEW_3D_TESSFACTOR; >>> } >>> >>> diff --git a/src/gallium/drivers/r600/evergreen_state.c >>> b/src/gallium/drivers/r600/evergreen_state.c >>> index 1ac8914..2a424f5 100644 >>> --- a/src/gallium/drivers/r600/evergreen_state.c >>> +++ b/src/gallium/drivers/r600/evergreen_state.c >>> @@ -3569,13 +3569,12
Re: [Mesa-dev] [PATCH v2] swr: automake: don't ship LLVM version specific generated sources
On 14 June 2016 at 18:06, Rowley, Timothy O wrote: > >> On Jun 13, 2016, at 8:03 PM, Rowley, Timothy O >> wrote: >> >> A clean tree build works with this version, but distcheck fails: >> >> ... >> rm -f config.status config.cache config.log configure.lineno >> config.status.lineno >> rm -f Makefile >> ERROR: files left in build directory after distclean: >> ./src/gallium/drivers/swr/rasterizer/jitter/builder_gen.cpp >> ./src/gallium/drivers/swr/rasterizer/jitter/builder_x86.cpp >> ./src/gallium/drivers/swr/rasterizer/jitter/builder_gen.h >> make[1]: *** [distcleancheck] Error 1 >> make[1]: Leaving directory >> `/home/torowley/work/mesa-opt/mesa-12.1.0-devel/_build' >> make: *** [distcheck] Error 1 >> >> Not sure how builder_x86.cpp managed to change its status. > > To answer my own question: the reason for builder_x86.cpp being regenerated > is because of its dependency on builder_gen.h (through builder.h). > I thought that one was mentioned is the big comment in the patch. Perhaps my wording could be improved - any suggestions ? And yes, due to the missing dependency the file will be (re)generated at a later stage thus we'll need to add yet another workaround for that. Just listing the whole lot in CLEANFILES should be enough. Feel free to give it a try, or I'll do at some point later on today. Thanks, Emil ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH (backport)] radeonsi: mark buffer texture range valid for shader images
Reviewed-by: Marek Olšák Marek On Tue, Jun 14, 2016 at 6:00 PM, Nicolai Hähnle wrote: > From: Nicolai Hähnle > > When a shader image view into a buffer texture can be written to, the buffer's > valid range must be updated, or subsequent transfers may incorrectly skip > synchronization. > > This fixes a bug that was exposed in Xephyr by PBO acceleration for > glReadPixels, > reported by Michel Dänzer. > > Cc: Michel Dänzer > Cc: 12.0 > Reviewed-by: Marek Olšák > > Back-ported from commit a64c7cd2bac33a3a2bf908b5ef538dff03b93b73: > - include util/u_format.h > - code was extracted to si_set_shader_image in master, move it back > > Signed-off-by: Nicolai Hähnle > -- > src/gallium/drivers/radeonsi/si_descriptors.c | 24 > 1 file changed, 24 insertions(+) > > diff --git a/src/gallium/drivers/radeonsi/si_descriptors.c > b/src/gallium/drivers/radeonsi/si_descriptors.c > index 855b79e..e8ce87b 100644 > --- a/src/gallium/drivers/radeonsi/si_descriptors.c > +++ b/src/gallium/drivers/radeonsi/si_descriptors.c > @@ -60,6 +60,7 @@ > #include "si_shader.h" > #include "sid.h" > > +#include "util/u_format.h" > #include "util/u_math.h" > #include "util/u_memory.h" > #include "util/u_suballoc.h" > @@ -471,6 +472,23 @@ si_disable_shader_image(struct si_images_info *images, > unsigned slot) > } > > static void > +si_mark_image_range_valid(struct pipe_image_view *view) > +{ > + struct r600_resource *res = (struct r600_resource *)view->resource; > + const struct util_format_description *desc; > + unsigned stride; > + > + assert(res && res->b.b.target == PIPE_BUFFER); > + > + desc = util_format_description(view->format); > + stride = desc->block.bits / 8; > + > + util_range_add(&res->valid_buffer_range, > + stride * (view->u.buf.first_element), > + stride * (view->u.buf.last_element + 1)); > +} > + > +static void > si_set_shader_images(struct pipe_context *pipe, unsigned shader, > unsigned start_slot, unsigned count, > struct pipe_image_view *views) > @@ -502,6 +520,9 @@ si_set_shader_images(struct pipe_context *pipe, unsigned > shader, >RADEON_USAGE_READWRITE); > > if (res->b.b.target == PIPE_BUFFER) { > + if (views[i].access & PIPE_IMAGE_ACCESS_WRITE) > + si_mark_image_range_valid(&views[i]); > + > si_make_buffer_descriptor(screen, res, > views[i].format, > > views[i].u.buf.first_element, > @@ -1297,6 +1318,9 @@ static void si_invalidate_buffer(struct pipe_context > *ctx, struct pipe_resource > unsigned i = u_bit_scan(&mask); > > if (images->views[i].resource == buf) { > + if (images->views[i].access & > PIPE_IMAGE_ACCESS_WRITE) > + > si_mark_image_range_valid(&images->views[i]); > + > si_desc_reset_buffer_offset( > ctx, images->desc.list + i * 8 + 4, > old_va, buf); > -- > 2.7.4 > > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 0/7] Fix ralloc/rzalloc usage v2
I (and I expect Eric too) would appreciate it if you went ahead and replaced the current use of non-"z" versions in code that you can't test w/ the "z" versions. That way we can switch over to non-zero'ing on our own time, rather than getting a surprise next time we pull/rebase I think it's only a couple spots in freedreno, and pre-emptive r-b for that change ;-) BR, -R On Tue, Jun 14, 2016 at 11:07 AM, Ilia Mirkin wrote: > I assume you've only tested this with i965? ralloc is also used by > st/mesa, freedreno, and vc4. Should probably try to coordinate with > the responsible developers before making the big switch. > > -ilia > > On Tue, Jun 14, 2016 at 10:58 AM, Juha-Pekka Heikkila > wrote: >> Here is fixed version of this ralloc set. Now I got to run this on many >> different machines thanks to Mark Janes. There didn't show up any >> regressions on different gen hw. On my IVB I've been running also many >> different traces with Apitrace while having Valgrind running on background >> but Valgrind did seem to be happy with my changes. >> >> As a performance test I did shader-db compile runs 10 times and compare >> timing results against what Mesa master does on my IVB. To my surprise this >> does bring reasonable gain which also seem to be repeatable, on my IVB >> shader compile time is around 5% faster with these changes. >> >> /Juha-Pekka >> >> Juha-Pekka Heikkila (7): >> glsl: Fix reading of uninitialized memory >> util: use rzalloc instead on ralloc in _mesa_hash_table_create() >> util: use rzalloc instead on ralloc in _mesa_set_create(() >> nir: zero allocated memory where needed >> i965/vec4: zero allocated memory where needed >> i965/fs: fill allocated memory with zeros where needed >> util: Fix ralloc to use malloc instead of calloc >> >> src/compiler/glsl/ast_to_hir.cpp | 2 +- >> src/compiler/glsl/glcpp/glcpp-parse.y | 4 +- >> src/compiler/glsl/link_uniform_blocks.cpp | 2 +- >> src/compiler/glsl_types.cpp| 2 +- >> src/compiler/nir/nir.c | 6 +-- >> src/compiler/nir/nir_opt_dce.c | 2 +- >> src/compiler/nir/nir_phi_builder.c | 2 +- >> src/compiler/nir/nir_search.c | 2 +- >> src/compiler/nir/nir_to_ssa.c | 2 +- >> src/compiler/nir/nir_worklist.c| 2 +- >> .../drivers/dri/i965/brw_fs_copy_propagation.cpp | 2 +- >> .../dri/i965/brw_fs_dead_code_eliminate.cpp| 4 +- >> .../dri/i965/brw_vec4_dead_code_eliminate.cpp | 4 +- >> src/util/hash_table.c | 2 +- >> src/util/ralloc.c | 49 >> +++--- >> src/util/ralloc.h | 2 +- >> src/util/set.c | 2 +- >> 17 files changed, 54 insertions(+), 37 deletions(-) >> >> -- >> 1.9.1 >> >> ___ >> mesa-dev mailing list >> mesa-dev@lists.freedesktop.org >> https://lists.freedesktop.org/mailman/listinfo/mesa-dev > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v2] swr: automake: don't ship LLVM version specific generated sources
> On Jun 13, 2016, at 8:03 PM, Rowley, Timothy O > wrote: > > A clean tree build works with this version, but distcheck fails: > > ... > rm -f config.status config.cache config.log configure.lineno > config.status.lineno > rm -f Makefile > ERROR: files left in build directory after distclean: > ./src/gallium/drivers/swr/rasterizer/jitter/builder_gen.cpp > ./src/gallium/drivers/swr/rasterizer/jitter/builder_x86.cpp > ./src/gallium/drivers/swr/rasterizer/jitter/builder_gen.h > make[1]: *** [distcleancheck] Error 1 > make[1]: Leaving directory > `/home/torowley/work/mesa-opt/mesa-12.1.0-devel/_build' > make: *** [distcheck] Error 1 > > Not sure how builder_x86.cpp managed to change its status. To answer my own question: the reason for builder_x86.cpp being regenerated is because of its dependency on builder_gen.h (through builder.h). >> On Jun 13, 2016, at 6:46 PM, Emil Velikov wrote: >> >> From: Emil Velikov >> >> Otherwise things will fail to build, if the builder is using another >> version of LLVM. >> >> v2: annotate all the dependencies of builder_gen.h >> >> Cc: "12.0" >> Cc: Tim Rowley >> Cc: Chuck Atkins >> Reported-by: Chuck Atkins >> Signed-off-by: Emil Velikov >> --- >> Unlike v1, this ones seems to work. Please give it a try and let me know >> how it fares on your end. >> >> Thanks >> Emil >> --- >> src/gallium/drivers/swr/Makefile.am | 37 >> +++-- >> 1 file changed, 35 insertions(+), 2 deletions(-) >> >> diff --git a/src/gallium/drivers/swr/Makefile.am >> b/src/gallium/drivers/swr/Makefile.am >> index 8151e4a..63dadbf 100644 >> --- a/src/gallium/drivers/swr/Makefile.am >> +++ b/src/gallium/drivers/swr/Makefile.am >> @@ -52,8 +52,6 @@ BUILT_SOURCES = \ >> rasterizer/scripts/gen_knobs.cpp \ >> rasterizer/scripts/gen_knobs.h \ >> rasterizer/jitter/state_llvm.h \ >> -rasterizer/jitter/builder_gen.h \ >> -rasterizer/jitter/builder_gen.cpp \ >> rasterizer/jitter/builder_x86.h \ >> rasterizer/jitter/builder_x86.cpp >> >> @@ -122,6 +120,23 @@ COMMON_LDFLAGS = \ >> $(NO_UNDEFINED) \ >> $(LLVM_LDFLAGS) >> >> + >> +# XXX: As we cannot use BUILT_SOURCES (the files will end up in the dist >> +# tarball) just annotate the dependency directly. >> +# As the single direct user of builder_gen.h is a header (builder.h) trace >> all >> +# the translusive users (one that use the latter header). >> +# >> +# Note: one should really clean the includes a bit, according to Tim there's >> +# only 4 users of the builder_gen methods/API. >> +rasterizer/jitter/blend_jit.cpp: rasterizer/jitter/builder_gen.h >> +rasterizer/jitter/builder.cpp: rasterizer/jitter/builder_gen.h >> +rasterizer/jitter/builder_gen.cpp: rasterizer/jitter/builder_gen.h >> +rasterizer/jitter/builder_x86.cpp: rasterizer/jitter/builder_gen.h >> +rasterizer/jitter/builder_misc.cpp: rasterizer/jitter/builder_gen.h >> +rasterizer/jitter/fetch_jit.cpp: rasterizer/jitter/builder_gen.h >> +rasterizer/jitter/streamout_jit.cpp: rasterizer/jitter/builder_gen.h >> +swr_shader.cpp: rasterizer/jitter/builder_gen.h >> + >> lib_LTLIBRARIES = libswrAVX.la libswrAVX2.la >> >> libswrAVX_la_CXXFLAGS = \ >> @@ -132,6 +147,15 @@ libswrAVX_la_CXXFLAGS = \ >> libswrAVX_la_SOURCES = \ >> $(COMMON_SOURCES) >> >> +# XXX: Don't ship these generated sources for now, since they are specific >> +# to the LLVM version they are generated from. Thus a release tarball >> +# containing the said files, generated against eg. LLVM 3.8 will fail to >> build >> +# on systems with other versions of LLVM eg. 3.7 or 3.6. >> +# Move these back to BUILT_SOURCES once that is resolved. >> +nodist_libswrAVX_la_SOURCES = \ >> +rasterizer/jitter/builder_gen.h \ >> +rasterizer/jitter/builder_gen.cpp >> + >> libswrAVX_la_LIBADD = \ >> $(COMMON_LIBADD) >> >> @@ -146,6 +170,15 @@ libswrAVX2_la_CXXFLAGS = \ >> libswrAVX2_la_SOURCES = \ >> $(COMMON_SOURCES) >> >> +# XXX: Don't ship these generated sources for now, since they are specific >> +# to the LLVM version they are generated from. Thus a release tarball >> +# containing the said files, generated against eg. LLVM 3.8 will fail to >> build >> +# on systems with other versions of LLVM eg. 3.7 or 3.6. >> +# Move these back to BUILT_SOURCES once that is resolved. >> +nodist_libswrAVX2_la_SOURCES = \ >> +rasterizer/jitter/builder_gen.h \ >> +rasterizer/jitter/builder_gen.cpp >> + >> libswrAVX2_la_LIBADD = \ >> $(COMMON_LIBADD) >> >> -- >> 2.8.2 >> > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 2/2] winsys/radeon: use the common job queue for multithreaded command submission v2
From: Marek Olšák v2: fixup after renaming to util_queue_fence --- src/gallium/winsys/radeon/drm/radeon_drm_cs.c | 22 src/gallium/winsys/radeon/drm/radeon_drm_cs.h | 4 +- src/gallium/winsys/radeon/drm/radeon_drm_winsys.c | 63 ++- src/gallium/winsys/radeon/drm/radeon_drm_winsys.h | 12 ++--- 4 files changed, 19 insertions(+), 82 deletions(-) diff --git a/src/gallium/winsys/radeon/drm/radeon_drm_cs.c b/src/gallium/winsys/radeon/drm/radeon_drm_cs.c index e9ab53d..9552bd5 100644 --- a/src/gallium/winsys/radeon/drm/radeon_drm_cs.c +++ b/src/gallium/winsys/radeon/drm/radeon_drm_cs.c @@ -177,7 +177,7 @@ radeon_drm_cs_create(struct radeon_winsys_ctx *ctx, if (!cs) { return NULL; } -pipe_semaphore_init(&cs->flush_completed, 1); +util_queue_fence_init(&cs->flush_completed); cs->ws = ws; cs->flush_cs = flush; @@ -427,8 +427,9 @@ static unsigned radeon_drm_cs_get_buffer_list(struct radeon_winsys_cs *rcs, return cs->csc->crelocs; } -void radeon_drm_cs_emit_ioctl_oneshot(struct radeon_drm_cs *cs, struct radeon_cs_context *csc) +void radeon_drm_cs_emit_ioctl_oneshot(void *job) { +struct radeon_cs_context *csc = ((struct radeon_drm_cs*)job)->cst; unsigned i; int r; @@ -463,11 +464,9 @@ void radeon_drm_cs_sync_flush(struct radeon_winsys_cs *rcs) { struct radeon_drm_cs *cs = radeon_drm_cs(rcs); -/* Wait for any pending ioctl to complete. */ -if (cs->ws->thread) { -pipe_semaphore_wait(&cs->flush_completed); -pipe_semaphore_signal(&cs->flush_completed); -} +/* Wait for any pending ioctl of this CS to complete. */ +if (util_queue_is_initialized(&cs->ws->cs_queue)) +util_queue_job_wait(&cs->flush_completed); } DEBUG_GET_ONCE_BOOL_OPTION(noop, "RADEON_NOOP", FALSE) @@ -586,13 +585,12 @@ static void radeon_drm_cs_flush(struct radeon_winsys_cs *rcs, break; } -if (cs->ws->thread) { -pipe_semaphore_wait(&cs->flush_completed); -radeon_drm_ws_queue_cs(cs->ws, cs); +if (util_queue_is_initialized(&cs->ws->cs_queue)) { +util_queue_add_job(&cs->ws->cs_queue, cs, &cs->flush_completed); if (!(flags & RADEON_FLUSH_ASYNC)) radeon_drm_cs_sync_flush(rcs); } else { -radeon_drm_cs_emit_ioctl_oneshot(cs, cs->cst); +radeon_drm_cs_emit_ioctl_oneshot(cs); } } else { radeon_cs_context_cleanup(cs->cst); @@ -610,7 +608,7 @@ static void radeon_drm_cs_destroy(struct radeon_winsys_cs *rcs) struct radeon_drm_cs *cs = radeon_drm_cs(rcs); radeon_drm_cs_sync_flush(rcs); -pipe_semaphore_destroy(&cs->flush_completed); +util_queue_fence_destroy(&cs->flush_completed); radeon_cs_context_cleanup(&cs->csc1); radeon_cs_context_cleanup(&cs->csc2); p_atomic_dec(&cs->ws->num_cs); diff --git a/src/gallium/winsys/radeon/drm/radeon_drm_cs.h b/src/gallium/winsys/radeon/drm/radeon_drm_cs.h index 8056e72..a5f243d 100644 --- a/src/gallium/winsys/radeon/drm/radeon_drm_cs.h +++ b/src/gallium/winsys/radeon/drm/radeon_drm_cs.h @@ -78,7 +78,7 @@ struct radeon_drm_cs { void (*flush_cs)(void *ctx, unsigned flags, struct pipe_fence_handle **fence); void *flush_data; -pipe_semaphore flush_completed; +struct util_queue_fence flush_completed; }; int radeon_lookup_buffer(struct radeon_cs_context *csc, struct radeon_bo *bo); @@ -122,6 +122,6 @@ radeon_bo_is_referenced_by_any_cs(struct radeon_bo *bo) void radeon_drm_cs_sync_flush(struct radeon_winsys_cs *rcs); void radeon_drm_cs_init_functions(struct radeon_drm_winsys *ws); -void radeon_drm_cs_emit_ioctl_oneshot(struct radeon_drm_cs *cs, struct radeon_cs_context *csc); +void radeon_drm_cs_emit_ioctl_oneshot(void *job); #endif diff --git a/src/gallium/winsys/radeon/drm/radeon_drm_winsys.c b/src/gallium/winsys/radeon/drm/radeon_drm_winsys.c index 5c85c8f..1f296f4 100644 --- a/src/gallium/winsys/radeon/drm/radeon_drm_winsys.c +++ b/src/gallium/winsys/radeon/drm/radeon_drm_winsys.c @@ -534,16 +534,11 @@ static void radeon_winsys_destroy(struct radeon_winsys *rws) { struct radeon_drm_winsys *ws = (struct radeon_drm_winsys*)rws; -if (ws->thread) { -ws->kill_thread = 1; -pipe_semaphore_signal(&ws->cs_queued); -pipe_thread_wait(ws->thread); -} -pipe_semaphore_destroy(&ws->cs_queued); +if (util_queue_is_initialized(&ws->cs_queue)) +util_queue_destroy(&ws->cs_queue); pipe_mutex_destroy(ws->hyperz_owner_mutex); pipe_mutex_destroy(ws->cmask_owner_mutex); -pipe_mutex_destroy(ws->cs_stack_lock); pb_cache_deinit(&ws->bo_cache); @@ -686,55 +681,7 @@ static int compare_fd(void *key1, void *key2) stat1.st_rdev != stat2.st_rdev; } -void radeon_drm_ws_queue_cs(struct radeon_drm_winsys *ws, struct radeon_drm_cs *cs) -{ -retry: -pipe_mutex_lock(ws->cs_stack_lock); -
[Mesa-dev] [PATCH 1/2] gallium/util: import the multithreaded job queue from amdgpu winsys (v2)
From: Marek Olšák v2: rename the event to util_queue_fence --- src/gallium/auxiliary/Makefile.sources| 2 + src/gallium/auxiliary/util/u_queue.c | 129 ++ src/gallium/auxiliary/util/u_queue.h | 80 src/gallium/winsys/amdgpu/drm/amdgpu_cs.c | 23 ++--- src/gallium/winsys/amdgpu/drm/amdgpu_cs.h | 4 +- src/gallium/winsys/amdgpu/drm/amdgpu_winsys.c | 63 + src/gallium/winsys/amdgpu/drm/amdgpu_winsys.h | 11 +-- 7 files changed, 229 insertions(+), 83 deletions(-) create mode 100644 src/gallium/auxiliary/util/u_queue.c create mode 100644 src/gallium/auxiliary/util/u_queue.h diff --git a/src/gallium/auxiliary/Makefile.sources b/src/gallium/auxiliary/Makefile.sources index 7b3853e..ab58358 100644 --- a/src/gallium/auxiliary/Makefile.sources +++ b/src/gallium/auxiliary/Makefile.sources @@ -274,6 +274,8 @@ C_SOURCES := \ util/u_pstipple.c \ util/u_pstipple.h \ util/u_pwr8.h \ + util/u_queue.c \ + util/u_queue.h \ util/u_range.h \ util/u_rect.h \ util/u_resource.c \ diff --git a/src/gallium/auxiliary/util/u_queue.c b/src/gallium/auxiliary/util/u_queue.c new file mode 100644 index 000..8e58414 --- /dev/null +++ b/src/gallium/auxiliary/util/u_queue.c @@ -0,0 +1,129 @@ +/* + * Copyright © 2016 Advanced Micro Devices, Inc. + * All Rights Reserved. + * + * Permission is hereby granted, free of charge, to any person obtaining + * a copy of this software and associated documentation files (the + * "Software"), to deal in the Software without restriction, including + * without limitation the rights to use, copy, modify, merge, publish, + * distribute, sub license, and/or sell copies of the Software, and to + * permit persons to whom the Software is furnished to do so, subject to + * the following conditions: + * + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, + * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES + * OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND + * NON-INFRINGEMENT. IN NO EVENT SHALL THE COPYRIGHT HOLDERS, AUTHORS + * AND/OR ITS SUPPLIERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, + * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE + * USE OR OTHER DEALINGS IN THE SOFTWARE. + * + * The above copyright notice and this permission notice (including the + * next paragraph) shall be included in all copies or substantial portions + * of the Software. + */ + +#include "u_queue.h" + +static PIPE_THREAD_ROUTINE(util_queue_thread_func, param) +{ + struct util_queue *queue = (struct util_queue*)param; + unsigned i; + + while (1) { + struct util_queue_job job; + + pipe_semaphore_wait(&queue->queued); + if (queue->kill_thread) + break; + + pipe_mutex_lock(queue->lock); + job = queue->jobs[0]; + for (i = 1; i < queue->num_jobs; i++) + queue->jobs[i - 1] = queue->jobs[i]; + queue->jobs[--queue->num_jobs].job = NULL; + pipe_mutex_unlock(queue->lock); + + pipe_semaphore_signal(&queue->has_space); + + if (job.job) { + queue->execute_job(job.job); + pipe_semaphore_signal(&job.fence->done); + } + } + + /* signal remaining jobs before terminating */ + pipe_mutex_lock(queue->lock); + for (i = 0; i < queue->num_jobs; i++) { + pipe_semaphore_signal(&queue->jobs[i].fence->done); + queue->jobs[i].job = NULL; + } + queue->num_jobs = 0; + pipe_mutex_unlock(queue->lock); + return 0; +} + +void +util_queue_init(struct util_queue *queue, +void (*execute_job)(void *)) +{ + memset(queue, 0, sizeof(*queue)); + queue->execute_job = execute_job; + pipe_mutex_init(queue->lock); + pipe_semaphore_init(&queue->has_space, ARRAY_SIZE(queue->jobs)); + pipe_semaphore_init(&queue->queued, 0); + queue->thread = pipe_thread_create(util_queue_thread_func, queue); +} + +void +util_queue_destroy(struct util_queue *queue) +{ + queue->kill_thread = 1; + pipe_semaphore_signal(&queue->queued); + pipe_thread_wait(queue->thread); + pipe_semaphore_destroy(&queue->has_space); + pipe_semaphore_destroy(&queue->queued); + pipe_mutex_destroy(queue->lock); +} + +void +util_queue_fence_init(struct util_queue_fence *fence) +{ + pipe_semaphore_init(&fence->done, 1); +} + +void +util_queue_fence_destroy(struct util_queue_fence *fence) +{ + pipe_semaphore_destroy(&fence->done); +} + +void +util_queue_add_job(struct util_queue *queue, + void *job, + struct util_queue_fence *fence) +{ + /* Set the semaphore to "busy". */ + pipe_semaphore_wait(&fence->done); + + /* if the queue is full, wait until there is space */ + pipe_semaphore_wait(&queue->has_space); + + pipe_mutex_lock(queue->lock); + assert(queue->num_jobs < ARRAY_SIZE(queue->jobs)); + queue->jobs[queue->num_jobs].job = j
Re: [Mesa-dev] [PATCH 06/10] gallium/util: add util_copy_index_buffer() helper
On Tue, Jun 14, 2016 at 12:32 PM, Nicolai Hähnle wrote: > On 14.06.2016 17:57, Rob Clark wrote: >> >> From: Rob Clark >> >> Note there was previously a util_set_index_buffer() which was only used >> by svga. Replace this. >> >> (The util_copy_* naming is more consistent with other u_inlines/ >> u_framebuffer helpers) > > > Looks like you're changing semantics in a few places: memcpy is replaced by > util_copy_index_buffer, which does reference counting. I'll double check, but I think the replaced memcpy's are only in drivers which do not support the non-user_buffer case. (Pretty sure the memcpy approach would have been completely broken otherwise.) BR, -R > Nicolai > > >> >> Signed-off-by: Rob Clark >> --- >> src/gallium/auxiliary/util/u_helpers.c | 15 --- >> src/gallium/auxiliary/util/u_helpers.h | 3 --- >> src/gallium/auxiliary/util/u_inlines.h | 17 + >> src/gallium/drivers/freedreno/freedreno_state.c | 11 +-- >> src/gallium/drivers/i915/i915_state.c | 6 +- >> src/gallium/drivers/ilo/ilo_state.c | 10 +- >> src/gallium/drivers/llvmpipe/lp_state_vertex.c | 6 +- >> src/gallium/drivers/nouveau/nv30/nv30_state.c | 11 +-- >> src/gallium/drivers/r300/r300_state.c | 8 +--- >> src/gallium/drivers/r600/r600_state_common.c| 5 + >> src/gallium/drivers/radeonsi/si_state.c | 6 +- >> src/gallium/drivers/softpipe/sp_state_vertex.c | 6 +- >> src/gallium/drivers/svga/svga_pipe_vertex.c | 2 +- >> src/gallium/drivers/swr/swr_state.cpp | 7 +-- >> src/gallium/drivers/vc4/vc4_state.c | 11 +-- >> src/gallium/drivers/virgl/virgl_context.c | 8 +--- >> 16 files changed, 30 insertions(+), 102 deletions(-) >> >> diff --git a/src/gallium/auxiliary/util/u_helpers.c >> b/src/gallium/auxiliary/util/u_helpers.c >> index 09020b0..117a51b 100644 >> --- a/src/gallium/auxiliary/util/u_helpers.c >> +++ b/src/gallium/auxiliary/util/u_helpers.c >> @@ -94,18 +94,3 @@ void util_set_vertex_buffers_count(struct >> pipe_vertex_buffer *dst, >> >> *dst_count = util_last_bit(enabled_buffers); >> } >> - >> - >> -void >> -util_set_index_buffer(struct pipe_index_buffer *dst, >> - const struct pipe_index_buffer *src) >> -{ >> - if (src) { >> - pipe_resource_reference(&dst->buffer, src->buffer); >> - memcpy(dst, src, sizeof(*dst)); >> - } >> - else { >> - pipe_resource_reference(&dst->buffer, NULL); >> - memset(dst, 0, sizeof(*dst)); >> - } >> -} >> diff --git a/src/gallium/auxiliary/util/u_helpers.h >> b/src/gallium/auxiliary/util/u_helpers.h >> index a9a53e4..9804163 100644 >> --- a/src/gallium/auxiliary/util/u_helpers.h >> +++ b/src/gallium/auxiliary/util/u_helpers.h >> @@ -44,9 +44,6 @@ void util_set_vertex_buffers_count(struct >> pipe_vertex_buffer *dst, >> const struct pipe_vertex_buffer *src, >> unsigned start_slot, unsigned count); >> >> -void util_set_index_buffer(struct pipe_index_buffer *dst, >> - const struct pipe_index_buffer *src); >> - >> #ifdef __cplusplus >> } >> #endif >> diff --git a/src/gallium/auxiliary/util/u_inlines.h >> b/src/gallium/auxiliary/util/u_inlines.h >> index 207e2aa..78125c8 100644 >> --- a/src/gallium/auxiliary/util/u_inlines.h >> +++ b/src/gallium/auxiliary/util/u_inlines.h >> @@ -623,6 +623,23 @@ util_copy_constant_buffer(struct pipe_constant_buffer >> *dst, >> } >> >> static inline void >> +util_copy_index_buffer(struct pipe_index_buffer *dst, >> + const struct pipe_index_buffer *src) >> +{ >> + if (src) { >> + dst->index_size = src->index_size; >> + dst->offset = src->offset; >> + pipe_resource_reference(&dst->buffer, src->buffer); >> + dst->user_buffer = src->user_buffer; >> + } else { >> + dst->index_size = 0; >> + dst->offset = 0; >> + pipe_resource_reference(&dst->buffer, NULL); >> + dst->user_buffer = NULL; >> + } >> +} >> + >> +static inline void >> util_copy_image_view(struct pipe_image_view *dst, >>const struct pipe_image_view *src) >> { >> diff --git a/src/gallium/drivers/freedreno/freedreno_state.c >> b/src/gallium/drivers/freedreno/freedreno_state.c >> index 53ea39b..688975f 100644 >> --- a/src/gallium/drivers/freedreno/freedreno_state.c >> +++ b/src/gallium/drivers/freedreno/freedreno_state.c >> @@ -207,16 +207,7 @@ fd_set_index_buffer(struct pipe_context *pctx, >> const struct pipe_index_buffer *ib) >> { >> struct fd_context *ctx = fd_context(pctx); >> - >> - if (ib) { >> - pipe_resource_reference(&ctx->indexbuf.buffer, >> ib->buffer); >> - ctx->indexbuf.index_size = ib->index_size; >> - ctx->indexbuf.offset = ib->offset; >> -
Re: [Mesa-dev] [PATCH 05/10] gallium: change end_query() to return boolean
On Tue, Jun 14, 2016 at 12:24 PM, Ilia Mirkin wrote: > On Tue, Jun 14, 2016 at 12:19 PM, Nicolai Hähnle wrote: >> On 14.06.2016 17:57, Rob Clark wrote: >>> >>> From: Rob Clark >>> >>> s/bool/boolean/ to make it match the other APIs. >> >> >> Please no. C has finally grown a proper bool type, we should use it where >> possible. If anything, make the patch go in the other direction. > > FWIW I've eradicated boolean from nouveau except for the gallium API > interfaces. Would definitely be in favor of flipping those to bool. ok, I don't mind going the other direction.. I just picked 'boolean' since that is what most of the query APIs already used. (and in fact this patch isn't strictly required for the rsq_state.py stuff.. the inconsistency just bothered me ;-)) BR, -R > Cheers, > > -ilia ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 06/10] gallium/util: add util_copy_index_buffer() helper
On 14.06.2016 17:57, Rob Clark wrote: From: Rob Clark Note there was previously a util_set_index_buffer() which was only used by svga. Replace this. (The util_copy_* naming is more consistent with other u_inlines/ u_framebuffer helpers) Looks like you're changing semantics in a few places: memcpy is replaced by util_copy_index_buffer, which does reference counting. Nicolai Signed-off-by: Rob Clark --- src/gallium/auxiliary/util/u_helpers.c | 15 --- src/gallium/auxiliary/util/u_helpers.h | 3 --- src/gallium/auxiliary/util/u_inlines.h | 17 + src/gallium/drivers/freedreno/freedreno_state.c | 11 +-- src/gallium/drivers/i915/i915_state.c | 6 +- src/gallium/drivers/ilo/ilo_state.c | 10 +- src/gallium/drivers/llvmpipe/lp_state_vertex.c | 6 +- src/gallium/drivers/nouveau/nv30/nv30_state.c | 11 +-- src/gallium/drivers/r300/r300_state.c | 8 +--- src/gallium/drivers/r600/r600_state_common.c| 5 + src/gallium/drivers/radeonsi/si_state.c | 6 +- src/gallium/drivers/softpipe/sp_state_vertex.c | 6 +- src/gallium/drivers/svga/svga_pipe_vertex.c | 2 +- src/gallium/drivers/swr/swr_state.cpp | 7 +-- src/gallium/drivers/vc4/vc4_state.c | 11 +-- src/gallium/drivers/virgl/virgl_context.c | 8 +--- 16 files changed, 30 insertions(+), 102 deletions(-) diff --git a/src/gallium/auxiliary/util/u_helpers.c b/src/gallium/auxiliary/util/u_helpers.c index 09020b0..117a51b 100644 --- a/src/gallium/auxiliary/util/u_helpers.c +++ b/src/gallium/auxiliary/util/u_helpers.c @@ -94,18 +94,3 @@ void util_set_vertex_buffers_count(struct pipe_vertex_buffer *dst, *dst_count = util_last_bit(enabled_buffers); } - - -void -util_set_index_buffer(struct pipe_index_buffer *dst, - const struct pipe_index_buffer *src) -{ - if (src) { - pipe_resource_reference(&dst->buffer, src->buffer); - memcpy(dst, src, sizeof(*dst)); - } - else { - pipe_resource_reference(&dst->buffer, NULL); - memset(dst, 0, sizeof(*dst)); - } -} diff --git a/src/gallium/auxiliary/util/u_helpers.h b/src/gallium/auxiliary/util/u_helpers.h index a9a53e4..9804163 100644 --- a/src/gallium/auxiliary/util/u_helpers.h +++ b/src/gallium/auxiliary/util/u_helpers.h @@ -44,9 +44,6 @@ void util_set_vertex_buffers_count(struct pipe_vertex_buffer *dst, const struct pipe_vertex_buffer *src, unsigned start_slot, unsigned count); -void util_set_index_buffer(struct pipe_index_buffer *dst, - const struct pipe_index_buffer *src); - #ifdef __cplusplus } #endif diff --git a/src/gallium/auxiliary/util/u_inlines.h b/src/gallium/auxiliary/util/u_inlines.h index 207e2aa..78125c8 100644 --- a/src/gallium/auxiliary/util/u_inlines.h +++ b/src/gallium/auxiliary/util/u_inlines.h @@ -623,6 +623,23 @@ util_copy_constant_buffer(struct pipe_constant_buffer *dst, } static inline void +util_copy_index_buffer(struct pipe_index_buffer *dst, + const struct pipe_index_buffer *src) +{ + if (src) { + dst->index_size = src->index_size; + dst->offset = src->offset; + pipe_resource_reference(&dst->buffer, src->buffer); + dst->user_buffer = src->user_buffer; + } else { + dst->index_size = 0; + dst->offset = 0; + pipe_resource_reference(&dst->buffer, NULL); + dst->user_buffer = NULL; + } +} + +static inline void util_copy_image_view(struct pipe_image_view *dst, const struct pipe_image_view *src) { diff --git a/src/gallium/drivers/freedreno/freedreno_state.c b/src/gallium/drivers/freedreno/freedreno_state.c index 53ea39b..688975f 100644 --- a/src/gallium/drivers/freedreno/freedreno_state.c +++ b/src/gallium/drivers/freedreno/freedreno_state.c @@ -207,16 +207,7 @@ fd_set_index_buffer(struct pipe_context *pctx, const struct pipe_index_buffer *ib) { struct fd_context *ctx = fd_context(pctx); - - if (ib) { - pipe_resource_reference(&ctx->indexbuf.buffer, ib->buffer); - ctx->indexbuf.index_size = ib->index_size; - ctx->indexbuf.offset = ib->offset; - ctx->indexbuf.user_buffer = ib->user_buffer; - } else { - pipe_resource_reference(&ctx->indexbuf.buffer, NULL); - } - + util_copy_index_buffer(&ctx->indexbuf, ib); ctx->dirty |= FD_DIRTY_INDEXBUF; } diff --git a/src/gallium/drivers/i915/i915_state.c b/src/gallium/drivers/i915/i915_state.c index 2efa14e..dbd711f 100644 --- a/src/gallium/drivers/i915/i915_state.c +++ b/src/gallium/drivers/i915/i915_state.c @@ -1063,11 +1063,7 @@ static void i915_set_index_buffer(struct pipe_context *pipe, const struct pipe_index_
Re: [Mesa-dev] [PATCH 01/10] gallium: cleanup set_tess_state
On 14.06.2016 18:02, Ilia Mirkin wrote: Can you explain the motivation behind this change? I'm adding a ->set_window_rectangles thing which also takes multiple parameters. What's the advantage of stuffing things into a struct first? FWIW, I tend to be mildly supportive of changes like this. At least, the other extreme where functions grow multiple bool or int parameters over time is much worse. But in this particular case, changing this around might be too eager. Perhaps teaching the script to deal with slightly more complicated cases will help elsewhere, too. Nicolai -ilia On Tue, Jun 14, 2016 at 11:57 AM, Rob Clark wrote: From: Rob Clark The reset of the state APIs take state structs, rather than inline parameters (with the exception of a couple which just amount to a single uint). This makes the API more regular and simplifies autogeneration of the gallium state related APIs. Signed-off-by: Rob Clark --- src/gallium/drivers/ddebug/dd_context.c | 9 - src/gallium/drivers/nouveau/nvc0/nvc0_state.c | 7 +++ src/gallium/drivers/r600/evergreen_state.c| 7 +++ src/gallium/drivers/radeonsi/si_state.c | 7 +++ src/gallium/drivers/trace/tr_context.c| 9 - src/gallium/include/pipe/p_context.h | 4 ++-- src/gallium/include/pipe/p_state.h| 8 src/mesa/state_tracker/st_atom_tess.c | 13 ++--- 8 files changed, 37 insertions(+), 27 deletions(-) diff --git a/src/gallium/drivers/ddebug/dd_context.c b/src/gallium/drivers/ddebug/dd_context.c index 0f8ef18..06b7c91 100644 --- a/src/gallium/drivers/ddebug/dd_context.c +++ b/src/gallium/drivers/ddebug/dd_context.c @@ -380,15 +380,14 @@ dd_context_set_viewport_states(struct pipe_context *_pipe, } static void dd_context_set_tess_state(struct pipe_context *_pipe, - const float default_outer_level[4], - const float default_inner_level[2]) + const struct pipe_tess_state *state) { struct dd_context *dctx = dd_context(_pipe); struct pipe_context *pipe = dctx->pipe; - memcpy(dctx->tess_default_levels, default_outer_level, sizeof(float) * 4); - memcpy(dctx->tess_default_levels+4, default_inner_level, sizeof(float) * 2); - pipe->set_tess_state(pipe, default_outer_level, default_inner_level); + memcpy(dctx->tess_default_levels, state->default_outer_level, sizeof(float) * 4); + memcpy(dctx->tess_default_levels+4, state->default_inner_level, sizeof(float) * 2); + pipe->set_tess_state(pipe, state); } diff --git a/src/gallium/drivers/nouveau/nvc0/nvc0_state.c b/src/gallium/drivers/nouveau/nvc0/nvc0_state.c index 92161ec..a9c1830 100644 --- a/src/gallium/drivers/nouveau/nvc0/nvc0_state.c +++ b/src/gallium/drivers/nouveau/nvc0/nvc0_state.c @@ -1001,13 +1001,12 @@ nvc0_set_viewport_states(struct pipe_context *pipe, static void nvc0_set_tess_state(struct pipe_context *pipe, -const float default_tess_outer[4], -const float default_tess_inner[2]) +const struct pipe_tess_state *state) { struct nvc0_context *nvc0 = nvc0_context(pipe); - memcpy(nvc0->default_tess_outer, default_tess_outer, 4 * sizeof(float)); - memcpy(nvc0->default_tess_inner, default_tess_inner, 2 * sizeof(float)); + memcpy(nvc0->default_tess_outer, state->default_tess_outer, 4 * sizeof(float)); + memcpy(nvc0->default_tess_inner, state->default_tess_inner, 2 * sizeof(float)); nvc0->dirty_3d |= NVC0_NEW_3D_TESSFACTOR; } diff --git a/src/gallium/drivers/r600/evergreen_state.c b/src/gallium/drivers/r600/evergreen_state.c index 1ac8914..2a424f5 100644 --- a/src/gallium/drivers/r600/evergreen_state.c +++ b/src/gallium/drivers/r600/evergreen_state.c @@ -3569,13 +3569,12 @@ fallback: } static void evergreen_set_tess_state(struct pipe_context *ctx, -const float default_outer_level[4], -const float default_inner_level[2]) +const struct pipe_tess_state *state) { struct r600_context *rctx = (struct r600_context *)ctx; - memcpy(rctx->tess_state, default_outer_level, sizeof(float) * 4); - memcpy(rctx->tess_state+4, default_inner_level, sizeof(float) * 2); + memcpy(rctx->tess_state, state->default_outer_level, sizeof(float) * 4); + memcpy(rctx->tess_state+4, state->default_inner_level, sizeof(float) * 2); rctx->tess_state_dirty = true; } diff --git a/src/gallium/drivers/radeonsi/si_state.c b/src/gallium/drivers/radeonsi/si_state.c index 0c52eee..6ef3fe5 100644 --- a/src/gallium/drivers/radeonsi/si_state.c +++ b/src/gallium/drivers/radeonsi/si_state.c @@ -3238,15 +3238,14 @@ static void si_set_index_buffer(struct pipe_context *ctx, */ static void si_set_tess_state(struct pipe_context *ctx, -
Re: [Mesa-dev] [PATCH 01/10] gallium: cleanup set_tess_state
On Tue, Jun 14, 2016 at 12:13 PM, Ilia Mirkin wrote: > [trimming cc's because mesa-dev hates them] > > On Tue, Jun 14, 2016 at 12:09 PM, Rob Clark wrote: >> On Tue, Jun 14, 2016 at 12:02 PM, Ilia Mirkin wrote: >>> Can you explain the motivation behind this change? I'm adding a >>> ->set_window_rectangles thing which also takes multiple parameters. >>> What's the advantage of stuffing things into a struct first? >> >> consistency with the other pipe->set_xyz APIs, and adding support for >> it in rsq would then be a one line addition on rsq_state.py rather >> than writing a bunch of code by hand ;-) > > Sounds like it should be easy to extend your new script to handle > different argument types easily, no? I don't know that the "this new > script I wrote doesn't handle this situation well, so let's change a > bunch of code" logic is the right one... maybe.. in the best case it makes the script more complicated. And, IMHO, it is nice when the API is more consistent. Plus having a struct gives other drivers a convenient way to store the information. So I think it makes sense on it's own, even ignoring the script. BR, -R > -ilia ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 05/10] gallium: change end_query() to return boolean
On Tue, Jun 14, 2016 at 12:19 PM, Nicolai Hähnle wrote: > On 14.06.2016 17:57, Rob Clark wrote: >> >> From: Rob Clark >> >> s/bool/boolean/ to make it match the other APIs. > > > Please no. C has finally grown a proper bool type, we should use it where > possible. If anything, make the patch go in the other direction. FWIW I've eradicated boolean from nouveau except for the gallium API interfaces. Would definitely be in favor of flipping those to bool. Cheers, -ilia ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 04/10] gallium: make image_view const
Patches 2-4: Reviewed-by: Nicolai Hähnle On 14.06.2016 17:57, Rob Clark wrote: From: Rob Clark Signed-off-by: Rob Clark --- src/gallium/drivers/ddebug/dd_context.c | 2 +- src/gallium/drivers/ilo/ilo_state.c | 2 +- src/gallium/drivers/nouveau/nvc0/nvc0_state.c | 4 ++-- src/gallium/drivers/radeonsi/si_descriptors.c | 6 +++--- src/gallium/drivers/softpipe/sp_state_image.c | 2 +- src/gallium/drivers/trace/tr_context.c| 2 +- src/gallium/include/pipe/p_context.h | 2 +- 7 files changed, 10 insertions(+), 10 deletions(-) diff --git a/src/gallium/drivers/ddebug/dd_context.c b/src/gallium/drivers/ddebug/dd_context.c index 64b16f6..f72fd2f 100644 --- a/src/gallium/drivers/ddebug/dd_context.c +++ b/src/gallium/drivers/ddebug/dd_context.c @@ -490,7 +490,7 @@ dd_context_set_sampler_views(struct pipe_context *_pipe, unsigned shader, static void dd_context_set_shader_images(struct pipe_context *_pipe, unsigned shader, unsigned start, unsigned num, - struct pipe_image_view *views) + const struct pipe_image_view *views) { struct dd_context *dctx = dd_context(_pipe); struct pipe_context *pipe = dctx->pipe; diff --git a/src/gallium/drivers/ilo/ilo_state.c b/src/gallium/drivers/ilo/ilo_state.c index 53a5aca..4f1002e 100644 --- a/src/gallium/drivers/ilo/ilo_state.c +++ b/src/gallium/drivers/ilo/ilo_state.c @@ -1851,7 +1851,7 @@ ilo_set_sampler_views(struct pipe_context *pipe, unsigned shader, static void ilo_set_shader_images(struct pipe_context *pipe, unsigned shader, unsigned start, unsigned count, - struct pipe_image_view *views) + const struct pipe_image_view *views) { #if 0 struct ilo_state_vector *vec = &ilo_context(pipe)->state_vector; diff --git a/src/gallium/drivers/nouveau/nvc0/nvc0_state.c b/src/gallium/drivers/nouveau/nvc0/nvc0_state.c index a0e01bd..0bd756f 100644 --- a/src/gallium/drivers/nouveau/nvc0/nvc0_state.c +++ b/src/gallium/drivers/nouveau/nvc0/nvc0_state.c @@ -1233,7 +1233,7 @@ nvc0_set_compute_resources(struct pipe_context *pipe, static bool nvc0_bind_images_range(struct nvc0_context *nvc0, const unsigned s, unsigned start, unsigned nr, - struct pipe_image_view *pimages) + const struct pipe_image_view *pimages) { const unsigned end = start + nr; unsigned mask = 0; @@ -1301,7 +1301,7 @@ nvc0_bind_images_range(struct nvc0_context *nvc0, const unsigned s, static void nvc0_set_shader_images(struct pipe_context *pipe, unsigned shader, unsigned start, unsigned nr, - struct pipe_image_view *images) + const struct pipe_image_view *images) { const unsigned s = nvc0_shader_stage(shader); if (!nvc0_bind_images_range(nvc0_context(pipe), s, start, nr, images)) diff --git a/src/gallium/drivers/radeonsi/si_descriptors.c b/src/gallium/drivers/radeonsi/si_descriptors.c index 55686e8..e95556b 100644 --- a/src/gallium/drivers/radeonsi/si_descriptors.c +++ b/src/gallium/drivers/radeonsi/si_descriptors.c @@ -560,7 +560,7 @@ si_disable_shader_image(struct si_context *ctx, unsigned shader, unsigned slot) } static void -si_mark_image_range_valid(struct pipe_image_view *view) +si_mark_image_range_valid(const struct pipe_image_view *view) { struct r600_resource *res = (struct r600_resource *)view->resource; const struct util_format_description *desc; @@ -578,7 +578,7 @@ si_mark_image_range_valid(struct pipe_image_view *view) static void si_set_shader_image(struct si_context *ctx, unsigned shader, - unsigned slot, struct pipe_image_view *view) + unsigned slot, const struct pipe_image_view *view) { struct si_screen *screen = ctx->screen; struct si_images_info *images = &ctx->images[shader]; @@ -674,7 +674,7 @@ static void si_set_shader_image(struct si_context *ctx, static void si_set_shader_images(struct pipe_context *pipe, unsigned shader, unsigned start_slot, unsigned count, -struct pipe_image_view *views) +const struct pipe_image_view *views) { struct si_context *ctx = (struct si_context *)pipe; unsigned i, slot; diff --git a/src/gallium/drivers/softpipe/sp_state_image.c b/src/gallium/drivers/softpipe/sp_state_image.c index 81bb7ca..553a76a 100644 --- a/src/gallium/drivers/softpipe/sp_state_image.c +++ b/src/gallium/drivers/softpipe/sp_state_image.c @@ -30,7 +30,7 @@ static void softpipe_set_shader_images(struct pipe_context *pipe, unsigned shader, unsigned start, unsigned num, -
Re: [Mesa-dev] [PATCH 0/7] Fix ralloc/rzalloc usage v2
On 14.06.2016 17:07, Ilia Mirkin wrote: I assume you've only tested this with i965? ralloc is also used by st/mesa, freedreno, and vc4. Should probably try to coordinate with the responsible developers before making the big switch. In st_glsl_to_tgsi.c, there is one ralloc(mem_ctx, function_entry) that doesn't initialize all members. That should probably get the rzalloc treatment. The other uses in that file look fine to me. The use in st_nir_lower_builtin.c looks problematic as well. Nicolai -ilia On Tue, Jun 14, 2016 at 10:58 AM, Juha-Pekka Heikkila wrote: Here is fixed version of this ralloc set. Now I got to run this on many different machines thanks to Mark Janes. There didn't show up any regressions on different gen hw. On my IVB I've been running also many different traces with Apitrace while having Valgrind running on background but Valgrind did seem to be happy with my changes. As a performance test I did shader-db compile runs 10 times and compare timing results against what Mesa master does on my IVB. To my surprise this does bring reasonable gain which also seem to be repeatable, on my IVB shader compile time is around 5% faster with these changes. /Juha-Pekka Juha-Pekka Heikkila (7): glsl: Fix reading of uninitialized memory util: use rzalloc instead on ralloc in _mesa_hash_table_create() util: use rzalloc instead on ralloc in _mesa_set_create(() nir: zero allocated memory where needed i965/vec4: zero allocated memory where needed i965/fs: fill allocated memory with zeros where needed util: Fix ralloc to use malloc instead of calloc src/compiler/glsl/ast_to_hir.cpp | 2 +- src/compiler/glsl/glcpp/glcpp-parse.y | 4 +- src/compiler/glsl/link_uniform_blocks.cpp | 2 +- src/compiler/glsl_types.cpp| 2 +- src/compiler/nir/nir.c | 6 +-- src/compiler/nir/nir_opt_dce.c | 2 +- src/compiler/nir/nir_phi_builder.c | 2 +- src/compiler/nir/nir_search.c | 2 +- src/compiler/nir/nir_to_ssa.c | 2 +- src/compiler/nir/nir_worklist.c| 2 +- .../drivers/dri/i965/brw_fs_copy_propagation.cpp | 2 +- .../dri/i965/brw_fs_dead_code_eliminate.cpp| 4 +- .../dri/i965/brw_vec4_dead_code_eliminate.cpp | 4 +- src/util/hash_table.c | 2 +- src/util/ralloc.c | 49 +++--- src/util/ralloc.h | 2 +- src/util/set.c | 2 +- 17 files changed, 54 insertions(+), 37 deletions(-) -- 1.9.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 01/10] gallium: cleanup set_tess_state
On Tue, Jun 14, 2016 at 12:02 PM, Ilia Mirkin wrote: > Can you explain the motivation behind this change? I'm adding a > ->set_window_rectangles thing which also takes multiple parameters. > What's the advantage of stuffing things into a struct first? consistency with the other pipe->set_xyz APIs, and adding support for it in rsq would then be a one line addition on rsq_state.py rather than writing a bunch of code by hand ;-) BR, -R > -ilia > > On Tue, Jun 14, 2016 at 11:57 AM, Rob Clark wrote: >> From: Rob Clark >> >> The reset of the state APIs take state structs, rather than inline >> parameters (with the exception of a couple which just amount to a single >> uint). >> >> This makes the API more regular and simplifies autogeneration of the >> gallium state related APIs. >> >> Signed-off-by: Rob Clark >> --- >> src/gallium/drivers/ddebug/dd_context.c | 9 - >> src/gallium/drivers/nouveau/nvc0/nvc0_state.c | 7 +++ >> src/gallium/drivers/r600/evergreen_state.c| 7 +++ >> src/gallium/drivers/radeonsi/si_state.c | 7 +++ >> src/gallium/drivers/trace/tr_context.c| 9 - >> src/gallium/include/pipe/p_context.h | 4 ++-- >> src/gallium/include/pipe/p_state.h| 8 >> src/mesa/state_tracker/st_atom_tess.c | 13 ++--- >> 8 files changed, 37 insertions(+), 27 deletions(-) >> >> diff --git a/src/gallium/drivers/ddebug/dd_context.c >> b/src/gallium/drivers/ddebug/dd_context.c >> index 0f8ef18..06b7c91 100644 >> --- a/src/gallium/drivers/ddebug/dd_context.c >> +++ b/src/gallium/drivers/ddebug/dd_context.c >> @@ -380,15 +380,14 @@ dd_context_set_viewport_states(struct pipe_context >> *_pipe, >> } >> >> static void dd_context_set_tess_state(struct pipe_context *_pipe, >> - const float default_outer_level[4], >> - const float default_inner_level[2]) >> + const struct pipe_tess_state *state) >> { >> struct dd_context *dctx = dd_context(_pipe); >> struct pipe_context *pipe = dctx->pipe; >> >> - memcpy(dctx->tess_default_levels, default_outer_level, sizeof(float) * >> 4); >> - memcpy(dctx->tess_default_levels+4, default_inner_level, sizeof(float) * >> 2); >> - pipe->set_tess_state(pipe, default_outer_level, default_inner_level); >> + memcpy(dctx->tess_default_levels, state->default_outer_level, >> sizeof(float) * 4); >> + memcpy(dctx->tess_default_levels+4, state->default_inner_level, >> sizeof(float) * 2); >> + pipe->set_tess_state(pipe, state); >> } >> >> >> diff --git a/src/gallium/drivers/nouveau/nvc0/nvc0_state.c >> b/src/gallium/drivers/nouveau/nvc0/nvc0_state.c >> index 92161ec..a9c1830 100644 >> --- a/src/gallium/drivers/nouveau/nvc0/nvc0_state.c >> +++ b/src/gallium/drivers/nouveau/nvc0/nvc0_state.c >> @@ -1001,13 +1001,12 @@ nvc0_set_viewport_states(struct pipe_context *pipe, >> >> static void >> nvc0_set_tess_state(struct pipe_context *pipe, >> -const float default_tess_outer[4], >> -const float default_tess_inner[2]) >> +const struct pipe_tess_state *state) >> { >> struct nvc0_context *nvc0 = nvc0_context(pipe); >> >> - memcpy(nvc0->default_tess_outer, default_tess_outer, 4 * sizeof(float)); >> - memcpy(nvc0->default_tess_inner, default_tess_inner, 2 * sizeof(float)); >> + memcpy(nvc0->default_tess_outer, state->default_tess_outer, 4 * >> sizeof(float)); >> + memcpy(nvc0->default_tess_inner, state->default_tess_inner, 2 * >> sizeof(float)); >> nvc0->dirty_3d |= NVC0_NEW_3D_TESSFACTOR; >> } >> >> diff --git a/src/gallium/drivers/r600/evergreen_state.c >> b/src/gallium/drivers/r600/evergreen_state.c >> index 1ac8914..2a424f5 100644 >> --- a/src/gallium/drivers/r600/evergreen_state.c >> +++ b/src/gallium/drivers/r600/evergreen_state.c >> @@ -3569,13 +3569,12 @@ fallback: >> } >> >> static void evergreen_set_tess_state(struct pipe_context *ctx, >> -const float default_outer_level[4], >> -const float default_inner_level[2]) >> +const struct pipe_tess_state *state) >> { >> struct r600_context *rctx = (struct r600_context *)ctx; >> >> - memcpy(rctx->tess_state, default_outer_level, sizeof(float) * 4); >> - memcpy(rctx->tess_state+4, default_inner_level, sizeof(float) * 2); >> + memcpy(rctx->tess_state, state->default_outer_level, sizeof(float) * >> 4); >> + memcpy(rctx->tess_state+4, state->default_inner_level, sizeof(float) >> * 2); >> rctx->tess_state_dirty = true; >> } >> >> diff --git a/src/gallium/drivers/radeonsi/si_state.c >> b/src/gallium/drivers/radeonsi/si_state.c >> index 0c52eee..6ef3fe5 100644 >> --- a/src/gallium/drivers/radeonsi/si_state.c >> +++ b/src/gallium/drivers/radeonsi/si_state.c >> @@ -3238,15 +3238,14 @@ stati
Re: [Mesa-dev] [PATCH 00/10] gallium: resequencer layer
On 06/14/2016 10:07 AM, Rob Clark wrote: bleh, seems like max-cc's is still too low on mesa-dev, and some of the patches didn't get through. You can also find them here: https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_freedreno_mesa_commits_wip-2Drsq&d=CwIGaQ&c=Sqcl0Ez6M0X8aeM67LKIiDJAXVeAw-YihVMNtXt-uEs&r=T0t4QG7chq2ZwJo6wilkFznRSFy-8uDKartPGbomVj8&m=y8_YjxPlAvWMLZ5SEnmHmZHxd8Z2nrfsQMnZgaplS1o&s=3zmiU05shnNzgGuD1LPvnREsSmOcjGwBpIAGe6x4X8s&e= I don't see a way to raise the max-cc's in the mailmain interface. I approved your pending messages. -Brian ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v2] glsl: reuse main extension table to appropriate restrict extensions
On Mon, Jun 13, 2016 at 11:43:57PM -0400, Ilia Mirkin wrote: > Previously we were only restricting based on ES/non-ES-ness and whether > the overall enable bit had been flipped on. However we have been adding > more fine-grained restrictions, such as based on compat profiles, as > well as specific ES versions. Most of the time this doesn't matter, but > it can create awkward situations and duplication of logic. > > Here we separate the main extension table into a separate object file, > linked to the glsl compiler, which makes use of it with a custom > function which takes the ES-ness of the shader into account (thus > allowing desktop shaders to properly use ES extensions that would > otherwise have been disallowed.) > > The effect of this change should be nil in most cases. > > Signed-off-by: Ilia Mirkin > --- > > v1 -> v2: > - use a final enum to obtain number of extensions > - move calculation of the gl version to be once per shader, for better reuse > - bake GL version into the "supported_versions" struct > - while we're at it, fix supported_versions size, it was off by 1 since ES > 3.20 >"support" was added. > Looks all good to me :) Reviewed-by: Eric Engestrom ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 05/10] gallium: change end_query() to return boolean
On 14.06.2016 17:57, Rob Clark wrote: From: Rob Clark s/bool/boolean/ to make it match the other APIs. Please no. C has finally grown a proper bool type, we should use it where possible. If anything, make the patch go in the other direction. Nicolai Signed-off-by: Rob Clark --- src/gallium/drivers/freedreno/freedreno_query.c | 2 +- src/gallium/drivers/i915/i915_query.c | 2 +- src/gallium/drivers/ilo/ilo_query.c | 2 +- src/gallium/drivers/llvmpipe/lp_query.c | 2 +- src/gallium/drivers/noop/noop_pipe.c| 2 +- src/gallium/drivers/nouveau/nv30/nv30_query.c | 2 +- src/gallium/drivers/nouveau/nv50/nv50_query.c | 2 +- src/gallium/drivers/r300/r300_query.c | 4 ++-- src/gallium/drivers/radeon/r600_query.c | 2 +- src/gallium/drivers/rbug/rbug_context.c | 4 ++-- src/gallium/drivers/softpipe/sp_query.c | 2 +- src/gallium/drivers/svga/svga_pipe_query.c | 2 +- src/gallium/drivers/swr/swr_query.cpp | 2 +- src/gallium/drivers/vc4/vc4_query.c | 2 +- src/gallium/drivers/virgl/virgl_query.c | 4 ++-- src/gallium/include/pipe/p_context.h| 2 +- 16 files changed, 19 insertions(+), 19 deletions(-) diff --git a/src/gallium/drivers/freedreno/freedreno_query.c b/src/gallium/drivers/freedreno/freedreno_query.c index 18e0c79..fc50076 100644 --- a/src/gallium/drivers/freedreno/freedreno_query.c +++ b/src/gallium/drivers/freedreno/freedreno_query.c @@ -66,7 +66,7 @@ fd_begin_query(struct pipe_context *pctx, struct pipe_query *pq) return q->funcs->begin_query(fd_context(pctx), q); } -static bool +static boolean fd_end_query(struct pipe_context *pctx, struct pipe_query *pq) { struct fd_query *q = fd_query(pq); diff --git a/src/gallium/drivers/i915/i915_query.c b/src/gallium/drivers/i915/i915_query.c index d6015a6..9d5569a 100644 --- a/src/gallium/drivers/i915/i915_query.c +++ b/src/gallium/drivers/i915/i915_query.c @@ -60,7 +60,7 @@ static boolean i915_begin_query(struct pipe_context *ctx, return true; } -static bool i915_end_query(struct pipe_context *ctx, struct pipe_query *query) +static boolean i915_end_query(struct pipe_context *ctx, struct pipe_query *query) { return true; } diff --git a/src/gallium/drivers/ilo/ilo_query.c b/src/gallium/drivers/ilo/ilo_query.c index 3088c96..98c3f6d 100644 --- a/src/gallium/drivers/ilo/ilo_query.c +++ b/src/gallium/drivers/ilo/ilo_query.c @@ -128,7 +128,7 @@ ilo_begin_query(struct pipe_context *pipe, struct pipe_query *query) return true; } -static bool +static boolean ilo_end_query(struct pipe_context *pipe, struct pipe_query *query) { struct ilo_query *q = ilo_query(query); diff --git a/src/gallium/drivers/llvmpipe/lp_query.c b/src/gallium/drivers/llvmpipe/lp_query.c index d5ed656..ffc4f56c 100644 --- a/src/gallium/drivers/llvmpipe/lp_query.c +++ b/src/gallium/drivers/llvmpipe/lp_query.c @@ -239,7 +239,7 @@ llvmpipe_begin_query(struct pipe_context *pipe, struct pipe_query *q) } -static bool +static boolean llvmpipe_end_query(struct pipe_context *pipe, struct pipe_query *q) { struct llvmpipe_context *llvmpipe = llvmpipe_context( pipe ); diff --git a/src/gallium/drivers/noop/noop_pipe.c b/src/gallium/drivers/noop/noop_pipe.c index 99e5f1a..e58507b 100644 --- a/src/gallium/drivers/noop/noop_pipe.c +++ b/src/gallium/drivers/noop/noop_pipe.c @@ -63,7 +63,7 @@ static boolean noop_begin_query(struct pipe_context *ctx, struct pipe_query *que return true; } -static bool noop_end_query(struct pipe_context *ctx, struct pipe_query *query) +static boolean noop_end_query(struct pipe_context *ctx, struct pipe_query *query) { return true; } diff --git a/src/gallium/drivers/nouveau/nv30/nv30_query.c b/src/gallium/drivers/nouveau/nv30/nv30_query.c index aa9a12f..0f4d9b4 100644 --- a/src/gallium/drivers/nouveau/nv30/nv30_query.c +++ b/src/gallium/drivers/nouveau/nv30/nv30_query.c @@ -175,7 +175,7 @@ nv30_query_begin(struct pipe_context *pipe, struct pipe_query *pq) return true; } -static bool +static boolean nv30_query_end(struct pipe_context *pipe, struct pipe_query *pq) { struct nv30_context *nv30 = nv30_context(pipe); diff --git a/src/gallium/drivers/nouveau/nv50/nv50_query.c b/src/gallium/drivers/nouveau/nv50/nv50_query.c index 9a1397a..9124946 100644 --- a/src/gallium/drivers/nouveau/nv50/nv50_query.c +++ b/src/gallium/drivers/nouveau/nv50/nv50_query.c @@ -54,7 +54,7 @@ nv50_begin_query(struct pipe_context *pipe, struct pipe_query *pq) return q->funcs->begin_query(nv50_context(pipe), q); } -static bool +static boolean nv50_end_query(struct pipe_context *pipe, struct pipe_query *pq) { struct nv50_query *q = nv50_query(pq); diff --git a/src/gallium/drivers/r300/r300_query.c b/src/gallium/drivers/r300/r300_query.c index 79e2198..1428d03 100644 --- a/src/gallium/drivers/r300/r300_query.c +++ b/src/gallium/driv
Re: [Mesa-dev] [PATCH 01/10] gallium: cleanup set_tess_state
[trimming cc's because mesa-dev hates them] On Tue, Jun 14, 2016 at 12:09 PM, Rob Clark wrote: > On Tue, Jun 14, 2016 at 12:02 PM, Ilia Mirkin wrote: >> Can you explain the motivation behind this change? I'm adding a >> ->set_window_rectangles thing which also takes multiple parameters. >> What's the advantage of stuffing things into a struct first? > > consistency with the other pipe->set_xyz APIs, and adding support for > it in rsq would then be a one line addition on rsq_state.py rather > than writing a bunch of code by hand ;-) Sounds like it should be easy to extend your new script to handle different argument types easily, no? I don't know that the "this new script I wrote doesn't handle this situation well, so let's change a bunch of code" logic is the right one... -ilia ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 01/10] gallium: cleanup set_tess_state
From: Rob Clark The reset of the state APIs take state structs, rather than inline parameters (with the exception of a couple which just amount to a single uint). This makes the API more regular and simplifies autogeneration of the gallium state related APIs. Signed-off-by: Rob Clark --- src/gallium/drivers/ddebug/dd_context.c | 9 - src/gallium/drivers/nouveau/nvc0/nvc0_state.c | 7 +++ src/gallium/drivers/r600/evergreen_state.c| 7 +++ src/gallium/drivers/radeonsi/si_state.c | 7 +++ src/gallium/drivers/trace/tr_context.c| 9 - src/gallium/include/pipe/p_context.h | 4 ++-- src/gallium/include/pipe/p_state.h| 8 src/mesa/state_tracker/st_atom_tess.c | 13 ++--- 8 files changed, 37 insertions(+), 27 deletions(-) diff --git a/src/gallium/drivers/ddebug/dd_context.c b/src/gallium/drivers/ddebug/dd_context.c index 0f8ef18..06b7c91 100644 --- a/src/gallium/drivers/ddebug/dd_context.c +++ b/src/gallium/drivers/ddebug/dd_context.c @@ -380,15 +380,14 @@ dd_context_set_viewport_states(struct pipe_context *_pipe, } static void dd_context_set_tess_state(struct pipe_context *_pipe, - const float default_outer_level[4], - const float default_inner_level[2]) + const struct pipe_tess_state *state) { struct dd_context *dctx = dd_context(_pipe); struct pipe_context *pipe = dctx->pipe; - memcpy(dctx->tess_default_levels, default_outer_level, sizeof(float) * 4); - memcpy(dctx->tess_default_levels+4, default_inner_level, sizeof(float) * 2); - pipe->set_tess_state(pipe, default_outer_level, default_inner_level); + memcpy(dctx->tess_default_levels, state->default_outer_level, sizeof(float) * 4); + memcpy(dctx->tess_default_levels+4, state->default_inner_level, sizeof(float) * 2); + pipe->set_tess_state(pipe, state); } diff --git a/src/gallium/drivers/nouveau/nvc0/nvc0_state.c b/src/gallium/drivers/nouveau/nvc0/nvc0_state.c index 92161ec..a9c1830 100644 --- a/src/gallium/drivers/nouveau/nvc0/nvc0_state.c +++ b/src/gallium/drivers/nouveau/nvc0/nvc0_state.c @@ -1001,13 +1001,12 @@ nvc0_set_viewport_states(struct pipe_context *pipe, static void nvc0_set_tess_state(struct pipe_context *pipe, -const float default_tess_outer[4], -const float default_tess_inner[2]) +const struct pipe_tess_state *state) { struct nvc0_context *nvc0 = nvc0_context(pipe); - memcpy(nvc0->default_tess_outer, default_tess_outer, 4 * sizeof(float)); - memcpy(nvc0->default_tess_inner, default_tess_inner, 2 * sizeof(float)); + memcpy(nvc0->default_tess_outer, state->default_tess_outer, 4 * sizeof(float)); + memcpy(nvc0->default_tess_inner, state->default_tess_inner, 2 * sizeof(float)); nvc0->dirty_3d |= NVC0_NEW_3D_TESSFACTOR; } diff --git a/src/gallium/drivers/r600/evergreen_state.c b/src/gallium/drivers/r600/evergreen_state.c index 1ac8914..2a424f5 100644 --- a/src/gallium/drivers/r600/evergreen_state.c +++ b/src/gallium/drivers/r600/evergreen_state.c @@ -3569,13 +3569,12 @@ fallback: } static void evergreen_set_tess_state(struct pipe_context *ctx, -const float default_outer_level[4], -const float default_inner_level[2]) +const struct pipe_tess_state *state) { struct r600_context *rctx = (struct r600_context *)ctx; - memcpy(rctx->tess_state, default_outer_level, sizeof(float) * 4); - memcpy(rctx->tess_state+4, default_inner_level, sizeof(float) * 2); + memcpy(rctx->tess_state, state->default_outer_level, sizeof(float) * 4); + memcpy(rctx->tess_state+4, state->default_inner_level, sizeof(float) * 2); rctx->tess_state_dirty = true; } diff --git a/src/gallium/drivers/radeonsi/si_state.c b/src/gallium/drivers/radeonsi/si_state.c index 0c52eee..6ef3fe5 100644 --- a/src/gallium/drivers/radeonsi/si_state.c +++ b/src/gallium/drivers/radeonsi/si_state.c @@ -3238,15 +3238,14 @@ static void si_set_index_buffer(struct pipe_context *ctx, */ static void si_set_tess_state(struct pipe_context *ctx, - const float default_outer_level[4], - const float default_inner_level[2]) + const struct pipe_tess_state *state) { struct si_context *sctx = (struct si_context *)ctx; struct pipe_constant_buffer cb; float array[8]; - memcpy(array, default_outer_level, sizeof(float) * 4); - memcpy(array+4, default_inner_level, sizeof(float) * 2); + memcpy(array, state->default_outer_level, sizeof(float) * 4); + memcpy(array+4, state->default_inner_level, sizeof(float) * 2); cb.buffer = NULL; cb.user_buffer = NULL; diff --git a/src/gallium/dr
Re: [Mesa-dev] [PATCH 00/10] gallium: resequencer layer
bleh, seems like max-cc's is still too low on mesa-dev, and some of the patches didn't get through. You can also find them here: https://github.com/freedreno/mesa/commits/wip-rsq BR, -R On Tue, Jun 14, 2016 at 11:57 AM, Rob Clark wrote: > From: Rob Clark > > So, I know there were a couple concerns voiced over the idea of > re-ordering rendering in a gallium shim pipe driver layer. For > me, the main concern was whether the overhead of an extra layer, > queueing and replaying state updates, draws, etc, would be > prohibitive. So I implemented it enough that I could do some > benchmarking ;-) > > The first 9 patches are just some general API cleanups, which I > found to be convenient (since the resequencer layer is generating > most of the state handling with python + mako, so the cleanups to > improve consistency help minimize the state which required special > handling). But regardless of the outcome of the resequencer > layer, I think these patches make sense on their own. > > (Note: auto-generating some of the other wrapper layers might be > an interesting future cleanup.. at least it should be trivial > for noop ;-)) > > As far as overhead, I've been benchmarking (most glmark2 + stk + > gfxbench), and in the current state (without actually having the > dependency tracking implemented) it doesn't seem to cause more > than a couple percent overhead. From here on out, the remaining > overhead added to implement the dependency tracking and re- > ordering would be the same as the additional overhead required > to implement it in the driver backend. > > And a couple percent overhead is small compared to the expected > gains for games which benefit.. ie. 8MiB for 1080p rgb frame, > avoiding copying that from tile to memory and back once or twice > quickly dwarfs an extra copy of some 10's of kb of state.. and > even more so for (for ex.) f32f32f32f32 intermediate buffers. > > Queries are still missing, but I expect what would be required > to implement it is the same as the logic that would be needed in > the driver backend otherwise. > > Basically, the only concern I have, compared to the approach of > implementing the dependency tracking in each driver backend is > pipe_constant_buffer::user_buffer. Currently both freedreno and > vc4 what non-UBO constant buffers to be emitted in cmdstream. > In the adreno case, it looks like a3xx/a4xx should also support > the non-user_buffer case, although in fact this appears to be > broken (at least on a4xx) and I've never seen blob driver use > this. At the moment I'm doing a hack in freedreno to map the > backing fd_bo and then memcpy it into cmdstream. Which is a > bit silly (since it is a write-combine buffer I'm copying from). > But in glmark I had trouble even measuring the overhead of this > extra copy. Although possibly I need to find something to > measure which emits more non-UBO constant state. > > btw, if someone has some requests for benchmarks to try (provided > they are available for arm/linux) I'd be happy to try some other > things. > > The plus side of doing this in a separate layer is that we only > implement the dependency tracking and resource shadowing once, > instead of both in vc4 and freedreno (and who knows, maybe > someday someone gets around to writing a lima gallium driver). > Plus, I envision this to be something that mesa/st wraps the > pipe_screen with if driconf tells it to, and pscreen->rsq_funcs > is populated (we at least need a callback to know if resource > is still busy). This way we can turn it on for games/apps that > are known to benefit, and leave it off with zero additional > overhead for better written things (or rather, things written > with tilers in mind). > > > Rob Clark (10): > gallium: cleanup set_tess_state > gallium: make shader_buffers const > gallium: make constant_buffer const > gallium: make image_view const > gallium: change end_query() to return boolean > gallium/util: add util_copy_index_buffer() helper > gallium/util: add util_copy_shader_buffer() helper > gallium/util: add util_copy_vertex_buffer helper > gallium/util: make util_copy_framebuffer_state(src=NULL) work > RFC: gallium: add resequencer driver (INCOMPLETE) > > configure.ac | 1 + > src/gallium/auxiliary/util/u_framebuffer.c | 37 +- > src/gallium/auxiliary/util/u_helpers.c | 15 - > src/gallium/auxiliary/util/u_helpers.h | 3 - > src/gallium/auxiliary/util/u_inlines.h | 49 ++ > src/gallium/drivers/ddebug/dd_context.c| 15 +- > src/gallium/drivers/freedreno/freedreno_query.c| 2 +- > src/gallium/drivers/freedreno/freedreno_state.c| 13 +- > src/gallium/drivers/i915/i915_query.c | 2 +- > src/gallium/drivers/i915/i915_state.c | 8 +- > src/gallium/drivers/ilo/ilo_query.c| 2 +- > src/gallium/drivers/ilo/ilo_state.c| 14 +- > src/gallium/
[Mesa-dev] [PATCH 10/10] RFC: gallium: add resequencer driver (INCOMPLETE)
From: Rob Clark NOTE: the mako templates turned out to be a bit more hairy than expected.. maybe they would be better split out, or maybe there is something that could be done more simply. It more or less is my first time doing much with mako. But, I have changed how the state tracking / emit / replay works a few times as I went, and making a couple line change in a template and regenerating is much nicer than manually refactoring the code ;-) Note that since it might be easier to see how things work from the resulting code: rsq_state.h -> http://hastebin.com/erupecivab.c rsq_state.c -> http://hastebin.com/cokenawanu.c --- configure.ac | 1 + src/gallium/drivers/resequencer/.gitignore | 2 + src/gallium/drivers/resequencer/Makefile.am| 44 ++ src/gallium/drivers/resequencer/Makefile.sources | 23 + src/gallium/drivers/resequencer/rsq_batch.c| 144 + src/gallium/drivers/resequencer/rsq_batch.h| 71 +++ src/gallium/drivers/resequencer/rsq_context.c | 457 src/gallium/drivers/resequencer/rsq_context.h | 84 +++ src/gallium/drivers/resequencer/rsq_draw.c | 230 src/gallium/drivers/resequencer/rsq_draw.h | 40 ++ src/gallium/drivers/resequencer/rsq_fence.c| 48 ++ src/gallium/drivers/resequencer/rsq_fence.h| 43 ++ src/gallium/drivers/resequencer/rsq_public.h | 68 +++ src/gallium/drivers/resequencer/rsq_query.c| 148 + src/gallium/drivers/resequencer/rsq_query.h| 32 ++ src/gallium/drivers/resequencer/rsq_resource.c | 222 src/gallium/drivers/resequencer/rsq_resource.h | 60 ++ src/gallium/drivers/resequencer/rsq_screen.c | 186 +++ src/gallium/drivers/resequencer/rsq_screen.h | 50 ++ src/gallium/drivers/resequencer/rsq_state.py | 607 + .../drivers/resequencer/rsq_state_helpers.h| 219 src/gallium/drivers/resequencer/rsq_surface.c | 107 src/gallium/drivers/resequencer/rsq_surface.h | 72 +++ 23 files changed, 2958 insertions(+) create mode 100644 src/gallium/drivers/resequencer/.gitignore create mode 100644 src/gallium/drivers/resequencer/Makefile.am create mode 100644 src/gallium/drivers/resequencer/Makefile.sources create mode 100644 src/gallium/drivers/resequencer/rsq_batch.c create mode 100644 src/gallium/drivers/resequencer/rsq_batch.h create mode 100644 src/gallium/drivers/resequencer/rsq_context.c create mode 100644 src/gallium/drivers/resequencer/rsq_context.h create mode 100644 src/gallium/drivers/resequencer/rsq_draw.c create mode 100644 src/gallium/drivers/resequencer/rsq_draw.h create mode 100644 src/gallium/drivers/resequencer/rsq_fence.c create mode 100644 src/gallium/drivers/resequencer/rsq_fence.h create mode 100644 src/gallium/drivers/resequencer/rsq_public.h create mode 100644 src/gallium/drivers/resequencer/rsq_query.c create mode 100644 src/gallium/drivers/resequencer/rsq_query.h create mode 100644 src/gallium/drivers/resequencer/rsq_resource.c create mode 100644 src/gallium/drivers/resequencer/rsq_resource.h create mode 100644 src/gallium/drivers/resequencer/rsq_screen.c create mode 100644 src/gallium/drivers/resequencer/rsq_screen.h create mode 100644 src/gallium/drivers/resequencer/rsq_state.py create mode 100644 src/gallium/drivers/resequencer/rsq_state_helpers.h create mode 100644 src/gallium/drivers/resequencer/rsq_surface.c create mode 100644 src/gallium/drivers/resequencer/rsq_surface.h diff --git a/configure.ac b/configure.ac index c492e15..0dbfc32 100644 --- a/configure.ac +++ b/configure.ac @@ -2644,6 +2644,7 @@ AC_CONFIG_FILES([Makefile src/gallium/drivers/llvmpipe/Makefile src/gallium/drivers/noop/Makefile src/gallium/drivers/nouveau/Makefile + src/gallium/drivers/resequencer/Makefile src/gallium/drivers/r300/Makefile src/gallium/drivers/r600/Makefile src/gallium/drivers/radeon/Makefile diff --git a/src/gallium/drivers/resequencer/.gitignore b/src/gallium/drivers/resequencer/.gitignore new file mode 100644 index 000..c827305 --- /dev/null +++ b/src/gallium/drivers/resequencer/.gitignore @@ -0,0 +1,2 @@ +rsq_state.c +rsq_state.h diff --git a/src/gallium/drivers/resequencer/Makefile.am b/src/gallium/drivers/resequencer/Makefile.am new file mode 100644 index 000..503aa98 --- /dev/null +++ b/src/gallium/drivers/resequencer/Makefile.am @@ -0,0 +1,44 @@ +# Copyright © 2016 Red Hat. +# +# Permission is hereby granted, free of charge, to any person obtaining a +# copy of this software and associated documentation files (the "Software"), +# to deal in the Software without restriction, including without limitation +# the rights to use, copy, modify, merge, publish, distribute, sublicense, +# and/or sell copies of the Software, and to permit persons to whom the
[Mesa-dev] [PATCH 05/10] gallium: change end_query() to return boolean
From: Rob Clark s/bool/boolean/ to make it match the other APIs. Signed-off-by: Rob Clark --- src/gallium/drivers/freedreno/freedreno_query.c | 2 +- src/gallium/drivers/i915/i915_query.c | 2 +- src/gallium/drivers/ilo/ilo_query.c | 2 +- src/gallium/drivers/llvmpipe/lp_query.c | 2 +- src/gallium/drivers/noop/noop_pipe.c| 2 +- src/gallium/drivers/nouveau/nv30/nv30_query.c | 2 +- src/gallium/drivers/nouveau/nv50/nv50_query.c | 2 +- src/gallium/drivers/r300/r300_query.c | 4 ++-- src/gallium/drivers/radeon/r600_query.c | 2 +- src/gallium/drivers/rbug/rbug_context.c | 4 ++-- src/gallium/drivers/softpipe/sp_query.c | 2 +- src/gallium/drivers/svga/svga_pipe_query.c | 2 +- src/gallium/drivers/swr/swr_query.cpp | 2 +- src/gallium/drivers/vc4/vc4_query.c | 2 +- src/gallium/drivers/virgl/virgl_query.c | 4 ++-- src/gallium/include/pipe/p_context.h| 2 +- 16 files changed, 19 insertions(+), 19 deletions(-) diff --git a/src/gallium/drivers/freedreno/freedreno_query.c b/src/gallium/drivers/freedreno/freedreno_query.c index 18e0c79..fc50076 100644 --- a/src/gallium/drivers/freedreno/freedreno_query.c +++ b/src/gallium/drivers/freedreno/freedreno_query.c @@ -66,7 +66,7 @@ fd_begin_query(struct pipe_context *pctx, struct pipe_query *pq) return q->funcs->begin_query(fd_context(pctx), q); } -static bool +static boolean fd_end_query(struct pipe_context *pctx, struct pipe_query *pq) { struct fd_query *q = fd_query(pq); diff --git a/src/gallium/drivers/i915/i915_query.c b/src/gallium/drivers/i915/i915_query.c index d6015a6..9d5569a 100644 --- a/src/gallium/drivers/i915/i915_query.c +++ b/src/gallium/drivers/i915/i915_query.c @@ -60,7 +60,7 @@ static boolean i915_begin_query(struct pipe_context *ctx, return true; } -static bool i915_end_query(struct pipe_context *ctx, struct pipe_query *query) +static boolean i915_end_query(struct pipe_context *ctx, struct pipe_query *query) { return true; } diff --git a/src/gallium/drivers/ilo/ilo_query.c b/src/gallium/drivers/ilo/ilo_query.c index 3088c96..98c3f6d 100644 --- a/src/gallium/drivers/ilo/ilo_query.c +++ b/src/gallium/drivers/ilo/ilo_query.c @@ -128,7 +128,7 @@ ilo_begin_query(struct pipe_context *pipe, struct pipe_query *query) return true; } -static bool +static boolean ilo_end_query(struct pipe_context *pipe, struct pipe_query *query) { struct ilo_query *q = ilo_query(query); diff --git a/src/gallium/drivers/llvmpipe/lp_query.c b/src/gallium/drivers/llvmpipe/lp_query.c index d5ed656..ffc4f56c 100644 --- a/src/gallium/drivers/llvmpipe/lp_query.c +++ b/src/gallium/drivers/llvmpipe/lp_query.c @@ -239,7 +239,7 @@ llvmpipe_begin_query(struct pipe_context *pipe, struct pipe_query *q) } -static bool +static boolean llvmpipe_end_query(struct pipe_context *pipe, struct pipe_query *q) { struct llvmpipe_context *llvmpipe = llvmpipe_context( pipe ); diff --git a/src/gallium/drivers/noop/noop_pipe.c b/src/gallium/drivers/noop/noop_pipe.c index 99e5f1a..e58507b 100644 --- a/src/gallium/drivers/noop/noop_pipe.c +++ b/src/gallium/drivers/noop/noop_pipe.c @@ -63,7 +63,7 @@ static boolean noop_begin_query(struct pipe_context *ctx, struct pipe_query *que return true; } -static bool noop_end_query(struct pipe_context *ctx, struct pipe_query *query) +static boolean noop_end_query(struct pipe_context *ctx, struct pipe_query *query) { return true; } diff --git a/src/gallium/drivers/nouveau/nv30/nv30_query.c b/src/gallium/drivers/nouveau/nv30/nv30_query.c index aa9a12f..0f4d9b4 100644 --- a/src/gallium/drivers/nouveau/nv30/nv30_query.c +++ b/src/gallium/drivers/nouveau/nv30/nv30_query.c @@ -175,7 +175,7 @@ nv30_query_begin(struct pipe_context *pipe, struct pipe_query *pq) return true; } -static bool +static boolean nv30_query_end(struct pipe_context *pipe, struct pipe_query *pq) { struct nv30_context *nv30 = nv30_context(pipe); diff --git a/src/gallium/drivers/nouveau/nv50/nv50_query.c b/src/gallium/drivers/nouveau/nv50/nv50_query.c index 9a1397a..9124946 100644 --- a/src/gallium/drivers/nouveau/nv50/nv50_query.c +++ b/src/gallium/drivers/nouveau/nv50/nv50_query.c @@ -54,7 +54,7 @@ nv50_begin_query(struct pipe_context *pipe, struct pipe_query *pq) return q->funcs->begin_query(nv50_context(pipe), q); } -static bool +static boolean nv50_end_query(struct pipe_context *pipe, struct pipe_query *pq) { struct nv50_query *q = nv50_query(pq); diff --git a/src/gallium/drivers/r300/r300_query.c b/src/gallium/drivers/r300/r300_query.c index 79e2198..1428d03 100644 --- a/src/gallium/drivers/r300/r300_query.c +++ b/src/gallium/drivers/r300/r300_query.c @@ -112,8 +112,8 @@ void r300_stop_query(struct r300_context *r300) r300->query_current = NULL; } -static bool r300_end_query(struct pipe_context* pipe, - struct pipe_query* query)
[Mesa-dev] [PATCH 02/10] gallium: make shader_buffers const
From: Rob Clark Be consistent with the rest of the "set_xyz" state interfaces. Signed-off-by: Rob Clark --- src/gallium/drivers/ddebug/dd_context.c | 2 +- src/gallium/drivers/nouveau/nvc0/nvc0_state.c | 6 +++--- src/gallium/drivers/radeonsi/si_descriptors.c | 4 ++-- src/gallium/drivers/softpipe/sp_state_image.c | 2 +- src/gallium/drivers/trace/tr_context.c| 2 +- src/gallium/include/pipe/p_context.h | 2 +- 6 files changed, 9 insertions(+), 9 deletions(-) diff --git a/src/gallium/drivers/ddebug/dd_context.c b/src/gallium/drivers/ddebug/dd_context.c index 06b7c91..07c46dd 100644 --- a/src/gallium/drivers/ddebug/dd_context.c +++ b/src/gallium/drivers/ddebug/dd_context.c @@ -503,7 +503,7 @@ dd_context_set_shader_images(struct pipe_context *_pipe, unsigned shader, static void dd_context_set_shader_buffers(struct pipe_context *_pipe, unsigned shader, unsigned start, unsigned num_buffers, - struct pipe_shader_buffer *buffers) + const struct pipe_shader_buffer *buffers) { struct dd_context *dctx = dd_context(_pipe); struct pipe_context *pipe = dctx->pipe; diff --git a/src/gallium/drivers/nouveau/nvc0/nvc0_state.c b/src/gallium/drivers/nouveau/nvc0/nvc0_state.c index a9c1830..d10a88d 100644 --- a/src/gallium/drivers/nouveau/nvc0/nvc0_state.c +++ b/src/gallium/drivers/nouveau/nvc0/nvc0_state.c @@ -1315,8 +1315,8 @@ nvc0_set_shader_images(struct pipe_context *pipe, unsigned shader, static bool nvc0_bind_buffers_range(struct nvc0_context *nvc0, const unsigned t, - unsigned start, unsigned nr, - struct pipe_shader_buffer *pbuffers) +unsigned start, unsigned nr, +const struct pipe_shader_buffer *pbuffers) { const unsigned end = start + nr; unsigned mask = 0; @@ -1366,7 +1366,7 @@ static void nvc0_set_shader_buffers(struct pipe_context *pipe, unsigned shader, unsigned start, unsigned nr, -struct pipe_shader_buffer *buffers) +const struct pipe_shader_buffer *buffers) { const unsigned s = nvc0_shader_stage(shader); if (!nvc0_bind_buffers_range(nvc0_context(pipe), s, start, nr, buffers)) diff --git a/src/gallium/drivers/radeonsi/si_descriptors.c b/src/gallium/drivers/radeonsi/si_descriptors.c index 2d780e6..5ad251f 100644 --- a/src/gallium/drivers/radeonsi/si_descriptors.c +++ b/src/gallium/drivers/radeonsi/si_descriptors.c @@ -1040,7 +1040,7 @@ si_shader_buffer_descriptors(struct si_context *sctx, unsigned shader) static void si_set_shader_buffers(struct pipe_context *ctx, unsigned shader, unsigned start_slot, unsigned count, - struct pipe_shader_buffer *sbuffers) + const struct pipe_shader_buffer *sbuffers) { struct si_context *sctx = (struct si_context *)ctx; struct si_buffer_resources *buffers = &sctx->shader_buffers[shader]; @@ -1050,7 +1050,7 @@ static void si_set_shader_buffers(struct pipe_context *ctx, unsigned shader, assert(start_slot + count <= SI_NUM_SHADER_BUFFERS); for (i = 0; i < count; ++i) { - struct pipe_shader_buffer *sbuffer = sbuffers ? &sbuffers[i] : NULL; + const struct pipe_shader_buffer *sbuffer = sbuffers ? &sbuffers[i] : NULL; struct r600_resource *buf; unsigned slot = start_slot + i; uint32_t *desc = descs->list + slot * 4; diff --git a/src/gallium/drivers/softpipe/sp_state_image.c b/src/gallium/drivers/softpipe/sp_state_image.c index b1810d3..81bb7ca 100644 --- a/src/gallium/drivers/softpipe/sp_state_image.c +++ b/src/gallium/drivers/softpipe/sp_state_image.c @@ -56,7 +56,7 @@ static void softpipe_set_shader_buffers(struct pipe_context *pipe, unsigned shader, unsigned start, unsigned num, -struct pipe_shader_buffer *buffers) +const struct pipe_shader_buffer *buffers) { struct softpipe_context *softpipe = softpipe_context(pipe); unsigned i; diff --git a/src/gallium/drivers/trace/tr_context.c b/src/gallium/drivers/trace/tr_context.c index 0dd07e9..041a47c 100644 --- a/src/gallium/drivers/trace/tr_context.c +++ b/src/gallium/drivers/trace/tr_context.c @@ -1670,7 +1670,7 @@ trace_context_set_tess_state(struct pipe_context *_context, static void trace_context_set_shader_buffers(struct pipe_context *_context, unsigned shader, unsigned start, unsigned nr, - struct pipe_shader_buffer *buffers)
[Mesa-dev] [PATCH 03/10] gallium: make constant_buffer const
From: Rob Clark Signed-off-by: Rob Clark --- src/gallium/drivers/ddebug/dd_context.c | 2 +- src/gallium/drivers/freedreno/freedreno_state.c | 2 +- src/gallium/drivers/i915/i915_state.c | 2 +- src/gallium/drivers/ilo/ilo_state.c | 2 +- src/gallium/drivers/llvmpipe/lp_state_fs.c | 2 +- src/gallium/drivers/noop/noop_state.c | 2 +- src/gallium/drivers/nouveau/nv30/nv30_state.c | 2 +- src/gallium/drivers/nouveau/nv50/nv50_state.c | 2 +- src/gallium/drivers/nouveau/nvc0/nvc0_state.c | 2 +- src/gallium/drivers/r300/r300_state.c | 2 +- src/gallium/drivers/r600/r600_state_common.c| 2 +- src/gallium/drivers/radeonsi/si_descriptors.c | 6 +++--- src/gallium/drivers/radeonsi/si_state.h | 3 +-- src/gallium/drivers/rbug/rbug_context.c | 2 +- src/gallium/drivers/softpipe/sp_state_shader.c | 2 +- src/gallium/drivers/svga/svga_pipe_constants.c | 2 +- src/gallium/drivers/swr/swr_state.cpp | 2 +- src/gallium/drivers/trace/tr_context.c | 2 +- src/gallium/drivers/vc4/vc4_state.c | 2 +- src/gallium/drivers/virgl/virgl_context.c | 2 +- src/gallium/include/pipe/p_context.h| 2 +- 21 files changed, 23 insertions(+), 24 deletions(-) diff --git a/src/gallium/drivers/ddebug/dd_context.c b/src/gallium/drivers/ddebug/dd_context.c index 07c46dd..64b16f6 100644 --- a/src/gallium/drivers/ddebug/dd_context.c +++ b/src/gallium/drivers/ddebug/dd_context.c @@ -343,7 +343,7 @@ DD_IMM_STATE(polygon_stipple, const struct pipe_poly_stipple, *state, state) static void dd_context_set_constant_buffer(struct pipe_context *_pipe, uint shader, uint index, - struct pipe_constant_buffer *constant_buffer) + const struct pipe_constant_buffer *constant_buffer) { struct dd_context *dctx = dd_context(_pipe); struct pipe_context *pipe = dctx->pipe; diff --git a/src/gallium/drivers/freedreno/freedreno_state.c b/src/gallium/drivers/freedreno/freedreno_state.c index e4df909..53ea39b 100644 --- a/src/gallium/drivers/freedreno/freedreno_state.c +++ b/src/gallium/drivers/freedreno/freedreno_state.c @@ -89,7 +89,7 @@ fd_set_sample_mask(struct pipe_context *pctx, unsigned sample_mask) */ static void fd_set_constant_buffer(struct pipe_context *pctx, uint shader, uint index, - struct pipe_constant_buffer *cb) + const struct pipe_constant_buffer *cb) { struct fd_context *ctx = fd_context(pctx); struct fd_constbuf_stateobj *so = &ctx->constbuf[shader]; diff --git a/src/gallium/drivers/i915/i915_state.c b/src/gallium/drivers/i915/i915_state.c index 8fa2f42..2efa14e 100644 --- a/src/gallium/drivers/i915/i915_state.c +++ b/src/gallium/drivers/i915/i915_state.c @@ -675,7 +675,7 @@ static void i915_delete_vs_state(struct pipe_context *pipe, void *shader) static void i915_set_constant_buffer(struct pipe_context *pipe, uint shader, uint index, - struct pipe_constant_buffer *cb) + const struct pipe_constant_buffer *cb) { struct i915_context *i915 = i915_context(pipe); struct pipe_resource *buf = cb ? cb->buffer : NULL; diff --git a/src/gallium/drivers/ilo/ilo_state.c b/src/gallium/drivers/ilo/ilo_state.c index 37234ec..53a5aca 100644 --- a/src/gallium/drivers/ilo/ilo_state.c +++ b/src/gallium/drivers/ilo/ilo_state.c @@ -1536,7 +1536,7 @@ ilo_set_clip_state(struct pipe_context *pipe, static void ilo_set_constant_buffer(struct pipe_context *pipe, uint shader, uint index, -struct pipe_constant_buffer *buf) +const struct pipe_constant_buffer *buf) { const struct ilo_dev *dev = ilo_context(pipe)->dev; struct ilo_state_vector *vec = &ilo_context(pipe)->state_vector; diff --git a/src/gallium/drivers/llvmpipe/lp_state_fs.c b/src/gallium/drivers/llvmpipe/lp_state_fs.c index 7dceff7..429b082 100644 --- a/src/gallium/drivers/llvmpipe/lp_state_fs.c +++ b/src/gallium/drivers/llvmpipe/lp_state_fs.c @@ -2836,7 +2836,7 @@ llvmpipe_delete_fs_state(struct pipe_context *pipe, void *fs) static void llvmpipe_set_constant_buffer(struct pipe_context *pipe, uint shader, uint index, - struct pipe_constant_buffer *cb) + const struct pipe_constant_buffer *cb) { struct llvmpipe_context *llvmpipe = llvmpipe_context(pipe); struct pipe_resource *constants = cb ? cb->buffer : NULL; diff --git a/src/gallium/drivers/noop/noop_state.c b/src/gallium/drivers/noop/noop_state.c index fe5b5e4..0ddffa2 100644 --- a/src/gallium/drivers/noop/noop_state.c +++ b/src/gallium/drivers/noop/noop_state.c @@ -176,7 +176,7 @@ static void noop_set_framebuffer_state(struct pipe_context *ctx, static void noop_set_cons
[Mesa-dev] [PATCH 04/10] gallium: make image_view const
From: Rob Clark Signed-off-by: Rob Clark --- src/gallium/drivers/ddebug/dd_context.c | 2 +- src/gallium/drivers/ilo/ilo_state.c | 2 +- src/gallium/drivers/nouveau/nvc0/nvc0_state.c | 4 ++-- src/gallium/drivers/radeonsi/si_descriptors.c | 6 +++--- src/gallium/drivers/softpipe/sp_state_image.c | 2 +- src/gallium/drivers/trace/tr_context.c| 2 +- src/gallium/include/pipe/p_context.h | 2 +- 7 files changed, 10 insertions(+), 10 deletions(-) diff --git a/src/gallium/drivers/ddebug/dd_context.c b/src/gallium/drivers/ddebug/dd_context.c index 64b16f6..f72fd2f 100644 --- a/src/gallium/drivers/ddebug/dd_context.c +++ b/src/gallium/drivers/ddebug/dd_context.c @@ -490,7 +490,7 @@ dd_context_set_sampler_views(struct pipe_context *_pipe, unsigned shader, static void dd_context_set_shader_images(struct pipe_context *_pipe, unsigned shader, unsigned start, unsigned num, - struct pipe_image_view *views) + const struct pipe_image_view *views) { struct dd_context *dctx = dd_context(_pipe); struct pipe_context *pipe = dctx->pipe; diff --git a/src/gallium/drivers/ilo/ilo_state.c b/src/gallium/drivers/ilo/ilo_state.c index 53a5aca..4f1002e 100644 --- a/src/gallium/drivers/ilo/ilo_state.c +++ b/src/gallium/drivers/ilo/ilo_state.c @@ -1851,7 +1851,7 @@ ilo_set_sampler_views(struct pipe_context *pipe, unsigned shader, static void ilo_set_shader_images(struct pipe_context *pipe, unsigned shader, unsigned start, unsigned count, - struct pipe_image_view *views) + const struct pipe_image_view *views) { #if 0 struct ilo_state_vector *vec = &ilo_context(pipe)->state_vector; diff --git a/src/gallium/drivers/nouveau/nvc0/nvc0_state.c b/src/gallium/drivers/nouveau/nvc0/nvc0_state.c index a0e01bd..0bd756f 100644 --- a/src/gallium/drivers/nouveau/nvc0/nvc0_state.c +++ b/src/gallium/drivers/nouveau/nvc0/nvc0_state.c @@ -1233,7 +1233,7 @@ nvc0_set_compute_resources(struct pipe_context *pipe, static bool nvc0_bind_images_range(struct nvc0_context *nvc0, const unsigned s, unsigned start, unsigned nr, - struct pipe_image_view *pimages) + const struct pipe_image_view *pimages) { const unsigned end = start + nr; unsigned mask = 0; @@ -1301,7 +1301,7 @@ nvc0_bind_images_range(struct nvc0_context *nvc0, const unsigned s, static void nvc0_set_shader_images(struct pipe_context *pipe, unsigned shader, unsigned start, unsigned nr, - struct pipe_image_view *images) + const struct pipe_image_view *images) { const unsigned s = nvc0_shader_stage(shader); if (!nvc0_bind_images_range(nvc0_context(pipe), s, start, nr, images)) diff --git a/src/gallium/drivers/radeonsi/si_descriptors.c b/src/gallium/drivers/radeonsi/si_descriptors.c index 55686e8..e95556b 100644 --- a/src/gallium/drivers/radeonsi/si_descriptors.c +++ b/src/gallium/drivers/radeonsi/si_descriptors.c @@ -560,7 +560,7 @@ si_disable_shader_image(struct si_context *ctx, unsigned shader, unsigned slot) } static void -si_mark_image_range_valid(struct pipe_image_view *view) +si_mark_image_range_valid(const struct pipe_image_view *view) { struct r600_resource *res = (struct r600_resource *)view->resource; const struct util_format_description *desc; @@ -578,7 +578,7 @@ si_mark_image_range_valid(struct pipe_image_view *view) static void si_set_shader_image(struct si_context *ctx, unsigned shader, - unsigned slot, struct pipe_image_view *view) + unsigned slot, const struct pipe_image_view *view) { struct si_screen *screen = ctx->screen; struct si_images_info *images = &ctx->images[shader]; @@ -674,7 +674,7 @@ static void si_set_shader_image(struct si_context *ctx, static void si_set_shader_images(struct pipe_context *pipe, unsigned shader, unsigned start_slot, unsigned count, -struct pipe_image_view *views) +const struct pipe_image_view *views) { struct si_context *ctx = (struct si_context *)pipe; unsigned i, slot; diff --git a/src/gallium/drivers/softpipe/sp_state_image.c b/src/gallium/drivers/softpipe/sp_state_image.c index 81bb7ca..553a76a 100644 --- a/src/gallium/drivers/softpipe/sp_state_image.c +++ b/src/gallium/drivers/softpipe/sp_state_image.c @@ -30,7 +30,7 @@ static void softpipe_set_shader_images(struct pipe_context *pipe, unsigned shader, unsigned start, unsigned num, - struct pipe_image_view *images) + const struct pi
Re: [Mesa-dev] [PATCH 01/10] gallium: cleanup set_tess_state
Can you explain the motivation behind this change? I'm adding a ->set_window_rectangles thing which also takes multiple parameters. What's the advantage of stuffing things into a struct first? -ilia On Tue, Jun 14, 2016 at 11:57 AM, Rob Clark wrote: > From: Rob Clark > > The reset of the state APIs take state structs, rather than inline > parameters (with the exception of a couple which just amount to a single > uint). > > This makes the API more regular and simplifies autogeneration of the > gallium state related APIs. > > Signed-off-by: Rob Clark > --- > src/gallium/drivers/ddebug/dd_context.c | 9 - > src/gallium/drivers/nouveau/nvc0/nvc0_state.c | 7 +++ > src/gallium/drivers/r600/evergreen_state.c| 7 +++ > src/gallium/drivers/radeonsi/si_state.c | 7 +++ > src/gallium/drivers/trace/tr_context.c| 9 - > src/gallium/include/pipe/p_context.h | 4 ++-- > src/gallium/include/pipe/p_state.h| 8 > src/mesa/state_tracker/st_atom_tess.c | 13 ++--- > 8 files changed, 37 insertions(+), 27 deletions(-) > > diff --git a/src/gallium/drivers/ddebug/dd_context.c > b/src/gallium/drivers/ddebug/dd_context.c > index 0f8ef18..06b7c91 100644 > --- a/src/gallium/drivers/ddebug/dd_context.c > +++ b/src/gallium/drivers/ddebug/dd_context.c > @@ -380,15 +380,14 @@ dd_context_set_viewport_states(struct pipe_context > *_pipe, > } > > static void dd_context_set_tess_state(struct pipe_context *_pipe, > - const float default_outer_level[4], > - const float default_inner_level[2]) > + const struct pipe_tess_state *state) > { > struct dd_context *dctx = dd_context(_pipe); > struct pipe_context *pipe = dctx->pipe; > > - memcpy(dctx->tess_default_levels, default_outer_level, sizeof(float) * 4); > - memcpy(dctx->tess_default_levels+4, default_inner_level, sizeof(float) * > 2); > - pipe->set_tess_state(pipe, default_outer_level, default_inner_level); > + memcpy(dctx->tess_default_levels, state->default_outer_level, > sizeof(float) * 4); > + memcpy(dctx->tess_default_levels+4, state->default_inner_level, > sizeof(float) * 2); > + pipe->set_tess_state(pipe, state); > } > > > diff --git a/src/gallium/drivers/nouveau/nvc0/nvc0_state.c > b/src/gallium/drivers/nouveau/nvc0/nvc0_state.c > index 92161ec..a9c1830 100644 > --- a/src/gallium/drivers/nouveau/nvc0/nvc0_state.c > +++ b/src/gallium/drivers/nouveau/nvc0/nvc0_state.c > @@ -1001,13 +1001,12 @@ nvc0_set_viewport_states(struct pipe_context *pipe, > > static void > nvc0_set_tess_state(struct pipe_context *pipe, > -const float default_tess_outer[4], > -const float default_tess_inner[2]) > +const struct pipe_tess_state *state) > { > struct nvc0_context *nvc0 = nvc0_context(pipe); > > - memcpy(nvc0->default_tess_outer, default_tess_outer, 4 * sizeof(float)); > - memcpy(nvc0->default_tess_inner, default_tess_inner, 2 * sizeof(float)); > + memcpy(nvc0->default_tess_outer, state->default_tess_outer, 4 * > sizeof(float)); > + memcpy(nvc0->default_tess_inner, state->default_tess_inner, 2 * > sizeof(float)); > nvc0->dirty_3d |= NVC0_NEW_3D_TESSFACTOR; > } > > diff --git a/src/gallium/drivers/r600/evergreen_state.c > b/src/gallium/drivers/r600/evergreen_state.c > index 1ac8914..2a424f5 100644 > --- a/src/gallium/drivers/r600/evergreen_state.c > +++ b/src/gallium/drivers/r600/evergreen_state.c > @@ -3569,13 +3569,12 @@ fallback: > } > > static void evergreen_set_tess_state(struct pipe_context *ctx, > -const float default_outer_level[4], > -const float default_inner_level[2]) > +const struct pipe_tess_state *state) > { > struct r600_context *rctx = (struct r600_context *)ctx; > > - memcpy(rctx->tess_state, default_outer_level, sizeof(float) * 4); > - memcpy(rctx->tess_state+4, default_inner_level, sizeof(float) * 2); > + memcpy(rctx->tess_state, state->default_outer_level, sizeof(float) * > 4); > + memcpy(rctx->tess_state+4, state->default_inner_level, sizeof(float) > * 2); > rctx->tess_state_dirty = true; > } > > diff --git a/src/gallium/drivers/radeonsi/si_state.c > b/src/gallium/drivers/radeonsi/si_state.c > index 0c52eee..6ef3fe5 100644 > --- a/src/gallium/drivers/radeonsi/si_state.c > +++ b/src/gallium/drivers/radeonsi/si_state.c > @@ -3238,15 +3238,14 @@ static void si_set_index_buffer(struct pipe_context > *ctx, > */ > > static void si_set_tess_state(struct pipe_context *ctx, > - const float default_outer_level[4], > - const float default_inner_level[2]) > + const struct pipe_tess_state *state) > { > struct si_contex
[Mesa-dev] [PATCH 06/10] gallium/util: add util_copy_index_buffer() helper
From: Rob Clark Note there was previously a util_set_index_buffer() which was only used by svga. Replace this. (The util_copy_* naming is more consistent with other u_inlines/ u_framebuffer helpers) Signed-off-by: Rob Clark --- src/gallium/auxiliary/util/u_helpers.c | 15 --- src/gallium/auxiliary/util/u_helpers.h | 3 --- src/gallium/auxiliary/util/u_inlines.h | 17 + src/gallium/drivers/freedreno/freedreno_state.c | 11 +-- src/gallium/drivers/i915/i915_state.c | 6 +- src/gallium/drivers/ilo/ilo_state.c | 10 +- src/gallium/drivers/llvmpipe/lp_state_vertex.c | 6 +- src/gallium/drivers/nouveau/nv30/nv30_state.c | 11 +-- src/gallium/drivers/r300/r300_state.c | 8 +--- src/gallium/drivers/r600/r600_state_common.c| 5 + src/gallium/drivers/radeonsi/si_state.c | 6 +- src/gallium/drivers/softpipe/sp_state_vertex.c | 6 +- src/gallium/drivers/svga/svga_pipe_vertex.c | 2 +- src/gallium/drivers/swr/swr_state.cpp | 7 +-- src/gallium/drivers/vc4/vc4_state.c | 11 +-- src/gallium/drivers/virgl/virgl_context.c | 8 +--- 16 files changed, 30 insertions(+), 102 deletions(-) diff --git a/src/gallium/auxiliary/util/u_helpers.c b/src/gallium/auxiliary/util/u_helpers.c index 09020b0..117a51b 100644 --- a/src/gallium/auxiliary/util/u_helpers.c +++ b/src/gallium/auxiliary/util/u_helpers.c @@ -94,18 +94,3 @@ void util_set_vertex_buffers_count(struct pipe_vertex_buffer *dst, *dst_count = util_last_bit(enabled_buffers); } - - -void -util_set_index_buffer(struct pipe_index_buffer *dst, - const struct pipe_index_buffer *src) -{ - if (src) { - pipe_resource_reference(&dst->buffer, src->buffer); - memcpy(dst, src, sizeof(*dst)); - } - else { - pipe_resource_reference(&dst->buffer, NULL); - memset(dst, 0, sizeof(*dst)); - } -} diff --git a/src/gallium/auxiliary/util/u_helpers.h b/src/gallium/auxiliary/util/u_helpers.h index a9a53e4..9804163 100644 --- a/src/gallium/auxiliary/util/u_helpers.h +++ b/src/gallium/auxiliary/util/u_helpers.h @@ -44,9 +44,6 @@ void util_set_vertex_buffers_count(struct pipe_vertex_buffer *dst, const struct pipe_vertex_buffer *src, unsigned start_slot, unsigned count); -void util_set_index_buffer(struct pipe_index_buffer *dst, - const struct pipe_index_buffer *src); - #ifdef __cplusplus } #endif diff --git a/src/gallium/auxiliary/util/u_inlines.h b/src/gallium/auxiliary/util/u_inlines.h index 207e2aa..78125c8 100644 --- a/src/gallium/auxiliary/util/u_inlines.h +++ b/src/gallium/auxiliary/util/u_inlines.h @@ -623,6 +623,23 @@ util_copy_constant_buffer(struct pipe_constant_buffer *dst, } static inline void +util_copy_index_buffer(struct pipe_index_buffer *dst, + const struct pipe_index_buffer *src) +{ + if (src) { + dst->index_size = src->index_size; + dst->offset = src->offset; + pipe_resource_reference(&dst->buffer, src->buffer); + dst->user_buffer = src->user_buffer; + } else { + dst->index_size = 0; + dst->offset = 0; + pipe_resource_reference(&dst->buffer, NULL); + dst->user_buffer = NULL; + } +} + +static inline void util_copy_image_view(struct pipe_image_view *dst, const struct pipe_image_view *src) { diff --git a/src/gallium/drivers/freedreno/freedreno_state.c b/src/gallium/drivers/freedreno/freedreno_state.c index 53ea39b..688975f 100644 --- a/src/gallium/drivers/freedreno/freedreno_state.c +++ b/src/gallium/drivers/freedreno/freedreno_state.c @@ -207,16 +207,7 @@ fd_set_index_buffer(struct pipe_context *pctx, const struct pipe_index_buffer *ib) { struct fd_context *ctx = fd_context(pctx); - - if (ib) { - pipe_resource_reference(&ctx->indexbuf.buffer, ib->buffer); - ctx->indexbuf.index_size = ib->index_size; - ctx->indexbuf.offset = ib->offset; - ctx->indexbuf.user_buffer = ib->user_buffer; - } else { - pipe_resource_reference(&ctx->indexbuf.buffer, NULL); - } - + util_copy_index_buffer(&ctx->indexbuf, ib); ctx->dirty |= FD_DIRTY_INDEXBUF; } diff --git a/src/gallium/drivers/i915/i915_state.c b/src/gallium/drivers/i915/i915_state.c index 2efa14e..dbd711f 100644 --- a/src/gallium/drivers/i915/i915_state.c +++ b/src/gallium/drivers/i915/i915_state.c @@ -1063,11 +1063,7 @@ static void i915_set_index_buffer(struct pipe_context *pipe, const struct pipe_index_buffer *ib) { struct i915_context *i915 = i915_context(pipe); - - if (ib) - memcpy(&i915->index_buffer, ib, sizeof(i915->index_buffer)); - else - memset(&i915->index_buffer, 0, sizeof(i915->index
[Mesa-dev] [PATCH (backport)] radeonsi: mark buffer texture range valid for shader images
From: Nicolai Hähnle When a shader image view into a buffer texture can be written to, the buffer's valid range must be updated, or subsequent transfers may incorrectly skip synchronization. This fixes a bug that was exposed in Xephyr by PBO acceleration for glReadPixels, reported by Michel Dänzer. Cc: Michel Dänzer Cc: 12.0 Reviewed-by: Marek Olšák Back-ported from commit a64c7cd2bac33a3a2bf908b5ef538dff03b93b73: - include util/u_format.h - code was extracted to si_set_shader_image in master, move it back Signed-off-by: Nicolai Hähnle -- src/gallium/drivers/radeonsi/si_descriptors.c | 24 1 file changed, 24 insertions(+) diff --git a/src/gallium/drivers/radeonsi/si_descriptors.c b/src/gallium/drivers/radeonsi/si_descriptors.c index 855b79e..e8ce87b 100644 --- a/src/gallium/drivers/radeonsi/si_descriptors.c +++ b/src/gallium/drivers/radeonsi/si_descriptors.c @@ -60,6 +60,7 @@ #include "si_shader.h" #include "sid.h" +#include "util/u_format.h" #include "util/u_math.h" #include "util/u_memory.h" #include "util/u_suballoc.h" @@ -471,6 +472,23 @@ si_disable_shader_image(struct si_images_info *images, unsigned slot) } static void +si_mark_image_range_valid(struct pipe_image_view *view) +{ + struct r600_resource *res = (struct r600_resource *)view->resource; + const struct util_format_description *desc; + unsigned stride; + + assert(res && res->b.b.target == PIPE_BUFFER); + + desc = util_format_description(view->format); + stride = desc->block.bits / 8; + + util_range_add(&res->valid_buffer_range, + stride * (view->u.buf.first_element), + stride * (view->u.buf.last_element + 1)); +} + +static void si_set_shader_images(struct pipe_context *pipe, unsigned shader, unsigned start_slot, unsigned count, struct pipe_image_view *views) @@ -502,6 +520,9 @@ si_set_shader_images(struct pipe_context *pipe, unsigned shader, RADEON_USAGE_READWRITE); if (res->b.b.target == PIPE_BUFFER) { + if (views[i].access & PIPE_IMAGE_ACCESS_WRITE) + si_mark_image_range_valid(&views[i]); + si_make_buffer_descriptor(screen, res, views[i].format, views[i].u.buf.first_element, @@ -1297,6 +1318,9 @@ static void si_invalidate_buffer(struct pipe_context *ctx, struct pipe_resource unsigned i = u_bit_scan(&mask); if (images->views[i].resource == buf) { + if (images->views[i].access & PIPE_IMAGE_ACCESS_WRITE) + si_mark_image_range_valid(&images->views[i]); + si_desc_reset_buffer_offset( ctx, images->desc.list + i * 8 + 4, old_va, buf); -- 2.7.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 08/10] gallium/util: add util_copy_vertex_buffer helper
From: Rob Clark Signed-off-by: Rob Clark --- src/gallium/auxiliary/util/u_inlines.h | 17 + 1 file changed, 17 insertions(+) diff --git a/src/gallium/auxiliary/util/u_inlines.h b/src/gallium/auxiliary/util/u_inlines.h index ebaf368..93171d9 100644 --- a/src/gallium/auxiliary/util/u_inlines.h +++ b/src/gallium/auxiliary/util/u_inlines.h @@ -671,6 +671,23 @@ util_copy_image_view(struct pipe_image_view *dst, } } +static inline void +util_copy_vertex_buffer(struct pipe_vertex_buffer *dst, +const struct pipe_vertex_buffer *src) +{ + if (src) { + dst->stride = src->stride; + dst->buffer_offset = src->buffer_offset; + pipe_resource_reference(&dst->buffer, src->buffer); + dst->user_buffer = src->user_buffer; + } else { + dst->stride = 0; + dst->buffer_offset = 0; + pipe_resource_reference(&dst->buffer, NULL); + dst->user_buffer = NULL; + } +} + static inline unsigned util_max_layer(const struct pipe_resource *r, unsigned level) { -- 2.5.5 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev