[Mesa-dev] [PATCH] i965/fs: Gen4: Zero out extra coordinates when using shadow compare
Fixes broken rendering if these MRFs contained anything other than zero. NOTE: This is a candidate for stable branches. Signed-off-by: Chris Forbes chr...@ijw.co.nz --- src/mesa/drivers/dri/i965/brw_fs_visitor.cpp | 7 ++- 1 file changed, 6 insertions(+), 1 deletion(-) diff --git a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp index 964ad40..c3eecfa 100644 --- a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp @@ -868,7 +868,12 @@ fs_visitor::emit_texture_gen4(ir_texture *ir, fs_reg dst, fs_reg coordinate, emit(MOV(fs_reg(MRF, base_mrf + mlen + i), coordinate)); coordinate.reg_offset++; } - /* gen4's SIMD8 sampler always has the slots for u,v,r present. */ + + /* gen4's SIMD8 sampler always has the slots for u,v,r present. + * the unused slots must be zeroed.*/ + for (int i = ir-coordinate-type-vector_elements; i3; i++) { + emit(MOV(fs_reg(MRF, base_mrf + mlen + i), fs_reg(0.0f))); + } mlen += 3; if (ir-op == ir_tex) { -- 1.8.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] i965/fs: Gen4: Zero out extra coordinates when using shadow compare
Sorry, saw the bogus comment style just after I sent this out. -- Chris On Tue, Aug 27, 2013 at 9:45 PM, Chris Forbes chr...@ijw.co.nz wrote: Fixes broken rendering if these MRFs contained anything other than zero. NOTE: This is a candidate for stable branches. Signed-off-by: Chris Forbes chr...@ijw.co.nz --- src/mesa/drivers/dri/i965/brw_fs_visitor.cpp | 7 ++- 1 file changed, 6 insertions(+), 1 deletion(-) diff --git a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp index 964ad40..c3eecfa 100644 --- a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp @@ -868,7 +868,12 @@ fs_visitor::emit_texture_gen4(ir_texture *ir, fs_reg dst, fs_reg coordinate, emit(MOV(fs_reg(MRF, base_mrf + mlen + i), coordinate)); coordinate.reg_offset++; } - /* gen4's SIMD8 sampler always has the slots for u,v,r present. */ + + /* gen4's SIMD8 sampler always has the slots for u,v,r present. + * the unused slots must be zeroed.*/ + for (int i = ir-coordinate-type-vector_elements; i3; i++) { + emit(MOV(fs_reg(MRF, base_mrf + mlen + i), fs_reg(0.0f))); + } mlen += 3; if (ir-op == ir_tex) { -- 1.8.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] tgsi dump and parsing
- Original Message - On 08/26/2013 02:38 AM, Dave Airlie wrote: Hi TGSI guys mostly :-) So I'm wondering how circular and perfect tgsi-text-tgsi roundabouts should be, Ideally, they should be consistent. currently the TGSI dump code uses .4f in one place, which makes things like 1e6 not make it across the divide, I was thinking of dumping immediates in 32-bit hex format so we know for definite what happens on the other side, Yeah, I think we'd have to dump floats as hex to always preserve their value. However, I'd like to maintain the readability of float immediates in dumps. Maybe they could be displayed as a comment. Something like: IMM[0] FLT32 { 0x, 0x, 0x, 0x } # 1.0, 3.0, 2.0, 4.0 If you use %.9g instead of %.4f then floating point numbers will be preserved without loss of precision. Jose I've been thinking of maybe adding a debug option to softpipe to dump to text and read it back, to see what other regression lie in wait. Sounds OK to me but it could be a TGSI debug option that would test dumping + reassembling a shader whenever tgsi_dump() is called. -Brian ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 68262] [build error with wayland 1.0.5] wayland-drm.c:110:3: error: implicit declaration of function 'wl_resource_create'
https://bugs.freedesktop.org/show_bug.cgi?id=68262 --- Comment #2 from Fabio Pedretti fabio@libero.it --- Confirming the same error with 1.1.0. -- You are receiving this mail because: You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH V2 04/15] mesa: Implement glPushDebugGroup and glPopDebugGroup
On 08/26/2013 07:17 PM, Timothy Arceri wrote: V2: fixed spelling typo in comment Signed-off-by: Timothy Arceri t_arc...@yahoo.com.au --- src/mesa/main/errors.c | 275 - src/mesa/main/errors.h | 5 + src/mesa/main/mtypes.h | 5 +- 3 files changed, 214 insertions(+), 71 deletions(-) The updated patches look fine. Thanks. Reviewed-by: Brian Paul bri...@vmware.com ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 03/15] mesa: Add a clone function to mesa hash
On 08/26/2013 07:10 PM, Timothy Arceri wrote: On 27/08/13 00:51, Brian Paul wrote: On 08/26/2013 04:43 AM, Timothy Arceri wrote: Signed-off-by: Timothy Arceri t_arc...@yahoo.com.au --- src/mesa/main/hash.c | 26 ++ src/mesa/main/hash.h |3 +++ 2 files changed, 29 insertions(+) diff --git a/src/mesa/main/hash.c b/src/mesa/main/hash.c index 6591af9..8dde8b1 100644 --- a/src/mesa/main/hash.c +++ b/src/mesa/main/hash.c @@ -302,6 +302,32 @@ _mesa_HashDeleteAll(struct _mesa_HashTable *table, /** + * Clone all entries in a hash table, into a new table. + * + * \param table the hash table to clone + */ +struct _mesa_HashTable * +_mesa_HashClone(struct _mesa_HashTable *table) Can that be const qualified? The cloned tables are to be edited after we clone them. Basically we just want a copy of whats on the top of the stack when we do a push but then the we want to be able to be able to add more ids to this copy if we want to. I have a piglit test for push/popDebugGroup that shows what I mean. I will try to polish this up and submit it later today. Or I'm I miss understanding what you are suggesting? I'm simply asking if the 'table' function parameter can be const-qualified. Normally, when you clone an object you don't modify the original object, so I'd assume that it could be const. -Brian ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 68262] [build error with wayland 1.2.0] wayland-drm.c:110:3: error: implicit declaration of function 'wl_resource_create'
https://bugs.freedesktop.org/show_bug.cgi?id=68262 Fabio Pedretti fabio@libero.it changed: What|Removed |Added Summary|[build error with wayland |[build error with wayland |1.0.5] wayland-drm.c:110:3: |1.2.0] wayland-drm.c:110:3: |error: implicit declaration |error: implicit declaration |of function |of function |'wl_resource_create'|'wl_resource_create' -- You are receiving this mail because: You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] r600g/compute: Fix bug in compute memory pool
From: Tom Stellard thomas.stell...@amd.com When adding a new buffer to the beginning of the memory pool, we were accidentally deleting the buffer that was first in the buffer list. This was caused by a bug in the memory pool's linked list implementation. --- src/gallium/drivers/r600/compute_memory_pool.c | 9 ++--- 1 file changed, 2 insertions(+), 7 deletions(-) diff --git a/src/gallium/drivers/r600/compute_memory_pool.c b/src/gallium/drivers/r600/compute_memory_pool.c index 454af90..4846bfe 100644 --- a/src/gallium/drivers/r600/compute_memory_pool.c +++ b/src/gallium/drivers/r600/compute_memory_pool.c @@ -337,14 +337,9 @@ void compute_memory_finalize_pending(struct compute_memory_pool* pool, } } else { /* Add item to the front of the list */ - item-next = pool-item_list-next; - if (pool-item_list-next) { - pool-item_list-next-prev = item; - } + item-next = pool-item_list; item-prev = pool-item_list-prev; - if (pool-item_list-prev) { - pool-item_list-prev-next = item; - } + pool-item_list-prev = item; pool-item_list = item; } } -- 1.7.11.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] glsl: propagate max_array_access through function calls
On 08/21/2013 05:57 PM, Dominik Behr wrote: Fixes a bug where if an uniform array is passed to a function the accesses to the array are not propagated so later all but the first vector of the uniform array are removed in parcel_out_uniform_storage resulting in broken shaders and out of bounds access to arrays in brw::vec4_visitor::pack_uniform_registers. Signed-off-by: Dominik Behr db...@chromium.org --- src/glsl/link_functions.cpp | 26 ++ 1 file changed, 26 insertions(+) diff --git a/src/glsl/link_functions.cpp b/src/glsl/link_functions.cpp index 6b3e154..32e2012 100644 --- a/src/glsl/link_functions.cpp +++ b/src/glsl/link_functions.cpp @@ -173,6 +173,32 @@ public: return visit_continue; } + virtual ir_visitor_status visit_leave(ir_call *ir) + { + /* traverse list of function parameters, and for array parameters + propagate max_array_access, do it when leaving the node + so the childen would propagate their array accesses first */ children I also think this comment should be expanded. On the first read of this code, it wasn't obvious to me why this was necessary. It's worth mentioning that the important case is when the reference to the array at the call site does not reference a specific element. In that case the access range of the from the callee is propagated back to the array passed in. + exec_list_iterator sig_param_iter = ir-callee-parameters.iterator(); + exec_list_iterator param_iter = ir-actual_parameters.iterator(); While iterators are a fine design pattern, the implementation of them in the compiler front-end is garbage (and I made that implementation). We stopped writing new code to use the iterators years ago. There is other code that does a similar double-walk of the signature parameters and actual parameters lists. I think the best example is in lower_clip_distance_visitor::visit_leave(ir_call *ir) in lower_clip_distance.cpp. + while (param_iter.has_next()) { + ir_variable *sig_param = (ir_variable *) sig_param_iter.get(); + ir_rvalue *param = (ir_rvalue *) param_iter.get(); + assert(sig_param); + assert(param); + if (sig_param-type-is_array()) { +ir_dereference_variable *deref = param-as_dereference_variable(); +if (deref deref-var deref-var-type-is_array()) { + if (sig_param-max_array_access deref-var-max_array_access) { + deref-var-max_array_access = sig_param-max_array_access; + } This should end up looking more similar to the code in call_link_visitor::visit(ir_dereference_variable *ir) around line 200 (without your patch). var-max_array_access = MAX2(var-max_array_access, ir-var-max_array_access); +} + } + sig_param_iter.next(); + param_iter.next(); + } + return visit_continue; + } + virtual ir_visitor_status visit(ir_dereference_variable *ir) { if (hash_table_find(locals, ir-var) == NULL) { ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] i965/fs: Gen4: Zero out extra coordinates when using shadow compare
On 08/27/2013 02:47 AM, Chris Forbes wrote: Sorry, saw the bogus comment style just after I sent this out. -- Chris On Tue, Aug 27, 2013 at 9:45 PM, Chris Forbes chr...@ijw.co.nz wrote: Fixes broken rendering if these MRFs contained anything other than zero. NOTE: This is a candidate for stable branches. Signed-off-by: Chris Forbes chr...@ijw.co.nz --- src/mesa/drivers/dri/i965/brw_fs_visitor.cpp | 7 ++- 1 file changed, 6 insertions(+), 1 deletion(-) diff --git a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp index 964ad40..c3eecfa 100644 --- a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp @@ -868,7 +868,12 @@ fs_visitor::emit_texture_gen4(ir_texture *ir, fs_reg dst, fs_reg coordinate, emit(MOV(fs_reg(MRF, base_mrf + mlen + i), coordinate)); coordinate.reg_offset++; } - /* gen4's SIMD8 sampler always has the slots for u,v,r present. */ + + /* gen4's SIMD8 sampler always has the slots for u,v,r present. + * the unused slots must be zeroed.*/ + for (int i = ir-coordinate-type-vector_elements; i3; i++) { Another instance of forgetting to do this? :( i 3 please (with spaces). Either way, Reviewed-by: Kenneth Graunke kenn...@whitecape.org + emit(MOV(fs_reg(MRF, base_mrf + mlen + i), fs_reg(0.0f))); + } mlen += 3; if (ir-op == ir_tex) { -- 1.8.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] update wayland requirement
On 08/12/2013 04:48 AM, Fabio Pedretti wrote: Since 8d29b52 wayland 1.2.0 is required. Thanks. Committed to master. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 68262] [build error with wayland 1.2.0] wayland-drm.c:110:3: error: implicit declaration of function 'wl_resource_create'
https://bugs.freedesktop.org/show_bug.cgi?id=68262 Fabio Pedretti fabio@libero.it changed: What|Removed |Added Status|NEW |RESOLVED Resolution|--- |FIXED --- Comment #3 from Fabio Pedretti fabio@libero.it --- 1.2.0 is now required: http://cgit.freedesktop.org/mesa/mesa/commit/?id=aa3905423e398e1ba36502ae91339d1303acf77f -- You are receiving this mail because: You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] glsl: Allow precision qualifiers for sampler types
GLSL 1.30 doesn't allow precision qualifiers on sampler types, but in GLSL ES, sampler types are also allowed. This seems like an oversight (since the intention of including these in GLSL 1.30 is to allow compatibility with ES shaders). Currently, Mesa allows default precision qualifiers to be set for sampler types in GLSL (commit d5948f2). This patch makes it follow GLSL ES rules and also allow declaring sampler variables with a precision qualifier in GLSL. This fixes a shader compilation error in Khronos OpenGL conformance test depth_texture_mipmap. Signed-off-by: Anuj Phogat anuj.pho...@gmail.com --- src/glsl/ast_to_hir.cpp | 14 +- 1 file changed, 9 insertions(+), 5 deletions(-) diff --git a/src/glsl/ast_to_hir.cpp b/src/glsl/ast_to_hir.cpp index 192130a..b3d6d8c 100644 --- a/src/glsl/ast_to_hir.cpp +++ b/src/glsl/ast_to_hir.cpp @@ -3131,8 +3131,8 @@ ast_declarator_list::hir(exec_list *instructions, state-check_precision_qualifiers_allowed(loc); } - - /* Precision qualifiers only apply to floating point and integer types. + /* Precision qualifiers apply to floating point, integer and sampler + * types. * * From section 4.5.2 of the GLSL 1.30 spec: *Any floating point or any integer declaration can have the type @@ -3144,20 +3144,24 @@ ast_declarator_list::hir(exec_list *instructions, * * From page 87 of the GLSL ES spec: *RESOLUTION: Allow sampler types to take a precision qualifier. + * + * GLSL 1.30 doesn't allow precision qualifiers on sampler types, but + * this seems like an oversight (since the intention of including these + * in GLSL 1.30 is to allow compatibility with ES shaders). So we allow + * int, float, and all sampler types regardless of GLSL version. */ if (this-type-qualifier.precision != ast_precision_none !var-type-is_float() !var-type-is_integer() !var-type-is_record() - !(var-type-is_sampler() state-es_shader) + !(var-type-is_sampler()) !(var-type-is_array() (var-type-fields.array-is_float() || var-type-fields.array-is_integer( { _mesa_glsl_error(loc, state, precision qualifiers apply only to floating point - %s types, state-es_shader ? , integer, and sampler - : and integer); + , integer and sampler types); } /* From page 17 (page 23 of the PDF) of the GLSL 1.20 spec: -- 1.8.1.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] r600g: fix color exports when we have no CBs
We need to export at least one color if the shader writes it, even when nr_cbufs==0. Signed-off-by: Vadim Girlin vadimgir...@gmail.com --- Tested on evergreen with multiple combinations of backends - no regressions, fixes some tests: default- fixes fb-alphatest-nocolor and fb_alphatest-nocolor-ff default+sb - fixes fb-alphatest-nocolor and fb_alphatest-nocolor-ff llvm - fixes about 25 tests related to depth/stencil llvm+sb- fixes about 300 tests (llvm's depth/stencil issues and regressions cased by reordering of exports in sb) With this patch, there are no regressions with default+sb vs default. There is one regression with llvm+sb vs llvm - fs-texturegrad-miplevels, AFAICS it's a problem with llvm backend uncovered by sb - SET_GRADIENTS_V/H instructions are not placed in the same TEX clause with corresponding SAMPLE_G. src/gallium/drivers/r600/r600_shader.c | 7 --- 1 file changed, 4 insertions(+), 3 deletions(-) diff --git a/src/gallium/drivers/r600/r600_shader.c b/src/gallium/drivers/r600/r600_shader.c index 300b5c4..f7eab76 100644 --- a/src/gallium/drivers/r600/r600_shader.c +++ b/src/gallium/drivers/r600/r600_shader.c @@ -918,6 +918,7 @@ static int r600_shader_from_tgsi(struct r600_screen *rscreen, unsigned opcode; int i, j, k, r = 0; int next_pos_base = 60, next_param_base = 0; + int max_color_exports = MAX2(key.nr_cbufs, 1); /* Declarations used by llvm code */ bool use_llvm = false; bool indirect_gprs; @@ -1130,7 +1131,7 @@ static int r600_shader_from_tgsi(struct r600_screen *rscreen, radeon_llvm_ctx.face_gpr = ctx.face_gpr; radeon_llvm_ctx.r600_inputs = ctx.shader-input; radeon_llvm_ctx.r600_outputs = ctx.shader-output; - radeon_llvm_ctx.color_buffer_count = MAX2(key.nr_cbufs , 1); + radeon_llvm_ctx.color_buffer_count = max_color_exports; radeon_llvm_ctx.chip_class = ctx.bc-chip_class; radeon_llvm_ctx.fs_color_all = shader-fs_write_all (rscreen-chip_class = EVERGREEN); radeon_llvm_ctx.stream_outputs = so; @@ -1440,7 +1441,7 @@ static int r600_shader_from_tgsi(struct r600_screen *rscreen, case TGSI_PROCESSOR_FRAGMENT: if (shader-output[i].name == TGSI_SEMANTIC_COLOR) { /* never export more colors than the number of CBs */ - if (shader-output[i].sid = key.nr_cbufs) { + if (shader-output[i].sid = max_color_exports) { /* skip export */ j--; continue; @@ -1450,7 +1451,7 @@ static int r600_shader_from_tgsi(struct r600_screen *rscreen, output[j].type = V_SQ_CF_ALLOC_EXPORT_WORD0_SQ_EXPORT_PIXEL; shader-nr_ps_color_exports++; if (shader-fs_write_all (rscreen-chip_class = EVERGREEN)) { - for (k = 1; k key.nr_cbufs; k++) { + for (k = 1; k max_color_exports; k++) { j++; memset(output[j], 0, sizeof(struct r600_bytecode_output)); output[j].gpr = shader-output[i].gpr; -- 1.8.3.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] glsl: Allow precision qualifiers for sampler types
On 08/27/2013 10:45 AM, Anuj Phogat wrote: GLSL 1.30 doesn't allow precision qualifiers on sampler types, but in GLSL ES, sampler types are also allowed. This seems like an oversight (since the intention of including these in GLSL 1.30 is to allow compatibility with ES shaders). Currently, Mesa allows default precision qualifiers to be set for sampler types in GLSL (commit d5948f2). This patch makes it follow GLSL ES rules and also allow declaring sampler variables with a precision qualifier in GLSL. I think our current behavior is incorrect even in the ES case. GLSL ES 3.30 and desktop GLSL 4.40 say the following in section 4.5.3 (Precision Qualifiers): Any floating point or any integer declaration can have the type preceded by one of these precision qualifiers... The also both say the following in section 4.5.4 (Default Precision Qualifiers): The precision statement...can be used to establish a default precision qualifier. The type field can be either int or float or any of the sampler types... So I believe precision mediump sampler2D; should be legal in all versions, but uniform mediump sampler2D s; should not. Which syntax is the test using? This fixes a shader compilation error in Khronos OpenGL conformance test depth_texture_mipmap. Signed-off-by: Anuj Phogat anuj.pho...@gmail.com --- src/glsl/ast_to_hir.cpp | 14 +- 1 file changed, 9 insertions(+), 5 deletions(-) diff --git a/src/glsl/ast_to_hir.cpp b/src/glsl/ast_to_hir.cpp index 192130a..b3d6d8c 100644 --- a/src/glsl/ast_to_hir.cpp +++ b/src/glsl/ast_to_hir.cpp @@ -3131,8 +3131,8 @@ ast_declarator_list::hir(exec_list *instructions, state-check_precision_qualifiers_allowed(loc); } - - /* Precision qualifiers only apply to floating point and integer types. + /* Precision qualifiers apply to floating point, integer and sampler + * types. * * From section 4.5.2 of the GLSL 1.30 spec: *Any floating point or any integer declaration can have the type @@ -3144,20 +3144,24 @@ ast_declarator_list::hir(exec_list *instructions, * * From page 87 of the GLSL ES spec: *RESOLUTION: Allow sampler types to take a precision qualifier. + * + * GLSL 1.30 doesn't allow precision qualifiers on sampler types, but + * this seems like an oversight (since the intention of including these + * in GLSL 1.30 is to allow compatibility with ES shaders). So we allow + * int, float, and all sampler types regardless of GLSL version. */ if (this-type-qualifier.precision != ast_precision_none !var-type-is_float() !var-type-is_integer() !var-type-is_record() - !(var-type-is_sampler() state-es_shader) + !(var-type-is_sampler()) !(var-type-is_array() (var-type-fields.array-is_float() || var-type-fields.array-is_integer( { _mesa_glsl_error(loc, state, precision qualifiers apply only to floating point - %s types, state-es_shader ? , integer, and sampler - : and integer); + , integer and sampler types); } /* From page 17 (page 23 of the PDF) of the GLSL 1.20 spec: ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] r600g: fix color exports when we have no CBs
Not that I'm qualified to review r600 code, but couldn't you create different shader variants depending on whether you need alpha test? At least I would assume shader exports aren't free. Roland Am 27.08.2013 19:56, schrieb Vadim Girlin: We need to export at least one color if the shader writes it, even when nr_cbufs==0. Signed-off-by: Vadim Girlin vadimgir...@gmail.com --- Tested on evergreen with multiple combinations of backends - no regressions, fixes some tests: default- fixes fb-alphatest-nocolor and fb_alphatest-nocolor-ff default+sb - fixes fb-alphatest-nocolor and fb_alphatest-nocolor-ff llvm - fixes about 25 tests related to depth/stencil llvm+sb- fixes about 300 tests (llvm's depth/stencil issues and regressions cased by reordering of exports in sb) With this patch, there are no regressions with default+sb vs default. There is one regression with llvm+sb vs llvm - fs-texturegrad-miplevels, AFAICS it's a problem with llvm backend uncovered by sb - SET_GRADIENTS_V/H instructions are not placed in the same TEX clause with corresponding SAMPLE_G. src/gallium/drivers/r600/r600_shader.c | 7 --- 1 file changed, 4 insertions(+), 3 deletions(-) diff --git a/src/gallium/drivers/r600/r600_shader.c b/src/gallium/drivers/r600/r600_shader.c index 300b5c4..f7eab76 100644 --- a/src/gallium/drivers/r600/r600_shader.c +++ b/src/gallium/drivers/r600/r600_shader.c @@ -918,6 +918,7 @@ static int r600_shader_from_tgsi(struct r600_screen *rscreen, unsigned opcode; int i, j, k, r = 0; int next_pos_base = 60, next_param_base = 0; + int max_color_exports = MAX2(key.nr_cbufs, 1); /* Declarations used by llvm code */ bool use_llvm = false; bool indirect_gprs; @@ -1130,7 +1131,7 @@ static int r600_shader_from_tgsi(struct r600_screen *rscreen, radeon_llvm_ctx.face_gpr = ctx.face_gpr; radeon_llvm_ctx.r600_inputs = ctx.shader-input; radeon_llvm_ctx.r600_outputs = ctx.shader-output; - radeon_llvm_ctx.color_buffer_count = MAX2(key.nr_cbufs , 1); + radeon_llvm_ctx.color_buffer_count = max_color_exports; radeon_llvm_ctx.chip_class = ctx.bc-chip_class; radeon_llvm_ctx.fs_color_all = shader-fs_write_all (rscreen-chip_class = EVERGREEN); radeon_llvm_ctx.stream_outputs = so; @@ -1440,7 +1441,7 @@ static int r600_shader_from_tgsi(struct r600_screen *rscreen, case TGSI_PROCESSOR_FRAGMENT: if (shader-output[i].name == TGSI_SEMANTIC_COLOR) { /* never export more colors than the number of CBs */ - if (shader-output[i].sid = key.nr_cbufs) { + if (shader-output[i].sid = max_color_exports) { /* skip export */ j--; continue; @@ -1450,7 +1451,7 @@ static int r600_shader_from_tgsi(struct r600_screen *rscreen, output[j].type = V_SQ_CF_ALLOC_EXPORT_WORD0_SQ_EXPORT_PIXEL; shader-nr_ps_color_exports++; if (shader-fs_write_all (rscreen-chip_class = EVERGREEN)) { - for (k = 1; k key.nr_cbufs; k++) { + for (k = 1; k max_color_exports; k++) { j++; memset(output[j], 0, sizeof(struct r600_bytecode_output)); output[j].gpr = shader-output[i].gpr; ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] radeonsi: add flexible shader descriptor management and use it for sampler views
On Mon, Aug 26, 2013 at 8:59 PM, Marko Ristola marko.rist...@kolumbus.fi wrote: Hi 15.08.2013 13:54, Marek Olšák wrote: On Thu, Aug 15, 2013 at 10:27 AM, Christian König deathsim...@vodafone.de wrote: Am 15.08.2013 05:25, schrieb Marek Olšák: (This should be applied before MSAA, which will need to be rebased.) It moves all sampler view descriptors to a buffer. It supports partial resource updates and it can also unbind resources (required for FMASK texturing). The buffer contains all sampler view descriptors for one shader stage, represented as an array. On top of that, there are N arrays in the buffer, which are used to emulate context registers as implemented by the previous ASICs (each array is a context). This uses the RCU synchronization approach to avoid read-after-write hazards as discussed in the thread: radeonsi: add FMASK texture binding slots and resource setup CP DMA is used to clear the descriptors at context initialization and to copy the descriptors from one context to the next. IMPORTANT: 128 resource contexts are needed, 64 doesn't work. If I set SH_KCACHE_ACTION_ENA before every draw call, only 2 contexts are needed. I don't have an explanation for this. --- The idea itself looks really good to me, but we should probably also move the all resources and samplers to the new model and then rip out the code that stores them directly into the IB. I'd like MSAA to land first, but yes, the plan is to eventually move all resources and samplers to the new model. +/* Emit a CP DMA packet to do a copy from one buffer to another. + * The size must fit in bits [20:0]. Notes: + * + * 1) Set sync to true if you want the 3D engine to wait until CP DMA is done. + * + * 2) Set raw_hazard_wait to true if the source data was used as a destination + *in a previous CP DMA packet. It's for preventing a read-after-write hazard + *between two CP DMA packets. + */ +static void si_emit_cp_dma_copy_buffer(struct r600_context *rctx, + uint64_t dst_va, uint64_t src_va, + unsigned size, + bool sync, bool raw_hazard_wait) +{ + struct radeon_winsys_cs *cs = rctx-cs; + uint32_t sync_flag = sync ? PKT3_CP_DMA_CP_SYNC : 0; + uint32_t raw_wait = raw_hazard_wait ? PKT3_CP_DMA_CMD_RAW_WAIT : 0; + + assert(size); + assert((size ((121)-1)) == size); + + cs-buf[cs-cdw++] = PKT3(PKT3_CP_DMA, 4, 0); + cs-buf[cs-cdw++] = src_va;/* SRC_ADDR_LO [31:0] */ + cs-buf[cs-cdw++] = sync_flag | ((src_va 32) 0xff); /* CP_SYNC [31] | SRC_ADDR_HI [7:0] */ + cs-buf[cs-cdw++] = dst_va;/* DST_ADDR_LO [31:0] */ + cs-buf[cs-cdw++] = (dst_va 32) 0xff; /* DST_ADDR_HI [7:0] */ + cs-buf[cs-cdw++] = size | raw_wait; /* COMMAND [29:22] | BYTE_COUNT [20:0] */ +} + +/* Emit a CP DMA packet to clear a buffer. The size must fit in bits [20:0]. */ +static void si_emit_cp_dma_clear_buffer(struct r600_context *rctx, + uint64_t dst_va, unsigned size, + uint32_t clear_value, + bool sync, bool raw_hazard_wait) +{ + struct radeon_winsys_cs *cs = rctx-cs; + uint32_t sync_flag = sync ? PKT3_CP_DMA_CP_SYNC : 0; + uint32_t raw_wait = raw_hazard_wait ? PKT3_CP_DMA_CMD_RAW_WAIT : 0; + + assert(size); + assert((size ((121)-1)) == size); + + cs-buf[cs-cdw++] = PKT3(PKT3_CP_DMA, 4, 0); + cs-buf[cs-cdw++] = clear_value; /* DATA [31:0] */ + cs-buf[cs-cdw++] = sync_flag | PKT3_CP_DMA_SRC_SEL(2); /* CP_SYNC [31] | SRC_SEL[30:29] */ + cs-buf[cs-cdw++] = dst_va;/* DST_ADDR_LO [31:0] */ + cs-buf[cs-cdw++] = (dst_va 32) 0xff; /* DST_ADDR_HI [7:0] */ + cs-buf[cs-cdw++] = size | raw_wait; /* COMMAND [29:22] | BYTE_COUNT [20:0] */ +} Can we use some kind of macro or inline function instead of cs-buf[cs-cdw++] ? That should help of we need to port that over to a different CS mechanism. How about this? static INLINE void r600_write_value(struct radeon_winsys_cs *cs, unsigned value) { cs-buf[cs-cdw++] = value; } And IIRC the CP DMA is identical on all chipset generation (maybe excluding early R6xx, but I'm not 100% sure of that), so it might be a good idea to start sharing code again by putting this under src/gallium/drivers/radeon/radeon_cp_dma.c. Not necessary now, but more as a general idea. What do you think? I agree. CP DMA is indeed identical on all chipsets. The copying is supported since R600 and the clearing is supported since Evergreen. Maybe you already thought: One way to emulate clearing is to copy with CP DMA from a constant cleared
Re: [Mesa-dev] [PATCH] glsl: Allow precision qualifiers for sampler types
On Tue, Aug 27, 2013 at 11:53 AM, Ian Romanick i...@freedesktop.org wrote: On 08/27/2013 10:45 AM, Anuj Phogat wrote: GLSL 1.30 doesn't allow precision qualifiers on sampler types, but in GLSL ES, sampler types are also allowed. This seems like an oversight (since the intention of including these in GLSL 1.30 is to allow compatibility with ES shaders). Currently, Mesa allows default precision qualifiers to be set for sampler types in GLSL (commit d5948f2). This patch makes it follow GLSL ES rules and also allow declaring sampler variables with a precision qualifier in GLSL. I think our current behavior is incorrect even in the ES case. GLSL ES 3.30 You mean to say GLSL ES 3.00? and desktop GLSL 4.40 say the following in section 4.5.3 (Precision Qualifiers): Any floating point or any integer declaration can have the type preceded by one of these precision qualifiers... Yes, samplers are now allowed in GLSL 4.4. They were not in GLSL 4.3. The also both say the following in section 4.5.4 (Default Precision Qualifiers): The precision statement...can be used to establish a default precision qualifier. The type field can be either int or float or any of the sampler types... So I believe precision mediump sampler2D; should be legal in all versions, but uniform mediump sampler2D s; should not. Yes, there is no clear statement in GLSL spec which allows: uniform mediump sampler2D s; Which syntax is the test using? test uses: uniform mediump sampler2D s; I haven't yet tested if it is accepted by NVIDIA. This fixes a shader compilation error in Khronos OpenGL conformance test depth_texture_mipmap. Signed-off-by: Anuj Phogat anuj.pho...@gmail.com --- src/glsl/ast_to_hir.cpp | 14 +- 1 file changed, 9 insertions(+), 5 deletions(-) diff --git a/src/glsl/ast_to_hir.cpp b/src/glsl/ast_to_hir.cpp index 192130a..b3d6d8c 100644 --- a/src/glsl/ast_to_hir.cpp +++ b/src/glsl/ast_to_hir.cpp @@ -3131,8 +3131,8 @@ ast_declarator_list::hir(exec_list *instructions, state-check_precision_qualifiers_allowed(loc); } - - /* Precision qualifiers only apply to floating point and integer types. + /* Precision qualifiers apply to floating point, integer and sampler + * types. * * From section 4.5.2 of the GLSL 1.30 spec: *Any floating point or any integer declaration can have the type @@ -3144,20 +3144,24 @@ ast_declarator_list::hir(exec_list *instructions, * * From page 87 of the GLSL ES spec: *RESOLUTION: Allow sampler types to take a precision qualifier. + * + * GLSL 1.30 doesn't allow precision qualifiers on sampler types, but + * this seems like an oversight (since the intention of including these + * in GLSL 1.30 is to allow compatibility with ES shaders). So we allow + * int, float, and all sampler types regardless of GLSL version. */ if (this-type-qualifier.precision != ast_precision_none !var-type-is_float() !var-type-is_integer() !var-type-is_record() - !(var-type-is_sampler() state-es_shader) + !(var-type-is_sampler()) !(var-type-is_array() (var-type-fields.array-is_float() || var-type-fields.array-is_integer( { _mesa_glsl_error(loc, state, precision qualifiers apply only to floating point - %s types, state-es_shader ? , integer, and sampler - : and integer); + , integer and sampler types); } /* From page 17 (page 23 of the PDF) of the GLSL 1.20 spec: ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] r600g: fix color exports when we have no CBs
Shader variants are BAD, BAD, BAD. Have you ever played an AAA game with a Mesa driver that likes to compile shader variants on first use? It's HORRIBLE. What the patch does is probably the right solution. At least alpha-test state changes don't cause shader recompilation and re-binding, which also negatively affects performance. Ideally we shouldn't depend on the framebuffer state at all, but we need to emulate the TGSI property FS_COLOR0_WRITES_ALL_CBUFS. I think we should always be fine with key.nr_cbufs forced to 8 for any shader without that property. I expect app developers to do the right thing and not write outputs they don't need. Marek On Tue, Aug 27, 2013 at 9:00 PM, Roland Scheidegger srol...@vmware.com wrote: Not that I'm qualified to review r600 code, but couldn't you create different shader variants depending on whether you need alpha test? At least I would assume shader exports aren't free. Roland Am 27.08.2013 19:56, schrieb Vadim Girlin: We need to export at least one color if the shader writes it, even when nr_cbufs==0. Signed-off-by: Vadim Girlin vadimgir...@gmail.com --- Tested on evergreen with multiple combinations of backends - no regressions, fixes some tests: default- fixes fb-alphatest-nocolor and fb_alphatest-nocolor-ff default+sb - fixes fb-alphatest-nocolor and fb_alphatest-nocolor-ff llvm - fixes about 25 tests related to depth/stencil llvm+sb- fixes about 300 tests (llvm's depth/stencil issues and regressions cased by reordering of exports in sb) With this patch, there are no regressions with default+sb vs default. There is one regression with llvm+sb vs llvm - fs-texturegrad-miplevels, AFAICS it's a problem with llvm backend uncovered by sb - SET_GRADIENTS_V/H instructions are not placed in the same TEX clause with corresponding SAMPLE_G. src/gallium/drivers/r600/r600_shader.c | 7 --- 1 file changed, 4 insertions(+), 3 deletions(-) diff --git a/src/gallium/drivers/r600/r600_shader.c b/src/gallium/drivers/r600/r600_shader.c index 300b5c4..f7eab76 100644 --- a/src/gallium/drivers/r600/r600_shader.c +++ b/src/gallium/drivers/r600/r600_shader.c @@ -918,6 +918,7 @@ static int r600_shader_from_tgsi(struct r600_screen *rscreen, unsigned opcode; int i, j, k, r = 0; int next_pos_base = 60, next_param_base = 0; + int max_color_exports = MAX2(key.nr_cbufs, 1); /* Declarations used by llvm code */ bool use_llvm = false; bool indirect_gprs; @@ -1130,7 +1131,7 @@ static int r600_shader_from_tgsi(struct r600_screen *rscreen, radeon_llvm_ctx.face_gpr = ctx.face_gpr; radeon_llvm_ctx.r600_inputs = ctx.shader-input; radeon_llvm_ctx.r600_outputs = ctx.shader-output; - radeon_llvm_ctx.color_buffer_count = MAX2(key.nr_cbufs , 1); + radeon_llvm_ctx.color_buffer_count = max_color_exports; radeon_llvm_ctx.chip_class = ctx.bc-chip_class; radeon_llvm_ctx.fs_color_all = shader-fs_write_all (rscreen-chip_class = EVERGREEN); radeon_llvm_ctx.stream_outputs = so; @@ -1440,7 +1441,7 @@ static int r600_shader_from_tgsi(struct r600_screen *rscreen, case TGSI_PROCESSOR_FRAGMENT: if (shader-output[i].name == TGSI_SEMANTIC_COLOR) { /* never export more colors than the number of CBs */ - if (shader-output[i].sid = key.nr_cbufs) { + if (shader-output[i].sid = max_color_exports) { /* skip export */ j--; continue; @@ -1450,7 +1451,7 @@ static int r600_shader_from_tgsi(struct r600_screen *rscreen, output[j].type = V_SQ_CF_ALLOC_EXPORT_WORD0_SQ_EXPORT_PIXEL; shader-nr_ps_color_exports++; if (shader-fs_write_all (rscreen-chip_class = EVERGREEN)) { - for (k = 1; k key.nr_cbufs; k++) { + for (k = 1; k max_color_exports; k++) { j++; memset(output[j], 0, sizeof(struct r600_bytecode_output)); output[j].gpr = shader-output[i].gpr; ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] r600g/compute: Don't flush the cs in pipe_context::launch_grid()
Reviewed-by: Marek Olšák marek.ol...@amd.com Marek On Tue, Aug 27, 2013 at 3:05 AM, Tom Stellard t...@stellard.net wrote: From: Tom Stellard thomas.stell...@amd.com This is the state tracker's responsibility. --- src/gallium/drivers/r600/evergreen_compute.c | 11 +-- 1 file changed, 1 insertion(+), 10 deletions(-) diff --git a/src/gallium/drivers/r600/evergreen_compute.c b/src/gallium/drivers/r600/evergreen_compute.c index 9b2bae3..4fa3d2f 100644 --- a/src/gallium/drivers/r600/evergreen_compute.c +++ b/src/gallium/drivers/r600/evergreen_compute.c @@ -473,6 +473,7 @@ static void compute_emit_cs(struct r600_context *ctx, const uint *block_layout, R600_CONTEXT_INV_VERTEX_CACHE | R600_CONTEXT_INV_TEX_CACHE; r600_flush_emit(ctx); + ctx-flags = 0; #if 0 COMPUTE_DBG(ctx-screen, cdw: %i\n, cs-cdw); @@ -481,16 +482,6 @@ static void compute_emit_cs(struct r600_context *ctx, const uint *block_layout, } #endif - flush_flags = RADEON_FLUSH_ASYNC | RADEON_FLUSH_COMPUTE; - if (ctx-keep_tiling_flags) { - flush_flags |= RADEON_FLUSH_KEEP_TILING_FLAGS; - } - - ctx-ws-cs_flush(ctx-rings.gfx.cs, flush_flags, ctx-screen-cs_count++); - - ctx-flags = 0; - - COMPUTE_DBG(ctx-screen, shader started\n); } -- 1.7.11.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 50482] git mesa fails to build
https://bugs.freedesktop.org/show_bug.cgi?id=50482 Henry henry_rosario_gonza...@hotmail.com changed: What|Removed |Added Resolution|FIXED |WORKSFORME -- You are receiving this mail because: You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] glsl: propagate max_array_access through function calls
Hi, There was a lot of code everywhere that used iterators so it is not obvious for a new person that the iterators are evil. And they do abstract the implementation of the storage. Also, I'd rather use ir_instruction and as_*() calls instead of exec_node and explicit cast. I am not sure about MAX2, is it guaranteed to be converted to branchless conditional assignment? On Tue, Aug 27, 2013 at 8:26 AM, Ian Romanick i...@freedesktop.org wrote: On 08/21/2013 05:57 PM, Dominik Behr wrote: Fixes a bug where if an uniform array is passed to a function the accesses to the array are not propagated so later all but the first vector of the uniform array are removed in parcel_out_uniform_storage resulting in broken shaders and out of bounds access to arrays in brw::vec4_visitor::pack_**uniform_registers. Signed-off-by: Dominik Behr db...@chromium.org --- src/glsl/link_functions.cpp | 26 ++ 1 file changed, 26 insertions(+) diff --git a/src/glsl/link_functions.cpp b/src/glsl/link_functions.cpp index 6b3e154..32e2012 100644 --- a/src/glsl/link_functions.cpp +++ b/src/glsl/link_functions.cpp @@ -173,6 +173,32 @@ public: return visit_continue; } + virtual ir_visitor_status visit_leave(ir_call *ir) + { + /* traverse list of function parameters, and for array parameters + propagate max_array_access, do it when leaving the node + so the childen would propagate their array accesses first */ children I also think this comment should be expanded. On the first read of this code, it wasn't obvious to me why this was necessary. It's worth mentioning that the important case is when the reference to the array at the call site does not reference a specific element. In that case the access range of the from the callee is propagated back to the array passed in. + exec_list_iterator sig_param_iter = ir-callee-parameters.** iterator(); + exec_list_iterator param_iter = ir-actual_parameters.** iterator(); While iterators are a fine design pattern, the implementation of them in the compiler front-end is garbage (and I made that implementation). We stopped writing new code to use the iterators years ago. There is other code that does a similar double-walk of the signature parameters and actual parameters lists. I think the best example is in lower_clip_distance_visitor::**visit_leave(ir_call *ir) in lower_clip_distance.cpp. + while (param_iter.has_next()) { + ir_variable *sig_param = (ir_variable *) sig_param_iter.get(); + ir_rvalue *param = (ir_rvalue *) param_iter.get(); + assert(sig_param); + assert(param); + if (sig_param-type-is_array()) { +ir_dereference_variable *deref = param-as_dereference_** variable(); +if (deref deref-var deref-var-type-is_array()) { + if (sig_param-max_array_access deref-var-max_array_access) { + deref-var-max_array_access = sig_param-max_array_access; + } This should end up looking more similar to the code in call_link_visitor::visit(ir_**dereference_variable *ir) around line 200 (without your patch). var-max_array_access = MAX2(var-max_array_access, ir-var-max_array_access); +} + } + sig_param_iter.next(); + param_iter.next(); + } + return visit_continue; + } + virtual ir_visitor_status visit(ir_dereference_variable *ir) { if (hash_table_find(locals, ir-var) == NULL) { ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 05/15] glsl: Add new {fr, ld}exp built-ins IR and prototypes.
On 26 August 2013 17:49, Ian Romanick i...@freedesktop.org wrote: On 08/23/2013 02:02 PM, Paul Berry wrote: On 23 August 2013 13:19, Matt Turner matts...@gmail.com mailto:matts...@gmail.com wrote: On Fri, Aug 23, 2013 at 8:55 AM, Paul Berry stereotype...@gmail.com mailto:stereotype441@gmail.**com stereotype...@gmail.com wrote: On 22 August 2013 16:08, Matt Turner matts...@gmail.com mailto:matts...@gmail.com wrote: --- src/glsl/builtins/ir/frexp.ir http://frexp.ir | 25 + src/glsl/builtins/ir/ldexp.ir http://ldexp.ir | 25 + src/glsl/builtins/profiles/**ARB_gpu_shader5.glsl | 10 ++ 3 files changed, 60 insertions(+) create mode 100644 src/glsl/builtins/ir/frexp.ir http://frexp.ir create mode 100644 src/glsl/builtins/ir/ldexp.ir http://ldexp.ir diff --git a/src/glsl/builtins/ir/frexp.**ir http://frexp.ir http://frexp.ir b/src/glsl/builtins/ir/frexp.**ir http://frexp.ir http://frexp.ir new file mode 100644 index 000..a514994 --- /dev/null +++ b/src/glsl/builtins/ir/frexp.**ir http://frexp.ir http://frexp.ir @@ -0,0 +1,25 @@ +((function frexp + (signature float + (parameters + (declare (in) float x) + (declare (out) int exp)) + ((return (expression float frexp (var_ref x) (var_ref exp) Having an ir_expression that writes to one of its parameters is going to break assumptions in a lot of our optimization passes. I'm concerned that that may be a problem we have to solve anyway. While our hardware doesn't support an frexp instruction (like e.g., AMD does) and we could probably do what you suggest, we do have instructions that correspond directly to the uaddCarry() and usubBorrow() built-ins in this same extension. They return a value and also have an out parameter. genUType uaddCarry(genUType x, genUType y, out genUType carry); genUType usubBorrow(genUType x, genUType y, out genUType borrow); We could probably avoid the problem you describe by lowering them, but it's feeling increasingly distasteful. Your code would make a good piglit test. I'll do some experiments. Hmm, interesting. The way LLVM solves this problem, as I understand it, is through so-called intrinsic functions (http://llvm.org/docs/LangRef.**html#intrinsic-functionshttp://llvm.org/docs/LangRef.html#intrinsic-functions). I wonder if we should start doing that in Mesa. Briefly, here is what it would look like, using uaddCarry as an example: 1. First we do an inefficient implementation of uaddCarry in terms of existing GLSL functions, much like you did for frexp in your frexp_to_arith lowering pass, except that we do it in src/glsl/builtins/glsl/**uaddCarry.glsl, so it's a little easier to review :). Optimization passes already deal with function out parameters properly, and function inlining automatically splices in the proper code during linking. 2. For back-ends that don't have an efficient native way to do uaddCarry, we're done. The uaddCarry function works as is. 3. For back-ends that do have an efficient way to do uaddCarry, we add a mechanism to allow the back-end to tell the linker: don't inline the definition of this built-in. Just leave it as an ir_call because I have my own special implementation of it*. I had thought about solving this in a slightly different way, but there are a couple potential tricky bits. Provide an implementation of the built-in function in the GLSL library. float frexp(float x, out int exponent) { return __intrinsic_frexp(x, exponent); } Provide a default implementation of the intrinsic elsewhere. Allow drivers to supply an alternate library with custom versions of the intrinsics. Since the GLSL library's frexp is the same in either case, the problem Paul identifies below should be avoided. The tricky bit, and the problem we always come to when talking about intrinsics is dealing with constant expressions. That doesn't (shouldn't?) apply to this case because of the out parameter, but it may apply to other cases. Yeah, good point about constant expressions. With my proposal, that could be addressed by having the constant expression evaluator always recurse into the GLSL implementation, regardless of whether the function is an intrinsic (this should be fine, since the only reason for the intrinsic version of the function to be used is to take advantage of efficient instructions in the GPU). I confess that I don't understand the rest of your proposal as well as I would like. Maybe the three of us should discuss it in person next time we're in the office. Right now an application could do: float foo[packUnorm2x16(vec2(1,0))]; If
Re: [Mesa-dev] [PATCH 03/15] i965/fs: Add support for translating ir_triop_fma into MAD.
On 26 August 2013 22:00, Matt Turner matts...@gmail.com wrote: Thanks Paul. I've placed this patch as 2.5/15 in the series. i965/fs: Assert that ir_expressions are usable by 3-src instructions MAD will be generated directly from ir_triop_fma, so this assertion checks that all ir_expressions are usable. Reviewed-by: Paul Berry stereotype...@gmail.com --- diff --git a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp index 27887d6..a02d92d 100644 --- a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp @@ -336,6 +336,7 @@ fs_visitor::visit(ir_expression *ir) ir-operands[operand]-print(); printf(\n); } + assert(this-result.is_valid_3src()); I spent a while trying to think of a commet that could go above the assert, to help someone who stumbles across it in the future and wonders why the assertion is here rather than in the switch statement below, but I couldn't really think of anything :( op[operand] = this-result; /* Matrix expression operands should have been broken down to vector ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] r600g: fix color exports when we have no CBs
On 08/27/2013 11:00 PM, Roland Scheidegger wrote: Not that I'm qualified to review r600 code, but couldn't you create different shader variants depending on whether you need alpha test? At least I would assume shader exports aren't free. I thought about performance, but my main concern now is to avoid serious regressions after enabling sb, we can try to improve it later. Even if we won't emit this color export, we'll have fake export (with all color components masked) instead, and I'm not sure whether it's cheaper. Possibly hardware can see that there is no actual memory write, but benchmarks are needed to prove it. Also there is another possible improvement for exports - sometimes we need to export depth/stencil but no colors, probably we can get rid of fake color export as well in such cases. Anyway, this also needs additional testing/benchmarking. Vadim Roland Am 27.08.2013 19:56, schrieb Vadim Girlin: We need to export at least one color if the shader writes it, even when nr_cbufs==0. Signed-off-by: Vadim Girlin vadimgir...@gmail.com --- Tested on evergreen with multiple combinations of backends - no regressions, fixes some tests: default- fixes fb-alphatest-nocolor and fb_alphatest-nocolor-ff default+sb - fixes fb-alphatest-nocolor and fb_alphatest-nocolor-ff llvm - fixes about 25 tests related to depth/stencil llvm+sb- fixes about 300 tests (llvm's depth/stencil issues and regressions cased by reordering of exports in sb) With this patch, there are no regressions with default+sb vs default. There is one regression with llvm+sb vs llvm - fs-texturegrad-miplevels, AFAICS it's a problem with llvm backend uncovered by sb - SET_GRADIENTS_V/H instructions are not placed in the same TEX clause with corresponding SAMPLE_G. src/gallium/drivers/r600/r600_shader.c | 7 --- 1 file changed, 4 insertions(+), 3 deletions(-) diff --git a/src/gallium/drivers/r600/r600_shader.c b/src/gallium/drivers/r600/r600_shader.c index 300b5c4..f7eab76 100644 --- a/src/gallium/drivers/r600/r600_shader.c +++ b/src/gallium/drivers/r600/r600_shader.c @@ -918,6 +918,7 @@ static int r600_shader_from_tgsi(struct r600_screen *rscreen, unsigned opcode; int i, j, k, r = 0; int next_pos_base = 60, next_param_base = 0; + int max_color_exports = MAX2(key.nr_cbufs, 1); /* Declarations used by llvm code */ bool use_llvm = false; bool indirect_gprs; @@ -1130,7 +1131,7 @@ static int r600_shader_from_tgsi(struct r600_screen *rscreen, radeon_llvm_ctx.face_gpr = ctx.face_gpr; radeon_llvm_ctx.r600_inputs = ctx.shader-input; radeon_llvm_ctx.r600_outputs = ctx.shader-output; - radeon_llvm_ctx.color_buffer_count = MAX2(key.nr_cbufs , 1); + radeon_llvm_ctx.color_buffer_count = max_color_exports; radeon_llvm_ctx.chip_class = ctx.bc-chip_class; radeon_llvm_ctx.fs_color_all = shader-fs_write_all (rscreen-chip_class = EVERGREEN); radeon_llvm_ctx.stream_outputs = so; @@ -1440,7 +1441,7 @@ static int r600_shader_from_tgsi(struct r600_screen *rscreen, case TGSI_PROCESSOR_FRAGMENT: if (shader-output[i].name == TGSI_SEMANTIC_COLOR) { /* never export more colors than the number of CBs */ - if (shader-output[i].sid = key.nr_cbufs) { + if (shader-output[i].sid = max_color_exports) { /* skip export */ j--; continue; @@ -1450,7 +1451,7 @@ static int r600_shader_from_tgsi(struct r600_screen *rscreen, output[j].type = V_SQ_CF_ALLOC_EXPORT_WORD0_SQ_EXPORT_PIXEL; shader-nr_ps_color_exports++; if (shader-fs_write_all (rscreen-chip_class = EVERGREEN)) { - for (k = 1; k key.nr_cbufs; k++) { + for (k = 1; k max_color_exports; k++) { j++; memset(output[j], 0, sizeof(struct r600_bytecode_output)); output[j].gpr = shader-output[i].gpr; ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] r600g: fix color exports when we have no CBs
On 08/28/2013 12:43 AM, Marek Olšák wrote: Shader variants are BAD, BAD, BAD. Have you ever played an AAA game with a Mesa driver that likes to compile shader variants on first use? It's HORRIBLE. I don't think that shader variants are bad, but it's definitely bad when we are compiling variants that are never used. Currently glxgears compiles 18 ps/vs shaders. In my branch with initial GS support [1] I switched handling of the shaders to deferred compilation, that is, shaders are compiled only before the actual draw. I found later that it's not really required for GS, but IIRC this change results in only 5 shaders being compiled for glxgears instead of 18. It seems most of the useless variants are results of state changes between creation of the shader state (initial compilation) and actual draw call. I had some concerns about increased overhead with those changes, and it's actually noticeable with drawoverhead demo, but I didn't see any regressions with a few real apps that I tested, e.g. glxgears even showed slightly better performance with these changes. Probably I also implemented it in a not very optimal way (I was mostly concentrated on GS support) and the overhead can be reduced. One more thing is duplicate shaders, I've analyzed shader dumps from Unigine Heaven 3.0 some time ago and found that from about 320 compiled shaders, only about 180 (50%) were unique, others were duplicates (detected by comparing the bytecode dumps for them in an automated way), maybe they had different shader keys (which still resulted in the same bytecode), but I suspect duplicate pipe shaders were also involved. Unfortunately I didn't have a time to investigate it more thoroughly since then. So my point is that we don't really need to eliminate shader variants, first we need to eliminate compilation of unused variants and duplicate shaders. Also we might want to consider offloading of the compilation to separate thread(s) and caching of shader binaries between runs. Vadim [1] http://cgit.freedesktop.org/~vadimg/mesa/log/?h=r600-geom-shaders What the patch does is probably the right solution. At least alpha-test state changes don't cause shader recompilation and re-binding, which also negatively affects performance. Ideally we shouldn't depend on the framebuffer state at all, but we need to emulate the TGSI property FS_COLOR0_WRITES_ALL_CBUFS. I think we should always be fine with key.nr_cbufs forced to 8 for any shader without that property. I expect app developers to do the right thing and not write outputs they don't need. Marek On Tue, Aug 27, 2013 at 9:00 PM, Roland Scheidegger srol...@vmware.com wrote: Not that I'm qualified to review r600 code, but couldn't you create different shader variants depending on whether you need alpha test? At least I would assume shader exports aren't free. Roland Am 27.08.2013 19:56, schrieb Vadim Girlin: We need to export at least one color if the shader writes it, even when nr_cbufs==0. Signed-off-by: Vadim Girlin vadimgir...@gmail.com --- Tested on evergreen with multiple combinations of backends - no regressions, fixes some tests: default- fixes fb-alphatest-nocolor and fb_alphatest-nocolor-ff default+sb - fixes fb-alphatest-nocolor and fb_alphatest-nocolor-ff llvm - fixes about 25 tests related to depth/stencil llvm+sb- fixes about 300 tests (llvm's depth/stencil issues and regressions cased by reordering of exports in sb) With this patch, there are no regressions with default+sb vs default. There is one regression with llvm+sb vs llvm - fs-texturegrad-miplevels, AFAICS it's a problem with llvm backend uncovered by sb - SET_GRADIENTS_V/H instructions are not placed in the same TEX clause with corresponding SAMPLE_G. src/gallium/drivers/r600/r600_shader.c | 7 --- 1 file changed, 4 insertions(+), 3 deletions(-) diff --git a/src/gallium/drivers/r600/r600_shader.c b/src/gallium/drivers/r600/r600_shader.c index 300b5c4..f7eab76 100644 --- a/src/gallium/drivers/r600/r600_shader.c +++ b/src/gallium/drivers/r600/r600_shader.c @@ -918,6 +918,7 @@ static int r600_shader_from_tgsi(struct r600_screen *rscreen, unsigned opcode; int i, j, k, r = 0; int next_pos_base = 60, next_param_base = 0; + int max_color_exports = MAX2(key.nr_cbufs, 1); /* Declarations used by llvm code */ bool use_llvm = false; bool indirect_gprs; @@ -1130,7 +1131,7 @@ static int r600_shader_from_tgsi(struct r600_screen *rscreen, radeon_llvm_ctx.face_gpr = ctx.face_gpr; radeon_llvm_ctx.r600_inputs = ctx.shader-input; radeon_llvm_ctx.r600_outputs = ctx.shader-output; - radeon_llvm_ctx.color_buffer_count = MAX2(key.nr_cbufs , 1); + radeon_llvm_ctx.color_buffer_count = max_color_exports; radeon_llvm_ctx.chip_class = ctx.bc-chip_class; radeon_llvm_ctx.fs_color_all =
Re: [Mesa-dev] [PATCH 05/15] glsl: Add new {fr, ld}exp built-ins IR and prototypes.
On Mon, Aug 26, 2013 at 5:49 PM, Ian Romanick i...@freedesktop.org wrote: On 08/23/2013 02:02 PM, Paul Berry wrote: On 23 August 2013 13:19, Matt Turner matts...@gmail.com mailto:matts...@gmail.com wrote: On Fri, Aug 23, 2013 at 8:55 AM, Paul Berry stereotype...@gmail.com mailto:stereotype...@gmail.com wrote: On 22 August 2013 16:08, Matt Turner matts...@gmail.com mailto:matts...@gmail.com wrote: --- src/glsl/builtins/ir/frexp.ir http://frexp.ir | 25 + src/glsl/builtins/ir/ldexp.ir http://ldexp.ir | 25 + src/glsl/builtins/profiles/ARB_gpu_shader5.glsl | 10 ++ 3 files changed, 60 insertions(+) create mode 100644 src/glsl/builtins/ir/frexp.ir http://frexp.ir create mode 100644 src/glsl/builtins/ir/ldexp.ir http://ldexp.ir diff --git a/src/glsl/builtins/ir/frexp.ir http://frexp.ir b/src/glsl/builtins/ir/frexp.ir http://frexp.ir new file mode 100644 index 000..a514994 --- /dev/null +++ b/src/glsl/builtins/ir/frexp.ir http://frexp.ir @@ -0,0 +1,25 @@ +((function frexp + (signature float + (parameters + (declare (in) float x) + (declare (out) int exp)) + ((return (expression float frexp (var_ref x) (var_ref exp) Having an ir_expression that writes to one of its parameters is going to break assumptions in a lot of our optimization passes. I'm concerned that that may be a problem we have to solve anyway. While our hardware doesn't support an frexp instruction (like e.g., AMD does) and we could probably do what you suggest, we do have instructions that correspond directly to the uaddCarry() and usubBorrow() built-ins in this same extension. They return a value and also have an out parameter. genUType uaddCarry(genUType x, genUType y, out genUType carry); genUType usubBorrow(genUType x, genUType y, out genUType borrow); We could probably avoid the problem you describe by lowering them, but it's feeling increasingly distasteful. Your code would make a good piglit test. I'll do some experiments. Hmm, interesting. The way LLVM solves this problem, as I understand it, is through so-called intrinsic functions (http://llvm.org/docs/LangRef.html#intrinsic-functions). I wonder if we should start doing that in Mesa. Briefly, here is what it would look like, using uaddCarry as an example: 1. First we do an inefficient implementation of uaddCarry in terms of existing GLSL functions, much like you did for frexp in your frexp_to_arith lowering pass, except that we do it in src/glsl/builtins/glsl/uaddCarry.glsl, so it's a little easier to review :). Optimization passes already deal with function out parameters properly, and function inlining automatically splices in the proper code during linking. 2. For back-ends that don't have an efficient native way to do uaddCarry, we're done. The uaddCarry function works as is. 3. For back-ends that do have an efficient way to do uaddCarry, we add a mechanism to allow the back-end to tell the linker: don't inline the definition of this built-in. Just leave it as an ir_call because I have my own special implementation of it*. I had thought about solving this in a slightly different way, but there are a couple potential tricky bits. Provide an implementation of the built-in function in the GLSL library. float frexp(float x, out int exponent) { return __intrinsic_frexp(x, exponent); } Provide a default implementation of the intrinsic elsewhere. Allow drivers to supply an alternate library with custom versions of the intrinsics. Since the GLSL library's frexp is the same in either case, the problem Paul identifies below should be avoided. The tricky bit, and the problem we always come to when talking about intrinsics is dealing with constant expressions. That doesn't (shouldn't?) apply to this case because of the out parameter, but it may apply to other cases. Maybe this is a problem in the general case, but I think the only thing we'd want to use intrinsics for at the moment are exactly the things you can't consider to be constant expressions -- because of the multiple outputs. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [RFC PATCH] i965: Allow C++ type safety in the use of enum brw_urb_write_flags.
Francisco Jerez curroje...@riseup.net writes: Ian Romanick i...@freedesktop.org writes: On 08/26/2013 01:10 PM, Francisco Jerez wrote: [...] The thing is that defining bitwise operators separately for each enum type, as this patch and the macro solution do, doesn't stop the compiler From using the corresponding built-in integer operators when it doesn't find a match among the user-defined ones. That means that if we have two bitfield enumerants from two different disjoint types e.g. SEASON_OF_THE_YEAR_SUMMER and a CPU_ARCHITECTURE_I386, the compiler is still going to accept expressions like SEASON_OF_THE_YEAR_SUMMER | CPU_ARCHITECTURE_I386, which might not be what had been expected if the BRW_CXX_ENUM_OPS macro was used in an attempt to improve the code's type safety. This sounds insane. If there are no operator overloads, the compiler rejects: enum foo f(enum foo a, enum foo b) { return a | b; } Here both a and b would be implicitly converted to integers which are then or'ed together using the built-in '|' operator yielding an integer. Using the result as return value of your enum foo function fails because in C++ integers are not implicitly convertible to enum types. Then we add operloads: enum foo operator|(enum foo, enum foo); enum bar operator|(enum bar, enum bar); And now the compiler will accept: unsigned f(enum foo a, enum bar b) { return a | b; } That's accepted because both a and b are converted to integers as before, giving an integer as result, which can be implicitly converted to your unsigned return type just fine. It's easy to avoid that though by defining a general overload: templatetypename T, typename S T operator|(T, S); for any bitfield types T and S that includes a static assertion within its definition to make sure that both types are indeed the same. In your example above the compiler would find an exact match during overload resolution (our operator|foo, bar(foo, bar);) so it wouldn't have to bother with implicit argument conversions. The instantiation of that template would fail thanks to the static assertion requiring both T and S to be equal. I hope this makes the idea a bit clearer. That can't be right. Am I missing something? Or am I reinforcing my point about not being ready for this level of C++ ninjitsu... This is way overcomplicated. I think we're doing just fine with the status quo of #defines for our bitfields with namespaced names. pgptvgKWfB3KF.pgp Description: PGP signature ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] i965: Silece usused variable warning in release build
Use `(void) success;` to silence this warning: i965/brw_vs.c:481:12: warning: unused variable 'success' [-Wunused-variable] bool success = do_vs_prog(brw, ctx-Shader.CurrentVertexProgram, Signed-off-by: Chad Versace chad.vers...@linux.intel.com --- src/mesa/drivers/dri/i965/brw_vs.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/mesa/drivers/dri/i965/brw_vs.c b/src/mesa/drivers/dri/i965/brw_vs.c index c4e016b..e0adc1d 100644 --- a/src/mesa/drivers/dri/i965/brw_vs.c +++ b/src/mesa/drivers/dri/i965/brw_vs.c @@ -480,7 +480,7 @@ static void brw_upload_vs_prog(struct brw_context *brw) brw-vs.prog_offset, brw-vs.prog_data)) { bool success = do_vs_prog(brw, ctx-Shader.CurrentVertexProgram, vp, key); - + (void) success; assert(success); } if (memcmp(brw-vs.prog_data-base.vue_map, brw-vue_map_geom_out, -- 1.8.3.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] glsl: propagate max_array_access through function calls
On Tue, Aug 27, 2013 at 2:03 PM, Dominik Behr db...@chromium.org wrote: I am not sure about MAX2, is it guaranteed to be converted to branchless conditional assignment? I don't think there's a guarantee, no. Using MAX2 just makes the code easier to immediately understand. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 3/3] mesa/main: Check for 0 size draws after validation.
On 18 July 2013 10:30, Matt Turner matts...@gmail.com wrote: On Sat, May 25, 2013 at 4:33 AM, Fabian Bieler fabianbie...@fastmail.fm wrote: When validating draw parameters move check for 0 draw count last (drawing with count 0 is not an error), so that other parameters (e.g.: the primitive type) are validated and the correct errors (if applicable) are generated. From the OpenGL 3.3 spec page 33 (page 48 of the PDF): [Regarding DrawArraysOneInstance, in terms of which other draw operations are defined:] If count is negative, an INVALID_VALUE error is generated. This patch also changes the bahavior of MultiDrawElements to perform the draw operation if some primitive's index counts are zero. Signed-off-by: Fabian Bieler fabianbie...@fastmail.fm --- Was this committed? I don't see it. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev I've just committed it. Thanks for the reminder, Matt! ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 2/8] intel: Add a batch flush between front-buffer downsample and X protocol.
This was already happening because blorp happens to flush at the end of every call, but we have been talking about removing that at some point, and this would surely get overlooked. v2 (Kenneth Graunke): Rebase on latest master. Note that we did remove the other flush, and this change actually did get overlooked! Reviewed-by: Kenneth Graunke kenn...@whitecape.org --- src/mesa/drivers/dri/i965/intel_context.c | 1 + 1 file changed, 1 insertion(+) diff --git a/src/mesa/drivers/dri/i965/intel_context.c b/src/mesa/drivers/dri/i965/intel_context.c index 9a089bf..0f1639f 100644 --- a/src/mesa/drivers/dri/i965/intel_context.c +++ b/src/mesa/drivers/dri/i965/intel_context.c @@ -144,6 +144,7 @@ intel_flush_front(struct gl_context *ctx) * performance. */ intel_resolve_for_dri2_flush(brw, driDrawable); + intel_batchbuffer_flush(brw); screen-dri2.loader-flushFrontBuffer(driDrawable, driDrawable-loaderPrivate); -- 1.8.4.rc3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 4/8] intel: Reuse intel_glFlush().
v2 (Kenneth Graunke): Rebase on latest master. Reviewed-by: Kenneth Graunke kenn...@whitecape.org --- src/mesa/drivers/dri/i965/intel_context.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/src/mesa/drivers/dri/i965/intel_context.c b/src/mesa/drivers/dri/i965/intel_context.c index 0f1639f..4f96989 100644 --- a/src/mesa/drivers/dri/i965/intel_context.c +++ b/src/mesa/drivers/dri/i965/intel_context.c @@ -366,8 +366,7 @@ intelFinish(struct gl_context * ctx) { struct brw_context *brw = brw_context(ctx); - intel_batchbuffer_flush(brw); - intel_flush_front(ctx); + intel_glFlush(ctx); if (brw-batch.last_bo) drm_intel_bo_wait_rendering(brw-batch.last_bo); -- 1.8.4.rc3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 1/8] i965: Directly call intel_batchbuffer_flush() after i915 split.
intel_flush() now did nothing except call through (and intel_batchbuffer_flush() does the no-op check, too!) --- src/mesa/drivers/dri/i965/gen6_blorp.cpp | 4 +--- src/mesa/drivers/dri/i965/intel_buffer_objects.c | 2 +- src/mesa/drivers/dri/i965/intel_context.c| 17 - src/mesa/drivers/dri/i965/intel_context.h| 3 --- src/mesa/drivers/dri/i965/intel_mipmap_tree.c| 3 +-- src/mesa/drivers/dri/i965/intel_pixel_copy.c | 3 ++- src/mesa/drivers/dri/i965/intel_syncobj.c| 2 +- 7 files changed, 10 insertions(+), 24 deletions(-) diff --git a/src/mesa/drivers/dri/i965/gen6_blorp.cpp b/src/mesa/drivers/dri/i965/gen6_blorp.cpp index 1c85921..da523e5 100644 --- a/src/mesa/drivers/dri/i965/gen6_blorp.cpp +++ b/src/mesa/drivers/dri/i965/gen6_blorp.cpp @@ -49,12 +49,10 @@ void gen6_blorp_emit_batch_head(struct brw_context *brw, const brw_blorp_params *params) { - struct gl_context *ctx = brw-ctx; - /* To ensure that the batch contains only the resolve, flush the batch * before beginning and after finishing emitting the resolve packets. */ - intel_flush(ctx); + intel_batchbuffer_flush(brw); } diff --git a/src/mesa/drivers/dri/i965/intel_buffer_objects.c b/src/mesa/drivers/dri/i965/intel_buffer_objects.c index 663cc29..21d3727 100644 --- a/src/mesa/drivers/dri/i965/intel_buffer_objects.c +++ b/src/mesa/drivers/dri/i965/intel_buffer_objects.c @@ -291,7 +291,7 @@ intel_bufferobj_map_range(struct gl_context * ctx, } else { perf_debug(Stalling on the GPU for mapping a busy buffer object\n); - intel_flush(ctx); + intel_batchbuffer_flush(brw); } } else if (drm_intel_bo_busy(intel_obj-buffer) (access GL_MAP_INVALIDATE_BUFFER_BIT)) { diff --git a/src/mesa/drivers/dri/i965/intel_context.c b/src/mesa/drivers/dri/i965/intel_context.c index 37c1770..9a089bf 100644 --- a/src/mesa/drivers/dri/i965/intel_context.c +++ b/src/mesa/drivers/dri/i965/intel_context.c @@ -349,21 +349,12 @@ intelInvalidateState(struct gl_context * ctx, GLuint new_state) brw-NewGLState |= new_state; } -void -_intel_flush(struct gl_context *ctx, const char *file, int line) -{ - struct brw_context *brw = brw_context(ctx); - - if (brw-batch.used) - _intel_batchbuffer_flush(brw, file, line); -} - static void intel_glFlush(struct gl_context *ctx) { struct brw_context *brw = brw_context(ctx); - intel_flush(ctx); + intel_batchbuffer_flush(brw); intel_flush_front(ctx); if (brw-is_front_buffer_rendering) brw-need_throttle = true; @@ -374,7 +365,7 @@ intelFinish(struct gl_context * ctx) { struct brw_context *brw = brw_context(ctx); - intel_flush(ctx); + intel_batchbuffer_flush(brw); intel_flush_front(ctx); if (brw-batch.last_bo) @@ -816,7 +807,7 @@ intel_query_dri2_buffers(struct brw_context *brw, * query, we need to make sure all the pending drawing has landed in the * real front buffer. */ - intel_flush(brw-ctx); + intel_batchbuffer_flush(brw); intel_flush_front(brw-ctx); attachments[i++] = __DRI_BUFFER_FRONT_LEFT; @@ -828,7 +819,7 @@ intel_query_dri2_buffers(struct brw_context *brw, * So before doing the query, make sure all the pending drawing has * landed in the real front buffer. */ - intel_flush(brw-ctx); + intel_batchbuffer_flush(brw); intel_flush_front(brw-ctx); } diff --git a/src/mesa/drivers/dri/i965/intel_context.h b/src/mesa/drivers/dri/i965/intel_context.h index 734c57c..f35dafa 100644 --- a/src/mesa/drivers/dri/i965/intel_context.h +++ b/src/mesa/drivers/dri/i965/intel_context.h @@ -250,9 +250,6 @@ extern bool intelInitContext(struct brw_context *brw, unsigned *dri_ctx_error); extern void intelFinish(struct gl_context * ctx); -extern void _intel_flush(struct gl_context * ctx, const char *file, int line); - -#define intel_flush(ctx) _intel_flush(ctx, __FILE__, __LINE__) extern void intelInitDriverFunctions(struct dd_function_table *functions); diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c index 950ef57..f8cf96f 100644 --- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c +++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c @@ -1626,7 +1626,6 @@ intel_miptree_upsample(struct brw_context *brw, void * intel_miptree_map_raw(struct brw_context *brw, struct intel_mipmap_tree *mt) { - struct gl_context *ctx = brw-ctx; /* CPU accesses to color buffers don't understand fast color clears, so * resolve any pending fast color clears before we map. */ @@ -1640,7 +1639,7 @@ intel_miptree_map_raw(struct brw_context *brw, struct intel_mipmap_tree *mt) } } - intel_flush(ctx); + intel_batchbuffer_flush(brw); if (mt-region-tiling != I915_TILING_NONE)
[Mesa-dev] [PATCH 3/8] intel: Add support for the new flush_with_flags extension.
This gives us more information about why we're flushing that we can use for handling our throttling. v2 (Kenneth Graunke): Rebase on latest master, add missing FLUSH_VERTICES and FLUSH_CURRENT, which fixes a regression in Glean's polygonOffset test. v3 (anholt): Drop FLUSH_CURRENT -- FLUSH_VERTICES is what we need, which is get any queued prims out of VBO and into the driver, not update ctx-Current so we can read it with the CPU. Also drop batch-used check, which intel_batchbuffer_flush() does anyway. Signed-off-by: Kenneth Graunke kenn...@whitecape.org --- src/mesa/drivers/dri/i965/intel_screen.c | 46 +--- 1 file changed, 36 insertions(+), 10 deletions(-) diff --git a/src/mesa/drivers/dri/i965/intel_screen.c b/src/mesa/drivers/dri/i965/intel_screen.c index 9b3c31a..0580d6f 100644 --- a/src/mesa/drivers/dri/i965/intel_screen.c +++ b/src/mesa/drivers/dri/i965/intel_screen.c @@ -151,29 +151,55 @@ static const __DRItexBufferExtension intelTexBufferExtension = { }; static void -intelDRI2Flush(__DRIdrawable *drawable) +intel_dri2_flush_with_flags(__DRIcontext *cPriv, +__DRIdrawable *dPriv, +unsigned flags, +enum __DRI2throttleReason reason) { - GET_CURRENT_CONTEXT(ctx); - struct brw_context *brw = brw_context(ctx); - if (brw == NULL) + struct brw_context *brw = cPriv-driverPrivate; + + if (!brw) return; - intel_resolve_for_dri2_flush(brw, drawable); - brw-need_throttle = true; + struct gl_context *ctx = brw-ctx; + + FLUSH_VERTICES(ctx, 0); - if (brw-batch.used) - intel_batchbuffer_flush(brw); + if (flags __DRI2_FLUSH_DRAWABLE) + intel_resolve_for_dri2_flush(brw, dPriv); + + if (reason == __DRI2_THROTTLE_SWAPBUFFER || + reason == __DRI2_THROTTLE_FLUSHFRONT) { + brw-need_throttle = true; + } + + intel_batchbuffer_flush(brw); if (INTEL_DEBUG DEBUG_AUB) { aub_dump_bmp(ctx); } } +/** + * Provides compatibility with loaders that only support the older (version + * 1-3) flush interface. + * + * That includes libGL up to Mesa 9.0, and the X Server at least up to 1.13. + */ +static void +intel_dri2_flush(__DRIdrawable *drawable) +{ + intel_dri2_flush_with_flags(drawable-driContextPriv, drawable, + __DRI2_FLUSH_DRAWABLE, + __DRI2_THROTTLE_SWAPBUFFER); +} + static const struct __DRI2flushExtensionRec intelFlushExtension = { -.base = { __DRI2_FLUSH, 3 }, +.base = { __DRI2_FLUSH, 4 }, -.flush = intelDRI2Flush, +.flush = intel_dri2_flush, .invalidate = dri2InvalidateDrawable, +.flush_with_flags = intel_dri2_flush_with_flags, }; static struct intel_image_format intel_image_formats[] = { -- 1.8.4.rc3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 7/8] i965: Add missing state reset at the end of blorp.
These are things that happen to be occurring because of the batch flush at the start of the blorp op (which exists to prevent batch space or aperture space overflow), but the intention was for this sequence of state resets at the end of blorp to be everything necessary for the next draw call. Found when debugging the next commit, by comparing brw_new_batch() and intel_batchbuffer_reset() to brw_blorp_exec(). --- src/mesa/drivers/dri/i965/brw_blorp.cpp | 2 ++ src/mesa/drivers/dri/i965/intel_batchbuffer.c | 10 +- src/mesa/drivers/dri/i965/intel_batchbuffer.h | 1 + 3 files changed, 8 insertions(+), 5 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_blorp.cpp b/src/mesa/drivers/dri/i965/brw_blorp.cpp index 20ea09e..1576ff2 100644 --- a/src/mesa/drivers/dri/i965/brw_blorp.cpp +++ b/src/mesa/drivers/dri/i965/brw_blorp.cpp @@ -214,6 +214,8 @@ brw_blorp_exec(struct brw_context *brw, const brw_blorp_params *params) brw-state.dirty.cache = ~0; brw-state_batch_count = 0; brw-batch.need_workaround_flush = true; + brw-ib.type = -1; + intel_batchbuffer_clear_cache(brw); /* Flush the sampler cache so any texturing from the destination is * coherent. diff --git a/src/mesa/drivers/dri/i965/intel_batchbuffer.c b/src/mesa/drivers/dri/i965/intel_batchbuffer.c index 5604829..0aa2551 100644 --- a/src/mesa/drivers/dri/i965/intel_batchbuffer.c +++ b/src/mesa/drivers/dri/i965/intel_batchbuffer.c @@ -41,8 +41,8 @@ struct cached_batch_item { uint16_t size; }; -static void -clear_cache(struct brw_context *brw) +void +intel_batchbuffer_clear_cache(struct brw_context *brw) { struct cached_batch_item *item = brw-batch.cached_items; @@ -85,7 +85,7 @@ intel_batchbuffer_reset(struct brw_context *brw) } brw-batch.last_bo = brw-batch.bo; - clear_cache(brw); + intel_batchbuffer_clear_cache(brw); brw-batch.bo = drm_intel_bo_alloc(brw-bufmgr, batchbuffer, BATCH_SZ, 4096); @@ -118,7 +118,7 @@ intel_batchbuffer_reset_to_saved(struct brw_context *brw) /* Cached batch state is dead, since we just cleared some unknown part of the * batchbuffer. Assume that the caller resets any other state necessary. */ - clear_cache(brw); + intel_batchbuffer_clear_cache(brw); } void @@ -128,7 +128,7 @@ intel_batchbuffer_free(struct brw_context *brw) drm_intel_bo_unreference(brw-batch.last_bo); drm_intel_bo_unreference(brw-batch.bo); drm_intel_bo_unreference(brw-batch.workaround_bo); - clear_cache(brw); + intel_batchbuffer_clear_cache(brw); } static void diff --git a/src/mesa/drivers/dri/i965/intel_batchbuffer.h b/src/mesa/drivers/dri/i965/intel_batchbuffer.h index 15a9ca1..d46f48e 100644 --- a/src/mesa/drivers/dri/i965/intel_batchbuffer.h +++ b/src/mesa/drivers/dri/i965/intel_batchbuffer.h @@ -28,6 +28,7 @@ void intel_batchbuffer_init(struct brw_context *brw); void intel_batchbuffer_free(struct brw_context *brw); void intel_batchbuffer_save_state(struct brw_context *brw); void intel_batchbuffer_reset_to_saved(struct brw_context *brw); +void intel_batchbuffer_clear_cache(struct brw_context *brw); int _intel_batchbuffer_flush(struct brw_context *brw, const char *file, int line); -- 1.8.4.rc3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 6/8] i965: Drop extra flush when calling intel_miptree_map_raw().
The code that got replaced with map_raw didn't do the flush, but now map_raw() is responsible for it and we don't have to worry about it. --- src/mesa/drivers/dri/i965/intel_mipmap_tree.c | 1 - 1 file changed, 1 deletion(-) diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c index 5f8041f..2f5e04f 100644 --- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c +++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c @@ -1735,7 +1735,6 @@ intel_miptree_map_blit(struct brw_context *brw, goto fail; } - intel_batchbuffer_flush(brw); map-ptr = intel_miptree_map_raw(brw, map-mt); DBG(%s: %d,%d %dx%d from mt %p (%s) %d,%d = %p/%d\n, __FUNCTION__, -- 1.8.4.rc3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 8/8] i965: Avoid flushing the batch for every blorp op.
This brings over the batch-wrap-prevention and aperture space checking code from the normal brw_draw.c path, so that we don't need to flush the batch every time. There's a risk here if the intel_emit_post_sync_nonzero_flush() call isn't high enough up in the state emit sequences -- before, we implicitly had one at the batch flush before any state was emitted, so Mesa's workaround emits didn't really matter. Improves cairo-gl performance by 13.7733% +/- 1.74876% (n=30/32) Improves minecraft apitrace performance by 1.03183% +/- 0.482297% (n=90). Reduces low-resolution GLB 2.7 performance by 1.17553% +/- 0.432263% (n=88) Reduces Lightsmark performance by 3.70246% +/- 0.322432% (n=126) No statistically significant performance difference on unigine tropics (n=10) No statistically significant performance difference on openarena (n=755) The two apps that are hurt happen to include stalls on busy buffer objects, so I think this is an effect of missing out on an opportune flush. --- src/mesa/drivers/dri/i965/brw_blorp.cpp | 50 src/mesa/drivers/dri/i965/brw_blorp.h| 4 --- src/mesa/drivers/dri/i965/gen6_blorp.cpp | 12 src/mesa/drivers/dri/i965/gen7_blorp.cpp | 1 - 4 files changed, 50 insertions(+), 17 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_blorp.cpp b/src/mesa/drivers/dri/i965/brw_blorp.cpp index 1576ff2..c566d1d 100644 --- a/src/mesa/drivers/dri/i965/brw_blorp.cpp +++ b/src/mesa/drivers/dri/i965/brw_blorp.cpp @@ -21,6 +21,7 @@ * IN THE SOFTWARE. */ +#include errno.h #include intel_batchbuffer.h #include intel_fbo.h @@ -191,6 +192,26 @@ intel_hiz_exec(struct brw_context *brw, struct intel_mipmap_tree *mt, void brw_blorp_exec(struct brw_context *brw, const brw_blorp_params *params) { + struct gl_context *ctx = brw-ctx; + uint32_t estimated_max_batch_usage = 1500; + bool check_aperture_failed_once = false; + + /* Flush the sampler and render caches. We definitely need to flush the +* sampler cache so that we get updated contents from the render cache for +* the glBlitFramebuffer() source. Also, we are sometimes warned in the +* docs to flush the cache between reinterpretations of the same surface +* data with different formats, which blorp does for stencil and depth +* data. +*/ + intel_batchbuffer_emit_mi_flush(brw); + +retry: + intel_batchbuffer_require_space(brw, estimated_max_batch_usage, false); + intel_batchbuffer_save_state(brw); + drm_intel_bo *saved_bo = brw-batch.bo; + uint32_t saved_used = brw-batch.used; + uint32_t saved_state_batch_offset = brw-batch.state_batch_offset; + switch (brw-gen) { case 6: gen6_blorp_exec(brw, params); @@ -204,6 +225,35 @@ brw_blorp_exec(struct brw_context *brw, const brw_blorp_params *params) break; } + /* Make sure we didn't wrap the batch unintentionally, and make sure we +* reserved enough space that a wrap will never happen. +*/ + assert(brw-batch.bo == saved_bo); + assert((brw-batch.used - saved_used) * 4 + + (saved_state_batch_offset - brw-batch.state_batch_offset) + estimated_max_batch_usage); + /* Shut up compiler warnings on release build */ + (void)saved_bo; + (void)saved_used; + (void)saved_state_batch_offset; + + /* Check if the blorp op we just did would make our batch likely to fail to +* map all the BOs into the GPU at batch exec time later. If so, flush the +* batch and try again with nothing else in the batch. +*/ + if (dri_bufmgr_check_aperture_space(brw-batch.bo, 1)) { + if (!check_aperture_failed_once) { + check_aperture_failed_once = true; + intel_batchbuffer_reset_to_saved(brw); + intel_batchbuffer_flush(brw); + goto retry; + } else { + int ret = intel_batchbuffer_flush(brw); + WARN_ONCE(ret == -ENOSPC, + i965: blorp emit exceeded available aperture space\n); + } + } + if (unlikely(brw-always_flush_batch)) intel_batchbuffer_flush(brw); diff --git a/src/mesa/drivers/dri/i965/brw_blorp.h b/src/mesa/drivers/dri/i965/brw_blorp.h index dceb4fc..e03e27f 100644 --- a/src/mesa/drivers/dri/i965/brw_blorp.h +++ b/src/mesa/drivers/dri/i965/brw_blorp.h @@ -370,10 +370,6 @@ void gen6_blorp_init(struct brw_context *brw); void -gen6_blorp_emit_batch_head(struct brw_context *brw, - const brw_blorp_params *params); - -void gen6_blorp_emit_state_base_address(struct brw_context *brw, const brw_blorp_params *params); diff --git a/src/mesa/drivers/dri/i965/gen6_blorp.cpp b/src/mesa/drivers/dri/i965/gen6_blorp.cpp index da523e5..87b1d2b 100644 --- a/src/mesa/drivers/dri/i965/gen6_blorp.cpp +++ b/src/mesa/drivers/dri/i965/gen6_blorp.cpp @@ -45,17 +45,6 @@ * sizeof(float)) /** \} */ -void -gen6_blorp_emit_batch_head(struct brw_context *brw, -
[Mesa-dev] [PATCH 5/8] i965: Make a slight distinction in perf debug for BOs versus miptrees.
--- src/mesa/drivers/dri/i965/intel_mipmap_tree.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c index f8cf96f..5f8041f 100644 --- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c +++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c @@ -1635,7 +1635,7 @@ intel_miptree_map_raw(struct brw_context *brw, struct intel_mipmap_tree *mt) if (unlikely(INTEL_DEBUG DEBUG_PERF)) { if (drm_intel_bo_busy(bo)) { - perf_debug(Mapping a busy BO, causing a stall on the GPU.\n); + perf_debug(Mapping a busy miptree, causing a stall on the GPU.\n); } } -- 1.8.4.rc3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH V2 10/15] mesa: Implement KHR_debug ObjectLabel functions
On Tuesday 27 August 2013, Timothy Arceri wrote: V2: fixed indentation of comment Signed-off-by: Timothy Arceri t_arc...@yahoo.com.au --- src/mesa/main/objectlabel.c | 277 src/mesa/main/objectlabel.h | 61 ++ 2 files changed, 338 insertions(+) create mode 100644 src/mesa/main/objectlabel.c create mode 100644 src/mesa/main/objectlabel.h diff --git a/src/mesa/main/objectlabel.c b/src/mesa/main/objectlabel.c new file mode 100644 index 000..78f9b33 --- /dev/null +++ b/src/mesa/main/objectlabel.c @@ -0,0 +1,277 @@ +/* + * Mesa 3-D graphics library + * + * Copyright (C) 2013 Timothy Arceri All Rights Reserved. + * + * Permission is hereby granted, free of charge, to any person obtaining a + * copy of this software and associated documentation files (the Software), + * to deal in the Software without restriction, including without limitation + * the rights to use, copy, modify, merge, publish, distribute, sublicense, + * and/or sell copies of the Software, and to permit persons to whom the + * Software is furnished to do so, subject to the following conditions: + * + * The above copyright notice and this permission notice shall be included + * in all copies or substantial portions of the Software. + * + * THE SOFTWARE IS PROVIDED AS IS, WITHOUT WARRANTY OF ANY KIND, EXPRESS + * OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR + * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, + * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR + * OTHER DEALINGS IN THE SOFTWARE. + */ + + +#include arrayobj.h +#include bufferobj.h +#include context.h +#include dlist.h +#include enums.h +#include fbobject.h +#include objectlabel.h +#include queryobj.h +#include samplerobj.h +#include shaderobj.h +#include syncobj.h +#include texobj.h +#include transformfeedback.h + + +/** + * Helper for _mesa_ObjectLabel() and _mesa_ObjectPtrLabel(). + */ +static void +set_label(struct gl_context *ctx, char **labelPtr, const char *label, + int length, const char *caller) +{ + if (*labelPtr) { + /* free old label string */ + free(*labelPtr); You should set *labelPtr to NULL here, since the code below is not guaranteed to assign it. + } + + if (label) { + /* set new label string */ + + if (length = 0) { + /* explicit length */ + *labelPtr = (char *) malloc(length); + if (*labelPtr) { +memcpy(*labelPtr, label, length); + } The length given by the client is not required to include a terminating null-character. + } + else { + /* null-terminated string */ + int len = strlen(label); + if (len = MAX_LABEL_LENGTH) { +/* An INVALID_VALUE error is generated if the number of characters + * in label, excluding the null terminator when length is + * negative, is not less than the value of MAX_LABEL_LENGTH. + */ +_mesa_error(ctx, GL_INVALID_VALUE, +%s(length=%d, which is not less than +GL_MAX_LABEL_LENGTH=%d), caller, length, +MAX_LABEL_LENGTH); This error should also be generated when the client specifies an explicit length that exceeds MAX_LABEL_LENGTH. Fredrik ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] r600g: fix color exports when we have no CBs
Am 27.08.2013 23:52, schrieb Vadim Girlin: On 08/28/2013 12:43 AM, Marek Olšák wrote: Shader variants are BAD, BAD, BAD. Have you ever played an AAA game with a Mesa driver that likes to compile shader variants on first use? It's HORRIBLE. I don't think that shader variants are bad, but it's definitely bad when we are compiling variants that are never used. Currently glxgears compiles 18 ps/vs shaders. In my branch with initial GS support [1] I switched handling of the shaders to deferred compilation, that is, shaders are compiled only before the actual draw. I found later that it's not really required for GS, but IIRC this change results in only 5 shaders being compiled for glxgears instead of 18. It seems most of the useless variants are results of state changes between creation of the shader state (initial compilation) and actual draw call. I had some concerns about increased overhead with those changes, and it's actually noticeable with drawoverhead demo, but I didn't see any regressions with a few real apps that I tested, e.g. glxgears even showed slightly better performance with these changes. Probably I also implemented it in a not very optimal way (I was mostly concentrated on GS support) and the overhead can be reduced. One more thing is duplicate shaders, I've analyzed shader dumps from Unigine Heaven 3.0 some time ago and found that from about 320 compiled shaders, only about 180 (50%) were unique, others were duplicates (detected by comparing the bytecode dumps for them in an automated way), maybe they had different shader keys (which still resulted in the same bytecode), but I suspect duplicate pipe shaders were also involved. Unfortunately I didn't have a time to investigate it more thoroughly since then. So my point is that we don't really need to eliminate shader variants, first we need to eliminate compilation of unused variants and duplicate shaders. Also we might want to consider offloading of the compilation to separate thread(s) and caching of shader binaries between runs. Hmm ok that seems a way more complicated problem than I thought :-). Compile early and you might compile variants you will never use, compile late and the delay might be noticeable. I just thought it might be unlikely you'd actually need two variants - e.g. some depth exporting shader is probably unlikely to use alpha test. But ok I guess it shouldn't write color in this case, so even then it might never be worth bothering. Was just a random idea ;-). Roland Vadim [1] http://cgit.freedesktop.org/~vadimg/mesa/log/?h=r600-geom-shaders What the patch does is probably the right solution. At least alpha-test state changes don't cause shader recompilation and re-binding, which also negatively affects performance. Ideally we shouldn't depend on the framebuffer state at all, but we need to emulate the TGSI property FS_COLOR0_WRITES_ALL_CBUFS. I think we should always be fine with key.nr_cbufs forced to 8 for any shader without that property. I expect app developers to do the right thing and not write outputs they don't need. Marek On Tue, Aug 27, 2013 at 9:00 PM, Roland Scheidegger srol...@vmware.com wrote: Not that I'm qualified to review r600 code, but couldn't you create different shader variants depending on whether you need alpha test? At least I would assume shader exports aren't free. Roland Am 27.08.2013 19:56, schrieb Vadim Girlin: We need to export at least one color if the shader writes it, even when nr_cbufs==0. Signed-off-by: Vadim Girlin vadimgir...@gmail.com --- Tested on evergreen with multiple combinations of backends - no regressions, fixes some tests: default- fixes fb-alphatest-nocolor and fb_alphatest-nocolor-ff default+sb - fixes fb-alphatest-nocolor and fb_alphatest-nocolor-ff llvm - fixes about 25 tests related to depth/stencil llvm+sb- fixes about 300 tests (llvm's depth/stencil issues and regressions cased by reordering of exports in sb) With this patch, there are no regressions with default+sb vs default. There is one regression with llvm+sb vs llvm - fs-texturegrad-miplevels, AFAICS it's a problem with llvm backend uncovered by sb - SET_GRADIENTS_V/H instructions are not placed in the same TEX clause with corresponding SAMPLE_G. src/gallium/drivers/r600/r600_shader.c | 7 --- 1 file changed, 4 insertions(+), 3 deletions(-) diff --git a/src/gallium/drivers/r600/r600_shader.c b/src/gallium/drivers/r600/r600_shader.c index 300b5c4..f7eab76 100644 --- a/src/gallium/drivers/r600/r600_shader.c +++ b/src/gallium/drivers/r600/r600_shader.c @@ -918,6 +918,7 @@ static int r600_shader_from_tgsi(struct r600_screen *rscreen, unsigned opcode; int i, j, k, r = 0; int next_pos_base = 60, next_param_base = 0; + int max_color_exports = MAX2(key.nr_cbufs, 1); /* Declarations used by llvm code */ bool use_llvm = false;
[Mesa-dev] [Bug 60737] In GLSL ES, a missing FS precision qualifier does not generate an error
https://bugs.freedesktop.org/show_bug.cgi?id=60737 Ian Romanick i...@freedesktop.org changed: What|Removed |Added Status|NEW |RESOLVED Resolution|--- |FIXED --- Comment #3 from Ian Romanick i...@freedesktop.org --- This should be fixed by: commit cabd45773b58d6aa48202da1cdd8cf1a6b856c53 Author: Ian Romanick ian.d.roman...@intel.com Date: Fri Aug 9 15:17:18 2013 -0700 glsl: Track existence of default float precision in GLSL ES fragment shaders This is required by the spec, and it's a bit tricky because the default precision is scoped. As a result, I'm slightly abusing the symbol table. Fixes piglit no-default-float-precision.frag tests and the piglit default-precision-nested-scope-0[1234].frag tests that are currently on the piglit mailing list for review. On IRC I got confirmation from cwabbot that ARM (Mali T6xx and T400) enforces this requirement and from kusma that NVIDIA (Tegra2) enforces this requirement. We should be safe from regressing shipping applications. Signed-off-by: Ian Romanick ian.d.roman...@intel.com Reviewed-by: Kenneth Graunke kenn...@whitecape.org Cc: 9.2 mesa-sta...@lists.freedesktop.org -- You are receiving this mail because: You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] r600g: fix color exports when we have no CBs
First, you won't really see any significant continual difference in frame rate no matter how many shader variants you have unless you are very CPU-bound. The problem is shader compilation on the first use, that's where you get a big hiccup. Try Skyrim for example: You have to first look around and see every object that's around you and get unpleasant stuttering before you can actually go on and play the game. Yes, this also Wine's fault that it compiles shaders on the first use too, but we don't have to be as bad as Wine, do we? Valve also reported shader recompilations on the first use being a serious issue with open source drivers. Marek On Tue, Aug 27, 2013 at 11:52 PM, Vadim Girlin vadimgir...@gmail.com wrote: On 08/28/2013 12:43 AM, Marek Olšák wrote: Shader variants are BAD, BAD, BAD. Have you ever played an AAA game with a Mesa driver that likes to compile shader variants on first use? It's HORRIBLE. I don't think that shader variants are bad, but it's definitely bad when we are compiling variants that are never used. Currently glxgears compiles 18 ps/vs shaders. In my branch with initial GS support [1] I switched handling of the shaders to deferred compilation, that is, shaders are compiled only before the actual draw. I found later that it's not really required for GS, but IIRC this change results in only 5 shaders being compiled for glxgears instead of 18. It seems most of the useless variants are results of state changes between creation of the shader state (initial compilation) and actual draw call. I had some concerns about increased overhead with those changes, and it's actually noticeable with drawoverhead demo, but I didn't see any regressions with a few real apps that I tested, e.g. glxgears even showed slightly better performance with these changes. Probably I also implemented it in a not very optimal way (I was mostly concentrated on GS support) and the overhead can be reduced. One more thing is duplicate shaders, I've analyzed shader dumps from Unigine Heaven 3.0 some time ago and found that from about 320 compiled shaders, only about 180 (50%) were unique, others were duplicates (detected by comparing the bytecode dumps for them in an automated way), maybe they had different shader keys (which still resulted in the same bytecode), but I suspect duplicate pipe shaders were also involved. Unfortunately I didn't have a time to investigate it more thoroughly since then. So my point is that we don't really need to eliminate shader variants, first we need to eliminate compilation of unused variants and duplicate shaders. Also we might want to consider offloading of the compilation to separate thread(s) and caching of shader binaries between runs. Vadim [1] http://cgit.freedesktop.org/~vadimg/mesa/log/?h=r600-geom-shaders What the patch does is probably the right solution. At least alpha-test state changes don't cause shader recompilation and re-binding, which also negatively affects performance. Ideally we shouldn't depend on the framebuffer state at all, but we need to emulate the TGSI property FS_COLOR0_WRITES_ALL_CBUFS. I think we should always be fine with key.nr_cbufs forced to 8 for any shader without that property. I expect app developers to do the right thing and not write outputs they don't need. Marek On Tue, Aug 27, 2013 at 9:00 PM, Roland Scheidegger srol...@vmware.com wrote: Not that I'm qualified to review r600 code, but couldn't you create different shader variants depending on whether you need alpha test? At least I would assume shader exports aren't free. Roland Am 27.08.2013 19:56, schrieb Vadim Girlin: We need to export at least one color if the shader writes it, even when nr_cbufs==0. Signed-off-by: Vadim Girlin vadimgir...@gmail.com --- Tested on evergreen with multiple combinations of backends - no regressions, fixes some tests: default- fixes fb-alphatest-nocolor and fb_alphatest-nocolor-ff default+sb - fixes fb-alphatest-nocolor and fb_alphatest-nocolor-ff llvm - fixes about 25 tests related to depth/stencil llvm+sb- fixes about 300 tests (llvm's depth/stencil issues and regressions cased by reordering of exports in sb) With this patch, there are no regressions with default+sb vs default. There is one regression with llvm+sb vs llvm - fs-texturegrad-miplevels, AFAICS it's a problem with llvm backend uncovered by sb - SET_GRADIENTS_V/H instructions are not placed in the same TEX clause with corresponding SAMPLE_G. src/gallium/drivers/r600/r600_shader.c | 7 --- 1 file changed, 4 insertions(+), 3 deletions(-) diff --git a/src/gallium/drivers/r600/r600_shader.c b/src/gallium/drivers/r600/r600_shader.c index 300b5c4..f7eab76 100644 --- a/src/gallium/drivers/r600/r600_shader.c +++ b/src/gallium/drivers/r600/r600_shader.c @@ -918,6 +918,7 @@ static int r600_shader_from_tgsi(struct r600_screen *rscreen, unsigned
[Mesa-dev] [PATCH] radeonsi: flush TC at the end of CS
TC can be used for writing and therefore should be flushed. --- src/gallium/drivers/radeonsi/r600_hw_context.c | 8 +--- 1 file changed, 5 insertions(+), 3 deletions(-) diff --git a/src/gallium/drivers/radeonsi/r600_hw_context.c b/src/gallium/drivers/radeonsi/r600_hw_context.c index 5631bdb..1fe6814 100644 --- a/src/gallium/drivers/radeonsi/r600_hw_context.c +++ b/src/gallium/drivers/radeonsi/r600_hw_context.c @@ -167,7 +167,7 @@ void si_need_cs_space(struct r600_context *ctx, unsigned num_dw, } } -static void r600_flush_framebuffer(struct r600_context *ctx) +static void r600_flush_caches(struct r600_context *ctx) { struct si_pm4_state *pm4; @@ -188,7 +188,9 @@ static void r600_flush_framebuffer(struct r600_context *ctx) S_0085F0_CB6_DEST_BASE_ENA(1) | S_0085F0_CB7_DEST_BASE_ENA(1) | S_0085F0_DB_ACTION_ENA(1) | - S_0085F0_DB_DEST_BASE_ENA(1)); + S_0085F0_DB_DEST_BASE_ENA(1) | + S_0085F0_TC_ACTION_ENA(1) | + S_0085F0_TCL1_ACTION_ENA(1)); si_cmd_flush_and_inv_cb_meta(pm4); si_pm4_emit(ctx, pm4); @@ -223,7 +225,7 @@ void si_context_flush(struct r600_context *ctx, unsigned flags) } #endif - r600_flush_framebuffer(ctx); + r600_flush_caches(ctx); /* partial flush is needed to avoid lockups on some chips with user fences */ cs-buf[cs-cdw++] = PKT3(PKT3_EVENT_WRITE, 0, 0); -- 1.8.1.2 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] radeonsi: flush TC at the end of CS
Please disregard this patch. I didn't mean to send this one. Marek On Wed, Aug 28, 2013 at 1:00 AM, Marek Olšák mar...@gmail.com wrote: TC can be used for writing and therefore should be flushed. --- src/gallium/drivers/radeonsi/r600_hw_context.c | 8 +--- 1 file changed, 5 insertions(+), 3 deletions(-) diff --git a/src/gallium/drivers/radeonsi/r600_hw_context.c b/src/gallium/drivers/radeonsi/r600_hw_context.c index 5631bdb..1fe6814 100644 --- a/src/gallium/drivers/radeonsi/r600_hw_context.c +++ b/src/gallium/drivers/radeonsi/r600_hw_context.c @@ -167,7 +167,7 @@ void si_need_cs_space(struct r600_context *ctx, unsigned num_dw, } } -static void r600_flush_framebuffer(struct r600_context *ctx) +static void r600_flush_caches(struct r600_context *ctx) { struct si_pm4_state *pm4; @@ -188,7 +188,9 @@ static void r600_flush_framebuffer(struct r600_context *ctx) S_0085F0_CB6_DEST_BASE_ENA(1) | S_0085F0_CB7_DEST_BASE_ENA(1) | S_0085F0_DB_ACTION_ENA(1) | - S_0085F0_DB_DEST_BASE_ENA(1)); + S_0085F0_DB_DEST_BASE_ENA(1) | + S_0085F0_TC_ACTION_ENA(1) | + S_0085F0_TCL1_ACTION_ENA(1)); si_cmd_flush_and_inv_cb_meta(pm4); si_pm4_emit(ctx, pm4); @@ -223,7 +225,7 @@ void si_context_flush(struct r600_context *ctx, unsigned flags) } #endif - r600_flush_framebuffer(ctx); + r600_flush_caches(ctx); /* partial flush is needed to avoid lockups on some chips with user fences */ cs-buf[cs-cdw++] = PKT3(PKT3_EVENT_WRITE, 0, 0); -- 1.8.1.2 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] draw: fix segfaults with aaline and aapoint stages disabled
There are drivers not using these optional stages. Broken by a3ae5dc7dd5c2f8893f86a920247e690e550ebd4. Cc: mesa-sta...@lists.freedesktop.org --- src/gallium/auxiliary/draw/draw_context.c | 6 -- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/src/gallium/auxiliary/draw/draw_context.c b/src/gallium/auxiliary/draw/draw_context.c index d1fac0c..641dd82 100644 --- a/src/gallium/auxiliary/draw/draw_context.c +++ b/src/gallium/auxiliary/draw/draw_context.c @@ -564,8 +564,10 @@ draw_prepare_shader_outputs(struct draw_context *draw) draw_remove_extra_vertex_attribs(draw); draw_prim_assembler_prepare_outputs(draw-ia); draw_unfilled_prepare_outputs(draw, draw-pipeline.unfilled); - draw_aapoint_prepare_outputs(draw, draw-pipeline.aapoint); - draw_aaline_prepare_outputs(draw, draw-pipeline.aaline); + if (draw-pipeline.aapoint) + draw_aapoint_prepare_outputs(draw, draw-pipeline.aapoint); + if (draw-pipeline.aaline) + draw_aaline_prepare_outputs(draw, draw-pipeline.aaline); } /** -- 1.8.1.2 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] r600g: fix color exports when we have no CBs
On 08/28/2013 02:28 AM, Roland Scheidegger wrote: Am 27.08.2013 23:52, schrieb Vadim Girlin: On 08/28/2013 12:43 AM, Marek Olšák wrote: Shader variants are BAD, BAD, BAD. Have you ever played an AAA game with a Mesa driver that likes to compile shader variants on first use? It's HORRIBLE. I don't think that shader variants are bad, but it's definitely bad when we are compiling variants that are never used. Currently glxgears compiles 18 ps/vs shaders. In my branch with initial GS support [1] I switched handling of the shaders to deferred compilation, that is, shaders are compiled only before the actual draw. I found later that it's not really required for GS, but IIRC this change results in only 5 shaders being compiled for glxgears instead of 18. It seems most of the useless variants are results of state changes between creation of the shader state (initial compilation) and actual draw call. I had some concerns about increased overhead with those changes, and it's actually noticeable with drawoverhead demo, but I didn't see any regressions with a few real apps that I tested, e.g. glxgears even showed slightly better performance with these changes. Probably I also implemented it in a not very optimal way (I was mostly concentrated on GS support) and the overhead can be reduced. One more thing is duplicate shaders, I've analyzed shader dumps from Unigine Heaven 3.0 some time ago and found that from about 320 compiled shaders, only about 180 (50%) were unique, others were duplicates (detected by comparing the bytecode dumps for them in an automated way), maybe they had different shader keys (which still resulted in the same bytecode), but I suspect duplicate pipe shaders were also involved. Unfortunately I didn't have a time to investigate it more thoroughly since then. So my point is that we don't really need to eliminate shader variants, first we need to eliminate compilation of unused variants and duplicate shaders. Also we might want to consider offloading of the compilation to separate thread(s) and caching of shader binaries between runs. Hmm ok that seems a way more complicated problem than I thought :-). Compile early and you might compile variants you will never use, compile late and the delay might be noticeable. Compilation of unused variants is not bad if they are compiled at the game/level loading time, I think many apps are trying to compile shaders early to avoid freezes during gameplay. But trying to compile early in the driver doesn't make sense currently because it's already too late anyway, if I'm not missing something, it's deferred in mesa or state tracker. Otherwise probably it would be preferable for the driver to precompile variants that are likely to be used (but only if we really can do it early, at the loading time when shaders are created by the app). I just thought it might be unlikely you'd actually need two variants - e.g. some depth exporting shader is probably unlikely to use alpha test. But ok I guess it shouldn't write color in this case, so even then it might never be worth bothering. Was just a random idea ;-). I think it's a good idea that just needs some benchmarking to make sure that it can provide any real benefits. I only wanted to say that it can be done separately from this fix. Vadim Roland Vadim [1] http://cgit.freedesktop.org/~vadimg/mesa/log/?h=r600-geom-shaders What the patch does is probably the right solution. At least alpha-test state changes don't cause shader recompilation and re-binding, which also negatively affects performance. Ideally we shouldn't depend on the framebuffer state at all, but we need to emulate the TGSI property FS_COLOR0_WRITES_ALL_CBUFS. I think we should always be fine with key.nr_cbufs forced to 8 for any shader without that property. I expect app developers to do the right thing and not write outputs they don't need. Marek On Tue, Aug 27, 2013 at 9:00 PM, Roland Scheidegger srol...@vmware.com wrote: Not that I'm qualified to review r600 code, but couldn't you create different shader variants depending on whether you need alpha test? At least I would assume shader exports aren't free. Roland Am 27.08.2013 19:56, schrieb Vadim Girlin: We need to export at least one color if the shader writes it, even when nr_cbufs==0. Signed-off-by: Vadim Girlin vadimgir...@gmail.com --- Tested on evergreen with multiple combinations of backends - no regressions, fixes some tests: default- fixes fb-alphatest-nocolor and fb_alphatest-nocolor-ff default+sb - fixes fb-alphatest-nocolor and fb_alphatest-nocolor-ff llvm - fixes about 25 tests related to depth/stencil llvm+sb- fixes about 300 tests (llvm's depth/stencil issues and regressions cased by reordering of exports in sb) With this patch, there are no regressions with default+sb vs default. There is one regression with llvm+sb vs llvm - fs-texturegrad-miplevels, AFAICS it's a problem with llvm
[Mesa-dev] Mesa 9.2 released
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Mesa 9.2 has been released! Mesa 9.2 is a feature release that includes many updates and enhancements. The full list is available in the release notes file in docs/relnotes/9.2.html. The tag in the GIT repository for Mesa 9.2 is 'mesa-9.2'. Mesa 9.2 is available for download at ftp://freedesktop.org/pub/mesa/9.2/ md5sums: 4f93c6475ec656fc1f7b93aeffc9b6c4 MesaLib-9.2.0.tar.gz 4185b6aae890bc62a964f4b24cc1aca8 MesaLib-9.2.0.tar.bz2 3bc5339bc98b9c3ffd14e3a8eca4 MesaLib-9.2.0.zip I have verified building from the .tar.bz2 file by doing: tar -xjf MesaLib-9.2.tar.bz2 cd Mesa-9.2 ./configure make -j6 I have also verified that I pushed the tag. REALLY! -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.13 (GNU/Linux) iEYEARECAAYFAlIdN4cACgkQX1gOwKyEAw9DYgCfa1ovOpuqnN/wefXyz5t5YVUD 6lwAnjPsrnTkm+lzpO3P/KUutLihS5lP =J6kN -END PGP SIGNATURE- ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] i965: Initialize inout_offset parameter to brw_search_cache().
On Wed, Jul 24, 2013 at 10:02:22AM -0700, Kenneth Graunke wrote: On 07/24/2013 09:33 AM, Paul Berry wrote: Two callers of brw_search_cache() weren't initializing that function's inout_offset parameter: brw_blorp_const_color_params::get_wm_prog() and brw_blorp_const_color_params::get_wm_prog(). That's a benign problem, since the only effect of not initializing inout_offset prior to calling brw_search_cache() is that the bit corresponding to cache_id in brw-state.dirty.cache may not be set reliably. This is ok, since the cache_id's used by brw_blorp_const_color_params::get_wm_prog() and brw_blorp_blit_params::get_wm_prog() (BRW_BLORP_CONST_COLOR_PROG and BRW_BLORP_BLIT_PROG, respectively) correspond to dirty bits that are not used. However, failing to initialize this parameter causes valgrind to complain. So let's go ahead and fix it to reduce valgrind noise. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=66779 Can we have this commit in the 9.2 branch? It was added in the release blocker bug but never cherry-picked for 9.2. Thanks! -- Damien ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/8] i965: Directly call intel_batchbuffer_flush() after i915 split.
On 08/27/2013 03:21 PM, Eric Anholt wrote: intel_flush() now did nothing except call through (and intel_batchbuffer_flush() does the no-op check, too!) --- src/mesa/drivers/dri/i965/gen6_blorp.cpp | 4 +--- src/mesa/drivers/dri/i965/intel_buffer_objects.c | 2 +- src/mesa/drivers/dri/i965/intel_context.c| 17 - src/mesa/drivers/dri/i965/intel_context.h| 3 --- src/mesa/drivers/dri/i965/intel_mipmap_tree.c| 3 +-- src/mesa/drivers/dri/i965/intel_pixel_copy.c | 3 ++- src/mesa/drivers/dri/i965/intel_syncobj.c| 2 +- 7 files changed, 10 insertions(+), 24 deletions(-) Series is: Reviewed-by: Kenneth Graunke kenn...@whitecape.org When running glxgears with INTEL_DEBUG=bat (and batch decoding turned off) on master, I see this: intel_context.c:366: Batchbuffer flush with 5388b (pkt) + 3424b (state) = 8812b (26.9%) intel_screen.c:165: Batchbuffer flush with 748b (pkt) + 448b (state) = 1196b (3.6%) gen6_blorp.cpp:57: Batchbuffer flush with 16b (pkt) +0b (state) = 16b (0.0%) gen6_blorp.cpp:57: Batchbuffer flush with 740b (pkt) + 160b (state) = 900b (2.7%) ..repeating.. Which is pretty ridiculous...a 16 byte batch? Really? And we have to switch to the kernel to execbuffer that. The others aren't much better. With this series, we get a single batch per frame: intel_screen.c:176: Batchbuffer flush with 6940b (pkt) + 4096b (state) = 11036b (33.7%) ..repeating.. Clearly much better. Every other application I've looked at sees similar benefits - much better batch utilization, much fewer tiny batches. --Ken ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] r600g: fix color exports when we have no CBs
On 08/28/2013 02:59 AM, Marek Olšák wrote: First, you won't really see any significant continual difference in frame rate no matter how many shader variants you have unless you are very CPU-bound. The problem is shader compilation on the first use, that's where you get a big hiccup. Try Skyrim for example: You have to first look around and see every object that's around you and get unpleasant stuttering before you can actually go on and play the game. Yes, this also Wine's fault that it compiles shaders on the first use too, but we don't have to be as bad as Wine, do we? Valve also reported shader recompilations on the first use being a serious issue with open source drivers. I perfectly understand that deferred compilation is exactly the problem that makes the games freeze due to shader compilation on first use when something new appears on the screen, but I don't think we can solve this problem in the *driver* by trying to compile early, because AFAICS currently the shaders are passed to the driver too late anyway, and this happens not only with wine. E.g. when I run Heaven in a window with MESA_GLSL=dump R600_DEBUG=ps,vs, so that I can see Heaven's window and console output at the same time, what I see is that most of GL dumps happen while Heaven shows splash screen with loading progress, but most of the driver's dumps appear on the first frame and few more times during benchmark. It looks like compilation is deferred somewhere in the stack before the driver, or am I missing something? Vadim Marek On Tue, Aug 27, 2013 at 11:52 PM, Vadim Girlin vadimgir...@gmail.com wrote: On 08/28/2013 12:43 AM, Marek Olšák wrote: Shader variants are BAD, BAD, BAD. Have you ever played an AAA game with a Mesa driver that likes to compile shader variants on first use? It's HORRIBLE. I don't think that shader variants are bad, but it's definitely bad when we are compiling variants that are never used. Currently glxgears compiles 18 ps/vs shaders. In my branch with initial GS support [1] I switched handling of the shaders to deferred compilation, that is, shaders are compiled only before the actual draw. I found later that it's not really required for GS, but IIRC this change results in only 5 shaders being compiled for glxgears instead of 18. It seems most of the useless variants are results of state changes between creation of the shader state (initial compilation) and actual draw call. I had some concerns about increased overhead with those changes, and it's actually noticeable with drawoverhead demo, but I didn't see any regressions with a few real apps that I tested, e.g. glxgears even showed slightly better performance with these changes. Probably I also implemented it in a not very optimal way (I was mostly concentrated on GS support) and the overhead can be reduced. One more thing is duplicate shaders, I've analyzed shader dumps from Unigine Heaven 3.0 some time ago and found that from about 320 compiled shaders, only about 180 (50%) were unique, others were duplicates (detected by comparing the bytecode dumps for them in an automated way), maybe they had different shader keys (which still resulted in the same bytecode), but I suspect duplicate pipe shaders were also involved. Unfortunately I didn't have a time to investigate it more thoroughly since then. So my point is that we don't really need to eliminate shader variants, first we need to eliminate compilation of unused variants and duplicate shaders. Also we might want to consider offloading of the compilation to separate thread(s) and caching of shader binaries between runs. Vadim [1] http://cgit.freedesktop.org/~vadimg/mesa/log/?h=r600-geom-shaders What the patch does is probably the right solution. At least alpha-test state changes don't cause shader recompilation and re-binding, which also negatively affects performance. Ideally we shouldn't depend on the framebuffer state at all, but we need to emulate the TGSI property FS_COLOR0_WRITES_ALL_CBUFS. I think we should always be fine with key.nr_cbufs forced to 8 for any shader without that property. I expect app developers to do the right thing and not write outputs they don't need. Marek On Tue, Aug 27, 2013 at 9:00 PM, Roland Scheidegger srol...@vmware.com wrote: Not that I'm qualified to review r600 code, but couldn't you create different shader variants depending on whether you need alpha test? At least I would assume shader exports aren't free. Roland Am 27.08.2013 19:56, schrieb Vadim Girlin: We need to export at least one color if the shader writes it, even when nr_cbufs==0. Signed-off-by: Vadim Girlin vadimgir...@gmail.com --- Tested on evergreen with multiple combinations of backends - no regressions, fixes some tests: default- fixes fb-alphatest-nocolor and fb_alphatest-nocolor-ff default+sb - fixes fb-alphatest-nocolor and fb_alphatest-nocolor-ff llvm - fixes about 25 tests related to depth/stencil llvm+sb- fixes
Re: [Mesa-dev] [PATCH] i965: Silece usused variable warning in release build
On 08/27/2013 03:11 PM, Chad Versace wrote: Use `(void) success;` to silence this warning: i965/brw_vs.c:481:12: warning: unused variable 'success' [-Wunused-variable] bool success = do_vs_prog(brw, ctx-Shader.CurrentVertexProgram, Signed-off-by: Chad Versace chad.vers...@linux.intel.com Typo in commit title (Silece - Silence), otherwise, this is at least better than it was. Reviewed-by: Kenneth Graunke kenn...@whitecape.org We might want to come up with a better strategy for failing to compile, but...someday. --Ken ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] i965: Initialize inout_offset parameter to brw_search_cache().
On 27 August 2013 16:51, Damien Lespiau damien.lesp...@intel.com wrote: On Wed, Jul 24, 2013 at 10:02:22AM -0700, Kenneth Graunke wrote: On 07/24/2013 09:33 AM, Paul Berry wrote: Two callers of brw_search_cache() weren't initializing that function's inout_offset parameter: brw_blorp_const_color_params::get_wm_prog() and brw_blorp_const_color_params::get_wm_prog(). That's a benign problem, since the only effect of not initializing inout_offset prior to calling brw_search_cache() is that the bit corresponding to cache_id in brw-state.dirty.cache may not be set reliably. This is ok, since the cache_id's used by brw_blorp_const_color_params::get_wm_prog() and brw_blorp_blit_params::get_wm_prog() (BRW_BLORP_CONST_COLOR_PROG and BRW_BLORP_BLIT_PROG, respectively) correspond to dirty bits that are not used. However, failing to initialize this parameter causes valgrind to complain. So let's go ahead and fix it to reduce valgrind noise. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=66779 Can we have this commit in the 9.2 branch? It was added in the release blocker bug but never cherry-picked for 9.2. Thanks! -- Damien Is there a reason for wanting this cherry-picked to 9.2 other than to follow procedure? As the commit message notes it's a benign problem--all it does is cause false positives from valgrind. I'd rather not mess with 9.2 if there's not going to be any benefit to users. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 50482] git mesa fails to build
https://bugs.freedesktop.org/show_bug.cgi?id=50482 Jos van Wolput wol...@onsneteindhoven.nl changed: What|Removed |Added OS|Solaris |Linux (All) -- You are receiving this mail because: You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH V2 10/15] mesa: Implement KHR_debug ObjectLabel functions
On 28/08/13 08:25, Fredrik Höglund wrote: On Tuesday 27 August 2013, Timothy Arceri wrote: V2: fixed indentation of comment Signed-off-by: Timothy Arceri t_arc...@yahoo.com.au --- src/mesa/main/objectlabel.c | 277 src/mesa/main/objectlabel.h | 61 ++ 2 files changed, 338 insertions(+) create mode 100644 src/mesa/main/objectlabel.c create mode 100644 src/mesa/main/objectlabel.h diff --git a/src/mesa/main/objectlabel.c b/src/mesa/main/objectlabel.c new file mode 100644 index 000..78f9b33 --- /dev/null +++ b/src/mesa/main/objectlabel.c @@ -0,0 +1,277 @@ +/* + * Mesa 3-D graphics library + * + * Copyright (C) 2013 Timothy Arceri All Rights Reserved. + * + * Permission is hereby granted, free of charge, to any person obtaining a + * copy of this software and associated documentation files (the Software), + * to deal in the Software without restriction, including without limitation + * the rights to use, copy, modify, merge, publish, distribute, sublicense, + * and/or sell copies of the Software, and to permit persons to whom the + * Software is furnished to do so, subject to the following conditions: + * + * The above copyright notice and this permission notice shall be included + * in all copies or substantial portions of the Software. + * + * THE SOFTWARE IS PROVIDED AS IS, WITHOUT WARRANTY OF ANY KIND, EXPRESS + * OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR + * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, + * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR + * OTHER DEALINGS IN THE SOFTWARE. + */ + + +#include arrayobj.h +#include bufferobj.h +#include context.h +#include dlist.h +#include enums.h +#include fbobject.h +#include objectlabel.h +#include queryobj.h +#include samplerobj.h +#include shaderobj.h +#include syncobj.h +#include texobj.h +#include transformfeedback.h + + +/** + * Helper for _mesa_ObjectLabel() and _mesa_ObjectPtrLabel(). + */ +static void +set_label(struct gl_context *ctx, char **labelPtr, const char *label, + int length, const char *caller) +{ + if (*labelPtr) { + /* free old label string */ + free(*labelPtr); You should set *labelPtr to NULL here, since the code below is not guaranteed to assign it. Ok I'll create an updated patch with this. + } + + if (label) { + /* set new label string */ + + if (length = 0) { + /* explicit length */ + *labelPtr = (char *) malloc(length); + if (*labelPtr) { +memcpy(*labelPtr, label, length); + } The length given by the client is not required to include a terminating null-character. Correct. There is no requirement for a terminating null-charater in this code so I'm not sure what your trying to say. The null-charater is applied in the GetObjectLabel call where the client is required to provide bufsize which includes the null-terminator. + } + else { + /* null-terminated string */ + int len = strlen(label); + if (len = MAX_LABEL_LENGTH) { +/* An INVALID_VALUE error is generated if the number of characters + * in label, excluding the null terminator when length is + * negative, is not less than the value of MAX_LABEL_LENGTH. + */ +_mesa_error(ctx, GL_INVALID_VALUE, +%s(length=%d, which is not less than +GL_MAX_LABEL_LENGTH=%d), caller, length, +MAX_LABEL_LENGTH); This error should also be generated when the client specifies an explicit length that exceeds MAX_LABEL_LENGTH. The comment in this code is from the spec. The spec only ever says to generate this error when length is negative. Fredrik ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH V2 03/15] mesa: Add a clone function to mesa hash
V2: const qualify table parameter Signed-off-by: Timothy Arceri t_arc...@yahoo.com.au --- src/mesa/main/hash.c | 28 src/mesa/main/hash.h |3 +++ 2 files changed, 31 insertions(+) diff --git a/src/mesa/main/hash.c b/src/mesa/main/hash.c index 6591af9..b31fd48 100644 --- a/src/mesa/main/hash.c +++ b/src/mesa/main/hash.c @@ -302,6 +302,34 @@ _mesa_HashDeleteAll(struct _mesa_HashTable *table, /** + * Clone all entries in a hash table, into a new table. + * + * \param table the hash table to clone + */ +struct _mesa_HashTable * +_mesa_HashClone(const struct _mesa_HashTable *table) +{ + /* cast-away const */ + struct _mesa_HashTable *table2 = (struct _mesa_HashTable *) table; + struct hash_entry *entry; + struct _mesa_HashTable *clonetable; + + ASSERT(table); + _glthread_LOCK_MUTEX(table2-Mutex); + + clonetable = _mesa_NewHashTable(); + assert(clonetable); + hash_table_foreach(table-ht, entry) { + _mesa_HashInsert(clonetable, (GLint)(uintptr_t)entry-key, entry-data); + } + + _glthread_UNLOCK_MUTEX(table2-Mutex); + + return clonetable; +} + + +/** * Walk over all entries in a hash table, calling callback function for each. * Note: we use a separate mutex in this function to avoid a recursive * locking deadlock (in case the callback calls _mesa_HashRemove()) and to diff --git a/src/mesa/main/hash.h b/src/mesa/main/hash.h index 142d284..b34f328 100644 --- a/src/mesa/main/hash.h +++ b/src/mesa/main/hash.h @@ -50,6 +50,9 @@ _mesa_HashDeleteAll(struct _mesa_HashTable *table, void (*callback)(GLuint key, void *data, void *userData), void *userData); +extern struct _mesa_HashTable * +_mesa_HashClone(const struct _mesa_HashTable *table); + extern void _mesa_HashWalk(const struct _mesa_HashTable *table, void (*callback)(GLuint key, void *data, void *userData), -- 1.7.9.5 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] i965: Initialize inout_offset parameter to brw_search_cache().
On 08/27/2013 06:21 PM, Paul Berry wrote: On 27 August 2013 16:51, Damien Lespiau damien.lesp...@intel.com wrote: On Wed, Jul 24, 2013 at 10:02:22AM -0700, Kenneth Graunke wrote: On 07/24/2013 09:33 AM, Paul Berry wrote: Two callers of brw_search_cache() weren't initializing that function's inout_offset parameter: brw_blorp_const_color_params::get_wm_prog() and brw_blorp_const_color_params::get_wm_prog(). That's a benign problem, since the only effect of not initializing inout_offset prior to calling brw_search_cache() is that the bit corresponding to cache_id in brw-state.dirty.cache may not be set reliably. This is ok, since the cache_id's used by brw_blorp_const_color_params::get_wm_prog() and brw_blorp_blit_params::get_wm_prog() (BRW_BLORP_CONST_COLOR_PROG and BRW_BLORP_BLIT_PROG, respectively) correspond to dirty bits that are not used. However, failing to initialize this parameter causes valgrind to complain. So let's go ahead and fix it to reduce valgrind noise. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=66779 Can we have this commit in the 9.2 branch? It was added in the release blocker bug but never cherry-picked for 9.2. Thanks! -- Damien Is there a reason for wanting this cherry-picked to 9.2 other than to follow procedure? As the commit message notes it's a benign problem--all it does is cause false positives from valgrind. I'd rather not mess with 9.2 if there's not going to be any benefit to users. People working on GL programs want to be able to use valgrind to debug their own code. They don't want to see driver issues - not only is it confusing, it also means extra things to skip over every time they run valgrind on their program. So, I'm strongly in favor of cherry-picking it. --Ken ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] glx: make the interval of LIBGL_SHOW_FPS adjustable
LIBGL_SHOW_FPS=1 makes GLX print FPS every second while other values do nothing. Extend it so that LIBGL_SHOW_FPS=N will print the FPS every N seconds. --- src/glx/dri2_glx.c | 12 1 file changed, 8 insertions(+), 4 deletions(-) diff --git a/src/glx/dri2_glx.c b/src/glx/dri2_glx.c index c54edac..54fc21c 100644 --- a/src/glx/dri2_glx.c +++ b/src/glx/dri2_glx.c @@ -95,7 +95,7 @@ struct dri2_screen { void *driver; int fd; - Bool show_fps; + int show_fps_interval; }; struct dri2_context @@ -764,6 +764,8 @@ unsigned dri2GetSwapEventType(Display* dpy, XID drawable) static void show_fps(struct dri2_drawable *draw) { + const int interval = + ((struct dri2_screen *) draw-base.psc)-show_fps_interval; struct timeval tv; uint64_t current_time; @@ -772,7 +774,7 @@ static void show_fps(struct dri2_drawable *draw) draw-frames++; - if (draw-previous_time + 100 = current_time) { + if (draw-previous_time + interval * 100 = current_time) { if (draw-previous_time) { fprintf(stderr, libGL: FPS = %.1f\n, ((uint64_t)draw-frames * 100) / @@ -859,7 +861,7 @@ dri2SwapBuffers(__GLXDRIdrawable *pdraw, int64_t target_msc, int64_t divisor, target_msc, divisor, remainder); } -if (psc-show_fps) { +if (psc-show_fps_interval) { show_fps(priv); } @@ -1283,7 +1285,9 @@ dri2CreateScreen(int screen, struct glx_display * priv) free(deviceName); tmp = getenv(LIBGL_SHOW_FPS); - psc-show_fps = tmp strcmp(tmp, 1) == 0; + psc-show_fps_interval = (tmp) ? atoi(tmp) : 0; + if (psc-show_fps_interval 0) + psc-show_fps_interval = 0; return psc-base; -- 1.8.4.rc3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] glsl: propagate max_array_access through function calls
Hi, I am looking at the code in lower_clip_distance.cpp ir_visitor_status lower_clip_distance_visitor::visit_leave(ir_call *ir) { void *ctx = ralloc_parent(ir); const exec_node *formal_param_node = ir-callee-parameters.head; const exec_node *actual_param_node = ir-actual_parameters.head; while (!actual_param_node-is_tail_sentinel()) { ir_variable *formal_param = (ir_variable *) formal_param_node; ir_rvalue *actual_param = (ir_rvalue *) actual_param_node; /* Advance formal_param_node and actual_param_node now so that we can * safely replace actual_param with another node, if necessary, below. */ formal_param_node = formal_param_node-next; actual_param_node = actual_param_node-next; is_tail_sentinel() is true when next == NULL. this means this loop will not be executed for the last formal param in the list. Is this by design? In my case I need to visit all function parameters. I think it should by simply while (actual_param_node) { Also, what is the point of exec_node::get_head() and exec_node::get_next() accessor methods if they are not used? On Tue, Aug 27, 2013 at 3:15 PM, Matt Turner matts...@gmail.com wrote: On Tue, Aug 27, 2013 at 2:03 PM, Dominik Behr db...@chromium.org wrote: I am not sure about MAX2, is it guaranteed to be converted to branchless conditional assignment? I don't think there's a guarantee, no. Using MAX2 just makes the code easier to immediately understand. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH V3 10/15] mesa: Implement KHR_debug ObjectLabel functions
V3: make sure to add null terminator when setting label, generate error when the client specifies an explicit length that exceeds MAX_LABEL_LENGTH, set label pointer to NULL when freed, and output correct length in MAX_LABEL_LENGTH error message. V2: fixed indentation of comment Signed-off-by: Timothy Arceri t_arc...@yahoo.com.au --- src/mesa/main/objectlabel.c | 282 +++ src/mesa/main/objectlabel.h | 61 ++ 2 files changed, 343 insertions(+) create mode 100644 src/mesa/main/objectlabel.c create mode 100644 src/mesa/main/objectlabel.h diff --git a/src/mesa/main/objectlabel.c b/src/mesa/main/objectlabel.c new file mode 100644 index 000..0edb728 --- /dev/null +++ b/src/mesa/main/objectlabel.c @@ -0,0 +1,282 @@ +/* + * Mesa 3-D graphics library + * + * Copyright (C) 2013 Timothy Arceri All Rights Reserved. + * + * Permission is hereby granted, free of charge, to any person obtaining a + * copy of this software and associated documentation files (the Software), + * to deal in the Software without restriction, including without limitation + * the rights to use, copy, modify, merge, publish, distribute, sublicense, + * and/or sell copies of the Software, and to permit persons to whom the + * Software is furnished to do so, subject to the following conditions: + * + * The above copyright notice and this permission notice shall be included + * in all copies or substantial portions of the Software. + * + * THE SOFTWARE IS PROVIDED AS IS, WITHOUT WARRANTY OF ANY KIND, EXPRESS + * OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR + * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, + * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR + * OTHER DEALINGS IN THE SOFTWARE. + */ + + +#include arrayobj.h +#include bufferobj.h +#include context.h +#include dlist.h +#include enums.h +#include fbobject.h +#include objectlabel.h +#include queryobj.h +#include samplerobj.h +#include shaderobj.h +#include syncobj.h +#include texobj.h +#include transformfeedback.h + + +/** + * Helper for _mesa_ObjectLabel() and _mesa_ObjectPtrLabel(). + */ +static void +set_label(struct gl_context *ctx, char **labelPtr, const char *label, + int length, const char *caller) +{ + if (*labelPtr) { + /* free old label string */ + free(*labelPtr); + *labelPtr = NULL; + } + + /* set new label string */ + if (label) { + if (length = 0) { + if (length = MAX_LABEL_LENGTH) +_mesa_error(ctx, GL_INVALID_VALUE, +%s(length=%d, which is not less than +GL_MAX_LABEL_LENGTH=%d), caller, length, +MAX_LABEL_LENGTH); + + /* explicit length */ + *labelPtr = (char *) malloc(length+1); + if (*labelPtr) { +memcpy(*labelPtr, label, length); +/* length is not required to include the null terminator so + * add one just in case + */ +(*labelPtr)[length] = '\0'; + } + } + else { + int len = strlen(label); + if (len = MAX_LABEL_LENGTH) +_mesa_error(ctx, GL_INVALID_VALUE, +%s(label length=%d, which is not less than +GL_MAX_LABEL_LENGTH=%d), caller, len, +MAX_LABEL_LENGTH); + + /* null-terminated string */ + *labelPtr = _mesa_strdup(label); + } + } +} + +/** + * Helper for _mesa_GetObjectLabel() and _mesa_GetObjectPtrLabel(). + */ +static void +copy_label(char **labelPtr, char *label, int *length, int bufSize) +{ + int labelLen = 0; + + if (*labelPtr) + labelLen = strlen(*labelPtr); + + if (label) { + if (bufSize = labelLen) + labelLen = bufSize-1; + + memcpy(label, *labelPtr, labelLen); + label[labelLen] = '\0'; + } + + if (length) + *length = labelLen; +} + +/** + * Helper for _mesa_ObjectLabel() and _mesa_GetObjectLabel(). + */ +static char ** +get_label_pointer(struct gl_context *ctx, GLenum identifier, GLuint name, + const char *caller) +{ + char **labelPtr = NULL; + + switch (identifier) { + case GL_BUFFER: + { + struct gl_buffer_object *bufObj = _mesa_lookup_bufferobj(ctx, name); + if (bufObj) +labelPtr = bufObj-Label; + } + break; + case GL_SHADER: + { + struct gl_shader *shader = _mesa_lookup_shader(ctx, name); + if (shader) +labelPtr = shader-Label; + } + break; + case GL_PROGRAM: + { + struct gl_shader_program *program = +_mesa_lookup_shader_program(ctx, name); + if (program) +labelPtr = program-Label; + } + break; + case GL_VERTEX_ARRAY: + { +
Re: [Mesa-dev] [PATCH V2 10/15] mesa: Implement KHR_debug ObjectLabel functions
+ } + + if (label) { + /* set new label string */ + + if (length = 0) { + /* explicit length */ + *labelPtr = (char *) malloc(length); + if (*labelPtr) { +memcpy(*labelPtr, label, length); + } The length given by the client is not required to include a terminating null-character. Ok I see what you mean now. Fixed in V3 of the patch I just sent out. + } + else { + /* null-terminated string */ + int len = strlen(label); + if (len = MAX_LABEL_LENGTH) { +/* An INVALID_VALUE error is generated if the number of characters + * in label, excluding the null terminator when length is + * negative, is not less than the value of MAX_LABEL_LENGTH. + */ +_mesa_error(ctx, GL_INVALID_VALUE, +%s(length=%d, which is not less than +GL_MAX_LABEL_LENGTH=%d), caller, length, +MAX_LABEL_LENGTH); This error should also be generated when the client specifies an explicit length that exceeds MAX_LABEL_LENGTH. Yes you are right. The wording in the spec was confusing me a little I had to read it a couple of times before it made sense. Again fixed in V3 of the patch I just sent out. Fredrik Thanks for reviewing Fredrik Tim ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] tgsi dump and parsing
IMM[0] FLT32 { 0x, 0x, 0x, 0x } # 1.0, 3.0, 2.0, 4.0 If you use %.9g instead of %.4f then floating point numbers will be preserved without loss of precision. I see a -nan in my tests that doesn't get reparsed so I expect hex is still better. oops to list as well this time, sorry. Dave. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] tgsi dump and parsing
On Wed, Aug 28, 2013 at 3:32 PM, Dave Airlie airl...@gmail.com wrote: IMM[0] FLT32 { 0x, 0x, 0x, 0x } # 1.0, 3.0, 2.0, 4.0 If you use %.9g instead of %.4f then floating point numbers will be preserved without loss of precision. I see a -nan in my tests that doesn't get reparsed so I expect hex is still better. oops to list as well this time, sorry. Just in case you are wondering its tests/shaders/glsl-const-builtin-inversesqrt.shader_test and tests/shaders/glsl-const-builtin-normalize.shader_test that throw up the -nan in the dumps. Dave. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev