Re: [Mesa-dev] [PATCH 09/16] i965/fs: Apply conditional mod specially to split MAD/LRP.
On Monday, January 19, 2015 03:31:08 PM Matt Turner wrote: Otherwise we'll apply the conditional mod to only one of SIMD8 instructions and trigger an assertion. NoDDClr/NoDDChk have the same problem but we never apply those to these instructions, so I'm leaving them for a later time. --- src/mesa/drivers/dri/i965/brw_fs_generator.cpp | 20 1 file changed, 16 insertions(+), 4 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_fs_generator.cpp b/src/mesa/drivers/dri/i965/brw_fs_generator.cpp index ab848f1..f35da71 100644 --- a/src/mesa/drivers/dri/i965/brw_fs_generator.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs_generator.cpp @@ -1656,10 +1656,16 @@ fs_generator::generate_code(const cfg_t *cfg, int dispatch_width) brw_set_default_access_mode(p, BRW_ALIGN_16); if (dispatch_width == 16 brw-gen 8 !brw-is_haswell) { brw_set_default_compression_control(p, BRW_COMPRESSION_NONE); - brw_MAD(p, firsthalf(dst), firsthalf(src[0]), firsthalf(src[1]), firsthalf(src[2])); +brw_inst *f = brw_MAD(p, firsthalf(dst), firsthalf(src[0]), firsthalf(src[1]), firsthalf(src[2])); brw_set_default_compression_control(p, BRW_COMPRESSION_2NDHALF); - brw_MAD(p, sechalf(dst), sechalf(src[0]), sechalf(src[1]), sechalf(src[2])); +brw_inst *s = brw_MAD(p, sechalf(dst), sechalf(src[0]), sechalf(src[1]), sechalf(src[2])); brw_set_default_compression_control(p, BRW_COMPRESSION_COMPRESSED); + +if (inst-conditional_mod) { + brw_inst_set_cond_modifier(brw, f, inst-conditional_mod); + brw_inst_set_cond_modifier(brw, s, inst-conditional_mod); + inst-conditional_mod = BRW_CONDITIONAL_NONE; Having the generator mutate the incoming IR feels dirty to me. Honestly, it should be const...we've never changed it until now. I see what you're trying to accomplish - bypassing the assertion failure about conditional_mod set with more than one instruction. Maybe add a bool multiple_instructions_allowed flag, set it to false before the switch, set it true here, and check it later to skip the assert? Seems ugly, but not as bad as mutating the IR. I think a better solution (after this series lands!) would be to generate two MADs/LRPs at the fs_visitor level, and just emit a single instruction for each at the generator level. We should have the infrastructure now and it'd let us schedule them. +} } else { brw_MAD(p, dst, src[0], src[1], src[2]); } @@ -1671,10 +1677,16 @@ fs_generator::generate_code(const cfg_t *cfg, int dispatch_width) brw_set_default_access_mode(p, BRW_ALIGN_16); if (dispatch_width == 16 brw-gen 8 !brw-is_haswell) { brw_set_default_compression_control(p, BRW_COMPRESSION_NONE); - brw_LRP(p, firsthalf(dst), firsthalf(src[0]), firsthalf(src[1]), firsthalf(src[2])); +brw_inst *f = brw_LRP(p, firsthalf(dst), firsthalf(src[0]), firsthalf(src[1]), firsthalf(src[2])); brw_set_default_compression_control(p, BRW_COMPRESSION_2NDHALF); - brw_LRP(p, sechalf(dst), sechalf(src[0]), sechalf(src[1]), sechalf(src[2])); +brw_inst *s = brw_LRP(p, sechalf(dst), sechalf(src[0]), sechalf(src[1]), sechalf(src[2])); brw_set_default_compression_control(p, BRW_COMPRESSION_COMPRESSED); + +if (inst-conditional_mod) { + brw_inst_set_cond_modifier(brw, f, inst-conditional_mod); + brw_inst_set_cond_modifier(brw, s, inst-conditional_mod); + inst-conditional_mod = BRW_CONDITIONAL_NONE; +} } else { brw_LRP(p, dst, src[0], src[1], src[2]); } signature.asc Description: This is a digitally signed message part. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 07/16] i965: Add is_3src() to backend_instruction.
On Monday, January 19, 2015 03:31:06 PM Matt Turner wrote: --- src/mesa/drivers/dri/i965/brw_shader.cpp| 10 ++ src/mesa/drivers/dri/i965/brw_shader.h | 1 + src/mesa/drivers/dri/i965/brw_vec4_copy_propagation.cpp | 6 +- 3 files changed, 12 insertions(+), 5 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_shader.cpp b/src/mesa/drivers/dri/i965/brw_shader.cpp index cbdf976..c6fead7 100644 --- a/src/mesa/drivers/dri/i965/brw_shader.cpp +++ b/src/mesa/drivers/dri/i965/brw_shader.cpp @@ -678,6 +678,16 @@ backend_reg::is_accumulator() const } bool +backend_instruction::is_3src() const +{ + return opcode == BRW_OPCODE_LRP || + opcode == BRW_OPCODE_MAD || + opcode == BRW_OPCODE_BFE || + opcode == BRW_OPCODE_BFI2 || + opcode == BRW_OPCODE_CSEL; Pah, manual listings of things :) Let's do even better: return opcode 128 opcode_descs[op].nsrc == 3; That would get Reviewed-by: Kenneth Graunke kenn...@whitecape.org signature.asc Description: This is a digitally signed message part. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 05/16] i965: Add backend_instruction::can_do_cmod().
On Monday, January 19, 2015 03:31:04 PM Matt Turner wrote: --- src/mesa/drivers/dri/i965/brw_shader.cpp | 45 src/mesa/drivers/dri/i965/brw_shader.h | 1 + 2 files changed, 46 insertions(+) diff --git a/src/mesa/drivers/dri/i965/brw_shader.cpp b/src/mesa/drivers/dri/i965/brw_shader.cpp index d76134b..cbdf976 100644 --- a/src/mesa/drivers/dri/i965/brw_shader.cpp +++ b/src/mesa/drivers/dri/i965/brw_shader.cpp @@ -790,6 +790,51 @@ backend_instruction::can_do_saturate() const } bool +backend_instruction::can_do_cmod() const +{ + switch (opcode) { + case BRW_OPCODE_ADD: + case BRW_OPCODE_ADDC: + case BRW_OPCODE_AND: + case BRW_OPCODE_ASR: + case BRW_OPCODE_AVG: + case BRW_OPCODE_CMP: + case BRW_OPCODE_CMPN: + case BRW_OPCODE_DP2: + case BRW_OPCODE_DP3: + case BRW_OPCODE_DP4: + case BRW_OPCODE_DPH: + case BRW_OPCODE_F16TO32: + case BRW_OPCODE_F32TO16: + case BRW_OPCODE_FRC: + case BRW_OPCODE_LINE: + case BRW_OPCODE_LRP: + case BRW_OPCODE_LZD: + case BRW_OPCODE_MAC: + case BRW_OPCODE_MACH: + case BRW_OPCODE_MAD: + case BRW_OPCODE_MOV: + case BRW_OPCODE_MUL: + case BRW_OPCODE_NOT: + case BRW_OPCODE_OR: + case BRW_OPCODE_PLN: + case BRW_OPCODE_RNDD: + case BRW_OPCODE_RNDE: + case BRW_OPCODE_RNDU: + case BRW_OPCODE_RNDZ: + case BRW_OPCODE_SAD2: + case BRW_OPCODE_SADA2: + case BRW_OPCODE_SHL: + case BRW_OPCODE_SHR: + case BRW_OPCODE_SUBB: + case BRW_OPCODE_XOR: + return true; I checked this function against the Gen4 and Gen8 docs, and I spied a few missing opcodes: - DIM (only exists on Haswell - I guess we don't have a #define yet) - CSEL (only exists on Gen8+) - SEL on Gen6+ - IF/WHILE on Sandybridge only But, I suppose the conditional modifier behaves differently for SEL, CSEL, IF, and WHILE, so it probably makes sense to omit them. Maybe add a comment explaining SEL's absence? Patches 1-5 are: Reviewed-by: Kenneth Graunke kenn...@whitecape.org + default: + return false; + } +} + +bool backend_instruction::reads_accumulator_implicitly() const { switch (opcode) { diff --git a/src/mesa/drivers/dri/i965/brw_shader.h b/src/mesa/drivers/dri/i965/brw_shader.h index 233e224..54d770e 100644 --- a/src/mesa/drivers/dri/i965/brw_shader.h +++ b/src/mesa/drivers/dri/i965/brw_shader.h @@ -87,6 +87,7 @@ struct backend_instruction : public exec_node { bool is_control_flow() const; bool can_do_source_mods() const; bool can_do_saturate() const; + bool can_do_cmod() const; bool reads_accumulator_implicitly() const; bool writes_accumulator_implicitly(struct brw_context *brw) const; signature.asc Description: This is a digitally signed message part. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 08/16] i965/fs: Add a pass to fixup 3-src instructions that have a null dest.
On Monday, January 19, 2015 03:31:07 PM Matt Turner wrote: 3-src instructions can only have GRF/MRF destinations. It's really difficult to deal with that restriction in dead code elimination (that wants to give instructions null destinations to show that their result isn't used) while allowing 3-src instructions to have conditional mod, so don't, and just give then a destination before register allocation. --- src/mesa/drivers/dri/i965/brw_fs.cpp | 14 ++ src/mesa/drivers/dri/i965/brw_fs.h | 1 + 2 files changed, 15 insertions(+) diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp b/src/mesa/drivers/dri/i965/brw_fs.cpp index 35639de..73d722e 100644 --- a/src/mesa/drivers/dri/i965/brw_fs.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs.cpp @@ -3605,6 +3605,18 @@ fs_visitor::optimize() } Doxygen comments above functions are nice: /** * Three source instruction must have a GRF/MRF destination register. * ARF NULL is not allowed. Fix that up by allocating a temporary GRF. */ With that added, this gets Reviewed-by: Kenneth Graunke kenn...@whitecape.org (I think you could drop the comment in the function, but either way's fine) void +fs_visitor::fixup_3src_null_dest() +{ + foreach_block_and_inst_safe (block, fs_inst, inst, cfg) { + if (inst-is_3src() inst-dst.is_null()) { + /* 3-src instructions can only have GRF/MRF destination. */ + inst-dst = fs_reg(GRF, virtual_grf_alloc(dispatch_width / 8), +inst-dst.type); + } + } +} + +void fs_visitor::allocate_registers() { bool allocated_without_spills; @@ -3701,6 +3713,7 @@ fs_visitor::run_vs() assign_curb_setup(); assign_vs_urb_setup(); + fixup_3src_null_dest(); allocate_registers(); return !failed; @@ -3780,6 +3793,7 @@ fs_visitor::run_fs() assign_curb_setup(); assign_urb_setup(); + fixup_3src_null_dest(); allocate_registers(); if (failed) diff --git a/src/mesa/drivers/dri/i965/brw_fs.h b/src/mesa/drivers/dri/i965/brw_fs.h index 2aa58eb..9c125a6 100644 --- a/src/mesa/drivers/dri/i965/brw_fs.h +++ b/src/mesa/drivers/dri/i965/brw_fs.h @@ -423,6 +423,7 @@ public: void setup_payload_gen4(); void setup_payload_gen6(); void setup_vs_payload(); + void fixup_3src_null_dest(); void assign_curb_setup(); void calculate_urb_setup(); void assign_urb_setup(); signature.asc Description: This is a digitally signed message part. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 87886] constant fps drops with Intel and Radeon
https://bugs.freedesktop.org/show_bug.cgi?id=87886 --- Comment #20 from Eero Tamminen eero.t.tammi...@intel.com --- Are you still getting INTEL_DEBUG=perf output from them? - if you're still getting recompile messages, re-check you have latest Mesa - if there are no perf warnings, check that: * your dmesg doesn't have any suspicious warnings * top output doesn't show things to be CPU limited and you having some background CPU / X hog occasionally stalling things for the foreground app (You need another machine to monitor this when running things at fullscreen) -- You are receiving this mail because: You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 01/11] mesa: Returns a GL_INVALID_VALUE error on several glGet* APIs when max length is negative
On Tue, Jan 20, 2015 at 3:54 AM, Ian Romanick i...@freedesktop.org wrote: On 01/19/2015 04:35 AM, Erik Faye-Lund wrote: On Mon, Jan 19, 2015 at 1:31 PM, Erik Faye-Lund kusmab...@gmail.com wrote: On Mon, Jan 19, 2015 at 12:32 PM, Eduardo Lima Mitev el...@igalia.com wrote: The manual page for glGetAttachedShaders, glGetShaderSource, glGetActiveUniform and glGetActiveUniform state that a GL_INVALID_VALUE is returned if the maximum length argument is less than zero. For reference, see: https://www.opengl.org/sdk/docs/man3/xhtml/glGetAttachedShaders.xml, https://www.khronos.org/opengles/sdk/docs/man31/html/glGetAttachedShaders.xhtml, https://www.opengl.org/sdk/docs/man3/xhtml/glGetShaderSource.xml, https://www.khronos.org/opengles/sdk/docs/man31/html/glGetShaderSource.xhtml, https://www.opengl.org/sdk/docs/man3/xhtml/glGetActiveUniform.xml, https://www.khronos.org/opengles/sdk/docs/man31/html/glGetActiveUniform.xhtml, https://www.opengl.org/sdk/docs/man3/xhtml/glGetActiveAttrib.xml, https://www.khronos.org/opengles/sdk/docs/man31/html/glGetActiveAttrib.xhtml. This fixes 4 dEQP test: * dEQP-GLES3.functional.negative_api.state.get_attached_shaders * dEQP-GLES3.functional.negative_api.state.get_shader_source * dEQP-GLES3.functional.negative_api.state.get_active_uniform * dEQP-GLES3.functional.negative_api.state.get_active_attrib These tests are about GLES3, but I cannot find such behavior specified in the OpenGL ES 3.0 specification for GetAttachedShaders nor GetProgramBinary. I stopped checking after those two, because I felt I saw a pattern ;) The man pages you linked to are for desktop-OpenGL, which *does* specify such an error; OpenGL 4.5 spec says An INVALID_VALUE error is generated if maxCount is negative about GetAttachedShaders (and a similar for GetProgramBinary()). However, the GLES 3 man pages also list such an error: https://www.khronos.org/opengles/sdk/docs/man3/html/glGetAttachedShaders.xhtml Generally speaking, the manual pages are not considered to dictate behavior, only to be a programmer convenience. And they have historically been full of errors. However, since desktop GL *does* specify these errors *and* the man-pages document them, this does look like a spec-error for GLES 3 to me. But I think we should get a clarification from Khronos before assuming so, otherwise we won't be in conformance. Ian, any thoughts? By the way, this error is also properly defined for OpenGL ES 3.1, so I'm feeling even more confident that it's a spec-bug in OpenGL ES 3.0. There's a general statement in the Errors section that passing a negative value for a GLsizei or GLsizeiptr is always an INVALID_VALUE error. That exists in all specs back to OpenGL 1.0. :) Since it's in a different place, it is often overlooked (like when we wrote these functions in Mesa). Thanks a lot for pointing that out. Yeah, you're right, this is not missing from the GLES 3.0 spec, section 2.5 says: Several error generation conditions are implicit in the description of every GL command: ... * If a negative number is provided where an argument of type sizei or sizeiptr is specified, the error INVALID_VALUE is generated. So yeah, worry withdrawn! ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 10/16] i965/fs: Eliminate null-dst instructions without side-effects.
On Monday, January 19, 2015 03:31:09 PM Matt Turner wrote: --- src/mesa/drivers/dri/i965/brw_fs_dead_code_eliminate.cpp | 11 +++ 1 file changed, 11 insertions(+) diff --git a/src/mesa/drivers/dri/i965/brw_fs_dead_code_eliminate.cpp b/src/mesa/drivers/dri/i965/brw_fs_dead_code_eliminate.cpp index 81be4de..d66808b 100644 --- a/src/mesa/drivers/dri/i965/brw_fs_dead_code_eliminate.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs_dead_code_eliminate.cpp @@ -85,6 +85,17 @@ fs_visitor::dead_code_eliminate() } } + if ((inst-opcode != BRW_OPCODE_IF + inst-opcode != BRW_OPCODE_WHILE) + inst-dst.is_null() + !inst-has_side_effects() + !inst-writes_flag() + !inst-writes_accumulator) { +inst-opcode = BRW_OPCODE_NOP; +progress = true; +continue; + } + if (inst-dst.file == GRF) { if (!inst-is_partial_write()) { int var = live_intervals-var_from_reg(inst-dst); Seems like these should be handled too... - BRW_OPCODE_ELSE - FS_OPCODE_DISCARD_JUMP - FS_OPCODE_PLACEHOLDER_HALT - SHADER_OPCODE_SHADER_TIME_ADD - SHADER_OPCODE_GEN4_SCRATCH_READ - SHADER_OPCODE_GEN4_SCRATCH_WRITE - SHADER_OPCODE_GEN7_SCRATCH_READ Maybe some of these should be added to has_side_effects()? I'm kind of surprised you didn't see regressions in Piglit...maybe I'm missing something. signature.asc Description: This is a digitally signed message part. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 88618] opengles2: fix building without X11
https://bugs.freedesktop.org/show_bug.cgi?id=88618 Bug ID: 88618 Summary: opengles2: fix building without X11 Product: Mesa Version: git Hardware: Other OS: All Status: NEW Severity: normal Priority: medium Component: Demos Assignee: mesa-dev@lists.freedesktop.org Reporter: m.olbr...@pengutronix.de QA Contact: mesa-dev@lists.freedesktop.org Created attachment 112525 -- https://bugs.freedesktop.org/attachment.cgi?id=112525action=edit patch es2_info, es2gears_x11 and es2tri require X11, so don't build them if X11 is disabled. -- You are receiving this mail because: You are the QA Contact for the bug. You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 06/16] i965: Add a brw_invert_cmod() function.
On Monday, January 19, 2015 03:31:05 PM Matt Turner wrote: --- src/mesa/drivers/dri/i965/brw_eu.c | 22 ++ src/mesa/drivers/dri/i965/brw_eu.h | 1 + 2 files changed, 23 insertions(+) diff --git a/src/mesa/drivers/dri/i965/brw_eu.c b/src/mesa/drivers/dri/i965/brw_eu.c index 9905972..9977eed 100644 --- a/src/mesa/drivers/dri/i965/brw_eu.c +++ b/src/mesa/drivers/dri/i965/brw_eu.c @@ -88,6 +88,28 @@ brw_swap_cmod(uint32_t cmod) } } +/* Returns the corresponding inverted conditional mod. */ +enum brw_conditional_mod +brw_invert_cmod(enum brw_conditional_mod cmod) +{ + switch (cmod) { + case BRW_CONDITIONAL_Z: + return BRW_CONDITIONAL_NZ; + case BRW_CONDITIONAL_NZ: + return BRW_CONDITIONAL_Z; + case BRW_CONDITIONAL_G: + return BRW_CONDITIONAL_LE; + case BRW_CONDITIONAL_GE: + return BRW_CONDITIONAL_L; + case BRW_CONDITIONAL_L: + return BRW_CONDITIONAL_GE; + case BRW_CONDITIONAL_LE: + return BRW_CONDITIONAL_G; + default: + return BRW_CONDITIONAL_NONE; + } +} Heh, I thought this looked familiar...apparently I wrote one too :) http://lists.freedesktop.org/archives/mesa-dev/2014-August/066127.html I wasn't sure whether invert meant flip direction or negate condition until I read the code. How about calling it brw_negate_cmod instead? /* Returns a conditional modifier that negates the condition. */ enum brw_conditional_mod brw_negate_cmod(uint32_t cmod) { ... } Either way is fine. Reviewed-by: Kenneth Graunke kenn...@whitecape.org signature.asc Description: This is a digitally signed message part. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 07/11] glsl: error out on empty declarations
On Mon, 2015-01-19 at 19:03 -0800, Ian Romanick wrote: On 01/19/2015 04:52 AM, Erik Faye-Lund wrote: On Mon, Jan 19, 2015 at 12:32 PM, Eduardo Lima Mitev el...@igalia.com wrote: From: Iago Toral Quiroga ito...@igalia.com So far we have only been emitting a warning. Fixes the following 2 dEQP tests: dEQP-GLES3.functional.shaders.arrays.invalid.empty_declaration_without_var_name_vertex dEQP-GLES3.functional.shaders.arrays.invalid.empty_declaration_without_var_name_fragment --- src/glsl/ast_to_hir.cpp | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/src/glsl/ast_to_hir.cpp b/src/glsl/ast_to_hir.cpp index c52e4af..5ed8b0e 100644 --- a/src/glsl/ast_to_hir.cpp +++ b/src/glsl/ast_to_hir.cpp @@ -3406,7 +3406,7 @@ ast_declarator_list::hir(exec_list *instructions, lowp }; -_mesa_glsl_warning(loc, state, +_mesa_glsl_error(loc, state, empty declaration with precision qualifier, to set the default precision, use `precision %s %s;', @@ -3414,7 +3414,7 @@ ast_declarator_list::hir(exec_list *instructions, type_name); } } else if (this-type-specifier-structure == NULL) { - _mesa_glsl_warning(loc, state, empty declaration); + _mesa_glsl_error(loc, state, empty declaration); } } OpenGL ES 3.1 specifies this as allowed, as far as I can tell: Section 4.11: Empty declarations are allowed. E.g. int; // No effect struct S {int x;}; // Defines a struct S OpenGL ES 3.0 seems to not say, which generally should be interpreted as allowed. In fact, no such error is listed in section 10 of the spec, which is the list of errors required to detect. We included some text in the cover letter of this series about this topic. I think that GL ES3 explicitly allows it but you have to go to the chapter Shading Language Grammar (page 106) to see it. There they have this: declaration: various rules here type_qualifier SEMICOLON So I think these dEQP-tests are bogus. I think Erik is right. I seem to recall submitting a Khronos bug about this (long, long ago), and the decision was that this was silly but valid. Agreed, let's drop this patch. Iago ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 01/11] mesa: Returns a GL_INVALID_VALUE error on several glGet* APIs when max length is negative
Thank you Erik and Ian for taking a look to the patch. On 01/20/2015 03:52 AM, Ian Romanick wrote: On 01/19/2015 03:32 AM, Eduardo Lima Mitev wrote: The manual page for glGetAttachedShaders, glGetShaderSource, glGetActiveUniform and glGetActiveUniform state that a GL_INVALID_VALUE is returned if the maximum length argument is less than zero. For reference, see: https://www.opengl.org/sdk/docs/man3/xhtml/glGetAttachedShaders.xml, https://www.khronos.org/opengles/sdk/docs/man31/html/glGetAttachedShaders.xhtml, https://www.opengl.org/sdk/docs/man3/xhtml/glGetShaderSource.xml, https://www.khronos.org/opengles/sdk/docs/man31/html/glGetShaderSource.xhtml, https://www.opengl.org/sdk/docs/man3/xhtml/glGetActiveUniform.xml, https://www.khronos.org/opengles/sdk/docs/man31/html/glGetActiveUniform.xhtml, https://www.opengl.org/sdk/docs/man3/xhtml/glGetActiveAttrib.xml, https://www.khronos.org/opengles/sdk/docs/man31/html/glGetActiveAttrib.xhtml. It's probably easier to quote section 2.3.1 (Errors): If a negative number is provided where an argument of type sizei or sizeiptr is specified, an INVALID_VALUE error is generated. Indeed. I failed to find the relevant text in the specs (only looked at GLES3.0 and GL3.3 though), so I decided to finally quote the man pages even if I knew it is not a good practice. Thought is was better than nothing. I think we should be consistent (as much as possible) with the order errors are generated. Each function below will check things in a different order. get_attached_shaders can even generate multiple errors (one in _mesa_lookup_shader_program_err and one for maxCount 0). If you change all three to check max*, then call _mesa_lookup_shader_program_err, the patch is Agree. I will rework the patch to make it more consistent as you suggest. Thanks again! Eduardo ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 09/11 v2] mesa: Allow querying for GL_PRIMITIVE_RESTART_FIXED_INDEX under GLES 3
GLES 3.0.0 spec introduces context state PRIMITIVE_RESTART_FIXED_INDEX (2.8.1 Transferring Array Elements, page 26) which is not currently possible to query using glGet*() funcs. Fixes 4 dEQP tests: * dEQP-GLES3.functional.state_query.boolean.primitive_restart_fixed_index_getboolean * dEQP-GLES3.functional.state_query.boolean.primitive_restart_fixed_index_getinteger * dEQP-GLES3.functional.state_query.boolean.primitive_restart_fixed_index_getinteger64 * dEQP-GLES3.functional.state_query.boolean.primitive_restart_fixed_index_getfloat --- src/mesa/main/get_hash_params.py | 1 + 1 file changed, 1 insertion(+) diff --git a/src/mesa/main/get_hash_params.py b/src/mesa/main/get_hash_params.py index c487e98..c79ca45 100644 --- a/src/mesa/main/get_hash_params.py +++ b/src/mesa/main/get_hash_params.py @@ -343,6 +343,7 @@ descriptor=[ # GL_ARB_ES3_compatibility [ MAX_ELEMENT_INDEX, CONTEXT_INT64(Const.MaxElementIndex), extra_ARB_ES3_compatibility_api_es3], + [ PRIMITIVE_RESTART_FIXED_INDEX, CONTEXT_BOOL(Array.PrimitiveRestartFixedIndex), extra_ARB_ES3_compatibility_api_es3 ], # GL_ARB_fragment_shader [ MAX_FRAGMENT_UNIFORM_COMPONENTS_ARB, CONTEXT_INT(Const.Program[MESA_SHADER_FRAGMENT].MaxUniformComponents), extra_ARB_fragment_shader ], -- 2.1.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 08/11] glsl: GLSL ES identifiers cannot exceed 1024 characters
On Mon, 2015-01-19 at 19:25 -0800, Ian Romanick wrote: On 01/19/2015 04:41 AM, Erik Faye-Lund wrote: On Mon, Jan 19, 2015 at 12:32 PM, Eduardo Lima Mitev el...@igalia.com wrote: From: Iago Toral Quiroga ito...@igalia.com Fixes the following 2 dEQP tests: dEQP-GLES3.functional.shaders.keywords.invalid_identifiers.max_length_vertex dEQP-GLES3.functional.shaders.keywords.invalid_identifiers.max_length_fragment --- src/glsl/glsl_parser.yy | 7 +++ 1 file changed, 7 insertions(+) diff --git a/src/glsl/glsl_parser.yy b/src/glsl/glsl_parser.yy index 7fb8c38..165419c 100644 --- a/src/glsl/glsl_parser.yy +++ b/src/glsl/glsl_parser.yy @@ -362,6 +362,13 @@ any_identifier: IDENTIFIER | TYPE_IDENTIFIER | NEW_IDENTIFIER + { + if (state-es_shader strlen($1) 1024) { + _mesa_glsl_error( @1, state, + Identifier `%s' exceeds + 1024 characters, $1); + } + } The limit of 1024 characters for identifiers is only specified for OpenGL ES Shading Language versions 3.00 and up, not for version 1.00 as OpenGL ES 2.0 uses. Perhaps a spec bug? If I had to wager a guess... I'd guess that the limitation was added because shipping ES2 implementations already had such a limit (or just had bugs with very large identifiers). Assuming that's the case, this check should be safe universally. Using huge identifiers increases memory requirements for the compiler (we copy variables and duplicate identifiers all the time during the compilation process), so I guess this is intended to force developers into more ES friendly programming habits and avoid situations where the compiler just runs out of memory in the middle of it. However, checking here defeats the purpose of having such a limitation. It seems like we should do this in the lexer... before rallocing a copy of the large string. My guess is that this requirement is more related to the compilation process than the parser but doing this in the lexer does not harm and can help in some cases. I'll update the patch accordingly. Iago ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 10/11] glsl: Improve precision of mod(x,y)
On Mon, 2015-01-19 at 19:10 -0800, Ian Romanick wrote: On 01/19/2015 03:32 AM, Eduardo Lima Mitev wrote: From: Iago Toral Quiroga ito...@igalia.com Currently, Mesa uses the lowering pass MOD_TO_FRACT to implement mod(x,y) as y * fract(x/y). This implementation has a down side though: it introduces precision errors due to the fract() operation. Even worse, since the result of fract() is multiplied by y, the larger y gets the larger the precision error we produce, so for large enough numbers the precision loss is significant. Some examples on i965: Operation Precision error - mod(-1.951171875, 1.9980468750) 0.000447 mod(121.57, 13.29) 0.023842 mod(3769.12, 321.99) 0.762939 mod(3769.12, 1321.99)0.0001220703 mod(-987654.125, 123456.984375) 0.0160663128 mod( 987654.125, 123456.984375) 0.031250 This patch replaces the current lowering pass with a different one (MOD_TO_FLOOR) that follows the recommended implementation in the GLSL man pages: mod(x,y) = x - y * floor(x/y) This implementation eliminates the precision errors at the expense of an additional add instruction on some systems. On systems that can do negate with multiply-add in a single operation this new implementation would come at no additional cost. Fixes the following 16 dEQP tests: dEQP-GLES3.functional.shaders.builtin_functions.precision.mod.mediump_* dEQP-GLES3.functional.shaders.builtin_functions.precision.mod.highp_* --- src/glsl/README| 2 +- src/glsl/ir_optimization.h | 2 +- src/glsl/lower_instructions.cpp| 49 -- src/mesa/drivers/dri/i965/brw_fs_visitor.cpp | 2 +- src/mesa/drivers/dri/i965/brw_shader.cpp | 2 +- src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp | 2 +- src/mesa/program/ir_to_mesa.cpp| 4 +-- src/mesa/state_tracker/st_glsl_to_tgsi.cpp | 2 +- 8 files changed, 31 insertions(+), 34 deletions(-) diff --git a/src/glsl/README b/src/glsl/README index 2f93f12..bfcf69f 100644 --- a/src/glsl/README +++ b/src/glsl/README @@ -187,7 +187,7 @@ You may also need to update the backends if they will see the new expr type: You can then use the new expression from builtins (if all backends would rather see it), or scan the IR and convert to use your new -expression type (see ir_mod_to_fract, for example). +expression type (see ir_mod_to_floor, for example). Q: How is memory management handled in the compiler? diff --git a/src/glsl/ir_optimization.h b/src/glsl/ir_optimization.h index 34e0b4b..912d910 100644 --- a/src/glsl/ir_optimization.h +++ b/src/glsl/ir_optimization.h @@ -34,7 +34,7 @@ #define EXP_TO_EXP20x04 #define POW_TO_EXP20x08 #define LOG_TO_LOG20x10 -#define MOD_TO_FRACT 0x20 +#define MOD_TO_FLOOR 0x20 #define INT_DIV_TO_MUL_RCP 0x40 #define BITFIELD_INSERT_TO_BFM_BFI 0x80 #define LDEXP_TO_ARITH 0x100 diff --git a/src/glsl/lower_instructions.cpp b/src/glsl/lower_instructions.cpp index 6842853..b23c24d 100644 --- a/src/glsl/lower_instructions.cpp +++ b/src/glsl/lower_instructions.cpp @@ -36,7 +36,7 @@ * - EXP_TO_EXP2 * - POW_TO_EXP2 * - LOG_TO_LOG2 - * - MOD_TO_FRACT + * - MOD_TO_FLOOR * - LDEXP_TO_ARITH * - BITFIELD_INSERT_TO_BFM_BFI * - CARRY_TO_ARITH @@ -77,14 +77,17 @@ * Many older GPUs don't have an x**y instruction. For these GPUs, convert * x**y to 2**(y * log2(x)). * - * MOD_TO_FRACT: + * MOD_TO_FLOOR: * - - * Breaks an ir_binop_mod expression down to (op1 * fract(op0 / op1)) + * Breaks an ir_binop_mod expression down to (op0 - op1 * floor(op0 / op1)) * * Many GPUs don't have a MOD instruction (945 and 965 included), and * if we have to break it down like this anyway, it gives an * opportunity to do things like constant fold the (1.0 / op1) easily. * + * Note: before we used to implement this as op1 * fract(op / op1) but this + * implementation had significant precission errors. + * * LDEXP_TO_ARITH: * - * Converts ir_binop_ldexp to arithmetic and bit operations. @@ -136,7 +139,7 @@ private: void sub_to_add_neg(ir_expression *); void div_to_mul_rcp(ir_expression *); void int_div_to_mul_rcp(ir_expression *); - void mod_to_fract(ir_expression *); + void mod_to_floor(ir_expression *); void exp_to_exp2(ir_expression *); void pow_to_exp2(ir_expression *); void log_to_log2(ir_expression *); @@ -276,22 +279,15 @@ lower_instructions_visitor::log_to_log2(ir_expression *ir) } void -lower_instructions_visitor::mod_to_fract(ir_expression *ir)
Re: [Mesa-dev] [PATCH 02/11] glsl: Add link time checks for GLSL precision qualifiers
On Mon, 2015-01-19 at 19:39 -0800, Ian Romanick wrote: On 01/19/2015 03:32 AM, Eduardo Lima Mitev wrote: From: Iago Toral Quiroga ito...@igalia.com Currently, we only consider precision qualifiers at compile-time. This patch adds precision information to ir_variable so we can also do link time checks. Specifically, from the GLSL ES3 spec, 4.5.3 Precision Qualifiers: The same uniform declared in different shaders that are linked together must have the same precision qualification. Notice that this patch will check the above also for GLSL ES globals that are not uniforms. This is not explicitly stated in the spec, but seems to be the only consistent choice: since we can only have one definition of a global all its declarations should be identical, including precision qualifiers. Probably not. In ES you can only have one compilation unit per stage, so you can never have such a conflict. No piglit regressions. Fixes the following 4 dEQP tests: dEQP-GLES3.functional.shaders.linkage.uniform.struct.precision_conflict_1 dEQP-GLES3.functional.shaders.linkage.uniform.struct.precision_conflict_2 dEQP-GLES3.functional.shaders.linkage.uniform.struct.precision_conflict_3 dEQP-GLES3.functional.shaders.linkage.uniform.struct.precision_conflict_4 --- src/glsl/ast_to_hir.cpp | 12 src/glsl/glsl_types.cpp | 4 src/glsl/glsl_types.h | 13 + src/glsl/ir.h | 15 +++ src/glsl/linker.cpp | 48 5 files changed, 84 insertions(+), 8 deletions(-) diff --git a/src/glsl/ast_to_hir.cpp b/src/glsl/ast_to_hir.cpp index 811a955..2b67e14 100644 --- a/src/glsl/ast_to_hir.cpp +++ b/src/glsl/ast_to_hir.cpp @@ -2460,6 +2460,10 @@ apply_type_qualifier_to_variable(const struct ast_type_qualifier *qual, if (qual-flags.q.sample) var-data.sample = 1; + /* Precision qualifiers do not hold any meaning in Desktop GLSL */ + if (state-es_shader qual-precision) + var-data.precision = qual-precision; + if (state-stage == MESA_SHADER_GEOMETRY qual-flags.q.out qual-flags.q.stream) { var-data.stream = qual-stream; @@ -5213,6 +5217,10 @@ ast_process_structure_or_interface_block(exec_list *instructions, fields[i].centroid = qual-flags.q.centroid ? 1 : 0; fields[i].sample = qual-flags.q.sample ? 1 : 0; + /* Precision qualifiers do not hold any meaning in Desktop GLSL */ + fields[i].precision = state-es_shader ? qual-precision : + GLSL_PRECISION_NONE; + /* Only save explicitly defined streams in block's field */ fields[i].stream = qual-flags.q.explicit_stream ? qual-stream : -1; @@ -5688,6 +5696,10 @@ ast_interface_block::hir(exec_list *instructions, var-data.sample = fields[i].sample; var-init_interface_type(block_type); + /* Precision qualifiers do not hold any meaning in Desktop GLSL */ + var-data.precision = state-es_shader ? fields[i].precision : + GLSL_PRECISION_NONE; + if (fields[i].matrix_layout == GLSL_MATRIX_LAYOUT_INHERITED) { var-data.matrix_layout = matrix_layout == GLSL_MATRIX_LAYOUT_INHERITED ? GLSL_MATRIX_LAYOUT_COLUMN_MAJOR : matrix_layout; diff --git a/src/glsl/glsl_types.cpp b/src/glsl/glsl_types.cpp index b4223f4..e35c2aa 100644 --- a/src/glsl/glsl_types.cpp +++ b/src/glsl/glsl_types.cpp @@ -122,6 +122,7 @@ glsl_type::glsl_type(const glsl_struct_field *fields, unsigned num_fields, this-fields.structure[i].centroid = fields[i].centroid; this-fields.structure[i].sample = fields[i].sample; this-fields.structure[i].matrix_layout = fields[i].matrix_layout; + this-fields.structure[i].precision = fields[i].precision; } mtx_unlock(glsl_type::mutex); @@ -668,6 +669,9 @@ glsl_type::record_compare(const glsl_type *b) const if (this-fields.structure[i].sample != b-fields.structure[i].sample) return false; + if (this-fields.structure[i].precision + != b-fields.structure[i].precision) + return false; } return true; diff --git a/src/glsl/glsl_types.h b/src/glsl/glsl_types.h index 441015c..2405006 100644 --- a/src/glsl/glsl_types.h +++ b/src/glsl/glsl_types.h @@ -100,6 +100,13 @@ enum glsl_matrix_layout { GLSL_MATRIX_LAYOUT_ROW_MAJOR }; +enum { + GLSL_PRECISION_NONE = 0, + GLSL_PRECISION_HIGH, + GLSL_PRECISION_MEDIUM, + GLSL_PRECISION_LOW +}; + #ifdef __cplusplus #include GL/gl.h #include util/ralloc.h @@ -117,6 +124,7 @@ struct glsl_type { * and \c GLSL_TYPE_UINT are valid.
Re: [Mesa-dev] [PATCH 02/16] glsl: Add a foreach_in_list_reverse_safe macro.
Shouldn't the subject say exec_list: Add a foreach_in_list_reverse_safe macro.? On Mon, Jan 19, 2015 at 6:31 PM, Matt Turner matts...@gmail.com wrote: --- src/glsl/list.h | 6 ++ 1 file changed, 6 insertions(+) diff --git a/src/glsl/list.h b/src/glsl/list.h index 330c17e..85368a4 100644 --- a/src/glsl/list.h +++ b/src/glsl/list.h @@ -636,6 +636,12 @@ inline void exec_node::insert_before(exec_list *before) __next != NULL; \ __node = __next, __next = (__type *)__next-next) +#define foreach_in_list_reverse_safe(__type, __node, __list) \ + for (__type *__node = (__type *)(__list)-tail_pred, \ + *__prev = (__type *)__node-prev; \ +__prev != NULL; \ +__node = __prev, __prev = (__type *)__prev-prev) + #define foreach_in_list_use_after(__type, __inst, __list) \ __type *(__inst); \ for ((__inst) = (__type *)(__list)-head; \ -- 2.0.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] radeonsi: Enable VGPR spilling for all shader types v3
The problem with CPDMA (DMA_DATA and WRITE_DATA) is that the ordering of flushes must be correct. First, partial flushes must be done, so that the shaders are idle. Then you can use CP DMA to update the binary. After that, ICACHE should be invalidated. This is similar to how si_clear_buffer and si_copy_buffer work. However, the biggest issue is that it doesn't map very well to our state management mechanisms, so it must be done separately (just like si_clear_buffer). Doing it on the CPU just seems a lot easier. The problem with mapping VRAM can be avoided by keeping a CPU copy of the binary from the beginning. We would only need a CPU copy of those shaders that use the scratch buffer. Then, you wouldn't have to read VRAM at all, which would make it even simpler. Marek On Tue, Jan 20, 2015 at 4:14 AM, Michel Dänzer mic...@daenzer.net wrote: On 20.01.2015 00:07, Marek Olšák wrote: 3) Since the CP runs in parallel with the graphics engine, shaders can be busy while WRITE_DATA is changing their code. (no good can come out of that) Therefore, the partial flushes (= partial WAIT_UNTIL) must be emitted before the WRITE_DATA packet. The other thing is that the code is missing checking if there is enough space in the CS for the WRITE_DATA packets. In order to simplify everything, I recommend this solution: - Forget about partial flushes, forget about WRITE_DATA. - If the shader relocs need updating, copy the current shader bo to a new shader bo using the CPU (map the current shader bo with read+unsynchronized flags). - Update the relocs using the CPU in the new shader bo. - Then replace the current shader bo with the new one and mark the whole shader state as dirty. Doing this with the CPU has some drawbacks though: Accessing a BO in VRAM with the CPU forces it into the first 256MB of VRAM, which could stall the graphics pipeline if the BO wasn't there before. I think basically the same scheme of allocating new BOs for reloc patching should be doable with CPDMA, which should remove the need to worry about flushing things in particular orders. It's probably fine to do it with the CPU first and with CPDMA (also for the initial shader code write) as an optimization later. -- Earthling Michel Dänzer | http://www.amd.com Libre software enthusiast | Mesa and X developer ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH v2] glsl: GLSL ES identifiers cannot exceed 1024 characters
v2 (Ian Romanick) - Move the check to the lexer before rallocing a copy of the large string. Fixes the following 2 dEQP tests: dEQP-GLES3.functional.shaders.keywords.invalid_identifiers.max_length_vertex dEQP-GLES3.functional.shaders.keywords.invalid_identifiers.max_length_fragment --- src/glsl/glsl_lexer.ll | 8 +++- 1 file changed, 7 insertions(+), 1 deletion(-) diff --git a/src/glsl/glsl_lexer.ll b/src/glsl/glsl_lexer.ll index 57c46be..48ba463 100644 --- a/src/glsl/glsl_lexer.ll +++ b/src/glsl/glsl_lexer.ll @@ -544,7 +544,13 @@ subroutine KEYWORD(0, 300, 0, 0, SUBROUTINE); [_a-zA-Z][_a-zA-Z0-9]* { struct _mesa_glsl_parse_state *state = yyextra; void *ctx = state; - yylval-identifier = ralloc_strdup(ctx, yytext); + if (state-es_shader strlen(yytext) 1024) { + _mesa_glsl_error(yylloc, state, + Identifier `%s' exceeds 1024 characters, + yytext); + } else { + yylval-identifier = ralloc_strdup(ctx, yytext); + } return classify_identifier(state, yytext); } -- 1.9.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [xf86-video-ati] dri2: Enable BufferAge support
On Mon, Jan 19, 2015 at 6:00 AM, Chris Wilson ch...@chris-wilson.co.uk wrote: For BufferAge support, we just have to guarrantee that we were not using the DRI2Buffer-flags field for anything else (i.e. it is always 0) and then to make sure that we exchange the flags field after buffer exchanges. radeon does not support TripleBuffering so we do not have to worry about perserving the flags when injecting the third buffer. Signed-off-by: Chris Wilson ch...@chris-wilson.co.uk Cc: Dave Airlie airl...@redhat.com Cc: Jerome Glisse jgli...@redhat.com Cc: Alex Deucher alexander.deuc...@amd.com Cc: Michel Dänzer michel.daen...@amd.com Reviewed-by: Alex Deucher alexander.deuc...@amd.com --- src/radeon_dri2.c | 10 ++ 1 file changed, 10 insertions(+) diff --git a/src/radeon_dri2.c b/src/radeon_dri2.c index 0fbe96c..091cd06 100644 --- a/src/radeon_dri2.c +++ b/src/radeon_dri2.c @@ -764,6 +764,11 @@ radeon_dri2_exchange_buffers(DrawablePtr draw, DRI2BufferPtr front, DRI2BufferPt front-name = back-name; back-name = tmp; +/* Swap flags so BufferAge works */ +tmp = front-flags; +front-flags = back-flags; +back-flags = tmp; + /* Swap pixmap bos */ front_bo = radeon_get_pixmap_bo(front_priv-pixmap); back_bo = radeon_get_pixmap_bo(back_priv-pixmap); @@ -1647,6 +1652,11 @@ radeon_dri2_screen_init(ScreenPtr pScreen) dri2_info.CopyRegion2 = radeon_dri2_copy_region2; #endif +#if DRI2INFOREC_VERSION = 10 +dri2_info.version = 10; +dri2_info.bufferAge = 1; +#endif + info-dri2.enabled = DRI2ScreenInit(pScreen, dri2_info); return info-dri2.enabled; } -- 2.1.4 ___ xorg-de...@lists.x.org: X.Org development Archives: http://lists.x.org/archives/xorg-devel Info: http://lists.x.org/mailman/listinfo/xorg-devel ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 88275] [865G] Intel OpenGL rendering isn't starting
https://bugs.freedesktop.org/show_bug.cgi?id=88275 Eugene ken20...@ukr.net changed: What|Removed |Added Summary|[865G] Intel OpenGL |[865G] Intel OpenGL |rendering not starting |rendering isn't starting -- You are receiving this mail because: You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 82668] Can't set int attributes to certain values on 32-bit
https://bugs.freedesktop.org/show_bug.cgi?id=82668 marius predut marius.pre...@intel.com changed: What|Removed |Added Assignee|mesa-dev@lists.freedesktop. |marius.pre...@intel.com |org | --- Comment #5 from marius predut marius.pre...@intel.com --- A patch that fix this defect was sent to mesa-dev@lists.freedesktop.org -- You are receiving this mail because: You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH v2] glsl: Improve precision of mod(x,y)
Currently, Mesa uses the lowering pass MOD_TO_FRACT to implement mod(x,y) as y * fract(x/y). This implementation has a down side though: it introduces precision errors due to the fract() operation. Even worse, since the result of fract() is multiplied by y, the larger y gets the larger the precision error we produce, so for large enough numbers the precision loss is significant. Some examples on i965: Operation Precision error - mod(-1.951171875, 1.9980468750) 0.000447 mod(121.57, 13.29) 0.023842 mod(3769.12, 321.99) 0.762939 mod(3769.12, 1321.99)0.0001220703 mod(-987654.125, 123456.984375) 0.0160663128 mod( 987654.125, 123456.984375) 0.031250 This patch replaces the current lowering pass with a different one (MOD_TO_FLOOR) that follows the recommended implementation in the GLSL man pages: mod(x,y) = x - y * floor(x/y) This implementation eliminates the precision errors at the expense of an additional add instruction on some systems. On systems that can do negate with multiply-add in a single operation this new implementation would come at no additional cost. v2 (Ian Romanick) - Do not clone operands because when they are expressions we would be duplicating them and that can lead to suboptimal code. Fixes the following 16 dEQP tests: dEQP-GLES3.functional.shaders.builtin_functions.precision.mod.mediump_* dEQP-GLES3.functional.shaders.builtin_functions.precision.mod.highp_* --- src/glsl/README| 2 +- src/glsl/ir_optimization.h | 2 +- src/glsl/lower_instructions.cpp| 65 +++--- src/mesa/drivers/dri/i965/brw_fs_visitor.cpp | 2 +- src/mesa/drivers/dri/i965/brw_shader.cpp | 2 +- src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp | 2 +- src/mesa/program/ir_to_mesa.cpp| 4 +- src/mesa/state_tracker/st_glsl_to_tgsi.cpp | 2 +- 8 files changed, 47 insertions(+), 34 deletions(-) diff --git a/src/glsl/README b/src/glsl/README index 2f93f12..bfcf69f 100644 --- a/src/glsl/README +++ b/src/glsl/README @@ -187,7 +187,7 @@ You may also need to update the backends if they will see the new expr type: You can then use the new expression from builtins (if all backends would rather see it), or scan the IR and convert to use your new -expression type (see ir_mod_to_fract, for example). +expression type (see ir_mod_to_floor, for example). Q: How is memory management handled in the compiler? diff --git a/src/glsl/ir_optimization.h b/src/glsl/ir_optimization.h index 34e0b4b..912d910 100644 --- a/src/glsl/ir_optimization.h +++ b/src/glsl/ir_optimization.h @@ -34,7 +34,7 @@ #define EXP_TO_EXP20x04 #define POW_TO_EXP20x08 #define LOG_TO_LOG20x10 -#define MOD_TO_FRACT 0x20 +#define MOD_TO_FLOOR 0x20 #define INT_DIV_TO_MUL_RCP 0x40 #define BITFIELD_INSERT_TO_BFM_BFI 0x80 #define LDEXP_TO_ARITH 0x100 diff --git a/src/glsl/lower_instructions.cpp b/src/glsl/lower_instructions.cpp index 6842853..09afe55 100644 --- a/src/glsl/lower_instructions.cpp +++ b/src/glsl/lower_instructions.cpp @@ -36,7 +36,7 @@ * - EXP_TO_EXP2 * - POW_TO_EXP2 * - LOG_TO_LOG2 - * - MOD_TO_FRACT + * - MOD_TO_FLOOR * - LDEXP_TO_ARITH * - BITFIELD_INSERT_TO_BFM_BFI * - CARRY_TO_ARITH @@ -77,14 +77,17 @@ * Many older GPUs don't have an x**y instruction. For these GPUs, convert * x**y to 2**(y * log2(x)). * - * MOD_TO_FRACT: + * MOD_TO_FLOOR: * - - * Breaks an ir_binop_mod expression down to (op1 * fract(op0 / op1)) + * Breaks an ir_binop_mod expression down to (op0 - op1 * floor(op0 / op1)) * * Many GPUs don't have a MOD instruction (945 and 965 included), and * if we have to break it down like this anyway, it gives an * opportunity to do things like constant fold the (1.0 / op1) easily. * + * Note: before we used to implement this as op1 * fract(op / op1) but this + * implementation had significant precision errors. + * * LDEXP_TO_ARITH: * - * Converts ir_binop_ldexp to arithmetic and bit operations. @@ -136,7 +139,7 @@ private: void sub_to_add_neg(ir_expression *); void div_to_mul_rcp(ir_expression *); void int_div_to_mul_rcp(ir_expression *); - void mod_to_fract(ir_expression *); + void mod_to_floor(ir_expression *); void exp_to_exp2(ir_expression *); void pow_to_exp2(ir_expression *); void log_to_log2(ir_expression *); @@ -276,22 +279,29 @@ lower_instructions_visitor::log_to_log2(ir_expression *ir) } void -lower_instructions_visitor::mod_to_fract(ir_expression *ir) +lower_instructions_visitor::mod_to_floor(ir_expression *ir) { - ir_variable *temp = new(ir) ir_variable(ir-operands[1]-type, mod_b, - ir_var_temporary); - this-base_ir-insert_before(temp); - - ir_assignment *const assign = -
[Mesa-dev] [Bug 67676] Transparent windows no longer work
https://bugs.freedesktop.org/show_bug.cgi?id=67676 Daniel Stone dan...@fooishbar.org changed: What|Removed |Added CC||e...@anholt.net --- Comment #9 from Daniel Stone dan...@fooishbar.org --- Eric, Chad, any thoughts on the above? Jonny, did you have the Mesa implementation as well? -- You are receiving this mail because: You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] glsl: do not allow interface block to have name already taken
Ian, That's what the -2 variation added by that commit does. - Chris On Wed, Jan 21, 2015 at 7:29 AM, Ian Romanick i...@freedesktop.org wrote: On 01/19/2015 10:55 PM, Tapani Pälli wrote: Fixes currently failing Piglit case interface-blocks-name-reused-globally.vert Signed-off-by: Tapani Pälli tapani.pa...@intel.com --- src/glsl/ast_to_hir.cpp | 17 - 1 file changed, 16 insertions(+), 1 deletion(-) diff --git a/src/glsl/ast_to_hir.cpp b/src/glsl/ast_to_hir.cpp index 811a955..13ddb00 100644 --- a/src/glsl/ast_to_hir.cpp +++ b/src/glsl/ast_to_hir.cpp @@ -5443,9 +5443,24 @@ ast_interface_block::hir(exec_list *instructions, state-struct_specifier_depth--; - if (!redeclaring_per_vertex) + if (!redeclaring_per_vertex) { + ir_variable *var; This is C++, so you can combine the declaration with the assignment below. I guess you've been bit by the C89 requirements elsewhere in Mesa. :) validate_identifier(this-block_name, loc, state); + /* From section 4.3.9 (Interface Blocks) of the GLSL 4.50 spec: + * + * Block names have no other use within a shader beyond interface + * matching; it is a compile-time error to use a block name at global + * scope for anything other than as a block name. + */ + var = state-symbols-get_variable(this-block_name); + if (var !var-type-is_interface()) { + _mesa_glsl_error(loc, state, Block name `%s' is + already used in the scope., + this-block_name); + } This fixes the previously mentioned test case, but what about a test like out block { vec4 a; } inst; vec4 block; Looking at Chris's patch (piglit 14165586) that adds interface-blocks-name-reused-globally.vert, I don't see a test like this. Does that test already exist, and I just missed it? + } + const glsl_type *earlier_per_vertex = NULL; if (redeclaring_per_vertex) { /* Find the previous declaration of gl_PerVertex. If we're redeclaring ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 11/11] mesa: fix error value in GetFramebufferAttachmentParameteriv for OpenGL ES 3.0
On 01/20/2015 10:57 AM, Anuj Phogat wrote: On Mon, Jan 19, 2015 at 7:23 PM, Ian Romanick i...@freedesktop.org wrote: On 01/19/2015 03:32 AM, Eduardo Lima Mitev wrote: From: Samuel Iglesias Gonsalvez sigles...@igalia.com Section 6.1.13 Framebuffer Object Queries of OpenGL ES 3.0 spec: If the default framebuffer is bound to target, then attachment must be BACK, identifying the color buffer; DEPTH, identifying the depth buffer; or STENCIL, identifying the stencil buffer. But the spec doesn't state which kind of error to return. GL_EXT_draw_buffers, section Errors says the following: The INVALID_ENUM error is generated by GetFramebufferAttachmentParameteriv if the attachment parameter is not one of the values listed in Table 4.x. Due to the ambiguity of OpenGL ES 3.0 spec on this regard and because GL_EXT_draw_buffers returns INVALID_ENUM in a related case, then change the returned error to INVALID_ENUM. This is another section 2.3.1 (Errors) case: If a command that requires an enumerated value is passed a symbolic constant that is not one of those specified as allowable for that command, an INVALID_ENUM error is generated. The ES3.1 spec is also specific on this point: An INVALID_ENUM error is generated by any combinations of framebuffer type and pname not described above. I'm not sure why we were generating INVALID_OPERATION previously. This came from commit 2f2801f876a4c637273bd3ddefb8a5b7a840e604 Author: Anuj Phogat anuj.pho...@gmail.com Date: Tue Dec 11 20:08:13 2012 -0800 mesa: Fix GL error generation in _mesa_GetFramebufferAttachmentParameteriv() This allows query on default framebuffer in glGetFramebufferAttachmentParameteriv() for gles3. Fixes unexpected GL errors in gles3 conformance test case: framebuffer_blit_functionality_multisampled_to_singlesampled_blit V2: Use _mesa_is_gles3() check to restrict allowed attachment types to specific APIs. Signed-off-by: Anuj Phogat anuj.pho...@gmail.com Reviewed-by: Kenneth Graunke kenn...@whitecape.org Reviewed-by: Ian Romanick ian.d.roman...@intel.com So... I wonder if framebuffer_blit_functionality_multisampled_to_singlesampled_blit checks for INVALID_OPERATION here. Either that or the error was just copy-and-pasted from the previous case. Anuj, do you recall? framebuffer_blit_functionality_multisampled_to_singlesampled_blit continued passing if we change the error to GL_INVALID_ENUM. Seems like I copy pasted the gles 2.0 error from previous case :(. Okay. Thanks for the info. This patch is Reviewed-by: Ian Romanick ian.d.roman...@intel.com More info: http://lists.freedesktop.org/archives/mesa-dev/2015-January/074447.html Fixes: dEQP-GLES3.functional.fbo.api.attachment_query_default_fbo Signed-off-by: Samuel Iglesias Gonsalvez sigles...@igalia.com --- src/mesa/main/fbobject.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/mesa/main/fbobject.c b/src/mesa/main/fbobject.c index 80dc353..99ecee8 100644 --- a/src/mesa/main/fbobject.c +++ b/src/mesa/main/fbobject.c @@ -2790,7 +2790,7 @@ _mesa_GetFramebufferAttachmentParameteriv(GLenum target, GLenum attachment, if (_mesa_is_gles3(ctx) attachment != GL_BACK attachment != GL_DEPTH attachment != GL_STENCIL) { - _mesa_error(ctx, GL_INVALID_OPERATION, + _mesa_error(ctx, GL_INVALID_ENUM, glGetFramebufferAttachmentParameteriv(attachment)); return; } ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] glsl: do not allow interface block to have name already taken
On 01/19/2015 10:55 PM, Tapani Pälli wrote: Fixes currently failing Piglit case interface-blocks-name-reused-globally.vert Signed-off-by: Tapani Pälli tapani.pa...@intel.com --- src/glsl/ast_to_hir.cpp | 17 - 1 file changed, 16 insertions(+), 1 deletion(-) diff --git a/src/glsl/ast_to_hir.cpp b/src/glsl/ast_to_hir.cpp index 811a955..13ddb00 100644 --- a/src/glsl/ast_to_hir.cpp +++ b/src/glsl/ast_to_hir.cpp @@ -5443,9 +5443,24 @@ ast_interface_block::hir(exec_list *instructions, state-struct_specifier_depth--; - if (!redeclaring_per_vertex) + if (!redeclaring_per_vertex) { + ir_variable *var; This is C++, so you can combine the declaration with the assignment below. I guess you've been bit by the C89 requirements elsewhere in Mesa. :) validate_identifier(this-block_name, loc, state); + /* From section 4.3.9 (Interface Blocks) of the GLSL 4.50 spec: + * + * Block names have no other use within a shader beyond interface + * matching; it is a compile-time error to use a block name at global + * scope for anything other than as a block name. + */ + var = state-symbols-get_variable(this-block_name); + if (var !var-type-is_interface()) { + _mesa_glsl_error(loc, state, Block name `%s' is + already used in the scope., + this-block_name); + } This fixes the previously mentioned test case, but what about a test like out block { vec4 a; } inst; vec4 block; Looking at Chris's patch (piglit 14165586) that adds interface-blocks-name-reused-globally.vert, I don't see a test like this. Does that test already exist, and I just missed it? + } + const glsl_type *earlier_per_vertex = NULL; if (redeclaring_per_vertex) { /* Find the previous declaration of gl_PerVertex. If we're redeclaring ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 11/11] mesa: fix error value in GetFramebufferAttachmentParameteriv for OpenGL ES 3.0
On Mon, Jan 19, 2015 at 7:23 PM, Ian Romanick i...@freedesktop.org wrote: On 01/19/2015 03:32 AM, Eduardo Lima Mitev wrote: From: Samuel Iglesias Gonsalvez sigles...@igalia.com Section 6.1.13 Framebuffer Object Queries of OpenGL ES 3.0 spec: If the default framebuffer is bound to target, then attachment must be BACK, identifying the color buffer; DEPTH, identifying the depth buffer; or STENCIL, identifying the stencil buffer. But the spec doesn't state which kind of error to return. GL_EXT_draw_buffers, section Errors says the following: The INVALID_ENUM error is generated by GetFramebufferAttachmentParameteriv if the attachment parameter is not one of the values listed in Table 4.x. Due to the ambiguity of OpenGL ES 3.0 spec on this regard and because GL_EXT_draw_buffers returns INVALID_ENUM in a related case, then change the returned error to INVALID_ENUM. This is another section 2.3.1 (Errors) case: If a command that requires an enumerated value is passed a symbolic constant that is not one of those specified as allowable for that command, an INVALID_ENUM error is generated. The ES3.1 spec is also specific on this point: An INVALID_ENUM error is generated by any combinations of framebuffer type and pname not described above. I'm not sure why we were generating INVALID_OPERATION previously. This came from commit 2f2801f876a4c637273bd3ddefb8a5b7a840e604 Author: Anuj Phogat anuj.pho...@gmail.com Date: Tue Dec 11 20:08:13 2012 -0800 mesa: Fix GL error generation in _mesa_GetFramebufferAttachmentParameteriv() This allows query on default framebuffer in glGetFramebufferAttachmentParameteriv() for gles3. Fixes unexpected GL errors in gles3 conformance test case: framebuffer_blit_functionality_multisampled_to_singlesampled_blit V2: Use _mesa_is_gles3() check to restrict allowed attachment types to specific APIs. Signed-off-by: Anuj Phogat anuj.pho...@gmail.com Reviewed-by: Kenneth Graunke kenn...@whitecape.org Reviewed-by: Ian Romanick ian.d.roman...@intel.com So... I wonder if framebuffer_blit_functionality_multisampled_to_singlesampled_blit checks for INVALID_OPERATION here. Either that or the error was just copy-and-pasted from the previous case. Anuj, do you recall? framebuffer_blit_functionality_multisampled_to_singlesampled_blit continued passing if we change the error to GL_INVALID_ENUM. Seems like I copy pasted the gles 2.0 error from previous case :(. More info: http://lists.freedesktop.org/archives/mesa-dev/2015-January/074447.html Fixes: dEQP-GLES3.functional.fbo.api.attachment_query_default_fbo Signed-off-by: Samuel Iglesias Gonsalvez sigles...@igalia.com --- src/mesa/main/fbobject.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/mesa/main/fbobject.c b/src/mesa/main/fbobject.c index 80dc353..99ecee8 100644 --- a/src/mesa/main/fbobject.c +++ b/src/mesa/main/fbobject.c @@ -2790,7 +2790,7 @@ _mesa_GetFramebufferAttachmentParameteriv(GLenum target, GLenum attachment, if (_mesa_is_gles3(ctx) attachment != GL_BACK attachment != GL_DEPTH attachment != GL_STENCIL) { - _mesa_error(ctx, GL_INVALID_OPERATION, + _mesa_error(ctx, GL_INVALID_ENUM, glGetFramebufferAttachmentParameteriv(attachment)); return; } ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] glsl: do not allow interface block to have name already taken
On 01/20/2015 10:42 AM, Chris Forbes wrote: Ian, That's what the -2 variation added by that commit does. Of course. :) Not sure how I overlooked that. - Chris On Wed, Jan 21, 2015 at 7:29 AM, Ian Romanick i...@freedesktop.org wrote: On 01/19/2015 10:55 PM, Tapani Pälli wrote: Fixes currently failing Piglit case interface-blocks-name-reused-globally.vert Signed-off-by: Tapani Pälli tapani.pa...@intel.com --- src/glsl/ast_to_hir.cpp | 17 - 1 file changed, 16 insertions(+), 1 deletion(-) diff --git a/src/glsl/ast_to_hir.cpp b/src/glsl/ast_to_hir.cpp index 811a955..13ddb00 100644 --- a/src/glsl/ast_to_hir.cpp +++ b/src/glsl/ast_to_hir.cpp @@ -5443,9 +5443,24 @@ ast_interface_block::hir(exec_list *instructions, state-struct_specifier_depth--; - if (!redeclaring_per_vertex) + if (!redeclaring_per_vertex) { + ir_variable *var; This is C++, so you can combine the declaration with the assignment below. I guess you've been bit by the C89 requirements elsewhere in Mesa. :) validate_identifier(this-block_name, loc, state); + /* From section 4.3.9 (Interface Blocks) of the GLSL 4.50 spec: + * + * Block names have no other use within a shader beyond interface + * matching; it is a compile-time error to use a block name at global + * scope for anything other than as a block name. + */ + var = state-symbols-get_variable(this-block_name); + if (var !var-type-is_interface()) { + _mesa_glsl_error(loc, state, Block name `%s' is + already used in the scope., + this-block_name); + } This fixes the previously mentioned test case, but what about a test like out block { vec4 a; } inst; vec4 block; Looking at Chris's patch (piglit 14165586) that adds interface-blocks-name-reused-globally.vert, I don't see a test like this. Does that test already exist, and I just missed it? Since interface-blocks-name-reused-globally-2.vert already passes, this patch is Reviewed-by: Ian Romanick ian.d.roman...@intel.com + } + const glsl_type *earlier_per_vertex = NULL; if (redeclaring_per_vertex) { /* Find the previous declaration of gl_PerVertex. If we're redeclaring ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] nv50, nvc0: Report correct caps for pixel center integer
I've done some tests, after being confirmed that the 0.5 subtraction done in gallium nine is correct, I've tried to set raster half_pixel_center to 1 , and this way , with the 0.5 subtraction is fine like it is on radeon With half_pixel_center set to 0 instead the default behavior on shaders seems to be fragcoord center pixel integer So i'm guessing it is more like a D3D/OGL switch rather than on/off, and switches between default behaviors for the two APIs 2015-01-20 0:46 GMT+01:00 Ilia Mirkin imir...@alum.mit.edu: Nope. But feel free to do a piglit run to prove/disprove. The fragcoord origin tests should start failing with your patch. On Jan 19, 2015 3:41 PM, Tiziano Bacocco tizb...@gmail.com wrote: Could it be that it affects also fragcoord? because also on rnndb it says that 2015-01-20 0:17 GMT+01:00 Ilia Mirkin imir...@alum.mit.edu: Oops, dropped cc. On Jan 19, 2015 3:15 PM, Ilia Mirkin imir...@alum.mit.edu wrote: Nope, that's for something else. This has to do with whether gl fragcoord is at the integer or half integer coord. The other is a rasterizer setting. On Jan 19, 2015 3:06 PM, Tiziano Bacocco tizb...@gmail.com wrote: On the commit http://cgit.freedesktop.org/mesa/mesa/commit/?id=aafd13027a38d5a2ab5d80019b282b8233c15a8a , the part where the supported caps are added was missing nv50,nvc0: Report correct caps for pixel center integer Signed-off-by: Tiziano Bacocco tizb...@gmail.com diff --git a/src/gallium/drivers/nouveau/nv50/nv50_screen.c b/src/gallium/drivers/nouveau/nv50/nv50_screen.c index da237f9..d6b8d34 100644 --- a/src/gallium/drivers/nouveau/nv50/nv50_screen.c +++ b/src/gallium/drivers/nouveau/nv50/nv50_screen.c @@ -157,6 +157,7 @@ nv50_screen_get_param(struct pipe_screen *pscreen, enum pipe_cap param) case PIPE_CAP_INDEP_BLEND_ENABLE: case PIPE_CAP_TGSI_FS_COORD_ORIGIN_UPPER_LEFT: case PIPE_CAP_TGSI_FS_COORD_PIXEL_CENTER_HALF_INTEGER: + case PIPE_CAP_TGSI_FS_COORD_PIXEL_CENTER_INTEGER: case PIPE_CAP_PRIMITIVE_RESTART: case PIPE_CAP_TGSI_INSTANCEID: case PIPE_CAP_VERTEX_ELEMENT_INSTANCE_DIVISOR: @@ -190,7 +191,6 @@ nv50_screen_get_param(struct pipe_screen *pscreen, enum pipe_cap param) /* unsupported caps */ case PIPE_CAP_SEAMLESS_CUBE_MAP_PER_TEXTURE: case PIPE_CAP_TGSI_FS_COORD_ORIGIN_LOWER_LEFT: - case PIPE_CAP_TGSI_FS_COORD_PIXEL_CENTER_INTEGER: case PIPE_CAP_SHADER_STENCIL_EXPORT: case PIPE_CAP_TGSI_CAN_COMPACT_CONSTANTS: case PIPE_CAP_VERTEX_BUFFER_OFFSET_4BYTE_ALIGNED_ONLY: diff --git a/src/gallium/drivers/nouveau/nvc0/nvc0_screen.c b/src/gallium/drivers/nouveau/nvc0/nvc0_screen.c index 1d7caf8..24eb4d7 100644 --- a/src/gallium/drivers/nouveau/nvc0/nvc0_screen.c +++ b/src/gallium/drivers/nouveau/nvc0/nvc0_screen.c @@ -150,6 +150,7 @@ nvc0_screen_get_param(struct pipe_screen *pscreen, enum pipe_cap param) case PIPE_CAP_INDEP_BLEND_FUNC: case PIPE_CAP_TGSI_FS_COORD_ORIGIN_UPPER_LEFT: case PIPE_CAP_TGSI_FS_COORD_PIXEL_CENTER_HALF_INTEGER: + case PIPE_CAP_TGSI_FS_COORD_PIXEL_CENTER_INTEGER: case PIPE_CAP_PRIMITIVE_RESTART: case PIPE_CAP_TGSI_INSTANCEID: case PIPE_CAP_VERTEX_ELEMENT_INSTANCE_DIVISOR: @@ -180,7 +181,6 @@ nvc0_screen_get_param(struct pipe_screen *pscreen, enum pipe_cap param) /* unsupported caps */ case PIPE_CAP_TGSI_FS_COORD_ORIGIN_LOWER_LEFT: - case PIPE_CAP_TGSI_FS_COORD_PIXEL_CENTER_INTEGER: case PIPE_CAP_SHADER_STENCIL_EXPORT: case PIPE_CAP_TGSI_CAN_COMPACT_CONSTANTS: case PIPE_CAP_VERTEX_BUFFER_OFFSET_4BYTE_ALIGNED_ONLY: ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 88536] AMD graphics hardware hangs with an homogeneous coloured screen or blank screen, and with chirp coming from the graphics card
https://bugs.freedesktop.org/show_bug.cgi?id=88536 Alberto Salvia Novella es204904...@gmail.com changed: What|Removed |Added Component|Driver/AMDgpu |EGL Assignee|xorg-driver-...@lists.x.org |mesa-dev@lists.freedesktop. ||org Product|xorg|Mesa QA Contact|xorg-t...@lists.x.org |mesa-dev@lists.freedesktop. ||org -- You are receiving this mail because: You are the QA Contact for the bug. You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 09/16] i965/fs: Apply conditional mod specially to split MAD/LRP.
On Tue, Jan 20, 2015 at 12:32 AM, Kenneth Graunke kenn...@whitecape.org wrote: On Monday, January 19, 2015 03:31:08 PM Matt Turner wrote: Otherwise we'll apply the conditional mod to only one of SIMD8 instructions and trigger an assertion. NoDDClr/NoDDChk have the same problem but we never apply those to these instructions, so I'm leaving them for a later time. --- src/mesa/drivers/dri/i965/brw_fs_generator.cpp | 20 1 file changed, 16 insertions(+), 4 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_fs_generator.cpp b/src/mesa/drivers/dri/i965/brw_fs_generator.cpp index ab848f1..f35da71 100644 --- a/src/mesa/drivers/dri/i965/brw_fs_generator.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs_generator.cpp @@ -1656,10 +1656,16 @@ fs_generator::generate_code(const cfg_t *cfg, int dispatch_width) brw_set_default_access_mode(p, BRW_ALIGN_16); if (dispatch_width == 16 brw-gen 8 !brw-is_haswell) { brw_set_default_compression_control(p, BRW_COMPRESSION_NONE); - brw_MAD(p, firsthalf(dst), firsthalf(src[0]), firsthalf(src[1]), firsthalf(src[2])); +brw_inst *f = brw_MAD(p, firsthalf(dst), firsthalf(src[0]), firsthalf(src[1]), firsthalf(src[2])); brw_set_default_compression_control(p, BRW_COMPRESSION_2NDHALF); - brw_MAD(p, sechalf(dst), sechalf(src[0]), sechalf(src[1]), sechalf(src[2])); +brw_inst *s = brw_MAD(p, sechalf(dst), sechalf(src[0]), sechalf(src[1]), sechalf(src[2])); brw_set_default_compression_control(p, BRW_COMPRESSION_COMPRESSED); + +if (inst-conditional_mod) { + brw_inst_set_cond_modifier(brw, f, inst-conditional_mod); + brw_inst_set_cond_modifier(brw, s, inst-conditional_mod); + inst-conditional_mod = BRW_CONDITIONAL_NONE; Having the generator mutate the incoming IR feels dirty to me. Honestly, it should be const...we've never changed it until now. I see what you're trying to accomplish - bypassing the assertion failure about conditional_mod set with more than one instruction. Maybe add a bool multiple_instructions_allowed flag, set it to false before the switch, set it true here, and check it later to skip the assert? Seems ugly, but not as bad as mutating the IR. Sure. I think a better solution (after this series lands!) would be to generate two MADs/LRPs at the fs_visitor level, and just emit a single instruction for each at the generator level. We should have the infrastructure now and it'd let us schedule them. I really don't think adding that complication is a good idea. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 88536] AMD graphics hardware hangs with an homogeneous coloured screen or blank screen, and with chirp coming from the graphics card
https://bugs.freedesktop.org/show_bug.cgi?id=88536 --- Comment #11 from Alberto Salvia Novella es204904...@gmail.com --- Does fglrx depend on Mesa? How can I figure out if this is a bug in Mesa? -- You are receiving this mail because: You are the QA Contact for the bug. You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 10/16] i965/fs: Eliminate null-dst instructions without side-effects.
On Tue, Jan 20, 2015 at 12:53 AM, Kenneth Graunke kenn...@whitecape.org wrote: On Monday, January 19, 2015 03:31:09 PM Matt Turner wrote: --- src/mesa/drivers/dri/i965/brw_fs_dead_code_eliminate.cpp | 11 +++ 1 file changed, 11 insertions(+) diff --git a/src/mesa/drivers/dri/i965/brw_fs_dead_code_eliminate.cpp b/src/mesa/drivers/dri/i965/brw_fs_dead_code_eliminate.cpp index 81be4de..d66808b 100644 --- a/src/mesa/drivers/dri/i965/brw_fs_dead_code_eliminate.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs_dead_code_eliminate.cpp @@ -85,6 +85,17 @@ fs_visitor::dead_code_eliminate() } } + if ((inst-opcode != BRW_OPCODE_IF + inst-opcode != BRW_OPCODE_WHILE) + inst-dst.is_null() + !inst-has_side_effects() + !inst-writes_flag() + !inst-writes_accumulator) { +inst-opcode = BRW_OPCODE_NOP; +progress = true; +continue; + } + if (inst-dst.file == GRF) { if (!inst-is_partial_write()) { int var = live_intervals-var_from_reg(inst-dst); Seems like these should be handled too... - BRW_OPCODE_ELSE - FS_OPCODE_DISCARD_JUMP - FS_OPCODE_PLACEHOLDER_HALT - SHADER_OPCODE_SHADER_TIME_ADD These have BAD_FILE destinations. - SHADER_OPCODE_GEN4_SCRATCH_READ - SHADER_OPCODE_GEN4_SCRATCH_WRITE - SHADER_OPCODE_GEN7_SCRATCH_READ The READs have non-null destinations (they have to return the data somewhere). And we only emit SCRATCH_* from spilling registers as part of register allocation. We can't ever call dead code elimination after we've assigned registers. (Not only do we not, but it couldn't work) Maybe some of these should be added to has_side_effects()? I'm kind of surprised you didn't see regressions in Piglit...maybe I'm missing something. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 09/11 v2] mesa: Allow querying for GL_PRIMITIVE_RESTART_FIXED_INDEX under GLES 3
On 01/20/2015 04:58 AM, Eduardo Lima Mitev wrote: GLES 3.0.0 spec introduces context state PRIMITIVE_RESTART_FIXED_INDEX (2.8.1 Transferring Array Elements, page 26) which is not currently possible to query using glGet*() funcs. Fixes 4 dEQP tests: * dEQP-GLES3.functional.state_query.boolean.primitive_restart_fixed_index_getboolean * dEQP-GLES3.functional.state_query.boolean.primitive_restart_fixed_index_getinteger * dEQP-GLES3.functional.state_query.boolean.primitive_restart_fixed_index_getinteger64 * dEQP-GLES3.functional.state_query.boolean.primitive_restart_fixed_index_getfloat Reviewed-by: Ian Romanick ian.d.roman...@intel.com --- src/mesa/main/get_hash_params.py | 1 + 1 file changed, 1 insertion(+) diff --git a/src/mesa/main/get_hash_params.py b/src/mesa/main/get_hash_params.py index c487e98..c79ca45 100644 --- a/src/mesa/main/get_hash_params.py +++ b/src/mesa/main/get_hash_params.py @@ -343,6 +343,7 @@ descriptor=[ # GL_ARB_ES3_compatibility [ MAX_ELEMENT_INDEX, CONTEXT_INT64(Const.MaxElementIndex), extra_ARB_ES3_compatibility_api_es3], + [ PRIMITIVE_RESTART_FIXED_INDEX, CONTEXT_BOOL(Array.PrimitiveRestartFixedIndex), extra_ARB_ES3_compatibility_api_es3 ], # GL_ARB_fragment_shader [ MAX_FRAGMENT_UNIFORM_COMPONENTS_ARB, CONTEXT_INT(Const.Program[MESA_SHADER_FRAGMENT].MaxUniformComponents), extra_ARB_fragment_shader ], ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 02/16] glsl: Add a foreach_in_list_reverse_safe macro.
On Tue, Jan 20, 2015 at 6:16 AM, Connor Abbott cwabbo...@gmail.com wrote: Shouldn't the subject say exec_list: Add a foreach_in_list_reverse_safe macro.? Meh? Shortlog on this file shows glsl2, exec_list, glsl, glsl/list. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [mesa 8/9] glx/dri2: Move the wait after SwapBuffers into the next GetBuffers
On 01/19/2015 03:00 AM, Chris Wilson wrote: As the SBC from the reply from SwapBuffers is not used immediately and can be easily determined by counting the new of SwapBuffers requests made by the client, we can defer the synchronisation point to the pending GetBuffers round trip. (We force the invalidation event in order to require the GetBuffers round trip - we know that all X servers will send the invalidation after SwapBuffers and that invalidation will arrive before the SwapBuffers reply, so no extra roundtrips are forced.) This is a pretty big change in behavior. How much testing has it received? I'm nervous that applications that work fine today could misbehave in subtle ways. How much work would it be to add a way to disable the new behavior, perhaps via an environment variable, at run-time? That would make it much easier for users to determine whether this change was responsible for a problem in some game we don't have. Additional comments in-line below. An important side-effect is demonstrated in the next patch where we can detect when the buffers are stale before querying properties. Signed-off-by: Chris Wilson ch...@chris-wilson.co.uk --- src/glx/dri2_glx.c | 73 -- 1 file changed, 38 insertions(+), 35 deletions(-) diff --git a/src/glx/dri2_glx.c b/src/glx/dri2_glx.c index 462d560..0577804 100644 --- a/src/glx/dri2_glx.c +++ b/src/glx/dri2_glx.c @@ -93,6 +93,10 @@ struct dri2_drawable int have_back; int have_fake_front; int swap_interval; + int swap_pending; + + xcb_dri2_swap_buffers_cookie_t swap_buffers_cookie; + int64_t last_swap_sbc; uint64_t previous_time; unsigned frames; @@ -778,50 +782,51 @@ static void show_fps(struct dri2_drawable *draw) } } -static int64_t -dri2XcbSwapBuffers(Display *dpy, - __GLXDRIdrawable *pdraw, +static void +dri2XcbSwapBuffers(struct dri2_drawable *priv, int64_t target_msc, int64_t divisor, int64_t remainder) { - xcb_dri2_swap_buffers_cookie_t swap_buffers_cookie; - xcb_dri2_swap_buffers_reply_t *swap_buffers_reply; + xcb_connection_t *c = XGetXCBConnection(priv-base.psc-dpy); uint32_t target_msc_hi, target_msc_lo; uint32_t divisor_hi, divisor_lo; uint32_t remainder_hi, remainder_lo; - int64_t ret = 0; - xcb_connection_t *c = XGetXCBConnection(dpy); split_counter(target_msc, target_msc_hi, target_msc_lo); split_counter(divisor, divisor_hi, divisor_lo); split_counter(remainder, remainder_hi, remainder_lo); - swap_buffers_cookie = - xcb_dri2_swap_buffers_unchecked(c, pdraw-xDrawable, + priv-swap_buffers_cookie = + xcb_dri2_swap_buffers_unchecked(c, priv-base.xDrawable, target_msc_hi, target_msc_lo, divisor_hi, divisor_lo, remainder_hi, remainder_lo); + xcb_flush(c); + priv-swap_pending++; - /* Immediately wait on the swapbuffers reply. If we didn't, we'd have -* to do so some time before reusing a (non-pageflipped) backbuffer. -* Otherwise, the new rendering could get ahead of the X Server's -* dispatch of the swapbuffer and you'd display garbage. -* -* We use XSync() first to reap the invalidate events through the event -* filter, to ensure that the next drawing doesn't use an invalidated -* buffer. -*/ - XSync(dpy, False); + /* Force a synchronous completion prior to the next rendering */ + dri2InvalidateBuffers(priv-base.psc-dpy, priv-base.xDrawable); +} + +static void dri2XcbSwapBuffersComplete(struct dri2_drawable *priv) static void dri2XcbSwapBuffersComplete(struct dri2_drawable *priv) +{ + xcb_dri2_swap_buffers_reply_t *swap_buffers_reply; + + if (!priv-swap_pending) + return; + + priv-swap_pending = 0; Is this actually right? dri2XcbSwapBuffers increments the count, do we know the single reply will be for all pending swaps? swap_buffers_reply = - xcb_dri2_swap_buffers_reply(c, swap_buffers_cookie, NULL); - if (swap_buffers_reply) { - ret = merge_counter(swap_buffers_reply-swap_hi, - swap_buffers_reply-swap_lo); - free(swap_buffers_reply); - } - return ret; +xcb_dri2_swap_buffers_reply(XGetXCBConnection(priv-base.psc-dpy), + priv-swap_buffers_cookie, NULL); + if (swap_buffers_reply == NULL) + return; + + priv-last_swap_sbc = merge_counter(swap_buffers_reply-swap_hi, + swap_buffers_reply-swap_lo); + free(swap_buffers_reply); } static int64_t @@ -833,11 +838,10 @@ dri2SwapBuffers(__GLXDRIdrawable *pdraw, int64_t target_msc, int64_t divisor, struct dri2_screen *psc = (struct dri2_screen *) priv-base.psc;
Re: [Mesa-dev] [PATCH 4/4] i965: Implemente a tiled fast-path for glReadPixels and glGetTexImage
On Tue, Jan 13, 2015 at 10:44 AM, Chad Versace chad.vers...@intel.com wrote: On 01/12/2015 10:22 AM, Jason Ekstrand wrote: From: Sisinty Sasmita Patra sisinty.pa...@intel.com Added intel_readpixels_tiled_mempcpy and intel_gettexsubimage_tiled_mempcpy functions. These are the fast paths for glReadPixels and glGetTexImage. On chrome, using the RoboHornet 2D Canvas toDataURL test, this patch cuts amount of time spent in glReadPixels by more than half and reduces the time of the entire test by 10%. v2: Jason Ekstrand jason.ekstr...@intel.com - Refactor to make the functions look more like the old intel_tex_subimage_tiled_memcpy - Don't export the readpixels_tiled_memcpy function - Fix some pointer arithmatic bugs in partial image downloads (using ReadPixels with a non-zero x or y offset) - Fix a bug when ReadPixels is performed on an FBO wrapping a texture miplevel other than zero. v3: Jason Ekstrand jason.ekstr...@intel.com - Better documentation fot the *_tiled_memcpy functions - Add target restrictions for renderbuffers wrapping textures v4: Jason Ekstrand jason.ekstr...@intel.com - Only check the return value of brw_bo_map for error and not bo-virtual Signed-off-by: Jason Ekstrand jason.ekstr...@intel.com --- src/mesa/drivers/dri/i965/intel_pixel_read.c | 142 ++- src/mesa/drivers/dri/i965/intel_tex.h| 9 ++ src/mesa/drivers/dri/i965/intel_tex_image.c | 139 +- 3 files changed, 287 insertions(+), 3 deletions(-) +/** + * \brief A fast path for glReadPixels + * + * This fast path is taken when the source format is BGRA, RGBA, + * A or L and when the texture memory is X- or Y-tiled. It downloads + * the source data by directly mapping the memory without a GTT fence. + * This then needs to be de-tiled on the CPU before presenting the data to + * the user in the linear fasion. + * + * This is a performance win over the conventional texture download path. + * In the conventional texture download path, the texture is either mapped + * through the GTT or copied to a linear buffer with the blitter before + * handing off to a software path. This allows us to avoid round-tripping + * through the GPU (in the case where we would be blitting) and do only a + * single copy operation. + */ +static bool +intel_readpixels_tiled_memcpy(struct gl_context * ctx, + GLint xoffset, GLint yoffset, + GLsizei width, GLsizei height, + GLenum format, GLenum type, + GLvoid * pixels, + const struct gl_pixelstore_attrib *pack) +{ [snip] + + if (!intel_get_memcpy(rb-Format, format, type, mem_copy, cpp)) + return false; The use of intel_get_memcpy here is surprising and relies on a lucky coincidence, if I understand this function correctly (which maybe I don't). Whether the driver is copying data *to* of *from* the miptree, intel_get_memcpy is given the same parameter values. In other words, the 'tiledFormat' and 'format' parameters of intel_get_memcpy are symmetric. Please add a comment somewhere (maybe in intel_get_memcpy's Doxygen, maybe somewhere else is better) that documents that symmetry. Sure, I can do that. Yes, the symmetry is a bit subtle and relies on the fact that the only two copy functions that are returned are memcpy and one for bgra both of which are symmetric. Well, more to the point, they are their own inverse. I'll add a comment to intel_get_memcpy to that effect. In particular, I'll add the following comment to get_memcpy: * The only two possible copy functions which are ever returned are a * direct memcpy and a RGBA - BGRA copy function. Since RGBA - BGRA and * BGRA - RGBA are exactly the same operation (and memcpy is obviously * symmetric), it doesn't matter whether the copy is from the tiled image * to the untiled or vice versa. The copy function required is the same in * either case so this function can be used. If you don't mind, I'll squash that in to the commit where we add the tiled_to_linear paths. +/** + * \brief A fast path for glGetTexImage. + * + * This fast path is taken when the source format is BGRA, RGBA, + * A or L and when the texture memory is X- or Y-tiled. It downloads + * the source data by directly mapping the memory without a GTT fence. + * This then needs to be de-tiled on the CPU before presenting the data to + * the user in the linear fasion. + * + * This is a performance win over the conventional texture download path. + * In the conventional texture download path, the texture is either mapped + * through the GTT or copied to a linear buffer with the blitter before + * handing off to a software path. This allows us to avoid round-tripping + * through the
[Mesa-dev] [Bug 88536] AMD graphics hardware hangs with an homogeneous coloured screen or blank screen, and with chirp coming from the graphics card
https://bugs.freedesktop.org/show_bug.cgi?id=88536 --- Comment #13 from Alberto Salvia Novella es204904...@gmail.com --- So why is this happening also with the proprietary driver? And about upgrading using the PPA, I have just done it. If in three weeks the problem doesn't reproduce, I will report it as gone in git version. -- You are receiving this mail because: You are the QA Contact for the bug. You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 12/16] i965/fs: Add unit tests for cmod propagation pass.
I had comments on 11 and 15, but 12-14 are Reviewed-by: Jason Ekstrand jason.ekstr...@intel.com On Mon, Jan 19, 2015 at 3:31 PM, Matt Turner matts...@gmail.com wrote: --- src/mesa/drivers/dri/i965/Makefile.am | 7 + .../drivers/dri/i965/test_fs_cmod_propagation.cpp | 271 + 2 files changed, 278 insertions(+) create mode 100644 src/mesa/drivers/dri/i965/test_fs_cmod_propagation.cpp diff --git a/src/mesa/drivers/dri/i965/Makefile.am b/src/mesa/drivers/dri/i965/Makefile.am index 0420507..b74c7d7 100644 --- a/src/mesa/drivers/dri/i965/Makefile.am +++ b/src/mesa/drivers/dri/i965/Makefile.am @@ -52,6 +52,7 @@ TEST_LIBS = \ ../common/libdri_test_stubs.la TESTS = \ + test_fs_cmod_propagation \ test_eu_compact \ test_vf_float_conversions \ test_vec4_copy_propagation \ @@ -59,6 +60,12 @@ TESTS = \ check_PROGRAMS = $(TESTS) +test_fs_cmod_propagation_SOURCES = \ + test_fs_cmod_propagation.cpp +test_fs_cmod_propagation_LDADD = \ + $(TEST_LIBS) \ + $(top_builddir)/src/gtest/libgtest.la + test_vf_float_conversions_SOURCES = \ test_vf_float_conversions.cpp test_vf_float_conversions_LDADD = \ diff --git a/src/mesa/drivers/dri/i965/test_fs_cmod_propagation.cpp b/src/mesa/drivers/dri/i965/test_fs_cmod_propagation.cpp new file mode 100644 index 000..daac9e6 --- /dev/null +++ b/src/mesa/drivers/dri/i965/test_fs_cmod_propagation.cpp @@ -0,0 +1,271 @@ +/* + * Copyright © 2015 Intel Corporation + * + * Permission is hereby granted, free of charge, to any person obtaining a + * copy of this software and associated documentation files (the Software), + * to deal in the Software without restriction, including without limitation + * the rights to use, copy, modify, merge, publish, distribute, sublicense, + * and/or sell copies of the Software, and to permit persons to whom the + * Software is furnished to do so, subject to the following conditions: + * + * The above copyright notice and this permission notice (including the next + * paragraph) shall be included in all copies or substantial portions of the + * Software. + * + * THE SOFTWARE IS PROVIDED AS IS, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS + * IN THE SOFTWARE. + */ + +#include gtest/gtest.h +#include brw_fs.h +#include brw_cfg.h +#include program/program.h + +class cmod_propagation_test : public ::testing::Test { + virtual void SetUp(); + +public: + struct brw_context *brw; + struct gl_context *ctx; + struct brw_wm_prog_data *prog_data; + struct gl_shader_program *shader_prog; + struct brw_fragment_program *fp; + fs_visitor *v; +}; + +class cmod_propagation_fs_visitor : public fs_visitor +{ +public: + cmod_propagation_fs_visitor(struct brw_context *brw, + struct brw_wm_prog_data *prog_data, + struct gl_shader_program *shader_prog) + : fs_visitor(brw, NULL, NULL, prog_data, shader_prog, NULL, 8) {} +}; + + +void cmod_propagation_test::SetUp() +{ + brw = (struct brw_context *)calloc(1, sizeof(*brw)); + ctx = brw-ctx; + + fp = ralloc(NULL, struct brw_fragment_program); + prog_data = ralloc(NULL, struct brw_wm_prog_data); + shader_prog = ralloc(NULL, struct gl_shader_program); + + v = new cmod_propagation_fs_visitor(brw, prog_data, shader_prog); + + _mesa_init_fragment_program(ctx, fp-program, GL_FRAGMENT_SHADER, 0); + + brw-gen = 4; +} + +static fs_inst * +instruction(bblock_t *block, int num) +{ + fs_inst *inst = (fs_inst *)block-start(); + for (int i = 0; i num; i++) { + inst = (fs_inst *)inst-next; + } + return inst; +} + +static bool +cmod_propagation(fs_visitor *v) +{ + const bool print = false; + + if (print) { + fprintf(stderr, = Before =\n); + v-cfg-dump(v); + } + + bool ret = v-opt_cmod_propagation(); + + if (print) { + fprintf(stderr, \n= After =\n); + v-cfg-dump(v); + } + + return ret; +} + +TEST_F(cmod_propagation_test, basic) +{ + fs_reg dest(v, glsl_type::float_type); + fs_reg src0(v, glsl_type::float_type); + fs_reg src1(v, glsl_type::float_type); + fs_reg zero(0.0f); + v-emit(BRW_OPCODE_ADD, dest, src0, src1); + v-emit(BRW_OPCODE_CMP, v-reg_null_f, dest, zero) + -conditional_mod = BRW_CONDITIONAL_GE; + + /* = Before = +* +* 0: add(8)dest src0 src1 +* 1: cmp.ge.f0(8) null dest 0.0f +* +* = After = +* 0:
Re: [Mesa-dev] [PATCH 04/16] i965/cfg: Add a foreach_block_reverse macro.
Patches 1, 3, and 4 are also Reviewed-by: Jason Ekstrand jason.ekstr...@intel.com On Mon, Jan 19, 2015 at 3:31 PM, Matt Turner matts...@gmail.com wrote: --- src/mesa/drivers/dri/i965/brw_cfg.h | 3 +++ 1 file changed, 3 insertions(+) diff --git a/src/mesa/drivers/dri/i965/brw_cfg.h b/src/mesa/drivers/dri/i965/brw_cfg.h index 3b1bd16..0b60fec 100644 --- a/src/mesa/drivers/dri/i965/brw_cfg.h +++ b/src/mesa/drivers/dri/i965/brw_cfg.h @@ -297,6 +297,9 @@ struct cfg_t { #define foreach_block(__block, __cfg) \ foreach_list_typed (bblock_t, __block, link, (__cfg)-block_list) +#define foreach_block_reverse(__block, __cfg) \ + foreach_list_typed_reverse (bblock_t, __block, link, (__cfg)-block_list) + #define foreach_block_safe(__block, __cfg) \ foreach_list_typed_safe (bblock_t, __block, link, (__cfg)-block_list) -- 2.0.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] nir: Add a nir_foreach_phi_src helper macro
--- src/glsl/nir/nir.c | 4 ++-- src/glsl/nir/nir.h | 3 +++ src/glsl/nir/nir_from_ssa.c| 4 ++-- src/glsl/nir/nir_live_variables.c | 2 +- src/glsl/nir/nir_opt_cse.c | 4 ++-- src/glsl/nir/nir_opt_peephole_select.c | 2 +- src/glsl/nir/nir_print.c | 2 +- src/glsl/nir/nir_to_ssa.c | 2 +- src/glsl/nir/nir_validate.c| 2 +- 9 files changed, 14 insertions(+), 11 deletions(-) diff --git a/src/glsl/nir/nir.c b/src/glsl/nir/nir.c index 81dec1c..89e21fd 100644 --- a/src/glsl/nir/nir.c +++ b/src/glsl/nir/nir.c @@ -731,7 +731,7 @@ rewrite_phi_preds(nir_block *block, nir_block *old_pred, nir_block *new_pred) break; nir_phi_instr *phi = nir_instr_as_phi(instr); - foreach_list_typed_safe(nir_phi_src, src, node, phi-srcs) { + nir_foreach_phi_src(phi, src) { if (src-pred == old_pred) { src-pred = new_pred; break; @@ -1585,7 +1585,7 @@ visit_load_const_src(nir_load_const_instr *instr, nir_foreach_src_cb cb, static bool visit_phi_src(nir_phi_instr *instr, nir_foreach_src_cb cb, void *state) { - foreach_list_typed(nir_phi_src, src, node, instr-srcs) { + nir_foreach_phi_src(instr, src) { if (!visit_src(src-src, cb, state)) return false; } diff --git a/src/glsl/nir/nir.h b/src/glsl/nir/nir.h index 8861809..f31d0e0 100644 --- a/src/glsl/nir/nir.h +++ b/src/glsl/nir/nir.h @@ -990,6 +990,9 @@ typedef struct { nir_src src; } nir_phi_src; +#define nir_foreach_phi_src(phi, entry) \ + foreach_list_typed(nir_phi_src, entry, node, (phi)-srcs) + typedef struct { nir_instr instr; diff --git a/src/glsl/nir/nir_from_ssa.c b/src/glsl/nir/nir_from_ssa.c index 0258699..9728b99 100644 --- a/src/glsl/nir/nir_from_ssa.c +++ b/src/glsl/nir/nir_from_ssa.c @@ -343,7 +343,7 @@ isolate_phi_nodes_block(nir_block *block, void *void_state) nir_phi_instr *phi = nir_instr_as_phi(instr); assert(phi-dest.is_ssa); - foreach_list_typed(nir_phi_src, src, node, phi-srcs) { + nir_foreach_phi_src(phi, src) { nir_parallel_copy_instr *pcopy = get_parallel_copy_at_end_of_block(src-pred); assert(pcopy); @@ -412,7 +412,7 @@ coalesce_phi_nodes_block(nir_block *block, void *void_state) assert(phi-dest.is_ssa); merge_node *dest_node = get_merge_node(phi-dest.ssa, state); - foreach_list_typed(nir_phi_src, src, node, phi-srcs) { + nir_foreach_phi_src(phi, src) { assert(src-src.is_ssa); merge_node *src_node = get_merge_node(src-src.ssa, state); if (src_node-set != dest_node-set) diff --git a/src/glsl/nir/nir_live_variables.c b/src/glsl/nir/nir_live_variables.c index f110c5e..7402dc0 100644 --- a/src/glsl/nir/nir_live_variables.c +++ b/src/glsl/nir/nir_live_variables.c @@ -147,7 +147,7 @@ propagate_across_edge(nir_block *pred, nir_block *succ, break; nir_phi_instr *phi = nir_instr_as_phi(instr); - foreach_list_typed(nir_phi_src, src, node, phi-srcs) { + nir_foreach_phi_src(phi, src) { if (src-pred == pred) { set_src_live(src-src, live); break; diff --git a/src/glsl/nir/nir_opt_cse.c b/src/glsl/nir/nir_opt_cse.c index e7dba1d..89d78c8 100644 --- a/src/glsl/nir/nir_opt_cse.c +++ b/src/glsl/nir/nir_opt_cse.c @@ -99,8 +99,8 @@ nir_instrs_equal(nir_instr *instr1, nir_instr *instr2) if (phi1-instr.block != phi2-instr.block) return false; - foreach_list_typed(nir_phi_src, src1, node, phi1-srcs) { - foreach_list_typed(nir_phi_src, src2, node, phi2-srcs) { + nir_foreach_phi_src(phi1, src1) { + nir_foreach_phi_src(phi2, src2) { if (src1-pred == src2-pred) { if (!nir_srcs_equal(src1-src, src2-src)) return false; diff --git a/src/glsl/nir/nir_opt_peephole_select.c b/src/glsl/nir/nir_opt_peephole_select.c index 3e8c938..5d2f5d6 100644 --- a/src/glsl/nir/nir_opt_peephole_select.c +++ b/src/glsl/nir/nir_opt_peephole_select.c @@ -140,7 +140,7 @@ nir_opt_peephole_select_block(nir_block *block, void *void_state) memset(sel-src[0].swizzle, 0, sizeof sel-src[0].swizzle); assert(exec_list_length(phi-srcs) == 2); - foreach_list_typed(nir_phi_src, src, node, phi-srcs) { + nir_foreach_phi_src(phi, src) { assert(src-pred == then_block || src-pred == else_block); assert(src-src.is_ssa); diff --git a/src/glsl/nir/nir_print.c b/src/glsl/nir/nir_print.c index 84bb979..1a50ae9 100644 --- a/src/glsl/nir/nir_print.c +++ b/src/glsl/nir/nir_print.c @@ -543,7 +543,7 @@ print_phi_instr(nir_phi_instr *instr, FILE *fp) { print_dest(instr-dest, fp); fprintf(fp, = phi ); - foreach_list_typed(nir_phi_src, src, node, instr-srcs) { + nir_foreach_phi_src(instr, src) { if (src-node != exec_list_get_head(instr-srcs)) fprintf(fp, , ); diff --git
Re: [Mesa-dev] [PATCH 15/16] i965/fs: Add support for removing MOV.NZ instructions.
On Tue, Jan 20, 2015 at 3:09 PM, Matt Turner matts...@gmail.com wrote: On Tue, Jan 20, 2015 at 2:58 PM, Jason Ekstrand ja...@jlekstrand.net wrote: On Mon, Jan 19, 2015 at 3:31 PM, Matt Turner matts...@gmail.com wrote: For some reason, we occasionally write the flag register with a MOV.NZ instruction: add(8) g251F -g60,1,0F g158,8,1F cmp.l.f0(8) g261D g258,8,1F 0F mov.nz.f0(8)nullg268,8,1D A MOV.NZ instruction on the result of a CMP is like comparing for equality with true in C. It's useless. Removing it allows us to generate: add.l.f0(8) null-g60,1,0F g158,8,1F total instructions in shared programs: 5955701 - 5951657 (-0.07%) instructions in affected programs: 302910 - 298866 (-1.34%) GAINED:1 LOST: 0 --- .../drivers/dri/i965/brw_fs_cmod_propagation.cpp | 23 ++-- .../drivers/dri/i965/test_fs_cmod_propagation.cpp | 32 ++ 2 files changed, 52 insertions(+), 3 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_fs_cmod_propagation.cpp b/src/mesa/drivers/dri/i965/brw_fs_cmod_propagation.cpp index b521350..dd89512 100644 --- a/src/mesa/drivers/dri/i965/brw_fs_cmod_propagation.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs_cmod_propagation.cpp @@ -57,12 +57,20 @@ opt_cmod_propagation_local(fs_visitor *v, bblock_t *block) foreach_inst_in_block_reverse_safe(fs_inst, inst, block) { ip--; - if (inst-opcode != BRW_OPCODE_CMP || + if ((inst-opcode != BRW_OPCODE_CMP + inst-opcode != BRW_OPCODE_MOV) || inst-predicate != BRW_PREDICATE_NONE || !inst-dst.is_null() || inst-src[0].file != GRF || - inst-src[0].abs || - !inst-src[1].is_zero()) + inst-src[0].abs) + continue; + + if (inst-opcode == BRW_OPCODE_CMP !inst-src[1].is_zero()) + continue; + + if (inst-opcode == BRW_OPCODE_MOV + (inst-conditional_mod != BRW_CONDITIONAL_NZ || + inst-src[0].negate)) I think negate is ok here. I'm not 100% sure on the symantics of move.nz, but if it's a != 0 then negation shouldn't matter. If it only considers the bottom bit then negation shouldn't matter there either. The instruction mov.nz.f0 null src0 sets f0 if src0 != 0. Hmm, you're right. Since we're only allowing NZ conditional modifiers we can also allow negation. I don't think we'll ever generate that, but okay. I'll remove the inst-src[0].negate check. Sure we will. When we do older gens in NIR, we'll emit one of those after every cmp. Still have to deal with the and though... ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] nir: Add a nir_foreach_phi_src helper macro
Assuming you grepped for uses of foreach_list* with nir_phi_src and made sure there were no more, Reviewed-by: Connor Abbott cwabbott02gmail.com On Tue, Jan 20, 2015 at 7:34 PM, Jason Ekstrand ja...@jlekstrand.net wrote: --- src/glsl/nir/nir.c | 4 ++-- src/glsl/nir/nir.h | 3 +++ src/glsl/nir/nir_from_ssa.c| 4 ++-- src/glsl/nir/nir_live_variables.c | 2 +- src/glsl/nir/nir_opt_cse.c | 4 ++-- src/glsl/nir/nir_opt_peephole_select.c | 2 +- src/glsl/nir/nir_print.c | 2 +- src/glsl/nir/nir_to_ssa.c | 2 +- src/glsl/nir/nir_validate.c| 2 +- 9 files changed, 14 insertions(+), 11 deletions(-) diff --git a/src/glsl/nir/nir.c b/src/glsl/nir/nir.c index 81dec1c..89e21fd 100644 --- a/src/glsl/nir/nir.c +++ b/src/glsl/nir/nir.c @@ -731,7 +731,7 @@ rewrite_phi_preds(nir_block *block, nir_block *old_pred, nir_block *new_pred) break; nir_phi_instr *phi = nir_instr_as_phi(instr); - foreach_list_typed_safe(nir_phi_src, src, node, phi-srcs) { + nir_foreach_phi_src(phi, src) { if (src-pred == old_pred) { src-pred = new_pred; break; @@ -1585,7 +1585,7 @@ visit_load_const_src(nir_load_const_instr *instr, nir_foreach_src_cb cb, static bool visit_phi_src(nir_phi_instr *instr, nir_foreach_src_cb cb, void *state) { - foreach_list_typed(nir_phi_src, src, node, instr-srcs) { + nir_foreach_phi_src(instr, src) { if (!visit_src(src-src, cb, state)) return false; } diff --git a/src/glsl/nir/nir.h b/src/glsl/nir/nir.h index 8861809..f31d0e0 100644 --- a/src/glsl/nir/nir.h +++ b/src/glsl/nir/nir.h @@ -990,6 +990,9 @@ typedef struct { nir_src src; } nir_phi_src; +#define nir_foreach_phi_src(phi, entry) \ + foreach_list_typed(nir_phi_src, entry, node, (phi)-srcs) + typedef struct { nir_instr instr; diff --git a/src/glsl/nir/nir_from_ssa.c b/src/glsl/nir/nir_from_ssa.c index 0258699..9728b99 100644 --- a/src/glsl/nir/nir_from_ssa.c +++ b/src/glsl/nir/nir_from_ssa.c @@ -343,7 +343,7 @@ isolate_phi_nodes_block(nir_block *block, void *void_state) nir_phi_instr *phi = nir_instr_as_phi(instr); assert(phi-dest.is_ssa); - foreach_list_typed(nir_phi_src, src, node, phi-srcs) { + nir_foreach_phi_src(phi, src) { nir_parallel_copy_instr *pcopy = get_parallel_copy_at_end_of_block(src-pred); assert(pcopy); @@ -412,7 +412,7 @@ coalesce_phi_nodes_block(nir_block *block, void *void_state) assert(phi-dest.is_ssa); merge_node *dest_node = get_merge_node(phi-dest.ssa, state); - foreach_list_typed(nir_phi_src, src, node, phi-srcs) { + nir_foreach_phi_src(phi, src) { assert(src-src.is_ssa); merge_node *src_node = get_merge_node(src-src.ssa, state); if (src_node-set != dest_node-set) diff --git a/src/glsl/nir/nir_live_variables.c b/src/glsl/nir/nir_live_variables.c index f110c5e..7402dc0 100644 --- a/src/glsl/nir/nir_live_variables.c +++ b/src/glsl/nir/nir_live_variables.c @@ -147,7 +147,7 @@ propagate_across_edge(nir_block *pred, nir_block *succ, break; nir_phi_instr *phi = nir_instr_as_phi(instr); - foreach_list_typed(nir_phi_src, src, node, phi-srcs) { + nir_foreach_phi_src(phi, src) { if (src-pred == pred) { set_src_live(src-src, live); break; diff --git a/src/glsl/nir/nir_opt_cse.c b/src/glsl/nir/nir_opt_cse.c index e7dba1d..89d78c8 100644 --- a/src/glsl/nir/nir_opt_cse.c +++ b/src/glsl/nir/nir_opt_cse.c @@ -99,8 +99,8 @@ nir_instrs_equal(nir_instr *instr1, nir_instr *instr2) if (phi1-instr.block != phi2-instr.block) return false; - foreach_list_typed(nir_phi_src, src1, node, phi1-srcs) { - foreach_list_typed(nir_phi_src, src2, node, phi2-srcs) { + nir_foreach_phi_src(phi1, src1) { + nir_foreach_phi_src(phi2, src2) { if (src1-pred == src2-pred) { if (!nir_srcs_equal(src1-src, src2-src)) return false; diff --git a/src/glsl/nir/nir_opt_peephole_select.c b/src/glsl/nir/nir_opt_peephole_select.c index 3e8c938..5d2f5d6 100644 --- a/src/glsl/nir/nir_opt_peephole_select.c +++ b/src/glsl/nir/nir_opt_peephole_select.c @@ -140,7 +140,7 @@ nir_opt_peephole_select_block(nir_block *block, void *void_state) memset(sel-src[0].swizzle, 0, sizeof sel-src[0].swizzle); assert(exec_list_length(phi-srcs) == 2); - foreach_list_typed(nir_phi_src, src, node, phi-srcs) { + nir_foreach_phi_src(phi, src) { assert(src-pred == then_block || src-pred == else_block); assert(src-src.is_ssa); diff --git a/src/glsl/nir/nir_print.c b/src/glsl/nir/nir_print.c index 84bb979..1a50ae9 100644 --- a/src/glsl/nir/nir_print.c +++
Re: [Mesa-dev] [PATCH 0/5] NIR opcodes and constant folding
Connor, You mentioned commit access. I'll try and poke someone, but if you resend with changes, I can get it pushed. --Jason On Tue, Jan 20, 2015 at 11:45 AM, Jason Ekstrand ja...@jlekstrand.net wrote: There's still some cleanups needed for 2/5. The rest is Reviewed-by: Jason Ekstrand jason.ekstr...@intel.com On Fri, Jan 16, 2015 at 1:53 PM, Connor Abbott cwabbo...@gmail.com wrote: Oh, and I forgot... the series is also available at https://github.com/cwabbott0/mesa nir-opcodes-cleanup On Fri, Jan 16, 2015 at 4:46 PM, Connor Abbott cwabbo...@gmail.com wrote: Hi, This is a series I had floating around a while. The idea is to have all the opcode stuff, including constant folding, derived from a single Python file. I've cleaned it up a little by using {}-style Python formatting instead of the pile of text-replacement and regular expressions we had before for getting the constant expressions to a state where they could be compiled as C code. Connor Abbott (5): nir: add generated file to .gitignore nir: use Python to autogenerate opcode information nir: add new constant folding infrastructure nir/constant_folding: use the new constant folding infrastructure nir/lower_vars_to_ssa: fix a bug with boolean constants src/glsl/Makefile.am | 23 +- src/glsl/Makefile.sources| 7 +- src/glsl/nir/.gitignore | 4 + src/glsl/nir/nir.h | 9 - src/glsl/nir/nir_constant_expressions.h | 32 ++ src/glsl/nir/nir_constant_expressions.py | 320 + src/glsl/nir/nir_lower_vars_to_ssa.c | 2 +- src/glsl/nir/nir_opcodes.c | 46 --- src/glsl/nir/nir_opcodes.h | 366 src/glsl/nir/nir_opcodes.py | 567 +++ src/glsl/nir/nir_opcodes_c.py| 56 +++ src/glsl/nir/nir_opcodes_h.py| 39 +++ src/glsl/nir/nir_opt_constant_folding.c | 179 ++ 13 files changed, 1066 insertions(+), 584 deletions(-) create mode 100644 src/glsl/nir/.gitignore create mode 100644 src/glsl/nir/nir_constant_expressions.h create mode 100644 src/glsl/nir/nir_constant_expressions.py delete mode 100644 src/glsl/nir/nir_opcodes.c delete mode 100644 src/glsl/nir/nir_opcodes.h create mode 100644 src/glsl/nir/nir_opcodes.py create mode 100644 src/glsl/nir/nir_opcodes_c.py create mode 100644 src/glsl/nir/nir_opcodes_h.py -- 2.1.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 16/16] i965: Convert CMP.GE -(abs)reg 0 - CMP.Z reg 0.
On Mon, Jan 19, 2015 at 8:08 PM, Ian Romanick i...@freedesktop.org wrote: On 01/19/2015 03:31 PM, Matt Turner wrote: total instructions in shared programs: 5952059 - 5951603 (-0.01%) instructions in affected programs: 138812 - 138356 (-0.33%) GAINED:1 LOST: 0 Does the glsl: Rewrite (-abs(x) = 0) as (x == 0) patch in my bool-optimizations-v2 tree achieve the same thing? I have a memory of writing that patch at some point in the past and discovering that it wasn't able to get rid of everything because of annoying tree grafting problems. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 11/16] i965/fs: Add pass to propagate conditional modifiers.
On Tue, Jan 20, 2015 at 2:56 PM, Jason Ekstrand ja...@jlekstrand.net wrote: On Mon, Jan 19, 2015 at 3:31 PM, Matt Turner matts...@gmail.com wrote: + if (scan_inst-dst.file == GRF + scan_inst-dst.reg == inst-src[0].reg + scan_inst-dst.reg_offset == inst-src[0].reg_offset + !scan_inst-is_partial_write()) { I don't think this is quite the right condition. We want to do the replacement under these conditions but if it's a partial write we want to break without replacement. In other words, we want to break whenever something *may* have touched it and only consider it as a substitute for the CMP if they overwrote it entirely. Ah, yes. I'll start the block with if (scan_inst-is_partial_write()) break; Thanks. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 02/16] glsl: Add a foreach_in_list_reverse_safe macro.
On Tue, Jan 20, 2015 at 1:15 PM, Matt Turner matts...@gmail.com wrote: On Tue, Jan 20, 2015 at 6:16 AM, Connor Abbott cwabbo...@gmail.com wrote: Shouldn't the subject say exec_list: Add a foreach_in_list_reverse_safe macro.? Meh? Shortlog on this file shows glsl2, exec_list, glsl, glsl/list. Well, given that this stuff is used outside of src/glsl, it would make more sense to use something that at least has list in the name so people can separate it from stuff that's purely specific to GLSL IR. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 88536] AMD graphics hardware hangs with an homogeneous coloured screen or blank screen, and with chirp coming from the graphics card
https://bugs.freedesktop.org/show_bug.cgi?id=88536 --- Comment #14 from Alberto Salvia Novella es204904...@gmail.com --- I can confirm upgrading from the PPA didn't solve the problem :) Graphics hanged again just after finishing the latest comment. -- You are receiving this mail because: You are the QA Contact for the bug. You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 15/16] i965/fs: Add support for removing MOV.NZ instructions.
On Mon, Jan 19, 2015 at 3:31 PM, Matt Turner matts...@gmail.com wrote: For some reason, we occasionally write the flag register with a MOV.NZ instruction: add(8) g251F -g60,1,0F g158,8,1F cmp.l.f0(8) g261D g258,8,1F 0F mov.nz.f0(8)nullg268,8,1D A MOV.NZ instruction on the result of a CMP is like comparing for equality with true in C. It's useless. Removing it allows us to generate: add.l.f0(8) null-g60,1,0F g158,8,1F total instructions in shared programs: 5955701 - 5951657 (-0.07%) instructions in affected programs: 302910 - 298866 (-1.34%) GAINED:1 LOST: 0 --- .../drivers/dri/i965/brw_fs_cmod_propagation.cpp | 23 ++-- .../drivers/dri/i965/test_fs_cmod_propagation.cpp | 32 ++ 2 files changed, 52 insertions(+), 3 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_fs_cmod_propagation.cpp b/src/mesa/drivers/dri/i965/brw_fs_cmod_propagation.cpp index b521350..dd89512 100644 --- a/src/mesa/drivers/dri/i965/brw_fs_cmod_propagation.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs_cmod_propagation.cpp @@ -57,12 +57,20 @@ opt_cmod_propagation_local(fs_visitor *v, bblock_t *block) foreach_inst_in_block_reverse_safe(fs_inst, inst, block) { ip--; - if (inst-opcode != BRW_OPCODE_CMP || + if ((inst-opcode != BRW_OPCODE_CMP + inst-opcode != BRW_OPCODE_MOV) || inst-predicate != BRW_PREDICATE_NONE || !inst-dst.is_null() || inst-src[0].file != GRF || - inst-src[0].abs || - !inst-src[1].is_zero()) + inst-src[0].abs) + continue; + + if (inst-opcode == BRW_OPCODE_CMP !inst-src[1].is_zero()) + continue; + + if (inst-opcode == BRW_OPCODE_MOV + (inst-conditional_mod != BRW_CONDITIONAL_NZ || + inst-src[0].negate)) I think negate is ok here. I'm not 100% sure on the symantics of move.nz, but if it's a != 0 then negation shouldn't matter. If it only considers the bottom bit then negation shouldn't matter there either. continue; bool read_flag = false; @@ -72,6 +80,15 @@ opt_cmod_propagation_local(fs_visitor *v, bblock_t *block) scan_inst-dst.reg == inst-src[0].reg scan_inst-dst.reg_offset == inst-src[0].reg_offset !scan_inst-is_partial_write()) { +if (inst-opcode == BRW_OPCODE_MOV) { + if (!scan_inst-writes_flag()) + break; + + inst-remove(block); + progress = true; + break; +} + enum brw_conditional_mod cond = inst-src[0].negate ? brw_invert_cmod(inst-conditional_mod) : inst-conditional_mod; diff --git a/src/mesa/drivers/dri/i965/test_fs_cmod_propagation.cpp b/src/mesa/drivers/dri/i965/test_fs_cmod_propagation.cpp index 15f685e..9541597 100644 --- a/src/mesa/drivers/dri/i965/test_fs_cmod_propagation.cpp +++ b/src/mesa/drivers/dri/i965/test_fs_cmod_propagation.cpp @@ -343,3 +343,35 @@ TEST_F(cmod_propagation_test, negate) EXPECT_EQ(BRW_OPCODE_ADD, instruction(block0, 0)-opcode); EXPECT_EQ(BRW_CONDITIONAL_L, instruction(block0, 0)-conditional_mod); } + +TEST_F(cmod_propagation_test, movnz) +{ + fs_reg dest(v, glsl_type::float_type); + fs_reg src0(v, glsl_type::float_type); + fs_reg src1(v, glsl_type::float_type); + v-emit(BRW_OPCODE_CMP, dest, src0, src1) + -conditional_mod = BRW_CONDITIONAL_GE; + v-emit(BRW_OPCODE_MOV, v-reg_null_f, dest) + -conditional_mod = BRW_CONDITIONAL_NZ; + + /* = Before = +* +* 0: cmp.ge.f0(8) dest src0 src1 +* 1: mov.nz.f0(8) null dest +* +* = After = +* 0: cmp.ge.f0(8) dest src0 src1 +*/ + + v-calculate_cfg(); + bblock_t *block0 = v-cfg-blocks[0]; + + EXPECT_EQ(0, block0-start_ip); + EXPECT_EQ(1, block0-end_ip); + + EXPECT_TRUE(cmod_propagation(v)); + EXPECT_EQ(0, block0-start_ip); + EXPECT_EQ(0, block0-end_ip); + EXPECT_EQ(BRW_OPCODE_CMP, instruction(block0, 0)-opcode); + EXPECT_EQ(BRW_CONDITIONAL_GE, instruction(block0, 0)-conditional_mod); +} -- 2.0.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] egl/dri2: implement platform_null.
On Tue, Jan 20, 2015 at 2:25 PM, Haixia Shi h...@chromium.org wrote: diff --git a/src/egl/drivers/dri2/platform_null.c b/src/egl/drivers/dri2/platform_null.c new file mode 100644 index 000..9c59809 --- /dev/null +++ b/src/egl/drivers/dri2/platform_null.c @@ -0,0 +1,156 @@ +/* + * Copyright (c) 2014 The Chromium OS Authors. All rights reserved. + * Use of this source code is governed by a BSD-style license that can be + * found in the LICENSE file. + */ Could you contribute this under the same MIT license that the rest of Mesa uses? ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 15/16] i965/fs: Add support for removing MOV.NZ instructions.
On Tue, Jan 20, 2015 at 2:58 PM, Jason Ekstrand ja...@jlekstrand.net wrote: On Mon, Jan 19, 2015 at 3:31 PM, Matt Turner matts...@gmail.com wrote: For some reason, we occasionally write the flag register with a MOV.NZ instruction: add(8) g251F -g60,1,0F g158,8,1F cmp.l.f0(8) g261D g258,8,1F 0F mov.nz.f0(8)nullg268,8,1D A MOV.NZ instruction on the result of a CMP is like comparing for equality with true in C. It's useless. Removing it allows us to generate: add.l.f0(8) null-g60,1,0F g158,8,1F total instructions in shared programs: 5955701 - 5951657 (-0.07%) instructions in affected programs: 302910 - 298866 (-1.34%) GAINED:1 LOST: 0 --- .../drivers/dri/i965/brw_fs_cmod_propagation.cpp | 23 ++-- .../drivers/dri/i965/test_fs_cmod_propagation.cpp | 32 ++ 2 files changed, 52 insertions(+), 3 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_fs_cmod_propagation.cpp b/src/mesa/drivers/dri/i965/brw_fs_cmod_propagation.cpp index b521350..dd89512 100644 --- a/src/mesa/drivers/dri/i965/brw_fs_cmod_propagation.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs_cmod_propagation.cpp @@ -57,12 +57,20 @@ opt_cmod_propagation_local(fs_visitor *v, bblock_t *block) foreach_inst_in_block_reverse_safe(fs_inst, inst, block) { ip--; - if (inst-opcode != BRW_OPCODE_CMP || + if ((inst-opcode != BRW_OPCODE_CMP + inst-opcode != BRW_OPCODE_MOV) || inst-predicate != BRW_PREDICATE_NONE || !inst-dst.is_null() || inst-src[0].file != GRF || - inst-src[0].abs || - !inst-src[1].is_zero()) + inst-src[0].abs) + continue; + + if (inst-opcode == BRW_OPCODE_CMP !inst-src[1].is_zero()) + continue; + + if (inst-opcode == BRW_OPCODE_MOV + (inst-conditional_mod != BRW_CONDITIONAL_NZ || + inst-src[0].negate)) I think negate is ok here. I'm not 100% sure on the symantics of move.nz, but if it's a != 0 then negation shouldn't matter. If it only considers the bottom bit then negation shouldn't matter there either. The instruction mov.nz.f0 null src0 sets f0 if src0 != 0. Hmm, you're right. Since we're only allowing NZ conditional modifiers we can also allow negation. I don't think we'll ever generate that, but okay. I'll remove the inst-src[0].negate check. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 15/16] i965/fs: Add support for removing MOV.NZ instructions.
On Tue, Jan 20, 2015 at 3:31 PM, Matt Turner matts...@gmail.com wrote: On Tue, Jan 20, 2015 at 3:17 PM, Jason Ekstrand ja...@jlekstrand.net wrote: On Tue, Jan 20, 2015 at 3:09 PM, Matt Turner matts...@gmail.com wrote: On Tue, Jan 20, 2015 at 2:58 PM, Jason Ekstrand ja...@jlekstrand.net wrote: On Mon, Jan 19, 2015 at 3:31 PM, Matt Turner matts...@gmail.com wrote: For some reason, we occasionally write the flag register with a MOV.NZ instruction: add(8) g251F -g60,1,0F g158,8,1F cmp.l.f0(8) g261D g258,8,1F 0F mov.nz.f0(8)nullg268,8,1D A MOV.NZ instruction on the result of a CMP is like comparing for equality with true in C. It's useless. Removing it allows us to generate: add.l.f0(8) null-g60,1,0F g158,8,1F total instructions in shared programs: 5955701 - 5951657 (-0.07%) instructions in affected programs: 302910 - 298866 (-1.34%) GAINED:1 LOST: 0 --- .../drivers/dri/i965/brw_fs_cmod_propagation.cpp | 23 ++-- .../drivers/dri/i965/test_fs_cmod_propagation.cpp | 32 ++ 2 files changed, 52 insertions(+), 3 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_fs_cmod_propagation.cpp b/src/mesa/drivers/dri/i965/brw_fs_cmod_propagation.cpp index b521350..dd89512 100644 --- a/src/mesa/drivers/dri/i965/brw_fs_cmod_propagation.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs_cmod_propagation.cpp @@ -57,12 +57,20 @@ opt_cmod_propagation_local(fs_visitor *v, bblock_t *block) foreach_inst_in_block_reverse_safe(fs_inst, inst, block) { ip--; - if (inst-opcode != BRW_OPCODE_CMP || + if ((inst-opcode != BRW_OPCODE_CMP + inst-opcode != BRW_OPCODE_MOV) || inst-predicate != BRW_PREDICATE_NONE || !inst-dst.is_null() || inst-src[0].file != GRF || - inst-src[0].abs || - !inst-src[1].is_zero()) + inst-src[0].abs) + continue; + + if (inst-opcode == BRW_OPCODE_CMP !inst-src[1].is_zero()) + continue; + + if (inst-opcode == BRW_OPCODE_MOV + (inst-conditional_mod != BRW_CONDITIONAL_NZ || + inst-src[0].negate)) I think negate is ok here. I'm not 100% sure on the symantics of move.nz, but if it's a != 0 then negation shouldn't matter. If it only considers the bottom bit then negation shouldn't matter there either. The instruction mov.nz.f0 null src0 sets f0 if src0 != 0. Hmm, you're right. Since we're only allowing NZ conditional modifiers we can also allow negation. I don't think we'll ever generate that, but okay. I'll remove the inst-src[0].negate check. Sure we will. When we do older gens in NIR, we'll emit one of those after every cmp. Still have to deal with the and though... Emitting it after every comparison isn't what you want. We emit it from resolve_bool_comparison() before we need the integer representation of a bool for things like b2f. NIR - FS should behave the same way. Except typeless... We need some sort of assurance that the result of a NIR comparison is always 0 or ~0. We may be able to pull similar stunts or maybe do the cleanup in NIR, but I'm not sure if we can or not. It's going to be a bit interesting. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 10/16] i965/fs: Eliminate null-dst instructions without side-effects.
That makes sense. However, in the vec4 backend, we do emit scratch reads and writes prior to optimization - in move_grf_array_access_to_scratch(). I've been meaning to port that over to the scalar backend. At which point, SHADER_OPCODE_GEN4_SCRATCH_WRITE would be a problem. Sort of off-topic, but it would be really nice if you could do this. A lot of the code Jason wrote for NIR is about dealing with indirect array accesses, which currently get lowered to if-ladders in GLSL which makes all of it dead code. It would be nice to be able to bugfix + test it. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 11/16] i965/fs: Add pass to propagate conditional modifiers.
On Mon, Jan 19, 2015 at 3:31 PM, Matt Turner matts...@gmail.com wrote: total instructions in shared programs: 5974160 - 5959463 (-0.25%) instructions in affected programs: 1743737 - 1729040 (-0.84%) GAINED:0 LOST: 12 --- src/mesa/drivers/dri/i965/Makefile.sources | 1 + src/mesa/drivers/dri/i965/brw_fs.cpp | 1 + src/mesa/drivers/dri/i965/brw_fs.h | 1 + .../drivers/dri/i965/brw_fs_cmod_propagation.cpp | 97 ++ 4 files changed, 100 insertions(+) create mode 100644 src/mesa/drivers/dri/i965/brw_fs_cmod_propagation.cpp diff --git a/src/mesa/drivers/dri/i965/Makefile.sources b/src/mesa/drivers/dri/i965/Makefile.sources index 3b72955..da48455 100644 --- a/src/mesa/drivers/dri/i965/Makefile.sources +++ b/src/mesa/drivers/dri/i965/Makefile.sources @@ -39,6 +39,7 @@ i965_FILES = \ brw_ff_gs_emit.c \ brw_ff_gs.h \ brw_fs_channel_expressions.cpp \ + brw_fs_cmod_propagation.cpp \ brw_fs_copy_propagation.cpp \ brw_fs.cpp \ brw_fs_cse.cpp \ diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp b/src/mesa/drivers/dri/i965/brw_fs.cpp index 73d722e..994d457 100644 --- a/src/mesa/drivers/dri/i965/brw_fs.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs.cpp @@ -3581,6 +3581,7 @@ fs_visitor::optimize() OPT(opt_cse); OPT(opt_copy_propagate); OPT(opt_peephole_predicated_break); + OPT(opt_cmod_propagation); OPT(dead_code_eliminate); OPT(opt_peephole_sel); OPT(dead_control_flow_eliminate, this); diff --git a/src/mesa/drivers/dri/i965/brw_fs.h b/src/mesa/drivers/dri/i965/brw_fs.h index 9c125a6..e1bc7d7 100644 --- a/src/mesa/drivers/dri/i965/brw_fs.h +++ b/src/mesa/drivers/dri/i965/brw_fs.h @@ -539,6 +539,7 @@ public: bool opt_peephole_sel(); bool opt_peephole_predicated_break(); bool opt_saturate_propagation(); + bool opt_cmod_propagation(); void emit_bool_to_cond_code(ir_rvalue *condition); void emit_if_gen6(ir_if *ir); void emit_unspill(bblock_t *block, fs_inst *inst, fs_reg reg, diff --git a/src/mesa/drivers/dri/i965/brw_fs_cmod_propagation.cpp b/src/mesa/drivers/dri/i965/brw_fs_cmod_propagation.cpp new file mode 100644 index 000..5ba2fd6 --- /dev/null +++ b/src/mesa/drivers/dri/i965/brw_fs_cmod_propagation.cpp @@ -0,0 +1,97 @@ +/* + * Copyright © 2014 Intel Corporation + * + * Permission is hereby granted, free of charge, to any person obtaining a + * copy of this software and associated documentation files (the Software), + * to deal in the Software without restriction, including without limitation + * the rights to use, copy, modify, merge, publish, distribute, sublicense, + * and/or sell copies of the Software, and to permit persons to whom the + * Software is furnished to do so, subject to the following conditions: + * + * The above copyright notice and this permission notice (including the next + * paragraph) shall be included in all copies or substantial portions of the + * Software. + * + * THE SOFTWARE IS PROVIDED AS IS, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS + * IN THE SOFTWARE. + */ + +#include brw_fs.h +#include brw_fs_live_variables.h +#include brw_cfg.h + +/** @file brw_fs_cmod_propagation.cpp + * + * Implements a pass that propagates the conditional modifier from a CMP x 0.0 + * instruction into the instruction that generated x. For instance, in this + * sequence + * + *add(8) g701Fg698,8,1F4096F + *cmp.ge.f0(8)null g708,8,1F0F + * + * we can do the comparison as part of the ADD instruction directly: + * + *add.ge.f0(8)g701Fg698,8,1F4096F + */ + +static bool +opt_cmod_propagation_local(fs_visitor *v, bblock_t *block) +{ + bool progress = false; + int ip = block-end_ip + 1; + + foreach_inst_in_block_reverse_safe(fs_inst, inst, block) { + ip--; + + if (inst-opcode != BRW_OPCODE_CMP || + inst-predicate != BRW_PREDICATE_NONE || + !inst-dst.is_null() || + inst-src[0].file != GRF || + inst-src[0].abs || + inst-src[0].negate || + !inst-src[1].is_zero()) + continue; + + foreach_inst_in_block_reverse_starting_from(fs_inst, scan_inst, inst, + block) { + if (scan_inst-dst.file == GRF + scan_inst-dst.reg == inst-src[0].reg +
Re: [Mesa-dev] [PATCH 15/16] i965/fs: Add support for removing MOV.NZ instructions.
On Tue, Jan 20, 2015 at 3:17 PM, Jason Ekstrand ja...@jlekstrand.net wrote: On Tue, Jan 20, 2015 at 3:09 PM, Matt Turner matts...@gmail.com wrote: On Tue, Jan 20, 2015 at 2:58 PM, Jason Ekstrand ja...@jlekstrand.net wrote: On Mon, Jan 19, 2015 at 3:31 PM, Matt Turner matts...@gmail.com wrote: For some reason, we occasionally write the flag register with a MOV.NZ instruction: add(8) g251F -g60,1,0F g158,8,1F cmp.l.f0(8) g261D g258,8,1F 0F mov.nz.f0(8)nullg268,8,1D A MOV.NZ instruction on the result of a CMP is like comparing for equality with true in C. It's useless. Removing it allows us to generate: add.l.f0(8) null-g60,1,0F g158,8,1F total instructions in shared programs: 5955701 - 5951657 (-0.07%) instructions in affected programs: 302910 - 298866 (-1.34%) GAINED:1 LOST: 0 --- .../drivers/dri/i965/brw_fs_cmod_propagation.cpp | 23 ++-- .../drivers/dri/i965/test_fs_cmod_propagation.cpp | 32 ++ 2 files changed, 52 insertions(+), 3 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_fs_cmod_propagation.cpp b/src/mesa/drivers/dri/i965/brw_fs_cmod_propagation.cpp index b521350..dd89512 100644 --- a/src/mesa/drivers/dri/i965/brw_fs_cmod_propagation.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs_cmod_propagation.cpp @@ -57,12 +57,20 @@ opt_cmod_propagation_local(fs_visitor *v, bblock_t *block) foreach_inst_in_block_reverse_safe(fs_inst, inst, block) { ip--; - if (inst-opcode != BRW_OPCODE_CMP || + if ((inst-opcode != BRW_OPCODE_CMP + inst-opcode != BRW_OPCODE_MOV) || inst-predicate != BRW_PREDICATE_NONE || !inst-dst.is_null() || inst-src[0].file != GRF || - inst-src[0].abs || - !inst-src[1].is_zero()) + inst-src[0].abs) + continue; + + if (inst-opcode == BRW_OPCODE_CMP !inst-src[1].is_zero()) + continue; + + if (inst-opcode == BRW_OPCODE_MOV + (inst-conditional_mod != BRW_CONDITIONAL_NZ || + inst-src[0].negate)) I think negate is ok here. I'm not 100% sure on the symantics of move.nz, but if it's a != 0 then negation shouldn't matter. If it only considers the bottom bit then negation shouldn't matter there either. The instruction mov.nz.f0 null src0 sets f0 if src0 != 0. Hmm, you're right. Since we're only allowing NZ conditional modifiers we can also allow negation. I don't think we'll ever generate that, but okay. I'll remove the inst-src[0].negate check. Sure we will. When we do older gens in NIR, we'll emit one of those after every cmp. Still have to deal with the and though... Emitting it after every comparison isn't what you want. We emit it from resolve_bool_comparison() before we need the integer representation of a bool for things like b2f. NIR - FS should behave the same way. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 09/16] i965/fs: Apply conditional mod specially to split MAD/LRP.
Otherwise we'll apply the conditional mod to only one of SIMD8 instructions and trigger an assertion. NoDDClr/NoDDChk have the same problem but we never apply those to these instructions, so I'm leaving them for a later time. --- src/mesa/drivers/dri/i965/brw_fs_generator.cpp | 24 1 file changed, 20 insertions(+), 4 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_fs_generator.cpp b/src/mesa/drivers/dri/i965/brw_fs_generator.cpp index ab848f1..5b39c88 100644 --- a/src/mesa/drivers/dri/i965/brw_fs_generator.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs_generator.cpp @@ -1583,6 +1583,7 @@ fs_generator::generate_code(const cfg_t *cfg, int dispatch_width) foreach_block_and_inst (block, fs_inst, inst, cfg) { struct brw_reg src[3], dst; unsigned int last_insn_offset = p-next_insn_offset; + bool multiple_instructions_emitted = false; if (unlikely(debug_flag)) annotate(brw, annotation, cfg, inst, p-next_insn_offset); @@ -1656,10 +1657,16 @@ fs_generator::generate_code(const cfg_t *cfg, int dispatch_width) brw_set_default_access_mode(p, BRW_ALIGN_16); if (dispatch_width == 16 brw-gen 8 !brw-is_haswell) { brw_set_default_compression_control(p, BRW_COMPRESSION_NONE); - brw_MAD(p, firsthalf(dst), firsthalf(src[0]), firsthalf(src[1]), firsthalf(src[2])); +brw_inst *f = brw_MAD(p, firsthalf(dst), firsthalf(src[0]), firsthalf(src[1]), firsthalf(src[2])); brw_set_default_compression_control(p, BRW_COMPRESSION_2NDHALF); - brw_MAD(p, sechalf(dst), sechalf(src[0]), sechalf(src[1]), sechalf(src[2])); +brw_inst *s = brw_MAD(p, sechalf(dst), sechalf(src[0]), sechalf(src[1]), sechalf(src[2])); brw_set_default_compression_control(p, BRW_COMPRESSION_COMPRESSED); + +if (inst-conditional_mod) { + brw_inst_set_cond_modifier(brw, f, inst-conditional_mod); + brw_inst_set_cond_modifier(brw, s, inst-conditional_mod); + multiple_instructions_emitted = true; +} } else { brw_MAD(p, dst, src[0], src[1], src[2]); } @@ -1671,10 +1678,16 @@ fs_generator::generate_code(const cfg_t *cfg, int dispatch_width) brw_set_default_access_mode(p, BRW_ALIGN_16); if (dispatch_width == 16 brw-gen 8 !brw-is_haswell) { brw_set_default_compression_control(p, BRW_COMPRESSION_NONE); - brw_LRP(p, firsthalf(dst), firsthalf(src[0]), firsthalf(src[1]), firsthalf(src[2])); +brw_inst *f = brw_LRP(p, firsthalf(dst), firsthalf(src[0]), firsthalf(src[1]), firsthalf(src[2])); brw_set_default_compression_control(p, BRW_COMPRESSION_2NDHALF); - brw_LRP(p, sechalf(dst), sechalf(src[0]), sechalf(src[1]), sechalf(src[2])); +brw_inst *s = brw_LRP(p, sechalf(dst), sechalf(src[0]), sechalf(src[1]), sechalf(src[2])); brw_set_default_compression_control(p, BRW_COMPRESSION_COMPRESSED); + +if (inst-conditional_mod) { + brw_inst_set_cond_modifier(brw, f, inst-conditional_mod); + brw_inst_set_cond_modifier(brw, s, inst-conditional_mod); + multiple_instructions_emitted = true; +} } else { brw_LRP(p, dst, src[0], src[1], src[2]); } @@ -2048,6 +2061,9 @@ fs_generator::generate_code(const cfg_t *cfg, int dispatch_width) unreachable(Should be lowered by lower_load_payload()); } + if (multiple_instructions_emitted) + continue; + if (inst-no_dd_clear || inst-no_dd_check || inst-conditional_mod) { assert(p-next_insn_offset == last_insn_offset + 16 || !conditional_mod, no_dd_check, or no_dd_clear set for IR -- 2.0.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 15/16] i965/fs: Add support for removing MOV.NZ instructions.
On Tue, Jan 20, 2015 at 3:51 PM, Jason Ekstrand ja...@jlekstrand.net wrote: Except typeless... We need some sort of assurance that the result of a NIR comparison is always 0 or ~0. Help me understand how this is a different situation from what we have today? Let's take for instance a vec2 == vec2 comparison. On Gen4/5 we generate two CMPs an AND to combine their results and a single resolving AND.NZ with 1 to generate the flag. Look at generated_tests/spec/glsl-1.10/execution/built-in-functions/fs-op-eq-vec2-vec2-using-if.shader_test for example. Why should NIR necessitate doing any of that differently? ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [mesa-dev][PATCH] Remove UINT_AS_FLT, INT_AS_FLT, FLOAT_AS_FLT macros.No functional changes, only bug fixed.
From: Marius Predut marius.pre...@intel.com On 32-bit, for floating point operations is used x86 FPU registers instead SSE, reason for when reinterprets an integer as a float result is unexpected (modify floats when they are written to memory). This fixes https://bugs.freedesktop.org/show_bug.cgi?id=82668 Reviewed-by: Roberts, Neil Sneil.s.robe...@intel.com --- src/mesa/main/context.c |2 +- src/mesa/main/macros.h| 29 ++- src/mesa/vbo/vbo_attrib_tmp.h | 43 + src/mesa/vbo/vbo_exec.h |3 ++- src/mesa/vbo/vbo_exec_api.c | 25 src/mesa/vbo/vbo_exec_eval.c | 22 - src/mesa/vbo/vbo_save_api.c | 10 +- 7 files changed, 78 insertions(+), 56 deletions(-) diff --git a/src/mesa/main/context.c b/src/mesa/main/context.c index 400c158..3007491 100644 --- a/src/mesa/main/context.c +++ b/src/mesa/main/context.c @@ -656,7 +656,7 @@ _mesa_init_constants(struct gl_constants *consts, gl_api api) consts-MaxSamples = 0; /* GLSL default if NativeIntegers == FALSE */ - consts-UniformBooleanTrue = FLT_AS_UINT(1.0f); + consts-UniformBooleanTrue = 1; /* GL_ARB_sync */ consts-MaxServerWaitTimeout = 0x1fff7fffULL; diff --git a/src/mesa/main/macros.h b/src/mesa/main/macros.h index cd5f2d6..4d245e1 100644 --- a/src/mesa/main/macros.h +++ b/src/mesa/main/macros.h @@ -170,27 +170,6 @@ extern GLfloat _mesa_ubyte_to_float_color_tab[256]; ub = ((GLubyte) F_TO_I((f) * 255.0F)) #endif -static inline GLfloat INT_AS_FLT(GLint i) -{ - fi_type tmp; - tmp.i = i; - return tmp.f; -} - -static inline GLfloat UINT_AS_FLT(GLuint u) -{ - fi_type tmp; - tmp.u = u; - return tmp.f; -} - -static inline unsigned FLT_AS_UINT(float f) -{ - fi_type tmp; - tmp.f = f; - return tmp.u; -} - /** * Convert a floating point value to an unsigned fixed point value. * @@ -625,15 +604,11 @@ COPY_CLEAN_4V_TYPE_AS_FLOAT(GLfloat dst[4], int sz, const GLfloat src[4], { switch (type) { case GL_FLOAT: - ASSIGN_4V(dst, 0, 0, 0, 1); + ASSIGN_4V(dst, 0.0f, 0.0f, 0.0f, 1.0f); break; case GL_INT: - ASSIGN_4V(dst, INT_AS_FLT(0), INT_AS_FLT(0), - INT_AS_FLT(0), INT_AS_FLT(1)); - break; case GL_UNSIGNED_INT: - ASSIGN_4V(dst, UINT_AS_FLT(0), UINT_AS_FLT(0), - UINT_AS_FLT(0), UINT_AS_FLT(1)); + ASSIGN_4V(dst, 0, 0, 0, 1); break; default: ASSIGN_4V(dst, 0.0f, 0.0f, 0.0f, 1.0f); /* silence warnings */ diff --git a/src/mesa/vbo/vbo_attrib_tmp.h b/src/mesa/vbo/vbo_attrib_tmp.h index ec66934..a853cb1 100644 --- a/src/mesa/vbo/vbo_attrib_tmp.h +++ b/src/mesa/vbo/vbo_attrib_tmp.h @@ -28,6 +28,41 @@ USE OR OTHER DEALINGS IN THE SOFTWARE. #include util/u_format_r11g11b10f.h #include main/varray.h +#include program/prog_parameter.h + + +static union gl_constant_value UINT_AS_UNION(GLuint u) +{ + union gl_constant_value tmp; + tmp.u = u; + return tmp; +} + +static inline union gl_constant_value INT_AS_UNION(GLint i) +{ + union gl_constant_value tmp; + tmp.i = i; + return tmp; +} + +static inline union gl_constant_value FLOAT_AS_UNION(GLfloat f) +{ + union gl_constant_value tmp; + tmp.f = f; + return tmp; +} + +/* ATTR */ +#define ATTR( A, N, T, V0, V1, V2, V3 ) ATTR_##T((A), (N), (T), (V0), (V1), (V2), (V3)) + +#define ATTR_GL_UNSIGNED_INT( A, N, T, V0, V1, V2, V3 ) \ +ATTR_UNION(A, N, T, UINT_AS_UNION(V0), UINT_AS_UNION(V1), UINT_AS_UNION(V2), UINT_AS_UNION(V3)) +#define ATTR_GL_INT( A, N, T, V0, V1, V2, V3 ) \ +ATTR_UNION(A, N, T, INT_AS_UNION(V0), INT_AS_UNION(V1), INT_AS_UNION(V2), INT_AS_UNION(V3)) +#define ATTR_GL_FLOAT( A, N, T, V0, V1, V2, V3 ) \ +ATTR_UNION(A, N, T, FLOAT_AS_UNION(V0), FLOAT_AS_UNION(V1), FLOAT_AS_UNION(V2), FLOAT_AS_UNION(V3)) + + /* float */ #define ATTR1FV( A, V ) ATTR( A, 1, GL_FLOAT, (V)[0], 0, 0, 1 ) #define ATTR2FV( A, V ) ATTR( A, 2, GL_FLOAT, (V)[0], (V)[1], 0, 1 ) @@ -41,8 +76,8 @@ USE OR OTHER DEALINGS IN THE SOFTWARE. /* int */ #define ATTRI( A, N, X, Y, Z, W) ATTR( A, N, GL_INT, \ - INT_AS_FLT(X), INT_AS_FLT(Y), \ - INT_AS_FLT(Z), INT_AS_FLT(W) ) + X, Y, \ + Z, W ) #define ATTR2IV( A, V ) ATTRI( A, 2, (V)[0], (V)[1], 0, 1 ) #define ATTR3IV( A, V ) ATTRI( A, 3, (V)[0], (V)[1], (V)[2], 1 ) @@ -56,8 +91,8 @@ USE OR OTHER DEALINGS IN THE SOFTWARE. /* uint */ #define ATTRUI( A, N, X, Y, Z, W) ATTR( A, N, GL_UNSIGNED_INT, \ -UINT_AS_FLT(X), UINT_AS_FLT(Y), \ -UINT_AS_FLT(Z), UINT_AS_FLT(W) ) +X, Y, \ +Z, W ) #define ATTR2UIV( A, V ) ATTRUI( A, 2, (V)[0],
Re: [Mesa-dev] [PATCH] radeonsi: Enable VGPR spilling for all shader types v3
On 20.01.2015 22:39, Marek Olšák wrote: The problem with CPDMA (DMA_DATA and WRITE_DATA) is that the ordering of flushes must be correct. First, partial flushes must be done, so that the shaders are idle. That's only necessary when reusing a single BO for the shader code, not when allocating a new BO when the relocations change, right? Then you can use CP DMA to update the binary. After that, ICACHE should be invalidated. ICACHE has to be invalidated when writing with the CPU as well, right? The problem with mapping VRAM can be avoided by keeping a CPU copy of the binary from the beginning. We would only need a CPU copy of those shaders that use the scratch buffer. Then, you wouldn't have to read VRAM at all, which would make it even simpler. Right, but CPU writes to the new BO in VRAM could cause stalls anyway. Anyway, let's do it with the CPU first and maybe using CPDMA as an optimization later. -- Earthling Michel Dänzer | http://www.amd.com Libre software enthusiast | Mesa and X developer ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 05/11] mesa: Validate internal format and format type first to provide accurate error code
On Mon, Jan 19, 2015 at 3:32 AM, Eduardo Lima Mitev el...@igalia.com wrote: The specification states that glTexImage2D and glTexImage3D should return GL_INVALID_VALUE if the internal format is invalid, and GL_INVALID_ENUM is the format type is invalid. However, current error check only considers the combination of format, type and internal format; which returns a GL_INVALID_OPERATION error when invalid. I did a quick search in es 3.0.4 spec but couldn't find the reference. Could you point me to the reference in the spec and may be add it as a comment in the code? Fixes 2 dEQP tests: * dEQP-GLES3.functional.negative_api.texture.teximage2d * dEQP-GLES3.functional.negative_api.texture.teximage3d --- src/mesa/main/glformats.c | 114 ++ 1 file changed, 114 insertions(+) diff --git a/src/mesa/main/glformats.c b/src/mesa/main/glformats.c index 06f9aaf..5cec90d 100644 --- a/src/mesa/main/glformats.c +++ b/src/mesa/main/glformats.c @@ -2012,6 +2012,112 @@ _mesa_es_error_check_format_and_type(GLenum format, GLenum type, return type_valid ? GL_NO_ERROR : GL_INVALID_OPERATION; } +/** + * Check that internal format is a valid enum for OpenGL ES 3. + * \return TRUE if valid, FALSE otherwise. + */ +static GLboolean +_mesa_es3_is_valid_internal_format(GLenum internalFormat) +{ + switch (internalFormat) { + case GL_RGB: + case GL_RGBA: + case GL_LUMINANCE_ALPHA: + case GL_LUMINANCE: + case GL_ALPHA: + case GL_R8: + case GL_R8_SNORM: + case GL_R16F: + case GL_R32F: + case GL_R8UI: + case GL_R8I: + case GL_R16UI: + case GL_R16I: + case GL_R32UI: + case GL_R32I: + case GL_RG8: + case GL_RG8_SNORM: + case GL_RG16F: + case GL_RG32F: + case GL_RG8UI: + case GL_RG8I: + case GL_RG16UI: + case GL_RG16I: + case GL_RG32UI: + case GL_RG32I: + case GL_RGB8: + case GL_SRGB8: + case GL_RGB565: + case GL_RGB8_SNORM: + case GL_R11F_G11F_B10F: + case GL_RGB9_E5: + case GL_RGB16F: + case GL_RGB32F: + case GL_RGB8UI: + case GL_RGB8I: + case GL_RGB16UI: + case GL_RGB16I: + case GL_RGB32UI: + case GL_RGB32I: + case GL_RGBA8: + case GL_SRGB8_ALPHA8: + case GL_RGBA8_SNORM: + case GL_RGB5_A1: + case GL_RGBA4: + case GL_RGB10_A2: + case GL_RGBA16F: + case GL_RGBA32F: + case GL_RGBA8UI: + case GL_RGBA8I: + case GL_RGB10_A2UI: + case GL_RGBA16UI: + case GL_RGBA16I: + case GL_RGBA32I: + case GL_RGBA32UI: + case GL_DEPTH_COMPONENT16: + case GL_DEPTH_COMPONENT24: + case GL_DEPTH_COMPONENT32F: + case GL_DEPTH24_STENCIL8: + case GL_DEPTH32F_STENCIL8: + break; + default: + return GL_FALSE; + } + + return GL_TRUE; +} + +/** + * Check that format type is a valid enum for OpenGL ES 3. + * \return TRUE if valid, FALSE otherwise. + */ +static GLboolean +_mesa_es3_is_valid_format_type(GLenum type) +{ + switch (type) { + case GL_UNSIGNED_BYTE: + case GL_UNSIGNED_SHORT_5_6_5: + case GL_UNSIGNED_SHORT_4_4_4_4: + case GL_UNSIGNED_SHORT_5_5_5_1: + case GL_BYTE: + case GL_HALF_FLOAT: + case GL_FLOAT: + case GL_UNSIGNED_SHORT: + case GL_SHORT: + case GL_UNSIGNED_INT: + case GL_INT: + case GL_UNSIGNED_INT_10F_11F_11F_REV: + case GL_UNSIGNED_INT_5_9_9_9_REV: + case GL_UNSIGNED_INT_2_10_10_10_REV: + case GL_UNSIGNED_INT_24_8: + case GL_FLOAT_32_UNSIGNED_INT_24_8_REV: + break; + default: + return GL_FALSE; + } + + return GL_TRUE; +} /** * Do error checking of format/type combinations for OpenGL ES 3 @@ -2022,6 +2128,14 @@ GLenum _mesa_es3_error_check_format_and_type(GLenum format, GLenum type, GLenum internalFormat) { + if (!_mesa_es3_is_valid_format_type(type)) { + return GL_INVALID_ENUM; + } + + if (!_mesa_es3_is_valid_internal_format(internalFormat)) { + return GL_INVALID_VALUE; + } + switch (format) { case GL_RGBA: switch (type) { -- 2.1.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [mesa-dev][PATCH] Remove UINT_AS_FLT, INT_AS_FLT, FLOAT_AS_FLT macros.No functional changes, only bug fixed.
On 01/20/2015 07:30 AM, marius.pre...@intel.com wrote: From: Marius Predut marius.pre...@intel.com On 32-bit, for floating point operations is used x86 FPU registers instead SSE, reason for when reinterprets an integer as a float result is unexpected (modify floats when they are written to memory). This fixes https://bugs.freedesktop.org/show_bug.cgi?id=82668 Reviewed-by: Roberts, Neil Sneil.s.robe...@intel.com This should be formatted as: Reviewed-by: Neil Roberts neil.s.robe...@intel.com Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=82668 --- src/mesa/main/context.c |2 +- src/mesa/main/macros.h| 29 ++- src/mesa/vbo/vbo_attrib_tmp.h | 43 + src/mesa/vbo/vbo_exec.h |3 ++- src/mesa/vbo/vbo_exec_api.c | 25 src/mesa/vbo/vbo_exec_eval.c | 22 - src/mesa/vbo/vbo_save_api.c | 10 +- 7 files changed, 78 insertions(+), 56 deletions(-) diff --git a/src/mesa/main/context.c b/src/mesa/main/context.c index 400c158..3007491 100644 --- a/src/mesa/main/context.c +++ b/src/mesa/main/context.c @@ -656,7 +656,7 @@ _mesa_init_constants(struct gl_constants *consts, gl_api api) consts-MaxSamples = 0; /* GLSL default if NativeIntegers == FALSE */ - consts-UniformBooleanTrue = FLT_AS_UINT(1.0f); + consts-UniformBooleanTrue = 1; As Jason mentioned, this hunk must be dropped. /* GL_ARB_sync */ consts-MaxServerWaitTimeout = 0x1fff7fffULL; diff --git a/src/mesa/main/macros.h b/src/mesa/main/macros.h index cd5f2d6..4d245e1 100644 --- a/src/mesa/main/macros.h +++ b/src/mesa/main/macros.h @@ -170,27 +170,6 @@ extern GLfloat _mesa_ubyte_to_float_color_tab[256]; ub = ((GLubyte) F_TO_I((f) * 255.0F)) #endif -static inline GLfloat INT_AS_FLT(GLint i) -{ - fi_type tmp; - tmp.i = i; - return tmp.f; -} - -static inline GLfloat UINT_AS_FLT(GLuint u) -{ - fi_type tmp; - tmp.u = u; - return tmp.f; -} - -static inline unsigned FLT_AS_UINT(float f) -{ - fi_type tmp; - tmp.f = f; - return tmp.u; -} - /** * Convert a floating point value to an unsigned fixed point value. * @@ -625,15 +604,11 @@ COPY_CLEAN_4V_TYPE_AS_FLOAT(GLfloat dst[4], int sz, const GLfloat src[4], { switch (type) { case GL_FLOAT: - ASSIGN_4V(dst, 0, 0, 0, 1); + ASSIGN_4V(dst, 0.0f, 0.0f, 0.0f, 1.0f); break; case GL_INT: - ASSIGN_4V(dst, INT_AS_FLT(0), INT_AS_FLT(0), - INT_AS_FLT(0), INT_AS_FLT(1)); - break; case GL_UNSIGNED_INT: - ASSIGN_4V(dst, UINT_AS_FLT(0), UINT_AS_FLT(0), - UINT_AS_FLT(0), UINT_AS_FLT(1)); + ASSIGN_4V(dst, 0, 0, 0, 1); break; I'm having trouble understanding how this is correct. This makes all three cases the same. They all assign float values {0, 0, 0, 1} to dst. Code later in the function (not shown in the patch) then copies possibly integer or unsigned values into some of the components. You then end up with a mix of integer values and floating point values. It seems like this function should take two gl_constant_value as parameters instead of GLfloat[4]. default: ASSIGN_4V(dst, 0.0f, 0.0f, 0.0f, 1.0f); /* silence warnings */ diff --git a/src/mesa/vbo/vbo_attrib_tmp.h b/src/mesa/vbo/vbo_attrib_tmp.h index ec66934..a853cb1 100644 --- a/src/mesa/vbo/vbo_attrib_tmp.h +++ b/src/mesa/vbo/vbo_attrib_tmp.h @@ -28,6 +28,41 @@ USE OR OTHER DEALINGS IN THE SOFTWARE. #include util/u_format_r11g11b10f.h #include main/varray.h +#include program/prog_parameter.h + + +static union gl_constant_value UINT_AS_UNION(GLuint u) +{ + union gl_constant_value tmp; + tmp.u = u; + return tmp; +} + +static inline union gl_constant_value INT_AS_UNION(GLint i) +{ + union gl_constant_value tmp; + tmp.i = i; + return tmp; +} + +static inline union gl_constant_value FLOAT_AS_UNION(GLfloat f) +{ + union gl_constant_value tmp; + tmp.f = f; + return tmp; +} + +/* ATTR */ +#define ATTR( A, N, T, V0, V1, V2, V3 )ATTR_##T((A), (N), (T), (V0), (V1), (V2), (V3)) + +#define ATTR_GL_UNSIGNED_INT( A, N, T, V0, V1, V2, V3 ) \ +ATTR_UNION(A, N, T, UINT_AS_UNION(V0), UINT_AS_UNION(V1), UINT_AS_UNION(V2), UINT_AS_UNION(V3)) +#define ATTR_GL_INT( A, N, T, V0, V1, V2, V3 ) \ +ATTR_UNION(A, N, T, INT_AS_UNION(V0), INT_AS_UNION(V1), INT_AS_UNION(V2), INT_AS_UNION(V3)) +#define ATTR_GL_FLOAT( A, N, T, V0, V1, V2, V3 ) \ +ATTR_UNION(A, N, T, FLOAT_AS_UNION(V0), FLOAT_AS_UNION(V1), FLOAT_AS_UNION(V2), FLOAT_AS_UNION(V3)) + + /* float */ #define ATTR1FV( A, V ) ATTR( A, 1, GL_FLOAT, (V)[0], 0, 0, 1 ) #define ATTR2FV( A, V ) ATTR( A, 2, GL_FLOAT, (V)[0], (V)[1], 0, 1 ) @@ -41,8 +76,8 @@ USE OR OTHER DEALINGS IN THE SOFTWARE. /* int */ #define ATTRI( A, N, X,
Re: [Mesa-dev] [PATCH 15/16] i965/fs: Add support for removing MOV.NZ instructions.
On Tue, Jan 20, 2015 at 4:02 PM, Matt Turner matts...@gmail.com wrote: On Tue, Jan 20, 2015 at 3:51 PM, Jason Ekstrand ja...@jlekstrand.net wrote: Except typeless... We need some sort of assurance that the result of a NIR comparison is always 0 or ~0. Help me understand how this is a different situation from what we have today? Let's take for instance a vec2 == vec2 comparison. On Gen4/5 we generate two CMPs an AND to combine their results and a single resolving AND.NZ with 1 to generate the flag. Look at generated_tests/spec/glsl-1.10/execution/built-in-functions/fs-op-eq-vec2-vec2-using-if.shader_test for example. Why should NIR necessitate doing any of that differently? Because we don't have a concept of boolean. Instead, a == b is supposed to produce ~0 if equal and 0 if not. On older gens, we have to clean up by doing -((a == b) 1). If we're doing a vector operation, we have to take the final result and and things together so we get -((a.x == b.x) 1) -(a.y == b.y) 1). Of course, we can propagate through the and and just get -(((a.x == b.x) (a.y == b.y)) 1) which is what we want to do. Right now, NIR doesn't do anything interesting like transforming b2f into with 0x3f80 but, as per the boolean convention in NIR, it could. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 88536] AMD graphics hardware hangs with an homogeneous coloured screen or blank screen, and with chirp coming from the graphics card
https://bugs.freedesktop.org/show_bug.cgi?id=88536 --- Comment #17 from Alberto Salvia Novella es204904...@gmail.com --- Wait, comment 2 says further steps are needed... I'm testing now. -- You are receiving this mail because: You are the QA Contact for the bug. You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 03/11] i965: Fix negate with unsigned integers
On Mon, Jan 19, 2015 at 3:32 AM, Eduardo Lima Mitev el...@igalia.com wrote: From: Iago Toral Quiroga ito...@igalia.com For code such as: uint tmp1 = uint(in0); uint tmp2 = -tmp1; float out0 = float(tmp2); We produce code like: mov(8)g51.xF-g94,4,1.xUD which does not produce correct results. This code produces the results we would expect if tmp1 and tmp2 were signed integers instead. It seems that a similar problem was detected and addressed when using negations with unsigned integers as part of condionals, but it looks like the problem has a wider impact than that. This patch fixes the problem by preventing copy-propagation of negated UD registers in all scenarios, not only in conditionals. Fixes the following 8 dEQP tests: dEQP-GLES3.functional.shaders.operator.unary_operator.minus.highp_uint_vertex dEQP-GLES3.functional.shaders.operator.unary_operator.minus.highp_uint_fragment dEQP-GLES3.functional.shaders.operator.unary_operator.minus.highp_uvec2_vertex dEQP-GLES3.functional.shaders.operator.unary_operator.minus.highp_uvec2_fragment dEQP-GLES3.functional.shaders.operator.unary_operator.minus.highp_uvec3_vertex dEQP-GLES3.functional.shaders.operator.unary_operator.minus.highp_uvec3_fragment dEQP-GLES3.functional.shaders.operator.unary_operator.minus.highp_uvec4_vertex dEQP-GLES3.functional.shaders.operator.unary_operator.minus.highp_uvec4_fragment Note: For some reason the mediump and lowp versions of these tests still fail but I am not sure about the reason for that since the code we generate now seems correct (in fact, is the same as for the highp versions). These tests would need further investigation. --- src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp | 9 ++--- src/mesa/drivers/dri/i965/brw_vec4_copy_propagation.cpp | 9 - 2 files changed, 10 insertions(+), 8 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp b/src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp index 70f417f..5dd7255 100644 --- a/src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp @@ -302,9 +302,12 @@ fs_visitor::try_copy_propagate(fs_inst *inst, int arg, acp_entry *entry) (entry-dst.reg_offset + entry-regs_written) * 32) return false; - /* See resolve_ud_negate() and comment in brw_fs_emit.cpp. */ - if (inst-conditional_mod - inst-src[arg].type == BRW_REGISTER_TYPE_UD + /* we can't generally copy-propagate UD negations because we +* can end up accessing the resulting values as signed integers +* instead. See also resolve_ud_negate() and comment in +* fs_generator::generate_code. +*/ + if (inst-src[arg].type == BRW_REGISTER_TYPE_UD entry-src.negate) return false; diff --git a/src/mesa/drivers/dri/i965/brw_vec4_copy_propagation.cpp b/src/mesa/drivers/dri/i965/brw_vec4_copy_propagation.cpp index 9e47dd9..562ecb7 100644 --- a/src/mesa/drivers/dri/i965/brw_vec4_copy_propagation.cpp +++ b/src/mesa/drivers/dri/i965/brw_vec4_copy_propagation.cpp @@ -318,12 +318,11 @@ try_copy_propagate(struct brw_context *brw, vec4_instruction *inst, if (inst-is_send_from_grf()) return false; - /* We can't copy-propagate a UD negation into a condmod -* instruction, because the condmod ends up looking at the 33-bit -* signed accumulator value instead of the 32-bit value we wanted + /* we can't generally copy-propagate UD negations becuse we +* end up accessing the resulting values as signed integers +* instead. See also resolve_ud_negate(). */ - if (inst-conditional_mod - value.negate + if (value.negate value.type == BRW_REGISTER_TYPE_UD) return false; -- 2.1.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev Changes look good to me. Verified that patch now generates correct code for the example in the comment. Reviewed-by: Anuj Phogat anuj.pho...@gmail.com ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 87886] constant fps drops with Intel and Radeon
https://bugs.freedesktop.org/show_bug.cgi?id=87886 --- Comment #23 from Michel Dänzer mic...@daenzer.net --- (In reply to Stéphane Travostino from comment #21) I'm trying to reproduce the same thing with Intel, but it's V-synced and can't manage to have it run more than 59 FPS. Have you tried 'vblank_mode=0 glxgears'? -- You are receiving this mail because: You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 88536] AMD graphics hardware hangs with an homogeneous coloured screen or blank screen, and with chirp coming from the graphics card
https://bugs.freedesktop.org/show_bug.cgi?id=88536 Alberto Salvia Novella es204904...@gmail.com changed: What|Removed |Added Status|NEW |NEEDINFO --- Comment #18 from Alberto Salvia Novella es204904...@gmail.com --- I will test this for a while. -- You are receiving this mail because: You are the QA Contact for the bug. You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 4/5] radeonsi: Re-enable LLVM IR dumps
On 16.01.2015 09:48, Tom Stellard wrote: This was inadvertently disabled by 761e36b4caab4e8e09a4c2b1409a825902fc7d2c. --- src/gallium/drivers/radeon/radeon_llvm_emit.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/src/gallium/drivers/radeon/radeon_llvm_emit.c b/src/gallium/drivers/radeon/radeon_llvm_emit.c index b98afb2..0f9dbab 100644 --- a/src/gallium/drivers/radeon/radeon_llvm_emit.c +++ b/src/gallium/drivers/radeon/radeon_llvm_emit.c @@ -162,7 +162,6 @@ unsigned radeon_llvm_compile(LLVMModuleRef M, struct radeon_shader_binary *binar strncpy(cpu, gpu_family, CPU_STRING_LEN); memset(fs, 0, sizeof(fs)); if (dump) { - LLVMDumpModule(M); strncpy(fs, +DumpCode, FS_STRING_LEN); } tm = LLVMCreateTargetMachine(target, triple, cpu, fs, @@ -170,6 +169,9 @@ unsigned radeon_llvm_compile(LLVMModuleRef M, struct radeon_shader_binary *binar LLVMCodeModelDefault); dispose_tm = true; } + if (dump) { + LLVMDumpModule(M); + } /* Setup Diagnostic Handler*/ llvm_ctx = LLVMGetModuleContext(M); Reviewed-and-Tested-by: Michel Dänzer michel.daen...@amd.com Please push this patch now, regardless of what happens with the rest of the series. -- Earthling Michel Dänzer | http://www.amd.com Libre software enthusiast | Mesa and X developer ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 87886] constant fps drops with Intel and Radeon
https://bugs.freedesktop.org/show_bug.cgi?id=87886 --- Comment #25 from Michel Dänzer mic...@daenzer.net --- (In reply to Stéphane Travostino from comment #24) yes I confirm I can reproduce the same FPS drops with glxgears, Intel and vsync disabled. So, it seems like something is causing the performance of your system as a whole to degrade at regular intervals. Does top show any additional CPU load while performance is degraded? Or does something like iotop or vmstat show I/O during those times? Does it also affect pure CPU applications, e.g. audio or video encoding / transcoding? -- You are receiving this mail because: You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [mesa-dev][PATCH] Remove UINT_AS_FLT, INT_AS_FLT, FLOAT_AS_FLT macros.No functional changes, only bug fixed.
On Tue, Jan 20, 2015 at 7:30 AM, marius.pre...@intel.com wrote: From: Marius Predut marius.pre...@intel.com On 32-bit, for floating point operations is used x86 FPU registers instead SSE, reason for when reinterprets an integer as a float result is unexpected (modify floats when they are written to memory). This fixes https://bugs.freedesktop.org/show_bug.cgi?id=82668 Reviewed-by: Roberts, Neil Sneil.s.robe...@intel.com --- src/mesa/main/context.c |2 +- src/mesa/main/macros.h| 29 ++- src/mesa/vbo/vbo_attrib_tmp.h | 43 + src/mesa/vbo/vbo_exec.h |3 ++- src/mesa/vbo/vbo_exec_api.c | 25 src/mesa/vbo/vbo_exec_eval.c | 22 - src/mesa/vbo/vbo_save_api.c | 10 +- 7 files changed, 78 insertions(+), 56 deletions(-) diff --git a/src/mesa/main/context.c b/src/mesa/main/context.c index 400c158..3007491 100644 --- a/src/mesa/main/context.c +++ b/src/mesa/main/context.c @@ -656,7 +656,7 @@ _mesa_init_constants(struct gl_constants *consts, gl_api api) consts-MaxSamples = 0; /* GLSL default if NativeIntegers == FALSE */ - consts-UniformBooleanTrue = FLT_AS_UINT(1.0f); + consts-UniformBooleanTrue = 1; This doesn't do what you think it does. FLT_AS_UINT(1.0f) and 1 are very different values. We need to leave the above alone as it's the uniform value passed in as true to uniforms on hardware that can only handle floating-point values. I haven't looked very thoroughly at the rest of the patch but I didn't see anything wrong with it either. --Jason /* GL_ARB_sync */ consts-MaxServerWaitTimeout = 0x1fff7fffULL; diff --git a/src/mesa/main/macros.h b/src/mesa/main/macros.h index cd5f2d6..4d245e1 100644 --- a/src/mesa/main/macros.h +++ b/src/mesa/main/macros.h @@ -170,27 +170,6 @@ extern GLfloat _mesa_ubyte_to_float_color_tab[256]; ub = ((GLubyte) F_TO_I((f) * 255.0F)) #endif -static inline GLfloat INT_AS_FLT(GLint i) -{ - fi_type tmp; - tmp.i = i; - return tmp.f; -} - -static inline GLfloat UINT_AS_FLT(GLuint u) -{ - fi_type tmp; - tmp.u = u; - return tmp.f; -} - -static inline unsigned FLT_AS_UINT(float f) -{ - fi_type tmp; - tmp.f = f; - return tmp.u; -} - /** * Convert a floating point value to an unsigned fixed point value. * @@ -625,15 +604,11 @@ COPY_CLEAN_4V_TYPE_AS_FLOAT(GLfloat dst[4], int sz, const GLfloat src[4], { switch (type) { case GL_FLOAT: - ASSIGN_4V(dst, 0, 0, 0, 1); + ASSIGN_4V(dst, 0.0f, 0.0f, 0.0f, 1.0f); break; case GL_INT: - ASSIGN_4V(dst, INT_AS_FLT(0), INT_AS_FLT(0), - INT_AS_FLT(0), INT_AS_FLT(1)); - break; case GL_UNSIGNED_INT: - ASSIGN_4V(dst, UINT_AS_FLT(0), UINT_AS_FLT(0), - UINT_AS_FLT(0), UINT_AS_FLT(1)); + ASSIGN_4V(dst, 0, 0, 0, 1); break; default: ASSIGN_4V(dst, 0.0f, 0.0f, 0.0f, 1.0f); /* silence warnings */ diff --git a/src/mesa/vbo/vbo_attrib_tmp.h b/src/mesa/vbo/vbo_attrib_tmp.h index ec66934..a853cb1 100644 --- a/src/mesa/vbo/vbo_attrib_tmp.h +++ b/src/mesa/vbo/vbo_attrib_tmp.h @@ -28,6 +28,41 @@ USE OR OTHER DEALINGS IN THE SOFTWARE. #include util/u_format_r11g11b10f.h #include main/varray.h +#include program/prog_parameter.h + + +static union gl_constant_value UINT_AS_UNION(GLuint u) +{ + union gl_constant_value tmp; + tmp.u = u; + return tmp; +} + +static inline union gl_constant_value INT_AS_UNION(GLint i) +{ + union gl_constant_value tmp; + tmp.i = i; + return tmp; +} + +static inline union gl_constant_value FLOAT_AS_UNION(GLfloat f) +{ + union gl_constant_value tmp; + tmp.f = f; + return tmp; +} + +/* ATTR */ +#define ATTR( A, N, T, V0, V1, V2, V3 ) ATTR_##T((A), (N), (T), (V0), (V1), (V2), (V3)) + +#define ATTR_GL_UNSIGNED_INT( A, N, T, V0, V1, V2, V3 ) \ +ATTR_UNION(A, N, T, UINT_AS_UNION(V0), UINT_AS_UNION(V1), UINT_AS_UNION(V2), UINT_AS_UNION(V3)) +#define ATTR_GL_INT( A, N, T, V0, V1, V2, V3 ) \ +ATTR_UNION(A, N, T, INT_AS_UNION(V0), INT_AS_UNION(V1), INT_AS_UNION(V2), INT_AS_UNION(V3)) +#define ATTR_GL_FLOAT( A, N, T, V0, V1, V2, V3 ) \ +ATTR_UNION(A, N, T, FLOAT_AS_UNION(V0), FLOAT_AS_UNION(V1), FLOAT_AS_UNION(V2), FLOAT_AS_UNION(V3)) + + /* float */ #define ATTR1FV( A, V ) ATTR( A, 1, GL_FLOAT, (V)[0], 0, 0, 1 ) #define ATTR2FV( A, V ) ATTR( A, 2, GL_FLOAT, (V)[0], (V)[1], 0, 1 ) @@ -41,8 +76,8 @@ USE OR OTHER DEALINGS IN THE SOFTWARE. /* int */ #define ATTRI( A, N, X, Y, Z, W) ATTR( A, N, GL_INT, \ - INT_AS_FLT(X), INT_AS_FLT(Y), \ - INT_AS_FLT(Z), INT_AS_FLT(W) ) + X, Y, \ + Z, W ) #define
[Mesa-dev] [Bug 88536] AMD graphics hardware hangs with an homogeneous coloured screen or blank screen, and with chirp coming from the graphics card
https://bugs.freedesktop.org/show_bug.cgi?id=88536 --- Comment #16 from Alberto Salvia Novella es204904...@gmail.com --- The Synaptic package manager says there isn't any fglrx package installed, and following instructions at http://askubuntu.com/questions/78675/how-do-i-remove-the-fglrx-drivers-after-ive-installed-them-by-hand says the same. -- You are receiving this mail because: You are the QA Contact for the bug. You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 87886] constant fps drops with Intel and Radeon
https://bugs.freedesktop.org/show_bug.cgi?id=87886 --- Comment #24 from Stéphane Travostino stephane.travost...@gmail.com --- Thanks Michel, yes I confirm I can reproduce the same FPS drops with glxgears, Intel and vsync disabled. No FPS HUD for Intel, but I can see the FPS numbers change between ~5.6k and 1.6k every 10 seconds on average. -- You are receiving this mail because: You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] glsl: do not allow interface block to have name already taken
On 01/20/2015 08:29 PM, Ian Romanick wrote: On 01/19/2015 10:55 PM, Tapani Pälli wrote: Fixes currently failing Piglit case interface-blocks-name-reused-globally.vert Signed-off-by: Tapani Pälli tapani.pa...@intel.com --- src/glsl/ast_to_hir.cpp | 17 - 1 file changed, 16 insertions(+), 1 deletion(-) diff --git a/src/glsl/ast_to_hir.cpp b/src/glsl/ast_to_hir.cpp index 811a955..13ddb00 100644 --- a/src/glsl/ast_to_hir.cpp +++ b/src/glsl/ast_to_hir.cpp @@ -5443,9 +5443,24 @@ ast_interface_block::hir(exec_list *instructions, state-struct_specifier_depth--; - if (!redeclaring_per_vertex) + if (!redeclaring_per_vertex) { + ir_variable *var; This is C++, so you can combine the declaration with the assignment below. I guess you've been bit by the C89 requirements elsewhere in Mesa. :) True :) Thanks, I'll send a v2 validate_identifier(this-block_name, loc, state); + /* From section 4.3.9 (Interface Blocks) of the GLSL 4.50 spec: + * + * Block names have no other use within a shader beyond interface + * matching; it is a compile-time error to use a block name at global + * scope for anything other than as a block name. + */ + var = state-symbols-get_variable(this-block_name); + if (var !var-type-is_interface()) { + _mesa_glsl_error(loc, state, Block name `%s' is + already used in the scope., + this-block_name); + } This fixes the previously mentioned test case, but what about a test like out block { vec4 a; } inst; vec4 block; Looking at Chris's patch (piglit 14165586) that adds interface-blocks-name-reused-globally.vert, I don't see a test like this. Does that test already exist, and I just missed it? + } + const glsl_type *earlier_per_vertex = NULL; if (redeclaring_per_vertex) { /* Find the previous declaration of gl_PerVertex. If we're redeclaring ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH v2] glsl: do not allow interface block to have name already taken
Fixes currently failing Piglit case interface-blocks-name-reused-globally.vert v2: combine var declaration with assignment (Ian) Signed-off-by: Tapani Pälli tapani.pa...@intel.com --- src/glsl/ast_to_hir.cpp | 16 +++- 1 file changed, 15 insertions(+), 1 deletion(-) diff --git a/src/glsl/ast_to_hir.cpp b/src/glsl/ast_to_hir.cpp index 811a955..1ba29f7 100644 --- a/src/glsl/ast_to_hir.cpp +++ b/src/glsl/ast_to_hir.cpp @@ -5443,9 +5443,23 @@ ast_interface_block::hir(exec_list *instructions, state-struct_specifier_depth--; - if (!redeclaring_per_vertex) + if (!redeclaring_per_vertex) { validate_identifier(this-block_name, loc, state); + /* From section 4.3.9 (Interface Blocks) of the GLSL 4.50 spec: + * + * Block names have no other use within a shader beyond interface + * matching; it is a compile-time error to use a block name at global + * scope for anything other than as a block name. + */ + ir_variable *var = state-symbols-get_variable(this-block_name); + if (var !var-type-is_interface()) { + _mesa_glsl_error(loc, state, Block name `%s' is + already used in the scope., + this-block_name); + } + } + const glsl_type *earlier_per_vertex = NULL; if (redeclaring_per_vertex) { /* Find the previous declaration of gl_PerVertex. If we're redeclaring -- 2.1.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] Implement GLX_EXT_buffer_age for DRI2
On 19 January 2015 at 21:00, Chris Wilson ch...@chris-wilson.co.uk wrote: In order to suport GLX_EXT_buffer_age in DRI2, we need to pass back the last swap buffer count that the back buffer was defined for. For simplicity, we can reuse an existing field in the DRI2GetBuffers reply that is not used by current drivers, the flags. Since we change the interpretation of this flag, we also declare the semantic change with a DRI2 parameter and depend upon the DDX to enable the change responsibility (which is just a matter of reviewing whether the flags field has ever been used for a non-zero value). This is just missing the why do we need to add this to DRI2 when we have DRI3/Present that should be solving it. Doesn't dri3 already do this? Dave. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] egl/dri2: implement platform_null.
Try the render node first and use it if available. Otherwise fall back to normal nodes. Signed-off-by: Haixia Shi h...@chromium.org --- src/egl/drivers/dri2/Makefile.am | 5 ++ src/egl/drivers/dri2/egl_dri2.c | 11 ++- src/egl/drivers/dri2/egl_dri2.h | 3 + src/egl/drivers/dri2/platform_null.c | 156 +++ 4 files changed, 173 insertions(+), 2 deletions(-) create mode 100644 src/egl/drivers/dri2/platform_null.c diff --git a/src/egl/drivers/dri2/Makefile.am b/src/egl/drivers/dri2/Makefile.am index 79a40e8..14b2d60 100644 --- a/src/egl/drivers/dri2/Makefile.am +++ b/src/egl/drivers/dri2/Makefile.am @@ -64,3 +64,8 @@ if HAVE_EGL_PLATFORM_DRM libegl_dri2_la_SOURCES += platform_drm.c AM_CFLAGS += -DHAVE_DRM_PLATFORM endif + +if HAVE_EGL_PLATFORM_NULL +libegl_dri2_la_SOURCES += platform_null.c +AM_CFLAGS += -DHAVE_NULL_PLATFORM +endif diff --git a/src/egl/drivers/dri2/egl_dri2.c b/src/egl/drivers/dri2/egl_dri2.c index 86e5f24..fd72233 100644 --- a/src/egl/drivers/dri2/egl_dri2.c +++ b/src/egl/drivers/dri2/egl_dri2.c @@ -632,6 +632,13 @@ dri2_initialize(_EGLDriver *drv, _EGLDisplay *disp) return EGL_FALSE; switch (disp-Platform) { +#ifdef HAVE_NULL_PLATFORM + case _EGL_PLATFORM_NULL: + if (disp-Options.TestOnly) + return EGL_TRUE; + return dri2_initialize_null(drv, disp); +#endif + #ifdef HAVE_X11_PLATFORM case _EGL_PLATFORM_X11: if (disp-Options.TestOnly) @@ -1571,7 +1578,7 @@ dri2_create_wayland_buffer_from_image(_EGLDriver *drv, _EGLDisplay *dpy, return dri2_dpy-vtbl-create_wayland_buffer_from_image(drv, dpy, img); } -#ifdef HAVE_DRM_PLATFORM +#if defined(HAVE_DRM_PLATFORM) || defined(HAVE_NULL_PLATFORM) static EGLBoolean dri2_check_dma_buf_attribs(const _EGLImageAttribs *attrs) { @@ -1829,7 +1836,7 @@ dri2_create_image_khr(_EGLDriver *drv, _EGLDisplay *disp, case EGL_WAYLAND_BUFFER_WL: return dri2_create_image_wayland_wl_buffer(disp, ctx, buffer, attr_list); #endif -#ifdef HAVE_DRM_PLATFORM +#if defined(HAVE_DRM_PLATFORM) || defined(HAVE_NULL_PLATFORM) case EGL_LINUX_DMA_BUF_EXT: return dri2_create_image_dma_buf(disp, ctx, buffer, attr_list); #endif diff --git a/src/egl/drivers/dri2/egl_dri2.h b/src/egl/drivers/dri2/egl_dri2.h index 9efe1f7..e206424 100644 --- a/src/egl/drivers/dri2/egl_dri2.h +++ b/src/egl/drivers/dri2/egl_dri2.h @@ -332,6 +332,9 @@ dri2_initialize_wayland(_EGLDriver *drv, _EGLDisplay *disp); EGLBoolean dri2_initialize_android(_EGLDriver *drv, _EGLDisplay *disp); +EGLBoolean +dri2_initialize_null(_EGLDriver *drv, _EGLDisplay *disp); + void dri2_flush_drawable_for_swapbuffers(_EGLDisplay *disp, _EGLSurface *draw); diff --git a/src/egl/drivers/dri2/platform_null.c b/src/egl/drivers/dri2/platform_null.c new file mode 100644 index 000..9c59809 --- /dev/null +++ b/src/egl/drivers/dri2/platform_null.c @@ -0,0 +1,156 @@ +/* + * Copyright (c) 2014 The Chromium OS Authors. All rights reserved. + * Use of this source code is governed by a BSD-style license that can be + * found in the LICENSE file. + */ + +#include stdlib.h +#include stdio.h +#include string.h +#include xf86drm.h +#include dlfcn.h +#include sys/types.h +#include sys/stat.h +#include fcntl.h +#include unistd.h + +#include egl_dri2.h +#include egl_dri2_fallbacks.h +#include loader.h + +static struct dri2_egl_display_vtbl dri2_null_display_vtbl = { + .create_pixmap_surface = dri2_fallback_create_pixmap_surface, + .create_image = dri2_create_image_khr, + .swap_interval = dri2_fallback_swap_interval, + .swap_buffers_with_damage = dri2_fallback_swap_buffers_with_damage, + .swap_buffers_region = dri2_fallback_swap_buffers_region, + .post_sub_buffer = dri2_fallback_post_sub_buffer, + .copy_buffers = dri2_fallback_copy_buffers, + .query_buffer_age = dri2_fallback_query_buffer_age, + .create_wayland_buffer_from_image = dri2_fallback_create_wayland_buffer_from_image, + .get_sync_values = dri2_fallback_get_sync_values, +}; + +static void +null_flush_front_buffer(__DRIdrawable *driDrawable, void *loaderPrivate) +{ +} + +static __DRIbuffer * +null_get_buffers_with_format(__DRIdrawable * driDrawable, + int *width, int *height, + unsigned int *attachments, int count, + int *out_count, void *loaderPrivate) +{ + struct dri2_egl_surface *dri2_surf = loaderPrivate; + struct dri2_egl_display *dri2_dpy = + dri2_egl_display(dri2_surf-base.Resource.Display); + + dri2_surf-buffer_count = 1; + if (width) + *width = dri2_surf-base.Width; + if (height) + *height = dri2_surf-base.Height; + *out_count = dri2_surf-buffer_count;; + return dri2_surf-buffers; +} + +static const char* node_path_fmt_card = /dev/dri/card%d; +static const char* node_path_fmt_render = /dev/dri/renderD%d; + +EGLBoolean +dri2_initialize_null(_EGLDriver *drv, _EGLDisplay *disp) +{ + struct dri2_egl_display
[Mesa-dev] [PATCH] egl/dri2: implement platform_null.
Try the render node first and use it if available. Otherwise fall back to normal nodes. Signed-off-by: Haixia Shi h...@chromium.org --- src/egl/drivers/dri2/Makefile.am | 5 + src/egl/drivers/dri2/egl_dri2.c | 11 ++- src/egl/drivers/dri2/egl_dri2.h | 3 + src/egl/drivers/dri2/platform_null.c | 178 +++ 4 files changed, 195 insertions(+), 2 deletions(-) create mode 100644 src/egl/drivers/dri2/platform_null.c diff --git a/src/egl/drivers/dri2/Makefile.am b/src/egl/drivers/dri2/Makefile.am index 79a40e8..14b2d60 100644 --- a/src/egl/drivers/dri2/Makefile.am +++ b/src/egl/drivers/dri2/Makefile.am @@ -64,3 +64,8 @@ if HAVE_EGL_PLATFORM_DRM libegl_dri2_la_SOURCES += platform_drm.c AM_CFLAGS += -DHAVE_DRM_PLATFORM endif + +if HAVE_EGL_PLATFORM_NULL +libegl_dri2_la_SOURCES += platform_null.c +AM_CFLAGS += -DHAVE_NULL_PLATFORM +endif diff --git a/src/egl/drivers/dri2/egl_dri2.c b/src/egl/drivers/dri2/egl_dri2.c index 86e5f24..fd72233 100644 --- a/src/egl/drivers/dri2/egl_dri2.c +++ b/src/egl/drivers/dri2/egl_dri2.c @@ -632,6 +632,13 @@ dri2_initialize(_EGLDriver *drv, _EGLDisplay *disp) return EGL_FALSE; switch (disp-Platform) { +#ifdef HAVE_NULL_PLATFORM + case _EGL_PLATFORM_NULL: + if (disp-Options.TestOnly) + return EGL_TRUE; + return dri2_initialize_null(drv, disp); +#endif + #ifdef HAVE_X11_PLATFORM case _EGL_PLATFORM_X11: if (disp-Options.TestOnly) @@ -1571,7 +1578,7 @@ dri2_create_wayland_buffer_from_image(_EGLDriver *drv, _EGLDisplay *dpy, return dri2_dpy-vtbl-create_wayland_buffer_from_image(drv, dpy, img); } -#ifdef HAVE_DRM_PLATFORM +#if defined(HAVE_DRM_PLATFORM) || defined(HAVE_NULL_PLATFORM) static EGLBoolean dri2_check_dma_buf_attribs(const _EGLImageAttribs *attrs) { @@ -1829,7 +1836,7 @@ dri2_create_image_khr(_EGLDriver *drv, _EGLDisplay *disp, case EGL_WAYLAND_BUFFER_WL: return dri2_create_image_wayland_wl_buffer(disp, ctx, buffer, attr_list); #endif -#ifdef HAVE_DRM_PLATFORM +#if defined(HAVE_DRM_PLATFORM) || defined(HAVE_NULL_PLATFORM) case EGL_LINUX_DMA_BUF_EXT: return dri2_create_image_dma_buf(disp, ctx, buffer, attr_list); #endif diff --git a/src/egl/drivers/dri2/egl_dri2.h b/src/egl/drivers/dri2/egl_dri2.h index 9efe1f7..e206424 100644 --- a/src/egl/drivers/dri2/egl_dri2.h +++ b/src/egl/drivers/dri2/egl_dri2.h @@ -332,6 +332,9 @@ dri2_initialize_wayland(_EGLDriver *drv, _EGLDisplay *disp); EGLBoolean dri2_initialize_android(_EGLDriver *drv, _EGLDisplay *disp); +EGLBoolean +dri2_initialize_null(_EGLDriver *drv, _EGLDisplay *disp); + void dri2_flush_drawable_for_swapbuffers(_EGLDisplay *disp, _EGLSurface *draw); diff --git a/src/egl/drivers/dri2/platform_null.c b/src/egl/drivers/dri2/platform_null.c new file mode 100644 index 000..4f0b18f --- /dev/null +++ b/src/egl/drivers/dri2/platform_null.c @@ -0,0 +1,178 @@ +/* + * Mesa 3-D graphics library + * + * Copyright (c) 2014 The Chromium OS Authors. All rights reserved. + * + * Based on platform_x11, which has + * + * Copyright © 2011 Intel Corporation + * + * Permission is hereby granted, free of charge, to any person obtaining a + * copy of this software and associated documentation files (the Software), + * to deal in the Software without restriction, including without limitation + * the rights to use, copy, modify, merge, publish, distribute, sublicense, + * and/or sell copies of the Software, and to permit persons to whom the + * Software is furnished to do so, subject to the following conditions: + * + * The above copyright notice and this permission notice shall be included + * in all copies or substantial portions of the Software. + * + * THE SOFTWARE IS PROVIDED AS IS, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER + * DEALINGS IN THE SOFTWARE. + */ + +#include stdlib.h +#include stdio.h +#include string.h +#include xf86drm.h +#include dlfcn.h +#include sys/types.h +#include sys/stat.h +#include fcntl.h +#include unistd.h + +#include egl_dri2.h +#include egl_dri2_fallbacks.h +#include loader.h + +static struct dri2_egl_display_vtbl dri2_null_display_vtbl = { + .create_pixmap_surface = dri2_fallback_create_pixmap_surface, + .create_image = dri2_create_image_khr, + .swap_interval = dri2_fallback_swap_interval, + .swap_buffers_with_damage = dri2_fallback_swap_buffers_with_damage, + .swap_buffers_region = dri2_fallback_swap_buffers_region, + .post_sub_buffer = dri2_fallback_post_sub_buffer, + .copy_buffers = dri2_fallback_copy_buffers, + .query_buffer_age = dri2_fallback_query_buffer_age, +
Re: [Mesa-dev] [PATCH 03/11] i965: Fix negate with unsigned integers
On Tue, 2015-01-20 at 19:06 -0800, Anuj Phogat wrote: On Mon, Jan 19, 2015 at 3:32 AM, Eduardo Lima Mitev el...@igalia.com wrote: From: Iago Toral Quiroga ito...@igalia.com For code such as: uint tmp1 = uint(in0); uint tmp2 = -tmp1; float out0 = float(tmp2); We produce code like: mov(8)g51.xF-g94,4,1.xUD which does not produce correct results. This code produces the results we would expect if tmp1 and tmp2 were signed integers instead. It seems that a similar problem was detected and addressed when using negations with unsigned integers as part of condionals, but it looks like the problem has a wider impact than that. This patch fixes the problem by preventing copy-propagation of negated UD registers in all scenarios, not only in conditionals. Fixes the following 8 dEQP tests: dEQP-GLES3.functional.shaders.operator.unary_operator.minus.highp_uint_vertex dEQP-GLES3.functional.shaders.operator.unary_operator.minus.highp_uint_fragment dEQP-GLES3.functional.shaders.operator.unary_operator.minus.highp_uvec2_vertex dEQP-GLES3.functional.shaders.operator.unary_operator.minus.highp_uvec2_fragment dEQP-GLES3.functional.shaders.operator.unary_operator.minus.highp_uvec3_vertex dEQP-GLES3.functional.shaders.operator.unary_operator.minus.highp_uvec3_fragment dEQP-GLES3.functional.shaders.operator.unary_operator.minus.highp_uvec4_vertex dEQP-GLES3.functional.shaders.operator.unary_operator.minus.highp_uvec4_fragment Note: For some reason the mediump and lowp versions of these tests still fail but I am not sure about the reason for that since the code we generate now seems correct (in fact, is the same as for the highp versions). These tests would need further investigation. Thanks for testing and reviewing Anuj. I re-tested it too and realized this also fixes the mediump and lowp versions if the tests so I'll update the commit log accordingly. Iago --- src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp | 9 ++--- src/mesa/drivers/dri/i965/brw_vec4_copy_propagation.cpp | 9 - 2 files changed, 10 insertions(+), 8 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp b/src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp index 70f417f..5dd7255 100644 --- a/src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs_copy_propagation.cpp @@ -302,9 +302,12 @@ fs_visitor::try_copy_propagate(fs_inst *inst, int arg, acp_entry *entry) (entry-dst.reg_offset + entry-regs_written) * 32) return false; - /* See resolve_ud_negate() and comment in brw_fs_emit.cpp. */ - if (inst-conditional_mod - inst-src[arg].type == BRW_REGISTER_TYPE_UD + /* we can't generally copy-propagate UD negations because we +* can end up accessing the resulting values as signed integers +* instead. See also resolve_ud_negate() and comment in +* fs_generator::generate_code. +*/ + if (inst-src[arg].type == BRW_REGISTER_TYPE_UD entry-src.negate) return false; diff --git a/src/mesa/drivers/dri/i965/brw_vec4_copy_propagation.cpp b/src/mesa/drivers/dri/i965/brw_vec4_copy_propagation.cpp index 9e47dd9..562ecb7 100644 --- a/src/mesa/drivers/dri/i965/brw_vec4_copy_propagation.cpp +++ b/src/mesa/drivers/dri/i965/brw_vec4_copy_propagation.cpp @@ -318,12 +318,11 @@ try_copy_propagate(struct brw_context *brw, vec4_instruction *inst, if (inst-is_send_from_grf()) return false; - /* We can't copy-propagate a UD negation into a condmod -* instruction, because the condmod ends up looking at the 33-bit -* signed accumulator value instead of the 32-bit value we wanted + /* we can't generally copy-propagate UD negations becuse we +* end up accessing the resulting values as signed integers +* instead. See also resolve_ud_negate(). */ - if (inst-conditional_mod - value.negate + if (value.negate value.type == BRW_REGISTER_TYPE_UD) return false; -- 2.1.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev Changes look good to me. Verified that patch now generates correct code for the example in the comment. Reviewed-by: Anuj Phogat anuj.pho...@gmail.com ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 88467] nir.c:140: error: ‘nir_src’ has no member named ‘ssa’
https://bugs.freedesktop.org/show_bug.cgi?id=88467 Vinson Lee v...@freedesktop.org changed: What|Removed |Added CC||airl...@freedesktop.org QA Contact||mesa-dev@lists.freedesktop. ||org -- You are receiving this mail because: You are the QA Contact for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] nir: Add a nir_foreach_phi_src helper macro
On Tue, Jan 20, 2015 at 4:37 PM, Connor Abbott cwabbo...@gmail.com wrote: Assuming you grepped for uses of foreach_list* with nir_phi_src and made sure there were no more, I did Reviewed-by: Connor Abbott cwabbott02gmail.com thanks --Jason On Tue, Jan 20, 2015 at 7:34 PM, Jason Ekstrand ja...@jlekstrand.net wrote: --- src/glsl/nir/nir.c | 4 ++-- src/glsl/nir/nir.h | 3 +++ src/glsl/nir/nir_from_ssa.c| 4 ++-- src/glsl/nir/nir_live_variables.c | 2 +- src/glsl/nir/nir_opt_cse.c | 4 ++-- src/glsl/nir/nir_opt_peephole_select.c | 2 +- src/glsl/nir/nir_print.c | 2 +- src/glsl/nir/nir_to_ssa.c | 2 +- src/glsl/nir/nir_validate.c| 2 +- 9 files changed, 14 insertions(+), 11 deletions(-) diff --git a/src/glsl/nir/nir.c b/src/glsl/nir/nir.c index 81dec1c..89e21fd 100644 --- a/src/glsl/nir/nir.c +++ b/src/glsl/nir/nir.c @@ -731,7 +731,7 @@ rewrite_phi_preds(nir_block *block, nir_block *old_pred, nir_block *new_pred) break; nir_phi_instr *phi = nir_instr_as_phi(instr); - foreach_list_typed_safe(nir_phi_src, src, node, phi-srcs) { + nir_foreach_phi_src(phi, src) { if (src-pred == old_pred) { src-pred = new_pred; break; @@ -1585,7 +1585,7 @@ visit_load_const_src(nir_load_const_instr *instr, nir_foreach_src_cb cb, static bool visit_phi_src(nir_phi_instr *instr, nir_foreach_src_cb cb, void *state) { - foreach_list_typed(nir_phi_src, src, node, instr-srcs) { + nir_foreach_phi_src(instr, src) { if (!visit_src(src-src, cb, state)) return false; } diff --git a/src/glsl/nir/nir.h b/src/glsl/nir/nir.h index 8861809..f31d0e0 100644 --- a/src/glsl/nir/nir.h +++ b/src/glsl/nir/nir.h @@ -990,6 +990,9 @@ typedef struct { nir_src src; } nir_phi_src; +#define nir_foreach_phi_src(phi, entry) \ + foreach_list_typed(nir_phi_src, entry, node, (phi)-srcs) + typedef struct { nir_instr instr; diff --git a/src/glsl/nir/nir_from_ssa.c b/src/glsl/nir/nir_from_ssa.c index 0258699..9728b99 100644 --- a/src/glsl/nir/nir_from_ssa.c +++ b/src/glsl/nir/nir_from_ssa.c @@ -343,7 +343,7 @@ isolate_phi_nodes_block(nir_block *block, void *void_state) nir_phi_instr *phi = nir_instr_as_phi(instr); assert(phi-dest.is_ssa); - foreach_list_typed(nir_phi_src, src, node, phi-srcs) { + nir_foreach_phi_src(phi, src) { nir_parallel_copy_instr *pcopy = get_parallel_copy_at_end_of_block(src-pred); assert(pcopy); @@ -412,7 +412,7 @@ coalesce_phi_nodes_block(nir_block *block, void *void_state) assert(phi-dest.is_ssa); merge_node *dest_node = get_merge_node(phi-dest.ssa, state); - foreach_list_typed(nir_phi_src, src, node, phi-srcs) { + nir_foreach_phi_src(phi, src) { assert(src-src.is_ssa); merge_node *src_node = get_merge_node(src-src.ssa, state); if (src_node-set != dest_node-set) diff --git a/src/glsl/nir/nir_live_variables.c b/src/glsl/nir/nir_live_variables.c index f110c5e..7402dc0 100644 --- a/src/glsl/nir/nir_live_variables.c +++ b/src/glsl/nir/nir_live_variables.c @@ -147,7 +147,7 @@ propagate_across_edge(nir_block *pred, nir_block *succ, break; nir_phi_instr *phi = nir_instr_as_phi(instr); - foreach_list_typed(nir_phi_src, src, node, phi-srcs) { + nir_foreach_phi_src(phi, src) { if (src-pred == pred) { set_src_live(src-src, live); break; diff --git a/src/glsl/nir/nir_opt_cse.c b/src/glsl/nir/nir_opt_cse.c index e7dba1d..89d78c8 100644 --- a/src/glsl/nir/nir_opt_cse.c +++ b/src/glsl/nir/nir_opt_cse.c @@ -99,8 +99,8 @@ nir_instrs_equal(nir_instr *instr1, nir_instr *instr2) if (phi1-instr.block != phi2-instr.block) return false; - foreach_list_typed(nir_phi_src, src1, node, phi1-srcs) { - foreach_list_typed(nir_phi_src, src2, node, phi2-srcs) { + nir_foreach_phi_src(phi1, src1) { + nir_foreach_phi_src(phi2, src2) { if (src1-pred == src2-pred) { if (!nir_srcs_equal(src1-src, src2-src)) return false; diff --git a/src/glsl/nir/nir_opt_peephole_select.c b/src/glsl/nir/nir_opt_peephole_select.c index 3e8c938..5d2f5d6 100644 --- a/src/glsl/nir/nir_opt_peephole_select.c +++ b/src/glsl/nir/nir_opt_peephole_select.c @@ -140,7 +140,7 @@ nir_opt_peephole_select_block(nir_block *block, void *void_state) memset(sel-src[0].swizzle, 0, sizeof sel-src[0].swizzle); assert(exec_list_length(phi-srcs) == 2); - foreach_list_typed(nir_phi_src, src, node, phi-srcs) { + nir_foreach_phi_src(phi, src) { assert(src-pred == then_block || src-pred
[Mesa-dev] [Bug 88536] AMD graphics hardware hangs with an homogeneous coloured screen or blank screen, and with chirp coming from the graphics card
https://bugs.freedesktop.org/show_bug.cgi?id=88536 --- Comment #15 from Timothy Arceri t_arc...@yahoo.com.au --- The open and closed drivers currently don't play nice together. Hopefully one day that will change. But to me it looks like you haven't fully removed fglrx and this is causing you problems. -- You are receiving this mail because: You are the QA Contact for the bug. You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v2] glsl: Improve precision of mod(x,y)
On 01/20/2015 08:09 AM, Iago Toral Quiroga wrote: Currently, Mesa uses the lowering pass MOD_TO_FRACT to implement mod(x,y) as y * fract(x/y). This implementation has a down side though: it introduces precision errors due to the fract() operation. Even worse, since the result of fract() is multiplied by y, the larger y gets the larger the precision error we produce, so for large enough numbers the precision loss is significant. Some examples on i965: Operation Precision error - mod(-1.951171875, 1.9980468750) 0.000447 mod(121.57, 13.29) 0.023842 mod(3769.12, 321.99) 0.762939 mod(3769.12, 1321.99)0.0001220703 mod(-987654.125, 123456.984375) 0.0160663128 mod( 987654.125, 123456.984375) 0.031250 This patch replaces the current lowering pass with a different one (MOD_TO_FLOOR) that follows the recommended implementation in the GLSL man pages: mod(x,y) = x - y * floor(x/y) This implementation eliminates the precision errors at the expense of an additional add instruction on some systems. On systems that can do negate with multiply-add in a single operation this new implementation would come at no additional cost. v2 (Ian Romanick) - Do not clone operands because when they are expressions we would be duplicating them and that can lead to suboptimal code. Fixes the following 16 dEQP tests: dEQP-GLES3.functional.shaders.builtin_functions.precision.mod.mediump_* dEQP-GLES3.functional.shaders.builtin_functions.precision.mod.highp_* Reviewed-by: Ian Romanick ian.d.roman...@intel.com --- src/glsl/README| 2 +- src/glsl/ir_optimization.h | 2 +- src/glsl/lower_instructions.cpp| 65 +++--- src/mesa/drivers/dri/i965/brw_fs_visitor.cpp | 2 +- src/mesa/drivers/dri/i965/brw_shader.cpp | 2 +- src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp | 2 +- src/mesa/program/ir_to_mesa.cpp| 4 +- src/mesa/state_tracker/st_glsl_to_tgsi.cpp | 2 +- 8 files changed, 47 insertions(+), 34 deletions(-) diff --git a/src/glsl/README b/src/glsl/README index 2f93f12..bfcf69f 100644 --- a/src/glsl/README +++ b/src/glsl/README @@ -187,7 +187,7 @@ You may also need to update the backends if they will see the new expr type: You can then use the new expression from builtins (if all backends would rather see it), or scan the IR and convert to use your new -expression type (see ir_mod_to_fract, for example). +expression type (see ir_mod_to_floor, for example). Q: How is memory management handled in the compiler? diff --git a/src/glsl/ir_optimization.h b/src/glsl/ir_optimization.h index 34e0b4b..912d910 100644 --- a/src/glsl/ir_optimization.h +++ b/src/glsl/ir_optimization.h @@ -34,7 +34,7 @@ #define EXP_TO_EXP20x04 #define POW_TO_EXP20x08 #define LOG_TO_LOG20x10 -#define MOD_TO_FRACT 0x20 +#define MOD_TO_FLOOR 0x20 #define INT_DIV_TO_MUL_RCP 0x40 #define BITFIELD_INSERT_TO_BFM_BFI 0x80 #define LDEXP_TO_ARITH 0x100 diff --git a/src/glsl/lower_instructions.cpp b/src/glsl/lower_instructions.cpp index 6842853..09afe55 100644 --- a/src/glsl/lower_instructions.cpp +++ b/src/glsl/lower_instructions.cpp @@ -36,7 +36,7 @@ * - EXP_TO_EXP2 * - POW_TO_EXP2 * - LOG_TO_LOG2 - * - MOD_TO_FRACT + * - MOD_TO_FLOOR * - LDEXP_TO_ARITH * - BITFIELD_INSERT_TO_BFM_BFI * - CARRY_TO_ARITH @@ -77,14 +77,17 @@ * Many older GPUs don't have an x**y instruction. For these GPUs, convert * x**y to 2**(y * log2(x)). * - * MOD_TO_FRACT: + * MOD_TO_FLOOR: * - - * Breaks an ir_binop_mod expression down to (op1 * fract(op0 / op1)) + * Breaks an ir_binop_mod expression down to (op0 - op1 * floor(op0 / op1)) * * Many GPUs don't have a MOD instruction (945 and 965 included), and * if we have to break it down like this anyway, it gives an * opportunity to do things like constant fold the (1.0 / op1) easily. * + * Note: before we used to implement this as op1 * fract(op / op1) but this + * implementation had significant precision errors. + * * LDEXP_TO_ARITH: * - * Converts ir_binop_ldexp to arithmetic and bit operations. @@ -136,7 +139,7 @@ private: void sub_to_add_neg(ir_expression *); void div_to_mul_rcp(ir_expression *); void int_div_to_mul_rcp(ir_expression *); - void mod_to_fract(ir_expression *); + void mod_to_floor(ir_expression *); void exp_to_exp2(ir_expression *); void pow_to_exp2(ir_expression *); void log_to_log2(ir_expression *); @@ -276,22 +279,29 @@ lower_instructions_visitor::log_to_log2(ir_expression *ir) } void -lower_instructions_visitor::mod_to_fract(ir_expression *ir)
[Mesa-dev] [Bug 87886] constant fps drops with Intel and Radeon
https://bugs.freedesktop.org/show_bug.cgi?id=87886 --- Comment #21 from Stéphane Travostino stephane.travost...@gmail.com --- (In reply to Hohahiu from comment #19) Maybe a wild guess, but what are the temperatures of the GPUs in your laptop? Is this an overheating issue? No, but GOOD NEWS i've managed to reproduce the same problem with glxgears + RADEON. Here's a self-explanatory screenshot with Gallium FPS HUD enabled: https://dl.dropboxusercontent.com/u/64733/Screenshot%20from%202015-01-20%2020%3A25%3A42.png I'm trying to reproduce the same thing with Intel, but it's V-synced and can't manage to have it run more than 59 FPS. One bizarre thing I've noticed is that no matter the complexity of the game the fan speed is relatively slow compared to Windows, where any AAA game makes my laptop sound like a jet engine. I'd empirically say it's running at 50%, where 0% is normal operation and 100% is jet-fighter loud. Hope this helps. -- You are receiving this mail because: You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [mesa 9/9] glx/dri2: Implement getBufferAge
On 01/19/2015 03:00 AM, Chris Wilson wrote: Within the DRI2GetBuffers return packet there is a 4-byte field that is currently unused by any driver, i.e. flags. With the co-operation of a suitably modified X server, we can pass the last SBC on which the buffer was defined (i.e. the last SwapBuffers for which it was used) and 0 if it is fresh (with a slight loss of precision). We can then compare the flags field of the DRIBuffer against the current swap buffers count and so compute the age of the back buffer (thus satisfying GLX_EXT_buffer_age). As we reuse a driver specific field within the DRI2GetBuffers packet, we first query whether the X/DDX are ready to supply the new value using a DRI2GetParam request. Another caveat is that we need to complete the SwapBuffers/GetBuffers roundtrip before reporting the back buffer age so that it tallies against the buffer used for rendering. As with all things X, there is a race between the query and the rendering where the buffer may be invalidated by the server. However, for the primary usecase (that of a compositing manager), the DRI2Drawable is only accessible to a single client mitigating the impact of the issue. Signed-off-by: Chris Wilson ch...@chris-wilson.co.uk --- configure.ac | 2 +- src/glx/dri2_glx.c | 65 ++ 2 files changed, 66 insertions(+), 1 deletion(-) diff --git a/configure.ac b/configure.ac index 870435c..ca1da86 100644 --- a/configure.ac +++ b/configure.ac @@ -65,7 +65,7 @@ LIBDRM_INTEL_REQUIRED=2.4.52 LIBDRM_NVVIEUX_REQUIRED=2.4.33 LIBDRM_NOUVEAU_REQUIRED=2.4.33 libdrm = 2.4.41 LIBDRM_FREEDRENO_REQUIRED=2.4.57 -DRI2PROTO_REQUIRED=2.6 +DRI2PROTO_REQUIRED=2.9 DRI3PROTO_REQUIRED=1.0 PRESENTPROTO_REQUIRED=1.0 LIBUDEV_REQUIRED=151 diff --git a/src/glx/dri2_glx.c b/src/glx/dri2_glx.c index 0577804..b43f115 100644 --- a/src/glx/dri2_glx.c +++ b/src/glx/dri2_glx.c @@ -917,6 +917,67 @@ dri2GetBuffersWithFormat(__DRIdrawable * driDrawable, } static int +dri2HasBufferAge(int screen, struct glx_display * priv) +{ + const struct dri2_display *const pdp = +(struct dri2_display *)priv-dri2Display; + CARD64 value; + + if (pdp-driMajor = 1 pdp-driMinor 4) +return 0; + + value = 0; + if (!DRI2GetParam(priv-dpy, RootWindow(priv-dpy, screen), + DRI2ParamXHasBufferAge, value)) + return 0; + + return value; +} + +static int +dri2GetBufferAge(__GLXDRIdrawable *pdraw) +{ + struct dri2_drawable *priv = (struct dri2_drawable *) pdraw; + int i, age = 0; + + if (priv-swap_pending) { +unsigned int attachments[5]; I see other callers that have attachments of at least 8 (although it appears that intel_query_dri2_buffers only needs 2). Could we at least get an assertion or something that priv-bufferCount = ARRAY_SIZE(attachments)? A (hypothetical) driver doing stereo rendering with separate, DDX managed, depth and stencil buffers would need 6. A (again, hypothetical) driver with AUX buffers could need... more. +DRI2Buffer *buffers; + +for (i = 0; i priv-bufferCount; i++) +attachments[i] = priv-buffers[i].attachment; + +buffers = DRI2GetBuffers(priv-base.psc-dpy, priv-base.xDrawable, + priv-width, priv-height, + attachments, i, i); Most drivers prefer DRI2GetBuffersWithFormat, and some drivers only use DRI2GetBuffersWithFormat. Is mixing DRI2GetBuffersWithFormat and DRI2GetBuffers going to cause problems or unexpected behavior changes? +if (buffers == NULL) +return 0; + +process_buffers(priv, buffers, i); +free(buffers); + +dri2XcbSwapBuffersComplete(priv); + } + + if (!priv-have_back) + return 0; + + for (i = 0; i priv-bufferCount; i++) { +if (priv-buffers[i].attachment != __DRI_BUFFER_BACK_LEFT) +continue; + +if (priv-buffers[i].flags == 0) +continue; + +age = priv-last_swap_sbc - priv-buffers[i].flags + 1; +if (age 0) +age = 0; I was going to comment that this looked like it calculated age wrong when the buffers had different ages. Then I realized that age should only be calculated once. I think this would be more obvious if the body of the loop were: if (priv-buffers[i].attachment == __DRI_BUFFER_BACK_LEFT priv-buffers[i].flags != 0) { age = priv-last_swap_sbc - priv-buffers[i].flags + 1; if (age 0) age = 0; break; } I also just noticed that your patches are mixing tabs and spaces (use spaces only) and are using a mix of 3-space and 8-space (maybe?) indents (use 3 spaces only). + } + + return age; +} + +static int dri2SetSwapInterval(__GLXDRIdrawable *pdraw, int interval) { xcb_connection_t *c =
[Mesa-dev] [Bug 87886] constant fps drops with Intel and Radeon
https://bugs.freedesktop.org/show_bug.cgi?id=87886 --- Comment #22 from Stéphane Travostino stephane.travost...@gmail.com --- (In reply to Eero Tamminen from comment #20) Are you still getting INTEL_DEBUG=perf output from them? - if you're still getting recompile messages, re-check you have latest Mesa - if there are no perf warnings, check that: * your dmesg doesn't have any suspicious warnings * top output doesn't show things to be CPU limited and you having some background CPU / X hog occasionally stalling things for the foreground app (You need another machine to monitor this when running things at fullscreen) No warnings, no dmesg warnings whatsoever. Following up my latest update, here's my dmesg after running glxgears: http://pastebin.com/0zvVCXdB There are multiple power state switches as I ran glxgears w/PRIME multiple times in a row. -- You are receiving this mail because: You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [mesa 9/9] glx/dri2: Implement getBufferAge
On 01/20/2015 12:35 PM, Ian Romanick wrote: On 01/19/2015 03:00 AM, Chris Wilson wrote: + DRI2Buffer *buffers; + + for (i = 0; i priv-bufferCount; i++) + attachments[i] = priv-buffers[i].attachment; + + buffers = DRI2GetBuffers(priv-base.psc-dpy, priv-base.xDrawable, +priv-width, priv-height, +attachments, i, i); Most drivers prefer DRI2GetBuffersWithFormat, and some drivers only use DRI2GetBuffersWithFormat. Is mixing DRI2GetBuffersWithFormat and DRI2GetBuffers going to cause problems or unexpected behavior changes? Okay... I hadn't seen the server change until after I sent this review. I sent some comments there. This should be fine, but can we get a comment that it relies on DRI2GetBuffers re-using the format from the previous DRI2GetBuffersWithFormat? That way the next person to look at the code won't make the same mistake I made. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [xorg 1/3] dri2: Allow GetBuffers to match any format
On 01/19/2015 03:00 AM, Chris Wilson wrote: Since the introduction of DRI2GetBuffersWithFormat, the old DRI2GetBuffers interface would always recreate all buffers all the time as it was no longer agnostic to the format value being set by the DDXes. This causes an issue with clients intermixing the two requests, rendering any sharing or caching of buffers (e.g. for triple buffering) void. Signed-off-by: Chris Wilson ch...@chris-wilson.co.uk --- hw/xfree86/dri2/dri2.c | 13 - 1 file changed, 8 insertions(+), 5 deletions(-) diff --git a/hw/xfree86/dri2/dri2.c b/hw/xfree86/dri2/dri2.c index 43a1899..f9f594d 100644 --- a/hw/xfree86/dri2/dri2.c +++ b/hw/xfree86/dri2/dri2.c @@ -464,14 +464,16 @@ find_attachment(DRI2DrawablePtr pPriv, unsigned attachment) static Bool allocate_or_reuse_buffer(DrawablePtr pDraw, DRI2ScreenPtr ds, DRI2DrawablePtr pPriv, - unsigned int attachment, unsigned int format, + unsigned int attachment, + int has_format, unsigned int format, int dimensions_match, DRI2BufferPtr * buffer) { int old_buf = find_attachment(pPriv, attachment); if ((old_buf 0) || attachment == DRI2BufferFrontLeft -|| !dimensions_match || (pPriv-buffers[old_buf]-format != format)) { +|| !dimensions_match +|| (has_format pPriv-buffers[old_buf]-format != format)) { *buffer = create_buffer(ds, pDraw, attachment, format); Shouldn't the create_buffer change if !has_format? If !has_format and, say, !dimensions_match, create_buffer will get format = 0 when it should get format = pPriv-buffers[old_buf]-format. Right? Another alternative would be to have the caller always pass a format: either the format supplied in the protocol or the format of the old buffer. That might be more messy. Dunno. return TRUE; @@ -549,7 +551,8 @@ do_get_buffers(DrawablePtr pDraw, int *width, int *height, const unsigned format = (has_format) ? *(attachments++) : 0; if (allocate_or_reuse_buffer(pDraw, ds, pPriv, attachment, - format, dimensions_match, buffers[i])) + has_format, format, dimensions_match, + buffers[i])) buffers_changed = 1; if (buffers[i] == NULL) @@ -584,7 +587,7 @@ do_get_buffers(DrawablePtr pDraw, int *width, int *height, if (need_real_front 0) { if (allocate_or_reuse_buffer(pDraw, ds, pPriv, DRI2BufferFrontLeft, - front_format, dimensions_match, + has_format, front_format, dimensions_match, buffers[i])) buffers_changed = 1; @@ -595,7 +598,7 @@ do_get_buffers(DrawablePtr pDraw, int *width, int *height, if (need_fake_front 0) { if (allocate_or_reuse_buffer(pDraw, ds, pPriv, DRI2BufferFakeFrontLeft, - front_format, dimensions_match, + has_format, front_format, dimensions_match, buffers[i])) buffers_changed = 1; ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 10/16] i965/fs: Eliminate null-dst instructions without side-effects.
On Tuesday, January 20, 2015 10:13:09 AM Matt Turner wrote: On Tue, Jan 20, 2015 at 12:53 AM, Kenneth Graunke kenn...@whitecape.org wrote: On Monday, January 19, 2015 03:31:09 PM Matt Turner wrote: --- src/mesa/drivers/dri/i965/brw_fs_dead_code_eliminate.cpp | 11 +++ 1 file changed, 11 insertions(+) diff --git a/src/mesa/drivers/dri/i965/brw_fs_dead_code_eliminate.cpp b/src/mesa/drivers/dri/i965/brw_fs_dead_code_eliminate.cpp index 81be4de..d66808b 100644 --- a/src/mesa/drivers/dri/i965/brw_fs_dead_code_eliminate.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs_dead_code_eliminate.cpp @@ -85,6 +85,17 @@ fs_visitor::dead_code_eliminate() } } + if ((inst-opcode != BRW_OPCODE_IF + inst-opcode != BRW_OPCODE_WHILE) + inst-dst.is_null() + !inst-has_side_effects() + !inst-writes_flag() + !inst-writes_accumulator) { +inst-opcode = BRW_OPCODE_NOP; +progress = true; +continue; + } + if (inst-dst.file == GRF) { if (!inst-is_partial_write()) { int var = live_intervals-var_from_reg(inst-dst); Seems like these should be handled too... - BRW_OPCODE_ELSE - FS_OPCODE_DISCARD_JUMP - FS_OPCODE_PLACEHOLDER_HALT - SHADER_OPCODE_SHADER_TIME_ADD These have BAD_FILE destinations. Oh, my mistake - I confused BAD_FILE and ARF NULL for a moment. You're right, of course. - SHADER_OPCODE_GEN4_SCRATCH_READ - SHADER_OPCODE_GEN4_SCRATCH_WRITE - SHADER_OPCODE_GEN7_SCRATCH_READ The READs have non-null destinations (they have to return the data somewhere). Right. I was thinking about scratch, and just pasted all the opcodes...oops. And we only emit SCRATCH_* from spilling registers as part of register allocation. We can't ever call dead code elimination after we've assigned registers. (Not only do we not, but it couldn't work) That makes sense. However, in the vec4 backend, we do emit scratch reads and writes prior to optimization - in move_grf_array_access_to_scratch(). I've been meaning to port that over to the scalar backend. At which point, SHADER_OPCODE_GEN4_SCRATCH_WRITE would be a problem. But it's not yet, so, Reviewed-by: Kenneth Graunke kenn...@whitecape.org signature.asc Description: This is a digitally signed message part. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [dri2proto] Declare DRI2ParamXHasBufferAge
On 01/19/2015 03:00 AM, Chris Wilson wrote: In order for X/DDX to reuse a driver specific field of the DRI2GetBuffers reply, we need to declare the change in semantics. To indicate that the flags field now continues the last swap buffers count instead, we introduce the has-buffer-age parameter. Signed-off-by: Chris Wilson ch...@chris-wilson.co.uk Reviewed-by: Ian Romanick ian.d.roman...@intel.com --- configure.ac | 2 +- dri2proto.h | 2 ++ dri2proto.txt | 11 --- 3 files changed, 11 insertions(+), 4 deletions(-) diff --git a/configure.ac b/configure.ac index 5fadf56..9f4c4a0 100644 --- a/configure.ac +++ b/configure.ac @@ -1,5 +1,5 @@ AC_PREREQ([2.60]) -AC_INIT([DRI2Proto], [2.8], [https://bugs.freedesktop.org/enter_bug.cgi?product=xorg]) +AC_INIT([DRI2Proto], [2.9], [https://bugs.freedesktop.org/enter_bug.cgi?product=xorg]) AM_INIT_AUTOMAKE([foreign dist-bzip2]) # Require xorg-macros: XORG_DEFAULT_OPTIONS diff --git a/dri2proto.h b/dri2proto.h index 128b807..086dc96 100644 --- a/dri2proto.h +++ b/dri2proto.h @@ -340,6 +340,8 @@ typedef struct { } xDRI2GetParamReq; #define sz_xDRI2GetParamReq 12 +#define DRI2ParamXHasBufferAge 0 + typedef struct { BYTEtype; /*X_Reply*/ BOOLis_param_recognized; diff --git a/dri2proto.txt b/dri2proto.txt index 9921301..9daa58e 100644 --- a/dri2proto.txt +++ b/dri2proto.txt @@ -454,9 +454,14 @@ The name of this extension is DRI2. the screen associated with 'drawable'. Parameter names in which the value of the most significant byte is - 0 are reserved for the X server. Currently, no such parameter names - are defined. (When any such names are defined, they will be defined in - this extension specification and its associated headers). + 0 are reserved for the X server. The complete list of known parameter +names for the X server are: + +0 - DRI2ParamXHasBufferAge + +Query whether the X server and DDX support passing the +buffers last swap buffer count in the flags field of +the DRI2GetBuffers reply. Parameter names in which the byte's value is 1 are reserved for the DDX. Such names are private to each driver and shall be defined in the ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [mesa 7/9] glx/dri2: Add DRI2GetParam()
On 01/19/2015 03:00 AM, Chris Wilson wrote: Available since the inclusion of dri2proto 1.4 is a DRI2 request to query and set certain parameters about the X/DDX configuration. This implements the getter request. Signed-off-by: Chris Wilson ch...@chris-wilson.co.uk This patch is Reviewed-by: Ian Romanick ian.d.roman...@intel.com --- src/glx/dri2.c | 29 + src/glx/dri2.h | 4 2 files changed, 33 insertions(+) diff --git a/src/glx/dri2.c b/src/glx/dri2.c index cc6c164..6d9403e 100644 --- a/src/glx/dri2.c +++ b/src/glx/dri2.c @@ -546,4 +546,33 @@ DRI2CopyRegion(Display * dpy, XID drawable, XserverRegion region, SyncHandle(); } +Bool +DRI2GetParam(Display * dpy, XID drawable, CARD32 param, CARD64 *value) +{ + XExtDisplayInfo *info = DRI2FindDisplay(dpy); + xDRI2GetParamReply rep; + xDRI2GetParamReq *req; + + XextCheckExtension(dpy, info, dri2ExtensionName, False); + + LockDisplay(dpy); + GetReq(DRI2GetParam, req); + req-reqType = info-codes-major_opcode; + req-dri2ReqType = X_DRI2GetParam; + req-drawable = drawable; + req-param = param; + + if (!_XReply(dpy, (xReply *) rep, 0, xFalse)) { + UnlockDisplay(dpy); + SyncHandle(); + return False; + } + + *value = (CARD64)rep.value_hi 32 | rep.value_lo; + UnlockDisplay(dpy); + SyncHandle(); + + return rep.is_param_recognized; +} + #endif /* GLX_DIRECT_RENDERING */ diff --git a/src/glx/dri2.h b/src/glx/dri2.h index 4be5bf8..a5b23f0 100644 --- a/src/glx/dri2.h +++ b/src/glx/dri2.h @@ -88,4 +88,8 @@ DRI2CopyRegion(Display * dpy, XID drawable, XserverRegion region, CARD32 dest, CARD32 src); +extern Bool +DRI2GetParam(Display * dpy, XID drawable, + CARD32 param, CARD64 *value); + #endif ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 3/5] nir: add new constant folding infrastructure
On Mon, Jan 19, 2015 at 5:01 PM, Connor Abbott cwabbo...@gmail.com wrote: On Mon, Jan 19, 2015 at 4:04 PM, Jason Ekstrand ja...@jlekstrand.net wrote: I've got some specific comments below, but I want to make some more general comments here. I like this in principle: having all the opcodes self-documenting is wonderful. However, I'm not terribly happy with the way it worked out. A lot of the codegen stuff is very confusing and its not at all obvious what's going on. I'll give it some thought and see if I can come up with a good way to clean it up. On Jan 16, 2015 3:46 PM, Connor Abbott cwabbo...@gmail.com wrote: Add a required field to the Opcode class, const_expr, that contains an expression or statement that computes the result of the opcode given known constant inputs. Then take those const_expr's and expand them into a function that takes an opcode and an array of constant inputs and spits out the constant result. This means that when adding opcodes, there's one less place to update, and almost all the opcodes are self-documenting since the information on how to compute the result is right next to the definition. The helper functions in nir_constant_expressions.c were taken from ir_constant_expressions.cpp. v2: use Python formatting and get rid of regex's Signed-off-by: Connor Abbott cwabbo...@gmail.com --- src/glsl/Makefile.am | 10 +- src/glsl/Makefile.sources| 3 +- src/glsl/nir/.gitignore | 1 + src/glsl/nir/nir_constant_expressions.h | 32 ++ src/glsl/nir/nir_constant_expressions.py | 320 ++ src/glsl/nir/nir_opcodes.py | 562 +-- 6 files changed, 740 insertions(+), 188 deletions(-) create mode 100644 src/glsl/nir/nir_constant_expressions.h create mode 100644 src/glsl/nir/nir_constant_expressions.py diff --git a/src/glsl/Makefile.am b/src/glsl/Makefile.am index b2fe16a..51036b7 100644 --- a/src/glsl/Makefile.am +++ b/src/glsl/Makefile.am @@ -210,7 +210,8 @@ BUILT_SOURCES = \ glcpp/glcpp-lex.c \ nir/nir_opt_algebraic.c \ nir/nir_opcodes.h \ - nir/nir_opcodes.c + nir/nir_opcodes.c \ + nir/nir_constant_expressions.c CLEANFILES = \ glcpp/glcpp-parse.h \ glsl_parser.h \ @@ -236,3 +237,10 @@ nir/nir_opcodes.c: nir/nir_opcodes.py nir/nir_opcodes_c.py $(PYTHON2) $(PYTHON_FLAGS) $(srcdir)/nir/nir_opcodes_c.py $@ nir/nir.h: nir/nir_opcodes.h + +nir/nir_constant_expressions.c: nir/nir_opcodes.py nir/nir_constant_expressions.py nir/nir_constant_expressions.h + $(AM_V_GEN)set -e; \ + $(MKDIR_P) `dirname $@`; \ + $(PYTHON2) $(PYTHON_FLAGS) $(srcdir)/nir/nir_constant_expressions.py $@.tmp; \ + mv $@.tmp $@; + diff --git a/src/glsl/Makefile.sources b/src/glsl/Makefile.sources index 03b4f2e..9dd1a56 100644 --- a/src/glsl/Makefile.sources +++ b/src/glsl/Makefile.sources @@ -16,7 +16,8 @@ LIBGLCPP_GENERATED_FILES = \ NIR_GENERATED_FILES = \ $(GLSL_BUILDDIR)/nir/nir_opt_algebraic.c \ $(GLSL_BUILDDIR)/nir/nir_opcodes.h \ - $(GLSL_BUILDDIR)/nir/nir_opcodes.c + $(GLSL_BUILDDIR)/nir/nir_opcodes.c \ + $(GLSL_BUILDDIR)/nir/nir_constant_expressions.c NIR_FILES = \ $(GLSL_SRCDIR)/nir/nir.c \ diff --git a/src/glsl/nir/.gitignore b/src/glsl/nir/.gitignore index 4c28193..261f64f 100644 --- a/src/glsl/nir/.gitignore +++ b/src/glsl/nir/.gitignore @@ -1,3 +1,4 @@ nir_opt_algebraic.c nir_opcodes.c nir_opcodes.h +nir_constant_expressions.c diff --git a/src/glsl/nir/nir_constant_expressions.h b/src/glsl/nir/nir_constant_expressions.h new file mode 100644 index 000..4ca09be --- /dev/null +++ b/src/glsl/nir/nir_constant_expressions.h @@ -0,0 +1,32 @@ +/* + * Copyright © 2014 Connor Abbott + * + * Permission is hereby granted, free of charge, to any person obtaining a + * copy of this software and associated documentation files (the Software), + * to deal in the Software without restriction, including without limitation + * the rights to use, copy, modify, merge, publish, distribute, sublicense, + * and/or sell copies of the Software, and to permit persons to whom the + * Software is furnished to do so, subject to the following conditions: + * + * The above copyright notice and this permission notice (including the next + * paragraph) shall be included in all copies or substantial portions of the + * Software. + * + * THE SOFTWARE IS PROVIDED AS IS, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + * IMPLIED,
Re: [Mesa-dev] [PATCH 0/5] NIR opcodes and constant folding
There's still some cleanups needed for 2/5. The rest is Reviewed-by: Jason Ekstrand jason.ekstr...@intel.com On Fri, Jan 16, 2015 at 1:53 PM, Connor Abbott cwabbo...@gmail.com wrote: Oh, and I forgot... the series is also available at https://github.com/cwabbott0/mesa nir-opcodes-cleanup On Fri, Jan 16, 2015 at 4:46 PM, Connor Abbott cwabbo...@gmail.com wrote: Hi, This is a series I had floating around a while. The idea is to have all the opcode stuff, including constant folding, derived from a single Python file. I've cleaned it up a little by using {}-style Python formatting instead of the pile of text-replacement and regular expressions we had before for getting the constant expressions to a state where they could be compiled as C code. Connor Abbott (5): nir: add generated file to .gitignore nir: use Python to autogenerate opcode information nir: add new constant folding infrastructure nir/constant_folding: use the new constant folding infrastructure nir/lower_vars_to_ssa: fix a bug with boolean constants src/glsl/Makefile.am | 23 +- src/glsl/Makefile.sources| 7 +- src/glsl/nir/.gitignore | 4 + src/glsl/nir/nir.h | 9 - src/glsl/nir/nir_constant_expressions.h | 32 ++ src/glsl/nir/nir_constant_expressions.py | 320 + src/glsl/nir/nir_lower_vars_to_ssa.c | 2 +- src/glsl/nir/nir_opcodes.c | 46 --- src/glsl/nir/nir_opcodes.h | 366 src/glsl/nir/nir_opcodes.py | 567 +++ src/glsl/nir/nir_opcodes_c.py| 56 +++ src/glsl/nir/nir_opcodes_h.py| 39 +++ src/glsl/nir/nir_opt_constant_folding.c | 179 ++ 13 files changed, 1066 insertions(+), 584 deletions(-) create mode 100644 src/glsl/nir/.gitignore create mode 100644 src/glsl/nir/nir_constant_expressions.h create mode 100644 src/glsl/nir/nir_constant_expressions.py delete mode 100644 src/glsl/nir/nir_opcodes.c delete mode 100644 src/glsl/nir/nir_opcodes.h create mode 100644 src/glsl/nir/nir_opcodes.py create mode 100644 src/glsl/nir/nir_opcodes_c.py create mode 100644 src/glsl/nir/nir_opcodes_h.py -- 2.1.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 88536] AMD graphics hardware hangs with an homogeneous coloured screen or blank screen, and with chirp coming from the graphics card
https://bugs.freedesktop.org/show_bug.cgi?id=88536 --- Comment #12 from Timothy Arceri t_arc...@yahoo.com.au --- (In reply to Alberto Salvia Novella from comment #11) Does fglrx depend on Mesa? No How can I figure out if this is a bug in Mesa? Make sure you have completely removed fglrx (there are many guides on how to do this in Ubuntu just google it) and then try using this ppa with updated drivers https://launchpad.net/~oibaf/+archive/ubuntu/graphics-drivers -- You are receiving this mail because: You are the QA Contact for the bug. You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev