Re: [Mesa-dev] [PATCH] mesa: Expose GL_OES_texture_float and GL_OES_texture_half_float.
>> For OpenGL ES, I propose a simpler solution: >> - don't touch ARB_texture_float at all >> - add OES_texture_float to gl_extensions >> - add OES_texture_float_linear to gl_extensions >> - define OES_texture_half_float as o(OES_texture_float) >> - define OES_texture_half_float_linear as o(OES_texture_float_linear) >> >> Then, drivers can enable the extensions as they see fit. >That sounds like a happy medium. It seems like we could use >ARB_texture_float as the enable for OES_texture_float, but I'm not >crying over one extra flag. I think it is actually the most unhappy medium. The patch as-is enable floating point textures in GLES2 on hardware targets without affecting any DRI drivers (or the Gallium state tracker). That was the original purpose of the patch. On one side: having 4 separate booleans then gives complete resolution to the situation for floating point textures. Having just -2- means that some resolution is provided but it is not complete and will then need to be revisited and leave whoever made or pushed the patch embarrassed about not dotting the i's and crossing the t's. Additionally, adding 2 or 4 and leaving ARB_texture_float, we are still left with the situation that the booleans are not orthogonal. Also, what does ARB_texture_float support then mean? What contract is it satisfying that mesa/main can rely upon? Going further what happens when/if we want to add support for GL_ARB_ES2_compatibility and also expose OES extensions (as NVIDIA does)? [I admit exposing OES extension in a non-ES context sounds gross, but the whole point of ES2_compatibility is to make ports from GLES to GL almost a no-op, so the OES extensions should come too]. >It will mean that a bunch of extension checks in the code will need to >be expanded. > >We'll probably also want a negative test that verifies an error is >generated for glTexParameteri(..., GL_LINEAR_MIPMAP_LINEAR) when >OES_texture_float_linear (or OES_texture_half_float_linear) is not >supported. This is the other reason why I do not want to go down the multiple booleans initially as then the patch touches much more code; the all or nothing approach avoided all sorts of additional ickiness. Lets put the patch as-is (because from the point of view of mesa/main it looks correct) and then a subsequent patch, after some discussion, to support situations like the r300 partial floating point texture support. -Kevin > Marek > > On Mon, May 19, 2014 at 8:34 AM, Rogovin, Kevin > wrote: >> Hi, >> >> Each of the four extensions are right now set to be advertised if and only >> if a GL context would advertise GL_ARB_texture_float: >> >> { "GL_OES_texture_float", o(ARB_texture_float), >>ES2,2005 }, >> { "GL_OES_texture_half_float", o(ARB_texture_float), >>ES2,2005 }, >> { "GL_OES_texture_float_linear",o(ARB_texture_float), >>ES2,2005 }, >> { "GL_OES_texture_half_float_linear", o(ARB_texture_float), >>ES2,2005 }, >> >> From my interpretation of ARB_texture_float, that extension requires both >> 16-bit and 32-bit textures and ability to filter linearly such textures. Did >> I misunderstand the specification? If I got the specification correct, then >> the r300 should not be advertising any of the extensions for otherwise it >> would be advertising GL_ARB_texture_float. >> >> However, the r300 does give an example of ability to support some of the OES >> extensions but not all. Previously Matt asked if there an example or need >> and I thought not. It turns out I was wrong and there is a need atleast for >> the r300. Supporting that granularity is going to be a bigger patch since it >> would require changing the data structure struct gl_extensions to have four >> entries and in turn additional logic to combine them to >> GL_ARB_texture_float. The correct and more work way to do it would be to >> remove ARB_texture_float from gl_extension, add a GLboolean for each of the >> 4 OES extensions, change each driver to correctly fill them and then >> additional logic in creating extension string(s) to check if each of the 4 >> OES extensions are TRUE then to advertise GL_ARB_texture_float; we could >> also instead just add the 4 OES booleans and have additional logic in >> mesa/main to set them each to TRUE if ARB_texture_float is true. The latter >> solution though easier is less clean a! nd begging for trouble later. Regardless, lets first get this patch as-is into Mesa, then do the "right" thing to allow a backend to support a subset of the OES extensions without needing to support the ARB extension. >> >> -Kevin >> >> >> >> >> From: Marek Olšák [mar...@gmail.com] >> Sent: Friday, May 16, 2014 4:33 PM >> To: Rogovin, Kevin >> Cc: mesa-dev@lists.freedesktop.org >> Subject
Re: [Mesa-dev] [Mesa-stable] [PATCH 2/2] meta: Write color values in to 'out' variables for all the draw buffers
On 05/19/2014 03:52 PM, Chris Forbes wrote: > If you're going to do that, you'd really want to add draw buffer count > to the cache key (and i guess this might be the point where you > convert the blit shader cache to be a hashtable), to avoid recompiling > all the time if the app does two blits with the same target but > different draw buffer counts. > > This all seems like a huge amount of extra machinery to avoid using > gl_FragColor and having the backend just take care of it, though. What > do we actually gain from this? One thing that's bothered me about our blit code...the integer RT support is rather sketchy. Eric pointed out that it ought to work: we interpret the integer source buffer as float, copy those bits to gl_FragColor - which takes a float - and then write the bits out as if the destination were float. It should preserve the bits, and filtering should be off... One annoying thing is that there's no int/uint equivalent to gl_FragColor...so if you want to write to all the render targets, you have to do something like this. (Or, we'd have to add something to the language...) signature.asc Description: OpenPGP digital signature ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 03/10] mesa: add new enum MAX_UNIFORM_LOCATIONS
On 05/19/2014 08:26 PM, Ian Romanick wrote: On 04/09/2014 02:56 AM, Tapani Pälli wrote: Patch adds new implementation dependent value required by the GL_ARB_explicit_uniform_location extension. Default value for user assignable locations is calculated as sum of MaxUniformComponents for each stage. Signed-off-by: Tapani Pälli --- src/mesa/main/context.c | 10 +- src/mesa/main/get.c | 1 + src/mesa/main/get_hash_params.py | 1 + src/mesa/main/mtypes.h | 5 + src/mesa/main/tests/enum_strings.cpp | 1 + 5 files changed, 17 insertions(+), 1 deletion(-) diff --git a/src/mesa/main/context.c b/src/mesa/main/context.c index 860ae86..8b77df1 100644 --- a/src/mesa/main/context.c +++ b/src/mesa/main/context.c @@ -610,8 +610,16 @@ _mesa_init_constants(struct gl_context *ctx) ctx->Const.MaxUniformBlockSize = 16384; ctx->Const.UniformBufferOffsetAlignment = 1; - for (i = 0; i < MESA_SHADER_STAGES; i++) + /* GL_ARB_explicit_uniform_location, initial value calculated +* as sum of MaxUniformComponents for each stage. +*/ + ctx->Const.MaxUserAssignableUniformLocations = 0; + + for (i = 0; i < MESA_SHADER_STAGES; i++) { init_program_limits(ctx, i, &ctx->Const.Program[i]); + ctx->Const.MaxUserAssignableUniformLocations += + ctx->Const.Program[i].MaxUniformComponents; + } This is just going to set ctx->Const.MaxUserAssignableUniformLocations to 4 * 4 * MAX_UNIFORMS, and that's probably not what we want. Maybe just set 4 * MAX_UNIFORMS with a comment saying it's, "MAX_UNIFORMS for each possible shader stage." There should be much more locations than number of uniforms though (?) MAX_UNIFORMS refers to count of available vec4 uniforms, each of these should have 4 locations available. Also, value from the above formula nicely matches with binary drivers so IMO it shouldn't be 'too much'. ctx->Const.MaxProgramMatrices = MAX_PROGRAM_MATRICES; ctx->Const.MaxProgramMatrixStackDepth = MAX_PROGRAM_MATRIX_STACK_DEPTH; diff --git a/src/mesa/main/get.c b/src/mesa/main/get.c index 6d95790..8b50441 100644 --- a/src/mesa/main/get.c +++ b/src/mesa/main/get.c @@ -395,6 +395,7 @@ EXTRA_EXT(ARB_viewport_array); EXTRA_EXT(ARB_compute_shader); EXTRA_EXT(ARB_gpu_shader5); EXTRA_EXT2(ARB_transform_feedback3, ARB_gpu_shader5); +EXTRA_EXT(ARB_explicit_uniform_location); static const int extra_ARB_color_buffer_float_or_glcore[] = { diff --git a/src/mesa/main/get_hash_params.py b/src/mesa/main/get_hash_params.py index 06d0bba..5709d42 100644 --- a/src/mesa/main/get_hash_params.py +++ b/src/mesa/main/get_hash_params.py @@ -474,6 +474,7 @@ descriptor=[ [ "MAX_LIST_NESTING", "CONST(MAX_LIST_NESTING), NO_EXTRA" ], [ "MAX_NAME_STACK_DEPTH", "CONST(MAX_NAME_STACK_DEPTH), NO_EXTRA" ], [ "MAX_PIXEL_MAP_TABLE", "CONST(MAX_PIXEL_MAP_TABLE), NO_EXTRA" ], + [ "MAX_UNIFORM_LOCATIONS", "CONTEXT_INT(Const.MaxUserAssignableUniformLocations), NO_EXTRA" ], Ditto on Petri's comment. [ "NAME_STACK_DEPTH", "CONTEXT_INT(Select.NameStackDepth), NO_EXTRA" ], [ "PACK_LSB_FIRST", "CONTEXT_BOOL(Pack.LsbFirst), NO_EXTRA" ], [ "PACK_SWAP_BYTES", "CONTEXT_BOOL(Pack.SwapBytes), NO_EXTRA" ], diff --git a/src/mesa/main/mtypes.h b/src/mesa/main/mtypes.h index 7ac6bbe..fefbe06 100644 --- a/src/mesa/main/mtypes.h +++ b/src/mesa/main/mtypes.h @@ -3311,6 +3311,11 @@ struct gl_constants GLuint UniformBufferOffsetAlignment; /** @} */ + /** +* GL_ARB_explicit_uniform_location +*/ + GLuint MaxUserAssignableUniformLocations; + /** GL_ARB_geometry_shader4 */ GLuint MaxGeometryOutputVertices; GLuint MaxGeometryTotalOutputComponents; diff --git a/src/mesa/main/tests/enum_strings.cpp b/src/mesa/main/tests/enum_strings.cpp index 3795700..298ff6a 100644 --- a/src/mesa/main/tests/enum_strings.cpp +++ b/src/mesa/main/tests/enum_strings.cpp @@ -787,6 +787,7 @@ const struct enum_info everything[] = { { 0x8256, "GL_RESET_NOTIFICATION_STRATEGY_ARB" }, { 0x8257, "GL_PROGRAM_BINARY_RETRIEVABLE_HINT" }, { 0x8261, "GL_NO_RESET_NOTIFICATION_ARB" }, + { 0x826E, "GL_MAX_UNIFORM_LOCATIONS" }, { 0x82DF, "GL_TEXTURE_IMMUTABLE_LEVELS" }, { 0x8362, "GL_UNSIGNED_BYTE_2_3_3_REV" }, { 0x8363, "GL_UNSIGNED_SHORT_5_6_5" }, ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 09/10] Enable GL_ARB_explicit_uniform_location in the drivers.
On 05/19/2014 08:21 PM, Ian Romanick wrote: Either this patch should: - Delete the extension enable flag - Change the table in extensions.c to use dummy_true or The next patch needs to not say "all drivers that support GLSL". I think we should just enable it everywhere. OK, I was following the way how GL_ARB_explicit_attrib_location was enabled. That one is still only for "all drivers that support GLSL" and you really need GLSL to be able to use attributes or uniforms. I can enable it everywhere via dummy_true. On 04/09/2014 02:56 AM, Tapani Pälli wrote: Signed-off-by: Tapani Pälli --- src/mesa/drivers/dri/i965/intel_extensions.c | 1 + src/mesa/state_tracker/st_extensions.c | 1 + 2 files changed, 2 insertions(+) diff --git a/src/mesa/drivers/dri/i965/intel_extensions.c b/src/mesa/drivers/dri/i965/intel_extensions.c index 15fcd30..f8abf98 100644 --- a/src/mesa/drivers/dri/i965/intel_extensions.c +++ b/src/mesa/drivers/dri/i965/intel_extensions.c @@ -170,6 +170,7 @@ intelInitExtensions(struct gl_context *ctx) ctx->Extensions.ARB_draw_instanced = true; ctx->Extensions.ARB_ES2_compatibility = true; ctx->Extensions.ARB_explicit_attrib_location = true; + ctx->Extensions.ARB_explicit_uniform_location = true; ctx->Extensions.ARB_fragment_coord_conventions = true; ctx->Extensions.ARB_fragment_program = true; ctx->Extensions.ARB_fragment_program_shadow = true; diff --git a/src/mesa/state_tracker/st_extensions.c b/src/mesa/state_tracker/st_extensions.c index 3e1e45d..5b11e7b 100644 --- a/src/mesa/state_tracker/st_extensions.c +++ b/src/mesa/state_tracker/st_extensions.c @@ -534,6 +534,7 @@ void st_init_extensions(struct st_context *st) ctx->Extensions.ARB_ES2_compatibility = GL_TRUE; ctx->Extensions.ARB_draw_elements_base_vertex = GL_TRUE; ctx->Extensions.ARB_explicit_attrib_location = GL_TRUE; + ctx->Extensions.ARB_explicit_uniform_location = GL_TRUE; ctx->Extensions.ARB_fragment_coord_conventions = GL_TRUE; ctx->Extensions.ARB_fragment_program = GL_TRUE; ctx->Extensions.ARB_fragment_shader = GL_TRUE; ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 08/10] glsl: parser changes for GL_ARB_explicit_uniform_location
On 05/19/2014 08:18 PM, Ian Romanick wrote: On 04/09/2014 02:56 AM, Tapani Pälli wrote: Patch adds a preprocessor define for the extension and stores explicit location data for uniforms during AST->HIR conversion. It also sets layout token to be available when having the extension in place. Signed-off-by: Tapani Pälli --- src/glsl/ast_to_hir.cpp | 37 + src/glsl/glcpp/glcpp-parse.y | 3 +++ src/glsl/glsl_lexer.ll| 1 + src/glsl/glsl_parser_extras.h | 14 ++ 4 files changed, 55 insertions(+) diff --git a/src/glsl/ast_to_hir.cpp b/src/glsl/ast_to_hir.cpp index 8d55ee3..7431ad7 100644 --- a/src/glsl/ast_to_hir.cpp +++ b/src/glsl/ast_to_hir.cpp @@ -2170,6 +2170,43 @@ validate_explicit_location(const struct ast_type_qualifier *qual, { bool fail = false; + /* Checks for GL_ARB_explicit_uniform_location. */ + if (qual->flags.q.uniform) { + Extra blank line. oops + if (!state->check_explicit_uniform_location_allowed(loc, var)) + return; + + const struct gl_context *const ctx = state->ctx; + unsigned max_loc = qual->location + var->type->component_slots() - 1; I think that over counts for this purpose, and we can blame confusing nomenclature. component_slots for a mat4 is 4, so a mat4 uniform counts 4*4 against the GL_MAX_VERTEX_UNIFORM_COMPONENTS limit. However, it only has one "location" (as returned by glGetUniformLocation), so it only counts 1 against the GL_MAX_UNIFORM_LOCATIONS limit. I see, I was considering structs and arrays when writing this part and forgot about matrix. I assume matrix is the only special case here though? Everything else gets correct location count value via component_slots(). + + /* ARB_explicit_uniform_location specification states: + * + * "The explicitly defined locations and the generated locations + * must be in the range of 0 to MAX_UNIFORM_LOCATIONS minus one." + * + * "Valid locations for default-block uniform variable locations + * are in the range of 0 to the implementation-defined maximum + * number of uniform locations." + */ + if (qual->location < 0) { + _mesa_glsl_error(loc, state, + "explicit location < 0 for uniform %s", var->name); + return; + } + + if (max_loc >= ctx->Const.MaxUserAssignableUniformLocations) { + _mesa_glsl_error(loc, state, "location qualifier for uniform %s " + ">= MAX_UNIFORM_LOCATIONS (%u)", + var->name, + ctx->Const.MaxUserAssignableUniformLocations); + return; + } + + var->data.explicit_location = true; + var->data.location = qual->location; + return; + } + /* Between GL_ARB_explicit_attrib_location an * GL_ARB_separate_shader_objects, the inputs and outputs of any shader * stage can be assigned explicit locations. The checking here associates diff --git a/src/glsl/glcpp/glcpp-parse.y b/src/glsl/glcpp/glcpp-parse.y index f28d853..6d42138 100644 --- a/src/glsl/glcpp/glcpp-parse.y +++ b/src/glsl/glcpp/glcpp-parse.y @@ -2087,6 +2087,9 @@ _glcpp_parser_handle_version_declaration(glcpp_parser_t *parser, intmax_t versio if (extensions->ARB_explicit_attrib_location) add_builtin_define(parser, "GL_ARB_explicit_attrib_location", 1); + if (extensions->ARB_explicit_uniform_location) +add_builtin_define(parser, "GL_ARB_explicit_uniform_location", 1); + if (extensions->ARB_shader_texture_lod) add_builtin_define(parser, "GL_ARB_shader_texture_lod", 1); diff --git a/src/glsl/glsl_lexer.ll b/src/glsl/glsl_lexer.ll index 7602351..83f0b6d 100644 --- a/src/glsl/glsl_lexer.ll +++ b/src/glsl/glsl_lexer.ll @@ -393,6 +393,7 @@ layout { || yyextra->AMD_conservative_depth_enable || yyextra->ARB_conservative_depth_enable || yyextra->ARB_explicit_attrib_location_enable + || yyextra->ARB_explicit_uniform_location_enable || yyextra->has_separate_shader_objects() || yyextra->ARB_uniform_buffer_object_enable || yyextra->ARB_fragment_coord_conventions_enable diff --git a/src/glsl/glsl_parser_extras.h b/src/glsl/glsl_parser_extras.h index c53c583..20879a0 100644 --- a/src/glsl/glsl_parser_extras.h +++ b/src/glsl/glsl_parser_extras.h @@ -152,6 +152,20 @@ struct _mesa_glsl_parse_state { return true; } + bool check_explicit_uniform_location_allowed(YYLTYPE *locp, +const ir_variable *var) + { + /* Requires OpenGL 3.3 or ARB_explicit_attrib_location. */ + if (ctx->Version < 33 && !ctx->Extensions.ARB_explicit_attrib_location) { + _mesa_glsl_error(locp, this,
[Mesa-dev] ARB_sso layout() + other qualifiers
Hi Ian, When I was writing the `precise` support I found some error cases in the GLSL parser where we reject combinations of layout() with invariant / interpolation / etc qualifiers. This seems to be consistent with the GLSL 1.50 grammar (or, at least, admits all the examples that were given in various GLSL specs and extension specs), but I don't think it works any more with SSO, since you'd want to be able to do rendezvous-by-location on qualified variables. The body of the ARB_sso spec doesn't clearly make the changes required to allow this, but various parts of the spec hint at it being possible, with the most obvious being in the resolution of issue 13: 13. How are interpolation modifiers handled for separate shader programs? RESOLVED: GLSL only provides interpolation modifiers for user- defined varyings. These modifiers can be used in conjunction with the layout location qualifiers specified in this extension. Such modifiers must match. I propose relaxing the rules in type_qualifier as follows: * If neither GLSL 4.20 nor ARB_shading_language_420pack is supported, then require layout qualifiers to precede any other qualifiers; continue to disallow multiple layout qualifiers. * Remove all other error generation for combining layout with invariant / interpolation / (with my other patches) precise. I think this retains all the useful current behavior, and will accept all the examples I've seen. -- Chris ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 06/10] mesa: support inactive uniforms in glUniform* functions
On 05/19/2014 08:09 PM, Ian Romanick wrote: On 04/09/2014 02:56 AM, Tapani Pälli wrote: Support inactive uniforms that have explicit location set in glUniform* functions. Signed-off-by: Tapani Pälli --- src/mesa/main/uniform_query.cpp | 15 +++ 1 file changed, 15 insertions(+) diff --git a/src/mesa/main/uniform_query.cpp b/src/mesa/main/uniform_query.cpp index 5f1af08..e33800a 100644 --- a/src/mesa/main/uniform_query.cpp +++ b/src/mesa/main/uniform_query.cpp @@ -253,6 +253,21 @@ validate_uniform_parameters(struct gl_context *ctx, return false; } + /* If the driver storage pointer in remap table is -1, we ignore silently. +* +* GL_ARB_explicit_uniform_location spec says: +* "What happens if Uniform* is called with an explicitly defined +* uniform location, but that uniform is deemed inactive by the +* linker? +* +* RESOLVED: The call is ignored for inactive uniform variables and +* no error is generated." +* +*/ + if (ctx->Extensions.ARB_explicit_uniform_location && + shProg->UniformRemapTable[location] == (gl_uniform_storage *) -1) + return false; + Do we actually need to check ctx->Extensions.ARB_explicit_uniform_location? It seems like UniformRemapTable will only have -1 in it for that case, right? Yes, the extension check can be removed. _mesa_uniform_split_location_offset(shProg, location, loc, array_index); if (shProg->UniformStorage[*loc].array_elements == 0 && count > 1) { ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 05/10] glsl/linker: assign explicit uniform locations
On 05/19/2014 08:07 PM, Ian Romanick wrote: On 04/09/2014 02:56 AM, Tapani Pälli wrote: Patch refactors the existing uniform processing so explicit locations are taken in to account during variable processing. These locations are temporarily stored in gl_uniform_storage before actual locations are set. The 'remap_location' variable in gl_uniform_storage is changed to be signed so that we can use 0 as a valid explicit location and '-1' as identifier that no explicit location has been defined. When locations are set, UniformRemapTable is first populated with uniforms that have explicit location set (inactive and actives ones), rest are put after explicit location slots. Signed-off-by: Tapani Pälli --- src/glsl/ir_uniform.h | 5 +++-- src/glsl/link_uniforms.cpp | 56 +- 2 files changed, 54 insertions(+), 7 deletions(-) diff --git a/src/glsl/ir_uniform.h b/src/glsl/ir_uniform.h index 3508509..9dc4a8e 100644 --- a/src/glsl/ir_uniform.h +++ b/src/glsl/ir_uniform.h @@ -181,9 +181,10 @@ struct gl_uniform_storage { /** * The 'base location' for this uniform in the uniform remap table. For -* arrays this is the first element in the array. +* arrays this is the first element in the array. It needs to be signed +* so that we can use 0 as valid location and -1 as initial value */ - unsigned remap_location; + int remap_location; You could use ~0u instead of -1, right? A #define for the magic value will also help. Sure, I can move to using ~0u. Should be enough to never become a problem. }; #ifdef __cplusplus diff --git a/src/glsl/link_uniforms.cpp b/src/glsl/link_uniforms.cpp index 29dc0b1..0f99082 100644 --- a/src/glsl/link_uniforms.cpp +++ b/src/glsl/link_uniforms.cpp @@ -387,6 +387,9 @@ public: void set_and_process(struct gl_shader_program *prog, ir_variable *var) { + current_var = var; + field_counter = 0; + ubo_block_index = -1; if (var->is_in_uniform_block()) { if (var->is_interface_instance() && var->type->is_array()) { @@ -543,6 +546,22 @@ private: return; } + /* Assign explicit locations. */ + if (current_var->data.explicit_location) { + /* Set sequential locations for struct fields. */ + if (current_var->type->is_record()) { I think you can accomplish the same thing with record_type != NULL. ok, I can change +const unsigned entries = MAX2(1, this->uniforms[id].array_elements); +this->uniforms[id].remap_location = + current_var->data.location + field_counter; + field_counter += entries; Weird indentation. will fix + } else { +this->uniforms[id].remap_location = current_var->data.location; + } + } else { + /* Initialize to -1 to indicate that no explicit location is set */ + this->uniforms[id].remap_location = -1; + } + this->uniforms[id].name = ralloc_strdup(this->uniforms, name); this->uniforms[id].type = base_type; this->uniforms[id].initialized = 0; @@ -598,6 +617,17 @@ public: gl_texture_index targets[MAX_SAMPLERS]; /** +* Current variable being processed. +*/ + ir_variable *current_var; + + /** +* Field counter is used to take care that uniform structures +* with explicit locations get sequential locations. +*/ + unsigned field_counter; + + /** * Mask of samplers used by the current shader stage. */ unsigned shader_samplers_used; @@ -799,10 +829,6 @@ link_assign_uniform_locations(struct gl_shader_program *prog) prog->UniformStorage = NULL; prog->NumUserUniformStorage = 0; - ralloc_free(prog->UniformRemapTable); - prog->UniformRemapTable = NULL; - prog->NumUniformRemapTable = 0; - if (prog->UniformHash != NULL) { prog->UniformHash->clear(); } else { @@ -915,9 +941,29 @@ link_assign_uniform_locations(struct gl_shader_program *prog) sizeof(prog->_LinkedShaders[i]->SamplerTargets)); } - /* Build the uniform remap table that is used to set/get uniform locations */ + /* Reserve all the explicit locations of the active uniforms. */ + for (unsigned i = 0; i < num_user_uniforms; i++) { + if (uniforms[i].remap_location != -1) { + /* How many new entries for this uniform? */ + const unsigned entries = MAX2(1, uniforms[i].array_elements); + + /* Set remap table entries point to correct gl_uniform_storage. */ + for (unsigned j = 0; j < entries; j++) { +unsigned element_loc = uniforms[i].remap_location + j; +assert(prog->UniformRemapTable[element_loc] == + (gl_uniform_storage *) -1); +prog->UniformRemapTable[element_loc] = &uniforms[i]; + } + } + } + + /* Reserve locations for rest of the uniforms. */ for (unsigned
Re: [Mesa-dev] [PATCH 04/10] glsl/linker: initialize explicit uniform locations
On 05/19/2014 07:51 PM, Ian Romanick wrote: On 04/09/2014 02:56 AM, Tapani Pälli wrote: Patch initializes the UniformRemapTable for explicit locations. This needs to happen before optimizations to make sure all inactive uniforms get their explicit locations correctly. Signed-off-by: Tapani Pälli --- src/glsl/linker.cpp | 99 + 1 file changed, 99 insertions(+) diff --git a/src/glsl/linker.cpp b/src/glsl/linker.cpp index 7c194a2..1b4cb63 100644 --- a/src/glsl/linker.cpp +++ b/src/glsl/linker.cpp @@ -74,6 +74,7 @@ #include "link_varyings.h" #include "ir_optimization.h" #include "ir_rvalue_visitor.h" +#include "ir_uniform.h" extern "C" { #include "main/shaderobj.h" @@ -2089,6 +2090,100 @@ check_image_resources(struct gl_context *ctx, struct gl_shader_program *prog) linker_error(prog, "Too many combined image uniforms and fragment outputs"); } + +/** + * Initializes explicit location slots point to -1 for a variable, + * checks for overlaps between other uniforms using explicit locations. + */ +static bool +reserve_explicit_locations(struct gl_shader_program *prog, + string_to_uint_map *map, ir_variable *var) +{ + unsigned max_loc = var->data.location + var->type->component_slots() - 1; + + /* Resize remap table if locations do not fit in the current one. */ + if (max_loc + 1 > prog->NumUniformRemapTable) { + prog->UniformRemapTable = + reralloc(prog, prog->UniformRemapTable, + gl_uniform_storage *, + max_loc + 1); + prog->NumUniformRemapTable = max_loc + 1; + } + + for (unsigned i = 0; i < var->type->component_slots(); i++) { You should check the code that gets generated for this. I suspect this will translate to a call to component_slots per iteration of the loop. Maybe just call it once above (since it is also used to calculate max_loc). OK, will change. + unsigned loc = var->data.location + i; + + /* Check if location is already used. */ + if (prog->UniformRemapTable[loc] == (gl_uniform_storage *) -1) { So... -1 means that an inactive uniform has that location explicitly assigned? I'm inferring that from comments in the next patch. Maybe we should have a descriptive #define #define INACTIVE_UNIFORM_EXPLICIT_LOCATION ((gl_uniform_storage *) -1) Yep, makes it more easier to read. + + /* Possibly same uniform from a different stage, this is ok. */ + unsigned hash_loc; + if (map->get(hash_loc, var->name) && hash_loc == loc - i) + continue; + + /* ARB_explicit_uniform_location specification states: + * + * "No two default-block uniform variables in the program can have + * the same location, even if they are unused, otherwise a compiler + * or linker error will be generated." + */ + linker_error(prog, "location qualifier " + "for uniform %s " + "overlaps previously used location", + var->name); Minor nit (which you can take or leave). I usually like to have fewer breaks in strings. I would have split this up as: linker_error(prog, "location qualifier for uniform %s overlaps " "previously used location", var->name); ok + return false; + } + + prog->UniformRemapTable[loc] = (gl_uniform_storage *) -1; + } + + /* Note, base location used for arrays. */ + map->put(var->data.location, var->name); + + return true; +} + +/** + * Check and reserve all explicit uniform locations, called before + * any optimizations happen to handle also inactive uniforms and + * inactive array elements that may get trimmed away. + */ +static void +check_explicit_uniform_locations(struct gl_context *ctx, + struct gl_shader_program *prog) +{ + if (!ctx->Extensions.ARB_explicit_uniform_location) + return; + + /* This map is used to detect if overlapping explicit locations +* occur with the same uniform (from different stage) or a different one. +*/ + string_to_uint_map *uniform_map = new string_to_uint_map; + + for (unsigned i = 0; i < MESA_SHADER_STAGES; i++) { + struct gl_shader *sh = prog->_LinkedShaders[i]; + + if (!sh) + continue; + + foreach_list(node, sh->ir) { + ir_variable *var = ((ir_instruction *)node)->as_variable(); + if ((var && var->data.mode == ir_var_uniform) && + var->data.explicit_location) { +if (!reserve_explicit_locations(prog, uniform_map, var)) + return; + +/* Initialize locations that were allocated but left unused. */ +for (unsigned i = 0; i < prog->NumUniformRemapTable; i++) + if (prog->UniformRemapTable[i] != (gl_uniform_storage *) -1) + prog->U
[Mesa-dev] [PATCH] tgsi: add GS_INVOCATIONS to property names array
In commit 4be146b1, I neglected to add the new property to the strings array. This leads to the string '(null)' to be printed instead when converting a GS shader to text. Signed-off-by: Ilia Mirkin Cc: "10.2" --- src/gallium/auxiliary/tgsi/tgsi_strings.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/src/gallium/auxiliary/tgsi/tgsi_strings.c b/src/gallium/auxiliary/tgsi/tgsi_strings.c index 5b6e47f..34dec4f 100644 --- a/src/gallium/auxiliary/tgsi/tgsi_strings.c +++ b/src/gallium/auxiliary/tgsi/tgsi_strings.c @@ -120,7 +120,8 @@ const char *tgsi_property_names[TGSI_PROPERTY_COUNT] = "FS_COORD_PIXEL_CENTER", "FS_COLOR0_WRITES_ALL_CBUFS", "FS_DEPTH_LAYOUT", - "VS_PROHIBIT_UCPS" + "VS_PROHIBIT_UCPS", + "GS_INVOCATIONS", }; const char *tgsi_type_names[5] = -- 1.8.5.5 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 16/23] targets/vdpau: convert to static and shared pipe-drivers
On Mon, May 19, 2014 at 11:57:58PM +0100, Emil Velikov wrote: > On 18/05/14 12:22, Jonathan Gray wrote: > [snip] > > > > Currently I run my autotools builds like this: > > > > export LDFLAGS=-L/usr/local/lib > > export CPPFLAGS="-I/usr/local/include -I/usr/local/include/libelf" > > export AUTOMAKE_VERSION=1.12 > > export AUTOCONF_VERSION=2.69 > > export LEX=/usr/local/bin/gflex > > ./autogen.sh \ > > --with-gallium-drivers=r300,r600,radeonsi,swrast,nouveau \ > > --with-dri-drivers=i915,i965,r200,radeon,swrast \ > > --disable-silent-rules \ > > --enable-r600-llvm-compiler --enable-gallium-llvm \ > > --disable-llvm-shared-libs \ > > --enable-gles1 --enable-gles2 \ > > --enable-shared-glapi \ > > --disable-osmesa \ > > --enable-debug \ > > --enable-gbm \ > > --with-egl-platforms="x11,drm" \ > > --prefix=/usr/mesa > > > I'm a bit curious how xenocara's CVS is going to handle the symlinks when > building dri/radeon, dri/r200 and st/dri (all gallium dri drivers). AFAICS it > will fail miserably :\ > If interested you can rework the former two and effectively drop a handful > symbol redefinition, shed off some code and size off the classic dri. I'm > planning to address the st/dri case after this series is merged. I'm not really sure what xenocara has to do with the autotools build? As said before xenocara uses it's own set of makefiles, ie http://www.openbsd.org/cgi-bin/cvsweb/xenocara/lib/libGL/dri/radeon/Makefile?rev=HEAD;content-type=text%2Fplain http://www.openbsd.org/cgi-bin/cvsweb/xenocara/lib/libGL/dri/r600g/Makefile?rev=HEAD;content-type=text%2Fplain with seperate directories for libglapi libGL libEGL libGLESv1_CM libGLESv2 that refer to the source in http://www.openbsd.org/cgi-bin/cvsweb/xenocara/dist/Mesa/ > > >From the above configure one cannot determine if you're building vdpau. > Current code enables the vdpau targets if the vdpau package is available. Can > you confirm if this is the case or not ? My autotools builds are not done on a system with vdpau installed. The resulting target list from configure here looks like: Gallium: yes Target dirs: dri-nouveau dri-swrast r300/dri r600/dri radeonsi/dri Winsys dirs: nouveau/drm radeon/drm sw sw/dri sw/xlib Driver dirs: galahad identity llvmpipe noop nouveau r300 r600 radeonsi rbug softpipe trace Trackers dirs: dri The build does not seem to reference gallium/state_trackers/vdpau but does build mesa/main/vdpau.c and mesa/state_tracker/st_vdpau.c It would be nice to have the possibility of building the gallium vdpau code in future however. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [Mesa-stable] [PATCH 2/2] meta: Write color values in to 'out' variables for all the draw buffers
On Mon, May 19, 2014 at 3:52 PM, Chris Forbes wrote: > If you're going to do that, you'd really want to add draw buffer count > to the cache key (and i guess this might be the point where you > convert the blit shader cache to be a hashtable), to avoid recompiling > all the time if the app does two blits with the same target but > different draw buffer counts. > > This all seems like a huge amount of extra machinery to avoid using > gl_FragColor and having the backend just take care of it, though. What > do we actually gain from this? > Right, It doesn't look like worth doing it. I was avoiding 'gl_FragColor' just because it's deprecated in GLSL 130. Using 'gl_FragColor" here will work perfectly fine. But, seems like it won't work in fragment shader for msaa blits because msaa blit shader makes use of non vec4 output types. Although, blitting to multiple multisample buffers is not a common use case, we'll have similar shader recompilation problem due to changing draw buffers count. For now, I'll go ahead and make changes to use 'gl_FragColor' for non-multisample blits. > > On Tue, May 20, 2014 at 10:22 AM, Anuj Phogat wrote: >> On Mon, May 19, 2014 at 3:12 PM, Chris Forbes wrote: >>> On Tue, May 20, 2014 at 8:20 AM, Anuj Phogat wrote: @@ -247,7 +247,8 @@ _mesa_meta_setup_blit_shader(struct gl_context *ctx, struct blit_shader *shader = choose_blit_shader(target, table); const char *vs_input, *vs_output, *fs_input, *fs_output; const char *vs_preprocess = "", *fs_preprocess = ""; - const char *fs_output_decl = ""; + const char *fs_output_decl = "", *for_loop = ""; + const int draw_buf_count = ctx->DrawBuffer->_NumColorDrawBuffers; >>> >>> You can't depend on the number of bound draw buffers here. These >>> shaders get generated on first use, and cached for the life of the >>> context. >> Nice catch. I'll add a condition to recompile the shader if number of >> draw buffers change. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] glsl: ideas how to improve dead code elimination?
On Mon, May 19, 2014 at 10:56 AM, Aras Pranckevicius wrote: > Hi, > > When Mesa's GLSL compiler is faced with a code like this: > > // vec4 packednormal exists > vec3 normal; > normal.xy = packednormal.wy * 2.0 - 1.0; > normal.z = sqrt(1.0 - dot(normal.xy, normal.xy)); > // now do not use the "normal" at all anywhere > > Then the dead code elimination pass will not be able to eliminate the > "normal" variable, nor anything that lead to it (possibly sampling textures > into packed normal, etc.). > > This is because variable refcounting visitor sees "normal" as having four > references, but only two assignments, and can not consider it dead. Even if > these two references are from assignment to normal.z where both LHS & RHS > reference the same variable. > > Any ideas on how to improve this? > > > If the original code was doing something like this, then dead code > elimination is able to "properly" eliminate this whole thing: > > // vec4 packednormal exists > vec3 normal; > vec2 nxy = packednormal.wy * 2.0 - 1.0; > float nz = sqrt(1.0 - dot(nxy, nxy)); > normal.xy = nxy; > normal.z = nz; > // now do not use the "normal" at all anywhere Eric is working on a better GLSL IR dead code elimination pass. I'm not sure of the current status. It's in his tree: git://people.freedesktop.org/~anholt/mesa deadcode ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] loader: allow alternative methods for PCI identification.
On 15/05/14 05:39, Gary Wong wrote: > loader_get_pci_id_for_fd() and loader_get_device_name_for_fd() now attempt > all available strategies to identify the hardware, instead of conditionally > compiling in a single test. The existing libudev and DRM approaches have > been retained, and another simple alternative of looking up the answer in > the /sys filesystem (available on Linux) is added. > > This should assist Linux systems that mount /sys but do not include > libudev (Android?), give Mesa a fighting chance of running on systems > where libudev is uninstalled/inaccessible/broken at runtime, and provides > a hook where non-Linux systems (BSD?) could implement their own PCI > identification. > Hi Gary, Are you trying to get mesa working under GNU Hurd ? IIRC Jonathan is able to get mesa working under OpenBSD and I would expect other non-linux platforms to just work(tm). Although with that said I may have broken Android (not sure if autohell detects is as a linux platform). As you can notice I'm not a huge fan of adding yet another way of retrieving the device/driver name although I would not object if you're willing to split this patch a bit, have the option off by default and fix bugs if/when they pop up :) Cheers, Emil > Signed-off-by: Gary Wong > --- > configure.ac| 51 > src/loader/loader.c | 173 > +++- > 2 files changed, 195 insertions(+), 29 deletions(-) > > diff --git a/configure.ac b/configure.ac > index d3e96de..fe572cd 100644 > --- a/configure.ac > +++ b/configure.ac > @@ -818,13 +818,31 @@ fi > > case "$host_os" in > linux*) > -need_libudev=yes ;; > +need_pci_id=yes ;; > *) > -need_libudev=no ;; > +need_pci_id=no ;; > esac > > -PKG_CHECK_MODULES([LIBUDEV], [libudev >= $LIBUDEV_REQUIRED], > - have_libudev=yes, have_libudev=no) > +AC_ARG_ENABLE([libudev], > +[AS_HELP_STRING([--disable-libudev], > +[disable libudev PCI identification @<:@default=enabled on supported > platforms@:>@])], > +[attempt_libudev="$enableval"], > +[attempt_libudev=yes] > +) > + > +if test "x$attempt_libudev" = "xyes"; then > +PKG_CHECK_MODULES([LIBUDEV], [libudev >= $LIBUDEV_REQUIRED], > + have_libudev=yes, have_libudev=no) > +else > +have_libudev=no > +fi > + > +AC_ARG_ENABLE([sysfs], > +[AS_HELP_STRING([--disable-sysfs], > +[disable /sys PCI identification @<:@default=enabled on supported > platforms@:>@])], > +[have_sysfs="$enableval"], > +[have_sysfs=yes] > +) > > if test "x$enable_dri" = xyes; then > if test "$enable_static" = yes; then > @@ -910,8 +928,15 @@ xyesno) > ;; > esac > > +have_pci_id=no > if test "$have_libudev" = yes; then > DEFINES="$DEFINES -DHAVE_LIBUDEV" > +have_pci_id=yes > +fi > + > +if test "$have_sysfs" = yes; then > +DEFINES="$DEFINES -DHAVE_SYSFS" > +have_pci_id=yes > fi > > # This is outside the case (above) so that it is invoked even for non-GLX > @@ -1013,8 +1038,8 @@ if test "x$enable_dri" = xyes; then > DEFINES="$DEFINES -DHAVE_DRI3" > fi > > -if test "x$have_libudev" != xyes; then > -AC_MSG_ERROR([libudev-dev required for building DRI]) > +if test "x$have_pci_id" != xyes; then > +AC_MSG_ERROR([libudev-dev or sysfs required for building DRI]) > fi > > case "$host_cpu" in > @@ -1183,8 +1208,8 @@ if test "x$enable_gbm" = xauto; then > esac > fi > if test "x$enable_gbm" = xyes; then > -if test "x$need_libudev$have_libudev" = xyesno; then > -AC_MSG_ERROR([gbm requires udev >= $LIBUDEV_REQUIRED]) > +if test "x$need_pci_id$have_pci_id" = xyesno; then > +AC_MSG_ERROR([gbm requires udev >= $LIBUDEV_REQUIRED or sysfs]) > fi > > if test "x$enable_dri" = xyes; then > @@ -1202,7 +1227,7 @@ if test "x$enable_gbm" = xyes; then > fi > fi > AM_CONDITIONAL(HAVE_GBM, test "x$enable_gbm" = xyes) > -if test "x$need_libudev" = xyes; then > +if test "x$need_pci_id$have_libudev" = xyesyes; then > GBM_PC_REQ_PRIV="libudev >= $LIBUDEV_REQUIRED" > else > GBM_PC_REQ_PRIV="" > @@ -1491,9 +1516,9 @@ for plat in $egl_platforms; do > ;; > esac > > -case "$plat$need_libudev$have_libudev" in > +case "$plat$need_pci_id$have_pci_id" in > waylandyesno|drmyesno) > -AC_MSG_ERROR([cannot build $plat platform without udev > >= $LIBUDEV_REQUIRED]) ;; > +AC_MSG_ERROR([cannot build $plat platform without udev > >= $LIBUDEV_REQUIRED or sysfs]) ;; > esac > done > > @@ -1766,8 +1791,8 @@ gallium_require_llvm() { > > gallium_require_drm_loader() { > if test "x$enable_gallium_loader" = xyes; then > -if test "x$need_libudev$have_libudev" = xyesno; then > -AC_MSG_ERROR([Gallium drm loader requires libudev
Re: [Mesa-dev] [PATCH 16/23] targets/vdpau: convert to static and shared pipe-drivers
On 18/05/14 12:22, Jonathan Gray wrote: [snip] > > Currently I run my autotools builds like this: > > export LDFLAGS=-L/usr/local/lib > export CPPFLAGS="-I/usr/local/include -I/usr/local/include/libelf" > export AUTOMAKE_VERSION=1.12 > export AUTOCONF_VERSION=2.69 > export LEX=/usr/local/bin/gflex > ./autogen.sh \ > --with-gallium-drivers=r300,r600,radeonsi,swrast,nouveau \ > --with-dri-drivers=i915,i965,r200,radeon,swrast \ > --disable-silent-rules \ > --enable-r600-llvm-compiler --enable-gallium-llvm \ > --disable-llvm-shared-libs \ > --enable-gles1 --enable-gles2 \ > --enable-shared-glapi \ > --disable-osmesa \ > --enable-debug \ > --enable-gbm \ > --with-egl-platforms="x11,drm" \ > --prefix=/usr/mesa > I'm a bit curious how xenocara's CVS is going to handle the symlinks when building dri/radeon, dri/r200 and st/dri (all gallium dri drivers). AFAICS it will fail miserably :\ If interested you can rework the former two and effectively drop a handful symbol redefinition, shed off some code and size off the classic dri. I'm planning to address the st/dri case after this series is merged. >From the above configure one cannot determine if you're building vdpau. Current code enables the vdpau targets if the vdpau package is available. Can you confirm if this is the case or not ? Cheers, Emil ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [Mesa-stable] [PATCH 2/2] meta: Write color values in to 'out' variables for all the draw buffers
If you're going to do that, you'd really want to add draw buffer count to the cache key (and i guess this might be the point where you convert the blit shader cache to be a hashtable), to avoid recompiling all the time if the app does two blits with the same target but different draw buffer counts. This all seems like a huge amount of extra machinery to avoid using gl_FragColor and having the backend just take care of it, though. What do we actually gain from this? On Tue, May 20, 2014 at 10:22 AM, Anuj Phogat wrote: > On Mon, May 19, 2014 at 3:12 PM, Chris Forbes wrote: >> On Tue, May 20, 2014 at 8:20 AM, Anuj Phogat wrote: >>> @@ -247,7 +247,8 @@ _mesa_meta_setup_blit_shader(struct gl_context *ctx, >>> struct blit_shader *shader = choose_blit_shader(target, table); >>> const char *vs_input, *vs_output, *fs_input, *fs_output; >>> const char *vs_preprocess = "", *fs_preprocess = ""; >>> - const char *fs_output_decl = ""; >>> + const char *fs_output_decl = "", *for_loop = ""; >>> + const int draw_buf_count = ctx->DrawBuffer->_NumColorDrawBuffers; >> >> You can't depend on the number of bound draw buffers here. These >> shaders get generated on first use, and cached for the life of the >> context. > Nice catch. I'll add a condition to recompile the shader if number of > draw buffers change. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 78914] [llvmpipe] Front/Backfaces do not cover the same pixels when rasterized
https://bugs.freedesktop.org/show_bug.cgi?id=78914 Roland Scheidegger changed: What|Removed |Added Summary|Front/Backfaces do not |[llvmpipe] Front/Backfaces |cover the same pixels when |do not cover the same |rasterized |pixels when rasterized Component|Mesa core |Other --- Comment #1 from Roland Scheidegger --- So, in order to get front and backface tris, you draw essentially the same tri twice, but once you draw index 0,1,2 and once you draw 0,2,1? I could see this getting different results for interpolated attributes (in fact I know it will happen...). I am not actually sure it's guaranteed to get the same results, this is very tricky to get right (the reason is the interpolation / interpolation setup is not quite symmetric wrt all triangle corners, the float math can give different results). Though this should only affect interpolated attribute values, not rasterization itself (which happens with fixed point math). If it actually rasterizes different pixels this is a bug. Hence if you could provide some minimal test case that would be great. This only affects llvmpipe right? -- You are receiving this mail because: You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [Mesa-stable] [PATCH 2/2] meta: Write color values in to 'out' variables for all the draw buffers
On Mon, May 19, 2014 at 3:12 PM, Chris Forbes wrote: > On Tue, May 20, 2014 at 8:20 AM, Anuj Phogat wrote: >> @@ -247,7 +247,8 @@ _mesa_meta_setup_blit_shader(struct gl_context *ctx, >> struct blit_shader *shader = choose_blit_shader(target, table); >> const char *vs_input, *vs_output, *fs_input, *fs_output; >> const char *vs_preprocess = "", *fs_preprocess = ""; >> - const char *fs_output_decl = ""; >> + const char *fs_output_decl = "", *for_loop = ""; >> + const int draw_buf_count = ctx->DrawBuffer->_NumColorDrawBuffers; > > You can't depend on the number of bound draw buffers here. These > shaders get generated on first use, and cached for the life of the > context. Nice catch. I'll add a condition to recompile the shader if number of draw buffers change. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 2/2] meta: Write color values in to 'out' variables for all the draw buffers
On Mon, May 19, 2014 at 2:45 PM, Matt Turner wrote: > On Mon, May 19, 2014 at 1:20 PM, Anuj Phogat wrote: >> _mesa_meta_setup_blit_shader() currently generates a fragment shader >> which, irrespective of the number of draw buffers, writes the color >> to only one output variable. Current shader rely on an undefined >> behavior and possibly works by chance. >> >> From OpenGL 4.0 spec, page 256: >> "If a fragment shader writes to gl_FragColor, DrawBuffers specifies a >>set of draw buffers into which the single fragment color defined by >>gl_FragColor is written. If a fragment shader writes to gl_FragData, >>or a user-defined varying out variable, DrawBuffers specifies a set >>of draw buffers into which each of the multiple output colors defined >>by these variables are separately written. If a fragment shader writes >>to none of gl_FragColor, gl_FragData, nor any user defined varying out >>variables, the values of the fragment colors following shader execution >>are undefined, and may differ for each fragment color." >> >> OpenGL 4.4 spec, page 463, added an additional line in this section: >> "If some, but not all user-defined output variables are written, the >>values of fragment colors corresponding to unwritten variables are >>similarly undefined." >> >> Cc: >> Signed-off-by: Anuj Phogat >> --- >> src/mesa/drivers/common/meta.c | 23 ++- >> 1 file changed, 18 insertions(+), 5 deletions(-) >> >> diff --git a/src/mesa/drivers/common/meta.c b/src/mesa/drivers/common/meta.c >> index 87609b4..4897cd9 100644 >> --- a/src/mesa/drivers/common/meta.c >> +++ b/src/mesa/drivers/common/meta.c >> @@ -247,7 +247,8 @@ _mesa_meta_setup_blit_shader(struct gl_context *ctx, >> struct blit_shader *shader = choose_blit_shader(target, table); >> const char *vs_input, *vs_output, *fs_input, *fs_output; >> const char *vs_preprocess = "", *fs_preprocess = ""; >> - const char *fs_output_decl = ""; >> + const char *fs_output_decl = "", *for_loop = ""; >> + const int draw_buf_count = ctx->DrawBuffer->_NumColorDrawBuffers; >> >> if (ctx->Const.GLSLVersion < 130) { >>vs_input = "attribute"; >> @@ -255,12 +256,23 @@ _mesa_meta_setup_blit_shader(struct gl_context *ctx, >>fs_preprocess = "#extension GL_EXT_texture_array : enable"; >>fs_output = "gl_FragColor"; >> } else { >> - vs_preprocess = fs_preprocess = "#version 130"; >> + vs_preprocess = "#version 130"; >>vs_input = fs_input = "in"; >>vs_output = "out"; >> - fs_output = "out_color"; >> - fs_output_decl = "out vec4 out_color;"; >>shader->func = "texture"; >> + if (draw_buf_count > 1) { >> + fs_preprocess = ralloc_asprintf(mem_ctx, >> + "#version 130\n" >> + "#define NUM_DRAW_BUFS %d", >> + draw_buf_count); >> + fs_output = "out_color[i]"; >> + fs_output_decl = "out vec4 out_color[NUM_DRAW_BUFS];"; >> + for_loop = " for (int i = 0; i < NUM_DRAW_BUFS; i++)\n "; >> + } else { >> + fs_preprocess = "#version 130"; >> + fs_output = "out_color"; >> + fs_output_decl = "out vec4 out_color;"; >> + } > > It's safe to emit a loop with only one iterations. The compiler will > happily optimize that (it's going to unroll all of these loops > anyway). Emitting GLSL code for the for loop unconditionally seems > like it would clean this up some. > I wasn't sure if that'll generate extra instructions for one iteration. I'll clean it up before pushing. > With the comments for these two patches addressed, both of these are > > Reviewed-by: Matt Turner ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [Mesa-stable] [PATCH 2/2] meta: Write color values in to 'out' variables for all the draw buffers
On Tue, May 20, 2014 at 8:20 AM, Anuj Phogat wrote: > @@ -247,7 +247,8 @@ _mesa_meta_setup_blit_shader(struct gl_context *ctx, > struct blit_shader *shader = choose_blit_shader(target, table); > const char *vs_input, *vs_output, *fs_input, *fs_output; > const char *vs_preprocess = "", *fs_preprocess = ""; > - const char *fs_output_decl = ""; > + const char *fs_output_decl = "", *for_loop = ""; > + const int draw_buf_count = ctx->DrawBuffer->_NumColorDrawBuffers; You can't depend on the number of bound draw buffers here. These shaders get generated on first use, and cached for the life of the context. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 2/2] meta: Write color values in to 'out' variables for all the draw buffers
On Mon, May 19, 2014 at 1:20 PM, Anuj Phogat wrote: > _mesa_meta_setup_blit_shader() currently generates a fragment shader > which, irrespective of the number of draw buffers, writes the color > to only one output variable. Current shader rely on an undefined > behavior and possibly works by chance. > > From OpenGL 4.0 spec, page 256: > "If a fragment shader writes to gl_FragColor, DrawBuffers specifies a >set of draw buffers into which the single fragment color defined by >gl_FragColor is written. If a fragment shader writes to gl_FragData, >or a user-defined varying out variable, DrawBuffers specifies a set >of draw buffers into which each of the multiple output colors defined >by these variables are separately written. If a fragment shader writes >to none of gl_FragColor, gl_FragData, nor any user defined varying out >variables, the values of the fragment colors following shader execution >are undefined, and may differ for each fragment color." > > OpenGL 4.4 spec, page 463, added an additional line in this section: > "If some, but not all user-defined output variables are written, the >values of fragment colors corresponding to unwritten variables are >similarly undefined." > > Cc: > Signed-off-by: Anuj Phogat > --- > src/mesa/drivers/common/meta.c | 23 ++- > 1 file changed, 18 insertions(+), 5 deletions(-) > > diff --git a/src/mesa/drivers/common/meta.c b/src/mesa/drivers/common/meta.c > index 87609b4..4897cd9 100644 > --- a/src/mesa/drivers/common/meta.c > +++ b/src/mesa/drivers/common/meta.c > @@ -247,7 +247,8 @@ _mesa_meta_setup_blit_shader(struct gl_context *ctx, > struct blit_shader *shader = choose_blit_shader(target, table); > const char *vs_input, *vs_output, *fs_input, *fs_output; > const char *vs_preprocess = "", *fs_preprocess = ""; > - const char *fs_output_decl = ""; > + const char *fs_output_decl = "", *for_loop = ""; > + const int draw_buf_count = ctx->DrawBuffer->_NumColorDrawBuffers; > > if (ctx->Const.GLSLVersion < 130) { >vs_input = "attribute"; > @@ -255,12 +256,23 @@ _mesa_meta_setup_blit_shader(struct gl_context *ctx, >fs_preprocess = "#extension GL_EXT_texture_array : enable"; >fs_output = "gl_FragColor"; > } else { > - vs_preprocess = fs_preprocess = "#version 130"; > + vs_preprocess = "#version 130"; >vs_input = fs_input = "in"; >vs_output = "out"; > - fs_output = "out_color"; > - fs_output_decl = "out vec4 out_color;"; >shader->func = "texture"; > + if (draw_buf_count > 1) { > + fs_preprocess = ralloc_asprintf(mem_ctx, > + "#version 130\n" > + "#define NUM_DRAW_BUFS %d", > + draw_buf_count); > + fs_output = "out_color[i]"; > + fs_output_decl = "out vec4 out_color[NUM_DRAW_BUFS];"; > + for_loop = " for (int i = 0; i < NUM_DRAW_BUFS; i++)\n "; > + } else { > + fs_preprocess = "#version 130"; > + fs_output = "out_color"; > + fs_output_decl = "out vec4 out_color;"; > + } It's safe to emit a loop with only one iterations. The compiler will happily optimize that (it's going to unroll all of these loops anyway). Emitting GLSL code for the for loop unconditionally seems like it would clean this up some. With the comments for these two patches addressed, both of these are Reviewed-by: Matt Turner ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/2] meta: Refactor _mesa_meta_setup_blit_shader() to avoid duplicate shader code
On Mon, May 19, 2014 at 1:20 PM, Anuj Phogat wrote: > Cc: > Signed-off-by: Anuj Phogat > --- > src/mesa/drivers/common/meta.c | 97 > +++--- > 1 file changed, 44 insertions(+), 53 deletions(-) > > diff --git a/src/mesa/drivers/common/meta.c b/src/mesa/drivers/common/meta.c > index 3ef3f79..87609b4 100644 > --- a/src/mesa/drivers/common/meta.c > +++ b/src/mesa/drivers/common/meta.c > @@ -242,10 +242,26 @@ _mesa_meta_setup_blit_shader(struct gl_context *ctx, > GLenum target, > struct blit_shader_table *table) > { > - const char *vs_source; > - char *fs_source; > + char *vs_source, *fs_source; > void *const mem_ctx = ralloc_context(NULL); > struct blit_shader *shader = choose_blit_shader(target, table); > + const char *vs_input, *vs_output, *fs_input, *fs_output; > + const char *vs_preprocess = "", *fs_preprocess = ""; > + const char *fs_output_decl = ""; > + > + if (ctx->Const.GLSLVersion < 130) { > + vs_input = "attribute"; > + vs_output = fs_input = "varying"; > + fs_preprocess = "#extension GL_EXT_texture_array : enable"; > + fs_output = "gl_FragColor"; > + } else { > + vs_preprocess = fs_preprocess = "#version 130"; > + vs_input = fs_input = "in"; > + vs_output = "out"; > + fs_output = "out_color"; vs_output means "vertex shader output keyword" but fs_output means "fragment shader output variable". Maybe change s/fs_output/fs_output_var/? > + fs_output_decl = "out vec4 out_color;"; > + shader->func = "texture"; > + } This block would be clearer if we assigned the same variables in the same order. Instead of initializing variables, I'd set them in both blocks. Multiple assignments on the same line also make it less obviously correct. > > assert(shader != NULL); > > @@ -254,57 +270,32 @@ _mesa_meta_setup_blit_shader(struct gl_context *ctx, >return; > } > > - if (ctx->Const.GLSLVersion < 130) { > - vs_source = > - "attribute vec2 position;\n" > - "attribute vec4 textureCoords;\n" > - "varying vec4 texCoords;\n" > - "void main()\n" > - "{\n" > - " texCoords = textureCoords;\n" > - " gl_Position = vec4(position, 0.0, 1.0);\n" > - "}\n"; > - > - fs_source = ralloc_asprintf(mem_ctx, > - "#extension GL_EXT_texture_array : > enable\n" > - "#extension GL_ARB_texture_cube_map_array: > enable\n" > - "uniform %s texSampler;\n" > - "varying vec4 texCoords;\n" > - "void main()\n" > - "{\n" > - " gl_FragColor = %s(texSampler, %s);\n" > - " gl_FragDepth = gl_FragColor.x;\n" > - "}\n", > - shader->type, > - shader->func, shader->texcoords); > - } > - else { > - vs_source = ralloc_asprintf(mem_ctx, > - "#version 130\n" > - "in vec2 position;\n" > - "in vec4 textureCoords;\n" > - "out vec4 texCoords;\n" > - "void main()\n" > - "{\n" > - " texCoords = textureCoords;\n" > - " gl_Position = vec4(position, 0.0, > 1.0);\n" > - "}\n"); > - fs_source = ralloc_asprintf(mem_ctx, > - "#version 130\n" > - "#extension GL_ARB_texture_cube_map_array: > enable\n" > - "uniform %s texSampler;\n" > - "in vec4 texCoords;\n" > - "out vec4 out_color;\n" > - "\n" > - "void main()\n" > - "{\n" > - " out_color = texture(texSampler, %s);\n" > - " gl_FragDepth = out_color.x;\n" > - "}\n", > - shader->type, > - shader->texcoords); > - } > - > + vs_source = ralloc_asprintf(mem_ctx, > +"%s\n" > +"%s vec2 position;\n" > +"%s vec4 textureCoords;\n" > +"%s vec4 texCoords;\n" > +"void main()\n" > +"{\n" > +" texCoords = textureCoords;\n" > +" gl_Position = vec4(position, 0.0, 1.0);\n" > +"}\n", > +vs_preprocess, vs_input,
Re: [Mesa-dev] [Mesa-stable] [PATCH 4/4] meta: Avoid _swrast_BlitFramebuffer in the meta CopyTexSubImage code.
Series is: Reviewed-by: Chris Forbes On Mon, May 19, 2014 at 6:12 PM, Kenneth Graunke wrote: > This is a replacement for bd44ac8b5ca08016bb064b37edaec95eccfdbcd5 > that should actually work. > > Fixes Piglit's copyteximage-border on swrast, as well as one of > es3conform's packed_pixels_pixelstore test. > > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=78546 > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=77705 > Signed-off-by: Kenneth Graunke > Cc: "10.2" > --- > src/mesa/drivers/common/meta.c | 12 ++-- > 1 file changed, 6 insertions(+), 6 deletions(-) > > diff --git a/src/mesa/drivers/common/meta.c b/src/mesa/drivers/common/meta.c > index f90d5bd..b194b6e 100644 > --- a/src/mesa/drivers/common/meta.c > +++ b/src/mesa/drivers/common/meta.c > @@ -2860,13 +2860,13 @@ copytexsubimage_using_blit_framebuffer(struct > gl_context *ctx, GLuint dims, > * are too strict for CopyTexImage. We know meta will be fine with format > * changes. > */ > - _mesa_meta_and_swrast_BlitFramebuffer(ctx, x, y, > - x + width, y + height, > - xoffset, yoffset, > - xoffset + width, yoffset + height, > - mask, GL_NEAREST); > + mask = _mesa_meta_BlitFramebuffer(ctx, x, y, > + x + width, y + height, > + xoffset, yoffset, > + xoffset + width, yoffset + height, > + mask, GL_NEAREST); > ctx->Meta->Blit.no_ctsi_fallback = false; > - success = true; > + success = mask == 0x0; > > out: > _mesa_lock_texture(ctx, texObj); > -- > 1.9.2 > > ___ > mesa-stable mailing list > mesa-sta...@lists.freedesktop.org > http://lists.freedesktop.org/mailman/listinfo/mesa-stable ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 2/2] meta: Write color values in to 'out' variables for all the draw buffers
_mesa_meta_setup_blit_shader() currently generates a fragment shader which, irrespective of the number of draw buffers, writes the color to only one output variable. Current shader rely on an undefined behavior and possibly works by chance. >From OpenGL 4.0 spec, page 256: "If a fragment shader writes to gl_FragColor, DrawBuffers specifies a set of draw buffers into which the single fragment color defined by gl_FragColor is written. If a fragment shader writes to gl_FragData, or a user-defined varying out variable, DrawBuffers specifies a set of draw buffers into which each of the multiple output colors defined by these variables are separately written. If a fragment shader writes to none of gl_FragColor, gl_FragData, nor any user defined varying out variables, the values of the fragment colors following shader execution are undefined, and may differ for each fragment color." OpenGL 4.4 spec, page 463, added an additional line in this section: "If some, but not all user-defined output variables are written, the values of fragment colors corresponding to unwritten variables are similarly undefined." Cc: Signed-off-by: Anuj Phogat --- src/mesa/drivers/common/meta.c | 23 ++- 1 file changed, 18 insertions(+), 5 deletions(-) diff --git a/src/mesa/drivers/common/meta.c b/src/mesa/drivers/common/meta.c index 87609b4..4897cd9 100644 --- a/src/mesa/drivers/common/meta.c +++ b/src/mesa/drivers/common/meta.c @@ -247,7 +247,8 @@ _mesa_meta_setup_blit_shader(struct gl_context *ctx, struct blit_shader *shader = choose_blit_shader(target, table); const char *vs_input, *vs_output, *fs_input, *fs_output; const char *vs_preprocess = "", *fs_preprocess = ""; - const char *fs_output_decl = ""; + const char *fs_output_decl = "", *for_loop = ""; + const int draw_buf_count = ctx->DrawBuffer->_NumColorDrawBuffers; if (ctx->Const.GLSLVersion < 130) { vs_input = "attribute"; @@ -255,12 +256,23 @@ _mesa_meta_setup_blit_shader(struct gl_context *ctx, fs_preprocess = "#extension GL_EXT_texture_array : enable"; fs_output = "gl_FragColor"; } else { - vs_preprocess = fs_preprocess = "#version 130"; + vs_preprocess = "#version 130"; vs_input = fs_input = "in"; vs_output = "out"; - fs_output = "out_color"; - fs_output_decl = "out vec4 out_color;"; shader->func = "texture"; + if (draw_buf_count > 1) { + fs_preprocess = ralloc_asprintf(mem_ctx, + "#version 130\n" + "#define NUM_DRAW_BUFS %d", + draw_buf_count); + fs_output = "out_color[i]"; + fs_output_decl = "out vec4 out_color[NUM_DRAW_BUFS];"; + for_loop = " for (int i = 0; i < NUM_DRAW_BUFS; i++)\n "; + } else { + fs_preprocess = "#version 130"; + fs_output = "out_color"; + fs_output_decl = "out vec4 out_color;"; + } } assert(shader != NULL); @@ -291,11 +303,12 @@ _mesa_meta_setup_blit_shader(struct gl_context *ctx, "void main()\n" "{\n" " vec4 color = %s(texSampler, %s);\n" +"%s" " %s = color;\n" " gl_FragDepth = color.x;\n" "}\n", fs_preprocess, shader->type, fs_input, fs_output_decl, -shader->func, shader->texcoords, fs_output); +shader->func, shader->texcoords, for_loop, fs_output); _mesa_meta_compile_and_link_program(ctx, vs_source, fs_source, ralloc_asprintf(mem_ctx, "%s blit", -- 1.8.3.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 1/2] meta: Refactor _mesa_meta_setup_blit_shader() to avoid duplicate shader code
Cc: Signed-off-by: Anuj Phogat --- src/mesa/drivers/common/meta.c | 97 +++--- 1 file changed, 44 insertions(+), 53 deletions(-) diff --git a/src/mesa/drivers/common/meta.c b/src/mesa/drivers/common/meta.c index 3ef3f79..87609b4 100644 --- a/src/mesa/drivers/common/meta.c +++ b/src/mesa/drivers/common/meta.c @@ -242,10 +242,26 @@ _mesa_meta_setup_blit_shader(struct gl_context *ctx, GLenum target, struct blit_shader_table *table) { - const char *vs_source; - char *fs_source; + char *vs_source, *fs_source; void *const mem_ctx = ralloc_context(NULL); struct blit_shader *shader = choose_blit_shader(target, table); + const char *vs_input, *vs_output, *fs_input, *fs_output; + const char *vs_preprocess = "", *fs_preprocess = ""; + const char *fs_output_decl = ""; + + if (ctx->Const.GLSLVersion < 130) { + vs_input = "attribute"; + vs_output = fs_input = "varying"; + fs_preprocess = "#extension GL_EXT_texture_array : enable"; + fs_output = "gl_FragColor"; + } else { + vs_preprocess = fs_preprocess = "#version 130"; + vs_input = fs_input = "in"; + vs_output = "out"; + fs_output = "out_color"; + fs_output_decl = "out vec4 out_color;"; + shader->func = "texture"; + } assert(shader != NULL); @@ -254,57 +270,32 @@ _mesa_meta_setup_blit_shader(struct gl_context *ctx, return; } - if (ctx->Const.GLSLVersion < 130) { - vs_source = - "attribute vec2 position;\n" - "attribute vec4 textureCoords;\n" - "varying vec4 texCoords;\n" - "void main()\n" - "{\n" - " texCoords = textureCoords;\n" - " gl_Position = vec4(position, 0.0, 1.0);\n" - "}\n"; - - fs_source = ralloc_asprintf(mem_ctx, - "#extension GL_EXT_texture_array : enable\n" - "#extension GL_ARB_texture_cube_map_array: enable\n" - "uniform %s texSampler;\n" - "varying vec4 texCoords;\n" - "void main()\n" - "{\n" - " gl_FragColor = %s(texSampler, %s);\n" - " gl_FragDepth = gl_FragColor.x;\n" - "}\n", - shader->type, - shader->func, shader->texcoords); - } - else { - vs_source = ralloc_asprintf(mem_ctx, - "#version 130\n" - "in vec2 position;\n" - "in vec4 textureCoords;\n" - "out vec4 texCoords;\n" - "void main()\n" - "{\n" - " texCoords = textureCoords;\n" - " gl_Position = vec4(position, 0.0, 1.0);\n" - "}\n"); - fs_source = ralloc_asprintf(mem_ctx, - "#version 130\n" - "#extension GL_ARB_texture_cube_map_array: enable\n" - "uniform %s texSampler;\n" - "in vec4 texCoords;\n" - "out vec4 out_color;\n" - "\n" - "void main()\n" - "{\n" - " out_color = texture(texSampler, %s);\n" - " gl_FragDepth = out_color.x;\n" - "}\n", - shader->type, - shader->texcoords); - } - + vs_source = ralloc_asprintf(mem_ctx, +"%s\n" +"%s vec2 position;\n" +"%s vec4 textureCoords;\n" +"%s vec4 texCoords;\n" +"void main()\n" +"{\n" +" texCoords = textureCoords;\n" +" gl_Position = vec4(position, 0.0, 1.0);\n" +"}\n", +vs_preprocess, vs_input, vs_input, vs_output); + + fs_source = ralloc_asprintf(mem_ctx, +"%s\n" +"#extension GL_ARB_texture_cube_map_array: enable\n" +"uniform %s texSampler;\n" +"%s vec4 texCoords;\n" +"%s\n" +"void main()\n" +"{\n" +" vec4 color = %s(texSampler, %s);\n" +" %s = color;\n" +" gl_FragDepth = color.x;\n" +"}\n", +fs_preprocess, shader->type, fs_input, fs_output_decl, +shader->func, shader->texcoords, fs_output); _mesa_
Re: [Mesa-dev] [Mesa-stable] [PATCH 2/4] meta: Drop unnecessary early returns in _mesa_meta_BlitFramebuffer.
Looks good. Reviewed-by: Courtney Goeltzenleuchter On Mon, May 19, 2014 at 12:12 AM, Kenneth Graunke wrote: > These aren't necessary - all of the following code is predicated on mask > being non-zero, so no code will get executed anyway. > > Signed-off-by: Kenneth Graunke > Cc: "10.2" > --- > src/mesa/drivers/common/meta_blit.c | 8 > 1 file changed, 8 deletions(-) > > diff --git a/src/mesa/drivers/common/meta_blit.c > b/src/mesa/drivers/common/meta_blit.c > index beb1ea5..bd6118b 100644 > --- a/src/mesa/drivers/common/meta_blit.c > +++ b/src/mesa/drivers/common/meta_blit.c > @@ -705,10 +705,6 @@ _mesa_meta_BlitFramebuffer(struct gl_context *ctx, >filter, dstFlipX, dstFlipY, >use_glsl_version, false)) { > mask &= ~GL_COLOR_BUFFER_BIT; > - if (mask == 0x0) { > -_mesa_meta_end(ctx); > -return; > - } >} > } > > @@ -718,10 +714,6 @@ _mesa_meta_BlitFramebuffer(struct gl_context *ctx, >filter, dstFlipX, dstFlipY, >use_glsl_version, true)) { > mask &= ~GL_DEPTH_BUFFER_BIT; > - if (mask == 0x0) { > -_mesa_meta_end(ctx); > -return; > - } >} > } > > -- > 1.9.2 > > ___ > mesa-stable mailing list > mesa-sta...@lists.freedesktop.org > http://lists.freedesktop.org/mailman/listinfo/mesa-stable > -- Courtney Goeltzenleuchter LunarG ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] loader: allow alternative methods for PCI identification.
On Wed, May 14, 2014 at 10:39:05PM -0600, Gary Wong wrote: > loader_get_pci_id_for_fd() and loader_get_device_name_for_fd() now attempt > all available strategies to identify the hardware, instead of conditionally > compiling in a single test. The existing libudev and DRM approaches have > been retained, and another simple alternative of looking up the answer in > the /sys filesystem (available on Linux) is added. Hi folks, Any feedback on this patch? I'd like to push it to master if there are no objections. Thanks, Gary. -- Gary Wong g...@gnu.org http://www.cs.utah.edu/~gtw/ ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 78771] egl not works on 10.0.x and 10.1.x with black screen
https://bugs.freedesktop.org/show_bug.cgi?id=78771 U. Artie Eoff changed: What|Removed |Added Status|NEW |RESOLVED Resolution|--- |WORKSFORME --- Comment #10 from U. Artie Eoff --- Works for me... wayland (1.4) 1.4.0-0-g4b4cd00 --disable-documentation --disable-static drm (master) libdrm-2.4.52-0-g46d451c --enable-static=yes --enable-udev --enable-libkms --disable-nouveau-experimental-api --disable-radeon --disable-nouveau --enable-exynos-experimental-api mesa (10.1) mesa-10.1.2-0-gbde3135 --enable-gles1 --enable-gles2 --with-egl-platforms=drm,wayland --disable-glx --enable-shared-glapi --enable-texture-float --enable-gbm --enable-gallium-llvm --with-dri-drivers=i915,i965,swrast --with-gallium-drivers=swrast,svga cairo (1.12) 1.12.16-0-g8e11a42 --with-pic --enable-fc --enable-ft --enable-egl --enable-glesv2 --enable-ps --enable-pdf --enable-script --enable-svg --enable-tee --disable-xlib --disable-xcb --disable-gtk-doc --disable-static weston (1.4) 1.4.0-0-g1811312 --disable-static --disable-setuid-install --enable-simple-clients --enable-clients --disable-libunwind --disable-xwayland --disable-xwayland-test --disable-x11-compositor --disable-rpi-compositor --enable-demo-clients-install ...I tested 32-bit on the NDiS-166, yet, I still don't see any issues. Perhaps your issue is caused by one of the custom IVI patches... which is beyond scope here. -- You are receiving this mail because: You are the QA Contact for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 20/23] i965/fs: Loop over instruction lists and generate code.
Small code reduction. Will let us move the program header code into a common place in generate_assembly(). --- src/mesa/drivers/dri/i965/brw_fs_generator.cpp | 56 ++--- src/mesa/drivers/dri/i965/gen8_fs_generator.cpp | 46 +--- 2 files changed, 42 insertions(+), 60 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_fs_generator.cpp b/src/mesa/drivers/dri/i965/brw_fs_generator.cpp index 914fb29..bae39c1 100644 --- a/src/mesa/drivers/dri/i965/brw_fs_generator.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs_generator.cpp @@ -1825,45 +1825,35 @@ fs_generator::generate_assembly(exec_list *simd8_instructions, assert(simd8_instructions || simd16_instructions); const struct gl_program *prog = fp ? &fp->Base : NULL; + exec_list *instructions[] = { simd8_instructions, simd16_instructions }; - if (simd8_instructions) { - struct annotation *annotation; - int num_annotations; + for (unsigned i = 0; i < ARRAY_SIZE(instructions); i++) { + if (instructions[i]) { + if (i == 1) { +/* align to 64 byte boundary. */ +while (p->next_insn_offset % 64) { + brw_NOP(p); +} - dispatch_width = 8; - generate_code(simd8_instructions, &num_annotations, &annotation); - brw_compact_instructions(p, 0, num_annotations, annotation); +/* Save off the start of this SIMD16 program */ +prog_data->prog_offset_16 = p->next_insn_offset; - if (unlikely(debug_flag)) { - dump_assembly(p->store, num_annotations, annotation, brw, prog, - brw_disassemble); - ralloc_free(annotation); - } - } - - if (simd16_instructions) { - /* align to 64 byte boundary. */ - while (p->next_insn_offset % 64) { - brw_NOP(p); - } - - /* Save off the start of this SIMD16 program */ - prog_data->prog_offset_16 = p->next_insn_offset; - - brw_set_compression_control(p, BRW_COMPRESSION_COMPRESSED); +brw_set_compression_control(p, BRW_COMPRESSION_COMPRESSED); + } - struct annotation *annotation; - int num_annotations; + struct annotation *annotation; + int num_annotations; - dispatch_width = 16; - generate_code(simd16_instructions, &num_annotations, &annotation); - brw_compact_instructions(p, prog_data->prog_offset_16, - num_annotations, annotation); + dispatch_width = (i + 1) * 8; + generate_code(instructions[i], &num_annotations, &annotation); + brw_compact_instructions(p, prog_data->prog_offset_16, + num_annotations, annotation); - if (unlikely(debug_flag)) { - dump_assembly(p->store, num_annotations, annotation, brw, prog, - brw_disassemble); - ralloc_free(annotation); + if (unlikely(debug_flag)) { +dump_assembly(p->store, num_annotations, annotation, brw, prog, + brw_disassemble); +ralloc_free(annotation); + } } } diff --git a/src/mesa/drivers/dri/i965/gen8_fs_generator.cpp b/src/mesa/drivers/dri/i965/gen8_fs_generator.cpp index 272f668..f498cd5 100644 --- a/src/mesa/drivers/dri/i965/gen8_fs_generator.cpp +++ b/src/mesa/drivers/dri/i965/gen8_fs_generator.cpp @@ -1314,38 +1314,30 @@ gen8_fs_generator::generate_assembly(exec_list *simd8_instructions, { assert(simd8_instructions || simd16_instructions); - if (simd8_instructions) { - struct annotation *annotation; - int num_annotations; + exec_list *instructions[] = { simd8_instructions, simd16_instructions }; - dispatch_width = 8; - generate_code(simd8_instructions, &num_annotations, &annotation); + for (unsigned i = 0; i < ARRAY_SIZE(instructions); i++) { + if (instructions[i]) { + if (i == 1) { +/* Align to a 64-byte boundary. */ +while (next_inst_offset % 64) + NOP(); - if (unlikely(INTEL_DEBUG & DEBUG_WM)) { - dump_assembly(store, num_annotations, annotation, brw, prog, - gen8_disassemble); - ralloc_free(annotation); - } - } - - if (simd16_instructions) { - /* Align to a 64-byte boundary. */ - while (next_inst_offset % 64) - NOP(); - - /* Save off the start of this SIMD16 program */ - prog_data->prog_offset_16 = next_inst_offset; +/* Save off the start of this SIMD16 program */ +prog_data->prog_offset_16 = next_inst_offset; + } - struct annotation *annotation; - int num_annotations; + struct annotation *annotation; + int num_annotations; - dispatch_width = 16; - generate_code(simd16_instructions, &num_annotations, &annotation); + dispatch_width = (i + 1) * 8; + generate_code(instructions[i], &num_annotations, &annotation); - if (unlik
[Mesa-dev] [PATCH 09/23] i965/gen8/fs: Print disassembly after compaction.
--- src/mesa/drivers/dri/i965/brw_fs.h | 3 +- src/mesa/drivers/dri/i965/gen8_fs_generator.cpp | 138 +++- 2 files changed, 65 insertions(+), 76 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_fs.h b/src/mesa/drivers/dri/i965/brw_fs.h index d26b972..1390895 100644 --- a/src/mesa/drivers/dri/i965/brw_fs.h +++ b/src/mesa/drivers/dri/i965/brw_fs.h @@ -740,7 +740,8 @@ public: unsigned *assembly_size); private: - void generate_code(exec_list *instructions); + void generate_code(exec_list *instructions, int *num_annotations, + struct annotation **annotation); void generate_fb_write(fs_inst *inst); void generate_linterp(fs_inst *inst, struct brw_reg dst, struct brw_reg *src); diff --git a/src/mesa/drivers/dri/i965/gen8_fs_generator.cpp b/src/mesa/drivers/dri/i965/gen8_fs_generator.cpp index 9df5b73..7e90ee6 100644 --- a/src/mesa/drivers/dri/i965/gen8_fs_generator.cpp +++ b/src/mesa/drivers/dri/i965/gen8_fs_generator.cpp @@ -883,12 +883,9 @@ gen8_fs_generator::generate_untyped_surface_read(fs_inst *ir, } void -gen8_fs_generator::generate_code(exec_list *instructions) +gen8_fs_generator::generate_code(exec_list *instructions, int *num_annotations, + struct annotation **annotation) { - int last_native_inst_offset = next_inst_offset; - const char *last_annotation_string = NULL; - const void *last_annotation_ir = NULL; - if (unlikely(INTEL_DEBUG & DEBUG_WM)) { if (prog) { fprintf(stderr, @@ -905,53 +902,52 @@ gen8_fs_generator::generate_code(exec_list *instructions) } } + int block_num = 0; + int ann_num = 0; + int ann_size = 1024; cfg_t *cfg = NULL; - if (unlikely(INTEL_DEBUG & DEBUG_WM)) + struct annotation *ann = NULL; + + if (unlikely(INTEL_DEBUG & DEBUG_WM)) { cfg = new(mem_ctx) cfg_t(instructions); + ann = rzalloc_array(NULL, struct annotation, ann_size); + } foreach_list(node, instructions) { fs_inst *ir = (fs_inst *) node; struct brw_reg src[3], dst; if (unlikely(INTEL_DEBUG & DEBUG_WM)) { - foreach_list(node, &cfg->block_list) { -bblock_link *link = (bblock_link *)node; -bblock_t *block = link->block; - -if (block->start == ir) { - fprintf(stderr, " START B%d", block->block_num); - foreach_list(predecessor_node, &block->parents) { - bblock_link *predecessor_link = - (bblock_link *)predecessor_node; - bblock_t *predecessor_block = predecessor_link->block; - fprintf(stderr, " <-B%d", predecessor_block->block_num); - } - fprintf(stderr, "\n"); -} + if (ann_num == ann_size) { +ann_size *= 2; +ann = reralloc(NULL, ann, struct annotation, ann_size); } - if (last_annotation_ir != ir->ir) { -last_annotation_ir = ir->ir; -if (last_annotation_ir) { - fprintf(stderr, " "); - if (prog) { - ((ir_instruction *) ir->ir)->fprint(stderr); - } else if (prog) { - const prog_instruction *fpi; - fpi = (const prog_instruction *) ir->ir; - fprintf(stderr, "%d: ", (int)(fpi - prog->Instructions)); - _mesa_fprint_instruction_opt(stderr, - fpi, - 0, PROG_PRINT_DEBUG, NULL); - } - fprintf(stderr, "\n"); -} + ann[ann_num].offset = next_inst_offset; + ann[ann_num].ir = ir->ir; + ann[ann_num].annotation = ir->annotation; + + if (cfg->blocks[block_num]->start == ir) { +ann[ann_num].block_start = cfg->blocks[block_num]; } - if (last_annotation_string != ir->annotation) { -last_annotation_string = ir->annotation; -if (last_annotation_string) - fprintf(stderr, " %s\n", last_annotation_string); + + /* There is no hardware DO instruction on Gen6+, so since DO always + * starts a basic block, we need to set the .block_start of the next + * instruction's annotation with a pointer to the bblock started by + * the DO. + * + * There's also only complication from emitting an annotation without + * a corresponding hardware instruction to disassemble. + */ + if (brw->gen >= 6 && ir->opcode == BRW_OPCODE_DO) { +ann_num--; } + + if (cfg->blocks[block_num]->end == ir) { +ann[ann_num].block_end = cfg->blocks[block_num]; +block_num++; + } + ann_num++; } for (unsigned int i = 0; i < 3; i++) { @@ -1295,44 +
[Mesa-dev] [PATCH 16/23] i965: Emit 0.0:F sources with type VF instead.
Number of compacted instructions: 817752 -> 827404 (1.18%) --- src/mesa/drivers/dri/i965/brw_eu_emit.c | 16 1 file changed, 16 insertions(+) diff --git a/src/mesa/drivers/dri/i965/brw_eu_emit.c b/src/mesa/drivers/dri/i965/brw_eu_emit.c index d8efa01..1810233 100644 --- a/src/mesa/drivers/dri/i965/brw_eu_emit.c +++ b/src/mesa/drivers/dri/i965/brw_eu_emit.c @@ -357,6 +357,22 @@ brw_set_src0(struct brw_compile *p, struct brw_instruction *insn, } else { insn->bits1.da1.src1_reg_type = BRW_HW_REG_TYPE_UD; } + + /* Compacted instructions only have 12-bits (plus 1 for the other 20) + * for immediate values. Presumably the hardware engineers realized + * that the only useful floating-point value that could be represented + * in this format is 0.0, which can also be represented as a VF-typed + * immediate, so they gave us the previously mentioned mapping on IVB+. + * + * Strangely, we do have a mapping for imm:f in src1, so we don't need + * to do this there. + * + * If we see a 0.0:F, change the type to VF so that it can be compacted. + */ + if (insn->bits3.ud == 0x0 && + insn->bits1.da1.src0_reg_type == BRW_HW_REG_TYPE_F) { + insn->bits1.da1.src0_reg_type = BRW_HW_REG_IMM_TYPE_VF; + } } else { -- 1.8.3.2 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 23/23] i965/gen8: Print number of instructions directly.
--- src/mesa/drivers/dri/i965/gen8_fs_generator.cpp | 5 + src/mesa/drivers/dri/i965/gen8_vec4_generator.cpp | 3 +++ 2 files changed, 8 insertions(+) diff --git a/src/mesa/drivers/dri/i965/gen8_fs_generator.cpp b/src/mesa/drivers/dri/i965/gen8_fs_generator.cpp index 0ac00f9..90743ee 100644 --- a/src/mesa/drivers/dri/i965/gen8_fs_generator.cpp +++ b/src/mesa/drivers/dri/i965/gen8_fs_generator.cpp @@ -1313,10 +1313,13 @@ gen8_fs_generator::generate_assembly(exec_list *simd8_instructions, struct annotation *annotation; int num_annotations; + int start_offset = next_inst_offset; dispatch_width = (i + 1) * 8; generate_code(instructions[i], &num_annotations, &annotation); + int before_size = next_inst_offset - start_offset; + if (unlikely(INTEL_DEBUG & DEBUG_WM)) { if (this->prog) { fprintf(stderr, @@ -1331,6 +1334,8 @@ gen8_fs_generator::generate_assembly(exec_list *simd8_instructions, fprintf(stderr, "Native code for blorp program (SIMD%d dispatch):\n", dispatch_width); } +fprintf(stderr, "SIMD%d shader: %d instructions.\n", +dispatch_width, before_size / 16); dump_assembly(store, num_annotations, annotation, brw, prog, gen8_disassemble); ralloc_free(annotation); diff --git a/src/mesa/drivers/dri/i965/gen8_vec4_generator.cpp b/src/mesa/drivers/dri/i965/gen8_vec4_generator.cpp index 9f19a0a..3447ebf 100644 --- a/src/mesa/drivers/dri/i965/gen8_vec4_generator.cpp +++ b/src/mesa/drivers/dri/i965/gen8_vec4_generator.cpp @@ -944,6 +944,8 @@ gen8_vec4_generator::generate_assembly(exec_list *instructions, default_state.exec_size = BRW_EXECUTE_8; generate_code(instructions, &num_annotations, &annotation); + int before_size = next_inst_offset; + if (unlikely(debug_flag)) { if (shader_prog) { fprintf(stderr, "Native code for %s vertex shader %d:\n", @@ -952,6 +954,7 @@ gen8_vec4_generator::generate_assembly(exec_list *instructions, } else { fprintf(stderr, "Native code for vertex program %d:\n", prog->Id); } + fprintf(stderr, "vec4 shader: %d instructions.\n", before_size / 16); dump_assembly(store, num_annotations, annotation, brw, prog, gen8_disassemble); ralloc_free(annotation); -- 1.8.3.2 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 21/23] i965: Print shader header in generate_assembly().
--- src/mesa/drivers/dri/i965/brw_fs_generator.cpp| 29 ++- src/mesa/drivers/dri/i965/brw_vec4_generator.cpp | 17 ++--- src/mesa/drivers/dri/i965/gen8_fs_generator.cpp | 29 ++- src/mesa/drivers/dri/i965/gen8_vec4_generator.cpp | 17 ++--- 4 files changed, 40 insertions(+), 52 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_fs_generator.cpp b/src/mesa/drivers/dri/i965/brw_fs_generator.cpp index bae39c1..f70e7b2 100644 --- a/src/mesa/drivers/dri/i965/brw_fs_generator.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs_generator.cpp @@ -1325,22 +1325,6 @@ void fs_generator::generate_code(exec_list *instructions, int *num_annotations, struct annotation **annotation) { - if (unlikely(debug_flag)) { - if (prog) { - fprintf(stderr, - "Native code for %s fragment shader %d (SIMD%d dispatch):\n", - prog->Label ? prog->Label : "unnamed", - prog->Name, dispatch_width); - } else if (fp) { - fprintf(stderr, - "Native code for fragment program %d (SIMD%d dispatch):\n", - fp->Base.Id, dispatch_width); - } else { - fprintf(stderr, "Native code for blorp program (SIMD%d dispatch):\n", - dispatch_width); - } - } - int block_num = 0; int ann_num = 0; int ann_size = 1024; @@ -1850,6 +1834,19 @@ fs_generator::generate_assembly(exec_list *simd8_instructions, num_annotations, annotation); if (unlikely(debug_flag)) { +if (this->prog) { + fprintf(stderr, + "Native code for %s fragment shader %d (SIMD%d dispatch):\n", + this->prog->Label ? this->prog->Label : "unnamed", + this->prog->Name, dispatch_width); +} else if (fp) { + fprintf(stderr, + "Native code for fragment program %d (SIMD%d dispatch):\n", + fp->Base.Id, dispatch_width); +} else { + fprintf(stderr, "Native code for blorp program (SIMD%d dispatch):\n", + dispatch_width); +} dump_assembly(p->store, num_annotations, annotation, brw, prog, brw_disassemble); ralloc_free(annotation); diff --git a/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp b/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp index 5980aad..819ed10 100644 --- a/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp +++ b/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp @@ -1264,16 +1264,6 @@ void vec4_generator::generate_code(exec_list *instructions, int *num_annotations, struct annotation **annotation) { - if (unlikely(debug_flag)) { - if (shader_prog) { - fprintf(stderr, "Native code for %s vertex shader %d:\n", - shader_prog->Label ? shader_prog->Label : "unnamed", - shader_prog->Name); - } else { - fprintf(stderr, "Native code for vertex program %d:\n", prog->Id); - } - } - int block_num = 0; int ann_num = 0; int ann_size = 1024; @@ -1378,6 +1368,13 @@ vec4_generator::generate_assembly(exec_list *instructions, brw_compact_instructions(p, 0, num_annotations, annotation); if (unlikely(debug_flag)) { + if (shader_prog) { + fprintf(stderr, "Native code for %s vertex shader %d:\n", + shader_prog->Label ? shader_prog->Label : "unnamed", + shader_prog->Name); + } else { + fprintf(stderr, "Native code for vertex program %d:\n", prog->Id); + } dump_assembly(p->store, num_annotations, annotation, brw, prog, brw_disassemble); ralloc_free(annotation); diff --git a/src/mesa/drivers/dri/i965/gen8_fs_generator.cpp b/src/mesa/drivers/dri/i965/gen8_fs_generator.cpp index f498cd5..0ac00f9 100644 --- a/src/mesa/drivers/dri/i965/gen8_fs_generator.cpp +++ b/src/mesa/drivers/dri/i965/gen8_fs_generator.cpp @@ -886,22 +886,6 @@ void gen8_fs_generator::generate_code(exec_list *instructions, int *num_annotations, struct annotation **annotation) { - if (unlikely(INTEL_DEBUG & DEBUG_WM)) { - if (prog) { - fprintf(stderr, - "Native code for %s fragment shader %d (SIMD%d dispatch):\n", -shader_prog->Label ? shader_prog->Label : "unnamed", -shader_prog->Name, dispatch_width); - } else if (fp) { - fprintf(stderr, - "Native code for fragment program %d (SIMD%d dispatch):\n", - prog->Id, dispatch_width); - } else { - fprintf(stderr, "Native code for blorp program (SIMD%d dispatch):\n", - dispatch_width); - } - } - int block_num = 0; int ann_num = 0; int ann_size = 1
[Mesa-dev] [PATCH 04/23] i965/fs+blorp: Remove left over dump_file arguments.
Were used by the blorp unit test programs. --- src/mesa/drivers/dri/i965/brw_blorp_blit.cpp| 20 src/mesa/drivers/dri/i965/brw_blorp_blit_eu.cpp | 4 ++-- src/mesa/drivers/dri/i965/brw_blorp_blit_eu.h | 2 +- src/mesa/drivers/dri/i965/brw_fs.h | 5 ++--- src/mesa/drivers/dri/i965/brw_fs_generator.cpp | 13 ++--- 5 files changed, 15 insertions(+), 29 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_blorp_blit.cpp b/src/mesa/drivers/dri/i965/brw_blorp_blit.cpp index 3da6388..118af27 100644 --- a/src/mesa/drivers/dri/i965/brw_blorp_blit.cpp +++ b/src/mesa/drivers/dri/i965/brw_blorp_blit.cpp @@ -519,8 +519,7 @@ public: brw_blorp_blit_program(struct brw_context *brw, const brw_blorp_blit_prog_key *key, bool debug_flag); - const GLuint *compile(struct brw_context *brw, GLuint *program_size, - FILE *dump_file = stderr); + const GLuint *compile(struct brw_context *brw, GLuint *program_size); brw_blorp_prog_data prog_data; @@ -634,8 +633,7 @@ brw_blorp_blit_program::brw_blorp_blit_program( const GLuint * brw_blorp_blit_program::compile(struct brw_context *brw, -GLuint *program_size, -FILE *dump_file) +GLuint *program_size) { /* Sanity checks */ if (key->dst_tiled_w && key->rt_samples > 0) { @@ -790,7 +788,7 @@ brw_blorp_blit_program::compile(struct brw_context *brw, */ render_target_write(); - return get_program(program_size, dump_file); + return get_program(program_size); } void @@ -2146,7 +2144,7 @@ brw_blorp_blit_params::get_wm_prog(struct brw_context *brw, brw_blorp_blit_program prog(brw, &this->wm_prog_key, INTEL_DEBUG & DEBUG_BLORP); GLuint program_size; - const GLuint *program = prog.compile(brw, &program_size, stderr); + const GLuint *program = prog.compile(brw, &program_size); brw_upload_cache(&brw->cache, BRW_BLORP_BLIT_PROG, &this->wm_prog_key, sizeof(this->wm_prog_key), program, program_size, @@ -2155,13 +2153,3 @@ brw_blorp_blit_params::get_wm_prog(struct brw_context *brw, } return prog_offset; } - -void -brw_blorp_blit_test_compile(struct brw_context *brw, -const brw_blorp_blit_prog_key *key, -FILE *out) -{ - GLuint program_size; - brw_blorp_blit_program prog(brw, key, true /* debug_flag */); - prog.compile(brw, &program_size, out); -} diff --git a/src/mesa/drivers/dri/i965/brw_blorp_blit_eu.cpp b/src/mesa/drivers/dri/i965/brw_blorp_blit_eu.cpp index 4910b6c..33fa606 100644 --- a/src/mesa/drivers/dri/i965/brw_blorp_blit_eu.cpp +++ b/src/mesa/drivers/dri/i965/brw_blorp_blit_eu.cpp @@ -41,9 +41,9 @@ brw_blorp_eu_emitter::~brw_blorp_eu_emitter() } const unsigned * -brw_blorp_eu_emitter::get_program(unsigned *program_size, FILE *dump_file) +brw_blorp_eu_emitter::get_program(unsigned *program_size) { - return generator.generate_assembly(NULL, &insts, program_size, dump_file); + return generator.generate_assembly(NULL, &insts, program_size); } /** diff --git a/src/mesa/drivers/dri/i965/brw_blorp_blit_eu.h b/src/mesa/drivers/dri/i965/brw_blorp_blit_eu.h index 8a93f05..bc927fe 100644 --- a/src/mesa/drivers/dri/i965/brw_blorp_blit_eu.h +++ b/src/mesa/drivers/dri/i965/brw_blorp_blit_eu.h @@ -33,7 +33,7 @@ protected: explicit brw_blorp_eu_emitter(struct brw_context *brw, bool debug_flag); ~brw_blorp_eu_emitter(); - const unsigned *get_program(unsigned *program_size, FILE *dump_file); + const unsigned *get_program(unsigned *program_size); void emit_kill_if_outside_rect(const struct brw_reg &x, const struct brw_reg &y, diff --git a/src/mesa/drivers/dri/i965/brw_fs.h b/src/mesa/drivers/dri/i965/brw_fs.h index 7a87aed..8acad2f 100644 --- a/src/mesa/drivers/dri/i965/brw_fs.h +++ b/src/mesa/drivers/dri/i965/brw_fs.h @@ -608,11 +608,10 @@ public: const unsigned *generate_assembly(exec_list *simd8_instructions, exec_list *simd16_instructions, - unsigned *assembly_size, - FILE *dump_file = NULL); + unsigned *assembly_size); private: - void generate_code(exec_list *instructions, FILE *dump_file); + void generate_code(exec_list *instructions); void generate_fb_write(fs_inst *inst); void generate_blorp_fb_write(fs_inst *inst); void generate_pixel_xy(struct brw_reg dst, bool is_x); diff --git a/src/mesa/drivers/dri/i965/brw_fs_generator.cpp b/src/mesa/drivers/dri/i965/brw_fs_generator.cpp index 878b0e0..bf3f32c 100644 --- a/src/mesa/drivers/dri/i965/brw_fs_generator.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs_generator.cpp @@ -1321,7 +1321,7 @@ fs_ge
[Mesa-dev] [PATCH 08/23] i965/vec4: Print disassembly after compaction.
--- src/mesa/drivers/dri/i965/brw_vec4.h | 4 +- src/mesa/drivers/dri/i965/brw_vec4_generator.cpp | 109 +-- 2 files changed, 66 insertions(+), 47 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_vec4.h b/src/mesa/drivers/dri/i965/brw_vec4.h index a86972a..3a1eb12 100644 --- a/src/mesa/drivers/dri/i965/brw_vec4.h +++ b/src/mesa/drivers/dri/i965/brw_vec4.h @@ -36,6 +36,7 @@ extern "C" { #include "brw_context.h" #include "brw_eu.h" +#include "intel_asm_printer.h" #ifdef __cplusplus }; /* extern "C" */ @@ -650,7 +651,8 @@ public: const unsigned *generate_assembly(exec_list *insts, unsigned *asm_size); private: - void generate_code(exec_list *instructions); + void generate_code(exec_list *instructions, int *num_annotations, + struct annotation **annotation); void generate_vec4_instruction(vec4_instruction *inst, struct brw_reg dst, struct brw_reg *src); diff --git a/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp b/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp index a91bfe7..2176de4 100644 --- a/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp +++ b/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp @@ -21,6 +21,7 @@ */ #include "brw_vec4.h" +#include "brw_cfg.h" extern "C" { #include "brw_eu.h" @@ -1260,12 +1261,9 @@ vec4_generator::generate_vec4_instruction(vec4_instruction *instruction, } void -vec4_generator::generate_code(exec_list *instructions) +vec4_generator::generate_code(exec_list *instructions, int *num_annotations, + struct annotation **annotation) { - int last_native_insn_offset = 0; - const char *last_annotation_string = NULL; - const void *last_annotation_ir = NULL; - if (unlikely(debug_flag)) { if (shader_prog) { fprintf(stderr, "Native code for %s vertex shader %d:\n", @@ -1276,32 +1274,52 @@ vec4_generator::generate_code(exec_list *instructions) } } + int block_num = 0; + int ann_num = 0; + int ann_size = 1024; + cfg_t *cfg = NULL; + struct annotation *ann = NULL; + + if (unlikely(debug_flag)) { + cfg = new(mem_ctx) cfg_t(instructions); + ann = rzalloc_array(NULL, struct annotation, ann_size); + } + foreach_list(node, instructions) { vec4_instruction *inst = (vec4_instruction *)node; struct brw_reg src[3], dst; if (unlikely(debug_flag)) { -if (last_annotation_ir != inst->ir) { - last_annotation_ir = inst->ir; - if (last_annotation_ir) { - fprintf(stderr, " "); - if (shader_prog) { - ((ir_instruction *) last_annotation_ir)->fprint(stderr); - } else { - const prog_instruction *vpi; - vpi = (const prog_instruction *) inst->ir; - fprintf(stderr, "%d: ", (int)(vpi - prog->Instructions)); - _mesa_fprint_instruction_opt(stderr, vpi, 0, - PROG_PRINT_DEBUG, NULL); - } - fprintf(stderr, "\n"); - } -} -if (last_annotation_string != inst->annotation) { - last_annotation_string = inst->annotation; - if (last_annotation_string) - fprintf(stderr, " %s\n", last_annotation_string); -} + if (ann_num == ann_size) { +ann_size *= 2; +ann = reralloc(NULL, ann, struct annotation, ann_size); + } + + ann[ann_num].offset = p->next_insn_offset; + ann[ann_num].ir = inst->ir; + ann[ann_num].annotation = inst->annotation; + + if (cfg->blocks[block_num]->start == inst) { +ann[ann_num].block_start = cfg->blocks[block_num]; + } + + /* There is no hardware DO instruction on Gen6+, so since DO always + * starts a basic block, we need to set the .block_start of the next + * instruction's annotation with a pointer to the bblock started by + * the DO. + * + * There's also only complication from emitting an annotation without + * a corresponding hardware instruction to disassemble. + */ + if (brw->gen >= 6 && inst->opcode == BRW_OPCODE_DO) { +ann_num--; + } + + if (cfg->blocks[block_num]->end == inst) { +ann[ann_num].block_end = cfg->blocks[block_num]; +block_num++; + } + ann_num++; } for (unsigned int i = 0; i < 3; i++) { @@ -1332,38 +1350,37 @@ vec4_generator::generate_code(exec_list *instructions) if (inst->no_dd_check) last->header.dependency_control |= BRW_DEPENDENCY_NOTCHECKED; } - - if (unlikely(debug_flag)) { -brw_disassemble(brw, p->store, -last_native_insn_offset, p->next_insn_offset, stderr); - } - - last_nativ
[Mesa-dev] [PATCH 13/23] i965: Use next_offset() in instruction compaction code.
--- src/mesa/drivers/dri/i965/brw_eu_compact.c | 20 +++- 1 file changed, 3 insertions(+), 17 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_eu_compact.c b/src/mesa/drivers/dri/i965/brw_eu_compact.c index 40d1fc2..f6f055f 100644 --- a/src/mesa/drivers/dri/i965/brw_eu_compact.c +++ b/src/mesa/drivers/dri/i965/brw_eu_compact.c @@ -765,11 +765,7 @@ brw_compact_instructions(struct brw_compile *p, int start_offset, break; } - if (insn->header.cmpt_control) { - offset += 8; - } else { - offset += 16; - } + offset = next_offset(store, offset); } /* p->nr_insn is counting the number of uncompacted instructions still, so @@ -792,22 +788,12 @@ brw_compact_instructions(struct brw_compile *p, int start_offset, while (start_offset + old_ip[offset / 8] * 8 != annotation[i].offset) { assert(start_offset + old_ip[offset / 8] * 8 < annotation[i].offset); -struct brw_instruction *insn = store + offset; -if (insn->header.cmpt_control) { - offset += 8; -} else { - offset += 16; -} +offset = next_offset(store, offset); } annotation[i].offset = start_offset + offset; - struct brw_instruction *insn = store + offset; - if (insn->header.cmpt_control) { -offset += 8; - } else { -offset += 16; - } + offset = next_offset(store, offset); } annotation[num_annotations].offset = p->next_insn_offset; -- 1.8.3.2 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 07/23] i965/fs: Print disassembly after compaction.
--- src/mesa/drivers/dri/i965/brw_fs.h | 4 +- src/mesa/drivers/dri/i965/brw_fs_generator.cpp | 156 - 2 files changed, 77 insertions(+), 83 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_fs.h b/src/mesa/drivers/dri/i965/brw_fs.h index 111e994..d26b972 100644 --- a/src/mesa/drivers/dri/i965/brw_fs.h +++ b/src/mesa/drivers/dri/i965/brw_fs.h @@ -46,6 +46,7 @@ extern "C" { #include "brw_eu.h" #include "brw_wm.h" #include "brw_shader.h" +#include "intel_asm_printer.h" } #include "gen8_generator.h" #include "glsl/glsl_types.h" @@ -611,7 +612,8 @@ public: unsigned *assembly_size); private: - void generate_code(exec_list *instructions); + void generate_code(exec_list *instructions, int *num_annotations, + struct annotation **annotation); void generate_fb_write(fs_inst *inst); void generate_blorp_fb_write(fs_inst *inst); void generate_pixel_xy(struct brw_reg dst, bool is_x); diff --git a/src/mesa/drivers/dri/i965/brw_fs_generator.cpp b/src/mesa/drivers/dri/i965/brw_fs_generator.cpp index 132d5cd..b0b3b56 100644 --- a/src/mesa/drivers/dri/i965/brw_fs_generator.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs_generator.cpp @@ -1322,12 +1322,9 @@ fs_generator::generate_untyped_surface_read(fs_inst *inst, struct brw_reg dst, } void -fs_generator::generate_code(exec_list *instructions) +fs_generator::generate_code(exec_list *instructions, int *num_annotations, +struct annotation **annotation) { - int last_native_insn_offset = p->next_insn_offset; - const char *last_annotation_string = NULL; - const void *last_annotation_ir = NULL; - if (unlikely(debug_flag)) { if (prog) { fprintf(stderr, @@ -1344,54 +1341,52 @@ fs_generator::generate_code(exec_list *instructions) } } + int block_num = 0; + int ann_num = 0; + int ann_size = 1024; cfg_t *cfg = NULL; - if (unlikely(debug_flag)) + struct annotation *ann = NULL; + + if (unlikely(debug_flag)) { cfg = new(mem_ctx) cfg_t(instructions); + ann = rzalloc_array(NULL, struct annotation, ann_size); + } foreach_list(node, instructions) { fs_inst *inst = (fs_inst *)node; struct brw_reg src[3], dst; if (unlikely(debug_flag)) { -foreach_list(node, &cfg->block_list) { - bblock_link *link = (bblock_link *)node; - bblock_t *block = link->block; - - if (block->start == inst) { - fprintf(stderr, " START B%d", block->block_num); - foreach_list(predecessor_node, &block->parents) { - bblock_link *predecessor_link = -(bblock_link *)predecessor_node; - bblock_t *predecessor_block = predecessor_link->block; - fprintf(stderr, " <-B%d", predecessor_block->block_num); - } - fprintf(stderr, "\n"); - } -} + if (ann_num == ann_size) { +ann_size *= 2; +ann = reralloc(NULL, ann, struct annotation, ann_size); + } -if (last_annotation_ir != inst->ir) { - last_annotation_ir = inst->ir; - if (last_annotation_ir) { - fprintf(stderr, " "); - if (prog) - ((ir_instruction *)inst->ir)->fprint(stderr); - else { - const prog_instruction *fpi; - fpi = (const prog_instruction *)inst->ir; - fprintf(stderr, "%d: ", - (int)(fpi - (fp ? fp->Base.Instructions : 0))); - _mesa_fprint_instruction_opt(stderr, - fpi, - 0, PROG_PRINT_DEBUG, NULL); - } - fprintf(stderr, "\n"); - } -} -if (last_annotation_string != inst->annotation) { - last_annotation_string = inst->annotation; - if (last_annotation_string) - fprintf(stderr, " %s\n", last_annotation_string); -} + ann[ann_num].offset = p->next_insn_offset; + ann[ann_num].ir = inst->ir; + ann[ann_num].annotation = inst->annotation; + + if (cfg->blocks[block_num]->start == inst) { +ann[ann_num].block_start = cfg->blocks[block_num]; + } + + /* There is no hardware DO instruction on Gen6+, so since DO always + * starts a basic block, we need to set the .block_start of the next + * instruction's annotation with a pointer to the bblock started by + * the DO. + * + * There's also only complication from emitting an annotation without + * a corresponding hardware instruction to disassemble. + */ + if (brw->gen >= 6 && inst->opcode == BRW_OPCODE_DO) { +ann_num--; + } + + if (cfg->blocks[block_num]->e
[Mesa-dev] [PATCH 19/23] i965/fs: Use next_insn_offset rather than nr_insn.
--- src/mesa/drivers/dri/i965/brw_fs_generator.cpp | 4 ++-- src/mesa/drivers/dri/i965/gen8_fs_generator.cpp | 4 ++-- 2 files changed, 4 insertions(+), 4 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_fs_generator.cpp b/src/mesa/drivers/dri/i965/brw_fs_generator.cpp index 872b5a4..914fb29 100644 --- a/src/mesa/drivers/dri/i965/brw_fs_generator.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs_generator.cpp @@ -1843,12 +1843,12 @@ fs_generator::generate_assembly(exec_list *simd8_instructions, if (simd16_instructions) { /* align to 64 byte boundary. */ - while ((p->nr_insn * sizeof(struct brw_instruction)) % 64) { + while (p->next_insn_offset % 64) { brw_NOP(p); } /* Save off the start of this SIMD16 program */ - prog_data->prog_offset_16 = p->nr_insn * sizeof(struct brw_instruction); + prog_data->prog_offset_16 = p->next_insn_offset; brw_set_compression_control(p, BRW_COMPRESSION_COMPRESSED); diff --git a/src/mesa/drivers/dri/i965/gen8_fs_generator.cpp b/src/mesa/drivers/dri/i965/gen8_fs_generator.cpp index 9011bff..272f668 100644 --- a/src/mesa/drivers/dri/i965/gen8_fs_generator.cpp +++ b/src/mesa/drivers/dri/i965/gen8_fs_generator.cpp @@ -1330,11 +1330,11 @@ gen8_fs_generator::generate_assembly(exec_list *simd8_instructions, if (simd16_instructions) { /* Align to a 64-byte boundary. */ - while ((nr_inst * sizeof(gen8_instruction)) % 64) + while (next_inst_offset % 64) NOP(); /* Save off the start of this SIMD16 program */ - prog_data->prog_offset_16 = nr_inst * sizeof(gen8_instruction); + prog_data->prog_offset_16 = next_inst_offset; struct annotation *annotation; int num_annotations; -- 1.8.3.2 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 11/23] i965: Rename next_ip() -> next_offset().
That we were comparing its return value with offsets should have been a clue. :) Make it take a void *store in preparation for making the function useful elsewhere. --- src/mesa/drivers/dri/i965/brw_eu_emit.c | 63 + 1 file changed, 33 insertions(+), 30 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_eu_emit.c b/src/mesa/drivers/dri/i965/brw_eu_emit.c index 1ebd7a9..a357d5d 100644 --- a/src/mesa/drivers/dri/i965/brw_eu_emit.c +++ b/src/mesa/drivers/dri/i965/brw_eu_emit.c @@ -2383,31 +2383,32 @@ void brw_urb_WRITE(struct brw_compile *p, } static int -next_ip(struct brw_compile *p, int ip) +next_offset(void *store, int offset) { - struct brw_instruction *insn = (void *)p->store + ip; + struct brw_instruction *insn = (void *)store + offset; if (insn->header.cmpt_control) - return ip + 8; + return offset + 8; else - return ip + 16; + return offset + 16; } static int -brw_find_next_block_end(struct brw_compile *p, int start) +brw_find_next_block_end(struct brw_compile *p, int start_offset) { - int ip; + int offset; void *store = p->store; - for (ip = next_ip(p, start); ip < p->next_insn_offset; ip = next_ip(p, ip)) { - struct brw_instruction *insn = store + ip; + for (offset = next_offset(store, start_offset); offset < p->next_insn_offset; +offset = next_offset(store, offset)) { + struct brw_instruction *insn = store + offset; switch (insn->header.opcode) { case BRW_OPCODE_ENDIF: case BRW_OPCODE_ELSE: case BRW_OPCODE_WHILE: case BRW_OPCODE_HALT: -return ip; +return offset; } } @@ -2419,28 +2420,29 @@ brw_find_next_block_end(struct brw_compile *p, int start) * instruction. */ static int -brw_find_loop_end(struct brw_compile *p, int start) +brw_find_loop_end(struct brw_compile *p, int start_offset) { struct brw_context *brw = p->brw; - int ip; + int offset; int scale = 8; void *store = p->store; /* Always start after the instruction (such as a WHILE) we're trying to fix * up. */ - for (ip = next_ip(p, start); ip < p->next_insn_offset; ip = next_ip(p, ip)) { - struct brw_instruction *insn = store + ip; + for (offset = next_offset(store, start_offset); offset < p->next_insn_offset; +offset = next_offset(store, offset)) { + struct brw_instruction *insn = store + offset; if (insn->header.opcode == BRW_OPCODE_WHILE) { int jip = brw->gen == 6 ? insn->bits1.branch_gen6.jump_count : insn->bits3.break_cont.jip; -if (ip + jip * scale <= start) - return ip; +if (offset + jip * scale <= start_offset) + return offset; } } assert(!"not reached"); - return start; + return start_offset; } /* After program generation, go back and update the UIP and JIP of @@ -2450,15 +2452,16 @@ void brw_set_uip_jip(struct brw_compile *p) { struct brw_context *brw = p->brw; - int ip; + int offset; int scale = 8; void *store = p->store; if (brw->gen < 6) return; - for (ip = 0; ip < p->next_insn_offset; ip = next_ip(p, ip)) { - struct brw_instruction *insn = store + ip; + for (offset = 0; offset < p->next_insn_offset; +offset = next_offset(store, offset)) { + struct brw_instruction *insn = store + offset; if (insn->header.cmpt_control) { /* Fixups for compacted BREAK/CONTINUE not supported yet. */ @@ -2468,31 +2471,31 @@ brw_set_uip_jip(struct brw_compile *p) continue; } - int block_end_ip = brw_find_next_block_end(p, ip); + int block_end_offset = brw_find_next_block_end(p, offset); switch (insn->header.opcode) { case BRW_OPCODE_BREAK: - assert(block_end_ip != 0); -insn->bits3.break_cont.jip = (block_end_ip - ip) / scale; + assert(block_end_offset != 0); +insn->bits3.break_cont.jip = (block_end_offset - offset) / scale; /* Gen7 UIP points to WHILE; Gen6 points just after it */ insn->bits3.break_cont.uip = - (brw_find_loop_end(p, ip) - ip + + (brw_find_loop_end(p, offset) - offset + (brw->gen == 6 ? 16 : 0)) / scale; break; case BRW_OPCODE_CONTINUE: - assert(block_end_ip != 0); -insn->bits3.break_cont.jip = (block_end_ip - ip) / scale; + assert(block_end_offset != 0); +insn->bits3.break_cont.jip = (block_end_offset - offset) / scale; insn->bits3.break_cont.uip = -(brw_find_loop_end(p, ip) - ip) / scale; +(brw_find_loop_end(p, offset) - offset) / scale; assert(insn->bits3.break_cont.uip != 0); assert(insn->bits3.break_cont.jip != 0); break; case BRW_OPCODE_ENDIF: - if (block_end_ip == 0) + if (block_end_offset == 0) insn->bits3.break_cont.jip = 2;
[Mesa-dev] [PATCH 22/23] i965: Emit compaction stats without walking the assembly.
The instruction count does not include padding NOPs, but the compaction stats do. --- src/mesa/drivers/dri/i965/brw_eu_compact.c | 19 --- src/mesa/drivers/dri/i965/brw_fs_generator.cpp | 8 src/mesa/drivers/dri/i965/brw_vec4_generator.cpp | 7 +++ 3 files changed, 15 insertions(+), 19 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_eu_compact.c b/src/mesa/drivers/dri/i965/brw_eu_compact.c index f40ba04..0560367 100644 --- a/src/mesa/drivers/dri/i965/brw_eu_compact.c +++ b/src/mesa/drivers/dri/i965/brw_eu_compact.c @@ -841,23 +841,4 @@ brw_compact_instructions(struct brw_compile *p, int start_offset, annotation[num_annotations].offset = p->next_insn_offset; } - - if (0) { - fprintf(stderr, "dumping compacted program\n"); - brw_disassemble(brw, store, 0, p->next_insn_offset - start_offset, stderr); - - int cmp = 0; - for (offset = 0; offset < p->next_insn_offset - start_offset;) { - struct brw_instruction *insn = store + offset; - - if (insn->header.cmpt_control) { -offset += 8; -cmp++; - } else { -offset += 16; - } - } - fprintf(stderr, "%db/%db saved (%d%%)\n", cmp * 8, offset + cmp * 8, - cmp * 8 * 100 / (offset + cmp * 8)); - } } diff --git a/src/mesa/drivers/dri/i965/brw_fs_generator.cpp b/src/mesa/drivers/dri/i965/brw_fs_generator.cpp index f70e7b2..4b2245b 100644 --- a/src/mesa/drivers/dri/i965/brw_fs_generator.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs_generator.cpp @@ -1827,11 +1827,15 @@ fs_generator::generate_assembly(exec_list *simd8_instructions, struct annotation *annotation; int num_annotations; + int start_offset = p->next_insn_offset; dispatch_width = (i + 1) * 8; generate_code(instructions[i], &num_annotations, &annotation); + + int before_size = p->next_insn_offset - start_offset; brw_compact_instructions(p, prog_data->prog_offset_16, num_annotations, annotation); + int after_size = p->next_insn_offset - start_offset; if (unlikely(debug_flag)) { if (this->prog) { @@ -1847,6 +1851,10 @@ fs_generator::generate_assembly(exec_list *simd8_instructions, fprintf(stderr, "Native code for blorp program (SIMD%d dispatch):\n", dispatch_width); } +fprintf(stderr, "SIMD%d shader: %d instructions. Compacted %d to %d" +" bytes (%.0f%%)\n", +dispatch_width, before_size / 16, before_size, after_size, +100.0f * (before_size - after_size) / before_size); dump_assembly(p->store, num_annotations, annotation, brw, prog, brw_disassemble); ralloc_free(annotation); diff --git a/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp b/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp index 819ed10..affcc90 100644 --- a/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp +++ b/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp @@ -1365,7 +1365,10 @@ vec4_generator::generate_assembly(exec_list *instructions, brw_set_access_mode(p, BRW_ALIGN_16); generate_code(instructions, &num_annotations, &annotation); + + int before_size = p->next_insn_offset; brw_compact_instructions(p, 0, num_annotations, annotation); + int after_size = p->next_insn_offset; if (unlikely(debug_flag)) { if (shader_prog) { @@ -1375,6 +1378,10 @@ vec4_generator::generate_assembly(exec_list *instructions, } else { fprintf(stderr, "Native code for vertex program %d:\n", prog->Id); } + fprintf(stderr, "vec4 shader: %d instructions. Compacted %d to %d" + " bytes (%.0f%%)\n", + before_size / 16, before_size, after_size, + 100.0f * (before_size - after_size) / before_size); dump_assembly(p->store, num_annotations, annotation, brw, prog, brw_disassemble); ralloc_free(annotation); -- 1.8.3.2 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 15/23] i965: Emit ARF:UD for non-present src1 on Gen6+.
Enables the next commits to compact more instructions. --- src/mesa/drivers/dri/i965/brw_eu_emit.c | 28 ++-- 1 file changed, 26 insertions(+), 2 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_eu_emit.c b/src/mesa/drivers/dri/i965/brw_eu_emit.c index 38d327a..d8efa01 100644 --- a/src/mesa/drivers/dri/i965/brw_eu_emit.c +++ b/src/mesa/drivers/dri/i965/brw_eu_emit.c @@ -329,10 +329,34 @@ brw_set_src0(struct brw_compile *p, struct brw_instruction *insn, if (reg.file == BRW_IMMEDIATE_VALUE) { insn->bits3.ud = reg.dw1.ud; - /* Required to set some fields in src1 as well: + /* The Bspec's section titled "Non-present Operands" claims that if src0 + * is an immediate that src1's type must be the same as that of src0. + * + * The SNB+ DataTypeIndex instruction compaction tables contain mappings + * that do not follow this rule. E.g., from the IVB/HSW table: + * + * DataTypeIndex 18-Bit Mapping Mapped Meaning + *3 0010101101 r:f | i:vf | a:ud | <1> | dir | + * + * And from the SNB table: + * + * DataTypeIndex 18-Bit Mapping Mapped Meaning + *8 0010001100 a:w | i:w | a:ud | <1> | dir | + * + * Neither of these cause warnings from the simulator when used, + * compacted or otherwise. In fact, all compaction mappings that have an + * immediate in src0 use a:ud for src1. + * + * The GM45 instruction compaction tables do not contain mapped meanings + * so it's not clear whether it has the restriction. We'll assume it was + * lifted on SNB. (FINISHME: decode the GM45 tables and check.) */ insn->bits1.da1.src1_reg_file = 0; /* arf */ - insn->bits1.da1.src1_reg_type = insn->bits1.da1.src0_reg_type; + if (brw->gen < 6) { + insn->bits1.da1.src1_reg_type = insn->bits1.da1.src0_reg_type; + } else { + insn->bits1.da1.src1_reg_type = BRW_HW_REG_TYPE_UD; + } } else { -- 1.8.3.2 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 18/23] i965: Print IR annotations only with INTEL_DEBUG=annotation.
Running shader-db without INTEL_DEBUG=annotation reduces the runtime from ~90 to ~80 seconds on my machine. It also reduces the disk space consumed by the .out files from 660 MB (676 on disk) to 343 MB (358 on disk). --- src/mesa/drivers/dri/i965/brw_fs_generator.cpp| 6 -- src/mesa/drivers/dri/i965/brw_vec4_generator.cpp | 6 -- src/mesa/drivers/dri/i965/gen8_fs_generator.cpp | 6 -- src/mesa/drivers/dri/i965/gen8_vec4_generator.cpp | 6 -- src/mesa/drivers/dri/i965/intel_debug.c | 1 + src/mesa/drivers/dri/i965/intel_debug.h | 1 + 6 files changed, 18 insertions(+), 8 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_fs_generator.cpp b/src/mesa/drivers/dri/i965/brw_fs_generator.cpp index b0b3b56..872b5a4 100644 --- a/src/mesa/drivers/dri/i965/brw_fs_generator.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs_generator.cpp @@ -1363,8 +1363,10 @@ fs_generator::generate_code(exec_list *instructions, int *num_annotations, } ann[ann_num].offset = p->next_insn_offset; - ann[ann_num].ir = inst->ir; - ann[ann_num].annotation = inst->annotation; + if (INTEL_DEBUG & DEBUG_ANNOTATION) { +ann[ann_num].ir = inst->ir; +ann[ann_num].annotation = inst->annotation; + } if (cfg->blocks[block_num]->start == inst) { ann[ann_num].block_start = cfg->blocks[block_num]; diff --git a/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp b/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp index 2176de4..5980aad 100644 --- a/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp +++ b/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp @@ -1296,8 +1296,10 @@ vec4_generator::generate_code(exec_list *instructions, int *num_annotations, } ann[ann_num].offset = p->next_insn_offset; - ann[ann_num].ir = inst->ir; - ann[ann_num].annotation = inst->annotation; + if (INTEL_DEBUG & DEBUG_ANNOTATION) { +ann[ann_num].ir = inst->ir; +ann[ann_num].annotation = inst->annotation; + } if (cfg->blocks[block_num]->start == inst) { ann[ann_num].block_start = cfg->blocks[block_num]; diff --git a/src/mesa/drivers/dri/i965/gen8_fs_generator.cpp b/src/mesa/drivers/dri/i965/gen8_fs_generator.cpp index 7e90ee6..9011bff 100644 --- a/src/mesa/drivers/dri/i965/gen8_fs_generator.cpp +++ b/src/mesa/drivers/dri/i965/gen8_fs_generator.cpp @@ -924,8 +924,10 @@ gen8_fs_generator::generate_code(exec_list *instructions, int *num_annotations, } ann[ann_num].offset = next_inst_offset; - ann[ann_num].ir = ir->ir; - ann[ann_num].annotation = ir->annotation; + if (INTEL_DEBUG & DEBUG_ANNOTATION) { +ann[ann_num].ir = ir->ir; +ann[ann_num].annotation = ir->annotation; + } if (cfg->blocks[block_num]->start == ir) { ann[ann_num].block_start = cfg->blocks[block_num]; diff --git a/src/mesa/drivers/dri/i965/gen8_vec4_generator.cpp b/src/mesa/drivers/dri/i965/gen8_vec4_generator.cpp index 5470f87..4aeaf89 100644 --- a/src/mesa/drivers/dri/i965/gen8_vec4_generator.cpp +++ b/src/mesa/drivers/dri/i965/gen8_vec4_generator.cpp @@ -878,8 +878,10 @@ gen8_vec4_generator::generate_code(exec_list *instructions, } ann[ann_num].offset = next_inst_offset; - ann[ann_num].ir = ir->ir; - ann[ann_num].annotation = ir->annotation; + if (INTEL_DEBUG & DEBUG_ANNOTATION) { +ann[ann_num].ir = ir->ir; +ann[ann_num].annotation = ir->annotation; + } if (cfg->blocks[block_num]->start == ir) { ann[ann_num].block_start = cfg->blocks[block_num]; diff --git a/src/mesa/drivers/dri/i965/intel_debug.c b/src/mesa/drivers/dri/i965/intel_debug.c index 621a571..64d2c61 100644 --- a/src/mesa/drivers/dri/i965/intel_debug.c +++ b/src/mesa/drivers/dri/i965/intel_debug.c @@ -64,6 +64,7 @@ static const struct dri_debug_control debug_control[] = { { "no16", DEBUG_NO16 }, { "blorp", DEBUG_BLORP }, { "nodualobj", DEBUG_NO_DUAL_OBJECT_GS }, + { "annotation", DEBUG_ANNOTATION }, { NULL,0 } }; diff --git a/src/mesa/drivers/dri/i965/intel_debug.h b/src/mesa/drivers/dri/i965/intel_debug.h index 6402cec..49cc584 100644 --- a/src/mesa/drivers/dri/i965/intel_debug.h +++ b/src/mesa/drivers/dri/i965/intel_debug.h @@ -60,6 +60,7 @@ extern uint64_t INTEL_DEBUG; #define DEBUG_NO160x2000 #define DEBUG_VUE 0x4000 #define DEBUG_NO_DUAL_OBJECT_GS 0x8000 +#define DEBUG_ANNOTATION 0x1 #ifdef HAVE_ANDROID_PLATFORM #define LOG_TAG "INTEL-MESA" -- 1.8.3.2 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 12/23] i965: Move next_offset() to brw_eu.h for use elsewhere.
Also perform arithmetic on char* rather than void* since the latter is a GNU C extension not available in C++. --- src/mesa/drivers/dri/i965/brw_eu.h | 12 src/mesa/drivers/dri/i965/brw_eu_emit.c | 11 --- 2 files changed, 12 insertions(+), 11 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_eu.h b/src/mesa/drivers/dri/i965/brw_eu.h index 8ce31a1..3c89365 100644 --- a/src/mesa/drivers/dri/i965/brw_eu.h +++ b/src/mesa/drivers/dri/i965/brw_eu.h @@ -424,6 +424,18 @@ void brw_debug_compact_uncompact(struct brw_context *brw, struct brw_instruction *orig, struct brw_instruction *uncompacted); +static inline int +next_offset(void *store, int offset) +{ + struct brw_instruction *insn = + (struct brw_instruction *)((char *)store + offset); + + if (insn->header.cmpt_control) + return offset + 8; + else + return offset + 16; +} + #ifdef __cplusplus } #endif diff --git a/src/mesa/drivers/dri/i965/brw_eu_emit.c b/src/mesa/drivers/dri/i965/brw_eu_emit.c index a357d5d..38d327a 100644 --- a/src/mesa/drivers/dri/i965/brw_eu_emit.c +++ b/src/mesa/drivers/dri/i965/brw_eu_emit.c @@ -2383,17 +2383,6 @@ void brw_urb_WRITE(struct brw_compile *p, } static int -next_offset(void *store, int offset) -{ - struct brw_instruction *insn = (void *)store + offset; - - if (insn->header.cmpt_control) - return offset + 8; - else - return offset + 16; -} - -static int brw_find_next_block_end(struct brw_compile *p, int start_offset) { int offset; -- 1.8.3.2 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 10/23] i965/gen8/vec4: Print disassembly after compaction.
--- src/mesa/drivers/dri/i965/brw_vec4.h | 3 +- src/mesa/drivers/dri/i965/gen8_vec4_generator.cpp | 103 +- 2 files changed, 63 insertions(+), 43 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_vec4.h b/src/mesa/drivers/dri/i965/brw_vec4.h index 3a1eb12..a3fa42f 100644 --- a/src/mesa/drivers/dri/i965/brw_vec4.h +++ b/src/mesa/drivers/dri/i965/brw_vec4.h @@ -753,7 +753,8 @@ public: const unsigned *generate_assembly(exec_list *insts, unsigned *asm_size); private: - void generate_code(exec_list *instructions); + void generate_code(exec_list *instructions, int *num_annotations, + struct annotation **annotation); void generate_vec4_instruction(vec4_instruction *inst, struct brw_reg dst, struct brw_reg *src); diff --git a/src/mesa/drivers/dri/i965/gen8_vec4_generator.cpp b/src/mesa/drivers/dri/i965/gen8_vec4_generator.cpp index e53fd35..5470f87 100644 --- a/src/mesa/drivers/dri/i965/gen8_vec4_generator.cpp +++ b/src/mesa/drivers/dri/i965/gen8_vec4_generator.cpp @@ -22,6 +22,7 @@ */ #include "brw_vec4.h" +#include "brw_cfg.h" extern "C" { #include "brw_eu.h" @@ -841,12 +842,10 @@ gen8_vec4_generator::generate_vec4_instruction(vec4_instruction *instruction, } void -gen8_vec4_generator::generate_code(exec_list *instructions) +gen8_vec4_generator::generate_code(exec_list *instructions, + int *num_annotations, + struct annotation **annotation) { - int last_native_inst_offset = 0; - const char *last_annotation_string = NULL; - const void *last_annotation_ir = NULL; - if (unlikely(debug_flag)) { if (shader_prog) { fprintf(stderr, "Native code for %s vertex shader %d:\n", @@ -857,32 +856,52 @@ gen8_vec4_generator::generate_code(exec_list *instructions) } } + int block_num = 0; + int ann_num = 0; + int ann_size = 1024; + cfg_t *cfg = NULL; + struct annotation *ann = NULL; + + if (unlikely(debug_flag)) { + cfg = new(mem_ctx) cfg_t(instructions); + ann = rzalloc_array(NULL, struct annotation, ann_size); + } + foreach_list(node, instructions) { vec4_instruction *ir = (vec4_instruction *) node; struct brw_reg src[3], dst; if (unlikely(debug_flag)) { - if (last_annotation_ir != ir->ir) { -last_annotation_ir = ir->ir; -if (last_annotation_ir) { - fprintf(stderr, " "); - if (shader_prog) { - ((ir_instruction *) last_annotation_ir)->fprint(stderr); - } else { - const prog_instruction *vpi; - vpi = (const prog_instruction *) ir->ir; - fprintf(stderr, "%d: ", (int)(vpi - prog->Instructions)); - _mesa_fprint_instruction_opt(stderr, vpi, 0, - PROG_PRINT_DEBUG, NULL); - } - fprintf(stderr, "\n"); -} + if (ann_num == ann_size) { +ann_size *= 2; +ann = reralloc(NULL, ann, struct annotation, ann_size); + } + + ann[ann_num].offset = next_inst_offset; + ann[ann_num].ir = ir->ir; + ann[ann_num].annotation = ir->annotation; + + if (cfg->blocks[block_num]->start == ir) { +ann[ann_num].block_start = cfg->blocks[block_num]; } - if (last_annotation_string != ir->annotation) { -last_annotation_string = ir->annotation; -if (last_annotation_string) - fprintf(stderr, " %s\n", last_annotation_string); + + /* There is no hardware DO instruction on Gen6+, so since DO always + * starts a basic block, we need to set the .block_start of the next + * instruction's annotation with a pointer to the bblock started by + * the DO. + * + * There's also only complication from emitting an annotation without + * a corresponding hardware instruction to disassemble. + */ + if (brw->gen >= 6 && ir->opcode == BRW_OPCODE_DO) { +ann_num--; } + + if (cfg->blocks[block_num]->end == ir) { +ann[ann_num].block_end = cfg->blocks[block_num]; +block_num++; + } + ann_num++; } for (unsigned int i = 0; i < 3; i++) { @@ -908,37 +927,37 @@ gen8_vec4_generator::generate_code(exec_list *instructions) gen8_set_no_dd_clear(last, ir->no_dd_clear); gen8_set_no_dd_check(last, ir->no_dd_check); } - - if (unlikely(debug_flag)) { - gen8_disassemble(brw, store, last_native_inst_offset, next_inst_offset, stderr); - } - - last_native_inst_offset = next_inst_offset; - } - - if (unlikely(debug_flag)) { - fprintf(stderr, "\n"); } patch_jump_targets(); - /* OK
[Mesa-dev] [PATCH 17/23] i965: Switch types D->UD when possible to allow compaction.
Number of compacted instructions: 827404 -> 833045 (0.68%) --- src/mesa/drivers/dri/i965/brw_eu_emit.c | 20 1 file changed, 20 insertions(+) diff --git a/src/mesa/drivers/dri/i965/brw_eu_emit.c b/src/mesa/drivers/dri/i965/brw_eu_emit.c index 1810233..ab00d7c 100644 --- a/src/mesa/drivers/dri/i965/brw_eu_emit.c +++ b/src/mesa/drivers/dri/i965/brw_eu_emit.c @@ -295,6 +295,16 @@ validate_reg(struct brw_instruction *insn, struct brw_reg reg) /* 10. Check destination issues. */ } +static bool +is_compactable_immediate(unsigned imm) +{ + /* We get the low 12 bits as-is. */ + imm &= ~0xfff; + + /* We get one bit replicated through the top 20 bits. */ + return imm == 0 || imm == 0xf000; +} + void brw_set_src0(struct brw_compile *p, struct brw_instruction *insn, struct brw_reg reg) @@ -373,6 +383,16 @@ brw_set_src0(struct brw_compile *p, struct brw_instruction *insn, insn->bits1.da1.src0_reg_type == BRW_HW_REG_TYPE_F) { insn->bits1.da1.src0_reg_type = BRW_HW_REG_IMM_TYPE_VF; } + + /* There are no mappings for dst:d | i:d, so if the immediate is suitable + * set the types to :UD so the instruction can be compacted. + */ + if (is_compactable_immediate(insn->bits3.ud) && + insn->bits1.da1.src0_reg_type == BRW_HW_REG_TYPE_D && + insn->bits1.da1.dest_reg_type == BRW_HW_REG_TYPE_D) { + insn->bits1.da1.src0_reg_type = BRW_HW_REG_TYPE_UD; + insn->bits1.da1.dest_reg_type = BRW_HW_REG_TYPE_UD; + } } else { -- 1.8.3.2 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 14/23] i965: Support compacted instructions with immediate sources.
Note the weirdness with src1 subregs. The compacted immediate fields are uncompacted to bits [127:96] and the high five bits of the subreg mapping maps to bits [100:96]. Number of compacted instructions: 790085 -> 817752 (3.50%) --- src/mesa/drivers/dri/i965/brw_eu_compact.c | 83 +++--- 1 file changed, 63 insertions(+), 20 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_eu_compact.c b/src/mesa/drivers/dri/i965/brw_eu_compact.c index f6f055f..f40ba04 100644 --- a/src/mesa/drivers/dri/i965/brw_eu_compact.c +++ b/src/mesa/drivers/dri/i965/brw_eu_compact.c @@ -373,13 +373,16 @@ set_datatype_index(struct brw_compact_instruction *dst, static bool set_subreg_index(struct brw_compact_instruction *dst, - struct brw_instruction *src) + struct brw_instruction *src, + bool is_immediate) { uint16_t uncompacted = 0; uncompacted |= src->bits1.da1.dest_subreg_nr << 0; uncompacted |= src->bits2.da1.src0_subreg_nr << 5; - uncompacted |= src->bits3.da1.src1_subreg_nr << 10; + + if (!is_immediate) + uncompacted |= src->bits3.da1.src1_subreg_nr << 10; for (int i = 0; i < 32; i++) { if (subreg_table[i] == uncompacted) { @@ -424,20 +427,40 @@ set_src0_index(struct brw_compact_instruction *dst, static bool set_src1_index(struct brw_compact_instruction *dst, - struct brw_instruction *src) + struct brw_instruction *src, bool is_immediate) { - uint16_t compacted, uncompacted = 0; + if (is_immediate) { + dst->dw1.src1_index = (src->bits3.ud >> 8) & 0x1f; + } else { + uint16_t compacted, uncompacted; - uncompacted |= (src->bits3.ud >> 13) & 0xfff; + uncompacted = (src->bits3.ud >> 13) & 0xfff; - if (!get_src_index(uncompacted, &compacted)) - return false; + if (!get_src_index(uncompacted, &compacted)) + return false; - dst->dw1.src1_index = compacted; + dst->dw1.src1_index = compacted; + } return true; } +/* Compacted instructions have 12-bits for immediate sources, and a 13th bit + * that's replicated through the high 20 bits. + * + * Effectively this means we get 12-bit integers, 0.0f, and some limited uses + * of packed vectors as compactable immediates. + */ +static bool +is_compactable_immediate(unsigned imm) +{ + /* We get the low 12 bits as-is. */ + imm &= ~0xfff; + + /* We get one bit replicated through the top 20 bits. */ + return imm == 0 || imm == 0xf000; +} + /** * Tries to compact instruction src into dst. * @@ -464,10 +487,11 @@ brw_try_compact_instruction(struct brw_compile *p, return false; } - /* FINISHME: immediates */ - if (src->bits1.da1.src0_reg_file == BRW_IMMEDIATE_VALUE || - src->bits1.da1.src1_reg_file == BRW_IMMEDIATE_VALUE) + bool is_immediate = src->bits1.da1.src0_reg_file == BRW_IMMEDIATE_VALUE || + src->bits1.da1.src1_reg_file == BRW_IMMEDIATE_VALUE; + if (is_immediate && !is_compactable_immediate(src->bits3.ud)) { return false; + } memset(&temp, 0, sizeof(temp)); @@ -477,7 +501,7 @@ brw_try_compact_instruction(struct brw_compile *p, return false; if (!set_datatype_index(&temp, src)) return false; - if (!set_subreg_index(&temp, src)) + if (!set_subreg_index(&temp, src, is_immediate)) return false; temp.dw0.acc_wr_control = src->header.acc_wr_control; temp.dw0.conditionalmod = src->header.destreg__conditionalmod; @@ -486,11 +510,15 @@ brw_try_compact_instruction(struct brw_compile *p, temp.dw0.cmpt_ctrl = 1; if (!set_src0_index(&temp, src)) return false; - if (!set_src1_index(&temp, src)) + if (!set_src1_index(&temp, src, is_immediate)) return false; temp.dw1.dst_reg_nr = src->bits1.da1.dest_reg_nr; temp.dw1.src0_reg_nr = src->bits2.da1.src0_reg_nr; - temp.dw1.src1_reg_nr = src->bits3.da1.src1_reg_nr; + if (is_immediate) { + temp.dw1.src1_reg_nr = src->bits3.ud & 0xff; + } else { + temp.dw1.src1_reg_nr = src->bits3.da1.src1_reg_nr; + } *dst = temp; @@ -547,11 +575,17 @@ set_uncompacted_src0(struct brw_instruction *dst, static void set_uncompacted_src1(struct brw_instruction *dst, - struct brw_compact_instruction *src) + struct brw_compact_instruction *src, bool is_immediate) { - uint16_t uncompacted = src_index_table[src->dw1.src1_index]; - - dst->bits3.ud |= uncompacted << 13; + if (is_immediate) { + signed high5 = src->dw1.src1_index; + /* Replicate top bit of src1_index into high 20 bits of the immediate. */ + dst->bits3.ud = (high5 << 27) >> 19; + } else { + uint16_t uncompacted = src_index_table[src->dw1.src1_index]; + + dst->bits3.ud |= uncompacted << 13; + } } void @@ -566,16 +600,25 @@ brw_uncompact_instruction(struct brw_context *brw, set_uncompacted_control(brw, dst, src); set_uncompacted_datatype(dst, src
[Mesa-dev] [PATCH 06/23] i965/fs: Make patch_discard_jumps_to_fb_writes return bool.
... to tell us whether it emitted any code. Will be used to determine whether we need to skip an annotation for it. --- src/mesa/drivers/dri/i965/brw_fs.h | 4 ++-- src/mesa/drivers/dri/i965/brw_fs_generator.cpp | 5 +++-- src/mesa/drivers/dri/i965/gen8_fs_generator.cpp | 5 +++-- 3 files changed, 8 insertions(+), 6 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_fs.h b/src/mesa/drivers/dri/i965/brw_fs.h index 8acad2f..111e994 100644 --- a/src/mesa/drivers/dri/i965/brw_fs.h +++ b/src/mesa/drivers/dri/i965/brw_fs.h @@ -696,7 +696,7 @@ private: struct brw_reg dst, struct brw_reg surf_index); - void patch_discard_jumps_to_fb_writes(); + bool patch_discard_jumps_to_fb_writes(); struct brw_context *brw; struct gl_context *ctx; @@ -788,7 +788,7 @@ private: struct brw_reg surf_index); void generate_discard_jump(fs_inst *ir); - void patch_discard_jumps_to_fb_writes(); + bool patch_discard_jumps_to_fb_writes(); const struct brw_wm_prog_key *const key; struct brw_wm_prog_data *prog_data; diff --git a/src/mesa/drivers/dri/i965/brw_fs_generator.cpp b/src/mesa/drivers/dri/i965/brw_fs_generator.cpp index 0fcf527..132d5cd 100644 --- a/src/mesa/drivers/dri/i965/brw_fs_generator.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs_generator.cpp @@ -59,11 +59,11 @@ fs_generator::~fs_generator() { } -void +bool fs_generator::patch_discard_jumps_to_fb_writes() { if (brw->gen < 6 || this->discard_halt_patches.is_empty()) - return; + return false; /* There is a somewhat strange undocumented requirement of using * HALT, according to the simulator. If some channel has HALTed to @@ -92,6 +92,7 @@ fs_generator::patch_discard_jumps_to_fb_writes() } this->discard_halt_patches.make_empty(); + return true; } void diff --git a/src/mesa/drivers/dri/i965/gen8_fs_generator.cpp b/src/mesa/drivers/dri/i965/gen8_fs_generator.cpp index 294ce46..9df5b73 100644 --- a/src/mesa/drivers/dri/i965/gen8_fs_generator.cpp +++ b/src/mesa/drivers/dri/i965/gen8_fs_generator.cpp @@ -639,11 +639,11 @@ gen8_fs_generator::generate_discard_jump(fs_inst *ir) HALT(); } -void +bool gen8_fs_generator::patch_discard_jumps_to_fb_writes() { if (discard_halt_patches.is_empty()) - return; + return false; /* There is a somewhat strange undocumented requirement of using * HALT, according to the simulator. If some channel has HALTed to @@ -672,6 +672,7 @@ gen8_fs_generator::patch_discard_jumps_to_fb_writes() } this->discard_halt_patches.make_empty(); + return true; } /** -- 1.8.3.2 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 03/23] i965/fs: Don't hardcode DEBUG_WM in generic fs code.
Similar to Paul's commit e9fa3a944 except brw_fs_generator's debug_flag is for DEBUG_WM and DEBUG_BLORP. --- src/mesa/drivers/dri/i965/brw_blorp_blit.cpp| 13 +++-- src/mesa/drivers/dri/i965/brw_blorp_blit_eu.cpp | 17 - src/mesa/drivers/dri/i965/brw_blorp_blit_eu.h | 2 +- src/mesa/drivers/dri/i965/brw_fs.cpp| 3 ++- src/mesa/drivers/dri/i965/brw_fs.h | 4 +++- src/mesa/drivers/dri/i965/brw_fs_generator.cpp | 16 +--- 6 files changed, 26 insertions(+), 29 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_blorp_blit.cpp b/src/mesa/drivers/dri/i965/brw_blorp_blit.cpp index fe75100..3da6388 100644 --- a/src/mesa/drivers/dri/i965/brw_blorp_blit.cpp +++ b/src/mesa/drivers/dri/i965/brw_blorp_blit.cpp @@ -517,7 +517,7 @@ class brw_blorp_blit_program : public brw_blorp_eu_emitter { public: brw_blorp_blit_program(struct brw_context *brw, - const brw_blorp_blit_prog_key *key); + const brw_blorp_blit_prog_key *key, bool debug_flag); const GLuint *compile(struct brw_context *brw, GLuint *program_size, FILE *dump_file = stderr); @@ -624,8 +624,9 @@ private: brw_blorp_blit_program::brw_blorp_blit_program( struct brw_context *brw, - const brw_blorp_blit_prog_key *key) - : brw_blorp_eu_emitter(brw), + const brw_blorp_blit_prog_key *key, + bool debug_flag) + : brw_blorp_eu_emitter(brw, debug_flag), brw(brw), key(key) { @@ -2142,7 +2143,8 @@ brw_blorp_blit_params::get_wm_prog(struct brw_context *brw, if (!brw_search_cache(&brw->cache, BRW_BLORP_BLIT_PROG, &this->wm_prog_key, sizeof(this->wm_prog_key), &prog_offset, prog_data)) { - brw_blorp_blit_program prog(brw, &this->wm_prog_key); + brw_blorp_blit_program prog(brw, &this->wm_prog_key, + INTEL_DEBUG & DEBUG_BLORP); GLuint program_size; const GLuint *program = prog.compile(brw, &program_size, stderr); brw_upload_cache(&brw->cache, BRW_BLORP_BLIT_PROG, @@ -2160,7 +2162,6 @@ brw_blorp_blit_test_compile(struct brw_context *brw, FILE *out) { GLuint program_size; - brw_blorp_blit_program prog(brw, key); - INTEL_DEBUG |= DEBUG_BLORP; + brw_blorp_blit_program prog(brw, key, true /* debug_flag */); prog.compile(brw, &program_size, out); } diff --git a/src/mesa/drivers/dri/i965/brw_blorp_blit_eu.cpp b/src/mesa/drivers/dri/i965/brw_blorp_blit_eu.cpp index 3549173..4910b6c 100644 --- a/src/mesa/drivers/dri/i965/brw_blorp_blit_eu.cpp +++ b/src/mesa/drivers/dri/i965/brw_blorp_blit_eu.cpp @@ -25,12 +25,13 @@ #include "brw_blorp_blit_eu.h" #include "brw_blorp.h" -brw_blorp_eu_emitter::brw_blorp_eu_emitter(struct brw_context *brw) +brw_blorp_eu_emitter::brw_blorp_eu_emitter(struct brw_context *brw, + bool debug_flag) : mem_ctx(ralloc_context(NULL)), generator(brw, mem_ctx, rzalloc(mem_ctx, struct brw_wm_prog_key), rzalloc(mem_ctx, struct brw_wm_prog_data), - NULL, NULL, false) + NULL, NULL, false, debug_flag) { } @@ -42,17 +43,7 @@ brw_blorp_eu_emitter::~brw_blorp_eu_emitter() const unsigned * brw_blorp_eu_emitter::get_program(unsigned *program_size, FILE *dump_file) { - const unsigned *res; - - if (unlikely(INTEL_DEBUG & DEBUG_BLORP)) { - fprintf(stderr, "Native code for BLORP blit:\n"); - res = generator.generate_assembly(NULL, &insts, program_size, dump_file); - fprintf(stderr, "\n"); - } else { - res = generator.generate_assembly(NULL, &insts, program_size); - } - - return res; + return generator.generate_assembly(NULL, &insts, program_size, dump_file); } /** diff --git a/src/mesa/drivers/dri/i965/brw_blorp_blit_eu.h b/src/mesa/drivers/dri/i965/brw_blorp_blit_eu.h index e68f925..8a93f05 100644 --- a/src/mesa/drivers/dri/i965/brw_blorp_blit_eu.h +++ b/src/mesa/drivers/dri/i965/brw_blorp_blit_eu.h @@ -30,7 +30,7 @@ class brw_blorp_eu_emitter { protected: - explicit brw_blorp_eu_emitter(struct brw_context *brw); + explicit brw_blorp_eu_emitter(struct brw_context *brw, bool debug_flag); ~brw_blorp_eu_emitter(); const unsigned *get_program(unsigned *program_size, FILE *dump_file); diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp b/src/mesa/drivers/dri/i965/brw_fs.cpp index 606a160..0c9aeeb 100644 --- a/src/mesa/drivers/dri/i965/brw_fs.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs.cpp @@ -3166,7 +3166,8 @@ brw_wm_fs_emit(struct brw_context *brw, assembly = g.generate_assembly(&v.instructions, simd16_instructions, final_assembly_size); } else { - fs_generator g(brw, mem_ctx, key, prog_data, prog, fp, v.do_dual_src); + fs_generator g(brw, mem_ctx, key, prog_data, prog, fp, v.do_dual_src, +
[Mesa-dev] [PATCH 01/23] i965/cfg: Make DO instruction begin a basic block.
The DO instruction doesn't exist on Gen6+. Since before this commit, DO always ended a basic block, if it also happened to start one (e.g., a while loop inside an if statement) the block containing only the DO would actually contain no hardware instructions. Pre-Gen6's WHILE instructions jumps to the instruction following the DO, so strictly speaking we won't be modeling that properly, but I claim there is actually no functional difference. This will simplify an upcoming change where we want to mark the first hardware instruction in the loop as beginning a block, and the last instruction before the loop as ending one. --- src/mesa/drivers/dri/i965/brw_cfg.cpp | 21 - 1 file changed, 12 insertions(+), 9 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_cfg.cpp b/src/mesa/drivers/dri/i965/brw_cfg.cpp index a806714..6bf99f1 100644 --- a/src/mesa/drivers/dri/i965/brw_cfg.cpp +++ b/src/mesa/drivers/dri/i965/brw_cfg.cpp @@ -98,7 +98,7 @@ cfg_t::cfg_t(exec_list *instructions) bblock_t *cur_if = NULL;/**< BB ending with IF. */ bblock_t *cur_else = NULL; /**< BB ending with ELSE. */ bblock_t *cur_endif = NULL; /**< BB starting with ENDIF. */ - bblock_t *cur_do = NULL;/**< BB ending with DO. */ + bblock_t *cur_do = NULL;/**< BB starting with DO. */ bblock_t *cur_while = NULL; /**< BB immediately following WHILE. */ exec_list if_stack, else_stack, do_stack, while_stack; bblock_t *next; @@ -205,15 +205,18 @@ cfg_t::cfg_t(exec_list *instructions) */ cur_while = new_block(); -/* Set up our immediately following block, full of "then" - * instructions. - */ -next = new_block(); -next->start = (backend_instruction *)inst->next; -cur->add_successor(mem_ctx, next); -cur_do = next; + if (cur->start == inst) { +/* New block was just created; use it. */ +cur_do = cur; + } else { +cur_do = new_block(); +cur_do->start = inst; -set_next_block(&cur, next, ip); +cur->end = (backend_instruction *)inst->prev; +cur->add_successor(mem_ctx, cur_do); + +set_next_block(&cur, cur_do, ip - 1); + } break; case BRW_OPCODE_CONTINUE: -- 1.8.3.2 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 02/23] i965: Pass in start_offset to brw_compact_instructions().
Let's us avoid recompacting the SIMD8 instructions when we compact the SIMD16 program. --- src/mesa/drivers/dri/i965/brw_blorp_clear.cpp| 2 +- src/mesa/drivers/dri/i965/brw_clip.c | 2 +- src/mesa/drivers/dri/i965/brw_eu.h | 2 +- src/mesa/drivers/dri/i965/brw_eu_compact.c | 18 +- src/mesa/drivers/dri/i965/brw_fs_generator.cpp | 4 ++-- src/mesa/drivers/dri/i965/brw_gs.c | 2 +- src/mesa/drivers/dri/i965/brw_sf.c | 2 +- src/mesa/drivers/dri/i965/brw_vec4_generator.cpp | 2 +- 8 files changed, 17 insertions(+), 17 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_blorp_clear.cpp b/src/mesa/drivers/dri/i965/brw_blorp_clear.cpp index 28c01c4..4b2c667 100644 --- a/src/mesa/drivers/dri/i965/brw_blorp_clear.cpp +++ b/src/mesa/drivers/dri/i965/brw_blorp_clear.cpp @@ -490,7 +490,7 @@ brw_blorp_const_color_program::compile(struct brw_context *brw, fprintf(stderr, "\n"); } - brw_compact_instructions(&func); + brw_compact_instructions(&func, 0); return brw_get_program(&func, program_size); } diff --git a/src/mesa/drivers/dri/i965/brw_clip.c b/src/mesa/drivers/dri/i965/brw_clip.c index 11f0b69..57c49f0 100644 --- a/src/mesa/drivers/dri/i965/brw_clip.c +++ b/src/mesa/drivers/dri/i965/brw_clip.c @@ -110,7 +110,7 @@ static void compile_clip_prog( struct brw_context *brw, return; } - brw_compact_instructions(&c.func); + brw_compact_instructions(&c.func, 0); /* get the program */ diff --git a/src/mesa/drivers/dri/i965/brw_eu.h b/src/mesa/drivers/dri/i965/brw_eu.h index 51d5214..65008a0 100644 --- a/src/mesa/drivers/dri/i965/brw_eu.h +++ b/src/mesa/drivers/dri/i965/brw_eu.h @@ -410,7 +410,7 @@ uint32_t brw_swap_cmod(uint32_t cmod); /* brw_eu_compact.c */ void brw_init_compaction_tables(struct brw_context *brw); -void brw_compact_instructions(struct brw_compile *p); +void brw_compact_instructions(struct brw_compile *p, int start_offset); void brw_uncompact_instruction(struct brw_context *brw, struct brw_instruction *dst, struct brw_compact_instruction *src); diff --git a/src/mesa/drivers/dri/i965/brw_eu_compact.c b/src/mesa/drivers/dri/i965/brw_eu_compact.c index c85bc89..c3a2ec3 100644 --- a/src/mesa/drivers/dri/i965/brw_eu_compact.c +++ b/src/mesa/drivers/dri/i965/brw_eu_compact.c @@ -661,18 +661,18 @@ brw_init_compaction_tables(struct brw_context *brw) } void -brw_compact_instructions(struct brw_compile *p) +brw_compact_instructions(struct brw_compile *p, int start_offset) { struct brw_context *brw = p->brw; - void *store = p->store; + void *store = p->store + start_offset / 16; /* For an instruction at byte offset 8*i before compaction, this is the number * of compacted instructions that preceded it. */ - int compacted_counts[p->next_insn_offset / 8]; + int compacted_counts[(p->next_insn_offset - start_offset) / 8]; /* For an instruction at byte offset 8*i after compaction, this is the * 8-byte offset it was at before compaction. */ - int old_ip[p->next_insn_offset / 8]; + int old_ip[(p->next_insn_offset - start_offset) / 8]; if (brw->gen < 6) return; @@ -680,7 +680,7 @@ brw_compact_instructions(struct brw_compile *p) int src_offset; int offset = 0; int compacted_count = 0; - for (src_offset = 0; src_offset < p->nr_insn * 16;) { + for (src_offset = 0; src_offset < p->next_insn_offset - start_offset;) { struct brw_instruction *src = store + src_offset; void *dst = store + offset; @@ -734,8 +734,8 @@ brw_compact_instructions(struct brw_compile *p) } /* Fix up control flow offsets. */ - p->next_insn_offset = offset; - for (offset = 0; offset < p->next_insn_offset;) { + p->next_insn_offset = start_offset + offset; + for (offset = 0; offset < p->next_insn_offset - start_offset;) { struct brw_instruction *insn = store + offset; int this_old_ip = old_ip[offset / 8]; int this_compacted_count = compacted_counts[this_old_ip]; @@ -786,10 +786,10 @@ brw_compact_instructions(struct brw_compile *p) if (0) { fprintf(stderr, "dumping compacted program\n"); - brw_disassemble(brw, p->store, 0, p->next_insn_offset, stderr); + brw_disassemble(brw, store, 0, p->next_insn_offset - start_offset, stderr); int cmp = 0; - for (offset = 0; offset < p->next_insn_offset;) { + for (offset = 0; offset < p->next_insn_offset - start_offset;) { struct brw_instruction *insn = store + offset; if (insn->header.cmpt_control) { diff --git a/src/mesa/drivers/dri/i965/brw_fs_generator.cpp b/src/mesa/drivers/dri/i965/brw_fs_generator.cpp index c61cc5c..9518e72 100644 --- a/src/mesa/drivers/dri/i965/brw_fs_generator.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs_generator.cpp @@ -1852,7 +1852,7 @@ fs_generator::generate_assembly(exec_list *simd8_inst
[Mesa-dev] [PATCH 05/23] i965: Add annotation data structure and support code.
Will be used to print disassembly after jump targets are set and instructions are compacted, while still retaining higher-level IR annotations and basic block information. An array of 'struct annotation' will live along side the generated assembly. The generators will populate the array with their IR annotations, and basic block pointers if the instructions began or ended a basic block pointer. We'll then update the instruction offset when we compact instructions and then using the annotations print the disassembly. --- src/mesa/drivers/dri/i965/Makefile.sources | 1 + src/mesa/drivers/dri/i965/brw_blorp_clear.cpp| 2 +- src/mesa/drivers/dri/i965/brw_clip.c | 2 +- src/mesa/drivers/dri/i965/brw_eu.h | 4 +- src/mesa/drivers/dri/i965/brw_eu_compact.c | 31 - src/mesa/drivers/dri/i965/brw_fs_generator.cpp | 4 +- src/mesa/drivers/dri/i965/brw_gs.c | 2 +- src/mesa/drivers/dri/i965/brw_sf.c | 2 +- src/mesa/drivers/dri/i965/brw_vec4_generator.cpp | 2 +- src/mesa/drivers/dri/i965/intel_asm_printer.c| 89 src/mesa/drivers/dri/i965/intel_asm_printer.h| 53 ++ 11 files changed, 183 insertions(+), 9 deletions(-) create mode 100644 src/mesa/drivers/dri/i965/intel_asm_printer.c create mode 100644 src/mesa/drivers/dri/i965/intel_asm_printer.h diff --git a/src/mesa/drivers/dri/i965/Makefile.sources b/src/mesa/drivers/dri/i965/Makefile.sources index 5fc90b5..2570059 100644 --- a/src/mesa/drivers/dri/i965/Makefile.sources +++ b/src/mesa/drivers/dri/i965/Makefile.sources @@ -3,6 +3,7 @@ i965_INCLUDES = \ $(MESA_TOP)/src/mesa/drivers/dri/intel i965_FILES = \ + intel_asm_printer.c \ intel_batchbuffer.c \ intel_blit.c \ intel_buffer_objects.c \ diff --git a/src/mesa/drivers/dri/i965/brw_blorp_clear.cpp b/src/mesa/drivers/dri/i965/brw_blorp_clear.cpp index 4b2c667..ea0065a 100644 --- a/src/mesa/drivers/dri/i965/brw_blorp_clear.cpp +++ b/src/mesa/drivers/dri/i965/brw_blorp_clear.cpp @@ -490,7 +490,7 @@ brw_blorp_const_color_program::compile(struct brw_context *brw, fprintf(stderr, "\n"); } - brw_compact_instructions(&func, 0); + brw_compact_instructions(&func, 0, 0, NULL); return brw_get_program(&func, program_size); } diff --git a/src/mesa/drivers/dri/i965/brw_clip.c b/src/mesa/drivers/dri/i965/brw_clip.c index 57c49f0..536c085 100644 --- a/src/mesa/drivers/dri/i965/brw_clip.c +++ b/src/mesa/drivers/dri/i965/brw_clip.c @@ -110,7 +110,7 @@ static void compile_clip_prog( struct brw_context *brw, return; } - brw_compact_instructions(&c.func, 0); + brw_compact_instructions(&c.func, 0, 0, NULL); /* get the program */ diff --git a/src/mesa/drivers/dri/i965/brw_eu.h b/src/mesa/drivers/dri/i965/brw_eu.h index 65008a0..8ce31a1 100644 --- a/src/mesa/drivers/dri/i965/brw_eu.h +++ b/src/mesa/drivers/dri/i965/brw_eu.h @@ -37,6 +37,7 @@ #include "brw_structs.h" #include "brw_defines.h" #include "brw_reg.h" +#include "intel_asm_printer.h" #include "program/prog_instruction.h" #ifdef __cplusplus @@ -410,7 +411,8 @@ uint32_t brw_swap_cmod(uint32_t cmod); /* brw_eu_compact.c */ void brw_init_compaction_tables(struct brw_context *brw); -void brw_compact_instructions(struct brw_compile *p, int start_offset); +void brw_compact_instructions(struct brw_compile *p, int start_offset, + int num_annotations, struct annotation *annotation); void brw_uncompact_instruction(struct brw_context *brw, struct brw_instruction *dst, struct brw_compact_instruction *src); diff --git a/src/mesa/drivers/dri/i965/brw_eu_compact.c b/src/mesa/drivers/dri/i965/brw_eu_compact.c index c3a2ec3..40d1fc2 100644 --- a/src/mesa/drivers/dri/i965/brw_eu_compact.c +++ b/src/mesa/drivers/dri/i965/brw_eu_compact.c @@ -39,6 +39,7 @@ #include "brw_context.h" #include "brw_eu.h" +#include "intel_asm_printer.h" static const uint32_t gen6_control_index_table[32] = { 0b0, @@ -661,7 +662,8 @@ brw_init_compaction_tables(struct brw_context *brw) } void -brw_compact_instructions(struct brw_compile *p, int start_offset) +brw_compact_instructions(struct brw_compile *p, int start_offset, + int num_annotations, struct annotation *annotation) { struct brw_context *brw = p->brw; void *store = p->store + start_offset / 16; @@ -784,6 +786,33 @@ brw_compact_instructions(struct brw_compile *p, int start_offset) } p->nr_insn = p->next_insn_offset / 16; + /* Update the instruction offsets for each annotation. */ + if (annotation) { + for (int offset = 0, i = 0; i < num_annotations; i++) { + while (start_offset + old_ip[offset / 8] * 8 != annotation[i].offset) { +assert(start_offset + old_ip[offset / 8] * 8 < + annotation[i].offset); +
[Mesa-dev] [PATCH 00/23] i965: Instruction compaction improvements.
Available from git://people.freedesktop.org/~mattst88/mesa compaction Highlights - Print disassembly after instruction compaction, while still having control-flow graph information and higher-level IR annotations. - Three improvements to instruction compaction increase number of compacted instructions by 5.4%. - Adds INTEL_DEBUG=annotation to control printing higher-level IR annotations. Saves significant time and space requirements of shader-db. - Print instruction counts and compaction stats with each shader. Will let shader-db parse this directly, rather than counting instructions. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] docs: update the prerequisites section
LGTM Reviewed-by: Ian Romanick On 05/19/2014 07:17 AM, Brian Paul wrote: > SCons is required for Windows. Add links to flex/bison for Windows. > Reorder items and improve formatting. > --- > docs/install.html | 15 --- > 1 file changed, 12 insertions(+), 3 deletions(-) > > diff --git a/docs/install.html b/docs/install.html > index 5061ede..f12425f 100644 > --- a/docs/install.html > +++ b/docs/install.html > @@ -34,16 +34,25 @@ > > 1.1 General > > +http://www.python.org/";>Python - Python is required. > +Version 2.6.4 or later should work. > + > + > +http://www.scons.org/";>SCons is required for building on > +Windows and optional for Linux (it's an alternative to autoconf/automake.) > + > + > lex / yacc - for building the GLSL compiler. > + > + > On Linux systems, flex and bison are used. > Versions 2.5.35 and 2.4.1, respectively, (or later) should work. > > > On Windows with MinGW, install flex and bison with: > mingw-get install msys-flex msys-bison > - > -python - Python is needed for building the Gallium components. > -Version 2.6.4 or later should work. > +For MSVC on Windows, you can find flex/bison programs on the > +ftp://ftp.freedesktop.org/pub/mesa/windows-utils/";>Mesa ftp > site. > > > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/4] Revert "i965: Don't _swrast_BlitFramebuffer when doing CopyTexSubImage."
Thanks for the quick fix. :) Series is Reviewed-by: Ian Romanick On 05/18/2014 11:12 PM, Kenneth Graunke wrote: > This reverts commit bd44ac8b5ca08016bb064b37edaec95eccfdbcd5. > > Fixes: > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=78842 > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=78843 > > Re-breaks: > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=77705 > but that will be fixed properly in a few commits. > > Cc: "10.2" > --- > src/mesa/drivers/common/meta_blit.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/src/mesa/drivers/common/meta_blit.c > b/src/mesa/drivers/common/meta_blit.c > index e5a0a9a..beb1ea5 100644 > --- a/src/mesa/drivers/common/meta_blit.c > +++ b/src/mesa/drivers/common/meta_blit.c > @@ -732,7 +732,7 @@ _mesa_meta_BlitFramebuffer(struct gl_context *ctx, > _mesa_meta_end(ctx); > > fallback: > - if (mask && !ctx->Meta->Blit.no_ctsi_fallback) { > + if (mask) { >_swrast_BlitFramebuffer(ctx, srcX0, srcY0, srcX1, srcY1, >dstX0, dstY0, dstX1, dstY1, mask, filter); > } > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] mesa: Expose GL_OES_texture_float and GL_OES_texture_half_float.
On 05/19/2014 06:39 AM, Marek Olšák wrote: > You are complicating it. If we followed the specification to the > letter, the driver would have to advertise OpenGL 1.1 instead of 2.1. > > The fact r300 cannot filter floating-point textures is documented by > the vendor and game developers (especially those who targeted D3D9) > knew about it. > > For OpenGL ES, I propose a simpler solution: > - don't touch ARB_texture_float at all > - add OES_texture_float to gl_extensions > - add OES_texture_float_linear to gl_extensions > - define OES_texture_half_float as o(OES_texture_float) > - define OES_texture_half_float_linear as o(OES_texture_float_linear) > > Then, drivers can enable the extensions as they see fit. That sounds like a happy medium. It seems like we could use ARB_texture_float as the enable for OES_texture_float, but I'm not crying over one extra flag. It will mean that a bunch of extension checks in the code will need to be expanded. We'll probably also want a negative test that verifies an error is generated for glTexParameteri(..., GL_LINEAR_MIPMAP_LINEAR) when OES_texture_float_linear (or OES_texture_half_float_linear) is not supported. > Marek > > On Mon, May 19, 2014 at 8:34 AM, Rogovin, Kevin > wrote: >> Hi, >> >> Each of the four extensions are right now set to be advertised if and only >> if a GL context would advertise GL_ARB_texture_float: >> >> { "GL_OES_texture_float", o(ARB_texture_float), >>ES2,2005 }, >> { "GL_OES_texture_half_float", o(ARB_texture_float), >>ES2,2005 }, >> { "GL_OES_texture_float_linear",o(ARB_texture_float), >>ES2,2005 }, >> { "GL_OES_texture_half_float_linear", o(ARB_texture_float), >>ES2,2005 }, >> >> From my interpretation of ARB_texture_float, that extension requires both >> 16-bit and 32-bit textures and ability to filter linearly such textures. Did >> I misunderstand the specification? If I got the specification correct, then >> the r300 should not be advertising any of the extensions for otherwise it >> would be advertising GL_ARB_texture_float. >> >> However, the r300 does give an example of ability to support some of the OES >> extensions but not all. Previously Matt asked if there an example or need >> and I thought not. It turns out I was wrong and there is a need atleast for >> the r300. Supporting that granularity is going to be a bigger patch since it >> would require changing the data structure struct gl_extensions to have four >> entries and in turn additional logic to combine them to >> GL_ARB_texture_float. The correct and more work way to do it would be to >> remove ARB_texture_float from gl_extension, add a GLboolean for each of the >> 4 OES extensions, change each driver to correctly fill them and then >> additional logic in creating extension string(s) to check if each of the 4 >> OES extensions are TRUE then to advertise GL_ARB_texture_float; we could >> also instead just add the 4 OES booleans and have additional logic in >> mesa/main to set them each to TRUE if ARB_texture_float is true. The latter >> solution though easier is less clean a! nd begging for trouble later. Regardless, lets first get this patch as-is into Mesa, then do the "right" thing to allow a backend to support a subset of the OES extensions without needing to support the ARB extension. >> >> -Kevin >> >> >> >> >> From: Marek Olšák [mar...@gmail.com] >> Sent: Friday, May 16, 2014 4:33 PM >> To: Rogovin, Kevin >> Cc: mesa-dev@lists.freedesktop.org >> Subject: Re: [Mesa-dev] [PATCH] mesa: Expose GL_OES_texture_float and >> GL_OES_texture_half_float. >> >> Sorry, I meant the linear filtering extensions. >> >> Marek >> >> On Fri, May 16, 2014 at 3:31 PM, Marek Olšák wrote: >>> Hi Kevin, >>> >>> r300g doesn't support filtering of floating-point textures, so the >>> extension shouldn't be advertised there. >>> >>> Marek >>> >>> On Wed, May 7, 2014 at 1:18 PM, Kevin Rogovin >>> wrote: Add support for GLES2 extensions for floating point and half floating point textures (GL_OES_texture_float, GL_OES_texture_half_float, GL_OES_texture_float_linear and GL_OES_texture_half_float_linear). --- src/mesa/main/extensions.c | 12 +- src/mesa/main/glformats.c | 25 src/mesa/main/pack.c | 17 + src/mesa/main/teximage.c | 59 ++ 4 files changed, 112 insertions(+), 1 deletion(-) diff --git a/src/mesa/main/extensions.c b/src/mesa/main/extensions.c index c2ff7e3..e39f65e 100644 --- a/src/mesa/main/extensions.c +++ b/src/mesa/main/extensions.c @@ -301,7 +301,17 @@ static const struct extension extension_table[] = { { "GL_OES_t
[Mesa-dev] glsl: ideas how to improve dead code elimination?
Hi, When Mesa's GLSL compiler is faced with a code like this: // vec4 packednormal exists vec3 normal; normal.xy = packednormal.wy * 2.0 - 1.0; normal.z = sqrt(1.0 - dot(normal.xy, normal.xy)); // now do not use the "normal" at all anywhere Then the dead code elimination pass will not be able to eliminate the "normal" variable, nor anything that lead to it (possibly sampling textures into packed normal, etc.). This is because variable refcounting visitor sees "normal" as having four references, but only two assignments, and can not consider it dead. Even if these two references are from assignment to normal.z where both LHS & RHS reference the same variable. Any ideas on how to improve this? If the original code was doing something like this, then dead code elimination is able to "properly" eliminate this whole thing: // vec4 packednormal exists vec3 normal; vec2 nxy = packednormal.wy * 2.0 - 1.0; float nz = sqrt(1.0 - dot(nxy, nxy)); normal.xy = nxy; normal.z = nz; // now do not use the "normal" at all anywhere -- Aras Pranckevičius work: http://unity3d.com home: http://aras-p.info ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 03/10] mesa: add new enum MAX_UNIFORM_LOCATIONS
On 04/09/2014 02:56 AM, Tapani Pälli wrote: > Patch adds new implementation dependent value required by the > GL_ARB_explicit_uniform_location extension. Default value for user > assignable locations is calculated as sum of MaxUniformComponents > for each stage. > > Signed-off-by: Tapani Pälli > --- > src/mesa/main/context.c | 10 +- > src/mesa/main/get.c | 1 + > src/mesa/main/get_hash_params.py | 1 + > src/mesa/main/mtypes.h | 5 + > src/mesa/main/tests/enum_strings.cpp | 1 + > 5 files changed, 17 insertions(+), 1 deletion(-) > > diff --git a/src/mesa/main/context.c b/src/mesa/main/context.c > index 860ae86..8b77df1 100644 > --- a/src/mesa/main/context.c > +++ b/src/mesa/main/context.c > @@ -610,8 +610,16 @@ _mesa_init_constants(struct gl_context *ctx) > ctx->Const.MaxUniformBlockSize = 16384; > ctx->Const.UniformBufferOffsetAlignment = 1; > > - for (i = 0; i < MESA_SHADER_STAGES; i++) > + /* GL_ARB_explicit_uniform_location, initial value calculated > +* as sum of MaxUniformComponents for each stage. > +*/ > + ctx->Const.MaxUserAssignableUniformLocations = 0; > + > + for (i = 0; i < MESA_SHADER_STAGES; i++) { >init_program_limits(ctx, i, &ctx->Const.Program[i]); > + ctx->Const.MaxUserAssignableUniformLocations += > + ctx->Const.Program[i].MaxUniformComponents; > + } This is just going to set ctx->Const.MaxUserAssignableUniformLocations to 4 * 4 * MAX_UNIFORMS, and that's probably not what we want. Maybe just set 4 * MAX_UNIFORMS with a comment saying it's, "MAX_UNIFORMS for each possible shader stage." > ctx->Const.MaxProgramMatrices = MAX_PROGRAM_MATRICES; > ctx->Const.MaxProgramMatrixStackDepth = MAX_PROGRAM_MATRIX_STACK_DEPTH; > diff --git a/src/mesa/main/get.c b/src/mesa/main/get.c > index 6d95790..8b50441 100644 > --- a/src/mesa/main/get.c > +++ b/src/mesa/main/get.c > @@ -395,6 +395,7 @@ EXTRA_EXT(ARB_viewport_array); > EXTRA_EXT(ARB_compute_shader); > EXTRA_EXT(ARB_gpu_shader5); > EXTRA_EXT2(ARB_transform_feedback3, ARB_gpu_shader5); > +EXTRA_EXT(ARB_explicit_uniform_location); > > static const int > extra_ARB_color_buffer_float_or_glcore[] = { > diff --git a/src/mesa/main/get_hash_params.py > b/src/mesa/main/get_hash_params.py > index 06d0bba..5709d42 100644 > --- a/src/mesa/main/get_hash_params.py > +++ b/src/mesa/main/get_hash_params.py > @@ -474,6 +474,7 @@ descriptor=[ >[ "MAX_LIST_NESTING", "CONST(MAX_LIST_NESTING), NO_EXTRA" ], >[ "MAX_NAME_STACK_DEPTH", "CONST(MAX_NAME_STACK_DEPTH), NO_EXTRA" ], >[ "MAX_PIXEL_MAP_TABLE", "CONST(MAX_PIXEL_MAP_TABLE), NO_EXTRA" ], > + [ "MAX_UNIFORM_LOCATIONS", > "CONTEXT_INT(Const.MaxUserAssignableUniformLocations), NO_EXTRA" ], Ditto on Petri's comment. >[ "NAME_STACK_DEPTH", "CONTEXT_INT(Select.NameStackDepth), NO_EXTRA" ], >[ "PACK_LSB_FIRST", "CONTEXT_BOOL(Pack.LsbFirst), NO_EXTRA" ], >[ "PACK_SWAP_BYTES", "CONTEXT_BOOL(Pack.SwapBytes), NO_EXTRA" ], > diff --git a/src/mesa/main/mtypes.h b/src/mesa/main/mtypes.h > index 7ac6bbe..fefbe06 100644 > --- a/src/mesa/main/mtypes.h > +++ b/src/mesa/main/mtypes.h > @@ -3311,6 +3311,11 @@ struct gl_constants > GLuint UniformBufferOffsetAlignment; > /** @} */ > > + /** > +* GL_ARB_explicit_uniform_location > +*/ > + GLuint MaxUserAssignableUniformLocations; > + > /** GL_ARB_geometry_shader4 */ > GLuint MaxGeometryOutputVertices; > GLuint MaxGeometryTotalOutputComponents; > diff --git a/src/mesa/main/tests/enum_strings.cpp > b/src/mesa/main/tests/enum_strings.cpp > index 3795700..298ff6a 100644 > --- a/src/mesa/main/tests/enum_strings.cpp > +++ b/src/mesa/main/tests/enum_strings.cpp > @@ -787,6 +787,7 @@ const struct enum_info everything[] = { > { 0x8256, "GL_RESET_NOTIFICATION_STRATEGY_ARB" }, > { 0x8257, "GL_PROGRAM_BINARY_RETRIEVABLE_HINT" }, > { 0x8261, "GL_NO_RESET_NOTIFICATION_ARB" }, > + { 0x826E, "GL_MAX_UNIFORM_LOCATIONS" }, > { 0x82DF, "GL_TEXTURE_IMMUTABLE_LEVELS" }, > { 0x8362, "GL_UNSIGNED_BYTE_2_3_3_REV" }, > { 0x8363, "GL_UNSIGNED_SHORT_5_6_5" }, > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 00/10] GL_ARB_explicit_uniform_location v2
Patches 1, 2, and 7 are Reviewed-by: Ian Romanick I sent out comments for the rest. On 04/09/2014 02:56 AM, Tapani Pälli wrote: > Hi; > > Patches implement the extension, no Piglit regressions and all the tests > for the extension pass. Location initialization and assignment is done > like Ian suggested, this removed quite a bit of code since now there is > no need to store inactive uniforms temporarily. > > Here's a branch with the patches: > http://cgit.freedesktop.org/~tpalli/mesa/log/?h=exp_uniform_loc_v2 > > // Tapani > > > Tapani Pälli (10): > glapi: add GL_ARB_explicit_uniform_location > mesa: add enable bit for ARB_explicit_uniform_location > mesa: add new enum MAX_UNIFORM_LOCATIONS > glsl/linker: initialize explicit uniform locations > glsl/linker: assign explicit uniform locations > mesa: support inactive uniforms in glUniform* functions > glsl: add enable bit for ARB_explicit_uniform_location > glsl: parser changes for GL_ARB_explicit_uniform_location > Enable GL_ARB_explicit_uniform_location in the drivers. > docs: update ARB_explicit_uniform_location status > > docs/GL3.txt | 2 +- > src/glsl/ast_to_hir.cpp | 37 +++ > src/glsl/glcpp/glcpp-parse.y | 3 + > src/glsl/glsl_lexer.ll | 1 + > src/glsl/glsl_parser_extras.cpp | 1 + > src/glsl/glsl_parser_extras.h| 16 + > src/glsl/ir_uniform.h| 5 +- > src/glsl/link_uniforms.cpp | 56 ++-- > src/glsl/linker.cpp | 99 > > src/mapi/glapi/gen/gl_API.xml| 6 ++ > src/mesa/drivers/dri/i965/intel_extensions.c | 1 + > src/mesa/main/context.c | 10 ++- > src/mesa/main/extensions.c | 1 + > src/mesa/main/get.c | 1 + > src/mesa/main/get_hash_params.py | 1 + > src/mesa/main/mtypes.h | 6 ++ > src/mesa/main/tests/enum_strings.cpp | 1 + > src/mesa/main/uniform_query.cpp | 15 + > src/mesa/state_tracker/st_extensions.c | 1 + > 19 files changed, 254 insertions(+), 9 deletions(-) > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 09/10] Enable GL_ARB_explicit_uniform_location in the drivers.
Either this patch should: - Delete the extension enable flag - Change the table in extensions.c to use dummy_true or The next patch needs to not say "all drivers that support GLSL". I think we should just enable it everywhere. On 04/09/2014 02:56 AM, Tapani Pälli wrote: > Signed-off-by: Tapani Pälli > --- > src/mesa/drivers/dri/i965/intel_extensions.c | 1 + > src/mesa/state_tracker/st_extensions.c | 1 + > 2 files changed, 2 insertions(+) > > diff --git a/src/mesa/drivers/dri/i965/intel_extensions.c > b/src/mesa/drivers/dri/i965/intel_extensions.c > index 15fcd30..f8abf98 100644 > --- a/src/mesa/drivers/dri/i965/intel_extensions.c > +++ b/src/mesa/drivers/dri/i965/intel_extensions.c > @@ -170,6 +170,7 @@ intelInitExtensions(struct gl_context *ctx) > ctx->Extensions.ARB_draw_instanced = true; > ctx->Extensions.ARB_ES2_compatibility = true; > ctx->Extensions.ARB_explicit_attrib_location = true; > + ctx->Extensions.ARB_explicit_uniform_location = true; > ctx->Extensions.ARB_fragment_coord_conventions = true; > ctx->Extensions.ARB_fragment_program = true; > ctx->Extensions.ARB_fragment_program_shadow = true; > diff --git a/src/mesa/state_tracker/st_extensions.c > b/src/mesa/state_tracker/st_extensions.c > index 3e1e45d..5b11e7b 100644 > --- a/src/mesa/state_tracker/st_extensions.c > +++ b/src/mesa/state_tracker/st_extensions.c > @@ -534,6 +534,7 @@ void st_init_extensions(struct st_context *st) > ctx->Extensions.ARB_ES2_compatibility = GL_TRUE; > ctx->Extensions.ARB_draw_elements_base_vertex = GL_TRUE; > ctx->Extensions.ARB_explicit_attrib_location = GL_TRUE; > + ctx->Extensions.ARB_explicit_uniform_location = GL_TRUE; > ctx->Extensions.ARB_fragment_coord_conventions = GL_TRUE; > ctx->Extensions.ARB_fragment_program = GL_TRUE; > ctx->Extensions.ARB_fragment_shader = GL_TRUE; > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 08/10] glsl: parser changes for GL_ARB_explicit_uniform_location
On 04/09/2014 02:56 AM, Tapani Pälli wrote: > Patch adds a preprocessor define for the extension and stores explicit > location data for uniforms during AST->HIR conversion. It also sets > layout token to be available when having the extension in place. > > Signed-off-by: Tapani Pälli > --- > src/glsl/ast_to_hir.cpp | 37 + > src/glsl/glcpp/glcpp-parse.y | 3 +++ > src/glsl/glsl_lexer.ll| 1 + > src/glsl/glsl_parser_extras.h | 14 ++ > 4 files changed, 55 insertions(+) > > diff --git a/src/glsl/ast_to_hir.cpp b/src/glsl/ast_to_hir.cpp > index 8d55ee3..7431ad7 100644 > --- a/src/glsl/ast_to_hir.cpp > +++ b/src/glsl/ast_to_hir.cpp > @@ -2170,6 +2170,43 @@ validate_explicit_location(const struct > ast_type_qualifier *qual, > { > bool fail = false; > > + /* Checks for GL_ARB_explicit_uniform_location. */ > + if (qual->flags.q.uniform) { > + Extra blank line. > + if (!state->check_explicit_uniform_location_allowed(loc, var)) > + return; > + > + const struct gl_context *const ctx = state->ctx; > + unsigned max_loc = qual->location + var->type->component_slots() - 1; I think that over counts for this purpose, and we can blame confusing nomenclature. component_slots for a mat4 is 4, so a mat4 uniform counts 4*4 against the GL_MAX_VERTEX_UNIFORM_COMPONENTS limit. However, it only has one "location" (as returned by glGetUniformLocation), so it only counts 1 against the GL_MAX_UNIFORM_LOCATIONS limit. > + > + /* ARB_explicit_uniform_location specification states: > + * > + * "The explicitly defined locations and the generated locations > + * must be in the range of 0 to MAX_UNIFORM_LOCATIONS minus one." > + * > + * "Valid locations for default-block uniform variable locations > + * are in the range of 0 to the implementation-defined maximum > + * number of uniform locations." > + */ > + if (qual->location < 0) { > + _mesa_glsl_error(loc, state, > + "explicit location < 0 for uniform %s", var->name); > + return; > + } > + > + if (max_loc >= ctx->Const.MaxUserAssignableUniformLocations) { > + _mesa_glsl_error(loc, state, "location qualifier for uniform %s " > + ">= MAX_UNIFORM_LOCATIONS (%u)", > + var->name, > + ctx->Const.MaxUserAssignableUniformLocations); > + return; > + } > + > + var->data.explicit_location = true; > + var->data.location = qual->location; > + return; > + } > + > /* Between GL_ARB_explicit_attrib_location an > * GL_ARB_separate_shader_objects, the inputs and outputs of any shader > * stage can be assigned explicit locations. The checking here associates > diff --git a/src/glsl/glcpp/glcpp-parse.y b/src/glsl/glcpp/glcpp-parse.y > index f28d853..6d42138 100644 > --- a/src/glsl/glcpp/glcpp-parse.y > +++ b/src/glsl/glcpp/glcpp-parse.y > @@ -2087,6 +2087,9 @@ _glcpp_parser_handle_version_declaration(glcpp_parser_t > *parser, intmax_t versio > if (extensions->ARB_explicit_attrib_location) >add_builtin_define(parser, "GL_ARB_explicit_attrib_location", > 1); > > + if (extensions->ARB_explicit_uniform_location) > + add_builtin_define(parser, "GL_ARB_explicit_uniform_location", > 1); > + > if (extensions->ARB_shader_texture_lod) >add_builtin_define(parser, "GL_ARB_shader_texture_lod", 1); > > diff --git a/src/glsl/glsl_lexer.ll b/src/glsl/glsl_lexer.ll > index 7602351..83f0b6d 100644 > --- a/src/glsl/glsl_lexer.ll > +++ b/src/glsl/glsl_lexer.ll > @@ -393,6 +393,7 @@ layout{ > || yyextra->AMD_conservative_depth_enable > || yyextra->ARB_conservative_depth_enable > || yyextra->ARB_explicit_attrib_location_enable > + || yyextra->ARB_explicit_uniform_location_enable >|| yyextra->has_separate_shader_objects() > || yyextra->ARB_uniform_buffer_object_enable > || yyextra->ARB_fragment_coord_conventions_enable > diff --git a/src/glsl/glsl_parser_extras.h b/src/glsl/glsl_parser_extras.h > index c53c583..20879a0 100644 > --- a/src/glsl/glsl_parser_extras.h > +++ b/src/glsl/glsl_parser_extras.h > @@ -152,6 +152,20 @@ struct _mesa_glsl_parse_state { >return true; > } > > + bool check_explicit_uniform_location_allowed(YYLTYPE *locp, > +const ir_variable *var) > + { > + /* Requires OpenGL 3.3 or ARB_explicit_attrib_location. */ > + if (ctx->Version < 33 && > !ctx->Extensions.ARB_explicit_attrib_location) { > + _mesa_glsl_error(locp, this, "%s explicit location requires " > + "GL_ARB_explicit_attrib_location extension " >
Re: [Mesa-dev] [PATCH 05/10] glsl/linker: assign explicit uniform locations
On 04/09/2014 02:56 AM, Tapani Pälli wrote: > Patch refactors the existing uniform processing so explicit locations > are taken in to account during variable processing. These locations > are temporarily stored in gl_uniform_storage before actual locations > are set. > > The 'remap_location' variable in gl_uniform_storage is changed to be > signed so that we can use 0 as a valid explicit location and '-1' as > identifier that no explicit location has been defined. > > When locations are set, UniformRemapTable is first populated with > uniforms that have explicit location set (inactive and actives ones), > rest are put after explicit location slots. > > Signed-off-by: Tapani Pälli > --- > src/glsl/ir_uniform.h | 5 +++-- > src/glsl/link_uniforms.cpp | 56 > +- > 2 files changed, 54 insertions(+), 7 deletions(-) > > diff --git a/src/glsl/ir_uniform.h b/src/glsl/ir_uniform.h > index 3508509..9dc4a8e 100644 > --- a/src/glsl/ir_uniform.h > +++ b/src/glsl/ir_uniform.h > @@ -181,9 +181,10 @@ struct gl_uniform_storage { > > /** > * The 'base location' for this uniform in the uniform remap table. For > -* arrays this is the first element in the array. > +* arrays this is the first element in the array. It needs to be signed > +* so that we can use 0 as valid location and -1 as initial value > */ > - unsigned remap_location; > + int remap_location; You could use ~0u instead of -1, right? A #define for the magic value will also help. > }; > > #ifdef __cplusplus > diff --git a/src/glsl/link_uniforms.cpp b/src/glsl/link_uniforms.cpp > index 29dc0b1..0f99082 100644 > --- a/src/glsl/link_uniforms.cpp > +++ b/src/glsl/link_uniforms.cpp > @@ -387,6 +387,9 @@ public: > void set_and_process(struct gl_shader_program *prog, > ir_variable *var) > { > + current_var = var; > + field_counter = 0; > + >ubo_block_index = -1; >if (var->is_in_uniform_block()) { > if (var->is_interface_instance() && var->type->is_array()) { > @@ -543,6 +546,22 @@ private: > return; >} > > + /* Assign explicit locations. */ > + if (current_var->data.explicit_location) { > + /* Set sequential locations for struct fields. */ > + if (current_var->type->is_record()) { I think you can accomplish the same thing with record_type != NULL. > +const unsigned entries = MAX2(1, > this->uniforms[id].array_elements); > +this->uniforms[id].remap_location = > + current_var->data.location + field_counter; > + field_counter += entries; Weird indentation. > + } else { > +this->uniforms[id].remap_location = current_var->data.location; > + } > + } else { > + /* Initialize to -1 to indicate that no explicit location is set */ > + this->uniforms[id].remap_location = -1; > + } > + >this->uniforms[id].name = ralloc_strdup(this->uniforms, name); >this->uniforms[id].type = base_type; >this->uniforms[id].initialized = 0; > @@ -598,6 +617,17 @@ public: > gl_texture_index targets[MAX_SAMPLERS]; > > /** > +* Current variable being processed. > +*/ > + ir_variable *current_var; > + > + /** > +* Field counter is used to take care that uniform structures > +* with explicit locations get sequential locations. > +*/ > + unsigned field_counter; > + > + /** > * Mask of samplers used by the current shader stage. > */ > unsigned shader_samplers_used; > @@ -799,10 +829,6 @@ link_assign_uniform_locations(struct gl_shader_program > *prog) > prog->UniformStorage = NULL; > prog->NumUserUniformStorage = 0; > > - ralloc_free(prog->UniformRemapTable); > - prog->UniformRemapTable = NULL; > - prog->NumUniformRemapTable = 0; > - > if (prog->UniformHash != NULL) { >prog->UniformHash->clear(); > } else { > @@ -915,9 +941,29 @@ link_assign_uniform_locations(struct gl_shader_program > *prog) > sizeof(prog->_LinkedShaders[i]->SamplerTargets)); > } > > - /* Build the uniform remap table that is used to set/get uniform > locations */ > + /* Reserve all the explicit locations of the active uniforms. */ > + for (unsigned i = 0; i < num_user_uniforms; i++) { > + if (uniforms[i].remap_location != -1) { > + /* How many new entries for this uniform? */ > + const unsigned entries = MAX2(1, uniforms[i].array_elements); > + > + /* Set remap table entries point to correct gl_uniform_storage. */ > + for (unsigned j = 0; j < entries; j++) { > +unsigned element_loc = uniforms[i].remap_location + j; > +assert(prog->UniformRemapTable[element_loc] == > + (gl_uniform_storage *) -1); > +prog->UniformRemapTable[element_loc] = &uniforms[i]; > + } > + } > + } > + > + /* Reserve
Re: [Mesa-dev] [PATCH 06/10] mesa: support inactive uniforms in glUniform* functions
On 04/09/2014 02:56 AM, Tapani Pälli wrote: > Support inactive uniforms that have explicit location set in > glUniform* functions. > > Signed-off-by: Tapani Pälli > --- > src/mesa/main/uniform_query.cpp | 15 +++ > 1 file changed, 15 insertions(+) > > diff --git a/src/mesa/main/uniform_query.cpp b/src/mesa/main/uniform_query.cpp > index 5f1af08..e33800a 100644 > --- a/src/mesa/main/uniform_query.cpp > +++ b/src/mesa/main/uniform_query.cpp > @@ -253,6 +253,21 @@ validate_uniform_parameters(struct gl_context *ctx, >return false; > } > > + /* If the driver storage pointer in remap table is -1, we ignore silently. > +* > +* GL_ARB_explicit_uniform_location spec says: > +* "What happens if Uniform* is called with an explicitly defined > +* uniform location, but that uniform is deemed inactive by the > +* linker? > +* > +* RESOLVED: The call is ignored for inactive uniform variables and > +* no error is generated." > +* > +*/ > + if (ctx->Extensions.ARB_explicit_uniform_location && > + shProg->UniformRemapTable[location] == (gl_uniform_storage *) -1) > + return false; > + Do we actually need to check ctx->Extensions.ARB_explicit_uniform_location? It seems like UniformRemapTable will only have -1 in it for that case, right? > _mesa_uniform_split_location_offset(shProg, location, loc, array_index); > > if (shProg->UniformStorage[*loc].array_elements == 0 && count > 1) { > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 04/10] glsl/linker: initialize explicit uniform locations
On 04/09/2014 02:56 AM, Tapani Pälli wrote: > Patch initializes the UniformRemapTable for explicit locations. This > needs to happen before optimizations to make sure all inactive uniforms > get their explicit locations correctly. > > Signed-off-by: Tapani Pälli > --- > src/glsl/linker.cpp | 99 > + > 1 file changed, 99 insertions(+) > > diff --git a/src/glsl/linker.cpp b/src/glsl/linker.cpp > index 7c194a2..1b4cb63 100644 > --- a/src/glsl/linker.cpp > +++ b/src/glsl/linker.cpp > @@ -74,6 +74,7 @@ > #include "link_varyings.h" > #include "ir_optimization.h" > #include "ir_rvalue_visitor.h" > +#include "ir_uniform.h" > > extern "C" { > #include "main/shaderobj.h" > @@ -2089,6 +2090,100 @@ check_image_resources(struct gl_context *ctx, struct > gl_shader_program *prog) >linker_error(prog, "Too many combined image uniforms and fragment > outputs"); > } > > + > +/** > + * Initializes explicit location slots point to -1 for a variable, > + * checks for overlaps between other uniforms using explicit locations. > + */ > +static bool > +reserve_explicit_locations(struct gl_shader_program *prog, > + string_to_uint_map *map, ir_variable *var) > +{ > + unsigned max_loc = var->data.location + var->type->component_slots() - 1; > + > + /* Resize remap table if locations do not fit in the current one. */ > + if (max_loc + 1 > prog->NumUniformRemapTable) { > + prog->UniformRemapTable = > + reralloc(prog, prog->UniformRemapTable, > + gl_uniform_storage *, > + max_loc + 1); > + prog->NumUniformRemapTable = max_loc + 1; > + } > + > + for (unsigned i = 0; i < var->type->component_slots(); i++) { You should check the code that gets generated for this. I suspect this will translate to a call to component_slots per iteration of the loop. Maybe just call it once above (since it is also used to calculate max_loc). > + unsigned loc = var->data.location + i; > + > + /* Check if location is already used. */ > + if (prog->UniformRemapTable[loc] == (gl_uniform_storage *) -1) { So... -1 means that an inactive uniform has that location explicitly assigned? I'm inferring that from comments in the next patch. Maybe we should have a descriptive #define #define INACTIVE_UNIFORM_EXPLICIT_LOCATION ((gl_uniform_storage *) -1) > + > + /* Possibly same uniform from a different stage, this is ok. */ > + unsigned hash_loc; > + if (map->get(hash_loc, var->name) && hash_loc == loc - i) > + continue; > + > + /* ARB_explicit_uniform_location specification states: > + * > + * "No two default-block uniform variables in the program can > have > + * the same location, even if they are unused, otherwise a > compiler > + * or linker error will be generated." > + */ > + linker_error(prog, "location qualifier " > + "for uniform %s " > + "overlaps previously used location", > + var->name); Minor nit (which you can take or leave). I usually like to have fewer breaks in strings. I would have split this up as: linker_error(prog, "location qualifier for uniform %s overlaps " "previously used location", var->name); > + return false; > + } > + > + prog->UniformRemapTable[loc] = (gl_uniform_storage *) -1; > + } > + > + /* Note, base location used for arrays. */ > + map->put(var->data.location, var->name); > + > + return true; > +} > + > +/** > + * Check and reserve all explicit uniform locations, called before > + * any optimizations happen to handle also inactive uniforms and > + * inactive array elements that may get trimmed away. > + */ > +static void > +check_explicit_uniform_locations(struct gl_context *ctx, > + struct gl_shader_program *prog) > +{ > + if (!ctx->Extensions.ARB_explicit_uniform_location) > + return; > + > + /* This map is used to detect if overlapping explicit locations > +* occur with the same uniform (from different stage) or a different one. > +*/ > + string_to_uint_map *uniform_map = new string_to_uint_map; > + > + for (unsigned i = 0; i < MESA_SHADER_STAGES; i++) { > + struct gl_shader *sh = prog->_LinkedShaders[i]; > + > + if (!sh) > + continue; > + > + foreach_list(node, sh->ir) { > + ir_variable *var = ((ir_instruction *)node)->as_variable(); > + if ((var && var->data.mode == ir_var_uniform) && > + var->data.explicit_location) { > +if (!reserve_explicit_locations(prog, uniform_map, var)) > + return; > + > +/* Initialize locations that were allocated but left unused. */ > +for (unsigned i = 0; i < prog->NumUniformRemapTable; i++)
[Mesa-dev] [PATCH 3/7] r600g/compute: Add more NULL checks
In this case, NULL checks are added to compute_memory_grow_pool, so it returns -1 when it fails. This makes necesary to handle such cases in compute_memory_finalize_pending when it is needed to grow the pool --- src/gallium/drivers/r600/compute_memory_pool.c | 30 -- src/gallium/drivers/r600/compute_memory_pool.h | 6 -- 2 files changed, 28 insertions(+), 8 deletions(-) diff --git a/src/gallium/drivers/r600/compute_memory_pool.c b/src/gallium/drivers/r600/compute_memory_pool.c index 7143545..e959a6d 100644 --- a/src/gallium/drivers/r600/compute_memory_pool.c +++ b/src/gallium/drivers/r600/compute_memory_pool.c @@ -160,9 +160,10 @@ struct compute_memory_item* compute_memory_postalloc_chunk( } /** - * Reallocates pool, conserves data + * Reallocates pool, conserves data. + * @returns -1 if it fails, 0 otherwise */ -void compute_memory_grow_pool(struct compute_memory_pool* pool, +int compute_memory_grow_pool(struct compute_memory_pool* pool, struct pipe_context * pipe, int new_size_in_dw) { COMPUTE_DBG(pool->screen, "* compute_memory_grow_pool() " @@ -173,6 +174,8 @@ void compute_memory_grow_pool(struct compute_memory_pool* pool, if (!pool->bo) { compute_memory_pool_init(pool, MAX2(new_size_in_dw, 1024 * 16)); + if (pool->shadow == NULL) + return -1; } else { new_size_in_dw += 1024 - (new_size_in_dw % 1024); @@ -181,6 +184,9 @@ void compute_memory_grow_pool(struct compute_memory_pool* pool, compute_memory_shadow(pool, pipe, 1); pool->shadow = realloc(pool->shadow, new_size_in_dw*4); + if (pool->shadow == NULL) + return -1; + pool->size_in_dw = new_size_in_dw; pool->screen->b.b.resource_destroy( (struct pipe_screen *)pool->screen, @@ -190,6 +196,8 @@ void compute_memory_grow_pool(struct compute_memory_pool* pool, pool->size_in_dw * 4); compute_memory_shadow(pool, pipe, 0); } + + return 0; } /** @@ -213,8 +221,9 @@ void compute_memory_shadow(struct compute_memory_pool* pool, /** * Allocates pending allocations in the pool + * @returns -1 if it fails, 0 otherwise */ -void compute_memory_finalize_pending(struct compute_memory_pool* pool, +int compute_memory_finalize_pending(struct compute_memory_pool* pool, struct pipe_context * pipe) { struct compute_memory_item *pending_list = NULL, *end_p = NULL; @@ -225,6 +234,8 @@ void compute_memory_finalize_pending(struct compute_memory_pool* pool, int64_t start_in_dw = 0; + int err = 0; + COMPUTE_DBG(pool->screen, "* compute_memory_finalize_pending()\n"); for (item = pool->item_list; item; item = item->next) { @@ -292,7 +303,9 @@ void compute_memory_finalize_pending(struct compute_memory_pool* pool, * they aren't contiguous, so it will be impossible to allocate Item D. */ if (pool->size_in_dw < allocated+unallocated) { - compute_memory_grow_pool(pool, pipe, allocated+unallocated); + err = compute_memory_grow_pool(pool, pipe, allocated+unallocated); + if (err == -1) + return -1; } /* Loop through all the pending items, allocate space for them and @@ -309,17 +322,20 @@ void compute_memory_finalize_pending(struct compute_memory_pool* pool, need += 1024 - (need % 1024); if (need > 0) { - compute_memory_grow_pool(pool, + err = compute_memory_grow_pool(pool, pipe, pool->size_in_dw + need); } else { need = pool->size_in_dw / 10; need += 1024 - (need % 1024); - compute_memory_grow_pool(pool, + err = compute_memory_grow_pool(pool, pipe, pool->size_in_dw + need); } + + if (err == -1) + return -1; } COMPUTE_DBG(pool->screen, " + Found space for Item %p id = %u " "start_in_dw = %u (%u bytes) size_in_dw = %u (%u bytes)\n", @@ -355,6 +371,8 @@ void compute_memory_finalize_pending(struct compute_memory_pool* pool, allocated += item->size_in_dw; } + + return 0; } diff --git a/src/gallium/drivers/r600/compute_memory_pool.h b/src/gallium/drivers/r600/compute_memory_pool.h index 3777e3f..e61c003 100644 --- a/src/gallium/drivers/r600
[Mesa-dev] [PATCH 7/7] r600g/compute: Use %u as the unsigned format
This fixes an issue when running cl-program-bitcoin-phatk piglit test where some of the inputs have negative values --- src/gallium/drivers/r600/evergreen_compute.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/gallium/drivers/r600/evergreen_compute.c b/src/gallium/drivers/r600/evergreen_compute.c index 701bb5c..a2abf15 100644 --- a/src/gallium/drivers/r600/evergreen_compute.c +++ b/src/gallium/drivers/r600/evergreen_compute.c @@ -323,7 +323,7 @@ void evergreen_compute_upload_input( memcpy(kernel_parameters_start, input, shader->input_size); for (i = 0; i < (input_size / 4); i++) { - COMPUTE_DBG(ctx->screen, "input %i : %i\n", i, + COMPUTE_DBG(ctx->screen, "input %i : %u\n", i, ((unsigned*)num_work_groups_start)[i]); } -- 1.9.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 5/7] r600g/compute: Cleanup of compute_memory_pool.h
Removed compute_memory_defrag declaration because it seems to be unimplemented. I think that this function would have been the one that solves the problem with fragmentation that compute_memory_finalize_pending has. Also removed comments that are already at compute_memory_pool.c --- src/gallium/drivers/r600/compute_memory_pool.h | 15 --- 1 file changed, 15 deletions(-) diff --git a/src/gallium/drivers/r600/compute_memory_pool.h b/src/gallium/drivers/r600/compute_memory_pool.h index e61c003..c711c59 100644 --- a/src/gallium/drivers/r600/compute_memory_pool.h +++ b/src/gallium/drivers/r600/compute_memory_pool.h @@ -64,32 +64,17 @@ int64_t compute_memory_prealloc_chunk(struct compute_memory_pool* pool, int64_t struct compute_memory_item* compute_memory_postalloc_chunk(struct compute_memory_pool* pool, int64_t start_in_dw); ///search for the chunk where we can link our new chunk after it -/** - * reallocates pool, conserves data - * @returns -1 if it fails, 0 otherwise - */ int compute_memory_grow_pool(struct compute_memory_pool* pool, struct pipe_context * pipe, int new_size_in_dw); -/** - * Copy pool from device to host, or host to device - */ void compute_memory_shadow(struct compute_memory_pool* pool, struct pipe_context * pipe, int device_to_host); -/** - * Allocates pending allocations in the pool - * @returns -1 if it fails, 0 otherwise - */ int compute_memory_finalize_pending(struct compute_memory_pool* pool, struct pipe_context * pipe); -void compute_memory_defrag(struct compute_memory_pool* pool); ///Defragment the memory pool, always heavy memory usage void compute_memory_free(struct compute_memory_pool* pool, int64_t id); struct compute_memory_item* compute_memory_alloc(struct compute_memory_pool* pool, int64_t size_in_dw); ///Creates pending allocations -/** - * Transfer data host<->device, offset and size is in bytes - */ void compute_memory_transfer(struct compute_memory_pool* pool, struct pipe_context * pipe, int device_to_host, struct compute_memory_item* chunk, void* data, -- 1.9.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 6/7] r600g/compute: align items correctly
Now, items whose size is a multiple of 1024 dw won't leave 1024 dw between itself and the following item The rest of the cases is left as it was --- src/gallium/drivers/r600/compute_memory_pool.c | 9 + 1 file changed, 5 insertions(+), 4 deletions(-) diff --git a/src/gallium/drivers/r600/compute_memory_pool.c b/src/gallium/drivers/r600/compute_memory_pool.c index 01851ad..2050f28 100644 --- a/src/gallium/drivers/r600/compute_memory_pool.c +++ b/src/gallium/drivers/r600/compute_memory_pool.c @@ -30,6 +30,7 @@ #include "util/u_transfer.h" #include "util/u_surface.h" #include "util/u_pack_color.h" +#include "util/u_math.h" #include "util/u_memory.h" #include "util/u_inlines.h" #include "util/u_framebuffer.h" @@ -41,6 +42,7 @@ #include "evergreen_compute_internal.h" #include +#define ITEM_ALIGNMENT 1024 /** * Creates a new pool */ @@ -112,8 +114,7 @@ int64_t compute_memory_prealloc_chunk( return last_end; } - last_end = item->start_in_dw + item->size_in_dw; - last_end += (1024 - last_end % 1024); + last_end = item->start_in_dw + align(item->size_in_dw, ITEM_ALIGNMENT); } } @@ -177,7 +178,7 @@ int compute_memory_grow_pool(struct compute_memory_pool* pool, if (pool->shadow == NULL) return -1; } else { - new_size_in_dw += 1024 - (new_size_in_dw % 1024); + new_size_in_dw = align(new_size_in_dw, ITEM_ALIGNMENT); COMPUTE_DBG(pool->screen, " Aligned size = %d (%d bytes)\n", new_size_in_dw, new_size_in_dw * 4); @@ -323,7 +324,7 @@ int compute_memory_finalize_pending(struct compute_memory_pool* pool, need = pool->size_in_dw / 10; } - need += 1024 - (need % 1024); + need = align(need, ITEM_ALIGNMENT); err = compute_memory_grow_pool(pool, pipe, -- 1.9.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 4/7] r600g/compute: Tidy a bit compute_memory_finalize_pending
Explanation of the changes, as requested by Tom Stellard: Let's take need after is calculated as item->size_in_dw+2048 - (pool->size_in_dw - allocated) BEFORE: If need is positive or 0: we calculate need += 1024 - (need % 1024), which is like cealing to the nearest multiple of 1024, for example 0 goes to 1024, 512 goes to 1024 as well, 1025 goes to 2048 and so on. So now need is always possitive, we do compute_memory_grow_pool, check its output and continue. If need is negative: we calculate need += 1024 - (need % 1024), in this case we will have negative numbers, and if need is [-1024:-1] 0, so now we take the else, recalculate need as need = pool->size_in_dw / 10 and need += 1024 - (need % 1024), we do compute_memory_grow_pool, check its output and continue. AFTER: If need is positive or 0: we jump the if, calculate need += 1024 - (need % 1024) compute_memory_grow_pool, check its output and continue. If need is negative: we enter the if, and need is now pool->size_in_dw / 10. Now we calculate need += 1024 - (need % 1024) compute_memory_grow_pool, check its output and continue. --- src/gallium/drivers/r600/compute_memory_pool.c | 19 +++ 1 file changed, 7 insertions(+), 12 deletions(-) diff --git a/src/gallium/drivers/r600/compute_memory_pool.c b/src/gallium/drivers/r600/compute_memory_pool.c index e959a6d..01851ad 100644 --- a/src/gallium/drivers/r600/compute_memory_pool.c +++ b/src/gallium/drivers/r600/compute_memory_pool.c @@ -319,21 +319,16 @@ int compute_memory_finalize_pending(struct compute_memory_pool* pool, int64_t need = item->size_in_dw+2048 - (pool->size_in_dw - allocated); - need += 1024 - (need % 1024); - - if (need > 0) { - err = compute_memory_grow_pool(pool, - pipe, - pool->size_in_dw + need); - } - else { + if (need < 0) { need = pool->size_in_dw / 10; - need += 1024 - (need % 1024); - err = compute_memory_grow_pool(pool, - pipe, - pool->size_in_dw + need); } + need += 1024 - (need % 1024); + + err = compute_memory_grow_pool(pool, + pipe, + pool->size_in_dw + need); + if (err == -1) return -1; } -- 1.9.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 2/7] r600g/compute: Adding checks for NULL after CALLOC
--- src/gallium/drivers/r600/compute_memory_pool.c | 8 1 file changed, 8 insertions(+) diff --git a/src/gallium/drivers/r600/compute_memory_pool.c b/src/gallium/drivers/r600/compute_memory_pool.c index ccbb211..7143545 100644 --- a/src/gallium/drivers/r600/compute_memory_pool.c +++ b/src/gallium/drivers/r600/compute_memory_pool.c @@ -49,6 +49,8 @@ struct compute_memory_pool* compute_memory_pool_new( { struct compute_memory_pool* pool = (struct compute_memory_pool*) CALLOC(sizeof(struct compute_memory_pool), 1); + if (pool == NULL) + return NULL; COMPUTE_DBG(rscreen, "* compute_memory_pool_new()\n"); @@ -64,6 +66,9 @@ static void compute_memory_pool_init(struct compute_memory_pool * pool, initial_size_in_dw); pool->shadow = (uint32_t*)CALLOC(initial_size_in_dw, 4); + if (pool->shadow == NULL) + return; + pool->next_id = 1; pool->size_in_dw = initial_size_in_dw; pool->bo = (struct r600_resource*)r600_compute_buffer_alloc_vram(pool->screen, @@ -400,6 +405,9 @@ struct compute_memory_item* compute_memory_alloc( new_item = (struct compute_memory_item *) CALLOC(sizeof(struct compute_memory_item), 1); + if (new_item == NULL) + return NULL; + new_item->size_in_dw = size_in_dw; new_item->start_in_dw = -1; /* mark pending */ new_item->id = pool->next_id++; -- 1.9.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 0/7] r600g/compute: Some cleanup patches
Hi, Firstly, I shall introduce myself (at least more formally than just sending some patches). My name is Bruno Jiménez, I'm studying physics at Zaragoza's University (Spain) and I am participating in this year's Google Summer of Code, where I will try to improve the compute_memory_pool, solve an annoying bug related to mappings and anything else that I can do. These patches are a little first cleanup. I sent the first five some time ago, but weren't pushed. The sixth fixes the alignment for items whose size is a multiple of 1024 dw and the last one corrects the format type of an unsigned. Thanks! Bruno Bruno Jiménez (7): r600g/compute: Fixing a typo and some indentation r600g/compute: Adding checks for NULL after CALLOC r600g/compute: Add more NULL checks r600g/compute: Tidy a bit compute_memory_finalize_pending r600g/compute: Cleanup of compute_memory_pool.h r600g/compute: align items correctly r600g/compute: Use %u as the unsigned format src/gallium/drivers/r600/compute_memory_pool.c | 64 +- src/gallium/drivers/r600/compute_memory_pool.h | 17 +-- src/gallium/drivers/r600/evergreen_compute.c | 2 +- 3 files changed, 46 insertions(+), 37 deletions(-) -- 1.9.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 1/7] r600g/compute: Fixing a typo and some indentation
--- src/gallium/drivers/r600/compute_memory_pool.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/src/gallium/drivers/r600/compute_memory_pool.c b/src/gallium/drivers/r600/compute_memory_pool.c index 2f0d4c8..ccbb211 100644 --- a/src/gallium/drivers/r600/compute_memory_pool.c +++ b/src/gallium/drivers/r600/compute_memory_pool.c @@ -263,7 +263,7 @@ void compute_memory_finalize_pending(struct compute_memory_pool* pool, unallocated += item->size_in_dw+1024; } else { - /* The item is not pendng, so update the amount of space + /* The item is not pending, so update the amount of space * that has already been allocated. */ allocated += item->size_in_dw; } @@ -451,7 +451,7 @@ void compute_memory_transfer( map = pipe->transfer_map(pipe, gart, 0, PIPE_TRANSFER_READ, &(struct pipe_box) { .width = aligned_size * 4, .height = 1, .depth = 1 }, &xfer); -assert(xfer); + assert(xfer); assert(map); memcpy(data, map + internal_offset, size); pipe->transfer_unmap(pipe, xfer); -- 1.9.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] llvmpipe: do IR counting for shader cache management after optimization.
Looks good to me. Jose - Original Message - > From: Roland Scheidegger > > 2ea923cf571235dfe573c35c3f0d90f632bd86d8 had the side effect of IR counting > now being done after IR optimization instead of before. Some quick analysis > shows that there's roughly 1.5 times more IR instructions before optimization > than after, hence the effective shader cache size got quite a bit smaller. > Could counter this with an increase of the instruction limit but it probably > makes more sense to count them after optimizations, so move that code. > --- > src/gallium/auxiliary/gallivm/lp_bld_type.c | 20 +++- > src/gallium/auxiliary/gallivm/lp_bld_type.h | 2 +- > src/gallium/drivers/llvmpipe/lp_state_fs.c | 4 ++-- > 3 files changed, 22 insertions(+), 4 deletions(-) > > diff --git a/src/gallium/auxiliary/gallivm/lp_bld_type.c > b/src/gallium/auxiliary/gallivm/lp_bld_type.c > index 9b25e15..5a80199 100644 > --- a/src/gallium/auxiliary/gallivm/lp_bld_type.c > +++ b/src/gallium/auxiliary/gallivm/lp_bld_type.c > @@ -394,7 +394,7 @@ lp_build_context_init(struct lp_build_context *bld, > /** > * Count the number of instructions in a function. > */ > -unsigned > +static unsigned > lp_build_count_instructions(LLVMValueRef function) > { > unsigned num_instrs = 0; > @@ -414,3 +414,21 @@ lp_build_count_instructions(LLVMValueRef function) > > return num_instrs; > } > + > + > +/** > + * Count the number of instructions in a module. > + */ > +unsigned > +lp_build_count_ir_module(LLVMModuleRef module) > +{ > + LLVMValueRef func; > + unsigned num_instrs = 0; > + > + func = LLVMGetFirstFunction(module); > + while (func) { > + num_instrs += lp_build_count_instructions(func); > + func = LLVMGetNextFunction(func); > + } > + return num_instrs; > +} > diff --git a/src/gallium/auxiliary/gallivm/lp_bld_type.h > b/src/gallium/auxiliary/gallivm/lp_bld_type.h > index d0b490b..191cf92 100644 > --- a/src/gallium/auxiliary/gallivm/lp_bld_type.h > +++ b/src/gallium/auxiliary/gallivm/lp_bld_type.h > @@ -447,7 +447,7 @@ lp_build_context_init(struct lp_build_context *bld, > > > unsigned > -lp_build_count_instructions(LLVMValueRef function); > +lp_build_count_ir_module(LLVMModuleRef module); > > > #endif /* !LP_BLD_TYPE_H */ > diff --git a/src/gallium/drivers/llvmpipe/lp_state_fs.c > b/src/gallium/drivers/llvmpipe/lp_state_fs.c > index 4872e0d..0b74d15 100644 > --- a/src/gallium/drivers/llvmpipe/lp_state_fs.c > +++ b/src/gallium/drivers/llvmpipe/lp_state_fs.c > @@ -2438,8 +2438,6 @@ generate_fragment(struct llvmpipe_context *lp, > LLVMBuildRetVoid(builder); > > gallivm_verify_function(gallivm, function); > - > - variant->nr_instrs += lp_build_count_instructions(function); > } > > > @@ -2629,6 +2627,8 @@ generate_variant(struct llvmpipe_context *lp, > > gallivm_compile_module(variant->gallivm); > > + variant->nr_instrs += lp_build_count_ir_module(variant->gallivm->module); > + > if (variant->function[RAST_EDGE_TEST]) { >variant->jit_function[RAST_EDGE_TEST] = (lp_jit_frag_func) > gallivm_jit_function(variant->gallivm, > -- > 1.9.1 > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] targets/opencl: Fix (static) linking with LLVM
Michel Dänzer schrieb am 19.05.2014 04:12: > On 18.05.2014 18:37, Kai Wasserbäch wrote: >> >> And instead of just not starting, my X starts crashing, whenever >> libGL fails to load a (32 bit) driver. > > FWIW, some potential alternatives for avoiding the X crashes: > > With current xserver Git master, you can pass the -iglx parameter to > Xorg to prohibit GLX indirect rendering. > > Or just make sure the 32-bit swrast_dri.so works. Thanks a lot for those pointers. I think my swrast failed because it had picked up some newer SO_VERSION as well. Which would bring me back to static linking. Kind regards, Kai Wasserbäch signature.asc Description: OpenPGP digital signature ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] targets/opencl: Fix (static) linking with LLVM (v2)
Without this, I get linking failures (static linking). The static linking is sort of required for me, because otherwise Steam and applications using the Steam runtime regularily fail because my LLVM was compiled and linked against a newer libgcc_s, libstdc++, etc. and uses features from those newer versions. And instead of Steam just not starting, my X starts crashing, whenever libGL fails to load a (32 bit) driver. Since I hate crashes of X and I don't think Valve/Steam will behave like a proper distribution soon (rebuilds versus current Debian Testing, since they base their Steam OS off that), I need a radeonsi which carries its own LLVM within and doesn't care about what the runtime sets. This means linking Mesa statically. v1 → v2: Move logic to configure.ac Signed-off-by: Kai Wasserbäch --- Dear Emil, I hope this is the right place for adding the two additional modules. If you accept this patch, please push it for me, I don't have commit access. Cheers, Kai configure.ac | 7 +++ 1 file changed, 7 insertions(+) diff --git a/configure.ac b/configure.ac index 4e4d761..b4920ba 100644 --- a/configure.ac +++ b/configure.ac @@ -1658,6 +1658,13 @@ if test "x$enable_gallium_llvm" = xyes; then if $LLVM_CONFIG --components | grep -qw 'option'; then LLVM_COMPONENTS="${LLVM_COMPONENTS} option" fi +# Current OpenCL/Clover and LLVM 3.5 require ObjCARCOpts and ProfileData +if $LLVM_CONFIG --components | grep -qw 'objcarcopts'; then +LLVM_COMPONENTS="${LLVM_COMPONENTS} objcarcopts" +fi +if $LLVM_CONFIG --components | grep -qw 'profiledata'; then +LLVM_COMPONENTS="${LLVM_COMPONENTS} profiledata" +fi fi DEFINES="${DEFINES} -DHAVE_LLVM=0x0$LLVM_VERSION_INT -DLLVM_VERSION_PATCH=$LLVM_VERSION_PATCH" MESA_LLVM=1 -- 2.0.0.rc2 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 78914] New: Front/Backfaces do not cover the same pixels when rasterized
https://bugs.freedesktop.org/show_bug.cgi?id=78914 Priority: medium Bug ID: 78914 Assignee: mesa-dev@lists.freedesktop.org Summary: Front/Backfaces do not cover the same pixels when rasterized Severity: normal Classification: Unclassified OS: All Reporter: florianl...@gmail.com Hardware: Other Status: NEW Version: 10.1 Component: Mesa core Product: Mesa When trying to run my GLSL raycaster with Mesa/llvmpipe, I recognized artifacts due to the wrong ray start/end positions. I render the start/end rays as boxes where the front face is the start position and the back face is the back position. The problem seems to be that the front/back faces do not cover the exact same pixels in the framebuffer, even when they share the same edges. On NVidia/ATI cards with native driver, the back faces cover the same pixels as the front faces. This can be reproduced by rendering a triangle with culling turned off and blending turned on. When rotating the triangle one can see the artifacts on the borders of the triangle. -- You are receiving this mail because: You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] docs: update the prerequisites section
SCons is required for Windows. Add links to flex/bison for Windows. Reorder items and improve formatting. --- docs/install.html | 15 --- 1 file changed, 12 insertions(+), 3 deletions(-) diff --git a/docs/install.html b/docs/install.html index 5061ede..f12425f 100644 --- a/docs/install.html +++ b/docs/install.html @@ -34,16 +34,25 @@ 1.1 General +http://www.python.org/";>Python - Python is required. +Version 2.6.4 or later should work. + + +http://www.scons.org/";>SCons is required for building on +Windows and optional for Linux (it's an alternative to autoconf/automake.) + + lex / yacc - for building the GLSL compiler. + + On Linux systems, flex and bison are used. Versions 2.5.35 and 2.4.1, respectively, (or later) should work. On Windows with MinGW, install flex and bison with: mingw-get install msys-flex msys-bison - -python - Python is needed for building the Gallium components. -Version 2.6.4 or later should work. +For MSVC on Windows, you can find flex/bison programs on the +ftp://ftp.freedesktop.org/pub/mesa/windows-utils/";>Mesa ftp site. -- 1.7.10.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] mesa: Expose GL_OES_texture_float and GL_OES_texture_half_float.
You are complicating it. If we followed the specification to the letter, the driver would have to advertise OpenGL 1.1 instead of 2.1. The fact r300 cannot filter floating-point textures is documented by the vendor and game developers (especially those who targeted D3D9) knew about it. For OpenGL ES, I propose a simpler solution: - don't touch ARB_texture_float at all - add OES_texture_float to gl_extensions - add OES_texture_float_linear to gl_extensions - define OES_texture_half_float as o(OES_texture_float) - define OES_texture_half_float_linear as o(OES_texture_float_linear) Then, drivers can enable the extensions as they see fit. Marek On Mon, May 19, 2014 at 8:34 AM, Rogovin, Kevin wrote: > Hi, > > Each of the four extensions are right now set to be advertised if and only > if a GL context would advertise GL_ARB_texture_float: > > { "GL_OES_texture_float", o(ARB_texture_float), > ES2,2005 }, > { "GL_OES_texture_half_float", o(ARB_texture_float), > ES2,2005 }, > { "GL_OES_texture_float_linear",o(ARB_texture_float), > ES2,2005 }, > { "GL_OES_texture_half_float_linear", o(ARB_texture_float), > ES2,2005 }, > > From my interpretation of ARB_texture_float, that extension requires both > 16-bit and 32-bit textures and ability to filter linearly such textures. Did > I misunderstand the specification? If I got the specification correct, then > the r300 should not be advertising any of the extensions for otherwise it > would be advertising GL_ARB_texture_float. > > However, the r300 does give an example of ability to support some of the OES > extensions but not all. Previously Matt asked if there an example or need and > I thought not. It turns out I was wrong and there is a need atleast for the > r300. Supporting that granularity is going to be a bigger patch since it > would require changing the data structure struct gl_extensions to have four > entries and in turn additional logic to combine them to GL_ARB_texture_float. > The correct and more work way to do it would be to remove ARB_texture_float > from gl_extension, add a GLboolean for each of the 4 OES extensions, change > each driver to correctly fill them and then additional logic in creating > extension string(s) to check if each of the 4 OES extensions are TRUE then to > advertise GL_ARB_texture_float; we could also instead just add the 4 OES > booleans and have additional logic in mesa/main to set them each to TRUE if > ARB_texture_float is true. The latter solution though easier is less clean > and begging for trouble later. Regardless, lets first get this patch as-is > into Mesa, then do the "right" thing to allow a backend to support a subset > of the OES extensions without needing to support the ARB extension. > > -Kevin > > > > > From: Marek Olšák [mar...@gmail.com] > Sent: Friday, May 16, 2014 4:33 PM > To: Rogovin, Kevin > Cc: mesa-dev@lists.freedesktop.org > Subject: Re: [Mesa-dev] [PATCH] mesa: Expose GL_OES_texture_float and > GL_OES_texture_half_float. > > Sorry, I meant the linear filtering extensions. > > Marek > > On Fri, May 16, 2014 at 3:31 PM, Marek Olšák wrote: >> Hi Kevin, >> >> r300g doesn't support filtering of floating-point textures, so the >> extension shouldn't be advertised there. >> >> Marek >> >> On Wed, May 7, 2014 at 1:18 PM, Kevin Rogovin >> wrote: >>> Add support for GLES2 extensions for floating point and half >>> floating point textures (GL_OES_texture_float, GL_OES_texture_half_float, >>> GL_OES_texture_float_linear and GL_OES_texture_half_float_linear). >>> >>> --- >>> src/mesa/main/extensions.c | 12 +- >>> src/mesa/main/glformats.c | 25 >>> src/mesa/main/pack.c | 17 + >>> src/mesa/main/teximage.c | 59 >>> ++ >>> 4 files changed, 112 insertions(+), 1 deletion(-) >>> >>> diff --git a/src/mesa/main/extensions.c b/src/mesa/main/extensions.c >>> index c2ff7e3..e39f65e 100644 >>> --- a/src/mesa/main/extensions.c >>> +++ b/src/mesa/main/extensions.c >>> @@ -301,7 +301,17 @@ static const struct extension extension_table[] = { >>> { "GL_OES_texture_mirrored_repeat", o(dummy_true), >>> ES1, 2005 }, >>> { "GL_OES_texture_npot", >>> o(ARB_texture_non_power_of_two), ES1 | ES2, 2005 }, >>> { "GL_OES_vertex_array_object", o(dummy_true), >>> ES1 | ES2, 2010 }, >>> - >>> + /* >>> +* TODO: >>> +* - rather than have an all or nothing approach for floating point >>> textures, >>> +*allow for driver to specify what parts of floating point texture >>> functionality >>> +*is suppo
[Mesa-dev] [Bug 78892] New: configure: error: Could not find clang internal header stddef.h in /usr/lib64/llvm/clang/3.4 Use --with-clang-libdir to specify the correct path to the clang libraries.
https://bugs.freedesktop.org/show_bug.cgi?id=78892 Priority: medium Bug ID: 78892 Assignee: mesa-dev@lists.freedesktop.org Summary: configure: error: Could not find clang internal header stddef.h in /usr/lib64/llvm/clang/3.4 Use --with-clang-libdir to specify the correct path to the clang libraries. Severity: blocker Classification: Unclassified OS: Linux (All) Reporter: v...@freedesktop.org Hardware: x86-64 (AMD64) Status: NEW Version: git Component: Other Product: Mesa mesa: 9e74de884a0595e577ebdfb7c7c13f4fd4d4dff5 (master 10.3.0-devel) configure error on Fedora 21. stddef.h is located at /usr/lib/clang/3.4/include/stddef.h. $ ./autogen.sh --enable-opencl [...] checking for llvm-config... /usr/bin/llvm-config configure: error: Could not find clang internal header stddef.h in /usr/lib64/llvm/clang/3.4 Use --with-clang-libdir to specify the correct path to the clang libraries. -- You are receiving this mail because: You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] define GL_OES_standard_derivatives if extension is supported
From: Kevin Rogovin Define the macro GL_OES_standard_derivatives as 1 if the extension GL_OES_standard_derivatives is supported. --- src/glsl/glcpp/glcpp-parse.y | 2 ++ 1 file changed, 2 insertions(+) diff --git a/src/glsl/glcpp/glcpp-parse.y b/src/glsl/glcpp/glcpp-parse.y index 9887583..83f6f46 100644 --- a/src/glsl/glcpp/glcpp-parse.y +++ b/src/glsl/glcpp/glcpp-parse.y @@ -2067,6 +2067,8 @@ _glcpp_parser_handle_version_declaration(glcpp_parser_t *parser, intmax_t versio if (extensions != NULL) { if (extensions->OES_EGL_image_external) add_builtin_define(parser, "GL_OES_EGL_image_external", 1); + if (extensions->OES_standard_derivatives) + add_builtin_define(parser, "GL_OES_standard_derivatives", 1); } } else { add_builtin_define(parser, "GL_ARB_draw_buffers", 1); -- 1.8.1.2 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 78888] test_eu_compact.c:54:3: error: implicit declaration of function ‘brw_disasm’ [-Werror=implicit-function-declaration]
https://bugs.freedesktop.org/show_bug.cgi?id=7 Vinson Lee changed: What|Removed |Added Status|NEW |RESOLVED Resolution|--- |FIXED --- Comment #1 from Vinson Lee --- commit 9e74de884a0595e577ebdfb7c7c13f4fd4d4dff5 Author: Vinson Lee Date: Mon May 19 00:39:12 2014 -0700 i965: Rename brw_disasm to brw_disassemble_inst. Fixes build error introduced with commit 4b04152db055babb8b06929a0c9ebea5c7f4fb92. CC test_eu_compact.o test_eu_compact.c: In function ‘test_compact_instruction’: test_eu_compact.c:54:3: error: implicit declaration of function ‘brw_disasm’ [-Werror=implicit-function-declaration] brw_disasm(stderr, &src, brw->gen, false); ^ Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=7 Signed-off-by: Vinson Lee -- You are receiving this mail because: You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] i965: rename brw_disasm to brw_disassemble_inst in test_eu_compact
On Mon, May 19, 2014 at 10:31:56AM +0300, Tapani P?lli wrote: > (forgotten from commit 4b04152d) > > Signed-off-by: Tapani Pälli > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=7 > --- > src/mesa/drivers/dri/i965/test_eu_compact.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) Reviewed-by: Topi Pohjolainen > > diff --git a/src/mesa/drivers/dri/i965/test_eu_compact.c > b/src/mesa/drivers/dri/i965/test_eu_compact.c > index 8713918..231487d 100644 > --- a/src/mesa/drivers/dri/i965/test_eu_compact.c > +++ b/src/mesa/drivers/dri/i965/test_eu_compact.c > @@ -51,7 +51,7 @@ test_compact_instruction(struct brw_compile *p, struct > brw_instruction src) >if (memcmp(&unchanged, &dst, sizeof(dst))) { >fprintf(stderr, "Failed to compact, but dst changed\n"); >fprintf(stderr, " Instruction: "); > - brw_disasm(stderr, &src, brw->gen, false); > + brw_disassemble_inst(stderr, &src, brw->gen, false); >return false; >} > } > -- > 1.8.3.1 > > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] i965: rename brw_disasm to brw_disassemble_inst in test_eu_compact
(forgotten from commit 4b04152d) Signed-off-by: Tapani Pälli Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=7 --- src/mesa/drivers/dri/i965/test_eu_compact.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/mesa/drivers/dri/i965/test_eu_compact.c b/src/mesa/drivers/dri/i965/test_eu_compact.c index 8713918..231487d 100644 --- a/src/mesa/drivers/dri/i965/test_eu_compact.c +++ b/src/mesa/drivers/dri/i965/test_eu_compact.c @@ -51,7 +51,7 @@ test_compact_instruction(struct brw_compile *p, struct brw_instruction src) if (memcmp(&unchanged, &dst, sizeof(dst))) { fprintf(stderr, "Failed to compact, but dst changed\n"); fprintf(stderr, " Instruction: "); -brw_disasm(stderr, &src, brw->gen, false); +brw_disassemble_inst(stderr, &src, brw->gen, false); return false; } } -- 1.8.3.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] mesa: Expose GL_OES_texture_float and GL_OES_texture_half_float.
Hi > It should be possible to adapt some of the existing float texture tests > to run on ES mode without too much effort. Oh dear, the test makes the GL API convert between 16 and 32 bit float formats. Also it does not appear to test filtering either. Would it be prudent to make 4 tests: one for each extension or fold it together? -Kevin ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 78888] New: test_eu_compact.c:54:3: error: implicit declaration of function ‘brw_disasm’ [-Werror=implicit-function-declaration]
https://bugs.freedesktop.org/show_bug.cgi?id=7 Priority: medium Bug ID: 7 Keywords: regression CC: kenn...@whitecape.org Assignee: mesa-dev@lists.freedesktop.org Summary: test_eu_compact.c:54:3: error: implicit declaration of function ‘brw_disasm’ [-Werror=implicit-function-declaration] Severity: blocker Classification: Unclassified OS: Linux (All) Reporter: v...@freedesktop.org Hardware: x86-64 (AMD64) Status: NEW Version: git Component: Mesa core Product: Mesa mesa: 13edd5f6160fce73369afbbf937b5e7ef646a4cc (master 10.3.0-devel) $ make check [...] CC test_eu_compact.o test_eu_compact.c: In function ‘test_compact_instruction’: test_eu_compact.c:54:3: error: implicit declaration of function ‘brw_disasm’ [-Werror=implicit-function-declaration] brw_disasm(stderr, &src, brw->gen, false); ^ -- You are receiving this mail because: You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] i965/fbo: Only try stencil meta blits on gen >= 8
I don't have an ILK at hand but the fix should be trivial. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=78872 Cc: "10.2" Signed-off-by: Topi Pohjolainen --- src/mesa/drivers/dri/i965/intel_fbo.c | 6 -- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/src/mesa/drivers/dri/i965/intel_fbo.c b/src/mesa/drivers/dri/i965/intel_fbo.c index 5ff4263..6c99de9 100644 --- a/src/mesa/drivers/dri/i965/intel_fbo.c +++ b/src/mesa/drivers/dri/i965/intel_fbo.c @@ -865,6 +865,8 @@ intel_blit_framebuffer(struct gl_context *ctx, GLint dstX0, GLint dstY0, GLint dstX1, GLint dstY1, GLbitfield mask, GLenum filter) { + struct brw_context *brw = brw_context(ctx); + /* Page 679 of OpenGL 4.4 spec says: *"Added BlitFramebuffer to commands affected by conditional rendering in * section 10.10 (Bug 9562)." @@ -872,14 +874,14 @@ intel_blit_framebuffer(struct gl_context *ctx, if (!_mesa_check_conditional_render(ctx)) return; - mask = brw_blorp_framebuffer(brw_context(ctx), + mask = brw_blorp_framebuffer(brw, srcX0, srcY0, srcX1, srcY1, dstX0, dstY0, dstX1, dstY1, mask, filter); if (mask == 0x0) return; - if (mask & GL_STENCIL_BUFFER_BIT) { + if (brw->gen >= 8 && (mask & GL_STENCIL_BUFFER_BIT)) { brw_meta_fbo_stencil_blit(brw_context(ctx), srcX0, srcY0, srcX1, srcY1, dstX0, dstY0, dstX1, dstY1); -- 1.8.3.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 78773] Doom3 BFG doesnt start
https://bugs.freedesktop.org/show_bug.cgi?id=78773 --- Comment #4 from Tapani Pälli --- Jan, could you please add full log output from doom3 when you run it. -- You are receiving this mail because: You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 78773] Doom3 BFG doesnt start
https://bugs.freedesktop.org/show_bug.cgi?id=78773 Tapani Pälli changed: What|Removed |Added CC||lem...@gmail.com --- Comment #3 from Tapani Pälli --- Is there something different within BFG edition than more content, engine is the same? I've just tested regular Doom3 with today's Mesa master on Intel IVB and everything works fine, log also says "...using GL_ARB_multitexture". -- You are receiving this mail because: You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 05/12] gallivm: Use LLVM global context.
On 19.05.2014 15:03, Mathias Fröhlich wrote: > > I tried to get my local llvm install again to a point where I can see > backtrace information, but still failed to get valgrind/massif to print > these nice backtraces. All of the llvm addresses are not resolved so far. You may want to try some or all of these parameters for LLVM's configure: '--enable-optimized' '--with-optimize-option=-fno-omit-frame-pointer -O2 [...]' '--enable-assertions' '--enable-debug-runtime' '--enable-debug-symbols' 'CC=gcc' 'CXX=g++' -- Earthling Michel Dänzer| http://www.amd.com Libre software enthusiast |Mesa and X developer ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev