Re: [Mesa-dev] [PATCH 01/20] mesa: fix and simplify resource query for arrays
On Thu, 2015-07-30 at 14:21 +0300, Tapani Pälli wrote: On 07/30/2015 12:26 PM, Timothy Arceri wrote: On Thu, 2015-07-30 at 11:07 +0300, Tapani Pälli wrote: On 07/30/2015 08:56 AM, Tapani Pälli wrote: On 07/30/2015 08:50 AM, Timothy Arceri wrote: On 30 July 2015 3:27:41 pm AEST, Tapani Pälli tapani.pa...@intel.com wrote: On 07/30/2015 08:25 AM, Tapani Pälli wrote: Hi Timothy; Would it be OK for you to wait a bit with these 3 first patches? I'm currently going through failing PIQ tests and I have a 1 line fix to valid_program_resource_index_name() that together with your patch glsl: set stage flag for structs and arrays in resource list will fix all of the failing ES31 -CTS.program_interface_query.uniform tests. Frankly I don't see the value of such a big refactor necessary for this. The value is in removing almost 200 lines of complicated code trying to solve the same problem in different ways. How its this not a good thing? Also once this is done we can also work towards removing UniformHash from Mesa. Sure, I agree that removing lots of code is a good thing. I just don't quite yet understand how some of this will work, many of the removed checks were there for a good reason (faling tests!). Just to make sure, did you run Piglit and conformance without regressions? Anyway, I will read through your changes and think about this more. Replying myself, no regressions observed :) Yep :) Tested as you can see. I didn't expect to fix anything when I made the change it was just a tidy up before getting the AoA tests to pass. The fixes were a nice bonus. Yep, it seems that parse_program_resource_name() tackles the cases. I will do some further testing, particularly on the 'gl_VertexIDMESA' special case and review the patches during today or tomorrow. Yeah parse_program_resource_name() is a bit more robust than the other functions and by using it we get the index for free :) The old code required the gl_VertexIDMESA special case because it was comparing the name with the the ir var name in get_matching_index(). _mesa_program_resource_name() already handles the special case for us so we can just use the name it returns in _mesa_program_resource_find_name() for our compares. Yep, I gdb stepped these through to make sure :) Sorry about the paranoia before, this has been a bit fragile area and would not like to break it. Not a problem. I would rather you be paranoid as it means I'm less likely to break something. Hopefully this is a step towards being less fragile. Woops, make it ES31-CTS.program_interface_query.uniform-types, not ES31-CTS.program_interface_query.uniform. It looks to me that most tests fail just because of not figuring out the correct index for the resource and that can be fixed by a valid_program_resource_index_name change. On 07/29/2015 04:56 PM, Timothy Arceri wrote: This removes the need for multiple functions designed to validate an array subscript and replaces them with a call to a single function. The change also means that validation is now only done once and the index is retrived at the same time, as a result the getUniformLocation code can be simplified saving an extra hash table lookup (and yet another validation call). This chage also fixes some tests in: ES31-CTS.program_interface_query.uniform V3: rebase on subroutines, and move the resource index array == 0 check into _mesa_GetProgramResourceIndex() to simplify things further V2: Fix bounds checks for program input/output, split unrelated comment fix and _mesa_get_uniform_location() removal into their own patch. Cc: Tapani Pälli tapani.pa...@intel.com --- src/mesa/main/program_resource.c | 14 ++-- src/mesa/main/shader_query.cpp | 174 +-- src/mesa/main/shaderapi.c| 2 +- src/mesa/main/shaderapi.h| 3 +- src/mesa/main/uniforms.c | 5 +- 5 files changed, 106 insertions(+), 92 deletions(-) diff --git a/src/mesa/main/program_resource.c b/src/mesa/main/program_resource.c index 641ef22..fdbd5b3 100644 --- a/src/mesa/main/program_resource.c +++ b/src/mesa/main/program_resource.c @@ -229,6 +229,7 @@ _mesa_GetProgramResourceIndex(GLuint program, GLenum programInterface,
Re: [Mesa-dev] [PATCH v2 6/6] i965: Add a debug option for spilling everything in vec4 code
On Thu, 2015-07-30 at 15:58 +0300, Francisco Jerez wrote: Iago Toral Quiroga ito...@igalia.com writes: --- src/mesa/drivers/dri/i965/brw_fs_reg_allocate.cpp | 2 +- src/mesa/drivers/dri/i965/brw_vec4.cpp| 2 +- src/mesa/drivers/dri/i965/intel_debug.c | 3 ++- src/mesa/drivers/dri/i965/intel_debug.h | 5 +++-- 4 files changed, 7 insertions(+), 5 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_fs_reg_allocate.cpp b/src/mesa/drivers/dri/i965/brw_fs_reg_allocate.cpp index f25f2ec..714248a 100644 --- a/src/mesa/drivers/dri/i965/brw_fs_reg_allocate.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs_reg_allocate.cpp @@ -634,7 +634,7 @@ fs_visitor::assign_regs(bool allow_spilling) } /* Debug of register spilling: Go spill everything. */ - if (unlikely(INTEL_DEBUG DEBUG_SPILL)) { + if (unlikely(INTEL_DEBUG DEBUG_SPILL_FS)) { int reg = choose_spill_reg(g); if (reg != -1) { diff --git a/src/mesa/drivers/dri/i965/brw_vec4.cpp b/src/mesa/drivers/dri/i965/brw_vec4.cpp index 53270fb..6cf5ede 100644 --- a/src/mesa/drivers/dri/i965/brw_vec4.cpp +++ b/src/mesa/drivers/dri/i965/brw_vec4.cpp @@ -1814,7 +1814,7 @@ vec4_visitor::run(gl_clip_plane *clip_planes) setup_payload(); - if (false) { + if (unlikely(INTEL_DEBUG DEBUG_SPILL_VEC4)) { /* Debug of register spilling: Go spill everything. */ const int grf_count = alloc.count; float spill_costs[alloc.count]; diff --git a/src/mesa/drivers/dri/i965/intel_debug.c b/src/mesa/drivers/dri/i965/intel_debug.c index a077731..8d34349 100644 --- a/src/mesa/drivers/dri/i965/intel_debug.c +++ b/src/mesa/drivers/dri/i965/intel_debug.c @@ -69,7 +69,8 @@ static const struct dri_debug_control debug_control[] = { { ann, DEBUG_ANNOTATION }, { no8, DEBUG_NO8 }, { vec4vs, DEBUG_VEC4VS }, - { spill, DEBUG_SPILL }, + { spill_frag, DEBUG_SPILL_FS }, How about we call this spill_fs instead? The flag doesn't only affect fragment shaders, AFAICT it will cause all programs compiled with the FS back-end [F for fast ;)] to spill everything. With that fixed: that was my first choice, but if we do that it seems that driParseDebugString will also mark INTEL_DEBUG=fs as enabled. It seems as if this function checks if any of the string options is present in the provided string to enable them, so we can't really use an option name where any substring of it is included as a separate option :-( Reviewed-by: Francisco Jerez curroje...@riseup.net + { spill_vec4, DEBUG_SPILL_VEC4 }, { cs, DEBUG_CS }, { NULL,0 } }; diff --git a/src/mesa/drivers/dri/i965/intel_debug.h b/src/mesa/drivers/dri/i965/intel_debug.h index 4689492..b7d0c82 100644 --- a/src/mesa/drivers/dri/i965/intel_debug.h +++ b/src/mesa/drivers/dri/i965/intel_debug.h @@ -64,8 +64,9 @@ extern uint64_t INTEL_DEBUG; #define DEBUG_ANNOTATION (1ull 28) #define DEBUG_NO8 (1ull 29) #define DEBUG_VEC4VS (1ull 30) -#define DEBUG_SPILL (1ull 31) -#define DEBUG_CS (1ull 32) +#define DEBUG_SPILL_FS(1ull 31) +#define DEBUG_SPILL_VEC4 (1ull 32) +#define DEBUG_CS (1ull 33) #ifdef HAVE_ANDROID_PLATFORM #define LOG_TAG INTEL-MESA -- 1.9.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v2 5/6] i965/vec4: Adjust spilling cost for consecutive instructions
Iago Toral Quiroga ito...@igalia.com writes: Previous patches made it so that we do not need to unspill the same vgrf with every instruction as long as these instructions come right after the register was spilled or unspilled. This means that actually spilling the register is now cheaper in these scenarios, so adjust the spilling cost calculation accordingly. --- .../drivers/dri/i965/brw_vec4_reg_allocate.cpp | 61 +++--- 1 file changed, 55 insertions(+), 6 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_vec4_reg_allocate.cpp b/src/mesa/drivers/dri/i965/brw_vec4_reg_allocate.cpp index 9a69fac..652a68c 100644 --- a/src/mesa/drivers/dri/i965/brw_vec4_reg_allocate.cpp +++ b/src/mesa/drivers/dri/i965/brw_vec4_reg_allocate.cpp @@ -269,6 +269,21 @@ vec4_visitor::evaluate_spill_costs(float *spill_costs, bool *no_spill) { float loop_scale = 1.0; + /* If a spilled register is written (or unspilled) and then immediately read, +* we will avoid to emit scratch reads for that register for as long as +* consecutive instructions keep reading that register. This makes spilling +* the register cheaper, so we need to account for this here. Since any +* instruction can only have at most 3 src operands we can only have up to +* 3 cached registers (i.e. registers that we do not need to unspill if they +* are read in a consecutive instruction) at any given time. +* +* We keep two lists, cached[0] has the registers that are cached before +* the current instruction, and cached[1] has the registers that are +* cached after the current instruction. We copy cached[1] into cached[0] +* after processing each instruction. +*/ + int cached[2][3] = { -1, -1, -1, -1, -1, -1 }; + Wouldn't it be easier to drop the array and just compare the sources of each instruction with the previous one? for (unsigned i = 0; i this-alloc.count; i++) { spill_costs[i] = 0.0; no_spill[i] = alloc.sizes[i] != 1; @@ -279,18 +294,52 @@ vec4_visitor::evaluate_spill_costs(float *spill_costs, bool *no_spill) * loops run 10 times. */ foreach_block_and_inst(block, vec4_instruction, inst, cfg) { + memset(cached[1], -1, sizeof(cached[1])); + for (unsigned int i = 0; i 3; i++) { - if (inst-src[i].file == GRF) { - spill_costs[inst-src[i].reg] += loop_scale; - assert(!inst-src[i].reladdr); - } + if (inst-src[i].file == GRF) { +/* Don't increase spilling cost if src[i] isn't going + * to be unspilled + */ +int slot; +for (slot = 0; slot 3; slot++) { + if (inst-src[i].reg == cached[0][slot]) I guess this should take the reg_offset and writemask of the cached register into account. I suggest you use backend_reg::in_range() to check if there is overlap. + break; +} +if (slot == 3) { + /* Not cached, we will unspill src[i] */ + spill_costs[inst-src[i].reg] += loop_scale; + assert(!inst-src[i].reladdr); +} else { + /* It is cached. Since it is read in this instruction it +* stays cached for the next. +*/ + cached[1][slot] = inst-src[i].reg; +} + } } if (inst-dst.file == GRF) { - spill_costs[inst-dst.reg] += loop_scale; - assert(!inst-dst.reladdr); + int slot; + /* Mark this reg as cached because if the next instruction reads it + * we won't unspill it. + */ + for (slot = 0; slot 3; slot++) { +if (cached[1][slot] == -1) { + cached[1][slot] = inst-dst.reg; + break; +} + } + assert(slot 3); + spill_costs[inst-dst.reg] += loop_scale; + assert(!inst-dst.reladdr); } + /* Cached registers after this instruction are the cached registers + * for the next instruction + */ + memcpy(cached[0], cached[1], sizeof(cached[0])); + switch (inst-opcode) { case BRW_OPCODE_DO: -- 1.9.1 signature.asc Description: PGP signature ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] gallivm: Fix GCC unused-variable warning.
On 07/29/2015 09:19 PM, Vinson Lee wrote: lp_bld_tgsi_soa.c: In function 'lp_emit_immediate_soa': lp_bld_tgsi_soa.c:3065:18: warning: unused variable 'size' [-Wunused-variable] const uint size = imm-Immediate.NrTokens - 1; ^ Signed-off-by: Vinson Lee v...@freedesktop.org --- src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c b/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c index 55f606f..fae604e 100644 --- a/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c +++ b/src/gallium/auxiliary/gallivm/lp_bld_tgsi_soa.c @@ -3062,8 +3062,7 @@ void lp_emit_immediate_soa( } else { /* simply copy the immediate values into the next immediates[] slot */ unsigned i; - const uint size = imm-Immediate.NrTokens - 1; - assert(size = 4); + assert(imm-Immediate.NrTokens - 1 = 4); assert(bld-num_immediates LP_MAX_INLINED_IMMEDIATES); for(i = 0; i 4; ++i ) Reviewed-by: Brian Paul bri...@vmware.com ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] i965/fs: Fix regression with SIMD8 VS since b5f1a48e234d47b24df38cb562cffb8941d43795.
With num_direct_uniforms == 0 there's no space allocated in the param_size array for the one block of direct uniforms -- On the FS stage this would be a harmless no-op because it would simply re-set one of the param_size entries allocated for the sampler units to zero, but on the VS stage it has been reported to cause memory corruption followed by a crash -- Surprising how a full piglit run on Gen8 didn't catch it. Reported-by: Marta Lofstedt marta.lofst...@intel.com --- src/mesa/drivers/dri/i965/brw_fs_nir.cpp | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp index 722e4e7..9cb7b0d 100644 --- a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp @@ -182,7 +182,8 @@ fs_visitor::nir_setup_uniforms(nir_shader *shader) /* We split the uniform register file in half. The first half is * entirely direct uniforms. The second half is indirect. */ - param_size[0] = num_direct_uniforms; + if (num_direct_uniforms 0) + param_size[0] = num_direct_uniforms; if (shader-num_uniforms num_direct_uniforms) param_size[num_direct_uniforms] = shader-num_uniforms - num_direct_uniforms; -- 2.4.6 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 13.5/17 v2] glsl: Add precision information to ir_variable
We will need this later on when we implement proper support for precision qualifiers in the drivers and also to do link time checks for uniforms as indicated by the spec. This patch also adds compile-time checks for variables without precision information (currently, Mesa only checks that a default precision is set for floats in fragment shaders). As indicated by Ian, the addition of the precision information to ir_variable has been done using a bitfield and pahole to identify an available hole so that memory requirements for ir_variable stay the same. v2 (Ian): - Avoid if-ladders by defining arrays of supported sampler names and indexing into them with type-sampler_array + 2 * type-sampler_shadow - Make the code that selects the precision qualifier to use an utility function - Fix a typo --- src/glsl/ast_to_hir.cpp | 270 src/glsl/glsl_types.cpp | 4 + src/glsl/glsl_types.h | 12 +++ src/glsl/ir.h | 13 +++ 4 files changed, 279 insertions(+), 20 deletions(-) diff --git a/src/glsl/ast_to_hir.cpp b/src/glsl/ast_to_hir.cpp index 3ececb7..6311f40 100644 --- a/src/glsl/ast_to_hir.cpp +++ b/src/glsl/ast_to_hir.cpp @@ -2044,31 +2044,242 @@ ast_type_specifier::glsl_type(const char **name, return type; } -const glsl_type * -ast_fully_specified_type::glsl_type(const char **name, -struct _mesa_glsl_parse_state *state) const +/** + * From the OpenGL ES 3.0 spec, 4.5.4 Default Precision Qualifiers: + * + * The precision statement + * + *precision precision-qualifier type; + * + * can be used to establish a default precision qualifier. The type field can + * be either int or float or any of the sampler types, (...) If type is float, + * the directive applies to non-precision-qualified floating point type + * (scalar, vector, and matrix) declarations. If type is int, the directive + * applies to all non-precision-qualified integer type (scalar, vector, signed, + * and unsigned) declarations. + * + * We use the symbol table to keep the values of the default precisions for + * each 'type' in each scope and we use the 'type' string from the precision + * statement as key in the symbol table. When we want to retrieve the default + * precision associated with a given glsl_type we need to know the type string + * associated with it. This is what this function returns. + */ +static const char * +get_type_name_for_precision_qualifier(const glsl_type *type) { - const struct glsl_type *type = this-specifier-glsl_type(name, state); - - if (type == NULL) - return NULL; + switch (type-base_type) { + case GLSL_TYPE_FLOAT: + return float; + case GLSL_TYPE_UINT: + case GLSL_TYPE_INT: + return int; + case GLSL_TYPE_SAMPLER: { + const unsigned type_idx = + type-sampler_array + 2 * type-sampler_shadow; + assert(type_idx 4); + switch (type-sampler_type) { + case GLSL_TYPE_FLOAT: + switch (type-sampler_dimensionality) { + case GLSL_SAMPLER_DIM_1D: { +static const char *const names[4] = { + sampler1D, sampler1DArray, + sampler1DShadow, sampler1DArrayShadow +}; +return names[type_idx]; + } + case GLSL_SAMPLER_DIM_2D: { +static const char *const names[4] = { + sampler2D, sampler2DArray, + sampler2DShadow, sampler2DArrayShadow +}; +return names[type_idx]; + } + case GLSL_SAMPLER_DIM_3D: { +static const char *const names[4] = { + sampler3D, NULL, NULL, NULL +}; +return names[type_idx]; + } + case GLSL_SAMPLER_DIM_CUBE: { +static const char *const names[4] = { + samplerCube, samplerCubeArray, + samplerCubeShadow, samplerCubeArrayShadow +}; +return names[type_idx]; + } + case GLSL_SAMPLER_DIM_MS: { +static const char *const names[4] = { + sampler2DMS, sampler2DMSArray, NULL, NULL +}; +return names[type_idx]; + } + case GLSL_SAMPLER_DIM_RECT: { +static const char *const names[4] = { + samplerRect, NULL, samplerRectShadow, NULL +}; +return names[type_idx]; + } + case GLSL_SAMPLER_DIM_BUF: { +static const char *const names[4] = { + samplerBuffer, NULL, NULL, NULL +}; +return names[type_idx]; + } + case GLSL_SAMPLER_DIM_EXTERNAL: { +static const char *const names[4] = { + samplerExternalOES, NULL, NULL, NULL +}; +return names[type_idx]; + } + default: +unreachable(Unsupported sampler dimensionality); + } /* sampler float dimensionality */ + break; + case
[Mesa-dev] [PATCH 13/17 v2] glsl: Move the definition of precision_qualifier_allowed
We will need this to build later patches --- src/glsl/ast_to_hir.cpp | 71 - 1 file changed, 35 insertions(+), 36 deletions(-) diff --git a/src/glsl/ast_to_hir.cpp b/src/glsl/ast_to_hir.cpp index 789b2bc..3ececb7 100644 --- a/src/glsl/ast_to_hir.cpp +++ b/src/glsl/ast_to_hir.cpp @@ -1993,6 +1993,41 @@ process_array_type(YYLTYPE *loc, const glsl_type *base, return array_type; } +static bool +precision_qualifier_allowed(const glsl_type *type) +{ + /* Precision qualifiers apply to floating point, integer and sampler +* types. +* +* Section 4.5.2 (Precision Qualifiers) of the GLSL 1.30 spec says: +*Any floating point or any integer declaration can have the type +*preceded by one of these precision qualifiers [...] Literal +*constants do not have precision qualifiers. Neither do Boolean +*variables. +* +* Section 4.5 (Precision and Precision Qualifiers) of the GLSL 1.30 +* spec also says: +* +* Precision qualifiers are added for code portability with OpenGL +* ES, not for functionality. They have the same syntax as in OpenGL +* ES. +* +* Section 8 (Built-In Functions) of the GLSL ES 1.00 spec says: +* +* uniform lowp sampler2D sampler; +* highp vec2 coord; +* ... +* lowp vec4 col = texture2D (sampler, coord); +*// texture2D returns lowp +* +* From this, we infer that GLSL 1.30 (and later) should allow precision +* qualifiers on sampler types just like float and integer types. +*/ + return type-is_float() + || type-is_integer() + || type-is_record() + || type-is_sampler(); +} const glsl_type * ast_type_specifier::glsl_type(const char **name, @@ -3381,42 +3416,6 @@ validate_identifier(const char *identifier, YYLTYPE loc, } } -static bool -precision_qualifier_allowed(const glsl_type *type) -{ - /* Precision qualifiers apply to floating point, integer and sampler -* types. -* -* Section 4.5.2 (Precision Qualifiers) of the GLSL 1.30 spec says: -*Any floating point or any integer declaration can have the type -*preceded by one of these precision qualifiers [...] Literal -*constants do not have precision qualifiers. Neither do Boolean -*variables. -* -* Section 4.5 (Precision and Precision Qualifiers) of the GLSL 1.30 -* spec also says: -* -* Precision qualifiers are added for code portability with OpenGL -* ES, not for functionality. They have the same syntax as in OpenGL -* ES. -* -* Section 8 (Built-In Functions) of the GLSL ES 1.00 spec says: -* -* uniform lowp sampler2D sampler; -* highp vec2 coord; -* ... -* lowp vec4 col = texture2D (sampler, coord); -*// texture2D returns lowp -* -* From this, we infer that GLSL 1.30 (and later) should allow precision -* qualifiers on sampler types just like float and integer types. -*/ - return type-is_float() - || type-is_integer() - || type-is_record() - || type-is_sampler(); -} - ir_rvalue * ast_declarator_list::hir(exec_list *instructions, struct _mesa_glsl_parse_state *state) -- 1.9.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] Allow setting GL_TEXTURE_COMPARE_MODE on a sampler object without ARB_shadow support (v2).
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Hi, Thanks for the quick feedback! I'll resend the patch with the requested fixes shortly. Stefan Am 2015-07-30 um 15:30 schrieb Brian Paul: Can you try to shorten the subject line to 75 chars or less? It should also start with mesa:. Perhaps: mesa: disable texture compare mode error checking for Wine On 07/30/2015 07:04 AM, Stefan Dösinger wrote: Version 2: Return GL_FALSE if ARB_shadow is unsupported instead of pretending to store the value as Please word wrap that line to 75 chars or less. suggested by Brian Paul. This fixes a GL error warning on r200 in Wine. The GL_ARB_sampler_objects extension does not specify a dependency on GL_ARB_shadow or GL_ARB_depth_texture for this value. Just set the value and don't do anything else. It won't matter without a depth texture being assigned anyway. This comment doesn't reflect what the code is really doing. The code is not setting any values. --- src/mesa/main/samplerobj.c | 28 ++-- 1 file changed, 10 insertions(+), 18 deletions(-) diff --git a/src/mesa/main/samplerobj.c b/src/mesa/main/samplerobj.c index 32180fb1..f5d2077 100644 --- a/src/mesa/main/samplerobj.c +++ b/src/mesa/main/samplerobj.c @@ -634,8 +634,12 @@ static GLuint set_sampler_compare_mode(struct gl_context *ctx, struct gl_sampler_object *samp, GLint param) { + /* No state change, no error. GL_ARB_sampler_objects does not + * have a dependency on GL_ARB_shadow for these settings. They + * don't have any effect without shadow textures anyway. +* This matters for Wine on old GPUs, e.g. r200. */ Please put closing */ on next line. Actually, the comment could be simplified: /* If GL_ARB_shadow is not supported, don't report an error. The * sampler object extension spec isn't clear on this extension interaction. * Silences errors with Wine on older GPUs such as R200. */ if (!ctx-Extensions.ARB_shadow) - return INVALID_PNAME; + return GL_FALSE; if (samp-CompareMode == param) return GL_FALSE; @@ -655,8 +659,12 @@ static GLuint set_sampler_compare_func(struct gl_context *ctx, struct gl_sampler_object *samp, GLint param) { + /* No state change, no error. GL_ARB_sampler_objects does not + * have a dependency on GL_ARB_shadow for these settings. They + * don't have any effect without shadow textures anyway. +* This matters for Wine on old GPUs, e.g. r200. */ if (!ctx-Extensions.ARB_shadow) - return INVALID_PNAME; + return GL_FALSE; if (samp-CompareFunc == param) return GL_FALSE; @@ -1342,13 +1350,9 @@ _mesa_GetSamplerParameteriv(GLuint sampler, GLenum pname, GLint *params) *params = IROUND(sampObj-LodBias); break; case GL_TEXTURE_COMPARE_MODE: - if (!ctx-Extensions.ARB_shadow) - goto invalid_pname; *params = sampObj-CompareMode; break; case GL_TEXTURE_COMPARE_FUNC: - if (!ctx-Extensions.ARB_shadow) - goto invalid_pname; *params = sampObj-CompareFunc; break; case GL_TEXTURE_MAX_ANISOTROPY_EXT: @@ -1431,13 +1435,9 @@ _mesa_GetSamplerParameterfv(GLuint sampler, GLenum pname, GLfloat *params) *params = sampObj-LodBias; break; case GL_TEXTURE_COMPARE_MODE: - if (!ctx-Extensions.ARB_shadow) - goto invalid_pname; *params = (GLfloat) sampObj-CompareMode; break; case GL_TEXTURE_COMPARE_FUNC: - if (!ctx-Extensions.ARB_shadow) - goto invalid_pname; *params = (GLfloat) sampObj-CompareFunc; break; case GL_TEXTURE_MAX_ANISOTROPY_EXT: @@ -1510,13 +1510,9 @@ _mesa_GetSamplerParameterIiv(GLuint sampler, GLenum pname, GLint *params) *params = (GLint) sampObj-LodBias; break; case GL_TEXTURE_COMPARE_MODE: - if (!ctx-Extensions.ARB_shadow) - goto invalid_pname; *params = sampObj-CompareMode; break; case GL_TEXTURE_COMPARE_FUNC: - if (!ctx-Extensions.ARB_shadow) - goto invalid_pname; *params = sampObj-CompareFunc; break; case GL_TEXTURE_MAX_ANISOTROPY_EXT: @@ -1589,13 +1585,9 @@ _mesa_GetSamplerParameterIuiv(GLuint sampler, GLenum pname, GLuint *params) *params = (GLuint) sampObj-LodBias; break; case GL_TEXTURE_COMPARE_MODE: - if (!ctx-Extensions.ARB_shadow) - goto invalid_pname; *params = sampObj-CompareMode; break; case GL_TEXTURE_COMPARE_FUNC: - if (!ctx-Extensions.ARB_shadow) - goto invalid_pname; *params = sampObj-CompareFunc; break; case GL_TEXTURE_MAX_ANISOTROPY_EXT: -BEGIN PGP SIGNATURE- Version: GnuPG v2 iQIcBAEBAgAGBQJVuisUAAoJEN0/YqbEcdMwwl4P/Awl2F5yQyO0gmxzwwTOj4Cx 3yReuJmJj++rJkP8ZnYOD4ATOSwqfVFvwRvV5i2s/UhtqO5DDmS65IJ92qVLbgzH As7QnUdoIFGlSB5vZv9KljQJhRFhrllod6xKYWGY8VtE4f8imri6vs6/2TperDHv tkLF50UJX/JWo96aQDD+PuMPF02EcvpN1AJIvbqhswtm30T5a2QNRiiXFcEdoecQ KbKKhKdoivLZ57ltyeeI2aCBVWm/EfyqX+wS4aRYLWlj8e2dQVHfPqeVcw8vfNPB lIF8qDjzuX5kgy6KrtKbKKcMLMIVyzs+pWJAuCBkOlbvxEIm8zldlyh5ClGquU/o BBa45KtKotzHnddmKdDl7MQBh9J1zSygGGbiHCUi64JbtAi6Kco4qmI7RRKl0dVa
Re: [Mesa-dev] [PATCH v2 3/6] i965/vec4: Don't emit scratch reads for a spilled register we have just written
Iago Toral Quiroga ito...@igalia.com writes: When we have code such as this: mov vgrf1.0.x:F, vgrf2.:F mov vgrf3.0.x:F, vgrf1.:F ... mov vgrf3.0.x:F, vgrf1.:F And vgrf1 is chosen for spilling, we can emit this: mov vgrf1.0.x:F, vgrf2.:F gen4_scratch_write hw_reg0:F, vgrf1.:D, 22D mov vgrf3.0.x:F, vgrf1.:F ... gen4_scratch_read vgrf4.0.x:F, 22D mov vgrf3.0.x:F, vgrf4.:F Instead of this: mov vgrf1.0.x:F, vgrf2.:F gen4_scratch_write hw_reg0:F, vgrf1.:D, 22D gen4_scratch_read vgrf4.0.x:F, 22D mov vgrf3.0.x:F, vgrf4.:F ... gen4_scratch_read vgrf5.0.x:F, 22D mov vgrf3.0.x:F, vgrf5.:F And save one scratch read while still preserving the benefits of spilling the register. In general, we avoid emitting scratch reads for as long as the next instruction keeps reading the spilled register. This should not harm the benefit of spilling the register because gains for register allocation only come when we have chunks of program code where the register is alive but not really used (because these are the points where we could effectively use that register for another purpose if we spilled it), so as long as consecutive instructions use that register we can avoid the scratch reads without losing anything. --- .../drivers/dri/i965/brw_vec4_reg_allocate.cpp | 37 +- 1 file changed, 36 insertions(+), 1 deletion(-) diff --git a/src/mesa/drivers/dri/i965/brw_vec4_reg_allocate.cpp b/src/mesa/drivers/dri/i965/brw_vec4_reg_allocate.cpp index cff5406..fd56dae 100644 --- a/src/mesa/drivers/dri/i965/brw_vec4_reg_allocate.cpp +++ b/src/mesa/drivers/dri/i965/brw_vec4_reg_allocate.cpp @@ -340,11 +340,43 @@ vec4_visitor::spill_reg(int spill_reg_nr) unsigned int spill_offset = last_scratch++; /* Generate spill/unspill instructions for the objects being spilled. */ + vec4_instruction *spill_write_inst = NULL; foreach_block_and_inst(block, vec4_instruction, inst, cfg) { + /* We don't spill registers used for scratch */ + if (inst-opcode == SHADER_OPCODE_GEN4_SCRATCH_READ || + inst-opcode == SHADER_OPCODE_GEN4_SCRATCH_WRITE) + continue; + int scratch_reg = -1; + bool spill_reg_was_read = false; for (unsigned int i = 0; i 3; i++) { if (inst-src[i].file == GRF inst-src[i].reg == spill_reg_nr) { -if (scratch_reg == -1) { +if (!spill_reg_was_read) { + spill_reg_was_read = (!inst-predicate || + inst-opcode == BRW_OPCODE_SEL); +} + +/* If we are reading the spilled register right after writing + * to it we can skip the scratch read and use directly the + * register we used as source for the scratch write. For this + * to work we must check that: + * + * 1) The write is inconditional, that is, it is not predicated or + *it is a SEL. + * 2) All the channels that we read have been written in that + *last write instruction. + * + * We keep doing this for as long as the next instruction + * keeps reading the spilled register and break as soon as we + * find an instruction that doesn't. + */ +if (spill_write_inst +(!spill_write_inst-predicate || + spill_write_inst-opcode == BRW_OPCODE_SEL) +((brw_mask_for_swizzle(inst-src[i].swizzle) + ~spill_write_inst-dst.writemask) == 0)) { + scratch_reg = spill_write_inst-dst.reg; +} else if (scratch_reg == -1) { One suggestion: You could factor out the rather complex caching logic into a separate function (e.g. 'bool can_reuse_scratch_for_source(const vec4_instruction *, unsigned i, unsigned scratch_reg)'). The function would simply compare scratch_reg with the sources of the current instruction (up to src) and the sources and destination of the previous non-scratch_read/write instruction. If there's a match it would check that the regioning is compatible with the i-th source and return true in that case. This would have several benefits: - It would keep ::spill_reg() simple and the caching heuristic factored out. The only thing spill_reg() would still need to take care of is keep track of the last spilling temporary (e.g. as you're doing it now with the scratch_reg variable) and pass it to can_reuse_scratch_for_source(), but it would no longer be necessary to reset it to -1 when reuse is not possible. - It would allow you to use a single implementation of the caching policy to implement both spill_reg() and evaluate_spill_costs(), what would make sure that things don't go out of sync in the likely case that we want to change the caching heuristic in the future.
Re: [Mesa-dev] [PATCH v2 5/6] i965/vec4: Adjust spilling cost for consecutive instructions
Francisco Jerez curroje...@riseup.net writes: Iago Toral Quiroga ito...@igalia.com writes: Previous patches made it so that we do not need to unspill the same vgrf with every instruction as long as these instructions come right after the register was spilled or unspilled. This means that actually spilling the register is now cheaper in these scenarios, so adjust the spilling cost calculation accordingly. --- .../drivers/dri/i965/brw_vec4_reg_allocate.cpp | 61 +++--- 1 file changed, 55 insertions(+), 6 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_vec4_reg_allocate.cpp b/src/mesa/drivers/dri/i965/brw_vec4_reg_allocate.cpp index 9a69fac..652a68c 100644 --- a/src/mesa/drivers/dri/i965/brw_vec4_reg_allocate.cpp +++ b/src/mesa/drivers/dri/i965/brw_vec4_reg_allocate.cpp @@ -269,6 +269,21 @@ vec4_visitor::evaluate_spill_costs(float *spill_costs, bool *no_spill) { float loop_scale = 1.0; + /* If a spilled register is written (or unspilled) and then immediately read, +* we will avoid to emit scratch reads for that register for as long as +* consecutive instructions keep reading that register. This makes spilling +* the register cheaper, so we need to account for this here. Since any +* instruction can only have at most 3 src operands we can only have up to +* 3 cached registers (i.e. registers that we do not need to unspill if they +* are read in a consecutive instruction) at any given time. +* +* We keep two lists, cached[0] has the registers that are cached before +* the current instruction, and cached[1] has the registers that are +* cached after the current instruction. We copy cached[1] into cached[0] +* after processing each instruction. +*/ + int cached[2][3] = { -1, -1, -1, -1, -1, -1 }; + Wouldn't it be easier to drop the array and just compare the sources of each instruction with the previous one? for (unsigned i = 0; i this-alloc.count; i++) { spill_costs[i] = 0.0; no_spill[i] = alloc.sizes[i] != 1; @@ -279,18 +294,52 @@ vec4_visitor::evaluate_spill_costs(float *spill_costs, bool *no_spill) * loops run 10 times. */ foreach_block_and_inst(block, vec4_instruction, inst, cfg) { + memset(cached[1], -1, sizeof(cached[1])); + for (unsigned int i = 0; i 3; i++) { - if (inst-src[i].file == GRF) { -spill_costs[inst-src[i].reg] += loop_scale; -assert(!inst-src[i].reladdr); - } + if (inst-src[i].file == GRF) { +/* Don't increase spilling cost if src[i] isn't going + * to be unspilled + */ +int slot; +for (slot = 0; slot 3; slot++) { + if (inst-src[i].reg == cached[0][slot]) I guess this should take the reg_offset and writemask of the cached register into account. I suggest you use backend_reg::in_range() to check if there is overlap. It also doesn't look like this handles the case in which the same register is unspilled several times for the same instruction correctly, as you've implemented it in PATCH 1. + break; +} +if (slot == 3) { + /* Not cached, we will unspill src[i] */ + spill_costs[inst-src[i].reg] += loop_scale; + assert(!inst-src[i].reladdr); +} else { + /* It is cached. Since it is read in this instruction it +* stays cached for the next. +*/ + cached[1][slot] = inst-src[i].reg; +} + } } if (inst-dst.file == GRF) { - spill_costs[inst-dst.reg] += loop_scale; - assert(!inst-dst.reladdr); + int slot; + /* Mark this reg as cached because if the next instruction reads it + * we won't unspill it. + */ + for (slot = 0; slot 3; slot++) { +if (cached[1][slot] == -1) { + cached[1][slot] = inst-dst.reg; + break; +} + } + assert(slot 3); + spill_costs[inst-dst.reg] += loop_scale; + assert(!inst-dst.reladdr); } + /* Cached registers after this instruction are the cached registers + * for the next instruction + */ + memcpy(cached[0], cached[1], sizeof(cached[0])); + switch (inst-opcode) { case BRW_OPCODE_DO: -- 1.9.1 signature.asc Description: PGP signature ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] Allow setting GL_TEXTURE_COMPARE_MODE on a sampler object without ARB_shadow support (v2).
Version 2: Return GL_FALSE if ARB_shadow is unsupported instead of pretending to store the value as suggested by Brian Paul. This fixes a GL error warning on r200 in Wine. The GL_ARB_sampler_objects extension does not specify a dependency on GL_ARB_shadow or GL_ARB_depth_texture for this value. Just set the value and don't do anything else. It won't matter without a depth texture being assigned anyway. --- src/mesa/main/samplerobj.c | 28 ++-- 1 file changed, 10 insertions(+), 18 deletions(-) diff --git a/src/mesa/main/samplerobj.c b/src/mesa/main/samplerobj.c index 32180fb1..f5d2077 100644 --- a/src/mesa/main/samplerobj.c +++ b/src/mesa/main/samplerobj.c @@ -634,8 +634,12 @@ static GLuint set_sampler_compare_mode(struct gl_context *ctx, struct gl_sampler_object *samp, GLint param) { + /* No state change, no error. GL_ARB_sampler_objects does not +* have a dependency on GL_ARB_shadow for these settings. They +* don't have any effect without shadow textures anyway. +* This matters for Wine on old GPUs, e.g. r200. */ if (!ctx-Extensions.ARB_shadow) - return INVALID_PNAME; + return GL_FALSE; if (samp-CompareMode == param) return GL_FALSE; @@ -655,8 +659,12 @@ static GLuint set_sampler_compare_func(struct gl_context *ctx, struct gl_sampler_object *samp, GLint param) { + /* No state change, no error. GL_ARB_sampler_objects does not +* have a dependency on GL_ARB_shadow for these settings. They +* don't have any effect without shadow textures anyway. +* This matters for Wine on old GPUs, e.g. r200. */ if (!ctx-Extensions.ARB_shadow) - return INVALID_PNAME; + return GL_FALSE; if (samp-CompareFunc == param) return GL_FALSE; @@ -1342,13 +1350,9 @@ _mesa_GetSamplerParameteriv(GLuint sampler, GLenum pname, GLint *params) *params = IROUND(sampObj-LodBias); break; case GL_TEXTURE_COMPARE_MODE: - if (!ctx-Extensions.ARB_shadow) - goto invalid_pname; *params = sampObj-CompareMode; break; case GL_TEXTURE_COMPARE_FUNC: - if (!ctx-Extensions.ARB_shadow) - goto invalid_pname; *params = sampObj-CompareFunc; break; case GL_TEXTURE_MAX_ANISOTROPY_EXT: @@ -1431,13 +1435,9 @@ _mesa_GetSamplerParameterfv(GLuint sampler, GLenum pname, GLfloat *params) *params = sampObj-LodBias; break; case GL_TEXTURE_COMPARE_MODE: - if (!ctx-Extensions.ARB_shadow) - goto invalid_pname; *params = (GLfloat) sampObj-CompareMode; break; case GL_TEXTURE_COMPARE_FUNC: - if (!ctx-Extensions.ARB_shadow) - goto invalid_pname; *params = (GLfloat) sampObj-CompareFunc; break; case GL_TEXTURE_MAX_ANISOTROPY_EXT: @@ -1510,13 +1510,9 @@ _mesa_GetSamplerParameterIiv(GLuint sampler, GLenum pname, GLint *params) *params = (GLint) sampObj-LodBias; break; case GL_TEXTURE_COMPARE_MODE: - if (!ctx-Extensions.ARB_shadow) - goto invalid_pname; *params = sampObj-CompareMode; break; case GL_TEXTURE_COMPARE_FUNC: - if (!ctx-Extensions.ARB_shadow) - goto invalid_pname; *params = sampObj-CompareFunc; break; case GL_TEXTURE_MAX_ANISOTROPY_EXT: @@ -1589,13 +1585,9 @@ _mesa_GetSamplerParameterIuiv(GLuint sampler, GLenum pname, GLuint *params) *params = (GLuint) sampObj-LodBias; break; case GL_TEXTURE_COMPARE_MODE: - if (!ctx-Extensions.ARB_shadow) - goto invalid_pname; *params = sampObj-CompareMode; break; case GL_TEXTURE_COMPARE_FUNC: - if (!ctx-Extensions.ARB_shadow) - goto invalid_pname; *params = sampObj-CompareFunc; break; case GL_TEXTURE_MAX_ANISOTROPY_EXT: -- 2.3.6 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v2 2/6] i965/vec4: Remove checks for reladdr when checking for spillable registers
Iago Toral Quiroga ito...@igalia.com writes: In theory, GRF array access should have been moved to scratch by the time we got here, so this should never happen. A full piglit run forcing spilling of all registers seems to confirm this. The FS backend does not seem to check for this either. --- src/mesa/drivers/dri/i965/brw_vec4_reg_allocate.cpp | 6 ++ 1 file changed, 2 insertions(+), 4 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_vec4_reg_allocate.cpp b/src/mesa/drivers/dri/i965/brw_vec4_reg_allocate.cpp index a9bf0d8..cff5406 100644 --- a/src/mesa/drivers/dri/i965/brw_vec4_reg_allocate.cpp +++ b/src/mesa/drivers/dri/i965/brw_vec4_reg_allocate.cpp @@ -282,15 +282,13 @@ vec4_visitor::evaluate_spill_costs(float *spill_costs, bool *no_spill) for (unsigned int i = 0; i 3; i++) { if (inst-src[i].file == GRF) { spill_costs[inst-src[i].reg] += loop_scale; -if (inst-src[i].reladdr) - no_spill[inst-src[i].reg] = true; + assert(!inst-src[i].reladdr); } } if (inst-dst.file == GRF) { spill_costs[inst-dst.reg] += loop_scale; - if (inst-dst.reladdr) -no_spill[inst-dst.reg] = true; I guess the intention behind these checks was to support unlowered indirect addressing of GRFs sometime in the future. Is there some reason why you want to remove them? That said, I don't think there would be any reason to force indirectly addressed VGRFs not to spill even if we implemented them, because if the VGRF is larger than one physical register it will already have been marked as no-spill by the previous loop at the top of this function, and if they are exactly one physical register the reladdr value is guaranteed to be zero so the indirect addressing is a no-op. How about we get rid of these checks as you have done, but leave no assertions behind? + assert(!inst-dst.reladdr); } switch (inst-opcode) { -- 1.9.1 signature.asc Description: PGP signature ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] mesa: Remove GL_ARB_sampler_object depth compare error checking.
Version 3: Simplify the code comment, word wrap commit description. Version 2: Return GL_FALSE if ARB_shadow is unsupported instead of pretending to store the value as suggested by Brian Paul. This fixes a GL error warning on r200 in Wine. The GL_ARB_sampler_objects extension does not specify a dependency on GL_ARB_shadow or GL_ARB_depth_texture for setting the depth texture compare mode and function. Silently ignore attempts to change these settings. They won't matter without a depth texture being assigned anyway. --- src/mesa/main/samplerobj.c | 28 ++-- 1 file changed, 10 insertions(+), 18 deletions(-) diff --git a/src/mesa/main/samplerobj.c b/src/mesa/main/samplerobj.c index 32180fb1..01cf584 100644 --- a/src/mesa/main/samplerobj.c +++ b/src/mesa/main/samplerobj.c @@ -634,8 +634,12 @@ static GLuint set_sampler_compare_mode(struct gl_context *ctx, struct gl_sampler_object *samp, GLint param) { +/* If GL_ARB_shadow is not supported, don't report an error. The + * sampler object extension spec isn't clear on this extension interaction. + * Silences errors with Wine on older GPUs such as R200. + */ if (!ctx-Extensions.ARB_shadow) - return INVALID_PNAME; + return GL_FALSE; if (samp-CompareMode == param) return GL_FALSE; @@ -655,8 +659,12 @@ static GLuint set_sampler_compare_func(struct gl_context *ctx, struct gl_sampler_object *samp, GLint param) { +/* If GL_ARB_shadow is not supported, don't report an error. The + * sampler object extension spec isn't clear on this extension interaction. + * Silences errors with Wine on older GPUs such as R200. + */ if (!ctx-Extensions.ARB_shadow) - return INVALID_PNAME; + return GL_FALSE; if (samp-CompareFunc == param) return GL_FALSE; @@ -1342,13 +1350,9 @@ _mesa_GetSamplerParameteriv(GLuint sampler, GLenum pname, GLint *params) *params = IROUND(sampObj-LodBias); break; case GL_TEXTURE_COMPARE_MODE: - if (!ctx-Extensions.ARB_shadow) - goto invalid_pname; *params = sampObj-CompareMode; break; case GL_TEXTURE_COMPARE_FUNC: - if (!ctx-Extensions.ARB_shadow) - goto invalid_pname; *params = sampObj-CompareFunc; break; case GL_TEXTURE_MAX_ANISOTROPY_EXT: @@ -1431,13 +1435,9 @@ _mesa_GetSamplerParameterfv(GLuint sampler, GLenum pname, GLfloat *params) *params = sampObj-LodBias; break; case GL_TEXTURE_COMPARE_MODE: - if (!ctx-Extensions.ARB_shadow) - goto invalid_pname; *params = (GLfloat) sampObj-CompareMode; break; case GL_TEXTURE_COMPARE_FUNC: - if (!ctx-Extensions.ARB_shadow) - goto invalid_pname; *params = (GLfloat) sampObj-CompareFunc; break; case GL_TEXTURE_MAX_ANISOTROPY_EXT: @@ -1510,13 +1510,9 @@ _mesa_GetSamplerParameterIiv(GLuint sampler, GLenum pname, GLint *params) *params = (GLint) sampObj-LodBias; break; case GL_TEXTURE_COMPARE_MODE: - if (!ctx-Extensions.ARB_shadow) - goto invalid_pname; *params = sampObj-CompareMode; break; case GL_TEXTURE_COMPARE_FUNC: - if (!ctx-Extensions.ARB_shadow) - goto invalid_pname; *params = sampObj-CompareFunc; break; case GL_TEXTURE_MAX_ANISOTROPY_EXT: @@ -1589,13 +1585,9 @@ _mesa_GetSamplerParameterIuiv(GLuint sampler, GLenum pname, GLuint *params) *params = (GLuint) sampObj-LodBias; break; case GL_TEXTURE_COMPARE_MODE: - if (!ctx-Extensions.ARB_shadow) - goto invalid_pname; *params = sampObj-CompareMode; break; case GL_TEXTURE_COMPARE_FUNC: - if (!ctx-Extensions.ARB_shadow) - goto invalid_pname; *params = sampObj-CompareFunc; break; case GL_TEXTURE_MAX_ANISOTROPY_EXT: -- 2.3.6 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] V3 ARB_arrays_of_arrays GLSL ES
On Thu, 2015-07-30 at 14:16 +0300, Tapani Pälli wrote: patches 1,2,3 I'm going to push these so they get as much testing as possible. and patches for AoA size calculation 11,12,16 Reviewed-by: Tapani Pälli tapani.pa...@intel.com (will review some more later) That would be great thanks. On 07/29/2015 04:56 PM, Timothy Arceri wrote: If useful the series is in the 'gles8' branch of the repo here: https://github.com/tarceri/Mesa_arrays_of_arrays.git Changes in v3: - Rebased on the tessellation and subroutine changes - Atomic counter AoA now working including indirect indexing (previously I had broken piglit tests that were always passing as long as the shader compiled) - Included fix for atomic counters with binding other than 0 - Split the program interface query chages for easier review Changes in v2: - Reworked how uniform AoA are handled which simplified things greatly. - Transform feedback and program interface query now working with AoA. - Speed up glsl optimisation pass via improvement to dead code elimination. - Some smaller reviewed patches pushed. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v2 6/6] i965: Add a debug option for spilling everything in vec4 code
Iago Toral Quiroga ito...@igalia.com writes: --- src/mesa/drivers/dri/i965/brw_fs_reg_allocate.cpp | 2 +- src/mesa/drivers/dri/i965/brw_vec4.cpp| 2 +- src/mesa/drivers/dri/i965/intel_debug.c | 3 ++- src/mesa/drivers/dri/i965/intel_debug.h | 5 +++-- 4 files changed, 7 insertions(+), 5 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_fs_reg_allocate.cpp b/src/mesa/drivers/dri/i965/brw_fs_reg_allocate.cpp index f25f2ec..714248a 100644 --- a/src/mesa/drivers/dri/i965/brw_fs_reg_allocate.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs_reg_allocate.cpp @@ -634,7 +634,7 @@ fs_visitor::assign_regs(bool allow_spilling) } /* Debug of register spilling: Go spill everything. */ - if (unlikely(INTEL_DEBUG DEBUG_SPILL)) { + if (unlikely(INTEL_DEBUG DEBUG_SPILL_FS)) { int reg = choose_spill_reg(g); if (reg != -1) { diff --git a/src/mesa/drivers/dri/i965/brw_vec4.cpp b/src/mesa/drivers/dri/i965/brw_vec4.cpp index 53270fb..6cf5ede 100644 --- a/src/mesa/drivers/dri/i965/brw_vec4.cpp +++ b/src/mesa/drivers/dri/i965/brw_vec4.cpp @@ -1814,7 +1814,7 @@ vec4_visitor::run(gl_clip_plane *clip_planes) setup_payload(); - if (false) { + if (unlikely(INTEL_DEBUG DEBUG_SPILL_VEC4)) { /* Debug of register spilling: Go spill everything. */ const int grf_count = alloc.count; float spill_costs[alloc.count]; diff --git a/src/mesa/drivers/dri/i965/intel_debug.c b/src/mesa/drivers/dri/i965/intel_debug.c index a077731..8d34349 100644 --- a/src/mesa/drivers/dri/i965/intel_debug.c +++ b/src/mesa/drivers/dri/i965/intel_debug.c @@ -69,7 +69,8 @@ static const struct dri_debug_control debug_control[] = { { ann, DEBUG_ANNOTATION }, { no8, DEBUG_NO8 }, { vec4vs, DEBUG_VEC4VS }, - { spill, DEBUG_SPILL }, + { spill_frag, DEBUG_SPILL_FS }, How about we call this spill_fs instead? The flag doesn't only affect fragment shaders, AFAICT it will cause all programs compiled with the FS back-end [F for fast ;)] to spill everything. With that fixed: Reviewed-by: Francisco Jerez curroje...@riseup.net + { spill_vec4, DEBUG_SPILL_VEC4 }, { cs, DEBUG_CS }, { NULL,0 } }; diff --git a/src/mesa/drivers/dri/i965/intel_debug.h b/src/mesa/drivers/dri/i965/intel_debug.h index 4689492..b7d0c82 100644 --- a/src/mesa/drivers/dri/i965/intel_debug.h +++ b/src/mesa/drivers/dri/i965/intel_debug.h @@ -64,8 +64,9 @@ extern uint64_t INTEL_DEBUG; #define DEBUG_ANNOTATION (1ull 28) #define DEBUG_NO8 (1ull 29) #define DEBUG_VEC4VS (1ull 30) -#define DEBUG_SPILL (1ull 31) -#define DEBUG_CS (1ull 32) +#define DEBUG_SPILL_FS(1ull 31) +#define DEBUG_SPILL_VEC4 (1ull 32) +#define DEBUG_CS (1ull 33) #ifdef HAVE_ANDROID_PLATFORM #define LOG_TAG INTEL-MESA -- 1.9.1 signature.asc Description: PGP signature ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v2 6/6] i965: Add a debug option for spilling everything in vec4 code
Iago Toral ito...@igalia.com writes: On Thu, 2015-07-30 at 15:58 +0300, Francisco Jerez wrote: Iago Toral Quiroga ito...@igalia.com writes: --- src/mesa/drivers/dri/i965/brw_fs_reg_allocate.cpp | 2 +- src/mesa/drivers/dri/i965/brw_vec4.cpp| 2 +- src/mesa/drivers/dri/i965/intel_debug.c | 3 ++- src/mesa/drivers/dri/i965/intel_debug.h | 5 +++-- 4 files changed, 7 insertions(+), 5 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_fs_reg_allocate.cpp b/src/mesa/drivers/dri/i965/brw_fs_reg_allocate.cpp index f25f2ec..714248a 100644 --- a/src/mesa/drivers/dri/i965/brw_fs_reg_allocate.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs_reg_allocate.cpp @@ -634,7 +634,7 @@ fs_visitor::assign_regs(bool allow_spilling) } /* Debug of register spilling: Go spill everything. */ - if (unlikely(INTEL_DEBUG DEBUG_SPILL)) { + if (unlikely(INTEL_DEBUG DEBUG_SPILL_FS)) { int reg = choose_spill_reg(g); if (reg != -1) { diff --git a/src/mesa/drivers/dri/i965/brw_vec4.cpp b/src/mesa/drivers/dri/i965/brw_vec4.cpp index 53270fb..6cf5ede 100644 --- a/src/mesa/drivers/dri/i965/brw_vec4.cpp +++ b/src/mesa/drivers/dri/i965/brw_vec4.cpp @@ -1814,7 +1814,7 @@ vec4_visitor::run(gl_clip_plane *clip_planes) setup_payload(); - if (false) { + if (unlikely(INTEL_DEBUG DEBUG_SPILL_VEC4)) { /* Debug of register spilling: Go spill everything. */ const int grf_count = alloc.count; float spill_costs[alloc.count]; diff --git a/src/mesa/drivers/dri/i965/intel_debug.c b/src/mesa/drivers/dri/i965/intel_debug.c index a077731..8d34349 100644 --- a/src/mesa/drivers/dri/i965/intel_debug.c +++ b/src/mesa/drivers/dri/i965/intel_debug.c @@ -69,7 +69,8 @@ static const struct dri_debug_control debug_control[] = { { ann, DEBUG_ANNOTATION }, { no8, DEBUG_NO8 }, { vec4vs, DEBUG_VEC4VS }, - { spill, DEBUG_SPILL }, + { spill_frag, DEBUG_SPILL_FS }, How about we call this spill_fs instead? The flag doesn't only affect fragment shaders, AFAICT it will cause all programs compiled with the FS back-end [F for fast ;)] to spill everything. With that fixed: that was my first choice, but if we do that it seems that driParseDebugString will also mark INTEL_DEBUG=fs as enabled. It seems as if this function checks if any of the string options is present in the provided string to enable them, so we can't really use an option name where any substring of it is included as a separate option :-( Oh man... That sounds seriously broken... Reviewed-by: Francisco Jerez curroje...@riseup.net + { spill_vec4, DEBUG_SPILL_VEC4 }, { cs, DEBUG_CS }, { NULL,0 } }; diff --git a/src/mesa/drivers/dri/i965/intel_debug.h b/src/mesa/drivers/dri/i965/intel_debug.h index 4689492..b7d0c82 100644 --- a/src/mesa/drivers/dri/i965/intel_debug.h +++ b/src/mesa/drivers/dri/i965/intel_debug.h @@ -64,8 +64,9 @@ extern uint64_t INTEL_DEBUG; #define DEBUG_ANNOTATION (1ull 28) #define DEBUG_NO8 (1ull 29) #define DEBUG_VEC4VS (1ull 30) -#define DEBUG_SPILL (1ull 31) -#define DEBUG_CS (1ull 32) +#define DEBUG_SPILL_FS(1ull 31) +#define DEBUG_SPILL_VEC4 (1ull 32) +#define DEBUG_CS (1ull 33) #ifdef HAVE_ANDROID_PLATFORM #define LOG_TAG INTEL-MESA -- 1.9.1 signature.asc Description: PGP signature ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v2 5/6] i965/vec4: Adjust spilling cost for consecutive instructions
Francisco Jerez curroje...@riseup.net writes: Iago Toral Quiroga ito...@igalia.com writes: Previous patches made it so that we do not need to unspill the same vgrf with every instruction as long as these instructions come right after the register was spilled or unspilled. This means that actually spilling the register is now cheaper in these scenarios, so adjust the spilling cost calculation accordingly. --- .../drivers/dri/i965/brw_vec4_reg_allocate.cpp | 61 +++--- 1 file changed, 55 insertions(+), 6 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_vec4_reg_allocate.cpp b/src/mesa/drivers/dri/i965/brw_vec4_reg_allocate.cpp index 9a69fac..652a68c 100644 --- a/src/mesa/drivers/dri/i965/brw_vec4_reg_allocate.cpp +++ b/src/mesa/drivers/dri/i965/brw_vec4_reg_allocate.cpp @@ -269,6 +269,21 @@ vec4_visitor::evaluate_spill_costs(float *spill_costs, bool *no_spill) { float loop_scale = 1.0; + /* If a spilled register is written (or unspilled) and then immediately read, +* we will avoid to emit scratch reads for that register for as long as +* consecutive instructions keep reading that register. This makes spilling +* the register cheaper, so we need to account for this here. Since any +* instruction can only have at most 3 src operands we can only have up to +* 3 cached registers (i.e. registers that we do not need to unspill if they +* are read in a consecutive instruction) at any given time. +* +* We keep two lists, cached[0] has the registers that are cached before +* the current instruction, and cached[1] has the registers that are +* cached after the current instruction. We copy cached[1] into cached[0] +* after processing each instruction. +*/ + int cached[2][3] = { -1, -1, -1, -1, -1, -1 }; + Wouldn't it be easier to drop the array and just compare the sources of each instruction with the previous one? for (unsigned i = 0; i this-alloc.count; i++) { spill_costs[i] = 0.0; no_spill[i] = alloc.sizes[i] != 1; @@ -279,18 +294,52 @@ vec4_visitor::evaluate_spill_costs(float *spill_costs, bool *no_spill) * loops run 10 times. */ foreach_block_and_inst(block, vec4_instruction, inst, cfg) { + memset(cached[1], -1, sizeof(cached[1])); + for (unsigned int i = 0; i 3; i++) { - if (inst-src[i].file == GRF) { -spill_costs[inst-src[i].reg] += loop_scale; -assert(!inst-src[i].reladdr); - } + if (inst-src[i].file == GRF) { +/* Don't increase spilling cost if src[i] isn't going + * to be unspilled + */ +int slot; +for (slot = 0; slot 3; slot++) { + if (inst-src[i].reg == cached[0][slot]) I guess this should take the reg_offset and writemask of the cached register into account. I suggest you use backend_reg::in_range() to check if there is overlap. Er, it's of course not necessary to check reg_offset because it's guaranteed to be zero, but checking writemask is AFAICT -- Using backend_reg::in_range() may improve readability either way. + break; +} +if (slot == 3) { + /* Not cached, we will unspill src[i] */ + spill_costs[inst-src[i].reg] += loop_scale; + assert(!inst-src[i].reladdr); +} else { + /* It is cached. Since it is read in this instruction it +* stays cached for the next. +*/ + cached[1][slot] = inst-src[i].reg; +} + } } if (inst-dst.file == GRF) { - spill_costs[inst-dst.reg] += loop_scale; - assert(!inst-dst.reladdr); + int slot; + /* Mark this reg as cached because if the next instruction reads it + * we won't unspill it. + */ + for (slot = 0; slot 3; slot++) { +if (cached[1][slot] == -1) { + cached[1][slot] = inst-dst.reg; + break; +} + } + assert(slot 3); + spill_costs[inst-dst.reg] += loop_scale; + assert(!inst-dst.reladdr); } + /* Cached registers after this instruction are the cached registers + * for the next instruction + */ + memcpy(cached[0], cached[1], sizeof(cached[0])); + switch (inst-opcode) { case BRW_OPCODE_DO: -- 1.9.1 signature.asc Description: PGP signature ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] Allow setting GL_TEXTURE_COMPARE_MODE on a sampler object without ARB_shadow support (v2).
Can you try to shorten the subject line to 75 chars or less? It should also start with mesa:. Perhaps: mesa: disable texture compare mode error checking for Wine On 07/30/2015 07:04 AM, Stefan Dösinger wrote: Version 2: Return GL_FALSE if ARB_shadow is unsupported instead of pretending to store the value as Please word wrap that line to 75 chars or less. suggested by Brian Paul. This fixes a GL error warning on r200 in Wine. The GL_ARB_sampler_objects extension does not specify a dependency on GL_ARB_shadow or GL_ARB_depth_texture for this value. Just set the value and don't do anything else. It won't matter without a depth texture being assigned anyway. This comment doesn't reflect what the code is really doing. The code is not setting any values. --- src/mesa/main/samplerobj.c | 28 ++-- 1 file changed, 10 insertions(+), 18 deletions(-) diff --git a/src/mesa/main/samplerobj.c b/src/mesa/main/samplerobj.c index 32180fb1..f5d2077 100644 --- a/src/mesa/main/samplerobj.c +++ b/src/mesa/main/samplerobj.c @@ -634,8 +634,12 @@ static GLuint set_sampler_compare_mode(struct gl_context *ctx, struct gl_sampler_object *samp, GLint param) { + /* No state change, no error. GL_ARB_sampler_objects does not +* have a dependency on GL_ARB_shadow for these settings. They +* don't have any effect without shadow textures anyway. +* This matters for Wine on old GPUs, e.g. r200. */ Please put closing */ on next line. Actually, the comment could be simplified: /* If GL_ARB_shadow is not supported, don't report an error. The * sampler object extension spec isn't clear on this extension interaction. * Silences errors with Wine on older GPUs such as R200. */ if (!ctx-Extensions.ARB_shadow) - return INVALID_PNAME; + return GL_FALSE; if (samp-CompareMode == param) return GL_FALSE; @@ -655,8 +659,12 @@ static GLuint set_sampler_compare_func(struct gl_context *ctx, struct gl_sampler_object *samp, GLint param) { + /* No state change, no error. GL_ARB_sampler_objects does not +* have a dependency on GL_ARB_shadow for these settings. They +* don't have any effect without shadow textures anyway. +* This matters for Wine on old GPUs, e.g. r200. */ if (!ctx-Extensions.ARB_shadow) - return INVALID_PNAME; + return GL_FALSE; if (samp-CompareFunc == param) return GL_FALSE; @@ -1342,13 +1350,9 @@ _mesa_GetSamplerParameteriv(GLuint sampler, GLenum pname, GLint *params) *params = IROUND(sampObj-LodBias); break; case GL_TEXTURE_COMPARE_MODE: - if (!ctx-Extensions.ARB_shadow) - goto invalid_pname; *params = sampObj-CompareMode; break; case GL_TEXTURE_COMPARE_FUNC: - if (!ctx-Extensions.ARB_shadow) - goto invalid_pname; *params = sampObj-CompareFunc; break; case GL_TEXTURE_MAX_ANISOTROPY_EXT: @@ -1431,13 +1435,9 @@ _mesa_GetSamplerParameterfv(GLuint sampler, GLenum pname, GLfloat *params) *params = sampObj-LodBias; break; case GL_TEXTURE_COMPARE_MODE: - if (!ctx-Extensions.ARB_shadow) - goto invalid_pname; *params = (GLfloat) sampObj-CompareMode; break; case GL_TEXTURE_COMPARE_FUNC: - if (!ctx-Extensions.ARB_shadow) - goto invalid_pname; *params = (GLfloat) sampObj-CompareFunc; break; case GL_TEXTURE_MAX_ANISOTROPY_EXT: @@ -1510,13 +1510,9 @@ _mesa_GetSamplerParameterIiv(GLuint sampler, GLenum pname, GLint *params) *params = (GLint) sampObj-LodBias; break; case GL_TEXTURE_COMPARE_MODE: - if (!ctx-Extensions.ARB_shadow) - goto invalid_pname; *params = sampObj-CompareMode; break; case GL_TEXTURE_COMPARE_FUNC: - if (!ctx-Extensions.ARB_shadow) - goto invalid_pname; *params = sampObj-CompareFunc; break; case GL_TEXTURE_MAX_ANISOTROPY_EXT: @@ -1589,13 +1585,9 @@ _mesa_GetSamplerParameterIuiv(GLuint sampler, GLenum pname, GLuint *params) *params = (GLuint) sampObj-LodBias; break; case GL_TEXTURE_COMPARE_MODE: - if (!ctx-Extensions.ARB_shadow) - goto invalid_pname; *params = sampObj-CompareMode; break; case GL_TEXTURE_COMPARE_FUNC: - if (!ctx-Extensions.ARB_shadow) - goto invalid_pname; *params = sampObj-CompareFunc; break; case GL_TEXTURE_MAX_ANISOTROPY_EXT: ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 1/9] i965: Trivial formatting changes in brw_wm.c
From: Ian Romanick ian.d.roman...@intel.com Signed-off-by: Ian Romanick ian.d.roman...@intel.com --- src/mesa/drivers/dri/i965/brw_wm.c | 84 +++--- 1 file changed, 41 insertions(+), 43 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_wm.c b/src/mesa/drivers/dri/i965/brw_wm.c index b590b17..9b90a7c 100644 --- a/src/mesa/drivers/dri/i965/brw_wm.c +++ b/src/mesa/drivers/dri/i965/brw_wm.c @@ -1,34 +1,28 @@ /* - Copyright (C) Intel Corp. 2006. All Rights Reserved. - Intel funded Tungsten Graphics to - develop this 3D driver. - - Permission is hereby granted, free of charge, to any person obtaining - a copy of this software and associated documentation files (the - Software), to deal in the Software without restriction, including - without limitation the rights to use, copy, modify, merge, publish, - distribute, sublicense, and/or sell copies of the Software, and to - permit persons to whom the Software is furnished to do so, subject to - the following conditions: - - The above copyright notice and this permission notice (including the - next paragraph) shall be included in all copies or substantial - portions of the Software. - - THE SOFTWARE IS PROVIDED AS IS, WITHOUT WARRANTY OF ANY KIND, - EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF - MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. - IN NO EVENT SHALL THE COPYRIGHT OWNER(S) AND/OR ITS SUPPLIERS BE - LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION - OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION - WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. - - **/ - /* - * Authors: - * Keith Whitwell kei...@vmware.com - */ - + * Copyright (C) Intel Corp. 2006. All Rights Reserved. + * Intel funded Tungsten Graphics to + * develop this 3D driver. + * + * Permission is hereby granted, free of charge, to any person obtaining + * a copy of this software and associated documentation files (the + * Software), to deal in the Software without restriction, including + * without limitation the rights to use, copy, modify, merge, publish, + * distribute, sublicense, and/or sell copies of the Software, and to + * permit persons to whom the Software is furnished to do so, subject to + * the following conditions: + * + * The above copyright notice and this permission notice (including the + * next paragraph) shall be included in all copies or substantial + * portions of the Software. + * + * THE SOFTWARE IS PROVIDED AS IS, WITHOUT WARRANTY OF ANY KIND, + * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF + * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. + * IN NO EVENT SHALL THE COPYRIGHT OWNER(S) AND/OR ITS SUPPLIERS BE + * LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION + * OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION + * WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. + */ #include brw_context.h #include brw_wm.h #include brw_state.h @@ -350,13 +344,15 @@ static uint8_t gen6_gather_workaround(GLenum internalformat) { switch (internalformat) { - case GL_R8I: return WA_SIGN | WA_8BIT; - case GL_R8UI: return WA_8BIT; - case GL_R16I: return WA_SIGN | WA_16BIT; - case GL_R16UI: return WA_16BIT; - /* note that even though GL_R32I and GL_R32UI have format overrides - * in the surface state, there is no shader w/a required */ - default: return 0; + case GL_R8I: return WA_SIGN | WA_8BIT; + case GL_R8UI: return WA_8BIT; + case GL_R16I: return WA_SIGN | WA_16BIT; + case GL_R16UI: return WA_16BIT; + default: + /* Note that even though GL_R32I and GL_R32UI have format overrides in + * the surface state, there is no shader w/a required. + */ + return 0; } } @@ -403,8 +399,9 @@ brw_populate_sampler_prog_key_data(struct gl_context *ctx, key-gl_clamp_mask[2] |= 1 s; } - /* gather4's channel select for green from RG32F is broken; - * requires a shader w/a on IVB; fixable with just SCS on HSW. */ + /* gather4's channel select for green from RG32F is broken; requires + * a shader w/a on IVB; fixable with just SCS on HSW. + */ if (brw-gen == 7 !brw-is_haswell prog-UsesGather) { if (img-InternalFormat == GL_RG32F) key-gather_channel_quirk_mask |= 1 s; @@ -453,13 +450,13 @@ brw_wm_state_dirty (struct brw_context *brw) BRW_NEW_VUE_MAP_GEOM_OUT); } -static void brw_wm_populate_key( struct brw_context *brw, -struct brw_wm_prog_key *key ) +static void +brw_wm_populate_key(struct brw_context *brw, struct brw_wm_prog_key *key) { struct gl_context *ctx = brw-ctx; /* BRW_NEW_FRAGMENT_PROGRAM */ const struct
[Mesa-dev] [PATCH 6/9] i965: Trivial formatting changes in gen7_vs_state.c
From: Ian Romanick ian.d.roman...@intel.com Signed-off-by: Ian Romanick ian.d.roman...@intel.com --- src/mesa/drivers/dri/i965/gen7_vs_state.c | 10 +- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/src/mesa/drivers/dri/i965/gen7_vs_state.c b/src/mesa/drivers/dri/i965/gen7_vs_state.c index 4b17d06..00bc6f2 100644 --- a/src/mesa/drivers/dri/i965/gen7_vs_state.c +++ b/src/mesa/drivers/dri/i965/gen7_vs_state.c @@ -62,6 +62,7 @@ gen7_upload_constant_state(struct brw_context *brw, OUT_BATCH(active ? stage_state-push_const_size : 0); OUT_BATCH(0); } + /* Pointer to the constant buffer. Covered by the set of state flags * from gen6_prepare_wm_contants */ @@ -95,15 +96,14 @@ gen7_upload_constant_state(struct brw_context *brw, ADVANCE_BATCH(); - /* On SKL+ the new constants don't take effect until the next corresponding - * 3DSTATE_BINDING_TABLE_POINTER_* command is parsed so we need to ensure - * that is sent - */ + /* On SKL+ the new constants don't take effect until the next corresponding +* 3DSTATE_BINDING_TABLE_POINTER_* command is parsed so we need to ensure +* that is sent +*/ if (brw-gen = 9) brw-ctx.NewDriverState |= BRW_NEW_SURFACES; } - static void upload_vs_state(struct brw_context *brw) { -- 2.1.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 9/9] glsl: Add constuctors for the common cases of glsl_struct_field
From: Ian Romanick ian.d.roman...@intel.com Fixes a giant pile of GCC warnings: builtin_types.cpp:60:1: warning: missing initializer for member 'glsl_struct_field::stream' [-Wmissing-field-initializers] I had to add a default constructor because a non-default constructor was added. Otherwise the only constructor would be the one with parameters, and all the plases like glsl_struct_field foo; would fail to compile. I wanted to do this in two patches. All of the initializers of glsl_struct_field structures had to be converted to use the constructor because C++ apparently forces you to do one or the other: builtin_types.cpp:61:1: error: could not convert '{glsl_type::float_type, near, -1, 0, 0, 0, GLSL_MATRIX_LAYOUT_INHERITED, 0, -1}' from 'brace-enclosed initializer list' to 'glsl_struct_field' Signed-off-by: Ian Romanick ian.d.roman...@intel.com Cc: Francisco Jerez curroje...@riseup.net Perhaps there's some other C++ way that I don't know. --- src/glsl/builtin_types.cpp | 74 +++--- src/glsl/glsl_types.h | 13 +++ src/glsl/tests/general_ir_test.cpp | 12 ++- src/glsl/tests/varyings_test.cpp | 10 +- 4 files changed, 53 insertions(+), 56 deletions(-) diff --git a/src/glsl/builtin_types.cpp b/src/glsl/builtin_types.cpp index d92e2eb..ffbc5e6 100644 --- a/src/glsl/builtin_types.cpp +++ b/src/glsl/builtin_types.cpp @@ -54,64 +54,64 @@ glsl_type::_struct_##NAME##_type; static const struct glsl_struct_field gl_DepthRangeParameters_fields[] = { - { glsl_type::float_type, near, -1, 0, 0, 0, GLSL_MATRIX_LAYOUT_INHERITED, 0 }, - { glsl_type::float_type, far, -1, 0, 0, 0, GLSL_MATRIX_LAYOUT_INHERITED, 0 }, - { glsl_type::float_type, diff, -1, 0, 0, 0, GLSL_MATRIX_LAYOUT_INHERITED, 0 }, + glsl_struct_field(glsl_type::float_type, near), + glsl_struct_field(glsl_type::float_type, far), + glsl_struct_field(glsl_type::float_type, diff), }; static const struct glsl_struct_field gl_PointParameters_fields[] = { - { glsl_type::float_type, size, -1, 0, 0, 0, GLSL_MATRIX_LAYOUT_INHERITED, 0 }, - { glsl_type::float_type, sizeMin, -1, 0, 0, 0, GLSL_MATRIX_LAYOUT_INHERITED, 0 }, - { glsl_type::float_type, sizeMax, -1, 0, 0, 0, GLSL_MATRIX_LAYOUT_INHERITED, 0 }, - { glsl_type::float_type, fadeThresholdSize, -1, 0, 0, 0, GLSL_MATRIX_LAYOUT_INHERITED, 0 }, - { glsl_type::float_type, distanceConstantAttenuation, -1, 0, 0, 0, GLSL_MATRIX_LAYOUT_INHERITED, 0 }, - { glsl_type::float_type, distanceLinearAttenuation, -1, 0, 0, 0, GLSL_MATRIX_LAYOUT_INHERITED, 0 }, - { glsl_type::float_type, distanceQuadraticAttenuation, -1, 0, 0, 0, GLSL_MATRIX_LAYOUT_INHERITED, 0 }, + glsl_struct_field(glsl_type::float_type, size), + glsl_struct_field(glsl_type::float_type, sizeMin), + glsl_struct_field(glsl_type::float_type, sizeMax), + glsl_struct_field(glsl_type::float_type, fadeThresholdSize), + glsl_struct_field(glsl_type::float_type, distanceConstantAttenuation), + glsl_struct_field(glsl_type::float_type, distanceLinearAttenuation), + glsl_struct_field(glsl_type::float_type, distanceQuadraticAttenuation), }; static const struct glsl_struct_field gl_MaterialParameters_fields[] = { - { glsl_type::vec4_type, emission, -1, 0, 0, 0, GLSL_MATRIX_LAYOUT_INHERITED, 0 }, - { glsl_type::vec4_type, ambient, -1, 0, 0, 0, GLSL_MATRIX_LAYOUT_INHERITED, 0 }, - { glsl_type::vec4_type, diffuse, -1, 0, 0, 0, GLSL_MATRIX_LAYOUT_INHERITED, 0 }, - { glsl_type::vec4_type, specular, -1, 0, 0, 0, GLSL_MATRIX_LAYOUT_INHERITED, 0 }, - { glsl_type::float_type, shininess, -1, 0, 0, 0, GLSL_MATRIX_LAYOUT_INHERITED, 0 }, + glsl_struct_field(glsl_type::vec4_type, emission), + glsl_struct_field(glsl_type::vec4_type, ambient), + glsl_struct_field(glsl_type::vec4_type, diffuse), + glsl_struct_field(glsl_type::vec4_type, specular), + glsl_struct_field(glsl_type::float_type, shininess), }; static const struct glsl_struct_field gl_LightSourceParameters_fields[] = { - { glsl_type::vec4_type, ambient, -1, 0, 0, 0, GLSL_MATRIX_LAYOUT_INHERITED, 0 }, - { glsl_type::vec4_type, diffuse, -1, 0, 0, 0, GLSL_MATRIX_LAYOUT_INHERITED, 0 }, - { glsl_type::vec4_type, specular, -1, 0, 0, 0, GLSL_MATRIX_LAYOUT_INHERITED, 0 }, - { glsl_type::vec4_type, position, -1, 0, 0, 0, GLSL_MATRIX_LAYOUT_INHERITED, 0 }, - { glsl_type::vec4_type, halfVector, -1, 0, 0, 0, GLSL_MATRIX_LAYOUT_INHERITED, 0 }, - { glsl_type::vec3_type, spotDirection, -1, 0, 0, 0, GLSL_MATRIX_LAYOUT_INHERITED, 0 }, - { glsl_type::float_type, spotExponent, -1, 0, 0, 0, GLSL_MATRIX_LAYOUT_INHERITED, 0 }, - { glsl_type::float_type, spotCutoff, -1, 0, 0, 0, GLSL_MATRIX_LAYOUT_INHERITED, 0 }, - { glsl_type::float_type, spotCosCutoff, -1, 0, 0, 0, GLSL_MATRIX_LAYOUT_INHERITED, 0 }, - { glsl_type::float_type, constantAttenuation, -1, 0, 0, 0, GLSL_MATRIX_LAYOUT_INHERITED, 0 }, - { glsl_type::float_type, linearAttenuation, -1, 0,
Re: [Mesa-dev] [PATCH 9/9] glsl: Add constuctors for the common cases of glsl_struct_field
Ian Romanick i...@freedesktop.org writes: From: Ian Romanick ian.d.roman...@intel.com Fixes a giant pile of GCC warnings: builtin_types.cpp:60:1: warning: missing initializer for member 'glsl_struct_field::stream' [-Wmissing-field-initializers] I had to add a default constructor because a non-default constructor was added. Otherwise the only constructor would be the one with parameters, and all the plases like glsl_struct_field foo; would fail to compile. I wanted to do this in two patches. All of the initializers of glsl_struct_field structures had to be converted to use the constructor because C++ apparently forces you to do one or the other: builtin_types.cpp:61:1: error: could not convert '{glsl_type::float_type, near, -1, 0, 0, 0, GLSL_MATRIX_LAYOUT_INHERITED, 0, -1}' from 'brace-enclosed initializer list' to 'glsl_struct_field' Signed-off-by: Ian Romanick ian.d.roman...@intel.com Cc: Francisco Jerez curroje...@riseup.net Perhaps there's some other C++ way that I don't know. There is a way for an initializer list to be interpreted as a constructor call, but it's C++11 :). --- src/glsl/builtin_types.cpp | 74 +++--- src/glsl/glsl_types.h | 13 +++ src/glsl/tests/general_ir_test.cpp | 12 ++- src/glsl/tests/varyings_test.cpp | 10 +- 4 files changed, 53 insertions(+), 56 deletions(-) diff --git a/src/glsl/builtin_types.cpp b/src/glsl/builtin_types.cpp index d92e2eb..ffbc5e6 100644 --- a/src/glsl/builtin_types.cpp +++ b/src/glsl/builtin_types.cpp @@ -54,64 +54,64 @@ glsl_type::_struct_##NAME##_type; static const struct glsl_struct_field gl_DepthRangeParameters_fields[] = { - { glsl_type::float_type, near, -1, 0, 0, 0, GLSL_MATRIX_LAYOUT_INHERITED, 0 }, - { glsl_type::float_type, far, -1, 0, 0, 0, GLSL_MATRIX_LAYOUT_INHERITED, 0 }, - { glsl_type::float_type, diff, -1, 0, 0, 0, GLSL_MATRIX_LAYOUT_INHERITED, 0 }, + glsl_struct_field(glsl_type::float_type, near), + glsl_struct_field(glsl_type::float_type, far), + glsl_struct_field(glsl_type::float_type, diff), }; static const struct glsl_struct_field gl_PointParameters_fields[] = { - { glsl_type::float_type, size, -1, 0, 0, 0, GLSL_MATRIX_LAYOUT_INHERITED, 0 }, - { glsl_type::float_type, sizeMin, -1, 0, 0, 0, GLSL_MATRIX_LAYOUT_INHERITED, 0 }, - { glsl_type::float_type, sizeMax, -1, 0, 0, 0, GLSL_MATRIX_LAYOUT_INHERITED, 0 }, - { glsl_type::float_type, fadeThresholdSize, -1, 0, 0, 0, GLSL_MATRIX_LAYOUT_INHERITED, 0 }, - { glsl_type::float_type, distanceConstantAttenuation, -1, 0, 0, 0, GLSL_MATRIX_LAYOUT_INHERITED, 0 }, - { glsl_type::float_type, distanceLinearAttenuation, -1, 0, 0, 0, GLSL_MATRIX_LAYOUT_INHERITED, 0 }, - { glsl_type::float_type, distanceQuadraticAttenuation, -1, 0, 0, 0, GLSL_MATRIX_LAYOUT_INHERITED, 0 }, + glsl_struct_field(glsl_type::float_type, size), + glsl_struct_field(glsl_type::float_type, sizeMin), + glsl_struct_field(glsl_type::float_type, sizeMax), + glsl_struct_field(glsl_type::float_type, fadeThresholdSize), + glsl_struct_field(glsl_type::float_type, distanceConstantAttenuation), + glsl_struct_field(glsl_type::float_type, distanceLinearAttenuation), + glsl_struct_field(glsl_type::float_type, distanceQuadraticAttenuation), }; static const struct glsl_struct_field gl_MaterialParameters_fields[] = { - { glsl_type::vec4_type, emission, -1, 0, 0, 0, GLSL_MATRIX_LAYOUT_INHERITED, 0 }, - { glsl_type::vec4_type, ambient, -1, 0, 0, 0, GLSL_MATRIX_LAYOUT_INHERITED, 0 }, - { glsl_type::vec4_type, diffuse, -1, 0, 0, 0, GLSL_MATRIX_LAYOUT_INHERITED, 0 }, - { glsl_type::vec4_type, specular, -1, 0, 0, 0, GLSL_MATRIX_LAYOUT_INHERITED, 0 }, - { glsl_type::float_type, shininess, -1, 0, 0, 0, GLSL_MATRIX_LAYOUT_INHERITED, 0 }, + glsl_struct_field(glsl_type::vec4_type, emission), + glsl_struct_field(glsl_type::vec4_type, ambient), + glsl_struct_field(glsl_type::vec4_type, diffuse), + glsl_struct_field(glsl_type::vec4_type, specular), + glsl_struct_field(glsl_type::float_type, shininess), }; static const struct glsl_struct_field gl_LightSourceParameters_fields[] = { - { glsl_type::vec4_type, ambient, -1, 0, 0, 0, GLSL_MATRIX_LAYOUT_INHERITED, 0 }, - { glsl_type::vec4_type, diffuse, -1, 0, 0, 0, GLSL_MATRIX_LAYOUT_INHERITED, 0 }, - { glsl_type::vec4_type, specular, -1, 0, 0, 0, GLSL_MATRIX_LAYOUT_INHERITED, 0 }, - { glsl_type::vec4_type, position, -1, 0, 0, 0, GLSL_MATRIX_LAYOUT_INHERITED, 0 }, - { glsl_type::vec4_type, halfVector, -1, 0, 0, 0, GLSL_MATRIX_LAYOUT_INHERITED, 0 }, - { glsl_type::vec3_type, spotDirection, -1, 0, 0, 0, GLSL_MATRIX_LAYOUT_INHERITED, 0 }, - { glsl_type::float_type, spotExponent, -1, 0, 0, 0, GLSL_MATRIX_LAYOUT_INHERITED, 0 }, - { glsl_type::float_type, spotCutoff, -1, 0, 0, 0, GLSL_MATRIX_LAYOUT_INHERITED, 0 },
[Mesa-dev] [PATCH] includes/GL: remove duplicated extension declarations from glx.h
All three of GLX_NV_float_buffer, GLX_EXT_texture_from_pixmap and GLX_MESA_query_renderer have been in glxext.h for a while now. As such we can drop this workaround/hack from the header. Signed-off-by: Emil Velikov emil.l.veli...@gmail.com --- Not to mention that glxext.h honours GLX_GLXEXT_PROTOTYPES and provides the relevant typedefs and/or function declarations when appropriate. -Emil include/GL/glx.h | 88 1 file changed, 88 deletions(-) diff --git a/include/GL/glx.h b/include/GL/glx.h index 78f5052..d612b79 100644 --- a/include/GL/glx.h +++ b/include/GL/glx.h @@ -371,14 +371,6 @@ extern Bool glXDrawableAttribARB(Display *dpy, GLXDrawable draw, const int *attr /* * Remove this when glxext.h is updated. */ -#ifndef GLX_NV_float_buffer -#define GLX_NV_float_buffer 1 - -#define GLX_FLOAT_COMPONENTS_NV 0x20B0 - -#endif /* GLX_NV_float_buffer */ - - /* * #?. GLX_MESA_swap_frame_usage @@ -415,86 +407,6 @@ typedef int (*PFNGLXGETSWAPINTERVALMESAPROC)(void); #endif /* GLX_MESA_swap_control */ - -/* - * #?. GLX_EXT_texture_from_pixmap - * XXX not finished? - */ -#ifndef GLX_EXT_texture_from_pixmap -#define GLX_EXT_texture_from_pixmap 1 - -#define GLX_BIND_TO_TEXTURE_RGB_EXT0x20D0 -#define GLX_BIND_TO_TEXTURE_RGBA_EXT 0x20D1 -#define GLX_BIND_TO_MIPMAP_TEXTURE_EXT 0x20D2 -#define GLX_BIND_TO_TEXTURE_TARGETS_EXT0x20D3 -#define GLX_Y_INVERTED_EXT 0x20D4 - -#define GLX_TEXTURE_FORMAT_EXT 0x20D5 -#define GLX_TEXTURE_TARGET_EXT 0x20D6 -#define GLX_MIPMAP_TEXTURE_EXT 0x20D7 - -#define GLX_TEXTURE_FORMAT_NONE_EXT0x20D8 -#define GLX_TEXTURE_FORMAT_RGB_EXT 0x20D9 -#define GLX_TEXTURE_FORMAT_RGBA_EXT0x20DA - -#define GLX_TEXTURE_1D_BIT_EXT 0x0001 -#define GLX_TEXTURE_2D_BIT_EXT 0x0002 -#define GLX_TEXTURE_RECTANGLE_BIT_EXT 0x0004 - -#define GLX_TEXTURE_1D_EXT 0x20DB -#define GLX_TEXTURE_2D_EXT 0x20DC -#define GLX_TEXTURE_RECTANGLE_EXT 0x20DD - -#define GLX_FRONT_LEFT_EXT 0x20DE -#define GLX_FRONT_RIGHT_EXT0x20DF -#define GLX_BACK_LEFT_EXT 0x20E0 -#define GLX_BACK_RIGHT_EXT 0x20E1 -#define GLX_FRONT_EXT GLX_FRONT_LEFT_EXT -#define GLX_BACK_EXT GLX_BACK_LEFT_EXT -#define GLX_AUX0_EXT 0x20E2 -#define GLX_AUX1_EXT 0x20E3 -#define GLX_AUX2_EXT 0x20E4 -#define GLX_AUX3_EXT 0x20E5 -#define GLX_AUX4_EXT 0x20E6 -#define GLX_AUX5_EXT 0x20E7 -#define GLX_AUX6_EXT 0x20E8 -#define GLX_AUX7_EXT 0x20E9 -#define GLX_AUX8_EXT 0x20EA -#define GLX_AUX9_EXT 0x20EB - -extern void glXBindTexImageEXT(Display *dpy, GLXDrawable drawable, int buffer, const int *attrib_list); -extern void glXReleaseTexImageEXT(Display *dpy, GLXDrawable drawable, int buffer); - -#endif /* GLX_EXT_texture_from_pixmap */ - - -#ifndef GLX_MESA_query_renderer -#define GLX_MESA_query_renderer 1 - -#define GLX_RENDERER_VENDOR_ID_MESA 0x8183 -#define GLX_RENDERER_DEVICE_ID_MESA 0x8184 -#define GLX_RENDERER_VERSION_MESA0x8185 -#define GLX_RENDERER_ACCELERATED_MESA0x8186 -#define GLX_RENDERER_VIDEO_MEMORY_MESA 0x8187 -#define GLX_RENDERER_UNIFIED_MEMORY_ARCHITECTURE_MESA0x8188 -#define GLX_RENDERER_PREFERRED_PROFILE_MESA 0x8189 -#define GLX_RENDERER_OPENGL_CORE_PROFILE_VERSION_MESA0x818A -#define GLX_RENDERER_OPENGL_COMPATIBILITY_PROFILE_VERSION_MESA0x818B -#define GLX_RENDERER_OPENGL_ES_PROFILE_VERSION_MESA 0x818C -#define GLX_RENDERER_OPENGL_ES2_PROFILE_VERSION_MESA 0x818D -#define GLX_RENDERER_ID_MESA 0x818E - -Bool glXQueryRendererIntegerMESA(Display *dpy, int screen, int renderer, int attribute, unsigned int *value); -Bool glXQueryCurrentRendererIntegerMESA(int attribute, unsigned int *value); -const char *glXQueryRendererStringMESA(Display *dpy, int screen, int renderer, int attribute); -const char *glXQueryCurrentRendererStringMESA(int attribute); - -typedef Bool (*PFNGLXQUERYRENDERERINTEGERMESAPROC) (Display *dpy, int screen, int renderer, int attribute, unsigned int *value); -typedef Bool (*PFNGLXQUERYCURRENTRENDERERINTEGERMESAPROC) (int attribute, unsigned int *value); -typedef const char *(*PFNGLXQUERYRENDERERSTRINGMESAPROC) (Display *dpy, int screen, int renderer, int attribute); -typedef const char *(*PFNGLXQUERYCURRENTRENDERERSTRINGMESAPROC) (int attribute); -#endif /* GLX_MESA_query_renderer */ - /*** Should these go here, or in another header? */ /* ** GLX Events -- 2.4.5
[Mesa-dev] [PATCH 2/3] mesa: save which transform feedback buffer is associated with which stream
From: Marek Olšák marek.ol...@amd.com --- src/glsl/link_varyings.cpp | 1 + src/mesa/main/mtypes.h | 5 + 2 files changed, 6 insertions(+) diff --git a/src/glsl/link_varyings.cpp b/src/glsl/link_varyings.cpp index 1c52ff3..f7a7b8c 100644 --- a/src/glsl/link_varyings.cpp +++ b/src/glsl/link_varyings.cpp @@ -572,6 +572,7 @@ tfeedback_decl::store(struct gl_context *ctx, struct gl_shader_program *prog, info-Outputs[info-NumOutputs].DstOffset = info-BufferStride[buffer]; ++info-NumOutputs; info-BufferStride[buffer] += output_size; + info-BufferStream[buffer] = this-stream_id; num_components -= output_size; location++; location_frac = 0; diff --git a/src/mesa/main/mtypes.h b/src/mesa/main/mtypes.h index a252bf4..4e00fb6 100644 --- a/src/mesa/main/mtypes.h +++ b/src/mesa/main/mtypes.h @@ -1868,6 +1868,11 @@ struct gl_transform_feedback_info * multiple transform feedback outputs in the same buffer. */ unsigned BufferStride[MAX_FEEDBACK_BUFFERS]; + + /** +* Which transform feedback stream this buffer binding is associated with. +*/ + unsigned BufferStream[MAX_FEEDBACK_BUFFERS]; }; -- 2.1.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 3/3] st/mesa: implement DrawTransformFeedbackStream
From: Marek Olšák marek.ol...@amd.com --- src/mesa/state_tracker/st_cb_xformfb.c | 58 ++ src/mesa/state_tracker/st_cb_xformfb.h | 2 +- src/mesa/state_tracker/st_draw.c | 2 +- 3 files changed, 33 insertions(+), 29 deletions(-) diff --git a/src/mesa/state_tracker/st_cb_xformfb.c b/src/mesa/state_tracker/st_cb_xformfb.c index 07c118e..85a8620 100644 --- a/src/mesa/state_tracker/st_cb_xformfb.c +++ b/src/mesa/state_tracker/st_cb_xformfb.c @@ -54,9 +54,9 @@ struct st_transform_feedback_object { struct pipe_stream_output_target *targets[PIPE_MAX_SO_BUFFERS]; /* This encapsulates the count that can be used as a source for draw_vbo. -* It contains a stream output target from the last call of -* EndTransformFeedback. */ - struct pipe_stream_output_target *draw_count; +* It contains stream output targets from the last call of +* EndTransformFeedback for each stream. */ + struct pipe_stream_output_target *draw_count[MAX_VERTEX_STREAMS]; }; static inline struct st_transform_feedback_object * @@ -88,7 +88,8 @@ st_delete_transform_feedback(struct gl_context *ctx, st_transform_feedback_object(obj); unsigned i; - pipe_so_target_reference(sobj-draw_count, NULL); + for (i = 0; i ARRAY_SIZE(sobj-draw_count); i++) + pipe_so_target_reference(sobj-draw_count[i], NULL); /* Unreference targets. */ for (i = 0; i sobj-num_targets; i++) { @@ -123,9 +124,12 @@ st_begin_transform_feedback(struct gl_context *ctx, GLenum mode, struct st_buffer_object *bo = st_buffer_object(sobj-base.Buffers[i]); if (bo bo-buffer) { + unsigned stream = +obj-shader_program-LinkedTransformFeedback.BufferStream[i]; + /* Check whether we need to recreate the target. */ if (!sobj-targets[i] || - sobj-targets[i] == sobj-draw_count || + sobj-targets[i] == sobj-draw_count[stream] || sobj-targets[i]-buffer != bo-buffer || sobj-targets[i]-buffer_offset != sobj-base.Offset[i] || sobj-targets[i]-buffer_size != sobj-base.Size[i]) { @@ -178,24 +182,6 @@ st_resume_transform_feedback(struct gl_context *ctx, } -static struct pipe_stream_output_target * -st_transform_feedback_get_draw_target(struct gl_transform_feedback_object *obj) -{ - struct st_transform_feedback_object *sobj = - st_transform_feedback_object(obj); - unsigned i; - - for (i = 0; i ARRAY_SIZE(sobj-targets); i++) { - if (sobj-targets[i]) { - return sobj-targets[i]; - } - } - - assert(0); - return NULL; -} - - static void st_end_transform_feedback(struct gl_context *ctx, struct gl_transform_feedback_object *obj) @@ -203,22 +189,40 @@ st_end_transform_feedback(struct gl_context *ctx, struct st_context *st = st_context(ctx); struct st_transform_feedback_object *sobj = st_transform_feedback_object(obj); + unsigned i; cso_set_stream_outputs(st-cso_context, 0, NULL, NULL); - pipe_so_target_reference(sobj-draw_count, -st_transform_feedback_get_draw_target(obj)); + /* The next call to glDrawTransformFeedbackStream should use the vertex +* count from the last call to glEndTransformFeedback. +* Therefore, save the targets for each stream. +* +* NULL means the vertex counter is 0 (initial state). +*/ + for (i = 0; i ARRAY_SIZE(sobj-draw_count); i++) + pipe_so_target_reference(sobj-draw_count[i], NULL); + + for (i = 0; i ARRAY_SIZE(sobj-targets); i++) { + unsigned stream = + obj-shader_program-LinkedTransformFeedback.BufferStream[i]; + + /* Is it not boudn or already set for this stream? */ + if (!sobj-targets[i] || sobj-draw_count[stream]) + continue; + + pipe_so_target_reference(sobj-draw_count[stream], sobj-targets[i]); + } } void st_transform_feedback_draw_init(struct gl_transform_feedback_object *obj, -struct pipe_draw_info *out) +unsigned stream, struct pipe_draw_info *out) { struct st_transform_feedback_object *sobj = st_transform_feedback_object(obj); - out-count_from_stream_output = sobj-draw_count; + out-count_from_stream_output = sobj-draw_count[stream]; } diff --git a/src/mesa/state_tracker/st_cb_xformfb.h b/src/mesa/state_tracker/st_cb_xformfb.h index 998c418..16f5176 100644 --- a/src/mesa/state_tracker/st_cb_xformfb.h +++ b/src/mesa/state_tracker/st_cb_xformfb.h @@ -40,7 +40,7 @@ st_init_xformfb_functions(struct dd_function_table *functions); extern void st_transform_feedback_draw_init(struct gl_transform_feedback_object *obj, -struct pipe_draw_info *out); +unsigned stream, struct pipe_draw_info *out); #endif /* ST_CB_XFORMFB_H */ diff --git a/src/mesa/state_tracker/st_draw.c
[Mesa-dev] [PATCH 1/3] vbo: pass the stream from DrawTransformFeedbackStream to drivers
From: Marek Olšák marek.ol...@amd.com --- src/mesa/drivers/dri/i965/brw_draw.c | 3 ++- src/mesa/drivers/dri/i965/brw_draw.h | 3 ++- src/mesa/drivers/dri/i965/brw_meta_fast_clear.c | 2 +- src/mesa/drivers/dri/i965/brw_primitive_restart.c | 3 ++- src/mesa/drivers/dri/nouveau/nouveau_vbo_t.c | 7 +-- src/mesa/state_tracker/st_cb_rasterpos.c | 2 +- src/mesa/state_tracker/st_draw.c | 1 + src/mesa/state_tracker/st_draw.h | 2 ++ src/mesa/state_tracker/st_draw_feedback.c | 1 + src/mesa/tnl/t_draw.c | 1 + src/mesa/tnl/tnl.h| 3 ++- src/mesa/vbo/vbo.h| 3 ++- src/mesa/vbo/vbo_exec_array.c | 20 ++-- src/mesa/vbo/vbo_exec_draw.c | 2 +- src/mesa/vbo/vbo_primitive_restart.c | 4 ++-- src/mesa/vbo/vbo_rebase.c | 2 +- src/mesa/vbo/vbo_save_draw.c | 2 +- src/mesa/vbo/vbo_split_copy.c | 2 +- src/mesa/vbo/vbo_split_inplace.c | 2 +- 19 files changed, 39 insertions(+), 26 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_draw.c b/src/mesa/drivers/dri/i965/brw_draw.c index b5ef276..efc4f40 100644 --- a/src/mesa/drivers/dri/i965/brw_draw.c +++ b/src/mesa/drivers/dri/i965/brw_draw.c @@ -561,6 +561,7 @@ void brw_draw_prims( struct gl_context *ctx, GLuint min_index, GLuint max_index, struct gl_transform_feedback_object *unused_tfb_object, + unsigned stream, struct gl_buffer_object *indirect ) { struct brw_context *brw = brw_context(ctx); @@ -586,7 +587,7 @@ void brw_draw_prims( struct gl_context *ctx, _swsetup_Wakeup(ctx); _tnl_wakeup(ctx); _tnl_draw_prims(ctx, prims, nr_prims, ib, - index_bounds_valid, min_index, max_index, NULL, NULL); + index_bounds_valid, min_index, max_index, NULL, 0, NULL); return; } diff --git a/src/mesa/drivers/dri/i965/brw_draw.h b/src/mesa/drivers/dri/i965/brw_draw.h index fc83dcd..f994726 100644 --- a/src/mesa/drivers/dri/i965/brw_draw.h +++ b/src/mesa/drivers/dri/i965/brw_draw.h @@ -34,7 +34,7 @@ struct brw_context; -void brw_draw_prims( struct gl_context *ctx, +void brw_draw_prims(struct gl_context *ctx, const struct _mesa_prim *prims, GLuint nr_prims, const struct _mesa_index_buffer *ib, @@ -42,6 +42,7 @@ void brw_draw_prims( struct gl_context *ctx, GLuint min_index, GLuint max_index, struct gl_transform_feedback_object *unused_tfb_object, + unsigned stream, struct gl_buffer_object *indirect ); void brw_draw_init( struct brw_context *brw ); diff --git a/src/mesa/drivers/dri/i965/brw_meta_fast_clear.c b/src/mesa/drivers/dri/i965/brw_meta_fast_clear.c index e7e8df5..f5ecbb5 100644 --- a/src/mesa/drivers/dri/i965/brw_meta_fast_clear.c +++ b/src/mesa/drivers/dri/i965/brw_meta_fast_clear.c @@ -200,7 +200,7 @@ brw_draw_rectlist(struct gl_context *ctx, struct rect *rect, int num_instances) brw_draw_prims(ctx, prim, 1, NULL, GL_TRUE, start, start + count - 1, - NULL, NULL); + NULL, 0, NULL); } static void diff --git a/src/mesa/drivers/dri/i965/brw_primitive_restart.c b/src/mesa/drivers/dri/i965/brw_primitive_restart.c index 2c7a7e8..6ed79d7 100644 --- a/src/mesa/drivers/dri/i965/brw_primitive_restart.c +++ b/src/mesa/drivers/dri/i965/brw_primitive_restart.c @@ -161,7 +161,8 @@ brw_handle_primitive_restart(struct gl_context *ctx, /* Cut index should work for primitive restart, so use it */ brw-prim_restart.enable_cut_index = true; - brw_draw_prims(ctx, prims, nr_prims, ib, GL_FALSE, -1, -1, NULL, indirect); + brw_draw_prims(ctx, prims, nr_prims, ib, GL_FALSE, -1, -1, NULL, 0, + indirect); brw-prim_restart.enable_cut_index = false; } else { /* Not all the primitive draw modes are supported by the cut index, diff --git a/src/mesa/drivers/dri/nouveau/nouveau_vbo_t.c b/src/mesa/drivers/dri/nouveau/nouveau_vbo_t.c index c85acec..a3fbad0 100644 --- a/src/mesa/drivers/dri/nouveau/nouveau_vbo_t.c +++ b/src/mesa/drivers/dri/nouveau/nouveau_vbo_t.c @@ -223,6 +223,7 @@ TAG(vbo_render_prims)(struct gl_context *ctx, GLboolean index_bounds_valid, GLuint min_index, GLuint max_index, struct gl_transform_feedback_object *tfb_vertcount, + unsigned stream, struct gl_buffer_object *indirect); static GLboolean @@ -455,6 +456,7 @@ TAG(vbo_render_prims)(struct
Re: [Mesa-dev] [PATCH] includes/GL: remove duplicated extension declarations from glx.h
On 07/30/2015 08:22 AM, Emil Velikov wrote: All three of GLX_NV_float_buffer, GLX_EXT_texture_from_pixmap and GLX_MESA_query_renderer have been in glxext.h for a while now. As such we can drop this workaround/hack from the header. Signed-off-by: Emil Velikov emil.l.veli...@gmail.com --- Not to mention that glxext.h honours GLX_GLXEXT_PROTOTYPES and provides the relevant typedefs and/or function declarations when appropriate. -Emil include/GL/glx.h | 88 1 file changed, 88 deletions(-) diff --git a/include/GL/glx.h b/include/GL/glx.h index 78f5052..d612b79 100644 --- a/include/GL/glx.h +++ b/include/GL/glx.h @@ -371,14 +371,6 @@ extern Bool glXDrawableAttribARB(Display *dpy, GLXDrawable draw, const int *attr /* * Remove this when glxext.h is updated. */ -#ifndef GLX_NV_float_buffer -#define GLX_NV_float_buffer 1 - -#define GLX_FLOAT_COMPONENTS_NV 0x20B0 - -#endif /* GLX_NV_float_buffer */ - - /* * #?. GLX_MESA_swap_frame_usage @@ -415,86 +407,6 @@ typedef int (*PFNGLXGETSWAPINTERVALMESAPROC)(void); #endif /* GLX_MESA_swap_control */ - -/* - * #?. GLX_EXT_texture_from_pixmap - * XXX not finished? - */ -#ifndef GLX_EXT_texture_from_pixmap -#define GLX_EXT_texture_from_pixmap 1 - -#define GLX_BIND_TO_TEXTURE_RGB_EXT0x20D0 -#define GLX_BIND_TO_TEXTURE_RGBA_EXT 0x20D1 -#define GLX_BIND_TO_MIPMAP_TEXTURE_EXT 0x20D2 -#define GLX_BIND_TO_TEXTURE_TARGETS_EXT0x20D3 -#define GLX_Y_INVERTED_EXT 0x20D4 - -#define GLX_TEXTURE_FORMAT_EXT 0x20D5 -#define GLX_TEXTURE_TARGET_EXT 0x20D6 -#define GLX_MIPMAP_TEXTURE_EXT 0x20D7 - -#define GLX_TEXTURE_FORMAT_NONE_EXT0x20D8 -#define GLX_TEXTURE_FORMAT_RGB_EXT 0x20D9 -#define GLX_TEXTURE_FORMAT_RGBA_EXT0x20DA - -#define GLX_TEXTURE_1D_BIT_EXT 0x0001 -#define GLX_TEXTURE_2D_BIT_EXT 0x0002 -#define GLX_TEXTURE_RECTANGLE_BIT_EXT 0x0004 - -#define GLX_TEXTURE_1D_EXT 0x20DB -#define GLX_TEXTURE_2D_EXT 0x20DC -#define GLX_TEXTURE_RECTANGLE_EXT 0x20DD - -#define GLX_FRONT_LEFT_EXT 0x20DE -#define GLX_FRONT_RIGHT_EXT0x20DF -#define GLX_BACK_LEFT_EXT 0x20E0 -#define GLX_BACK_RIGHT_EXT 0x20E1 -#define GLX_FRONT_EXT GLX_FRONT_LEFT_EXT -#define GLX_BACK_EXT GLX_BACK_LEFT_EXT -#define GLX_AUX0_EXT 0x20E2 -#define GLX_AUX1_EXT 0x20E3 -#define GLX_AUX2_EXT 0x20E4 -#define GLX_AUX3_EXT 0x20E5 -#define GLX_AUX4_EXT 0x20E6 -#define GLX_AUX5_EXT 0x20E7 -#define GLX_AUX6_EXT 0x20E8 -#define GLX_AUX7_EXT 0x20E9 -#define GLX_AUX8_EXT 0x20EA -#define GLX_AUX9_EXT 0x20EB - -extern void glXBindTexImageEXT(Display *dpy, GLXDrawable drawable, int buffer, const int *attrib_list); -extern void glXReleaseTexImageEXT(Display *dpy, GLXDrawable drawable, int buffer); - -#endif /* GLX_EXT_texture_from_pixmap */ - - -#ifndef GLX_MESA_query_renderer -#define GLX_MESA_query_renderer 1 - -#define GLX_RENDERER_VENDOR_ID_MESA 0x8183 -#define GLX_RENDERER_DEVICE_ID_MESA 0x8184 -#define GLX_RENDERER_VERSION_MESA0x8185 -#define GLX_RENDERER_ACCELERATED_MESA0x8186 -#define GLX_RENDERER_VIDEO_MEMORY_MESA 0x8187 -#define GLX_RENDERER_UNIFIED_MEMORY_ARCHITECTURE_MESA0x8188 -#define GLX_RENDERER_PREFERRED_PROFILE_MESA 0x8189 -#define GLX_RENDERER_OPENGL_CORE_PROFILE_VERSION_MESA0x818A -#define GLX_RENDERER_OPENGL_COMPATIBILITY_PROFILE_VERSION_MESA0x818B -#define GLX_RENDERER_OPENGL_ES_PROFILE_VERSION_MESA 0x818C -#define GLX_RENDERER_OPENGL_ES2_PROFILE_VERSION_MESA 0x818D -#define GLX_RENDERER_ID_MESA 0x818E - -Bool glXQueryRendererIntegerMESA(Display *dpy, int screen, int renderer, int attribute, unsigned int *value); -Bool glXQueryCurrentRendererIntegerMESA(int attribute, unsigned int *value); -const char *glXQueryRendererStringMESA(Display *dpy, int screen, int renderer, int attribute); -const char *glXQueryCurrentRendererStringMESA(int attribute); - -typedef Bool (*PFNGLXQUERYRENDERERINTEGERMESAPROC) (Display *dpy, int screen, int renderer, int attribute, unsigned int *value); -typedef Bool (*PFNGLXQUERYCURRENTRENDERERINTEGERMESAPROC) (int attribute, unsigned int *value); -typedef const char *(*PFNGLXQUERYRENDERERSTRINGMESAPROC) (Display *dpy, int screen, int renderer, int attribute); -typedef const char *(*PFNGLXQUERYCURRENTRENDERERSTRINGMESAPROC) (int attribute); -#endif /* GLX_MESA_query_renderer */ - /*** Should these go here, or in another header? */ /*
Re: [Mesa-dev] [PATCH] includes/GL: remove duplicated extension declarations from glx.h
On 30 July 2015 at 15:22, Emil Velikov emil.l.veli...@gmail.com wrote: All three of GLX_NV_float_buffer, GLX_EXT_texture_from_pixmap and GLX_MESA_query_renderer have been in glxext.h for a while now. As such we can drop this workaround/hack from the header. Signed-off-by: Emil Velikov emil.l.veli...@gmail.com --- Not to mention that glxext.h honours GLX_GLXEXT_PROTOTYPES and provides the relevant typedefs and/or function declarations when appropriate. -Emil include/GL/glx.h | 88 1 file changed, 88 deletions(-) diff --git a/include/GL/glx.h b/include/GL/glx.h index 78f5052..d612b79 100644 --- a/include/GL/glx.h +++ b/include/GL/glx.h @@ -371,14 +371,6 @@ extern Bool glXDrawableAttribARB(Display *dpy, GLXDrawable draw, const int *attr /* * Remove this when glxext.h is updated. */ Seems like I should drop this comment as well. On a related note: there are a few more extensions that never made it (i.e. are missing from the Khronos public spec page) but are still around. Namely GLX_NV_vertex_array_range - Shipped with nvidia's glxext.h, but unused in the binary driver (version 352.21). - Referenced in GLX_MESA_agp_offset - GLX_NV_vertex_array_range is required GLX_ARB_render_texture - No hints in nvidia headers/binary. - Referenced in GLX_EXT_texture_from_pixmap - Adapted spec language from draft version of GLX_ARB_render_texture - The Khronos meeting minutes state GLX_ARB_render_texture hasn't been kept up to date, but there's no reason it can't be completed. [1] GLX_MESA_swap_frame_usage - No hints in nvidia headers/binary. - No reference to in the public specs. GLX_MESA_swap_control - No hints in nvidia headers/binary. - Referenced by GLX_EXT_swap_control, Based on GLX_MESA_swap_control version 1.1 I'm not suggesting that we should nuke these, but curious if they are or may have been used at some point. -Emil [1] https://www.opengl.org/archives/about/arb/meeting_notes/notes/meeting_note_2001-06-12.html ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [Bug 91020] Mesa's demo / tools won't compile since EGL changes
https://bugs.freedesktop.org/show_bug.cgi?id=91020 --- Comment #5 from Emil Velikov emil.l.veli...@gmail.com --- (In reply to George Diamantopoulos from comment #4) Created attachment 117106 [details] [review] Fix build against EGL implementations that don't support the EGL_MESA_screen_surface extension There is a patch by Frank Binns (posted on mesa-dev) since 2012 on this... It worked for me, so I'm attaching it here While this should work and has been the approach used so far in mesa-demos, I'm having second doubts how wise of an idea it is. In this particular example, this will result in a empty demo/program which won't do anything when run on a EGL_MESA_screen_surface capable driver. I.e. even if your system is perfectly fine, the binary build by your distro/elsewhere with the missing definition will be useless. May I suggest that we add the relevant defines, after including the *GL headers. This way the binary will work on compatible systems, and (should) bail out if the extension is not supported. One could also fix the other similar cases, while at it ;-) Git grep shows some ~120 lines :-P -- You are receiving this mail because: You are the QA Contact for the bug. You are the assignee for the bug. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 0/9] Mostly trivial clean ups
All but the last of these patches have been sitting in one tree or another for quite some time. All of these files were touched as part of other work, but that work is stalled a bit. At Ken's suggestion, I'm sending out the trivial bits just to prune my trees a bit. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v2 3/6] i965/vec4: Don't emit scratch reads for a spilled register we have just written
Francisco Jerez curroje...@riseup.net writes: Iago Toral Quiroga ito...@igalia.com writes: When we have code such as this: mov vgrf1.0.x:F, vgrf2.:F mov vgrf3.0.x:F, vgrf1.:F ... mov vgrf3.0.x:F, vgrf1.:F And vgrf1 is chosen for spilling, we can emit this: mov vgrf1.0.x:F, vgrf2.:F gen4_scratch_write hw_reg0:F, vgrf1.:D, 22D mov vgrf3.0.x:F, vgrf1.:F ... gen4_scratch_read vgrf4.0.x:F, 22D mov vgrf3.0.x:F, vgrf4.:F Instead of this: mov vgrf1.0.x:F, vgrf2.:F gen4_scratch_write hw_reg0:F, vgrf1.:D, 22D gen4_scratch_read vgrf4.0.x:F, 22D mov vgrf3.0.x:F, vgrf4.:F ... gen4_scratch_read vgrf5.0.x:F, 22D mov vgrf3.0.x:F, vgrf5.:F And save one scratch read while still preserving the benefits of spilling the register. In general, we avoid emitting scratch reads for as long as the next instruction keeps reading the spilled register. This should not harm the benefit of spilling the register because gains for register allocation only come when we have chunks of program code where the register is alive but not really used (because these are the points where we could effectively use that register for another purpose if we spilled it), so as long as consecutive instructions use that register we can avoid the scratch reads without losing anything. --- .../drivers/dri/i965/brw_vec4_reg_allocate.cpp | 37 +- 1 file changed, 36 insertions(+), 1 deletion(-) diff --git a/src/mesa/drivers/dri/i965/brw_vec4_reg_allocate.cpp b/src/mesa/drivers/dri/i965/brw_vec4_reg_allocate.cpp index cff5406..fd56dae 100644 --- a/src/mesa/drivers/dri/i965/brw_vec4_reg_allocate.cpp +++ b/src/mesa/drivers/dri/i965/brw_vec4_reg_allocate.cpp @@ -340,11 +340,43 @@ vec4_visitor::spill_reg(int spill_reg_nr) unsigned int spill_offset = last_scratch++; /* Generate spill/unspill instructions for the objects being spilled. */ + vec4_instruction *spill_write_inst = NULL; foreach_block_and_inst(block, vec4_instruction, inst, cfg) { + /* We don't spill registers used for scratch */ + if (inst-opcode == SHADER_OPCODE_GEN4_SCRATCH_READ || + inst-opcode == SHADER_OPCODE_GEN4_SCRATCH_WRITE) + continue; + int scratch_reg = -1; + bool spill_reg_was_read = false; for (unsigned int i = 0; i 3; i++) { if (inst-src[i].file == GRF inst-src[i].reg == spill_reg_nr) { -if (scratch_reg == -1) { +if (!spill_reg_was_read) { + spill_reg_was_read = (!inst-predicate || + inst-opcode == BRW_OPCODE_SEL); +} + +/* If we are reading the spilled register right after writing + * to it we can skip the scratch read and use directly the + * register we used as source for the scratch write. For this + * to work we must check that: + * + * 1) The write is inconditional, that is, it is not predicated or + *it is a SEL. + * 2) All the channels that we read have been written in that + *last write instruction. + * + * We keep doing this for as long as the next instruction + * keeps reading the spilled register and break as soon as we + * find an instruction that doesn't. + */ +if (spill_write_inst +(!spill_write_inst-predicate || + spill_write_inst-opcode == BRW_OPCODE_SEL) +((brw_mask_for_swizzle(inst-src[i].swizzle) + ~spill_write_inst-dst.writemask) == 0)) { + scratch_reg = spill_write_inst-dst.reg; +} else if (scratch_reg == -1) { One suggestion: You could factor out the rather complex caching logic into a separate function (e.g. 'bool can_reuse_scratch_for_source(const vec4_instruction *, unsigned i, unsigned scratch_reg)'). The function would simply compare scratch_reg with the sources of the current instruction (up to src) and the sources and destination of the previous Huh, of course up to i is what I intended to write. non-scratch_read/write instruction. If there's a match it would check that the regioning is compatible with the i-th source and return true in that case. This would have several benefits: - It would keep ::spill_reg() simple and the caching heuristic factored out. The only thing spill_reg() would still need to take care of is keep track of the last spilling temporary (e.g. as you're doing it now with the scratch_reg variable) and pass it to can_reuse_scratch_for_source(), but it would no longer be necessary to reset it to -1 when reuse is not possible. - It would allow you to use a single implementation of the caching policy to implement both spill_reg() and evaluate_spill_costs(), what would make sure that things don't go
Re: [Mesa-dev] [PATCH 0/9] Mostly trivial clean ups
Patches 1-8 are Acked-by: Jason Ekstrand jason.ekstr...@intel.com Thanks for cleaning that stuff up. On Thu, Jul 30, 2015 at 7:14 AM, Ian Romanick i...@freedesktop.org wrote: All but the last of these patches have been sitting in one tree or another for quite some time. All of these files were touched as part of other work, but that work is stalled a bit. At Ken's suggestion, I'm sending out the trivial bits just to prune my trees a bit. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] i915/aa: fixing anti-aliasing bug for thinnest width lines
On PNV platform, for 1 pixel line thickness or less, the general anti-aliasing algorithm gives up, and a garbage line is generated. Setting a Line Width of 0.0 specifies the rasterization of the thinnest (one-pixel-wide), non-antialiased lines. Lines rendered with zero Line Width are rasterized using Grid Intersection Quantization rules as specified by bspec G45: Volume 2: 3D/Media, 7.3.13.1 Zero-Width (Cosmetic) Line Rasterization section. This patch follow the same rules as patches fixing the https://bugs.freedesktop.org/show_bug.cgi?id=28832 bug. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=90367 Signed-off-by: Marius Predut marius.pre...@intel.com --- src/mesa/drivers/dri/i915/i915_state.c | 15 +++ 1 file changed, 15 insertions(+) diff --git a/src/mesa/drivers/dri/i915/i915_state.c b/src/mesa/drivers/dri/i915/i915_state.c index 5f10b84..6cd342c 100644 --- a/src/mesa/drivers/dri/i915/i915_state.c +++ b/src/mesa/drivers/dri/i915/i915_state.c @@ -599,6 +599,21 @@ i915LineWidth(struct gl_context * ctx, GLfloat widthf) width = (int) (widthf * 2); width = CLAMP(width, 1, 0xf); + + if (ctx-Line.Width 1.5 || widthf 1.5) { +/* For 1 pixel line thickness or less, the general + * anti-aliasing algorithm gives up, and a garbage line is + * generated. Setting a Line Width of 0.0 specifies the + * rasterization of the thinnest (one-pixel-wide), + * non-antialiased lines. + * + * Lines rendered with zero Line Width are rasterized using + * Grid Intersection Quantization rules as specified by + * bspec G45: Volume 2: 3D/Media, + * 7.3.13.1 Zero-Width (Cosmetic) Line Rasterization section + */ +width = 0; + } lis4 |= width S4_LINE_WIDTH_SHIFT; if (lis4 != i915-state.Ctx[I915_CTXREG_LIS4]) { -- 1.9.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 2/9] i965: Trivial formatting changes in brw_draw.c
From: Ian Romanick ian.d.roman...@intel.com Signed-off-by: Ian Romanick ian.d.roman...@intel.com --- src/mesa/drivers/dri/i965/brw_draw.c | 102 +-- 1 file changed, 51 insertions(+), 51 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_draw.c b/src/mesa/drivers/dri/i965/brw_draw.c index b5ef276..9113d0f 100644 --- a/src/mesa/drivers/dri/i965/brw_draw.c +++ b/src/mesa/drivers/dri/i965/brw_draw.c @@ -104,8 +104,8 @@ get_hw_prim_for_gl_prim(int mode) * programs be immune to the active primitive (ie. cope with all * possibilities). That may not be realistic however. */ -static void brw_set_prim(struct brw_context *brw, - const struct _mesa_prim *prim) +static void +brw_set_prim(struct brw_context *brw, const struct _mesa_prim *prim) { struct gl_context *ctx = brw-ctx; uint32_t hw_prim = get_hw_prim_for_gl_prim(prim-mode); @@ -138,15 +138,12 @@ static void brw_set_prim(struct brw_context *brw, } } -static void gen6_set_prim(struct brw_context *brw, - const struct _mesa_prim *prim) +static void +gen6_set_prim(struct brw_context *brw, const struct _mesa_prim *prim) { - uint32_t hw_prim; - DBG(PRIM: %s\n, _mesa_enum_to_string(prim-mode)); - hw_prim = get_hw_prim_for_gl_prim(prim-mode); - + const uint32_t hw_prim = get_hw_prim_for_gl_prim(prim-mode); if (hw_prim != brw-primitive) { brw-primitive = hw_prim; brw-ctx.NewDriverState |= BRW_NEW_PRIMITIVE; @@ -162,7 +159,8 @@ static void gen6_set_prim(struct brw_context *brw, * quads so that those dangling vertices won't get drawn when we convert to * trifans/tristrips. */ -static GLuint trim(GLenum prim, GLuint length) +static GLuint +trim(GLenum prim, GLuint length) { if (prim == GL_QUAD_STRIP) return length 3 ? (length - length % 2) : 0; @@ -173,14 +171,14 @@ static GLuint trim(GLenum prim, GLuint length) } -static void brw_emit_prim(struct brw_context *brw, - const struct _mesa_prim *prim, - uint32_t hw_prim) +static void +brw_emit_prim(struct brw_context *brw, + const struct _mesa_prim *prim, + uint32_t hw_prim) { int verts_per_instance; int vertex_access_type; int indirect_flag; - int predicate_enable; DBG(PRIM: %s %d %d\n, _mesa_enum_to_string(prim-mode), prim-start, prim-count); @@ -216,9 +214,8 @@ static void brw_emit_prim(struct brw_context *brw, * and missed flushes of the render cache as it heads to other parts of * the besides the draw code. */ - if (brw-always_flush_cache) { + if (brw-always_flush_cache) brw_emit_mi_flush(brw); - } /* If indirect, emit a bunch of loads from the indirect BO. */ if (prim-is_indirect) { @@ -256,18 +253,16 @@ static void brw_emit_prim(struct brw_context *brw, OUT_BATCH(0); ADVANCE_BATCH(); } - } - else { + } else { indirect_flag = 0; } BEGIN_BATCH(brw-gen = 7 ? 7 : 6); if (brw-gen = 7) { - if (brw-predicate.state == BRW_PREDICATE_STATE_USE_BIT) - predicate_enable = GEN7_3DPRIM_PREDICATE_ENABLE; - else - predicate_enable = 0; + const int predicate_enable = + (brw-predicate.state == BRW_PREDICATE_STATE_USE_BIT) + ? GEN7_3DPRIM_PREDICATE_ENABLE : 0; OUT_BATCH(CMD_3D_PRIM 16 | (7 - 2) | indirect_flag | predicate_enable); OUT_BATCH(hw_prim | vertex_access_type); @@ -283,14 +278,14 @@ static void brw_emit_prim(struct brw_context *brw, OUT_BATCH(base_vertex_location); ADVANCE_BATCH(); - if (brw-always_flush_cache) { + if (brw-always_flush_cache) brw_emit_mi_flush(brw); - } } -static void brw_merge_inputs( struct brw_context *brw, - const struct gl_client_array *arrays[]) +static void +brw_merge_inputs(struct brw_context *brw, + const struct gl_client_array *arrays[]) { const struct gl_context *ctx = brw-ctx; GLuint i; @@ -359,7 +354,8 @@ static void brw_merge_inputs( struct brw_context *brw, * Also mark any render targets which will be textured as needing a render * cache flush. */ -static void brw_postdraw_set_buffers_need_resolve(struct brw_context *brw) +static void +brw_postdraw_set_buffers_need_resolve(struct brw_context *brw) { struct gl_context *ctx = brw-ctx; struct gl_framebuffer *fb = ctx-DrawBuffer; @@ -399,21 +395,22 @@ static void brw_postdraw_set_buffers_need_resolve(struct brw_context *brw) /* May fail if out of video memory for texture or vbo upload, or on * fallback conditions. */ -static void brw_try_draw_prims( struct gl_context *ctx, -const struct gl_client_array *arrays[], -const struct _mesa_prim *prims, -GLuint nr_prims, -const struct _mesa_index_buffer
[Mesa-dev] [PATCH 7/9] i965: Remove extern declaration for nonexistent state atom
From: Ian Romanick ian.d.roman...@intel.com Signed-off-by: Ian Romanick ian.d.roman...@intel.com --- src/mesa/drivers/dri/i965/brw_state.h | 1 - 1 file changed, 1 deletion(-) diff --git a/src/mesa/drivers/dri/i965/brw_state.h b/src/mesa/drivers/dri/i965/brw_state.h index 2eff1b5..a99bbbd 100644 --- a/src/mesa/drivers/dri/i965/brw_state.h +++ b/src/mesa/drivers/dri/i965/brw_state.h @@ -121,7 +121,6 @@ extern const struct brw_tracked_state gen6_wm_state; extern const struct brw_tracked_state gen7_depthbuffer; extern const struct brw_tracked_state gen7_clip_state; extern const struct brw_tracked_state gen7_disable_stages; -extern const struct brw_tracked_state gen7_gs_push_constants; extern const struct brw_tracked_state gen7_gs_state; extern const struct brw_tracked_state gen7_ps_state; extern const struct brw_tracked_state gen7_push_constant_space; -- 2.1.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 4/9] i965: Trivial formatting changes in brw_misc_state.c
From: Ian Romanick ian.d.roman...@intel.com Signed-off-by: Ian Romanick ian.d.roman...@intel.com --- src/mesa/drivers/dri/i965/brw_misc_state.c | 49 ++ 1 file changed, 23 insertions(+), 26 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_misc_state.c b/src/mesa/drivers/dri/i965/brw_misc_state.c index 16b0ed2..e9d9467 100644 --- a/src/mesa/drivers/dri/i965/brw_misc_state.c +++ b/src/mesa/drivers/dri/i965/brw_misc_state.c @@ -44,7 +44,8 @@ #include main/glformats.h /* Constant single cliprect for framebuffer object or DRI2 drawing */ -static void upload_drawing_rect(struct brw_context *brw) +static void +upload_drawing_rect(struct brw_context *brw) { struct gl_context *ctx = brw-ctx; const struct gl_framebuffer *fb = ctx-DrawBuffer; @@ -73,7 +74,8 @@ const struct brw_tracked_state brw_drawing_rect = { * The state pointers in this packet are all relative to the general state * base address set by CMD_STATE_BASE_ADDRESS, which is 0. */ -static void upload_pipelined_state_pointers(struct brw_context *brw ) +static void +upload_pipelined_state_pointers(struct brw_context *brw) { if (brw-gen == 5) { /* Need to flush before changing clip max threads for errata. */ @@ -104,7 +106,8 @@ static void upload_pipelined_state_pointers(struct brw_context *brw ) brw-ctx.NewDriverState |= BRW_NEW_PSP; } -static void upload_psp_urb_cbs(struct brw_context *brw ) +static void +upload_psp_urb_cbs(struct brw_context *brw) { upload_pipelined_state_pointers(brw); brw_upload_urb_fence(brw); @@ -700,13 +703,11 @@ const struct brw_tracked_state brw_depthbuffer = { .emit = brw_emit_depthbuffer, }; - - -/*** +/** * Polygon stipple packet */ - -static void upload_polygon_stipple(struct brw_context *brw) +static void +upload_polygon_stipple(struct brw_context *brw) { struct gl_context *ctx = brw-ctx; GLuint i; @@ -728,8 +729,7 @@ static void upload_polygon_stipple(struct brw_context *brw) if (_mesa_is_winsys_fbo(ctx-DrawBuffer)) { for (i = 0; i 32; i++) OUT_BATCH(ctx-PolygonStipple[31 - i]); /* invert */ - } - else { + } else { for (i = 0; i 32; i++) OUT_BATCH(ctx-PolygonStipple[i]); } @@ -745,12 +745,11 @@ const struct brw_tracked_state brw_polygon_stipple = { .emit = upload_polygon_stipple }; - -/*** +/** * Polygon stipple offset packet */ - -static void upload_polygon_stipple_offset(struct brw_context *brw) +static void +upload_polygon_stipple_offset(struct brw_context *brw) { struct gl_context *ctx = brw-ctx; @@ -785,10 +784,11 @@ const struct brw_tracked_state brw_polygon_stipple_offset = { .emit = upload_polygon_stipple_offset }; -/** +/** * AA Line parameters */ -static void upload_aa_line_parameters(struct brw_context *brw) +static void +upload_aa_line_parameters(struct brw_context *brw) { struct gl_context *ctx = brw-ctx; @@ -815,11 +815,11 @@ const struct brw_tracked_state brw_aa_line_parameters = { .emit = upload_aa_line_parameters }; -/*** +/** * Line stipple packet */ - -static void upload_line_stipple(struct brw_context *brw) +static void +upload_line_stipple(struct brw_context *brw) { struct gl_context *ctx = brw-ctx; GLfloat tmp; @@ -837,8 +837,7 @@ static void upload_line_stipple(struct brw_context *brw) tmp = 1.0f / ctx-Line.StippleFactor; tmpi = tmp * (116); OUT_BATCH(tmpi 15 | ctx-Line.StippleFactor); - } - else { + } else { /* in U1.13 */ tmp = 1.0f / ctx-Line.StippleFactor; tmpi = tmp * (113); @@ -856,7 +855,6 @@ const struct brw_tracked_state brw_line_stipple = { .emit = upload_line_stipple }; - void brw_emit_select_pipeline(struct brw_context *brw, enum brw_pipeline pipeline) { @@ -872,11 +870,9 @@ brw_emit_select_pipeline(struct brw_context *brw, enum brw_pipeline pipeline) ADVANCE_BATCH(); } - -/*** +/** * Misc invariant state packets */ - void brw_upload_invariant_state(struct brw_context *brw) { @@ -930,7 +926,8 @@ const struct brw_tracked_state brw_invariant_state = { * surface state objects, but not the surfaces that the surface state * objects point to. */ -static void upload_state_base_address( struct brw_context *brw ) +static void +upload_state_base_address(struct brw_context *brw) { /* FINISHME: According to section 3.6.1 STATE_BASE_ADDRESS of * vol1a of the G45 PRM, MI_FLUSH with the ISC invalidate should be -- 2.1.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org
[Mesa-dev] [PATCH 5/9] i965: Trivial formatting changes in gen6_multisample_state.c
From: Ian Romanick ian.d.roman...@intel.com Signed-off-by: Ian Romanick ian.d.roman...@intel.com --- src/mesa/drivers/dri/i965/gen6_multisample_state.c | 7 ++- 1 file changed, 2 insertions(+), 5 deletions(-) diff --git a/src/mesa/drivers/dri/i965/gen6_multisample_state.c b/src/mesa/drivers/dri/i965/gen6_multisample_state.c index cf1421e..8444c0c 100644 --- a/src/mesa/drivers/dri/i965/gen6_multisample_state.c +++ b/src/mesa/drivers/dri/i965/gen6_multisample_state.c @@ -143,7 +143,6 @@ gen6_emit_3dstate_multisample(struct brw_context *brw, ADVANCE_BATCH(); } - unsigned gen6_determine_sample_mask(struct brw_context *brw) { @@ -176,7 +175,6 @@ gen6_determine_sample_mask(struct brw_context *brw) } } - /** * 3DSTATE_SAMPLE_MASK */ @@ -189,15 +187,14 @@ gen6_emit_3dstate_sample_mask(struct brw_context *brw, unsigned mask) ADVANCE_BATCH(); } - -static void upload_multisample_state(struct brw_context *brw) +static void +upload_multisample_state(struct brw_context *brw) { /* BRW_NEW_NUM_SAMPLES */ gen6_emit_3dstate_multisample(brw, brw-num_samples); gen6_emit_3dstate_sample_mask(brw, gen6_determine_sample_mask(brw)); } - const struct brw_tracked_state gen6_multisample_state = { .dirty = { .mesa = _NEW_MULTISAMPLE, -- 2.1.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 8/9] i965: Make gen7_upload_ps_state static
From: Ian Romanick ian.d.roman...@intel.com It is only ever called from within the same file. Signed-off-by: Ian Romanick ian.d.roman...@intel.com --- src/mesa/drivers/dri/i965/brw_state.h | 9 - src/mesa/drivers/dri/i965/gen7_wm_state.c | 2 +- 2 files changed, 1 insertion(+), 10 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_state.h b/src/mesa/drivers/dri/i965/brw_state.h index a99bbbd..cf3a96c 100644 --- a/src/mesa/drivers/dri/i965/brw_state.h +++ b/src/mesa/drivers/dri/i965/brw_state.h @@ -266,15 +266,6 @@ void brw_update_renderbuffer_surfaces(struct brw_context *brw, uint32_t render_target_start, uint32_t *surf_offset); -/* gen7_wm_state.c */ -void -gen7_upload_ps_state(struct brw_context *brw, - const struct gl_fragment_program *fp, - const struct brw_stage_state *stage_state, - const struct brw_wm_prog_data *prog_data, - bool enable_dual_src_blend, unsigned sample_mask, - unsigned fast_clear_op); - /* gen7_wm_surface_state.c */ uint32_t gen7_surface_tiling_mode(uint32_t tiling); uint32_t gen7_surface_msaa_bits(unsigned num_samples, enum intel_msaa_layout l); diff --git a/src/mesa/drivers/dri/i965/gen7_wm_state.c b/src/mesa/drivers/dri/i965/gen7_wm_state.c index ea11ae8..d7be58d 100644 --- a/src/mesa/drivers/dri/i965/gen7_wm_state.c +++ b/src/mesa/drivers/dri/i965/gen7_wm_state.c @@ -127,7 +127,7 @@ const struct brw_tracked_state gen7_wm_state = { .emit = upload_wm_state, }; -void +static void gen7_upload_ps_state(struct brw_context *brw, const struct gl_fragment_program *fp, const struct brw_stage_state *stage_state, -- 2.1.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 3/9] i965: Trivial formatting changes in brw_draw_upload.c
From: Ian Romanick ian.d.roman...@intel.com Signed-off-by: Ian Romanick ian.d.roman...@intel.com --- src/mesa/drivers/dri/i965/brw_draw_upload.c | 9 ++--- 1 file changed, 6 insertions(+), 3 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_draw_upload.c b/src/mesa/drivers/dri/i965/brw_draw_upload.c index c6dd69b..cbfd585 100644 --- a/src/mesa/drivers/dri/i965/brw_draw_upload.c +++ b/src/mesa/drivers/dri/i965/brw_draw_upload.c @@ -649,7 +649,8 @@ emit_vertex_buffer_state(struct brw_context *brw, } #define EMIT_VERTEX_BUFFER_STATE(...) __map = emit_vertex_buffer_state(__VA_ARGS__, __map) -static void brw_emit_vertices(struct brw_context *brw) +static void +brw_emit_vertices(struct brw_context *brw) { GLuint i; @@ -859,7 +860,8 @@ const struct brw_tracked_state brw_vertices = { .emit = brw_emit_vertices, }; -static void brw_upload_indices(struct brw_context *brw) +static void +brw_upload_indices(struct brw_context *brw) { struct gl_context *ctx = brw-ctx; const struct _mesa_index_buffer *index_buffer = brw-ib.ib; @@ -939,7 +941,8 @@ const struct brw_tracked_state brw_indices = { .emit = brw_upload_indices, }; -static void brw_emit_index_buffer(struct brw_context *brw) +static void +brw_emit_index_buffer(struct brw_context *brw) { const struct _mesa_index_buffer *index_buffer = brw-ib.ib; GLuint cut_index_setting; -- 2.1.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 0/3] Fix glDrawTransformFeedbackStream
This wasn't implemented. Piglit tests have yet to be written. Hopefully, they will just materialize themselves. Marek ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] i915/aa: fixing anti-aliasing bug for thinnest width lines
Was sent 2 month ago, still not review/upstream -Original Message- From: Predut, Marius Sent: Thursday, July 30, 2015 7:04 PM To: mesa-dev@lists.freedesktop.org Cc: Predut, Marius Subject: [Mesa-dev][PATCH] i915/aa: fixing anti-aliasing bug for thinnest width lines On PNV platform, for 1 pixel line thickness or less, the general anti-aliasing algorithm gives up, and a garbage line is generated. Setting a Line Width of 0.0 specifies the rasterization of the thinnest (one-pixel-wide), non-antialiased lines. Lines rendered with zero Line Width are rasterized using Grid Intersection Quantization rules as specified by bspec G45: Volume 2: 3D/Media, 7.3.13.1 Zero-Width (Cosmetic) Line Rasterization section. This patch follow the same rules as patches fixing the https://bugs.freedesktop.org/show_bug.cgi?id=28832 bug. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=90367 Signed-off-by: Marius Predut marius.pre...@intel.com --- src/mesa/drivers/dri/i915/i915_state.c | 15 +++ 1 file changed, 15 insertions(+) diff --git a/src/mesa/drivers/dri/i915/i915_state.c b/src/mesa/drivers/dri/i915/i915_state.c index 5f10b84..6cd342c 100644 --- a/src/mesa/drivers/dri/i915/i915_state.c +++ b/src/mesa/drivers/dri/i915/i915_state.c @@ -599,6 +599,21 @@ i915LineWidth(struct gl_context * ctx, GLfloat widthf) width = (int) (widthf * 2); width = CLAMP(width, 1, 0xf); + + if (ctx-Line.Width 1.5 || widthf 1.5) { +/* For 1 pixel line thickness or less, the general + * anti-aliasing algorithm gives up, and a garbage line is + * generated. Setting a Line Width of 0.0 specifies the + * rasterization of the thinnest (one-pixel-wide), + * non-antialiased lines. + * + * Lines rendered with zero Line Width are rasterized using + * Grid Intersection Quantization rules as specified by + * bspec G45: Volume 2: 3D/Media, + * 7.3.13.1 Zero-Width (Cosmetic) Line Rasterization section + */ +width = 0; + } lis4 |= width S4_LINE_WIDTH_SHIFT; if (lis4 != i915-state.Ctx[I915_CTXREG_LIS4]) { -- 1.9.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v2 00/78] i965: A new vec4 backend based on NIR
On Jul 27, 2015 3:39 PM, Jason Ekstrand ja...@jlekstrand.net wrote: On Mon, Jul 27, 2015 at 2:07 PM, Eduardo Lima Mitev el...@igalia.com wrote: On 07/25/2015 03:08 AM, Jason Ekstrand wrote: Alright, I got through it again... I asked for a few trivial changes on a few of the patches. With those fixed, everything except patch 65 and 66 are Thanks! we have fixed most of the patches already. Reviewed-by: Jason Ekstrand jason.ekstr...@intel.com While the requested changes on the texturing patches are not complicated, I would like to see the updated version of those two patches before I give a formal R-B. Sure, we will send new versions of patch 65 and 66 as soon as we have them ready and tested. There is one caveat to the above review: Something is badly broken on HSW. I'm not sure what, but one of the patches badly breaks HSW even in the non-NIR case. We need to figure out what that is before pushing anything. That said, please don't re-send the full series; just figure out where the bug is, fix it, and re-send the one patch. We have done extensive testing today, trying to find something strange that explains this breakage you are experiencing. Most of us have HSW laptops so it is our normal development environments. However, we have not been able to reproduce any fail that would suggest our branch breaks something. Using current master as baseline, I have tested both nir-vec4 and vec4_visitor against it (piglit and dEQP), and we see no regressions other that the one we described in the cover letter. Please, could you be more specific about the symptoms of this breakage and give us any possible info that would help us reproduce it locally? Maybe details of your system, kernel and gcc version; anything that might be different to a standard linux system. I'm not sure what happened there. I may have had a bad rebase or something. I pulled your branch, rebased on master, and tried it again. Now it seems to be ok. On Broadwell and Cherryview, however, it's not so ok. It looks like you may have accidentally broken scalar VS or something like that. I can't even run glxgears on BDW without hitting an assert. If your dev machine is a HSW, you can use INTEL_DEVID_OVERRIDE to tell it to compile shaders as if it's a BDW and figure out why it's crashing. The patch titled nir/nir_lower_io: Add vec4 support makes glxgears render black on BDW. It seems like a later patch is causing the crash. In any case, you should be able to use INTEL_DEVID_OVERRIDE and compare the shader results. --Jason Any progress on this? Thanks again for the review and patience. Eduardo --Jason On Thu, Jul 23, 2015 at 3:16 AM, Eduardo Lima Mitev el...@igalia.com wrote: Hi, This is the second version of the series that adds a new vec4 backend for i965 based on NIR. The first series was sent some weeks ago [1] and went through review by Jason and Kenneth (thanks a lot!). This series is a result of addressing issues detected during that feedback. This series also adds support for the NIR-vec4 pass on geometry shaders and on ARB_vertex_programs. Both supports were work-in-progress by the time we sent the first version, and are now completed. The patch-sets for GS and ARB_vertex_program were added at the end of the series. Like the first version, we tested the backend on gen6 and gen7 (specifically, SNB, IVB and HSW); and with both piglit and dEQP. No regressions observed with piglit. The issues related with register spilling that we had in the first version have been fixed. With dEQP, we observe a regression that is causing the following 3 tests to fail: dEQP-GLES3.functional.shaders.loops.for_dynamic_iterations.vector_counter_vertex dEQP-GLES3.functional.shaders.loops.while_dynamic_iterations.vector_counter_vertex dEQP-GLES3.functional.shaders.loops.do_while_dynamic_iterations.vector_counter_vertex We have a patch that solves those specific cases, but we would like to discuss it independently, before including it in the series. We believe these regressions should not block the review of the series. As usual, a git tree is available at https://github.com/Igalia/mesa/tree/nir-vec4-v2 to ease testing. cheers, Eduardo [1] http://lists.freedesktop.org/archives/mesa-dev/2015-June/087448.html Alejandro Piñeiro (9): i965/vec4: Redefine make_reg_for_system_value() to allow reuse in NIR-vec4 pass i965/nir/vec4: Add setup for system values i965/nir/vec4: Implement intrinsics that load system values i965/nir/vec4: Implement atomic counter intrinsics (read, inc and dec) i965/nir: Disable alu_to_scalar pass on non-scalar shaders i965/vec4: Change vec4_visitor::swizzle_result() method to allow reuse i965/vec4: Add a new dst_reg constructor accepting a brw_reg_type i965/ir/vec4: Refactor visit(ir_texture *ir) i965/nir/vec4: Add implementation of
[Mesa-dev] Properly exposing limits with gallium
So I'm trying to get these things to line up, especially for nvc0. Here are the limits exposed by the blob drivers: http://people.freedesktop.org/~imirkin/glxinfo/glxinfo.html#v=Vendor and they reflect what the hardware is capable. More specifically, I need GL_MAX_VERTEX_OUTPUT_COMPONENTS = 128 GL_MAX_VARYING_FLOATS_ARB = 124 GL_MAX_FRAGMENT_INPUT_COMPONENTS = 128 GL_MAX_GEOMETRY_INPUT_COMPONENTS = 128 GL_MAX_GEOMETRY_OUTPUT_COMPONENTS = 128 What would the proper way to expose this be? Should we always just say MAX_VARYING_FLOATS = MAX_VERTEX_OUTPUT_COMPONENTS - 4? Or a separate PIPE_SHADER_CAP for the maxvaryingfloats? -ilia ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 1/2] i965: Extract push constant state to a new file
Recommended-by: Kenneth Graunke kenn...@whitecape.org Signed-off-by: Ben Widawsky b...@bwidawsk.net --- src/mesa/drivers/dri/i965/Makefile.sources | 1 + src/mesa/drivers/dri/i965/gen6_constant_state.c | 189 src/mesa/drivers/dri/i965/gen6_vs_state.c | 88 --- src/mesa/drivers/dri/i965/gen7_vs_state.c | 75 -- 4 files changed, 190 insertions(+), 163 deletions(-) create mode 100644 src/mesa/drivers/dri/i965/gen6_constant_state.c diff --git a/src/mesa/drivers/dri/i965/Makefile.sources b/src/mesa/drivers/dri/i965/Makefile.sources index e861c2c..1c87280 100644 --- a/src/mesa/drivers/dri/i965/Makefile.sources +++ b/src/mesa/drivers/dri/i965/Makefile.sources @@ -143,6 +143,7 @@ i965_FILES = \ gen6_blorp.h \ gen6_cc.c \ gen6_clip_state.c \ +gen6_constant_state.c \ gen6_depth_state.c \ gen6_depthstencil.c \ gen6_gs_state.c \ diff --git a/src/mesa/drivers/dri/i965/gen6_constant_state.c b/src/mesa/drivers/dri/i965/gen6_constant_state.c new file mode 100644 index 000..9d59d12 --- /dev/null +++ b/src/mesa/drivers/dri/i965/gen6_constant_state.c @@ -0,0 +1,189 @@ +/* + * Copyright © 2015 Intel Corporation + * + * Permission is hereby granted, free of charge, to any person obtaining a + * copy of this software and associated documentation files (the Software), + * to deal in the Software without restriction, including without limitation + * the rights to use, copy, modify, merge, publish, distribute, sublicense, + * and/or sell copies of the Software, and to permit persons to whom the + * Software is furnished to do so, subject to the following conditions: + * + * The above copyright notice and this permission notice (including the next + * paragraph) shall be included in all copies or substantial portions of the + * Software. + * + * THE SOFTWARE IS PROVIDED AS IS, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS + * IN THE SOFTWARE. + */ + +#include brw_context.h +#include brw_state.h +#include brw_defines.h +#include intel_batchbuffer.h +#include glsl/glsl_parser_extras.h + +void +gen7_upload_constant_state(struct brw_context *brw, + const struct brw_stage_state *stage_state, + bool active, unsigned opcode) +{ + uint32_t mocs = brw-gen 8 ? GEN7_MOCS_L3 : 0; + + /* Disable if the shader stage is inactive or there are no push constants. */ + active = active stage_state-push_const_size != 0; + + int dwords = brw-gen = 8 ? 11 : 7; + BEGIN_BATCH(dwords); + OUT_BATCH(opcode 16 | (dwords - 2)); + + /* Workaround for SKL+ (we use option #2 until we have a need for more +* constant buffers). This comes from the documentation for 3DSTATE_CONSTANT_* +* +* The driver must ensure The following case does not occur without a flush +* to the 3D engine: 3DSTATE_CONSTANT_* with buffer 3 read length equal to +* zero committed followed by a 3DSTATE_CONSTANT_* with buffer 0 read length +* not equal to zero committed. Possible ways to avoid this condition +* include: +* 1. always force buffer 3 to have a non zero read length +* 2. always force buffer 0 to a zero read length +*/ + if (brw-gen = 9 active) { + OUT_BATCH(0); + OUT_BATCH(stage_state-push_const_size); + } else { + OUT_BATCH(active ? stage_state-push_const_size : 0); + OUT_BATCH(0); + } + /* Pointer to the constant buffer. Covered by the set of state flags +* from gen6_prepare_wm_contants +*/ + if (brw-gen = 9 active) { + OUT_BATCH(0); + OUT_BATCH(0); + OUT_BATCH(0); + OUT_BATCH(0); + /* XXX: When using buffers other than 0, you need to specify the + * graphics virtual address regardless of INSPM/debug bits + */ + OUT_RELOC64(brw-batch.bo, I915_GEM_DOMAIN_RENDER, 0, + stage_state-push_const_offset); + OUT_BATCH(0); + OUT_BATCH(0); + } else if (brw-gen= 8) { + OUT_BATCH(active ? (stage_state-push_const_offset | mocs) : 0); + OUT_BATCH(0); + OUT_BATCH(0); + OUT_BATCH(0); + OUT_BATCH(0); + OUT_BATCH(0); + OUT_BATCH(0); + OUT_BATCH(0); + } else { + OUT_BATCH(active ? (stage_state-push_const_offset | mocs) : 0); + OUT_BATCH(0); + OUT_BATCH(0); + OUT_BATCH(0); + } + + ADVANCE_BATCH(); + + /* On SKL+ the new constants don't take effect until the next corresponding +* 3DSTATE_BINDING_TABLE_POINTER_* command is parsed so we need to ensure +* that is sent +*/ + if (brw-gen = 9) +
[Mesa-dev] [PATCH 2/2] i965: Consolidate push constant buffer setup
This patch replaces the previous patch listed in references. References: http://patchwork.freedesktop.org/patch/54099/ Recommended-by: Kenneth Graunke kenn...@whitecape.org Signed-off-by: Ben Widawsky b...@bwidawsk.net --- src/mesa/drivers/dri/i965/gen6_constant_state.c | 104 ++-- 1 file changed, 62 insertions(+), 42 deletions(-) diff --git a/src/mesa/drivers/dri/i965/gen6_constant_state.c b/src/mesa/drivers/dri/i965/gen6_constant_state.c index 9d59d12..cde39c5 100644 --- a/src/mesa/drivers/dri/i965/gen6_constant_state.c +++ b/src/mesa/drivers/dri/i965/gen6_constant_state.c @@ -27,20 +27,71 @@ #include intel_batchbuffer.h #include glsl/glsl_parser_extras.h +static inline void +emit_3dstate_constant(struct brw_context *brw, + uint32_t opcode, + uint16_t read_length_0, + uint16_t read_length_1, + uint16_t read_length_2, + uint16_t read_length_3, + uint32_t ptr_0, + uint64_t ptr_1, + uint64_t ptr_2, + uint64_t ptr_3) +{ +#define OUT_RELOC_NULL(x) if (x != 0) { \ + OUT_RELOC(brw-batch.bo, I915_GEM_DOMAIN_RENDER, 0, x); } else { OUT_BATCH(x); } +#define OUT_RELOC64_NULL(x) if (x != 0) { \ + OUT_RELOC64(brw-batch.bo, I915_GEM_DOMAIN_RENDER, 0, x); \ +} else { OUT_BATCH(x); OUT_BATCH(x); } + + if (brw-gen = 8) { + BEGIN_BATCH(11); + OUT_BATCH(opcode 16 | (11 - 2)); + OUT_BATCH(read_length_0| read_length_1 16); + OUT_BATCH(read_length_2| read_length_3 16); + + OUT_BATCH(ptr_0); + OUT_BATCH(0); + /* XXX: When using buffers other than 0, you need to specify the + * graphics virtual address regardless of INSTPM/debug bits + */ + OUT_RELOC64_NULL(ptr_1); + OUT_RELOC64_NULL(ptr_2); + OUT_RELOC64_NULL(ptr_3); + + ADVANCE_BATCH(); + } else if (brw-gen = 6) { + BEGIN_BATCH(7); + OUT_BATCH(opcode 16 | (7 - 2)); + OUT_BATCH(read_length_0 | read_length_1 16); + OUT_BATCH(read_length_2 | read_length_3 16); + OUT_BATCH(ptr_0 | GEN7_MOCS_L3); + OUT_RELOC_NULL(ptr_1); + OUT_RELOC_NULL(ptr_2); + OUT_RELOC_NULL(ptr_3); + ADVANCE_BATCH(); + } else { + unreachable(unhandled gen in emit_3dstate_constant); + } + + +#undef OUT_RELOC64_NULL +#undef OUT_RELOC_NULL +} + void gen7_upload_constant_state(struct brw_context *brw, const struct brw_stage_state *stage_state, bool active, unsigned opcode) { - uint32_t mocs = brw-gen 8 ? GEN7_MOCS_L3 : 0; /* Disable if the shader stage is inactive or there are no push constants. */ active = active stage_state-push_const_size != 0; - int dwords = brw-gen = 8 ? 11 : 7; - BEGIN_BATCH(dwords); - OUT_BATCH(opcode 16 | (dwords - 2)); - + if (!active) { + emit_3dstate_constant(brw, opcode, 0, 0, 0, 0, 0, 0, 0, 0); + } else if (brw-gen = 9) { /* Workaround for SKL+ (we use option #2 until we have a need for more * constant buffers). This comes from the documentation for 3DSTATE_CONSTANT_* * @@ -52,45 +103,14 @@ gen7_upload_constant_state(struct brw_context *brw, * 1. always force buffer 3 to have a non zero read length * 2. always force buffer 0 to a zero read length */ - if (brw-gen = 9 active) { - OUT_BATCH(0); - OUT_BATCH(stage_state-push_const_size); + emit_3dstate_constant(brw, opcode, +0, 0, stage_state-push_const_size, 0, +0, 0, stage_state-push_const_offset, 0); } else { - OUT_BATCH(active ? stage_state-push_const_size : 0); - OUT_BATCH(0); + emit_3dstate_constant(brw, opcode, +stage_state-push_const_size, 0, 0, 0, +0, 0, stage_state-push_const_offset, 0); } - /* Pointer to the constant buffer. Covered by the set of state flags -* from gen6_prepare_wm_contants -*/ - if (brw-gen = 9 active) { - OUT_BATCH(0); - OUT_BATCH(0); - OUT_BATCH(0); - OUT_BATCH(0); - /* XXX: When using buffers other than 0, you need to specify the - * graphics virtual address regardless of INSPM/debug bits - */ - OUT_RELOC64(brw-batch.bo, I915_GEM_DOMAIN_RENDER, 0, - stage_state-push_const_offset); - OUT_BATCH(0); - OUT_BATCH(0); - } else if (brw-gen= 8) { - OUT_BATCH(active ? (stage_state-push_const_offset | mocs) : 0); - OUT_BATCH(0); - OUT_BATCH(0); - OUT_BATCH(0); - OUT_BATCH(0); - OUT_BATCH(0); - OUT_BATCH(0); - OUT_BATCH(0); - } else { - OUT_BATCH(active ? (stage_state-push_const_offset | mocs) : 0); - OUT_BATCH(0); - OUT_BATCH(0); - OUT_BATCH(0); - } - - ADVANCE_BATCH(); /* On SKL+ the new constants don't take effect until the next
Re: [Mesa-dev] [PATCH 00/21] NIR control flow modification
On Tuesday, July 21, 2015 07:54:14 PM Connor Abbott wrote: Back when I was getting NIR up and running, I created a bunch of functions that dealt with modifying control flow; they allowed you to remove control flow nodes (if's, loops, and basic blocks) as well as insert newly-created control flow nodes at certain points. The insertion API's worked well enough for the usecase we had, which was building up NIR while translating from something else, and the removal API found some use in e.g. the select peephole. But there are various things we want to do, some of which are becoming more and more urgent, which involve doing more invasive modifications to the IR. To be able to do these things, this series adds an API which is a little lower-level than the existing one, but a lot less dangerous than manipulating things directly. It uses most of the same internal machinery as the existing functions, but I needed to rework it extensively to fix issues that were usually obscure edges in the old code that become more important in the new code. For more detail on the API itself and how to use it, see patch 20. In addition, there are a few peripheral things that I did which are related to the core objective of the series: - The control flow code was entangled with the rest of the code in nir.c, and it was about to get larger, so I pulled it out into its own file (nir_control_flow.c) and gave it its own header (nir_control_flow.h). - The new API includes an extract function (for details see patch 20) which needs to take begin and end points. There were going to be many variants necessary for completeness, and even more for convenience, so I decided to do something about it by creating a cursor structure. A cursor is a small structure (8 bytes on a 32-bit system, 16 bytes on a 64-bit system) that represents some point where something can be inserted (before or after an instruction or before or after a block). There are helpers for constructing cursors in all the relevant places, and convenience helpers for creating cursors before/after a control flow node and before/after a control flow list. That way, the need for trivial nir_insert_thing_before/after_other_thing wrappers is gone, and it's easy to add new convenience helpers for cursors which will work with every API that uses a cursor. We probably want to use cursors to simplify instruction insertion/deletion and the builder, but I deferred that till later. The series was tested with an updated version of my dead control flow elimination series, which I'm going to post shortly. The structure of the series is as follows: patch 01: adds a lot more checking of block predecessors and successors in the validator, which is useful for checking the results of control flow modification. patch 02-03: prepares for splitting off the control flow code by getting rid of as many dependencies as possible. patch 04: moves the control flow code off to its own file. patch 05-15: fix many different problems with the current control flow code which will become more important when the new API is added. patch 16-19: introduce the cursor helper, and make the old control flow insertion API use it. patch 20: actually adds the new API. patch 21: removes the old, buggy implementation of nir_cf_node_remove() and makes it a small wrapper around the new API instead. The series can also be found at: git://people.freedesktop.org/~cwabbott0/mesa nir-control-flow-mod Connor Connor Abbott (21): nir/validate: check successors/predecessors more carefully nir: inline block_add_pred() a few places nir: make cleanup_cf_node() not use remove_defs_uses() nir: move control flow modification to its own file nir/cf: add insert_phi_undef() helper nir: add nir_foreach_phi_src_safe() nir/cf: add remove_phi_src() helper nir/cf: split up and improve nir_handle_remove_jumps() nir/cf: handle phi nodes better in split_block_beginning() nir/cf: add block_ends_in_jump() nir/cf: handle jumps in split_block_end() nir/cf: handle jumps better in stitch_blocks() nir/cf: remove uses of SSA definitions that are being deleted Patches 1-13 (the above) are: Reviewed-by: Kenneth Graunke kenn...@whitecape.org I haven't gotten to the rest, and I'm not sure if I will in the next week...sorry :( Hopefully someone else can take a look and you can land this. nir/cf: clean up jumps when cleaning up CF nodes nir/cf: fix link_blocks() when there are no successors nir/cf: add a cursor structure nir/cf: add split_block_before_instr() nir/cf: add split_block_cursor() nir/cf: use a cursor for inserting control flow nir/cf: add new control modification API's nir/cf: reimplement nir_cf_node_remove() using the new API src/gallium/auxiliary/nir/tgsi_to_nir.c | 1 + src/glsl/Makefile.sources | 3 + src/glsl/nir/glsl_to_nir.cpp| 1 +
[Mesa-dev] [PATCH] i965/fs: Add a very basic validation pass
Currently the validation pass only validates that regs_read and regs_written are consistent with the sizes of VGRF's. We can add more as we find it to be useful. --- src/mesa/drivers/dri/i965/Makefile.sources| 1 + src/mesa/drivers/dri/i965/brw_fs.cpp | 9 src/mesa/drivers/dri/i965/brw_fs.h| 1 + src/mesa/drivers/dri/i965/brw_fs_validate.cpp | 59 +++ 4 files changed, 70 insertions(+) create mode 100644 src/mesa/drivers/dri/i965/brw_fs_validate.cpp diff --git a/src/mesa/drivers/dri/i965/Makefile.sources b/src/mesa/drivers/dri/i965/Makefile.sources index e861c2c..26d19b8 100644 --- a/src/mesa/drivers/dri/i965/Makefile.sources +++ b/src/mesa/drivers/dri/i965/Makefile.sources @@ -62,6 +62,7 @@ i965_FILES = \ brw_fs_sel_peephole.cpp \ brw_fs_surface_builder.cpp \ brw_fs_surface_builder.h \ + brw_fs_validate.cpp \ brw_fs_vector_splitting.cpp \ brw_fs_visitor.cpp \ brw_gs.c \ diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp b/src/mesa/drivers/dri/i965/brw_fs.cpp index 15fe364..7127991 100644 --- a/src/mesa/drivers/dri/i965/brw_fs.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs.cpp @@ -4664,6 +4664,9 @@ fs_visitor::calculate_register_pressure() void fs_visitor::optimize() { + /* Start by validating the shader we currently have. */ + validate(); + /* bld is the common builder object pointing at the end of the program we * used to translate it into i965 IR. For the optimization and lowering * passes coming next, any code added after the end of the program without @@ -4683,6 +4686,8 @@ fs_visitor::optimize() assign_constant_locations(); demote_pull_constants(); + validate(); + #define OPT(pass, args...) ({ \ pass_num++; \ bool this_progress = pass(args); \ @@ -4695,6 +4700,8 @@ fs_visitor::optimize() backend_shader::dump_instructions(filename); \ } \ \ + validate(); \ +\ progress = progress || this_progress; \ this_progress;\ }) @@ -4756,6 +4763,8 @@ fs_visitor::optimize() OPT(lower_integer_multiplication); lower_uniform_pull_constant_loads(); + + validate(); } /** diff --git a/src/mesa/drivers/dri/i965/brw_fs.h b/src/mesa/drivers/dri/i965/brw_fs.h index 4749c47..e532e79 100644 --- a/src/mesa/drivers/dri/i965/brw_fs.h +++ b/src/mesa/drivers/dri/i965/brw_fs.h @@ -153,6 +153,7 @@ public: void invalidate_live_intervals(); void calculate_live_intervals(); void calculate_register_pressure(); + void validate(); bool opt_algebraic(); bool opt_redundant_discard_jumps(); bool opt_cse(); diff --git a/src/mesa/drivers/dri/i965/brw_fs_validate.cpp b/src/mesa/drivers/dri/i965/brw_fs_validate.cpp new file mode 100644 index 000..cd65a58 --- /dev/null +++ b/src/mesa/drivers/dri/i965/brw_fs_validate.cpp @@ -0,0 +1,59 @@ +/* + * Copyright © 2012 Intel Corporation + * + * Permission is hereby granted, free of charge, to any person obtaining a + * copy of this software and associated documentation files (the Software), + * to deal in the Software without restriction, including without limitation + * the rights to use, copy, modify, merge, publish, distribute, sublicense, + * and/or sell copies of the Software, and to permit persons to whom the + * Software is furnished to do so, subject to the following conditions: + * + * The above copyright notice and this permission notice (including the next + * paragraph) shall be included in all copies or substantial portions of the + * Software. + * + * THE SOFTWARE IS PROVIDED AS IS, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS + * IN THE SOFTWARE. + */ + +/** @file brw_fs_validate.cpp + * + * Implements a pass that validates various invariants of the IR. The current + * pass only validates that GRF's uses are sane. More can be added later. + */ + +#include brw_fs.h +#include brw_cfg.h + +#define fsv_assert(cond) ({ \ + if (!(cond)) { \ + fprintf(stderr, ASSERT: FS validation failed!\n); \ + v-dump_instruction(inst, stderr); \ + fprintf(stderr, %s:%d: %s\n, __FILE__,
[Mesa-dev] XDC2015 - Travel sponsorship
Hello everyone, We are 1.5 months away from XDC and 20 days away from the proposals deadline[1]! If you did not manage to secure funding from your company but still think you could benefit the community by giving a talk, we encourage you to send an email to the board of X.Org with your talk proposal and we will review it in a timely fashion. Looking forward to seeing a lot of you soon, in Toronto! Martin, on behalf of the board of directors of X.Org [1] http://www.x.org/wiki/Other/Press/CFP2015/ ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] i965/fs: Add a very basic validation pass
On Thu, Jul 30, 2015 at 3:42 PM, Jason Ekstrand ja...@jlekstrand.net wrote: Currently the validation pass only validates that regs_read and regs_written are consistent with the sizes of VGRF's. We can add more as we find it to be useful. --- src/mesa/drivers/dri/i965/Makefile.sources| 1 + src/mesa/drivers/dri/i965/brw_fs.cpp | 9 src/mesa/drivers/dri/i965/brw_fs.h| 1 + src/mesa/drivers/dri/i965/brw_fs_validate.cpp | 59 +++ 4 files changed, 70 insertions(+) create mode 100644 src/mesa/drivers/dri/i965/brw_fs_validate.cpp diff --git a/src/mesa/drivers/dri/i965/Makefile.sources b/src/mesa/drivers/dri/i965/Makefile.sources index e861c2c..26d19b8 100644 --- a/src/mesa/drivers/dri/i965/Makefile.sources +++ b/src/mesa/drivers/dri/i965/Makefile.sources @@ -62,6 +62,7 @@ i965_FILES = \ brw_fs_sel_peephole.cpp \ brw_fs_surface_builder.cpp \ brw_fs_surface_builder.h \ + brw_fs_validate.cpp \ brw_fs_vector_splitting.cpp \ brw_fs_visitor.cpp \ brw_gs.c \ diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp b/src/mesa/drivers/dri/i965/brw_fs.cpp index 15fe364..7127991 100644 --- a/src/mesa/drivers/dri/i965/brw_fs.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs.cpp @@ -4664,6 +4664,9 @@ fs_visitor::calculate_register_pressure() void fs_visitor::optimize() { + /* Start by validating the shader we currently have. */ + validate(); + /* bld is the common builder object pointing at the end of the program we * used to translate it into i965 IR. For the optimization and lowering * passes coming next, any code added after the end of the program without @@ -4683,6 +4686,8 @@ fs_visitor::optimize() assign_constant_locations(); demote_pull_constants(); + validate(); + #define OPT(pass, args...) ({ \ pass_num++; \ bool this_progress = pass(args); \ @@ -4695,6 +4700,8 @@ fs_visitor::optimize() backend_shader::dump_instructions(filename); \ } \ \ + validate(); \ +\ progress = progress || this_progress; \ this_progress;\ }) @@ -4756,6 +4763,8 @@ fs_visitor::optimize() OPT(lower_integer_multiplication); lower_uniform_pull_constant_loads(); + + validate(); } /** diff --git a/src/mesa/drivers/dri/i965/brw_fs.h b/src/mesa/drivers/dri/i965/brw_fs.h index 4749c47..e532e79 100644 --- a/src/mesa/drivers/dri/i965/brw_fs.h +++ b/src/mesa/drivers/dri/i965/brw_fs.h @@ -153,6 +153,7 @@ public: void invalidate_live_intervals(); void calculate_live_intervals(); void calculate_register_pressure(); + void validate(); bool opt_algebraic(); bool opt_redundant_discard_jumps(); bool opt_cse(); diff --git a/src/mesa/drivers/dri/i965/brw_fs_validate.cpp b/src/mesa/drivers/dri/i965/brw_fs_validate.cpp new file mode 100644 index 000..cd65a58 --- /dev/null +++ b/src/mesa/drivers/dri/i965/brw_fs_validate.cpp @@ -0,0 +1,59 @@ +/* + * Copyright © 2012 Intel Corporation 2015 + * + * Permission is hereby granted, free of charge, to any person obtaining a + * copy of this software and associated documentation files (the Software), + * to deal in the Software without restriction, including without limitation + * the rights to use, copy, modify, merge, publish, distribute, sublicense, + * and/or sell copies of the Software, and to permit persons to whom the + * Software is furnished to do so, subject to the following conditions: + * + * The above copyright notice and this permission notice (including the next + * paragraph) shall be included in all copies or substantial portions of the + * Software. + * + * THE SOFTWARE IS PROVIDED AS IS, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS + * IN THE SOFTWARE. + */ + +/** @file brw_fs_validate.cpp + * + * Implements a pass that validates various invariants of the IR. The current + * pass only validates that GRF's uses are sane. More can be added later. + */ + +#include brw_fs.h +#include brw_cfg.h +
Re: [Mesa-dev] [PATCH 10/20] i965/fs: Implement image load, store and atomic.
On Thu, Jul 23, 2015 at 4:38 AM, Francisco Jerez curroje...@riseup.net wrote: Jason Ekstrand ja...@jlekstrand.net writes: This all looks correct as far as I can tell. However, I'm very concerned about the number of checks such as has_matching_typed_format() that are built-in to the compiler (via surface_builder) where we then go on to do something that is highly dependant on state setup doing the exact same check (not via surface_builder) and doing the right thing. Another example would be the interplay with has_split_bit_layout. How do we know these won't get out of sync? They cannot get out of sync because the policy which surface format to use for a given GLSL format is shared between the compiler and the state-setup code, see brw_lower_mesa_image_format() in i965: Implement surface state set-up for shader images.. I came back to this today and looked at it again. I think my main objection is that all of these metadata functions work in terms of a devinfo and a format and then immediately call brw_lower_mesa_image_format. Why don't they work instead in terms comparing the lowered format to the nominal format? Then it would be a lot more obvious that you're doing a transformation from one format to another. As is, it's a bunch of magic what do I have to do next queries which, while in the same namespace, are in a different file. It would also be more efficient because you would only call brw_lower_mesa_image_format once. Does that make more sense? --Jason Thanks. On Tue, Jul 21, 2015 at 9:38 AM, Francisco Jerez curroje...@riseup.net wrote: v2: Drop VEC4 suport. v3: Rebase. --- .../drivers/dri/i965/brw_fs_surface_builder.cpp| 216 + src/mesa/drivers/dri/i965/brw_fs_surface_builder.h | 17 ++ 2 files changed, 233 insertions(+) diff --git a/src/mesa/drivers/dri/i965/brw_fs_surface_builder.cpp b/src/mesa/drivers/dri/i965/brw_fs_surface_builder.cpp index ea1c4aa..46b449f 100644 --- a/src/mesa/drivers/dri/i965/brw_fs_surface_builder.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs_surface_builder.cpp @@ -587,3 +587,219 @@ namespace { } } } + +namespace brw { + namespace image_access { + /** + * Load a vector from a surface of the given format and dimensionality + * at the given coordinates. + */ + fs_reg + emit_image_load(const fs_builder bld, + const fs_reg image, const fs_reg addr, + mesa_format format, unsigned dims) + { + using namespace image_format_info; + using namespace image_format_conversion; + using namespace image_validity; + using namespace image_coordinates; + using namespace surface_access; + const brw_device_info *devinfo = bld.shader-devinfo; + const mesa_format lower_format = +brw_lower_mesa_image_format(devinfo, format); + fs_reg tmp; + + if (has_matching_typed_format(devinfo, format)) { +/* Hopefully we get here most of the time... */ +tmp = emit_typed_read(bld, image, addr, dims, + _mesa_format_num_components(lower_format)); + } else { +/* Untyped surface reads return 32 bits of the surface per + * component, without any sort of unpacking or type conversion, + */ +const unsigned size = _mesa_get_format_bytes(format) / 4; + +/* they don't properly handle out of bounds access, so we have to + * check manually if the coordinates are valid and predicate the + * surface read on the result, + */ +const brw_predicate pred = + emit_bounds_check(bld, image, addr, dims); + +/* and they don't know about surface coordinates, we need to + * convert them to a raw memory offset. + */ +const fs_reg laddr = emit_address_calculation(bld, image, addr, dims); + +tmp = emit_untyped_read(bld, image, laddr, 1, size, pred); + +/* An out of bounds surface access should give zero as result. */ +for (unsigned c = 0; c 4; ++c) + set_predicate(pred, bld.SEL(offset(tmp, bld, c), + offset(tmp, bld, c), fs_reg(0))); + } + + /* Set the register type to D instead of UD if the data type is + * represented as a signed integer in memory so that sign extension + * is handled correctly by unpack. + */ + if (needs_sign_extension(format)) +tmp = retype(tmp, BRW_REGISTER_TYPE_D); + + if (!has_supported_bit_layout(devinfo, format)) { +/* Unpack individual vector components from the bitfield if the + * hardware is unable to do it for us. + */ +if
[Mesa-dev] [PATCH] r600, compute: force tiling on 2D and 3D texture compute resources
To circumvent a problem occuring when LINEAR_ALIGNED array mode is selected on a TEXTURE_2D RAT. This configuration causes MEM_RAT STORE_TYPED to write to incorrect locations. --- src/gallium/drivers/radeon/r600_texture.c | 10 -- 1 file changed, 8 insertions(+), 2 deletions(-) diff --git a/src/gallium/drivers/radeon/r600_texture.c b/src/gallium/drivers/radeon/r600_texture.c index 8169e05..8979a98 100644 --- a/src/gallium/drivers/radeon/r600_texture.c +++ b/src/gallium/drivers/radeon/r600_texture.c @@ -706,6 +706,7 @@ static unsigned r600_choose_tiling(struct r600_common_screen *rscreen, const struct pipe_resource *templ) { const struct util_format_description *desc = util_format_description(templ-format); + bool force_tiling = templ-flags R600_RESOURCE_FLAG_FORCE_TILING; /* MSAA resources must be 2D tiled. */ if (templ-nr_samples 1) @@ -715,10 +716,15 @@ static unsigned r600_choose_tiling(struct r600_common_screen *rscreen, if (templ-flags R600_RESOURCE_FLAG_TRANSFER) return RADEON_SURF_MODE_LINEAR_ALIGNED; + /* Force tiling on TEXTURE_2D and TEXTURE_3D compute resources. */ + if ((templ-target == PIPE_TEXTURE_2D || +templ-target == PIPE_TEXTURE_3D) + (templ-bind PIPE_BIND_COMPUTE_RESOURCE)) + force_tiling = true; + /* Handle common candidates for the linear mode. * Compressed textures must always be tiled. */ - if (!(templ-flags R600_RESOURCE_FLAG_FORCE_TILING) - !util_format_is_compressed(templ-format)) { + if (!force_tiling !util_format_is_compressed(templ-format)) { /* Not everything can be linear, so we cannot enforce it * for all textures. */ if ((rscreen-debug_flags DBG_NO_TILING) -- 2.4.6 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/2] i965: Extract push constant state to a new file
On Thu, Jul 30, 2015 at 2:31 PM, Ben Widawsky benjamin.widaw...@intel.com wrote: Recommended-by: Kenneth Graunke kenn...@whitecape.org Signed-off-by: Ben Widawsky b...@bwidawsk.net --- src/mesa/drivers/dri/i965/Makefile.sources | 1 + src/mesa/drivers/dri/i965/gen6_constant_state.c | 189 src/mesa/drivers/dri/i965/gen6_vs_state.c | 88 --- src/mesa/drivers/dri/i965/gen7_vs_state.c | 75 -- 4 files changed, 190 insertions(+), 163 deletions(-) create mode 100644 src/mesa/drivers/dri/i965/gen6_constant_state.c diff --git a/src/mesa/drivers/dri/i965/Makefile.sources b/src/mesa/drivers/dri/i965/Makefile.sources index e861c2c..1c87280 100644 --- a/src/mesa/drivers/dri/i965/Makefile.sources +++ b/src/mesa/drivers/dri/i965/Makefile.sources @@ -143,6 +143,7 @@ i965_FILES = \ gen6_blorp.h \ gen6_cc.c \ gen6_clip_state.c \ +gen6_constant_state.c \ Use tabs in Makefiles. gen6_depth_state.c \ gen6_depthstencil.c \ gen6_gs_state.c \ ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 3/3] st/mesa: implement DrawTransformFeedbackStream
On 07/30/2015 04:26 PM, Marek Olšák wrote: From: Marek Olšák marek.ol...@amd.com --- src/mesa/state_tracker/st_cb_xformfb.c | 58 ++ src/mesa/state_tracker/st_cb_xformfb.h | 2 +- src/mesa/state_tracker/st_draw.c | 2 +- 3 files changed, 33 insertions(+), 29 deletions(-) diff --git a/src/mesa/state_tracker/st_cb_xformfb.c b/src/mesa/state_tracker/st_cb_xformfb.c index 07c118e..85a8620 100644 --- a/src/mesa/state_tracker/st_cb_xformfb.c +++ b/src/mesa/state_tracker/st_cb_xformfb.c @@ -54,9 +54,9 @@ struct st_transform_feedback_object { struct pipe_stream_output_target *targets[PIPE_MAX_SO_BUFFERS]; /* This encapsulates the count that can be used as a source for draw_vbo. -* It contains a stream output target from the last call of -* EndTransformFeedback. */ - struct pipe_stream_output_target *draw_count; +* It contains stream output targets from the last call of +* EndTransformFeedback for each stream. */ + struct pipe_stream_output_target *draw_count[MAX_VERTEX_STREAMS]; }; static inline struct st_transform_feedback_object * @@ -88,7 +88,8 @@ st_delete_transform_feedback(struct gl_context *ctx, st_transform_feedback_object(obj); unsigned i; - pipe_so_target_reference(sobj-draw_count, NULL); + for (i = 0; i ARRAY_SIZE(sobj-draw_count); i++) + pipe_so_target_reference(sobj-draw_count[i], NULL); /* Unreference targets. */ for (i = 0; i sobj-num_targets; i++) { @@ -123,9 +124,12 @@ st_begin_transform_feedback(struct gl_context *ctx, GLenum mode, struct st_buffer_object *bo = st_buffer_object(sobj-base.Buffers[i]); if (bo bo-buffer) { + unsigned stream = +obj-shader_program-LinkedTransformFeedback.BufferStream[i]; + /* Check whether we need to recreate the target. */ if (!sobj-targets[i] || - sobj-targets[i] == sobj-draw_count || + sobj-targets[i] == sobj-draw_count[stream] || sobj-targets[i]-buffer != bo-buffer || sobj-targets[i]-buffer_offset != sobj-base.Offset[i] || sobj-targets[i]-buffer_size != sobj-base.Size[i]) { @@ -178,24 +182,6 @@ st_resume_transform_feedback(struct gl_context *ctx, } -static struct pipe_stream_output_target * -st_transform_feedback_get_draw_target(struct gl_transform_feedback_object *obj) -{ - struct st_transform_feedback_object *sobj = - st_transform_feedback_object(obj); - unsigned i; - - for (i = 0; i ARRAY_SIZE(sobj-targets); i++) { - if (sobj-targets[i]) { - return sobj-targets[i]; - } - } - - assert(0); - return NULL; -} - - static void st_end_transform_feedback(struct gl_context *ctx, struct gl_transform_feedback_object *obj) @@ -203,22 +189,40 @@ st_end_transform_feedback(struct gl_context *ctx, struct st_context *st = st_context(ctx); struct st_transform_feedback_object *sobj = st_transform_feedback_object(obj); + unsigned i; cso_set_stream_outputs(st-cso_context, 0, NULL, NULL); - pipe_so_target_reference(sobj-draw_count, -st_transform_feedback_get_draw_target(obj)); + /* The next call to glDrawTransformFeedbackStream should use the vertex +* count from the last call to glEndTransformFeedback. +* Therefore, save the targets for each stream. +* +* NULL means the vertex counter is 0 (initial state). +*/ + for (i = 0; i ARRAY_SIZE(sobj-draw_count); i++) + pipe_so_target_reference(sobj-draw_count[i], NULL); + + for (i = 0; i ARRAY_SIZE(sobj-targets); i++) { + unsigned stream = + obj-shader_program-LinkedTransformFeedback.BufferStream[i]; + + /* Is it not boudn or already set for this stream? */ s/boudn/bound I assume this is a typo error. :) + if (!sobj-targets[i] || sobj-draw_count[stream]) + continue; + + pipe_so_target_reference(sobj-draw_count[stream], sobj-targets[i]); + } } void st_transform_feedback_draw_init(struct gl_transform_feedback_object *obj, -struct pipe_draw_info *out) +unsigned stream, struct pipe_draw_info *out) { struct st_transform_feedback_object *sobj = st_transform_feedback_object(obj); - out-count_from_stream_output = sobj-draw_count; + out-count_from_stream_output = sobj-draw_count[stream]; } diff --git a/src/mesa/state_tracker/st_cb_xformfb.h b/src/mesa/state_tracker/st_cb_xformfb.h index 998c418..16f5176 100644 --- a/src/mesa/state_tracker/st_cb_xformfb.h +++ b/src/mesa/state_tracker/st_cb_xformfb.h @@ -40,7 +40,7 @@ st_init_xformfb_functions(struct dd_function_table *functions); extern void st_transform_feedback_draw_init(struct gl_transform_feedback_object *obj, -struct pipe_draw_info *out); +
Re: [Mesa-dev] i965 implementation of the ARB_shader_image_load_store built-ins. (v4)
Curro, What are we still wainting on for the image_load_store extension? I think I've given you R-B's on all but one or two of the compiler patches. Is the state setup stuff reviewed? Is there anything else that needs review? --Jason On Thu, Jul 23, 2015 at 6:58 AM, Francisco Jerez curroje...@riseup.net wrote: Jason Ekstrand ja...@jlekstrand.net writes: *whew*, I've made it through the entire series... Thanks! On Tue, Jul 21, 2015 at 9:38 AM, Francisco Jerez curroje...@riseup.net wrote: Another resend of the i965 compiler-related changes for ARB_shader_image_load_store, reworked to make use of the SIMD lowering infrastructure introduced in a previous series [1]. For a self-contained branch in testable form see [2]. [1] http://lists.freedesktop.org/archives/mesa-dev/2015-July/089009.html [2] http://cgit.freedesktop.org/~currojerez/mesa/log/?h=image-load-store-lower [PATCH 01/20] i965/fs: Define logical typed and untyped surface opcodes. [PATCH 02/20] i965/fs: Hook up SIMD lowering to unroll surface instructions of unsupported width. [PATCH 03/20] i965/fs: Implement lowering of logical surface instructions. [PATCH 04/20] i965/fs: Handle zero-size allocations in fs_builder::vgrf(). [PATCH 05/20] i965/fs: Import surface message builder helper functions. [PATCH 06/20] i965/fs: Import image access validity checks. The above are all Reviewed-by: Jason Ekstrand jason.ekstr...@intel.com [PATCH 07/20] i965/fs: Import image memory offset calculation code. As I said in the e-mail, this needs a *lot* more comments. I'll re-try the review once you've actually explained what it's doing. [PATCH 08/20] i965/fs: Import image format metadata queries. Reviewed-by: Jason Ekstrand jason.ekstr...@intel.com [PATCH 09/20] i965/fs: Import image format conversion primitives. I had some minor concerns here. [PATCH 10/20] i965/fs: Implement image load, store and atomic. This looks pretty good. However, I had some concerns about things getting out-of-sync with state setup that I'd like to see addressed. [PATCH 11/20] i965/fs: Revisit NIR atomic counter intrinsic translation. [PATCH 12/20] i965/fs: Drop unused untyped surface read and atomic emit methods. Reviewed-by: Jason Ekstrand jason.ekstr...@intel.com [PATCH 13/20] i965: Teach type_size() about the size of an image uniform. [PATCH 14/20] i965: Implement logic to set up and upload an image uniform. I'm not particularly familiar with how mesa's uniform setup works. Ken? [PATCH 15/20] i965/fs: Don't overwrite fs_visitor::uniforms and ::param_size during the SIMD16 run. [PATCH 16/20] i965/fs: Execute nir_setup_uniforms, _inputs and _outputs unconditionally. Reviewed-by: Jason Ekstrand jason.ekstr...@intel.com [PATCH 17/20] i965/fs: Handle image uniforms in NIR programs. Reviewed-by: Jason Ekstrand jason.ekstr...@intel.com [PATCH 18/20] i965/fs: Translate image load, store and atomic NIR intrinsics. I had some comments on this one regarding encapsulation that I'd like to see addressed. [PATCH 19/20] i965/fs: Translate memory barrier NIR intrinsics. Reviewed-by: Jason Ekstrand jason.ekstr...@intel.com [PATCH 20/20] i965: Expose ARB_shader_image_load_store. I'm going to wait until the rest of it is reviewed to add mine to that little patch. :-) ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 1/2] clover: fix mapping of 2 and 3 dimenisional regions of images
Mapping tiled textures requires 2 or 3 dimensional region information. --- src/gallium/state_trackers/clover/api/transfer.cpp | 33 +++--- 1 file changed, 23 insertions(+), 10 deletions(-) diff --git a/src/gallium/state_trackers/clover/api/transfer.cpp b/src/gallium/state_trackers/clover/api/transfer.cpp index fdb9405..f21819f 100644 --- a/src/gallium/state_trackers/clover/api/transfer.cpp +++ b/src/gallium/state_trackers/clover/api/transfer.cpp @@ -204,9 +204,22 @@ namespace { struct _map { static mapping get(command_queue q, T obj, cl_map_flags flags, - size_t offset, size_t size) { + const vector_t origin, const vector_t region, + const vector_t pitch) { + size_t offset = dot(pitch, origin); + size_t region_size = size(pitch, region); return { q, obj-resource(q), flags, true, - {{ offset }}, {{ size, 1, 1 }} }; + {{ offset }}, {{ region_size, 1, 1 }}}; + } + }; + + template + struct _mapimage * { + static mapping + get(command_queue q, image *obj, cl_map_flags flags, + const vector_t origin, const vector_t region, + const vector_t pitch) { + return { q, obj-resource(q), flags, true, origin, region }; } }; @@ -214,8 +227,9 @@ namespace { struct _mapvoid * { static void * get(command_queue q, void *obj, cl_map_flags flags, - size_t offset, size_t size) { - return (char *)obj + offset; + const vector_t origin, const vector_t region, + const vector_t pitch) { + return (char *)obj + dot(pitch, origin); } }; @@ -223,8 +237,9 @@ namespace { struct _mapconst void * { static const void * get(command_queue q, const void *obj, cl_map_flags flags, - size_t offset, size_t size) { - return (const char *)obj + offset; + const vector_t origin, const vector_t region, + const vector_t pitch) { + return (char *)obj + dot(pitch, origin); } }; @@ -240,11 +255,9 @@ namespace { const vector_t region) { return [=, q](event ) { auto dst = _mapT::get(q, dst_obj, CL_MAP_WRITE, - dot(dst_pitch, dst_orig), - size(dst_pitch, region)); + dst_orig, region, dst_pitch); auto src = _mapS::get(q, src_obj, CL_MAP_READ, - dot(src_pitch, src_orig), - size(src_pitch, region)); + src_orig, region, src_pitch); vector_t v = {}; for (v[2] = 0; v[2] region[2]; ++v[2]) { -- 2.4.6 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 2/2] clover: allow driver to override transfer pitch
The driver may set the pitch of a 2d or 3d mapping. --- src/gallium/state_trackers/clover/api/transfer.cpp | 13 - src/gallium/state_trackers/clover/core/resource.hpp | 8 2 files changed, 16 insertions(+), 5 deletions(-) diff --git a/src/gallium/state_trackers/clover/api/transfer.cpp b/src/gallium/state_trackers/clover/api/transfer.cpp index f21819f..8bd4aa0 100644 --- a/src/gallium/state_trackers/clover/api/transfer.cpp +++ b/src/gallium/state_trackers/clover/api/transfer.cpp @@ -218,8 +218,11 @@ namespace { static mapping get(command_queue q, image *obj, cl_map_flags flags, const vector_t origin, const vector_t region, - const vector_t pitch) { - return { q, obj-resource(q), flags, true, origin, region }; + vector_t pitch) { + mapping res{ q, obj-resource(q), flags, true, origin, region }; + pitch[1] = res.get_row_pitch(); + pitch[2] = res.get_slice_pitch(); + return res; } }; @@ -250,10 +253,10 @@ namespace { templatetypename T, typename S std::functionvoid (event ) soft_copy_op(command_queue q, -T dst_obj, const vector_t dst_orig, const vector_t dst_pitch, -S src_obj, const vector_t src_orig, const vector_t src_pitch, +T dst_obj, const vector_t dst_orig, vector_t dst_pitch, +S src_obj, const vector_t src_orig, vector_t src_pitch, const vector_t region) { - return [=, q](event ) { + return [=, q](event ) mutable { auto dst = _mapT::get(q, dst_obj, CL_MAP_WRITE, dst_orig, region, dst_pitch); auto src = _mapS::get(q, src_obj, CL_MAP_READ, diff --git a/src/gallium/state_trackers/clover/core/resource.hpp b/src/gallium/state_trackers/clover/core/resource.hpp index 9993dcb..90f74fd 100644 --- a/src/gallium/state_trackers/clover/core/resource.hpp +++ b/src/gallium/state_trackers/clover/core/resource.hpp @@ -122,6 +122,14 @@ namespace clover { return (T *)p; } + unsigned get_row_pitch() const { + return pxfer-stride; + } + + unsigned get_slice_pitch() const { + return pxfer-layer_stride; + } + private: pipe_context *pctx; pipe_transfer *pxfer; -- 2.4.6 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] i965/skl: Emit new 3DSTATE_VF_COMPONENT_PACKING
Update: I no longer thing we should merge this patch. A note in the programming has been introduced stating that we should not program this state packet without enabling at least one component. On Wed, Jul 01, 2015 at 04:06:54PM -0700, Ben Widawsky wrote: We don't yet have a use for this state, but initializing it to known values is always considered wise. In general NULL state can probably go in the misc state upload, I only put it here because I assume it might be useful at some point. Signed-off-by: Ben Widawsky b...@bwidawsk.net --- I've had this patch sitting around for almost 3 months now. I believe we like to initialize new fields as a general rule of thumb. --- src/mesa/drivers/dri/i965/brw_defines.h | 2 ++ src/mesa/drivers/dri/i965/gen8_draw_upload.c | 29 2 files changed, 31 insertions(+) diff --git a/src/mesa/drivers/dri/i965/brw_defines.h b/src/mesa/drivers/dri/i965/brw_defines.h index 66b9abc..444d974 100644 --- a/src/mesa/drivers/dri/i965/brw_defines.h +++ b/src/mesa/drivers/dri/i965/brw_defines.h @@ -1713,6 +1713,8 @@ enum brw_message_target { #define _3DSTATE_VF_TOPOLOGY0x784b /* GEN8+ */ +#define _3DSTATE_VF_COMPONENT_PACKING 0x7855 /* GEN9+ */ + #define _3DSTATE_WM_CHROMAKEY0x784c /* GEN8+ */ #define _3DSTATE_URB_VS 0x7830 /* GEN7+ */ diff --git a/src/mesa/drivers/dri/i965/gen8_draw_upload.c b/src/mesa/drivers/dri/i965/gen8_draw_upload.c index 1af90ec..f6e7fc8 100644 --- a/src/mesa/drivers/dri/i965/gen8_draw_upload.c +++ b/src/mesa/drivers/dri/i965/gen8_draw_upload.c @@ -35,6 +35,33 @@ #include intel_batchbuffer.h #include intel_buffer_objects.h +/** + * Emits a null component packing state. + * + * Paraphrasing the docs: This command is used to specify which 32-bit + * components are enabled to be stored in the URB, and which are disabled. + * Disabling all four components for a given Vertex Element will result in no + * data stored for that element. Note that any insertion of SGVs + * (3DSTATE_VF_SGVS) is performed before the packing operation. + * + * FINISHME: Component packing is probably only useful for SIMD8 VS thread + * execution. When enabled, the correct bit must be set in 3DSTATE_VF. + */ +static void +gen9_emit_component_packing(struct brw_context *brw) +{ + if (brw-gen 9) + return; + + BEGIN_BATCH(5); + OUT_BATCH(_3DSTATE_VF_COMPONENT_PACKING 16 | (5 - 2)); + OUT_BATCH(0); + OUT_BATCH(0); + OUT_BATCH(0); + OUT_BATCH(0); + ADVANCE_BATCH(); +} + static void gen8_emit_vertices(struct brw_context *brw) { @@ -44,6 +71,8 @@ gen8_emit_vertices(struct brw_context *brw) brw_prepare_vertices(brw); brw_prepare_shader_draw_parameters(brw); + gen9_emit_component_packing(brw); + if (brw-vs.prog_data-uses_vertexid || brw-vs.prog_data-uses_instanceid) { unsigned vue = brw-vb.nr_enabled; -- 2.4.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev -- Ben Widawsky, Intel Open Source Technology Center ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCHv2 18/20] i965/fs: Translate image load, store and atomic NIR intrinsics.
Reviewed-by: Jason Ekstrand jason.ekstr...@intel.com On Thu, Jul 23, 2015 at 10:31 AM, Francisco Jerez curroje...@riseup.net wrote: v2: Move array coordinate workaround into the surface builder. --- src/mesa/drivers/dri/i965/brw_fs_nir.cpp | 106 +++ 1 file changed, 106 insertions(+) diff --git a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp index 805f782..318b600 100644 --- a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp @@ -24,6 +24,7 @@ #include glsl/ir.h #include glsl/ir_optimization.h #include glsl/nir/glsl_to_nir.h +#include main/shaderimage.h #include program/prog_to_nir.h #include brw_fs.h #include brw_fs_surface_builder.h @@ -1246,6 +1247,55 @@ fs_visitor::emit_percomp(const fs_builder bld, const fs_inst inst, } } +/** + * Get the matching channel register datatype for an image intrinsic of the + * specified GLSL image type. + */ +static brw_reg_type +get_image_base_type(const glsl_type *type) +{ + switch ((glsl_base_type)type-sampler_type) { + case GLSL_TYPE_UINT: + return BRW_REGISTER_TYPE_UD; + case GLSL_TYPE_INT: + return BRW_REGISTER_TYPE_D; + case GLSL_TYPE_FLOAT: + return BRW_REGISTER_TYPE_F; + default: + unreachable(Not reached.); + } +} + +/** + * Get the appropriate atomic op for an image atomic intrinsic. + */ +static unsigned +get_image_atomic_op(nir_intrinsic_op op, const glsl_type *type) +{ + switch (op) { + case nir_intrinsic_image_atomic_add: + return BRW_AOP_ADD; + case nir_intrinsic_image_atomic_min: + return (get_image_base_type(type) == BRW_REGISTER_TYPE_D ? + BRW_AOP_IMIN : BRW_AOP_UMIN); + case nir_intrinsic_image_atomic_max: + return (get_image_base_type(type) == BRW_REGISTER_TYPE_D ? + BRW_AOP_IMAX : BRW_AOP_UMAX); + case nir_intrinsic_image_atomic_and: + return BRW_AOP_AND; + case nir_intrinsic_image_atomic_or: + return BRW_AOP_OR; + case nir_intrinsic_image_atomic_xor: + return BRW_AOP_XOR; + case nir_intrinsic_image_atomic_exchange: + return BRW_AOP_MOV; + case nir_intrinsic_image_atomic_comp_swap: + return BRW_AOP_CMPWR; + default: + unreachable(Not reachable.); + } +} + void fs_visitor::nir_emit_intrinsic(const fs_builder bld, nir_intrinsic_instr *instr) { @@ -1320,6 +1370,62 @@ fs_visitor::nir_emit_intrinsic(const fs_builder bld, nir_intrinsic_instr *instr break; } + case nir_intrinsic_image_load: + case nir_intrinsic_image_store: + case nir_intrinsic_image_atomic_add: + case nir_intrinsic_image_atomic_min: + case nir_intrinsic_image_atomic_max: + case nir_intrinsic_image_atomic_and: + case nir_intrinsic_image_atomic_or: + case nir_intrinsic_image_atomic_xor: + case nir_intrinsic_image_atomic_exchange: + case nir_intrinsic_image_atomic_comp_swap: { + using namespace image_access; + + /* Get the referenced image variable and type. */ + const nir_variable *var = instr-variables[0]-var; + const glsl_type *type = var-type-without_array(); + const brw_reg_type base_type = get_image_base_type(type); + + /* Get some metadata from the image intrinsic. */ + const nir_intrinsic_info *info = nir_intrinsic_infos[instr-intrinsic]; + const unsigned arr_dims = type-sampler_array ? 1 : 0; + const unsigned surf_dims = type-coordinate_components() - arr_dims; + const mesa_format format = + (var-data.image.write_only ? MESA_FORMAT_NONE : + _mesa_get_shader_image_format(var-data.image.format)); + + /* Get the arguments of the image intrinsic. */ + const fs_reg image = get_nir_image_deref(instr-variables[0]); + const fs_reg addr = retype(get_nir_src(instr-src[0]), + BRW_REGISTER_TYPE_UD); + const fs_reg src0 = (info-num_srcs = 3 ? + retype(get_nir_src(instr-src[2]), base_type) : + fs_reg()); + const fs_reg src1 = (info-num_srcs = 4 ? + retype(get_nir_src(instr-src[3]), base_type) : + fs_reg()); + fs_reg tmp; + + /* Emit an image load, store or atomic op. */ + if (instr-intrinsic == nir_intrinsic_image_load) + tmp = emit_image_load(bld, image, addr, surf_dims, arr_dims, format); + + else if (instr-intrinsic == nir_intrinsic_image_store) + emit_image_store(bld, image, addr, src0, surf_dims, arr_dims, format); + + else + tmp = emit_image_atomic(bld, image, addr, src0, src1, + surf_dims, arr_dims, info-dest_components, + get_image_atomic_op(instr-intrinsic, type)); + + /* Assign the result. */ + for (unsigned c = 0; c
Re: [Mesa-dev] [PATCHv4 09/20] i965/fs: Import image format conversion primitives.
Reviewed-by: Jason Ekstrand jason.ekstr...@intel.com On Thu, Jul 23, 2015 at 10:30 AM, Francisco Jerez curroje...@riseup.net wrote: Define bitfield packing, unpacking and type conversion operations in terms of which the image format conversion code will be implemented. These don't directly know about image formats: The packing and unpacking functions take a 4-tuple of bit shifts and a 4-tuple of bit widths as arguments, determining the bitfield position of each component. Most of the remaining functions perform integer, fixed point normalized, and floating point type conversions, mapping between a target type with per-component bit widths given by a parameter and a matching native representation of the same type. v2: Drop VEC4 suport. v3: Rebase. v4: Fix clamping of negative floats in the unsigned case of emit_convert_to_scaled(). --- .../drivers/dri/i965/brw_fs_surface_builder.cpp| 265 + 1 file changed, 265 insertions(+) diff --git a/src/mesa/drivers/dri/i965/brw_fs_surface_builder.cpp b/src/mesa/drivers/dri/i965/brw_fs_surface_builder.cpp index e6a3115..367a991 100644 --- a/src/mesa/drivers/dri/i965/brw_fs_surface_builder.cpp +++ b/src/mesa/drivers/dri/i965/brw_fs_surface_builder.cpp @@ -375,4 +375,269 @@ namespace { return dst; } } + + namespace image_format_conversion { + using image_format_info::color_u; + + namespace { + /** + * Maximum representable value in an unsigned integer with the given + * number of bits. + */ + inline unsigned + scale(unsigned n) + { +return (1 n) - 1; + } + } + + /** + * Pack the vector \p src in a bitfield given the per-component bit + * shifts and widths. Note that bitfield components are not allowed to + * cross 32-bit boundaries. + */ + fs_reg + emit_pack(const fs_builder bld, const fs_reg src, +const color_u shifts, const color_u widths) + { + const fs_reg dst = bld.vgrf(BRW_REGISTER_TYPE_UD, 4); + bool seen[4] = {}; + + for (unsigned c = 0; c 4; ++c) { +if (widths[c]) { + const fs_reg tmp = bld.vgrf(BRW_REGISTER_TYPE_UD); + + /* Shift each component left to the correct bitfield position. */ + bld.SHL(tmp, offset(src, bld, c), fs_reg(shifts[c] % 32)); + + /* Add everything up. */ + if (seen[shifts[c] / 32]) { + bld.OR(offset(dst, bld, shifts[c] / 32), + offset(dst, bld, shifts[c] / 32), tmp); + } else { + bld.MOV(offset(dst, bld, shifts[c] / 32), tmp); + seen[shifts[c] / 32] = true; + } +} + } + + return dst; + } + + /** + * Unpack a vector from the bitfield \p src given the per-component bit + * shifts and widths. Note that bitfield components are not allowed to + * cross 32-bit boundaries. + */ + fs_reg + emit_unpack(const fs_builder bld, const fs_reg src, + const color_u shifts, const color_u widths) + { + const fs_reg dst = bld.vgrf(src.type, 4); + + for (unsigned c = 0; c 4; ++c) { +if (widths[c]) { + /* Shift left to discard the most significant bits. */ + bld.SHL(offset(dst, bld, c), + offset(src, bld, shifts[c] / 32), + fs_reg(32 - shifts[c] % 32 - widths[c])); + + /* Shift back to the least significant bits using an arithmetic +* shift to get sign extension on signed types. +*/ + bld.ASR(offset(dst, bld, c), + offset(dst, bld, c), fs_reg(32 - widths[c])); +} + } + + return dst; + } + + /** + * Convert an integer vector into another integer vector of the + * specified bit widths, properly handling overflow. + */ + fs_reg + emit_convert_to_integer(const fs_builder bld, const fs_reg src, + const color_u widths, bool is_signed) + { + const unsigned s = (is_signed ? 1 : 0); + const fs_reg dst = bld.vgrf( +is_signed ? BRW_REGISTER_TYPE_D : BRW_REGISTER_TYPE_UD, 4); + assert(src.type == dst.type); + + for (unsigned c = 0; c 4; ++c) { +if (widths[c]) { + /* Clamp to the maximum value. */ + bld.emit_minmax(offset(dst, bld, c), offset(src, bld, c), + fs_reg((int)scale(widths[c] - s)), + BRW_CONDITIONAL_L); + + /* Clamp to the minimum value. */ + if (is_signed) +
Re: [Mesa-dev] [PATCH] i965/skl: Add production thread counts and URB size
On 2015-07-29 12:35:24, Ben Widawsky wrote: This patch adjusts the SKL values to the best known values we have. It also adds the missing HS/DS/CS fields. To support this patch I needed to add some default values to BXT. Those values are just minimal values which we can use for enabling. Note to backporter: You can drop the BXT change (second hunk) entirely. Instead can you split this into 2 patches? One for skl, meant for master and stable, and one with the placeholder numbers for bxt meant just for master. If split: Reviewed-by: Jordan Justen jordan.l.jus...@intel.com Cc: mesa-sta...@lists.freedesktop.org Signed-off-by: Ben Widawsky b...@bwidawsk.net --- src/mesa/drivers/dri/i965/brw_device_info.c | 20 +++- 1 file changed, 15 insertions(+), 5 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_device_info.c b/src/mesa/drivers/dri/i965/brw_device_info.c index 6fe6ea2..066dac6 100644 --- a/src/mesa/drivers/dri/i965/brw_device_info.c +++ b/src/mesa/drivers/dri/i965/brw_device_info.c @@ -308,13 +308,18 @@ static const struct brw_device_info brw_device_info_chv = { .has_llc = true, \ .has_pln = true, \ .supports_simd16_3src = true,\ - .max_vs_threads = 280, \ - .max_gs_threads = 256, \ - .max_wm_threads = 408, \ + .max_vs_threads = 336, \ + .max_hs_threads = 336, \ + .max_ds_threads = 336, \ + .max_gs_threads = 336, \ + .max_wm_threads = 64 * 6,\ + .max_cs_threads = 56,\ .urb = { \ - .size = 128, \ + .size = 192, \ .min_vs_entries = 64, \ - .max_vs_entries = 1664, \ + .max_vs_entries = 1856, \ + .max_hs_entries = 672,\ + .max_ds_entries = 1120, \ .max_gs_entries = 640,\ } @@ -341,12 +346,17 @@ static const struct brw_device_info brw_device_info_bxt = { .gt = 1, .has_llc = false, .max_vs_threads = 112, + .max_hs_threads = 112, + .max_ds_threads = 112, .max_gs_threads = 112, .max_wm_threads = 32, + .max_cs_threads = 28, .urb = { .size = 64, .min_vs_entries = 34, .max_vs_entries = 640, + .max_hs_entries = 80, + .max_ds_entries = 80, .max_gs_entries = 256, } }; -- 2.5.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 2/3] i965/bxt: Use more conservative thread counts
Since we really do not know what may occur in the future, pick a more conservative value for thread counts until we know better what values are correct. As far as I can tell, the old values will work fine, but some of the registers seem to indicate that going even lower is possible and the purpose of having early support is to enable as many configurations that can possibly exist (we can trim things down after platforms begin shipping later). Cc: Jordan Justen jordan.l.jus...@intel.com Signed-off-by: Ben Widawsky b...@bwidawsk.net --- src/mesa/drivers/dri/i965/brw_device_info.c | 6 -- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_device_info.c b/src/mesa/drivers/dri/i965/brw_device_info.c index f313b2b..75d1ecb 100644 --- a/src/mesa/drivers/dri/i965/brw_device_info.c +++ b/src/mesa/drivers/dri/i965/brw_device_info.c @@ -340,8 +340,10 @@ static const struct brw_device_info brw_device_info_bxt = { .is_broxton = 1, .gt = 1, .has_llc = false, - .max_vs_threads = 112, - .max_gs_threads = 112, + + /* XXX: These are preliminary thread counts and URB sizes. */ + .max_vs_threads = 56, + .max_gs_threads = 56, .max_wm_threads = 32, .urb = { .size = 64, -- 2.5.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] r600, compute: force tiling on 2D and 3D texture compute resources
On 31.07.2015 07:06, Zoltan Gilian wrote: To circumvent a problem occuring when LINEAR_ALIGNED array mode is selected on a TEXTURE_2D RAT. This configuration causes MEM_RAT STORE_TYPED to write to incorrect locations. [...] @@ -715,10 +716,15 @@ static unsigned r600_choose_tiling(struct r600_common_screen *rscreen, if (templ-flags R600_RESOURCE_FLAG_TRANSFER) return RADEON_SURF_MODE_LINEAR_ALIGNED; + /* Force tiling on TEXTURE_2D and TEXTURE_3D compute resources. */ + if ((templ-target == PIPE_TEXTURE_2D || + templ-target == PIPE_TEXTURE_3D) + (templ-bind PIPE_BIND_COMPUTE_RESOURCE)) + force_tiling = true; This should test (templ-bind PIPE_BIND_COMPUTE_RESOURCE) first. Also, does the same restriction apply to SI and newer GPUs? -- Earthling Michel Dänzer | http://www.amd.com Libre software enthusiast | Mesa and X developer ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 1/3] [v2] i965/skl: Add production thread counts and URB size
This patch adjusts the SKL values to the best known values we have. v2: Remove HS/DS/CS fields. Adding this makes most sense to add to the GEN9_FEATURES macro, however, doing that would require updating BXT values, and Jordan requested I not do that. Conveniently, this request makes a lot of sense wrt to stable backport as HS, and DS do not even exist there. Cc: mesa-sta...@lists.freedesktop.org Signed-off-by: Ben Widawsky b...@bwidawsk.net Reviewed-by: Jordan Justen jordan.l.jus...@intel.com --- src/mesa/drivers/dri/i965/brw_device_info.c | 10 +- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_device_info.c b/src/mesa/drivers/dri/i965/brw_device_info.c index 2c5d778..f313b2b 100644 --- a/src/mesa/drivers/dri/i965/brw_device_info.c +++ b/src/mesa/drivers/dri/i965/brw_device_info.c @@ -308,13 +308,13 @@ static const struct brw_device_info brw_device_info_chv = { .has_llc = true, \ .has_pln = true, \ .supports_simd16_3src = true,\ - .max_vs_threads = 280, \ - .max_gs_threads = 256, \ - .max_wm_threads = 408, \ + .max_vs_threads = 336, \ + .max_gs_threads = 336, \ + .max_wm_threads = 64 * 6,\ .urb = { \ - .size = 128, \ + .size = 192, \ .min_vs_entries = 64, \ - .max_vs_entries = 1664, \ + .max_vs_entries = 1856, \ .max_gs_entries = 640,\ } -- 2.5.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 3/3] i965/gen9: Add hs, ds, and cs thread + urb info
For SKL: These are the production values. For BXT: These are low estimates to enable platforms. This patch was originally part of i965/skl: Add production thread counts and URB size but was split out at Jordan's request (which I found to be reasonable). Note on stable inclusion: 10.6 does not care about hs, and ds. It does care about cs, but since Jordan was the one that asked me to extract it, I'll leave it up to him to deal with a backport to stable is required. Signed-off-by: Ben Widawsky b...@bwidawsk.net Reviewed-by: Jordan Justen jordan.l.jus...@intel.com --- src/mesa/drivers/dri/i965/brw_device_info.c | 10 ++ 1 file changed, 10 insertions(+) diff --git a/src/mesa/drivers/dri/i965/brw_device_info.c b/src/mesa/drivers/dri/i965/brw_device_info.c index 75d1ecb..fcc243e 100644 --- a/src/mesa/drivers/dri/i965/brw_device_info.c +++ b/src/mesa/drivers/dri/i965/brw_device_info.c @@ -310,11 +310,16 @@ static const struct brw_device_info brw_device_info_chv = { .supports_simd16_3src = true,\ .max_vs_threads = 336, \ .max_gs_threads = 336, \ + .max_hs_threads = 336, \ + .max_ds_threads = 336, \ .max_wm_threads = 64 * 6,\ + .max_cs_threads = 56,\ .urb = { \ .size = 192, \ .min_vs_entries = 64, \ .max_vs_entries = 1856, \ + .max_hs_entries = 672,\ + .max_ds_entries = 1120, \ .max_gs_entries = 640,\ } @@ -343,12 +348,17 @@ static const struct brw_device_info brw_device_info_bxt = { /* XXX: These are preliminary thread counts and URB sizes. */ .max_vs_threads = 56, + .max_hs_threads = 56, + .max_ds_threads = 56, .max_gs_threads = 56, .max_wm_threads = 32, + .max_cs_threads = 28, .urb = { .size = 64, .min_vs_entries = 34, .max_vs_entries = 640, + .max_hs_entries = 80, + .max_ds_entries = 80, .max_gs_entries = 256, } }; -- 2.5.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] r600: Remove assert that not general in some stream instr's
seems like one I wrote already.. http://lists.freedesktop.org/archives/mesa-dev/2015-July/088404.html Dave. On 31 July 2015 at 12:42, Edward O'Callaghan eocallag...@alterapraxis.com wrote: Don't trigger assert on some stream emit instructions. Signed-off-by: Edward O'Callaghan eocallag...@alterapraxis.com --- src/gallium/drivers/r600/sb/sb_bc_finalize.cpp | 2 -- 1 file changed, 2 deletions(-) diff --git a/src/gallium/drivers/r600/sb/sb_bc_finalize.cpp b/src/gallium/drivers/r600/sb/sb_bc_finalize.cpp index 8c2cd14..48d56ac 100644 --- a/src/gallium/drivers/r600/sb/sb_bc_finalize.cpp +++ b/src/gallium/drivers/r600/sb/sb_bc_finalize.cpp @@ -761,8 +761,6 @@ void bc_finalizer::finalize_cf(cf_node* c) { mask |= (1 chan); } - assert(reg = 0 mask); - if (reg = 0) update_ngpr(reg); -- 2.4.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] radeonsi: copy *8_SNORM bits exactly in resource_copy_region
On 31.07.2015 01:29, Marek Olšák wrote: From: Marek Olšák marek.ol...@amd.com Disabling the FP16 mode didn't help. If needed, we can use this trick for blits too, but not for scaled blits. + 4 piglits --- src/gallium/drivers/radeonsi/si_blit.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/src/gallium/drivers/radeonsi/si_blit.c b/src/gallium/drivers/radeonsi/si_blit.c index c892623..61ca2a8 100644 --- a/src/gallium/drivers/radeonsi/si_blit.c +++ b/src/gallium/drivers/radeonsi/si_blit.c @@ -522,7 +522,9 @@ void si_resource_copy_region(struct pipe_context *ctx, src_box = sbox; src_force_level = src_level; - } else if (!util_blitter_is_copy_supported(sctx-blitter, dst, src)) { + } else if (!util_blitter_is_copy_supported(sctx-blitter, dst, src) || +/* also *8_SNORM has precision issues, use UNORM instead */ +util_format_is_snorm(src-format)) { if (util_format_is_subsampled_422(src-format)) { src_templ.format = PIPE_FORMAT_R8G8B8A8_UINT; dst_templ.format = PIPE_FORMAT_R8G8B8A8_UINT; Reviewed-by: Michel Dänzer michel.daen...@amd.com -- Earthling Michel Dänzer | http://www.amd.com Libre software enthusiast | Mesa and X developer ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] glsl: when generating out/inout parameter fixups, do indexing before the call
If the indexing expression involves anything modified during the call, the fixup code to copy back after the call would use the new values. This fixes the cases where the first expression node encountered is ir_binop_vector_extract, fixing the piglits: * shaders@out-parameter-indexing@vs-inout-index-inout-mat2-row * shaders@out-parameter-indexing@vs-inout-index-inout-vec4 * shaders@out-parameter-indexing@vs-inout-index-inout-vec4-array-element Further changes are needed for other expression types. Signed-off-by: Chris Forbes chr...@ijw.co.nz Cc: Ben Widawsky b...@bwidawsk.net --- src/glsl/ast_function.cpp | 15 +++ 1 file changed, 15 insertions(+) diff --git a/src/glsl/ast_function.cpp b/src/glsl/ast_function.cpp index 803edf5..e7147dd 100644 --- a/src/glsl/ast_function.cpp +++ b/src/glsl/ast_function.cpp @@ -299,6 +299,21 @@ fix_parameter(void *mem_ctx, ir_rvalue *actual, const glsl_type *formal_type, before_instructions-push_tail(tmp); + if (expr != NULL expr-operation == ir_binop_vector_extract) { + /* We're going to have to emit a matching insert after the call. + * evaluate the indexing expression before the call, because it + * may reference things that change during the call. + */ + ir_variable *index_tmp = new(mem_ctx) ir_variable(expr-operands[1]-type, +inout_index_tmp, ir_var_temporary); + before_instructions-push_tail(index_tmp); + before_instructions-push_tail( +new(mem_ctx) ir_assignment( + new(mem_ctx) ir_dereference_variable(index_tmp), + expr-operands[1]-clone(mem_ctx, NULL))); + expr-operands[1] = new(mem_ctx) ir_dereference_variable(index_tmp); + } + /* If the parameter is an inout parameter, copy the value of the actual * parameter to the new temporary. Note that no type conversion is allowed * here because inout parameters must match types exactly. -- 2.5.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] glsl: when generating out/inout parameter fixups, do indexing before the call
On Fri, Jul 31, 2015 at 04:37:16PM +1200, Chris Forbes wrote: If the indexing expression involves anything modified during the call, the fixup code to copy back after the call would use the new values. This fixes the cases where the first expression node encountered is ir_binop_vector_extract, fixing the piglits: * shaders@out-parameter-indexing@vs-inout-index-inout-mat2-row * shaders@out-parameter-indexing@vs-inout-index-inout-vec4 * shaders@out-parameter-indexing@vs-inout-index-inout-vec4-array-element Further changes are needed for other expression types. Signed-off-by: Chris Forbes chr...@ijw.co.nz Cc: Ben Widawsky b...@bwidawsk.net Tested-by: Ben Widawsky b...@bwidawsk.net I'm not familiar with this code. I can review it if needed, but I'd prefer not to if there is someone better equipped. --- src/glsl/ast_function.cpp | 15 +++ 1 file changed, 15 insertions(+) diff --git a/src/glsl/ast_function.cpp b/src/glsl/ast_function.cpp index 803edf5..e7147dd 100644 --- a/src/glsl/ast_function.cpp +++ b/src/glsl/ast_function.cpp @@ -299,6 +299,21 @@ fix_parameter(void *mem_ctx, ir_rvalue *actual, const glsl_type *formal_type, before_instructions-push_tail(tmp); + if (expr != NULL expr-operation == ir_binop_vector_extract) { + /* We're going to have to emit a matching insert after the call. + * evaluate the indexing expression before the call, because it + * may reference things that change during the call. + */ + ir_variable *index_tmp = new(mem_ctx) ir_variable(expr-operands[1]-type, +inout_index_tmp, ir_var_temporary); + before_instructions-push_tail(index_tmp); + before_instructions-push_tail( +new(mem_ctx) ir_assignment( + new(mem_ctx) ir_dereference_variable(index_tmp), + expr-operands[1]-clone(mem_ctx, NULL))); + expr-operands[1] = new(mem_ctx) ir_dereference_variable(index_tmp); + } + /* If the parameter is an inout parameter, copy the value of the actual * parameter to the new temporary. Note that no type conversion is allowed * here because inout parameters must match types exactly. -- 2.5.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/3] [v2] i965/skl: Add production thread counts and URB size
Series Reviewed-by: Jordan Justen jordan.l.jus...@intel.com On 2015-07-30 20:16:20, Ben Widawsky wrote: This patch adjusts the SKL values to the best known values we have. v2: Remove HS/DS/CS fields. Adding this makes most sense to add to the GEN9_FEATURES macro, however, doing that would require updating BXT values, and Jordan requested I not do that. Conveniently, this request makes a lot of sense wrt to stable backport as HS, and DS do not even exist there. Cc: mesa-sta...@lists.freedesktop.org Signed-off-by: Ben Widawsky b...@bwidawsk.net Reviewed-by: Jordan Justen jordan.l.jus...@intel.com --- src/mesa/drivers/dri/i965/brw_device_info.c | 10 +- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_device_info.c b/src/mesa/drivers/dri/i965/brw_device_info.c index 2c5d778..f313b2b 100644 --- a/src/mesa/drivers/dri/i965/brw_device_info.c +++ b/src/mesa/drivers/dri/i965/brw_device_info.c @@ -308,13 +308,13 @@ static const struct brw_device_info brw_device_info_chv = { .has_llc = true, \ .has_pln = true, \ .supports_simd16_3src = true,\ - .max_vs_threads = 280, \ - .max_gs_threads = 256, \ - .max_wm_threads = 408, \ + .max_vs_threads = 336, \ + .max_gs_threads = 336, \ + .max_wm_threads = 64 * 6,\ .urb = { \ - .size = 128, \ + .size = 192, \ .min_vs_entries = 64, \ - .max_vs_entries = 1664, \ + .max_vs_entries = 1856, \ .max_gs_entries = 640,\ } -- 2.5.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v2 05/23] glsl: Fail to link if inter-stage input/outputs are not assigned to stream 0
On Wed, 2015-07-29 at 21:58 +0200, Marek Olšák wrote: Hi, Where does the spec say we should fail to link? I don't see such a statement there. I have reviewed ARB_gpu_shader5 and I don't see any specific mentions to what should be done in this particular case. That said, isn't this the logical thing to do? It is a programming error to link an FS input to a GS output bound to a non-zero stream and at best the program would have undefined behavior if the FS input is used. Hiding this from the developer silently does not seem to be a good idea in any case, whatever the developer was trying to accomplish he is doing it wrong. It looks like varyings with stream 0 should not be linked with the fragment shader. How is this better? FWIW, the proprietary nVidia driver also fails to link in this case with this error: output 'var_name' is associated with an input with a non-zero stream, which is not allowed Iago Marek On Wed, Jun 18, 2014 at 11:51 AM, Iago Toral Quiroga ito...@igalia.com wrote: Outputs that are linked to inputs in the next stage must be output to stream 0, otherwise we should fail to link. --- src/glsl/link_varyings.cpp | 8 1 file changed, 8 insertions(+) diff --git a/src/glsl/link_varyings.cpp b/src/glsl/link_varyings.cpp index 9725a43..3b20594 100644 --- a/src/glsl/link_varyings.cpp +++ b/src/glsl/link_varyings.cpp @@ -1345,6 +1345,14 @@ assign_varying_locations(struct gl_context *ctx, if (input_var || (prog-SeparateShader consumer == NULL)) { matches.record(output_var, input_var); } + + /* Only stream 0 outputs can be consumed in the next stage */ + if (input_var output_var-data.stream != 0) { +linker_error(prog, output %s is assigned to stream=%d but + is linked to an input, which requires stream=0, + output_var-name, output_var-data.stream); +return false; + } } } else { /* If there's no producer stage, then this must be a separable program. -- 1.9.1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/2] mesa/formats: add some formats from GL3.3
The spec only says we should support packing to and unpacking from 332, 565, , 5551, but it doesn't specify any INTERNAL formats for those, so no new formats need to be added to Mesa. I guess the packing formats are only supported for convenience. Marek On Thu, Jul 30, 2015 at 3:48 AM, Dave Airlie airl...@gmail.com wrote: From: Dave Airlie airl...@redhat.com GL3.3 added GL_ARB_texture_rgb10_a2ui, which specifies a lot more things than just rgb10/a2ui. While playing with ogl conform one of the tests must attempted all valid formats for GL3.3 and hits the unreachable here. This adds the first chunk of formats that hit the assert. Signed-off-by: Dave Airlie airl...@redhat.com --- src/mesa/main/formats.c | 63 +++ src/mesa/main/formats.csv | 12 + src/mesa/main/formats.h | 12 + src/mesa/main/glformats.c | 24 ++ 4 files changed, 111 insertions(+) diff --git a/src/mesa/main/formats.c b/src/mesa/main/formats.c index baeb1bf..872f18b 100644 --- a/src/mesa/main/formats.c +++ b/src/mesa/main/formats.c @@ -1008,6 +1008,8 @@ _mesa_format_to_type_and_comps(mesa_format format, return; case MESA_FORMAT_B5G6R5_UNORM: case MESA_FORMAT_R5G6B5_UNORM: + case MESA_FORMAT_B5G6R5_UINT: + case MESA_FORMAT_R5G6B5_UINT: *datatype = GL_UNSIGNED_SHORT_5_6_5; *comps = 3; return; @@ -1015,6 +1017,8 @@ _mesa_format_to_type_and_comps(mesa_format format, case MESA_FORMAT_B4G4R4A4_UNORM: case MESA_FORMAT_A4R4G4B4_UNORM: case MESA_FORMAT_B4G4R4X4_UNORM: + case MESA_FORMAT_B4G4R4A4_UINT: + case MESA_FORMAT_A4R4G4B4_UINT: *datatype = GL_UNSIGNED_SHORT_4_4_4_4; *comps = 4; return; @@ -1022,6 +1026,8 @@ _mesa_format_to_type_and_comps(mesa_format format, case MESA_FORMAT_B5G5R5A1_UNORM: case MESA_FORMAT_A1R5G5B5_UNORM: case MESA_FORMAT_B5G5R5X1_UNORM: + case MESA_FORMAT_B5G5R5A1_UINT: + case MESA_FORMAT_A1R5G5B5_UINT: *datatype = GL_UNSIGNED_SHORT_1_5_5_5_REV; *comps = 4; return; @@ -1032,6 +1038,7 @@ _mesa_format_to_type_and_comps(mesa_format format, return; case MESA_FORMAT_A1B5G5R5_UNORM: + case MESA_FORMAT_A1B5G5R5_UINT: *datatype = GL_UNSIGNED_SHORT_5_5_5_1; *comps = 4; return; @@ -1066,19 +1073,23 @@ _mesa_format_to_type_and_comps(mesa_format format, return; case MESA_FORMAT_R3G3B2_UNORM: + case MESA_FORMAT_R3G3B2_UINT: *datatype = GL_UNSIGNED_BYTE_2_3_3_REV; *comps = 3; return; case MESA_FORMAT_A4B4G4R4_UNORM: + case MESA_FORMAT_A4B4G4R4_UINT: *datatype = GL_UNSIGNED_SHORT_4_4_4_4; *comps = 4; return; case MESA_FORMAT_R4G4B4A4_UNORM: + case MESA_FORMAT_R4G4B4A4_UINT: *datatype = GL_UNSIGNED_SHORT_4_4_4_4; *comps = 4; return; case MESA_FORMAT_R5G5B5A1_UNORM: + case MESA_FORMAT_R5G5B5A1_UINT: *datatype = GL_UNSIGNED_SHORT_1_5_5_5_REV; *comps = 4; return; @@ -1094,6 +1105,7 @@ _mesa_format_to_type_and_comps(mesa_format format, return; case MESA_FORMAT_B2G3R3_UNORM: + case MESA_FORMAT_B2G3R3_UINT: *datatype = GL_UNSIGNED_BYTE_3_3_2; *comps = 3; return; @@ -2123,6 +2135,57 @@ _mesa_format_matches_format_and_type(mesa_format mesa_format, type == GL_UNSIGNED_INT_2_10_10_10_REV !swapBytes); + case MESA_FORMAT_B5G6R5_UINT: + return format == GL_RGB_INTEGER type == GL_UNSIGNED_SHORT_5_6_5; + + case MESA_FORMAT_R5G6B5_UINT: + return format == GL_RGB_INTEGER type == GL_UNSIGNED_SHORT_5_6_5_REV; + + case MESA_FORMAT_B2G3R3_UINT: + return format == GL_RGB_INTEGER type == GL_UNSIGNED_BYTE_3_3_2; + + case MESA_FORMAT_R3G3B2_UINT: + return format == GL_RGB_INTEGER type == GL_UNSIGNED_BYTE_2_3_3_REV; + + case MESA_FORMAT_A4B4G4R4_UINT: + if (format == GL_RGBA_INTEGER type == GL_UNSIGNED_SHORT_4_4_4_4 !swapBytes) + return GL_TRUE; + + if (format == GL_RGBA_INTEGER type == GL_UNSIGNED_SHORT_4_4_4_4_REV swapBytes) + return GL_TRUE; + return GL_FALSE; + + case MESA_FORMAT_R4G4B4A4_UINT: + if (format == GL_RGBA_INTEGER type == GL_UNSIGNED_SHORT_4_4_4_4_REV !swapBytes) + return GL_TRUE; + + if (format == GL_RGBA_INTEGER type == GL_UNSIGNED_SHORT_4_4_4_4 swapBytes) + return GL_TRUE; + + return GL_FALSE; + + case MESA_FORMAT_B4G4R4A4_UINT: + return format == GL_BGRA_INTEGER type == GL_UNSIGNED_SHORT_4_4_4_4_REV + !swapBytes; + + case MESA_FORMAT_A4R4G4B4_UINT: + return GL_FALSE; + + case MESA_FORMAT_A1B5G5R5_UINT: + return format == GL_RGBA_INTEGER type == GL_UNSIGNED_SHORT_5_5_5_1 + !swapBytes; + + case MESA_FORMAT_B5G5R5A1_UINT: + return format ==
Re: [Mesa-dev] [PATCH] mesa/st/xformfb: drop assert(0) that can get hit from API
If glDrawTransformFeedback is called with no recorded vertices, the whole draw call should be dropped, because vertex count == 0, right? Marek On Thu, Jul 30, 2015 at 2:52 AM, Dave Airlie airl...@gmail.com wrote: From: Dave Airlie airl...@redhat.com The API can hit this, as the ogl confirm suite showed, I've sent a piglit test to also demonstrate hitting it. Signed-off-by: Dave Airlie airl...@redhat.com --- src/mesa/state_tracker/st_cb_xformfb.c | 1 - 1 file changed, 1 deletion(-) diff --git a/src/mesa/state_tracker/st_cb_xformfb.c b/src/mesa/state_tracker/st_cb_xformfb.c index 07c118e..925fbd4 100644 --- a/src/mesa/state_tracker/st_cb_xformfb.c +++ b/src/mesa/state_tracker/st_cb_xformfb.c @@ -191,7 +191,6 @@ st_transform_feedback_get_draw_target(struct gl_transform_feedback_object *obj) } } - assert(0); return NULL; } -- 2.4.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/2] mesa/formats: add some formats from GL3.3
That said, it looks like the packing code has changed a lot since I last worked with it, so I don't know. Marek On Thu, Jul 30, 2015 at 10:06 AM, Marek Olšák mar...@gmail.com wrote: The spec only says we should support packing to and unpacking from 332, 565, , 5551, but it doesn't specify any INTERNAL formats for those, so no new formats need to be added to Mesa. I guess the packing formats are only supported for convenience. Marek On Thu, Jul 30, 2015 at 3:48 AM, Dave Airlie airl...@gmail.com wrote: From: Dave Airlie airl...@redhat.com GL3.3 added GL_ARB_texture_rgb10_a2ui, which specifies a lot more things than just rgb10/a2ui. While playing with ogl conform one of the tests must attempted all valid formats for GL3.3 and hits the unreachable here. This adds the first chunk of formats that hit the assert. Signed-off-by: Dave Airlie airl...@redhat.com --- src/mesa/main/formats.c | 63 +++ src/mesa/main/formats.csv | 12 + src/mesa/main/formats.h | 12 + src/mesa/main/glformats.c | 24 ++ 4 files changed, 111 insertions(+) diff --git a/src/mesa/main/formats.c b/src/mesa/main/formats.c index baeb1bf..872f18b 100644 --- a/src/mesa/main/formats.c +++ b/src/mesa/main/formats.c @@ -1008,6 +1008,8 @@ _mesa_format_to_type_and_comps(mesa_format format, return; case MESA_FORMAT_B5G6R5_UNORM: case MESA_FORMAT_R5G6B5_UNORM: + case MESA_FORMAT_B5G6R5_UINT: + case MESA_FORMAT_R5G6B5_UINT: *datatype = GL_UNSIGNED_SHORT_5_6_5; *comps = 3; return; @@ -1015,6 +1017,8 @@ _mesa_format_to_type_and_comps(mesa_format format, case MESA_FORMAT_B4G4R4A4_UNORM: case MESA_FORMAT_A4R4G4B4_UNORM: case MESA_FORMAT_B4G4R4X4_UNORM: + case MESA_FORMAT_B4G4R4A4_UINT: + case MESA_FORMAT_A4R4G4B4_UINT: *datatype = GL_UNSIGNED_SHORT_4_4_4_4; *comps = 4; return; @@ -1022,6 +1026,8 @@ _mesa_format_to_type_and_comps(mesa_format format, case MESA_FORMAT_B5G5R5A1_UNORM: case MESA_FORMAT_A1R5G5B5_UNORM: case MESA_FORMAT_B5G5R5X1_UNORM: + case MESA_FORMAT_B5G5R5A1_UINT: + case MESA_FORMAT_A1R5G5B5_UINT: *datatype = GL_UNSIGNED_SHORT_1_5_5_5_REV; *comps = 4; return; @@ -1032,6 +1038,7 @@ _mesa_format_to_type_and_comps(mesa_format format, return; case MESA_FORMAT_A1B5G5R5_UNORM: + case MESA_FORMAT_A1B5G5R5_UINT: *datatype = GL_UNSIGNED_SHORT_5_5_5_1; *comps = 4; return; @@ -1066,19 +1073,23 @@ _mesa_format_to_type_and_comps(mesa_format format, return; case MESA_FORMAT_R3G3B2_UNORM: + case MESA_FORMAT_R3G3B2_UINT: *datatype = GL_UNSIGNED_BYTE_2_3_3_REV; *comps = 3; return; case MESA_FORMAT_A4B4G4R4_UNORM: + case MESA_FORMAT_A4B4G4R4_UINT: *datatype = GL_UNSIGNED_SHORT_4_4_4_4; *comps = 4; return; case MESA_FORMAT_R4G4B4A4_UNORM: + case MESA_FORMAT_R4G4B4A4_UINT: *datatype = GL_UNSIGNED_SHORT_4_4_4_4; *comps = 4; return; case MESA_FORMAT_R5G5B5A1_UNORM: + case MESA_FORMAT_R5G5B5A1_UINT: *datatype = GL_UNSIGNED_SHORT_1_5_5_5_REV; *comps = 4; return; @@ -1094,6 +1105,7 @@ _mesa_format_to_type_and_comps(mesa_format format, return; case MESA_FORMAT_B2G3R3_UNORM: + case MESA_FORMAT_B2G3R3_UINT: *datatype = GL_UNSIGNED_BYTE_3_3_2; *comps = 3; return; @@ -2123,6 +2135,57 @@ _mesa_format_matches_format_and_type(mesa_format mesa_format, type == GL_UNSIGNED_INT_2_10_10_10_REV !swapBytes); + case MESA_FORMAT_B5G6R5_UINT: + return format == GL_RGB_INTEGER type == GL_UNSIGNED_SHORT_5_6_5; + + case MESA_FORMAT_R5G6B5_UINT: + return format == GL_RGB_INTEGER type == GL_UNSIGNED_SHORT_5_6_5_REV; + + case MESA_FORMAT_B2G3R3_UINT: + return format == GL_RGB_INTEGER type == GL_UNSIGNED_BYTE_3_3_2; + + case MESA_FORMAT_R3G3B2_UINT: + return format == GL_RGB_INTEGER type == GL_UNSIGNED_BYTE_2_3_3_REV; + + case MESA_FORMAT_A4B4G4R4_UINT: + if (format == GL_RGBA_INTEGER type == GL_UNSIGNED_SHORT_4_4_4_4 !swapBytes) + return GL_TRUE; + + if (format == GL_RGBA_INTEGER type == GL_UNSIGNED_SHORT_4_4_4_4_REV swapBytes) + return GL_TRUE; + return GL_FALSE; + + case MESA_FORMAT_R4G4B4A4_UINT: + if (format == GL_RGBA_INTEGER type == GL_UNSIGNED_SHORT_4_4_4_4_REV !swapBytes) + return GL_TRUE; + + if (format == GL_RGBA_INTEGER type == GL_UNSIGNED_SHORT_4_4_4_4 swapBytes) + return GL_TRUE; + + return GL_FALSE; + + case MESA_FORMAT_B4G4R4A4_UINT: + return format == GL_BGRA_INTEGER type == GL_UNSIGNED_SHORT_4_4_4_4_REV + !swapBytes; + + case MESA_FORMAT_A4R4G4B4_UINT: + return GL_FALSE; + + case
Re: [Mesa-dev] [PATCH v2 05/23] glsl: Fail to link if inter-stage input/outputs are not assigned to stream 0
On Thu, Jul 30, 2015 at 8:49 AM, Iago Toral ito...@igalia.com wrote: On Wed, 2015-07-29 at 21:58 +0200, Marek Olšák wrote: Hi, Where does the spec say we should fail to link? I don't see such a statement there. I have reviewed ARB_gpu_shader5 and I don't see any specific mentions to what should be done in this particular case. That said, isn't this the logical thing to do? It is a programming error to link an FS input to a GS output bound to a non-zero stream and at best the program would have undefined behavior if the FS input is used. Hiding this from the developer silently does not seem to be a good idea in any case, whatever the developer was trying to accomplish he is doing it wrong. It looks like varyings with stream 0 should not be linked with the fragment shader. How is this better? The spec simply says that varyings with stream 0 are not passed to the rasterizer, which means the matching fragment shader inputs should be uninitialized, but it should still be possible to capture the varyings with transform feedback. That's how I understand the spec. Marek ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] radeonsi: add GS multiple streams support (v2)
Reviewed-by: Marek Olšák marek.ol...@amd.com Marek On Thu, Jul 30, 2015 at 2:06 AM, Dave Airlie airl...@gmail.com wrote: From: Dave Airlie airl...@redhat.com This is the final piece for ARB_gpu_shader5, The code is based on the r600 code from Glenn Kennard, and myself. While developing this, I'm not 100% sure of all the calculations made in the GS registers, this is why the max_stream is worked out there and used to limit the changes in registers. Otherwise my initial attempts either regressed GS texelFetch tests or primitive-id-restart. The current code has no regressions in piglit. This commit doesn't enable ARB_gpu_shader5, since that just bumps the glsl level to 4.00, so I'll just do a separate patch for 4.10. v1.1: fix bug introduced in rebase. v2: Address Marek's review comments, remove my llvm stream code for simpler C, move gsvs_ring and gs_next_vertex to arrays. Signed-off-by: Dave Airlie airl...@redhat.com --- src/gallium/drivers/radeonsi/si_descriptors.c | 4 +- src/gallium/drivers/radeonsi/si_pipe.c | 2 +- src/gallium/drivers/radeonsi/si_shader.c| 74 +++- src/gallium/drivers/radeonsi/si_state.c | 4 -- src/gallium/drivers/radeonsi/si_state.h | 7 ++- src/gallium/drivers/radeonsi/si_state_shaders.c | 75 +++-- 6 files changed, 127 insertions(+), 39 deletions(-) diff --git a/src/gallium/drivers/radeonsi/si_descriptors.c b/src/gallium/drivers/radeonsi/si_descriptors.c index 2e2a35b..14bb6e1 100644 --- a/src/gallium/drivers/radeonsi/si_descriptors.c +++ b/src/gallium/drivers/radeonsi/si_descriptors.c @@ -724,7 +724,7 @@ void si_set_ring_buffer(struct pipe_context *ctx, uint shader, uint slot, struct pipe_resource *buffer, unsigned stride, unsigned num_records, bool add_tid, bool swizzle, - unsigned element_size, unsigned index_stride) + unsigned element_size, unsigned index_stride, uint64_t offset) { struct si_context *sctx = (struct si_context *)ctx; struct si_buffer_resources *buffers = sctx-rw_buffers[shader]; @@ -741,7 +741,7 @@ void si_set_ring_buffer(struct pipe_context *ctx, uint shader, uint slot, if (buffer) { uint64_t va; - va = r600_resource(buffer)-gpu_address; + va = r600_resource(buffer)-gpu_address + offset; switch (element_size) { default: diff --git a/src/gallium/drivers/radeonsi/si_pipe.c b/src/gallium/drivers/radeonsi/si_pipe.c index 808b9bc..a120282 100644 --- a/src/gallium/drivers/radeonsi/si_pipe.c +++ b/src/gallium/drivers/radeonsi/si_pipe.c @@ -316,7 +316,7 @@ static int si_get_param(struct pipe_screen* pscreen, enum pipe_cap param) case PIPE_CAP_MAX_GEOMETRY_TOTAL_OUTPUT_COMPONENTS: return 4095; case PIPE_CAP_MAX_VERTEX_STREAMS: - return 1; + return 4; case PIPE_CAP_MAX_VERTEX_ATTRIB_STRIDE: return 2048; diff --git a/src/gallium/drivers/radeonsi/si_shader.c b/src/gallium/drivers/radeonsi/si_shader.c index fa31f73..d8bab87 100644 --- a/src/gallium/drivers/radeonsi/si_shader.c +++ b/src/gallium/drivers/radeonsi/si_shader.c @@ -31,6 +31,7 @@ #include gallivm/lp_bld_intr.h #include gallivm/lp_bld_logic.h #include gallivm/lp_bld_arit.h +#include gallivm/lp_bld_bitarit.h #include gallivm/lp_bld_flow.h #include radeon/r600_cs.h #include radeon/radeon_llvm.h @@ -87,8 +88,8 @@ struct si_shader_context LLVMValueRef samplers[SI_NUM_SAMPLER_STATES]; LLVMValueRef so_buffers[4]; LLVMValueRef esgs_ring; - LLVMValueRef gsvs_ring; - LLVMValueRef gs_next_vertex; + LLVMValueRef gsvs_ring[4]; + LLVMValueRef gs_next_vertex[4]; }; static struct si_shader_context * si_shader_context( @@ -1576,6 +1577,9 @@ static void si_llvm_emit_streamout(struct si_shader_context *shader, LLVMValueRef can_emit = LLVMBuildICmp(builder, LLVMIntULT, tid, so_vtx_count, ); + LLVMValueRef stream_id = + unpack_param(shader, shader-param_streamout_config, 24, 2); + /* Emit the streamout code conditionally. This actually avoids * out-of-bounds buffer access. The hw tells us via the SGPR * (so_vtx_count) which threads are allowed to emit streamout data. */ @@ -1615,7 +1619,9 @@ static void si_llvm_emit_streamout(struct si_shader_context *shader, unsigned reg = so-output[i].register_index; unsigned start = so-output[i].start_component; unsigned num_comps = so-output[i].num_components; + unsigned stream = so-output[i].stream; LLVMValueRef out[4]; + struct
Re: [Mesa-dev] [PATCH 01/20] mesa: fix and simplify resource query for arrays
On 07/30/2015 08:56 AM, Tapani Pälli wrote: On 07/30/2015 08:50 AM, Timothy Arceri wrote: On 30 July 2015 3:27:41 pm AEST, Tapani Pälli tapani.pa...@intel.com wrote: On 07/30/2015 08:25 AM, Tapani Pälli wrote: Hi Timothy; Would it be OK for you to wait a bit with these 3 first patches? I'm currently going through failing PIQ tests and I have a 1 line fix to valid_program_resource_index_name() that together with your patch glsl: set stage flag for structs and arrays in resource list will fix all of the failing ES31-CTS.program_interface_query.uniform tests. Frankly I don't see the value of such a big refactor necessary for this. The value is in removing almost 200 lines of complicated code trying to solve the same problem in different ways. How its this not a good thing? Also once this is done we can also work towards removing UniformHash from Mesa. Sure, I agree that removing lots of code is a good thing. I just don't quite yet understand how some of this will work, many of the removed checks were there for a good reason (faling tests!). Just to make sure, did you run Piglit and conformance without regressions? Anyway, I will read through your changes and think about this more. Replying myself, no regressions observed :) Yep, it seems that parse_program_resource_name() tackles the cases. I will do some further testing, particularly on the 'gl_VertexIDMESA' special case and review the patches during today or tomorrow. Woops, make it ES31-CTS.program_interface_query.uniform-types, not ES31-CTS.program_interface_query.uniform. It looks to me that most tests fail just because of not figuring out the correct index for the resource and that can be fixed by a valid_program_resource_index_name change. On 07/29/2015 04:56 PM, Timothy Arceri wrote: This removes the need for multiple functions designed to validate an array subscript and replaces them with a call to a single function. The change also means that validation is now only done once and the index is retrived at the same time, as a result the getUniformLocation code can be simplified saving an extra hash table lookup (and yet another validation call). This chage also fixes some tests in: ES31-CTS.program_interface_query.uniform V3: rebase on subroutines, and move the resource index array == 0 check into _mesa_GetProgramResourceIndex() to simplify things further V2: Fix bounds checks for program input/output, split unrelated comment fix and _mesa_get_uniform_location() removal into their own patch. Cc: Tapani Pälli tapani.pa...@intel.com --- src/mesa/main/program_resource.c | 14 ++-- src/mesa/main/shader_query.cpp | 174 +-- src/mesa/main/shaderapi.c| 2 +- src/mesa/main/shaderapi.h| 3 +- src/mesa/main/uniforms.c | 5 +- 5 files changed, 106 insertions(+), 92 deletions(-) diff --git a/src/mesa/main/program_resource.c b/src/mesa/main/program_resource.c index 641ef22..fdbd5b3 100644 --- a/src/mesa/main/program_resource.c +++ b/src/mesa/main/program_resource.c @@ -229,6 +229,7 @@ _mesa_GetProgramResourceIndex(GLuint program, GLenum programInterface, const GLchar *name) { GET_CURRENT_CONTEXT(ctx); + unsigned array_index = 0; struct gl_program_resource *res; struct gl_shader_program *shProg = _mesa_lookup_shader_program_err(ctx, program, @@ -268,13 +269,10 @@ _mesa_GetProgramResourceIndex(GLuint program, GLenum programInterface, case GL_PROGRAM_OUTPUT: case GL_UNIFORM: case GL_TRANSFORM_FEEDBACK_VARYING: - /* Validate name syntax for array variables */ - if (!valid_program_resource_index_name(name)) - return GL_INVALID_INDEX; - /* fall-through */ case GL_UNIFORM_BLOCK: - res = _mesa_program_resource_find_name(shProg, programInterface, name); - if (!res) + res = _mesa_program_resource_find_name(shProg, programInterface, name, + array_index); + if (!res || array_index 0) return GL_INVALID_INDEX; return _mesa_program_resource_index(shProg, res); @@ -403,7 +401,7 @@ _mesa_GetProgramResourceLocation(GLuint program, GLenum programInterface, struct gl_shader_program *shProg = lookup_linked_program(program, glGetProgramResourceLocation); - if (!shProg || !name || invalid_array_element_syntax(name)) + if (!shProg || !name) return -1; /* Validate programInterface. */ @@ -453,7 +451,7 @@ _mesa_GetProgramResourceLocationIndex(GLuint program, GLenum programInterface, struct gl_shader_program *shProg = lookup_linked_program(program, glGetProgramResourceLocationIndex); - if (!shProg || !name || invalid_array_element_syntax(name)) + if (!shProg || !name) return -1; /* From the GL_ARB_program_interface_query spec: diff --git a/src/mesa/main/shader_query.cpp b/src/mesa/main/shader_query.cpp index 3e52a09..dd11b81
Re: [Mesa-dev] [PATCH v2 05/23] glsl: Fail to link if inter-stage input/outputs are not assigned to stream 0
On Thu, 2015-07-30 at 09:43 +0200, Marek Olšák wrote: On Thu, Jul 30, 2015 at 8:49 AM, Iago Toral ito...@igalia.com wrote: On Wed, 2015-07-29 at 21:58 +0200, Marek Olšák wrote: Hi, Where does the spec say we should fail to link? I don't see such a statement there. I have reviewed ARB_gpu_shader5 and I don't see any specific mentions to what should be done in this particular case. That said, isn't this the logical thing to do? It is a programming error to link an FS input to a GS output bound to a non-zero stream and at best the program would have undefined behavior if the FS input is used. Hiding this from the developer silently does not seem to be a good idea in any case, whatever the developer was trying to accomplish he is doing it wrong. It looks like varyings with stream 0 should not be linked with the fragment shader. How is this better? The spec simply says that varyings with stream 0 are not passed to the rasterizer, which means the matching fragment shader inputs should be uninitialized, but it should still be possible to capture the varyings with transform feedback. That's how I understand the spec. FWIW, I have tested this in the proprietary nVidia driver and the result is the same, it fails to link even if that GS output is captured by TF. My interpretation of the spec is that since GS outputs to stream 0 are not passed down the pipeline they simply do not exist in the eyes of the FS, that is, I see this situation as the same in which we declare an input in the FS that is not declared as output in the GS. But it is true that the spec does not address this situation explicitly, so I think both interpretations could be valid. I still think that failing to link is better though. If we report a link failure the developer knows what is going on and the fix is trivial, otherwise they will run into incorrect rendering, they will have to figure out what is going and eventually fix the code anyway... Since at least nVidia proprietary is failing to link as well in these scenarios I guess our chances of running into shaders that we fail to link for this reason and were expected to link properly are pretty small too. Iago ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 15/17] glsl: Add precision information to ir_variable
On Wed, 2015-07-29 at 15:16 -0700, Ian Romanick wrote: On 07/29/2015 07:01 AM, Samuel Iglesias Gonsalvez wrote: From: Iago Toral Quiroga ito...@igalia.com We will need this later on when we implement proper support for precision qualifiers in the drivers and also to do link time checks for uniforms as indicated by the spec. This patch also adds compile-time checks for variables without precision information (currently, Mesa only checks that a default precision is set for floats in fragment shaders). As indicated by Ian, the addition of the precision information to ir_variable has been done using a bitfield and pahole to identify an available hole so that memory requirements for ir_variable stay the same. --- src/glsl/ast_to_hir.cpp | 316 +++- src/glsl/glsl_types.cpp | 4 + src/glsl/glsl_types.h | 12 ++ src/glsl/ir.h | 13 ++ 4 files changed, 288 insertions(+), 57 deletions(-) diff --git a/src/glsl/ast_to_hir.cpp b/src/glsl/ast_to_hir.cpp index 789b2bc..8b170c2 100644 --- a/src/glsl/ast_to_hir.cpp +++ b/src/glsl/ast_to_hir.cpp @@ -1993,6 +1993,41 @@ process_array_type(YYLTYPE *loc, const glsl_type *base, return array_type; } +static bool +precision_qualifier_allowed(const glsl_type *type) This function is just moved up from below? I would have been tempted to put that in a separate patch to make it more obvious that there no changes. *shrug* +{ + /* Precision qualifiers apply to floating point, integer and sampler +* types. +* +* Section 4.5.2 (Precision Qualifiers) of the GLSL 1.30 spec says: +*Any floating point or any integer declaration can have the type +*preceded by one of these precision qualifiers [...] Literal +*constants do not have precision qualifiers. Neither do Boolean +*variables. +* +* Section 4.5 (Precision and Precision Qualifiers) of the GLSL 1.30 +* spec also says: +* +* Precision qualifiers are added for code portability with OpenGL +* ES, not for functionality. They have the same syntax as in OpenGL +* ES. +* +* Section 8 (Built-In Functions) of the GLSL ES 1.00 spec says: +* +* uniform lowp sampler2D sampler; +* highp vec2 coord; +* ... +* lowp vec4 col = texture2D (sampler, coord); +*// texture2D returns lowp +* +* From this, we infer that GLSL 1.30 (and later) should allow precision +* qualifiers on sampler types just like float and integer types. +*/ + return type-is_float() + || type-is_integer() + || type-is_record() + || type-is_sampler(); +} const glsl_type * ast_type_specifier::glsl_type(const char **name, @@ -2009,31 +2044,172 @@ ast_type_specifier::glsl_type(const char **name, return type; } +/** + * From the OpenGL ES 3.0 spec, 4.5.4 Default Precision Qualifiers: + * + * The precision statement + * + *precision precision-qualifier type; + * + * can be used to establish a default precision qualifier. The type field can + * be either int or float or any of the sampler types, (...) If type is float, + * the directive applies to non-precision-qualified floating point type + * (scalar, vector, and matrix) declarations. If type is int, the directive + * applies to all non-precision-qualified integer type (scalar, vector, signed, + * and unsigned) declarations. + * + * We use the symbol table to keep the values of the default precisions for + * each 'type' in each scope and we use the 'type' string from the precision + * statement as key in the symbol table. When we want to retrieve the default + * precision associated with a given glsl_type we need to know the type string + * associated with it. This is what this function returns. + */ +static const char * +get_type_name_for_precision_qualifier(const glsl_type *type) +{ + switch (type-base_type) { + case GLSL_TYPE_FLOAT: + return float; + case GLSL_TYPE_UINT: + case GLSL_TYPE_INT: + return int; + case GLSL_TYPE_SAMPLER: { + bool array = type-sampler_array; + bool shadow = type-sampler_shadow; + switch (type-sampler_type) { + case GLSL_TYPE_FLOAT: + switch (type-sampler_dimensionality) { + case GLSL_SAMPLER_DIM_1D: +if (!array !shadow) + return sampler1D; +if (array !shadow) + return sampler1DArray; +if (!array shadow) + return sampler1DShadow; +return sampler1DArrayShadow; + case GLSL_SAMPLER_DIM_2D: +if (!array !shadow) + return sampler2D; +if (array !shadow) +
Re: [Mesa-dev] [PATCH] includes/GL: remove duplicated extension declarations from glx.h
On 07/30/2015 07:50 AM, Emil Velikov wrote: On 30 July 2015 at 15:22, Emil Velikov emil.l.veli...@gmail.com wrote: All three of GLX_NV_float_buffer, GLX_EXT_texture_from_pixmap and GLX_MESA_query_renderer have been in glxext.h for a while now. As such we can drop this workaround/hack from the header. Signed-off-by: Emil Velikov emil.l.veli...@gmail.com --- Not to mention that glxext.h honours GLX_GLXEXT_PROTOTYPES and provides the relevant typedefs and/or function declarations when appropriate. -Emil include/GL/glx.h | 88 1 file changed, 88 deletions(-) diff --git a/include/GL/glx.h b/include/GL/glx.h index 78f5052..d612b79 100644 --- a/include/GL/glx.h +++ b/include/GL/glx.h @@ -371,14 +371,6 @@ extern Bool glXDrawableAttribARB(Display *dpy, GLXDrawable draw, const int *attr /* * Remove this when glxext.h is updated. */ Seems like I should drop this comment as well. On a related note: there are a few more extensions that never made it (i.e. are missing from the Khronos public spec page) but are still around. Namely GLX_NV_vertex_array_range - Shipped with nvidia's glxext.h, but unused in the binary driver (version 352.21). - Referenced in GLX_MESA_agp_offset - GLX_NV_vertex_array_range is required GLX_ARB_render_texture - No hints in nvidia headers/binary. - Referenced in GLX_EXT_texture_from_pixmap - Adapted spec language from draft version of GLX_ARB_render_texture - The Khronos meeting minutes state GLX_ARB_render_texture hasn't been kept up to date, but there's no reason it can't be completed. [1] GLX_MESA_swap_frame_usage - No hints in nvidia headers/binary. - No reference to in the public specs. GLX_MESA_swap_control - No hints in nvidia headers/binary. - Referenced by GLX_EXT_swap_control, Based on GLX_MESA_swap_control version 1.1 I think both of these have specs in the Mesa tree. I'm not suggesting that we should nuke these, but curious if they are or may have been used at some point. -Emil [1] https://www.opengl.org/archives/about/arb/meeting_notes/notes/meeting_note_2001-06-12.html ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] radeonsi: copy *8_SNORM bits exactly in resource_copy_region
From: Marek Olšák marek.ol...@amd.com Disabling the FP16 mode didn't help. If needed, we can use this trick for blits too, but not for scaled blits. + 4 piglits --- src/gallium/drivers/radeonsi/si_blit.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/src/gallium/drivers/radeonsi/si_blit.c b/src/gallium/drivers/radeonsi/si_blit.c index c892623..61ca2a8 100644 --- a/src/gallium/drivers/radeonsi/si_blit.c +++ b/src/gallium/drivers/radeonsi/si_blit.c @@ -522,7 +522,9 @@ void si_resource_copy_region(struct pipe_context *ctx, src_box = sbox; src_force_level = src_level; - } else if (!util_blitter_is_copy_supported(sctx-blitter, dst, src)) { + } else if (!util_blitter_is_copy_supported(sctx-blitter, dst, src) || + /* also *8_SNORM has precision issues, use UNORM instead */ + util_format_is_snorm(src-format)) { if (util_format_is_subsampled_422(src-format)) { src_templ.format = PIPE_FORMAT_R8G8B8A8_UINT; dst_templ.format = PIPE_FORMAT_R8G8B8A8_UINT; -- 2.1.4 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] docs: trivial cleanup of GL3.txt, remove redundant radeonsi entries.
Follow-up to 1b2b0e42ce47bfd1fcb5513ed2c23b9bb7a5a5b8 Signed-off-by: Kai Wasserbäch k...@dev.carbon-project.org --- Please commit this for me, if this is accepted. Thanks. docs/GL3.txt | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/docs/GL3.txt b/docs/GL3.txt index 0220f24..8124383 100644 --- a/docs/GL3.txt +++ b/docs/GL3.txt @@ -96,7 +96,7 @@ GL 4.0, GLSL 4.00 --- all DONE: nvc0, radeonsi GL_ARB_draw_buffers_blendDONE (i965, nv50, r600, llvmpipe, softpipe) GL_ARB_draw_indirect DONE (i965, r600, llvmpipe, softpipe) - GL_ARB_gpu_shader5 DONE (i965, radeonsi) + GL_ARB_gpu_shader5 DONE (i965) - 'precise' qualifierDONE - Dynamically uniform sampler array indices DONE (r600, softpipe) - Dynamically uniform UBO array indices DONE (r600) @@ -110,7 +110,7 @@ GL 4.0, GLSL 4.00 --- all DONE: nvc0, radeonsi - Interpolation functionsDONE (r600) - New overload resolution rules DONE GL_ARB_gpu_shader_fp64 DONE (llvmpipe, softpipe) - GL_ARB_sample_shadingDONE (i965, nv50, r600, radeonsi) + GL_ARB_sample_shadingDONE (i965, nv50, r600) GL_ARB_shader_subroutine DONE (i965, nv50, r600, llvmpipe, softpipe) GL_ARB_tessellation_shader DONE () GL_ARB_texture_buffer_object_rgb32 DONE (i965, r600, llvmpipe, softpipe) -- 2.4.6 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] arrays of structures fix
Hi; This together with Timothy's set stage flag for structs and arrays in resource list fixes following failing conformance test: ES31-CTS.program_interface_query.uniform-types Tapani Pälli (1): mesa: support arrays of structures in GetProgramResourceIndex src/mesa/main/program_resource.c | 7 +-- 1 file changed, 5 insertions(+), 2 deletions(-) -- 2.1.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] mesa: support arrays of structures in GetProgramResourceIndex
This fixes a lot of failing tests in: ES31-CTS.program_interface_query.uniform-types Signed-off-by: Tapani Pälli tapani.pa...@intel.com --- src/mesa/main/program_resource.c | 7 +-- 1 file changed, 5 insertions(+), 2 deletions(-) diff --git a/src/mesa/main/program_resource.c b/src/mesa/main/program_resource.c index 641ef22..6c0a923 100644 --- a/src/mesa/main/program_resource.c +++ b/src/mesa/main/program_resource.c @@ -212,13 +212,16 @@ valid_program_resource_index_name(const GLchar *name) { const char *array = strstr(name, [); const char *close = strrchr(name, ']'); + const char *after = close ? close + 1 : NULL; /* Not array, no need for the check. */ if (!array) return true; - /* Last array index has to be zero. */ - if (!close || *--close != '0') + /* Last array index has to be zero, for the exception when +* dealing with arrays of structures. +*/ + if (!close || (*--close != '0' *after != '.')) return false; return true; -- 2.1.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] Delete duplicate function is_power_of_two() and use _mesa_is_pow_two()
On 30.07.2015 01:27, Anuj Phogat wrote: On Wed, Jul 29, 2015 at 12:20 AM, Iago Toral ito...@igalia.com wrote: Funny, we had 3 instances of the same function with the same implementation :), there is still util_is_power_of_two in Gallium btw. Yes. I wasn't sure if we can use main/imports.h in gallium. We can't, that's what src/util/ is for. -- Earthling Michel Dänzer | http://www.amd.com Libre software enthusiast | Mesa and X developer ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 2/2] mesa: Fix printf format specifier warn of the ptrdiff_t
That's correct. We should use the inttypes.h PRI*PTR specifiers for portability. Unfortunately there's no such format specificier for size_t or ssize_t, but I believe intptr_t should match on all platforms we support. Jose On 30/07/15 00:53, Ilia Mirkin wrote: I assume that you mean the C99 spec? I don't think MSVC supports %t or %z :( On Wed, Jul 29, 2015 at 7:46 PM, Edward O'Callaghan eocallag...@alterapraxis.com wrote: See §7.19.6.1, paragraph 7 of the ISO C specification. Signed-off-by: Edward O'Callaghan eocallag...@alterapraxis.com --- src/mesa/main/bufferobj.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/mesa/main/bufferobj.c b/src/mesa/main/bufferobj.c index 9425b09..453b864 100644 --- a/src/mesa/main/bufferobj.c +++ b/src/mesa/main/bufferobj.c @@ -2372,7 +2372,7 @@ _mesa_map_buffer_range(struct gl_context *ctx, if (offset + length bufObj-Size) { _mesa_error(ctx, GL_INVALID_VALUE, - %s(offset %ld + length %ld buffer_size %ld), func, + %s(offset %td + length %td buffer_size %td), func, offset, length, bufObj-Size); return NULL; } -- 2.4.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/2] r600: Fix some printf format warns pretaining to PRIi64
Hi, I don't know if you will believe it or not, but I have just written a very similar patch to yours :) Anyway, yours have come first so have my Reviewed-by: Bruno Jiménez brunoji...@gmail.com Note: I have no permissions in the git repo, you should ask someone to push this patch for you. - Bruno On Thu, 2015-07-30 at 09:46 +1000, Edward O'Callaghan wrote: Signed-off-by: Edward O'Callaghan eocallag...@alterapraxis.com --- src/gallium/drivers/r600/compute_memory_pool.c | 8 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/src/gallium/drivers/r600/compute_memory_pool.c b/src/gallium/drivers/r600/compute_memory_pool.c index 413aa3d..7c5113e 100644 --- a/src/gallium/drivers/r600/compute_memory_pool.c +++ b/src/gallium/drivers/r600/compute_memory_pool.c @@ -120,7 +120,7 @@ int64_t compute_memory_prealloc_chunk( assert(size_in_dw = pool-size_in_dw); - COMPUTE_DBG(pool-screen, * compute_memory_prealloc_chunk() size_in_dw = %ld\n, + COMPUTE_DBG(pool-screen, * compute_memory_prealloc_chunk() size_in_dw = %PRIi64\n, size_in_dw); LIST_FOR_EACH_ENTRY(item, pool-item_list, link) { @@ -151,7 +151,7 @@ struct list_head *compute_memory_postalloc_chunk( struct compute_memory_item *next; struct list_head *next_link; - COMPUTE_DBG(pool-screen, * compute_memory_postalloc_chunck() start_in_dw = %ld\n, + COMPUTE_DBG(pool-screen, * compute_memory_postalloc_chunck() start_in_dw = %PRIi64\n, start_in_dw); /* Check if we can insert it in the front of the list */ @@ -568,7 +568,7 @@ void compute_memory_free(struct compute_memory_pool* pool, int64_t id) struct pipe_screen *screen = (struct pipe_screen *)pool -screen; struct pipe_resource *res; - COMPUTE_DBG(pool-screen, * compute_memory_free() id + %ld \n, id); + COMPUTE_DBG(pool-screen, * compute_memory_free() id + %PRIi64 \n, id); LIST_FOR_EACH_ENTRY_SAFE(item, next, pool-item_list, link) { @@ -628,7 +628,7 @@ struct compute_memory_item* compute_memory_alloc( { struct compute_memory_item *new_item = NULL; - COMPUTE_DBG(pool-screen, * compute_memory_alloc() size_in_dw = %ld (%ld bytes)\n, + COMPUTE_DBG(pool-screen, * compute_memory_alloc() size_in_dw = %PRIi64 (%PRIi64 bytes)\n, size_in_dw, 4 * size_in_dw); new_item = (struct compute_memory_item *) ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 3/3] dri: set the __DRI_API_OPENGL bit based on max gl compat version
On 30 July 2015 at 10:27, Frank Binns frank.bi...@imgtec.com wrote: This matches similar behaviour for the __DRI_API_OPENGL_CORE bit. Signed-off-by: Frank Binns frank.bi...@imgtec.com --- src/mesa/drivers/dri/common/dri_util.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/src/mesa/drivers/dri/common/dri_util.c b/src/mesa/drivers/dri/common/dri_util.c index 884a7e0..d35ac26 100644 --- a/src/mesa/drivers/dri/common/dri_util.c +++ b/src/mesa/drivers/dri/common/dri_util.c @@ -163,7 +163,9 @@ driCreateNewScreen2(int scrn, int fd, } } -psp-api_mask = (1 __DRI_API_OPENGL); +psp-api_mask = 0; +if (psp-max_gl_compat_version 0) + psp-api_mask |= (1 __DRI_API_OPENGL); It's almost as if there is a DRI module that doesn't do OpenGL but GLES alone (cough, cough). But seriously, the series looks great imho, and is Reviewed-by: Emil Velikov emil.l.veli...@gmail.com I do wonder if we should pull patch#1 (and #2) for the stable branch ? -Emil ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/2] mesa/formats: add some formats from GL3.3
Hmm I guess we can't convert it to something else when we don't have a matching MESA_FORMAT? Roland Am 30.07.2015 um 03:48 schrieb Dave Airlie: From: Dave Airlie airl...@redhat.com GL3.3 added GL_ARB_texture_rgb10_a2ui, which specifies a lot more things than just rgb10/a2ui. While playing with ogl conform one of the tests must attempted all valid formats for GL3.3 and hits the unreachable here. This adds the first chunk of formats that hit the assert. Signed-off-by: Dave Airlie airl...@redhat.com --- src/mesa/main/formats.c | 63 +++ src/mesa/main/formats.csv | 12 + src/mesa/main/formats.h | 12 + src/mesa/main/glformats.c | 24 ++ 4 files changed, 111 insertions(+) diff --git a/src/mesa/main/formats.c b/src/mesa/main/formats.c index baeb1bf..872f18b 100644 --- a/src/mesa/main/formats.c +++ b/src/mesa/main/formats.c @@ -1008,6 +1008,8 @@ _mesa_format_to_type_and_comps(mesa_format format, return; case MESA_FORMAT_B5G6R5_UNORM: case MESA_FORMAT_R5G6B5_UNORM: + case MESA_FORMAT_B5G6R5_UINT: + case MESA_FORMAT_R5G6B5_UINT: *datatype = GL_UNSIGNED_SHORT_5_6_5; *comps = 3; return; @@ -1015,6 +1017,8 @@ _mesa_format_to_type_and_comps(mesa_format format, case MESA_FORMAT_B4G4R4A4_UNORM: case MESA_FORMAT_A4R4G4B4_UNORM: case MESA_FORMAT_B4G4R4X4_UNORM: + case MESA_FORMAT_B4G4R4A4_UINT: + case MESA_FORMAT_A4R4G4B4_UINT: *datatype = GL_UNSIGNED_SHORT_4_4_4_4; *comps = 4; return; @@ -1022,6 +1026,8 @@ _mesa_format_to_type_and_comps(mesa_format format, case MESA_FORMAT_B5G5R5A1_UNORM: case MESA_FORMAT_A1R5G5B5_UNORM: case MESA_FORMAT_B5G5R5X1_UNORM: + case MESA_FORMAT_B5G5R5A1_UINT: + case MESA_FORMAT_A1R5G5B5_UINT: *datatype = GL_UNSIGNED_SHORT_1_5_5_5_REV; *comps = 4; return; @@ -1032,6 +1038,7 @@ _mesa_format_to_type_and_comps(mesa_format format, return; case MESA_FORMAT_A1B5G5R5_UNORM: + case MESA_FORMAT_A1B5G5R5_UINT: *datatype = GL_UNSIGNED_SHORT_5_5_5_1; *comps = 4; return; @@ -1066,19 +1073,23 @@ _mesa_format_to_type_and_comps(mesa_format format, return; case MESA_FORMAT_R3G3B2_UNORM: + case MESA_FORMAT_R3G3B2_UINT: *datatype = GL_UNSIGNED_BYTE_2_3_3_REV; *comps = 3; return; case MESA_FORMAT_A4B4G4R4_UNORM: + case MESA_FORMAT_A4B4G4R4_UINT: *datatype = GL_UNSIGNED_SHORT_4_4_4_4; *comps = 4; return; case MESA_FORMAT_R4G4B4A4_UNORM: + case MESA_FORMAT_R4G4B4A4_UINT: *datatype = GL_UNSIGNED_SHORT_4_4_4_4; *comps = 4; return; case MESA_FORMAT_R5G5B5A1_UNORM: + case MESA_FORMAT_R5G5B5A1_UINT: *datatype = GL_UNSIGNED_SHORT_1_5_5_5_REV; *comps = 4; return; @@ -1094,6 +1105,7 @@ _mesa_format_to_type_and_comps(mesa_format format, return; case MESA_FORMAT_B2G3R3_UNORM: + case MESA_FORMAT_B2G3R3_UINT: *datatype = GL_UNSIGNED_BYTE_3_3_2; *comps = 3; return; @@ -2123,6 +2135,57 @@ _mesa_format_matches_format_and_type(mesa_format mesa_format, type == GL_UNSIGNED_INT_2_10_10_10_REV !swapBytes); + case MESA_FORMAT_B5G6R5_UINT: + return format == GL_RGB_INTEGER type == GL_UNSIGNED_SHORT_5_6_5; + + case MESA_FORMAT_R5G6B5_UINT: + return format == GL_RGB_INTEGER type == GL_UNSIGNED_SHORT_5_6_5_REV; + + case MESA_FORMAT_B2G3R3_UINT: + return format == GL_RGB_INTEGER type == GL_UNSIGNED_BYTE_3_3_2; + + case MESA_FORMAT_R3G3B2_UINT: + return format == GL_RGB_INTEGER type == GL_UNSIGNED_BYTE_2_3_3_REV; + + case MESA_FORMAT_A4B4G4R4_UINT: + if (format == GL_RGBA_INTEGER type == GL_UNSIGNED_SHORT_4_4_4_4 !swapBytes) + return GL_TRUE; + + if (format == GL_RGBA_INTEGER type == GL_UNSIGNED_SHORT_4_4_4_4_REV swapBytes) + return GL_TRUE; + return GL_FALSE; + + case MESA_FORMAT_R4G4B4A4_UINT: + if (format == GL_RGBA_INTEGER type == GL_UNSIGNED_SHORT_4_4_4_4_REV !swapBytes) + return GL_TRUE; + + if (format == GL_RGBA_INTEGER type == GL_UNSIGNED_SHORT_4_4_4_4 swapBytes) + return GL_TRUE; + + return GL_FALSE; + + case MESA_FORMAT_B4G4R4A4_UINT: + return format == GL_BGRA_INTEGER type == GL_UNSIGNED_SHORT_4_4_4_4_REV + !swapBytes; + + case MESA_FORMAT_A4R4G4B4_UINT: + return GL_FALSE; + + case MESA_FORMAT_A1B5G5R5_UINT: + return format == GL_RGBA_INTEGER type == GL_UNSIGNED_SHORT_5_5_5_1 + !swapBytes; + + case MESA_FORMAT_B5G5R5A1_UINT: + return format == GL_BGRA_INTEGER type == GL_UNSIGNED_SHORT_1_5_5_5_REV + !swapBytes; + + case MESA_FORMAT_A1R5G5B5_UINT: + return format == GL_BGRA_INTEGER type ==
[Mesa-dev] [PATCH] clover: handle setKernelArg errors
--- src/gallium/state_trackers/clover/core/kernel.cpp | 15 +++ 1 file changed, 15 insertions(+) diff --git a/src/gallium/state_trackers/clover/core/kernel.cpp b/src/gallium/state_trackers/clover/core/kernel.cpp index a23cd2b..820a80a 100644 --- a/src/gallium/state_trackers/clover/core/kernel.cpp +++ b/src/gallium/state_trackers/clover/core/kernel.cpp @@ -366,6 +366,9 @@ kernel::scalar_argument::scalar_argument(size_t size) : size(size) { void kernel::scalar_argument::set(size_t size, const void *value) { + if (!value) + throw error(CL_INVALID_ARG_VALUE); + if (size != this-size) throw error(CL_INVALID_ARG_SIZE); @@ -434,6 +437,9 @@ kernel::local_argument::set(size_t size, const void *value) { if (value) throw error(CL_INVALID_ARG_VALUE); + if (!size) + throw error(CL_INVALID_ARG_SIZE); + _storage = size; _set = true; } @@ -493,6 +499,9 @@ kernel::constant_argument::unbind(exec_context ctx) { void kernel::image_rd_argument::set(size_t size, const void *value) { + if (!value) + throw error(CL_INVALID_ARG_VALUE); + if (size != sizeof(cl_mem)) throw error(CL_INVALID_ARG_SIZE); @@ -521,6 +530,9 @@ kernel::image_rd_argument::unbind(exec_context ctx) { void kernel::image_wr_argument::set(size_t size, const void *value) { + if (!value) + throw error(CL_INVALID_ARG_VALUE); + if (size != sizeof(cl_mem)) throw error(CL_INVALID_ARG_SIZE); @@ -549,6 +561,9 @@ kernel::image_wr_argument::unbind(exec_context ctx) { void kernel::sampler_argument::set(size_t size, const void *value) { + if (!value) + throw error(CL_INVALID_SAMPLER); + if (size != sizeof(cl_sampler)) throw error(CL_INVALID_ARG_SIZE); -- 2.4.6 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 3/3] st/mesa: implement DrawTransformFeedbackStream
On 31 July 2015 at 00:26, Marek Olšák mar...@gmail.com wrote: From: Marek Olšák marek.ol...@amd.com once you rebase onto my last patch, these look fine. Reviewed-by: Dave Airlie airl...@redhat.com --- src/mesa/state_tracker/st_cb_xformfb.c | 58 ++ src/mesa/state_tracker/st_cb_xformfb.h | 2 +- src/mesa/state_tracker/st_draw.c | 2 +- 3 files changed, 33 insertions(+), 29 deletions(-) diff --git a/src/mesa/state_tracker/st_cb_xformfb.c b/src/mesa/state_tracker/st_cb_xformfb.c index 07c118e..85a8620 100644 --- a/src/mesa/state_tracker/st_cb_xformfb.c +++ b/src/mesa/state_tracker/st_cb_xformfb.c @@ -54,9 +54,9 @@ struct st_transform_feedback_object { struct pipe_stream_output_target *targets[PIPE_MAX_SO_BUFFERS]; /* This encapsulates the count that can be used as a source for draw_vbo. -* It contains a stream output target from the last call of -* EndTransformFeedback. */ - struct pipe_stream_output_target *draw_count; +* It contains stream output targets from the last call of +* EndTransformFeedback for each stream. */ + struct pipe_stream_output_target *draw_count[MAX_VERTEX_STREAMS]; }; static inline struct st_transform_feedback_object * @@ -88,7 +88,8 @@ st_delete_transform_feedback(struct gl_context *ctx, st_transform_feedback_object(obj); unsigned i; - pipe_so_target_reference(sobj-draw_count, NULL); + for (i = 0; i ARRAY_SIZE(sobj-draw_count); i++) + pipe_so_target_reference(sobj-draw_count[i], NULL); /* Unreference targets. */ for (i = 0; i sobj-num_targets; i++) { @@ -123,9 +124,12 @@ st_begin_transform_feedback(struct gl_context *ctx, GLenum mode, struct st_buffer_object *bo = st_buffer_object(sobj-base.Buffers[i]); if (bo bo-buffer) { + unsigned stream = +obj-shader_program-LinkedTransformFeedback.BufferStream[i]; + /* Check whether we need to recreate the target. */ if (!sobj-targets[i] || - sobj-targets[i] == sobj-draw_count || + sobj-targets[i] == sobj-draw_count[stream] || sobj-targets[i]-buffer != bo-buffer || sobj-targets[i]-buffer_offset != sobj-base.Offset[i] || sobj-targets[i]-buffer_size != sobj-base.Size[i]) { @@ -178,24 +182,6 @@ st_resume_transform_feedback(struct gl_context *ctx, } -static struct pipe_stream_output_target * -st_transform_feedback_get_draw_target(struct gl_transform_feedback_object *obj) -{ - struct st_transform_feedback_object *sobj = - st_transform_feedback_object(obj); - unsigned i; - - for (i = 0; i ARRAY_SIZE(sobj-targets); i++) { - if (sobj-targets[i]) { - return sobj-targets[i]; - } - } - - assert(0); - return NULL; -} - - static void st_end_transform_feedback(struct gl_context *ctx, struct gl_transform_feedback_object *obj) @@ -203,22 +189,40 @@ st_end_transform_feedback(struct gl_context *ctx, struct st_context *st = st_context(ctx); struct st_transform_feedback_object *sobj = st_transform_feedback_object(obj); + unsigned i; cso_set_stream_outputs(st-cso_context, 0, NULL, NULL); - pipe_so_target_reference(sobj-draw_count, -st_transform_feedback_get_draw_target(obj)); + /* The next call to glDrawTransformFeedbackStream should use the vertex +* count from the last call to glEndTransformFeedback. +* Therefore, save the targets for each stream. +* +* NULL means the vertex counter is 0 (initial state). +*/ + for (i = 0; i ARRAY_SIZE(sobj-draw_count); i++) + pipe_so_target_reference(sobj-draw_count[i], NULL); + + for (i = 0; i ARRAY_SIZE(sobj-targets); i++) { + unsigned stream = + obj-shader_program-LinkedTransformFeedback.BufferStream[i]; + + /* Is it not boudn or already set for this stream? */ + if (!sobj-targets[i] || sobj-draw_count[stream]) + continue; + + pipe_so_target_reference(sobj-draw_count[stream], sobj-targets[i]); + } } void st_transform_feedback_draw_init(struct gl_transform_feedback_object *obj, -struct pipe_draw_info *out) +unsigned stream, struct pipe_draw_info *out) { struct st_transform_feedback_object *sobj = st_transform_feedback_object(obj); - out-count_from_stream_output = sobj-draw_count; + out-count_from_stream_output = sobj-draw_count[stream]; } diff --git a/src/mesa/state_tracker/st_cb_xformfb.h b/src/mesa/state_tracker/st_cb_xformfb.h index 998c418..16f5176 100644 --- a/src/mesa/state_tracker/st_cb_xformfb.h +++ b/src/mesa/state_tracker/st_cb_xformfb.h @@ -40,7 +40,7 @@ st_init_xformfb_functions(struct dd_function_table *functions); extern void st_transform_feedback_draw_init(struct
[Mesa-dev] [PATCH v2 2/2]] mesa/teximage: accept ASTC formats for 3D texture specification
From: Nanley Chery nanley.g.ch...@intel.com The ASTC spec was revised as follows: Revision 2, April 28, 2015 - added CompressedTex{Sub,}Image3D to commands accepting ASTC format tokens in the New Tokens section [...]. Support only exists in the HDR submode: Add a second new column 3D Tex. which is empty for all non-ASTC formats. If only the LDR profile is supported by the implementation, this column is also empty for all ASTC formats. If both the LDR and HDR profiles are supported only, this column is checked for all ASTC formats. LDR-only systems should generate an INVALID_OPERATION error when attempting to call CompressedTexImage3D with the TEXTURE_3D target. Signed-off-by: Nanley Chery nanley.g.ch...@intel.com --- The original patch was missing the GL error modification for LDR-only systems. src/mesa/main/teximage.c | 50 +++- 1 file changed, 49 insertions(+), 1 deletion(-) diff --git a/src/mesa/main/teximage.c b/src/mesa/main/teximage.c index 949eef0..3a4db63 100644 --- a/src/mesa/main/teximage.c +++ b/src/mesa/main/teximage.c @@ -1819,6 +1819,35 @@ _mesa_target_can_be_compressed(const struct gl_context *ctx, GLenum target, case GL_COMPRESSED_RGB_BPTC_SIGNED_FLOAT: case GL_COMPRESSED_RGB_BPTC_UNSIGNED_FLOAT: return ctx-Extensions.ARB_texture_compression_bptc; + case GL_COMPRESSED_RGBA_ASTC_4x4_KHR: + case GL_COMPRESSED_RGBA_ASTC_5x4_KHR: + case GL_COMPRESSED_RGBA_ASTC_5x5_KHR: + case GL_COMPRESSED_RGBA_ASTC_6x5_KHR: + case GL_COMPRESSED_RGBA_ASTC_6x6_KHR: + case GL_COMPRESSED_RGBA_ASTC_8x5_KHR: + case GL_COMPRESSED_RGBA_ASTC_8x6_KHR: + case GL_COMPRESSED_RGBA_ASTC_8x8_KHR: + case GL_COMPRESSED_RGBA_ASTC_10x5_KHR: + case GL_COMPRESSED_RGBA_ASTC_10x6_KHR: + case GL_COMPRESSED_RGBA_ASTC_10x8_KHR: + case GL_COMPRESSED_RGBA_ASTC_10x10_KHR: + case GL_COMPRESSED_RGBA_ASTC_12x10_KHR: + case GL_COMPRESSED_RGBA_ASTC_12x12_KHR: + case GL_COMPRESSED_SRGB8_ALPHA8_ASTC_4x4_KHR: + case GL_COMPRESSED_SRGB8_ALPHA8_ASTC_5x4_KHR: + case GL_COMPRESSED_SRGB8_ALPHA8_ASTC_5x5_KHR: + case GL_COMPRESSED_SRGB8_ALPHA8_ASTC_6x5_KHR: + case GL_COMPRESSED_SRGB8_ALPHA8_ASTC_6x6_KHR: + case GL_COMPRESSED_SRGB8_ALPHA8_ASTC_8x5_KHR: + case GL_COMPRESSED_SRGB8_ALPHA8_ASTC_8x6_KHR: + case GL_COMPRESSED_SRGB8_ALPHA8_ASTC_8x8_KHR: + case GL_COMPRESSED_SRGB8_ALPHA8_ASTC_10x5_KHR: + case GL_COMPRESSED_SRGB8_ALPHA8_ASTC_10x6_KHR: + case GL_COMPRESSED_SRGB8_ALPHA8_ASTC_10x8_KHR: + case GL_COMPRESSED_SRGB8_ALPHA8_ASTC_10x10_KHR: + case GL_COMPRESSED_SRGB8_ALPHA8_ASTC_12x10_KHR: + case GL_COMPRESSED_SRGB8_ALPHA8_ASTC_12x12_KHR: + return ctx-Extensions.KHR_texture_compression_astc_hdr; default: return GL_FALSE; } @@ -2296,6 +2325,16 @@ texture_error_check( struct gl_context *ctx, return GL_FALSE; } +static inline bool +is_astc_format(const struct gl_context *ctx, GLenum internalFormat) +{ + return ctx-Extensions.KHR_texture_compression_astc_ldr + ((internalFormat = GL_COMPRESSED_RGBA_ASTC_4x4_KHR + internalFormat = GL_COMPRESSED_RGBA_ASTC_12x12_KHR) || + (internalFormat = GL_COMPRESSED_SRGB8_ALPHA8_ASTC_4x4_KHR + internalFormat = GL_COMPRESSED_SRGB8_ALPHA8_ASTC_12x12_KHR)); +} + /** * Error checking for glCompressedTexImage[123]D(). @@ -2324,7 +2363,16 @@ compressed_texture_error_check(struct gl_context *ctx, GLint dimensions, * CompressedTexImage3D will generate an INVALID_OPERATION error if * target is not TEXTURE_2D_ARRAY. */ - error = _mesa_is_desktop_gl(ctx) ? GL_INVALID_ENUM : GL_INVALID_OPERATION; + /* From the KHR_texture_compression_astc_hdr spec: + * + * 'An INVALID_OPERATION error is generated by CompressedTexImage3D + * if internalformat is TEXTURE_CUBE_MAP_ARRAY and the + * Cube Map Array column of table 8.19 is *not* checked, or if + * internalformat is TEXTURE_3D and the 3D Tex. column of table + * 8.19 is *not* checked' + */ + error = _mesa_is_desktop_gl(ctx) !is_astc_format(ctx, internalFormat) ? + GL_INVALID_ENUM : GL_INVALID_OPERATION; goto error; } -- 2.4.2 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v3 (part2) 55/56] glsl: Consider active all elements of a shared/std140 block array
Please, ignore this patch, we have recently sent an updated version to the mailing list. The new version is available in: http://lists.freedesktop.org/archives/mesa-dev/2015-July/090108.html Regards. On mar, 2015-07-14 at 09:46 +0200, Iago Toral Quiroga wrote: From: Antia Puentes apuen...@igalia.com Commmit 1ca25ab (glsl: Do not eliminate 'shared' or 'std140' blocks or block members) considers active 'shared' and 'std140' uniform blocks and uniform block arrays but did not include the block array elements. It was possible to have an active uniform block array without any elements marked as used, making the assertion ((b-num_array_elements 0) == b-type-is_array()) in link_uniform_blocks fail. Fixes the following 5 dEQP tests: * dEQP-GLES3.functional.ubo.random.nested_structs_instance_arrays.18 * dEQP-GLES3.functional.ubo.random.nested_structs_instance_arrays.24 * dEQP-GLES3.functional.ubo.random.nested_structs_arrays_instance_arrays.19 * dEQP-GLES3.functional.ubo.random.all_per_block_buffers.49 * dEQP-GLES3.functional.ubo.random.all_shared_buffer.36 --- src/glsl/link_uniform_block_active_visitor.cpp | 23 +++ 1 file changed, 23 insertions(+) diff --git a/src/glsl/link_uniform_block_active_visitor.cpp b/src/glsl/link_uniform_block_active_visitor.cpp index 5102947..37418c2 100644 --- a/src/glsl/link_uniform_block_active_visitor.cpp +++ b/src/glsl/link_uniform_block_active_visitor.cpp @@ -106,6 +106,22 @@ link_uniform_block_active_visitor::visit(ir_variable *var) assert(b-num_array_elements == 0); assert(b-array_elements == NULL); assert(b-type != NULL); + assert(!b-type-is_array() || b-has_instance_name); + + /* For uniform block arrays declared with a shared or std140 layout +* qualifier, mark all its instances as used. +*/ + if (b-type-is_array() b-type-length 0) { + b-num_array_elements = b-type-length; + b-array_elements = reralloc(this-mem_ctx, + b-array_elements, + unsigned, + b-num_array_elements); + + for (unsigned i = 0; i b-num_array_elements; i++) { + b-array_elements[i] = i; + } + } return visit_continue; } @@ -147,6 +163,13 @@ link_uniform_block_active_visitor::visit_enter(ir_dereference_array *ir) assert((b-num_array_elements == 0) == (b-array_elements == NULL)); assert(b-type != NULL); + /* If the block array was declared with a shared or std140 layout qualifier, +* all its instances have been already marked as used (see +* link_uniform_block_active_visitor::visit(ir_variable *) function). +*/ + if (var-type-interface_packing == GLSL_INTERFACE_PACKING_PACKED) + return visit_continue; + ir_constant *c = ir-array_index-as_constant(); if (c) { ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] r600: Remove assert that not general in some stream instr's
Don't trigger assert on some stream emit instructions. Signed-off-by: Edward O'Callaghan eocallag...@alterapraxis.com --- src/gallium/drivers/r600/sb/sb_bc_finalize.cpp | 2 -- 1 file changed, 2 deletions(-) diff --git a/src/gallium/drivers/r600/sb/sb_bc_finalize.cpp b/src/gallium/drivers/r600/sb/sb_bc_finalize.cpp index 8c2cd14..48d56ac 100644 --- a/src/gallium/drivers/r600/sb/sb_bc_finalize.cpp +++ b/src/gallium/drivers/r600/sb/sb_bc_finalize.cpp @@ -761,8 +761,6 @@ void bc_finalizer::finalize_cf(cf_node* c) { mask |= (1 chan); } - assert(reg = 0 mask); - if (reg = 0) update_ngpr(reg); -- 2.4.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH/RFC] Minor assert fix
I believe this assert isn't vaild in generality of possible stream emit instructions out the sb backend finalizer. Edward O'Callaghan (1): r600: Remove assert that not general in some stream instr's src/gallium/drivers/r600/sb/sb_bc_finalize.cpp | 2 -- 1 file changed, 2 deletions(-) -- 2.4.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 16/17] glsl: Add link time checks for GLSL precision qualifiers
On Wed, 2015-07-29 at 15:21 -0700, Ian Romanick wrote: On 07/29/2015 07:01 AM, Samuel Iglesias Gonsalvez wrote: From: Iago Toral Quiroga ito...@igalia.com Currently, we only consider precision qualifiers at compile-time. This patch adds precision information to ir_variable so we can also do link time checks. Specifically, from the GLSL ES3 spec, 4.5.3 Precision Qualifiers: The same uniform declared in different shaders that are linked together must have the same precision qualification. Notice that this patch will check the above also for GLSL ES globals that are not uniforms. This is not explicitly stated in the spec, but seems to be the only consistent choice since we can only have one definition of a global all its declarations should be identical, including precision qualifiers. That's not right. Global variables from different stages that are not inputs/outputs or uniforms are distinct... they don't even have to be the same type. ES shaders only allow a single compliation unit per stage, so we don't have to worry about inter-stage globals. Ugh, sorry, the commit log does not make a good job at explaining the situation. This patch does not produce a linker error for globals that are not uniforms, I only meant to say that for globals *in interface blocks*, for which we are producing a linker error in the case of type mismatches, precision will also be considered to decide if the types mismatch. Sorry for being so imprecise in the description, I'll fix the commit log. I guess with this clarification there are no issues with this, right? Iago These checks don't affect desktop GLSL shaders because we ignore precision information in this case (all variables have precision GLSL_PRECISION_NONE). Fixes the following 5 dEQP tests: dEQP-GLES3.functional.shaders.linkage.uniform.struct.precision_conflict_1 dEQP-GLES3.functional.shaders.linkage.uniform.struct.precision_conflict_2 dEQP-GLES3.functional.shaders.linkage.uniform.struct.precision_conflict_3 dEQP-GLES3.functional.shaders.linkage.uniform.struct.precision_conflict_4 dEQP-GLES3.functional.shaders.linkage.uniform.block.precision_mismatch --- src/glsl/linker.cpp | 34 +- 1 file changed, 33 insertions(+), 1 deletion(-) diff --git a/src/glsl/linker.cpp b/src/glsl/linker.cpp index 12b7780..fd68f43 100644 --- a/src/glsl/linker.cpp +++ b/src/glsl/linker.cpp @@ -958,13 +958,22 @@ cross_validate_globals(struct gl_shader_program *prog, if (var-type-is_record() existing-type-is_record() existing-type-record_compare(var-type)) { existing-type = var-type; - } else { + } else if (strcmp(var-type-name, existing-type-name)) { linker_error(prog, %s `%s' declared as type `%s' and type `%s'\n, mode_string(var), var-name, var-type-name, existing-type-name); return; + } else { + /* The global is declared with the same type name but the type + * declarations mismatch (e.g. the same struct type name, but + * the actual struct declarations mismatch). + */ + linker_error(prog, %s `%s' declared with mismatching definitions + of type `%s'\n, + mode_string(var), var-name, var-type-name); + return; } } } @@ -1121,6 +1130,29 @@ cross_validate_globals(struct gl_shader_program *prog, mode_string(var), var-name); return; } +/* From the GLSL ES3 spec, 4.5.3 Precision qualifiers: + * + * The same uniform declared in different shaders that are linked + * together must have the same precision qualification. + * + * In the GLSL ES2 spec this was resolved in the issue amendments + * (10.3 Precision Qualifiers). The GLSL ES1 spec overlooked this, + * but seems like an obvious error since we can only have one + * consistent definition of a global. + * + * The desktop GLSL spec does not include this reference + * because precision qualifiers are ignored. We will never + * hit this scenario in desktop GLSL though because we always set + * the precision of variables to GLSL_PRECISION_NONE. + */ +if (var-data.mode == ir_var_uniform) { + if (existing-data.precision != var-data.precision)
Re: [Mesa-dev] [PATCH] glsl: Initialize parse-state in constructor of lower_subroutine.
Reviewed-by: Samuel Iglesias Gonsálvez sigles...@igalia.com On Wed, 2015-07-29 at 10:53 -0700, Matt Turner wrote: Static analysis tools don't like partial object initializations. --- src/glsl/lower_subroutine.cpp | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/src/glsl/lower_subroutine.cpp b/src/glsl/lower_subroutine.cpp index e45ccfe..b29912a 100644 --- a/src/glsl/lower_subroutine.cpp +++ b/src/glsl/lower_subroutine.cpp @@ -37,7 +37,8 @@ namespace { class lower_subroutine_visitor : public ir_hierarchical_visitor { public: - lower_subroutine_visitor() + lower_subroutine_visitor(struct _mesa_glsl_parse_state *state) + : state(state) { this-progress = false; } @@ -52,8 +53,7 @@ public: bool lower_subroutine(exec_list *instructions, struct _mesa_glsl_parse_state *state) { - lower_subroutine_visitor v; - v.state = state; + lower_subroutine_visitor v(state); visit_list_elements(v, instructions); return v.progress; } signature.asc Description: This is a digitally signed message part ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] st/mesa: fail to draw instead of asserting in transform feedback
if we get a request to take the count from feedback, but there is no buffer to take it from, just draw nothing instead of asserting. This fixes this assert killing the ogl conform, and a piglit test I've sent. Signed-off-by: Dave Airlie airl...@redhat.com --- src/mesa/state_tracker/st_cb_xformfb.c | 6 -- src/mesa/state_tracker/st_cb_xformfb.h | 2 +- src/mesa/state_tracker/st_draw.c | 3 ++- 3 files changed, 7 insertions(+), 4 deletions(-) diff --git a/src/mesa/state_tracker/st_cb_xformfb.c b/src/mesa/state_tracker/st_cb_xformfb.c index 07c118e..0708e68 100644 --- a/src/mesa/state_tracker/st_cb_xformfb.c +++ b/src/mesa/state_tracker/st_cb_xformfb.c @@ -191,7 +191,6 @@ st_transform_feedback_get_draw_target(struct gl_transform_feedback_object *obj) } } - assert(0); return NULL; } @@ -211,14 +210,17 @@ st_end_transform_feedback(struct gl_context *ctx, } -void +bool st_transform_feedback_draw_init(struct gl_transform_feedback_object *obj, struct pipe_draw_info *out) { struct st_transform_feedback_object *sobj = st_transform_feedback_object(obj); + if (sobj-draw_count == NULL) + return false; out-count_from_stream_output = sobj-draw_count; + return true; } diff --git a/src/mesa/state_tracker/st_cb_xformfb.h b/src/mesa/state_tracker/st_cb_xformfb.h index 998c418..fb50ded 100644 --- a/src/mesa/state_tracker/st_cb_xformfb.h +++ b/src/mesa/state_tracker/st_cb_xformfb.h @@ -38,7 +38,7 @@ struct pipe_draw_info; extern void st_init_xformfb_functions(struct dd_function_table *functions); -extern void +extern bool st_transform_feedback_draw_init(struct gl_transform_feedback_object *obj, struct pipe_draw_info *out); diff --git a/src/mesa/state_tracker/st_draw.c b/src/mesa/state_tracker/st_draw.c index 66b2f83..bae8096 100644 --- a/src/mesa/state_tracker/st_draw.c +++ b/src/mesa/state_tracker/st_draw.c @@ -242,7 +242,8 @@ st_draw_vbo(struct gl_context *ctx, /* Transform feedback drawing is always non-indexed. */ /* Set info.count_from_stream_output. */ if (tfb_vertcount) { - st_transform_feedback_draw_init(tfb_vertcount, info); + if (st_transform_feedback_draw_init(tfb_vertcount, info) == false) +return; } } -- 2.4.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] st/mesa: fail to draw instead of asserting in transform feedback
It's not a failure, so it can't _fail_ to draw. It should be interpreted such that the drawing succeeds with the vertex count given by the transform feedback object, which is 0. Reviewed-by: Marek Olšák marek.ol...@amd.com Marek On Thu, Jul 30, 2015 at 12:46 PM, Dave Airlie airl...@gmail.com wrote: if we get a request to take the count from feedback, but there is no buffer to take it from, just draw nothing instead of asserting. This fixes this assert killing the ogl conform, and a piglit test I've sent. Signed-off-by: Dave Airlie airl...@redhat.com --- src/mesa/state_tracker/st_cb_xformfb.c | 6 -- src/mesa/state_tracker/st_cb_xformfb.h | 2 +- src/mesa/state_tracker/st_draw.c | 3 ++- 3 files changed, 7 insertions(+), 4 deletions(-) diff --git a/src/mesa/state_tracker/st_cb_xformfb.c b/src/mesa/state_tracker/st_cb_xformfb.c index 07c118e..0708e68 100644 --- a/src/mesa/state_tracker/st_cb_xformfb.c +++ b/src/mesa/state_tracker/st_cb_xformfb.c @@ -191,7 +191,6 @@ st_transform_feedback_get_draw_target(struct gl_transform_feedback_object *obj) } } - assert(0); return NULL; } @@ -211,14 +210,17 @@ st_end_transform_feedback(struct gl_context *ctx, } -void +bool st_transform_feedback_draw_init(struct gl_transform_feedback_object *obj, struct pipe_draw_info *out) { struct st_transform_feedback_object *sobj = st_transform_feedback_object(obj); + if (sobj-draw_count == NULL) + return false; out-count_from_stream_output = sobj-draw_count; + return true; } diff --git a/src/mesa/state_tracker/st_cb_xformfb.h b/src/mesa/state_tracker/st_cb_xformfb.h index 998c418..fb50ded 100644 --- a/src/mesa/state_tracker/st_cb_xformfb.h +++ b/src/mesa/state_tracker/st_cb_xformfb.h @@ -38,7 +38,7 @@ struct pipe_draw_info; extern void st_init_xformfb_functions(struct dd_function_table *functions); -extern void +extern bool st_transform_feedback_draw_init(struct gl_transform_feedback_object *obj, struct pipe_draw_info *out); diff --git a/src/mesa/state_tracker/st_draw.c b/src/mesa/state_tracker/st_draw.c index 66b2f83..bae8096 100644 --- a/src/mesa/state_tracker/st_draw.c +++ b/src/mesa/state_tracker/st_draw.c @@ -242,7 +242,8 @@ st_draw_vbo(struct gl_context *ctx, /* Transform feedback drawing is always non-indexed. */ /* Set info.count_from_stream_output. */ if (tfb_vertcount) { - st_transform_feedback_draw_init(tfb_vertcount, info); + if (st_transform_feedback_draw_init(tfb_vertcount, info) == false) +return; } } -- 2.4.3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] mesa: support arrays of structures in GetProgramResourceIndex
On Thu, 2015-07-30 at 19:40 +1000, Timothy Arceri wrote: On Thu, 2015-07-30 at 09:44 +0300, Tapani Pälli wrote: This fixes a lot of failing tests in: ES31-CTS.program_interface_query.uniform-types Signed-off-by: Tapani Pälli tapani.pa...@intel.com --- src/mesa/main/program_resource.c | 7 +-- 1 file changed, 5 insertions(+), 2 deletions(-) diff --git a/src/mesa/main/program_resource.c b/src/mesa/main/program_resource.c index 641ef22..6c0a923 100644 --- a/src/mesa/main/program_resource.c +++ b/src/mesa/main/program_resource.c @@ -212,13 +212,16 @@ valid_program_resource_index_name(const GLchar *name) { const char *array = strstr(name, [); const char *close = strrchr(name, ']'); + const char *after = close ? close + 1 : NULL; /* Not array, no need for the check. */ if (!array) return true; - /* Last array index has to be zero. */ - if (!close || *--close != '0') + /* Last array index has to be zero, for the exception when +* dealing with arrays of structures. +*/ + if (!close || (*--close != '0' *after != '.')) return false; return true; It this intended to go in the stable branch? Hit send too early. If this is intended to land in the stable branch also then: Reviewed-by: Timothy Arceri t_arc...@yahoo.com.au also we will need to Cc stable on both patches. Otherwise I'd rather just land my cleanup unless you see any issues with it. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 01/20] mesa: fix and simplify resource query for arrays
On Thu, 2015-07-30 at 11:07 +0300, Tapani Pälli wrote: On 07/30/2015 08:56 AM, Tapani Pälli wrote: On 07/30/2015 08:50 AM, Timothy Arceri wrote: On 30 July 2015 3:27:41 pm AEST, Tapani Pälli tapani.pa...@intel.com wrote: On 07/30/2015 08:25 AM, Tapani Pälli wrote: Hi Timothy; Would it be OK for you to wait a bit with these 3 first patches? I'm currently going through failing PIQ tests and I have a 1 line fix to valid_program_resource_index_name() that together with your patch glsl: set stage flag for structs and arrays in resource list will fix all of the failing ES31-CTS.program_interface_query.uniform tests. Frankly I don't see the value of such a big refactor necessary for this. The value is in removing almost 200 lines of complicated code trying to solve the same problem in different ways. How its this not a good thing? Also once this is done we can also work towards removing UniformHash from Mesa. Sure, I agree that removing lots of code is a good thing. I just don't quite yet understand how some of this will work, many of the removed checks were there for a good reason (faling tests!). Just to make sure, did you run Piglit and conformance without regressions? Anyway, I will read through your changes and think about this more. Replying myself, no regressions observed :) Yep :) Tested as you can see. I didn't expect to fix anything when I made the change it was just a tidy up before getting the AoA tests to pass. The fixes were a nice bonus. Yep, it seems that parse_program_resource_name() tackles the cases. I will do some further testing, particularly on the 'gl_VertexIDMESA' special case and review the patches during today or tomorrow. Yeah parse_program_resource_name() is a bit more robust than the other functions and by using it we get the index for free :) The old code required the gl_VertexIDMESA special case because it was comparing the name with the the ir var name in get_matching_index(). _mesa_program_resource_name() already handles the special case for us so we can just use the name it returns in _mesa_program_resource_find_name() for our compares. Woops, make it ES31-CTS.program_interface_query.uniform-types, not ES31-CTS.program_interface_query.uniform. It looks to me that most tests fail just because of not figuring out the correct index for the resource and that can be fixed by a valid_program_resource_index_name change. On 07/29/2015 04:56 PM, Timothy Arceri wrote: This removes the need for multiple functions designed to validate an array subscript and replaces them with a call to a single function. The change also means that validation is now only done once and the index is retrived at the same time, as a result the getUniformLocation code can be simplified saving an extra hash table lookup (and yet another validation call). This chage also fixes some tests in: ES31-CTS.program_interface_query.uniform V3: rebase on subroutines, and move the resource index array == 0 check into _mesa_GetProgramResourceIndex() to simplify things further V2: Fix bounds checks for program input/output, split unrelated comment fix and _mesa_get_uniform_location() removal into their own patch. Cc: Tapani Pälli tapani.pa...@intel.com --- src/mesa/main/program_resource.c | 14 ++-- src/mesa/main/shader_query.cpp | 174 +-- src/mesa/main/shaderapi.c| 2 +- src/mesa/main/shaderapi.h| 3 +- src/mesa/main/uniforms.c | 5 +- 5 files changed, 106 insertions(+), 92 deletions(-) diff --git a/src/mesa/main/program_resource.c b/src/mesa/main/program_resource.c index 641ef22..fdbd5b3 100644 --- a/src/mesa/main/program_resource.c +++ b/src/mesa/main/program_resource.c @@ -229,6 +229,7 @@ _mesa_GetProgramResourceIndex(GLuint program, GLenum programInterface, const GLchar *name) { GET_CURRENT_CONTEXT(ctx); + unsigned array_index = 0; struct gl_program_resource *res; struct gl_shader_program *shProg = _mesa_lookup_shader_program_err(ctx, program, @@ -268,13 +269,10 @@ _mesa_GetProgramResourceIndex(GLuint program, GLenum programInterface, case GL_PROGRAM_OUTPUT: case GL_UNIFORM: case GL_TRANSFORM_FEEDBACK_VARYING: - /* Validate name syntax for array variables */ - if (!valid_program_resource_index_name(name)) - return GL_INVALID_INDEX; - /* fall-through */ case GL_UNIFORM_BLOCK: - res =