Re: [Mesa-dev] [PATCH v2 1/2] mesa: make glFramebuffer* check immutable texture level bounds
Reviewed-by: Antia Puentes On jue, 2017-01-26 at 00:47 -0500, Ilia Mirkin wrote: > When a texture is immutable, we can't tack on extra levels > after-the-fact like we could with glTexImage. So check against that > level limit and return an error if it's surpassed. > > The spec is a little unclear in that it says to check if "level is > not a > supported texture level", however that is never fully defined. > > This fixes: > GL45-CTS.geometry_shader.layered_fbo.fb_texture_invalid_level_number > > Signed-off-by: Ilia Mirkin > --- > > v1 -> v2: use NumLevels instead of _MaxLevel. > > Not sure why this isn't showing up as failing in the Intel CI, but it > was > definitely failing here. > > src/mesa/main/fbobject.c | 22 ++ > 1 file changed, 14 insertions(+), 8 deletions(-) > > diff --git a/src/mesa/main/fbobject.c b/src/mesa/main/fbobject.c > index 6934805..6e86248 100644 > --- a/src/mesa/main/fbobject.c > +++ b/src/mesa/main/fbobject.c > @@ -3128,16 +3128,22 @@ check_layer(struct gl_context *ctx, GLenum > target, GLint layer, > * \return true if no errors, false if errors > */ > static bool > -check_level(struct gl_context *ctx, GLenum target, GLint level, > -const char *caller) > +check_level(struct gl_context *ctx, const struct gl_texture_object > *texObj, > +GLint level, const char *caller) > { > if ((level < 0) || > - (level >= _mesa_max_texture_levels(ctx, target))) { > + (level >= _mesa_max_texture_levels(ctx, texObj->Target))) { > _mesa_error(ctx, GL_INVALID_VALUE, > "%s(invalid level %d)", caller, level); > return false; > } > > + if (texObj->Immutable && level >= texObj->NumLevels) { > + _mesa_error(ctx, GL_INVALID_VALUE, > + "%s(level %u >= %u)", caller, level, texObj- > >NumLevels); > + return false; > + } > + > return true; > } > > @@ -3260,7 +3266,7 @@ framebuffer_texture_with_dims(int dims, GLenum > target, > if ((dims == 3) && !check_layer(ctx, texObj->Target, layer, > caller)) > return; > > - if (!check_level(ctx, textarget, level, caller)) > + if (!check_level(ctx, texObj, level, caller)) > return; > } > > @@ -3328,7 +3334,7 @@ _mesa_FramebufferTextureLayer(GLenum target, > GLenum attachment, > if (!check_layer(ctx, texObj->Target, layer, func)) > return; > > - if (!check_level(ctx, texObj->Target, level, func)) > + if (!check_level(ctx, texObj, level, func)) > return; > > if (texObj->Target == GL_TEXTURE_CUBE_MAP) { > @@ -3370,7 +3376,7 @@ _mesa_NamedFramebufferTextureLayer(GLuint > framebuffer, GLenum attachment, > if (!check_layer(ctx, texObj->Target, layer, func)) > return; > > - if (!check_level(ctx, texObj->Target, level, func)) > + if (!check_level(ctx, texObj, level, func)) > return; > > if (texObj->Target == GL_TEXTURE_CUBE_MAP) { > @@ -3419,7 +3425,7 @@ _mesa_FramebufferTexture(GLenum target, GLenum > attachment, > if (!check_layered_texture_target(ctx, texObj->Target, func, > &layered)) > return; > > - if (!check_level(ctx, texObj->Target, level, func)) > + if (!check_level(ctx, texObj, level, func)) > return; > } > > @@ -3459,7 +3465,7 @@ _mesa_NamedFramebufferTexture(GLuint > framebuffer, GLenum attachment, > &layered)) > return; > > - if (!check_level(ctx, texObj->Target, level, func)) > + if (!check_level(ctx, texObj, level, func)) > return; > } > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v2 1/2] mesa: make glFramebuffer* check immutable texture level bounds
On mié, 2017-02-01 at 19:18 +0100, Antía Puentes wrote: > On mié, 2017-02-01 at 00:37 -0500, Ilia Mirkin wrote: > > On Tue, Jan 31, 2017 at 2:55 PM, Antía Puentes wrote: > > > > > > There was an update in the OpenGL ES 3.2 specification (November 3, > > > 2016) addressing this: > > > > > > - In section "9.2.8 Attaching Texture Images to a Framebuffer", > > > FramebufferTexture2D (page 241) and FramebufferTextureLayer (page 242) > > > descriptions: > > > > > > " specifies the mipmap level of the texture image to be attached > > > to the framebuffer, and must satisfy the following conditions: > > > > > > • If texture refers to an immutable-format texture, level must be > > > greater than or equal to zero and smaller than the value of > > > TEXTURE_IMMUTABLE_LEVELS for texture." > > > > > > https://cvs.khronos.org/bugzilla/show_bug.cgi?id=15946 > > Interesting, OK. And the test that's failing is a "glesext" test, even > > though it's being run in a GL context. Does that bug indicate whether > > such a change also applies to desktop GL? (I don't have access to the > > bug tracker.) > It does not mention anything about desktop GL. I have filled a new bug > in OpenGL 4.5 asking for clarification: > https://cvs.khronos.org/bugzilla/show_bug.cgi?id=16214 Update on the spec bug [*]. It was agreed to adopt the OpenGL ES 3.2 behaviour in OpenGL, that is to return GL_INVALID_VALUE when attempting to attach a non-existent level of an immutable texture to an FBO. [*] https://cvs.khronos.org/bugzilla/show_bug.cgi?id=16214#c1 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] mesa: Clamp GetUniformuiv values to be >= 0.
On lun, 2016-12-12 at 10:43 +0100, Nicolai Hähnle wrote: > On 12.12.2016 00:25, Kenneth Graunke wrote: > > > > Section 2.2.2 (Data Conversions For State Query Commands) of the > > OpenGL 4.5 October 24th 2016 specification says: > > > > "If a command returning unsigned integer data is called, such as > > GetSamplerParameterIuiv, negative values are clamped to zero." > > > > Fixes GL44-CTS.gpu_shader_fp64.state_query. > > > > Signed-off-by: Kenneth Graunke > > --- > > src/mesa/main/uniform_query.cpp | 48 > > + > > 1 file changed, 39 insertions(+), 9 deletions(-) > > > > Hey Nicolai, > > > > I wrote a similar patch a while back, but never got around to > > sending it, > > since I realized that the gl45release branch expects our current > > behavior, > > and the change to make the CTS expect clamping is only on the > > master branch. > > > > Apparently I made some additional changes, compared to yours. I > > figured > > I'd send this along and let you see if you think any of my extra > > changes > > are still necessary. If so, feel free to fold them into your > > patch. > > > > I also think we need to fix several other glGet* commands...it's > > just that > > this is the only one currently tested. A bunch work because the > > values > > returned can't be negative. > I think your patch is a strict superset of what mine does and should > be > used instead. I do have one comment below, with that fixed it has my > R-b. This patch was never pushed, was it? and GL45-CTS.gpu_shader_fp64.state_query fails in the new vk-gl-cts repository because it expects these negative values to be clamped. > There is the more general question of how to cope with those cases > where > the CTS requires non-standard behavior. I think we should insist on > doing the right thing in Mesa, and push for changes to the CTS. > > Until quite recently, I've been occupied by radeonsi- and > Gallium-specific bugs, but that's changing and I'm looking into > using > CTS master rather than back-porting fixes to the dead gl45release > branch > (hence this patch). > > > > > > > --Ken > > > > diff --git a/src/mesa/main/uniform_query.cpp > > b/src/mesa/main/uniform_query.cpp > > index db700df..f05a29f 100644 > > --- a/src/mesa/main/uniform_query.cpp > > +++ b/src/mesa/main/uniform_query.cpp > > @@ -347,14 +347,10 @@ _mesa_get_uniform(struct gl_context *ctx, > > GLuint program, GLint location, > > * just memcpy the data. If the types are not compatible, > > perform a > > * slower convert-and-copy process. > > */ > > - if (returnType == uni->type->base_type > > - || ((returnType == GLSL_TYPE_INT > > - || returnType == GLSL_TYPE_UINT) > > - && > > - (uni->type->base_type == GLSL_TYPE_INT > > - || uni->type->base_type == GLSL_TYPE_UINT > > - || uni->type->base_type == GLSL_TYPE_SAMPLER > > - || uni->type->base_type == GLSL_TYPE_IMAGE))) { > > + if (returnType == uni->type->base_type || > > + ((returnType == GLSL_TYPE_INT || returnType == > > GLSL_TYPE_UINT) && > > + (uni->type->base_type == GLSL_TYPE_SAMPLER || > > +uni->type->base_type == GLSL_TYPE_IMAGE))) { > > memcpy(paramsOut, src, bytes); > > } else { > > union gl_constant_value *const dst = > > @@ -422,7 +418,6 @@ _mesa_get_uniform(struct gl_context *ctx, > > GLuint program, GLint location, > > } > > break; > > case GLSL_TYPE_INT: > > - case GLSL_TYPE_UINT: > > switch (uni->type->base_type) { > > case GLSL_TYPE_FLOAT: > > /* While the GL 3.2 core spec doesn't explicitly > > @@ -447,6 +442,9 @@ _mesa_get_uniform(struct gl_context *ctx, > > GLuint program, GLint location, > > case GLSL_TYPE_BOOL: > > dst[didx].i = src[sidx].i ? 1 : 0; > > break; > > + case GLSL_TYPE_UINT: > > + dst[didx].i = src[sidx].i; > I think this should be > > dst[didx].i = MIN2(src[sidx].u, INT_MAX); > > Cheers, > Nicolai > > > > > + break; > > case GLSL_TYPE_DOUBLE: { > > double tmp; > > memcpy(&tmp, &src[sidx].f, sizeof(tmp)); > > @@ -458,6 +456,38 @@ _mesa_get_uniform(struct gl_context *ctx, > > GLuint program, GLint location, > > break; > > } > > break; > > +case GLSL_TYPE_UINT: > > + switch (uni->type->base_type) { > > + case GLSL_TYPE_FLOAT: > > + /* The spec isn't terribly clear how to handle > > negative > > + * values with an unsigned return type. > > + * > > + * GL 4.5 section 2.2.2 ("Data Conversions for > > State > > + * Query Commands") says: > > + * > > + * "If a value is so large in magnit
Re: [Mesa-dev] [PATCH v2 1/2] mesa: make glFramebuffer* check immutable texture level bounds
Hi Ilia, On mié, 2017-02-01 at 00:37 -0500, Ilia Mirkin wrote: > On Tue, Jan 31, 2017 at 2:55 PM, Antía Puentes wrote: > > > > There was an update in the OpenGL ES 3.2 specification (November 3, > > 2016) addressing this: > > > > - In section "9.2.8 Attaching Texture Images to a Framebuffer", > > FramebufferTexture2D (page 241) and FramebufferTextureLayer (page 242) > > descriptions: > > > > " specifies the mipmap level of the texture image to be attached > > to the framebuffer, and must satisfy the following conditions: > > > > • If texture refers to an immutable-format texture, level must be > > greater than or equal to zero and smaller than the value of > > TEXTURE_IMMUTABLE_LEVELS for texture." > > > > https://cvs.khronos.org/bugzilla/show_bug.cgi?id=15946 > Interesting, OK. And the test that's failing is a "glesext" test, even > though it's being run in a GL context. Does that bug indicate whether > such a change also applies to desktop GL? (I don't have access to the > bug tracker.) It does not mention anything about desktop GL. I have filled a new bug in OpenGL 4.5 asking for clarification: https://cvs.khronos.org/bugzilla/show_bug.cgi?id=16214 For what it is worth, I have a patch similar to yours but using ImmutableLevels in the condition check instead of NumLevels: (texObj->Immutable && level >= texObj->ImmutableLevels) ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v2 1/2] mesa: make glFramebuffer* check immutable texture level bounds
On mar, 2017-01-31 at 12:09 +0100, Nicolai Hähnle wrote: > On 30.01.2017 19:09, Ilia Mirkin wrote: > > > > On Mon, Jan 30, 2017 at 1:06 PM, Ilia Mirkin wrote: > > > > > > On Mon, Jan 30, 2017 at 12:26 PM, Nicolai Hähnle wrote: > > > > > > > > On 30.01.2017 18:23, Ilia Mirkin wrote: > > > > > > > > > > > > > > > On Mon, Jan 30, 2017 at 4:36 AM, Ilia Mirkin wrote: > > > > > > > > > > > > > > > > > > On Mon, Jan 30, 2017 at 4:33 AM, Nicolai Hähnle > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > On 26.01.2017 06:47, Ilia Mirkin wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > When a texture is immutable, we can't tack on extra levels > > > > > > > > after-the-fact like we could with glTexImage. So check against > > > > > > > > that > > > > > > > > level limit and return an error if it's surpassed. > > > > > > > > > > > > > > > > The spec is a little unclear in that it says to check if "level > > > > > > > > is not > > > > > > > > a > > > > > > > > supported texture level", however that is never fully defined. > > > > > > > > > > > > > > > > This fixes: > > > > > > > > GL45-CTS.geometry_shader.layered_fbo.fb_texture_invalid_level_number > > > > > > > > > > > > > > > > Signed-off-by: Ilia Mirkin > > > > > > > > --- > > > > > > > > > > > > > > > > v1 -> v2: use NumLevels instead of _MaxLevel. > > > > > > > > > > > > > > > > Not sure why this isn't showing up as failing in the Intel CI, > > > > > > > > but it > > > > > > > > was > > > > > > > > definitely failing here. > > > > > > > > > > > > > > > > > > > > > Maybe the Intel CI is running the GLCTS based on the last 4.5 > > > > > > > release, > > > > > > > and I > > > > > > > guess you're running off what's been published on Github? The > > > > > > > GLCTS on > > > > > > > Github has a bunch of new and possibly broken tests, and may > > > > > > > still have > > > > > > > a > > > > > > > number of regressions as well (since a lot of code was moved > > > > > > > around). > > > > > > > > > > > > > > Can you point out which specific place of the spec you're talking > > > > > > > about > > > > > > > in > > > > > > > your comment? > > > > > > > > > > > > One of the errors listed for glFramebufferTexture is: > > > > > > > > > > > > """ > > > > > > An INVALID_VALUE error is generated if texture is not zero and is > > > > > > not the > > > > > > name of a texture object, or if level is not a supported texture > > > > > > level > > > > > > for texture > > > > > > """ > > > > > > > > > > Curiously for glFramebufferTexture1D/2D/3D, it also says: > > > > > > > > > > """ > > > > > If textarget is TEXTURE_RECTANGLE or TEXTURE_2D_MULTISAMPLE, then > > > > > level must be zero. If textarget is TEXTURE_3D, then level must be > > > > > greater than or equal to zero and less than or equal to log2 of the > > > > > value of MAX_3D_TEXTURE_- SIZE. If textarget is one of the cube map > > > > > face targets from table 8.19, then level must be greater than or equal > > > > > to zero and less than or equal to log2 of the value of > > > > > MAX_CUBE_MAP_TEXTURE_SIZE. For all other values of textarget, level > > > > > must be greater than or equal to zero and no larger than log2 of the > > > > > value of MAX_- TEXTURE_SIZE. > > > > > """ > > > > > > > > > > which matches the current code. I guess this patch is withdrawn... > > > > > > > > What a coincidence, I've been staring at this for a while now and been > > > > getting increasingly confused. Which function does the CTS actually test > > > > (and possibly fail incorrectly)? Maybe it's time to open an issue > > > > against > > > > the CTS. > > > https://github.com/KhronosGroup/VK-GL-CTS/blob/c9921995d8d360bd34d8672194d7c095bb376f82/external/openglcts/modules/glesext/geometry_shader/esextcGeometryShaderLayeredFBO.cpp#L1062 > > A thought occurred to me... whereas glFramebufferTexture1D/etc talk > > about binding points, glFramebufferTexture talks about a specific > > texture object. When are the bindings resolved? at > > glFramebufferTexture2D time, or at draw time? If the latter, then the > > spec has no choice but to just check the maxima for the binding point > > ones... > The problem is that in all cases, the error table lists INVALID_ENUM > when "level is not a supported texture level for $foo", but it's not > explicitly stated how "supported texture level" is defined for > FramebufferTexture. > > However, I did a bit more searching, and there's language below > FramebufferTextureLayer which matches that of FramebufferTexture1D/2D/3D: > > "If texture is a three-dimensional texture, then level must be greater > than or equal to zero and less than or equal to log 2 of the value of > MAX_3D_TEXTURE_SIZE. If texture is a two-dimensional array texture, then > level must be greater than or equal to zero and no larger than log 2 of > the value of MAX_TEXTURE_SIZE." > > It would be odd to have different requirements for FramebufferTexture > and Framebuffer
Re: [Mesa-dev] [PATCH] i965/gen8: Fix vertex attrib upload for dvec3/4 shader inputs
On lun, 2016-10-31 at 14:51 -0700, Kenneth Graunke wrote: > On Monday, October 31, 2016 6:22:43 PM PDT Antia Puentes wrote: > > > > The emission of vertex attributes corresponding to dvec3 and dvec4 > > vertex shader input variables was not correct when the > > passed > > to the VertexAttribL* commands was <= 2. > > > > This was because we were using the vertex array size when emitting > > vertices > > to decide if we uploaded a 64-bit floating point attribute as 1 > > slot (128-bits) > > for sizes 1 and 2, or 2 slots (256-bits) for sizes 3 and 4. This > > caused problems > > when mapping the input variables to registers because, for deciding > > which > > registers contain the values uploaded for a certain variable, we > > use the size > > and type given to the variable in the shader, so we will be > > assigning 256-bits > > to dvec3/4 variables, even if we only uploaded 128-bits for them, > > which happened > > when the vertex array size was <= 2. > > > > The patch uses the shader information to only emit as 128-bits > > those 64-bit floating > > point variables that were declared as double or dvec2 in the vertex > > shader. Dvec3 and > > dvec4 variables will be always uploaded as 256-bits, independently > > of the given > > to the VertexAttribL* command. > > > > From the ARB_vertex_attrib_64bit specification: > > > > "For the 64-bit double precision types listed in Table X.1, no > > default > > attribute values are provided if the values of the vertex > > attribute variable > > are specified with fewer components than required for the > > attribute > > variable. For example, the fourth component of a variable of > > type dvec4 > > will be undefined if specified using VertexAttribL3dv or using > > a vertex > > array specified with VertexAttribLPointer and a size of three." > > > > We are filling these unspecified components with zeros, which > > coincidentally is > > also what the GL44-CTS.vertex_attrib_binding.basic-inputL-case1 > > expects. > > > > Fixes: GL44-CTS.vertex_attrib_binding.basic-inputL-case1 test > > > > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97287 > > --- > > src/mesa/drivers/dri/i965/brw_compiler.h | 1 + > > src/mesa/drivers/dri/i965/brw_context.h | 2 +- > > src/mesa/drivers/dri/i965/brw_draw_upload.c | 4 +++- > > src/mesa/drivers/dri/i965/brw_vs.c | 1 + > > src/mesa/drivers/dri/i965/gen8_draw_upload.c | 35 -- > > -- > > 5 files changed, 21 insertions(+), 22 deletions(-) > > > > diff --git a/src/mesa/drivers/dri/i965/brw_compiler.h > > b/src/mesa/drivers/dri/i965/brw_compiler.h > > index 819c7d6..c2400f9 100644 > > --- a/src/mesa/drivers/dri/i965/brw_compiler.h > > +++ b/src/mesa/drivers/dri/i965/brw_compiler.h > > @@ -641,6 +641,7 @@ struct brw_vs_prog_data { > > struct brw_vue_prog_data base; > > > > GLbitfield64 inputs_read; > > + GLbitfield64 double_inputs_read; > > > > unsigned nr_attributes; > > unsigned nr_attribute_slots; > > diff --git a/src/mesa/drivers/dri/i965/brw_context.h > > b/src/mesa/drivers/dri/i965/brw_context.h > > index 308ba99..310372a 100644 > > --- a/src/mesa/drivers/dri/i965/brw_context.h > > +++ b/src/mesa/drivers/dri/i965/brw_context.h > > @@ -535,7 +535,7 @@ struct brw_vertex_element { > > const struct gl_vertex_array *glarray; > > > > int buffer; > > - > > + bool is_dual_slot; > > /** Offset of the first element within the buffer object */ > > unsigned int offset; > > }; > > diff --git a/src/mesa/drivers/dri/i965/brw_draw_upload.c > > b/src/mesa/drivers/dri/i965/brw_draw_upload.c > > index da13e7a..19c8065 100644 > > --- a/src/mesa/drivers/dri/i965/brw_draw_upload.c > > +++ b/src/mesa/drivers/dri/i965/brw_draw_upload.c > > @@ -472,7 +472,9 @@ brw_prepare_vertices(struct brw_context *brw) > > while (vs_inputs) { > > GLuint index = ffsll(vs_inputs) - 1; > > struct brw_vertex_element *input = &brw->vb.inputs[index]; > > - > > + input->is_dual_slot = > > + (brw->gen >= 8 && _mesa_bitcount_64(vs_prog_data- > > >double_inputs_read & > > + BITFIELD64_BIT(index) > > )); > Bitcount of a single bit? Couldn't you do: > > input->is_dual_slot = brw->gen >= 8 && > (vs_prog_data->double_inputs_read & BITFIELD64_BIT(index)) > != 0; Will do. Thanks for the review! > With that fixed, this is: > Reviewed-by: Kenneth Graunke > > Thanks for tracking this down and fixing it! ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v4] i965: Fix calculation of the image height at start level
Ping. This patch fixes a significant amount of CTS tests and it will be nice to have it reviewed. On sáb, 2016-09-03 at 03:04 +0200, Antia Puentes wrote: > - Fixes CTS tests: > > * GL44-CTS.shader_image_size.advanced-nonMS-cs-float > * GL44-CTS.shader_image_size.advanced-nonMS-cs-int > * GL44-CTS.shader_image_size.advanced-nonMS-cs-uint > * GL44-CTS.shader_image_size.advanced-nonMS-gs-float > * GL44-CTS.shader_image_size.advanced-nonMS-gs-int > * GL44-CTS.shader_image_size.advanced-nonMS-gs-uint > * GL44-CTS.shader_image_size.advanced-nonMS-tes-float > * GL44-CTS.shader_image_size.advanced-nonMS-tes-int > * GL44-CTS.shader_image_size.advanced-nonMS-tes-uint > * GL44-CTS.shader_image_size.advanced-nonMS-vs-float > * GL44-CTS.shader_image_size.advanced-nonMS-vs-int > * GL44-CTS.shader_image_size.advanced-nonMS-vs-uint > > v1: (written by Dave Airlie) Always shift height images for levels. > Fixed the CTS test. > > v2: Only shift height if the texture is not an 1D_ARRAY, > it fixes assertion in GL44-CTS.texture_view.gettexparameter > due to the original patch (Antia). > > v3: Remove the loop. Do not shift height either for 1D textures. > Use an explicit switch and add an assertion (levels == 0) for > multisampled textures (Jason). > > v4: Rectangle textures can not have levels either (Ilia Mirkin). > > Signed-off-by: Dave Airlie > Signed-off-by: Antia Puentes > --- > src/mesa/drivers/dri/i965/intel_tex_image.c | 27 > +-- > 1 file changed, 21 insertions(+), 6 deletions(-) > > diff --git a/src/mesa/drivers/dri/i965/intel_tex_image.c > b/src/mesa/drivers/dri/i965/intel_tex_image.c > index 7affe08..65962eb 100644 > --- a/src/mesa/drivers/dri/i965/intel_tex_image.c > +++ b/src/mesa/drivers/dri/i965/intel_tex_image.c > @@ -47,12 +47,27 @@ intel_miptree_create_for_teximage(struct > brw_context *brw, > DBG("%s\n", __func__); > > /* Figure out image dimensions at start level. */ > - for (i = intelImage->base.Base.Level; i > 0; i--) { > - width <<= 1; > - if (height != 1) > - height <<= 1; > - if (intelObj->base.Target == GL_TEXTURE_3D) > - depth <<= 1; > + switch(intelObj->base.Target) { > + case GL_TEXTURE_2D_MULTISAMPLE: > + case GL_TEXTURE_2D_MULTISAMPLE_ARRAY: > + case GL_TEXTURE_RECTANGLE: > + assert(intelImage->base.Base.Level == 0); > + break; > + case GL_TEXTURE_3D: > + depth <<= intelImage->base.Base.Level; > + /* Fall through */ > + case GL_TEXTURE_2D: > + case GL_TEXTURE_2D_ARRAY: > + case GL_TEXTURE_CUBE_MAP: > + case GL_TEXTURE_CUBE_MAP_ARRAY: > + height <<= intelImage->base.Base.Level; > + /* Fall through */ > + case GL_TEXTURE_1D: > + case GL_TEXTURE_1D_ARRAY: > + width <<= intelImage->base.Base.Level; > + break; > + default: > + unreachable("Unexpected target"); > } > > /* Guess a reasonable value for lastLevel. This is probably > going___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v3] i965: Fix calculation of the image height at start level
On sáb, 2016-09-03 at 01:20 +0200, Antía Puentes wrote: > On vie, 2016-09-02 at 17:49 -0400, Ilia Mirkin wrote: > > > > On Fri, Sep 2, 2016 at 5:40 PM, Antia Puentes > > wrote: > > > > > > > > > - Fixes CTS tests: > > > > > > * GL44-CTS.shader_image_size.advanced-nonMS-cs-float > > > * GL44-CTS.shader_image_size.advanced-nonMS-cs-int > > > * GL44-CTS.shader_image_size.advanced-nonMS-cs-uint > > > * GL44-CTS.shader_image_size.advanced-nonMS-gs-float > > > * GL44-CTS.shader_image_size.advanced-nonMS-gs-int > > > * GL44-CTS.shader_image_size.advanced-nonMS-gs-uint > > > * GL44-CTS.shader_image_size.advanced-nonMS-tes-float > > > * GL44-CTS.shader_image_size.advanced-nonMS-tes-int > > > * GL44-CTS.shader_image_size.advanced-nonMS-tes-uint > > > * GL44-CTS.shader_image_size.advanced-nonMS-vs-float > > > * GL44-CTS.shader_image_size.advanced-nonMS-vs-int > > > * GL44-CTS.shader_image_size.advanced-nonMS-vs-uint > > > > > > v1: (written by Dave Airlie) Always shift height images for > > > levels. > > > Fixed the CTS tests. > > > > > > v2: Only shift height if the texture is not an 1D_ARRAY, > > > it fixes assertion in GL44-CTS.texture_view.gettexparameter > > > due to the original patch (Antia). > > > > > > v3: Remove the loop. Do not shift height either for 1D textures. > > > Use an explicit switch and add an assertion (levels == 0) for > > > multisampled textures (Jason). > > > > > > Signed-off-by: Dave Airlie > > > Signed-off-by: Antia Puentes > > > --- > > > src/mesa/drivers/dri/i965/intel_tex_image.c | 27 > > > +-- > > > 1 file changed, 21 insertions(+), 6 deletions(-) > > > > > > diff --git a/src/mesa/drivers/dri/i965/intel_tex_image.c > > > b/src/mesa/drivers/dri/i965/intel_tex_image.c > > > index 7affe08..cfcbf3c 100644 > > > --- a/src/mesa/drivers/dri/i965/intel_tex_image.c > > > +++ b/src/mesa/drivers/dri/i965/intel_tex_image.c > > > @@ -47,12 +47,27 @@ intel_miptree_create_for_teximage(struct > > > brw_context *brw, > > > DBG("%s\n", __func__); > > > > > > /* Figure out image dimensions at start level. */ > > > - for (i = intelImage->base.Base.Level; i > 0; i--) { > > > - width <<= 1; > > > - if (height != 1) > > > - height <<= 1; > > > - if (intelObj->base.Target == GL_TEXTURE_3D) > > > - depth <<= 1; > > > + switch(intelObj->base.Target) { > > > + case GL_TEXTURE_2D_MULTISAMPLE: > > > + case GL_TEXTURE_2D_MULTISAMPLE_ARRAY: > > > + assert(intelImage->base.Base.Level == 0); > > > + break; > > > + case GL_TEXTURE_3D: > > > + depth <<= intelImage->base.Base.Level; > > > + /* Fall through */ > > > + case GL_TEXTURE_2D: > > > + case GL_TEXTURE_2D_ARRAY: > > > + case GL_TEXTURE_RECTANGLE: > > FWIW, GL_TEXTURE_RECTANGLE can't have levels either. IMO you should > > move it to the section above with the MS targets. > True, thanks!. Apart from that, I think I should remove: > > > + case GL_TEXTURE_CUBE_MAP: > and list the cube map faces instead because this is called from > glTexImage*. I will send a new version of the patch after running the > tests. Sorry for the noise. Ok, I was wrong. 500 crashes in Piglit certified it. > > > > > > + case GL_TEXTURE_CUBE_MAP_ARRAY: > > > + height <<= intelImage->base.Base.Level; > > > + /* Fall through */ > > > + case GL_TEXTURE_1D: > > > + case GL_TEXTURE_1D_ARRAY: > > > + width <<= intelImage->base.Base.Level; > > > + break; > > > + default: > > > + unreachable("Unexpected target"); > > > } > > > > > > /* Guess a reasonable value for lastLevel. This is probably > > > going > > > -- > > > 2.7.4 > > > > > > ___ > > > mesa-dev mailing list > > > mesa-dev@lists.freedesktop.org > > > https://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v3] i965: Fix calculation of the image height at start level
On vie, 2016-09-02 at 17:49 -0400, Ilia Mirkin wrote: > On Fri, Sep 2, 2016 at 5:40 PM, Antia Puentes > wrote: > > > > - Fixes CTS tests: > > > > * GL44-CTS.shader_image_size.advanced-nonMS-cs-float > > * GL44-CTS.shader_image_size.advanced-nonMS-cs-int > > * GL44-CTS.shader_image_size.advanced-nonMS-cs-uint > > * GL44-CTS.shader_image_size.advanced-nonMS-gs-float > > * GL44-CTS.shader_image_size.advanced-nonMS-gs-int > > * GL44-CTS.shader_image_size.advanced-nonMS-gs-uint > > * GL44-CTS.shader_image_size.advanced-nonMS-tes-float > > * GL44-CTS.shader_image_size.advanced-nonMS-tes-int > > * GL44-CTS.shader_image_size.advanced-nonMS-tes-uint > > * GL44-CTS.shader_image_size.advanced-nonMS-vs-float > > * GL44-CTS.shader_image_size.advanced-nonMS-vs-int > > * GL44-CTS.shader_image_size.advanced-nonMS-vs-uint > > > > v1: (written by Dave Airlie) Always shift height images for levels. > > Fixed the CTS tests. > > > > v2: Only shift height if the texture is not an 1D_ARRAY, > > it fixes assertion in GL44-CTS.texture_view.gettexparameter > > due to the original patch (Antia). > > > > v3: Remove the loop. Do not shift height either for 1D textures. > > Use an explicit switch and add an assertion (levels == 0) for > > multisampled textures (Jason). > > > > Signed-off-by: Dave Airlie > > Signed-off-by: Antia Puentes > > --- > > src/mesa/drivers/dri/i965/intel_tex_image.c | 27 > > +-- > > 1 file changed, 21 insertions(+), 6 deletions(-) > > > > diff --git a/src/mesa/drivers/dri/i965/intel_tex_image.c > > b/src/mesa/drivers/dri/i965/intel_tex_image.c > > index 7affe08..cfcbf3c 100644 > > --- a/src/mesa/drivers/dri/i965/intel_tex_image.c > > +++ b/src/mesa/drivers/dri/i965/intel_tex_image.c > > @@ -47,12 +47,27 @@ intel_miptree_create_for_teximage(struct > > brw_context *brw, > > DBG("%s\n", __func__); > > > > /* Figure out image dimensions at start level. */ > > - for (i = intelImage->base.Base.Level; i > 0; i--) { > > - width <<= 1; > > - if (height != 1) > > - height <<= 1; > > - if (intelObj->base.Target == GL_TEXTURE_3D) > > - depth <<= 1; > > + switch(intelObj->base.Target) { > > + case GL_TEXTURE_2D_MULTISAMPLE: > > + case GL_TEXTURE_2D_MULTISAMPLE_ARRAY: > > + assert(intelImage->base.Base.Level == 0); > > + break; > > + case GL_TEXTURE_3D: > > + depth <<= intelImage->base.Base.Level; > > + /* Fall through */ > > + case GL_TEXTURE_2D: > > + case GL_TEXTURE_2D_ARRAY: > > + case GL_TEXTURE_RECTANGLE: > FWIW, GL_TEXTURE_RECTANGLE can't have levels either. IMO you should > move it to the section above with the MS targets. True, thanks!. Apart from that, I think I should remove: > > + case GL_TEXTURE_CUBE_MAP: and list the cube map faces instead because this is called from glTexImage*. I will send a new version of the patch after running the tests. Sorry for the noise. > > + case GL_TEXTURE_CUBE_MAP_ARRAY: > > + height <<= intelImage->base.Base.Level; > > + /* Fall through */ > > + case GL_TEXTURE_1D: > > + case GL_TEXTURE_1D_ARRAY: > > + width <<= intelImage->base.Base.Level; > > + break; > > + default: > > + unreachable("Unexpected target"); > > } > > > > /* Guess a reasonable value for lastLevel. This is probably > > going > > -- > > 2.7.4 > > > > ___ > > mesa-dev mailing list > > mesa-dev@lists.freedesktop.org > > https://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] i965: report bound buffer size not underlying buffer size for image size (v2)
Hi Dave, The "GL44-CTS.shader_image_size.advanced-nonMS-fs-int" test fails for me with current master (which contains this patch), I have tested it both in a Broadwell and a Skylake machine. I recall that you sent to the mailing list a second patch "i965: don't fail to shift height images for levels." related to the test. Is this patch or a new version of it still needed?. I remember that as it is it regressed GL44-CTS.texture_view.gettexparameter, raising the assertion: "glcts: intel_mipmap_tree.c:368: intel_miptree_create_layout: Assertion `height0 = 1' failed." On a side note, as we are taking about CTS, just remind that Andres has reviewed the "mesa/subroutines: start adding per-context subroutine index support" series that you sent, which is not pushed yet. Regards. On mar, 2016-08-23 at 11:10 +1000, Dave Airlie wrote: > ping, esp Igalia CTS ppl. > > Dave. > > On 27 May 2016 at 15:11, Dave Airlie wrote: > > > > From: Dave Airlie > > > > This seems to make sense, the image is bound to a subset of the > > buffer > > so the image size should be from the bound size not the underlying > > object. > > > > This fixes: > > GL44-CTS.shader_image_size.advanced-nonMS-fs-int > > > > v2: get mininum of the two values, same as we write to the hw. > > > > Signed-off-by: Dave Airlie > > --- > > src/mesa/drivers/dri/i965/brw_wm_surface_state.c | 5 +++-- > > 1 file changed, 3 insertions(+), 2 deletions(-) > > > > diff --git a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c > > b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c > > index b73d5d5..ee0b9c2 100644 > > --- a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c > > +++ b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c > > @@ -1227,10 +1227,11 @@ update_buffer_image_param(struct > > brw_context *brw, > > struct brw_image_param *param) > > { > > struct gl_buffer_object *obj = u->TexObj->BufferObject; > > - > > + uint32_t size; > > update_default_image_param(brw, u, surface_idx, param); > > > > - param->size[0] = obj->Size / _mesa_get_format_bytes(u- > > >_ActualFormat); > > + size = MIN2((uint32_t)u->TexObj->BufferSize, obj->Size); > > + param->size[0] = size / _mesa_get_format_bytes(u- > > >_ActualFormat); > > param->stride[0] = _mesa_get_format_bytes(u->_ActualFormat); > > } > > > > -- > > 2.5.5 > > > > ___ > > mesa-dev mailing list > > mesa-dev@lists.freedesktop.org > > https://lists.freedesktop.org/mailman/listinfo/mesa-dev > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] mesa: Fix uf10_to_f32() scale factor in the E == 0 and M != 0 case.
The patch looks good to me. Reviewed-by: Antia Puentes On lun, 2016-08-15 at 23:50 -0700, Kenneth Graunke wrote: > GL_EXT_packed_float, 2.1.B Unsigned 10-Bit Floating-Point Numbers: > > 0.0, if E == 0 and M == 0, > 2^-14 * (M / 32), if E == 0 and M != 0, > 2^(E-15) * (1 + M/32),if 0 < E < 31, > INF, if E == 31 and M == 0, or > NaN, if E == 31 and M != 0, > > In the second case (E == 0 and M != 0), we were multiplying the > mantissa > by 2^-20, when we should have been multiplying by 2^-19 (which is > 2^(-14 + -5), or 2^-14 * 2^-5, or 2^-14 / 32). > > The previous section defines the formula for 11-bit numbers, which > is: > > 2^-14 * (M / 64), if E == 0 and M != 0, > > In other words, we had accidentally copy and pasted the 11-bit code > to the 10-bit case, and neglected to change the exponent. > > Fixes dEQP-GLES3.functional.pbo.renderbuffer.r11f_g11f_b10f_triangles > when run with surface dimensions of 1536x1152 or 1920x1080. > > References: https://code.google.com/p/chrome-os-partner/issues/detail > ?id=56244 > Signed-off-by: Kenneth Graunke > Reviewed-by: Stephane Marchesin > --- > src/util/format_r11g11b10f.h | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/src/util/format_r11g11b10f.h > b/src/util/format_r11g11b10f.h > index c9e967c..f6cd4ac 100644 > --- a/src/util/format_r11g11b10f.h > +++ b/src/util/format_r11g11b10f.h > @@ -191,7 +191,7 @@ static inline float uf10_to_f32(uint16_t val) > > if (exponent == 0) { > if (mantissa != 0) { > - const float scale = 1.0 / (1 << 20); > + const float scale = 1.0 / (1 << 19); > f32.f = scale * mantissa; > } > } else if (exponent == 31) { ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/2] mesa/compute: move compute checks around for tests.
Hi! I have taken a look to the GL44-CTS.compute_shader.api-indirect test and I agree with Piñeiro that this test is not correct. A call to glUseProgram to set the active program is missing in the test. The test passes invalid values as parameters to glDispatchComputeIndirect like, glDispatchComputeIndirect(-2); and checks if the error returned by the the implementation is the one expected. However, as the test does not set the active program, Mesa returns the INVALID_OPERATION ("no active compute shader") error instead of the one expected by the test but, this can not be considered as a nonconformity because implementations can choose which error generate when several error conditions are met. For that reason the test is invalid unless it sets the active program. Adding the missing glUseProgram call to the test, Mesa returns the expected errors and the test passes in i965. Notice that the lack of an active compute shader when calling to glDispatchComputeIndirect is already tested by a different test: GL44-CTS.compute_shader.api-no-active-program. On jue, 2016-05-05 at 09:51 +0200, Alejandro Piñeiro wrote: > So as far as I understand, on that test there is no active program > and > indirect length is wrong, and fails because it was expecting the > second > error. Is that right? > > Unless Im wrong, when the OpenGL spec specifies the Error cases, it > doesn't specify any kind of priority (which error should be raised > first > if more of one condition happens). > > Im somewhat biased to think that it is a problem with the test. What > would happen if a new test is written to check that INVALID_OPERATION > is > generated when no active program, and it uses a wrong length? IMHO, > if > the test is setting two wrong conditions, it should check for the two > possible errors. > > On 05/05/16 02:41, Dave Airlie wrote: > > > > From: Dave Airlie > > > > This fixes GL43-CTS.compute_shader.api-indirect > > which tests the length/4 before anything else. > > > > Signed-off-by: Dave Airlie > > --- > > src/mesa/main/api_validate.c | 6 +++--- > > 1 file changed, 3 insertions(+), 3 deletions(-) > > > > diff --git a/src/mesa/main/api_validate.c > > b/src/mesa/main/api_validate.c > > index 688408f..d455f19 100644 > > --- a/src/mesa/main/api_validate.c > > +++ b/src/mesa/main/api_validate.c > > @@ -1109,9 +1109,6 @@ valid_dispatch_indirect(struct gl_context > > *ctx, > > { > > GLintptr end = (GLintptr)indirect + size; > > > > - if (!check_valid_to_compute(ctx, name)) > > - return GL_FALSE; > > - > > /* From the OpenGL 4.3 Core Specification, Chapter 19, Compute > > Shaders: > > * > > * "An INVALID_VALUE error is generated if indirect is negative > > or is not a > > @@ -1153,6 +1150,9 @@ valid_dispatch_indirect(struct gl_context > > *ctx, > > return GL_FALSE; > > } > > > > + if (!check_valid_to_compute(ctx, name)) > > + return GL_FALSE; > > + > > return GL_TRUE; > > } > > > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] mesa/copyimage: report INVALID_VALUE for missing cube array
On jue, 2016-06-02 at 14:25 +1000, Dave Airlie wrote: > From: Dave Airlie > > The specs says INVALID_VALUE for exceeding dimensions, > which is really what is happening here. > > This fixes: > GL45-CTS.copy_image.non_existent_mipmap > > Signed-off-by: Dave Airlie > --- > src/mesa/main/copyimage.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/src/mesa/main/copyimage.c b/src/mesa/main/copyimage.c > index 6aa6bcb..7e5df61 100644 > --- a/src/mesa/main/copyimage.c > +++ b/src/mesa/main/copyimage.c > @@ -180,7 +180,7 @@ prepare_target(struct gl_context *ctx, GLuint > name, GLenum target, > for (i = 0; i < depth; i++) { > if (!texObj->Image[z+i][level]) { > /* missing cube face */ > - _mesa_error(ctx, GL_INVALID_OPERATION, > + _mesa_error(ctx, GL_INVALID_VALUE, > "glCopyImageSubData(missing cube face)"); I would say "... missing cube face" in the commit message as this is the error you are returning here. With this change, Reviewed-by: Antia Puentes > return false; > } ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] mesa/shader_query: add missing subroutines cases
LGTM: Reviewed-by: Antia Puentes On vie, 2016-05-06 at 10:22 +1000, Dave Airlie wrote: > From: Dave Airlie > > ARRAY_SIZE and LOCATION should accept the SUBROUTINE_UNIFORM types. > > Fixes: > GL43-CTS.program_interface_query.subroutines-vertex > GL43-CTS.program_interface_query.subroutines-tess-control > GL43-CTS.program_interface_query.subroutines-tess-eval > GL43-CTS.program_interface_query.subroutines-geometry > GL43-CTS.program_interface_query.subroutines-fragment > GL43-CTS.program_interface_query.subroutines-compute > > Signed-off-by: Dave Airlie > --- > src/mesa/main/shader_query.cpp | 13 + > 1 file changed, 13 insertions(+) > > diff --git a/src/mesa/main/shader_query.cpp > b/src/mesa/main/shader_query.cpp > index 020990a..c8c3df4 100644 > --- a/src/mesa/main/shader_query.cpp > +++ b/src/mesa/main/shader_query.cpp > @@ -1125,6 +1125,13 @@ _mesa_program_resource_prop(struct > gl_shader_program *shProg, > switch (res->Type) { > case GL_UNIFORM: > case GL_BUFFER_VARIABLE: > + case GL_VERTEX_SUBROUTINE_UNIFORM: > + case GL_GEOMETRY_SUBROUTINE_UNIFORM: > + case GL_FRAGMENT_SUBROUTINE_UNIFORM: > + case GL_COMPUTE_SUBROUTINE_UNIFORM: > + case GL_TESS_CONTROL_SUBROUTINE_UNIFORM: > + case GL_TESS_EVALUATION_SUBROUTINE_UNIFORM: > + > /* Test if a buffer variable is an array or an unsized > array. > * Unsized arrays return zero as array size. > */ > @@ -1207,6 +1214,12 @@ _mesa_program_resource_prop(struct > gl_shader_program *shProg, > case GL_LOCATION: > switch (res->Type) { > case GL_UNIFORM: > + case GL_VERTEX_SUBROUTINE_UNIFORM: > + case GL_GEOMETRY_SUBROUTINE_UNIFORM: > + case GL_FRAGMENT_SUBROUTINE_UNIFORM: > + case GL_COMPUTE_SUBROUTINE_UNIFORM: > + case GL_TESS_CONTROL_SUBROUTINE_UNIFORM: > + case GL_TESS_EVALUATION_SUBROUTINE_UNIFORM: > case GL_PROGRAM_INPUT: > case GL_PROGRAM_OUTPUT: > *val = program_resource_location(shProg, res, ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] nir: fix block iterator to not forget fxn->end_block
Hi Rob, On jue, 2016-05-05 at 14:40 -0400, Rob Clark wrote: > From: Rob Clark > > With the switch to new block iterator macro, we silently stopped > iterating over the end-block. Which caused nir_index_blocks() to not > index the end-block. Resulting in funny looking nir_print's like: > > impl main { > block block_0: > /* preds: */ > intrinsic copy_var () (gl_FragColor@out-temp, gl_Color) () > intrinsic copy_var () (gl_FragColor, gl_FragColor@out-temp) () > /* succs: block_0 */ > block block_0: > } I had noticed this as well. The patch looks good to me: Reviewed-by: Antia Puentes > I don't *think* there are any more serious consequences of not > iterating > the end_block, but it's probably also a good idea to not subtly > change > behavior compared to the old fxn-callback based iterator. > > Signed-off-by: Rob Clark > --- > src/compiler/nir/nir.c | 19 ++- > 1 file changed, 14 insertions(+), 5 deletions(-) > > diff --git a/src/compiler/nir/nir.c b/src/compiler/nir/nir.c > index 867a43c..1f51b31 100644 > --- a/src/compiler/nir/nir.c > +++ b/src/compiler/nir/nir.c > @@ -1512,12 +1512,18 @@ nir_block_cf_tree_next(nir_block *block) > return NULL; > } > > - nir_cf_node *cf_next = nir_cf_node_next(&block->cf_node); > + nir_cf_node *parent = block->cf_node.parent; > + nir_cf_node *cf_next; > + > + if ((parent->type == nir_cf_node_function) && > + (block == nir_cf_node_as_function(parent)->end_block)) > + cf_next = NULL; > + else > + cf_next = nir_cf_node_next(&block->cf_node); > + > if (cf_next) > return nir_cf_node_cf_tree_first(cf_next); > > - nir_cf_node *parent = block->cf_node.parent; > - > switch (parent->type) { > case nir_cf_node_if: { > /* Are we at the end of the if? Go to the beginning of the > else */ > @@ -1532,9 +1538,12 @@ nir_block_cf_tree_next(nir_block *block) > case nir_cf_node_loop: > return nir_cf_node_as_block(nir_cf_node_next(parent)); > > - case nir_cf_node_function: > + case nir_cf_node_function: { > + nir_function_impl *func = nir_cf_node_as_function(parent); > + if (func->end_block != block) > + return func->end_block; > return NULL; > - > + } > default: > unreachable("unknown cf node type"); > } ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 02/15] i965: Configure how to store *64*PASSTHRU vertex components
Hi Kenneth, thanks for reviewing. On mié, 2016-05-04 at 03:36 -0700, Kenneth Graunke wrote: > On Thursday, April 28, 2016 1:40:32 PM PDT Antia Puentes wrote: > > > > From the Broadwell specification, structure VERTEX_ELEMENT_STATE > > description: > > > > "When SourceElementFormat is set to one of the *64*_PASSTHRU > > formats, 64-bit components are stored in the URB without any > > conversion. In this case, vertex elements must be written as 128 > > or 256 bits, with VFCOMP_STORE_0 being used to pad the output > > as required. E.g., if R64_PASSTHRU is used to copy a 64-bit Red > component into > > > > the URB, Component 1 must be specified as VFCOMP_STORE_0 (with > > Components 2,3 set to VFCOMP_NOSTORE) in order to output a 128-bit > > vertex element, or Components 1-3 must be specified as VFCOMP_STORE_0 > > in order to output a 256-bit vertex element. Likewise, use of > > R64G64B64_PASSTHRU requires Component 3 to be specified as > VFCOMP_STORE_0 > > > > in order to output a 256-bit vertex element." > > > > Uses 128-bits to write double and dvec2 vertex elements, and 256-bits for > > dvec3 and dvec4 vertex elements. > > > > Signed-off-by: Juan A. Suarez Romero > > Signed-off-by: Antia Puentes > > --- > > src/mesa/drivers/dri/i965/gen8_draw_upload.c | 34 + > +++ > > > > 1 file changed, 34 insertions(+) > > > > diff --git a/src/mesa/drivers/dri/i965/gen8_draw_upload.c b/src/mesa/ > drivers/dri/i965/gen8_draw_upload.c > > > > index fe5ed35..14f7091 100644 > > --- a/src/mesa/drivers/dri/i965/gen8_draw_upload.c > > +++ b/src/mesa/drivers/dri/i965/gen8_draw_upload.c > > @@ -217,6 +217,40 @@ gen8_emit_vertices(struct brw_context *brw) > > break; > > } > > > > + /* From the BDW PRM, Volume 2d, page 586 (VERTEX_ELEMENT_STATE): > > + * > > + * "When SourceElementFormat is set to one of the *64*_PASSTHRU > > + * formats, 64-bit components are stored in the URB without any > > + * conversion. In this case, vertex elements must be written as > 128 > > > > + * or 256 bits, with VFCOMP_STORE_0 being used to pad the output > > + * as required. E.g., if R64_PASSTHRU is used to copy a 64-bit > Red > > > > + * component into the URB, Component 1 must be specified as > > + * VFCOMP_STORE_0 (with Components 2,3 set to VFCOMP_NOSTORE) > > + * in order to output a 128-bit vertex element, or Components > > 1-3 > must > > > > + * be specified as VFCOMP_STORE_0 in order to output a 256-bit > vertex > > > > + * element. Likewise, use of R64G64B64_PASSTHRU requires > Component 3 > > > > + * to be specified as VFCOMP_STORE_0 in order to output a > > 256-bit > vertex > > > > + * element." > > + */ > The above comment seems to indicate that R64 needs component 1 set to > STORE_0, and that looks to be missing. I'm guessing you need to add: > > > > > + if (input->glarray->Doubles) { > > + switch (input->glarray->Size) { > > + case 0: > comp0 = BRW_VE1_COMPONENT_STORE_0; > /* fallthrough */ > > > > > + case 1: > comp1 = BRW_VE1_COMPONENT_STORE_0; > /* fallthrough */ > I have not added them because comp0 and comp1 are initialized in the code immediately above the switch I added. That already existing code sets the default values for the components: switch (input->glarray->Size) { case 0: comp0 = BRW_VE1_COMPONENT_STORE_0; case 1: comp1 = BRW_VE1_COMPONENT_STORE_0; case 2: comp2 = BRW_VE1_COMPONENT_STORE_0; case 3: comp3 = input->glarray->Integer ? BRW_VE1_COMPONENT_STORE_1_INT : BRW_VE1_COMPONENT_STORE_1_FLT; break; } In my switch I just override the values that are not right for the double-precision floating point cases. > > + case 2: > > +/* Use 128-bits instead of 256-bits to write double and dvec2 > > + * vertex elements. > > + */ > > +comp2 = BRW_VE1_COMPONENT_NOSTORE; > > +comp3 = BRW_VE1_COMPONENT_NOSTORE; > > +break; > > + case 3: > > +/* Pad the output using VFCOMP_STORE_0 as suggested > > + * by the BDW PRM. > > + */ > > +comp3 = BRW_VE1_COMPONENT_STORE_0; > > + } > > + } > > + > > OUT_BATCH((input->buffer << GEN6_VE0_INDEX_SHIFT) | > > GEN6_VE0_VALID | > > (format << BRW_VE0_FORMAT_SHIFT) | > > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 02/15] i965: Configure how to store *64*PASSTHRU vertex components
On jue, 2016-04-28 at 15:16 +0200, Ian Romanick wrote: > On 04/28/2016 01:40 PM, Antia Puentes wrote: > > From the Broadwell specification, structure VERTEX_ELEMENT_STATE > > description: > > > >"When SourceElementFormat is set to one of the *64*_PASSTHRU > > formats, 64-bit components are stored in the URB without any > > conversion. In this case, vertex elements must be written as > > 128 > > or 256 bits, with VFCOMP_STORE_0 being used to pad the output > > as required. E.g., if R64_PASSTHRU is used to copy a 64-bit Red > > component into > > the URB, Component 1 must be specified as VFCOMP_STORE_0 (with > > Components 2,3 set to VFCOMP_NOSTORE) in order to output a 128 > > -bit > > vertex element, or Components 1-3 must be specified as > > VFCOMP_STORE_0 > > in order to output a 256-bit vertex element. Likewise, use of > > R64G64B64_PASSTHRU requires Component 3 to be specified as > > VFCOMP_STORE_0 > > in order to output a 256-bit vertex element." > > > > Uses 128-bits to write double and dvec2 vertex elements, and 256 > > -bits for > > dvec3 and dvec4 vertex elements. > > > > Signed-off-by: Juan A. Suarez Romero > > Signed-off-by: Antia Puentes > > --- > > src/mesa/drivers/dri/i965/gen8_draw_upload.c | 34 > > > > 1 file changed, 34 insertions(+) > > > > diff --git a/src/mesa/drivers/dri/i965/gen8_draw_upload.c > > b/src/mesa/drivers/dri/i965/gen8_draw_upload.c > > index fe5ed35..14f7091 100644 > > --- a/src/mesa/drivers/dri/i965/gen8_draw_upload.c > > +++ b/src/mesa/drivers/dri/i965/gen8_draw_upload.c > > @@ -217,6 +217,40 @@ gen8_emit_vertices(struct brw_context *brw) > > break; > >} > > > > + /* From the BDW PRM, Volume 2d, page 586 > > (VERTEX_ELEMENT_STATE): > > + * > > + * "When SourceElementFormat is set to one of the > > *64*_PASSTHRU > > + * formats, 64-bit components are stored in the URB > > without any > > + * conversion. In this case, vertex elements must be > > written as 128 > > + * or 256 bits, with VFCOMP_STORE_0 being used to pad > > the output > > + * as required. E.g., if R64_PASSTHRU is used to copy a > > 64-bit Red > > + * component into the URB, Component 1 must be specified > > as > > + * VFCOMP_STORE_0 (with Components 2,3 set to > > VFCOMP_NOSTORE) > > + * in order to output a 128-bit vertex element, or > > Components 1-3 must > > + * be specified as VFCOMP_STORE_0 in order to output a > > 256-bit vertex > > + * element. Likewise, use of R64G64B64_PASSTHRU requires > > Component 3 > > + * to be specified as VFCOMP_STORE_0 in order to output > > a 256-bit vertex > > + * element." > > + */ > > + if (input->glarray->Doubles) { > > + switch (input->glarray->Size) { > > + case 0: > > + case 1: > > + case 2: > > +/* Use 128-bits instead of 256-bits to write double > > and dvec2 > > + * vertex elements. > > + */ > > +comp2 = BRW_VE1_COMPONENT_NOSTORE; > > +comp3 = BRW_VE1_COMPONENT_NOSTORE; > > +break; > > + case 3: > > +/* Pad the output using VFCOMP_STORE_0 as suggested > > + * by the BDW PRM. > > + */ > > +comp3 = BRW_VE1_COMPONENT_STORE_0; > > I think this would be better as > > if (input->glarray->Size < 3) { > ... > } else { > ... > } It should be then: if (input->glarray->Size < 3) { ... } else if (input->glarray->Size == 3) { ... } because input->glarray->Size can be 4. > At the very least... there should be a "break" at the end of the 3 > case. > It's not strictly necessary, but my eye keeps noticing that it's not > there. > > > + } > > + } > > + > >OUT_BATCH((input->buffer << GEN6_VE0_INDEX_SHIFT) | > > GEN6_VE0_VALID | > > (format << BRW_VE0_FORMAT_SHIFT) | > > > > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 15/15] i965: Expose OpenGL 4.1 for gen8+
On jue, 2016-04-28 at 15:11 +0200, Ian Romanick wrote: > I think we should bump all the way to 4.2 because all of the features > specific to 4.2 have been done for quite some time. True. > On 04/28/2016 01:40 PM, Antia Puentes wrote: > > From: Alejandro Piñeiro > > > > ARB_vertex_attrib_64bit was the only feature missing. > > --- > > src/mesa/drivers/dri/i965/intel_extensions.c | 2 +- > > src/mesa/drivers/dri/i965/intel_screen.c | 2 +- > > 2 files changed, 2 insertions(+), 2 deletions(-) > > > > diff --git a/src/mesa/drivers/dri/i965/intel_extensions.c > > b/src/mesa/drivers/dri/i965/intel_extensions.c > > index d3905d0..0361d42 100644 > > --- a/src/mesa/drivers/dri/i965/intel_extensions.c > > +++ b/src/mesa/drivers/dri/i965/intel_extensions.c > > @@ -268,7 +268,7 @@ intelInitExtensions(struct gl_context *ctx) > > ctx->Extensions.OES_texture_half_float_linear = true; > > > > if (brw->gen >= 8) > > - ctx->Const.GLSLVersion = 400; > > + ctx->Const.GLSLVersion = 410; > > else if (brw->gen >= 6) > >ctx->Const.GLSLVersion = 330; > > else > > diff --git a/src/mesa/drivers/dri/i965/intel_screen.c > > b/src/mesa/drivers/dri/i965/intel_screen.c > > index 6f685b1..f2ec450 100644 > > --- a/src/mesa/drivers/dri/i965/intel_screen.c > > +++ b/src/mesa/drivers/dri/i965/intel_screen.c > > @@ -1386,7 +1386,7 @@ set_max_gl_versions(struct intel_screen > > *screen) > > switch (screen->devinfo->gen) { > > case 9: > > case 8: > > - psp->max_gl_core_version = 40; > > + psp->max_gl_core_version = 41; > >psp->max_gl_compat_version = 30; > >psp->max_gl_es1_version = 11; > >psp->max_gl_es2_version = 31; > > > > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] nir: Add missing break into switch in construct_value()
On vie, 2016-04-22 at 09:45 +0300, Juha-Pekka Heikkila wrote: > There seemed to be missing one break in nested switchcases. > > Signed-off-by: Juha-Pekka Heikkila > --- > src/compiler/nir/nir_search.c | 1 + > 1 file changed, 1 insertion(+) > > diff --git a/src/compiler/nir/nir_search.c > b/src/compiler/nir/nir_search.c > index dc53a90..df80c0f 100644 > --- a/src/compiler/nir/nir_search.c > +++ b/src/compiler/nir/nir_search.c > @@ -521,6 +521,7 @@ construct_value(const nir_search_value *value, > default: > unreachable("unknown bit size"); > } > + break; > >case nir_type_bool32: > load->value.u32[0] = c->data.u; Reviewed-by: Antia Puentes ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] Patchwork admin rights
On vie, 2015-11-13 at 13:26 +0100, Samuel Iglesias Gonsálvez wrote: > Hello, > > I would like to have admin permissions to Mesa and Piglit projects in > patchwork [0] to change the status of patches that are mine but they > are not assigned to me. As Samuel, I would like to have admin permissions in patchwork for Mesa and Piglit, for the same reasons that he commented. Thanks. > I saw in previous emails than just asking for it here is enough. If I > need to create a bug report in bugzilla, just tell me to do so. > > Thanks, > > Sam > > [0] http://patchwork.freedesktop.org/ ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] Intent to work on support for ARB_internalformat_query2
Hi! Chris, On mar, 2015-10-27 at 21:45 +1300, Chris Forbes wrote: > Presumably ARB_internalformat_query2? Eduardo meant ARB_internalformat_query2, indeed. > On Tue, Oct 27, 2015 at 9:31 PM, Eduardo Lima Mitev > wrote: > Hello, > > This is an announcement that a few folks at Igalia team are > planning to > work on adding support for EXT_internalformat_query2 extension > to Mesa. > > If somebody had started work on this already, or has any input > that is > relevant to the implementation, we would be very thankful to > hear about it. > > I just filed a bug [1] to track progress. Feel free to add > your comments > there too. > > That's all! > > cheers, > Eduardo > > [1] https://bugs.freedesktop.org/show_bug.cgi?id=92687 > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > http://lists.freedesktop.org/mailman/listinfo/mesa-dev > > > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] Plan to work on opt_peephole_sel optimization for vec4
On lun, 2015-10-12 at 15:55 +0300, Francisco Jerez wrote: > Antía Puentes writes: > > On dom, 2015-10-11 at 10:35 -0700, Matt Turner wrote: > >> I would expect big improvements in the vec4 backend from making its > >> copy propagation pass global. > > > > I will be working on global copy propagation then. > > > > Heh... I suggest you hold yourself from working on this for a while. > > While working on some fragment discard improvements last week I noticed > that a seemingly harmless change in the CFG could cause shitloads of > shader-db regressions (>8k). The reason is that a bunch of optimization > passes of the compiler back-end are fully local (including VEC4 copy > propagation) and become ineffective anytime your dataflow happens to > traverse edges of the CFG. > > Instead of reimplementing the intricate dataflow propagation logic (e.g. > what fs_copy_prop_dataflow does) in every pass (or rather every pass > that needs to consider two or more instructions at once), I've given a > shot at implementing use-def chains in the back-end (didn't have a > bigger hammer at hand). Use-def chains should simplify this sort of > optimization passes massively and allow them to work globally without > any additional effort: Instead of looping back the instruction list to > find out whether the last instruction that wrote a register had some > specific form (c.f. cmod_propagation, saturate_propagation, > compute_to_mrf, which are all local and have O(n^2) complexity), you > just look-up the use entry for the instruction you're interested in and > look at its immediate neighbours in the use-def graph, which represent > the instruction(s) that generated the variable. > > Use-def chains can be implemented trivially on an SSA IR so as an added > benefit it should become easier to port optimization passes to SSA once > they've been rewritten in terms of use-def chains. > > I've got an algorithm to construct the use-def graph basically working > on the FS back-end (I'm not sharing it publicly yet though because in > its current form it would likely offend some people's sensibilities -- I > could send you a branch in private though if you're interested). I > haven't started porting other optimization passes to use it yet, but > it's next on my to-do list. Hi Francisco, that sounds great. Do you intend to do the same for vec4? Otherwise, I can try to port your work to vec4 once you are done with FS. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] Plan to work on opt_peephole_sel optimization for vec4
Hi Matt, thanks for your suggestions. On dom, 2015-10-11 at 10:35 -0700, Matt Turner wrote: > I don't believe it's valuable to port the opt_peephole_sel() pass to > the vec4 backend. With NIR (since NIR essentially performs the same > optimization), the opt_peephole_sel() pass only improves code > generation on Gen < 7 (where we use MRFs) in cases like > > (+f0) if > mov m1, ... > mov m2, ... > send ..., m1 > else > mov m1, ... > mov m2, ... > send ..., m1 > endif > > where the two sends are texture operations using the same coordinate > but different samplers. On Gen7+ where send-from-GRF, the source > payload for the two sends will be different registers and > opt_peephole_sel() won't handle it. > > That said, we still do send-from-MRF on all platforms in the vec4 > backend, but texturing is sufficiently rare in vertex shaders that I > don't believe there's much to be gained. Ok. > I believe the most valuable optimization missing from the vec4 backend > is global copy propagation (it currently only has local copy > propagation). *Lots* of shaders would benefit from this even before we > switched to NIR, and even more after. > > While both fs/vec4 backends have opt_copy_propagate(), the fs > backend's implementation is much better. It first performs local (that > is, on each basic block separately) copy propagation. Second, it uses > the values available at the end of each block to generate an analysis > of the dataflow, and finally it reruns local copy propagation but this > time with more data available. > > I would expect big improvements in the vec4 backend from making its > copy propagation pass global. I will be working on global copy propagation then. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] Plan to work on opt_peephole_sel optimization for vec4
As there are quite a lot of movement related to vec4 optimizations and I would like to avoid overlapping, I am commenting here that I intend to work on a version of the fs_visitor::opt_peephole_sel() for vec4. Suggestions about other FS optimizations worth porting to the vec4 backend are welcome. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v5 70/70] glsl: Mark as active all elements of shared/std140 block arrays
Hi! Timothy, thanks for your review. Seeing your patch in: http://lists.freedesktop.org/archives/mesa-dev/2015-September/095070.html The condition in my patch: (var->type->interface_packing != GLSL_INTERFACE_PACKING_PACKED) should also be changed to: (var->get_interface_type()->interface_packing != GLSL_INTERFACE_PACKING_PACKED) Right? On mié, 2015-09-23 at 23:44 +1000, Timothy Arceri wrote: > On Wed, 2015-09-23 at 12:25 +0200, Samuel Iglesias Gonsálvez wrote: > > > > On 21/09/15 13:11, Samuel Iglesias Gonsálvez wrote: > > > On 21/09/15 09:41, Tapani Pälli wrote: > > > > Seems like a nice fix, takes ES3 CTS failures from 116 to 64 on > > > > Haswell > > > > (most failing tests are with ubos). > > > > > > > > Tested-by: Tapani Pälli > > > > > > > > > > OK thanks! > > > > > > > This is individual patch not related to just SSBOs, maybe this > > > > could be > > > > pushed separately from the rest? > > > > > > > > > > Yes, this patch can be pushed separately from the rest of patches > > > of the > > > series: we just need an R-b, as usual. > > > > > > We are going to discuss the proper solution with Timothy [0], as he > > > found that we are not covering one case. > > > > > > > Timothy has sent a patch fixing the packed case [0] and he developed > > a > > piglit test for it [1]. > > > > We are going to wait for an R-b of Antía's patch before pushing it. > > I sent a reply to this same email saying almost the same thing but it > seems to have gone missing. > > Anyway I also sent my r-b, this patch looks good to me. > > Reviewed-by: Timothy Arceri > > > > > > Sam > > > > [0] > > http://lists.freedesktop.org/archives/mesa-dev/2015-September/095070. > > html > > [1] http://lists.freedesktop.org/archives/piglit/2015-September/01724 > > 7.html > > > > > Sam > > > > > > [0] https://bugs.freedesktop.org/show_bug.cgi?id=83508 > > > > > > > > > > // Tapani > > > > > > > > On 09/10/2015 04:36 PM, Iago Toral Quiroga wrote: > > > > > From: Antia Puentes > > > > > > > > > > Commit 1ca25ab (glsl: Do not eliminate 'shared' or 'std140' > > > > > blocks > > > > > or block members) considered as active 'shared' and 'std140' > > > > > uniform > > > > > blocks and uniform block arrays, but did not include the block > > > > > array > > > > > elements. Because of that, it was possible to have an active > > > > > uniform > > > > > block array without any elements marked as used, making the > > > > > assertion > > > > > ((b->num_array_elements > 0) == b->type->is_array()) > > > > > in link_uniform_blocks() fail. > > > > > > > > > > Fixes the following 5 dEQP tests: > > > > > > > > > > * dEQP > > > > > -GLES3.functional.ubo.random.nested_structs_instance_arrays.18 > > > > > * dEQP > > > > > -GLES3.functional.ubo.random.nested_structs_instance_arrays.24 > > > > > * > > > > > dEQP > > > > > -GLES3.functional.ubo.random.nested_structs_arrays_instance_arr > > > > > ays.19 > > > > > * dEQP-GLES3.functional.ubo.random.all_per_block_buffers.49 > > > > > * dEQP-GLES3.functional.ubo.random.all_shared_buffer.36 > > > > > > > > > > Fixes bugzilla: > > > > > https://bugs.freedesktop.org/show_bug.cgi?id=83508 > > > > > --- > > > > > src/glsl/link_uniform_block_active_visitor.cpp | 23 > > > > > +++ > > > > > 1 file changed, 23 insertions(+) > > > > > > > > > > diff --git a/src/glsl/link_uniform_block_active_visitor.cpp > > > > > b/src/glsl/link_uniform_block_active_visitor.cpp > > > > > index 5102947..fbe79de 100644 > > > > > --- a/src/glsl/link_uniform_block_active_visitor.cpp > > > > > +++ b/src/glsl/link_uniform_block_active_visitor.cpp > > > > > @@ -106,6 +106,22 @@ > > > > > link_uniform_block_active_visitor::visit(ir_variable *var) > > > > > assert(b->num_array_elements == 0); > > > > > assert(b->array_elements == NULL); > > > > > assert(b->type != NULL); > > > > > + assert(!b->type->is_array() || b->has_instance_name); > > > > > + > > > > > + /* For uniform block arrays declared with a shared or > > > > > std140 layout > > > > > +* qualifier, mark all its instances as used. > > > > > +*/ > > > > > + if (b->type->is_array() && b->type->length > 0) { > > > > > + b->num_array_elements = b->type->length; > > > > > + b->array_elements = reralloc(this->mem_ctx, > > > > > + b->array_elements, > > > > > + unsigned, > > > > > + b->num_array_elements); > > > > > + > > > > > + for (unsigned i = 0; i < b->num_array_elements; i++) { > > > > > + b->array_elements[i] = i; > > > > > + } > > > > > + } > > > > > > > > > > return visit_continue; > > > > > } > > > > > @@ -147,6 +163,13 @@ > > > > > link_uniform_block_active_visitor::visit_enter(ir_dereference_a > > > > > rray *ir) > > > > > assert((b->num_array_elements == 0) == (b->array_elements > > > > > == NULL)); > > > > > assert(b->type != NULL); > > > > >
Re: [Mesa-dev] [PATCH] i965/vec4: Don't coalesce registers in gen6 math ops if reswizzling needed
On mar, 2015-09-22 at 11:22 +0200, Antía Puentes wrote: > I realized that if writemasking is also unsupported I should probably > update the patch to avoid coalescing if the dst_writemask parameter is > different from XYWZ. The condition in "vec4_instruction::can_reswizzle" > would be something like: > > /* Gen6 MATH instructions can not execute in align16 mode, so swizzles >* or writemasking are not allowed. */ >if (devinfo->gen == 6 && is_math() && >swizzle != BRW_SWIZZLE_XYZW && >dst_writemask != WRITEMASK_XYWZ) > return false; Ups, I meant: (swizzle != BRW_SWIZZLE_XYZW || dst_writemask != WRITEMASK_XYWZ) Sorry for the noise. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] i965/vec4: Don't coalesce registers in gen6 math ops if reswizzling needed
On dom, 2015-09-20 at 11:11 -0700, Matt Turner wrote: > On Sun, Sep 20, 2015 at 8:48 AM, Antia Puentes wrote: > > Math operations in SandyBridge do not support source swizzling > > I can't find any documentation to support this claim, but I remember > that SNB math must be in align1 mode so it can't do swizzles or > writemasking (see commit e14cc504). I realized that if writemasking is also unsupported I should probably update the patch to avoid coalescing if the dst_writemask parameter is different from XYWZ. The condition in "vec4_instruction::can_reswizzle" would be something like: /* Gen6 MATH instructions can not execute in align16 mode, so swizzles * or writemasking are not allowed. */ if (devinfo->gen == 6 && is_math() && swizzle != BRW_SWIZZLE_XYZW && dst_writemask != WRITEMASK_XYWZ) return false; The writemask check is needed because the "vec4_instruction::reswizzle" method, apart from modifying the swizzle, also updates the intruction's writemask to the AND between the original writemask and the writemask of the instruction we are trying to remove. I will run Piglit and dEQP and send the updated patch. > But, the documentation actually says "The supported regioning modes > for math instruction are align16, align1 with the following > restrictions: ..." > > Would be nice if we could actually find a PRM citation. > > > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92033 > > --- > > src/mesa/drivers/dri/i965/brw_ir_vec4.h | 3 ++- > > src/mesa/drivers/dri/i965/brw_vec4.cpp | 11 +-- > > 2 files changed, 11 insertions(+), 3 deletions(-) > > > > diff --git a/src/mesa/drivers/dri/i965/brw_ir_vec4.h > > b/src/mesa/drivers/dri/i965/brw_ir_vec4.h > > index 966a410..a48bb68 100644 > > --- a/src/mesa/drivers/dri/i965/brw_ir_vec4.h > > +++ b/src/mesa/drivers/dri/i965/brw_ir_vec4.h > > @@ -175,7 +175,8 @@ public: > > > > bool is_send_from_grf(); > > unsigned regs_read(unsigned arg) const; > > - bool can_reswizzle(int dst_writemask, int swizzle, int swizzle_mask); > > + bool can_reswizzle(const struct brw_device_info *devinfo, int > > dst_writemask, > > + int swizzle, int swizzle_mask); > > void reswizzle(int dst_writemask, int swizzle); > > bool can_do_source_mods(const struct brw_device_info *devinfo); > > > > diff --git a/src/mesa/drivers/dri/i965/brw_vec4.cpp > > b/src/mesa/drivers/dri/i965/brw_vec4.cpp > > index ed49cd3..d7192e4 100644 > > --- a/src/mesa/drivers/dri/i965/brw_vec4.cpp > > +++ b/src/mesa/drivers/dri/i965/brw_vec4.cpp > > @@ -941,10 +941,17 @@ vec4_visitor::opt_set_dependency_control() > > } > > > > bool > > -vec4_instruction::can_reswizzle(int dst_writemask, > > +vec4_instruction::can_reswizzle(const struct brw_device_info *devinfo, > > +int dst_writemask, > > int swizzle, > > int swizzle_mask) > > { > > + > > Extra new line. > > > + /* gen6 math instructions can not manage source swizzles */ > > If we can find documentation, we should update this comment. If not, > let's change it to read > > /* Gen6 MATH instructions can not execute in align16 mode, so swizzles > are not allowed. */ > > (linewrapped as appropriate) > > With those changes, > > Reviewed-by: Matt Turner > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v2 2/2] i965/vec4_nir: Load constants as integers
Hi! Jason, On lun, 2015-08-24 at 10:26 -0700, Jason Ekstrand wrote: > Could you please debase this patch (probably rewrite)? I *think* it > should fix https://bugs.freedesktop.org/show_bug.cgi?id=91716. What > were the shader-db results for it? This V2 of the patch is already rebased/rewritten taking into account the changes made in load_const to use fewer moves, sorry for not having said it explicitly. Because for loading the constants as integers we necessarily need to fix the saturation errors, we would be having Piglit regressions in: "shaders/glsl-clamp-vertex-color.shader_test" and "arb_color_buffer_float-render" tests otherwise, I have applied both patches to collect the shader-db results. Comparing master (commit 529acab22a3e21e0ed0c5243675aec6c0ee27e8f) and the version obtained by applying patches "i965/vec4: Fix saturation errors when coalescing registers" and "i965/vec4_nir: Load constants as integers", we have: * Shader-db results for vec4 on i965 (Haswell): total instructions in shared programs: 6402199 -> 6402200 (0.00%) instructions in affected programs: 81 -> 82 (1.23%) helped:0 HURT: 1 GAINED:0 LOST: 0 Printing the vec4 instructions generated by the hurt program, we can see that this extra instruction is the result of reaching a program where the register coalescing is not possible because saturation is required and the instruction we want to coalesce into is not a MOV (if it were a MOV, we could safely coalesce using the type of the final register, instead of the type of the register to be removed as it is currently done). I am detailing only the pseudo-code for the component x for brevity, the rest of the components would be equivalent: In the master branch, the following vec4 instructions: and vgrf8.0.x:D, vgrf7.:D, 1065353216U mov vgrf9.0.x:D, vgrf8.:D mov.sat m4:F, vgrf9.xyzw:F are reduced by opt_register_coalesce to: and.sat m4.x:D, vgrf7.:D, 1065353216U making the saturation modifier a no-op (type D). With the patches applied, opt_register_coalesce does not coalesce vgrf9, we have an extra mov but the saturation is effective: and vgrf9.0.x:D, vgrf7.:D, 1065353216U mov.sat m4:F, vgrf9.xyzw:F > On Aug 24, 2015 4:51 AM, "Antia Puentes" wrote: > Loads constants using integer as their register type, this is > done > for consistency with the FS backend. > --- > src/mesa/drivers/dri/i965/brw_vec4_nir.cpp | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > > diff --git a/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp > b/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp > > index 632e409..23b2fab 100644 > --- a/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp > +++ b/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp > @@ -456,7 +456,7 @@ void > vec4_visitor::nir_emit_load_const(nir_load_const_instr > *instr) > { > dst_reg reg = dst_reg(GRF, alloc.allocate(1)); > - reg.type = BRW_REGISTER_TYPE_F; > + reg.type = BRW_REGISTER_TYPE_D; > > > unsigned remaining = > brw_writemask_for_size(instr->def.num_components); > > @@ -477,7 +477,7 @@ > vec4_visitor::nir_emit_load_const(nir_load_const_instr *instr) >} > >reg.writemask = writemask; > - emit(MOV(reg, src_reg(instr->value.f[i]))); > + emit(MOV(reg, src_reg(instr->value.i[i]))); > > >remaining &= ~writemask; > } > -- > 2.1.0 > > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > http://lists.freedesktop.org/mailman/listinfo/mesa-dev > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] Fix saturation when coalescing registers in NIR-vec4
Hi! On vie, 2015-08-14 at 13:56 +0200, Antia Puentes wrote: > With this fix, we can load the constants in NIR-vec4 as integers, > as it is done in the FS backend. > > This is related to: > http://lists.freedesktop.org/archives/mesa-dev/2015-July/089899.html I've been on holidays last week, as far I can read in: http://lists.freedesktop.org/archives/mesa-dev/2015-August/092382.html Matt Turner is working on a new version of the vec4 register coalescing. This makes me wonder if it makes sense to consider now the patches I am sending, because first, the code is going to change soon, and second, it could be that Matt is already tackling the saturation errors in the new register coalesce version. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v2 00/78] i965: A new vec4 backend based on NIR
Hi! Jason, On lun, 2015-08-03 at 17:25 -0700, Jason Ekstrand wrote: > On Mon, Aug 3, 2015 at 9:51 AM, Jason Ekstrand wrote: > > I also pushed a version to our CI system that switched the > > INTEL_USE_NIR default to true and things look pretty good over-all. > > On ILK (gen5) and 965 (gen 4), there were a bunch of tests that failed > > most of which seemed to be focussed around either variable indexing or > > comparison operations. It looks like it's probably the same problem > > (or problems) on both gens, so once we get one fixed, the other will > > most likely follow. On gen8, it looks like integers are broken. > > > > I don't know if any of you have the hardware to solve those problems. > > If not, someone from the Oregon team can look into it. > > I went ahead and looked at some of these problems. I just sent a > short patch series that fixes all of the regressions vs. non-NIR on > BDW+ and ILK-. Some things I noticed while I was at it: > > 1) You implemented nir_op_bany but not nir_op_ball. Why is this? Do > we not have any piglit tests that hit nir_op_ball? I did not implement nir_op_ball because, AFAICS, it is never emitted. Again, AFAICS, the equivalent IR operation is also missing, probably because the "all()" builtin function is implemented using nir_binop_logic_ands. I can add it if you think it makes sense though. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v3 (part2) 55/56] glsl: Consider active all elements of a shared/std140 block array
Please, ignore this patch, we have recently sent an updated version to the mailing list. The new version is available in: http://lists.freedesktop.org/archives/mesa-dev/2015-July/090108.html Regards. On mar, 2015-07-14 at 09:46 +0200, Iago Toral Quiroga wrote: > From: Antia Puentes > > Commmit 1ca25ab (glsl: Do not eliminate 'shared' or 'std140' > blocks or block members) considers active 'shared' and 'std140' > uniform blocks and uniform block arrays but did not include the > block array elements. It was possible to have an active uniform > block array without any elements marked as used, making the > assertion ((b->num_array_elements > 0) == b->type->is_array()) > in link_uniform_blocks fail. > > Fixes the following 5 dEQP tests: > > * dEQP-GLES3.functional.ubo.random.nested_structs_instance_arrays.18 > * dEQP-GLES3.functional.ubo.random.nested_structs_instance_arrays.24 > * dEQP-GLES3.functional.ubo.random.nested_structs_arrays_instance_arrays.19 > * dEQP-GLES3.functional.ubo.random.all_per_block_buffers.49 > * dEQP-GLES3.functional.ubo.random.all_shared_buffer.36 > --- > src/glsl/link_uniform_block_active_visitor.cpp | 23 +++ > 1 file changed, 23 insertions(+) > > diff --git a/src/glsl/link_uniform_block_active_visitor.cpp > b/src/glsl/link_uniform_block_active_visitor.cpp > index 5102947..37418c2 100644 > --- a/src/glsl/link_uniform_block_active_visitor.cpp > +++ b/src/glsl/link_uniform_block_active_visitor.cpp > @@ -106,6 +106,22 @@ link_uniform_block_active_visitor::visit(ir_variable > *var) > assert(b->num_array_elements == 0); > assert(b->array_elements == NULL); > assert(b->type != NULL); > + assert(!b->type->is_array() || b->has_instance_name); > + > + /* For uniform block arrays declared with a shared or std140 layout > +* qualifier, mark all its instances as used. > +*/ > + if (b->type->is_array() && b->type->length > 0) { > + b->num_array_elements = b->type->length; > + b->array_elements = reralloc(this->mem_ctx, > + b->array_elements, > + unsigned, > + b->num_array_elements); > + > + for (unsigned i = 0; i < b->num_array_elements; i++) { > + b->array_elements[i] = i; > + } > + } > > return visit_continue; > } > @@ -147,6 +163,13 @@ > link_uniform_block_active_visitor::visit_enter(ir_dereference_array *ir) > assert((b->num_array_elements == 0) == (b->array_elements == NULL)); > assert(b->type != NULL); > > + /* If the block array was declared with a shared or std140 layout > qualifier, > +* all its instances have been already marked as used (see > +* link_uniform_block_active_visitor::visit(ir_variable *) function). > +*/ > + if (var->type->interface_packing == GLSL_INTERFACE_PACKING_PACKED) > + return visit_continue; > + > ir_constant *c = ir->array_index->as_constant(); > > if (c) { ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH v2 12/78] i965/nir/vec4: Implement load_const intrinsic
Hi! Jason, On jue, 2015-07-23 at 15:46 -0700, Jason Ekstrand wrote: > > @@ -332,7 +334,22 @@ vec4_visitor::nir_emit_instr(nir_instr *instr) > > void > > vec4_visitor::nir_emit_load_const(nir_load_const_instr *instr) > > { > > - /* @TODO: Not yet implemented */ > > + dst_reg reg = dst_reg(GRF, alloc.allocate(1)); > > + reg.type = BRW_REGISTER_TYPE_F; > > + > > + /* @FIXME: consider emitting vector operations to save some MOVs in > > +* cases where the components are representable in 8 bits. > > +* By now, we emit a MOV for each component. > > +*/ > > + for (unsigned i = 0; i < instr->def.num_components; ++i) { > > + reg.writemask = 1 << i; > > + emit(MOV(reg, src_reg(instr->value.f[i]))); > > In the FS, we use integers here. I don't know that using floats is a > problem, but we should probably be consistent. We are using floats as the default type when loading constants, because otherwise we have a number of regressions in Piglit that happen because clamping is not done when it is required. This lack of clamping is related to the vec4_visitor:opt_register_coalesce optimization relaying on the constants being loaded with their real type, this is in effect how the IR-vec4 pass works. Let me explain it with an example. Imagine that we use integers as the default type in load_const as you suggest, and we have in our shader's code the following instruction: gl_FrontColor = vec4(-2, -1, 0.5, 3); This is translated into the following vec4 instructions: /* Load constant */ mov vgrf10.0.x:D, -1073741824D mov vgrf10.0.y:D, -1082130432D mov vgrf10.0.z:D, 1056964608D mov vgrf10.0.w:D, 1077936128D /* Store the value */ mov.sat m4:F, vgrf10.xyzw:F The error comes when that bunch of instructions is translated by the opt_register_coalesce as: mov.sat m4.x:D, -1073741824D mov.sat m4.y:D, -1082130432D mov.sat m4.z:D, 1056964608D mov.sat m4.w:D, 1077936128D As you can see, although the instruction saturation is set, nothing will be done because the operands are integers. I see a number of ways to fix this: 1) Provide the constant's type information in nir_load_const_instr and set the real type when loading the constant. 2) Set the type always to float and rely on receiving the correct value for the instruction->saturation, so the clamping is only done when needed. 3) Modify the opt_register_coalesce optimization to preserve the types of the instruction containing the final destination, if the instruction saturation is true. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 39/78] i965/nir/vec4: Add swizzle utility method for vector ops
Hi Jason, On mar, 2015-06-30 at 14:18 -0700, Jason Ekstrand wrote: > On Fri, Jun 26, 2015 at 1:06 AM, Eduardo Lima Mitev wrote: > > From: Antia Puentes > > > > For operations that have a predefined operand size > 0, defined in > > glsl/nir/nir_opcodes.c, NIR returns a swizzle containing zeros in the > > components from outside the source vector. However, the driver > > expects those components to have a swizzle value equal to the swizzle > > of the component in (number of vector elements - 1). For example, > > > >for a vec2 operation with an identity swizzle (.xy): > > - (NIR -> XYXX, driver ->X) > > > >for a vec3 operation with swizzle (.zxy) > > - (NIR-> ZXYX, driver -> ZXYY) > > Why is this needed. Was there some bug you ran into regarding this or > are you just trying to make the generated code match? If the > component isn't used, then it seems to me like the swizzle shouldn't > matter. NIR may give you bogus swizzles outside of the enabled > channels but it may do that regardless. I'm also not seeing how > composing swizzles fixes anything. There were several bugs in our NIR->vec4 code path related to NIR returning bogus swizzles outside the components of a vector. When doing an "any", "equal" or "notEqual" operation on vectors, the result of the instruction "cmp" is predicated using "all4h" or "any4h" depending on the operation. The results were incorrect for vectors with less than 4 elements because: (1) I was not setting the writemask of the "cmp" operation according to the vector size and (2) NIR returns bogus swizzles for components outside the vector. As a consequence, we had spurious component-wise comparisons that affected the result of the predication. As composing the operand swizzle with the identity swizzle-per-size means to repeat the value of the swizzle of the last vector-element on the components outside the vector, the error mentioned above does not happen because to repeat the last (inside-boundaries) component-wise comparison for the components outside the vector does not affect the result of the predication. Other way to fix the bugs above is to set the writemask of the "cmp" operation to the appropriate mask according to the vector size, thus forcing the "cmp" instruction to compare only the components inside the vector boundaries. As the vec4_visitor does not set the writemask of the "cmp" instruction because it relies on the operand having the correct swizzle outside the vector boundaries, for consistency with the existing behaviour, I decided to do the same and compose the swizzle (notice that the composing is done in the vec4_visitor when visiting an ir_swizzle, while in NIR we only set the swizzle for the elements inside the vector). I can remove this function if you prefer setting the writemask over composing the swizzles. I remember another regression related to bogus swizzles in unpackHalf2x16, for that function the operand is an uint, however in the vec4_visitor::emit_unpack_half_2x16 function we have: tmp_dst.writemask = WRITEMASK_Y; emit(SHR(tmp_dst, src0, src_reg(16u))); This was not buggy in the old vec4_visitor because the src0's swizzle was correct for the component Y (src0's swizzle would be either , the identity swizzle for size == 1, or the result of composing the swizzle with in case the parent IR node was an ir_swizzle). In the case of NIR, as the swizzle can be bogus for all the components except the first one (size == 1), we need to compose it. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 33/78] i965/nir/vec4: Implement float-related functions
Hi Jason, thanks for the review. On mar, 2015-06-30 at 14:10 -0700, Jason Ekstrand wrote: > First off, this needs a different commit message. "float-related > functions" isn't particularly descriptive. How about "various > rounding functions" because these really are all "rounding modes". True, I will change it. > > diff --git a/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp > > b/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp > > index 12fdb48..57685bb 100644 > > --- a/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp > > +++ b/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp > > @@ -855,6 +855,46 @@ vec4_visitor::nir_emit_alu(nir_alu_instr *instr) ... > > + case nir_op_fceil: { > > + /* Get the number of components of the source to create a > > + * temporary register with the appropriate glsl type. > > + */ > > + int num_components = instr->src[0].src.is_ssa ? > > + instr->src[0].src.ssa->num_components : > > + instr->src[0].src.reg.reg->num_components; > > + assert(num_components > 0); > > + > > + src_reg tmp = src_reg(this, glsl_type::float_type, num_components); > > We should be able to just do 1 register because each register is already a > vec4. Right. We only need the number of components to set the swizzle. > > + tmp.swizzle = brw_swizzle_for_size(num_components); ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 3/5] glcpp: Allow arithmetic integer expressions in #line
First, sorry for the late answer, I somehow missed your replies (I was not in CC). On mar, 2015-06-09 at 10:59 -0700, Ian Romanick wrote: > On 06/09/2015 10:40 AM, Carl Worth wrote: > > On Tue, Jun 09 2015, Ian Romanick wrote: > >>> From section 3.4 ("Preprocessor") of the GLSL ES 3.00 specification: > >>> "#line must have, after macro substitution, one of the following forms: > >>> #line line > >>> #line line source-string-number > >>> where line and source-string-number are constant integral > >>> expressions." > > ... > >>> From section 4.3.3 ("Constant Expressions") of the same specification: > >>> "A constant integral expression is a constant expression that evaluates > >>> to a scalar signed or unsigned integer." > > > > Yes. That's an extremely unfortunate piece of the specification. > > > > This, together with unary operators introduces inherent ambiguity into > > the grammar. Just think about things like: I forgot to mention in the patch's commit message that, because of the ambiguity of the grammar, I made the assumption that one (or more blanks) that are not part of an expression between parentheses, act as parameter separators. Then, for the examples mentioned in the thread the output would be: #line 2-1+5 -> #line 6 #line 2 -1+5 -> #line 2 4 #line 2-1 +5 -> #line 1 5 #line 2-1+5 3 -> #line 6 3 #line 2 -1+5 3 -> compilation error #line 2-1 +5 3 -> compilation error #line 3 +3 -> #line 3 3 #line 3 (+3) -> #line 3 3 And for the parentheses the behavior is: #line (2 -1)+5 -> #line 6 #line 3 (4+1)-1 -> #line 3 4 #line (3) ((4+1) -1) -> #line 3 4 #line 3 (4+1) -1 -> compilation error > The spec was supposed to get updated to say that parsing is greedy, so > we at least know what those should do. I say "supposed to" instead of > "was" because I don't know for sure that it was updated. I am afraid that greedy parsing is not what this patch implements, as #line 3 +3 4 will not be evaluated as #line 6 4, but will raise an error instead. > > But I'll also take a look at this patch. Thanks for bringing it to my > > attention, Ian. ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] i965: Emit IF/ELSE/ENDIF/WHILE JIP with type W on Gen7
Hi!, On vie, 2015-03-13 at 10:42 -0700, Matt Turner wrote: > On Fri, Mar 13, 2015 at 3:41 AM, Antia Puentes wrote: > > IvyBridge and Haswell PRM say that the JIP should be emitted > > with type W but we were using UD. The previous implementation > > did not show adverse effects, however changing the type to > > D caused a GPU hang, see bug 84557; IMHO it is safer to > > follow the specification thoroughly. > > --- > > The change seems fine to me, though changing the type to D never > seemed like the thing that was actually causing the hang. It seemed > much more likely that allowing the ELSE and ENDIF instructions to be > compacted violated some assumption in our code and we misaligned some > jump target by 8-bytes. > > I should investigate more. > > I'd probably remove the "however changing the type to D caused a GPU > hang, see bug 84557" comment from the commit. I will remove the comment. > Reviewed-by: Matt Turner Thanks for reviewing! ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] About sRGB encoding/decoding while blitting
Hi! I have been looking into the dEQP test failures that deal with blitting between sRGB and linear color spaces, and vice versa (in total 37 tests). OpenGL ES 3.0 and OpenGL 4.4 specifications clearly state that linearization should happen when reading from sRGB buffers, and sRGB encoding should be done when writing to sRGB buffers during the blit operation. In the case OpenGL 4.4 specification this encoding/decoding is dependent on the value of the GL_FRAMEBUFFER_SRGB enable flag; OpenGL ES 3.0 acts as if this setting was always enabled. - From section 4.3.3 ("Copying Pixels") of the OpenGL ES 3.0 specification (page 198): "When values are taken from the read buffer, if the value of FRAMEBUFFER_ATTACHMENT_COLOR_ENCODING for the framebuffer attachment corresponding to the read buffer is SRGB (see section 6.1.13), the red, green, and blue components are converted from the non-linear sRGB color space according to equation 3.24. When values are written to the draw buffers, blit operations bypass the fragment pipeline. The only fragment operations which affect a blit are the pixel ownership test, the scissor test, and sRGB conversion (see section 4.1.8). Color, depth, and stencil masks (see section 4.2.2) are ignored." - From section 18.3.1 ("Blitting Pixel Rectangles") of the OpenGL 4.4 core specification (page 485): "When values are taken from the read buffer, if FRAMEBUFFER_SRGB is enabled and the value of FRAMEBUFFER_ATTACHMENT_COLOR_ENCODING for the framebuffer attachment corresponding to the read buffer is SRGB (see section 9.2.3), the red, green, and blue components are converted from the non-linear sRGB color space according to equation 8.14. When values are written to the draw buffers, blit operations bypass most of the fragment pipeline. The only fragment operations which affect a blit are the pixel ownership test, the scissor test, and sRGB conversion (see section 17.3.9). Color, depth, and stencil masks (see section 17.4.2) are ignored." However, the implementation in Mesa preserves the underlying binary representation of the pixels, regardless the color space. This is related to the fact that the OpenGL 4.3 specification was not consistent regarding blits involving sRGB buffers; for more information see [1]. Apart from this, it was suggested in some comments in the code (specifically comments [1] and [2]), that existing games and applications rely on the current behaviour, not expecting sRGB encoding/decoding to be done. In summary, the current implementation does not respect the OpenGL 4.4 and OpenGL ES 3.0 specs, however, it seems to be the implementation that some applications / games expect, so it is unclear to me if we want to re-implement this according to what the specs dictate or leave it as it is now. Any thoughts? Regards. [1] Piglit: comments in tests/spec/arb_framebuffer_srgb/blit.c source file and the log of the commit abd189966d39648c00f3204c58fef217e94a8703 [2] Mesa: comments in src/mesa/drivers/common/meta_blit.c source file ("blitframebuffer_texture" function). ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev