Re: [Mesa-dev] [PATCH v2 1/2] mesa: make glFramebuffer* check immutable texture level bounds

2017-03-24 Thread Antía Puentes
Reviewed-by: Antia Puentes 


On jue, 2017-01-26 at 00:47 -0500, Ilia Mirkin wrote:
> When a texture is immutable, we can't tack on extra levels
> after-the-fact like we could with glTexImage. So check against that
> level limit and return an error if it's surpassed.
> 
> The spec is a little unclear in that it says to check if "level is
> not a
> supported texture level", however that is never fully defined.
> 
> This fixes:
> GL45-CTS.geometry_shader.layered_fbo.fb_texture_invalid_level_number
> 
> Signed-off-by: Ilia Mirkin 
> ---
> 
> v1 -> v2: use NumLevels instead of _MaxLevel.
> 
> Not sure why this isn't showing up as failing in the Intel CI, but it
> was
> definitely failing here.
> 
>  src/mesa/main/fbobject.c | 22 ++
>  1 file changed, 14 insertions(+), 8 deletions(-)
> 
> diff --git a/src/mesa/main/fbobject.c b/src/mesa/main/fbobject.c
> index 6934805..6e86248 100644
> --- a/src/mesa/main/fbobject.c
> +++ b/src/mesa/main/fbobject.c
> @@ -3128,16 +3128,22 @@ check_layer(struct gl_context *ctx, GLenum
> target, GLint layer,
>   * \return true if no errors, false if errors
>   */
>  static bool
> -check_level(struct gl_context *ctx, GLenum target, GLint level,
> -const char *caller)
> +check_level(struct gl_context *ctx, const struct gl_texture_object
> *texObj,
> +GLint level, const char *caller)
>  {
> if ((level < 0) ||
> -   (level >= _mesa_max_texture_levels(ctx, target))) {
> +   (level >= _mesa_max_texture_levels(ctx, texObj->Target))) {
>    _mesa_error(ctx, GL_INVALID_VALUE,
>    "%s(invalid level %d)", caller, level);
>    return false;
> }
>  
> +   if (texObj->Immutable && level >= texObj->NumLevels) {
> +  _mesa_error(ctx, GL_INVALID_VALUE,
> +  "%s(level %u >= %u)", caller, level, texObj-
> >NumLevels);
> +  return false;
> +   }
> +
> return true;
>  }
>  
> @@ -3260,7 +3266,7 @@ framebuffer_texture_with_dims(int dims, GLenum
> target,
>    if ((dims == 3) && !check_layer(ctx, texObj->Target, layer,
> caller))
>   return;
>  
> -  if (!check_level(ctx, textarget, level, caller))
> +  if (!check_level(ctx, texObj, level, caller))
>   return;
> }
>  
> @@ -3328,7 +3334,7 @@ _mesa_FramebufferTextureLayer(GLenum target,
> GLenum attachment,
>    if (!check_layer(ctx, texObj->Target, layer, func))
>   return;
>  
> -  if (!check_level(ctx, texObj->Target, level, func))
> +  if (!check_level(ctx, texObj, level, func))
>   return;
>  
>    if (texObj->Target == GL_TEXTURE_CUBE_MAP) {
> @@ -3370,7 +3376,7 @@ _mesa_NamedFramebufferTextureLayer(GLuint
> framebuffer, GLenum attachment,
>    if (!check_layer(ctx, texObj->Target, layer, func))
>   return;
>  
> -  if (!check_level(ctx, texObj->Target, level, func))
> +  if (!check_level(ctx, texObj, level, func))
>   return;
>  
>    if (texObj->Target == GL_TEXTURE_CUBE_MAP) {
> @@ -3419,7 +3425,7 @@ _mesa_FramebufferTexture(GLenum target, GLenum
> attachment,
>    if (!check_layered_texture_target(ctx, texObj->Target, func,
> &layered))
>   return;
>  
> -  if (!check_level(ctx, texObj->Target, level, func))
> +  if (!check_level(ctx, texObj, level, func))
>   return;
> }
>  
> @@ -3459,7 +3465,7 @@ _mesa_NamedFramebufferTexture(GLuint
> framebuffer, GLenum attachment,
>  &layered))
>   return;
>  
> -  if (!check_level(ctx, texObj->Target, level, func))
> +  if (!check_level(ctx, texObj, level, func))
>   return;
> }
>  
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2 1/2] mesa: make glFramebuffer* check immutable texture level bounds

2017-03-24 Thread Antía Puentes
On mié, 2017-02-01 at 19:18 +0100, Antía Puentes wrote:
> On mié, 2017-02-01 at 00:37 -0500, Ilia Mirkin wrote:
> > On Tue, Jan 31, 2017 at 2:55 PM, Antía Puentes  wrote:
> > > 
> > > There was an update in the OpenGL ES 3.2 specification (November 3,
> > > 2016) addressing this:
> > > 
> > > - In section "9.2.8 Attaching Texture Images to a Framebuffer",
> > > FramebufferTexture2D (page 241) and FramebufferTextureLayer (page 242)
> > > descriptions:
> > > 
> > > " specifies the mipmap level of the texture image to be attached
> > > to the framebuffer, and must satisfy the following conditions:
> > > 
> > > • If texture refers to an immutable-format texture, level must be
> > > greater than or equal to zero and smaller than the value of
> > > TEXTURE_IMMUTABLE_LEVELS for texture."
> > > 
> > > https://cvs.khronos.org/bugzilla/show_bug.cgi?id=15946
> > Interesting, OK. And the test that's failing is a "glesext" test, even
> > though it's being run in a GL context. Does that bug indicate whether
> > such a change also applies to desktop GL? (I don't have access to the
> > bug tracker.)

> It does not mention anything about desktop GL. I have filled a new bug
> in OpenGL 4.5 asking for clarification:
> https://cvs.khronos.org/bugzilla/show_bug.cgi?id=16214

Update on the spec bug [*]. It was agreed to adopt the OpenGL ES 3.2
behaviour in OpenGL, that is to return GL_INVALID_VALUE when attempting
to attach a non-existent level of an immutable texture to an FBO.

[*] https://cvs.khronos.org/bugzilla/show_bug.cgi?id=16214#c1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] mesa: Clamp GetUniformuiv values to be >= 0.

2017-02-16 Thread Antía Puentes
On lun, 2016-12-12 at 10:43 +0100, Nicolai Hähnle wrote:
> On 12.12.2016 00:25, Kenneth Graunke wrote:
> > 
> > Section 2.2.2 (Data Conversions For State Query Commands) of the
> > OpenGL 4.5 October 24th 2016 specification says:
> > 
> > "If a command returning unsigned integer data is called, such as
> >  GetSamplerParameterIuiv, negative values are clamped to zero."
> > 
> > Fixes GL44-CTS.gpu_shader_fp64.state_query.
> > 
> > Signed-off-by: Kenneth Graunke 
> > ---
> >  src/mesa/main/uniform_query.cpp | 48
> > +
> >  1 file changed, 39 insertions(+), 9 deletions(-)
> > 
> > Hey Nicolai,
> > 
> > I wrote a similar patch a while back, but never got around to
> > sending it,
> > since I realized that the gl45release branch expects our current
> > behavior,
> > and the change to make the CTS expect clamping is only on the
> > master branch.
> > 
> > Apparently I made some additional changes, compared to yours.  I
> > figured
> > I'd send this along and let you see if you think any of my extra
> > changes
> > are still necessary.  If so, feel free to fold them into your
> > patch.
> > 
> > I also think we need to fix several other glGet* commands...it's
> > just that
> > this is the only one currently tested.  A bunch work because the
> > values
> > returned can't be negative.
> I think your patch is a strict superset of what mine does and should
> be 
> used instead. I do have one comment below, with that fixed it has my
> R-b.

This patch was never pushed, was it? and GL45-CTS.gpu_shader_fp64.state_query
fails in the new vk-gl-cts repository because it expects these negative
values to be clamped.

> There is the more general question of how to cope with those cases
> where 
> the CTS requires non-standard behavior. I think we should insist on 
> doing the right thing in Mesa, and push for changes to the CTS.
> 
> Until quite recently, I've been occupied by radeonsi- and 
> Gallium-specific bugs, but that's changing and I'm looking into
> using 
> CTS master rather than back-porting fixes to the dead gl45release
> branch 
> (hence this patch).
> 
> > 
> > 
> >  --Ken
> > 
> > diff --git a/src/mesa/main/uniform_query.cpp
> > b/src/mesa/main/uniform_query.cpp
> > index db700df..f05a29f 100644
> > --- a/src/mesa/main/uniform_query.cpp
> > +++ b/src/mesa/main/uniform_query.cpp
> > @@ -347,14 +347,10 @@ _mesa_get_uniform(struct gl_context *ctx,
> > GLuint program, GLint location,
> > * just memcpy the data.  If the types are not compatible,
> > perform a
> > * slower convert-and-copy process.
> > */
> > -  if (returnType == uni->type->base_type
> > -     || ((returnType == GLSL_TYPE_INT
> > -      || returnType == GLSL_TYPE_UINT)
> > -     &&
> > -     (uni->type->base_type == GLSL_TYPE_INT
> > -      || uni->type->base_type == GLSL_TYPE_UINT
> > -   || uni->type->base_type == GLSL_TYPE_SAMPLER
> > -   || uni->type->base_type == GLSL_TYPE_IMAGE))) {
> > +  if (returnType == uni->type->base_type ||
> > +  ((returnType == GLSL_TYPE_INT || returnType ==
> > GLSL_TYPE_UINT) &&
> > +   (uni->type->base_type == GLSL_TYPE_SAMPLER ||
> > +uni->type->base_type == GLSL_TYPE_IMAGE))) {
> >      memcpy(paramsOut, src, bytes);
> >    } else {
> >      union gl_constant_value *const dst =
> > @@ -422,7 +418,6 @@ _mesa_get_uniform(struct gl_context *ctx,
> > GLuint program, GLint location,
> >        }
> >        break;
> >     case GLSL_TYPE_INT:
> > -   case GLSL_TYPE_UINT:
> >        switch (uni->type->base_type) {
> >        case GLSL_TYPE_FLOAT:
> >       /* While the GL 3.2 core spec doesn't explicitly
> > @@ -447,6 +442,9 @@ _mesa_get_uniform(struct gl_context *ctx,
> > GLuint program, GLint location,
> >        case GLSL_TYPE_BOOL:
> >       dst[didx].i = src[sidx].i ? 1 : 0;
> >       break;
> > +   case GLSL_TYPE_UINT:
> > +  dst[didx].i = src[sidx].i;
> I think this should be
> 
> dst[didx].i = MIN2(src[sidx].u, INT_MAX);
> 
> Cheers,
> Nicolai
> 
> > 
> > +  break;
> > case GLSL_TYPE_DOUBLE: {
> >    double tmp;
> >    memcpy(&tmp, &src[sidx].f, sizeof(tmp));
> > @@ -458,6 +456,38 @@ _mesa_get_uniform(struct gl_context *ctx,
> > GLuint program, GLint location,
> >       break;
> >        }
> >        break;
> > +case GLSL_TYPE_UINT:
> > +   switch (uni->type->base_type) {
> > +   case GLSL_TYPE_FLOAT:
> > +  /* The spec isn't terribly clear how to handle
> > negative
> > +   * values with an unsigned return type.
> > +   *
> > +   * GL 4.5 section 2.2.2 ("Data Conversions for
> > State
> > +   * Query Commands") says:
> > +   *
> > +   * "If a value is so large in magnit

Re: [Mesa-dev] [PATCH v2 1/2] mesa: make glFramebuffer* check immutable texture level bounds

2017-02-01 Thread Antía Puentes
Hi Ilia,

On mié, 2017-02-01 at 00:37 -0500, Ilia Mirkin wrote:
> On Tue, Jan 31, 2017 at 2:55 PM, Antía Puentes  wrote:
> > 
> > There was an update in the OpenGL ES 3.2 specification (November 3,
> > 2016) addressing this:
> > 
> > - In section "9.2.8 Attaching Texture Images to a Framebuffer",
> > FramebufferTexture2D (page 241) and FramebufferTextureLayer (page 242)
> > descriptions:
> > 
> > " specifies the mipmap level of the texture image to be attached
> > to the framebuffer, and must satisfy the following conditions:
> > 
> > • If texture refers to an immutable-format texture, level must be
> > greater than or equal to zero and smaller than the value of
> > TEXTURE_IMMUTABLE_LEVELS for texture."
> > 
> > https://cvs.khronos.org/bugzilla/show_bug.cgi?id=15946
> Interesting, OK. And the test that's failing is a "glesext" test, even
> though it's being run in a GL context. Does that bug indicate whether
> such a change also applies to desktop GL? (I don't have access to the
> bug tracker.)

It does not mention anything about desktop GL. I have filled a new bug
in OpenGL 4.5 asking for clarification:
https://cvs.khronos.org/bugzilla/show_bug.cgi?id=16214

For what it is worth, I have a patch similar to yours but using
ImmutableLevels in the condition check instead of NumLevels:

(texObj->Immutable && level >= texObj->ImmutableLevels)


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2 1/2] mesa: make glFramebuffer* check immutable texture level bounds

2017-01-31 Thread Antía Puentes
On mar, 2017-01-31 at 12:09 +0100, Nicolai Hähnle wrote:
> On 30.01.2017 19:09, Ilia Mirkin wrote:
> > 
> > On Mon, Jan 30, 2017 at 1:06 PM, Ilia Mirkin  wrote:
> > > 
> > > On Mon, Jan 30, 2017 at 12:26 PM, Nicolai Hähnle  wrote:
> > > > 
> > > > On 30.01.2017 18:23, Ilia Mirkin wrote:
> > > > > 
> > > > > 
> > > > > On Mon, Jan 30, 2017 at 4:36 AM, Ilia Mirkin  wrote:
> > > > > > 
> > > > > > 
> > > > > > On Mon, Jan 30, 2017 at 4:33 AM, Nicolai Hähnle 
> > > > > > wrote:
> > > > > > > 
> > > > > > > 
> > > > > > > On 26.01.2017 06:47, Ilia Mirkin wrote:
> > > > > > > > 
> > > > > > > > 
> > > > > > > > 
> > > > > > > > When a texture is immutable, we can't tack on extra levels
> > > > > > > > after-the-fact like we could with glTexImage. So check against 
> > > > > > > > that
> > > > > > > > level limit and return an error if it's surpassed.
> > > > > > > > 
> > > > > > > > The spec is a little unclear in that it says to check if "level 
> > > > > > > > is not
> > > > > > > > a
> > > > > > > > supported texture level", however that is never fully defined.
> > > > > > > > 
> > > > > > > > This fixes:
> > > > > > > > GL45-CTS.geometry_shader.layered_fbo.fb_texture_invalid_level_number
> > > > > > > > 
> > > > > > > > Signed-off-by: Ilia Mirkin 
> > > > > > > > ---
> > > > > > > > 
> > > > > > > > v1 -> v2: use NumLevels instead of _MaxLevel.
> > > > > > > > 
> > > > > > > > Not sure why this isn't showing up as failing in the Intel CI, 
> > > > > > > > but it
> > > > > > > > was
> > > > > > > > definitely failing here.
> > > > > > > 
> > > > > > > 
> > > > > > > Maybe the Intel CI is running the GLCTS based on the last 4.5 
> > > > > > > release,
> > > > > > > and I
> > > > > > > guess you're running off what's been published on Github? The 
> > > > > > > GLCTS on
> > > > > > > Github has a bunch of new and possibly broken tests, and may 
> > > > > > > still have
> > > > > > > a
> > > > > > > number of regressions as well (since a lot of code was moved 
> > > > > > > around).
> > > > > > > 
> > > > > > > Can you point out which specific place of the spec you're talking 
> > > > > > > about
> > > > > > > in
> > > > > > > your comment?
> > > > > > 
> > > > > > One of the errors listed for glFramebufferTexture is:
> > > > > > 
> > > > > > """
> > > > > > An INVALID_VALUE error is generated if texture is not zero and is 
> > > > > > not the
> > > > > > name of a texture object, or if level is not a supported texture 
> > > > > > level
> > > > > > for texture
> > > > > > """
> > > > > 
> > > > > Curiously for glFramebufferTexture1D/2D/3D, it also says:
> > > > > 
> > > > > """
> > > > > If textarget is TEXTURE_RECTANGLE or TEXTURE_2D_MULTISAMPLE, then
> > > > > level must be zero. If textarget is TEXTURE_3D, then level must be
> > > > > greater than or equal to zero and less than or equal to log2 of the
> > > > > value of MAX_3D_TEXTURE_- SIZE. If textarget is one of the cube map
> > > > > face targets from table 8.19, then level must be greater than or equal
> > > > > to zero and less than or equal to log2 of the value of
> > > > > MAX_CUBE_MAP_TEXTURE_SIZE. For all other values of textarget, level
> > > > > must be greater than or equal to zero and no larger than log2 of the
> > > > > value of MAX_- TEXTURE_SIZE.
> > > > > """
> > > > > 
> > > > > which matches the current code. I guess this patch is withdrawn...
> > > > 
> > > > What a coincidence, I've been staring at this for a while now and been
> > > > getting increasingly confused. Which function does the CTS actually test
> > > > (and possibly fail incorrectly)? Maybe it's time to open an issue 
> > > > against
> > > > the CTS.
> > > https://github.com/KhronosGroup/VK-GL-CTS/blob/c9921995d8d360bd34d8672194d7c095bb376f82/external/openglcts/modules/glesext/geometry_shader/esextcGeometryShaderLayeredFBO.cpp#L1062
> > A thought occurred to me... whereas glFramebufferTexture1D/etc talk
> > about binding points, glFramebufferTexture talks about a specific
> > texture object. When are the bindings resolved? at
> > glFramebufferTexture2D time, or at draw time? If the latter, then the
> > spec has no choice but to just check the maxima for the binding point
> > ones...
> The problem is that in all cases, the error table lists INVALID_ENUM 
> when "level is not a supported texture level for $foo", but it's not 
> explicitly stated how "supported texture level" is defined for 
> FramebufferTexture.
> 
> However, I did a bit more searching, and there's language below 
> FramebufferTextureLayer which matches that of FramebufferTexture1D/2D/3D:
> 
> "If texture is a three-dimensional texture, then level must be greater 
> than or equal to zero and less than or equal to log 2 of the value of 
> MAX_3D_TEXTURE_SIZE. If texture is a two-dimensional array texture, then 
> level must be greater than or equal to zero and no larger than log 2 of 
> the value of MAX_TEXTURE_SIZE."
> 
> It would be odd to have different requirements for FramebufferTexture 
> and Framebuffer

Re: [Mesa-dev] [PATCH] i965/gen8: Fix vertex attrib upload for dvec3/4 shader inputs

2016-11-01 Thread Antía Puentes
On lun, 2016-10-31 at 14:51 -0700, Kenneth Graunke wrote:
> On Monday, October 31, 2016 6:22:43 PM PDT Antia Puentes wrote:
> > 
> > The emission of vertex attributes corresponding to dvec3 and dvec4
> > vertex shader input variables was not correct when the 
> > passed
> > to the VertexAttribL* commands was <= 2.
> > 
> > This was because we were using the vertex array size when emitting
> > vertices
> > to decide if we uploaded a 64-bit floating point attribute as 1
> > slot (128-bits)
> > for sizes 1 and 2, or 2 slots (256-bits) for sizes 3 and 4. This
> > caused problems
> > when mapping the input variables to registers because, for deciding
> > which
> > registers contain the values uploaded for a certain variable, we
> > use the size
> > and type given to the variable in the shader, so we will be
> > assigning 256-bits
> > to dvec3/4 variables, even if we only uploaded 128-bits for them,
> > which happened
> > when the vertex array size was <= 2.
> > 
> > The patch uses the shader information to only emit as 128-bits
> > those 64-bit floating
> > point variables that were declared as double or dvec2 in the vertex
> > shader. Dvec3 and
> > dvec4 variables will be always uploaded as 256-bits, independently
> > of the  given
> > to the VertexAttribL* command.
> > 
> > From the ARB_vertex_attrib_64bit specification:
> > 
> >    "For the 64-bit double precision types listed in Table X.1, no
> > default
> > attribute values are provided if the values of the vertex
> > attribute variable
> > are specified with fewer components than required for the
> > attribute
> > variable. For example, the fourth component of a variable of
> > type dvec4
> > will be undefined if specified using VertexAttribL3dv or using
> > a vertex
> > array specified with VertexAttribLPointer and a size of three."
> > 
> > We are filling these unspecified components with zeros, which
> > coincidentally is
> > also what the GL44-CTS.vertex_attrib_binding.basic-inputL-case1
> > expects.
> > 
> > Fixes: GL44-CTS.vertex_attrib_binding.basic-inputL-case1 test
> > 
> > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97287
> > ---
> >  src/mesa/drivers/dri/i965/brw_compiler.h |  1 +
> >  src/mesa/drivers/dri/i965/brw_context.h  |  2 +-
> >  src/mesa/drivers/dri/i965/brw_draw_upload.c  |  4 +++-
> >  src/mesa/drivers/dri/i965/brw_vs.c   |  1 +
> >  src/mesa/drivers/dri/i965/gen8_draw_upload.c | 35 --
> > --
> >  5 files changed, 21 insertions(+), 22 deletions(-)
> > 
> > diff --git a/src/mesa/drivers/dri/i965/brw_compiler.h
> > b/src/mesa/drivers/dri/i965/brw_compiler.h
> > index 819c7d6..c2400f9 100644
> > --- a/src/mesa/drivers/dri/i965/brw_compiler.h
> > +++ b/src/mesa/drivers/dri/i965/brw_compiler.h
> > @@ -641,6 +641,7 @@ struct brw_vs_prog_data {
> > struct brw_vue_prog_data base;
> >  
> > GLbitfield64 inputs_read;
> > +   GLbitfield64 double_inputs_read;
> >  
> > unsigned nr_attributes;
> > unsigned nr_attribute_slots;
> > diff --git a/src/mesa/drivers/dri/i965/brw_context.h
> > b/src/mesa/drivers/dri/i965/brw_context.h
> > index 308ba99..310372a 100644
> > --- a/src/mesa/drivers/dri/i965/brw_context.h
> > +++ b/src/mesa/drivers/dri/i965/brw_context.h
> > @@ -535,7 +535,7 @@ struct brw_vertex_element {
> > const struct gl_vertex_array *glarray;
> >  
> > int buffer;
> > -
> > +   bool is_dual_slot;
> > /** Offset of the first element within the buffer object */
> > unsigned int offset;
> >  };
> > diff --git a/src/mesa/drivers/dri/i965/brw_draw_upload.c
> > b/src/mesa/drivers/dri/i965/brw_draw_upload.c
> > index da13e7a..19c8065 100644
> > --- a/src/mesa/drivers/dri/i965/brw_draw_upload.c
> > +++ b/src/mesa/drivers/dri/i965/brw_draw_upload.c
> > @@ -472,7 +472,9 @@ brw_prepare_vertices(struct brw_context *brw)
> > while (vs_inputs) {
> >    GLuint index = ffsll(vs_inputs) - 1;
> >    struct brw_vertex_element *input = &brw->vb.inputs[index];
> > -
> > +  input->is_dual_slot =
> > + (brw->gen >= 8 && _mesa_bitcount_64(vs_prog_data-
> > >double_inputs_read &
> > + BITFIELD64_BIT(index)
> > ));
> Bitcount of a single bit?  Couldn't you do:
> 
>   input->is_dual_slot = brw->gen >= 8 &&
>  (vs_prog_data->double_inputs_read & BITFIELD64_BIT(index))
> != 0;

Will do. Thanks for the review!

> With that fixed, this is:
> Reviewed-by: Kenneth Graunke 
> 
> Thanks for tracking this down and fixing it!
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v4] i965: Fix calculation of the image height at start level

2016-09-08 Thread Antía Puentes
Ping. This patch fixes a significant amount of CTS tests and
it will be nice to have it reviewed.
On sáb, 2016-09-03 at 03:04 +0200, Antia Puentes wrote:
> - Fixes CTS tests:
> 
> * GL44-CTS.shader_image_size.advanced-nonMS-cs-float
> * GL44-CTS.shader_image_size.advanced-nonMS-cs-int
> * GL44-CTS.shader_image_size.advanced-nonMS-cs-uint
> * GL44-CTS.shader_image_size.advanced-nonMS-gs-float
> * GL44-CTS.shader_image_size.advanced-nonMS-gs-int
> * GL44-CTS.shader_image_size.advanced-nonMS-gs-uint
> * GL44-CTS.shader_image_size.advanced-nonMS-tes-float
> * GL44-CTS.shader_image_size.advanced-nonMS-tes-int
> * GL44-CTS.shader_image_size.advanced-nonMS-tes-uint
> * GL44-CTS.shader_image_size.advanced-nonMS-vs-float
> * GL44-CTS.shader_image_size.advanced-nonMS-vs-int
> * GL44-CTS.shader_image_size.advanced-nonMS-vs-uint
> 
> v1: (written by Dave Airlie) Always shift height images for levels.
> Fixed the CTS test.
> 
> v2: Only shift height if the texture is not an 1D_ARRAY,
> it fixes assertion in GL44-CTS.texture_view.gettexparameter
> due to the original patch (Antia).
> 
> v3: Remove the loop. Do not shift height either for 1D textures.
> Use an explicit switch and add an assertion (levels == 0) for
> multisampled textures (Jason).
> 
> v4: Rectangle textures can not have levels either (Ilia Mirkin).
> 
> Signed-off-by: Dave Airlie 
> Signed-off-by: Antia Puentes 
> ---
>  src/mesa/drivers/dri/i965/intel_tex_image.c | 27
> +--
>  1 file changed, 21 insertions(+), 6 deletions(-)
> 
> diff --git a/src/mesa/drivers/dri/i965/intel_tex_image.c
> b/src/mesa/drivers/dri/i965/intel_tex_image.c
> index 7affe08..65962eb 100644
> --- a/src/mesa/drivers/dri/i965/intel_tex_image.c
> +++ b/src/mesa/drivers/dri/i965/intel_tex_image.c
> @@ -47,12 +47,27 @@ intel_miptree_create_for_teximage(struct
> brw_context *brw,
> DBG("%s\n", __func__);
>  
> /* Figure out image dimensions at start level. */
> -   for (i = intelImage->base.Base.Level; i > 0; i--) {
> -  width <<= 1;
> -  if (height != 1)
> - height <<= 1;
> -  if (intelObj->base.Target == GL_TEXTURE_3D)
> - depth <<= 1;
> +   switch(intelObj->base.Target) {
> +   case GL_TEXTURE_2D_MULTISAMPLE:
> +   case GL_TEXTURE_2D_MULTISAMPLE_ARRAY:
> +   case GL_TEXTURE_RECTANGLE:
> +  assert(intelImage->base.Base.Level == 0);
> +  break;
> +   case GL_TEXTURE_3D:
> +  depth <<= intelImage->base.Base.Level;
> +  /* Fall through */
> +   case GL_TEXTURE_2D:
> +   case GL_TEXTURE_2D_ARRAY:
> +   case GL_TEXTURE_CUBE_MAP:
> +   case GL_TEXTURE_CUBE_MAP_ARRAY:
> +  height <<= intelImage->base.Base.Level;
> +  /* Fall through */
> +   case GL_TEXTURE_1D:
> +   case GL_TEXTURE_1D_ARRAY:
> +  width <<= intelImage->base.Base.Level;
> +  break;
> +   default:
> +  unreachable("Unexpected target");
> }
>  
> /* Guess a reasonable value for lastLevel.  This is probably
> going___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v3] i965: Fix calculation of the image height at start level

2016-09-02 Thread Antía Puentes
On sáb, 2016-09-03 at 01:20 +0200, Antía Puentes wrote:
> On vie, 2016-09-02 at 17:49 -0400, Ilia Mirkin wrote:
> > 
> > On Fri, Sep 2, 2016 at 5:40 PM, Antia Puentes 
> > wrote:
> > > 
> > > 
> > > - Fixes CTS tests:
> > > 
> > > * GL44-CTS.shader_image_size.advanced-nonMS-cs-float
> > > * GL44-CTS.shader_image_size.advanced-nonMS-cs-int
> > > * GL44-CTS.shader_image_size.advanced-nonMS-cs-uint
> > > * GL44-CTS.shader_image_size.advanced-nonMS-gs-float
> > > * GL44-CTS.shader_image_size.advanced-nonMS-gs-int
> > > * GL44-CTS.shader_image_size.advanced-nonMS-gs-uint
> > > * GL44-CTS.shader_image_size.advanced-nonMS-tes-float
> > > * GL44-CTS.shader_image_size.advanced-nonMS-tes-int
> > > * GL44-CTS.shader_image_size.advanced-nonMS-tes-uint
> > > * GL44-CTS.shader_image_size.advanced-nonMS-vs-float
> > > * GL44-CTS.shader_image_size.advanced-nonMS-vs-int
> > > * GL44-CTS.shader_image_size.advanced-nonMS-vs-uint
> > > 
> > > v1: (written by Dave Airlie) Always shift height images for
> > > levels.
> > > Fixed the CTS tests.
> > > 
> > > v2: Only shift height if the texture is not an 1D_ARRAY,
> > > it fixes assertion in GL44-CTS.texture_view.gettexparameter
> > > due to the original patch (Antia).
> > > 
> > > v3: Remove the loop. Do not shift height either for 1D textures.
> > > Use an explicit switch and add an assertion (levels == 0) for
> > > multisampled textures (Jason).
> > > 
> > > Signed-off-by: Dave Airlie 
> > > Signed-off-by: Antia Puentes 
> > > ---
> > >  src/mesa/drivers/dri/i965/intel_tex_image.c | 27
> > > +--
> > >  1 file changed, 21 insertions(+), 6 deletions(-)
> > > 
> > > diff --git a/src/mesa/drivers/dri/i965/intel_tex_image.c
> > > b/src/mesa/drivers/dri/i965/intel_tex_image.c
> > > index 7affe08..cfcbf3c 100644
> > > --- a/src/mesa/drivers/dri/i965/intel_tex_image.c
> > > +++ b/src/mesa/drivers/dri/i965/intel_tex_image.c
> > > @@ -47,12 +47,27 @@ intel_miptree_create_for_teximage(struct
> > > brw_context *brw,
> > > DBG("%s\n", __func__);
> > > 
> > > /* Figure out image dimensions at start level. */
> > > -   for (i = intelImage->base.Base.Level; i > 0; i--) {
> > > -  width <<= 1;
> > > -  if (height != 1)
> > > - height <<= 1;
> > > -  if (intelObj->base.Target == GL_TEXTURE_3D)
> > > - depth <<= 1;
> > > +   switch(intelObj->base.Target) {
> > > +   case GL_TEXTURE_2D_MULTISAMPLE:
> > > +   case GL_TEXTURE_2D_MULTISAMPLE_ARRAY:
> > > +  assert(intelImage->base.Base.Level == 0);
> > > +  break;
> > > +   case GL_TEXTURE_3D:
> > > +  depth <<= intelImage->base.Base.Level;
> > > +  /* Fall through */
> > > +   case GL_TEXTURE_2D:
> > > +   case GL_TEXTURE_2D_ARRAY:
> > > +   case GL_TEXTURE_RECTANGLE:
> > FWIW, GL_TEXTURE_RECTANGLE can't have levels either. IMO you should
> > move it to the section above with the MS targets.

> True, thanks!. Apart from that, I think I should remove:
> > > +   case GL_TEXTURE_CUBE_MAP:
> and list the cube map faces instead because this is called from
> glTexImage*. I will send a new version of the patch after running the
> tests. Sorry for the noise.

Ok, I was wrong. 500 crashes in Piglit certified it.

> > > 
> > > +   case GL_TEXTURE_CUBE_MAP_ARRAY:
> > > +  height <<= intelImage->base.Base.Level;
> > > +  /* Fall through */
> > > +   case GL_TEXTURE_1D:
> > > +   case GL_TEXTURE_1D_ARRAY:
> > > +  width <<= intelImage->base.Base.Level;
> > > +  break;
> > > +   default:
> > > +  unreachable("Unexpected target");
> > > }
> > > 
> > > /* Guess a reasonable value for lastLevel.  This is probably
> > > going
> > > --
> > > 2.7.4
> > > 
> > > ___
> > > mesa-dev mailing list
> > > mesa-dev@lists.freedesktop.org
> > > https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v3] i965: Fix calculation of the image height at start level

2016-09-02 Thread Antía Puentes
On vie, 2016-09-02 at 17:49 -0400, Ilia Mirkin wrote:
> On Fri, Sep 2, 2016 at 5:40 PM, Antia Puentes 
> wrote:
> > 
> > - Fixes CTS tests:
> > 
> > * GL44-CTS.shader_image_size.advanced-nonMS-cs-float
> > * GL44-CTS.shader_image_size.advanced-nonMS-cs-int
> > * GL44-CTS.shader_image_size.advanced-nonMS-cs-uint
> > * GL44-CTS.shader_image_size.advanced-nonMS-gs-float
> > * GL44-CTS.shader_image_size.advanced-nonMS-gs-int
> > * GL44-CTS.shader_image_size.advanced-nonMS-gs-uint
> > * GL44-CTS.shader_image_size.advanced-nonMS-tes-float
> > * GL44-CTS.shader_image_size.advanced-nonMS-tes-int
> > * GL44-CTS.shader_image_size.advanced-nonMS-tes-uint
> > * GL44-CTS.shader_image_size.advanced-nonMS-vs-float
> > * GL44-CTS.shader_image_size.advanced-nonMS-vs-int
> > * GL44-CTS.shader_image_size.advanced-nonMS-vs-uint
> > 
> > v1: (written by Dave Airlie) Always shift height images for levels.
> > Fixed the CTS tests.
> > 
> > v2: Only shift height if the texture is not an 1D_ARRAY,
> > it fixes assertion in GL44-CTS.texture_view.gettexparameter
> > due to the original patch (Antia).
> > 
> > v3: Remove the loop. Do not shift height either for 1D textures.
> > Use an explicit switch and add an assertion (levels == 0) for
> > multisampled textures (Jason).
> > 
> > Signed-off-by: Dave Airlie 
> > Signed-off-by: Antia Puentes 
> > ---
> >  src/mesa/drivers/dri/i965/intel_tex_image.c | 27
> > +--
> >  1 file changed, 21 insertions(+), 6 deletions(-)
> > 
> > diff --git a/src/mesa/drivers/dri/i965/intel_tex_image.c
> > b/src/mesa/drivers/dri/i965/intel_tex_image.c
> > index 7affe08..cfcbf3c 100644
> > --- a/src/mesa/drivers/dri/i965/intel_tex_image.c
> > +++ b/src/mesa/drivers/dri/i965/intel_tex_image.c
> > @@ -47,12 +47,27 @@ intel_miptree_create_for_teximage(struct
> > brw_context *brw,
> > DBG("%s\n", __func__);
> > 
> > /* Figure out image dimensions at start level. */
> > -   for (i = intelImage->base.Base.Level; i > 0; i--) {
> > -  width <<= 1;
> > -  if (height != 1)
> > - height <<= 1;
> > -  if (intelObj->base.Target == GL_TEXTURE_3D)
> > - depth <<= 1;
> > +   switch(intelObj->base.Target) {
> > +   case GL_TEXTURE_2D_MULTISAMPLE:
> > +   case GL_TEXTURE_2D_MULTISAMPLE_ARRAY:
> > +  assert(intelImage->base.Base.Level == 0);
> > +  break;
> > +   case GL_TEXTURE_3D:
> > +  depth <<= intelImage->base.Base.Level;
> > +  /* Fall through */
> > +   case GL_TEXTURE_2D:
> > +   case GL_TEXTURE_2D_ARRAY:
> > +   case GL_TEXTURE_RECTANGLE:
> FWIW, GL_TEXTURE_RECTANGLE can't have levels either. IMO you should
> move it to the section above with the MS targets.

True, thanks!. Apart from that, I think I should remove:

> > +   case GL_TEXTURE_CUBE_MAP:

and list the cube map faces instead because this is called from
glTexImage*. I will send a new version of the patch after running the
tests. Sorry for the noise.

> > +   case GL_TEXTURE_CUBE_MAP_ARRAY:
> > +  height <<= intelImage->base.Base.Level;
> > +  /* Fall through */
> > +   case GL_TEXTURE_1D:
> > +   case GL_TEXTURE_1D_ARRAY:
> > +  width <<= intelImage->base.Base.Level;
> > +  break;
> > +   default:
> > +  unreachable("Unexpected target");
> > }
> > 
> > /* Guess a reasonable value for lastLevel.  This is probably
> > going
> > --
> > 2.7.4
> > 
> > ___
> > mesa-dev mailing list
> > mesa-dev@lists.freedesktop.org
> > https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] i965: report bound buffer size not underlying buffer size for image size (v2)

2016-08-23 Thread Antía Puentes
Hi Dave,

The "GL44-CTS.shader_image_size.advanced-nonMS-fs-int" test fails for
me with current master (which contains this patch), I have tested it
both in a Broadwell and a Skylake machine.

I recall that you sent to the mailing list a second patch "i965: don't
fail to shift height images for levels." related to the test. Is this
patch or a new version of it still needed?. I remember that as it is it
regressed GL44-CTS.texture_view.gettexparameter, raising the assertion:
"glcts: intel_mipmap_tree.c:368: intel_miptree_create_layout: Assertion
`height0 = 1' failed."

On a side note, as we are taking about CTS, just remind that Andres has
reviewed the "mesa/subroutines: start adding per-context subroutine
index support" series that you sent, which is not pushed yet.

Regards.

On mar, 2016-08-23 at 11:10 +1000, Dave Airlie wrote:
> ping, esp Igalia CTS ppl.
> 
> Dave.
> 
> On 27 May 2016 at 15:11, Dave Airlie  wrote:
> > 
> > From: Dave Airlie 
> > 
> > This seems to make sense, the image is bound to a subset of the
> > buffer
> > so the image size should be from the bound size not the underlying
> > object.
> > 
> > This fixes:
> > GL44-CTS.shader_image_size.advanced-nonMS-fs-int
> > 
> > v2: get mininum of the two values, same as we write to the hw.
> > 
> > Signed-off-by: Dave Airlie 
> > ---
> >  src/mesa/drivers/dri/i965/brw_wm_surface_state.c | 5 +++--
> >  1 file changed, 3 insertions(+), 2 deletions(-)
> > 
> > diff --git a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
> > b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
> > index b73d5d5..ee0b9c2 100644
> > --- a/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
> > +++ b/src/mesa/drivers/dri/i965/brw_wm_surface_state.c
> > @@ -1227,10 +1227,11 @@ update_buffer_image_param(struct
> > brw_context *brw,
> >    struct brw_image_param *param)
> >  {
> > struct gl_buffer_object *obj = u->TexObj->BufferObject;
> > -
> > +   uint32_t size;
> > update_default_image_param(brw, u, surface_idx, param);
> > 
> > -   param->size[0] = obj->Size / _mesa_get_format_bytes(u-
> > >_ActualFormat);
> > +   size = MIN2((uint32_t)u->TexObj->BufferSize, obj->Size);
> > +   param->size[0] = size / _mesa_get_format_bytes(u-
> > >_ActualFormat);
> > param->stride[0] = _mesa_get_format_bytes(u->_ActualFormat);
> >  }
> > 
> > --
> > 2.5.5
> > 
> > ___
> > mesa-dev mailing list
> > mesa-dev@lists.freedesktop.org
> > https://lists.freedesktop.org/mailman/listinfo/mesa-dev
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] mesa: Fix uf10_to_f32() scale factor in the E == 0 and M != 0 case.

2016-08-16 Thread Antía Puentes
The patch looks good to me.

Reviewed-by: Antia Puentes 

On lun, 2016-08-15 at 23:50 -0700, Kenneth Graunke wrote:
> GL_EXT_packed_float, 2.1.B Unsigned 10-Bit Floating-Point Numbers:
> 
> 0.0,  if E == 0 and M == 0,
> 2^-14 * (M / 32), if E == 0 and M != 0,
> 2^(E-15) * (1 + M/32),if 0 < E < 31,
> INF,  if E == 31 and M == 0, or
> NaN,  if E == 31 and M != 0,
> 
> In the second case (E == 0 and M != 0), we were multiplying the
> mantissa
> by 2^-20, when we should have been multiplying by 2^-19 (which is
> 2^(-14 + -5), or 2^-14 * 2^-5, or 2^-14 / 32).
> 
> The previous section defines the formula for 11-bit numbers, which
> is:
> 
> 2^-14 * (M / 64), if E == 0 and M != 0,
> 
> In other words, we had accidentally copy and pasted the 11-bit code
> to the 10-bit case, and neglected to change the exponent.
> 
> Fixes dEQP-GLES3.functional.pbo.renderbuffer.r11f_g11f_b10f_triangles
> when run with surface dimensions of 1536x1152 or 1920x1080.
> 
> References: https://code.google.com/p/chrome-os-partner/issues/detail
> ?id=56244
> Signed-off-by: Kenneth Graunke 
> Reviewed-by: Stephane Marchesin 
> ---
>  src/util/format_r11g11b10f.h | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/src/util/format_r11g11b10f.h
> b/src/util/format_r11g11b10f.h
> index c9e967c..f6cd4ac 100644
> --- a/src/util/format_r11g11b10f.h
> +++ b/src/util/format_r11g11b10f.h
> @@ -191,7 +191,7 @@ static inline float uf10_to_f32(uint16_t val)
>  
> if (exponent == 0) {
>    if (mantissa != 0) {
> - const float scale = 1.0 / (1 << 20);
> + const float scale = 1.0 / (1 << 19);
>   f32.f = scale * mantissa;
>    }
> } else if (exponent == 31) {
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/2] mesa/compute: move compute checks around for tests.

2016-08-05 Thread Antía Puentes
Hi!

I have taken a look to the GL44-CTS.compute_shader.api-indirect test
and I agree with Piñeiro that this test is not correct. A call to
glUseProgram to set the active program is missing in the test.

The test passes invalid values as parameters to
glDispatchComputeIndirect
like,
glDispatchComputeIndirect(-2);

and checks if the error returned by the the implementation is the one
expected.

However, as the test does not set the active program, Mesa returns the
INVALID_OPERATION ("no active compute shader") error instead of the one
expected by the test but, this can not be considered as a nonconformity
because implementations can choose which error generate when several
error conditions are met. For that reason the test is invalid unless it
sets the active program.

Adding the missing glUseProgram call to the test, Mesa returns the
expected errors and the test passes in i965. Notice that the lack of an
active compute shader when calling to glDispatchComputeIndirect is
already tested by a different test:
GL44-CTS.compute_shader.api-no-active-program.

On jue, 2016-05-05 at 09:51 +0200, Alejandro Piñeiro wrote:
> So as far as I understand, on that test there is no active program
> and
> indirect length is wrong, and fails because it was expecting the
> second
> error. Is that right?
> 
> Unless Im wrong, when the OpenGL spec specifies the Error cases, it
> doesn't specify any kind of priority (which error should be raised
> first
> if more of one condition happens).
> 
> Im somewhat biased to think that it is a problem with the test. What
> would happen if a new test is written to check that INVALID_OPERATION
> is
> generated when no active program, and it uses a wrong length? IMHO,
> if
> the test is setting two wrong conditions, it should check for the two
> possible errors.
> 
> On 05/05/16 02:41, Dave Airlie wrote:
> > 
> > From: Dave Airlie 
> > 
> > This fixes GL43-CTS.compute_shader.api-indirect
> > which tests the length/4 before anything else.
> > 
> > Signed-off-by: Dave Airlie 
> > ---
> >  src/mesa/main/api_validate.c | 6 +++---
> >  1 file changed, 3 insertions(+), 3 deletions(-)
> > 
> > diff --git a/src/mesa/main/api_validate.c
> > b/src/mesa/main/api_validate.c
> > index 688408f..d455f19 100644
> > --- a/src/mesa/main/api_validate.c
> > +++ b/src/mesa/main/api_validate.c
> > @@ -1109,9 +1109,6 @@ valid_dispatch_indirect(struct gl_context
> > *ctx,
> >  {
> > GLintptr end = (GLintptr)indirect + size;
> >  
> > -   if (!check_valid_to_compute(ctx, name))
> > -  return GL_FALSE;
> > -
> > /* From the OpenGL 4.3 Core Specification, Chapter 19, Compute
> > Shaders:
> >  *
> >  * "An INVALID_VALUE error is generated if indirect is negative
> > or is not a
> > @@ -1153,6 +1150,9 @@ valid_dispatch_indirect(struct gl_context
> > *ctx,
> >    return GL_FALSE;
> > }
> >  
> > +   if (!check_valid_to_compute(ctx, name))
> > +  return GL_FALSE;
> > +
> > return GL_TRUE;
> >  }
> >  
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] mesa/copyimage: report INVALID_VALUE for missing cube array

2016-06-02 Thread Antía Puentes
On jue, 2016-06-02 at 14:25 +1000, Dave Airlie wrote:
> From: Dave Airlie 
> 
> The specs says INVALID_VALUE for exceeding dimensions,
> which is really what is happening here.
> 
> This fixes:
> GL45-CTS.copy_image.non_existent_mipmap
> 
> Signed-off-by: Dave Airlie 
> ---
>  src/mesa/main/copyimage.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/src/mesa/main/copyimage.c b/src/mesa/main/copyimage.c
> index 6aa6bcb..7e5df61 100644
> --- a/src/mesa/main/copyimage.c
> +++ b/src/mesa/main/copyimage.c
> @@ -180,7 +180,7 @@ prepare_target(struct gl_context *ctx, GLuint
> name, GLenum target,
>   for (i = 0; i < depth; i++) {
>  if (!texObj->Image[z+i][level]) {
> /* missing cube face */
> -   _mesa_error(ctx, GL_INVALID_OPERATION,
> +   _mesa_error(ctx, GL_INVALID_VALUE,
> "glCopyImageSubData(missing cube face)");

I would say "... missing cube face" in the commit message as this is
the error you are returning here. With this change,

Reviewed-by: Antia Puentes 

> return false;
>  }
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] mesa/shader_query: add missing subroutines cases

2016-05-08 Thread Antía Puentes
LGTM:

Reviewed-by: Antia Puentes 


On vie, 2016-05-06 at 10:22 +1000, Dave Airlie wrote:
> From: Dave Airlie 
> 

> ARRAY_SIZE and LOCATION should accept the SUBROUTINE_UNIFORM types.
> 
> Fixes:
> GL43-CTS.program_interface_query.subroutines-vertex
> GL43-CTS.program_interface_query.subroutines-tess-control
> GL43-CTS.program_interface_query.subroutines-tess-eval
> GL43-CTS.program_interface_query.subroutines-geometry
> GL43-CTS.program_interface_query.subroutines-fragment
> GL43-CTS.program_interface_query.subroutines-compute
> 
> Signed-off-by: Dave Airlie 
> ---
>  src/mesa/main/shader_query.cpp | 13 +
>  1 file changed, 13 insertions(+)
> 
> diff --git a/src/mesa/main/shader_query.cpp
> b/src/mesa/main/shader_query.cpp
> index 020990a..c8c3df4 100644
> --- a/src/mesa/main/shader_query.cpp
> +++ b/src/mesa/main/shader_query.cpp
> @@ -1125,6 +1125,13 @@ _mesa_program_resource_prop(struct
> gl_shader_program *shProg,
>    switch (res->Type) {
>    case GL_UNIFORM:
>    case GL_BUFFER_VARIABLE:
> +  case GL_VERTEX_SUBROUTINE_UNIFORM:
> +  case GL_GEOMETRY_SUBROUTINE_UNIFORM:
> +  case GL_FRAGMENT_SUBROUTINE_UNIFORM:
> +  case GL_COMPUTE_SUBROUTINE_UNIFORM:
> +  case GL_TESS_CONTROL_SUBROUTINE_UNIFORM:
> +  case GL_TESS_EVALUATION_SUBROUTINE_UNIFORM:
> +
>   /* Test if a buffer variable is an array or an unsized
> array.
>    * Unsized arrays return zero as array size.
>    */
> @@ -1207,6 +1214,12 @@ _mesa_program_resource_prop(struct
> gl_shader_program *shProg,
> case GL_LOCATION:
>    switch (res->Type) {
>    case GL_UNIFORM:
> +  case GL_VERTEX_SUBROUTINE_UNIFORM:
> +  case GL_GEOMETRY_SUBROUTINE_UNIFORM:
> +  case GL_FRAGMENT_SUBROUTINE_UNIFORM:
> +  case GL_COMPUTE_SUBROUTINE_UNIFORM:
> +  case GL_TESS_CONTROL_SUBROUTINE_UNIFORM:
> +  case GL_TESS_EVALUATION_SUBROUTINE_UNIFORM:
>    case GL_PROGRAM_INPUT:
>    case GL_PROGRAM_OUTPUT:
>   *val = program_resource_location(shProg, res,
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] nir: fix block iterator to not forget fxn->end_block

2016-05-08 Thread Antía Puentes
Hi Rob,

On jue, 2016-05-05 at 14:40 -0400, Rob Clark wrote:
> From: Rob Clark 
> 
> With the switch to new block iterator macro, we silently stopped
> iterating over the end-block.  Which caused nir_index_blocks() to not
> index the end-block.  Resulting in funny looking nir_print's like:
> 
> impl main {
>   block block_0:
>   /* preds: */
>   intrinsic copy_var () (gl_FragColor@out-temp, gl_Color) ()
>   intrinsic copy_var () (gl_FragColor, gl_FragColor@out-temp) ()
>   /* succs: block_0 */
>   block block_0:
> }

I had noticed this as well. The patch looks good to me:

Reviewed-by: Antia Puentes 

> I don't *think* there are any more serious consequences of not
> iterating
> the end_block, but it's probably also a good idea to not subtly
> change
> behavior compared to the old fxn-callback based iterator.
> 
> Signed-off-by: Rob Clark 
> ---
>  src/compiler/nir/nir.c | 19 ++-
>  1 file changed, 14 insertions(+), 5 deletions(-)
> 
> diff --git a/src/compiler/nir/nir.c b/src/compiler/nir/nir.c
> index 867a43c..1f51b31 100644
> --- a/src/compiler/nir/nir.c
> +++ b/src/compiler/nir/nir.c
> @@ -1512,12 +1512,18 @@ nir_block_cf_tree_next(nir_block *block)
>    return NULL;
> }
>  
> -   nir_cf_node *cf_next = nir_cf_node_next(&block->cf_node);
> +   nir_cf_node *parent = block->cf_node.parent;
> +   nir_cf_node *cf_next;
> +
> +   if ((parent->type == nir_cf_node_function) &&
> +   (block == nir_cf_node_as_function(parent)->end_block))
> +  cf_next = NULL;
> +   else
> +  cf_next = nir_cf_node_next(&block->cf_node);
> +
> if (cf_next)
>    return nir_cf_node_cf_tree_first(cf_next);
>  
> -   nir_cf_node *parent = block->cf_node.parent;
> -
> switch (parent->type) {
> case nir_cf_node_if: {
>    /* Are we at the end of the if? Go to the beginning of the
> else */
> @@ -1532,9 +1538,12 @@ nir_block_cf_tree_next(nir_block *block)
> case nir_cf_node_loop:
>    return nir_cf_node_as_block(nir_cf_node_next(parent));
>  
> -   case nir_cf_node_function:
> +   case nir_cf_node_function: {
> +  nir_function_impl *func = nir_cf_node_as_function(parent);
> +  if (func->end_block != block)
> + return func->end_block;
>    return NULL;
> -
> +   }
> default:
>    unreachable("unknown cf node type");
> }
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 02/15] i965: Configure how to store *64*PASSTHRU vertex components

2016-05-04 Thread Antía Puentes
Hi Kenneth,

thanks for reviewing.

On mié, 2016-05-04 at 03:36 -0700, Kenneth Graunke wrote:
> On Thursday, April 28, 2016 1:40:32 PM PDT Antia Puentes wrote:
> > 
> > From the Broadwell specification, structure VERTEX_ELEMENT_STATE
> > description:
> > 
> >    "When SourceElementFormat is set to one of the *64*_PASSTHRU
> > formats,  64-bit components are stored in the URB without any
> > conversion. In this case, vertex elements must be written as 128
> > or 256 bits, with VFCOMP_STORE_0 being used to pad the output
> > as required. E.g., if R64_PASSTHRU is used to copy a 64-bit Red 
> component into
> > 
> > the URB, Component 1 must be specified as VFCOMP_STORE_0 (with
> > Components 2,3 set to VFCOMP_NOSTORE) in order to output a 128-bit
> > vertex element, or Components 1-3 must be specified as VFCOMP_STORE_0
> > in order to output a 256-bit vertex element. Likewise, use of
> > R64G64B64_PASSTHRU requires Component 3 to be specified as 
> VFCOMP_STORE_0
> > 
> > in order to output a 256-bit vertex element."
> > 
> > Uses 128-bits to write double and dvec2 vertex elements, and 256-bits for
> > dvec3 and dvec4 vertex elements.
> > 
> > Signed-off-by: Juan A. Suarez Romero 
> > Signed-off-by: Antia Puentes 
> > ---
> >  src/mesa/drivers/dri/i965/gen8_draw_upload.c | 34 +
> +++
> > 
> >  1 file changed, 34 insertions(+)
> > 
> > diff --git a/src/mesa/drivers/dri/i965/gen8_draw_upload.c b/src/mesa/
> drivers/dri/i965/gen8_draw_upload.c
> > 
> > index fe5ed35..14f7091 100644
> > --- a/src/mesa/drivers/dri/i965/gen8_draw_upload.c
> > +++ b/src/mesa/drivers/dri/i965/gen8_draw_upload.c
> > @@ -217,6 +217,40 @@ gen8_emit_vertices(struct brw_context *brw)
> >   break;
> >    }
> >  
> > +  /* From the BDW PRM, Volume 2d, page 586 (VERTEX_ELEMENT_STATE):
> > +   *
> > +   * "When SourceElementFormat is set to one of the *64*_PASSTHRU
> > +   * formats, 64-bit components are stored in the URB without any
> > +   * conversion. In this case, vertex elements must be written as 
> 128
> > 
> > +   * or 256 bits, with VFCOMP_STORE_0 being used to pad the output
> > +   * as required. E.g., if R64_PASSTHRU is used to copy a 64-bit 
> Red
> > 
> > +   * component into the URB, Component 1 must be specified as
> > +   * VFCOMP_STORE_0 (with Components 2,3 set to VFCOMP_NOSTORE)
> > +   * in order to output a 128-bit vertex element, or Components 
> > 1-3 
> must
> > 
> > +   * be specified as VFCOMP_STORE_0 in order to output a 256-bit 
> vertex
> > 
> > +   * element. Likewise, use of R64G64B64_PASSTHRU requires 
> Component 3
> > 
> > +   * to be specified as VFCOMP_STORE_0 in order to output a 
> > 256-bit 
> vertex
> > 
> > +   * element."
> > +   */
> The above comment seems to indicate that R64 needs component 1 set to
> STORE_0, and that looks to be missing.  I'm guessing you need to add:
> 
> > 
> > +  if (input->glarray->Doubles) {
> > + switch (input->glarray->Size) {
> > + case 0:
>    comp0 = BRW_VE1_COMPONENT_STORE_0;
>    /* fallthrough */
> 
> > 
> > + case 1:
>    comp1 = BRW_VE1_COMPONENT_STORE_0;
>    /* fallthrough */
> 

I have not added them because comp0 and comp1 are initialized in the
code immediately above the switch I added. That already existing code
sets the default values for the components:

switch (input->glarray->Size) {
  case 0: comp0 = BRW_VE1_COMPONENT_STORE_0;
  case 1: comp1 = BRW_VE1_COMPONENT_STORE_0;
  case 2: comp2 = BRW_VE1_COMPONENT_STORE_0;
  case 3: comp3 = input->glarray->Integer ?
   BRW_VE1_COMPONENT_STORE_1_INT
 : BRW_VE1_COMPONENT_STORE_1_FLT;
 break;
  }

In my switch I just override the values that are not right for the
double-precision floating point cases.

> > + case 2:
> > +/*  Use 128-bits instead of 256-bits to write double and dvec2
> > + *  vertex elements.
> > + */
> > +comp2 = BRW_VE1_COMPONENT_NOSTORE;
> > +comp3 = BRW_VE1_COMPONENT_NOSTORE;
> > +break;
> > + case 3:
> > +/* Pad the output using VFCOMP_STORE_0 as suggested
> > + * by the BDW PRM.
> > + */
> > +comp3 = BRW_VE1_COMPONENT_STORE_0;
> > + }
> > +  }
> > +
> >    OUT_BATCH((input->buffer << GEN6_VE0_INDEX_SHIFT) |
> >  GEN6_VE0_VALID |
> >  (format << BRW_VE0_FORMAT_SHIFT) |
> > 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 02/15] i965: Configure how to store *64*PASSTHRU vertex components

2016-04-29 Thread Antía Puentes
On jue, 2016-04-28 at 15:16 +0200, Ian Romanick wrote:
> On 04/28/2016 01:40 PM, Antia Puentes wrote:
> > From the Broadwell specification, structure VERTEX_ELEMENT_STATE
> > description:
> > 
> >"When SourceElementFormat is set to one of the *64*_PASSTHRU
> > formats,  64-bit components are stored in the URB without any
> > conversion. In this case, vertex elements must be written as
> > 128
> > or 256 bits, with VFCOMP_STORE_0 being used to pad the output
> > as required. E.g., if R64_PASSTHRU is used to copy a 64-bit Red
> > component into
> > the URB, Component 1 must be specified as VFCOMP_STORE_0 (with
> > Components 2,3 set to VFCOMP_NOSTORE) in order to output a 128
> > -bit
> > vertex element, or Components 1-3 must be specified as
> > VFCOMP_STORE_0
> > in order to output a 256-bit vertex element. Likewise, use of
> > R64G64B64_PASSTHRU requires Component 3 to be specified as
> > VFCOMP_STORE_0
> > in order to output a 256-bit vertex element."
> > 
> > Uses 128-bits to write double and dvec2 vertex elements, and 256
> > -bits for
> > dvec3 and dvec4 vertex elements.
> > 
> > Signed-off-by: Juan A. Suarez Romero 
> > Signed-off-by: Antia Puentes 
> > ---
> >  src/mesa/drivers/dri/i965/gen8_draw_upload.c | 34
> > 
> >  1 file changed, 34 insertions(+)
> > 
> > diff --git a/src/mesa/drivers/dri/i965/gen8_draw_upload.c
> > b/src/mesa/drivers/dri/i965/gen8_draw_upload.c
> > index fe5ed35..14f7091 100644
> > --- a/src/mesa/drivers/dri/i965/gen8_draw_upload.c
> > +++ b/src/mesa/drivers/dri/i965/gen8_draw_upload.c
> > @@ -217,6 +217,40 @@ gen8_emit_vertices(struct brw_context *brw)
> >   break;
> >}
> >  
> > +  /* From the BDW PRM, Volume 2d, page 586
> > (VERTEX_ELEMENT_STATE):
> > +   *
> > +   * "When SourceElementFormat is set to one of the
> > *64*_PASSTHRU
> > +   * formats, 64-bit components are stored in the URB
> > without any
> > +   * conversion. In this case, vertex elements must be
> > written as 128
> > +   * or 256 bits, with VFCOMP_STORE_0 being used to pad
> > the output
> > +   * as required. E.g., if R64_PASSTHRU is used to copy a
> > 64-bit Red
> > +   * component into the URB, Component 1 must be specified
> > as
> > +   * VFCOMP_STORE_0 (with Components 2,3 set to
> > VFCOMP_NOSTORE)
> > +   * in order to output a 128-bit vertex element, or
> > Components 1-3 must
> > +   * be specified as VFCOMP_STORE_0 in order to output a
> > 256-bit vertex
> > +   * element. Likewise, use of R64G64B64_PASSTHRU requires
> > Component 3
> > +   * to be specified as VFCOMP_STORE_0 in order to output
> > a 256-bit vertex
> > +   * element."
> > +   */
> > +  if (input->glarray->Doubles) {
> > + switch (input->glarray->Size) {
> > + case 0:
> > + case 1:
> > + case 2:
> > +/*  Use 128-bits instead of 256-bits to write double
> > and dvec2
> > + *  vertex elements.
> > + */
> > +comp2 = BRW_VE1_COMPONENT_NOSTORE;
> > +comp3 = BRW_VE1_COMPONENT_NOSTORE;
> > +break;
> > + case 3:
> > +/* Pad the output using VFCOMP_STORE_0 as suggested
> > + * by the BDW PRM.
> > + */
> > +comp3 = BRW_VE1_COMPONENT_STORE_0;
> 
> I think this would be better as
> 
>   if (input->glarray->Size < 3) {
>  ...
>   } else {
>  ...
>   }

It should be then:

if (input->glarray->Size < 3) {
...
} else if (input->glarray->Size == 3) {
...
}

because input->glarray->Size can be 4.


> At the very least... there should be a "break" at the end of the 3
> case.
>  It's not strictly necessary, but my eye keeps noticing that it's not
> there.
> 
> > + }
> > +  }
> > +
> >OUT_BATCH((input->buffer << GEN6_VE0_INDEX_SHIFT) |
> >  GEN6_VE0_VALID |
> >  (format << BRW_VE0_FORMAT_SHIFT) |
> > 
> 
> 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 15/15] i965: Expose OpenGL 4.1 for gen8+

2016-04-28 Thread Antía Puentes
On jue, 2016-04-28 at 15:11 +0200, Ian Romanick wrote:
> I think we should bump all the way to 4.2 because all of the features
> specific to 4.2 have been done for quite some time.

True.


> On 04/28/2016 01:40 PM, Antia Puentes wrote:
> > From: Alejandro Piñeiro 
> > 
> > ARB_vertex_attrib_64bit was the only feature missing.
> > ---
> >  src/mesa/drivers/dri/i965/intel_extensions.c | 2 +-
> >  src/mesa/drivers/dri/i965/intel_screen.c | 2 +-
> >  2 files changed, 2 insertions(+), 2 deletions(-)
> > 
> > diff --git a/src/mesa/drivers/dri/i965/intel_extensions.c
> > b/src/mesa/drivers/dri/i965/intel_extensions.c
> > index d3905d0..0361d42 100644
> > --- a/src/mesa/drivers/dri/i965/intel_extensions.c
> > +++ b/src/mesa/drivers/dri/i965/intel_extensions.c
> > @@ -268,7 +268,7 @@ intelInitExtensions(struct gl_context *ctx)
> > ctx->Extensions.OES_texture_half_float_linear = true;
> >  
> > if (brw->gen >= 8)
> > -  ctx->Const.GLSLVersion = 400;
> > +  ctx->Const.GLSLVersion = 410;
> > else if (brw->gen >= 6)
> >ctx->Const.GLSLVersion = 330;
> > else
> > diff --git a/src/mesa/drivers/dri/i965/intel_screen.c
> > b/src/mesa/drivers/dri/i965/intel_screen.c
> > index 6f685b1..f2ec450 100644
> > --- a/src/mesa/drivers/dri/i965/intel_screen.c
> > +++ b/src/mesa/drivers/dri/i965/intel_screen.c
> > @@ -1386,7 +1386,7 @@ set_max_gl_versions(struct intel_screen
> > *screen)
> > switch (screen->devinfo->gen) {
> > case 9:
> > case 8:
> > -  psp->max_gl_core_version = 40;
> > +  psp->max_gl_core_version = 41;
> >psp->max_gl_compat_version = 30;
> >psp->max_gl_es1_version = 11;
> >psp->max_gl_es2_version = 31;
> > 
> 
> 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] nir: Add missing break into switch in construct_value()

2016-04-22 Thread Antía Puentes
On vie, 2016-04-22 at 09:45 +0300, Juha-Pekka Heikkila wrote:
> There seemed to be missing one break in nested switchcases.
> 
> Signed-off-by: Juha-Pekka Heikkila 
> ---
>  src/compiler/nir/nir_search.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/src/compiler/nir/nir_search.c
> b/src/compiler/nir/nir_search.c
> index dc53a90..df80c0f 100644
> --- a/src/compiler/nir/nir_search.c
> +++ b/src/compiler/nir/nir_search.c
> @@ -521,6 +521,7 @@ construct_value(const nir_search_value *value,
>   default:
>  unreachable("unknown bit size");
>   }
> + break;
>  
>case nir_type_bool32:
>   load->value.u32[0] = c->data.u;


Reviewed-by: Antia Puentes 

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] Patchwork admin rights

2015-11-13 Thread Antía Puentes
On vie, 2015-11-13 at 13:26 +0100, Samuel Iglesias Gonsálvez wrote:
> Hello,
> 
> I would like to have admin permissions to Mesa and Piglit projects in
> patchwork [0] to change the status of patches that are mine but they
> are not assigned to me.

As Samuel, I would like to have admin permissions in patchwork for Mesa
and Piglit,
for the same reasons that he commented.

Thanks.

> I saw in previous emails than just asking for it here is enough. If I
> need to create a bug report in bugzilla, just tell me to do so.
> 
> Thanks,
> 
> Sam
> 
> [0] http://patchwork.freedesktop.org/


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] Intent to work on support for ARB_internalformat_query2

2015-10-27 Thread Antía Puentes
Hi! Chris,

On mar, 2015-10-27 at 21:45 +1300, Chris Forbes wrote:
> Presumably ARB_internalformat_query2?

Eduardo meant ARB_internalformat_query2, indeed.

> On Tue, Oct 27, 2015 at 9:31 PM, Eduardo Lima Mitev 
> wrote:
> Hello,
> 
> This is an announcement that a few folks at Igalia team are
> planning to
> work on adding support for EXT_internalformat_query2 extension
> to Mesa.
> 
> If somebody had started work on this already, or has any input
> that is
> relevant to the implementation, we would be very thankful to
> hear about it.
> 
> I just filed a bug [1] to track progress. Feel free to add
> your comments
> there too.
> 
> That's all!
> 
> cheers,
> Eduardo
> 
> [1] https://bugs.freedesktop.org/show_bug.cgi?id=92687
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
> 
> 
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] Plan to work on opt_peephole_sel optimization for vec4

2015-10-13 Thread Antía Puentes
On lun, 2015-10-12 at 15:55 +0300, Francisco Jerez wrote:
> Antía Puentes  writes:
> > On dom, 2015-10-11 at 10:35 -0700, Matt Turner wrote:
> >> I would expect big improvements in the vec4 backend from making its
> >> copy propagation pass global.
> >
> > I will be working on global copy propagation then.
> >
> 
> Heh...  I suggest you hold yourself from working on this for a while.
> 
> While working on some fragment discard improvements last week I noticed
> that a seemingly harmless change in the CFG could cause shitloads of
> shader-db regressions (>8k).  The reason is that a bunch of optimization
> passes of the compiler back-end are fully local (including VEC4 copy
> propagation) and become ineffective anytime your dataflow happens to
> traverse edges of the CFG.
> 
> Instead of reimplementing the intricate dataflow propagation logic (e.g.
> what fs_copy_prop_dataflow does) in every pass (or rather every pass
> that needs to consider two or more instructions at once), I've given a
> shot at implementing use-def chains in the back-end (didn't have a
> bigger hammer at hand).  Use-def chains should simplify this sort of
> optimization passes massively and allow them to work globally without
> any additional effort: Instead of looping back the instruction list to
> find out whether the last instruction that wrote a register had some
> specific form (c.f. cmod_propagation, saturate_propagation,
> compute_to_mrf, which are all local and have O(n^2) complexity), you
> just look-up the use entry for the instruction you're interested in and
> look at its immediate neighbours in the use-def graph, which represent
> the instruction(s) that generated the variable.
> 
> Use-def chains can be implemented trivially on an SSA IR so as an added
> benefit it should become easier to port optimization passes to SSA once
> they've been rewritten in terms of use-def chains.
> 
> I've got an algorithm to construct the use-def graph basically working
> on the FS back-end (I'm not sharing it publicly yet though because in
> its current form it would likely offend some people's sensibilities -- I
> could send you a branch in private though if you're interested).  I
> haven't started porting other optimization passes to use it yet, but
> it's next on my to-do list.

Hi Francisco,

that sounds great. Do you intend to do the same for vec4? Otherwise, I
can try to port your work to vec4 once you are done with FS.


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] Plan to work on opt_peephole_sel optimization for vec4

2015-10-12 Thread Antía Puentes
Hi Matt,

thanks for your suggestions.

On dom, 2015-10-11 at 10:35 -0700, Matt Turner wrote:
> I don't believe it's valuable to port the opt_peephole_sel() pass to
> the vec4 backend. With NIR (since NIR essentially performs the same
> optimization), the opt_peephole_sel() pass only improves code
> generation on Gen < 7 (where we use MRFs) in cases like
> 
> (+f0) if
> mov m1, ...
> mov m2, ...
> send ..., m1
> else
> mov m1, ...
> mov m2, ...
> send ..., m1
> endif
> 
> where the two sends are texture operations using the same coordinate
> but different samplers. On Gen7+ where send-from-GRF, the source
> payload for the two sends will be different registers and
> opt_peephole_sel() won't handle it.
> 
> That said, we still do send-from-MRF on all platforms in the vec4
> backend, but texturing is sufficiently rare in vertex shaders that I
> don't believe there's much to be gained.

Ok.

> I believe the most valuable optimization missing from the vec4 backend
> is global copy propagation (it currently only has local copy
> propagation). *Lots* of shaders would benefit from this even before we
> switched to NIR, and even more after.
> 
> While both fs/vec4 backends have opt_copy_propagate(), the fs
> backend's implementation is much better. It first performs local (that
> is, on each basic block separately) copy propagation. Second, it uses
> the values available at the end of each block to generate an analysis
> of the dataflow, and finally it reruns local copy propagation but this
> time with more data available.
> 
> I would expect big improvements in the vec4 backend from making its
> copy propagation pass global.

I will be working on global copy propagation then.


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] Plan to work on opt_peephole_sel optimization for vec4

2015-10-11 Thread Antía Puentes
As there are quite a lot of movement related to vec4 optimizations and I
would like to avoid overlapping, I am commenting here that I intend to
work on a version of the fs_visitor::opt_peephole_sel() for vec4.

Suggestions about other FS optimizations worth porting to the vec4
backend are welcome.


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v5 70/70] glsl: Mark as active all elements of shared/std140 block arrays

2015-09-23 Thread Antía Puentes
Hi! Timothy,
thanks for your review.

Seeing your patch in:
http://lists.freedesktop.org/archives/mesa-dev/2015-September/095070.html

The condition in my patch:
(var->type->interface_packing != GLSL_INTERFACE_PACKING_PACKED)
should also be changed to:
(var->get_interface_type()->interface_packing !=
GLSL_INTERFACE_PACKING_PACKED)

Right?

On mié, 2015-09-23 at 23:44 +1000, Timothy Arceri wrote:
> On Wed, 2015-09-23 at 12:25 +0200, Samuel Iglesias Gonsálvez wrote:
> > 
> > On 21/09/15 13:11, Samuel Iglesias Gonsálvez wrote:
> > > On 21/09/15 09:41, Tapani Pälli wrote:
> > > > Seems like a nice fix, takes ES3 CTS failures from 116 to 64 on
> > > > Haswell
> > > > (most failing tests are with ubos).
> > > > 
> > > > Tested-by: Tapani Pälli 
> > > > 
> > > 
> > > OK thanks!
> > > 
> > > > This is individual patch not related to just SSBOs, maybe this
> > > > could be
> > > > pushed separately from the rest?
> > > > 
> > > 
> > > Yes, this patch can be pushed separately from the rest of patches
> > > of the
> > > series: we just need an R-b, as usual.
> > > 
> > > We are going to discuss the proper solution with Timothy [0], as he
> > > found that we are not covering one case.
> > > 
> > 
> > Timothy has sent a patch fixing the packed case [0] and he developed
> > a
> > piglit test for it [1].
> > 
> > We are going to wait for an R-b of Antía's patch before pushing it.
> 
> I sent a reply to this same email saying almost the same thing but it
> seems to have gone missing.
> 
> Anyway I also sent my r-b, this patch looks good to me.
> 
> Reviewed-by: Timothy Arceri 
> 
> 
> > 
> > Sam
> > 
> > [0]
> > http://lists.freedesktop.org/archives/mesa-dev/2015-September/095070.
> > html
> > [1] http://lists.freedesktop.org/archives/piglit/2015-September/01724
> > 7.html
> > 
> > > Sam
> > > 
> > > [0] https://bugs.freedesktop.org/show_bug.cgi?id=83508
> > > 
> > > 
> > > > // Tapani
> > > > 
> > > > On 09/10/2015 04:36 PM, Iago Toral Quiroga wrote:
> > > > > From: Antia Puentes 
> > > > > 
> > > > > Commit 1ca25ab (glsl: Do not eliminate 'shared' or 'std140'
> > > > > blocks
> > > > > or block members) considered as active 'shared' and 'std140'
> > > > > uniform
> > > > > blocks and uniform block arrays, but did not include the block
> > > > > array
> > > > > elements. Because of that, it was possible to have an active
> > > > > uniform
> > > > > block array without any elements marked as used, making the
> > > > > assertion
> > > > > ((b->num_array_elements > 0) == b->type->is_array())
> > > > > in link_uniform_blocks() fail.
> > > > > 
> > > > > Fixes the following 5 dEQP tests:
> > > > > 
> > > > >   * dEQP
> > > > > -GLES3.functional.ubo.random.nested_structs_instance_arrays.18
> > > > >   * dEQP
> > > > > -GLES3.functional.ubo.random.nested_structs_instance_arrays.24
> > > > >   *
> > > > > dEQP
> > > > > -GLES3.functional.ubo.random.nested_structs_arrays_instance_arr
> > > > > ays.19
> > > > >   * dEQP-GLES3.functional.ubo.random.all_per_block_buffers.49
> > > > >   * dEQP-GLES3.functional.ubo.random.all_shared_buffer.36
> > > > > 
> > > > > Fixes bugzilla: 
> > > > > https://bugs.freedesktop.org/show_bug.cgi?id=83508
> > > > > ---
> > > > >   src/glsl/link_uniform_block_active_visitor.cpp | 23
> > > > > +++
> > > > >   1 file changed, 23 insertions(+)
> > > > > 
> > > > > diff --git a/src/glsl/link_uniform_block_active_visitor.cpp
> > > > > b/src/glsl/link_uniform_block_active_visitor.cpp
> > > > > index 5102947..fbe79de 100644
> > > > > --- a/src/glsl/link_uniform_block_active_visitor.cpp
> > > > > +++ b/src/glsl/link_uniform_block_active_visitor.cpp
> > > > > @@ -106,6 +106,22 @@
> > > > > link_uniform_block_active_visitor::visit(ir_variable *var)
> > > > >  assert(b->num_array_elements == 0);
> > > > >  assert(b->array_elements == NULL);
> > > > >  assert(b->type != NULL);
> > > > > +   assert(!b->type->is_array() || b->has_instance_name);
> > > > > +
> > > > > +   /* For uniform block arrays declared with a shared or
> > > > > std140 layout
> > > > > +* qualifier, mark all its instances as used.
> > > > > +*/
> > > > > +   if (b->type->is_array() && b->type->length > 0) {
> > > > > +  b->num_array_elements = b->type->length;
> > > > > +  b->array_elements = reralloc(this->mem_ctx,
> > > > > +   b->array_elements,
> > > > > +   unsigned,
> > > > > +   b->num_array_elements);
> > > > > +
> > > > > +  for (unsigned i = 0; i < b->num_array_elements; i++) {
> > > > > + b->array_elements[i] = i;
> > > > > +  }
> > > > > +   }
> > > > > 
> > > > >  return visit_continue;
> > > > >   }
> > > > > @@ -147,6 +163,13 @@
> > > > > link_uniform_block_active_visitor::visit_enter(ir_dereference_a
> > > > > rray *ir)
> > > > >  assert((b->num_array_elements == 0) == (b->array_elements
> > > > > == NULL));
> > > > >  assert(b->type != NULL);
> > > > > 

Re: [Mesa-dev] [PATCH] i965/vec4: Don't coalesce registers in gen6 math ops if reswizzling needed

2015-09-22 Thread Antía Puentes
On mar, 2015-09-22 at 11:22 +0200, Antía Puentes wrote:
> I realized that if writemasking is also unsupported I should probably
> update the patch to avoid coalescing if the dst_writemask parameter is
> different from XYWZ. The condition in "vec4_instruction::can_reswizzle"
> would be something like:
> 
>   /* Gen6 MATH instructions can not execute in align16 mode, so swizzles
>* or writemasking are not allowed. */
>if (devinfo->gen == 6 && is_math() &&
>swizzle != BRW_SWIZZLE_XYZW &&
>dst_writemask != WRITEMASK_XYWZ)
>   return false;

Ups, I meant: (swizzle != BRW_SWIZZLE_XYZW || dst_writemask !=
WRITEMASK_XYWZ)
Sorry for the noise.

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] i965/vec4: Don't coalesce registers in gen6 math ops if reswizzling needed

2015-09-22 Thread Antía Puentes
On dom, 2015-09-20 at 11:11 -0700, Matt Turner wrote:
> On Sun, Sep 20, 2015 at 8:48 AM, Antia Puentes  wrote:
> > Math operations in SandyBridge do not support source swizzling
> 
> I can't find any documentation to support this claim, but I remember
> that SNB math must be in align1 mode so it can't do swizzles or
> writemasking (see commit e14cc504).

I realized that if writemasking is also unsupported I should probably
update the patch to avoid coalescing if the dst_writemask parameter is
different from XYWZ. The condition in "vec4_instruction::can_reswizzle"
would be something like:

  /* Gen6 MATH instructions can not execute in align16 mode, so swizzles
   * or writemasking are not allowed. */
   if (devinfo->gen == 6 && is_math() &&
   swizzle != BRW_SWIZZLE_XYZW &&
   dst_writemask != WRITEMASK_XYWZ)
  return false;

The writemask check is needed because the "vec4_instruction::reswizzle"
method, apart from modifying the swizzle, also updates the intruction's
writemask to the AND between the original writemask and the writemask of
the instruction we are trying to remove.

I will run Piglit and dEQP and send the updated patch.

> But, the documentation actually says "The supported regioning modes
> for math instruction are align16, align1 with the following
> restrictions: ..."
> 
> Would be nice if we could actually find a PRM citation.
> 
> > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92033
> > ---
> >  src/mesa/drivers/dri/i965/brw_ir_vec4.h |  3 ++-
> >  src/mesa/drivers/dri/i965/brw_vec4.cpp  | 11 +--
> >  2 files changed, 11 insertions(+), 3 deletions(-)
> >
> > diff --git a/src/mesa/drivers/dri/i965/brw_ir_vec4.h 
> > b/src/mesa/drivers/dri/i965/brw_ir_vec4.h
> > index 966a410..a48bb68 100644
> > --- a/src/mesa/drivers/dri/i965/brw_ir_vec4.h
> > +++ b/src/mesa/drivers/dri/i965/brw_ir_vec4.h
> > @@ -175,7 +175,8 @@ public:
> >
> > bool is_send_from_grf();
> > unsigned regs_read(unsigned arg) const;
> > -   bool can_reswizzle(int dst_writemask, int swizzle, int swizzle_mask);
> > +   bool can_reswizzle(const struct brw_device_info *devinfo, int 
> > dst_writemask,
> > +  int swizzle, int swizzle_mask);
> > void reswizzle(int dst_writemask, int swizzle);
> > bool can_do_source_mods(const struct brw_device_info *devinfo);
> >
> > diff --git a/src/mesa/drivers/dri/i965/brw_vec4.cpp 
> > b/src/mesa/drivers/dri/i965/brw_vec4.cpp
> > index ed49cd3..d7192e4 100644
> > --- a/src/mesa/drivers/dri/i965/brw_vec4.cpp
> > +++ b/src/mesa/drivers/dri/i965/brw_vec4.cpp
> > @@ -941,10 +941,17 @@ vec4_visitor::opt_set_dependency_control()
> >  }
> >
> >  bool
> > -vec4_instruction::can_reswizzle(int dst_writemask,
> > +vec4_instruction::can_reswizzle(const struct brw_device_info *devinfo,
> > +int dst_writemask,
> >  int swizzle,
> >  int swizzle_mask)
> >  {
> > +
> 
> Extra new line.
> 
> > +   /* gen6 math instructions can not manage source swizzles */
> 
> If we can find documentation, we should update this comment. If not,
> let's change it to read
> 
> /* Gen6 MATH instructions can not execute in align16 mode, so swizzles
> are not allowed. */
> 
> (linewrapped as appropriate)
> 
> With those changes,
> 
> Reviewed-by: Matt Turner 
> 


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2 2/2] i965/vec4_nir: Load constants as integers

2015-08-25 Thread Antía Puentes
Hi! Jason,

On lun, 2015-08-24 at 10:26 -0700, Jason Ekstrand wrote:
> Could you please debase this patch (probably rewrite)? I *think* it
> should fix https://bugs.freedesktop.org/show_bug.cgi?id=91716.  What
> were the shader-db results for it?

This V2 of the patch is already rebased/rewritten taking into account
the changes made in load_const to use fewer moves, sorry for not having
said it explicitly.

Because for loading the constants as integers we necessarily need to fix
the saturation errors, we would be having Piglit regressions in:
"shaders/glsl-clamp-vertex-color.shader_test" and
"arb_color_buffer_float-render" tests otherwise, I have applied both
patches to collect the shader-db results.
  
Comparing master (commit 529acab22a3e21e0ed0c5243675aec6c0ee27e8f) and
the version obtained by applying patches "i965/vec4: Fix saturation
errors when coalescing registers" and "i965/vec4_nir: Load constants as
integers", we have:

* Shader-db results for vec4 on i965 (Haswell):

total instructions in shared programs: 6402199 -> 6402200 (0.00%)
instructions in affected programs: 81 -> 82 (1.23%)
helped:0
HURT:  1
GAINED:0
LOST:  0

Printing the vec4 instructions generated by the hurt program, we can see
that this extra instruction is the result of reaching a program where
the register coalescing is not possible because saturation is required
and the instruction we want to coalesce into is not a MOV (if it were a
MOV, we could safely coalesce using the type of the final register,
instead of the type of the register to be removed as it is currently
done). I am detailing only the pseudo-code for the component x for
brevity, the rest of the components would be equivalent:

In the master branch, the following vec4 instructions:
  and vgrf8.0.x:D, vgrf7.:D, 1065353216U
  mov vgrf9.0.x:D, vgrf8.:D
  mov.sat m4:F, vgrf9.xyzw:F

are reduced by opt_register_coalesce to: 
  and.sat m4.x:D, vgrf7.:D, 1065353216U
making the saturation modifier a no-op (type D).

With the patches applied, opt_register_coalesce does not coalesce vgrf9,
we have an extra mov but the saturation is effective:
  and vgrf9.0.x:D, vgrf7.:D, 1065353216U
  mov.sat m4:F, vgrf9.xyzw:F

> On Aug 24, 2015 4:51 AM, "Antia Puentes"  wrote:
> Loads constants using integer as their register type, this is
> done
> for consistency with the FS backend.
> ---
>  src/mesa/drivers/dri/i965/brw_vec4_nir.cpp | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
> b/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
> 
> index 632e409..23b2fab 100644
> --- a/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
> @@ -456,7 +456,7 @@ void
>  vec4_visitor::nir_emit_load_const(nir_load_const_instr
> *instr)
>  {
> dst_reg reg = dst_reg(GRF, alloc.allocate(1));
> -   reg.type =  BRW_REGISTER_TYPE_F;
> +   reg.type =  BRW_REGISTER_TYPE_D;
> 
> 
> unsigned remaining =
> brw_writemask_for_size(instr->def.num_components);
> 
> @@ -477,7 +477,7 @@
> vec4_visitor::nir_emit_load_const(nir_load_const_instr *instr)
>}
> 
>reg.writemask = writemask;
> -  emit(MOV(reg, src_reg(instr->value.f[i])));
> +  emit(MOV(reg, src_reg(instr->value.i[i])));
> 
> 
>remaining &= ~writemask;
> }
> --
> 2.1.0
> 
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
> 


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] Fix saturation when coalescing registers in NIR-vec4

2015-08-24 Thread Antía Puentes
Hi!

On vie, 2015-08-14 at 13:56 +0200, Antia Puentes wrote:
> With this fix, we can load the constants in NIR-vec4 as integers,
> as it is done in the FS backend.
> 
> This is related to:
> http://lists.freedesktop.org/archives/mesa-dev/2015-July/089899.html

I've been on holidays last week, as far I can read in:
http://lists.freedesktop.org/archives/mesa-dev/2015-August/092382.html
Matt Turner is working on a new version of the vec4 register coalescing.
 
This makes me wonder if it makes sense to consider now the patches I am
sending, because first, the code is going to change soon, and second, it
could be that Matt is already tackling the saturation errors in the new
register coalesce version.



___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2 00/78] i965: A new vec4 backend based on NIR

2015-08-04 Thread Antía Puentes
Hi! Jason,

On lun, 2015-08-03 at 17:25 -0700, Jason Ekstrand wrote:
> On Mon, Aug 3, 2015 at 9:51 AM, Jason Ekstrand  wrote:

> > I also pushed a version to our CI system that switched the
> > INTEL_USE_NIR default to true and things look pretty good over-all.
> > On ILK (gen5) and 965 (gen 4), there were a bunch of tests that failed
> > most of which seemed to be focussed around either variable indexing or
> > comparison operations.  It looks like it's probably the same problem
> > (or problems) on both gens, so once we get one fixed, the other will
> > most likely follow.  On gen8, it looks like integers are broken.
> >
> > I don't know if any of you have the hardware to solve those problems.
> > If not, someone from the Oregon team can look into it.
> 
> I went ahead and looked at some of these problems.  I just sent a
> short patch series that fixes all of the regressions vs. non-NIR on
> BDW+ and ILK-.  Some things I noticed while I was at it:
> 
> 1) You implemented nir_op_bany but not nir_op_ball.  Why is this?  Do
> we not have any piglit tests that hit nir_op_ball?

I did not implement nir_op_ball because, AFAICS, it is never emitted.
Again, AFAICS, the equivalent IR operation is also missing, probably
because the "all()" builtin function is implemented using
nir_binop_logic_ands. I can add it if you think it makes sense though.







___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v3 (part2) 55/56] glsl: Consider active all elements of a shared/std140 block array

2015-07-30 Thread Antía Puentes
Please, ignore this patch, we have recently sent an updated version to
the mailing list.
The new version is available in:
http://lists.freedesktop.org/archives/mesa-dev/2015-July/090108.html

Regards.

On mar, 2015-07-14 at 09:46 +0200, Iago Toral Quiroga wrote:
> From: Antia Puentes 
> 
> Commmit 1ca25ab (glsl: Do not eliminate 'shared' or 'std140'
> blocks or block members) considers active 'shared' and 'std140'
> uniform blocks and uniform block arrays but did not include the
> block array elements. It was possible to have an active uniform
> block array without any elements marked as used, making the
> assertion ((b->num_array_elements > 0) == b->type->is_array())
> in link_uniform_blocks fail.
> 
> Fixes the following 5 dEQP tests:
> 
>  * dEQP-GLES3.functional.ubo.random.nested_structs_instance_arrays.18
>  * dEQP-GLES3.functional.ubo.random.nested_structs_instance_arrays.24
>  * dEQP-GLES3.functional.ubo.random.nested_structs_arrays_instance_arrays.19
>  * dEQP-GLES3.functional.ubo.random.all_per_block_buffers.49
>  * dEQP-GLES3.functional.ubo.random.all_shared_buffer.36
> ---
>  src/glsl/link_uniform_block_active_visitor.cpp | 23 +++
>  1 file changed, 23 insertions(+)
> 
> diff --git a/src/glsl/link_uniform_block_active_visitor.cpp 
> b/src/glsl/link_uniform_block_active_visitor.cpp
> index 5102947..37418c2 100644
> --- a/src/glsl/link_uniform_block_active_visitor.cpp
> +++ b/src/glsl/link_uniform_block_active_visitor.cpp
> @@ -106,6 +106,22 @@ link_uniform_block_active_visitor::visit(ir_variable 
> *var)
> assert(b->num_array_elements == 0);
> assert(b->array_elements == NULL);
> assert(b->type != NULL);
> +   assert(!b->type->is_array() || b->has_instance_name);
> +
> +   /* For uniform block arrays declared with a shared or std140 layout
> +* qualifier, mark all its instances as used.
> +*/
> +   if (b->type->is_array() && b->type->length > 0) {
> +  b->num_array_elements = b->type->length;
> +  b->array_elements = reralloc(this->mem_ctx,
> +   b->array_elements,
> +   unsigned,
> +   b->num_array_elements);
> +
> +  for (unsigned i = 0; i < b->num_array_elements; i++) {
> + b->array_elements[i] = i;
> +  }
> +   }
>  
> return visit_continue;
>  }
> @@ -147,6 +163,13 @@ 
> link_uniform_block_active_visitor::visit_enter(ir_dereference_array *ir)
> assert((b->num_array_elements == 0) == (b->array_elements == NULL));
> assert(b->type != NULL);
>  
> +   /* If the block array was declared with a shared or std140 layout 
> qualifier,
> +* all its instances have been already marked as used (see
> +* link_uniform_block_active_visitor::visit(ir_variable *) function).
> +*/
> +   if (var->type->interface_packing == GLSL_INTERFACE_PACKING_PACKED)
> +  return visit_continue;
> +
> ir_constant *c = ir->array_index->as_constant();
>  
> if (c) {


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2 12/78] i965/nir/vec4: Implement load_const intrinsic

2015-07-27 Thread Antía Puentes
Hi! Jason,

On jue, 2015-07-23 at 15:46 -0700, Jason Ekstrand wrote:

> > @@ -332,7 +334,22 @@ vec4_visitor::nir_emit_instr(nir_instr *instr)
> >  void
> >  vec4_visitor::nir_emit_load_const(nir_load_const_instr *instr)
> >  {
> > -   /* @TODO: Not yet implemented */
> > +   dst_reg reg = dst_reg(GRF, alloc.allocate(1));
> > +   reg.type =  BRW_REGISTER_TYPE_F;
> > +
> > +   /* @FIXME: consider emitting vector operations to save some MOVs in
> > +* cases where the components are representable in 8 bits.
> > +* By now, we emit a MOV for each component.
> > +*/
> > +   for (unsigned i = 0; i < instr->def.num_components; ++i) {
> > +  reg.writemask = 1 << i;
> > +  emit(MOV(reg, src_reg(instr->value.f[i])));
> 
> In the FS, we use integers here.  I don't know that using floats is a
> problem, but we should probably be consistent.

We are using floats as the default type when loading constants, because
otherwise we have a number of regressions in Piglit that happen because
clamping is not done when it is required. This lack of clamping is
related to the vec4_visitor:opt_register_coalesce optimization relaying
on the constants being loaded with their real type, this is in effect
how the IR-vec4 pass works.

Let me explain it with an example. Imagine that we use integers as the
default type in load_const as you suggest, and we have in our shader's
code the following instruction:

gl_FrontColor = vec4(-2, -1, 0.5, 3);

This is translated into the following vec4 instructions:

/* Load constant */
mov vgrf10.0.x:D, -1073741824D
mov vgrf10.0.y:D, -1082130432D
mov vgrf10.0.z:D, 1056964608D
mov vgrf10.0.w:D, 1077936128D

/* Store the value */
mov.sat m4:F, vgrf10.xyzw:F

The error comes when that bunch of instructions is translated by the
opt_register_coalesce as:
mov.sat m4.x:D, -1073741824D
mov.sat m4.y:D, -1082130432D
mov.sat m4.z:D, 1056964608D
mov.sat m4.w:D, 1077936128D

As you can see, although the instruction saturation is set, nothing will
be done because the operands are integers. I see a number of ways to fix
this:

1) Provide the constant's type information in nir_load_const_instr and
set the real type when loading the constant.

2) Set the type always to float and rely on receiving the correct value
for the instruction->saturation, so the clamping is only done when
needed.

3) Modify the opt_register_coalesce optimization to preserve the types
of the instruction containing the final destination, if the instruction
saturation is true.


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 39/78] i965/nir/vec4: Add swizzle utility method for vector ops

2015-07-03 Thread Antía Puentes
Hi Jason,

On mar, 2015-06-30 at 14:18 -0700, Jason Ekstrand wrote:
> On Fri, Jun 26, 2015 at 1:06 AM, Eduardo Lima Mitev  wrote:
> > From: Antia Puentes 
> >
> > For operations that have a predefined operand size > 0, defined in
> > glsl/nir/nir_opcodes.c, NIR returns a swizzle containing zeros in the
> > components from outside the source vector. However, the driver
> > expects those components to have a swizzle value equal to the swizzle
> > of the component in (number of vector elements - 1). For example,
> >
> >for a vec2 operation with an identity swizzle (.xy):
> >   - (NIR -> XYXX, driver ->X)
> >
> >for a vec3 operation with swizzle (.zxy)
> >   - (NIR-> ZXYX, driver -> ZXYY)
> 
> Why is this needed.  Was there some bug you ran into regarding this or
> are you just trying to make the generated code match?  If the
> component isn't used, then it seems to me like the swizzle shouldn't
> matter.  NIR may give you bogus swizzles outside of the enabled
> channels but it may do that regardless.  I'm also not seeing how
> composing swizzles fixes anything.

There were several bugs in our NIR->vec4 code path related to NIR
returning bogus swizzles outside the components of a vector. When doing
an "any", "equal" or "notEqual" operation on vectors, the result of the
instruction "cmp" is predicated using "all4h" or "any4h" depending on
the operation. The results were incorrect for vectors with less than 4
elements because: (1) I was not setting the writemask of the "cmp"
operation according to the vector size and (2) NIR returns bogus
swizzles for components outside the vector. As a consequence, we had
spurious component-wise comparisons that affected the result of the
predication.

As composing the operand swizzle with the identity swizzle-per-size
means to repeat the value of the swizzle of the last vector-element on
the components outside the vector, the error mentioned above does not
happen because to repeat the last (inside-boundaries) component-wise
comparison for the components outside the vector does not affect the
result of the predication.

Other way to fix the bugs above is to set the writemask of the "cmp"
operation to the appropriate mask according to the vector size, thus
forcing the "cmp" instruction to compare only the components inside the
vector boundaries. As the vec4_visitor does not set the writemask of the
"cmp" instruction because it relies on the operand having the correct
swizzle outside the vector boundaries, for consistency with the existing
behaviour, I decided to do the same and compose the swizzle (notice that
the composing is done in the vec4_visitor when visiting an ir_swizzle,
while in NIR we only set the swizzle for the elements inside the
vector). I can remove this function if you prefer setting the writemask
over composing the swizzles. 

I remember another regression related to bogus swizzles in
unpackHalf2x16, for that function the operand is an uint, however in the
vec4_visitor::emit_unpack_half_2x16 function we have:
tmp_dst.writemask = WRITEMASK_Y;
emit(SHR(tmp_dst, src0, src_reg(16u)));
This was not buggy in the old vec4_visitor because the src0's swizzle
was correct for the component Y (src0's swizzle would be either ,
the identity swizzle for size == 1, or the result of composing the
swizzle with  in case the parent IR node was an ir_swizzle). In the
case of NIR, as the swizzle can be bogus for all the components except
the first one (size == 1), we need to compose it.


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 33/78] i965/nir/vec4: Implement float-related functions

2015-07-01 Thread Antía Puentes
Hi Jason,

thanks for the review.

On mar, 2015-06-30 at 14:10 -0700, Jason Ekstrand wrote:
> First off, this needs a different commit message.  "float-related
> functions" isn't particularly descriptive.  How about "various
> rounding functions" because these really are all "rounding modes".

True, I will change it.

> > diff --git a/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp 
> > b/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
> > index 12fdb48..57685bb 100644
> > --- a/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
> > +++ b/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
> > @@ -855,6 +855,46 @@ vec4_visitor::nir_emit_alu(nir_alu_instr *instr)

...
> > +   case nir_op_fceil: {
> > +  /* Get the number of components of the source to create a
> > +   * temporary register with the appropriate glsl type.
> > +   */
> > +  int num_components = instr->src[0].src.is_ssa ?
> > + instr->src[0].src.ssa->num_components :
> > + instr->src[0].src.reg.reg->num_components;
> > +  assert(num_components > 0);
> > +
> > +  src_reg tmp = src_reg(this, glsl_type::float_type, num_components);
> 
> We should be able to just do 1 register because each register is already a 
> vec4.

Right. We only need the number of components to set the swizzle.

> > +  tmp.swizzle = brw_swizzle_for_size(num_components);


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 3/5] glcpp: Allow arithmetic integer expressions in #line

2015-06-22 Thread Antía Puentes
First, sorry for the late answer, I somehow missed your replies (I was
not in CC).

On mar, 2015-06-09 at 10:59 -0700, Ian Romanick wrote:
> On 06/09/2015 10:40 AM, Carl Worth wrote:
> > On Tue, Jun 09 2015, Ian Romanick wrote:
> >>> From section 3.4 ("Preprocessor") of the GLSL ES 3.00 specification:
> >>> "#line must have, after macro substitution, one of the following forms:
> >>>   #line line
> >>>   #line line source-string-number
> >>> where line and source-string-number are constant integral
> >>> expressions."
> > ...
> >>> From section 4.3.3 ("Constant Expressions") of the same specification:
> >>> "A constant integral expression is a constant expression that evaluates
> >>> to a scalar signed or unsigned integer."
> > 
> > Yes. That's an extremely unfortunate piece of the specification.
> > 
> > This, together with unary operators introduces inherent ambiguity into
> > the grammar. Just think about things like:

I forgot to mention in the patch's commit message that, because of the
ambiguity of the grammar, I made the assumption that one (or more
blanks) that are not part of an expression between parentheses, act as
parameter separators. Then, for the examples mentioned in the thread the
output would be:

#line 2-1+5 -> #line 6
#line 2 -1+5 -> #line 2 4
#line 2-1 +5 -> #line 1 5
#line 2-1+5 3 -> #line 6 3
#line 2 -1+5 3 -> compilation  error
#line 2-1 +5 3 -> compilation error

#line 3 +3 -> #line 3 3
#line 3 (+3) -> #line 3 3

And for the parentheses the behavior is:

#line (2  -1)+5 -> #line 6
#line 3 (4+1)-1 -> #line 3 4
#line (3) ((4+1) -1) -> #line 3 4
#line 3 (4+1) -1 -> compilation error

> The spec was supposed to get updated to say that parsing is greedy, so
> we at least know what those should do.  I say "supposed to" instead of
> "was" because I don't know for sure that it was updated.

I am afraid that greedy parsing is not what this patch implements, as
#line 3 +3 4 will not be evaluated as #line 6 4, but will raise an error
instead.

> > But I'll also take a look at this patch. Thanks for bringing it to my
> > attention, Ian.


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] i965: Emit IF/ELSE/ENDIF/WHILE JIP with type W on Gen7

2015-03-16 Thread Antía Puentes
Hi!,

On vie, 2015-03-13 at 10:42 -0700, Matt Turner wrote:
> On Fri, Mar 13, 2015 at 3:41 AM, Antia Puentes  wrote:
> > IvyBridge and Haswell PRM say that the JIP should be emitted
> > with type W but we were using UD. The previous implementation
> > did not show adverse effects, however changing the type to
> > D caused a GPU hang, see bug 84557; IMHO it is safer to
> > follow the specification thoroughly.
> > ---
> 
> The change seems fine to me, though changing the type to D never
> seemed like the thing that was actually causing the hang. It seemed
> much more likely that allowing the ELSE and ENDIF instructions to be
> compacted violated some assumption in our code and we misaligned some
> jump target by 8-bytes.
> 
> I should investigate more.
> 
> I'd probably remove the "however changing the type to D caused a GPU
> hang, see bug 84557" comment from the commit.

I will remove the comment.

> Reviewed-by: Matt Turner 

Thanks for reviewing!


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] About sRGB encoding/decoding while blitting

2015-02-24 Thread Antía Puentes
Hi!

I have been looking into the dEQP test failures that deal with blitting
between sRGB and linear color spaces, and vice versa (in total 37
tests).

OpenGL ES 3.0 and OpenGL 4.4 specifications clearly state that
linearization should happen when reading from sRGB buffers, and sRGB
encoding should be done when writing to sRGB buffers during the blit
operation. In the case OpenGL 4.4 specification this encoding/decoding
is dependent on the value of the GL_FRAMEBUFFER_SRGB enable flag; OpenGL
ES 3.0 acts as if this setting was always enabled.

 - From section 4.3.3 ("Copying Pixels") of the OpenGL ES 3.0
specification (page 198):

"When values are taken from the read buffer, if the value of
FRAMEBUFFER_ATTACHMENT_COLOR_ENCODING for the framebuffer
attachment corresponding to the read buffer is SRGB (see section
6.1.13), the red, green, and blue components are converted from
the non-linear sRGB color space according to equation 3.24.

When values are written to the draw buffers, blit operations
bypass the fragment pipeline. The only fragment operations which
affect a blit are the pixel ownership test, the scissor test,
and sRGB conversion (see section 4.1.8). Color, depth, and
stencil masks (see section 4.2.2) are ignored."

 - From section 18.3.1 ("Blitting Pixel Rectangles") of the OpenGL 4.4
core specification (page 485):

"When values are taken from the read buffer, if FRAMEBUFFER_SRGB
is enabled and the value of
FRAMEBUFFER_ATTACHMENT_COLOR_ENCODING for the framebuffer
attachment corresponding to the read buffer is SRGB (see section
9.2.3), the red, green, and blue components are converted from
the non-linear sRGB color space according to equation 8.14.

When values are written to the draw buffers, blit operations
bypass most of the fragment pipeline. The only fragment
operations which affect a blit are the pixel ownership test, the
scissor test, and sRGB conversion (see section 17.3.9). Color,
depth, and stencil masks (see section 17.4.2) are ignored."

However, the implementation in Mesa preserves the underlying binary
representation of the pixels,
regardless the color space. This is related to the fact that the OpenGL
4.3 specification was not consistent regarding blits involving sRGB
buffers; for more information see [1]. Apart from this, it was suggested
in some comments in the code (specifically comments [1] and [2]), that
existing games and applications rely on the current behaviour, not
expecting sRGB encoding/decoding to be done.

In summary, the current implementation does not respect the OpenGL 4.4
and OpenGL ES 3.0 specs, however, it seems to be the implementation that
some applications / games expect, so it is unclear to me if we want to
re-implement this according to what the specs dictate or leave it as it
is now. Any thoughts?

Regards.


[1] Piglit: comments in tests/spec/arb_framebuffer_srgb/blit.c source
file and the log of the commit abd189966d39648c00f3204c58fef217e94a8703

[2] Mesa: comments in src/mesa/drivers/common/meta_blit.c source file
("blitframebuffer_texture" function).


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev