date:20151209

Re: [Mesa-dev] [RFC PATCH 5/5] i965: Skip execution size adjustment for instructions of width 4

2015-12-09 Thread Jason Ekstrand

On Dec 9, 2015 11:47 PM, "Iago Toral"  wrote:
>
> On Wed, 2015-12-09 at 08:10 -0800, Jason Ekstrand wrote:
> >
> > On Dec 9, 2015 4:16 AM, "Iago Toral Quiroga" 
> > wrote:
> > >
> > > This code in brw_set_dest adjusts the execution size of any
> > instruction
> > > with a dst.width < 8. However, we don't want to do this with
> > instructions
> > > operating on doubles, since these will have a width of 4, but still
> > > need an execution size of 8 (for SIMD8). Unfortunately, we can't
> > just check
> > > the size of the operands involved to detect if we are doing an
> > operation on
> > > doubles, because we can have instructions that do operations on
> > double
> > > operands interpreted as UD, operating on any of its 2 32-bit
> > components.
> > >
> > > Previous commits have made it so we never emit instructions with a
> > horizontal
> > > width of 4 that don't have the correct execution size set for
> > gen7/gen8, so
> > > we can skip it in this case, avoiding the conflicts with fp64
> > requirements.
> > >
> > > Expanding the same fix to other hardware generations requires many
> > more
> > > changes but since we are not targetting fp64 support on them
> > > wer don't really care for now.
> > > ---
> > >  src/mesa/drivers/dri/i965/brw_eu_emit.c | 14 +-
> > >  1 file changed, 13 insertions(+), 1 deletion(-)
> > >
> > > diff --git a/src/mesa/drivers/dri/i965/brw_eu_emit.c
> > b/src/mesa/drivers/dri/i965/brw_eu_emit.c
> > > index 78f2c8c..50a8771 100644
> > > --- a/src/mesa/drivers/dri/i965/brw_eu_emit.c
> > > +++ b/src/mesa/drivers/dri/i965/brw_eu_emit.c
> > > @@ -202,8 +202,20 @@ brw_set_dest(struct brw_codegen *p, brw_inst
> > *inst, struct brw_reg dest)
> > > /* Generators should set a default exec_size of either 8
> > (SIMD4x2 or SIMD8)
> > >  * or 16 (SIMD16), as that's normally correct.  However, when
> > dealing with
> > >  * small registers, we automatically reduce it to match the
> > register size.
> > > +*
> > > +* In platforms that support fp64 we can emit instructions with
> > a width of
> > > +* 4 that need two SIMD8 registers and an exec_size of 8 or 16.
> > In these
> > > +* cases we need to make sure that these instructions have their
> > exec sizes
> > > +* set properly when they are emitted and we can't rely on this
> > code to fix
> > > +* it.
> > >  */
> > > -   if (dest.width < BRW_EXECUTE_8)
> > > +   bool fix_exec_size;
> > > +   if (devinfo->gen == 7 || devinfo->gen == 8)
> >
> > If we're doing to take this approach, we definitely want to make it
> > gen > 6 or something so we include future gens.  Really gen > 4 is
> > probably doable since the only real problem is the legacy clipping
> > code.
>
> Strips and fans is also a problem, but it is certainly doable if we want
> to do it.

Yeah, my primary point is that we should make it as little of an edge-case
as possible.  We could go back to at least gen6 and we should go forward.
That said, it'll take a little testing from the Intel side.

> Iago
>
>
> > > +  fix_exec_size = dest.width < BRW_EXECUTE_4;
> > > +   else
> > > +  fix_exec_size = dest.width < BRW_EXECUTE_8;
> > > +
> > > +   if (fix_exec_size)
> > >brw_inst_set_exec_size(devinfo, inst, dest.width);
> > >  }
> > >
> > > --
> > > 2.1.4
> > >
> > > ___
> > > mesa-dev mailing list
> > > mesa-dev@lists.freedesktop.org
> > > http://lists.freedesktop.org/mailman/listinfo/mesa-dev
> >
> >
>
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [RFC PATCH 5/5] i965: Skip execution size adjustment for instructions of width 4

2015-12-09 Thread Iago Toral

On Wed, 2015-12-09 at 08:10 -0800, Jason Ekstrand wrote:
> 
> On Dec 9, 2015 4:16 AM, "Iago Toral Quiroga" 
> wrote:
> >
> > This code in brw_set_dest adjusts the execution size of any
> instruction
> > with a dst.width < 8. However, we don't want to do this with
> instructions
> > operating on doubles, since these will have a width of 4, but still
> > need an execution size of 8 (for SIMD8). Unfortunately, we can't
> just check
> > the size of the operands involved to detect if we are doing an
> operation on
> > doubles, because we can have instructions that do operations on
> double
> > operands interpreted as UD, operating on any of its 2 32-bit
> components.
> >
> > Previous commits have made it so we never emit instructions with a
> horizontal
> > width of 4 that don't have the correct execution size set for
> gen7/gen8, so
> > we can skip it in this case, avoiding the conflicts with fp64
> requirements.
> >
> > Expanding the same fix to other hardware generations requires many
> more
> > changes but since we are not targetting fp64 support on them
> > wer don't really care for now.
> > ---
> >  src/mesa/drivers/dri/i965/brw_eu_emit.c | 14 +-
> >  1 file changed, 13 insertions(+), 1 deletion(-)
> >
> > diff --git a/src/mesa/drivers/dri/i965/brw_eu_emit.c
> b/src/mesa/drivers/dri/i965/brw_eu_emit.c
> > index 78f2c8c..50a8771 100644
> > --- a/src/mesa/drivers/dri/i965/brw_eu_emit.c
> > +++ b/src/mesa/drivers/dri/i965/brw_eu_emit.c
> > @@ -202,8 +202,20 @@ brw_set_dest(struct brw_codegen *p, brw_inst
> *inst, struct brw_reg dest)
> > /* Generators should set a default exec_size of either 8
> (SIMD4x2 or SIMD8)
> >  * or 16 (SIMD16), as that's normally correct.  However, when
> dealing with
> >  * small registers, we automatically reduce it to match the
> register size.
> > +*
> > +* In platforms that support fp64 we can emit instructions with
> a width of
> > +* 4 that need two SIMD8 registers and an exec_size of 8 or 16.
> In these
> > +* cases we need to make sure that these instructions have their
> exec sizes
> > +* set properly when they are emitted and we can't rely on this
> code to fix
> > +* it.
> >  */
> > -   if (dest.width < BRW_EXECUTE_8)
> > +   bool fix_exec_size;
> > +   if (devinfo->gen == 7 || devinfo->gen == 8)
> 
> If we're doing to take this approach, we definitely want to make it
> gen > 6 or something so we include future gens.  Really gen > 4 is
> probably doable since the only real problem is the legacy clipping
> code.

Strips and fans is also a problem, but it is certainly doable if we want
to do it.

Iago


> > +  fix_exec_size = dest.width < BRW_EXECUTE_4;
> > +   else
> > +  fix_exec_size = dest.width < BRW_EXECUTE_8;
> > +
> > +   if (fix_exec_size)
> >brw_inst_set_exec_size(devinfo, inst, dest.width);
> >  }
> >
> > --
> > 2.1.4
> >
> > ___
> > mesa-dev mailing list
> > mesa-dev@lists.freedesktop.org
> > http://lists.freedesktop.org/mailman/listinfo/mesa-dev
> 
> 


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] mesa/shader: return correct attribute location for double matrix arrays

2015-12-09 Thread Tapani Pälli



On 12/10/2015 05:41 AM, Dave Airlie wrote:

From: Dave Airlie 

If we have a dmat2[4], then dmat2[0] is at 17, dmat2[1] at 19,
dmat2[2] at 21 etc. The old code was returning 17,18,19.

I think this code is also wrong for float matricies as well.

This partly fixes:
GL41-CTS.vertex_attrib_64bit.limits_test
---
  src/mesa/main/shader_query.cpp | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/mesa/main/shader_query.cpp b/src/mesa/main/shader_query.cpp
index 5d15006..faaf08c 100644
--- a/src/mesa/main/shader_query.cpp
+++ b/src/mesa/main/shader_query.cpp
@@ -858,7 +858,7 @@ program_resource_location(struct gl_shader_program *shProg,
&& array_index >= RESOURCE_VAR(res)->type->length) {
   return -1;
}
-  return RESOURCE_VAR(res)->data.location + array_index - 
VERT_ATTRIB_GENERIC0;
+  return RESOURCE_VAR(res)->data.location + (array_index * 
RESOURCE_VAR(res)->type->without_array()->matrix_columns) - VERT_ATTRIB_GENERIC0;


There are some lines that exceed 80 chars in this file but this seems 
way too long (?) You could put this case in braces and have a temporary 
for glsl_type or maybe matrix cols to make it fit. Otherwise seems 
correct to me. Looks like this has been broken for quite a long time, 
even before program interface query was introduced.


Reviewed-by: Tapani Pälli 


 case GL_PROGRAM_OUTPUT:
/* If the output is an array, fail if the index is out of bounds. */
if (array_index > 0



// Tapani
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH v3 44/44] docs: Add ARB_compute_shader to 11.2.0 release notes

2015-12-09 Thread Kristian Høgsberg

On Tue, Dec 01, 2015 at 12:20:02AM -0800, Jordan Justen wrote:
> Signed-off-by: Jordan Justen 
> Cc: Iago Toral Quiroga 

For the series:

Reviewed-by: Kristian Høgsberg 

Admittedly, light review on patches 17-25 but Iago covered that and
the overall refactoring looks sounds.

For the new shared load/store intrinsics, we'll want to avoid the
_indirect versions, but that may be better done as an update to Jasons
series.

Kristian

> ---
>  docs/relnotes/11.2.0.html | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/docs/relnotes/11.2.0.html b/docs/relnotes/11.2.0.html
> index c9c0c90..eb53e81 100644
> --- a/docs/relnotes/11.2.0.html
> +++ b/docs/relnotes/11.2.0.html
> @@ -45,6 +45,7 @@ Note: some of the new features are only available with 
> certain drivers.
>  
>  
>  GL_ARB_base_instance on freedreno/a4xx
> +GL_ARB_compute_shader on i965
>  GL_ARB_texture_buffer_object_rgb32 on freedreno/a4xx
>  GL_ARB_texture_buffer_range on freedreno/a4xx
>  GL_ARB_texture_query_lod on freedreno/a4xx
> -- 
> 2.6.2
> 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [RFC] i965/vec4: Add support for SHADER_OPCODE_MOV_INDIRECT

2015-12-09 Thread Jason Ekstrand

On Wed, Dec 9, 2015 at 9:36 PM, Jason Ekstrand  wrote:
> This is an initial implementation of the MOV_INDIRECT opcode in the vec4
> backend.  Unfortunately, I haven't had a chance to test it in the wild yet,
> but I think review would still be good.  In particular, the approach I took
> to handling swizzles.
>
> Unfortunately, the only indirect MOV instructions you can use in align16
> mode have a uniform indirect.  This means that, in order to do an indirect
> MOV, we need to either do two movs or use align1 mode.  The problem with
> two MOVs is that, in order to force first/second half, you have to disable
> writemasking so you might as well be in align1.  In align1 mode, we have
> two options for indirects:  We could use height of 2 and make the hardware
> grab two sets of 4 consecutive dwords for us or we could use a UV
> immediate to add 0, 4, 8, and 12 to the four channels.  The second method
> only works easily on SNB+ because we don't have UV immediates on ILK and
> previous.  However, the first method (use a height of 2) may have
> interesting hardware implications if any of those sets of 4 dwords ever
> crosses a register boundary.  I didn't want to count software always giving
> us vec4-aligned offsets, so I decided on using the UV immediate.
>
> Ok, so swizzles.  One of the other advantages of adding a UV immediate is
> that we can make the immediate be whatever we want.  It doesn't have to be
> (0, 4, 8, 12); it could be anything.  So, I used it to implement swizzling.

Correction: This patch doesn't support swizzling.  But we could if we
wanted.  Do we want?

> Do we want swizzling?  Does it make sense to load a swizzled value starting
> at an arbitrary offset?  Does it make sense to support swizzling but not
> writemasking?  I don't know.  Thoughts?
>
> ---
>  src/mesa/drivers/dri/i965/brw_vec4_generator.cpp | 45 
> 
>  1 file changed, 45 insertions(+)
>
> diff --git a/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp 
> b/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
> index c3426dd..71a7f63 100644
> --- a/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
> @@ -1052,6 +1052,48 @@ generate_set_simd4x2_header_gen9(struct brw_codegen *p,
>  }
>
>  static void
> +generate_mov_indirect(struct brw_codegen *p,
> +  vec4_instruction *inst,
> +  struct brw_reg dst, struct brw_reg reg,
> +  struct brw_reg indirect, struct brw_reg length)
> +{
> +   assert(indirect.type == BRW_REGISTER_TYPE_UD);
> +
> +   unsigned imm_byte_offset = reg.nr * REG_SIZE + reg.subnr * (REG_SIZE / 2);
> +
> +   /* This instruction acts in align1 mode */
> +   assert(inst->force_writemask_all || reg.writemask == 0xf);
> +
> +   brw_push_insn_state(p);
> +   brw_set_default_access_mode(p, BRW_ALIGN_1);
> +   brw_set_default_mask_control(p, BRW_MASK_DISABLE);
> +
> +   struct brw_reg addr = vec2(brw_address_reg(0));
> +
> +   /* We need to move the indirect value into the address register.  In order
> +* to make things make some sense, we want to respect at least the X
> +* component of the swizzle.  In order to do that, we need to convert the
> +* subnr (probably 0) to an align1 subnr and add in the swizzle.  We then
> +* use a region of <8,4,0>:uw to pick off the first 2 bytes of the 
> indirect
> +* and splat it out to all four channels of the given half of a0.
> +*/
> +   assert(brw_is_single_value_swizzle(indirect.swizzle));
> +   indirect.subnr = (indirect.subnr * 4 + BRW_GET_SWZ(indirect.swizzle, 0)) 
> * 2;
> +   indirect = stride(retype(indirect, BRW_REGISTER_TYPE_UW), 8, 4, 0);
> +
> +   brw_ADD(p, addr, indirect, brw_imm_uw(imm_byte_offset));
> +
> +   /* Use a <4,1> region Vx1 region*/
> +   struct brw_reg src = brw_VxH_indirect(0, 0);
> +   src.width = BRW_WIDTH_4;
> +   src.hstride = BRW_HORIZONTAL_STRIDE_1;
> +
> +   brw_MOV(p, dst, retype(src, reg.type));
> +
> +   brw_pop_insn_state(p);
> +}
> +
> +static void
>  generate_code(struct brw_codegen *p,
>const struct brw_compiler *compiler,
>void *log_data,
> @@ -1538,6 +1580,9 @@ generate_code(struct brw_codegen *p,
>   break;
>}
>
> +  case SHADER_OPCODE_MOV_INDIRECT:
> + generate_mov_indirect(p, inst, dst, src[0], src[1], src[2]);
> +
>default:
>   unreachable("Unsupported opcode");
>}
> --
> 2.5.0.400.gff86faf
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [RFC] i965/vec4: Add support for SHADER_OPCODE_MOV_INDIRECT

2015-12-09 Thread Jason Ekstrand

This is an initial implementation of the MOV_INDIRECT opcode in the vec4
backend.  Unfortunately, I haven't had a chance to test it in the wild yet,
but I think review would still be good.  In particular, the approach I took
to handling swizzles.

Unfortunately, the only indirect MOV instructions you can use in align16
mode have a uniform indirect.  This means that, in order to do an indirect
MOV, we need to either do two movs or use align1 mode.  The problem with
two MOVs is that, in order to force first/second half, you have to disable
writemasking so you might as well be in align1.  In align1 mode, we have
two options for indirects:  We could use height of 2 and make the hardware
grab two sets of 4 consecutive dwords for us or we could use a UV
immediate to add 0, 4, 8, and 12 to the four channels.  The second method
only works easily on SNB+ because we don't have UV immediates on ILK and
previous.  However, the first method (use a height of 2) may have
interesting hardware implications if any of those sets of 4 dwords ever
crosses a register boundary.  I didn't want to count software always giving
us vec4-aligned offsets, so I decided on using the UV immediate.

Ok, so swizzles.  One of the other advantages of adding a UV immediate is
that we can make the immediate be whatever we want.  It doesn't have to be
(0, 4, 8, 12); it could be anything.  So, I used it to implement swizzling.
Do we want swizzling?  Does it make sense to load a swizzled value starting
at an arbitrary offset?  Does it make sense to support swizzling but not
writemasking?  I don't know.  Thoughts?

---
 src/mesa/drivers/dri/i965/brw_vec4_generator.cpp | 45 
 1 file changed, 45 insertions(+)

diff --git a/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp 
b/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
index c3426dd..71a7f63 100644
--- a/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
+++ b/src/mesa/drivers/dri/i965/brw_vec4_generator.cpp
@@ -1052,6 +1052,48 @@ generate_set_simd4x2_header_gen9(struct brw_codegen *p,
 }
 
 static void
+generate_mov_indirect(struct brw_codegen *p,
+  vec4_instruction *inst,
+  struct brw_reg dst, struct brw_reg reg,
+  struct brw_reg indirect, struct brw_reg length)
+{
+   assert(indirect.type == BRW_REGISTER_TYPE_UD);
+
+   unsigned imm_byte_offset = reg.nr * REG_SIZE + reg.subnr * (REG_SIZE / 2);
+
+   /* This instruction acts in align1 mode */
+   assert(inst->force_writemask_all || reg.writemask == 0xf);
+
+   brw_push_insn_state(p);
+   brw_set_default_access_mode(p, BRW_ALIGN_1);
+   brw_set_default_mask_control(p, BRW_MASK_DISABLE);
+
+   struct brw_reg addr = vec2(brw_address_reg(0));
+
+   /* We need to move the indirect value into the address register.  In order
+* to make things make some sense, we want to respect at least the X
+* component of the swizzle.  In order to do that, we need to convert the
+* subnr (probably 0) to an align1 subnr and add in the swizzle.  We then
+* use a region of <8,4,0>:uw to pick off the first 2 bytes of the indirect
+* and splat it out to all four channels of the given half of a0.
+*/
+   assert(brw_is_single_value_swizzle(indirect.swizzle));
+   indirect.subnr = (indirect.subnr * 4 + BRW_GET_SWZ(indirect.swizzle, 0)) * 
2;
+   indirect = stride(retype(indirect, BRW_REGISTER_TYPE_UW), 8, 4, 0);
+
+   brw_ADD(p, addr, indirect, brw_imm_uw(imm_byte_offset));
+
+   /* Use a <4,1> region Vx1 region*/
+   struct brw_reg src = brw_VxH_indirect(0, 0);
+   src.width = BRW_WIDTH_4;
+   src.hstride = BRW_HORIZONTAL_STRIDE_1;
+
+   brw_MOV(p, dst, retype(src, reg.type));
+
+   brw_pop_insn_state(p);
+}
+
+static void
 generate_code(struct brw_codegen *p,
   const struct brw_compiler *compiler,
   void *log_data,
@@ -1538,6 +1580,9 @@ generate_code(struct brw_codegen *p,
  break;
   }
 
+  case SHADER_OPCODE_MOV_INDIRECT:
+ generate_mov_indirect(p, inst, dst, src[0], src[1], src[2]);
+
   default:
  unreachable("Unsupported opcode");
   }
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] mesa/shader: return correct attribute location for double matrix arrays

2015-12-09 Thread Timothy Arceri

On Thu, 2015-12-10 at 13:41 +1000, Dave Airlie wrote:
> From: Dave Airlie 
> 
> If we have a dmat2[4], then dmat2[0] is at 17, dmat2[1] at 19,
> dmat2[2] at 21 etc. The old code was returning 17,18,19.
> 
> I think this code is also wrong for float matricies as well.

Would be good to have a piglit for the float case.

Reviewed-by: Timothy Arceri 

> 
> This partly fixes:
> GL41-CTS.vertex_attrib_64bit.limits_test
> ---
>  src/mesa/main/shader_query.cpp | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/src/mesa/main/shader_query.cpp
> b/src/mesa/main/shader_query.cpp
> index 5d15006..faaf08c 100644
> --- a/src/mesa/main/shader_query.cpp
> +++ b/src/mesa/main/shader_query.cpp
> @@ -858,7 +858,7 @@ program_resource_location(struct
> gl_shader_program *shProg,
>&& array_index >= RESOURCE_VAR(res)->type->length) {
>   return -1;
>}
> -  return RESOURCE_VAR(res)->data.location + array_index -
> VERT_ATTRIB_GENERIC0;
> +  return RESOURCE_VAR(res)->data.location + (array_index *
> RESOURCE_VAR(res)->type->without_array()->matrix_columns) -
> VERT_ATTRIB_GENERIC0;
> case GL_PROGRAM_OUTPUT:
>/* If the output is an array, fail if the index is out of
> bounds. */
>if (array_index > 0
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 03/15] i965/fs: Use MOV_INDIRECT for all indirect uniform loads

2015-12-09 Thread Matt Turner

On Wed, Dec 9, 2015 at 8:23 PM, Jason Ekstrand  wrote:
> Instead of using reladdr, this commit changes the FS backend to emit a
> MOV_INDIRECT whenever we need an indirect uniform load.  We also have to
> rework some of the other bits of the backend to handle this new form of
> uniform load.  The obvious change is that demote_pull_constants now acts
> more like a lowering pass when it hits a MOV_INDIRECT.
> ---
>  src/mesa/drivers/dri/i965/brw_fs.cpp | 72 
> +++-
>  src/mesa/drivers/dri/i965/brw_fs_nir.cpp | 53 ++-
>  2 files changed, 86 insertions(+), 39 deletions(-)
>
> diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp 
> b/src/mesa/drivers/dri/i965/brw_fs.cpp
> index bf446d2..7cc03c5 100644
> --- a/src/mesa/drivers/dri/i965/brw_fs.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
> @@ -1945,8 +1945,8 @@ fs_visitor::assign_constant_locations()
>   if (inst->src[i].file != UNIFORM)
>  continue;
>
> - if (inst->src[i].reladdr) {
> -int uniform = inst->src[i].nr;
> + if (inst->opcode == SHADER_OPCODE_MOV_INDIRECT && i == 0) {
> +int uniform = inst->src[0].nr;
>
>  /* If this array isn't already present in the pull constant 
> buffer,
>   * add it.
> @@ -2028,49 +2028,63 @@ fs_visitor::assign_constant_locations()
>  void
>  fs_visitor::demote_pull_constants()
>  {
> -   foreach_block_and_inst (block, fs_inst, inst, cfg) {
> +   const unsigned index = 
> stage_prog_data->binding_table.pull_constants_start;
> +
> +   foreach_block_and_inst_safe (block, fs_inst, inst, cfg) {
> +  /* Set up the annotation tracking for new generated instructions. */
> +  const fs_builder ibld(this, block, inst);
> +
>for (int i = 0; i < inst->sources; i++) {
>  if (inst->src[i].file != UNIFORM)
> continue;
>
> - int pull_index;
> + /* We'll handle this case later */
> + if (inst->opcode == SHADER_OPCODE_MOV_INDIRECT && i == 0)
> +continue;
> +
>   unsigned location = inst->src[i].nr + inst->src[i].reg_offset;
> - if (location >= uniforms) /* Out of bounds access */
> -pull_index = -1;
> - else
> -pull_index = pull_constant_loc[location];
> + if (location >= uniforms)
> +continue; /* Out of bounds access */
> +
> + int pull_index = pull_constant_loc[location];
>
>   if (pull_index == -1)
> continue;
>
> - /* Set up the annotation tracking for new generated instructions. */
> - const fs_builder ibld(this, block, inst);
> - const unsigned index = 
> stage_prog_data->binding_table.pull_constants_start;
> - fs_reg dst = vgrf(glsl_type::float_type);
> -
>   assert(inst->src[i].stride == 0);
>
> - /* Generate a pull load into dst. */
> - if (inst->src[i].reladdr) {
> -VARYING_PULL_CONSTANT_LOAD(ibld, dst,
> -   brw_imm_ud(index),
> -   *inst->src[i].reladdr,
> -   pull_index * 4);
> -inst->src[i].reladdr = NULL;
> -inst->src[i].stride = 1;
> - } else {
> -const fs_builder ubld = ibld.exec_all().group(8, 0);
> -struct brw_reg offset = brw_imm_ud((unsigned)(pull_index * 4) & 
> ~15);
> -ubld.emit(FS_OPCODE_UNIFORM_PULL_CONSTANT_LOAD,
> -  dst, brw_imm_ud(index), offset);
> -inst->src[i].set_smear(pull_index & 3);
> - }
> - brw_mark_surface_used(prog_data, index);
> + fs_reg dst = vgrf(glsl_type::float_type);
> + const fs_builder ubld = ibld.exec_all().group(8, 0);
> + struct brw_reg offset = brw_imm_ud((unsigned)(pull_index * 4) & 
> ~15);
> + ubld.emit(FS_OPCODE_UNIFORM_PULL_CONSTANT_LOAD,
> +   dst, brw_imm_ud(index), offset);
>
>   /* Rewrite the instruction to use the temporary VGRF. */
>   inst->src[i].file = VGRF;
>   inst->src[i].nr = dst.nr;
>   inst->src[i].reg_offset = 0;
> + inst->src[i].set_smear(pull_index & 3);
> +
> + brw_mark_surface_used(prog_data, index);
> +  }
> +
> +  if (inst->opcode == SHADER_OPCODE_MOV_INDIRECT &&
> +  inst->src[0].file == UNIFORM) {
> +
> + unsigned location = inst->src[0].nr + inst->src[0].reg_offset;
> + if (location >= uniforms)
> +continue; /* Out of bounds access */
> +
> + int pull_index = pull_constant_loc[location];
> + assert(pull_index >= 0); /* This had better be pull */
> +
> + VARYING_PULL_CONSTANT_LOAD(ibld, inst->dst,
> +brw_imm_ud(index),
> +inst->src[1],
> +pull_index * 4);
> + inst->remove(block);
> +
> + brw_mark

[Mesa-dev] [PATCH] draw: fix clipping with linear interpolated values and gl_ClipVertex

2015-12-09 Thread sroland

From: Roland Scheidegger 

Discovered this when working on other clip code, apparently didn't work
correctly - the combination of linear interpolated values and using
gl_ClipVertex produced wrong values (failing all such combinations
in piglits glsl-1.30 interpolation tests).
Use the pre-clip-pos values when determining the interpolation factor to
fix this.
Unfortunately I have no idea what I'm doing here really, but it fixes all
these failures in piglit (all interpolation-noperspective-XXX-vertex, 10
tests in total). Albeit piglit coverage of clipping isn't great, so hopefully
someone can confirm this actually makes sense, and wouldn't cause failures
elsewhere...
---
 src/gallium/auxiliary/draw/draw_pipe_clip.c | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/src/gallium/auxiliary/draw/draw_pipe_clip.c 
b/src/gallium/auxiliary/draw/draw_pipe_clip.c
index f2b56b0..7f22eef 100644
--- a/src/gallium/auxiliary/draw/draw_pipe_clip.c
+++ b/src/gallium/auxiliary/draw/draw_pipe_clip.c
@@ -192,11 +192,11 @@ static void interp(const struct clip_stage *clip,
   t_nopersp = t;
   /* find either in.x != out.x or in.y != out.y */
   for (k = 0; k < 2; k++) {
- if (in->clip[k] != out->clip[k]) {
+ if (in->pre_clip_pos[k] != out->pre_clip_pos[k]) {
 /* do divide by W, then compute linear interpolation factor */
-float in_coord = in->clip[k] / in->clip[3];
-float out_coord = out->clip[k] / out->clip[3];
-float dst_coord = dst->clip[k] / dst->clip[3];
+float in_coord = in->pre_clip_pos[k] / in->pre_clip_pos[3];
+float out_coord = out->pre_clip_pos[k] / out->pre_clip_pos[3];
+float dst_coord = dst->pre_clip_pos[k] / dst->pre_clip_pos[3];
 t_nopersp = (dst_coord - out_coord) / (in_coord - out_coord);
 break;
  }
-- 
2.1.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH v3 05/12] nir/lower_io: Get rid of load/store_foo_indirect

2015-12-09 Thread Jason Ekstrand

On Tue, Dec 8, 2015 at 6:25 PM, Kenneth Graunke  wrote:
> On Tuesday, December 08, 2015 01:46:22 PM Jason Ekstrand wrote:
>> diff --git a/src/glsl/nir/nir_lower_io.c b/src/glsl/nir/nir_lower_io.c
>> index f64ac69..a2723d5 100644
>> --- a/src/glsl/nir/nir_lower_io.c
>> +++ b/src/glsl/nir/nir_lower_io.c
>> @@ -333,18 +311,18 @@ nir_lower_io(nir_shader *shader, nir_variable_mode 
>> mode,
>>   * Return the indirect source for a load/store indirect intrinsic.
>
> This comment could use updating.  How about:
>
> /**
>  * Return the offset soruce for a load/store intrinsic.
>  */

Good call.  Fixed locally.
--Jason

>>   */
>>  nir_src *
>> -nir_get_io_indirect_src(nir_intrinsic_instr *instr)
>> +nir_get_io_offset_src(nir_intrinsic_instr *instr)
>>  {
>> switch (instr->intrinsic) {
>> -   case nir_intrinsic_load_input_indirect:
>> -   case nir_intrinsic_load_output_indirect:
>> -   case nir_intrinsic_load_uniform_indirect:
>> +   case nir_intrinsic_load_input:
>> +   case nir_intrinsic_load_output:
>> +   case nir_intrinsic_load_uniform:
>>return &instr->src[0];
>> -   case nir_intrinsic_load_per_vertex_input_indirect:
>> -   case nir_intrinsic_load_per_vertex_output_indirect:
>> -   case nir_intrinsic_store_output_indirect:
>> +   case nir_intrinsic_load_per_vertex_input:
>> +   case nir_intrinsic_load_per_vertex_output:
>> +   case nir_intrinsic_store_output:
>>return &instr->src[1];
>> -   case nir_intrinsic_store_per_vertex_output_indirect:
>> +   case nir_intrinsic_store_per_vertex_output:
>>return &instr->src[2];
>> default:
>>return NULL;
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 02/15] i965/fs: Use UD type for offsets in VARYING_PULL_CONSTANT_LOAD

2015-12-09 Thread Jason Ekstrand

This is to prevent shader-db regressions from D <-> UD conversions in deref
add+mul chains caused by using MOV_INDIRECT.
---
 src/mesa/drivers/dri/i965/brw_fs.cpp | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp 
b/src/mesa/drivers/dri/i965/brw_fs.cpp
index 5e8acec..bf446d2 100644
--- a/src/mesa/drivers/dri/i965/brw_fs.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
@@ -174,7 +174,7 @@ fs_visitor::VARYING_PULL_CONSTANT_LOAD(const fs_builder 
&bld,
 * CSE can later notice that those loads are all the same and eliminate
 * the redundant ones.
 */
-   fs_reg vec4_offset = vgrf(glsl_type::int_type);
+   fs_reg vec4_offset = vgrf(glsl_type::uint_type);
bld.ADD(vec4_offset, varying_offset, brw_imm_ud(const_offset & ~0xf));
 
int scale = 1;
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 05/15] i965/fs: Stop relying on param_size in assign_constant_locations

2015-12-09 Thread Jason Ekstrand

Now that we have MOV_INDIRECT opcodes, we have all of the size information
we need directly in the opcode.  With a little restructuring of the
algorithm used in assign_constant_locations we don't need param_size
anymore.  The big thing to watch out for now, however, is that you can have
two ranges overlap where neither contains the other.  In order to deal with
this, we make the first pass just flag what needs pulling and handle
assigning pull constant locations until later.
---
 src/mesa/drivers/dri/i965/brw_fs.cpp | 44 ++--
 1 file changed, 17 insertions(+), 27 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp 
b/src/mesa/drivers/dri/i965/brw_fs.cpp
index 786c5fb..1add656 100644
--- a/src/mesa/drivers/dri/i965/brw_fs.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
@@ -1920,14 +1920,12 @@ fs_visitor::assign_constant_locations()
if (dispatch_width != 8)
   return;
 
-   unsigned int num_pull_constants = 0;
-
-   pull_constant_loc = ralloc_array(mem_ctx, int, uniforms);
-   memset(pull_constant_loc, -1, sizeof(pull_constant_loc[0]) * uniforms);
-
bool is_live[uniforms];
memset(is_live, 0, sizeof(is_live));
 
+   bool needs_pull[uniforms];
+   memset(needs_pull, 0, sizeof(is_live));
+
/* First, we walk through the instructions and do two things:
 *
 *  1) Figure out which uniforms are live.
@@ -1943,20 +1941,15 @@ fs_visitor::assign_constant_locations()
  if (inst->src[i].file != UNIFORM)
 continue;
 
- if (inst->opcode == SHADER_OPCODE_MOV_INDIRECT && i == 0) {
-int uniform = inst->src[0].nr;
+ int constant_nr = inst->src[i].nr + inst->src[i].reg_offset;
 
-/* If this array isn't already present in the pull constant buffer,
- * add it.
- */
-if (pull_constant_loc[uniform] == -1) {
-   assert(param_size[uniform]);
-   for (int j = 0; j < param_size[uniform]; j++)
-  pull_constant_loc[uniform + j] = num_pull_constants++;
+ if (inst->opcode == SHADER_OPCODE_MOV_INDIRECT && i == 0) {
+for (unsigned j = 0; j < inst->src[2].ud / 4; j++) {
+   is_live[constant_nr + j] = true;
+   needs_pull[constant_nr + j] = true;
 }
  } else {
 /* Mark the the one accessed uniform as live */
-int constant_nr = inst->src[i].nr + inst->src[i].reg_offset;
 if (constant_nr >= 0 && constant_nr < (int) uniforms)
is_live[constant_nr] = true;
  }
@@ -1973,26 +1966,23 @@ fs_visitor::assign_constant_locations()
 */
unsigned int max_push_components = 16 * 8;
unsigned int num_push_constants = 0;
+   unsigned int num_pull_constants = 0;
 
push_constant_loc = ralloc_array(mem_ctx, int, uniforms);
+   pull_constant_loc = ralloc_array(mem_ctx, int, uniforms);
 
for (unsigned int i = 0; i < uniforms; i++) {
-  if (!is_live[i] || pull_constant_loc[i] != -1) {
- /* This UNIFORM register is either dead, or has already been demoted
-  * to a pull const.  Mark it as no longer living in the param[] array.
-  */
- push_constant_loc[i] = -1;
+  push_constant_loc[i] = -1;
+  pull_constant_loc[i] = -1;
+
+  if (!is_live[i])
  continue;
-  }
 
-  if (num_push_constants < max_push_components) {
- /* Retain as a push constant.  Record the location in the params[]
-  * array.
-  */
+  if (!needs_pull[i] && num_push_constants < max_push_components) {
+ /* Retain as a push constant */
  push_constant_loc[i] = num_push_constants++;
   } else {
- /* Demote to a pull constant. */
- push_constant_loc[i] = -1;
+ /* We have to pull it */
  pull_constant_loc[i] = num_pull_constants++;
   }
}
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 10/15] i965/fs: Add support for doing MOV_INDIRECT on uniforms

2015-12-09 Thread Jason Ekstrand

---
 src/mesa/drivers/dri/i965/brw_fs.cpp | 5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp 
b/src/mesa/drivers/dri/i965/brw_fs.cpp
index 9b06ed2..de33c1d 100644
--- a/src/mesa/drivers/dri/i965/brw_fs.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
@@ -848,7 +848,10 @@ fs_inst::regs_read(int arg) const
  assert(src[2].file == IMM);
  unsigned region_length = src[2].ud;
 
- if (src[0].file == FIXED_GRF) {
+ if (src[0].file == UNIFORM) {
+assert(region_length % 4 == 0);
+return region_length / 4;
+ } else if (src[0].file == FIXED_GRF) {
 /* If the start of the region is not register aligned, then
  * there's some portion of the register that's technically
  * unread at the beginning.
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 04/15] i965/fs: Get rid of reladdr

2015-12-09 Thread Jason Ekstrand

We aren't using it anymore.
---
 src/mesa/drivers/dri/i965/brw_fs.cpp  | 7 +--
 src/mesa/drivers/dri/i965/brw_ir_fs.h | 5 +
 2 files changed, 2 insertions(+), 10 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp 
b/src/mesa/drivers/dri/i965/brw_fs.cpp
index 7cc03c5..786c5fb 100644
--- a/src/mesa/drivers/dri/i965/brw_fs.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
@@ -433,7 +433,6 @@ fs_reg::fs_reg(struct ::brw_reg reg) :
 {
this->reg_offset = 0;
this->subreg_offset = 0;
-   this->reladdr = NULL;
this->stride = 1;
if (this->file == IMM &&
(this->type != BRW_REGISTER_TYPE_V &&
@@ -448,7 +447,6 @@ fs_reg::equals(const fs_reg &r) const
 {
return (this->backend_reg::equals(r) &&
subreg_offset == r.subreg_offset &&
-   !reladdr && !r.reladdr &&
stride == r.stride);
 }
 
@@ -4716,9 +4714,7 @@ fs_visitor::dump_instruction(backend_instruction 
*be_inst, FILE *file)
  break;
   case UNIFORM:
  fprintf(file, "u%d", inst->src[i].nr + inst->src[i].reg_offset);
- if (inst->src[i].reladdr) {
-fprintf(file, "+reladdr");
- } else if (inst->src[i].subreg_offset) {
+ if (inst->src[i].subreg_offset) {
 fprintf(file, "+%d.%d", inst->src[i].reg_offset,
 inst->src[i].subreg_offset);
  }
@@ -4829,7 +4825,6 @@ fs_visitor::get_instruction_generating_reg(fs_inst *start,
 {
if (end == start ||
end->is_partial_write() ||
-   reg.reladdr ||
!reg.equals(end->dst)) {
   return NULL;
} else {
diff --git a/src/mesa/drivers/dri/i965/brw_ir_fs.h 
b/src/mesa/drivers/dri/i965/brw_ir_fs.h
index c3eec2e..e4f20f4 100644
--- a/src/mesa/drivers/dri/i965/brw_ir_fs.h
+++ b/src/mesa/drivers/dri/i965/brw_ir_fs.h
@@ -58,8 +58,6 @@ public:
 */
int subreg_offset;
 
-   fs_reg *reladdr;
-
/** Register region horizontal stride */
uint8_t stride;
 };
@@ -136,8 +134,7 @@ component(fs_reg reg, unsigned idx)
 static inline bool
 is_uniform(const fs_reg ®)
 {
-   return (reg.stride == 0 || reg.is_null()) &&
-  (!reg.reladdr || is_uniform(*reg.reladdr));
+   return (reg.stride == 0 || reg.is_null());
 }
 
 /**
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 08/15] i965/vec4: Use MOV_INDIRECT instead of reladdr for indirect push constants

2015-12-09 Thread Jason Ekstrand

This commit moves us to an instruction based model rather than a
register-based model for indirects.  This is more accurate anyway as we
have to emit instructions to resolve the reladdr.  It's also a lot simpler
because it gets rid of the recursive reladdr problem by design.

One side-effect of this is that we need a whole new algorithm in
move_uniform_array_access_to_pull_constants.  This new algorithm is much
more straightforward than the old one and is fairly similar to what we're
already doing in the FS backend.
---
 src/mesa/drivers/dri/i965/brw_vec4.cpp |  2 +-
 src/mesa/drivers/dri/i965/brw_vec4.h   |  3 +-
 src/mesa/drivers/dri/i965/brw_vec4_nir.cpp | 10 +--
 src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp | 86 --
 4 files changed, 50 insertions(+), 51 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_vec4.cpp 
b/src/mesa/drivers/dri/i965/brw_vec4.cpp
index a697bdf..e4a405b 100644
--- a/src/mesa/drivers/dri/i965/brw_vec4.cpp
+++ b/src/mesa/drivers/dri/i965/brw_vec4.cpp
@@ -775,7 +775,7 @@ vec4_visitor::move_push_constants_to_pull_constants()
 dst_reg temp = dst_reg(this, glsl_type::vec4_type);
 
 emit_pull_constant_load(block, inst, temp, inst->src[i],
-pull_constant_loc[uniform]);
+pull_constant_loc[uniform], src_reg());
 
 inst->src[i].file = temp.file;
  inst->src[i].nr = temp.nr;
diff --git a/src/mesa/drivers/dri/i965/brw_vec4.h 
b/src/mesa/drivers/dri/i965/brw_vec4.h
index f2e5ce1..e6d6c82 100644
--- a/src/mesa/drivers/dri/i965/brw_vec4.h
+++ b/src/mesa/drivers/dri/i965/brw_vec4.h
@@ -293,7 +293,8 @@ public:
void emit_pull_constant_load(bblock_t *block, vec4_instruction *inst,
dst_reg dst,
src_reg orig_src,
-   int base_offset);
+   int base_offset,
+src_reg indirect);
void emit_pull_constant_load_reg(dst_reg dst,
 src_reg surf_index,
 src_reg offset,
diff --git a/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp 
b/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
index f965b39..58b6612 100644
--- a/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
+++ b/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
@@ -673,12 +673,14 @@ vec4_visitor::nir_emit_intrinsic(nir_intrinsic_instr 
*instr)
  /* Offsets are in bytes but they should always be multiples of 16 */
  assert(const_offset->u[0] % 16 == 0);
  src.reg_offset = const_offset->u[0] / 16;
+
+ emit(MOV(dest, src));
   } else {
- src_reg tmp = get_nir_src(instr->src[0], BRW_REGISTER_TYPE_D, 1);
- src.reladdr = new(mem_ctx) src_reg(tmp);
-  }
+ src_reg indirect = get_nir_src(instr->src[0], BRW_REGISTER_TYPE_UD, 
1);
 
-  emit(MOV(dest, src));
+ emit(SHADER_OPCODE_MOV_INDIRECT, dest, src,
+  indirect, brw_imm_ud(instr->const_index[1]));
+  }
   break;
}
 
diff --git a/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp 
b/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp
index 7712d34..e7ab536 100644
--- a/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp
+++ b/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp
@@ -1641,16 +1641,16 @@ vec4_visitor::move_grf_array_access_to_scratch()
 void
 vec4_visitor::emit_pull_constant_load(bblock_t *block, vec4_instruction *inst,
  dst_reg temp, src_reg orig_src,
- int base_offset)
+ int base_offset, src_reg indirect)
 {
int reg_offset = base_offset + orig_src.reg_offset;
const unsigned index = prog_data->base.binding_table.pull_constants_start;
 
src_reg offset;
-   if (orig_src.reladdr) {
+   if (indirect.file != BAD_FILE) {
   offset = src_reg(this, glsl_type::int_type);
 
-  emit_before(block, inst, ADD(dst_reg(offset), *orig_src.reladdr,
+  emit_before(block, inst, ADD(dst_reg(offset), indirect,
brw_imm_d(reg_offset * 16)));
} else if (devinfo->gen >= 8) {
   /* Store the offset in a GRF so we can send-from-GRF. */
@@ -1685,59 +1685,55 @@ 
vec4_visitor::move_uniform_array_access_to_pull_constants()
 {
int pull_constant_loc[this->uniforms];
memset(pull_constant_loc, -1, sizeof(pull_constant_loc));
-   bool nested_reladdr;
 
-   /* Walk through and find array access of uniforms.  Put a copy of that
-* uniform in the pull constant buffer.
-*
-* Note that we don't move constant-indexed accesses to arrays.  No
-* testing has been done of the performance impact of this choice.
+   /* First, walk through the instructions and determine which things need to
+* be pulled.  We mark something as needing to bye pulled by setting
+* pull_constant_loc to 0.
 */
-   do {
-  nested_r

[Mesa-dev] [PATCH 03/15] i965/fs: Use MOV_INDIRECT for all indirect uniform loads

2015-12-09 Thread Jason Ekstrand

Instead of using reladdr, this commit changes the FS backend to emit a
MOV_INDIRECT whenever we need an indirect uniform load.  We also have to
rework some of the other bits of the backend to handle this new form of
uniform load.  The obvious change is that demote_pull_constants now acts
more like a lowering pass when it hits a MOV_INDIRECT.
---
 src/mesa/drivers/dri/i965/brw_fs.cpp | 72 +++-
 src/mesa/drivers/dri/i965/brw_fs_nir.cpp | 53 ++-
 2 files changed, 86 insertions(+), 39 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp 
b/src/mesa/drivers/dri/i965/brw_fs.cpp
index bf446d2..7cc03c5 100644
--- a/src/mesa/drivers/dri/i965/brw_fs.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
@@ -1945,8 +1945,8 @@ fs_visitor::assign_constant_locations()
  if (inst->src[i].file != UNIFORM)
 continue;
 
- if (inst->src[i].reladdr) {
-int uniform = inst->src[i].nr;
+ if (inst->opcode == SHADER_OPCODE_MOV_INDIRECT && i == 0) {
+int uniform = inst->src[0].nr;
 
 /* If this array isn't already present in the pull constant buffer,
  * add it.
@@ -2028,49 +2028,63 @@ fs_visitor::assign_constant_locations()
 void
 fs_visitor::demote_pull_constants()
 {
-   foreach_block_and_inst (block, fs_inst, inst, cfg) {
+   const unsigned index = stage_prog_data->binding_table.pull_constants_start;
+
+   foreach_block_and_inst_safe (block, fs_inst, inst, cfg) {
+  /* Set up the annotation tracking for new generated instructions. */
+  const fs_builder ibld(this, block, inst);
+
   for (int i = 0; i < inst->sources; i++) {
 if (inst->src[i].file != UNIFORM)
continue;
 
- int pull_index;
+ /* We'll handle this case later */
+ if (inst->opcode == SHADER_OPCODE_MOV_INDIRECT && i == 0)
+continue;
+
  unsigned location = inst->src[i].nr + inst->src[i].reg_offset;
- if (location >= uniforms) /* Out of bounds access */
-pull_index = -1;
- else
-pull_index = pull_constant_loc[location];
+ if (location >= uniforms)
+continue; /* Out of bounds access */
+
+ int pull_index = pull_constant_loc[location];
 
  if (pull_index == -1)
continue;
 
- /* Set up the annotation tracking for new generated instructions. */
- const fs_builder ibld(this, block, inst);
- const unsigned index = 
stage_prog_data->binding_table.pull_constants_start;
- fs_reg dst = vgrf(glsl_type::float_type);
-
  assert(inst->src[i].stride == 0);
 
- /* Generate a pull load into dst. */
- if (inst->src[i].reladdr) {
-VARYING_PULL_CONSTANT_LOAD(ibld, dst,
-   brw_imm_ud(index),
-   *inst->src[i].reladdr,
-   pull_index * 4);
-inst->src[i].reladdr = NULL;
-inst->src[i].stride = 1;
- } else {
-const fs_builder ubld = ibld.exec_all().group(8, 0);
-struct brw_reg offset = brw_imm_ud((unsigned)(pull_index * 4) & 
~15);
-ubld.emit(FS_OPCODE_UNIFORM_PULL_CONSTANT_LOAD,
-  dst, brw_imm_ud(index), offset);
-inst->src[i].set_smear(pull_index & 3);
- }
- brw_mark_surface_used(prog_data, index);
+ fs_reg dst = vgrf(glsl_type::float_type);
+ const fs_builder ubld = ibld.exec_all().group(8, 0);
+ struct brw_reg offset = brw_imm_ud((unsigned)(pull_index * 4) & ~15);
+ ubld.emit(FS_OPCODE_UNIFORM_PULL_CONSTANT_LOAD,
+   dst, brw_imm_ud(index), offset);
 
  /* Rewrite the instruction to use the temporary VGRF. */
  inst->src[i].file = VGRF;
  inst->src[i].nr = dst.nr;
  inst->src[i].reg_offset = 0;
+ inst->src[i].set_smear(pull_index & 3);
+
+ brw_mark_surface_used(prog_data, index);
+  }
+
+  if (inst->opcode == SHADER_OPCODE_MOV_INDIRECT &&
+  inst->src[0].file == UNIFORM) {
+
+ unsigned location = inst->src[0].nr + inst->src[0].reg_offset;
+ if (location >= uniforms)
+continue; /* Out of bounds access */
+
+ int pull_index = pull_constant_loc[location];
+ assert(pull_index >= 0); /* This had better be pull */
+
+ VARYING_PULL_CONSTANT_LOAD(ibld, inst->dst,
+brw_imm_ud(index),
+inst->src[1],
+pull_index * 4);
+ inst->remove(block);
+
+ brw_mark_surface_used(prog_data, index);
   }
}
invalidate_live_intervals();
diff --git a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp 
b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
index 15d9b1c..bf239c3 100644
--- a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
+++ b/src/mesa/drivers/dri/i

[Mesa-dev] [PATCH 01/15] nir: Add another index to load_uniform to specify the range read

2015-12-09 Thread Jason Ekstrand

---
 src/glsl/nir/nir_intrinsics.h | 7 +--
 src/glsl/nir/nir_lower_io.c   | 5 +
 2 files changed, 10 insertions(+), 2 deletions(-)

diff --git a/src/glsl/nir/nir_intrinsics.h b/src/glsl/nir/nir_intrinsics.h
index 63df21e..c329a82 100644
--- a/src/glsl/nir/nir_intrinsics.h
+++ b/src/glsl/nir/nir_intrinsics.h
@@ -236,6 +236,9 @@ SYSTEM_VALUE(helper_invocation, 1, 0)
  * of the start of the variable being loaded and and the offset source is a
  * offset into that variable.
  *
+ * Uniform load operations have a second index that specifies the size of the
+ * variable being loaded.  If const_index[1] == 0, then the size is unknown.
+ *
  * Some load operations such as UBO/SSBO load and per_vertex loads take an
  * additional source to specify which UBO/SSBO/vertex to load from.
  *
@@ -248,8 +251,8 @@ SYSTEM_VALUE(helper_invocation, 1, 0)
 #define LOAD(name, srcs, indices, flags) \
INTRINSIC(load_##name, srcs, ARR(1, 1, 1, 1), true, 0, 0, indices, flags)
 
-/* src[] = { offset }. const_index[] = { base } */
-LOAD(uniform, 1, 1, NIR_INTRINSIC_CAN_ELIMINATE | NIR_INTRINSIC_CAN_REORDER)
+/* src[] = { offset }. const_index[] = { base, size } */
+LOAD(uniform, 1, 2, NIR_INTRINSIC_CAN_ELIMINATE | NIR_INTRINSIC_CAN_REORDER)
 /* src[] = { buffer_index, offset }. No const_index */
 LOAD(ubo, 2, 0, NIR_INTRINSIC_CAN_ELIMINATE | NIR_INTRINSIC_CAN_REORDER)
 /* src[] = { offset }. const_index[0] = { base } */
diff --git a/src/glsl/nir/nir_lower_io.c b/src/glsl/nir/nir_lower_io.c
index a2723d5..b73ceec 100644
--- a/src/glsl/nir/nir_lower_io.c
+++ b/src/glsl/nir/nir_lower_io.c
@@ -216,6 +216,11 @@ nir_lower_io_block(nir_block *block, void *void_state)
  load->const_index[0] =
 intrin->variables[0]->var->data.driver_location;
 
+ if (load->intrinsic == nir_intrinsic_load_uniform) {
+load->const_index[1] =
+   state->type_size(intrin->variables[0]->var->type);
+ }
+
  if (per_vertex)
 load->src[0] = nir_src_for_ssa(vertex_index);
 
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 11/15] i965/fs: Don't force MASK_DISABLE on INDIRECT_MOV instructions

2015-12-09 Thread Jason Ekstrand

It should work fine without it and the visitor can set it if it wants.
---
 src/mesa/drivers/dri/i965/brw_fs_generator.cpp | 1 -
 1 file changed, 1 deletion(-)

diff --git a/src/mesa/drivers/dri/i965/brw_fs_generator.cpp 
b/src/mesa/drivers/dri/i965/brw_fs_generator.cpp
index c25da07..d86eee1 100644
--- a/src/mesa/drivers/dri/i965/brw_fs_generator.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs_generator.cpp
@@ -366,7 +366,6 @@ fs_generator::generate_mov_indirect(fs_inst *inst,
assert(inst->exec_size == 8 || devinfo->gen >= 8);
 
brw_MOV(p, addr, indirect_byte_offset);
-   brw_inst_set_mask_control(devinfo, brw_last_inst, BRW_MASK_DISABLE);
brw_MOV(p, dst, retype(brw_VxH_indirect(0, imm_byte_offset), dst.type));
 }
 
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 07/15] i965/vec4: Inline get_pull_constant_offset

2015-12-09 Thread Jason Ekstrand

It's not really doing enough anymore to justify a helper function.
---
 src/mesa/drivers/dri/i965/brw_vec4.h   |  2 --
 src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp | 37 ++
 2 files changed, 14 insertions(+), 25 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_vec4.h 
b/src/mesa/drivers/dri/i965/brw_vec4.h
index ae5bf69..f2e5ce1 100644
--- a/src/mesa/drivers/dri/i965/brw_vec4.h
+++ b/src/mesa/drivers/dri/i965/brw_vec4.h
@@ -284,8 +284,6 @@ public:
 
src_reg get_scratch_offset(bblock_t *block, vec4_instruction *inst,
  src_reg *reladdr, int reg_offset);
-   src_reg get_pull_constant_offset(bblock_t *block, vec4_instruction *inst,
-   src_reg *reladdr, int reg_offset);
void emit_scratch_read(bblock_t *block, vec4_instruction *inst,
  dst_reg dst,
  src_reg orig_src,
diff --git a/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp 
b/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp
index 443d0eb..7712d34 100644
--- a/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp
+++ b/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp
@@ -1464,27 +1464,6 @@ vec4_visitor::get_scratch_offset(bblock_t *block, 
vec4_instruction *inst,
}
 }
 
-src_reg
-vec4_visitor::get_pull_constant_offset(bblock_t * block, vec4_instruction 
*inst,
-  src_reg *reladdr, int reg_offset)
-{
-   if (reladdr) {
-  src_reg index = src_reg(this, glsl_type::int_type);
-
-  emit_before(block, inst, ADD(dst_reg(index), *reladdr,
-   brw_imm_d(reg_offset * 16)));
-
-  return index;
-   } else if (devinfo->gen >= 8) {
-  /* Store the offset in a GRF so we can send-from-GRF. */
-  src_reg offset = src_reg(this, glsl_type::int_type);
-  emit_before(block, inst, MOV(dst_reg(offset), brw_imm_d(reg_offset * 
16)));
-  return offset;
-   } else {
-  return brw_imm_d(reg_offset * 16);
-   }
-}
-
 /**
  * Emits an instruction before @inst to load the value named by @orig_src
  * from scratch space at @base_offset to @temp.
@@ -1666,8 +1645,20 @@ vec4_visitor::emit_pull_constant_load(bblock_t *block, 
vec4_instruction *inst,
 {
int reg_offset = base_offset + orig_src.reg_offset;
const unsigned index = prog_data->base.binding_table.pull_constants_start;
-   src_reg offset = get_pull_constant_offset(block, inst, orig_src.reladdr,
- reg_offset);
+
+   src_reg offset;
+   if (orig_src.reladdr) {
+  offset = src_reg(this, glsl_type::int_type);
+
+  emit_before(block, inst, ADD(dst_reg(offset), *orig_src.reladdr,
+   brw_imm_d(reg_offset * 16)));
+   } else if (devinfo->gen >= 8) {
+  /* Store the offset in a GRF so we can send-from-GRF. */
+  offset = src_reg(this, glsl_type::int_type);
+  emit_before(block, inst, MOV(dst_reg(offset), brw_imm_d(reg_offset * 
16)));
+   } else {
+  offset = brw_imm_d(reg_offset * 16);
+   }
 
emit_pull_constant_load_reg(temp,
brw_imm_ud(index),
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 09/15] i965/vec4: Get rid of the uniform_size array

2015-12-09 Thread Jason Ekstrand

---
 src/mesa/drivers/dri/i965/brw_vec4.cpp|  8 
 src/mesa/drivers/dri/i965/brw_vec4.h  |  2 --
 src/mesa/drivers/dri/i965/brw_vec4_nir.cpp|  9 -
 src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp| 11 ---
 src/mesa/drivers/dri/i965/brw_vec4_vs_visitor.cpp |  1 -
 5 files changed, 31 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_vec4.cpp 
b/src/mesa/drivers/dri/i965/brw_vec4.cpp
index e4a405b..1304e23 100644
--- a/src/mesa/drivers/dri/i965/brw_vec4.cpp
+++ b/src/mesa/drivers/dri/i965/brw_vec4.cpp
@@ -466,11 +466,6 @@ vec4_visitor::split_uniform_registers()
 inst->src[i].reg_offset = 0;
   }
}
-
-   /* Update that everything is now vector-sized. */
-   for (int i = 0; i < this->uniforms; i++) {
-  this->uniform_size[i] = 1;
-   }
 }
 
 void
@@ -528,7 +523,6 @@ vec4_visitor::pack_uniform_registers()
 * push constants.
 */
for (int src = 0; src < uniforms; src++) {
-  assert(src < uniform_array_size);
   int size = chans_used[src];
 
   if (size == 0)
@@ -1588,8 +1582,6 @@ vec4_visitor::setup_uniforms(int reg)
 * matter what, or the GPU would hang.
 */
if (devinfo->gen < 6 && this->uniforms == 0) {
-  assert(this->uniforms < this->uniform_array_size);
-
   stage_prog_data->param =
  reralloc(NULL, stage_prog_data->param, const gl_constant_value *, 4);
   for (unsigned int i = 0; i < 4; i++) {
diff --git a/src/mesa/drivers/dri/i965/brw_vec4.h 
b/src/mesa/drivers/dri/i965/brw_vec4.h
index e6d6c82..0dc04ea 100644
--- a/src/mesa/drivers/dri/i965/brw_vec4.h
+++ b/src/mesa/drivers/dri/i965/brw_vec4.h
@@ -115,8 +115,6 @@ public:
 */
dst_reg output_reg[BRW_VARYING_SLOT_COUNT];
const char *output_reg_annotation[BRW_VARYING_SLOT_COUNT];
-   int *uniform_size;
-   int uniform_array_size; /*< Size of the uniform_size array */
int uniforms;
 
src_reg shader_start_time;
diff --git a/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp 
b/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
index 58b6612..bafc9a5 100644
--- a/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
+++ b/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
@@ -118,15 +118,6 @@ void
 vec4_visitor::nir_setup_uniforms()
 {
uniforms = nir->num_uniforms / 16;
-
-   nir_foreach_variable(var, &nir->uniforms) {
-  /* UBO's and atomics don't take up space in the uniform file */
-  if (var->interface_type != NULL || var->type->contains_atomic())
- continue;
-
-  if (type_size_vec4(var->type) > 0)
- uniform_size[var->data.driver_location / 16] = 
type_size_vec4(var->type);
-   }
 }
 
 void
diff --git a/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp 
b/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp
index e7ab536..138db7e 100644
--- a/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp
+++ b/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp
@@ -1786,17 +1786,6 @@ vec4_visitor::vec4_visitor(const struct brw_compiler 
*compiler,
this->max_grf = devinfo->gen >= 7 ? GEN7_MRF_HACK_START : BRW_MAX_GRF;
 
this->uniforms = 0;
-
-   /* Initialize uniform_array_size to at least 1 because pre-gen6 VS requires
-* at least one. See setup_uniforms() in brw_vec4.cpp.
-*/
-   this->uniform_array_size = 1;
-   if (prog_data) {
-  this->uniform_array_size =
- MAX2(DIV_ROUND_UP(stage_prog_data->nr_params, 4), 1);
-   }
-
-   this->uniform_size = rzalloc_array(mem_ctx, int, this->uniform_array_size);
 }
 
 vec4_visitor::~vec4_visitor()
diff --git a/src/mesa/drivers/dri/i965/brw_vec4_vs_visitor.cpp 
b/src/mesa/drivers/dri/i965/brw_vec4_vs_visitor.cpp
index fd8be7d..205323c 100644
--- a/src/mesa/drivers/dri/i965/brw_vec4_vs_visitor.cpp
+++ b/src/mesa/drivers/dri/i965/brw_vec4_vs_visitor.cpp
@@ -261,7 +261,6 @@ void
 vec4_vs_visitor::setup_uniform_clipplane_values()
 {
for (int i = 0; i < key->nr_userclip_plane_consts; ++i) {
-  assert(this->uniforms < uniform_array_size);
   this->userplane[i] = dst_reg(UNIFORM, this->uniforms);
   this->userplane[i].type = BRW_REGISTER_TYPE_F;
   for (int j = 0; j < 4; ++j) {
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 15/15] i965/fs: Push small uniform arrays

2015-12-09 Thread Jason Ekstrand

Unfortunately, this also means that we need to use a slightly different
algorithm for assign_constant_locations.  The old algorithm worked based on
the assumption that each read of a uniform value read exactly one float.
If it encountered a MOV_INDIRECT, it would immediately bail and push the
whole thing.  Since we can now read ranges using MOV_INDIRECT, we need to
be able to push a series of floats without breaking them up.  To do this,
we use an algorithm similar to the on in split_virtual_grfs.
---
 src/mesa/drivers/dri/i965/brw_fs.cpp | 76 +---
 1 file changed, 53 insertions(+), 23 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp 
b/src/mesa/drivers/dri/i965/brw_fs.cpp
index f3cf129..98f8336 100644
--- a/src/mesa/drivers/dri/i965/brw_fs.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
@@ -1911,9 +1911,7 @@ fs_visitor::compact_virtual_grfs()
  * maximum number of fragment shader uniform components (64).  If
  * there are too many of these, they'd fill up all of register space.
  * So, this will push some of them out to the pull constant buffer and
- * update the program to load them.  We also use pull constants for all
- * indirect constant loads because we don't support indirect accesses in
- * registers yet.
+ * update the program to load them.
  */
 void
 fs_visitor::assign_constant_locations()
@@ -1925,15 +1923,18 @@ fs_visitor::assign_constant_locations()
bool is_live[uniforms];
memset(is_live, 0, sizeof(is_live));
 
-   bool needs_pull[uniforms];
-   memset(needs_pull, 0, sizeof(is_live));
+   /* For each uniform slot, a value of true indicates that the given slot and
+* the next slot must remain contiguous.  This is used to keep us from
+* splitting arrays apart.
+*/
+   bool contiguous[uniforms];
+   memset(contiguous, 0, sizeof(contiguous));
 
/* First, we walk through the instructions and do two things:
 *
 *  1) Figure out which uniforms are live.
 *
-*  2) Find all indirect access of uniform arrays and flag them as needing
-* to go into the pull constant buffer.
+*  2) Mark any indirectly used ranges of registers as contiguous.
 *
 * Note that we don't move constant-indexed accesses to arrays.  No
 * testing has been done of the performance impact of this choice.
@@ -1946,12 +1947,16 @@ fs_visitor::assign_constant_locations()
  int constant_nr = inst->src[i].nr + inst->src[i].reg_offset;
 
  if (inst->opcode == SHADER_OPCODE_MOV_INDIRECT && i == 0) {
-for (unsigned j = 0; j < inst->src[2].ud / 4; j++) {
-   is_live[constant_nr + j] = true;
-   needs_pull[constant_nr + j] = true;
+assert(inst->src[2].ud % 4 == 0);
+unsigned last = constant_nr + (inst->src[2].ud / 4) - 1;
+assert(last < uniforms);
+
+for (unsigned j = constant_nr; j < last; j++) {
+   is_live[j] = true;
+   contiguous[j] = true;
 }
+is_live[last] = true;
  } else {
-/* Mark the the one accessed uniform as live */
 if (constant_nr >= 0 && constant_nr < (int) uniforms)
is_live[constant_nr] = true;
  }
@@ -1966,26 +1971,49 @@ fs_visitor::assign_constant_locations()
 * If changing this value, note the limitation about total_regs in
 * brw_curbe.c.
 */
-   unsigned int max_push_components = 16 * 8;
+   const unsigned int max_push_components = 16 * 8;
+
+   /* We push small arrays, but no bigger than 16 floats.  This is big enough
+* for a vec4 but hopefully not large enough to push out other stuff.  We
+* should probably use a better heuristic at some point.
+*/
+   const unsigned int max_chunk_size = 16;
+
unsigned int num_push_constants = 0;
unsigned int num_pull_constants = 0;
 
push_constant_loc = ralloc_array(mem_ctx, int, uniforms);
pull_constant_loc = ralloc_array(mem_ctx, int, uniforms);
 
-   for (unsigned int i = 0; i < uniforms; i++) {
-  push_constant_loc[i] = -1;
-  pull_constant_loc[i] = -1;
+   int chunk_start = -1;
+   for (unsigned u = 0; u < uniforms; u++) {
+  push_constant_loc[u] = -1;
+  pull_constant_loc[u] = -1;
 
-  if (!is_live[i])
+  if (!is_live[u])
  continue;
 
-  if (!needs_pull[i] && num_push_constants < max_push_components) {
- /* Retain as a push constant */
- push_constant_loc[i] = num_push_constants++;
-  } else {
- /* We have to pull it */
- pull_constant_loc[i] = num_pull_constants++;
+  /* This is the first live uniform in the chunk */
+  if (chunk_start < 0)
+ chunk_start = u;
+
+  /* If this element does not need to be contiguous with the next, we
+   * split at this point and everthing between chunk_start and u forms a
+   * single chunk.
+   */
+  if (!contiguous[u]) {
+ unsigned chunk_size = u - chunk_start + 1;
+
+ if (n

[Mesa-dev] [PATCH 12/15] i965/fs: Fix regs_read() for MOV_INDIRECT with a non-zero subnr

2015-12-09 Thread Jason Ekstrand

The subnr field is in bytes so we don't need to multiply by type_sz.
---
 src/mesa/drivers/dri/i965/brw_fs.cpp | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp 
b/src/mesa/drivers/dri/i965/brw_fs.cpp
index de33c1d..9eaf8d0 100644
--- a/src/mesa/drivers/dri/i965/brw_fs.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
@@ -865,7 +865,7 @@ fs_inst::regs_read(int arg) const
  * unread portion at the beginning.
  */
 if (src[0].subnr)
-   region_length += src[0].subnr * type_sz(src[0].type);
+   region_length += src[0].subnr;
 
 return DIV_ROUND_UP(region_length, REG_SIZE);
  } else {
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 06/15] i965/fs: Get rid of the param_size array

2015-12-09 Thread Jason Ekstrand

---
 src/mesa/drivers/dri/i965/brw_fs.cpp | 1 -
 src/mesa/drivers/dri/i965/brw_fs.h   | 2 --
 src/mesa/drivers/dri/i965/brw_fs_nir.cpp | 9 -
 src/mesa/drivers/dri/i965/brw_fs_visitor.cpp | 3 ---
 4 files changed, 15 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp 
b/src/mesa/drivers/dri/i965/brw_fs.cpp
index 1add656..9b06ed2 100644
--- a/src/mesa/drivers/dri/i965/brw_fs.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
@@ -1018,7 +1018,6 @@ fs_visitor::import_uniforms(fs_visitor *v)
this->push_constant_loc = v->push_constant_loc;
this->pull_constant_loc = v->pull_constant_loc;
this->uniforms = v->uniforms;
-   this->param_size = v->param_size;
 }
 
 fs_reg *
diff --git a/src/mesa/drivers/dri/i965/brw_fs.h 
b/src/mesa/drivers/dri/i965/brw_fs.h
index b55589f..b42c49d 100644
--- a/src/mesa/drivers/dri/i965/brw_fs.h
+++ b/src/mesa/drivers/dri/i965/brw_fs.h
@@ -311,8 +311,6 @@ public:
struct brw_stage_prog_data *prog_data;
struct gl_program *prog;
 
-   int *param_size;
-
int *virtual_grf_start;
int *virtual_grf_end;
brw::fs_live_variables *live_intervals;
diff --git a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp 
b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
index bf239c3..22323b4 100644
--- a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
@@ -174,15 +174,6 @@ fs_visitor::nir_setup_uniforms()
   return;
 
uniforms = nir->num_uniforms / 4;
-
-   nir_foreach_variable(var, &nir->uniforms) {
-  /* UBO's and atomics don't take up space in the uniform file */
-  if (var->interface_type != NULL || var->type->contains_atomic())
- continue;
-
-  if (type_size_scalar(var->type) > 0)
- param_size[var->data.driver_location / 4] = 
type_size_scalar(var->type);
-   }
 }
 
 static bool
diff --git a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp 
b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
index 68f2548..5616963 100644
--- a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
@@ -1011,9 +1011,6 @@ fs_visitor::init()
 
this->spilled_any_registers = false;
this->do_dual_src = false;
-
-   if (dispatch_width == 8)
-  this->param_size = rzalloc_array(mem_ctx, int, 
stage_prog_data->nr_params);
 }
 
 fs_visitor::~fs_visitor()
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 14/15] i965/fs: Rename demote_pull_constants to lower_constant_loads

2015-12-09 Thread Jason Ekstrand

---
 src/mesa/drivers/dri/i965/brw_fs.cpp | 4 ++--
 src/mesa/drivers/dri/i965/brw_fs.h   | 2 +-
 2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp 
b/src/mesa/drivers/dri/i965/brw_fs.cpp
index a2ec03e..f3cf129 100644
--- a/src/mesa/drivers/dri/i965/brw_fs.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
@@ -2016,7 +2016,7 @@ fs_visitor::assign_constant_locations()
  * or VARYING_PULL_CONSTANT_LOAD instructions which load values into VGRFs.
  */
 void
-fs_visitor::demote_pull_constants()
+fs_visitor::lower_constant_loads()
 {
const unsigned index = stage_prog_data->binding_table.pull_constants_start;
 
@@ -5033,7 +5033,7 @@ fs_visitor::optimize()
bld = fs_builder(this, 64);
 
assign_constant_locations();
-   demote_pull_constants();
+   lower_constant_loads();
 
validate();
 
diff --git a/src/mesa/drivers/dri/i965/brw_fs.h 
b/src/mesa/drivers/dri/i965/brw_fs.h
index b42c49d..eb6ecb2 100644
--- a/src/mesa/drivers/dri/i965/brw_fs.h
+++ b/src/mesa/drivers/dri/i965/brw_fs.h
@@ -136,7 +136,7 @@ public:
void split_virtual_grfs();
bool compact_virtual_grfs();
void assign_constant_locations();
-   void demote_pull_constants();
+   void lower_constant_loads();
void invalidate_live_intervals();
void calculate_live_intervals();
void calculate_register_pressure();
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 13/15] i965/fs: Add support for MOV_INDIRECT on pre-Broadwell hardware

2015-12-09 Thread Jason Ekstrand

While we're at it, we also add support for the possibility that the
indirect is, in fact, a constant.  This shouldn't happen in the common case
(if it does, that means NIR failed to constant-fold something), but it's
possible so we should handle it.
---
 src/mesa/drivers/dri/i965/brw_fs.cpp   |  4 ++
 src/mesa/drivers/dri/i965/brw_fs_generator.cpp | 51 +++---
 2 files changed, 42 insertions(+), 13 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp 
b/src/mesa/drivers/dri/i965/brw_fs.cpp
index 9eaf8d0..a2ec03e 100644
--- a/src/mesa/drivers/dri/i965/brw_fs.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
@@ -4424,6 +4424,10 @@ get_lowered_simd_width(const struct brw_device_info 
*devinfo,
case SHADER_OPCODE_TYPED_SURFACE_WRITE_LOGICAL:
   return 8;
 
+   case SHADER_OPCODE_MOV_INDIRECT:
+  /* Prior to Broadwell, we only have 8 address subregisters */
+  return devinfo->gen < 8 ? 8 : inst->exec_size;
+
default:
   return inst->exec_size;
}
diff --git a/src/mesa/drivers/dri/i965/brw_fs_generator.cpp 
b/src/mesa/drivers/dri/i965/brw_fs_generator.cpp
index d86eee1..7fa6d84 100644
--- a/src/mesa/drivers/dri/i965/brw_fs_generator.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs_generator.cpp
@@ -351,22 +351,47 @@ fs_generator::generate_mov_indirect(fs_inst *inst,
 
unsigned imm_byte_offset = reg.nr * REG_SIZE + reg.subnr;
 
-   /* We use VxH indirect addressing, clobbering a0.0 through a0.7. */
-   struct brw_reg addr = vec8(brw_address_reg(0));
+   if (indirect_byte_offset.file == BRW_IMMEDIATE_VALUE) {
+  imm_byte_offset += indirect_byte_offset.ud;
 
-   /* The destination stride of an instruction (in bytes) must be greater
-* than or equal to the size of the rest of the instruction.  Since the
-* address register is of type UW, we can't use a D-type instruction.
-* In order to get around this, re re-type to UW and use a stride.
-*/
-   indirect_byte_offset =
-  retype(spread(indirect_byte_offset, 2), BRW_REGISTER_TYPE_UW);
+  reg.nr = imm_byte_offset / REG_SIZE;
+  reg.subnr = imm_byte_offset % REG_SIZE;
+  brw_MOV(p, dst, reg);
+   } else {
+  /* Prior to Broadwell, there are only 8 address registers. */
+  assert(inst->exec_size == 8 || devinfo->gen >= 8);
 
-   /* Prior to Broadwell, there are only 8 address registers. */
-   assert(inst->exec_size == 8 || devinfo->gen >= 8);
+  /* We use VxH indirect addressing, clobbering a0.0 through a0.7. */
+  struct brw_reg addr = vec8(brw_address_reg(0));
 
-   brw_MOV(p, addr, indirect_byte_offset);
-   brw_MOV(p, dst, retype(brw_VxH_indirect(0, imm_byte_offset), dst.type));
+  /* The destination stride of an instruction (in bytes) must be greater
+   * than or equal to the size of the rest of the instruction.  Since the
+   * address register is of type UW, we can't use a D-type instruction.
+   * In order to get around this, re re-type to UW and use a stride.
+   */
+  indirect_byte_offset =
+ retype(spread(indirect_byte_offset, 2), BRW_REGISTER_TYPE_UW);
+
+  if (devinfo->gen < 8) {
+ /* Prior to broadwell, we have a restriction that the bottom 5 bits
+  * of the base offset and the bottom 5 bits of the indirect must add
+  * to less than 32.  In other words, the hardware needs to be able to
+  * add the bottom five bits of the two to get the subnumber and add
+  * the next 7 bits of each to get the actual register number.  Since
+  * the indirect may cause us to cross a register boundary, this makes
+  * it almost useless.  We could try and do something clever where we
+  * use a actual base offset if base_offset % 32 == 0 but that would
+  * mean we were generating different code depending on the base
+  * offset.  Instead, for the sake of consistency, we'll just do the
+  * add ourselves.
+  */
+ brw_ADD(p, addr, indirect_byte_offset, brw_imm_uw(imm_byte_offset));
+ brw_MOV(p, dst, retype(brw_VxH_indirect(0, 0), dst.type));
+  } else {
+ brw_MOV(p, addr, indirect_byte_offset);
+ brw_MOV(p, dst, retype(brw_VxH_indirect(0, imm_byte_offset), 
dst.type));
+  }
+   }
 }
 
 void
-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 00/15] i965: Rework uniform handling in the back-end

2015-12-09 Thread Jason Ekstrand

This series is kind-of two series in one.  First, we rework the way we do
uniform handling all throughout both backends.  In particular, we stop
using reladdr and, instead, start using MOV_INDIRECT.  Doing this allows us
some nice simplifications:  First, we no longer have the recursive reladdr
problem; it's gone by design.  Second, we can now get rid of the extra
arrays of uniform sizes that we've been carying around everywhere.  The
fact that they exist has bothered me for a while, but no more!

The second half of the series does a bunch of cleanups on the MOV_INDIRECT
instruction in the FS and then starts using it to do indirect push
constants.

Jason Ekstrand (15):
  nir: Add another index to load_uniform to specify the range read
  i965/fs: Use UD type for offsets in VARYING_PULL_CONSTANT_LOAD
  i965/fs: Use MOV_INDIRECT for all indirect uniform loads
  i965/fs: Get rid of reladdr
  i965/fs: Stop relying on param_size in assign_constant_locations
  i965/fs: Get rid of the param_size array
  i965/vec4: Inline get_pull_constant_offset
  i965/vec4: Use MOV_INDIRECT instead of reladdr for indirect push
constants
  i965/vec4: Get rid of the uniform_size array
  i965/fs: Add support for doing MOV_INDIRECT on uniforms
  i965/fs: Don't force MASK_DISABLE on INDIRECT_MOV instructions
  i965/fs: Fix regs_read() for MOV_INDIRECT with a non-zero subnr
  i965/fs: Add support for MOV_INDIRECT on pre-Broadwell hardware
  i965/fs: Rename demote_pull_constants to lower_constant_loads
  i965/fs: Push small uniform arrays

 src/glsl/nir/nir_intrinsics.h |   7 +-
 src/glsl/nir/nir_lower_io.c   |   5 +
 src/mesa/drivers/dri/i965/brw_fs.cpp  | 189 +-
 src/mesa/drivers/dri/i965/brw_fs.h|   4 +-
 src/mesa/drivers/dri/i965/brw_fs_generator.cpp|  52 --
 src/mesa/drivers/dri/i965/brw_fs_nir.cpp  |  62 ---
 src/mesa/drivers/dri/i965/brw_fs_visitor.cpp  |   3 -
 src/mesa/drivers/dri/i965/brw_ir_fs.h |   5 +-
 src/mesa/drivers/dri/i965/brw_vec4.cpp|  10 +-
 src/mesa/drivers/dri/i965/brw_vec4.h  |   7 +-
 src/mesa/drivers/dri/i965/brw_vec4_nir.cpp|  19 +--
 src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp| 130 ++-
 src/mesa/drivers/dri/i965/brw_vec4_vs_visitor.cpp |   1 -
 13 files changed, 267 insertions(+), 227 deletions(-)

-- 
2.5.0.400.gff86faf

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] mesa/shader: return correct attribute location for double matrix arrays

2015-12-09 Thread Dave Airlie

From: Dave Airlie 

If we have a dmat2[4], then dmat2[0] is at 17, dmat2[1] at 19,
dmat2[2] at 21 etc. The old code was returning 17,18,19.

I think this code is also wrong for float matricies as well.

This partly fixes:
GL41-CTS.vertex_attrib_64bit.limits_test
---
 src/mesa/main/shader_query.cpp | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/mesa/main/shader_query.cpp b/src/mesa/main/shader_query.cpp
index 5d15006..faaf08c 100644
--- a/src/mesa/main/shader_query.cpp
+++ b/src/mesa/main/shader_query.cpp
@@ -858,7 +858,7 @@ program_resource_location(struct gl_shader_program *shProg,
   && array_index >= RESOURCE_VAR(res)->type->length) {
  return -1;
   }
-  return RESOURCE_VAR(res)->data.location + array_index - 
VERT_ATTRIB_GENERIC0;
+  return RESOURCE_VAR(res)->data.location + (array_index * 
RESOURCE_VAR(res)->type->without_array()->matrix_columns) - 
VERT_ATTRIB_GENERIC0;
case GL_PROGRAM_OUTPUT:
   /* If the output is an array, fail if the index is out of bounds. */
   if (array_index > 0
-- 
2.5.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] radeonsi: don't call of u_prims_for_vertices for patches and rectangles

2015-12-09 Thread Michel Dänzer

On 10.12.2015 06:58, Marek Olšák wrote:
> From: Marek Olšák 
> 
> Both caused a crash due to a division by zero in that function.
> This is an alternative fix.
> 
> Cc: 11.0 11.1 
> ---
>  src/gallium/drivers/radeonsi/si_state_draw.c | 14 +-
>  1 file changed, 13 insertions(+), 1 deletion(-)
> 
> diff --git a/src/gallium/drivers/radeonsi/si_state_draw.c 
> b/src/gallium/drivers/radeonsi/si_state_draw.c
> index ee84a1f..e550011 100644
> --- a/src/gallium/drivers/radeonsi/si_state_draw.c
> +++ b/src/gallium/drivers/radeonsi/si_state_draw.c
> @@ -216,6 +216,18 @@ static void si_emit_derived_tess_state(struct si_context 
> *sctx,
>   radeon_emit(cs, tcs_out_layout | (num_tcs_output_cp << 26));
>  }
>  
> +static unsigned si_num_prims_for_vertices(const struct pipe_draw_info *info)
> +{
> + switch (info->mode) {
> + case PIPE_PRIM_PATCHES:
> + return info->count / info->vertices_per_patch;
> + case R600_PRIM_RECTANGLE_LIST:
> + return info->count / 3;
> + default:
> + return u_prims_for_vertices(info->mode, info->count);
> + }
> +}

I don't suppose it makes sense to handle PIPE_PRIM_PATCHES in
u_prims_for_vertices? Either way,

Reviewed-by: Michel Dänzer 


-- 
Earthling Michel Dänzer   |   http://www.amd.com
Libre software enthusiast | Mesa and X developer
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 1/4] winsys/radeon: clear the buffer cache on allocation failure and try again

2015-12-09 Thread Michel Dänzer

On 10.12.2015 06:58, Marek Olšák wrote:
> From: Marek Olšák 

The series is

Reviewed-by: Michel Dänzer 


-- 
Earthling Michel Dänzer   |   http://www.amd.com
Libre software enthusiast | Mesa and X developer
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] radeonsi: also print hexadecimal values for register fields in the IB parser

2015-12-09 Thread Michel Dänzer

On 10.12.2015 07:54, Marek Olšák wrote:
> From: Marek Olšák 
> 
> ---
>  src/gallium/drivers/radeonsi/si_debug.c | 11 +++
>  1 file changed, 7 insertions(+), 4 deletions(-)
> 
> diff --git a/src/gallium/drivers/radeonsi/si_debug.c 
> b/src/gallium/drivers/radeonsi/si_debug.c
> index cce665e..034acf5 100644
> --- a/src/gallium/drivers/radeonsi/si_debug.c
> +++ b/src/gallium/drivers/radeonsi/si_debug.c
> @@ -61,13 +61,16 @@ static void print_spaces(FILE *f, unsigned num)
>  static void print_value(FILE *file, uint32_t value, int bits)
>  {
>   /* Guess if it's int or float */
> - if (value <= (1 << 15))
> - fprintf(file, "%u\n", value);
> - else {
> + if (value <= (1 << 15)) {
> + if (value <= 9)
> + fprintf(file, "%u\n", value);
> + else
> + fprintf(file, "%u (0x%0*x)\n", value, bits / 4, value);
> + } else {
>   float f = uif(value);
>  
>   if (fabs(f) < 10 && f*10 == floor(f*10))
> - fprintf(file, "%.1ff\n", f);
> + fprintf(file, "%.1ff (0x%0*x)\n", f, bits / 4, value);
>   else
>   /* Don't print more leading zeros than there are bits. 
> */
>   fprintf(file, "0x%0*x\n", bits / 4, value);
> 

Reviewed-by: Michel Dänzer 


-- 
Earthling Michel Dänzer   |   http://www.amd.com
Libre software enthusiast | Mesa and X developer
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 2/3] mesa: docs: i965: Use correct doxygen groupings syntax

2015-12-09 Thread Ian Romanick

On 12/07/2015 12:18 PM, Sarah Sharp wrote:
> When reading the source code, it's useful to indicate that a group of
> fields in a struct are related in someway. The convention in Mesa seems
> to be:
> 
> struct foo {
> /**
>  * Related fields description
>  * @{
>  */
> int bar;
> char baz;
> */@}

  /*@}*/

> long qux;
> }
> 
> However, the doxygen syntax for grouping is:
> 
> struct foo {
> /**
>  * @defgroup group_name Related fields description

There are 193 uses of \name, but only one use @defgroup.  Assuming that
\name is also correct, and
https://www.stack.nl/~dimitri/doxygen/manual/grouping.html#memgroup
suggests that it is, we should stick with that.

If it's not correct... can a sed job convert the \name uses to \defgroup?

>  * @{
>  */
> int bar;
> char baz;
> */@}

  /*@}*/

> long qux;
> }
> 
> https://www.stack.nl/~dimitri/doxygen/manual/grouping.html
> 
> Without the group name definition, the fields don't get properly
> grouped. Instead, the group description is applied to the first field.
> You can see this in the current doxygen build, since there are no links
> to groups (doxygen calls them modules).
> 
> Fix this. Use the brw_ prefix for the group name, since I'm pretty sure
> group names are global.
> 
> Signed-off-by: Sarah Sharp 
> ---
>  src/mesa/drivers/dri/i965/brw_device_info.h | 12 ++--
>  1 file changed, 6 insertions(+), 6 deletions(-)
> 
> diff --git a/src/mesa/drivers/dri/i965/brw_device_info.h 
> b/src/mesa/drivers/dri/i965/brw_device_info.h
> index 4911c23..89c1501 100644
> --- a/src/mesa/drivers/dri/i965/brw_device_info.h
> +++ b/src/mesa/drivers/dri/i965/brw_device_info.h
> @@ -49,8 +49,8 @@ struct brw_device_info
> bool has_resource_streamer;
>  
> /**
> -* Quirks:
> -*  @{
> +* @defgroup brw_quirks Hardware quirks
> +* @{
>  */
> bool has_negative_rhw_bug;
>  
> @@ -62,11 +62,11 @@ struct brw_device_info
>  * fragment shader instructions.
>  */
> bool needs_unlit_centroid_workaround;
> -   /** @} */
> +   /* @} */
>  
> /**
> -* GPU Limits:
> -*  @{
> +* @defgroup brw_gpu_limits GPU hardware limitations
> +* @{
>  */
> unsigned max_vs_threads;
> unsigned max_hs_threads;
> @@ -83,7 +83,7 @@ struct brw_device_info
>unsigned max_ds_entries;
>unsigned max_gs_entries;
> } urb;
> -   /** @} */
> +   /* @} */
>  };
>  
>  const struct brw_device_info *brw_get_device_info(int devid);
> 

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 1/3] mesa: docs: Add link to planet.freedesktop.org

2015-12-09 Thread Ian Romanick

This patch is

Reviewed-by: Ian Romanick 

On 12/07/2015 12:18 PM, Sarah Sharp wrote:
> The freedesktop.org blog feeds aren't mentioned on either mesa3d.org or
> any of the graphics project wikis (including the DRI wiki) on
> freedeskop.org.  Fix that by linking to it from the sidebar.
> 
> Signed-off-by: Sarah Sharp 
> ---
>  docs/contents.html | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/docs/contents.html b/docs/contents.html
> index 6612cbe..a683b07 100644
> --- a/docs/contents.html
> +++ b/docs/contents.html
> @@ -90,6 +90,7 @@
>  http://www.opengl.org"; target="_parent">OpenGL website
>  http://dri.freedesktop.org"; target="_parent">DRI website
>  http://www.freedesktop.org"; target="_parent">freedesktop.org
> +http://planet.freedesktop.org"; target="_parent">Developer 
> blogs
>  
>  
>  Hosted by:
> 

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 1/2] gallium/ddebug: add GALLIUM_DDEBUG_SKIP option

2015-12-09 Thread Nicolai Hähnle

From: Nicolai Hähnle 

When we know that hangs occur only very late in a reproducible run (e.g.
apitrace), we can save a lot of debugging time by skipping the flush and hang
detection for earlier draw calls.
---
 src/gallium/drivers/ddebug/dd_draw.c   | 39 +-
 src/gallium/drivers/ddebug/dd_pipe.h   |  3 +++
 src/gallium/drivers/ddebug/dd_screen.c |  9 
 3 files changed, 36 insertions(+), 15 deletions(-)

diff --git a/src/gallium/drivers/ddebug/dd_draw.c 
b/src/gallium/drivers/ddebug/dd_draw.c
index b443c5b..0778099 100644
--- a/src/gallium/drivers/ddebug/dd_draw.c
+++ b/src/gallium/drivers/ddebug/dd_draw.c
@@ -588,8 +588,11 @@ dd_context_flush(struct pipe_context *_pipe,
 static void
 dd_before_draw(struct dd_context *dctx)
 {
-   if (dd_screen(dctx->base.screen)->mode == DD_DETECT_HANGS &&
-   !dd_screen(dctx->base.screen)->no_flush)
+   struct dd_screen *dscreen = dd_screen(dctx->base.screen);
+
+   if (dscreen->mode == DD_DETECT_HANGS &&
+   !dscreen->no_flush &&
+   dctx->num_draw_calls >= dscreen->skip_count)
   dd_flush_and_handle_hang(dctx, NULL, 0,
"GPU hang most likely caused by internal "
"driver commands");
@@ -598,22 +601,28 @@ dd_before_draw(struct dd_context *dctx)
 static void
 dd_after_draw(struct dd_context *dctx, struct dd_call *call)
 {
-   switch (dd_screen(dctx->base.screen)->mode) {
-   case DD_DETECT_HANGS:
-  if (!dd_screen(dctx->base.screen)->no_flush &&
-  dd_flush_and_check_hang(dctx, NULL, 0)) {
- dd_dump_call(dctx, call, PIPE_DEBUG_DEVICE_IS_HUNG);
+   struct dd_screen *dscreen = dd_screen(dctx->base.screen);
 
- /* Terminate the process to prevent future hangs. */
- dd_kill_process();
+   if (dctx->num_draw_calls >= dscreen->skip_count) {
+  switch (dscreen->mode) {
+  case DD_DETECT_HANGS:
+ if (!dscreen->no_flush &&
+dd_flush_and_check_hang(dctx, NULL, 0)) {
+dd_dump_call(dctx, call, PIPE_DEBUG_DEVICE_IS_HUNG);
+
+/* Terminate the process to prevent future hangs. */
+dd_kill_process();
+ }
+ break;
+  case DD_DUMP_ALL_CALLS:
+ dd_dump_call(dctx, call, 0);
+ break;
+  default:
+ assert(0);
   }
-  break;
-   case DD_DUMP_ALL_CALLS:
-  dd_dump_call(dctx, call, 0);
-  break;
-   default:
-  assert(0);
}
+
+   ++dctx->num_draw_calls;
 }
 
 static void
diff --git a/src/gallium/drivers/ddebug/dd_pipe.h 
b/src/gallium/drivers/ddebug/dd_pipe.h
index 34f5920..a045518 100644
--- a/src/gallium/drivers/ddebug/dd_pipe.h
+++ b/src/gallium/drivers/ddebug/dd_pipe.h
@@ -45,6 +45,7 @@ struct dd_screen
unsigned timeout_ms;
enum dd_mode mode;
bool no_flush;
+   unsigned skip_count;
 };
 
 struct dd_query
@@ -110,6 +111,8 @@ struct dd_context
struct pipe_scissor_state scissors[PIPE_MAX_VIEWPORTS];
struct pipe_viewport_state viewports[PIPE_MAX_VIEWPORTS];
float tess_default_levels[6];
+
+   unsigned num_draw_calls;
 };
 
 
diff --git a/src/gallium/drivers/ddebug/dd_screen.c 
b/src/gallium/drivers/ddebug/dd_screen.c
index a776580..2716845 100644
--- a/src/gallium/drivers/ddebug/dd_screen.c
+++ b/src/gallium/drivers/ddebug/dd_screen.c
@@ -290,6 +290,9 @@ ddebug_screen_create(struct pipe_screen *screen)
   puts("$HOME/"DD_DIR"/ when a hang is detected.");
   puts("If 'noflush' is specified, only detect hangs in pipe->flush.");
   puts("");
+  puts("  GALLIUM_DDEBUG_SKIP=[count]");
+  puts("Skip flush and hang detection for the given initial number of 
draw calls.");
+  puts("");
   exit(0);
}
 
@@ -349,5 +352,11 @@ ddebug_screen_create(struct pipe_screen *screen)
   assert(0);
}
 
+   dscreen->skip_count = debug_get_num_option("GALLIUM_DDEBUG_SKIP", 0);
+   if (dscreen->skip_count > 0) {
+  fprintf(stderr, "Gallium debugger skipping the first %u draw calls.\n",
+  dscreen->skip_count);
+   }
+
return &dscreen->base;
 }
-- 
2.5.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 2/2] gallium/ddebug: regularly log the total number of draw calls

2015-12-09 Thread Nicolai Hähnle

From: Nicolai Hähnle 

This helps in the use of GALLIUM_DDEBUG_SKIP: first run a target application
with skip set to a very large number and note how many draw calls happen
before the bug. Then re-run, skipping the corresponding number of calls.
Despite the additional run, this can still be much faster than not skipping
anything.
---
 src/gallium/drivers/ddebug/dd_draw.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/src/gallium/drivers/ddebug/dd_draw.c 
b/src/gallium/drivers/ddebug/dd_draw.c
index 0778099..0d7ee9a 100644
--- a/src/gallium/drivers/ddebug/dd_draw.c
+++ b/src/gallium/drivers/ddebug/dd_draw.c
@@ -623,6 +623,9 @@ dd_after_draw(struct dd_context *dctx, struct dd_call *call)
}
 
++dctx->num_draw_calls;
+   if (dscreen->skip_count && dctx->num_draw_calls % 1 == 0)
+  fprintf(stderr, "Gallium debugger reached %u draw calls.\n",
+  dctx->num_draw_calls);
 }
 
 static void
-- 
2.5.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] mesa/varray: set double arrays to non-normalised.

2015-12-09 Thread Ian Romanick

...and it matches what we do for single precision.

Reviewed-by: Ian Romanick 

Presumably this should also be a candidate for 11.0 and 11.1?

On 12/09/2015 04:37 PM, Dave Airlie wrote:
> From: Dave Airlie 
> 
> Doesn't have any effect in practice I don't think, but
> CTS reads back using GetVertexAttrib.
> 
> This fixes: GL41-CTS.vertex_attrib_64bit.get_vertex_attrib
> 
> Signed-off-by: Dave Airlie 
> ---
>  src/mesa/main/varray.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/src/mesa/main/varray.c b/src/mesa/main/varray.c
> index 58f376b..c71e16a 100644
> --- a/src/mesa/main/varray.c
> +++ b/src/mesa/main/varray.c
> @@ -776,7 +776,7 @@ _mesa_VertexAttribLPointer(GLuint index, GLint size, 
> GLenum type,
>  
> update_array(ctx, "glVertexAttribLPointer", VERT_ATTRIB_GENERIC(index),
>  legalTypes, 1, 4,
> -size, type, stride, GL_TRUE, GL_FALSE, GL_TRUE, ptr);
> +size, type, stride, GL_FALSE, GL_FALSE, GL_TRUE, ptr);
>  }
>  
>  
> 

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 8/8] glsl: only divide left components when it is a dual slot double.

2015-12-09 Thread Ian Romanick

On 12/09/2015 04:07 AM, Timothy Arceri wrote:
> On Wed, 2015-12-09 at 16:06 +1000, Dave Airlie wrote:
>> From: Dave Airlie 
>>
>> Signed-off-by: Dave Airlie 
>> ---
>>  src/glsl/lower_packed_varyings.cpp | 2 +-
>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/src/glsl/lower_packed_varyings.cpp
>> b/src/glsl/lower_packed_varyings.cpp
>> index 037c27d..ec9af62 100644
>> --- a/src/glsl/lower_packed_varyings.cpp
>> +++ b/src/glsl/lower_packed_varyings.cpp
>> @@ -472,7 +472,7 @@
>> lower_packed_varyings_visitor::lower_rvalue(ir_rvalue *rvalue,
>>char right_swizzle_name[4] = { 0, 0, 0, 0 };
>>  
>>left_components = 4 - fine_location % 4;
>> -  if (rvalue->type->is_double()) {
>> +  if (rvalue->type->is_dual_slot_double()) {
> 
> The subject line says what the change is but there is no explanation on
> why it was made. Can you add more detail to the comment?
> 
> To me the existing code *seems* correct as all doubles take up twice
> the amount of components, why would we only divide by 2 when its a dual
> slot double? 

Yeah... it seems like after this change it might try to pack a double in
the W component of a vec3.

>>   /* We might actually end up with 0 left components! */
>>   left_components /= 2;
>>}
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
> 

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] mesa/varray: set double arrays to non-normalised.

2015-12-09 Thread Dave Airlie

From: Dave Airlie 

Doesn't have any effect in practice I don't think, but
CTS reads back using GetVertexAttrib.

This fixes: GL41-CTS.vertex_attrib_64bit.get_vertex_attrib

Signed-off-by: Dave Airlie 
---
 src/mesa/main/varray.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/mesa/main/varray.c b/src/mesa/main/varray.c
index 58f376b..c71e16a 100644
--- a/src/mesa/main/varray.c
+++ b/src/mesa/main/varray.c
@@ -776,7 +776,7 @@ _mesa_VertexAttribLPointer(GLuint index, GLint size, GLenum 
type,
 
update_array(ctx, "glVertexAttribLPointer", VERT_ATTRIB_GENERIC(index),
 legalTypes, 1, 4,
-size, type, stride, GL_TRUE, GL_FALSE, GL_TRUE, ptr);
+size, type, stride, GL_FALSE, GL_FALSE, GL_TRUE, ptr);
 }
 
 
-- 
2.5.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [Bug 91888] EGL Wayland software rendering no longer work after regression

2015-12-09 Thread bugzilla-daemon

https://bugs.freedesktop.org/show_bug.cgi?id=91888

--- Comment #20 from nerdopol...@verizon.net ---
And now all the examples from qtbase/examples/opengl are working as well even
after exporting LIBGL_ALWAYS_SOFTWARE as well with mesa master.
including contextinfo, qopenglwidget, qopenglwindow, and threadedqopenglwindow

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] i965: Separate base offset/constant offset combining from remapping.

2015-12-09 Thread Jason Ekstrand

On Wed, Dec 9, 2015 at 12:08 PM, Kenneth Graunke  wrote:
> On Wednesday, December 09, 2015 08:03:25 AM Jason Ekstrand wrote:
>> On Dec 9, 2015 2:51 AM, "Kenneth Graunke"  wrote:
>> >
>> > My tessellation branch has two additional remap functions.  I don't want
>> > to replicate this logic there.
>> >
>> > Signed-off-by: Kenneth Graunke 
>> > ---
>> >  src/mesa/drivers/dri/i965/brw_nir.c | 78
>> -
>> >  1 file changed, 50 insertions(+), 28 deletions(-)
>> >
>> > Hey Jason,
>> >
>> > If you like this patch, and haven't yet merged your NIR input reworks,
>> > feel free to just squash it into your changes.  Or, we can land it
>> > separately after your changes.  It's up to you.
>> >
>> > Separating this out allows me to reuse this in my new tessellation input
>> > and output remapping functions, and also means we don't need to add
>> structs
>> > for the remap functions...we can just pass the builder, or inputs_read, or
>> > the VUE map...and not have to pack multiple things together.
>>
>> Sure.  It does make things simpler.
>>
>> >  --Ken
>> >
>> > diff --git a/src/mesa/drivers/dri/i965/brw_nir.c
>> b/src/mesa/drivers/dri/i965/brw_nir.c
>> > index 14ad172..105a175 100644
>> > --- a/src/mesa/drivers/dri/i965/brw_nir.c
>> > +++ b/src/mesa/drivers/dri/i965/brw_nir.c
>> > @@ -27,15 +27,19 @@
>> >  #include "glsl/nir/nir_builder.h"
>> >  #include "program/prog_to_nir.h"
>> >
>> > -struct remap_vs_attrs_state {
>> > -   nir_builder b;
>> > -   uint64_t inputs_read;
>> > -};
>> > -
>> > +/**
>> > + * In many cases, we just add the base and offset together, so there's no
>> > + * reason to keep them separate.  Sometimes, combining them is essential:
>> > + * if a shader only accesses part of a compound variable (such as a
>> matrix
>> > + * or array), the variable's base may not actually exist in the VUE map.
>> > + *
>> > + * This pass adds constant offsets to instr->const_index[0], and resets
>> > + * the offset source to 0.  Non-constant offsets remain unchanged.
>> > + */
>> >  static bool
>> > -remap_vs_attrs(nir_block *block, void *void_state)
>> > +add_const_offset_to_base(nir_block *block, void *closure)
>> >  {
>> > -   struct remap_vs_attrs_state *state = void_state;
>> > +   nir_builder *b = closure;
>> >
>> > nir_foreach_instr_safe(block, instr) {
>> >if (instr->type != nir_instr_type_intrinsic)
>> > @@ -43,30 +47,48 @@ remap_vs_attrs(nir_block *block, void *void_state)
>> >
>> >nir_intrinsic_instr *intrin = nir_instr_as_intrinsic(instr);
>> >
>> > +  if (intrin->intrinsic == nir_intrinsic_load_input ||
>> > +  intrin->intrinsic == nir_intrinsic_load_per_vertex_input ||
>> > +  intrin->intrinsic == nir_intrinsic_load_output ||
>> > +  intrin->intrinsic == nir_intrinsic_load_per_vertex_output ||
>> > +  intrin->intrinsic == nir_intrinsic_store_output ||
>> > +  intrin->intrinsic == nir_intrinsic_store_per_vertex_output) {
>>
>> This seems a bit scortched-earth. It would be nice if the caller had a bit
>> more control.
>
> We could always add "do input"/"do output" options once we actually need
> them.  It means adding the structs back, but that's fine.
>
> I suppose for now we could ignore output intrinsics here.
>
>> > + nir_src *offset = nir_get_io_offset_src(intrin);
>> > + nir_const_value *const_offset = nir_src_as_const_value(*offset);
>> > +
>> > + if (const_offset) {
>> > +intrin->const_index[0] += const_offset->u[0];
>> > +b->cursor = nir_before_instr(&intrin->instr);
>> > +nir_instr_rewrite_src(&intrin->instr, offset,
>> > +  nir_src_for_ssa(nir_imm_int(b, 0)));
>> > + }
>>
>> Else???  It seems that you don't want to run this pass if you think you'll
>> ever hit an indirect.  I guess it's harmless to just do this for all direct
>> things in our driver, but it doesn't sit well.
>
> Else do nothing.   The problem I'm trying to avoid with this logic is
> that inputs/outputs which take up multiple slots and are directly
> accessed may only have slots assigned for the sub-components that are
> actually used.  So, I can't use the base offset, and need to move the
> base to the slot that's actually used.
>
> If something is accessed indirectly, we assign all of its slots, so
> using the base location is fine.

Ok, that makes sense.

>> > +  }
>> > +   }
>> > +   return true;
>> > +
>> > +}
>> > +
>> > +static bool
>> > +remap_vs_attrs(nir_block *block, void *closure)
>> > +{
>> > +   GLbitfield64 inputs_read = *((GLbitfield64 *) closure);
>> > +
>> > +   nir_foreach_instr(block, instr) {
>> > +  if (instr->type != nir_instr_type_intrinsic)
>> > + continue;
>> > +
>> > +  nir_intrinsic_instr *intrin = nir_instr_as_intrinsic(instr);
>> > +
>> >if (intrin->intrinsic == nir_intrinsic_load_input) {
>> >   /* Attributes come in a contiguous block, ordered by their
>> >

Re: [Mesa-dev] [PATCH] mesa: fix ID usage for buffer warnings

2015-12-09 Thread Ilia Mirkin

Reviewed-by: Ilia Mirkin 

On Wed, Dec 9, 2015 at 6:02 PM, Brian Paul  wrote:
> We need a different ID pointer for each call site.
> ---
>  src/mesa/main/bufferobj.c | 18 --
>  1 file changed, 12 insertions(+), 6 deletions(-)
>
> diff --git a/src/mesa/main/bufferobj.c b/src/mesa/main/bufferobj.c
> index 6bc1b5e..e0639c8 100644
> --- a/src/mesa/main/bufferobj.c
> +++ b/src/mesa/main/bufferobj.c
> @@ -60,16 +60,16 @@
>
>  /**
>   * Helper to warn of possible performance issues, such as frequently
> - * updating a buffer created with GL_STATIC_DRAW.
> + * updating a buffer created with GL_STATIC_DRAW.  Called via the macro
> + * below.
>   */
>  static void
> -buffer_usage_warning(struct gl_context *ctx, const char *fmt, ...)
> +buffer_usage_warning(struct gl_context *ctx, GLuint *id, const char *fmt, 
> ...)
>  {
> va_list args;
> -   GLuint msg_id = 0;
>
> va_start(args, fmt);
> -   _mesa_gl_vdebug(ctx, &msg_id,
> +   _mesa_gl_vdebug(ctx, id,
> MESA_DEBUG_SOURCE_API,
> MESA_DEBUG_TYPE_PERFORMANCE,
> MESA_DEBUG_SEVERITY_MEDIUM,
> @@ -77,6 +77,12 @@ buffer_usage_warning(struct gl_context *ctx, const char 
> *fmt, ...)
> va_end(args);
>  }
>
> +#define BUFFER_USAGE_WARNING(CTX, FMT, ...) \
> +   do { \
> +  static GLuint id = 0; \
> +  buffer_usage_warning(CTX, &id, FMT, ##__VA_ARGS__); \
> +   } while (0)
> +
>
>  /**
>   * Used as a placeholder for buffer objects between glGenBuffers() and
> @@ -1713,7 +1719,7 @@ _mesa_buffer_sub_data(struct gl_context *ctx, struct 
> gl_buffer_object *bufObj,
>/* If the application declared the buffer as static draw/copy or stream
> * draw, it should not be frequently modified with glBufferSubData.
> */
> -  buffer_usage_warning(ctx,
> +  BUFFER_USAGE_WARNING(ctx,
> "using %s(buffer %u, offset %u, size %u) to "
> "update a %s buffer",
> func, bufObj->Name, offset, size,
> @@ -2432,7 +2438,7 @@ _mesa_map_buffer_range(struct gl_context *ctx,
>if ((bufObj->Usage == GL_STATIC_DRAW ||
> bufObj->Usage == GL_STATIC_COPY) &&
>bufObj->NumMapBufferWriteCalls >= BUFFER_WARNING_CALL_COUNT) {
> - buffer_usage_warning(ctx,
> + BUFFER_USAGE_WARNING(ctx,
>"using %s(buffer %u, offset %u, length %u) to "
>"update a %s buffer",
>func, bufObj->Name, offset, length,
> --
> 1.9.1
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] mesa: fix ID usage for buffer warnings

2015-12-09 Thread Brian Paul

We need a different ID pointer for each call site.
---
 src/mesa/main/bufferobj.c | 18 --
 1 file changed, 12 insertions(+), 6 deletions(-)

diff --git a/src/mesa/main/bufferobj.c b/src/mesa/main/bufferobj.c
index 6bc1b5e..e0639c8 100644
--- a/src/mesa/main/bufferobj.c
+++ b/src/mesa/main/bufferobj.c
@@ -60,16 +60,16 @@
 
 /**
  * Helper to warn of possible performance issues, such as frequently
- * updating a buffer created with GL_STATIC_DRAW.
+ * updating a buffer created with GL_STATIC_DRAW.  Called via the macro
+ * below.
  */
 static void
-buffer_usage_warning(struct gl_context *ctx, const char *fmt, ...)
+buffer_usage_warning(struct gl_context *ctx, GLuint *id, const char *fmt, ...)
 {
va_list args;
-   GLuint msg_id = 0;
 
va_start(args, fmt);
-   _mesa_gl_vdebug(ctx, &msg_id,
+   _mesa_gl_vdebug(ctx, id,
MESA_DEBUG_SOURCE_API,
MESA_DEBUG_TYPE_PERFORMANCE,
MESA_DEBUG_SEVERITY_MEDIUM,
@@ -77,6 +77,12 @@ buffer_usage_warning(struct gl_context *ctx, const char 
*fmt, ...)
va_end(args);
 }
 
+#define BUFFER_USAGE_WARNING(CTX, FMT, ...) \
+   do { \
+  static GLuint id = 0; \
+  buffer_usage_warning(CTX, &id, FMT, ##__VA_ARGS__); \
+   } while (0)
+
 
 /**
  * Used as a placeholder for buffer objects between glGenBuffers() and
@@ -1713,7 +1719,7 @@ _mesa_buffer_sub_data(struct gl_context *ctx, struct 
gl_buffer_object *bufObj,
   /* If the application declared the buffer as static draw/copy or stream
* draw, it should not be frequently modified with glBufferSubData.
*/
-  buffer_usage_warning(ctx,
+  BUFFER_USAGE_WARNING(ctx,
"using %s(buffer %u, offset %u, size %u) to "
"update a %s buffer",
func, bufObj->Name, offset, size,
@@ -2432,7 +2438,7 @@ _mesa_map_buffer_range(struct gl_context *ctx,
   if ((bufObj->Usage == GL_STATIC_DRAW ||
bufObj->Usage == GL_STATIC_COPY) &&
   bufObj->NumMapBufferWriteCalls >= BUFFER_WARNING_CALL_COUNT) {
- buffer_usage_warning(ctx,
+ BUFFER_USAGE_WARNING(ctx,
   "using %s(buffer %u, offset %u, length %u) to "
   "update a %s buffer",
   func, bufObj->Name, offset, length,
-- 
1.9.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] radeonsi: also print hexadecimal values for register fields in the IB parser

2015-12-09 Thread Marek Olšák

From: Marek Olšák 

---
 src/gallium/drivers/radeonsi/si_debug.c | 11 +++
 1 file changed, 7 insertions(+), 4 deletions(-)

diff --git a/src/gallium/drivers/radeonsi/si_debug.c 
b/src/gallium/drivers/radeonsi/si_debug.c
index cce665e..034acf5 100644
--- a/src/gallium/drivers/radeonsi/si_debug.c
+++ b/src/gallium/drivers/radeonsi/si_debug.c
@@ -61,13 +61,16 @@ static void print_spaces(FILE *f, unsigned num)
 static void print_value(FILE *file, uint32_t value, int bits)
 {
/* Guess if it's int or float */
-   if (value <= (1 << 15))
-   fprintf(file, "%u\n", value);
-   else {
+   if (value <= (1 << 15)) {
+   if (value <= 9)
+   fprintf(file, "%u\n", value);
+   else
+   fprintf(file, "%u (0x%0*x)\n", value, bits / 4, value);
+   } else {
float f = uif(value);
 
if (fabs(f) < 10 && f*10 == floor(f*10))
-   fprintf(file, "%.1ff\n", f);
+   fprintf(file, "%.1ff (0x%0*x)\n", f, bits / 4, value);
else
/* Don't print more leading zeros than there are bits. 
*/
fprintf(file, "0x%0*x\n", bits / 4, value);
-- 
2.1.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH v2] nv50, nvc0: optimize coherent buffer checking at draw time

2015-12-09 Thread Ilia Mirkin

On Wed, Dec 9, 2015 at 5:40 PM, Samuel Pitoiset
 wrote:
> Instead of iterating over all the buffer resources looking for coherent
> buffers, we keep track of a context-wide count. This will save some
> iterations (and CPU cycles) in 99.99% case because usually coherent
> buffers are not so used.
>
> Changes from v2:
>  - forgot to apply some changes for nv50 (texture/vertex bufs)
>
> Signed-off-by: Samuel Pitoiset 
> ---
>  src/gallium/drivers/nouveau/nv50/nv50_context.h |  3 ++
>  src/gallium/drivers/nouveau/nv50/nv50_state.c   | 19 +++
>  src/gallium/drivers/nouveau/nv50/nv50_vbo.c | 44 
> +++--
>  src/gallium/drivers/nouveau/nvc0/nvc0_context.h |  3 ++
>  src/gallium/drivers/nouveau/nvc0/nvc0_state.c   | 26 +++
>  src/gallium/drivers/nouveau/nvc0/nvc0_vbo.c | 41 +--
>  6 files changed, 64 insertions(+), 72 deletions(-)
>
> diff --git a/src/gallium/drivers/nouveau/nv50/nv50_context.h 
> b/src/gallium/drivers/nouveau/nv50/nv50_context.h
> index 2cebcd9..712d00e 100644
> --- a/src/gallium/drivers/nouveau/nv50/nv50_context.h
> +++ b/src/gallium/drivers/nouveau/nv50/nv50_context.h
> @@ -134,9 +134,11 @@ struct nv50_context {
> struct nv50_constbuf constbuf[3][NV50_MAX_PIPE_CONSTBUFS];
> uint16_t constbuf_dirty[3];
> uint16_t constbuf_valid[3];
> +   uint16_t constbuf_coherent[3];
>
> struct pipe_vertex_buffer vtxbuf[PIPE_MAX_ATTRIBS];
> unsigned num_vtxbufs;
> +   uint32_t vtxbufs_coherent;
> struct pipe_index_buffer idxbuf;
> uint32_t vbo_fifo; /* bitmask of vertex elements to be pushed to FIFO */
> uint32_t vbo_user; /* bitmask of vertex buffers pointing to user memory */
> @@ -148,6 +150,7 @@ struct nv50_context {
>
> struct pipe_sampler_view *textures[3][PIPE_MAX_SAMPLERS];
> unsigned num_textures[3];
> +   uint32_t textures_coherent[3];
> struct nv50_tsc_entry *samplers[3][PIPE_MAX_SAMPLERS];
> unsigned num_samplers[3];
>
> diff --git a/src/gallium/drivers/nouveau/nv50/nv50_state.c 
> b/src/gallium/drivers/nouveau/nv50/nv50_state.c
> index fd7c7cd..b6e5c75 100644
> --- a/src/gallium/drivers/nouveau/nv50/nv50_state.c
> +++ b/src/gallium/drivers/nouveau/nv50/nv50_state.c
> @@ -661,9 +661,16 @@ nv50_stage_set_sampler_views(struct nv50_context *nv50, 
> int s,
> assert(nr <= PIPE_MAX_SAMPLERS);
> for (i = 0; i < nr; ++i) {
>struct nv50_tic_entry *old = nv50_tic_entry(nv50->textures[s][i]);
> +  struct pipe_resource *res = views[i]->texture;

I'm moderately sure either views[i] or texture can be null. [Same for nvc0.]

>if (old)
>   nv50_screen_tic_unlock(nv50->screen, old);
>
> +  if (res->target == PIPE_BUFFER &&
> +  (res->flags & PIPE_RESOURCE_FLAG_MAP_COHERENT))
> + nv50->textures_coherent[s] |= 1 << i;
> +  else
> + nv50->textures_coherent[s] &= ~(1 << i);
> +
>pipe_sampler_view_reference(&nv50->textures[s][i], views[i]);
> }
>
> @@ -852,8 +859,13 @@ nv50_set_constant_buffer(struct pipe_context *pipe, uint 
> shader, uint index,
>nv50->constbuf[s][i].offset = cb->buffer_offset;
>nv50->constbuf[s][i].size = MIN2(align(cb->buffer_size, 0x100), 
> 0x1);
>nv50->constbuf_valid[s] |= 1 << i;
> +  if (res->flags & PIPE_RESOURCE_FLAG_MAP_COHERENT)
> + nv50->constbuf_coherent[s] |= 1 << i;
> +  else
> + nv50->constbuf_coherent[s] &= ~(1 << i);

You also need to clear it out in the user case.

> } else {
>nv50->constbuf_valid[s] &= ~(1 << i);
> +  nv50->constbuf_coherent[s] &= ~(1 << i);
> }
> nv50->constbuf_dirty[s] |= 1 << i;
>
> @@ -1000,6 +1012,7 @@ nv50_set_vertex_buffers(struct pipe_context *pipe,
> if (!vb) {
>nv50->vbo_user &= ~(((1ull << count) - 1) << start_slot);
>nv50->vbo_constant &= ~(((1ull << count) - 1) << start_slot);
> +  nv50->vtxbufs_coherent &= ~(((1ull << count) - 1) << start_slot);
>return;
> }
>
> @@ -1012,9 +1025,15 @@ nv50_set_vertex_buffers(struct pipe_context *pipe,
>  nv50->vbo_constant |= 1 << dst_index;
>   else
>  nv50->vbo_constant &= ~(1 << dst_index);
> + nv50->vtxbufs_coherent &= ~(1 << dst_index);
>} else {
>   nv50->vbo_user &= ~(1 << dst_index);
>   nv50->vbo_constant &= ~(1 << dst_index);
> +
> + if (vb[i].buffer->flags & PIPE_RESOURCE_FLAG_MAP_COHERENT)

I wonder if vb[i].buffer might be null here... Not sure if that's
allowed or not...

> +nv50->vtxbufs_coherent |= (1 << dst_index);
> + else
> +nv50->vtxbufs_coherent &= ~(1 << dst_index);
>}
> }
>
> diff --git a/src/gallium/drivers/nouveau/nv50/nv50_vbo.c 
> b/src/gallium/drivers/nouveau/nv50/nv50_vbo.c
> index 85878d5..b6ba803 100644
> --- a/src/gallium/drivers/nouveau/nv50/nv50_vbo.c
> +++ b/src/gallium/drivers/nouveau/nv50/nv50_vbo.c
> @@ -761,8 +761,7 @@ nv50_draw_vbo(struct pipe_context *pipe, const

[Mesa-dev] [PATCH v2] nv50, nvc0: optimize coherent buffer checking at draw time

2015-12-09 Thread Samuel Pitoiset

Instead of iterating over all the buffer resources looking for coherent
buffers, we keep track of a context-wide count. This will save some
iterations (and CPU cycles) in 99.99% case because usually coherent
buffers are not so used.

Changes from v2:
 - forgot to apply some changes for nv50 (texture/vertex bufs)

Signed-off-by: Samuel Pitoiset 
---
 src/gallium/drivers/nouveau/nv50/nv50_context.h |  3 ++
 src/gallium/drivers/nouveau/nv50/nv50_state.c   | 19 +++
 src/gallium/drivers/nouveau/nv50/nv50_vbo.c | 44 +++--
 src/gallium/drivers/nouveau/nvc0/nvc0_context.h |  3 ++
 src/gallium/drivers/nouveau/nvc0/nvc0_state.c   | 26 +++
 src/gallium/drivers/nouveau/nvc0/nvc0_vbo.c | 41 +--
 6 files changed, 64 insertions(+), 72 deletions(-)

diff --git a/src/gallium/drivers/nouveau/nv50/nv50_context.h 
b/src/gallium/drivers/nouveau/nv50/nv50_context.h
index 2cebcd9..712d00e 100644
--- a/src/gallium/drivers/nouveau/nv50/nv50_context.h
+++ b/src/gallium/drivers/nouveau/nv50/nv50_context.h
@@ -134,9 +134,11 @@ struct nv50_context {
struct nv50_constbuf constbuf[3][NV50_MAX_PIPE_CONSTBUFS];
uint16_t constbuf_dirty[3];
uint16_t constbuf_valid[3];
+   uint16_t constbuf_coherent[3];
 
struct pipe_vertex_buffer vtxbuf[PIPE_MAX_ATTRIBS];
unsigned num_vtxbufs;
+   uint32_t vtxbufs_coherent;
struct pipe_index_buffer idxbuf;
uint32_t vbo_fifo; /* bitmask of vertex elements to be pushed to FIFO */
uint32_t vbo_user; /* bitmask of vertex buffers pointing to user memory */
@@ -148,6 +150,7 @@ struct nv50_context {
 
struct pipe_sampler_view *textures[3][PIPE_MAX_SAMPLERS];
unsigned num_textures[3];
+   uint32_t textures_coherent[3];
struct nv50_tsc_entry *samplers[3][PIPE_MAX_SAMPLERS];
unsigned num_samplers[3];
 
diff --git a/src/gallium/drivers/nouveau/nv50/nv50_state.c 
b/src/gallium/drivers/nouveau/nv50/nv50_state.c
index fd7c7cd..b6e5c75 100644
--- a/src/gallium/drivers/nouveau/nv50/nv50_state.c
+++ b/src/gallium/drivers/nouveau/nv50/nv50_state.c
@@ -661,9 +661,16 @@ nv50_stage_set_sampler_views(struct nv50_context *nv50, 
int s,
assert(nr <= PIPE_MAX_SAMPLERS);
for (i = 0; i < nr; ++i) {
   struct nv50_tic_entry *old = nv50_tic_entry(nv50->textures[s][i]);
+  struct pipe_resource *res = views[i]->texture;
   if (old)
  nv50_screen_tic_unlock(nv50->screen, old);
 
+  if (res->target == PIPE_BUFFER &&
+  (res->flags & PIPE_RESOURCE_FLAG_MAP_COHERENT))
+ nv50->textures_coherent[s] |= 1 << i;
+  else
+ nv50->textures_coherent[s] &= ~(1 << i);
+
   pipe_sampler_view_reference(&nv50->textures[s][i], views[i]);
}
 
@@ -852,8 +859,13 @@ nv50_set_constant_buffer(struct pipe_context *pipe, uint 
shader, uint index,
   nv50->constbuf[s][i].offset = cb->buffer_offset;
   nv50->constbuf[s][i].size = MIN2(align(cb->buffer_size, 0x100), 0x1);
   nv50->constbuf_valid[s] |= 1 << i;
+  if (res->flags & PIPE_RESOURCE_FLAG_MAP_COHERENT)
+ nv50->constbuf_coherent[s] |= 1 << i;
+  else
+ nv50->constbuf_coherent[s] &= ~(1 << i);
} else {
   nv50->constbuf_valid[s] &= ~(1 << i);
+  nv50->constbuf_coherent[s] &= ~(1 << i);
}
nv50->constbuf_dirty[s] |= 1 << i;
 
@@ -1000,6 +1012,7 @@ nv50_set_vertex_buffers(struct pipe_context *pipe,
if (!vb) {
   nv50->vbo_user &= ~(((1ull << count) - 1) << start_slot);
   nv50->vbo_constant &= ~(((1ull << count) - 1) << start_slot);
+  nv50->vtxbufs_coherent &= ~(((1ull << count) - 1) << start_slot);
   return;
}
 
@@ -1012,9 +1025,15 @@ nv50_set_vertex_buffers(struct pipe_context *pipe,
 nv50->vbo_constant |= 1 << dst_index;
  else
 nv50->vbo_constant &= ~(1 << dst_index);
+ nv50->vtxbufs_coherent &= ~(1 << dst_index);
   } else {
  nv50->vbo_user &= ~(1 << dst_index);
  nv50->vbo_constant &= ~(1 << dst_index);
+
+ if (vb[i].buffer->flags & PIPE_RESOURCE_FLAG_MAP_COHERENT)
+nv50->vtxbufs_coherent |= (1 << dst_index);
+ else
+nv50->vtxbufs_coherent &= ~(1 << dst_index);
   }
}
 
diff --git a/src/gallium/drivers/nouveau/nv50/nv50_vbo.c 
b/src/gallium/drivers/nouveau/nv50/nv50_vbo.c
index 85878d5..b6ba803 100644
--- a/src/gallium/drivers/nouveau/nv50/nv50_vbo.c
+++ b/src/gallium/drivers/nouveau/nv50/nv50_vbo.c
@@ -761,8 +761,7 @@ nv50_draw_vbo(struct pipe_context *pipe, const struct 
pipe_draw_info *info)
 {
struct nv50_context *nv50 = nv50_context(pipe);
struct nouveau_pushbuf *push = nv50->base.pushbuf;
-   bool tex_dirty = false;
-   int i, s;
+   int s;
 
/* NOTE: caller must ensure that (min_index + index_bias) is >= 0 */
nv50->vb_elt_first = info->min_index + info->index_bias;
@@ -791,27 +790,9 @@ nv50_draw_vbo(struct pipe_context *pipe, const struct 
pipe_draw_info *info)
 
push->kick_notify = nv50_draw_vbo_kick_notify;

[Mesa-dev] [PATCH] radeonsi: implement RB+ for Stoney (v2)

2015-12-09 Thread Marek Olšák

From: Marek Olšák 

v2: fix dual source blending
---
 src/gallium/drivers/radeon/r600_pipe_common.c |   1 +
 src/gallium/drivers/radeon/r600_pipe_common.h |   3 +
 src/gallium/drivers/radeon/r600_texture.c |   6 +
 src/gallium/drivers/radeonsi/si_state.c   | 159 +-
 src/gallium/drivers/radeonsi/sid.h|   3 +
 5 files changed, 170 insertions(+), 2 deletions(-)

diff --git a/src/gallium/drivers/radeon/r600_pipe_common.c 
b/src/gallium/drivers/radeon/r600_pipe_common.c
index 8899ba4..ba541ac 100644
--- a/src/gallium/drivers/radeon/r600_pipe_common.c
+++ b/src/gallium/drivers/radeon/r600_pipe_common.c
@@ -375,6 +375,7 @@ static const struct debug_named_value 
common_debug_options[] = {
{ "check_vm", DBG_CHECK_VM, "Check VM faults and dump debug info." },
{ "nodcc", DBG_NO_DCC, "Disable DCC." },
{ "nodccclear", DBG_NO_DCC_CLEAR, "Disable DCC fast clear." },
+   { "norbplus", DBG_NO_RB_PLUS, "Disable RB+ on Stoney." },
 
DEBUG_NAMED_VALUE_END /* must be last */
 };
diff --git a/src/gallium/drivers/radeon/r600_pipe_common.h 
b/src/gallium/drivers/radeon/r600_pipe_common.h
index 8c6c0c3..dd23ed5 100644
--- a/src/gallium/drivers/radeon/r600_pipe_common.h
+++ b/src/gallium/drivers/radeon/r600_pipe_common.h
@@ -86,6 +86,7 @@
 #define DBG_CHECK_VM   (1llu << 42)
 #define DBG_NO_DCC (1llu << 43)
 #define DBG_NO_DCC_CLEAR   (1llu << 44)
+#define DBG_NO_RB_PLUS (1llu << 45)
 
 #define R600_MAP_BUFFER_ALIGNMENT 64
 
@@ -250,6 +251,8 @@ struct r600_surface {
unsigned cb_color_fmask_slice;  /* EG and later */
unsigned cb_color_cmask;/* CB_COLORn_TILE (r600 only) */
unsigned cb_color_mask; /* R600 only */
+   unsigned sx_ps_downconvert; /* Stoney only */
+   unsigned sx_blend_opt_epsilon;  /* Stoney only */
struct r600_resource *cb_buffer_fmask; /* Used for FMASK relocations. 
R600 only */
struct r600_resource *cb_buffer_cmask; /* Used for CMASK relocations. 
R600 only */
 
diff --git a/src/gallium/drivers/radeon/r600_texture.c 
b/src/gallium/drivers/radeon/r600_texture.c
index 774722f..8c145e5 100644
--- a/src/gallium/drivers/radeon/r600_texture.c
+++ b/src/gallium/drivers/radeon/r600_texture.c
@@ -1393,6 +1393,7 @@ void evergreen_do_fast_color_clear(struct 
r600_common_context *rctx,
return;
 
for (i = 0; i < fb->nr_cbufs; i++) {
+   struct r600_surface *surf;
struct r600_texture *tex;
unsigned clear_bit = PIPE_CLEAR_COLOR0 << i;
 
@@ -1403,6 +1404,7 @@ void evergreen_do_fast_color_clear(struct 
r600_common_context *rctx,
if (!(*buffers & clear_bit))
continue;
 
+   surf = (struct r600_surface *)fb->cbufs[i];
tex = (struct r600_texture *)fb->cbufs[i]->texture;
 
/* 128-bit formats are unusupported */
@@ -1449,6 +1451,10 @@ void evergreen_do_fast_color_clear(struct 
r600_common_context *rctx,
if (clear_words_needed)
tex->dirty_level_mask |= 1 << 
fb->cbufs[i]->u.tex.level;
} else {
+   /* RB+ doesn't work with CMASK fast clear. */
+   if (surf->sx_ps_downconvert)
+   continue;
+
/* ensure CMASK is enabled */
r600_texture_alloc_cmask_separate(rctx->screen, tex);
if (tex->cmask.size == 0) {
diff --git a/src/gallium/drivers/radeonsi/si_state.c 
b/src/gallium/drivers/radeonsi/si_state.c
index 2ebfa1c..dcf4a7b 100644
--- a/src/gallium/drivers/radeonsi/si_state.c
+++ b/src/gallium/drivers/radeonsi/si_state.c
@@ -347,10 +347,54 @@ static uint32_t si_translate_blend_factor(int blend_fact)
return 0;
 }
 
+static uint32_t si_translate_blend_opt_function(int blend_func)
+{
+   switch (blend_func) {
+   case PIPE_BLEND_ADD:
+   return V_028760_OPT_COMB_ADD;
+   case PIPE_BLEND_SUBTRACT:
+   return V_028760_OPT_COMB_SUBTRACT;
+   case PIPE_BLEND_REVERSE_SUBTRACT:
+   return V_028760_OPT_COMB_REVSUBTRACT;
+   case PIPE_BLEND_MIN:
+   return V_028760_OPT_COMB_MIN;
+   case PIPE_BLEND_MAX:
+   return V_028760_OPT_COMB_MAX;
+   default:
+   return V_028760_OPT_COMB_BLEND_DISABLED;
+   }
+}
+
+static uint32_t si_translate_blend_opt_factor(int blend_fact, bool is_alpha)
+{
+   switch (blend_fact) {
+   case PIPE_BLENDFACTOR_ZERO:
+   return V_028760_BLEND_OPT_PRESERVE_NONE_IGNORE_ALL;
+   case PIPE_BLENDFACTOR_ONE:
+   return V_028760_BLEND_OPT_PRESERVE_ALL_IGNORE_NONE;
+   case PIPE_BLENDFACTOR_SRC_COLOR:
+   return is_alpha ? V_028760_BLEND_OPT_PRESERVE_A1_IGNORE_A0
+   : V_028760_BLEND_OPT_PRESERVE_C1_IGNORE_C0;
+

Re: [Mesa-dev] [PATCH] mesa: detect inefficient buffer use and report through debug output

2015-12-09 Thread Ilia Mirkin

On Wed, Dec 9, 2015 at 5:23 PM, Brian Paul  wrote:
> On 12/09/2015 11:43 AM, Ilia Mirkin wrote:
>>
>> On Mon, Dec 7, 2015 at 8:42 PM, Brian Paul  wrote:
>>>
>>> When a buffer is created with GL_STATIC_DRAW, its contents should not
>>> be changed frequently.  But that's exactly what one application I'm
>>> debugging does.  This patch adds code to try to detect inefficient
>>> buffer use in a couple places.  The GL_ARB_debug_output mechanism is
>>> used to report the issue.
>>>
>>> NVIDIA's driver detects these sort of things too.
>>>
>>> Other types of inefficient buffer use could also be detected in the
>>> future.
>>> ---
>>>   src/mesa/main/bufferobj.c | 55
>>> +++
>>>   src/mesa/main/mtypes.h|  4 
>>>   2 files changed, 59 insertions(+)
>>>
>>> diff --git a/src/mesa/main/bufferobj.c b/src/mesa/main/bufferobj.c
>>> index f985982..6bc1b5e 100644
>>> --- a/src/mesa/main/bufferobj.c
>>> +++ b/src/mesa/main/bufferobj.c
>>> @@ -51,6 +51,34 @@
>>>
>>>
>>>   /**
>>> + * We count the number of buffer modification calls to check for
>>> + * inefficient buffer use.  This is the number of such calls before we
>>> + * issue a warning.
>>> + */
>>> +#define BUFFER_WARNING_CALL_COUNT 4
>>> +
>>> +
>>> +/**
>>> + * Helper to warn of possible performance issues, such as frequently
>>> + * updating a buffer created with GL_STATIC_DRAW.
>>> + */
>>> +static void
>>> +buffer_usage_warning(struct gl_context *ctx, const char *fmt, ...)
>>> +{
>>> +   va_list args;
>>> +   GLuint msg_id = 0;
>>
>>
>> This needs to be wrapped in a macro, with a 'static' id (at each macro
>> invocation), otherwise a fresh id will get generated each time this is
>> called, which is presumably not desirable. Same as what I did with
>> pipe_debug_message/_pipe_debug_message.
>
>
> Is the macro required, or can I just pass a pointer to a static GLuint from
> each call site?  I took a quick stab at the macro but I'm having trouble
> with the varargs stuff.

The macro is not required... but you can copy the one I used for
pipe_debug_message. Passing in a different static variable from each
callsite would work too -- that's precisely what the macro would be
doing in the first place.

  -ilia
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] mesa: detect inefficient buffer use and report through debug output

2015-12-09 Thread Brian Paul


On 12/09/2015 11:43 AM, Ilia Mirkin wrote:

On Mon, Dec 7, 2015 at 8:42 PM, Brian Paul  wrote:

When a buffer is created with GL_STATIC_DRAW, its contents should not
be changed frequently.  But that's exactly what one application I'm
debugging does.  This patch adds code to try to detect inefficient
buffer use in a couple places.  The GL_ARB_debug_output mechanism is
used to report the issue.

NVIDIA's driver detects these sort of things too.

Other types of inefficient buffer use could also be detected in the
future.
---
  src/mesa/main/bufferobj.c | 55 +++
  src/mesa/main/mtypes.h|  4 
  2 files changed, 59 insertions(+)

diff --git a/src/mesa/main/bufferobj.c b/src/mesa/main/bufferobj.c
index f985982..6bc1b5e 100644
--- a/src/mesa/main/bufferobj.c
+++ b/src/mesa/main/bufferobj.c
@@ -51,6 +51,34 @@


  /**
+ * We count the number of buffer modification calls to check for
+ * inefficient buffer use.  This is the number of such calls before we
+ * issue a warning.
+ */
+#define BUFFER_WARNING_CALL_COUNT 4
+
+
+/**
+ * Helper to warn of possible performance issues, such as frequently
+ * updating a buffer created with GL_STATIC_DRAW.
+ */
+static void
+buffer_usage_warning(struct gl_context *ctx, const char *fmt, ...)
+{
+   va_list args;
+   GLuint msg_id = 0;


This needs to be wrapped in a macro, with a 'static' id (at each macro
invocation), otherwise a fresh id will get generated each time this is
called, which is presumably not desirable. Same as what I did with
pipe_debug_message/_pipe_debug_message.


Is the macro required, or can I just pass a pointer to a static GLuint 
from each call site?  I took a quick stab at the macro but I'm having 
trouble with the varargs stuff.


-Brian

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] softpipe: V.2 implement some support for multiple viewports

2015-12-09 Thread Roland Scheidegger

Am 09.12.2015 um 19:59 schrieb eocallag...@alterapraxis.com:
> Roland,
> 
> I could not due to ml size limit or something, it just bounces hence the
> pull request.
Due to size limit? Didn't look that big to me...

Otherwise, looks good to me, though the second commit should mention it
actually enables the extension.

Roland


> Cheers,
> Edward.
> 
> On 2015-12-10 02:38, Roland Scheidegger wrote:
>> Am 09.12.2015 um 05:16 schrieb Edward O'Callaghan:
>>> This fixes my initial attempt so that piglit now passes 14/14. Thanks
>>> to a couple of tips from Roland in the previous patch I was able to
>>> fix the remaining issue. This should be golden now.
>>>
>>
>> Great that you got it working!
>> Please send the patches to the ml.
>>
>> Roland
> 

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 2/2] st/mesa: fix GLSL uniform updates for glBitmap & glDrawPixels

2015-12-09 Thread Marek Olšák

On Mon, Dec 7, 2015 at 5:36 PM, Brian Paul  wrote:
> On 12/06/2015 04:34 PM, Marek Olšák wrote:
>>
>> From: Marek Olšák 
>>
>> Spotted by luck. The GLSL uniform storage is only associated once
>> in LinkShader and can't be reallocated afterwards, because that would
>> break the association.
>> ---
>>   src/mesa/state_tracker/st_cb_bitmap.c  |  6 +-
>>   src/mesa/state_tracker/st_cb_drawpixels.c  |  6 --
>>   src/mesa/state_tracker/st_glsl_to_tgsi.cpp |  6 ++
>>   src/mesa/state_tracker/st_program.c| 17 -
>>   src/mesa/state_tracker/st_program.h|  1 -
>>   5 files changed, 15 insertions(+), 21 deletions(-)
>>
>> diff --git a/src/mesa/state_tracker/st_cb_bitmap.c
>> b/src/mesa/state_tracker/st_cb_bitmap.c
>> index cbc6845..a4a48a6 100644
>> --- a/src/mesa/state_tracker/st_cb_bitmap.c
>> +++ b/src/mesa/state_tracker/st_cb_bitmap.c
>> @@ -287,7 +287,8 @@ draw_bitmap_quad(struct gl_context *ctx, GLint x,
>> GLint y, GLfloat z,
>> GLfloat colorSave[4];
>> COPY_4V(colorSave, ctx->Current.Attrib[VERT_ATTRIB_COLOR0]);
>> COPY_4V(ctx->Current.Attrib[VERT_ATTRIB_COLOR0], color);
>> -  st_upload_constants(st, fpv->parameters, PIPE_SHADER_FRAGMENT);
>> +  st_upload_constants(st, st->fp->Base.Base.Parameters,
>> +  PIPE_SHADER_FRAGMENT);
>> COPY_4V(ctx->Current.Attrib[VERT_ATTRIB_COLOR0], colorSave);
>>  }
>>
>> @@ -404,6 +405,9 @@ draw_bitmap_quad(struct gl_context *ctx, GLint x,
>> GLint y, GLfloat z,
>>  cso_restore_stream_outputs(cso);
>>
>>  pipe_resource_reference(&vbuf, NULL);
>> +
>> +   /* We uploaded modified constants, need to invalidate them. */
>> +   st->dirty.mesa |= _NEW_PROGRAM_CONSTANTS;
>>   }
>>
>>
>> diff --git a/src/mesa/state_tracker/st_cb_drawpixels.c
>> b/src/mesa/state_tracker/st_cb_drawpixels.c
>> index 262ad80..e295f54 100644
>> --- a/src/mesa/state_tracker/st_cb_drawpixels.c
>> +++ b/src/mesa/state_tracker/st_cb_drawpixels.c
>> @@ -1109,9 +1109,6 @@ st_DrawPixels(struct gl_context *ctx, GLint x, GLint
>> y,
>>
>> st->pixel_xfer.pixelmap_sampler_view);
>>num_sampler_view++;
>> }
>> -
>> -  /* update fragment program constants */
>> -  st_upload_constants(st, fpv->parameters, PIPE_SHADER_FRAGMENT);
>>  }
>>
>>  /* Put glDrawPixels image into a texture */
>> @@ -1462,9 +1459,6 @@ st_CopyPixels(struct gl_context *ctx, GLint srcx,
>> GLint srcy,
>>
>> st->pixel_xfer.pixelmap_sampler_view);
>>num_sampler_view++;
>> }
>> -
>> -  /* update fragment program constants */
>> -  st_upload_constants(st, fpv->parameters, PIPE_SHADER_FRAGMENT);
>>  }
>>  else {
>> assert(type == GL_DEPTH);
>> diff --git a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
>> b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
>> index 40c7725..a32c4cf 100644
>> --- a/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
>> +++ b/src/mesa/state_tracker/st_glsl_to_tgsi.cpp
>> @@ -5640,6 +5640,12 @@ get_mesa_program(struct gl_context *ctx,
>>
>>  _mesa_reference_program(ctx, &shader->Program, prog);
>>
>> +   /* Avoid reallocation of the program parameter list, because the
>> uniform
>> +* storage is only associated with the original parameter list.
>> +* This should be enough for Bitmap and DrawPixels constants.
>> +*/
>> +   _mesa_reserve_parameter_storage(prog->Parameters, 8);
>> +
>>  /* This has to be done last.  Any operation the can cause
>>   * prog->ParameterValues to get reallocated (e.g., anything that adds
>> a
>>   * program constant) has to happen before creating this linkage.
>> diff --git a/src/mesa/state_tracker/st_program.c
>> b/src/mesa/state_tracker/st_program.c
>> index 75ccaf2..39c54c2 100644
>> --- a/src/mesa/state_tracker/st_program.c
>> +++ b/src/mesa/state_tracker/st_program.c
>> @@ -112,8 +112,6 @@ delete_fp_variant(struct st_context *st, struct
>> st_fp_variant *fpv)
>>   {
>>  if (fpv->driver_shader)
>> cso_delete_fragment_shader(st->cso_context, fpv->driver_shader);
>> -   if (fpv->parameters)
>> -  _mesa_free_parameter_list(fpv->parameters);
>>  free(fpv);
>>   }
>>
>> @@ -914,8 +912,6 @@ st_create_fp_variant(struct st_context *st,
>>if (tgsi.tokens != stfp->tgsi.tokens)
>>   tgsi_free_tokens(tgsi.tokens);
>>tgsi.tokens = tokens;
>> - variant->parameters =
>> -_mesa_clone_parameter_list(stfp->Base.Base.Parameters);
>> } else
>>fprintf(stderr, "mesa: cannot create a shader for glBitmap\n");
>>  }
>> @@ -924,6 +920,7 @@ st_create_fp_variant(struct st_context *st,
>>  if (key->drawpixels) {
>> const struct tgsi_token *tokens;
>> unsigned scale_const = 0, bias_const = 0, texcoord_const = 0;
>> +  struct gl_program_parameter_list *params =
>> stfp->Base.Base.Parameters;
>>
>> /* Find the first unused slot. */
>> variant->drawpix_sampler = ffs(~stfp->Base.Base.S

[Mesa-dev] [PATCH 2/4] winsys/amdgpu: clear the buffer cache on allocation failure and try again

2015-12-09 Thread Marek Olšák

From: Marek Olšák 

---
 src/gallium/winsys/amdgpu/drm/amdgpu_bo.c | 9 +++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/src/gallium/winsys/amdgpu/drm/amdgpu_bo.c 
b/src/gallium/winsys/amdgpu/drm/amdgpu_bo.c
index 41efbcb..674482e 100644
--- a/src/gallium/winsys/amdgpu/drm/amdgpu_bo.c
+++ b/src/gallium/winsys/amdgpu/drm/amdgpu_bo.c
@@ -494,8 +494,13 @@ amdgpu_bo_create(struct radeon_winsys *rws,
 
/* Create a new one. */
bo = amdgpu_create_bo(ws, size, alignment, usage, domain, flags);
-   if (!bo)
-  return NULL;
+   if (!bo) {
+  /* Clear the cache and try again. */
+  pb_cache_release_all_buffers(&ws->bo_cache);
+  bo = amdgpu_create_bo(ws, size, alignment, usage, domain, flags);
+  if (!bo)
+ return NULL;
+   }
 
bo->use_reusable_pool = use_reusable_pool;
return &bo->base;
-- 
2.1.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 3/4] winsys/radeon: clear the buffer cache on mmap failure and try again

2015-12-09 Thread Marek Olšák

From: Marek Olšák 

---
 src/gallium/winsys/radeon/drm/radeon_drm_bo.c | 13 ++---
 1 file changed, 10 insertions(+), 3 deletions(-)

diff --git a/src/gallium/winsys/radeon/drm/radeon_drm_bo.c 
b/src/gallium/winsys/radeon/drm/radeon_drm_bo.c
index a5f8aeb..5ba01b9 100644
--- a/src/gallium/winsys/radeon/drm/radeon_drm_bo.c
+++ b/src/gallium/winsys/radeon/drm/radeon_drm_bo.c
@@ -361,9 +361,16 @@ void *radeon_bo_do_map(struct radeon_bo *bo)
 ptr = os_mmap(0, args.size, PROT_READ|PROT_WRITE, MAP_SHARED,
bo->rws->fd, args.addr_ptr);
 if (ptr == MAP_FAILED) {
-pipe_mutex_unlock(bo->map_mutex);
-fprintf(stderr, "radeon: mmap failed, errno: %i\n", errno);
-return NULL;
+/* Clear the cache and try again. */
+pb_cache_release_all_buffers(&bo->rws->bo_cache);
+
+ptr = os_mmap(0, args.size, PROT_READ|PROT_WRITE, MAP_SHARED,
+  bo->rws->fd, args.addr_ptr);
+if (ptr == MAP_FAILED) {
+pipe_mutex_unlock(bo->map_mutex);
+fprintf(stderr, "radeon: mmap failed, errno: %i\n", errno);
+return NULL;
+}
 }
 bo->ptr = ptr;
 bo->map_count = 1;
-- 
2.1.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 4/4] winsys/amdgpu: clear the buffer cache on mmap failure and try again

2015-12-09 Thread Marek Olšák

From: Marek Olšák 

---
 src/gallium/winsys/amdgpu/drm/amdgpu_bo.c | 5 +
 1 file changed, 5 insertions(+)

diff --git a/src/gallium/winsys/amdgpu/drm/amdgpu_bo.c 
b/src/gallium/winsys/amdgpu/drm/amdgpu_bo.c
index 674482e..adea376 100644
--- a/src/gallium/winsys/amdgpu/drm/amdgpu_bo.c
+++ b/src/gallium/winsys/amdgpu/drm/amdgpu_bo.c
@@ -229,6 +229,11 @@ static void *amdgpu_bo_map(struct pb_buffer *buf,
return bo->user_ptr;
 
r = amdgpu_bo_cpu_map(bo->bo, &cpu);
+   if (r) {
+  /* Clear the cache and try again. */
+  pb_cache_release_all_buffers(&bo->ws->bo_cache);
+  r = amdgpu_bo_cpu_map(bo->bo, &cpu);
+   }
return r ? NULL : cpu;
 }
 
-- 
2.1.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 1/4] winsys/radeon: clear the buffer cache on allocation failure and try again

2015-12-09 Thread Marek Olšák

From: Marek Olšák 

---
 src/gallium/winsys/radeon/drm/radeon_drm_bo.c | 9 +++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/src/gallium/winsys/radeon/drm/radeon_drm_bo.c 
b/src/gallium/winsys/radeon/drm/radeon_drm_bo.c
index 3fd233c..a5f8aeb 100644
--- a/src/gallium/winsys/radeon/drm/radeon_drm_bo.c
+++ b/src/gallium/winsys/radeon/drm/radeon_drm_bo.c
@@ -766,8 +766,13 @@ radeon_winsys_bo_create(struct radeon_winsys *rws,
 }
 
 bo = radeon_create_bo(ws, size, alignment, usage, domain, flags);
-if (!bo)
-return NULL;
+if (!bo) {
+/* Clear the cache and try again. */
+pb_cache_release_all_buffers(&ws->bo_cache);
+bo = radeon_create_bo(ws, size, alignment, usage, domain, flags);
+if (!bo)
+return NULL;
+}
 
 bo->use_reusable_pool = use_reusable_pool;
 
-- 
2.1.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] radeonsi: don't call of u_prims_for_vertices for patches and rectangles

2015-12-09 Thread Marek Olšák

From: Marek Olšák 

Both caused a crash due to a division by zero in that function.
This is an alternative fix.

Cc: 11.0 11.1 
---
 src/gallium/drivers/radeonsi/si_state_draw.c | 14 +-
 1 file changed, 13 insertions(+), 1 deletion(-)

diff --git a/src/gallium/drivers/radeonsi/si_state_draw.c 
b/src/gallium/drivers/radeonsi/si_state_draw.c
index ee84a1f..e550011 100644
--- a/src/gallium/drivers/radeonsi/si_state_draw.c
+++ b/src/gallium/drivers/radeonsi/si_state_draw.c
@@ -216,6 +216,18 @@ static void si_emit_derived_tess_state(struct si_context 
*sctx,
radeon_emit(cs, tcs_out_layout | (num_tcs_output_cp << 26));
 }
 
+static unsigned si_num_prims_for_vertices(const struct pipe_draw_info *info)
+{
+   switch (info->mode) {
+   case PIPE_PRIM_PATCHES:
+   return info->count / info->vertices_per_patch;
+   case R600_PRIM_RECTANGLE_LIST:
+   return info->count / 3;
+   default:
+   return u_prims_for_vertices(info->mode, info->count);
+   }
+}
+
 static unsigned si_get_ia_multi_vgt_param(struct si_context *sctx,
  const struct pipe_draw_info *info,
  unsigned num_patches)
@@ -320,7 +332,7 @@ static unsigned si_get_ia_multi_vgt_param(struct si_context 
*sctx,
if (sctx->b.screen->info.max_se >= 2 && ia_switch_on_eoi &&
(info->indirect ||
 (info->instance_count > 1 &&
- u_prims_for_vertices(info->mode, info->count) <= 1)))
+ si_num_prims_for_vertices(info) <= 1)))
sctx->b.flags |= SI_CONTEXT_VGT_FLUSH;
 
return S_028AA8_SWITCH_ON_EOP(ia_switch_on_eop) |
-- 
2.1.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] gallium/aux../util: Make u_prims_for_vertices() safe

2015-12-09 Thread Marek Olšák

Pushed, thanks.

Marek

On Wed, Dec 9, 2015 at 10:07 AM, Edward O'Callaghan
 wrote:
> Let us avoid trapping in hardware from a SIGFPE and instead
> assert on a zero divisor.
>
> Hint: This can occur if a PIPE_PRIM_? is not handled in
>   u_prim_vertex_count() that results in ' info ' not
>   being initialized in the expected manner.
>
> Further, we also fix a possibly NULL pointer dereference
> from ' info ' being NULL from a u_prim_vertex_count() call.
>
> Signed-off-by: Edward O'Callaghan 
> ---
>  src/gallium/auxiliary/util/u_prim.h | 3 +++
>  1 file changed, 3 insertions(+)
>
> diff --git a/src/gallium/auxiliary/util/u_prim.h 
> b/src/gallium/auxiliary/util/u_prim.h
> index 3668015..a09c315 100644
> --- a/src/gallium/auxiliary/util/u_prim.h
> +++ b/src/gallium/auxiliary/util/u_prim.h
> @@ -145,6 +145,9 @@ u_prims_for_vertices(unsigned prim, unsigned num)
>  {
> const struct u_prim_vertex_count *info = u_prim_vertex_count(prim);
>
> +   assert(info);
> +   assert(info->incr != 0);
> +
> if (num < info->min)
>return 0;
>
> --
> 2.5.0
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 15/26] i965: Add Gen7+ tessellation engine state (3DSTATE_TE).

2015-12-09 Thread Jordan Justen

On 2015-12-02 16:15:56, Kenneth Graunke wrote:
> Signed-off-by: Kenneth Graunke 
> ---
>  src/mesa/drivers/dri/i965/gen7_te_state.c | 36 
> ---
>  1 file changed, 28 insertions(+), 8 deletions(-)
> 
> diff --git a/src/mesa/drivers/dri/i965/gen7_te_state.c 
> b/src/mesa/drivers/dri/i965/gen7_te_state.c
> index 95a5e98..2650fa5 100644
> --- a/src/mesa/drivers/dri/i965/gen7_te_state.c
> +++ b/src/mesa/drivers/dri/i965/gen7_te_state.c
> @@ -29,19 +29,39 @@
>  static void
>  upload_te_state(struct brw_context *brw)
>  {
> -   /* Disable the TE */
> -   BEGIN_BATCH(4);
> -   OUT_BATCH(_3DSTATE_TE << 16 | (4 - 2));
> -   OUT_BATCH(0);
> -   OUT_BATCH(0);
> -   OUT_BATCH(0);
> -   ADVANCE_BATCH();
> +   /* BRW_NEW_TESS_EVAL_PROGRAM */
> +   bool active = brw->tess_eval_program;
> +   if (active)
> +  assert(brw->tess_ctrl_program);
> +
> +   const struct brw_tes_prog_data *tes_prog_data = brw->tes.prog_data;
> +
> +   if (active) {
> +  BEGIN_BATCH(4);
> +  OUT_BATCH(_3DSTATE_TE << 16 | (4 - 2));
> +  OUT_BATCH((tes_prog_data->partitioning << GEN7_TE_PARTITIONING_SHIFT) |
> +(tes_prog_data->output_topology << 
> GEN7_TE_OUTPUT_TOPOLOGY_SHIFT) |
> +(tes_prog_data->domain << GEN7_TE_DOMAIN_SHIFT) |

Looks like we don't currently have the masks for SET_FIELD, but maybe
we should add them?

Reviewed-by: Jordan Justen 

> +GEN7_TE_ENABLE);
> +  OUT_BATCH_F(63.0);
> +  OUT_BATCH_F(64.0);
> +  ADVANCE_BATCH();
> +   } else {
> +  BEGIN_BATCH(4);
> +  OUT_BATCH(_3DSTATE_TE << 16 | (4 - 2));
> +  OUT_BATCH(0);
> +  OUT_BATCH_F(0);
> +  OUT_BATCH_F(0);
> +  ADVANCE_BATCH();
> +   }
>  }
>  
>  const struct brw_tracked_state gen7_te_state = {
> .dirty = {
>.mesa  = 0,
> -  .brw   = BRW_NEW_CONTEXT,
> +  .brw   = BRW_NEW_CONTEXT |
> +   BRW_NEW_TES_PROG_DATA |
> +   BRW_NEW_TESS_EVAL_PROGRAM,
> },
> .emit = upload_te_state,
>  };
> -- 
> 2.6.2
> 
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 14/26] i965: Add Gen8+ tessellation evaluation shader state (3DSTATE_DS).

2015-12-09 Thread Jordan Justen

On 2015-12-02 16:15:55, Kenneth Graunke wrote:
> Signed-off-by: Kenneth Graunke 
> ---
>  src/mesa/drivers/dri/i965/gen8_ds_state.c | 66 
> +++
>  1 file changed, 59 insertions(+), 7 deletions(-)
> 
> diff --git a/src/mesa/drivers/dri/i965/gen8_ds_state.c 
> b/src/mesa/drivers/dri/i965/gen8_ds_state.c
> index 4ce4ab3..a79e8aa 100644
> --- a/src/mesa/drivers/dri/i965/gen8_ds_state.c
> +++ b/src/mesa/drivers/dri/i965/gen8_ds_state.c
> @@ -29,19 +29,71 @@
>  static void
>  gen8_upload_ds_state(struct brw_context *brw)
>  {
> -   /* Disable the DS Unit */
> -   int ds_pkt_len = brw->gen >= 9 ? 11 : 9;
> -   BEGIN_BATCH(ds_pkt_len);
> -   OUT_BATCH(_3DSTATE_DS << 16 | (ds_pkt_len - 2));
> -   for (int i = 0; i < ds_pkt_len - 1; i++)
> +   struct gl_context *ctx = &brw->ctx;
> +   const struct brw_stage_state *stage_state = &brw->tes.base;
> +   /* BRW_NEW_TESS_EVAL_PROGRAM */
> +   bool active = brw->tess_eval_program;
> +   assert(!active || brw->tess_ctrl_program);
> +
> +   /* BRW_NEW_TES_PROG_DATA */
> +   const struct brw_tes_prog_data *tes_prog_data = brw->tes.prog_data;
> +   const struct brw_vue_prog_data *vue_prog_data = &tes_prog_data->base;
> +   const struct brw_stage_prog_data *prog_data = &vue_prog_data->base;
> +
> +   if (active) {
> +  BEGIN_BATCH(9);
> +  OUT_BATCH(_3DSTATE_DS << 16 | (9 - 2));
> +  OUT_BATCH(stage_state->prog_offset);
> +  OUT_BATCH(0);
> +  OUT_BATCH(SET_FIELD(DIV_ROUND_UP(stage_state->sampler_count, 4),
> +  GEN7_DS_SAMPLER_COUNT) |
> +SET_FIELD(prog_data->binding_table.size_bytes / 4,
> +  GEN7_DS_BINDING_TABLE_ENTRY_COUNT));
> +  if (prog_data->total_scratch) {
> + OUT_RELOC64(stage_state->scratch_bo,
> + I915_GEM_DOMAIN_RENDER, I915_GEM_DOMAIN_RENDER,
> + ffs(prog_data->total_scratch) - 11);
> +  } else {
> + OUT_BATCH(0);
> + OUT_BATCH(0);
> +  }
> +  OUT_BATCH(SET_FIELD(prog_data->dispatch_grf_start_reg,
> +  GEN7_DS_DISPATCH_START_GRF) |
> +SET_FIELD(vue_prog_data->urb_read_length,
> +  GEN7_DS_URB_READ_LENGTH));
> +
> +  OUT_BATCH(GEN7_DS_ENABLE |
> +GEN7_DS_STATISTICS_ENABLE |
> +(brw->max_ds_threads - 1) << HSW_DS_MAX_THREADS_SHIFT |

SET_FIELD?

Reviewed-by: Jordan Justen 

> +(vue_prog_data->dispatch_mode == DISPATCH_MODE_SIMD8 ?
> + GEN7_DS_SIMD8_DISPATCH_ENABLE : 0) |
> +(tes_prog_data->domain == BRW_TESS_DOMAIN_TRI ?
> + GEN7_DS_COMPUTE_W_COORDINATE_ENABLE : 0));
> +  OUT_BATCH(SET_FIELD(ctx->Transform.ClipPlanesEnabled,
> +  GEN8_DS_USER_CLIP_DISTANCE));
> +  ADVANCE_BATCH();
> +   } else {
> +  BEGIN_BATCH(9);
> +  OUT_BATCH(_3DSTATE_DS << 16 | (9 - 2));
> +  OUT_BATCH(0);
> +  OUT_BATCH(0);
> +  OUT_BATCH(0);
> +  OUT_BATCH(0);
> +  OUT_BATCH(0);
> +  OUT_BATCH(0);
> +  OUT_BATCH(0);
>OUT_BATCH(0);
> -   ADVANCE_BATCH();
> +  ADVANCE_BATCH();
> +   }
> +   brw->tes.enabled = active;
>  }
>  
>  const struct brw_tracked_state gen8_ds_state = {
> .dirty = {
>.mesa  = 0,
> -  .brw   = BRW_NEW_CONTEXT,
> +  .brw   = BRW_NEW_BATCH |
> +   BRW_NEW_TESS_EVAL_PROGRAM |
> +   BRW_NEW_TES_PROG_DATA,
> },
> .emit = gen8_upload_ds_state,
>  };
> -- 
> 2.6.2
> 
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [ANNOUNCE] mesa-demos 8.3.0

2015-12-09 Thread Andreas Boll

This new mesa-demos release fixes the build issue against mesa 10.6 (Bug #91643)
and picks up the latest glxinfo changes. For the misc changes see below.

Andreas.


Adam Jackson (1):
  glxinfo: Add support for GLX_MESA_query_renderer

Andreas Boll (3):
  demos: add missing binaries to .gitignore
  util: Remove unused glstate.[ch]
  demos: Bump version to 8.3.0 for release

Awais Belal (1):
  sharedtex_mt: fix rendering thread hang

Brian Paul (21):
  wglinfo: query and report multisample information
  glxinfo/wglinfo: move argument parsing into common code
  wglinfo: Add support for reporting core profile info
  wglinfo: print swap method in print_visual_attribs_verbose()
  wglinfo: adjust column spacing for pixel format info
  glxinfo/wglinfo: query/print GL_ARB_texture_multisample limits
  glxinfo/wglinfo: reverse order of the gl_versions[] array
  geom-stipple-lines: use VBO, specify number of verts on cmd line
  geom-wide-lines: use VBO, specify number of verts on cmd line
  line-sample: a new test that draws a sample of wide/stipple/smooth lines
  line-sample: use GL_LINE_STRIP instead of GL_LINE_LOOP
  glxinfo/wglinfo: add brief (-B) output mode
  glxinfo: pass the options object to print_screen_info()
  wglinfo: pass the options object to print_screen_info()
  demos: flush stdout after printing frame rate
  quad-offset-unfilled: fix GL_POLYGON_OFFSET_FILL/LINE mistake
  util: add fflush() in ValidateShaderProgram()
  tri-tex-stipple: trivial test of texturing with stippling
  line-sample: add flat/smooth and blend toggles
  rubberband: select line width with 1..4 keys
  rubberband: add keyboard option to test non-white drawing color

Dave Airlie (1):
  glxinfo: add 4.5 as a valid version

Dongwon Kim (1):
  torus.c: Lighting effect is distorted when object is scaled up/down

Jose Fonseca (3):
  wgl: Remove unnecessary #pragmas.
  line-sample: Ensure GL_ALIASED_LINE_WIDTH_RANGE is defined on Windows.
  xdemos/corender: Remove.

José Fonseca (5):
  cmake: Define HAVE_FREEGLUT when glutInitContextProfile symbol is present.
  cmake: Don't use gcc specific warnings with g++.
  tests,trival,fp,vp: Rename errno with errnum.
  wgl: Ensure PIXELFORMATDESCRIPTOR members are zeroed.
  s/Tungsten Graphics/VMware/

Julien Isorce (1):
  glxinfo: fix segfault when core profile is unavailable

Marc Dietrich (1):
  glxinfo: fix direct rendering context in glxinfo

Marek Olšák (2):
  eglinfo: print client extensions
  glxinfo: fix printing core profile extensions

Matt Turner (1):
  egl: Remove demos using EGL_MESA_screen_surface.

Michael Olbrich (1):
  opengles2: fix building without X11

Nathan Kidd (1):
  glxgears_pixmap: destroy correct pixmap id

Ross Burton (1):
  configure.ac: fix AC_WITH(glut) so that --without-glut works

git tag: mesa-demos-8.3.0

ftp://ftp.freedesktop.org/pub/mesa/demos/8.3.0/mesa-demos-8.3.0.tar.bz2
MD5:  628e75c23c17394f11a316c36f8e4164  mesa-demos-8.3.0.tar.bz2
SHA1: 468a8f24938ab07e2e31828cf961515371d45b56  mesa-demos-8.3.0.tar.bz2
SHA256: c173154bbd0d5fb53d732471984def42fb1b14ac85fcb834138fb9518b3e0bef  
mesa-demos-8.3.0.tar.bz2
PGP:  
ftp://ftp.freedesktop.org/pub/mesa/demos/8.3.0/mesa-demos-8.3.0.tar.bz2.sig

ftp://ftp.freedesktop.org/pub/mesa/demos/8.3.0/mesa-demos-8.3.0.tar.gz
MD5:  f42510108d9401ad1b4ee67957cc73d7  mesa-demos-8.3.0.tar.gz
SHA1: 2d46ef2d83030c978ba032eac14bf94007bd4a84  mesa-demos-8.3.0.tar.gz
SHA256: 6127c5511e63447b28c2df735739de06c5d221f68e7671cc0ee446e605e92357  
mesa-demos-8.3.0.tar.gz
PGP:  ftp://ftp.freedesktop.org/pub/mesa/demos/8.3.0/mesa-demos-8.3.0.tar.gz.sig



signature.asc
Description: Digital signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 12/26] i965: Add tessellation shader sampler support.

2015-12-09 Thread Jordan Justen

12 & 13 Reviewed-by: Jordan Justen 

On 2015-12-02 16:15:53, Kenneth Graunke wrote:
> Based on code by Chris Forbes and Fabian Bieler.
> 
> Signed-off-by: Kenneth Graunke 
> ---
>  src/mesa/drivers/dri/i965/brw_context.h   |  2 +-
>  src/mesa/drivers/dri/i965/brw_sampler_state.c | 46 
> +++
>  src/mesa/drivers/dri/i965/brw_state.h |  2 ++
>  src/mesa/drivers/dri/i965/brw_state_upload.c  |  2 ++
>  4 files changed, 51 insertions(+), 1 deletion(-)
> 
> diff --git a/src/mesa/drivers/dri/i965/brw_context.h 
> b/src/mesa/drivers/dri/i965/brw_context.h
> index 8de0e6f..c440c7d 100644
> --- a/src/mesa/drivers/dri/i965/brw_context.h
> +++ b/src/mesa/drivers/dri/i965/brw_context.h
> @@ -1228,7 +1228,7 @@ struct brw_context
> } perfmon;
>  
> int num_atoms[BRW_NUM_PIPELINES];
> -   const struct brw_tracked_state render_atoms[71];
> +   const struct brw_tracked_state render_atoms[73];
> const struct brw_tracked_state compute_atoms[9];
>  
> /* If (INTEL_DEBUG & DEBUG_BATCH) */
> diff --git a/src/mesa/drivers/dri/i965/brw_sampler_state.c 
> b/src/mesa/drivers/dri/i965/brw_sampler_state.c
> index 6d73444..3f29e2f 100644
> --- a/src/mesa/drivers/dri/i965/brw_sampler_state.c
> +++ b/src/mesa/drivers/dri/i965/brw_sampler_state.c
> @@ -55,6 +55,8 @@ gen7_emit_sampler_state_pointers_xs(struct brw_context *brw,
>  {
> static const uint16_t packet_headers[] = {
>[MESA_SHADER_VERTEX] = _3DSTATE_SAMPLER_STATE_POINTERS_VS,
> +  [MESA_SHADER_TESS_CTRL] = _3DSTATE_SAMPLER_STATE_POINTERS_HS,
> +  [MESA_SHADER_TESS_EVAL] = _3DSTATE_SAMPLER_STATE_POINTERS_DS,
>[MESA_SHADER_GEOMETRY] = _3DSTATE_SAMPLER_STATE_POINTERS_GS,
>[MESA_SHADER_FRAGMENT] = _3DSTATE_SAMPLER_STATE_POINTERS_PS,
> };
> @@ -647,3 +649,47 @@ const struct brw_tracked_state brw_gs_samplers = {
> },
> .emit = brw_upload_gs_samplers,
>  };
> +
> +
> +static void
> +brw_upload_tcs_samplers(struct brw_context *brw)
> +{
> +   /* BRW_NEW_TESS_CTRL_PROGRAM */
> +   struct gl_program *tcs = (struct gl_program *) brw->tess_ctrl_program;
> +   if (!tcs)
> +  return;
> +
> +   brw_upload_sampler_state_table(brw, tcs, &brw->tcs.base);
> +}
> +
> +
> +const struct brw_tracked_state brw_tcs_samplers = {
> +   .dirty = {
> +  .mesa = _NEW_TEXTURE,
> +  .brw = BRW_NEW_BATCH |
> + BRW_NEW_TESS_CTRL_PROGRAM,
> +   },
> +   .emit = brw_upload_tcs_samplers,
> +};
> +
> +
> +static void
> +brw_upload_tes_samplers(struct brw_context *brw)
> +{
> +   /* BRW_NEW_TESS_EVAL_PROGRAM */
> +   struct gl_program *tes = (struct gl_program *) brw->tess_eval_program;
> +   if (!tes)
> +  return;
> +
> +   brw_upload_sampler_state_table(brw, tes, &brw->tes.base);
> +}
> +
> +
> +const struct brw_tracked_state brw_tes_samplers = {
> +   .dirty = {
> +  .mesa = _NEW_TEXTURE,
> +  .brw = BRW_NEW_BATCH |
> + BRW_NEW_TESS_EVAL_PROGRAM,
> +   },
> +   .emit = brw_upload_tes_samplers,
> +};
> diff --git a/src/mesa/drivers/dri/i965/brw_state.h 
> b/src/mesa/drivers/dri/i965/brw_state.h
> index 129de30..a332b92 100644
> --- a/src/mesa/drivers/dri/i965/brw_state.h
> +++ b/src/mesa/drivers/dri/i965/brw_state.h
> @@ -72,6 +72,8 @@ extern const struct brw_tracked_state 
> brw_state_base_address;
>  extern const struct brw_tracked_state brw_urb_fence;
>  extern const struct brw_tracked_state brw_vs_prog;
>  extern const struct brw_tracked_state brw_vs_samplers;
> +extern const struct brw_tracked_state brw_tcs_samplers;
> +extern const struct brw_tracked_state brw_tes_samplers;
>  extern const struct brw_tracked_state brw_gs_samplers;
>  extern const struct brw_tracked_state brw_vs_ubo_surfaces;
>  extern const struct brw_tracked_state brw_vs_abo_surfaces;
> diff --git a/src/mesa/drivers/dri/i965/brw_state_upload.c 
> b/src/mesa/drivers/dri/i965/brw_state_upload.c
> index ba66886..e61fa6a 100644
> --- a/src/mesa/drivers/dri/i965/brw_state_upload.c
> +++ b/src/mesa/drivers/dri/i965/brw_state_upload.c
> @@ -322,6 +322,8 @@ static const struct brw_tracked_state 
> *gen8_render_atoms[] =
>  
> &brw_fs_samplers,
> &brw_vs_samplers,
> +   &brw_tcs_samplers,
> +   &brw_tes_samplers,
> &brw_gs_samplers,
> &gen8_multisample_state,
>  
> -- 
> 2.6.2
> 
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] Mesa 11.0.7

2015-12-09 Thread Emil Velikov

Mesa 11.0.7 is now available.

This release brings substantial amount of fixes in meta (affecting i965),
some driver fixes for i965, nouveau, r600 and llvm. The video encoding
for Stoney has been disabled, as it isn't working properly.

There are also build fixes for DragonFly and other *BSD platforms,


Chris Wilson (1):
  meta: Compute correct buffer size with SkipRows/SkipPixels

Daniel Stone (1):
  egl/wayland: Ignore rects from SwapBuffersWithDamage

Dave Airlie (4):
  texgetimage: consolidate 1D array handling code.
  r600: geometry shader gsvs itemsize workaround
  r600: rv670 use at least 16es/gs threads
  r600: workaround empty geom shader.

Emil Velikov (5):
  docs: add sha256 checksums for 11.0.6
  get-pick-list.sh: Require explicit "11.0" for nominating stable patches
  mesa; add get-extra-pick-list.sh script into bin/
  Update version to 11.0.7
  docs: add release notes for 11.0.7

François Tigeot (1):
  xmlconfig: Add support for DragonFly

Ian Romanick (22):
  mesa: Make bind_vertex_buffer avilable outside varray.c
  mesa: Refactor update_array_format to make 
_mesa_update_array_format_public
  mesa: Refactor enable_vertex_array_attrib to make 
_mesa_enable_vertex_array_attrib
  i965: Pass brw_context instead of gl_context to brw_draw_rectlist
  i965: Use DSA functions for VBOs in brw_meta_fast_clear
  i965: Use internal functions for buffer object access
  i965: Don't pollute the buffer object namespace in brw_meta_fast_clear
  meta: Use DSA functions for PBO in create_texture_for_pbo
  meta: Use _mesa_NamedBufferData and _mesa_NamedBufferSubData for users of 
_mesa_meta_setup_vertex_objects
  i965: Use _mesa_NamedBufferSubData for users of 
_mesa_meta_setup_vertex_objects
  meta: Don't leave the VBO bound after _mesa_meta_setup_vertex_objects
  meta: Track VBO using gl_buffer_object instead of GL API object handle
  meta: Use DSA functions for VBOs in _mesa_meta_setup_vertex_objects
  meta: Use internal functions for buffer object and VAO access
  meta: Don't pollute the buffer object namespace in 
_mesa_meta_setup_vertex_objects
  meta: Partially convert _mesa_meta_DrawTex to DSA
  meta: Track VBO using gl_buffer_object instead of GL API object handle in 
_mesa_meta_DrawTex
  meta: Use internal functions for buffer object and VAO access in 
_mesa_meta_DrawTex
  meta: Don't pollute the buffer object namespace in _mesa_meta_DrawTex
  meta/TexSubImage: Don't pollute the buffer object namespace
  meta/generate_mipmap: Don't leak the framebuffer object
  glsl: Fix off-by-one error in array size check assertion

Ilia Mirkin (7):
  nvc0/ir: actually emit AFETCH on kepler
  nir: fix typo in idiv lowering, causing large-udiv-udiv failures
  nouveau: use the buffer usage to determine placement when no binding
  nv50,nvc0: properly handle buffer storage invalidation on dsa buffer
  nv50/ir: fix (un)spilling of 3-wide results
  mesa: support GL_RED/GL_RG in ES2 contexts when driver support exists
  nvc0/ir: start offset at texBindBase for txq, like regular texturing

Jonathan Gray (1):
  automake: fix some occurrences of hardcoded -ldl and -lpthread

Leo Liu (1):
  radeon/vce: disable Stoney VCE for 11.0

Marta Lofstedt (1):
  gles2: Update gl2ext.h to revision: 32120

Oded Gabbay (1):
  llvmpipe: disable VSX in ppc due to LLVM PPC bug


git tag: mesa-11.0.7

ftp://ftp.freedesktop.org/pub/mesa/11.0.7/mesa-11.0.7.tar.gz
MD5: 73e2e04d02ab2cdddc635fcb261bde13  mesa-11.0.7.tar.gz
SHA1: ba017f72c5cfe08140b8b0c1a98035f76e19abae  mesa-11.0.7.tar.gz
SHA256: 07c27004ff68b288097d17b2faa7bdf15ec73c96b7e6c9835266e544adf0a62f  
mesa-11.0.7.tar.gz
PGP: ftp://ftp.freedesktop.org/pub/mesa/11.0.7/mesa-11.0.7.tar.gz.sig

ftp://ftp.freedesktop.org/pub/mesa/11.0.7/mesa-11.0.7.tar.xz
MD5: 5bb515d4b0931b7a9e1bfec3da73f10f  mesa-11.0.7.tar.xz
SHA1: 7e96868bf104673509e30846ccb6f641231e8c5a  mesa-11.0.7.tar.xz
SHA256: e7e90a332ede6c8fd08eff90786a3fd1605a4e62ebf3a9b514047838194538cb  
mesa-11.0.7.tar.xz
PGP: ftp://ftp.freedesktop.org/pub/mesa/11.0.7/mesa-11.0.7.tar.xz.sig

--
Emil



signature.asc
Description: OpenPGP digital signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] nvc0: optimize coherent buffer checking at draw time

2015-12-09 Thread Samuel Pitoiset

Instead of iterating over all the buffer resources looking for coherent
buffers, we keep track of a context-wide count. This will save some
iterations (and CPU cycles) in 99.99% case because usually coherent
buffers are not so used.

Signed-off-by: Samuel Pitoiset 
---

I didn't test the patch, but I will run a full piglit tonight.

 src/gallium/drivers/nouveau/nv50/nv50_context.h |  1 +
 src/gallium/drivers/nouveau/nv50/nv50_state.c   |  5 +++
 src/gallium/drivers/nouveau/nv50/nv50_vbo.c | 22 ++---
 src/gallium/drivers/nouveau/nvc0/nvc0_context.h |  3 ++
 src/gallium/drivers/nouveau/nvc0/nvc0_state.c   | 26 
 src/gallium/drivers/nouveau/nvc0/nvc0_vbo.c | 41 +
 6 files changed, 45 insertions(+), 53 deletions(-)

diff --git a/src/gallium/drivers/nouveau/nv50/nv50_context.h 
b/src/gallium/drivers/nouveau/nv50/nv50_context.h
index 2cebcd9..dd5b5e3 100644
--- a/src/gallium/drivers/nouveau/nv50/nv50_context.h
+++ b/src/gallium/drivers/nouveau/nv50/nv50_context.h
@@ -134,6 +134,7 @@ struct nv50_context {
struct nv50_constbuf constbuf[3][NV50_MAX_PIPE_CONSTBUFS];
uint16_t constbuf_dirty[3];
uint16_t constbuf_valid[3];
+   uint16_t constbuf_coherent[3];
 
struct pipe_vertex_buffer vtxbuf[PIPE_MAX_ATTRIBS];
unsigned num_vtxbufs;
diff --git a/src/gallium/drivers/nouveau/nv50/nv50_state.c 
b/src/gallium/drivers/nouveau/nv50/nv50_state.c
index fd7c7cd..c08a63e 100644
--- a/src/gallium/drivers/nouveau/nv50/nv50_state.c
+++ b/src/gallium/drivers/nouveau/nv50/nv50_state.c
@@ -852,8 +852,13 @@ nv50_set_constant_buffer(struct pipe_context *pipe, uint 
shader, uint index,
   nv50->constbuf[s][i].offset = cb->buffer_offset;
   nv50->constbuf[s][i].size = MIN2(align(cb->buffer_size, 0x100), 0x1);
   nv50->constbuf_valid[s] |= 1 << i;
+  if (res->flags & PIPE_RESOURCE_FLAG_MAP_COHERENT)
+ nv50->constbuf_coherent[s] |= 1 << i;
+  else
+ nv50->constbuf_coherent[s] &= ~(1 << i);
} else {
   nv50->constbuf_valid[s] &= ~(1 << i);
+  nv50->constbuf_coherent[s] &= ~(1 << i);
}
nv50->constbuf_dirty[s] |= 1 << i;
 
diff --git a/src/gallium/drivers/nouveau/nv50/nv50_vbo.c 
b/src/gallium/drivers/nouveau/nv50/nv50_vbo.c
index 85878d5..4ca8f11 100644
--- a/src/gallium/drivers/nouveau/nv50/nv50_vbo.c
+++ b/src/gallium/drivers/nouveau/nv50/nv50_vbo.c
@@ -791,27 +791,9 @@ nv50_draw_vbo(struct pipe_context *pipe, const struct 
pipe_draw_info *info)
 
push->kick_notify = nv50_draw_vbo_kick_notify;
 
-   /* TODO: Instead of iterating over all the buffer resources looking for
-* coherent buffers, keep track of a context-wide count.
-*/
for (s = 0; s < 3 && !nv50->cb_dirty; ++s) {
-  uint32_t valid = nv50->constbuf_valid[s];
-
-  while (valid && !nv50->cb_dirty) {
- const unsigned i = ffs(valid) - 1;
- struct pipe_resource *res;
-
- valid &= ~(1 << i);
- if (nv50->constbuf[s][i].user)
-continue;
-
- res = nv50->constbuf[s][i].u.buf;
- if (!res)
-continue;
-
- if (res->flags & PIPE_RESOURCE_FLAG_MAP_COHERENT)
-nv50->cb_dirty = true;
-  }
+  if (nv50->constbuf_coherent[s])
+ nv50->cb_dirty = true;
}
 
/* If there are any coherent constbufs, flush the cache */
diff --git a/src/gallium/drivers/nouveau/nvc0/nvc0_context.h 
b/src/gallium/drivers/nouveau/nvc0/nvc0_context.h
index 39b73ec..1219548 100644
--- a/src/gallium/drivers/nouveau/nvc0/nvc0_context.h
+++ b/src/gallium/drivers/nouveau/nvc0/nvc0_context.h
@@ -134,10 +134,12 @@ struct nvc0_context {
struct nvc0_constbuf constbuf[6][NVC0_MAX_PIPE_CONSTBUFS];
uint16_t constbuf_dirty[6];
uint16_t constbuf_valid[6];
+   uint16_t constbuf_coherent[6];
bool cb_dirty;
 
struct pipe_vertex_buffer vtxbuf[PIPE_MAX_ATTRIBS];
unsigned num_vtxbufs;
+   uint32_t vtxbufs_coherent;
struct pipe_index_buffer idxbuf;
uint32_t constant_vbos;
uint32_t vbo_user; /* bitmask of vertex buffers pointing to user memory */
@@ -149,6 +151,7 @@ struct nvc0_context {
struct pipe_sampler_view *textures[6][PIPE_MAX_SAMPLERS];
unsigned num_textures[6];
uint32_t textures_dirty[6];
+   uint32_t textures_coherent[6];
struct nv50_tsc_entry *samplers[6][PIPE_MAX_SAMPLERS];
unsigned num_samplers[6];
uint16_t samplers_dirty[6];
diff --git a/src/gallium/drivers/nouveau/nvc0/nvc0_state.c 
b/src/gallium/drivers/nouveau/nvc0/nvc0_state.c
index 5da0ea8..b498b5d 100644
--- a/src/gallium/drivers/nouveau/nvc0/nvc0_state.c
+++ b/src/gallium/drivers/nouveau/nvc0/nvc0_state.c
@@ -549,11 +549,18 @@ nvc0_stage_set_sampler_views(struct nvc0_context *nvc0, 
int s,
 
for (i = 0; i < nr; ++i) {
   struct nv50_tic_entry *old = nv50_tic_entry(nvc0->textures[s][i]);
+  struct pipe_resource *res = views[i]->texture;
 
   if (views[i] == nvc0->textures[s][i])
  continue;
   nvc0->textures_dirty[s] |= 1 <<

Re: [Mesa-dev] [PATCH] i965: Separate base offset/constant offset combining from remapping.

2015-12-09 Thread Kenneth Graunke

On Wednesday, December 09, 2015 08:03:25 AM Jason Ekstrand wrote:
> On Dec 9, 2015 2:51 AM, "Kenneth Graunke"  wrote:
> >
> > My tessellation branch has two additional remap functions.  I don't want
> > to replicate this logic there.
> >
> > Signed-off-by: Kenneth Graunke 
> > ---
> >  src/mesa/drivers/dri/i965/brw_nir.c | 78
> -
> >  1 file changed, 50 insertions(+), 28 deletions(-)
> >
> > Hey Jason,
> >
> > If you like this patch, and haven't yet merged your NIR input reworks,
> > feel free to just squash it into your changes.  Or, we can land it
> > separately after your changes.  It's up to you.
> >
> > Separating this out allows me to reuse this in my new tessellation input
> > and output remapping functions, and also means we don't need to add
> structs
> > for the remap functions...we can just pass the builder, or inputs_read, or
> > the VUE map...and not have to pack multiple things together.
> 
> Sure.  It does make things simpler.
> 
> >  --Ken
> >
> > diff --git a/src/mesa/drivers/dri/i965/brw_nir.c
> b/src/mesa/drivers/dri/i965/brw_nir.c
> > index 14ad172..105a175 100644
> > --- a/src/mesa/drivers/dri/i965/brw_nir.c
> > +++ b/src/mesa/drivers/dri/i965/brw_nir.c
> > @@ -27,15 +27,19 @@
> >  #include "glsl/nir/nir_builder.h"
> >  #include "program/prog_to_nir.h"
> >
> > -struct remap_vs_attrs_state {
> > -   nir_builder b;
> > -   uint64_t inputs_read;
> > -};
> > -
> > +/**
> > + * In many cases, we just add the base and offset together, so there's no
> > + * reason to keep them separate.  Sometimes, combining them is essential:
> > + * if a shader only accesses part of a compound variable (such as a
> matrix
> > + * or array), the variable's base may not actually exist in the VUE map.
> > + *
> > + * This pass adds constant offsets to instr->const_index[0], and resets
> > + * the offset source to 0.  Non-constant offsets remain unchanged.
> > + */
> >  static bool
> > -remap_vs_attrs(nir_block *block, void *void_state)
> > +add_const_offset_to_base(nir_block *block, void *closure)
> >  {
> > -   struct remap_vs_attrs_state *state = void_state;
> > +   nir_builder *b = closure;
> >
> > nir_foreach_instr_safe(block, instr) {
> >if (instr->type != nir_instr_type_intrinsic)
> > @@ -43,30 +47,48 @@ remap_vs_attrs(nir_block *block, void *void_state)
> >
> >nir_intrinsic_instr *intrin = nir_instr_as_intrinsic(instr);
> >
> > +  if (intrin->intrinsic == nir_intrinsic_load_input ||
> > +  intrin->intrinsic == nir_intrinsic_load_per_vertex_input ||
> > +  intrin->intrinsic == nir_intrinsic_load_output ||
> > +  intrin->intrinsic == nir_intrinsic_load_per_vertex_output ||
> > +  intrin->intrinsic == nir_intrinsic_store_output ||
> > +  intrin->intrinsic == nir_intrinsic_store_per_vertex_output) {
> 
> This seems a bit scortched-earth. It would be nice if the caller had a bit
> more control.

We could always add "do input"/"do output" options once we actually need
them.  It means adding the structs back, but that's fine.

I suppose for now we could ignore output intrinsics here.

> > + nir_src *offset = nir_get_io_offset_src(intrin);
> > + nir_const_value *const_offset = nir_src_as_const_value(*offset);
> > +
> > + if (const_offset) {
> > +intrin->const_index[0] += const_offset->u[0];
> > +b->cursor = nir_before_instr(&intrin->instr);
> > +nir_instr_rewrite_src(&intrin->instr, offset,
> > +  nir_src_for_ssa(nir_imm_int(b, 0)));
> > + }
> 
> Else???  It seems that you don't want to run this pass if you think you'll
> ever hit an indirect.  I guess it's harmless to just do this for all direct
> things in our driver, but it doesn't sit well.

Else do nothing.   The problem I'm trying to avoid with this logic is
that inputs/outputs which take up multiple slots and are directly
accessed may only have slots assigned for the sub-components that are
actually used.  So, I can't use the base offset, and need to move the
base to the slot that's actually used.

If something is accessed indirectly, we assign all of its slots, so
using the base location is fine.

> > +  }
> > +   }
> > +   return true;
> > +
> > +}
> > +
> > +static bool
> > +remap_vs_attrs(nir_block *block, void *closure)
> > +{
> > +   GLbitfield64 inputs_read = *((GLbitfield64 *) closure);
> > +
> > +   nir_foreach_instr(block, instr) {
> > +  if (instr->type != nir_instr_type_intrinsic)
> > + continue;
> > +
> > +  nir_intrinsic_instr *intrin = nir_instr_as_intrinsic(instr);
> > +
> >if (intrin->intrinsic == nir_intrinsic_load_input) {
> >   /* Attributes come in a contiguous block, ordered by their
> >* gl_vert_attrib value.  That means we can compute the slot
> >* number for an attribute by masking out the enabled attributes
> >* before it and counting the bits.
> >*/

Re: [Mesa-dev] [PATCH 3/8] glsl: use dual slot helper in the linker code.

2015-12-09 Thread Dave Airlie

On 9 December 2015 at 21:39, Timothy Arceri  wrote:
> On Wed, 2015-12-09 at 16:06 +1000, Dave Airlie wrote:
>> From: Dave Airlie 
>>
>> Signed-off-by: Dave Airlie 
>
> Great timing :) I was going to have to look into fixing this stuff for
> enhanced layouts.
>
> Patches 1 & 2 are:
> Reviewed-by: Timothy Arceri 
>
> I have a question about this patch. If these doubles only take up a
> single attribute then why do we even bother with this test? The spec
> says its optional and your fixing the counting up in later patches so
> what does it do thats useful?

It's complicated.

ARB_gpu_shader_fp64 passes doubles between shaders, dual slots ones
take two locations. So a dvec3[2] will consume locations 17/18, 19/20.
The limits
are in number of locations.

ARB_vertex_attrib_64bit allows doubles as vertex inputs. Dual slot ones
take up a single location, however they consume two slots with respect
to the hw limits. So a dvec3[2] will consume 17, 18. However it will take
4 slots against the vertex attrib limits.

Dave.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 3/5] i965: Check base format to determine whether to use tiled memcpy

2015-12-09 Thread Jason Ekstrand

Reviewed-by: Jason Ekstrand 

On Thu, Nov 19, 2015 at 7:25 AM, Neil Roberts  wrote:
> The tiled memcpy doesn't work for copying from RGBX to RGBA because it
> doesn't override the alpha component to 1.0. Commit 2cebaac479d4 added
> a check to disable it for RGBX formats by looking at the TexFormat.
> However a lot of the rest of the code base is written with the
> assumption that an RGBA texture can be used internally to implement a
> GL_RGB texture. If that is done then this check breaks. This patch
> makes it instead check the base format of the texture which I think
> more directly matches the intention.
>
> Cc: Jason Ekstrand 
> ---
>  src/mesa/drivers/dri/i965/intel_pixel_read.c | 7 ---
>  src/mesa/drivers/dri/i965/intel_tex_image.c  | 7 ---
>  2 files changed, 8 insertions(+), 6 deletions(-)
>
> diff --git a/src/mesa/drivers/dri/i965/intel_pixel_read.c 
> b/src/mesa/drivers/dri/i965/intel_pixel_read.c
> index 9bcbbd1..c8aef65 100644
> --- a/src/mesa/drivers/dri/i965/intel_pixel_read.c
> +++ b/src/mesa/drivers/dri/i965/intel_pixel_read.c
> @@ -135,10 +135,11 @@ intel_readpixels_tiled_memcpy(struct gl_context * ctx,
>return false;
>
> /* We can't handle copying from RGBX or BGRX because the tiled_memcpy
> -* function doesn't set the last channel to 1.
> +* function doesn't set the last channel to 1. Note this checks BaseFormat
> +* rather than TexFormat in case the RGBX format is being simulated with 
> an
> +* RGBA format.
>  */
> -   if (rb->Format == MESA_FORMAT_B8G8R8X8_UNORM ||
> -   rb->Format == MESA_FORMAT_R8G8B8X8_UNORM)
> +   if (rb->_BaseFormat == GL_RGB)
>return false;
>
> if (!intel_get_memcpy(rb->Format, format, type, &mem_copy, &cpp,
> diff --git a/src/mesa/drivers/dri/i965/intel_tex_image.c 
> b/src/mesa/drivers/dri/i965/intel_tex_image.c
> index 34b91e8..e3710da7 100644
> --- a/src/mesa/drivers/dri/i965/intel_tex_image.c
> +++ b/src/mesa/drivers/dri/i965/intel_tex_image.c
> @@ -399,10 +399,11 @@ intel_gettexsubimage_tiled_memcpy(struct gl_context 
> *ctx,
>return false;
>
> /* We can't handle copying from RGBX or BGRX because the tiled_memcpy
> -* function doesn't set the last channel to 1.
> +* function doesn't set the last channel to 1. Note this checks BaseFormat
> +* rather than TexFormat in case the RGBX format is being simulated with 
> an
> +* RGBA format.
>  */
> -   if (texImage->TexFormat == MESA_FORMAT_B8G8R8X8_UNORM ||
> -   texImage->TexFormat == MESA_FORMAT_R8G8B8X8_UNORM)
> +   if (texImage->_BaseFormat == GL_RGB)
>return false;
>
> if (!intel_get_memcpy(texImage->TexFormat, format, type, &mem_copy, &cpp,
> --
> 1.9.3
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 2/2] util/u_helpers: return correct number of bound buffers

2015-12-09 Thread Ilia Mirkin

On Wed, Dec 9, 2015 at 2:01 PM, Patrick Rudolph  wrote:
> Ok, first of all bind some buffers:
>
> pipe->set_vertex_buffers(pipe, 0, 1, &vtxbuf);
> pipe->set_vertex_buffers(pipe, 1, 1, &vtxbuf);
> pipe->set_vertex_buffers(pipe, 2, 1, &vtxbuf);
>
> num_vtxbufs is now 3 as it should be.
>
> Now you are unbinding buffers, one after another starting at slot 0:
> pipe->set_vertex_buffers(pipe, 0, 1, NULL);
>
> enabled_buffers = (1ull << num_vtxbufs) - 1 = 0x7;
> util_set_vertex_buffers_mask(...);
> enabled_buffers = 0x6;
> num_vtxbufs = util_last_bit(enabled_buffers) = 3
>
>
> pipe->set_vertex_buffers(pipe, 1, 1, NULL);
>
> enabled_buffers = (1ull << num_vtxbufs) - 1 = 0x7;
> util_set_vertex_buffers_mask(...);
> enabled_buffers = 0x5;
> num_vtxbufs = util_last_bit(enabled_buffers) = 3
>
>
> pipe->set_vertex_buffers(pipe, 2, 1, NULL);
>
> enabled_buffers = (1ull << num_vtxbufs) - 1 = 0x7;
> util_set_vertex_buffers_mask(...);
> enabled_buffers = 0x3;
> num_vtxbufs = util_last_bit(enabled_buffers) = 2
>
> There are no buffers bound any more, but num_vtxbufs is now 2 instead of
> 0.
>
> There would be no problem if you start at slot 2 going down to 0.
> There would be no problem if you unbind all buffers at once.
>
> I hope this clarifies the problem.

Right, thank you, yes it does. With a slightly fixed commit log
including a bit more justification, this is also

Reviewed-by: Ilia Mirkin 

Basically something mentioning that the current algorithm does not
account for pre-existing holes in the buffer list.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [Bug 90821] Segfault when calling glViewport on surfaceless EGL context without bound FBO

2015-12-09 Thread bugzilla-daemon

https://bugs.freedesktop.org/show_bug.cgi?id=90821

Nanley Chery  changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |DUPLICATE

--- Comment #5 from Nanley Chery  ---


*** This bug has been marked as a duplicate of bug 93257 ***

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 1/2] nvc0: fix use after free of pipe_resource

2015-12-09 Thread Ilia Mirkin

I pushed a slightly modified version of this:

http://cgit.freedesktop.org/mesa/mesa/commit/?id=432a798cf5c7fab18a3e32d4073840df7d0d37cb

Thanks for the patch! I hope this will resolve some weird crashes
people have seen with various buffers being null unexpectedly.

On Sun, Dec 6, 2015 at 4:11 AM, Patrick Rudolph  wrote:
> Always reset the vertex bufctx to make sure there's no pointer to
> an already freed pipe_resource left after unbinding buffers.
> Fixes use after free crash in nvc0_bufctx_fence().
>
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93004
>
> Signed-off-by: Patrick Rudolph 
> ---
>  src/gallium/drivers/nouveau/nvc0/nvc0_state.c | 5 -
>  1 file changed, 4 insertions(+), 1 deletion(-)
>
> diff --git a/src/gallium/drivers/nouveau/nvc0/nvc0_state.c 
> b/src/gallium/drivers/nouveau/nvc0/nvc0_state.c
> index 5dce5f0..2aa90c9 100644
> --- a/src/gallium/drivers/nouveau/nvc0/nvc0_state.c
> +++ b/src/gallium/drivers/nouveau/nvc0/nvc0_state.c
> @@ -1000,12 +1000,16 @@ nvc0_set_vertex_buffers(struct pipe_context *pipe,
>  struct nvc0_context *nvc0 = nvc0_context(pipe);
>  unsigned i;
>
> +if (nvc0->num_vtxbufs)
> +nouveau_bufctx_reset(nvc0->bufctx_3d, NVC0_BIND_VTX);
> +
>  util_set_vertex_buffers_count(nvc0->vtxbuf, &nvc0->num_vtxbufs, vb,
>start_slot, count);
>
>  if (!vb) {
> nvc0->vbo_user &= ~(((1ull << count) - 1) << start_slot);
> nvc0->constant_vbos &= ~(((1ull << count) - 1) << start_slot);
> +   nvc0->dirty |= NVC0_NEW_ARRAYS;
> return;
>  }
>
> @@ -1025,7 +1029,6 @@ nvc0_set_vertex_buffers(struct pipe_context *pipe,
>  }
>
>  nvc0->dirty |= NVC0_NEW_ARRAYS;
> -nouveau_bufctx_reset(nvc0->bufctx_3d, NVC0_BIND_VTX);
>  }
>
>  static void
> --
> 2.4.3
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] i965: handle stencil_bits parameter for MESA_FORMAT_B8G8R8X8_UNORM format.

2015-12-09 Thread Ilia Mirkin

On Wed, Dec 9, 2015 at 11:23 AM, Ilia Mirkin  wrote:
> On Wed, Dec 9, 2015 at 11:18 AM, Deve  wrote:
>> This patch indeed seems to not have a sense. I just added it to the bug
>> report as a suggestion that it works for me after this modification. Emil
>> Velikov said that I should send it to the mailing list.
>>
>> Here is how it works in Supertuxkart:
>> We create rtt with following parameters:
>>
>> DepthStencilTexture = generateRTT(res, GL_DEPTH24_STENCIL8,
>> GL_DEPTH_STENCIL, GL_UNSIGNED_INT_24_8);
>>
>> Then, during rendering scene, we do:
>>
>> glEnable(GL_FRAMEBUFFER_SRGB);
>> glBindFramebuffer(GL_FRAMEBUFFER, 0);
>
> OK, so this is the "winsys" framebuffer (GL has some term for it,
> sorry, I don't remember what it is... perhaps it's even winsys). This
> is created based on parameters of your selected GLX visual. For
> example, when I run glxinfo, I see (on nouveau; the list on intel will
> be different but comparable):
>
> 480 GLX Visuals
> visual  x   bf lv rg d st  colorbuffer  sr ax dp st accumbuffer  ms  cav
>   id dep cl sp  sz l  ci b ro  r  g  b  a F gb bf th cl  r  g  b  a ns b eat
> 
> 0x021 24 tc  0  32  0 r  y .   8  8  8  8 .  .  0 24  8  0  0  0  0  0 0 None
> 0x022 24 dc  0  32  0 r  y .   8  8  8  8 .  .  0 24  8  0  0  0  0  0 0 None
> ...
> 0x343 24 tc  0  32  0 r  . .   8  8  8  8 .  s  0  0  0  0  0  0  0  0 0 None
> 0x344 24 tc  0  32  0 r  . .   8  8  8  8 .  s  0  0  0 16 16 16 16  0 0 Slow
> 0x345 24 tc  0  32  0 r  y .   8  8  8  8 .  s  0  0  0  0  0  0  0  0 0 None
> 0x346 24 tc  0  32  0 r  y .   8  8  8  8 .  s  0  0  0 16 16 16 16  0 0 Slow
>
> Note how some of them have srgb, others don't (and have various
> differences in their various other properties). EGL has something
> similar I believe, but tbh I don't remember the specifics. That's the
> mesaVis->sRGBCapable bit below. If you need an sRGB-capable visual,
> are you sure you're picking one? This would be somewhere well before
> any actual rendering takes place.
>
> If you're not sure whether you need an sRGB-capable visual or not, try
> making sure you pick a *non-srgb* visual in a working configuration
> and see if it breaks.

Right, so looking at your code with in more detail, you probably just
require a stencil visual, which, in combination with this patch, makes
you skip this one.

However I noticed that i965/hsw and probably others don't expose *any*
sRGB-capable GLX visuals/fb configs! Pretty sure that's not good...
but GLX, sRGB, and visuals are not exactly my strong point, hopefully
someone a bit more clued in can comment.

Cheers,

  -ilia
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] gallium/util: handle patches in u_prims_for_vertices to fix a radeonsi crash

2015-12-09 Thread eocallaghan


On 2015-12-10 01:47, Marek Olšák wrote:

From: Marek Olšák 

I guess the crash was because of divison by zero.

Cc: 11.0 11.1 
---
 src/gallium/auxiliary/util/u_prim.h  | 17 +
 src/gallium/drivers/radeonsi/si_state_draw.c |  3 ++-
 2 files changed, 15 insertions(+), 5 deletions(-)

diff --git a/src/gallium/auxiliary/util/u_prim.h
b/src/gallium/auxiliary/util/u_prim.h
index 3668015..4926af6 100644
--- a/src/gallium/auxiliary/util/u_prim.h
+++ b/src/gallium/auxiliary/util/u_prim.h
@@ -141,14 +141,23 @@ u_prim_vertex_count(unsigned prim)
  * For polygons, return the number of triangles.
  */
 static inline unsigned
-u_prims_for_vertices(unsigned prim, unsigned num)
+u_prims_for_vertices(unsigned prim, unsigned num, unsigned 
vertices_per_patch)

 {
-   const struct u_prim_vertex_count *info = u_prim_vertex_count(prim);
+   struct u_prim_vertex_count info;

-   if (num < info->min)
+   if (prim == PIPE_PRIM_PATCHES)
+  info.min = info.incr = vertices_per_patch;
+   else if (prim < PIPE_PRIM_MAX)


We already do this check in u_prim_vertex_count() and if out-of-bounds 
we
returned a NULL. Perhaps it would be better avoid this extra else-if 
branch
here and just in the else branch, make the call and then assert on the 
NULL.



+  info = *u_prim_vertex_count(prim);
+   else {
+  assert(!"invalid prim type");
+  return 0;
+   }
+
+   if (num < info.min)
   return 0;


Well convolving this with my previous patch,
http://lists.freedesktop.org/archives/mesa-dev/2015-December/102729.html
I think we should still have an  assert(info.incr != 0);  here.



-   return 1 + ((num - info->min) / info->incr);
+   return 1 + ((num - info.min) / info.incr);
 }

 static inline boolean u_validate_pipe_prim( unsigned pipe_prim, 
unsigned nr )

diff --git a/src/gallium/drivers/radeonsi/si_state_draw.c
b/src/gallium/drivers/radeonsi/si_state_draw.c
index ee84a1f..4ac9d0a 100644
--- a/src/gallium/drivers/radeonsi/si_state_draw.c
+++ b/src/gallium/drivers/radeonsi/si_state_draw.c
@@ -320,7 +320,8 @@ static unsigned si_get_ia_multi_vgt_param(struct
si_context *sctx,
if (sctx->b.screen->info.max_se >= 2 && ia_switch_on_eoi &&
(info->indirect ||
 (info->instance_count > 1 &&
- u_prims_for_vertices(info->mode, info->count) <= 1)))
+ u_prims_for_vertices(info->mode, info->count,
+  info->vertices_per_patch) <= 1)))
sctx->b.flags |= SI_CONTEXT_VGT_FLUSH;

return S_028AA8_SWITCH_ON_EOP(ia_switch_on_eop) |


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 2/2] util/u_helpers: return correct number of bound buffers

2015-12-09 Thread Patrick Rudolph

Ok, first of all bind some buffers:

pipe->set_vertex_buffers(pipe, 0, 1, &vtxbuf);
pipe->set_vertex_buffers(pipe, 1, 1, &vtxbuf);
pipe->set_vertex_buffers(pipe, 2, 1, &vtxbuf);

num_vtxbufs is now 3 as it should be.

Now you are unbinding buffers, one after another starting at slot 0:
pipe->set_vertex_buffers(pipe, 0, 1, NULL);

enabled_buffers = (1ull << num_vtxbufs) - 1 = 0x7;
util_set_vertex_buffers_mask(...);
enabled_buffers = 0x6;
num_vtxbufs = util_last_bit(enabled_buffers) = 3


pipe->set_vertex_buffers(pipe, 1, 1, NULL);

enabled_buffers = (1ull << num_vtxbufs) - 1 = 0x7;
util_set_vertex_buffers_mask(...);
enabled_buffers = 0x5;
num_vtxbufs = util_last_bit(enabled_buffers) = 3


pipe->set_vertex_buffers(pipe, 2, 1, NULL);

enabled_buffers = (1ull << num_vtxbufs) - 1 = 0x7;
util_set_vertex_buffers_mask(...);
enabled_buffers = 0x3;
num_vtxbufs = util_last_bit(enabled_buffers) = 2

There are no buffers bound any more, but num_vtxbufs is now 2 instead of
0.

There would be no problem if you start at slot 2 going down to 0.
There would be no problem if you unbind all buffers at once.

I hope this clarifies the problem.

Kind Regards,
Patrick

On 2015-12-09 07:10 PM, Ilia Mirkin wrote:
> I'm probably just being dense... can you provide an exact sequence of
> calls that would cause this logic to fail? Seems like it should work
> as-is...
> 
> On Sun, Dec 6, 2015 at 4:12 AM, Patrick Rudolph  wrote:
>> In case a state tracker unbinds every slot by a seperate
>> pipe->set_vertex_buffers() call, starting from slot zero, the number
>> of bound buffers would not reach zero at all. Unbinding all buffers
>> at once or starting at the top-most slot results in correct behaviour.
>>
>> Calculating the correct number of bound buffers fixes a NULL pointer
>> dereference in nvc0_validate_vertex_buffers_shared().
>>
>> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93004
>>
>> Signed-off-by: Patrick Rudolph 
>> ---
>>  src/gallium/auxiliary/util/u_helpers.c | 8 +++-
>>  1 file changed, 7 insertions(+), 1 deletion(-)
>>
>> diff --git a/src/gallium/auxiliary/util/u_helpers.c 
>> b/src/gallium/auxiliary/util/u_helpers.c
>> index 09619c1..09020b0 100644
>> --- a/src/gallium/auxiliary/util/u_helpers.c
>> +++ b/src/gallium/auxiliary/util/u_helpers.c
>> @@ -81,7 +81,13 @@ void util_set_vertex_buffers_count(struct 
>> pipe_vertex_buffer *dst,
>> const struct pipe_vertex_buffer *src,
>> unsigned start_slot, unsigned count)
>>  {
>> -   uint32_t enabled_buffers = (1ull << *dst_count) - 1;
>> +   unsigned i;
>> +   uint32_t enabled_buffers = 0;
>> +
>> +   for (i = 0; i < *dst_count; i++) {
>> +  if (dst[i].buffer || dst[i].user_buffer)
>> + enabled_buffers |= (1ull << i);
>> +   }
>>
>> util_set_vertex_buffers_mask(dst, &enabled_buffers, src, start_slot,
>>  count);
>> --
>> 2.4.3
>>
>> ___
>> mesa-dev mailing list
>> mesa-dev@lists.freedesktop.org
>> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] softpipe: V.2 implement some support for multiple viewports

2015-12-09 Thread eocallaghan


Roland,

I could not due to ml size limit or something, it just bounces hence the 
pull request.


Cheers,
Edward.

On 2015-12-10 02:38, Roland Scheidegger wrote:

Am 09.12.2015 um 05:16 schrieb Edward O'Callaghan:

This fixes my initial attempt so that piglit now passes 14/14. Thanks
to a couple of tips from Roland in the previous patch I was able to
fix the remaining issue. This should be golden now.



Great that you got it working!
Please send the patches to the ml.

Roland


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] mesa: detect inefficient buffer use and report through debug output

2015-12-09 Thread Ilia Mirkin

On Mon, Dec 7, 2015 at 8:42 PM, Brian Paul  wrote:
> When a buffer is created with GL_STATIC_DRAW, its contents should not
> be changed frequently.  But that's exactly what one application I'm
> debugging does.  This patch adds code to try to detect inefficient
> buffer use in a couple places.  The GL_ARB_debug_output mechanism is
> used to report the issue.
>
> NVIDIA's driver detects these sort of things too.
>
> Other types of inefficient buffer use could also be detected in the
> future.
> ---
>  src/mesa/main/bufferobj.c | 55 
> +++
>  src/mesa/main/mtypes.h|  4 
>  2 files changed, 59 insertions(+)
>
> diff --git a/src/mesa/main/bufferobj.c b/src/mesa/main/bufferobj.c
> index f985982..6bc1b5e 100644
> --- a/src/mesa/main/bufferobj.c
> +++ b/src/mesa/main/bufferobj.c
> @@ -51,6 +51,34 @@
>
>
>  /**
> + * We count the number of buffer modification calls to check for
> + * inefficient buffer use.  This is the number of such calls before we
> + * issue a warning.
> + */
> +#define BUFFER_WARNING_CALL_COUNT 4
> +
> +
> +/**
> + * Helper to warn of possible performance issues, such as frequently
> + * updating a buffer created with GL_STATIC_DRAW.
> + */
> +static void
> +buffer_usage_warning(struct gl_context *ctx, const char *fmt, ...)
> +{
> +   va_list args;
> +   GLuint msg_id = 0;

This needs to be wrapped in a macro, with a 'static' id (at each macro
invocation), otherwise a fresh id will get generated each time this is
called, which is presumably not desirable. Same as what I did with
pipe_debug_message/_pipe_debug_message.

[I know you already pushed this... that's how I noticed... but should
still get fixed.]

  -ilia

> +
> +   va_start(args, fmt);
> +   _mesa_gl_vdebug(ctx, &msg_id,
> +   MESA_DEBUG_SOURCE_API,
> +   MESA_DEBUG_TYPE_PERFORMANCE,
> +   MESA_DEBUG_SEVERITY_MEDIUM,
> +   fmt, args);
> +   va_end(args);
> +}
> +
> +
> +/**
>   * Used as a placeholder for buffer objects between glGenBuffers() and
>   * glBindBuffer() so that glIsBuffer() can work correctly.
>   */
> @@ -1677,6 +1705,21 @@ _mesa_buffer_sub_data(struct gl_context *ctx, struct 
> gl_buffer_object *bufObj,
> if (size == 0)
>return;
>
> +   bufObj->NumSubDataCalls++;
> +
> +   if ((bufObj->Usage == GL_STATIC_DRAW ||
> +bufObj->Usage == GL_STATIC_COPY) &&
> +   bufObj->NumSubDataCalls >= BUFFER_WARNING_CALL_COUNT) {
> +  /* If the application declared the buffer as static draw/copy or stream
> +   * draw, it should not be frequently modified with glBufferSubData.
> +   */
> +  buffer_usage_warning(ctx,
> +   "using %s(buffer %u, offset %u, size %u) to "
> +   "update a %s buffer",
> +   func, bufObj->Name, offset, size,
> +   _mesa_enum_to_string(bufObj->Usage));
> +   }
> +
> bufObj->Written = GL_TRUE;
>
> assert(ctx->Driver.BufferSubData);
> @@ -2384,6 +2427,18 @@ _mesa_map_buffer_range(struct gl_context *ctx,
>return NULL;
> }
>
> +   if (access & GL_MAP_WRITE_BIT) {
> +  bufObj->NumMapBufferWriteCalls++;
> +  if ((bufObj->Usage == GL_STATIC_DRAW ||
> +   bufObj->Usage == GL_STATIC_COPY) &&
> +  bufObj->NumMapBufferWriteCalls >= BUFFER_WARNING_CALL_COUNT) {
> + buffer_usage_warning(ctx,
> +  "using %s(buffer %u, offset %u, length %u) to "
> +  "update a %s buffer",
> +  func, bufObj->Name, offset, length,
> +  _mesa_enum_to_string(bufObj->Usage));
> +  }
> +   }
>
> assert(ctx->Driver.MapBufferRange);
> map = ctx->Driver.MapBufferRange(ctx, offset, length, access, bufObj,
> diff --git a/src/mesa/main/mtypes.h b/src/mesa/main/mtypes.h
> index 1eb1e21..de54169 100644
> --- a/src/mesa/main/mtypes.h
> +++ b/src/mesa/main/mtypes.h
> @@ -1275,6 +1275,10 @@ struct gl_buffer_object
> GLboolean Immutable; /**< GL_ARB_buffer_storage */
> gl_buffer_usage UsageHistory; /**< How has this buffer been used so far? 
> */
>
> +   /** Counters used for buffer usage warnings */
> +   GLuint NumSubDataCalls;
> +   GLuint NumMapBufferWriteCalls;
> +
> struct gl_buffer_mapping Mappings[MAP_COUNT];
>  };
>
> --
> 1.9.1
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [RFC PATCH 1/5] i965/eu: set correct execution size in brw_NOP

2015-12-09 Thread Matt Turner

On Wed, Dec 9, 2015 at 4:15 AM, Iago Toral Quiroga  wrote:
> ---
>  src/mesa/drivers/dri/i965/brw_eu_emit.c | 1 +
>  1 file changed, 1 insertion(+)
>
> diff --git a/src/mesa/drivers/dri/i965/brw_eu_emit.c 
> b/src/mesa/drivers/dri/i965/brw_eu_emit.c
> index f8c0f80..9543d5e 100644
> --- a/src/mesa/drivers/dri/i965/brw_eu_emit.c
> +++ b/src/mesa/drivers/dri/i965/brw_eu_emit.c
> @@ -1256,6 +1256,7 @@ brw_F16TO32(struct brw_codegen *p, struct brw_reg dst, 
> struct brw_reg src)
>  void brw_NOP(struct brw_codegen *p)
>  {
> brw_inst *insn = next_insn(p, BRW_OPCODE_NOP);
> +   brw_inst_set_exec_size(p->devinfo, insn, BRW_EXECUTE_4);

I don't follow this change. Was this implicitly set before?

At least in newer documentation, NOP is defined to have nearly all
fields 0 which would mean execution size must be 1.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] configure.ac: fix test for SSE4.1 assembler support

2015-12-09 Thread Matt Turner

On Tue, Dec 8, 2015 at 9:37 PM, Jonathan Gray  wrote:
> Change the __m128i variables to be volatile so gcc 4.9 won't optimise
> all of them out with -O1 or greater.  The _mm_set1_epi32/pinsrd calls
> still get optimised out but now there is at least one SSE4.1 instruction
> generated via _mm_max_epu32/pmaxud.  When all of the sse4.1 instructions
> got optimised out the configure test would incorrectly pass when the
> compiler supported the intrinsics and the assembler didn't support the
> instructions.
>
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=91806
> Signed-off-by: Jonathan Gray 
> Cc: "11.0 11.1" 
> ---
>  configure.ac | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/configure.ac b/configure.ac
> index 260934d..1d82e47 100644
> --- a/configure.ac
> +++ b/configure.ac
> @@ -384,7 +384,7 @@ CFLAGS="$SSE41_CFLAGS $CFLAGS"
>  AC_COMPILE_IFELSE([AC_LANG_SOURCE([[
>  #include 
>  int main () {
> -__m128i a = _mm_set1_epi32 (0), b = _mm_set1_epi32 (0), c;
> +volatile __m128i a = _mm_set1_epi32 (0), b = _mm_set1_epi32 (0), c;
>  c = _mm_max_epu32(a, b);
>  return 0;

I would have extracted an int from the result of _mm_max_epu32 and
returned that instead of 0.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 2/2] util/u_helpers: return correct number of bound buffers

2015-12-09 Thread Ilia Mirkin

I'm probably just being dense... can you provide an exact sequence of
calls that would cause this logic to fail? Seems like it should work
as-is...

On Sun, Dec 6, 2015 at 4:12 AM, Patrick Rudolph  wrote:
> In case a state tracker unbinds every slot by a seperate
> pipe->set_vertex_buffers() call, starting from slot zero, the number
> of bound buffers would not reach zero at all. Unbinding all buffers
> at once or starting at the top-most slot results in correct behaviour.
>
> Calculating the correct number of bound buffers fixes a NULL pointer
> dereference in nvc0_validate_vertex_buffers_shared().
>
> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93004
>
> Signed-off-by: Patrick Rudolph 
> ---
>  src/gallium/auxiliary/util/u_helpers.c | 8 +++-
>  1 file changed, 7 insertions(+), 1 deletion(-)
>
> diff --git a/src/gallium/auxiliary/util/u_helpers.c 
> b/src/gallium/auxiliary/util/u_helpers.c
> index 09619c1..09020b0 100644
> --- a/src/gallium/auxiliary/util/u_helpers.c
> +++ b/src/gallium/auxiliary/util/u_helpers.c
> @@ -81,7 +81,13 @@ void util_set_vertex_buffers_count(struct 
> pipe_vertex_buffer *dst,
> const struct pipe_vertex_buffer *src,
> unsigned start_slot, unsigned count)
>  {
> -   uint32_t enabled_buffers = (1ull << *dst_count) - 1;
> +   unsigned i;
> +   uint32_t enabled_buffers = 0;
> +
> +   for (i = 0; i < *dst_count; i++) {
> +  if (dst[i].buffer || dst[i].user_buffer)
> + enabled_buffers |= (1ull << i);
> +   }
>
> util_set_vertex_buffers_mask(dst, &enabled_buffers, src, start_slot,
>  count);
> --
> 2.4.3
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [RFC] glapi: Build gl_gentable.c only on Darwin

2015-12-09 Thread Jeremy Huddleston Sequoia

The general concept of this change seems fine to me.

Given the desire to keep glapi as similar as possible across platforms, would 
it be better to just move this into glx/apple rather than leaving it in glapi?

> On Dec 9, 2015, at 09:07, Emil Velikov  wrote:
> 
> On 9 December 2015 at 14:11, Andreas Boll  wrote:
>> Removes the public symbol _glapi_create_table_from_handle from
>> libGL.so.1 on all plattforms except Darwin.
>> 
> typo -> platforms
> 
>> Since the symbol is not used on other plattforms it makes sense to
> ditto
> 
>> build gl_gentable.c only on Darwin.
>> 
> Ideally we'll keep the dispatch as close to identical across all
> platforms (i.e. we'll nuke this), although for now this will do.
> 
> Out of curiosity is there any noticeable difference in the build times?
> 
> ...
>> XXX If we still want to distribute gl_gentable.c in the release tarball
>> we could drop the changes in src/mapi/glapi/gen/Makefile.am
>> 
> Yes please. We want to ship all the generated sources, regardless of
> the platform they're used.
> 
> ...
>> index 2da8f7d..25ea44a 100644
>> --- a/src/mapi/glapi/gen/Makefile.am
>> +++ b/src/mapi/glapi/gen/Makefile.am
>> @@ -27,8 +27,11 @@ MESA_GLAPI_OUTPUTS = \
>>$(MESA_GLAPI_DIR)/glapi_mapi_tmp.h \
>>$(MESA_GLAPI_DIR)/glprocs.h \
>>$(MESA_GLAPI_DIR)/glapitemp.h \
>> -   $(MESA_GLAPI_DIR)/glapitable.h \
>> -   $(MESA_GLAPI_DIR)/glapi_gentable.c
>> +   $(MESA_GLAPI_DIR)/glapitable.h
>> +
>> +if HAVE_APPLEDRI
>> +MESA_GLAPI_OUTPUTS += $(MESA_GLAPI_DIR)/glapi_gentable.c
>> +endif
>> 
>> MESA_GLAPI_ASM_OUTPUTS =
>> if HAVE_X86_ASM
>> @@ -88,8 +91,11 @@ XORG_GLAPI_DIR = $(XORG_BASE)/glx
>> XORG_GLAPI_OUTPUTS = \
>>$(XORG_GLAPI_DIR)/glprocs.h \
>>$(XORG_GLAPI_DIR)/glapitable.h \
>> -   $(XORG_GLAPI_DIR)/dispatch.h \
>> +   $(XORG_GLAPI_DIR)/dispatch.h
>> +
>> +if HAVE_APPLEDRI
>>$(XORG_GLAPI_DIR)/glapi_gentable.c
> Erm missing XORG_GLAPI_OUTPUTS += ?
> 
> Afaict even with the above makefile changes the file should still be
> in the in the tarball. Am I missing something ?
> 
> Would be great if Jeremy (or someone else) has the chance to test
> this, in case we've missing something.
> 
> -Emil



smime.p7s
Description: S/MIME cryptographic signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [RFC] glapi: Build gl_gentable.c only on Darwin

2015-12-09 Thread Emil Velikov

On 9 December 2015 at 14:11, Andreas Boll  wrote:
> Removes the public symbol _glapi_create_table_from_handle from
> libGL.so.1 on all plattforms except Darwin.
>
typo -> platforms

> Since the symbol is not used on other plattforms it makes sense to
ditto

> build gl_gentable.c only on Darwin.
>
Ideally we'll keep the dispatch as close to identical across all
platforms (i.e. we'll nuke this), although for now this will do.

Out of curiosity is there any noticeable difference in the build times?

...
> XXX If we still want to distribute gl_gentable.c in the release tarball
> we could drop the changes in src/mapi/glapi/gen/Makefile.am
>
Yes please. We want to ship all the generated sources, regardless of
the platform they're used.

...
> index 2da8f7d..25ea44a 100644
> --- a/src/mapi/glapi/gen/Makefile.am
> +++ b/src/mapi/glapi/gen/Makefile.am
> @@ -27,8 +27,11 @@ MESA_GLAPI_OUTPUTS = \
> $(MESA_GLAPI_DIR)/glapi_mapi_tmp.h \
> $(MESA_GLAPI_DIR)/glprocs.h \
> $(MESA_GLAPI_DIR)/glapitemp.h \
> -   $(MESA_GLAPI_DIR)/glapitable.h \
> -   $(MESA_GLAPI_DIR)/glapi_gentable.c
> +   $(MESA_GLAPI_DIR)/glapitable.h
> +
> +if HAVE_APPLEDRI
> +MESA_GLAPI_OUTPUTS += $(MESA_GLAPI_DIR)/glapi_gentable.c
> +endif
>
>  MESA_GLAPI_ASM_OUTPUTS =
>  if HAVE_X86_ASM
> @@ -88,8 +91,11 @@ XORG_GLAPI_DIR = $(XORG_BASE)/glx
>  XORG_GLAPI_OUTPUTS = \
> $(XORG_GLAPI_DIR)/glprocs.h \
> $(XORG_GLAPI_DIR)/glapitable.h \
> -   $(XORG_GLAPI_DIR)/dispatch.h \
> +   $(XORG_GLAPI_DIR)/dispatch.h
> +
> +if HAVE_APPLEDRI
> $(XORG_GLAPI_DIR)/glapi_gentable.c
Erm missing XORG_GLAPI_OUTPUTS += ?

Afaict even with the above makefile changes the file should still be
in the in the tarball. Am I missing something ?

Would be great if Jeremy (or someone else) has the chance to test
this, in case we've missing something.

-Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 1/4] glsl: Fix a typo in a comment

2015-12-09 Thread Brian Paul


On 12/09/2015 09:20 AM, Andreas Boll wrote:

s/suports/supports/

Signed-off-by: Andreas Boll 
---
  src/glsl/glsl_parser_extras.h | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/glsl/glsl_parser_extras.h b/src/glsl/glsl_parser_extras.h
index 6bded3e..a4bda77 100644
--- a/src/glsl/glsl_parser_extras.h
+++ b/src/glsl/glsl_parser_extras.h
@@ -97,7 +97,7 @@ struct _mesa_glsl_parse_state {
  * supports the feature.
  *
  * \param required_glsl_es_version is the GLSL ES version that is required
-* to support the feature, or 0 if no version of GLSL ES suports the
+* to support the feature, or 0 if no version of GLSL ES supports the
  * feature.
  */
 bool is_version(unsigned required_glsl_version,



For all 4, Reviewed-by: Brian Paul 

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH 1/2] glx: Fix a typo in a comment

2015-12-09 Thread Brian Paul


On 12/09/2015 09:29 AM, Andreas Boll wrote:

s/suports/supports/

Signed-off-by: Andreas Boll 
---
Found two more "suports" typos.
I could squash all patches together if that's preferred.

  src/glx/dri2_glx.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/glx/dri2_glx.c b/src/glx/dri2_glx.c
index 27ea952..651915a 100644
--- a/src/glx/dri2_glx.c
+++ b/src/glx/dri2_glx.c
@@ -1289,7 +1289,7 @@ dri2CreateScreen(int screen, struct glx_display * priv)
__glXEnableDirectExtension(&psc->base, "GLX_OML_sync_control");
 }

-   /* DRI2 suports SubBuffer through DRI2CopyRegion, so it's always
+   /* DRI2 supports SubBuffer through DRI2CopyRegion, so it's always
  * available.*/
 psp->copySubBuffer = dri2CopySubBuffer;
 __glXEnableDirectExtension(&psc->base, "GLX_MESA_copy_sub_buffer");



For both, Reviewed-by: Brian Paul 

I'd say you don't really have to get reviews for fixing comment typos 
though.


-Brian

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 2/2] mesa: Fix a typo in a comment

2015-12-09 Thread Andreas Boll

s/suports/supports/

Signed-off-by: Andreas Boll 
---
 src/mesa/main/extensions.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/mesa/main/extensions.h b/src/mesa/main/extensions.h
index 1615e1c..b5e0350 100644
--- a/src/mesa/main/extensions.h
+++ b/src/mesa/main/extensions.h
@@ -88,7 +88,7 @@ enum {
 };
 
 
-/** Checks if the context suports a user-facing extension */
+/** Checks if the context supports a user-facing extension */
 #define EXT(name_str, driver_cap, ...) \
 static inline bool \
 _mesa_has_##name_str(const struct gl_context *ctx) \
-- 
2.1.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 1/2] glx: Fix a typo in a comment

2015-12-09 Thread Andreas Boll

s/suports/supports/

Signed-off-by: Andreas Boll 
---
Found two more "suports" typos.
I could squash all patches together if that's preferred.

 src/glx/dri2_glx.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/glx/dri2_glx.c b/src/glx/dri2_glx.c
index 27ea952..651915a 100644
--- a/src/glx/dri2_glx.c
+++ b/src/glx/dri2_glx.c
@@ -1289,7 +1289,7 @@ dri2CreateScreen(int screen, struct glx_display * priv)
   __glXEnableDirectExtension(&psc->base, "GLX_OML_sync_control");
}
 
-   /* DRI2 suports SubBuffer through DRI2CopyRegion, so it's always
+   /* DRI2 supports SubBuffer through DRI2CopyRegion, so it's always
 * available.*/
psp->copySubBuffer = dri2CopySubBuffer;
__glXEnableDirectExtension(&psc->base, "GLX_MESA_copy_sub_buffer");
-- 
2.1.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] i965: handle stencil_bits parameter for MESA_FORMAT_B8G8R8X8_UNORM format.

2015-12-09 Thread Ilia Mirkin

On Wed, Dec 9, 2015 at 11:18 AM, Deve  wrote:
> This patch indeed seems to not have a sense. I just added it to the bug
> report as a suggestion that it works for me after this modification. Emil
> Velikov said that I should send it to the mailing list.
>
> Here is how it works in Supertuxkart:
> We create rtt with following parameters:
>
> DepthStencilTexture = generateRTT(res, GL_DEPTH24_STENCIL8,
> GL_DEPTH_STENCIL, GL_UNSIGNED_INT_24_8);
>
> Then, during rendering scene, we do:
>
> glEnable(GL_FRAMEBUFFER_SRGB);
> glBindFramebuffer(GL_FRAMEBUFFER, 0);

OK, so this is the "winsys" framebuffer (GL has some term for it,
sorry, I don't remember what it is... perhaps it's even winsys). This
is created based on parameters of your selected GLX visual. For
example, when I run glxinfo, I see (on nouveau; the list on intel will
be different but comparable):

480 GLX Visuals
visual  x   bf lv rg d st  colorbuffer  sr ax dp st accumbuffer  ms  cav
  id dep cl sp  sz l  ci b ro  r  g  b  a F gb bf th cl  r  g  b  a ns b eat

0x021 24 tc  0  32  0 r  y .   8  8  8  8 .  .  0 24  8  0  0  0  0  0 0 None
0x022 24 dc  0  32  0 r  y .   8  8  8  8 .  .  0 24  8  0  0  0  0  0 0 None
...
0x343 24 tc  0  32  0 r  . .   8  8  8  8 .  s  0  0  0  0  0  0  0  0 0 None
0x344 24 tc  0  32  0 r  . .   8  8  8  8 .  s  0  0  0 16 16 16 16  0 0 Slow
0x345 24 tc  0  32  0 r  y .   8  8  8  8 .  s  0  0  0  0  0  0  0  0 0 None
0x346 24 tc  0  32  0 r  y .   8  8  8  8 .  s  0  0  0 16 16 16 16  0 0 Slow

Note how some of them have srgb, others don't (and have various
differences in their various other properties). EGL has something
similar I believe, but tbh I don't remember the specifics. That's the
mesaVis->sRGBCapable bit below. If you need an sRGB-capable visual,
are you sure you're picking one? This would be somewhere well before
any actual rendering takes place.

If you're not sure whether you need an sRGB-capable visual or not, try
making sure you pick a *non-srgb* visual in a working configuration
and see if it breaks.

Cheers,

  -ilia

> (...)
> render();
> (...)
> glDisable(GL_FRAMEBUFFER_SRGB);
>
> It looks that glEnable(GL_FRAMEBUFFER_SRGB) doesn't work anymore. It's
> because of following lines in intel_screen.c in intelCreateBuffer()
> function:
>
>if (mesaVis->redBits == 5)
>   rgbFormat = MESA_FORMAT_B5G6R5_UNORM;
>else if (mesaVis->sRGBCapable)
>   rgbFormat = MESA_FORMAT_B8G8R8A8_SRGB;
>else if (mesaVis->alphaBits == 0)
>   rgbFormat = MESA_FORMAT_B8G8R8X8_UNORM;
>else {
>   rgbFormat = MESA_FORMAT_B8G8R8A8_SRGB;
>   fb->Visual.sRGBCapable = true;
>}
>
> Previously MESA_FORMAT_B8G8R8X8_UNORM was not available, and thus
> MESA_FORMAT_B8G8R8A8_UNORM was handled as last case (using
> MESA_FORMAT_B8G8R8A8_SRGB format). Now it uses MESA_FORMAT_B8G8R8X8_UNORM
> format.
>
> Any ideas how it should be handled?
>
> Regards,
> Deve
>
> W dniu 09.12.2015 o 03:00, Ilia Mirkin pisze:
>
>> On Mon, Dec 7, 2015 at 5:32 PM, Dawid Gan  wrote:
>>>
>>> This format has been added in commit:
>>> 28090b30dd6b5977de085f48c620574214b6b4ba
>>> But it was handled in the same way as MESA_FORMAT_B8G8R8A8_UNORM format.
>>> It was causing the screen in Supertuxkart to be darker than expected,
>>> see:
>>> https://bugs.freedesktop.org/show_bug.cgi?id=92759
>>>
>>> Cc: Boyan Ding 
>>> Cc: "11.0 11.1" 
>>> Fixes: 28090b30dd6 "i965: Add XRGB format to
>>> intel_screen_make_configs"
>>> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92759
>>> ---
>>>   src/mesa/drivers/dri/i965/intel_screen.c | 9 +
>>>   1 file changed, 9 insertions(+)
>>>
>>> diff --git a/src/mesa/drivers/dri/i965/intel_screen.c
>>> b/src/mesa/drivers/dri/i965/intel_screen.c
>>> index cc90efe..75d5a65 100644
>>> --- a/src/mesa/drivers/dri/i965/intel_screen.c
>>> +++ b/src/mesa/drivers/dri/i965/intel_screen.c
>>> @@ -1237,6 +1237,9 @@ intel_screen_make_configs(__DRIscreen *dri_screen)
>>>stencil_bits[2] = 8;
>>>num_depth_stencil_bits = 3;
>>>}
>>> +  } else if (formats[i] == MESA_FORMAT_B8G8R8X8_UNORM) {
>>> + depth_bits[1] = 24;
>>> + stencil_bits[1] = 0;
>>
>>
>> Why would you want depth without stencil when using BGRX? I don't see
>> how the two are connected... Are you sure you're picking the right
>> visual?
>>
>>-ilia
>>
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [Bug 91888] EGL Wayland software rendering no longer work after regression

2015-12-09 Thread bugzilla-daemon

https://bugs.freedesktop.org/show_bug.cgi?id=91888

--- Comment #19 from nerdopol...@verizon.net ---
I didn't see any thing in the changelog for 'egl' that looked like it might be
a fix... Not 100% sure though

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 2/4] mesa: Fix typos in print messages

2015-12-09 Thread Andreas Boll

s/inconsistant/inconsistent/
s/occurences/occurrences/

Signed-off-by: Andreas Boll 
---
 src/mesa/main/teximage.c  | 2 +-
 src/mesa/main/transformfeedback.c | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/src/mesa/main/teximage.c b/src/mesa/main/teximage.c
index 60fc7cc..73b3318 100644
--- a/src/mesa/main/teximage.c
+++ b/src/mesa/main/teximage.c
@@ -2028,7 +2028,7 @@ compressed_texture_error_check(struct gl_context *ctx, 
GLint dimensions,
* if  is not consistent with the format, dimensions, and
* contents of the specified image.
*/
-  reason = "imageSize inconsistant with width/height/format";
+  reason = "imageSize inconsistent with width/height/format";
   error = GL_INVALID_VALUE;
   goto error;
}
diff --git a/src/mesa/main/transformfeedback.c 
b/src/mesa/main/transformfeedback.c
index 103011c..976b268 100644
--- a/src/mesa/main/transformfeedback.c
+++ b/src/mesa/main/transformfeedback.c
@@ -861,7 +861,7 @@ _mesa_TransformFeedbackVaryings(GLuint program, GLsizei 
count,
  if (buffers > ctx->Const.MaxTransformFeedbackBuffers) {
 _mesa_error(ctx, GL_INVALID_OPERATION,
 "glTransformFeedbackVaryings(too many gl_NextBuffer "
-"occurences)");
+"occurrences)");
 return;
  }
   } else {
-- 
2.1.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 1/4] glsl: Fix a typo in a comment

2015-12-09 Thread Andreas Boll

s/suports/supports/

Signed-off-by: Andreas Boll 
---
 src/glsl/glsl_parser_extras.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/glsl/glsl_parser_extras.h b/src/glsl/glsl_parser_extras.h
index 6bded3e..a4bda77 100644
--- a/src/glsl/glsl_parser_extras.h
+++ b/src/glsl/glsl_parser_extras.h
@@ -97,7 +97,7 @@ struct _mesa_glsl_parse_state {
 * supports the feature.
 *
 * \param required_glsl_es_version is the GLSL ES version that is required
-* to support the feature, or 0 if no version of GLSL ES suports the
+* to support the feature, or 0 if no version of GLSL ES supports the
 * feature.
 */
bool is_version(unsigned required_glsl_version,
-- 
2.1.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 4/4] st/osmesa: Fix a typo in a comment

2015-12-09 Thread Andreas Boll

s/suport/support/

Signed-off-by: Andreas Boll 
---
 src/gallium/state_trackers/osmesa/osmesa.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/gallium/state_trackers/osmesa/osmesa.c 
b/src/gallium/state_trackers/osmesa/osmesa.c
index 0285cb0..0f27ba8 100644
--- a/src/gallium/state_trackers/osmesa/osmesa.c
+++ b/src/gallium/state_trackers/osmesa/osmesa.c
@@ -32,7 +32,7 @@
  * may be set to "softpipe" or "llvmpipe" to override.
  *
  * With softpipe we could render directly into the user's buffer by using a
- * display target resource.  However, softpipe doesn't suport "upside-down"
+ * display target resource.  However, softpipe doesn't support "upside-down"
  * rendering which would be needed for the OSMESA_Y_UP=TRUE case.
  *
  * With llvmpipe we could only render directly into the user's buffer when its
-- 
2.1.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH 3/4] meta: Fix a typo in a print message

2015-12-09 Thread Andreas Boll

s/Unkown/Unknown/

Signed-off-by: Andreas Boll 
---
 src/mesa/drivers/common/meta_blit.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/mesa/drivers/common/meta_blit.c 
b/src/mesa/drivers/common/meta_blit.c
index c5faf61..4dbf0a7 100644
--- a/src/mesa/drivers/common/meta_blit.c
+++ b/src/mesa/drivers/common/meta_blit.c
@@ -325,7 +325,7 @@ setup_glsl_msaa_blit_shader(struct gl_context *ctx,
   }
   break;
default:
-  _mesa_problem(ctx, "Unkown texture target %s\n",
+  _mesa_problem(ctx, "Unknown texture target %s\n",
 _mesa_enum_to_string(target));
   shader_index = BLIT_2X_MSAA_SHADER_2D_MULTISAMPLE_RESOLVE;
}
-- 
2.1.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] i965: handle stencil_bits parameter for MESA_FORMAT_B8G8R8X8_UNORM format.

2015-12-09 Thread Deve

This patch indeed seems to not have a sense. I just added it to the bug 
report as a suggestion that it works for me after this modification. 
Emil Velikov said that I should send it to the mailing list.


Here is how it works in Supertuxkart:
We create rtt with following parameters:

DepthStencilTexture = generateRTT(res, GL_DEPTH24_STENCIL8, 
GL_DEPTH_STENCIL, GL_UNSIGNED_INT_24_8);


Then, during rendering scene, we do:

glEnable(GL_FRAMEBUFFER_SRGB);
glBindFramebuffer(GL_FRAMEBUFFER, 0);
(...)
render();
(...)
glDisable(GL_FRAMEBUFFER_SRGB);

It looks that glEnable(GL_FRAMEBUFFER_SRGB) doesn't work anymore. It's 
because of following lines in intel_screen.c in intelCreateBuffer() 
function:


   if (mesaVis->redBits == 5)
  rgbFormat = MESA_FORMAT_B5G6R5_UNORM;
   else if (mesaVis->sRGBCapable)
  rgbFormat = MESA_FORMAT_B8G8R8A8_SRGB;
   else if (mesaVis->alphaBits == 0)
  rgbFormat = MESA_FORMAT_B8G8R8X8_UNORM;
   else {
  rgbFormat = MESA_FORMAT_B8G8R8A8_SRGB;
  fb->Visual.sRGBCapable = true;
   }

Previously MESA_FORMAT_B8G8R8X8_UNORM was not available, and thus 
MESA_FORMAT_B8G8R8A8_UNORM was handled as last case (using 
MESA_FORMAT_B8G8R8A8_SRGB format). Now it uses 
MESA_FORMAT_B8G8R8X8_UNORM format.


Any ideas how it should be handled?

Regards,
Deve

W dniu 09.12.2015 o 03:00, Ilia Mirkin pisze:

On Mon, Dec 7, 2015 at 5:32 PM, Dawid Gan  wrote:

This format has been added in commit:
28090b30dd6b5977de085f48c620574214b6b4ba
But it was handled in the same way as MESA_FORMAT_B8G8R8A8_UNORM format.
It was causing the screen in Supertuxkart to be darker than expected, see:
https://bugs.freedesktop.org/show_bug.cgi?id=92759

Cc: Boyan Ding 
Cc: "11.0 11.1" 
Fixes: 28090b30dd6 "i965: Add XRGB format to intel_screen_make_configs"
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92759
---
  src/mesa/drivers/dri/i965/intel_screen.c | 9 +
  1 file changed, 9 insertions(+)

diff --git a/src/mesa/drivers/dri/i965/intel_screen.c 
b/src/mesa/drivers/dri/i965/intel_screen.c
index cc90efe..75d5a65 100644
--- a/src/mesa/drivers/dri/i965/intel_screen.c
+++ b/src/mesa/drivers/dri/i965/intel_screen.c
@@ -1237,6 +1237,9 @@ intel_screen_make_configs(__DRIscreen *dri_screen)
   stencil_bits[2] = 8;
   num_depth_stencil_bits = 3;
   }
+  } else if (formats[i] == MESA_FORMAT_B8G8R8X8_UNORM) {
+ depth_bits[1] = 24;
+ stencil_bits[1] = 0;


Why would you want depth without stencil when using BGRX? I don't see
how the two are connected... Are you sure you're picking the right
visual?

   -ilia


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] Fix locking of GLsync objects

2015-12-09 Thread Steinar H. Gunderson

On Wed, Dec 09, 2015 at 10:35:25AM +, Emil Velikov wrote:
>> I was told that it's easier for people to review my patch if it comes in via
>> email than being stuck in the bug tracker; FWIW, this is for bug 120238.
> Which bugtracker it this ? bugs.fd.o does not like the number
> mentioned. Please add the full URL to the commit message with a
> Bugzilla: tag.

I mixed up some bug numbers, sorry.

https://bugs.freedesktop.org/show_bug.cgi?id=92757

>> GLsync objects had a race condition when used from multiple threads
>> (which is the main point of the extension, really); it could be
>> validated as a sync object at the beginning of the function, and then
>> deleted by another thread before use, causing crashes. Fix this by
>> changing all casts from GLsync to struct gl_sync_object to a new
>> function _mesa_get_sync() that validates and increases the refcount.
> Might be worth keeping _mesa_ref_sync_object(), even if it's an inline
> wrapper around the above. As things get a bit confusing - foo_get vs
> foo_unref.

What about _mesa_get_and_ref_sync()?

> Alternatively one could even throw the locking (+extra checks) into
> the validate, use it in _mesa_IsSync(), while using the ref/unref
> combo elsewhere and drop the "amount" argument from unref.

I agree the amount argument is a bit icky. :-)

> Please mention if this commit fixes a certain game/program.

The motivating program is unreleased for now, but it will be released
(under GPLv3) in February.

> Can you also add the following tag. This way it'll be harder to miss
> the patch when picking things for the stable branch(es).
> 
> Cc: "11.0 11.1" 

Sure. (I'll send an updated patch when we have agreement on the issues
above.)

/* Steinar */
-- 
Homepage: https://www.sesse.net/
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [RFC PATCH 5/5] i965: Skip execution size adjustment for instructions of width 4

2015-12-09 Thread Jason Ekstrand

On Dec 9, 2015 4:16 AM, "Iago Toral Quiroga"  wrote:
>
> This code in brw_set_dest adjusts the execution size of any instruction
> with a dst.width < 8. However, we don't want to do this with instructions
> operating on doubles, since these will have a width of 4, but still
> need an execution size of 8 (for SIMD8). Unfortunately, we can't just
check
> the size of the operands involved to detect if we are doing an operation
on
> doubles, because we can have instructions that do operations on double
> operands interpreted as UD, operating on any of its 2 32-bit components.
>
> Previous commits have made it so we never emit instructions with a
horizontal
> width of 4 that don't have the correct execution size set for gen7/gen8,
so
> we can skip it in this case, avoiding the conflicts with fp64
requirements.
>
> Expanding the same fix to other hardware generations requires many more
> changes but since we are not targetting fp64 support on them
> wer don't really care for now.
> ---
>  src/mesa/drivers/dri/i965/brw_eu_emit.c | 14 +-
>  1 file changed, 13 insertions(+), 1 deletion(-)
>
> diff --git a/src/mesa/drivers/dri/i965/brw_eu_emit.c
b/src/mesa/drivers/dri/i965/brw_eu_emit.c
> index 78f2c8c..50a8771 100644
> --- a/src/mesa/drivers/dri/i965/brw_eu_emit.c
> +++ b/src/mesa/drivers/dri/i965/brw_eu_emit.c
> @@ -202,8 +202,20 @@ brw_set_dest(struct brw_codegen *p, brw_inst *inst,
struct brw_reg dest)
> /* Generators should set a default exec_size of either 8 (SIMD4x2 or
SIMD8)
>  * or 16 (SIMD16), as that's normally correct.  However, when dealing
with
>  * small registers, we automatically reduce it to match the register
size.
> +*
> +* In platforms that support fp64 we can emit instructions with a
width of
> +* 4 that need two SIMD8 registers and an exec_size of 8 or 16. In
these
> +* cases we need to make sure that these instructions have their exec
sizes
> +* set properly when they are emitted and we can't rely on this code
to fix
> +* it.
>  */
> -   if (dest.width < BRW_EXECUTE_8)
> +   bool fix_exec_size;
> +   if (devinfo->gen == 7 || devinfo->gen == 8)

If we're doing to take this approach, we definitely want to make it gen > 6
or something so we include future gens.  Really gen > 4 is probably doable
since the only real problem is the legacy clipping code.

> +  fix_exec_size = dest.width < BRW_EXECUTE_4;
> +   else
> +  fix_exec_size = dest.width < BRW_EXECUTE_8;
> +
> +   if (fix_exec_size)
>brw_inst_set_exec_size(devinfo, inst, dest.width);
>  }
>
> --
> 2.1.4
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH] i965: Separate base offset/constant offset combining from remapping.

2015-12-09 Thread Jason Ekstrand

On Dec 9, 2015 2:51 AM, "Kenneth Graunke"  wrote:
>
> My tessellation branch has two additional remap functions.  I don't want
> to replicate this logic there.
>
> Signed-off-by: Kenneth Graunke 
> ---
>  src/mesa/drivers/dri/i965/brw_nir.c | 78
-
>  1 file changed, 50 insertions(+), 28 deletions(-)
>
> Hey Jason,
>
> If you like this patch, and haven't yet merged your NIR input reworks,
> feel free to just squash it into your changes.  Or, we can land it
> separately after your changes.  It's up to you.
>
> Separating this out allows me to reuse this in my new tessellation input
> and output remapping functions, and also means we don't need to add
structs
> for the remap functions...we can just pass the builder, or inputs_read, or
> the VUE map...and not have to pack multiple things together.

Sure.  It does make things simpler.

>  --Ken
>
> diff --git a/src/mesa/drivers/dri/i965/brw_nir.c
b/src/mesa/drivers/dri/i965/brw_nir.c
> index 14ad172..105a175 100644
> --- a/src/mesa/drivers/dri/i965/brw_nir.c
> +++ b/src/mesa/drivers/dri/i965/brw_nir.c
> @@ -27,15 +27,19 @@
>  #include "glsl/nir/nir_builder.h"
>  #include "program/prog_to_nir.h"
>
> -struct remap_vs_attrs_state {
> -   nir_builder b;
> -   uint64_t inputs_read;
> -};
> -
> +/**
> + * In many cases, we just add the base and offset together, so there's no
> + * reason to keep them separate.  Sometimes, combining them is essential:
> + * if a shader only accesses part of a compound variable (such as a
matrix
> + * or array), the variable's base may not actually exist in the VUE map.
> + *
> + * This pass adds constant offsets to instr->const_index[0], and resets
> + * the offset source to 0.  Non-constant offsets remain unchanged.
> + */
>  static bool
> -remap_vs_attrs(nir_block *block, void *void_state)
> +add_const_offset_to_base(nir_block *block, void *closure)
>  {
> -   struct remap_vs_attrs_state *state = void_state;
> +   nir_builder *b = closure;
>
> nir_foreach_instr_safe(block, instr) {
>if (instr->type != nir_instr_type_intrinsic)
> @@ -43,30 +47,48 @@ remap_vs_attrs(nir_block *block, void *void_state)
>
>nir_intrinsic_instr *intrin = nir_instr_as_intrinsic(instr);
>
> +  if (intrin->intrinsic == nir_intrinsic_load_input ||
> +  intrin->intrinsic == nir_intrinsic_load_per_vertex_input ||
> +  intrin->intrinsic == nir_intrinsic_load_output ||
> +  intrin->intrinsic == nir_intrinsic_load_per_vertex_output ||
> +  intrin->intrinsic == nir_intrinsic_store_output ||
> +  intrin->intrinsic == nir_intrinsic_store_per_vertex_output) {

This seems a bit scortched-earth. It would be nice if the caller had a bit
more control.

> + nir_src *offset = nir_get_io_offset_src(intrin);
> + nir_const_value *const_offset = nir_src_as_const_value(*offset);
> +
> + if (const_offset) {
> +intrin->const_index[0] += const_offset->u[0];
> +b->cursor = nir_before_instr(&intrin->instr);
> +nir_instr_rewrite_src(&intrin->instr, offset,
> +  nir_src_for_ssa(nir_imm_int(b, 0)));
> + }

Else???  It seems that you don't want to run this pass if you think you'll
ever hit an indirect.  I guess it's harmless to just do this for all direct
things in our driver, but it doesn't sit well.

> +  }
> +   }
> +   return true;
> +
> +}
> +
> +static bool
> +remap_vs_attrs(nir_block *block, void *closure)
> +{
> +   GLbitfield64 inputs_read = *((GLbitfield64 *) closure);
> +
> +   nir_foreach_instr(block, instr) {
> +  if (instr->type != nir_instr_type_intrinsic)
> + continue;
> +
> +  nir_intrinsic_instr *intrin = nir_instr_as_intrinsic(instr);
> +
>if (intrin->intrinsic == nir_intrinsic_load_input) {
>   /* Attributes come in a contiguous block, ordered by their
>* gl_vert_attrib value.  That means we can compute the slot
>* number for an attribute by masking out the enabled attributes
>* before it and counting the bits.
>*/
> - nir_const_value *const_offset =
nir_src_as_const_value(intrin->src[0]);
> -
> - /* We set EmitNoIndirect for VS inputs, so there are no
indirects. */
> - assert(const_offset);
> -
> - int attr = intrin->const_index[0] + const_offset->u[0];
> - int slot = _mesa_bitcount_64(state->inputs_read &
> -  BITFIELD64_MASK(attr));
> + int attr = intrin->const_index[0];
> + int slot = _mesa_bitcount_64(inputs_read &
BITFIELD64_MASK(attr));
>
> - /* The NIR -> FS pass will just add the base and offset
together, so
> -  * there's no reason to keep them separate.  Just put it all in
> -  * const_index[0] and set the offset src[0] to load_const(0).
> -  */
>   intrin->const_index[0] = 4 * slot;
> -
> - state->b.cursor = nir_before_instr(&intrin->instr);

Re: [Mesa-dev] [Mesa-stable] [PATCH] i965: Fix crash when calling glViewport with no surface bound

2015-12-09 Thread Emil Velikov

On 9 December 2015 at 14:57, Neil Roberts  wrote:
> Emil Velikov  writes:
>
>> Worth throwing in 11.0 as well ?
>
> Yeah, that would probably be sensible.
>
>>> if (_mesa_is_winsys_fbo(ctx->DrawBuffer)) {
>>> -  dri2InvalidateDrawable(driContext->driDrawablePriv);
>>> -  dri2InvalidateDrawable(driContext->driReadablePriv);
>>> +  if (driContext->driDrawablePriv)
>>> + dri2InvalidateDrawable(driContext->driDrawablePriv);
>>> +  if (driContext->driReadablePriv)
>>> + dri2InvalidateDrawable(driContext->driReadablePriv);
>>
>>  Afaict i915 could use an identical fix ?
>
> Yes, I think you're right. However I don't have any way of testing it so
> I feel a bit uncomfortable touching i915 driver.
>
Considering it's a null-deref fix one can just move the check in
dri2InvalidateDrawable, which would spare going through i915, i965,
radeon... ;-)

Just throwing it out there, it's up-to whichever route you want to take.
-Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] softpipe: V.2 implement some support for multiple viewports

2015-12-09 Thread Roland Scheidegger

Am 09.12.2015 um 05:16 schrieb Edward O'Callaghan:
> This fixes my initial attempt so that piglit now passes 14/14. Thanks
> to a couple of tips from Roland in the previous patch I was able to
> fix the remaining issue. This should be golden now.
> 

Great that you got it working!
Please send the patches to the ml.

Roland


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [PATCH shader-db 1/3] split-to-files: deal with minimum versions, other shader types

2015-12-09 Thread Eero Tamminen


Hi,

On 11/09/2015 08:47 PM, Matt Turner wrote:

On Mon, Nov 9, 2015 at 10:46 AM, Ilia Mirkin  wrote:

I used this script in conjunction with ST_DUMP_SHADERS. What other way is there?


Some local hack and we should probably finish and upstream.


Did anything happen with this?

I had to rewrite split-to-files because it didn't output all (ARB) 
shaders and it picks the wrong one [1] when application re-uses same 
program numbers.


[1] It picked first one, although I think almost always the last one 
will be most interesting.  Last one will also allow easily dumping 
different shader sets (that re-use same program numbers) from the same 
application.


RFC patches for my changes are attached.


- Eero


>From c52f02ff664af269fa5268627624fe94c647ad37 Mon Sep 17 00:00:00 2001
From: Eero Tamminen 
Date: Wed, 9 Dec 2015 16:29:43 +0200
Subject: [PATCH 1/2] Rewrite split-to-files.py to fix it

This rewrite improves on previous version in following ways:

* Improve recognization of shader end.

* Remove extra lines after shader ends (in normal shaders anything
  after last line with '}' that closes main(), and in ARB shaders, lines
  after END).

* Optimize parsing by using compiled regexps.

* Calculate (md5) hashes for normalized (single line comments and white
  space removed) shader contents and identify duplicate shaders with
  those

* If program gets a new shader, output the latest one.  It should be
  more relevant one.  It also allows dumping different shader sets
  e.g. shaders for startup / game menu vs. actual game play, just by
  running application further before killing it.

* When application replaces ARB shaders, continue instead of claiming
  to be done & exiting.  Same program numbers can be used if
  application removes previous programs.

* Tell user which shaders were duplicates and which were replaced by
  which shaders.

* Remove duplicate programs based on shader stage hashes (of their
  normalized sources) and tell user about this.

* Output shader stage sources in 3D pipeline order.

* Give ARB shaders different file name from normal shaders.
---
 split-to-files.py |  409 +++--
 1 file changed, 306 insertions(+), 103 deletions(-)

diff --git a/split-to-files.py b/split-to-files.py
index 151681e..7150622 100755
--- a/split-to-files.py
+++ b/split-to-files.py
@@ -2,122 +2,300 @@
 
 import re
 import os
+import hashlib
 import argparse
 
 
+class ShaderBase:
+def __init__(self, prog, stage):
+self.lines = []
+self.progid = prog # latest
+self.programs = {self.progid: True} # all
+self.stage = stage
+self.hash = None
+self.hashed_len = 0
+self.done = False
+self.replaced = False
+# filled by subclasses
+self.shadernum = 0
+self.req_start = None
+self.req_end = None
+self.warn = None
+
+def append_line(self, line):
+assert not self.done
+self.lines.append(line)
+
+def is_finished(self, line):
+return False
+
+def get_source(self):
+assert self.done
+return "\n".join(self.lines) + "\n"
+
+def add_program(self, dup):
+self.programs[dup.progid] = True
+
+def del_program(self, dup):
+del(self.programs[dup.progid])
+
+def get_hash(self):
+if self.hash:
+return self.hash
+assert self.done
+
+# source without single line comments & whitespace
+normalized = []
+for line in self.lines:
+offset = line.find("//")
+if offset >= 0:
+line = line[:offset]
+# Python2: line = line.translate(None, " \t")
+line = line.translate({' ': None, '\t': None})
+if line:
+normalized.append(line)
+normalized = "".join(normalized).encode()
+
+# create hash for normalized source
+md5 = hashlib.md5()
+self.hashed_len = len(normalized)
+md5.update(normalized)
+self.hash = md5.hexdigest()
+return self.hash
+
+def check_conflict(self, dup):
+assert self.done and dup.done
+if self.hash == dup.hash and self.hashed_len != dup.hashed_len:
+print("ERROR: hash collision with %s" % dup.get_info())
+exit(-1)
+if dup.stage != dup.stage:
+# same shader for different stage, this isn't handled correctly at the moment
+# all code assumes that each hash/shader represents just one shader stage
+print("ERROR: duplicate is for different shader stage (%s)" % dup.stage)
+exit(-1)
+
+def get_info(self):
+assert None # must be subclassed
+
+def show_info(self):
+print(self.get_info())
+
+
+class ShaderARB(ShaderBase):
+def __init__(self, prog, stage):
+ShaderBase.__init__(self, prog, stage)
+self.progid = "%s-ARB_%s" % (prog, stage)
+
+self.req_start = "GL_ARB

Re: [Mesa-dev] [Mesa-stable] [PATCH] i965: Fix crash when calling glViewport with no surface bound

2015-12-09 Thread Neil Roberts

Emil Velikov  writes:

> Worth throwing in 11.0 as well ?

Yeah, that would probably be sensible.

>> if (_mesa_is_winsys_fbo(ctx->DrawBuffer)) {
>> -  dri2InvalidateDrawable(driContext->driDrawablePriv);
>> -  dri2InvalidateDrawable(driContext->driReadablePriv);
>> +  if (driContext->driDrawablePriv)
>> + dri2InvalidateDrawable(driContext->driDrawablePriv);
>> +  if (driContext->driReadablePriv)
>> + dri2InvalidateDrawable(driContext->driReadablePriv);
>
>  Afaict i915 could use an identical fix ?

Yes, I think you're right. However I don't have any way of testing it so
I feel a bit uncomfortable touching i915 driver.

Regards,
- Neil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [Bug 91888] EGL Wayland software rendering no longer work after regression

2015-12-09 Thread bugzilla-daemon

https://bugs.freedesktop.org/show_bug.cgi?id=91888

--- Comment #18 from Daniel Stone  ---
How about 'hooray, it's fixed'? :)

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [PATCH] gallium/util: handle patches in u_prims_for_vertices to fix a radeonsi crash

2015-12-09 Thread Marek Olšák

From: Marek Olšák 

I guess the crash was because of divison by zero.

Cc: 11.0 11.1 
---
 src/gallium/auxiliary/util/u_prim.h  | 17 +
 src/gallium/drivers/radeonsi/si_state_draw.c |  3 ++-
 2 files changed, 15 insertions(+), 5 deletions(-)

diff --git a/src/gallium/auxiliary/util/u_prim.h 
b/src/gallium/auxiliary/util/u_prim.h
index 3668015..4926af6 100644
--- a/src/gallium/auxiliary/util/u_prim.h
+++ b/src/gallium/auxiliary/util/u_prim.h
@@ -141,14 +141,23 @@ u_prim_vertex_count(unsigned prim)
  * For polygons, return the number of triangles.
  */
 static inline unsigned
-u_prims_for_vertices(unsigned prim, unsigned num)
+u_prims_for_vertices(unsigned prim, unsigned num, unsigned vertices_per_patch)
 {
-   const struct u_prim_vertex_count *info = u_prim_vertex_count(prim);
+   struct u_prim_vertex_count info;
 
-   if (num < info->min)
+   if (prim == PIPE_PRIM_PATCHES)
+  info.min = info.incr = vertices_per_patch;
+   else if (prim < PIPE_PRIM_MAX)
+  info = *u_prim_vertex_count(prim);
+   else {
+  assert(!"invalid prim type");
+  return 0;
+   }
+
+   if (num < info.min)
   return 0;
 
-   return 1 + ((num - info->min) / info->incr);
+   return 1 + ((num - info.min) / info.incr);
 }
 
 static inline boolean u_validate_pipe_prim( unsigned pipe_prim, unsigned nr )
diff --git a/src/gallium/drivers/radeonsi/si_state_draw.c 
b/src/gallium/drivers/radeonsi/si_state_draw.c
index ee84a1f..4ac9d0a 100644
--- a/src/gallium/drivers/radeonsi/si_state_draw.c
+++ b/src/gallium/drivers/radeonsi/si_state_draw.c
@@ -320,7 +320,8 @@ static unsigned si_get_ia_multi_vgt_param(struct si_context 
*sctx,
if (sctx->b.screen->info.max_se >= 2 && ia_switch_on_eoi &&
(info->indirect ||
 (info->instance_count > 1 &&
- u_prims_for_vertices(info->mode, info->count) <= 1)))
+ u_prims_for_vertices(info->mode, info->count,
+  info->vertices_per_patch) <= 1)))
sctx->b.flags |= SI_CONTEXT_VGT_FLUSH;
 
return S_028AA8_SWITCH_ON_EOP(ia_switch_on_eop) |
-- 
2.1.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

[Mesa-dev] [RFC] glapi: Build gl_gentable.c only on Darwin

2015-12-09 Thread Andreas Boll

Removes the public symbol _glapi_create_table_from_handle from
libGL.so.1 on all plattforms except Darwin.

Since the symbol is not used on other plattforms it makes sense to
build gl_gentable.c only on Darwin.

A little bit of history:

_glapi_create_table_from_handle was introduced in

commit 85937f4c0d4a78d3a11e3c1fa6148640f2a9ad7b
Author: Jeremy Huddleston 
Date:   Thu Jun 9 16:59:49 2011 -0700

glapi: Add API that can create a _glapi_table from a dlfcn handle

Example usage:

void *handle = dlopen(opengl_library_path, RTLD_LOCAL);
struct _glapi_table *disp = _glapi_create_table_from_handle(handle,
"gl");

Signed-off-by: Jeremy Huddleston 

and the only user in mesa was added in

commit f35913b96e743c5014e99220b1a1c5532a894d69
Author: Jeremy Huddleston 
Date:   Thu Jun 9 17:29:51 2011 -0700

apple: Use _glapi_create_table_from_handle to initialize our
dispatch table

Signed-off-by: Jeremy Huddleston 

gl_gentable.py was also used for XQuartz in xserver 1.11 - 1.14.

Cc: Jeremy Huddleston 
Signed-off-by: Andreas Boll 
---
XXX If we still want to distribute gl_gentable.c in the release tarball
we could drop the changes in src/mapi/glapi/gen/Makefile.am

 src/mapi/Makefile.am   |  6 +-
 src/mapi/glapi/gen/Makefile.am | 12 +---
 src/mapi/glapi/glapi.h |  2 ++
 3 files changed, 16 insertions(+), 4 deletions(-)

diff --git a/src/mapi/Makefile.am b/src/mapi/Makefile.am
index 307e05d..ddd3daa 100644
--- a/src/mapi/Makefile.am
+++ b/src/mapi/Makefile.am
@@ -106,12 +106,16 @@ if HAVE_SPARC_ASM
 GLAPI_ASM_SOURCES = glapi/glapi_sparc.S
 endif
 
-glapi_libglapi_la_SOURCES = glapi/glapi_gentable.c
+glapi_libglapi_la_SOURCES =
 glapi_libglapi_la_CPPFLAGS = \
$(AM_CPPFLAGS) \
-I$(top_srcdir)/src/mapi/glapi \
-I$(top_srcdir)/src/mesa
 
+if HAVE_APPLEDRI
+glapi_libglapi_la_SOURCES += glapi/glapi_gentable.c
+endif
+
 if HAVE_SHARED_GLAPI
 glapi_libglapi_la_SOURCES += $(MAPI_BRIDGE_FILES) glapi/glapi_mapi_tmp.h
 glapi_libglapi_la_CPPFLAGS += \
diff --git a/src/mapi/glapi/gen/Makefile.am b/src/mapi/glapi/gen/Makefile.am
index 2da8f7d..25ea44a 100644
--- a/src/mapi/glapi/gen/Makefile.am
+++ b/src/mapi/glapi/gen/Makefile.am
@@ -27,8 +27,11 @@ MESA_GLAPI_OUTPUTS = \
$(MESA_GLAPI_DIR)/glapi_mapi_tmp.h \
$(MESA_GLAPI_DIR)/glprocs.h \
$(MESA_GLAPI_DIR)/glapitemp.h \
-   $(MESA_GLAPI_DIR)/glapitable.h \
-   $(MESA_GLAPI_DIR)/glapi_gentable.c
+   $(MESA_GLAPI_DIR)/glapitable.h
+
+if HAVE_APPLEDRI
+MESA_GLAPI_OUTPUTS += $(MESA_GLAPI_DIR)/glapi_gentable.c
+endif
 
 MESA_GLAPI_ASM_OUTPUTS =
 if HAVE_X86_ASM
@@ -88,8 +91,11 @@ XORG_GLAPI_DIR = $(XORG_BASE)/glx
 XORG_GLAPI_OUTPUTS = \
$(XORG_GLAPI_DIR)/glprocs.h \
$(XORG_GLAPI_DIR)/glapitable.h \
-   $(XORG_GLAPI_DIR)/dispatch.h \
+   $(XORG_GLAPI_DIR)/dispatch.h
+
+if HAVE_APPLEDRI
$(XORG_GLAPI_DIR)/glapi_gentable.c
+endif
 
 XORG_OUTPUTS = \
$(XORG_GLAPI_OUTPUTS) \
diff --git a/src/mapi/glapi/glapi.h b/src/mapi/glapi/glapi.h
index f269b17..3593c88 100644
--- a/src/mapi/glapi/glapi.h
+++ b/src/mapi/glapi/glapi.h
@@ -158,8 +158,10 @@ _GLAPI_EXPORT const char *
 _glapi_get_proc_name(unsigned int offset);
 
 
+#ifdef GLX_USE_APPLEGL
 _GLAPI_EXPORT struct _glapi_table *
 _glapi_create_table_from_handle(void *handle, const char *symbol_prefix);
+#endif
 
 
 _GLAPI_EXPORT void
-- 
2.1.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [Mesa-stable] [PATCH] configure.ac: fix test for SSE4.1 assembler support

2015-12-09 Thread Oded Gabbay

On Wed, Dec 9, 2015 at 2:34 PM, Jonathan Gray  wrote:
> On Wed, Dec 09, 2015 at 01:39:30PM +0200, Oded Gabbay wrote:
>> On Wed, Dec 9, 2015 at 1:09 PM, Emil Velikov  
>> wrote:
>> > On 9 December 2015 at 05:37, Jonathan Gray  wrote:
>> >> Change the __m128i variables to be volatile so gcc 4.9 won't optimise
>> >> all of them out with -O1 or greater.  The _mm_set1_epi32/pinsrd calls
>> >> still get optimised out but now there is at least one SSE4.1 instruction
>> >> generated via _mm_max_epu32/pmaxud.  When all of the sse4.1 instructions
>> >> got optimised out the configure test would incorrectly pass when the
>> >> compiler supported the intrinsics and the assembler didn't support the
>> >> instructions.
>> >>
>> > Must admit that I was not expecting that one. Looks like pixman (the
>> > inspiration for this check) is missing volatile as well. Does that one
>> > build/run fine on OpenBSD ?
>> >
>> >> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=91806
>> >> Signed-off-by: Jonathan Gray 
>> >> Cc: "11.0 11.1" 
>> > Reviewed-by: Emil Velikov 
>> >
>> > I'll pick this in a couple of days (barring any objections).
>> >
>> > Thanks
>> > Emil
>>
>> Adding pixman ML.
>> I must admit ignorance on this one.
>> I looked at configure.ac of pixman and I don't see any SSE4.1
>> reference, and AFAIK, we don't use those instructions (only SSE2 and
>> SSSE3).
>> Is the above patch relevant for those as well ? because the tests in
>> configure.ac does *not* contain volatile.
>>
>> Oded
>
> It looks like this is indeed a problem in pixman as well with at least
> gcc 4.2 and 4.9.  Running the pixman sse2 test on amd64 I only
> see xmm register use and movdqa in the generated assembly with -O0.
>
> The pixman configure tests passes on OpenBSD but the toolchain
> recognises sse2 instructions just not sse 4.1.
>
> Introducing volatile to the pixman test stops the xmm/movdqa
> use from being optimised out.

I adapted the patch to pixman's configure.ac and sent it to the ML.
Jonathan, thanks for the patch.

Oded
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Re: [Mesa-dev] [RFC] i965: Resolve color for all active shader images in intel_update_state().

2015-12-09 Thread Francisco Jerez

Kristian Høgsberg  writes:

> On Sat, Sep 5, 2015 at 11:30 AM, Jordan Justen
>  wrote:
>> From: Francisco Jerez 
>>
>> Fixes 
>> arb_shader_image_load_store/execution/load-from-cleared-image.shader_test
>>
>> Cc: Chris Wilson 
>> Cc: Jason Ekstrand 
>> Tested-by: Jordan Justen 
>
> This patch is required for correct behavior and looks straightforward
> and correct to me. Let's fix the bug and optimize the CPU performance
> regression (if there is one) later.
>
> Reviewed-by: Kristian Høgsberg 

Thanks, pushed.

Chris, feel free to open a bug report and add me to the CC list if you
can still reproduce a regression on master.

>
>> ---
>>  RE: i965: Perform an explicit flush after doing _mesa_meta_pbo_TexSubImage
>>
>>  curro has some concerns about potential perf impact by this and
>>  wanted it to be checked on small-core w/CPU bound apps.
>>  Unfortunately, he is on vacation now.
>>
>>  src/mesa/drivers/dri/i965/brw_context.c | 18 ++
>>  1 file changed, 18 insertions(+)
>>
>> diff --git a/src/mesa/drivers/dri/i965/brw_context.c 
>> b/src/mesa/drivers/dri/i965/brw_context.c
>> index f0ed891..a274a43 100644
>> --- a/src/mesa/drivers/dri/i965/brw_context.c
>> +++ b/src/mesa/drivers/dri/i965/brw_context.c
>> @@ -189,6 +189,24 @@ intel_update_state(struct gl_context * ctx, GLuint 
>> new_state)
>>brw_render_cache_set_check_flush(brw, tex_obj->mt->bo);
>> }
>>
>> +   /* Resolve color for each active shader image. */
>> +   for (unsigned i = 0; i < MESA_SHADER_STAGES; i++) {
>> +  const struct gl_shader *shader = ctx->_Shader->CurrentProgram[i] ?
>> + ctx->_Shader->CurrentProgram[i]->_LinkedShaders[i] : NULL;
>> +
>> +  if (unlikely(shader && shader->NumImages)) {
>> + for (unsigned j = 0; j < shader->NumImages; j++) {
>> +struct gl_image_unit *u = 
>> &ctx->ImageUnits[shader->ImageUnits[j]];
>> +tex_obj = intel_texture_object(u->TexObj);
>> +
>> +if (tex_obj && tex_obj->mt) {
>> +   intel_miptree_resolve_color(brw, tex_obj->mt);
>> +   brw_render_cache_set_check_flush(brw, tex_obj->mt->bo);
>> +}
>> + }
>> +  }
>> +   }
>> +
>> _mesa_lock_context_textures(ctx);
>>  }
>>
>> --
>> 2.5.0
>>
>> ___
>> mesa-dev mailing list
>> mesa-dev@lists.freedesktop.org
>> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev


signature.asc
Description: PGP signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

1 2 >

1 - 100 of 130 matches

Mail list logo