Re: [Mesa-dev] [PATCH 09/23] main: Cosmetic changes to BufferSubData infrastructure.

2015-02-20 Thread Martin Peres

Be more precise in the commit message. I propose:
report more explicit error messages in the BufferSubData infrastructure

Other than that:
Reviewed-by: Martin Peres martin.pe...@linux.intel.com

On 12/02/2015 04:05, Laura Ekstrand wrote:

---
  src/mesa/main/bufferobj.c | 10 +++---
  src/mesa/main/bufferobj.h |  4 ++--
  2 files changed, 9 insertions(+), 5 deletions(-)

diff --git a/src/mesa/main/bufferobj.c b/src/mesa/main/bufferobj.c
index 4f89748..fc01d02 100644
--- a/src/mesa/main/bufferobj.c
+++ b/src/mesa/main/bufferobj.c
@@ -267,13 +267,17 @@ buffer_object_subdata_range_good(struct gl_context *ctx,
  
 if (mappedRange) {

if (bufferobj_range_mapped(bufObj, offset, size)) {
- _mesa_error(ctx, GL_INVALID_OPERATION, %s, caller);
+ _mesa_error(ctx, GL_INVALID_OPERATION,
+ %s(range is mapped without persistent bit),
+ caller);
   return false;
}
 }
 else {
if (_mesa_bufferobj_mapped(bufObj, MAP_USER)) {
- _mesa_error(ctx, GL_INVALID_OPERATION, %s, caller);
+ _mesa_error(ctx, GL_INVALID_OPERATION,
+ %s(buffer is mapped without persistent bit),
+ caller);
   return false;
}
 }
@@ -1613,7 +1617,7 @@ _mesa_buffer_sub_data(struct gl_context *ctx, struct 
gl_buffer_object *bufObj,
 bufObj-Written = GL_TRUE;
  
 ASSERT(ctx-Driver.BufferSubData);

-   ctx-Driver.BufferSubData( ctx, offset, size, data, bufObj );
+   ctx-Driver.BufferSubData(ctx, offset, size, data, bufObj);
  }
  
  void GLAPIENTRY

diff --git a/src/mesa/main/bufferobj.h b/src/mesa/main/bufferobj.h
index 889bbb1..d15ad00 100644
--- a/src/mesa/main/bufferobj.h
+++ b/src/mesa/main/bufferobj.h
@@ -190,8 +190,8 @@ _mesa_NamedBufferData(GLuint buffer, GLsizeiptr size,
const GLvoid *data, GLenum usage);
  
  void GLAPIENTRY

-_mesa_BufferSubData(GLenum target, GLintptrARB offset,
-GLsizeiptrARB size, const GLvoid * data);
+_mesa_BufferSubData(GLenum target, GLintptr offset,
+GLsizeiptr size, const GLvoid *data);
  
  void GLAPIENTRY

  _mesa_NamedBufferSubData(GLuint buffer, GLintptr offset,


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] i965/fs: Set pixel/sample mask for compute shaders atomic ops

2015-02-20 Thread Francisco Jerez
Jordan Justen jordan.l.jus...@intel.com writes:

 On 2015-02-19 21:40:37, Ben Widawsky wrote:
 On Thu, Feb 19, 2015 at 03:42:05PM -0800, Jordan Justen wrote:
  For fragment programs, we pull this mask from the payload header. The same
  mask doesn't exist for compute shaders, so we set all bits to enabled.
  
  Note: this mask is ANDed with the execution mask, so some channels may not 
  end
  up issuing the atomic operation.
  
  Signed-off-by: Jordan Justen jordan.l.jus...@intel.com
  Cc: Ben Widawsky b...@bwidawsk.net
  Cc: Francisco Jerez curroje...@riseup.net
 
 Just add to the commit message that this is needed specifically because 
 compute
 is invoked as SIMD16 (and perhaps reference the other commits?) and it's:
 Reviewed-by: Ben Widawsky b...@bwidawsk.net

 Good idea. I'll add those.

 Sorry it advance...
 we may as well just go for 0x in case we ever support SIMD32.

 I had been setting all 32-bits previously. I mentioned to you that I
 thought this was needed for SIMD32. I wanted to double check it before
 sending the patch out. I think I found the field for IVB in the PRM:

 IVB Vol 4 Part 1 3.9.9.9 Message Header
 Pixel/Sample Mask

 ...and it looks like it is only 16-bits. Maybe Francisco can confirm
 that I got it right.

 I couldn't find this same information in the HSW PRMs.

 I'm not sure what that means for SIMD32.

Yeah, it's only 16 bits.  For SIMD32 it means that, *sigh*, we'll have
to split up the message in several SIMD16 chunks (my image load store
branch has to do something similar to do typed surface operations in
SIMD16 mode because there are only 8-wide variants of those messages).
To initialize the header we would just copy the first or second 16-bit
half of the sample mask, and set the quarter control bits appropriately
for each half so the execution mask is taken into account correctly.

With Ben's and Matt's suggestions this patch is:
Reviewed-by: Francisco Jerez curroje...@riseup.net


 -Jordan

  ---
   While it's fresh in our minds. :)
  
   This seems to work for gen7  gen8 CS. For CS simd16, we need the
   0x change, but it seems to work fine for simd8 as well.
  
   I also tested gen8 (simd8vs), and there were no piglit regressions.
  
   src/mesa/drivers/dri/i965/brw_fs_visitor.cpp | 8 
   1 file changed, 4 insertions(+), 4 deletions(-)
  
  diff --git a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp 
  b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
  index 24cc118..960a0aa 100644
  --- a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
  +++ b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
  @@ -2998,9 +2998,9 @@ fs_visitor::emit_untyped_atomic(unsigned atomic_op, 
  unsigned surf_index,
  * mask sent in the header to compute the actual set of channels 
  that execute
  * the atomic operation.
  */
  -  assert(stage == MESA_SHADER_VERTEX);
  +  assert(stage == MESA_SHADER_VERTEX || stage == MESA_SHADER_COMPUTE);
 emit(MOV(component(sources[0], 7),
  -   brw_imm_ud(0xff)))-force_writemask_all = true;
  +   brw_imm_ud(0x)))-force_writemask_all = true;
  }
  length++;
   
  @@ -3061,9 +3061,9 @@ fs_visitor::emit_untyped_surface_read(unsigned 
  surf_index, fs_reg dst,
  * mask sent in the header to compute the actual set of channels 
  that execute
  * the atomic operation.
  */
  -  assert(stage == MESA_SHADER_VERTEX);
  +  assert(stage == MESA_SHADER_VERTEX || stage == MESA_SHADER_COMPUTE);
 emit(MOV(component(sources[0], 7),
  -   brw_imm_ud(0xff)))-force_writemask_all = true;
  +   brw_imm_ud(0x)))-force_writemask_all = true;
  }
   
  /* Set the surface read offset. */
  -- 
  2.1.4
  


pgpppOi6R6AAi.pgp
Description: PGP signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] glsl: add double support for packing varyings

2015-02-20 Thread Ilia Mirkin
On Fri, Feb 20, 2015 at 4:10 AM, Ilia Mirkin imir...@alum.mit.edu wrote:
 Signed-off-by: Ilia Mirkin imir...@alum.mit.edu
 ---

 This works with a modified varying-packing test for everything except
 dvec4 array which crashes in st/mesa somewhere. Still working on that
 one. The IR generated here kinda stinks, but I couldn't get an
 assignment with a swizzle to work.

Ugh, looks like this is a no-go for GS due to the temp vars. But this
change gets it mostly working. I also had to switch doubles to
*always* getting packed, even for dvec2 and dvec4... I think there's
some variable storage/linking-related logic which doesn't take the
double size into account properly.

  -ilia


  src/glsl/lower_packed_varyings.cpp | 88 
 +-
  1 file changed, 67 insertions(+), 21 deletions(-)

 diff --git a/src/glsl/lower_packed_varyings.cpp 
 b/src/glsl/lower_packed_varyings.cpp
 index 5e844c7..f9e3e85 100644
 --- a/src/glsl/lower_packed_varyings.cpp
 +++ b/src/glsl/lower_packed_varyings.cpp
 @@ -146,7 +146,11 @@

  #include glsl_symbol_table.h
  #include ir.h
 +#include ir_builder.h
  #include ir_optimization.h
 +#include program/prog_instruction.h
 +
 +using namespace ir_builder;

  namespace {

 @@ -168,8 +172,8 @@ public:
 void run(exec_list *instructions);

  private:
 -   ir_assignment *bitwise_assign_pack(ir_rvalue *lhs, ir_rvalue *rhs);
 -   ir_assignment *bitwise_assign_unpack(ir_rvalue *lhs, ir_rvalue *rhs);
 +   void bitwise_assign_pack(ir_rvalue *lhs, ir_rvalue *rhs);
 +   void bitwise_assign_unpack(ir_rvalue *lhs, ir_rvalue *rhs);
 unsigned lower_rvalue(ir_rvalue *rvalue, unsigned fine_location,
   ir_variable *unpacked_var, const char *name,
   bool gs_input_toplevel, unsigned vertex_index);
 @@ -274,6 +278,7 @@ lower_packed_varyings_visitor::run(exec_list 
 *instructions)
 }
  }

 +#define SWIZZLE_ZWZW MAKE_SWIZZLE4(SWIZZLE_Z, SWIZZLE_W, SWIZZLE_Z, 
 SWIZZLE_W)

  /**
   * Make an ir_assignment from \c rhs to \c lhs, performing appropriate
 @@ -281,7 +286,7 @@ lower_packed_varyings_visitor::run(exec_list 
 *instructions)
   *
   * This function is called when packing varyings.
   */
 -ir_assignment *
 +void
  lower_packed_varyings_visitor::bitwise_assign_pack(ir_rvalue *lhs,
 ir_rvalue *rhs)
  {
 @@ -300,12 +305,28 @@ 
 lower_packed_varyings_visitor::bitwise_assign_pack(ir_rvalue *lhs,
   rhs = new(this-mem_ctx)
  ir_expression(ir_unop_bitcast_f2i, lhs-type, rhs);
   break;
 +  case GLSL_TYPE_DOUBLE:
 + assert(rhs-type-vector_elements = 2);
 + if (rhs-type-vector_elements == 2) {
 +ir_variable *t = new(mem_ctx) ir_variable(lhs-type, pack, 
 ir_var_temporary);
 +
 +assert(lhs-type-vector_elements == 4);
 +this-out_instructions-push_tail(t);
 +this-out_instructions-push_tail(
 +  assign(t, u2i(expr(ir_unop_unpack_double_2x32, 
 swizzle_x(rhs-clone(mem_ctx, NULL, 0x3));
 +this-out_instructions-push_tail(
 +  assign(t,  u2i(expr(ir_unop_unpack_double_2x32, 
 swizzle_y(rhs))), 0xc));
 +rhs = deref(t).val;
 + } else {
 +rhs = u2i(expr(ir_unop_unpack_double_2x32, rhs));
 + }
 + break;
default:
   assert(!Unexpected type conversion while lowering varyings);
   break;
}
 }
 -   return new(this-mem_ctx) ir_assignment(lhs, rhs);
 +   this-out_instructions-push_tail(new (this-mem_ctx) ir_assignment(lhs, 
 rhs));
  }


 @@ -315,7 +336,7 @@ 
 lower_packed_varyings_visitor::bitwise_assign_pack(ir_rvalue *lhs,
   *
   * This function is called when unpacking varyings.
   */
 -ir_assignment *
 +void
  lower_packed_varyings_visitor::bitwise_assign_unpack(ir_rvalue *lhs,
   ir_rvalue *rhs)
  {
 @@ -334,12 +355,27 @@ 
 lower_packed_varyings_visitor::bitwise_assign_unpack(ir_rvalue *lhs,
   rhs = new(this-mem_ctx)
  ir_expression(ir_unop_bitcast_i2f, lhs-type, rhs);
   break;
 +  case GLSL_TYPE_DOUBLE:
 + assert(lhs-type-vector_elements = 2);
 + if (lhs-type-vector_elements == 2) {
 +ir_variable *t = new(mem_ctx) ir_variable(lhs-type, unpack, 
 ir_var_temporary);
 +assert(rhs-type-vector_elements == 4);
 +this-out_instructions-push_tail(t);
 +this-out_instructions-push_tail(
 +  assign(t, expr(ir_unop_pack_double_2x32, 
 i2u(swizzle_xy(rhs-clone(mem_ctx, NULL, 0x1));
 +this-out_instructions-push_tail(
 +  assign(t, expr(ir_unop_pack_double_2x32, 
 i2u(swizzle(rhs-clone(mem_ctx, NULL), SWIZZLE_ZWZW, 2))), 0x2));
 +rhs = deref(t).val;
 + } else {
 +rhs = expr(ir_unop_pack_double_2x32, i2u(rhs));
 + }
 + break;

Re: [Mesa-dev] [PATCH] i965/skl: Use 1 register for uniform pull constant payload

2015-02-20 Thread Kenneth Graunke
On Thursday, February 19, 2015 10:48:08 PM Ben Widawsky wrote:
 When under dispatch_width=16 the previous code would allocate 2 registers for
 the payload when only one is needed. This manifested itself through bugs on 
 SKL
 which needs to mess with this instruction.
 
 Ken though this might impact shader-db, but apparently it doesn't
 
 Cc: Kenneth Graunke kenn...@whitecape.org
 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89118
 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=88999
 Signed-off-by: Ben Widawsky b...@bwidawsk.net
 ---
  src/mesa/drivers/dri/i965/brw_fs.cpp | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)
 
 diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp 
 b/src/mesa/drivers/dri/i965/brw_fs.cpp
 index c46e1d7..24125cc 100644
 --- a/src/mesa/drivers/dri/i965/brw_fs.cpp
 +++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
 @@ -3062,7 +3062,7 @@ fs_visitor::lower_uniform_pull_constant_loads()
   assert(const_offset_reg.file == IMM 
  const_offset_reg.type == BRW_REGISTER_TYPE_UD);
   const_offset_reg.fixed_hw_reg.dw1.ud /= 4;
 - fs_reg payload = vgrf(glsl_type::uint_type);
 + fs_reg payload = fs_reg(GRF, alloc.allocate(1));
  
   /* We have to use a message header on Skylake to get SIMD4x2 mode.
* Reserve space for the register.
 

Reviewed-by: Kenneth Graunke kenn...@whitecape.org


signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 08/23] main: Add entry point for NamedBufferSubData.

2015-02-20 Thread Martin Peres

On 12/02/2015 04:05, Laura Ekstrand wrote:

v2: review by Ian Romanick
- Remove _mesa from name of static software fallback buffer_sub_data.
- Remove mappedRange from _mesa_buffer_sub_data.
- Removed some cosmetic changes to a separate commit.
---
  src/mapi/glapi/gen/ARB_direct_state_access.xml |   7 ++
  src/mesa/main/bufferobj.c  | 129 +++--
  src/mesa/main/bufferobj.h  |   9 ++
  src/mesa/main/tests/dispatch_sanity.cpp|   1 +
  4 files changed, 97 insertions(+), 49 deletions(-)

diff --git a/src/mapi/glapi/gen/ARB_direct_state_access.xml 
b/src/mapi/glapi/gen/ARB_direct_state_access.xml
index 7779262..6d70b8e 100644
--- a/src/mapi/glapi/gen/ARB_direct_state_access.xml
+++ b/src/mapi/glapi/gen/ARB_direct_state_access.xml
@@ -28,6 +28,13 @@
param name=usage type=GLenum /
 /function
  
+   function name=NamedBufferSubData offset=assign

+  param name=buffer type=GLuint /
+  param name=offset type=GLintptr /
+  param name=size type=GLsizeiptr /
+  param name=data type=const GLvoid * /
+   /function
+
 !-- Texture object functions --
  
 function name=CreateTextures offset=assign

diff --git a/src/mesa/main/bufferobj.c b/src/mesa/main/bufferobj.c
index ac8eed1..4f89748 100644
--- a/src/mesa/main/bufferobj.c
+++ b/src/mesa/main/bufferobj.c
@@ -227,67 +227,58 @@ bufferobj_range_mapped(const struct gl_buffer_object *obj,
   * \c glClearBufferSubData.
   *
   * \param ctx GL context.
- * \param target  Buffer object target on which to operate.
+ * \param bufObj  The buffer object.
   * \param offset  Offset of the first byte of the subdata range.
   * \param sizeSize, in bytes, of the subdata range.
   * \param mappedRange  If true, checks if an overlapping range is mapped.
   * If false, checks if buffer is mapped.
- * \param errorNoBuffer  Error code if no buffer is bound to target.
   * \param caller  Name of calling function for recording errors.
- * \return   A pointer to the buffer object bound to \c target in the
- *   specified context or \c NULL if any of the parameter or state
- *   conditions are invalid.
+ * \return   false if error, true otherwise
   *
   * \sa glBufferSubDataARB, glGetBufferSubDataARB, glClearBufferSubData
   */
-static struct gl_buffer_object *
-buffer_object_subdata_range_good(struct gl_context * ctx, GLenum target,
- GLintptrARB offset, GLsizeiptrARB size,
- bool mappedRange, GLenum errorNoBuffer,
- const char *caller)
+static bool
+buffer_object_subdata_range_good(struct gl_context *ctx,
+ struct gl_buffer_object *bufObj,
+ GLintptr offset, GLsizeiptr size,
+ bool mappedRange, const char *caller)
  {
-   struct gl_buffer_object *bufObj;
-
 if (size  0) {
_mesa_error(ctx, GL_INVALID_VALUE, %s(size  0), caller);
-  return NULL;
+  return false;
 }
  
 if (offset  0) {

_mesa_error(ctx, GL_INVALID_VALUE, %s(offset  0), caller);
-  return NULL;
+  return false;
 }
  
-   bufObj = get_buffer(ctx, caller, target, errorNoBuffer);

-   if (!bufObj)
-  return NULL;
-
 if (offset + size  bufObj-Size) {
_mesa_error(ctx, GL_INVALID_VALUE,
%s(offset %lu + size %lu  buffer size %lu), caller,
(unsigned long) offset,
(unsigned long) size,
(unsigned long) bufObj-Size);
-  return NULL;
+  return false;
 }
  
 if (bufObj-Mappings[MAP_USER].AccessFlags  GL_MAP_PERSISTENT_BIT)

-  return bufObj;
+  return true;
  
 if (mappedRange) {

if (bufferobj_range_mapped(bufObj, offset, size)) {
   _mesa_error(ctx, GL_INVALID_OPERATION, %s, caller);
- return NULL;
+ return false;
}
 }
 else {
if (_mesa_bufferobj_mapped(bufObj, MAP_USER)) {
   _mesa_error(ctx, GL_INVALID_OPERATION, %s, caller);
- return NULL;
+ return false;
}
 }
  
-   return bufObj;

+   return true;
  }
  
  
@@ -602,9 +593,9 @@ buffer_data_fallback(struct gl_context *ctx, GLenum target, GLsizeiptr size,

   * \sa glBufferSubDataARB, dd_function_table::BufferSubData.
   */
  static void
-_mesa_buffer_subdata( struct gl_context *ctx, GLintptrARB offset,
- GLsizeiptrARB size, const GLvoid * data,
- struct gl_buffer_object * bufObj )
+buffer_sub_data_fallback(struct gl_context *ctx, GLintptr offset,
+ GLsizeiptr size, const GLvoid *data,
+ struct gl_buffer_object *bufObj)
  {
 (void) ctx;
  
@@ -1113,7 +1104,7 @@ _mesa_init_buffer_object_functions(struct dd_function_table *driver)

 driver-NewBufferObject = _mesa_new_buffer_object;
 

[Mesa-dev] [PATCH] glsl: add double support for packing varyings

2015-02-20 Thread Ilia Mirkin
Signed-off-by: Ilia Mirkin imir...@alum.mit.edu
---

This works with a modified varying-packing test for everything except
dvec4 array which crashes in st/mesa somewhere. Still working on that
one. The IR generated here kinda stinks, but I couldn't get an
assignment with a swizzle to work.

 src/glsl/lower_packed_varyings.cpp | 88 +-
 1 file changed, 67 insertions(+), 21 deletions(-)

diff --git a/src/glsl/lower_packed_varyings.cpp 
b/src/glsl/lower_packed_varyings.cpp
index 5e844c7..f9e3e85 100644
--- a/src/glsl/lower_packed_varyings.cpp
+++ b/src/glsl/lower_packed_varyings.cpp
@@ -146,7 +146,11 @@
 
 #include glsl_symbol_table.h
 #include ir.h
+#include ir_builder.h
 #include ir_optimization.h
+#include program/prog_instruction.h
+
+using namespace ir_builder;
 
 namespace {
 
@@ -168,8 +172,8 @@ public:
void run(exec_list *instructions);
 
 private:
-   ir_assignment *bitwise_assign_pack(ir_rvalue *lhs, ir_rvalue *rhs);
-   ir_assignment *bitwise_assign_unpack(ir_rvalue *lhs, ir_rvalue *rhs);
+   void bitwise_assign_pack(ir_rvalue *lhs, ir_rvalue *rhs);
+   void bitwise_assign_unpack(ir_rvalue *lhs, ir_rvalue *rhs);
unsigned lower_rvalue(ir_rvalue *rvalue, unsigned fine_location,
  ir_variable *unpacked_var, const char *name,
  bool gs_input_toplevel, unsigned vertex_index);
@@ -274,6 +278,7 @@ lower_packed_varyings_visitor::run(exec_list *instructions)
}
 }
 
+#define SWIZZLE_ZWZW MAKE_SWIZZLE4(SWIZZLE_Z, SWIZZLE_W, SWIZZLE_Z, SWIZZLE_W)
 
 /**
  * Make an ir_assignment from \c rhs to \c lhs, performing appropriate
@@ -281,7 +286,7 @@ lower_packed_varyings_visitor::run(exec_list *instructions)
  *
  * This function is called when packing varyings.
  */
-ir_assignment *
+void
 lower_packed_varyings_visitor::bitwise_assign_pack(ir_rvalue *lhs,
ir_rvalue *rhs)
 {
@@ -300,12 +305,28 @@ 
lower_packed_varyings_visitor::bitwise_assign_pack(ir_rvalue *lhs,
  rhs = new(this-mem_ctx)
 ir_expression(ir_unop_bitcast_f2i, lhs-type, rhs);
  break;
+  case GLSL_TYPE_DOUBLE:
+ assert(rhs-type-vector_elements = 2);
+ if (rhs-type-vector_elements == 2) {
+ir_variable *t = new(mem_ctx) ir_variable(lhs-type, pack, 
ir_var_temporary);
+
+assert(lhs-type-vector_elements == 4);
+this-out_instructions-push_tail(t);
+this-out_instructions-push_tail(
+  assign(t, u2i(expr(ir_unop_unpack_double_2x32, 
swizzle_x(rhs-clone(mem_ctx, NULL, 0x3));
+this-out_instructions-push_tail(
+  assign(t,  u2i(expr(ir_unop_unpack_double_2x32, 
swizzle_y(rhs))), 0xc));
+rhs = deref(t).val;
+ } else {
+rhs = u2i(expr(ir_unop_unpack_double_2x32, rhs));
+ }
+ break;
   default:
  assert(!Unexpected type conversion while lowering varyings);
  break;
   }
}
-   return new(this-mem_ctx) ir_assignment(lhs, rhs);
+   this-out_instructions-push_tail(new (this-mem_ctx) ir_assignment(lhs, 
rhs));
 }
 
 
@@ -315,7 +336,7 @@ 
lower_packed_varyings_visitor::bitwise_assign_pack(ir_rvalue *lhs,
  *
  * This function is called when unpacking varyings.
  */
-ir_assignment *
+void
 lower_packed_varyings_visitor::bitwise_assign_unpack(ir_rvalue *lhs,
  ir_rvalue *rhs)
 {
@@ -334,12 +355,27 @@ 
lower_packed_varyings_visitor::bitwise_assign_unpack(ir_rvalue *lhs,
  rhs = new(this-mem_ctx)
 ir_expression(ir_unop_bitcast_i2f, lhs-type, rhs);
  break;
+  case GLSL_TYPE_DOUBLE:
+ assert(lhs-type-vector_elements = 2);
+ if (lhs-type-vector_elements == 2) {
+ir_variable *t = new(mem_ctx) ir_variable(lhs-type, unpack, 
ir_var_temporary);
+assert(rhs-type-vector_elements == 4);
+this-out_instructions-push_tail(t);
+this-out_instructions-push_tail(
+  assign(t, expr(ir_unop_pack_double_2x32, 
i2u(swizzle_xy(rhs-clone(mem_ctx, NULL, 0x1));
+this-out_instructions-push_tail(
+  assign(t, expr(ir_unop_pack_double_2x32, 
i2u(swizzle(rhs-clone(mem_ctx, NULL), SWIZZLE_ZWZW, 2))), 0x2));
+rhs = deref(t).val;
+ } else {
+rhs = expr(ir_unop_pack_double_2x32, i2u(rhs));
+ }
+ break;
   default:
  assert(!Unexpected type conversion while lowering varyings);
  break;
   }
}
-   return new(this-mem_ctx) ir_assignment(lhs, rhs);
+   this-out_instructions-push_tail(new(this-mem_ctx) ir_assignment(lhs, 
rhs));
 }
 
 
@@ -372,6 +408,7 @@ lower_packed_varyings_visitor::lower_rvalue(ir_rvalue 
*rvalue,
 bool gs_input_toplevel,
 unsigned vertex_index)
 {
+   unsigned dmul = 

Re: [Mesa-dev] [PATCH 23/23] main: Cosmetic changes to GetBufferSubData.

2015-02-20 Thread Martin Peres

Please squash this commit with the one introducing GetBufferSubData.

It should be pretty easy to do with git rebase -i :)

On 12/02/2015 04:06, Laura Ekstrand wrote:

---
  src/mesa/main/bufferobj.c | 2 +-
  src/mesa/main/bufferobj.h | 4 ++--
  2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/src/mesa/main/bufferobj.c b/src/mesa/main/bufferobj.c
index 0272704..38d8b5a 100644
--- a/src/mesa/main/bufferobj.c
+++ b/src/mesa/main/bufferobj.c
@@ -1668,7 +1668,7 @@ _mesa_GetBufferSubData(GLenum target, GLintptr offset,
 }
  
 ASSERT(ctx-Driver.GetBufferSubData);

-   ctx-Driver.GetBufferSubData( ctx, offset, size, data, bufObj );
+   ctx-Driver.GetBufferSubData(ctx, offset, size, data, bufObj);
  }
  
  void GLAPIENTRY

diff --git a/src/mesa/main/bufferobj.h b/src/mesa/main/bufferobj.h
index feeaa8b..b5d73ae 100644
--- a/src/mesa/main/bufferobj.h
+++ b/src/mesa/main/bufferobj.h
@@ -230,8 +230,8 @@ _mesa_NamedBufferSubData(GLuint buffer, GLintptr offset,
   GLsizeiptr size, const GLvoid *data);
  
  void GLAPIENTRY

-_mesa_GetBufferSubData(GLenum target, GLintptrARB offset,
-   GLsizeiptrARB size, void * data);
+_mesa_GetBufferSubData(GLenum target, GLintptr offset,
+   GLsizeiptr size, GLvoid *data);
  
  void GLAPIENTRY

  _mesa_GetNamedBufferSubData(GLuint buffer, GLintptr offset,


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 10/23] main: Add entry point for CopyNamedBufferSubData.

2015-02-20 Thread Martin Peres

Looks good to me.

Reviewed-by: Martin Peres martin.pe...@linux.intel.com

On 12/02/2015 04:05, Laura Ekstrand wrote:

v2: remove _mesa in front of static software fallback.
---
  src/mapi/glapi/gen/ARB_direct_state_access.xml |  8 +++
  src/mesa/main/bufferobj.c  | 99 +-
  src/mesa/main/bufferobj.h  | 12 
  src/mesa/main/tests/dispatch_sanity.cpp|  1 +
  4 files changed, 87 insertions(+), 33 deletions(-)

diff --git a/src/mapi/glapi/gen/ARB_direct_state_access.xml 
b/src/mapi/glapi/gen/ARB_direct_state_access.xml
index 6d70b8e..042b2a8 100644
--- a/src/mapi/glapi/gen/ARB_direct_state_access.xml
+++ b/src/mapi/glapi/gen/ARB_direct_state_access.xml
@@ -35,6 +35,14 @@
param name=data type=const GLvoid * /
 /function
  
+   function name=CopyNamedBufferSubData offset=assign

+  param name=readBuffer type=GLuint /
+  param name=writeBuffer type=GLuint /
+  param name=readOffset type=GLintptr /
+  param name=writeOffset type=GLintptr /
+  param name=size type=GLsizeiptr /
+   /function
+
 !-- Texture object functions --
  
 function name=CreateTextures offset=assign

diff --git a/src/mesa/main/bufferobj.c b/src/mesa/main/bufferobj.c
index fc01d02..7225b64 100644
--- a/src/mesa/main/bufferobj.c
+++ b/src/mesa/main/bufferobj.c
@@ -761,11 +761,11 @@ _mesa_buffer_unmap(struct gl_context *ctx, struct 
gl_buffer_object *bufObj,
   * Called via glCopyBufferSubData().
   */
  static void
-_mesa_copy_buffer_subdata(struct gl_context *ctx,
-  struct gl_buffer_object *src,
-  struct gl_buffer_object *dst,
-  GLintptr readOffset, GLintptr writeOffset,
-  GLsizeiptr size)
+copy_buffer_sub_data_fallback(struct gl_context *ctx,
+  struct gl_buffer_object *src,
+  struct gl_buffer_object *dst,
+  GLintptr readOffset, GLintptr writeOffset,
+  GLsizeiptr size)
  {
 GLubyte *srcPtr, *dstPtr;
  
@@ -1120,7 +1120,7 @@ _mesa_init_buffer_object_functions(struct dd_function_table *driver)

 driver-FlushMappedBufferRange = _mesa_buffer_flush_mapped_range;
  
 /* GL_ARB_copy_buffer */

-   driver-CopyBufferSubData = _mesa_copy_buffer_subdata;
+   driver-CopyBufferSubData = copy_buffer_sub_data_fallback;
  }
  
  
@@ -2101,65 +2101,54 @@ _mesa_GetBufferPointerv(GLenum target, GLenum pname, GLvoid **params)

  }
  
  
-void GLAPIENTRY

-_mesa_CopyBufferSubData(GLenum readTarget, GLenum writeTarget,
-GLintptr readOffset, GLintptr writeOffset,
-GLsizeiptr size)
+void
+_mesa_copy_buffer_sub_data(struct gl_context *ctx,
+   struct gl_buffer_object *src,
+   struct gl_buffer_object *dst,
+   GLintptr readOffset, GLintptr writeOffset,
+   GLsizeiptr size, const char *func)
  {
-   GET_CURRENT_CONTEXT(ctx);
-   struct gl_buffer_object *src, *dst;
-
-   src = get_buffer(ctx, glCopyBufferSubData, readTarget,
-GL_INVALID_OPERATION);
-   if (!src)
-  return;
-
-   dst = get_buffer(ctx, glCopyBufferSubData, writeTarget,
-GL_INVALID_OPERATION);
-   if (!dst)
-  return;
-
 if (_mesa_check_disallowed_mapping(src)) {
_mesa_error(ctx, GL_INVALID_OPERATION,
-  glCopyBufferSubData(readBuffer is mapped));
+  %s(readBuffer is mapped), func);
return;
 }
  
 if (_mesa_check_disallowed_mapping(dst)) {

_mesa_error(ctx, GL_INVALID_OPERATION,
-  glCopyBufferSubData(writeBuffer is mapped));
+  %s(writeBuffer is mapped), func);
return;
 }
  
 if (readOffset  0) {

_mesa_error(ctx, GL_INVALID_VALUE,
-  glCopyBufferSubData(readOffset = %d), (int) readOffset);
+  %s(readOffset %d  0), func, (int) readOffset);
return;
 }
  
 if (writeOffset  0) {

_mesa_error(ctx, GL_INVALID_VALUE,
-  glCopyBufferSubData(writeOffset = %d), (int) writeOffset);
+  %s(writeOffset %d  0), func, (int) writeOffset);
return;
 }
  
 if (size  0) {

_mesa_error(ctx, GL_INVALID_VALUE,
-  glCopyBufferSubData(writeOffset = %d), (int) size);
+  %s(size %d  0), func, (int) size);
return;
 }
  
 if (readOffset + size  src-Size) {

_mesa_error(ctx, GL_INVALID_VALUE,
-  glCopyBufferSubData(readOffset + size = %d),
-  (int) (readOffset + size));
+  %s(readOffset %d + size %d  src_buffer_size %d), func,
+  (int) readOffset, (int) size, (int) src-Size);
return;
 }
  
 if (writeOffset + size  dst-Size) {


Re: [Mesa-dev] [PATCH 10/32] i965/fs: Remove logic to keep track of MRF metadata in lower_load_payload().

2015-02-20 Thread Francisco Jerez
Jason Ekstrand ja...@jlekstrand.net writes:

 I'm still a little pensive.  But

 Reviewed-by: Jason Ekstrand jason.ekstr...@intel.com

Thanks.

 Now for a little aside.  I have come to the conclusion that I made a grave
 mistake when I did the LOAD_PAYLOAD stuff.  In retrospect, I should have
 just subclassed fs_inst for load_payload.  The problem is that we need to
 snag a bunch of information for the sources when we create the
 load_payload.  In particular, we need to know the width of the source so
 that we know how much space it consumes in the payload and we need to know
 the information required to properly re-create the mov such as
 force_sechalf and force_writemask_all.  Really, in order to do things
 properly, we need to gather this information *before* we do any
 optimizations.  The nasty pile of code that you're editing together with
 the effective_width parameter is a lame attempt to capture/reconstruct
 this information.  Really, we should just subclass, capture the information
 up-front, and do it properly.

Yes, absolutely, this lowering pass is a real mess.  There are four more
unreviewed patches in the mailing list fixing bugs of the metadata
guessing of lower_load_payload() [1][2][3][4], you may want to take a
look at them.  There are more bugs I'm aware of but it didn't seem
practical to fix them.

That said, I don't think it would be worth subclassing fs_inst.

My suggestion would have been to keep it simple and lower LOAD_PAYLOAD
into a series of MOVs with force_writemask_all enabled in all cases,
simply rely on the optimizer to eliminate those where possible.  Then
get rid of the metadata and effective_width guessing.  Instead agree on
immediates and uniforms being exec_size-wide by convention
(i.e. LOAD_PAYLOAD's exec_size rather than the original instruction's),
then prevent constant propagation from propagating an immediate load
into a LOAD_PAYLOAD if it would lead to a change in the semantics of the
program, and maybe just run constant propagation after lowering so we
can be sure those cases are cleaned up properly where register coalesce
isn't enough.

[1] http://lists.freedesktop.org/archives/mesa-dev/2015-January/074614.html
[2] http://lists.freedesktop.org/archives/mesa-dev/2015-February/076094.html
[3] http://lists.freedesktop.org/archives/mesa-dev/2015-February/076097.html
[4] http://lists.freedesktop.org/archives/mesa-dev/2015-February/076098.html


 --Jason

 On Thu, Feb 19, 2015 at 1:53 PM, Jason Ekstrand ja...@jlekstrand.net
 wrote:



 On Thu, Feb 19, 2015 at 1:25 PM, Francisco Jerez curroje...@riseup.net
 wrote:

 Jason Ekstrand ja...@jlekstrand.net writes:

  On Thu, Feb 19, 2015 at 12:13 PM, Francisco Jerez 
 curroje...@riseup.net
  wrote:
 
  Jason Ekstrand ja...@jlekstrand.net writes:
 
   On Fri, Feb 6, 2015 at 4:01 PM, Francisco Jerez 
 curroje...@riseup.net
   wrote:
  
   Hey Matt,
  
   Matt Turner matts...@gmail.com writes:
  
On Fri, Feb 6, 2015 at 6:42 AM, Francisco Jerez 
  curroje...@riseup.net
   wrote:
MRFs cannot be read from anyway so they cannot possibly be a
 valid
source of LOAD_PAYLOAD.
---
   
The function only seems to test inst-dst.file == MRF. I don't
 see any
code for handling MRF sources. What am I missing?
  
   That test is for handling MRF sources -- More precisely, it's
   collecting the writemask and half flags for MRF writes, which can
 only
   possibly be useful if we're going to use them later on to read
 something
   out of an MRF into a payload, which we shouldn't be doing in the
 first
   place.
  
   Aside from simplifying the function somewhat, that allows us to
 drop the
   16 register gap reserved for MRFs at register offset zero, what will
   allow us to drop the vgrf_to_reg[] offset calculation completely
 (also
   in split_virtual_grfs()) in a future patch (not sent for review
 yet).
  
  
   No, we do read from MRF's sort-of...  Send messages have an implicit
  read
   from an MRF.
 
  Heh, and that's pretty much the only way you read from it.
 
   This was written precicely so that we could use LOAD_PAYLOAD
   to build MRF payloads.  We do on pre-GEN6.
  
  I'm aware, but you don't need any of this meta-data to LOAD_PAYLOAD
  *into* an MRF, and LOAD_PAYLOAD with an MRF as source should be illegal
  anyway.
 
 
  And no one is using it that way.  All of the metadata checks you are
  deleting are checks on the *destination*.
 

 Didn't you write this code yourself?  The only use for the collected
 metadata is initializing the instruction flags of the MOVs subsequent
 LOAD_PAYLOAD instructions are lowered to, based on the metadata already
 collected for its source registers, which can never be MRFs, so the
 metadata you collect from MRF writes is never actually used.


 Right... I misred something initially.  Yes, we should never be tracking
 MRF's as a source of a LOAD_PAYLOAD.  I'll give it a better look a bit
 later, but it looks better.



pgpjBO09kHb1i.pgp
Description: PGP signature

[Mesa-dev] [PATCH] mesa: Adds missing error condition in _mesa_check_sample_count()

2015-02-20 Thread Eduardo Lima Mitev
This corrects a trivial error introduced in commit
19252fee46b835cb4f6b1cce18d7737d62b64a2e. That patch was merged recently
and omits one condition (that 'samples' is greater than zero) in one of
the error checks. That error will definitely cause regressions.
---
 src/mesa/main/multisample.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/src/mesa/main/multisample.c b/src/mesa/main/multisample.c
index b696de9..50a8a11 100644
--- a/src/mesa/main/multisample.c
+++ b/src/mesa/main/multisample.c
@@ -155,7 +155,8 @@ _mesa_check_sample_count(struct gl_context *ctx, GLenum 
target,
 * If internalformat is a signed or unsigned integer format and samples
 * is greater than zero, then the error INVALID_OPERATION is generated.
 */
-   if (_mesa_is_gles3(ctx)  _mesa_is_enum_format_integer(internalFormat)) {
+   if (_mesa_is_gles3(ctx)  _mesa_is_enum_format_integer(internalFormat)
+samples  0) {
   return GL_INVALID_OPERATION;
}
 
-- 
2.1.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 11/23] main: Refactor ClearBuffer[Sub]Data.

2015-02-20 Thread Martin Peres

On 12/02/2015 04:05, Laura Ekstrand wrote:

v2: review by Jason Ekstrand
- Split refactor of clear buffer sub data from addition of DSA entry
  points.
---
  src/mesa/main/bufferobj.c| 125 ---
  src/mesa/main/bufferobj.h|  19 ++--
  src/mesa/state_tracker/st_cb_bufferobjects.c |   4 +-
  3 files changed, 69 insertions(+), 79 deletions(-)

diff --git a/src/mesa/main/bufferobj.c b/src/mesa/main/bufferobj.c
index 7225b64..b8fa917 100644
--- a/src/mesa/main/bufferobj.c
+++ b/src/mesa/main/bufferobj.c
@@ -660,11 +660,11 @@ _mesa_buffer_get_subdata( struct gl_context *ctx, 
GLintptrARB offset,
   * dd_function_table::ClearBufferSubData.
   */
  void
-_mesa_buffer_clear_subdata(struct gl_context *ctx,
-   GLintptr offset, GLsizeiptr size,
-   const GLvoid *clearValue,
-   GLsizeiptr clearValueSize,
-   struct gl_buffer_object *bufObj)
+_mesa_ClearBufferSubData_sw(struct gl_context *ctx,


Interesting choice of naming the function as a mix of camel case and 
underscores.


Since it is an internal function, shouldn't it only have underscores?

Other than that:

Reviewed-by: Martin Peres martin.pe...@linux.intel.com

+GLintptr offset, GLsizeiptr size,
+const GLvoid *clearValue,
+GLsizeiptr clearValueSize,
+struct gl_buffer_object *bufObj)
  {
 GLsizeiptr i;
 GLubyte *dest;
@@ -1113,7 +1113,7 @@ _mesa_init_buffer_object_functions(struct 
dd_function_table *driver)
 driver-UnmapBuffer = _mesa_buffer_unmap;
  
 /* GL_ARB_clear_buffer_object */

-   driver-ClearBufferSubData = _mesa_buffer_clear_subdata;
+   driver-ClearBufferSubData = _mesa_ClearBufferSubData_sw;
  
 /* GL_ARB_map_buffer_range */

 driver-MapBufferRange = _mesa_buffer_map_range;
@@ -1671,57 +1671,77 @@ _mesa_GetBufferSubData(GLenum target, GLintptr offset,
 ctx-Driver.GetBufferSubData( ctx, offset, size, data, bufObj );
  }
  
-

-void GLAPIENTRY
-_mesa_ClearBufferData(GLenum target, GLenum internalformat, GLenum format,
-  GLenum type, const GLvoid* data)
+/**
+ * \param subdata   true if caller is *SubData, false if *Data
+ */
+void
+_mesa_clear_buffer_sub_data(struct gl_context *ctx,
+struct gl_buffer_object *bufObj,
+GLenum internalformat,
+GLintptr offset, GLsizeiptr size,
+GLenum format, GLenum type,
+const GLvoid *data,
+const char *func, bool subdata)
  {
-   GET_CURRENT_CONTEXT(ctx);
-   struct gl_buffer_object* bufObj;
 mesa_format mesaFormat;
 GLubyte clearValue[MAX_PIXEL_BYTES];
 GLsizeiptr clearValueSize;
  
-   bufObj = get_buffer(ctx, glClearBufferData, target, GL_INVALID_VALUE);

-   if (!bufObj) {
-  return;
-   }
-
-   if (_mesa_check_disallowed_mapping(bufObj)) {
-  _mesa_error(ctx, GL_INVALID_OPERATION,
-  glClearBufferData(buffer currently mapped));
+   /* This checks for disallowed mappings. */
+   if (!buffer_object_subdata_range_good(ctx, bufObj, offset, size,
+ subdata, func)) {
return;
 }
  
 mesaFormat = validate_clear_buffer_format(ctx, internalformat,

- format, type,
- glClearBufferData);
+ format, type, func);
+
 if (mesaFormat == MESA_FORMAT_NONE) {
return;
 }
  
 clearValueSize = _mesa_get_format_bytes(mesaFormat);

-   if (bufObj-Size % clearValueSize != 0) {
+   if (offset % clearValueSize != 0 || size % clearValueSize != 0) {
_mesa_error(ctx, GL_INVALID_VALUE,
-  glClearBufferData(size is not a multiple of 
-  internalformat size));
+  %s(offset or size is not a multiple of 
+  internalformat size), func);
return;
 }
  
 if (data == NULL) {

/* clear to zeros, per the spec */
-  ctx-Driver.ClearBufferSubData(ctx, 0, bufObj-Size,
- NULL, clearValueSize, bufObj);
+  if (size  0) {
+ ctx-Driver.ClearBufferSubData(ctx, offset, size,
+NULL, clearValueSize, bufObj);
+  }
return;
 }
  
 if (!convert_clear_buffer_data(ctx, mesaFormat, clearValue,

-  format, type, data, glClearBufferData)) {
+  format, type, data, func)) {
return;
 }
  
-   ctx-Driver.ClearBufferSubData(ctx, 0, bufObj-Size,

-  clearValue, clearValueSize, bufObj);
+   if (size  0) {
+  

Re: [Mesa-dev] [PATCH] glsl: add lowering for double divide to rcp/mul

2015-02-20 Thread Roland Scheidegger
Am 19.02.2015 um 23:47 schrieb Dave Airlie:
 From: Dave Airlie airl...@redhat.com
 
 It looks like no hw does div anyways, so we should just
 lower at the GLSL level.
 
 Signed-off-by: Dave Airlie airl...@redhat.com
 ---
  src/glsl/lower_instructions.cpp | 6 +++---
  1 file changed, 3 insertions(+), 3 deletions(-)
 
 diff --git a/src/glsl/lower_instructions.cpp b/src/glsl/lower_instructions.cpp
 index e8a69e7..ac6715b 100644
 --- a/src/glsl/lower_instructions.cpp
 +++ b/src/glsl/lower_instructions.cpp
 @@ -199,7 +199,7 @@ lower_instructions_visitor::sub_to_add_neg(ir_expression 
 *ir)
  void
  lower_instructions_visitor::div_to_mul_rcp(ir_expression *ir)
  {
 -   assert(ir-operands[1]-type-is_float());
 +   assert(ir-operands[1]-type-is_float() || 
 ir-operands[1]-type-is_double());
  
 /* New expression for the 1.0 / op1 */
 ir_rvalue *expr;
 @@ -327,7 +327,7 @@ lower_instructions_visitor::mod_to_floor(ir_expression 
 *ir)
 /* Don't generate new IR that would need to be lowered in an additional
  * pass.
  */
 -   if (lowering(DIV_TO_MUL_RCP)  ir-type-is_float())
 +   if (lowering(DIV_TO_MUL_RCP)  (ir-type-is_float() || 
 ir-type-is_double()))
div_to_mul_rcp(div_expr);
  
 ir_expression *const floor_expr =
 @@ -1014,7 +1014,7 @@ lower_instructions_visitor::visit_leave(ir_expression 
 *ir)
 case ir_binop_div:
if (ir-operands[1]-type-is_integer()  
 lowering(INT_DIV_TO_MUL_RCP))
int_div_to_mul_rcp(ir);
 -  else if (ir-operands[1]-type-is_float()  lowering(DIV_TO_MUL_RCP))
 +  else if ((ir-operands[1]-type-is_float() || 
 ir-operands[1]-type-is_double()) lowering(DIV_TO_MUL_RCP))
div_to_mul_rcp(ir);
break;
  
 

FWIW I suspect lowering ddiv doesn't really cut it always. Can be done
later though. (d3d11 in fact has ddiv as an extended double feature
not just the ordinary double feature, though along with dfma and drcp,
which makes me wonder how you'd do a division with the unextended
version if you don't even have rcp... In any case ddiv is required to be
precise to 0.5 ULP if supported, drcp only 1.0 ULP.)

Roland

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 13/23] main: Minor whitespace fixes in ClearNamedBuffer[Sub]Data.

2015-02-20 Thread Martin Peres

Please squash this in the previous commit (with git rebase -i).

On 12/02/2015 04:05, Laura Ekstrand wrote:

---
  src/mesa/main/bufferobj.c | 4 ++--
  src/mesa/main/bufferobj.h | 4 ++--
  2 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/src/mesa/main/bufferobj.c b/src/mesa/main/bufferobj.c
index bd21c8a..88230d6 100644
--- a/src/mesa/main/bufferobj.c
+++ b/src/mesa/main/bufferobj.c
@@ -1765,10 +1765,10 @@ void GLAPIENTRY
  _mesa_ClearBufferSubData(GLenum target, GLenum internalformat,
   GLintptr offset, GLsizeiptr size,
   GLenum format, GLenum type,
- const GLvoid* data)
+ const GLvoid *data)
  {
 GET_CURRENT_CONTEXT(ctx);
-   struct gl_buffer_object* bufObj;
+   struct gl_buffer_object *bufObj;
  
 bufObj = get_buffer(ctx, glClearBufferSubData, target, GL_INVALID_VALUE);

 if (!bufObj)
diff --git a/src/mesa/main/bufferobj.h b/src/mesa/main/bufferobj.h
index 5254727..2a66444 100644
--- a/src/mesa/main/bufferobj.h
+++ b/src/mesa/main/bufferobj.h
@@ -220,7 +220,7 @@ _mesa_GetBufferSubData(GLenum target, GLintptrARB offset,
  void GLAPIENTRY
  _mesa_ClearBufferData(GLenum target, GLenum internalformat,
GLenum format, GLenum type,
-  const GLvoid * data);
+  const GLvoid *data);
  
  void GLAPIENTRY

  _mesa_ClearNamedBufferData(GLuint buffer, GLenum internalformat,
@@ -231,7 +231,7 @@ void GLAPIENTRY
  _mesa_ClearBufferSubData(GLenum target, GLenum internalformat,
   GLintptr offset, GLsizeiptr size,
   GLenum format, GLenum type,
- const GLvoid * data);
+ const GLvoid *data);
  
  void GLAPIENTRY

  _mesa_ClearNamedBufferSubData(GLuint buffer, GLenum internalformat,


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 12/23] main: Add entry points for ClearNamedBuffer[Sub]Data.

2015-02-20 Thread Martin Peres
It could have been split in two, but as no changes affect any 
already-running code,

I do not see any good reason to spend time on splitting this.

Reviewed-by: Martin Peres martin.pe...@linux.intel.com

On 12/02/2015 04:05, Laura Ekstrand wrote:

---
  src/mapi/glapi/gen/ARB_direct_state_access.xml | 18 +
  src/mesa/main/bufferobj.c  | 35 ++
  src/mesa/main/bufferobj.h  | 11 
  src/mesa/main/tests/dispatch_sanity.cpp|  2 ++
  4 files changed, 66 insertions(+)

diff --git a/src/mapi/glapi/gen/ARB_direct_state_access.xml 
b/src/mapi/glapi/gen/ARB_direct_state_access.xml
index 042b2a8..ce4017b 100644
--- a/src/mapi/glapi/gen/ARB_direct_state_access.xml
+++ b/src/mapi/glapi/gen/ARB_direct_state_access.xml
@@ -43,6 +43,24 @@
param name=size type=GLsizeiptr /
 /function
  
+   function name=ClearNamedBufferData offset=assign

+  param name=buffer type=GLuint /
+  param name=internalformat type=GLenum /
+  param name=format type=GLenum /
+  param name=type type=GLenum /
+  param name=data type=const GLvoid * /
+   /function
+
+   function name=ClearNamedBufferSubData offset=assign
+  param name=buffer type=GLuint /
+  param name=internalformat type=GLenum /
+  param name=offset type=GLintptr /
+  param name=size type=GLsizeiptr /
+  param name=format type=GLenum /
+  param name=type type=GLenum /
+  param name=data type=const GLvoid * /
+   /function
+
 !-- Texture object functions --
  
 function name=CreateTextures offset=assign

diff --git a/src/mesa/main/bufferobj.c b/src/mesa/main/bufferobj.c
index b8fa917..bd21c8a 100644
--- a/src/mesa/main/bufferobj.c
+++ b/src/mesa/main/bufferobj.c
@@ -1744,6 +1744,22 @@ _mesa_ClearBufferData(GLenum target, GLenum 
internalformat, GLenum format,
 glClearBufferData, false);
  }
  
+void GLAPIENTRY

+_mesa_ClearNamedBufferData(GLuint buffer, GLenum internalformat,
+   GLenum format, GLenum type, const GLvoid *data)
+{
+   GET_CURRENT_CONTEXT(ctx);
+   struct gl_buffer_object *bufObj;
+
+   bufObj = _mesa_lookup_bufferobj_err(ctx, buffer, glClearNamedBufferData);
+   if (!bufObj)
+  return;
+
+   _mesa_clear_buffer_sub_data(ctx, bufObj, internalformat, 0, bufObj-Size,
+   format, type, data,
+   glClearNamedBufferData, false);
+}
+
  
  void GLAPIENTRY

  _mesa_ClearBufferSubData(GLenum target, GLenum internalformat,
@@ -1763,6 +1779,25 @@ _mesa_ClearBufferSubData(GLenum target, GLenum 
internalformat,
 glClearBufferSubData, true);
  }
  
+void GLAPIENTRY

+_mesa_ClearNamedBufferSubData(GLuint buffer, GLenum internalformat,
+  GLintptr offset, GLsizeiptr size,
+  GLenum format, GLenum type,
+  const GLvoid *data)
+{
+   GET_CURRENT_CONTEXT(ctx);
+   struct gl_buffer_object *bufObj;
+
+   bufObj = _mesa_lookup_bufferobj_err(ctx, buffer,
+   glClearNamedBufferSubData);
+   if (!bufObj)
+  return;
+
+   _mesa_clear_buffer_sub_data(ctx, bufObj, internalformat, offset, size,
+   format, type, data,
+   glClearNamedBufferSubData, true);
+}
+
  
  void * GLAPIENTRY

  _mesa_MapBuffer(GLenum target, GLenum access)
diff --git a/src/mesa/main/bufferobj.h b/src/mesa/main/bufferobj.h
index 5911270..5254727 100644
--- a/src/mesa/main/bufferobj.h
+++ b/src/mesa/main/bufferobj.h
@@ -223,11 +223,22 @@ _mesa_ClearBufferData(GLenum target, GLenum 
internalformat,
const GLvoid * data);
  
  void GLAPIENTRY

+_mesa_ClearNamedBufferData(GLuint buffer, GLenum internalformat,
+   GLenum format, GLenum type,
+   const GLvoid *data);
+
+void GLAPIENTRY
  _mesa_ClearBufferSubData(GLenum target, GLenum internalformat,
   GLintptr offset, GLsizeiptr size,
   GLenum format, GLenum type,
   const GLvoid * data);
  
+void GLAPIENTRY

+_mesa_ClearNamedBufferSubData(GLuint buffer, GLenum internalformat,
+  GLintptr offset, GLsizeiptr size,
+  GLenum format, GLenum type,
+  const GLvoid *data);
+
  void * GLAPIENTRY
  _mesa_MapBuffer(GLenum target, GLenum access);
  
diff --git a/src/mesa/main/tests/dispatch_sanity.cpp b/src/mesa/main/tests/dispatch_sanity.cpp

index aa8e352..4bd3e10 100644
--- a/src/mesa/main/tests/dispatch_sanity.cpp
+++ b/src/mesa/main/tests/dispatch_sanity.cpp
@@ -960,6 +960,8 @@ const struct function gl_core_functions_possible[] = {
 { glNamedBufferData, 45, -1 },
 { glNamedBufferSubData, 45, -1 },
 { glCopyNamedBufferSubData, 45, -1 },
+   { glClearNamedBufferData, 45, 

Re: [Mesa-dev] [PATCH 1/6] gallium: add some more double opcodes to avoid unnecessary lowering

2015-02-20 Thread Roland Scheidegger
Just some minor nits.

Am 20.02.2015 um 00:52 schrieb Ilia Mirkin:
 Signed-off-by: Ilia Mirkin imir...@alum.mit.edu
 ---
  src/gallium/auxiliary/tgsi/tgsi_info.c |  5 
  src/gallium/docs/source/tgsi.rst   | 39 
 ++
  src/gallium/include/pipe/p_shader_tokens.h |  7 +-
  3 files changed, 50 insertions(+), 1 deletion(-)
 
 diff --git a/src/gallium/auxiliary/tgsi/tgsi_info.c 
 b/src/gallium/auxiliary/tgsi/tgsi_info.c
 index d04f9da..4d838fd 100644
 --- a/src/gallium/auxiliary/tgsi/tgsi_info.c
 +++ b/src/gallium/auxiliary/tgsi/tgsi_info.c
 @@ -257,6 +257,11 @@ static const struct tgsi_opcode_info 
 opcode_info[TGSI_OPCODE_LAST] =
 { 1, 1, 0, 0, 0, 0, COMP, D2U, TGSI_OPCODE_D2U },
 { 1, 1, 0, 0, 0, 0, COMP, U2D, TGSI_OPCODE_U2D },
 { 1, 1, 0, 0 ,0, 0, COMP, DRSQ, TGSI_OPCODE_DRSQ },
 +   { 1, 1, 0, 0, 0, 0, COMP, DTRUNC, TGSI_OPCODE_DTRUNC },
 +   { 1, 1, 0, 0, 0, 0, COMP, DCEIL, TGSI_OPCODE_DCEIL },
 +   { 1, 1, 0, 0, 0, 0, COMP, DFLR, TGSI_OPCODE_DFLR },
 +   { 1, 1, 0, 0, 0, 0, COMP, DROUND, TGSI_OPCODE_DROUND },
 +   { 1, 1, 0, 0, 0, 0, COMP, DSSG, TGSI_OPCODE_DSSG },
  };
  
  const struct tgsi_opcode_info *
 diff --git a/src/gallium/docs/source/tgsi.rst 
 b/src/gallium/docs/source/tgsi.rst
 index e20af79..15f1e9f 100644
 --- a/src/gallium/docs/source/tgsi.rst
 +++ b/src/gallium/docs/source/tgsi.rst
 @@ -1861,6 +1861,45 @@ two-component vectors with doubled precision in each 
 component.
  
dst.zw = src.zw - \lfloor src.zw\rfloor
  
 +.. opcode:: DTRUNC - Truncate
 +
 +.. math::
 +
 +  dst.xy = trunc(src.xy)
 +
 +  dst.zw = trunc(src.zw)
 +
 +.. opcode:: DCEIL - Ceiling
 +
 +.. math::
 +
 +  dst.xy = \lceil src.xy\rceil
 +
 +  dst.zw = \lceil src.zw\rceil
 +
 +.. opcode:: DFLR - Floor
 +
 +.. math::
 +
 +  dst.xy = \lfloor src.xy\rfloor
 +
 +  dst.zw = \lfloor src.zw\rfloor
 +
 +.. opcode:: DROUND - Fraction
 +
 +.. math::
 +
 +  dst.xy = round(src.xy)
 +
 +  dst.zw = round(src.zw)
 +
 +.. opcode:: DSSG - Set Sign
 +
 +.. math::
 +
 +  dst.xy = (src.xy  0) ? 1 : (src.xy  0) ? -1 : 0
 +
 +  dst.zw = (src.zw  0) ? 1 : (src.zw  0) ? -1 : 0
I think this would be more obvious written as 1.0/-1.0/0.0 (same goes
for the non-double version, of course).
(As a side note, I'm wondering if this actually has defined NaN
behavior, the glsl language sort of implies a number has to be exactly
one of these 3 cases, and doesn't give an answer what should be returned
for a NaN - maybe a NaN?)

  
  .. opcode:: DFRACEXP - Convert Number to Fractional and Integral Components
  
 diff --git a/src/gallium/include/pipe/p_shader_tokens.h 
 b/src/gallium/include/pipe/p_shader_tokens.h
 index fc41cc9..95ac590 100644
 --- a/src/gallium/include/pipe/p_shader_tokens.h
 +++ b/src/gallium/include/pipe/p_shader_tokens.h
 @@ -519,7 +519,12 @@ struct tgsi_property_data {
  #define TGSI_OPCODE_D2U 215
  #define TGSI_OPCODE_U2D 216
  #define TGSI_OPCODE_DRSQ217 /* eg, cayman also has DRSQ */
 -#define TGSI_OPCODE_LAST218
 +#define TGSI_OPCODE_DTRUNC  218 /* nvc0 */
 +#define TGSI_OPCODE_DCEIL   219 /* nvc0 */
 +#define TGSI_OPCODE_DFLR220 /* nvc0 */
 +#define TGSI_OPCODE_DROUND  221 /* nvc0 */
 +#define TGSI_OPCODE_DSSG222
 +#define TGSI_OPCODE_LAST223

I don't think these hw-specific references (nvc0) really fit in there -
the same can be said about some existing ones (cayman...)

  #define TGSI_SAT_NONE0  /* do not saturate */
  #define TGSI_SAT_ZERO_ONE1  /* clamp to [0,1] */
 

Otherwise looks good to me.

Roland

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 11/23] main: Refactor ClearBuffer[Sub]Data.

2015-02-20 Thread Martin Peres

On 20/02/2015 20:03, Laura Ekstrand wrote:

This naming convention tries to match the corresponding DD table entry.

There's a thread discussing the naming convention for external 
software fallback functions.  Feel free to add your input to the 
discussion there :)


Ack, no strong opinion here :)



On Fri, Feb 20, 2015 at 6:18 AM, Martin Peres martin.pe...@free.fr 
mailto:martin.pe...@free.fr wrote:


On 12/02/2015 04:05, Laura Ekstrand wrote:

v2: review by Jason Ekstrand
- Split refactor of clear buffer sub data from addition of
DSA entry
  points.
---
  src/mesa/main/bufferobj.c| 125
---
  src/mesa/main/bufferobj.h|  19 ++--
  src/mesa/state_tracker/st_cb_bufferobjects.c |   4 +-
  3 files changed, 69 insertions(+), 79 deletions(-)

diff --git a/src/mesa/main/bufferobj.c b/src/mesa/main/bufferobj.c
index 7225b64..b8fa917 100644
--- a/src/mesa/main/bufferobj.c
+++ b/src/mesa/main/bufferobj.c
@@ -660,11 +660,11 @@ _mesa_buffer_get_subdata( struct
gl_context *ctx, GLintptrARB offset,
   * dd_function_table::ClearBufferSubData.
   */
  void
-_mesa_buffer_clear_subdata(struct gl_context *ctx,
-   GLintptr offset, GLsizeiptr size,
-   const GLvoid *clearValue,
-   GLsizeiptr clearValueSize,
-   struct gl_buffer_object *bufObj)
+_mesa_ClearBufferSubData_sw(struct gl_context *ctx,


Interesting choice of naming the function as a mix of camel case
and underscores.

Since it is an internal function, shouldn't it only have underscores?

Other than that:

Reviewed-by: Martin Peres martin.pe...@linux.intel.com
mailto:martin.pe...@linux.intel.com

+GLintptr offset, GLsizeiptr size,
+const GLvoid *clearValue,
+GLsizeiptr clearValueSize,
+struct gl_buffer_object *bufObj)
  {
 GLsizeiptr i;
 GLubyte *dest;
@@ -1113,7 +1113,7 @@
_mesa_init_buffer_object_functions(struct dd_function_table
*driver)
 driver-UnmapBuffer = _mesa_buffer_unmap;
   /* GL_ARB_clear_buffer_object */
-   driver-ClearBufferSubData = _mesa_buffer_clear_subdata;
+   driver-ClearBufferSubData = _mesa_ClearBufferSubData_sw;
   /* GL_ARB_map_buffer_range */
 driver-MapBufferRange = _mesa_buffer_map_range;
@@ -1671,57 +1671,77 @@ _mesa_GetBufferSubData(GLenum target,
GLintptr offset,
 ctx-Driver.GetBufferSubData( ctx, offset, size, data,
bufObj );
  }
  -
-void GLAPIENTRY
-_mesa_ClearBufferData(GLenum target, GLenum internalformat,
GLenum format,
-  GLenum type, const GLvoid* data)
+/**
+ * \param subdata   true if caller is *SubData, false if *Data
+ */
+void
+_mesa_clear_buffer_sub_data(struct gl_context *ctx,
+struct gl_buffer_object *bufObj,
+GLenum internalformat,
+GLintptr offset, GLsizeiptr size,
+GLenum format, GLenum type,
+const GLvoid *data,
+const char *func, bool subdata)
  {
-   GET_CURRENT_CONTEXT(ctx);
-   struct gl_buffer_object* bufObj;
 mesa_format mesaFormat;
 GLubyte clearValue[MAX_PIXEL_BYTES];
 GLsizeiptr clearValueSize;
  -   bufObj = get_buffer(ctx, glClearBufferData, target,
GL_INVALID_VALUE);
-   if (!bufObj) {
-  return;
-   }
-
-   if (_mesa_check_disallowed_mapping(bufObj)) {
-  _mesa_error(ctx, GL_INVALID_OPERATION,
-  glClearBufferData(buffer currently mapped));
+   /* This checks for disallowed mappings. */
+   if (!buffer_object_subdata_range_good(ctx, bufObj, offset,
size,
+ subdata, func)) {
return;
 }
   mesaFormat = validate_clear_buffer_format(ctx,
internalformat,
- format, type,
-  glClearBufferData);
+ format, type, func);
+
 if (mesaFormat == MESA_FORMAT_NONE) {
return;
 }
   clearValueSize = _mesa_get_format_bytes(mesaFormat);
-   if (bufObj-Size % 

Re: [Mesa-dev] [PATCH 08/23] main: Add entry point for NamedBufferSubData.

2015-02-20 Thread Martin Peres

On 20/02/2015 22:17, Laura Ekstrand wrote:
That would make the diff easier to read, but it doesn't make sense to 
lay out the functions in that order in the file.  GetBufferSubData is 
a download function and shouldn't be part of the upload function 
group in the file.


Fair enough! That was a minor nitpick anyway and I did not expect you to 
fix it, hence why I gave my R-b ;)




On Fri, Feb 20, 2015 at 2:28 AM, Martin Peres martin.pe...@free.fr 
mailto:martin.pe...@free.fr wrote:


On 12/02/2015 04:05, Laura Ekstrand wrote:

v2: review by Ian Romanick
- Remove _mesa from name of static software fallback
buffer_sub_data.
- Remove mappedRange from _mesa_buffer_sub_data.
- Removed some cosmetic changes to a separate commit.
---
  src/mapi/glapi/gen/ARB_direct_state_access.xml |   7 ++
  src/mesa/main/bufferobj.c  | 129
+++--
  src/mesa/main/bufferobj.h  |   9 ++
  src/mesa/main/tests/dispatch_sanity.cpp|   1 +
  4 files changed, 97 insertions(+), 49 deletions(-)

diff --git a/src/mapi/glapi/gen/ARB_direct_state_access.xml
b/src/mapi/glapi/gen/ARB_direct_state_access.xml
index 7779262..6d70b8e 100644
--- a/src/mapi/glapi/gen/ARB_direct_state_access.xml
+++ b/src/mapi/glapi/gen/ARB_direct_state_access.xml
@@ -28,6 +28,13 @@
param name=usage type=GLenum /
 /function
  +   function name=NamedBufferSubData offset=assign
+  param name=buffer type=GLuint /
+  param name=offset type=GLintptr /
+  param name=size type=GLsizeiptr /
+  param name=data type=const GLvoid * /
+   /function
+
 !-- Texture object functions --
   function name=CreateTextures offset=assign
diff --git a/src/mesa/main/bufferobj.c b/src/mesa/main/bufferobj.c
index ac8eed1..4f89748 100644
--- a/src/mesa/main/bufferobj.c
+++ b/src/mesa/main/bufferobj.c
@@ -227,67 +227,58 @@ bufferobj_range_mapped(const struct
gl_buffer_object *obj,
   * \c glClearBufferSubData.
   *
   * \param ctx GL context.
- * \param target  Buffer object target on which to operate.
+ * \param bufObj  The buffer object.
   * \param offset  Offset of the first byte of the subdata range.
   * \param sizeSize, in bytes, of the subdata range.
   * \param mappedRange  If true, checks if an overlapping
range is mapped.
   * If false, checks if buffer is mapped.
- * \param errorNoBuffer  Error code if no buffer is bound to
target.
   * \param caller  Name of calling function for recording errors.
- * \return   A pointer to the buffer object bound to \c
target in the
- *   specified context or \c NULL if any of the
parameter or state
- *   conditions are invalid.
+ * \return   false if error, true otherwise
   *
   * \sa glBufferSubDataARB, glGetBufferSubDataARB,
glClearBufferSubData
   */
-static struct gl_buffer_object *
-buffer_object_subdata_range_good(struct gl_context * ctx,
GLenum target,
- GLintptrARB offset,
GLsizeiptrARB size,
- bool mappedRange, GLenum
errorNoBuffer,
- const char *caller)
+static bool
+buffer_object_subdata_range_good(struct gl_context *ctx,
+ struct gl_buffer_object *bufObj,
+ GLintptr offset, GLsizeiptr
size,
+ bool mappedRange, const char
*caller)
  {
-   struct gl_buffer_object *bufObj;
-
 if (size  0) {
_mesa_error(ctx, GL_INVALID_VALUE, %s(size  0),
caller);
-  return NULL;
+  return false;
 }
   if (offset  0) {
_mesa_error(ctx, GL_INVALID_VALUE, %s(offset  0),
caller);
-  return NULL;
+  return false;
 }
  -   bufObj = get_buffer(ctx, caller, target, errorNoBuffer);
-   if (!bufObj)
-  return NULL;
-
 if (offset + size  bufObj-Size) {
_mesa_error(ctx, GL_INVALID_VALUE,
%s(offset %lu + size %lu  buffer size
%lu), caller,
(unsigned long) offset,
(unsigned long) size,
(unsigned long) bufObj-Size);
-  return NULL;
+  return false;
 }
  

[Mesa-dev] [PATCH] mesa: Have configure define NDEBUG, not mtypes.h.

2015-02-20 Thread Matt Turner
mtypes.h had been defining NDEBUG (used by assert) if DEBUG was not
defined. Confusing and bizarre that you don't get NDEBUG if you don't
include mtypes.h.

... which is just what happened in commit bef38f62e.

Let's let configure define this for us if not using --enable-debug.
---
 configure.ac   | 2 ++
 src/mesa/main/mtypes.h | 3 ---
 2 files changed, 2 insertions(+), 3 deletions(-)

diff --git a/configure.ac b/configure.ac
index beb7a7d..5fbb7bc 100644
--- a/configure.ac
+++ b/configure.ac
@@ -370,6 +370,8 @@ if test x$enable_debug = xyes; then
 CXXFLAGS=$CXXFLAGS -O0
 fi
 fi
+else
+   DEFINES=$DEFINES -DNDEBUG
 fi
 
 dnl
diff --git a/src/mesa/main/mtypes.h b/src/mesa/main/mtypes.h
index 05b5a81..6e99773 100644
--- a/src/mesa/main/mtypes.h
+++ b/src/mesa/main/mtypes.h
@@ -4512,9 +4512,6 @@ extern int MESA_DEBUG_FLAGS;
 # define MESA_VERBOSE 0
 # define MESA_DEBUG_FLAGS 0
 # define MESA_FUNCTION a function
-# ifndef NDEBUG
-#  define NDEBUG
-# endif
 #endif
 
 
-- 
2.0.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] nir: Fix the Mesa build without -DDEBUG.

2015-02-20 Thread Matt Turner
Regardless of where DEBUG/NDEBUG is defined, it seems like we should
do this anyway.

Reviewed-by: Matt Turner matts...@gmail.com
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] i965/fs: Use fs_reg for CS/VS atomics pixel mask immediate data

2015-02-20 Thread Matt Turner
Reviewed-by: Matt Turner matts...@gmail.com

It might be a good idea to make the fs_reg(struct brw_reg) explicit to
prevent this kind of mistake from happening.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/4] common: Correct texture initialization in create_texture_for_pbo.

2015-02-20 Thread Laura Ekstrand
Solves bugs related to the driver not setting up the texture miptree
correctly, leading to faulty PBO uploads and downloads.
---
 src/mesa/drivers/common/meta_tex_subimage.c | 13 +
 src/mesa/drivers/dri/i965/intel_tex.c   |  3 ++-
 src/mesa/main/dd.h  |  1 +
 3 files changed, 12 insertions(+), 5 deletions(-)

diff --git a/src/mesa/drivers/common/meta_tex_subimage.c 
b/src/mesa/drivers/common/meta_tex_subimage.c
index 68c8273..f4f7716 100644
--- a/src/mesa/drivers/common/meta_tex_subimage.c
+++ b/src/mesa/drivers/common/meta_tex_subimage.c
@@ -51,7 +51,7 @@ create_texture_for_pbo(struct gl_context *ctx, bool 
create_pbo,
 {
uint32_t pbo_format;
GLenum internal_format;
-   unsigned row_stride;
+   unsigned row_stride, image_stride;
struct gl_buffer_object *buffer_obj;
struct gl_texture_object *tex_obj;
struct gl_texture_image *tex_image;
@@ -74,6 +74,8 @@ create_texture_for_pbo(struct gl_context *ctx, bool 
create_pbo,
pixels = _mesa_image_address3d(packing, pixels,
   width, height, format, type, 0, 0, 0);
row_stride = _mesa_image_row_stride(packing, width, format, type);
+   image_stride = _mesa_image_image_stride(packing, width, height, format,
+   type);
 
if (_mesa_is_bufferobj(packing-BufferObj)) {
   *tmp_pbo = 0;
@@ -89,7 +91,7 @@ create_texture_for_pbo(struct gl_context *ctx, bool 
create_pbo,
*/
   _mesa_BindBuffer(pbo_target, *tmp_pbo);
 
-  _mesa_BufferData(pbo_target, row_stride * height, pixels, 
GL_STREAM_DRAW);
+  _mesa_BufferData(pbo_target, image_stride * depth, pixels, 
GL_STREAM_DRAW);
 
   buffer_obj = ctx-Unpack.BufferObj;
   pixels = NULL;
@@ -99,9 +101,11 @@ create_texture_for_pbo(struct gl_context *ctx, bool 
create_pbo,
 
_mesa_GenTextures(1, tmp_tex);
tex_obj = _mesa_lookup_texture(ctx, *tmp_tex);
-   tex_obj-Target = depth  1 ? GL_TEXTURE_2D_ARRAY : GL_TEXTURE_2D;
-   tex_obj-Immutable = GL_TRUE;
_mesa_initialize_texture_object(ctx, tex_obj, *tmp_tex, GL_TEXTURE_2D);
+   /* This must be set after _mesa_initialize_texture_object, not before. */
+   tex_obj-Immutable = GL_TRUE;
+   /* This is required for interactions with ARB_texture_view. */
+   tex_obj-NumLayers = 1;
 
internal_format = _mesa_get_format_base_format(pbo_format);
 
@@ -114,6 +118,7 @@ create_texture_for_pbo(struct gl_context *ctx, bool 
create_pbo,
  buffer_obj,
  (intptr_t)pixels,
  row_stride,
+ image_stride,
  read_only)) {
   _mesa_DeleteTextures(1, tmp_tex);
   _mesa_DeleteBuffers(1, tmp_pbo);
diff --git a/src/mesa/drivers/dri/i965/intel_tex.c 
b/src/mesa/drivers/dri/i965/intel_tex.c
index 2d3009a..3a0c09a 100644
--- a/src/mesa/drivers/dri/i965/intel_tex.c
+++ b/src/mesa/drivers/dri/i965/intel_tex.c
@@ -305,6 +305,7 @@ intel_set_texture_storage_for_buffer_object(struct 
gl_context *ctx,
 struct gl_buffer_object 
*buffer_obj,
 uint32_t buffer_offset,
 uint32_t row_stride,
+uint32_t image_stride,
 bool read_only)
 {
struct brw_context *brw = brw_context(ctx);
@@ -334,7 +335,7 @@ intel_set_texture_storage_for_buffer_object(struct 
gl_context *ctx,
 
drm_intel_bo *bo = intel_bufferobj_buffer(brw, intel_buffer_obj,
  buffer_offset,
- row_stride * image-Height);
+ image_stride * image-Depth);
intel_texobj-mt =
   intel_miptree_create_for_bo(brw, bo,
   image-TexFormat,
diff --git a/src/mesa/main/dd.h b/src/mesa/main/dd.h
index ec8662b..9de08e2 100644
--- a/src/mesa/main/dd.h
+++ b/src/mesa/main/dd.h
@@ -429,6 +429,7 @@ struct dd_function_table {
 struct gl_buffer_object *bufferObj,
 uint32_t buffer_offset,
 uint32_t row_stride,
+uint32_t image_stride,
 bool read_only);
 
/**
-- 
2.1.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 4/4] i965: Force miptrees for BOs to have all slices in each lod.

2015-02-20 Thread Laura Ekstrand
Textures made expressly for internal buffer objects shouldn't have extra
padding around them, but should be densely packed.
---
 src/mesa/drivers/dri/i965/intel_mipmap_tree.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c 
b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
index 0e3888f..b46532d 100644
--- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
+++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
@@ -724,7 +724,7 @@ intel_miptree_create_for_bo(struct brw_context *brw,
mt = intel_miptree_create_layout(brw, target, format,
 0, 0,
 width, height, depth,
-true, 0, false);
+true, 0, true);
if (!mt) {
   free(mt);
   return mt;
-- 
2.1.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 3/4] common: Fix PBOs for 1D_ARRAY.

2015-02-20 Thread Laura Ekstrand
Corrects the way that _mesa_meta_pbo_TexSubImage and
_mesa_meta_pbo_GetTexSubImage handle 1D_ARRAY textures.  Fixes a failure in
the Piglit arb_direct_state_access/gettextureimage-targets test.
---
 src/mesa/drivers/common/meta_tex_subimage.c | 62 +
 1 file changed, 36 insertions(+), 26 deletions(-)

diff --git a/src/mesa/drivers/common/meta_tex_subimage.c 
b/src/mesa/drivers/common/meta_tex_subimage.c
index ee3295b..24a72ba 100644
--- a/src/mesa/drivers/common/meta_tex_subimage.c
+++ b/src/mesa/drivers/common/meta_tex_subimage.c
@@ -141,7 +141,7 @@ _mesa_meta_pbo_TexSubImage(struct gl_context *ctx, GLuint 
dims,
struct gl_texture_image *pbo_tex_image;
GLenum status;
bool success = false;
-   int z;
+   int z, iters;
 
/* XXX: This should probably be passed in from somewhere */
const char *where = _mesa_meta_pbo_TexSubImage;
@@ -189,12 +189,6 @@ _mesa_meta_pbo_TexSubImage(struct gl_context *ctx, GLuint 
dims,
_mesa_BindFramebuffer(GL_READ_FRAMEBUFFER, fbos[0]);
_mesa_BindFramebuffer(GL_DRAW_FRAMEBUFFER, fbos[1]);
 
-   if (tex_image-TexObject-Target == GL_TEXTURE_1D_ARRAY) {
-  assert(depth == 1);
-  depth = height;
-  height = 1;
-   }
-
_mesa_meta_bind_fbo_image(GL_READ_FRAMEBUFFER, GL_COLOR_ATTACHMENT0,
  pbo_tex_image, 0);
/* If this passes on the first layer it should pass on the others */
@@ -218,7 +212,10 @@ _mesa_meta_pbo_TexSubImage(struct gl_context *ctx, GLuint 
dims,
   GL_COLOR_BUFFER_BIT, GL_NEAREST))
   goto fail;
 
-   for (z = 1; z  depth; z++) {
+   iters = tex_image-TexObject-Target == GL_TEXTURE_1D_ARRAY ?
+   height : depth;
+
+   for (z = 1; z  iters; z++) {
   _mesa_meta_bind_fbo_image(GL_READ_FRAMEBUFFER, GL_COLOR_ATTACHMENT0,
 pbo_tex_image, z);
   _mesa_meta_bind_fbo_image(GL_DRAW_FRAMEBUFFER, GL_COLOR_ATTACHMENT0,
@@ -226,11 +223,18 @@ _mesa_meta_pbo_TexSubImage(struct gl_context *ctx, GLuint 
dims,
 
   _mesa_update_state(ctx);
 
-  _mesa_meta_BlitFramebuffer(ctx, ctx-ReadBuffer, ctx-DrawBuffer,
- 0, z * height, width, (z + 1) * height,
- xoffset, yoffset,
- xoffset + width, yoffset + height,
- GL_COLOR_BUFFER_BIT, GL_NEAREST);
+  if (tex_image-TexObject-Target == GL_TEXTURE_1D_ARRAY)
+ _mesa_meta_BlitFramebuffer(ctx, ctx-ReadBuffer, ctx-DrawBuffer,
+0, z, width, z + 1,
+xoffset, yoffset,
+xoffset + width, yoffset + 1,
+GL_COLOR_BUFFER_BIT, GL_NEAREST);
+  else
+ _mesa_meta_BlitFramebuffer(ctx, ctx-ReadBuffer, ctx-DrawBuffer,
+0, z * height, width, (z + 1) * height,
+xoffset, yoffset,
+xoffset + width, yoffset + height,
+GL_COLOR_BUFFER_BIT, GL_NEAREST);
}
 
success = true;
@@ -257,7 +261,7 @@ _mesa_meta_pbo_GetTexSubImage(struct gl_context *ctx, 
GLuint dims,
struct gl_texture_image *pbo_tex_image;
GLenum status;
bool success = false;
-   int z;
+   int z, iters;
 
/* XXX: This should probably be passed in from somewhere */
const char *where = _mesa_meta_pbo_GetTexSubImage;
@@ -299,12 +303,6 @@ _mesa_meta_pbo_GetTexSubImage(struct gl_context *ctx, 
GLuint dims,
 
_mesa_GenFramebuffers(2, fbos);
 
-   if (tex_image  tex_image-TexObject-Target == GL_TEXTURE_1D_ARRAY) {
-  assert(depth == 1);
-  depth = height;
-  height = 1;
-   }
-
/* If we were given a texture, bind it to the read framebuffer.  If not,
 * we're doing a ReadPixels and we should just use whatever framebuffer
 * the client has bound.
@@ -338,7 +336,12 @@ _mesa_meta_pbo_GetTexSubImage(struct gl_context *ctx, 
GLuint dims,
   GL_COLOR_BUFFER_BIT, GL_NEAREST))
   goto fail;
 
-   for (z = 1; z  depth; z++) {
+   if (tex_image  tex_image-TexObject-Target == GL_TEXTURE_1D_ARRAY)
+  iters = height;
+   else
+  iters = depth;
+
+   for (z = 1; z  iters; z++) {
   _mesa_meta_bind_fbo_image(GL_READ_FRAMEBUFFER, GL_COLOR_ATTACHMENT0,
 tex_image, zoffset + z);
   _mesa_meta_bind_fbo_image(GL_DRAW_FRAMEBUFFER, GL_COLOR_ATTACHMENT0,
@@ -346,11 +349,18 @@ _mesa_meta_pbo_GetTexSubImage(struct gl_context *ctx, 
GLuint dims,
 
   _mesa_update_state(ctx);
 
-  _mesa_meta_BlitFramebuffer(ctx, ctx-ReadBuffer, ctx-DrawBuffer,
- xoffset, yoffset,
- xoffset + width, yoffset + height,
- 0, z * height, width, (z + 1) * height,
- 

[Mesa-dev] [PATCH 2/4] common: Correct PBO 2D_ARRAY handling.

2015-02-20 Thread Laura Ekstrand
Changes PBO uploads and downloads to use a tall (height * depth) 2D texture
for blitting.  This fixes the bug where 2D_ARRAY, 3D, and CUBE_MAP_ARRAY
textures are not properly uploaded and downloaded.
---
 src/mesa/drivers/common/meta_tex_subimage.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/src/mesa/drivers/common/meta_tex_subimage.c 
b/src/mesa/drivers/common/meta_tex_subimage.c
index f4f7716..ee3295b 100644
--- a/src/mesa/drivers/common/meta_tex_subimage.c
+++ b/src/mesa/drivers/common/meta_tex_subimage.c
@@ -110,7 +110,7 @@ create_texture_for_pbo(struct gl_context *ctx, bool 
create_pbo,
internal_format = _mesa_get_format_base_format(pbo_format);
 
tex_image = _mesa_get_tex_image(ctx, tex_obj, tex_obj-Target, 0);
-   _mesa_init_teximage_fields(ctx, tex_image, width, height, depth,
+   _mesa_init_teximage_fields(ctx, tex_image, width, height * depth, 1,
   0, internal_format, pbo_format);
 
read_only = pbo_target == GL_PIXEL_UNPACK_BUFFER;
@@ -227,7 +227,7 @@ _mesa_meta_pbo_TexSubImage(struct gl_context *ctx, GLuint 
dims,
   _mesa_update_state(ctx);
 
   _mesa_meta_BlitFramebuffer(ctx, ctx-ReadBuffer, ctx-DrawBuffer,
- 0, 0, width, height,
+ 0, z * height, width, (z + 1) * height,
  xoffset, yoffset,
  xoffset + width, yoffset + height,
  GL_COLOR_BUFFER_BIT, GL_NEAREST);
@@ -349,7 +349,7 @@ _mesa_meta_pbo_GetTexSubImage(struct gl_context *ctx, 
GLuint dims,
   _mesa_meta_BlitFramebuffer(ctx, ctx-ReadBuffer, ctx-DrawBuffer,
  xoffset, yoffset,
  xoffset + width, yoffset + height,
- 0, 0, width, height,
+ 0, z * height, width, (z + 1) * height,
  GL_COLOR_BUFFER_BIT, GL_NEAREST);
}
 
-- 
2.1.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 0/4] v2: Fix PBO uploads/downloads.

2015-02-20 Thread Laura Ekstrand
This is a more robust set of patches to fix Meta PBO texture uploads and
downloads.  This fixes bugs related to array PBOs that were found using the
new set of getteximage-targets tests.

Laura Ekstrand (4):
  common: Correct texture initialization in create_texture_for_pbo.
  common: Correct PBO 2D_ARRAY handling.
  common: Fix PBOs for 1D_ARRAY.
  i965: Force miptrees for BOs to have all slices in each lod.

 src/mesa/drivers/common/meta_tex_subimage.c   | 77 ---
 src/mesa/drivers/dri/i965/intel_mipmap_tree.c |  2 +-
 src/mesa/drivers/dri/i965/intel_tex.c |  3 +-
 src/mesa/main/dd.h|  1 +
 4 files changed, 50 insertions(+), 33 deletions(-)

-- 
2.1.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/6] i965/skl: Layout 3D textures the same as array textures

2015-02-20 Thread Neil Roberts
On Gen9+ the 3D textures use the same mipmap layout as 2D array
textures.
---
 src/mesa/drivers/dri/i965/brw_tex_layout.c | 10 --
 1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_tex_layout.c 
b/src/mesa/drivers/dri/i965/brw_tex_layout.c
index 0e2841f..57922e9 100644
--- a/src/mesa/drivers/dri/i965/brw_tex_layout.c
+++ b/src/mesa/drivers/dri/i965/brw_tex_layout.c
@@ -224,6 +224,9 @@ brw_miptree_layout_2d(struct intel_mipmap_tree *mt)
 
   width  = minify(width, 1);
   height = minify(height, 1);
+
+  if (mt-target == GL_TEXTURE_3D)
+ depth = minify(depth, 1);
}
 }
 
@@ -263,7 +266,7 @@ brw_miptree_layout_texture_array(struct brw_context *brw,
   if (mt-compressed)
  img_height /= mt-align_h;
 
-  for (int q = 0; q  mt-physical_depth0; q++) {
+  for (int q = 0; q  mt-level[level].depth; q++) {
  if (mt-array_layout == ALL_SLICES_AT_EACH_LOD) {
 intel_miptree_set_image_offset(mt, level, q, 0, q * img_height);
  } else {
@@ -368,7 +371,10 @@ brw_miptree_layout(struct brw_context *brw, struct 
intel_mipmap_tree *mt)
   break;
 
case GL_TEXTURE_3D:
-  brw_miptree_layout_texture_3d(brw, mt);
+  if (brw-gen = 9)
+ brw_miptree_layout_texture_array(brw, mt);
+  else
+ brw_miptree_layout_texture_3d(brw, mt);
   break;
 
case GL_TEXTURE_1D_ARRAY:
-- 
1.9.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/6] i965/skl: Layout a 1D miptree horizontally

2015-02-20 Thread Neil Roberts
On Gen9+ the 1D miptree is laid out with all of the mipmap levels in a
horizontal line.
---
 src/mesa/drivers/dri/i965/brw_tex_layout.c | 62 +-
 1 file changed, 60 insertions(+), 2 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_tex_layout.c 
b/src/mesa/drivers/dri/i965/brw_tex_layout.c
index 57922e9..851742b 100644
--- a/src/mesa/drivers/dri/i965/brw_tex_layout.c
+++ b/src/mesa/drivers/dri/i965/brw_tex_layout.c
@@ -158,6 +158,36 @@ intel_vertical_texture_alignment_unit(struct brw_context 
*brw,
 }
 
 static void
+gen9_miptree_layout_1d(struct intel_mipmap_tree *mt)
+{
+   unsigned x = 0;
+   unsigned width = mt-physical_width0;
+   unsigned depth = mt-physical_depth0; /* number of array layers. */
+
+   /* When this layout is used the horizontal alignment is fixed at 64 and the
+* hardware ignores the value given in the surface state
+*/
+   const unsigned int align_w = 64;
+
+   mt-total_height = mt-physical_height0;
+   mt-total_width = 0;
+
+   for (unsigned level = mt-first_level; level = mt-last_level; level++) {
+  unsigned img_width;
+
+  intel_miptree_set_level_info(mt, level, x, 0, depth);
+
+  img_width = ALIGN(width, align_w);
+
+  mt-total_width = MAX2(mt-total_width, x + img_width);
+
+  x += img_width;
+
+  width = minify(width, 1);
+   }
+}
+
+static void
 brw_miptree_layout_2d(struct intel_mipmap_tree *mt)
 {
unsigned x = 0;
@@ -242,12 +272,34 @@ align_cube(struct intel_mipmap_tree *mt)
   mt-total_height += 2;
 }
 
+static bool
+use_linear_1d_layout(struct brw_context *brw,
+ struct intel_mipmap_tree *mt)
+{
+   /* On Gen9+ the mipmap levels of a 1D surface are all laid out in a
+* horizontal line. This isn't done for depth/stencil buffers however
+* because those will be using a tiled layout
+*/
+   if (brw-gen = 9 
+   (mt-target == GL_TEXTURE_1D ||
+mt-target == GL_TEXTURE_1D_ARRAY)) {
+  GLenum base_format = _mesa_get_format_base_format(mt-format);
+
+  if (base_format != GL_DEPTH_COMPONENT 
+  base_format != GL_DEPTH_STENCIL)
+ return true;
+   }
+
+   return false;
+}
+
 static void
 brw_miptree_layout_texture_array(struct brw_context *brw,
 struct intel_mipmap_tree *mt)
 {
int h0, h1;
unsigned height = mt-physical_height0;
+   bool layout_1d = use_linear_1d_layout(brw, mt);
 
h0 = ALIGN(mt-physical_height0, mt-align_h);
h1 = ALIGN(minify(mt-physical_height0, 1), mt-align_h);
@@ -258,7 +310,10 @@ brw_miptree_layout_texture_array(struct brw_context *brw,
 
int physical_qpitch = mt-compressed ? mt-qpitch / 4 : mt-qpitch;
 
-   brw_miptree_layout_2d(mt);
+   if (layout_1d)
+  gen9_miptree_layout_1d(mt);
+   else
+  brw_miptree_layout_2d(mt);
 
for (unsigned level = mt-first_level; level = mt-last_level; level++) {
   unsigned img_height;
@@ -392,7 +447,10 @@ brw_miptree_layout(struct brw_context *brw, struct 
intel_mipmap_tree *mt)
  break;
   case INTEL_MSAA_LAYOUT_NONE:
   case INTEL_MSAA_LAYOUT_IMS:
- brw_miptree_layout_2d(mt);
+ if (use_linear_1d_layout(brw, mt))
+gen9_miptree_layout_1d(mt);
+ else
+brw_miptree_layout_2d(mt);
  break;
   }
   break;
-- 
1.9.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 5/6] i965/skl: Align compressed textures to four times the block size

2015-02-20 Thread Neil Roberts
On Skylake it is possible to choose your own alignment values for
compressed textures but they are expressed as a multiple of the block
size. The minimum alignment value we can use is 4 so we effectively
have to align to 4 times the block size. This patch makes it initially
set mt-align_[wh] to the large alignment value and then later divides
it by the block size so that it can be uploaded as part of the surface
state.
---
 src/mesa/drivers/dri/i965/brw_tex_layout.c | 31 ++
 1 file changed, 27 insertions(+), 4 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_tex_layout.c 
b/src/mesa/drivers/dri/i965/brw_tex_layout.c
index d89a04e..1fe2c26 100644
--- a/src/mesa/drivers/dri/i965/brw_tex_layout.c
+++ b/src/mesa/drivers/dri/i965/brw_tex_layout.c
@@ -73,7 +73,16 @@ intel_horizontal_texture_alignment_unit(struct brw_context 
*brw,
 */
   unsigned int i, j;
   _mesa_get_format_block_size(format, i, j);
-  return i;
+
+  /* On Gen9+ we can pick our own alignment for compressed textures but it
+   * has to be a multiple of the block size. The minimum alignment we can
+   * pick is 4 so we effectively have to align to 4 times the block
+   * size
+   */
+  if (brw-gen = 9)
+ return i * 4;
+  else
+ return i;
 }
 
if (format == MESA_FORMAT_S_UINT8)
@@ -113,7 +122,8 @@ intel_vertical_texture_alignment_unit(struct brw_context 
*brw,
 * the SURFACE_STATE Surface Vertical Alignment field.
 */
if (_mesa_is_format_compressed(format))
-  return 4;
+  /* See comment above for the horizontal alignment */
+  return brw-gen = 9 ? 16 : 4;
 
if (format == MESA_FORMAT_S_UINT8)
   return brw-gen = 7 ? 8 : 4;
@@ -195,6 +205,9 @@ brw_miptree_layout_2d(struct intel_mipmap_tree *mt)
unsigned width = mt-physical_width0;
unsigned height = mt-physical_height0;
unsigned depth = mt-physical_depth0; /* number of array layers. */
+   unsigned int bw, bh;
+
+   _mesa_get_format_block_size(mt-format, bw, bh);
 
mt-total_width = mt-physical_width0;
 
@@ -212,7 +225,7 @@ brw_miptree_layout_2d(struct intel_mipmap_tree *mt)
 
if (mt-compressed) {
   mip1_width = ALIGN(minify(mt-physical_width0, 1), mt-align_w) +
- ALIGN(minify(mt-physical_width0, 2), mt-align_w);
+ ALIGN(minify(mt-physical_width0, 2), bw);
} else {
   mip1_width = ALIGN(minify(mt-physical_width0, 1), mt-align_w) +
  minify(mt-physical_width0, 2);
@@ -232,7 +245,7 @@ brw_miptree_layout_2d(struct intel_mipmap_tree *mt)
 
   img_height = ALIGN(height, mt-align_h);
   if (mt-compressed)
-img_height /= mt-align_h;
+img_height /= bh;
 
   if (mt-array_layout == ALL_SLICES_AT_EACH_LOD) {
  /* Compact arrays with separated miplevels */
@@ -481,5 +494,15 @@ brw_miptree_layout(struct brw_context *brw, struct 
intel_mipmap_tree *mt)
}
DBG(%s: %dx%dx%d\n, __FUNCTION__,
mt-total_width, mt-total_height, mt-cpp);
+
+   /* On Gen9+ the alignment values are expressed in multiples of the block
+* size
+*/
+   if (brw-gen = 9) {
+  unsigned int i, j;
+  _mesa_get_format_block_size(mt-format, i, j);
+  mt-align_w /= i;
+  mt-align_h /= j;
+   }
 }
 
-- 
1.9.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 0/6] i965/skl: Texture layout fixes

2015-02-20 Thread Neil Roberts
Here is a v2 of my patch series to fix 1D textures on Skylake. It's
now turned into just some general fixes and also helps with 3D
textures. It fixes 110 piglit tests but sadly seems to cause 3
regressions. It might not be worth landing until I can work out what
the regressions are so I guess I'm just posting it now in case
anyone's interested to look at it anyway.

I've uploaded the piglit html summary of the changes here:

http://busydoingnothing.co.uk/skl-qpitch-patches/changes.html

- Neil

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 4/6] i965: Don't force x-tiling for 16-bpp formats on Gen7

2015-02-20 Thread Neil Roberts
Sandybridge doesn't support y-tiling for surface formats with 16 or
more bpp. There was previously an override to explicitly allow this
for Gen7. However, this restriction is also removed in Gen8+ so we
should use y-tiling there too.

This is important to do for Skylake which doesn't support x-tiling for
3D surfaces.
---
 src/mesa/drivers/dri/i965/intel_mipmap_tree.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c 
b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
index 0e3888f..994670a 100644
--- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
+++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
@@ -516,9 +516,9 @@ intel_miptree_choose_tiling(struct brw_context *brw,
 * NOTE: 128BPE Format Color Buffer ( render target ) MUST be either TileX
 *  or Linear.
 * 128 bits per pixel translates to 16 bytes per pixel.  This is necessary
-* all the way back to 965, but is explicitly permitted on Gen7.
+* all the way back to 965, but is explicitly permitted on Gen7+.
 */
-   if (brw-gen != 7  mt-cpp = 16)
+   if (brw-gen  7  mt-cpp = 16)
   return I915_TILING_X;
 
/* From the Ivy Bridge PRM, Vol4 Part1 2.12.2.1 (SURFACE_STATE for most
-- 
1.9.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 6/6] i965/skl: Don't use ALL_SLICES_AT_EACH_LOD

2015-02-20 Thread Neil Roberts
The render surface state command for Skylake doesn't have the surface
array spacing bit so I don't think it's possible to select this
layout. This avoids a kernel panic when running the piglit test below:

ext_framebuffer_multisample-no-color 8 stencil single

However the test still fails so there may be something else wrong as
well. The test was not causing a kernel panic before the patch to fix
the qpitch.

I think it's also not possible to select this layout on Gen8 so it may
need to be changed to only be used on Gen7.
---
 src/mesa/drivers/dri/i965/intel_mipmap_tree.c | 26 --
 1 file changed, 16 insertions(+), 10 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c 
b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
index 994670a..018e16b 100644
--- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
+++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
@@ -371,19 +371,25 @@ intel_miptree_create_layout(struct brw_context *brw,
   }
}
 
-   /* Set array_layout to ALL_SLICES_AT_EACH_LOD when gen7+ array_spacing_lod0
-* can be used. array_spacing_lod0 is only used for non-IMS MSAA surfaces.
+   /* Set array_layout to ALL_SLICES_AT_EACH_LOD when array_spacing_lod0
+* can be used. array_spacing_lod0 is only used for non-IMS MSAA surfaces
+* on Gen 7 and 8.
 * TODO: can we use it elsewhere?
+* TODO: does this actually work on Gen 8?
 */
-   switch (mt-msaa_layout) {
-   case INTEL_MSAA_LAYOUT_NONE:
-   case INTEL_MSAA_LAYOUT_IMS:
+   if (brw-gen = 9) {
   mt-array_layout = ALL_LOD_IN_EACH_SLICE;
-  break;
-   case INTEL_MSAA_LAYOUT_UMS:
-   case INTEL_MSAA_LAYOUT_CMS:
-  mt-array_layout = ALL_SLICES_AT_EACH_LOD;
-  break;
+   } else {
+  switch (mt-msaa_layout) {
+  case INTEL_MSAA_LAYOUT_NONE:
+  case INTEL_MSAA_LAYOUT_IMS:
+ mt-array_layout = ALL_LOD_IN_EACH_SLICE;
+ break;
+  case INTEL_MSAA_LAYOUT_UMS:
+  case INTEL_MSAA_LAYOUT_CMS:
+ mt-array_layout = ALL_SLICES_AT_EACH_LOD;
+ break;
+  }
}
 
if (target == GL_TEXTURE_CUBE_MAP) {
-- 
1.9.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/5] nir: Add a couple of simplifications of csel operations.

2015-02-20 Thread Matt Turner
The series is

Reviewed-by: Matt Turner matts...@gmail.com
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] drivers/x11: add gallium include dirs to Makefile.am

2015-02-20 Thread Eric Anholt
Matt Turner matts...@gmail.com writes:

 On Fri, Feb 20, 2015 at 12:20 PM, Brian Paul bri...@vmware.com wrote:
 Fixes xlib driver build after e8c5cbfd921680c.
 ---
  src/mesa/drivers/x11/Makefile.am | 2 ++
  1 file changed, 2 insertions(+)

 diff --git a/src/mesa/drivers/x11/Makefile.am 
 b/src/mesa/drivers/x11/Makefile.am
 index 76a7cd4..c0596f8 100644
 --- a/src/mesa/drivers/x11/Makefile.am
 +++ b/src/mesa/drivers/x11/Makefile.am
 @@ -30,6 +30,8 @@ AM_CPPFLAGS = \
 -I$(top_srcdir)/src/mapi \
 -I$(top_srcdir)/src/mesa \
 -I$(top_srcdir)/src \
 +   -I$(top_srcdir)/src/gallium/include \
 +   -I$(top_srcdir)/src/gallium/auxiliary \

 I really dislike including these directories in a bunch of non-Gallium
 places. I should have said so more strongly when the patch went by.

 I'm not sure whether I want to spend my time to fix this the right way or 
 what.

 I guess just go ahead and commit this until something better comes along.

I'm sympathetic, it's just the feeling of if I can't get incremental
changes towards sharing, we'll never get anywhere and we'll be stuck
with this code duplication we've had for the last 5 years.

p_compiler.h looks like something we should be able to get out into
util/, at which point we could drop these include dirs.  It's on my list
still, but I do really want to land my NIR work.


signature.asc
Description: PGP signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] mesa: Have configure define NDEBUG, not mtypes.h.

2015-02-20 Thread Kenneth Graunke
On Friday, February 20, 2015 12:44:53 PM Matt Turner wrote:
 mtypes.h had been defining NDEBUG (used by assert) if DEBUG was not
 defined. Confusing and bizarre that you don't get NDEBUG if you don't
 include mtypes.h.
 
 ... which is just what happened in commit bef38f62e.
 
 Let's let configure define this for us if not using --enable-debug.
 ---
  configure.ac   | 2 ++
  src/mesa/main/mtypes.h | 3 ---
  2 files changed, 2 insertions(+), 3 deletions(-)
 
 diff --git a/configure.ac b/configure.ac
 index beb7a7d..5fbb7bc 100644
 --- a/configure.ac
 +++ b/configure.ac
 @@ -370,6 +370,8 @@ if test x$enable_debug = xyes; then
  CXXFLAGS=$CXXFLAGS -O0
  fi
  fi
 +else
 +   DEFINES=$DEFINES -DNDEBUG
  fi
  
  dnl
 diff --git a/src/mesa/main/mtypes.h b/src/mesa/main/mtypes.h
 index 05b5a81..6e99773 100644
 --- a/src/mesa/main/mtypes.h
 +++ b/src/mesa/main/mtypes.h
 @@ -4512,9 +4512,6 @@ extern int MESA_DEBUG_FLAGS;
  # define MESA_VERBOSE 0
  # define MESA_DEBUG_FLAGS 0
  # define MESA_FUNCTION a function
 -# ifndef NDEBUG
 -#  define NDEBUG
 -# endif
  #endif
  
  
 

It looks like scons/gallium.py already defines DEBUG and NDEBUG:

cppdefines = []
if env['build'] in ('debug', 'checked'):
cppdefines += ['DEBUG']
else:
cppdefines += ['NDEBUG']

So this may not even break the SCons build.  If it does, there's a
fairly clear solution there.

Reviewed-by: Kenneth Graunke kenn...@whitecape.org


signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] nir: Fix the Mesa build without -DDEBUG.

2015-02-20 Thread Kenneth Graunke
With -DDEBUG -UNDEBUG, this assert uses reg_state::stack_size, which
doesn't exist, breaking the build:

assert(state-states[index].index  state-states[index].stack_size);

Switch it to ifndef NDEBUG, so the field will exist if the assertion
actually generates code.

Signed-off-by: Kenneth Graunke kenn...@whitecape.org
---
 src/glsl/nir/nir_to_ssa.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/src/glsl/nir/nir_to_ssa.c b/src/glsl/nir/nir_to_ssa.c
index dbe1699..47cf453 100644
--- a/src/glsl/nir/nir_to_ssa.c
+++ b/src/glsl/nir/nir_to_ssa.c
@@ -134,7 +134,7 @@ typedef struct {
nir_ssa_def **stack;
int index;
unsigned num_defs; /**  used to add indices to debug names */
-#ifdef DEBUG
+#ifndef NDEBUG
unsigned stack_size;
 #endif
 } reg_state;
@@ -489,7 +489,7 @@ init_rewrite_state(nir_function_impl *impl, rewrite_state 
*state)
  state-states[reg-index].stack = ralloc_array(state-states,
 nir_ssa_def *,
 stack_size);
-#ifdef DEBUG
+#ifndef NDEBUG
  state-states[reg-index].stack_size = stack_size;
 #endif
  state-states[reg-index].index = -1;
-- 
2.2.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/3] i965/fs/nir: Optimize (gl_FrontFacing ? x : y) where x and y are ±1.0.

2015-02-20 Thread Eric Anholt
Matt Turner matts...@gmail.com writes:

 On Fri, Feb 20, 2015 at 11:54 AM, Eric Anholt e...@anholt.net wrote:
 I wanted patch #1 to land, so I took a look at this one :)

 Thanks! :)

 Matt Turner matts...@gmail.com writes:
 +   if (brw-gen = 6) {
 +  /* Bit 15 of g0.0 is 0 if the polygon is front facing. */
 +  fs_reg g0 = fs_reg(retype(brw_vec1_grf(0, 0), BRW_REGISTER_TYPE_W));
 +
 +  /* For (gl_FrontFacing ? 1.0 : -1.0), emit:
 +   *
 +   *or(8)  tmp.12W  g0.00,1,0W  0x3f80W
 +   *and(8) dst1Dtmp8,8,1D   0xbf80D
 +   *
 +   * and negate g0.00,1,0W for (gl_FrontFacing ? -1.0 : 1.0).
 +   */
 +
 +  if (value1-f[0] == -1.0f) {
 + g0.negate = true;
 +  }

 Does this do what you want?  If g0.0 happened to be *all* zeroes, you're
 still going to get 0 after negation, right?

 That's a good question. I'm not sure. The bits below it are

 13: Source Depth Present to Render Target.
 12: oMask to Render Target
 11: Source0 Alpha Present to RenderTarget.
 8:6: Starting Sample Pair Index

 BDW adds some additional fields as well.

 The others are ignored. It's not clear to me that at least one of
 the defined bits is guaranteed to be zero. It's no guarantee or
 anything, but FWIW without realizing it we were depending on some bit
 being non-zero for the frontfacing optimizations that used ASR as well
 (commits d1c43ed4, 19c6617a) and haven't seen any problems from it. So
 if there's a problem... we're not making it worse in this commit...

 The simulator seems to set some bits in the ignored fields, but I
 don't have any explanation for that, nor is that evidence that we can
 rely on the hardware to do the same.

 Or maybe I'm just wrong and some bit is guaranteed to be set?

A This negation looks like it's safe in practice, because bits 0:4 will
surely be TRIANGLES comment seems fine with me.


signature.asc
Description: PGP signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] i965/fs: Use fs_reg for CS/VS atomics pixel mask immediate data

2015-02-20 Thread Jordan Justen
The brw_imm_ud will yield a HW_REG which then will introduce a barrier
for certain optimization opportunities.

No piglit regressions seen with gen8 (simd8vs).

Suggested-by: Matt Turner matts...@gmail.com
Signed-off-by: Jordan Justen jordan.l.jus...@intel.com
Cc: Matt Turner matts...@gmail.com
---
 src/mesa/drivers/dri/i965/brw_fs_visitor.cpp | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp 
b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
index fa7d32c..b1b75821 100644
--- a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
@@ -3016,7 +3016,7 @@ fs_visitor::emit_untyped_atomic(unsigned atomic_op, 
unsigned surf_index,
*/
   assert(stage == MESA_SHADER_VERTEX || stage == MESA_SHADER_COMPUTE);
   emit(MOV(component(sources[0], 7),
-   brw_imm_ud(0x)))-force_writemask_all = true;
+   fs_reg(0x)))-force_writemask_all = true;
}
length++;
 
@@ -3079,7 +3079,7 @@ fs_visitor::emit_untyped_surface_read(unsigned 
surf_index, fs_reg dst,
*/
   assert(stage == MESA_SHADER_VERTEX || stage == MESA_SHADER_COMPUTE);
   emit(MOV(component(sources[0], 7),
-   brw_imm_ud(0x)))-force_writemask_all = true;
+   fs_reg(0x)))-force_writemask_all = true;
}
 
/* Set the surface read offset. */
-- 
2.1.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] nir: Compilation error in nir/nir_to_ssa.c

2015-02-20 Thread Matt Turner
On Fri, Feb 20, 2015 at 1:23 PM, Dieter Nützel die...@nuetzel-hh.de wrote:
 make[3]: Entering directory '/opt/mesa/src/glsl'
   CC   nir/nir_to_ssa.lo
 In file included from nir/../glsl_types.h:30:0,
  from nir/nir_types.h:32,
  from nir/nir.h:36,
  from nir/nir_to_ssa.c:28:
 nir/nir_to_ssa.c: In function 'rewrite_def_forwards':
 nir/nir_to_ssa.c:226:60: error: 'reg_state' has no member named 'stack_size'
 assert(state-states[index].index  state-states[index].stack_size);
 ^
 Makefile:1602: recipe for target 'nir/nir_to_ssa.lo' failed
 make[3]: *** [nir/nir_to_ssa.lo] Error 1

 It's only defined for DEBUG.

 -Dieter

Yep. There are two patches on the list to fix this.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] drivers/x11: add gallium include dirs to Makefile.am

2015-02-20 Thread Matt Turner
On Fri, Feb 20, 2015 at 12:20 PM, Brian Paul bri...@vmware.com wrote:
 Fixes xlib driver build after e8c5cbfd921680c.
 ---
  src/mesa/drivers/x11/Makefile.am | 2 ++
  1 file changed, 2 insertions(+)

 diff --git a/src/mesa/drivers/x11/Makefile.am 
 b/src/mesa/drivers/x11/Makefile.am
 index 76a7cd4..c0596f8 100644
 --- a/src/mesa/drivers/x11/Makefile.am
 +++ b/src/mesa/drivers/x11/Makefile.am
 @@ -30,6 +30,8 @@ AM_CPPFLAGS = \
 -I$(top_srcdir)/src/mapi \
 -I$(top_srcdir)/src/mesa \
 -I$(top_srcdir)/src \
 +   -I$(top_srcdir)/src/gallium/include \
 +   -I$(top_srcdir)/src/gallium/auxiliary \

I really dislike including these directories in a bunch of non-Gallium
places. I should have said so more strongly when the patch went by.

I'm not sure whether I want to spend my time to fix this the right way or what.

I guess just go ahead and commit this until something better comes along.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/3] i965/fs/nir: Optimize (gl_FrontFacing ? x : y) where x and y are ±1.0.

2015-02-20 Thread Matt Turner
On Fri, Feb 20, 2015 at 1:41 PM, Eric Anholt e...@anholt.net wrote:
 Or maybe I'm just wrong and some bit is guaranteed to be set?

 A This negation looks like it's safe in practice, because bits 0:4 will
 surely be TRIANGLES comment seems fine with me.

Thanks, will do. R-b?

I realized I was looking at Vol4/Part1 which described the render
target write message header, which much the same description but not
the actual incoming thread dispatch payload. Vol2/Part1 has the info I
want.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 3/6] i965/skl: Fix the qpitch value

2015-02-20 Thread Neil Roberts
On Skylake the qpitch value is uploaded as part of the surface state
so we don't need to add the extra rows that are done for other
generations. However for 3D textures it needs to be aligned to the
tile height. Unlike previous generations the qpitch is measured as a
multiple of the block size for compressed surfaces. When the
horizontal mipmap layout is used for 1D textures then the qpitch is
measured in pixels instead of rows.
---
 src/mesa/drivers/dri/i965/brw_tex_layout.c| 44 +--
 src/mesa/drivers/dri/i965/intel_mipmap_tree.h | 10 --
 2 files changed, 41 insertions(+), 13 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_tex_layout.c 
b/src/mesa/drivers/dri/i965/brw_tex_layout.c
index 851742b..d89a04e 100644
--- a/src/mesa/drivers/dri/i965/brw_tex_layout.c
+++ b/src/mesa/drivers/dri/i965/brw_tex_layout.c
@@ -297,24 +297,48 @@ static void
 brw_miptree_layout_texture_array(struct brw_context *brw,
 struct intel_mipmap_tree *mt)
 {
-   int h0, h1;
unsigned height = mt-physical_height0;
bool layout_1d = use_linear_1d_layout(brw, mt);
-
-   h0 = ALIGN(mt-physical_height0, mt-align_h);
-   h1 = ALIGN(minify(mt-physical_height0, 1), mt-align_h);
-   if (mt-array_layout == ALL_SLICES_AT_EACH_LOD)
-  mt-qpitch = h0;
-   else
-  mt-qpitch = (h0 + h1 + (brw-gen = 7 ? 12 : 11) * mt-align_h);
-
-   int physical_qpitch = mt-compressed ? mt-qpitch / 4 : mt-qpitch;
+   int physical_qpitch;
 
if (layout_1d)
   gen9_miptree_layout_1d(mt);
else
   brw_miptree_layout_2d(mt);
 
+   if (layout_1d) {
+  physical_qpitch = mt-align_h;
+  /* When using the horizontal layout the qpitch is measured in pixels */
+  mt-qpitch = physical_qpitch * mt-total_width;
+   } else if (brw-gen = 9) {
+  /* On Gen9 we can pick whatever qpitch we like as long as it's aligned
+   * to the vertical alignment so we don't need to add any extra rows.
+   */
+  mt-qpitch = mt-total_height;
+
+  /* However 3D textures need to be aligned to the tile height. At this
+   * point we don't know which tiling will be used so let's just align it
+   * to 32
+   */
+  if (mt-target == GL_TEXTURE_3D)
+ mt-qpitch = ALIGN(mt-qpitch, 32);
+
+  /* Unlike previous generations the qpitch is now a multiple of the
+   * compressed block size so physical_qpitch matches mt-qpitch.
+   */
+  physical_qpitch = mt-qpitch;
+   } else {
+  int h0 = ALIGN(mt-physical_height0, mt-align_h);
+  int h1 = ALIGN(minify(mt-physical_height0, 1), mt-align_h);
+
+  if (mt-array_layout == ALL_SLICES_AT_EACH_LOD)
+ mt-qpitch = h0;
+  else
+ mt-qpitch = (h0 + h1 + (brw-gen = 7 ? 12 : 11) * mt-align_h);
+
+  physical_qpitch = mt-compressed ? mt-qpitch / 4 : mt-qpitch;
+   }
+
for (unsigned level = mt-first_level; level = mt-last_level; level++) {
   unsigned img_height;
   img_height = ALIGN(height, mt-align_h);
diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.h 
b/src/mesa/drivers/dri/i965/intel_mipmap_tree.h
index ee9cf1e..247e5b8 100644
--- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.h
+++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.h
@@ -380,10 +380,14 @@ struct intel_mipmap_tree
enum miptree_array_layout array_layout;
 
/**
-* The distance in rows between array slices in an uncompressed surface.
+* The distance in between array slices.
 *
-* For compressed surfaces, slices are stored closer together physically;
-* the real distance is (qpitch / block height).
+* The value is the one that is sent in the surface state. The actual
+* meaning depends on certain criteria. Usually it is simply the number of
+* uncompressed rows between each slice. However on Gen9+ for compressed
+* surfaces it is the number of blocks. For 1D surfaces that have the
+* mipmap tree stored horizontally it is the number of pixels between each
+* slice.
 */
uint32_t qpitch;
 
-- 
1.9.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] mesa: Have configure define NDEBUG, not mtypes.h.

2015-02-20 Thread Matt Turner
On Fri, Feb 20, 2015 at 12:44 PM, Matt Turner matts...@gmail.com wrote:
 mtypes.h had been defining NDEBUG (used by assert) if DEBUG was not
 defined. Confusing and bizarre that you don't get NDEBUG if you don't
 include mtypes.h.

 ... which is just what happened in commit bef38f62e.

 Let's let configure define this for us if not using --enable-debug.
 ---

Scons will probably need similar treatment.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 23/23] main: Cosmetic changes to GetBufferSubData.

2015-02-20 Thread Martin Peres

On 20/02/2015 19:58, Laura Ekstrand wrote:
Ian has asked that this not be squashed.  In fact, they were the same 
commit last time around on the ML.


Ok. Sorry for the noise then :s



On Fri, Feb 20, 2015 at 12:29 AM, Martin Peres martin.pe...@free.fr 
mailto:martin.pe...@free.fr wrote:


Please squash this commit with the one introducing GetBufferSubData.

It should be pretty easy to do with git rebase -i :)


On 12/02/2015 04:06, Laura Ekstrand wrote:

---
  src/mesa/main/bufferobj.c | 2 +-
  src/mesa/main/bufferobj.h | 4 ++--
  2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/src/mesa/main/bufferobj.c b/src/mesa/main/bufferobj.c
index 0272704..38d8b5a 100644
--- a/src/mesa/main/bufferobj.c
+++ b/src/mesa/main/bufferobj.c
@@ -1668,7 +1668,7 @@ _mesa_GetBufferSubData(GLenum target,
GLintptr offset,
 }
   ASSERT(ctx-Driver.GetBufferSubData);
-   ctx-Driver.GetBufferSubData( ctx, offset, size, data,
bufObj );
+   ctx-Driver.GetBufferSubData(ctx, offset, size, data, bufObj);
  }
void GLAPIENTRY
diff --git a/src/mesa/main/bufferobj.h b/src/mesa/main/bufferobj.h
index feeaa8b..b5d73ae 100644
--- a/src/mesa/main/bufferobj.h
+++ b/src/mesa/main/bufferobj.h
@@ -230,8 +230,8 @@ _mesa_NamedBufferSubData(GLuint buffer,
GLintptr offset,
   GLsizeiptr size, const GLvoid *data);
void GLAPIENTRY
-_mesa_GetBufferSubData(GLenum target, GLintptrARB offset,
-   GLsizeiptrARB size, void * data);
+_mesa_GetBufferSubData(GLenum target, GLintptr offset,
+   GLsizeiptr size, GLvoid *data);
void GLAPIENTRY
  _mesa_GetNamedBufferSubData(GLuint buffer, GLintptr offset,





___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] nir: Compilation error in nir/nir_to_ssa.c

2015-02-20 Thread Dieter Nützel

make[3]: Entering directory '/opt/mesa/src/glsl'
  CC   nir/nir_to_ssa.lo
In file included from nir/../glsl_types.h:30:0,
 from nir/nir_types.h:32,
 from nir/nir.h:36,
 from nir/nir_to_ssa.c:28:
nir/nir_to_ssa.c: In function 'rewrite_def_forwards':
nir/nir_to_ssa.c:226:60: error: 'reg_state' has no member named 
'stack_size'
assert(state-states[index].index  
state-states[index].stack_size);

^
Makefile:1602: recipe for target 'nir/nir_to_ssa.lo' failed
make[3]: *** [nir/nir_to_ssa.lo] Error 1

It's only defined for DEBUG.

-Dieter
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 89245] [bisected] Drop dependency on mtypes.h for core NIR patch broke Gallium builds

2015-02-20 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=89245

Bug ID: 89245
   Summary: [bisected] Drop dependency on mtypes.h for core NIR
patch broke Gallium builds
   Product: Mesa
   Version: git
  Hardware: x86-64 (AMD64)
OS: All
Status: NEW
  Severity: normal
  Priority: medium
 Component: Mesa core
  Assignee: mesa-dev@lists.freedesktop.org
  Reporter: b.bel...@gmail.com
QA Contact: mesa-dev@lists.freedesktop.org
CC: e...@anholt.net

The patch nir: Drop dependency on mtypes.h for core NIR. (commit
bef38f62e026) broke my Gallium (r600g) build.


I build with:
./autogen.sh --with-gallium-drivers=r600 --with-dri-drivers=
--enable-texture-float --disable-dri3 --disable-r600-llvm-compiler
--disable-gallium-llvm CFLAGS=-O2 -march=native CXXFLAGS=-O2 -march=native
--libdir=/usr/local/lib64 --prefix=/usr/local


Here is the error:
[...]
  CC   nir/nir_split_var_copies.lo
  CC   nir/nir_to_ssa.lo
  CC   nir/nir_validate.lo
  CC   nir/nir_worklist.lo
In file included from nir/../glsl_types.h:30:0,
 from nir/nir_types.h:32,
 from nir/nir.h:36,
 from nir/nir_to_ssa.c:28:
nir/nir_to_ssa.c: In function 'rewrite_def_forwards':
nir/nir_to_ssa.c:226:60: error: 'reg_state' has no member named 'stack_size'
assert(state-states[index].index  state-states[index].stack_size);
^
Makefile:1602: recipe for target 'nir/nir_to_ssa.lo' failed
make[3]: *** [nir/nir_to_ssa.lo] Error 1
make[3]: *** Attente des tâches non terminées
make[3]: Leaving directory '/home/benjamin/MESA/mesa/src/glsl'
Makefile:1209: recipe for target 'all' failed
make[2]: *** [all] Error 2
make[2]: Leaving directory '/home/benjamin/MESA/mesa/src/glsl'
Makefile:658: recipe for target 'all-recursive' failed
make[1]: *** [all-recursive] Error 1
make[1]: Leaving directory '/home/benjamin/MESA/mesa/src'
Makefile:601: recipe for target 'all-recursive' failed
make: *** [all-recursive] Error 1

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] i965/fs: Use fs_reg for CS/VS atomics pixel mask immediate data

2015-02-20 Thread Matt Turner
On Fri, Feb 20, 2015 at 1:14 PM, Jordan Justen
jordan.l.jus...@intel.com wrote:
 The brw_imm_ud will yield a HW_REG which then will introduce a barrier
 for certain optimization opportunities.

 No piglit regressions seen with gen8 (simd8vs).

 Suggested-by: Matt Turner matts...@gmail.com
 Signed-off-by: Jordan Justen jordan.l.jus...@intel.com
 Cc: Matt Turner matts...@gmail.com

By the way, there are some more in gen6_gs_visitor.cpp if you'd like
to clean those up as well.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/5] nir: Allow nir_opt_algebraic to see booleanness through , ||, !.

2015-02-20 Thread Connor Abbott
On Fri, Feb 20, 2015 at 3:04 PM, Eric Anholt e...@anholt.net wrote:
 We have some useful optimizations to drop things like 'ine a, 0' on a
 boolean argument, but if 'a' came from logical operations on bools, it
 couldn't tell.  These kinds of constructs appear as a result of TGSI-NIR
 quite frequently (at least with if flattening), so being a little more
 aggressive in detecting booleans can pay off.

 vc4 results:
 total instructions in shared programs: 40207 - 39881 (-0.81%)
 instructions in affected programs: 6677 - 6351 (-4.88%)
 ---
  src/glsl/nir/nir_search.c | 29 -
  1 file changed, 28 insertions(+), 1 deletion(-)

 diff --git a/src/glsl/nir/nir_search.c b/src/glsl/nir/nir_search.c
 index 4671931..d12b6ab 100644
 --- a/src/glsl/nir/nir_search.c
 +++ b/src/glsl/nir/nir_search.c
 @@ -39,6 +39,32 @@ match_expression(const nir_search_expression *expr, 
 nir_alu_instr *instr,

  static const uint8_t identity_swizzle[] = { 0, 1, 2, 3 };

 +static bool alu_instr_is_bool(nir_alu_instr *instr);
 +
 +static bool
 +src_is_bool(nir_src src)
 +{
 +   if (!src.is_ssa)
 +  return false;
 +   if (src.ssa-parent_instr-type != nir_instr_type_alu)
 +  return false;
 +   return alu_instr_is_bool((nir_alu_instr *)src.ssa-parent_instr);
 +}
 +
 +static bool
 +alu_instr_is_bool(nir_alu_instr *instr)
 +{
 +   switch (instr-op) {
 +   case nir_op_iand:
 +   case nir_op_ior:

Add nir_op_ixor here?

 +  return src_is_bool(instr-src[0].src)  
 src_is_bool(instr-src[1].src);
 +   case nir_op_inot:
 +  return src_is_bool(instr-src[0].src);
 +   default:
 +  return nir_op_infos[instr-op].output_type == nir_type_bool;
 +   }
 +}
 +
  static bool
  match_value(const nir_search_value *value, nir_alu_instr *instr, unsigned 
 src,
  unsigned num_components, const uint8_t *swizzle,
 @@ -89,7 +115,8 @@ match_value(const nir_search_value *value, nir_alu_instr 
 *instr, unsigned src,
  nir_alu_instr *src_alu =
 nir_instr_as_alu(instr-src[src].src.ssa-parent_instr);

 -if (nir_op_infos[src_alu-op].output_type != var-type)
 +if (nir_op_infos[src_alu-op].output_type != var-type 
 +!(var-type == nir_type_bool  alu_instr_is_bool(src_alu)))
 return false;
   }

 --
 2.1.4

 ___
 mesa-dev mailing list
 mesa-dev@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 89245] [bisected] Drop dependency on mtypes.h for core NIR patch broke Gallium builds

2015-02-20 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=89245

Benjamin Bellec b.bel...@gmail.com changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 CC||b.bel...@gmail.com
 Resolution|--- |FIXED

--- Comment #1 from Benjamin Bellec b.bel...@gmail.com ---
Fixed by commit b6393d70402a60c124c1884d8d0cc1dc6a9b4ca5 and/or
b6393d70402a60c124c1884d8d0cc1dc6a9b4ca5

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/7] i965: Introduce the BROADCAST pseudo-opcode.

2015-02-20 Thread Ian Romanick
I want to be sure I understand what this series is doing and why it's
necessary.

Sampler arrays and UBO arrays can, with some restrictions, be indexed
dynamically.  The primary restriction is that any active SIMD channels
that access the array must access the same index.  When you get to a
particular access, not all SIMD channels may be active.  This is the
essence of what the GLSL spec calls dynamically uniform.

The implication is that the compiler is allowed to pick the index from
any active SIMD channel.  If they're not all the same, the shader author
should have followed the rules.  Currently i965 picks the index from a
SIMD channel that may or may not be active (due to other, non-uniform
flow control).  Correct?

This patch series attempts to fix that by first introducing some opcodes
to get a value from one of the active SIMD channels, then using those
opcodes to pick an array index from one of the active SIMD channels.
Correct?

On 02/20/2015 11:48 AM, Francisco Jerez wrote:
 The BROADCAST instruction picks the channel from its first source
 given by an index passed in as second source.  This will be used in
 situations where all channels from the same SIMD thread have to agree
 on the value of something, e.g. a surface binding table index.
 ---
  src/mesa/drivers/dri/i965/brw_defines.h  |  6 ++
  src/mesa/drivers/dri/i965/brw_eu.h   |  6 ++
  src/mesa/drivers/dri/i965/brw_eu_emit.c  | 77 
 
  src/mesa/drivers/dri/i965/brw_fs_generator.cpp   |  4 ++
  src/mesa/drivers/dri/i965/brw_shader.cpp |  3 +
  src/mesa/drivers/dri/i965/brw_vec4_generator.cpp |  4 ++
  6 files changed, 100 insertions(+)
 
 diff --git a/src/mesa/drivers/dri/i965/brw_defines.h 
 b/src/mesa/drivers/dri/i965/brw_defines.h
 index 17c27dd..d4930e3 100644
 --- a/src/mesa/drivers/dri/i965/brw_defines.h
 +++ b/src/mesa/drivers/dri/i965/brw_defines.h
 @@ -911,6 +911,12 @@ enum opcode {
  
 SHADER_OPCODE_URB_WRITE_SIMD8,
  
 +   /**
 +* Pick the channel from its first source register given by the index
 +* specified as second source.  Useful for variable indexing of surfaces.
 +*/
 +   SHADER_OPCODE_BROADCAST,
 +
 VEC4_OPCODE_MOV_BYTES,
 VEC4_OPCODE_PACK_BYTES,
 VEC4_OPCODE_UNPACK_UNIFORM,
 diff --git a/src/mesa/drivers/dri/i965/brw_eu.h 
 b/src/mesa/drivers/dri/i965/brw_eu.h
 index a94ea42..2505480 100644
 --- a/src/mesa/drivers/dri/i965/brw_eu.h
 +++ b/src/mesa/drivers/dri/i965/brw_eu.h
 @@ -413,6 +413,12 @@ brw_pixel_interpolator_query(struct brw_compile *p,
   unsigned msg_length,
   unsigned response_length);
  
 +void
 +brw_broadcast(struct brw_compile *p,
 +  struct brw_reg dst,
 +  struct brw_reg src,
 +  struct brw_reg idx);
 +
  /***
   * brw_eu_util.c:
   */
 diff --git a/src/mesa/drivers/dri/i965/brw_eu_emit.c 
 b/src/mesa/drivers/dri/i965/brw_eu_emit.c
 index 1d6fd67..d7e3995 100644
 --- a/src/mesa/drivers/dri/i965/brw_eu_emit.c
 +++ b/src/mesa/drivers/dri/i965/brw_eu_emit.c
 @@ -2854,6 +2854,83 @@ brw_pixel_interpolator_query(struct brw_compile *p,
 brw_inst_set_pi_message_data(brw, insn, data);
  }
  
 +void
 +brw_broadcast(struct brw_compile *p,
 +  struct brw_reg dst,
 +  struct brw_reg src,
 +  struct brw_reg idx)
 +{
 +   const struct brw_context *brw = p-brw;
 +   const bool align1 = (brw_inst_access_mode(brw, p-current) == 
 BRW_ALIGN_1);
 +   brw_inst *inst;
 +
 +   assert(src.file == BRW_GENERAL_REGISTER_FILE 
 +  src.address_mode == BRW_ADDRESS_DIRECT);
 +
 +   if ((src.vstride == 0  (src.hstride == 0 || !align1)) ||
 +   idx.file == BRW_IMMEDIATE_VALUE) {
 +  /* Trivial, the source is already uniform or the index is a constant.
 +   * We will typically not get here if the optimizer is doing its job, 
 but
 +   * asserting would be mean.
 +   */
 +  const unsigned i = (idx.file == BRW_IMMEDIATE_VALUE ? idx.dw1.ud : 0);
 +  brw_MOV(p, dst,
 +  (align1 ? stride(suboffset(src, i), 0, 1, 0) :
 +   stride(suboffset(src, 4 * i), 0, 4, 1)));
 +
 +   } else {
 +  if (align1) {
 + const struct brw_reg addr =
 +retype(brw_address_reg(0), BRW_REGISTER_TYPE_UD);
 + const unsigned offset = src.nr * REG_SIZE + src.subnr;
 + /* Limit in bytes of the signed indirect addressing immediate. */
 + const unsigned limit = 512;
 +
 + brw_push_insn_state(p);
 + brw_set_default_mask_control(p, BRW_MASK_DISABLE);
 + brw_set_default_predicate_control(p, BRW_PREDICATE_NONE);
 +
 + /* Take into account the component size and horizontal stride. */
 + assert(src.vstride == src.hstride + src.width);
 + brw_SHL(p, addr, vec1(idx),
 + brw_imm_ud(_mesa_logbase2(type_sz(src.type)) +
 +   

Re: [Mesa-dev] [PATCH 06/16] main: Added entry point for glTransformFeedbackBufferRange

2015-02-20 Thread Laura Ekstrand
On Mon, Feb 16, 2015 at 6:13 AM, Martin Peres martin.pe...@linux.intel.com
wrote:

 v2: review from Laura Ekstrand
 - use the refactored code to lookup the objects
 - improve some error messages
 - factor out the gl method name computation
 - better handle the spec differences between the DSA and non-DSA cases
 - quote the spec a little more

 v3: review from Laura Ekstrand
 - use the new name of _mesa_lookup_bufferobj_err
 - swap the comments around the offset and size checks

 v4: review from Laura Ekstrand
 - add more spec quotes
 - properly fix the comments around the offset and size checks

 Signed-off-by: Martin Peres martin.pe...@linux.intel.com
 ---
  src/mapi/glapi/gen/ARB_direct_state_access.xml |  8 +++
  src/mesa/main/bufferobj.c  |  3 +-
  src/mesa/main/tests/dispatch_sanity.cpp|  1 +
  src/mesa/main/transformfeedback.c  | 98
 +-
  src/mesa/main/transformfeedback.h  |  6 +-
  5 files changed, 97 insertions(+), 19 deletions(-)

 diff --git a/src/mapi/glapi/gen/ARB_direct_state_access.xml
 b/src/mapi/glapi/gen/ARB_direct_state_access.xml
 index 35d6906..b3c090f 100644
 --- a/src/mapi/glapi/gen/ARB_direct_state_access.xml
 +++ b/src/mapi/glapi/gen/ARB_direct_state_access.xml
 @@ -20,6 +20,14 @@
param name=buffer type=GLuint /
 /function

 +   function name=TransformFeedbackBufferRange offset=assign
 +  param name=xfb type=GLuint /
 +  param name=index type=GLuint /
 +  param name=buffer type=GLuint /
 +  param name=offset type=GLintptr /
 +  param name=size type=GLsizeiptr /
 +   /function
 +
 !-- Texture object functions --

 function name=CreateTextures offset=assign
 diff --git a/src/mesa/main/bufferobj.c b/src/mesa/main/bufferobj.c
 index 86532ea..7558e17 100644
 --- a/src/mesa/main/bufferobj.c
 +++ b/src/mesa/main/bufferobj.c
 @@ -3548,7 +3548,8 @@ _mesa_BindBufferRange(GLenum target, GLuint index,
 case GL_TRANSFORM_FEEDBACK_BUFFER:
_mesa_bind_buffer_range_transform_feedback(ctx,

 ctx-TransformFeedback.CurrentObject,
 - index, bufObj, offset,
 size);
 + index, bufObj, offset,
 size,
 + false);
return;
 case GL_UNIFORM_BUFFER:
bind_buffer_range_uniform_buffer(ctx, index, bufObj, offset, size);
 diff --git a/src/mesa/main/tests/dispatch_sanity.cpp
 b/src/mesa/main/tests/dispatch_sanity.cpp
 index 183755f..87f7d6f 100644
 --- a/src/mesa/main/tests/dispatch_sanity.cpp
 +++ b/src/mesa/main/tests/dispatch_sanity.cpp
 @@ -957,6 +957,7 @@ const struct function gl_core_functions_possible[] = {
 /* GL_ARB_direct_state_access */
 { glCreateTransformFeedbacks, 45, -1 },
 { glTransformFeedbackBufferBase, 45, -1 },
 +   { glTransformFeedbackBufferRange, 45, -1 },
 { glCreateTextures, 45, -1 },
 { glTextureStorage1D, 45, -1 },
 { glTextureStorage2D, 45, -1 },
 diff --git a/src/mesa/main/transformfeedback.c
 b/src/mesa/main/transformfeedback.c
 index d932943..2dded21 100644
 --- a/src/mesa/main/transformfeedback.c
 +++ b/src/mesa/main/transformfeedback.c
 @@ -541,7 +541,8 @@ bind_buffer_range(struct gl_context *ctx,
  /**
   * Specify a buffer object to receive transform feedback results.  Plus,
   * specify the starting offset to place the results, and max size.
 - * Called from the glBindBufferRange() function.
 + * Called from the glBindBufferRange() and glTransformFeedbackBufferRange
 + * functions.
   */
  void
  _mesa_bind_buffer_range_transform_feedback(struct gl_context *ctx,
 @@ -549,35 +550,74 @@ _mesa_bind_buffer_range_transform_feedback(struct
 gl_context *ctx,
 GLuint index,
 struct gl_buffer_object
 *bufObj,
 GLintptr offset,
 -   GLsizeiptr size)
 +   GLsizeiptr size,
 +   bool dsa)
  {
 +   const char *gl_methd_name;
 +   if (dsa)
 +  gl_methd_name = glTransformFeedbackBufferRange;
 +   else
 +  gl_methd_name = glBindBufferRange;
 +
 +
 if (obj-Active) {
 -  _mesa_error(ctx, GL_INVALID_OPERATION,
 -  glBindBufferRange(transform feedback active));
 +  _mesa_error(ctx, GL_INVALID_OPERATION, %s(transform feedback
 active),
 +  gl_methd_name);
return;
 }

 if (index = ctx-Const.MaxTransformFeedbackBuffers) {
 -  _mesa_error(ctx, GL_INVALID_VALUE, glBindBufferRange(index=%d 
 -  out of bounds), index);
 +  /* OpenGL 4.5 core profile, 6.1, pdf page 82: An INVALID_VALUE
 error is
 +   * generated if index is greater than or equal to the number of
 binding
 +   * points for transform feedback, as described in section 6.7.1.

Again,  would be 

Re: [Mesa-dev] [PATCH 2/4] common: Correct PBO 2D_ARRAY handling.

2015-02-20 Thread Jason Ekstrand
This is mostly correct and it's a good solution.  The only problem is that
it doesn't handle the skipRows packing property properly.  This property
tells the driver the stride (in rows) between image planes in the data.
Most of the places where depth is used below, we should be using the stride
(in rows) between images.  Look at the _mesa_get_image_stride familiy of
functions to see exactly how to handle this.

On Fri, Feb 20, 2015 at 4:30 PM, Laura Ekstrand la...@jlekstrand.net
wrote:

 Changes PBO uploads and downloads to use a tall (height * depth) 2D texture
 for blitting.  This fixes the bug where 2D_ARRAY, 3D, and CUBE_MAP_ARRAY
 textures are not properly uploaded and downloaded.
 ---
  src/mesa/drivers/common/meta_tex_subimage.c | 6 +++---
  1 file changed, 3 insertions(+), 3 deletions(-)

 diff --git a/src/mesa/drivers/common/meta_tex_subimage.c
 b/src/mesa/drivers/common/meta_tex_subimage.c
 index f4f7716..ee3295b 100644
 --- a/src/mesa/drivers/common/meta_tex_subimage.c
 +++ b/src/mesa/drivers/common/meta_tex_subimage.c
 @@ -110,7 +110,7 @@ create_texture_for_pbo(struct gl_context *ctx, bool
 create_pbo,
 internal_format = _mesa_get_format_base_format(pbo_format);

 tex_image = _mesa_get_tex_image(ctx, tex_obj, tex_obj-Target, 0);
 -   _mesa_init_teximage_fields(ctx, tex_image, width, height, depth,
 +   _mesa_init_teximage_fields(ctx, tex_image, width, height * depth, 1,
0, internal_format, pbo_format);


 read_only = pbo_target == GL_PIXEL_UNPACK_BUFFER;
 @@ -227,7 +227,7 @@ _mesa_meta_pbo_TexSubImage(struct gl_context *ctx,
 GLuint dims,
_mesa_update_state(ctx);

_mesa_meta_BlitFramebuffer(ctx, ctx-ReadBuffer, ctx-DrawBuffer,
 - 0, 0, width, height,
 + 0, z * height, width, (z + 1) * height,
   xoffset, yoffset,
   xoffset + width, yoffset + height,
   GL_COLOR_BUFFER_BIT, GL_NEAREST);
 @@ -349,7 +349,7 @@ _mesa_meta_pbo_GetTexSubImage(struct gl_context *ctx,
 GLuint dims,
_mesa_meta_BlitFramebuffer(ctx, ctx-ReadBuffer, ctx-DrawBuffer,
   xoffset, yoffset,
   xoffset + width, yoffset + height,
 - 0, 0, width, height,
 + 0, z * height, width, (z + 1) * height,
   GL_COLOR_BUFFER_BIT, GL_NEAREST);
 }

 --
 2.1.0

 ___
 mesa-dev mailing list
 mesa-dev@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/mesa-dev

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/7] i965: Introduce the BROADCAST pseudo-opcode.

2015-02-20 Thread Francisco Jerez
Ian Romanick i...@freedesktop.org writes:

 I want to be sure I understand what this series is doing and why it's
 necessary.

 Sampler arrays and UBO arrays can, with some restrictions, be indexed
 dynamically.  The primary restriction is that any active SIMD channels
 that access the array must access the same index.  When you get to a
 particular access, not all SIMD channels may be active.  This is the
 essence of what the GLSL spec calls dynamically uniform.

 The implication is that the compiler is allowed to pick the index from
 any active SIMD channel.  If they're not all the same, the shader author
 should have followed the rules.  Currently i965 picks the index from a
 SIMD channel that may or may not be active (due to other, non-uniform
 flow control).  Correct?

Yeah, the problem is that right now we incorrectly assume that the first
channel of the indexing expression is always live, if it's not we end up
using garbage for the binding table index.

 This patch series attempts to fix that by first introducing some opcodes
 to get a value from one of the active SIMD channels, then using those
 opcodes to pick an array index from one of the active SIMD channels.
 Correct?

Yeah, that's right.  FIND_LIVE_CHANNEL returns the index of an arbitrary
live channel and BROADCAST uses indirect register addressing to fetch
the value from that channel and write it to the destination.  The same
opcodes will be used for dynamically uniform indexing of image arrays
too.

 On 02/20/2015 11:48 AM, Francisco Jerez wrote:
 The BROADCAST instruction picks the channel from its first source
 given by an index passed in as second source.  This will be used in
 situations where all channels from the same SIMD thread have to agree
 on the value of something, e.g. a surface binding table index.
 ---
  src/mesa/drivers/dri/i965/brw_defines.h  |  6 ++
  src/mesa/drivers/dri/i965/brw_eu.h   |  6 ++
  src/mesa/drivers/dri/i965/brw_eu_emit.c  | 77 
 
  src/mesa/drivers/dri/i965/brw_fs_generator.cpp   |  4 ++
  src/mesa/drivers/dri/i965/brw_shader.cpp |  3 +
  src/mesa/drivers/dri/i965/brw_vec4_generator.cpp |  4 ++
  6 files changed, 100 insertions(+)
 
 diff --git a/src/mesa/drivers/dri/i965/brw_defines.h 
 b/src/mesa/drivers/dri/i965/brw_defines.h
 index 17c27dd..d4930e3 100644
 --- a/src/mesa/drivers/dri/i965/brw_defines.h
 +++ b/src/mesa/drivers/dri/i965/brw_defines.h
 @@ -911,6 +911,12 @@ enum opcode {
  
 SHADER_OPCODE_URB_WRITE_SIMD8,
  
 +   /**
 +* Pick the channel from its first source register given by the index
 +* specified as second source.  Useful for variable indexing of surfaces.
 +*/
 +   SHADER_OPCODE_BROADCAST,
 +
 VEC4_OPCODE_MOV_BYTES,
 VEC4_OPCODE_PACK_BYTES,
 VEC4_OPCODE_UNPACK_UNIFORM,
 diff --git a/src/mesa/drivers/dri/i965/brw_eu.h 
 b/src/mesa/drivers/dri/i965/brw_eu.h
 index a94ea42..2505480 100644
 --- a/src/mesa/drivers/dri/i965/brw_eu.h
 +++ b/src/mesa/drivers/dri/i965/brw_eu.h
 @@ -413,6 +413,12 @@ brw_pixel_interpolator_query(struct brw_compile *p,
   unsigned msg_length,
   unsigned response_length);
  
 +void
 +brw_broadcast(struct brw_compile *p,
 +  struct brw_reg dst,
 +  struct brw_reg src,
 +  struct brw_reg idx);
 +
  /***
   * brw_eu_util.c:
   */
 diff --git a/src/mesa/drivers/dri/i965/brw_eu_emit.c 
 b/src/mesa/drivers/dri/i965/brw_eu_emit.c
 index 1d6fd67..d7e3995 100644
 --- a/src/mesa/drivers/dri/i965/brw_eu_emit.c
 +++ b/src/mesa/drivers/dri/i965/brw_eu_emit.c
 @@ -2854,6 +2854,83 @@ brw_pixel_interpolator_query(struct brw_compile *p,
 brw_inst_set_pi_message_data(brw, insn, data);
  }
  
 +void
 +brw_broadcast(struct brw_compile *p,
 +  struct brw_reg dst,
 +  struct brw_reg src,
 +  struct brw_reg idx)
 +{
 +   const struct brw_context *brw = p-brw;
 +   const bool align1 = (brw_inst_access_mode(brw, p-current) == 
 BRW_ALIGN_1);
 +   brw_inst *inst;
 +
 +   assert(src.file == BRW_GENERAL_REGISTER_FILE 
 +  src.address_mode == BRW_ADDRESS_DIRECT);
 +
 +   if ((src.vstride == 0  (src.hstride == 0 || !align1)) ||
 +   idx.file == BRW_IMMEDIATE_VALUE) {
 +  /* Trivial, the source is already uniform or the index is a constant.
 +   * We will typically not get here if the optimizer is doing its job, 
 but
 +   * asserting would be mean.
 +   */
 +  const unsigned i = (idx.file == BRW_IMMEDIATE_VALUE ? idx.dw1.ud : 0);
 +  brw_MOV(p, dst,
 +  (align1 ? stride(suboffset(src, i), 0, 1, 0) :
 +   stride(suboffset(src, 4 * i), 0, 4, 1)));
 +
 +   } else {
 +  if (align1) {
 + const struct brw_reg addr =
 +retype(brw_address_reg(0), BRW_REGISTER_TYPE_UD);
 + const unsigned offset = src.nr * REG_SIZE + src.subnr;
 

Re: [Mesa-dev] [PATCH 05/16] main: Added entry point for glTransformFeedbackBufferBase

2015-02-20 Thread Laura Ekstrand
You still seem to be confused about the names of internal functions.  If a
function exists outside of its class, (ie. is extern-ed from a *.h file),
it should have the _mesa in front.  If a function exists purely inside of
a *.c file, it should be static and not have _mesa.

Since _mesa_lookup_transform_feedback_object in transformfeedback.h is used
outside of transformfeedback.[c|h], it should have the _mesa in front and
not be static.  In this patch, you renamed it to
lookup_transform_feedback_object, which is incorrect.

On the other hand, the names that you gave to
lookup_transform_feedback_object_err and
lookup_transform_feedback_bufferobj_err are just fine because these
functions are static and never called outside transformfeedback.c.

On Mon, Feb 16, 2015 at 6:13 AM, Martin Peres martin.pe...@linux.intel.com
wrote:

 v2: Review from Laura Ekstrand
 - give more helpful error messages
 - factor the lookup code for the xfb and objBuf
 - replace some already-existing tabs with spaces
 - add comments to explain the cases where xfb == 0 or buffer == 0
 - fix the condition for binding the transform buffer or not

 v3: Review from Laura Ekstrand
 - rename _mesa_lookup_bufferobj_err to
   _mesa_lookup_transform_feedback_bufferobj_err and make it static to
 avoid a
   future conflict
 - make _mesa_lookup_transform_feedback_object_err static

 v4: Review from Laura Ekstrand
 - add the pdf page number when quoting the spec
 - rename some of the symbols to follow the public/private conventions

 Signed-off-by: Martin Peres martin.pe...@linux.intel.com
 ---
  src/mapi/glapi/gen/ARB_direct_state_access.xml |   6 +
  src/mesa/main/bufferobj.c  |   9 +-
  src/mesa/main/objectlabel.c|   2 +-
  src/mesa/main/tests/dispatch_sanity.cpp|   1 +
  src/mesa/main/transformfeedback.c  | 146
 +++--
  src/mesa/main/transformfeedback.h  |  13 ++-
  src/mesa/vbo/vbo_exec_array.c  |   8 +-
  7 files changed, 139 insertions(+), 46 deletions(-)

 diff --git a/src/mapi/glapi/gen/ARB_direct_state_access.xml
 b/src/mapi/glapi/gen/ARB_direct_state_access.xml
 index 15b00c2..35d6906 100644
 --- a/src/mapi/glapi/gen/ARB_direct_state_access.xml
 +++ b/src/mapi/glapi/gen/ARB_direct_state_access.xml
 @@ -14,6 +14,12 @@
param name=ids type=GLuint * /
 /function

 +   function name=TransformFeedbackBufferBase offset=assign
 +  param name=xfb type=GLuint /
 +  param name=index type=GLuint /
 +  param name=buffer type=GLuint /
 +   /function
 +
 !-- Texture object functions --

 function name=CreateTextures offset=assign
 diff --git a/src/mesa/main/bufferobj.c b/src/mesa/main/bufferobj.c
 index 0c1ce98..86532ea 100644
 --- a/src/mesa/main/bufferobj.c
 +++ b/src/mesa/main/bufferobj.c
 @@ -3546,8 +3546,9 @@ _mesa_BindBufferRange(GLenum target, GLuint index,

 switch (target) {
 case GL_TRANSFORM_FEEDBACK_BUFFER:
 -  _mesa_bind_buffer_range_transform_feedback(ctx, index, bufObj,
 -offset, size);
 +  _mesa_bind_buffer_range_transform_feedback(ctx,
 +
  ctx-TransformFeedback.CurrentObject,
 + index, bufObj, offset,
 size);
return;
 case GL_UNIFORM_BUFFER:
bind_buffer_range_uniform_buffer(ctx, index, bufObj, offset, size);
 @@ -3611,7 +3612,9 @@ _mesa_BindBufferBase(GLenum target, GLuint index,
 GLuint buffer)

 switch (target) {
 case GL_TRANSFORM_FEEDBACK_BUFFER:
 -  _mesa_bind_buffer_base_transform_feedback(ctx, index, bufObj);
 +  _mesa_bind_buffer_base_transform_feedback(ctx,
 +
 ctx-TransformFeedback.CurrentObject,
 +index, bufObj, false);
return;
 case GL_UNIFORM_BUFFER:
bind_buffer_base_uniform_buffer(ctx, index, bufObj);
 diff --git a/src/mesa/main/objectlabel.c b/src/mesa/main/objectlabel.c
 index 78df96b..19a7e59 100644
 --- a/src/mesa/main/objectlabel.c
 +++ b/src/mesa/main/objectlabel.c
 @@ -170,7 +170,7 @@ get_label_pointer(struct gl_context *ctx, GLenum
 identifier, GLuint name,
 case GL_TRANSFORM_FEEDBACK:
{
   struct gl_transform_feedback_object *tfo =
 -_mesa_lookup_transform_feedback_object(ctx, name);
 +lookup_transform_feedback_object(ctx, name);
   if (tfo)
  labelPtr = tfo-Label;
}
 diff --git a/src/mesa/main/tests/dispatch_sanity.cpp
 b/src/mesa/main/tests/dispatch_sanity.cpp
 index bf320bf..183755f 100644
 --- a/src/mesa/main/tests/dispatch_sanity.cpp
 +++ b/src/mesa/main/tests/dispatch_sanity.cpp
 @@ -956,6 +956,7 @@ const struct function gl_core_functions_possible[] = {

 /* GL_ARB_direct_state_access */
 { glCreateTransformFeedbacks, 45, -1 },
 +   { glTransformFeedbackBufferBase, 45, -1 },
 { glCreateTextures, 45, -1 },
 { glTextureStorage1D, 45, -1 },
 { glTextureStorage2D, 45, -1 

Re: [Mesa-dev] [PATCH v5] mesa: use fi_type in vertex attribute code

2015-02-20 Thread Brian Paul

On 02/19/2015 11:42 AM, marius.pre...@intel.com wrote:

From: Marius Predut marius.pre...@intel.com

For 32-bit builds, floating point operations use x86 FPU registers,
not SSE registers.  If we're actually storing an integer in a float
variable, the value might get modified when written to memory.  This
patch changes the VBO code to use the fi_type (float/int union) to
store/copy vertex attributes.

Also, this can improve performance on x86 because moving floats with
integer registers instead of FP registers is faster.

Neil Roberts review:
- include changes on all places that are storing attribute values.
- check with and without -O3 compiler flag.
Brian Paul review:
- use fi_type type instead gl_constant_value type
- fix a bunch of nit-picks.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=82668
Signed-off-by: Marius Predut marius.pre...@intel.com


Looks good.  Thanks.

Unfortunately, there's a few new compiler warnings after applying the 
patch.  I'll try to fix those up before I push the patch.  I'll also do 
some piglit testing...


-Brian


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] glCompressedTex(Sub)Image(2,3) on mapped PBOs

2015-02-20 Thread Eduardo Lima Mitev
Hello,

While working on the following dEQP failing tests I ran into an issue
that I think deserves clarification from more experienced people:

dEQP-GLES3.functional.negative_api.texture.compressedteximage2d_invalid_buffer_target
dEQP-GLES3.functional.negative_api.texture.compressedtexsubimage2d_invalid_buffer_target
dEQP-GLES3.functional.negative_api.texture.compressedteximage3d_invalid_buffer_target
dEQP-GLES3.functional.negative_api.texture.compressedtexsubimage3d_invalid_buffer_target

All above tests fail because a CompressedTex(Sub)Image(2,3)D operation
is performed while there is a pixel buffer object bound to
GL_PIXEL_UNPACK_BUFFER and that pixel buffer object is currently mapped.
The tests expect this situation to throw an GL_INVALID_OPERATION but no
error is returned on i965.

In the spec GLES 3.1 spec, page 57, there is this: (and similar wording
in GL 4.5 spec, page 76)

6.3.2 Effects of Mapping Buffers on Other GL Commands

Any GL command which attempts to read from, write to, or change the
state of a buffer object may generate an INVALID_OPERATION error if all
or part of the buffer object is mapped. However, only commands which
explicitly describe this error are required to do so. If an error is not
generated, using such commands to perform invalid reads, writes, or
state changes will have undefined results and may result in GL
interruption or termination.

There is no further restriction in gl(Compressed)Tex(Sub)Image(2,3)D
that makes throwing an error mandatory, hence I suspect these tests
might be wrong.


However, I found that glTex(Sub)Image(2,3)D do emits an
GL_INVALID_OPERATION error when executed while a mapped PBO is bound.
But the check is performed in driver code (as opposed to API entry
points). (Concretely, in i965/intel_tex_image.c::intelTexImage() and
i965/intel_tex_subimage.c::intelTexSubImage() which both call
common/meta_tex_subimage.c::_mesa_meta_pbo_TexSubImage()).

So, the question is:

Shouldn't these operations on compressed texture images have the same
restriction as their non-compressed counter-parts?

Or is there a reason why this restriction is enforced only on
non-compressed images?

Also, shouldn't this restriction be moved to common API entry points, or
are there drivers that do write to PBOs while they are mapped?

cheers,
Eduardo
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] Mesa (master): tgsi: add support for flt64 constants

2015-02-20 Thread Brian Paul

On 02/19/2015 03:53 PM, Dave Airlie wrote:

Module: Mesa
Branch: master
Commit: fa43e0443e206740e219d45abefee65bdb2c3ecb
URL:
https://urldefense.proofpoint.com/v2/url?u=http-3A__cgit.freedesktop.org_mesa_mesa_commit_-3Fid-3Dfa43e0443e206740e219d45abefee65bdb2c3ecbd=AwIGaQc=Sqcl0Ez6M0X8aeM67LKIiDJAXVeAw-YihVMNtXt-uEsr=T0t4QG7chq2ZwJo6wilkFznRSFy-8uDKartPGbomVj8m=wliriAj2Oy285HDmqyh5ciHQpkhmeNqaYMAXn6TDVgos=QfqfqTGO4IS5Ib7ukIrfSIR3EGdcXECY38l_K3GlYloe=

Author: Dave Airlie airl...@redhat.com
Date:   Wed Aug 27 09:56:14 2014 +1000

tgsi: add support for flt64 constants

These act like flt32 except they take up two slots, and you
can only add 2 x flt64 constants in one slot.

The main reason they are different is we don't want to match half a flt64
constants against a flt32 constant in the matching code, we need to make
sure we treat both parts of the flt64 as an single structure.

Cleaned up printing/parsing by Ilia Mirkin imir...@alum.mit.edu

Reviewed-by: Ilia Mirkin imir...@alum.mit.edu
Signed-off-by: Dave Airlie airl...@redhat.com

---

  src/gallium/auxiliary/tgsi/tgsi_dump.c |8 +++
  src/gallium/auxiliary/tgsi/tgsi_parse.c|1 +
  src/gallium/auxiliary/tgsi/tgsi_strings.c  |5 +-
  src/gallium/auxiliary/tgsi/tgsi_strings.h  |2 +-
  src/gallium/auxiliary/tgsi/tgsi_text.c |   22 
  src/gallium/auxiliary/tgsi/tgsi_ureg.c |   75 ++--
  src/gallium/auxiliary/tgsi/tgsi_ureg.h |5 ++
  src/gallium/include/pipe/p_shader_tokens.h |1 +
  8 files changed, 113 insertions(+), 6 deletions(-)



[...]


diff --git a/src/gallium/auxiliary/tgsi/tgsi_text.c 
b/src/gallium/auxiliary/tgsi/tgsi_text.c
index f965b01..5069d13 100644
--- a/src/gallium/auxiliary/tgsi/tgsi_text.c
+++ b/src/gallium/auxiliary/tgsi/tgsi_text.c
@@ -232,6 +232,24 @@ static boolean parse_float( const char **pcur, float *val )
 return TRUE;
  }

+static boolean parse_double( const char **pcur, uint32_t *val0, uint32_t *val1)
+{
+   const char *cur = *pcur;
+   union {
+  double dval;
+  uint32_t uval[2];
+   } v;
+
+   v.dval = strtod(cur, pcur);
+   if (*pcur == cur)
+  return FALSE;
+
+   *val0 = v.uval[0];
+   *val1 = v.uval[1];
+
+   return TRUE;
+}


Hi Dave, I'm seeing a new compiler warning in that code:

tgsi/tgsi_text.c: In function 'parse_double':
tgsi/tgsi_text.c:243:4: warning: passing argument 2 of 'strtod' from 
incompatible pointer type [enabled by default]

v.dval = strtod(cur, pcur);
^
In file included from ../../../src/gallium/include/pipe/p_compiler.h:36:0,
 from ./os/os_misc.h:38,
 from ./util/u_debug.h:42,
 from tgsi/tgsi_text.c:28:
/usr/include/stdlib.h:164:15: note: expected 'char ** restrict' but 
argument is of type 'const char **'

 extern double strtod (const char *__restrict __nptr,
   ^

Can you clean that up?

gcc 4.8.2

-Brian



___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] nir: Copy-propagate vecN operations that are actually moves

2015-02-20 Thread Jason Ekstrand
On Thu, Feb 19, 2015 at 11:03 PM, Matt Turner matts...@gmail.com wrote:

 On Thu, Feb 19, 2015 at 11:01 PM, Connor Abbott cwabbo...@gmail.com
 wrote:
  I agree with Ken that the regressions are small enough, and it seems
  they're mostly stuff we can prevent by being smarter when doing the
  sel peephole, so it seems like the cleanup that will probably help
  other passes is worth it.

 So, usually we do that as a preparatory patch. Why aren't we doing that
 here?


Do what in a preparatory patch?  Fix up the sel peephole to be able to
handle if (foo) bar = baz;?  Sure, I can put that patch together.


 NIR instruction counts is not the metric we care about.


No, but cleaning things up means that we can do other optimizations
better.  Also, in each of those cases, the non-ssa NIR code was better it
was just less cleanable by the backend.  We need to work on that, but I
don't think it's an indicator of a problem.
--Jason
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/7] mesa: Add gallium include dirs to more parts of the tree.

2015-02-20 Thread Jose Fonseca

On 13/02/15 00:49, Eric Anholt wrote:

Jose Fonseca jfons...@vmware.com writes:


Thanks for doing this. I appreciate it.

I have no objection with the series.  I'm happy to see more reuse.  We
can always move things around later, and it will be much easier when
then are less entangled/duplicated.

We'll need to update SCons include paths too.  If you have a git repos
with your series that I can pull from I'll give it a go tomorrow and
provide a patch for it.


gallium-includes of my tree.



Sorry it took this long to come back to on this.  I was sick for a 
couple days then had to rush to catch up.


I've tried to build your branch, and it looks like with your v2: Try to 
patch up the scons bits. update everything just works.



Apart of one minor comment (in a seperate email), the series looks good:

Reviewed-by: Jose Fonseca jfons...@vmware.com

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 4/7] mesa: Make bitset.h not rely on Mesa-specific types and functions.

2015-02-20 Thread Jose Fonseca

On 12/02/15 00:48, Eric Anholt wrote:

Note that we can't use u_math.h's align() because it's a function instead
of a macro, while BITSET_DECLARE needs a constant expression for nouveau's
usage in global declarations.
---
  src/mesa/main/bitset.h | 7 +++
  1 file changed, 3 insertions(+), 4 deletions(-)

diff --git a/src/mesa/main/bitset.h b/src/mesa/main/bitset.h
index 2558da4..6f9bed9 100644
--- a/src/mesa/main/bitset.h
+++ b/src/mesa/main/bitset.h
@@ -31,19 +31,18 @@
  #ifndef BITSET_H
  #define BITSET_H

-#include imports.h
-#include macros.h
+#include util/u_math.h

  /
   * generic bitset implementation
   */

-#define BITSET_WORD GLuint
+#define BITSET_WORD unsigned int
  #define BITSET_WORDBITS (sizeof (BITSET_WORD) * 8)

  /* bitset declarations
   */
-#define BITSET_WORDS(bits) (ALIGN(bits, BITSET_WORDBITS) / BITSET_WORDBITS)
+#define BITSET_WORDS(bits) ((bits + BITSET_WORDBITS - 1) / BITSET_WORDBITS)


bits - (bits)

just in case...


  #define BITSET_DECLARE(name, bits) BITSET_WORD name[BITSET_WORDS(bits)]

  /* bitset operations



With it

Reviewed-by: Jose Fonseca jfons...@vmware.com

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] glCompressedTex(Sub)Image(2,3) on mapped PBOs

2015-02-20 Thread Jason Ekstrand
On Fri, Feb 20, 2015 at 8:13 AM, Eduardo Lima Mitev el...@igalia.com
wrote:

 Hello,

 While working on the following dEQP failing tests I ran into an issue
 that I think deserves clarification from more experienced people:


 dEQP-GLES3.functional.negative_api.texture.compressedteximage2d_invalid_buffer_target

 dEQP-GLES3.functional.negative_api.texture.compressedtexsubimage2d_invalid_buffer_target

 dEQP-GLES3.functional.negative_api.texture.compressedteximage3d_invalid_buffer_target

 dEQP-GLES3.functional.negative_api.texture.compressedtexsubimage3d_invalid_buffer_target

 All above tests fail because a CompressedTex(Sub)Image(2,3)D operation
 is performed while there is a pixel buffer object bound to
 GL_PIXEL_UNPACK_BUFFER and that pixel buffer object is currently mapped.
 The tests expect this situation to throw an GL_INVALID_OPERATION but no
 error is returned on i965.

 In the spec GLES 3.1 spec, page 57, there is this: (and similar wording
 in GL 4.5 spec, page 76)

 6.3.2 Effects of Mapping Buffers on Other GL Commands

 Any GL command which attempts to read from, write to, or change the
 state of a buffer object may generate an INVALID_OPERATION error if all
 or part of the buffer object is mapped. However, only commands which
 explicitly describe this error are required to do so. If an error is not
 generated, using such commands to perform invalid reads, writes, or
 state changes will have undefined results and may result in GL
 interruption or termination.

 There is no further restriction in gl(Compressed)Tex(Sub)Image(2,3)D
 that makes throwing an error mandatory, hence I suspect these tests
 might be wrong.


 However, I found that glTex(Sub)Image(2,3)D do emits an
 GL_INVALID_OPERATION error when executed while a mapped PBO is bound.
 But the check is performed in driver code (as opposed to API entry
 points). (Concretely, in i965/intel_tex_image.c::intelTexImage() and
 i965/intel_tex_subimage.c::intelTexSubImage() which both call
 common/meta_tex_subimage.c::_mesa_meta_pbo_TexSubImage()).

 So, the question is:

 Shouldn't these operations on compressed texture images have the same
 restriction as their non-compressed counter-parts?

 Or is there a reason why this restriction is enforced only on
 non-compressed images?

 Also, shouldn't this restriction be moved to common API entry points, or
 are there drivers that do write to PBOs while they are mapped?


IMHO, that's exactly what we need to do.  I've been toying with writing a
patch to do just that.  Right now, we're leaving checking if the buffer is
mapped and checking if it has enough size up to the device-specific driver
backend.  This is, in general, not how the rest of mesa does error-checking
so it seems very odd.  Also, it means that if a driver misses the check it
causes a GL API failure not just a texture upload error.  To make it worse,
they all check it by calling the same function inside mesa/main so there's
really no excuse.  I haven't seen any good reason why we aren't just
calling that in the error-checking section of the entry-point.

--Jason


 cheers,
 Eduardo
 ___
 mesa-dev mailing list
 mesa-dev@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/mesa-dev

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 10/32] i965/fs: Remove logic to keep track of MRF metadata in lower_load_payload().

2015-02-20 Thread Jason Ekstrand
On Fri, Feb 20, 2015 at 4:11 AM, Francisco Jerez curroje...@riseup.net
wrote:

 Jason Ekstrand ja...@jlekstrand.net writes:

  I'm still a little pensive.  But
 
  Reviewed-by: Jason Ekstrand jason.ekstr...@intel.com
 
 Thanks.

  Now for a little aside.  I have come to the conclusion that I made a
 grave
  mistake when I did the LOAD_PAYLOAD stuff.  In retrospect, I should have
  just subclassed fs_inst for load_payload.  The problem is that we need to
  snag a bunch of information for the sources when we create the
  load_payload.  In particular, we need to know the width of the source so
  that we know how much space it consumes in the payload and we need to
 know
  the information required to properly re-create the mov such as
  force_sechalf and force_writemask_all.  Really, in order to do things
  properly, we need to gather this information *before* we do any
  optimizations.  The nasty pile of code that you're editing together with
  the effective_width parameter is a lame attempt to capture/reconstruct
  this information.  Really, we should just subclass, capture the
 information
  up-front, and do it properly.
 
 Yes, absolutely, this lowering pass is a real mess.  There are four more
 unreviewed patches in the mailing list fixing bugs of the metadata
 guessing of lower_load_payload() [1][2][3][4], you may want to take a
 look at them.  There are more bugs I'm aware of but it didn't seem
 practical to fix them.


Yeah, Matt pointed me at those.  I'll give them a look later today.


 That said, I don't think it would be worth subclassing fs_inst.

 My suggestion would have been to keep it simple and lower LOAD_PAYLOAD
 into a series of MOVs with force_writemask_all enabled in all cases,
 simply rely on the optimizer to eliminate those where possible.  Then
 get rid of the metadata and effective_width guessing.  Instead agree on
 immediates and uniforms being exec_size-wide by convention
 (i.e. LOAD_PAYLOAD's exec_size rather than the original instruction's),
 then prevent constant propagation from propagating an immediate load
 into a LOAD_PAYLOAD if it would lead to a change in the semantics of the
 program, and maybe just run constant propagation after lowering so we
 can be sure those cases are cleaned up properly where register coalesce
 isn't enough.


First off, simply setting force_writemask_all isn't an option.  Consider
the following scenario:

a = foo;
if (bar) {
   b = 7;
} else {
   use(a);
}
use(b);

If use(a) is the last use of the variable a, then the live ranges of a
and b don't actually over-lap and we can assign a and b to the same
register.  However, if force_writemask_all is set on b, it will destroy the
value in a before its last use.  Right now, we don't actually do this
because our liveness analysis pass flattens everything so it will think
that b and a over-lap even though they don't.  However, if we ever decide
to make the liveness information more accurate, having a pile of
force_writemask_all instructions will be a major problem.  So, while it is
*technically* safe for now, it's a really bad idea in the long-term.

Regarding the other suggestion of just requiring width == exec_size for
immediates and uniforms, that seems pretty reasonable to me.  I'd like to
know what it will do to optimizations, but it seems ok initially.  I'm
still a bigger fan of just subclassing and stashing everything we need to
know up-front.  If we do it right, the only things that will actually need
to know about the subclass are the function for creating a LOAD_PAYLOAD and
lower_load_payloads.

--Jason



 [1]
 http://lists.freedesktop.org/archives/mesa-dev/2015-January/074614.html
 [2]
 http://lists.freedesktop.org/archives/mesa-dev/2015-February/076094.html
 [3]
 http://lists.freedesktop.org/archives/mesa-dev/2015-February/076097.html
 [4]
 http://lists.freedesktop.org/archives/mesa-dev/2015-February/076098.html


  --Jason
 
  On Thu, Feb 19, 2015 at 1:53 PM, Jason Ekstrand ja...@jlekstrand.net
  wrote:
 
 
 
  On Thu, Feb 19, 2015 at 1:25 PM, Francisco Jerez curroje...@riseup.net
 
  wrote:
 
  Jason Ekstrand ja...@jlekstrand.net writes:
 
   On Thu, Feb 19, 2015 at 12:13 PM, Francisco Jerez 
  curroje...@riseup.net
   wrote:
  
   Jason Ekstrand ja...@jlekstrand.net writes:
  
On Fri, Feb 6, 2015 at 4:01 PM, Francisco Jerez 
  curroje...@riseup.net
wrote:
   
Hey Matt,
   
Matt Turner matts...@gmail.com writes:
   
 On Fri, Feb 6, 2015 at 6:42 AM, Francisco Jerez 
   curroje...@riseup.net
wrote:
 MRFs cannot be read from anyway so they cannot possibly be a
  valid
 source of LOAD_PAYLOAD.
 ---

 The function only seems to test inst-dst.file == MRF. I don't
  see any
 code for handling MRF sources. What am I missing?
   
That test is for handling MRF sources -- More precisely, it's
collecting the writemask and half flags for MRF writes, which can
  only
possibly be useful if we're going to use them later on to read
  something

Re: [Mesa-dev] [PATCH] nir: Copy-propagate vecN operations that are actually moves

2015-02-20 Thread Matt Turner
On Fri, Feb 20, 2015 at 9:23 AM, Jason Ekstrand ja...@jlekstrand.net wrote:


 On Thu, Feb 19, 2015 at 11:03 PM, Matt Turner matts...@gmail.com wrote:

 On Thu, Feb 19, 2015 at 11:01 PM, Connor Abbott cwabbo...@gmail.com
 wrote:
  I agree with Ken that the regressions are small enough, and it seems
  they're mostly stuff we can prevent by being smarter when doing the
  sel peephole, so it seems like the cleanup that will probably help
  other passes is worth it.

 So, usually we do that as a preparatory patch. Why aren't we doing that
 here?


 Do what in a preparatory patch?  Fix up the sel peephole to be able to
 handle if (foo) bar = baz;?  Sure, I can put that patch together.

I thought that this would get turned into a conditional select as a
side-effect of going out of SSA anyway (assuming baz was
unconditionally set before the if)? Are these shaders not setting baz
unconditionally?


 NIR instruction counts is not the metric we care about.


 No, but cleaning things up means that we can do other optimizations better.
 Also, in each of those cases, the non-ssa NIR code was better it was just
 less cleanable by the backend.  We need to work on that, but I don't think
 it's an indicator of a problem.
 --Jason

Okay, that seems fine. I wouldn't bother updating the SEL peephole. It
sounds like a number of the regressions we have are related to it not
handling something so we might need to do a larger evaluation.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/5] nir: Add a couple of simplifications of csel operations.

2015-02-20 Thread Connor Abbott
Aside from one minor suggestion for patch 2, this series is

Reviewed-by: Connor Abbott cwabbo...@gmail.com

On Fri, Feb 20, 2015 at 3:28 PM, Matt Turner matts...@gmail.com wrote:
 The series is

 Reviewed-by: Matt Turner matts...@gmail.com
 ___
 mesa-dev mailing list
 mesa-dev@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 89260] macros.h:34:25: fatal error: util/u_math.h: No such file or directory

2015-02-20 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=89260

Bug ID: 89260
   Summary: macros.h:34:25: fatal error: util/u_math.h: No such
file or directory
   Product: Mesa
   Version: git
  Hardware: x86-64 (AMD64)
OS: Linux (All)
Status: NEW
  Severity: normal
  Priority: medium
 Component: Mesa core
  Assignee: mesa-dev@lists.freedesktop.org
  Reporter: fabio@libero.it
QA Contact: mesa-dev@lists.freedesktop.org
CC: e...@anholt.net

After changes in the latest 12h osmesa fails to build:
  CC   osmesa.lo
In file included from ../../../../../../src/mesa/drivers/osmesa/osmesa.c:44:0:
../../../../../../src/mesa/main/macros.h:34:25: fatal error: util/u_math.h: No
such file or directory
 #include util/u_math.h
 ^
compilation terminated.

Full log including configure options here:
https://launchpadlibrarian.net/198345895/buildlog_ubuntu-vivid-i386.mesa_10.6~git1502210730.9fe818~gd~v_BUILDING.txt.gz

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] i965/skl: Use 1 register for uniform pull constant payload

2015-02-20 Thread Ben Widawsky
On Fri, Feb 20, 2015 at 03:34:21AM -0800, Kenneth Graunke wrote:
 On Thursday, February 19, 2015 10:48:08 PM Ben Widawsky wrote:
  When under dispatch_width=16 the previous code would allocate 2 registers 
  for
  the payload when only one is needed. This manifested itself through bugs on 
  SKL
  which needs to mess with this instruction.
  
  Ken though this might impact shader-db, but apparently it doesn't
  
  Cc: Kenneth Graunke kenn...@whitecape.org
  Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89118
  Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=88999
  Signed-off-by: Ben Widawsky b...@bwidawsk.net
  ---
   src/mesa/drivers/dri/i965/brw_fs.cpp | 2 +-
   1 file changed, 1 insertion(+), 1 deletion(-)
  
  diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp 
  b/src/mesa/drivers/dri/i965/brw_fs.cpp
  index c46e1d7..24125cc 100644
  --- a/src/mesa/drivers/dri/i965/brw_fs.cpp
  +++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
  @@ -3062,7 +3062,7 @@ fs_visitor::lower_uniform_pull_constant_loads()
assert(const_offset_reg.file == IMM 
   const_offset_reg.type == BRW_REGISTER_TYPE_UD);
const_offset_reg.fixed_hw_reg.dw1.ud /= 4;
  - fs_reg payload = vgrf(glsl_type::uint_type);
  + fs_reg payload = fs_reg(GRF, alloc.allocate(1));
   
/* We have to use a message header on Skylake to get SIMD4x2 mode.
 * Reserve space for the register.
  
 
 Reviewed-by: Kenneth Graunke kenn...@whitecape.org

Given that this seems to be required to get SKL stable, do I want to Cc 1.5 too?

-- 
Ben Widawsky, Intel Open Source Technology Center
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 03/16] main: fix the validation of the number of samples

2015-02-20 Thread Laura Ekstrand
Please provide a page number and a section title in your spec comment.

Thanks.

On Mon, Feb 16, 2015 at 6:13 AM, Martin Peres martin.pe...@linux.intel.com
wrote:

 Maybe this should be the job of the dispatch layer.

 Signed-off-by: Martin Peres martin.pe...@linux.intel.com
 ---
  src/mesa/main/multisample.c | 9 +
  1 file changed, 9 insertions(+)

 diff --git a/src/mesa/main/multisample.c b/src/mesa/main/multisample.c
 index 1f3fa0c..a0a659b 100644
 --- a/src/mesa/main/multisample.c
 +++ b/src/mesa/main/multisample.c
 @@ -150,6 +150,15 @@ GLenum
  _mesa_check_sample_count(struct gl_context *ctx, GLenum target,
   GLenum internalFormat, GLsizei samples)
  {
 +   /* From the OpenGL core 3.0 spec, section 2.5:
 +*
 +* If a negative number is provided where an argument of type sizei or
 +* sizeiptr is specified, the error INVALID VALUE is generated.
 +*/
 +   if (samples  0) {
 +  return GL_INVALID_VALUE;
 +   }
 +
 /* If ARB_internalformat_query is supported, then treat its highest
  * returned sample count as the absolute maximum for this format; it is
  * allowed to exceed MAX_SAMPLES.
 --
 2.3.0

 ___
 mesa-dev mailing list
 mesa-dev@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/mesa-dev

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] nir: Copy-propagate vecN operations that are actually moves

2015-02-20 Thread Jason Ekstrand
On Feb 20, 2015 9:27 AM, Matt Turner matts...@gmail.com wrote:

 On Fri, Feb 20, 2015 at 9:23 AM, Jason Ekstrand ja...@jlekstrand.net
wrote:
 
 
  On Thu, Feb 19, 2015 at 11:03 PM, Matt Turner matts...@gmail.com
wrote:
 
  On Thu, Feb 19, 2015 at 11:01 PM, Connor Abbott cwabbo...@gmail.com
  wrote:
   I agree with Ken that the regressions are small enough, and it seems
   they're mostly stuff we can prevent by being smarter when doing the
   sel peephole, so it seems like the cleanup that will probably help
   other passes is worth it.
 
  So, usually we do that as a preparatory patch. Why aren't we doing that
  here?
 
 
  Do what in a preparatory patch?  Fix up the sel peephole to be able to
  handle if (foo) bar = baz;?  Sure, I can put that patch together.

 I thought that this would get turned into a conditional select as a
 side-effect of going out of SSA anyway (assuming baz was
 unconditionally set before the if)? Are these shaders not setting baz
 unconditionally?

The NIR peephole only triggers if the only instructions in the if or else
are moves.  In this particular case, it was a uniform load.  Before the
change, it was lowered to moves in both sides of the if and, since it ended
up being a push constant, got peepholed.  After the change, it was lowered
to moves on only one side of the if (which is theoretically better) and the
select didn't trigger.

 
  NIR instruction counts is not the metric we care about.
 
 
  No, but cleaning things up means that we can do other optimizations
better.
  Also, in each of those cases, the non-ssa NIR code was better it was
just
  less cleanable by the backend.  We need to work on that, but I don't
think
  it's an indicator of a problem.
  --Jason

 Okay, that seems fine. I wouldn't bother updating the SEL peephole. It
 sounds like a number of the regressions we have are related to it not
 handling something so we might need to do a larger evaluation.

Yeah, it just needs to handle one-sided it's better.
--Jason
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 11/23] main: Refactor ClearBuffer[Sub]Data.

2015-02-20 Thread Laura Ekstrand
This naming convention tries to match the corresponding DD table entry.

There's a thread discussing the naming convention for external software
fallback functions.  Feel free to add your input to the discussion there :)

On Fri, Feb 20, 2015 at 6:18 AM, Martin Peres martin.pe...@free.fr wrote:

 On 12/02/2015 04:05, Laura Ekstrand wrote:

 v2: review by Jason Ekstrand
 - Split refactor of clear buffer sub data from addition of DSA entry
   points.
 ---
   src/mesa/main/bufferobj.c| 125
 ---
   src/mesa/main/bufferobj.h|  19 ++--
   src/mesa/state_tracker/st_cb_bufferobjects.c |   4 +-
   3 files changed, 69 insertions(+), 79 deletions(-)

 diff --git a/src/mesa/main/bufferobj.c b/src/mesa/main/bufferobj.c
 index 7225b64..b8fa917 100644
 --- a/src/mesa/main/bufferobj.c
 +++ b/src/mesa/main/bufferobj.c
 @@ -660,11 +660,11 @@ _mesa_buffer_get_subdata( struct gl_context *ctx,
 GLintptrARB offset,
* dd_function_table::ClearBufferSubData.
*/
   void
 -_mesa_buffer_clear_subdata(struct gl_context *ctx,
 -   GLintptr offset, GLsizeiptr size,
 -   const GLvoid *clearValue,
 -   GLsizeiptr clearValueSize,
 -   struct gl_buffer_object *bufObj)
 +_mesa_ClearBufferSubData_sw(struct gl_context *ctx,


 Interesting choice of naming the function as a mix of camel case and
 underscores.

 Since it is an internal function, shouldn't it only have underscores?

 Other than that:

 Reviewed-by: Martin Peres martin.pe...@linux.intel.com

  +GLintptr offset, GLsizeiptr size,
 +const GLvoid *clearValue,
 +GLsizeiptr clearValueSize,
 +struct gl_buffer_object *bufObj)
   {
  GLsizeiptr i;
  GLubyte *dest;
 @@ -1113,7 +1113,7 @@ _mesa_init_buffer_object_functions(struct
 dd_function_table *driver)
  driver-UnmapBuffer = _mesa_buffer_unmap;
/* GL_ARB_clear_buffer_object */
 -   driver-ClearBufferSubData = _mesa_buffer_clear_subdata;
 +   driver-ClearBufferSubData = _mesa_ClearBufferSubData_sw;
/* GL_ARB_map_buffer_range */
  driver-MapBufferRange = _mesa_buffer_map_range;
 @@ -1671,57 +1671,77 @@ _mesa_GetBufferSubData(GLenum target, GLintptr
 offset,
  ctx-Driver.GetBufferSubData( ctx, offset, size, data, bufObj );
   }
   -
 -void GLAPIENTRY
 -_mesa_ClearBufferData(GLenum target, GLenum internalformat, GLenum
 format,
 -  GLenum type, const GLvoid* data)
 +/**
 + * \param subdata   true if caller is *SubData, false if *Data
 + */
 +void
 +_mesa_clear_buffer_sub_data(struct gl_context *ctx,
 +struct gl_buffer_object *bufObj,
 +GLenum internalformat,
 +GLintptr offset, GLsizeiptr size,
 +GLenum format, GLenum type,
 +const GLvoid *data,
 +const char *func, bool subdata)
   {
 -   GET_CURRENT_CONTEXT(ctx);
 -   struct gl_buffer_object* bufObj;
  mesa_format mesaFormat;
  GLubyte clearValue[MAX_PIXEL_BYTES];
  GLsizeiptr clearValueSize;
   -   bufObj = get_buffer(ctx, glClearBufferData, target,
 GL_INVALID_VALUE);
 -   if (!bufObj) {
 -  return;
 -   }
 -
 -   if (_mesa_check_disallowed_mapping(bufObj)) {
 -  _mesa_error(ctx, GL_INVALID_OPERATION,
 -  glClearBufferData(buffer currently mapped));
 +   /* This checks for disallowed mappings. */
 +   if (!buffer_object_subdata_range_good(ctx, bufObj, offset, size,
 + subdata, func)) {
 return;
  }
mesaFormat = validate_clear_buffer_format(ctx, internalformat,
 - format, type,
 - glClearBufferData);
 + format, type, func);
 +
  if (mesaFormat == MESA_FORMAT_NONE) {
 return;
  }
clearValueSize = _mesa_get_format_bytes(mesaFormat);
 -   if (bufObj-Size % clearValueSize != 0) {
 +   if (offset % clearValueSize != 0 || size % clearValueSize != 0) {
 _mesa_error(ctx, GL_INVALID_VALUE,
 -  glClearBufferData(size is not a multiple of 
 -  internalformat size));
 +  %s(offset or size is not a multiple of 
 +  internalformat size), func);
 return;
  }
if (data == NULL) {
 /* clear to zeros, per the spec */
 -  ctx-Driver.ClearBufferSubData(ctx, 0, bufObj-Size,
 - NULL, clearValueSize, bufObj);
 +  if (size  0) {
 + ctx-Driver.ClearBufferSubData(ctx, offset, size,
 +NULL, clearValueSize, bufObj);
 +  }
 return;
  }
if 

Re: [Mesa-dev] [PATCH] svga: add missing padding to SVGA3dSize

2015-02-20 Thread Sinclair Yeh
On Fri, Feb 20, 2015 at 09:22:20AM +, Van Der Wath, DanielX J wrote:
 From: Daniel van der Wath danielx.j.van.der.w...@intel.com
 
 The kernel side equivalent of struct SVGA3dSize (struct drm_vmw_size) has an
 extra padding word that SVGA3dSize lacks. This was causing data to be written
 past the end of size in vmw_drm_surface_from_handle(), corrupting other data

The drm_vmw_* types are used to exchange data with the VMW DRM so none of the
DRM IOCTL functions, e.g. drmCommandWriteRead(), should be using the SVGA3d*
types.

In vmw_drm_surface_from_handle(), size is of type struct drm_vmw_size,
and is being used here: rep-size_addr = (unsigned long)size, to call
drmCommandWriteRead().  So there shouldn't be a user/kernel mismatch here.

At which point during this function do you see handle being over written?
Also, which version of MESA are you seeing this on?

I'll see if I can see this on my end.

 and in this case leading to Weston being unable to render anything on screen.
 ---
  src/gallium/drivers/svga/include/svga3d_types.h | 1 +
  1 file changed, 1 insertion(+)
 
 diff --git a/src/gallium/drivers/svga/include/svga3d_types.h 
 b/src/gallium/drivers/svga/include/svga3d_types.h
 index fc4a6b9..3ce6814 100644
 --- a/src/gallium/drivers/svga/include/svga3d_types.h
 +++ b/src/gallium/drivers/svga/include/svga3d_types.h
 @@ -1280,6 +1280,7 @@ struct {
 uint32   width;
 uint32   height;
 uint32   depth;
 +   uint32   pad64;
  }
  #include vmware_pack_end.h
  SVGA3dSize;
 -- 
 1.7.11.7
 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 23/23] main: Cosmetic changes to GetBufferSubData.

2015-02-20 Thread Laura Ekstrand
Ian has asked that this not be squashed.  In fact, they were the same
commit last time around on the ML.

On Fri, Feb 20, 2015 at 12:29 AM, Martin Peres martin.pe...@free.fr wrote:

 Please squash this commit with the one introducing GetBufferSubData.

 It should be pretty easy to do with git rebase -i :)


 On 12/02/2015 04:06, Laura Ekstrand wrote:

 ---
   src/mesa/main/bufferobj.c | 2 +-
   src/mesa/main/bufferobj.h | 4 ++--
   2 files changed, 3 insertions(+), 3 deletions(-)

 diff --git a/src/mesa/main/bufferobj.c b/src/mesa/main/bufferobj.c
 index 0272704..38d8b5a 100644
 --- a/src/mesa/main/bufferobj.c
 +++ b/src/mesa/main/bufferobj.c
 @@ -1668,7 +1668,7 @@ _mesa_GetBufferSubData(GLenum target, GLintptr
 offset,
  }
ASSERT(ctx-Driver.GetBufferSubData);
 -   ctx-Driver.GetBufferSubData( ctx, offset, size, data, bufObj );
 +   ctx-Driver.GetBufferSubData(ctx, offset, size, data, bufObj);
   }
 void GLAPIENTRY
 diff --git a/src/mesa/main/bufferobj.h b/src/mesa/main/bufferobj.h
 index feeaa8b..b5d73ae 100644
 --- a/src/mesa/main/bufferobj.h
 +++ b/src/mesa/main/bufferobj.h
 @@ -230,8 +230,8 @@ _mesa_NamedBufferSubData(GLuint buffer, GLintptr
 offset,
GLsizeiptr size, const GLvoid *data);
 void GLAPIENTRY
 -_mesa_GetBufferSubData(GLenum target, GLintptrARB offset,
 -   GLsizeiptrARB size, void * data);
 +_mesa_GetBufferSubData(GLenum target, GLintptr offset,
 +   GLsizeiptr size, GLvoid *data);
 void GLAPIENTRY
   _mesa_GetNamedBufferSubData(GLuint buffer, GLintptr offset,



___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 10/32] i965/fs: Remove logic to keep track of MRF metadata in lower_load_payload().

2015-02-20 Thread Francisco Jerez
Jason Ekstrand ja...@jlekstrand.net writes:

 On Fri, Feb 20, 2015 at 4:11 AM, Francisco Jerez curroje...@riseup.net
 wrote:

 Jason Ekstrand ja...@jlekstrand.net writes:

  I'm still a little pensive.  But
 
  Reviewed-by: Jason Ekstrand jason.ekstr...@intel.com
 
 Thanks.

  Now for a little aside.  I have come to the conclusion that I made a
 grave
  mistake when I did the LOAD_PAYLOAD stuff.  In retrospect, I should have
  just subclassed fs_inst for load_payload.  The problem is that we need to
  snag a bunch of information for the sources when we create the
  load_payload.  In particular, we need to know the width of the source so
  that we know how much space it consumes in the payload and we need to
 know
  the information required to properly re-create the mov such as
  force_sechalf and force_writemask_all.  Really, in order to do things
  properly, we need to gather this information *before* we do any
  optimizations.  The nasty pile of code that you're editing together with
  the effective_width parameter is a lame attempt to capture/reconstruct
  this information.  Really, we should just subclass, capture the
 information
  up-front, and do it properly.
 
 Yes, absolutely, this lowering pass is a real mess.  There are four more
 unreviewed patches in the mailing list fixing bugs of the metadata
 guessing of lower_load_payload() [1][2][3][4], you may want to take a
 look at them.  There are more bugs I'm aware of but it didn't seem
 practical to fix them.


 Yeah, Matt pointed me at those.  I'll give them a look later today.


 That said, I don't think it would be worth subclassing fs_inst.

 My suggestion would have been to keep it simple and lower LOAD_PAYLOAD
 into a series of MOVs with force_writemask_all enabled in all cases,
 simply rely on the optimizer to eliminate those where possible.  Then
 get rid of the metadata and effective_width guessing.  Instead agree on
 immediates and uniforms being exec_size-wide by convention
 (i.e. LOAD_PAYLOAD's exec_size rather than the original instruction's),
 then prevent constant propagation from propagating an immediate load
 into a LOAD_PAYLOAD if it would lead to a change in the semantics of the
 program, and maybe just run constant propagation after lowering so we
 can be sure those cases are cleaned up properly where register coalesce
 isn't enough.


 First off, simply setting force_writemask_all isn't an option.  Consider
 the following scenario:

 a = foo;
 if (bar) {
b = 7;
 } else {
use(a);
 }
 use(b);

 If use(a) is the last use of the variable a, then the live ranges of a
 and b don't actually over-lap and we can assign a and b to the same
 register.  However, if force_writemask_all is set on b, it will destroy the
 value in a before its last use.  Right now, we don't actually do this
 because our liveness analysis pass flattens everything so it will think
 that b and a over-lap even though they don't.  However, if we ever decide
 to make the liveness information more accurate, having a pile of
 force_writemask_all instructions will be a major problem.  So, while it is
 *technically* safe for now, it's a really bad idea in the long-term.

The thing is we *will* have to deal with that scenario.  Building a
message payload inherently involves crossing channel boundaries (because
of headers, unsupported SIMD modes by some shared functions, etc.).  I'd
say it's a bug in the liveness analysis pass if it wouldn't consider the
liveness interval of a and b to overlap in your example if the
assignment of b had force_writemask_all set.

A reasonable compromise might be to leave it up to the caller whether to
set the force_writemask_all and force_sechalf flags or not.  I.e. just
copy the same flags set on the LOAD_PAYLOAD instruction to the MOV
instructions.  That would still allow reducing the liveness intervals in
cases where we know that the payload respects channel boundaries.

Tracking the flag information per-register in cases where the same
payload has well- and ill-behaved values seems rather premature and
rather useless to me because the optimizer is likely to be able to get
rid of redundant copies with force_writemask_all -- register coalesce
is doing this already AFAIK, maybe by accident.

 Regarding the other suggestion of just requiring width == exec_size for
 immediates and uniforms, that seems pretty reasonable to me.  I'd like to
 know what it will do to optimizations, but it seems ok initially.  I'm
 still a bigger fan of just subclassing and stashing everything we need to
 know up-front.  If we do it right, the only things that will actually need
 to know about the subclass are the function for creating a LOAD_PAYLOAD and
 lower_load_payloads.

 --Jason



 [1]
 http://lists.freedesktop.org/archives/mesa-dev/2015-January/074614.html
 [2]
 http://lists.freedesktop.org/archives/mesa-dev/2015-February/076094.html
 [3]
 http://lists.freedesktop.org/archives/mesa-dev/2015-February/076097.html
 [4]
 

Re: [Mesa-dev] [PATCH 04/16] main: Added entry point for glCreateTransformFeedbacks

2015-02-20 Thread Laura Ekstrand
Looks good to me

Reviewed-by: Laura Ekstrand la...@jlekstrand.net

On Mon, Feb 16, 2015 at 6:13 AM, Martin Peres martin.pe...@linux.intel.com
wrote:

 v2: Review from Laura Ekstrand
 - generate the name of the gl method once
 - shorten some lines to stay in the 78 chars limit

 v3: Review from Fredrik Höglund fred...@kde.org
 - rename gl_mthd_name to func
 - set EverBound in _mesa_create_transform_feedbacks in the dsa case

 v4:
 - rename _mesa_create_transform_feedbacks to create_transform_feedbacks and
   make it static

 Signed-off-by: Martin Peres martin.pe...@linux.intel.com
 ---
  src/mapi/glapi/gen/ARB_direct_state_access.xml |  7 +++
  src/mesa/main/tests/dispatch_sanity.cpp|  1 +
  src/mesa/main/transformfeedback.c  | 67
 --
  src/mesa/main/transformfeedback.h  |  3 ++
  4 files changed, 63 insertions(+), 15 deletions(-)

 diff --git a/src/mapi/glapi/gen/ARB_direct_state_access.xml
 b/src/mapi/glapi/gen/ARB_direct_state_access.xml
 index 2fe1638..15b00c2 100644
 --- a/src/mapi/glapi/gen/ARB_direct_state_access.xml
 +++ b/src/mapi/glapi/gen/ARB_direct_state_access.xml
 @@ -7,6 +7,13 @@
 enum name=QUERY_TARGETvalue=0x82EA/
 enum name=TEXTURE_BINDING value=0x82EB/

 +   !-- Transform Feedback object functions --
 +
 +  function name=CreateTransformFeedbacks offset=assign
 +  param name=n type=GLsizei /
 +  param name=ids type=GLuint * /
 +   /function
 +
 !-- Texture object functions --

 function name=CreateTextures offset=assign
 diff --git a/src/mesa/main/tests/dispatch_sanity.cpp
 b/src/mesa/main/tests/dispatch_sanity.cpp
 index 1f1a3a8..bf320bf 100644
 --- a/src/mesa/main/tests/dispatch_sanity.cpp
 +++ b/src/mesa/main/tests/dispatch_sanity.cpp
 @@ -955,6 +955,7 @@ const struct function gl_core_functions_possible[] = {
 { glClipControl, 45, -1 },

 /* GL_ARB_direct_state_access */
 +   { glCreateTransformFeedbacks, 45, -1 },
 { glCreateTextures, 45, -1 },
 { glTextureStorage1D, 45, -1 },
 { glTextureStorage2D, 45, -1 },
 diff --git a/src/mesa/main/transformfeedback.c
 b/src/mesa/main/transformfeedback.c
 index a737463..10bb6a1 100644
 --- a/src/mesa/main/transformfeedback.c
 +++ b/src/mesa/main/transformfeedback.c
 @@ -825,25 +825,24 @@ _mesa_lookup_transform_feedback_object(struct
 gl_context *ctx, GLuint name)
   _mesa_HashLookup(ctx-TransformFeedback.Objects, name);
  }

 -
 -/**
 - * Create new transform feedback objects.   Transform feedback objects
 - * encapsulate the state related to transform feedback to allow quickly
 - * switching state (and drawing the results, below).
 - * Part of GL_ARB_transform_feedback2.
 - */
 -void GLAPIENTRY
 -_mesa_GenTransformFeedbacks(GLsizei n, GLuint *names)
 +static void
 +create_transform_feedbacks(struct gl_context *ctx, GLsizei n, GLuint *ids,
 +   bool dsa)
  {
 GLuint first;
 -   GET_CURRENT_CONTEXT(ctx);
 +   const char* func;
 +
 +   if (dsa)
 +  func = glCreateTransformFeedbacks;
 +   else
 +  func = glGenTransformFeedbacks;

 if (n  0) {
 -  _mesa_error(ctx, GL_INVALID_VALUE, glGenTransformFeedbacks(n 
 0));
 +  _mesa_error(ctx, GL_INVALID_VALUE, %s(n  0), func);
return;
 }

 -   if (!names)
 +   if (!ids)
return;

 /* we don't need contiguous IDs, but this might be faster */
 @@ -854,18 +853,56 @@ _mesa_GenTransformFeedbacks(GLsizei n, GLuint *names)
   struct gl_transform_feedback_object *obj
  = ctx-Driver.NewTransformFeedback(ctx, first + i);
   if (!obj) {
 -_mesa_error(ctx, GL_OUT_OF_MEMORY, glGenTransformFeedbacks);
 +_mesa_error(ctx, GL_OUT_OF_MEMORY, %s, func);
  return;
   }
 - names[i] = first + i;
 + ids[i] = first + i;
   _mesa_HashInsert(ctx-TransformFeedback.Objects, first + i, obj);
 + if (dsa) {
 +/* this is normally done at bind time in the non-dsa case */
 +obj-EverBound = GL_TRUE;
 + }
}
 }
 else {
 -  _mesa_error(ctx, GL_OUT_OF_MEMORY, glGenTransformFeedbacks);
 +  _mesa_error(ctx, GL_OUT_OF_MEMORY, %s, func);
 }
  }

 +/**
 + * Create new transform feedback objects.   Transform feedback objects
 + * encapsulate the state related to transform feedback to allow quickly
 + * switching state (and drawing the results, below).
 + * Part of GL_ARB_transform_feedback2.
 + */
 +void GLAPIENTRY
 +_mesa_GenTransformFeedbacks(GLsizei n, GLuint *names)
 +{
 +   GET_CURRENT_CONTEXT(ctx);
 +
 +   /* GenTransformFeedbacks should just reserve the object names that a
 +* subsequent call to BindTransformFeedback should actively create. For
 +* the sake of simplicity, we reserve the names and create the objects
 +* straight away.
 +*/
 +
 +   create_transform_feedbacks(ctx, n, names, false);
 +}
 +
 +/**
 + * Create new transform feedback objects.   Transform feedback objects
 + * 

Re: [Mesa-dev] [PATCH] i965/skl: Use 1 register for uniform pull constant payload

2015-02-20 Thread Kenneth Graunke
On Friday, February 20, 2015 10:33:18 AM Ben Widawsky wrote:
 On Fri, Feb 20, 2015 at 03:34:21AM -0800, Kenneth Graunke wrote:
  On Thursday, February 19, 2015 10:48:08 PM Ben Widawsky wrote:
   When under dispatch_width=16 the previous code would allocate 2 registers 
   for
   the payload when only one is needed. This manifested itself through bugs 
   on SKL
   which needs to mess with this instruction.
   
   Ken though this might impact shader-db, but apparently it doesn't
   
   Cc: Kenneth Graunke kenn...@whitecape.org
   Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89118
   Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=88999
   Signed-off-by: Ben Widawsky b...@bwidawsk.net
   ---
src/mesa/drivers/dri/i965/brw_fs.cpp | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
   
   diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp 
   b/src/mesa/drivers/dri/i965/brw_fs.cpp
   index c46e1d7..24125cc 100644
   --- a/src/mesa/drivers/dri/i965/brw_fs.cpp
   +++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
   @@ -3062,7 +3062,7 @@ fs_visitor::lower_uniform_pull_constant_loads()
 assert(const_offset_reg.file == IMM 
const_offset_reg.type == BRW_REGISTER_TYPE_UD);
 const_offset_reg.fixed_hw_reg.dw1.ud /= 4;
   - fs_reg payload = vgrf(glsl_type::uint_type);
   + fs_reg payload = fs_reg(GRF, alloc.allocate(1));

 /* We have to use a message header on Skylake to get SIMD4x2 
   mode.
  * Reserve space for the register.
   
  
  Reviewed-by: Kenneth Graunke kenn...@whitecape.org
 
 Given that this seems to be required to get SKL stable, do I want to Cc 1.5 
 too?

That seems reasonable to me.  It likely won't apply (or build), since I
don't think 10.5 has alloc.allocate().  To be nice to Emil, I'm guessing
the right thing to do is probably to cherry-pick it to 10.5, change
alloc.allocate(1) to virtual_grf_alloc(1), and send that patch to
mesa-sta...@lists.freedesktop.org marking it as a backport.


signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 03/16] main: fix the validation of the number of samples

2015-02-20 Thread Ilia Mirkin
On Mon, Feb 16, 2015 at 9:13 AM, Martin Peres
martin.pe...@linux.intel.com wrote:
 Maybe this should be the job of the dispatch layer.

 Signed-off-by: Martin Peres martin.pe...@linux.intel.com
 ---
  src/mesa/main/multisample.c | 9 +
  1 file changed, 9 insertions(+)

 diff --git a/src/mesa/main/multisample.c b/src/mesa/main/multisample.c
 index 1f3fa0c..a0a659b 100644
 --- a/src/mesa/main/multisample.c
 +++ b/src/mesa/main/multisample.c
 @@ -150,6 +150,15 @@ GLenum
  _mesa_check_sample_count(struct gl_context *ctx, GLenum target,
   GLenum internalFormat, GLsizei samples)
  {
 +   /* From the OpenGL core 3.0 spec, section 2.5:

I think this is a bit of a contradiction... there is no OpenGL 3.0
core spec. Core didn't become a thing until 3.1 afaik.

 +*
 +* If a negative number is provided where an argument of type sizei or
 +* sizeiptr is specified, the error INVALID VALUE is generated.
 +*/
 +   if (samples  0) {
 +  return GL_INVALID_VALUE;
 +   }
 +
 /* If ARB_internalformat_query is supported, then treat its highest
  * returned sample count as the absolute maximum for this format; it is
  * allowed to exceed MAX_SAMPLES.
 --
 2.3.0

 ___
 mesa-dev mailing list
 mesa-dev@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 10/32] i965/fs: Remove logic to keep track of MRF metadata in lower_load_payload().

2015-02-20 Thread Jason Ekstrand
On Fri, Feb 20, 2015 at 1:09 PM, Francisco Jerez curroje...@riseup.net
wrote:

 Jason Ekstrand ja...@jlekstrand.net writes:

  On Fri, Feb 20, 2015 at 4:11 AM, Francisco Jerez curroje...@riseup.net
  wrote:
 
  Jason Ekstrand ja...@jlekstrand.net writes:
 
   I'm still a little pensive.  But
  
   Reviewed-by: Jason Ekstrand jason.ekstr...@intel.com
  
  Thanks.
 
   Now for a little aside.  I have come to the conclusion that I made a
  grave
   mistake when I did the LOAD_PAYLOAD stuff.  In retrospect, I should
 have
   just subclassed fs_inst for load_payload.  The problem is that we
 need to
   snag a bunch of information for the sources when we create the
   load_payload.  In particular, we need to know the width of the source
 so
   that we know how much space it consumes in the payload and we need to
  know
   the information required to properly re-create the mov such as
   force_sechalf and force_writemask_all.  Really, in order to do things
   properly, we need to gather this information *before* we do any
   optimizations.  The nasty pile of code that you're editing together
 with
   the effective_width parameter is a lame attempt to
 capture/reconstruct
   this information.  Really, we should just subclass, capture the
  information
   up-front, and do it properly.
  
  Yes, absolutely, this lowering pass is a real mess.  There are four more
  unreviewed patches in the mailing list fixing bugs of the metadata
  guessing of lower_load_payload() [1][2][3][4], you may want to take a
  look at them.  There are more bugs I'm aware of but it didn't seem
  practical to fix them.
 
 
  Yeah, Matt pointed me at those.  I'll give them a look later today.
 
 
  That said, I don't think it would be worth subclassing fs_inst.
 
  My suggestion would have been to keep it simple and lower LOAD_PAYLOAD
  into a series of MOVs with force_writemask_all enabled in all cases,
  simply rely on the optimizer to eliminate those where possible.  Then
  get rid of the metadata and effective_width guessing.  Instead agree on
  immediates and uniforms being exec_size-wide by convention
  (i.e. LOAD_PAYLOAD's exec_size rather than the original instruction's),
  then prevent constant propagation from propagating an immediate load
  into a LOAD_PAYLOAD if it would lead to a change in the semantics of the
  program, and maybe just run constant propagation after lowering so we
  can be sure those cases are cleaned up properly where register coalesce
  isn't enough.
 
 
  First off, simply setting force_writemask_all isn't an option.  Consider
  the following scenario:
 
  a = foo;
  if (bar) {
 b = 7;
  } else {
 use(a);
  }
  use(b);
 
  If use(a) is the last use of the variable a, then the live ranges of a
  and b don't actually over-lap and we can assign a and b to the same
  register.  However, if force_writemask_all is set on b, it will destroy
 the
  value in a before its last use.  Right now, we don't actually do this
  because our liveness analysis pass flattens everything so it will think
  that b and a over-lap even though they don't.  However, if we ever decide
  to make the liveness information more accurate, having a pile of
  force_writemask_all instructions will be a major problem.  So, while it
 is
  *technically* safe for now, it's a really bad idea in the long-term.
 
 The thing is we *will* have to deal with that scenario.  Building a
 message payload inherently involves crossing channel boundaries (because
 of headers, unsupported SIMD modes by some shared functions, etc.).  I'd
 say it's a bug in the liveness analysis pass if it wouldn't consider the
 liveness interval of a and b to overlap in your example if the
 assignment of b had force_writemask_all set.


Yes, I'm aware of that.  However, the part of that register that has to
squash everything is usually only a couple of registers while the entire
payload may be up to 13 (if I remmeber correctly).  We'd rather not have to
squash everything if we can.  All that being said, our liveness/register
allocation can't handle this and making register allocation handle parts of
registers interfering but not other bits is going to be a real pain.  Maybe
not even worth it.  If you'd rather force_writemask_all everything, I'll
sign off on that for now.  I just wanted to point out that it may not be a
good permanent solution.


 A reasonable compromise might be to leave it up to the caller whether to
 set the force_writemask_all and force_sechalf flags or not.  I.e. just
 copy the same flags set on the LOAD_PAYLOAD instruction to the MOV
 instructions.  That would still allow reducing the liveness intervals in
 cases where we know that the payload respects channel boundaries.

 Tracking the flag information per-register in cases where the same
 payload has well- and ill-behaved values seems rather premature and
 rather useless to me because the optimizer is likely to be able to get
 rid of redundant copies with force_writemask_all -- register coalesce
 is 

Re: [Mesa-dev] glCompressedTex(Sub)Image(2,3) on mapped PBOs

2015-02-20 Thread Ian Romanick
On 02/20/2015 08:13 AM, Eduardo Lima Mitev wrote:
 Hello,
 
 While working on the following dEQP failing tests I ran into an issue
 that I think deserves clarification from more experienced people:
 
 dEQP-GLES3.functional.negative_api.texture.compressedteximage2d_invalid_buffer_target
 dEQP-GLES3.functional.negative_api.texture.compressedtexsubimage2d_invalid_buffer_target
 dEQP-GLES3.functional.negative_api.texture.compressedteximage3d_invalid_buffer_target
 dEQP-GLES3.functional.negative_api.texture.compressedtexsubimage3d_invalid_buffer_target
 
 All above tests fail because a CompressedTex(Sub)Image(2,3)D operation
 is performed while there is a pixel buffer object bound to
 GL_PIXEL_UNPACK_BUFFER and that pixel buffer object is currently mapped.
 The tests expect this situation to throw an GL_INVALID_OPERATION but no
 error is returned on i965.
 
 In the spec GLES 3.1 spec, page 57, there is this: (and similar wording
 in GL 4.5 spec, page 76)
 
 6.3.2 Effects of Mapping Buffers on Other GL Commands
 
 Any GL command which attempts to read from, write to, or change the
 state of a buffer object may generate an INVALID_OPERATION error if all
 or part of the buffer object is mapped. However, only commands which
 explicitly describe this error are required to do so. If an error is not
 generated, using such commands to perform invalid reads, writes, or
 state changes will have undefined results and may result in GL
 interruption or termination.

This language is intentionally loose.  It is often perfectly safe to
have a portion of a buffer mapped while a GL command operates on a
different part of the buffer.  Determining whether a particular
operation is safe can be expensive.  Imagine trying to determine whether
a glDrawElements command will source data from a mapped region of a
vertex buffer object.  Many vendors objected to having to put expensive
(or cheap) checks in performance critical paths.

I don't think an of the TexSubImage commands are performance critical
enough to warrant excluding the test.  I believe some benchmarks use
some of the TexSubImage commands, so it's probably worth measuring them
on, say, Baytrail with and without the checks.  My expectation is that
there's enough CPU wasted in those paths that the checks will be lost in
the noise.  We won't know until we measure.

 There is no further restriction in gl(Compressed)Tex(Sub)Image(2,3)D
 that makes throwing an error mandatory, hence I suspect these tests
 might be wrong.

I'm also inclined to think they are wrong... or at least overly strict.

Do you know of any commands that require this error be generated?

 However, I found that glTex(Sub)Image(2,3)D do emits an
 GL_INVALID_OPERATION error when executed while a mapped PBO is bound.
 But the check is performed in driver code (as opposed to API entry
 points). (Concretely, in i965/intel_tex_image.c::intelTexImage() and
 i965/intel_tex_subimage.c::intelTexSubImage() which both call
 common/meta_tex_subimage.c::_mesa_meta_pbo_TexSubImage()).
 
 So, the question is:
 
 Shouldn't these operations on compressed texture images have the same
 restriction as their non-compressed counter-parts?
 
 Or is there a reason why this restriction is enforced only on
 non-compressed images?
 
 Also, shouldn't this restriction be moved to common API entry points, or
 are there drivers that do write to PBOs while they are mapped?

Looking at that function, I find it odd that either of the errors it
generates are generated there.  We should either do these checks in a
single, common place, or we should not do them at all.

 cheers,
 Eduardo
 ___
 mesa-dev mailing list
 mesa-dev@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/mesa-dev
 

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 5/7] i965/fs: Less broken handling of force_writemask_all in lower_load_payload().

2015-02-20 Thread Jason Ekstrand
Reviewed-by: Jason Ekstrand jason.ekstr...@intel.com

On Sat, Jan 17, 2015 at 6:04 PM, Francisco Jerez curroje...@riseup.net
wrote:

 It's perfectly fine to read the second half of a register written with
 force_writemask_all from a first half MOV instruction or vice versa, and
 lower_load_payload shouldn't mark the whole MOV as belonging to the second
 half in that case.  Replicate the same metadata to both halves of the
 destination when writemasking is disabled.
 ---
  src/mesa/drivers/dri/i965/brw_fs.cpp | 20 +---
  1 file changed, 13 insertions(+), 7 deletions(-)

 diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp
 b/src/mesa/drivers/dri/i965/brw_fs.cpp
 index 4a61943..d585a67 100644
 --- a/src/mesa/drivers/dri/i965/brw_fs.cpp
 +++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
 @@ -3059,9 +3059,11 @@ fs_visitor::lower_load_payload()
}

if (inst-dst.file == MRF || inst-dst.file == GRF) {
 - bool force_sechalf = inst-force_sechalf;
 + bool force_sechalf = inst-force_sechalf 
 +  !inst-force_writemask_all;
   bool toggle_sechalf = inst-dst.width == 16 
 -   type_sz(inst-dst.type) == 4;
 +   type_sz(inst-dst.type) == 4 
 +   !inst-force_writemask_all;
   for (int i = 0; i  inst-regs_written; ++i) {
  metadata[dst_reg + i].written = true;
  metadata[dst_reg + i].force_sechalf = force_sechalf;
 @@ -3104,11 +3106,15 @@ fs_visitor::lower_load_payload()
mov-force_writemask_all =
 metadata[src_reg].force_writemask_all;
metadata[dst_reg] = metadata[src_reg];
if (dst.width * type_sz(dst.type)  32) {
 - assert((!metadata[src_reg].written ||
 - !metadata[src_reg].force_sechalf) 
 -(!metadata[src_reg + 1].written ||
 - metadata[src_reg + 1].force_sechalf));
 - metadata[dst_reg + 1] = metadata[src_reg + 1];
 + if (metadata[src_reg].force_writemask_all) {
 +metadata[dst_reg + 1] = metadata[src_reg];
 + } else {
 +assert((!metadata[src_reg].written ||
 +!metadata[src_reg].force_sechalf) 
 +   (!metadata[src_reg + 1].written ||
 +metadata[src_reg + 1].force_sechalf));
 +metadata[dst_reg + 1] = metadata[src_reg + 1];
 + }
}
 } else {
metadata[dst_reg].force_writemask_all = false;
 --
 2.1.3

 ___
 mesa-dev mailing list
 mesa-dev@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/mesa-dev

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 13/32] i965/fs: Fix lower_load_payload() not to use an incorrect half for immediates and uniforms.

2015-02-20 Thread Jason Ekstrand
Yeah... More proof that the lower_load_payload code is plenty bogus...

Reviewed-by: Jason Ekstrand jason.ekstr...@intel.com

On Fri, Feb 6, 2015 at 9:42 AM, Francisco Jerez curroje...@riseup.net
wrote:

 ---
  src/mesa/drivers/dri/i965/brw_fs.cpp | 8 
  1 file changed, 8 insertions(+)

 diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp
 b/src/mesa/drivers/dri/i965/brw_fs.cpp
 index 8da1f47..e2ebf7e 100644
 --- a/src/mesa/drivers/dri/i965/brw_fs.cpp
 +++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
 @@ -3116,6 +3116,14 @@ fs_visitor::lower_load_payload()
  inst-src[i].reg_offset;
mov-force_sechalf = metadata[src_reg].force_sechalf;
mov-force_writemask_all =
 metadata[src_reg].force_writemask_all;
 +   } else {
 +  /* We don't have any useful metadata for immediates or
 +   * uniforms.  Assume that any of the channels of the
 +   * destination may be used.
 +   */
 +  assert(inst-src[i].file == IMM ||
 + inst-src[i].file == UNIFORM);
 +  mov-force_writemask_all = true;
 }

 if (dst.file == GRF) {
 --
 2.1.3

 ___
 mesa-dev mailing list
 mesa-dev@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/mesa-dev

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 6/6] vbo: fix an unitialized-variable warning

2015-02-20 Thread Marek Olšák
From: Marek Olšák marek.ol...@amd.com

It looks like a bug to me.

Cc: 10.5 10.4 10.3 mesa-sta...@lists.freedesktop.org
---
 src/mesa/vbo/vbo_attrib_tmp.h | 1 +
 1 file changed, 1 insertion(+)

diff --git a/src/mesa/vbo/vbo_attrib_tmp.h b/src/mesa/vbo/vbo_attrib_tmp.h
index ec66934..0c44540 100644
--- a/src/mesa/vbo/vbo_attrib_tmp.h
+++ b/src/mesa/vbo/vbo_attrib_tmp.h
@@ -210,6 +210,7 @@ static inline float conv_i2_to_norm_float(const struct 
gl_context *ctx, int i2)
   }\
} else if ((type) == GL_UNSIGNED_INT_10F_11F_11F_REV) { \
   float res[4];\
+  res[3] = 1;   \
   r11g11b10f_to_float3((arg), res);\
   ATTR##val##FV((attr), res);  \
} else  \
-- 
2.1.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 10/32] i965/fs: Remove logic to keep track of MRF metadata in lower_load_payload().

2015-02-20 Thread Jason Ekstrand
One more comment here...  This particularly regards your plan of separating
it into things that match the destination and other things and not copy
prop uniforms or immediates into the other things.  There is another case
we need to handle.  On older gens (SNB maybe?) the SIMD16 FB write messages
are in the order (r0, g0, b0, a0, r1, g1, b1, a1) instead of the usual (r0,
r1, g0, g1, b0, b1, a0, a1).  This isn't going to work if we expect SIMD16
load_payload instructions to be lowerable to a bunch of SIMD16 MOVs.  On
those platforms, there is a special type of MOV-to-MRF that can move from
(gN, gN+1) to (mK, mK+4) and we handle that in lower_load_payload right
now.  However, it doesn't follow the standard rules.
--Jason

On Fri, Feb 20, 2015 at 2:10 PM, Jason Ekstrand ja...@jlekstrand.net
wrote:



 On Fri, Feb 20, 2015 at 1:09 PM, Francisco Jerez curroje...@riseup.net
 wrote:

 Jason Ekstrand ja...@jlekstrand.net writes:

  On Fri, Feb 20, 2015 at 4:11 AM, Francisco Jerez curroje...@riseup.net
 
  wrote:
 
  Jason Ekstrand ja...@jlekstrand.net writes:
 
   I'm still a little pensive.  But
  
   Reviewed-by: Jason Ekstrand jason.ekstr...@intel.com
  
  Thanks.
 
   Now for a little aside.  I have come to the conclusion that I made a
  grave
   mistake when I did the LOAD_PAYLOAD stuff.  In retrospect, I should
 have
   just subclassed fs_inst for load_payload.  The problem is that we
 need to
   snag a bunch of information for the sources when we create the
   load_payload.  In particular, we need to know the width of the
 source so
   that we know how much space it consumes in the payload and we need to
  know
   the information required to properly re-create the mov such as
   force_sechalf and force_writemask_all.  Really, in order to do things
   properly, we need to gather this information *before* we do any
   optimizations.  The nasty pile of code that you're editing together
 with
   the effective_width parameter is a lame attempt to
 capture/reconstruct
   this information.  Really, we should just subclass, capture the
  information
   up-front, and do it properly.
  
  Yes, absolutely, this lowering pass is a real mess.  There are four
 more
  unreviewed patches in the mailing list fixing bugs of the metadata
  guessing of lower_load_payload() [1][2][3][4], you may want to take a
  look at them.  There are more bugs I'm aware of but it didn't seem
  practical to fix them.
 
 
  Yeah, Matt pointed me at those.  I'll give them a look later today.
 
 
  That said, I don't think it would be worth subclassing fs_inst.
 
  My suggestion would have been to keep it simple and lower LOAD_PAYLOAD
  into a series of MOVs with force_writemask_all enabled in all cases,
  simply rely on the optimizer to eliminate those where possible.  Then
  get rid of the metadata and effective_width guessing.  Instead agree on
  immediates and uniforms being exec_size-wide by convention
  (i.e. LOAD_PAYLOAD's exec_size rather than the original instruction's),
  then prevent constant propagation from propagating an immediate load
  into a LOAD_PAYLOAD if it would lead to a change in the semantics of
 the
  program, and maybe just run constant propagation after lowering so we
  can be sure those cases are cleaned up properly where register coalesce
  isn't enough.
 
 
  First off, simply setting force_writemask_all isn't an option.  Consider
  the following scenario:
 
  a = foo;
  if (bar) {
 b = 7;
  } else {
 use(a);
  }
  use(b);
 
  If use(a) is the last use of the variable a, then the live ranges of a
  and b don't actually over-lap and we can assign a and b to the same
  register.  However, if force_writemask_all is set on b, it will destroy
 the
  value in a before its last use.  Right now, we don't actually do this
  because our liveness analysis pass flattens everything so it will think
  that b and a over-lap even though they don't.  However, if we ever
 decide
  to make the liveness information more accurate, having a pile of
  force_writemask_all instructions will be a major problem.  So, while it
 is
  *technically* safe for now, it's a really bad idea in the long-term.
 
 The thing is we *will* have to deal with that scenario.  Building a
 message payload inherently involves crossing channel boundaries (because
 of headers, unsupported SIMD modes by some shared functions, etc.).  I'd
 say it's a bug in the liveness analysis pass if it wouldn't consider the
 liveness interval of a and b to overlap in your example if the
 assignment of b had force_writemask_all set.


 Yes, I'm aware of that.  However, the part of that register that has to
 squash everything is usually only a couple of registers while the entire
 payload may be up to 13 (if I remmeber correctly).  We'd rather not have to
 squash everything if we can.  All that being said, our liveness/register
 allocation can't handle this and making register allocation handle parts of
 registers interfering but not other bits is going to be a real pain.  

[Mesa-dev] [Bug 67676] Transparent windows no longer work

2015-02-20 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=67676

Chad Versace chad.vers...@intel.com changed:

   What|Removed |Added

 Status|NEW |ASSIGNED
   Assignee|mesa-dev@lists.freedesktop. |jonny.l...@collabora.co.uk
   |org |

--- Comment #14 from Chad Versace chad.vers...@intel.com ---
n reply to Jonny Lamb from comment #13)
 Created attachment 113563 [details]
 version three

Comments:

- Good call on specifying pre-multiplied alpha.

- I think you should remove the paragraph

To determine if the EGL implementation supports this extension, clients
should query the EGL_EXTENSIONS string of the current active display.

  because that's the normal way of querying extensions and needs no
  explanation. I found that paragraph confusing because it made me ask Why is
  the extension telling me this? Is there some magic in this extension that
I've
  failed to see that warrants this explanation?. The platform extension
  specification that you used as a template explicitly describes how to perform
  the extension query because its query mechanism differs from 95% of existing
  EGL extensions.

- You should remove the New Behavior section. Normally, the fine details of
  an extension specification are defined in the Additions to the EGL x.y
  specification section as a diff against the EGL x.y spec. This extension is
no
  different. A few, special extensions, though, define behavior that should
*not*
  be added to the EGL spec, such as the platform extensions. That's what the
New
  Behavior section should be used for, and this alpha extension defines no
such
  outside-of-the-spec behavior.

- The paragraph

To obtain an EGL window surface with a meaningful alpha channel, use an
EGLConfig with EGL_TRANSPARENT_TYPE set to EGL_TRANSPARENT_ALPHA_MESA.

  is still valuable, despite belonging in the New Behavior section. The
  proper location for it is the Overview section.

- Regarding issue #1:

RESOLUTION: Yes. Non-window surfaces created with
EGL_TRANSPARENT_TYPE set to EGL_TRANSPARENT_ALPHA_MESA will
generate an error.

  What error does it generate? If we refer to this paragraph of Section 3.5.1
as precedent,

If config does not support rendering to windows (the EGL_SURFACE_TYPE
attribute does not contain EGL_WINDOW_BIT ), an EGL_BAD_MATCH error is
generated.

  then the resolution text can be clarified as:

RESOLUTION: Attempting to create a non-window surface with a config in
which EGL_TRANSPARENT_TYPE is EGL_TRANSPARENT_ALPHA_MESA, then an
EGL_BAD_MATCH error is generated.


- Other than the above (mostly sylistic) issues, the extension looks good to
  me.  Thanks for writing it! When you submit to mesa-dev, please CC me.

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v2] glsl: add double support for packing varyings

2015-02-20 Thread Ilia Mirkin
Doubles are always packed, but a single double will never cross a slot
boundary -- single slots can still be wasted in some situations.

Signed-off-by: Ilia Mirkin imir...@alum.mit.edu
---

Packing *all* doubles is a bit of a hack... we shouldn't need to do it
for dvec2/dvec4. And yet, here we are. This means that setting
explicit locations on doubles is also unlikely to go well. But this
should be enough to get it mostly going.

v1 - v2:
 - split out variables into a separate list so that it can work for GS

 src/glsl/lower_packed_varyings.cpp | 117 -
 1 file changed, 90 insertions(+), 27 deletions(-)

diff --git a/src/glsl/lower_packed_varyings.cpp 
b/src/glsl/lower_packed_varyings.cpp
index 5e844c7..2c9a1c4 100644
--- a/src/glsl/lower_packed_varyings.cpp
+++ b/src/glsl/lower_packed_varyings.cpp
@@ -146,7 +146,11 @@
 
 #include glsl_symbol_table.h
 #include ir.h
+#include ir_builder.h
 #include ir_optimization.h
+#include program/prog_instruction.h
+
+using namespace ir_builder;
 
 namespace {
 
@@ -163,13 +167,14 @@ public:
lower_packed_varyings_visitor(void *mem_ctx, unsigned locations_used,
  ir_variable_mode mode,
  unsigned gs_input_vertices,
- exec_list *out_instructions);
+ exec_list *out_instructions,
+ exec_list *out_variables);
 
void run(exec_list *instructions);
 
 private:
-   ir_assignment *bitwise_assign_pack(ir_rvalue *lhs, ir_rvalue *rhs);
-   ir_assignment *bitwise_assign_unpack(ir_rvalue *lhs, ir_rvalue *rhs);
+   void bitwise_assign_pack(ir_rvalue *lhs, ir_rvalue *rhs);
+   void bitwise_assign_unpack(ir_rvalue *lhs, ir_rvalue *rhs);
unsigned lower_rvalue(ir_rvalue *rvalue, unsigned fine_location,
  ir_variable *unpacked_var, const char *name,
  bool gs_input_toplevel, unsigned vertex_index);
@@ -221,13 +226,19 @@ private:
 * appropriate place in the shader once the visitor has finished running.
 */
exec_list *out_instructions;
+
+   /**
+* Exec list into which the visitor should insert any new variables.
+*/
+   exec_list *out_variables;
 };
 
 } /* anonymous namespace */
 
 lower_packed_varyings_visitor::lower_packed_varyings_visitor(
   void *mem_ctx, unsigned locations_used, ir_variable_mode mode,
-  unsigned gs_input_vertices, exec_list *out_instructions)
+  unsigned gs_input_vertices, exec_list *out_instructions,
+  exec_list *out_variables)
: mem_ctx(mem_ctx),
  locations_used(locations_used),
  packed_varyings((ir_variable **)
@@ -235,7 +246,8 @@ 
lower_packed_varyings_visitor::lower_packed_varyings_visitor(
 locations_used)),
  mode(mode),
  gs_input_vertices(gs_input_vertices),
- out_instructions(out_instructions)
+ out_instructions(out_instructions),
+ out_variables(out_variables)
 {
 }
 
@@ -274,6 +286,7 @@ lower_packed_varyings_visitor::run(exec_list *instructions)
}
 }
 
+#define SWIZZLE_ZWZW MAKE_SWIZZLE4(SWIZZLE_Z, SWIZZLE_W, SWIZZLE_Z, SWIZZLE_W)
 
 /**
  * Make an ir_assignment from \c rhs to \c lhs, performing appropriate
@@ -281,7 +294,7 @@ lower_packed_varyings_visitor::run(exec_list *instructions)
  *
  * This function is called when packing varyings.
  */
-ir_assignment *
+void
 lower_packed_varyings_visitor::bitwise_assign_pack(ir_rvalue *lhs,
ir_rvalue *rhs)
 {
@@ -300,12 +313,28 @@ 
lower_packed_varyings_visitor::bitwise_assign_pack(ir_rvalue *lhs,
  rhs = new(this-mem_ctx)
 ir_expression(ir_unop_bitcast_f2i, lhs-type, rhs);
  break;
+  case GLSL_TYPE_DOUBLE:
+ assert(rhs-type-vector_elements = 2);
+ if (rhs-type-vector_elements == 2) {
+ir_variable *t = new(mem_ctx) ir_variable(lhs-type, pack, 
ir_var_temporary);
+
+assert(lhs-type-vector_elements == 4);
+this-out_variables-push_tail(t);
+this-out_instructions-push_tail(
+  assign(t, u2i(expr(ir_unop_unpack_double_2x32, 
swizzle_x(rhs-clone(mem_ctx, NULL, 0x3));
+this-out_instructions-push_tail(
+  assign(t,  u2i(expr(ir_unop_unpack_double_2x32, 
swizzle_y(rhs))), 0xc));
+rhs = deref(t).val;
+ } else {
+rhs = u2i(expr(ir_unop_unpack_double_2x32, rhs));
+ }
+ break;
   default:
  assert(!Unexpected type conversion while lowering varyings);
  break;
   }
}
-   return new(this-mem_ctx) ir_assignment(lhs, rhs);
+   this-out_instructions-push_tail(new (this-mem_ctx) ir_assignment(lhs, 
rhs));
 }
 
 
@@ -315,7 +344,7 @@ 
lower_packed_varyings_visitor::bitwise_assign_pack(ir_rvalue *lhs,
  *
  * This function is called when unpacking varyings.
  */
-ir_assignment *
+void
 

Re: [Mesa-dev] [PATCH 1/6] gallium: add some more double opcodes to avoid unnecessary lowering

2015-02-20 Thread Ilia Mirkin
On Fri, Feb 20, 2015 at 9:37 AM, Roland Scheidegger srol...@vmware.com wrote:
 Just some minor nits.

 Am 20.02.2015 um 00:52 schrieb Ilia Mirkin:
 Signed-off-by: Ilia Mirkin imir...@alum.mit.edu
 ---
  src/gallium/auxiliary/tgsi/tgsi_info.c |  5 
  src/gallium/docs/source/tgsi.rst   | 39 
 ++
  src/gallium/include/pipe/p_shader_tokens.h |  7 +-
  3 files changed, 50 insertions(+), 1 deletion(-)

 diff --git a/src/gallium/auxiliary/tgsi/tgsi_info.c 
 b/src/gallium/auxiliary/tgsi/tgsi_info.c
 index d04f9da..4d838fd 100644
 --- a/src/gallium/auxiliary/tgsi/tgsi_info.c
 +++ b/src/gallium/auxiliary/tgsi/tgsi_info.c
 @@ -257,6 +257,11 @@ static const struct tgsi_opcode_info 
 opcode_info[TGSI_OPCODE_LAST] =
 { 1, 1, 0, 0, 0, 0, COMP, D2U, TGSI_OPCODE_D2U },
 { 1, 1, 0, 0, 0, 0, COMP, U2D, TGSI_OPCODE_U2D },
 { 1, 1, 0, 0 ,0, 0, COMP, DRSQ, TGSI_OPCODE_DRSQ },
 +   { 1, 1, 0, 0, 0, 0, COMP, DTRUNC, TGSI_OPCODE_DTRUNC },
 +   { 1, 1, 0, 0, 0, 0, COMP, DCEIL, TGSI_OPCODE_DCEIL },
 +   { 1, 1, 0, 0, 0, 0, COMP, DFLR, TGSI_OPCODE_DFLR },
 +   { 1, 1, 0, 0, 0, 0, COMP, DROUND, TGSI_OPCODE_DROUND },
 +   { 1, 1, 0, 0, 0, 0, COMP, DSSG, TGSI_OPCODE_DSSG },
  };

  const struct tgsi_opcode_info *
 diff --git a/src/gallium/docs/source/tgsi.rst 
 b/src/gallium/docs/source/tgsi.rst
 index e20af79..15f1e9f 100644
 --- a/src/gallium/docs/source/tgsi.rst
 +++ b/src/gallium/docs/source/tgsi.rst
 @@ -1861,6 +1861,45 @@ two-component vectors with doubled precision in each 
 component.

dst.zw = src.zw - \lfloor src.zw\rfloor

 +.. opcode:: DTRUNC - Truncate
 +
 +.. math::
 +
 +  dst.xy = trunc(src.xy)
 +
 +  dst.zw = trunc(src.zw)
 +
 +.. opcode:: DCEIL - Ceiling
 +
 +.. math::
 +
 +  dst.xy = \lceil src.xy\rceil
 +
 +  dst.zw = \lceil src.zw\rceil
 +
 +.. opcode:: DFLR - Floor
 +
 +.. math::
 +
 +  dst.xy = \lfloor src.xy\rfloor
 +
 +  dst.zw = \lfloor src.zw\rfloor
 +
 +.. opcode:: DROUND - Fraction
 +
 +.. math::
 +
 +  dst.xy = round(src.xy)
 +
 +  dst.zw = round(src.zw)
 +
 +.. opcode:: DSSG - Set Sign
 +
 +.. math::
 +
 +  dst.xy = (src.xy  0) ? 1 : (src.xy  0) ? -1 : 0
 +
 +  dst.zw = (src.zw  0) ? 1 : (src.zw  0) ? -1 : 0
 I think this would be more obvious written as 1.0/-1.0/0.0 (same goes
 for the non-double version, of course).

Yep, that was Dave's comment too. I've fixed it here, left SSG alone.

 (As a side note, I'm wondering if this actually has defined NaN
 behavior, the glsl language sort of implies a number has to be exactly
 one of these 3 cases, and doesn't give an answer what should be returned
 for a NaN - maybe a NaN?)

I'll leave that for another day :)



  .. opcode:: DFRACEXP - Convert Number to Fractional and Integral Components

 diff --git a/src/gallium/include/pipe/p_shader_tokens.h 
 b/src/gallium/include/pipe/p_shader_tokens.h
 index fc41cc9..95ac590 100644
 --- a/src/gallium/include/pipe/p_shader_tokens.h
 +++ b/src/gallium/include/pipe/p_shader_tokens.h
 @@ -519,7 +519,12 @@ struct tgsi_property_data {
  #define TGSI_OPCODE_D2U 215
  #define TGSI_OPCODE_U2D 216
  #define TGSI_OPCODE_DRSQ217 /* eg, cayman also has DRSQ */
 -#define TGSI_OPCODE_LAST218
 +#define TGSI_OPCODE_DTRUNC  218 /* nvc0 */
 +#define TGSI_OPCODE_DCEIL   219 /* nvc0 */
 +#define TGSI_OPCODE_DFLR220 /* nvc0 */
 +#define TGSI_OPCODE_DROUND  221 /* nvc0 */
 +#define TGSI_OPCODE_DSSG222
 +#define TGSI_OPCODE_LAST223

 I don't think these hw-specific references (nvc0) really fit in there -
 the same can be said about some existing ones (cayman...)

I went with the flow... started by Dave, I believe. Happy to remove
them all too. Dave -- thoughts?

  -ilia
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 11/32] i965/fs: Fix lower_load_payload() to take into account non-zero reg_offset.

2015-02-20 Thread Jason Ekstrand
Reviewed-by: Jason Ekstrand jason.ekstr...@intel.com

On Fri, Feb 6, 2015 at 9:42 AM, Francisco Jerez curroje...@riseup.net
wrote:

 Fixes metadata guess when instructions in the program specify a
 destination register with non-zero reg_offset and when the payload of
 a LOAD_PAYLOAD spans several registers.
 ---
  src/mesa/drivers/dri/i965/brw_fs.cpp | 4 ++--
  1 file changed, 2 insertions(+), 2 deletions(-)

 diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp
 b/src/mesa/drivers/dri/i965/brw_fs.cpp
 index eccae06..163aa41 100644
 --- a/src/mesa/drivers/dri/i965/brw_fs.cpp
 +++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
 @@ -3070,7 +3070,7 @@ fs_visitor::lower_load_payload()

 foreach_block_and_inst_safe (block, fs_inst, inst, cfg) {
if (inst-dst.file == GRF) {
 - const int dst_reg = vgrf_to_reg[inst-dst.reg];
 + const int dst_reg = vgrf_to_reg[inst-dst.reg] +
 inst-dst.reg_offset;
   bool force_sechalf = inst-force_sechalf 
!inst-force_writemask_all;
   bool toggle_sechalf = inst-dst.width == 16 
 @@ -3119,7 +3119,7 @@ fs_visitor::lower_load_payload()
 }

 if (dst.file == GRF) {
 -  const int dst_reg = vgrf_to_reg[dst.reg];
 +  const int dst_reg = vgrf_to_reg[dst.reg] +
 dst.reg_offset;
const bool force_writemask = mov-force_writemask_all;
metadata[dst_reg].force_writemask_all = force_writemask;
metadata[dst_reg].force_sechalf = mov-force_sechalf;
 --
 2.1.3

 ___
 mesa-dev mailing list
 mesa-dev@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/mesa-dev

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 4/6] gallium/sw/kms: don't redefine DEBUG

2015-02-20 Thread Marek Olšák
From: Marek Olšák marek.ol...@amd.com

---
 src/gallium/winsys/sw/kms-dri/kms_dri_sw_winsys.c | 14 +++---
 1 file changed, 7 insertions(+), 7 deletions(-)

diff --git a/src/gallium/winsys/sw/kms-dri/kms_dri_sw_winsys.c 
b/src/gallium/winsys/sw/kms-dri/kms_dri_sw_winsys.c
index ed34dfa..781818a 100644
--- a/src/gallium/winsys/sw/kms-dri/kms_dri_sw_winsys.c
+++ b/src/gallium/winsys/sw/kms-dri/kms_dri_sw_winsys.c
@@ -53,9 +53,9 @@
 #include state_tracker/drm_driver.h
 
 #if 0
-#define DEBUG(msg, ...) fprintf(stderr, msg, __VA_ARGS__)
+#define DEBUG_PRINT(msg, ...) fprintf(stderr, msg, __VA_ARGS__)
 #else
-#define DEBUG(msg, ...)
+#define DEBUG_PRINT(msg, ...)
 #endif
 
 struct sw_winsys;
@@ -145,7 +145,7 @@ kms_sw_displaytarget_create(struct sw_winsys *ws,
 
list_add(kms_sw_dt-link, kms_sw-bo_list);
 
-   DEBUG(KMS-DEBUG: created buffer %u (size %u)\n, kms_sw_dt-handle, 
kms_sw_dt-size);
+   DEBUG_PRINT(KMS-DEBUG: created buffer %u (size %u)\n, kms_sw_dt-handle, 
kms_sw_dt-size);
 
*stride = kms_sw_dt-stride;
return (struct sw_displaytarget *)kms_sw_dt;
@@ -177,7 +177,7 @@ kms_sw_displaytarget_destroy(struct sw_winsys *ws,
 
list_del(kms_sw_dt-link);
 
-   DEBUG(KMS-DEBUG: destroyed buffer %u\n, kms_sw_dt-handle);
+   DEBUG_PRINT(KMS-DEBUG: destroyed buffer %u\n, kms_sw_dt-handle);
 
FREE(kms_sw_dt);
 }
@@ -205,7 +205,7 @@ kms_sw_displaytarget_map(struct sw_winsys *ws,
if (kms_sw_dt-mapped == MAP_FAILED)
   return NULL;
 
-   DEBUG(KMS-DEBUG: mapped buffer %u (size %u) at %p\n,
+   DEBUG_PRINT(KMS-DEBUG: mapped buffer %u (size %u) at %p\n,
  kms_sw_dt-handle, kms_sw_dt-size, kms_sw_dt-mapped);
 
return kms_sw_dt-mapped;
@@ -249,7 +249,7 @@ kms_sw_displaytarget_unmap(struct sw_winsys *ws,
 {
struct kms_sw_displaytarget *kms_sw_dt = kms_sw_displaytarget(dt);
 
-   DEBUG(KMS-DEBUG: unmapped buffer %u (was %p)\n, kms_sw_dt-handle, 
kms_sw_dt-mapped);
+   DEBUG_PRINT(KMS-DEBUG: unmapped buffer %u (was %p)\n, kms_sw_dt-handle, 
kms_sw_dt-mapped);
 
munmap(kms_sw_dt-mapped, kms_sw_dt-size);
kms_sw_dt-mapped = NULL;
@@ -283,7 +283,7 @@ kms_sw_displaytarget_from_handle(struct sw_winsys *ws,
  if (kms_sw_dt-handle == whandle-handle) {
 kms_sw_dt-ref_count++;
 
-DEBUG(KMS-DEBUG: imported buffer %u (size %u)\n, 
kms_sw_dt-handle, kms_sw_dt-size);
+DEBUG_PRINT(KMS-DEBUG: imported buffer %u (size %u)\n, 
kms_sw_dt-handle, kms_sw_dt-size);
 
 *stride = kms_sw_dt-stride;
 return (struct sw_displaytarget *)kms_sw_dt;
-- 
2.1.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/6] gallivm: fix uninitialized-variable warnings

2015-02-20 Thread Marek Olšák
From: Marek Olšák marek.ol...@amd.com

---
 src/gallium/auxiliary/gallivm/lp_bld_init.c   | 2 +-
 src/gallium/auxiliary/gallivm/lp_bld_sample.c | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/src/gallium/auxiliary/gallivm/lp_bld_init.c 
b/src/gallium/auxiliary/gallivm/lp_bld_init.c
index b9593de..6133883 100644
--- a/src/gallium/auxiliary/gallivm/lp_bld_init.c
+++ b/src/gallium/auxiliary/gallivm/lp_bld_init.c
@@ -509,7 +509,7 @@ void
 gallivm_compile_module(struct gallivm_state *gallivm)
 {
LLVMValueRef func;
-   int64_t time_begin;
+   int64_t time_begin = 0;
 
assert(!gallivm-compiled);
 
diff --git a/src/gallium/auxiliary/gallivm/lp_bld_sample.c 
b/src/gallium/auxiliary/gallivm/lp_bld_sample.c
index 093c8fc..7d18ee5 100644
--- a/src/gallium/auxiliary/gallivm/lp_bld_sample.c
+++ b/src/gallium/auxiliary/gallivm/lp_bld_sample.c
@@ -221,7 +221,7 @@ lp_build_rho(struct lp_build_sample_context *bld,
struct lp_build_context *coord_bld = bld-coord_bld;
struct lp_build_context *rho_bld = bld-lodf_bld;
const unsigned dims = bld-dims;
-   LLVMValueRef ddx_ddy[2];
+   LLVMValueRef ddx_ddy[2] = {NULL};
LLVMBuilderRef builder = bld-gallivm-builder;
LLVMTypeRef i32t = LLVMInt32TypeInContext(bld-gallivm-context);
LLVMValueRef index0 = LLVMConstInt(i32t, 0, 0);
-- 
2.1.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 5/6] gallium/sw/kms: fix a type-mismatch warning

2015-02-20 Thread Marek Olšák
From: Marek Olšák marek.ol...@amd.com

---
 src/gallium/winsys/sw/kms-dri/kms_dri_sw_winsys.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/gallium/winsys/sw/kms-dri/kms_dri_sw_winsys.c 
b/src/gallium/winsys/sw/kms-dri/kms_dri_sw_winsys.c
index 781818a..ce3de78 100644
--- a/src/gallium/winsys/sw/kms-dri/kms_dri_sw_winsys.c
+++ b/src/gallium/winsys/sw/kms-dri/kms_dri_sw_winsys.c
@@ -313,7 +313,7 @@ kms_sw_displaytarget_get_handle(struct sw_winsys *winsys,
   return TRUE;
case DRM_API_HANDLE_TYPE_FD:
   if (!drmPrimeHandleToFD(kms_sw-fd, kms_sw_dt-handle,
- DRM_CLOEXEC, whandle-handle)) {
+ DRM_CLOEXEC, (int*)whandle-handle)) {
  whandle-stride = kms_sw_dt-stride;
  return TRUE;
   }
-- 
2.1.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/6] tgsi: fix type-mismatch warning

2015-02-20 Thread Marek Olšák
From: Marek Olšák marek.ol...@amd.com

---
 src/gallium/auxiliary/tgsi/tgsi_text.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/gallium/auxiliary/tgsi/tgsi_text.c 
b/src/gallium/auxiliary/tgsi/tgsi_text.c
index 5069d13..b6b3585 100644
--- a/src/gallium/auxiliary/tgsi/tgsi_text.c
+++ b/src/gallium/auxiliary/tgsi/tgsi_text.c
@@ -240,7 +240,7 @@ static boolean parse_double( const char **pcur, uint32_t 
*val0, uint32_t *val1)
   uint32_t uval[2];
} v;
 
-   v.dval = strtod(cur, pcur);
+   v.dval = strtod(cur, (char**)pcur);
if (*pcur == cur)
   return FALSE;
 
-- 
2.1.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/6] gallivm: fix uninitialized-variable warnings

2015-02-20 Thread Brian Paul

On 02/20/2015 12:22 PM, Marek Olšák wrote:

From: Marek Olšák marek.ol...@amd.com

---
  src/gallium/auxiliary/gallivm/lp_bld_init.c   | 2 +-
  src/gallium/auxiliary/gallivm/lp_bld_sample.c | 2 +-
  2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/src/gallium/auxiliary/gallivm/lp_bld_init.c 
b/src/gallium/auxiliary/gallivm/lp_bld_init.c
index b9593de..6133883 100644
--- a/src/gallium/auxiliary/gallivm/lp_bld_init.c
+++ b/src/gallium/auxiliary/gallivm/lp_bld_init.c
@@ -509,7 +509,7 @@ void
  gallivm_compile_module(struct gallivm_state *gallivm)
  {
 LLVMValueRef func;
-   int64_t time_begin;
+   int64_t time_begin = 0;

 assert(!gallivm-compiled);

diff --git a/src/gallium/auxiliary/gallivm/lp_bld_sample.c 
b/src/gallium/auxiliary/gallivm/lp_bld_sample.c
index 093c8fc..7d18ee5 100644
--- a/src/gallium/auxiliary/gallivm/lp_bld_sample.c
+++ b/src/gallium/auxiliary/gallivm/lp_bld_sample.c
@@ -221,7 +221,7 @@ lp_build_rho(struct lp_build_sample_context *bld,
 struct lp_build_context *coord_bld = bld-coord_bld;
 struct lp_build_context *rho_bld = bld-lodf_bld;
 const unsigned dims = bld-dims;
-   LLVMValueRef ddx_ddy[2];
+   LLVMValueRef ddx_ddy[2] = {NULL};
 LLVMBuilderRef builder = bld-gallivm-builder;
 LLVMTypeRef i32t = LLVMInt32TypeInContext(bld-gallivm-context);
 LLVMValueRef index0 = LLVMConstInt(i32t, 0, 0);



For the series, Reviewed-by: Brian Paul bri...@vmware.com

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 13/23] main: Minor whitespace fixes in ClearNamedBuffer[Sub]Data.

2015-02-20 Thread Laura Ekstrand
Again, Ian requested that this be a separate commit.

On Fri, Feb 20, 2015 at 6:22 AM, Martin Peres martin.pe...@free.fr wrote:

 Please squash this in the previous commit (with git rebase -i).


 On 12/02/2015 04:05, Laura Ekstrand wrote:

 ---
   src/mesa/main/bufferobj.c | 4 ++--
   src/mesa/main/bufferobj.h | 4 ++--
   2 files changed, 4 insertions(+), 4 deletions(-)

 diff --git a/src/mesa/main/bufferobj.c b/src/mesa/main/bufferobj.c
 index bd21c8a..88230d6 100644
 --- a/src/mesa/main/bufferobj.c
 +++ b/src/mesa/main/bufferobj.c
 @@ -1765,10 +1765,10 @@ void GLAPIENTRY
   _mesa_ClearBufferSubData(GLenum target, GLenum internalformat,
GLintptr offset, GLsizeiptr size,
GLenum format, GLenum type,
 - const GLvoid* data)
 + const GLvoid *data)
   {
  GET_CURRENT_CONTEXT(ctx);
 -   struct gl_buffer_object* bufObj;
 +   struct gl_buffer_object *bufObj;
bufObj = get_buffer(ctx, glClearBufferSubData, target,
 GL_INVALID_VALUE);
  if (!bufObj)
 diff --git a/src/mesa/main/bufferobj.h b/src/mesa/main/bufferobj.h
 index 5254727..2a66444 100644
 --- a/src/mesa/main/bufferobj.h
 +++ b/src/mesa/main/bufferobj.h
 @@ -220,7 +220,7 @@ _mesa_GetBufferSubData(GLenum target, GLintptrARB
 offset,
   void GLAPIENTRY
   _mesa_ClearBufferData(GLenum target, GLenum internalformat,
 GLenum format, GLenum type,
 -  const GLvoid * data);
 +  const GLvoid *data);
 void GLAPIENTRY
   _mesa_ClearNamedBufferData(GLuint buffer, GLenum internalformat,
 @@ -231,7 +231,7 @@ void GLAPIENTRY
   _mesa_ClearBufferSubData(GLenum target, GLenum internalformat,
GLintptr offset, GLsizeiptr size,
GLenum format, GLenum type,
 - const GLvoid * data);
 + const GLvoid *data);
 void GLAPIENTRY
   _mesa_ClearNamedBufferSubData(GLuint buffer, GLenum internalformat,



___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 12/32] i965/fs: Fix lower_load_payload() to take into account stride in the metadata guess.

2015-02-20 Thread Francisco Jerez
Jason Ekstrand ja...@jlekstrand.net writes:

 Where are you getting a destination stride in a load_payload?  The whole
 point of load_payload was to remove partial writes so this seems to go
 against everything it's intended for.
 --Jason

Yeah, maybe...  This case was triggered by some of my image_load_store
code that ends up using byte types with stride=4 for some image formats.


 On Fri, Feb 6, 2015 at 9:42 AM, Francisco Jerez curroje...@riseup.net
 wrote:

 ---
  src/mesa/drivers/dri/i965/brw_fs.cpp | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

 diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp
 b/src/mesa/drivers/dri/i965/brw_fs.cpp
 index 163aa41..8da1f47 100644
 --- a/src/mesa/drivers/dri/i965/brw_fs.cpp
 +++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
 @@ -3074,7 +3074,7 @@ fs_visitor::lower_load_payload()
   bool force_sechalf = inst-force_sechalf 
!inst-force_writemask_all;
   bool toggle_sechalf = inst-dst.width == 16 
 -   type_sz(inst-dst.type) == 4 
 +   type_sz(inst-dst.type) * inst-dst.stride
 == 4 
 !inst-force_writemask_all;
   for (int i = 0; i  inst-regs_written; ++i) {
  metadata[dst_reg + i].written = true;
 --
 2.1.3

 ___
 mesa-dev mailing list
 mesa-dev@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/mesa-dev



pgpsLmQeSeN85.pgp
Description: PGP signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/4] common: Correct texture initialization in create_texture_for_pbo.

2015-02-20 Thread Jason Ekstrand
On Fri, Feb 20, 2015 at 4:30 PM, Laura Ekstrand la...@jlekstrand.net
wrote:

 Solves bugs related to the driver not setting up the texture miptree
 correctly, leading to faulty PBO uploads and downloads.
 ---
  src/mesa/drivers/common/meta_tex_subimage.c | 13 +
  src/mesa/drivers/dri/i965/intel_tex.c   |  3 ++-
  src/mesa/main/dd.h  |  1 +
  3 files changed, 12 insertions(+), 5 deletions(-)

 diff --git a/src/mesa/drivers/common/meta_tex_subimage.c
 b/src/mesa/drivers/common/meta_tex_subimage.c
 index 68c8273..f4f7716 100644
 --- a/src/mesa/drivers/common/meta_tex_subimage.c
 +++ b/src/mesa/drivers/common/meta_tex_subimage.c
 @@ -51,7 +51,7 @@ create_texture_for_pbo(struct gl_context *ctx, bool
 create_pbo,
  {
 uint32_t pbo_format;
 GLenum internal_format;
 -   unsigned row_stride;
 +   unsigned row_stride, image_stride;
 struct gl_buffer_object *buffer_obj;
 struct gl_texture_object *tex_obj;
 struct gl_texture_image *tex_image;
 @@ -74,6 +74,8 @@ create_texture_for_pbo(struct gl_context *ctx, bool
 create_pbo,
 pixels = _mesa_image_address3d(packing, pixels,
width, height, format, type, 0, 0, 0);
 row_stride = _mesa_image_row_stride(packing, width, format, type);
 +   image_stride = _mesa_image_image_stride(packing, width, height, format,
 +   type);

 if (_mesa_is_bufferobj(packing-BufferObj)) {
*tmp_pbo = 0;
 @@ -89,7 +91,7 @@ create_texture_for_pbo(struct gl_context *ctx, bool
 create_pbo,
 */
_mesa_BindBuffer(pbo_target, *tmp_pbo);

 -  _mesa_BufferData(pbo_target, row_stride * height, pixels,
 GL_STREAM_DRAW);
 +  _mesa_BufferData(pbo_target, image_stride * depth, pixels,
 GL_STREAM_DRAW);

buffer_obj = ctx-Unpack.BufferObj;
pixels = NULL;
 @@ -99,9 +101,11 @@ create_texture_for_pbo(struct gl_context *ctx, bool
 create_pbo,

 _mesa_GenTextures(1, tmp_tex);
 tex_obj = _mesa_lookup_texture(ctx, *tmp_tex);
 -   tex_obj-Target = depth  1 ? GL_TEXTURE_2D_ARRAY : GL_TEXTURE_2D;
 -   tex_obj-Immutable = GL_TRUE;
 _mesa_initialize_texture_object(ctx, tex_obj, *tmp_tex, GL_TEXTURE_2D);
 +   /* This must be set after _mesa_initialize_texture_object, not before.
 */
 +   tex_obj-Immutable = GL_TRUE;
 +   /* This is required for interactions with ARB_texture_view. */
 +   tex_obj-NumLayers = 1;

 internal_format = _mesa_get_format_base_format(pbo_format);

 @@ -114,6 +118,7 @@ create_texture_for_pbo(struct gl_context *ctx, bool
 create_pbo,
   buffer_obj,
   (intptr_t)pixels,
   row_stride,
 + image_stride,
   read_only)) {
_mesa_DeleteTextures(1, tmp_tex);
_mesa_DeleteBuffers(1, tmp_pbo);
 diff --git a/src/mesa/drivers/dri/i965/intel_tex.c
 b/src/mesa/drivers/dri/i965/intel_tex.c
 index 2d3009a..3a0c09a 100644
 --- a/src/mesa/drivers/dri/i965/intel_tex.c
 +++ b/src/mesa/drivers/dri/i965/intel_tex.c
 @@ -305,6 +305,7 @@ intel_set_texture_storage_for_buffer_object(struct
 gl_context *ctx,
  struct gl_buffer_object
 *buffer_obj,
  uint32_t buffer_offset,
  uint32_t row_stride,
 +uint32_t image_stride,
  bool read_only)
  {
 struct brw_context *brw = brw_context(ctx);
 @@ -334,7 +335,7 @@ intel_set_texture_storage_for_buffer_object(struct
 gl_context *ctx,

 drm_intel_bo *bo = intel_bufferobj_buffer(brw, intel_buffer_obj,
   buffer_offset,
 - row_stride * image-Height);
 + image_stride * image-Depth);
 intel_texobj-mt =
intel_miptree_create_for_bo(brw, bo,
image-TexFormat,
 diff --git a/src/mesa/main/dd.h b/src/mesa/main/dd.h
 index ec8662b..9de08e2 100644
 --- a/src/mesa/main/dd.h
 +++ b/src/mesa/main/dd.h
 @@ -429,6 +429,7 @@ struct dd_function_table {
  struct gl_buffer_object
 *bufferObj,
  uint32_t buffer_offset,
  uint32_t row_stride,
 +uint32_t image_stride,
  bool read_only);


Given that the next patch fixes it to only use 2D textures, we shouldn't
need the changes to the DD table.  In fact, we should take all the stuff
about array textures out of the DD table entry, comment, and
implementation.  However, 

Re: [Mesa-dev] [PATCH 2/4] common: Correct PBO 2D_ARRAY handling.

2015-02-20 Thread Laura Ekstrand
There needs to be some corresponding Piglit test to guarantee that skip
pixels is really a problem here.  Otherwise, I'm flying blind trying to fix
it.

On Fri, Feb 20, 2015 at 5:04 PM, Jason Ekstrand ja...@jlekstrand.net
wrote:

 This is mostly correct and it's a good solution.  The only problem is that
 it doesn't handle the skipRows packing property properly.  This property
 tells the driver the stride (in rows) between image planes in the data.
 Most of the places where depth is used below, we should be using the stride
 (in rows) between images.  Look at the _mesa_get_image_stride familiy of
 functions to see exactly how to handle this.

 On Fri, Feb 20, 2015 at 4:30 PM, Laura Ekstrand la...@jlekstrand.net
 wrote:

 Changes PBO uploads and downloads to use a tall (height * depth) 2D
 texture
 for blitting.  This fixes the bug where 2D_ARRAY, 3D, and CUBE_MAP_ARRAY
 textures are not properly uploaded and downloaded.
 ---
  src/mesa/drivers/common/meta_tex_subimage.c | 6 +++---
  1 file changed, 3 insertions(+), 3 deletions(-)

 diff --git a/src/mesa/drivers/common/meta_tex_subimage.c
 b/src/mesa/drivers/common/meta_tex_subimage.c
 index f4f7716..ee3295b 100644
 --- a/src/mesa/drivers/common/meta_tex_subimage.c
 +++ b/src/mesa/drivers/common/meta_tex_subimage.c
 @@ -110,7 +110,7 @@ create_texture_for_pbo(struct gl_context *ctx, bool
 create_pbo,
 internal_format = _mesa_get_format_base_format(pbo_format);

 tex_image = _mesa_get_tex_image(ctx, tex_obj, tex_obj-Target, 0);
 -   _mesa_init_teximage_fields(ctx, tex_image, width, height, depth,
 +   _mesa_init_teximage_fields(ctx, tex_image, width, height * depth, 1,
0, internal_format, pbo_format);


 read_only = pbo_target == GL_PIXEL_UNPACK_BUFFER;
 @@ -227,7 +227,7 @@ _mesa_meta_pbo_TexSubImage(struct gl_context *ctx,
 GLuint dims,
_mesa_update_state(ctx);

_mesa_meta_BlitFramebuffer(ctx, ctx-ReadBuffer, ctx-DrawBuffer,
 - 0, 0, width, height,
 + 0, z * height, width, (z + 1) * height,
   xoffset, yoffset,
   xoffset + width, yoffset + height,
   GL_COLOR_BUFFER_BIT, GL_NEAREST);
 @@ -349,7 +349,7 @@ _mesa_meta_pbo_GetTexSubImage(struct gl_context *ctx,
 GLuint dims,
_mesa_meta_BlitFramebuffer(ctx, ctx-ReadBuffer, ctx-DrawBuffer,
   xoffset, yoffset,
   xoffset + width, yoffset + height,
 - 0, 0, width, height,
 + 0, z * height, width, (z + 1) * height,
   GL_COLOR_BUFFER_BIT, GL_NEAREST);
 }

 --
 2.1.0

 ___
 mesa-dev mailing list
 mesa-dev@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/mesa-dev



___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 10/32] i965/fs: Remove logic to keep track of MRF metadata in lower_load_payload().

2015-02-20 Thread Francisco Jerez
Jason Ekstrand ja...@jlekstrand.net writes:

 On Fri, Feb 20, 2015 at 1:09 PM, Francisco Jerez curroje...@riseup.net
 wrote:

 Jason Ekstrand ja...@jlekstrand.net writes:

  On Fri, Feb 20, 2015 at 4:11 AM, Francisco Jerez curroje...@riseup.net
  wrote:
 
  Jason Ekstrand ja...@jlekstrand.net writes:
 
   I'm still a little pensive.  But
  
   Reviewed-by: Jason Ekstrand jason.ekstr...@intel.com
  
  Thanks.
 
   Now for a little aside.  I have come to the conclusion that I made a
  grave
   mistake when I did the LOAD_PAYLOAD stuff.  In retrospect, I should
 have
   just subclassed fs_inst for load_payload.  The problem is that we
 need to
   snag a bunch of information for the sources when we create the
   load_payload.  In particular, we need to know the width of the source
 so
   that we know how much space it consumes in the payload and we need to
  know
   the information required to properly re-create the mov such as
   force_sechalf and force_writemask_all.  Really, in order to do things
   properly, we need to gather this information *before* we do any
   optimizations.  The nasty pile of code that you're editing together
 with
   the effective_width parameter is a lame attempt to
 capture/reconstruct
   this information.  Really, we should just subclass, capture the
  information
   up-front, and do it properly.
  
  Yes, absolutely, this lowering pass is a real mess.  There are four more
  unreviewed patches in the mailing list fixing bugs of the metadata
  guessing of lower_load_payload() [1][2][3][4], you may want to take a
  look at them.  There are more bugs I'm aware of but it didn't seem
  practical to fix them.
 
 
  Yeah, Matt pointed me at those.  I'll give them a look later today.
 
 
  That said, I don't think it would be worth subclassing fs_inst.
 
  My suggestion would have been to keep it simple and lower LOAD_PAYLOAD
  into a series of MOVs with force_writemask_all enabled in all cases,
  simply rely on the optimizer to eliminate those where possible.  Then
  get rid of the metadata and effective_width guessing.  Instead agree on
  immediates and uniforms being exec_size-wide by convention
  (i.e. LOAD_PAYLOAD's exec_size rather than the original instruction's),
  then prevent constant propagation from propagating an immediate load
  into a LOAD_PAYLOAD if it would lead to a change in the semantics of the
  program, and maybe just run constant propagation after lowering so we
  can be sure those cases are cleaned up properly where register coalesce
  isn't enough.
 
 
  First off, simply setting force_writemask_all isn't an option.  Consider
  the following scenario:
 
  a = foo;
  if (bar) {
 b = 7;
  } else {
 use(a);
  }
  use(b);
 
  If use(a) is the last use of the variable a, then the live ranges of a
  and b don't actually over-lap and we can assign a and b to the same
  register.  However, if force_writemask_all is set on b, it will destroy
 the
  value in a before its last use.  Right now, we don't actually do this
  because our liveness analysis pass flattens everything so it will think
  that b and a over-lap even though they don't.  However, if we ever decide
  to make the liveness information more accurate, having a pile of
  force_writemask_all instructions will be a major problem.  So, while it
 is
  *technically* safe for now, it's a really bad idea in the long-term.
 
 The thing is we *will* have to deal with that scenario.  Building a
 message payload inherently involves crossing channel boundaries (because
 of headers, unsupported SIMD modes by some shared functions, etc.).  I'd
 say it's a bug in the liveness analysis pass if it wouldn't consider the
 liveness interval of a and b to overlap in your example if the
 assignment of b had force_writemask_all set.


 Yes, I'm aware of that.  However, the part of that register that has to
 squash everything is usually only a couple of registers while the entire
 payload may be up to 13 (if I remmeber correctly).  We'd rather not have to
 squash everything if we can.  All that being said, our liveness/register
 allocation can't handle this and making register allocation handle parts of
 registers interfering but not other bits is going to be a real pain.  Maybe
 not even worth it.  If you'd rather force_writemask_all everything, I'll
 sign off on that for now.  I just wanted to point out that it may not be a
 good permanent solution.


 A reasonable compromise might be to leave it up to the caller whether to
 set the force_writemask_all and force_sechalf flags or not.  I.e. just
 copy the same flags set on the LOAD_PAYLOAD instruction to the MOV
 instructions.  That would still allow reducing the liveness intervals in
 cases where we know that the payload respects channel boundaries.

 Tracking the flag information per-register in cases where the same
 payload has well- and ill-behaved values seems rather premature and
 rather useless to me because the optimizer is likely to be able to get
 rid of redundant 

Re: [Mesa-dev] [PATCH 10/32] i965/fs: Remove logic to keep track of MRF metadata in lower_load_payload().

2015-02-20 Thread Francisco Jerez
Jason Ekstrand ja...@jlekstrand.net writes:

 One more comment here...  This particularly regards your plan of separating
 it into things that match the destination and other things and not copy
 prop uniforms or immediates into the other things.  There is another case
 we need to handle.  On older gens (SNB maybe?) the SIMD16 FB write messages
 are in the order (r0, g0, b0, a0, r1, g1, b1, a1) instead of the usual (r0,
 r1, g0, g1, b0, b1, a0, a1).  This isn't going to work if we expect SIMD16
 load_payload instructions to be lowerable to a bunch of SIMD16 MOVs.  On
 those platforms, there is a special type of MOV-to-MRF that can move from
 (gN, gN+1) to (mK, mK+4) and we handle that in lower_load_payload right
 now.  However, it doesn't follow the standard rules.
 --Jason

I don't see why it couldn't be handled in roughly the same way you do it
now?  Recognize when src[i + 4] is the same 8-wide register as src[i]
shifted by 8 and emit a COMPR4 copy in that case?

 On Fri, Feb 20, 2015 at 2:10 PM, Jason Ekstrand ja...@jlekstrand.net
 wrote:



 On Fri, Feb 20, 2015 at 1:09 PM, Francisco Jerez curroje...@riseup.net
 wrote:

 Jason Ekstrand ja...@jlekstrand.net writes:

  On Fri, Feb 20, 2015 at 4:11 AM, Francisco Jerez curroje...@riseup.net
 
  wrote:
 
  Jason Ekstrand ja...@jlekstrand.net writes:
 
   I'm still a little pensive.  But
  
   Reviewed-by: Jason Ekstrand jason.ekstr...@intel.com
  
  Thanks.
 
   Now for a little aside.  I have come to the conclusion that I made a
  grave
   mistake when I did the LOAD_PAYLOAD stuff.  In retrospect, I should
 have
   just subclassed fs_inst for load_payload.  The problem is that we
 need to
   snag a bunch of information for the sources when we create the
   load_payload.  In particular, we need to know the width of the
 source so
   that we know how much space it consumes in the payload and we need to
  know
   the information required to properly re-create the mov such as
   force_sechalf and force_writemask_all.  Really, in order to do things
   properly, we need to gather this information *before* we do any
   optimizations.  The nasty pile of code that you're editing together
 with
   the effective_width parameter is a lame attempt to
 capture/reconstruct
   this information.  Really, we should just subclass, capture the
  information
   up-front, and do it properly.
  
  Yes, absolutely, this lowering pass is a real mess.  There are four
 more
  unreviewed patches in the mailing list fixing bugs of the metadata
  guessing of lower_load_payload() [1][2][3][4], you may want to take a
  look at them.  There are more bugs I'm aware of but it didn't seem
  practical to fix them.
 
 
  Yeah, Matt pointed me at those.  I'll give them a look later today.
 
 
  That said, I don't think it would be worth subclassing fs_inst.
 
  My suggestion would have been to keep it simple and lower LOAD_PAYLOAD
  into a series of MOVs with force_writemask_all enabled in all cases,
  simply rely on the optimizer to eliminate those where possible.  Then
  get rid of the metadata and effective_width guessing.  Instead agree on
  immediates and uniforms being exec_size-wide by convention
  (i.e. LOAD_PAYLOAD's exec_size rather than the original instruction's),
  then prevent constant propagation from propagating an immediate load
  into a LOAD_PAYLOAD if it would lead to a change in the semantics of
 the
  program, and maybe just run constant propagation after lowering so we
  can be sure those cases are cleaned up properly where register coalesce
  isn't enough.
 
 
  First off, simply setting force_writemask_all isn't an option.  Consider
  the following scenario:
 
  a = foo;
  if (bar) {
 b = 7;
  } else {
 use(a);
  }
  use(b);
 
  If use(a) is the last use of the variable a, then the live ranges of a
  and b don't actually over-lap and we can assign a and b to the same
  register.  However, if force_writemask_all is set on b, it will destroy
 the
  value in a before its last use.  Right now, we don't actually do this
  because our liveness analysis pass flattens everything so it will think
  that b and a over-lap even though they don't.  However, if we ever
 decide
  to make the liveness information more accurate, having a pile of
  force_writemask_all instructions will be a major problem.  So, while it
 is
  *technically* safe for now, it's a really bad idea in the long-term.
 
 The thing is we *will* have to deal with that scenario.  Building a
 message payload inherently involves crossing channel boundaries (because
 of headers, unsupported SIMD modes by some shared functions, etc.).  I'd
 say it's a bug in the liveness analysis pass if it wouldn't consider the
 liveness interval of a and b to overlap in your example if the
 assignment of b had force_writemask_all set.


 Yes, I'm aware of that.  However, the part of that register that has to
 squash everything is usually only a couple of registers while the entire
 payload may be up to 13 (if I remmeber 

[Mesa-dev] [PATCH 7/7] i965: Fix variable indexing of sampler arrays under non-uniform control flow.

2015-02-20 Thread Francisco Jerez
ARB_gpu_shader5 requires sampler array indexing expressions to be
dynamically uniform, this however doesn't have any implications on the
control flow that leads to the evaluation of that expression being
uniform.  Use emit_uniformize() to obtain an arbitrary live value from
the binding table index calculation instead of assuming that the first
channel is always live.

Fixes the following Piglit test cases:
  
arb_gpu_shader5/execution/sampler_array_indexing/fs-nonuniform-control-flow.shader_test
  
arb_gpu_shader5/execution/sampler_array_indexing/vs-nonuniform-control-flow.shader_test

part of the series:
  http://lists.freedesktop.org/archives/piglit/2015-February/014615.html
---
 src/mesa/drivers/dri/i965/brw_fs_nir.cpp   | 4 ++--
 src/mesa/drivers/dri/i965/brw_fs_visitor.cpp   | 5 +++--
 src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp | 5 +++--
 3 files changed, 8 insertions(+), 6 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp 
b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
index ad77752..030ce24 100644
--- a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
@@ -1563,8 +1563,8 @@ fs_visitor::nir_emit_texture(nir_tex_instr *instr)
 
  /* Emit code to evaluate the actual indexing expression */
  sampler_reg = vgrf(glsl_type::uint_type);
- emit(ADD(sampler_reg, src, fs_reg(sampler)))
- -force_writemask_all = true;
+ emit(ADD(sampler_reg, src, fs_reg(sampler)));
+ emit_uniformize(sampler_reg, sampler_reg);
  break;
   }
 
diff --git a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp 
b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
index dee6a6d..d7cad9f 100644
--- a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
@@ -2166,8 +2166,9 @@ fs_visitor::visit(ir_texture *ir)
   /* Emit code to evaluate the actual indexing expression */
   nonconst_sampler_index-accept(this);
   fs_reg temp = vgrf(glsl_type::uint_type);
-  emit(ADD(temp, this-result, fs_reg(sampler)))
--force_writemask_all = true;
+  emit(ADD(temp, this-result, fs_reg(sampler)));
+  emit_uniformize(temp, temp);
+
   sampler_reg = temp;
} else {
   /* Single sampler, or constant array index; the indexing expression
diff --git a/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp 
b/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp
index d5848d2..c260381 100644
--- a/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp
+++ b/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp
@@ -2497,8 +2497,9 @@ vec4_visitor::visit(ir_texture *ir)
   /* Emit code to evaluate the actual indexing expression */
   nonconst_sampler_index-accept(this);
   dst_reg temp(this, glsl_type::uint_type);
-  emit(ADD(temp, this-result, src_reg(sampler)))
- -force_writemask_all = true;
+  emit(ADD(temp, this-result, src_reg(sampler)));
+  emit_uniformize(temp, src_reg(temp));
+
   sampler_reg = src_reg(temp);
} else {
   /* Single sampler, or constant array index; the indexing expression
-- 
2.1.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 5/7] i965: Define helper function to copy an arbitrary live component from some register.

2015-02-20 Thread Francisco Jerez
---
 src/mesa/drivers/dri/i965/brw_fs.h |  2 ++
 src/mesa/drivers/dri/i965/brw_fs_visitor.cpp   | 12 
 src/mesa/drivers/dri/i965/brw_vec4.h   |  3 +++
 src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp | 11 +++
 4 files changed, 28 insertions(+)

diff --git a/src/mesa/drivers/dri/i965/brw_fs.h 
b/src/mesa/drivers/dri/i965/brw_fs.h
index 6e5fa1d..df6b825 100644
--- a/src/mesa/drivers/dri/i965/brw_fs.h
+++ b/src/mesa/drivers/dri/i965/brw_fs.h
@@ -306,6 +306,8 @@ public:
  const fs_reg a);
void emit_minmax(enum brw_conditional_mod conditionalmod, const fs_reg dst,
 const fs_reg src0, const fs_reg src1);
+   /** Copy any live channel from \p src to the first channel of \p dst. */
+   void emit_uniformize(const fs_reg dst, const fs_reg src);
bool try_emit_saturate(ir_expression *ir);
bool try_emit_line(ir_expression *ir);
bool try_emit_mad(ir_expression *ir);
diff --git a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp 
b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
index 4b48f2d..6f1ff20 100644
--- a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
@@ -337,6 +337,18 @@ fs_visitor::emit_minmax(enum brw_conditional_mod 
conditionalmod, const fs_reg d
}
 }
 
+void
+fs_visitor::emit_uniformize(const fs_reg dst, const fs_reg src)
+{
+   const fs_reg chan_index = vgrf(glsl_type::uint_type);
+
+   emit(SHADER_OPCODE_FIND_LIVE_CHANNEL, component(chan_index, 0))
+  -force_writemask_all = true;
+   emit(SHADER_OPCODE_BROADCAST, component(dst, 0),
+src, component(chan_index, 0))
+  -force_writemask_all = true;
+}
+
 bool
 fs_visitor::try_emit_saturate(ir_expression *ir)
 {
diff --git a/src/mesa/drivers/dri/i965/brw_vec4.h 
b/src/mesa/drivers/dri/i965/brw_vec4.h
index 7a92d59..adb5fe4 100644
--- a/src/mesa/drivers/dri/i965/brw_vec4.h
+++ b/src/mesa/drivers/dri/i965/brw_vec4.h
@@ -298,6 +298,9 @@ public:
void emit_lrp(const dst_reg dst,
  const src_reg x, const src_reg y, const src_reg a);
 
+   /** Copy any live channel from \p src to the first channel of \p dst. */
+   void emit_uniformize(const dst_reg dst, const src_reg src);
+
void emit_block_move(dst_reg *dst, src_reg *src,
 const struct glsl_type *type, brw_predicate predicate);
 
diff --git a/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp 
b/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp
index f6f589d..4515fc7 100644
--- a/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp
+++ b/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp
@@ -1304,6 +1304,17 @@ vec4_visitor::emit_lrp(const dst_reg dst,
 }
 
 void
+vec4_visitor::emit_uniformize(const dst_reg dst, const src_reg src)
+{
+   const src_reg chan_index(this, glsl_type::uint_type);
+
+   emit(SHADER_OPCODE_FIND_LIVE_CHANNEL, dst_reg(chan_index))
+  -force_writemask_all = true;
+   emit(SHADER_OPCODE_BROADCAST, dst, src, chan_index)
+  -force_writemask_all = true;
+}
+
+void
 vec4_visitor::visit(ir_expression *ir)
 {
unsigned int operand;
-- 
2.1.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/7] i965: Introduce the BROADCAST pseudo-opcode.

2015-02-20 Thread Francisco Jerez
The BROADCAST instruction picks the channel from its first source
given by an index passed in as second source.  This will be used in
situations where all channels from the same SIMD thread have to agree
on the value of something, e.g. a surface binding table index.
---
 src/mesa/drivers/dri/i965/brw_defines.h  |  6 ++
 src/mesa/drivers/dri/i965/brw_eu.h   |  6 ++
 src/mesa/drivers/dri/i965/brw_eu_emit.c  | 77 
 src/mesa/drivers/dri/i965/brw_fs_generator.cpp   |  4 ++
 src/mesa/drivers/dri/i965/brw_shader.cpp |  3 +
 src/mesa/drivers/dri/i965/brw_vec4_generator.cpp |  4 ++
 6 files changed, 100 insertions(+)

diff --git a/src/mesa/drivers/dri/i965/brw_defines.h 
b/src/mesa/drivers/dri/i965/brw_defines.h
index 17c27dd..d4930e3 100644
--- a/src/mesa/drivers/dri/i965/brw_defines.h
+++ b/src/mesa/drivers/dri/i965/brw_defines.h
@@ -911,6 +911,12 @@ enum opcode {
 
SHADER_OPCODE_URB_WRITE_SIMD8,
 
+   /**
+* Pick the channel from its first source register given by the index
+* specified as second source.  Useful for variable indexing of surfaces.
+*/
+   SHADER_OPCODE_BROADCAST,
+
VEC4_OPCODE_MOV_BYTES,
VEC4_OPCODE_PACK_BYTES,
VEC4_OPCODE_UNPACK_UNIFORM,
diff --git a/src/mesa/drivers/dri/i965/brw_eu.h 
b/src/mesa/drivers/dri/i965/brw_eu.h
index a94ea42..2505480 100644
--- a/src/mesa/drivers/dri/i965/brw_eu.h
+++ b/src/mesa/drivers/dri/i965/brw_eu.h
@@ -413,6 +413,12 @@ brw_pixel_interpolator_query(struct brw_compile *p,
  unsigned msg_length,
  unsigned response_length);
 
+void
+brw_broadcast(struct brw_compile *p,
+  struct brw_reg dst,
+  struct brw_reg src,
+  struct brw_reg idx);
+
 /***
  * brw_eu_util.c:
  */
diff --git a/src/mesa/drivers/dri/i965/brw_eu_emit.c 
b/src/mesa/drivers/dri/i965/brw_eu_emit.c
index 1d6fd67..d7e3995 100644
--- a/src/mesa/drivers/dri/i965/brw_eu_emit.c
+++ b/src/mesa/drivers/dri/i965/brw_eu_emit.c
@@ -2854,6 +2854,83 @@ brw_pixel_interpolator_query(struct brw_compile *p,
brw_inst_set_pi_message_data(brw, insn, data);
 }
 
+void
+brw_broadcast(struct brw_compile *p,
+  struct brw_reg dst,
+  struct brw_reg src,
+  struct brw_reg idx)
+{
+   const struct brw_context *brw = p-brw;
+   const bool align1 = (brw_inst_access_mode(brw, p-current) == BRW_ALIGN_1);
+   brw_inst *inst;
+
+   assert(src.file == BRW_GENERAL_REGISTER_FILE 
+  src.address_mode == BRW_ADDRESS_DIRECT);
+
+   if ((src.vstride == 0  (src.hstride == 0 || !align1)) ||
+   idx.file == BRW_IMMEDIATE_VALUE) {
+  /* Trivial, the source is already uniform or the index is a constant.
+   * We will typically not get here if the optimizer is doing its job, but
+   * asserting would be mean.
+   */
+  const unsigned i = (idx.file == BRW_IMMEDIATE_VALUE ? idx.dw1.ud : 0);
+  brw_MOV(p, dst,
+  (align1 ? stride(suboffset(src, i), 0, 1, 0) :
+   stride(suboffset(src, 4 * i), 0, 4, 1)));
+
+   } else {
+  if (align1) {
+ const struct brw_reg addr =
+retype(brw_address_reg(0), BRW_REGISTER_TYPE_UD);
+ const unsigned offset = src.nr * REG_SIZE + src.subnr;
+ /* Limit in bytes of the signed indirect addressing immediate. */
+ const unsigned limit = 512;
+
+ brw_push_insn_state(p);
+ brw_set_default_mask_control(p, BRW_MASK_DISABLE);
+ brw_set_default_predicate_control(p, BRW_PREDICATE_NONE);
+
+ /* Take into account the component size and horizontal stride. */
+ assert(src.vstride == src.hstride + src.width);
+ brw_SHL(p, addr, vec1(idx),
+ brw_imm_ud(_mesa_logbase2(type_sz(src.type)) +
+src.hstride - 1));
+
+ /* We can only address up to limit bytes using the indirect
+  * addressing immediate, account for the difference if the source
+  * register is above this limit.
+  */
+ if (offset = limit)
+brw_ADD(p, addr, addr, brw_imm_ud(offset - offset % limit));
+
+ brw_pop_insn_state(p);
+
+ /* Use indirect addressing to fetch the specified component. */
+ brw_MOV(p, dst,
+ retype(brw_vec1_indirect(addr.subnr, offset % limit),
+src.type));
+
+  } else {
+ /* In SIMD4x2 mode the index can be either zero or one, replicate it
+  * to all bits of a flag register,
+  */
+ inst = brw_MOV(p,
+brw_null_reg(),
+stride(brw_swizzle1(idx, 0), 0, 4, 1));
+ brw_inst_set_pred_control(brw, inst, BRW_PREDICATE_NONE);
+ brw_inst_set_cond_modifier(brw, inst, BRW_CONDITIONAL_NZ);
+ brw_inst_set_flag_reg_nr(brw, inst, 1);
+
+ /* and use predicated 

Re: [Mesa-dev] [PATCH 10/32] i965/fs: Remove logic to keep track of MRF metadata in lower_load_payload().

2015-02-20 Thread Jason Ekstrand
On Fri, Feb 20, 2015 at 2:43 PM, Francisco Jerez curroje...@riseup.net
wrote:

 Jason Ekstrand ja...@jlekstrand.net writes:

  One more comment here...  This particularly regards your plan of
 separating
  it into things that match the destination and other things and not
 copy
  prop uniforms or immediates into the other things.  There is another
 case
  we need to handle.  On older gens (SNB maybe?) the SIMD16 FB write
 messages
  are in the order (r0, g0, b0, a0, r1, g1, b1, a1) instead of the usual
 (r0,
  r1, g0, g1, b0, b1, a0, a1).  This isn't going to work if we expect
 SIMD16
  load_payload instructions to be lowerable to a bunch of SIMD16 MOVs.  On
  those platforms, there is a special type of MOV-to-MRF that can move from
  (gN, gN+1) to (mK, mK+4) and we handle that in lower_load_payload right
  now.  However, it doesn't follow the standard rules.
  --Jason
 
 I don't see why it couldn't be handled in roughly the same way you do it
 now?  Recognize when src[i + 4] is the same 8-wide register as src[i]
 shifted by 8 and emit a COMPR4 copy in that case?


Sure.  I haven't thought about it hard enough to prove it can't be done.
Just wanted to make sure you were aware of that particular monkey-wrench.
--Jason


  On Fri, Feb 20, 2015 at 2:10 PM, Jason Ekstrand ja...@jlekstrand.net
  wrote:
 
 
 
  On Fri, Feb 20, 2015 at 1:09 PM, Francisco Jerez curroje...@riseup.net
 
  wrote:
 
  Jason Ekstrand ja...@jlekstrand.net writes:
 
   On Fri, Feb 20, 2015 at 4:11 AM, Francisco Jerez 
 curroje...@riseup.net
  
   wrote:
  
   Jason Ekstrand ja...@jlekstrand.net writes:
  
I'm still a little pensive.  But
   
Reviewed-by: Jason Ekstrand jason.ekstr...@intel.com
   
   Thanks.
  
Now for a little aside.  I have come to the conclusion that I
 made a
   grave
mistake when I did the LOAD_PAYLOAD stuff.  In retrospect, I
 should
  have
just subclassed fs_inst for load_payload.  The problem is that we
  need to
snag a bunch of information for the sources when we create the
load_payload.  In particular, we need to know the width of the
  source so
that we know how much space it consumes in the payload and we
 need to
   know
the information required to properly re-create the mov such as
force_sechalf and force_writemask_all.  Really, in order to do
 things
properly, we need to gather this information *before* we do any
optimizations.  The nasty pile of code that you're editing
 together
  with
the effective_width parameter is a lame attempt to
  capture/reconstruct
this information.  Really, we should just subclass, capture the
   information
up-front, and do it properly.
   
   Yes, absolutely, this lowering pass is a real mess.  There are four
  more
   unreviewed patches in the mailing list fixing bugs of the metadata
   guessing of lower_load_payload() [1][2][3][4], you may want to take
 a
   look at them.  There are more bugs I'm aware of but it didn't seem
   practical to fix them.
  
  
   Yeah, Matt pointed me at those.  I'll give them a look later today.
  
  
   That said, I don't think it would be worth subclassing fs_inst.
  
   My suggestion would have been to keep it simple and lower
 LOAD_PAYLOAD
   into a series of MOVs with force_writemask_all enabled in all cases,
   simply rely on the optimizer to eliminate those where possible.
 Then
   get rid of the metadata and effective_width guessing.  Instead
 agree on
   immediates and uniforms being exec_size-wide by convention
   (i.e. LOAD_PAYLOAD's exec_size rather than the original
 instruction's),
   then prevent constant propagation from propagating an immediate load
   into a LOAD_PAYLOAD if it would lead to a change in the semantics of
  the
   program, and maybe just run constant propagation after lowering so
 we
   can be sure those cases are cleaned up properly where register
 coalesce
   isn't enough.
  
  
   First off, simply setting force_writemask_all isn't an option.
 Consider
   the following scenario:
  
   a = foo;
   if (bar) {
  b = 7;
   } else {
  use(a);
   }
   use(b);
  
   If use(a) is the last use of the variable a, then the live ranges
 of a
   and b don't actually over-lap and we can assign a and b to the same
   register.  However, if force_writemask_all is set on b, it will
 destroy
  the
   value in a before its last use.  Right now, we don't actually do this
   because our liveness analysis pass flattens everything so it will
 think
   that b and a over-lap even though they don't.  However, if we ever
  decide
   to make the liveness information more accurate, having a pile of
   force_writemask_all instructions will be a major problem.  So, while
 it
  is
   *technically* safe for now, it's a really bad idea in the long-term.
  
  The thing is we *will* have to deal with that scenario.  Building a
  message payload inherently involves crossing channel boundaries
 (because
  of headers, unsupported SIMD modes by some shared functions, etc.).
 I'd
  

[Mesa-dev] [PATCH 4/7] i965: Perform basic optimizations on the FIND_LIVE_CHANNEL opcode.

2015-02-20 Thread Francisco Jerez
---
 src/mesa/drivers/dri/i965/brw_fs.cpp   | 49 ++
 src/mesa/drivers/dri/i965/brw_fs.h |  1 +
 src/mesa/drivers/dri/i965/brw_fs_cse.cpp   |  1 +
 src/mesa/drivers/dri/i965/brw_vec4.cpp | 41 +
 src/mesa/drivers/dri/i965/brw_vec4.h   |  1 +
 src/mesa/drivers/dri/i965/brw_vec4_cse.cpp |  1 +
 6 files changed, 94 insertions(+)

diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp 
b/src/mesa/drivers/dri/i965/brw_fs.cpp
index d567c2b..4537900 100644
--- a/src/mesa/drivers/dri/i965/brw_fs.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
@@ -2734,6 +2734,54 @@ fs_visitor::compute_to_mrf()
 }
 
 /**
+ * Eliminate FIND_LIVE_CHANNEL instructions occurring outside any control
+ * flow.  We could probably do better here with some form of divergence
+ * analysis.
+ */
+bool
+fs_visitor::eliminate_find_live_channel()
+{
+   bool progress = false;
+   unsigned depth = 0;
+
+   foreach_block_and_inst_safe(block, fs_inst, inst, cfg) {
+  switch (inst-opcode) {
+  case BRW_OPCODE_IF:
+  case BRW_OPCODE_DO:
+ depth++;
+ break;
+
+  case BRW_OPCODE_ENDIF:
+  case BRW_OPCODE_WHILE:
+ depth--;
+ break;
+
+  case FS_OPCODE_DISCARD_JUMP:
+ /* This can potentially make control flow non-uniform until the end
+  * of the program.
+  */
+ depth++;
+ break;
+
+  case SHADER_OPCODE_FIND_LIVE_CHANNEL:
+ if (depth == 0) {
+inst-opcode = BRW_OPCODE_MOV;
+inst-src[0] = fs_reg(0);
+inst-sources = 1;
+inst-force_writemask_all = true;
+progress = true;
+ }
+ break;
+
+  default:
+ break;
+  }
+   }
+
+   return progress;
+}
+
+/**
  * Once we've generated code, try to convert normal FS_OPCODE_FB_WRITE
  * instructions to FS_OPCODE_REP_FB_WRITE.
  */
@@ -3678,6 +3726,7 @@ fs_visitor::optimize()
   OPT(opt_saturate_propagation);
   OPT(register_coalesce);
   OPT(compute_to_mrf);
+  OPT(eliminate_find_live_channel);
 
   OPT(compact_virtual_grfs);
} while (progress);
diff --git a/src/mesa/drivers/dri/i965/brw_fs.h 
b/src/mesa/drivers/dri/i965/brw_fs.h
index 9375412..6e5fa1d 100644
--- a/src/mesa/drivers/dri/i965/brw_fs.h
+++ b/src/mesa/drivers/dri/i965/brw_fs.h
@@ -228,6 +228,7 @@ public:
bool opt_register_renaming();
bool register_coalesce();
bool compute_to_mrf();
+   bool eliminate_find_live_channel();
bool dead_code_eliminate();
bool remove_duplicate_mrf_writes();
bool virtual_grf_interferes(int a, int b);
diff --git a/src/mesa/drivers/dri/i965/brw_fs_cse.cpp 
b/src/mesa/drivers/dri/i965/brw_fs_cse.cpp
index bbe0747..04a073e 100644
--- a/src/mesa/drivers/dri/i965/brw_fs_cse.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs_cse.cpp
@@ -89,6 +89,7 @@ is_expression(const fs_inst *const inst)
case FS_OPCODE_VARYING_PULL_CONSTANT_LOAD:
case FS_OPCODE_CINTERP:
case FS_OPCODE_LINTERP:
+   case SHADER_OPCODE_FIND_LIVE_CHANNEL:
case SHADER_OPCODE_BROADCAST:
   return true;
case SHADER_OPCODE_RCP:
diff --git a/src/mesa/drivers/dri/i965/brw_vec4.cpp 
b/src/mesa/drivers/dri/i965/brw_vec4.cpp
index 601b8c1..219c05e 100644
--- a/src/mesa/drivers/dri/i965/brw_vec4.cpp
+++ b/src/mesa/drivers/dri/i965/brw_vec4.cpp
@@ -1250,6 +1250,46 @@ vec4_visitor::opt_register_coalesce()
 }
 
 /**
+ * Eliminate FIND_LIVE_CHANNEL instructions occurring outside any control
+ * flow.  We could probably do better here with some form of divergence
+ * analysis.
+ */
+bool
+vec4_visitor::eliminate_find_live_channel()
+{
+   bool progress = false;
+   unsigned depth = 0;
+
+   foreach_block_and_inst_safe(block, vec4_instruction, inst, cfg) {
+  switch (inst-opcode) {
+  case BRW_OPCODE_IF:
+  case BRW_OPCODE_DO:
+ depth++;
+ break;
+
+  case BRW_OPCODE_ENDIF:
+  case BRW_OPCODE_WHILE:
+ depth--;
+ break;
+
+  case SHADER_OPCODE_FIND_LIVE_CHANNEL:
+ if (depth == 0) {
+inst-opcode = BRW_OPCODE_MOV;
+inst-src[0] = src_reg(0);
+inst-force_writemask_all = true;
+progress = true;
+ }
+ break;
+
+  default:
+ break;
+  }
+   }
+
+   return progress;
+}
+
+/**
  * Splits virtual GRFs requesting more than one contiguous physical register.
  *
  * We initially create large virtual GRFs for temporary structures, arrays,
@@ -1880,6 +1920,7 @@ vec4_visitor::run()
   OPT(opt_cse);
   OPT(opt_algebraic);
   OPT(opt_register_coalesce);
+  OPT(eliminate_find_live_channel);
} while (progress);
 
pass_num = 0;
diff --git a/src/mesa/drivers/dri/i965/brw_vec4.h 
b/src/mesa/drivers/dri/i965/brw_vec4.h
index a24f843..7a92d59 100644
--- a/src/mesa/drivers/dri/i965/brw_vec4.h
+++ b/src/mesa/drivers/dri/i965/brw_vec4.h
@@ -207,6 +207,7 @@ public:
bool opt_cse();
bool opt_algebraic();
bool 

  1   2   >