[Mesa-dev] [PATCH] doc: break down ARB_direct_state_access in GL3.txt

2015-01-29 Thread Martin Peres
A student was wondering what was going on + I started working on it too.

CC: Laura Ekstrand la...@jlekstrand.net
Signed-off-by: Martin Peres martin.pe...@linux.intel.com
---
 docs/GL3.txt | 11 ++-
 1 file changed, 10 insertions(+), 1 deletion(-)

diff --git a/docs/GL3.txt b/docs/GL3.txt
index 648f5ac..2dc1404 100644
--- a/docs/GL3.txt
+++ b/docs/GL3.txt
@@ -191,7 +191,16 @@ GL 4.5, GLSL 4.50:
   GL_ARB_conditional_render_inverted   DONE (i965, nv50, nvc0, 
llvmpipe, softpipe)
   GL_ARB_cull_distance not started
   GL_ARB_derivative_controlDONE (i965, nv50, nvc0, 
r600)
-  GL_ARB_direct_state_access   started (Laura Ekstrand)
+  GL_ARB_direct_state_access   started
+  - Transform Feedback object  started (Martin Peres)
+  - Buffer object  started (Laura Ekstrand)
+  - Framebuffer object started (Laura Ekstrand)
+  - Renderbuffer objectnot started
+  - Texture object started (Laura Ekstrand)
+  - Vertex array objectnot started
+  - Sampler object not started
+  - Program Pipeline objectnot started
+  - Query object   started (Martin Peres)
   GL_ARB_get_texture_sub_image started (Brian Paul)
   GL_ARB_shader_texture_image_samples  not started
   GL_ARB_texture_barrier   DONE (nv50, nvc0, r300, 
r600, radeonsi)
-- 
2.2.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] mesa: Add new fast mtx_t mutex type for basic use cases

2015-01-29 Thread Kristian Høgsberg
On Thu, Jan 29, 2015 at 6:36 AM, Emil Velikov emil.l.veli...@gmail.com wrote:
 On 28/01/15 05:08, Kristian Høgsberg wrote:
 While modern pthread mutexes are very fast, they still incur a call to an
 external DSO and overhead of the generality and features of pthread mutexes.
 Most mutexes in mesa only needs lock/unlock, and the idea here is that we can
 inline the atomic operation and make the fast case just two intructions.
 Mutexes are subtle and finicky to implement, so we carefully copy the
 implementation from Ulrich Dreppers well-written and well-reviewed paper:

   Futexes Are Tricky
   http://www.akkadia.org/drepper/futex.pdf

 We implement mutex3, which gives us a mutex that has no syscalls on
 uncontended lock or unlock.  Further, the uncontended case boils down to a
 cmpxchg and an untaken branch and the uncontended unlock is just a locked 
 decr
 and an untaken branch.  We use __builtin_expect() to indicate that contention
 is unlikely so that gcc will put the contention code out of the main code
 flow.

 A fast mutex only supports lock/unlock, can't be recursive or used with
 condition variables.  We keep the pthread mutex implementation around as
 full_mtx_t for the few places where we use condition variables or recursive
 locking.  For platforms or compilers where futex and atomics aren't 
 available,
 mtx_t falls back to the pthread mutex.

 The pthread mutex lock/unlock overhead shows up on benchmarks for CPU bound
 applications.  Most CPU bound cases are helped and some of our internal
 bind_buffer_object heavy benchmarks gain up to 10%.

 Hi Kristian,

 Can I humbly ask that you split this into two patches - one that
 introduces the new functions/struct and another one that uses them ?
 This way it'll be easier if/when things go crazy.

 Also the patch seems to wonder between posix and win32
 + typedef full_mtx_t mtx_t;

 and
 + typedef mtx_t fast_mtx_t;

 Looks like a left over from the should I rename XX variables to fast*
 or just one to full* moment :)

Yeah, that's how it progressed :)  At first I called it fast_mtx_t and
planned on replacing simple uses of mtx_t one by one. Jordan suggested
that it'd be easier to make the regular mutex fast and then rename the
couple of places that use more feature than we provide.  I just
botched the windows side when I did that.  However, given that the
patch now reimplements mtx_t, there's really no good way to split up
introducing the fast mutex and making use of it in multiple patches.
I guess we can add full_mtx_t in a separate patch, use it in a second
patch and then finally reimplement mtx_t as a fast mutex.  How does
that sound?

Kristian
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] i965/skl: Force a BINDING_TABLE_POINTER_* command after push constant command

2015-01-29 Thread Neil Roberts
According to the SkyLake bspec the 3DSTATE_CONSTANT_* commands only take
effect on the next corresponding 3DSTATE_BINDING_TABLE_POINTER_* command. This
patch just makes it set the BRW_NEW_SURFACES state when uploading the push
constants to ensure the binding tables will be updated.

This fixes the fbo-blending-formats Piglit test and possibly others.

Reviewed-by: Kristian Høgsberg k...@bitplanet.net
---
 src/mesa/drivers/dri/i965/gen7_vs_state.c | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/src/mesa/drivers/dri/i965/gen7_vs_state.c 
b/src/mesa/drivers/dri/i965/gen7_vs_state.c
index 404dd20..c55d6ca 100644
--- a/src/mesa/drivers/dri/i965/gen7_vs_state.c
+++ b/src/mesa/drivers/dri/i965/gen7_vs_state.c
@@ -60,6 +60,13 @@ gen7_upload_constant_state(struct brw_context *brw,
}
 
ADVANCE_BATCH();
+
+  /* On SkyLake+ the new constants don't take effect until the next
+   * corresponding 3DSTATE_BINDING_TABLE_POINTER_* command is parsed so we
+   * need to ensure that is sent
+   */
+   if (brw-gen = 9)
+  brw-state.dirty.brw |= BRW_NEW_SURFACES;
 }
 
 
-- 
1.9.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] i965: Add a better PRM citation for the IMS dimension mangling.

2015-01-29 Thread Kenneth Graunke
Paul originally had to reverse engineer these formulas based on the
description about how the sampler works.  The description here is not
the easiest to follow - especially given that it's from the Sandybridge
era, when the hardware only did 4x multisampling.

Jordan and I recently found another part of the documentation where they
simply state that IMS dimensions must be adjusted by a set of formulas.
Quoting this section provides an easy to follow explanation for the
code, including 2x/4x/8x/16x.

Signed-off-by: Kenneth Graunke kenn...@whitecape.org
---
 src/mesa/drivers/dri/i965/intel_mipmap_tree.c | 23 ++-
 1 file changed, 22 insertions(+), 1 deletion(-)

diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c 
b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
index 26e137d..9e79f6e 100644
--- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
+++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
@@ -290,7 +290,28 @@ intel_miptree_create_layout(struct brw_context *brw,
   /* Adjust width/height/depth for MSAA */
   mt-msaa_layout = compute_msaa_layout(brw, format, mt-target);
   if (mt-msaa_layout == INTEL_MSAA_LAYOUT_IMS) {
- /* In the Sandy Bridge PRM, volume 4, part 1, page 31, it says:
+ /* From the Ivybridge PRM, Volume 1, Part 1, page 108:
+  * If the surface is multisampled and it is a depth or stencil
+  *  surface or Multisampled Surface StorageFormat in SURFACE_STATE is
+  *  MSFMT_DEPTH_STENCIL, WL and HL must be adjusted as follows before
+  *  proceeding:
+  *
+  *  ++
+  *  | Num Multisamples |W_l = |H_l = |
+  *  ++
+  *  | 2| ceiling(W_l / 2) * 4 | H_l (no adjustment)  |
+  *  | 4| ceiling(W_l / 2) * 4 | ceiling(H_l / 2) * 4 |
+  *  | 8| ceiling(W_l / 2) * 8 | ceiling(H_l / 2) * 4 |
+  *  |16| ceiling(W_l / 2) * 8 | ceiling(H_l / 2) * 8 |
+  *  ++
+  * 
+  *
+  * Note that MSFMT_DEPTH_STENCIL just means the IMS (interleaved)
+  * format rather than UMS/CMS (array slices).  The Sandybridge PRM,
+  * Volume 1, Part 1, Page 111 has the same formula for 4x MSAA.
+  *
+  * Another more complicated explanation for these adjustments comes
+  * from the Sandybridge PRM, volume 4, part 1, page 31:
   *
   * Any of the other messages (sample*, LOD, load4) used with a
   *  (4x) multisampled surface will in-effect sample a surface with
-- 
2.2.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v4 2/3] i965/fs: Use NIR's scalarizing abilities and stop handling vectors

2015-01-29 Thread Kenneth Graunke
On Wednesday, January 28, 2015 04:27:23 PM Jason Ekstrand wrote:
 On Wed, Jan 28, 2015 at 4:02 PM, Matt Turner matts...@gmail.com wrote:
 
  On Tue, Jan 27, 2015 at 5:31 PM, Jason Ekstrand ja...@jlekstrand.net
  wrote:
   diff --git a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
  b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
   index de0d780..217fe09 100644
   @@ -689,9 +689,9 @@ fs_visitor::nir_emit_alu(nir_alu_instr *instr)
  
  struct brw_reg acc = retype(brw_acc_reg(dispatch_width),
  result.type);
  
   -  emit_percomp(MUL(acc, op[0], op[1]), instr-dest.write_mask);
   -  emit_percomp(MACH(reg_null_d, op[0], op[1]),
  instr-dest.write_mask);
   -  emit_percomp(MOV(result, fs_reg(acc)), instr-dest.write_mask);
 
  Bah. How the hell did the old code pass piglit?
 
 
 I think channel expressions saved it.
 
 
 
   +  emit(MUL(acc, op[0], op[1]));
   +  emit(MACH(reg_null_d, op[0], op[1]));
   +  emit(MOV(result, fs_reg(acc)));
  break;
   }
  
   @@ -773,72 +767,38 @@ fs_visitor::nir_emit_alu(nir_alu_instr *instr)
   case nir_op_ball_fequal3:
   case nir_op_ball_iequal3:
   case nir_op_ball_fequal4:
   -   case nir_op_ball_iequal4: {
   -  unsigned num_components = nir_op_infos[instr-op].input_sizes[0];
   -  fs_reg temp = vgrf(num_components);
   -  emit_percomp(CMP(temp, op[0], op[1], BRW_CONDITIONAL_Z),
   -   (1  num_components) - 1);
   -  emit_reduction(BRW_OPCODE_AND, result, temp, num_components);
   -  break;
   -   }
   -
   +   case nir_op_ball_iequal4:
 
  We can save it for later, but it might be interesting to let the fs
  backend get the vector comparisons directly, if that's possible.
 
  See https://bugs.freedesktop.org/show_bug.cgi?id=77456
 
 
 We can.  We just have to add a flag to turn that off in the
 nir_lower_alu_to_scalar pass.
 
 
 
   @@ -867,83 +827,67 @@ fs_visitor::nir_emit_alu(nir_alu_instr *instr)
  unreachable(not reached: should be handled by ldexp_to_arith());
  
   case nir_op_fsqrt:
   -  emit_math_percomp(SHADER_OPCODE_SQRT, result, op[0],
   -instr-dest.write_mask, instr-dest.saturate);
   +  emit_math(SHADER_OPCODE_SQRT, result, op[0])
   +   -saturate = instr-dest.saturate;
 
  I don't really like the -saturate = ... as a single statement. I know
  I'm guilty of using it in a unit test recently.
 
 
 Yeah, I'm not sure what to do there.  I really don't like having to assign
 it to a fs_inst variable and then do the saturate.  I guess we could.  Or
 we could add something to emit_math.  Thoughts?

In the brw_eu_emit.c code I worked around this by creating a brw_last_inst
macro that always refers to the most recently emitted instruction.

Still not 100% crazy about it, but it's been pretty handy.  *shrug*.
It's an idea, anyway.

I'm not too opposed to just creating fs_inst * temporaries.

--Ken

signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/6] main: Added entry point for glCreateTransformFeedbacks

2015-01-29 Thread Fredrik Höglund
On Thursday 29 January 2015, Martin Peres wrote:
 v2: Review from Laura Ekstrand
 - generate the name of the gl method once
 - shorten some lines to stay in the 78 chars limit
 
 Signed-off-by: Martin Peres martin.pe...@linux.intel.com
 ---
  src/mapi/glapi/gen/ARB_direct_state_access.xml |  7 +++
  src/mesa/main/tests/dispatch_sanity.cpp|  1 +
  src/mesa/main/transformfeedback.c  | 63 
 --
  src/mesa/main/transformfeedback.h  |  6 +++
  4 files changed, 62 insertions(+), 15 deletions(-)
 
 diff --git a/src/mapi/glapi/gen/ARB_direct_state_access.xml 
 b/src/mapi/glapi/gen/ARB_direct_state_access.xml
 index 2fe1638..15b00c2 100644
 --- a/src/mapi/glapi/gen/ARB_direct_state_access.xml
 +++ b/src/mapi/glapi/gen/ARB_direct_state_access.xml
 @@ -7,6 +7,13 @@
 enum name=QUERY_TARGETvalue=0x82EA/
 enum name=TEXTURE_BINDING value=0x82EB/
  
 +   !-- Transform Feedback object functions --
 +
 +  function name=CreateTransformFeedbacks offset=assign
 +  param name=n type=GLsizei /
 +  param name=ids type=GLuint * /
 +   /function
 +
 !-- Texture object functions --
  
 function name=CreateTextures offset=assign
 diff --git a/src/mesa/main/tests/dispatch_sanity.cpp 
 b/src/mesa/main/tests/dispatch_sanity.cpp
 index ee4db45..5e7f7ae 100644
 --- a/src/mesa/main/tests/dispatch_sanity.cpp
 +++ b/src/mesa/main/tests/dispatch_sanity.cpp
 @@ -955,6 +955,7 @@ const struct function gl_core_functions_possible[] = {
 { glClipControl, 45, -1 },
  
 /* GL_ARB_direct_state_access */
 +   { glCreateTransformFeedbacks, 45, -1 },
 { glCreateTextures, 45, -1 },
 { glTextureStorage1D, 45, -1 },
 { glTextureStorage2D, 45, -1 },
 diff --git a/src/mesa/main/transformfeedback.c 
 b/src/mesa/main/transformfeedback.c
 index a737463..b0fca48 100644
 --- a/src/mesa/main/transformfeedback.c
 +++ b/src/mesa/main/transformfeedback.c
 @@ -825,25 +825,24 @@ _mesa_lookup_transform_feedback_object(struct 
 gl_context *ctx, GLuint name)
   _mesa_HashLookup(ctx-TransformFeedback.Objects, name);
  }
  
 -
 -/**
 - * Create new transform feedback objects.   Transform feedback objects
 - * encapsulate the state related to transform feedback to allow quickly
 - * switching state (and drawing the results, below).
 - * Part of GL_ARB_transform_feedback2.
 - */
 -void GLAPIENTRY
 -_mesa_GenTransformFeedbacks(GLsizei n, GLuint *names)
 +void
 +_mesa_create_transform_feedbacks(struct gl_context *ctx, GLsizei n,
 + GLuint *ids, bool dsa)
  {
 GLuint first;
 -   GET_CURRENT_CONTEXT(ctx);
 +   const char* gl_mthd_name;

I would prefer it if you used a simpler name for this variable, such
as func.

 +   if (dsa)
 +  gl_mthd_name = glCreateTransformFeedbacks;
 +   else
 +  gl_mthd_name = glGenTransformFeedbacks;
  
 if (n  0) {
 -  _mesa_error(ctx, GL_INVALID_VALUE, glGenTransformFeedbacks(n  0));
 +  _mesa_error(ctx, GL_INVALID_VALUE, %s(n  0), gl_mthd_name);
return;
 }
  
 -   if (!names)
 +   if (!ids)
return;
  
 /* we don't need contiguous IDs, but this might be faster */
 @@ -854,18 +853,52 @@ _mesa_GenTransformFeedbacks(GLsizei n, GLuint *names)
   struct gl_transform_feedback_object *obj
  = ctx-Driver.NewTransformFeedback(ctx, first + i);
   if (!obj) {
 -_mesa_error(ctx, GL_OUT_OF_MEMORY, glGenTransformFeedbacks);
 +_mesa_error(ctx, GL_OUT_OF_MEMORY, %s, gl_mthd_name);
  return;
   }
 - names[i] = first + i;
 + ids[i] = first + i;
   _mesa_HashInsert(ctx-TransformFeedback.Objects, first + i, obj);

You have to set EverBound to true when dsa is true. Otherwise
glIsTransformFeedback() will return false when it's passed an ID that
was created by glCreateTransformFeedbacks().

}
 }
 else {
 -  _mesa_error(ctx, GL_OUT_OF_MEMORY, glGenTransformFeedbacks);
 +  _mesa_error(ctx, GL_OUT_OF_MEMORY, %s, gl_mthd_name);
 }
  }
  
 +/**
 + * Create new transform feedback objects.   Transform feedback objects
 + * encapsulate the state related to transform feedback to allow quickly
 + * switching state (and drawing the results, below).
 + * Part of GL_ARB_transform_feedback2.
 + */
 +void GLAPIENTRY
 +_mesa_GenTransformFeedbacks(GLsizei n, GLuint *names)
 +{
 +   GET_CURRENT_CONTEXT(ctx);
 +
 +   /* GenTransformFeedbacks should just reserve the object names that a
 +* subsequent call to BindTransformFeedback should actively create. For
 +* the sake of simplicity, we reserve the names and create the objects
 +* straight away.
 +*/
 +   _mesa_create_transform_feedbacks(ctx, n, names, false);
 +}
 +
 +/**
 + * Create new transform feedback objects.   Transform feedback objects
 + * encapsulate the state related to transform feedback to allow quickly
 + * switching state (and drawing the results, below).
 + * Part of GL_ARB_direct_state_access.
 + */

[Mesa-dev] [PATCH 5/6] main: Added entry point for glGetTransformFeedbacki_v

2015-01-29 Thread Martin Peres
v2: Review from Laura Ekstrand
- use the transform feedback object lookup wrapper

Signed-off-by: Martin Peres martin.pe...@linux.intel.com
---
 src/mapi/glapi/gen/ARB_direct_state_access.xml |  7 +++
 src/mesa/main/tests/dispatch_sanity.cpp|  1 +
 src/mesa/main/transformfeedback.c  | 29 ++
 src/mesa/main/transformfeedback.h  |  4 
 4 files changed, 41 insertions(+)

diff --git a/src/mapi/glapi/gen/ARB_direct_state_access.xml 
b/src/mapi/glapi/gen/ARB_direct_state_access.xml
index 1ac4386..935e088 100644
--- a/src/mapi/glapi/gen/ARB_direct_state_access.xml
+++ b/src/mapi/glapi/gen/ARB_direct_state_access.xml
@@ -34,6 +34,13 @@
   param name=param type=GLint * /
/function
 
+   function name=GetTransformFeedbacki_v offset=assign
+  param name=xfb type=GLuint /
+  param name=pname type=GLenum /
+  param name=index type=GLuint /
+  param name=param type=GLint * /
+   /function
+
!-- Texture object functions --
 
function name=CreateTextures offset=assign
diff --git a/src/mesa/main/tests/dispatch_sanity.cpp 
b/src/mesa/main/tests/dispatch_sanity.cpp
index 888d34a..4b92699 100644
--- a/src/mesa/main/tests/dispatch_sanity.cpp
+++ b/src/mesa/main/tests/dispatch_sanity.cpp
@@ -959,6 +959,7 @@ const struct function gl_core_functions_possible[] = {
{ glTransformFeedbackBufferBase, 45, -1 },
{ glTransformFeedbackBufferRange, 45, -1 },
{ glGetTransformFeedbackiv, 45, -1 },
+   { glGetTransformFeedbacki_v, 45, -1 },
{ glCreateTextures, 45, -1 },
{ glTextureStorage1D, 45, -1 },
{ glTextureStorage2D, 45, -1 },
diff --git a/src/mesa/main/transformfeedback.c 
b/src/mesa/main/transformfeedback.c
index ae3e1c0..1ffdc16 100644
--- a/src/mesa/main/transformfeedback.c
+++ b/src/mesa/main/transformfeedback.c
@@ -1220,3 +1220,32 @@ _mesa_GetTransformFeedbackiv(GLuint xfb, GLenum pname, 
GLint *param)
glGetTransformFeedbackiv(pname=%i), pname);
 }
 }
+
+extern void GLAPIENTRY
+_mesa_GetTransformFeedbacki_v(GLuint xfb, GLenum pname, GLuint index,
+  GLint *param)
+{
+   struct gl_transform_feedback_object *obj;
+   GET_CURRENT_CONTEXT(ctx);
+
+   obj = _mesa_lookup_transform_feedback_object_err(ctx, xfb,
+   
glGetTransformFeedbacki_v);
+   if(!obj) {
+  return;
+   }
+
+   if (index = ctx-Const.MaxTransformFeedbackBuffers) {
+  _mesa_error(ctx, GL_INVALID_VALUE,
+  glGetTransformFeedbacki_v(index=%i), index);
+  return;
+   }
+
+   switch(pname) {
+   case GL_TRANSFORM_FEEDBACK_BUFFER_BINDING:
+  *param = obj-BufferNames[index];
+  break;
+   default:
+  _mesa_error(ctx, GL_INVALID_ENUM,
+  glGetTransformFeedbacki_v(pname=%i), pname);
+   }
+}
diff --git a/src/mesa/main/transformfeedback.h 
b/src/mesa/main/transformfeedback.h
index 4bc7c0c..2c31b22 100644
--- a/src/mesa/main/transformfeedback.h
+++ b/src/mesa/main/transformfeedback.h
@@ -169,4 +169,8 @@ _mesa_TransformFeedbackBufferRange(GLuint xfb, GLuint 
index, GLuint buffer,
 extern void GLAPIENTRY
 _mesa_GetTransformFeedbackiv(GLuint xfb, GLenum pname, GLint *param);
 
+extern void GLAPIENTRY
+_mesa_GetTransformFeedbacki_v(GLuint xfb, GLenum pname, GLuint index,
+  GLint *param);
+
 #endif /* TRANSFORM_FEEDBACK_H */
-- 
2.2.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 4/6] main: Added entry point for glGetTransformFeedbackiv

2015-01-29 Thread Martin Peres
v2: Review from Laura Ekstrand
- use the transform feedback object lookup wrapper

Signed-off-by: Martin Peres martin.pe...@linux.intel.com
---
 src/mapi/glapi/gen/ARB_direct_state_access.xml |  6 ++
 src/mesa/main/tests/dispatch_sanity.cpp|  1 +
 src/mesa/main/transformfeedback.c  | 25 +
 src/mesa/main/transformfeedback.h  |  3 +++
 4 files changed, 35 insertions(+)

diff --git a/src/mapi/glapi/gen/ARB_direct_state_access.xml 
b/src/mapi/glapi/gen/ARB_direct_state_access.xml
index b3c090f..1ac4386 100644
--- a/src/mapi/glapi/gen/ARB_direct_state_access.xml
+++ b/src/mapi/glapi/gen/ARB_direct_state_access.xml
@@ -28,6 +28,12 @@
   param name=size type=GLsizeiptr /
/function
 
+   function name=GetTransformFeedbackiv offset=assign
+  param name=xfb type=GLuint /
+  param name=pname type=GLenum /
+  param name=param type=GLint * /
+   /function
+
!-- Texture object functions --
 
function name=CreateTextures offset=assign
diff --git a/src/mesa/main/tests/dispatch_sanity.cpp 
b/src/mesa/main/tests/dispatch_sanity.cpp
index 3b34f7b..888d34a 100644
--- a/src/mesa/main/tests/dispatch_sanity.cpp
+++ b/src/mesa/main/tests/dispatch_sanity.cpp
@@ -958,6 +958,7 @@ const struct function gl_core_functions_possible[] = {
{ glCreateTransformFeedbacks, 45, -1 },
{ glTransformFeedbackBufferBase, 45, -1 },
{ glTransformFeedbackBufferRange, 45, -1 },
+   { glGetTransformFeedbackiv, 45, -1 },
{ glCreateTextures, 45, -1 },
{ glTextureStorage1D, 45, -1 },
{ glTextureStorage2D, 45, -1 },
diff --git a/src/mesa/main/transformfeedback.c 
b/src/mesa/main/transformfeedback.c
index 7564b6f..ae3e1c0 100644
--- a/src/mesa/main/transformfeedback.c
+++ b/src/mesa/main/transformfeedback.c
@@ -1195,3 +1195,28 @@ _mesa_ResumeTransformFeedback(void)
assert(ctx-Driver.ResumeTransformFeedback);
ctx-Driver.ResumeTransformFeedback(ctx, obj);
 }
+
+extern void GLAPIENTRY
+_mesa_GetTransformFeedbackiv(GLuint xfb, GLenum pname, GLint *param)
+{
+struct gl_transform_feedback_object *obj;
+GET_CURRENT_CONTEXT(ctx);
+
+obj = _mesa_lookup_transform_feedback_object_err(ctx, xfb,
+
glGetTransformFeedbackiv);
+if(!obj) {
+   return;
+}
+
+switch(pname) {
+case GL_TRANSFORM_FEEDBACK_PAUSED:
+   *param = obj-Paused;
+   break;
+case GL_TRANSFORM_FEEDBACK_ACTIVE:
+   *param = obj-Active;
+   break;
+default:
+   _mesa_error(ctx, GL_INVALID_ENUM,
+   glGetTransformFeedbackiv(pname=%i), pname);
+}
+}
diff --git a/src/mesa/main/transformfeedback.h 
b/src/mesa/main/transformfeedback.h
index 1178169..4bc7c0c 100644
--- a/src/mesa/main/transformfeedback.h
+++ b/src/mesa/main/transformfeedback.h
@@ -166,4 +166,7 @@ extern void GLAPIENTRY
 _mesa_TransformFeedbackBufferRange(GLuint xfb, GLuint index, GLuint buffer,
GLintptr offset, GLsizeiptr size);
 
+extern void GLAPIENTRY
+_mesa_GetTransformFeedbackiv(GLuint xfb, GLenum pname, GLint *param);
+
 #endif /* TRANSFORM_FEEDBACK_H */
-- 
2.2.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/6] main: Added entry point for glTransformFeedbackBufferBase

2015-01-29 Thread Martin Peres
v2: Review from Laura Ekstrand
- give more helpful error messages
- factor the lookup code for the xfb and objBuf
- replace some already-existing tabs with spaces
- add comments to explain the cases where xfb == 0 or buffer == 0
- fix the condition for binding the transform buffer or not

Signed-off-by: Martin Peres martin.pe...@linux.intel.com
---
 src/mapi/glapi/gen/ARB_direct_state_access.xml |   6 ++
 src/mesa/main/bufferobj.c  |   9 +-
 src/mesa/main/tests/dispatch_sanity.cpp|   1 +
 src/mesa/main/transformfeedback.c  | 136 +++--
 src/mesa/main/transformfeedback.h  |  19 +++-
 5 files changed, 135 insertions(+), 36 deletions(-)

diff --git a/src/mapi/glapi/gen/ARB_direct_state_access.xml 
b/src/mapi/glapi/gen/ARB_direct_state_access.xml
index 15b00c2..35d6906 100644
--- a/src/mapi/glapi/gen/ARB_direct_state_access.xml
+++ b/src/mapi/glapi/gen/ARB_direct_state_access.xml
@@ -14,6 +14,12 @@
   param name=ids type=GLuint * /
/function
 
+   function name=TransformFeedbackBufferBase offset=assign
+  param name=xfb type=GLuint /
+  param name=index type=GLuint /
+  param name=buffer type=GLuint /
+   /function
+
!-- Texture object functions --
 
function name=CreateTextures offset=assign
diff --git a/src/mesa/main/bufferobj.c b/src/mesa/main/bufferobj.c
index 0c1ce98..86532ea 100644
--- a/src/mesa/main/bufferobj.c
+++ b/src/mesa/main/bufferobj.c
@@ -3546,8 +3546,9 @@ _mesa_BindBufferRange(GLenum target, GLuint index,
 
switch (target) {
case GL_TRANSFORM_FEEDBACK_BUFFER:
-  _mesa_bind_buffer_range_transform_feedback(ctx, index, bufObj,
-offset, size);
+  _mesa_bind_buffer_range_transform_feedback(ctx,
+ 
ctx-TransformFeedback.CurrentObject,
+ index, bufObj, offset, size);
   return;
case GL_UNIFORM_BUFFER:
   bind_buffer_range_uniform_buffer(ctx, index, bufObj, offset, size);
@@ -3611,7 +3612,9 @@ _mesa_BindBufferBase(GLenum target, GLuint index, GLuint 
buffer)
 
switch (target) {
case GL_TRANSFORM_FEEDBACK_BUFFER:
-  _mesa_bind_buffer_base_transform_feedback(ctx, index, bufObj);
+  _mesa_bind_buffer_base_transform_feedback(ctx,
+
ctx-TransformFeedback.CurrentObject,
+index, bufObj, false);
   return;
case GL_UNIFORM_BUFFER:
   bind_buffer_base_uniform_buffer(ctx, index, bufObj);
diff --git a/src/mesa/main/tests/dispatch_sanity.cpp 
b/src/mesa/main/tests/dispatch_sanity.cpp
index 5e7f7ae..166401a 100644
--- a/src/mesa/main/tests/dispatch_sanity.cpp
+++ b/src/mesa/main/tests/dispatch_sanity.cpp
@@ -956,6 +956,7 @@ const struct function gl_core_functions_possible[] = {
 
/* GL_ARB_direct_state_access */
{ glCreateTransformFeedbacks, 45, -1 },
+   { glTransformFeedbackBufferBase, 45, -1 },
{ glCreateTextures, 45, -1 },
{ glTextureStorage1D, 45, -1 },
{ glTextureStorage2D, 45, -1 },
diff --git a/src/mesa/main/transformfeedback.c 
b/src/mesa/main/transformfeedback.c
index b0fca48..6583974 100644
--- a/src/mesa/main/transformfeedback.c
+++ b/src/mesa/main/transformfeedback.c
@@ -514,22 +514,24 @@ _mesa_EndTransformFeedback(void)
  * Helper used by BindBufferRange() and BindBufferBase().
  */
 static void
-bind_buffer_range(struct gl_context *ctx, GLuint index,
+bind_buffer_range(struct gl_context *ctx,
+  struct gl_transform_feedback_object *obj,
+  GLuint index,
   struct gl_buffer_object *bufObj,
-  GLintptr offset, GLsizeiptr size)
+  GLintptr offset, GLsizeiptr size,
+  bool dsa)
 {
-   struct gl_transform_feedback_object *obj =
-  ctx-TransformFeedback.CurrentObject;
-
/* Note: no need to FLUSH_VERTICES or flag NewTransformFeedback, because
 * transform feedback buffers can't be changed while transform feedback is
 * active.
 */
 
-   /* The general binding point */
-   _mesa_reference_buffer_object(ctx,
- ctx-TransformFeedback.CurrentBuffer,
- bufObj);
+   if (!dsa) {
+  /* The general binding point */
+  _mesa_reference_buffer_object(ctx,
+ctx-TransformFeedback.CurrentBuffer,
+bufObj);
+   }
 
/* The per-attribute binding point */
_mesa_set_transform_feedback_binding(ctx, obj, index, bufObj, offset, size);
@@ -543,15 +545,12 @@ bind_buffer_range(struct gl_context *ctx, GLuint index,
  */
 void
 _mesa_bind_buffer_range_transform_feedback(struct gl_context *ctx,
-  GLuint index,
-  struct gl_buffer_object *bufObj,
-  

[Mesa-dev] [PATCH 0/6] Add 6 transform-feedback-related entry points for direct_state_access

2015-01-29 Thread Martin Peres
The corresponding piglit tests have been submited to the piglit ML.

Warning: those tests will only work when forcing mesa to expose the DSA 
extension.

Martin Peres (6):
  main: Added entry point for glCreateTransformFeedbacks
  main: Added entry point for glTransformFeedbackBufferBase
  main: Added entry point for glTransformFeedbackBufferRange
  main: Added entry point for glGetTransformFeedbackiv
  main: Added entry point for glGetTransformFeedbacki_v
  main: Added entry point for glGetTransformFeedbacki64_v

 src/mapi/glapi/gen/ARB_direct_state_access.xml |  41 +++
 src/mesa/main/bufferobj.c  |  10 +-
 src/mesa/main/tests/dispatch_sanity.cpp|   6 +
 src/mesa/main/transformfeedback.c  | 357 +
 src/mesa/main/transformfeedback.h  |  42 ++-
 5 files changed, 399 insertions(+), 57 deletions(-)

-- 
2.2.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/6] main: Added entry point for glCreateTransformFeedbacks

2015-01-29 Thread Martin Peres
v2: Review from Laura Ekstrand
- generate the name of the gl method once
- shorten some lines to stay in the 78 chars limit

Signed-off-by: Martin Peres martin.pe...@linux.intel.com
---
 src/mapi/glapi/gen/ARB_direct_state_access.xml |  7 +++
 src/mesa/main/tests/dispatch_sanity.cpp|  1 +
 src/mesa/main/transformfeedback.c  | 63 --
 src/mesa/main/transformfeedback.h  |  6 +++
 4 files changed, 62 insertions(+), 15 deletions(-)

diff --git a/src/mapi/glapi/gen/ARB_direct_state_access.xml 
b/src/mapi/glapi/gen/ARB_direct_state_access.xml
index 2fe1638..15b00c2 100644
--- a/src/mapi/glapi/gen/ARB_direct_state_access.xml
+++ b/src/mapi/glapi/gen/ARB_direct_state_access.xml
@@ -7,6 +7,13 @@
enum name=QUERY_TARGETvalue=0x82EA/
enum name=TEXTURE_BINDING value=0x82EB/
 
+   !-- Transform Feedback object functions --
+
+  function name=CreateTransformFeedbacks offset=assign
+  param name=n type=GLsizei /
+  param name=ids type=GLuint * /
+   /function
+
!-- Texture object functions --
 
function name=CreateTextures offset=assign
diff --git a/src/mesa/main/tests/dispatch_sanity.cpp 
b/src/mesa/main/tests/dispatch_sanity.cpp
index ee4db45..5e7f7ae 100644
--- a/src/mesa/main/tests/dispatch_sanity.cpp
+++ b/src/mesa/main/tests/dispatch_sanity.cpp
@@ -955,6 +955,7 @@ const struct function gl_core_functions_possible[] = {
{ glClipControl, 45, -1 },
 
/* GL_ARB_direct_state_access */
+   { glCreateTransformFeedbacks, 45, -1 },
{ glCreateTextures, 45, -1 },
{ glTextureStorage1D, 45, -1 },
{ glTextureStorage2D, 45, -1 },
diff --git a/src/mesa/main/transformfeedback.c 
b/src/mesa/main/transformfeedback.c
index a737463..b0fca48 100644
--- a/src/mesa/main/transformfeedback.c
+++ b/src/mesa/main/transformfeedback.c
@@ -825,25 +825,24 @@ _mesa_lookup_transform_feedback_object(struct gl_context 
*ctx, GLuint name)
  _mesa_HashLookup(ctx-TransformFeedback.Objects, name);
 }
 
-
-/**
- * Create new transform feedback objects.   Transform feedback objects
- * encapsulate the state related to transform feedback to allow quickly
- * switching state (and drawing the results, below).
- * Part of GL_ARB_transform_feedback2.
- */
-void GLAPIENTRY
-_mesa_GenTransformFeedbacks(GLsizei n, GLuint *names)
+void
+_mesa_create_transform_feedbacks(struct gl_context *ctx, GLsizei n,
+ GLuint *ids, bool dsa)
 {
GLuint first;
-   GET_CURRENT_CONTEXT(ctx);
+   const char* gl_mthd_name;
+
+   if (dsa)
+  gl_mthd_name = glCreateTransformFeedbacks;
+   else
+  gl_mthd_name = glGenTransformFeedbacks;
 
if (n  0) {
-  _mesa_error(ctx, GL_INVALID_VALUE, glGenTransformFeedbacks(n  0));
+  _mesa_error(ctx, GL_INVALID_VALUE, %s(n  0), gl_mthd_name);
   return;
}
 
-   if (!names)
+   if (!ids)
   return;
 
/* we don't need contiguous IDs, but this might be faster */
@@ -854,18 +853,52 @@ _mesa_GenTransformFeedbacks(GLsizei n, GLuint *names)
  struct gl_transform_feedback_object *obj
 = ctx-Driver.NewTransformFeedback(ctx, first + i);
  if (!obj) {
-_mesa_error(ctx, GL_OUT_OF_MEMORY, glGenTransformFeedbacks);
+_mesa_error(ctx, GL_OUT_OF_MEMORY, %s, gl_mthd_name);
 return;
  }
- names[i] = first + i;
+ ids[i] = first + i;
  _mesa_HashInsert(ctx-TransformFeedback.Objects, first + i, obj);
   }
}
else {
-  _mesa_error(ctx, GL_OUT_OF_MEMORY, glGenTransformFeedbacks);
+  _mesa_error(ctx, GL_OUT_OF_MEMORY, %s, gl_mthd_name);
}
 }
 
+/**
+ * Create new transform feedback objects.   Transform feedback objects
+ * encapsulate the state related to transform feedback to allow quickly
+ * switching state (and drawing the results, below).
+ * Part of GL_ARB_transform_feedback2.
+ */
+void GLAPIENTRY
+_mesa_GenTransformFeedbacks(GLsizei n, GLuint *names)
+{
+   GET_CURRENT_CONTEXT(ctx);
+
+   /* GenTransformFeedbacks should just reserve the object names that a
+* subsequent call to BindTransformFeedback should actively create. For
+* the sake of simplicity, we reserve the names and create the objects
+* straight away.
+*/
+
+   _mesa_create_transform_feedbacks(ctx, n, names, false);
+}
+
+/**
+ * Create new transform feedback objects.   Transform feedback objects
+ * encapsulate the state related to transform feedback to allow quickly
+ * switching state (and drawing the results, below).
+ * Part of GL_ARB_direct_state_access.
+ */
+void GLAPIENTRY
+_mesa_CreateTransformFeedbacks(GLsizei n, GLuint *names)
+{
+   GET_CURRENT_CONTEXT(ctx);
+
+   _mesa_create_transform_feedbacks(ctx, n, names, true);
+}
+
 
 /**
  * Is the given ID a transform feedback object?
diff --git a/src/mesa/main/transformfeedback.h 
b/src/mesa/main/transformfeedback.h
index 87f4080..d598533 100644
--- a/src/mesa/main/transformfeedback.h
+++ b/src/mesa/main/transformfeedback.h
@@ -99,9 

[Mesa-dev] [PATCH 3/6] main: Added entry point for glTransformFeedbackBufferRange

2015-01-29 Thread Martin Peres
v2: review from Laura Ekstrand
- use the refactored code to lookup the objects
- improve some error messages
- factor out the gl method name computation
- better handle the spec differences between the DSA and non-DSA cases
- quote the spec a little more

Signed-off-by: Martin Peres martin.pe...@linux.intel.com
---
 src/mapi/glapi/gen/ARB_direct_state_access.xml |  8 +++
 src/mesa/main/bufferobj.c  |  3 +-
 src/mesa/main/tests/dispatch_sanity.cpp|  1 +
 src/mesa/main/transformfeedback.c  | 94 +-
 src/mesa/main/transformfeedback.h  |  6 +-
 5 files changed, 94 insertions(+), 18 deletions(-)

diff --git a/src/mapi/glapi/gen/ARB_direct_state_access.xml 
b/src/mapi/glapi/gen/ARB_direct_state_access.xml
index 35d6906..b3c090f 100644
--- a/src/mapi/glapi/gen/ARB_direct_state_access.xml
+++ b/src/mapi/glapi/gen/ARB_direct_state_access.xml
@@ -20,6 +20,14 @@
   param name=buffer type=GLuint /
/function
 
+   function name=TransformFeedbackBufferRange offset=assign
+  param name=xfb type=GLuint /
+  param name=index type=GLuint /
+  param name=buffer type=GLuint /
+  param name=offset type=GLintptr /
+  param name=size type=GLsizeiptr /
+   /function
+
!-- Texture object functions --
 
function name=CreateTextures offset=assign
diff --git a/src/mesa/main/bufferobj.c b/src/mesa/main/bufferobj.c
index 86532ea..7558e17 100644
--- a/src/mesa/main/bufferobj.c
+++ b/src/mesa/main/bufferobj.c
@@ -3548,7 +3548,8 @@ _mesa_BindBufferRange(GLenum target, GLuint index,
case GL_TRANSFORM_FEEDBACK_BUFFER:
   _mesa_bind_buffer_range_transform_feedback(ctx,
  
ctx-TransformFeedback.CurrentObject,
- index, bufObj, offset, size);
+ index, bufObj, offset, size,
+ false);
   return;
case GL_UNIFORM_BUFFER:
   bind_buffer_range_uniform_buffer(ctx, index, bufObj, offset, size);
diff --git a/src/mesa/main/tests/dispatch_sanity.cpp 
b/src/mesa/main/tests/dispatch_sanity.cpp
index 166401a..3b34f7b 100644
--- a/src/mesa/main/tests/dispatch_sanity.cpp
+++ b/src/mesa/main/tests/dispatch_sanity.cpp
@@ -957,6 +957,7 @@ const struct function gl_core_functions_possible[] = {
/* GL_ARB_direct_state_access */
{ glCreateTransformFeedbacks, 45, -1 },
{ glTransformFeedbackBufferBase, 45, -1 },
+   { glTransformFeedbackBufferRange, 45, -1 },
{ glCreateTextures, 45, -1 },
{ glTextureStorage1D, 45, -1 },
{ glTextureStorage2D, 45, -1 },
diff --git a/src/mesa/main/transformfeedback.c 
b/src/mesa/main/transformfeedback.c
index 6583974..7564b6f 100644
--- a/src/mesa/main/transformfeedback.c
+++ b/src/mesa/main/transformfeedback.c
@@ -541,7 +541,8 @@ bind_buffer_range(struct gl_context *ctx,
 /**
  * Specify a buffer object to receive transform feedback results.  Plus,
  * specify the starting offset to place the results, and max size.
- * Called from the glBindBufferRange() function.
+ * Called from the glBindBufferRange() and glTransformFeedbackBufferRange
+ * functions.
  */
 void
 _mesa_bind_buffer_range_transform_feedback(struct gl_context *ctx,
@@ -549,35 +550,72 @@ _mesa_bind_buffer_range_transform_feedback(struct 
gl_context *ctx,
GLuint index,
struct gl_buffer_object *bufObj,
GLintptr offset,
-   GLsizeiptr size)
+   GLsizeiptr size,
+   bool dsa)
 {
+   const char *gl_methd_name;
+   if (dsa)
+  gl_methd_name = glTransformFeedbackBufferRange;
+   else
+  gl_methd_name = glBindBufferRange;
+
+
if (obj-Active) {
-  _mesa_error(ctx, GL_INVALID_OPERATION,
-  glBindBufferRange(transform feedback active));
+  _mesa_error(ctx, GL_INVALID_OPERATION, %s(transform feedback active),
+  gl_methd_name);
   return;
}
 
if (index = ctx-Const.MaxTransformFeedbackBuffers) {
-  _mesa_error(ctx, GL_INVALID_VALUE, glBindBufferRange(index=%d 
-  out of bounds), index);
+  /* An INVALID_VALUE error is generated if index is greater than or equal
+   * to the number of binding points for transform feedback, as described
+   * in section 6.7.1.
+   */
+  _mesa_error(ctx, GL_INVALID_VALUE, %s(index=%d out of bounds),
+  gl_methd_name, index);
   return;
}
 
if (size  0x3) {
-  /* must a multiple of four */
-  _mesa_error(ctx, GL_INVALID_VALUE, glBindBufferRange(size=%d),
-  (int) size);
+  /* must a be multiple of four */
+  _mesa_error(ctx, GL_INVALID_VALUE, %s(size=%d must be a multiple of 
+  four), 

[Mesa-dev] [PATCH 6/6] main: Added entry point for glGetTransformFeedbacki64_v

2015-01-29 Thread Martin Peres
v2: Review from Laura Ekstrand
- use the transform feedback object lookup wrapper

Signed-off-by: Martin Peres martin.pe...@linux.intel.com
---
 src/mapi/glapi/gen/ARB_direct_state_access.xml |  7 ++
 src/mesa/main/tests/dispatch_sanity.cpp|  1 +
 src/mesa/main/transformfeedback.c  | 32 ++
 src/mesa/main/transformfeedback.h  |  4 
 4 files changed, 44 insertions(+)

diff --git a/src/mapi/glapi/gen/ARB_direct_state_access.xml 
b/src/mapi/glapi/gen/ARB_direct_state_access.xml
index 935e088..340dbba 100644
--- a/src/mapi/glapi/gen/ARB_direct_state_access.xml
+++ b/src/mapi/glapi/gen/ARB_direct_state_access.xml
@@ -41,6 +41,13 @@
   param name=param type=GLint * /
/function
 
+   function name=GetTransformFeedbacki64_v offset=assign
+  param name=xfb type=GLuint /
+  param name=pname type=GLenum /
+  param name=index type=GLuint /
+  param name=param type=GLint64 * /
+   /function
+
!-- Texture object functions --
 
function name=CreateTextures offset=assign
diff --git a/src/mesa/main/tests/dispatch_sanity.cpp 
b/src/mesa/main/tests/dispatch_sanity.cpp
index 4b92699..f485262 100644
--- a/src/mesa/main/tests/dispatch_sanity.cpp
+++ b/src/mesa/main/tests/dispatch_sanity.cpp
@@ -960,6 +960,7 @@ const struct function gl_core_functions_possible[] = {
{ glTransformFeedbackBufferRange, 45, -1 },
{ glGetTransformFeedbackiv, 45, -1 },
{ glGetTransformFeedbacki_v, 45, -1 },
+   { glGetTransformFeedbacki64_v, 45, -1 },
{ glCreateTextures, 45, -1 },
{ glTextureStorage1D, 45, -1 },
{ glTextureStorage2D, 45, -1 },
diff --git a/src/mesa/main/transformfeedback.c 
b/src/mesa/main/transformfeedback.c
index 1ffdc16..cf5608d 100644
--- a/src/mesa/main/transformfeedback.c
+++ b/src/mesa/main/transformfeedback.c
@@ -1249,3 +1249,35 @@ _mesa_GetTransformFeedbacki_v(GLuint xfb, GLenum pname, 
GLuint index,
   glGetTransformFeedbacki_v(pname=%i), pname);
}
 }
+
+extern void GLAPIENTRY
+_mesa_GetTransformFeedbacki64_v(GLuint xfb, GLenum pname, GLuint index,
+  GLint64 *param)
+{
+   struct gl_transform_feedback_object *obj;
+   GET_CURRENT_CONTEXT(ctx);
+
+   obj = _mesa_lookup_transform_feedback_object_err(ctx, xfb,
+ 
glGetTransformFeedbacki64_v);
+   if(!obj) {
+  return;
+   }
+
+   if (index = ctx-Const.MaxTransformFeedbackBuffers) {
+  _mesa_error(ctx, GL_INVALID_VALUE,
+  glGetTransformFeedbacki64_v(index=%i), index);
+  return;
+   }
+
+   switch(pname) {
+   case GL_TRANSFORM_FEEDBACK_BUFFER_START:
+  *param = obj-Offset[index];
+  break;
+   case GL_TRANSFORM_FEEDBACK_BUFFER_SIZE:
+  *param = obj-RequestedSize[index];
+  break;
+   default:
+  _mesa_error(ctx, GL_INVALID_ENUM,
+  glGetTransformFeedbacki64_v(pname=%i), pname);
+   }
+}
diff --git a/src/mesa/main/transformfeedback.h 
b/src/mesa/main/transformfeedback.h
index 2c31b22..1e7ac7f 100644
--- a/src/mesa/main/transformfeedback.h
+++ b/src/mesa/main/transformfeedback.h
@@ -173,4 +173,8 @@ extern void GLAPIENTRY
 _mesa_GetTransformFeedbacki_v(GLuint xfb, GLenum pname, GLuint index,
   GLint *param);
 
+extern void GLAPIENTRY
+_mesa_GetTransformFeedbacki64_v(GLuint xfb, GLenum pname, GLuint index,
+  GLint64 *param);
+
 #endif /* TRANSFORM_FEEDBACK_H */
-- 
2.2.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] doc: break down ARB_direct_state_access in GL3.txt

2015-01-29 Thread Laura Ekstrand
Good idea.

Reviewed-by: Laura Ekstrand la...@jlekstrand.net

On Thu, Jan 29, 2015 at 6:54 AM, Martin Peres martin.pe...@linux.intel.com
wrote:

 A student was wondering what was going on + I started working on it too.

 CC: Laura Ekstrand la...@jlekstrand.net
 Signed-off-by: Martin Peres martin.pe...@linux.intel.com
 ---
  docs/GL3.txt | 11 ++-
  1 file changed, 10 insertions(+), 1 deletion(-)

 diff --git a/docs/GL3.txt b/docs/GL3.txt
 index 648f5ac..2dc1404 100644
 --- a/docs/GL3.txt
 +++ b/docs/GL3.txt
 @@ -191,7 +191,16 @@ GL 4.5, GLSL 4.50:
GL_ARB_conditional_render_inverted   DONE (i965, nv50,
 nvc0, llvmpipe, softpipe)
GL_ARB_cull_distance not started
GL_ARB_derivative_controlDONE (i965, nv50,
 nvc0, r600)
 -  GL_ARB_direct_state_access   started (Laura
 Ekstrand)
 +  GL_ARB_direct_state_access   started
 +  - Transform Feedback object  started (Martin
 Peres)
 +  - Buffer object  started (Laura
 Ekstrand)
 +  - Framebuffer object started (Laura
 Ekstrand)
 +  - Renderbuffer objectnot started
 +  - Texture object started (Laura
 Ekstrand)
 +  - Vertex array objectnot started
 +  - Sampler object not started
 +  - Program Pipeline objectnot started
 +  - Query object   started (Martin
 Peres)
GL_ARB_get_texture_sub_image started (Brian
 Paul)
GL_ARB_shader_texture_image_samples  not started
GL_ARB_texture_barrier   DONE (nv50, nvc0,
 r300, r600, radeonsi)
 --
 2.2.2


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/5] meta/blit: Write depth only when asked for

2015-01-29 Thread Matt Turner
On Thu, Jan 29, 2015 at 6:21 AM, Topi Pohjolainen
topi.pohjolai...@intel.com wrote:
 diff --git a/src/mesa/drivers/common/meta.c b/src/mesa/drivers/common/meta.c
 index ec7564a..45035b1 100644
 --- a/src/mesa/drivers/common/meta.c
 +++ b/src/mesa/drivers/common/meta.c
 @@ -293,13 +293,14 @@ _mesa_meta_setup_blit_shader(struct gl_context *ctx,
  void main()\n
  {\n
 gl_FragColor = %s(texSampler, %s);\n
 -   gl_FragDepth = gl_FragColor.x;\n
 +%s
  }\n,
  fs_preprocess, shader-type, fs_input,
 -shader-func, shader-texcoords);
 +shader-func, shader-texcoords,
 +do_depth ? gl_FragDepth = gl_FragColor.x;\n : );

 _mesa_meta_compile_and_link_program(ctx, vs_source, fs_source,
 -   ralloc_asprintf(mem_ctx, %s blit,
 +   ralloc_asprintf(mem_ctx,%s blit,

Unintended change.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/6] main: Added entry point for glTransformFeedbackBufferBase

2015-01-29 Thread Laura Ekstrand
On Thu, Jan 29, 2015 at 4:55 AM, Martin Peres martin.pe...@linux.intel.com
wrote:

 v2: Review from Laura Ekstrand
 - give more helpful error messages
 - factor the lookup code for the xfb and objBuf
 - replace some already-existing tabs with spaces
 - add comments to explain the cases where xfb == 0 or buffer == 0
 - fix the condition for binding the transform buffer or not

 Signed-off-by: Martin Peres martin.pe...@linux.intel.com
 ---
  src/mapi/glapi/gen/ARB_direct_state_access.xml |   6 ++
  src/mesa/main/bufferobj.c  |   9 +-
  src/mesa/main/tests/dispatch_sanity.cpp|   1 +
  src/mesa/main/transformfeedback.c  | 136
 +++--
  src/mesa/main/transformfeedback.h  |  19 +++-
  5 files changed, 135 insertions(+), 36 deletions(-)

 diff --git a/src/mapi/glapi/gen/ARB_direct_state_access.xml
 b/src/mapi/glapi/gen/ARB_direct_state_access.xml
 index 15b00c2..35d6906 100644
 --- a/src/mapi/glapi/gen/ARB_direct_state_access.xml
 +++ b/src/mapi/glapi/gen/ARB_direct_state_access.xml
 @@ -14,6 +14,12 @@
param name=ids type=GLuint * /
 /function

 +   function name=TransformFeedbackBufferBase offset=assign
 +  param name=xfb type=GLuint /
 +  param name=index type=GLuint /
 +  param name=buffer type=GLuint /
 +   /function
 +
 !-- Texture object functions --

 function name=CreateTextures offset=assign
 diff --git a/src/mesa/main/bufferobj.c b/src/mesa/main/bufferobj.c
 index 0c1ce98..86532ea 100644
 --- a/src/mesa/main/bufferobj.c
 +++ b/src/mesa/main/bufferobj.c
 @@ -3546,8 +3546,9 @@ _mesa_BindBufferRange(GLenum target, GLuint index,

 switch (target) {
 case GL_TRANSFORM_FEEDBACK_BUFFER:
 -  _mesa_bind_buffer_range_transform_feedback(ctx, index, bufObj,
 -offset, size);
 +  _mesa_bind_buffer_range_transform_feedback(ctx,
 +
  ctx-TransformFeedback.CurrentObject,
 + index, bufObj, offset,
 size);
return;
 case GL_UNIFORM_BUFFER:
bind_buffer_range_uniform_buffer(ctx, index, bufObj, offset, size);
 @@ -3611,7 +3612,9 @@ _mesa_BindBufferBase(GLenum target, GLuint index,
 GLuint buffer)

 switch (target) {
 case GL_TRANSFORM_FEEDBACK_BUFFER:
 -  _mesa_bind_buffer_base_transform_feedback(ctx, index, bufObj);
 +  _mesa_bind_buffer_base_transform_feedback(ctx,
 +
 ctx-TransformFeedback.CurrentObject,
 +index, bufObj, false);
return;
 case GL_UNIFORM_BUFFER:
bind_buffer_base_uniform_buffer(ctx, index, bufObj);
 diff --git a/src/mesa/main/tests/dispatch_sanity.cpp
 b/src/mesa/main/tests/dispatch_sanity.cpp
 index 5e7f7ae..166401a 100644
 --- a/src/mesa/main/tests/dispatch_sanity.cpp
 +++ b/src/mesa/main/tests/dispatch_sanity.cpp
 @@ -956,6 +956,7 @@ const struct function gl_core_functions_possible[] = {

 /* GL_ARB_direct_state_access */
 { glCreateTransformFeedbacks, 45, -1 },
 +   { glTransformFeedbackBufferBase, 45, -1 },
 { glCreateTextures, 45, -1 },
 { glTextureStorage1D, 45, -1 },
 { glTextureStorage2D, 45, -1 },
 diff --git a/src/mesa/main/transformfeedback.c
 b/src/mesa/main/transformfeedback.c
 index b0fca48..6583974 100644
 --- a/src/mesa/main/transformfeedback.c
 +++ b/src/mesa/main/transformfeedback.c
 @@ -514,22 +514,24 @@ _mesa_EndTransformFeedback(void)
   * Helper used by BindBufferRange() and BindBufferBase().
   */
  static void
 -bind_buffer_range(struct gl_context *ctx, GLuint index,
 +bind_buffer_range(struct gl_context *ctx,
 +  struct gl_transform_feedback_object *obj,
 +  GLuint index,
struct gl_buffer_object *bufObj,
 -  GLintptr offset, GLsizeiptr size)
 +  GLintptr offset, GLsizeiptr size,
 +  bool dsa)
  {
 -   struct gl_transform_feedback_object *obj =
 -  ctx-TransformFeedback.CurrentObject;
 -
 /* Note: no need to FLUSH_VERTICES or flag NewTransformFeedback,
 because
  * transform feedback buffers can't be changed while transform
 feedback is
  * active.
  */

 -   /* The general binding point */
 -   _mesa_reference_buffer_object(ctx,
 - ctx-TransformFeedback.CurrentBuffer,
 - bufObj);
 +   if (!dsa) {
 +  /* The general binding point */
 +  _mesa_reference_buffer_object(ctx,
 +ctx-TransformFeedback.CurrentBuffer,
 +bufObj);
 +   }

 /* The per-attribute binding point */
 _mesa_set_transform_feedback_binding(ctx, obj, index, bufObj, offset,
 size);
 @@ -543,15 +545,12 @@ bind_buffer_range(struct gl_context *ctx, GLuint
 index,
   */
  void
  _mesa_bind_buffer_range_transform_feedback(struct gl_context *ctx,
 -  GLuint index,
 -   

Re: [Mesa-dev] [PATCH 3/6] main: Added entry point for glTransformFeedbackBufferRange

2015-01-29 Thread Laura Ekstrand
On Thu, Jan 29, 2015 at 4:55 AM, Martin Peres martin.pe...@linux.intel.com
wrote:

 v2: review from Laura Ekstrand
 - use the refactored code to lookup the objects
 - improve some error messages
 - factor out the gl method name computation
 - better handle the spec differences between the DSA and non-DSA cases
 - quote the spec a little more

 Signed-off-by: Martin Peres martin.pe...@linux.intel.com
 ---
  src/mapi/glapi/gen/ARB_direct_state_access.xml |  8 +++
  src/mesa/main/bufferobj.c  |  3 +-
  src/mesa/main/tests/dispatch_sanity.cpp|  1 +
  src/mesa/main/transformfeedback.c  | 94
 +-
  src/mesa/main/transformfeedback.h  |  6 +-
  5 files changed, 94 insertions(+), 18 deletions(-)

 diff --git a/src/mapi/glapi/gen/ARB_direct_state_access.xml
 b/src/mapi/glapi/gen/ARB_direct_state_access.xml
 index 35d6906..b3c090f 100644
 --- a/src/mapi/glapi/gen/ARB_direct_state_access.xml
 +++ b/src/mapi/glapi/gen/ARB_direct_state_access.xml
 @@ -20,6 +20,14 @@
param name=buffer type=GLuint /
 /function

 +   function name=TransformFeedbackBufferRange offset=assign
 +  param name=xfb type=GLuint /
 +  param name=index type=GLuint /
 +  param name=buffer type=GLuint /
 +  param name=offset type=GLintptr /
 +  param name=size type=GLsizeiptr /
 +   /function
 +
 !-- Texture object functions --

 function name=CreateTextures offset=assign
 diff --git a/src/mesa/main/bufferobj.c b/src/mesa/main/bufferobj.c
 index 86532ea..7558e17 100644
 --- a/src/mesa/main/bufferobj.c
 +++ b/src/mesa/main/bufferobj.c
 @@ -3548,7 +3548,8 @@ _mesa_BindBufferRange(GLenum target, GLuint index,
 case GL_TRANSFORM_FEEDBACK_BUFFER:
_mesa_bind_buffer_range_transform_feedback(ctx,

 ctx-TransformFeedback.CurrentObject,
 - index, bufObj, offset,
 size);
 + index, bufObj, offset,
 size,
 + false);
return;
 case GL_UNIFORM_BUFFER:
bind_buffer_range_uniform_buffer(ctx, index, bufObj, offset, size);
 diff --git a/src/mesa/main/tests/dispatch_sanity.cpp
 b/src/mesa/main/tests/dispatch_sanity.cpp
 index 166401a..3b34f7b 100644
 --- a/src/mesa/main/tests/dispatch_sanity.cpp
 +++ b/src/mesa/main/tests/dispatch_sanity.cpp
 @@ -957,6 +957,7 @@ const struct function gl_core_functions_possible[] = {
 /* GL_ARB_direct_state_access */
 { glCreateTransformFeedbacks, 45, -1 },
 { glTransformFeedbackBufferBase, 45, -1 },
 +   { glTransformFeedbackBufferRange, 45, -1 },
 { glCreateTextures, 45, -1 },
 { glTextureStorage1D, 45, -1 },
 { glTextureStorage2D, 45, -1 },
 diff --git a/src/mesa/main/transformfeedback.c
 b/src/mesa/main/transformfeedback.c
 index 6583974..7564b6f 100644
 --- a/src/mesa/main/transformfeedback.c
 +++ b/src/mesa/main/transformfeedback.c
 @@ -541,7 +541,8 @@ bind_buffer_range(struct gl_context *ctx,
  /**
   * Specify a buffer object to receive transform feedback results.  Plus,
   * specify the starting offset to place the results, and max size.
 - * Called from the glBindBufferRange() function.
 + * Called from the glBindBufferRange() and glTransformFeedbackBufferRange
 + * functions.
   */
  void
  _mesa_bind_buffer_range_transform_feedback(struct gl_context *ctx,
 @@ -549,35 +550,72 @@ _mesa_bind_buffer_range_transform_feedback(struct
 gl_context *ctx,
 GLuint index,
 struct gl_buffer_object
 *bufObj,
 GLintptr offset,
 -   GLsizeiptr size)
 +   GLsizeiptr size,
 +   bool dsa)
  {
 +   const char *gl_methd_name;
 +   if (dsa)
 +  gl_methd_name = glTransformFeedbackBufferRange;
 +   else
 +  gl_methd_name = glBindBufferRange;
 +
 +
 if (obj-Active) {
 -  _mesa_error(ctx, GL_INVALID_OPERATION,
 -  glBindBufferRange(transform feedback active));
 +  _mesa_error(ctx, GL_INVALID_OPERATION, %s(transform feedback
 active),
 +  gl_methd_name);
return;
 }

 if (index = ctx-Const.MaxTransformFeedbackBuffers) {
 -  _mesa_error(ctx, GL_INVALID_VALUE, glBindBufferRange(index=%d 
 -  out of bounds), index);
 +  /* An INVALID_VALUE error is generated if index is greater than or
 equal
 +   * to the number of binding points for transform feedback, as
 described
 +   * in section 6.7.1.
 +   */
 +  _mesa_error(ctx, GL_INVALID_VALUE, %s(index=%d out of bounds),
 +  gl_methd_name, index);
return;
 }

 if (size  0x3) {
 -  /* must a multiple of four */
 -  _mesa_error(ctx, GL_INVALID_VALUE, glBindBufferRange(size=%d),
 -  (int) size);
 +   

[Mesa-dev] [PATCH 2/5] meta/blit: Write depth only when asked for

2015-01-29 Thread Topi Pohjolainen
Implementing an idea from Ken, on i965 the shader program for 2D
blits becomes significantly simpler.

Before:

pln(8)   g61Fg40,1,0Fg28,8,1F  { align1 1Q compacted };
pln(8)   g71Fg4.40,1,0F  g28,8,1F  { align1 1Q compacted };
send(8)  g21UW   g68,8,1F
 sampler (1, 0, 0, 1) mlen 2 rlen 4  { align1 1Q };
mov(8)   g1231F  g28,8,1F{ align1 1Q compacted };
mov(8)   g1241F  g38,8,1F{ align1 1Q compacted };
mov(8)   g1251F  g48,8,1F{ align1 1Q compacted };
mov(8)   g1261F  g58,8,1F{ align1 1Q compacted };
mov(8)   g1271F  g28,8,1F{ align1 1Q compacted };
nop ;
sendc(8) nullg1238,8,1F
render RT write SIMD8 LastRT Surface = 0 mlen 5 rlen 0 { align1 1Q EOT };

After:

pln(8)   g61F g40,1,0Fg28,8,1F   { align1 1Q compacted };
pln(8)   g71F g4.40,1,0F  g28,8,1F   { align1 1Q compacted };
send(8)  g1241UW  g68,8,1F
 sampler (1, 0, 0, 1) mlen 2 rlen 4{ align1 1Q };
sendc(8) nullg1248,8,1F
   render RT write SIMD8 LastRT Surface = 0 mlen 4 rlen 0 { align1 1Q EOT };

Signed-off-by: Topi Pohjolainen topi.pohjolai...@intel.com
CC: Kenneth Graunke kenn...@whitecape.org
---
 src/mesa/drivers/common/meta.c | 7 ---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/src/mesa/drivers/common/meta.c b/src/mesa/drivers/common/meta.c
index ec7564a..45035b1 100644
--- a/src/mesa/drivers/common/meta.c
+++ b/src/mesa/drivers/common/meta.c
@@ -293,13 +293,14 @@ _mesa_meta_setup_blit_shader(struct gl_context *ctx,
 void main()\n
 {\n
gl_FragColor = %s(texSampler, %s);\n
-   gl_FragDepth = gl_FragColor.x;\n
+%s
 }\n,
 fs_preprocess, shader-type, fs_input,
-shader-func, shader-texcoords);
+shader-func, shader-texcoords,
+do_depth ? gl_FragDepth = gl_FragColor.x;\n : );
 
_mesa_meta_compile_and_link_program(ctx, vs_source, fs_source,
-   ralloc_asprintf(mem_ctx, %s blit,
+   ralloc_asprintf(mem_ctx,%s blit,
shader-type),
shader-shader_prog);
ralloc_free(mem_ctx);
-- 
1.9.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/5] meta/blit: Add plumbing for shaders without depth

2015-01-29 Thread Topi Pohjolainen
Currently all blit programs are unconditionally compiled with
gl_FragDepth.

Signed-off-by: Topi Pohjolainen topi.pohjolai...@intel.com
---
 src/mesa/drivers/common/meta.c | 3 ++-
 src/mesa/drivers/common/meta.h | 1 +
 src/mesa/drivers/common/meta_blit.c| 2 +-
 src/mesa/drivers/common/meta_generate_mipmap.c | 2 +-
 4 files changed, 5 insertions(+), 3 deletions(-)

diff --git a/src/mesa/drivers/common/meta.c b/src/mesa/drivers/common/meta.c
index 23c05eb..ec7564a 100644
--- a/src/mesa/drivers/common/meta.c
+++ b/src/mesa/drivers/common/meta.c
@@ -243,6 +243,7 @@ _mesa_meta_compile_and_link_program(struct gl_context *ctx,
 void
 _mesa_meta_setup_blit_shader(struct gl_context *ctx,
  GLenum target,
+ bool do_depth,
  struct blit_shader_table *table)
 {
char *vs_source, *fs_source;
@@ -3035,7 +3036,7 @@ decompress_texture_image(struct gl_context *ctx,
   _mesa_meta_setup_vertex_objects(decompress-VAO, decompress-VBO, true,
   2, 4, 0);
 
-  _mesa_meta_setup_blit_shader(ctx, target, decompress-shaders);
+  _mesa_meta_setup_blit_shader(ctx, target, true, decompress-shaders);
} else {
   _mesa_meta_setup_ff_tnl_for_blit(decompress-VAO, decompress-VBO, 3);
}
diff --git a/src/mesa/drivers/common/meta.h b/src/mesa/drivers/common/meta.h
index 20d30f7..de3dc6a 100644
--- a/src/mesa/drivers/common/meta.h
+++ b/src/mesa/drivers/common/meta.h
@@ -632,6 +632,7 @@ _mesa_meta_setup_copypix_texture(struct gl_context *ctx,
 void
 _mesa_meta_setup_blit_shader(struct gl_context *ctx,
  GLenum target,
+ bool do_depth,
  struct blit_shader_table *table);
 
 void
diff --git a/src/mesa/drivers/common/meta_blit.c 
b/src/mesa/drivers/common/meta_blit.c
index 079b8e1..02e90e0 100644
--- a/src/mesa/drivers/common/meta_blit.c
+++ b/src/mesa/drivers/common/meta_blit.c
@@ -531,7 +531,7 @@ setup_glsl_blit_framebuffer(struct gl_context *ctx,
} else if (is_target_multisample) {
   setup_glsl_msaa_blit_shader(ctx, blit, src_rb, target);
} else {
-  _mesa_meta_setup_blit_shader(ctx, target, blit-shaders);
+  _mesa_meta_setup_blit_shader(ctx, target, true, blit-shaders);
}
 }
 
diff --git a/src/mesa/drivers/common/meta_generate_mipmap.c 
b/src/mesa/drivers/common/meta_generate_mipmap.c
index eedbb8e..c38b46b 100644
--- a/src/mesa/drivers/common/meta_generate_mipmap.c
+++ b/src/mesa/drivers/common/meta_generate_mipmap.c
@@ -193,7 +193,7 @@ _mesa_meta_GenerateMipmap(struct gl_context *ctx, GLenum 
target,
if (use_glsl_version) {
   _mesa_meta_setup_vertex_objects(mipmap-VAO, mipmap-VBO, true,
   2, 4, 0);
-  _mesa_meta_setup_blit_shader(ctx, target, mipmap-shaders);
+  _mesa_meta_setup_blit_shader(ctx, target, true, mipmap-shaders);
} else {
   _mesa_meta_setup_ff_tnl_for_blit(mipmap-VAO, mipmap-VBO, 3);
   _mesa_set_enable(ctx, target, GL_TRUE);
-- 
1.9.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] mesa: Add new fast mtx_t mutex type for basic use cases

2015-01-29 Thread Emil Velikov
On 28/01/15 05:08, Kristian Høgsberg wrote:
 While modern pthread mutexes are very fast, they still incur a call to an
 external DSO and overhead of the generality and features of pthread mutexes.
 Most mutexes in mesa only needs lock/unlock, and the idea here is that we can
 inline the atomic operation and make the fast case just two intructions.
 Mutexes are subtle and finicky to implement, so we carefully copy the
 implementation from Ulrich Dreppers well-written and well-reviewed paper:
 
   Futexes Are Tricky
   http://www.akkadia.org/drepper/futex.pdf
 
 We implement mutex3, which gives us a mutex that has no syscalls on
 uncontended lock or unlock.  Further, the uncontended case boils down to a
 cmpxchg and an untaken branch and the uncontended unlock is just a locked decr
 and an untaken branch.  We use __builtin_expect() to indicate that contention
 is unlikely so that gcc will put the contention code out of the main code
 flow.
 
 A fast mutex only supports lock/unlock, can't be recursive or used with
 condition variables.  We keep the pthread mutex implementation around as
 full_mtx_t for the few places where we use condition variables or recursive
 locking.  For platforms or compilers where futex and atomics aren't available,
 mtx_t falls back to the pthread mutex.
 
 The pthread mutex lock/unlock overhead shows up on benchmarks for CPU bound
 applications.  Most CPU bound cases are helped and some of our internal
 bind_buffer_object heavy benchmarks gain up to 10%.
 
Hi Kristian,

Can I humbly ask that you split this into two patches - one that
introduces the new functions/struct and another one that uses them ?
This way it'll be easier if/when things go crazy.

Also the patch seems to wonder between posix and win32
+ typedef full_mtx_t mtx_t;

and
+ typedef mtx_t fast_mtx_t;

Looks like a left over from the should I rename XX variables to fast*
or just one to full* moment :)


Thanks
Emil

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] meta/blit: Write gl_FragDepth only when really needed

2015-01-29 Thread Topi Pohjolainen
Currently meta blit programs unconditionally write both color and
depth values. This series introduces two flavours - with and without
gl_FragDepth - allowing simpler programs for color only blits. Ken
had observed that i965 behaves badly emitting five unnecessary moves
when the depth is included in the shader.
While this didn't improve any performance benches I tried, it doesn't
seem to introduce any regressions either in the benches or with
piglit.

Topi Pohjolainen (5):
  meta/blit: Add plumbing for shaders without depth
  meta/blit: Write depth only when asked for
  meta/blit: Compile programs with and without depth
  meta: Don't write depth when generating miptrees
  meta: Don't write depth when decompressing tex-images

 src/mesa/drivers/common/meta.c | 10 ++
 src/mesa/drivers/common/meta.h |  4 +++-
 src/mesa/drivers/common/meta_blit.c| 13 +
 src/mesa/drivers/common/meta_generate_mipmap.c |  2 +-
 4 files changed, 19 insertions(+), 10 deletions(-)

-- 
1.9.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 3/5] meta/blit: Compile programs with and without depth

2015-01-29 Thread Topi Pohjolainen
When color buffers alone are concerned the depth is not needed.

No regression on BDW where meta blit is used instead of blorp. I
also disabled blorp temporarily for fbo-blits on IVB and saw no
regressions there either.
I also compared several graphics benchmarks on BDW and saw neither
regressions or improvements.

Signed-off-by: Topi Pohjolainen topi.pohjolai...@intel.com
---
 src/mesa/drivers/common/meta.h  |  3 ++-
 src/mesa/drivers/common/meta_blit.c | 13 +
 2 files changed, 11 insertions(+), 5 deletions(-)

diff --git a/src/mesa/drivers/common/meta.h b/src/mesa/drivers/common/meta.h
index de3dc6a..3b9e05e 100644
--- a/src/mesa/drivers/common/meta.h
+++ b/src/mesa/drivers/common/meta.h
@@ -298,7 +298,8 @@ struct blit_state
 {
GLuint VAO;
GLuint VBO;
-   struct blit_shader_table shaders;
+   struct blit_shader_table shaders_with_depth;
+   struct blit_shader_table shaders_without_depth;
GLuint msaa_shaders[BLIT_MSAA_SHADER_COUNT];
struct temp_texture depthTex;
bool no_ctsi_fallback;
diff --git a/src/mesa/drivers/common/meta_blit.c 
b/src/mesa/drivers/common/meta_blit.c
index 02e90e0..4212d94 100644
--- a/src/mesa/drivers/common/meta_blit.c
+++ b/src/mesa/drivers/common/meta_blit.c
@@ -510,7 +510,8 @@ setup_glsl_blit_framebuffer(struct gl_context *ctx,
 struct blit_state *blit,
 struct gl_renderbuffer *src_rb,
 GLenum target, GLenum filter,
-bool is_scaled_blit)
+bool is_scaled_blit,
+bool do_depth)
 {
unsigned texcoord_size;
bool is_target_multisample = target == GL_TEXTURE_2D_MULTISAMPLE ||
@@ -531,7 +532,9 @@ setup_glsl_blit_framebuffer(struct gl_context *ctx,
} else if (is_target_multisample) {
   setup_glsl_msaa_blit_shader(ctx, blit, src_rb, target);
} else {
-  _mesa_meta_setup_blit_shader(ctx, target, true, blit-shaders);
+  _mesa_meta_setup_blit_shader(ctx, target, do_depth,
+   do_depth ? blit-shaders_with_depth
+: blit-shaders_without_depth);
}
 }
 
@@ -642,7 +645,8 @@ blitframebuffer_texture(struct gl_context *ctx,
scaled_blit = dstW != srcW || dstH != srcH;
 
if (glsl_version) {
-  setup_glsl_blit_framebuffer(ctx, blit, rb, target, filter, scaled_blit);
+  setup_glsl_blit_framebuffer(ctx, blit, rb, target, filter, scaled_blit,
+  do_depth);
}
else {
   _mesa_meta_setup_ff_tnl_for_blit(ctx-Meta-Blit.VAO,
@@ -962,7 +966,8 @@ _mesa_meta_glsl_blit_cleanup(struct blit_state *blit)
   blit-VBO = 0;
}
 
-   _mesa_meta_blit_shader_table_cleanup(blit-shaders);
+   _mesa_meta_blit_shader_table_cleanup(blit-shaders_with_depth);
+   _mesa_meta_blit_shader_table_cleanup(blit-shaders_without_depth);
 
_mesa_DeleteTextures(1, blit-depthTex.TexObj);
blit-depthTex.TexObj = 0;
-- 
1.9.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 5/5] meta: Don't write depth when decompressing tex-images

2015-01-29 Thread Topi Pohjolainen
Signed-off-by: Topi Pohjolainen topi.pohjolai...@intel.com
---
 src/mesa/drivers/common/meta.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/mesa/drivers/common/meta.c b/src/mesa/drivers/common/meta.c
index 45035b1..6ddec73 100644
--- a/src/mesa/drivers/common/meta.c
+++ b/src/mesa/drivers/common/meta.c
@@ -3037,7 +3037,7 @@ decompress_texture_image(struct gl_context *ctx,
   _mesa_meta_setup_vertex_objects(decompress-VAO, decompress-VBO, true,
   2, 4, 0);
 
-  _mesa_meta_setup_blit_shader(ctx, target, true, decompress-shaders);
+  _mesa_meta_setup_blit_shader(ctx, target, false, decompress-shaders);
} else {
   _mesa_meta_setup_ff_tnl_for_blit(decompress-VAO, decompress-VBO, 3);
}
-- 
1.9.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 4/5] meta: Don't write depth when generating miptrees

2015-01-29 Thread Topi Pohjolainen
Signed-off-by: Topi Pohjolainen topi.pohjolai...@intel.com
---
 src/mesa/drivers/common/meta_generate_mipmap.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/mesa/drivers/common/meta_generate_mipmap.c 
b/src/mesa/drivers/common/meta_generate_mipmap.c
index c38b46b..c1b6d3c 100644
--- a/src/mesa/drivers/common/meta_generate_mipmap.c
+++ b/src/mesa/drivers/common/meta_generate_mipmap.c
@@ -193,7 +193,7 @@ _mesa_meta_GenerateMipmap(struct gl_context *ctx, GLenum 
target,
if (use_glsl_version) {
   _mesa_meta_setup_vertex_objects(mipmap-VAO, mipmap-VBO, true,
   2, 4, 0);
-  _mesa_meta_setup_blit_shader(ctx, target, true, mipmap-shaders);
+  _mesa_meta_setup_blit_shader(ctx, target, false, mipmap-shaders);
} else {
   _mesa_meta_setup_ff_tnl_for_blit(mipmap-VAO, mipmap-VBO, 3);
   _mesa_set_enable(ctx, target, GL_TRUE);
-- 
1.9.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/6] main: Added entry point for glCreateTransformFeedbacks

2015-01-29 Thread Martin Peres


On 29/01/15 15:37, Fredrik Höglund wrote:

On Thursday 29 January 2015, Martin Peres wrote:

v2: Review from Laura Ekstrand
- generate the name of the gl method once
- shorten some lines to stay in the 78 chars limit

Signed-off-by: Martin Peres martin.pe...@linux.intel.com
---
  src/mapi/glapi/gen/ARB_direct_state_access.xml |  7 +++
  src/mesa/main/tests/dispatch_sanity.cpp|  1 +
  src/mesa/main/transformfeedback.c  | 63 --
  src/mesa/main/transformfeedback.h  |  6 +++
  4 files changed, 62 insertions(+), 15 deletions(-)

diff --git a/src/mapi/glapi/gen/ARB_direct_state_access.xml 
b/src/mapi/glapi/gen/ARB_direct_state_access.xml
index 2fe1638..15b00c2 100644
--- a/src/mapi/glapi/gen/ARB_direct_state_access.xml
+++ b/src/mapi/glapi/gen/ARB_direct_state_access.xml
@@ -7,6 +7,13 @@
 enum name=QUERY_TARGETvalue=0x82EA/
 enum name=TEXTURE_BINDING value=0x82EB/
  
+   !-- Transform Feedback object functions --

+
+  function name=CreateTransformFeedbacks offset=assign
+  param name=n type=GLsizei /
+  param name=ids type=GLuint * /
+   /function
+
 !-- Texture object functions --
  
 function name=CreateTextures offset=assign

diff --git a/src/mesa/main/tests/dispatch_sanity.cpp 
b/src/mesa/main/tests/dispatch_sanity.cpp
index ee4db45..5e7f7ae 100644
--- a/src/mesa/main/tests/dispatch_sanity.cpp
+++ b/src/mesa/main/tests/dispatch_sanity.cpp
@@ -955,6 +955,7 @@ const struct function gl_core_functions_possible[] = {
 { glClipControl, 45, -1 },
  
 /* GL_ARB_direct_state_access */

+   { glCreateTransformFeedbacks, 45, -1 },
 { glCreateTextures, 45, -1 },
 { glTextureStorage1D, 45, -1 },
 { glTextureStorage2D, 45, -1 },
diff --git a/src/mesa/main/transformfeedback.c 
b/src/mesa/main/transformfeedback.c
index a737463..b0fca48 100644
--- a/src/mesa/main/transformfeedback.c
+++ b/src/mesa/main/transformfeedback.c
@@ -825,25 +825,24 @@ _mesa_lookup_transform_feedback_object(struct gl_context 
*ctx, GLuint name)
   _mesa_HashLookup(ctx-TransformFeedback.Objects, name);
  }
  
-

-/**
- * Create new transform feedback objects.   Transform feedback objects
- * encapsulate the state related to transform feedback to allow quickly
- * switching state (and drawing the results, below).
- * Part of GL_ARB_transform_feedback2.
- */
-void GLAPIENTRY
-_mesa_GenTransformFeedbacks(GLsizei n, GLuint *names)
+void
+_mesa_create_transform_feedbacks(struct gl_context *ctx, GLsizei n,
+ GLuint *ids, bool dsa)
  {
 GLuint first;
-   GET_CURRENT_CONTEXT(ctx);
+   const char* gl_mthd_name;

I would prefer it if you used a simpler name for this variable, such
as func.


Func seems like a good name. Thanks.



+   if (dsa)
+  gl_mthd_name = glCreateTransformFeedbacks;
+   else
+  gl_mthd_name = glGenTransformFeedbacks;
  
 if (n  0) {

-  _mesa_error(ctx, GL_INVALID_VALUE, glGenTransformFeedbacks(n  0));
+  _mesa_error(ctx, GL_INVALID_VALUE, %s(n  0), gl_mthd_name);
return;
 }
  
-   if (!names)

+   if (!ids)
return;
  
 /* we don't need contiguous IDs, but this might be faster */

@@ -854,18 +853,52 @@ _mesa_GenTransformFeedbacks(GLsizei n, GLuint *names)
   struct gl_transform_feedback_object *obj
  = ctx-Driver.NewTransformFeedback(ctx, first + i);
   if (!obj) {
-_mesa_error(ctx, GL_OUT_OF_MEMORY, glGenTransformFeedbacks);
+_mesa_error(ctx, GL_OUT_OF_MEMORY, %s, gl_mthd_name);
  return;
   }
- names[i] = first + i;
+ ids[i] = first + i;
   _mesa_HashInsert(ctx-TransformFeedback.Objects, first + i, obj);

You have to set EverBound to true when dsa is true. Otherwise
glIsTransformFeedback() will return false when it's passed an ID that
was created by glCreateTransformFeedbacks().


Good catch! I'm going to modify my piglit test to check for this 
behaviour. Thanks.



}
 }
 else {
-  _mesa_error(ctx, GL_OUT_OF_MEMORY, glGenTransformFeedbacks);
+  _mesa_error(ctx, GL_OUT_OF_MEMORY, %s, gl_mthd_name);
 }
  }
  
+/**

+ * Create new transform feedback objects.   Transform feedback objects
+ * encapsulate the state related to transform feedback to allow quickly
+ * switching state (and drawing the results, below).
+ * Part of GL_ARB_transform_feedback2.
+ */
+void GLAPIENTRY
+_mesa_GenTransformFeedbacks(GLsizei n, GLuint *names)
+{
+   GET_CURRENT_CONTEXT(ctx);
+
+   /* GenTransformFeedbacks should just reserve the object names that a
+* subsequent call to BindTransformFeedback should actively create. For
+* the sake of simplicity, we reserve the names and create the objects
+* straight away.
+*/
+   _mesa_create_transform_feedbacks(ctx, n, names, false);
+}
+
+/**
+ * Create new transform feedback objects.   Transform feedback objects
+ * encapsulate the state related to transform feedback to allow quickly
+ * 

[Mesa-dev] [PATCH] dir-locals.el: Don't set variables for non-programming modes

2015-01-29 Thread Neil Roberts
This limits the style changes to modes inherited from prog-mode. The
main reason to do this is to avoid setting fill-column for people
using Emacs to edit commit messages because 78 characters is too many
to make it wrap properly in git log. Note that makefile-mode also
inherits from prog-mode so the fill column should continue to apply
there.
---
 .dir-locals.el | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/.dir-locals.el b/.dir-locals.el
index 04a1a2c..d95eb48 100644
--- a/.dir-locals.el
+++ b/.dir-locals.el
@@ -1,4 +1,4 @@
-((nil
+((prog-mode
   (indent-tabs-mode . nil)
   (tab-width . 8)
   (c-basic-offset . 3)
-- 
1.9.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v4 2/3] i965/fs: Use NIR's scalarizing abilities and stop handling vectors

2015-01-29 Thread Jason Ekstrand
Matt wanted shader-db numbers and here they are:

total instructions in shared programs: 5998287 - 5974070 (-0.40%)
instructions in affected programs: 732041 - 707824 (-3.31%)
helped:3135
HURT:  193
GAINED:18
LOST:  0


On Thu, Jan 29, 2015 at 3:41 AM, Kenneth Graunke kenn...@whitecape.org
wrote:

 On Wednesday, January 28, 2015 04:27:23 PM Jason Ekstrand wrote:
  On Wed, Jan 28, 2015 at 4:02 PM, Matt Turner matts...@gmail.com wrote:
 
   On Tue, Jan 27, 2015 at 5:31 PM, Jason Ekstrand ja...@jlekstrand.net
   wrote:
diff --git a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
   b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
index de0d780..217fe09 100644
@@ -689,9 +689,9 @@ fs_visitor::nir_emit_alu(nir_alu_instr *instr)
   
   struct brw_reg acc = retype(brw_acc_reg(dispatch_width),
   result.type);
   
-  emit_percomp(MUL(acc, op[0], op[1]), instr-dest.write_mask);
-  emit_percomp(MACH(reg_null_d, op[0], op[1]),
   instr-dest.write_mask);
-  emit_percomp(MOV(result, fs_reg(acc)),
 instr-dest.write_mask);
  
   Bah. How the hell did the old code pass piglit?
  
 
  I think channel expressions saved it.
 
 
  
+  emit(MUL(acc, op[0], op[1]));
+  emit(MACH(reg_null_d, op[0], op[1]));
+  emit(MOV(result, fs_reg(acc)));
   break;
}
   
@@ -773,72 +767,38 @@ fs_visitor::nir_emit_alu(nir_alu_instr *instr)
case nir_op_ball_fequal3:
case nir_op_ball_iequal3:
case nir_op_ball_fequal4:
-   case nir_op_ball_iequal4: {
-  unsigned num_components =
 nir_op_infos[instr-op].input_sizes[0];
-  fs_reg temp = vgrf(num_components);
-  emit_percomp(CMP(temp, op[0], op[1], BRW_CONDITIONAL_Z),
-   (1  num_components) - 1);
-  emit_reduction(BRW_OPCODE_AND, result, temp, num_components);
-  break;
-   }
-
+   case nir_op_ball_iequal4:
  
   We can save it for later, but it might be interesting to let the fs
   backend get the vector comparisons directly, if that's possible.
  
   See https://bugs.freedesktop.org/show_bug.cgi?id=77456
  
 
  We can.  We just have to add a flag to turn that off in the
  nir_lower_alu_to_scalar pass.
 
 
  
@@ -867,83 +827,67 @@ fs_visitor::nir_emit_alu(nir_alu_instr *instr)
   unreachable(not reached: should be handled by
 ldexp_to_arith());
   
case nir_op_fsqrt:
-  emit_math_percomp(SHADER_OPCODE_SQRT, result, op[0],
-instr-dest.write_mask,
 instr-dest.saturate);
+  emit_math(SHADER_OPCODE_SQRT, result, op[0])
+   -saturate = instr-dest.saturate;
  
   I don't really like the -saturate = ... as a single statement. I know
   I'm guilty of using it in a unit test recently.
  
 
  Yeah, I'm not sure what to do there.  I really don't like having to
 assign
  it to a fs_inst variable and then do the saturate.  I guess we could.  Or
  we could add something to emit_math.  Thoughts?

 In the brw_eu_emit.c code I worked around this by creating a brw_last_inst
 macro that always refers to the most recently emitted instruction.

 Still not 100% crazy about it, but it's been pretty handy.  *shrug*.
 It's an idea, anyway.

 I'm not too opposed to just creating fs_inst * temporaries.

 --Ken
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] meta/blit: Write gl_FragDepth only when really needed

2015-01-29 Thread Matt Turner
I haven't worked in meta a lot, but I didn't see anything that didn't
look correct. The series is

Reviewed-by: Matt Turner matts...@gmail.com
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] meta/blit: Write gl_FragDepth only when really needed

2015-01-29 Thread Kenneth Graunke
On Thursday, January 29, 2015 04:20:59 PM Topi Pohjolainen wrote:
 Currently meta blit programs unconditionally write both color and
 depth values. This series introduces two flavours - with and without
 gl_FragDepth - allowing simpler programs for color only blits. Ken
 had observed that i965 behaves badly emitting five unnecessary moves
 when the depth is included in the shader.
 While this didn't improve any performance benches I tried, it doesn't
 seem to introduce any regressions either in the benches or with
 piglit.
 
 Topi Pohjolainen (5):
   meta/blit: Add plumbing for shaders without depth
   meta/blit: Write depth only when asked for
   meta/blit: Compile programs with and without depth
   meta: Don't write depth when generating miptrees
   meta: Don't write depth when decompressing tex-images

Series is:
Reviewed-by: Kenneth Graunke kenn...@whitecape.org

This should also help with future Cherryview optimizations.  Thanks!

signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] gallium/docs: fix docs wrt ARL/ARR/FLR

2015-01-29 Thread sroland
From: Roland Scheidegger srol...@vmware.com

since the address reg holds integer values, ARL/ARR do an implicit float-to-int
conversion, so clarify that. Thus it is also incorrect to say that FLR really
does the same as ARL.
---
 src/gallium/docs/source/tgsi.rst | 18 --
 1 file changed, 8 insertions(+), 10 deletions(-)

diff --git a/src/gallium/docs/source/tgsi.rst b/src/gallium/docs/source/tgsi.rst
index ff322e8..84b0ed6 100644
--- a/src/gallium/docs/source/tgsi.rst
+++ b/src/gallium/docs/source/tgsi.rst
@@ -48,13 +48,13 @@ used.
 
 .. math::
 
-  dst.x = \lfloor src.x\rfloor
+  dst.x = (int) \lfloor src.x\rfloor
 
-  dst.y = \lfloor src.y\rfloor
+  dst.y = (int) \lfloor src.y\rfloor
 
-  dst.z = \lfloor src.z\rfloor
+  dst.z = (int) \lfloor src.z\rfloor
 
-  dst.w = \lfloor src.w\rfloor
+  dst.w = (int) \lfloor src.w\rfloor
 
 
 .. opcode:: MOV - Move
@@ -313,8 +313,6 @@ This instruction replicates its result.
 
 .. opcode:: FLR - Floor
 
-This is identical to :opcode:`ARL`.
-
 .. math::
 
   dst.x = \lfloor src.x\rfloor
@@ -637,13 +635,13 @@ This instruction replicates its result.
 
 .. math::
 
-  dst.x = round(src.x)
+  dst.x = (int) round(src.x)
 
-  dst.y = round(src.y)
+  dst.y = (int) round(src.y)
 
-  dst.z = round(src.z)
+  dst.z = (int) round(src.z)
 
-  dst.w = round(src.w)
+  dst.w = (int) round(src.w)
 
 
 .. opcode:: SSG - Set Sign
-- 
1.9.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] gallium/docs: fix docs wrt ARL/ARR/FLR

2015-01-29 Thread Alex Deucher
On Thu, Jan 29, 2015 at 2:40 PM,  srol...@vmware.com wrote:
 From: Roland Scheidegger srol...@vmware.com

 since the address reg holds integer values, ARL/ARR do an implicit 
 float-to-int
 conversion, so clarify that. Thus it is also incorrect to say that FLR really
 does the same as ARL.

Reviewed-by: Alex Deucher alexander.deuc...@amd.com

 ---
  src/gallium/docs/source/tgsi.rst | 18 --
  1 file changed, 8 insertions(+), 10 deletions(-)

 diff --git a/src/gallium/docs/source/tgsi.rst 
 b/src/gallium/docs/source/tgsi.rst
 index ff322e8..84b0ed6 100644
 --- a/src/gallium/docs/source/tgsi.rst
 +++ b/src/gallium/docs/source/tgsi.rst
 @@ -48,13 +48,13 @@ used.

  .. math::

 -  dst.x = \lfloor src.x\rfloor
 +  dst.x = (int) \lfloor src.x\rfloor

 -  dst.y = \lfloor src.y\rfloor
 +  dst.y = (int) \lfloor src.y\rfloor

 -  dst.z = \lfloor src.z\rfloor
 +  dst.z = (int) \lfloor src.z\rfloor

 -  dst.w = \lfloor src.w\rfloor
 +  dst.w = (int) \lfloor src.w\rfloor


  .. opcode:: MOV - Move
 @@ -313,8 +313,6 @@ This instruction replicates its result.

  .. opcode:: FLR - Floor

 -This is identical to :opcode:`ARL`.
 -
  .. math::

dst.x = \lfloor src.x\rfloor
 @@ -637,13 +635,13 @@ This instruction replicates its result.

  .. math::

 -  dst.x = round(src.x)
 +  dst.x = (int) round(src.x)

 -  dst.y = round(src.y)
 +  dst.y = (int) round(src.y)

 -  dst.z = round(src.z)
 +  dst.z = (int) round(src.z)

 -  dst.w = round(src.w)
 +  dst.w = (int) round(src.w)


  .. opcode:: SSG - Set Sign
 --
 1.9.1

 ___
 mesa-dev mailing list
 mesa-dev@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] gallium/docs: fix docs wrt ARL/ARR/FLR

2015-01-29 Thread Jose Fonseca

On 29/01/15 19:40, srol...@vmware.com wrote:

From: Roland Scheidegger srol...@vmware.com

since the address reg holds integer values, ARL/ARR do an implicit float-to-int
conversion, so clarify that. Thus it is also incorrect to say that FLR really
does the same as ARL.
---
  src/gallium/docs/source/tgsi.rst | 18 --
  1 file changed, 8 insertions(+), 10 deletions(-)

diff --git a/src/gallium/docs/source/tgsi.rst b/src/gallium/docs/source/tgsi.rst
index ff322e8..84b0ed6 100644
--- a/src/gallium/docs/source/tgsi.rst
+++ b/src/gallium/docs/source/tgsi.rst
@@ -48,13 +48,13 @@ used.

  .. math::

-  dst.x = \lfloor src.x\rfloor
+  dst.x = (int) \lfloor src.x\rfloor

-  dst.y = \lfloor src.y\rfloor
+  dst.y = (int) \lfloor src.y\rfloor

-  dst.z = \lfloor src.z\rfloor
+  dst.z = (int) \lfloor src.z\rfloor

-  dst.w = \lfloor src.w\rfloor
+  dst.w = (int) \lfloor src.w\rfloor


  .. opcode:: MOV - Move
@@ -313,8 +313,6 @@ This instruction replicates its result.

  .. opcode:: FLR - Floor

-This is identical to :opcode:`ARL`.
-
  .. math::

dst.x = \lfloor src.x\rfloor
@@ -637,13 +635,13 @@ This instruction replicates its result.

  .. math::

-  dst.x = round(src.x)
+  dst.x = (int) round(src.x)

-  dst.y = round(src.y)
+  dst.y = (int) round(src.y)

-  dst.z = round(src.z)
+  dst.z = (int) round(src.z)

-  dst.w = round(src.w)
+  dst.w = (int) round(src.w)


  .. opcode:: SSG - Set Sign



Looks good.

Jose
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] killing off the address reg in tgsi

2015-01-29 Thread Axel Davy

Hi,

for Gallium Nine we use ARR and UARL.

Do all cards supporting integers support indexing with an integer register ?
If yes we could remove our UARL usage.

As for ARR, if all cards are ok with normal registers, we could use a 
normal one
too, as long as we can get the same rounding behaviour (either to a 
integer directly

if card supports it, or to float).

Axel Davy

On 29/01/2015 22:20, Roland Scheidegger wrote :

Hi,

the address reg in tgsi is quite a nuisance. glsl-to-tgsi code assumes
that indirections can only be done through the address reg and has quite
some extra code to deal with this. Even though hardware and apis which
worked like that are definitely old by now.
Thus, I'm proposing the address reg be nuked. I am however not quite
sure what the implications for drivers are, other than I'm certain
llvmpipe can handle that already.
For that reason, I suspect at least initially a new cap bit would be
required so glsl-to-tgsi would skip the extra code. I tend to think
longer term it would be great if it could be nuked completely, I am
however not sure if that is easily done with drivers for old hw (such as
r300) - I guess if necessary we could keep operations such as ARL (or
even ARR though clearly not UARL!) and just define them to be usable
with temp regs.

Opinions?

Roland
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] killing off the address reg in tgsi

2015-01-29 Thread Rob Clark
On Thu, Jan 29, 2015 at 4:20 PM, Roland Scheidegger srol...@vmware.com wrote:
 Hi,

 the address reg in tgsi is quite a nuisance. glsl-to-tgsi code assumes
 that indirections can only be done through the address reg and has quite
 some extra code to deal with this. Even though hardware and apis which
 worked like that are definitely old by now.
 Thus, I'm proposing the address reg be nuked. I am however not quite
 sure what the implications for drivers are, other than I'm certain
 llvmpipe can handle that already.
 For that reason, I suspect at least initially a new cap bit would be
 required so glsl-to-tgsi would skip the extra code. I tend to think
 longer term it would be great if it could be nuked completely, I am
 however not sure if that is easily done with drivers for old hw (such as
 r300) - I guess if necessary we could keep operations such as ARL (or
 even ARR though clearly not UARL!) and just define them to be usable
 with temp regs.

 Opinions?

Not sure about other hw (although I would assume radeon is similar)..
but adreno has real actual address register, which need to be used for
relative addressing of the register (or const) file.  Otherwise we'd
need to do something like arrays in memory and load/store
instructions.

Possibly w/ more smarts in the compiler, we could infer address
register usage.  It would require some smarts that we don't have yet
to eliminate redundant writes to the address register generated for
each instruction with indirect reference.

(Although eventually I'd like to use tgsi-nir.. and haven't yet
looked at how nir handles indirects so maybe eventually I need to
implement those smarts in my compiler anyway.)

BR,
-R


 Roland
 ___
 mesa-dev mailing list
 mesa-dev@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 4/7] nir/algebraic: Fail to compile of a variable is used in a replace but not the search

2015-01-29 Thread Jason Ekstrand
---
 src/glsl/nir/nir_algebraic.py | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/src/glsl/nir/nir_algebraic.py b/src/glsl/nir/nir_algebraic.py
index f9b246d..75436f4 100644
--- a/src/glsl/nir/nir_algebraic.py
+++ b/src/glsl/nir/nir_algebraic.py
@@ -34,13 +34,18 @@ class VarSet(object):
def __init__(self):
   self.names = {}
   self.ids = itertools.count()
+  self.immutable = False;
 
def __getitem__(self, name):
   if name not in self.names:
+ assert not self.immutable, Unknown replacement variable:  + name
  self.names[name] = self.ids.next()
 
   return self.names[name]
 
+   def lock(self):
+  self.immutable = True
+
 class Value(object):
@staticmethod
def create(val, name_base, varset):
@@ -138,6 +143,8 @@ class SearchAndReplace(object):
   else:
  self.search = Expression(search, search{0}.format(self.id), varset)
 
+  varset.lock()
+
   if isinstance(replace, Value):
  self.replace = replace
   else:
-- 
2.2.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 3/7] nir/search: Allow for matching variables based on types

2015-01-29 Thread Jason Ekstrand
This allows you to match on an unknown value but only if it is of a given
type.  90% of the uses of this are for matching only booleans, but adding
the generality of arbitrary types is no more complex.

nir_algebraic.py doesn't handle this yet but that's ok because the C
language will ensure that the default type on all variables is void.
---
 src/glsl/nir/nir_search.c | 11 +++
 src/glsl/nir/nir_search.h | 12 
 2 files changed, 23 insertions(+)

diff --git a/src/glsl/nir/nir_search.c b/src/glsl/nir/nir_search.c
index ec89817..35eb7e2 100644
--- a/src/glsl/nir/nir_search.c
+++ b/src/glsl/nir/nir_search.c
@@ -82,6 +82,17 @@ match_value(const nir_search_value *value, nir_alu_instr 
*instr, unsigned src,
  instr-src[src].src.ssa-parent_instr-type != 
nir_instr_type_load_const)
 return false;
 
+ if (var-type != nir_type_void) {
+if (instr-src[src].src.ssa-parent_instr-type != 
nir_instr_type_alu)
+   return false;
+
+nir_alu_instr *src_alu =
+   nir_instr_as_alu(instr-src[src].src.ssa-parent_instr);
+
+if (nir_op_infos[src_alu-op].output_type != var-type)
+   return false;
+ }
+
  state-variables_seen |= (1  var-variable);
  state-variables[var-variable].src = instr-src[src].src;
  state-variables[var-variable].abs = false;
diff --git a/src/glsl/nir/nir_search.h b/src/glsl/nir/nir_search.h
index 18aa28d..7d47792 100644
--- a/src/glsl/nir/nir_search.h
+++ b/src/glsl/nir/nir_search.h
@@ -54,6 +54,18 @@ typedef struct {
 * given variable is only allowed to match constant values.
 */
bool is_constant;
+
+   /** Indicates that the given variable must have a certain type
+*
+* This is only allowed in search expressions and indicates that the
+* given variable is only allowed to match values that come from an ALU
+* instruction with the given output type.  A type of nir_type_void
+* means it can match any type.
+*
+* Note: A variable that is both constant and has a non-void type will
+* never match anything.
+*/
+   nir_alu_type type;
 } nir_search_variable;
 
 typedef struct {
-- 
2.2.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 7/7] nir/opt_algebraic: Add some constant bcsel reductions

2015-01-29 Thread Jason Ekstrand
total instructions in shared programs: 8026469 - 8026343 (-0.00%)
instructions in affected programs: 28541 - 28415 (-0.44%)
helped:101
HURT:  76
---
 src/glsl/nir/nir_opt_algebraic.py | 26 ++
 1 file changed, 26 insertions(+)

diff --git a/src/glsl/nir/nir_opt_algebraic.py 
b/src/glsl/nir/nir_opt_algebraic.py
index e3b309c..4e1beb4 100644
--- a/src/glsl/nir/nir_opt_algebraic.py
+++ b/src/glsl/nir/nir_opt_algebraic.py
@@ -122,9 +122,35 @@ optimizations = [
(('ieq', 'a@bool', 0), ('inot', 'a')),
(('bcsel', 'a@bool', True, False), 'a'),
(('bcsel', 'a@bool', False, True), ('inot', 'a')),
+   (('bcsel', True, b, c), b),
+   (('bcsel', False, b, c), c),
+   # The result of this should be hit by constant propagation and, in the
+   # next round of opt_algebraic, get picked up by one of the above two.
+   (('bcsel', '#a', b, c), ('bcsel', ('ine', 'a', 0), b, c)),
 
 # This one may not be exact
(('feq', ('fadd', a, b), 0.0), ('feq', a, ('fneg', b))),
 ]
 
+# Add optimizations to handle the case where the result of a ternary is
+# compared to a constant.  This way we can take things like
+#
+# (a ? 0 : 1)  0
+#
+# and turn it into
+#
+# a ? (0  0) : (1  0)
+#
+# which constant folding will eat for lunch.  The resulting ternary will
+# further get cleaned up by the boolean reductions above ane we will be
+# left with just the original variable a.
+for op in ['flt', 'fge', 'feq', 'fne',
+   'ilt', 'ige', 'ieq', 'ine', 'ult', 'uge']:
+   optimizations += [
+  ((op, ('bcsel', 'a', '#b', '#c'), '#d'),
+   ('bcsel', 'a', (op, 'b', 'd'), (op, 'c', 'd'))),
+  ((op, '#d', ('bcsel', a, '#b', '#c')),
+   ('bcsel', 'a', (op, 'd', 'b'), (op, 'd', 'c'))),
+   ]
+
 print nir_algebraic.AlgebraicPass(nir_opt_algebraic, optimizations).render()
-- 
2.2.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 6/7] nir/opt_algebraic: Add some boolean simplifications

2015-01-29 Thread Jason Ekstrand
total instructions in shared programs: 8026509 - 8026469 (-0.00%)
instructions in affected programs: 6338 - 6298 (-0.63%)
helped:11
---
 src/glsl/nir/nir_opt_algebraic.py | 5 +
 1 file changed, 5 insertions(+)

diff --git a/src/glsl/nir/nir_opt_algebraic.py 
b/src/glsl/nir/nir_opt_algebraic.py
index 1dea42a..e3b309c 100644
--- a/src/glsl/nir/nir_opt_algebraic.py
+++ b/src/glsl/nir/nir_opt_algebraic.py
@@ -117,6 +117,11 @@ optimizations = [
(('frcp', ('frcp', a)), a),
(('frcp', ('fsqrt', a)), ('frsq', a)),
(('frcp', ('frsq', a)), ('fsqrt', a)),
+   # Boolean simplifications
+   (('ine', 'a@bool', 0), 'a'),
+   (('ieq', 'a@bool', 0), ('inot', 'a')),
+   (('bcsel', 'a@bool', True, False), 'a'),
+   (('bcsel', 'a@bool', False, True), ('inot', 'a')),
 
 # This one may not be exact
(('feq', ('fadd', a, b), 0.0), ('feq', a, ('fneg', b))),
-- 
2.2.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 5/7] nir/algebraic: Support specifying variable as constant or by type

2015-01-29 Thread Jason Ekstrand
---
 src/glsl/nir/nir_algebraic.py | 20 +---
 src/glsl/nir/nir_opt_algebraic.py | 12 +---
 2 files changed, 26 insertions(+), 6 deletions(-)

diff --git a/src/glsl/nir/nir_algebraic.py b/src/glsl/nir/nir_algebraic.py
index 75436f4..6e7973d 100644
--- a/src/glsl/nir/nir_algebraic.py
+++ b/src/glsl/nir/nir_algebraic.py
@@ -28,6 +28,7 @@ import itertools
 import struct
 import sys
 import mako.template
+import re
 
 # Represents a set of variables, each with a unique id
 class VarSet(object):
@@ -65,6 +66,8 @@ static const ${val.c_type} ${val.name} = {
{ ${hex(val)} /* ${val.value} */ },
 % elif isinstance(val, Variable):
${val.index}, /* ${val.var_name} */
+   ${'true' if val.is_constant else 'false'},
+   nir_type_${'void' if val.required_type is None else val.required_type},
 % elif isinstance(val, Expression):
nir_op_${val.opcode},
{ ${', '.join(src.c_ptr for src in val.sources)} },
@@ -111,12 +114,23 @@ class Constant(Value):
   else:
  assert False
 
+_var_name_re = re.compile(r(?Pconst#)?(?Pname\w+)(?:@(?Ptype\w+))?)
+
 class Variable(Value):
def __init__(self, val, name, varset):
   Value.__init__(self, name, variable)
-  self.var_name = val
-  self.index = varset[val]
-  self.name = name
+
+  m = _var_name_re.match(val)
+  assert m and m.group('name') is not None
+
+  self.var_name = m.group('name')
+  self.is_constant = m.group('const') is not None
+  self.required_type = m.group('type')
+
+  if self.required_type is not None:
+ assert self.required_type in ('float', 'bool', 'int', 'unsigned')
+
+  self.index = varset[self.var_name]
 
 class Expression(Value):
def __init__(self, expr, name_base, varset):
diff --git a/src/glsl/nir/nir_opt_algebraic.py 
b/src/glsl/nir/nir_opt_algebraic.py
index 9c62b28..1dea42a 100644
--- a/src/glsl/nir/nir_opt_algebraic.py
+++ b/src/glsl/nir/nir_opt_algebraic.py
@@ -36,9 +36,15 @@ d = 'd'
 # and replace is either an expression or a value.  An expression is
 # defined as a tuple of the form (op, src0, src1, src2, src3)
 # where each source is either an expression or a value.  A value can be
-# either a numeric constant or a string representing a variable name.  For
-# constants, you have to be careful to make sure that it is the right type
-# because python is unaware of the source and destination types of the
+# either a numeric constant or a string representing a variable name.
+#
+# Variable names are specified as [#]name[@type] where # inicates that
+# the given variable will only match constants and thpe type indicates that
+# the given variable will only match values from ALU instructions with the
+# given output type.
+#
+# For constants, you have to be careful to make sure that it is the right
+# type because python is unaware of the source and destination types of the
 # opcodes.
 
 optimizations = [
-- 
2.2.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/7] nir: Add a void type

2015-01-29 Thread Jason Ekstrand
This allows us to indicate a concept of an invalid type.
---
 src/glsl/nir/nir.h | 1 +
 1 file changed, 1 insertion(+)

diff --git a/src/glsl/nir/nir.h b/src/glsl/nir/nir.h
index 980fdd0..f2050de 100644
--- a/src/glsl/nir/nir.h
+++ b/src/glsl/nir/nir.h
@@ -576,6 +576,7 @@ void nir_alu_dest_copy(nir_alu_dest *dest, const 
nir_alu_dest *src,
void *mem_ctx);
 
 typedef enum {
+   nir_type_void = 0, /* Not a valid type */
nir_type_float,
nir_type_int,
nir_type_unsigned,
-- 
2.2.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/7] nir/search: Add support for matching unknown constants

2015-01-29 Thread Jason Ekstrand
There are some algebraic transformations that we want to do but only if
certain things are constants.  For instance, we may want to replace
a * (b + c) with (a * b) + (a * c) as long as a and either b or c is constant.
While this generates more instructions, some of it will get constant
folded.

nir_algebraic.py doesn't handle this yet, but that's ok because the C
language will make sure that false is the default for now.
---
 src/glsl/nir/nir_search.c | 6 ++
 src/glsl/nir/nir_search.h | 7 +++
 2 files changed, 13 insertions(+)

diff --git a/src/glsl/nir/nir_search.c b/src/glsl/nir/nir_search.c
index 18e0330..ec89817 100644
--- a/src/glsl/nir/nir_search.c
+++ b/src/glsl/nir/nir_search.c
@@ -78,6 +78,10 @@ match_value(const nir_search_value *value, nir_alu_instr 
*instr, unsigned src,
 
  return true;
   } else {
+ if (var-is_constant 
+ instr-src[src].src.ssa-parent_instr-type != 
nir_instr_type_load_const)
+return false;
+
  state-variables_seen |= (1  var-variable);
  state-variables[var-variable].src = instr-src[src].src;
  state-variables[var-variable].abs = false;
@@ -236,6 +240,8 @@ construct_value(const nir_search_value *value, nir_alu_type 
type,
   nir_alu_src val;
   nir_alu_src_copy(val, state-variables[var-variable], mem_ctx);
 
+  assert(!var-is_constant);
+
   return val;
}
 
diff --git a/src/glsl/nir/nir_search.h b/src/glsl/nir/nir_search.h
index 8ec58b0..18aa28d 100644
--- a/src/glsl/nir/nir_search.h
+++ b/src/glsl/nir/nir_search.h
@@ -47,6 +47,13 @@ typedef struct {
 
/** The variable index;  Must be less than NIR_SEARCH_MAX_VARIABLES */
unsigned variable;
+
+   /** Indicates that the given variable must be a constant
+*
+* This is only alloed in search expressions and indicates that the
+* given variable is only allowed to match constant values.
+*/
+   bool is_constant;
 } nir_search_variable;
 
 typedef struct {
-- 
2.2.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/2] main: make the intel driver obey drirc's force_glsl_version

2015-01-29 Thread Martin Peres
Signed-off-by: Martin Peres martin.pe...@linux.intel.com
---
 src/mesa/drivers/dri/i915/intel_screen.c | 1 +
 src/mesa/drivers/dri/i965/brw_context.c  | 3 +++
 src/mesa/drivers/dri/i965/intel_screen.c | 1 +
 3 files changed, 5 insertions(+)

diff --git a/src/mesa/drivers/dri/i915/intel_screen.c 
b/src/mesa/drivers/dri/i915/intel_screen.c
index 748eee7..59ebabd 100644
--- a/src/mesa/drivers/dri/i915/intel_screen.c
+++ b/src/mesa/drivers/dri/i915/intel_screen.c
@@ -71,6 +71,7 @@ DRI_CONF_BEGIN
   DRI_CONF_ALWAYS_FLUSH_BATCH(false)
   DRI_CONF_ALWAYS_FLUSH_CACHE(false)
   DRI_CONF_DISABLE_THROTTLING(false)
+  DRI_CONF_FORCE_GLSL_VERSION(0)
   DRI_CONF_FORCE_GLSL_EXTENSIONS_WARN(false)
   DRI_CONF_DISABLE_GLSL_LINE_CONTINUATIONS(false)
   DRI_CONF_DISABLE_BLEND_FUNC_EXTENDED(false)
diff --git a/src/mesa/drivers/dri/i965/brw_context.c 
b/src/mesa/drivers/dri/i965/brw_context.c
index 6fa54a3..09a5dc1 100644
--- a/src/mesa/drivers/dri/i965/brw_context.c
+++ b/src/mesa/drivers/dri/i965/brw_context.c
@@ -641,6 +641,9 @@ brw_process_driconf_options(struct brw_context *brw)
 
brw-precompile = driQueryOptionb(brw-optionCache, shader_precompile);
 
+   ctx-Const.ForceGLSLVersion =
+  driQueryOptioni(options, force_glsl_version);
+
ctx-Const.ForceGLSLExtensionsWarn =
   driQueryOptionb(options, force_glsl_extensions_warn);
 
diff --git a/src/mesa/drivers/dri/i965/intel_screen.c 
b/src/mesa/drivers/dri/i965/intel_screen.c
index 013222f..7bd8b44 100644
--- a/src/mesa/drivers/dri/i965/intel_screen.c
+++ b/src/mesa/drivers/dri/i965/intel_screen.c
@@ -77,6 +77,7 @@ DRI_CONF_BEGIN
   DRI_CONF_ALWAYS_FLUSH_BATCH(false)
   DRI_CONF_ALWAYS_FLUSH_CACHE(false)
   DRI_CONF_DISABLE_THROTTLING(false)
+  DRI_CONF_FORCE_GLSL_VERSION(0)
   DRI_CONF_FORCE_GLSL_EXTENSIONS_WARN(false)
   DRI_CONF_DISABLE_GLSL_LINE_CONTINUATIONS(false)
   DRI_CONF_DISABLE_BLEND_FUNC_EXTENDED(false)
-- 
2.2.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/2] mesa: add a workaround for unigine Tropics

2015-01-29 Thread Martin Peres
While trying to understand a GLSL pass, curro and I tried running Unigine
Tropics and the GLSL compilers would not compile the shaders.

The reason is due to the fact that the shaders do not specify the needed GLSL
version but also use in the same shader keywords that could never co-exist
because one got deleted when the other one was added (for instance,
gl_TexCoord and gl_InstanceID). The current solution was to use the
force_glsl_extensions_warn workaround but it broke when GL_ARB_gpu_shader5
got introduced as this workaround also enabled this extension which reserved
the name sample which is then used by most of Unigine Tropics' shaders.

To fix this, the easiest solution seem to introduce a workaround to
disable GL_ARB_gpu_shader5 in the GLSL compiler. This is what this patch
does along with modifying drirc to enable this workaround on tropics.

This patch has been tested on Haswell. It should also work on Gallium-based
drivers but I did not test it.

I would like to thank curro for helping me understand the whole issue and
directed me to the right place in the code.

Signed-off-by: Martin Peres martin.pe...@linux.intel.com
---
 src/gallium/include/state_tracker/st_api.h  | 1 +
 src/gallium/state_trackers/dri/dri_screen.c | 3 +++
 src/glsl/glsl_parser_extras.cpp | 3 +++
 src/mesa/drivers/dri/common/drirc   | 1 +
 src/mesa/drivers/dri/common/xmlpool/t_options.h | 4 
 src/mesa/drivers/dri/i915/intel_screen.c| 1 +
 src/mesa/drivers/dri/i965/brw_context.c | 3 +++
 src/mesa/drivers/dri/i965/intel_screen.c| 1 +
 src/mesa/main/mtypes.h  | 7 +++
 src/mesa/state_tracker/st_extensions.c  | 3 +++
 10 files changed, 27 insertions(+)

diff --git a/src/gallium/include/state_tracker/st_api.h 
b/src/gallium/include/state_tracker/st_api.h
index 86fdc69..01bc98d 100644
--- a/src/gallium/include/state_tracker/st_api.h
+++ b/src/gallium/include/state_tracker/st_api.h
@@ -246,6 +246,7 @@ struct st_config_options
unsigned force_glsl_version;
boolean force_s3tc_enable;
boolean allow_glsl_extension_directive_midshader;
+   boolean disable_glsl_extension_gpu_shader5;
 };
 
 /**
diff --git a/src/gallium/state_trackers/dri/dri_screen.c 
b/src/gallium/state_trackers/dri/dri_screen.c
index 9cdebf8..091aecf 100644
--- a/src/gallium/state_trackers/dri/dri_screen.c
+++ b/src/gallium/state_trackers/dri/dri_screen.c
@@ -70,6 +70,7 @@ const __DRIconfigOptionsExtension gallium_config_options = {
  DRI_CONF_DISABLE_SHADER_BIT_ENCODING(false)
  DRI_CONF_FORCE_GLSL_VERSION(0)
  DRI_CONF_ALLOW_GLSL_EXTENSION_DIRECTIVE_MIDSHADER(false)
+ DRI_CONF_DISABLE_GLSL_EXTENSION_GPU_SHADER5(false)
   DRI_CONF_SECTION_END
 
   DRI_CONF_SECTION_MISCELLANEOUS
@@ -98,6 +99,8 @@ dri_fill_st_options(struct st_config_options *options,
   driQueryOptionb(optionCache, force_s3tc_enable);
options-allow_glsl_extension_directive_midshader =
   driQueryOptionb(optionCache, allow_glsl_extension_directive_midshader);
+   options-disable_glsl_extension_gpu_shader5 =
+  driQueryOptionb(optionCache, disable_glsl_extension_gpu_shader5);
 }
 
 static const __DRIconfig **
diff --git a/src/glsl/glsl_parser_extras.cpp b/src/glsl/glsl_parser_extras.cpp
index ccdf031..fff04b8 100644
--- a/src/glsl/glsl_parser_extras.cpp
+++ b/src/glsl/glsl_parser_extras.cpp
@@ -217,6 +217,9 @@ _mesa_glsl_parse_state::_mesa_glsl_parse_state(struct 
gl_context *_ctx,
   sizeof(this-atomic_counter_offsets));
this-allow_extension_directive_midshader =
   ctx-Const.AllowGLSLExtensionDirectiveMidShader;
+
+   if (ctx-Const.DisableGLSLExtensionGpuShader5)
+  this-ARB_gpu_shader5_enable = false;
 }
 
 /**
diff --git a/src/mesa/drivers/dri/common/drirc 
b/src/mesa/drivers/dri/common/drirc
index cecd6a9..3199cfc 100644
--- a/src/mesa/drivers/dri/common/drirc
+++ b/src/mesa/drivers/dri/common/drirc
@@ -41,6 +41,7 @@ TODO: document the other workarounds.
 
 application name=Unigine Tropics executable=Tropics
 option name=force_glsl_extensions_warn value=true /
+option name=disable_glsl_extension_gpu_shader5 value=true /
 option name=disable_blend_func_extended value=true /
/application
 
diff --git a/src/mesa/drivers/dri/common/xmlpool/t_options.h 
b/src/mesa/drivers/dri/common/xmlpool/t_options.h
index 4e5a721..d1869d6 100644
--- a/src/mesa/drivers/dri/common/xmlpool/t_options.h
+++ b/src/mesa/drivers/dri/common/xmlpool/t_options.h
@@ -110,6 +110,10 @@ 
DRI_CONF_OPT_BEGIN_B(allow_glsl_extension_directive_midshader, def) \
 DRI_CONF_DESC(en,gettext(Allow GLSL #extension directives in the 
middle of shaders)) \
 DRI_CONF_OPT_END
 
+#define DRI_CONF_DISABLE_GLSL_EXTENSION_GPU_SHADER5(def) \
+DRI_CONF_OPT_BEGIN_B(disable_glsl_extension_gpu_shader5, def) \
+DRI_CONF_DESC(en,gettext(Disable GLSL support for the 
GL_ARB_gpu_shader5)) \
+DRI_CONF_OPT_END
 
 
 

Re: [Mesa-dev] [PATCH 6/7] nir/opt_algebraic: Add some boolean simplifications

2015-01-29 Thread Jason Ekstrand
On Thu, Jan 29, 2015 at 12:50 PM, Jason Ekstrand ja...@jlekstrand.net
wrote:

 total instructions in shared programs: 8026509 - 8026469 (-0.00%)
 instructions in affected programs: 6338 - 6298 (-0.63%)
 helped:11


Ignore those numbers. they're bogus.  Here are the real ones:

total instructions in shared programs: 5998321 - 5998287 (-0.00%)
instructions in affected programs: 4520 - 4486 (-0.75%)
helped:8



 ---
  src/glsl/nir/nir_opt_algebraic.py | 5 +
  1 file changed, 5 insertions(+)

 diff --git a/src/glsl/nir/nir_opt_algebraic.py
 b/src/glsl/nir/nir_opt_algebraic.py
 index 1dea42a..e3b309c 100644
 --- a/src/glsl/nir/nir_opt_algebraic.py
 +++ b/src/glsl/nir/nir_opt_algebraic.py
 @@ -117,6 +117,11 @@ optimizations = [
 (('frcp', ('frcp', a)), a),
 (('frcp', ('fsqrt', a)), ('frsq', a)),
 (('frcp', ('frsq', a)), ('fsqrt', a)),
 +   # Boolean simplifications
 +   (('ine', 'a@bool', 0), 'a'),
 +   (('ieq', 'a@bool', 0), ('inot', 'a')),
 +   (('bcsel', 'a@bool', True, False), 'a'),
 +   (('bcsel', 'a@bool', False, True), ('inot', 'a')),

  # This one may not be exact
 (('feq', ('fadd', a, b), 0.0), ('feq', a, ('fneg', b))),
 --
 2.2.2


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/2] mesa: add a workaround for unigine Tropics

2015-01-29 Thread Kenneth Graunke
On Thursday, January 29, 2015 11:14:26 PM Martin Peres wrote:
 While trying to understand a GLSL pass, curro and I tried running Unigine
 Tropics and the GLSL compilers would not compile the shaders.
 
 The reason is due to the fact that the shaders do not specify the needed GLSL
 version but also use in the same shader keywords that could never co-exist
 because one got deleted when the other one was added (for instance,
 gl_TexCoord and gl_InstanceID). The current solution was to use the
 force_glsl_extensions_warn workaround but it broke when GL_ARB_gpu_shader5
 got introduced as this workaround also enabled this extension which reserved
 the name sample which is then used by most of Unigine Tropics' shaders.
 
 To fix this, the easiest solution seem to introduce a workaround to
 disable GL_ARB_gpu_shader5 in the GLSL compiler. This is what this patch
 does along with modifying drirc to enable this workaround on tropics.
 
 This patch has been tested on Haswell. It should also work on Gallium-based
 drivers but I did not test it.
 
 I would like to thank curro for helping me understand the whole issue and
 directed me to the right place in the code.
 
 Signed-off-by: Martin Peres martin.pe...@linux.intel.com

Hi Martin!

Thanks for fixing this.  One small comment...

[snip]
 diff --git a/src/mesa/drivers/dri/i915/intel_screen.c 
 b/src/mesa/drivers/dri/i915/intel_screen.c
 index 00d8580..748eee7 100644
 --- a/src/mesa/drivers/dri/i915/intel_screen.c
 +++ b/src/mesa/drivers/dri/i915/intel_screen.c
 @@ -74,6 +74,7 @@ DRI_CONF_BEGIN
DRI_CONF_FORCE_GLSL_EXTENSIONS_WARN(false)
DRI_CONF_DISABLE_GLSL_LINE_CONTINUATIONS(false)
DRI_CONF_DISABLE_BLEND_FUNC_EXTENDED(false)
 +  DRI_CONF_DISABLE_GLSL_EXTENSION_GPU_SHADER5(false)
  
DRI_CONF_OPT_BEGIN_B(shader_precompile, true)
DRI_CONF_DESC(en, Perform code generation at shader link time.)

You can probably drop the i915 change - it doesn't support ARB_gpu_shader5
anyway.  With that removed, you can add:

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=82897
Reviewed-by: Kenneth Graunke kenn...@whitecape.org

signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/2] main: make the intel driver obey drirc's force_glsl_version

2015-01-29 Thread Kenneth Graunke
On Thursday, January 29, 2015 11:14:27 PM Martin Peres wrote:
 Signed-off-by: Martin Peres martin.pe...@linux.intel.com
 ---
  src/mesa/drivers/dri/i915/intel_screen.c | 1 +
  src/mesa/drivers/dri/i965/brw_context.c  | 3 +++
  src/mesa/drivers/dri/i965/intel_screen.c | 1 +
  3 files changed, 5 insertions(+)
 
 diff --git a/src/mesa/drivers/dri/i915/intel_screen.c 
 b/src/mesa/drivers/dri/i915/intel_screen.c
 index 748eee7..59ebabd 100644
 --- a/src/mesa/drivers/dri/i915/intel_screen.c
 +++ b/src/mesa/drivers/dri/i915/intel_screen.c
 @@ -71,6 +71,7 @@ DRI_CONF_BEGIN
DRI_CONF_ALWAYS_FLUSH_BATCH(false)
DRI_CONF_ALWAYS_FLUSH_CACHE(false)
DRI_CONF_DISABLE_THROTTLING(false)
 +  DRI_CONF_FORCE_GLSL_VERSION(0)
DRI_CONF_FORCE_GLSL_EXTENSIONS_WARN(false)
DRI_CONF_DISABLE_GLSL_LINE_CONTINUATIONS(false)
DRI_CONF_DISABLE_BLEND_FUNC_EXTENDED(false)
 diff --git a/src/mesa/drivers/dri/i965/brw_context.c 
 b/src/mesa/drivers/dri/i965/brw_context.c
 index 6fa54a3..09a5dc1 100644
 --- a/src/mesa/drivers/dri/i965/brw_context.c
 +++ b/src/mesa/drivers/dri/i965/brw_context.c
 @@ -641,6 +641,9 @@ brw_process_driconf_options(struct brw_context *brw)
  
 brw-precompile = driQueryOptionb(brw-optionCache, shader_precompile);
  
 +   ctx-Const.ForceGLSLVersion =
 +  driQueryOptioni(options, force_glsl_version);
 +
 ctx-Const.ForceGLSLExtensionsWarn =
driQueryOptionb(options, force_glsl_extensions_warn);
  
 diff --git a/src/mesa/drivers/dri/i965/intel_screen.c 
 b/src/mesa/drivers/dri/i965/intel_screen.c
 index 013222f..7bd8b44 100644
 --- a/src/mesa/drivers/dri/i965/intel_screen.c
 +++ b/src/mesa/drivers/dri/i965/intel_screen.c
 @@ -77,6 +77,7 @@ DRI_CONF_BEGIN
DRI_CONF_ALWAYS_FLUSH_BATCH(false)
DRI_CONF_ALWAYS_FLUSH_CACHE(false)
DRI_CONF_DISABLE_THROTTLING(false)
 +  DRI_CONF_FORCE_GLSL_VERSION(0)
DRI_CONF_FORCE_GLSL_EXTENSIONS_WARN(false)
DRI_CONF_DISABLE_GLSL_LINE_CONTINUATIONS(false)
DRI_CONF_DISABLE_BLEND_FUNC_EXTENDED(false)

I've been meaning to do this for a while :) Thanks!

Reviewed-by: Kenneth Graunke kenn...@whitecape.org

signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 7/7] nir/opt_algebraic: Add some constant bcsel reductions

2015-01-29 Thread Jason Ekstrand
On Thu, Jan 29, 2015 at 12:50 PM, Jason Ekstrand ja...@jlekstrand.net
wrote:

 total instructions in shared programs: 8026469 - 8026343 (-0.00%)
 instructions in affected programs: 28541 - 28415 (-0.44%)
 helped:101
 HURT:  76


These numbers are bogus too.  The real ones look good for NIR but not for
FS.  I'm looking into that.
--Jason


 ---
  src/glsl/nir/nir_opt_algebraic.py | 26 ++
  1 file changed, 26 insertions(+)

 diff --git a/src/glsl/nir/nir_opt_algebraic.py
 b/src/glsl/nir/nir_opt_algebraic.py
 index e3b309c..4e1beb4 100644
 --- a/src/glsl/nir/nir_opt_algebraic.py
 +++ b/src/glsl/nir/nir_opt_algebraic.py
 @@ -122,9 +122,35 @@ optimizations = [
 (('ieq', 'a@bool', 0), ('inot', 'a')),
 (('bcsel', 'a@bool', True, False), 'a'),
 (('bcsel', 'a@bool', False, True), ('inot', 'a')),
 +   (('bcsel', True, b, c), b),
 +   (('bcsel', False, b, c), c),
 +   # The result of this should be hit by constant propagation and, in the
 +   # next round of opt_algebraic, get picked up by one of the above two.
 +   (('bcsel', '#a', b, c), ('bcsel', ('ine', 'a', 0), b, c)),

  # This one may not be exact
 (('feq', ('fadd', a, b), 0.0), ('feq', a, ('fneg', b))),
  ]

 +# Add optimizations to handle the case where the result of a ternary is
 +# compared to a constant.  This way we can take things like
 +#
 +# (a ? 0 : 1)  0
 +#
 +# and turn it into
 +#
 +# a ? (0  0) : (1  0)
 +#
 +# which constant folding will eat for lunch.  The resulting ternary will
 +# further get cleaned up by the boolean reductions above ane we will be
 +# left with just the original variable a.
 +for op in ['flt', 'fge', 'feq', 'fne',
 +   'ilt', 'ige', 'ieq', 'ine', 'ult', 'uge']:
 +   optimizations += [
 +  ((op, ('bcsel', 'a', '#b', '#c'), '#d'),
 +   ('bcsel', 'a', (op, 'b', 'd'), (op, 'c', 'd'))),
 +  ((op, '#d', ('bcsel', a, '#b', '#c')),
 +   ('bcsel', 'a', (op, 'd', 'b'), (op, 'd', 'c'))),
 +   ]
 +
  print nir_algebraic.AlgebraicPass(nir_opt_algebraic,
 optimizations).render()
 --
 2.2.2


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] mesa: Add new fast mtx_t mutex type for basic use cases

2015-01-29 Thread Emil Velikov
On 29/01/15 17:14, Kristian Høgsberg wrote:
 On Thu, Jan 29, 2015 at 6:36 AM, Emil Velikov emil.l.veli...@gmail.com 
 wrote:
 On 28/01/15 05:08, Kristian Høgsberg wrote:
 While modern pthread mutexes are very fast, they still incur a call to an
 external DSO and overhead of the generality and features of pthread mutexes.
 Most mutexes in mesa only needs lock/unlock, and the idea here is that we 
 can
 inline the atomic operation and make the fast case just two intructions.
 Mutexes are subtle and finicky to implement, so we carefully copy the
 implementation from Ulrich Dreppers well-written and well-reviewed paper:

   Futexes Are Tricky
   http://www.akkadia.org/drepper/futex.pdf

 We implement mutex3, which gives us a mutex that has no syscalls on
 uncontended lock or unlock.  Further, the uncontended case boils down to a
 cmpxchg and an untaken branch and the uncontended unlock is just a locked 
 decr
 and an untaken branch.  We use __builtin_expect() to indicate that 
 contention
 is unlikely so that gcc will put the contention code out of the main code
 flow.

 A fast mutex only supports lock/unlock, can't be recursive or used with
 condition variables.  We keep the pthread mutex implementation around as
 full_mtx_t for the few places where we use condition variables or recursive
 locking.  For platforms or compilers where futex and atomics aren't 
 available,
 mtx_t falls back to the pthread mutex.

 The pthread mutex lock/unlock overhead shows up on benchmarks for CPU bound
 applications.  Most CPU bound cases are helped and some of our internal
 bind_buffer_object heavy benchmarks gain up to 10%.

 Hi Kristian,

 Can I humbly ask that you split this into two patches - one that
 introduces the new functions/struct and another one that uses them ?
 This way it'll be easier if/when things go crazy.

 Also the patch seems to wonder between posix and win32
 + typedef full_mtx_t mtx_t;

 and
 + typedef mtx_t fast_mtx_t;

 Looks like a left over from the should I rename XX variables to fast*
 or just one to full* moment :)
 
 Yeah, that's how it progressed :)  At first I called it fast_mtx_t and
 planned on replacing simple uses of mtx_t one by one. Jordan suggested
 that it'd be easier to make the regular mutex fast and then rename the
 couple of places that use more feature than we provide.  I just
 botched the windows side when I did that.  However, given that the
 patch now reimplements mtx_t, there's really no good way to split up
 introducing the fast mutex and making use of it in multiple patches.
 I guess we can add full_mtx_t in a separate patch, use it in a second
 patch and then finally reimplement mtx_t as a fast mutex.  How does
 that sound?
 
That sounds great. Thank you.

-Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v2 7/7] nir/opt_algebraic: Add some constant bcsel reductions

2015-01-29 Thread Jason Ekstrand
total instructions in shared programs: 5998190 - 5997603 (-0.01%)
instructions in affected programs: 54276 - 53689 (-1.08%)
helped:293
---
 src/glsl/nir/nir_opt_algebraic.py | 30 --
 1 file changed, 28 insertions(+), 2 deletions(-)

diff --git a/src/glsl/nir/nir_opt_algebraic.py 
b/src/glsl/nir/nir_opt_algebraic.py
index e3b309c..3223119 100644
--- a/src/glsl/nir/nir_opt_algebraic.py
+++ b/src/glsl/nir/nir_opt_algebraic.py
@@ -120,11 +120,37 @@ optimizations = [
# Boolean simplifications
(('ine', 'a@bool', 0), 'a'),
(('ieq', 'a@bool', 0), ('inot', 'a')),
-   (('bcsel', 'a@bool', True, False), 'a'),
-   (('bcsel', 'a@bool', False, True), ('inot', 'a')),
+   (('bcsel', a, True, False), ('ine', a, 0)),
+   (('bcsel', a, False, True), ('ieq', a, 0)),
+   (('bcsel', True, b, c), b),
+   (('bcsel', False, b, c), c),
+   # The result of this should be hit by constant propagation and, in the
+   # next round of opt_algebraic, get picked up by one of the above two.
+   (('bcsel', '#a', b, c), ('bcsel', ('ine', 'a', 0), b, c)),
 
 # This one may not be exact
(('feq', ('fadd', a, b), 0.0), ('feq', a, ('fneg', b))),
 ]
 
+# Add optimizations to handle the case where the result of a ternary is
+# compared to a constant.  This way we can take things like
+#
+# (a ? 0 : 1)  0
+#
+# and turn it into
+#
+# a ? (0  0) : (1  0)
+#
+# which constant folding will eat for lunch.  The resulting ternary will
+# further get cleaned up by the boolean reductions above ane we will be
+# left with just the original variable a.
+for op in ['flt', 'fge', 'feq', 'fne',
+   'ilt', 'ige', 'ieq', 'ine', 'ult', 'uge']:
+   optimizations += [
+  ((op, ('bcsel', 'a', '#b', '#c'), '#d'),
+   ('bcsel', 'a', (op, 'b', 'd'), (op, 'c', 'd'))),
+  ((op, '#d', ('bcsel', a, '#b', '#c')),
+   ('bcsel', 'a', (op, 'd', 'b'), (op, 'd', 'c'))),
+   ]
+
 print nir_algebraic.AlgebraicPass(nir_opt_algebraic, optimizations).render()
-- 
2.2.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] i965/skl: Force a BINDING_TABLE_POINTER_* command after push constant command

2015-01-29 Thread Kenneth Graunke
On Thursday, January 29, 2015 05:21:03 PM Neil Roberts wrote:
 According to the SkyLake bspec the 3DSTATE_CONSTANT_* commands only take
 effect on the next corresponding 3DSTATE_BINDING_TABLE_POINTER_* command. This
 patch just makes it set the BRW_NEW_SURFACES state when uploading the push
 constants to ensure the binding tables will be updated.
 
 This fixes the fbo-blending-formats Piglit test and possibly others.
 
 Reviewed-by: Kristian Høgsberg k...@bitplanet.net
 ---
  src/mesa/drivers/dri/i965/gen7_vs_state.c | 7 +++
  1 file changed, 7 insertions(+)
 
 diff --git a/src/mesa/drivers/dri/i965/gen7_vs_state.c 
 b/src/mesa/drivers/dri/i965/gen7_vs_state.c
 index 404dd20..c55d6ca 100644
 --- a/src/mesa/drivers/dri/i965/gen7_vs_state.c
 +++ b/src/mesa/drivers/dri/i965/gen7_vs_state.c
 @@ -60,6 +60,13 @@ gen7_upload_constant_state(struct brw_context *brw,
 }
  
 ADVANCE_BATCH();
 +
 +  /* On SkyLake+ the new constants don't take effect until the next
 +   * corresponding 3DSTATE_BINDING_TABLE_POINTER_* command is parsed so we
 +   * need to ensure that is sent
 +   */
 +   if (brw-gen = 9)
 +  brw-state.dirty.brw |= BRW_NEW_SURFACES;
  }

Reviewed-by: Kenneth Graunke kenn...@whitecape.org

signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] killing off the address reg in tgsi

2015-01-29 Thread Eric Anholt
Roland Scheidegger srol...@vmware.com writes:

 Hi,

 the address reg in tgsi is quite a nuisance. glsl-to-tgsi code assumes
 that indirections can only be done through the address reg and has quite
 some extra code to deal with this. Even though hardware and apis which
 worked like that are definitely old by now.
 Thus, I'm proposing the address reg be nuked. I am however not quite
 sure what the implications for drivers are, other than I'm certain
 llvmpipe can handle that already.
 For that reason, I suspect at least initially a new cap bit would be
 required so glsl-to-tgsi would skip the extra code. I tend to think
 longer term it would be great if it could be nuked completely, I am
 however not sure if that is easily done with drivers for old hw (such as
 r300) - I guess if necessary we could keep operations such as ARL (or
 even ARR though clearly not UARL!) and just define them to be usable
 with temp regs.

Yes, please, let's kill ARL.


signature.asc
Description: PGP signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] nir: Add variants of some of the comparison simplifications.

2015-01-29 Thread Jason Ekstrand
On Wed, Jan 28, 2015 at 11:20 PM, Eric Anholt e...@anholt.net wrote:

 Jason Ekstrand ja...@jlekstrand.net writes:

  I've got a patch I cooked up today that generalizes this a bit.  I
 haven't
  sent it yet because it lacks shader-db statistics.  That said, I'm fine
  with merging this in the mean time.

 Not sure what generalization you're working on, but something that let
 me have any sort of bool as an expression node, and that and(flt(a,
 b), flt(c, d)) was recognized as a bool would be nice for some of these
 patterns.  I was wondering if band/bor operations that were only defined
 on bool arguments would be a good idea (even though they'd codegen just
 like normal iand/ior), but then I got to thinking that this sounds like
 just a specific case of generally wanting possible-value categorization
 for SSA nodes.


Yeah, that's what it does.  It's an extension to the current system to
allow you to specify that a variable must match a certain type.  I just
sent the series.
--Jason
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [RFC] gallium/docs: add some freedreno compiler docs

2015-01-29 Thread Rob Clark
From: Rob Clark robcl...@freedesktop.org

Enable the 'sphinx.ext.graphviz' extension, and add in a section for
driver specific docs, with freedreno compiler docs beneath.  The
goal is for more complete compiler docs, and hopefully some docs about
other parts of the driver (such as how tiling works, etc).

Note that there is also a Distribution - Drivers section.  Although
that appears to be simply just a list of drivers.  Not sure if that
should move under the 'Drivers' section or left alone.  I did add a
one-line section for freedreno in the existing Distribution - Drivers
section.

Also, I'm not sure if there is a good way to link back to .rst source
files that live under the src/gallium/drivers/foo directory, or if
that is even desired.

At least adding this under the existing gallium docs, versus my initial
approach which setup a new sphinx build docs tree under the freedreno
driver directory, results in a much smaller patch due to not having an
extra Makefile, conf.py, etc.
---
 src/gallium/docs/source/conf.py|   2 +-
 src/gallium/docs/source/distro.rst |   5 +
 src/gallium/docs/source/drivers.rst|   9 +
 src/gallium/docs/source/drivers/freedreno.rst  |   9 +
 .../docs/source/drivers/freedreno/ir3-notes.rst| 386 +
 src/gallium/docs/source/index.rst  |   1 +
 6 files changed, 411 insertions(+), 1 deletion(-)
 create mode 100644 src/gallium/docs/source/drivers.rst
 create mode 100644 src/gallium/docs/source/drivers/freedreno.rst
 create mode 100644 src/gallium/docs/source/drivers/freedreno/ir3-notes.rst

diff --git a/src/gallium/docs/source/conf.py b/src/gallium/docs/source/conf.py
index 1288666..5e8173d 100644
--- a/src/gallium/docs/source/conf.py
+++ b/src/gallium/docs/source/conf.py
@@ -22,7 +22,7 @@ sys.path.append(os.path.abspath('exts'))
 
 # Add any Sphinx extension module names here, as strings. They can be 
extensions
 # coming with Sphinx (named 'sphinx.ext.*') or your custom ones.
-extensions = ['sphinx.ext.pngmath', 'formatting']
+extensions = ['sphinx.ext.pngmath', 'sphinx.ext.graphviz', 'formatting']
 
 # Add any paths that contain templates here, relative to this directory.
 templates_path = ['_templates']
diff --git a/src/gallium/docs/source/distro.rst 
b/src/gallium/docs/source/distro.rst
index d69c186..a841b34 100644
--- a/src/gallium/docs/source/distro.rst
+++ b/src/gallium/docs/source/distro.rst
@@ -60,6 +60,11 @@ AMD radeonsi
 
 Driver for the AMD Southern Islands family of GPUs.
 
+freedreno
+^
+
+Driver for Qualcomm Adreno a2xx, a3xx, and a4xx series of GPUs.
+
 .. _softpipe:
 
 Softpipe
diff --git a/src/gallium/docs/source/drivers.rst 
b/src/gallium/docs/source/drivers.rst
new file mode 100644
index 000..469197c
--- /dev/null
+++ b/src/gallium/docs/source/drivers.rst
@@ -0,0 +1,9 @@
+Drivers
+===
+
+Driver specific documentation.
+
+.. toctree::
+   :glob:
+
+   drivers/*
diff --git a/src/gallium/docs/source/drivers/freedreno.rst 
b/src/gallium/docs/source/drivers/freedreno.rst
new file mode 100644
index 000..723ffdd
--- /dev/null
+++ b/src/gallium/docs/source/drivers/freedreno.rst
@@ -0,0 +1,9 @@
+Freedreno
+=
+
+Freedreno driver specific docs.
+
+.. toctree::
+   :glob:
+
+   freedreno/*
diff --git a/src/gallium/docs/source/drivers/freedreno/ir3-notes.rst 
b/src/gallium/docs/source/drivers/freedreno/ir3-notes.rst
new file mode 100644
index 000..dd28700
--- /dev/null
+++ b/src/gallium/docs/source/drivers/freedreno/ir3-notes.rst
@@ -0,0 +1,386 @@
+IR3 NOTES
+=
+
+Some notes about ir3, the compiler and machine-specific IR for the shader ISA 
introduced with adreno a3xx.  The same shader ISA is present, with some small 
differences, with adreno a4xx.
+
+Compared to the previous generation a2xx ISA (ir2), the a3xx ISA is a simple 
scalar instruction set.  However, the compiler is responsible, in most cases, 
to schedule the instructions.  The hardware does not try to hide the shader 
core pipeline stages.  For a common example, a common (cat2) ALU instruction 
takes four cycles, so a subsequent cat2 instruction which uses the result must 
have three intervening instructions (or nops).  When operating on vec4's, 
typically the corresponding scalar instructions for operating on the remaining 
three components could typically fit.  Although that results in a lot of edge 
cases where things fall over, like:
+
+::
+
+  ADD TEMP[0], TEMP[1], TEMP[2]
+  MUL TEMP[0], TEMP[1], TEMP[0].wzyx
+
+Here, the second instruction needs the output of the first group of scalar 
instructions in the wrong order, resulting in not enough instruction spots 
between the ``add r0.w, r1.w, r2.w`` and ``mul r0.x, r1.x, r0.w``.  Which is 
why the original/old compiler which merely translated nearly literally from 
TGSI to ir3, had a strong tendency to fall over.
+
+So the current compiler instead, in the frontend, generates a 
directed-acyclic-graph of instructions and basic 

[Mesa-dev] [PATCH v5 2/3] i965/fs: Use NIR's scalarizing abilities and stop handling vectors

2015-01-29 Thread Jason Ekstrand
Now that we can scalarize with NIR, there's no need for all this code
anymore.  Let's get rid of it and just do scalar operations.

v2: run copy prop before lowering phi nodes

v3: Get rid of the emit(...)-saturate = foo pattern

total instructions in shared programs: 5998321 - 5974070 (-0.40%)
instructions in affected programs: 732075 - 707824 (-3.31%)
helped:3137
HURT:  191
GAINED:18
LOST:  0
---
 src/mesa/drivers/dri/i965/brw_fs.h   |  15 -
 src/mesa/drivers/dri/i965/brw_fs_nir.cpp | 476 ++-
 2 files changed, 153 insertions(+), 338 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_fs.h 
b/src/mesa/drivers/dri/i965/brw_fs.h
index 84e0b9e..25197cd 100644
--- a/src/mesa/drivers/dri/i965/brw_fs.h
+++ b/src/mesa/drivers/dri/i965/brw_fs.h
@@ -592,21 +592,6 @@ public:
fs_reg get_nir_alu_src(nir_alu_instr *instr, unsigned src);
fs_reg get_nir_dest(nir_dest dest);
void emit_percomp(fs_inst *inst, unsigned wr_mask);
-   void emit_percomp(enum opcode op, fs_reg dest, fs_reg src0,
- unsigned wr_mask, bool saturate = false,
- enum brw_predicate predicate = BRW_PREDICATE_NONE,
- enum brw_conditional_mod mod = BRW_CONDITIONAL_NONE);
-   void emit_percomp(enum opcode op, fs_reg dest, fs_reg src0, fs_reg src1,
- unsigned wr_mask, bool saturate = false,
- enum brw_predicate predicate = BRW_PREDICATE_NONE,
- enum brw_conditional_mod mod = BRW_CONDITIONAL_NONE);
-   void emit_math_percomp(enum opcode op, fs_reg dest, fs_reg src0,
-  unsigned wr_mask, bool saturate = false);
-   void emit_math_percomp(enum opcode op, fs_reg dest, fs_reg src0,
-  fs_reg src1, unsigned wr_mask,
-  bool saturate = false);
-   void emit_reduction(enum opcode op, fs_reg dest, fs_reg src,
-   unsigned num_components);
 
int setup_color_payload(fs_reg *dst, fs_reg color, unsigned components);
void emit_alpha_test();
diff --git a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp 
b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
index de0d780..37ccd24 100644
--- a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
@@ -36,6 +36,10 @@ nir_optimize(nir_shader *nir)
   nir_validate_shader(nir);
   progress |= nir_copy_prop(nir);
   nir_validate_shader(nir);
+  nir_lower_phis_to_scalar(nir);
+  nir_validate_shader(nir);
+  progress |= nir_copy_prop(nir);
+  nir_validate_shader(nir);
   progress |= nir_opt_dce(nir);
   nir_validate_shader(nir);
   progress |= nir_opt_cse(nir);
@@ -85,6 +89,9 @@ fs_visitor::emit_nir_code()
nir_split_var_copies(nir);
nir_validate_shader(nir);
 
+   nir_lower_alu_to_scalar(nir);
+   nir_validate_shader(nir);
+
nir_optimize(nir);
 
/* Lower a bunch of stuff */
@@ -532,6 +539,7 @@ void
 fs_visitor::nir_emit_alu(nir_alu_instr *instr)
 {
struct brw_wm_prog_key *fs_key = (struct brw_wm_prog_key *) this-key;
+   fs_inst *inst;
 
fs_reg op[3];
fs_reg result = get_nir_dest(instr-dest.dest);
@@ -540,20 +548,30 @@ fs_visitor::nir_emit_alu(nir_alu_instr *instr)
for (unsigned i = 0; i  nir_op_infos[instr-op].num_inputs; i++)
   op[i] = get_nir_alu_src(instr, i);
 
+   if (nir_op_infos[instr-op].output_size == 0) {
+  /* We've already scalarized, so we know that we only have one
+   * channel.  The only question is which channel.
+   */
+  assert(_mesa_bitcount(instr-dest.write_mask) == 1);
+  unsigned off = ffs(instr-dest.write_mask) - 1;
+  result = offset(result, off);
+
+  for (unsigned i = 0; i  nir_op_infos[instr-op].num_inputs; i++)
+ op[i] = offset(op[i], off);
+   }
+
switch (instr-op) {
case nir_op_fmov:
case nir_op_i2f:
-   case nir_op_u2f: {
-  fs_inst *inst = MOV(result, op[0]);
+   case nir_op_u2f:
+  inst = emit(MOV(result, op[0]));
   inst-saturate = instr-dest.saturate;
-  emit_percomp(inst, instr-dest.write_mask);
-   }
   break;
 
case nir_op_imov:
case nir_op_f2i:
case nir_op_f2u:
-  emit_percomp(MOV(result, op[0]), instr-dest.write_mask);
+  emit(MOV(result, op[0]));
   break;
 
case nir_op_fsign: {
@@ -562,55 +580,46 @@ fs_visitor::nir_emit_alu(nir_alu_instr *instr)
  * Predicated OR ORs 1.0 (0x3f80) with the sign bit if val is not
  * zero.
  */
-  emit_percomp(CMP(reg_null_f, op[0], fs_reg(0.0f), BRW_CONDITIONAL_NZ),
-   instr-dest.write_mask);
+  emit(CMP(reg_null_f, op[0], fs_reg(0.0f), BRW_CONDITIONAL_NZ));
 
   fs_reg result_int = retype(result, BRW_REGISTER_TYPE_UD);
   op[0].type = BRW_REGISTER_TYPE_UD;
   result.type = BRW_REGISTER_TYPE_UD;
-  

[Mesa-dev] [PATCH v5 1/3] nir: Add a pass to lower vector phi nodes to scalar phi nodes

2015-01-29 Thread Jason Ekstrand
v2 Jason Ekstrand jason.ekstr...@intel.com:
 - Add better comments
 - Use nir_ssa_dest_init and nir_src_for_ssa more places
 - Fix some void * casts

v3 Jason Ekstrand jason.ekstr...@intel.com:
 - Rework the way we determine whether or not to sccalarize a phi node to
   make the recursion non-bogus
 - Treat load_const instructions as scalarizable

v4 Jason Ekstrand jason.ekstr...@intel.com:
 - Allow uniform and input loads to be scalarizable
---
 src/glsl/Makefile.sources   |   1 +
 src/glsl/nir/nir.h  |   2 +
 src/glsl/nir/nir_lower_phis_to_scalar.c | 278 
 3 files changed, 281 insertions(+)
 create mode 100644 src/glsl/nir/nir_lower_phis_to_scalar.c

diff --git a/src/glsl/Makefile.sources b/src/glsl/Makefile.sources
index face22e..bf6b70b 100644
--- a/src/glsl/Makefile.sources
+++ b/src/glsl/Makefile.sources
@@ -31,6 +31,7 @@ NIR_FILES = \
nir/nir_lower_global_vars_to_local.c \
nir/nir_lower_locals_to_regs.c \
nir/nir_lower_io.c \
+   nir/nir_lower_phis_to_scalar.c \
nir/nir_lower_samplers.cpp \
nir/nir_lower_system_values.c \
nir/nir_lower_to_source_mods.c \
diff --git a/src/glsl/nir/nir.h b/src/glsl/nir/nir.h
index 980fdd0..4f58eee 100644
--- a/src/glsl/nir/nir.h
+++ b/src/glsl/nir/nir.h
@@ -1526,6 +1526,8 @@ void nir_remove_dead_variables(nir_shader *shader);
 void nir_lower_vec_to_movs(nir_shader *shader);
 void nir_lower_alu_to_scalar(nir_shader *shader);
 
+void nir_lower_phis_to_scalar(nir_shader *shader);
+
 void nir_lower_samplers(nir_shader *shader,
 struct gl_shader_program *shader_program,
 struct gl_program *prog);
diff --git a/src/glsl/nir/nir_lower_phis_to_scalar.c 
b/src/glsl/nir/nir_lower_phis_to_scalar.c
new file mode 100644
index 000..bf65f5a
--- /dev/null
+++ b/src/glsl/nir/nir_lower_phis_to_scalar.c
@@ -0,0 +1,278 @@
+/*
+ * Copyright © 2015 Intel Corporation
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the Software),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice (including the next
+ * paragraph) shall be included in all copies or substantial portions of the
+ * Software.
+ *
+ * THE SOFTWARE IS PROVIDED AS IS, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
+ * IN THE SOFTWARE.
+ *
+ * Authors:
+ *Jason Ekstrand (ja...@jlekstrand.net)
+ *
+ */
+
+#include nir.h
+
+/*
+ * Implements a pass that lowers vector phi nodes to scalar phi nodes when
+ * we don't think it will hurt anything.
+ */
+
+struct lower_phis_to_scalar_state {
+   void *mem_ctx;
+   void *dead_ctx;
+
+   /* Hash table marking which phi nodes are scalarizable.  The key is
+* pointers to phi instructions and the entry is either NULL for not
+* scalarizable or non-null for scalarizable.
+*/
+   struct hash_table *phi_table;
+};
+
+static bool
+should_lower_phi(nir_phi_instr *phi, struct lower_phis_to_scalar_state *state);
+
+static bool
+is_phi_src_scalarizable(nir_phi_src *src,
+struct lower_phis_to_scalar_state *state)
+{
+   /* Don't know what to do with non-ssa sources */
+   if (!src-src.is_ssa)
+  return false;
+
+   nir_instr *src_instr = src-src.ssa-parent_instr;
+   switch (src_instr-type) {
+   case nir_instr_type_alu: {
+  nir_alu_instr *src_alu = nir_instr_as_alu(src_instr);
+
+  /* ALU operations with output_size == 0 should be scalarized.  We
+   * will also see a bunch of vecN operations from scalarizing ALU
+   * operations and, since they can easily be copy-propagated, they
+   * are ok too.
+   */
+  return nir_op_infos[src_alu-op].output_size == 0 ||
+ src_alu-op != nir_op_vec2 ||
+ src_alu-op != nir_op_vec3 ||
+ src_alu-op != nir_op_vec4;
+   }
+
+   case nir_instr_type_phi:
+  /* A phi is scalarizable if we're going to lower it */
+  return should_lower_phi(nir_instr_as_phi(src_instr), state);
+
+   case nir_instr_type_load_const:
+  /* These are trivially scalarizable */
+  return true;
+
+   case nir_instr_type_intrinsic: {
+  nir_intrinsic_instr *src_intrin = nir_instr_as_intrinsic(src_instr);
+
+  switch 

[Mesa-dev] [PATCH v5 3/3] i965/fs_nir: Get rid of get_alu_src

2015-01-29 Thread Jason Ekstrand
Originally, get_alu_src was supposed to handle resolving swizzles and
things like that.  However, now that basically every instruction we have
only takes scalar sources, we don't really need it anymore.  The only case
where it's still marginally useful is for the mov and vecN operations that
are left over from SSA form.  We can handle those cases as a special case
easily enough.  As a side-effect, we don't need the vec_to_movs pass
anymore.
---
 src/mesa/drivers/dri/i965/brw_fs.h   |   1 -
 src/mesa/drivers/dri/i965/brw_fs_nir.cpp | 153 +++
 2 files changed, 95 insertions(+), 59 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_fs.h 
b/src/mesa/drivers/dri/i965/brw_fs.h
index 25197cd..b95e2c0 100644
--- a/src/mesa/drivers/dri/i965/brw_fs.h
+++ b/src/mesa/drivers/dri/i965/brw_fs.h
@@ -589,7 +589,6 @@ public:
void nir_emit_texture(nir_tex_instr *instr);
void nir_emit_jump(nir_jump_instr *instr);
fs_reg get_nir_src(nir_src src);
-   fs_reg get_nir_alu_src(nir_alu_instr *instr, unsigned src);
fs_reg get_nir_dest(nir_dest dest);
void emit_percomp(fs_inst *inst, unsigned wr_mask);
 
diff --git a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp 
b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
index 37ccd24..fbb1622 100644
--- a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
@@ -140,8 +140,6 @@ fs_visitor::emit_nir_code()
 
nir_convert_from_ssa(nir);
nir_validate_shader(nir);
-   nir_lower_vec_to_movs(nir);
-   nir_validate_shader(nir);
 
/* emit the arrays used for inputs and outputs - load/store intrinsics will
 * be converted to reads/writes of these arrays
@@ -419,6 +417,7 @@ fs_visitor::nir_emit_impl(nir_function_impl *impl)
 void
 fs_visitor::nir_emit_cf_list(exec_list *list)
 {
+   exec_list_validate(list);
foreach_list_typed(nir_cf_node, node, node, list) {
   switch (node-type) {
   case nir_cf_node_if:
@@ -541,34 +540,117 @@ fs_visitor::nir_emit_alu(nir_alu_instr *instr)
struct brw_wm_prog_key *fs_key = (struct brw_wm_prog_key *) this-key;
fs_inst *inst;
 
-   fs_reg op[3];
fs_reg result = get_nir_dest(instr-dest.dest);
result.type = brw_type_for_nir_type(nir_op_infos[instr-op].output_type);
 
-   for (unsigned i = 0; i  nir_op_infos[instr-op].num_inputs; i++)
-  op[i] = get_nir_alu_src(instr, i);
+   fs_reg op[4];
+   for (unsigned i = 0; i  nir_op_infos[instr-op].num_inputs; i++) {
+  op[i] = get_nir_src(instr-src[i].src);
+  op[i].type = 
brw_type_for_nir_type(nir_op_infos[instr-op].input_types[i]);
+  op[i].abs = instr-src[i].abs;
+  op[i].negate = instr-src[i].negate;
+   }
+
+   /* We get a bunch of mov's out of the from_ssa pass and they may still
+* be vectorized.  We'll handle them as a special-case.  We'll also
+* handle vecN here because it's basically the same thing.
+*/
+   bool need_extra_copy = false;
+   switch (instr-op) {
+   case nir_op_vec4:
+  if (!instr-src[3].src.is_ssa 
+  instr-dest.dest.reg.reg == instr-src[3].src.reg.reg)
+ need_extra_copy = true;
+  /* fall through */
+   case nir_op_vec3:
+  if (!instr-src[2].src.is_ssa 
+  instr-dest.dest.reg.reg == instr-src[2].src.reg.reg)
+ need_extra_copy = true;
+  /* fall through */
+   case nir_op_vec2:
+  if (!instr-src[1].src.is_ssa 
+  instr-dest.dest.reg.reg == instr-src[1].src.reg.reg)
+ need_extra_copy = true;
+  /* fall through */
+   case nir_op_imov:
+   case nir_op_fmov: {
+  if (!instr-src[0].src.is_ssa 
+  instr-dest.dest.reg.reg == instr-src[0].src.reg.reg)
+ need_extra_copy = true;
+
+  fs_reg temp;
+  if (need_extra_copy) {
+ temp = retype(vgrf(4), result.type);
+  } else {
+ temp = result;
+  }
+
+  if (instr-op == nir_op_imov || instr-op == nir_op_fmov) {
+ for (unsigned i = 0; i  4; i++) {
+if (!(instr-dest.write_mask  (1  i)))
+   continue;
+
+inst = emit(MOV(offset(temp, i),
+offset(op[0], instr-src[0].swizzle[i])));
+inst-saturate = instr-dest.saturate;
+ }
+  } else {
+ for (unsigned i = 0; i  4; i++) {
+if (!(instr-dest.write_mask  (1  i)))
+   continue;
+
+inst = emit(MOV(offset(temp, i),
+offset(op[i], instr-src[i].swizzle[0])));
+inst-saturate = instr-dest.saturate;
+ }
+  }
+
+  /* In this case the source and destination registers were the same,
+   * so we need to insert an extra set of moves in order to deal with
+   * any swizzling.
+   */
+  if (need_extra_copy) {
+ for (unsigned i = 0; i  4; i++) {
+if (!(instr-dest.write_mask  (1  i)))
+   continue;
+
+emit(MOV(offset(result, i), offset(temp, i)));
+ }
+  }
+  return;
+   }
+   default:
+  break;
+   }
 

Re: [Mesa-dev] [PATCH 2/2] egl/dri: Defer display destruction until the display is not in use.

2015-01-29 Thread Chad Versace
On 01/28/2015 01:45 PM, jim.br...@linux.intel.com wrote:
 From: Jim Bride jim.br...@intel.com
 
 In situations where a DRI2 driver is in use and a call to eglTerminate()
 is followed by a call to eglMakeCurrent() the mandatory call to glFlush()
 inside of dri2_make_current() would cause a segmentation fault because the
 DRI2 driver was torn down and glFlush() expects certain data structures
 associated through the driver to be present.  In order to accommodate
 this the tear-down of the driver itself has been moved from dri2_terminate()
 into a new function dri2_cleanup_display().  Finally, both dri2_terminate()
 and dri2_make_current() call dri2_cleanup_display() if there are no
 resources associated with the display.
 
 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=69622
 Signed-off-by: Jim Bride jim.br...@intel.com
 ---
  src/egl/drivers/dri2/egl_dri2.c | 84 
 +
  1 file changed, 68 insertions(+), 16 deletions(-)
 
 diff --git a/src/egl/drivers/dri2/egl_dri2.c b/src/egl/drivers/dri2/egl_dri2.c
 index 9d5a1cf..bea5ccc 100644
 --- a/src/egl/drivers/dri2/egl_dri2.c
 +++ b/src/egl/drivers/dri2/egl_dri2.c
 @@ -665,20 +665,14 @@ dri2_initialize(_EGLDriver *drv, _EGLDisplay *disp)
  }
  
  /**
 - * Called via eglTerminate(), drv-API.Terminate().
 + * Cleanup display elements that are not in use.
   */
 -static EGLBoolean
 -dri2_terminate(_EGLDriver *drv, _EGLDisplay *disp)
 +static void
 +dri2_cleanup_display(_EGLDisplay *disp)
  {
 struct dri2_egl_display *dri2_dpy = dri2_egl_display(disp);
 unsigned i;
  
 -   /* Release any resources that are not currently bound to
 -* the display.  We do not want to release bound resources
 -* because doing so could interfere with proper cleanup via
 -* a call to eglMakeCurrent() to unbind said resources.
 -*/
 -   _eglReleaseDisplayResources(drv, disp, EGL_TRUE);
 _eglCleanupDisplay(disp);
  
 if (dri2_dpy-own_dri_screen)
 @@ -728,7 +722,26 @@ dri2_terminate(_EGLDriver *drv, _EGLDisplay *disp)
 }
 free(dri2_dpy);
 disp-DriverData = NULL;
 +}
 +
 +/**
 + * Called via eglTerminate(), drv-API.Terminate().
 + */
 +static EGLBoolean
 +dri2_terminate(_EGLDriver *drv, _EGLDisplay *disp)
 +{
 +   /* Release any resources that are not currently bound to
 +* the display.  We do not want to release bound resources
 +* because doing so could interfere with proper cleanup via
 +* a call to eglMakeCurrent() to unbind said resources.
 +*/
 +   _eglReleaseDisplayResources(drv, disp, EGL_TRUE);
  
 +   /* If the display has not resources (and all are tied to contexts) then
^^^
no

 +* go ahead and free up the display itself.
 +*/
 +   if (!disp-ResourceLists[_EGL_RESOURCE_CONTEXT])
 +  dri2_cleanup_display(disp);
 return EGL_TRUE;
  }
  
 @@ -992,14 +1005,13 @@ dri2_make_current(_EGLDriver *drv, _EGLDisplay *disp, 
 _EGLSurface *dsurf,
 __DRIdrawable *ddraw, *rdraw;
 __DRIcontext *cctx;
  
 +   /* flush before context switch */
 +   dri2_drv-glFlush();
 +
 /* make new bindings */
 if (!_eglBindContext(ctx, dsurf, rsurf, old_ctx, old_dsurf, old_rsurf))
return EGL_FALSE;
  
 -   /* flush before context switch */
 -   if (old_ctx  dri2_drv-glFlush)
 -  dri2_drv-glFlush();
 -
 ddraw = (dri2_dsurf) ? dri2_dsurf-dri_drawable : NULL;
 rdraw = (dri2_rsurf) ? dri2_rsurf-dri_drawable : NULL;
 cctx = (dri2_ctx) ? dri2_ctx-dri_context : NULL;
 @@ -1011,13 +1023,53 @@ dri2_make_current(_EGLDriver *drv, _EGLDisplay *disp, 
 _EGLSurface *dsurf,
  
 if ((cctx == NULL  ddraw == NULL  rdraw == NULL) ||
 dri2_dpy-core-bindContext(cctx, ddraw, rdraw)) {
 -  if (old_dsurf)
 +  _EGLDisplay *old_dpy = NULL, *ref = NULL;
 +
 +  if (old_dsurf) {
 + ref = old_dsurf-Resource.Display;
 + old_dpy = (ref != disp) ? ref : NULL;
 +
   drv-API.DestroySurface(drv, disp, old_dsurf);
 -  if (old_rsurf)
 +  }
 +
 +  if (old_rsurf) {
 + if (!old_dpy) {
 +ref = old_rsurf-Resource.Display;
 +old_dpy = (ref != disp) ? ref : NULL;
 + }

Nit pick. Ignore if you wish.

I think the second-level if statements are much easier to read if rewritten
with the pattern below:

if (old_rsurf) {
   old_dpy = old_rsurf-Resource.Display;
   drv-API.DestroySurface(drv, disp, old_rsurf);
}

It eliminates a nested 'if' statement, eliminates the weird `ref` variable,
and eliminates the ternary 'if' operator. And it
is functionally equivalent to the patch as-is because, if any
of the old objects exist, the old display is the same for all of them.

To make the simplification though, you would have to modify the
'if' condition farther below:
- if (old_dpy  !old_dpy-Initialized)
+ if (old_dpy  old_dpy != disp  !old_dpy-Initialized)

   drv-API.DestroySurface(drv, disp, old_rsurf);
 -  if (old_ctx)
 +  }
 +
 +  if 

Re: [Mesa-dev] [PATCH 1/7] nir: Add a void type

2015-01-29 Thread Jason Ekstrand
On Thu, Jan 29, 2015 at 4:13 PM, Kenneth Graunke kenn...@whitecape.org
wrote:

 On Thursday, January 29, 2015 12:50:16 PM Jason Ekstrand wrote:
  This allows us to indicate a concept of an invalid type.
  ---
   src/glsl/nir/nir.h | 1 +
   1 file changed, 1 insertion(+)
 
  diff --git a/src/glsl/nir/nir.h b/src/glsl/nir/nir.h
  index 980fdd0..f2050de 100644
  --- a/src/glsl/nir/nir.h
  +++ b/src/glsl/nir/nir.h
  @@ -576,6 +576,7 @@ void nir_alu_dest_copy(nir_alu_dest *dest, const
 nir_alu_dest *src,
  void *mem_ctx);
 
   typedef enum {
  +   nir_type_void = 0, /* Not a valid type */
  nir_type_float,
  nir_type_int,
  nir_type_unsigned,

 I think it would be clearer to call this nir_type_invalid, since void is
 a
 real type in source languages.


Sure.  I don't care too much.  Changed locally.
--Jason



 --Ken
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v2 7/7] nir/opt_algebraic: Add some constant bcsel reductions

2015-01-29 Thread Kenneth Graunke
On Thursday, January 29, 2015 02:34:18 PM Jason Ekstrand wrote:
 total instructions in shared programs: 5998190 - 5997603 (-0.01%)
 instructions in affected programs: 54276 - 53689 (-1.08%)
 helped:293
 ---
  src/glsl/nir/nir_opt_algebraic.py | 30 --
  1 file changed, 28 insertions(+), 2 deletions(-)
 
 diff --git a/src/glsl/nir/nir_opt_algebraic.py 
 b/src/glsl/nir/nir_opt_algebraic.py
 index e3b309c..3223119 100644
 --- a/src/glsl/nir/nir_opt_algebraic.py
 +++ b/src/glsl/nir/nir_opt_algebraic.py
 @@ -120,11 +120,37 @@ optimizations = [
 # Boolean simplifications
 (('ine', 'a@bool', 0), 'a'),
 (('ieq', 'a@bool', 0), ('inot', 'a')),
 -   (('bcsel', 'a@bool', True, False), 'a'),
 -   (('bcsel', 'a@bool', False, True), ('inot', 'a')),
 +   (('bcsel', a, True, False), ('ine', a, 0)),
 +   (('bcsel', a, False, True), ('ieq', a, 0)),
 +   (('bcsel', True, b, c), b),
 +   (('bcsel', False, b, c), c),
 +   # The result of this should be hit by constant propagation and, in the
 +   # next round of opt_algebraic, get picked up by one of the above two.
 +   (('bcsel', '#a', b, c), ('bcsel', ('ine', 'a', 0), b, c)),
  
  # This one may not be exact
 (('feq', ('fadd', a, b), 0.0), ('feq', a, ('fneg', b))),
  ]
  
 +# Add optimizations to handle the case where the result of a ternary is
 +# compared to a constant.  This way we can take things like
 +#
 +# (a ? 0 : 1)  0
 +#
 +# and turn it into
 +#
 +# a ? (0  0) : (1  0)
 +#
 +# which constant folding will eat for lunch.  The resulting ternary will
 +# further get cleaned up by the boolean reductions above ane we will be

 typo (and) ^^^

Series is:
Reviewed-by: Kenneth Graunke kenn...@whitecape.org

 +# left with just the original variable a.
 +for op in ['flt', 'fge', 'feq', 'fne',
 +   'ilt', 'ige', 'ieq', 'ine', 'ult', 'uge']:
 +   optimizations += [
 +  ((op, ('bcsel', 'a', '#b', '#c'), '#d'),
 +   ('bcsel', 'a', (op, 'b', 'd'), (op, 'c', 'd'))),
 +  ((op, '#d', ('bcsel', a, '#b', '#c')),
 +   ('bcsel', 'a', (op, 'd', 'b'), (op, 'd', 'c'))),
 +   ]
 +
  print nir_algebraic.AlgebraicPass(nir_opt_algebraic, 
 optimizations).render()

signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] killing off the address reg in tgsi

2015-01-29 Thread Rob Clark
On Thu, Jan 29, 2015 at 5:31 PM, Rob Clark robdcl...@gmail.com wrote:
 On Thu, Jan 29, 2015 at 4:20 PM, Roland Scheidegger srol...@vmware.com 
 wrote:
 Hi,

 the address reg in tgsi is quite a nuisance. glsl-to-tgsi code assumes
 that indirections can only be done through the address reg and has quite
 some extra code to deal with this. Even though hardware and apis which
 worked like that are definitely old by now.
 Thus, I'm proposing the address reg be nuked. I am however not quite
 sure what the implications for drivers are, other than I'm certain
 llvmpipe can handle that already.
 For that reason, I suspect at least initially a new cap bit would be
 required so glsl-to-tgsi would skip the extra code. I tend to think
 longer term it would be great if it could be nuked completely, I am
 however not sure if that is easily done with drivers for old hw (such as
 r300) - I guess if necessary we could keep operations such as ARL (or
 even ARR though clearly not UARL!) and just define them to be usable
 with temp regs.

 Opinions?

 Not sure about other hw (although I would assume radeon is similar)..
 but adreno has real actual address register, which need to be used for
 relative addressing of the register (or const) file.  Otherwise we'd
 need to do something like arrays in memory and load/store
 instructions.

 Possibly w/ more smarts in the compiler, we could infer address
 register usage.  It would require some smarts that we don't have yet
 to eliminate redundant writes to the address register generated for
 each instruction with indirect reference.

actually, I think I'm lying.. since it will end up SSA form (in either
tgsi-ir3 or tgsi-nir-ir3), we should have a single address register
value, so maybe it isn't such a big problem..

BR,
-R

 (Although eventually I'd like to use tgsi-nir.. and haven't yet
 looked at how nir handles indirects so maybe eventually I need to
 implement those smarts in my compiler anyway.)

 BR,
 -R


 Roland
 ___
 mesa-dev mailing list
 mesa-dev@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 5/7] nir/algebraic: Support specifying variable as constant or by type

2015-01-29 Thread Kenneth Graunke
On Thursday, January 29, 2015 12:50:20 PM Jason Ekstrand wrote:
 ---
  src/glsl/nir/nir_algebraic.py | 20 +---
  src/glsl/nir/nir_opt_algebraic.py | 12 +---
  2 files changed, 26 insertions(+), 6 deletions(-)
 
 diff --git a/src/glsl/nir/nir_algebraic.py b/src/glsl/nir/nir_algebraic.py
 index 75436f4..6e7973d 100644
 --- a/src/glsl/nir/nir_algebraic.py
 +++ b/src/glsl/nir/nir_algebraic.py
 @@ -28,6 +28,7 @@ import itertools
  import struct
  import sys
  import mako.template
 +import re
  
  # Represents a set of variables, each with a unique id
  class VarSet(object):
 @@ -65,6 +66,8 @@ static const ${val.c_type} ${val.name} = {
 { ${hex(val)} /* ${val.value} */ },
  % elif isinstance(val, Variable):
 ${val.index}, /* ${val.var_name} */
 +   ${'true' if val.is_constant else 'false'},
 +   nir_type_${'void' if val.required_type is None else val.required_type},

   nir_type_${val.required_type or 'void'}

  % elif isinstance(val, Expression):
 nir_op_${val.opcode},
 { ${', '.join(src.c_ptr for src in val.sources)} },
 @@ -111,12 +114,23 @@ class Constant(Value):
else:
   assert False
  
 +_var_name_re = re.compile(r(?Pconst#)?(?Pname\w+)(?:@(?Ptype\w+))?)
 +
  class Variable(Value):
 def __init__(self, val, name, varset):
Value.__init__(self, name, variable)
 -  self.var_name = val
 -  self.index = varset[val]
 -  self.name = name
 +
 +  m = _var_name_re.match(val)
 +  assert m and m.group('name') is not None
 +
 +  self.var_name = m.group('name')
 +  self.is_constant = m.group('const') is not None
 +  self.required_type = m.group('type')
 +
 +  if self.required_type is not None:
 + assert self.required_type in ('float', 'bool', 'int', 'unsigned')
 +
 +  self.index = varset[self.var_name]
  
  class Expression(Value):
 def __init__(self, expr, name_base, varset):
 diff --git a/src/glsl/nir/nir_opt_algebraic.py 
 b/src/glsl/nir/nir_opt_algebraic.py
 index 9c62b28..1dea42a 100644
 --- a/src/glsl/nir/nir_opt_algebraic.py
 +++ b/src/glsl/nir/nir_opt_algebraic.py
 @@ -36,9 +36,15 @@ d = 'd'
  # and replace is either an expression or a value.  An expression is
  # defined as a tuple of the form (op, src0, src1, src2, src3)
  # where each source is either an expression or a value.  A value can be
 -# either a numeric constant or a string representing a variable name.  For
 -# constants, you have to be careful to make sure that it is the right type
 -# because python is unaware of the source and destination types of the
 +# either a numeric constant or a string representing a variable name.
 +#
 +# Variable names are specified as [#]name[@type] where # inicates that
 +# the given variable will only match constants and thpe type indicates that

 typo 

 +# the given variable will only match values from ALU instructions with the
 +# given output type.
 +#
 +# For constants, you have to be careful to make sure that it is the right
 +# type because python is unaware of the source and destination types of the
  # opcodes.
  
  optimizations = [

signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] killing off the address reg in tgsi

2015-01-29 Thread Brian Paul

On 01/29/2015 02:20 PM, Roland Scheidegger wrote:

Hi,

the address reg in tgsi is quite a nuisance. glsl-to-tgsi code assumes
that indirections can only be done through the address reg and has quite
some extra code to deal with this. Even though hardware and apis which
worked like that are definitely old by now.
Thus, I'm proposing the address reg be nuked. I am however not quite
sure what the implications for drivers are, other than I'm certain
llvmpipe can handle that already.
For that reason, I suspect at least initially a new cap bit would be
required so glsl-to-tgsi would skip the extra code. I tend to think
longer term it would be great if it could be nuked completely, I am
however not sure if that is easily done with drivers for old hw (such as
r300) - I guess if necessary we could keep operations such as ARL (or
even ARR though clearly not UARL!) and just define them to be usable
with temp regs.

Opinions?


Sounds good, but as you know, the vmware/svga driver uses the address 
register.


Currently, there's some handy code in st_glsl_to_tgsi.cpp which looks 
for instructions where more than one operand uses indirect addressing 
and reduces it down to one indirection per instruction.


For example, if we had something like this:

uniform vec4 a[10], b[10], c[10];
x = a[i] * b[j] + c[k];

We get TGSI code something like this:

ARL ADDR.x, i;
MOV TEMP[0], CONST[ADDR.x];
ARL ADDR.x, j;
MOV TEMP[1], CONST[ADDR.x];
ARL ADDR.x, k;
MAD TEMP[2], TEMP[0], TEMP[1], CONST[ADDR.x];

(and yes, I can see the desire to get rid of that entirely and emit 
something like:

MAD TEMP[2], CONST[TEMP_i], CONST[TEMP_j], CONST[TEMP_k];
)


If we added a new cap along the lines of 
PIPE_CAP_MAX_INDIRECTIONS_PER_INSTRUCTION we could keep that logic in 
the state tracker.  This would make life easier in the driver(s).


Otherwise, if we discard ADDR, ARL and don't have the new cap, the 
driver(s) will have to count the indirections in each instruction and 
load some of those operands into temp registers.


It's do-able, but more work.  It would take me a while to implement that 
in our driver.  Of course, some of it could go into a TGSI lowering utility.


In any case, the driver would still have to detect patterns like this:

OPCODE DST, SRC[TEMP], ...

and convert it to

ARL ADDR.x, TEMP;
OPCODE DST, SRC[ADDR.x], ...

But that's not too hard.

-Brian

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] killing off the address reg in tgsi

2015-01-29 Thread Rob Clark
On Thu, Jan 29, 2015 at 6:06 PM, Brian Paul bri...@vmware.com wrote:
 On 01/29/2015 02:20 PM, Roland Scheidegger wrote:

 Hi,

 the address reg in tgsi is quite a nuisance. glsl-to-tgsi code assumes
 that indirections can only be done through the address reg and has quite
 some extra code to deal with this. Even though hardware and apis which
 worked like that are definitely old by now.
 Thus, I'm proposing the address reg be nuked. I am however not quite
 sure what the implications for drivers are, other than I'm certain
 llvmpipe can handle that already.
 For that reason, I suspect at least initially a new cap bit would be
 required so glsl-to-tgsi would skip the extra code. I tend to think
 longer term it would be great if it could be nuked completely, I am
 however not sure if that is easily done with drivers for old hw (such as
 r300) - I guess if necessary we could keep operations such as ARL (or
 even ARR though clearly not UARL!) and just define them to be usable
 with temp regs.

 Opinions?


 Sounds good, but as you know, the vmware/svga driver uses the address
 register.

 Currently, there's some handy code in st_glsl_to_tgsi.cpp which looks for
 instructions where more than one operand uses indirect addressing and
 reduces it down to one indirection per instruction.

 For example, if we had something like this:

 uniform vec4 a[10], b[10], c[10];
 x = a[i] * b[j] + c[k];

 We get TGSI code something like this:

 ARL ADDR.x, i;
 MOV TEMP[0], CONST[ADDR.x];
 ARL ADDR.x, j;
 MOV TEMP[1], CONST[ADDR.x];
 ARL ADDR.x, k;
 MAD TEMP[2], TEMP[0], TEMP[1], CONST[ADDR.x];

 (and yes, I can see the desire to get rid of that entirely and emit
 something like:
 MAD TEMP[2], CONST[TEMP_i], CONST[TEMP_j], CONST[TEMP_k];
 )


 If we added a new cap along the lines of
 PIPE_CAP_MAX_INDIRECTIONS_PER_INSTRUCTION we could keep that logic in the
 state tracker.  This would make life easier in the driver(s).


that would make it easier for me at least.. :-)

otherwise I end up lowering the second form back into the first

BR,
-R


 Otherwise, if we discard ADDR, ARL and don't have the new cap, the driver(s)
 will have to count the indirections in each instruction and load some of
 those operands into temp registers.

 It's do-able, but more work.  It would take me a while to implement that in
 our driver.  Of course, some of it could go into a TGSI lowering utility.

 In any case, the driver would still have to detect patterns like this:

 OPCODE DST, SRC[TEMP], ...

 and convert it to

 ARL ADDR.x, TEMP;
 OPCODE DST, SRC[ADDR.x], ...

 But that's not too hard.

 -Brian


 ___
 mesa-dev mailing list
 mesa-dev@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] GL: Update glext.h to fix ARB_dsa function prototypes.

2015-01-29 Thread Jason Ekstrand
It might be good to say in the commit message update glext.h to match
revision X from Khronos  and then put the comment about DSA functions
as a second line in the commit message.
--Jason

On Fri, Jan 23, 2015 at 2:23 PM, Laura Ekstrand la...@jlekstrand.net
wrote:

 I checked, and all of the currently implemented DSA functions build with
 this file.

 Laura

 On Fri, Jan 23, 2015 at 2:20 PM, Laura Ekstrand la...@jlekstrand.net
 wrote:

 ---
  include/GL/glext.h | 48 ++--
  1 file changed, 26 insertions(+), 22 deletions(-)

 diff --git a/include/GL/glext.h b/include/GL/glext.h
 index d3cfbb5..0ca89ca 100644
 --- a/include/GL/glext.h
 +++ b/include/GL/glext.h
 @@ -33,7 +33,7 @@ extern C {
  ** used to make the header, and the header can be found at
  **   http://www.opengl.org/registry/
  **
 -** Khronos $Revision: 28986 $ on $Date: 2014-11-18 18:43:15 -0800 (Tue,
 18 Nov 2014) $
 +** Khronos $Revision: 29537 $ on $Date: 2015-01-22 02:32:35 -0800 (Thu,
 22 Jan 2015) $
  */

  #if defined(_WIN32)  !defined(APIENTRY)  !defined(__CYGWIN__) 
 !defined(__SCITECH_SNAP__)
 @@ -53,7 +53,7 @@ extern C {
  #define GLAPI extern
  #endif

 -#define GL_GLEXT_VERSION 20141118
 +#define GL_GLEXT_VERSION 20150122

  /* Generated C header for:
   * API: gl
 @@ -2607,25 +2607,25 @@ GLAPI void APIENTRY glBindVertexBuffers (GLuint
 first, GLsizei count, const GLui
  typedef void (APIENTRYP PFNGLCLIPCONTROLPROC) (GLenum origin, GLenum
 depth);
  typedef void (APIENTRYP PFNGLCREATETRANSFORMFEEDBACKSPROC) (GLsizei n,
 GLuint *ids);
  typedef void (APIENTRYP PFNGLTRANSFORMFEEDBACKBUFFERBASEPROC) (GLuint
 xfb, GLuint index, GLuint buffer);
 -typedef void (APIENTRYP PFNGLTRANSFORMFEEDBACKBUFFERRANGEPROC) (GLuint
 xfb, GLuint index, GLuint buffer, GLintptr offset, GLsizei size);
 +typedef void (APIENTRYP PFNGLTRANSFORMFEEDBACKBUFFERRANGEPROC) (GLuint
 xfb, GLuint index, GLuint buffer, GLintptr offset, GLsizeiptr size);
  typedef void (APIENTRYP PFNGLGETTRANSFORMFEEDBACKIVPROC) (GLuint xfb,
 GLenum pname, GLint *param);
  typedef void (APIENTRYP PFNGLGETTRANSFORMFEEDBACKI_VPROC) (GLuint xfb,
 GLenum pname, GLuint index, GLint *param);
  typedef void (APIENTRYP PFNGLGETTRANSFORMFEEDBACKI64_VPROC) (GLuint xfb,
 GLenum pname, GLuint index, GLint64 *param);
  typedef void (APIENTRYP PFNGLCREATEBUFFERSPROC) (GLsizei n, GLuint
 *buffers);
 -typedef void (APIENTRYP PFNGLNAMEDBUFFERSTORAGEPROC) (GLuint buffer,
 GLsizei size, const void *data, GLbitfield flags);
 -typedef void (APIENTRYP PFNGLNAMEDBUFFERDATAPROC) (GLuint buffer,
 GLsizei size, const void *data, GLenum usage);
 -typedef void (APIENTRYP PFNGLNAMEDBUFFERSUBDATAPROC) (GLuint buffer,
 GLintptr offset, GLsizei size, const void *data);
 -typedef void (APIENTRYP PFNGLCOPYNAMEDBUFFERSUBDATAPROC) (GLuint
 readBuffer, GLuint writeBuffer, GLintptr readOffset, GLintptr writeOffset,
 GLsizei size);
 +typedef void (APIENTRYP PFNGLNAMEDBUFFERSTORAGEPROC) (GLuint buffer,
 GLsizeiptr size, const void *data, GLbitfield flags);
 +typedef void (APIENTRYP PFNGLNAMEDBUFFERDATAPROC) (GLuint buffer,
 GLsizeiptr size, const void *data, GLenum usage);
 +typedef void (APIENTRYP PFNGLNAMEDBUFFERSUBDATAPROC) (GLuint buffer,
 GLintptr offset, GLsizeiptr size, const void *data);
 +typedef void (APIENTRYP PFNGLCOPYNAMEDBUFFERSUBDATAPROC) (GLuint
 readBuffer, GLuint writeBuffer, GLintptr readOffset, GLintptr writeOffset,
 GLsizeiptr size);
  typedef void (APIENTRYP PFNGLCLEARNAMEDBUFFERDATAPROC) (GLuint buffer,
 GLenum internalformat, GLenum format, GLenum type, const void *data);
 -typedef void (APIENTRYP PFNGLCLEARNAMEDBUFFERSUBDATAPROC) (GLuint
 buffer, GLenum internalformat, GLintptr offset, GLsizei size, GLenum
 format, GLenum type, const void *data);
 +typedef void (APIENTRYP PFNGLCLEARNAMEDBUFFERSUBDATAPROC) (GLuint
 buffer, GLenum internalformat, GLintptr offset, GLsizeiptr size, GLenum
 format, GLenum type, const void *data);
  typedef void *(APIENTRYP PFNGLMAPNAMEDBUFFERPROC) (GLuint buffer, GLenum
 access);
 -typedef void *(APIENTRYP PFNGLMAPNAMEDBUFFERRANGEPROC) (GLuint buffer,
 GLintptr offset, GLsizei length, GLbitfield access);
 +typedef void *(APIENTRYP PFNGLMAPNAMEDBUFFERRANGEPROC) (GLuint buffer,
 GLintptr offset, GLsizeiptr length, GLbitfield access);
  typedef GLboolean (APIENTRYP PFNGLUNMAPNAMEDBUFFERPROC) (GLuint buffer);
 -typedef void (APIENTRYP PFNGLFLUSHMAPPEDNAMEDBUFFERRANGEPROC) (GLuint
 buffer, GLintptr offset, GLsizei length);
 +typedef void (APIENTRYP PFNGLFLUSHMAPPEDNAMEDBUFFERRANGEPROC) (GLuint
 buffer, GLintptr offset, GLsizeiptr length);
  typedef void (APIENTRYP PFNGLGETNAMEDBUFFERPARAMETERIVPROC) (GLuint
 buffer, GLenum pname, GLint *params);
  typedef void (APIENTRYP PFNGLGETNAMEDBUFFERPARAMETERI64VPROC) (GLuint
 buffer, GLenum pname, GLint64 *params);
  typedef void (APIENTRYP PFNGLGETNAMEDBUFFERPOINTERVPROC) (GLuint buffer,
 GLenum pname, void **params);
 -typedef void (APIENTRYP PFNGLGETNAMEDBUFFERSUBDATAPROC) (GLuint 

Re: [Mesa-dev] Mesa 10.5.0 release plan

2015-01-29 Thread Emil Velikov
On 29/01/15 04:05, Michel Dänzer wrote:
 On 29.01.2015 00:37, Emil Velikov wrote:

 February 20th 2015 - Release candidate 3
 February 20th 2015 - Release candidate 4
 
 I assume you meant February 27th for rc4. :)
 
Sigh... will I be able to write a statement without ever making silly
typos :)

That's correct Michel - I meant the 27th.

Cheers,
Emil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 5/7] nir/algebraic: Support specifying variable as constant or by type

2015-01-29 Thread Dylan Baker
On Thursday, January 29, 2015 17:00:08 Kenneth Graunke wrote:
 On Thursday, January 29, 2015 12:50:20 PM Jason Ekstrand wrote:
  ---
   src/glsl/nir/nir_algebraic.py | 20 +---
   src/glsl/nir/nir_opt_algebraic.py | 12 +---
   2 files changed, 26 insertions(+), 6 deletions(-)
  
  diff --git a/src/glsl/nir/nir_algebraic.py b/src/glsl/nir/nir_algebraic.py
  index 75436f4..6e7973d 100644
  --- a/src/glsl/nir/nir_algebraic.py
  +++ b/src/glsl/nir/nir_algebraic.py
  @@ -28,6 +28,7 @@ import itertools
   import struct
   import sys
   import mako.template
  +import re
   
   # Represents a set of variables, each with a unique id
   class VarSet(object):
  @@ -65,6 +66,8 @@ static const ${val.c_type} ${val.name} = {
  { ${hex(val)} /* ${val.value} */ },
   % elif isinstance(val, Variable):
  ${val.index}, /* ${val.var_name} */
  +   ${'true' if val.is_constant else 'false'},
  +   nir_type_${'void' if val.required_type is None else val.required_type},
 
nir_type_${val.required_type or 'void'}

This is equivalent except in a few cases, namely 'void' will be picked
over any falsy value, not just None. Reasonable falsy values might be 0
or '' (empty string).

 
   % elif isinstance(val, Expression):
  nir_op_${val.opcode},
  { ${', '.join(src.c_ptr for src in val.sources)} },
  @@ -111,12 +114,23 @@ class Constant(Value):
 else:
assert False
   
  +_var_name_re = re.compile(r(?Pconst#)?(?Pname\w+)(?:@(?Ptype\w+))?)
  +
   class Variable(Value):
  def __init__(self, val, name, varset):
 Value.__init__(self, name, variable)
  -  self.var_name = val
  -  self.index = varset[val]
  -  self.name = name
  +
  +  m = _var_name_re.match(val)
  +  assert m and m.group('name') is not None

Can m.group('name') actually be None? I don't think it can.

  +
  +  self.var_name = m.group('name')
  +  self.is_constant = m.group('const') is not None
  +  self.required_type = m.group('type')
  +
  +  if self.required_type is not None:
  + assert self.required_type in ('float', 'bool', 'int', 'unsigned')
  +
  +  self.index = varset[self.var_name]
   
   class Expression(Value):
  def __init__(self, expr, name_base, varset):
  diff --git a/src/glsl/nir/nir_opt_algebraic.py 
  b/src/glsl/nir/nir_opt_algebraic.py
  index 9c62b28..1dea42a 100644
  --- a/src/glsl/nir/nir_opt_algebraic.py
  +++ b/src/glsl/nir/nir_opt_algebraic.py
  @@ -36,9 +36,15 @@ d = 'd'
   # and replace is either an expression or a value.  An expression is
   # defined as a tuple of the form (op, src0, src1, src2, src3)
   # where each source is either an expression or a value.  A value can be
  -# either a numeric constant or a string representing a variable name.  For
  -# constants, you have to be careful to make sure that it is the right type
  -# because python is unaware of the source and destination types of the
  +# either a numeric constant or a string representing a variable name.
  +#
  +# Variable names are specified as [#]name[@type] where # inicates that
  +# the given variable will only match constants and thpe type indicates that
 
  typo 
 
  +# the given variable will only match values from ALU instructions with the
  +# given output type.
  +#
  +# For constants, you have to be careful to make sure that it is the right
  +# type because python is unaware of the source and destination types of the
   # opcodes.
   
   optimizations = [


signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 5/7] nir/algebraic: Support specifying variable as constant or by type

2015-01-29 Thread Jason Ekstrand
On Thu, Jan 29, 2015 at 5:00 PM, Kenneth Graunke kenn...@whitecape.org
wrote:

 On Thursday, January 29, 2015 12:50:20 PM Jason Ekstrand wrote:
  ---
   src/glsl/nir/nir_algebraic.py | 20 +---
   src/glsl/nir/nir_opt_algebraic.py | 12 +---
   2 files changed, 26 insertions(+), 6 deletions(-)
 
  diff --git a/src/glsl/nir/nir_algebraic.py
 b/src/glsl/nir/nir_algebraic.py
  index 75436f4..6e7973d 100644
  --- a/src/glsl/nir/nir_algebraic.py
  +++ b/src/glsl/nir/nir_algebraic.py
  @@ -28,6 +28,7 @@ import itertools
   import struct
   import sys
   import mako.template
  +import re
 
   # Represents a set of variables, each with a unique id
   class VarSet(object):
  @@ -65,6 +66,8 @@ static const ${val.c_type} ${val.name} = {
  { ${hex(val)} /* ${val.value} */ },
   % elif isinstance(val, Variable):
  ${val.index}, /* ${val.var_name} */
  +   ${'true' if val.is_constant else 'false'},
  +   nir_type_${'void' if val.required_type is None else
 val.required_type},

nir_type_${val.required_type or 'void'}


done.



   % elif isinstance(val, Expression):
  nir_op_${val.opcode},
  { ${', '.join(src.c_ptr for src in val.sources)} },
  @@ -111,12 +114,23 @@ class Constant(Value):
 else:
assert False
 
  +_var_name_re =
 re.compile(r(?Pconst#)?(?Pname\w+)(?:@(?Ptype\w+))?)
  +
   class Variable(Value):
  def __init__(self, val, name, varset):
 Value.__init__(self, name, variable)
  -  self.var_name = val
  -  self.index = varset[val]
  -  self.name = name
  +
  +  m = _var_name_re.match(val)
  +  assert m and m.group('name') is not None
  +
  +  self.var_name = m.group('name')
  +  self.is_constant = m.group('const') is not None
  +  self.required_type = m.group('type')
  +
  +  if self.required_type is not None:
  + assert self.required_type in ('float', 'bool', 'int',
 'unsigned')
  +
  +  self.index = varset[self.var_name]
 
   class Expression(Value):
  def __init__(self, expr, name_base, varset):
  diff --git a/src/glsl/nir/nir_opt_algebraic.py
 b/src/glsl/nir/nir_opt_algebraic.py
  index 9c62b28..1dea42a 100644
  --- a/src/glsl/nir/nir_opt_algebraic.py
  +++ b/src/glsl/nir/nir_opt_algebraic.py
  @@ -36,9 +36,15 @@ d = 'd'
   # and replace is either an expression or a value.  An expression is
   # defined as a tuple of the form (op, src0, src1, src2, src3)
   # where each source is either an expression or a value.  A value can be
  -# either a numeric constant or a string representing a variable name.
 For
  -# constants, you have to be careful to make sure that it is the right
 type
  -# because python is unaware of the source and destination types of the
  +# either a numeric constant or a string representing a variable name.
  +#
  +# Variable names are specified as [#]name[@type] where # inicates
 that
  +# the given variable will only match constants and thpe type indicates
 that

  typo 


fixed.



  +# the given variable will only match values from ALU instructions with
 the
  +# given output type.
  +#
  +# For constants, you have to be careful to make sure that it is the
 right
  +# type because python is unaware of the source and destination types of
 the
   # opcodes.
 
   optimizations = [

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] i965: Add a better PRM citation for the IMS dimension mangling.

2015-01-29 Thread Chad Versace
On 01/29/2015 12:00 AM, Kenneth Graunke wrote:
 Paul originally had to reverse engineer these formulas based on the
 description about how the sampler works.  The description here is not
 the easiest to follow - especially given that it's from the Sandybridge
 era, when the hardware only did 4x multisampling.
 
 Jordan and I recently found another part of the documentation where they
 simply state that IMS dimensions must be adjusted by a set of formulas.
 Quoting this section provides an easy to follow explanation for the
 code, including 2x/4x/8x/16x.
 
 Signed-off-by: Kenneth Graunke kenn...@whitecape.org

Reviewed-by: Chad Versace chad.vers...@intel.com




signature.asc
Description: OpenPGP digital signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/7] nir: Add a void type

2015-01-29 Thread Kenneth Graunke
On Thursday, January 29, 2015 12:50:16 PM Jason Ekstrand wrote:
 This allows us to indicate a concept of an invalid type.
 ---
  src/glsl/nir/nir.h | 1 +
  1 file changed, 1 insertion(+)
 
 diff --git a/src/glsl/nir/nir.h b/src/glsl/nir/nir.h
 index 980fdd0..f2050de 100644
 --- a/src/glsl/nir/nir.h
 +++ b/src/glsl/nir/nir.h
 @@ -576,6 +576,7 @@ void nir_alu_dest_copy(nir_alu_dest *dest, const 
 nir_alu_dest *src,
 void *mem_ctx);
  
  typedef enum {
 +   nir_type_void = 0, /* Not a valid type */
 nir_type_float,
 nir_type_int,
 nir_type_unsigned,

I think it would be clearer to call this nir_type_invalid, since void is a
real type in source languages.

--Ken

signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/2] egl: Add unbound_only parameter to _eglReleaseDisplayResources()

2015-01-29 Thread Chad Versace
On 01/28/2015 01:45 PM, jim.br...@linux.intel.com wrote:
 From: Jim Bride jim.br...@intel.com
 
 Add a EGLBoolean parameter unbound_only to _eglReleaseDisplayResources() to
 allow that function to skip releasing contexts or surfaces that are currently
 bound to a display.  Setting unbound_only to EGL_FALSE preserves the original
 behavior of this function.
 
 Signed-off-by: Jim Bride jim.br...@intel.com
 ---
  src/egl/drivers/dri2/egl_dri2.c |  7 ++-
  src/egl/main/egldisplay.c   | 17 +
  src/egl/main/egldisplay.h   |  3 ++-
  src/gallium/state_trackers/egl/common/egl_g3d.c |  9 -
  4 files changed, 29 insertions(+), 7 deletions(-)

Reviewed-by: Chad Versace chad.vers...@intel.com

I'm waiting on fixes to patch 2 before committing patch 1, because
they don't make sense to commit in independently.

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v2 0/2] Fast mutex v2

2015-01-29 Thread Kristian Høgsberg
Here's take two of the fast mutex path.  This version has a check for
linux/futex.h and it split into two patches.  First patch introduces
full_mtx_t and updates callers.  Second patch adds the inline mtx_t.

Kristian Høgsberg (2):
  mesa: Split full featured mutex into its own type
  mesa: Add new fast mtx_t mutex type for basic use cases

 configure.ac |   3 +
 include/c11/threads_posix.h  | 149 ---
 include/c11/threads_win32.h  |  48 ---
 src/gallium/auxiliary/os/os_thread.h |  10 +--
 src/mesa/main/mtypes.h   |   2 +-
 src/mesa/main/shared.c   |   4 +-
 src/mesa/main/texobj.c   |   4 +-
 src/mesa/main/texobj.h   |   4 +-
 8 files changed, 191 insertions(+), 33 deletions(-)

-- 
2.2.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH v2 1/2] mesa: Split full featured mutex into its own type

2015-01-29 Thread Kristian Høgsberg
Most of the mtx_t uses in mesa only use lock/unlock, but a couple of
places we use a condition variable and a recursive mutex.  This patch
introduces a new full_mtx_t for the cases where we need a more featureful
mutex and changes mtx_t to only support lock/unlock.

Signed-off-by: Kristian Høgsberg k...@bitplanet.net
---
 include/c11/threads_posix.h  | 51 
 include/c11/threads_win32.h  | 48 ++---
 src/gallium/auxiliary/os/os_thread.h | 10 +++
 src/mesa/main/mtypes.h   |  2 +-
 src/mesa/main/shared.c   |  4 +--
 src/mesa/main/texobj.c   |  4 +--
 src/mesa/main/texobj.h   |  4 +--
 7 files changed, 91 insertions(+), 32 deletions(-)

diff --git a/include/c11/threads_posix.h b/include/c11/threads_posix.h
index f9c165d..5119773 100644
--- a/include/c11/threads_posix.h
+++ b/include/c11/threads_posix.h
@@ -59,13 +59,14 @@ Configuration macro:
 #endif
 
 // FIXME: temporary non-standard hack to ease transition
+#define _FULL_MTX_INITIALIZER_NP PTHREAD_MUTEX_INITIALIZER
 #define _MTX_INITIALIZER_NP PTHREAD_MUTEX_INITIALIZER
 
 /* types */
 typedef pthread_cond_t  cnd_t;
 typedef pthread_t   thrd_t;
 typedef pthread_key_t   tss_t;
-typedef pthread_mutex_t mtx_t;
+typedef pthread_mutex_t full_mtx_t;
 typedef pthread_once_t  once_flag;
 
 
@@ -135,7 +136,7 @@ cnd_signal(cnd_t *cond)
 
 // 7.25.3.5
 static inline int
-cnd_timedwait(cnd_t *cond, mtx_t *mtx, const xtime *xt)
+cnd_timedwait(cnd_t *cond, full_mtx_t *mtx, const xtime *xt)
 {
 struct timespec abs_time;
 int rt;
@@ -148,7 +149,7 @@ cnd_timedwait(cnd_t *cond, mtx_t *mtx, const xtime *xt)
 
 // 7.25.3.6
 static inline int
-cnd_wait(cnd_t *cond, mtx_t *mtx)
+cnd_wait(cnd_t *cond, full_mtx_t *mtx)
 {
 if (!cond || !mtx) return thrd_error;
 pthread_cond_wait(cond, mtx);
@@ -159,7 +160,7 @@ cnd_wait(cnd_t *cond, mtx_t *mtx)
 /* 7.25.4 Mutex functions */
 // 7.25.4.1
 static inline void
-mtx_destroy(mtx_t *mtx)
+full_mtx_destroy(full_mtx_t *mtx)
 {
 assert(mtx);
 pthread_mutex_destroy(mtx);
@@ -167,7 +168,7 @@ mtx_destroy(mtx_t *mtx)
 
 // 7.25.4.2
 static inline int
-mtx_init(mtx_t *mtx, int type)
+full_mtx_init(full_mtx_t *mtx, int type)
 {
 pthread_mutexattr_t attr;
 if (!mtx) return thrd_error;
@@ -191,7 +192,7 @@ mtx_init(mtx_t *mtx, int type)
 
 // 7.25.4.3
 static inline int
-mtx_lock(mtx_t *mtx)
+full_mtx_lock(full_mtx_t *mtx)
 {
 if (!mtx) return thrd_error;
 pthread_mutex_lock(mtx);
@@ -199,14 +200,14 @@ mtx_lock(mtx_t *mtx)
 }
 
 static inline int
-mtx_trylock(mtx_t *mtx);
+full_mtx_trylock(full_mtx_t *mtx);
 
 static inline void
 thrd_yield(void);
 
 // 7.25.4.4
 static inline int
-mtx_timedlock(mtx_t *mtx, const xtime *xt)
+full_mtx_timedlock(full_mtx_t *mtx, const xtime *xt)
 {
 if (!mtx || !xt) return thrd_error;
 {
@@ -222,7 +223,7 @@ mtx_timedlock(mtx_t *mtx, const xtime *xt)
 #else
 time_t expire = time(NULL);
 expire += xt-sec;
-while (mtx_trylock(mtx) != thrd_success) {
+while (full_mtx_trylock(mtx) != thrd_success) {
 time_t now = time(NULL);
 if (expire  now)
 return thrd_busy;
@@ -236,7 +237,7 @@ mtx_timedlock(mtx_t *mtx, const xtime *xt)
 
 // 7.25.4.5
 static inline int
-mtx_trylock(mtx_t *mtx)
+full_mtx_trylock(full_mtx_t *mtx)
 {
 if (!mtx) return thrd_error;
 return (pthread_mutex_trylock(mtx) == 0) ? thrd_success : thrd_busy;
@@ -244,13 +245,41 @@ mtx_trylock(mtx_t *mtx)
 
 // 7.25.4.6
 static inline int
-mtx_unlock(mtx_t *mtx)
+full_mtx_unlock(full_mtx_t *mtx)
 {
 if (!mtx) return thrd_error;
 pthread_mutex_unlock(mtx);
 return thrd_success;
 }
 
+typedef full_mtx_t mtx_t;
+
+static inline void
+mtx_init(mtx_t *mtx, int type)
+{
+   assert(type == mtx_plain);
+
+   full_mtx_init(mtx, mtx_plain);
+}
+
+static inline void
+mtx_destroy(mtx_t *mtx)
+{
+   full_mtx_destroy(mtx);
+}
+
+static inline void
+mtx_lock(mtx_t *mtx)
+{
+   full_mtx_lock(mtx);
+}
+
+static inline void
+mtx_unlock(mtx_t *mtx)
+{
+   full_mtx_unlock(mtx);
+}
+
 
 /*--- 7.25.5 Thread functions ---*/
 // 7.25.5.1
diff --git a/include/c11/threads_win32.h b/include/c11/threads_win32.h
index d017c31..6968ce4 100644
--- a/include/c11/threads_win32.h
+++ b/include/c11/threads_win32.h
@@ -85,6 +85,7 @@ Configuration macro:
 #define TSS_DTOR_ITERATIONS 1
 
 // FIXME: temporary non-standard hack to ease transition
+#define _FULL_MTX_INITIALIZER_NP {(PCRITICAL_SECTION_DEBUG)-1, -1, 0, 0, 0, 0}
 #define _MTX_INITIALIZER_NP {(PCRITICAL_SECTION_DEBUG)-1, -1, 0, 0, 0, 0}
 
 /* types */
@@ -206,7 +207,7 @@ static void impl_cond_do_signal(cnd_t *cond, int broadcast)
 ReleaseSemaphore(cond-sem_queue, nsignal, NULL);
 }
 
-static int 

[Mesa-dev] [PATCH v2 2/2] mesa: Add new fast mtx_t mutex type for basic use cases

2015-01-29 Thread Kristian Høgsberg
While modern pthread mutexes are very fast, they still incur a call to an
external DSO and overhead of the generality and features of pthread mutexes.
Most mutexes in mesa only needs lock/unlock, and the idea here is that we can
inline the atomic operation and make the fast case just two intructions.
Mutexes are subtle and finicky to implement, so we carefully copy the
implementation from Ulrich Dreppers well-written and well-reviewed paper:

  Futexes Are Tricky
  http://www.akkadia.org/drepper/futex.pdf

We implement mutex3, which gives us a mutex that has no syscalls on
uncontended lock or unlock.  Further, the uncontended case boils down to a
cmpxchg and an untaken branch and the uncontended unlock is just a locked decr
and an untaken branch.  We use __builtin_expect() to indicate that contention
is unlikely so that gcc will put the contention code out of the main code
flow.

A fast mutex only supports lock/unlock, can't be recursive or used with
condition variables.  We keep the pthread mutex implementation around as
full_mtx_t for the few places where we use condition variables or recursive
locking.  For platforms or compilers where futex and atomics aren't available,
mtx_t falls back to the pthread mutex.

The pthread mutex lock/unlock overhead shows up on benchmarks for CPU bound
applications.  Most CPU bound cases are helped and some of our internal
bind_buffer_object heavy benchmarks gain up to 10%.

Signed-off-by: Kristian Høgsberg k...@bitplanet.net
---
 configure.ac|  3 ++
 include/c11/threads_posix.h | 98 -
 2 files changed, 100 insertions(+), 1 deletion(-)

diff --git a/configure.ac b/configure.ac
index c2d775e..c4b26ee 100644
--- a/configure.ac
+++ b/configure.ac
@@ -639,6 +639,9 @@ dnl to -pthread, which causes problems if we need -lpthread 
to appear in
 dnl pkgconfig files.
 test -z $PTHREAD_LIBS  PTHREAD_LIBS=-lpthread
 
+dnl Check for futex for fast inline mtx_t.
+AC_CHECK_HEADER([linux/futex.h], [DEFINES=$DEFINES -DHAVE_FUTEX])
+
 dnl SELinux awareness.
 AC_ARG_ENABLE([selinux],
 [AS_HELP_STRING([--enable-selinux],
diff --git a/include/c11/threads_posix.h b/include/c11/threads_posix.h
index 5119773..a48b15f 100644
--- a/include/c11/threads_posix.h
+++ b/include/c11/threads_posix.h
@@ -60,7 +60,6 @@ Configuration macro:
 
 // FIXME: temporary non-standard hack to ease transition
 #define _FULL_MTX_INITIALIZER_NP PTHREAD_MUTEX_INITIALIZER
-#define _MTX_INITIALIZER_NP PTHREAD_MUTEX_INITIALIZER
 
 /* types */
 typedef pthread_cond_t  cnd_t;
@@ -252,8 +251,104 @@ full_mtx_unlock(full_mtx_t *mtx)
 return thrd_success;
 }
 
+#if defined(__GNUC__)  defined(HAVE_FUTEX)
+
+/* mtx_t - Fast, simple mutex
+ *
+ * While modern pthread mutexes are very fast (implemented using futex), they
+ * still incur a call to an external DSO and overhead of the generality and
+ * features of pthread mutexes.  Most mutexes in mesa only needs lock/unlock,
+ * and the idea here is that we can inline the atomic operation and make the
+ * fast case just two intructions.  Mutexes are subtle and finicky to
+ * implement, so we carefully copy the implementation from Ulrich Dreppers
+ * well-written and well-reviewed paper:
+ *
+ *   Futexes Are Tricky
+ *   http://www.akkadia.org/drepper/futex.pdf
+ *
+ * We implement mutex3, which gives us a mutex that has no syscalls on
+ * uncontended lock or unlock.  Further, the uncontended case boils down to a
+ * locked cmpxchg and an untaken branch, the uncontended unlock is just a
+ * locked decr and an untaken branch.  We use __builtin_expect() to indicate
+ * that contention is unlikely so that gcc will put the contention code out of
+ * the main code flow.
+ *
+ * A fast mutex only supports lock/unlock, can't be recursive or used with
+ * condition variables.
+ */
+
+typedef struct {
+   uint32_t val;
+} mtx_t;
+
+#define _MTX_INITIALIZER_NP { 0 }
+
+#include stdint.h
+#include values.h
+#include linux/futex.h
+#include sys/time.h
+#include sys/syscall.h
+
+static inline long sys_futex(void *addr1, int op, int val1, struct timespec 
*timeout, void *addr2, int val3)
+{
+   return syscall(SYS_futex, addr1, op, val1, timeout, addr2, val3);
+}
+
+static inline int futex_wake(uint32_t *addr) {
+   return sys_futex(addr, FUTEX_WAKE, 1, NULL, NULL, 0);
+}
+
+static inline int futex_wait(uint32_t *addr, int32_t value) {
+   return sys_futex(addr, FUTEX_WAIT, value, NULL, NULL, 0);
+}
+
+static inline void
+mtx_init(mtx_t *mtx, int type)
+{
+   assert(type == mtx_plain);
+
+   mtx-val = 0;
+}
+
+static inline void
+mtx_destroy(mtx_t *mtx)
+{
+}
+
+static inline void
+mtx_lock(mtx_t *mtx)
+{
+   uint32_t c;
+
+   c = __sync_val_compare_and_swap(mtx-val, 0, 1);
+   if (__builtin_expect(c != 0, 0)) {
+  if (c != 2)
+ c = __sync_lock_test_and_set(mtx-val, 2);
+  while (c != 0) {
+ futex_wait(mtx-val, 2);
+ c = 

Re: [Mesa-dev] [PATCH 1/2] egl: Add unbound_only parameter to _eglReleaseDisplayResources()

2015-01-29 Thread Matt Turner
On Thu, Jan 29, 2015 at 6:03 PM, Chad Versace chad.vers...@intel.com wrote:
 On 01/28/2015 01:45 PM, jim.br...@linux.intel.com wrote:
 From: Jim Bride jim.br...@intel.com

 Add a EGLBoolean parameter unbound_only to _eglReleaseDisplayResources() to
 allow that function to skip releasing contexts or surfaces that are currently
 bound to a display.  Setting unbound_only to EGL_FALSE preserves the original
 behavior of this function.

 Signed-off-by: Jim Bride jim.br...@intel.com
 ---
  src/egl/drivers/dri2/egl_dri2.c |  7 ++-
  src/egl/main/egldisplay.c   | 17 +
  src/egl/main/egldisplay.h   |  3 ++-
  src/gallium/state_trackers/egl/common/egl_g3d.c |  9 -
  4 files changed, 29 insertions(+), 7 deletions(-)

 Reviewed-by: Chad Versace chad.vers...@intel.com

 I'm waiting on fixes to patch 2 before committing patch 1, because
 they don't make sense to commit in independently.

It might be nice as to wait until the patches actually show up on the
mailing list as well.

I suspect Jim isn't subscribed so his patches are stuck in the
moderation queue, but you got them since they were sent directly to
you.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] glsl: Pick ast_conditional branch regardless of op1/2 being constant.

2015-01-29 Thread Kenneth Graunke
If the ?: operator's condition is a constant value, and both branches
were pure expressions, we can just make the resulting value one or the
other.

Previously, we only did this if op[1] and op[2] were also constant
values - but there's no actual reason for that restriction.

No changes in shader-db, probably because we usually optimize this later
anyway.  But it does make us generate less stupid code up front.

Signed-off-by: Kenneth Graunke kenn...@whitecape.org
---
 src/glsl/ast_to_hir.cpp | 6 ++
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/src/glsl/ast_to_hir.cpp b/src/glsl/ast_to_hir.cpp
index 1ba29f7..4d28069 100644
--- a/src/glsl/ast_to_hir.cpp
+++ b/src/glsl/ast_to_hir.cpp
@@ -1597,13 +1597,11 @@ ast_expression::do_hir(exec_list *instructions,
   }
 
   ir_constant *cond_val = op[0]-constant_expression_value();
-  ir_constant *then_val = op[1]-constant_expression_value();
-  ir_constant *else_val = op[2]-constant_expression_value();
 
   if (then_instructions.is_empty()
else_instructions.is_empty()
-   (cond_val != NULL)  (then_val != NULL)  (else_val != NULL)) {
- result = (cond_val-value.b[0]) ? then_val : else_val;
+   cond_val != NULL) {
+ result = cond_val-value.b[0] ? op[1] : op[2];
   } else {
  ir_variable *const tmp =
 new(ctx) ir_variable(type, conditional_tmp, ir_var_temporary);
-- 
2.2.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 12/21] main: Added entry points for ClearNamedBuffer[Sub]Data.

2015-01-29 Thread Jason Ekstrand
On Wed, Jan 21, 2015 at 5:40 PM, Laura Ekstrand la...@jlekstrand.net
wrote:

 ---
  src/mapi/glapi/gen/ARB_direct_state_access.xml |  18 +++
  src/mesa/main/bufferobj.c  | 156
 ++---
  src/mesa/main/bufferobj.h  |  34 --
  src/mesa/main/tests/dispatch_sanity.cpp|   2 +
  src/mesa/state_tracker/st_cb_bufferobjects.c   |   4 +-
  5 files changed, 135 insertions(+), 79 deletions(-)

 diff --git a/src/mapi/glapi/gen/ARB_direct_state_access.xml
 b/src/mapi/glapi/gen/ARB_direct_state_access.xml
 index 8c80dfb..4b29e00 100644
 --- a/src/mapi/glapi/gen/ARB_direct_state_access.xml
 +++ b/src/mapi/glapi/gen/ARB_direct_state_access.xml
 @@ -43,6 +43,24 @@
param name=size type=GLsizeiptr /
 /function

 +   function name=ClearNamedBufferData offset=assign
 +  param name=buffer type=GLuint /
 +  param name=internalformat type=GLenum /
 +  param name=format type=GLenum /
 +  param name=type type=GLenum /
 +  param name=data type=const GLvoid * /
 +   /function
 +
 +   function name=ClearNamedBufferSubData offset=assign
 +  param name=buffer type=GLuint /
 +  param name=internalformat type=GLenum /
 +  param name=offset type=GLintptr /
 +  param name=size type=GLsizeiptr /
 +  param name=format type=GLenum /
 +  param name=type type=GLenum /
 +  param name=data type=const GLvoid * /
 +   /function
 +
!-- Texture object functions --

 function name=CreateTextures offset=assign
 diff --git a/src/mesa/main/bufferobj.c b/src/mesa/main/bufferobj.c
 index 31ae006..1ec1681 100644
 --- a/src/mesa/main/bufferobj.c
 +++ b/src/mesa/main/bufferobj.c
 @@ -660,11 +660,11 @@ _mesa_buffer_get_subdata( struct gl_context *ctx,
 GLintptrARB offset,
   * dd_function_table::ClearBufferSubData.
   */
  void
 -_mesa_buffer_clear_subdata(struct gl_context *ctx,
 -   GLintptr offset, GLsizeiptr size,
 -   const GLvoid *clearValue,
 -   GLsizeiptr clearValueSize,
 -   struct gl_buffer_object *bufObj)
 +_mesa_ClearBufferSubData_sw(struct gl_context *ctx,
 +GLintptr offset, GLsizeiptr size,
 +const GLvoid *clearValue,
 +GLsizeiptr clearValueSize,
 +struct gl_buffer_object *bufObj)
  {
 GLsizeiptr i;
 GLubyte *dest;
 @@ -1113,7 +1113,7 @@ _mesa_init_buffer_object_functions(struct
 dd_function_table *driver)
 driver-UnmapBuffer = _mesa_buffer_unmap;

 /* GL_ARB_clear_buffer_object */
 -   driver-ClearBufferSubData = _mesa_buffer_clear_subdata;
 +   driver-ClearBufferSubData = _mesa_ClearBufferSubData_sw;

 /* GL_ARB_map_buffer_range */
 driver-MapBufferRange = _mesa_buffer_map_range;
 @@ -1666,57 +1666,93 @@ _mesa_GetBufferSubData(GLenum target, GLintptr
 offset,
 ctx-Driver.GetBufferSubData( ctx, offset, size, data, bufObj );
  }

 -
 -void GLAPIENTRY
 -_mesa_ClearBufferData(GLenum target, GLenum internalformat, GLenum format,
 -  GLenum type, const GLvoid* data)
 +/**
 + * \param subdata   true if caller is *SubData, false if *Data
 + */
 +void
 +_mesa_clear_buffer_sub_data(struct gl_context *ctx,
 +struct gl_buffer_object *bufObj,
 +GLenum internalformat,
 +GLintptr offset, GLsizeiptr size,
 +GLenum format, GLenum type,
 +const GLvoid *data,
 +const char *func, bool subdata)


Doesn't matter much, but it might be nice if the refactor that pulls the
guts out of ClearBufferData and ClearBufferSubData into one function were
it's own commit and the DSA stuff were added later.  It wasn't obvious what
was going on in my initial scan-through.  Not a big deal though.


  {
 -   GET_CURRENT_CONTEXT(ctx);
 -   struct gl_buffer_object* bufObj;
 mesa_format mesaFormat;
 GLubyte clearValue[MAX_PIXEL_BYTES];
 GLsizeiptr clearValueSize;

 -   bufObj = get_buffer(ctx, glClearBufferData, target,
 GL_INVALID_VALUE);
 -   if (!bufObj) {
 -  return;
 -   }
 -
 -   if (_mesa_check_disallowed_mapping(bufObj)) {
 -  _mesa_error(ctx, GL_INVALID_OPERATION,
 -  glClearBufferData(buffer currently mapped));
 +   /* This checks for disallowed mappings. */
 +   if (!buffer_object_subdata_range_good(ctx, bufObj, offset, size,
 + subdata, func)) {
return;
 }

 mesaFormat = validate_clear_buffer_format(ctx, internalformat,
 - format, type,
 - glClearBufferData);
 + format, type, func);
 +
 if (mesaFormat == MESA_FORMAT_NONE) {
return;
 }

 clearValueSize = 

Re: [Mesa-dev] [PATCH] dir-locals.el: Don't set variables for non-programming modes

2015-01-29 Thread Michel Dänzer
On 30.01.2015 03:52, Neil Roberts wrote:
 This limits the style changes to modes inherited from prog-mode. The
 main reason to do this is to avoid setting fill-column for people
 using Emacs to edit commit messages because 78 characters is too many
 to make it wrap properly in git log. Note that makefile-mode also
 inherits from prog-mode so the fill column should continue to apply
 there.

Sounds good, but can you please make the same change to the other
.dir-locals.el files in the tree as well?


-- 
Earthling Michel Dänzer   |   http://www.amd.com
Libre software enthusiast | Mesa and X developer
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [RFC] nir/register: Add a parent_instr field

2015-01-29 Thread Jason Ekstrand
This adds a parent_instr field similar to the one for ssa_def.  The
difference here is that the parent_instr field on a nir_register can be
NULL if the register does not have a unique definition or if that
definition does not dominate all its uses.  We set this field in the
out-of-SSA pass so that backends can get SSA-like information even after
they have gone out of SSA.
---

Note: This has not been compile-tested

 src/glsl/nir/nir.c  |  1 +
 src/glsl/nir/nir.h  | 11 ++-
 src/glsl/nir/nir_from_ssa.c |  7 +++
 3 files changed, 18 insertions(+), 1 deletion(-)

diff --git a/src/glsl/nir/nir.c b/src/glsl/nir/nir.c
index 25cdac0..b3b825b 100644
--- a/src/glsl/nir/nir.c
+++ b/src/glsl/nir/nir.c
@@ -61,6 +61,7 @@ reg_create(void *mem_ctx, struct exec_list *list)
 {
nir_register *reg = ralloc(mem_ctx, nir_register);
 
+   reg-parent_instr = NULL;
reg-uses = _mesa_set_create(mem_ctx, _mesa_hash_pointer,
 _mesa_key_pointer_equal);
reg-defs = _mesa_set_create(mem_ctx, _mesa_hash_pointer,
diff --git a/src/glsl/nir/nir.h b/src/glsl/nir/nir.h
index 980fdd0..2c7ffdc 100644
--- a/src/glsl/nir/nir.h
+++ b/src/glsl/nir/nir.h
@@ -64,6 +64,7 @@ name(const in_type *parent)  \
 struct nir_function_overload;
 struct nir_function;
 struct nir_shader;
+struct nir_instr;
 
 
 /**
@@ -384,6 +385,14 @@ typedef struct {
 */
bool is_packed;
 
+   /**
+* If this pointer is non-NULL then this register has exactly one
+* definition and that definition dominates all of its uses.  This is
+* set by the out-of-SSA pass so that backends can get SSA-like
+* information even once they have gone out of SSA.
+*/
+   struct nir_instr *parent_instr;
+
/** set of nir_instr's where this register is used (read from) */
struct set *uses;
 
@@ -406,7 +415,7 @@ typedef enum {
nir_instr_type_parallel_copy,
 } nir_instr_type;
 
-typedef struct {
+typedef struct nir_instr {
struct exec_node node;
nir_instr_type type;
struct nir_block *block;
diff --git a/src/glsl/nir/nir_from_ssa.c b/src/glsl/nir/nir_from_ssa.c
index 2e7add3..f0105fb 100644
--- a/src/glsl/nir/nir_from_ssa.c
+++ b/src/glsl/nir/nir_from_ssa.c
@@ -513,6 +513,13 @@ get_register_for_ssa_def(nir_ssa_def *def, struct 
from_ssa_state *state)
   reg-num_components = def-num_components;
   reg-num_array_elems = 0;
 
+  /* This register comes from an SSA definition that was not part of a
+   * phi-web.  Therefore, we know it has a single unique definition
+   * that dominates all of its uses.  Therefore, we can copy the
+   * parent_instr from the SSA def safely.
+   */
+  reg-parent_instr = def-parent_instr;
+
   _mesa_hash_table_insert(state-ssa_table, def, reg);
   return reg;
}
-- 
2.1.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 14/21] main: Add entry points for MapNamedBuffer[Range].

2015-01-29 Thread Jason Ekstrand
Again, I think this would be more readable if the refactor was in its own
commit with a little more explanation what's going on.  Then add DSA.

On Wed, Jan 21, 2015 at 5:40 PM, Laura Ekstrand la...@jlekstrand.net
wrote:

 ---
  src/mapi/glapi/gen/ARB_direct_state_access.xml |  14 ++
  src/mesa/main/bufferobj.c  | 327
 -
  src/mesa/main/bufferobj.h  |  20 +-
  src/mesa/main/tests/dispatch_sanity.cpp|   2 +
  4 files changed, 189 insertions(+), 174 deletions(-)

 diff --git a/src/mapi/glapi/gen/ARB_direct_state_access.xml
 b/src/mapi/glapi/gen/ARB_direct_state_access.xml
 index 4b29e00..7727f0d 100644
 --- a/src/mapi/glapi/gen/ARB_direct_state_access.xml
 +++ b/src/mapi/glapi/gen/ARB_direct_state_access.xml
 @@ -61,6 +61,20 @@
param name=data type=const GLvoid * /
 /function

 +   function name=MapNamedBuffer offset=assign
 +  return type=GLvoid * /
 +  param name=buffer type=GLuint /
 +  param name=access type=GLenum /
 +   /function
 +
 +   function name=MapNamedBufferRange offset=assign
 +  return type=GLvoid * /
 +  param name=buffer type=GLuint /
 +  param name=offset type=GLintptr /
 +  param name=length type=GLsizeiptr /
 +  param name=access type=GLbitfield /
 +   /function
 +
!-- Texture object functions --

 function name=CreateTextures offset=assign
 diff --git a/src/mesa/main/bufferobj.c b/src/mesa/main/bufferobj.c
 index 1ec1681..16c0faa 100644
 --- a/src/mesa/main/bufferobj.c
 +++ b/src/mesa/main/bufferobj.c
 @@ -701,7 +701,7 @@ _mesa_ClearBufferSubData_sw(struct gl_context *ctx,
   * Called via glMapBufferRange().
   */
  static void *
 -_mesa_buffer_map_range( struct gl_context *ctx, GLintptr offset,
 +_mesa_MapBufferRange_sw(struct gl_context *ctx, GLintptr offset,
  GLsizeiptr length, GLbitfield access,
  struct gl_buffer_object *bufObj,
  gl_map_buffer_index index)
 @@ -1116,7 +1116,7 @@ _mesa_init_buffer_object_functions(struct
 dd_function_table *driver)
 driver-ClearBufferSubData = _mesa_ClearBufferSubData_sw;

 /* GL_ARB_map_buffer_range */
 -   driver-MapBufferRange = _mesa_buffer_map_range;
 +   driver-MapBufferRange = _mesa_MapBufferRange_sw;
 driver-FlushMappedBufferRange = _mesa_buffer_flush_mapped_range;

 /* GL_ARB_copy_buffer */
 @@ -1794,116 +1794,6 @@ _mesa_ClearNamedBufferSubData(GLuint buffer,
 GLenum internalformat,
  }


 -void * GLAPIENTRY
 -_mesa_MapBuffer(GLenum target, GLenum access)
 -{
 -   GET_CURRENT_CONTEXT(ctx);
 -   struct gl_buffer_object * bufObj;
 -   GLbitfield accessFlags;
 -   void *map;
 -   bool valid_access;
 -
 -   ASSERT_OUTSIDE_BEGIN_END_WITH_RETVAL(ctx, NULL);
 -
 -   switch (access) {
 -   case GL_READ_ONLY_ARB:
 -  accessFlags = GL_MAP_READ_BIT;
 -  valid_access = _mesa_is_desktop_gl(ctx);
 -  break;
 -   case GL_WRITE_ONLY_ARB:
 -  accessFlags = GL_MAP_WRITE_BIT;
 -  valid_access = true;
 -  break;
 -   case GL_READ_WRITE_ARB:
 -  accessFlags = GL_MAP_READ_BIT | GL_MAP_WRITE_BIT;
 -  valid_access = _mesa_is_desktop_gl(ctx);
 -  break;
 -   default:
 -  valid_access = false;
 -  break;
 -   }
 -
 -   if (!valid_access) {
 -  _mesa_error(ctx, GL_INVALID_ENUM, glMapBufferARB(access));
 -  return NULL;
 -   }
 -
 -   bufObj = get_buffer(ctx, glMapBufferARB, target,
 GL_INVALID_OPERATION);
 -   if (!bufObj)
 -  return NULL;
 -
 -   if (accessFlags  GL_MAP_READ_BIT 
 -   !(bufObj-StorageFlags  GL_MAP_READ_BIT)) {
 -  _mesa_error(ctx, GL_INVALID_OPERATION,
 -  glMapBuffer(invalid read flag));
 -  return NULL;
 -   }
 -
 -   if (accessFlags  GL_MAP_WRITE_BIT 
 -   !(bufObj-StorageFlags  GL_MAP_WRITE_BIT)) {
 -  _mesa_error(ctx, GL_INVALID_OPERATION,
 -  glMapBuffer(invalid write flag));
 -  return NULL;
 -   }
 -
 -   if (_mesa_bufferobj_mapped(bufObj, MAP_USER)) {
 -  _mesa_error(ctx, GL_INVALID_OPERATION, glMapBufferARB(already
 mapped));
 -  return NULL;
 -   }
 -
 -   if (!bufObj-Size) {
 -  _mesa_error(ctx, GL_OUT_OF_MEMORY,
 -  glMapBuffer(buffer size = 0));
 -  return NULL;
 -   }
 -
 -   ASSERT(ctx-Driver.MapBufferRange);
 -   map = ctx-Driver.MapBufferRange(ctx, 0, bufObj-Size, accessFlags,
 bufObj,
 -MAP_USER);
 -   if (!map) {
 -  _mesa_error(ctx, GL_OUT_OF_MEMORY, glMapBufferARB(map failed));
 -  return NULL;
 -   }
 -   else {
 -  /* The driver callback should have set these fields.
 -   * This is important because other modules (like VBO) might call
 -   * the driver function directly.
 -   */
 -  ASSERT(bufObj-Mappings[MAP_USER].Pointer == map);
 -  ASSERT(bufObj-Mappings[MAP_USER].Length == bufObj-Size);
 -  ASSERT(bufObj-Mappings[MAP_USER].Offset == 0);
 -  bufObj-Mappings[MAP_USER].AccessFlags = accessFlags;
 

Re: [Mesa-dev] [PATCH] glsl: Pick ast_conditional branch regardless of op1/2 being constant.

2015-01-29 Thread Matt Turner
Reviewed-by: Matt Turner matts...@gmail.com
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] mesa: fix display list 8-byte alignment issue

2015-01-29 Thread Brian Paul

Ping.

I'll handle the cherry-pick to 10.4 since a small code change is needed.

-Brian

On 01/27/2015 08:06 PM, Brian Paul wrote:

The _mesa_dlist_alloc() function is only guaranteed to return a pointer
with 4-byte alignment.  On 64-bit systems which don't support unaligned
loads (e.g. SPARC or MIPS) this could lead to a bus error in the VBO code.

The solution is to add a new  _mesa_dlist_alloc_aligned() function which
will return a pointer to an 8-byte aligned address on 64-bit systems.
This is accomplished by inserting a 4-byte NOP instruction in the display
list when needed.

The only place this actually matters is the VBO code where we need to
allocate a 'struct vbo_save_vertex_list' which needs to be 8-byte
aligned (just as if it were malloc'd).

The gears demo and others hit this bug.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=88662
Cc: 10.4 mesa-sta...@lists.freedesktop.org
---
  src/mesa/main/dlist.c   | 72 +
  src/mesa/main/dlist.h   |  3 ++
  src/mesa/vbo/vbo_save_api.c |  5 +++-
  3 files changed, 73 insertions(+), 7 deletions(-)

diff --git a/src/mesa/main/dlist.c b/src/mesa/main/dlist.c
index 138d360..dc6070b 100644
--- a/src/mesa/main/dlist.c
+++ b/src/mesa/main/dlist.c
@@ -487,6 +487,7 @@ typedef enum
 /* The following three are meta instructions */
 OPCODE_ERROR,/* raise compiled-in error */
 OPCODE_CONTINUE,
+   OPCODE_NOP,  /* No-op (used for 8-byte alignment */
 OPCODE_END_OF_LIST,
 OPCODE_EXT_0
  } OpCode;
@@ -1012,13 +1013,16 @@ memdup(const void *src, GLsizei bytes)
   * Allocate space for a display list instruction (opcode + payload space).
   * \param opcode  the instruction opcode (OPCODE_* value)
   * \param bytes   instruction payload size (not counting opcode)
- * \return pointer to allocated memory (the opcode space)
+ * \param align8  does the payload need to be 8-byte aligned?
+ *This is only relevant in 64-bit environments.
+ * \return pointer to allocated memory (the payload will be at pointer+1)
   */
  static Node *
-dlist_alloc(struct gl_context *ctx, OpCode opcode, GLuint bytes)
+dlist_alloc(struct gl_context *ctx, OpCode opcode, GLuint bytes, bool align8)
  {
 const GLuint numNodes = 1 + (bytes + sizeof(Node) - 1) / sizeof(Node);
 const GLuint contNodes = 1 + POINTER_DWORDS;  /* size of continue info */
+   GLuint nopNode;
 Node *n;

 if (opcode  OPCODE_EXT_0) {
@@ -1032,7 +1036,19 @@ dlist_alloc(struct gl_context *ctx, OpCode opcode, 
GLuint bytes)
}
 }

-   if (ctx-ListState.CurrentPos + numNodes + contNodes  BLOCK_SIZE) {
+   if (sizeof(void *) == 8  align8  ctx-ListState.CurrentPos % 2 == 0) {
+  /* The opcode would get placed at node[0] and the payload would start
+   * at node[1].  But the payload needs to be at an even offset (8-byte
+   * multiple).
+   */
+  nopNode = 1;
+   }
+   else {
+  nopNode = 0;
+   }
+
+   if (ctx-ListState.CurrentPos + nopNode + numNodes + contNodes
+BLOCK_SIZE) {
/* This block is full.  Allocate a new block and chain to it */
Node *newblock;
n = ctx-ListState.CurrentBlock + ctx-ListState.CurrentPos;
@@ -1042,13 +1058,34 @@ dlist_alloc(struct gl_context *ctx, OpCode opcode, 
GLuint bytes)
   _mesa_error(ctx, GL_OUT_OF_MEMORY, Building display list);
   return NULL;
}
+
+  /* a fresh block should be 8-byte aligned on 64-bit systems */
+  assert(((GLintptr) newblock) % sizeof(void *) == 0);
+
save_pointer(n[1], newblock);
ctx-ListState.CurrentBlock = newblock;
ctx-ListState.CurrentPos = 0;
+
+  /* Display list nodes are always 4 bytes.  If we need 8-byte alignment
+   * we have to insert a NOP so that the payload of the real opcode lands
+   * on an even location:
+   *   node[0] = OPCODE_NOP
+   *   node[1] = OPCODE_x;
+   *   node[2] = start of payload
+   */
+  nopNode = sizeof(void *) == 8  align8;
 }

 n = ctx-ListState.CurrentBlock + ctx-ListState.CurrentPos;
-   ctx-ListState.CurrentPos += numNodes;
+   if (nopNode) {
+  assert(ctx-ListState.CurrentPos % 2 == 0); /* even value */
+  n[0].opcode = OPCODE_NOP;
+  n++;
+  /* The real opcode will now be at an odd location and the payload
+   * will be at an even location.
+   */
+   }
+   ctx-ListState.CurrentPos += nopNode + numNodes;

 n[0].opcode = opcode;

@@ -1069,7 +1106,22 @@ dlist_alloc(struct gl_context *ctx, OpCode opcode, 
GLuint bytes)
  void *
  _mesa_dlist_alloc(struct gl_context *ctx, GLuint opcode, GLuint bytes)
  {
-   Node *n = dlist_alloc(ctx, (OpCode) opcode, bytes);
+   Node *n = dlist_alloc(ctx, (OpCode) opcode, bytes, false);
+   if (n)
+  return n + 1;  /* return pointer to payload area, after opcode */
+   else
+  return NULL;
+}
+
+
+/**
+ * Same as _mesa_dlist_alloc(), but return a pointer which is 8-byte
+ *