[Mesa-dev] [PATCH] i965/fs: Gen4: Zero out extra coordinates when using shadow compare

2013-08-27 Thread Chris Forbes
Fixes broken rendering if these MRFs contained anything other than zero.

NOTE: This is a candidate for stable branches.
Signed-off-by: Chris Forbes chr...@ijw.co.nz
---
 src/mesa/drivers/dri/i965/brw_fs_visitor.cpp | 7 ++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp 
b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
index 964ad40..c3eecfa 100644
--- a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
@@ -868,7 +868,12 @@ fs_visitor::emit_texture_gen4(ir_texture *ir, fs_reg dst, 
fs_reg coordinate,
 emit(MOV(fs_reg(MRF, base_mrf + mlen + i), coordinate));
 coordinate.reg_offset++;
   }
-  /* gen4's SIMD8 sampler always has the slots for u,v,r present. */
+
+  /* gen4's SIMD8 sampler always has the slots for u,v,r present.
+   * the unused slots must be zeroed.*/
+  for (int i = ir-coordinate-type-vector_elements; i3; i++) {
+ emit(MOV(fs_reg(MRF, base_mrf + mlen + i), fs_reg(0.0f)));
+  }
   mlen += 3;
 
   if (ir-op == ir_tex) {
-- 
1.8.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] i965/fs: Gen4: Zero out extra coordinates when using shadow compare

2013-08-27 Thread Chris Forbes
Sorry, saw the bogus comment style just after I sent this out.

-- Chris

On Tue, Aug 27, 2013 at 9:45 PM, Chris Forbes chr...@ijw.co.nz wrote:
 Fixes broken rendering if these MRFs contained anything other than zero.

 NOTE: This is a candidate for stable branches.
 Signed-off-by: Chris Forbes chr...@ijw.co.nz
 ---
  src/mesa/drivers/dri/i965/brw_fs_visitor.cpp | 7 ++-
  1 file changed, 6 insertions(+), 1 deletion(-)

 diff --git a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp 
 b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
 index 964ad40..c3eecfa 100644
 --- a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
 +++ b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
 @@ -868,7 +868,12 @@ fs_visitor::emit_texture_gen4(ir_texture *ir, fs_reg 
 dst, fs_reg coordinate,
  emit(MOV(fs_reg(MRF, base_mrf + mlen + i), coordinate));
  coordinate.reg_offset++;
}
 -  /* gen4's SIMD8 sampler always has the slots for u,v,r present. */
 +
 +  /* gen4's SIMD8 sampler always has the slots for u,v,r present.
 +   * the unused slots must be zeroed.*/
 +  for (int i = ir-coordinate-type-vector_elements; i3; i++) {
 + emit(MOV(fs_reg(MRF, base_mrf + mlen + i), fs_reg(0.0f)));
 +  }
mlen += 3;

if (ir-op == ir_tex) {
 --
 1.8.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] tgsi dump and parsing

2013-08-27 Thread Jose Fonseca


- Original Message -
 On 08/26/2013 02:38 AM, Dave Airlie wrote:
  Hi TGSI guys mostly :-)
 
  So I'm wondering how circular and perfect tgsi-text-tgsi roundabouts
  should be,
 
 Ideally, they should be consistent.
 
 
  currently the TGSI dump code uses .4f in one place, which makes things
  like 1e6 not make it across the divide, I was thinking of dumping
  immediates in 32-bit hex format so we know for definite what happens
  on the other side,
 
 Yeah, I think we'd have to dump floats as hex to always preserve their
 value.
 
 However, I'd like to maintain the readability of float immediates in
 dumps.  Maybe they could be displayed as a comment.  Something like:
 
 IMM[0] FLT32 { 0x, 0x, 0x, 0x }  # 1.0, 3.0, 2.0, 4.0

If you use %.9g instead of %.4f then floating point numbers will be 
preserved without loss of precision.

Jose

 
 
  I've been thinking of maybe adding a debug option to softpipe to dump
  to text and read it back, to see what other regression lie in wait.
 
 Sounds OK to me but it could be a TGSI debug option that would test
 dumping + reassembling a shader whenever tgsi_dump() is called.
 
 -Brian
 
 ___
 mesa-dev mailing list
 mesa-dev@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/mesa-dev
 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 68262] [build error with wayland 1.0.5] wayland-drm.c:110:3: error: implicit declaration of function 'wl_resource_create'

2013-08-27 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=68262

--- Comment #2 from Fabio Pedretti fabio@libero.it ---
Confirming the same error with 1.1.0.

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH V2 04/15] mesa: Implement glPushDebugGroup and glPopDebugGroup

2013-08-27 Thread Brian Paul

On 08/26/2013 07:17 PM, Timothy Arceri wrote:

V2: fixed spelling typo in comment

Signed-off-by: Timothy Arceri t_arc...@yahoo.com.au
---
  src/mesa/main/errors.c | 275 -
  src/mesa/main/errors.h |   5 +
  src/mesa/main/mtypes.h |   5 +-
  3 files changed, 214 insertions(+), 71 deletions(-)


The updated patches look fine.  Thanks.

Reviewed-by: Brian Paul bri...@vmware.com

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 03/15] mesa: Add a clone function to mesa hash

2013-08-27 Thread Brian Paul

On 08/26/2013 07:10 PM, Timothy Arceri wrote:

On 27/08/13 00:51, Brian Paul wrote:

On 08/26/2013 04:43 AM, Timothy Arceri wrote:


Signed-off-by: Timothy Arceri t_arc...@yahoo.com.au
---
  src/mesa/main/hash.c |   26 ++
  src/mesa/main/hash.h |3 +++
  2 files changed, 29 insertions(+)

diff --git a/src/mesa/main/hash.c b/src/mesa/main/hash.c
index 6591af9..8dde8b1 100644
--- a/src/mesa/main/hash.c
+++ b/src/mesa/main/hash.c
@@ -302,6 +302,32 @@ _mesa_HashDeleteAll(struct _mesa_HashTable *table,


  /**
+ * Clone all entries in a hash table, into a new table.
+ *
+ * \param table  the hash table to clone
+ */
+struct _mesa_HashTable *
+_mesa_HashClone(struct _mesa_HashTable *table)


Can that be const qualified?



The cloned tables are to be edited after we clone them. Basically we
just want a copy of whats on the top of the stack when we do a push but
then the we want to be able to be able to add more ids to this copy if
we want to. I have a piglit test for push/popDebugGroup that shows what
I mean.
I will try to polish this up and submit it later today.

Or I'm I miss understanding what you are suggesting?


I'm simply asking if the 'table' function parameter can be 
const-qualified.  Normally, when you clone an object you don't modify 
the original object, so I'd assume that it could be const.


-Brian


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 68262] [build error with wayland 1.2.0] wayland-drm.c:110:3: error: implicit declaration of function 'wl_resource_create'

2013-08-27 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=68262

Fabio Pedretti fabio@libero.it changed:

   What|Removed |Added

Summary|[build error with wayland   |[build error with wayland 
   |1.0.5] wayland-drm.c:110:3: |1.2.0] wayland-drm.c:110:3:
   |error: implicit declaration |error: implicit declaration
   |of function |of function
   |'wl_resource_create'|'wl_resource_create'

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] r600g/compute: Fix bug in compute memory pool

2013-08-27 Thread Tom Stellard
From: Tom Stellard thomas.stell...@amd.com

When adding a new buffer to the beginning of the memory pool, we were
accidentally deleting the buffer that was first in the buffer list.
This was caused by a bug in the memory pool's linked list
implementation.
---
 src/gallium/drivers/r600/compute_memory_pool.c | 9 ++---
 1 file changed, 2 insertions(+), 7 deletions(-)

diff --git a/src/gallium/drivers/r600/compute_memory_pool.c 
b/src/gallium/drivers/r600/compute_memory_pool.c
index 454af90..4846bfe 100644
--- a/src/gallium/drivers/r600/compute_memory_pool.c
+++ b/src/gallium/drivers/r600/compute_memory_pool.c
@@ -337,14 +337,9 @@ void compute_memory_finalize_pending(struct 
compute_memory_pool* pool,
}
} else {
/* Add item to the front of the list */
-   item-next = pool-item_list-next;
-   if (pool-item_list-next) {
-   pool-item_list-next-prev = item;
-   }
+   item-next = pool-item_list;
item-prev = pool-item_list-prev;
-   if (pool-item_list-prev) {
-   pool-item_list-prev-next = item;
-   }
+   pool-item_list-prev = item;
pool-item_list = item;
}
}
-- 
1.7.11.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] glsl: propagate max_array_access through function calls

2013-08-27 Thread Ian Romanick

On 08/21/2013 05:57 PM, Dominik Behr wrote:

Fixes a bug where if an uniform array is passed to a function the accesses
to the array are not propagated so later all but the first vector of the
uniform array are removed in parcel_out_uniform_storage resulting in
broken shaders and out of bounds access to arrays in
brw::vec4_visitor::pack_uniform_registers.

Signed-off-by: Dominik Behr db...@chromium.org
---
  src/glsl/link_functions.cpp | 26 ++
  1 file changed, 26 insertions(+)

diff --git a/src/glsl/link_functions.cpp b/src/glsl/link_functions.cpp
index 6b3e154..32e2012 100644
--- a/src/glsl/link_functions.cpp
+++ b/src/glsl/link_functions.cpp
@@ -173,6 +173,32 @@ public:
return visit_continue;
 }

+   virtual ir_visitor_status visit_leave(ir_call *ir)
+   {
+  /* traverse list of function parameters, and for array parameters
+ propagate max_array_access, do it when leaving the node
+ so the childen would propagate their array accesses first */

   children

I also think this comment should be expanded.  On the first read of this 
code, it wasn't obvious to me why this was necessary.  It's worth 
mentioning that the important case is when the reference to the array at 
the call site does not reference a specific element.  In that case the 
access range of the from the callee is propagated back to the array 
passed in.



+  exec_list_iterator sig_param_iter = ir-callee-parameters.iterator();
+  exec_list_iterator param_iter = ir-actual_parameters.iterator();


While iterators are a fine design pattern, the implementation of them in 
the compiler front-end is garbage (and I made that implementation).  We 
stopped writing new code to use the iterators years ago.  There is other 
code that does a similar double-walk of the signature parameters and 
actual parameters lists.


I think the best example is in 
lower_clip_distance_visitor::visit_leave(ir_call *ir) in 
lower_clip_distance.cpp.



+  while (param_iter.has_next()) {
+ ir_variable *sig_param = (ir_variable *) sig_param_iter.get();
+ ir_rvalue *param = (ir_rvalue *) param_iter.get();
+ assert(sig_param);
+ assert(param);
+ if (sig_param-type-is_array()) {
+ir_dereference_variable *deref = param-as_dereference_variable();
+if (deref  deref-var  deref-var-type-is_array()) {
+   if (sig_param-max_array_access  deref-var-max_array_access) 
{
+   deref-var-max_array_access = sig_param-max_array_access;
+   }


This should end up looking more similar to the code in 
call_link_visitor::visit(ir_dereference_variable *ir) around line 200 
(without your patch).


   var-max_array_access =
   MAX2(var-max_array_access, ir-var-max_array_access);


+}
+ }
+ sig_param_iter.next();
+ param_iter.next();
+  }
+  return visit_continue;
+   }
+
 virtual ir_visitor_status visit(ir_dereference_variable *ir)
 {
if (hash_table_find(locals, ir-var) == NULL) {



___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] i965/fs: Gen4: Zero out extra coordinates when using shadow compare

2013-08-27 Thread Kenneth Graunke

On 08/27/2013 02:47 AM, Chris Forbes wrote:

Sorry, saw the bogus comment style just after I sent this out.

-- Chris

On Tue, Aug 27, 2013 at 9:45 PM, Chris Forbes chr...@ijw.co.nz wrote:

Fixes broken rendering if these MRFs contained anything other than zero.

NOTE: This is a candidate for stable branches.
Signed-off-by: Chris Forbes chr...@ijw.co.nz
---
  src/mesa/drivers/dri/i965/brw_fs_visitor.cpp | 7 ++-
  1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp 
b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
index 964ad40..c3eecfa 100644
--- a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
@@ -868,7 +868,12 @@ fs_visitor::emit_texture_gen4(ir_texture *ir, fs_reg dst, 
fs_reg coordinate,
  emit(MOV(fs_reg(MRF, base_mrf + mlen + i), coordinate));
  coordinate.reg_offset++;
}
-  /* gen4's SIMD8 sampler always has the slots for u,v,r present. */
+
+  /* gen4's SIMD8 sampler always has the slots for u,v,r present.
+   * the unused slots must be zeroed.*/
+  for (int i = ir-coordinate-type-vector_elements; i3; i++) {


Another instance of forgetting to do this? :(

i  3 please (with spaces).  Either way,

Reviewed-by: Kenneth Graunke kenn...@whitecape.org


+ emit(MOV(fs_reg(MRF, base_mrf + mlen + i), fs_reg(0.0f)));
+  }
mlen += 3;

if (ir-op == ir_tex) {
--
1.8.4


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev



___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] update wayland requirement

2013-08-27 Thread Chad Versace

On 08/12/2013 04:48 AM, Fabio Pedretti wrote:

Since 8d29b52 wayland 1.2.0 is required.


Thanks. Committed to master.

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 68262] [build error with wayland 1.2.0] wayland-drm.c:110:3: error: implicit declaration of function 'wl_resource_create'

2013-08-27 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=68262

Fabio Pedretti fabio@libero.it changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |FIXED

--- Comment #3 from Fabio Pedretti fabio@libero.it ---
1.2.0 is now required:
http://cgit.freedesktop.org/mesa/mesa/commit/?id=aa3905423e398e1ba36502ae91339d1303acf77f

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] glsl: Allow precision qualifiers for sampler types

2013-08-27 Thread Anuj Phogat
GLSL 1.30 doesn't allow precision qualifiers on sampler types,
but in GLSL ES, sampler types are also allowed. This seems like
an oversight (since the intention of including these in GLSL 1.30
is to allow compatibility with ES shaders).

Currently, Mesa allows default precision qualifiers to be set for
sampler types in GLSL (commit d5948f2). This patch makes it follow
GLSL ES rules and also allow declaring sampler variables with a
precision qualifier in GLSL.

This fixes a shader compilation error in Khronos OpenGL conformance
test depth_texture_mipmap.

Signed-off-by: Anuj Phogat anuj.pho...@gmail.com
---
 src/glsl/ast_to_hir.cpp | 14 +-
 1 file changed, 9 insertions(+), 5 deletions(-)

diff --git a/src/glsl/ast_to_hir.cpp b/src/glsl/ast_to_hir.cpp
index 192130a..b3d6d8c 100644
--- a/src/glsl/ast_to_hir.cpp
+++ b/src/glsl/ast_to_hir.cpp
@@ -3131,8 +3131,8 @@ ast_declarator_list::hir(exec_list *instructions,
  state-check_precision_qualifiers_allowed(loc);
   }
 
-
-  /* Precision qualifiers only apply to floating point and integer types.
+  /* Precision qualifiers apply to floating point, integer and sampler
+   * types.
*
* From section 4.5.2 of the GLSL 1.30 spec:
*Any floating point or any integer declaration can have the type
@@ -3144,20 +3144,24 @@ ast_declarator_list::hir(exec_list *instructions,
*
* From page 87 of the GLSL ES spec:
*RESOLUTION: Allow sampler types to take a precision qualifier.
+   *
+   * GLSL 1.30 doesn't allow precision qualifiers on sampler types, but
+   * this seems like an oversight (since the intention of including these
+   * in GLSL 1.30 is to allow compatibility with ES shaders). So we allow
+   * int, float, and all sampler types regardless of GLSL version.
*/
   if (this-type-qualifier.precision != ast_precision_none
!var-type-is_float()
!var-type-is_integer()
!var-type-is_record()
-   !(var-type-is_sampler()  state-es_shader)
+   !(var-type-is_sampler())
!(var-type-is_array()
 (var-type-fields.array-is_float()
|| var-type-fields.array-is_integer( {
 
  _mesa_glsl_error(loc, state,
   precision qualifiers apply only to floating point
-  %s types, state-es_shader ? , integer, and 
sampler
-  : and integer);
+  , integer and sampler types);
   }
 
   /* From page 17 (page 23 of the PDF) of the GLSL 1.20 spec:
-- 
1.8.1.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] r600g: fix color exports when we have no CBs

2013-08-27 Thread Vadim Girlin
We need to export at least one color if the shader writes it,
even when nr_cbufs==0.

Signed-off-by: Vadim Girlin vadimgir...@gmail.com
---

Tested on evergreen with multiple combinations of backends - no regressions,
fixes some tests:

  default- fixes fb-alphatest-nocolor and fb_alphatest-nocolor-ff
  default+sb - fixes fb-alphatest-nocolor and fb_alphatest-nocolor-ff
  llvm   - fixes about 25 tests related to depth/stencil
  llvm+sb- fixes about 300 tests (llvm's depth/stencil issues and
   regressions cased by reordering of exports in sb)

With this patch, there are no regressions with default+sb vs default.
There is one regression with llvm+sb vs llvm - fs-texturegrad-miplevels,
AFAICS it's a problem with llvm backend uncovered by sb - SET_GRADIENTS_V/H 
instructions are not placed in the same TEX clause with corresponding SAMPLE_G.

 src/gallium/drivers/r600/r600_shader.c | 7 ---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/src/gallium/drivers/r600/r600_shader.c 
b/src/gallium/drivers/r600/r600_shader.c
index 300b5c4..f7eab76 100644
--- a/src/gallium/drivers/r600/r600_shader.c
+++ b/src/gallium/drivers/r600/r600_shader.c
@@ -918,6 +918,7 @@ static int r600_shader_from_tgsi(struct r600_screen 
*rscreen,
unsigned opcode;
int i, j, k, r = 0;
int next_pos_base = 60, next_param_base = 0;
+   int max_color_exports = MAX2(key.nr_cbufs, 1);
/* Declarations used by llvm code */
bool use_llvm = false;
bool indirect_gprs;
@@ -1130,7 +1131,7 @@ static int r600_shader_from_tgsi(struct r600_screen 
*rscreen,
radeon_llvm_ctx.face_gpr = ctx.face_gpr;
radeon_llvm_ctx.r600_inputs = ctx.shader-input;
radeon_llvm_ctx.r600_outputs = ctx.shader-output;
-   radeon_llvm_ctx.color_buffer_count = MAX2(key.nr_cbufs , 1);
+   radeon_llvm_ctx.color_buffer_count = max_color_exports;
radeon_llvm_ctx.chip_class = ctx.bc-chip_class;
radeon_llvm_ctx.fs_color_all = shader-fs_write_all  
(rscreen-chip_class = EVERGREEN);
radeon_llvm_ctx.stream_outputs = so;
@@ -1440,7 +1441,7 @@ static int r600_shader_from_tgsi(struct r600_screen 
*rscreen,
case TGSI_PROCESSOR_FRAGMENT:
if (shader-output[i].name == TGSI_SEMANTIC_COLOR) {
/* never export more colors than the number of 
CBs */
-   if (shader-output[i].sid = key.nr_cbufs) {
+   if (shader-output[i].sid = max_color_exports) 
{
/* skip export */
j--;
continue;
@@ -1450,7 +1451,7 @@ static int r600_shader_from_tgsi(struct r600_screen 
*rscreen,
output[j].type = 
V_SQ_CF_ALLOC_EXPORT_WORD0_SQ_EXPORT_PIXEL;
shader-nr_ps_color_exports++;
if (shader-fs_write_all  
(rscreen-chip_class = EVERGREEN)) {
-   for (k = 1; k  key.nr_cbufs; k++) {
+   for (k = 1; k  max_color_exports; k++) 
{
j++;
memset(output[j], 0, 
sizeof(struct r600_bytecode_output));
output[j].gpr = 
shader-output[i].gpr;
-- 
1.8.3.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] glsl: Allow precision qualifiers for sampler types

2013-08-27 Thread Ian Romanick

On 08/27/2013 10:45 AM, Anuj Phogat wrote:

GLSL 1.30 doesn't allow precision qualifiers on sampler types,
but in GLSL ES, sampler types are also allowed. This seems like
an oversight (since the intention of including these in GLSL 1.30
is to allow compatibility with ES shaders).

Currently, Mesa allows default precision qualifiers to be set for
sampler types in GLSL (commit d5948f2). This patch makes it follow
GLSL ES rules and also allow declaring sampler variables with a
precision qualifier in GLSL.


I think our current behavior is incorrect even in the ES case.  GLSL ES 
3.30 and desktop GLSL 4.40 say the following in section 4.5.3 (Precision 
Qualifiers):


Any floating point or any integer declaration can have the type
preceded by one of these precision qualifiers...

The also both say the following in section 4.5.4 (Default Precision 
Qualifiers):


The precision statement...can be used to establish a default
precision qualifier. The type field can be either int or float
or any of the sampler types...

So I believe

precision mediump sampler2D;

should be legal in all versions, but

uniform mediump sampler2D s;

should not.

Which syntax is the test using?


This fixes a shader compilation error in Khronos OpenGL conformance
test depth_texture_mipmap.

Signed-off-by: Anuj Phogat anuj.pho...@gmail.com
---
  src/glsl/ast_to_hir.cpp | 14 +-
  1 file changed, 9 insertions(+), 5 deletions(-)

diff --git a/src/glsl/ast_to_hir.cpp b/src/glsl/ast_to_hir.cpp
index 192130a..b3d6d8c 100644
--- a/src/glsl/ast_to_hir.cpp
+++ b/src/glsl/ast_to_hir.cpp
@@ -3131,8 +3131,8 @@ ast_declarator_list::hir(exec_list *instructions,
   state-check_precision_qualifiers_allowed(loc);
}

-
-  /* Precision qualifiers only apply to floating point and integer types.
+  /* Precision qualifiers apply to floating point, integer and sampler
+   * types.
 *
 * From section 4.5.2 of the GLSL 1.30 spec:
 *Any floating point or any integer declaration can have the type
@@ -3144,20 +3144,24 @@ ast_declarator_list::hir(exec_list *instructions,
 *
 * From page 87 of the GLSL ES spec:
 *RESOLUTION: Allow sampler types to take a precision qualifier.
+   *
+   * GLSL 1.30 doesn't allow precision qualifiers on sampler types, but
+   * this seems like an oversight (since the intention of including these
+   * in GLSL 1.30 is to allow compatibility with ES shaders). So we allow
+   * int, float, and all sampler types regardless of GLSL version.
 */
if (this-type-qualifier.precision != ast_precision_none
 !var-type-is_float()
 !var-type-is_integer()
 !var-type-is_record()
-   !(var-type-is_sampler()  state-es_shader)
+   !(var-type-is_sampler())
 !(var-type-is_array()
  (var-type-fields.array-is_float()
 || var-type-fields.array-is_integer( {

   _mesa_glsl_error(loc, state,
precision qualifiers apply only to floating point
-  %s types, state-es_shader ? , integer, and 
sampler
-  : and integer);
+  , integer and sampler types);
}

/* From page 17 (page 23 of the PDF) of the GLSL 1.20 spec:



___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] r600g: fix color exports when we have no CBs

2013-08-27 Thread Roland Scheidegger
Not that I'm qualified to review r600 code, but couldn't you create
different shader variants depending on whether you need alpha test? At
least I would assume shader exports aren't free.

Roland

Am 27.08.2013 19:56, schrieb Vadim Girlin:
 We need to export at least one color if the shader writes it,
 even when nr_cbufs==0.
 
 Signed-off-by: Vadim Girlin vadimgir...@gmail.com
 ---
 
 Tested on evergreen with multiple combinations of backends - no regressions,
 fixes some tests:
 
   default- fixes fb-alphatest-nocolor and fb_alphatest-nocolor-ff
   default+sb - fixes fb-alphatest-nocolor and fb_alphatest-nocolor-ff
   llvm   - fixes about 25 tests related to depth/stencil
   llvm+sb- fixes about 300 tests (llvm's depth/stencil issues and
regressions cased by reordering of exports in sb)
 
 With this patch, there are no regressions with default+sb vs default.
 There is one regression with llvm+sb vs llvm - fs-texturegrad-miplevels,
 AFAICS it's a problem with llvm backend uncovered by sb - SET_GRADIENTS_V/H 
 instructions are not placed in the same TEX clause with corresponding 
 SAMPLE_G.
 
  src/gallium/drivers/r600/r600_shader.c | 7 ---
  1 file changed, 4 insertions(+), 3 deletions(-)
 
 diff --git a/src/gallium/drivers/r600/r600_shader.c 
 b/src/gallium/drivers/r600/r600_shader.c
 index 300b5c4..f7eab76 100644
 --- a/src/gallium/drivers/r600/r600_shader.c
 +++ b/src/gallium/drivers/r600/r600_shader.c
 @@ -918,6 +918,7 @@ static int r600_shader_from_tgsi(struct r600_screen 
 *rscreen,
   unsigned opcode;
   int i, j, k, r = 0;
   int next_pos_base = 60, next_param_base = 0;
 + int max_color_exports = MAX2(key.nr_cbufs, 1);
   /* Declarations used by llvm code */
   bool use_llvm = false;
   bool indirect_gprs;
 @@ -1130,7 +1131,7 @@ static int r600_shader_from_tgsi(struct r600_screen 
 *rscreen,
   radeon_llvm_ctx.face_gpr = ctx.face_gpr;
   radeon_llvm_ctx.r600_inputs = ctx.shader-input;
   radeon_llvm_ctx.r600_outputs = ctx.shader-output;
 - radeon_llvm_ctx.color_buffer_count = MAX2(key.nr_cbufs , 1);
 + radeon_llvm_ctx.color_buffer_count = max_color_exports;
   radeon_llvm_ctx.chip_class = ctx.bc-chip_class;
   radeon_llvm_ctx.fs_color_all = shader-fs_write_all  
 (rscreen-chip_class = EVERGREEN);
   radeon_llvm_ctx.stream_outputs = so;
 @@ -1440,7 +1441,7 @@ static int r600_shader_from_tgsi(struct r600_screen 
 *rscreen,
   case TGSI_PROCESSOR_FRAGMENT:
   if (shader-output[i].name == TGSI_SEMANTIC_COLOR) {
   /* never export more colors than the number of 
 CBs */
 - if (shader-output[i].sid = key.nr_cbufs) {
 + if (shader-output[i].sid = max_color_exports) 
 {
   /* skip export */
   j--;
   continue;
 @@ -1450,7 +1451,7 @@ static int r600_shader_from_tgsi(struct r600_screen 
 *rscreen,
   output[j].type = 
 V_SQ_CF_ALLOC_EXPORT_WORD0_SQ_EXPORT_PIXEL;
   shader-nr_ps_color_exports++;
   if (shader-fs_write_all  
 (rscreen-chip_class = EVERGREEN)) {
 - for (k = 1; k  key.nr_cbufs; k++) {
 + for (k = 1; k  max_color_exports; k++) 
 {
   j++;
   memset(output[j], 0, 
 sizeof(struct r600_bytecode_output));
   output[j].gpr = 
 shader-output[i].gpr;
 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] radeonsi: add flexible shader descriptor management and use it for sampler views

2013-08-27 Thread Marek Olšák
On Mon, Aug 26, 2013 at 8:59 PM, Marko Ristola
marko.rist...@kolumbus.fi wrote:

 Hi


 15.08.2013 13:54, Marek Olšák wrote:

 On Thu, Aug 15, 2013 at 10:27 AM, Christian König
 deathsim...@vodafone.de wrote:

 Am 15.08.2013 05:25, schrieb Marek Olšák:

 (This should be applied before MSAA, which will need to be rebased.)

 It moves all sampler view descriptors to a buffer.
 It supports partial resource updates and it can also unbind resources
 (required for FMASK texturing).

 The buffer contains all sampler view descriptors for one shader stage,
 represented as an array. On top of that, there are N arrays in the
 buffer,
 which are used to emulate context registers as implemented by the
 previous
 ASICs (each array is a context).

 This uses the RCU synchronization approach to avoid read-after-write
 hazards
 as discussed in the thread:
 radeonsi: add FMASK texture binding slots and resource setup

 CP DMA is used to clear the descriptors at context initialization and to
 copy
 the descriptors from one context to the next.

 IMPORTANT:
 128 resource contexts are needed, 64 doesn't work. If I set
 SH_KCACHE_ACTION_ENA before every draw call, only 2 contexts are
 needed.
 I don't have an explanation for this.
 ---



 The idea itself looks really good to me, but we should probably also move
 the all resources and samplers to the new model and then rip out the code
 that stores them directly into the IB.


 I'd like MSAA to land first, but yes, the plan is to eventually move
 all resources and samplers to the new model.



 +/* Emit a CP DMA packet to do a copy from one buffer to another.
 + * The size must fit in bits [20:0]. Notes:
 + *
 + * 1) Set sync to true if you want the 3D engine to wait until CP DMA
 is
 done.
 + *
 + * 2) Set raw_hazard_wait to true if the source data was used as a
 destination
 + *in a previous CP DMA packet. It's for preventing a
 read-after-write
 hazard
 + *between two CP DMA packets.
 + */
 +static void si_emit_cp_dma_copy_buffer(struct r600_context *rctx,
 +  uint64_t dst_va, uint64_t src_va,
 +  unsigned size,
 +  bool sync, bool raw_hazard_wait)
 +{
 +   struct radeon_winsys_cs *cs = rctx-cs;
 +   uint32_t sync_flag = sync ? PKT3_CP_DMA_CP_SYNC : 0;
 +   uint32_t raw_wait = raw_hazard_wait ? PKT3_CP_DMA_CMD_RAW_WAIT :
 0;
 +
 +   assert(size);
 +   assert((size  ((121)-1)) == size);
 +
 +   cs-buf[cs-cdw++] = PKT3(PKT3_CP_DMA, 4, 0);
 +   cs-buf[cs-cdw++] = src_va;/* SRC_ADDR_LO
 [31:0] */
 +   cs-buf[cs-cdw++] = sync_flag | ((src_va  32)  0xff); /*
 CP_SYNC [31] | SRC_ADDR_HI [7:0] */
 +   cs-buf[cs-cdw++] = dst_va;/* DST_ADDR_LO
 [31:0] */
 +   cs-buf[cs-cdw++] = (dst_va  32)  0xff; /* DST_ADDR_HI
 [7:0] */
 +   cs-buf[cs-cdw++] = size | raw_wait;   /* COMMAND
 [29:22]
 | BYTE_COUNT [20:0] */
 +}
 +
 +/* Emit a CP DMA packet to clear a buffer. The size must fit in bits
 [20:0]. */
 +static void si_emit_cp_dma_clear_buffer(struct r600_context *rctx,
 +   uint64_t dst_va, unsigned size,
 +   uint32_t clear_value,
 +   bool sync, bool raw_hazard_wait)
 +{
 +   struct radeon_winsys_cs *cs = rctx-cs;
 +   uint32_t sync_flag = sync ? PKT3_CP_DMA_CP_SYNC : 0;
 +   uint32_t raw_wait = raw_hazard_wait ? PKT3_CP_DMA_CMD_RAW_WAIT :
 0;
 +
 +   assert(size);
 +   assert((size  ((121)-1)) == size);
 +
 +   cs-buf[cs-cdw++] = PKT3(PKT3_CP_DMA, 4, 0);
 +   cs-buf[cs-cdw++] = clear_value;   /* DATA [31:0]
 */
 +   cs-buf[cs-cdw++] = sync_flag | PKT3_CP_DMA_SRC_SEL(2); /*
 CP_SYNC [31] | SRC_SEL[30:29] */
 +   cs-buf[cs-cdw++] = dst_va;/* DST_ADDR_LO
 [31:0] */
 +   cs-buf[cs-cdw++] = (dst_va  32)  0xff; /* DST_ADDR_HI
 [7:0] */
 +   cs-buf[cs-cdw++] = size | raw_wait;   /* COMMAND
 [29:22]
 | BYTE_COUNT [20:0] */
 +}



 Can we use some kind of macro or inline function instead of
 cs-buf[cs-cdw++]  ? That should help of we need to port that over to
 a
 different CS mechanism.


 How about this?

 static INLINE void
 r600_write_value(struct radeon_winsys_cs *cs, unsigned value)
 {
  cs-buf[cs-cdw++] = value;
 }



 And IIRC the CP DMA is identical on all chipset generation (maybe
 excluding
 early R6xx, but I'm not 100% sure of that), so it might be a good idea to
 start sharing code again by putting this under
 src/gallium/drivers/radeon/radeon_cp_dma.c. Not necessary now, but more
 as
 a general idea. What do you think?


 I agree.

 CP DMA is indeed identical on all chipsets. The copying is supported
 since R600 and the clearing is supported since Evergreen.


 Maybe you already thought: One way to emulate clearing is to
 copy with CP DMA from a constant cleared 

Re: [Mesa-dev] [PATCH] glsl: Allow precision qualifiers for sampler types

2013-08-27 Thread Anuj Phogat
On Tue, Aug 27, 2013 at 11:53 AM, Ian Romanick i...@freedesktop.org wrote:
 On 08/27/2013 10:45 AM, Anuj Phogat wrote:

 GLSL 1.30 doesn't allow precision qualifiers on sampler types,
 but in GLSL ES, sampler types are also allowed. This seems like
 an oversight (since the intention of including these in GLSL 1.30
 is to allow compatibility with ES shaders).

 Currently, Mesa allows default precision qualifiers to be set for
 sampler types in GLSL (commit d5948f2). This patch makes it follow
 GLSL ES rules and also allow declaring sampler variables with a
 precision qualifier in GLSL.


 I think our current behavior is incorrect even in the ES case.  GLSL ES 3.30
You mean to say GLSL ES 3.00?
 and desktop GLSL 4.40 say the following in section 4.5.3 (Precision
 Qualifiers):


 Any floating point or any integer declaration can have the type
 preceded by one of these precision qualifiers...

Yes, samplers are now allowed in GLSL 4.4. They were not in GLSL 4.3.

 The also both say the following in section 4.5.4 (Default Precision
 Qualifiers):

 The precision statement...can be used to establish a default
 precision qualifier. The type field can be either int or float
 or any of the sampler types...

 So I believe

 precision mediump sampler2D;

 should be legal in all versions, but

 uniform mediump sampler2D s;

 should not.

Yes, there is no clear statement in GLSL spec which allows:
uniform mediump sampler2D s;

 Which syntax is the test using?

test uses:
uniform mediump sampler2D s;

I haven't yet tested if it is accepted by NVIDIA.

 This fixes a shader compilation error in Khronos OpenGL conformance
 test depth_texture_mipmap.

 Signed-off-by: Anuj Phogat anuj.pho...@gmail.com
 ---
   src/glsl/ast_to_hir.cpp | 14 +-
   1 file changed, 9 insertions(+), 5 deletions(-)

 diff --git a/src/glsl/ast_to_hir.cpp b/src/glsl/ast_to_hir.cpp
 index 192130a..b3d6d8c 100644
 --- a/src/glsl/ast_to_hir.cpp
 +++ b/src/glsl/ast_to_hir.cpp
 @@ -3131,8 +3131,8 @@ ast_declarator_list::hir(exec_list *instructions,
state-check_precision_qualifiers_allowed(loc);
 }

 -
 -  /* Precision qualifiers only apply to floating point and integer
 types.
 +  /* Precision qualifiers apply to floating point, integer and
 sampler
 +   * types.
  *
  * From section 4.5.2 of the GLSL 1.30 spec:
  *Any floating point or any integer declaration can have the
 type
 @@ -3144,20 +3144,24 @@ ast_declarator_list::hir(exec_list *instructions,
  *
  * From page 87 of the GLSL ES spec:
  *RESOLUTION: Allow sampler types to take a precision
 qualifier.
 +   *
 +   * GLSL 1.30 doesn't allow precision qualifiers on sampler types,
 but
 +   * this seems like an oversight (since the intention of including
 these
 +   * in GLSL 1.30 is to allow compatibility with ES shaders). So we
 allow
 +   * int, float, and all sampler types regardless of GLSL version.
  */
 if (this-type-qualifier.precision != ast_precision_none
  !var-type-is_float()
  !var-type-is_integer()
  !var-type-is_record()
 -   !(var-type-is_sampler()  state-es_shader)
 +   !(var-type-is_sampler())
  !(var-type-is_array()
   (var-type-fields.array-is_float()
  || var-type-fields.array-is_integer( {

_mesa_glsl_error(loc, state,
 precision qualifiers apply only to floating
 point
 -  %s types, state-es_shader ? , integer, and
 sampler
 -  : and integer);
 +  , integer and sampler types);
 }

 /* From page 17 (page 23 of the PDF) of the GLSL 1.20 spec:


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] r600g: fix color exports when we have no CBs

2013-08-27 Thread Marek Olšák
Shader variants are BAD, BAD, BAD. Have you ever played an AAA game
with a Mesa driver that likes to compile shader variants on first use?
It's HORRIBLE.

What the patch does is probably the right solution. At least
alpha-test state changes don't cause shader recompilation and
re-binding, which also negatively affects performance. Ideally we
shouldn't depend on the framebuffer state at all, but we need to
emulate the TGSI property FS_COLOR0_WRITES_ALL_CBUFS. I think we
should always be fine with key.nr_cbufs forced to 8 for any shader
without that property. I expect app developers to do the right thing
and not write outputs they don't need.

Marek

On Tue, Aug 27, 2013 at 9:00 PM, Roland Scheidegger srol...@vmware.com wrote:
 Not that I'm qualified to review r600 code, but couldn't you create
 different shader variants depending on whether you need alpha test? At
 least I would assume shader exports aren't free.

 Roland

 Am 27.08.2013 19:56, schrieb Vadim Girlin:
 We need to export at least one color if the shader writes it,
 even when nr_cbufs==0.

 Signed-off-by: Vadim Girlin vadimgir...@gmail.com
 ---

 Tested on evergreen with multiple combinations of backends - no regressions,
 fixes some tests:

   default- fixes fb-alphatest-nocolor and fb_alphatest-nocolor-ff
   default+sb - fixes fb-alphatest-nocolor and fb_alphatest-nocolor-ff
   llvm   - fixes about 25 tests related to depth/stencil
   llvm+sb- fixes about 300 tests (llvm's depth/stencil issues and
regressions cased by reordering of exports in sb)

 With this patch, there are no regressions with default+sb vs default.
 There is one regression with llvm+sb vs llvm - fs-texturegrad-miplevels,
 AFAICS it's a problem with llvm backend uncovered by sb - SET_GRADIENTS_V/H
 instructions are not placed in the same TEX clause with corresponding 
 SAMPLE_G.

  src/gallium/drivers/r600/r600_shader.c | 7 ---
  1 file changed, 4 insertions(+), 3 deletions(-)

 diff --git a/src/gallium/drivers/r600/r600_shader.c 
 b/src/gallium/drivers/r600/r600_shader.c
 index 300b5c4..f7eab76 100644
 --- a/src/gallium/drivers/r600/r600_shader.c
 +++ b/src/gallium/drivers/r600/r600_shader.c
 @@ -918,6 +918,7 @@ static int r600_shader_from_tgsi(struct r600_screen 
 *rscreen,
   unsigned opcode;
   int i, j, k, r = 0;
   int next_pos_base = 60, next_param_base = 0;
 + int max_color_exports = MAX2(key.nr_cbufs, 1);
   /* Declarations used by llvm code */
   bool use_llvm = false;
   bool indirect_gprs;
 @@ -1130,7 +1131,7 @@ static int r600_shader_from_tgsi(struct r600_screen 
 *rscreen,
   radeon_llvm_ctx.face_gpr = ctx.face_gpr;
   radeon_llvm_ctx.r600_inputs = ctx.shader-input;
   radeon_llvm_ctx.r600_outputs = ctx.shader-output;
 - radeon_llvm_ctx.color_buffer_count = MAX2(key.nr_cbufs , 1);
 + radeon_llvm_ctx.color_buffer_count = max_color_exports;
   radeon_llvm_ctx.chip_class = ctx.bc-chip_class;
   radeon_llvm_ctx.fs_color_all = shader-fs_write_all  
 (rscreen-chip_class = EVERGREEN);
   radeon_llvm_ctx.stream_outputs = so;
 @@ -1440,7 +1441,7 @@ static int r600_shader_from_tgsi(struct r600_screen 
 *rscreen,
   case TGSI_PROCESSOR_FRAGMENT:
   if (shader-output[i].name == TGSI_SEMANTIC_COLOR) {
   /* never export more colors than the number of 
 CBs */
 - if (shader-output[i].sid = key.nr_cbufs) {
 + if (shader-output[i].sid = 
 max_color_exports) {
   /* skip export */
   j--;
   continue;
 @@ -1450,7 +1451,7 @@ static int r600_shader_from_tgsi(struct r600_screen 
 *rscreen,
   output[j].type = 
 V_SQ_CF_ALLOC_EXPORT_WORD0_SQ_EXPORT_PIXEL;
   shader-nr_ps_color_exports++;
   if (shader-fs_write_all  
 (rscreen-chip_class = EVERGREEN)) {
 - for (k = 1; k  key.nr_cbufs; k++) {
 + for (k = 1; k  max_color_exports; 
 k++) {
   j++;
   memset(output[j], 0, 
 sizeof(struct r600_bytecode_output));
   output[j].gpr = 
 shader-output[i].gpr;

 ___
 mesa-dev mailing list
 mesa-dev@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] r600g/compute: Don't flush the cs in pipe_context::launch_grid()

2013-08-27 Thread Marek Olšák
Reviewed-by: Marek Olšák marek.ol...@amd.com

Marek

On Tue, Aug 27, 2013 at 3:05 AM, Tom Stellard t...@stellard.net wrote:
 From: Tom Stellard thomas.stell...@amd.com

 This is the state tracker's responsibility.
 ---
  src/gallium/drivers/r600/evergreen_compute.c | 11 +--
  1 file changed, 1 insertion(+), 10 deletions(-)

 diff --git a/src/gallium/drivers/r600/evergreen_compute.c 
 b/src/gallium/drivers/r600/evergreen_compute.c
 index 9b2bae3..4fa3d2f 100644
 --- a/src/gallium/drivers/r600/evergreen_compute.c
 +++ b/src/gallium/drivers/r600/evergreen_compute.c
 @@ -473,6 +473,7 @@ static void compute_emit_cs(struct r600_context *ctx, 
 const uint *block_layout,
   R600_CONTEXT_INV_VERTEX_CACHE |
   R600_CONTEXT_INV_TEX_CACHE;
 r600_flush_emit(ctx);
 +   ctx-flags = 0;

  #if 0
 COMPUTE_DBG(ctx-screen, cdw: %i\n, cs-cdw);
 @@ -481,16 +482,6 @@ static void compute_emit_cs(struct r600_context *ctx, 
 const uint *block_layout,
 }
  #endif

 -   flush_flags = RADEON_FLUSH_ASYNC | RADEON_FLUSH_COMPUTE;
 -   if (ctx-keep_tiling_flags) {
 -   flush_flags |= RADEON_FLUSH_KEEP_TILING_FLAGS;
 -   }
 -
 -   ctx-ws-cs_flush(ctx-rings.gfx.cs, flush_flags, 
 ctx-screen-cs_count++);
 -
 -   ctx-flags = 0;
 -
 -   COMPUTE_DBG(ctx-screen, shader started\n);
  }


 --
 1.7.11.4

 ___
 mesa-dev mailing list
 mesa-dev@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 50482] git mesa fails to build

2013-08-27 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=50482

Henry henry_rosario_gonza...@hotmail.com changed:

   What|Removed |Added

 Resolution|FIXED   |WORKSFORME

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] glsl: propagate max_array_access through function calls

2013-08-27 Thread Dominik Behr
Hi,
There was a lot of code everywhere that used iterators so it is not obvious
for a new person that the iterators are evil. And they do abstract the
implementation of the storage.
Also, I'd rather use ir_instruction and as_*() calls instead of exec_node
and explicit cast.
I am not sure about MAX2, is it guaranteed to be converted to branchless
conditional assignment?



On Tue, Aug 27, 2013 at 8:26 AM, Ian Romanick i...@freedesktop.org wrote:

 On 08/21/2013 05:57 PM, Dominik Behr wrote:

 Fixes a bug where if an uniform array is passed to a function the accesses
 to the array are not propagated so later all but the first vector of the
 uniform array are removed in parcel_out_uniform_storage resulting in
 broken shaders and out of bounds access to arrays in
 brw::vec4_visitor::pack_**uniform_registers.

 Signed-off-by: Dominik Behr db...@chromium.org
 ---
   src/glsl/link_functions.cpp | 26 ++
   1 file changed, 26 insertions(+)

 diff --git a/src/glsl/link_functions.cpp b/src/glsl/link_functions.cpp
 index 6b3e154..32e2012 100644
 --- a/src/glsl/link_functions.cpp
 +++ b/src/glsl/link_functions.cpp
 @@ -173,6 +173,32 @@ public:
 return visit_continue;
  }

 +   virtual ir_visitor_status visit_leave(ir_call *ir)
 +   {
 +  /* traverse list of function parameters, and for array parameters
 + propagate max_array_access, do it when leaving the node
 + so the childen would propagate their array accesses first */

children

 I also think this comment should be expanded.  On the first read of this
 code, it wasn't obvious to me why this was necessary.  It's worth
 mentioning that the important case is when the reference to the array at
 the call site does not reference a specific element.  In that case the
 access range of the from the callee is propagated back to the array passed
 in.


  +  exec_list_iterator sig_param_iter = ir-callee-parameters.**
 iterator();
 +  exec_list_iterator param_iter = ir-actual_parameters.**
 iterator();


 While iterators are a fine design pattern, the implementation of them in
 the compiler front-end is garbage (and I made that implementation).  We
 stopped writing new code to use the iterators years ago.  There is other
 code that does a similar double-walk of the signature parameters and actual
 parameters lists.

 I think the best example is in 
 lower_clip_distance_visitor::**visit_leave(ir_call
 *ir) in lower_clip_distance.cpp.


  +  while (param_iter.has_next()) {
 + ir_variable *sig_param = (ir_variable *) sig_param_iter.get();
 + ir_rvalue *param = (ir_rvalue *) param_iter.get();
 + assert(sig_param);
 + assert(param);
 + if (sig_param-type-is_array()) {
 +ir_dereference_variable *deref = param-as_dereference_**
 variable();
 +if (deref  deref-var  deref-var-type-is_array()) {
 +   if (sig_param-max_array_access 
 deref-var-max_array_access) {
 +   deref-var-max_array_access =
 sig_param-max_array_access;
 +   }


 This should end up looking more similar to the code in
 call_link_visitor::visit(ir_**dereference_variable *ir) around line 200
 (without your patch).

var-max_array_access =
MAX2(var-max_array_access, ir-var-max_array_access);


  +}
 + }
 + sig_param_iter.next();
 + param_iter.next();
 +  }
 +  return visit_continue;
 +   }
 +
  virtual ir_visitor_status visit(ir_dereference_variable *ir)
  {
 if (hash_table_find(locals, ir-var) == NULL) {



___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 05/15] glsl: Add new {fr, ld}exp built-ins IR and prototypes.

2013-08-27 Thread Paul Berry
On 26 August 2013 17:49, Ian Romanick i...@freedesktop.org wrote:

 On 08/23/2013 02:02 PM, Paul Berry wrote:

 On 23 August 2013 13:19, Matt Turner matts...@gmail.com
 mailto:matts...@gmail.com wrote:

 On Fri, Aug 23, 2013 at 8:55 AM, Paul Berry stereotype...@gmail.com
 mailto:stereotype441@gmail.**com stereotype...@gmail.com wrote:
   On 22 August 2013 16:08, Matt Turner matts...@gmail.com
 mailto:matts...@gmail.com wrote:
  
   ---
src/glsl/builtins/ir/frexp.ir http://frexp.ir
| 25
   +
src/glsl/builtins/ir/ldexp.ir http://ldexp.ir

| 25
   +
src/glsl/builtins/profiles/**ARB_gpu_shader5.glsl | 10
 ++
3 files changed, 60 insertions(+)
create mode 100644 src/glsl/builtins/ir/frexp.ir 
 http://frexp.ir
create mode 100644 src/glsl/builtins/ir/ldexp.ir 
 http://ldexp.ir
  
   diff --git a/src/glsl/builtins/ir/frexp.**ir http://frexp.ir 
 http://frexp.ir
 b/src/glsl/builtins/ir/frexp.**ir http://frexp.ir http://frexp.ir

   new file mode 100644
   index 000..a514994
   --- /dev/null
   +++ b/src/glsl/builtins/ir/frexp.**ir http://frexp.ir 
 http://frexp.ir

   @@ -0,0 +1,25 @@
   +((function frexp
   +   (signature float
   + (parameters
   +   (declare (in) float x)
   +   (declare (out) int exp))
   + ((return (expression float frexp (var_ref x) (var_ref
 exp)
  
  
   Having an ir_expression that writes to one of its parameters is
 going to
   break assumptions in a lot of our optimization passes.

 I'm concerned that that may be a problem we have to solve anyway.

 While our hardware doesn't support an frexp instruction (like e.g.,
 AMD does) and we could probably do what you suggest, we do have
 instructions that correspond directly to the uaddCarry() and
 usubBorrow() built-ins in this same extension. They return a value and
 also have an out parameter.

 genUType uaddCarry(genUType x, genUType y, out genUType carry);
 genUType usubBorrow(genUType x, genUType y, out genUType borrow);

 We could probably avoid the problem you describe by lowering them, but
 it's feeling increasingly distasteful.

 Your code would make a good piglit test. I'll do some experiments.


 Hmm, interesting.

 The way LLVM solves this problem, as I understand it, is through
 so-called intrinsic functions
 (http://llvm.org/docs/LangRef.**html#intrinsic-functionshttp://llvm.org/docs/LangRef.html#intrinsic-functions).
  I wonder if we
 should start doing that in Mesa.

 Briefly, here is what it would look like, using uaddCarry as an example:

 1. First we do an inefficient implementation of uaddCarry in terms of
 existing GLSL functions, much like you did for frexp in your
 frexp_to_arith lowering pass, except that we do it in
 src/glsl/builtins/glsl/**uaddCarry.glsl, so it's a little easier to
 review
 :).  Optimization passes already deal with function out parameters
 properly, and function inlining automatically splices in the proper code
 during linking.

 2. For back-ends that don't have an efficient native way to do
 uaddCarry, we're done.  The uaddCarry function works as is.

 3. For back-ends that do have an efficient way to do uaddCarry, we add a
 mechanism to allow the back-end to tell the linker: don't inline the
 definition of this built-in.  Just leave it as an ir_call because I have
 my own special implementation of it*.


 I had thought about solving this in a slightly different way, but there
 are a couple potential tricky bits.

 Provide an implementation of the built-in function in the GLSL library.

 float frexp(float x, out int exponent)
 {
 return __intrinsic_frexp(x, exponent);
 }

 Provide a default implementation of the intrinsic elsewhere.

 Allow drivers to supply an alternate library with custom versions of the
 intrinsics.

 Since the GLSL library's frexp is the same in either case, the problem
 Paul identifies below should be avoided.

 The tricky bit, and the problem we always come to when talking about
 intrinsics is dealing with constant expressions.  That doesn't (shouldn't?)
 apply to this case because of the out parameter, but it may apply to other
 cases.


Yeah, good point about constant expressions.  With my proposal, that could
be addressed by having the constant expression evaluator always recurse
into the GLSL implementation, regardless of whether the function is an
intrinsic (this should be fine, since the only reason for the intrinsic
version of the function to be used is to take advantage of efficient
instructions in the GPU).

I confess that I don't understand the rest of your proposal as well as I
would like.  Maybe the three of us should discuss it in person next time
we're in the office.



 Right now an application could do:

 float foo[packUnorm2x16(vec2(1,0))];

 If 

Re: [Mesa-dev] [PATCH 03/15] i965/fs: Add support for translating ir_triop_fma into MAD.

2013-08-27 Thread Paul Berry
On 26 August 2013 22:00, Matt Turner matts...@gmail.com wrote:

 Thanks Paul.

 I've placed this patch as 2.5/15 in the series.

 i965/fs: Assert that ir_expressions are usable by 3-src instructions

 MAD will be generated directly from ir_triop_fma, so this assertion
 checks that all ir_expressions are usable.


Reviewed-by: Paul Berry stereotype...@gmail.com


 ---
 diff --git a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
 b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
 index 27887d6..a02d92d 100644
 --- a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
 +++ b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
 @@ -336,6 +336,7 @@ fs_visitor::visit(ir_expression *ir)
  ir-operands[operand]-print();
   printf(\n);
}
 +  assert(this-result.is_valid_3src());


I spent a while trying to think of a commet that could go above the assert,
to help someone who stumbles across it in the future and wonders why the
assertion is here rather than in the switch statement below, but I couldn't
really think of anything :(


op[operand] = this-result;

/* Matrix expression operands should have been broken down to vector

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] r600g: fix color exports when we have no CBs

2013-08-27 Thread Vadim Girlin

On 08/27/2013 11:00 PM, Roland Scheidegger wrote:

Not that I'm qualified to review r600 code, but couldn't you create
different shader variants depending on whether you need alpha test? At
least I would assume shader exports aren't free.


I thought about performance, but my main concern now is to avoid serious 
regressions after enabling sb, we can try to improve it later.


Even if we won't emit this color export, we'll have fake export (with 
all color components masked) instead, and I'm not sure whether it's 
cheaper. Possibly hardware can see that there is no actual memory write, 
but benchmarks are needed to prove it.


Also there is another possible improvement for exports - sometimes we 
need to export depth/stencil but no colors, probably we can get rid of 
fake color export as well in such cases. Anyway, this also needs 
additional testing/benchmarking.


Vadim



Roland

Am 27.08.2013 19:56, schrieb Vadim Girlin:

We need to export at least one color if the shader writes it,
even when nr_cbufs==0.

Signed-off-by: Vadim Girlin vadimgir...@gmail.com
---

Tested on evergreen with multiple combinations of backends - no regressions,
fixes some tests:

   default- fixes fb-alphatest-nocolor and fb_alphatest-nocolor-ff
   default+sb - fixes fb-alphatest-nocolor and fb_alphatest-nocolor-ff
   llvm   - fixes about 25 tests related to depth/stencil
   llvm+sb- fixes about 300 tests (llvm's depth/stencil issues and
regressions cased by reordering of exports in sb)

With this patch, there are no regressions with default+sb vs default.
There is one regression with llvm+sb vs llvm - fs-texturegrad-miplevels,
AFAICS it's a problem with llvm backend uncovered by sb - SET_GRADIENTS_V/H
instructions are not placed in the same TEX clause with corresponding SAMPLE_G.

  src/gallium/drivers/r600/r600_shader.c | 7 ---
  1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/src/gallium/drivers/r600/r600_shader.c 
b/src/gallium/drivers/r600/r600_shader.c
index 300b5c4..f7eab76 100644
--- a/src/gallium/drivers/r600/r600_shader.c
+++ b/src/gallium/drivers/r600/r600_shader.c
@@ -918,6 +918,7 @@ static int r600_shader_from_tgsi(struct r600_screen 
*rscreen,
unsigned opcode;
int i, j, k, r = 0;
int next_pos_base = 60, next_param_base = 0;
+   int max_color_exports = MAX2(key.nr_cbufs, 1);
/* Declarations used by llvm code */
bool use_llvm = false;
bool indirect_gprs;
@@ -1130,7 +1131,7 @@ static int r600_shader_from_tgsi(struct r600_screen 
*rscreen,
radeon_llvm_ctx.face_gpr = ctx.face_gpr;
radeon_llvm_ctx.r600_inputs = ctx.shader-input;
radeon_llvm_ctx.r600_outputs = ctx.shader-output;
-   radeon_llvm_ctx.color_buffer_count = MAX2(key.nr_cbufs , 1);
+   radeon_llvm_ctx.color_buffer_count = max_color_exports;
radeon_llvm_ctx.chip_class = ctx.bc-chip_class;
radeon_llvm_ctx.fs_color_all = shader-fs_write_all  
(rscreen-chip_class = EVERGREEN);
radeon_llvm_ctx.stream_outputs = so;
@@ -1440,7 +1441,7 @@ static int r600_shader_from_tgsi(struct r600_screen 
*rscreen,
case TGSI_PROCESSOR_FRAGMENT:
if (shader-output[i].name == TGSI_SEMANTIC_COLOR) {
/* never export more colors than the number of 
CBs */
-   if (shader-output[i].sid = key.nr_cbufs) {
+   if (shader-output[i].sid = max_color_exports) 
{
/* skip export */
j--;
continue;
@@ -1450,7 +1451,7 @@ static int r600_shader_from_tgsi(struct r600_screen 
*rscreen,
output[j].type = 
V_SQ_CF_ALLOC_EXPORT_WORD0_SQ_EXPORT_PIXEL;
shader-nr_ps_color_exports++;
if (shader-fs_write_all  (rscreen-chip_class 
= EVERGREEN)) {
-   for (k = 1; k  key.nr_cbufs; k++) {
+   for (k = 1; k  max_color_exports; k++) 
{
j++;
memset(output[j], 0, 
sizeof(struct r600_bytecode_output));
output[j].gpr = 
shader-output[i].gpr;


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev



___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] r600g: fix color exports when we have no CBs

2013-08-27 Thread Vadim Girlin

On 08/28/2013 12:43 AM, Marek Olšák wrote:

Shader variants are BAD, BAD, BAD. Have you ever played an AAA game
with a Mesa driver that likes to compile shader variants on first use?
It's HORRIBLE.


I don't think that shader variants are bad, but it's definitely bad when 
we are compiling variants that are never used. Currently glxgears 
compiles 18 ps/vs shaders. In my branch with initial GS support [1] I 
switched handling of the shaders to deferred compilation, that is, 
shaders are compiled only before the actual draw. I found later that 
it's not really required for GS, but IIRC this change results in only 5 
shaders being compiled for glxgears instead of 18. It seems most of the 
useless variants are results of state changes between creation of the 
shader state (initial compilation) and actual draw call.


I had some concerns about increased overhead with those changes, and 
it's actually noticeable with drawoverhead demo, but I didn't see any 
regressions with a few real apps that I tested, e.g. glxgears even 
showed slightly better performance with these changes. Probably I also 
implemented it in a not very optimal way (I was mostly concentrated on 
GS support) and the overhead can be reduced.


One more thing is duplicate shaders, I've analyzed shader dumps from 
Unigine Heaven 3.0 some time ago and found that from about 320 compiled 
shaders, only about 180 (50%) were unique, others were duplicates 
(detected by comparing the bytecode dumps for them in an automated way), 
maybe they had different shader keys (which still resulted in the same 
bytecode), but I suspect duplicate pipe shaders were also involved. 
Unfortunately I didn't have a time to investigate it more thoroughly 
since then.


So my point is that we don't really need to eliminate shader variants, 
first we need to eliminate compilation of unused variants and duplicate 
shaders. Also we might want to consider offloading of the compilation to 
separate thread(s) and caching of shader binaries between runs.


Vadim

 [1] http://cgit.freedesktop.org/~vadimg/mesa/log/?h=r600-geom-shaders



What the patch does is probably the right solution. At least
alpha-test state changes don't cause shader recompilation and
re-binding, which also negatively affects performance. Ideally we
shouldn't depend on the framebuffer state at all, but we need to
emulate the TGSI property FS_COLOR0_WRITES_ALL_CBUFS. I think we
should always be fine with key.nr_cbufs forced to 8 for any shader
without that property. I expect app developers to do the right thing
and not write outputs they don't need.

Marek

On Tue, Aug 27, 2013 at 9:00 PM, Roland Scheidegger srol...@vmware.com wrote:

Not that I'm qualified to review r600 code, but couldn't you create
different shader variants depending on whether you need alpha test? At
least I would assume shader exports aren't free.

Roland

Am 27.08.2013 19:56, schrieb Vadim Girlin:

We need to export at least one color if the shader writes it,
even when nr_cbufs==0.

Signed-off-by: Vadim Girlin vadimgir...@gmail.com
---

Tested on evergreen with multiple combinations of backends - no regressions,
fixes some tests:

   default- fixes fb-alphatest-nocolor and fb_alphatest-nocolor-ff
   default+sb - fixes fb-alphatest-nocolor and fb_alphatest-nocolor-ff
   llvm   - fixes about 25 tests related to depth/stencil
   llvm+sb- fixes about 300 tests (llvm's depth/stencil issues and
regressions cased by reordering of exports in sb)

With this patch, there are no regressions with default+sb vs default.
There is one regression with llvm+sb vs llvm - fs-texturegrad-miplevels,
AFAICS it's a problem with llvm backend uncovered by sb - SET_GRADIENTS_V/H
instructions are not placed in the same TEX clause with corresponding SAMPLE_G.

  src/gallium/drivers/r600/r600_shader.c | 7 ---
  1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/src/gallium/drivers/r600/r600_shader.c 
b/src/gallium/drivers/r600/r600_shader.c
index 300b5c4..f7eab76 100644
--- a/src/gallium/drivers/r600/r600_shader.c
+++ b/src/gallium/drivers/r600/r600_shader.c
@@ -918,6 +918,7 @@ static int r600_shader_from_tgsi(struct r600_screen 
*rscreen,
   unsigned opcode;
   int i, j, k, r = 0;
   int next_pos_base = 60, next_param_base = 0;
+ int max_color_exports = MAX2(key.nr_cbufs, 1);
   /* Declarations used by llvm code */
   bool use_llvm = false;
   bool indirect_gprs;
@@ -1130,7 +1131,7 @@ static int r600_shader_from_tgsi(struct r600_screen 
*rscreen,
   radeon_llvm_ctx.face_gpr = ctx.face_gpr;
   radeon_llvm_ctx.r600_inputs = ctx.shader-input;
   radeon_llvm_ctx.r600_outputs = ctx.shader-output;
- radeon_llvm_ctx.color_buffer_count = MAX2(key.nr_cbufs , 1);
+ radeon_llvm_ctx.color_buffer_count = max_color_exports;
   radeon_llvm_ctx.chip_class = ctx.bc-chip_class;
   radeon_llvm_ctx.fs_color_all = 

Re: [Mesa-dev] [PATCH 05/15] glsl: Add new {fr, ld}exp built-ins IR and prototypes.

2013-08-27 Thread Matt Turner
On Mon, Aug 26, 2013 at 5:49 PM, Ian Romanick i...@freedesktop.org wrote:
 On 08/23/2013 02:02 PM, Paul Berry wrote:

 On 23 August 2013 13:19, Matt Turner matts...@gmail.com
 mailto:matts...@gmail.com wrote:

 On Fri, Aug 23, 2013 at 8:55 AM, Paul Berry stereotype...@gmail.com
 mailto:stereotype...@gmail.com wrote:
   On 22 August 2013 16:08, Matt Turner matts...@gmail.com
 mailto:matts...@gmail.com wrote:
  
   ---
src/glsl/builtins/ir/frexp.ir http://frexp.ir
| 25
   +
src/glsl/builtins/ir/ldexp.ir http://ldexp.ir

| 25
   +
src/glsl/builtins/profiles/ARB_gpu_shader5.glsl | 10 ++
3 files changed, 60 insertions(+)
create mode 100644 src/glsl/builtins/ir/frexp.ir
 http://frexp.ir
create mode 100644 src/glsl/builtins/ir/ldexp.ir
 http://ldexp.ir
  
   diff --git a/src/glsl/builtins/ir/frexp.ir http://frexp.ir
 b/src/glsl/builtins/ir/frexp.ir http://frexp.ir

   new file mode 100644
   index 000..a514994
   --- /dev/null
   +++ b/src/glsl/builtins/ir/frexp.ir http://frexp.ir

   @@ -0,0 +1,25 @@
   +((function frexp
   +   (signature float
   + (parameters
   +   (declare (in) float x)
   +   (declare (out) int exp))
   + ((return (expression float frexp (var_ref x) (var_ref
 exp)
  
  
   Having an ir_expression that writes to one of its parameters is
 going to
   break assumptions in a lot of our optimization passes.

 I'm concerned that that may be a problem we have to solve anyway.

 While our hardware doesn't support an frexp instruction (like e.g.,
 AMD does) and we could probably do what you suggest, we do have
 instructions that correspond directly to the uaddCarry() and
 usubBorrow() built-ins in this same extension. They return a value and
 also have an out parameter.

 genUType uaddCarry(genUType x, genUType y, out genUType carry);
 genUType usubBorrow(genUType x, genUType y, out genUType borrow);

 We could probably avoid the problem you describe by lowering them, but
 it's feeling increasingly distasteful.

 Your code would make a good piglit test. I'll do some experiments.


 Hmm, interesting.

 The way LLVM solves this problem, as I understand it, is through
 so-called intrinsic functions
 (http://llvm.org/docs/LangRef.html#intrinsic-functions).  I wonder if we
 should start doing that in Mesa.

 Briefly, here is what it would look like, using uaddCarry as an example:

 1. First we do an inefficient implementation of uaddCarry in terms of
 existing GLSL functions, much like you did for frexp in your
 frexp_to_arith lowering pass, except that we do it in
 src/glsl/builtins/glsl/uaddCarry.glsl, so it's a little easier to review
 :).  Optimization passes already deal with function out parameters
 properly, and function inlining automatically splices in the proper code
 during linking.

 2. For back-ends that don't have an efficient native way to do
 uaddCarry, we're done.  The uaddCarry function works as is.

 3. For back-ends that do have an efficient way to do uaddCarry, we add a
 mechanism to allow the back-end to tell the linker: don't inline the
 definition of this built-in.  Just leave it as an ir_call because I have
 my own special implementation of it*.


 I had thought about solving this in a slightly different way, but there are
 a couple potential tricky bits.

 Provide an implementation of the built-in function in the GLSL library.

 float frexp(float x, out int exponent)
 {
 return __intrinsic_frexp(x, exponent);
 }

 Provide a default implementation of the intrinsic elsewhere.

 Allow drivers to supply an alternate library with custom versions of the
 intrinsics.

 Since the GLSL library's frexp is the same in either case, the problem Paul
 identifies below should be avoided.

 The tricky bit, and the problem we always come to when talking about
 intrinsics is dealing with constant expressions.  That doesn't (shouldn't?)
 apply to this case because of the out parameter, but it may apply to other
 cases.

Maybe this is a problem in the general case, but I think the only
thing we'd want to use intrinsics for at the moment are exactly the
things you can't consider to be constant expressions -- because of the
multiple outputs.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [RFC PATCH] i965: Allow C++ type safety in the use of enum brw_urb_write_flags.

2013-08-27 Thread Eric Anholt
Francisco Jerez curroje...@riseup.net writes:

 Ian Romanick i...@freedesktop.org writes:

 On 08/26/2013 01:10 PM, Francisco Jerez wrote:
[...]
 The thing is that defining bitwise operators separately for each enum
 type, as this patch and the macro solution do, doesn't stop the compiler
  From using the corresponding built-in integer operators when it doesn't
 find a match among the user-defined ones.  That means that if we have
 two bitfield enumerants from two different disjoint types
 e.g. SEASON_OF_THE_YEAR_SUMMER and a CPU_ARCHITECTURE_I386, the
 compiler is still going to accept expressions like
 SEASON_OF_THE_YEAR_SUMMER | CPU_ARCHITECTURE_I386, which might not be
 what had been expected if the BRW_CXX_ENUM_OPS macro was used in an
 attempt to improve the code's type safety.

 This sounds insane.

 If there are no operator overloads, the compiler rejects:

 enum foo f(enum foo a, enum foo b)
 {
 return a | b;
 }


 Here both a and b would be implicitly converted to integers which are
 then or'ed together using the built-in '|' operator yielding an integer.
 Using the result as return value of your enum foo function fails
 because in C++ integers are not implicitly convertible to enum types.

 Then we add operloads:

 enum foo operator|(enum foo, enum foo);
 enum bar operator|(enum bar, enum bar);

 And now the compiler will accept:

 unsigned f(enum foo a, enum bar b)
 {
 return a | b;
 }

 That's accepted because both a and b are converted to integers as
 before, giving an integer as result, which can be implicitly converted
 to your unsigned return type just fine.  It's easy to avoid that though
 by defining a general overload:

 templatetypename T, typename S T operator|(T, S);

 for any bitfield types T and S that includes a static assertion
 within its definition to make sure that both types are indeed the same.

 In your example above the compiler would find an exact match during
 overload resolution (our operator|foo, bar(foo, bar);) so it
 wouldn't have to bother with implicit argument conversions.  The
 instantiation of that template would fail thanks to the static assertion
 requiring both T and S to be equal.

 I hope this makes the idea a bit clearer.

 That can't be right.  Am I missing something?  Or am I reinforcing my
 point about not being ready for this level of C++ ninjitsu...

This is way overcomplicated.  I think we're doing just fine with the
status quo of #defines for our bitfields with namespaced names.


pgptvgKWfB3KF.pgp
Description: PGP signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] i965: Silece usused variable warning in release build

2013-08-27 Thread Chad Versace
Use `(void) success;` to silence this warning:

  i965/brw_vs.c:481:12:
  warning: unused variable 'success' [-Wunused-variable]
 bool success = do_vs_prog(brw, ctx-Shader.CurrentVertexProgram,

Signed-off-by: Chad Versace chad.vers...@linux.intel.com
---
 src/mesa/drivers/dri/i965/brw_vs.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/mesa/drivers/dri/i965/brw_vs.c 
b/src/mesa/drivers/dri/i965/brw_vs.c
index c4e016b..e0adc1d 100644
--- a/src/mesa/drivers/dri/i965/brw_vs.c
+++ b/src/mesa/drivers/dri/i965/brw_vs.c
@@ -480,7 +480,7 @@ static void brw_upload_vs_prog(struct brw_context *brw)
 brw-vs.prog_offset, brw-vs.prog_data)) {
   bool success = do_vs_prog(brw, ctx-Shader.CurrentVertexProgram,
vp, key);
-
+  (void) success;
   assert(success);
}
if (memcmp(brw-vs.prog_data-base.vue_map, brw-vue_map_geom_out,
-- 
1.8.3.1

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] glsl: propagate max_array_access through function calls

2013-08-27 Thread Matt Turner
On Tue, Aug 27, 2013 at 2:03 PM, Dominik Behr db...@chromium.org wrote:
 I am not sure about MAX2, is it guaranteed to be converted to branchless
 conditional assignment?

I don't think there's a guarantee, no. Using MAX2 just makes the code
easier to immediately understand.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 3/3] mesa/main: Check for 0 size draws after validation.

2013-08-27 Thread Paul Berry
On 18 July 2013 10:30, Matt Turner matts...@gmail.com wrote:

 On Sat, May 25, 2013 at 4:33 AM, Fabian Bieler fabianbie...@fastmail.fm
 wrote:
  When validating draw parameters move check for 0 draw count last
  (drawing with count 0 is not an error), so that other parameters (e.g.:
 the
  primitive type) are validated and the correct errors (if applicable) are
  generated.
 
  From the OpenGL 3.3 spec page 33 (page 48 of the PDF):
  [Regarding DrawArraysOneInstance, in terms of which other draw
 operations
  are defined:]
  If count is negative, an INVALID_VALUE error is generated.
 
  This patch also changes the bahavior of MultiDrawElements to perform the
 draw
  operation if some primitive's index counts are zero.
 
  Signed-off-by: Fabian Bieler fabianbie...@fastmail.fm
  ---

 Was this committed? I don't see it.
 ___
 mesa-dev mailing list
 mesa-dev@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/mesa-dev


I've just committed it.  Thanks for the reminder, Matt!
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/8] intel: Add a batch flush between front-buffer downsample and X protocol.

2013-08-27 Thread Eric Anholt
This was already happening because blorp happens to flush at the end of
every call, but we have been talking about removing that at some point,
and this would surely get overlooked.

v2 (Kenneth Graunke): Rebase on latest master.  Note that we did remove
   the other flush, and this change actually did get overlooked!

Reviewed-by: Kenneth Graunke kenn...@whitecape.org
---
 src/mesa/drivers/dri/i965/intel_context.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/src/mesa/drivers/dri/i965/intel_context.c 
b/src/mesa/drivers/dri/i965/intel_context.c
index 9a089bf..0f1639f 100644
--- a/src/mesa/drivers/dri/i965/intel_context.c
+++ b/src/mesa/drivers/dri/i965/intel_context.c
@@ -144,6 +144,7 @@ intel_flush_front(struct gl_context *ctx)
   * performance.
   */
  intel_resolve_for_dri2_flush(brw, driDrawable);
+ intel_batchbuffer_flush(brw);
 
  screen-dri2.loader-flushFrontBuffer(driDrawable,
driDrawable-loaderPrivate);
-- 
1.8.4.rc3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 4/8] intel: Reuse intel_glFlush().

2013-08-27 Thread Eric Anholt
v2 (Kenneth Graunke): Rebase on latest master.

Reviewed-by: Kenneth Graunke kenn...@whitecape.org
---
 src/mesa/drivers/dri/i965/intel_context.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/intel_context.c 
b/src/mesa/drivers/dri/i965/intel_context.c
index 0f1639f..4f96989 100644
--- a/src/mesa/drivers/dri/i965/intel_context.c
+++ b/src/mesa/drivers/dri/i965/intel_context.c
@@ -366,8 +366,7 @@ intelFinish(struct gl_context * ctx)
 {
struct brw_context *brw = brw_context(ctx);
 
-   intel_batchbuffer_flush(brw);
-   intel_flush_front(ctx);
+   intel_glFlush(ctx);
 
if (brw-batch.last_bo)
   drm_intel_bo_wait_rendering(brw-batch.last_bo);
-- 
1.8.4.rc3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/8] i965: Directly call intel_batchbuffer_flush() after i915 split.

2013-08-27 Thread Eric Anholt
intel_flush() now did nothing except call through (and
intel_batchbuffer_flush() does the no-op check, too!)
---
 src/mesa/drivers/dri/i965/gen6_blorp.cpp |  4 +---
 src/mesa/drivers/dri/i965/intel_buffer_objects.c |  2 +-
 src/mesa/drivers/dri/i965/intel_context.c| 17 -
 src/mesa/drivers/dri/i965/intel_context.h|  3 ---
 src/mesa/drivers/dri/i965/intel_mipmap_tree.c|  3 +--
 src/mesa/drivers/dri/i965/intel_pixel_copy.c |  3 ++-
 src/mesa/drivers/dri/i965/intel_syncobj.c|  2 +-
 7 files changed, 10 insertions(+), 24 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/gen6_blorp.cpp 
b/src/mesa/drivers/dri/i965/gen6_blorp.cpp
index 1c85921..da523e5 100644
--- a/src/mesa/drivers/dri/i965/gen6_blorp.cpp
+++ b/src/mesa/drivers/dri/i965/gen6_blorp.cpp
@@ -49,12 +49,10 @@ void
 gen6_blorp_emit_batch_head(struct brw_context *brw,
const brw_blorp_params *params)
 {
-   struct gl_context *ctx = brw-ctx;
-
/* To ensure that the batch contains only the resolve, flush the batch
 * before beginning and after finishing emitting the resolve packets.
 */
-   intel_flush(ctx);
+   intel_batchbuffer_flush(brw);
 }
 
 
diff --git a/src/mesa/drivers/dri/i965/intel_buffer_objects.c 
b/src/mesa/drivers/dri/i965/intel_buffer_objects.c
index 663cc29..21d3727 100644
--- a/src/mesa/drivers/dri/i965/intel_buffer_objects.c
+++ b/src/mesa/drivers/dri/i965/intel_buffer_objects.c
@@ -291,7 +291,7 @@ intel_bufferobj_map_range(struct gl_context * ctx,
 } else {
 perf_debug(Stalling on the GPU for mapping a busy buffer 
object\n);
-   intel_flush(ctx);
+   intel_batchbuffer_flush(brw);
 }
   } else if (drm_intel_bo_busy(intel_obj-buffer) 
 (access  GL_MAP_INVALIDATE_BUFFER_BIT)) {
diff --git a/src/mesa/drivers/dri/i965/intel_context.c 
b/src/mesa/drivers/dri/i965/intel_context.c
index 37c1770..9a089bf 100644
--- a/src/mesa/drivers/dri/i965/intel_context.c
+++ b/src/mesa/drivers/dri/i965/intel_context.c
@@ -349,21 +349,12 @@ intelInvalidateState(struct gl_context * ctx, GLuint 
new_state)
brw-NewGLState |= new_state;
 }
 
-void
-_intel_flush(struct gl_context *ctx, const char *file, int line)
-{
-   struct brw_context *brw = brw_context(ctx);
-
-   if (brw-batch.used)
-  _intel_batchbuffer_flush(brw, file, line);
-}
-
 static void
 intel_glFlush(struct gl_context *ctx)
 {
struct brw_context *brw = brw_context(ctx);
 
-   intel_flush(ctx);
+   intel_batchbuffer_flush(brw);
intel_flush_front(ctx);
if (brw-is_front_buffer_rendering)
   brw-need_throttle = true;
@@ -374,7 +365,7 @@ intelFinish(struct gl_context * ctx)
 {
struct brw_context *brw = brw_context(ctx);
 
-   intel_flush(ctx);
+   intel_batchbuffer_flush(brw);
intel_flush_front(ctx);
 
if (brw-batch.last_bo)
@@ -816,7 +807,7 @@ intel_query_dri2_buffers(struct brw_context *brw,
* query, we need to make sure all the pending drawing has landed in the
* real front buffer.
*/
-  intel_flush(brw-ctx);
+  intel_batchbuffer_flush(brw);
   intel_flush_front(brw-ctx);
 
   attachments[i++] = __DRI_BUFFER_FRONT_LEFT;
@@ -828,7 +819,7 @@ intel_query_dri2_buffers(struct brw_context *brw,
* So before doing the query, make sure all the pending drawing has
* landed in the real front buffer.
*/
-  intel_flush(brw-ctx);
+  intel_batchbuffer_flush(brw);
   intel_flush_front(brw-ctx);
}
 
diff --git a/src/mesa/drivers/dri/i965/intel_context.h 
b/src/mesa/drivers/dri/i965/intel_context.h
index 734c57c..f35dafa 100644
--- a/src/mesa/drivers/dri/i965/intel_context.h
+++ b/src/mesa/drivers/dri/i965/intel_context.h
@@ -250,9 +250,6 @@ extern bool intelInitContext(struct brw_context *brw,
  unsigned *dri_ctx_error);
 
 extern void intelFinish(struct gl_context * ctx);
-extern void _intel_flush(struct gl_context * ctx, const char *file, int line);
-
-#define intel_flush(ctx) _intel_flush(ctx, __FILE__, __LINE__)
 
 extern void intelInitDriverFunctions(struct dd_function_table *functions);
 
diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c 
b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
index 950ef57..f8cf96f 100644
--- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
+++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
@@ -1626,7 +1626,6 @@ intel_miptree_upsample(struct brw_context *brw,
 void *
 intel_miptree_map_raw(struct brw_context *brw, struct intel_mipmap_tree *mt)
 {
-   struct gl_context *ctx = brw-ctx;
/* CPU accesses to color buffers don't understand fast color clears, so
 * resolve any pending fast color clears before we map.
 */
@@ -1640,7 +1639,7 @@ intel_miptree_map_raw(struct brw_context *brw, struct 
intel_mipmap_tree *mt)
   }
}
 
-   intel_flush(ctx);
+   intel_batchbuffer_flush(brw);
 
if (mt-region-tiling != I915_TILING_NONE)
   

[Mesa-dev] [PATCH 3/8] intel: Add support for the new flush_with_flags extension.

2013-08-27 Thread Eric Anholt
This gives us more information about why we're flushing that we can
use for handling our throttling.

v2 (Kenneth Graunke): Rebase on latest master, add missing
   FLUSH_VERTICES and FLUSH_CURRENT, which fixes a regression in Glean's
   polygonOffset test.
v3 (anholt): Drop FLUSH_CURRENT -- FLUSH_VERTICES is what we need, which
   is get any queued prims out of VBO and into the driver, not update
   ctx-Current so we can read it with the CPU.  Also drop batch-used
   check, which intel_batchbuffer_flush() does anyway.

Signed-off-by: Kenneth Graunke kenn...@whitecape.org
---
 src/mesa/drivers/dri/i965/intel_screen.c | 46 +---
 1 file changed, 36 insertions(+), 10 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/intel_screen.c 
b/src/mesa/drivers/dri/i965/intel_screen.c
index 9b3c31a..0580d6f 100644
--- a/src/mesa/drivers/dri/i965/intel_screen.c
+++ b/src/mesa/drivers/dri/i965/intel_screen.c
@@ -151,29 +151,55 @@ static const __DRItexBufferExtension 
intelTexBufferExtension = {
 };
 
 static void
-intelDRI2Flush(__DRIdrawable *drawable)
+intel_dri2_flush_with_flags(__DRIcontext *cPriv,
+__DRIdrawable *dPriv,
+unsigned flags,
+enum __DRI2throttleReason reason)
 {
-   GET_CURRENT_CONTEXT(ctx);
-   struct brw_context *brw = brw_context(ctx);
-   if (brw == NULL)
+   struct brw_context *brw = cPriv-driverPrivate;
+
+   if (!brw)
   return;
 
-   intel_resolve_for_dri2_flush(brw, drawable);
-   brw-need_throttle = true;
+   struct gl_context *ctx = brw-ctx;
+
+   FLUSH_VERTICES(ctx, 0);
 
-   if (brw-batch.used)
-  intel_batchbuffer_flush(brw);
+   if (flags  __DRI2_FLUSH_DRAWABLE)
+  intel_resolve_for_dri2_flush(brw, dPriv);
+
+   if (reason == __DRI2_THROTTLE_SWAPBUFFER ||
+   reason == __DRI2_THROTTLE_FLUSHFRONT) {
+  brw-need_throttle = true;
+   }
+
+   intel_batchbuffer_flush(brw);
 
if (INTEL_DEBUG  DEBUG_AUB) {
   aub_dump_bmp(ctx);
}
 }
 
+/**
+ * Provides compatibility with loaders that only support the older (version
+ * 1-3) flush interface.
+ *
+ * That includes libGL up to Mesa 9.0, and the X Server at least up to 1.13.
+ */
+static void
+intel_dri2_flush(__DRIdrawable *drawable)
+{
+   intel_dri2_flush_with_flags(drawable-driContextPriv, drawable,
+   __DRI2_FLUSH_DRAWABLE,
+   __DRI2_THROTTLE_SWAPBUFFER);
+}
+
 static const struct __DRI2flushExtensionRec intelFlushExtension = {
-.base = { __DRI2_FLUSH, 3 },
+.base = { __DRI2_FLUSH, 4 },
 
-.flush  = intelDRI2Flush,
+.flush  = intel_dri2_flush,
 .invalidate = dri2InvalidateDrawable,
+.flush_with_flags   = intel_dri2_flush_with_flags,
 };
 
 static struct intel_image_format intel_image_formats[] = {
-- 
1.8.4.rc3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 7/8] i965: Add missing state reset at the end of blorp.

2013-08-27 Thread Eric Anholt
These are things that happen to be occurring because of the batch flush at
the start of the blorp op (which exists to prevent batch space or aperture
space overflow), but the intention was for this sequence of state resets at
the end of blorp to be everything necessary for the next draw call.

Found when debugging the next commit, by comparing brw_new_batch() and
intel_batchbuffer_reset() to brw_blorp_exec().
---
 src/mesa/drivers/dri/i965/brw_blorp.cpp   |  2 ++
 src/mesa/drivers/dri/i965/intel_batchbuffer.c | 10 +-
 src/mesa/drivers/dri/i965/intel_batchbuffer.h |  1 +
 3 files changed, 8 insertions(+), 5 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_blorp.cpp 
b/src/mesa/drivers/dri/i965/brw_blorp.cpp
index 20ea09e..1576ff2 100644
--- a/src/mesa/drivers/dri/i965/brw_blorp.cpp
+++ b/src/mesa/drivers/dri/i965/brw_blorp.cpp
@@ -214,6 +214,8 @@ brw_blorp_exec(struct brw_context *brw, const 
brw_blorp_params *params)
brw-state.dirty.cache = ~0;
brw-state_batch_count = 0;
brw-batch.need_workaround_flush = true;
+   brw-ib.type = -1;
+   intel_batchbuffer_clear_cache(brw);
 
/* Flush the sampler cache so any texturing from the destination is
 * coherent.
diff --git a/src/mesa/drivers/dri/i965/intel_batchbuffer.c 
b/src/mesa/drivers/dri/i965/intel_batchbuffer.c
index 5604829..0aa2551 100644
--- a/src/mesa/drivers/dri/i965/intel_batchbuffer.c
+++ b/src/mesa/drivers/dri/i965/intel_batchbuffer.c
@@ -41,8 +41,8 @@ struct cached_batch_item {
uint16_t size;
 };
 
-static void
-clear_cache(struct brw_context *brw)
+void
+intel_batchbuffer_clear_cache(struct brw_context *brw)
 {
struct cached_batch_item *item = brw-batch.cached_items;
 
@@ -85,7 +85,7 @@ intel_batchbuffer_reset(struct brw_context *brw)
}
brw-batch.last_bo = brw-batch.bo;
 
-   clear_cache(brw);
+   intel_batchbuffer_clear_cache(brw);
 
brw-batch.bo = drm_intel_bo_alloc(brw-bufmgr, batchbuffer,
BATCH_SZ, 4096);
@@ -118,7 +118,7 @@ intel_batchbuffer_reset_to_saved(struct brw_context *brw)
/* Cached batch state is dead, since we just cleared some unknown part of 
the
 * batchbuffer.  Assume that the caller resets any other state necessary.
 */
-   clear_cache(brw);
+   intel_batchbuffer_clear_cache(brw);
 }
 
 void
@@ -128,7 +128,7 @@ intel_batchbuffer_free(struct brw_context *brw)
drm_intel_bo_unreference(brw-batch.last_bo);
drm_intel_bo_unreference(brw-batch.bo);
drm_intel_bo_unreference(brw-batch.workaround_bo);
-   clear_cache(brw);
+   intel_batchbuffer_clear_cache(brw);
 }
 
 static void
diff --git a/src/mesa/drivers/dri/i965/intel_batchbuffer.h 
b/src/mesa/drivers/dri/i965/intel_batchbuffer.h
index 15a9ca1..d46f48e 100644
--- a/src/mesa/drivers/dri/i965/intel_batchbuffer.h
+++ b/src/mesa/drivers/dri/i965/intel_batchbuffer.h
@@ -28,6 +28,7 @@ void intel_batchbuffer_init(struct brw_context *brw);
 void intel_batchbuffer_free(struct brw_context *brw);
 void intel_batchbuffer_save_state(struct brw_context *brw);
 void intel_batchbuffer_reset_to_saved(struct brw_context *brw);
+void intel_batchbuffer_clear_cache(struct brw_context *brw);
 
 int _intel_batchbuffer_flush(struct brw_context *brw,
 const char *file, int line);
-- 
1.8.4.rc3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 6/8] i965: Drop extra flush when calling intel_miptree_map_raw().

2013-08-27 Thread Eric Anholt
The code that got replaced with map_raw didn't do the flush, but now
map_raw() is responsible for it and we don't have to worry about it.
---
 src/mesa/drivers/dri/i965/intel_mipmap_tree.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c 
b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
index 5f8041f..2f5e04f 100644
--- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
+++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
@@ -1735,7 +1735,6 @@ intel_miptree_map_blit(struct brw_context *brw,
   goto fail;
}
 
-   intel_batchbuffer_flush(brw);
map-ptr = intel_miptree_map_raw(brw, map-mt);
 
DBG(%s: %d,%d %dx%d from mt %p (%s) %d,%d = %p/%d\n, __FUNCTION__,
-- 
1.8.4.rc3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 8/8] i965: Avoid flushing the batch for every blorp op.

2013-08-27 Thread Eric Anholt
This brings over the batch-wrap-prevention and aperture space checking
code from the normal brw_draw.c path, so that we don't need to flush the
batch every time.

There's a risk here if the intel_emit_post_sync_nonzero_flush() call isn't
high enough up in the state emit sequences -- before, we implicitly had
one at the batch flush before any state was emitted, so Mesa's workaround
emits didn't really matter.

Improves cairo-gl performance by 13.7733% +/- 1.74876% (n=30/32)
Improves minecraft apitrace performance by 1.03183% +/- 0.482297% (n=90).
Reduces low-resolution GLB 2.7 performance by 1.17553% +/- 0.432263% (n=88)
Reduces Lightsmark performance by 3.70246% +/- 0.322432% (n=126)
No statistically significant performance difference on unigine tropics
(n=10)
No statistically significant performance difference on openarena (n=755)

The two apps that are hurt happen to include stalls on busy buffer
objects, so I think this is an effect of missing out on an opportune
flush.
---
 src/mesa/drivers/dri/i965/brw_blorp.cpp  | 50 
 src/mesa/drivers/dri/i965/brw_blorp.h|  4 ---
 src/mesa/drivers/dri/i965/gen6_blorp.cpp | 12 
 src/mesa/drivers/dri/i965/gen7_blorp.cpp |  1 -
 4 files changed, 50 insertions(+), 17 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_blorp.cpp 
b/src/mesa/drivers/dri/i965/brw_blorp.cpp
index 1576ff2..c566d1d 100644
--- a/src/mesa/drivers/dri/i965/brw_blorp.cpp
+++ b/src/mesa/drivers/dri/i965/brw_blorp.cpp
@@ -21,6 +21,7 @@
  * IN THE SOFTWARE.
  */
 
+#include errno.h
 #include intel_batchbuffer.h
 #include intel_fbo.h
 
@@ -191,6 +192,26 @@ intel_hiz_exec(struct brw_context *brw, struct 
intel_mipmap_tree *mt,
 void
 brw_blorp_exec(struct brw_context *brw, const brw_blorp_params *params)
 {
+   struct gl_context *ctx = brw-ctx;
+   uint32_t estimated_max_batch_usage = 1500;
+   bool check_aperture_failed_once = false;
+
+   /* Flush the sampler and render caches.  We definitely need to flush the
+* sampler cache so that we get updated contents from the render cache for
+* the glBlitFramebuffer() source.  Also, we are sometimes warned in the
+* docs to flush the cache between reinterpretations of the same surface
+* data with different formats, which blorp does for stencil and depth
+* data.
+*/
+   intel_batchbuffer_emit_mi_flush(brw);
+
+retry:
+   intel_batchbuffer_require_space(brw, estimated_max_batch_usage, false);
+   intel_batchbuffer_save_state(brw);
+   drm_intel_bo *saved_bo = brw-batch.bo;
+   uint32_t saved_used = brw-batch.used;
+   uint32_t saved_state_batch_offset = brw-batch.state_batch_offset;
+
switch (brw-gen) {
case 6:
   gen6_blorp_exec(brw, params);
@@ -204,6 +225,35 @@ brw_blorp_exec(struct brw_context *brw, const 
brw_blorp_params *params)
   break;
}
 
+   /* Make sure we didn't wrap the batch unintentionally, and make sure we
+* reserved enough space that a wrap will never happen.
+*/
+   assert(brw-batch.bo == saved_bo);
+   assert((brw-batch.used - saved_used) * 4 +
+  (saved_state_batch_offset - brw-batch.state_batch_offset) 
+  estimated_max_batch_usage);
+   /* Shut up compiler warnings on release build */
+   (void)saved_bo;
+   (void)saved_used;
+   (void)saved_state_batch_offset;
+
+   /* Check if the blorp op we just did would make our batch likely to fail to
+* map all the BOs into the GPU at batch exec time later.  If so, flush the
+* batch and try again with nothing else in the batch.
+*/
+   if (dri_bufmgr_check_aperture_space(brw-batch.bo, 1)) {
+  if (!check_aperture_failed_once) {
+ check_aperture_failed_once = true;
+ intel_batchbuffer_reset_to_saved(brw);
+ intel_batchbuffer_flush(brw);
+ goto retry;
+  } else {
+ int ret = intel_batchbuffer_flush(brw);
+ WARN_ONCE(ret == -ENOSPC,
+   i965: blorp emit exceeded available aperture space\n);
+  }
+   }
+
if (unlikely(brw-always_flush_batch))
   intel_batchbuffer_flush(brw);
 
diff --git a/src/mesa/drivers/dri/i965/brw_blorp.h 
b/src/mesa/drivers/dri/i965/brw_blorp.h
index dceb4fc..e03e27f 100644
--- a/src/mesa/drivers/dri/i965/brw_blorp.h
+++ b/src/mesa/drivers/dri/i965/brw_blorp.h
@@ -370,10 +370,6 @@ void
 gen6_blorp_init(struct brw_context *brw);
 
 void
-gen6_blorp_emit_batch_head(struct brw_context *brw,
-   const brw_blorp_params *params);
-
-void
 gen6_blorp_emit_state_base_address(struct brw_context *brw,
const brw_blorp_params *params);
 
diff --git a/src/mesa/drivers/dri/i965/gen6_blorp.cpp 
b/src/mesa/drivers/dri/i965/gen6_blorp.cpp
index da523e5..87b1d2b 100644
--- a/src/mesa/drivers/dri/i965/gen6_blorp.cpp
+++ b/src/mesa/drivers/dri/i965/gen6_blorp.cpp
@@ -45,17 +45,6 @@
  * sizeof(float))
 /** \} */
 
-void
-gen6_blorp_emit_batch_head(struct brw_context *brw,
- 

[Mesa-dev] [PATCH 5/8] i965: Make a slight distinction in perf debug for BOs versus miptrees.

2013-08-27 Thread Eric Anholt
---
 src/mesa/drivers/dri/i965/intel_mipmap_tree.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c 
b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
index f8cf96f..5f8041f 100644
--- a/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
+++ b/src/mesa/drivers/dri/i965/intel_mipmap_tree.c
@@ -1635,7 +1635,7 @@ intel_miptree_map_raw(struct brw_context *brw, struct 
intel_mipmap_tree *mt)
 
if (unlikely(INTEL_DEBUG  DEBUG_PERF)) {
   if (drm_intel_bo_busy(bo)) {
- perf_debug(Mapping a busy BO, causing a stall on the GPU.\n);
+ perf_debug(Mapping a busy miptree, causing a stall on the GPU.\n);
   }
}
 
-- 
1.8.4.rc3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH V2 10/15] mesa: Implement KHR_debug ObjectLabel functions

2013-08-27 Thread Fredrik Höglund
On Tuesday 27 August 2013, Timothy Arceri wrote:
 V2: fixed indentation of comment
 
 Signed-off-by: Timothy Arceri t_arc...@yahoo.com.au
 ---
  src/mesa/main/objectlabel.c | 277 
 
  src/mesa/main/objectlabel.h |  61 ++
  2 files changed, 338 insertions(+)
  create mode 100644 src/mesa/main/objectlabel.c
  create mode 100644 src/mesa/main/objectlabel.h
 
 diff --git a/src/mesa/main/objectlabel.c b/src/mesa/main/objectlabel.c
 new file mode 100644
 index 000..78f9b33
 --- /dev/null
 +++ b/src/mesa/main/objectlabel.c
 @@ -0,0 +1,277 @@
 +/*
 + * Mesa 3-D graphics library
 + *
 + * Copyright (C) 2013  Timothy Arceri   All Rights Reserved.
 + *
 + * Permission is hereby granted, free of charge, to any person obtaining a
 + * copy of this software and associated documentation files (the Software),
 + * to deal in the Software without restriction, including without limitation
 + * the rights to use, copy, modify, merge, publish, distribute, sublicense,
 + * and/or sell copies of the Software, and to permit persons to whom the
 + * Software is furnished to do so, subject to the following conditions:
 + *
 + * The above copyright notice and this permission notice shall be included
 + * in all copies or substantial portions of the Software.
 + *
 + * THE SOFTWARE IS PROVIDED AS IS, WITHOUT WARRANTY OF ANY KIND, EXPRESS
 + * OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF 
 MERCHANTABILITY,
 + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
 + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR
 + * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
 + * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
 + * OTHER DEALINGS IN THE SOFTWARE.
 + */
 +
 +
 +#include arrayobj.h
 +#include bufferobj.h
 +#include context.h
 +#include dlist.h
 +#include enums.h
 +#include fbobject.h
 +#include objectlabel.h
 +#include queryobj.h
 +#include samplerobj.h
 +#include shaderobj.h
 +#include syncobj.h
 +#include texobj.h
 +#include transformfeedback.h
 +
 +
 +/**
 + * Helper for _mesa_ObjectLabel() and _mesa_ObjectPtrLabel().
 + */
 +static void
 +set_label(struct gl_context *ctx, char **labelPtr, const char *label,
 +  int length, const char *caller)
 +{
 +   if (*labelPtr) {
 +  /* free old label string */
 +  free(*labelPtr);

You should set *labelPtr to NULL here, since the code below is not guaranteed
to assign it.

 +   }
 +
 +   if (label) {
 +  /* set new label string */
 +
 +  if (length = 0) {
 + /* explicit length */
 + *labelPtr = (char *) malloc(length);
 + if (*labelPtr) {
 +memcpy(*labelPtr, label, length);
 + }

The length given by the client is not required to include a terminating
null-character. 

 +  }
 +  else {
 + /* null-terminated string */
 + int len = strlen(label);
 + if (len = MAX_LABEL_LENGTH) {
 +/* An INVALID_VALUE error is generated if the number of 
 characters
 + * in label, excluding the null terminator when length is
 + * negative, is not less than the value of MAX_LABEL_LENGTH.
 + */
 +_mesa_error(ctx, GL_INVALID_VALUE,
 +%s(length=%d, which is not less than 
 +GL_MAX_LABEL_LENGTH=%d), caller, length,
 +MAX_LABEL_LENGTH);

This error should also be generated when the client specifies an explicit
length that exceeds MAX_LABEL_LENGTH.

Fredrik

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] r600g: fix color exports when we have no CBs

2013-08-27 Thread Roland Scheidegger
Am 27.08.2013 23:52, schrieb Vadim Girlin:
 On 08/28/2013 12:43 AM, Marek Olšák wrote:
 Shader variants are BAD, BAD, BAD. Have you ever played an AAA game
 with a Mesa driver that likes to compile shader variants on first use?
 It's HORRIBLE.
 
 I don't think that shader variants are bad, but it's definitely bad when
 we are compiling variants that are never used. Currently glxgears
 compiles 18 ps/vs shaders. In my branch with initial GS support [1] I
 switched handling of the shaders to deferred compilation, that is,
 shaders are compiled only before the actual draw. I found later that
 it's not really required for GS, but IIRC this change results in only 5
 shaders being compiled for glxgears instead of 18. It seems most of the
 useless variants are results of state changes between creation of the
 shader state (initial compilation) and actual draw call.
 
 I had some concerns about increased overhead with those changes, and
 it's actually noticeable with drawoverhead demo, but I didn't see any
 regressions with a few real apps that I tested, e.g. glxgears even
 showed slightly better performance with these changes. Probably I also
 implemented it in a not very optimal way (I was mostly concentrated on
 GS support) and the overhead can be reduced.
 
 One more thing is duplicate shaders, I've analyzed shader dumps from
 Unigine Heaven 3.0 some time ago and found that from about 320 compiled
 shaders, only about 180 (50%) were unique, others were duplicates
 (detected by comparing the bytecode dumps for them in an automated way),
 maybe they had different shader keys (which still resulted in the same
 bytecode), but I suspect duplicate pipe shaders were also involved.
 Unfortunately I didn't have a time to investigate it more thoroughly
 since then.
 
 So my point is that we don't really need to eliminate shader variants,
 first we need to eliminate compilation of unused variants and duplicate
 shaders. Also we might want to consider offloading of the compilation to
 separate thread(s) and caching of shader binaries between runs.

Hmm ok that seems a way more complicated problem than I thought :-).
Compile early and you might compile variants you will never use, compile
late and the delay might be noticeable.
I just thought it might be unlikely you'd actually need two variants -
e.g. some depth exporting shader is probably unlikely to use alpha test.
But ok I guess it shouldn't write color in this case, so even then it
might never be worth bothering. Was just a random idea ;-).

Roland


 
 Vadim
 
  [1] http://cgit.freedesktop.org/~vadimg/mesa/log/?h=r600-geom-shaders
 

 What the patch does is probably the right solution. At least
 alpha-test state changes don't cause shader recompilation and
 re-binding, which also negatively affects performance. Ideally we
 shouldn't depend on the framebuffer state at all, but we need to
 emulate the TGSI property FS_COLOR0_WRITES_ALL_CBUFS. I think we
 should always be fine with key.nr_cbufs forced to 8 for any shader
 without that property. I expect app developers to do the right thing
 and not write outputs they don't need.

 Marek

 On Tue, Aug 27, 2013 at 9:00 PM, Roland Scheidegger
 srol...@vmware.com wrote:
 Not that I'm qualified to review r600 code, but couldn't you create
 different shader variants depending on whether you need alpha test? At
 least I would assume shader exports aren't free.

 Roland

 Am 27.08.2013 19:56, schrieb Vadim Girlin:
 We need to export at least one color if the shader writes it,
 even when nr_cbufs==0.

 Signed-off-by: Vadim Girlin vadimgir...@gmail.com
 ---

 Tested on evergreen with multiple combinations of backends - no
 regressions,
 fixes some tests:

default- fixes fb-alphatest-nocolor and fb_alphatest-nocolor-ff
default+sb - fixes fb-alphatest-nocolor and fb_alphatest-nocolor-ff
llvm   - fixes about 25 tests related to depth/stencil
llvm+sb- fixes about 300 tests (llvm's depth/stencil issues and
 regressions cased by reordering of exports in sb)

 With this patch, there are no regressions with default+sb vs default.
 There is one regression with llvm+sb vs llvm -
 fs-texturegrad-miplevels,
 AFAICS it's a problem with llvm backend uncovered by sb -
 SET_GRADIENTS_V/H
 instructions are not placed in the same TEX clause with
 corresponding SAMPLE_G.

   src/gallium/drivers/r600/r600_shader.c | 7 ---
   1 file changed, 4 insertions(+), 3 deletions(-)

 diff --git a/src/gallium/drivers/r600/r600_shader.c
 b/src/gallium/drivers/r600/r600_shader.c
 index 300b5c4..f7eab76 100644
 --- a/src/gallium/drivers/r600/r600_shader.c
 +++ b/src/gallium/drivers/r600/r600_shader.c
 @@ -918,6 +918,7 @@ static int r600_shader_from_tgsi(struct
 r600_screen *rscreen,
unsigned opcode;
int i, j, k, r = 0;
int next_pos_base = 60, next_param_base = 0;
 + int max_color_exports = MAX2(key.nr_cbufs, 1);
/* Declarations used by llvm code */
bool use_llvm = false;

[Mesa-dev] [Bug 60737] In GLSL ES, a missing FS precision qualifier does not generate an error

2013-08-27 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=60737

Ian Romanick i...@freedesktop.org changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |FIXED

--- Comment #3 from Ian Romanick i...@freedesktop.org ---
This should be fixed by:

commit cabd45773b58d6aa48202da1cdd8cf1a6b856c53
Author: Ian Romanick ian.d.roman...@intel.com
Date:   Fri Aug 9 15:17:18 2013 -0700

glsl: Track existence of default float precision in GLSL ES fragment
shaders

This is required by the spec, and it's a bit tricky because the default
precision is scoped.  As a result, I'm slightly abusing the symbol
table.

Fixes piglit no-default-float-precision.frag tests and the piglit
default-precision-nested-scope-0[1234].frag tests that are currently on
the piglit mailing list for review.

On IRC I got confirmation from cwabbot that ARM (Mali T6xx and T400)
enforces this requirement and from kusma that NVIDIA (Tegra2) enforces
this requirement.  We should be safe from regressing shipping
applications.

Signed-off-by: Ian Romanick ian.d.roman...@intel.com
Reviewed-by: Kenneth Graunke kenn...@whitecape.org
Cc: 9.2 mesa-sta...@lists.freedesktop.org

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] r600g: fix color exports when we have no CBs

2013-08-27 Thread Marek Olšák
First, you won't really see any significant continual difference in
frame rate no matter how many shader variants you have unless you are
very CPU-bound. The problem is shader compilation on the first use,
that's where you get a big hiccup. Try Skyrim for example: You have to
first look around and see every object that's around you and get
unpleasant stuttering before you can actually go on and play the game.
Yes, this also Wine's fault that it compiles shaders on the first use
too, but we don't have to be as bad as Wine, do we? Valve also
reported shader recompilations on the first use being a serious issue
with open source drivers.

Marek

On Tue, Aug 27, 2013 at 11:52 PM, Vadim Girlin vadimgir...@gmail.com wrote:
 On 08/28/2013 12:43 AM, Marek Olšák wrote:

 Shader variants are BAD, BAD, BAD. Have you ever played an AAA game
 with a Mesa driver that likes to compile shader variants on first use?
 It's HORRIBLE.


 I don't think that shader variants are bad, but it's definitely bad when we
 are compiling variants that are never used. Currently glxgears compiles 18
 ps/vs shaders. In my branch with initial GS support [1] I switched handling
 of the shaders to deferred compilation, that is, shaders are compiled only
 before the actual draw. I found later that it's not really required for GS,
 but IIRC this change results in only 5 shaders being compiled for glxgears
 instead of 18. It seems most of the useless variants are results of state
 changes between creation of the shader state (initial compilation) and
 actual draw call.

 I had some concerns about increased overhead with those changes, and it's
 actually noticeable with drawoverhead demo, but I didn't see any regressions
 with a few real apps that I tested, e.g. glxgears even showed slightly
 better performance with these changes. Probably I also implemented it in a
 not very optimal way (I was mostly concentrated on GS support) and the
 overhead can be reduced.

 One more thing is duplicate shaders, I've analyzed shader dumps from Unigine
 Heaven 3.0 some time ago and found that from about 320 compiled shaders,
 only about 180 (50%) were unique, others were duplicates (detected by
 comparing the bytecode dumps for them in an automated way), maybe they had
 different shader keys (which still resulted in the same bytecode), but I
 suspect duplicate pipe shaders were also involved. Unfortunately I didn't
 have a time to investigate it more thoroughly since then.

 So my point is that we don't really need to eliminate shader variants, first
 we need to eliminate compilation of unused variants and duplicate shaders.
 Also we might want to consider offloading of the compilation to separate
 thread(s) and caching of shader binaries between runs.

 Vadim

  [1] http://cgit.freedesktop.org/~vadimg/mesa/log/?h=r600-geom-shaders



 What the patch does is probably the right solution. At least
 alpha-test state changes don't cause shader recompilation and
 re-binding, which also negatively affects performance. Ideally we
 shouldn't depend on the framebuffer state at all, but we need to
 emulate the TGSI property FS_COLOR0_WRITES_ALL_CBUFS. I think we
 should always be fine with key.nr_cbufs forced to 8 for any shader
 without that property. I expect app developers to do the right thing
 and not write outputs they don't need.

 Marek

 On Tue, Aug 27, 2013 at 9:00 PM, Roland Scheidegger srol...@vmware.com
 wrote:

 Not that I'm qualified to review r600 code, but couldn't you create
 different shader variants depending on whether you need alpha test? At
 least I would assume shader exports aren't free.

 Roland

 Am 27.08.2013 19:56, schrieb Vadim Girlin:

 We need to export at least one color if the shader writes it,
 even when nr_cbufs==0.

 Signed-off-by: Vadim Girlin vadimgir...@gmail.com
 ---

 Tested on evergreen with multiple combinations of backends - no
 regressions,
 fixes some tests:

default- fixes fb-alphatest-nocolor and fb_alphatest-nocolor-ff
default+sb - fixes fb-alphatest-nocolor and fb_alphatest-nocolor-ff
llvm   - fixes about 25 tests related to depth/stencil
llvm+sb- fixes about 300 tests (llvm's depth/stencil issues and
 regressions cased by reordering of exports in sb)

 With this patch, there are no regressions with default+sb vs default.
 There is one regression with llvm+sb vs llvm - fs-texturegrad-miplevels,
 AFAICS it's a problem with llvm backend uncovered by sb -
 SET_GRADIENTS_V/H
 instructions are not placed in the same TEX clause with corresponding
 SAMPLE_G.

   src/gallium/drivers/r600/r600_shader.c | 7 ---
   1 file changed, 4 insertions(+), 3 deletions(-)

 diff --git a/src/gallium/drivers/r600/r600_shader.c
 b/src/gallium/drivers/r600/r600_shader.c
 index 300b5c4..f7eab76 100644
 --- a/src/gallium/drivers/r600/r600_shader.c
 +++ b/src/gallium/drivers/r600/r600_shader.c
 @@ -918,6 +918,7 @@ static int r600_shader_from_tgsi(struct r600_screen
 *rscreen,
unsigned 

[Mesa-dev] [PATCH] radeonsi: flush TC at the end of CS

2013-08-27 Thread Marek Olšák
TC can be used for writing and therefore should be flushed.
---
 src/gallium/drivers/radeonsi/r600_hw_context.c | 8 +---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/src/gallium/drivers/radeonsi/r600_hw_context.c 
b/src/gallium/drivers/radeonsi/r600_hw_context.c
index 5631bdb..1fe6814 100644
--- a/src/gallium/drivers/radeonsi/r600_hw_context.c
+++ b/src/gallium/drivers/radeonsi/r600_hw_context.c
@@ -167,7 +167,7 @@ void si_need_cs_space(struct r600_context *ctx, unsigned 
num_dw,
}
 }
 
-static void r600_flush_framebuffer(struct r600_context *ctx)
+static void r600_flush_caches(struct r600_context *ctx)
 {
struct si_pm4_state *pm4;
 
@@ -188,7 +188,9 @@ static void r600_flush_framebuffer(struct r600_context *ctx)
S_0085F0_CB6_DEST_BASE_ENA(1) |
S_0085F0_CB7_DEST_BASE_ENA(1) |
S_0085F0_DB_ACTION_ENA(1) |
-   S_0085F0_DB_DEST_BASE_ENA(1));
+   S_0085F0_DB_DEST_BASE_ENA(1) |
+   S_0085F0_TC_ACTION_ENA(1) |
+   S_0085F0_TCL1_ACTION_ENA(1));
si_cmd_flush_and_inv_cb_meta(pm4);
 
si_pm4_emit(ctx, pm4);
@@ -223,7 +225,7 @@ void si_context_flush(struct r600_context *ctx, unsigned 
flags)
}
 #endif
 
-   r600_flush_framebuffer(ctx);
+   r600_flush_caches(ctx);
 
/* partial flush is needed to avoid lockups on some chips with user 
fences */
cs-buf[cs-cdw++] = PKT3(PKT3_EVENT_WRITE, 0, 0);
-- 
1.8.1.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] radeonsi: flush TC at the end of CS

2013-08-27 Thread Marek Olšák
Please disregard this patch. I didn't mean to send this one.

Marek

On Wed, Aug 28, 2013 at 1:00 AM, Marek Olšák mar...@gmail.com wrote:
 TC can be used for writing and therefore should be flushed.
 ---
  src/gallium/drivers/radeonsi/r600_hw_context.c | 8 +---
  1 file changed, 5 insertions(+), 3 deletions(-)

 diff --git a/src/gallium/drivers/radeonsi/r600_hw_context.c 
 b/src/gallium/drivers/radeonsi/r600_hw_context.c
 index 5631bdb..1fe6814 100644
 --- a/src/gallium/drivers/radeonsi/r600_hw_context.c
 +++ b/src/gallium/drivers/radeonsi/r600_hw_context.c
 @@ -167,7 +167,7 @@ void si_need_cs_space(struct r600_context *ctx, unsigned 
 num_dw,
 }
  }

 -static void r600_flush_framebuffer(struct r600_context *ctx)
 +static void r600_flush_caches(struct r600_context *ctx)
  {
 struct si_pm4_state *pm4;

 @@ -188,7 +188,9 @@ static void r600_flush_framebuffer(struct r600_context 
 *ctx)
 S_0085F0_CB6_DEST_BASE_ENA(1) |
 S_0085F0_CB7_DEST_BASE_ENA(1) |
 S_0085F0_DB_ACTION_ENA(1) |
 -   S_0085F0_DB_DEST_BASE_ENA(1));
 +   S_0085F0_DB_DEST_BASE_ENA(1) |
 +   S_0085F0_TC_ACTION_ENA(1) |
 +   S_0085F0_TCL1_ACTION_ENA(1));
 si_cmd_flush_and_inv_cb_meta(pm4);

 si_pm4_emit(ctx, pm4);
 @@ -223,7 +225,7 @@ void si_context_flush(struct r600_context *ctx, unsigned 
 flags)
 }
  #endif

 -   r600_flush_framebuffer(ctx);
 +   r600_flush_caches(ctx);

 /* partial flush is needed to avoid lockups on some chips with user 
 fences */
 cs-buf[cs-cdw++] = PKT3(PKT3_EVENT_WRITE, 0, 0);
 --
 1.8.1.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] draw: fix segfaults with aaline and aapoint stages disabled

2013-08-27 Thread Marek Olšák
There are drivers not using these optional stages.

Broken by a3ae5dc7dd5c2f8893f86a920247e690e550ebd4.

Cc: mesa-sta...@lists.freedesktop.org
---
 src/gallium/auxiliary/draw/draw_context.c | 6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/src/gallium/auxiliary/draw/draw_context.c 
b/src/gallium/auxiliary/draw/draw_context.c
index d1fac0c..641dd82 100644
--- a/src/gallium/auxiliary/draw/draw_context.c
+++ b/src/gallium/auxiliary/draw/draw_context.c
@@ -564,8 +564,10 @@ draw_prepare_shader_outputs(struct draw_context *draw)
draw_remove_extra_vertex_attribs(draw);
draw_prim_assembler_prepare_outputs(draw-ia);
draw_unfilled_prepare_outputs(draw, draw-pipeline.unfilled);
-   draw_aapoint_prepare_outputs(draw, draw-pipeline.aapoint);
-   draw_aaline_prepare_outputs(draw, draw-pipeline.aaline);
+   if (draw-pipeline.aapoint)
+  draw_aapoint_prepare_outputs(draw, draw-pipeline.aapoint);
+   if (draw-pipeline.aaline)
+  draw_aaline_prepare_outputs(draw, draw-pipeline.aaline);
 }
 
 /**
-- 
1.8.1.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] r600g: fix color exports when we have no CBs

2013-08-27 Thread Vadim Girlin

On 08/28/2013 02:28 AM, Roland Scheidegger wrote:

Am 27.08.2013 23:52, schrieb Vadim Girlin:

On 08/28/2013 12:43 AM, Marek Olšák wrote:

Shader variants are BAD, BAD, BAD. Have you ever played an AAA game
with a Mesa driver that likes to compile shader variants on first use?
It's HORRIBLE.


I don't think that shader variants are bad, but it's definitely bad when
we are compiling variants that are never used. Currently glxgears
compiles 18 ps/vs shaders. In my branch with initial GS support [1] I
switched handling of the shaders to deferred compilation, that is,
shaders are compiled only before the actual draw. I found later that
it's not really required for GS, but IIRC this change results in only 5
shaders being compiled for glxgears instead of 18. It seems most of the
useless variants are results of state changes between creation of the
shader state (initial compilation) and actual draw call.

I had some concerns about increased overhead with those changes, and
it's actually noticeable with drawoverhead demo, but I didn't see any
regressions with a few real apps that I tested, e.g. glxgears even
showed slightly better performance with these changes. Probably I also
implemented it in a not very optimal way (I was mostly concentrated on
GS support) and the overhead can be reduced.

One more thing is duplicate shaders, I've analyzed shader dumps from
Unigine Heaven 3.0 some time ago and found that from about 320 compiled
shaders, only about 180 (50%) were unique, others were duplicates
(detected by comparing the bytecode dumps for them in an automated way),
maybe they had different shader keys (which still resulted in the same
bytecode), but I suspect duplicate pipe shaders were also involved.
Unfortunately I didn't have a time to investigate it more thoroughly
since then.

So my point is that we don't really need to eliminate shader variants,
first we need to eliminate compilation of unused variants and duplicate
shaders. Also we might want to consider offloading of the compilation to
separate thread(s) and caching of shader binaries between runs.


Hmm ok that seems a way more complicated problem than I thought :-).
Compile early and you might compile variants you will never use, compile
late and the delay might be noticeable.


Compilation of unused variants is not bad if they are compiled at the 
game/level loading time, I think many apps are trying to compile shaders 
early to avoid freezes during gameplay. But trying to compile early in 
the driver doesn't make sense currently because it's already too late 
anyway, if I'm not missing something, it's deferred in mesa or state 
tracker.


Otherwise probably it would be preferable for the driver to precompile 
variants that are likely to be used (but only if we really can do it 
early, at the loading time when shaders are created by the app).



I just thought it might be unlikely you'd actually need two variants -
e.g. some depth exporting shader is probably unlikely to use alpha test.
But ok I guess it shouldn't write color in this case, so even then it
might never be worth bothering. Was just a random idea ;-).


I think it's a good idea that just needs some benchmarking to make sure 
that it can provide any real benefits. I only wanted to say that it can 
be done separately from this fix.


Vadim




Roland




Vadim

  [1] http://cgit.freedesktop.org/~vadimg/mesa/log/?h=r600-geom-shaders



What the patch does is probably the right solution. At least
alpha-test state changes don't cause shader recompilation and
re-binding, which also negatively affects performance. Ideally we
shouldn't depend on the framebuffer state at all, but we need to
emulate the TGSI property FS_COLOR0_WRITES_ALL_CBUFS. I think we
should always be fine with key.nr_cbufs forced to 8 for any shader
without that property. I expect app developers to do the right thing
and not write outputs they don't need.

Marek

On Tue, Aug 27, 2013 at 9:00 PM, Roland Scheidegger
srol...@vmware.com wrote:

Not that I'm qualified to review r600 code, but couldn't you create
different shader variants depending on whether you need alpha test? At
least I would assume shader exports aren't free.

Roland

Am 27.08.2013 19:56, schrieb Vadim Girlin:

We need to export at least one color if the shader writes it,
even when nr_cbufs==0.

Signed-off-by: Vadim Girlin vadimgir...@gmail.com
---

Tested on evergreen with multiple combinations of backends - no
regressions,
fixes some tests:

default- fixes fb-alphatest-nocolor and fb_alphatest-nocolor-ff
default+sb - fixes fb-alphatest-nocolor and fb_alphatest-nocolor-ff
llvm   - fixes about 25 tests related to depth/stencil
llvm+sb- fixes about 300 tests (llvm's depth/stencil issues and
 regressions cased by reordering of exports in sb)

With this patch, there are no regressions with default+sb vs default.
There is one regression with llvm+sb vs llvm -
fs-texturegrad-miplevels,
AFAICS it's a problem with llvm 

[Mesa-dev] Mesa 9.2 released

2013-08-27 Thread Ian Romanick
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Mesa 9.2 has been released! Mesa 9.2 is a feature release that includes
many updates and enhancements. The full list is available in the release
notes file in docs/relnotes/9.2.html.

The tag in the GIT repository for Mesa 9.2 is 'mesa-9.2'.

Mesa 9.2 is available for download at
ftp://freedesktop.org/pub/mesa/9.2/

md5sums:

4f93c6475ec656fc1f7b93aeffc9b6c4  MesaLib-9.2.0.tar.gz
4185b6aae890bc62a964f4b24cc1aca8  MesaLib-9.2.0.tar.bz2
3bc5339bc98b9c3ffd14e3a8eca4  MesaLib-9.2.0.zip

I have verified building from the .tar.bz2 file by doing:

tar -xjf MesaLib-9.2.tar.bz2
cd Mesa-9.2
./configure
make -j6

I have also verified that I pushed the tag. REALLY!
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.13 (GNU/Linux)

iEYEARECAAYFAlIdN4cACgkQX1gOwKyEAw9DYgCfa1ovOpuqnN/wefXyz5t5YVUD
6lwAnjPsrnTkm+lzpO3P/KUutLihS5lP
=J6kN
-END PGP SIGNATURE-
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] i965: Initialize inout_offset parameter to brw_search_cache().

2013-08-27 Thread Damien Lespiau
On Wed, Jul 24, 2013 at 10:02:22AM -0700, Kenneth Graunke wrote:
 On 07/24/2013 09:33 AM, Paul Berry wrote:
 Two callers of brw_search_cache() weren't initializing that function's
 inout_offset parameter: brw_blorp_const_color_params::get_wm_prog()
 and brw_blorp_const_color_params::get_wm_prog().
 
 That's a benign problem, since the only effect of not initializing
 inout_offset prior to calling brw_search_cache() is that the bit
 corresponding to cache_id in brw-state.dirty.cache may not be set
 reliably.  This is ok, since the cache_id's used by
 brw_blorp_const_color_params::get_wm_prog() and
 brw_blorp_blit_params::get_wm_prog() (BRW_BLORP_CONST_COLOR_PROG and
 BRW_BLORP_BLIT_PROG, respectively) correspond to dirty bits that are
 not used.
 
 However, failing to initialize this parameter causes valgrind to
 complain.  So let's go ahead and fix it to reduce valgrind noise.
 
 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=66779

Can we have this commit in the 9.2 branch? It was added in the release
blocker bug but never cherry-picked for 9.2.

Thanks!

-- 
Damien
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/8] i965: Directly call intel_batchbuffer_flush() after i915 split.

2013-08-27 Thread Kenneth Graunke

On 08/27/2013 03:21 PM, Eric Anholt wrote:

intel_flush() now did nothing except call through (and
intel_batchbuffer_flush() does the no-op check, too!)
---
  src/mesa/drivers/dri/i965/gen6_blorp.cpp |  4 +---
  src/mesa/drivers/dri/i965/intel_buffer_objects.c |  2 +-
  src/mesa/drivers/dri/i965/intel_context.c| 17 -
  src/mesa/drivers/dri/i965/intel_context.h|  3 ---
  src/mesa/drivers/dri/i965/intel_mipmap_tree.c|  3 +--
  src/mesa/drivers/dri/i965/intel_pixel_copy.c |  3 ++-
  src/mesa/drivers/dri/i965/intel_syncobj.c|  2 +-
  7 files changed, 10 insertions(+), 24 deletions(-)


Series is:
Reviewed-by: Kenneth Graunke kenn...@whitecape.org

When running glxgears with INTEL_DEBUG=bat (and batch decoding turned 
off) on master, I see this:


intel_context.c:366: Batchbuffer flush with 5388b (pkt) + 3424b (state) 
= 8812b (26.9%)
intel_screen.c:165: Batchbuffer flush with  748b (pkt) +  448b (state) = 
1196b (3.6%)
gen6_blorp.cpp:57: Batchbuffer flush with   16b (pkt) +0b (state) = 
  16b (0.0%)
gen6_blorp.cpp:57: Batchbuffer flush with  740b (pkt) +  160b (state) = 
 900b (2.7%)

..repeating..

Which is pretty ridiculous...a 16 byte batch?  Really?  And we have to 
switch to the kernel to execbuffer that.  The others aren't much better.


With this series, we get a single batch per frame:

intel_screen.c:176: Batchbuffer flush with 6940b (pkt) + 4096b (state) = 
11036b (33.7%)

..repeating..

Clearly much better.  Every other application I've looked at sees 
similar benefits - much better batch utilization, much fewer tiny batches.


--Ken
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] r600g: fix color exports when we have no CBs

2013-08-27 Thread Vadim Girlin

On 08/28/2013 02:59 AM, Marek Olšák wrote:

First, you won't really see any significant continual difference in
frame rate no matter how many shader variants you have unless you are
very CPU-bound. The problem is shader compilation on the first use,
that's where you get a big hiccup. Try Skyrim for example: You have to
first look around and see every object that's around you and get
unpleasant stuttering before you can actually go on and play the game.
Yes, this also Wine's fault that it compiles shaders on the first use
too, but we don't have to be as bad as Wine, do we? Valve also
reported shader recompilations on the first use being a serious issue
with open source drivers.


I perfectly understand that deferred compilation is exactly the problem 
that makes the games freeze due to shader compilation on first use when 
something new appears on the screen, but I don't think we can solve this 
problem in the *driver* by trying to compile early, because AFAICS 
currently the shaders are passed to the driver too late anyway, and this 
happens not only with wine. E.g. when I run Heaven in a window with 
MESA_GLSL=dump R600_DEBUG=ps,vs, so that I can see Heaven's window and 
console output at the same time, what I see is that most of GL dumps 
happen while Heaven shows splash screen with loading progress, but most 
of the driver's dumps appear on the first frame and few more times 
during benchmark. It looks like compilation is deferred somewhere in the 
stack before the driver, or am I missing something?


Vadim




Marek

On Tue, Aug 27, 2013 at 11:52 PM, Vadim Girlin vadimgir...@gmail.com wrote:

On 08/28/2013 12:43 AM, Marek Olšák wrote:


Shader variants are BAD, BAD, BAD. Have you ever played an AAA game
with a Mesa driver that likes to compile shader variants on first use?
It's HORRIBLE.



I don't think that shader variants are bad, but it's definitely bad when we
are compiling variants that are never used. Currently glxgears compiles 18
ps/vs shaders. In my branch with initial GS support [1] I switched handling
of the shaders to deferred compilation, that is, shaders are compiled only
before the actual draw. I found later that it's not really required for GS,
but IIRC this change results in only 5 shaders being compiled for glxgears
instead of 18. It seems most of the useless variants are results of state
changes between creation of the shader state (initial compilation) and
actual draw call.

I had some concerns about increased overhead with those changes, and it's
actually noticeable with drawoverhead demo, but I didn't see any regressions
with a few real apps that I tested, e.g. glxgears even showed slightly
better performance with these changes. Probably I also implemented it in a
not very optimal way (I was mostly concentrated on GS support) and the
overhead can be reduced.

One more thing is duplicate shaders, I've analyzed shader dumps from Unigine
Heaven 3.0 some time ago and found that from about 320 compiled shaders,
only about 180 (50%) were unique, others were duplicates (detected by
comparing the bytecode dumps for them in an automated way), maybe they had
different shader keys (which still resulted in the same bytecode), but I
suspect duplicate pipe shaders were also involved. Unfortunately I didn't
have a time to investigate it more thoroughly since then.

So my point is that we don't really need to eliminate shader variants, first
we need to eliminate compilation of unused variants and duplicate shaders.
Also we might want to consider offloading of the compilation to separate
thread(s) and caching of shader binaries between runs.

Vadim

  [1] http://cgit.freedesktop.org/~vadimg/mesa/log/?h=r600-geom-shaders




What the patch does is probably the right solution. At least
alpha-test state changes don't cause shader recompilation and
re-binding, which also negatively affects performance. Ideally we
shouldn't depend on the framebuffer state at all, but we need to
emulate the TGSI property FS_COLOR0_WRITES_ALL_CBUFS. I think we
should always be fine with key.nr_cbufs forced to 8 for any shader
without that property. I expect app developers to do the right thing
and not write outputs they don't need.

Marek

On Tue, Aug 27, 2013 at 9:00 PM, Roland Scheidegger srol...@vmware.com
wrote:


Not that I'm qualified to review r600 code, but couldn't you create
different shader variants depending on whether you need alpha test? At
least I would assume shader exports aren't free.

Roland

Am 27.08.2013 19:56, schrieb Vadim Girlin:


We need to export at least one color if the shader writes it,
even when nr_cbufs==0.

Signed-off-by: Vadim Girlin vadimgir...@gmail.com
---

Tested on evergreen with multiple combinations of backends - no
regressions,
fixes some tests:

default- fixes fb-alphatest-nocolor and fb_alphatest-nocolor-ff
default+sb - fixes fb-alphatest-nocolor and fb_alphatest-nocolor-ff
llvm   - fixes about 25 tests related to depth/stencil
llvm+sb- fixes 

Re: [Mesa-dev] [PATCH] i965: Silece usused variable warning in release build

2013-08-27 Thread Kenneth Graunke

On 08/27/2013 03:11 PM, Chad Versace wrote:

Use `(void) success;` to silence this warning:

   i965/brw_vs.c:481:12:
   warning: unused variable 'success' [-Wunused-variable]
  bool success = do_vs_prog(brw, ctx-Shader.CurrentVertexProgram,

Signed-off-by: Chad Versace chad.vers...@linux.intel.com


Typo in commit title (Silece - Silence), otherwise, this is at least 
better than it was.


Reviewed-by: Kenneth Graunke kenn...@whitecape.org

We might want to come up with a better strategy for failing to compile, 
but...someday.


--Ken

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] i965: Initialize inout_offset parameter to brw_search_cache().

2013-08-27 Thread Paul Berry
On 27 August 2013 16:51, Damien Lespiau damien.lesp...@intel.com wrote:

 On Wed, Jul 24, 2013 at 10:02:22AM -0700, Kenneth Graunke wrote:
  On 07/24/2013 09:33 AM, Paul Berry wrote:
  Two callers of brw_search_cache() weren't initializing that function's
  inout_offset parameter: brw_blorp_const_color_params::get_wm_prog()
  and brw_blorp_const_color_params::get_wm_prog().
  
  That's a benign problem, since the only effect of not initializing
  inout_offset prior to calling brw_search_cache() is that the bit
  corresponding to cache_id in brw-state.dirty.cache may not be set
  reliably.  This is ok, since the cache_id's used by
  brw_blorp_const_color_params::get_wm_prog() and
  brw_blorp_blit_params::get_wm_prog() (BRW_BLORP_CONST_COLOR_PROG and
  BRW_BLORP_BLIT_PROG, respectively) correspond to dirty bits that are
  not used.
  
  However, failing to initialize this parameter causes valgrind to
  complain.  So let's go ahead and fix it to reduce valgrind noise.
  
  Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=66779

 Can we have this commit in the 9.2 branch? It was added in the release
 blocker bug but never cherry-picked for 9.2.

 Thanks!

 --
 Damien


Is there a reason for wanting this cherry-picked to 9.2 other than to
follow procedure?  As the commit message notes it's a benign problem--all
it does is cause false positives from valgrind.  I'd rather not mess with
9.2 if there's not going to be any benefit to users.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 50482] git mesa fails to build

2013-08-27 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=50482

Jos van Wolput wol...@onsneteindhoven.nl changed:

   What|Removed |Added

 OS|Solaris |Linux (All)

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH V2 10/15] mesa: Implement KHR_debug ObjectLabel functions

2013-08-27 Thread Timothy Arceri

On 28/08/13 08:25, Fredrik Höglund wrote:

On Tuesday 27 August 2013, Timothy Arceri wrote:

V2: fixed indentation of comment

Signed-off-by: Timothy Arceri t_arc...@yahoo.com.au
---
  src/mesa/main/objectlabel.c | 277 
  src/mesa/main/objectlabel.h |  61 ++
  2 files changed, 338 insertions(+)
  create mode 100644 src/mesa/main/objectlabel.c
  create mode 100644 src/mesa/main/objectlabel.h

diff --git a/src/mesa/main/objectlabel.c b/src/mesa/main/objectlabel.c
new file mode 100644
index 000..78f9b33
--- /dev/null
+++ b/src/mesa/main/objectlabel.c
@@ -0,0 +1,277 @@
+/*
+ * Mesa 3-D graphics library
+ *
+ * Copyright (C) 2013  Timothy Arceri   All Rights Reserved.
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the Software),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included
+ * in all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED AS IS, WITHOUT WARRANTY OF ANY KIND, EXPRESS
+ * OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR
+ * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
+ * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
+ * OTHER DEALINGS IN THE SOFTWARE.
+ */
+
+
+#include arrayobj.h
+#include bufferobj.h
+#include context.h
+#include dlist.h
+#include enums.h
+#include fbobject.h
+#include objectlabel.h
+#include queryobj.h
+#include samplerobj.h
+#include shaderobj.h
+#include syncobj.h
+#include texobj.h
+#include transformfeedback.h
+
+
+/**
+ * Helper for _mesa_ObjectLabel() and _mesa_ObjectPtrLabel().
+ */
+static void
+set_label(struct gl_context *ctx, char **labelPtr, const char *label,
+  int length, const char *caller)
+{
+   if (*labelPtr) {
+  /* free old label string */
+  free(*labelPtr);

You should set *labelPtr to NULL here, since the code below is not guaranteed
to assign it.


Ok I'll create an updated patch with this.




+   }
+
+   if (label) {
+  /* set new label string */
+
+  if (length = 0) {
+ /* explicit length */
+ *labelPtr = (char *) malloc(length);
+ if (*labelPtr) {
+memcpy(*labelPtr, label, length);
+ }

The length given by the client is not required to include a terminating
null-character.


Correct. There is no requirement for a terminating null-charater in this 
code so I'm not sure what your trying to say.


The null-charater is applied in the GetObjectLabel call where the client 
is required to provide bufsize which includes the null-terminator.





+  }
+  else {
+ /* null-terminated string */
+ int len = strlen(label);
+ if (len = MAX_LABEL_LENGTH) {
+/* An INVALID_VALUE error is generated if the number of characters
+ * in label, excluding the null terminator when length is
+ * negative, is not less than the value of MAX_LABEL_LENGTH.
+ */
+_mesa_error(ctx, GL_INVALID_VALUE,
+%s(length=%d, which is not less than 
+GL_MAX_LABEL_LENGTH=%d), caller, length,
+MAX_LABEL_LENGTH);

This error should also be generated when the client specifies an explicit
length that exceeds MAX_LABEL_LENGTH.


The comment in this code is from the spec. The spec only ever says to 
generate this error when length is negative.




Fredrik



___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH V2 03/15] mesa: Add a clone function to mesa hash

2013-08-27 Thread Timothy Arceri
V2: const qualify table parameter

Signed-off-by: Timothy Arceri t_arc...@yahoo.com.au
---
 src/mesa/main/hash.c |   28 
 src/mesa/main/hash.h |3 +++
 2 files changed, 31 insertions(+)

diff --git a/src/mesa/main/hash.c b/src/mesa/main/hash.c
index 6591af9..b31fd48 100644
--- a/src/mesa/main/hash.c
+++ b/src/mesa/main/hash.c
@@ -302,6 +302,34 @@ _mesa_HashDeleteAll(struct _mesa_HashTable *table,
 
 
 /**
+ * Clone all entries in a hash table, into a new table.
+ *
+ * \param table  the hash table to clone
+ */
+struct _mesa_HashTable *
+_mesa_HashClone(const struct _mesa_HashTable *table)
+{
+   /* cast-away const */
+   struct _mesa_HashTable *table2 = (struct _mesa_HashTable *) table;
+   struct hash_entry *entry;
+   struct _mesa_HashTable *clonetable;
+
+   ASSERT(table);
+   _glthread_LOCK_MUTEX(table2-Mutex);
+
+   clonetable = _mesa_NewHashTable();
+   assert(clonetable);
+   hash_table_foreach(table-ht, entry) {
+  _mesa_HashInsert(clonetable, (GLint)(uintptr_t)entry-key, entry-data);
+   }
+
+   _glthread_UNLOCK_MUTEX(table2-Mutex);
+
+   return clonetable;
+}
+
+
+/**
  * Walk over all entries in a hash table, calling callback function for each.
  * Note: we use a separate mutex in this function to avoid a recursive
  * locking deadlock (in case the callback calls _mesa_HashRemove()) and to
diff --git a/src/mesa/main/hash.h b/src/mesa/main/hash.h
index 142d284..b34f328 100644
--- a/src/mesa/main/hash.h
+++ b/src/mesa/main/hash.h
@@ -50,6 +50,9 @@ _mesa_HashDeleteAll(struct _mesa_HashTable *table,
 void (*callback)(GLuint key, void *data, void *userData),
 void *userData);
 
+extern struct _mesa_HashTable *
+_mesa_HashClone(const struct _mesa_HashTable *table);
+
 extern void
 _mesa_HashWalk(const struct _mesa_HashTable *table,
void (*callback)(GLuint key, void *data, void *userData),
-- 
1.7.9.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] i965: Initialize inout_offset parameter to brw_search_cache().

2013-08-27 Thread Kenneth Graunke

On 08/27/2013 06:21 PM, Paul Berry wrote:

On 27 August 2013 16:51, Damien Lespiau damien.lesp...@intel.com wrote:

On Wed, Jul 24, 2013 at 10:02:22AM -0700, Kenneth Graunke wrote:
  On 07/24/2013 09:33 AM, Paul Berry wrote:
  Two callers of brw_search_cache() weren't initializing that
function's
  inout_offset parameter: brw_blorp_const_color_params::get_wm_prog()
  and brw_blorp_const_color_params::get_wm_prog().
  
  That's a benign problem, since the only effect of not initializing
  inout_offset prior to calling brw_search_cache() is that the bit
  corresponding to cache_id in brw-state.dirty.cache may not be set
  reliably.  This is ok, since the cache_id's used by
  brw_blorp_const_color_params::get_wm_prog() and
  brw_blorp_blit_params::get_wm_prog() (BRW_BLORP_CONST_COLOR_PROG and
  BRW_BLORP_BLIT_PROG, respectively) correspond to dirty bits that are
  not used.
  
  However, failing to initialize this parameter causes valgrind to
  complain.  So let's go ahead and fix it to reduce valgrind noise.
  
  Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=66779

Can we have this commit in the 9.2 branch? It was added in the release
blocker bug but never cherry-picked for 9.2.

Thanks!

--
Damien


Is there a reason for wanting this cherry-picked to 9.2 other than to
follow procedure?  As the commit message notes it's a benign
problem--all it does is cause false positives from valgrind.  I'd rather
not mess with 9.2 if there's not going to be any benefit to users.


People working on GL programs want to be able to use valgrind to debug 
their own code.  They don't want to see driver issues - not only is it 
confusing, it also means extra things to skip over every time they run 
valgrind on their program.


So, I'm strongly in favor of cherry-picking it.

--Ken
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] glx: make the interval of LIBGL_SHOW_FPS adjustable

2013-08-27 Thread Chia-I Wu
LIBGL_SHOW_FPS=1 makes GLX print FPS every second while other values do
nothing.  Extend it so that LIBGL_SHOW_FPS=N will print the FPS every N
seconds.
---
 src/glx/dri2_glx.c | 12 
 1 file changed, 8 insertions(+), 4 deletions(-)

diff --git a/src/glx/dri2_glx.c b/src/glx/dri2_glx.c
index c54edac..54fc21c 100644
--- a/src/glx/dri2_glx.c
+++ b/src/glx/dri2_glx.c
@@ -95,7 +95,7 @@ struct dri2_screen {
void *driver;
int fd;
 
-   Bool show_fps;
+   int show_fps_interval;
 };
 
 struct dri2_context
@@ -764,6 +764,8 @@ unsigned dri2GetSwapEventType(Display* dpy, XID drawable)
 
 static void show_fps(struct dri2_drawable *draw)
 {
+   const int interval =
+  ((struct dri2_screen *) draw-base.psc)-show_fps_interval;
struct timeval tv;
uint64_t current_time;
 
@@ -772,7 +774,7 @@ static void show_fps(struct dri2_drawable *draw)
 
draw-frames++;
 
-   if (draw-previous_time + 100 = current_time) {
+   if (draw-previous_time + interval * 100 = current_time) {
   if (draw-previous_time) {
  fprintf(stderr, libGL: FPS = %.1f\n,
  ((uint64_t)draw-frames * 100) /
@@ -859,7 +861,7 @@ dri2SwapBuffers(__GLXDRIdrawable *pdraw, int64_t 
target_msc, int64_t divisor,
 target_msc, divisor, remainder);
 }
 
-if (psc-show_fps) {
+if (psc-show_fps_interval) {
show_fps(priv);
 }
 
@@ -1283,7 +1285,9 @@ dri2CreateScreen(int screen, struct glx_display * priv)
free(deviceName);
 
tmp = getenv(LIBGL_SHOW_FPS);
-   psc-show_fps = tmp  strcmp(tmp, 1) == 0;
+   psc-show_fps_interval = (tmp) ? atoi(tmp) : 0;
+   if (psc-show_fps_interval  0)
+  psc-show_fps_interval = 0;
 
return psc-base;
 
-- 
1.8.4.rc3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] glsl: propagate max_array_access through function calls

2013-08-27 Thread Dominik Behr
Hi,

I am looking at the code in lower_clip_distance.cpp
ir_visitor_status
lower_clip_distance_visitor::visit_leave(ir_call *ir)
{
   void *ctx = ralloc_parent(ir);

   const exec_node *formal_param_node = ir-callee-parameters.head;
   const exec_node *actual_param_node = ir-actual_parameters.head;
   while (!actual_param_node-is_tail_sentinel()) {
  ir_variable *formal_param = (ir_variable *) formal_param_node;
  ir_rvalue *actual_param = (ir_rvalue *) actual_param_node;

  /* Advance formal_param_node and actual_param_node now so that we can
   * safely replace actual_param with another node, if necessary, below.
   */
  formal_param_node = formal_param_node-next;
  actual_param_node = actual_param_node-next;



is_tail_sentinel() is true when next == NULL. this means this loop will not
be executed for the last formal param in the list. Is this by design? In my
case I need to visit all function parameters.
I think it should by simply while (actual_param_node) {
Also, what is the point of exec_node::get_head() and exec_node::get_next()
accessor methods if they are not used?


On Tue, Aug 27, 2013 at 3:15 PM, Matt Turner matts...@gmail.com wrote:

 On Tue, Aug 27, 2013 at 2:03 PM, Dominik Behr db...@chromium.org wrote:
  I am not sure about MAX2, is it guaranteed to be converted to branchless
  conditional assignment?

 I don't think there's a guarantee, no. Using MAX2 just makes the code
 easier to immediately understand.

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH V3 10/15] mesa: Implement KHR_debug ObjectLabel functions

2013-08-27 Thread Timothy Arceri
V3: make sure to add null terminator when setting label,
generate error when the client specifies an explicit
length that exceeds MAX_LABEL_LENGTH, set label pointer
to NULL when freed, and output correct length in
MAX_LABEL_LENGTH error message.

V2: fixed indentation of comment

Signed-off-by: Timothy Arceri t_arc...@yahoo.com.au
---
 src/mesa/main/objectlabel.c |  282 +++
 src/mesa/main/objectlabel.h |   61 ++
 2 files changed, 343 insertions(+)
 create mode 100644 src/mesa/main/objectlabel.c
 create mode 100644 src/mesa/main/objectlabel.h

diff --git a/src/mesa/main/objectlabel.c b/src/mesa/main/objectlabel.c
new file mode 100644
index 000..0edb728
--- /dev/null
+++ b/src/mesa/main/objectlabel.c
@@ -0,0 +1,282 @@
+/*
+ * Mesa 3-D graphics library
+ *
+ * Copyright (C) 2013  Timothy Arceri   All Rights Reserved.
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the Software),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included
+ * in all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED AS IS, WITHOUT WARRANTY OF ANY KIND, EXPRESS
+ * OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR
+ * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
+ * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
+ * OTHER DEALINGS IN THE SOFTWARE.
+ */
+
+
+#include arrayobj.h
+#include bufferobj.h
+#include context.h
+#include dlist.h
+#include enums.h
+#include fbobject.h
+#include objectlabel.h
+#include queryobj.h
+#include samplerobj.h
+#include shaderobj.h
+#include syncobj.h
+#include texobj.h
+#include transformfeedback.h
+
+
+/**
+ * Helper for _mesa_ObjectLabel() and _mesa_ObjectPtrLabel().
+ */
+static void
+set_label(struct gl_context *ctx, char **labelPtr, const char *label,
+  int length, const char *caller)
+{
+   if (*labelPtr) {
+  /* free old label string */
+  free(*labelPtr);
+  *labelPtr = NULL;
+   }
+
+   /* set new label string */
+   if (label) {
+  if (length = 0) {
+ if (length = MAX_LABEL_LENGTH)
+_mesa_error(ctx, GL_INVALID_VALUE,
+%s(length=%d, which is not less than 
+GL_MAX_LABEL_LENGTH=%d), caller, length,
+MAX_LABEL_LENGTH);
+
+ /* explicit length */
+ *labelPtr = (char *) malloc(length+1);
+ if (*labelPtr) {
+memcpy(*labelPtr, label, length);
+/* length is not required to include the null terminator so
+ * add one just in case
+ */
+(*labelPtr)[length] = '\0';
+ }
+  }
+  else {
+ int len = strlen(label);
+ if (len = MAX_LABEL_LENGTH)
+_mesa_error(ctx, GL_INVALID_VALUE,
+%s(label length=%d, which is not less than 
+GL_MAX_LABEL_LENGTH=%d), caller, len,
+MAX_LABEL_LENGTH);
+
+ /* null-terminated string */
+ *labelPtr = _mesa_strdup(label);
+  }
+   }
+}
+
+/**
+ * Helper for _mesa_GetObjectLabel() and _mesa_GetObjectPtrLabel().
+ */
+static void
+copy_label(char **labelPtr, char *label, int *length, int bufSize)
+{
+   int labelLen = 0;
+
+   if (*labelPtr)
+  labelLen = strlen(*labelPtr);
+
+   if (label) {
+  if (bufSize = labelLen)
+ labelLen =  bufSize-1;
+
+  memcpy(label, *labelPtr, labelLen);
+  label[labelLen] = '\0';
+   }
+
+   if (length)
+  *length = labelLen;
+}
+
+/**
+ * Helper for _mesa_ObjectLabel() and _mesa_GetObjectLabel().
+ */
+static char **
+get_label_pointer(struct gl_context *ctx, GLenum identifier, GLuint name,
+  const char *caller)
+{
+   char **labelPtr = NULL;
+
+   switch (identifier) {
+   case GL_BUFFER:
+  {
+ struct gl_buffer_object *bufObj = _mesa_lookup_bufferobj(ctx, name);
+ if (bufObj)
+labelPtr = bufObj-Label;
+  }
+  break;
+   case GL_SHADER:
+  {
+ struct gl_shader *shader = _mesa_lookup_shader(ctx, name);
+ if (shader)
+labelPtr = shader-Label;
+  }
+  break;
+   case GL_PROGRAM:
+  {
+ struct gl_shader_program *program =
+_mesa_lookup_shader_program(ctx, name);
+ if (program)
+labelPtr = program-Label;
+  }
+  break;
+   case GL_VERTEX_ARRAY:
+  {
+

Re: [Mesa-dev] [PATCH V2 10/15] mesa: Implement KHR_debug ObjectLabel functions

2013-08-27 Thread Timothy Arceri





+   }
+
+   if (label) {
+  /* set new label string */
+
+  if (length = 0) {
+ /* explicit length */
+ *labelPtr = (char *) malloc(length);
+ if (*labelPtr) {
+memcpy(*labelPtr, label, length);
+ }

The length given by the client is not required to include a terminating
null-character.


Ok I see what you mean now. Fixed in V3 of the patch I just sent out.


+  }
+  else {
+ /* null-terminated string */
+ int len = strlen(label);
+ if (len = MAX_LABEL_LENGTH) {
+/* An INVALID_VALUE error is generated if the number of characters
+ * in label, excluding the null terminator when length is
+ * negative, is not less than the value of MAX_LABEL_LENGTH.
+ */
+_mesa_error(ctx, GL_INVALID_VALUE,
+%s(length=%d, which is not less than 
+GL_MAX_LABEL_LENGTH=%d), caller, length,
+MAX_LABEL_LENGTH);

This error should also be generated when the client specifies an explicit
length that exceeds MAX_LABEL_LENGTH.


Yes you are right. The wording in the spec was confusing me a little I had to 
read it a couple of times
before it made sense. Again fixed in V3 of the patch I just sent out.


Fredrik

Thanks for reviewing Fredrik

Tim
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] tgsi dump and parsing

2013-08-27 Thread Dave Airlie
 IMM[0] FLT32 { 0x, 0x, 0x, 0x }  # 1.0, 3.0, 2.0, 4.0

 If you use %.9g instead of %.4f then floating point numbers will be 
 preserved without loss of precision.


I see a -nan in my tests that doesn't get reparsed so I expect hex is
still better.


oops to list as well this time, sorry.

Dave.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] tgsi dump and parsing

2013-08-27 Thread Dave Airlie
On Wed, Aug 28, 2013 at 3:32 PM, Dave Airlie airl...@gmail.com wrote:
 IMM[0] FLT32 { 0x, 0x, 0x, 0x }  # 1.0, 3.0, 2.0, 4.0

 If you use %.9g instead of %.4f then floating point numbers will be 
 preserved without loss of precision.


 I see a -nan in my tests that doesn't get reparsed so I expect hex is
 still better.


 oops to list as well this time, sorry.

Just in case you are wondering its
tests/shaders/glsl-const-builtin-inversesqrt.shader_test and
tests/shaders/glsl-const-builtin-normalize.shader_test
that throw up the -nan in the dumps.

Dave.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev