[Mesa-dev] [PATCH v3] glsl: handle a switch where default is in the middle of cases

2014-07-14 Thread Tapani Pälli
This fixes following tests in es3conform:

   shaders.switch.default_not_last_dynamic_vertex
   shaders.switch.default_not_last_dynamic_fragment

and makes following tests in Piglit pass:

   glsl-1.30/execution/switch/fs-default-notlast-fallthrough
   glsl-1.30/execution/switch/fs-default_notlast

No Piglit regressions.

v2: take away unnecessary ir_if, just use conditional assignment
v3: use foreach_in_list instead of foreach_list

Signed-off-by: Tapani Pälli tapani.pa...@intel.com
Reviewed-by: Roland Scheidegger srol...@vmware.com (v2)
---
 src/glsl/ast_to_hir.cpp   | 83 +--
 src/glsl/glsl_parser_extras.h |  3 ++
 2 files changed, 83 insertions(+), 3 deletions(-)

diff --git a/src/glsl/ast_to_hir.cpp b/src/glsl/ast_to_hir.cpp
index 885bee5..0e0013e 100644
--- a/src/glsl/ast_to_hir.cpp
+++ b/src/glsl/ast_to_hir.cpp
@@ -4513,6 +4513,12 @@ ast_switch_statement::hir(exec_list *instructions,
instructions-push_tail(new(ctx) ir_assignment(deref_is_break_var,
   is_break_val));
 
+   state-switch_state.run_default =
+  new(ctx) ir_variable(glsl_type::bool_type,
+ run_default_tmp,
+ ir_var_temporary);
+   instructions-push_tail(state-switch_state.run_default);
+
/* Cache test expression.
 */
test_to_hir(instructions, state);
@@ -4567,8 +4573,71 @@ ir_rvalue *
 ast_case_statement_list::hir(exec_list *instructions,
  struct _mesa_glsl_parse_state *state)
 {
-   foreach_list_typed (ast_case_statement, case_stmt, link,  this-cases)
-  case_stmt-hir(instructions, state);
+   exec_list default_case, after_default, tmp;
+
+   foreach_list_typed (ast_case_statement, case_stmt, link,  this-cases) {
+  case_stmt-hir(tmp, state);
+
+  /* Default case. */
+  if (state-switch_state.previous_default  default_case.is_empty()) {
+ default_case.append_list(tmp);
+ continue;
+  }
+
+  /* If default case found, append 'after_default' list. */
+  if (!default_case.is_empty())
+ after_default.append_list(tmp);
+  else
+ instructions-append_list(tmp);
+   }
+
+   /* Handle the default case. This is done here because default might not be
+* the last case. We need to add checks against following cases first to see
+* if default should be chosen or not.
+*/
+   if (!default_case.is_empty()) {
+
+  /* Default case was the last one, no checks required. */
+  if (after_default.is_empty()) {
+ instructions-append_list(default_case);
+ return NULL;
+  }
+
+  ir_rvalue *const true_val = new (state) ir_constant(true);
+  ir_dereference_variable *deref_run_default_var =
+ new(state) ir_dereference_variable(state-switch_state.run_default);
+
+  /* Choose to run default case initially, following conditional
+   * assignments might change this.
+   */
+  ir_assignment *const init_var =
+ new(state) ir_assignment(deref_run_default_var, true_val);
+  instructions-push_tail(init_var);
+
+  foreach_in_list(ir_instruction, ir, after_default) {
+ ir_assignment *assign = ir-as_assignment();
+
+ if (!assign)
+continue;
+
+ /* Clone the check between case label and init expression. */
+ ir_expression *exp = (ir_expression*) assign-condition;
+ ir_expression *clone = exp-clone(state, NULL);
+
+ ir_dereference_variable *deref_var =
+new(state) 
ir_dereference_variable(state-switch_state.run_default);
+ ir_rvalue *const false_val = new (state) ir_constant(false);
+
+ ir_assignment *const set_false =
+new(state) ir_assignment(deref_var, false_val, clone);
+
+ instructions-push_tail(set_false);
+  }
+
+  /* Append default case and all cases after it. */
+  instructions-append_list(default_case);
+  instructions-append_list(after_default);
+   }
 
/* Case statements do not have r-values. */
return NULL;
@@ -4728,9 +4797,17 @@ ast_case_label::hir(exec_list *instructions,
   }
   state-switch_state.previous_default = this;
 
+  /* Set fallthru condition on 'run_default' bool. */
+  ir_dereference_variable *deref_run_default =
+ new(ctx) ir_dereference_variable(state-switch_state.run_default);
+  ir_rvalue *const cond_true = new(ctx) ir_constant(true);
+  ir_expression *test_cond = new(ctx) ir_expression(ir_binop_all_equal,
+cond_true,
+deref_run_default);
+
   /* Set falltrhu state. */
   ir_assignment *set_fallthru =
- new(ctx) ir_assignment(deref_fallthru_var, true_val);
+ new(ctx) ir_assignment(deref_fallthru_var, true_val, test_cond);
 
   instructions-push_tail(set_fallthru);
}
diff --git 

[Mesa-dev] [Bug 78716] Fix Mesa bugs for running Unreal Engine 4.1 Cave effects demo compiled for Linux

2014-07-14 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=78716

--- Comment #10 from Tapani Pälli lem...@gmail.com ---
Created attachment 102771
  -- https://bugs.freedesktop.org/attachment.cgi?id=102771action=edit
patch that adds 4.0 missing blend functions api

attached patch makes some of these new demos work for me on Intel Haswell
machine

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 78716] Fix Mesa bugs for running Unreal Engine 4.1 Cave effects demo compiled for Linux

2014-07-14 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=78716

--- Comment #11 from Kenneth Graunke kenn...@whitecape.org ---
(In reply to comment #10)
 Created attachment 102771 [details] [review]
 patch that adds 4.0 missing blend functions api
 
 attached patch makes some of these new demos work for me on Intel Haswell
 machine

Ah, that problem :)  Tapani, would you mind sending that patch to the mailing
list and Cc'ing Ian?

We support ARB_draw_buffers_blend, which introduces these functions but with an
ARB suffix.  But, if you read ARB_blend_func_extended, it talks abbout removing
references to BlendFunci (not BlendFunciARB!) if you don't support
ARB_draw_buffers_blend...which makes it sound like ARB_draw_buffers_blend
should have introduced them.

https://www.opengl.org/registry/specs/ARB/draw_buffers_blend.txt
https://www.opengl.org/registry/specs/ARB/blend_func_extended.txt

Since it's just accepting the core names in addition to the ARB names, which
we're going to do eventually anyway, I'm in favor of taking your patch.  I've
seen a few of Intel's GL tests hit this bug as well.

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] i965: Add an option to not generate the SIMD8 fragment shader

2014-07-14 Thread Kenneth Graunke
On Friday, July 11, 2014 11:26:30 AM Kristian Høgsberg wrote:
 For now, this can only be triggered with a new 'no8' INTEL_DEBUG option
 
 Signed-off-by: Kristian Høgsberg k...@bitplanet.net
 ---
  src/mesa/drivers/dri/i965/brw_context.h   |  2 ++
  src/mesa/drivers/dri/i965/brw_fs.cpp  | 14 --
  src/mesa/drivers/dri/i965/gen7_wm_state.c |  4 ++--
  src/mesa/drivers/dri/i965/gen8_ps_state.c |  4 ++--
  src/mesa/drivers/dri/i965/intel_debug.c   |  1 +
  src/mesa/drivers/dri/i965/intel_debug.h   |  1 +
  6 files changed, 20 insertions(+), 6 deletions(-)
 
 diff --git a/src/mesa/drivers/dri/i965/brw_context.h 
b/src/mesa/drivers/dri/i965/brw_context.h
 index 2943a20..11cea04 100644
 --- a/src/mesa/drivers/dri/i965/brw_context.h
 +++ b/src/mesa/drivers/dri/i965/brw_context.h
 @@ -340,6 +340,7 @@ struct brw_wm_prog_data {
/** @} */
 } binding_table;
  
 +   bool no_8;
 bool dual_src_blend;
 bool uses_pos_offset;
 bool uses_omask;
 @@ -1039,6 +1040,7 @@ struct brw_context
 bool has_compr4;
 bool has_negative_rhw_bug;
 bool has_pln;
 +   bool no_simd8;
  
 /**
  * Some versions of Gen hardware don't do centroid interpolation 
correctly
 diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp 
b/src/mesa/drivers/dri/i965/brw_fs.cpp
 index a3ad375..f018d94 100644
 --- a/src/mesa/drivers/dri/i965/brw_fs.cpp
 +++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
 @@ -3258,15 +3258,25 @@ brw_wm_fs_emit(struct brw_context *brw,
}
 }
  
 +   exec_list *simd8_instructions;
 +   int no_simd8 = (INTEL_DEBUG  DEBUG_NO8) || brw-no_simd8;
 +   if (no_simd8  simd16_instructions) {
 +  simd8_instructions = NULL;
 +  prog_data-no_8 = true;
 +   } else {
 +  simd8_instructions = v.instructions;
 +  prog_data-no_8 = false;
 +   }
 +
 const unsigned *assembly = NULL;
 if (brw-gen = 8) {
gen8_fs_generator g(brw, mem_ctx, key, prog_data, prog, fp, 
v.do_dual_src);
 -  assembly = g.generate_assembly(v.instructions, simd16_instructions,
 +  assembly = g.generate_assembly(simd8_instructions, 
simd16_instructions,
   final_assembly_size);
 } else {
fs_generator g(brw, mem_ctx, key, prog_data, prog, fp, v.do_dual_src,
   v.runtime_check_aads_emit, INTEL_DEBUG  DEBUG_WM);
 -  assembly = g.generate_assembly(v.instructions, simd16_instructions,
 +  assembly = g.generate_assembly(simd8_instructions, 
simd16_instructions,
   final_assembly_size);
 }
  
 diff --git a/src/mesa/drivers/dri/i965/gen7_wm_state.c 
b/src/mesa/drivers/dri/i965/gen7_wm_state.c
 index c03d19d..c3e9316 100644
 --- a/src/mesa/drivers/dri/i965/gen7_wm_state.c
 +++ b/src/mesa/drivers/dri/i965/gen7_wm_state.c
 @@ -223,9 +223,9 @@ upload_ps_state(struct brw_context *brw)
_mesa_get_min_invocations_per_fragment(ctx, brw-fragment_program, 
false);
 assert(min_inv_per_frag = 1);
  
 -   if (brw-wm.prog_data-prog_offset_16) {
 +   if (brw-wm.prog_data-prog_offset_16 || brw-wm.prog_data-no_8) {
dw4 |= GEN7_PS_16_DISPATCH_ENABLE;
 -  if (min_inv_per_frag == 1) {
 +  if (!brw-wm.prog_data-no_8  min_inv_per_frag == 1) {
   dw4 |= GEN7_PS_8_DISPATCH_ENABLE;
   dw5 |= (brw-wm.prog_data-base.dispatch_grf_start_reg 
   GEN7_PS_DISPATCH_START_GRF_SHIFT_0);
 diff --git a/src/mesa/drivers/dri/i965/gen8_ps_state.c 
b/src/mesa/drivers/dri/i965/gen8_ps_state.c
 index f58d49c..49d4fe0 100644
 --- a/src/mesa/drivers/dri/i965/gen8_ps_state.c
 +++ b/src/mesa/drivers/dri/i965/gen8_ps_state.c
 @@ -195,9 +195,9 @@ upload_ps_state(struct brw_context *brw)
_mesa_get_min_invocations_per_fragment(ctx, brw-fragment_program, 
false);
 assert(min_invocations_per_fragment = 1);
  
 -   if (brw-wm.prog_data-prog_offset_16) {
 +   if (brw-wm.prog_data-prog_offset_16 || brw-wm.prog_data-no_8) {
dw6 |= GEN7_PS_16_DISPATCH_ENABLE;
 -  if (min_invocations_per_fragment == 1) {
 +  if (!brw-wm.prog_data-no_8  min_invocations_per_fragment == 1) {
   dw6 |= GEN7_PS_8_DISPATCH_ENABLE;
   dw7 |= (brw-wm.prog_data-base.dispatch_grf_start_reg 
   GEN7_PS_DISPATCH_START_GRF_SHIFT_0);
 diff --git a/src/mesa/drivers/dri/i965/intel_debug.c 
b/src/mesa/drivers/dri/i965/intel_debug.c
 index c72fce2..1fb8b6c 100644
 --- a/src/mesa/drivers/dri/i965/intel_debug.c
 +++ b/src/mesa/drivers/dri/i965/intel_debug.c
 @@ -66,6 +66,7 @@ static const struct dri_debug_control debug_control[] = {
 { nodualobj, DEBUG_NO_DUAL_OBJECT_GS },
 { optimizer, DEBUG_OPTIMIZER },
 { noann, DEBUG_NO_ANNOTATION },
 +   { no8,  DEBUG_NO8 },
 { NULL,0 }
  };
  
 diff --git a/src/mesa/drivers/dri/i965/intel_debug.h 
b/src/mesa/drivers/dri/i965/intel_debug.h
 index 37dc34a..8e1c299 100644
 --- a/src/mesa/drivers/dri/i965/intel_debug.h
 +++ b/src/mesa/drivers/dri/i965/intel_debug.h
 @@ -62,6 +62,7 @@ extern uint64_t 

Re: [Mesa-dev] [PATCH] glsl: Fix aggregates with dynamic initializers.

2014-07-14 Thread Kenneth Graunke
On Thursday, July 10, 2014 09:55:31 AM Cody Northrop wrote:
 Vectors are falling in to the ir_dereference_array() path.
 
 Without this change, the following glsl aborts the debug driver,
 or gets the wrong answer in release:
 
 mat2x2 a = mat2( vec2( 1.0, vertex.x ), vec2( 0.0, 1.0 ) );
 
 Also submitting piglit tests, will reference in bug.
 
 v2: Rebase on Mesa master.
 
 v3: Remove unneeded check for arrays, which are covered by
 process_array_constructor(), recommended by Timothy Arceri.

Thanks for the feedback, Timothy, and for the updated patch, Cody.

Pushed.

signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/2] glsl: add a mechanism to allow #extension directives in the middle of shaders

2014-07-14 Thread Kenneth Graunke
On Tuesday, July 08, 2014 08:29:44 PM Marek Olšák wrote:
 From: Marek Olšák marek.ol...@amd.com
 
 This is needed to make Unigine Heaven 4.0 and Unigine Valley 1.0 work
 with sample shading.
 
 Also, if this is disabled, the error message at least makes sense now.
 ---
  src/glsl/glsl_parser.yy | 8 
  src/glsl/glsl_parser_extras.cpp | 2 ++
  src/glsl/glsl_parser_extras.h   | 2 ++
  src/mesa/main/mtypes.h  | 5 +
  4 files changed, 17 insertions(+)
 
 diff --git a/src/glsl/glsl_parser.yy b/src/glsl/glsl_parser.yy
 index b989749..4c87163 100644
 --- a/src/glsl/glsl_parser.yy
 +++ b/src/glsl/glsl_parser.yy
 @@ -376,6 +376,14 @@ external_declaration_list:
if ($2 != NULL)
   state-translation_unit.push_tail( $2-link);
 }
 +   | external_declaration_list extension_statement {
 +  if (!state-allow_extension_directive_midshader) {
 + _mesa_glsl_error( @2, state,
 +  #extension directive is not allowed 
 +  in the middle of a shader);
 + YYERROR;
 +  }
 +   }
 ;
  
  variable_identifier:
 diff --git a/src/glsl/glsl_parser_extras.cpp 
b/src/glsl/glsl_parser_extras.cpp
 index b327c2b..890123a 100644
 --- a/src/glsl/glsl_parser_extras.cpp
 +++ b/src/glsl/glsl_parser_extras.cpp
 @@ -210,6 +210,8 @@ _mesa_glsl_parse_state::_mesa_glsl_parse_state(struct 
gl_context *_ctx,
 this-early_fragment_tests = false;
 memset(this-atomic_counter_offsets, 0,
sizeof(this-atomic_counter_offsets));
 +   this-allow_extension_directive_midshader =
 +  ctx-Const.AllowGLSLExtensionDirectiveMidShader;
  }
  
  /**
 diff --git a/src/glsl/glsl_parser_extras.h b/src/glsl/glsl_parser_extras.h
 index 1791816..f720d6a 100644
 --- a/src/glsl/glsl_parser_extras.h
 +++ b/src/glsl/glsl_parser_extras.h
 @@ -490,6 +490,8 @@ struct _mesa_glsl_parse_state {
  
 /** Atomic counter offsets by binding */
 unsigned atomic_counter_offsets[MAX_COMBINED_ATOMIC_BUFFERS];
 +
 +   bool allow_extension_directive_midshader;
  };
  
  # define YYLLOC_DEFAULT(Current, Rhs, N) \
 diff --git a/src/mesa/main/mtypes.h b/src/mesa/main/mtypes.h
 index a7126fd..b42a175 100644
 --- a/src/mesa/main/mtypes.h
 +++ b/src/mesa/main/mtypes.h
 @@ -3349,6 +3349,11 @@ struct gl_constants
 GLuint ForceGLSLVersion;
  
 /**
 +* Allow GLSL #extension directives in the middle of shaders.
 +*/
 +   GLboolean AllowGLSLExtensionDirectiveMidShader;
 +
 +   /**
  * Does the driver support real 32-bit integers?  (Otherwise, integers 
are
  * simulated via floats.)
  */
 

Looks great to me, Marek - thanks for doing this.

I'd written a similar patch, but introduced shift/reduce conflicts while doing 
it.  Looks like you avoided that - nice :)

Reviewed-by: Kenneth Graunke kenn...@whitecape.org

signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] glapi: Use GetProcAddress instead of dlsym on Windows.

2014-07-14 Thread Brian Paul

On 07/11/2014 12:52 PM, Vinson Lee wrote:

This patch fixes this MinGW build error.

glapi_gentable.c: In function '_glapi_create_table_from_handle':
glapi_gentable.c:123:9: error: implicit declaration of function 'dlsym' 
[-Werror=implicit-function-declaration]
  *procp = dlsym(handle, symboln);
  ^

Signed-off-by: Vinson Lee v...@freedesktop.org
---
  src/mapi/glapi/gen/gl_gentable.py | 4 
  1 file changed, 4 insertions(+)

diff --git a/src/mapi/glapi/gen/gl_gentable.py 
b/src/mapi/glapi/gen/gl_gentable.py
index d45a5e0..9db6a77 100644
--- a/src/mapi/glapi/gen/gl_gentable.py
+++ b/src/mapi/glapi/gen/gl_gentable.py
@@ -134,7 +134,11 @@ body_template = 
  if(!disp-%(name)s) {
  void ** procp = (void **) disp-%(name)s;
  snprintf(symboln, sizeof(symboln), %%s%(entry_point)s, 
symbol_prefix);
+#ifdef _WIN32
+*procp = GetProcAddress(handle, symboln);
+#else
  *procp = dlsym(handle, symboln);
+#endif
  }
  




Hmm, I've been building on Mingw as-is for a long-time without ever 
seeing this problem.  That said, this patch doesn't seem to break 
anything for me.


Acked-by: Brian Paul bri...@vmware.com

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] i965: Add an option to not generate the SIMD8 fragment shader

2014-07-14 Thread Kristian Høgsberg
On Mon, Jul 14, 2014 at 9:17 AM, Kenneth Graunke kenn...@whitecape.org wrote:
 On Friday, July 11, 2014 11:26:30 AM Kristian Høgsberg wrote:
 For now, this can only be triggered with a new 'no8' INTEL_DEBUG option

 Signed-off-by: Kristian Høgsberg k...@bitplanet.net
 ---
  src/mesa/drivers/dri/i965/brw_context.h   |  2 ++
  src/mesa/drivers/dri/i965/brw_fs.cpp  | 14 --
  src/mesa/drivers/dri/i965/gen7_wm_state.c |  4 ++--
  src/mesa/drivers/dri/i965/gen8_ps_state.c |  4 ++--
  src/mesa/drivers/dri/i965/intel_debug.c   |  1 +
  src/mesa/drivers/dri/i965/intel_debug.h   |  1 +
  6 files changed, 20 insertions(+), 6 deletions(-)

 diff --git a/src/mesa/drivers/dri/i965/brw_context.h
 b/src/mesa/drivers/dri/i965/brw_context.h
 index 2943a20..11cea04 100644
 --- a/src/mesa/drivers/dri/i965/brw_context.h
 +++ b/src/mesa/drivers/dri/i965/brw_context.h
 @@ -340,6 +340,7 @@ struct brw_wm_prog_data {
/** @} */
 } binding_table;

 +   bool no_8;
 bool dual_src_blend;
 bool uses_pos_offset;
 bool uses_omask;
 @@ -1039,6 +1040,7 @@ struct brw_context
 bool has_compr4;
 bool has_negative_rhw_bug;
 bool has_pln;
 +   bool no_simd8;

 /**
  * Some versions of Gen hardware don't do centroid interpolation
 correctly
 diff --git a/src/mesa/drivers/dri/i965/brw_fs.cpp
 b/src/mesa/drivers/dri/i965/brw_fs.cpp
 index a3ad375..f018d94 100644
 --- a/src/mesa/drivers/dri/i965/brw_fs.cpp
 +++ b/src/mesa/drivers/dri/i965/brw_fs.cpp
 @@ -3258,15 +3258,25 @@ brw_wm_fs_emit(struct brw_context *brw,
}
 }

 +   exec_list *simd8_instructions;
 +   int no_simd8 = (INTEL_DEBUG  DEBUG_NO8) || brw-no_simd8;
 +   if (no_simd8  simd16_instructions) {
 +  simd8_instructions = NULL;
 +  prog_data-no_8 = true;
 +   } else {
 +  simd8_instructions = v.instructions;
 +  prog_data-no_8 = false;
 +   }
 +
 const unsigned *assembly = NULL;
 if (brw-gen = 8) {
gen8_fs_generator g(brw, mem_ctx, key, prog_data, prog, fp,
 v.do_dual_src);
 -  assembly = g.generate_assembly(v.instructions, simd16_instructions,
 +  assembly = g.generate_assembly(simd8_instructions,
 simd16_instructions,
   final_assembly_size);
 } else {
fs_generator g(brw, mem_ctx, key, prog_data, prog, fp, v.do_dual_src,
   v.runtime_check_aads_emit, INTEL_DEBUG  DEBUG_WM);
 -  assembly = g.generate_assembly(v.instructions, simd16_instructions,
 +  assembly = g.generate_assembly(simd8_instructions,
 simd16_instructions,
   final_assembly_size);
 }

 diff --git a/src/mesa/drivers/dri/i965/gen7_wm_state.c
 b/src/mesa/drivers/dri/i965/gen7_wm_state.c
 index c03d19d..c3e9316 100644
 --- a/src/mesa/drivers/dri/i965/gen7_wm_state.c
 +++ b/src/mesa/drivers/dri/i965/gen7_wm_state.c
 @@ -223,9 +223,9 @@ upload_ps_state(struct brw_context *brw)
_mesa_get_min_invocations_per_fragment(ctx, brw-fragment_program,
 false);
 assert(min_inv_per_frag = 1);

 -   if (brw-wm.prog_data-prog_offset_16) {
 +   if (brw-wm.prog_data-prog_offset_16 || brw-wm.prog_data-no_8) {
dw4 |= GEN7_PS_16_DISPATCH_ENABLE;
 -  if (min_inv_per_frag == 1) {
 +  if (!brw-wm.prog_data-no_8  min_inv_per_frag == 1) {
   dw4 |= GEN7_PS_8_DISPATCH_ENABLE;
   dw5 |= (brw-wm.prog_data-base.dispatch_grf_start_reg 
   GEN7_PS_DISPATCH_START_GRF_SHIFT_0);
 diff --git a/src/mesa/drivers/dri/i965/gen8_ps_state.c
 b/src/mesa/drivers/dri/i965/gen8_ps_state.c
 index f58d49c..49d4fe0 100644
 --- a/src/mesa/drivers/dri/i965/gen8_ps_state.c
 +++ b/src/mesa/drivers/dri/i965/gen8_ps_state.c
 @@ -195,9 +195,9 @@ upload_ps_state(struct brw_context *brw)
_mesa_get_min_invocations_per_fragment(ctx, brw-fragment_program,
 false);
 assert(min_invocations_per_fragment = 1);

 -   if (brw-wm.prog_data-prog_offset_16) {
 +   if (brw-wm.prog_data-prog_offset_16 || brw-wm.prog_data-no_8) {
dw6 |= GEN7_PS_16_DISPATCH_ENABLE;
 -  if (min_invocations_per_fragment == 1) {
 +  if (!brw-wm.prog_data-no_8  min_invocations_per_fragment == 1) {
   dw6 |= GEN7_PS_8_DISPATCH_ENABLE;
   dw7 |= (brw-wm.prog_data-base.dispatch_grf_start_reg 
   GEN7_PS_DISPATCH_START_GRF_SHIFT_0);
 diff --git a/src/mesa/drivers/dri/i965/intel_debug.c
 b/src/mesa/drivers/dri/i965/intel_debug.c
 index c72fce2..1fb8b6c 100644
 --- a/src/mesa/drivers/dri/i965/intel_debug.c
 +++ b/src/mesa/drivers/dri/i965/intel_debug.c
 @@ -66,6 +66,7 @@ static const struct dri_debug_control debug_control[] = {
 { nodualobj, DEBUG_NO_DUAL_OBJECT_GS },
 { optimizer, DEBUG_OPTIMIZER },
 { noann, DEBUG_NO_ANNOTATION },
 +   { no8,  DEBUG_NO8 },
 { NULL,0 }
  };

 diff --git a/src/mesa/drivers/dri/i965/intel_debug.h
 b/src/mesa/drivers/dri/i965/intel_debug.h
 index 37dc34a..8e1c299 100644
 --- a/src/mesa/drivers/dri/i965/intel_debug.h
 +++ 

[Mesa-dev] [PATCH] r600g: Implement GL_ARB_texture_gather

2014-07-14 Thread Glenn Kennard
Only supported on evergreen and later. Limited to
single component textures as the hardware GATHER4
instruction ignores texture swizzles.

Piglit quick run passes on radeon 6670 with all
applicable textureGather tests, no regressions.

Signed-off-by: Glenn Kennard glenn.kenn...@gmail.com
---
 docs/GL3.txt   |  2 +-
 docs/relnotes/10.3.html|  2 +-
 src/gallium/drivers/r600/r600_pipe.c   |  5 ++--
 src/gallium/drivers/r600/r600_shader.c | 47 +-
 4 files changed, 46 insertions(+), 10 deletions(-)

diff --git a/docs/GL3.txt b/docs/GL3.txt
index a2f438b..20e57b0 100644
--- a/docs/GL3.txt
+++ b/docs/GL3.txt
@@ -118,7 +118,7 @@ GL 4.0:
   GL_ARB_tessellation_shader   started (Fabian)
   GL_ARB_texture_buffer_object_rgb32   DONE (i965, nvc0, r600, 
radeonsi, softpipe)
   GL_ARB_texture_cube_map_arrayDONE (i965, nv50, nvc0, 
r600, radeonsi, softpipe)
-  GL_ARB_texture_gatherDONE (i965, nv50, nvc0, 
radeonsi)
+  GL_ARB_texture_gatherDONE (i965, nv50, nvc0, 
radeonsi, r600)
   GL_ARB_texture_query_lod DONE (i965, nv50, nvc0, 
radeonsi)
   GL_ARB_transform_feedback2   DONE (i965, nv50, nvc0, 
r600, radeonsi)
   GL_ARB_transform_feedback3   DONE (i965, nv50, nvc0, 
r600, radeonsi)
diff --git a/docs/relnotes/10.3.html b/docs/relnotes/10.3.html
index 2e718fc..1c0fab6 100644
--- a/docs/relnotes/10.3.html
+++ b/docs/relnotes/10.3.html
@@ -49,7 +49,7 @@ Note: some of the new features are only available with 
certain drivers.
 liGL_ARB_sample_shading on radeonsi/li
 liGL_ARB_stencil_texturing on nv50, nvc0, r600, and radeonsi/li
 liGL_ARB_texture_cube_map_array on radeonsi/li
-liGL_ARB_texture_gather on radeonsi/li
+liGL_ARB_texture_gather on radeonsi, r600/li
 liGL_ARB_texture_query_levels on nv50, nvc0, llvmpipe, r600, radeonsi, 
softpipe/li
 liGL_ARB_texture_query_lod on radeonsi/li
 liGL_ARB_viewport_array on nvc0/li
diff --git a/src/gallium/drivers/r600/r600_pipe.c 
b/src/gallium/drivers/r600/r600_pipe.c
index ca6399f..d967f0f 100644
--- a/src/gallium/drivers/r600/r600_pipe.c
+++ b/src/gallium/drivers/r600/r600_pipe.c
@@ -303,6 +303,9 @@ static int r600_get_param(struct pipe_screen* pscreen, enum 
pipe_cap param)
case PIPE_CAP_SEAMLESS_CUBE_MAP_PER_TEXTURE:
case PIPE_CAP_CUBE_MAP_ARRAY:
case PIPE_CAP_TGSI_VS_LAYER_VIEWPORT:
+   case PIPE_CAP_TEXTURE_GATHER_SM5:
+   return family = CHIP_CEDAR ? 1 : 0;
+   case PIPE_CAP_MAX_TEXTURE_GATHER_COMPONENTS:
return family = CHIP_CEDAR ? 1 : 0;
 
/* Unsupported features. */
@@ -312,8 +315,6 @@ static int r600_get_param(struct pipe_screen* pscreen, enum 
pipe_cap param)
case PIPE_CAP_FRAGMENT_COLOR_CLAMPED:
case PIPE_CAP_VERTEX_COLOR_CLAMPED:
case PIPE_CAP_USER_VERTEX_BUFFERS:
-   case PIPE_CAP_MAX_TEXTURE_GATHER_COMPONENTS:
-   case PIPE_CAP_TEXTURE_GATHER_SM5:
case PIPE_CAP_TEXTURE_QUERY_LOD:
case PIPE_CAP_SAMPLE_SHADING:
case PIPE_CAP_TEXTURE_GATHER_OFFSETS:
diff --git a/src/gallium/drivers/r600/r600_shader.c 
b/src/gallium/drivers/r600/r600_shader.c
index 6952e3c..db928f3 100644
--- a/src/gallium/drivers/r600/r600_shader.c
+++ b/src/gallium/drivers/r600/r600_shader.c
@@ -4477,7 +4477,8 @@ static int tgsi_tex(struct r600_shader_ctx *ctx)
 
if (inst-Instruction.Opcode == TGSI_OPCODE_TEX2 ||
inst-Instruction.Opcode == TGSI_OPCODE_TXB2 ||
-   inst-Instruction.Opcode == TGSI_OPCODE_TXL2)
+   inst-Instruction.Opcode == TGSI_OPCODE_TXL2 ||
+   inst-Instruction.Opcode == TGSI_OPCODE_TG4)
sampler_src_reg = 2;
 
src_gpr = tgsi_tex_get_src_gpr(ctx, 0);
@@ -5079,6 +5080,13 @@ static int tgsi_tex(struct r600_shader_ctx *ctx)
case FETCH_OP_SAMPLE_G:
opcode = FETCH_OP_SAMPLE_C_G;
break;
+   /* Texture gather variants */
+   case FETCH_OP_GATHER4:
+   tex.op = FETCH_OP_GATHER4_C;
+   break;
+   case FETCH_OP_GATHER4_O:
+   tex.op = FETCH_OP_GATHER4_C_O;
+   break;
}
}
 
@@ -5089,9 +5097,21 @@ static int tgsi_tex(struct r600_shader_ctx *ctx)
tex.resource_id = tex.sampler_id + R600_MAX_CONST_BUFFERS;
tex.src_gpr = src_gpr;
tex.dst_gpr = ctx-file_offset[inst-Dst[0].Register.File] + 
inst-Dst[0].Register.Index;
-   tex.dst_sel_x = (inst-Dst[0].Register.WriteMask  1) ? 0 : 7;
-   tex.dst_sel_y = (inst-Dst[0].Register.WriteMask  2) ? 1 : 7;
-   tex.dst_sel_z = (inst-Dst[0].Register.WriteMask  4) ? 2 : 7;
+
+   if (inst-Instruction.Opcode == TGSI_OPCODE_TG4) {
+   int8_t 

Re: [Mesa-dev] [PATCH] r600g: Implement GL_ARB_texture_gather

2014-07-14 Thread Ilia Mirkin
On Mon, Jul 14, 2014 at 2:20 PM, Glenn Kennard glenn.kenn...@gmail.com wrote:
 Only supported on evergreen and later. Limited to
 single component textures as the hardware GATHER4
 instruction ignores texture swizzles.

 Piglit quick run passes on radeon 6670 with all
 applicable textureGather tests, no regressions.

 Signed-off-by: Glenn Kennard glenn.kenn...@gmail.com
 ---
  docs/GL3.txt   |  2 +-
  docs/relnotes/10.3.html|  2 +-
  src/gallium/drivers/r600/r600_pipe.c   |  5 ++--
  src/gallium/drivers/r600/r600_shader.c | 47 
 +-
  4 files changed, 46 insertions(+), 10 deletions(-)

 diff --git a/docs/GL3.txt b/docs/GL3.txt
 index a2f438b..20e57b0 100644
 --- a/docs/GL3.txt
 +++ b/docs/GL3.txt
 @@ -118,7 +118,7 @@ GL 4.0:
GL_ARB_tessellation_shader   started (Fabian)
GL_ARB_texture_buffer_object_rgb32   DONE (i965, nvc0, 
 r600, radeonsi, softpipe)
GL_ARB_texture_cube_map_arrayDONE (i965, nv50, 
 nvc0, r600, radeonsi, softpipe)
 -  GL_ARB_texture_gatherDONE (i965, nv50, 
 nvc0, radeonsi)
 +  GL_ARB_texture_gatherDONE (i965, nv50, 
 nvc0, radeonsi, r600)
GL_ARB_texture_query_lod DONE (i965, nv50, 
 nvc0, radeonsi)
GL_ARB_transform_feedback2   DONE (i965, nv50, 
 nvc0, r600, radeonsi)
GL_ARB_transform_feedback3   DONE (i965, nv50, 
 nvc0, r600, radeonsi)
 diff --git a/docs/relnotes/10.3.html b/docs/relnotes/10.3.html
 index 2e718fc..1c0fab6 100644
 --- a/docs/relnotes/10.3.html
 +++ b/docs/relnotes/10.3.html
 @@ -49,7 +49,7 @@ Note: some of the new features are only available with 
 certain drivers.
  liGL_ARB_sample_shading on radeonsi/li
  liGL_ARB_stencil_texturing on nv50, nvc0, r600, and radeonsi/li
  liGL_ARB_texture_cube_map_array on radeonsi/li
 -liGL_ARB_texture_gather on radeonsi/li
 +liGL_ARB_texture_gather on radeonsi, r600/li
  liGL_ARB_texture_query_levels on nv50, nvc0, llvmpipe, r600, radeonsi, 
 softpipe/li
  liGL_ARB_texture_query_lod on radeonsi/li
  liGL_ARB_viewport_array on nvc0/li
 diff --git a/src/gallium/drivers/r600/r600_pipe.c 
 b/src/gallium/drivers/r600/r600_pipe.c
 index ca6399f..d967f0f 100644
 --- a/src/gallium/drivers/r600/r600_pipe.c
 +++ b/src/gallium/drivers/r600/r600_pipe.c
 @@ -303,6 +303,9 @@ static int r600_get_param(struct pipe_screen* pscreen, 
 enum pipe_cap param)
 case PIPE_CAP_SEAMLESS_CUBE_MAP_PER_TEXTURE:
 case PIPE_CAP_CUBE_MAP_ARRAY:
 case PIPE_CAP_TGSI_VS_LAYER_VIEWPORT:
 +   case PIPE_CAP_TEXTURE_GATHER_SM5:
 +   return family = CHIP_CEDAR ? 1 : 0;

No clue what the hardware supports, and this CAP is actually not used
by the state tracker. However I believe it is meant to imply that you
can support the ARB_gs5 texture gather, including shadow comparisons
and non-constant offsets. This is not yet turned on by core mesa, but
do the textureGather tests still pass if you add
MESA_EXTENSION_OVERRIDE=GL_ARB_gpu_shader5? (You can do a piglit-run
with -t gather.) I'm not 100% sure, but I _think_ it also implies that
you can deal with selecting any component from a 4-component
texture... the ARB_gs5 spec says:

Since this extension requires support for gathering from multi-component
textures, the minimum value of MAX_PROGRAM_TEXTURE_GATHER_COMPONENTS_ARB
is increased to 4.

My understanding is that the hardware is meant to be able to support
GL4 one way or aonther, but sounds like more work needs to be done in
order to claim the SM5 compatibility cap...

Also, have you looked at Dave Airlie's impl? Not sure what's been
going on there...
http://cgit.freedesktop.org/~airlied/mesa/log/?h=r600g-texture-gather
. He seemed to enable 4 components for = CEDAR.

 +   case PIPE_CAP_MAX_TEXTURE_GATHER_COMPONENTS:
 return family = CHIP_CEDAR ? 1 : 0;

 /* Unsupported features. */
 @@ -312,8 +315,6 @@ static int r600_get_param(struct pipe_screen* pscreen, 
 enum pipe_cap param)
 case PIPE_CAP_FRAGMENT_COLOR_CLAMPED:
 case PIPE_CAP_VERTEX_COLOR_CLAMPED:
 case PIPE_CAP_USER_VERTEX_BUFFERS:
 -   case PIPE_CAP_MAX_TEXTURE_GATHER_COMPONENTS:
 -   case PIPE_CAP_TEXTURE_GATHER_SM5:
 case PIPE_CAP_TEXTURE_QUERY_LOD:
 case PIPE_CAP_SAMPLE_SHADING:
 case PIPE_CAP_TEXTURE_GATHER_OFFSETS:
 diff --git a/src/gallium/drivers/r600/r600_shader.c 
 b/src/gallium/drivers/r600/r600_shader.c
 index 6952e3c..db928f3 100644
 --- a/src/gallium/drivers/r600/r600_shader.c
 +++ b/src/gallium/drivers/r600/r600_shader.c
 @@ -4477,7 +4477,8 @@ static int tgsi_tex(struct r600_shader_ctx *ctx)

 if (inst-Instruction.Opcode == TGSI_OPCODE_TEX2 ||
 inst-Instruction.Opcode == TGSI_OPCODE_TXB2 ||
 -   inst-Instruction.Opcode == TGSI_OPCODE_TXL2)
 + 

[Mesa-dev] [PATCH] configure: Don't override user -g or -O options for debug builds

2014-07-14 Thread Ian Romanick
From: Ian Romanick ian.d.roman...@intel.com

I already pass '-ggdb3 -O1' or '-ggdb3 -Og' for CFLAGS, and I don't want
configure to change them for me.

Signed-off-by: Ian Romanick ian.d.roman...@intel.com
---
 configure.ac | 14 --
 1 file changed, 12 insertions(+), 2 deletions(-)

diff --git a/configure.ac b/configure.ac
index 4646212..04abcc0 100644
--- a/configure.ac
+++ b/configure.ac
@@ -308,10 +308,20 @@ AC_ARG_ENABLE([debug],
 if test x$enable_debug = xyes; then
 DEFINES=$DEFINES -DDEBUG
 if test x$GCC = xyes; then
-CFLAGS=$CFLAGS -g -O0
+if ! echo $CFLAGS | grep -q -e '-g'; then
+CFLAGS=$CFLAGS -g
+fi
+if ! echo $CFLAGS | grep -q -e '-O'; then
+CFLAGS=$CFLAGS -O0
+fi
 fi
 if test x$GXX = xyes; then
-CXXFLAGS=$CXXFLAGS -g -O0
+if ! echo $CXXFLAGS | grep -q -e '-g'; then
+CXXFLAGS=$CXXFLAGS -g
+fi
+if ! echo $CXXFLAGS | grep -q -e '-O'; then
+CXXFLAGS=$CXXFLAGS -O0
+fi
 fi
 fi
 
-- 
1.8.1.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] configure: Don't override user -g or -O options for debug builds

2014-07-14 Thread Kenneth Graunke
On Monday, July 14, 2014 11:49:57 AM Ian Romanick wrote:
 From: Ian Romanick ian.d.roman...@intel.com
 
 I already pass '-ggdb3 -O1' or '-ggdb3 -Og' for CFLAGS, and I don't want
 configure to change them for me.
 
 Signed-off-by: Ian Romanick ian.d.roman...@intel.com
 ---
  configure.ac | 14 --
  1 file changed, 12 insertions(+), 2 deletions(-)
 
 diff --git a/configure.ac b/configure.ac
 index 4646212..04abcc0 100644
 --- a/configure.ac
 +++ b/configure.ac
 @@ -308,10 +308,20 @@ AC_ARG_ENABLE([debug],
  if test x$enable_debug = xyes; then
  DEFINES=$DEFINES -DDEBUG
  if test x$GCC = xyes; then
 -CFLAGS=$CFLAGS -g -O0
 +if ! echo $CFLAGS | grep -q -e '-g'; then
 +CFLAGS=$CFLAGS -g
 +fi
 +if ! echo $CFLAGS | grep -q -e '-O'; then
 +CFLAGS=$CFLAGS -O0
 +fi
  fi
  if test x$GXX = xyes; then
 -CXXFLAGS=$CXXFLAGS -g -O0
 +if ! echo $CXXFLAGS | grep -q -e '-g'; then
 +CXXFLAGS=$CXXFLAGS -g
 +fi
 +if ! echo $CXXFLAGS | grep -q -e '-O'; then
 +CXXFLAGS=$CXXFLAGS -O0
 +fi
  fi
  fi
  
 

Seems reasonable to me.

Reviewed-by: Kenneth Graunke kenn...@whitecape.org

signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] r600g: Implement GL_ARB_texture_gather

2014-07-14 Thread Marek Olšák
Reviewed-by: Marek Olšák marek.ol...@amd.com

Yes, Dave Airlie's branch seems to support multi-component textures as
well as texture offsets. However, his solution triggers shader
recompilation when the format and/or swizzle is changed even if the
shader doesn't use GATHER4. The recompilation should only be done for
shaders and samplers which use that opcode.

Marek

On Mon, Jul 14, 2014 at 8:20 PM, Glenn Kennard glenn.kenn...@gmail.com wrote:
 Only supported on evergreen and later. Limited to
 single component textures as the hardware GATHER4
 instruction ignores texture swizzles.

 Piglit quick run passes on radeon 6670 with all
 applicable textureGather tests, no regressions.

 Signed-off-by: Glenn Kennard glenn.kenn...@gmail.com
 ---
  docs/GL3.txt   |  2 +-
  docs/relnotes/10.3.html|  2 +-
  src/gallium/drivers/r600/r600_pipe.c   |  5 ++--
  src/gallium/drivers/r600/r600_shader.c | 47 
 +-
  4 files changed, 46 insertions(+), 10 deletions(-)

 diff --git a/docs/GL3.txt b/docs/GL3.txt
 index a2f438b..20e57b0 100644
 --- a/docs/GL3.txt
 +++ b/docs/GL3.txt
 @@ -118,7 +118,7 @@ GL 4.0:
GL_ARB_tessellation_shader   started (Fabian)
GL_ARB_texture_buffer_object_rgb32   DONE (i965, nvc0, 
 r600, radeonsi, softpipe)
GL_ARB_texture_cube_map_arrayDONE (i965, nv50, 
 nvc0, r600, radeonsi, softpipe)
 -  GL_ARB_texture_gatherDONE (i965, nv50, 
 nvc0, radeonsi)
 +  GL_ARB_texture_gatherDONE (i965, nv50, 
 nvc0, radeonsi, r600)
GL_ARB_texture_query_lod DONE (i965, nv50, 
 nvc0, radeonsi)
GL_ARB_transform_feedback2   DONE (i965, nv50, 
 nvc0, r600, radeonsi)
GL_ARB_transform_feedback3   DONE (i965, nv50, 
 nvc0, r600, radeonsi)
 diff --git a/docs/relnotes/10.3.html b/docs/relnotes/10.3.html
 index 2e718fc..1c0fab6 100644
 --- a/docs/relnotes/10.3.html
 +++ b/docs/relnotes/10.3.html
 @@ -49,7 +49,7 @@ Note: some of the new features are only available with 
 certain drivers.
  liGL_ARB_sample_shading on radeonsi/li
  liGL_ARB_stencil_texturing on nv50, nvc0, r600, and radeonsi/li
  liGL_ARB_texture_cube_map_array on radeonsi/li
 -liGL_ARB_texture_gather on radeonsi/li
 +liGL_ARB_texture_gather on radeonsi, r600/li
  liGL_ARB_texture_query_levels on nv50, nvc0, llvmpipe, r600, radeonsi, 
 softpipe/li
  liGL_ARB_texture_query_lod on radeonsi/li
  liGL_ARB_viewport_array on nvc0/li
 diff --git a/src/gallium/drivers/r600/r600_pipe.c 
 b/src/gallium/drivers/r600/r600_pipe.c
 index ca6399f..d967f0f 100644
 --- a/src/gallium/drivers/r600/r600_pipe.c
 +++ b/src/gallium/drivers/r600/r600_pipe.c
 @@ -303,6 +303,9 @@ static int r600_get_param(struct pipe_screen* pscreen, 
 enum pipe_cap param)
 case PIPE_CAP_SEAMLESS_CUBE_MAP_PER_TEXTURE:
 case PIPE_CAP_CUBE_MAP_ARRAY:
 case PIPE_CAP_TGSI_VS_LAYER_VIEWPORT:
 +   case PIPE_CAP_TEXTURE_GATHER_SM5:
 +   return family = CHIP_CEDAR ? 1 : 0;
 +   case PIPE_CAP_MAX_TEXTURE_GATHER_COMPONENTS:
 return family = CHIP_CEDAR ? 1 : 0;

 /* Unsupported features. */
 @@ -312,8 +315,6 @@ static int r600_get_param(struct pipe_screen* pscreen, 
 enum pipe_cap param)
 case PIPE_CAP_FRAGMENT_COLOR_CLAMPED:
 case PIPE_CAP_VERTEX_COLOR_CLAMPED:
 case PIPE_CAP_USER_VERTEX_BUFFERS:
 -   case PIPE_CAP_MAX_TEXTURE_GATHER_COMPONENTS:
 -   case PIPE_CAP_TEXTURE_GATHER_SM5:
 case PIPE_CAP_TEXTURE_QUERY_LOD:
 case PIPE_CAP_SAMPLE_SHADING:
 case PIPE_CAP_TEXTURE_GATHER_OFFSETS:
 diff --git a/src/gallium/drivers/r600/r600_shader.c 
 b/src/gallium/drivers/r600/r600_shader.c
 index 6952e3c..db928f3 100644
 --- a/src/gallium/drivers/r600/r600_shader.c
 +++ b/src/gallium/drivers/r600/r600_shader.c
 @@ -4477,7 +4477,8 @@ static int tgsi_tex(struct r600_shader_ctx *ctx)

 if (inst-Instruction.Opcode == TGSI_OPCODE_TEX2 ||
 inst-Instruction.Opcode == TGSI_OPCODE_TXB2 ||
 -   inst-Instruction.Opcode == TGSI_OPCODE_TXL2)
 +   inst-Instruction.Opcode == TGSI_OPCODE_TXL2 ||
 +   inst-Instruction.Opcode == TGSI_OPCODE_TG4)
 sampler_src_reg = 2;

 src_gpr = tgsi_tex_get_src_gpr(ctx, 0);
 @@ -5079,6 +5080,13 @@ static int tgsi_tex(struct r600_shader_ctx *ctx)
 case FETCH_OP_SAMPLE_G:
 opcode = FETCH_OP_SAMPLE_C_G;
 break;
 +   /* Texture gather variants */
 +   case FETCH_OP_GATHER4:
 +   tex.op = FETCH_OP_GATHER4_C;
 +   break;
 +   case FETCH_OP_GATHER4_O:
 +   tex.op = FETCH_OP_GATHER4_C_O;
 +   break;
 }
 }

 @@ -5089,9 +5097,21 @@ static 

Re: [Mesa-dev] [PATCH] r600g: Implement GL_ARB_texture_gather

2014-07-14 Thread Ilia Mirkin
On Mon, Jul 14, 2014 at 3:06 PM, Marek Olšák mar...@gmail.com wrote:
 Reviewed-by: Marek Olšák marek.ol...@amd.com

 Yes, Dave Airlie's branch seems to support multi-component textures as
 well as texture offsets. However, his solution triggers shader
 recompilation when the format and/or swizzle is changed even if the
 shader doesn't use GATHER4. The recompilation should only be done for
 shaders and samplers which use that opcode.

My objection was not to the implementation but to the enabling of
PIPE_CAP_TEXTURE_GATHER_SM5 which seems like it can't be true given
that the max number of components in the texture is only allowed to be
1, and the various other things are likely not actually supported
(although I'm not sure).

  -ilia


 Marek

 On Mon, Jul 14, 2014 at 8:20 PM, Glenn Kennard glenn.kenn...@gmail.com 
 wrote:
 Only supported on evergreen and later. Limited to
 single component textures as the hardware GATHER4
 instruction ignores texture swizzles.

 Piglit quick run passes on radeon 6670 with all
 applicable textureGather tests, no regressions.

 Signed-off-by: Glenn Kennard glenn.kenn...@gmail.com
 ---
  docs/GL3.txt   |  2 +-
  docs/relnotes/10.3.html|  2 +-
  src/gallium/drivers/r600/r600_pipe.c   |  5 ++--
  src/gallium/drivers/r600/r600_shader.c | 47 
 +-
  4 files changed, 46 insertions(+), 10 deletions(-)

 diff --git a/docs/GL3.txt b/docs/GL3.txt
 index a2f438b..20e57b0 100644
 --- a/docs/GL3.txt
 +++ b/docs/GL3.txt
 @@ -118,7 +118,7 @@ GL 4.0:
GL_ARB_tessellation_shader   started (Fabian)
GL_ARB_texture_buffer_object_rgb32   DONE (i965, nvc0, 
 r600, radeonsi, softpipe)
GL_ARB_texture_cube_map_arrayDONE (i965, nv50, 
 nvc0, r600, radeonsi, softpipe)
 -  GL_ARB_texture_gatherDONE (i965, nv50, 
 nvc0, radeonsi)
 +  GL_ARB_texture_gatherDONE (i965, nv50, 
 nvc0, radeonsi, r600)
GL_ARB_texture_query_lod DONE (i965, nv50, 
 nvc0, radeonsi)
GL_ARB_transform_feedback2   DONE (i965, nv50, 
 nvc0, r600, radeonsi)
GL_ARB_transform_feedback3   DONE (i965, nv50, 
 nvc0, r600, radeonsi)
 diff --git a/docs/relnotes/10.3.html b/docs/relnotes/10.3.html
 index 2e718fc..1c0fab6 100644
 --- a/docs/relnotes/10.3.html
 +++ b/docs/relnotes/10.3.html
 @@ -49,7 +49,7 @@ Note: some of the new features are only available with 
 certain drivers.
  liGL_ARB_sample_shading on radeonsi/li
  liGL_ARB_stencil_texturing on nv50, nvc0, r600, and radeonsi/li
  liGL_ARB_texture_cube_map_array on radeonsi/li
 -liGL_ARB_texture_gather on radeonsi/li
 +liGL_ARB_texture_gather on radeonsi, r600/li
  liGL_ARB_texture_query_levels on nv50, nvc0, llvmpipe, r600, radeonsi, 
 softpipe/li
  liGL_ARB_texture_query_lod on radeonsi/li
  liGL_ARB_viewport_array on nvc0/li
 diff --git a/src/gallium/drivers/r600/r600_pipe.c 
 b/src/gallium/drivers/r600/r600_pipe.c
 index ca6399f..d967f0f 100644
 --- a/src/gallium/drivers/r600/r600_pipe.c
 +++ b/src/gallium/drivers/r600/r600_pipe.c
 @@ -303,6 +303,9 @@ static int r600_get_param(struct pipe_screen* pscreen, 
 enum pipe_cap param)
 case PIPE_CAP_SEAMLESS_CUBE_MAP_PER_TEXTURE:
 case PIPE_CAP_CUBE_MAP_ARRAY:
 case PIPE_CAP_TGSI_VS_LAYER_VIEWPORT:
 +   case PIPE_CAP_TEXTURE_GATHER_SM5:
 +   return family = CHIP_CEDAR ? 1 : 0;
 +   case PIPE_CAP_MAX_TEXTURE_GATHER_COMPONENTS:
 return family = CHIP_CEDAR ? 1 : 0;

 /* Unsupported features. */
 @@ -312,8 +315,6 @@ static int r600_get_param(struct pipe_screen* pscreen, 
 enum pipe_cap param)
 case PIPE_CAP_FRAGMENT_COLOR_CLAMPED:
 case PIPE_CAP_VERTEX_COLOR_CLAMPED:
 case PIPE_CAP_USER_VERTEX_BUFFERS:
 -   case PIPE_CAP_MAX_TEXTURE_GATHER_COMPONENTS:
 -   case PIPE_CAP_TEXTURE_GATHER_SM5:
 case PIPE_CAP_TEXTURE_QUERY_LOD:
 case PIPE_CAP_SAMPLE_SHADING:
 case PIPE_CAP_TEXTURE_GATHER_OFFSETS:
 diff --git a/src/gallium/drivers/r600/r600_shader.c 
 b/src/gallium/drivers/r600/r600_shader.c
 index 6952e3c..db928f3 100644
 --- a/src/gallium/drivers/r600/r600_shader.c
 +++ b/src/gallium/drivers/r600/r600_shader.c
 @@ -4477,7 +4477,8 @@ static int tgsi_tex(struct r600_shader_ctx *ctx)

 if (inst-Instruction.Opcode == TGSI_OPCODE_TEX2 ||
 inst-Instruction.Opcode == TGSI_OPCODE_TXB2 ||
 -   inst-Instruction.Opcode == TGSI_OPCODE_TXL2)
 +   inst-Instruction.Opcode == TGSI_OPCODE_TXL2 ||
 +   inst-Instruction.Opcode == TGSI_OPCODE_TG4)
 sampler_src_reg = 2;

 src_gpr = tgsi_tex_get_src_gpr(ctx, 0);
 @@ -5079,6 +5080,13 @@ static int tgsi_tex(struct r600_shader_ctx *ctx)
 case FETCH_OP_SAMPLE_G:
 opcode = FETCH_OP_SAMPLE_C_G;
   

Re: [Mesa-dev] [PATCH] r600g: Implement GL_ARB_texture_gather

2014-07-14 Thread Marek Olšák
Hm, indeed. PIPE_CAP_TEXTURE_GATHER_SM5 should be 0.

Marek

On Mon, Jul 14, 2014 at 9:10 PM, Ilia Mirkin imir...@alum.mit.edu wrote:
 On Mon, Jul 14, 2014 at 3:06 PM, Marek Olšák mar...@gmail.com wrote:
 Reviewed-by: Marek Olšák marek.ol...@amd.com

 Yes, Dave Airlie's branch seems to support multi-component textures as
 well as texture offsets. However, his solution triggers shader
 recompilation when the format and/or swizzle is changed even if the
 shader doesn't use GATHER4. The recompilation should only be done for
 shaders and samplers which use that opcode.

 My objection was not to the implementation but to the enabling of
 PIPE_CAP_TEXTURE_GATHER_SM5 which seems like it can't be true given
 that the max number of components in the texture is only allowed to be
 1, and the various other things are likely not actually supported
 (although I'm not sure).

   -ilia


 Marek

 On Mon, Jul 14, 2014 at 8:20 PM, Glenn Kennard glenn.kenn...@gmail.com 
 wrote:
 Only supported on evergreen and later. Limited to
 single component textures as the hardware GATHER4
 instruction ignores texture swizzles.

 Piglit quick run passes on radeon 6670 with all
 applicable textureGather tests, no regressions.

 Signed-off-by: Glenn Kennard glenn.kenn...@gmail.com
 ---
  docs/GL3.txt   |  2 +-
  docs/relnotes/10.3.html|  2 +-
  src/gallium/drivers/r600/r600_pipe.c   |  5 ++--
  src/gallium/drivers/r600/r600_shader.c | 47 
 +-
  4 files changed, 46 insertions(+), 10 deletions(-)

 diff --git a/docs/GL3.txt b/docs/GL3.txt
 index a2f438b..20e57b0 100644
 --- a/docs/GL3.txt
 +++ b/docs/GL3.txt
 @@ -118,7 +118,7 @@ GL 4.0:
GL_ARB_tessellation_shader   started (Fabian)
GL_ARB_texture_buffer_object_rgb32   DONE (i965, nvc0, 
 r600, radeonsi, softpipe)
GL_ARB_texture_cube_map_arrayDONE (i965, nv50, 
 nvc0, r600, radeonsi, softpipe)
 -  GL_ARB_texture_gatherDONE (i965, nv50, 
 nvc0, radeonsi)
 +  GL_ARB_texture_gatherDONE (i965, nv50, 
 nvc0, radeonsi, r600)
GL_ARB_texture_query_lod DONE (i965, nv50, 
 nvc0, radeonsi)
GL_ARB_transform_feedback2   DONE (i965, nv50, 
 nvc0, r600, radeonsi)
GL_ARB_transform_feedback3   DONE (i965, nv50, 
 nvc0, r600, radeonsi)
 diff --git a/docs/relnotes/10.3.html b/docs/relnotes/10.3.html
 index 2e718fc..1c0fab6 100644
 --- a/docs/relnotes/10.3.html
 +++ b/docs/relnotes/10.3.html
 @@ -49,7 +49,7 @@ Note: some of the new features are only available with 
 certain drivers.
  liGL_ARB_sample_shading on radeonsi/li
  liGL_ARB_stencil_texturing on nv50, nvc0, r600, and radeonsi/li
  liGL_ARB_texture_cube_map_array on radeonsi/li
 -liGL_ARB_texture_gather on radeonsi/li
 +liGL_ARB_texture_gather on radeonsi, r600/li
  liGL_ARB_texture_query_levels on nv50, nvc0, llvmpipe, r600, radeonsi, 
 softpipe/li
  liGL_ARB_texture_query_lod on radeonsi/li
  liGL_ARB_viewport_array on nvc0/li
 diff --git a/src/gallium/drivers/r600/r600_pipe.c 
 b/src/gallium/drivers/r600/r600_pipe.c
 index ca6399f..d967f0f 100644
 --- a/src/gallium/drivers/r600/r600_pipe.c
 +++ b/src/gallium/drivers/r600/r600_pipe.c
 @@ -303,6 +303,9 @@ static int r600_get_param(struct pipe_screen* pscreen, 
 enum pipe_cap param)
 case PIPE_CAP_SEAMLESS_CUBE_MAP_PER_TEXTURE:
 case PIPE_CAP_CUBE_MAP_ARRAY:
 case PIPE_CAP_TGSI_VS_LAYER_VIEWPORT:
 +   case PIPE_CAP_TEXTURE_GATHER_SM5:
 +   return family = CHIP_CEDAR ? 1 : 0;
 +   case PIPE_CAP_MAX_TEXTURE_GATHER_COMPONENTS:
 return family = CHIP_CEDAR ? 1 : 0;

 /* Unsupported features. */
 @@ -312,8 +315,6 @@ static int r600_get_param(struct pipe_screen* pscreen, 
 enum pipe_cap param)
 case PIPE_CAP_FRAGMENT_COLOR_CLAMPED:
 case PIPE_CAP_VERTEX_COLOR_CLAMPED:
 case PIPE_CAP_USER_VERTEX_BUFFERS:
 -   case PIPE_CAP_MAX_TEXTURE_GATHER_COMPONENTS:
 -   case PIPE_CAP_TEXTURE_GATHER_SM5:
 case PIPE_CAP_TEXTURE_QUERY_LOD:
 case PIPE_CAP_SAMPLE_SHADING:
 case PIPE_CAP_TEXTURE_GATHER_OFFSETS:
 diff --git a/src/gallium/drivers/r600/r600_shader.c 
 b/src/gallium/drivers/r600/r600_shader.c
 index 6952e3c..db928f3 100644
 --- a/src/gallium/drivers/r600/r600_shader.c
 +++ b/src/gallium/drivers/r600/r600_shader.c
 @@ -4477,7 +4477,8 @@ static int tgsi_tex(struct r600_shader_ctx *ctx)

 if (inst-Instruction.Opcode == TGSI_OPCODE_TEX2 ||
 inst-Instruction.Opcode == TGSI_OPCODE_TXB2 ||
 -   inst-Instruction.Opcode == TGSI_OPCODE_TXL2)
 +   inst-Instruction.Opcode == TGSI_OPCODE_TXL2 ||
 +   inst-Instruction.Opcode == TGSI_OPCODE_TG4)
 sampler_src_reg = 2;

 src_gpr = tgsi_tex_get_src_gpr(ctx, 0);
 @@ -5079,6 +5080,13 @@ static int 

Re: [Mesa-dev] [PATCH 1/2] radeonsi: switch descriptors to i32 vectors

2014-07-14 Thread Marek Olšák
Sorry Tom, the patch does break the driver with LLVM 3.4.2. It looks
like LLVM wasn't installed properly.

I have reverted it.

Marek

On Fri, Jul 11, 2014 at 4:20 PM, Tom Stellard t...@stellard.net wrote:
 On Fri, Jul 11, 2014 at 01:00:34AM +0200, Marek Olšák wrote:
 I have just tested it and it works with LLVM 3.4.2.


 Ok, thanks.

 Both patches are:

 Reviewed-by: Tom Stellard thomas.stell...@amd.com

 Marek

 On Thu, Jul 10, 2014 at 5:11 PM, Tom Stellard t...@stellard.net wrote:
  On Tue, Jul 08, 2014 at 01:37:02AM +0200, Marek Olšák wrote:
  From: Marek Olšák marek.ol...@amd.com
 
  This is a follow-up to the commit which adds texture fetches with offsets.
 
  Will this still work with LLVM 3.4.2 ?
 
  -Tom
 
  ---
   src/gallium/drivers/radeonsi/si_shader.c | 29 
  -
   1 file changed, 16 insertions(+), 13 deletions(-)
 
  diff --git a/src/gallium/drivers/radeonsi/si_shader.c 
  b/src/gallium/drivers/radeonsi/si_shader.c
  index 3dd6ad1..a28d682 100644
  --- a/src/gallium/drivers/radeonsi/si_shader.c
  +++ b/src/gallium/drivers/radeonsi/si_shader.c
  @@ -1574,7 +1574,7 @@ static void tex_fetch_args(
LLVMTypeRef i8 = LLVMInt8TypeInContext(gallivm-context);
LLVMTypeRef v16i8 = LLVMVectorType(i8, 16);
 
  - /* Truncate v32i8 to v16i8. */
  + /* Bitcast and truncate v8i32 to v16i8. */
LLVMValueRef res = si_shader_ctx-resources[sampler_index];
res = LLVMBuildBitCast(gallivm-builder, res, v2i128, );
res = LLVMBuildExtractElement(gallivm-builder, res, 
  bld_base-uint_bld.zero, );
  @@ -2305,12 +2305,18 @@ static void create_meta_data(struct 
  si_shader_context *si_shader_ctx)
si_shader_ctx-const_md = LLVMMDNodeInContext(gallivm-context, 
  args, 3);
   }
 
  +static LLVMTypeRef const_array(LLVMTypeRef elem_type, int num_elements)
  +{
  + return LLVMPointerType(LLVMArrayType(elem_type, num_elements),
  +CONST_ADDR_SPACE);
  +}
  +
   static void create_function(struct si_shader_context *si_shader_ctx)
   {
struct lp_build_tgsi_context *bld_base = 
  si_shader_ctx-radeon_bld.soa.bld_base;
struct gallivm_state *gallivm = bld_base-base.gallivm;
struct si_pipe_shader *shader = si_shader_ctx-shader;
  - LLVMTypeRef params[SI_NUM_PARAMS], f32, i8, i32, v2i32, v3i32;
  + LLVMTypeRef params[SI_NUM_PARAMS], f32, i8, i32, v2i32, v3i32, 
  v16i8, v4i32, v8i32;
unsigned i, last_sgpr, num_params;
 
i8 = LLVMInt8TypeInContext(gallivm-context);
  @@ -2318,21 +2324,18 @@ static void create_function(struct 
  si_shader_context *si_shader_ctx)
f32 = LLVMFloatTypeInContext(gallivm-context);
v2i32 = LLVMVectorType(i32, 2);
v3i32 = LLVMVectorType(i32, 3);
  + v4i32 = LLVMVectorType(i32, 4);
  + v8i32 = LLVMVectorType(i32, 8);
  + v16i8 = LLVMVectorType(i8, 16);
 
  - params[SI_PARAM_CONST] = LLVMPointerType(
  - LLVMArrayType(LLVMVectorType(i8, 16), NUM_CONST_BUFFERS), 
  CONST_ADDR_SPACE);
  - params[SI_PARAM_RW_BUFFERS] = params[SI_PARAM_CONST];
  -
  - /* We assume at most 16 textures per program at the moment.
  -  * This need probably need to be changed to support bindless 
  textures */
  - params[SI_PARAM_SAMPLER] = LLVMPointerType(
  - LLVMArrayType(LLVMVectorType(i8, 16), NUM_SAMPLER_STATES), 
  CONST_ADDR_SPACE);
  - params[SI_PARAM_RESOURCE] = LLVMPointerType(
  - LLVMArrayType(LLVMVectorType(i8, 32), NUM_SAMPLER_VIEWS), 
  CONST_ADDR_SPACE);
  + params[SI_PARAM_CONST] = const_array(v16i8, NUM_CONST_BUFFERS);
  + params[SI_PARAM_RW_BUFFERS] = const_array(v16i8, 6); /* XXX 
  hardcoded */
  + params[SI_PARAM_SAMPLER] = const_array(v4i32, NUM_SAMPLER_STATES);
  + params[SI_PARAM_RESOURCE] = const_array(v8i32, NUM_SAMPLER_VIEWS);
 
switch (si_shader_ctx-type) {
case TGSI_PROCESSOR_VERTEX:
  - params[SI_PARAM_VERTEX_BUFFER] = params[SI_PARAM_CONST];
  + params[SI_PARAM_VERTEX_BUFFER] = const_array(v16i8, 16); /* 
  XXX hardcoded */
params[SI_PARAM_START_INSTANCE] = i32;
num_params = SI_PARAM_START_INSTANCE+1;
if (shader-key.vs.as_es) {
  --
  1.9.1
 
  ___
  mesa-dev mailing list
  mesa-dev@lists.freedesktop.org
  http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] r600g: Implement GL_ARB_texture_gather

2014-07-14 Thread Glenn Kennard
On Mon, 14 Jul 2014 20:33:08 +0200, Ilia Mirkin imir...@alum.mit.edu  
wrote:


On Mon, Jul 14, 2014 at 2:20 PM, Glenn Kennard glenn.kenn...@gmail.com  
wrote:
diff --git a/src/gallium/drivers/r600/r600_pipe.c  
b/src/gallium/drivers/r600/r600_pipe.c

index ca6399f..d967f0f 100644
--- a/src/gallium/drivers/r600/r600_pipe.c
+++ b/src/gallium/drivers/r600/r600_pipe.c
@@ -303,6 +303,9 @@ static int r600_get_param(struct pipe_screen*  
pscreen, enum pipe_cap param)

case PIPE_CAP_SEAMLESS_CUBE_MAP_PER_TEXTURE:
case PIPE_CAP_CUBE_MAP_ARRAY:
case PIPE_CAP_TGSI_VS_LAYER_VIEWPORT:
+   case PIPE_CAP_TEXTURE_GATHER_SM5:
+   return family = CHIP_CEDAR ? 1 : 0;


No clue what the hardware supports, and this CAP is actually not used
by the state tracker. However I believe it is meant to imply that you
can support the ARB_gs5 texture gather, including shadow comparisons
and non-constant offsets. This is not yet turned on by core mesa, but
do the textureGather tests still pass if you add
MESA_EXTENSION_OVERRIDE=GL_ARB_gpu_shader5? (You can do a piglit-run
with -t gather.) I'm not 100% sure, but I _think_ it also implies that
you can deal with selecting any component from a 4-component
texture... the ARB_gs5 spec says:

Since this extension requires support for gathering from  
multi-component
textures, the minimum value of  
MAX_PROGRAM_TEXTURE_GATHER_COMPONENTS_ARB

is increased to 4.

My understanding is that the hardware is meant to be able to support
GL4 one way or aonther, but sounds like more work needs to be done in
order to claim the SM5 compatibility cap...


Good catch, i missed that. It should return 0 for  
PIPE_CAP_TEXTURE_GATHER_SM5, and 1 for number
of components, which lets it expose the ARB_texture_gather extension for  
now, leaving SM5 support for
a later patch - the hardware apparently doesn't support texture sampler  
swizzling natively for the GATHER4
instruction so some logic is required to compile shader variants whenever  
the swizzle changes.




Also, have you looked at Dave Airlie's impl? Not sure what's been
going on there...
http://cgit.freedesktop.org/~airlied/mesa/log/?h=r600g-texture-gather
. He seemed to enable 4 components for = CEDAR.



No, didn't pop up when i searched the archives for any prior work on this.  
Interesting! It looks like the GL_ARB_texture_gather portion in that  
branch is almost exactly equivalent to my patch.


It also has some of the additional parts needed for GL_ARB_gpu_shader5  
such as non-constant offsets, but i don't see where it would trigger  
shader recompiles if texture sampler swizzle changes? Does it pass piglit?



David, any opinions on how to move forward with this feature?
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] configure: Don't override user -g or -O options for debug builds

2014-07-14 Thread Matt Turner
On Mon, Jul 14, 2014 at 11:49 AM, Ian Romanick i...@freedesktop.org wrote:
 From: Ian Romanick ian.d.roman...@intel.com

 I already pass '-ggdb3 -O1' or '-ggdb3 -Og' for CFLAGS, and I don't want
 configure to change them for me.

 Signed-off-by: Ian Romanick ian.d.roman...@intel.com
 ---
  configure.ac | 14 --
  1 file changed, 12 insertions(+), 2 deletions(-)

 diff --git a/configure.ac b/configure.ac
 index 4646212..04abcc0 100644
 --- a/configure.ac
 +++ b/configure.ac
 @@ -308,10 +308,20 @@ AC_ARG_ENABLE([debug],
  if test x$enable_debug = xyes; then
  DEFINES=$DEFINES -DDEBUG
  if test x$GCC = xyes; then
 -CFLAGS=$CFLAGS -g -O0
 +if ! echo $CFLAGS | grep -q -e '-g'; then
 +CFLAGS=$CFLAGS -g
 +fi
 +if ! echo $CFLAGS | grep -q -e '-O'; then
 +CFLAGS=$CFLAGS -O0
 +fi
  fi
  if test x$GXX = xyes; then
 -CXXFLAGS=$CXXFLAGS -g -O0
 +if ! echo $CXXFLAGS | grep -q -e '-g'; then
 +CXXFLAGS=$CXXFLAGS -g
 +fi
 +if ! echo $CXXFLAGS | grep -q -e '-O'; then
 +CXXFLAGS=$CXXFLAGS -O0
 +fi
  fi
  fi

 --
 1.8.1.4

This looks like strictly an improvement, so my feedback shouldn't hold
this up, but maybe it's time to fight about whether --enable-debug
should add -O0 at all. :)

Reviewed-by: Matt Turner matts...@gmail.com

Autotools defaults are -g -O2. Some people want debug builds to use
-g, others -ggdb*. Some people want -O2 (optimize, but with assertions
enabled), others -O0, others -Og. It'd be nice to not have
--enable-debug turn on one combination of these.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] radeon/llvm: Fix LLVM diagnostic error reporting

2014-07-14 Thread Tom Stellard
We were trying to print the error message after disposing the
message object.
---
 src/gallium/drivers/radeon/radeon_llvm_emit.c | 11 ---
 1 file changed, 4 insertions(+), 7 deletions(-)

diff --git a/src/gallium/drivers/radeon/radeon_llvm_emit.c 
b/src/gallium/drivers/radeon/radeon_llvm_emit.c
index 6a394b2..1b17dd4 100644
--- a/src/gallium/drivers/radeon/radeon_llvm_emit.c
+++ b/src/gallium/drivers/radeon/radeon_llvm_emit.c
@@ -83,16 +83,13 @@ static LLVMTargetRef get_r600_target() {
 #if HAVE_LLVM = 0x0305
 
 static void radeonDiagnosticHandler(LLVMDiagnosticInfoRef di, void *context) {
-   unsigned int *diagnosticflag;
-   char *diaginfo_message;
-
-   diaginfo_message = LLVMGetDiagInfoDescription(di);
-   LLVMDisposeMessage(diaginfo_message);
-
-   diagnosticflag = (unsigned int *)context;
if (LLVMGetDiagInfoSeverity(di) == LLVMDSError) {
+   unsigned int *diagnosticflag = (unsigned int *)context;
+   char *diaginfo_message = LLVMGetDiagInfoDescription(di);
+
*diagnosticflag = 1;
fprintf(stderr,LLVM triggered Diagnostic Handler: %s\n, 
diaginfo_message);
+   LLVMDisposeMessage(diaginfo_message);
}
 }
 
-- 
1.8.1.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] configure: Don't override user -g or -O options for debug builds

2014-07-14 Thread Ilia Mirkin
On Mon, Jul 14, 2014 at 4:41 PM, Matt Turner matts...@gmail.com wrote:
 On Mon, Jul 14, 2014 at 11:49 AM, Ian Romanick i...@freedesktop.org wrote:
 From: Ian Romanick ian.d.roman...@intel.com

 I already pass '-ggdb3 -O1' or '-ggdb3 -Og' for CFLAGS, and I don't want
 configure to change them for me.

 Signed-off-by: Ian Romanick ian.d.roman...@intel.com
 ---
  configure.ac | 14 --
  1 file changed, 12 insertions(+), 2 deletions(-)

 diff --git a/configure.ac b/configure.ac
 index 4646212..04abcc0 100644
 --- a/configure.ac
 +++ b/configure.ac
 @@ -308,10 +308,20 @@ AC_ARG_ENABLE([debug],
  if test x$enable_debug = xyes; then
  DEFINES=$DEFINES -DDEBUG
  if test x$GCC = xyes; then
 -CFLAGS=$CFLAGS -g -O0
 +if ! echo $CFLAGS | grep -q -e '-g'; then
 +CFLAGS=$CFLAGS -g
 +fi
 +if ! echo $CFLAGS | grep -q -e '-O'; then
 +CFLAGS=$CFLAGS -O0
 +fi
  fi
  if test x$GXX = xyes; then
 -CXXFLAGS=$CXXFLAGS -g -O0
 +if ! echo $CXXFLAGS | grep -q -e '-g'; then
 +CXXFLAGS=$CXXFLAGS -g
 +fi
 +if ! echo $CXXFLAGS | grep -q -e '-O'; then
 +CXXFLAGS=$CXXFLAGS -O0
 +fi
  fi
  fi

 --
 1.8.1.4

 This looks like strictly an improvement, so my feedback shouldn't hold
 this up, but maybe it's time to fight about whether --enable-debug
 should add -O0 at all. :)

 Reviewed-by: Matt Turner matts...@gmail.com

 Autotools defaults are -g -O2. Some people want debug builds to use
 -g, others -ggdb*. Some people want -O2 (optimize, but with assertions
 enabled), others -O0, others -Og. It'd be nice to not have
 --enable-debug turn on one combination of these.

IIRC I've only seen negative comments on this rather than
defend/praise. Does anyone actually _like_ this feature? Personally, I
find it incredibly counter-intuitive.

  -ilia
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] configure: Don't override user -g or -O options for debug builds

2014-07-14 Thread Marek Olšák
You can also just add -DDEBUG by yourself since you override the flags
anyway (me too).

Marek

On Mon, Jul 14, 2014 at 8:49 PM, Ian Romanick i...@freedesktop.org wrote:
 From: Ian Romanick ian.d.roman...@intel.com

 I already pass '-ggdb3 -O1' or '-ggdb3 -Og' for CFLAGS, and I don't want
 configure to change them for me.

 Signed-off-by: Ian Romanick ian.d.roman...@intel.com
 ---
  configure.ac | 14 --
  1 file changed, 12 insertions(+), 2 deletions(-)

 diff --git a/configure.ac b/configure.ac
 index 4646212..04abcc0 100644
 --- a/configure.ac
 +++ b/configure.ac
 @@ -308,10 +308,20 @@ AC_ARG_ENABLE([debug],
  if test x$enable_debug = xyes; then
  DEFINES=$DEFINES -DDEBUG
  if test x$GCC = xyes; then
 -CFLAGS=$CFLAGS -g -O0
 +if ! echo $CFLAGS | grep -q -e '-g'; then
 +CFLAGS=$CFLAGS -g
 +fi
 +if ! echo $CFLAGS | grep -q -e '-O'; then
 +CFLAGS=$CFLAGS -O0
 +fi
  fi
  if test x$GXX = xyes; then
 -CXXFLAGS=$CXXFLAGS -g -O0
 +if ! echo $CXXFLAGS | grep -q -e '-g'; then
 +CXXFLAGS=$CXXFLAGS -g
 +fi
 +if ! echo $CXXFLAGS | grep -q -e '-O'; then
 +CXXFLAGS=$CXXFLAGS -O0
 +fi
  fi
  fi

 --
 1.8.1.4

 ___
 mesa-dev mailing list
 mesa-dev@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] r600g: Implement GL_ARB_texture_gather

2014-07-14 Thread Dave Airlie
 Also, have you looked at Dave Airlie's impl? Not sure what's been
 going on there...
 http://cgit.freedesktop.org/~airlied/mesa/log/?h=r600g-texture-gather
 . He seemed to enable 4 components for = CEDAR.


 No, didn't pop up when i searched the archives for any prior work on this.
 Interesting! It looks like the GL_ARB_texture_gather portion in that branch
 is almost exactly equivalent to my patch.

 It also has some of the additional parts needed for GL_ARB_gpu_shader5 such
 as non-constant offsets, but i don't see where it would trigger shader
 recompiles if texture sampler swizzle changes? Does it pass piglit?


 David, any opinions on how to move forward with this feature?

I didn't like my implementation sufficiently to push it forward, but I
think the main thing
was the extra key changes, I meant to get Marek's input on that code,
but then it fell
off my radar. So get input from Marek, repost and merge :)

Dave.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] radeon/llvm: Fix LLVM diagnostic error reporting

2014-07-14 Thread Aaron Watry
Tested-by and Reviewed-by: Aaron Watry awa...@gmail.com

On Mon, Jul 14, 2014 at 3:51 PM, Tom Stellard thomas.stell...@amd.com wrote:
 We were trying to print the error message after disposing the
 message object.
 ---
  src/gallium/drivers/radeon/radeon_llvm_emit.c | 11 ---
  1 file changed, 4 insertions(+), 7 deletions(-)

 diff --git a/src/gallium/drivers/radeon/radeon_llvm_emit.c 
 b/src/gallium/drivers/radeon/radeon_llvm_emit.c
 index 6a394b2..1b17dd4 100644
 --- a/src/gallium/drivers/radeon/radeon_llvm_emit.c
 +++ b/src/gallium/drivers/radeon/radeon_llvm_emit.c
 @@ -83,16 +83,13 @@ static LLVMTargetRef get_r600_target() {
  #if HAVE_LLVM = 0x0305

  static void radeonDiagnosticHandler(LLVMDiagnosticInfoRef di, void *context) 
 {
 -   unsigned int *diagnosticflag;
 -   char *diaginfo_message;
 -
 -   diaginfo_message = LLVMGetDiagInfoDescription(di);
 -   LLVMDisposeMessage(diaginfo_message);
 -
 -   diagnosticflag = (unsigned int *)context;
 if (LLVMGetDiagInfoSeverity(di) == LLVMDSError) {
 +   unsigned int *diagnosticflag = (unsigned int *)context;
 +   char *diaginfo_message = LLVMGetDiagInfoDescription(di);
 +
 *diagnosticflag = 1;
 fprintf(stderr,LLVM triggered Diagnostic Handler: %s\n, 
 diaginfo_message);
 +   LLVMDisposeMessage(diaginfo_message);
 }
  }

 --
 1.8.1.5

 ___
 mesa-dev mailing list
 mesa-dev@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] i965: Initialize new chunks of realloc'd memory.

2014-07-14 Thread Kenneth Graunke
On Wednesday, July 09, 2014 12:41:25 PM Matt Turner wrote:
 Otherwise we'd compare uninitialized pointers with NULL and dereference,
 leading to crashes.
 ---
  src/mesa/drivers/dri/i965/intel_asm_annotation.c | 4 
  1 file changed, 4 insertions(+)
 
 diff --git a/src/mesa/drivers/dri/i965/intel_asm_annotation.c 
b/src/mesa/drivers/dri/i965/intel_asm_annotation.c
 index 4717baf..6a51d89 100644
 --- a/src/mesa/drivers/dri/i965/intel_asm_annotation.c
 +++ b/src/mesa/drivers/dri/i965/intel_asm_annotation.c
 @@ -96,11 +96,15 @@ void annotate(struct brw_context *brw,
struct backend_instruction *inst, unsigned offset)
  {
 if (annotation-ann_size = annotation-ann_count) {
 +  int old_size = annotation-ann_size;
annotation-ann_size = MAX2(1024, annotation-ann_size * 2);
annotation-ann = reralloc(annotation-mem_ctx, annotation-ann,
   struct annotation, annotation-ann_size);
if (!annotation-ann)
   return;
 +
 +  memset(annotation-ann + old_size, 0,
 + (annotation-ann_size - old_size) * sizeof(struct 
annotation));
 }
  
 struct annotation *ann = annotation-ann[annotation-ann_count++];
 

Reviewed-by: Kenneth Graunke kenn...@whitecape.org

signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] configure: Don't override user -g or -O options for debug builds

2014-07-14 Thread Emil Velikov
On 14/07/14 19:49, Ian Romanick wrote:
 From: Ian Romanick ian.d.roman...@intel.com
 
 I already pass '-ggdb3 -O1' or '-ggdb3 -Og' for CFLAGS, and I don't want
 configure to change them for me.
 
The topic is a rather messy and whatever we do there will always be a case
where this breaks someone's setup/workflow.

In order to minimize the endless bikeshed can we just drop the switch
altogether, and leave it to the builder to set these ? This way people will be
able to set -O, -g and/or -DDEBUG if they are unhappy with their distro's
defaults.

Thanks,
Emil

 Signed-off-by: Ian Romanick ian.d.roman...@intel.com
 ---
  configure.ac | 14 --
  1 file changed, 12 insertions(+), 2 deletions(-)
 
 diff --git a/configure.ac b/configure.ac
 index 4646212..04abcc0 100644
 --- a/configure.ac
 +++ b/configure.ac
 @@ -308,10 +308,20 @@ AC_ARG_ENABLE([debug],
  if test x$enable_debug = xyes; then
  DEFINES=$DEFINES -DDEBUG
  if test x$GCC = xyes; then
 -CFLAGS=$CFLAGS -g -O0
 +if ! echo $CFLAGS | grep -q -e '-g'; then
 +CFLAGS=$CFLAGS -g
 +fi
 +if ! echo $CFLAGS | grep -q -e '-O'; then
 +CFLAGS=$CFLAGS -O0
 +fi
  fi
  if test x$GXX = xyes; then
 -CXXFLAGS=$CXXFLAGS -g -O0
 +if ! echo $CXXFLAGS | grep -q -e '-g'; then
 +CXXFLAGS=$CXXFLAGS -g
 +fi
 +if ! echo $CXXFLAGS | grep -q -e '-O'; then
 +CXXFLAGS=$CXXFLAGS -O0
 +fi
  fi
  fi
  
 

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] r600g: Implement GL_ARB_texture_gather

2014-07-14 Thread Marek Olšák
My advice is: The shader being compiled should be analyzed before
compilation and an array of bool flags (or one uint16_t for all flags)
should be produced saying which sampler units use TG4. Then, you only
need to set the swizzle in the key for the i-th sampler if the i-th
sampler is really used by TG4.

The shader analysis can be done in tgsi_scan_shader, which should be
pretty trivial.

However, this won't work with GL4 sampler arrays and indirect
addressing of those arrays, so well, if I were you, I would
probably ignore the ZERO and ONE swizzle cases and just put up with
the fact some piglit tests will always fail. That's the only use of
the tg4_swizzle array as far as I can see, so it can be removed.

Marek

On Mon, Jul 14, 2014 at 11:27 PM, Dave Airlie airl...@gmail.com wrote:
 Also, have you looked at Dave Airlie's impl? Not sure what's been
 going on there...
 http://cgit.freedesktop.org/~airlied/mesa/log/?h=r600g-texture-gather
 . He seemed to enable 4 components for = CEDAR.


 No, didn't pop up when i searched the archives for any prior work on this.
 Interesting! It looks like the GL_ARB_texture_gather portion in that branch
 is almost exactly equivalent to my patch.

 It also has some of the additional parts needed for GL_ARB_gpu_shader5 such
 as non-constant offsets, but i don't see where it would trigger shader
 recompiles if texture sampler swizzle changes? Does it pass piglit?


 David, any opinions on how to move forward with this feature?

 I didn't like my implementation sufficiently to push it forward, but I
 think the main thing
 was the extra key changes, I meant to get Marek's input on that code,
 but then it fell
 off my radar. So get input from Marek, repost and merge :)

 Dave.
 ___
 mesa-dev mailing list
 mesa-dev@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] i965/fs: Perform CSE on sends-from-GRF rather than textures.

2014-07-14 Thread Kenneth Graunke
On Thursday, July 10, 2014 11:59:40 AM Matt Turner wrote:
 Should potentially allow a few more cases, while avoiding doing CSE on
 texture operations on Gen = 6 with the MRF.
 
 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=80211
 ---
  src/mesa/drivers/dri/i965/brw_fs_cse.cpp | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)
 
 diff --git a/src/mesa/drivers/dri/i965/brw_fs_cse.cpp 
b/src/mesa/drivers/dri/i965/brw_fs_cse.cpp
 index 5727801..7226aff 100644
 --- a/src/mesa/drivers/dri/i965/brw_fs_cse.cpp
 +++ b/src/mesa/drivers/dri/i965/brw_fs_cse.cpp
 @@ -103,7 +103,7 @@ is_expression(const fs_inst *const inst)
 case SHADER_OPCODE_LOAD_PAYLOAD:
return !is_copy_payload(inst);
 default:
 -  return inst-is_tex();
 +  return inst-is_send_from_grf();
 }
  }
  
 

Reviewed-by: Kenneth Graunke kenn...@whitecape.org

signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 04/10] glsl: Optimize (A || B) A == A

2014-07-14 Thread thomashelland90
From: Thomas Helland thomashellan...@gmail.com

And it's cousins

Signed-off-by: Thomas Helland thomashellan...@gmail.com
---
 src/glsl/opt_algebraic.cpp | 10 ++
 1 file changed, 10 insertions(+)

diff --git a/src/glsl/opt_algebraic.cpp b/src/glsl/opt_algebraic.cpp
index 748fd05..230fb1b 100644
--- a/src/glsl/opt_algebraic.cpp
+++ b/src/glsl/opt_algebraic.cpp
@@ -559,6 +559,16 @@ ir_algebraic_visitor::handle_expression(ir_expression *ir)
  ir-operands[0]-equals(op_expr[1]-operands[0])) {
  /* A  !A == 0 */
  return ir_constant::zero(mem_ctx, ir-type);
+  } else if (op_expr[0]  op_expr[0]-operation == ir_binop_logic_or 
+  (op_expr[0]-operands[0]-equals(ir-operands[1]) ||
+   op_expr[0]-operands[1]-equals(ir-operands[1]))) {
+ /* (A || B)  A == A or (B || A)  A == A */
+ return ir-operands[1];
+  } else if (op_expr[1]  op_expr[1]-operation == ir_binop_logic_or 
+  (op_expr[1]-operands[0]-equals(ir-operands[0]) ||
+   op_expr[1]-operands[1]-equals(ir-operands[0]))) {
+ /* A  (A || B) == A or A  (B || A) == A */
+ return ir-operands[1];
   }
   break;
 
-- 
2.0.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 07/10] glsl: Optimize log(x) + log(y) == log(x*y)

2014-07-14 Thread thomashelland90
From: Thomas Helland thomashellan...@gmail.com

And the log2() equivalent.

Signed-off-by: Thomas Helland thomashellan...@gmail.com
---
 src/glsl/opt_algebraic.cpp | 12 
 1 file changed, 12 insertions(+)

diff --git a/src/glsl/opt_algebraic.cpp b/src/glsl/opt_algebraic.cpp
index deaa49e..e2abf5f 100644
--- a/src/glsl/opt_algebraic.cpp
+++ b/src/glsl/opt_algebraic.cpp
@@ -380,6 +380,18 @@ ir_algebraic_visitor::handle_expression(ir_expression *ir)
 return expr;
   }
 
+  /* log(x) + log(y) == log(x*y) */
+  if (op_expr[0]  op_expr[0]-operation == ir_unop_log  
+  op_expr[1]  op_expr[1]-operation == ir_unop_log)
+ return new(mem_ctx) ir_expression(
+ir_unop_log, mul(op_expr[0]-operands[0], op_expr[1]-operands[0]) 
);
+
+  /* log2(x) + log2(y) == log2(x*y) */
+  if (op_expr[0]  op_expr[0]-operation == ir_unop_log2  
+  op_expr[1]  op_expr[1]-operation == ir_unop_log2)
+ return new(mem_ctx) ir_expression(
+ir_unop_log2, mul(op_expr[0]-operands[0], 
op_expr[1]-operands[0]) );
+
   /* Replace (-x + y) * a + x and commutative variations with lrp(x, y, a).
*
* (-x + y) * a + x
-- 
2.0.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 01/10] glsl: Optimize X - X - 0

2014-07-14 Thread thomashelland90
From: Thomas Helland thomashellan...@gmail.com

Silly optimization indeed, but who knows.

Signed-off-by: Thomas Helland thomashellan...@gmail.com
---
 src/glsl/opt_algebraic.cpp | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/src/glsl/opt_algebraic.cpp b/src/glsl/opt_algebraic.cpp
index ac7514a..c179d53 100644
--- a/src/glsl/opt_algebraic.cpp
+++ b/src/glsl/opt_algebraic.cpp
@@ -439,6 +439,9 @@ ir_algebraic_visitor::handle_expression(ir_expression *ir)
 return neg(ir-operands[1]);
   if (is_vec_zero(op_const[1]))
 return ir-operands[0];
+  /* X - X == 0 */
+  if (ir-operands[0]-equals(ir-operands[1]))
+ return ir_constant::zero(ir, ir-type);
   break;
 
case ir_binop_mul:
-- 
2.0.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 10/10] glsl: Optimize some more pow() special cases

2014-07-14 Thread thomashelland90
From: Thomas Helland thomashellan...@gmail.com

Specifically x^-1 = rcp(x)
.0^x  = 0
.x^0  = 1

Signed-off-by: Thomas Helland thomashellan...@gmail.com
---
 src/glsl/opt_algebraic.cpp | 15 +++
 1 file changed, 15 insertions(+)

diff --git a/src/glsl/opt_algebraic.cpp b/src/glsl/opt_algebraic.cpp
index fba9de6..1382067 100644
--- a/src/glsl/opt_algebraic.cpp
+++ b/src/glsl/opt_algebraic.cpp
@@ -662,6 +662,21 @@ ir_algebraic_visitor::handle_expression(ir_expression *ir)
   if (is_vec_one(op_const[1]))
  return ir-operands[0];
 
+  /* x^-1 == rcp(x) */
+  if (is_vec_negative_one(op_const[1]))
+ return new(mem_ctx) ir_expression(ir_unop_rcp,
+  ir-operands[0]-type,
+  ir-operands[0],
+  NULL);
+
+  /* 0^x == 0 */
+  if (is_vec_zero(op_const[0]))
+ return ir_constant::zero(ir, ir-type);
+
+  /* x^0 == 1 */
+  if (is_vec_zero(op_const[1]))
+ return new(mem_ctx) ir_constant(1.0f, 1);
+
   /* pow(2,x) == exp2(x) */
   if (is_vec_two(op_const[0]))
  return expr(ir_unop_exp2, ir-operands[1]);
-- 
2.0.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 02/10] glsl: Optimize !A || A == 1

2014-07-14 Thread thomashelland90
From: Thomas Helland thomashellan...@gmail.com

Signed-off-by: Thomas Helland thomashellan...@gmail.com
---
 src/glsl/opt_algebraic.cpp | 12 
 1 file changed, 12 insertions(+)

diff --git a/src/glsl/opt_algebraic.cpp b/src/glsl/opt_algebraic.cpp
index c179d53..385f0a2 100644
--- a/src/glsl/opt_algebraic.cpp
+++ b/src/glsl/opt_algebraic.cpp
@@ -591,6 +591,18 @@ ir_algebraic_visitor::handle_expression(ir_expression *ir)
   } else if (ir-operands[0]-equals(ir-operands[1])) {
  /* (a || a) == a */
  return ir-operands[0];
+  } else if ( (op_expr[0]  op_expr[0]-operation == ir_unop_logic_not 
+  ir-operands[1]-equals(op_expr[0]-operands[0])) || 
+  /* !A || A == 1 */
+  (op_expr[1]  op_expr[1]-operation == ir_unop_logic_not 
+ ir-operands[0]-equals(op_expr[1]-operands[0])) ) {
+  /* A || !A == 1 */
+ ir_constant_data data;
+
+for (unsigned i = 0; i  16; i++)
+   data.b[i] = true;
+
+return new(mem_ctx) ir_constant(ir-type, data);
   }
   break;
 
-- 
2.0.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 08/10] glsl: Optimize exp(x)*exp(y) == exp(x+y)

2014-07-14 Thread thomashelland90
From: Thomas Helland thomashellan...@gmail.com

And it's exp2() equivalent.

Signed-off-by: Thomas Helland thomashellan...@gmail.com
---
 src/glsl/opt_algebraic.cpp | 11 +++
 1 file changed, 11 insertions(+)

diff --git a/src/glsl/opt_algebraic.cpp b/src/glsl/opt_algebraic.cpp
index e2abf5f..2361d0f 100644
--- a/src/glsl/opt_algebraic.cpp
+++ b/src/glsl/opt_algebraic.cpp
@@ -470,6 +470,17 @@ ir_algebraic_visitor::handle_expression(ir_expression *ir)
   if (is_vec_negative_one(op_const[1]))
  return neg(ir-operands[0]);
 
+  if (op_expr[0]  op_expr[1]) {
+ /* exp(x) * exp(y) == exp(x+y) */
+ if (op_expr[0]-operation == ir_unop_exp 
+ op_expr[1]-operation == ir_unop_exp)
+return exp(add(op_expr[0]-operands[0], op_expr[1]-operands[0]));
+ /* exp2(x) * exp2(y) == exp2(x+y) */
+ if (op_expr[0]-operation == ir_unop_exp2 
+ op_expr[1]-operation == ir_unop_exp2)
+return new (mem_ctx) ir_expression(ir_unop_exp2,
+   add(op_expr[0]-operands[0], op_expr[1]-operands[0]));
+  }
 
   /* Reassociate multiplication of constants so that we can do
* constant folding.
-- 
2.0.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 05/10] glsl: Optimize min(-8, sin(x)) == -8 and similar

2014-07-14 Thread thomashelland90
From: Thomas Helland thomashellan...@gmail.com

This patch is a bit sketchy.
It only handles float-constants, otherwise it bails.
Also, ir_unop_saturate should be added here when
support for it lands in the ir.

Signed-off-by: Thomas Helland thomashellan...@gmail.com
---
 src/glsl/opt_algebraic.cpp | 37 +
 1 file changed, 37 insertions(+)

diff --git a/src/glsl/opt_algebraic.cpp b/src/glsl/opt_algebraic.cpp
index 230fb1b..90b6cae 100644
--- a/src/glsl/opt_algebraic.cpp
+++ b/src/glsl/opt_algebraic.cpp
@@ -647,6 +647,43 @@ ir_algebraic_visitor::handle_expression(ir_expression *ir)
 
   break;
 
+   case ir_binop_min: {
+  ir_rvalue *tempExpr;
+  ir_rvalue *tempConst;
+
+  if (op_expr[0]  op_const[1]) {
+ tempExpr = ir-operands[0];
+ tempConst = ir-operands[1];
+  } else if (op_expr[1]  op_const[0]) {
+ tempExpr = ir-operands[1];
+ tempConst = ir-operands[0];
+  } else {
+ break;
+  }
+  if(tempConst-type-is_float()) {
+ float constValue = *(tempConst-constant_expression_value()-value.f);
+ switch(tempExpr-as_expression()-operation) {
+ case ir_unop_cos:
+ case ir_unop_sin:
+ case ir_unop_sign:
+/* If we are comparing an expression bound between -1 and 1
+ * with a const = 1 we know that const will be the largest
+ */
+if(constValue = 1.0f)
+   return tempExpr;
+/* If we are comparing an expression bound between -1 and 1
+ * with a const = -1 we know that const will be the smallest
+ */
+if(constValue = -1.0f)
+   return tempConst;
+break;
+ default:
+break;
+ }
+  }
+  break;
+   }
+
case ir_unop_rcp:
   if (op_expr[0]  op_expr[0]-operation == ir_unop_rcp)
 return op_expr[0]-operands[0];
-- 
2.0.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 09/10] glsl: Optimize A - neg(B) == A + B

2014-07-14 Thread thomashelland90
From: Thomas Helland thomashellan...@gmail.com

...and  neg(A) - B == neg(A + B)

Signed-off-by: Thomas Helland thomashellan...@gmail.com
---
 src/glsl/opt_algebraic.cpp | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/src/glsl/opt_algebraic.cpp b/src/glsl/opt_algebraic.cpp
index 2361d0f..fba9de6 100644
--- a/src/glsl/opt_algebraic.cpp
+++ b/src/glsl/opt_algebraic.cpp
@@ -454,6 +454,12 @@ ir_algebraic_visitor::handle_expression(ir_expression *ir)
   /* X - X == 0 */
   if (ir-operands[0]-equals(ir-operands[1]))
  return ir_constant::zero(ir, ir-type);
+  /* A - neg(B) = A + B */
+  if (op_expr[1]  op_expr[1]-operation == ir_unop_neg)
+ return add(ir-operands[0], op_expr[1]-operands[0]);
+  /* neg(A) - B = neg(A + B) */
+  if (op_expr[0]  op_expr[0]-operation == ir_unop_neg)
+ return neg(add(op_expr[0]-operands[0], ir-operands[1]));
   break;
 
case ir_binop_mul:
-- 
2.0.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 00/10] [RFC] Probably useless algebraic optimizations

2014-07-14 Thread thomashelland90
From: Thomas Helland thomashellan...@gmail.com

When writing that A || (A  B) patch some
days ago I also wrote some other patches
that have no impact on my collection of shaders.
(shader-db + Some TF2 and Portal-shaders).
No reduction in instruction count, and no
significant increase in compilation time.

I decided to put them up here anyway, as
with your collection of shaders maybe YMMV.

These are mostly RFC-quality, and not all are
as complete and nicely formatted as they could be.
Possibly some are also implemented incorrectly.
(I'm still trying to get a good understanding of
the buildup of the ir, the visitors, etc)

Feel free to do with these patches as you please;
Ignore, test, review, flame, make cookies...

Thomas Helland (10):
  glsl: Optimize X - X - 0
  glsl: Optimize !A || A == 1
  glsl: Optimize !A  A == 0
  glsl: Optimize (A || B)  A == A
  glsl: Optimize min(-8, sin(x)) == -8 and similar
  glsl: Optimize max(8, sin(x)) == 8 and similar
  glsl: Optimize log(x) + log(y) == log(x*y)
  glsl: Optimize exp(x)*exp(y) == exp(x+y)
  glsl: Optimize A - neg(B) == A + B  and  neg(A) - B == neg(A + B)
  glsl: Optimize some more pow() special cases

 src/glsl/opt_algebraic.cpp | 152 +
 1 file changed, 152 insertions(+)

-- 
2.0.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 06/10] glsl: Optimize max(8, sin(x)) == 8 and similar

2014-07-14 Thread thomashelland90
From: Thomas Helland thomashellan...@gmail.com

This also only handles floats.
ir_unop_saturate should also be added when the
support for this lands in the ir.

Signed-off-by: Thomas Helland thomashellan...@gmail.com
---
 src/glsl/opt_algebraic.cpp | 38 ++
 1 file changed, 38 insertions(+)

diff --git a/src/glsl/opt_algebraic.cpp b/src/glsl/opt_algebraic.cpp
index 90b6cae..deaa49e 100644
--- a/src/glsl/opt_algebraic.cpp
+++ b/src/glsl/opt_algebraic.cpp
@@ -684,6 +684,44 @@ ir_algebraic_visitor::handle_expression(ir_expression *ir)
   break;
}
 
+   case ir_binop_max: {
+  ir_rvalue *tempExpr;
+  ir_rvalue *tempConst;
+
+  if (op_expr[0]  op_const[1]) {
+ tempExpr = ir-operands[0];
+ tempConst = ir-operands[1];
+  } else if (op_expr[1]  op_const[0]) {
+ tempExpr = ir-operands[1];
+ tempConst = ir-operands[0];
+  } else {
+ break;
+  }
+
+  if(tempConst-type-is_float()) {
+ float constValue = *(tempConst-constant_expression_value()-value.f);
+ switch(tempExpr-as_expression()-operation) {
+ case ir_unop_cos:
+ case ir_unop_sin:
+ case ir_unop_sign:
+/* If we are comparing an expression bound between -1 and 1
+ * with a const = 1 we know that const will be the largest
+ */
+if(constValue = 1.0f)
+   return tempConst;
+/* If we are comparing an expression bound between -1 and 1
+ * with a const = -1 we know that const will be the smallest
+ */
+if(constValue = -1.0f)
+   return tempExpr;
+break;
+ default:
+break;
+ }
+  }
+  break;
+   }
+
case ir_unop_rcp:
   if (op_expr[0]  op_expr[0]-operation == ir_unop_rcp)
 return op_expr[0]-operands[0];
-- 
2.0.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 03/10] glsl: Optimize !A A == 0

2014-07-14 Thread thomashelland90
From: Thomas Helland thomashellan...@gmail.com

Signed-off-by: Thomas Helland thomashellan...@gmail.com
---
 src/glsl/opt_algebraic.cpp | 8 
 1 file changed, 8 insertions(+)

diff --git a/src/glsl/opt_algebraic.cpp b/src/glsl/opt_algebraic.cpp
index 385f0a2..748fd05 100644
--- a/src/glsl/opt_algebraic.cpp
+++ b/src/glsl/opt_algebraic.cpp
@@ -551,6 +551,14 @@ ir_algebraic_visitor::handle_expression(ir_expression *ir)
   } else if (ir-operands[0]-equals(ir-operands[1])) {
  /* (a  a) == a */
  return ir-operands[0];
+  } else if (op_expr[0]  op_expr[0]-operation == ir_unop_logic_not 
+ ir-operands[1]-equals(op_expr[0]-operands[0])) {
+ /* !A  A == 0 */
+ return ir_constant::zero(mem_ctx, ir-type);
+  } else if (op_expr[1]  op_expr[1]-operation == ir_unop_logic_not 
+ ir-operands[0]-equals(op_expr[1]-operands[0])) {
+ /* A  !A == 0 */
+ return ir_constant::zero(mem_ctx, ir-type);
   }
   break;
 
-- 
2.0.0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/2] glsl: Update expression types after rebalancing the tree.

2014-07-14 Thread Kenneth Graunke
On Thursday, July 10, 2014 11:26:52 AM Matt Turner wrote:
 If we saw a tree that looked like
 
 vec3
/   \
  vec3 float
 /   \
   vec3 float
  /   \
vec3 float
 
 We would see that all of the expression types were vec3, and then
 rebalance to
 
vec3
 /\
   vec3   vec3 -- should be float
  /   \  /\
vec3 float float float
 
 This patch adds code to visit the rebalanced tree and update the
 expression types from the bottom up.
 
 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=80880
 ---
  src/glsl/opt_rebalance_tree.cpp | 17 +
  1 file changed, 17 insertions(+)
 
 diff --git a/src/glsl/opt_rebalance_tree.cpp 
b/src/glsl/opt_rebalance_tree.cpp
 index 773aab3..daabdc9 100644
 --- a/src/glsl/opt_rebalance_tree.cpp
 +++ b/src/glsl/opt_rebalance_tree.cpp
 @@ -60,6 +60,7 @@
  #include ir_visitor.h
  #include ir_rvalue_visitor.h
  #include ir_optimization.h
 +#include main/macros.h /* for MAX2 */
  
  /* The DSW algorithm generates a degenerate tree (really, a linked list) in
   * tree_to_vine(). We'd rather not leave a binary expression with only one
 @@ -261,6 +262,20 @@ handle_expression(ir_expression *expr)
 return expr;
  }
  
 +static void
 +update_types(ir_instruction *ir, void *)
 +{
 +   ir_expression *expr = ir-as_expression();
 +   if (!expr)
 +  return;
 +
 +   expr-type =
 +  glsl_type::get_instance(expr-type-base_type,
 +  MAX2(expr-operands[0]-type-components(),
 +   expr-operands[1]-type-components()),
 +  1);
 +}
 +
  void
  ir_rebalance_visitor::handle_rvalue(ir_rvalue **rvalue)
  {
 @@ -285,6 +300,8 @@ ir_rebalance_visitor::handle_rvalue(ir_rvalue **rvalue)
 if (new_rvalue == *rvalue)
return;
  
 +   visit_tree(new_rvalue, NULL, NULL, update_types);
 +
 *rvalue = new_rvalue;
 this-progress = true;
  }
 

Thanks for fixing this!

Reviewed-by: Kenneth Graunke kenn...@whitecape.org

signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 07/26] glsl: Use accessors for ir_variable::warn_extension

2014-07-14 Thread Ian Romanick
From: Ian Romanick ian.d.roman...@intel.com

The payoff for this will come in the next patch.

No change Valgrind massif results for a trimmed apitrace of dota2.

Signed-off-by: Ian Romanick ian.d.roman...@intel.com
---
 src/glsl/builtin_variables.cpp |  4 ++--
 src/glsl/ir.cpp| 11 +++
 src/glsl/ir.h  | 22 +-
 3 files changed, 30 insertions(+), 7 deletions(-)

diff --git a/src/glsl/builtin_variables.cpp b/src/glsl/builtin_variables.cpp
index 3497302..6fc3319 100644
--- a/src/glsl/builtin_variables.cpp
+++ b/src/glsl/builtin_variables.cpp
@@ -902,14 +902,14 @@ builtin_variable_generator::generate_fs_special_vars()
   ir_variable *const var =
  add_output(FRAG_RESULT_STENCIL, int_t, gl_FragStencilRefARB);
   if (state-ARB_shader_stencil_export_warn)
- var-warn_extension = GL_ARB_shader_stencil_export;
+ var-enable_extension_warning(GL_ARB_shader_stencil_export);
}
 
if (state-AMD_shader_stencil_export_enable) {
   ir_variable *const var =
  add_output(FRAG_RESULT_STENCIL, int_t, gl_FragStencilRefAMD);
   if (state-AMD_shader_stencil_export_warn)
- var-warn_extension = GL_AMD_shader_stencil_export;
+ var-enable_extension_warning(GL_AMD_shader_stencil_export);
}
 
if (state-ARB_sample_shading_enable) {
diff --git a/src/glsl/ir.cpp b/src/glsl/ir.cpp
index 6a4fc1b..8575ac3 100644
--- a/src/glsl/ir.cpp
+++ b/src/glsl/ir.cpp
@@ -1607,6 +1607,17 @@ ir_variable::determine_interpolation_mode(bool 
flat_shade)
   return INTERP_QUALIFIER_SMOOTH;
 }
 
+void
+ir_variable::enable_extension_warning(const char *extension)
+{
+   this-warn_extension = extension;
+}
+
+const char *
+ir_variable::get_extension_warning() const
+{
+   return this-warn_extension;
+}
 
 ir_function_signature::ir_function_signature(const glsl_type *return_type,
  builtin_available_predicate b)
diff --git a/src/glsl/ir.h b/src/glsl/ir.h
index 7036c1e..bdd9a3c 100644
--- a/src/glsl/ir.h
+++ b/src/glsl/ir.h
@@ -527,6 +527,18 @@ public:
}
 
/**
+* Enable emitting extension warnings for this variable
+*/
+   void enable_extension_warning(const char *extension);
+
+   /**
+* Get the extension warning string for this variable
+*
+* If warnings are not enabled, \c NULL is returned.
+*/
+   const char *get_extension_warning() const;
+
+   /**
 * Declared type of the variable
 */
const struct glsl_type *type;
@@ -778,11 +790,6 @@ public:
/*@}*/
 
/**
-* Emit a warning if this variable is accessed.
-*/
-   const char *warn_extension;
-
-   /**
 * Value assigned in the initializer of a variable declared const
 */
ir_constant *constant_value;
@@ -805,6 +812,11 @@ private:
 * \sa ir_variable::location
 */
const glsl_type *interface_type;
+
+   /**
+* Emit a warning if this variable is accessed.
+*/
+   const char *warn_extension;
 };
 
 /**
-- 
1.8.1.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 15/26] glsl: Never put ir_var_temporary variables in the symbol table

2014-07-14 Thread Ian Romanick
From: Ian Romanick ian.d.roman...@intel.com

Later patches will give every ir_var_temporary the same name in release
builds.  Adding a bunch of variables named compiler_temp to the symbol
table can only cause problems.

No change Valgrind massif results for a trimmed apitrace of dota2.

Signed-off-by: Ian Romanick ian.d.roman...@intel.com
---
 src/glsl/ast_to_hir.cpp| 4 ++--
 src/glsl/glsl_parser_extras.cpp| 8 ++--
 src/glsl/glsl_symbol_table.cpp | 2 ++
 src/glsl/linker.cpp| 4 +++-
 src/glsl/lower_packed_varyings.cpp | 1 +
 5 files changed, 14 insertions(+), 5 deletions(-)

diff --git a/src/glsl/ast_to_hir.cpp b/src/glsl/ast_to_hir.cpp
index 82ac29e..0a4b3a6 100644
--- a/src/glsl/ast_to_hir.cpp
+++ b/src/glsl/ast_to_hir.cpp
@@ -912,7 +912,6 @@ get_lvalue_copy(exec_list *instructions, ir_rvalue *lvalue)
var = new(ctx) ir_variable(lvalue-type, _post_incdec_tmp,
  ir_var_temporary);
instructions-push_tail(var);
-   var-data.mode = ir_var_auto;
 
instructions-push_tail(new(ctx) ir_assignment(new(ctx) 
ir_dereference_variable(var),
  lvalue));
@@ -2499,6 +2498,7 @@ apply_type_qualifier_to_variable(const struct 
ast_type_qualifier *qual,
/* If there is no qualifier that changes the mode of the variable, leave
 * the setting alone.
 */
+   assert(var-data.mode != ir_var_temporary);
if (qual-flags.q.in  qual-flags.q.out)
   var-data.mode = ir_var_function_inout;
else if (qual-flags.q.in)
@@ -4944,7 +4944,7 @@ ast_type_specifier::hir(exec_list *instructions,
   */
  ir_variable *const junk =
 new(state) ir_variable(type, #default precision,
-   ir_var_temporary);
+   ir_var_auto);
 
  state-symbols-add_variable(junk);
   }
diff --git a/src/glsl/glsl_parser_extras.cpp b/src/glsl/glsl_parser_extras.cpp
index 93f5e75..0555c57 100644
--- a/src/glsl/glsl_parser_extras.cpp
+++ b/src/glsl/glsl_parser_extras.cpp
@@ -1544,9 +1544,13 @@ _mesa_glsl_compile_shader(struct gl_context *ctx, struct 
gl_shader *shader,
  shader-symbols-add_function(decl);
  break;
   }
-  case ir_type_variable:
- shader-symbols-add_variable((ir_variable *) ir);
+  case ir_type_variable: {
+ ir_variable *const var = (ir_variable *) ir;
+
+ if (var-data.mode != ir_var_temporary)
+shader-symbols-add_variable(var);
  break;
+  }
   default:
  break;
   }
diff --git a/src/glsl/glsl_symbol_table.cpp b/src/glsl/glsl_symbol_table.cpp
index a052362..2294dda 100644
--- a/src/glsl/glsl_symbol_table.cpp
+++ b/src/glsl/glsl_symbol_table.cpp
@@ -124,6 +124,8 @@ bool glsl_symbol_table::name_declared_this_scope(const char 
*name)
 
 bool glsl_symbol_table::add_variable(ir_variable *v)
 {
+   assert(v-data.mode != ir_var_temporary);
+
if (this-separate_function_namespace) {
   /* In 1.10, functions and variables have separate namespaces. */
   symbol_table_entry *existing = get_entry(v-name);
diff --git a/src/glsl/linker.cpp b/src/glsl/linker.cpp
index 2119943..842678f 100644
--- a/src/glsl/linker.cpp
+++ b/src/glsl/linker.cpp
@@ -969,7 +969,8 @@ populate_symbol_table(gl_shader *sh)
   if ((func = inst-as_function()) != NULL) {
 sh-symbols-add_function(func);
   } else if ((var = inst-as_variable()) != NULL) {
-sh-symbols-add_variable(var);
+ if (var-data.mode != ir_var_temporary)
+sh-symbols-add_variable(var);
   }
}
 }
@@ -2166,6 +2167,7 @@ demote_shader_inputs_and_outputs(gl_shader *sh, enum 
ir_variable_mode mode)
* to have a location assigned.
*/
   if (var-data.is_unmatched_generic_inout) {
+ assert(var-data.mode != ir_var_temporary);
 var-data.mode = ir_var_auto;
   }
}
diff --git a/src/glsl/lower_packed_varyings.cpp 
b/src/glsl/lower_packed_varyings.cpp
index c72b80a..175f805 100644
--- a/src/glsl/lower_packed_varyings.cpp
+++ b/src/glsl/lower_packed_varyings.cpp
@@ -261,6 +261,7 @@ lower_packed_varyings_visitor::run(exec_list *instructions)
  !var-type-contains_integer());
 
   /* Change the old varying into an ordinary global. */
+  assert(var-data.mode != ir_var_temporary);
   var-data.mode = ir_var_auto;
 
   /* Create a reference to the old varying. */
-- 
1.8.1.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 09/26] glsl: Store ir_variable::depth_layout using 3 bits

2014-07-14 Thread Ian Romanick
From: Ian Romanick ian.d.roman...@intel.com

warn_extension_index was moved to improve packing.

Valgrind massif results for a trimmed apitrace of dota2:

  ntime(i) total(B)   useful-heap(B) 
extra-heap(B)stacks(B)
Before (32-bit): 73 40,580,476,304   68,488,400   62,796,151 
5,692,2490
After  (32-bit): 73 40,575,751,558   68,116,528   62,618,607 
5,497,9210

Before (64-bit): 71 37,124,890,613   95,889,584   88,089,008 
7,800,5760
After  (64-bit): 62 37,123,578,526   95,150,784   87,711,304 
7,439,4800

A real savings of 173KiB on 32-bit and 368KiB on 64-bit.

Signed-off-by: Ian Romanick ian.d.roman...@intel.com
---
 src/glsl/ast_to_hir.cpp |  4 ++--
 src/glsl/ir.h   | 19 +--
 2 files changed, 11 insertions(+), 12 deletions(-)

diff --git a/src/glsl/ast_to_hir.cpp b/src/glsl/ast_to_hir.cpp
index 542a869..82ac29e 100644
--- a/src/glsl/ast_to_hir.cpp
+++ b/src/glsl/ast_to_hir.cpp
@@ -2923,8 +2923,8 @@ get_variable_being_redeclared(ir_variable *var, YYLTYPE 
loc,
  gl_FragDepth: depth layout is declared here 
  as '%s, but it was previously declared as 
  '%s',
- depth_layout_string(var-data.depth_layout),
- depth_layout_string(earlier-data.depth_layout));
+ 
depth_layout_string(ir_depth_layout(var-data.depth_layout)),
+ 
depth_layout_string(ir_depth_layout(earlier-data.depth_layout)));
   }
 
   earlier-data.depth_layout = var-data.depth_layout;
diff --git a/src/glsl/ir.h b/src/glsl/ir.h
index d4010fa..d33d153 100644
--- a/src/glsl/ir.h
+++ b/src/glsl/ir.h
@@ -704,6 +704,13 @@ public:
*/
   unsigned index:1;
 
+  /**
+   * \brief Layout qualifier for gl_FragDepth.
+   *
+   * This is not equal to \c ir_depth_layout_none if and only if this
+   * variable is \c gl_FragDepth and a layout qualifier is specified.
+   */
+  unsigned depth_layout:3;
 
   /**
* ARB_shader_image_load_store qualifiers.
@@ -714,9 +721,6 @@ public:
   unsigned image_volatile:1;
   unsigned image_restrict:1;
 
-  /** Image internal format if specified explicitly, otherwise GL_NONE. */
-  uint16_t image_format;
-
   /**
* Emit a warning if this variable is accessed.
*/
@@ -724,13 +728,8 @@ public:
   uint8_t warn_extension_index;
 
public:
-  /**
-   * \brief Layout qualifier for gl_FragDepth.
-   *
-   * This is not equal to \c ir_depth_layout_none if and only if this
-   * variable is \c gl_FragDepth and a layout qualifier is specified.
-   */
-  ir_depth_layout depth_layout;
+  /** Image internal format if specified explicitly, otherwise GL_NONE. */
+  uint16_t image_format;
 
   /**
* Storage location of the base of this variable
-- 
1.8.1.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 02/26] glsl: Rebuild the symbol table without unreachable symbols

2014-07-14 Thread Ian Romanick
From: Ian Romanick ian.d.roman...@intel.com

Previously we had to keep unreachable global symbols in the symbol table
because the symbol table is used during linking.  Having the symbol
table retain pointers to freed memory... what could possibly go wrong?
At the same time, this meant that we kept live references to tons of
memory that was no longer needed.

New strategy:  destroy the old symbol table, and make a new one from the
reachable symbols.

Valgrind massif results for a trimmed apitrace of dota2:

  ntime(i) total(B)   useful-heap(B) 
extra-heap(B)stacks(B)
Before (32-bit): 59 40,642,425,451   76,337,968   69,720,886 
6,617,0820
After  (32-bit): 46 40,661,487,174   75,116,800   68,854,065 
6,262,7350

Before (64-bit): 79 37,179,441,771  106,986,512   98,112,095 
8,874,4170
After  (64-bit): 64 37,200,329,700  104,872,672   96,514,546 
8,358,1260

A real savings of 846KiB on 32-bit and 1.5MiB on 64-bit.

Signed-off-by: Ian Romanick ian.d.roman...@intel.com
Cc: Kenneth Graunke kenn...@whitecape.org
---
 src/glsl/glsl_parser_extras.cpp | 37 -
 1 file changed, 36 insertions(+), 1 deletion(-)

diff --git a/src/glsl/glsl_parser_extras.cpp b/src/glsl/glsl_parser_extras.cpp
index b327c2b..2962407 100644
--- a/src/glsl/glsl_parser_extras.cpp
+++ b/src/glsl/glsl_parser_extras.cpp
@@ -1485,7 +1485,7 @@ _mesa_glsl_compile_shader(struct gl_context *ctx, struct 
gl_shader *shader,
if (shader-InfoLog)
   ralloc_free(shader-InfoLog);
 
-   shader-symbols = state-symbols;
+   shader-symbols = new(shader-ir) glsl_symbol_table;
shader-CompileStatus = !state-error;
shader-InfoLog = state-info_log;
shader-Version = state-language_version;
@@ -1498,6 +1498,41 @@ _mesa_glsl_compile_shader(struct gl_context *ctx, struct 
gl_shader *shader,
/* Retain any live IR, but trash the rest. */
reparent_ir(shader-ir, shader-ir);
 
+   /* Destroy the symbol table.  Create a new symbol table that contains only
+* the variables and functions that still exist in the IR.  The symbol
+* table will be used later during linking.
+*
+* There must NOT be any freed objects still referenced by the symbol
+* table.  That could cause the linker to dereference freed memory.
+*
+* We don't have to worry about types or interface-types here because those
+* are fly-weights that are looked up by glsl_type.
+*/
+   foreach_in_list (ir_instruction, ir, shader-ir) {
+  switch (ir-ir_type) {
+  case ir_type_function: {
+ /* If the function is a built-in that is partially overridden in the
+  * shader, the ir_function stored in the symbol table may not be the
+  * same as the one that appears in the shader.  The one in the shader
+  * will only include definitions from inside the shader.  We need the
+  * one from the symbol table because it will include built-in
+  * defintions and definitions from the shader.
+  */
+ ir_function *const def = (ir_function *) ir;
+ ir_function *const decl = state-symbols-get_function(def-name);
+
+ shader-symbols-add_function(decl);
+ break;
+  }
+  case ir_type_variable:
+ shader-symbols-add_variable((ir_variable *) ir);
+ break;
+  default:
+ break;
+  }
+   }
+
+   delete state-symbols;
ralloc_free(state);
 }
 
-- 
1.8.1.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 18/26] glsl: Don't allocate a name for ir_var_temporary variables

2014-07-14 Thread Ian Romanick
From: Ian Romanick ian.d.roman...@intel.com

Valgrind massif results for a trimmed apitrace of dota2:

  ntime(i) total(B)   useful-heap(B) 
extra-heap(B)stacks(B)
Before (32-bit): 74 40,578,719,715   67,762,208   62,263,404 
5,498,8040
After  (32-bit): 52 40,565,579,466   66,359,800   61,187,818 
5,171,9820

Before (64-bit): 74 37,129,541,061   95,195,160   87,369,671 
7,825,4890
After  (64-bit): 76 37,134,691,404   93,271,352   85,900,223 
7,371,1290

A real savings of 1.0MiB on 32-bit and 1.4MiB on 64-bit.

Signed-off-by: Ian Romanick ian.d.roman...@intel.com
---
 src/glsl/glsl_parser_extras.cpp |  3 +++
 src/glsl/ir.cpp |  5 +
 src/glsl/ir.h   | 19 +++
 src/glsl/test_optpass.cpp   |  1 +
 4 files changed, 28 insertions(+)

diff --git a/src/glsl/glsl_parser_extras.cpp b/src/glsl/glsl_parser_extras.cpp
index 0555c57..1fe38f3 100644
--- a/src/glsl/glsl_parser_extras.cpp
+++ b/src/glsl/glsl_parser_extras.cpp
@@ -1437,6 +1437,9 @@ _mesa_glsl_compile_shader(struct gl_context *ctx, struct 
gl_shader *shader,
   new(shader) _mesa_glsl_parse_state(ctx, shader-Stage, shader);
const char *source = shader-Source;
 
+   if (ctx-Const.GenerateTemporaryNames)
+  ir_variable::temporaries_allocate_names = true;
+
state-error = glcpp_preprocess(state, source, state-info_log,
  ctx-Extensions, ctx);
 
diff --git a/src/glsl/ir.cpp b/src/glsl/ir.cpp
index 0045c2d..bc008a4 100644
--- a/src/glsl/ir.cpp
+++ b/src/glsl/ir.cpp
@@ -1533,6 +1533,8 @@ ir_swizzle::variable_referenced() const
 }
 
 
+bool ir_variable::temporaries_allocate_names = false;
+
 const char ir_variable::tmp_name[] = compiler_temp;
 
 ir_variable::ir_variable(const struct glsl_type *type, const char *name,
@@ -1541,6 +1543,9 @@ ir_variable::ir_variable(const struct glsl_type *type, 
const char *name,
 {
this-type = type;
 
+   if (mode == ir_var_temporary  !ir_variable::temporaries_allocate_names)
+  name = NULL;
+
/* The ir_variable clone method may call this constructor with name set to
 * tmp_name.
 */
diff --git a/src/glsl/ir.h b/src/glsl/ir.h
index 1aeb1ff..42ccb00 100644
--- a/src/glsl/ir.h
+++ b/src/glsl/ir.h
@@ -891,6 +891,25 @@ private:
 * Name used for anonymous compiler temporaries
 */
static const char tmp_name[];
+
+public:
+   /**
+* Should the construct keep names for ir_var_temporary variables?
+*
+* When this global is false, names passed to the constructor for
+* \c ir_var_temporary variables will be dropped.  Instead, the variable 
will
+* be named compiler_temp.  This name will be in static storage.
+*
+* \warning
+* \b NEVER change the mode of an \c ir_var_temporary.
+*
+* \warning
+* This variable is \b not thread-safe.  It is global, \b not
+* per-context. It begins life false.  A context can, at some point, make
+* it true.  From that point on, it will be true forever.  This should be
+* okay since it will only be set true while debugging.
+*/
+   static bool temporaries_allocate_names;
 };
 
 /**
diff --git a/src/glsl/test_optpass.cpp b/src/glsl/test_optpass.cpp
index e4878bf..1c29705 100644
--- a/src/glsl/test_optpass.cpp
+++ b/src/glsl/test_optpass.cpp
@@ -200,6 +200,7 @@ int test_optpass(int argc, char **argv)
initialize_context_to_defaults(ctx, API_OPENGL_COMPAT);
 
ctx-Driver.NewShader = _mesa_new_shader;
+   ir_variable::temporaries_allocate_names = true;
 
struct gl_shader *shader = rzalloc(NULL, struct gl_shader);
shader-Type = shader_type;
-- 
1.8.1.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 12/26] glsl: Squish ir_variable::max_ifc_array_access and ::state_slots together

2014-07-14 Thread Ian Romanick
From: Ian Romanick ian.d.roman...@intel.com

At least one of these pointers must be NULL, and we can determine which
will be NULL by looking at other fields.  Use this information to store
both pointers in the same location.

If anyone can think of a better name for the union than u, I'm all
ears.

Valgrind massif results for a trimmed apitrace of dota2:

  ntime(i) total(B)   useful-heap(B) 
extra-heap(B)stacks(B)
Before (32-bit): 63 40,574,239,515   68,117,280   62,618,607 
5,498,6730
After  (32-bit): 44 40,577,049,140   68,118,608   62,441,063 
5,677,5450

Before (64-bit): 53 37,126,451,468   95,150,256   87,711,304 
7,438,9520
After  (64-bit): 63 37,122,829,194   95,153,008   87,333,600 
7,819,4080

A real savings of 173KiB on 32-bit and 368KiB on 64-bit.

Signed-off-by: Ian Romanick ian.d.roman...@intel.com
---
 src/glsl/ir.cpp   |  5 +++-
 src/glsl/ir.h | 75 ---
 src/glsl/ir_clone.cpp |  4 +--
 3 files changed, 48 insertions(+), 36 deletions(-)

diff --git a/src/glsl/ir.cpp b/src/glsl/ir.cpp
index cfdc7f0..40dbb6b 100644
--- a/src/glsl/ir.cpp
+++ b/src/glsl/ir.cpp
@@ -1535,10 +1535,13 @@ ir_swizzle::variable_referenced() const
 
 ir_variable::ir_variable(const struct glsl_type *type, const char *name,
 ir_variable_mode mode)
-   : ir_instruction(ir_type_variable), max_ifc_array_access(NULL)
+   : ir_instruction(ir_type_variable)
 {
this-type = type;
this-name = ralloc_strdup(this, name);
+
+   this-u.max_ifc_array_access = NULL;
+
this-data.explicit_location = false;
this-data.has_initializer = false;
this-data.location = -1;
diff --git a/src/glsl/ir.h b/src/glsl/ir.h
index c47a1fb..e73b113 100644
--- a/src/glsl/ir.h
+++ b/src/glsl/ir.h
@@ -475,7 +475,7 @@ public:
   assert(this-interface_type == NULL);
   this-interface_type = type;
   if (this-is_interface_instance()) {
- this-max_ifc_array_access =
+ this-u.max_ifc_array_access =
 rzalloc_array(this, unsigned, type-length);
   }
}
@@ -487,7 +487,7 @@ public:
 */
void change_interface_type(const struct glsl_type *type)
{
-  if (this-max_ifc_array_access != NULL) {
+  if (this-u.max_ifc_array_access != NULL) {
  /* max_ifc_array_access has already been allocated, so make sure the
   * new interface has the same number of fields as the old one.
   */
@@ -504,7 +504,7 @@ public:
 */
void reinit_interface_type(const struct glsl_type *type)
{
-  if (this-max_ifc_array_access != NULL) {
+  if (this-u.max_ifc_array_access != NULL) {
 #ifndef NDEBUG
  /* Redeclaring gl_PerVertex is only allowed if none of the built-ins
   * it defines have been accessed yet; so it's safe to throw away the
@@ -512,10 +512,10 @@ public:
   * zero.
   */
  for (unsigned i = 0; i  this-interface_type-length; i++)
-assert(this-max_ifc_array_access[i] == 0);
+assert(this-u.max_ifc_array_access[i] == 0);
 #endif
- ralloc_free(this-max_ifc_array_access);
- this-max_ifc_array_access = NULL;
+ ralloc_free(this-u.max_ifc_array_access);
+ this-u.max_ifc_array_access = NULL;
   }
   this-interface_type = NULL;
   init_interface_type(type);
@@ -534,38 +534,45 @@ public:
 */
inline unsigned *get_max_ifc_array_access()
{
-  return this-max_ifc_array_access;
+  assert(this-data._num_state_slots == 0);
+  return this-u.max_ifc_array_access;
}
 
inline unsigned get_num_state_slots() const
{
+  assert(!this-is_interface_instance()
+ || this-data._num_state_slots == 0);
   return this-data._num_state_slots;
}
 
inline void set_num_state_slots(unsigned n)
{
+  assert(!this-is_interface_instance()
+ || n == 0);
   this-data._num_state_slots = n;
}
 
inline ir_state_slot *get_state_slots()
{
-  return this-state_slots;
+  return this-is_interface_instance() ? NULL : this-u.state_slots;
}
 
inline const ir_state_slot *get_state_slots() const
{
-  return this-state_slots;
+  return this-is_interface_instance() ? NULL : this-u.state_slots;
}
 
inline ir_state_slot *allocate_state_slots(unsigned n)
{
-  this-state_slots = ralloc_array(this, ir_state_slot, n);
+  assert(!this-is_interface_instance());
+
+  this-u.state_slots = ralloc_array(this, ir_state_slot, n);
   this-data._num_state_slots = 0;
 
-  if (this-state_slots != NULL)
+  if (this-u.state_slots != NULL)
  this-data._num_state_slots = n;
 
-  return this-state_slots;
+  return this-u.state_slots;
}
 
/**
@@ -834,28 +841,30 @@ public:
 private:
static const char *const warn_extension_table[];
 
-   /**

[Mesa-dev] [PATCH 01/26] glsl: Validate that built-in uniforms have backing state

2014-07-14 Thread Ian Romanick
From: Ian Romanick ian.d.roman...@intel.com

All built-in uniforms are supposed to be backed by some GL state.  The
state_slots field describes this backing state.

This helped me track down a bug in a later patch.

Signed-off-by: Ian Romanick ian.d.roman...@intel.com
Acked-by: Anuj Phogat anuj.pho...@gmail.com
---
 src/glsl/ir_validate.cpp | 8 
 1 file changed, 8 insertions(+)

diff --git a/src/glsl/ir_validate.cpp b/src/glsl/ir_validate.cpp
index 42142da..ef660e8 100644
--- a/src/glsl/ir_validate.cpp
+++ b/src/glsl/ir_validate.cpp
@@ -696,6 +696,14 @@ ir_validate::visit(ir_variable *ir)
   abort();
}
 
+   if (ir-data.mode == ir_var_uniform
+strncmp(ir-name, gl_, 3) == 0
+ir-state_slots == NULL) {
+  printf(built-in uniform has no state\n);
+  ir-print();
+  abort();
+   }
+
return visit_continue;
 }
 
-- 
1.8.1.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 17/26] glsl: Use ir_var_temporary for compiler generated temporaries

2014-07-14 Thread Ian Romanick
From: Ian Romanick ian.d.roman...@intel.com

These few places were using ir_var_auto for seemingly no reason.  The
names were not added to the symbol table.

No change Valgrind massif results for a trimmed apitrace of dota2.

Signed-off-by: Ian Romanick ian.d.roman...@intel.com
---
 src/glsl/lower_offset_array.cpp   | 3 ++-
 src/glsl/lower_texture_projection.cpp | 2 +-
 src/glsl/opt_cse.cpp  | 2 +-
 3 files changed, 4 insertions(+), 3 deletions(-)

diff --git a/src/glsl/lower_offset_array.cpp b/src/glsl/lower_offset_array.cpp
index 0c235ed..5b48526 100644
--- a/src/glsl/lower_offset_array.cpp
+++ b/src/glsl/lower_offset_array.cpp
@@ -63,7 +63,8 @@ brw_lower_offset_array_visitor::handle_rvalue(ir_rvalue **rv)
 
void *mem_ctx = ralloc_parent(ir);
 
-   ir_variable *var = new (mem_ctx) ir_variable(ir-type, result, 
ir_var_auto);
+   ir_variable *var =
+  new (mem_ctx) ir_variable(ir-type, result, ir_var_temporary);
base_ir-insert_before(var);
 
for (int i = 0; i  4; i++) {
diff --git a/src/glsl/lower_texture_projection.cpp 
b/src/glsl/lower_texture_projection.cpp
index 16d6376..95df106 100644
--- a/src/glsl/lower_texture_projection.cpp
+++ b/src/glsl/lower_texture_projection.cpp
@@ -62,7 +62,7 @@ lower_texture_projection_visitor::visit_leave(ir_texture *ir)
void *mem_ctx = ralloc_parent(ir);
 
ir_variable *var = new(mem_ctx) ir_variable(ir-projector-type,
-  projector, ir_var_auto);
+  projector, ir_var_temporary);
base_ir-insert_before(var);
ir_dereference *deref = new(mem_ctx) ir_dereference_variable(var);
ir_expression *expr = new(mem_ctx) ir_expression(ir_unop_rcp,
diff --git a/src/glsl/opt_cse.cpp b/src/glsl/opt_cse.cpp
index 0e720cc..9c96835 100644
--- a/src/glsl/opt_cse.cpp
+++ b/src/glsl/opt_cse.cpp
@@ -276,7 +276,7 @@ cse_visitor::try_cse(ir_rvalue *rvalue)
 
  ir_variable *var = new(rvalue) ir_variable(rvalue-type,
 cse,
-ir_var_auto);
+ir_var_temporary);
 
  /* Write the previous expression result into a new variable. */
  base_ir-insert_before(var);
-- 
1.8.1.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 16/26] glsl: Add context-level controls for whether temporaries have real names

2014-07-14 Thread Ian Romanick
From: Ian Romanick ian.d.roman...@intel.com

No change Valgrind massif results for a trimmed apitrace of dota2.

Signed-off-by: Ian Romanick ian.d.roman...@intel.com
---
 src/glsl/main.cpp |  1 +
 src/mesa/main/context.c   |  6 ++
 src/mesa/main/mtypes.h| 13 +
 src/mesa/main/shaderapi.c |  3 +++
 4 files changed, 23 insertions(+)

diff --git a/src/glsl/main.cpp b/src/glsl/main.cpp
index a4452e0..feed100 100644
--- a/src/glsl/main.cpp
+++ b/src/glsl/main.cpp
@@ -210,6 +210,7 @@ initialize_context(struct gl_context *ctx, gl_api api)
   break;
}
 
+   ctx-Const.GenerateTemporaryNames = true;
ctx-Driver.NewShader = _mesa_new_shader;
 }
 
diff --git a/src/mesa/main/context.c b/src/mesa/main/context.c
index 50aae8b..562abf7 100644
--- a/src/mesa/main/context.c
+++ b/src/mesa/main/context.c
@@ -650,6 +650,12 @@ _mesa_init_constants(struct gl_context *ctx)
   ctx-Const.GLSLVersion = 0; /* GLSL not supported */
}
 
+#ifdef DEBUG
+   ctx-Const.GenerateTemporaryNames = true;
+#else
+   ctx-Const.GenerateTemporaryNames = false;
+#endif
+
/* GL_ARB_framebuffer_object */
ctx-Const.MaxSamples = 0;
 
diff --git a/src/mesa/main/mtypes.h b/src/mesa/main/mtypes.h
index b699021..2025fd0 100644
--- a/src/mesa/main/mtypes.h
+++ b/src/mesa/main/mtypes.h
@@ -3455,6 +3455,19 @@ struct gl_constants
 */
GLboolean DisableVaryingPacking;
 
+   /**
+* Should meaningful names be generated for compiler temporary variables?
+*
+* Generally, it is not useful to have the compiler generate meaningful
+* names for temporary variables that it creates.  This can, however, be a
+* useful debugging aid.  In Mesa debug builds or release builds when
+* MESA_GLSL is set at run-time, meaningful names will be generated.
+* Drivers can also force names to be generated by setting this field.
+* For example, the i965 driver may set it when INTEL_DEBUG=vs (to dump
+* vertex shader assembly) is set at run-time.
+*/
+   bool GenerateTemporaryNames;
+
/*
 * Maximum value supported for an index in DrawElements and friends.
 *
diff --git a/src/mesa/main/shaderapi.c b/src/mesa/main/shaderapi.c
index 2bbef35..6c74583 100644
--- a/src/mesa/main/shaderapi.c
+++ b/src/mesa/main/shaderapi.c
@@ -123,6 +123,9 @@ _mesa_init_shader_state(struct gl_context *ctx)
 
ctx-Shader.Flags = _mesa_get_shader_flags();
 
+   if (ctx-Shader.Flags != 0)
+  ctx-Const.GenerateTemporaryNames = true;
+
/* Extended for ARB_separate_shader_objects */
ctx-Shader.RefCount = 1;
mtx_init(ctx-Shader.Mutex, mtx_plain);
-- 
1.8.1.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 13/26] glsl: Store ir_variable_data::_num_state_slots and ::binding in 16-bits each

2014-07-14 Thread Ian Romanick
From: Ian Romanick ian.d.roman...@intel.com

Valgrind massif results for a trimmed apitrace of dota2:

  ntime(i) total(B)   useful-heap(B) 
extra-heap(B)stacks(B)
Before (32-bit): 44 40,577,049,140   68,118,608   62,441,063 
5,677,5450
After  (32-bit): 71 40,583,408,411   67,761,528   62,263,519 
5,498,0090

Before (64-bit): 63 37,122,829,194   95,153,008   87,333,600 
7,819,4080
After  (64-bit): 67 37,123,303,706   95,150,544   87,333,600 
7,816,9440

A real savings of 173KiB on 32-bit and no change on 64-bit.

Signed-off-by: Ian Romanick ian.d.roman...@intel.com
---
 src/glsl/ir.h | 24 
 1 file changed, 16 insertions(+), 8 deletions(-)

diff --git a/src/glsl/ir.h b/src/glsl/ir.h
index e73b113..1071858 100644
--- a/src/glsl/ir.h
+++ b/src/glsl/ir.h
@@ -768,10 +768,25 @@ public:
   uint16_t image_format;
 
private:
-  unsigned _num_state_slots;/** Number of state slots used */
+  /**
+   * Number of state slots used
+   *
+   * \note
+   * This could be stored in as few as 7-bits, if necessary.  If it is made
+   * smaller, add an assertion to \c ir_variable::allocate_state_slots to
+   * be safe.
+   */
+  uint16_t _num_state_slots;
 
public:
   /**
+   * Initial binding point for a sampler, atomic, or UBO.
+   *
+   * For array types, this represents the binding point for the first 
element.
+   */
+  int16_t binding;
+
+  /**
* Storage location of the base of this variable
*
* The precise meaning of this field depends on the nature of the 
variable.
@@ -797,13 +812,6 @@ public:
   unsigned stream;
 
   /**
-   * Initial binding point for a sampler, atomic, or UBO.
-   *
-   * For array types, this represents the binding point for the first 
element.
-   */
-  int binding;
-
-  /**
* Location an atomic counter is stored at.
*/
   struct {
-- 
1.8.1.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 22/26] glsl: Store ir_variable::ir_type in 8 bits instead of 32

2014-07-14 Thread Ian Romanick
From: Ian Romanick ian.d.roman...@intel.com

No change Valgrind massif results for a trimmed apitrace of dota2.

Signed-off-by: Ian Romanick ian.d.roman...@intel.com
---
 src/glsl/ir.h | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/src/glsl/ir.h b/src/glsl/ir.h
index ee42857..5c565ff 100644
--- a/src/glsl/ir.h
+++ b/src/glsl/ir.h
@@ -88,12 +88,13 @@ enum ir_node_type {
  */
 class ir_instruction : public exec_node {
 private:
-   enum ir_node_type ir_type;
+   uint8_t ir_type;
 
 public:
inline enum ir_node_type get_ir_type() const
{
-  return this-ir_type;
+  STATIC_ASSERT(ir_type_max  256);
+  return (enum ir_node_type) this-ir_type;
}
 
/**
-- 
1.8.1.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 08/26] glsl: Replace ir_variable::warn_extension pointer with an 8-bit index

2014-07-14 Thread Ian Romanick
From: Ian Romanick ian.d.roman...@intel.com

Also move the new warn_extension_index into ir_variable::data.  This
enables slightly better packing.

Valgrind massif results for a trimmed apitrace of dota2:

  ntime(i) total(B)   useful-heap(B) 
extra-heap(B)stacks(B)
Before (32-bit): 82 40,580,040,531   68,488,992   62,973,695 
5,515,2970
After  (32-bit): 73 40,580,476,304   68,488,400   62,796,151 
5,692,2490

Before (64-bit): 65 37,124,013,542   95,892,768   88,466,712 
7,426,0560
After  (64-bit): 71 37,124,890,613   95,889,584   88,089,008 
7,800,5760

A real savings of 173KiB on 32-bit and 368KiB on 64-bit.

Signed-off-by: Ian Romanick ian.d.roman...@intel.com
---
 src/glsl/ir.cpp   | 21 ++---
 src/glsl/ir.h | 18 +-
 src/glsl/ir_clone.cpp |  2 --
 3 files changed, 31 insertions(+), 10 deletions(-)

diff --git a/src/glsl/ir.cpp b/src/glsl/ir.cpp
index 8575ac3..cfdc7f0 100644
--- a/src/glsl/ir.cpp
+++ b/src/glsl/ir.cpp
@@ -1544,7 +1544,7 @@ ir_variable::ir_variable(const struct glsl_type *type, 
const char *name,
this-data.location = -1;
this-data.location_frac = 0;
this-data.binding = 0;
-   this-warn_extension = NULL;
+   this-data.warn_extension_index = 0;
this-constant_value = NULL;
this-constant_initializer = NULL;
this-data.origin_upper_left = false;
@@ -1607,16 +1607,31 @@ ir_variable::determine_interpolation_mode(bool 
flat_shade)
   return INTERP_QUALIFIER_SMOOTH;
 }
 
+const char *const ir_variable::warn_extension_table[] = {
+   ,
+   GL_ARB_shader_stencil_export,
+   GL_AMD_shader_stencil_export,
+};
+
 void
 ir_variable::enable_extension_warning(const char *extension)
 {
-   this-warn_extension = extension;
+   for (unsigned i = 0; i  Elements(warn_extension_table); i++) {
+  if (strcmp(warn_extension_table[i], extension) == 0) {
+ this-data.warn_extension_index = i;
+ return;
+  }
+   }
+
+   assert(!Should not get here.);
+   this-data.warn_extension_index = 0;
 }
 
 const char *
 ir_variable::get_extension_warning() const
 {
-   return this-warn_extension;
+   return this-data.warn_extension_index == 0
+  ? NULL : warn_extension_table[this-data.warn_extension_index];
 }
 
 ir_function_signature::ir_function_signature(const glsl_type *return_type,
diff --git a/src/glsl/ir.h b/src/glsl/ir.h
index bdd9a3c..d4010fa 100644
--- a/src/glsl/ir.h
+++ b/src/glsl/ir.h
@@ -718,6 +718,13 @@ public:
   uint16_t image_format;
 
   /**
+   * Emit a warning if this variable is accessed.
+   */
+   private:
+  uint8_t warn_extension_index;
+
+   public:
+  /**
* \brief Layout qualifier for gl_FragDepth.
*
* This is not equal to \c ir_depth_layout_none if and only if this
@@ -771,6 +778,10 @@ public:
*/
   unsigned max_array_access;
 
+  /**
+   * Allow (only) ir_variable direct access private members.
+   */
+  friend class ir_variable;
} data;
 
/**
@@ -805,6 +816,8 @@ public:
ir_constant *constant_initializer;
 
 private:
+   static const char *const warn_extension_table[];
+
/**
 * For variables that are in an interface block or are an instance of an
 * interface block, this is the \c GLSL_TYPE_INTERFACE type for that block.
@@ -812,11 +825,6 @@ private:
 * \sa ir_variable::location
 */
const glsl_type *interface_type;
-
-   /**
-* Emit a warning if this variable is accessed.
-*/
-   const char *warn_extension;
 };
 
 /**
diff --git a/src/glsl/ir_clone.cpp b/src/glsl/ir_clone.cpp
index 4b444d4..86c0e31 100644
--- a/src/glsl/ir_clone.cpp
+++ b/src/glsl/ir_clone.cpp
@@ -53,8 +53,6 @@ ir_variable::clone(void *mem_ctx, struct hash_table *ht) const
 
memcpy(var-data, this-data, sizeof(var-data));
 
-   var-warn_extension = this-warn_extension;
-
var-num_state_slots = this-num_state_slots;
if (this-state_slots) {
   /* FINISHME: This really wants to use something like talloc_reference, 
but
-- 
1.8.1.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 06/26] glsl: Use bit-flags image attributes and uint16_t for the image format

2014-07-14 Thread Ian Romanick
From: Ian Romanick ian.d.roman...@intel.com

All of the GL image enums fit in 16-bits.

Also move the fields from the anonymous image structucture to the next
higher structure.  This will enable packing the bits with the other
bitfield.

Valgrind massif results for a trimmed apitrace of dota2:

  ntime(i) total(B)   useful-heap(B) 
extra-heap(B)stacks(B)
Before (32-bit): 76 40,572,916,873   68,831,248   63,328,783 
5,502,4650
After  (32-bit): 70 40,577,421,777   68,487,584   62,973,695 
5,513,8890

Before (64-bit): 60 36,822,640,058   96,526,824   88,735,296 
7,791,5280
After  (64-bit): 74 37,124,603,758   95,891,808   88,466,712 
7,425,0960

A real savings of 346KiB on 32-bit and 262KiB on 64-bit.

Signed-off-by: Ian Romanick ian.d.roman...@intel.com
---
 src/glsl/ast_function.cpp  | 10 +-
 src/glsl/ast_to_hir.cpp| 14 +++---
 src/glsl/builtin_functions.cpp | 10 +-
 src/glsl/ir.cpp| 20 ++--
 src/glsl/ir.h  | 27 +--
 src/glsl/link_uniforms.cpp |  4 ++--
 6 files changed, 42 insertions(+), 43 deletions(-)

diff --git a/src/glsl/ast_function.cpp b/src/glsl/ast_function.cpp
index 4981fe1..dbf4e66 100644
--- a/src/glsl/ast_function.cpp
+++ b/src/glsl/ast_function.cpp
@@ -103,35 +103,35 @@ verify_image_parameter(YYLTYPE *loc, 
_mesa_glsl_parse_state *state,
 *  qualifiers. [...] It is legal to have additional qualifiers
 *  on a formal parameter, but not to have fewer.
 */
-   if (actual-data.image.coherent  !formal-data.image.coherent) {
+   if (actual-data.image_coherent  !formal-data.image_coherent) {
   _mesa_glsl_error(loc, state,
function call parameter `%s' drops 
`coherent' qualifier, formal-name);
   return false;
}
 
-   if (actual-data.image._volatile  !formal-data.image._volatile) {
+   if (actual-data.image_volatile  !formal-data.image_volatile) {
   _mesa_glsl_error(loc, state,
function call parameter `%s' drops 
`volatile' qualifier, formal-name);
   return false;
}
 
-   if (actual-data.image.restrict_flag  !formal-data.image.restrict_flag) {
+   if (actual-data.image_restrict  !formal-data.image_restrict) {
   _mesa_glsl_error(loc, state,
function call parameter `%s' drops 
`restrict' qualifier, formal-name);
   return false;
}
 
-   if (actual-data.image.read_only  !formal-data.image.read_only) {
+   if (actual-data.image_read_only  !formal-data.image_read_only) {
   _mesa_glsl_error(loc, state,
function call parameter `%s' drops 
`readonly' qualifier, formal-name);
   return false;
}
 
-   if (actual-data.image.write_only  !formal-data.image.write_only) {
+   if (actual-data.image_write_only  !formal-data.image_write_only) {
   _mesa_glsl_error(loc, state,
function call parameter `%s' drops 
`writeonly' qualifier, formal-name);
diff --git a/src/glsl/ast_to_hir.cpp b/src/glsl/ast_to_hir.cpp
index 885bee5..542a869 100644
--- a/src/glsl/ast_to_hir.cpp
+++ b/src/glsl/ast_to_hir.cpp
@@ -2360,11 +2360,11 @@ apply_image_qualifier_to_variable(const struct 
ast_type_qualifier *qual,
   global variables);
   }
 
-  var-data.image.read_only |= qual-flags.q.read_only;
-  var-data.image.write_only |= qual-flags.q.write_only;
-  var-data.image.coherent |= qual-flags.q.coherent;
-  var-data.image._volatile |= qual-flags.q._volatile;
-  var-data.image.restrict_flag |= qual-flags.q.restrict_flag;
+  var-data.image_read_only |= qual-flags.q.read_only;
+  var-data.image_write_only |= qual-flags.q.write_only;
+  var-data.image_coherent |= qual-flags.q.coherent;
+  var-data.image_volatile |= qual-flags.q._volatile;
+  var-data.image_restrict |= qual-flags.q.restrict_flag;
   var-data.read_only = true;
 
   if (qual-flags.q.explicit_image_format) {
@@ -2378,7 +2378,7 @@ apply_image_qualifier_to_variable(const struct 
ast_type_qualifier *qual,
  base data type of the image);
  }
 
- var-data.image.format = qual-image_format;
+ var-data.image_format = qual-image_format;
   } else {
  if (var-data.mode == ir_var_uniform  !qual-flags.q.write_only) {
 _mesa_glsl_error(loc, state, uniforms not qualified with 
@@ -2386,7 +2386,7 @@ apply_image_qualifier_to_variable(const struct 
ast_type_qualifier *qual,
  qualifier);
  }
 
- var-data.image.format = GL_NONE;
+ var-data.image_format = GL_NONE;
   }
}
 }
diff --git a/src/glsl/builtin_functions.cpp 

[Mesa-dev] [PATCH 25/26] glsl: Keep common variable names in static storage

2014-07-14 Thread Ian Romanick
From: Ian Romanick ian.d.roman...@intel.com

Valgrind massif results for a trimmed apitrace of dota2:

  ntime(i) total(B)   useful-heap(B) 
extra-heap(B)stacks(B)
Before (32-bit): 52 40,521,071,734   66,157,928   61,054,870 
5,103,0580
After  (32-bit): 79 40,523,892,028   65,999,288   60,937,396 
5,061,8920

Before (64-bit): 48 37,089,379,412   92,630,712   85,454,539 
7,176,1730
After  (64-bit): 71 37,091,183,117   92,433,072   85,309,100 
7,123,9720

A real savings of 114KiB on 32-bit and 142KiB on 64-bit.

Signed-off-by: Ian Romanick ian.d.roman...@intel.com
---
 src/glsl/.gitignore   |   1 +
 src/glsl/Makefile.am  |  10 ++
 src/glsl/common_variable_names.py | 220 ++
 src/glsl/ir.cpp   |  32 --
 src/glsl/ir.h |   9 +-
 5 files changed, 256 insertions(+), 16 deletions(-)
 create mode 100644 src/glsl/common_variable_names.py

diff --git a/src/glsl/.gitignore b/src/glsl/.gitignore
index 43720f6..16e7e81 100644
--- a/src/glsl/.gitignore
+++ b/src/glsl/.gitignore
@@ -1,3 +1,4 @@
+get_static_name.h
 glsl_compiler
 glsl_lexer.cpp
 glsl_parser.cpp
diff --git a/src/glsl/Makefile.am b/src/glsl/Makefile.am
index 00261fd..c9de9f8 100644
--- a/src/glsl/Makefile.am
+++ b/src/glsl/Makefile.am
@@ -149,6 +149,15 @@ glsl_test_SOURCES = \
 
 glsl_test_LDADD = libglsl.la
 
+# I don't understand why automake recognizes that glsl_parser_extras.lo
+# depends on glsl_parser.h, but it cannot recognize that ir.lo depends on
+# get_static_name.h.
+
+ir.lo: get_static_name.h
+
+get_static_name.h: common_variable_names.py
+   $(PYTHON2) $(GLSL_SRCDIR)/common_variable_names.py  
$(GLSL_BUILDDIR)/get_static_name.h
+
 # We write our own rules for yacc and lex below. We'd rather use automake,
 # but automake makes it especially difficult for a number of reasons:
 #
@@ -205,6 +214,7 @@ BUILT_SOURCES = 
\
glcpp/glcpp-parse.c \
glcpp/glcpp-lex.c
 CLEANFILES =   \
+   get_static_name.h   \
glcpp/glcpp-parse.h \
glsl_parser.h   \
$(BUILT_SOURCES)
diff --git a/src/glsl/common_variable_names.py 
b/src/glsl/common_variable_names.py
new file mode 100644
index 000..7435a12
--- /dev/null
+++ b/src/glsl/common_variable_names.py
@@ -0,0 +1,220 @@
+# Copyright (c) 2014 Intel Corporation
+#
+# Permission is hereby granted, free of charge, to any person obtaining a
+# copy of this software and associated documentation files (the Software),
+# to deal in the Software without restriction, including without limitation
+# the rights to use, copy, modify, merge, publish, distribute, sublicense,
+# and/or sell copies of the Software, and to permit persons to whom the
+# Software is furnished to do so, subject to the following conditions:
+#
+# The above copyright notice and this permission notice (including the next
+# paragraph) shall be included in all copies or substantial portions of the
+# Software.
+#
+# THE SOFTWARE IS PROVIDED AS IS, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+# THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+# FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
+# DEALINGS IN THE SOFTWARE.
+
+common_names = [
+atomic_temp_var,
+color,
+diffuse,
+factor,
+fucxadd,
+fucxmul,
+gl_ClipVertex,
+gl_Color,
+gl_FogFragCoord,
+gl_FragColor,
+gl_FragData,
+gl_FragCoord,
+gl_FrontColor,
+gl_FrontFacing,
+gl_FrontSecondaryColor,
+gl_in_TexCoord0,
+gl_ModelViewProjectionMatrixTranspose,
+gl_MultiTexCoord0,
+gl_out_FragData0,
+gl_out_TexCoord0,
+gl_Position,
+gl_TexCoord,
+gl_Vertex,
+io_vLink0,
+oT0,
+oT1,
+oT2,
+oT3,
+oT4,
+oT5,
+oT6,
+oT7,
+pos,
+r10,
+r11,
+r12,
+sampler0,
+sampler1,
+sampler2,
+sampler3,
+sampler4,
+sampler5,
+sampler6,
+sampler7,
+sampler8,
+va_r,
+vcbones,
+vcscreen,
+vLit,
+vTempPos,
+v_vColor,
+v_vNormal,
+v_vPositionView,
+v_vTexcoordLight
+]
+
+template = static const char static_names[] =
+   compiler_temp\\0
+% for name in common_names:
+   ${name}\\0
+% endfor
+   ;
+
+% for s in sized_lists:
+% if len(sized_lists[s]) != 1:
+static const ${index_type} table_${s}[] = {
+% for name in sized_lists[s]:
+   ${location_dict[name]},
+ 

[Mesa-dev] [PATCH 00/26] GLSL memory diet

2014-07-14 Thread Ian Romanick
Most of these patches have been sent to the list already in one form or
another.  There are a few changes, removals, and additions.  The series
has also been re-ordered.

- The extra memory accounting code has been removed.  This was suggested
  by Ken.  Instead, all memory usage data is from Valgrind massiv.

- The store short names where padding would be code is still there,
  but many of the patches to shorten the names of temporaries has been
  removed.  Those patches became unnecessary because...

- In release builds (when MESA_GLSL is also unset), all temporaries get
  the name compiler_temp that is in static storage.

- A small set of names that frequently occur in shader-db are kept in
  static storage.  When a variable is created with one of these names,
  no additional storage is allocated.  We could probably give local
  variables and post-linking globals the same treatment as temporaries,
  but a lot of the frequently occuring names (that don't fit in padding)
  are uniforms or shader inputs.  These need to keep their names.

I suspect the last six patches will be contentious.  I'd like to get the
first 20 (or at least the first two!) reviewed and landed sooner... then
we can debate the last few more leisurely.

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 04/26] glsl: Eliminate ir_variable::data.atomic.buffer_index

2014-07-14 Thread Ian Romanick
From: Ian Romanick ian.d.roman...@intel.com

Just use ir_variable::data.binding... because that's the where the
binding is stored for everything else that can use layout(binding=).

Valgrind massif results for a trimmed apitrace of dota2:

  ntime(i) total(B)   useful-heap(B) 
extra-heap(B)stacks(B)
Before (32-bit): 50 40,564,927,443   69,185,408   63,683,871 
5,501,5370
After  (32-bit): 74 40,580,119,657   69,186,544   63,506,327 
5,680,2170

Before (64-bit): 59 36,822,048,449   96,526,888   89,113,000 
7,413,8880
After  (64-bit): 89 36,822,971,897   96,526,616   88,735,296 
7,791,3200

A real savings of 173KiB on 32-bit and 368KiB on 64-bit.

Signed-off-by: Ian Romanick ian.d.roman...@intel.com
Reviewed-by: Kenneth Graunke kenn...@whitecape.org
---
 src/glsl/ir.cpp| 2 +-
 src/glsl/ir.h  | 3 +--
 src/glsl/link_atomics.cpp  | 4 +++-
 src/mesa/drivers/dri/i965/brw_fs_visitor.cpp   | 2 +-
 src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp | 2 +-
 5 files changed, 7 insertions(+), 6 deletions(-)

diff --git a/src/glsl/ir.cpp b/src/glsl/ir.cpp
index 28fd94b..4376a5c 100644
--- a/src/glsl/ir.cpp
+++ b/src/glsl/ir.cpp
@@ -1543,6 +1543,7 @@ ir_variable::ir_variable(const struct glsl_type *type, 
const char *name,
this-data.has_initializer = false;
this-data.location = -1;
this-data.location_frac = 0;
+   this-data.binding = 0;
this-warn_extension = NULL;
this-constant_value = NULL;
this-constant_initializer = NULL;
@@ -1558,7 +1559,6 @@ ir_variable::ir_variable(const struct glsl_type *type, 
const char *name,
this-data.mode = mode;
this-data.interpolation = INTERP_QUALIFIER_NONE;
this-data.max_array_access = 0;
-   this-data.atomic.buffer_index = 0;
this-data.atomic.offset = 0;
this-data.image.read_only = false;
this-data.image.write_only = false;
diff --git a/src/glsl/ir.h b/src/glsl/ir.h
index ea19924..9dbcb7a 100644
--- a/src/glsl/ir.h
+++ b/src/glsl/ir.h
@@ -722,7 +722,7 @@ public:
   int index;
 
   /**
-   * Initial binding point for a sampler or UBO.
+   * Initial binding point for a sampler, atomic, or UBO.
*
* For array types, this represents the binding point for the first 
element.
*/
@@ -732,7 +732,6 @@ public:
* Location an atomic counter is stored at.
*/
   struct {
- unsigned buffer_index;
  unsigned offset;
   } atomic;
 
diff --git a/src/glsl/link_atomics.cpp b/src/glsl/link_atomics.cpp
index 75699fd..603873a 100644
--- a/src/glsl/link_atomics.cpp
+++ b/src/glsl/link_atomics.cpp
@@ -201,7 +201,9 @@ link_assign_atomic_counter_resources(struct gl_context *ctx,
  gl_uniform_storage *const storage = prog-UniformStorage[id];
 
  mab.Uniforms[j] = id;
- var-data.atomic.buffer_index = i;
+ if (!var-data.explicit_binding)
+var-data.binding = i;
+
  storage-atomic_buffer_index = i;
  storage-offset = var-data.atomic.offset;
  storage-array_stride = (var-type-is_array() ?
diff --git a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp 
b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
index 90bf3fa..4511f46 100644
--- a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
@@ -2447,7 +2447,7 @@ fs_visitor::visit_atomic_counter_intrinsic(ir_call *ir)
   ir-actual_parameters.get_head());
ir_variable *location = deref-variable_referenced();
unsigned surf_index = (prog_data-base.binding_table.abo_start +
-  location-data.atomic.buffer_index);
+  location-data.binding);
 
/* Calculate the surface offset */
fs_reg offset(this, glsl_type::uint_type);
diff --git a/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp 
b/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp
index e16e93c..481b885 100644
--- a/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp
+++ b/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp
@@ -2244,7 +2244,7 @@ vec4_visitor::visit_atomic_counter_intrinsic(ir_call *ir)
   ir-actual_parameters.get_head());
ir_variable *location = deref-variable_referenced();
unsigned surf_index = (prog_data-base.binding_table.abo_start +
-  location-data.atomic.buffer_index);
+  location-data.binding);
 
/* Calculate the surface offset */
src_reg offset(this, glsl_type::uint_type);
-- 
1.8.1.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 20/26] i965/fs: Don't make a name for a vector splitting temporary

2014-07-14 Thread Ian Romanick
From: Ian Romanick ian.d.roman...@intel.com

If the name is just going to get dropped, don't bother making it.  If
the name is made, release it sooner (rather than later).

No change Valgrind massif results for a trimmed apitrace of dota2.

Signed-off-by: Ian Romanick ian.d.roman...@intel.com
---
 src/mesa/drivers/dri/i965/brw_fs_vector_splitting.cpp | 11 ---
 1 file changed, 8 insertions(+), 3 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_fs_vector_splitting.cpp 
b/src/mesa/drivers/dri/i965/brw_fs_vector_splitting.cpp
index 422d801..03e5fdb 100644
--- a/src/mesa/drivers/dri/i965/brw_fs_vector_splitting.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs_vector_splitting.cpp
@@ -363,12 +363,17 @@ brw_do_vector_splitting(exec_list *instructions)
   entry-mem_ctx = ralloc_parent(entry-var);
 
   for (unsigned int i = 0; i  entry-var-type-vector_elements; i++) {
-const char *name = ralloc_asprintf(mem_ctx, %s_%c,
-   entry-var-name,
-   xyzw[i]);
+ char *const name = ir_variable::temporaries_allocate_names
+? ralloc_asprintf(mem_ctx, %s_%c,
+  entry-var-name,
+  xyzw[i])
+: NULL;
 
 entry-components[i] = new(entry-mem_ctx) ir_variable(type, name,

ir_var_temporary);
+
+ ralloc_free(name);
+
 entry-var-insert_before(entry-components[i]);
   }
 
-- 
1.8.1.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 23/26] glsl: Make compiler-added padding ir_instruction usable

2014-07-14 Thread Ian Romanick
From: Ian Romanick ian.d.roman...@intel.com

No change Valgrind massif results for a trimmed apitrace of dota2.

v2: Fix a couple spelling errors in the comment.

Signed-off-by: Ian Romanick ian.d.roman...@intel.com
---
 src/glsl/ir.h | 14 ++
 1 file changed, 14 insertions(+)

diff --git a/src/glsl/ir.h b/src/glsl/ir.h
index 5c565ff..ead0863 100644
--- a/src/glsl/ir.h
+++ b/src/glsl/ir.h
@@ -91,6 +91,20 @@ private:
uint8_t ir_type;
 
 public:
+   /**
+* Alignment padding that would be added by the compiler
+*
+* Putting a field here makes what would otherwise be dead space usable.
+* Subclasses of ir_instruction can store data here.  Care must be taken
+* for two reasons:
+*
+* 1. Direct descendants in the class hierarchy (e.g., \c ir_dereference
+*and \c ir_dereference_array) must not try to use this space.
+*
+* 2. The size of the padding depends on the architecture.
+*/
+   uint8_t padding[sizeof(intptr_t) - 1];
+
inline enum ir_node_type get_ir_type() const
{
   STATIC_ASSERT(ir_type_max  256);
-- 
1.8.1.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 10/26] glsl: Make ir_variable::max_ifc_array_access private

2014-07-14 Thread Ian Romanick
From: Ian Romanick ian.d.roman...@intel.com

The payoff for this will come in a few more patches.

No change Valgrind massif results for a trimmed apitrace of dota2.

Signed-off-by: Ian Romanick ian.d.roman...@intel.com
---
 src/glsl/ast_array_index.cpp | 10 --
 src/glsl/ir.h| 37 -
 src/glsl/ir_validate.cpp |  9 +++--
 src/glsl/link_functions.cpp  | 14 +++---
 src/glsl/linker.cpp  |  5 +++--
 5 files changed, 53 insertions(+), 22 deletions(-)

diff --git a/src/glsl/ast_array_index.cpp b/src/glsl/ast_array_index.cpp
index f3b060e..ecc4200 100644
--- a/src/glsl/ast_array_index.cpp
+++ b/src/glsl/ast_array_index.cpp
@@ -88,8 +88,14 @@ update_max_array_access(ir_rvalue *ir, unsigned idx, YYLTYPE 
*loc,
 unsigned field_index =
deref_record-record-type-field_index(deref_record-field);
 assert(field_index  interface_type-length);
-if (idx  deref_var-var-max_ifc_array_access[field_index]) {
-   deref_var-var-max_ifc_array_access[field_index] = idx;
+
+unsigned *const max_ifc_array_access =
+   deref_var-var-get_max_ifc_array_access();
+
+assert(max_ifc_array_access != NULL);
+
+if (idx  max_ifc_array_access[field_index]) {
+   max_ifc_array_access[field_index] = idx;
 
/* Check whether this access will, as a side effect, implicitly
 * cause the size of a built-in array to be too large.
diff --git a/src/glsl/ir.h b/src/glsl/ir.h
index d33d153..1924098 100644
--- a/src/glsl/ir.h
+++ b/src/glsl/ir.h
@@ -527,6 +527,17 @@ public:
}
 
/**
+* Get the max_ifc_array_access pointer
+*
+* A set function is not needed because the array is dynmically allocated
+* as necessary.
+*/
+   inline unsigned *get_max_ifc_array_access()
+   {
+  return this-max_ifc_array_access;
+   }
+
+   /**
 * Enable emitting extension warnings for this variable
 */
void enable_extension_warning(const char *extension);
@@ -548,19 +559,6 @@ public:
 */
const char *name;
 
-   /**
-* For variables which satisfy the is_interface_instance() predicate, this
-* points to an array of integers such that if the ith member of the
-* interface block is an array, max_ifc_array_access[i] is the maximum
-* array element of that member that has been accessed.  If the ith member
-* of the interface block is not an array, max_ifc_array_access[i] is
-* unused.
-*
-* For variables whose type is not an interface block, this pointer is
-* NULL.
-*/
-   unsigned *max_ifc_array_access;
-
struct ir_variable_data {
 
   /**
@@ -818,6 +816,19 @@ private:
static const char *const warn_extension_table[];
 
/**
+* For variables which satisfy the is_interface_instance() predicate, this
+* points to an array of integers such that if the ith member of the
+* interface block is an array, max_ifc_array_access[i] is the maximum
+* array element of that member that has been accessed.  If the ith member
+* of the interface block is not an array, max_ifc_array_access[i] is
+* unused.
+*
+* For variables whose type is not an interface block, this pointer is
+* NULL.
+*/
+   unsigned *max_ifc_array_access;
+
+   /**
 * For variables that are in an interface block or are an instance of an
 * interface block, this is the \c GLSL_TYPE_INTERFACE type for that block.
 *
diff --git a/src/glsl/ir_validate.cpp b/src/glsl/ir_validate.cpp
index ef660e8..b7afd08 100644
--- a/src/glsl/ir_validate.cpp
+++ b/src/glsl/ir_validate.cpp
@@ -678,10 +678,15 @@ ir_validate::visit(ir_variable *ir)
  ir-get_interface_type()-fields.structure;
   for (unsigned i = 0; i  ir-get_interface_type()-length; i++) {
  if (fields[i].type-array_size()  0) {
-if (ir-max_ifc_array_access[i] = fields[i].type-length) {
+const unsigned *const max_ifc_array_access =
+   ir-get_max_ifc_array_access();
+
+assert(max_ifc_array_access != NULL);
+
+if (max_ifc_array_access[i] = fields[i].type-length) {
printf(ir_variable has maximum access out of bounds for 
   field %s (%d vs %d)\n, fields[i].name,
-  ir-max_ifc_array_access[i], fields[i].type-length);
+  max_ifc_array_access[i], fields[i].type-length);
ir-print();
abort();
 }
diff --git a/src/glsl/link_functions.cpp b/src/glsl/link_functions.cpp
index 2ce9609..c06ffa4 100644
--- a/src/glsl/link_functions.cpp
+++ b/src/glsl/link_functions.cpp
@@ -243,11 +243,19 @@ public:
/* Similarly, we need implicit sizes of arrays within interface
 * blocks to be sized by the maximal access in *any* shader.
 */
+   unsigned *const 

[Mesa-dev] [PATCH 03/26] glsl: Eliminate unused built-in variables after compilation

2014-07-14 Thread Ian Romanick
From: Ian Romanick ian.d.roman...@intel.com

After compilation (and before linking) we can eliminate quite a few
built-in variables.  Basically, any uniform or constant (e.g.,
gl_MaxVertexTextureImageUnits) that isn't used (with one exception) can
be eliminated.  System values, vertex shader inputs (with one
exception), and fragment shader outputs that are not used and not
re-declared in the shader text can also be removed.

gl_ModelViewProjectMatrix and gl_Vertex are used by the built-in
function ftransform.  There are some complications with eliminating
these variables (see the comment in the patch), so they are not
eliminated.

Valgrind massif results for a trimmed apitrace of dota2:

  ntime(i) total(B)   useful-heap(B) 
extra-heap(B)stacks(B)
Before (32-bit): 46 40,661,487,174   75,116,800   68,854,065 
6,262,7350
After  (32-bit): 50 40,564,927,443   69,185,408   63,683,871 
5,501,5370

Before (64-bit): 64 37,200,329,700  104,872,672   96,514,546 
8,358,1260
After  (64-bit): 59 36,822,048,449   96,526,888   89,113,000 
7,413,8880

A real savings of 4.9MiB on 32-bit and 7.0MiB on 64-bit.

v2: Don't remove any built-in with Transpose in the name.

v3: Fix comment typo noticed by Anuj.

Signed-off-by: Ian Romanick ian.d.roman...@intel.com
Suggested-by: Eric Anholt e...@anholt.net
Acked-by: Anuj Phogat anuj.pho...@gmail.com
Cc: Eric Anholt e...@anholt.net
---
 src/glsl/Makefile.sources   |  1 +
 src/glsl/glsl_parser_extras.cpp | 20 
 src/glsl/ir_optimization.h  |  2 +
 src/glsl/opt_dead_builtin_variables.cpp | 81 +
 4 files changed, 104 insertions(+)
 create mode 100644 src/glsl/opt_dead_builtin_variables.cpp

diff --git a/src/glsl/Makefile.sources b/src/glsl/Makefile.sources
index b54eae7..5915bf8 100644
--- a/src/glsl/Makefile.sources
+++ b/src/glsl/Makefile.sources
@@ -87,6 +87,7 @@ LIBGLSL_FILES = \
$(GLSL_SRCDIR)/opt_copy_propagation.cpp \
$(GLSL_SRCDIR)/opt_copy_propagation_elements.cpp \
$(GLSL_SRCDIR)/opt_cse.cpp \
+   $(GLSL_SRCDIR)/opt_dead_builtin_variables.cpp \
$(GLSL_SRCDIR)/opt_dead_builtin_varyings.cpp \
$(GLSL_SRCDIR)/opt_dead_code.cpp \
$(GLSL_SRCDIR)/opt_dead_code_local.cpp \
diff --git a/src/glsl/glsl_parser_extras.cpp b/src/glsl/glsl_parser_extras.cpp
index 2962407..93f5e75 100644
--- a/src/glsl/glsl_parser_extras.cpp
+++ b/src/glsl/glsl_parser_extras.cpp
@@ -1480,6 +1480,26 @@ _mesa_glsl_compile_shader(struct gl_context *ctx, struct 
gl_shader *shader,
  ;
 
   validate_ir_tree(shader-ir);
+
+  enum ir_variable_mode other;
+  switch (shader-Stage) {
+  case MESA_SHADER_VERTEX:
+ other = ir_var_shader_in;
+ break;
+  case MESA_SHADER_FRAGMENT:
+ other = ir_var_shader_out;
+ break;
+  default:
+ /* Something invalid to ensure optimize_dead_builtin_uniforms
+  * doesn't remove anything other than uniforms or constants.
+  */
+ other = ir_var_mode_count;
+ break;
+  }
+
+  optimize_dead_builtin_variables(shader-ir, other);
+
+  validate_ir_tree(shader-ir);
}
 
if (shader-InfoLog)
diff --git a/src/glsl/ir_optimization.h b/src/glsl/ir_optimization.h
index b83c225..57e04c7 100644
--- a/src/glsl/ir_optimization.h
+++ b/src/glsl/ir_optimization.h
@@ -124,6 +124,8 @@ void lower_named_interface_blocks(void *mem_ctx, gl_shader 
*shader);
 bool optimize_redundant_jumps(exec_list *instructions);
 bool optimize_split_arrays(exec_list *instructions, bool linked);
 bool lower_offset_arrays(exec_list *instructions);
+void optimize_dead_builtin_variables(exec_list *instructions,
+ enum ir_variable_mode other);
 
 ir_rvalue *
 compare_index_block(exec_list *instructions, ir_variable *index,
diff --git a/src/glsl/opt_dead_builtin_variables.cpp 
b/src/glsl/opt_dead_builtin_variables.cpp
new file mode 100644
index 000..85c75d6
--- /dev/null
+++ b/src/glsl/opt_dead_builtin_variables.cpp
@@ -0,0 +1,81 @@
+/*
+ * Copyright © 2014 Intel Corporation
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the Software),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice (including the next
+ * paragraph) shall be included in all copies or substantial portions of the
+ * Software.
+ *
+ * THE SOFTWARE IS PROVIDED AS IS, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE 

[Mesa-dev] [PATCH 26/26] glsl: Add Bloom filter to get_static_name

2014-07-14 Thread Ian Romanick
From: Ian Romanick ian.d.roman...@intel.com

The use of the Bloom filter rejects the vast majority of strings that
are not in the table before expensive comparisons are performed.  This
Bloom filter uses two hashes.  One is explicit (calculate_hash in the
code), and the other is implicit.  The implicit hash is just the length
of the name, and the filter is the switch-statement in
get_static_name_do_not_call that rejects name with lengths that are not
in the table.

No change Valgrind massif results for a trimmed apitrace of dota2.

NOTE: This patch could probably get squashed into the previous patch.  I
kept it separate because I thought it would make things easier to
review.

Signed-off-by: Ian Romanick ian.d.roman...@intel.com
---
 src/glsl/common_variable_names.py | 113 +-
 1 file changed, 112 insertions(+), 1 deletion(-)

diff --git a/src/glsl/common_variable_names.py 
b/src/glsl/common_variable_names.py
index 7435a12..ee5571c 100644
--- a/src/glsl/common_variable_names.py
+++ b/src/glsl/common_variable_names.py
@@ -119,6 +119,46 @@ name_really_is_not_in_static_names(const char *name)
 }
 #endif /* DEBUG */
 
+static uint32_t
+calculate_hash(const char *str)
+{
+uint32_t hash = 5381;
+
+while (*str != '\\0') {
+hash = (hash * 33) + *str;
+str++;
+}
+
+return hash;
+}
+
+/**
+ * Check the Bloom filter for the name
+ */
+static bool
+name_is_in_Bloom_filter(const char *name)
+{
+   static const uint32_t filter[${len(filter)}] = {
+% for i in range(0,len(filter), 4):
+  ${{:#010x}.format(filter[i+0])}, ${{:#010x}.format(filter[i+1])}, 
${{:#010x}.format(filter[i+2])}, ${{:#010x}.format(filter[i+3])},
+% endfor
+   };
+
+   const uint32_t h = calculate_hash(name);
+
+   if (h  ${min(all_hash)})
+  return false;
+
+   if (h  ${max(all_hash)})
+  return false;
+
+   const unsigned bit = (h - ${min(all_hash)}) % ${len(filter) * 32};
+   const unsigned i = bit / 32;
+   const unsigned j = bit % 32;
+
+   return (filter[i]  (1U  j)) != 0;
+}
+
 /**
  * Search the static name table for the specified name
  *
@@ -129,6 +169,11 @@ name_really_is_not_in_static_names(const char *name)
 static const char *
 get_static_name_do_not_call(const char *name)
 {
+   /* Do the trivial rejection test before anything.
+*/
+   if (!name_is_in_Bloom_filter(name))
+  return NULL;
+
const unsigned len = strlen(name);
const ${index_type} *table = NULL;
unsigned table_size = 0;
@@ -188,6 +233,14 @@ get_static_name(const char *name)
 }
 
 
+def calculate_hash(str):
+hash = 5381
+
+for c in str:
+hash = ((hash * 33) + ord(c))  0x0
+
+return hash
+
 common_names.sort()
 
 # Partiation the list of names into sublists of names with the same length
@@ -213,8 +266,66 @@ elif base  (1  16):
 else:
index_type = uint32_t
 
+all_hash = []
+for name in common_names:
+   all_hash.append(calculate_hash(name))
+
+# Experimentally determined that 8,192 bits is sufficient for this dataset.
+# This was determined by:
+#
+# 1. Modify the generated code to log a message on entry to get_static_name.
+#
+# 2. Modify the generated code to log a message when name_is_in_Bloom_filter
+# returns true.
+#
+# 3. Modifiy the generated code to log a message each time a name is rejected
+# due to its length.  This is the 'default' case in the switch-statement.
+# This is an implicit hash that is part of the Bloom filter.
+#
+# 4. Modify the generated code to log a message each time a name has actually
+# been tested (using strcmp) and was not in the table.  This means logging at
+# the end of get_static_name_do_not_call and inside the switch-statement where
+# lonely names are handled.
+#
+# 5. Run a ton of shaders through (e.g., shader-db) and collect the output.
+# Count the number of times each message occurred.
+#
+# Two hash functions were tried.  The current one and Murmur2.  Exhaustive
+# testing over shader-db produced the following results.  There were a total
+# of 6,749,837 calls to get_static_name in each run.  The # in the column
+# header refers messages mentioned in the list above.
+#
+#Bits  Current hash   Murmur2
+##2 / #3 /  #4   #2 / #3 / #4
+# 128   5249223 / 262597 / 4826172   763090 / 285531 / 317105
+# 256   4955982 / 162087 / 4633441   508512 / 152491 / 195567
+# 512334736 / 52 /   63130   381691 /  98277 / 122960
+#1024236521 /  43809 /   32258   326657 /  69346 /  96857
+#2048174275 /   1533 /   12288   258885 /  49537 /  48894
+#4096171153 /341 /   10358   185632 /682 /  24496
+#8192161649 /264 / 931   163035 /142 /   2439
+#   16384160782 /  0 / 328   162764 / 15 /   2295
+#
+# Past 512 bits, the current hash was clearly superior to Murmur2.  8,192 bits
+# seems to be a sweet spot of a very small table size 

[Mesa-dev] [PATCH 14/26] glsl: Add the possibility for ir_variable to have a non-ralloced name

2014-07-14 Thread Ian Romanick
From: Ian Romanick ian.d.roman...@intel.com

Specifically, ir_var_temporary variables constructed with a NULL name
will all have the name compiler_temp in static storage.

No change Valgrind massif results for a trimmed apitrace of dota2.

Signed-off-by: Ian Romanick ian.d.roman...@intel.com
---
 src/glsl/ir.cpp  | 20 +++-
 src/glsl/ir.h| 10 ++
 src/glsl/ir_validate.cpp |  2 +-
 3 files changed, 30 insertions(+), 2 deletions(-)

diff --git a/src/glsl/ir.cpp b/src/glsl/ir.cpp
index 40dbb6b..0045c2d 100644
--- a/src/glsl/ir.cpp
+++ b/src/glsl/ir.cpp
@@ -1533,12 +1533,30 @@ ir_swizzle::variable_referenced() const
 }
 
 
+const char ir_variable::tmp_name[] = compiler_temp;
+
 ir_variable::ir_variable(const struct glsl_type *type, const char *name,
 ir_variable_mode mode)
: ir_instruction(ir_type_variable)
 {
this-type = type;
-   this-name = ralloc_strdup(this, name);
+
+   /* The ir_variable clone method may call this constructor with name set to
+* tmp_name.
+*/
+   assert(name != NULL
+  || mode == ir_var_temporary
+  || mode == ir_var_function_in
+  || mode == ir_var_function_out
+  || mode == ir_var_function_inout);
+   assert(name != ir_variable::tmp_name
+  || mode == ir_var_temporary);
+   if (mode == ir_var_temporary
+(name == NULL || name == ir_variable::tmp_name)) {
+  this-name = ir_variable::tmp_name;
+   } else {
+  this-name = ralloc_strdup(this, name);
+   }
 
this-u.max_ifc_array_access = NULL;
 
diff --git a/src/glsl/ir.h b/src/glsl/ir.h
index 1071858..1aeb1ff 100644
--- a/src/glsl/ir.h
+++ b/src/glsl/ir.h
@@ -575,6 +575,11 @@ public:
   return this-u.state_slots;
}
 
+   inline bool is_name_ralloced() const
+   {
+  return this-name != ir_variable::tmp_name;
+   }
+
/**
 * Enable emitting extension warnings for this variable
 */
@@ -881,6 +886,11 @@ private:
 * \sa ir_variable::location
 */
const glsl_type *interface_type;
+
+   /**
+* Name used for anonymous compiler temporaries
+*/
+   static const char tmp_name[];
 };
 
 /**
diff --git a/src/glsl/ir_validate.cpp b/src/glsl/ir_validate.cpp
index 12b8874..23f55fc 100644
--- a/src/glsl/ir_validate.cpp
+++ b/src/glsl/ir_validate.cpp
@@ -650,7 +650,7 @@ ir_validate::visit(ir_variable *ir)
 * in the ir_dereference_variable handler to ensure that a variable is
 * declared before it is dereferenced.
 */
-   if (ir-name)
+   if (ir-name  ir-is_name_ralloced())
   assert(ralloc_parent(ir-name) == ir);
 
hash_table_insert(ht, ir, ir);
-- 
1.8.1.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 21/26] glsl: Make ir_instruction::ir_type private

2014-07-14 Thread Ian Romanick
From: Ian Romanick ian.d.roman...@intel.com

In the next patch, the type of ir_type is going to change from enum to
uint8_t.  Since the type won't be an enum, we won't get compiler
warnings about, for example, switch statements that don't have cases for
all the enum values.  Using a getter that returns the enum type will
enable us to continue getting those warnings.

Also, ir_type should never be changed after an object is created.
Having it set in the constructor and no setter effectively makes it
write-once.

No change Valgrind massif results for a trimmed apitrace of dota2.

Signed-off-by: Ian Romanick ian.d.roman...@intel.com
---
 src/glsl/ast_function.cpp  | 2 +-
 src/glsl/ast_to_hir.cpp| 2 +-
 src/glsl/glsl_parser_extras.cpp| 2 +-
 src/glsl/ir.h  | 8 +++-
 src/glsl/ir_constant_expression.cpp| 4 ++--
 src/glsl/ir_print_visitor.cpp  | 2 +-
 src/glsl/ir_validate.cpp   | 6 +++---
 src/glsl/loop_analysis.cpp | 2 +-
 src/glsl/loop_controls.cpp | 2 +-
 src/glsl/loop_unroll.cpp   | 2 +-
 src/glsl/lower_clip_distance.cpp   | 4 ++--
 src/glsl/lower_if_to_cond_assign.cpp   | 4 ++--
 src/glsl/lower_jumps.cpp   | 6 +++---
 src/glsl/lower_offset_array.cpp| 2 +-
 src/glsl/lower_ubo_reference.cpp   | 4 ++--
 src/glsl/lower_vector.cpp  | 4 ++--
 src/glsl/lower_vector_insert.cpp   | 2 +-
 src/glsl/opt_constant_folding.cpp  | 2 +-
 src/glsl/opt_cse.cpp   | 2 +-
 src/glsl/opt_dead_builtin_variables.cpp| 2 +-
 src/glsl/opt_rebalance_tree.cpp| 4 ++--
 src/glsl/opt_redundant_jumps.cpp   | 6 +++---
 src/glsl/opt_structure_splitting.cpp   | 2 +-
 src/glsl/opt_vectorize.cpp | 2 +-
 src/mesa/program/ir_to_mesa.cpp| 2 +-
 src/mesa/state_tracker/st_glsl_to_tgsi.cpp | 2 +-
 26 files changed, 44 insertions(+), 38 deletions(-)

diff --git a/src/glsl/ast_function.cpp b/src/glsl/ast_function.cpp
index 2402e79..92c5bf1 100644
--- a/src/glsl/ast_function.cpp
+++ b/src/glsl/ast_function.cpp
@@ -171,7 +171,7 @@ verify_parameter_modes(_mesa_glsl_parse_state *state,
 
   /* Verify that 'const_in' parameters are ir_constants. */
   if (formal-data.mode == ir_var_const_in 
- actual-ir_type != ir_type_constant) {
+ actual-get_ir_type() != ir_type_constant) {
 _mesa_glsl_error(loc, state,
  parameter `in %s' must be a constant expression,
  formal-name);
diff --git a/src/glsl/ast_to_hir.cpp b/src/glsl/ast_to_hir.cpp
index 0a4b3a6..97b2033 100644
--- a/src/glsl/ast_to_hir.cpp
+++ b/src/glsl/ast_to_hir.cpp
@@ -767,7 +767,7 @@ do_assignment(exec_list *instructions, struct 
_mesa_glsl_parse_state *state,
/* If the assignment LHS comes back as an ir_binop_vector_extract
 * expression, move it to the RHS as an ir_triop_vector_insert.
 */
-   if (lhs-ir_type == ir_type_expression) {
+   if (lhs-get_ir_type() == ir_type_expression) {
   ir_expression *const lhs_expr = lhs-as_expression();
 
   if (unlikely(lhs_expr-operation == ir_binop_vector_extract)) {
diff --git a/src/glsl/glsl_parser_extras.cpp b/src/glsl/glsl_parser_extras.cpp
index 1fe38f3..ebfc031 100644
--- a/src/glsl/glsl_parser_extras.cpp
+++ b/src/glsl/glsl_parser_extras.cpp
@@ -1532,7 +1532,7 @@ _mesa_glsl_compile_shader(struct gl_context *ctx, struct 
gl_shader *shader,
 * are fly-weights that are looked up by glsl_type.
 */
foreach_in_list (ir_instruction, ir, shader-ir) {
-  switch (ir-ir_type) {
+  switch (ir-get_ir_type()) {
   case ir_type_function: {
  /* If the function is a built-in that is partially overridden in the
   * shader, the ir_function stored in the symbol table may not be the
diff --git a/src/glsl/ir.h b/src/glsl/ir.h
index 42ccb00..ee42857 100644
--- a/src/glsl/ir.h
+++ b/src/glsl/ir.h
@@ -87,9 +87,15 @@ enum ir_node_type {
  * Base class of all IR instructions
  */
 class ir_instruction : public exec_node {
-public:
+private:
enum ir_node_type ir_type;
 
+public:
+   inline enum ir_node_type get_ir_type() const
+   {
+  return this-ir_type;
+   }
+
/**
 * GCC 4.7+ and clang warn when deleting an ir_instruction unless
 * there's a virtual destructor present.  Because we almost
diff --git a/src/glsl/ir_constant_expression.cpp 
b/src/glsl/ir_constant_expression.cpp
index 5570ed4..e277314 100644
--- a/src/glsl/ir_constant_expression.cpp
+++ b/src/glsl/ir_constant_expression.cpp
@@ -402,7 +402,7 @@ constant_referenced(const ir_dereference *deref,
if (variable_context == NULL)
   return false;
 
-   switch (deref-ir_type) {
+   switch (deref-get_ir_type()) {
case ir_type_dereference_array: {
   const ir_dereference_array *const da =
  (const ir_dereference_array *) deref;
@@ 

[Mesa-dev] [PATCH 24/26] glsl: Store short variable names inside ir_variable

2014-07-14 Thread Ian Romanick
From: Ian Romanick ian.d.roman...@intel.com

Most of the overhead of the name allocation is the ralloc tracking,
especially on 64-bit.  The allocation of the variable name i is 2
bytes for the name and 40 bytes for the ralloc tracking.

I collected data from shader-db.  At the end of link_shaders I added a
visitor to print the name of every ir_variable in every shader.  Out of
the 188,786 names logged, 47,636 were two characters or less (the size
that will fit in the padding on 32-bit builds) and 89,610 were six
characters or less (the size that will fit in the padding on 64-bit
builds).

Valgrind massif results for a trimmed apitrace of dota2:

  ntime(i) total(B)   useful-heap(B) 
extra-heap(B)stacks(B)
Before (32-bit): 56 40,527,975,617   66,347,992   61,187,818 
5,160,1740
After  (32-bit): 52 40,521,071,734   66,157,928   61,054,870 
5,103,0580

Before (64-bit): 62 37,089,989,777   93,205,752   85,900,367 
7,305,3850
After  (64-bit): 48 37,089,379,412   92,630,712   85,454,539 
7,176,1730

A real savings of 129KiB on 32-bit and 435KiB on 64-bit.

The savings is much higher on 64-bit because ir_instruction::padding is
7 bytes (instead of 3 bytes on 32-bit).  This allows a lot more names to
fit in the padding.

Signed-off-by: Ian Romanick ian.d.roman...@intel.com
---
 src/glsl/ir.cpp | 5 +
 src/glsl/ir.h   | 3 ++-
 2 files changed, 7 insertions(+), 1 deletion(-)

diff --git a/src/glsl/ir.cpp b/src/glsl/ir.cpp
index bc008a4..4b22439 100644
--- a/src/glsl/ir.cpp
+++ b/src/glsl/ir.cpp
@@ -1559,6 +1559,11 @@ ir_variable::ir_variable(const struct glsl_type *type, 
const char *name,
if (mode == ir_var_temporary
 (name == NULL || name == ir_variable::tmp_name)) {
   this-name = ir_variable::tmp_name;
+   } else if (name == NULL) {
+  this-padding[0] = 0;
+  this-name = (char *) this-padding;
+   } else if (strlen(name)  sizeof(this-padding)) {
+  this-name = strcpy((char *) this-padding, name);
} else {
   this-name = ralloc_strdup(this, name);
}
diff --git a/src/glsl/ir.h b/src/glsl/ir.h
index ead0863..770fe60 100644
--- a/src/glsl/ir.h
+++ b/src/glsl/ir.h
@@ -598,7 +598,8 @@ public:
 
inline bool is_name_ralloced() const
{
-  return this-name != ir_variable::tmp_name;
+  return this-name != ir_variable::tmp_name
+  this-name != (char *) this-padding;
}
 
/**
-- 
1.8.1.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 11/26] glsl: Make ir_variable::num_state_slots and ir_variable::state_slots private

2014-07-14 Thread Ian Romanick
From: Ian Romanick ian.d.roman...@intel.com

Also move num_state_slots inside ir_variable_data for better packing.

The payoff for this will come in a few more patches.

No change Valgrind massif results for a trimmed apitrace of dota2.

Signed-off-by: Ian Romanick ian.d.roman...@intel.com
---
 src/glsl/builtin_variables.cpp |  5 +--
 src/glsl/ir.h  | 61 +++---
 src/glsl/ir_clone.cpp  | 13 ++
 src/glsl/ir_validate.cpp   |  2 +-
 src/glsl/linker.cpp|  7 +--
 src/mesa/drivers/dri/i965/brw_fs.cpp   |  6 +--
 src/mesa/drivers/dri/i965/brw_shader.cpp   |  6 +--
 src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp |  6 +--
 src/mesa/program/ir_to_mesa.cpp| 14 +++---
 src/mesa/state_tracker/st_glsl_to_tgsi.cpp | 14 +++---
 10 files changed, 78 insertions(+), 56 deletions(-)

diff --git a/src/glsl/builtin_variables.cpp b/src/glsl/builtin_variables.cpp
index 6fc3319..9e31fb8 100644
--- a/src/glsl/builtin_variables.cpp
+++ b/src/glsl/builtin_variables.cpp
@@ -478,12 +478,9 @@ builtin_variable_generator::add_uniform(const glsl_type 
*type,
   _mesa_builtin_uniform_desc[i];
 
const unsigned array_count = type-is_array() ? type-length : 1;
-   uni-num_state_slots = array_count * statevar-num_elements;
 
ir_state_slot *slots =
-  ralloc_array(uni, ir_state_slot, uni-num_state_slots);
-
-   uni-state_slots = slots;
+  uni-allocate_state_slots(array_count * statevar-num_elements);
 
for (unsigned a = 0; a  array_count; a++) {
   for (unsigned j = 0; j  statevar-num_elements; j++) {
diff --git a/src/glsl/ir.h b/src/glsl/ir.h
index 1924098..c47a1fb 100644
--- a/src/glsl/ir.h
+++ b/src/glsl/ir.h
@@ -537,6 +537,37 @@ public:
   return this-max_ifc_array_access;
}
 
+   inline unsigned get_num_state_slots() const
+   {
+  return this-data._num_state_slots;
+   }
+
+   inline void set_num_state_slots(unsigned n)
+   {
+  this-data._num_state_slots = n;
+   }
+
+   inline ir_state_slot *get_state_slots()
+   {
+  return this-state_slots;
+   }
+
+   inline const ir_state_slot *get_state_slots() const
+   {
+  return this-state_slots;
+   }
+
+   inline ir_state_slot *allocate_state_slots(unsigned n)
+   {
+  this-state_slots = ralloc_array(this, ir_state_slot, n);
+  this-data._num_state_slots = 0;
+
+  if (this-state_slots != NULL)
+ this-data._num_state_slots = n;
+
+  return this-state_slots;
+   }
+
/**
 * Enable emitting extension warnings for this variable
 */
@@ -729,6 +760,10 @@ public:
   /** Image internal format if specified explicitly, otherwise GL_NONE. */
   uint16_t image_format;
 
+   private:
+  unsigned _num_state_slots;/** Number of state slots used */
+
+   public:
   /**
* Storage location of the base of this variable
*
@@ -782,22 +817,6 @@ public:
} data;
 
/**
-* Built-in state that backs this uniform
-*
-* Once set at variable creation, \c state_slots must remain invariant.
-* This is because, ideally, this array would be shared by all clones of
-* this variable in the IR tree.  In other words, we'd really like for it
-* to be a fly-weight.
-*
-* If the variable is not a uniform, \c num_state_slots will be zero and
-* \c state_slots will be \c NULL.
-*/
-   /*@{*/
-   unsigned num_state_slots;/** Number of state slots used */
-   ir_state_slot *state_slots;  /** State descriptors. */
-   /*@}*/
-
-   /**
 * Value assigned in the initializer of a variable declared const
 */
ir_constant *constant_value;
@@ -829,6 +848,16 @@ private:
unsigned *max_ifc_array_access;
 
/**
+* Built-in state that backs this uniform
+*
+* Once set at variable creation, \c state_slots must remain invariant.
+*
+* If the variable is not a uniform, \c _num_state_slots will be zero and
+* \c state_slots will be \c NULL.
+*/
+   ir_state_slot *state_slots;
+
+   /**
 * For variables that are in an interface block or are an instance of an
 * interface block, this is the \c GLSL_TYPE_INTERFACE type for that block.
 *
diff --git a/src/glsl/ir_clone.cpp b/src/glsl/ir_clone.cpp
index 86c0e31..64019fe 100644
--- a/src/glsl/ir_clone.cpp
+++ b/src/glsl/ir_clone.cpp
@@ -53,15 +53,10 @@ ir_variable::clone(void *mem_ctx, struct hash_table *ht) 
const
 
memcpy(var-data, this-data, sizeof(var-data));
 
-   var-num_state_slots = this-num_state_slots;
-   if (this-state_slots) {
-  /* FINISHME: This really wants to use something like talloc_reference, 
but
-   * FINISHME: ralloc doesn't have any similar function.
-   */
-  var-state_slots = ralloc_array(var, ir_state_slot,
- this-num_state_slots);
-  memcpy(var-state_slots, this-state_slots,
-sizeof(this-state_slots[0]) * 

[Mesa-dev] [PATCH 19/26] glsl: Don't make a name for the function return variable

2014-07-14 Thread Ian Romanick
From: Ian Romanick ian.d.roman...@intel.com

If the name is just going to get dropped, don't bother making it.  If
the name is made, release it sooner (rather than later).

No change Valgrind massif results for a trimmed apitrace of dota2.

Signed-off-by: Ian Romanick ian.d.roman...@intel.com
---
 src/glsl/ast_function.cpp | 11 +++
 1 file changed, 7 insertions(+), 4 deletions(-)

diff --git a/src/glsl/ast_function.cpp b/src/glsl/ast_function.cpp
index dbf4e66..2402e79 100644
--- a/src/glsl/ast_function.cpp
+++ b/src/glsl/ast_function.cpp
@@ -408,14 +408,17 @@ generate_call(exec_list *instructions, 
ir_function_signature *sig,
ir_dereference_variable *deref = NULL;
if (!sig-return_type-is_void()) {
   /* Create a new temporary to hold the return value. */
+  char *const name = ir_variable::temporaries_allocate_names
+ ? ralloc_asprintf(ctx, %s_retval, sig-function_name())
+ : NULL;
+
   ir_variable *var;
 
-  var = new(ctx) ir_variable(sig-return_type,
-ralloc_asprintf(ctx, %s_retval,
-sig-function_name()),
-ir_var_temporary);
+  var = new(ctx) ir_variable(sig-return_type, name, ir_var_temporary);
   instructions-push_tail(var);
 
+  ralloc_free(name);
+
   deref = new(ctx) ir_dereference_variable(var);
}
ir_call *call = new(ctx) ir_call(sig, deref, actual_parameters);
-- 
1.8.1.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 05/26] glsl: Use a single bit for the dual-source blend index

2014-07-14 Thread Ian Romanick
From: Ian Romanick ian.d.roman...@intel.com

The only values allowed are 0 and 1, and the value is checked before
assigning.

Valgrind massif results for a trimmed apitrace of dota2:

  ntime(i) total(B)   useful-heap(B) 
extra-heap(B)stacks(B)
Before (32-bit): 74 40,580,119,657   69,186,544   63,506,327 
5,680,2170
After  (32-bit): 76 40,572,916,873   68,831,248   63,328,783 
5,502,4650

Before (64-bit): 89 36,822,971,897   96,526,616   88,735,296 
7,791,3200
After  (64-bit): 60 36,822,640,058   96,526,824   88,735,296 
7,791,5280

A real savings of 173KiB on 32-bit and no change on 64-bit.

Signed-off-by: Ian Romanick ian.d.roman...@intel.com
Reviewed-by: Kenneth Graunke kenn...@whitecape.org
---
 src/glsl/ir.h | 14 +-
 1 file changed, 9 insertions(+), 5 deletions(-)

diff --git a/src/glsl/ir.h b/src/glsl/ir.h
index 9dbcb7a..147d337 100644
--- a/src/glsl/ir.h
+++ b/src/glsl/ir.h
@@ -684,6 +684,15 @@ public:
   unsigned must_be_shader_input:1;
 
   /**
+   * Output index for dual source blending.
+   *
+   * \note
+   * The GLSL spec only allows the values 0 or 1 for the index in \b dual
+   * source blending.
+   */
+  unsigned index:1;
+
+  /**
* \brief Layout qualifier for gl_FragDepth.
*
* This is not equal to \c ir_depth_layout_none if and only if this
@@ -717,11 +726,6 @@ public:
   unsigned stream;
 
   /**
-   * output index for dual source blending.
-   */
-  int index;
-
-  /**
* Initial binding point for a sampler, atomic, or UBO.
*
* For array types, this represents the binding point for the first 
element.
-- 
1.8.1.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/2] glsl: add a mechanism to allow #extension directives in the middle of shaders

2014-07-14 Thread Ian Romanick
Looks good to me.

Reviewed-by: Ian Romanick ian.d.roman...@intel.com

On 07/08/2014 11:29 AM, Marek Olšák wrote:
 From: Marek Olšák marek.ol...@amd.com
 
 This is needed to make Unigine Heaven 4.0 and Unigine Valley 1.0 work
 with sample shading.
 
 Also, if this is disabled, the error message at least makes sense now.
 ---
  src/glsl/glsl_parser.yy | 8 
  src/glsl/glsl_parser_extras.cpp | 2 ++
  src/glsl/glsl_parser_extras.h   | 2 ++
  src/mesa/main/mtypes.h  | 5 +
  4 files changed, 17 insertions(+)
 
 diff --git a/src/glsl/glsl_parser.yy b/src/glsl/glsl_parser.yy
 index b989749..4c87163 100644
 --- a/src/glsl/glsl_parser.yy
 +++ b/src/glsl/glsl_parser.yy
 @@ -376,6 +376,14 @@ external_declaration_list:
if ($2 != NULL)
   state-translation_unit.push_tail( $2-link);
 }
 +   | external_declaration_list extension_statement {
 +  if (!state-allow_extension_directive_midshader) {
 + _mesa_glsl_error( @2, state,
 +  #extension directive is not allowed 
 +  in the middle of a shader);
 + YYERROR;
 +  }
 +   }
 ;
  
  variable_identifier:
 diff --git a/src/glsl/glsl_parser_extras.cpp b/src/glsl/glsl_parser_extras.cpp
 index b327c2b..890123a 100644
 --- a/src/glsl/glsl_parser_extras.cpp
 +++ b/src/glsl/glsl_parser_extras.cpp
 @@ -210,6 +210,8 @@ _mesa_glsl_parse_state::_mesa_glsl_parse_state(struct 
 gl_context *_ctx,
 this-early_fragment_tests = false;
 memset(this-atomic_counter_offsets, 0,
sizeof(this-atomic_counter_offsets));
 +   this-allow_extension_directive_midshader =
 +  ctx-Const.AllowGLSLExtensionDirectiveMidShader;
  }
  
  /**
 diff --git a/src/glsl/glsl_parser_extras.h b/src/glsl/glsl_parser_extras.h
 index 1791816..f720d6a 100644
 --- a/src/glsl/glsl_parser_extras.h
 +++ b/src/glsl/glsl_parser_extras.h
 @@ -490,6 +490,8 @@ struct _mesa_glsl_parse_state {
  
 /** Atomic counter offsets by binding */
 unsigned atomic_counter_offsets[MAX_COMBINED_ATOMIC_BUFFERS];
 +
 +   bool allow_extension_directive_midshader;
  };
  
  # define YYLLOC_DEFAULT(Current, Rhs, N) \
 diff --git a/src/mesa/main/mtypes.h b/src/mesa/main/mtypes.h
 index a7126fd..b42a175 100644
 --- a/src/mesa/main/mtypes.h
 +++ b/src/mesa/main/mtypes.h
 @@ -3349,6 +3349,11 @@ struct gl_constants
 GLuint ForceGLSLVersion;
  
 /**
 +* Allow GLSL #extension directives in the middle of shaders.
 +*/
 +   GLboolean AllowGLSLExtensionDirectiveMidShader;
 +
 +   /**
  * Does the driver support real 32-bit integers?  (Otherwise, integers are
  * simulated via floats.)
  */
 

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] configure: Don't override user -g or -O options for debug builds

2014-07-14 Thread Ian Romanick
On 07/14/2014 01:41 PM, Matt Turner wrote:
 On Mon, Jul 14, 2014 at 11:49 AM, Ian Romanick i...@freedesktop.org wrote:
 From: Ian Romanick ian.d.roman...@intel.com

 I already pass '-ggdb3 -O1' or '-ggdb3 -Og' for CFLAGS, and I don't want
 configure to change them for me.

 Signed-off-by: Ian Romanick ian.d.roman...@intel.com
 ---
  configure.ac | 14 --
  1 file changed, 12 insertions(+), 2 deletions(-)

 diff --git a/configure.ac b/configure.ac
 index 4646212..04abcc0 100644
 --- a/configure.ac
 +++ b/configure.ac
 @@ -308,10 +308,20 @@ AC_ARG_ENABLE([debug],
  if test x$enable_debug = xyes; then
  DEFINES=$DEFINES -DDEBUG
  if test x$GCC = xyes; then
 -CFLAGS=$CFLAGS -g -O0
 +if ! echo $CFLAGS | grep -q -e '-g'; then
 +CFLAGS=$CFLAGS -g
 +fi
 +if ! echo $CFLAGS | grep -q -e '-O'; then
 +CFLAGS=$CFLAGS -O0
 +fi
  fi
  if test x$GXX = xyes; then
 -CXXFLAGS=$CXXFLAGS -g -O0
 +if ! echo $CXXFLAGS | grep -q -e '-g'; then
 +CXXFLAGS=$CXXFLAGS -g
 +fi
 +if ! echo $CXXFLAGS | grep -q -e '-O'; then
 +CXXFLAGS=$CXXFLAGS -O0
 +fi
  fi
  fi

 --
 1.8.1.4
 
 This looks like strictly an improvement, so my feedback shouldn't hold
 this up, but maybe it's time to fight about whether --enable-debug
 should add -O0 at all. :)
 
 Reviewed-by: Matt Turner matts...@gmail.com
 
 Autotools defaults are -g -O2. Some people want debug builds to use
 -g, others -ggdb*. Some people want -O2 (optimize, but with assertions
 enabled), others -O0, others -Og. It'd be nice to not have
 --enable-debug turn on one combination of these.

I thought about that when I wrote this patch, but I decided against it.
 Principle of least surprise:  --enable-debug should enable debugging.
Ages ago, Mesa's build system only added -g in dri-debug builds (yay for
the static Makefiles).  If you forgot to change it (or wrap the build
with custom scripts), you would often be disappointed when trying to gdb
Mesa bugs.  New developers, that may not yet have custom scripts, will
have this same issue.

I think we should enable experienced developers to do what they want,
and make things easier for new developers.

That said, if the previaling opinion is to just remove it, I'm okay with
that too...

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 81174] Gallium: GL_LINE_LOOP broken with more than 512 points

2014-07-14 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=81174

--- Comment #7 from Roland Scheidegger srol...@vmware.com ---
Hmm interesting. I thought more than 512 vertices would fit before it uses a
new buffer (the buffer size is 64kB though I don't know what the vertex size
ends up with). But if the bug is due to that I guess indeed changing the prim
type (probably in vbo_exec_wrap_buffers()) and fixing up the copied vertices
stuff (if it doesn't do the right thing already) would do the trick. But indeed
closing the loop finally would be tricky (well I guess would need to keep the
vertex around when doing the buffer wrap, then can simply emit it as another
vertex last). I have no idea of the effects this has on things like line
stippling but I guess it couldn't be worse than it is now...
Also, things like GL_POLYGON with fill mode line probably has the same problem
and can't be fixed that way really.
Note that the draw module has its own decomposition of primitives into smaller
chunks too which could also be buggy (vsplit stuff).

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 81174] Gallium: GL_LINE_LOOP broken with more than 512 points

2014-07-14 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=81174

--- Comment #8 from Marek Olšák mar...@gmail.com ---
Line loops aren't so critical, since they are rarely used. But what about
triangle strips... every split kills 2 triangles...

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 81174] Gallium: GL_LINE_LOOP broken with more than 512 points

2014-07-14 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=81174

--- Comment #9 from Roland Scheidegger srol...@vmware.com ---
(In reply to comment #8)
 Line loops aren't so critical, since they are rarely used. But what about
 triangle strips... every split kills 2 triangles...

That shouldn't happen, the code is supposed to copy over the necessary two
tris.

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] glapi: add indexed blend functions (GL 4.0)

2014-07-14 Thread Tapani Pälli
This makes some of the UE4 engine demos (Stylized, Mobile Temple)
render correctly, tested on Intel Haswell machine.

Signed-off-by: Tapani Pälli tapani.pa...@intel.com
---
 src/mapi/glapi/gen/GL4x.xml | 26 ++
 src/mesa/main/tests/dispatch_sanity.cpp | 10 +-
 2 files changed, 31 insertions(+), 5 deletions(-)

diff --git a/src/mapi/glapi/gen/GL4x.xml b/src/mapi/glapi/gen/GL4x.xml
index 8efef0b..848316e 100644
--- a/src/mapi/glapi/gen/GL4x.xml
+++ b/src/mapi/glapi/gen/GL4x.xml
@@ -12,6 +12,32 @@
   function name=MinSampleShading offset=assign
 param name=value type=GLfloat/
   /function
+
+  function name=BlendFunci static_dispatch=false alias=BlendFunciARB
+param name=buf type=GLuint/
+param name=sfactor type=GLenum/
+param name=dfactor type=GLenum/
+  /function
+
+  function name=BlendFuncSeparatei static_dispatch=false 
alias=BlendFuncSeparateiARB
+param name=buf type=GLuint/
+param name=sfactorRGB type=GLenum/
+param name=dfactorRGB type=GLenum/
+param name=sfactorAlpha type=GLenum/
+param name=dfactorAlpha type=GLenum/
+  /function
+
+  function name=BlendEquationi static_dispatch=false 
alias=BlendEquationiARB
+param name=buf type=GLuint/
+param name=mode type=GLenum/
+  /function
+
+  function name=BlendEquationSeparatei static_dispatch=false 
alias=BlendEquationSeparateiARB 
+param name=buf type=GLuint/
+param name=modeRGB type=GLenum/
+param name=modeA type=GLenum/
+  /function
+
 /category
 
 category name=4.3
diff --git a/src/mesa/main/tests/dispatch_sanity.cpp 
b/src/mesa/main/tests/dispatch_sanity.cpp
index 0e57653..1a2c4d0 100644
--- a/src/mesa/main/tests/dispatch_sanity.cpp
+++ b/src/mesa/main/tests/dispatch_sanity.cpp
@@ -542,11 +542,11 @@ const struct function gl_core_functions_possible[] = {
{ glVertexAttribDivisor, 33, -1 },
 
/* GL 4.0 */
-   { glMinSampleShading, 40, -1 },// XXX: Add to xml
-// { glBlendEquationi, 40, -1 },  // XXX: Add to xml
-// { glBlendEquationSeparatei, 40, -1 },  // XXX: Add to xml
-// { glBlendFunci, 40, -1 },  // XXX: Add to xml
-// { glBlendFuncSeparatei, 40, -1 },  // XXX: Add to xml
+   { glMinSampleShading, 40, -1 },
+   { glBlendEquationi, 40, -1 },
+   { glBlendEquationSeparatei, 40, -1 },
+   { glBlendFunci, 40, -1 },
+   { glBlendFuncSeparatei, 40, -1 },
 
/* GL 4.3 */
{ glIsRenderbuffer, 43, -1 },
-- 
1.9.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] Release-candidate branch for upcoming 10.2.4

2014-07-14 Thread Carl Worth
Hi folks,

I've pushed out an update to the 10.2 branch and I need some specific
testing in the next three days.

I've tested the branch on Intel (Haswell) as well as both swrast and
Gallium softpipe and found no piglit regressions compared to the 10.2.3
release.

The branch includes a few patches to nouveau and radeonsi which I have
not been able to test. If someone will test one of these drivers with
piglit and let me know that all looks good, I'll be happy to include the
patches in the release. Otherwise, I'll drop any untested patches before
making the final release on Friday.

Also, there's still time in the next three days for someone to nominate
further driver-specific changes. I'll just need positive piglit test
results for any such patches, (on top of the branch as it stands now),
before I'll accept them.

Thanks, all.

-Carl

-- 
carl.d.wo...@intel.com


pgpn1miAvjfae.pgp
Description: PGP signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 81174] Gallium: GL_LINE_LOOP broken with more than 512 points

2014-07-14 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=81174

--- Comment #10 from Florian Link florianl...@gmail.com ---
What I don't understand yet is why the effect is not stable (if my example is
repainted several times, the wrap appears at different positions over time).

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev