[Mesa-dev] [PATCH 1/2] i965: Allocate vec4_visitor's uniform_size and uniform_vector_size arrays dynamically.

2014-02-27 Thread Petri Latvala
v2: Don't add function parameters, pass the required size in
prog_data-nr_params.

v3:
- Use the name uniform_array_size instead of uniform_param_count.
- Round up when dividing param_count by 4.
- Use MAX2() instead of taking the maximum by hand.
- Don't crash if prog_data passed to vec4_visitor constructor is NULL

v4: Rebase for current master

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=71254

Signed-off-by: Petri Latvala petri.latv...@intel.com
---
 src/mesa/drivers/dri/i965/brw_vec4.h   |  5 +++--
 src/mesa/drivers/dri/i965/brw_vec4_gs.c|  5 +
 src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp | 11 +++
 src/mesa/drivers/dri/i965/brw_vs.c |  8 
 4 files changed, 27 insertions(+), 2 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_vec4.h 
b/src/mesa/drivers/dri/i965/brw_vec4.h
index fb5c0a6..d97c103 100644
--- a/src/mesa/drivers/dri/i965/brw_vec4.h
+++ b/src/mesa/drivers/dri/i965/brw_vec4.h
@@ -387,8 +387,9 @@ public:
 */
dst_reg output_reg[BRW_VARYING_SLOT_COUNT];
const char *output_reg_annotation[BRW_VARYING_SLOT_COUNT];
-   int uniform_size[MAX_UNIFORMS];
-   int uniform_vector_size[MAX_UNIFORMS];
+   int *uniform_size;
+   int *uniform_vector_size;
+   int uniform_array_size; /* Size of uniform_[vector_]size arrays */
int uniforms;
 
src_reg shader_start_time;
diff --git a/src/mesa/drivers/dri/i965/brw_vec4_gs.c 
b/src/mesa/drivers/dri/i965/brw_vec4_gs.c
index 85fb979..8a9625b 100644
--- a/src/mesa/drivers/dri/i965/brw_vec4_gs.c
+++ b/src/mesa/drivers/dri/i965/brw_vec4_gs.c
@@ -68,6 +68,11 @@ do_gs_prog(struct brw_context *brw,
   rzalloc_array(NULL, const float *, param_count);
c.prog_data.base.base.pull_param =
   rzalloc_array(NULL, const float *, param_count);
+   /* Setting nr_params here NOT to the size of the param and pull_param
+* arrays, but to the number of uniform components vec4_visitor
+* needs. vec4_visitor::setup_uniforms() will set it back to a proper value.
+*/
+   c.prog_data.base.base.nr_params = ALIGN(param_count, 4) / 4 + 
gs-num_samplers;
 
if (gp-program.OutputType == GL_POINTS) {
   /* When the output type is points, the geometry shader may output data
diff --git a/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp 
b/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp
index aedab93..cc92a8a 100644
--- a/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp
+++ b/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp
@@ -3401,6 +3401,17 @@ vec4_visitor::vec4_visitor(struct brw_context *brw,
this-max_grf = brw-gen = 7 ? GEN7_MRF_HACK_START : BRW_MAX_GRF;
 
this-uniforms = 0;
+
+   /* Initialize uniform_array_size to at least 1 because pre-gen6 VS requires
+* at least one. See setup_uniforms() in brw_vec4.cpp.
+*/
+   this-uniform_array_size = 1;
+   if (prog_data) {
+  this-uniform_array_size = MAX2(stage_prog_data-nr_params, 1);
+   }
+
+   this-uniform_size = rzalloc_array(mem_ctx, int, this-uniform_array_size);
+   this-uniform_vector_size = rzalloc_array(mem_ctx, int, 
this-uniform_array_size);
 }
 
 vec4_visitor::~vec4_visitor()
diff --git a/src/mesa/drivers/dri/i965/brw_vs.c 
b/src/mesa/drivers/dri/i965/brw_vs.c
index d3dbc8e..ef8dcf1 100644
--- a/src/mesa/drivers/dri/i965/brw_vs.c
+++ b/src/mesa/drivers/dri/i965/brw_vs.c
@@ -235,6 +235,14 @@ do_vs_prog(struct brw_context *brw,
 
stage_prog_data-param = rzalloc_array(NULL, const float *, param_count);
stage_prog_data-pull_param = rzalloc_array(NULL, const float *, 
param_count);
+   /* Setting nr_params here NOT to the size of the param and pull_param
+* arrays, but to the number of uniform components vec4_visitor
+* needs. vec4_visitor::setup_uniforms() will set it back to a proper value.
+*/
+   stage_prog_data-nr_params = ALIGN(param_count, 4) / 4;
+   if (vs) {
+  stage_prog_data-nr_params += vs-num_samplers;
+   }
 
GLbitfield64 outputs_written = vp-program.Base.OutputsWritten;
prog_data.inputs_read = vp-program.Base.InputsRead;
-- 
1.8.4.rc3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/2] i965: Allocate vec4_visitor's uniform_size and uniform_vector_size arrays dynamically.

2013-12-30 Thread Petri Latvala

On 12/20/2013 09:26 PM, Kenneth Graunke wrote:

On 11/27/2013 05:28 AM, Petri Latvala wrote:

v2: Don't add function parameters, pass the required size in
prog_data-nr_params.

Signed-off-by: Petri Latvala petri.latv...@intel.com
---
  src/mesa/drivers/dri/i965/brw_vec4.h   | 5 +++--
  src/mesa/drivers/dri/i965/brw_vec4_gs.c| 5 +
  src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp | 7 +++
  src/mesa/drivers/dri/i965/brw_vs.c | 8 
  4 files changed, 23 insertions(+), 2 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_vec4.h 
b/src/mesa/drivers/dri/i965/brw_vec4.h
index 5cec9f9..5f5f5cd 100644
--- a/src/mesa/drivers/dri/i965/brw_vec4.h
+++ b/src/mesa/drivers/dri/i965/brw_vec4.h
@@ -325,8 +325,9 @@ public:
  */
 dst_reg output_reg[BRW_VARYING_SLOT_COUNT];
 const char *output_reg_annotation[BRW_VARYING_SLOT_COUNT];
-   int uniform_size[MAX_UNIFORMS];
-   int uniform_vector_size[MAX_UNIFORMS];
+   int *uniform_size;
+   int *uniform_vector_size;
+   int uniform_param_count; /* Size of uniform_[vector_]size arrays */

I'm not crazy about this variable name.  Between the params arrays,
uniform_* arrays, nr_params count, and uniforms count...we already have
a lot of distinct things that sound alike.

How about:

int uniform_array_size; /** Size of uniform_[vector_]size arrays. */

That seems clearer to me.  Especially seeing that the value here is
really the size of the array, which is an overestimate/upper bound on
the number of uniforms, not the actual number of elements in the
uniforms or params arrays.


Sounds good. Changing the name.



 int uniforms;
  
 src_reg shader_start_time;

diff --git a/src/mesa/drivers/dri/i965/brw_vec4_gs.c 
b/src/mesa/drivers/dri/i965/brw_vec4_gs.c
index 018b0b6..7cf9bac 100644
--- a/src/mesa/drivers/dri/i965/brw_vec4_gs.c
+++ b/src/mesa/drivers/dri/i965/brw_vec4_gs.c
@@ -64,6 +64,11 @@ do_gs_prog(struct brw_context *brw,
  
 c.prog_data.base.param = rzalloc_array(NULL, const float *, param_count);

 c.prog_data.base.pull_param = rzalloc_array(NULL, const float *, 
param_count);
+   /* Setting nr_params here NOT to the size of the param and pull_param
+* arrays, but to the number of uniform components vec4_visitor
+* needs. vec4_visitor::setup_uniforms() will set it back to a proper value.
+*/
+   c.prog_data.base.nr_params = param_count / 4 + gs-num_samplers;

Hmm.  You're counting the number of vec4s, but...don't samplers take up
a single entry, since they're just integers?  This seems odd to me.


No, that value is the number of float-size components. param_count, 
calculated above that code, multiplies the number of components by 4 to 
have room elsewhere for scalar params that eat up a vec4. I'm just 
dividing the number back.



You might also consider doing ALIGN(param_count, 4) / 4 so that you
round up rather than truncating on the division.


Ok, changing that. Here and in vs.


I also would really like to keep nr_params in consistent units, i.e.
always uniform float-size components or always vec4s.


I would really like that too, but unfortunately the inconsistency was 
already present.


The best option would be to have another variable for uniform memory 
use, but I didn't want to make too invasive changes. nr_param gets used 
here for uniform float-size counts, and later on for something else. 
This results in some overuse of memory when uniforms are smaller than vec4s.



 if (gp-program.OutputType == GL_POINTS) {
/* When the output type is points, the geometry shader may output data
diff --git a/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp 
b/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp
index a13eafb..b9226dc 100644
--- a/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp
+++ b/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp
@@ -3253,6 +3253,10 @@ vec4_visitor::vec4_visitor(struct brw_context *brw,
   fail_msg(NULL),
   first_non_payload_grf(0),
   need_all_constants_in_pull_buffer(false),
+ /* Initialize uniform_param_count to at least 1 because gen6 VS requires 
at
+  * least one. See setup_uniforms() in brw_vec4.cpp.
+  */

I think you mean Gen4-5 requires at least one push constant, not gen6
VS.  At least, that's what setup_uniforms() is doing.


Argh, yes. Fixing that comment.


+ uniform_param_count(prog_data-nr_params ? prog_data-nr_params : 1),

I think this would be clearer as:

MAX2(prog_data-nr_params, 1)


Yes, changing.


--
Petri Latvala

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/2] i965: Allocate vec4_visitor's uniform_size and uniform_vector_size arrays dynamically.

2013-12-30 Thread Petri Latvala

On 12/20/2013 08:54 PM, Ian Romanick wrote:

This patch breaks the test_vec4_register_coalesce unit test.  Did you
run 'make check'?




I thought I did, but turns out I didn't. Just ran piglit tests.

Fix coming up.


--
Petri Latvala

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/2] i965: Allocate vec4_visitor's uniform_size and uniform_vector_size arrays dynamically.

2013-12-30 Thread Petri Latvala
v2: Don't add function parameters, pass the required size in
prog_data-nr_params.

v3:
- Use the name uniform_array_size instead of uniform_param_count.
- Round up when dividing param_count by 4.
- Use MAX2() instead of taking the maximum by hand.
- Don't crash if prog_data passed to vec4_visitor constructor is NULL

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=71254

Signed-off-by: Petri Latvala petri.latv...@intel.com
---
 src/mesa/drivers/dri/i965/brw_vec4.h   |  5 +++--
 src/mesa/drivers/dri/i965/brw_vec4_gs.c|  5 +
 src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp | 11 +++
 src/mesa/drivers/dri/i965/brw_vs.c |  8 
 4 files changed, 27 insertions(+), 2 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_vec4.h 
b/src/mesa/drivers/dri/i965/brw_vec4.h
index d4029d8..798d8bd 100644
--- a/src/mesa/drivers/dri/i965/brw_vec4.h
+++ b/src/mesa/drivers/dri/i965/brw_vec4.h
@@ -325,8 +325,9 @@ public:
 */
dst_reg output_reg[BRW_VARYING_SLOT_COUNT];
const char *output_reg_annotation[BRW_VARYING_SLOT_COUNT];
-   int uniform_size[MAX_UNIFORMS];
-   int uniform_vector_size[MAX_UNIFORMS];
+   int *uniform_size;
+   int *uniform_vector_size;
+   int uniform_array_size; /* Size of uniform_[vector_]size arrays */
int uniforms;
 
src_reg shader_start_time;
diff --git a/src/mesa/drivers/dri/i965/brw_vec4_gs.c 
b/src/mesa/drivers/dri/i965/brw_vec4_gs.c
index 018b0b6..c749eb1 100644
--- a/src/mesa/drivers/dri/i965/brw_vec4_gs.c
+++ b/src/mesa/drivers/dri/i965/brw_vec4_gs.c
@@ -64,6 +64,11 @@ do_gs_prog(struct brw_context *brw,
 
c.prog_data.base.param = rzalloc_array(NULL, const float *, param_count);
c.prog_data.base.pull_param = rzalloc_array(NULL, const float *, 
param_count);
+   /* Setting nr_params here NOT to the size of the param and pull_param
+* arrays, but to the number of uniform components vec4_visitor
+* needs. vec4_visitor::setup_uniforms() will set it back to a proper value.
+*/
+   c.prog_data.base.nr_params = ALIGN(param_count, 4) / 4 + gs-num_samplers;
 
if (gp-program.OutputType == GL_POINTS) {
   /* When the output type is points, the geometry shader may output data
diff --git a/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp 
b/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp
index 3b8cef6..a93fdc5 100644
--- a/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp
+++ b/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp
@@ -3321,6 +3321,17 @@ vec4_visitor::vec4_visitor(struct brw_context *brw,
this-max_grf = brw-gen = 7 ? GEN7_MRF_HACK_START : BRW_MAX_GRF;
 
this-uniforms = 0;
+
+   /* Initialize uniform_array_size to at least 1 because pre-gen6 VS requires
+* at least one. See setup_uniforms() in brw_vec4.cpp.
+*/
+   this-uniform_array_size = 1;
+   if (prog_data) {
+  this-uniform_array_size = MAX2(prog_data-nr_params, 1);
+   }
+
+   this-uniform_size = rzalloc_array(mem_ctx, int, this-uniform_array_size);
+   this-uniform_vector_size = rzalloc_array(mem_ctx, int, 
this-uniform_array_size);
 }
 
 vec4_visitor::~vec4_visitor()
diff --git a/src/mesa/drivers/dri/i965/brw_vs.c 
b/src/mesa/drivers/dri/i965/brw_vs.c
index 5d50c3c..609a575 100644
--- a/src/mesa/drivers/dri/i965/brw_vs.c
+++ b/src/mesa/drivers/dri/i965/brw_vs.c
@@ -242,6 +242,14 @@ do_vs_prog(struct brw_context *brw,
 
prog_data.base.param = rzalloc_array(NULL, const float *, param_count);
prog_data.base.pull_param = rzalloc_array(NULL, const float *, param_count);
+   /* Setting nr_params here NOT to the size of the param and pull_param
+* arrays, but to the number of uniform components vec4_visitor
+* needs. vec4_visitor::setup_uniforms() will set it back to a proper value.
+*/
+   prog_data.base.nr_params = ALIGN(param_count, 4) / 4;
+   if (vs) {
+  prog_data.base.nr_params += vs-num_samplers;
+   }
 
GLbitfield64 outputs_written = vp-program.Base.OutputsWritten;
prog_data.inputs_read = vp-program.Base.InputsRead;
-- 
1.8.4.rc3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/2] i965: Allocate vec4_visitor's uniform_size and uniform_vector_size arrays dynamically.

2013-12-20 Thread Ian Romanick
This patch breaks the test_vec4_register_coalesce unit test.  Did you
run 'make check'?

On 11/27/2013 05:28 AM, Petri Latvala wrote:
 v2: Don't add function parameters, pass the required size in
 prog_data-nr_params.
 
 Signed-off-by: Petri Latvala petri.latv...@intel.com
 ---
  src/mesa/drivers/dri/i965/brw_vec4.h   | 5 +++--
  src/mesa/drivers/dri/i965/brw_vec4_gs.c| 5 +
  src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp | 7 +++
  src/mesa/drivers/dri/i965/brw_vs.c | 8 
  4 files changed, 23 insertions(+), 2 deletions(-)
 
 diff --git a/src/mesa/drivers/dri/i965/brw_vec4.h 
 b/src/mesa/drivers/dri/i965/brw_vec4.h
 index 5cec9f9..5f5f5cd 100644
 --- a/src/mesa/drivers/dri/i965/brw_vec4.h
 +++ b/src/mesa/drivers/dri/i965/brw_vec4.h
 @@ -325,8 +325,9 @@ public:
  */
 dst_reg output_reg[BRW_VARYING_SLOT_COUNT];
 const char *output_reg_annotation[BRW_VARYING_SLOT_COUNT];
 -   int uniform_size[MAX_UNIFORMS];
 -   int uniform_vector_size[MAX_UNIFORMS];
 +   int *uniform_size;
 +   int *uniform_vector_size;
 +   int uniform_param_count; /* Size of uniform_[vector_]size arrays */
 int uniforms;
  
 src_reg shader_start_time;
 diff --git a/src/mesa/drivers/dri/i965/brw_vec4_gs.c 
 b/src/mesa/drivers/dri/i965/brw_vec4_gs.c
 index 018b0b6..7cf9bac 100644
 --- a/src/mesa/drivers/dri/i965/brw_vec4_gs.c
 +++ b/src/mesa/drivers/dri/i965/brw_vec4_gs.c
 @@ -64,6 +64,11 @@ do_gs_prog(struct brw_context *brw,
  
 c.prog_data.base.param = rzalloc_array(NULL, const float *, param_count);
 c.prog_data.base.pull_param = rzalloc_array(NULL, const float *, 
 param_count);
 +   /* Setting nr_params here NOT to the size of the param and pull_param
 +* arrays, but to the number of uniform components vec4_visitor
 +* needs. vec4_visitor::setup_uniforms() will set it back to a proper 
 value.
 +*/
 +   c.prog_data.base.nr_params = param_count / 4 + gs-num_samplers;
  
 if (gp-program.OutputType == GL_POINTS) {
/* When the output type is points, the geometry shader may output data
 diff --git a/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp 
 b/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp
 index a13eafb..b9226dc 100644
 --- a/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp
 +++ b/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp
 @@ -3253,6 +3253,10 @@ vec4_visitor::vec4_visitor(struct brw_context *brw,
   fail_msg(NULL),
   first_non_payload_grf(0),
   need_all_constants_in_pull_buffer(false),
 + /* Initialize uniform_param_count to at least 1 because gen6 VS 
 requires at
 +  * least one. See setup_uniforms() in brw_vec4.cpp.
 +  */
 + uniform_param_count(prog_data-nr_params ? prog_data-nr_params : 1),
   debug_flag(debug_flag),
   no_spills(no_spills)
  {
 @@ -3290,6 +3294,9 @@ vec4_visitor::vec4_visitor(struct brw_context *brw,
 this-max_grf = brw-gen = 7 ? GEN7_MRF_HACK_START : BRW_MAX_GRF;
  
 this-uniforms = 0;
 +
 +   this-uniform_size = rzalloc_array(mem_ctx, int, 
 this-uniform_param_count);
 +   this-uniform_vector_size = rzalloc_array(mem_ctx, int, 
 this-uniform_param_count);
  }
  
  vec4_visitor::~vec4_visitor()
 diff --git a/src/mesa/drivers/dri/i965/brw_vs.c 
 b/src/mesa/drivers/dri/i965/brw_vs.c
 index b5c8b63..8d0933d 100644
 --- a/src/mesa/drivers/dri/i965/brw_vs.c
 +++ b/src/mesa/drivers/dri/i965/brw_vs.c
 @@ -242,6 +242,14 @@ do_vs_prog(struct brw_context *brw,
  
 prog_data.base.param = rzalloc_array(NULL, const float *, param_count);
 prog_data.base.pull_param = rzalloc_array(NULL, const float *, 
 param_count);
 +   /* Setting nr_params here NOT to the size of the param and pull_param
 +* arrays, but to the number of uniform components vec4_visitor
 +* needs. vec4_visitor::setup_uniforms() will set it back to a proper 
 value.
 +*/
 +   prog_data.base.nr_params = param_count / 4;
 +   if (vs) {
 +  prog_data.base.nr_params += vs-num_samplers;
 +   }
  
 GLbitfield64 outputs_written = vp-program.Base.OutputsWritten;
 prog_data.inputs_read = vp-program.Base.InputsRead;
 

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/2] i965: Allocate vec4_visitor's uniform_size and uniform_vector_size arrays dynamically.

2013-12-20 Thread Kenneth Graunke
On 11/27/2013 05:28 AM, Petri Latvala wrote:
 v2: Don't add function parameters, pass the required size in
 prog_data-nr_params.
 
 Signed-off-by: Petri Latvala petri.latv...@intel.com
 ---
  src/mesa/drivers/dri/i965/brw_vec4.h   | 5 +++--
  src/mesa/drivers/dri/i965/brw_vec4_gs.c| 5 +
  src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp | 7 +++
  src/mesa/drivers/dri/i965/brw_vs.c | 8 
  4 files changed, 23 insertions(+), 2 deletions(-)
 
 diff --git a/src/mesa/drivers/dri/i965/brw_vec4.h 
 b/src/mesa/drivers/dri/i965/brw_vec4.h
 index 5cec9f9..5f5f5cd 100644
 --- a/src/mesa/drivers/dri/i965/brw_vec4.h
 +++ b/src/mesa/drivers/dri/i965/brw_vec4.h
 @@ -325,8 +325,9 @@ public:
  */
 dst_reg output_reg[BRW_VARYING_SLOT_COUNT];
 const char *output_reg_annotation[BRW_VARYING_SLOT_COUNT];
 -   int uniform_size[MAX_UNIFORMS];
 -   int uniform_vector_size[MAX_UNIFORMS];
 +   int *uniform_size;
 +   int *uniform_vector_size;
 +   int uniform_param_count; /* Size of uniform_[vector_]size arrays */

I'm not crazy about this variable name.  Between the params arrays,
uniform_* arrays, nr_params count, and uniforms count...we already have
a lot of distinct things that sound alike.

How about:

int uniform_array_size; /** Size of uniform_[vector_]size arrays. */

That seems clearer to me.  Especially seeing that the value here is
really the size of the array, which is an overestimate/upper bound on
the number of uniforms, not the actual number of elements in the
uniforms or params arrays.

 int uniforms;
  
 src_reg shader_start_time;
 diff --git a/src/mesa/drivers/dri/i965/brw_vec4_gs.c 
 b/src/mesa/drivers/dri/i965/brw_vec4_gs.c
 index 018b0b6..7cf9bac 100644
 --- a/src/mesa/drivers/dri/i965/brw_vec4_gs.c
 +++ b/src/mesa/drivers/dri/i965/brw_vec4_gs.c
 @@ -64,6 +64,11 @@ do_gs_prog(struct brw_context *brw,
  
 c.prog_data.base.param = rzalloc_array(NULL, const float *, param_count);
 c.prog_data.base.pull_param = rzalloc_array(NULL, const float *, 
 param_count);
 +   /* Setting nr_params here NOT to the size of the param and pull_param
 +* arrays, but to the number of uniform components vec4_visitor
 +* needs. vec4_visitor::setup_uniforms() will set it back to a proper 
 value.
 +*/
 +   c.prog_data.base.nr_params = param_count / 4 + gs-num_samplers;

Hmm.  You're counting the number of vec4s, but...don't samplers take up
a single entry, since they're just integers?  This seems odd to me.

You might also consider doing ALIGN(param_count, 4) / 4 so that you
round up rather than truncating on the division.

I also would really like to keep nr_params in consistent units, i.e.
always uniform float-size components or always vec4s.

 if (gp-program.OutputType == GL_POINTS) {
/* When the output type is points, the geometry shader may output data
 diff --git a/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp 
 b/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp
 index a13eafb..b9226dc 100644
 --- a/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp
 +++ b/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp
 @@ -3253,6 +3253,10 @@ vec4_visitor::vec4_visitor(struct brw_context *brw,
   fail_msg(NULL),
   first_non_payload_grf(0),
   need_all_constants_in_pull_buffer(false),
 + /* Initialize uniform_param_count to at least 1 because gen6 VS 
 requires at
 +  * least one. See setup_uniforms() in brw_vec4.cpp.
 +  */

I think you mean Gen4-5 requires at least one push constant, not gen6
VS.  At least, that's what setup_uniforms() is doing.

 + uniform_param_count(prog_data-nr_params ? prog_data-nr_params : 1),

I think this would be clearer as:

   MAX2(prog_data-nr_params, 1)

   debug_flag(debug_flag),
   no_spills(no_spills)
  {
 @@ -3290,6 +3294,9 @@ vec4_visitor::vec4_visitor(struct brw_context *brw,
 this-max_grf = brw-gen = 7 ? GEN7_MRF_HACK_START : BRW_MAX_GRF;
  
 this-uniforms = 0;
 +
 +   this-uniform_size = rzalloc_array(mem_ctx, int, 
 this-uniform_param_count);
 +   this-uniform_vector_size = rzalloc_array(mem_ctx, int, 
 this-uniform_param_count);
  }
  
  vec4_visitor::~vec4_visitor()
 diff --git a/src/mesa/drivers/dri/i965/brw_vs.c 
 b/src/mesa/drivers/dri/i965/brw_vs.c
 index b5c8b63..8d0933d 100644
 --- a/src/mesa/drivers/dri/i965/brw_vs.c
 +++ b/src/mesa/drivers/dri/i965/brw_vs.c
 @@ -242,6 +242,14 @@ do_vs_prog(struct brw_context *brw,
  
 prog_data.base.param = rzalloc_array(NULL, const float *, param_count);
 prog_data.base.pull_param = rzalloc_array(NULL, const float *, 
 param_count);
 +   /* Setting nr_params here NOT to the size of the param and pull_param
 +* arrays, but to the number of uniform components vec4_visitor
 +* needs. vec4_visitor::setup_uniforms() will set it back to a proper 
 value.
 +*/
 +   prog_data.base.nr_params = param_count / 4;
 +   if (vs) {
 +  prog_data.base.nr_params += vs-num_samplers;
 +   }
  
 GLbitfield64 outputs_written 

[Mesa-dev] [PATCH 1/2] i965: Allocate vec4_visitor's uniform_size and uniform_vector_size arrays dynamically.

2013-11-27 Thread Petri Latvala
v2: Don't add function parameters, pass the required size in
prog_data-nr_params.

Signed-off-by: Petri Latvala petri.latv...@intel.com
---
 src/mesa/drivers/dri/i965/brw_vec4.h   | 5 +++--
 src/mesa/drivers/dri/i965/brw_vec4_gs.c| 5 +
 src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp | 7 +++
 src/mesa/drivers/dri/i965/brw_vs.c | 8 
 4 files changed, 23 insertions(+), 2 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_vec4.h 
b/src/mesa/drivers/dri/i965/brw_vec4.h
index 5cec9f9..5f5f5cd 100644
--- a/src/mesa/drivers/dri/i965/brw_vec4.h
+++ b/src/mesa/drivers/dri/i965/brw_vec4.h
@@ -325,8 +325,9 @@ public:
 */
dst_reg output_reg[BRW_VARYING_SLOT_COUNT];
const char *output_reg_annotation[BRW_VARYING_SLOT_COUNT];
-   int uniform_size[MAX_UNIFORMS];
-   int uniform_vector_size[MAX_UNIFORMS];
+   int *uniform_size;
+   int *uniform_vector_size;
+   int uniform_param_count; /* Size of uniform_[vector_]size arrays */
int uniforms;
 
src_reg shader_start_time;
diff --git a/src/mesa/drivers/dri/i965/brw_vec4_gs.c 
b/src/mesa/drivers/dri/i965/brw_vec4_gs.c
index 018b0b6..7cf9bac 100644
--- a/src/mesa/drivers/dri/i965/brw_vec4_gs.c
+++ b/src/mesa/drivers/dri/i965/brw_vec4_gs.c
@@ -64,6 +64,11 @@ do_gs_prog(struct brw_context *brw,
 
c.prog_data.base.param = rzalloc_array(NULL, const float *, param_count);
c.prog_data.base.pull_param = rzalloc_array(NULL, const float *, 
param_count);
+   /* Setting nr_params here NOT to the size of the param and pull_param
+* arrays, but to the number of uniform components vec4_visitor
+* needs. vec4_visitor::setup_uniforms() will set it back to a proper value.
+*/
+   c.prog_data.base.nr_params = param_count / 4 + gs-num_samplers;
 
if (gp-program.OutputType == GL_POINTS) {
   /* When the output type is points, the geometry shader may output data
diff --git a/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp 
b/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp
index a13eafb..b9226dc 100644
--- a/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp
+++ b/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp
@@ -3253,6 +3253,10 @@ vec4_visitor::vec4_visitor(struct brw_context *brw,
  fail_msg(NULL),
  first_non_payload_grf(0),
  need_all_constants_in_pull_buffer(false),
+ /* Initialize uniform_param_count to at least 1 because gen6 VS requires 
at
+  * least one. See setup_uniforms() in brw_vec4.cpp.
+  */
+ uniform_param_count(prog_data-nr_params ? prog_data-nr_params : 1),
  debug_flag(debug_flag),
  no_spills(no_spills)
 {
@@ -3290,6 +3294,9 @@ vec4_visitor::vec4_visitor(struct brw_context *brw,
this-max_grf = brw-gen = 7 ? GEN7_MRF_HACK_START : BRW_MAX_GRF;
 
this-uniforms = 0;
+
+   this-uniform_size = rzalloc_array(mem_ctx, int, this-uniform_param_count);
+   this-uniform_vector_size = rzalloc_array(mem_ctx, int, 
this-uniform_param_count);
 }
 
 vec4_visitor::~vec4_visitor()
diff --git a/src/mesa/drivers/dri/i965/brw_vs.c 
b/src/mesa/drivers/dri/i965/brw_vs.c
index b5c8b63..8d0933d 100644
--- a/src/mesa/drivers/dri/i965/brw_vs.c
+++ b/src/mesa/drivers/dri/i965/brw_vs.c
@@ -242,6 +242,14 @@ do_vs_prog(struct brw_context *brw,
 
prog_data.base.param = rzalloc_array(NULL, const float *, param_count);
prog_data.base.pull_param = rzalloc_array(NULL, const float *, param_count);
+   /* Setting nr_params here NOT to the size of the param and pull_param
+* arrays, but to the number of uniform components vec4_visitor
+* needs. vec4_visitor::setup_uniforms() will set it back to a proper value.
+*/
+   prog_data.base.nr_params = param_count / 4;
+   if (vs) {
+  prog_data.base.nr_params += vs-num_samplers;
+   }
 
GLbitfield64 outputs_written = vp-program.Base.OutputsWritten;
prog_data.inputs_read = vp-program.Base.InputsRead;
-- 
1.8.4.rc3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/2] i965: Allocate vec4_visitor's uniform_size and uniform_vector_size arrays dynamically.

2013-11-25 Thread Petri Latvala

On 11/22/2013 08:05 PM, Francisco Jerez wrote:

Petri Latvalapetri.latv...@intel.com  writes:


[...]
@@ -325,8 +326,9 @@ public:
  */
 dst_reg output_reg[BRW_VARYING_SLOT_COUNT];
 const char *output_reg_annotation[BRW_VARYING_SLOT_COUNT];
-   int uniform_size[MAX_UNIFORMS];
-   int uniform_vector_size[MAX_UNIFORMS];
+   unsigned uniform_param_count;
+   int* uniform_size;
+   int* uniform_vector_size;
 int uniforms;


All the code around you uses the 'type *identifier' convention, please
keep it consistent.


*slap on forehead* note to self, disable muscle memory. Fixing that.


 src_reg shader_start_time;
[...]
@@ -556,7 +558,7 @@ brw_gs_emit(struct brw_context *brw,
 if (likely(!(INTEL_DEBUG  DEBUG_NO_DUAL_OBJECT_GS))) {
c-prog_data.dual_instanced_dispatch = false;

-  vec4_gs_visitor v(brw, c, prog, shader, mem_ctx, true /* no_spills */);
+  vec4_gs_visitor v(brw, c, prog, shader, mem_ctx, true /* no_spills */, 
param_count);

Another possibility would be to set 'c.prog_data.base.nr_params =
param_count' here and in brw_vs_emit(), so you'd have the same value
available from the constructor of vec4_visitor without all the parameter
changes.


When trying this change, I began to wonder if I'm using the right values 
at all. What is that nr_params supposed to be for code elsewhere? 
vec4_visitor::setup_uniforms() sets it to this-uniforms * 4, but 
param_count that I passed from do_vs_prog() to brw_vs_emit() is 
num_uniform_components * 4, yielding (close to) this-uniforms * 4 * 4. 
Close to means that my test program gives param_count being 92, and 
this-uniforms * 4 in setup_uniforms() is 24.


Is vec4_visitor::setup_uniforms() supposed to be the authoritative 
source of the uniform counts or is that nr_params assignment just for 
the pre-gen6 hack in the same function? Is nr_params supposed to be the 
number of uniform components or uniforms? To my understanding 
param_count in do_vs_prog() is the number of uniform components required 
to hold all uniforms with a vec4 for each one (float or otherwise), are 
the floats packed to vec4s anywhere?



--
Petri Latvala

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/2] i965: Allocate vec4_visitor's uniform_size and uniform_vector_size arrays dynamically.

2013-11-25 Thread Francisco Jerez
Petri Latvala petri.latv...@intel.com writes:

 On 11/22/2013 08:05 PM, Francisco Jerez wrote:
 Petri Latvalapetri.latv...@intel.com  writes:

 [...]
 @@ -325,8 +326,9 @@ public:
   */
  dst_reg output_reg[BRW_VARYING_SLOT_COUNT];
  const char *output_reg_annotation[BRW_VARYING_SLOT_COUNT];
 -   int uniform_size[MAX_UNIFORMS];
 -   int uniform_vector_size[MAX_UNIFORMS];
 +   unsigned uniform_param_count;
 +   int* uniform_size;
 +   int* uniform_vector_size;
  int uniforms;

 All the code around you uses the 'type *identifier' convention, please
 keep it consistent.

 *slap on forehead* note to self, disable muscle memory. Fixing that.

  src_reg shader_start_time;
 [...]
 @@ -556,7 +558,7 @@ brw_gs_emit(struct brw_context *brw,
  if (likely(!(INTEL_DEBUG  DEBUG_NO_DUAL_OBJECT_GS))) {
 c-prog_data.dual_instanced_dispatch = false;

 -  vec4_gs_visitor v(brw, c, prog, shader, mem_ctx, true /* no_spills 
 */);
 +  vec4_gs_visitor v(brw, c, prog, shader, mem_ctx, true /* no_spills 
 */, param_count);
 Another possibility would be to set 'c.prog_data.base.nr_params =
 param_count' here and in brw_vs_emit(), so you'd have the same value
 available from the constructor of vec4_visitor without all the parameter
 changes.

 When trying this change, I began to wonder if I'm using the right values 
 at all. What is that nr_params supposed to be for code elsewhere? 
 vec4_visitor::setup_uniforms() sets it to this-uniforms * 4, but 
 param_count that I passed from do_vs_prog() to brw_vs_emit() is 
 num_uniform_components * 4, yielding (close to) this-uniforms * 4 * 4. 

Oh...  Apparently do_vs_prog() is overallocating memory.  And so is
do_gs_prog()...  That seems like a bug.

 Close to means that my test program gives param_count being 92, and 
 this-uniforms * 4 in setup_uniforms() is 24.

 Is vec4_visitor::setup_uniforms() supposed to be the authoritative 
 source of the uniform counts or is that nr_params assignment just for 
 the pre-gen6 hack in the same function? Is nr_params supposed to be the 
 number of uniform components or uniforms?

It should be the number of uniform components (in units of 32-bit
floats).

 To my understanding param_count in do_vs_prog() is the number of
 uniform components required to hold all uniforms with a vec4 for each
 one (float or otherwise), are the floats packed to vec4s anywhere?

Yes, see vec4_visitor::pack_uniform_registers().


 -- 
 Petri Latvala


pgpH9bUzFzmPZ.pgp
Description: PGP signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/2] i965: Allocate vec4_visitor's uniform_size and uniform_vector_size arrays dynamically.

2013-11-22 Thread Petri Latvala
Signed-off-by: Petri Latvala petri.latv...@intel.com
---
 src/mesa/drivers/dri/i965/brw_vec4.cpp|  5 +++--
 src/mesa/drivers/dri/i965/brw_vec4.h  | 14 +++---
 src/mesa/drivers/dri/i965/brw_vec4_gs.c   |  2 +-
 src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.cpp | 12 +++-
 src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.h   |  6 --
 src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp|  7 ++-
 src/mesa/drivers/dri/i965/brw_vec4_vs_visitor.cpp |  6 --
 src/mesa/drivers/dri/i965/brw_vs.c|  2 +-
 src/mesa/drivers/dri/i965/brw_vs.h|  6 --
 9 files changed, 41 insertions(+), 19 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_vec4.cpp 
b/src/mesa/drivers/dri/i965/brw_vec4.cpp
index 73f91a0..3da1b4d 100644
--- a/src/mesa/drivers/dri/i965/brw_vec4.cpp
+++ b/src/mesa/drivers/dri/i965/brw_vec4.cpp
@@ -1558,7 +1558,8 @@ brw_vs_emit(struct brw_context *brw,
 struct brw_vs_compile *c,
 struct brw_vs_prog_data *prog_data,
 void *mem_ctx,
-unsigned *final_assembly_size)
+unsigned *final_assembly_size,
+unsigned param_count)
 {
bool start_busy = false;
float start_time = 0;
@@ -1585,7 +1586,7 @@ brw_vs_emit(struct brw_context *brw,
   }
}
 
-   vec4_vs_visitor v(brw, c, prog_data, prog, shader, mem_ctx);
+   vec4_vs_visitor v(brw, c, prog_data, prog, shader, mem_ctx, param_count);
if (!v.run()) {
   if (prog) {
  prog-LinkStatus = false;
diff --git a/src/mesa/drivers/dri/i965/brw_vec4.h 
b/src/mesa/drivers/dri/i965/brw_vec4.h
index 5cec9f9..552ca55 100644
--- a/src/mesa/drivers/dri/i965/brw_vec4.h
+++ b/src/mesa/drivers/dri/i965/brw_vec4.h
@@ -231,7 +231,8 @@ public:
struct brw_shader *shader,
void *mem_ctx,
 bool debug_flag,
-bool no_spills);
+bool no_spills,
+unsigned param_count);
~vec4_visitor();
 
dst_reg dst_null_f()
@@ -325,8 +326,9 @@ public:
 */
dst_reg output_reg[BRW_VARYING_SLOT_COUNT];
const char *output_reg_annotation[BRW_VARYING_SLOT_COUNT];
-   int uniform_size[MAX_UNIFORMS];
-   int uniform_vector_size[MAX_UNIFORMS];
+   unsigned uniform_param_count;
+   int* uniform_size;
+   int* uniform_vector_size;
int uniforms;
 
src_reg shader_start_time;
@@ -546,6 +548,12 @@ private:
 * If true, then register allocation should fail instead of spilling.
 */
const bool no_spills;
+
+   /*
+* Make noncopyable.
+*/
+   vec4_visitor(const vec4_visitor);
+   vec4_visitor operator=(const vec4_visitor);
 };
 
 
diff --git a/src/mesa/drivers/dri/i965/brw_vec4_gs.c 
b/src/mesa/drivers/dri/i965/brw_vec4_gs.c
index b52d646..b0b77d1 100644
--- a/src/mesa/drivers/dri/i965/brw_vec4_gs.c
+++ b/src/mesa/drivers/dri/i965/brw_vec4_gs.c
@@ -213,7 +213,7 @@ do_gs_prog(struct brw_context *brw,
void *mem_ctx = ralloc_context(NULL);
unsigned program_size;
const unsigned *program =
-  brw_gs_emit(brw, prog, c, mem_ctx, program_size);
+  brw_gs_emit(brw, prog, c, mem_ctx, program_size, param_count);
if (program == NULL) {
   ralloc_free(mem_ctx);
   return false;
diff --git a/src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.cpp 
b/src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.cpp
index adbb1cf..16c05f5 100644
--- a/src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.cpp
+++ b/src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.cpp
@@ -38,10 +38,11 @@ vec4_gs_visitor::vec4_gs_visitor(struct brw_context *brw,
  struct gl_shader_program *prog,
  struct brw_shader *shader,
  void *mem_ctx,
- bool no_spills)
+ bool no_spills,
+ unsigned param_count)
: vec4_visitor(brw, c-base, c-gp-program.Base, c-key.base,
   c-prog_data.base, prog, shader, mem_ctx,
-  INTEL_DEBUG  DEBUG_GS, no_spills),
+  INTEL_DEBUG  DEBUG_GS, no_spills, param_count),
  c(c)
 {
 }
@@ -539,7 +540,8 @@ brw_gs_emit(struct brw_context *brw,
 struct gl_shader_program *prog,
 struct brw_gs_compile *c,
 void *mem_ctx,
-unsigned *final_assembly_size)
+unsigned *final_assembly_size,
+unsigned param_count)
 {
struct brw_shader *shader =
   (brw_shader *) prog-_LinkedShaders[MESA_SHADER_GEOMETRY];
@@ -556,7 +558,7 @@ brw_gs_emit(struct brw_context *brw,
if (likely(!(INTEL_DEBUG  DEBUG_NO_DUAL_OBJECT_GS))) {
   c-prog_data.dual_instanced_dispatch = false;
 
-  vec4_gs_visitor v(brw, c, prog, shader, mem_ctx, true /* no_spills */);
+  vec4_gs_visitor v(brw, c, prog, shader, mem_ctx, true /* no_spills */, 
param_count);
   if (v.run()) {
  vec4_generator g(brw, prog, 

Re: [Mesa-dev] [PATCH 1/2] i965: Allocate vec4_visitor's uniform_size and uniform_vector_size arrays dynamically.

2013-11-22 Thread Francisco Jerez
Petri Latvala petri.latv...@intel.com writes:

[...]
 @@ -325,8 +326,9 @@ public:
  */
 dst_reg output_reg[BRW_VARYING_SLOT_COUNT];
 const char *output_reg_annotation[BRW_VARYING_SLOT_COUNT];
 -   int uniform_size[MAX_UNIFORMS];
 -   int uniform_vector_size[MAX_UNIFORMS];
 +   unsigned uniform_param_count;
 +   int* uniform_size;
 +   int* uniform_vector_size;
 int uniforms;
  
All the code around you uses the 'type *identifier' convention, please
keep it consistent.

 src_reg shader_start_time;
[...]
 @@ -556,7 +558,7 @@ brw_gs_emit(struct brw_context *brw,
 if (likely(!(INTEL_DEBUG  DEBUG_NO_DUAL_OBJECT_GS))) {
c-prog_data.dual_instanced_dispatch = false;
  
 -  vec4_gs_visitor v(brw, c, prog, shader, mem_ctx, true /* no_spills */);
 +  vec4_gs_visitor v(brw, c, prog, shader, mem_ctx, true /* no_spills */, 
 param_count);

Another possibility would be to set 'c.prog_data.base.nr_params =
param_count' here and in brw_vs_emit(), so you'd have the same value
available from the constructor of vec4_visitor without all the parameter
changes.

It would be possible to do something similar in the fs_visitor code, but
it seems that it would be slightly more difficult because the fs_visitor
(ab-)uses nr_params to keep track of the most recently allocated uniform
index, you'd need to add a private counter variable to the fs_visitor
similar to what vec4_visitor does.

With the upcoming ARB_shader_image_load_store changes we might get more
than 4 * MAX_UNIFORM uniform allocations in some cases because image
uniforms can take up several slots for the image meta-data -- I think it
would make sense to extend this change to the FS back-end too.

Thanks.

if (v.run()) {
   vec4_generator g(brw, prog, c-gp-program.Base, 
 c-prog_data.base,
mem_ctx, INTEL_DEBUG  DEBUG_GS);
[...]


pgpt3d5HdM9K6.pgp
Description: PGP signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/2] i965: Allocate vec4_visitor's uniform_size and uniform_vector_size arrays dynamically.

2013-11-22 Thread Ian Romanick
On 11/22/2013 12:09 AM, Petri Latvala wrote:
 Signed-off-by: Petri Latvala petri.latv...@intel.com
 ---
  src/mesa/drivers/dri/i965/brw_vec4.cpp|  5 +++--
  src/mesa/drivers/dri/i965/brw_vec4.h  | 14 +++---
  src/mesa/drivers/dri/i965/brw_vec4_gs.c   |  2 +-
  src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.cpp | 12 +++-
  src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.h   |  6 --
  src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp|  7 ++-
  src/mesa/drivers/dri/i965/brw_vec4_vs_visitor.cpp |  6 --
  src/mesa/drivers/dri/i965/brw_vs.c|  2 +-
  src/mesa/drivers/dri/i965/brw_vs.h|  6 --
  9 files changed, 41 insertions(+), 19 deletions(-)
 
 diff --git a/src/mesa/drivers/dri/i965/brw_vec4.cpp 
 b/src/mesa/drivers/dri/i965/brw_vec4.cpp
 index 73f91a0..3da1b4d 100644
 --- a/src/mesa/drivers/dri/i965/brw_vec4.cpp
 +++ b/src/mesa/drivers/dri/i965/brw_vec4.cpp
 @@ -1558,7 +1558,8 @@ brw_vs_emit(struct brw_context *brw,
  struct brw_vs_compile *c,
  struct brw_vs_prog_data *prog_data,
  void *mem_ctx,
 -unsigned *final_assembly_size)
 +unsigned *final_assembly_size,
 +unsigned param_count)
  {
 bool start_busy = false;
 float start_time = 0;
 @@ -1585,7 +1586,7 @@ brw_vs_emit(struct brw_context *brw,
}
 }
  
 -   vec4_vs_visitor v(brw, c, prog_data, prog, shader, mem_ctx);
 +   vec4_vs_visitor v(brw, c, prog_data, prog, shader, mem_ctx, param_count);
 if (!v.run()) {
if (prog) {
   prog-LinkStatus = false;
 diff --git a/src/mesa/drivers/dri/i965/brw_vec4.h 
 b/src/mesa/drivers/dri/i965/brw_vec4.h
 index 5cec9f9..552ca55 100644
 --- a/src/mesa/drivers/dri/i965/brw_vec4.h
 +++ b/src/mesa/drivers/dri/i965/brw_vec4.h
 @@ -231,7 +231,8 @@ public:
   struct brw_shader *shader,
   void *mem_ctx,
  bool debug_flag,
 -bool no_spills);
 +bool no_spills,
 +unsigned param_count);
 ~vec4_visitor();
  
 dst_reg dst_null_f()
 @@ -325,8 +326,9 @@ public:
  */
 dst_reg output_reg[BRW_VARYING_SLOT_COUNT];
 const char *output_reg_annotation[BRW_VARYING_SLOT_COUNT];
 -   int uniform_size[MAX_UNIFORMS];
 -   int uniform_vector_size[MAX_UNIFORMS];
 +   unsigned uniform_param_count;
 +   int* uniform_size;
 +   int* uniform_vector_size;
 int uniforms;
  
 src_reg shader_start_time;
 @@ -546,6 +548,12 @@ private:
  * If true, then register allocation should fail instead of spilling.
  */
 const bool no_spills;
 +
 +   /*
 +* Make noncopyable.
 +*/
 +   vec4_visitor(const vec4_visitor);
 +   vec4_visitor operator=(const vec4_visitor);

I think this should be a separate patch with it's own justification.
I'm also not sure it's strictly necessary.  I'm not a C++ expert (and
most people on the project aren't either), so bear with me a bit.  The
usual reason to make a class non-copyable is because the default
copy-constructor will only make a shallow copy of pointer fields.  This
can lead to either double-frees or dereferencing freed memory.  However,
since we're using ralloc, and the first construction of a vec4_visitor
object will out-live any possible copies, neither of these situations
can occur.  Is that a fair assessment?

Now... we probably should do this anyway... and there are a lot of
classes in the i965 back-end where this should occur.  I don't know if
we want to make a macro to generate this boiler-plate code or if we just
want to manually add it to every class.  Perhaps Ken, Paul, or Curro
will have an opinion...

  };
  
  
 diff --git a/src/mesa/drivers/dri/i965/brw_vec4_gs.c 
 b/src/mesa/drivers/dri/i965/brw_vec4_gs.c
 index b52d646..b0b77d1 100644
 --- a/src/mesa/drivers/dri/i965/brw_vec4_gs.c
 +++ b/src/mesa/drivers/dri/i965/brw_vec4_gs.c
 @@ -213,7 +213,7 @@ do_gs_prog(struct brw_context *brw,
 void *mem_ctx = ralloc_context(NULL);
 unsigned program_size;
 const unsigned *program =
 -  brw_gs_emit(brw, prog, c, mem_ctx, program_size);
 +  brw_gs_emit(brw, prog, c, mem_ctx, program_size, param_count);
 if (program == NULL) {
ralloc_free(mem_ctx);
return false;
 diff --git a/src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.cpp 
 b/src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.cpp
 index adbb1cf..16c05f5 100644
 --- a/src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.cpp
 +++ b/src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.cpp
 @@ -38,10 +38,11 @@ vec4_gs_visitor::vec4_gs_visitor(struct brw_context *brw,
   struct gl_shader_program *prog,
   struct brw_shader *shader,
   void *mem_ctx,
 - bool no_spills)
 + bool no_spills,
 + unsigned param_count)
 : 

Re: [Mesa-dev] [PATCH 1/2] i965: Allocate vec4_visitor's uniform_size and uniform_vector_size arrays dynamically.

2013-11-22 Thread Ian Romanick
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 11/22/2013 10:05 AM, Francisco Jerez wrote:
 Petri Latvala petri.latv...@intel.com writes:
 
 [...] @@ -325,8 +326,9 @@ public: */ dst_reg
 output_reg[BRW_VARYING_SLOT_COUNT]; const char
 *output_reg_annotation[BRW_VARYING_SLOT_COUNT]; -   int
 uniform_size[MAX_UNIFORMS]; -   int
 uniform_vector_size[MAX_UNIFORMS]; +   unsigned
 uniform_param_count; +   int* uniform_size; +   int*
 uniform_vector_size; int uniforms;
 
 All the code around you uses the 'type *identifier' convention,
 please keep it consistent.
 
 src_reg shader_start_time; [...] @@ -556,7 +558,7 @@
 brw_gs_emit(struct brw_context *brw, if (likely(!(INTEL_DEBUG 
 DEBUG_NO_DUAL_OBJECT_GS))) { c-prog_data.dual_instanced_dispatch
 = false;
 
 -  vec4_gs_visitor v(brw, c, prog, shader, mem_ctx, true /*
 no_spills */); +  vec4_gs_visitor v(brw, c, prog, shader,
 mem_ctx, true /* no_spills */, param_count);
 
 Another possibility would be to set 'c.prog_data.base.nr_params = 
 param_count' here and in brw_vs_emit(), so you'd have the same
 value available from the constructor of vec4_visitor without all
 the parameter changes.
 
 It would be possible to do something similar in the fs_visitor
 code, but it seems that it would be slightly more difficult because
 the fs_visitor (ab-)uses nr_params to keep track of the most
 recently allocated uniform index, you'd need to add a private
 counter variable to the fs_visitor similar to what vec4_visitor
 does.
 
 With the upcoming ARB_shader_image_load_store changes we might get
 more than 4 * MAX_UNIFORM uniform allocations in some cases because
 image uniforms can take up several slots for the image meta-data --
 I think it would make sense to extend this change to the FS
 back-end too.

Do you think that should get done now or as a follow-on patch?  This
fixes an existing problem, and I think we want a fix that can get
picked to the 10.0 branch and possibly earlier stable branches.

 Thanks.
 
 if (v.run()) { vec4_generator g(brw, prog, c-gp-program.Base,
 c-prog_data.base, mem_ctx, INTEL_DEBUG  DEBUG_GS); [...] 
 ___ mesa-dev mailing
 list mesa-dev@lists.freedesktop.org 
 http://lists.freedesktop.org/mailman/listinfo/mesa-dev

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.14 (GNU/Linux)

iEYEARECAAYFAlKPr+0ACgkQX1gOwKyEAw9vzQCdE0DAxuZF85y7wFtw1Ypg+ejo
Ft4An3msa4mkI4HU/39n9vlXoY7LrEJ2
=sUh4
-END PGP SIGNATURE-
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/2] i965: Allocate vec4_visitor's uniform_size and uniform_vector_size arrays dynamically.

2013-11-22 Thread Francisco Jerez
Ian Romanick i...@freedesktop.org writes:

 -BEGIN PGP SIGNED MESSAGE-
 Hash: SHA1

 On 11/22/2013 10:05 AM, Francisco Jerez wrote:
[...]
 With the upcoming ARB_shader_image_load_store changes we might get
 more than 4 * MAX_UNIFORM uniform allocations in some cases because
 image uniforms can take up several slots for the image meta-data --
 I think it would make sense to extend this change to the FS
 back-end too.

 Do you think that should get done now or as a follow-on patch?  This
 fixes an existing problem, and I think we want a fix that can get
 picked to the 10.0 branch and possibly earlier stable branches.


I'm fine either way.  If you'd like to keep the bugfix as simple as
possible for the stable branch I can take care of the fs_visitor change
myself as a separate patch.


pgpzvRoVmbRBv.pgp
Description: PGP signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/2] i965: Allocate vec4_visitor's uniform_size and uniform_vector_size arrays dynamically.

2013-11-22 Thread Kenneth Graunke
On 11/22/2013 11:27 AM, Ian Romanick wrote:
 On 11/22/2013 12:09 AM, Petri Latvala wrote:
 Signed-off-by: Petri Latvala petri.latv...@intel.com
 ---
  src/mesa/drivers/dri/i965/brw_vec4.cpp|  5 +++--
  src/mesa/drivers/dri/i965/brw_vec4.h  | 14 +++---
  src/mesa/drivers/dri/i965/brw_vec4_gs.c   |  2 +-
  src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.cpp | 12 +++-
  src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.h   |  6 --
  src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp|  7 ++-
  src/mesa/drivers/dri/i965/brw_vec4_vs_visitor.cpp |  6 --
  src/mesa/drivers/dri/i965/brw_vs.c|  2 +-
  src/mesa/drivers/dri/i965/brw_vs.h|  6 --
  9 files changed, 41 insertions(+), 19 deletions(-)

 diff --git a/src/mesa/drivers/dri/i965/brw_vec4.cpp 
 b/src/mesa/drivers/dri/i965/brw_vec4.cpp
 index 73f91a0..3da1b4d 100644
 --- a/src/mesa/drivers/dri/i965/brw_vec4.cpp
 +++ b/src/mesa/drivers/dri/i965/brw_vec4.cpp
 @@ -1558,7 +1558,8 @@ brw_vs_emit(struct brw_context *brw,
  struct brw_vs_compile *c,
  struct brw_vs_prog_data *prog_data,
  void *mem_ctx,
 -unsigned *final_assembly_size)
 +unsigned *final_assembly_size,
 +unsigned param_count)
  {
 bool start_busy = false;
 float start_time = 0;
 @@ -1585,7 +1586,7 @@ brw_vs_emit(struct brw_context *brw,
}
 }
  
 -   vec4_vs_visitor v(brw, c, prog_data, prog, shader, mem_ctx);
 +   vec4_vs_visitor v(brw, c, prog_data, prog, shader, mem_ctx, param_count);
 if (!v.run()) {
if (prog) {
   prog-LinkStatus = false;
 diff --git a/src/mesa/drivers/dri/i965/brw_vec4.h 
 b/src/mesa/drivers/dri/i965/brw_vec4.h
 index 5cec9f9..552ca55 100644
 --- a/src/mesa/drivers/dri/i965/brw_vec4.h
 +++ b/src/mesa/drivers/dri/i965/brw_vec4.h
 @@ -231,7 +231,8 @@ public:
  struct brw_shader *shader,
  void *mem_ctx,
  bool debug_flag,
 -bool no_spills);
 +bool no_spills,
 +unsigned param_count);
 ~vec4_visitor();
  
 dst_reg dst_null_f()
 @@ -325,8 +326,9 @@ public:
  */
 dst_reg output_reg[BRW_VARYING_SLOT_COUNT];
 const char *output_reg_annotation[BRW_VARYING_SLOT_COUNT];
 -   int uniform_size[MAX_UNIFORMS];
 -   int uniform_vector_size[MAX_UNIFORMS];
 +   unsigned uniform_param_count;
 +   int* uniform_size;
 +   int* uniform_vector_size;
 int uniforms;
  
 src_reg shader_start_time;
 @@ -546,6 +548,12 @@ private:
  * If true, then register allocation should fail instead of spilling.
  */
 const bool no_spills;
 +
 +   /*
 +* Make noncopyable.
 +*/
 +   vec4_visitor(const vec4_visitor);
 +   vec4_visitor operator=(const vec4_visitor);
 
 I think this should be a separate patch with it's own justification.
 I'm also not sure it's strictly necessary.  I'm not a C++ expert (and
 most people on the project aren't either), so bear with me a bit.  The
 usual reason to make a class non-copyable is because the default
 copy-constructor will only make a shallow copy of pointer fields.  This
 can lead to either double-frees or dereferencing freed memory.  However,
 since we're using ralloc, and the first construction of a vec4_visitor
 object will out-live any possible copies, neither of these situations
 can occur.  Is that a fair assessment?
 
 Now... we probably should do this anyway... and there are a lot of
 classes in the i965 back-end where this should occur.  I don't know if
 we want to make a macro to generate this boiler-plate code or if we just
 want to manually add it to every class.  Perhaps Ken, Paul, or Curro
 will have an opinion...

Or you could just be sensible and just not make copies of things where
it makes no sense to do so...

I would drop this hunk...it's unrelated and unnecessary.

--Ken
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/2] i965: Allocate vec4_visitor's uniform_size and uniform_vector_size arrays dynamically.

2013-11-22 Thread Francisco Jerez
Ian Romanick i...@freedesktop.org writes:

 On 11/22/2013 12:09 AM, Petri Latvala wrote:
[...]
 @@ -546,6 +548,12 @@ private:
  * If true, then register allocation should fail instead of spilling.
  */
 const bool no_spills;
 +
 +   /*
 +* Make noncopyable.
 +*/
 +   vec4_visitor(const vec4_visitor);
 +   vec4_visitor operator=(const vec4_visitor);

 I think this should be a separate patch with it's own justification.
 I'm also not sure it's strictly necessary.  I'm not a C++ expert (and
 most people on the project aren't either), so bear with me a bit.  The
 usual reason to make a class non-copyable is because the default
 copy-constructor will only make a shallow copy of pointer fields.  This
 can lead to either double-frees or dereferencing freed memory.  However,
 since we're using ralloc, and the first construction of a vec4_visitor
 object will out-live any possible copies, neither of these situations
 can occur.  Is that a fair assessment?

 Now... we probably should do this anyway... and there are a lot of
 classes in the i965 back-end where this should occur.  I don't know if
 we want to make a macro to generate this boiler-plate code or if we just
 want to manually add it to every class.  Perhaps Ken, Paul, or Curro
 will have an opinion...


This hunk looks perfectly fine to me (well, aside from the -placement).
All singleton classes should have its copy-constructor declared private
to avoid the situations you have described, and adding a member variable
that needs special handling on copy construction seems like the right
time to do so.  I'm OK with using the private copy-constructor and
assignment operator idiom explicitly as in Petri's code or with having a
macro for the same purpose to save typing -- A third possibility that I
find slightly more elegant would be to define a noncopyable class all
singleton objects could inherit from, like:

| class noncopyable {
|private:
|   noncopyable(noncopyable );
|   noncopyable operator=(const noncopyable );
| };

Because macros that declare class members and need to use access
specifiers can have unexpected side-effects.

Thanks.


pgpU793roSig1.pgp
Description: PGP signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/2] i965: Allocate vec4_visitor's uniform_size and uniform_vector_size arrays dynamically.

2013-11-22 Thread Francisco Jerez
Kenneth Graunke kenn...@whitecape.org writes:

 On 11/22/2013 11:27 AM, Ian Romanick wrote:
 On 11/22/2013 12:09 AM, Petri Latvala wrote:
[...]
 @@ -546,6 +548,12 @@ private:
  * If true, then register allocation should fail instead of spilling.
  */
 const bool no_spills;
 +
 +   /*
 +* Make noncopyable.
 +*/
 +   vec4_visitor(const vec4_visitor);
 +   vec4_visitor operator=(const vec4_visitor);
[...]
 Or you could just be sensible and just not make copies of things where
 it makes no sense to do so...

 I would drop this hunk...it's unrelated and unnecessary.


It makes it less likely to make mistakes, doesn't it?  E.g. in cases
where you pass an argument by value where you really meant to pass it by
reference.  I think our code would benefit from doing this consistently,
not only because it tells the compiler which classes it makes sense to
copy and which ones don't, but also because it makes the semantics of
the class clearer to developers.


pgplF1DNM67jH.pgp
Description: PGP signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/2] i965: Allocate vec4_visitor's uniform_size and uniform_vector_size arrays dynamically.

2013-11-22 Thread Kenneth Graunke
On 11/22/2013 12:23 PM, Francisco Jerez wrote:
 Kenneth Graunke kenn...@whitecape.org writes:
 
 On 11/22/2013 11:27 AM, Ian Romanick wrote:
 On 11/22/2013 12:09 AM, Petri Latvala wrote:
 [...]
 @@ -546,6 +548,12 @@ private:
  * If true, then register allocation should fail instead of spilling.
  */
 const bool no_spills;
 +
 +   /*
 +* Make noncopyable.
 +*/
 +   vec4_visitor(const vec4_visitor);
 +   vec4_visitor operator=(const vec4_visitor);
 [...]
 Or you could just be sensible and just not make copies of things where
 it makes no sense to do so...

 I would drop this hunk...it's unrelated and unnecessary.

 
 It makes it less likely to make mistakes, doesn't it?  E.g. in cases
 where you pass an argument by value where you really meant to pass it by
 reference.  I think our code would benefit from doing this consistently,
 not only because it tells the compiler which classes it makes sense to
 copy and which ones don't, but also because it makes the semantics of
 the class clearer to developers.

Ah, I see...the point is that you might accidentally do:

void foo(vec4_visitor v, int bar);

and end up copying the world, when you meant to pass by reference.  I
guess that's useful.  I probably wouldn't NAK that.

I was thinking you were trying to prevent things like:

vec4_visitor v2(v);
vec4_visitor v3 = v;

which suggests a fundamental misunderstanding of the code which a
compiler error is unlikely to help you overcome.

At any rate, it's really a separate change, so it should be a separate
patch (as Ian mentioned).  Each patch should ideally only make one
specific change.  Smaller patches are almost always better.

--Ken
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev