[Mesa-dev] [PATCH 1/2] i965: Allocate vec4_visitor's uniform_size and uniform_vector_size arrays dynamically.
v2: Don't add function parameters, pass the required size in prog_data-nr_params. v3: - Use the name uniform_array_size instead of uniform_param_count. - Round up when dividing param_count by 4. - Use MAX2() instead of taking the maximum by hand. - Don't crash if prog_data passed to vec4_visitor constructor is NULL v4: Rebase for current master Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=71254 Signed-off-by: Petri Latvala petri.latv...@intel.com --- src/mesa/drivers/dri/i965/brw_vec4.h | 5 +++-- src/mesa/drivers/dri/i965/brw_vec4_gs.c| 5 + src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp | 11 +++ src/mesa/drivers/dri/i965/brw_vs.c | 8 4 files changed, 27 insertions(+), 2 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_vec4.h b/src/mesa/drivers/dri/i965/brw_vec4.h index fb5c0a6..d97c103 100644 --- a/src/mesa/drivers/dri/i965/brw_vec4.h +++ b/src/mesa/drivers/dri/i965/brw_vec4.h @@ -387,8 +387,9 @@ public: */ dst_reg output_reg[BRW_VARYING_SLOT_COUNT]; const char *output_reg_annotation[BRW_VARYING_SLOT_COUNT]; - int uniform_size[MAX_UNIFORMS]; - int uniform_vector_size[MAX_UNIFORMS]; + int *uniform_size; + int *uniform_vector_size; + int uniform_array_size; /* Size of uniform_[vector_]size arrays */ int uniforms; src_reg shader_start_time; diff --git a/src/mesa/drivers/dri/i965/brw_vec4_gs.c b/src/mesa/drivers/dri/i965/brw_vec4_gs.c index 85fb979..8a9625b 100644 --- a/src/mesa/drivers/dri/i965/brw_vec4_gs.c +++ b/src/mesa/drivers/dri/i965/brw_vec4_gs.c @@ -68,6 +68,11 @@ do_gs_prog(struct brw_context *brw, rzalloc_array(NULL, const float *, param_count); c.prog_data.base.base.pull_param = rzalloc_array(NULL, const float *, param_count); + /* Setting nr_params here NOT to the size of the param and pull_param +* arrays, but to the number of uniform components vec4_visitor +* needs. vec4_visitor::setup_uniforms() will set it back to a proper value. +*/ + c.prog_data.base.base.nr_params = ALIGN(param_count, 4) / 4 + gs-num_samplers; if (gp-program.OutputType == GL_POINTS) { /* When the output type is points, the geometry shader may output data diff --git a/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp b/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp index aedab93..cc92a8a 100644 --- a/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp +++ b/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp @@ -3401,6 +3401,17 @@ vec4_visitor::vec4_visitor(struct brw_context *brw, this-max_grf = brw-gen = 7 ? GEN7_MRF_HACK_START : BRW_MAX_GRF; this-uniforms = 0; + + /* Initialize uniform_array_size to at least 1 because pre-gen6 VS requires +* at least one. See setup_uniforms() in brw_vec4.cpp. +*/ + this-uniform_array_size = 1; + if (prog_data) { + this-uniform_array_size = MAX2(stage_prog_data-nr_params, 1); + } + + this-uniform_size = rzalloc_array(mem_ctx, int, this-uniform_array_size); + this-uniform_vector_size = rzalloc_array(mem_ctx, int, this-uniform_array_size); } vec4_visitor::~vec4_visitor() diff --git a/src/mesa/drivers/dri/i965/brw_vs.c b/src/mesa/drivers/dri/i965/brw_vs.c index d3dbc8e..ef8dcf1 100644 --- a/src/mesa/drivers/dri/i965/brw_vs.c +++ b/src/mesa/drivers/dri/i965/brw_vs.c @@ -235,6 +235,14 @@ do_vs_prog(struct brw_context *brw, stage_prog_data-param = rzalloc_array(NULL, const float *, param_count); stage_prog_data-pull_param = rzalloc_array(NULL, const float *, param_count); + /* Setting nr_params here NOT to the size of the param and pull_param +* arrays, but to the number of uniform components vec4_visitor +* needs. vec4_visitor::setup_uniforms() will set it back to a proper value. +*/ + stage_prog_data-nr_params = ALIGN(param_count, 4) / 4; + if (vs) { + stage_prog_data-nr_params += vs-num_samplers; + } GLbitfield64 outputs_written = vp-program.Base.OutputsWritten; prog_data.inputs_read = vp-program.Base.InputsRead; -- 1.8.4.rc3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/2] i965: Allocate vec4_visitor's uniform_size and uniform_vector_size arrays dynamically.
On 12/20/2013 09:26 PM, Kenneth Graunke wrote: On 11/27/2013 05:28 AM, Petri Latvala wrote: v2: Don't add function parameters, pass the required size in prog_data-nr_params. Signed-off-by: Petri Latvala petri.latv...@intel.com --- src/mesa/drivers/dri/i965/brw_vec4.h | 5 +++-- src/mesa/drivers/dri/i965/brw_vec4_gs.c| 5 + src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp | 7 +++ src/mesa/drivers/dri/i965/brw_vs.c | 8 4 files changed, 23 insertions(+), 2 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_vec4.h b/src/mesa/drivers/dri/i965/brw_vec4.h index 5cec9f9..5f5f5cd 100644 --- a/src/mesa/drivers/dri/i965/brw_vec4.h +++ b/src/mesa/drivers/dri/i965/brw_vec4.h @@ -325,8 +325,9 @@ public: */ dst_reg output_reg[BRW_VARYING_SLOT_COUNT]; const char *output_reg_annotation[BRW_VARYING_SLOT_COUNT]; - int uniform_size[MAX_UNIFORMS]; - int uniform_vector_size[MAX_UNIFORMS]; + int *uniform_size; + int *uniform_vector_size; + int uniform_param_count; /* Size of uniform_[vector_]size arrays */ I'm not crazy about this variable name. Between the params arrays, uniform_* arrays, nr_params count, and uniforms count...we already have a lot of distinct things that sound alike. How about: int uniform_array_size; /** Size of uniform_[vector_]size arrays. */ That seems clearer to me. Especially seeing that the value here is really the size of the array, which is an overestimate/upper bound on the number of uniforms, not the actual number of elements in the uniforms or params arrays. Sounds good. Changing the name. int uniforms; src_reg shader_start_time; diff --git a/src/mesa/drivers/dri/i965/brw_vec4_gs.c b/src/mesa/drivers/dri/i965/brw_vec4_gs.c index 018b0b6..7cf9bac 100644 --- a/src/mesa/drivers/dri/i965/brw_vec4_gs.c +++ b/src/mesa/drivers/dri/i965/brw_vec4_gs.c @@ -64,6 +64,11 @@ do_gs_prog(struct brw_context *brw, c.prog_data.base.param = rzalloc_array(NULL, const float *, param_count); c.prog_data.base.pull_param = rzalloc_array(NULL, const float *, param_count); + /* Setting nr_params here NOT to the size of the param and pull_param +* arrays, but to the number of uniform components vec4_visitor +* needs. vec4_visitor::setup_uniforms() will set it back to a proper value. +*/ + c.prog_data.base.nr_params = param_count / 4 + gs-num_samplers; Hmm. You're counting the number of vec4s, but...don't samplers take up a single entry, since they're just integers? This seems odd to me. No, that value is the number of float-size components. param_count, calculated above that code, multiplies the number of components by 4 to have room elsewhere for scalar params that eat up a vec4. I'm just dividing the number back. You might also consider doing ALIGN(param_count, 4) / 4 so that you round up rather than truncating on the division. Ok, changing that. Here and in vs. I also would really like to keep nr_params in consistent units, i.e. always uniform float-size components or always vec4s. I would really like that too, but unfortunately the inconsistency was already present. The best option would be to have another variable for uniform memory use, but I didn't want to make too invasive changes. nr_param gets used here for uniform float-size counts, and later on for something else. This results in some overuse of memory when uniforms are smaller than vec4s. if (gp-program.OutputType == GL_POINTS) { /* When the output type is points, the geometry shader may output data diff --git a/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp b/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp index a13eafb..b9226dc 100644 --- a/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp +++ b/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp @@ -3253,6 +3253,10 @@ vec4_visitor::vec4_visitor(struct brw_context *brw, fail_msg(NULL), first_non_payload_grf(0), need_all_constants_in_pull_buffer(false), + /* Initialize uniform_param_count to at least 1 because gen6 VS requires at + * least one. See setup_uniforms() in brw_vec4.cpp. + */ I think you mean Gen4-5 requires at least one push constant, not gen6 VS. At least, that's what setup_uniforms() is doing. Argh, yes. Fixing that comment. + uniform_param_count(prog_data-nr_params ? prog_data-nr_params : 1), I think this would be clearer as: MAX2(prog_data-nr_params, 1) Yes, changing. -- Petri Latvala ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/2] i965: Allocate vec4_visitor's uniform_size and uniform_vector_size arrays dynamically.
On 12/20/2013 08:54 PM, Ian Romanick wrote: This patch breaks the test_vec4_register_coalesce unit test. Did you run 'make check'? I thought I did, but turns out I didn't. Just ran piglit tests. Fix coming up. -- Petri Latvala ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 1/2] i965: Allocate vec4_visitor's uniform_size and uniform_vector_size arrays dynamically.
v2: Don't add function parameters, pass the required size in prog_data-nr_params. v3: - Use the name uniform_array_size instead of uniform_param_count. - Round up when dividing param_count by 4. - Use MAX2() instead of taking the maximum by hand. - Don't crash if prog_data passed to vec4_visitor constructor is NULL Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=71254 Signed-off-by: Petri Latvala petri.latv...@intel.com --- src/mesa/drivers/dri/i965/brw_vec4.h | 5 +++-- src/mesa/drivers/dri/i965/brw_vec4_gs.c| 5 + src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp | 11 +++ src/mesa/drivers/dri/i965/brw_vs.c | 8 4 files changed, 27 insertions(+), 2 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_vec4.h b/src/mesa/drivers/dri/i965/brw_vec4.h index d4029d8..798d8bd 100644 --- a/src/mesa/drivers/dri/i965/brw_vec4.h +++ b/src/mesa/drivers/dri/i965/brw_vec4.h @@ -325,8 +325,9 @@ public: */ dst_reg output_reg[BRW_VARYING_SLOT_COUNT]; const char *output_reg_annotation[BRW_VARYING_SLOT_COUNT]; - int uniform_size[MAX_UNIFORMS]; - int uniform_vector_size[MAX_UNIFORMS]; + int *uniform_size; + int *uniform_vector_size; + int uniform_array_size; /* Size of uniform_[vector_]size arrays */ int uniforms; src_reg shader_start_time; diff --git a/src/mesa/drivers/dri/i965/brw_vec4_gs.c b/src/mesa/drivers/dri/i965/brw_vec4_gs.c index 018b0b6..c749eb1 100644 --- a/src/mesa/drivers/dri/i965/brw_vec4_gs.c +++ b/src/mesa/drivers/dri/i965/brw_vec4_gs.c @@ -64,6 +64,11 @@ do_gs_prog(struct brw_context *brw, c.prog_data.base.param = rzalloc_array(NULL, const float *, param_count); c.prog_data.base.pull_param = rzalloc_array(NULL, const float *, param_count); + /* Setting nr_params here NOT to the size of the param and pull_param +* arrays, but to the number of uniform components vec4_visitor +* needs. vec4_visitor::setup_uniforms() will set it back to a proper value. +*/ + c.prog_data.base.nr_params = ALIGN(param_count, 4) / 4 + gs-num_samplers; if (gp-program.OutputType == GL_POINTS) { /* When the output type is points, the geometry shader may output data diff --git a/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp b/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp index 3b8cef6..a93fdc5 100644 --- a/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp +++ b/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp @@ -3321,6 +3321,17 @@ vec4_visitor::vec4_visitor(struct brw_context *brw, this-max_grf = brw-gen = 7 ? GEN7_MRF_HACK_START : BRW_MAX_GRF; this-uniforms = 0; + + /* Initialize uniform_array_size to at least 1 because pre-gen6 VS requires +* at least one. See setup_uniforms() in brw_vec4.cpp. +*/ + this-uniform_array_size = 1; + if (prog_data) { + this-uniform_array_size = MAX2(prog_data-nr_params, 1); + } + + this-uniform_size = rzalloc_array(mem_ctx, int, this-uniform_array_size); + this-uniform_vector_size = rzalloc_array(mem_ctx, int, this-uniform_array_size); } vec4_visitor::~vec4_visitor() diff --git a/src/mesa/drivers/dri/i965/brw_vs.c b/src/mesa/drivers/dri/i965/brw_vs.c index 5d50c3c..609a575 100644 --- a/src/mesa/drivers/dri/i965/brw_vs.c +++ b/src/mesa/drivers/dri/i965/brw_vs.c @@ -242,6 +242,14 @@ do_vs_prog(struct brw_context *brw, prog_data.base.param = rzalloc_array(NULL, const float *, param_count); prog_data.base.pull_param = rzalloc_array(NULL, const float *, param_count); + /* Setting nr_params here NOT to the size of the param and pull_param +* arrays, but to the number of uniform components vec4_visitor +* needs. vec4_visitor::setup_uniforms() will set it back to a proper value. +*/ + prog_data.base.nr_params = ALIGN(param_count, 4) / 4; + if (vs) { + prog_data.base.nr_params += vs-num_samplers; + } GLbitfield64 outputs_written = vp-program.Base.OutputsWritten; prog_data.inputs_read = vp-program.Base.InputsRead; -- 1.8.4.rc3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/2] i965: Allocate vec4_visitor's uniform_size and uniform_vector_size arrays dynamically.
This patch breaks the test_vec4_register_coalesce unit test. Did you run 'make check'? On 11/27/2013 05:28 AM, Petri Latvala wrote: v2: Don't add function parameters, pass the required size in prog_data-nr_params. Signed-off-by: Petri Latvala petri.latv...@intel.com --- src/mesa/drivers/dri/i965/brw_vec4.h | 5 +++-- src/mesa/drivers/dri/i965/brw_vec4_gs.c| 5 + src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp | 7 +++ src/mesa/drivers/dri/i965/brw_vs.c | 8 4 files changed, 23 insertions(+), 2 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_vec4.h b/src/mesa/drivers/dri/i965/brw_vec4.h index 5cec9f9..5f5f5cd 100644 --- a/src/mesa/drivers/dri/i965/brw_vec4.h +++ b/src/mesa/drivers/dri/i965/brw_vec4.h @@ -325,8 +325,9 @@ public: */ dst_reg output_reg[BRW_VARYING_SLOT_COUNT]; const char *output_reg_annotation[BRW_VARYING_SLOT_COUNT]; - int uniform_size[MAX_UNIFORMS]; - int uniform_vector_size[MAX_UNIFORMS]; + int *uniform_size; + int *uniform_vector_size; + int uniform_param_count; /* Size of uniform_[vector_]size arrays */ int uniforms; src_reg shader_start_time; diff --git a/src/mesa/drivers/dri/i965/brw_vec4_gs.c b/src/mesa/drivers/dri/i965/brw_vec4_gs.c index 018b0b6..7cf9bac 100644 --- a/src/mesa/drivers/dri/i965/brw_vec4_gs.c +++ b/src/mesa/drivers/dri/i965/brw_vec4_gs.c @@ -64,6 +64,11 @@ do_gs_prog(struct brw_context *brw, c.prog_data.base.param = rzalloc_array(NULL, const float *, param_count); c.prog_data.base.pull_param = rzalloc_array(NULL, const float *, param_count); + /* Setting nr_params here NOT to the size of the param and pull_param +* arrays, but to the number of uniform components vec4_visitor +* needs. vec4_visitor::setup_uniforms() will set it back to a proper value. +*/ + c.prog_data.base.nr_params = param_count / 4 + gs-num_samplers; if (gp-program.OutputType == GL_POINTS) { /* When the output type is points, the geometry shader may output data diff --git a/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp b/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp index a13eafb..b9226dc 100644 --- a/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp +++ b/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp @@ -3253,6 +3253,10 @@ vec4_visitor::vec4_visitor(struct brw_context *brw, fail_msg(NULL), first_non_payload_grf(0), need_all_constants_in_pull_buffer(false), + /* Initialize uniform_param_count to at least 1 because gen6 VS requires at + * least one. See setup_uniforms() in brw_vec4.cpp. + */ + uniform_param_count(prog_data-nr_params ? prog_data-nr_params : 1), debug_flag(debug_flag), no_spills(no_spills) { @@ -3290,6 +3294,9 @@ vec4_visitor::vec4_visitor(struct brw_context *brw, this-max_grf = brw-gen = 7 ? GEN7_MRF_HACK_START : BRW_MAX_GRF; this-uniforms = 0; + + this-uniform_size = rzalloc_array(mem_ctx, int, this-uniform_param_count); + this-uniform_vector_size = rzalloc_array(mem_ctx, int, this-uniform_param_count); } vec4_visitor::~vec4_visitor() diff --git a/src/mesa/drivers/dri/i965/brw_vs.c b/src/mesa/drivers/dri/i965/brw_vs.c index b5c8b63..8d0933d 100644 --- a/src/mesa/drivers/dri/i965/brw_vs.c +++ b/src/mesa/drivers/dri/i965/brw_vs.c @@ -242,6 +242,14 @@ do_vs_prog(struct brw_context *brw, prog_data.base.param = rzalloc_array(NULL, const float *, param_count); prog_data.base.pull_param = rzalloc_array(NULL, const float *, param_count); + /* Setting nr_params here NOT to the size of the param and pull_param +* arrays, but to the number of uniform components vec4_visitor +* needs. vec4_visitor::setup_uniforms() will set it back to a proper value. +*/ + prog_data.base.nr_params = param_count / 4; + if (vs) { + prog_data.base.nr_params += vs-num_samplers; + } GLbitfield64 outputs_written = vp-program.Base.OutputsWritten; prog_data.inputs_read = vp-program.Base.InputsRead; ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/2] i965: Allocate vec4_visitor's uniform_size and uniform_vector_size arrays dynamically.
On 11/27/2013 05:28 AM, Petri Latvala wrote: v2: Don't add function parameters, pass the required size in prog_data-nr_params. Signed-off-by: Petri Latvala petri.latv...@intel.com --- src/mesa/drivers/dri/i965/brw_vec4.h | 5 +++-- src/mesa/drivers/dri/i965/brw_vec4_gs.c| 5 + src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp | 7 +++ src/mesa/drivers/dri/i965/brw_vs.c | 8 4 files changed, 23 insertions(+), 2 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_vec4.h b/src/mesa/drivers/dri/i965/brw_vec4.h index 5cec9f9..5f5f5cd 100644 --- a/src/mesa/drivers/dri/i965/brw_vec4.h +++ b/src/mesa/drivers/dri/i965/brw_vec4.h @@ -325,8 +325,9 @@ public: */ dst_reg output_reg[BRW_VARYING_SLOT_COUNT]; const char *output_reg_annotation[BRW_VARYING_SLOT_COUNT]; - int uniform_size[MAX_UNIFORMS]; - int uniform_vector_size[MAX_UNIFORMS]; + int *uniform_size; + int *uniform_vector_size; + int uniform_param_count; /* Size of uniform_[vector_]size arrays */ I'm not crazy about this variable name. Between the params arrays, uniform_* arrays, nr_params count, and uniforms count...we already have a lot of distinct things that sound alike. How about: int uniform_array_size; /** Size of uniform_[vector_]size arrays. */ That seems clearer to me. Especially seeing that the value here is really the size of the array, which is an overestimate/upper bound on the number of uniforms, not the actual number of elements in the uniforms or params arrays. int uniforms; src_reg shader_start_time; diff --git a/src/mesa/drivers/dri/i965/brw_vec4_gs.c b/src/mesa/drivers/dri/i965/brw_vec4_gs.c index 018b0b6..7cf9bac 100644 --- a/src/mesa/drivers/dri/i965/brw_vec4_gs.c +++ b/src/mesa/drivers/dri/i965/brw_vec4_gs.c @@ -64,6 +64,11 @@ do_gs_prog(struct brw_context *brw, c.prog_data.base.param = rzalloc_array(NULL, const float *, param_count); c.prog_data.base.pull_param = rzalloc_array(NULL, const float *, param_count); + /* Setting nr_params here NOT to the size of the param and pull_param +* arrays, but to the number of uniform components vec4_visitor +* needs. vec4_visitor::setup_uniforms() will set it back to a proper value. +*/ + c.prog_data.base.nr_params = param_count / 4 + gs-num_samplers; Hmm. You're counting the number of vec4s, but...don't samplers take up a single entry, since they're just integers? This seems odd to me. You might also consider doing ALIGN(param_count, 4) / 4 so that you round up rather than truncating on the division. I also would really like to keep nr_params in consistent units, i.e. always uniform float-size components or always vec4s. if (gp-program.OutputType == GL_POINTS) { /* When the output type is points, the geometry shader may output data diff --git a/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp b/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp index a13eafb..b9226dc 100644 --- a/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp +++ b/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp @@ -3253,6 +3253,10 @@ vec4_visitor::vec4_visitor(struct brw_context *brw, fail_msg(NULL), first_non_payload_grf(0), need_all_constants_in_pull_buffer(false), + /* Initialize uniform_param_count to at least 1 because gen6 VS requires at + * least one. See setup_uniforms() in brw_vec4.cpp. + */ I think you mean Gen4-5 requires at least one push constant, not gen6 VS. At least, that's what setup_uniforms() is doing. + uniform_param_count(prog_data-nr_params ? prog_data-nr_params : 1), I think this would be clearer as: MAX2(prog_data-nr_params, 1) debug_flag(debug_flag), no_spills(no_spills) { @@ -3290,6 +3294,9 @@ vec4_visitor::vec4_visitor(struct brw_context *brw, this-max_grf = brw-gen = 7 ? GEN7_MRF_HACK_START : BRW_MAX_GRF; this-uniforms = 0; + + this-uniform_size = rzalloc_array(mem_ctx, int, this-uniform_param_count); + this-uniform_vector_size = rzalloc_array(mem_ctx, int, this-uniform_param_count); } vec4_visitor::~vec4_visitor() diff --git a/src/mesa/drivers/dri/i965/brw_vs.c b/src/mesa/drivers/dri/i965/brw_vs.c index b5c8b63..8d0933d 100644 --- a/src/mesa/drivers/dri/i965/brw_vs.c +++ b/src/mesa/drivers/dri/i965/brw_vs.c @@ -242,6 +242,14 @@ do_vs_prog(struct brw_context *brw, prog_data.base.param = rzalloc_array(NULL, const float *, param_count); prog_data.base.pull_param = rzalloc_array(NULL, const float *, param_count); + /* Setting nr_params here NOT to the size of the param and pull_param +* arrays, but to the number of uniform components vec4_visitor +* needs. vec4_visitor::setup_uniforms() will set it back to a proper value. +*/ + prog_data.base.nr_params = param_count / 4; + if (vs) { + prog_data.base.nr_params += vs-num_samplers; + } GLbitfield64 outputs_written
[Mesa-dev] [PATCH 1/2] i965: Allocate vec4_visitor's uniform_size and uniform_vector_size arrays dynamically.
v2: Don't add function parameters, pass the required size in prog_data-nr_params. Signed-off-by: Petri Latvala petri.latv...@intel.com --- src/mesa/drivers/dri/i965/brw_vec4.h | 5 +++-- src/mesa/drivers/dri/i965/brw_vec4_gs.c| 5 + src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp | 7 +++ src/mesa/drivers/dri/i965/brw_vs.c | 8 4 files changed, 23 insertions(+), 2 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_vec4.h b/src/mesa/drivers/dri/i965/brw_vec4.h index 5cec9f9..5f5f5cd 100644 --- a/src/mesa/drivers/dri/i965/brw_vec4.h +++ b/src/mesa/drivers/dri/i965/brw_vec4.h @@ -325,8 +325,9 @@ public: */ dst_reg output_reg[BRW_VARYING_SLOT_COUNT]; const char *output_reg_annotation[BRW_VARYING_SLOT_COUNT]; - int uniform_size[MAX_UNIFORMS]; - int uniform_vector_size[MAX_UNIFORMS]; + int *uniform_size; + int *uniform_vector_size; + int uniform_param_count; /* Size of uniform_[vector_]size arrays */ int uniforms; src_reg shader_start_time; diff --git a/src/mesa/drivers/dri/i965/brw_vec4_gs.c b/src/mesa/drivers/dri/i965/brw_vec4_gs.c index 018b0b6..7cf9bac 100644 --- a/src/mesa/drivers/dri/i965/brw_vec4_gs.c +++ b/src/mesa/drivers/dri/i965/brw_vec4_gs.c @@ -64,6 +64,11 @@ do_gs_prog(struct brw_context *brw, c.prog_data.base.param = rzalloc_array(NULL, const float *, param_count); c.prog_data.base.pull_param = rzalloc_array(NULL, const float *, param_count); + /* Setting nr_params here NOT to the size of the param and pull_param +* arrays, but to the number of uniform components vec4_visitor +* needs. vec4_visitor::setup_uniforms() will set it back to a proper value. +*/ + c.prog_data.base.nr_params = param_count / 4 + gs-num_samplers; if (gp-program.OutputType == GL_POINTS) { /* When the output type is points, the geometry shader may output data diff --git a/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp b/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp index a13eafb..b9226dc 100644 --- a/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp +++ b/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp @@ -3253,6 +3253,10 @@ vec4_visitor::vec4_visitor(struct brw_context *brw, fail_msg(NULL), first_non_payload_grf(0), need_all_constants_in_pull_buffer(false), + /* Initialize uniform_param_count to at least 1 because gen6 VS requires at + * least one. See setup_uniforms() in brw_vec4.cpp. + */ + uniform_param_count(prog_data-nr_params ? prog_data-nr_params : 1), debug_flag(debug_flag), no_spills(no_spills) { @@ -3290,6 +3294,9 @@ vec4_visitor::vec4_visitor(struct brw_context *brw, this-max_grf = brw-gen = 7 ? GEN7_MRF_HACK_START : BRW_MAX_GRF; this-uniforms = 0; + + this-uniform_size = rzalloc_array(mem_ctx, int, this-uniform_param_count); + this-uniform_vector_size = rzalloc_array(mem_ctx, int, this-uniform_param_count); } vec4_visitor::~vec4_visitor() diff --git a/src/mesa/drivers/dri/i965/brw_vs.c b/src/mesa/drivers/dri/i965/brw_vs.c index b5c8b63..8d0933d 100644 --- a/src/mesa/drivers/dri/i965/brw_vs.c +++ b/src/mesa/drivers/dri/i965/brw_vs.c @@ -242,6 +242,14 @@ do_vs_prog(struct brw_context *brw, prog_data.base.param = rzalloc_array(NULL, const float *, param_count); prog_data.base.pull_param = rzalloc_array(NULL, const float *, param_count); + /* Setting nr_params here NOT to the size of the param and pull_param +* arrays, but to the number of uniform components vec4_visitor +* needs. vec4_visitor::setup_uniforms() will set it back to a proper value. +*/ + prog_data.base.nr_params = param_count / 4; + if (vs) { + prog_data.base.nr_params += vs-num_samplers; + } GLbitfield64 outputs_written = vp-program.Base.OutputsWritten; prog_data.inputs_read = vp-program.Base.InputsRead; -- 1.8.4.rc3 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/2] i965: Allocate vec4_visitor's uniform_size and uniform_vector_size arrays dynamically.
On 11/22/2013 08:05 PM, Francisco Jerez wrote: Petri Latvalapetri.latv...@intel.com writes: [...] @@ -325,8 +326,9 @@ public: */ dst_reg output_reg[BRW_VARYING_SLOT_COUNT]; const char *output_reg_annotation[BRW_VARYING_SLOT_COUNT]; - int uniform_size[MAX_UNIFORMS]; - int uniform_vector_size[MAX_UNIFORMS]; + unsigned uniform_param_count; + int* uniform_size; + int* uniform_vector_size; int uniforms; All the code around you uses the 'type *identifier' convention, please keep it consistent. *slap on forehead* note to self, disable muscle memory. Fixing that. src_reg shader_start_time; [...] @@ -556,7 +558,7 @@ brw_gs_emit(struct brw_context *brw, if (likely(!(INTEL_DEBUG DEBUG_NO_DUAL_OBJECT_GS))) { c-prog_data.dual_instanced_dispatch = false; - vec4_gs_visitor v(brw, c, prog, shader, mem_ctx, true /* no_spills */); + vec4_gs_visitor v(brw, c, prog, shader, mem_ctx, true /* no_spills */, param_count); Another possibility would be to set 'c.prog_data.base.nr_params = param_count' here and in brw_vs_emit(), so you'd have the same value available from the constructor of vec4_visitor without all the parameter changes. When trying this change, I began to wonder if I'm using the right values at all. What is that nr_params supposed to be for code elsewhere? vec4_visitor::setup_uniforms() sets it to this-uniforms * 4, but param_count that I passed from do_vs_prog() to brw_vs_emit() is num_uniform_components * 4, yielding (close to) this-uniforms * 4 * 4. Close to means that my test program gives param_count being 92, and this-uniforms * 4 in setup_uniforms() is 24. Is vec4_visitor::setup_uniforms() supposed to be the authoritative source of the uniform counts or is that nr_params assignment just for the pre-gen6 hack in the same function? Is nr_params supposed to be the number of uniform components or uniforms? To my understanding param_count in do_vs_prog() is the number of uniform components required to hold all uniforms with a vec4 for each one (float or otherwise), are the floats packed to vec4s anywhere? -- Petri Latvala ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/2] i965: Allocate vec4_visitor's uniform_size and uniform_vector_size arrays dynamically.
Petri Latvala petri.latv...@intel.com writes: On 11/22/2013 08:05 PM, Francisco Jerez wrote: Petri Latvalapetri.latv...@intel.com writes: [...] @@ -325,8 +326,9 @@ public: */ dst_reg output_reg[BRW_VARYING_SLOT_COUNT]; const char *output_reg_annotation[BRW_VARYING_SLOT_COUNT]; - int uniform_size[MAX_UNIFORMS]; - int uniform_vector_size[MAX_UNIFORMS]; + unsigned uniform_param_count; + int* uniform_size; + int* uniform_vector_size; int uniforms; All the code around you uses the 'type *identifier' convention, please keep it consistent. *slap on forehead* note to self, disable muscle memory. Fixing that. src_reg shader_start_time; [...] @@ -556,7 +558,7 @@ brw_gs_emit(struct brw_context *brw, if (likely(!(INTEL_DEBUG DEBUG_NO_DUAL_OBJECT_GS))) { c-prog_data.dual_instanced_dispatch = false; - vec4_gs_visitor v(brw, c, prog, shader, mem_ctx, true /* no_spills */); + vec4_gs_visitor v(brw, c, prog, shader, mem_ctx, true /* no_spills */, param_count); Another possibility would be to set 'c.prog_data.base.nr_params = param_count' here and in brw_vs_emit(), so you'd have the same value available from the constructor of vec4_visitor without all the parameter changes. When trying this change, I began to wonder if I'm using the right values at all. What is that nr_params supposed to be for code elsewhere? vec4_visitor::setup_uniforms() sets it to this-uniforms * 4, but param_count that I passed from do_vs_prog() to brw_vs_emit() is num_uniform_components * 4, yielding (close to) this-uniforms * 4 * 4. Oh... Apparently do_vs_prog() is overallocating memory. And so is do_gs_prog()... That seems like a bug. Close to means that my test program gives param_count being 92, and this-uniforms * 4 in setup_uniforms() is 24. Is vec4_visitor::setup_uniforms() supposed to be the authoritative source of the uniform counts or is that nr_params assignment just for the pre-gen6 hack in the same function? Is nr_params supposed to be the number of uniform components or uniforms? It should be the number of uniform components (in units of 32-bit floats). To my understanding param_count in do_vs_prog() is the number of uniform components required to hold all uniforms with a vec4 for each one (float or otherwise), are the floats packed to vec4s anywhere? Yes, see vec4_visitor::pack_uniform_registers(). -- Petri Latvala pgpH9bUzFzmPZ.pgp Description: PGP signature ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH 1/2] i965: Allocate vec4_visitor's uniform_size and uniform_vector_size arrays dynamically.
Signed-off-by: Petri Latvala petri.latv...@intel.com --- src/mesa/drivers/dri/i965/brw_vec4.cpp| 5 +++-- src/mesa/drivers/dri/i965/brw_vec4.h | 14 +++--- src/mesa/drivers/dri/i965/brw_vec4_gs.c | 2 +- src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.cpp | 12 +++- src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.h | 6 -- src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp| 7 ++- src/mesa/drivers/dri/i965/brw_vec4_vs_visitor.cpp | 6 -- src/mesa/drivers/dri/i965/brw_vs.c| 2 +- src/mesa/drivers/dri/i965/brw_vs.h| 6 -- 9 files changed, 41 insertions(+), 19 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_vec4.cpp b/src/mesa/drivers/dri/i965/brw_vec4.cpp index 73f91a0..3da1b4d 100644 --- a/src/mesa/drivers/dri/i965/brw_vec4.cpp +++ b/src/mesa/drivers/dri/i965/brw_vec4.cpp @@ -1558,7 +1558,8 @@ brw_vs_emit(struct brw_context *brw, struct brw_vs_compile *c, struct brw_vs_prog_data *prog_data, void *mem_ctx, -unsigned *final_assembly_size) +unsigned *final_assembly_size, +unsigned param_count) { bool start_busy = false; float start_time = 0; @@ -1585,7 +1586,7 @@ brw_vs_emit(struct brw_context *brw, } } - vec4_vs_visitor v(brw, c, prog_data, prog, shader, mem_ctx); + vec4_vs_visitor v(brw, c, prog_data, prog, shader, mem_ctx, param_count); if (!v.run()) { if (prog) { prog-LinkStatus = false; diff --git a/src/mesa/drivers/dri/i965/brw_vec4.h b/src/mesa/drivers/dri/i965/brw_vec4.h index 5cec9f9..552ca55 100644 --- a/src/mesa/drivers/dri/i965/brw_vec4.h +++ b/src/mesa/drivers/dri/i965/brw_vec4.h @@ -231,7 +231,8 @@ public: struct brw_shader *shader, void *mem_ctx, bool debug_flag, -bool no_spills); +bool no_spills, +unsigned param_count); ~vec4_visitor(); dst_reg dst_null_f() @@ -325,8 +326,9 @@ public: */ dst_reg output_reg[BRW_VARYING_SLOT_COUNT]; const char *output_reg_annotation[BRW_VARYING_SLOT_COUNT]; - int uniform_size[MAX_UNIFORMS]; - int uniform_vector_size[MAX_UNIFORMS]; + unsigned uniform_param_count; + int* uniform_size; + int* uniform_vector_size; int uniforms; src_reg shader_start_time; @@ -546,6 +548,12 @@ private: * If true, then register allocation should fail instead of spilling. */ const bool no_spills; + + /* +* Make noncopyable. +*/ + vec4_visitor(const vec4_visitor); + vec4_visitor operator=(const vec4_visitor); }; diff --git a/src/mesa/drivers/dri/i965/brw_vec4_gs.c b/src/mesa/drivers/dri/i965/brw_vec4_gs.c index b52d646..b0b77d1 100644 --- a/src/mesa/drivers/dri/i965/brw_vec4_gs.c +++ b/src/mesa/drivers/dri/i965/brw_vec4_gs.c @@ -213,7 +213,7 @@ do_gs_prog(struct brw_context *brw, void *mem_ctx = ralloc_context(NULL); unsigned program_size; const unsigned *program = - brw_gs_emit(brw, prog, c, mem_ctx, program_size); + brw_gs_emit(brw, prog, c, mem_ctx, program_size, param_count); if (program == NULL) { ralloc_free(mem_ctx); return false; diff --git a/src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.cpp b/src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.cpp index adbb1cf..16c05f5 100644 --- a/src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.cpp +++ b/src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.cpp @@ -38,10 +38,11 @@ vec4_gs_visitor::vec4_gs_visitor(struct brw_context *brw, struct gl_shader_program *prog, struct brw_shader *shader, void *mem_ctx, - bool no_spills) + bool no_spills, + unsigned param_count) : vec4_visitor(brw, c-base, c-gp-program.Base, c-key.base, c-prog_data.base, prog, shader, mem_ctx, - INTEL_DEBUG DEBUG_GS, no_spills), + INTEL_DEBUG DEBUG_GS, no_spills, param_count), c(c) { } @@ -539,7 +540,8 @@ brw_gs_emit(struct brw_context *brw, struct gl_shader_program *prog, struct brw_gs_compile *c, void *mem_ctx, -unsigned *final_assembly_size) +unsigned *final_assembly_size, +unsigned param_count) { struct brw_shader *shader = (brw_shader *) prog-_LinkedShaders[MESA_SHADER_GEOMETRY]; @@ -556,7 +558,7 @@ brw_gs_emit(struct brw_context *brw, if (likely(!(INTEL_DEBUG DEBUG_NO_DUAL_OBJECT_GS))) { c-prog_data.dual_instanced_dispatch = false; - vec4_gs_visitor v(brw, c, prog, shader, mem_ctx, true /* no_spills */); + vec4_gs_visitor v(brw, c, prog, shader, mem_ctx, true /* no_spills */, param_count); if (v.run()) { vec4_generator g(brw, prog,
Re: [Mesa-dev] [PATCH 1/2] i965: Allocate vec4_visitor's uniform_size and uniform_vector_size arrays dynamically.
Petri Latvala petri.latv...@intel.com writes: [...] @@ -325,8 +326,9 @@ public: */ dst_reg output_reg[BRW_VARYING_SLOT_COUNT]; const char *output_reg_annotation[BRW_VARYING_SLOT_COUNT]; - int uniform_size[MAX_UNIFORMS]; - int uniform_vector_size[MAX_UNIFORMS]; + unsigned uniform_param_count; + int* uniform_size; + int* uniform_vector_size; int uniforms; All the code around you uses the 'type *identifier' convention, please keep it consistent. src_reg shader_start_time; [...] @@ -556,7 +558,7 @@ brw_gs_emit(struct brw_context *brw, if (likely(!(INTEL_DEBUG DEBUG_NO_DUAL_OBJECT_GS))) { c-prog_data.dual_instanced_dispatch = false; - vec4_gs_visitor v(brw, c, prog, shader, mem_ctx, true /* no_spills */); + vec4_gs_visitor v(brw, c, prog, shader, mem_ctx, true /* no_spills */, param_count); Another possibility would be to set 'c.prog_data.base.nr_params = param_count' here and in brw_vs_emit(), so you'd have the same value available from the constructor of vec4_visitor without all the parameter changes. It would be possible to do something similar in the fs_visitor code, but it seems that it would be slightly more difficult because the fs_visitor (ab-)uses nr_params to keep track of the most recently allocated uniform index, you'd need to add a private counter variable to the fs_visitor similar to what vec4_visitor does. With the upcoming ARB_shader_image_load_store changes we might get more than 4 * MAX_UNIFORM uniform allocations in some cases because image uniforms can take up several slots for the image meta-data -- I think it would make sense to extend this change to the FS back-end too. Thanks. if (v.run()) { vec4_generator g(brw, prog, c-gp-program.Base, c-prog_data.base, mem_ctx, INTEL_DEBUG DEBUG_GS); [...] pgpt3d5HdM9K6.pgp Description: PGP signature ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/2] i965: Allocate vec4_visitor's uniform_size and uniform_vector_size arrays dynamically.
On 11/22/2013 12:09 AM, Petri Latvala wrote: Signed-off-by: Petri Latvala petri.latv...@intel.com --- src/mesa/drivers/dri/i965/brw_vec4.cpp| 5 +++-- src/mesa/drivers/dri/i965/brw_vec4.h | 14 +++--- src/mesa/drivers/dri/i965/brw_vec4_gs.c | 2 +- src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.cpp | 12 +++- src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.h | 6 -- src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp| 7 ++- src/mesa/drivers/dri/i965/brw_vec4_vs_visitor.cpp | 6 -- src/mesa/drivers/dri/i965/brw_vs.c| 2 +- src/mesa/drivers/dri/i965/brw_vs.h| 6 -- 9 files changed, 41 insertions(+), 19 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_vec4.cpp b/src/mesa/drivers/dri/i965/brw_vec4.cpp index 73f91a0..3da1b4d 100644 --- a/src/mesa/drivers/dri/i965/brw_vec4.cpp +++ b/src/mesa/drivers/dri/i965/brw_vec4.cpp @@ -1558,7 +1558,8 @@ brw_vs_emit(struct brw_context *brw, struct brw_vs_compile *c, struct brw_vs_prog_data *prog_data, void *mem_ctx, -unsigned *final_assembly_size) +unsigned *final_assembly_size, +unsigned param_count) { bool start_busy = false; float start_time = 0; @@ -1585,7 +1586,7 @@ brw_vs_emit(struct brw_context *brw, } } - vec4_vs_visitor v(brw, c, prog_data, prog, shader, mem_ctx); + vec4_vs_visitor v(brw, c, prog_data, prog, shader, mem_ctx, param_count); if (!v.run()) { if (prog) { prog-LinkStatus = false; diff --git a/src/mesa/drivers/dri/i965/brw_vec4.h b/src/mesa/drivers/dri/i965/brw_vec4.h index 5cec9f9..552ca55 100644 --- a/src/mesa/drivers/dri/i965/brw_vec4.h +++ b/src/mesa/drivers/dri/i965/brw_vec4.h @@ -231,7 +231,8 @@ public: struct brw_shader *shader, void *mem_ctx, bool debug_flag, -bool no_spills); +bool no_spills, +unsigned param_count); ~vec4_visitor(); dst_reg dst_null_f() @@ -325,8 +326,9 @@ public: */ dst_reg output_reg[BRW_VARYING_SLOT_COUNT]; const char *output_reg_annotation[BRW_VARYING_SLOT_COUNT]; - int uniform_size[MAX_UNIFORMS]; - int uniform_vector_size[MAX_UNIFORMS]; + unsigned uniform_param_count; + int* uniform_size; + int* uniform_vector_size; int uniforms; src_reg shader_start_time; @@ -546,6 +548,12 @@ private: * If true, then register allocation should fail instead of spilling. */ const bool no_spills; + + /* +* Make noncopyable. +*/ + vec4_visitor(const vec4_visitor); + vec4_visitor operator=(const vec4_visitor); I think this should be a separate patch with it's own justification. I'm also not sure it's strictly necessary. I'm not a C++ expert (and most people on the project aren't either), so bear with me a bit. The usual reason to make a class non-copyable is because the default copy-constructor will only make a shallow copy of pointer fields. This can lead to either double-frees or dereferencing freed memory. However, since we're using ralloc, and the first construction of a vec4_visitor object will out-live any possible copies, neither of these situations can occur. Is that a fair assessment? Now... we probably should do this anyway... and there are a lot of classes in the i965 back-end where this should occur. I don't know if we want to make a macro to generate this boiler-plate code or if we just want to manually add it to every class. Perhaps Ken, Paul, or Curro will have an opinion... }; diff --git a/src/mesa/drivers/dri/i965/brw_vec4_gs.c b/src/mesa/drivers/dri/i965/brw_vec4_gs.c index b52d646..b0b77d1 100644 --- a/src/mesa/drivers/dri/i965/brw_vec4_gs.c +++ b/src/mesa/drivers/dri/i965/brw_vec4_gs.c @@ -213,7 +213,7 @@ do_gs_prog(struct brw_context *brw, void *mem_ctx = ralloc_context(NULL); unsigned program_size; const unsigned *program = - brw_gs_emit(brw, prog, c, mem_ctx, program_size); + brw_gs_emit(brw, prog, c, mem_ctx, program_size, param_count); if (program == NULL) { ralloc_free(mem_ctx); return false; diff --git a/src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.cpp b/src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.cpp index adbb1cf..16c05f5 100644 --- a/src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.cpp +++ b/src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.cpp @@ -38,10 +38,11 @@ vec4_gs_visitor::vec4_gs_visitor(struct brw_context *brw, struct gl_shader_program *prog, struct brw_shader *shader, void *mem_ctx, - bool no_spills) + bool no_spills, + unsigned param_count) :
Re: [Mesa-dev] [PATCH 1/2] i965: Allocate vec4_visitor's uniform_size and uniform_vector_size arrays dynamically.
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 11/22/2013 10:05 AM, Francisco Jerez wrote: Petri Latvala petri.latv...@intel.com writes: [...] @@ -325,8 +326,9 @@ public: */ dst_reg output_reg[BRW_VARYING_SLOT_COUNT]; const char *output_reg_annotation[BRW_VARYING_SLOT_COUNT]; - int uniform_size[MAX_UNIFORMS]; - int uniform_vector_size[MAX_UNIFORMS]; + unsigned uniform_param_count; + int* uniform_size; + int* uniform_vector_size; int uniforms; All the code around you uses the 'type *identifier' convention, please keep it consistent. src_reg shader_start_time; [...] @@ -556,7 +558,7 @@ brw_gs_emit(struct brw_context *brw, if (likely(!(INTEL_DEBUG DEBUG_NO_DUAL_OBJECT_GS))) { c-prog_data.dual_instanced_dispatch = false; - vec4_gs_visitor v(brw, c, prog, shader, mem_ctx, true /* no_spills */); + vec4_gs_visitor v(brw, c, prog, shader, mem_ctx, true /* no_spills */, param_count); Another possibility would be to set 'c.prog_data.base.nr_params = param_count' here and in brw_vs_emit(), so you'd have the same value available from the constructor of vec4_visitor without all the parameter changes. It would be possible to do something similar in the fs_visitor code, but it seems that it would be slightly more difficult because the fs_visitor (ab-)uses nr_params to keep track of the most recently allocated uniform index, you'd need to add a private counter variable to the fs_visitor similar to what vec4_visitor does. With the upcoming ARB_shader_image_load_store changes we might get more than 4 * MAX_UNIFORM uniform allocations in some cases because image uniforms can take up several slots for the image meta-data -- I think it would make sense to extend this change to the FS back-end too. Do you think that should get done now or as a follow-on patch? This fixes an existing problem, and I think we want a fix that can get picked to the 10.0 branch and possibly earlier stable branches. Thanks. if (v.run()) { vec4_generator g(brw, prog, c-gp-program.Base, c-prog_data.base, mem_ctx, INTEL_DEBUG DEBUG_GS); [...] ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.14 (GNU/Linux) iEYEARECAAYFAlKPr+0ACgkQX1gOwKyEAw9vzQCdE0DAxuZF85y7wFtw1Ypg+ejo Ft4An3msa4mkI4HU/39n9vlXoY7LrEJ2 =sUh4 -END PGP SIGNATURE- ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/2] i965: Allocate vec4_visitor's uniform_size and uniform_vector_size arrays dynamically.
Ian Romanick i...@freedesktop.org writes: -BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 11/22/2013 10:05 AM, Francisco Jerez wrote: [...] With the upcoming ARB_shader_image_load_store changes we might get more than 4 * MAX_UNIFORM uniform allocations in some cases because image uniforms can take up several slots for the image meta-data -- I think it would make sense to extend this change to the FS back-end too. Do you think that should get done now or as a follow-on patch? This fixes an existing problem, and I think we want a fix that can get picked to the 10.0 branch and possibly earlier stable branches. I'm fine either way. If you'd like to keep the bugfix as simple as possible for the stable branch I can take care of the fs_visitor change myself as a separate patch. pgpzvRoVmbRBv.pgp Description: PGP signature ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/2] i965: Allocate vec4_visitor's uniform_size and uniform_vector_size arrays dynamically.
On 11/22/2013 11:27 AM, Ian Romanick wrote: On 11/22/2013 12:09 AM, Petri Latvala wrote: Signed-off-by: Petri Latvala petri.latv...@intel.com --- src/mesa/drivers/dri/i965/brw_vec4.cpp| 5 +++-- src/mesa/drivers/dri/i965/brw_vec4.h | 14 +++--- src/mesa/drivers/dri/i965/brw_vec4_gs.c | 2 +- src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.cpp | 12 +++- src/mesa/drivers/dri/i965/brw_vec4_gs_visitor.h | 6 -- src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp| 7 ++- src/mesa/drivers/dri/i965/brw_vec4_vs_visitor.cpp | 6 -- src/mesa/drivers/dri/i965/brw_vs.c| 2 +- src/mesa/drivers/dri/i965/brw_vs.h| 6 -- 9 files changed, 41 insertions(+), 19 deletions(-) diff --git a/src/mesa/drivers/dri/i965/brw_vec4.cpp b/src/mesa/drivers/dri/i965/brw_vec4.cpp index 73f91a0..3da1b4d 100644 --- a/src/mesa/drivers/dri/i965/brw_vec4.cpp +++ b/src/mesa/drivers/dri/i965/brw_vec4.cpp @@ -1558,7 +1558,8 @@ brw_vs_emit(struct brw_context *brw, struct brw_vs_compile *c, struct brw_vs_prog_data *prog_data, void *mem_ctx, -unsigned *final_assembly_size) +unsigned *final_assembly_size, +unsigned param_count) { bool start_busy = false; float start_time = 0; @@ -1585,7 +1586,7 @@ brw_vs_emit(struct brw_context *brw, } } - vec4_vs_visitor v(brw, c, prog_data, prog, shader, mem_ctx); + vec4_vs_visitor v(brw, c, prog_data, prog, shader, mem_ctx, param_count); if (!v.run()) { if (prog) { prog-LinkStatus = false; diff --git a/src/mesa/drivers/dri/i965/brw_vec4.h b/src/mesa/drivers/dri/i965/brw_vec4.h index 5cec9f9..552ca55 100644 --- a/src/mesa/drivers/dri/i965/brw_vec4.h +++ b/src/mesa/drivers/dri/i965/brw_vec4.h @@ -231,7 +231,8 @@ public: struct brw_shader *shader, void *mem_ctx, bool debug_flag, -bool no_spills); +bool no_spills, +unsigned param_count); ~vec4_visitor(); dst_reg dst_null_f() @@ -325,8 +326,9 @@ public: */ dst_reg output_reg[BRW_VARYING_SLOT_COUNT]; const char *output_reg_annotation[BRW_VARYING_SLOT_COUNT]; - int uniform_size[MAX_UNIFORMS]; - int uniform_vector_size[MAX_UNIFORMS]; + unsigned uniform_param_count; + int* uniform_size; + int* uniform_vector_size; int uniforms; src_reg shader_start_time; @@ -546,6 +548,12 @@ private: * If true, then register allocation should fail instead of spilling. */ const bool no_spills; + + /* +* Make noncopyable. +*/ + vec4_visitor(const vec4_visitor); + vec4_visitor operator=(const vec4_visitor); I think this should be a separate patch with it's own justification. I'm also not sure it's strictly necessary. I'm not a C++ expert (and most people on the project aren't either), so bear with me a bit. The usual reason to make a class non-copyable is because the default copy-constructor will only make a shallow copy of pointer fields. This can lead to either double-frees or dereferencing freed memory. However, since we're using ralloc, and the first construction of a vec4_visitor object will out-live any possible copies, neither of these situations can occur. Is that a fair assessment? Now... we probably should do this anyway... and there are a lot of classes in the i965 back-end where this should occur. I don't know if we want to make a macro to generate this boiler-plate code or if we just want to manually add it to every class. Perhaps Ken, Paul, or Curro will have an opinion... Or you could just be sensible and just not make copies of things where it makes no sense to do so... I would drop this hunk...it's unrelated and unnecessary. --Ken ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/2] i965: Allocate vec4_visitor's uniform_size and uniform_vector_size arrays dynamically.
Ian Romanick i...@freedesktop.org writes: On 11/22/2013 12:09 AM, Petri Latvala wrote: [...] @@ -546,6 +548,12 @@ private: * If true, then register allocation should fail instead of spilling. */ const bool no_spills; + + /* +* Make noncopyable. +*/ + vec4_visitor(const vec4_visitor); + vec4_visitor operator=(const vec4_visitor); I think this should be a separate patch with it's own justification. I'm also not sure it's strictly necessary. I'm not a C++ expert (and most people on the project aren't either), so bear with me a bit. The usual reason to make a class non-copyable is because the default copy-constructor will only make a shallow copy of pointer fields. This can lead to either double-frees or dereferencing freed memory. However, since we're using ralloc, and the first construction of a vec4_visitor object will out-live any possible copies, neither of these situations can occur. Is that a fair assessment? Now... we probably should do this anyway... and there are a lot of classes in the i965 back-end where this should occur. I don't know if we want to make a macro to generate this boiler-plate code or if we just want to manually add it to every class. Perhaps Ken, Paul, or Curro will have an opinion... This hunk looks perfectly fine to me (well, aside from the -placement). All singleton classes should have its copy-constructor declared private to avoid the situations you have described, and adding a member variable that needs special handling on copy construction seems like the right time to do so. I'm OK with using the private copy-constructor and assignment operator idiom explicitly as in Petri's code or with having a macro for the same purpose to save typing -- A third possibility that I find slightly more elegant would be to define a noncopyable class all singleton objects could inherit from, like: | class noncopyable { |private: | noncopyable(noncopyable ); | noncopyable operator=(const noncopyable ); | }; Because macros that declare class members and need to use access specifiers can have unexpected side-effects. Thanks. pgpU793roSig1.pgp Description: PGP signature ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/2] i965: Allocate vec4_visitor's uniform_size and uniform_vector_size arrays dynamically.
Kenneth Graunke kenn...@whitecape.org writes: On 11/22/2013 11:27 AM, Ian Romanick wrote: On 11/22/2013 12:09 AM, Petri Latvala wrote: [...] @@ -546,6 +548,12 @@ private: * If true, then register allocation should fail instead of spilling. */ const bool no_spills; + + /* +* Make noncopyable. +*/ + vec4_visitor(const vec4_visitor); + vec4_visitor operator=(const vec4_visitor); [...] Or you could just be sensible and just not make copies of things where it makes no sense to do so... I would drop this hunk...it's unrelated and unnecessary. It makes it less likely to make mistakes, doesn't it? E.g. in cases where you pass an argument by value where you really meant to pass it by reference. I think our code would benefit from doing this consistently, not only because it tells the compiler which classes it makes sense to copy and which ones don't, but also because it makes the semantics of the class clearer to developers. pgplF1DNM67jH.pgp Description: PGP signature ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH 1/2] i965: Allocate vec4_visitor's uniform_size and uniform_vector_size arrays dynamically.
On 11/22/2013 12:23 PM, Francisco Jerez wrote: Kenneth Graunke kenn...@whitecape.org writes: On 11/22/2013 11:27 AM, Ian Romanick wrote: On 11/22/2013 12:09 AM, Petri Latvala wrote: [...] @@ -546,6 +548,12 @@ private: * If true, then register allocation should fail instead of spilling. */ const bool no_spills; + + /* +* Make noncopyable. +*/ + vec4_visitor(const vec4_visitor); + vec4_visitor operator=(const vec4_visitor); [...] Or you could just be sensible and just not make copies of things where it makes no sense to do so... I would drop this hunk...it's unrelated and unnecessary. It makes it less likely to make mistakes, doesn't it? E.g. in cases where you pass an argument by value where you really meant to pass it by reference. I think our code would benefit from doing this consistently, not only because it tells the compiler which classes it makes sense to copy and which ones don't, but also because it makes the semantics of the class clearer to developers. Ah, I see...the point is that you might accidentally do: void foo(vec4_visitor v, int bar); and end up copying the world, when you meant to pass by reference. I guess that's useful. I probably wouldn't NAK that. I was thinking you were trying to prevent things like: vec4_visitor v2(v); vec4_visitor v3 = v; which suggests a fundamental misunderstanding of the code which a compiler error is unlikely to help you overcome. At any rate, it's really a separate change, so it should be a separate patch (as Ian mentioned). Each patch should ideally only make one specific change. Smaller patches are almost always better. --Ken ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev