Re: [Mesa-dev] [PATCH 1/3] glsl: Bail on parsing if the #version directive is bogus.

2013-06-07 Thread Matt Turner
On Fri, Jun 7, 2013 at 10:42 PM, Kenneth Graunke  wrote:
> If we didn't successfully parse the #version line, there's no point in
> continuing with parsing and compiling: it's already failed.
>
> Furthermore, it can actually be harmful: right after handling #version,
> we call _mesa_glsl_initialize_types(), which checks state->es_shader and
> language_version.  If it isn't valid, it hits an assertion failure.
>
> Fixes Piglit's "invalid-version-es."  When processing "#version 110 es",
> our code set state->es_shader and state->language_version = 110.  It
> then properly determined that this was invalid and flagged an error.
> Since we continued anyway, we hit the assertion mentioned above.
>
> NOTE: This is a candidate for the 9.1 branch.
>
> Signed-off-by: Kenneth Graunke 
> ---
>  src/glsl/glsl_parser.yy | 6 ++
>  1 file changed, 6 insertions(+)
>
> diff --git a/src/glsl/glsl_parser.yy b/src/glsl/glsl_parser.yy
> index 6e92c26..56367f8 100644
> --- a/src/glsl/glsl_parser.yy
> +++ b/src/glsl/glsl_parser.yy
> @@ -267,10 +267,16 @@ version_statement:
> | VERSION_TOK INTCONSTANT EOL
> {
> state->process_version_directive(&@2, $2, NULL);
> +  if (state->error) {
> + YYERROR;
> +  }
> }
>  | VERSION_TOK INTCONSTANT any_identifier EOL
>  {
> state->process_version_directive(&@2, $2, $3);
> +  if (state->error) {
> + YYERROR;
> +  }
>  }
> ;
>
> --
> 1.8.3

Series is
Reviewed-by: Matt Turner 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 3/3] glcpp: Automatically #define GL_core_profile 1 on GLSL 1.50+.

2013-06-07 Thread Kenneth Graunke
Page 17 of the GLSL 1.50.11 specification states:
"There is a built-in macro definition for each profile the
 implementation supports.  All implementations provide the following
 macro:

Signed-off-by: Kenneth Graunke 
---
 src/glsl/glcpp/glcpp-parse.y | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/src/glsl/glcpp/glcpp-parse.y b/src/glsl/glcpp/glcpp-parse.y
index 81ba04b..fe36c12 100644
--- a/src/glsl/glcpp/glcpp-parse.y
+++ b/src/glsl/glcpp/glcpp-parse.y
@@ -2064,6 +2064,9 @@ _glcpp_parser_handle_version_declaration(glcpp_parser_t 
*parser, intmax_t versio
add_builtin_define (parser, "GL_ES", 1);
}
 
+   if (version >= 150)
+   add_builtin_define(parser, "GL_core_profile", 1);
+
/* Currently, all ES2/ES3 implementations support highp in the
 * fragment shader, so we always define this macro in ES2/ES3.
 * If we ever get a driver that doesn't support highp, we'll
-- 
1.8.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/3] glsl: Parse "#version 150 core" directives.

2013-06-07 Thread Kenneth Graunke
Previously we only supported "#version 150".  This patch recognizes
"compatibility" to give the user a more descriptive error message.

Fixes Piglit's version-150-core-profile test.

Signed-off-by: Kenneth Graunke 
---
 src/glsl/glsl_parser_extras.cpp | 13 +
 1 file changed, 13 insertions(+)

diff --git a/src/glsl/glsl_parser_extras.cpp b/src/glsl/glsl_parser_extras.cpp
index c0dd713..85b2165 100644
--- a/src/glsl/glsl_parser_extras.cpp
+++ b/src/glsl/glsl_parser_extras.cpp
@@ -226,6 +226,19 @@ _mesa_glsl_parse_state::process_version_directive(YYLTYPE 
*locp, int version,
if (ident) {
   if (strcmp(ident, "es") == 0) {
  es_token_present = true;
+  } else if (version >= 150) {
+ if (strcmp(ident, "core") == 0) {
+/* Accept the token.  There's no need to record that this is
+ * a core profile shader since that's the only profile we support.
+ */
+ } else if (strcmp(ident, "compatibility") == 0) {
+_mesa_glsl_error(locp, this,
+ "The compatibility profile is not supported.\n");
+ } else {
+_mesa_glsl_error(locp, this,
+ "\"%s\" is not a valid shading language profile; "
+ "if present, it must be \"core\".\n", ident);
+ }
   } else {
  _mesa_glsl_error(locp, this,
   "Illegal text following version number\n");
-- 
1.8.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/3] glsl: Bail on parsing if the #version directive is bogus.

2013-06-07 Thread Kenneth Graunke
If we didn't successfully parse the #version line, there's no point in
continuing with parsing and compiling: it's already failed.

Furthermore, it can actually be harmful: right after handling #version,
we call _mesa_glsl_initialize_types(), which checks state->es_shader and
language_version.  If it isn't valid, it hits an assertion failure.

Fixes Piglit's "invalid-version-es."  When processing "#version 110 es",
our code set state->es_shader and state->language_version = 110.  It
then properly determined that this was invalid and flagged an error.
Since we continued anyway, we hit the assertion mentioned above.

NOTE: This is a candidate for the 9.1 branch.

Signed-off-by: Kenneth Graunke 
---
 src/glsl/glsl_parser.yy | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/src/glsl/glsl_parser.yy b/src/glsl/glsl_parser.yy
index 6e92c26..56367f8 100644
--- a/src/glsl/glsl_parser.yy
+++ b/src/glsl/glsl_parser.yy
@@ -267,10 +267,16 @@ version_statement:
| VERSION_TOK INTCONSTANT EOL
{
state->process_version_directive(&@2, $2, NULL);
+  if (state->error) {
+ YYERROR;
+  }
}
 | VERSION_TOK INTCONSTANT any_identifier EOL
 {
state->process_version_directive(&@2, $2, $3);
+  if (state->error) {
+ YYERROR;
+  }
 }
;
 
-- 
1.8.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/3] i965/vs: Use the MAD instruction when possible.

2013-06-07 Thread Matt Turner
On Fri, Jun 7, 2013 at 6:55 PM, Eric Anholt  wrote:
> This is different from how we do it in the FS - we are using MAD even when
> some of the args are constants, because with the relatively unrestrained
> ability to schedule a MOV to prepare a temporary with that data, we can
> get lower latency for the sequence of instructions.
>
> No significant performance difference on GLB2.7 trex (n=33/34), though it
> doesn't have that many MADs.  I noticed MAD opportunities while reading
> the code for the DOTA2 bug.
> ---
>  src/mesa/drivers/dri/i965/brw_vec4.h   |  1 +
>  .../drivers/dri/i965/brw_vec4_copy_propagation.cpp |  1 +
>  src/mesa/drivers/dri/i965/brw_vec4_emit.cpp|  4 +++
>  src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp | 37 
> ++
>  4 files changed, 43 insertions(+)
>
> diff --git a/src/mesa/drivers/dri/i965/brw_vec4.h 
> b/src/mesa/drivers/dri/i965/brw_vec4.h
> index e6e59bc..a72d694 100644
> --- a/src/mesa/drivers/dri/i965/brw_vec4.h
> +++ b/src/mesa/drivers/dri/i965/brw_vec4.h
> @@ -468,6 +468,7 @@ public:
> int base_offset);
>
> bool try_emit_sat(ir_expression *ir);
> +   bool try_emit_mad(ir_expression *ir, int mul_arg);
> void resolve_ud_negate(src_reg *reg);
>
> src_reg get_timestamp();
> diff --git a/src/mesa/drivers/dri/i965/brw_vec4_copy_propagation.cpp 
> b/src/mesa/drivers/dri/i965/brw_vec4_copy_propagation.cpp
> index 39eef4b..1a667eb 100644
> --- a/src/mesa/drivers/dri/i965/brw_vec4_copy_propagation.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_vec4_copy_propagation.cpp
> @@ -216,6 +216,7 @@ vec4_visitor::try_copy_propagation(struct intel_context 
> *intel,
>return false;
>
> bool is_3src_inst = (inst->opcode == BRW_OPCODE_LRP ||
> +inst->opcode == BRW_OPCODE_MAD ||
>  inst->opcode == BRW_OPCODE_BFE ||
>  inst->opcode == BRW_OPCODE_BFI2);
> if (is_3src_inst && value.file == UNIFORM)
> diff --git a/src/mesa/drivers/dri/i965/brw_vec4_emit.cpp 
> b/src/mesa/drivers/dri/i965/brw_vec4_emit.cpp
> index 91101f2..fbb93db 100644
> --- a/src/mesa/drivers/dri/i965/brw_vec4_emit.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_vec4_emit.cpp
> @@ -772,6 +772,10 @@ vec4_generator::generate_code(exec_list *instructions)
>  brw_set_acc_write_control(p, 0);
>  break;
>
> +  case BRW_OPCODE_MAD:
> + brw_MAD(p, dst, src[0], src[1], src[2]);
> + break;
> +
>case BRW_OPCODE_FRC:
>  brw_FRC(p, dst, src[0]);
>  break;
> diff --git a/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp 
> b/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp
> index 33c1b24..451f7d5 100644
> --- a/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp
> @@ -1250,6 +1250,38 @@ vec4_visitor::try_emit_sat(ir_expression *ir)
> return true;
>  }
>
> +bool
> +vec4_visitor::try_emit_mad(ir_expression *ir, int mul_arg)
> +{
> +   /* 3-src instructions were introduced in gen6. */
> +   if (intel->gen < 6)
> +  return false;
> +
> +   /* MAD can only handle floating-point data. */
> +   if (ir->type->base_type != GLSL_TYPE_FLOAT)
> +  return false;
> +
> +   ir_rvalue *nonmul = ir->operands[1 - mul_arg];
> +   ir_expression *mul = ir->operands[mul_arg]->as_expression();
> +
> +   if (!mul || mul->operation != ir_binop_mul)
> +  return false;
> +
> +   nonmul->accept(this);
> +   src_reg src0 = fix_3src_operand(this->result);
> +
> +   mul->operands[0]->accept(this);
> +   src_reg src1 = fix_3src_operand(this->result);
> +
> +   mul->operands[1]->accept(this);
> +   src_reg src2 = fix_3src_operand(this->result);
> +
> +   this->result = src_reg(this, ir->type);
> +   emit(BRW_OPCODE_MAD, dst_reg(this->result), src0, src1, src2);
> +
> +   return true;
> +}
> +
>  void
>  vec4_visitor::emit_bool_comparison(unsigned int op,
>  dst_reg dst, src_reg src0, src_reg src1)
> @@ -1293,6 +1325,11 @@ vec4_visitor::visit(ir_expression *ir)
> if (try_emit_sat(ir))
>return;
>
> +   if (ir->operation == ir_binop_add) {
> +  if (try_emit_mad(ir, 0) || try_emit_mad(ir, 1))
> +return;
> +   }
> +
> for (operand = 0; operand < ir->get_num_operands(); operand++) {
>this->result.file = BAD_FILE;
>ir->operands[operand]->accept(this);
> --
> 1.8.3.rc0

Nice. I thought I'd poked some holes a few times in reviewing these,
but it all turned out to make sense.

Series is
Reviewed-by: Matt Turner 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 3/3] i965/vs: Avoid the MUL/MACH/MOV sequence for small integer multiplies.

2013-06-07 Thread Kenneth Graunke

On 06/07/2013 06:55 PM, Eric Anholt wrote:

We do a lot of multiplies by 3 or 4 for skinning shaders, and we can avoid
the sequence if we just move them into the right argument of the MUL.

On SNB, this means reliably putting a constant in a position where it
can't be constant folded, but that's still better than MUL/MACH/MOV.

Improves GLB 2.7 trex performance by 0.788648% +/- 0.23865% (n=29/30)
---
  src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp | 50 +++---
  1 file changed, 37 insertions(+), 13 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp 
b/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp
index 451f7d5..3c453eb 100644
--- a/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp
+++ b/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp
@@ -1313,6 +1313,20 @@ vec4_visitor::emit_minmax(uint32_t conditionalmod, 
dst_reg dst,
 }
  }

+static bool
+is_16bit_constant(ir_rvalue *rvalue)
+{
+   ir_constant *constant = rvalue->as_constant();
+   if (!constant)
+  return false;
+
+   if (constant->type != glsl_type::int_type &&
+   constant->type != glsl_type::uint_type)
+  return false;
+
+   return constant->value.u[0] < (1 << 16);
+}
+
  void
  vec4_visitor::visit(ir_expression *ir)
  {
@@ -1472,19 +1486,29 @@ vec4_visitor::visit(ir_expression *ir)

 case ir_binop_mul:
if (ir->type->is_integer()) {
-/* For integer multiplication, the MUL uses the low 16 bits
- * of one of the operands (src0 on gen6, src1 on gen7).  The
- * MACH accumulates in the contribution of the upper 16 bits
- * of that operand.
- *
- * FINISHME: Emit just the MUL if we know an operand is small
- * enough.
- */
-struct brw_reg acc = retype(brw_acc_reg(), BRW_REGISTER_TYPE_D);
-
-emit(MUL(acc, op[0], op[1]));
-emit(MACH(dst_null_d(), op[0], op[1]));
-emit(MOV(result_dst, src_reg(acc)));
+/* For integer multiplication, the MUL uses the low 16 bits of one of
+ * the operands (src0 on gen6, src1 on gen7).  The MACH accumulates
+ * in the contribution of the upper 16 bits of that operand.  If we
+ * can determine that one of the args is in the low 16 bits, though,
+ * we can just emit a single MUL.
+  */
+ if (is_16bit_constant(ir->operands[0])) {
+if (intel->gen == 6)
+   emit(MUL(result_dst, op[0], op[1]));
+else
+   emit(MUL(result_dst, op[1], op[0]));


This will take the IVB path on Gen4-5, which is wrong.  Gen4-6 use the 
low 16-bits of src0, while Gen7 uses src1.



+ } else if (is_16bit_constant(ir->operands[1])) {
+if (intel->gen == 6)
+   emit(MUL(result_dst, op[1], op[0]));
+else
+   emit(MUL(result_dst, op[0], op[1]));


Ditto.  Assuming you fix that (and the comment and commit message), this is:
Reviewed-by: Kenneth Graunke 

(and hey, it looks like we both wrote the VS MAD code in the same week! 
 I just hadn't sent it out since it didn't seem to help the apps I was 
looking at.  ah well...)



+ } else {
+struct brw_reg acc = retype(brw_acc_reg(), BRW_REGISTER_TYPE_D);
+
+emit(MUL(acc, op[0], op[1]));
+emit(MACH(dst_null_d(), op[0], op[1]));
+emit(MOV(result_dst, src_reg(acc)));
+ }
} else {
 emit(MUL(result_dst, op[0], op[1]));
}

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/3] i965/vs: Use the MAD instruction when possible.

2013-06-07 Thread Eric Anholt
This is different from how we do it in the FS - we are using MAD even when
some of the args are constants, because with the relatively unrestrained
ability to schedule a MOV to prepare a temporary with that data, we can
get lower latency for the sequence of instructions.

No significant performance difference on GLB2.7 trex (n=33/34), though it
doesn't have that many MADs.  I noticed MAD opportunities while reading
the code for the DOTA2 bug.
---
 src/mesa/drivers/dri/i965/brw_vec4.h   |  1 +
 .../drivers/dri/i965/brw_vec4_copy_propagation.cpp |  1 +
 src/mesa/drivers/dri/i965/brw_vec4_emit.cpp|  4 +++
 src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp | 37 ++
 4 files changed, 43 insertions(+)

diff --git a/src/mesa/drivers/dri/i965/brw_vec4.h 
b/src/mesa/drivers/dri/i965/brw_vec4.h
index e6e59bc..a72d694 100644
--- a/src/mesa/drivers/dri/i965/brw_vec4.h
+++ b/src/mesa/drivers/dri/i965/brw_vec4.h
@@ -468,6 +468,7 @@ public:
int base_offset);
 
bool try_emit_sat(ir_expression *ir);
+   bool try_emit_mad(ir_expression *ir, int mul_arg);
void resolve_ud_negate(src_reg *reg);
 
src_reg get_timestamp();
diff --git a/src/mesa/drivers/dri/i965/brw_vec4_copy_propagation.cpp 
b/src/mesa/drivers/dri/i965/brw_vec4_copy_propagation.cpp
index 39eef4b..1a667eb 100644
--- a/src/mesa/drivers/dri/i965/brw_vec4_copy_propagation.cpp
+++ b/src/mesa/drivers/dri/i965/brw_vec4_copy_propagation.cpp
@@ -216,6 +216,7 @@ vec4_visitor::try_copy_propagation(struct intel_context 
*intel,
   return false;
 
bool is_3src_inst = (inst->opcode == BRW_OPCODE_LRP ||
+inst->opcode == BRW_OPCODE_MAD ||
 inst->opcode == BRW_OPCODE_BFE ||
 inst->opcode == BRW_OPCODE_BFI2);
if (is_3src_inst && value.file == UNIFORM)
diff --git a/src/mesa/drivers/dri/i965/brw_vec4_emit.cpp 
b/src/mesa/drivers/dri/i965/brw_vec4_emit.cpp
index 91101f2..fbb93db 100644
--- a/src/mesa/drivers/dri/i965/brw_vec4_emit.cpp
+++ b/src/mesa/drivers/dri/i965/brw_vec4_emit.cpp
@@ -772,6 +772,10 @@ vec4_generator::generate_code(exec_list *instructions)
 brw_set_acc_write_control(p, 0);
 break;
 
+  case BRW_OPCODE_MAD:
+ brw_MAD(p, dst, src[0], src[1], src[2]);
+ break;
+
   case BRW_OPCODE_FRC:
 brw_FRC(p, dst, src[0]);
 break;
diff --git a/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp 
b/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp
index 33c1b24..451f7d5 100644
--- a/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp
+++ b/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp
@@ -1250,6 +1250,38 @@ vec4_visitor::try_emit_sat(ir_expression *ir)
return true;
 }
 
+bool
+vec4_visitor::try_emit_mad(ir_expression *ir, int mul_arg)
+{
+   /* 3-src instructions were introduced in gen6. */
+   if (intel->gen < 6)
+  return false;
+
+   /* MAD can only handle floating-point data. */
+   if (ir->type->base_type != GLSL_TYPE_FLOAT)
+  return false;
+
+   ir_rvalue *nonmul = ir->operands[1 - mul_arg];
+   ir_expression *mul = ir->operands[mul_arg]->as_expression();
+
+   if (!mul || mul->operation != ir_binop_mul)
+  return false;
+
+   nonmul->accept(this);
+   src_reg src0 = fix_3src_operand(this->result);
+
+   mul->operands[0]->accept(this);
+   src_reg src1 = fix_3src_operand(this->result);
+
+   mul->operands[1]->accept(this);
+   src_reg src2 = fix_3src_operand(this->result);
+
+   this->result = src_reg(this, ir->type);
+   emit(BRW_OPCODE_MAD, dst_reg(this->result), src0, src1, src2);
+
+   return true;
+}
+
 void
 vec4_visitor::emit_bool_comparison(unsigned int op,
 dst_reg dst, src_reg src0, src_reg src1)
@@ -1293,6 +1325,11 @@ vec4_visitor::visit(ir_expression *ir)
if (try_emit_sat(ir))
   return;
 
+   if (ir->operation == ir_binop_add) {
+  if (try_emit_mad(ir, 0) || try_emit_mad(ir, 1))
+return;
+   }
+
for (operand = 0; operand < ir->get_num_operands(); operand++) {
   this->result.file = BAD_FILE;
   ir->operands[operand]->accept(this);
-- 
1.8.3.rc0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 3/3] i965/vs: Avoid the MUL/MACH/MOV sequence for small integer multiplies.

2013-06-07 Thread Eric Anholt
We do a lot of multiplies by 3 or 4 for skinning shaders, and we can avoid
the sequence if we just move them into the right argument of the MUL.

On SNB, this means reliably putting a constant in a position where it
can't be constant folded, but that's still better than MUL/MACH/MOV.

Improves GLB 2.7 trex performance by 0.788648% +/- 0.23865% (n=29/30)
---
 src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp | 50 +++---
 1 file changed, 37 insertions(+), 13 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp 
b/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp
index 451f7d5..3c453eb 100644
--- a/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp
+++ b/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp
@@ -1313,6 +1313,20 @@ vec4_visitor::emit_minmax(uint32_t conditionalmod, 
dst_reg dst,
}
 }
 
+static bool
+is_16bit_constant(ir_rvalue *rvalue)
+{
+   ir_constant *constant = rvalue->as_constant();
+   if (!constant)
+  return false;
+
+   if (constant->type != glsl_type::int_type &&
+   constant->type != glsl_type::uint_type)
+  return false;
+
+   return constant->value.u[0] < (1 << 16);
+}
+
 void
 vec4_visitor::visit(ir_expression *ir)
 {
@@ -1472,19 +1486,29 @@ vec4_visitor::visit(ir_expression *ir)
 
case ir_binop_mul:
   if (ir->type->is_integer()) {
-/* For integer multiplication, the MUL uses the low 16 bits
- * of one of the operands (src0 on gen6, src1 on gen7).  The
- * MACH accumulates in the contribution of the upper 16 bits
- * of that operand.
- *
- * FINISHME: Emit just the MUL if we know an operand is small
- * enough.
- */
-struct brw_reg acc = retype(brw_acc_reg(), BRW_REGISTER_TYPE_D);
-
-emit(MUL(acc, op[0], op[1]));
-emit(MACH(dst_null_d(), op[0], op[1]));
-emit(MOV(result_dst, src_reg(acc)));
+/* For integer multiplication, the MUL uses the low 16 bits of one of
+ * the operands (src0 on gen6, src1 on gen7).  The MACH accumulates
+ * in the contribution of the upper 16 bits of that operand.  If we
+ * can determine that one of the args is in the low 16 bits, though,
+ * we can just emit a single MUL.
+  */
+ if (is_16bit_constant(ir->operands[0])) {
+if (intel->gen == 6)
+   emit(MUL(result_dst, op[0], op[1]));
+else
+   emit(MUL(result_dst, op[1], op[0]));
+ } else if (is_16bit_constant(ir->operands[1])) {
+if (intel->gen == 6)
+   emit(MUL(result_dst, op[1], op[0]));
+else
+   emit(MUL(result_dst, op[0], op[1]));
+ } else {
+struct brw_reg acc = retype(brw_acc_reg(), BRW_REGISTER_TYPE_D);
+
+emit(MUL(acc, op[0], op[1]));
+emit(MACH(dst_null_d(), op[0], op[1]));
+emit(MOV(result_dst, src_reg(acc)));
+ }
   } else {
 emit(MUL(result_dst, op[0], op[1]));
   }
-- 
1.8.3.rc0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/3] i965/vs: Allow copy propagation into MUL/MACH.

2013-06-07 Thread Eric Anholt
This is a trivial port of 1d6ead38042cc0d1e667d8ff55937c1e32d108b1 from
the FS.

No significant performance difference on trex (misplaced the data, but it
was about n=20).
---
 src/mesa/drivers/dri/i965/brw_vec4_copy_propagation.cpp | 6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_vec4_copy_propagation.cpp 
b/src/mesa/drivers/dri/i965/brw_vec4_copy_propagation.cpp
index 1a667eb..64f6ccc 100644
--- a/src/mesa/drivers/dri/i965/brw_vec4_copy_propagation.cpp
+++ b/src/mesa/drivers/dri/i965/brw_vec4_copy_propagation.cpp
@@ -95,6 +95,7 @@ try_constant_propagation(vec4_instruction *inst, int arg, 
src_reg *values[4])
   inst->src[arg] = value;
   return true;
 
+   case BRW_OPCODE_MACH:
case BRW_OPCODE_MUL:
case BRW_OPCODE_ADD:
   if (arg == 1) {
@@ -102,9 +103,10 @@ try_constant_propagation(vec4_instruction *inst, int arg, 
src_reg *values[4])
 return true;
   } else if (arg == 0 && inst->src[1].file != IMM) {
 /* Fit this constant in by commuting the operands.  Exception: we
- * can't do this for 32-bit integer MUL because it's asymmetric.
+ * can't do this for 32-bit integer MUL/MACH because it's asymmetric.
  */
-if (inst->opcode == BRW_OPCODE_MUL &&
+if ((inst->opcode == BRW_OPCODE_MUL ||
+  inst->opcode == BRW_OPCODE_MACH) &&
 (inst->src[1].type == BRW_REGISTER_TYPE_D ||
  inst->src[1].type == BRW_REGISTER_TYPE_UD))
break;
-- 
1.8.3.rc0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 65525] New: [llvmpipe] lp_scene.h:210:lp_scene_alloc: Assertion `size <= (64 * 1024)' failed.

2013-06-07 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=65525

  Priority: medium
Bug ID: 65525
  Keywords: have-backtrace, regression
CC: e...@anholt.net
  Assignee: mesa-dev@lists.freedesktop.org
   Summary: [llvmpipe] lp_scene.h:210:lp_scene_alloc: Assertion
`size <= (64 * 1024)' failed.
  Severity: critical
Classification: Unclassified
OS: Linux (All)
  Reporter: v...@freedesktop.org
  Hardware: x86-64 (AMD64)
Status: NEW
   Version: git
 Component: Other
   Product: Mesa

mesa: 3c21a7d3c9b626a10a38987d77b9e77d70bd67d7

Run piglit arb_uniform_buffer_object-maxuniformblocksize fsexceed on llvmpipe.

$ ./bin/arb_uniform_buffer_object-maxuniformblocksize fsexceed -auto
Max uniform block size: 518144
Testing FS with uniform block vec4 v[32385]
src/gallium/drivers/llvmpipe/lp_scene.h:210:lp_scene_alloc: Assertion `size <=
(64 * 1024)' failed.
Trace/breakpoint trap (core dumped)

(gdb) bt
#0  0x7fbfc2ed2b9a in _debug_assert_fail (expr=0x7fbfc39abba8 "size <= (64
* 1024)", file=0x7fbfc39abb80 "src/gallium/drivers/llvmpipe/lp_scene.h", 
line=210, function=0x7fbfc39ac1b7 "lp_scene_alloc") at
src/gallium/auxiliary/util/u_debug.c:278
#1  0x7fbfc2b79c75 in lp_scene_alloc (scene=0x7fbfc5ca3010, size=518160) at
src/gallium/drivers/llvmpipe/lp_scene.h:210
#2  0x7fbfc2b7c3d9 in try_update_scene_state (setup=0x1291b90) at
src/gallium/drivers/llvmpipe/lp_setup.c:942
#3  0x7fbfc2b7a825 in begin_binning (setup=0x1291b90) at
src/gallium/drivers/llvmpipe/lp_setup.c:195
#4  0x7fbfc2b7abf0 in set_scene_state (setup=0x1291b90,
new_state=SETUP_ACTIVE, reason=0x7fbfc39ac160 "lp_setup_update_state")
at src/gallium/drivers/llvmpipe/lp_setup.c:304
#5  0x7fbfc2b7c9dd in lp_setup_update_state (setup=0x1291b90,
update_scene=1 '\001') at src/gallium/drivers/llvmpipe/lp_setup.c:1072
#6  0x7fbfc2b84cc3 in lp_setup_draw_arrays (vbr=0x1291b90, start=0, nr=4)
at src/gallium/drivers/llvmpipe/lp_setup_vbuf.c:344
#7  0x7fbfc2f80a96 in draw_pt_emit_linear (emit=0x128e980,
vert_info=0x7fffbf28b020, prim_info=0x7fffbf28b100)
at src/gallium/auxiliary/draw/draw_pt_emit.c:268
#8  0x7fbfc2f7841e in emit (emit=0x128e980, vert_info=0x7fffbf28b020,
prim_info=0x7fffbf28b100)
at src/gallium/auxiliary/draw/draw_pt_fetch_shade_pipeline_llvm.c:304
#9  0x7fbfc2f788df in llvm_pipeline_generic (middle=0x128e830,
fetch_info=0x0, in_prim_info=0x7fffbf28b100)
at src/gallium/auxiliary/draw/draw_pt_fetch_shade_pipeline_llvm.c:432
#10 0x7fbfc2f78a26 in llvm_middle_end_linear_run (middle=0x128e830,
start=0, count=4, prim_flags=0)
at src/gallium/auxiliary/draw/draw_pt_fetch_shade_pipeline_llvm.c:494
#11 0x7fbfc2ea4fcb in vsplit_segment_simple_linear (vsplit=0x128b9e0,
flags=0, istart=0, icount=4)
at src/gallium/auxiliary/draw/draw_pt_vsplit_tmp.h:234
#12 0x7fbfc2ea52d0 in vsplit_run_linear (frontend=0x128b9e0, start=0,
count=4) at src/gallium/auxiliary/draw/draw_split_tmp.h:60
#13 0x7fbfc2e997a5 in draw_pt_arrays (draw=0x127c840, prim=7, start=0,
count=4) at src/gallium/auxiliary/draw/draw_pt.c:149
#14 0x7fbfc2e9a518 in draw_vbo (draw=0x127c840, info=0x7fffbf28b270) at
src/gallium/auxiliary/draw/draw_pt.c:532
#15 0x7fbfc2b6a9ec in llvmpipe_draw_vbo (pipe=0x127ac60,
info=0x7fffbf28b390) at src/gallium/drivers/llvmpipe/lp_draw_arrays.c:126
#16 0x7fbfc2e829e0 in cso_draw_vbo (cso=0x1357730, info=0x7fffbf28b390) at
src/gallium/auxiliary/cso_cache/cso_context.c:1406
#17 0x7fbfc2ceade9 in st_draw_vbo (ctx=0x12eaac0, prims=0x7fffbf28b460,
nr_prims=1, ib=0x0, index_bounds_valid=1 '\001', min_index=0, max_index=3, 
tfb_vertcount=0x0) at src/mesa/state_tracker/st_draw.c:286
#18 0x7fbfc2daa204 in vbo_draw_arrays (ctx=0x12eaac0, mode=7, start=0,
count=4, numInstances=1, baseInstance=0) at src/mesa/vbo/vbo_exec_array.c:624
#19 0x7fbfc2daabcb in vbo_exec_DrawArrays (mode=7, start=0, count=4) at
src/mesa/vbo/vbo_exec_array.c:776
#20 0x7fbfc59486d0 in stub_glDrawArrays (mode=7, first=0, count=4) at
piglit/tests/util/generated_dispatch.c:5673
#21 0x7fbfc59a75d9 in piglit_draw_rect (x=-1, y=-1, w=2, h=2) at
piglit/tests/util/piglit-util-gl.c:870
#22 0x0040168e in piglit_display () at
piglit/tests/spec/arb_uniform_buffer_object/maxuniformblocksize.c:200
#23 0x7fbfc593c030 in display () at
piglit/tests/util/piglit-framework-gl/piglit_glut_framework.c:60
#24 0x7fbfc50e9137 in fghRedrawWindow (window=0x1277090) at
freeglut_main.c:210
#25 fghcbDisplayWindow (window=0x1277090, enumerator=0x7fffbf28b6b0) at
freeglut_main.c:227
#26 0x7fbfc50ec889 in fgEnumWindows (enumCallback=0x7fbfc50e90d0
, enumerator=0x7fffbf28b6b0) at freeglut_structure.c:394
#27 0x7fbfc50e95fa in fghDisplayAll () at freeglut_main.c:249
#28 glutMainLoopEvent () at freeglut_main.c:1450
#29 0x7fbfc50e9f05 in glutMainLoop () at freegl

Re: [Mesa-dev] R600/SI: Intrinsics for derivatives

2013-06-07 Thread Tom Stellard
On Fri, Jun 07, 2013 at 05:24:42PM +0200, Michel Dänzer wrote:
> 
> The most important difference to the previous version of these is that
> whole quad mode is now enabled and M0 initialized appropriately for the
> LDS instructions, which now allows all of the relevant piglit tests to
> pass.
>

Hi Michel,

After I gave this series my r-b, I was reviewing your Mesa patches, and
I suddenly had an idea for a better way to implement this.  See my
comments below:

> From bb5adcd52cc5cadc308e85f635675199f5c02f35 Mon Sep 17 00:00:00 2001
> From: =?UTF-8?q?Michel=20D=C3=A4nzer?= 
> Date: Thu, 21 Feb 2013 17:56:22 +0100
> Subject: [PATCH 3/3] R600/SI: Support AMDGPU.ddx/y intrinsics
> MIME-Version: 1.0
> Content-Type: text/plain; charset=UTF-8
> Content-Transfer-Encoding: 8bit
> 
> Use LDS for calculating the deltas between neighbouring pixels.
> 
> Signed-off-by: Michel Dänzer 
> ---
>  lib/Target/R600/SIISelLowering.cpp | 77 
> +-
>  lib/Target/R600/SIISelLowering.h   |  6 +++
>  lib/Target/R600/SIInstructions.td  | 42 -
>  3 files changed, 121 insertions(+), 4 deletions(-)
> 
> diff --git a/lib/Target/R600/SIISelLowering.cpp 
> b/lib/Target/R600/SIISelLowering.cpp
> index ac6a4c3..7ea226a 100644
> --- a/lib/Target/R600/SIISelLowering.cpp
> +++ b/lib/Target/R600/SIISelLowering.cpp
> @@ -249,7 +249,7 @@ SDValue SITargetLowering::LowerFormalArguments(
>  
>  MachineBasicBlock * SITargetLowering::EmitInstrWithCustomInserter(
>  MachineInstr * MI, MachineBasicBlock * BB) const {
> -
> +  MachineRegisterInfo &MRI = BB->getParent()->getRegInfo();
>MachineBasicBlock::iterator I = *MI;
>  
>switch (MI->getOpcode()) {
> @@ -257,7 +257,6 @@ MachineBasicBlock * 
> SITargetLowering::EmitInstrWithCustomInserter(
>  return AMDGPUTargetLowering::EmitInstrWithCustomInserter(MI, BB);
>case AMDGPU::BRANCH: return BB;
>case AMDGPU::SI_ADDR64_RSRC: {
> -MachineRegisterInfo &MRI = BB->getParent()->getRegInfo();
>  unsigned SuperReg = MI->getOperand(0).getReg();
>  unsigned SubRegLo = MRI.createVirtualRegister(&AMDGPU::SReg_64RegClass);
>  unsigned SubRegHi = MRI.createVirtualRegister(&AMDGPU::SReg_64RegClass);
> @@ -282,10 +281,84 @@ MachineBasicBlock * 
> SITargetLowering::EmitInstrWithCustomInserter(
>  MI->eraseFromParent();
>  break;
>}
> +  case AMDGPU::SI_DD:
> +LowerSI_DD(MI, *BB, I, MRI);
> +break;
> +  case AMDGPU::SI_TID:
> +LowerSI_TID(MI, *BB, I, MRI);
> +break;
>}
>return BB;
>  }
>  
> +void SITargetLowering::LowerSI_DD(MachineInstr *MI, MachineBasicBlock &BB,
> +MachineBasicBlock::iterator I, MachineRegisterInfo & MRI) const {
> +  unsigned coord0 = MRI.createVirtualRegister(&AMDGPU::VReg_32RegClass);
> +  unsigned coord1 = MRI.createVirtualRegister(&AMDGPU::VReg_32RegClass);
> +  MachineOperand dst = MI->getOperand(0);
> +  MachineOperand coord = MI->getOperand(1);
> +  MachineOperand ldsaddr = MI->getOperand(2);
> +  MachineOperand ldsaddr0 = MI->getOperand(3);
> +  MachineOperand ldsdelta = MI->getOperand(4);
> +
> +  // Write this thread's coordinate to LDS
> +  BuildMI(BB, I, BB.findDebugLoc(I), TII->get(AMDGPU::DS_WRITE_B32))
> +  .addOperand(coord)
> +  .addImm(0) // LDS
> +  .addOperand(ldsaddr)
> +  .addOperand(coord)
> +  .addOperand(coord)
> +  .addImm(0)
> +  .addImm(0);
> +
> +  // Read top right / bottom left thread's coordinate from LDS
> +  BuildMI(BB, I, BB.findDebugLoc(I), TII->get(AMDGPU::DS_READ_B32), coord0)
> +  .addImm(0) // LDS
> +  .addOperand(ldsaddr0)
> +  .addOperand(ldsaddr0)
> +  .addOperand(ldsaddr0)
> +  .addOperand(ldsdelta)
> +  .addImm(0);
> +
> +  // Read top left thread's coordinate from LDS
> +  BuildMI(BB, I, BB.findDebugLoc(I), TII->get(AMDGPU::DS_READ_B32), coord1)
> +  .addImm(0) // LDS
> +  .addOperand(ldsaddr0)
> +  .addOperand(ldsaddr0)
> +  .addOperand(ldsaddr0)
> +  .addImm(0)
> +  .addImm(0);
> +
> +  // Subtract top left coordinate from top right / bottom left
> +  BuildMI(BB, I, BB.findDebugLoc(I), TII->get(AMDGPU::V_SUB_F32_e32))
> +  .addOperand(dst)
> +  .addReg(coord0)
> +  .addReg(coord1);
> +
> +  MI->eraseFromParent();
> +}
> +
> +void SITargetLowering::LowerSI_TID(MachineInstr *MI, MachineBasicBlock &BB,
> +MachineBasicBlock::iterator I, MachineRegisterInfo & MRI) const {
> +  unsigned mbcnt_lo = MRI.createVirtualRegister(&AMDGPU::VReg_32RegClass);
> +  MachineOperand dst = MI->getOperand(0);
> +
> +  // Get this thread's ID
> +  BuildMI(BB, I, BB.findDebugLoc(I), 
> TII->get(AMDGPU::V_MBCNT_LO_U32_B32_e64), mbcnt_lo)
> +  .addImm(0x)
> +  .addImm(0)
> +  .addImm(0)
> +  .addImm(0)
> +  .addImm(0)
> +  .addImm(0);
> +  BuildMI(BB, I, BB.findDebugLoc(I), 
> TII->get(AMDGPU::V_MBCNT_HI_U32_B32_e32)

Re: [Mesa-dev] [PATCH] r600g: implement fast color clears on evergreen+

2013-06-07 Thread Grigori Goronzy

On 08.06.2013 00:40, Marek Olšák wrote:

Also the fast clear
shouldn't be used for array, cube, and 3D textures unless all layers
are cleared together.



OK. I hadn't really thought about these.


One more thing. If you don't use piglit, I recommend using it before
sending patches to the mailing list. If you send a patch, I always
assume there are no piglit regressions on your hardware+driver
combination.



I'll try to set up piglit for regressions testing later.

Best regards
Grigori

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] R600/SI: Intrinsics for derivatives

2013-06-07 Thread Tom Stellard
On Fri, Jun 07, 2013 at 05:24:42PM +0200, Michel Dänzer wrote:
> 
> The most important difference to the previous version of these is that
> whole quad mode is now enabled and M0 initialized appropriately for the
> LDS instructions, which now allows all of the relevant piglit tests to
> pass.
> 
>

For the series:

Reviewed-by: Tom Stellard 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] Fixed bug in unclamped float to ubyte conversion.

2013-06-07 Thread Stéphane Marchesin
Ping, does anyone else want to review this patch?

Stéphane


On Fri, May 10, 2013 at 3:56 PM, Manfred Ernst  wrote:
> Problem: The IEEE float optimized version of UNCLAMPED_FLOAT_TO_UBYTE in 
> macros.h
> computed incorrect results for inputs in the range 0x3f7f (=0.99609375) to
> 0x3f7f7f80 (=0.99803924560546875) inclusive.  0x3f7f7f80 is the IEEE float
> value that results in 254.5 when multiplied by 255.  With rounding mode
> "round to closest even integer", this is the largest float in the range 
> 0.0-1.0
> that is converted to 254 by the generic implementation of
> UNCLAMPED_FLOAT_TO_UBYTE.  The IEEE float optimized version incorrectly 
> defined
> the cut-off for mapping to 255 as 0x3f7f (=255.0/256.0). The same bug was
> present in the function float_to_ubyte in u_math.h.
>
> Fix: The proposed fix replaces the incorrect cut-off value by 0x3f80, 
> which
> is the IEEE float representation of 1.0f. 0x3f7f7f81 (or any value in between)
> would also work, but 1.0f is probably cleaner.
>
> The patch does not regress piglit on llvmpipe and on i965 on sandy bridge.
> Tested-by Stéphane Marchesin 
> Reviewed-by Stéphane Marchesin 
> ---
>  src/gallium/auxiliary/util/u_math.h | 3 +--
>  src/mesa/main/macros.h  | 3 +--
>  2 files changed, 2 insertions(+), 4 deletions(-)
>
> diff --git a/src/gallium/auxiliary/util/u_math.h 
> b/src/gallium/auxiliary/util/u_math.h
> index 607fbec..64d16cb 100644
> --- a/src/gallium/auxiliary/util/u_math.h
> +++ b/src/gallium/auxiliary/util/u_math.h
> @@ -540,14 +540,13 @@ ubyte_to_float(ubyte ub)
>  static INLINE ubyte
>  float_to_ubyte(float f)
>  {
> -   const int ieee_0996 = 0x3f7f;   /* 0.996 or so */
> union fi tmp;
>
> tmp.f = f;
> if (tmp.i < 0) {
>return (ubyte) 0;
> }
> -   else if (tmp.i >= ieee_0996) {
> +   else if (tmp.i >= 0x3f80 /* 1.0f */) {
>return (ubyte) 255;
> }
> else {
> diff --git a/src/mesa/main/macros.h b/src/mesa/main/macros.h
> index ac24672..8f5b5ae 100644
> --- a/src/mesa/main/macros.h
> +++ b/src/mesa/main/macros.h
> @@ -142,7 +142,6 @@ extern GLfloat _mesa_ubyte_to_float_color_tab[256];
>   *** CLAMPED_FLOAT_TO_UBYTE: map float known to be in [0,1] to ubyte in 
> [0,255]
>   ***/
>  #if defined(USE_IEEE) && !defined(DEBUG)
> -#define IEEE_0996 0x3f7f   /* 0.996 or so */
>  /* This function/macro is sensitive to precision.  Test very carefully
>   * if you change it!
>   */
> @@ -152,7 +151,7 @@ extern GLfloat _mesa_ubyte_to_float_color_tab[256];
> __tmp.f = (F);  \
> if (__tmp.i < 0)\
>UB = (GLubyte) 0;  
>   \
> -   else if (__tmp.i >= IEEE_0996)  \
> +   else if (__tmp.i >= IEEE_ONE)   \
>UB = (GLubyte) 255;  \
> else {  \
>__tmp.f = __tmp.f * (255.0F/256.0F) + 32768.0F;  \
> --
> 1.8.2.1
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 65423] Remove gl_config::haveDepthBuffer, haveAccumBuffer, haveStencilBuffer fields

2013-06-07 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=65423

--- Comment #2 from Brian Paul  ---
(In reply to comment #1)
> struct gl_config doesn't have member accumBits, but it has accumR, accumG
> and accumB members. Maybe I should change `if(visual->haveAccumBuffer)` to
> if((visual->accumG + visual->accumB + visual-> accumR) > 0)`?

Sounds good.

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] r600g: implement fast color clears on evergreen+

2013-06-07 Thread Marek Olšák
The idea is good. Now if only it supported multiple colorbuffers and
all colorbuffer formats. It shouldn't be hard. Also the fast clear
shouldn't be used for array, cube, and 3D textures unless all layers
are cleared together.

One more thing. If you don't use piglit, I recommend using it before
sending patches to the mailing list. If you send a patch, I always
assume there are no piglit regressions on your hardware+driver
combination.

Marek

On Fri, Jun 7, 2013 at 9:44 PM, Grigori Goronzy  wrote:
> Allows MSAA colorbuffers, which have a CMASK automatically and don't
> need any further special handling, to be fast cleared. Instead
> of clearing the buffer, set the clear color and the CMASK to the
> cleared state.
> ---
>  src/gallium/drivers/r600/evergreen_state.c |  8 +++-
>  src/gallium/drivers/r600/r600_blit.c   | 29 +
>  src/gallium/drivers/r600/r600_resource.h   |  3 +++
>  3 files changed, 39 insertions(+), 1 deletion(-)
>
> diff --git a/src/gallium/drivers/r600/evergreen_state.c 
> b/src/gallium/drivers/r600/evergreen_state.c
> index 3ebb157..072a365 100644
> --- a/src/gallium/drivers/r600/evergreen_state.c
> +++ b/src/gallium/drivers/r600/evergreen_state.c
> @@ -1584,6 +1584,8 @@ void evergreen_init_color_surface(struct r600_context 
> *rctx,
> surf->cb_color_fmask_slice = 
> S_028C88_TILE_MAX(rtex->fmask_slice_tile_max);
> surf->cb_color_cmask_slice = 
> S_028C80_TILE_MAX(rtex->cmask_slice_tile_max);
>
> +   surf->cb_color_clear_value = rtex->color_clear_value;
> +
> surf->color_initialized = true;
>  }
>
> @@ -2178,7 +2180,7 @@ static void evergreen_emit_framebuffer_state(struct 
> r600_context *rctx, struct r
>(struct 
> r600_resource*)cb->base.texture,
>
> RADEON_USAGE_READWRITE);
>
> -   r600_write_context_reg_seq(cs, R_028C60_CB_COLOR0_BASE + i * 
> 0x3C, 11);
> +   r600_write_context_reg_seq(cs, R_028C60_CB_COLOR0_BASE + i * 
> 0x3C, 15);
> r600_write_value(cs, cb->cb_color_base);/* 
> R_028C60_CB_COLOR0_BASE */
> r600_write_value(cs, cb->cb_color_pitch);   /* 
> R_028C64_CB_COLOR0_PITCH */
> r600_write_value(cs, cb->cb_color_slice);   /* 
> R_028C68_CB_COLOR0_SLICE */
> @@ -2190,6 +2192,10 @@ static void evergreen_emit_framebuffer_state(struct 
> r600_context *rctx, struct r
> r600_write_value(cs, cb->cb_color_cmask_slice); /* 
> R_028C80_CB_COLOR0_CMASK_SLICE */
> r600_write_value(cs, cb->cb_color_fmask);   /* 
> R_028C84_CB_COLOR0_FMASK */
> r600_write_value(cs, cb->cb_color_fmask_slice); /* 
> R_028C88_CB_COLOR0_FMASK_SLICE */
> +   r600_write_value(cs, cb->cb_color_clear_value); /* 
> R_028C8C_CB_COLOR0_CLEAR_WORD0 */
> +   r600_write_value(cs, 0);/* 
> R_028C90_CB_COLOR0_CLEAR_WORD1 */
> +   r600_write_value(cs, 0);/* 
> R_028C94_CB_COLOR0_CLEAR_WORD2 */
> +   r600_write_value(cs, 0);/* 
> R_028C98_CB_COLOR0_CLEAR_WORD3 */
>
> r600_write_value(cs, PKT3(PKT3_NOP, 0, 0)); /* 
> R_028C60_CB_COLOR0_BASE */
> r600_write_value(cs, reloc);
> diff --git a/src/gallium/drivers/r600/r600_blit.c 
> b/src/gallium/drivers/r600/r600_blit.c
> index 058bf81..2b1b49a 100644
> --- a/src/gallium/drivers/r600/r600_blit.c
> +++ b/src/gallium/drivers/r600/r600_blit.c
> @@ -412,6 +412,23 @@ static boolean is_simple_msaa_resolve(const struct 
> pipe_blit_info *info)
> dst_tile_mode >= RADEON_SURF_MODE_1D;
>  }
>
> +static void r600_clear_buffer(struct pipe_context *ctx, struct pipe_resource 
> *dst,
> + unsigned offset, unsigned size, unsigned char 
> value);
> +
> +static void eg_set_clear_color(struct pipe_context *ctx,
> +  const union pipe_color_union *color)
> +{
> +   struct r600_context *rctx = (struct r600_context *)ctx;
> +   struct pipe_framebuffer_state *fb = &rctx->framebuffer.state;
> +   union util_color uc;
> +
> +   memset(&uc, 0, sizeof(uc));
> +   util_pack_color(color->f, fb->cbufs[0]->format, &uc);
> +
> +   /* TODO: color formats with more than 32bpp */
> +   ((struct r600_texture *)fb->cbufs[0]->texture)->color_clear_value = 
> uc.ui;
> +}
> +
>  static void r600_clear(struct pipe_context *ctx, unsigned buffers,
>const union pipe_color_union *color,
>double depth, unsigned stencil)
> @@ -419,6 +436,18 @@ static void r600_clear(struct pipe_context *ctx, 
> unsigned buffers,
> struct r600_context *rctx = (struct r600_context *)ctx;
> struct pipe_framebuffer_state *fb = &rctx->framebuffer.state;
>
> +   /* fast color clear on AA framebuffers (EG+) */
> +   /* TOD

Re: [Mesa-dev] [PATCH 2/3] llvmpipe: add support for layered rendering

2013-06-07 Thread Jose Fonseca


- Original Message -
> Am 07.06.2013 16:55, schrieb Jose Fonseca:
> > 
> > 
> > - Original Message -
> >> Am 06.06.2013 03:15, schrieb Brian Paul:
> >>> Reviewed-by: Brian Paul 
> >>>
> >>> Just two minor nits below.
> >>>
> >>>
> >>> On 06/05/2013 05:44 PM, srol...@vmware.com wrote:
>  From: Roland Scheidegger 
> 
>  Mostly just make sure the layer parameter gets passed through to the
>  right
>  places (and get clamped, can do this at setup time), fix up clears to
>  clear all layers and disable opaque optimization. Luckily don't need to
>  touch the jitted code.
>  (Clears invoked via pipe's clear_render_target method will not work
>  however
>  since the pipe_util_clear function used for it doesn't handle clearing
>  multiple layers yet.)
>  ---
>    src/gallium/drivers/llvmpipe/lp_context.h   |3 +
>    src/gallium/drivers/llvmpipe/lp_jit.h   |2 +-
>    src/gallium/drivers/llvmpipe/lp_rast.c  |  195
>  ---
>    src/gallium/drivers/llvmpipe/lp_rast.h  |2 +-
>    src/gallium/drivers/llvmpipe/lp_rast_priv.h |   20 ++-
>    src/gallium/drivers/llvmpipe/lp_scene.c |   12 +-
>    src/gallium/drivers/llvmpipe/lp_scene.h |7 +-
>    src/gallium/drivers/llvmpipe/lp_setup.c |1 +
>    src/gallium/drivers/llvmpipe/lp_setup_context.h |1 +
>    src/gallium/drivers/llvmpipe/lp_setup_line.c|6 +
>    src/gallium/drivers/llvmpipe/lp_setup_point.c   |7 +
>    src/gallium/drivers/llvmpipe/lp_setup_tri.c |   17 +-
>    src/gallium/drivers/llvmpipe/lp_state_derived.c |   13 +-
>    src/gallium/drivers/llvmpipe/lp_texture.c   |3 -
>    src/gallium/drivers/llvmpipe/lp_texture.h   |   10 ++
>    15 files changed, 190 insertions(+), 109 deletions(-)
> 
>  diff --git a/src/gallium/drivers/llvmpipe/lp_context.h
>  b/src/gallium/drivers/llvmpipe/lp_context.h
>  index 54f3830..abfe852 100644
>  --- a/src/gallium/drivers/llvmpipe/lp_context.h
>  +++ b/src/gallium/drivers/llvmpipe/lp_context.h
>  @@ -119,6 +119,9 @@ struct llvmpipe_context {
>   /** Which vertex shader output slot contains viewport index */
>   int viewport_index_slot;
> 
>  +   /** Which geometry shader output slot contains layer */
>  +   int layer_slot;
>  +
>   /**< minimum resolvable depth value, for polygon offset */
>   double mrd;
> 
>  diff --git a/src/gallium/drivers/llvmpipe/lp_jit.h
>  b/src/gallium/drivers/llvmpipe/lp_jit.h
>  index 4e9ca76..2ecfde7 100644
>  --- a/src/gallium/drivers/llvmpipe/lp_jit.h
>  +++ b/src/gallium/drivers/llvmpipe/lp_jit.h
>  @@ -204,7 +204,7 @@ typedef void
>    const void *dadx,
>    const void *dady,
>    uint8_t **color,
>  -void *depth,
>  +uint8_t *depth,
>    uint32_t mask,
>    struct lp_jit_thread_data *thread_data,
>    unsigned *stride,
>  diff --git a/src/gallium/drivers/llvmpipe/lp_rast.c
>  b/src/gallium/drivers/llvmpipe/lp_rast.c
>  index 981dd71..aa5224e 100644
>  --- a/src/gallium/drivers/llvmpipe/lp_rast.c
>  +++ b/src/gallium/drivers/llvmpipe/lp_rast.c
>  @@ -134,6 +134,8 @@ lp_rast_clear_color(struct lp_rasterizer_task *task,
> 
> for (i = 0; i < scene->fb.nr_cbufs; i++) {
>    enum pipe_format format = scene->fb.cbufs[i]->format;
>  +unsigned layer;
>  +uint8_t *map_layer = scene->cbufs[i].map;
> 
>    if (util_format_is_pure_sint(format)) {
>   util_format_write_4i(format, arg.clear_color.i, 0,
>  &uc, 0, 0, 0, 1, 1);
>  @@ -143,14 +145,17 @@ lp_rast_clear_color(struct lp_rasterizer_task
>  *task,
>   util_format_write_4ui(format, arg.clear_color.ui, 0,
>  &uc, 0, 0, 0, 1, 1);
>    }
> 
>  -util_fill_rect(scene->cbufs[i].map,
>  -   scene->fb.cbufs[i]->format,
>  -   scene->cbufs[i].stride,
>  -   task->x,
>  -   task->y,
>  -   task->width,
>  -   task->height,
>  -   &uc);
>  +for (layer = 0; layer <= scene->fb_max_layer; layer++) {
>  +   util_fill_rect(map_layer,
>  +  scene->fb.cbufs[i]->format,
>  +  scene->cbufs[i].stride,
>  +  task->x,
>  +  task->y,
>  +  task->width,
> 

[Mesa-dev] hw_gl_select branch status

2013-06-07 Thread Jerry Gamache

I've rebased the patch and fixed up some build failures.
If you want to play with it, here is the updated branch:
http://cgit.freedesktop.org/~ab/mesa/log/?h=hw_gl_select2  



I have a test where I draw 3 overlapped triangles in the
XY plane with varying Z depths, and the SW code correctly
returns 3 hits while the HW code only sees the first
triangle that was drawn. The depth values returned
were also 256 instead of the actual Z depth of the triangle
as a GLuint.

Tests were done with the softpipe swrast compiled into an
OSMesa driver. The same driver works correctly when the
MESA_HW_SELECT environment variable is not set.

Not sure exactly what I am doing wrong, but I can not seem
to get correct results with the hw_gl_select2 branch.

Jerry.

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] Random results in piglit spec/!OpenGL 1.1/read-front on r600g

2013-06-07 Thread Martin Andersson
On Fri, Jun 7, 2013 at 8:58 AM, Martin Andersson  wrote:
> On Fri, Jun 7, 2013 at 12:37 AM, Marek Olšák  wrote:
>> There are bugs in both piglit and DRI2. I haven't looked into the
>> issue, but Paul Berry seems to be working on it.
>>
>> See:
>> http://lists.freedesktop.org/archives/piglit/2013-May/005880.html
>> http://lists.freedesktop.org/archives/mesa-dev/2013-May/039985.html
>>
>> Marek
>>
>
> ok, it is good to know that it is being worked on, thanks
>
> //Martin
>
>> On Fri, Jun 7, 2013 at 12:04 AM, Martin Andersson  wrote:
>>> I get random results when I run the spec/!OpenGL 1.1/read-front test.
>>> Sometimes it passes and sometimes it failes, it mostly fails though.
>>> When it fails the observed values are random. I have an AMD 6950,
>>> running mesa git ce67fb4715e0c2fab01de33da475ef4705622020 and kernel
>>> 3.10-rc4.
>>>
>>> If I insert a delay before the piglit_swap_buffers call in the test
>>> (read-front.c) it always passes. It does not matter where I put the
>>> delay in the piglit_dispay function, as long as it is before the
>>> piglit_swap_buffers call. So it seems to be a race between the
>>> swap_buffers_call and some earlier call (some piglit initialization
>>> perhaps?)
>>>
>>> Does anyone know what could be wrong or how I could debug it further?
>>>
>>> //Martin
>>> ___
>>> mesa-dev mailing list
>>> mesa-dev@lists.freedesktop.org
>>> http://lists.freedesktop.org/mailman/listinfo/mesa-dev

This issue was fixed by
http://cgit.freedesktop.org/piglit/commit/?id=4c1d83cb4cf202dd7375593c830fd78db04ff14b

//Martin
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 65423] Remove gl_config::haveDepthBuffer, haveAccumBuffer, haveStencilBuffer fields

2013-06-07 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=65423

--- Comment #1 from Arnas Milaševičius  ---
struct gl_config doesn't have member accumBits, but it has accumR, accumG and
accumB members. Maybe I should change `if(visual->haveAccumBuffer)` to
if((visual->accumG + visual->accumB + visual-> accumR) > 0)`?

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/7] mesa: Hide weirdness of 1D_ARRAY textures from Driver.CopyTexSubImage().

2013-06-07 Thread Brian Paul

On 06/07/2013 12:13 PM, Eric Anholt wrote:

Brian Paul  writes:

On 06/05/2013 10:14 AM, Eric Anholt wrote:



-   /* 1D array textures need special treatment.
-* Blit rows from the source to layers in the destination. */
-   if (texImage->TexObject->Target == GL_TEXTURE_1D_ARRAY) {
-  int y, layer;
-
-  for (y = srcY0, layer = 0; layer < height; y += yStep, layer++) {
- blit.src.box.y = y;
- blit.src.box.height = 1;
- blit.dst.box.y = 0;
- blit.dst.box.height = 1;
- blit.dst.box.z = destY + layer;
-
- pipe->blit(pipe, &blit);
-  }
-   }
-   else {
-  /* All the other texture targets. */
-  pipe->blit(pipe, &blit);
-   }
+   pipe->blit(pipe, &blit);
  return;

   fallback:
  /* software fallback */
  fallback_copy_texsubimage(ctx,
strb, stImage, texImage->_BaseFormat,
- destX, destY, destZ,
+ destX, destY, slice,
srcX, srcY, width, height);
   }


Thanks for updating the state tracker code.  You removed the code above
on the premise that height will always be 1 if we're copying to a 1D
array texture, right?  Maybe we should assert that just to be safe.


I think instead of putting an assert like that in the places we think of
it at the middle levels, we should do so in our drivers at the bottom
level of mapping or blitting a particular slice, so that we catch
mistakes with slices or clipping wherever they may happen.


But in this case, one assert here would be easier than an assert in all 
the various drivers.


I could add that later.

-Brian

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 5/7] intel: Directly implement blit glBlitFramebuffer instead of awkward reuse.

2013-06-07 Thread Kenneth Graunke

On 06/07/2013 12:58 PM, Eric Anholt wrote:

Kenneth Graunke  writes:


On 06/05/2013 10:14 AM, Eric Anholt wrote:

This gets us support for blitting to attachment types other than
textures.
+  /* Blit to all active draw buffers.  We don't do any pre-checking,
+   * because we assume that copying to MRTs is rare, and failure midway
+   * through copying is even more rare.  Given that feedback loops in
+   * glFramebufferBlit() are undefined, we can safely fail out after
+   * having partially completed our copies.


The first part of this comment makes sense: yes, it's obviously rare.
The second part I feel could use a little more explanation:

If we fail midway through a blit, we return without clearing
BUFFER_BITS_COLOR from the mask.  The caller will fall back to the next
method (i.e. meta_BlitFramebuffer) and use that to blit /all/ RTs - even
ones we already successfully blit.  Given that feedback loops in
glFramebufferBlit() are undefined, we can safely fail out after having
partially completed our copies.

Unless there's some kind of blending, it doesn't even seem
undefined...meta should just overwrite the colors again, and everything
will work out.


Better?

   /* Blit to all active draw buffers.  We don't do any pre-checking,
* because we assume that copying to MRTs is rare, and failure midway
* through copying is even more rare.  Even if it was to occur, it's
* safe to let meta start the copy over from scratch, because
* glBlitFramebuffer completely overwrites the destination pixels, and
* results are undefined if any destination pixels have a dependency on
* source pixels.
*/


Yes, I like this better.  I'm good with this series now.  Thanks!

--Ken

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] intel: Make batch macros for doing BCS_SWCTRL setup.

2013-06-07 Thread Eric Anholt
We're going to add more BCS_SWCTRL setup instances soon, and you have to
be careful to have the set and restore atomic with the rendering that's
done, so that our state doesn't leak out to other rendering processes.

v2: Rewrite the patch to do have batch begin/advance macros so that magic
numbers don't get sprinkled around (and so you don't mix up your
do-I-need-to-reset vs what-do-I-reset-to logic, which I nearly did in
the next patch when first writing it)
---

Yeah, things were kinda fragile.  How about this version?  The next
patch got the obvious update, too

 src/mesa/drivers/dri/intel/intel_blit.c | 84 ++---
 1 file changed, 47 insertions(+), 37 deletions(-)

diff --git a/src/mesa/drivers/dri/intel/intel_blit.c 
b/src/mesa/drivers/dri/intel/intel_blit.c
index 1f6ad09..37f6937 100644
--- a/src/mesa/drivers/dri/intel/intel_blit.c
+++ b/src/mesa/drivers/dri/intel/intel_blit.c
@@ -86,6 +86,47 @@ br13_for_cpp(int cpp)
 }
 
 /**
+ * Emits the packet for switching the blitter from X to Y tiled or back.
+ *
+ * This has to be called in a single BEGIN_BATCH_BLT_TILED() /
+ * ADVANCE_BATCH_TILED().  This is because BCS_SWCTRL is saved and restored as
+ * part of the power context, not a render context, and if the batchbuffer was
+ * to get flushed between setting and blitting, or blitting and restoring, our
+ * tiling state would leak into other unsuspecting applications (like the X
+ * server).
+ */
+static void
+set_blitter_tiling(struct intel_context *intel,
+   bool dst_y_tiled, bool src_y_tiled)
+{
+   assert(intel->gen >= 6);
+
+   /* Idle the blitter before we update how tiling is interpreted. */
+   OUT_BATCH(MI_FLUSH_DW);
+   OUT_BATCH(0);
+   OUT_BATCH(0);
+   OUT_BATCH(0);
+
+   OUT_BATCH(MI_LOAD_REGISTER_IMM | (3 - 2));
+   OUT_BATCH(BCS_SWCTRL);
+   OUT_BATCH((BCS_SWCTRL_DST_Y | BCS_SWCTRL_SRC_Y) << 16 |
+ (dst_y_tiled ? BCS_SWCTRL_DST_Y : 0) |
+ (src_y_tiled ? BCS_SWCTRL_SRC_Y : 0));
+}
+
+#define BEGIN_BATCH_BLT_TILED(n, dst_y_tiled, src_y_tiled) do { \
+  BEGIN_BATCH_BLT(n + ((dst_y_tiled || src_y_tiled) ? 14 : 0)); \
+  if (dst_y_tiled || src_y_tiled)   \
+ set_blitter_tiling(intel, dst_y_tiled, src_y_tiled);   \
+   } while (0)
+
+#define ADVANCE_BATCH_TILED(dst_y_tiled, src_y_tiled) do {  \
+  if (dst_y_tiled || src_y_tiled)   \
+ set_blitter_tiling(intel, false, false);   \
+  ADVANCE_BATCH();  \
+   } while (0)
+
+/**
  * Implements a rectangular block transfer (blit) of pixels between two
  * miptrees.
  *
@@ -204,21 +245,20 @@ intelEmitCopyBlit(struct intel_context *intel,
int dst_y2 = dst_y + h;
int dst_x2 = dst_x + w;
drm_intel_bo *aper_array[3];
-   uint32_t bcs_swctrl = 0;
+   bool dst_y_tiled = dst_tiling == I915_TILING_Y;
+   bool src_y_tiled = src_tiling == I915_TILING_Y;
BATCH_LOCALS;
 
if (dst_tiling != I915_TILING_NONE) {
   if (dst_offset & 4095)
 return false;
-  if (dst_tiling == I915_TILING_Y && intel->gen < 6)
-return false;
}
if (src_tiling != I915_TILING_NONE) {
   if (src_offset & 4095)
 return false;
-  if (src_tiling == I915_TILING_Y && intel->gen < 6)
-return false;
}
+   if ((dst_y_tiled || src_y_tiled) && intel->gen < 6)
+  return false;
 
/* do space check before going any further */
do {
@@ -284,16 +324,10 @@ intelEmitCopyBlit(struct intel_context *intel,
if (dst_tiling != I915_TILING_NONE) {
   CMD |= XY_DST_TILED;
   dst_pitch /= 4;
-
-  if (dst_tiling == I915_TILING_Y)
- bcs_swctrl |= BCS_SWCTRL_DST_Y;
}
if (src_tiling != I915_TILING_NONE) {
   CMD |= XY_SRC_TILED;
   src_pitch /= 4;
-
-  if (src_tiling == I915_TILING_Y)
- bcs_swctrl |= BCS_SWCTRL_SRC_Y;
}
 #endif
 
@@ -304,20 +338,7 @@ intelEmitCopyBlit(struct intel_context *intel,
assert(dst_x < dst_x2);
assert(dst_y < dst_y2);
 
-   BEGIN_BATCH_BLT(8 + ((bcs_swctrl != 0) ? 14 : 0));
-
-   if (bcs_swctrl != 0) {
-  /* Idle the blitter before we update how tiling is interpreted. */
-  OUT_BATCH(MI_FLUSH_DW);
-  OUT_BATCH(0);
-  OUT_BATCH(0);
-  OUT_BATCH(0);
-
-  OUT_BATCH(MI_LOAD_REGISTER_IMM | (3 - 2));
-  OUT_BATCH(BCS_SWCTRL);
-  OUT_BATCH((BCS_SWCTRL_DST_Y | BCS_SWCTRL_SRC_Y) << 16 |
-bcs_swctrl);
-   }
+   BEGIN_BATCH_BLT_TILED(8, dst_y_tiled, src_y_tiled);
 
OUT_BATCH(CMD | (8 - 2));
OUT_BATCH(BR13 | (uint16_t)dst_pitch);
@@ -332,18 +353,7 @@ intelEmitCopyBlit(struct intel_context *intel,
I915_GEM_DOMAIN_RENDER, 0,
src_offset);
 
-   if (bcs_swctrl != 0) {
-  OUT_BATCH(MI_FLUSH_DW);
-  OUT_BATCH(0);
-  OUT_BATCH(0);
-  OUT_BATCH(0);
-
-  OUT_BATCH(MI_LOAD_REGISTE

Re: [Mesa-dev] [PATCH 5/7] intel: Directly implement blit glBlitFramebuffer instead of awkward reuse.

2013-06-07 Thread Eric Anholt
Kenneth Graunke  writes:

> On 06/05/2013 10:14 AM, Eric Anholt wrote:
>> This gets us support for blitting to attachment types other than
>> textures.
>> +  /* Blit to all active draw buffers.  We don't do any pre-checking,
>> +   * because we assume that copying to MRTs is rare, and failure midway
>> +   * through copying is even more rare.  Given that feedback loops in
>> +   * glFramebufferBlit() are undefined, we can safely fail out after
>> +   * having partially completed our copies.
>
> The first part of this comment makes sense: yes, it's obviously rare.
> The second part I feel could use a little more explanation:
>
> If we fail midway through a blit, we return without clearing 
> BUFFER_BITS_COLOR from the mask.  The caller will fall back to the next 
> method (i.e. meta_BlitFramebuffer) and use that to blit /all/ RTs - even 
> ones we already successfully blit.  Given that feedback loops in 
> glFramebufferBlit() are undefined, we can safely fail out after having 
> partially completed our copies.
>
> Unless there's some kind of blending, it doesn't even seem 
> undefined...meta should just overwrite the colors again, and everything 
> will work out.

Better?

  /* Blit to all active draw buffers.  We don't do any pre-checking,
   * because we assume that copying to MRTs is rare, and failure midway
   * through copying is even more rare.  Even if it was to occur, it's
   * safe to let meta start the copy over from scratch, because
   * glBlitFramebuffer completely overwrites the destination pixels, and
   * results are undefined if any destination pixels have a dependency on
   * source pixels.
   */


pgpEj7UqibZsh.pgp
Description: PGP signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] r600g: implement fast color clears on evergreen+

2013-06-07 Thread Grigori Goronzy
Allows MSAA colorbuffers, which have a CMASK automatically and don't
need any further special handling, to be fast cleared. Instead
of clearing the buffer, set the clear color and the CMASK to the
cleared state.
---
 src/gallium/drivers/r600/evergreen_state.c |  8 +++-
 src/gallium/drivers/r600/r600_blit.c   | 29 +
 src/gallium/drivers/r600/r600_resource.h   |  3 +++
 3 files changed, 39 insertions(+), 1 deletion(-)

diff --git a/src/gallium/drivers/r600/evergreen_state.c 
b/src/gallium/drivers/r600/evergreen_state.c
index 3ebb157..072a365 100644
--- a/src/gallium/drivers/r600/evergreen_state.c
+++ b/src/gallium/drivers/r600/evergreen_state.c
@@ -1584,6 +1584,8 @@ void evergreen_init_color_surface(struct r600_context 
*rctx,
surf->cb_color_fmask_slice = 
S_028C88_TILE_MAX(rtex->fmask_slice_tile_max);
surf->cb_color_cmask_slice = 
S_028C80_TILE_MAX(rtex->cmask_slice_tile_max);
 
+   surf->cb_color_clear_value = rtex->color_clear_value;
+
surf->color_initialized = true;
 }
 
@@ -2178,7 +2180,7 @@ static void evergreen_emit_framebuffer_state(struct 
r600_context *rctx, struct r
   (struct 
r600_resource*)cb->base.texture,
   RADEON_USAGE_READWRITE);
 
-   r600_write_context_reg_seq(cs, R_028C60_CB_COLOR0_BASE + i * 
0x3C, 11);
+   r600_write_context_reg_seq(cs, R_028C60_CB_COLOR0_BASE + i * 
0x3C, 15);
r600_write_value(cs, cb->cb_color_base);/* 
R_028C60_CB_COLOR0_BASE */
r600_write_value(cs, cb->cb_color_pitch);   /* 
R_028C64_CB_COLOR0_PITCH */
r600_write_value(cs, cb->cb_color_slice);   /* 
R_028C68_CB_COLOR0_SLICE */
@@ -2190,6 +2192,10 @@ static void evergreen_emit_framebuffer_state(struct 
r600_context *rctx, struct r
r600_write_value(cs, cb->cb_color_cmask_slice); /* 
R_028C80_CB_COLOR0_CMASK_SLICE */
r600_write_value(cs, cb->cb_color_fmask);   /* 
R_028C84_CB_COLOR0_FMASK */
r600_write_value(cs, cb->cb_color_fmask_slice); /* 
R_028C88_CB_COLOR0_FMASK_SLICE */
+   r600_write_value(cs, cb->cb_color_clear_value); /* 
R_028C8C_CB_COLOR0_CLEAR_WORD0 */
+   r600_write_value(cs, 0);/* 
R_028C90_CB_COLOR0_CLEAR_WORD1 */
+   r600_write_value(cs, 0);/* 
R_028C94_CB_COLOR0_CLEAR_WORD2 */
+   r600_write_value(cs, 0);/* 
R_028C98_CB_COLOR0_CLEAR_WORD3 */
 
r600_write_value(cs, PKT3(PKT3_NOP, 0, 0)); /* 
R_028C60_CB_COLOR0_BASE */
r600_write_value(cs, reloc);
diff --git a/src/gallium/drivers/r600/r600_blit.c 
b/src/gallium/drivers/r600/r600_blit.c
index 058bf81..2b1b49a 100644
--- a/src/gallium/drivers/r600/r600_blit.c
+++ b/src/gallium/drivers/r600/r600_blit.c
@@ -412,6 +412,23 @@ static boolean is_simple_msaa_resolve(const struct 
pipe_blit_info *info)
dst_tile_mode >= RADEON_SURF_MODE_1D;
 }
 
+static void r600_clear_buffer(struct pipe_context *ctx, struct pipe_resource 
*dst,
+ unsigned offset, unsigned size, unsigned char 
value);
+
+static void eg_set_clear_color(struct pipe_context *ctx,
+  const union pipe_color_union *color)
+{
+   struct r600_context *rctx = (struct r600_context *)ctx;
+   struct pipe_framebuffer_state *fb = &rctx->framebuffer.state;
+   union util_color uc;
+
+   memset(&uc, 0, sizeof(uc));
+   util_pack_color(color->f, fb->cbufs[0]->format, &uc);
+
+   /* TODO: color formats with more than 32bpp */
+   ((struct r600_texture *)fb->cbufs[0]->texture)->color_clear_value = 
uc.ui;
+}
+
 static void r600_clear(struct pipe_context *ctx, unsigned buffers,
   const union pipe_color_union *color,
   double depth, unsigned stencil)
@@ -419,6 +436,18 @@ static void r600_clear(struct pipe_context *ctx, unsigned 
buffers,
struct r600_context *rctx = (struct r600_context *)ctx;
struct pipe_framebuffer_state *fb = &rctx->framebuffer.state;
 
+   /* fast color clear on AA framebuffers (EG+) */
+   /* TODO: multiple color buffers */
+   if (rctx->chip_class >= EVERGREEN &&
+   (buffers & PIPE_CLEAR_COLOR) && fb->nr_cbufs == 1 &&
+   ((struct r600_texture 
*)fb->cbufs[0]->texture)->cmask_size) {
+   struct r600_texture *tex = (struct r600_texture 
*)fb->cbufs[0]->texture;
+   eg_set_clear_color(ctx, color);
+   r600_clear_buffer(ctx, fb->cbufs[0]->texture,
+   tex->cmask_offset, tex->cmask_size, 0);
+   buffers &= ~PIPE_CLEAR_COLOR;
+   }
+
/* if hyperz enabled just clear hyperz */
if (fb->zsbuf && (buffers & PIPE_CLEAR_DEPTH)) {
struct r6

[Mesa-dev] [RFC] r600g: implement fast color clears on evergreen+

2013-06-07 Thread Grigori Goronzy
This is my first try to contribute anything useful to Mesa, so please
bear with me. This is not finished, but I'd like feedback to make sure
the code's quality and style is in line with what is expected in Mesa.

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/7] mesa: Hide weirdness of 1D_ARRAY textures from Driver.CopyTexSubImage().

2013-06-07 Thread Eric Anholt
Paul Berry  writes:

> On 5 June 2013 10:14, Eric Anholt  wrote:
>
>> Intel had brokenness here, and I'd like to continue moving Mesa toward
>> hiding 1D_ARRAY's ridiculousness inside of the core, like we did with
>> MapTextureImage.  Fixes copyteximage 1D_ARRAY on intel.
>>
>> There's still an impedance mismatch in meta when falling back to read and
>> texsubimage, since texsubimage expects coordinates into 1D_ARRAY as
>> (width, slice, 0) instead of (width, 0, slice).

>> diff --git a/src/mesa/main/dd.h b/src/mesa/main/dd.h
>> index 13c7a83..846d5d0 100644
>> --- a/src/mesa/main/dd.h
>> +++ b/src/mesa/main/dd.h
>> @@ -249,10 +249,15 @@ struct dd_function_table {
>>
>> /**
>>  * Called by glCopyTex[Sub]Image[123]D().
>> +*
>> +* In the case of 1D array textures, the driver will be called to copy
>> each
>> +* appropriate scanline from the rb to each destination slice.  For 3D
>> or
>> +* other array textures, only one slice may be copied, but @slice may
>> be
>> +* nonzero.
>>
>
> I'm having trouble following this comment, especially the second sentence.
> Would this be clearer?
>
> "This function should copy a rectangular region in the rb to a single
> destination slice, specified by @slice.  In the case of 1D array textures
> (where one GL call can potentially affect multiple destination slices),
> core mesa takes care of calling this function multiple times, once for each
> scanline to be copied."

When I first wrote the comment, I was thinking there was going to be
multi-slice code for 2d arrays/3d/cubemaps, then I didn't rewrite from
scratch when I discovered things were simpler.  Your text is much
better.

>> +static void
>> +copytexsubimage_by_slice(struct gl_context *ctx,
>> + struct gl_texture_image *texImage,
>> + GLuint dims,
>> + GLint xoffset, GLint yoffset, GLint zoffset,
>> + struct gl_renderbuffer *rb,
>> + GLint x, GLint y,
>> + GLsizei width, GLsizei height)
>> +{
>> +   if (texImage->TexObject->Target == GL_TEXTURE_1D_ARRAY) {
>> +  int slice;
>> +
>> +  /* For 1D arrays, we copy each scanline of the source rectangle
>> into the
>> +   * next array slice.
>> +   */
>> +  assert(zoffset == 0);
>> +
>> +  for (slice = 0; slice < height; slice++) {
>> + if (yoffset + slice >= texImage->Height)
>> +break;
>>
>
> Shouldn't the error check in error_check_subtexture_dimensions() prevent
> this from ever occurring?  If so, I think this should be an assertion.

I failed to find the error case when I was writing the code.  Thanks!


pgpOYFHo9Kss8.pgp
Description: PGP signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/7] mesa: Hide weirdness of 1D_ARRAY textures from Driver.CopyTexSubImage().

2013-06-07 Thread Eric Anholt
Brian Paul  writes:

> On 06/05/2013 10:14 AM, Eric Anholt wrote:
>> Intel had brokenness here, and I'd like to continue moving Mesa toward
>> hiding 1D_ARRAY's ridiculousness inside of the core, like we did with
>> MapTextureImage.  Fixes copyteximage 1D_ARRAY on intel.
>>
>> There's still an impedance mismatch in meta when falling back to read and
>> texsubimage, since texsubimage expects coordinates into 1D_ARRAY as
>> (width, slice, 0) instead of (width, 0, slice).

>>
>> +static void
>> +copytexsubimage_by_slice(struct gl_context *ctx,
>> + struct gl_texture_image *texImage,
>> + GLuint dims,
>> + GLint xoffset, GLint yoffset, GLint zoffset,
>> + struct gl_renderbuffer *rb,
>> + GLint x, GLint y,
>> + GLsizei width, GLsizei height)
>> +{
>> +   if (texImage->TexObject->Target == GL_TEXTURE_1D_ARRAY) {
>> +  int slice;
>> +
>> +  /* For 1D arrays, we copy each scanline of the source rectangle into 
>> the
>> +   * next array slice.
>> +   */
>> +  assert(zoffset == 0);
>> +
>> +  for (slice = 0; slice < height; slice++) {
>> + if (yoffset + slice >= texImage->Height)
>> +break;
>> +
>> + ctx->Driver.CopyTexSubImage(ctx, 2, texImage,
>> + xoffset, 0, yoffset + slice,
>> + rb, x, y, width, 1);
>
> Should that be 'y + slice'?  Otherwise I think we're always copying from 
> the same Y position.

Good catch!  Fixed.

>> -   /* 1D array textures need special treatment.
>> -* Blit rows from the source to layers in the destination. */
>> -   if (texImage->TexObject->Target == GL_TEXTURE_1D_ARRAY) {
>> -  int y, layer;
>> -
>> -  for (y = srcY0, layer = 0; layer < height; y += yStep, layer++) {
>> - blit.src.box.y = y;
>> - blit.src.box.height = 1;
>> - blit.dst.box.y = 0;
>> - blit.dst.box.height = 1;
>> - blit.dst.box.z = destY + layer;
>> -
>> - pipe->blit(pipe, &blit);
>> -  }
>> -   }
>> -   else {
>> -  /* All the other texture targets. */
>> -  pipe->blit(pipe, &blit);
>> -   }
>> +   pipe->blit(pipe, &blit);
>>  return;
>>
>>   fallback:
>>  /* software fallback */
>>  fallback_copy_texsubimage(ctx,
>>strb, stImage, texImage->_BaseFormat,
>> - destX, destY, destZ,
>> + destX, destY, slice,
>>srcX, srcY, width, height);
>>   }
>
> Thanks for updating the state tracker code.  You removed the code above 
> on the premise that height will always be 1 if we're copying to a 1D 
> array texture, right?  Maybe we should assert that just to be safe.

I think instead of putting an assert like that in the places we think of
it at the middle levels, we should do so in our drivers at the bottom
level of mapping or blitting a particular slice, so that we catch
mistakes with slices or clipping wherever they may happen.

Turns out I don't have some of the asserts I thought I did in the Intel
driver, though.  I should go fix that.


pgp3aTnoCrB8R.pgp
Description: PGP signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] i965: Shrink Gen5 VUE map layout to be the same as Gen4.

2013-06-07 Thread Kenneth Graunke

On 06/07/2013 11:30 AM, Paul Berry wrote:

On 7 June 2013 03:17, Chris Forbes mailto:chr...@ijw.co.nz>> wrote:

The PRM suggests a larger layout, mostly to support having
gl_ClipDistance[] somewhere predictable for the fixed-function clipper
-- but it didn't actually arrive in Gen5.

Just use the same layout for both Gen4 and Gen5.

No Piglit regressions.

Improves performance in CS:S Video Stress Test by ~3%.


Fantastic!


Signed-off-by: Chris Forbes mailto:chr...@ijw.co.nz>>
---
  src/mesa/drivers/dri/i965/brw_sf_state.c |  5 +
  src/mesa/drivers/dri/i965/brw_vs.c   | 23 +++
  2 files changed, 4 insertions(+), 24 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_sf_state.c
b/src/mesa/drivers/dri/i965/brw_sf_state.c
index 7c29ba2..e9b7e66 100644
--- a/src/mesa/drivers/dri/i965/brw_sf_state.c
+++ b/src/mesa/drivers/dri/i965/brw_sf_state.c
@@ -131,10 +131,7 @@ const struct brw_tracked_state brw_sf_vp = {
  int
  brw_sf_compute_urb_entry_read_offset(struct intel_context *intel)
  {
-   if (intel->gen == 5)
-  return 3;
-   else
-  return 1;
+   return 1;
  }


How about just turning this into #define BRW_SF_URB_ENTRY_READ_OFFSET 1
in a header somewhere?  It seems silly to have a function whose only job
is to return a constant.


In the Gen6+ code, we just have:
   int urb_entry_read_offset = 1;

in both halves (which is kind of lame since you need to synchronize 
them), but...that's what's there.


I'd be okay with either.  I agree that the function should go away, 
either in this patch or a quick follow-up.


Reviewed-by: Kenneth Graunke 




  static void upload_sf_unit( struct brw_context *brw )
diff --git a/src/mesa/drivers/dri/i965/brw_vs.c
b/src/mesa/drivers/dri/i965/brw_vs.c
index 720325d..d173d2e 100644
--- a/src/mesa/drivers/dri/i965/brw_vs.c
+++ b/src/mesa/drivers/dri/i965/brw_vs.c
@@ -85,34 +85,17 @@ brw_compute_vue_map(struct brw_context *brw,
struct brw_vue_map *vue_map,
  */
 switch (intel->gen) {
 case 4:
+   case 5:
/* There are 8 dwords in VUE header pre-Ironlake:
 * dword 0-3 is indices, point width, clip flags.
 * dword 4-7 is ndc position
 * dword 8-11 is the first vertex data.
-   */
-  assign_vue_slot(vue_map, VARYING_SLOT_PSIZ);
-  assign_vue_slot(vue_map, BRW_VARYING_SLOT_NDC);
-  assign_vue_slot(vue_map, VARYING_SLOT_POS);
-  break;
-   case 5:
-  /* There are 20 DWs (D0-D19) in VUE header on Ironlake:
-   * dword 0-3 of the header is indices, point width, clip flags.
-   * dword 4-7 is the ndc position
-   * dword 8-11 of the vertex header is the 4D space position
-   * dword 12-19 of the vertex header is the user clip distance.
-   * dword 20-23 is a pad so that the vertex element data is
aligned
-   * dword 24-27 is the first vertex data we fill.
 *
-   * Note: future pipeline stages expect 4D space position to be
-   * contiguous with the other varyings, so we make dword 24-27 a
-   * duplicate copy of the 4D space position.
+   * On Ironlake the VUE header is nominally 20 dwords, but the
hardware
+   * will accept the same header layout as Gen4 [and should be
a bit faster]
 */
assign_vue_slot(vue_map, VARYING_SLOT_PSIZ);
assign_vue_slot(vue_map, BRW_VARYING_SLOT_NDC);
-  assign_vue_slot(vue_map, BRW_VARYING_SLOT_POS_DUPLICATE);


This was the last use of BRW_VARYING_SLOT_POS_DUPLICATE.  We ought to be
able to remove that from the enum now (and from the switch statement in
vec4_visitor::emit_urb_slot()).

With those changes, this patch is:

Reviewed-by: Paul Berry mailto:stereotype...@gmail.com>>

-  assign_vue_slot(vue_map, VARYING_SLOT_CLIP_DIST0);
-  assign_vue_slot(vue_map, VARYING_SLOT_CLIP_DIST1);
-  assign_vue_slot(vue_map, BRW_VARYING_SLOT_PAD);
assign_vue_slot(vue_map, VARYING_SLOT_POS);
break;
 case 6:
--
1.8.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org 
http://lists.freedesktop.org/mailman/listinfo/mesa-dev




___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev



___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/3] llvmpipe: add support for layered rendering

2013-06-07 Thread Roland Scheidegger
Am 07.06.2013 16:55, schrieb Jose Fonseca:
> 
> 
> - Original Message -
>> Am 06.06.2013 03:15, schrieb Brian Paul:
>>> Reviewed-by: Brian Paul 
>>>
>>> Just two minor nits below.
>>>
>>>
>>> On 06/05/2013 05:44 PM, srol...@vmware.com wrote:
 From: Roland Scheidegger 

 Mostly just make sure the layer parameter gets passed through to the
 right
 places (and get clamped, can do this at setup time), fix up clears to
 clear all layers and disable opaque optimization. Luckily don't need to
 touch the jitted code.
 (Clears invoked via pipe's clear_render_target method will not work
 however
 since the pipe_util_clear function used for it doesn't handle clearing
 multiple layers yet.)
 ---
   src/gallium/drivers/llvmpipe/lp_context.h   |3 +
   src/gallium/drivers/llvmpipe/lp_jit.h   |2 +-
   src/gallium/drivers/llvmpipe/lp_rast.c  |  195
 ---
   src/gallium/drivers/llvmpipe/lp_rast.h  |2 +-
   src/gallium/drivers/llvmpipe/lp_rast_priv.h |   20 ++-
   src/gallium/drivers/llvmpipe/lp_scene.c |   12 +-
   src/gallium/drivers/llvmpipe/lp_scene.h |7 +-
   src/gallium/drivers/llvmpipe/lp_setup.c |1 +
   src/gallium/drivers/llvmpipe/lp_setup_context.h |1 +
   src/gallium/drivers/llvmpipe/lp_setup_line.c|6 +
   src/gallium/drivers/llvmpipe/lp_setup_point.c   |7 +
   src/gallium/drivers/llvmpipe/lp_setup_tri.c |   17 +-
   src/gallium/drivers/llvmpipe/lp_state_derived.c |   13 +-
   src/gallium/drivers/llvmpipe/lp_texture.c   |3 -
   src/gallium/drivers/llvmpipe/lp_texture.h   |   10 ++
   15 files changed, 190 insertions(+), 109 deletions(-)

 diff --git a/src/gallium/drivers/llvmpipe/lp_context.h
 b/src/gallium/drivers/llvmpipe/lp_context.h
 index 54f3830..abfe852 100644
 --- a/src/gallium/drivers/llvmpipe/lp_context.h
 +++ b/src/gallium/drivers/llvmpipe/lp_context.h
 @@ -119,6 +119,9 @@ struct llvmpipe_context {
  /** Which vertex shader output slot contains viewport index */
  int viewport_index_slot;

 +   /** Which geometry shader output slot contains layer */
 +   int layer_slot;
 +
  /**< minimum resolvable depth value, for polygon offset */
  double mrd;

 diff --git a/src/gallium/drivers/llvmpipe/lp_jit.h
 b/src/gallium/drivers/llvmpipe/lp_jit.h
 index 4e9ca76..2ecfde7 100644
 --- a/src/gallium/drivers/llvmpipe/lp_jit.h
 +++ b/src/gallium/drivers/llvmpipe/lp_jit.h
 @@ -204,7 +204,7 @@ typedef void
   const void *dadx,
   const void *dady,
   uint8_t **color,
 -void *depth,
 +uint8_t *depth,
   uint32_t mask,
   struct lp_jit_thread_data *thread_data,
   unsigned *stride,
 diff --git a/src/gallium/drivers/llvmpipe/lp_rast.c
 b/src/gallium/drivers/llvmpipe/lp_rast.c
 index 981dd71..aa5224e 100644
 --- a/src/gallium/drivers/llvmpipe/lp_rast.c
 +++ b/src/gallium/drivers/llvmpipe/lp_rast.c
 @@ -134,6 +134,8 @@ lp_rast_clear_color(struct lp_rasterizer_task *task,

for (i = 0; i < scene->fb.nr_cbufs; i++) {
   enum pipe_format format = scene->fb.cbufs[i]->format;
 +unsigned layer;
 +uint8_t *map_layer = scene->cbufs[i].map;

   if (util_format_is_pure_sint(format)) {
  util_format_write_4i(format, arg.clear_color.i, 0,
 &uc, 0, 0, 0, 1, 1);
 @@ -143,14 +145,17 @@ lp_rast_clear_color(struct lp_rasterizer_task
 *task,
  util_format_write_4ui(format, arg.clear_color.ui, 0,
 &uc, 0, 0, 0, 1, 1);
   }

 -util_fill_rect(scene->cbufs[i].map,
 -   scene->fb.cbufs[i]->format,
 -   scene->cbufs[i].stride,
 -   task->x,
 -   task->y,
 -   task->width,
 -   task->height,
 -   &uc);
 +for (layer = 0; layer <= scene->fb_max_layer; layer++) {
 +   util_fill_rect(map_layer,
 +  scene->fb.cbufs[i]->format,
 +  scene->cbufs[i].stride,
 +  task->x,
 +  task->y,
 +  task->width,
 +  task->height,
 +  &uc);
 +   map_layer += scene->cbufs[i].layer_stride;
 +}
}
 }
 else {
>>>
>>> So, just

[Mesa-dev] [PATCH 1/2] i965/fs: Dump IR when fatally not compiling due to bad register spilling.

2013-06-07 Thread Eric Anholt
It should never happen, but it does, and at this point, you're going to
_mesa_problem() and abort() (unless it's just in precompile).  Give the
developer something to look at.
---
 src/mesa/drivers/dri/i965/brw_fs_reg_allocate.cpp | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/src/mesa/drivers/dri/i965/brw_fs_reg_allocate.cpp 
b/src/mesa/drivers/dri/i965/brw_fs_reg_allocate.cpp
index acd9846..cf74bf4 100644
--- a/src/mesa/drivers/dri/i965/brw_fs_reg_allocate.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs_reg_allocate.cpp
@@ -459,7 +459,8 @@ fs_visitor::assign_regs()
   int reg = choose_spill_reg(g);
 
   if (reg == -1) {
-fail("no register to spill\n");
+ fail("no register to spill:\n");
+ dump_instructions();
   } else if (dispatch_width == 16) {
 fail("Failure to register allocate.  Reduce number of live scalar "
   "values to avoid this.");
-- 
1.8.3.rc0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/2] ra: Fix register spilling.

2013-06-07 Thread Eric Anholt
Commit 551c991606e543c3a264a762026f11348b37947e tried to avoid spilling
registers that were trivially colorable.  But since we do optimistic
coloring, the top of the stack also contains nodes that are not trivially
colorable, so we need to consider them for spilling (since they are some
of our best candidates).

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=58384
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=63674
NOTE: This is a candidate for the 9.1 branch.
---
 src/mesa/program/register_allocate.c | 44 
 1 file changed, 39 insertions(+), 5 deletions(-)

diff --git a/src/mesa/program/register_allocate.c 
b/src/mesa/program/register_allocate.c
index 16739fd..8e3a9da 100644
--- a/src/mesa/program/register_allocate.c
+++ b/src/mesa/program/register_allocate.c
@@ -157,6 +157,16 @@ struct ra_graph {
 
unsigned int *stack;
unsigned int stack_count;
+
+   /**
+* Tracks the start of the set of optimistically-colored registers in the
+* stack.
+*
+* Along with any registers not in the stack (if one called ra_simplify()
+* and didn't do optimistic coloring), these need to be considered for
+* spilling.
+*/
+   unsigned int stack_optimistic_start;
 };
 
 /**
@@ -509,6 +519,7 @@ ra_optimistic_color(struct ra_graph *g)
 {
unsigned int i;
 
+   g->stack_optimistic_start = g->stack_count;
for (i = 0; i < g->count; i++) {
   if (g->nodes[i].in_stack || g->nodes[i].reg != NO_REG)
 continue;
@@ -587,8 +598,16 @@ ra_get_best_spill_node(struct ra_graph *g)
 {
unsigned int best_node = -1;
float best_benefit = 0.0;
-   unsigned int n;
+   unsigned int n, i;
 
+   /* For any registers not in the stack to be colored, consider them for
+* spilling.  This will mostly collect nodes that were being optimistally
+* colored as part of ra_allocate_no_spills() if we didn't successfully
+* optimistically color.
+*
+* It also includes nodes not trivially colorable by ra_simplify() if it
+* was used directly instead of as part of ra_allocate_no_spills().
+*/
for (n = 0; n < g->count; n++) {
   float cost = g->nodes[n].spill_cost;
   float benefit;
@@ -596,10 +615,6 @@ ra_get_best_spill_node(struct ra_graph *g)
   if (cost <= 0.0)
 continue;
 
-  /* Only consider registers for spilling if they are still in the
-   * interference graph (those on the stack have already been proven to be
-   * allocatable without spilling).
-   */
   if (g->nodes[n].in_stack)
  continue;
 
@@ -611,6 +626,25 @@ ra_get_best_spill_node(struct ra_graph *g)
   }
}
 
+   /* Also consider spilling any nodes that were set up to be optimistically
+* colored that we couldn't manage to color in ra_select().
+*/
+   for (i = g->stack_optimistic_start; i < g->stack_count; i++) {
+  n = g->stack[i];
+  float cost = g->nodes[n].spill_cost;
+  float benefit;
+
+  if (cost <= 0.0)
+ continue;
+
+  benefit = ra_get_spill_benefit(g, n);
+
+  if (benefit / cost > best_benefit) {
+best_benefit = benefit / cost;
+best_node = n;
+  }
+   }
+
return best_node;
 }
 
-- 
1.8.3.rc0

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] Rename api_validate.[ch] to draw_validate.[ch]

2013-06-07 Thread Kenneth Graunke

On 06/07/2013 08:49 AM, Brian Paul wrote:

Kenneth, it turns out that git blame handles file renaming just fine.
You'll see the line-by-line change information, along with the old
filename when you do git blame.

Did you, or anyone else, have any other objections?

-Brian


Huh.  I could have sworn it didn't, but you're right.  I don't really 
mind one way or another then.  Thanks!


--Ken
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 0/2] Fix gl_ClipVertex support on pre-Gen6 i965

2013-06-07 Thread Paul Berry
On 7 June 2013 02:25, Chris Forbes  wrote:

> Hi Paul
>
> Thanks for that suggestion -- you're right, the hardware does seem
> quite happy with the Gen4 layout. I'm doing a full piglit run to be
> safe.
>
> As far as performance goes, it's good for about a 3% speedup on the
> CS:S video stress test.
>
> -- Chris
>

Wow, that's a bigger improvement than I expected.  Glad to hear it!
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] i965: Shrink Gen5 VUE map layout to be the same as Gen4.

2013-06-07 Thread Paul Berry
On 7 June 2013 03:17, Chris Forbes  wrote:

> The PRM suggests a larger layout, mostly to support having
> gl_ClipDistance[] somewhere predictable for the fixed-function clipper
> -- but it didn't actually arrive in Gen5.
>
> Just use the same layout for both Gen4 and Gen5.
>
> No Piglit regressions.
>
> Improves performance in CS:S Video Stress Test by ~3%.
>
> Signed-off-by: Chris Forbes 
> ---
>  src/mesa/drivers/dri/i965/brw_sf_state.c |  5 +
>  src/mesa/drivers/dri/i965/brw_vs.c   | 23 +++
>  2 files changed, 4 insertions(+), 24 deletions(-)
>
> diff --git a/src/mesa/drivers/dri/i965/brw_sf_state.c
> b/src/mesa/drivers/dri/i965/brw_sf_state.c
> index 7c29ba2..e9b7e66 100644
> --- a/src/mesa/drivers/dri/i965/brw_sf_state.c
> +++ b/src/mesa/drivers/dri/i965/brw_sf_state.c
> @@ -131,10 +131,7 @@ const struct brw_tracked_state brw_sf_vp = {
>  int
>  brw_sf_compute_urb_entry_read_offset(struct intel_context *intel)
>  {
> -   if (intel->gen == 5)
> -  return 3;
> -   else
> -  return 1;
> +   return 1;
>  }
>

How about just turning this into #define BRW_SF_URB_ENTRY_READ_OFFSET 1 in
a header somewhere?  It seems silly to have a function whose only job is to
return a constant.


>
>  static void upload_sf_unit( struct brw_context *brw )
> diff --git a/src/mesa/drivers/dri/i965/brw_vs.c
> b/src/mesa/drivers/dri/i965/brw_vs.c
> index 720325d..d173d2e 100644
> --- a/src/mesa/drivers/dri/i965/brw_vs.c
> +++ b/src/mesa/drivers/dri/i965/brw_vs.c
> @@ -85,34 +85,17 @@ brw_compute_vue_map(struct brw_context *brw, struct
> brw_vue_map *vue_map,
>  */
> switch (intel->gen) {
> case 4:
> +   case 5:
>/* There are 8 dwords in VUE header pre-Ironlake:
> * dword 0-3 is indices, point width, clip flags.
> * dword 4-7 is ndc position
> * dword 8-11 is the first vertex data.
> -   */
> -  assign_vue_slot(vue_map, VARYING_SLOT_PSIZ);
> -  assign_vue_slot(vue_map, BRW_VARYING_SLOT_NDC);
> -  assign_vue_slot(vue_map, VARYING_SLOT_POS);
> -  break;
> -   case 5:
> -  /* There are 20 DWs (D0-D19) in VUE header on Ironlake:
> -   * dword 0-3 of the header is indices, point width, clip flags.
> -   * dword 4-7 is the ndc position
> -   * dword 8-11 of the vertex header is the 4D space position
> -   * dword 12-19 of the vertex header is the user clip distance.
> -   * dword 20-23 is a pad so that the vertex element data is aligned
> -   * dword 24-27 is the first vertex data we fill.
> *
> -   * Note: future pipeline stages expect 4D space position to be
> -   * contiguous with the other varyings, so we make dword 24-27 a
> -   * duplicate copy of the 4D space position.
> +   * On Ironlake the VUE header is nominally 20 dwords, but the
> hardware
> +   * will accept the same header layout as Gen4 [and should be a bit
> faster]
> */
>assign_vue_slot(vue_map, VARYING_SLOT_PSIZ);
>assign_vue_slot(vue_map, BRW_VARYING_SLOT_NDC);
> -  assign_vue_slot(vue_map, BRW_VARYING_SLOT_POS_DUPLICATE);
>

This was the last use of BRW_VARYING_SLOT_POS_DUPLICATE.  We ought to be
able to remove that from the enum now (and from the switch statement in
vec4_visitor::emit_urb_slot()).

With those changes, this patch is:

Reviewed-by: Paul Berry 


> -  assign_vue_slot(vue_map, VARYING_SLOT_CLIP_DIST0);
> -  assign_vue_slot(vue_map, VARYING_SLOT_CLIP_DIST1);
> -  assign_vue_slot(vue_map, BRW_VARYING_SLOT_PAD);
>assign_vue_slot(vue_map, VARYING_SLOT_POS);
>break;
> case 6:
> --
> 1.8.3
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 5/7] intel: Directly implement blit glBlitFramebuffer instead of awkward reuse.

2013-06-07 Thread Paul Berry
On 5 June 2013 10:14, Eric Anholt  wrote:

> This gets us support for blitting to attachment types other than
> textures.
>

I don't follow everything in this patch, but I trust Ken's review, so
consider it

Acked-by: Paul Berry 

I already made comments on patches 1 and 2.  Patches 3, 4, 6, and 7 are:

Reviewed-by: Paul Berry 


> ---
>  src/mesa/drivers/dri/intel/intel_fbo.c  | 129
> +++-
>  src/mesa/drivers/dri/intel/intel_tex.h  |   7 --
>  src/mesa/drivers/dri/intel/intel_tex_copy.c |   2 +-
>  3 files changed, 69 insertions(+), 69 deletions(-)
>
> diff --git a/src/mesa/drivers/dri/intel/intel_fbo.c
> b/src/mesa/drivers/dri/intel/intel_fbo.c
> index 9f892a9..9a24a55 100644
> --- a/src/mesa/drivers/dri/intel/intel_fbo.c
> +++ b/src/mesa/drivers/dri/intel/intel_fbo.c
> @@ -737,76 +737,83 @@ intel_validate_framebuffer(struct gl_context *ctx,
> struct gl_framebuffer *fb)
>   * normal path.
>   */
>  static GLbitfield
> -intel_blit_framebuffer_copy_tex_sub_image(struct gl_context *ctx,
> -  GLint srcX0, GLint srcY0,
> -  GLint srcX1, GLint srcY1,
> -  GLint dstX0, GLint dstY0,
> -  GLint dstX1, GLint dstY1,
> -  GLbitfield mask, GLenum filter)
> +intel_blit_framebuffer_with_blitter(struct gl_context *ctx,
> +GLint srcX0, GLint srcY0,
> +GLint srcX1, GLint srcY1,
> +GLint dstX0, GLint dstY0,
> +GLint dstX1, GLint dstY1,
> +GLbitfield mask, GLenum filter)
>  {
> +   struct intel_context *intel = intel_context(ctx);
> +
> if (mask & GL_COLOR_BUFFER_BIT) {
>GLint i;
>const struct gl_framebuffer *drawFb = ctx->DrawBuffer;
>const struct gl_framebuffer *readFb = ctx->ReadBuffer;
> -  const struct gl_renderbuffer_attachment *drawAtt;
> -  struct intel_renderbuffer *srcRb =
> - intel_renderbuffer(readFb->_ColorReadBuffer);
> +  struct gl_renderbuffer *src_rb = readFb->_ColorReadBuffer;
> +  struct intel_renderbuffer *src_irb = intel_renderbuffer(src_rb);
> +
> +  if (!src_irb) {
> + perf_debug("glBlitFramebuffer(): missing src renderbuffer.  "
> +"Falling back to software rendering.\n");
> + return mask;
> +  }
>
>/* If the source and destination are the same size with no
> mirroring,
> * the rectangles are within the size of the texture and there is no
> -   * scissor then we can use glCopyTexSubimage2D to implement the
> blit.
> -   * This will end up as a fast hardware blit on some drivers.
> +   * scissor, then we can probably use the blit engine.
> */
> -  const GLboolean use_intel_copy_texsubimage =
> - srcX0 - srcX1 == dstX0 - dstX1 &&
> - srcY0 - srcY1 == dstY0 - dstY1 &&
> - srcX1 >= srcX0 &&
> - srcY1 >= srcY0 &&
> - srcX0 >= 0 && srcX1 <= readFb->Width &&
> - srcY0 >= 0 && srcY1 <= readFb->Height &&
> - dstX0 >= 0 && dstX1 <= drawFb->Width &&
> - dstY0 >= 0 && dstY1 <= drawFb->Height &&
> - !ctx->Scissor.Enabled;
> -
> -  /* Verify that all the draw buffers can be blitted using
> -   * intel_copy_texsubimage().
> +  if (!(srcX0 - srcX1 == dstX0 - dstX1 &&
> +srcY0 - srcY1 == dstY0 - dstY1 &&
> +srcX1 >= srcX0 &&
> +srcY1 >= srcY0 &&
> +srcX0 >= 0 && srcX1 <= readFb->Width &&
> +srcY0 >= 0 && srcY1 <= readFb->Height &&
> +dstX0 >= 0 && dstX1 <= drawFb->Width &&
> +dstY0 >= 0 && dstY1 <= drawFb->Height &&
> +!ctx->Scissor.Enabled)) {
> + perf_debug("glBlitFramebuffer(): non-1:1 blit.  "
> +"Falling back to software rendering.\n");
> + return mask;
> +  }
> +
> +  /* Blit to all active draw buffers.  We don't do any pre-checking,
> +   * because we assume that copying to MRTs is rare, and failure
> midway
> +   * through copying is even more rare.  Given that feedback loops in
> +   * glFramebufferBlit() are undefined, we can safely fail out after
> +   * having partially completed our copies.
> */
>for (i = 0; i < ctx->DrawBuffer->_NumColorDrawBuffers; i++) {
> - int idx = ctx->DrawBuffer->_ColorDrawBufferIndexes[i];
> - if (idx == -1)
> -continue;
> - drawAtt = &drawFb->Attachment[idx];
> -
> - if (srcRb && drawAtt && drawAtt->Texture &&
> - use_intel_copy_texsubimage)
> -continue;
> - else
> + struct gl_renderbuffer *dst_rb =
> ctx->DrawBuffer->_ColorDrawBuffers[i];
> + struct intel_renderbuffer *dst_irb = intel_rende

Re: [Mesa-dev] [PATCH] u_vbuf: fix index buffer leak

2013-06-07 Thread Alex Deucher
On Fri, Jun 7, 2013 at 1:36 PM, Chia-I Wu  wrote:
> On Fri, Jun 7, 2013 at 9:25 PM, Alex Deucher  wrote:
>> Candidate for the stable branches?
> Ah, I already committed it.  I will let it settle in master for a few
> days and cherry-pick it to 9.1, unless this is against some policy
> about stable branches.

Sounds good.  Thanks!

>
>> On Fri, Jun 7, 2013 at 5:58 AM, Marek Olšák  wrote:
>>> Reviewed-by: Marek Olšák 
>>>
>>> Marek
>>>
>>> On Fri, Jun 7, 2013 at 6:25 AM, Chia-I Wu  wrote:

 Signed-off-by: Chia-I Wu 
 ---
  src/gallium/auxiliary/util/u_vbuf.c |3 +++
  1 file changed, 3 insertions(+)

 diff --git a/src/gallium/auxiliary/util/u_vbuf.c 
 b/src/gallium/auxiliary/util/u_vbuf.c
 index 244b04d..5936f74 100644
 --- a/src/gallium/auxiliary/util/u_vbuf.c
 +++ b/src/gallium/auxiliary/util/u_vbuf.c
 @@ -307,6 +307,9 @@ void u_vbuf_destroy(struct u_vbuf *mgr)
 unsigned num_vb = screen->get_shader_param(screen, PIPE_SHADER_VERTEX,
PIPE_SHADER_CAP_MAX_INPUTS);

 +   mgr->pipe->set_index_buffer(mgr->pipe, NULL);
 +   pipe_resource_reference(&mgr->index_buffer.buffer, NULL);
 +
 mgr->pipe->set_vertex_buffers(mgr->pipe, 0, num_vb, NULL);

 for (i = 0; i < PIPE_MAX_ATTRIBS; i++) {
 --
 1.7.10.4

 ___
 mesa-dev mailing list
 mesa-dev@lists.freedesktop.org
 http://lists.freedesktop.org/mailman/listinfo/mesa-dev
>>> ___
>>> mesa-dev mailing list
>>> mesa-dev@lists.freedesktop.org
>>> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
>
>
> --
> o...@lunarg.com
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] u_vbuf: fix index buffer leak

2013-06-07 Thread Chia-I Wu
On Fri, Jun 7, 2013 at 9:25 PM, Alex Deucher  wrote:
> Candidate for the stable branches?
Ah, I already committed it.  I will let it settle in master for a few
days and cherry-pick it to 9.1, unless this is against some policy
about stable branches.

> On Fri, Jun 7, 2013 at 5:58 AM, Marek Olšák  wrote:
>> Reviewed-by: Marek Olšák 
>>
>> Marek
>>
>> On Fri, Jun 7, 2013 at 6:25 AM, Chia-I Wu  wrote:
>>>
>>> Signed-off-by: Chia-I Wu 
>>> ---
>>>  src/gallium/auxiliary/util/u_vbuf.c |3 +++
>>>  1 file changed, 3 insertions(+)
>>>
>>> diff --git a/src/gallium/auxiliary/util/u_vbuf.c 
>>> b/src/gallium/auxiliary/util/u_vbuf.c
>>> index 244b04d..5936f74 100644
>>> --- a/src/gallium/auxiliary/util/u_vbuf.c
>>> +++ b/src/gallium/auxiliary/util/u_vbuf.c
>>> @@ -307,6 +307,9 @@ void u_vbuf_destroy(struct u_vbuf *mgr)
>>> unsigned num_vb = screen->get_shader_param(screen, PIPE_SHADER_VERTEX,
>>>PIPE_SHADER_CAP_MAX_INPUTS);
>>>
>>> +   mgr->pipe->set_index_buffer(mgr->pipe, NULL);
>>> +   pipe_resource_reference(&mgr->index_buffer.buffer, NULL);
>>> +
>>> mgr->pipe->set_vertex_buffers(mgr->pipe, 0, num_vb, NULL);
>>>
>>> for (i = 0; i < PIPE_MAX_ATTRIBS; i++) {
>>> --
>>> 1.7.10.4
>>>
>>> ___
>>> mesa-dev mailing list
>>> mesa-dev@lists.freedesktop.org
>>> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
>> ___
>> mesa-dev mailing list
>> mesa-dev@lists.freedesktop.org
>> http://lists.freedesktop.org/mailman/listinfo/mesa-dev



--
o...@lunarg.com
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/2] util: fix util_clear_render_target and util_clear_depth_stencil layer handling

2013-06-07 Thread Marek Olšák
I understand, though the current hackish approach in the radeon (and
also nouveau) drivers has been working really well for the last 3
years and it doesn't look like anybody would like to change that. Just
saying. Not that it's important.

Marek

On Fri, Jun 7, 2013 at 4:57 PM, Roland Scheidegger  wrote:
> Am 07.06.2013 12:14, schrieb Marek Olšák:
>>> diff --git a/src/gallium/auxiliary/util/u_transfer.c 
>>> b/src/gallium/auxiliary/util/u_transfer.c
>>> index 56e059b..7804f2a 100644
>>> --- a/src/gallium/auxiliary/util/u_transfer.c
>>> +++ b/src/gallium/auxiliary/util/u_transfer.c
>>> @@ -25,6 +25,7 @@ void u_default_transfer_inline_write( struct pipe_context 
>>> *pipe,
>>> usage |= PIPE_TRANSFER_WRITE;
>>>
>>> /* transfer_inline_write implicitly discards the rewritten buffer range 
>>> */
>>> +   /* XXX this looks very broken for non-buffer resources having more than 
>>> one dim. */
>>> if (box->x == 0 && box->width == resource->width0) {
>>
>> Indeed, however radeon drivers ignore the discard flags for textures
>> and if the transfer is write-only, they behave as if DISCARD_RANGE was
>> set no matter what the flags are. It's a respond to state trackers not
>> having used the discard flags when they should (that is: always). I
>> propose to standardize this behavior, i.e. if PIPE_TRANSFER_READ is
>> not set for texture transfers, PIPE_TRANSFER_DISCARD_RANGE is implied.
>
> I think that's a bit awkward to make it the state tracker's
> responsibility to set PIPE_TRANSFER_READ if it doesn't want to write the
> whole range, setting TRANSFER_DISCARD_RANGE if it does want to write
> everything sounds more natural and safe to me.
> If state trackers don't do this they should just be fixed.
>
> Roland
>
>>
>> Marek
>>
>>>usage |= PIPE_TRANSFER_DISCARD_WHOLE_RESOURCE;
>>> } else {
>>> --
>>> 1.7.9.5
>>> ___
>>> mesa-dev mailing list
>>> mesa-dev@lists.freedesktop.org
>>> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/2] radeonsi: Handle TGSI_OPCODE_DDX/Y

2013-06-07 Thread Michel Dänzer
From: Michel Dänzer 

16 more little piglits.

Signed-off-by: Michel Dänzer 
---
 src/gallium/drivers/radeonsi/radeonsi_shader.c | 35 ++
 src/gallium/drivers/radeonsi/radeonsi_shader.h |  1 +
 src/gallium/drivers/radeonsi/si_state_draw.c   |  1 +
 3 files changed, 37 insertions(+)

diff --git a/src/gallium/drivers/radeonsi/radeonsi_shader.c 
b/src/gallium/drivers/radeonsi/radeonsi_shader.c
index fc14f3c..d220a97 100644
--- a/src/gallium/drivers/radeonsi/radeonsi_shader.c
+++ b/src/gallium/drivers/radeonsi/radeonsi_shader.c
@@ -42,6 +42,7 @@
 #include "tgsi/tgsi_scan.h"
 #include "tgsi/tgsi_util.h"
 #include "tgsi/tgsi_dump.h"
+#include "util/u_memory.h"
 
 #include "radeonsi_pipe.h"
 #include "radeonsi_shader.h"
@@ -1065,6 +1066,33 @@ static void txq_fetch_args(
4);
 }
 
+static void si_llvm_emit_ddxy(
+   const struct lp_build_tgsi_action * action,
+   struct lp_build_tgsi_context * bld_base,
+   struct lp_build_emit_data * emit_data)
+{
+   struct gallivm_state * gallivm = bld_base->base.gallivm;
+   LLVMValueRef args[6];
+   unsigned c, sampler_src;
+
+   assert(emit_data->arg_count + 2 <= Elements(args));
+
+   for (c = 0; c < emit_data->arg_count; ++c)
+   args[c] = emit_data->args[c];
+
+   sampler_src = emit_data->inst->Instruction.NumSrcRegs-1;
+
+   args[c] = lp_build_const_int32(gallivm,
+  
emit_data->inst->Src[sampler_src].Register.Index);
+   args[c + 1] = args[c];
+   args[c + 2] = lp_build_const_int32(gallivm, 
emit_data->inst->Texture.Texture);
+
+   emit_data->output[0] = build_intrinsic(gallivm->builder,
+  action->intr_name,
+  emit_data->dst_type, args, c + 3,
+  LLVMReadNoneAttribute);
+}
+
 static const struct lp_build_tgsi_action tex_action = {
.fetch_args = tex_fetch_args,
.emit = build_tex_intrinsic,
@@ -1331,6 +1359,10 @@ int si_pipe_shader_create(
bld_base = &si_shader_ctx.radeon_bld.soa.bld_base;
 
tgsi_scan_shader(sel->tokens, &shader_info);
+
+   shader->shader.uses_derivs =
+   shader_info.opcode_count[TGSI_OPCODE_DDX] > 0 ||
+   shader_info.opcode_count[TGSI_OPCODE_DDY] > 0;
shader->shader.uses_kill = shader_info.uses_kill;
shader->shader.uses_instanceid = shader_info.uses_instanceid;
bld_base->info = &shader_info;
@@ -1345,6 +1377,9 @@ int si_pipe_shader_create(
bld_base->op_actions[TGSI_OPCODE_TXP] = tex_action;
bld_base->op_actions[TGSI_OPCODE_TXQ] = txq_action;
 
+   bld_base->op_actions[TGSI_OPCODE_DDX].emit = si_llvm_emit_ddxy;
+   bld_base->op_actions[TGSI_OPCODE_DDY].emit = si_llvm_emit_ddxy;
+
si_shader_ctx.radeon_bld.load_input = declare_input;
si_shader_ctx.radeon_bld.load_system_value = declare_system_value;
si_shader_ctx.tokens = sel->tokens;
diff --git a/src/gallium/drivers/radeonsi/radeonsi_shader.h 
b/src/gallium/drivers/radeonsi/radeonsi_shader.h
index 33e81c7..1a6c1c9 100644
--- a/src/gallium/drivers/radeonsi/radeonsi_shader.h
+++ b/src/gallium/drivers/radeonsi/radeonsi_shader.h
@@ -107,6 +107,7 @@ struct si_shader {
struct si_shader_io output[40];
 
unsignedninterp;
+   booluses_derivs;
booluses_kill;
booluses_instanceid;
boolfs_write_all;
diff --git a/src/gallium/drivers/radeonsi/si_state_draw.c 
b/src/gallium/drivers/radeonsi/si_state_draw.c
index 09c741f..6e6450d 100644
--- a/src/gallium/drivers/radeonsi/si_state_draw.c
+++ b/src/gallium/drivers/radeonsi/si_state_draw.c
@@ -232,6 +232,7 @@ static void si_pipe_shader_ps(struct pipe_context *ctx, 
struct si_pipe_shader *s
   S_00B028_VGPRS((shader->num_vgprs - 1) / 4) |
   S_00B028_SGPRS((num_sgprs - 1) / 8));
si_pm4_set_reg(pm4, R_00B02C_SPI_SHADER_PGM_RSRC2_PS,
+  S_00B02C_EXTRA_LDS_SIZE(shader->shader.uses_derivs ? 1 : 
0) |
   S_00B02C_USER_SGPR(num_user_sgprs));
 
si_pm4_set_reg(pm4, R_02880C_DB_SHADER_CONTROL, db_shader_control);
-- 
1.8.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/2] radeonsi: Handle TGSI_OPCODE_TXD

2013-06-07 Thread Michel Dänzer
From: Michel Dänzer 

One more little piglit.

Signed-off-by: Michel Dänzer 
---
 src/gallium/drivers/radeonsi/radeonsi_shader.c | 23 +--
 1 file changed, 21 insertions(+), 2 deletions(-)

diff --git a/src/gallium/drivers/radeonsi/radeonsi_shader.c 
b/src/gallium/drivers/radeonsi/radeonsi_shader.c
index f6fdfae..fc14f3c 100644
--- a/src/gallium/drivers/radeonsi/radeonsi_shader.c
+++ b/src/gallium/drivers/radeonsi/radeonsi_shader.c
@@ -875,6 +875,7 @@ static void tex_fetch_args(
const struct tgsi_full_instruction * inst = emit_data->inst;
unsigned opcode = inst->Instruction.Opcode;
unsigned target = inst->Texture.Texture;
+   unsigned sampler_src;
LLVMValueRef coords[4];
LLVMValueRef address[16];
int ref_pos;
@@ -920,6 +921,15 @@ static void tex_fetch_args(
address[count++] = lp_build_emit_fetch(bld_base, inst, 1, 0);
}
 
+   /* Pack user derivatives */
+   if (opcode == TGSI_OPCODE_TXD) {
+   for (chan = 0; chan < 2; chan++) {
+   address[count++] = lp_build_emit_fetch(bld_base, inst, 
1, chan);
+   if (num_coords > 1)
+   address[count++] = 
lp_build_emit_fetch(bld_base, inst, 2, chan);
+   }
+   }
+
/* Pack texture coordinates */
address[count++] = coords[0];
if (num_coords > 1)
@@ -961,8 +971,10 @@ static void tex_fetch_args(
 "");
}
 
+   sampler_src = emit_data->inst->Instruction.NumSrcRegs - 1;
+
/* Resource */
-   emit_data->args[1] = 
si_shader_ctx->resources[emit_data->inst->Src[1].Register.Index];
+   emit_data->args[1] = 
si_shader_ctx->resources[emit_data->inst->Src[sampler_src].Register.Index];
 
if (opcode == TGSI_OPCODE_TXF) {
/* add tex offsets */
@@ -993,7 +1005,7 @@ static void tex_fetch_args(
emit_data->arg_count = 3;
} else {
/* Sampler */
-   emit_data->args[2] = 
si_shader_ctx->samplers[emit_data->inst->Src[1].Register.Index];
+   emit_data->args[2] = 
si_shader_ctx->samplers[emit_data->inst->Src[sampler_src].Register.Index];
 
emit_data->dst_type = LLVMVectorType(
LLVMFloatTypeInContext(bld_base->base.gallivm->context),
@@ -1065,6 +1077,12 @@ static const struct lp_build_tgsi_action txb_action = {
.intr_name = "llvm.SI.sampleb."
 };
 
+static const struct lp_build_tgsi_action txd_action = {
+   .fetch_args = tex_fetch_args,
+   .emit = build_tex_intrinsic,
+   .intr_name = "llvm.SI.sampled."
+};
+
 static const struct lp_build_tgsi_action txf_action = {
.fetch_args = tex_fetch_args,
.emit = build_tex_intrinsic,
@@ -1321,6 +1339,7 @@ int si_pipe_shader_create(
 
bld_base->op_actions[TGSI_OPCODE_TEX] = tex_action;
bld_base->op_actions[TGSI_OPCODE_TXB] = txb_action;
+   bld_base->op_actions[TGSI_OPCODE_TXD] = txd_action;
bld_base->op_actions[TGSI_OPCODE_TXF] = txf_action;
bld_base->op_actions[TGSI_OPCODE_TXL] = txl_action;
bld_base->op_actions[TGSI_OPCODE_TXP] = tex_action;
-- 
1.8.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 0/2] radeonsi: Derivatives support

2013-06-07 Thread Michel Dänzer
I wonder how we should deal with LLVM 3.3 for these: Might we be able to
get the intrinsics into an LLVM 3.3.y release, or do we need to only enable
this stuff as of LLVM 3.4?

[PATCH 1/2] radeonsi: Handle TGSI_OPCODE_TXD
[PATCH 2/2] radeonsi: Handle TGSI_OPCODE_DDX/Y
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] Rename api_validate.[ch] to draw_validate.[ch]

2013-06-07 Thread Brian Paul

On 06/06/2013 01:57 AM, Arnas Milaševičius wrote:

On Thu, Jun 6, 2013 at 3:23 AM, Brian Paul mailto:bri...@vmware.com>> wrote:

On 06/05/2013 03:25 PM, Kenneth Graunke wrote:

On 06/05/2013 12:09 PM, Arnas Milasevicius wrote:

---
   src/mesa/Makefile.sources |   2 +-
   src/mesa/SConscript   |   2 +-
   src/mesa/main/draw_validate.c | 745
++__
   3 files changed, 747 insertions(+), 2 deletions(-)
   create mode 100644 src/mesa/main/draw_validate.c


It looks like this patch leaves the old api_validate.c file in
place, so
we would have two copies of everything.  The proper way to do
this is:
$ cd src/mesa/main
$ git mv api_validate.c draw_validate.c
$ 
$ git commit -a

That said...Brian, was this one of your ideas?  I don't see much
point
to renaming this file, and renaming files makes it harder to go
back in
history with git blame and such.  So unless there's a good
reason, I'd
rather leave it be.


Yes, it's from my personal Mesa to-do list.  Your point about
git-blame is well taken so if you'd rather not have the file renamed
we can leave it as-is.  It's just another one of those little things
that I've always found annoying.



> So, should I resend it with `git mv` or we will leave this file's name
> as it is?

Kenneth, it turns out that git blame handles file renaming just fine. 
You'll see the line-by-line change information, along with the old 
filename when you do git blame.


Did you, or anyone else, have any other objections?

-Brian

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 65513] New: In TGSI module, replace string arrays with functions

2013-06-07 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=65513

  Priority: medium
Bug ID: 65513
  Assignee: mesa-dev@lists.freedesktop.org
   Summary: In TGSI module, replace string arrays with functions
  Severity: trivial
Classification: Unclassified
OS: All
  Reporter: bri...@vmware.com
  Hardware: Other
Status: NEW
   Version: git
 Component: Mesa core
   Product: Mesa

This is another relatively simple code clean-up project.

In the tgsi_string.[ch] files we have arrays such as tgsi_semantic_names[] and
tgsi_texture_names[] which are used to map TGSI enums to strings.  In the .c
file we have static assertions to check that the number of strings in the array
matches the TGSI_x_COUNT values.  But the assertions are useless since the
arrays are explicitly dimensioned.  The point of the assertions is to make sure
that when we add a new TGSI enum that we also update the array of strings used
for TGSI parsing/printing.

In commit 14541dacab218cbe82310d999d44130ebc3f6526 we replaced the
tgsi_file_names[] array with a new tgsi_file_name() function.  The static
assertion now works properly, and it's probably a better solution anyway.  This
task is to do the same transformation for the other string arrays.

Please do one patch for each array->function transformation.

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] llvmpipe: Use saturating add/sub for UNORM formats

2013-06-07 Thread Jose Fonseca
Looks good to me.

Jose

- Original Message -
> lp_build_add and lp_build_sub have fallback code for cases
> that cannot be handled by known intrinsics.  For UNORM formats,
> this code was using modulo rather than saturating arithmetic.
> 
> This fixes some rendering issues for a gnome session on System z.
> It also fixes various piglit tests on z, such as
> spec/ARB_color_buffer_float/GL_RGBA8-render.
> 
> The patch deliberately doesn't tackle the more complicated
> SNORM case.
> 
> Tested against piglit on x86_64 and System z with no regressions.
> 
> Signed-off-by: Richard Sandiford 
> ---
>  src/gallium/auxiliary/gallivm/lp_bld_arit.c | 8 
>  1 file changed, 8 insertions(+)
> 
> diff --git a/src/gallium/auxiliary/gallivm/lp_bld_arit.c
> b/src/gallium/auxiliary/gallivm/lp_bld_arit.c
> index 3291ec4..08aec79 100644
> --- a/src/gallium/auxiliary/gallivm/lp_bld_arit.c
> +++ b/src/gallium/auxiliary/gallivm/lp_bld_arit.c
> @@ -386,6 +386,10 @@ lp_build_add(struct lp_build_context *bld,
>   return lp_build_intrinsic_binary(builder, intrinsic,
>   lp_build_vec_type(bld->gallivm, bld->type), a, b);
> }
>  
> +   /* TODO: handle signed case */
> +   if(type.norm && !type.floating && !type.fixed && !type.sign)
> +  a = lp_build_min_simple(bld, a, lp_build_comp(bld, b));
> +
> if(LLVMIsConstant(a) && LLVMIsConstant(b))
>if (type.floating)
>   res = LLVMConstFAdd(a, b);
> @@ -663,6 +667,10 @@ lp_build_sub(struct lp_build_context *bld,
>   return lp_build_intrinsic_binary(builder, intrinsic,
>   lp_build_vec_type(bld->gallivm, bld->type), a, b);
> }
>  
> +   /* TODO: handle signed case */
> +   if(type.norm && !type.floating && !type.fixed && !type.sign)
> +  a = lp_build_max_simple(bld, a, b);
> +
> if(LLVMIsConstant(a) && LLVMIsConstant(b))
>if (type.floating)
>   res = LLVMConstFSub(a, b);
> --
> 1.7.11.7
> 
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
> 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v5] mesa: Remove gallium draw_arrays() and draw_arrays_instanced() functions

2013-06-07 Thread Brian Paul

On 06/06/2013 03:21 PM, Arnas Milasevicius wrote:


Moved draw_arrays() to st_draw_feedback.c and removed draw_arrays_instanced()


I updated the comments a bit and pushed to master.  Thanks.

-Brian


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH v5] mesa: Remove gallium draw_arrays() and draw_arrays_instanced() functions

2013-06-07 Thread Arnas Milaševičius
Someone, please, check this patch.

On Fri, Jun 7, 2013 at 1:21 AM, Arnas Milasevicius  wrote:
>
> Moved draw_arrays() to st_draw_feedback.c and removed draw_arrays_instanced()
> ---
>  v5: combined patches together
>  src/gallium/auxiliary/draw/draw_context.h | 11 -
>  src/gallium/auxiliary/draw/draw_pt.c  | 40 
> ---
>  src/mesa/state_tracker/st_draw_feedback.c | 24 +++
>  3 files changed, 24 insertions(+), 51 deletions(-)
>
> diff --git a/src/gallium/auxiliary/draw/draw_context.h 
> b/src/gallium/auxiliary/draw/draw_context.h
> index 2d843b7..4a1b27e 100644
> --- a/src/gallium/auxiliary/draw/draw_context.h
> +++ b/src/gallium/auxiliary/draw/draw_context.h
> @@ -241,17 +241,6 @@ draw_set_mapped_so_targets(struct draw_context *draw,
>  void draw_vbo(struct draw_context *draw,
>const struct pipe_draw_info *info);
>
> -void draw_arrays(struct draw_context *draw, unsigned prim,
> -unsigned start, unsigned count);
> -
> -void
> -draw_arrays_instanced(struct draw_context *draw,
> -  unsigned mode,
> -  unsigned start,
> -  unsigned count,
> -  unsigned startInstance,
> -  unsigned instanceCount);
> -
>
>  
> /***
>   * Driver backend interface
> diff --git a/src/gallium/auxiliary/draw/draw_pt.c 
> b/src/gallium/auxiliary/draw/draw_pt.c
> index ce36ed0..131bd13 100644
> --- a/src/gallium/auxiliary/draw/draw_pt.c
> +++ b/src/gallium/auxiliary/draw/draw_pt.c
> @@ -413,46 +413,6 @@ draw_pt_arrays_restart(struct draw_context *draw,
>  }
>
>
> -
> -/**
> - * Non-instanced drawing.
> - * \sa draw_arrays_instanced
> - */
> -void
> -draw_arrays(struct draw_context *draw, unsigned prim,
> -unsigned start, unsigned count)
> -{
> -   draw_arrays_instanced(draw, prim, start, count, 0, 1);
> -}
> -
> -
> -/**
> - * Instanced drawing.
> - * \sa draw_vbo
> - */
> -void
> -draw_arrays_instanced(struct draw_context *draw,
> -  unsigned mode,
> -  unsigned start,
> -  unsigned count,
> -  unsigned startInstance,
> -  unsigned instanceCount)
> -{
> -   struct pipe_draw_info info;
> -
> -   util_draw_init_info(&info);
> -
> -   info.mode = mode;
> -   info.start = start;
> -   info.count = count;
> -   info.start_instance = startInstance;
> -   info.instance_count = instanceCount;
> -   info.min_index = start;
> -   info.max_index = start + count - 1;
> -
> -   draw_vbo(draw, &info);
> -}
> -
>  /**
>   * Resolve true values within pipe_draw_info.
>   * If we're rendering from transform feedback/stream output
> diff --git a/src/mesa/state_tracker/st_draw_feedback.c 
> b/src/mesa/state_tracker/st_draw_feedback.c
> index b19d913..bfd7403 100644
> --- a/src/mesa/state_tracker/st_draw_feedback.c
> +++ b/src/mesa/state_tracker/st_draw_feedback.c
> @@ -40,6 +40,7 @@
>  #include "pipe/p_context.h"
>  #include "pipe/p_defines.h"
>  #include "util/u_inlines.h"
> +#include "util/u_draw.h"
>
>  #include "draw/draw_private.h"
>  #include "draw/draw_context.h"
> @@ -81,6 +82,29 @@ set_feedback_vertex_format(struct gl_context *ctx)
>
>
>  /**
> + * Instanced drawing.
> + * \sa draw_vbo
> + */
> +static void
> +draw_arrays(struct draw_context *draw,
> +  unsigned mode,
> +  unsigned start,
> +  unsigned count)
> +{
> +   struct pipe_draw_info info;
> +
> +   util_draw_init_info(&info);
> +
> +   info.mode = mode;
> +   info.start = start;
> +   info.count = count;
> +   info.min_index = start;
> +   info.max_index = start + count - 1;
> +
> +   draw_vbo(draw, &info);
> +}
> +
> +/**
>   * Called by VBO to draw arrays when in selection or feedback mode and
>   * to implement glRasterPos.
>   * This is very much like the normal draw_vbo() function above.
> --
> 1.8.3
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] R600/SI: Intrinsics for derivatives

2013-06-07 Thread Michel Dänzer

The most important difference to the previous version of these is that
whole quad mode is now enabled and M0 initialized appropriately for the
LDS instructions, which now allows all of the relevant piglit tests to
pass.


-- 
Earthling Michel Dänzer   |   http://www.amd.com
Libre software enthusiast |  Debian, X and DRI developer
From db07ab94113be5810fd6d1035b3d394ed53d27ca Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Michel=20D=C3=A4nzer?= 
Date: Thu, 21 Feb 2013 16:12:45 +0100
Subject: [PATCH 1/3] R600/SI: Add intrinsics for texture sampling with user
 derivatives
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Signed-off-by: Michel Dänzer 
---
 lib/Target/R600/SIInstructions.td | 7 ++-
 lib/Target/R600/SIIntrinsics.td   | 1 +
 2 files changed, 7 insertions(+), 1 deletion(-)

diff --git a/lib/Target/R600/SIInstructions.td b/lib/Target/R600/SIInstructions.td
index b6db815..73f87ca 100644
--- a/lib/Target/R600/SIInstructions.td
+++ b/lib/Target/R600/SIInstructions.td
@@ -535,7 +535,7 @@ def IMAGE_SAMPLE_B : MIMG_Sampler_Helper <0x0025, "IMAGE_SAMPLE_B">;
 //def IMAGE_SAMPLE_LZ : MIMG_NoPattern_ <"IMAGE_SAMPLE_LZ", 0x0027>;
 def IMAGE_SAMPLE_C : MIMG_Sampler_Helper <0x0028, "IMAGE_SAMPLE_C">;
 //def IMAGE_SAMPLE_C_CL : MIMG_NoPattern_ <"IMAGE_SAMPLE_C_CL", 0x0029>;
-//def IMAGE_SAMPLE_C_D : MIMG_NoPattern_ <"IMAGE_SAMPLE_C_D", 0x002a>;
+def IMAGE_SAMPLE_C_D : MIMG_Sampler_Helper <0x002a, "IMAGE_SAMPLE_C_D">;
 //def IMAGE_SAMPLE_C_D_CL : MIMG_NoPattern_ <"IMAGE_SAMPLE_C_D_CL", 0x002b>;
 def IMAGE_SAMPLE_C_L : MIMG_Sampler_Helper <0x002c, "IMAGE_SAMPLE_C_L">;
 def IMAGE_SAMPLE_C_B : MIMG_Sampler_Helper <0x002d, "IMAGE_SAMPLE_C_B">;
@@ -1296,6 +1296,11 @@ multiclass SamplePatterns {
   def : SampleArrayPattern ;
   def : SampleShadowPattern ;
   def : SampleShadowArrayPattern ;
+
+  def : SamplePattern ;
+  def : SampleArrayPattern ;
+  def : SampleShadowPattern ;
+  def : SampleShadowArrayPattern ;
 }
 
 defm : SamplePatterns;
diff --git a/lib/Target/R600/SIIntrinsics.td b/lib/Target/R600/SIIntrinsics.td
index 224cd2f..d2643e0 100644
--- a/lib/Target/R600/SIIntrinsics.td
+++ b/lib/Target/R600/SIIntrinsics.td
@@ -23,6 +23,7 @@ let TargetPrefix = "SI", isTarget = 1 in {
 
   def int_SI_sample : Sample;
   def int_SI_sampleb : Sample;
+  def int_SI_sampled : Sample;
   def int_SI_samplel : Sample;
 
   def int_SI_imageload : Intrinsic <[llvm_v4i32_ty], [llvm_anyvector_ty, llvm_v32i8_ty, llvm_i32_ty], [IntrNoMem]>;
-- 
1.8.3

From 466936a680993dec58e1e537f3b489cd82b5176c Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Michel=20D=C3=A4nzer?= 
Date: Thu, 21 Feb 2013 18:51:38 +0100
Subject: [PATCH 2/3] R600/SI: Initial support for LDS/GDS instructions
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Signed-off-by: Michel Dänzer 
---
 lib/Target/R600/SIInstrFormats.td  | 24 
 lib/Target/R600/SIInstrInfo.td | 23 +++
 lib/Target/R600/SIInstructions.td  |  3 +++
 lib/Target/R600/SILowerControlFlow.cpp | 16 
 4 files changed, 66 insertions(+)

diff --git a/lib/Target/R600/SIInstrFormats.td b/lib/Target/R600/SIInstrFormats.td
index 51f323d..434aa7e 100644
--- a/lib/Target/R600/SIInstrFormats.td
+++ b/lib/Target/R600/SIInstrFormats.td
@@ -281,6 +281,30 @@ class VINTRP  op, dag outs, dag ins, string asm, list pattern> :
 
 let Uses = [EXEC] in {
 
+class DS  op, dag outs, dag ins, string asm, list pattern> :
+Enc64  {
+
+  bits<8> vdst;
+  bits<1> gds;
+  bits<8> addr;
+  bits<8> data0;
+  bits<8> data1;
+  bits<8> offset0;
+  bits<8> offset1;
+
+  let Inst{7-0} = offset0;
+  let Inst{15-8} = offset1;
+  let Inst{17} = gds;
+  let Inst{25-18} = op;
+  let Inst{31-26} = 0x36; //encoding
+  let Inst{39-32} = addr;
+  let Inst{47-40} = data0;
+  let Inst{55-48} = data1;
+  let Inst{63-56} = vdst;
+
+  let LGKM_CNT = 1;
+}
+
 class MUBUF  op, dag outs, dag ins, string asm, list pattern> :
 Enc64 {
 
diff --git a/lib/Target/R600/SIInstrInfo.td b/lib/Target/R600/SIInstrInfo.td
index 42fa95f..47a64f7 100644
--- a/lib/Target/R600/SIInstrInfo.td
+++ b/lib/Target/R600/SIInstrInfo.td
@@ -286,6 +286,29 @@ class VOP3_64  op, string opName, list pattern> : VOP3 <
 // Vector I/O classes
 //===--===//
 
+class DS_Load_Helper  op, string asm, RegisterClass regClass> : DS <
+  op,
+  (outs regClass:$vdst),
+  (ins i1imm:$gds, VReg_32:$addr, VReg_32:$data0, VReg_32:$data1,
+   i8imm:$offset0, i8imm:$offset1),
+  asm#" $vdst, $gds, $addr, $data0, $data1, $offset0, $offset1, [M0]",
+  []> {
+  let mayLoad = 1;
+  let mayStore = 0;
+}
+
+class DS_Store_Helper  op, string asm, RegisterClass regClass> : DS <
+  op,
+  (outs),
+  (ins regClass:$vdata, i1imm:$gds, VReg_32:$addr, VReg_32:$data0, VReg_32:$data1,
+   i8imm:$offset0, i8imm:$of

Re: [Mesa-dev] [PATCH] glsl: Fix null check in read_dereference.

2013-06-07 Thread Brian Paul

On 06/06/2013 11:11 PM, Vinson Lee wrote:

Fixes "Logically dead code" defect reported by Coverity.

Signed-off-by: Vinson Lee 
---
  src/glsl/ir_reader.cpp | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/glsl/ir_reader.cpp b/src/glsl/ir_reader.cpp
index b366712..51534ca 100644
--- a/src/glsl/ir_reader.cpp
+++ b/src/glsl/ir_reader.cpp
@@ -886,7 +886,7 @@ ir_reader::read_dereference(s_expression *expr)
}

ir_rvalue *idx = read_rvalue(s_index);
-  if (subject == NULL) {
+  if (idx == NULL) {
 ir_read_error(NULL, "when reading the index of an array_ref");
 return NULL;
}



Candidate for stable branches?

Reviewed-by: Brian Paul 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/2] util: add util_resource_is_array_texture()

2013-06-07 Thread Brian Paul

On 06/07/2013 12:44 AM, Chia-I Wu wrote:

Checking if array_size is greater than 1 is not enough for single-layered
array textures.

Signed-off-by: Chia-I Wu 
---
  src/gallium/auxiliary/util/u_resource.h |   20 +++-
  1 file changed, 19 insertions(+), 1 deletion(-)

diff --git a/src/gallium/auxiliary/util/u_resource.h 
b/src/gallium/auxiliary/util/u_resource.h
index 977e013..a5e091f 100644
--- a/src/gallium/auxiliary/util/u_resource.h
+++ b/src/gallium/auxiliary/util/u_resource.h
@@ -26,9 +26,27 @@
  #ifndef U_RESOURCE_H
  #define U_RESOURCE_H

-struct pipe_resource;
+#include "pipe/p_state.h"

  unsigned
  util_resource_size(const struct pipe_resource *res);

+/**
+ * Return true if the resource is an array texture.
+ *
+ * Note that this function returns true for single-layered array textures.
+ */
+static INLINE boolean
+util_resource_is_array_texture(const struct pipe_resource *res)
+{
+   switch (res->target) {
+   case PIPE_TEXTURE_1D_ARRAY:
+   case PIPE_TEXTURE_2D_ARRAY:
+   case PIPE_TEXTURE_CUBE_ARRAY:
+  return TRUE;
+   default:
+  return FALSE;
+   }
+}
+
  #endif



Reviewed-by: Brian Paul 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/2] util: fix util_clear_render_target and util_clear_depth_stencil layer handling

2013-06-07 Thread Jose Fonseca
- Original Message -
> Am 07.06.2013 12:14, schrieb Marek Olšák:
> >> diff --git a/src/gallium/auxiliary/util/u_transfer.c
> >> b/src/gallium/auxiliary/util/u_transfer.c
> >> index 56e059b..7804f2a 100644
> >> --- a/src/gallium/auxiliary/util/u_transfer.c
> >> +++ b/src/gallium/auxiliary/util/u_transfer.c
> >> @@ -25,6 +25,7 @@ void u_default_transfer_inline_write( struct
> >> pipe_context *pipe,
> >> usage |= PIPE_TRANSFER_WRITE;
> >>
> >> /* transfer_inline_write implicitly discards the rewritten buffer
> >> range */
> >> +   /* XXX this looks very broken for non-buffer resources having more
> >> than one dim. */
> >> if (box->x == 0 && box->width == resource->width0) {
> > 
> > Indeed, however radeon drivers ignore the discard flags for textures
> > and if the transfer is write-only, they behave as if DISCARD_RANGE was
> > set no matter what the flags are. It's a respond to state trackers not
> > having used the discard flags when they should (that is: always). I
> > propose to standardize this behavior, i.e. if PIPE_TRANSFER_READ is
> > not set for texture transfers, PIPE_TRANSFER_DISCARD_RANGE is implied.
> 
> I think that's a bit awkward to make it the state tracker's
> responsibility to set PIPE_TRANSFER_READ if it doesn't want to write the
> whole range, setting TRANSFER_DISCARD_RANGE if it does want to write
> everything sounds more natural and safe to me.
> If state trackers don't do this they should just be fixed.

I agree with Roland. There's nothing radically different between a buffer and a 
texture to grant special treatment.  Let's be explicit about these flags, 
instead of relying the implied flags and/or guessing intentions.  


Jose
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/2] util: fix util_clear_render_target and util_clear_depth_stencil layer handling

2013-06-07 Thread Roland Scheidegger
Am 07.06.2013 12:14, schrieb Marek Olšák:
>> diff --git a/src/gallium/auxiliary/util/u_transfer.c 
>> b/src/gallium/auxiliary/util/u_transfer.c
>> index 56e059b..7804f2a 100644
>> --- a/src/gallium/auxiliary/util/u_transfer.c
>> +++ b/src/gallium/auxiliary/util/u_transfer.c
>> @@ -25,6 +25,7 @@ void u_default_transfer_inline_write( struct pipe_context 
>> *pipe,
>> usage |= PIPE_TRANSFER_WRITE;
>>
>> /* transfer_inline_write implicitly discards the rewritten buffer range 
>> */
>> +   /* XXX this looks very broken for non-buffer resources having more than 
>> one dim. */
>> if (box->x == 0 && box->width == resource->width0) {
> 
> Indeed, however radeon drivers ignore the discard flags for textures
> and if the transfer is write-only, they behave as if DISCARD_RANGE was
> set no matter what the flags are. It's a respond to state trackers not
> having used the discard flags when they should (that is: always). I
> propose to standardize this behavior, i.e. if PIPE_TRANSFER_READ is
> not set for texture transfers, PIPE_TRANSFER_DISCARD_RANGE is implied.

I think that's a bit awkward to make it the state tracker's
responsibility to set PIPE_TRANSFER_READ if it doesn't want to write the
whole range, setting TRANSFER_DISCARD_RANGE if it does want to write
everything sounds more natural and safe to me.
If state trackers don't do this they should just be fixed.

Roland

> 
> Marek
> 
>>usage |= PIPE_TRANSFER_DISCARD_WHOLE_RESOURCE;
>> } else {
>> --
>> 1.7.9.5
>> ___
>> mesa-dev mailing list
>> mesa-dev@lists.freedesktop.org
>> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 3/3] llvmpipe: bump 3d and cube map limits to 2048 and 8192 respectively

2013-06-07 Thread Jose Fonseca


- Original Message -
> From: Roland Scheidegger 
> 
> These should just work (?), required by d3d10. Too large resources will
> get thrown out separately anyway.
> ---
>  src/gallium/drivers/llvmpipe/lp_limits.h |4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/src/gallium/drivers/llvmpipe/lp_limits.h
> b/src/gallium/drivers/llvmpipe/lp_limits.h
> index c7905b8..af31b35 100644
> --- a/src/gallium/drivers/llvmpipe/lp_limits.h
> +++ b/src/gallium/drivers/llvmpipe/lp_limits.h
> @@ -45,8 +45,8 @@
>   */
>  #define LP_MAX_TEXTURE_SIZE (1 * 1024 * 1024 * 1024ULL)  /* 1GB for now */
>  #define LP_MAX_TEXTURE_2D_LEVELS 14  /* 8K x 8K for now */
> -#define LP_MAX_TEXTURE_3D_LEVELS 11  /* 1K x 1K x 1K for now */
> -#define LP_MAX_TEXTURE_CUBE_LEVELS 13  /* 4K x 4K for now */
> +#define LP_MAX_TEXTURE_3D_LEVELS 12  /* 2K x 2K x 2K for now */
> +#define LP_MAX_TEXTURE_CUBE_LEVELS 14  /* 8K x 8K for now */
>  #define LP_MAX_TEXTURE_ARRAY_LAYERS 512 /* 8K x 512 / 8K x 8K x 512 */
>  


Reviewed-by: Jose Fonseca 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 1/3] gallium/tgsi: add missing string for layer semantic

2013-06-07 Thread Jose Fonseca


- Original Message -
> From: Roland Scheidegger 
> 
> Also report if a shader writes the layer semantic
> ---
>  src/gallium/auxiliary/draw/draw_context.c |2 +-
>  src/gallium/auxiliary/tgsi/tgsi_scan.c|5 +
>  src/gallium/auxiliary/tgsi/tgsi_scan.h|1 +
>  src/gallium/auxiliary/tgsi/tgsi_strings.c |1 +
>  4 files changed, 8 insertions(+), 1 deletion(-)
> 
> diff --git a/src/gallium/auxiliary/draw/draw_context.c
> b/src/gallium/auxiliary/draw/draw_context.c
> index 58ce270..35063b9 100644
> --- a/src/gallium/auxiliary/draw/draw_context.c
> +++ b/src/gallium/auxiliary/draw/draw_context.c
> @@ -548,7 +548,7 @@ draw_get_shader_info(const struct draw_context *draw)
>   * function to find those attributes.
>   *
>   * -1 is returned if the attribute is not found since this is
> - * an undefined situtation. Note, that zero is valid and can
> + * an undefined situation. Note, that zero is valid and can
>   * be used by any of the attributes, because position is not
>   * required to be attribute 0 or even at all present.
>   */
> diff --git a/src/gallium/auxiliary/tgsi/tgsi_scan.c
> b/src/gallium/auxiliary/tgsi/tgsi_scan.c
> index 0230267..d331257 100644
> --- a/src/gallium/auxiliary/tgsi/tgsi_scan.c
> +++ b/src/gallium/auxiliary/tgsi/tgsi_scan.c
> @@ -217,6 +217,11 @@ tgsi_scan_shader(const struct tgsi_token *tokens,
>TGSI_SEMANTIC_VIEWPORT_INDEX) {
>   info->writes_viewport_index = TRUE;
>}
> +  if (procType == TGSI_PROCESSOR_GEOMETRY &&
> +  fulldecl->Semantic.Name ==
> +  TGSI_SEMANTIC_LAYER) {
> + info->writes_layer = TRUE;
> +  }
> }
>  
>   }
> diff --git a/src/gallium/auxiliary/tgsi/tgsi_scan.h
> b/src/gallium/auxiliary/tgsi/tgsi_scan.h
> index 676abf0..a5b7024 100644
> --- a/src/gallium/auxiliary/tgsi/tgsi_scan.h
> +++ b/src/gallium/auxiliary/tgsi/tgsi_scan.h
> @@ -76,6 +76,7 @@ struct tgsi_shader_info
> boolean pixel_center_integer;
> boolean color0_writes_all_cbufs;
> boolean writes_viewport_index;
> +   boolean writes_layer;
>  
> unsigned num_written_clipdistance;
> /**
> diff --git a/src/gallium/auxiliary/tgsi/tgsi_strings.c
> b/src/gallium/auxiliary/tgsi/tgsi_strings.c
> index 6abf927..625107c 100644
> --- a/src/gallium/auxiliary/tgsi/tgsi_strings.c
> +++ b/src/gallium/auxiliary/tgsi/tgsi_strings.c
> @@ -80,6 +80,7 @@ const char *tgsi_semantic_names[TGSI_SEMANTIC_COUNT] =
> "TEXCOORD",
> "PCOORD",
> "VIEWPORT_INDEX"
> +   "LAYER"
>  };
>  
>  const char *tgsi_texture_names[TGSI_TEXTURE_COUNT] =
> --
> 1.7.9.5
> 

Reviewed-by: Jose Fonseca 
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/3] llvmpipe: add support for layered rendering

2013-06-07 Thread Jose Fonseca


- Original Message -
> Am 06.06.2013 03:15, schrieb Brian Paul:
> > Reviewed-by: Brian Paul 
> > 
> > Just two minor nits below.
> > 
> > 
> > On 06/05/2013 05:44 PM, srol...@vmware.com wrote:
> >> From: Roland Scheidegger 
> >>
> >> Mostly just make sure the layer parameter gets passed through to the
> >> right
> >> places (and get clamped, can do this at setup time), fix up clears to
> >> clear all layers and disable opaque optimization. Luckily don't need to
> >> touch the jitted code.
> >> (Clears invoked via pipe's clear_render_target method will not work
> >> however
> >> since the pipe_util_clear function used for it doesn't handle clearing
> >> multiple layers yet.)
> >> ---
> >>   src/gallium/drivers/llvmpipe/lp_context.h   |3 +
> >>   src/gallium/drivers/llvmpipe/lp_jit.h   |2 +-
> >>   src/gallium/drivers/llvmpipe/lp_rast.c  |  195
> >> ---
> >>   src/gallium/drivers/llvmpipe/lp_rast.h  |2 +-
> >>   src/gallium/drivers/llvmpipe/lp_rast_priv.h |   20 ++-
> >>   src/gallium/drivers/llvmpipe/lp_scene.c |   12 +-
> >>   src/gallium/drivers/llvmpipe/lp_scene.h |7 +-
> >>   src/gallium/drivers/llvmpipe/lp_setup.c |1 +
> >>   src/gallium/drivers/llvmpipe/lp_setup_context.h |1 +
> >>   src/gallium/drivers/llvmpipe/lp_setup_line.c|6 +
> >>   src/gallium/drivers/llvmpipe/lp_setup_point.c   |7 +
> >>   src/gallium/drivers/llvmpipe/lp_setup_tri.c |   17 +-
> >>   src/gallium/drivers/llvmpipe/lp_state_derived.c |   13 +-
> >>   src/gallium/drivers/llvmpipe/lp_texture.c   |3 -
> >>   src/gallium/drivers/llvmpipe/lp_texture.h   |   10 ++
> >>   15 files changed, 190 insertions(+), 109 deletions(-)
> >>
> >> diff --git a/src/gallium/drivers/llvmpipe/lp_context.h
> >> b/src/gallium/drivers/llvmpipe/lp_context.h
> >> index 54f3830..abfe852 100644
> >> --- a/src/gallium/drivers/llvmpipe/lp_context.h
> >> +++ b/src/gallium/drivers/llvmpipe/lp_context.h
> >> @@ -119,6 +119,9 @@ struct llvmpipe_context {
> >>  /** Which vertex shader output slot contains viewport index */
> >>  int viewport_index_slot;
> >>
> >> +   /** Which geometry shader output slot contains layer */
> >> +   int layer_slot;
> >> +
> >>  /**< minimum resolvable depth value, for polygon offset */
> >>  double mrd;
> >>
> >> diff --git a/src/gallium/drivers/llvmpipe/lp_jit.h
> >> b/src/gallium/drivers/llvmpipe/lp_jit.h
> >> index 4e9ca76..2ecfde7 100644
> >> --- a/src/gallium/drivers/llvmpipe/lp_jit.h
> >> +++ b/src/gallium/drivers/llvmpipe/lp_jit.h
> >> @@ -204,7 +204,7 @@ typedef void
> >>   const void *dadx,
> >>   const void *dady,
> >>   uint8_t **color,
> >> -void *depth,
> >> +uint8_t *depth,
> >>   uint32_t mask,
> >>   struct lp_jit_thread_data *thread_data,
> >>   unsigned *stride,
> >> diff --git a/src/gallium/drivers/llvmpipe/lp_rast.c
> >> b/src/gallium/drivers/llvmpipe/lp_rast.c
> >> index 981dd71..aa5224e 100644
> >> --- a/src/gallium/drivers/llvmpipe/lp_rast.c
> >> +++ b/src/gallium/drivers/llvmpipe/lp_rast.c
> >> @@ -134,6 +134,8 @@ lp_rast_clear_color(struct lp_rasterizer_task *task,
> >>
> >>for (i = 0; i < scene->fb.nr_cbufs; i++) {
> >>   enum pipe_format format = scene->fb.cbufs[i]->format;
> >> +unsigned layer;
> >> +uint8_t *map_layer = scene->cbufs[i].map;
> >>
> >>   if (util_format_is_pure_sint(format)) {
> >>  util_format_write_4i(format, arg.clear_color.i, 0,
> >> &uc, 0, 0, 0, 1, 1);
> >> @@ -143,14 +145,17 @@ lp_rast_clear_color(struct lp_rasterizer_task
> >> *task,
> >>  util_format_write_4ui(format, arg.clear_color.ui, 0,
> >> &uc, 0, 0, 0, 1, 1);
> >>   }
> >>
> >> -util_fill_rect(scene->cbufs[i].map,
> >> -   scene->fb.cbufs[i]->format,
> >> -   scene->cbufs[i].stride,
> >> -   task->x,
> >> -   task->y,
> >> -   task->width,
> >> -   task->height,
> >> -   &uc);
> >> +for (layer = 0; layer <= scene->fb_max_layer; layer++) {
> >> +   util_fill_rect(map_layer,
> >> +  scene->fb.cbufs[i]->format,
> >> +  scene->cbufs[i].stride,
> >> +  task->x,
> >> +  task->y,
> >> +  task->width,
> >> +  task->height,
> >> +  &uc);
> >> +   map_layer += scene->cbufs[i].layer_stride;
> >> +}
> >>}
> >> }
> >> else {
> > 
> > So, just to be clear (no pun intended), glClear() and 

Re: [Mesa-dev] [PATCH] u_vbuf: fix index buffer leak

2013-06-07 Thread Alex Deucher
Candidate for the stable branches?

On Fri, Jun 7, 2013 at 5:58 AM, Marek Olšák  wrote:
> Reviewed-by: Marek Olšák 
>
> Marek
>
> On Fri, Jun 7, 2013 at 6:25 AM, Chia-I Wu  wrote:
>>
>> Signed-off-by: Chia-I Wu 
>> ---
>>  src/gallium/auxiliary/util/u_vbuf.c |3 +++
>>  1 file changed, 3 insertions(+)
>>
>> diff --git a/src/gallium/auxiliary/util/u_vbuf.c 
>> b/src/gallium/auxiliary/util/u_vbuf.c
>> index 244b04d..5936f74 100644
>> --- a/src/gallium/auxiliary/util/u_vbuf.c
>> +++ b/src/gallium/auxiliary/util/u_vbuf.c
>> @@ -307,6 +307,9 @@ void u_vbuf_destroy(struct u_vbuf *mgr)
>> unsigned num_vb = screen->get_shader_param(screen, PIPE_SHADER_VERTEX,
>>PIPE_SHADER_CAP_MAX_INPUTS);
>>
>> +   mgr->pipe->set_index_buffer(mgr->pipe, NULL);
>> +   pipe_resource_reference(&mgr->index_buffer.buffer, NULL);
>> +
>> mgr->pipe->set_vertex_buffers(mgr->pipe, 0, num_vb, NULL);
>>
>> for (i = 0; i < PIPE_MAX_ATTRIBS; i++) {
>> --
>> 1.7.10.4
>>
>> ___
>> mesa-dev mailing list
>> mesa-dev@lists.freedesktop.org
>> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] llvmpipe: Use saturating add/sub for UNORM formats

2013-06-07 Thread Richard Sandiford
lp_build_add and lp_build_sub have fallback code for cases
that cannot be handled by known intrinsics.  For UNORM formats,
this code was using modulo rather than saturating arithmetic.

This fixes some rendering issues for a gnome session on System z.
It also fixes various piglit tests on z, such as
spec/ARB_color_buffer_float/GL_RGBA8-render.

The patch deliberately doesn't tackle the more complicated
SNORM case.

Tested against piglit on x86_64 and System z with no regressions.

Signed-off-by: Richard Sandiford 
---
 src/gallium/auxiliary/gallivm/lp_bld_arit.c | 8 
 1 file changed, 8 insertions(+)

diff --git a/src/gallium/auxiliary/gallivm/lp_bld_arit.c 
b/src/gallium/auxiliary/gallivm/lp_bld_arit.c
index 3291ec4..08aec79 100644
--- a/src/gallium/auxiliary/gallivm/lp_bld_arit.c
+++ b/src/gallium/auxiliary/gallivm/lp_bld_arit.c
@@ -386,6 +386,10 @@ lp_build_add(struct lp_build_context *bld,
  return lp_build_intrinsic_binary(builder, intrinsic, 
lp_build_vec_type(bld->gallivm, bld->type), a, b);
}
 
+   /* TODO: handle signed case */
+   if(type.norm && !type.floating && !type.fixed && !type.sign)
+  a = lp_build_min_simple(bld, a, lp_build_comp(bld, b));
+
if(LLVMIsConstant(a) && LLVMIsConstant(b))
   if (type.floating)
  res = LLVMConstFAdd(a, b);
@@ -663,6 +667,10 @@ lp_build_sub(struct lp_build_context *bld,
  return lp_build_intrinsic_binary(builder, intrinsic, 
lp_build_vec_type(bld->gallivm, bld->type), a, b);
}
 
+   /* TODO: handle signed case */
+   if(type.norm && !type.floating && !type.fixed && !type.sign)
+  a = lp_build_max_simple(bld, a, b);
+
if(LLVMIsConstant(a) && LLVMIsConstant(b))
   if (type.floating)
  res = LLVMConstFSub(a, b);
-- 
1.7.11.7

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH] i965: Shrink Gen5 VUE map layout to be the same as Gen4.

2013-06-07 Thread Chris Forbes
The PRM suggests a larger layout, mostly to support having
gl_ClipDistance[] somewhere predictable for the fixed-function clipper
-- but it didn't actually arrive in Gen5.

Just use the same layout for both Gen4 and Gen5.

No Piglit regressions.

Improves performance in CS:S Video Stress Test by ~3%.

Signed-off-by: Chris Forbes 
---
 src/mesa/drivers/dri/i965/brw_sf_state.c |  5 +
 src/mesa/drivers/dri/i965/brw_vs.c   | 23 +++
 2 files changed, 4 insertions(+), 24 deletions(-)

diff --git a/src/mesa/drivers/dri/i965/brw_sf_state.c 
b/src/mesa/drivers/dri/i965/brw_sf_state.c
index 7c29ba2..e9b7e66 100644
--- a/src/mesa/drivers/dri/i965/brw_sf_state.c
+++ b/src/mesa/drivers/dri/i965/brw_sf_state.c
@@ -131,10 +131,7 @@ const struct brw_tracked_state brw_sf_vp = {
 int
 brw_sf_compute_urb_entry_read_offset(struct intel_context *intel)
 {
-   if (intel->gen == 5)
-  return 3;
-   else
-  return 1;
+   return 1;
 }
 
 static void upload_sf_unit( struct brw_context *brw )
diff --git a/src/mesa/drivers/dri/i965/brw_vs.c 
b/src/mesa/drivers/dri/i965/brw_vs.c
index 720325d..d173d2e 100644
--- a/src/mesa/drivers/dri/i965/brw_vs.c
+++ b/src/mesa/drivers/dri/i965/brw_vs.c
@@ -85,34 +85,17 @@ brw_compute_vue_map(struct brw_context *brw, struct 
brw_vue_map *vue_map,
 */
switch (intel->gen) {
case 4:
+   case 5:
   /* There are 8 dwords in VUE header pre-Ironlake:
* dword 0-3 is indices, point width, clip flags.
* dword 4-7 is ndc position
* dword 8-11 is the first vertex data.
-   */
-  assign_vue_slot(vue_map, VARYING_SLOT_PSIZ);
-  assign_vue_slot(vue_map, BRW_VARYING_SLOT_NDC);
-  assign_vue_slot(vue_map, VARYING_SLOT_POS);
-  break;
-   case 5:
-  /* There are 20 DWs (D0-D19) in VUE header on Ironlake:
-   * dword 0-3 of the header is indices, point width, clip flags.
-   * dword 4-7 is the ndc position
-   * dword 8-11 of the vertex header is the 4D space position
-   * dword 12-19 of the vertex header is the user clip distance.
-   * dword 20-23 is a pad so that the vertex element data is aligned
-   * dword 24-27 is the first vertex data we fill.
*
-   * Note: future pipeline stages expect 4D space position to be
-   * contiguous with the other varyings, so we make dword 24-27 a
-   * duplicate copy of the 4D space position.
+   * On Ironlake the VUE header is nominally 20 dwords, but the hardware
+   * will accept the same header layout as Gen4 [and should be a bit 
faster]
*/
   assign_vue_slot(vue_map, VARYING_SLOT_PSIZ);
   assign_vue_slot(vue_map, BRW_VARYING_SLOT_NDC);
-  assign_vue_slot(vue_map, BRW_VARYING_SLOT_POS_DUPLICATE);
-  assign_vue_slot(vue_map, VARYING_SLOT_CLIP_DIST0);
-  assign_vue_slot(vue_map, VARYING_SLOT_CLIP_DIST1);
-  assign_vue_slot(vue_map, BRW_VARYING_SLOT_PAD);
   assign_vue_slot(vue_map, VARYING_SLOT_POS);
   break;
case 6:
-- 
1.8.3

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 65426] openGL glDeleteBuffers does not delete buffers created using glGenBuffers

2013-06-07 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=65426

--- Comment #12 from José Fonseca  ---
(In reply to comment #7)
> (In reply to comment #6)
> > (In reply to comment #5)
> > > Is this really a problem? Mesa might not always reuse buffer names, but 
> > > that
> > > does not mean the buffer wasn't properly deleted. As far as I can see,
> > > OpenGL does not require name reuse.
> > 
> > It's standard compliant, but it sounds like a symptom of a leak.
> 
> It's actually not a leak.  glGenTextures/Buffers/Framebuffers(), etc call
> the _mesa_HashFindFreeKeyBlock() function.  For speed, it simply returns the
> next previously unused integer.  
> If we'd ever hit 0x (or whatever
> the new hash table's limit is) we'd resort to searching the hash table for a
> lower, unused ID.  

Sounds good then.

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/2] util: fix util_clear_render_target and util_clear_depth_stencil layer handling

2013-06-07 Thread Marek Olšák
On Fri, Jun 7, 2013 at 2:32 AM,   wrote:
> From: Roland Scheidegger 
>
> These functions must clear all bound layers, not just the first.
> ---
>  src/gallium/auxiliary/util/u_surface.c  |  190 
> +--
>  src/gallium/auxiliary/util/u_transfer.c |1 +
>  2 files changed, 104 insertions(+), 87 deletions(-)
>
> diff --git a/src/gallium/auxiliary/util/u_surface.c 
> b/src/gallium/auxiliary/util/u_surface.c
> index 5c3a655..77d04ba 100644
> --- a/src/gallium/auxiliary/util/u_surface.c
> +++ b/src/gallium/auxiliary/util/u_surface.c
> @@ -307,6 +307,7 @@ no_src_map:
>   * cpp > 4 looks like a gross hack at best...
>   * Plus can't use these transfer fallbacks when clearing
>   * multisampled surfaces for instance.
> + * Clears all bound layers.
>   */
>  void
>  util_clear_render_target(struct pipe_context *pipe,
> @@ -316,8 +317,9 @@ util_clear_render_target(struct pipe_context *pipe,
>   unsigned width, unsigned height)
>  {
> struct pipe_transfer *dst_trans;
> -   void *dst_map;
> +   ubyte *dst_map;
> union util_color uc;
> +   unsigned max_layer, layer;
>
> assert(dst->texture);
> if (!dst->texture)
> @@ -332,6 +334,7 @@ util_clear_render_target(struct pipe_context *pipe,
>unsigned pixstride = util_format_get_blocksize(dst->format);
>dx = (dst->u.buf.first_element + dstx) * pixstride;
>w = width * pixstride;
> +  max_layer = 0;
>dst_map = pipe_transfer_map(pipe,
>dst->texture,
>0, 0,
> @@ -340,14 +343,13 @@ util_clear_render_target(struct pipe_context *pipe,
>&dst_trans);
> }
> else {
> -  /* XXX: should handle multiple layers */
> -  dst_map = pipe_transfer_map(pipe,
> -  dst->texture,
> -  dst->u.tex.level,
> -  dst->u.tex.first_layer,
> -  PIPE_TRANSFER_WRITE,
> -  dstx, dsty, width, height, &dst_trans);
> -
> +  max_layer = dst->u.tex.last_layer - dst->u.tex.first_layer;
> +  dst_map = pipe_transfer_map_3d(pipe,
> + dst->texture,
> + dst->u.tex.level,
> + PIPE_TRANSFER_WRITE,
> + dstx, dsty, dst->u.tex.first_layer,
> + width, height, max_layer, &dst_trans);
> }
>
> assert(dst_map);
> @@ -373,9 +375,13 @@ util_clear_render_target(struct pipe_context *pipe,
>else {
>   util_pack_color(color->f, dst->format, &uc);
>}
> -  util_fill_rect(dst_map, dst->format,
> - dst_trans->stride,
> - 0, 0, width, height, &uc);
> +
> +  for (layer = 0; layer <= max_layer; layer++) {
> + util_fill_rect(dst_map, dst->format,
> +dst_trans->stride,
> +0, 0, width, height, &uc);
> + dst_map += dst_trans->layer_stride;
> +  }
>
>pipe->transfer_unmap(pipe, dst_trans);
> }
> @@ -386,6 +392,7 @@ util_clear_render_target(struct pipe_context *pipe,
>   * sw fallback doesn't look terribly useful here.
>   * Plus can't use these transfer fallbacks when clearing
>   * multisampled surfaces for instance.
> + * Clears all bound layers.
>   */
>  void
>  util_clear_depth_stencil(struct pipe_context *pipe,
> @@ -400,6 +407,7 @@ util_clear_depth_stencil(struct pipe_context *pipe,
> struct pipe_transfer *dst_trans;
> ubyte *dst_map;
> boolean need_rmw = FALSE;
> +   unsigned max_layer, layer;
>
> if ((clear_flags & PIPE_CLEAR_DEPTHSTENCIL) &&
> ((clear_flags & PIPE_CLEAR_DEPTHSTENCIL) != PIPE_CLEAR_DEPTHSTENCIL) 
> &&
> @@ -409,102 +417,110 @@ util_clear_depth_stencil(struct pipe_context *pipe,
> assert(dst->texture);
> if (!dst->texture)
>return;
> -   dst_map = pipe_transfer_map(pipe,
> -   dst->texture,
> -   dst->u.tex.level,
> -   dst->u.tex.first_layer,
> -   (need_rmw ? PIPE_TRANSFER_READ_WRITE :
> -   PIPE_TRANSFER_WRITE),
> -   dstx, dsty, width, height, &dst_trans);
> +
> +   max_layer = dst->u.tex.last_layer - dst->u.tex.first_layer;
> +   dst_map = pipe_transfer_map_3d(pipe,
> +  dst->texture,
> +  dst->u.tex.level,
> +  (need_rmw ? PIPE_TRANSFER_READ_WRITE :
> +  PIPE_TRANSFER_WRITE),
> +  dstx, dsty, dst->u.tex.first_layer,
> +  width, height, max_layer + 1, &dst_trans);
> assert(dst_ma

Re: [Mesa-dev] [PATCH] u_vbuf: fix index buffer leak

2013-06-07 Thread Marek Olšák
Reviewed-by: Marek Olšák 

Marek

On Fri, Jun 7, 2013 at 6:25 AM, Chia-I Wu  wrote:
>
> Signed-off-by: Chia-I Wu 
> ---
>  src/gallium/auxiliary/util/u_vbuf.c |3 +++
>  1 file changed, 3 insertions(+)
>
> diff --git a/src/gallium/auxiliary/util/u_vbuf.c 
> b/src/gallium/auxiliary/util/u_vbuf.c
> index 244b04d..5936f74 100644
> --- a/src/gallium/auxiliary/util/u_vbuf.c
> +++ b/src/gallium/auxiliary/util/u_vbuf.c
> @@ -307,6 +307,9 @@ void u_vbuf_destroy(struct u_vbuf *mgr)
> unsigned num_vb = screen->get_shader_param(screen, PIPE_SHADER_VERTEX,
>PIPE_SHADER_CAP_MAX_INPUTS);
>
> +   mgr->pipe->set_index_buffer(mgr->pipe, NULL);
> +   pipe_resource_reference(&mgr->index_buffer.buffer, NULL);
> +
> mgr->pipe->set_vertex_buffers(mgr->pipe, 0, num_vb, NULL);
>
> for (i = 0; i < PIPE_MAX_ATTRIBS; i++) {
> --
> 1.7.10.4
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [Bug 65426] openGL glDeleteBuffers does not delete buffers created using glGenBuffers

2013-06-07 Thread bugzilla-daemon
https://bugs.freedesktop.org/show_bug.cgi?id=65426

--- Comment #11 from Michel Dänzer  ---
(In reply to comment #7)
> If we'd ever hit 0x (or whatever the new hash table's limit is) we'd
> resort to searching the hash table for a lower, unused ID.  I doubt that any
> app/test has ever exercised that case though...

If there's no test for that path, that probably means it's broken? :)

-- 
You are receiving this mail because:
You are the assignee for the bug.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 0/2] Fix gl_ClipVertex support on pre-Gen6 i965

2013-06-07 Thread Chris Forbes
Hi Paul

Thanks for that suggestion -- you're right, the hardware does seem
quite happy with the Gen4 layout. I'm doing a full piglit run to be
safe.

As far as performance goes, it's good for about a 3% speedup on the
CS:S video stress test.

-- Chris
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH] gallium/docs: fix up transfer description for 1d arrays, add cube map arrays

2013-06-07 Thread Eric Anholt
srol...@vmware.com writes:
> -For PIPE_TEXTURE_2D_ARRAY, the box::z and box::depth fields refer to the
> -array dimension of the texture.
> +For PIPE_TEXTURE_1D_ARRAY nad PIPE_TEXTURE_2D_ARRAY, the box::z and 
> box::depth

"and"



pgp9RJLIcFMTs.pgp
Description: PGP signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/2] ilo: fix textureSize() for single-layered array textures

2013-06-07 Thread Chia-I Wu
We returned 0 instead of 1 for the number of layers when the array texutre is
single-layered.  This fixes it on GEN7+.

Signed-off-by: Chia-I Wu 
---
 src/gallium/drivers/ilo/ilo_gpe_gen7.c |   20 ++--
 1 file changed, 18 insertions(+), 2 deletions(-)

diff --git a/src/gallium/drivers/ilo/ilo_gpe_gen7.c 
b/src/gallium/drivers/ilo/ilo_gpe_gen7.c
index f9533ff..c3211b5 100644
--- a/src/gallium/drivers/ilo/ilo_gpe_gen7.c
+++ b/src/gallium/drivers/ilo/ilo_gpe_gen7.c
@@ -25,6 +25,7 @@
  *Chia-I Wu 
  */
 
+#include "util/u_resource.h"
 #include "brw_defines.h"
 #include "intel_reg.h"
 
@@ -1591,8 +1592,23 @@ ilo_gpe_init_view_surface_for_texture_gen7(const struct 
ilo_dev_info *dev,
surface_format << BRW_SURFACE_FORMAT_SHIFT |
ilo_gpe_gen6_translate_winsys_tiling(tex->tiling) << 13;
 
-   if (surface_type != BRW_SURFACE_3D && depth > 1)
-  dw[0] |= GEN7_SURFACE_IS_ARRAY;
+   /*
+* From the Ivy Bridge PRM, volume 4 part 1, page 63:
+*
+* "If this field (Surface Array) is enabled, the Surface Type must be
+*  SURFTYPE_1D, SURFTYPE_2D, or SURFTYPE_CUBE. If this field is
+*  disabled and Surface Type is SURFTYPE_1D, SURFTYPE_2D, or
+*  SURFTYPE_CUBE, the Depth field must be set to zero."
+*
+* For non-3D sampler surfaces, resinfo (the sampler message) always
+* returns zero for the number of layers when this field is not set.
+*/
+   if (surface_type != BRW_SURFACE_3D) {
+  if (util_resource_is_array_texture(&tex->base))
+ dw[0] |= GEN7_SURFACE_IS_ARRAY;
+  else
+ assert(depth == 1);
+   }
 
if (tex->valign_4)
   dw[0] |= GEN7_SURFACE_VALIGN_4;
-- 
1.7.10.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 1/2] util: add util_resource_is_array_texture()

2013-06-07 Thread Chia-I Wu
Checking if array_size is greater than 1 is not enough for single-layered
array textures.

Signed-off-by: Chia-I Wu 
---
 src/gallium/auxiliary/util/u_resource.h |   20 +++-
 1 file changed, 19 insertions(+), 1 deletion(-)

diff --git a/src/gallium/auxiliary/util/u_resource.h 
b/src/gallium/auxiliary/util/u_resource.h
index 977e013..a5e091f 100644
--- a/src/gallium/auxiliary/util/u_resource.h
+++ b/src/gallium/auxiliary/util/u_resource.h
@@ -26,9 +26,27 @@
 #ifndef U_RESOURCE_H
 #define U_RESOURCE_H
 
-struct pipe_resource;
+#include "pipe/p_state.h"
 
 unsigned
 util_resource_size(const struct pipe_resource *res);
 
+/**
+ * Return true if the resource is an array texture.
+ *
+ * Note that this function returns true for single-layered array textures.
+ */
+static INLINE boolean
+util_resource_is_array_texture(const struct pipe_resource *res)
+{
+   switch (res->target) {
+   case PIPE_TEXTURE_1D_ARRAY:
+   case PIPE_TEXTURE_2D_ARRAY:
+   case PIPE_TEXTURE_CUBE_ARRAY:
+  return TRUE;
+   default:
+  return FALSE;
+   }
+}
+
 #endif
-- 
1.7.10.4

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev