[Mesa-dev] [PATCH 12/15] i965: Add support for ir_triop_cond_sel.

2013-08-22 Thread Matt Turner
---
 src/mesa/drivers/dri/i965/brw_fs_channel_expressions.cpp | 1 +
 src/mesa/drivers/dri/i965/brw_fs_visitor.cpp | 6 ++
 src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp   | 6 ++
 3 files changed, 13 insertions(+)

diff --git a/src/mesa/drivers/dri/i965/brw_fs_channel_expressions.cpp 
b/src/mesa/drivers/dri/i965/brw_fs_channel_expressions.cpp
index 6ee6d01..34dbc90 100644
--- a/src/mesa/drivers/dri/i965/brw_fs_channel_expressions.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs_channel_expressions.cpp
@@ -362,6 +362,7 @@ ir_channel_expressions_visitor::visit_leave(ir_assignment 
*ir)
 
case ir_triop_fma:
case ir_triop_lrp:
+   case ir_triop_cond_sel:
case ir_triop_bitfield_extract:
   for (i = 0; i < vector_elements; i++) {
 ir_rvalue *op0 = get_element(op_var[0], i);
diff --git a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp 
b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
index 4b54bee..27887d6 100644
--- a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
@@ -735,6 +735,12 @@ fs_visitor::visit(ir_expression *ir)
case ir_triop_lrp:
   emit_lrp(this->result, op[0], op[1], op[2]);
   break;
+
+   case ir_triop_cond_sel:
+  emit(CMP(reg_null_d, op[0], fs_reg(0), BRW_CONDITIONAL_NZ));
+  inst = emit(BRW_OPCODE_SEL, this->result, op[1], op[2]);
+  inst->predicate = BRW_PREDICATE_NORMAL;
+  break;
}
 }
 
diff --git a/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp 
b/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp
index fbdf73d..0615309 100644
--- a/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp
+++ b/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp
@@ -1731,6 +1731,12 @@ vec4_visitor::visit(ir_expression *ir)
   emit(LRP(result_dst, op[2], op[1], op[0]));
   break;
 
+   case ir_triop_cond_sel:
+  emit(CMP(dst_null_d(), op[0], src_reg(0), BRW_CONDITIONAL_NZ));
+  inst = emit(BRW_OPCODE_SEL, result_dst, op[1], op[2]);
+  inst->predicate = BRW_PREDICATE_NORMAL;
+  break;
+
case ir_triop_bfi:
   op[0] = fix_3src_operand(op[0]);
   op[1] = fix_3src_operand(op[1]);
-- 
1.8.3.2

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 12/15] i965: Add support for ir_triop_cond_sel.

2013-08-23 Thread Paul Berry
On 22 August 2013 16:08, Matt Turner  wrote:

> ---
>  src/mesa/drivers/dri/i965/brw_fs_channel_expressions.cpp | 1 +
>  src/mesa/drivers/dri/i965/brw_fs_visitor.cpp | 6 ++
>  src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp   | 6 ++
>  3 files changed, 13 insertions(+)
>
> diff --git a/src/mesa/drivers/dri/i965/brw_fs_channel_expressions.cpp
> b/src/mesa/drivers/dri/i965/brw_fs_channel_expressions.cpp
> index 6ee6d01..34dbc90 100644
> --- a/src/mesa/drivers/dri/i965/brw_fs_channel_expressions.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_fs_channel_expressions.cpp
> @@ -362,6 +362,7 @@
> ir_channel_expressions_visitor::visit_leave(ir_assignment *ir)
>
> case ir_triop_fma:
> case ir_triop_lrp:
> +   case ir_triop_cond_sel:
> case ir_triop_bitfield_extract:
>for (i = 0; i < vector_elements; i++) {
>  ir_rvalue *op0 = get_element(op_var[0], i);
> diff --git a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
> b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
> index 4b54bee..27887d6 100644
> --- a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
> @@ -735,6 +735,12 @@ fs_visitor::visit(ir_expression *ir)
> case ir_triop_lrp:
>emit_lrp(this->result, op[0], op[1], op[2]);
>break;
> +
> +   case ir_triop_cond_sel:
> +  emit(CMP(reg_null_d, op[0], fs_reg(0), BRW_CONDITIONAL_NZ));
> +  inst = emit(BRW_OPCODE_SEL, this->result, op[1], op[2]);
> +  inst->predicate = BRW_PREDICATE_NORMAL;
> +  break;
> }
>  }
>

For the uses of ir_triop_cond_sel we have currently (the lowering passes in
patch 14), I believe this will generate efficient code.  But if we adapt
the mix() functions to use it, then there are probably going to be a lot of
uses like this:

x = mix(y, z, a < b);

Which will compile down to this silly assembly (please excuse the
pseudocode--I can't remember the assembly syntax exactly):

CMP.lt tmp a b
CMP.nz null tmp 0
SEL.f0 x y z

What if we modify the loop that calls ir->operands[operand]->accept(this)
on each operand (near the top of the visitor) so that it skips operand 0
when the expression is ir_triop_cond_sel.  Then, in the switch statement,
we can do something like this:

emit_bool_to_cond_code(ir->operands[0]);
inst = emit(BRW_OPCODE_SEL, this->result, op[1], op[2]);
inst->predicate = BRW_PREDICATE_NORMAL;

That should produce the assembly we want, which is:

CMP.lt null a b
SEL.fo x y z


>
>
> diff --git a/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp
> b/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp
> index fbdf73d..0615309 100644
> --- a/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp
> @@ -1731,6 +1731,12 @@ vec4_visitor::visit(ir_expression *ir)
>emit(LRP(result_dst, op[2], op[1], op[0]));
>break;
>
> +   case ir_triop_cond_sel:
> +  emit(CMP(dst_null_d(), op[0], src_reg(0), BRW_CONDITIONAL_NZ));
> +  inst = emit(BRW_OPCODE_SEL, result_dst, op[1], op[2]);
> +  inst->predicate = BRW_PREDICATE_NORMAL;
> +  break;
> +
>

A nearly identical optimization ought to be possible here.


> case ir_triop_bfi:
>op[0] = fix_3src_operand(op[0]);
>op[1] = fix_3src_operand(op[1]);
> --
> 1.8.3.2
>
> ___
> mesa-dev mailing list
> mesa-dev@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/mesa-dev
>
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 12/15] i965: Add support for ir_triop_cond_sel.

2013-08-29 Thread Matt Turner
On Fri, Aug 23, 2013 at 9:24 AM, Paul Berry  wrote:
> On 22 August 2013 16:08, Matt Turner  wrote:
>>
>> ---
>>  src/mesa/drivers/dri/i965/brw_fs_channel_expressions.cpp | 1 +
>>  src/mesa/drivers/dri/i965/brw_fs_visitor.cpp | 6 ++
>>  src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp   | 6 ++
>>  3 files changed, 13 insertions(+)
>>
>> diff --git a/src/mesa/drivers/dri/i965/brw_fs_channel_expressions.cpp
>> b/src/mesa/drivers/dri/i965/brw_fs_channel_expressions.cpp
>> index 6ee6d01..34dbc90 100644
>> --- a/src/mesa/drivers/dri/i965/brw_fs_channel_expressions.cpp
>> +++ b/src/mesa/drivers/dri/i965/brw_fs_channel_expressions.cpp
>> @@ -362,6 +362,7 @@
>> ir_channel_expressions_visitor::visit_leave(ir_assignment *ir)
>>
>> case ir_triop_fma:
>> case ir_triop_lrp:
>> +   case ir_triop_cond_sel:
>> case ir_triop_bitfield_extract:
>>for (i = 0; i < vector_elements; i++) {
>>  ir_rvalue *op0 = get_element(op_var[0], i);
>> diff --git a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
>> b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
>> index 4b54bee..27887d6 100644
>> --- a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
>> +++ b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
>> @@ -735,6 +735,12 @@ fs_visitor::visit(ir_expression *ir)
>> case ir_triop_lrp:
>>emit_lrp(this->result, op[0], op[1], op[2]);
>>break;
>> +
>> +   case ir_triop_cond_sel:
>> +  emit(CMP(reg_null_d, op[0], fs_reg(0), BRW_CONDITIONAL_NZ));
>> +  inst = emit(BRW_OPCODE_SEL, this->result, op[1], op[2]);
>> +  inst->predicate = BRW_PREDICATE_NORMAL;
>> +  break;
>> }
>>  }
>
>
> For the uses of ir_triop_cond_sel we have currently (the lowering passes in
> patch 14), I believe this will generate efficient code.  But if we adapt the
> mix() functions to use it, then there are probably going to be a lot of uses
> like this:
>
> x = mix(y, z, a < b);
>
> Which will compile down to this silly assembly (please excuse the
> pseudocode--I can't remember the assembly syntax exactly):
>
> CMP.lt tmp a b
> CMP.nz null tmp 0
> SEL.f0 x y z
>
> What if we modify the loop that calls ir->operands[operand]->accept(this) on
> each operand (near the top of the visitor) so that it skips operand 0 when
> the expression is ir_triop_cond_sel.  Then, in the switch statement, we can
> do something like this:
>
> emit_bool_to_cond_code(ir->operands[0]);
>
> inst = emit(BRW_OPCODE_SEL, this->result, op[1], op[2]);
> inst->predicate = BRW_PREDICATE_NORMAL;
>
> That should produce the assembly we want, which is:
>
> CMP.lt null a b
> SEL.fo x y z

I agree. I tried to do this and kept running into problems that would
be solved by considering the flags registers better in various
optimization passes.

Using the code from fs-mix-bvec4-infnan.shader_test for example
(gl_FragColor = mix(arg0, arg1, selector);). We generate something
like the following with some local changes:

cmp.e.f0(8)  null   g2.4<0,1,0>D 0D
(+f0) sel(8) m1<1>F g2<0,1,0>F   g4<8,8,1>F
cmp.e.f0(8)  null   g2.5<0,1,0>D 0D
(+f0) sel(8) m2<1>F g2.1<0,1,0>F g7<8,8,1>F
cmp.e.f0(8)  null   g2.6<0,1,0>D 0D
(+f0) sel(8) m3<1>F g4<8,8,1>F   g2<0,1,0>F
cmp.e.f0(8)  null   g2.7<0,1,0>D 0D
(+f0) sel(8) m4<1>F g7<8,8,1>F   g2.1<0,1,0>F

which would benefit from using the multiple flag registers available
on IVB and newer (or even the subregisters, not sure how hardware
dependency tracking works for them).

And from the code generated for frexp is stupid:

cmp.ne.f0(8)g5<1>D  (abs)g2<0,1,0>F 0F
...
cmp.e.f0(8) nullg5<8,8,1>D  0D
(-f0) sel(8)g9<1>D  g8<8,8,1>D  -126D
...
cmp.e.f0(8) nullg5<8,8,1>D  0D
(-f0) sel(8)g12<1>UDg11<8,8,1>UD0x3f00UD

where we could just evaluate the abs(x) != 0.0f and use that flag
register for the two sels.

Basically, I think what I'm saying is yes, the generated code is
silly, but that I think our backend really needs to understand about
flags register to make it significantly better.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev