Re: [Mesa-dev] [PATCH 2/3] i965/fs/nir: Optimize (gl_FrontFacing ? x : y) where x and y are ±1.0.

2015-02-21 Thread Eric Anholt
Matt Turner matts...@gmail.com writes:

 On Fri, Feb 20, 2015 at 1:41 PM, Eric Anholt e...@anholt.net wrote:
 Or maybe I'm just wrong and some bit is guaranteed to be set?

 A This negation looks like it's safe in practice, because bits 0:4 will
 surely be TRIANGLES comment seems fine with me.

 Thanks, will do. R-b?

 I realized I was looking at Vol4/Part1 which described the render
 target write message header, which much the same description but not
 the actual incoming thread dispatch payload. Vol2/Part1 has the info I
 want.

Yep.  And,

Reviewed-by: Eric Anholt e...@anholt.net


signature.asc
Description: PGP signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/3] i965/fs/nir: Optimize (gl_FrontFacing ? x : y) where x and y are ±1.0.

2015-02-20 Thread Eric Anholt
Matt Turner matts...@gmail.com writes:

 On Fri, Feb 20, 2015 at 11:54 AM, Eric Anholt e...@anholt.net wrote:
 I wanted patch #1 to land, so I took a look at this one :)

 Thanks! :)

 Matt Turner matts...@gmail.com writes:
 +   if (brw-gen = 6) {
 +  /* Bit 15 of g0.0 is 0 if the polygon is front facing. */
 +  fs_reg g0 = fs_reg(retype(brw_vec1_grf(0, 0), BRW_REGISTER_TYPE_W));
 +
 +  /* For (gl_FrontFacing ? 1.0 : -1.0), emit:
 +   *
 +   *or(8)  tmp.12W  g0.00,1,0W  0x3f80W
 +   *and(8) dst1Dtmp8,8,1D   0xbf80D
 +   *
 +   * and negate g0.00,1,0W for (gl_FrontFacing ? -1.0 : 1.0).
 +   */
 +
 +  if (value1-f[0] == -1.0f) {
 + g0.negate = true;
 +  }

 Does this do what you want?  If g0.0 happened to be *all* zeroes, you're
 still going to get 0 after negation, right?

 That's a good question. I'm not sure. The bits below it are

 13: Source Depth Present to Render Target.
 12: oMask to Render Target
 11: Source0 Alpha Present to RenderTarget.
 8:6: Starting Sample Pair Index

 BDW adds some additional fields as well.

 The others are ignored. It's not clear to me that at least one of
 the defined bits is guaranteed to be zero. It's no guarantee or
 anything, but FWIW without realizing it we were depending on some bit
 being non-zero for the frontfacing optimizations that used ASR as well
 (commits d1c43ed4, 19c6617a) and haven't seen any problems from it. So
 if there's a problem... we're not making it worse in this commit...

 The simulator seems to set some bits in the ignored fields, but I
 don't have any explanation for that, nor is that evidence that we can
 rely on the hardware to do the same.

 Or maybe I'm just wrong and some bit is guaranteed to be set?

A This negation looks like it's safe in practice, because bits 0:4 will
surely be TRIANGLES comment seems fine with me.


signature.asc
Description: PGP signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/3] i965/fs/nir: Optimize (gl_FrontFacing ? x : y) where x and y are ±1.0.

2015-02-20 Thread Matt Turner
On Fri, Feb 20, 2015 at 1:41 PM, Eric Anholt e...@anholt.net wrote:
 Or maybe I'm just wrong and some bit is guaranteed to be set?

 A This negation looks like it's safe in practice, because bits 0:4 will
 surely be TRIANGLES comment seems fine with me.

Thanks, will do. R-b?

I realized I was looking at Vol4/Part1 which described the render
target write message header, which much the same description but not
the actual incoming thread dispatch payload. Vol2/Part1 has the info I
want.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/3] i965/fs/nir: Optimize (gl_FrontFacing ? x : y) where x and y are ±1.0.

2015-02-20 Thread Eric Anholt
I wanted patch #1 to land, so I took a look at this one :)

Matt Turner matts...@gmail.com writes:
 +   if (brw-gen = 6) {
 +  /* Bit 15 of g0.0 is 0 if the polygon is front facing. */
 +  fs_reg g0 = fs_reg(retype(brw_vec1_grf(0, 0), BRW_REGISTER_TYPE_W));
 +
 +  /* For (gl_FrontFacing ? 1.0 : -1.0), emit:
 +   *
 +   *or(8)  tmp.12W  g0.00,1,0W  0x3f80W
 +   *and(8) dst1Dtmp8,8,1D   0xbf80D
 +   *
 +   * and negate g0.00,1,0W for (gl_FrontFacing ? -1.0 : 1.0).
 +   */
 +
 +  if (value1-f[0] == -1.0f) {
 + g0.negate = true;
 +  }

Does this do what you want?  If g0.0 happened to be *all* zeroes, you're
still going to get 0 after negation, right?

(I suppose your primitive type is probably going to be triangles if
you're doing facing, so the low bits will be non-zero)


signature.asc
Description: PGP signature
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/3] i965/fs/nir: Optimize (gl_FrontFacing ? x : y) where x and y are ±1.0.

2015-02-20 Thread Matt Turner
On Fri, Feb 20, 2015 at 11:54 AM, Eric Anholt e...@anholt.net wrote:
 I wanted patch #1 to land, so I took a look at this one :)

Thanks! :)

 Matt Turner matts...@gmail.com writes:
 +   if (brw-gen = 6) {
 +  /* Bit 15 of g0.0 is 0 if the polygon is front facing. */
 +  fs_reg g0 = fs_reg(retype(brw_vec1_grf(0, 0), BRW_REGISTER_TYPE_W));
 +
 +  /* For (gl_FrontFacing ? 1.0 : -1.0), emit:
 +   *
 +   *or(8)  tmp.12W  g0.00,1,0W  0x3f80W
 +   *and(8) dst1Dtmp8,8,1D   0xbf80D
 +   *
 +   * and negate g0.00,1,0W for (gl_FrontFacing ? -1.0 : 1.0).
 +   */
 +
 +  if (value1-f[0] == -1.0f) {
 + g0.negate = true;
 +  }

 Does this do what you want?  If g0.0 happened to be *all* zeroes, you're
 still going to get 0 after negation, right?

That's a good question. I'm not sure. The bits below it are

13: Source Depth Present to Render Target.
12: oMask to Render Target
11: Source0 Alpha Present to RenderTarget.
8:6: Starting Sample Pair Index

BDW adds some additional fields as well.

The others are ignored. It's not clear to me that at least one of
the defined bits is guaranteed to be zero. It's no guarantee or
anything, but FWIW without realizing it we were depending on some bit
being non-zero for the frontfacing optimizations that used ASR as well
(commits d1c43ed4, 19c6617a) and haven't seen any problems from it. So
if there's a problem... we're not making it worse in this commit...

The simulator seems to set some bits in the ignored fields, but I
don't have any explanation for that, nor is that evidence that we can
rely on the hardware to do the same.

Or maybe I'm just wrong and some bit is guaranteed to be set?

 (I suppose your primitive type is probably going to be triangles if
 you're doing facing, so the low bits will be non-zero)
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 2/3] i965/fs/nir: Optimize (gl_FrontFacing ? x : y) where x and y are ±1.0.

2015-02-19 Thread Matt Turner
On Tue, Feb 17, 2015 at 11:46 AM, Matt Turner matts...@gmail.com wrote:
 total instructions in shared programs: 8013221 - 8010869 (-0.03%)
 instructions in affected programs: 475925 - 473573 (-0.49%)
 helped:2350
 ---

Patches 1 and 3 have been reviewed, but I'm this one hasn't.

Neither has the equivalent change to brw_fs_visitor.cpp [0]. The only
response to that patch was concerns about adding new optimizations
only to brw_fs_visitor.cpp.

The current numbers for this patch are

total instructions in shared programs: 7756214 - 7753873 (-0.03%)
instructions in affected programs: 455452 - 453111 (-0.51%)
helped:2333

and the current numbers for the analogous brw_fs_visitor.cpp change are

total instructions in shared programs: 5695356 - 5689775 (-0.10%)
instructions in affected programs: 486231 - 480650 (-1.15%)
helped:2604
LOST:  1

They're both available in the sent branch of my tree:

   git://people.freedesktop.org/~mattst88/mesa sent

[0] [PATCH 3/4] i965/fs: Optimize (gl_FrontFacing ? x : y) where x and
y are ±1.0.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


[Mesa-dev] [PATCH 2/3] i965/fs/nir: Optimize (gl_FrontFacing ? x : y) where x and y are ±1.0.

2015-02-17 Thread Matt Turner
total instructions in shared programs: 8013221 - 8010869 (-0.03%)
instructions in affected programs: 475925 - 473573 (-0.49%)
helped:2350
---
 src/mesa/drivers/dri/i965/brw_fs.h   |  3 ++
 src/mesa/drivers/dri/i965/brw_fs_nir.cpp | 81 
 2 files changed, 84 insertions(+)

diff --git a/src/mesa/drivers/dri/i965/brw_fs.h 
b/src/mesa/drivers/dri/i965/brw_fs.h
index bce9f7a..99a759f 100644
--- a/src/mesa/drivers/dri/i965/brw_fs.h
+++ b/src/mesa/drivers/dri/i965/brw_fs.h
@@ -363,6 +363,9 @@ public:
fs_reg get_nir_dest(nir_dest dest);
void emit_percomp(fs_inst *inst, unsigned wr_mask);
 
+   bool optimize_frontfacing_ternary(nir_alu_instr *instr,
+ const fs_reg result);
+
int setup_color_payload(fs_reg *dst, fs_reg color, unsigned components);
void emit_alpha_test();
fs_inst *emit_single_fb_write(fs_reg color1, fs_reg color2,
diff --git a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp 
b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
index 850f132..266249f 100644
--- a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
+++ b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
@@ -532,6 +532,84 @@ brw_type_for_nir_type(nir_alu_type type)
return BRW_REGISTER_TYPE_F;
 }
 
+bool
+fs_visitor::optimize_frontfacing_ternary(nir_alu_instr *instr,
+ const fs_reg result)
+{
+   if (instr-src[0].src.is_ssa ||
+   !instr-src[0].src.reg.reg ||
+   !instr-src[0].src.reg.reg-parent_instr)
+  return false;
+
+   if (instr-src[0].src.reg.reg-parent_instr-type !=
+   nir_instr_type_intrinsic)
+  return false;
+
+   nir_intrinsic_instr *src0 =
+  nir_instr_as_intrinsic(instr-src[0].src.reg.reg-parent_instr);
+
+   if (src0-intrinsic != nir_intrinsic_load_front_face)
+  return false;
+
+   nir_const_value *value1 = nir_src_as_const_value(instr-src[1].src);
+   if (!value1 || fabsf(value1-f[0]) != 1.0f)
+  return false;
+
+   nir_const_value *value2 = nir_src_as_const_value(instr-src[2].src);
+   if (!value2 || fabsf(value2-f[0]) != 1.0f)
+  return false;
+
+   fs_reg tmp = vgrf(glsl_type::int_type);
+
+   if (brw-gen = 6) {
+  /* Bit 15 of g0.0 is 0 if the polygon is front facing. */
+  fs_reg g0 = fs_reg(retype(brw_vec1_grf(0, 0), BRW_REGISTER_TYPE_W));
+
+  /* For (gl_FrontFacing ? 1.0 : -1.0), emit:
+   *
+   *or(8)  tmp.12W  g0.00,1,0W  0x3f80W
+   *and(8) dst1Dtmp8,8,1D   0xbf80D
+   *
+   * and negate g0.00,1,0W for (gl_FrontFacing ? -1.0 : 1.0).
+   */
+
+  if (value1-f[0] == -1.0f) {
+ g0.negate = true;
+  }
+
+  tmp.type = BRW_REGISTER_TYPE_W;
+  tmp.subreg_offset = 2;
+  tmp.stride = 2;
+
+  fs_inst *or_inst = emit(OR(tmp, g0, fs_reg(0x3f80)));
+  or_inst-src[1].type = BRW_REGISTER_TYPE_UW;
+
+  tmp.type = BRW_REGISTER_TYPE_D;
+  tmp.subreg_offset = 0;
+  tmp.stride = 1;
+   } else {
+  /* Bit 31 of g1.6 is 0 if the polygon is front facing. */
+  fs_reg g1_6 = fs_reg(retype(brw_vec1_grf(1, 6), BRW_REGISTER_TYPE_D));
+
+  /* For (gl_FrontFacing ? 1.0 : -1.0), emit:
+   *
+   *or(8)  tmp1D  g1.60,1,0D  0x3f80D
+   *and(8) dst1D  tmp8,8,1D   0xbf80D
+   *
+   * and negate g1.60,1,0D for (gl_FrontFacing ? -1.0 : 1.0).
+   */
+
+  if (value1-f[0] == -1.0f) {
+ g1_6.negate = true;
+  }
+
+  emit(OR(tmp, g1_6, fs_reg(0x3f80)));
+   }
+   emit(AND(retype(result, BRW_REGISTER_TYPE_D), tmp, fs_reg(0xbf80)));
+
+   return true;
+}
+
 void
 fs_visitor::nir_emit_alu(nir_alu_instr *instr)
 {
@@ -1057,6 +1135,9 @@ fs_visitor::nir_emit_alu(nir_alu_instr *instr)
   break;
 
case nir_op_bcsel:
+  if (optimize_frontfacing_ternary(instr, result))
+ return;
+
   emit(CMP(reg_null_d, op[0], fs_reg(0), BRW_CONDITIONAL_NZ));
   inst = emit(SEL(result, op[1], op[2]));
   inst-predicate = BRW_PREDICATE_NORMAL;
-- 
2.0.5

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev