Re: [Mesa-dev] [PATCH 8/8 v2] i965: Enable EXT_shader_samples_identical

2015-11-18 Thread Jason Ekstrand
On Nov 18, 2015 6:38 PM, "Ian Romanick"  wrote:
>
> From: Ian Romanick 
>
> v2: Handle immediate value for MCS smarter.  Rebase on changes to
> nir_texop_sampels_identical (missing second parameter).  Suggested by
> Jason.  This still doesn't handle the 16x MSAA case.

We could always just enable it on IVB-BDW for now.

The last three are

Reviewed-by: Jason Ekstrand 

> Signed-off-by: Ian Romanick 
> ---
>  src/mesa/drivers/dri/i965/brw_fs_nir.cpp   |  4 +++-
>  src/mesa/drivers/dri/i965/brw_fs_visitor.cpp   | 16 
>  src/mesa/drivers/dri/i965/brw_vec4_nir.cpp |  4 +++-
>  src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp | 11 +++
>  src/mesa/drivers/dri/i965/intel_extensions.c   |  1 +
>  5 files changed, 34 insertions(+), 2 deletions(-)
>
> diff --git a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
> index 80315fe..c85c9fc 100644
> --- a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
> @@ -2624,6 +2624,7 @@ fs_visitor::nir_emit_texture(const fs_builder &bld,
nir_tex_instr *instr)
>   switch (instr->op) {
>   case nir_texop_txf:
>   case nir_texop_txf_ms:
> + case nir_texop_samples_identical:
>  coordinate = retype(src, BRW_REGISTER_TYPE_D);
>  break;
>   default:
> @@ -2686,7 +2687,8 @@ fs_visitor::nir_emit_texture(const fs_builder &bld,
nir_tex_instr *instr)
>}
> }
>
> -   if (instr->op == nir_texop_txf_ms) {
> +   if (instr->op == nir_texop_txf_ms ||
> +   instr->op == nir_texop_samples_identical) {
>if (devinfo->gen >= 7 &&
>key_tex->compressed_multisample_layout_mask & (1 << sampler)) {
>   mcs = emit_mcs_fetch(coordinate, instr->coord_components,
sampler_reg);
> diff --git a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
> index a7bd9ce..66cbbd2 100644
> --- a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp
> @@ -259,6 +259,22 @@ fs_visitor::emit_texture(ir_texture_opcode op,
>lod = fs_reg(0u);
> }
>
> +   if (op == ir_samples_identical) {
> +  fs_reg dst = vgrf(glsl_type::get_instance(dest_type->base_type, 1,
1));
> +
> +  /* If mcs is an immediate value, it means there is no MCS.  In
that case
> +   * just return false.
> +   */
> +  if (mcs.file == BRW_IMMEDIATE_VALUE) {
> + bld.MOV(dst, fs_reg(0));
> +  } else {
> + bld.CMP(dst, mcs, src_reg(0u), BRW_CONDITIONAL_EQ);
> +  }
> +
> +  this->result = dst;
> +  return;
> +   }
> +
> if (coordinate.file != BAD_FILE) {
>/* FINISHME: Texture coordinate rescaling doesn't work with
non-constant
> * samplers.  This should only be a problem with GL_CLAMP on Gen7.
> diff --git a/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
b/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
> index c93227c..98a4d3b 100644
> --- a/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp
> @@ -1615,6 +1615,7 @@ vec4_visitor::nir_emit_texture(nir_tex_instr *instr)
>   switch (instr->op) {
>   case nir_texop_txf:
>   case nir_texop_txf_ms:
> + case nir_texop_samples_identical:
>  coordinate = get_nir_src(instr->src[i].src,
BRW_REGISTER_TYPE_D,
>   src_size);
>  coord_type = glsl_type::ivec(src_size);
> @@ -1695,7 +1696,8 @@ vec4_visitor::nir_emit_texture(nir_tex_instr *instr)
>}
> }
>
> -   if (instr->op == nir_texop_txf_ms) {
> +   if (instr->op == nir_texop_txf_ms ||
> +   instr->op == nir_texop_samples_identical) {
>assert(coord_type != NULL);
>if (devinfo->gen >= 7 &&
>key_tex->compressed_multisample_layout_mask & (1 << sampler)) {
> diff --git a/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp
b/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp
> index fda3d7c..d8a0f22 100644
> --- a/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp
> +++ b/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp
> @@ -909,6 +909,17 @@ vec4_visitor::emit_texture(ir_texture_opcode op,
>unreachable("TXB is not valid for vertex shaders.");
> case ir_lod:
>unreachable("LOD is not valid for vertex shaders.");
> +   case ir_samples_identical: {
> +  /* If mcs is an immediate value, it means there is no MCS.  In
that case
> +   * just return false.
> +   */
> +  if (mcs.file == BRW_IMMEDIATE_VALUE) {
> + emit(MOV(dest, src_reg(0u)));
> +  } else {
> + emit(CMP(dest, mcs, src_reg(0u), BRW_CONDITIONAL_EQ));
> +  }
> +  return;
> +   }
> default:
>unreachable("Unrecognized tex op");
> }
> diff --git a/src/mesa/drivers/dri/i965/intel_extensions.c
b/src/mesa/drivers/dri/i965/intel_extensions.c
> index 386b63c..2e2459c 100644
> --- a/src/mesa/drivers/dri/i965/intel_extensions.c
> +++ b/src/m

Re: [Mesa-dev] [PATCH 8/8 v2] i965: Enable EXT_shader_samples_identical

2015-11-19 Thread Neil Roberts
Jason Ekstrand  writes:

> On Nov 18, 2015 6:38 PM, "Ian Romanick"  wrote:
>>
>> From: Ian Romanick 
>>
>> v2: Handle immediate value for MCS smarter.  Rebase on changes to
>> nir_texop_sampels_identical (missing second parameter).  Suggested by
>> Jason.  This still doesn't handle the 16x MSAA case.

16x MSAA has a field in the program sampler key because it needs a new
instruction to fetch from the texture. For 16x MSAA it just uses two of
the registers that are returned by the MCS fetch instruction instead of
just one. Maybe you could do something like this:

  if (mcs.file == BRW_IMMEDIATE_VALUE) {
 bld.MOV(dst, fs_reg(0));
  } else if ((key_tex->msaa_16 & (1 << sampler))) {
 fs_reg tmp = vgrf(glsl_type::uint_type);
 bld.OR(tmp, mcs, offset(mcs, bld, 1));
 bld.CMP(dst, tmp, src_reg(0u), BRW_CONDITIONAL_EQ);
  } else {
 bld.CMP(dst, mcs, src_reg(0u), BRW_CONDITIONAL_EQ);
  }

I tested this on my SKL and it passes all of your Piglit tests. The
optimiser quite neatly removes the second instruction and the temporary
register in the 16x case and makes the OR directly update the flag
registers, so it is possibly the same cost as the 8x case in practice.

I can confirm that I've already pushed the 16x MSAA patches to master.

Regards,
- Neil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 8/8 v2] i965: Enable EXT_shader_samples_identical

2015-11-19 Thread Ian Romanick
On 11/19/2015 09:34 AM, Neil Roberts wrote:
> Jason Ekstrand  writes:
> 
>> On Nov 18, 2015 6:38 PM, "Ian Romanick"  wrote:
>>>
>>> From: Ian Romanick 
>>>
>>> v2: Handle immediate value for MCS smarter.  Rebase on changes to
>>> nir_texop_sampels_identical (missing second parameter).  Suggested by
>>> Jason.  This still doesn't handle the 16x MSAA case.
> 
> 16x MSAA has a field in the program sampler key because it needs a new
> instruction to fetch from the texture. For 16x MSAA it just uses two of
> the registers that are returned by the MCS fetch instruction instead of
> just one. Maybe you could do something like this:
> 
>   if (mcs.file == BRW_IMMEDIATE_VALUE) {
>  bld.MOV(dst, fs_reg(0));
>   } else if ((key_tex->msaa_16 & (1 << sampler))) {
>  fs_reg tmp = vgrf(glsl_type::uint_type);
>  bld.OR(tmp, mcs, offset(mcs, bld, 1));
>  bld.CMP(dst, tmp, src_reg(0u), BRW_CONDITIONAL_EQ);
>   } else {
>  bld.CMP(dst, mcs, src_reg(0u), BRW_CONDITIONAL_EQ);
>   }
> 
> I tested this on my SKL and it passes all of your Piglit tests. The
> optimiser quite neatly removes the second instruction and the temporary
> register in the 16x case and makes the OR directly update the flag
> registers, so it is possibly the same cost as the 8x case in practice.

I was looking at the msaa_16 flags last night after I sent out v2.  I
was thinking of something along these lines.  Am I correct that nothing
special is needed in the vec4 backend?  It seems like mcs should know
the register size, and the CMP with 0 should just do the right thing.

I'll add your code to the FS and your S-o-b to the commit message... and
hopefully this can land soon. :)

> I can confirm that I've already pushed the 16x MSAA patches to master.
> 
> Regards,
> - Neil


___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 8/8 v2] i965: Enable EXT_shader_samples_identical

2015-11-19 Thread Neil Roberts
Ian Romanick  writes:

> Am I correct that nothing special is needed in the vec4 backend? It
> seems like mcs should know the register size, and the CMP with 0
> should just do the right thing.

I think you probably will have to do something for the vec4 backend.
Currently it generates an instruction like this:

cmp.z.f0(8) g8<1>.xUD   g9<4,4,1>.xUD   0xUD

g9 contains the MCS data. If I understand correctly for 16x MSAA the
second half of the MCS data is in the y component which is currently
ignored by this instruction. Maybe something like this would work:

  if (mcs.file == BRW_IMMEDIATE_VALUE) {
 emit(MOV(dest, src_reg(0u)));
  } else if ((key_tex->msaa_16 & (1 << sampler))) {
 dst_reg tmp(this, glsl_type::uint_type);
 dst_reg swizzled_mcs = mcs;
 swizzled_mcs.swizzle = BRW_SWIZZLE_;
 emit(OR(tmp, mcs, swizzled_mcs));
 emit(CMP(dest, tmp, src_reg(0u), BRW_CONDITIONAL_EQ));
  } else {
 emit(CMP(dest, mcs, src_reg(0u), BRW_CONDITIONAL_EQ));
  }

Sadly the optimiser doesn't optimise out the extra instruction this
time. There's probably a better way to do it. On the other hand I can't
think why anyone would be using this function in the vec4 backend so
it's probably not worth worrying about.

I haven't tested this at all. I guess to test it you would have to write
a GS version of the Piglit test? I think that is the only place that
uses the vec4 backend on SKL.

If we don't want to risk this we could always just make it return false
when msaa_16 is set for the sampler. On the other hand if there is
currently no test for the 8x version either then we should probably
assume that neither of them work… maybe it wouldn't be so bad to just
always return false in the vec4 backend until we've tested it?

Regards,
- Neil
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 8/8 v2] i965: Enable EXT_shader_samples_identical

2015-11-19 Thread Ian Romanick
On 11/19/2015 02:13 PM, Neil Roberts wrote:
> Ian Romanick  writes:
> 
>> Am I correct that nothing special is needed in the vec4 backend? It
>> seems like mcs should know the register size, and the CMP with 0
>> should just do the right thing.
> 
> I think you probably will have to do something for the vec4 backend.
> Currently it generates an instruction like this:
> 
> cmp.z.f0(8) g8<1>.xUD   g9<4,4,1>.xUD   0xUD
> 
> g9 contains the MCS data. If I understand correctly for 16x MSAA the
> second half of the MCS data is in the y component which is currently
> ignored by this instruction. Maybe something like this would work:
> 
>   if (mcs.file == BRW_IMMEDIATE_VALUE) {
>  emit(MOV(dest, src_reg(0u)));
>   } else if ((key_tex->msaa_16 & (1 << sampler))) {
>  dst_reg tmp(this, glsl_type::uint_type);
>  dst_reg swizzled_mcs = mcs;
>  swizzled_mcs.swizzle = BRW_SWIZZLE_;
>  emit(OR(tmp, mcs, swizzled_mcs));
>  emit(CMP(dest, tmp, src_reg(0u), BRW_CONDITIONAL_EQ));
>   } else {
>  emit(CMP(dest, mcs, src_reg(0u), BRW_CONDITIONAL_EQ));
>   }
> 
> Sadly the optimiser doesn't optimise out the extra instruction this
> time. There's probably a better way to do it. On the other hand I can't
> think why anyone would be using this function in the vec4 backend so
> it's probably not worth worrying about.
> 
> I haven't tested this at all. I guess to test it you would have to write
> a GS version of the Piglit test? I think that is the only place that
> uses the vec4 backend on SKL.
> 
> If we don't want to risk this we could always just make it return false
> when msaa_16 is set for the sampler. On the other hand if there is

As far as I could tell, there is no msaa_16 for vec4.  It looks like it
always takes the equivalent of the FS msaa_16 path when gen >= 9.

> currently no test for the 8x version either then we should probably
> assume that neither of them work… maybe it wouldn't be so bad to just
> always return false in the vec4 backend until we've tested it?

Hmm... yeah, I'll do that.  It's an optimization, and I don't think
anyone will miss it in vec4 on any platform.  That way I don't have to
worry about applying a bunch of fixes to the 11.1 branch after I create
tests.  We can either implement it ("fix") in 11.1 or 11.2.

> Regards,
> - Neil

___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev


Re: [Mesa-dev] [PATCH 8/8 v2] i965: Enable EXT_shader_samples_identical

2015-11-21 Thread Kenneth Graunke
On Thursday, November 19, 2015 11:13:15 PM Neil Roberts wrote:
> Ian Romanick  writes:
> 
> > Am I correct that nothing special is needed in the vec4 backend? It
> > seems like mcs should know the register size, and the CMP with 0
> > should just do the right thing.
> 
> I think you probably will have to do something for the vec4 backend.
> Currently it generates an instruction like this:
> 
> cmp.z.f0(8) g8<1>.xUD   g9<4,4,1>.xUD   0xUD
> 
> g9 contains the MCS data. If I understand correctly for 16x MSAA the
> second half of the MCS data is in the y component which is currently
> ignored by this instruction. Maybe something like this would work:
> 
>   if (mcs.file == BRW_IMMEDIATE_VALUE) {
>  emit(MOV(dest, src_reg(0u)));
>   } else if ((key_tex->msaa_16 & (1 << sampler))) {
>  dst_reg tmp(this, glsl_type::uint_type);
>  dst_reg swizzled_mcs = mcs;
>  swizzled_mcs.swizzle = BRW_SWIZZLE_;
>  emit(OR(tmp, mcs, swizzled_mcs));
>  emit(CMP(dest, tmp, src_reg(0u), BRW_CONDITIONAL_EQ));
>   } else {
>  emit(CMP(dest, mcs, src_reg(0u), BRW_CONDITIONAL_EQ));
>   }
> 
> Sadly the optimiser doesn't optimise out the extra instruction this
> time. There's probably a better way to do it. On the other hand I can't
> think why anyone would be using this function in the vec4 backend so
> it's probably not worth worrying about.
> 
> I haven't tested this at all. I guess to test it you would have to write
> a GS version of the Piglit test? I think that is the only place that
> uses the vec4 backend on SKL.
> 
> If we don't want to risk this we could always just make it return false
> when msaa_16 is set for the sampler. On the other hand if there is
> currently no test for the 8x version either then we should probably
> assume that neither of them work… maybe it wouldn't be so bad to just
> always return false in the vec4 backend until we've tested it?
> 
> Regards,
> - Neil

FWIW, you can still test vec4 vertex shader code on Gen8+ with:

INTEL_DEBUG=vec4 

--Ken


signature.asc
Description: This is a digitally signed message part.
___
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev