On Wed, Nov 18, 2015 at 9:29 PM, Jason Ekstrand <ja...@jlekstrand.net> wrote: > > On Nov 18, 2015 5:02 PM, "Jason Ekstrand" <ja...@jlekstrand.net> wrote: >> >> On Wed, Nov 18, 2015 at 4:06 PM, Kenneth Graunke <kenn...@whitecape.org> >> wrote: >> > On Wednesday, November 18, 2015 03:46:54 PM Ian Romanick wrote: >> >> From: Ian Romanick <ian.d.roman...@intel.com> >> >> >> >> Signed-off-by: Ian Romanick <ian.d.roman...@intel.com> >> >> --- >> >> src/mesa/drivers/dri/i965/brw_fs_nir.cpp | 1 + >> >> src/mesa/drivers/dri/i965/brw_fs_visitor.cpp | 16 ++++++++++++++++ >> >> src/mesa/drivers/dri/i965/brw_vec4_nir.cpp | 1 + >> >> src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp | 11 +++++++++++ >> >> src/mesa/drivers/dri/i965/intel_extensions.c | 1 + >> >> 5 files changed, 30 insertions(+) >> >> >> >> diff --git a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp >> >> b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp >> >> index 1f71f66..4af1234 100644 >> >> --- a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp >> >> +++ b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp >> >> @@ -2550,6 +2550,7 @@ fs_visitor::nir_emit_texture(const fs_builder >> >> &bld, nir_tex_instr *instr) >> >> switch (instr->op) { >> >> case nir_texop_txf: >> >> case nir_texop_txf_ms: >> >> + case nir_texop_samples_identical: >> >> coordinate = retype(src, BRW_REGISTER_TYPE_D); >> >> break; >> >> default: >> >> diff --git a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp >> >> b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp >> >> index a7bd9ce..6688f6a 100644 >> >> --- a/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp >> >> +++ b/src/mesa/drivers/dri/i965/brw_fs_visitor.cpp >> >> @@ -259,6 +259,22 @@ fs_visitor::emit_texture(ir_texture_opcode op, >> >> lod = fs_reg(0u); >> >> } >> >> >> >> + if (op == ir_samples_identical) { >> >> + fs_reg dst = vgrf(glsl_type::get_instance(dest_type->base_type, >> >> 1, 1)); >> >> + >> >> + if (mcs.file == BRW_IMMEDIATE_VALUE) { >> >> + fs_reg tmp = vgrf(glsl_type::uint_type); >> >> + >> >> + bld.MOV(tmp, mcs); >> >> + bld.CMP(dst, tmp, src_reg(0u), BRW_CONDITIONAL_EQ); >> > >> > Seems a little strange to emit assembly to do the comparison when >> > you've already determined that the value is a compile time constant. >> > >> > Why not just: >> > >> > bld.MOV(dst, fs_reg(mcs.ud == 0u ? ~0u : 0u)); >> >> Actually, getting an immediate here means we don't have an MCS and we >> have no idea of the samples are identical, so we should return false >> always. >> >> >> + } else { >> >> + bld.CMP(dst, mcs, src_reg(0u), BRW_CONDITIONAL_EQ); >> >> We should also consider handling the clear color case. In this case, >> we'll get 0xff for 2x and 0xffffffff for 4x or 8x. Do we know the >> number of samples in the shader? We should be able to get that from >> the sampler or something but then we would have to pass that through >> the key and that would get gross. > > First off, I realized that the numbers I have there are wrong. It's 0xff for > 2x and 4x and 0xffffffff for 8x. However, I also just realized that the 8 > 8-bit values you get for 8x MSAA range from 0 to 7 but take up 4 bits each. > This means that no valid 8x MSAA MCS value can have 0xff as its bottom 8 > bits unless it's the clear color. This means that a simple and with 0xff > will get us a clear color check on all but 16x. Unfortunately, 16x has a > 64-bit MCS value and, unless the hardware provides is with some extra > guarantees, 0xff would be valid in the bottom 8 bits. > > Going off into the world of speculation just a bit, can we make some > assumptions about the hardware? Suppose for a moment that the used a fairly > greedy algorithm for determining which plane to store a value in: > > 1) If all samples are affected, store in slice zero > 2) If not, store in the first available empty or completely overwritten > slice. > > Such an algorithm would make sense and have the nice property of tending to > pack the samples in the earlier slices this decreasing the possibility of > ever touching slice 15. This is good for cache locality. It also has > another property that would be very useful for us, namely that it only > touches slice 15 if all 16 samples have different colors. In particular, it > would mean that you can never have two samples that both lie in slice 15 > and, more specifically, 0xff would also be invalid for 16x. > > Unfortunately, that's entirely based on my speculation as to how the > hardware works. Given that we don't actually know, it's not documented, and > that we're not liable to ever find anyone willing to give us those kinds of > details, we're not likely to find out without a very clever experiment.
So, chad has convinced my that my speculation is quite possibly bogus and *very* hard to actually test, so just checking the bottom 8 bits probably won't work. Something else that came out of that conversation is that, for 2x MSAA, we may get bogus data in all but the bottom 4 bits. In other words, just blindly checking for zero is probably a bad idea. It'll work because the extension spec lets us return false negatives, but it isn't a good idea in general. If we really want the implementation to be solid, we need to mask off all but the bottom n * log2(n) bits where n = number of samples. > OK, enough hardware speculation for one night... > --Jason > >> One other thought, 16x MSAA will break all this because it gives you a >> ivec4 value from the MCS (if I remember correctly). Not sure if we've >> landed 16x MSAA yet though. >> --Jason >> >> >> + } >> >> + >> >> + this->result = dst; >> >> + return; >> >> + } >> >> + >> >> if (coordinate.file != BAD_FILE) { >> >> /* FINISHME: Texture coordinate rescaling doesn't work with >> >> non-constant >> >> * samplers. This should only be a problem with GL_CLAMP on >> >> Gen7. >> >> diff --git a/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp >> >> b/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp >> >> index 3c2674d..41c3c10 100644 >> >> --- a/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp >> >> +++ b/src/mesa/drivers/dri/i965/brw_vec4_nir.cpp >> >> @@ -1615,6 +1615,7 @@ vec4_visitor::nir_emit_texture(nir_tex_instr >> >> *instr) >> >> switch (instr->op) { >> >> case nir_texop_txf: >> >> case nir_texop_txf_ms: >> >> + case nir_texop_samples_identical: >> >> coordinate = get_nir_src(instr->src[i].src, >> >> BRW_REGISTER_TYPE_D, >> >> src_size); >> >> coord_type = glsl_type::ivec(src_size); >> >> diff --git a/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp >> >> b/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp >> >> index fda3d7c..2190a86 100644 >> >> --- a/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp >> >> +++ b/src/mesa/drivers/dri/i965/brw_vec4_visitor.cpp >> >> @@ -909,6 +909,17 @@ vec4_visitor::emit_texture(ir_texture_opcode op, >> >> unreachable("TXB is not valid for vertex shaders."); >> >> case ir_lod: >> >> unreachable("LOD is not valid for vertex shaders."); >> >> + case ir_samples_identical: { >> >> + if (mcs.file == BRW_IMMEDIATE_VALUE) { >> >> + const src_reg temp = src_reg(this, glsl_type::uint_type); >> >> + >> >> + emit(MOV(dst_reg(temp), mcs)); >> >> + emit(CMP(dest, temp, src_reg(0u), BRW_CONDITIONAL_EQ)); >> > >> > Ditto. >> > >> > bld.MOV(dst, src_reg(mcs.ud == 0u ? ~0u : 0u)); >> > >> >> + } else { >> >> + emit(CMP(dest, mcs, src_reg(0u), BRW_CONDITIONAL_EQ)); >> >> + } >> >> + return; >> >> + } >> >> default: >> >> unreachable("Unrecognized tex op"); >> >> } >> >> diff --git a/src/mesa/drivers/dri/i965/intel_extensions.c >> >> b/src/mesa/drivers/dri/i965/intel_extensions.c >> >> index 386b63c..2e2459c 100644 >> >> --- a/src/mesa/drivers/dri/i965/intel_extensions.c >> >> +++ b/src/mesa/drivers/dri/i965/intel_extensions.c >> >> @@ -333,6 +333,7 @@ intelInitExtensions(struct gl_context *ctx) >> >> ctx->Extensions.ARB_texture_compression_bptc = true; >> >> ctx->Extensions.ARB_texture_view = true; >> >> ctx->Extensions.ARB_shader_storage_buffer_object = true; >> >> + ctx->Extensions.EXT_shader_samples_identical = true; >> >> >> >> if (can_do_pipelined_register_writes(brw)) { >> >> ctx->Extensions.ARB_draw_indirect = true; >> >> >> > >> > _______________________________________________ >> > mesa-dev mailing list >> > mesa-dev@lists.freedesktop.org >> > http://lists.freedesktop.org/mailman/listinfo/mesa-dev >> > _______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev