On 06/30/2016 03:20 PM, Matt Turner wrote: > On Wed, Jun 29, 2016 at 2:04 PM, Ian Romanick <i...@freedesktop.org> wrote: >> From: Ian Romanick <ian.d.roman...@intel.com> >> >> Signed-off-by: Ian Romanick <ian.d.roman...@intel.com> >> --- >> src/mesa/drivers/dri/i965/brw_fs_nir.cpp | 50 ++++++++++++++++++++++------ >> src/mesa/drivers/dri/i965/brw_vec4_nir.cpp | 52 >> +++++++++++++++++++++++------- >> 2 files changed, 81 insertions(+), 21 deletions(-) >> >> diff --git a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp >> b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp >> index f15bf3e..f8db28a 100644 >> --- a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp >> +++ b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp >> @@ -623,8 +623,32 @@ fs_visitor::nir_emit_find_msb_using_lzd(const >> fs_builder &bld, >> bool is_signed) >> { >> fs_inst *inst; >> + fs_reg temp = src; >> >> - bld.LZD(retype(result, BRW_REGISTER_TYPE_UD), src); >> + if (is_signed) { >> + /* LZD of an absolute value source almost always does the right >> + * thing. There are two problem values: >> + * >> + * * 0x80000000. Since abs(0x80000000) == 0x80000000, LZD returns >> + * 0. However, findMSB(int(0x80000000)) == 30. >> + * >> + * * 0xffffffff. Since abs(0xffffffff) == 1, LZD returns >> + * 31. Section 8.8 (Integer Functions) of the GLSL 4.50 spec says: >> + * >> + * For a value of zero or negative one, -1 will be returned. >> + * >> + * For all negative number cases, including 0x80000000 and >> + * 0xffffffff, the correct value is obtained from LZD if instead of > > Interesting, both the G45 and IVB docs (I didn't check others, I > suspect they say the same) say > > "If the source is signed, the abs source modifier must be used to > convert any negative source value to a positive value." > > I suppose that means you're supposed to retype the src to UD below?
Yes, probably. The above instructions make sure that it is positive, but we could still have a value of 0x80000000. I can add the retype() below. >> + * negating the (already negative) value the logical-not is used. A >> + * conditonal logical-not can be achieved in two instructions. >> + */ >> + temp = vgrf(glsl_type::int_type); >> + >> + bld.ASR(temp, src, brw_imm_d(31)); >> + bld.XOR(temp, temp, src); >> + } >> + >> + bld.LZD(retype(result, BRW_REGISTER_TYPE_UD), temp); > _______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev