On 06/30/2016 03:20 PM, Matt Turner wrote:
> On Wed, Jun 29, 2016 at 2:04 PM, Ian Romanick <i...@freedesktop.org> wrote:
>> From: Ian Romanick <ian.d.roman...@intel.com>
>>
>> Signed-off-by: Ian Romanick <ian.d.roman...@intel.com>
>> ---
>>  src/mesa/drivers/dri/i965/brw_fs_nir.cpp   | 50 ++++++++++++++++++++++------
>>  src/mesa/drivers/dri/i965/brw_vec4_nir.cpp | 52 
>> +++++++++++++++++++++++-------
>>  2 files changed, 81 insertions(+), 21 deletions(-)
>>
>> diff --git a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp 
>> b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
>> index f15bf3e..f8db28a 100644
>> --- a/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
>> +++ b/src/mesa/drivers/dri/i965/brw_fs_nir.cpp
>> @@ -623,8 +623,32 @@ fs_visitor::nir_emit_find_msb_using_lzd(const 
>> fs_builder &bld,
>>                                          bool is_signed)
>>  {
>>     fs_inst *inst;
>> +   fs_reg temp = src;
>>
>> -   bld.LZD(retype(result, BRW_REGISTER_TYPE_UD), src);
>> +   if (is_signed) {
>> +      /* LZD of an absolute value source almost always does the right
>> +       * thing.  There are two problem values:
>> +       *
>> +       * * 0x80000000.  Since abs(0x80000000) == 0x80000000, LZD returns
>> +       *   0.  However, findMSB(int(0x80000000)) == 30.
>> +       *
>> +       * * 0xffffffff.  Since abs(0xffffffff) == 1, LZD returns
>> +       *   31.  Section 8.8 (Integer Functions) of the GLSL 4.50 spec says:
>> +       *
>> +       *    For a value of zero or negative one, -1 will be returned.
>> +       *
>> +       * For all negative number cases, including 0x80000000 and
>> +       * 0xffffffff, the correct value is obtained from LZD if instead of
> 
> Interesting, both the G45 and IVB docs (I didn't check others, I
> suspect they say the same) say
> 
> "If the source is signed, the abs source modifier must be used to
> convert any negative source value to a positive value."
> 
> I suppose that means you're supposed to retype the src to UD below?

Yes, probably.  The above instructions make sure that it is positive,
but we could still have a value of 0x80000000.  I can add the retype()
below.

>> +       * negating the (already negative) value the logical-not is used.  A
>> +       * conditonal logical-not can be achieved in two instructions.
>> +       */
>> +      temp = vgrf(glsl_type::int_type);
>> +
>> +      bld.ASR(temp, src, brw_imm_d(31));
>> +      bld.XOR(temp, temp, src);
>> +   }
>> +
>> +   bld.LZD(retype(result, BRW_REGISTER_TYPE_UD), temp);
> 

_______________________________________________
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

Reply via email to