Hello,

I've been giving some thought to catching up with core mesa on ARB_gs5
support. One of the things that ARB_gs5 introduces are new operations:

      genType frexp(genType x, out genIType exp);
      genType ldexp(genType x, in genIType exp);

      genIType bitfieldExtract(genIType value, int offset, int bits);
      genUType bitfieldExtract(genUType value, int offset, int bits);

      genIType bitfieldInsert(genIType base, genIType insert, int offset,
                              int bits);
      genUType bitfieldInsert(genUType base, genUType insert, int offset,
                              int bits);

      genIType bitfieldReverse(genIType value);
      genUType bitfieldReverse(genUType value);

      genIType bitCount(genIType value);
      genIType bitCount(genUType value);

      genIType findLSB(genIType value);
      genIType findLSB(genUType value);

      genIType findMSB(genIType value);
      genIType findMSB(genUType value);

      genUType uaddCarry(genUType x, genUType y, out genUType carry);
      genUType usubBorrow(genUType x, genUType y, out genUType borrow);

      void umulExtended(genUType x, genUType y, out genUType msb,
                        out genUType lsb);
      void imulExtended(genIType x, genIType y, out genIType msb,
                        out genIType lsb);

(I've skipped the packing stuff since that seems to already be
supported/lowered elsewhere, i2f/f2i which is already handled, and the
texture gather stuff, for which support already exists. And the
interpolateAt* stuff which isn't supported by core mesa yet, and when
it is, will require a very diff kind of handling than the above.)

I guess the only drivers one really needs to worry about here are
r600/radeonsi and nouveau. svga is largely a passthrough afaik, and
llvmpipe/softpipe is software and can thus implement it however it
wants.

Looking at the nvc0+ shader ISA, there are instructions to directly
handle all the bitfield stuff (bitfieldExtract, bitfieldInsert,
bitfieldReverse, bitCount, findLSB, findMSB). There is also a "mul
high", which is that the *mulExtended stuff gets translated into.

There are no instructions to handle frexp/ldexp, or the add carry/sub
borrow stuff. (Looking at the code the blob generates, they just do
all that "by hand". Even though there is a "set cc" flag on those
instructions which one might assume has the carry. But the blob didn't
use it.)

So I was thinking that we could just take the relevant SM5
instructions and lower the rest. Specifically, these would be the new
opcodes:

IBFE
UBFE
BFI
BREV (not BFREV since most instructions appear to be 3/4 letters)
POPC (shorter than "countbits")
LSB
UMSB
IMSB
IMULHI

I just took a look at the Radeon SI ISA, and it does seem like it has
ldexp/frexp instructions, as well as setting the carry flag for
addc/subb. Although since TGSI doesn't have flags or multiple
destinations, not sure how the latter 2 could be easily encoded in the
glsl->tgsi translation.

Thoughts/opinions before I go and implement the above? Is someone else
already working on this?

  -ilia
_______________________________________________
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev

Reply via email to