----- Original Message ----- > From: Roland Scheidegger <srol...@vmware.com> > > A lot of them were missing. Others were moved from the Compute ISA > to a new Integer ISA section as that seemed more appropriate. > --- > src/gallium/docs/source/tgsi.rst | 362 > ++++++++++++++++++++++++++++++-------- > 1 file changed, 289 insertions(+), 73 deletions(-) > > diff --git a/src/gallium/docs/source/tgsi.rst > b/src/gallium/docs/source/tgsi.rst > index a528fd2..b7caf63 100644 > --- a/src/gallium/docs/source/tgsi.rst > +++ b/src/gallium/docs/source/tgsi.rst > @@ -872,6 +872,16 @@ This instruction replicates its result. > as an integer register. > > > +.. opcode:: CONT - Continue > + > + TBD > + > +.. note:: > + > + Support for CONT is determined by a special capability bit, > + ``TGSI_CONT_SUPPORTED``. See :ref:`Screen` for more information. > + > + > .. opcode:: IF - Float If > > Start an IF ... ELSE .. ENDIF block. Condition evaluates to true if > @@ -977,6 +987,7 @@ These opcodes are primarily provided for special-use > computational shaders. > Support for these opcodes indicated by a special pipe capability bit (TBD). > > XXX so let's discuss it, yeah? > +XXX doesn't look like most of the opcodes really belong here. > > .. opcode:: CEIL - Ceiling > > @@ -991,7 +1002,89 @@ XXX so let's discuss it, yeah? > dst.w = \lceil src.w\rceil > > > -.. opcode:: I2F - Integer To Float > +.. opcode:: TRUNC - Truncate > + > +.. math:: > + > + dst.x = trunc(src.x) > + > + dst.y = trunc(src.y) > + > + dst.z = trunc(src.z) > + > + dst.w = trunc(src.w) > + > + > +.. opcode:: MOD - Modulus > + > +.. math:: > + > + dst.x = src0.x \bmod src1.x > + > + dst.y = src0.y \bmod src1.y > + > + dst.z = src0.z \bmod src1.z > + > + dst.w = src0.w \bmod src1.w > + > + > +.. opcode:: UARL - Integer Address Register Load > + > + Moves the contents of the source register, assumed to be an integer, into > the > + destination register, which is assumed to be an address (ADDR) register. > + > + > +.. opcode:: SAD - Sum Of Absolute Differences > + > +.. math:: > + > + dst.x = |src0.x - src1.x| + src2.x > + > + dst.y = |src0.y - src1.y| + src2.y > + > + dst.z = |src0.z - src1.z| + src2.z > + > + dst.w = |src0.w - src1.w| + src2.w > + > + > +.. opcode:: TXF - Texel Fetch (as per NV_gpu_shader4), extract a single > texel > + from a specified texture image. The source sampler may > + not be a CUBE or SHADOW. > + src 0 is a four-component signed integer vector used to > + identify the single texel accessed. 3 components + level. > + src 1 is a 3 component constant signed integer vector, > + with each component only have a range of > + -8..+8 (hw only seems to deal with this range, interface > + allows for up to unsigned int). > + TXF(uint_vec coord, int_vec offset). > + > + > +.. opcode:: TXQ - Texture Size Query (as per NV_gpu_program4) > + retrieve the dimensions of the texture > + depending on the target. For 1D (width), 2D/RECT/CUBE > + (width, height), 3D (width, height, depth), > + 1D array (width, layers), 2D array (width, height, layers) > + > +.. math:: > + > + lod = src0
src0.x ? Otherwise looks good. Thanks for taking the time of cleaning up these. Jose > + > + dst.x = texture_width(unit, lod) > + > + dst.y = texture_height(unit, lod) > + > + dst.z = texture_depth(unit, lod) > + > + > +Integer ISA > +^^^^^^^^^^^^^^^^^^^^^^^^ > +These opcodes are used for integer operations. > +Support for these opcodes indicated by PIPE_SHADER_CAP_INTEGERS (all of > them?) > + > + > +.. opcode:: I2F - Signed Integer To Float > + > + Rounding is unspecified (round to nearest even suggested). > > .. math:: > > @@ -1004,56 +1097,157 @@ XXX so let's discuss it, yeah? > dst.w = (float) src.w > > > -.. opcode:: NOT - Bitwise Not > +.. opcode:: U2F - Unsigned Integer To Float > + > + Rounding is unspecified (round to nearest even suggested). > > .. math:: > > - dst.x = ~src.x > + dst.x = (float) src.x > > - dst.y = ~src.y > + dst.y = (float) src.y > > - dst.z = ~src.z > + dst.z = (float) src.z > > - dst.w = ~src.w > + dst.w = (float) src.w > > > -.. opcode:: TRUNC - Truncate > +.. opcode:: F2I - Float to Signed Integer > + > + Rounding is towards zero (truncate). > + Values outside signed range (including NaNs) produce undefined results. > > .. math:: > > - dst.x = trunc(src.x) > + dst.x = (int) src.x > > - dst.y = trunc(src.y) > + dst.y = (int) src.y > > - dst.z = trunc(src.z) > + dst.z = (int) src.z > > - dst.w = trunc(src.w) > + dst.w = (int) src.w > > > -.. opcode:: SHL - Shift Left > +.. opcode:: F2U - Float to Unsigned Integer > + > + Rounding is towards zero (truncate). > + Values outside unsigned range (including NaNs) produce undefined results. > > .. math:: > > - dst.x = src0.x << src1.x > + dst.x = (unsigned) src.x > > - dst.y = src0.y << src1.x > + dst.y = (unsigned) src.y > > - dst.z = src0.z << src1.x > + dst.z = (unsigned) src.z > > - dst.w = src0.w << src1.x > + dst.w = (unsigned) src.w > > > -.. opcode:: SHR - Shift Right > +.. opcode:: UADD - Integer Add > + > + This instruction works the same for signed and unsigned integers. > + The low 32bit of the result is returned. > > .. math:: > > - dst.x = src0.x >> src1.x > + dst.x = src0.x + src1.x > > - dst.y = src0.y >> src1.x > + dst.y = src0.y + src1.y > > - dst.z = src0.z >> src1.x > + dst.z = src0.z + src1.z > > - dst.w = src0.w >> src1.x > + dst.w = src0.w + src1.w > + > + > +.. opcode:: UMAD - Integer Multiply And Add > + > + This instruction works the same for signed and unsigned integers. > + The multiplication returns the low 32bit (as does the result itself). > + > +.. math:: > + > + dst.x = src0.x \times src1.x + src2.x > + > + dst.y = src0.y \times src1.y + src2.y > + > + dst.z = src0.z \times src1.z + src2.z > + > + dst.w = src0.w \times src1.w + src2.w > + > + > +.. opcode:: UMUL - Integer Multiply > + > + This instruction works the same for signed and unsigned integers. > + The low 32bit of the result is returned. > + > +.. math:: > + > + dst.x = src0.x \times src1.x > + > + dst.y = src0.y \times src1.y > + > + dst.z = src0.z \times src1.z > + > + dst.w = src0.w \times src1.w > + > + > +.. opcode:: IDIV - Signed Integer Division > + > + TBD: behavior for division by zero. > + > +.. math:: > + > + dst.x = src0.x \ src1.x > + > + dst.y = src0.y \ src1.y > + > + dst.z = src0.z \ src1.z > + > + dst.w = src0.w \ src1.w > + > + > +.. opcode:: UDIV - Unsigned Integer Division > + > + For division by zero, 0xffffffff is returned. > + > +.. math:: > + > + dst.x = src0.x \ src1.x > + > + dst.y = src0.y \ src1.y > + > + dst.z = src0.z \ src1.z > + > + dst.w = src0.w \ src1.w > + > + > +.. opcode:: UMOD - Unsigned Integer Remainder > + > + If second arg is zero, 0xffffffff is returned. > + > +.. math:: > + > + dst.x = src0.x \ src1.x > + > + dst.y = src0.y \ src1.y > + > + dst.z = src0.z \ src1.z > + > + dst.w = src0.w \ src1.w > + > + > +.. opcode:: NOT - Bitwise Not > + > +.. math:: > + > + dst.x = ~src.x > + > + dst.y = ~src.y > + > + dst.z = ~src.z > + > + dst.w = ~src.w > > > .. opcode:: AND - Bitwise And > @@ -1082,114 +1276,136 @@ XXX so let's discuss it, yeah? > dst.w = src0.w | src1.w > > > -.. opcode:: MOD - Modulus > +.. opcode:: XOR - Bitwise Xor > > .. math:: > > - dst.x = src0.x \bmod src1.x > + dst.x = src0.x \oplus src1.x > > - dst.y = src0.y \bmod src1.y > + dst.y = src0.y \oplus src1.y > > - dst.z = src0.z \bmod src1.z > + dst.z = src0.z \oplus src1.z > > - dst.w = src0.w \bmod src1.w > + dst.w = src0.w \oplus src1.w > > > -.. opcode:: XOR - Bitwise Xor > +.. opcode:: IMAX - Maximum of Signed Integers > > .. math:: > > - dst.x = src0.x \oplus src1.x > + dst.x = max(src0.x, src1.x) > > - dst.y = src0.y \oplus src1.y > + dst.y = max(src0.y, src1.y) > > - dst.z = src0.z \oplus src1.z > + dst.z = max(src0.z, src1.z) > > - dst.w = src0.w \oplus src1.w > + dst.w = max(src0.w, src1.w) > > > -.. opcode:: UCMP - Integer Conditional Move > +.. opcode:: UMAX - Maximum of Unsigned Integers > > .. math:: > > - dst.x = src0.x ? src1.x : src2.x > + dst.x = max(src0.x, src1.x) > > - dst.y = src0.y ? src1.y : src2.y > + dst.y = max(src0.y, src1.y) > > - dst.z = src0.z ? src1.z : src2.z > + dst.z = max(src0.z, src1.z) > > - dst.w = src0.w ? src1.w : src2.w > + dst.w = max(src0.w, src1.w) > > > -.. opcode:: UARL - Integer Address Register Load > +.. opcode:: IMIN - Minimum of Signed Integers > > - Moves the contents of the source register, assumed to be an integer, into > the > - destination register, which is assumed to be an address (ADDR) register. > +.. math:: > > + dst.x = min(src0.x, src1.x) > > -.. opcode:: IABS - Integer Absolute Value > + dst.y = min(src0.y, src1.y) > + > + dst.z = min(src0.z, src1.z) > + > + dst.w = min(src0.w, src1.w) > + > + > +.. opcode:: UMIN - Minimum of Unsigned Integers > > .. math:: > > - dst.x = |src.x| > + dst.x = min(src0.x, src1.x) > > - dst.y = |src.y| > + dst.y = min(src0.y, src1.y) > > - dst.z = |src.z| > + dst.z = min(src0.z, src1.z) > > - dst.w = |src.w| > + dst.w = min(src0.w, src1.w) > > > -.. opcode:: SAD - Sum Of Absolute Differences > +.. opcode:: SHL - Shift Left > > .. math:: > > - dst.x = |src0.x - src1.x| + src2.x > + dst.x = src0.x << src1.x > > - dst.y = |src0.y - src1.y| + src2.y > + dst.y = src0.y << src1.x > > - dst.z = |src0.z - src1.z| + src2.z > + dst.z = src0.z << src1.x > > - dst.w = |src0.w - src1.w| + src2.w > + dst.w = src0.w << src1.x > > > -.. opcode:: TXF - Texel Fetch (as per NV_gpu_shader4), extract a single > texel > - from a specified texture image. The source sampler may > - not be a CUBE or SHADOW. > - src 0 is a four-component signed integer vector used to > - identify the single texel accessed. 3 components + level. > - src 1 is a 3 component constant signed integer vector, > - with each component only have a range of > - -8..+8 (hw only seems to deal with this range, interface > - allows for up to unsigned int). > - TXF(uint_vec coord, int_vec offset). > +.. opcode:: ISHR - Arithmetic Shift Right (of Signed Integer) > > +.. math:: > > -.. opcode:: TXQ - Texture Size Query (as per NV_gpu_program4) > - retrieve the dimensions of the texture > - depending on the target. For 1D (width), 2D/RECT/CUBE > - (width, height), 3D (width, height, depth), > - 1D array (width, layers), 2D array (width, height, layers) > + dst.x = src0.x >> src1.x > + > + dst.y = src0.y >> src1.x > + > + dst.z = src0.z >> src1.x > + > + dst.w = src0.w >> src1.x > + > + > +.. opcode:: USHR - Logical Shift Right > > .. math:: > > - lod = src0 > + dst.x = src0.x >> (unsigned) src1.x > > - dst.x = texture_width(unit, lod) > + dst.y = src0.y >> (unsigned) src1.x > > - dst.y = texture_height(unit, lod) > + dst.z = src0.z >> (unsigned) src1.x > > - dst.z = texture_depth(unit, lod) > + dst.w = src0.w >> (unsigned) src1.x > > > -.. opcode:: CONT - Continue > > - TBD > > -.. note:: > +.. opcode:: UCMP - Integer Conditional Move > > - Support for CONT is determined by a special capability bit, > - ``TGSI_CONT_SUPPORTED``. See :ref:`Screen` for more information. > +.. math:: > + > + dst.x = src0.x ? src1.x : src2.x > + > + dst.y = src0.y ? src1.y : src2.y > + > + dst.z = src0.z ? src1.z : src2.z > + > + dst.w = src0.w ? src1.w : src2.w > + > + > +.. opcode:: IABS - Integer Absolute Value > + > +.. math:: > + > + dst.x = |src.x| > + > + dst.y = |src.y| > + > + dst.z = |src.z| > + > + dst.w = |src.w| > > > Geometry ISA > -- > 1.7.9.5 > _______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev