On 12/13/2011 10:48 PM, Jose Fonseca wrote: > ----- Original Message ----- >> On 12/13/2011 03:25 PM, Jose Fonseca wrote: >>> >>> ----- Original Message ----- >>>> On 12/13/2011 03:09 PM, Jose Fonseca wrote: >>>>> ----- Original Message ----- >>>>>> On 12/13/2011 12:26 PM, Bryan Cain wrote: >>>>>>> On 12/13/2011 02:11 PM, Jose Fonseca wrote: >>>>>>>> ----- Original Message ----- >>>>>>>>> This is an updated version of the patch set I sent to the >>>>>>>>> list >>>>>>>>> a >>>>>>>>> few >>>>>>>>> hours >>>>>>>>> ago. >>>>>>>>> There is now a TGSI property called >>>>>>>>> TGSI_PROPERTY_NUM_CLIP_DISTANCES >>>>>>>>> that drivers can use to determine how many of the 8 available >>>>>>>>> clip >>>>>>>>> distances >>>>>>>>> are actually used by a shader. >>>>>>>> Can't the info in TGSI_PROPERTY_NUM_CLIP_DISTANCES be easily >>>>>>>> derived from the shader, and queried through >>>>>>>> src/gallium/auxiliary/tgsi/tgsi_scan.h ? >>>>>>> No. The clip distances can be indirectly addressed (there are >>>>>>> up >>>>>>> to 2 >>>>>>> of them in vec4 form for a total of 8 floats), which makes it >>>>>>> impossible >>>>>>> to determine which ones are used by analyzing the shader. >>>>>> The description is almost complete. :) The issue is that the >>>>>> shader >>>>>> may >>>>>> declare >>>>>> >>>>>> out float gl_ClipDistance[4]; >>>>>> >>>>>> the use non-constant addressing of the array. The compiler >>>>>> knows >>>>>> that >>>>>> gl_ClipDistance has at most 4 elements, but post-hoc analysis >>>>>> would >>>>>> not >>>>>> be able to determine that. Often the fixed-function hardware >>>>>> (see >>>>>> below) needs to know which clip distance values are actually >>>>>> written. >>>>> But don't all the clip distances written by the shader need to be >>>>> declared? >>>>> >>>>> E.g.: >>>>> >>>>> DCL OUT[0], CLIPDIST[0] >>>>> DCL OUT[1], CLIPDIST[1] >>>>> DCL OUT[2], CLIPDIST[2] >>>>> DCL OUT[3], CLIPDIST[3] >>>>> >>>>> therefore a trivial analysis of the declarations convey that? >>>> No. Clip distance is an array of up to 8 floats in GLSL, but it's >>>> represented in the hardware as 2 vec4s. You can tell by analyzing >>>> the >>>> declarations whether there are more than 4 clip distances in use, >>>> but >>>> not which components the shader writes to. >>>> TGSI_PROPERTY_NUM_CLIP_DISTANCES is the number of components in >>>> use, >>>> not >>>> the number of full vectors. >>> Lets imagine >>> >>> out float gl_ClipDistance[6]; >>> >>> Each a clip distance is a scalar float. >>> >>> Either all hardware represents the 8 clip distances as two 4 >>> vectors, and we do: >>> >>> DCL OUT[0].xywz, CLIPDIST[0] >>> DCL OUT[1].xy, CLIPDIST[1] >>> >>> using the full range of struct tgsi_declaration::UsageMask [1] or >>> we represent them as as scalars: >>> >>> DCL OUT[0].x, CLIPDIST[0] >>> DCL OUT[1].x, CLIPDIST[1] >>> DCL OUT[2].x, CLIPDIST[2] >>> DCL OUT[3].x, CLIPDIST[3] >>> DCL OUT[4].x, CLIPDIST[4] >>> DCL OUT[5].x, CLIPDIST[5] >>> >>> If indirect addressing is allowed as I read bore, then maybe the >>> later is better. >>> >>> I confess my ignorance about clipping and maybe I'm being dense, >>> but I still don't see the need for this >>> TGSI_PROPERTY_NUM_CLIP_DISTANCES. Could you please draft an >>> example TGSI shader showing this property (or just paste one >>> generated with your change)? I think that would help a lot. >>> >>> >>> Jose >>> >>> >>> [1] I don't know if tgsi_dump pays much attention to >>> tgsi_declaration::UsageMask, but it does exist. >> >> UsageMask might work, but before that can be considered a viable >> solution, someone will need to make it possible to actually declare >> it >> from ureg. As it is, ureg is hardcoded to set UsageMask to xyzw no >> matter what on all declared inputs and outputs. > > ureg automatically fills the UsageMask from the destionation register masks, > since it easy to determine from the opcodes. > > Which leads me to my second point, if indirect addressing of CLIPDIST is > allowed, then we can't really pack the clip distance as 4-elem vectors in > TGSI: not only the syntax would be very weird, but it would create havoc on > all tgsi-translating code that makes decisions based on indirect addressing > of registers. > > That is, > > float gl_ClipDistance[6]; > > gl_ClipDistance[i] = foo; > > would become > > DCL OUT[0].x, CLIPDIST[0] > DCL OUT[1].x, CLIPDIST[1] > DCL OUT[2].x, CLIPDIST[2] > DCL OUT[3].x, CLIPDIST[3] > DCL OUT[4].x, CLIPDIST[4] > DCL OUT[5].x, CLIPDIST[5] > MOV OUT[ADDR[0].x].x, foo >
This cannot work properly yet. For instance, the clip distance slots in my hardware's output memory space are packed, i.e. consuming NUM_CLIP_DISTANCES * 4 bytes. (This cannot be changed, except by spilling outputs to high latency memory and moving them all at the end, which is very undesirable.) The MOV OUT[ADDR[0].x].x, however, has no way to know whether to scale ADDR[0].x by 4 or by 16 bytes (as it would for arrays of vec4s) since it is not clear which declaration/output range the instruction accesses. The plan is to remedy this, at some point, by augmenting indirect accesses with an extra "declaration index", and let a declaration constitute an array. This would also make the TGSI_FILE_*_ARRAY superfluous. So it would be: DCL OUT[0-8].x, CLIPDIST[0-8], making use of tgsi_declaration_range. > and the info from TGSI_PROPERTY_NUM_CLIP_DISTANCES can be obtained by walking > the declaration (which can/should be done only once in tgsi_scan). > > But this just doesn't look like it would ever work: > > DCL OUT[0].xyzw, CLIPDIST[0] > DCL OUT[1].xy , CLIPDIST[1] > MOV OUT[ADDR[0].x].????, foo > > Jose > _______________________________________________ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > http://lists.freedesktop.org/mailman/listinfo/mesa-dev _______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev