On Thursday 07 January 2010 06:50:36 José Fonseca wrote:
> I wonder if storage size of registers is such a big issue. Knowing the
> storage size of a register matters mostly for indexable temps. For
> regular assignments and intermediate computations storage everything
> gets transformed in SSA form, and the register size can be determined
> from the instructions where it is generated/used and there is no need
> for consistency.
> 
> For example, imagine a shader that has:
> 
>    TEX TEMP[0], SAMP[0], IN[0]  // SAMP[0] is a PIPE_FORMAT_R32G32B32_FLOAT
>  --> use 4x32bit float registers MAX ??
>    ...
>    TEX TEMP[0], SAMP[1], IN[0]  // SAMP[1] is a
>  PIPE_FORMAT_R64G64B64A64_FLOAT --> use 4x64bit double registers DMAX ????,
>  TEMP[0], ???

That's not an issue because such a format doesn't exist. There's no 256bit 
sampling in any api. It's one of the self-inflicted wounds that we have. R64G64 
is the most you'll get right now.

>    TEX TEMP[0], SAMP[2], IN[0] // texture 0 and rendertarget are both 
>  PIPE_FORMAT_R8G8B8A8_UNORM  --> use 4x8bit unorm registers MOV OUT[0],
>  TEMP[0]
> 
> etc.
> 
> There is actually programmable 3d hardware out there that has special
> 4x8bit registers, and for performance the compiler has to deduct where
> to use those 4xbit. llvmpipe will need to do similar thing, as the
> smaller the bit-width the higher the throughput. And at least current
> gallium statetrackers will reuse temps with no attempt to maintain
> consistency in use.
> 
> So if the compilers already need to deal with this, if this notion that
> registers are 128bits is really necessary, and will prevail in the long
> term.

Somehow this is the core issue it's the fact that TGSI is untyped anything but 
"register size" is constant implies "TGSI is typed but the actual types have 
to be deduced by the drivers" which goes against what Gallium was about (we 
put the complexity in the driver). 

The question of 8bit vs 32bit and 64bit vs 32bit are really different 
questions. The first one is about optimization - it will work perfectly well if 
the 128bit registers will be used, the second one is about correctness - it 
will not work if 128bit registers will be used for doubles and it will not 
work if 256bit registers will be used for floats. Also we don't have a 4x8bit 
instructions, they're all 4x32bit instructions (float, unsigned ints, signed 
ints), so doubles will be the first differently sized instructions. Which in 
turn will mean that either TGSI will have to be actually statically typed, but 
not typed declared i.e. D_ADD will only be able to take two 256bit registers 
as inputs and if anything else is passed it has to throw an error, which is 
especially difficult that those registers didn't have a size declared but it 
would have to be inferred from previous instructions, or we'd have to allow 
mixing sizes of all inputs, e.g. D_ADD can operate on both 4x32 or 4x64 which 
simply moves the problem from above into the driver.

Really, unless we'll say "the entire pipeline can run in 4x64" like we did for 
floats then I don't see an easier way of dealing with this than the xy, zw, 
swizzle form.

z

------------------------------------------------------------------------------
This SF.Net email is sponsored by the Verizon Developer Community
Take advantage of Verizon's best-in-class app development support
A streamlined, 14 day to market process makes app distribution fast and easy
Join now and get one step closer to millions of Verizon customers
http://p.sf.net/sfu/verizon-dev2dev 
_______________________________________________
Mesa3d-dev mailing list
Mesa3d-dev@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev

Reply via email to