On Thursday 07 January 2010 06:50:36 José Fonseca wrote: > I wonder if storage size of registers is such a big issue. Knowing the > storage size of a register matters mostly for indexable temps. For > regular assignments and intermediate computations storage everything > gets transformed in SSA form, and the register size can be determined > from the instructions where it is generated/used and there is no need > for consistency. > > For example, imagine a shader that has: > > TEX TEMP[0], SAMP[0], IN[0] // SAMP[0] is a PIPE_FORMAT_R32G32B32_FLOAT > --> use 4x32bit float registers MAX ?? > ... > TEX TEMP[0], SAMP[1], IN[0] // SAMP[1] is a > PIPE_FORMAT_R64G64B64A64_FLOAT --> use 4x64bit double registers DMAX ????, > TEMP[0], ???
That's not an issue because such a format doesn't exist. There's no 256bit sampling in any api. It's one of the self-inflicted wounds that we have. R64G64 is the most you'll get right now. > TEX TEMP[0], SAMP[2], IN[0] // texture 0 and rendertarget are both > PIPE_FORMAT_R8G8B8A8_UNORM --> use 4x8bit unorm registers MOV OUT[0], > TEMP[0] > > etc. > > There is actually programmable 3d hardware out there that has special > 4x8bit registers, and for performance the compiler has to deduct where > to use those 4xbit. llvmpipe will need to do similar thing, as the > smaller the bit-width the higher the throughput. And at least current > gallium statetrackers will reuse temps with no attempt to maintain > consistency in use. > > So if the compilers already need to deal with this, if this notion that > registers are 128bits is really necessary, and will prevail in the long > term. Somehow this is the core issue it's the fact that TGSI is untyped anything but "register size" is constant implies "TGSI is typed but the actual types have to be deduced by the drivers" which goes against what Gallium was about (we put the complexity in the driver). The question of 8bit vs 32bit and 64bit vs 32bit are really different questions. The first one is about optimization - it will work perfectly well if the 128bit registers will be used, the second one is about correctness - it will not work if 128bit registers will be used for doubles and it will not work if 256bit registers will be used for floats. Also we don't have a 4x8bit instructions, they're all 4x32bit instructions (float, unsigned ints, signed ints), so doubles will be the first differently sized instructions. Which in turn will mean that either TGSI will have to be actually statically typed, but not typed declared i.e. D_ADD will only be able to take two 256bit registers as inputs and if anything else is passed it has to throw an error, which is especially difficult that those registers didn't have a size declared but it would have to be inferred from previous instructions, or we'd have to allow mixing sizes of all inputs, e.g. D_ADD can operate on both 4x32 or 4x64 which simply moves the problem from above into the driver. Really, unless we'll say "the entire pipeline can run in 4x64" like we did for floats then I don't see an easier way of dealing with this than the xy, zw, swizzle form. z ------------------------------------------------------------------------------ This SF.Net email is sponsored by the Verizon Developer Community Take advantage of Verizon's best-in-class app development support A streamlined, 14 day to market process makes app distribution fast and easy Join now and get one step closer to millions of Verizon customers http://p.sf.net/sfu/verizon-dev2dev _______________________________________________ Mesa3d-dev mailing list Mesa3d-dev@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mesa3d-dev