On Wed, 2010-01-06 at 15:56 -0800, Zack Rusin wrote:
> On Wednesday 06 January 2010 14:56:35 Igor Oliveira wrote:
> > Hi,
> > 
> > the patches add support to double opcodes in gallium/tgsi.
> > It just implement some opcodes i like to know if someone has
> > suggestion about the patches.
> 
> Hi Igor, first of all this should probably go into a feature branch because 
> it'll be a bit of work before it's usable. 
> The patches that you've proposed are unlikely what we'll want for double's. 
> Keith, Michal and I discussed this on the phone a few days back and the 
> biggest issue with doubles is that unlike the switch between the integers and 
> floats they actually need bigger registers to accomodate them. Given that the 
> registers in TGSI are untyped and its up to instructions to define the type 
> it 
> becomes hard for drivers to figure out the size of the registers beforehand. 
> The solution that I personally like and what seems to becoming the facto 
> standard when dealing with double support is having double precision values 
> represented by a pair of registers. Outputs are 
> either the pair yx or to the pair wz, where the msb is stored in y/w. For 
> example:
> Idata 3.0 => (0x4008000000000000) in register r looks like:
> r.w =    0x40080000 ;high dword
> r.z =     0x00000000 ;low dword
> Or:
> r.y =    0x40080000 ;high dword
> r.x =    0x00000000 ;low dword
> All source double inputs must be in xy (after swizzle operations). For 
> example:
> d_add r1.xy, r2.xy, r2.xy
> Or
> d_add r1.zw, r2.xy, r2.xy
> Each computes twice the value in r2.xy, and places the result in either xy or 
> zw. 
> This assures that the register size stays constant. Of course the instruction 
> semantics are different to the typical 4-component wide TGSI instructions, 
> but 
> that, I think, is a lot less of an issue.
> 
> z

I wonder if storage size of registers is such a big issue. Knowing the
storage size of a register matters mostly for indexable temps. For
regular assignments and intermediate computations storage everything
gets transformed in SSA form, and the register size can be determined
from the instructions where it is generated/used and there is no need
for consistency. 

For example, imagine a shader that has:

   TEX TEMP[0], SAMP[0], IN[0]  // SAMP[0] is a PIPE_FORMAT_R32G32B32_FLOAT --> 
use 4x32bit float registers
   MAX ??
   ...
   TEX TEMP[0], SAMP[1], IN[0]  // SAMP[1] is a PIPE_FORMAT_R64G64B64A64_FLOAT 
--> use 4x64bit double registers
   DMAX ????, TEMP[0], ???
   ...
   TEX TEMP[0], SAMP[2], IN[0] // texture 0 and rendertarget are both  
PIPE_FORMAT_R8G8B8A8_UNORM  --> use 4x8bit unorm registers
   MOV OUT[0], TEMP[0]

etc.

There is actually programmable 3d hardware out there that has special
4x8bit registers, and for performance the compiler has to deduct where
to use those 4xbit. llvmpipe will need to do similar thing, as the
smaller the bit-width the higher the throughput. And at least current
gallium statetrackers will reuse temps with no attempt to maintain
consistency in use.

So if the compilers already need to deal with this, if this notion that
registers are 128bits is really necessary, and will prevail in the long
term.

Jose


------------------------------------------------------------------------------
This SF.Net email is sponsored by the Verizon Developer Community
Take advantage of Verizon's best-in-class app development support
A streamlined, 14 day to market process makes app distribution fast and easy
Join now and get one step closer to millions of Verizon customers
http://p.sf.net/sfu/verizon-dev2dev 
_______________________________________________
Mesa3d-dev mailing list
Mesa3d-dev@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev

Reply via email to