Re: [Mesa3d-dev] [RFC] add support to double opcodes

José Fonseca Thu, 07 Jan 2010 09:10:23 -0800

On Thu, 2010-01-07 at 08:42 -0800, Christoph Bumiller wrote:
> On 01/07/2010 12:50 PM, José Fonseca wrote:
> > On Wed, 2010-01-06 at 15:56 -0800, Zack Rusin wrote:
> >> On Wednesday 06 January 2010 14:56:35 Igor Oliveira wrote:
> >>> Hi,
> >>>
> >>> the patches add support to double opcodes in gallium/tgsi.
> >>> It just implement some opcodes i like to know if someone has
> >>> suggestion about the patches.
> >>
> >> Hi Igor, first of all this should probably go into a feature branch 
> >> because 
> >> it'll be a bit of work before it's usable. 
> >> The patches that you've proposed are unlikely what we'll want for 
> >> double's. 
> >> Keith, Michal and I discussed this on the phone a few days back and the 
> >> biggest issue with doubles is that unlike the switch between the integers 
> >> and 
> >> floats they actually need bigger registers to accomodate them. Given that 
> >> the 
> >> registers in TGSI are untyped and its up to instructions to define the 
> >> type it 
> >> becomes hard for drivers to figure out the size of the registers 
> >> beforehand. 
> >> The solution that I personally like and what seems to becoming the facto 
> >> standard when dealing with double support is having double precision 
> >> values 
> >> represented by a pair of registers. Outputs are 
> >> either the pair yx or to the pair wz, where the msb is stored in y/w. For 
> >> example:
> >> Idata 3.0 => (0x4008000000000000) in register r looks like:
> >> r.w =    0x40080000 ;high dword
> >> r.z =     0x00000000 ;low dword
> >> Or:
> >> r.y =    0x40080000 ;high dword
> >> r.x =    0x00000000 ;low dword
> >> All source double inputs must be in xy (after swizzle operations). For 
> >> example:
> >> d_add r1.xy, r2.xy, r2.xy
> >> Or
> >> d_add r1.zw, r2.xy, r2.xy
> >> Each computes twice the value in r2.xy, and places the result in either xy 
> >> or 
> >> zw. 
> >> This assures that the register size stays constant. Of course the 
> >> instruction 
> >> semantics are different to the typical 4-component wide TGSI instructions, 
> >> but 
> >> that, I think, is a lot less of an issue.
> >>
> >> z
> > 
> > I wonder if storage size of registers is such a big issue. Knowing the
> > storage size of a register matters mostly for indexable temps. For
> > regular assignments and intermediate computations storage everything
> > gets transformed in SSA form, and the register size can be determined
> > from the instructions where it is generated/used and there is no need
> > for consistency. 
> > 
> > For example, imagine a shader that has:
> > 
> >    TEX TEMP[0], SAMP[0], IN[0]  // SAMP[0] is a PIPE_FORMAT_R32G32B32_FLOAT 
> > --> use 4x32bit float registers
> >    MAX ??
> >    ...
> >    TEX TEMP[0], SAMP[1], IN[0]  // SAMP[1] is a 
> > PIPE_FORMAT_R64G64B64A64_FLOAT --> use 4x64bit double registers
> >    DMAX ????, TEMP[0], ???
> >    ...
> >    TEX TEMP[0], SAMP[2], IN[0] // texture 0 and rendertarget are both  
> > PIPE_FORMAT_R8G8B8A8_UNORM  --> use 4x8bit unorm registers
> >    MOV OUT[0], TEMP[0]
> > 
> > etc.
> > 
> TEX will output floats, independently of the bound texture, so this code
> makes no sense to me.
> I do *not* (want to have to) know what will be bound to the sampler
> in advance, or produce a new shader for every different format.
> 
> If TEX is to output doubles, make it D_TEX.


Yes, that's fair enough. My point here is that one could avoid the
double conversion as an optimization. 

That may not be the case of nvidia hardware, but there's 3d hardware out
there where this sort of analysis will happen for e.g., for 4x8bit
formats. It's not 3d hardware, but this will certainly be case for
llvmpipe: we'll want to check if the sampler formats and rendertarget
formats are rgba8 and produce specialized shaders using 16x8bit SIMD
instructions for those.

Jose


------------------------------------------------------------------------------
This SF.Net email is sponsored by the Verizon Developer Community
Take advantage of Verizon's best-in-class app development support
A streamlined, 14 day to market process makes app distribution fast and easy
Join now and get one step closer to millions of Verizon customers
http://p.sf.net/sfu/verizon-dev2dev 
_______________________________________________
Mesa3d-dev mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev

Re: [Mesa3d-dev] [RFC] add support to double opcodes

Reply via email to