Re: [Mesa3d-dev] light_twoside RE: [PATCH] glsl: put varyings in texcoord slots

2010-02-03 Thread Luca Barbieri
I tested this on Windows, using nVidia driver 195 on nv40, and it
seems we are all partially wrong.

SM3 does indeed allow semantics unrelated to hardware resources.
However, the semantic indices for any semantic type must be in the
range 0-15, or D3DX will report a compiler error during shader
compilation:
error X2000: syntax error : unexpected token 'dcl_texcoord16'

This is further confirmed by the following lines in d3d9types.h:
#define MAXD3DDECLUSAGE D3DDECLUSAGE_SAMPLE
#define MAXD3DDECLUSAGEINDEX15

I would guess that these two 4-bit values are combined into an 8-bit
value that is then passed directly to hardware like r600 which
supports 8-bit semantic indices in hardware.
Is this the case on Radeon?

Is the 8-bit semantic table a feature of r300 too or only of r600+?

In light of this, it may make sense to do some range limitation ourselves too.
In particular, a good plan could be limiting all semantic indices to
0-15, except GENERIC, which could support a 0-127 range.
This would allow to both directly take advantage of Radeon hardware,
and let drivers that need to remap in software do so with direct
lookup in a small array.

--
The Planet: dedicated and managed hosting, cloud storage, colocation
Stay online with enterprise data centers and the best network in the business
Choose flexible plans and management services without long-term contracts
Personal 24x7 support from experience hosting pros just a phone call away.
http://p.sf.net/sfu/theplanet-com
___
Mesa3d-dev mailing list
Mesa3d-dev@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev


Re: [Mesa3d-dev] light_twoside RE: [PATCH] glsl: put varyings in texcoord slots

2010-02-03 Thread Keith Whitwell
On Wed, 2010-02-03 at 01:42 -0800, Luca Barbieri wrote:
 I tested this on Windows, using nVidia driver 195 on nv40, and it
 seems we are all partially wrong.
 
 SM3 does indeed allow semantics unrelated to hardware resources.
 However, the semantic indices for any semantic type must be in the
 range 0-15, or D3DX will report a compiler error during shader
 compilation:
 error X2000: syntax error : unexpected token 'dcl_texcoord16'
 
 This is further confirmed by the following lines in d3d9types.h:
 #define MAXD3DDECLUSAGE D3DDECLUSAGE_SAMPLE
 #define MAXD3DDECLUSAGEINDEX15

 I would guess that these two 4-bit values are combined into an 8-bit
 value that is then passed directly to hardware like r600 which
 supports 8-bit semantic indices in hardware.


Further down that file they define the binary shader tokens for DX9,
which match your guess: 

// For dcl info tokens requiring a semantic (usage + index)
#define D3DSP_DCL_USAGE_SHIFT 0
#define D3DSP_DCL_USAGE_MASK  0x000f

#define D3DSP_DCL_USAGEINDEX_SHIFT 16
#define D3DSP_DCL_USAGEINDEX_MASK  0x000f

Not for the first time, hardware capabilities directly match what was
required to implement the DX version of the era.

 Is this the case on Radeon?
 
 Is the 8-bit semantic table a feature of r300 too or only of r600+?

At some point this would have been introduced to hardware to remove the
headache from software of dealing with the DX9 semantic scheme.  I don't
know at what point in the hardware/driver evolution it became important
enough to devote silicon to.


 In light of this, it may make sense to do some range limitation ourselves too.
 In particular, a good plan could be limiting all semantic indices to
 0-15, except GENERIC, which could support a 0-127 range.
 This would allow to both directly take advantage of Radeon hardware,
 and let drivers that need to remap in software do so with direct
 lookup in a small array.

This level of restriction is fine with me.  It seems to allow us to
capture all the important APIs - GL and DX9 clearly, and DX10 seems to
match outputs to inputs by position, without needing to examine
semantics.

Also, we've been proliferating semantic names, one each for various
system values.  It sounds like we might want to consolidate them down
designated indices within a single name.

Thanks for looking into this Luca,

Keith





--
The Planet: dedicated and managed hosting, cloud storage, colocation
Stay online with enterprise data centers and the best network in the business
Choose flexible plans and management services without long-term contracts
Personal 24x7 support from experience hosting pros just a phone call away.
http://p.sf.net/sfu/theplanet-com
___
Mesa3d-dev mailing list
Mesa3d-dev@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev


Re: [Mesa3d-dev] light_twoside RE: [PATCH] glsl: put varyings in texcoord slots

2010-02-03 Thread Alex Deucher
On Wed, Feb 3, 2010 at 4:42 AM, Luca Barbieri l...@luca-barbieri.com wrote:
 I tested this on Windows, using nVidia driver 195 on nv40, and it
 seems we are all partially wrong.

 SM3 does indeed allow semantics unrelated to hardware resources.
 However, the semantic indices for any semantic type must be in the
 range 0-15, or D3DX will report a compiler error during shader
 compilation:
 error X2000: syntax error : unexpected token 'dcl_texcoord16'

 This is further confirmed by the following lines in d3d9types.h:
 #define MAXD3DDECLUSAGE         D3DDECLUSAGE_SAMPLE
 #define MAXD3DDECLUSAGEINDEX    15

 I would guess that these two 4-bit values are combined into an 8-bit
 value that is then passed directly to hardware like r600 which
 supports 8-bit semantic indices in hardware.
 Is this the case on Radeon?

 Is the 8-bit semantic table a feature of r300 too or only of r600+?

Only r600+.  r3xx-r5xx is more basic.  You basically set up a table
based on the inputs and outputs.  Order doesn't matter as long as the
table is correct for the vs and ps you are using.  See pages 258-261
for the vertex fetch setup and pages 197-199 for the vs to ps routing
of the r5xx accel guide:
http://www.x.org/docs/AMD/R5xx_Acceleration_v1.4.pdf

Alex


 In light of this, it may make sense to do some range limitation ourselves too.
 In particular, a good plan could be limiting all semantic indices to
 0-15, except GENERIC, which could support a 0-127 range.
 This would allow to both directly take advantage of Radeon hardware,
 and let drivers that need to remap in software do so with direct
 lookup in a small array.


--
The Planet: dedicated and managed hosting, cloud storage, colocation
Stay online with enterprise data centers and the best network in the business
Choose flexible plans and management services without long-term contracts
Personal 24x7 support from experience hosting pros just a phone call away.
http://p.sf.net/sfu/theplanet-com
___
Mesa3d-dev mailing list
Mesa3d-dev@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev


Re: [Mesa3d-dev] light_twoside RE: [PATCH] glsl: put varyings in texcoord slots

2010-02-02 Thread michal
Luca Barbieri wrote on 2010-02-01 21:42:

 1. All the semantic indices in OpenGL are limited, according to the
 ARB specification
 2. All the sematic indices in DirectX 9/10 are limited, according to
 http://msdn.microsoft.com/en-us/library/ee418355%28VS.85%29.aspx

At least for SM3.0, one can specify a vertex shader output semantic like 
COLOR15 and have it running as long as one has also a pixel shader with 
a matching input semantic. Though I agree with you we don't really want 
to go this route and have something more sensible.

We could, for example, limit COLOR and BCOLOR indices to [0, 1], remove 
FOG and NORMAL names, and have a well-defined limit on GENERIC index 
value. After all, we only need non-generic semantics to communicate with 
the fixed-function part of the pipeline, that is rasteriser.

name   index range

POSITION   no limit?
COLOR  0..1, explicit clamp?
BCOLOR 0..1, explicit clamp?
FOGremove?
PSIZE  0
GENERIC0..max generics
NORMAL remove
FACE   0
EDGEFLAG   0
PRIMID 0
INSTANCEID 0


As for the routing table thing, I am not really convinced. The GLSL 
mechanism to link shaders based on varying names is GL-specific and thus 
should stay inside Mesa state tracker. In fact, D3D10 runtime is doing 
exactly the same thing and generating shader varients on the fly as they 
are mixed and matched by the application.


--
The Planet: dedicated and managed hosting, cloud storage, colocation
Stay online with enterprise data centers and the best network in the business
Choose flexible plans and management services without long-term contracts
Personal 24x7 support from experience hosting pros just a phone call away.
http://p.sf.net/sfu/theplanet-com
___
Mesa3d-dev mailing list
Mesa3d-dev@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev


Re: [Mesa3d-dev] light_twoside RE: [PATCH] glsl: put varyings in texcoord slots

2010-02-02 Thread Keith Whitwell
On Tue, Feb 2, 2010 at 3:54 PM, michal mic...@vmware.com wrote:
 Luca Barbieri wrote on 2010-02-01 21:42:

 1. All the semantic indices in OpenGL are limited, according to the
 ARB specification
 2. All the sematic indices in DirectX 9/10 are limited, according to
 http://msdn.microsoft.com/en-us/library/ee418355%28VS.85%29.aspx

 At least for SM3.0, one can specify a vertex shader output semantic like
 COLOR15 and have it running as long as one has also a pixel shader with
 a matching input semantic. Though I agree with you we don't really want
 to go this route and have something more sensible.

So translating COLOR15 away in a DX9 state tracker would mean that it
would have to examine pairs of vertex and fragment shaders together
and re-translate to generate varients that use the same set of
remapped semantics, right?   That sounds like extra work a DX9 state
tracker could avoid with the current rules.

I'm not opposed to doing more work in the state-trackers, but as I
keep saying, if we're going to do this type of hand-holding in the
state trackers, we should make sure we do enough to fix the
re-translation problem in all drivers, not just a couple.

Keith

--
The Planet: dedicated and managed hosting, cloud storage, colocation
Stay online with enterprise data centers and the best network in the business
Choose flexible plans and management services without long-term contracts
Personal 24x7 support from experience hosting pros just a phone call away.
http://p.sf.net/sfu/theplanet-com
___
Mesa3d-dev mailing list
Mesa3d-dev@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev


Re: [Mesa3d-dev] light_twoside RE: [PATCH] glsl: put varyings in texcoord slots

2010-02-02 Thread Luca Barbieri
 At least for SM3.0, one can specify a vertex shader output semantic like
 COLOR15 and have it running as long as one has also a pixel shader with a
 matching input semantic. Though I agree with you we don't really want to go
 this route and have something more sensible.

Do you know of any official Microsoft documentation that clearly
indicates that COLOR (and other) semantic indices are not limited?

The documentation I found seems to support to opposite statement, as
the following line:

n is an optional integer between 0 and the number of resources
supported. For example, POSITION0, TEXCOOR1, etc.

in Semantics (DirectX HLSL) on MSDN seems to indicate that if only 2
COLORs are supported, they are denoted by COLOR0 and COLOR1, and that
COLOR15 being valid would implying support for simultaneously using at
least 16 COLOR semantics.

As I understand it, the difference between SM2 and SM3 is that SM2
programs essentially directly use the semantics in instructions,
because they have c## registers for colors, t## registers for
texcoords, etc.
SM3 programs instead use generic i## or o## input/output registers,
which are associated to semantics with a declaration.

Note that this difference is orthogonal to the issue of whether
semantic indices are limited or not.


 As for the routing table thing, I am not really convinced. The GLSL
 mechanism to link shaders based on varying names is GL-specific and thus
 should stay inside Mesa state tracker

Surely.
However, if we want to support compiling the shaders separately,
variable foo may have been assigned output #2 in the vertex shader,
but input #1 in the fragment shader.
Thus, we need a way for Mesa to tell Gallium to map output #2 to input #1.
Of course, deciding to map #2 to #1 by consulting the GLSL shader
compiler symbol tables should be the state tracker's job.

Otherwise, we will need to recompile either of the shaders at link
time, so that foo is assigned the same slot in both shaders, which
is what we do now in GLSL linking, but is somewhat inefficient and in
particular can lead to compilation time growing quadratically in the
number of shaders, and slower shader switching.

--
The Planet: dedicated and managed hosting, cloud storage, colocation
Stay online with enterprise data centers and the best network in the business
Choose flexible plans and management services without long-term contracts
Personal 24x7 support from experience hosting pros just a phone call away.
http://p.sf.net/sfu/theplanet-com
___
Mesa3d-dev mailing list
Mesa3d-dev@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev


Re: [Mesa3d-dev] light_twoside RE: [PATCH] glsl: put varyings in texcoord slots

2010-02-02 Thread Luca Barbieri
  Personally I'm
 going to take a break from this thread, spend a couple of days looking
 at i965, etc, to see what can be done to improve things there, and
 maybe come back with an alternate proposal.

Yes, I think that the most important step is to precisely determine
how both hardware (and especially the newer cards you mentioned) works
and how shader APIs (especially DirectX) are defined.

Once the workings of both are known and agreed upon, the best solution
should be hopefully be clear.

In addition to looking at hardware such as i965, it would be awesome
to find some clear and unambiguous documentation on DirectX shader
semantics. and agree on its interpretation.

--
The Planet: dedicated and managed hosting, cloud storage, colocation
Stay online with enterprise data centers and the best network in the business
Choose flexible plans and management services without long-term contracts
Personal 24x7 support from experience hosting pros just a phone call away.
http://p.sf.net/sfu/theplanet-com
___
Mesa3d-dev mailing list
Mesa3d-dev@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev


Re: [Mesa3d-dev] light_twoside RE: [PATCH] glsl: put varyings in texcoord slots

2010-02-02 Thread Olivier Galibert
On Tue, Feb 02, 2010 at 07:09:12PM +0100, Luca Barbieri wrote:
 Otherwise, we will need to recompile either of the shaders at link
 time, so that foo is assigned the same slot in both shaders, which
 is what we do now in GLSL linking, but is somewhat inefficient and in
 particular can lead to compilation time growing quadratically in the
 number of shaders, and slower shader switching.

Slower shader switching is what caches are for.  And if you have n
VS and m FS, and a large subset of the n*m combinations (that's where
your quadratic comes from, right?) are actually used, then it's rather
obvious that inter-shader constant propagation and dead code removal
is going to be a must.  Incidentally, you can multiply by the number
of geometry shaders while you're at it.

As for link-by-name, it's pretty obvious it's going to become to norm
and not the exception.  Numbers are opaque, names aren't, and shaders
are a bitch to write and debug.  In addition color and texture
coords is way too specific and is pretty sure to morph into int and
float, or even float only, given HDR, and how easier it is hardware
wise and shader compiler wise to just have large-n parallel float
interpolation units.  That with link-time shared types.  So you'd
better ensure your approach is ready for a more dynamic world where
you can't decide a lot of things until link time.

  OG.


--
The Planet: dedicated and managed hosting, cloud storage, colocation
Stay online with enterprise data centers and the best network in the business
Choose flexible plans and management services without long-term contracts
Personal 24x7 support from experience hosting pros just a phone call away.
http://p.sf.net/sfu/theplanet-com
___
Mesa3d-dev mailing list
Mesa3d-dev@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev


Re: [Mesa3d-dev] light_twoside RE: [PATCH] glsl: put varyings in texcoord slots

2010-02-02 Thread Alex Deucher
On Tue, Feb 2, 2010 at 1:16 PM, Luca Barbieri l...@luca-barbieri.com wrote:
  Personally I'm
 going to take a break from this thread, spend a couple of days looking
 at i965, etc, to see what can be done to improve things there, and
 maybe come back with an alternate proposal.

 Yes, I think that the most important step is to precisely determine
 how both hardware (and especially the newer cards you mentioned) works
 and how shader APIs (especially DirectX) are defined.


On AMD r6xx and newer asics, the hardware provides 8 bit semantic ids
that are used for vertex fetches and shader to shader routing.  See:
http://www.x.org/docs/AMD/R6xx_R7xx_3D.pdf
pages 10-11, 16-17
The driver can define the ids to whatever it wants and then data will
be routed based on those ids.  E.g.,

#define POSITION 1
#define COLOR0   2
#define TEXCOORD0 3
etc.

then in your fetch shader:
vfetch POSITION
vfetch TEXCOORD0
vfetch COLOR0

and in your vs output:
export COLOR0
export TEXCOORD0
export POSITION

and in your ps inputs:
input TEXCOORD0
input COLOR0

etc.

The ordering doesn't matter all routing is done by semantic id.
There's no need to recompile your vertex shaders or pixel shaders, you
just adjust the fetch shader and sematic exports/imports state
accordingly.

The current r600 classic mesa driver just uses a hardcoded mapping
right now, but it would make sense to use semantic ids in the gallium
driver.

Alex

 Once the workings of both are known and agreed upon, the best solution
 should be hopefully be clear.

 In addition to looking at hardware such as i965, it would be awesome
 to find some clear and unambiguous documentation on DirectX shader
 semantics. and agree on its interpretation.

 --
 The Planet: dedicated and managed hosting, cloud storage, colocation
 Stay online with enterprise data centers and the best network in the business
 Choose flexible plans and management services without long-term contracts
 Personal 24x7 support from experience hosting pros just a phone call away.
 http://p.sf.net/sfu/theplanet-com
 ___
 Mesa3d-dev mailing list
 Mesa3d-dev@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/mesa3d-dev


--
The Planet: dedicated and managed hosting, cloud storage, colocation
Stay online with enterprise data centers and the best network in the business
Choose flexible plans and management services without long-term contracts
Personal 24x7 support from experience hosting pros just a phone call away.
http://p.sf.net/sfu/theplanet-com
___
Mesa3d-dev mailing list
Mesa3d-dev@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev


Re: [Mesa3d-dev] light_twoside RE: [PATCH] glsl: put varyings in texcoord slots

2010-02-02 Thread Luca Barbieri
On Tue, Feb 2, 2010 at 7:38 PM, Olivier Galibert galib...@pobox.com wrote:
 On Tue, Feb 02, 2010 at 07:09:12PM +0100, Luca Barbieri wrote:
 Otherwise, we will need to recompile either of the shaders at link
 time, so that foo is assigned the same slot in both shaders, which
 is what we do now in GLSL linking, but is somewhat inefficient and in
 particular can lead to compilation time growing quadratically in the
 number of shaders, and slower shader switching.

 Slower shader switching is what caches are for.  And if you have n
 VS and m FS, and a large subset of the n*m combinations (that's where
 your quadratic comes from, right?) are actually used, then it's rather
 obvious that inter-shader constant propagation and dead code removal
 is going to be a must.  Incidentally, you can multiply by the number
 of geometry shaders while you're at it.

 As for link-by-name, it's pretty obvious it's going to become to norm
 and not the exception.
Exactly, and that's why we should be able to support it efficiently.
The current Gallium architecture doesn't do that because semantic
indices are integers, and there is no way to specify linking without
creating different shaders.

Of course we could also decide we don't care about separate shaders,
and just have a Gallium CSO for a whole complete
vertex+geometry+fragment pipeline.
This makes things much simpler, but I'm afraid that some applications
could suffer catastrophic performance degradation.

A good place to find inspiration for this choice could be the nVidia
and ATI proprietary driver.
Since app/game developers test with those, if they always recompile
for each (fs, vs) pair, then we can safely do so too.

On the nVidia front, they on one hand patented inter-shader
optimization (patent 7426724, filed in 2004), but on the other hand
wrote GL_EXT_separate_shader_objects (written in 2009), which
doesn't provide significant benefits if the driver always does
inter-shader optimization.

Thus it seems they may be using a mix of the two techniques, possibly
depending driver version and on the shader API being used.

What do you think?
Is separate compilation important, and we should thus try to fully
support it, or can we just drop it?

In the first case, we would want a Gallium interface to support
specifying routing separately from shaders.
In the second case, we would be better off dropping the separate
VS/GS/FS shader CSOs and just having a CSO for the whole shader
pipeline.

--
The Planet: dedicated and managed hosting, cloud storage, colocation
Stay online with enterprise data centers and the best network in the business
Choose flexible plans and management services without long-term contracts
Personal 24x7 support from experience hosting pros just a phone call away.
http://p.sf.net/sfu/theplanet-com
___
Mesa3d-dev mailing list
Mesa3d-dev@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev


Re: [Mesa3d-dev] light_twoside RE: [PATCH] glsl: put varyings in texcoord slots

2010-02-02 Thread John Bauman
SM3 usages are arbitrary. For example, you could have some data with a 
blendweight 5 semantic, where there is no hardware to support it and no 
meaningful limit to the number.

From: Luca Barbieri [l...@luca-barbieri.com]
Sent: Tuesday, February 02, 2010 10:09 AM
To: Michal Krol
Cc: mesa3d-dev@lists.sourceforge.net
Subject: Re: [Mesa3d-dev] light_twoside RE: [PATCH] glsl: put varyings in   
texcoord slots



The documentation I found seems to support to opposite statement, as
the following line:

n is an optional integer between 0 and the number of resources
supported. For example, POSITION0, TEXCOOR1, etc.

in Semantics (DirectX HLSL) on MSDN seems to indicate that if only 2
COLORs are supported, they are denoted by COLOR0 and COLOR1, and that
COLOR15 being valid would implying support for simultaneously using at
least 16 COLOR semantics.

As I understand it, the difference between SM2 and SM3 is that SM2
programs essentially directly use the semantics in instructions,
because they have c## registers for colors, t## registers for
texcoords, etc.
SM3 programs instead use generic i## or o## input/output registers,
which are associated to semantics with a declaration.

Note that this difference is orthogonal to the issue of whether
semantic indices are limited or not.


--
The Planet: dedicated and managed hosting, cloud storage, colocation
Stay online with enterprise data centers and the best network in the business
Choose flexible plans and management services without long-term contracts
Personal 24x7 support from experience hosting pros just a phone call away.
http://p.sf.net/sfu/theplanet-com
___
Mesa3d-dev mailing list
Mesa3d-dev@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev


[Mesa3d-dev] light_twoside RE: [PATCH] glsl: put varyings in texcoord slots

2010-02-01 Thread Keith Whitwell
Christoph,  Luca,

Twoside lighting has is a bit of a special case GL-ism.  On a lot of hardware 
we end up implementing it by passing both front and back colors to the fragment 
shader and selecting between them using the FACE variable.  If we removed the 
implicit fixed-function support for two-side lighting in the rasterizer, it 
would solve the issue of how this is represented in any routing table.  

How does that sit with your drivers?

Keith




From: luca.barbi...@gmail.com [luca.barbi...@gmail.com] On Behalf Of Luca 
Barbieri [l...@luca-barbieri.com]
Sent: Monday, February 01, 2010 7:29 AM
To: Christoph Bumiller
Cc: Keith Whitwell; mesa3d-dev@lists.sourceforge.net
Subject: Re: [Mesa3d-dev] [PATCH] glsl: put varyings in texcoord slots

 I can't really use a routing table state to produce a cso, because the hw
 routing table I generate depends on rasterizer state, e.g. I must not
 put in back face colour (we have a 2 to 1 mapping here) if twoside
 is disabled.

 Also, I'm routing based on the scalar *components* the FP reads,
 not whole TGSI pseudo vec4 registers (NUM_INTERPOLATORS will
 thus be inaccurate) - set_routing_table will have to pass me the
 respective programs too.
 Well, I can still use the cso and insert it into the rest of the routing
 table that still need to be assembled on the fly, I did that before the
 1:1 mapping between FP and VP regs was removed.

You are right, the routing table CSO needs to contain the fragment and
vertex shader handles, and ideally light_twoside should be moved to
the vertex-fragment routing table since it is really an attribute of
that and not polygon rasterization/setup.

You can then just look at your internal data structure and construct a
scalar routing table from the vec4 one provided by Gallium.

We could also, as a further extension, support scalar routing tables
directly in Gallium.
Note however that radeon hardware presumably only supports vector
ones, so we would need all 3 options with caps.
A further intermediate step could be vector routing tables with swizzling.

--
The Planet: dedicated and managed hosting, cloud storage, colocation
Stay online with enterprise data centers and the best network in the business
Choose flexible plans and management services without long-term contracts
Personal 24x7 support from experience hosting pros just a phone call away.
http://p.sf.net/sfu/theplanet-com
___
Mesa3d-dev mailing list
Mesa3d-dev@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev


Re: [Mesa3d-dev] light_twoside RE: [PATCH] glsl: put varyings in texcoord slots

2010-02-01 Thread Luca Barbieri
On Mon, Feb 1, 2010 at 5:31 PM, Keith Whitwell kei...@vmware.com wrote:
 Christoph,  Luca,

 Twoside lighting has is a bit of a special case GL-ism.  On a lot of hardware 
 we end up implementing it by passing both front and back colors to the 
 fragment shader and selecting between them using the FACE variable.  If we 
 removed the implicit fixed-function support for two-side lighting in the 
 rasterizer, it would solve the issue of how this is represented in any 
 routing table.

 How does that sit with your drivers?

nv40 (and perhaps r300 too?) appears to have 2 hardware back color
registers in the vertex shader that are automatically routed, so it
would probably be best to leave it that way.

Of course, a generic face-dependent routing table could be yet another
optional feature.
Does any API expose such a thing, perhaps in the form of unlimited
rather than 2 front/back colors? (other than by using FACE)

--
The Planet: dedicated and managed hosting, cloud storage, colocation
Stay online with enterprise data centers and the best network in the business
Choose flexible plans and management services without long-term contracts
Personal 24x7 support from experience hosting pros just a phone call away.
http://p.sf.net/sfu/theplanet-com
___
Mesa3d-dev mailing list
Mesa3d-dev@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev


Re: [Mesa3d-dev] light_twoside RE: [PATCH] glsl: put varyings in texcoord slots

2010-02-01 Thread Keith Whitwell
DX9 semantic indexes are apparently unlimited, and you can definitely specify 
COLOR 0..3, I haven't tried to go further.

Keith

From: luca.barbi...@gmail.com [luca.barbi...@gmail.com] On Behalf Of Luca 
Barbieri [l...@luca-barbieri.com]
Sent: Monday, February 01, 2010 8:44 AM
To: Keith Whitwell
Cc: Christoph Bumiller; mesa3d-dev@lists.sourceforge.net
Subject: Re: light_twoside RE: [Mesa3d-dev] [PATCH] glsl: put varyings in   
texcoord slots

On Mon, Feb 1, 2010 at 5:31 PM, Keith Whitwell kei...@vmware.com wrote:
 Christoph,  Luca,

 Twoside lighting has is a bit of a special case GL-ism.  On a lot of hardware 
 we end up implementing it by passing both front and back colors to the 
 fragment shader and selecting between them using the FACE variable.  If we 
 removed the implicit fixed-function support for two-side lighting in the 
 rasterizer, it would solve the issue of how this is represented in any 
 routing table.

 How does that sit with your drivers?

nv40 (and perhaps r300 too?) appears to have 2 hardware back color
registers in the vertex shader that are automatically routed, so it
would probably be best to leave it that way.

Of course, a generic face-dependent routing table could be yet another
optional feature.
Does any API expose such a thing, perhaps in the form of unlimited
rather than 2 front/back colors? (other than by using FACE)

--
The Planet: dedicated and managed hosting, cloud storage, colocation
Stay online with enterprise data centers and the best network in the business
Choose flexible plans and management services without long-term contracts
Personal 24x7 support from experience hosting pros just a phone call away.
http://p.sf.net/sfu/theplanet-com
___
Mesa3d-dev mailing list
Mesa3d-dev@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev


Re: [Mesa3d-dev] light_twoside RE: [PATCH] glsl: put varyings in texcoord slots

2010-02-01 Thread Luca Barbieri
 DX9 semantic indexes are apparently unlimited

According to http://msdn.microsoft.com/en-us/library/ee418355%28VS.85%29.aspx,
this is not the case.

Here is the relevant text:

These semantics have meaning when attached to a vertex-shader
parameters. These semantics are supported in both Direct3D 9 and
Direct3D 10.
[...]
n is an optional integer between 0 and the number of resources
supported. For example, POSITION0, TEXCOOR1, etc.
[...]
These semantics have meaning when attached to a pixel-shader input
parameter. These semantics are supported in both Direct3D 9 and
Direct3D 10.
[...]
n is an optional integer between 0 and the number of resources
supported. For example, PSIZE0, COLOR1, etc.


Thus, both DX9 and DX10 do not need arbitrary indices.
OpenGL also doesn't, as fragment.texcoord[i] has i  GL_MAX_TEXTURE_COORDS_ARB.

It seems to make sense to follow those APIs in the design of Gallium semantics.

--
The Planet: dedicated and managed hosting, cloud storage, colocation
Stay online with enterprise data centers and the best network in the business
Choose flexible plans and management services without long-term contracts
Personal 24x7 support from experience hosting pros just a phone call away.
http://p.sf.net/sfu/theplanet-com
___
Mesa3d-dev mailing list
Mesa3d-dev@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev


Re: [Mesa3d-dev] light_twoside RE: [PATCH] glsl: put varyings in texcoord slots

2010-02-01 Thread Keith Whitwell
Luca,

I haven't tried to probe crazy high numbers, but within reason, my experience 
is that the numbers are unconstrained.   Certainly, within the range you're 
suggesting for gallium, there is no constraint in DX9.   No doubt where there 
is a system-interpreted meaning attached to a semantic, that meaning will 
impose an interpretation on the index and that will imply a limit on the 
semantic index.  For instance, in pixel shader outputs, COLOR[n] means a 
particular output is destined to be written to colorbuffer n.  Nobody is saying 
there isn't a limit on the number of bound colorbuffers.   By implication, the 
same limit already exists in gallium.

Now, your particular hardware has a additional limitation which is fairly 
unique, and you're pushing a change to gallium which would mimic the 
restrictions of your hardware.  I'm not actually interested in adjusting 
gallium to the constraints of one particular driver, but *am* quite interested 
in finding a way to improve linkage issues across the hardware we support.  

If you take a look at i965, I think you'll see that the change you're 
suggesting does nothing to avoid retranslating vertex shaders on that platform. 
 Likewise the software rasterizers and any driver relying on the draw module 
are currently jumping through hoops to emulate a routing table, which wouldn't 
be improved by your change.  But your change does dramatically alter the 
meaning of one part of gallium and introduces a new raft of hardware 
capabilities we'd have to be checking and respecting in every state tracker.

If we are going to adjust gallium, lets figure out a way to improve linkage 
generally.  Adding a per-driver, per-semantic maximum index query just for the 
benefit of one driver doesn't strike me as a good trade-off.  

Keith







From: luca.barbi...@gmail.com [luca.barbi...@gmail.com] On Behalf Of Luca 
Barbieri [l...@luca-barbieri.com]
Sent: Monday, February 01, 2010 9:15 AM
To: Keith Whitwell
Cc: mesa3d-dev@lists.sourceforge.net
Subject: Re: [Mesa3d-dev] light_twoside RE: [PATCH] glsl: put varyings in   
texcoord slots

 DX9 semantic indexes are apparently unlimited

According to http://msdn.microsoft.com/en-us/library/ee418355%28VS.85%29.aspx,
this is not the case.

Here is the relevant text:

These semantics have meaning when attached to a vertex-shader
parameters. These semantics are supported in both Direct3D 9 and
Direct3D 10.
[...]
n is an optional integer between 0 and the number of resources
supported. For example, POSITION0, TEXCOOR1, etc.
[...]
These semantics have meaning when attached to a pixel-shader input
parameter. These semantics are supported in both Direct3D 9 and
Direct3D 10.
[...]
n is an optional integer between 0 and the number of resources
supported. For example, PSIZE0, COLOR1, etc.


Thus, both DX9 and DX10 do not need arbitrary indices.
OpenGL also doesn't, as fragment.texcoord[i] has i  GL_MAX_TEXTURE_COORDS_ARB.

It seems to make sense to follow those APIs in the design of Gallium semantics.

--
The Planet: dedicated and managed hosting, cloud storage, colocation
Stay online with enterprise data centers and the best network in the business
Choose flexible plans and management services without long-term contracts
Personal 24x7 support from experience hosting pros just a phone call away.
http://p.sf.net/sfu/theplanet-com
___
Mesa3d-dev mailing list
Mesa3d-dev@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev


Re: [Mesa3d-dev] light_twoside RE: [PATCH] glsl: put varyings in texcoord slots

2010-02-01 Thread Christoph Bumiller
On 01.02.2010 17:31, Keith Whitwell wrote:
 Christoph,  Luca,

 Twoside lighting has is a bit of a special case GL-ism.  On a lot of hardware 
 we end up implementing it by passing both front and back colors to the 
 fragment shader and selecting between them using the FACE variable.  If we 
 removed the implicit fixed-function support for two-side lighting in the 
 rasterizer, it would solve the issue of how this is represented in any 
 routing table.  

 How does that sit with your drivers?

 Keith


   
It would work, if the COLOR semantic is completely ignored, i.e.
I would appreciate the insertion of clamping instructions on the
st side (I suspect earlier cards will not have 4 front color registers
so clamping will go away for their back colors too ...).

I can only select 2 x 8 consecutive scalar values in the routing table
to be clamped, and only 1 x 8 will get through to the fragment shader.

I'll not be happy to insert clamping manually, but I can do if it
turns out to be the best solution to not have the st do it.

It's a bit of a waste not to use that hw cap though ... otoh not many
apps will use two sided lighting nowadays I suppose.
 
 From: luca.barbi...@gmail.com [luca.barbi...@gmail.com] On Behalf Of Luca 
 Barbieri [l...@luca-barbieri.com]
 Sent: Monday, February 01, 2010 7:29 AM
 To: Christoph Bumiller
 Cc: Keith Whitwell; mesa3d-dev@lists.sourceforge.net
 Subject: Re: [Mesa3d-dev] [PATCH] glsl: put varyings in texcoord slots

   
 I can't really use a routing table state to produce a cso, because the hw
 routing table I generate depends on rasterizer state, e.g. I must not
 put in back face colour (we have a 2 to 1 mapping here) if twoside
 is disabled.

 Also, I'm routing based on the scalar *components* the FP reads,
 not whole TGSI pseudo vec4 registers (NUM_INTERPOLATORS will
 thus be inaccurate) - set_routing_table will have to pass me the
 respective programs too.
 Well, I can still use the cso and insert it into the rest of the routing
 table that still need to be assembled on the fly, I did that before the
 1:1 mapping between FP and VP regs was removed.
 
 You are right, the routing table CSO needs to contain the fragment and
 vertex shader handles, and ideally light_twoside should be moved to
 the vertex-fragment routing table since it is really an attribute of
 that and not polygon rasterization/setup.

 You can then just look at your internal data structure and construct a
 scalar routing table from the vec4 one provided by Gallium.

 We could also, as a further extension, support scalar routing tables
 directly in Gallium.
 Note however that radeon hardware presumably only supports vector
 ones, so we would need all 3 options with caps.
 A further intermediate step could be vector routing tables with swizzling.
   


--
The Planet: dedicated and managed hosting, cloud storage, colocation
Stay online with enterprise data centers and the best network in the business
Choose flexible plans and management services without long-term contracts
Personal 24x7 support from experience hosting pros just a phone call away.
http://p.sf.net/sfu/theplanet-com
___
Mesa3d-dev mailing list
Mesa3d-dev@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev


Re: [Mesa3d-dev] light_twoside RE: [PATCH] glsl: put varyings in texcoord slots

2010-02-01 Thread Luca Barbieri
 I haven't tried to probe crazy high numbers, but within reason, my experience 
 is that the numbers are unconstrained.

No, according to that document if you use TEXCOORD[n] then n  NUM_TEXCOORDS.


TEXCOORD[n] Texture coordinates float4
[...]
n is an optional integer between 0 and the number of resources
supported. For example, POSITION0, TEXCOOR1, etc.


Also look at the spec for ARB_fragment_program:

fragAttribItem   ::= color optColorType
 | texcoord optTexCoordNum
[...]
optTexCoordNum   ::= 
 | [ texCoordNum ]
optColorType ::= 
 | . primary
 | . secondary

texCoordNum  ::= integer from 0 to MAX_TEXTURE_COORDS_ARB-1

fragment.texcoord has the index limited by MAX_TEXTURE_COORDS_ARB.

It seems to me pretty clear from the above references that *all* 3D
APIs (i.e. DX9, DX10 and GL) have semantic indices in the range
0...N-1 where N is the limit appropriate for the specific semantic.

I think these references contradict your hypotesis that there is no
constraint in DX9.

Am I misunderstanding something completely?

Do you disagree with the fact that those references clearly show that
semantic indices are limited by hardware resources?


 Now, your particular hardware has a additional limitation which is fairly 
 unique, and you're pushing a change to gallium which would mimic the 
 restrictions of your hardware.  I'm not actually interested in adjusting 
 gallium to the constraints of one particular driver, but *am* quite 
 interested in finding a way to improve linkage issues across the hardware we 
 support.

No, it is a limitation that any hardware that does a direct
implementation of OpenGL has.
For instance, I'd guess that the VMWare driver works around that
problem somewhere since it ultimately uses the host OpenGL
implementation.
With my proposal, it could just convert GENERIC[0] into
fragment.texcoord[0] and likewise for others.


 But your change does dramatically alter the meaning of one part of gallium 
 and introduces a new raft of hardware capabilities we'd have to be checking 
 and respecting in every state tracker.

Not at all.
All code except the GLSL linker will work optimally as is, since it
uses indices sequentially starting from 0 (or implements an API that
does).
I provided a patch to fix the GLSL linker.


 If we are going to adjust gallium, lets figure out a way to improve linkage 
 generally.  Adding a per-driver, per-semantic maximum index query just for 
 the benefit of one driver doesn't strike me as a good trade-off.

It is already necessary to have that to implement glGet of
GL_MAX_TEXTURE_COORDS_ARB, for the TEXCOORD capability.

--
The Planet: dedicated and managed hosting, cloud storage, colocation
Stay online with enterprise data centers and the best network in the business
Choose flexible plans and management services without long-term contracts
Personal 24x7 support from experience hosting pros just a phone call away.
http://p.sf.net/sfu/theplanet-com
___
Mesa3d-dev mailing list
Mesa3d-dev@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev


Re: [Mesa3d-dev] light_twoside RE: [PATCH] glsl: put varyings in texcoord slots

2010-02-01 Thread Keith Whitwell
Luca,

Where the semantic indicates some relationship to actual system resources, I 
agree that the number is constrained by the number of those system resources.  
In the case of the gallium GENERIC semantic, there is explicitly no system 
resource that semantic is referring to and hence no limit on the index.

I feel like we're going in circles here.  We agree that we want to improve 
linkage, you have a patch that helps your driver, but please accept that it 
doesn't solve the wider problem. 

Keith

From: luca.barbi...@gmail.com [luca.barbi...@gmail.com] On Behalf Of Luca 
Barbieri [l...@luca-barbieri.com]
Sent: Monday, February 01, 2010 10:50 AM
To: Keith Whitwell
Cc: mesa3d-dev@lists.sourceforge.net
Subject: Re: [Mesa3d-dev] light_twoside RE: [PATCH] glsl: put varyings in   
texcoord slots

 I haven't tried to probe crazy high numbers, but within reason, my experience 
 is that the numbers are unconstrained.

No, according to that document if you use TEXCOORD[n] then n  NUM_TEXCOORDS.


TEXCOORD[n] Texture coordinates float4
[...]
n is an optional integer between 0 and the number of resources
supported. For example, POSITION0, TEXCOOR1, etc.


Also look at the spec for ARB_fragment_program:

fragAttribItem   ::= color optColorType
 | texcoord optTexCoordNum
[...]
optTexCoordNum   ::= 
 | [ texCoordNum ]
optColorType ::= 
 | . primary
 | . secondary

texCoordNum  ::= integer from 0 to MAX_TEXTURE_COORDS_ARB-1

fragment.texcoord has the index limited by MAX_TEXTURE_COORDS_ARB.

It seems to me pretty clear from the above references that *all* 3D
APIs (i.e. DX9, DX10 and GL) have semantic indices in the range
0...N-1 where N is the limit appropriate for the specific semantic.

I think these references contradict your hypotesis that there is no
constraint in DX9.

Am I misunderstanding something completely?

Do you disagree with the fact that those references clearly show that
semantic indices are limited by hardware resources?


 Now, your particular hardware has a additional limitation which is fairly 
 unique, and you're pushing a change to gallium which would mimic the 
 restrictions of your hardware.  I'm not actually interested in adjusting 
 gallium to the constraints of one particular driver, but *am* quite 
 interested in finding a way to improve linkage issues across the hardware we 
 support.

No, it is a limitation that any hardware that does a direct
implementation of OpenGL has.
For instance, I'd guess that the VMWare driver works around that
problem somewhere since it ultimately uses the host OpenGL
implementation.
With my proposal, it could just convert GENERIC[0] into
fragment.texcoord[0] and likewise for others.


 But your change does dramatically alter the meaning of one part of gallium 
 and introduces a new raft of hardware capabilities we'd have to be checking 
 and respecting in every state tracker.

Not at all.
All code except the GLSL linker will work optimally as is, since it
uses indices sequentially starting from 0 (or implements an API that
does).
I provided a patch to fix the GLSL linker.


 If we are going to adjust gallium, lets figure out a way to improve linkage 
 generally.  Adding a per-driver, per-semantic maximum index query just for 
 the benefit of one driver doesn't strike me as a good trade-off.

It is already necessary to have that to implement glGet of
GL_MAX_TEXTURE_COORDS_ARB, for the TEXCOORD capability.

--
The Planet: dedicated and managed hosting, cloud storage, colocation
Stay online with enterprise data centers and the best network in the business
Choose flexible plans and management services without long-term contracts
Personal 24x7 support from experience hosting pros just a phone call away.
http://p.sf.net/sfu/theplanet-com
___
Mesa3d-dev mailing list
Mesa3d-dev@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev


Re: [Mesa3d-dev] light_twoside RE: [PATCH] glsl: put varyings in texcoord slots

2010-02-01 Thread Luca Barbieri
 Where the semantic indicates some relationship to actual system resources, I 
 agree that the number is constrained by the number of those system resources. 
  In the case of the gallium GENERIC semantic, there is explicitly no system 
 resource that semantic is referring to and hence no limit on the index.

GENERIC[i] refers to a slot in the output register file of the vertex
shader or a slot in the input register file of the fragment shader.
It also refers to the interpolator unit that interpolates data between
those two registers.

Since interpolators are usually available in a finite number and
register files also are usually physically limited, there should be a
limit on the index.

 I feel like we're going in circles here.  We agree that we want to improve 
 linkage, you have a patch that helps your driver, but please accept that it 
 doesn't solve the wider problem.

Yes.
In the following I try to write out my reasoning step by step, in the
hope of making it clearer and making it easier to both establish which
facts we agree are true and pinpoint what we may disagree on.

This is a list of steps that are leading me to conclusion that it is
best to change the Gallium rules so that semantic indices must be in
the range 0..N-1, where is N is the maximum number of simultaneously
available registers with that semantic, apply my GLSL patch to fix
GLSL, and after doing that, consider extending Gallium by letting the
user specify a routing table to link these limited index semantics
with something other than an identity mapping.

Please tell me which points you find are incorrect, or why one any
deduction does not follow from the antecedents.

1. All the semantic indices in OpenGL are limited, according to the
ARB specification
2. All the sematic indices in DirectX 9/10 are limited, according to
http://msdn.microsoft.com/en-us/library/ee418355%28VS.85%29.aspx
3. In the OpenGL/DirectX 9/10 model, there are a fixed number of
interpolators, numbered from 0 to N - 1. Interpolator K reads from
vertex shader output register K, interpolates and writes to fragment
shader input register K.
4. Some cards (e.g. r300), but not all, allow to configure the vertex
shader input register and fragment shader output register that
interpolator K reads and writes.
5. Such register inputs are usually offsets in a physical register
file, and thus are limited to the physical size of that register file
6. No API exposes the functionality in point 4 and all expose the more
rigid model in (3.)
7. Gallium GENERIC is equivalent to OpenGL texcoord and DirectX 9/10
TEXCOORD semantics
8. texcoord is called this way because of historical reasons, since
fixed pipelines could use the values only for texture sampling.
GENERIC is called GENERIC instead of TEXCOORD because Gallium was
designed with a programmable pipeline in mind.
9. The current Mesa implementation of ARB_fp/vp translates texcoord[i]
to GENERIC[i]
10. fragment.texcoord[K] has K limited by GL_MAX_TEXTURE_COORDS_ARB
11. Because of (9.) and (10.), the current Mesa implementation of
ARB_vp/fp uses GENERIC indices limited by GL_MAX_TEXTURE_COORDS_ARB
(perhaps plus a very small constant)
12. Because of (2.), a straightforward Gallium DirectX state tracker
would also use GENERIC indices limited by the number of interpolators
13. If GLSL did not reserve sematic indices for unused gl_TexCoord[]
varyings, but allocated varyings sequentially, then it would use
semantic indices sequentially starting from 0
14. My patch implements (13.)
15. The xorg, vega and g3dvl state trackers use GENERIC indices
starting from 0 up to 1, 1 and 7 respectively
16. Because of (11.), (12.), (13.), (14.) and (15.), after applying my
patch, limiting the value of GENERIC semantic indices to the number of
interpolators would not adversely affect Mesa/Gallium functionality in
any way, probably including the VMware DirectX state tracker
17. Driver code would be simplified by not having to worry about any
register semantic remapping. It will be possible to separately compile
fragment and vertex shaders on all hardware. The CPU usage of all
drivers will be reduced, especially when switching shaders (a fast
path!)
18. Thus, (16.) is a net gain for Gallium, and should go forward

[Note: my current nv40 tree does exactly (16.) this and this does not
seem to be a source of any problem]

Points that lead me to propose a routing table CSO *IN ADDITION* to
applying my GLSL patch:

19. Some current 3D APIs (ARB_fp/vp, DX PS,
EXT_separate_shader_objects) link vertex and fragment shaders by
matching physical register file offset, limited to index N - 1 where N
is the maximum number of usage variables (see (3.))
20. Other 3D APIs (GLSL) link by matching variable name. This forces
to have the requirement, in unextended GLSL, to provide both the
fragment and vertex shaders at once in the link step
21. No API links by matching abstract unlimited variable number,
except some Gallium driver interfaces such as r300
22. It would be 

Re: [Mesa3d-dev] light_twoside RE: [PATCH] glsl: put varyings in texcoord slots

2010-02-01 Thread Luca Barbieri
A possible limitation of this scheme is that it doesn't readily map to
hardware that can configure its own interpolators to behave either as
GENERIC, COLOR (or some other semantic) dynamically.

However, it seems to me that at least ARB_fragment_program only
requires and supports 2 COLOR registers (primary and secondary), 1 FOG
register and 1 PSIZE register.

I'm not sure if any API can support more than 2 COLOR, 1 FOG and 1
PSIZE register as vertex shader outputs/fragment shader inputs (note
that this is totally different from COLOR as a fragment shader output,
where each COLOR semantic maps to a different render target), and thus
I'm not sure if such functionality is useful.

A driver with that functionality wishing to let an application use
more than 2 COLORs is however free to do remapping in the driver even
under my proposal. It just doesn't _have_ to do it.

--
The Planet: dedicated and managed hosting, cloud storage, colocation
Stay online with enterprise data centers and the best network in the business
Choose flexible plans and management services without long-term contracts
Personal 24x7 support from experience hosting pros just a phone call away.
http://p.sf.net/sfu/theplanet-com
___
Mesa3d-dev mailing list
Mesa3d-dev@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev