Re: [Mesa3d-dev] [PATCH 0/6] [RFC] Formalization of the Gallium shader semantics linkage model

2010-12-17 Thread Christoph Bumiller
On 17.12.2010 17:54, Marek Olšák wrote:
> On Fri, Dec 17, 2010 at 4:32 PM, Brian Paul  > wrote:
>
> Christoph,
>
> I don't see a patch for the st/mesa program translation code to check
> that we don't exceed the limit.  Were you going to take care of
> that too?
>
I didn't plan to for now, at least nothing beyond making the state
tracker return an error if possible, and removing/modifying a certain
comment mentioned below.

> I guess we're assuming that the max number of generic inputs == max
> number of generic outputs.  I think that's OK until a counter case
> appears.
>
>
> The way I understand it is that the max number of generic outputs is
> equal to the max number of generic inputs in the next shader stage
> (the same logic applies to some other shader caps too). I guess we
> need to use get_param to determine which shader stages are supported
> by the driver to know which one is next. The name
The problem is that (apart from the linked GL program case) you cannot
know which stage is next until validation time.
You have the same problem with the existing
PIPE_SHADER_CAP_MAX_INPUTS/OUTPUTS - nv50's vertex shaders can output
more variables to geometry shaders than they can to vertex shaders.

Maybe MAX_GENERIC_INDEX should be a non-shader specific cap - for nvc0
the value is the same everywhere, and for hardware that only has VP and
FP as well.

> *PIPE_SHADER_CAP_MAX_GENERIC_INPUT_INDEX* would be less ambiguous
> (still not perfect though).
>
I thought about using something even more verbose, like
PIPE_SHADER_CAP_MAX_GENERIC_INPUT_SEMANTIC_INDEX.

> However I don't believe in usefulness of this new cap, at least not
> without some serious state tracker work. I don't consider failing to
> translate a shader if some CAP is too low particularly useful.
>
The use of the cap is to prevent state tracker writers from thinking
they're free to use GENERIC[0,96,8911] or whatever random numbers they
like and rely on pipe drivers to ensure at all costs that linkage will
be correct.

In mesa/st I see the comment
/* Actually, let's try and zero-base this just for
 * readability of the generated TGSI.
 */
So I guess someone thought it would be ok to start at some unspecified
high index. Such random behaviour makes it really hard to get
ARB_separate_shader_objects features (which galliums assumed pipe
drivers would be able to do anyway from the start) sanely.

---

So, maybe we can do without this cap. Maybe it would be better to just
mandate that the GENERIC index be less than
PIPE_SHADER_CAP_MAX_INPUTS/OUTPUTS after all.

Christoph

> (posting to mesa-dev as well)
>
> Marek
>
>
> -Brian
>
>
> On 12/17/2010 05:28 AM, Keith Whitwell wrote:
> > Christoph,
> >
> > This looks good.  Thanks for bringing this back to life.
> >
> > Keith
> >
> > On Thu, 2010-12-16 at 07:47 -0800, Christoph Bumiller wrote:
> >> On 12/14/2010 12:36 PM, Keith Whitwell wrote:
> >>> On Mon, 2010-12-13 at 12:01 -0800, Christoph Bumiller wrote:
>  I want to warm this up again adding nvc0 and
>  GL_ARB_separate_shader_objects to the picture.
> 
>  The latter extends GL_EXT_separate_shader_objects to support user
>  defined varyings and guarantees well defined behaviour only if
>  - varyings are declared inside the
> gl_PerVertex/gl_PerFragment block the
>  blocks match exactly in name, type, qualification, and (most
>  significantly) declaration order.
>  - varyings are assigned matching location qualifiers:
>  like: layout(location = 3) in vec4 normal
>  "The number of input locations available to a shader is limited."
> 
>  So, I propose to (loosely) identify GENERIC semantic indices
> with these
>  location qualifiers and let the pipe driver set a limit on
> the allowed
>  maximum (e.g PIPE_SHADER_CAP_MAX_INPUTS, and not demand to at
> least
>  support 219 of them - nvc0 offsers 0x200 bytes for generic
> inputs/outputs).
> >>>
> >>> This sounds fine actually.  We kicked this around before&  I was
> >>> basically ok with the last iteration of the proposal, but this
> seems ok
> >>> too.
> >>>
> >>> As far as I can tell from a gallium perspective you're really just
> >>> proposing a new pipe cap _MAX_INPUTS (actually
> _MAX_GENERIC_INDEX would
> >>> be clearer), which the state tracker thereafter has to respect?
> >>>
> >>> That would be fine with me.
> >> First attempt at a patch introducing such a cap attached.
> >>
> >>>
>  My motivation is mostly that the hardware routing table for
> shader
>  varyings that was present on nv50 has been removed with nvc0
> (Fermi).
>  And I'm glad, because filling 4 routing tables (since we have
> 5 shader
>  types now) is somewhat annoying. And so applying relocatio

Re: [Mesa3d-dev] [PATCH 0/6] [RFC] Formalization of the Gallium shader semantics linkage model

2010-12-17 Thread Marek Olšák
On Fri, Dec 17, 2010 at 4:32 PM, Brian Paul  wrote:

> Christoph,
>
> I don't see a patch for the st/mesa program translation code to check
> that we don't exceed the limit.  Were you doing to take care of that too?
>
> I guess we're assuming that the max number of generic inputs == max
> number of generic outputs.  I think that's OK until a counter case
> appears.
>

The way I understand it is that the max number of generic outputs is equal
to the max number of generic inputs in the next shader stage (the same logic
applies to some other shader caps too). I guess we need to use get_param to
determine which shader stages are supported by the driver to know which one
is next. The name *PIPE_SHADER_CAP_MAX_GENERIC_INPUT_INDEX* would be less
ambiguous (still not perfect though).

However I don't believe in usefulness of this new cap, at least not without
some serious state tracker work. I don't consider failing to translate a
shader if some CAP is too low particularly useful.

(posting to mesa-dev as well)

Marek


> -Brian
>
>
> On 12/17/2010 05:28 AM, Keith Whitwell wrote:
> > Christoph,
> >
> > This looks good.  Thanks for bringing this back to life.
> >
> > Keith
> >
> > On Thu, 2010-12-16 at 07:47 -0800, Christoph Bumiller wrote:
> >> On 12/14/2010 12:36 PM, Keith Whitwell wrote:
> >>> On Mon, 2010-12-13 at 12:01 -0800, Christoph Bumiller wrote:
>  I want to warm this up again adding nvc0 and
>  GL_ARB_separate_shader_objects to the picture.
> 
>  The latter extends GL_EXT_separate_shader_objects to support user
>  defined varyings and guarantees well defined behaviour only if
>  - varyings are declared inside the gl_PerVertex/gl_PerFragment block
> the
>  blocks match exactly in name, type, qualification, and (most
>  significantly) declaration order.
>  - varyings are assigned matching location qualifiers:
>  like: layout(location = 3) in vec4 normal
>  "The number of input locations available to a shader is limited."
> 
>  So, I propose to (loosely) identify GENERIC semantic indices with
> these
>  location qualifiers and let the pipe driver set a limit on the allowed
>  maximum (e.g PIPE_SHADER_CAP_MAX_INPUTS, and not demand to at least
>  support 219 of them - nvc0 offsers 0x200 bytes for generic
> inputs/outputs).
> >>>
> >>> This sounds fine actually.  We kicked this around before&  I was
> >>> basically ok with the last iteration of the proposal, but this seems ok
> >>> too.
> >>>
> >>> As far as I can tell from a gallium perspective you're really just
> >>> proposing a new pipe cap _MAX_INPUTS (actually _MAX_GENERIC_INDEX would
> >>> be clearer), which the state tracker thereafter has to respect?
> >>>
> >>> That would be fine with me.
> >> First attempt at a patch introducing such a cap attached.
> >>
> >>>
>  My motivation is mostly that the hardware routing table for shader
>  varyings that was present on nv50 has been removed with nvc0 (Fermi).
>  And I'm glad, because filling 4 routing tables (since we have 5 shader
>  types now) is somewhat annoying. And so applying relocations to
> shaders
>  - it can be done, it's probably not too time consuming, but it's just
>  plain *unnecessary* (and thus stupid) for OpenGL.
> 
>  Now about d3d9 ...
>  1. don't care, I don't see a d3d9 state tracker
>  2. http://msdn.microsoft.com/en-us/library/bb509647%28v=VS.85%29.aspx
>  says "n is an optional integer between 0 and the number of resources
>  supported" - what "supported" means here isn't clear to me, but, I
>  didn't find any example where someone used something OpenGL doesn't
> have
>  (like COLOR2).
>  3.
> 
> http://msdn.microsoft.com/en-us/library/bb944006%28v=vs.85%29.aspx#Varying_Shader_Inputs_and_Semantics
>  says "Input semantics are similar to the values in the D3DDECLUSAGE."
>  and
>  DECLUSAGE sounds like you're limited to sane values.
> >>>
> >>> I think you're on the right track with (1)...  It's fairly pointless
> >>> trying to discuss code here which isn't public&  I don't think people
> >>> need to be worrying about what may or may not be important for code
> they
> >>> can't see.
> >>>
> >>> I know this idea previously got tied up with speculation about what a
> >>> DX9 state tracker might or might not require, but in retrospect I wish
> >>> I'd been able to steer conversation away from that.
> >>>
> >>> The work on closed components may drive a lot of the feature
> development
> >>> and new interfaces, but there's usually enough flexibility that this
> >>> sort of cleanup isn't a big deal.
> >>>
> >>>
> >>> Keith
> >>>
>  Not sure if anyone wants to think about this issue at this time (since
>  implementation of ARB_separate_shader_objects is probably far in the
> GL4
>  future), but I'd be happy about any comments.
> 
>  Regards,
>  Christoph
> 
>  On 04/13/2010 12:55 PM, Luca Barbieri wrote:
> > This patc

Re: [Mesa3d-dev] [PATCH 0/6] [RFC] Formalization of the Gallium shader semantics linkage model

2010-12-17 Thread Brian Paul
Christoph,

I don't see a patch for the st/mesa program translation code to check 
that we don't exceed the limit.  Were you doing to take care of that too?

I guess we're assuming that the max number of generic inputs == max 
number of generic outputs.  I think that's OK until a counter case 
appears.

-Brian


On 12/17/2010 05:28 AM, Keith Whitwell wrote:
> Christoph,
>
> This looks good.  Thanks for bringing this back to life.
>
> Keith
>
> On Thu, 2010-12-16 at 07:47 -0800, Christoph Bumiller wrote:
>> On 12/14/2010 12:36 PM, Keith Whitwell wrote:
>>> On Mon, 2010-12-13 at 12:01 -0800, Christoph Bumiller wrote:
 I want to warm this up again adding nvc0 and
 GL_ARB_separate_shader_objects to the picture.

 The latter extends GL_EXT_separate_shader_objects to support user
 defined varyings and guarantees well defined behaviour only if
 - varyings are declared inside the gl_PerVertex/gl_PerFragment block the
 blocks match exactly in name, type, qualification, and (most
 significantly) declaration order.
 - varyings are assigned matching location qualifiers:
 like: layout(location = 3) in vec4 normal
 "The number of input locations available to a shader is limited."

 So, I propose to (loosely) identify GENERIC semantic indices with these
 location qualifiers and let the pipe driver set a limit on the allowed
 maximum (e.g PIPE_SHADER_CAP_MAX_INPUTS, and not demand to at least
 support 219 of them - nvc0 offsers 0x200 bytes for generic inputs/outputs).
>>>
>>> This sounds fine actually.  We kicked this around before&  I was
>>> basically ok with the last iteration of the proposal, but this seems ok
>>> too.
>>>
>>> As far as I can tell from a gallium perspective you're really just
>>> proposing a new pipe cap _MAX_INPUTS (actually _MAX_GENERIC_INDEX would
>>> be clearer), which the state tracker thereafter has to respect?
>>>
>>> That would be fine with me.
>> First attempt at a patch introducing such a cap attached.
>>
>>>
 My motivation is mostly that the hardware routing table for shader
 varyings that was present on nv50 has been removed with nvc0 (Fermi).
 And I'm glad, because filling 4 routing tables (since we have 5 shader
 types now) is somewhat annoying. And so applying relocations to shaders
 - it can be done, it's probably not too time consuming, but it's just
 plain *unnecessary* (and thus stupid) for OpenGL.

 Now about d3d9 ...
 1. don't care, I don't see a d3d9 state tracker
 2. http://msdn.microsoft.com/en-us/library/bb509647%28v=VS.85%29.aspx
 says "n is an optional integer between 0 and the number of resources
 supported" - what "supported" means here isn't clear to me, but, I
 didn't find any example where someone used something OpenGL doesn't have
 (like COLOR2).
 3.
 http://msdn.microsoft.com/en-us/library/bb944006%28v=vs.85%29.aspx#Varying_Shader_Inputs_and_Semantics
 says "Input semantics are similar to the values in the D3DDECLUSAGE."
 and
 DECLUSAGE sounds like you're limited to sane values.
>>>
>>> I think you're on the right track with (1)...  It's fairly pointless
>>> trying to discuss code here which isn't public&  I don't think people
>>> need to be worrying about what may or may not be important for code they
>>> can't see.
>>>
>>> I know this idea previously got tied up with speculation about what a
>>> DX9 state tracker might or might not require, but in retrospect I wish
>>> I'd been able to steer conversation away from that.
>>>
>>> The work on closed components may drive a lot of the feature development
>>> and new interfaces, but there's usually enough flexibility that this
>>> sort of cleanup isn't a big deal.
>>>
>>>
>>> Keith
>>>
 Not sure if anyone wants to think about this issue at this time (since
 implementation of ARB_separate_shader_objects is probably far in the GL4
 future), but I'd be happy about any comments.

 Regards,
 Christoph

 On 04/13/2010 12:55 PM, Luca Barbieri wrote:
> This patch series is intended to resolve the issue of semantic-based 
> shader linkage in Gallium.
> It can also be found in the RFC-gallium-semantics branch.
>
> It does not change the current Gallium design, but rather formalizes some 
> limitations to it, and provides infrastructure to implement this model 
> more easily in drivers, along with a full nv30/nv40 implementation.
>
> These limitations are added to allow an efficient implementation for both 
> hardware lacking special support and hardware having support but also 
> special constraints.
>
> Note that this does NOT resolve all issues, and there are quite a bit 
> left to future refinement.
>
> In particular, the following issues are still open:
> 1. COLOR clamping (and floating point framebuffers)
> 2. A linkage table CSO allowing to specify non-identity linkage
> 3. BCOLOR/F

Re: [Mesa3d-dev] [PATCH 0/6] [RFC] Formalization of the Gallium shader semantics linkage model

2010-12-17 Thread Keith Whitwell
Christoph,

This looks good.  Thanks for bringing this back to life.

Keith

On Thu, 2010-12-16 at 07:47 -0800, Christoph Bumiller wrote:
> On 12/14/2010 12:36 PM, Keith Whitwell wrote:
> > On Mon, 2010-12-13 at 12:01 -0800, Christoph Bumiller wrote:
> >> I want to warm this up again adding nvc0 and
> >> GL_ARB_separate_shader_objects to the picture.
> >>
> >> The latter extends GL_EXT_separate_shader_objects to support user
> >> defined varyings and guarantees well defined behaviour only if
> >> - varyings are declared inside the gl_PerVertex/gl_PerFragment block the
> >> blocks match exactly in name, type, qualification, and (most
> >> significantly) declaration order.
> >> - varyings are assigned matching location qualifiers:
> >> like: layout(location = 3) in vec4 normal
> >> "The number of input locations available to a shader is limited."
> >>
> >> So, I propose to (loosely) identify GENERIC semantic indices with these
> >> location qualifiers and let the pipe driver set a limit on the allowed
> >> maximum (e.g PIPE_SHADER_CAP_MAX_INPUTS, and not demand to at least
> >> support 219 of them - nvc0 offsers 0x200 bytes for generic inputs/outputs).
> > 
> > This sounds fine actually.  We kicked this around before & I was
> > basically ok with the last iteration of the proposal, but this seems ok
> > too.
> > 
> > As far as I can tell from a gallium perspective you're really just
> > proposing a new pipe cap _MAX_INPUTS (actually _MAX_GENERIC_INDEX would
> > be clearer), which the state tracker thereafter has to respect?
> > 
> > That would be fine with me.
> First attempt at a patch introducing such a cap attached.
> 
> > 
> >> My motivation is mostly that the hardware routing table for shader
> >> varyings that was present on nv50 has been removed with nvc0 (Fermi).
> >> And I'm glad, because filling 4 routing tables (since we have 5 shader
> >> types now) is somewhat annoying. And so applying relocations to shaders
> >> - it can be done, it's probably not too time consuming, but it's just
> >> plain *unnecessary* (and thus stupid) for OpenGL.
> >>
> >> Now about d3d9 ...
> >> 1. don't care, I don't see a d3d9 state tracker
> >> 2. http://msdn.microsoft.com/en-us/library/bb509647%28v=VS.85%29.aspx
> >> says "n is an optional integer between 0 and the number of resources
> >> supported" - what "supported" means here isn't clear to me, but, I
> >> didn't find any example where someone used something OpenGL doesn't have
> >> (like COLOR2).
> >> 3.
> >> http://msdn.microsoft.com/en-us/library/bb944006%28v=vs.85%29.aspx#Varying_Shader_Inputs_and_Semantics
> >> says "Input semantics are similar to the values in the D3DDECLUSAGE."
> >> and
> >> DECLUSAGE sounds like you're limited to sane values.
> > 
> > I think you're on the right track with (1)...  It's fairly pointless
> > trying to discuss code here which isn't public & I don't think people
> > need to be worrying about what may or may not be important for code they
> > can't see.
> > 
> > I know this idea previously got tied up with speculation about what a
> > DX9 state tracker might or might not require, but in retrospect I wish
> > I'd been able to steer conversation away from that.
> > 
> > The work on closed components may drive a lot of the feature development
> > and new interfaces, but there's usually enough flexibility that this
> > sort of cleanup isn't a big deal.
> > 
> > 
> > Keith
> > 
> >> Not sure if anyone wants to think about this issue at this time (since
> >> implementation of ARB_separate_shader_objects is probably far in the GL4
> >> future), but I'd be happy about any comments.
> >>
> >> Regards,
> >> Christoph
> >>
> >> On 04/13/2010 12:55 PM, Luca Barbieri wrote:
> >>> This patch series is intended to resolve the issue of semantic-based 
> >>> shader linkage in Gallium.
> >>> It can also be found in the RFC-gallium-semantics branch.
> >>>
> >>> It does not change the current Gallium design, but rather formalizes some 
> >>> limitations to it, and provides infrastructure to implement this model 
> >>> more easily in drivers, along with a full nv30/nv40 implementation.
> >>>
> >>> These limitations are added to allow an efficient implementation for both 
> >>> hardware lacking special support and hardware having support but also 
> >>> special constraints.
> >>>
> >>> Note that this does NOT resolve all issues, and there are quite a bit 
> >>> left to future refinement.
> >>>
> >>> In particular, the following issues are still open:
> >>> 1. COLOR clamping (and floating point framebuffers)
> >>> 2. A linkage table CSO allowing to specify non-identity linkage
> >>> 3. BCOLOR/FACE-related issues
> >>> 4. Adding a cap to inform the state tracker that more than 219 generic 
> >>> indices are provided
> >>>
> >>> This topic was already very extensively discussed.
> >>> See 
> >>> http://www.mail-archive.com/mesa3d-dev@lists.sourceforge.net/msg10865.html
> >>>  for some early inconclusive discussion around an early implementation 
> >>> 

Re: [Mesa3d-dev] [PATCH 0/6] [RFC] Formalization of the Gallium shader semantics linkage model

2010-12-16 Thread Christoph Bumiller
On 12/14/2010 12:36 PM, Keith Whitwell wrote:
> On Mon, 2010-12-13 at 12:01 -0800, Christoph Bumiller wrote:
>> I want to warm this up again adding nvc0 and
>> GL_ARB_separate_shader_objects to the picture.
>>
>> The latter extends GL_EXT_separate_shader_objects to support user
>> defined varyings and guarantees well defined behaviour only if
>> - varyings are declared inside the gl_PerVertex/gl_PerFragment block the
>> blocks match exactly in name, type, qualification, and (most
>> significantly) declaration order.
>> - varyings are assigned matching location qualifiers:
>> like: layout(location = 3) in vec4 normal
>> "The number of input locations available to a shader is limited."
>>
>> So, I propose to (loosely) identify GENERIC semantic indices with these
>> location qualifiers and let the pipe driver set a limit on the allowed
>> maximum (e.g PIPE_SHADER_CAP_MAX_INPUTS, and not demand to at least
>> support 219 of them - nvc0 offsers 0x200 bytes for generic inputs/outputs).
> 
> This sounds fine actually.  We kicked this around before & I was
> basically ok with the last iteration of the proposal, but this seems ok
> too.
> 
> As far as I can tell from a gallium perspective you're really just
> proposing a new pipe cap _MAX_INPUTS (actually _MAX_GENERIC_INDEX would
> be clearer), which the state tracker thereafter has to respect?
> 
> That would be fine with me.
First attempt at a patch introducing such a cap attached.

> 
>> My motivation is mostly that the hardware routing table for shader
>> varyings that was present on nv50 has been removed with nvc0 (Fermi).
>> And I'm glad, because filling 4 routing tables (since we have 5 shader
>> types now) is somewhat annoying. And so applying relocations to shaders
>> - it can be done, it's probably not too time consuming, but it's just
>> plain *unnecessary* (and thus stupid) for OpenGL.
>>
>> Now about d3d9 ...
>> 1. don't care, I don't see a d3d9 state tracker
>> 2. http://msdn.microsoft.com/en-us/library/bb509647%28v=VS.85%29.aspx
>> says "n is an optional integer between 0 and the number of resources
>> supported" - what "supported" means here isn't clear to me, but, I
>> didn't find any example where someone used something OpenGL doesn't have
>> (like COLOR2).
>> 3.
>> http://msdn.microsoft.com/en-us/library/bb944006%28v=vs.85%29.aspx#Varying_Shader_Inputs_and_Semantics
>> says "Input semantics are similar to the values in the D3DDECLUSAGE."
>> and
>> DECLUSAGE sounds like you're limited to sane values.
> 
> I think you're on the right track with (1)...  It's fairly pointless
> trying to discuss code here which isn't public & I don't think people
> need to be worrying about what may or may not be important for code they
> can't see.
> 
> I know this idea previously got tied up with speculation about what a
> DX9 state tracker might or might not require, but in retrospect I wish
> I'd been able to steer conversation away from that.
> 
> The work on closed components may drive a lot of the feature development
> and new interfaces, but there's usually enough flexibility that this
> sort of cleanup isn't a big deal.
> 
> 
> Keith
> 
>> Not sure if anyone wants to think about this issue at this time (since
>> implementation of ARB_separate_shader_objects is probably far in the GL4
>> future), but I'd be happy about any comments.
>>
>> Regards,
>> Christoph
>>
>> On 04/13/2010 12:55 PM, Luca Barbieri wrote:
>>> This patch series is intended to resolve the issue of semantic-based shader 
>>> linkage in Gallium.
>>> It can also be found in the RFC-gallium-semantics branch.
>>>
>>> It does not change the current Gallium design, but rather formalizes some 
>>> limitations to it, and provides infrastructure to implement this model more 
>>> easily in drivers, along with a full nv30/nv40 implementation.
>>>
>>> These limitations are added to allow an efficient implementation for both 
>>> hardware lacking special support and hardware having support but also 
>>> special constraints.
>>>
>>> Note that this does NOT resolve all issues, and there are quite a bit left 
>>> to future refinement.
>>>
>>> In particular, the following issues are still open:
>>> 1. COLOR clamping (and floating point framebuffers)
>>> 2. A linkage table CSO allowing to specify non-identity linkage
>>> 3. BCOLOR/FACE-related issues
>>> 4. Adding a cap to inform the state tracker that more than 219 generic 
>>> indices are provided
>>>
>>> This topic was already very extensively discussed.
>>> See 
>>> http://www.mail-archive.com/mesa3d-dev@lists.sourceforge.net/msg10865.html 
>>> for some early inconclusive discussion around an early implementation that 
>>> modified the GLSL linker (which is NOT being proposed here)
>>> See 
>>> http://www.mail-archive.com/mesa3d-dev@lists.sourceforge.net/msg12016.html 
>>> for some more discussion that seemed to mostly reach a consensus over the 
>>> approach proposed here.
>>> See in particular 
>>> http://www.mail-archive.com/mesa3d-dev@lists.sourcefor

Re: [Mesa3d-dev] [PATCH 0/6] [RFC] Formalization of the Gallium shader semantics linkage model

2010-12-14 Thread Keith Whitwell
On Mon, 2010-12-13 at 12:01 -0800, Christoph Bumiller wrote:
> I want to warm this up again adding nvc0 and
> GL_ARB_separate_shader_objects to the picture.
> 
> The latter extends GL_EXT_separate_shader_objects to support user
> defined varyings and guarantees well defined behaviour only if
> - varyings are declared inside the gl_PerVertex/gl_PerFragment block the
> blocks match exactly in name, type, qualification, and (most
> significantly) declaration order.
> - varyings are assigned matching location qualifiers:
> like: layout(location = 3) in vec4 normal
> "The number of input locations available to a shader is limited."
> 
> So, I propose to (loosely) identify GENERIC semantic indices with these
> location qualifiers and let the pipe driver set a limit on the allowed
> maximum (e.g PIPE_SHADER_CAP_MAX_INPUTS, and not demand to at least
> support 219 of them - nvc0 offsers 0x200 bytes for generic inputs/outputs).

This sounds fine actually.  We kicked this around before & I was
basically ok with the last iteration of the proposal, but this seems ok
too.

As far as I can tell from a gallium perspective you're really just
proposing a new pipe cap _MAX_INPUTS (actually _MAX_GENERIC_INDEX would
be clearer), which the state tracker thereafter has to respect?

That would be fine with me.

> My motivation is mostly that the hardware routing table for shader
> varyings that was present on nv50 has been removed with nvc0 (Fermi).
> And I'm glad, because filling 4 routing tables (since we have 5 shader
> types now) is somewhat annoying. And so applying relocations to shaders
> - it can be done, it's probably not too time consuming, but it's just
> plain *unnecessary* (and thus stupid) for OpenGL.
> 
> Now about d3d9 ...
> 1. don't care, I don't see a d3d9 state tracker
> 2. http://msdn.microsoft.com/en-us/library/bb509647%28v=VS.85%29.aspx
> says "n is an optional integer between 0 and the number of resources
> supported" - what "supported" means here isn't clear to me, but, I
> didn't find any example where someone used something OpenGL doesn't have
> (like COLOR2).
> 3.
> http://msdn.microsoft.com/en-us/library/bb944006%28v=vs.85%29.aspx#Varying_Shader_Inputs_and_Semantics
> says "Input semantics are similar to the values in the D3DDECLUSAGE."
> and
> DECLUSAGE sounds like you're limited to sane values.

I think you're on the right track with (1)...  It's fairly pointless
trying to discuss code here which isn't public & I don't think people
need to be worrying about what may or may not be important for code they
can't see.

I know this idea previously got tied up with speculation about what a
DX9 state tracker might or might not require, but in retrospect I wish
I'd been able to steer conversation away from that.

The work on closed components may drive a lot of the feature development
and new interfaces, but there's usually enough flexibility that this
sort of cleanup isn't a big deal.


Keith

> Not sure if anyone wants to think about this issue at this time (since
> implementation of ARB_separate_shader_objects is probably far in the GL4
> future), but I'd be happy about any comments.
> 
> Regards,
> Christoph
> 
> On 04/13/2010 12:55 PM, Luca Barbieri wrote:
> > This patch series is intended to resolve the issue of semantic-based shader 
> > linkage in Gallium.
> > It can also be found in the RFC-gallium-semantics branch.
> >
> > It does not change the current Gallium design, but rather formalizes some 
> > limitations to it, and provides infrastructure to implement this model more 
> > easily in drivers, along with a full nv30/nv40 implementation.
> >
> > These limitations are added to allow an efficient implementation for both 
> > hardware lacking special support and hardware having support but also 
> > special constraints.
> >
> > Note that this does NOT resolve all issues, and there are quite a bit left 
> > to future refinement.
> >
> > In particular, the following issues are still open:
> > 1. COLOR clamping (and floating point framebuffers)
> > 2. A linkage table CSO allowing to specify non-identity linkage
> > 3. BCOLOR/FACE-related issues
> > 4. Adding a cap to inform the state tracker that more than 219 generic 
> > indices are provided
> >
> > This topic was already very extensively discussed.
> > See 
> > http://www.mail-archive.com/mesa3d-dev@lists.sourceforge.net/msg10865.html 
> > for some early inconclusive discussion around an early implementation that 
> > modified the GLSL linker (which is NOT being proposed here)
> > See 
> > http://www.mail-archive.com/mesa3d-dev@lists.sourceforge.net/msg12016.html 
> > for some more discussion that seemed to mostly reach a consensus over the 
> > approach proposed here.
> > See in particular 
> > http://www.mail-archive.com/mesa3d-dev@lists.sourceforge.net/msg12041.html .
> >
> > That said, I'm going to try to repeat all information here, partially by 
> > copy&pasting from earlier messages.
> > This message should probably be adapted into gallium

Re: [Mesa3d-dev] [PATCH 0/6] [RFC] Formalization of the Gallium shader semantics linkage model

2010-12-13 Thread Christoph Bumiller
I want to warm this up again adding nvc0 and
GL_ARB_separate_shader_objects to the picture.

The latter extends GL_EXT_separate_shader_objects to support user
defined varyings and guarantees well defined behaviour only if
- varyings are declared inside the gl_PerVertex/gl_PerFragment block the
blocks match exactly in name, type, qualification, and (most
significantly) declaration order.
- varyings are assigned matching location qualifiers:
like: layout(location = 3) in vec4 normal
"The number of input locations available to a shader is limited."

So, I propose to (loosely) identify GENERIC semantic indices with these
location qualifiers and let the pipe driver set a limit on the allowed
maximum (e.g PIPE_SHADER_CAP_MAX_INPUTS, and not demand to at least
support 219 of them - nvc0 offsers 0x200 bytes for generic inputs/outputs).

My motivation is mostly that the hardware routing table for shader
varyings that was present on nv50 has been removed with nvc0 (Fermi).
And I'm glad, because filling 4 routing tables (since we have 5 shader
types now) is somewhat annoying. And so applying relocations to shaders
- it can be done, it's probably not too time consuming, but it's just
plain *unnecessary* (and thus stupid) for OpenGL.

Now about d3d9 ...
1. don't care, I don't see a d3d9 state tracker
2. http://msdn.microsoft.com/en-us/library/bb509647%28v=VS.85%29.aspx
says "n is an optional integer between 0 and the number of resources
supported" - what "supported" means here isn't clear to me, but, I
didn't find any example where someone used something OpenGL doesn't have
(like COLOR2).
3.
http://msdn.microsoft.com/en-us/library/bb944006%28v=vs.85%29.aspx#Varying_Shader_Inputs_and_Semantics
says "Input semantics are similar to the values in the D3DDECLUSAGE."
and
DECLUSAGE sounds like you're limited to sane values.

Not sure if anyone wants to think about this issue at this time (since
implementation of ARB_separate_shader_objects is probably far in the GL4
future), but I'd be happy about any comments.

Regards,
Christoph

On 04/13/2010 12:55 PM, Luca Barbieri wrote:
> This patch series is intended to resolve the issue of semantic-based shader 
> linkage in Gallium.
> It can also be found in the RFC-gallium-semantics branch.
> 
> It does not change the current Gallium design, but rather formalizes some 
> limitations to it, and provides infrastructure to implement this model more 
> easily in drivers, along with a full nv30/nv40 implementation.
> 
> These limitations are added to allow an efficient implementation for both 
> hardware lacking special support and hardware having support but also special 
> constraints.
> 
> Note that this does NOT resolve all issues, and there are quite a bit left to 
> future refinement.
> 
> In particular, the following issues are still open:
> 1. COLOR clamping (and floating point framebuffers)
> 2. A linkage table CSO allowing to specify non-identity linkage
> 3. BCOLOR/FACE-related issues
> 4. Adding a cap to inform the state tracker that more than 219 generic 
> indices are provided
> 
> This topic was already very extensively discussed.
> See 
> http://www.mail-archive.com/mesa3d-dev@lists.sourceforge.net/msg10865.html 
> for some early inconclusive discussion around an early implementation that 
> modified the GLSL linker (which is NOT being proposed here)
> See 
> http://www.mail-archive.com/mesa3d-dev@lists.sourceforge.net/msg12016.html 
> for some more discussion that seemed to mostly reach a consensus over the 
> approach proposed here.
> See in particular 
> http://www.mail-archive.com/mesa3d-dev@lists.sourceforge.net/msg12041.html .
> 
> That said, I'm going to try to repeat all information here, partially by 
> copy&pasting from earlier messages.
> This message should probably be adapted into gallium/docs if/when this is 
> accepted.
> 
> Here is the short summary; the long rationale follows after it.
> 
> The proposal here is to add the following limitations to Gallium, for the 
> intermediate semantics:
> 1. TGSI_SEMANTIC_NORMAL is removed, using a commit by Michal Krol that was 
> never merged
> 2. Every semantic except GENERIC, COLOR and BCOLOR can only be used with 
> semantic index 0
> 3. COLOR and BCOLOR can only be used with semantic index 0-1 (note that this 
> doesn't apply to fragment outputs)
> 4. GENERIC can be used with semantic indices 0-218 on any driver, if BCOLOR 
> is not used
> 5. GENERIC can be used with semantic indices 0-216 on any driver, if BCOLOR 
> IS used
> 6. GENERIC can be used with semantic indices 0-255 on almost all drivers 
> (those that don't need the 0-218 limitation)
> 7. Some drivers may also choose to support GENERIC with arbitrary indices, 
> but that should generally not happen
> 
> The reason of this, in short, is that this maps directly to DirectX 9 SM3, 
> which is the most problematic interface of all.
> 
> The peculiar problem we have here is that we have two competing constraints 
> that force us into choosing the exact SM3 

Re: [Mesa3d-dev] [PATCH 0/6] [RFC] Formalization of the Gallium shader semantics linkage model

2010-04-13 Thread Keith Whitwell
On Tue, 2010-04-13 at 03:55 -0700, Luca Barbieri wrote:
> Personally I think the simplest idea for now could be to have all
> drivers support 256 indices or, in the case of r600 and svga, the
> maximum value supported by the hardware, and expose that as a cap (as
> well as another cap for the number of different semantic values
> supported at once).
> The minimum guaranteed value is set to the lowest hardware constraint,
> which would be svga with 219 indices (assuming no bcolor is used).
> If some new constraints pop up, we just lower it and change SM3 state
> trackers to check for it and fallback otherwise. 

Luca,

Thanks for your patience and efforts in compiling this - I really
appreciate the effort you've put into this and the persistence to keep
coming back to it.

The patchset looks good to me at first reading, I'll dig in more deeply.

Keith


--
Download Intel® Parallel Studio Eval
Try the new software tools for yourself. Speed compiling, find bugs
proactively, and fine-tune applications for parallel performance.
See why Intel Parallel Studio got high marks during beta.
http://p.sf.net/sfu/intel-sw-dev
___
Mesa3d-dev mailing list
Mesa3d-dev@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev


[Mesa3d-dev] [PATCH 0/6] [RFC] Formalization of the Gallium shader semantics linkage model

2010-04-13 Thread Luca Barbieri
This patch series is intended to resolve the issue of semantic-based shader 
linkage in Gallium.
It can also be found in the RFC-gallium-semantics branch.

It does not change the current Gallium design, but rather formalizes some 
limitations to it, and provides infrastructure to implement this model more 
easily in drivers, along with a full nv30/nv40 implementation.

These limitations are added to allow an efficient implementation for both 
hardware lacking special support and hardware having support but also special 
constraints.

Note that this does NOT resolve all issues, and there are quite a bit left to 
future refinement.

In particular, the following issues are still open:
1. COLOR clamping (and floating point framebuffers)
2. A linkage table CSO allowing to specify non-identity linkage
3. BCOLOR/FACE-related issues
4. Adding a cap to inform the state tracker that more than 219 generic indices 
are provided

This topic was already very extensively discussed.
See http://www.mail-archive.com/mesa3d-dev@lists.sourceforge.net/msg10865.html 
for some early inconclusive discussion around an early implementation that 
modified the GLSL linker (which is NOT being proposed here)
See http://www.mail-archive.com/mesa3d-dev@lists.sourceforge.net/msg12016.html 
for some more discussion that seemed to mostly reach a consensus over the 
approach proposed here.
See in particular 
http://www.mail-archive.com/mesa3d-dev@lists.sourceforge.net/msg12041.html .

That said, I'm going to try to repeat all information here, partially by 
copy&pasting from earlier messages.
This message should probably be adapted into gallium/docs if/when this is 
accepted.

Here is the short summary; the long rationale follows after it.

The proposal here is to add the following limitations to Gallium, for the 
intermediate semantics:
1. TGSI_SEMANTIC_NORMAL is removed, using a commit by Michal Krol that was 
never merged
2. Every semantic except GENERIC, COLOR and BCOLOR can only be used with 
semantic index 0
3. COLOR and BCOLOR can only be used with semantic index 0-1 (note that this 
doesn't apply to fragment outputs)
4. GENERIC can be used with semantic indices 0-218 on any driver, if BCOLOR is 
not used
5. GENERIC can be used with semantic indices 0-216 on any driver, if BCOLOR IS 
used
6. GENERIC can be used with semantic indices 0-255 on almost all drivers (those 
that don't need the 0-218 limitation)
7. Some drivers may also choose to support GENERIC with arbitrary indices, but 
that should generally not happen

The reason of this, in short, is that this maps directly to DirectX 9 SM3, 
which is the most problematic interface of all.

The peculiar problem we have here is that we have two competing constraints 
that force us into choosing the exact SM3 value:
1. The VMware SVGA driver must deal with an SM3 host interface and would 
ideally want to directly feed the Gallium semantics to the host
2. An hypotetical DirectX 9 state tracker needs to support SM3 and would 
ideally want to directly feed the SM3 semantics to Gallium

Note that this is not a reference to the VMware DirectX 9 state tracker, since 
its authors haven't provided details about its handling of shader semantics.

SM3 ends up supporting 219 generic indices: 16 indices in 14 classes, minus 
POSITION0, PSIZE0, COLOR0, COLOR1 and FOG0 which are the only ones that 
wouldn't be mapped to GENERIC.
However, Gallium drivers that don't benefit from having specific contraints 
(like svga and r600) are supposed to support 256 indices, and my nv30/nv40 work 
does that.

The expected implementation, if no hardware support exists, is to build a list 
of relocations to apply to either the fragment or the vertex shader, and patch 
one of them at validation time to match the other.
Data structures are provided in gallium/auxiliary to ease this, and try to 
minimize the number of times where this needs to be performed.

Let's now proceed to the discussion and detailed rationale, mostly constructed 
by copy&pasting older messages.

===
Michal Krol's proposal
===

First of all, see Michal Krol's proposal at 
http://www.opensource-archive.org/showthread.php?t=148573, and in particular:
<<
name index range

POSITION no limit?
COLOR 0..1, explicit clamp?
BCOLOR 0..1, explicit clamp?
FOG remove?
PSIZE 0
GENERIC 0..
NORMAL remove
FACE 0
EDGEFLAG 0
PRIMID 0
INSTANCEID 0
>>

My proposal follows this, except for limiting POSITION to 0 too.
Not sure why Michal thought "no limit" could make sense: the POSITION is 
fundamentally a singleton, since it is the input to the rasterizer unit.


==
An overview of hardware support
==

Hardware with no capabilities.
- nv30 does not support any mapping. However, we already need to patch
fragment programs to insert constants, so we can patch input register
numbers as well. The current driver only supports 0-7 generic indices,
but I already implemented support for 0