Re: [Mesa3d-dev] Does DX9 SM3 -> VMware svga with arbitrary semantics work? How?

2010-03-03 Thread Roland Scheidegger
On 03.03.2010 20:23, Luca Barbieri wrote:
>> And never will...  It does not export PIPE_CAP_GLSL, and does not have
>> the shader opcodes to ever do so.
> 
> Any Gallium driver should be able to support the GLSL subset without
> control flow.
> 
> And if we had a proper optimization infrastructure capable of inlining
> functions, converting conditionals to multiplications and unrolling
> loops (e.g. look at what the nVidia Cg compiler does), then
> essentially all GLSL could be supported on any driver, with only
> limitations on the maximum number of loop iterations.
> 
> Isn't it worth supporting that?
> 
> BTW, proprietary drivers do this: for instance nVidia supports GLSL on
> nv30, which can't do control flow in fragment shaders and doesn't
> support SM3.

I think the i915 is a lot closer to r300 in that regard (which is quite
a bit more limited than nv30), and it's true that ATI also supported
glsl on that. As far as I know though it was quite easy to bump into
shaders which wouldn't compile. There's only so much you can do if you
have 4 blocks of (max) 16 instructions to run without any control flow
if you need to unroll loops, not to mention lacking instructions for
derivatives, or the fact things like sin/cos will take quite a few
instructions...
nv30, while processing fragment shaders slowly, had a LOT higher
instruction count, IIRC supported derivatives and predication and had no
dependent texturing limit. So that makes it a lot better suited for glsl
hacks.
So, I'm not sure it really makes a whole lot of sense to support glsl on
i915. It'll really only ever work for very simple things (granted there
are apps out there which indeed will only use glsl shaders which are
known to compile fine on r300...)

Roland

--
Download Intel® Parallel Studio Eval
Try the new software tools for yourself. Speed compiling, find bugs
proactively, and fine-tune applications for parallel performance.
See why Intel Parallel Studio got high marks during beta.
http://p.sf.net/sfu/intel-sw-dev
___
Mesa3d-dev mailing list
Mesa3d-dev@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev


Re: [Mesa3d-dev] Does DX9 SM3 -> VMware svga with arbitrary semantics work? How?

2010-03-03 Thread Keith Whitwell
On Wed, 2010-03-03 at 06:58 -0800, Luca Barbieri wrote:
> BTW, i915 is also limited to 0-7 generic indices, and thus doesn't
> work with GLSL at all right now.
> 
> This should be relatively easy to fix since it should be enough to
> store the generic indices in the "texCoords" arrays, and then pass
> them to draw_find_shader_output.

Luca,

If you want to go ahead and send a patch, I don't have a problem with
it.  Like you say, it should be an easy change.

Keith


--
Download Intel® Parallel Studio Eval
Try the new software tools for yourself. Speed compiling, find bugs
proactively, and fine-tune applications for parallel performance.
See why Intel Parallel Studio got high marks during beta.
http://p.sf.net/sfu/intel-sw-dev
___
Mesa3d-dev mailing list
Mesa3d-dev@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev


Re: [Mesa3d-dev] Does DX9 SM3 -> VMware svga with arbitrary semantics work? How?

2010-03-03 Thread Keith Whitwell
On Wed, 2010-03-03 at 11:23 -0800, Luca Barbieri wrote:
> > And never will...  It does not export PIPE_CAP_GLSL, and does not have
> > the shader opcodes to ever do so.
> 
> Any Gallium driver should be able to support the GLSL subset without
> control flow.
> 
> And if we had a proper optimization infrastructure capable of inlining
> functions, converting conditionals to multiplications and unrolling
> loops (e.g. look at what the nVidia Cg compiler does), then
> essentially all GLSL could be supported on any driver, with only
> limitations on the maximum number of loop iterations.
> 
> Isn't it worth supporting that?
> 
> BTW, proprietary drivers do this: for instance nVidia supports GLSL on
> nv30, which can't do control flow in fragment shaders and doesn't
> support SM3.

OK, maybe never is too strong...  But it's certainly a long way off...


Keith


--
Download Intel® Parallel Studio Eval
Try the new software tools for yourself. Speed compiling, find bugs
proactively, and fine-tune applications for parallel performance.
See why Intel Parallel Studio got high marks during beta.
http://p.sf.net/sfu/intel-sw-dev
___
Mesa3d-dev mailing list
Mesa3d-dev@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev


Re: [Mesa3d-dev] Does DX9 SM3 -> VMware svga with arbitrary semantics work? How?

2010-03-03 Thread Luca Barbieri
> And never will...  It does not export PIPE_CAP_GLSL, and does not have
> the shader opcodes to ever do so.

Any Gallium driver should be able to support the GLSL subset without
control flow.

And if we had a proper optimization infrastructure capable of inlining
functions, converting conditionals to multiplications and unrolling
loops (e.g. look at what the nVidia Cg compiler does), then
essentially all GLSL could be supported on any driver, with only
limitations on the maximum number of loop iterations.

Isn't it worth supporting that?

BTW, proprietary drivers do this: for instance nVidia supports GLSL on
nv30, which can't do control flow in fragment shaders and doesn't
support SM3.

--
Download Intel® Parallel Studio Eval
Try the new software tools for yourself. Speed compiling, find bugs
proactively, and fine-tune applications for parallel performance.
See why Intel Parallel Studio got high marks during beta.
http://p.sf.net/sfu/intel-sw-dev
___
Mesa3d-dev mailing list
Mesa3d-dev@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev


Re: [Mesa3d-dev] Does DX9 SM3 -> VMware svga with arbitrary semantics work? How?

2010-03-03 Thread Keith Whitwell
On Wed, 2010-03-03 at 06:58 -0800, Luca Barbieri wrote:
> BTW, i915 is also limited to 0-7 generic indices, and thus doesn't
> work with GLSL at all right now.

And never will...  It does not export PIPE_CAP_GLSL, and does not have
the shader opcodes to ever do so.

Keith


--
Download Intel® Parallel Studio Eval
Try the new software tools for yourself. Speed compiling, find bugs
proactively, and fine-tune applications for parallel performance.
See why Intel Parallel Studio got high marks during beta.
http://p.sf.net/sfu/intel-sw-dev
___
Mesa3d-dev mailing list
Mesa3d-dev@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev


Re: [Mesa3d-dev] Does DX9 SM3 -> VMware svga with arbitrary semantics work? How?

2010-03-03 Thread Luca Barbieri
BTW, i915 is also limited to 0-7 generic indices, and thus doesn't
work with GLSL at all right now.

This should be relatively easy to fix since it should be enough to
store the generic indices in the "texCoords" arrays, and then pass
them to draw_find_shader_output.

--
Download Intel® Parallel Studio Eval
Try the new software tools for yourself. Speed compiling, find bugs
proactively, and fine-tune applications for parallel performance.
See why Intel Parallel Studio got high marks during beta.
http://p.sf.net/sfu/intel-sw-dev
___
Mesa3d-dev mailing list
Mesa3d-dev@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev


Re: [Mesa3d-dev] Does DX9 SM3 -> VMware svga with arbitrary semantics work? How?

2010-03-03 Thread Olivier Galibert
On Tue, Mar 02, 2010 at 09:43:51PM +0100, Luca Barbieri wrote:
> - Not sure about i965

On i965 interpolators are not a dedicated piece of hardware, they're
programs like the other shaders.  So the problem is entirely
different, and more at the level of space allocation in the
thread-to-thread communication packets in the pipeline vs. register
allocation in the shaders (there's a semi-direct mapping).

  OG.


--
Download Intel® Parallel Studio Eval
Try the new software tools for yourself. Speed compiling, find bugs
proactively, and fine-tune applications for parallel performance.
See why Intel Parallel Studio got high marks during beta.
http://p.sf.net/sfu/intel-sw-dev
___
Mesa3d-dev mailing list
Mesa3d-dev@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev


Re: [Mesa3d-dev] Does DX9 SM3 -> VMware svga with arbitrary semantics work? How?

2010-03-03 Thread Keith Whitwell
On Tue, 2010-03-02 at 12:43 -0800, Luca Barbieri wrote:
> The difference between an easier and harder life for (some) drivers is
> whether the limit is tied to hardware interpolators or not.
> Once we decide to not tie it, whether the limit is 128 or 256 is of
> course quite inconsequential.
> Allowing arbitrary 32-bit values would however require use of binary
> search or an hash table.
> 
> I think you or someone else from the Mesa team should decide how to
> proceed, and most drivers would need to be fixed.
> 
> As I understand, the constraints are the following:
> 
> Hardware with no capabilities.
> - nv30 does not support any mapping. However, we already need to patch
> fragment programs to insert constants, so we can patch input register
> numbers as well. The current driver only supports 0-7 generic indices,
> but I already implemented support for 0-255 indices with in-driver
> linkage and patching. Note that nv30 lacks control flow in fragment
> programs.
> - nv40 is like nv30, but supports fp control flow, and may have some
> configurable mapping support, with unknown behavior
> 
> Hardware with capabilities that must be configured for each fp/vp pair.
> - nv40 might have this but the nVidia OpenGL driver does not use them
> - nv50 has configurable vp->gp and gp->fp mappings with 64 entries.
> The current driver seems to support arbitrary 0-2^32 indices.
> - r300 appears to have a configurable vp->fp mapping. The current
> driver only supports 0-15 generic indices, but redefining
> ATTR_GENERIC_COUNT could be enough to have it support larger numbers.
> 
> Hardware with automatic linkage when semantics match:
> - VMWare svga appears to support 14 * 16 semantics, but the current
> driver only supports 0-15 generic indices. This could be fixed by
> mapping GENERIC into all non-special SM3 semantics.
> 
> Hardware that can do both configurable mappings and automatic linkage:
> - r600 supports linkage in hardware between matching apparently
> byte-sized semantic ids
> 
> Other hardware;
> - i915 has no hardware vertex shading
> - Not sure about i965
> 
> Software:
> 1. SM3 wants to use 14 * 16 indices overall. This is apparently only
> supported by the VMware closed source state tracker.
> 2. SM2 and non-GLSL OpenGL just want to use as many indices as the
> hardware interpolator count
> 3. Current GLSL currently wants to use at most about 10 indices more
> than the hardware interpolator count. This can be fixed since we see
> both the fragment and vertex shaders during linkage (the patch I sent
> did that)
> 4. GLSL with EXT_separate_shader_objects does not add requirements
> because only gl_TexCoord and other builtin varyings are supported.
> User-defined varyings are not supported
> 5. An hypotetical version of EXT_separate_shader_objects extended to
> support user-defining varyings would either want arbitrary 32-bit
> generic indices (by interning strings to generate the indices) or the
> ability to specify a custom mapping between shader indices
> 6. An hypotetical "no-op" implementation of the GLSL linker would have
> the same requirement
> 
> Also note that non-GENERIC indices have peculiar properties.
> 
> For COLOR and BCOLOR:
> 1. SM3 and OpenGL with glColorClamp appropriately set wants it to
> _not_ be clamped to [0, 1]
> 2. SM2 and normal OpenGL apparently want it to be clamped to [0, 1]
> (sometimes for fixed point targets only) and may also allow using
> U8_UNORM precision for it instead of FP32
> 3. OpenGL allows to enable two-sided lighting, in which case COLOR in
> the fragment shader is automagically set to BCOLOR for back faces
> 4. Older hardware (e.g. nv30) tends to support BCOLOR but not FACING.
> Some hardware (e.g. nv40) supports both FACING and BCOLOR in hardware.
> The latest hardware probably supports FACING only.
> 
> Any API that requires special semantics for COLOR and BCOLOR (i.e.
> non-SM3) seems to only want 0-1 indices.
> 
> Note that SM3 does *not* include BCOLOR, so basically the limits for
> generic indices would need to be conditional on BCOLOR being present
> or not (e.g. if it is present, we must reserve two semantic slots in
> svga for it).
> 
> POSITION0 is obviously special.
> PSIZE0 is also special for points.
> 
> FOG0 seems right now to just be a GENERIC with a single component.
> Gallium could be extended to support fixed function fog, which most
> DX9 hardware supports (nv30/nv40 and r300). This is mostly orthogonal
> to the semantic issue.
> 
> TGSI_SEMANTIC_NORMAL is essentially unused and should probably be removed
> 
> The options are the ones you outlined, plus:
> (e) Allow arbitrary 32-bit indices. This requires slightly more
> complicated data structures in some cases, and will require svga and
> r600 to fallback to software linkage if numbers are too high.
> (f) Limit semantic indices to hardware interpolators _and_ introduce
> an interface to let the user specify an
> 
> Personally I think the simplest idea for now could be to have all
> drivers s

Re: [Mesa3d-dev] Does DX9 SM3 -> VMware svga with arbitrary semantics work? How?

2010-03-02 Thread Marek Olšák
On Tue, Mar 2, 2010 at 10:20 PM, Luca Barbieri wrote:

> On Tue, Mar 2, 2010 at 10:00 PM, Corbin Simpson
>  wrote:
> > FYI r300 only supports 24 interpolators: 16 linear and 8 perspective.
> > (IIRC; not in front of the docs right now.) r600 supports 256 fully
> > configurable interpolators.
>
> Yes, but if you raised ATTR_GENERIC_COUNT, the current driver would
> support higher semantic indices right? (of course, with a limit of
> 8/24 different semantic indices used at once).
>
> That's right.
--
Download Intel® Parallel Studio Eval
Try the new software tools for yourself. Speed compiling, find bugs
proactively, and fine-tune applications for parallel performance.
See why Intel Parallel Studio got high marks during beta.
http://p.sf.net/sfu/intel-sw-dev___
Mesa3d-dev mailing list
Mesa3d-dev@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev


Re: [Mesa3d-dev] Does DX9 SM3 -> VMware svga with arbitrary semantics work? How?

2010-03-02 Thread Luca Barbieri
On Tue, Mar 2, 2010 at 10:00 PM, Corbin Simpson
 wrote:
> FYI r300 only supports 24 interpolators: 16 linear and 8 perspective.
> (IIRC; not in front of the docs right now.) r600 supports 256 fully
> configurable interpolators.

Yes, but if you raised ATTR_GENERIC_COUNT, the current driver would
support higher semantic indices right? (of course, with a limit of
8/24 different semantic indices used at once).

--
Download Intel® Parallel Studio Eval
Try the new software tools for yourself. Speed compiling, find bugs
proactively, and fine-tune applications for parallel performance.
See why Intel Parallel Studio got high marks during beta.
http://p.sf.net/sfu/intel-sw-dev
___
Mesa3d-dev mailing list
Mesa3d-dev@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev


Re: [Mesa3d-dev] Does DX9 SM3 -> VMware svga with arbitrary semantics work? How?

2010-03-02 Thread Corbin Simpson
FYI r300 only supports 24 interpolators: 16 linear and 8 perspective.
(IIRC; not in front of the docs right now.) r600 supports 256 fully
configurable interpolators.

-- 
Only fools are easily impressed by what is only
barely beyond their reach. ~ Unknown

Corbin Simpson


--
Download Intel® Parallel Studio Eval
Try the new software tools for yourself. Speed compiling, find bugs
proactively, and fine-tune applications for parallel performance.
See why Intel Parallel Studio got high marks during beta.
http://p.sf.net/sfu/intel-sw-dev
___
Mesa3d-dev mailing list
Mesa3d-dev@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev


Re: [Mesa3d-dev] Does DX9 SM3 -> VMware svga with arbitrary semantics work? How?

2010-03-02 Thread Luca Barbieri
The difference between an easier and harder life for (some) drivers is
whether the limit is tied to hardware interpolators or not.
Once we decide to not tie it, whether the limit is 128 or 256 is of
course quite inconsequential.
Allowing arbitrary 32-bit values would however require use of binary
search or an hash table.

I think you or someone else from the Mesa team should decide how to
proceed, and most drivers would need to be fixed.

As I understand, the constraints are the following:

Hardware with no capabilities.
- nv30 does not support any mapping. However, we already need to patch
fragment programs to insert constants, so we can patch input register
numbers as well. The current driver only supports 0-7 generic indices,
but I already implemented support for 0-255 indices with in-driver
linkage and patching. Note that nv30 lacks control flow in fragment
programs.
- nv40 is like nv30, but supports fp control flow, and may have some
configurable mapping support, with unknown behavior

Hardware with capabilities that must be configured for each fp/vp pair.
- nv40 might have this but the nVidia OpenGL driver does not use them
- nv50 has configurable vp->gp and gp->fp mappings with 64 entries.
The current driver seems to support arbitrary 0-2^32 indices.
- r300 appears to have a configurable vp->fp mapping. The current
driver only supports 0-15 generic indices, but redefining
ATTR_GENERIC_COUNT could be enough to have it support larger numbers.

Hardware with automatic linkage when semantics match:
- VMWare svga appears to support 14 * 16 semantics, but the current
driver only supports 0-15 generic indices. This could be fixed by
mapping GENERIC into all non-special SM3 semantics.

Hardware that can do both configurable mappings and automatic linkage:
- r600 supports linkage in hardware between matching apparently
byte-sized semantic ids

Other hardware;
- i915 has no hardware vertex shading
- Not sure about i965

Software:
1. SM3 wants to use 14 * 16 indices overall. This is apparently only
supported by the VMware closed source state tracker.
2. SM2 and non-GLSL OpenGL just want to use as many indices as the
hardware interpolator count
3. Current GLSL currently wants to use at most about 10 indices more
than the hardware interpolator count. This can be fixed since we see
both the fragment and vertex shaders during linkage (the patch I sent
did that)
4. GLSL with EXT_separate_shader_objects does not add requirements
because only gl_TexCoord and other builtin varyings are supported.
User-defined varyings are not supported
5. An hypotetical version of EXT_separate_shader_objects extended to
support user-defining varyings would either want arbitrary 32-bit
generic indices (by interning strings to generate the indices) or the
ability to specify a custom mapping between shader indices
6. An hypotetical "no-op" implementation of the GLSL linker would have
the same requirement

Also note that non-GENERIC indices have peculiar properties.

For COLOR and BCOLOR:
1. SM3 and OpenGL with glColorClamp appropriately set wants it to
_not_ be clamped to [0, 1]
2. SM2 and normal OpenGL apparently want it to be clamped to [0, 1]
(sometimes for fixed point targets only) and may also allow using
U8_UNORM precision for it instead of FP32
3. OpenGL allows to enable two-sided lighting, in which case COLOR in
the fragment shader is automagically set to BCOLOR for back faces
4. Older hardware (e.g. nv30) tends to support BCOLOR but not FACING.
Some hardware (e.g. nv40) supports both FACING and BCOLOR in hardware.
The latest hardware probably supports FACING only.

Any API that requires special semantics for COLOR and BCOLOR (i.e.
non-SM3) seems to only want 0-1 indices.

Note that SM3 does *not* include BCOLOR, so basically the limits for
generic indices would need to be conditional on BCOLOR being present
or not (e.g. if it is present, we must reserve two semantic slots in
svga for it).

POSITION0 is obviously special.
PSIZE0 is also special for points.

FOG0 seems right now to just be a GENERIC with a single component.
Gallium could be extended to support fixed function fog, which most
DX9 hardware supports (nv30/nv40 and r300). This is mostly orthogonal
to the semantic issue.

TGSI_SEMANTIC_NORMAL is essentially unused and should probably be removed

The options are the ones you outlined, plus:
(e) Allow arbitrary 32-bit indices. This requires slightly more
complicated data structures in some cases, and will require svga and
r600 to fallback to software linkage if numbers are too high.
(f) Limit semantic indices to hardware interpolators _and_ introduce
an interface to let the user specify an

Personally I think the simplest idea for now could be to have all
drivers support 256 indices or, in the case of r600 and svga, the
maximum value supported by the hardware, and expose that as a cap (as
well as another cap for the number of different semantic values
supported at once).
The minimum guaranteed value is set to the lowest h

Re: [Mesa3d-dev] Does DX9 SM3 -> VMware svga with arbitrary semantics work? How?

2010-03-02 Thread Keith Whitwell
On Tue, 2010-03-02 at 04:36 -0800, Luca Barbieri wrote:


> >> The correct value in this case seems to be 219 = 14 * 16 SM3 semantics
> >> - 5 for COLOR0, COLOR1, PSIZE0, POSITION0, FOG0 which have specific
> >> TGSI semantics which they need to mapped to/from.
> >
> > Agree, though I'd opt for 255 as a round number.
> 
> The problem with this is that you only have 14 SM3 semantics with 16
> indices each, so you can't map 256 generic indices into the VMware
> interface, or directly into an SM3 shader.
> You only have 14 * 16 minus the ones used for non-GENERIC semantics
> (the one mentioned above).
> And of course, if you choose a smaller number, you can't map SM3
> _into_ Gallium, so you need to choose the exact number required for
> SM3.
> 
> Tying Gallium in this way to SM3 is surely a bit ugly, but it's just a
> constant, and I don't see any other way to implement SM3 without doing
> linkage in software in the r600 and svga drivers and/or in SM3 state
> trackers.

I accept that it can be viewed as an arbitrary constant, but maybe it's
a step too far.  If another API or piece of hardware came along that we
decided was important, the calculations might change and we'd be stuck.

I see our options as:

a) Picking a lower number like 128, that an SM3 state tracker could
usually be able to directly translate incoming semantics into, but which
would force it to renumber under rare circumstances.  This would make
life easier for the open drivers at the expense of the closed code.

b) Picking 256 to make life easier for some closed-source SM3 state
tracker, but harder for open drivers.

c) Picking 219 (or some other magic number) that happens to work with
the current set of constraints, but makes gallium fragile in the face of
new constraints.

d) Abandoning the current gallium linkage rules and coming up with
something new, for instance forcing the state trackers to renumber
always and making life trivial for the drivers...

I suspect we'll end up reworking the whole GENERIC idea at some stage,
ie (d).  But for now, I don't think that some piece of closed code
should be dictating the gallium interface direction, so I'd suggest
something that makes the open driver's lives easier -- ie (a) or (c).

To be honest, I'd suggest keeping a modicum of independence in gallium
plus making the open code simpler, and going with 128...

But if you feel strongly, I don't mind the bigger number (219) either,
except on aesthetic grounds...

Keith




--
Download Intel® Parallel Studio Eval
Try the new software tools for yourself. Speed compiling, find bugs
proactively, and fine-tune applications for parallel performance.
See why Intel Parallel Studio got high marks during beta.
http://p.sf.net/sfu/intel-sw-dev
___
Mesa3d-dev mailing list
Mesa3d-dev@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev


Re: [Mesa3d-dev] Does DX9 SM3 -> VMware svga with arbitrary semantics work? How?

2010-03-02 Thread Luca Barbieri
> I don't think anybody has tried hooking it up - so far the primary
> purpose of the svga gallium driver has been GL support, but thinking
> about it you're probably right.

I'm a bit confused about this: I was under the impression that VMware
Tools for Windows used your DirectX state tracker and a WGL version of
Mesa, talking to the svga Gallium driver.
How does it actually work?
What do you normally use the DirectX 9 state tracker with?

> The details of the closed code aren't terribly important as they could
> always be changed.
Sure, but it currently is the only Gallium user that supports the SM3
model and thus the only one that really needs arbitrary semantic
indices, and puts constraints on them.

>> The correct value in this case seems to be 219 = 14 * 16 SM3 semantics
>> - 5 for COLOR0, COLOR1, PSIZE0, POSITION0, FOG0 which have specific
>> TGSI semantics which they need to mapped to/from.
>
> Agree, though I'd opt for 255 as a round number.

The problem with this is that you only have 14 SM3 semantics with 16
indices each, so you can't map 256 generic indices into the VMware
interface, or directly into an SM3 shader.
You only have 14 * 16 minus the ones used for non-GENERIC semantics
(the one mentioned above).
And of course, if you choose a smaller number, you can't map SM3
_into_ Gallium, so you need to choose the exact number required for
SM3.

Tying Gallium in this way to SM3 is surely a bit ugly, but it's just a
constant, and I don't see any other way to implement SM3 without doing
linkage in software in the r600 and svga drivers and/or in SM3 state
trackers.

--
Download Intel® Parallel Studio Eval
Try the new software tools for yourself. Speed compiling, find bugs
proactively, and fine-tune applications for parallel performance.
See why Intel Parallel Studio got high marks during beta.
http://p.sf.net/sfu/intel-sw-dev
___
Mesa3d-dev mailing list
Mesa3d-dev@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev


Re: [Mesa3d-dev] Does DX9 SM3 -> VMware svga with arbitrary semantics work? How?

2010-03-02 Thread Keith Whitwell
On Tue, 2010-03-02 at 03:26 -0800, Luca Barbieri wrote:
> I've been looking at shader semantics some more, and I'm a bit
> surprised by how the svga driver works.
> It seems that an obvious implementation of a DirectX 9 state tracker
> just won't work with the svga driver.

I don't think anybody has tried hooking it up - so far the primary
purpose of the svga gallium driver has been GL support, but thinking
about it you're probably right.

> In SM3, vertex/fragment semantics can be arbitrary (independent of
> hardware resources), but indices are limited to a 0-15 range.
> 
> A DirectX 9 state tracker must convert those to TGSI_SEMANTIC_GENERIC.
> How does the VMware one do that?
> Assuming that it maps them directly, this means that the driver must
> support GENERIC semantic indices up to a number that varies between
> about 200 and 255.

The details of the closed code aren't terribly important as they could
always be changed.

But you're right that any DX9 state tracker would have to try to pack
all or almost all the DX9 semantics into the TGSI GENERIC range.  The
simplest implementation would end up using 0..255.

> The problem is that the vmware svga driver, as far as I can see,
> doesn't support indices greater than 15.
> This is caused by the fact that it maps all GENERIC semantics to
> SVGA3D_DECLUSAGE_TEXCOORD, and the index bitfield in the svga virtual
> interface only supports 4 bits.

Indeed, good point.  That's basically a shortcoming of the current svga
gallium driver which needs to be addressed somehow.

> In other words, SM3 under VMware with arbitrary semantics (allowed by
> SM3 and other drivers) really seems broken, for a straightforward
> DirectX9 state tracker implementation.
> 
> The only way it can work now is if the DirectX 9 state tracker looks
> at both the vertex and pixel shaders, links them, and outputs
> sequential semantic indices.
> 
> It seems to me that the svga driver should be fixed to map GENERIC to
> *all* SM3 semantic types, ideally in a way that reverses the SM3 ->
> GENERIC transformation done by the DX9 state tracker.

Agree, though I don't think reversibility is necessary.

> Doing this requires to specify a maximum index for
> TGSI_SEMANTIC_GENERIC which is very carefully chosen to allow 1:1
> mapping with SM3, so that DirectX 9 state trackers have enough indices
> to represent all SM3, and the svga driver can fit all indices in the
> SM3-like semantics of the VMware virtual GPU interface.
> 
> The correct value in this case seems to be 219 = 14 * 16 SM3 semantics
> - 5 for COLOR0, COLOR1, PSIZE0, POSITION0, FOG0 which have specific
> TGSI semantics which they need to mapped to/from.

Agree, though I'd opt for 255 as a round number.

> I'm looking at this because this seems the strictest constraint on
> choosing a limit for TGSI_SEMANTIC_GENERIC indices.
> The other constraint is due to r600 supporting only byte-sized
> semantic/index combinations, which is less strict than SM3.
> 
> BTW, glsl also looks artificially limited on svga, as only 6 varyings
> will be supported, due to it starting from 10.

Agree, well spotted.  I'll take a look at that.

Keith


--
Download Intel® Parallel Studio Eval
Try the new software tools for yourself. Speed compiling, find bugs
proactively, and fine-tune applications for parallel performance.
See why Intel Parallel Studio got high marks during beta.
http://p.sf.net/sfu/intel-sw-dev
___
Mesa3d-dev mailing list
Mesa3d-dev@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev


[Mesa3d-dev] Does DX9 SM3 -> VMware svga with arbitrary semantics work? How?

2010-03-02 Thread Luca Barbieri
I've been looking at shader semantics some more, and I'm a bit
surprised by how the svga driver works.
It seems that an obvious implementation of a DirectX 9 state tracker
just won't work with the svga driver.

In SM3, vertex/fragment semantics can be arbitrary (independent of
hardware resources), but indices are limited to a 0-15 range.

A DirectX 9 state tracker must convert those to TGSI_SEMANTIC_GENERIC.
How does the VMware one do that?
Assuming that it maps them directly, this means that the driver must
support GENERIC semantic indices up to a number that varies between
about 200 and 255.

The problem is that the vmware svga driver, as far as I can see,
doesn't support indices greater than 15.
This is caused by the fact that it maps all GENERIC semantics to
SVGA3D_DECLUSAGE_TEXCOORD, and the index bitfield in the svga virtual
interface only supports 4 bits.

In other words, SM3 under VMware with arbitrary semantics (allowed by
SM3 and other drivers) really seems broken, for a straightforward
DirectX9 state tracker implementation.

The only way it can work now is if the DirectX 9 state tracker looks
at both the vertex and pixel shaders, links them, and outputs
sequential semantic indices.

It seems to me that the svga driver should be fixed to map GENERIC to
*all* SM3 semantic types, ideally in a way that reverses the SM3 ->
GENERIC transformation done by the DX9 state tracker.

Doing this requires to specify a maximum index for
TGSI_SEMANTIC_GENERIC which is very carefully chosen to allow 1:1
mapping with SM3, so that DirectX 9 state trackers have enough indices
to represent all SM3, and the svga driver can fit all indices in the
SM3-like semantics of the VMware virtual GPU interface.

The correct value in this case seems to be 219 = 14 * 16 SM3 semantics
- 5 for COLOR0, COLOR1, PSIZE0, POSITION0, FOG0 which have specific
TGSI semantics which they need to mapped to/from.

I'm looking at this because this seems the strictest constraint on
choosing a limit for TGSI_SEMANTIC_GENERIC indices.
The other constraint is due to r600 supporting only byte-sized
semantic/index combinations, which is less strict than SM3.

BTW, glsl also looks artificially limited on svga, as only 6 varyings
will be supported, due to it starting from 10.

--
Download Intel® Parallel Studio Eval
Try the new software tools for yourself. Speed compiling, find bugs
proactively, and fine-tune applications for parallel performance.
See why Intel Parallel Studio got high marks during beta.
http://p.sf.net/sfu/intel-sw-dev
___
Mesa3d-dev mailing list
Mesa3d-dev@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mesa3d-dev