Re: [Mesa3d-dev] Does DX9 SM3 -> VMware svga with arbitrary semantics work? How?
On 03.03.2010 20:23, Luca Barbieri wrote: >> And never will... It does not export PIPE_CAP_GLSL, and does not have >> the shader opcodes to ever do so. > > Any Gallium driver should be able to support the GLSL subset without > control flow. > > And if we had a proper optimization infrastructure capable of inlining > functions, converting conditionals to multiplications and unrolling > loops (e.g. look at what the nVidia Cg compiler does), then > essentially all GLSL could be supported on any driver, with only > limitations on the maximum number of loop iterations. > > Isn't it worth supporting that? > > BTW, proprietary drivers do this: for instance nVidia supports GLSL on > nv30, which can't do control flow in fragment shaders and doesn't > support SM3. I think the i915 is a lot closer to r300 in that regard (which is quite a bit more limited than nv30), and it's true that ATI also supported glsl on that. As far as I know though it was quite easy to bump into shaders which wouldn't compile. There's only so much you can do if you have 4 blocks of (max) 16 instructions to run without any control flow if you need to unroll loops, not to mention lacking instructions for derivatives, or the fact things like sin/cos will take quite a few instructions... nv30, while processing fragment shaders slowly, had a LOT higher instruction count, IIRC supported derivatives and predication and had no dependent texturing limit. So that makes it a lot better suited for glsl hacks. So, I'm not sure it really makes a whole lot of sense to support glsl on i915. It'll really only ever work for very simple things (granted there are apps out there which indeed will only use glsl shaders which are known to compile fine on r300...) Roland -- Download Intel® Parallel Studio Eval Try the new software tools for yourself. Speed compiling, find bugs proactively, and fine-tune applications for parallel performance. See why Intel Parallel Studio got high marks during beta. http://p.sf.net/sfu/intel-sw-dev ___ Mesa3d-dev mailing list Mesa3d-dev@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mesa3d-dev
Re: [Mesa3d-dev] Does DX9 SM3 -> VMware svga with arbitrary semantics work? How?
On Wed, 2010-03-03 at 06:58 -0800, Luca Barbieri wrote: > BTW, i915 is also limited to 0-7 generic indices, and thus doesn't > work with GLSL at all right now. > > This should be relatively easy to fix since it should be enough to > store the generic indices in the "texCoords" arrays, and then pass > them to draw_find_shader_output. Luca, If you want to go ahead and send a patch, I don't have a problem with it. Like you say, it should be an easy change. Keith -- Download Intel® Parallel Studio Eval Try the new software tools for yourself. Speed compiling, find bugs proactively, and fine-tune applications for parallel performance. See why Intel Parallel Studio got high marks during beta. http://p.sf.net/sfu/intel-sw-dev ___ Mesa3d-dev mailing list Mesa3d-dev@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mesa3d-dev
Re: [Mesa3d-dev] Does DX9 SM3 -> VMware svga with arbitrary semantics work? How?
On Wed, 2010-03-03 at 11:23 -0800, Luca Barbieri wrote: > > And never will... It does not export PIPE_CAP_GLSL, and does not have > > the shader opcodes to ever do so. > > Any Gallium driver should be able to support the GLSL subset without > control flow. > > And if we had a proper optimization infrastructure capable of inlining > functions, converting conditionals to multiplications and unrolling > loops (e.g. look at what the nVidia Cg compiler does), then > essentially all GLSL could be supported on any driver, with only > limitations on the maximum number of loop iterations. > > Isn't it worth supporting that? > > BTW, proprietary drivers do this: for instance nVidia supports GLSL on > nv30, which can't do control flow in fragment shaders and doesn't > support SM3. OK, maybe never is too strong... But it's certainly a long way off... Keith -- Download Intel® Parallel Studio Eval Try the new software tools for yourself. Speed compiling, find bugs proactively, and fine-tune applications for parallel performance. See why Intel Parallel Studio got high marks during beta. http://p.sf.net/sfu/intel-sw-dev ___ Mesa3d-dev mailing list Mesa3d-dev@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mesa3d-dev
Re: [Mesa3d-dev] Does DX9 SM3 -> VMware svga with arbitrary semantics work? How?
> And never will... It does not export PIPE_CAP_GLSL, and does not have > the shader opcodes to ever do so. Any Gallium driver should be able to support the GLSL subset without control flow. And if we had a proper optimization infrastructure capable of inlining functions, converting conditionals to multiplications and unrolling loops (e.g. look at what the nVidia Cg compiler does), then essentially all GLSL could be supported on any driver, with only limitations on the maximum number of loop iterations. Isn't it worth supporting that? BTW, proprietary drivers do this: for instance nVidia supports GLSL on nv30, which can't do control flow in fragment shaders and doesn't support SM3. -- Download Intel® Parallel Studio Eval Try the new software tools for yourself. Speed compiling, find bugs proactively, and fine-tune applications for parallel performance. See why Intel Parallel Studio got high marks during beta. http://p.sf.net/sfu/intel-sw-dev ___ Mesa3d-dev mailing list Mesa3d-dev@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mesa3d-dev
Re: [Mesa3d-dev] Does DX9 SM3 -> VMware svga with arbitrary semantics work? How?
On Wed, 2010-03-03 at 06:58 -0800, Luca Barbieri wrote: > BTW, i915 is also limited to 0-7 generic indices, and thus doesn't > work with GLSL at all right now. And never will... It does not export PIPE_CAP_GLSL, and does not have the shader opcodes to ever do so. Keith -- Download Intel® Parallel Studio Eval Try the new software tools for yourself. Speed compiling, find bugs proactively, and fine-tune applications for parallel performance. See why Intel Parallel Studio got high marks during beta. http://p.sf.net/sfu/intel-sw-dev ___ Mesa3d-dev mailing list Mesa3d-dev@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mesa3d-dev
Re: [Mesa3d-dev] Does DX9 SM3 -> VMware svga with arbitrary semantics work? How?
BTW, i915 is also limited to 0-7 generic indices, and thus doesn't work with GLSL at all right now. This should be relatively easy to fix since it should be enough to store the generic indices in the "texCoords" arrays, and then pass them to draw_find_shader_output. -- Download Intel® Parallel Studio Eval Try the new software tools for yourself. Speed compiling, find bugs proactively, and fine-tune applications for parallel performance. See why Intel Parallel Studio got high marks during beta. http://p.sf.net/sfu/intel-sw-dev ___ Mesa3d-dev mailing list Mesa3d-dev@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mesa3d-dev
Re: [Mesa3d-dev] Does DX9 SM3 -> VMware svga with arbitrary semantics work? How?
On Tue, Mar 02, 2010 at 09:43:51PM +0100, Luca Barbieri wrote: > - Not sure about i965 On i965 interpolators are not a dedicated piece of hardware, they're programs like the other shaders. So the problem is entirely different, and more at the level of space allocation in the thread-to-thread communication packets in the pipeline vs. register allocation in the shaders (there's a semi-direct mapping). OG. -- Download Intel® Parallel Studio Eval Try the new software tools for yourself. Speed compiling, find bugs proactively, and fine-tune applications for parallel performance. See why Intel Parallel Studio got high marks during beta. http://p.sf.net/sfu/intel-sw-dev ___ Mesa3d-dev mailing list Mesa3d-dev@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mesa3d-dev
Re: [Mesa3d-dev] Does DX9 SM3 -> VMware svga with arbitrary semantics work? How?
On Tue, 2010-03-02 at 12:43 -0800, Luca Barbieri wrote: > The difference between an easier and harder life for (some) drivers is > whether the limit is tied to hardware interpolators or not. > Once we decide to not tie it, whether the limit is 128 or 256 is of > course quite inconsequential. > Allowing arbitrary 32-bit values would however require use of binary > search or an hash table. > > I think you or someone else from the Mesa team should decide how to > proceed, and most drivers would need to be fixed. > > As I understand, the constraints are the following: > > Hardware with no capabilities. > - nv30 does not support any mapping. However, we already need to patch > fragment programs to insert constants, so we can patch input register > numbers as well. The current driver only supports 0-7 generic indices, > but I already implemented support for 0-255 indices with in-driver > linkage and patching. Note that nv30 lacks control flow in fragment > programs. > - nv40 is like nv30, but supports fp control flow, and may have some > configurable mapping support, with unknown behavior > > Hardware with capabilities that must be configured for each fp/vp pair. > - nv40 might have this but the nVidia OpenGL driver does not use them > - nv50 has configurable vp->gp and gp->fp mappings with 64 entries. > The current driver seems to support arbitrary 0-2^32 indices. > - r300 appears to have a configurable vp->fp mapping. The current > driver only supports 0-15 generic indices, but redefining > ATTR_GENERIC_COUNT could be enough to have it support larger numbers. > > Hardware with automatic linkage when semantics match: > - VMWare svga appears to support 14 * 16 semantics, but the current > driver only supports 0-15 generic indices. This could be fixed by > mapping GENERIC into all non-special SM3 semantics. > > Hardware that can do both configurable mappings and automatic linkage: > - r600 supports linkage in hardware between matching apparently > byte-sized semantic ids > > Other hardware; > - i915 has no hardware vertex shading > - Not sure about i965 > > Software: > 1. SM3 wants to use 14 * 16 indices overall. This is apparently only > supported by the VMware closed source state tracker. > 2. SM2 and non-GLSL OpenGL just want to use as many indices as the > hardware interpolator count > 3. Current GLSL currently wants to use at most about 10 indices more > than the hardware interpolator count. This can be fixed since we see > both the fragment and vertex shaders during linkage (the patch I sent > did that) > 4. GLSL with EXT_separate_shader_objects does not add requirements > because only gl_TexCoord and other builtin varyings are supported. > User-defined varyings are not supported > 5. An hypotetical version of EXT_separate_shader_objects extended to > support user-defining varyings would either want arbitrary 32-bit > generic indices (by interning strings to generate the indices) or the > ability to specify a custom mapping between shader indices > 6. An hypotetical "no-op" implementation of the GLSL linker would have > the same requirement > > Also note that non-GENERIC indices have peculiar properties. > > For COLOR and BCOLOR: > 1. SM3 and OpenGL with glColorClamp appropriately set wants it to > _not_ be clamped to [0, 1] > 2. SM2 and normal OpenGL apparently want it to be clamped to [0, 1] > (sometimes for fixed point targets only) and may also allow using > U8_UNORM precision for it instead of FP32 > 3. OpenGL allows to enable two-sided lighting, in which case COLOR in > the fragment shader is automagically set to BCOLOR for back faces > 4. Older hardware (e.g. nv30) tends to support BCOLOR but not FACING. > Some hardware (e.g. nv40) supports both FACING and BCOLOR in hardware. > The latest hardware probably supports FACING only. > > Any API that requires special semantics for COLOR and BCOLOR (i.e. > non-SM3) seems to only want 0-1 indices. > > Note that SM3 does *not* include BCOLOR, so basically the limits for > generic indices would need to be conditional on BCOLOR being present > or not (e.g. if it is present, we must reserve two semantic slots in > svga for it). > > POSITION0 is obviously special. > PSIZE0 is also special for points. > > FOG0 seems right now to just be a GENERIC with a single component. > Gallium could be extended to support fixed function fog, which most > DX9 hardware supports (nv30/nv40 and r300). This is mostly orthogonal > to the semantic issue. > > TGSI_SEMANTIC_NORMAL is essentially unused and should probably be removed > > The options are the ones you outlined, plus: > (e) Allow arbitrary 32-bit indices. This requires slightly more > complicated data structures in some cases, and will require svga and > r600 to fallback to software linkage if numbers are too high. > (f) Limit semantic indices to hardware interpolators _and_ introduce > an interface to let the user specify an > > Personally I think the simplest idea for now could be to have all > drivers s
Re: [Mesa3d-dev] Does DX9 SM3 -> VMware svga with arbitrary semantics work? How?
On Tue, Mar 2, 2010 at 10:20 PM, Luca Barbieri wrote: > On Tue, Mar 2, 2010 at 10:00 PM, Corbin Simpson > wrote: > > FYI r300 only supports 24 interpolators: 16 linear and 8 perspective. > > (IIRC; not in front of the docs right now.) r600 supports 256 fully > > configurable interpolators. > > Yes, but if you raised ATTR_GENERIC_COUNT, the current driver would > support higher semantic indices right? (of course, with a limit of > 8/24 different semantic indices used at once). > > That's right. -- Download Intel® Parallel Studio Eval Try the new software tools for yourself. Speed compiling, find bugs proactively, and fine-tune applications for parallel performance. See why Intel Parallel Studio got high marks during beta. http://p.sf.net/sfu/intel-sw-dev___ Mesa3d-dev mailing list Mesa3d-dev@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mesa3d-dev
Re: [Mesa3d-dev] Does DX9 SM3 -> VMware svga with arbitrary semantics work? How?
On Tue, Mar 2, 2010 at 10:00 PM, Corbin Simpson wrote: > FYI r300 only supports 24 interpolators: 16 linear and 8 perspective. > (IIRC; not in front of the docs right now.) r600 supports 256 fully > configurable interpolators. Yes, but if you raised ATTR_GENERIC_COUNT, the current driver would support higher semantic indices right? (of course, with a limit of 8/24 different semantic indices used at once). -- Download Intel® Parallel Studio Eval Try the new software tools for yourself. Speed compiling, find bugs proactively, and fine-tune applications for parallel performance. See why Intel Parallel Studio got high marks during beta. http://p.sf.net/sfu/intel-sw-dev ___ Mesa3d-dev mailing list Mesa3d-dev@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mesa3d-dev
Re: [Mesa3d-dev] Does DX9 SM3 -> VMware svga with arbitrary semantics work? How?
FYI r300 only supports 24 interpolators: 16 linear and 8 perspective. (IIRC; not in front of the docs right now.) r600 supports 256 fully configurable interpolators. -- Only fools are easily impressed by what is only barely beyond their reach. ~ Unknown Corbin Simpson -- Download Intel® Parallel Studio Eval Try the new software tools for yourself. Speed compiling, find bugs proactively, and fine-tune applications for parallel performance. See why Intel Parallel Studio got high marks during beta. http://p.sf.net/sfu/intel-sw-dev ___ Mesa3d-dev mailing list Mesa3d-dev@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mesa3d-dev
Re: [Mesa3d-dev] Does DX9 SM3 -> VMware svga with arbitrary semantics work? How?
The difference between an easier and harder life for (some) drivers is whether the limit is tied to hardware interpolators or not. Once we decide to not tie it, whether the limit is 128 or 256 is of course quite inconsequential. Allowing arbitrary 32-bit values would however require use of binary search or an hash table. I think you or someone else from the Mesa team should decide how to proceed, and most drivers would need to be fixed. As I understand, the constraints are the following: Hardware with no capabilities. - nv30 does not support any mapping. However, we already need to patch fragment programs to insert constants, so we can patch input register numbers as well. The current driver only supports 0-7 generic indices, but I already implemented support for 0-255 indices with in-driver linkage and patching. Note that nv30 lacks control flow in fragment programs. - nv40 is like nv30, but supports fp control flow, and may have some configurable mapping support, with unknown behavior Hardware with capabilities that must be configured for each fp/vp pair. - nv40 might have this but the nVidia OpenGL driver does not use them - nv50 has configurable vp->gp and gp->fp mappings with 64 entries. The current driver seems to support arbitrary 0-2^32 indices. - r300 appears to have a configurable vp->fp mapping. The current driver only supports 0-15 generic indices, but redefining ATTR_GENERIC_COUNT could be enough to have it support larger numbers. Hardware with automatic linkage when semantics match: - VMWare svga appears to support 14 * 16 semantics, but the current driver only supports 0-15 generic indices. This could be fixed by mapping GENERIC into all non-special SM3 semantics. Hardware that can do both configurable mappings and automatic linkage: - r600 supports linkage in hardware between matching apparently byte-sized semantic ids Other hardware; - i915 has no hardware vertex shading - Not sure about i965 Software: 1. SM3 wants to use 14 * 16 indices overall. This is apparently only supported by the VMware closed source state tracker. 2. SM2 and non-GLSL OpenGL just want to use as many indices as the hardware interpolator count 3. Current GLSL currently wants to use at most about 10 indices more than the hardware interpolator count. This can be fixed since we see both the fragment and vertex shaders during linkage (the patch I sent did that) 4. GLSL with EXT_separate_shader_objects does not add requirements because only gl_TexCoord and other builtin varyings are supported. User-defined varyings are not supported 5. An hypotetical version of EXT_separate_shader_objects extended to support user-defining varyings would either want arbitrary 32-bit generic indices (by interning strings to generate the indices) or the ability to specify a custom mapping between shader indices 6. An hypotetical "no-op" implementation of the GLSL linker would have the same requirement Also note that non-GENERIC indices have peculiar properties. For COLOR and BCOLOR: 1. SM3 and OpenGL with glColorClamp appropriately set wants it to _not_ be clamped to [0, 1] 2. SM2 and normal OpenGL apparently want it to be clamped to [0, 1] (sometimes for fixed point targets only) and may also allow using U8_UNORM precision for it instead of FP32 3. OpenGL allows to enable two-sided lighting, in which case COLOR in the fragment shader is automagically set to BCOLOR for back faces 4. Older hardware (e.g. nv30) tends to support BCOLOR but not FACING. Some hardware (e.g. nv40) supports both FACING and BCOLOR in hardware. The latest hardware probably supports FACING only. Any API that requires special semantics for COLOR and BCOLOR (i.e. non-SM3) seems to only want 0-1 indices. Note that SM3 does *not* include BCOLOR, so basically the limits for generic indices would need to be conditional on BCOLOR being present or not (e.g. if it is present, we must reserve two semantic slots in svga for it). POSITION0 is obviously special. PSIZE0 is also special for points. FOG0 seems right now to just be a GENERIC with a single component. Gallium could be extended to support fixed function fog, which most DX9 hardware supports (nv30/nv40 and r300). This is mostly orthogonal to the semantic issue. TGSI_SEMANTIC_NORMAL is essentially unused and should probably be removed The options are the ones you outlined, plus: (e) Allow arbitrary 32-bit indices. This requires slightly more complicated data structures in some cases, and will require svga and r600 to fallback to software linkage if numbers are too high. (f) Limit semantic indices to hardware interpolators _and_ introduce an interface to let the user specify an Personally I think the simplest idea for now could be to have all drivers support 256 indices or, in the case of r600 and svga, the maximum value supported by the hardware, and expose that as a cap (as well as another cap for the number of different semantic values supported at once). The minimum guaranteed value is set to the lowest h
Re: [Mesa3d-dev] Does DX9 SM3 -> VMware svga with arbitrary semantics work? How?
On Tue, 2010-03-02 at 04:36 -0800, Luca Barbieri wrote: > >> The correct value in this case seems to be 219 = 14 * 16 SM3 semantics > >> - 5 for COLOR0, COLOR1, PSIZE0, POSITION0, FOG0 which have specific > >> TGSI semantics which they need to mapped to/from. > > > > Agree, though I'd opt for 255 as a round number. > > The problem with this is that you only have 14 SM3 semantics with 16 > indices each, so you can't map 256 generic indices into the VMware > interface, or directly into an SM3 shader. > You only have 14 * 16 minus the ones used for non-GENERIC semantics > (the one mentioned above). > And of course, if you choose a smaller number, you can't map SM3 > _into_ Gallium, so you need to choose the exact number required for > SM3. > > Tying Gallium in this way to SM3 is surely a bit ugly, but it's just a > constant, and I don't see any other way to implement SM3 without doing > linkage in software in the r600 and svga drivers and/or in SM3 state > trackers. I accept that it can be viewed as an arbitrary constant, but maybe it's a step too far. If another API or piece of hardware came along that we decided was important, the calculations might change and we'd be stuck. I see our options as: a) Picking a lower number like 128, that an SM3 state tracker could usually be able to directly translate incoming semantics into, but which would force it to renumber under rare circumstances. This would make life easier for the open drivers at the expense of the closed code. b) Picking 256 to make life easier for some closed-source SM3 state tracker, but harder for open drivers. c) Picking 219 (or some other magic number) that happens to work with the current set of constraints, but makes gallium fragile in the face of new constraints. d) Abandoning the current gallium linkage rules and coming up with something new, for instance forcing the state trackers to renumber always and making life trivial for the drivers... I suspect we'll end up reworking the whole GENERIC idea at some stage, ie (d). But for now, I don't think that some piece of closed code should be dictating the gallium interface direction, so I'd suggest something that makes the open driver's lives easier -- ie (a) or (c). To be honest, I'd suggest keeping a modicum of independence in gallium plus making the open code simpler, and going with 128... But if you feel strongly, I don't mind the bigger number (219) either, except on aesthetic grounds... Keith -- Download Intel® Parallel Studio Eval Try the new software tools for yourself. Speed compiling, find bugs proactively, and fine-tune applications for parallel performance. See why Intel Parallel Studio got high marks during beta. http://p.sf.net/sfu/intel-sw-dev ___ Mesa3d-dev mailing list Mesa3d-dev@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mesa3d-dev
Re: [Mesa3d-dev] Does DX9 SM3 -> VMware svga with arbitrary semantics work? How?
> I don't think anybody has tried hooking it up - so far the primary > purpose of the svga gallium driver has been GL support, but thinking > about it you're probably right. I'm a bit confused about this: I was under the impression that VMware Tools for Windows used your DirectX state tracker and a WGL version of Mesa, talking to the svga Gallium driver. How does it actually work? What do you normally use the DirectX 9 state tracker with? > The details of the closed code aren't terribly important as they could > always be changed. Sure, but it currently is the only Gallium user that supports the SM3 model and thus the only one that really needs arbitrary semantic indices, and puts constraints on them. >> The correct value in this case seems to be 219 = 14 * 16 SM3 semantics >> - 5 for COLOR0, COLOR1, PSIZE0, POSITION0, FOG0 which have specific >> TGSI semantics which they need to mapped to/from. > > Agree, though I'd opt for 255 as a round number. The problem with this is that you only have 14 SM3 semantics with 16 indices each, so you can't map 256 generic indices into the VMware interface, or directly into an SM3 shader. You only have 14 * 16 minus the ones used for non-GENERIC semantics (the one mentioned above). And of course, if you choose a smaller number, you can't map SM3 _into_ Gallium, so you need to choose the exact number required for SM3. Tying Gallium in this way to SM3 is surely a bit ugly, but it's just a constant, and I don't see any other way to implement SM3 without doing linkage in software in the r600 and svga drivers and/or in SM3 state trackers. -- Download Intel® Parallel Studio Eval Try the new software tools for yourself. Speed compiling, find bugs proactively, and fine-tune applications for parallel performance. See why Intel Parallel Studio got high marks during beta. http://p.sf.net/sfu/intel-sw-dev ___ Mesa3d-dev mailing list Mesa3d-dev@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mesa3d-dev
Re: [Mesa3d-dev] Does DX9 SM3 -> VMware svga with arbitrary semantics work? How?
On Tue, 2010-03-02 at 03:26 -0800, Luca Barbieri wrote: > I've been looking at shader semantics some more, and I'm a bit > surprised by how the svga driver works. > It seems that an obvious implementation of a DirectX 9 state tracker > just won't work with the svga driver. I don't think anybody has tried hooking it up - so far the primary purpose of the svga gallium driver has been GL support, but thinking about it you're probably right. > In SM3, vertex/fragment semantics can be arbitrary (independent of > hardware resources), but indices are limited to a 0-15 range. > > A DirectX 9 state tracker must convert those to TGSI_SEMANTIC_GENERIC. > How does the VMware one do that? > Assuming that it maps them directly, this means that the driver must > support GENERIC semantic indices up to a number that varies between > about 200 and 255. The details of the closed code aren't terribly important as they could always be changed. But you're right that any DX9 state tracker would have to try to pack all or almost all the DX9 semantics into the TGSI GENERIC range. The simplest implementation would end up using 0..255. > The problem is that the vmware svga driver, as far as I can see, > doesn't support indices greater than 15. > This is caused by the fact that it maps all GENERIC semantics to > SVGA3D_DECLUSAGE_TEXCOORD, and the index bitfield in the svga virtual > interface only supports 4 bits. Indeed, good point. That's basically a shortcoming of the current svga gallium driver which needs to be addressed somehow. > In other words, SM3 under VMware with arbitrary semantics (allowed by > SM3 and other drivers) really seems broken, for a straightforward > DirectX9 state tracker implementation. > > The only way it can work now is if the DirectX 9 state tracker looks > at both the vertex and pixel shaders, links them, and outputs > sequential semantic indices. > > It seems to me that the svga driver should be fixed to map GENERIC to > *all* SM3 semantic types, ideally in a way that reverses the SM3 -> > GENERIC transformation done by the DX9 state tracker. Agree, though I don't think reversibility is necessary. > Doing this requires to specify a maximum index for > TGSI_SEMANTIC_GENERIC which is very carefully chosen to allow 1:1 > mapping with SM3, so that DirectX 9 state trackers have enough indices > to represent all SM3, and the svga driver can fit all indices in the > SM3-like semantics of the VMware virtual GPU interface. > > The correct value in this case seems to be 219 = 14 * 16 SM3 semantics > - 5 for COLOR0, COLOR1, PSIZE0, POSITION0, FOG0 which have specific > TGSI semantics which they need to mapped to/from. Agree, though I'd opt for 255 as a round number. > I'm looking at this because this seems the strictest constraint on > choosing a limit for TGSI_SEMANTIC_GENERIC indices. > The other constraint is due to r600 supporting only byte-sized > semantic/index combinations, which is less strict than SM3. > > BTW, glsl also looks artificially limited on svga, as only 6 varyings > will be supported, due to it starting from 10. Agree, well spotted. I'll take a look at that. Keith -- Download Intel® Parallel Studio Eval Try the new software tools for yourself. Speed compiling, find bugs proactively, and fine-tune applications for parallel performance. See why Intel Parallel Studio got high marks during beta. http://p.sf.net/sfu/intel-sw-dev ___ Mesa3d-dev mailing list Mesa3d-dev@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mesa3d-dev
[Mesa3d-dev] Does DX9 SM3 -> VMware svga with arbitrary semantics work? How?
I've been looking at shader semantics some more, and I'm a bit surprised by how the svga driver works. It seems that an obvious implementation of a DirectX 9 state tracker just won't work with the svga driver. In SM3, vertex/fragment semantics can be arbitrary (independent of hardware resources), but indices are limited to a 0-15 range. A DirectX 9 state tracker must convert those to TGSI_SEMANTIC_GENERIC. How does the VMware one do that? Assuming that it maps them directly, this means that the driver must support GENERIC semantic indices up to a number that varies between about 200 and 255. The problem is that the vmware svga driver, as far as I can see, doesn't support indices greater than 15. This is caused by the fact that it maps all GENERIC semantics to SVGA3D_DECLUSAGE_TEXCOORD, and the index bitfield in the svga virtual interface only supports 4 bits. In other words, SM3 under VMware with arbitrary semantics (allowed by SM3 and other drivers) really seems broken, for a straightforward DirectX9 state tracker implementation. The only way it can work now is if the DirectX 9 state tracker looks at both the vertex and pixel shaders, links them, and outputs sequential semantic indices. It seems to me that the svga driver should be fixed to map GENERIC to *all* SM3 semantic types, ideally in a way that reverses the SM3 -> GENERIC transformation done by the DX9 state tracker. Doing this requires to specify a maximum index for TGSI_SEMANTIC_GENERIC which is very carefully chosen to allow 1:1 mapping with SM3, so that DirectX 9 state trackers have enough indices to represent all SM3, and the svga driver can fit all indices in the SM3-like semantics of the VMware virtual GPU interface. The correct value in this case seems to be 219 = 14 * 16 SM3 semantics - 5 for COLOR0, COLOR1, PSIZE0, POSITION0, FOG0 which have specific TGSI semantics which they need to mapped to/from. I'm looking at this because this seems the strictest constraint on choosing a limit for TGSI_SEMANTIC_GENERIC indices. The other constraint is due to r600 supporting only byte-sized semantic/index combinations, which is less strict than SM3. BTW, glsl also looks artificially limited on svga, as only 6 varyings will be supported, due to it starting from 10. -- Download Intel® Parallel Studio Eval Try the new software tools for yourself. Speed compiling, find bugs proactively, and fine-tune applications for parallel performance. See why Intel Parallel Studio got high marks during beta. http://p.sf.net/sfu/intel-sw-dev ___ Mesa3d-dev mailing list Mesa3d-dev@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mesa3d-dev