Actually, we don't even bother worrying about the rasterizer's routing table until we've bound a pair of shaders and start drawing. Right before the draw call, we re-generate, among other things, routing tables for the vert shader and the rasterizer.
This is *incredibly* powerful, because it means we only have to compile the shaders once, and load the rasterizer tables based on those shaders. I even baked up a CSO to cache the tables, but it turned out to be an overall slowdown. If you get this patch in, then you'll still have to fight with every other state tracker that doesn't prettify their TGSI. It would be a much better approach to attempt to RE the routing tables. Also FYI the r300-r500 rasterizer can only handle, off the top of my head, 16 sets of vectors total (8 colors, 8 texcoords) so you're not the only ones with this kind of limitation. The situation gets better for r600 and nv50. ~ C. On Mon, Jan 18, 2010 at 8:27 AM, Luca Barbieri <l...@luca-barbieri.com> wrote: > So, basically, you allocate the rasterizer units according to the > vertex shader, and when the fragment shader comes up, you say "write > rasterizer output 4 to fragment input 1000000"? > > The current nouveau drivers can't do this. > There are "routing" registers in hardware, but I think the nVidia > proprietary driver (at least without GLSL) leaves them unaltered after > initialization and I don't think we really know how they would work. > They are also very likely limited to at most 256 values (maybe even > less, such as 16), even if they can actually be made to work. > > The way the current pre-nv50 driver works is that there are 8 slots, > each of which has an interpolator and a fixed associated vertex shader > output and fixed fragment input. This seems a rather obvious way to > design hardware, and so shouldn't be uncommon. > > Thus, the inputs/outputs can't be packed, because that will break if > the fragment shader doesn't use a vertex output. > And there is no way to correct that when the fragment program comes > up, other than recompiling the vertex shader, which would be very > desirable to avoid having to do. > > Non-GLSL programs can only use the 8 texcoords, so there is no problem > there since hardware supports 8 slots. > > Thus, I think my proposed solution is the simplest and most efficient > approach. > Any other solution would require much more, and slower, code in the > Gallium drivers for nv30, nv40, and maybe Intel too. > > ------------------------------------------------------------------------------ > Throughout its 18-year history, RSA Conference consistently attracts the > world's best and brightest in the field, creating opportunities for Conference > attendees to learn about information security's most important issues through > interactions with peers, luminaries and emerging and established companies. > http://p.sf.net/sfu/rsaconf-dev2dev > _______________________________________________ > Mesa3d-dev mailing list > mesa3d-...@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/mesa3d-dev > -- Only fools are easily impressed by what is only barely beyond their reach. ~ Unknown Corbin Simpson <mostawesomed...@gmail.com> _______________________________________________ Nouveau mailing list Nouveau@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/nouveau