On Fri, Apr 04, 2003 at 05:14:54PM -0700, Brian Paul wrote: > José Fonseca wrote: > >The things I found more interesting in the issue of applting the TCL > >operations on all the vertices at once, or a vertice at each time. From > >previous discussions on this list it seems that nowadays most > >of CPU performace is dictated by the cache, so it really seems the later > >option is more efficient, but Mesa implements the former (they are even > >called "pipeline stages") and to change would mean a big overhaul of the > >TnL module. > > On a historical note, the earliest versions of Mesa processed a single > vertex at a time, instead of operating on arrays of vertices, stage by > stage. Going to the later was a big speed up at the time.
Yes, and the use of the SIMD instructions also favors that approach. Actually on that article they've chosen to process 4 vertices at a time and not just one, surely because that's the number that fits on the MM registers. I think that the fact that CPUs got so much faster but BUSes didn't keep up pace contributed to change the picture making non-cached memory access look awfully slow compared with everythin else. > Since the T&L code is a module, one could implement the single-vertex > scheme as an alternate module. It would be an interesting experiment. Indeed. José Fonseca ------------------------------------------------------- This SF.net email is sponsored by: ValueWeb: Dedicated Hosting for just $79/mo with 500 GB of bandwidth! No other company gives more support or power for your dedicated server http://click.atdmt.com/AFF/go/sdnxxaff00300020aff/direct/01/ _______________________________________________ Dri-devel mailing list [EMAIL PROTECTED] https://lists.sourceforge.net/lists/listinfo/dri-devel