Hi, Paul,
As Tim pointed out earlier, the more limiting resource is likely to be lock contention in the NVidia driver instead of bus contention.
Note that Bb is going to be 4000 MB/sec (500 MB/sec * 8), per Quadroplex (PCI-e 2.0 8x) on the NIST machine. On my system I have an X58 chipset, so it's even wider (500 MB/sec * 16) per card. Also, the NIST machine's FSB is limited to 1.6 GT/sec, which is less than the 5 GT/sec that PCI-e provides.
Given that the data is static (we're not transferring any vertex data over the bus, just making glDrawElements calls to VBOs in STATIC_DRAW mode), I doubt that the OpenGL command stream is going to be getting anywhere near the actual bus bandwidth.
On 01/27/2012 09:17 AM, Paul Martz wrote:
Hi Jason -- I agree that the system bus is a likely source of the scaling issue. It is certainly a single resource that has to be shared by all the GPUs in the system. To make a case for bus bandwidth limitation, you might work through the math. Bb FPS = ---------------- Nw * Nd * Sd + O Where Bb is the bus bandwidth in bytes, Nw is the number of windows you're rendering to, Nd is the number of drawing commands, Sd is the size of the drawing commands in bytes, and O is the per frame OpenGL command overhead in bytes. The knowns: Bb = 500 MB/sec (PCIe 2.0) Nw = 4 Nd = 5500 Sd is a little harder to compute. It'll depend on the draw technology you're using (buffer objects or display lists) and the underlying OpenGL implementation. You could make a very rough guess here by figuring a fullword per OpenGL command, and a fullword per OpenGL command parameter. Just for the same of example, let's says Sd = 64 (16 fullwords to draw a single osg::Geometry). O encompasses all the per-frame OpenGL commands that OSG emits: glClear, glClearColor, glClearMask, dlDepthMask, matrices, light sources, swap, etc. You could plug in a rough guess like you would for Sd. Again, just as an example, let's use O = 2048. Plugging all that into a calculator, I get FPS = 371. But if Nw dropped to a single window, then FPS would jump to over 1400 -- or, more likely, you'd become limited by some other single resource in the system. The nice thing about algebra is that you can solve for the unknown, so of course you have the FPS, and if you have a pretty good guess for O (which can actually be pretty sloppy anyhow), then you ought to be able to solve for Sd and ask yourself if that result makes sense. I hope this helps. -Paul _______________________________________________ osg-users mailing list osg-users@lists.openscenegraph.org http://lists.openscenegraph.org/listinfo.cgi/osg-users-openscenegraph.org
_______________________________________________ osg-users mailing list osg-users@lists.openscenegraph.org http://lists.openscenegraph.org/listinfo.cgi/osg-users-openscenegraph.org