Re: [osg-users] Serialization of draw dispatch on multi GPU systems
Hi David Callu, am sorry I have to ask this. My project builds well but it does not launch. I have tried with code of several examples but it still fails. Could you please give me a step by step guide of how you do your openscene graph in eclipse. thanx, Felix On 8/24/07, David Callu [EMAIL PROTECTED] wrote: Hi I use eclipse under Linux. First you have the cmakeed http://www.cthing.com/CMakeEd.asp plugin to write your CMakeLists.txt. Then execute cmake ./path/of/your/project Then Menu - Project - C/C++ Make Project, set your path, target ... Then Control + B and Build !!! Cheers David Callu ___ osg-users mailing list osg-users@lists.openscenegraph.org http://lists.openscenegraph.org/listinfo.cgi/osg-users-openscenegraph.org
Re: [osg-users] Serialization of draw dispatch on multi GPU systems
Hi all, has any body used eclipse with osg. I tried using cmake but eclipse being an IDE and not a compiler, I failed to get a breakthrough. I will be very grateful to get guidance on this. Thanx, Felix On 8/23/07, Robert Osfield [EMAIL PROTECTED] wrote: Hi All, Yesterday I did testing on my new Intel quad core + dual 7800GT system and found performance in single threaded sometimes exceeded the multi-threaded models, which is seriously screwy. So I investigated. Looking at the draw stats it was clear that the draw dispatch by two threads/cores to two graphics context wasn't scaling well at all, with the single threaded draw dispatch total time being lower than a single draw dispatch when run multi-threaded. The per draw dispatch stats for each camera should that running multi-threaded was more than twice as slow. Clearly something in system between the CPU cores and the GPU is scaling very very poorly. Is it CPU front side bus? Is it the OpenGL driver not properly managing multiple graphics contexts? Is the chipset not properly disapatching data in parallel to two cards? I don't know the answer. As a test this morning I stuck a static Mutex into the osgViewer::Renderer's draw dispatch code to prevent the draw dispatch for the cameras from running in parallel - the cull can still run in parallel, but not the draw dispatch. The result was startling - a overall performance (i.e. fps) boost of 50-77% on the test models I've thrown at it. The CullThreadPerCameraDrawThreadPerContext benefiting the most. This also allows the multi-threaded models to out perform single threaded as one would expect. I haven't tried out these changes on my Athlon dual core system, but can't do this right away as I've taken the Gfx cards out of it for this system, but perhaps others can do similar tests. My expectation that different systems will exhibit different performance characteristics when running multi-threaded - a well balanced system should work best without serialization of the draw dispatch, but how many of our modern systems are well balanced?? I have checked in my changes to osgViewer::Renderer and osg::DisplaySettings to support the new serializer, so a svn update will get these. Since the performance difference is so colossal on my system and I expect that my system is closer to common set up of modern multi GPU systems I've made the default to use the serialize to true. To toggle the serializer: osgviewer mymodel.ive --serialize-draw ON osgviewer mymodel.ive --serialize-draw OFF Or export OSG_SERIALIZE_DRAW_DISPATCH=ON export OSG_SERIALIZE_DRAW_DISPATCH=OFF osgviewer cow.osg Replace export for setenv or set according to what your native platform is. It would be interesting to here from others with multi-CPU, multi-GPU systems to see how they fair. I am also curious about systems like AMD's 4x4 system, where they have two CPU sockets with a chipset connecting to the Gfx slots each. Does any one have one? It could be that this system scales better than my Intel quad core system. On this system I do actually have two chipsets too - they provide 2 x 16x PCIExpress + 1x8x PCIExpress bandwidth to the three Gfx slots, but it kinda looks like this isn't working properly, or perhaps its the OpenGL driver that sucks at multi-thread, multi-GPU... Robert. ___ osg-users mailing list osg-users@lists.openscenegraph.org http://lists.openscenegraph.org/listinfo.cgi/osg-users-openscenegraph.org ___ osg-users mailing list osg-users@lists.openscenegraph.org http://lists.openscenegraph.org/listinfo.cgi/osg-users-openscenegraph.org
Re: [osg-users] Serialization of draw dispatch on multi GPU systems
Hi I use eclipse under Linux. First you have the cmakeed http://www.cthing.com/CMakeEd.asp plugin to write your CMakeLists.txt. Then execute cmake ./path/of/your/project Then Menu - Project - C/C++ Make Project, set your path, target ... Then Control + B and Build !!! Cheers David Callu ___ osg-users mailing list osg-users@lists.openscenegraph.org http://lists.openscenegraph.org/listinfo.cgi/osg-users-openscenegraph.org
Re: [osg-users] Serialization of draw dispatch on multi GPU systems
Hello Robert, It would be interesting to here from others with multi-CPU, multi-GPU systems to see how they fair. I tested these out, here are my findings. Please bear in mind that I'm testing with cow.osg, so this is probably not representative of actual performance, hence my request earlier for some good test models you would have and could make public... Specs: AMD Athlon 64 X2 4200+ dual core 2GB DDR2-533 RAM GeForce 7900GTX Windows XP SP2 Compiled from SVN HEAD as of this afternoon Compiled with Visual C++ Express 2005 VSync off osgviewer --SingleThreaded cow.osg2490-2590 fps osgviewer --DrawThreadPerContext cow.osg 2420-2520 fps osgviewer --CullDrawThreadPerContext cow.osg 2430-2470 fps osgviewer --CullThreadPerCameraDrawThreadPerContext cow.osg 2600-2660 fps As you can see, pretty ridiculous values. I got the readings from Fraps, because using the built-in stats lowered the FPS so I didn't want that to affect the values. The good news is that all four modes work on Windows. I don't even know if any conclusions drawn from these results would be valid... It would seem that CullDrawThreadPerContext is more stable (less difference between min and max) than the previous two modes, and CullThreadPerCameraDrawThreadPerContext is faster than all others. Will that be the case with a more reasonable model? Who knows? I hope this was not as futile as I think it was... And as I said, if you have a model that would be a better test of performance and which you can make public, I'd be glad to test with that. J-S -- __ Jean-Sebastien Guay [EMAIL PROTECTED] http://whitestar02.webhop.org/ This message was sent using IMP, the Internet Messaging Program. ___ osg-users mailing list osg-users@lists.openscenegraph.org http://lists.openscenegraph.org/listinfo.cgi/osg-users-openscenegraph.org
Re: [osg-users] Serialization of draw dispatch on multi GPU systems
Hello again, Out of curiosity, I ran my own app (osgViewer::Viewer based app which currently displays a scene with a few medium poly count objects, an HDR skybox and a few shaders) with the various threading modes. The specs still apply: Specs: AMD Athlon 64 X2 4200+ dual core 2GB DDR2-533 RAM GeForce 7900GTX Windows XP SP2 Compiled from SVN HEAD as of this afternoon Compiled with Visual C++ Express 2005 VSync off --SingleThreaded ~1200 fps --DrawThreadPerContext ~1300 fps --CullDrawThreadPerContext ~1200 fps --CullThreadPerCameraDrawThreadPerContext~1300 fps Again, no significant difference for my scene. Were you expecting a big difference in performance on multi-core machines, or only on multi-GPU ones? Hope this helps, J-S -- __ Jean-Sebastien Guay [EMAIL PROTECTED] http://whitestar02.webhop.org/ This message was sent using IMP, the Internet Messaging Program. ___ osg-users mailing list osg-users@lists.openscenegraph.org http://lists.openscenegraph.org/listinfo.cgi/osg-users-openscenegraph.org