Hi, I am currently evaluating the performance of uniform buffer objects vs. regular uniforms. My demo applications uses hardware instancing to render thousands of quads that represent grass. The geometry is static in my demo. I tried out different approaches to store the model matrix for each instance: an uniform array, a texture and an uniform buffer object. I thought UBOs should be faster than regular uniforms, as they are bigger(i can store 4 times more matrices) and the data will only be uploaded once. But in fact they are way slower: 60 FPS vs. 16 FPS with 65.536 instances. Here is the code that creates my UBO and binds it:
Code: // create uniform buffer object for all matrices osg::FloatArray* matrixArray = new osg::FloatArray(maxUBOMatrices*16); for (unsigned int i = start, j = 0; i < end; ++i, ++j) { for (unsigned int k = 0; k < 16; ++k) { (*matrixArray)[j*16+k] = m_matrices[i].ptr()[k]; } } osg::ref_ptr<osg::UniformBufferObject> ubo = new osg::UniformBufferObject; ubo->setUsage(GL_STATIC_DRAW_ARB); ubo->setDataVariance(osg::Object::STATIC); matrixArray->setBufferObject(ubo); // create uniform buffer binding and add it to the stateset osg::ref_ptr<osg::UniformBufferBinding> ubb = new osg::UniformBufferBinding(0, ubo, 0, maxUBOMatrices*16*sizeof(GLfloat)); geode->getOrCreateStateSet()->setAttributeAndModes(ubb, osg::StateAttribute::ON); // set uniform block location program->addBindUniformBlock("instanceData", 0); And here is how I access it in the vertex shader: Code: #version 150 compatibility #extension GL_ARB_uniform_buffer_object : enable #define MAX_INSTANCES {will be set by c++ code} layout(std140) uniform instanceData { mat4 instanceModelMatrix[MAX_INSTANCES]; }; smooth out vec2 texCoord; smooth out vec3 normal; smooth out vec3 lightDir; void main() { mat4 _instanceModelMatrix = instanceModelMatrix[gl_InstanceID]; gl_Position = gl_ModelViewProjectionMatrix * _instanceModelMatrix * gl_Vertex; texCoord = gl_MultiTexCoord0.xy; mat3 normalMatrix = mat3(_instanceModelMatrix[0][0], _instanceModelMatrix[0][1], _instanceModelMatrix[0][2], _instanceModelMatrix[1][0], _instanceModelMatrix[1][1], _instanceModelMatrix[1][2], _instanceModelMatrix[2][0], _instanceModelMatrix[2][1], _instanceModelMatrix[2][2]); normal = gl_NormalMatrix * normalMatrix * gl_Normal; lightDir = gl_LightSource[0].position.xyz; } Am I doing something wrong, that the performance of UBOs is so bad? Or are they just not meant to be used, like I use them? Another question is about the software design of osg::BufferObject. When I set the buffer object of an array, it will get added to the internal BufferDataList of this BufferObject. But the BufferDataLists uses regular pointers instead of a ref_ptr. Is there a reason the BufferObject was designed this way? I have the problem right now, that I need to store the pointer to my matrix array somewhere else in my program for memory management reasons, even though I don't need this pointer anymore. In osg::Geometry on the other hand ref_ptr's are used for the storage of vertex arrays or other vertex attribute arrays. So I can just add an array to a geometry and the array will automatically be destroyed if the geometry is destroyed(and there is no other reference to this array). Wouldn't it be better if osg::BufferObject worked the same way? Or am I missing something important here? Thank you! Cheers, Marcel ------------------ Read this topic online here: http://forum.openscenegraph.org/viewtopic.php?p=56009#56009 _______________________________________________ osg-users mailing list osg-users@lists.openscenegraph.org http://lists.openscenegraph.org/listinfo.cgi/osg-users-openscenegraph.org