Re: [osg-users] Performance of Uniform-Buffer-Objects and question to the software design of osg::BufferObject

2013-08-30 Thread Roman Grigoriev
Hi, 
Nice examples. 
To extend functionality you can add hardware culling like this
http://rastergrid.com/blog/2010/06/instance-cloud-reduction-reloaded/
Also need to make some tests to see perfomance about dynamic matrix 
modification.
I think that best perfomance will be with Divisors.
To remove noisy messages you can comment this lines to work in OSG GL3.0 
stateSet-setAttributeAndModes(new osg::AlphaFunc(osg::AlphaFunc::GEQUAL, 
0.8f), osg::StateAttribute::ON);
and 
switchNode-addChild(lightSource);
in main.cpp
Thank you!
Cheers,
Roman

--
Read this topic online here:
http://forum.openscenegraph.org/viewtopic.php?p=56031#56031





___
osg-users mailing list
osg-users@lists.openscenegraph.org
http://lists.openscenegraph.org/listinfo.cgi/osg-users-openscenegraph.org


Re: [osg-users] Performance of Uniform-Buffer-Objects and question to the software design of osg::BufferObject

2013-08-30 Thread Aurelien Albert
Hi,

If you change data often, you can declare the vertex attribute array usage as
 STREAM_DRAW.

And to pass 4x4 matrices, you can do this :

- declare 4 vertex attributes array in OSG
- fill them with the 4 colums of your matrices
- bind them to attributes units 4, 5, 6, 7 (for example)
- in your shader, you can read mat4 type attribute on unit 4

But sometimes, passing full mat4 to shader can be slower than passing less data 
(only a vec3 for a position) and compute the matrix in the shader.

Cheers,
Aurelien

--
Read this topic online here:
http://forum.openscenegraph.org/viewtopic.php?p=56038#56038





___
osg-users mailing list
osg-users@lists.openscenegraph.org
http://lists.openscenegraph.org/listinfo.cgi/osg-users-openscenegraph.org


Re: [osg-users] Performance of Uniform-Buffer-Objects and question to the software design of osg::BufferObject

2013-08-29 Thread Aurelien Albert
Hi,

UBO are not designed for random access from shader, but for block access.   
For hardware instancing, you should get best performances using vertex 
attribute divisor.

Cheers,
Aurelien

--
Read this topic online here:
http://forum.openscenegraph.org/viewtopic.php?p=56020#56020





___
osg-users mailing list
osg-users@lists.openscenegraph.org
http://lists.openscenegraph.org/listinfo.cgi/osg-users-openscenegraph.org


Re: [osg-users] Performance of Uniform-Buffer-Objects and question to the software design of osg::BufferObject

2013-08-29 Thread Marcel Pursche
Hi Aurelien,

Thank you, for your answer. I suspected something like this, but I was not sure 
if I was right. It looks like I am using the cache for the uniform buffers in a 
very bad way, because I change the data so often.
As you suggested I'm currently trying out storing the matrices in a vertex 
attribute and using glVertexAttribDivisor. So far it's very promising. It's 
super fast, even a bit faster than using a texture to store the matrices. Only 
the compatibility with older hardware is definitely better with a texture.

Thank you!

Cheers,
Marcel

--
Read this topic online here:
http://forum.openscenegraph.org/viewtopic.php?p=56023#56023





___
osg-users mailing list
osg-users@lists.openscenegraph.org
http://lists.openscenegraph.org/listinfo.cgi/osg-users-openscenegraph.org


[osg-users] Performance of Uniform-Buffer-Objects and question to the software design of osg::BufferObject

2013-08-28 Thread Marcel Pursche
Hi,

I am currently evaluating the performance of uniform buffer objects vs. regular 
uniforms. My demo applications uses hardware instancing to render thousands of 
quads that represent grass. The geometry is static in my demo.
I tried out different approaches to store the model matrix for each instance: 
an uniform array, a texture and an uniform buffer object. I thought UBOs should 
be faster than regular uniforms, as they are bigger(i can store 4 times more 
matrices) and the data will only be uploaded once. But in fact they are way 
slower: 60 FPS vs.  16 FPS with 65.536 instances.
Here is the code that creates my UBO and binds it:

Code:

// create uniform buffer object for all matrices
osg::FloatArray* matrixArray = new osg::FloatArray(maxUBOMatrices*16);
for (unsigned int i = start, j = 0; i  end; ++i, ++j)
{
for (unsigned int k = 0; k  16; ++k)
{
(*matrixArray)[j*16+k] = m_matrices[i].ptr()[k];
}
}
osg::ref_ptrosg::UniformBufferObject ubo = new osg::UniformBufferObject;
ubo-setUsage(GL_STATIC_DRAW_ARB);
ubo-setDataVariance(osg::Object::STATIC);
matrixArray-setBufferObject(ubo);

// create uniform buffer binding and add it to the stateset
osg::ref_ptrosg::UniformBufferBinding ubb = new osg::UniformBufferBinding(0, 
ubo, 0, maxUBOMatrices*16*sizeof(GLfloat));
geode-getOrCreateStateSet()-setAttributeAndModes(ubb, 
osg::StateAttribute::ON);

// set uniform block location
program-addBindUniformBlock(instanceData, 0);




And here is how I access it in the vertex shader:

Code:

#version 150 compatibility
#extension GL_ARB_uniform_buffer_object : enable
#define MAX_INSTANCES {will be set by c++ code}
layout(std140) uniform instanceData
{
mat4 instanceModelMatrix[MAX_INSTANCES];
};

smooth out vec2 texCoord;
smooth out vec3 normal;
smooth out vec3 lightDir;

void main()
{
mat4 _instanceModelMatrix = instanceModelMatrix[gl_InstanceID];
gl_Position = gl_ModelViewProjectionMatrix * _instanceModelMatrix * 
gl_Vertex;
texCoord = gl_MultiTexCoord0.xy;

mat3 normalMatrix = mat3(_instanceModelMatrix[0][0], 
_instanceModelMatrix[0][1], _instanceModelMatrix[0][2],
 
_instanceModelMatrix[1][0], _instanceModelMatrix[1][1], 
_instanceModelMatrix[1][2],
 
_instanceModelMatrix[2][0], _instanceModelMatrix[2][1], 
_instanceModelMatrix[2][2]);

normal = gl_NormalMatrix * normalMatrix * gl_Normal;
lightDir = gl_LightSource[0].position.xyz;
}




Am I doing something wrong, that the performance of UBOs is so bad? Or are they 
just not meant to be used, like I use them?

Another question is about the software design of osg::BufferObject. When I set 
the buffer object of an array, it will get added to the internal BufferDataList 
of this BufferObject. But the BufferDataLists uses regular pointers instead of 
a ref_ptr. Is there a reason the BufferObject was designed this way? I have the 
problem right now, that I need to store the pointer to my matrix array 
somewhere else in my program for memory management reasons, even though I don't 
need this pointer anymore.
In osg::Geometry on the other hand ref_ptr's are used for the storage of vertex 
arrays or other vertex attribute arrays. So I can just add an array to a 
geometry and the array will automatically be destroyed if the geometry is 
destroyed(and there is no other reference to this array). Wouldn't it be better 
if osg::BufferObject worked the same way? Or am I missing something important 
here? 

Thank you!

Cheers,
Marcel

--
Read this topic online here:
http://forum.openscenegraph.org/viewtopic.php?p=56009#56009





___
osg-users mailing list
osg-users@lists.openscenegraph.org
http://lists.openscenegraph.org/listinfo.cgi/osg-users-openscenegraph.org


Re: [osg-users] Performance of Uniform-Buffer-Objects and question to the software design of osg::BufferObject

2013-08-28 Thread Christian Buchner
A lot of this performance issue may depend on the specific implementation
in your driver. Are you using Intel, nVidia or AMD (ATI) graphics?

Christian



2013/8/28 Marcel Pursche marcel.purs...@student.hpi.uni-potsdam.de

 Hi,

 I am currently evaluating the performance of uniform buffer objects vs.
 regular uniforms. My demo applications uses hardware instancing to render
 thousands of quads that represent grass. The geometry is static in my demo.
 I tried out different approaches to store the model matrix for each
 instance: an uniform array, a texture and an uniform buffer object. I
 thought UBOs should be faster than regular uniforms, as they are bigger(i
 can store 4 times more matrices) and the data will only be uploaded once.
 But in fact they are way slower: 60 FPS vs.  16 FPS with 65.536 instances.
 Here is the code that creates my UBO and binds it:

 Code:

 // create uniform buffer object for all matrices
 osg::FloatArray* matrixArray = new osg::FloatArray(maxUBOMatrices*16);
 for (unsigned int i = start, j = 0; i  end; ++i, ++j)
 {
 for (unsigned int k = 0; k  16; ++k)
 {
 (*matrixArray)[j*16+k] = m_matrices[i].ptr()[k];
 }
 }
 osg::ref_ptrosg::UniformBufferObject ubo = new osg::UniformBufferObject;
 ubo-setUsage(GL_STATIC_DRAW_ARB);
 ubo-setDataVariance(osg::Object::STATIC);
 matrixArray-setBufferObject(ubo);

 // create uniform buffer binding and add it to the stateset
 osg::ref_ptrosg::UniformBufferBinding ubb = new
 osg::UniformBufferBinding(0, ubo, 0, maxUBOMatrices*16*sizeof(GLfloat));
 geode-getOrCreateStateSet()-setAttributeAndModes(ubb,
 osg::StateAttribute::ON);

 // set uniform block location
 program-addBindUniformBlock(instanceData, 0);




 And here is how I access it in the vertex shader:

 Code:

 #version 150 compatibility
 #extension GL_ARB_uniform_buffer_object : enable
 #define MAX_INSTANCES {will be set by c++ code}
 layout(std140) uniform instanceData
 {
 mat4 instanceModelMatrix[MAX_INSTANCES];
 };

 smooth out vec2 texCoord;
 smooth out vec3 normal;
 smooth out vec3 lightDir;

 void main()
 {
 mat4 _instanceModelMatrix = instanceModelMatrix[gl_InstanceID];
 gl_Position = gl_ModelViewProjectionMatrix * _instanceModelMatrix
 * gl_Vertex;
 texCoord = gl_MultiTexCoord0.xy;

 mat3 normalMatrix = mat3(_instanceModelMatrix[0][0],
 _instanceModelMatrix[0][1], _instanceModelMatrix[0][2],

  _instanceModelMatrix[1][0], _instanceModelMatrix[1][1],
 _instanceModelMatrix[1][2],

  _instanceModelMatrix[2][0], _instanceModelMatrix[2][1],
 _instanceModelMatrix[2][2]);

 normal = gl_NormalMatrix * normalMatrix * gl_Normal;
 lightDir = gl_LightSource[0].position.xyz;
 }




 Am I doing something wrong, that the performance of UBOs is so bad? Or are
 they just not meant to be used, like I use them?

 Another question is about the software design of osg::BufferObject. When I
 set the buffer object of an array, it will get added to the internal
 BufferDataList of this BufferObject. But the BufferDataLists uses regular
 pointers instead of a ref_ptr. Is there a reason the BufferObject was
 designed this way? I have the problem right now, that I need to store the
 pointer to my matrix array somewhere else in my program for memory
 management reasons, even though I don't need this pointer anymore.
 In osg::Geometry on the other hand ref_ptr's are used for the storage of
 vertex arrays or other vertex attribute arrays. So I can just add an array
 to a geometry and the array will automatically be destroyed if the geometry
 is destroyed(and there is no other reference to this array). Wouldn't it be
 better if osg::BufferObject worked the same way? Or am I missing something
 important here?

 Thank you!

 Cheers,
 Marcel

 --
 Read this topic online here:
 http://forum.openscenegraph.org/viewtopic.php?p=56009#56009





 ___
 osg-users mailing list
 osg-users@lists.openscenegraph.org
 http://lists.openscenegraph.org/listinfo.cgi/osg-users-openscenegraph.org

___
osg-users mailing list
osg-users@lists.openscenegraph.org
http://lists.openscenegraph.org/listinfo.cgi/osg-users-openscenegraph.org


Re: [osg-users] Performance of Uniform-Buffer-Objects and question to the software design of osg::BufferObject

2013-08-28 Thread Marcel Pursche
Hi,

I tested it on a nVidia GTX670, driver version 314.22 for Windows 7 x64.

Thank you!

Cheers,
Marcel

--
Read this topic online here:
http://forum.openscenegraph.org/viewtopic.php?p=56011#56011





___
osg-users mailing list
osg-users@lists.openscenegraph.org
http://lists.openscenegraph.org/listinfo.cgi/osg-users-openscenegraph.org