Hi Stuart,

On Sunday, December 11, 2011 23:04:02 you wrote:
> I've had a look, and I think I can change the code to create a single
> PrimitiveSet for each cloud fairly easily.
I think you can try.

As an answer to the previous mail, point sprites may help here too. You will 
get the bilboard effect for free.

We have a queriable limit in the maximum supported point size which nobody 
guarantees to be really high. But in reality point sprites can get up to 
render buffer size for almost any GPU I know. The open source radeon driver 
does glClear by drawing a screen sized point sprite...

So, I am not 100% sure that just switching to point sprites is a good idea, 
but I think this could be reasonable. May be by about 99% ...
Thoughts? ... anybody listening?

> On thinking about this a bit more, one thing that I don't quite understand
> is why the behaviour for clouds should differ so much from our random
> vegetation.
> 
> The random vegetation code we have is very similar - a small number
> of geometries being used again and again. Yet, the performance is far,
> far better, even with much higher numbers of objects.
Hmm, is there really a higher number of object?
I would guess that the number of trees that is actually drawn on each frame is 
lower?
Can you verify this? May be a simple counter temporarily hacked into the 
vegetation and cloud code could provide harder numbers?

But yes, if this is the same, then we should find out. In the end this is also 
driver dependent. But what I see here on my setup is with a very high 
probability just draw limited.

When I understand the clouds right, there is only one cloud drawable that 
accounts for all the quad sprites in the scene. Then you draw a seperate quad 
for all sprites. True?
Since I assume that there is one drawable issuing several tousands of single 
quad draws, this will not show up in osg's depth sorting at all. All osg has 
to sort when this single drawable needs to be drawn with respect to itself. 
Sorting a single element is relatively cheap :)

If this is the case, transparency on or off should not show up on the CPU time. 
I agree that transparency costs a little more on the GPU. But still, todays 
GPU's should really do that fast enough. Think at the particle systems and how 
many particles you can do before you see a measurable reaction from the GPU.
There are vizualization techinques out there to draw geometry with several 
10^x point sprites. So, the GPU is really designed to do that.

If we have many cloud drawables putting them into the depth sorted render bin 
will increase the cull times. But again, multiple ones but only a few will not 
show up significantly on sorting.

You can also try to play with osg's frame statistics. I guess you know that 
you can switch that on from the debug(?) menu.
I expect transparency to show up on the orange GPU bar. Being CPU and draw 
limited means the yellow bar is long. And the blue one grows when cull happens 
to be a problem.
... just a rule of thumb for our problem.

For comparison, here I see about the same length for the yellow and orange bar 
with traditional clouds. Switching on 3d clouds leaves the orange bar mostly 
untouched and raises the yellow bar about by that factor I see in the frame 
rate reduction. This is on my notebook with a medium fast gpu.
So, I conclude that the GPU does not care at all for the clouds. It is the CPU 
that needs to do so much to make that geometry happen on the GPU.
How does this look on your machine?

> I had thought that the main difference was the use of transparency,
> where the clouds are larger and generally more transparent than
> the trees.
Hmm, see above. Do you see a long orange bar with the clouds? Much longer then 
without?
I am sure the fill rate needs to be high with the clouds. My feeling is that 
transparency on or off only makes this worse by say a factor of two?!
But I see a frame rate drop with 3d clouds by a factor of 10 or more.
You can experiment with switching on and off blending in the clouds. Since you 
still draw them back to front you should still occupy the same fill rate on the 
GPU. But the read modify write cycle needed for blending is then gone in 
favour to a cheaper just produce a color and write it if the depth test 
passes, which should pass almost every time in the clouds because of drawing 
back to front.

> If so, and the alpha blending of the textures has the most impact on
> framerate, will changing the geometry help significantly? Or is it the
> case that the transparency _within_ a geometry is much more effectively
> handled by OSG than the transparency between different geometries?
Well, there is nothing to handle for transparency within a geometry. The 
geometries are atomic for osgs transparency. If you implement something non 
atomic in the draw routine like you do for the clouds it's your cpu time. But 
the only thing that osg does is to sort drawables that are in the depth sorted 
render bin so that they are drawn back to front.

Also textures are just handed over to OpenGL. It is just that plenty of osg 
loaders tend to put textured geometry that is textured with a potentially 
translucent texture (= having alpha) in the depth sorted render bin. Having 
much of them makes cull and depth sorting slow. But still, Drawables are 
sorted as a whole. And since this is a loader issue, this does not apply to 
the clouds.
Also apart from this kind of transparency issue, also the GPU's texel fetch 
does not care much about fetching a transparent or non transparent texture. It 
occupies a little more bandwith on the GPU memory, but this is only the 30% 
more than it needs anyway for fetching RGB. Any again, very often we are not 
texel fetch limited. An other bad effect of transparent textures to the GPU 
would be the GPU memory usage. So, I can store about 30% more textures on the 
GPU when they do not have a needless alpha chanel. But todays GPU's relly offer 
a lot of memory.
That does not mean that we should just waste GPU resources. I still think a 
texture with all alpha == 1 should just be RGB. This helps a lot on different 
places. And in the total sum this is required. But I do not think that having 
a transparent texture for the clouds accounts for the factor of 10 I can see 
here. And what is the alternative? Are clouds somehow translucent? Yes! So we 
need to do something translucent? Sure!

May be you can omit the texture completely. We can already do clouds only with 
shaders. True? Then we can also do some simple procedural texture in the 
fragment shader. That would optimize away the texel fetches. If you think this 
is the limiting factor, try this.
Sure this does not optimize away blending being on. So, still we need to have 
a huger fill rate and we need the more expensive color buffer read for 
blending. 
But still, we can not get around that.

An other thing to think about would be to use geometry shaders for the clouds.

Also did you do any profiling on the CPU? Sometimes this helps.
Are you running linux?

Greetings

Mathias

------------------------------------------------------------------------------
Learn Windows Azure Live!  Tuesday, Dec 13, 2011
Microsoft is holding a special Learn Windows Azure training event for 
developers. It will provide a great way to learn Windows Azure and what it 
provides. You can attend the event by watching it streamed LIVE online.  
Learn more at http://p.sf.net/sfu/ms-windowsazure
_______________________________________________
Flightgear-devel mailing list
Flightgear-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/flightgear-devel

Reply via email to