subject:"\[osg\-users\] Question about views, contexts and threading"

Re: [osg-users] Question about views, contexts and threading

2008-11-24 Thread Ferdi Smit


Robert,

I tried your suggestion, but it didn't have any effect. It's probably a 
driver issue then (nvidia 180.06 beta). I should receive a dual GTX260 
system any day now; I'll try and see if that works better.


Robert Osfield wrote:

HI Ferdi,

Could try the same tests but with the following env var  set:

  set OSG_SERIALIZE_DRAW_DISPATCH=OFF

This will disable the mutex that serializes the draw dispatch.  Have a
search through the archives on this topic as I've written lots about
this topic and the fact serialize draw curious improves performance on
systems that I've tested on.  I still haven't had feedback from the
community on this topic as it's likely to be something effected by
hardware/drivers and OS.

Robert.

On Thu, Nov 20, 2008 at 4:05 PM, Ferdi Smit <[EMAIL PROTECTED]> wrote:
  

Thank you, that at least explains some of the drawing times I've been
seeing.

I ran more tests on our dual-gpu system, summarized below. Not striclty OSG
related, but they may be interesting nonetheless...

- Scene of 25x 1 million polygon model, all visible. Culling etc neglibile.
- Stand-alone refers to one rendering context only; normal, non-parallel
rendering
- frame rates in FPS

CPU Affinity on different cores
OSG_THREADING=SingleThreaded
(1 core shows heavy use, 2nd core show moderate use, 2 cores idle)

  Quadro 56008800GTX
Single-GPU / Stand-alone1615

Single-GPU / Multi-Threaded7.57.5
Single-GPU / Multi-Processing7.57.5

Multi-GPU / Multi-Threaded6.56.5
Multi-GPU / Multi-Processing1615

  Quadro 56008800GTX

OSG_THREADING=ThreadPerContext
(CPU Affinity is set but appears to be ignored: 1 core shows heavy use,
others idle)

Single-GPU / Stand-alone1615

Single-GPU / Multi-Threaded7.57.5
Single-GPU / Multi-Processing7.57.5

Multi-GPU / Multi-Threaded3.511
Multi-GPU / Multi-Processing1114


  Quadro 56008800GTX
Baseline:
Multi-GPU / Multi-Threaded6.56.5

Speeding up one card by rendering empty scene*, effect on other card:
Multi-GPU / Multi-Threaded6000*15
Multi-GPU / Multi-Threaded714*


All results are reasonable, except:

Single-GPU / Multi-Processing7.57.5
Multi-GPU / Multi-Threaded6.56.5
Multi-GPU / Multi-Processing1615

Which is very strange; using two distinct GPUs simultaneously in a threaded
way in the same address space is slower than sharing a single GPU. I can
only conclude that OpenGL drivers can not handle multi-threading with
different contexts on different devices. It also seems that the Quadro is
the culprit, locking the driver or something. If you let the quadro render
fast, the 8800 also renders fast. However, if you allow the 8800 to render
fast, both will remain slow.




--
Regards,

Ferdi Smit
INS3 Visualization and 3D Interfaces
CWI Amsterdam, The Netherlands

___
osg-users mailing list
osg-users@lists.openscenegraph.org
http://lists.openscenegraph.org/listinfo.cgi/osg-users-openscenegraph.org



___
osg-users mailing list
osg-users@lists.openscenegraph.org
http://lists.openscenegraph.org/listinfo.cgi/osg-users-openscenegraph.org
  



--
Regards,

Ferdi Smit
INS3 Visualization and 3D Interfaces
CWI Amsterdam, The Netherlands

___
osg-users mailing list
osg-users@lists.openscenegraph.org
http://lists.openscenegraph.org/listinfo.cgi/osg-users-openscenegraph.org

Re: [osg-users] Question about views, contexts and threading

2008-11-20 Thread Don Leich


I tried the changes of osgviewer and not much difference on 4 systems except
the texture on the cow appears in the copies (Optimizer previously bleached the 
cow).

My traces for system curly look the same.  Others systems I just look at 
behavior.

-Don




Hi All,

On Thu, Nov 20, 2008 at 4:01 PM, Robert Osfield
<[EMAIL PROTECTED]> wrote:


I think the best lead would be that perhaps the texture object/display
lists buffer_value containers aren't being resized to fit the new
number of contexts which the app is running single threaded.  In
theory addView should be stopping all threads, and then issuing the
Node::resizeGLObejcts() on the scene graph so handling this situation,
but perhaps this isn't happening.



I've looked into the
CompositeViewer:::addView()/View::setSceneData()/Viewer::setSceneData()
methods and only the Viewer::setSceneData() has a call to resize the
GL objects.  The actual code looks like:

void Viewer::setSceneData(osg::Node* node)
{
setReferenceTime(0.0);

View::setSceneData(node);

if (_threadingModel!=SingleThreaded && getSceneData())
{
// make sure that existing scene graph objects are allocated
with thread safe ref/unref
getSceneData()->setThreadSafeRefUnref(true);

// update the scene graph so that it has enough GL object
buffer memory for the graphics contexts that will be using it.

getSceneData()->resizeGLObjectBuffers(osg::DisplaySettings::instance()->getMaxNumberOfGraphicsContexts());
}

}

My guess is that we need to move the resize/setThreadSafeRefUnref() up
into the View::setSceneData() method.  The Viewer::setSceneData()
method is a viewer so has access to members of ViewerBase that View
doesn't have.  Another issue is that if we are setting the View up
prior to any call to stopThreading as the resize isn't thread safe.

As we don't know whether this is the cause of the problem yet, I've
modified J-S's osgviewer.cpp to do the resize.  Could users who've
seen problems try this version out, if this works then we have
workaround that end users can apply to existing apps, and we can
figure out a solution to fix it permanently in svn/trunk.

Robert.


___
osg-users mailing list
osg-users@lists.openscenegraph.org
http://lists.openscenegraph.org/listinfo.cgi/osg-users-openscenegraph.org

Re: [osg-users] Question about views, contexts and threading

2008-11-20 Thread Robert Osfield

HI Ferdi,

Could try the same tests but with the following env var  set:

  set OSG_SERIALIZE_DRAW_DISPATCH=OFF

This will disable the mutex that serializes the draw dispatch.  Have a
search through the archives on this topic as I've written lots about
this topic and the fact serialize draw curious improves performance on
systems that I've tested on.  I still haven't had feedback from the
community on this topic as it's likely to be something effected by
hardware/drivers and OS.

Robert.

On Thu, Nov 20, 2008 at 4:05 PM, Ferdi Smit <[EMAIL PROTECTED]> wrote:
> Thank you, that at least explains some of the drawing times I've been
> seeing.
>
> I ran more tests on our dual-gpu system, summarized below. Not striclty OSG
> related, but they may be interesting nonetheless...
>
> - Scene of 25x 1 million polygon model, all visible. Culling etc neglibile.
> - Stand-alone refers to one rendering context only; normal, non-parallel
> rendering
> - frame rates in FPS
>
> CPU Affinity on different cores
> OSG_THREADING=SingleThreaded
> (1 core shows heavy use, 2nd core show moderate use, 2 cores idle)
>
>   Quadro 56008800GTX
> Single-GPU / Stand-alone1615
>
> Single-GPU / Multi-Threaded7.57.5
> Single-GPU / Multi-Processing7.57.5
>
> Multi-GPU / Multi-Threaded6.56.5
> Multi-GPU / Multi-Processing1615
>
>   Quadro 56008800GTX
>
> OSG_THREADING=ThreadPerContext
> (CPU Affinity is set but appears to be ignored: 1 core shows heavy use,
> others idle)
>
> Single-GPU / Stand-alone1615
>
> Single-GPU / Multi-Threaded7.57.5
> Single-GPU / Multi-Processing7.57.5
>
> Multi-GPU / Multi-Threaded3.511
> Multi-GPU / Multi-Processing1114
>
>
>   Quadro 56008800GTX
> Baseline:
> Multi-GPU / Multi-Threaded6.56.5
>
> Speeding up one card by rendering empty scene*, effect on other card:
> Multi-GPU / Multi-Threaded6000*15
> Multi-GPU / Multi-Threaded714*
>
>
> All results are reasonable, except:
>
> Single-GPU / Multi-Processing7.57.5
> Multi-GPU / Multi-Threaded6.56.5
> Multi-GPU / Multi-Processing1615
>
> Which is very strange; using two distinct GPUs simultaneously in a threaded
> way in the same address space is slower than sharing a single GPU. I can
> only conclude that OpenGL drivers can not handle multi-threading with
> different contexts on different devices. It also seems that the Quadro is
> the culprit, locking the driver or something. If you let the quadro render
> fast, the 8800 also renders fast. However, if you allow the 8800 to render
> fast, both will remain slow.
>
> Robert Osfield wrote:
>>
>> Hi Ferdi,
>>
>> The understand what is happening with draw in the two instances you
>> need to understand how OpenGL operates.  For each graphics context
>> OpenGL maintains a FIFO that is filled by the applications graphics
>> thread for that context, and is drained by the driver that batches the
>> commands/data in the fifo up into a form that can be pushed to the
>> graphics card.
>>
>> Now if this FIFO has plenty of room then the application can keep
>> filling the FIFO without OpenGL ever blocking the applications graphis
>> thread - in this case the draw dispatch times (the OSG side) are
>> relatively low.  If however you fill the FIFO then OpenGL will block
>> the applications graphics thread till enough room has been made by the
>> GPU consuming command/data at the other end.   When you get to this
>> point often you'll find draw dispatch times that suddenly jump up, and
>> it's not because it's suddenly doing more work - in fact the app
>> graphics thread is just sitting their idle waiting for the graphics
>> drive/GPU to do it's stuff.
>>
>> Now drivers may have different sized FIFO's, and different GPU's will
>> work at different speeds and possibly have other features that affect
>> the FIFO filling/emptying. One would expect slower GPU's to empty the
>> fifo slower so are more likely to block, but the driver can also have
>> affect.  The architecture of overall hardware, what other threads are
>> running, how contended the various parts of the hardware etc all can
>> have an effect.   The fact that one GPU's draw dispatch is far longer
>> than another might simply bit that it's pushed just hard enough to
>> fill the FIFO, but it might still hit frame just fine, but the draw
>> times will be drastically higher because of the blocking due to the
>> filled FIFO, a slightly lower load will lead could lead to FIFO not
>> blocking an huge drop in draw dispatch times.  It's very no linear,
>> small differences can result in large observed differences, but often
>> the long draw time might not be anything to worry about - it's just an
>> early warning sign, you might still hit your target frame rate just
>> fine

Re: [osg-users] Question about views, contexts and threading

2008-11-20 Thread Ferdi Smit

Thank you, that at least explains some of the drawing times I've been 
seeing.


I ran more tests on our dual-gpu system, summarized below. Not striclty 
OSG related, but they may be interesting nonetheless...


- Scene of 25x 1 million polygon model, all visible. Culling etc neglibile.
- Stand-alone refers to one rendering context only; normal, non-parallel 
rendering

- frame rates in FPS

CPU Affinity on different cores
OSG_THREADING=SingleThreaded
(1 core shows heavy use, 2nd core show moderate use, 2 cores idle)

   Quadro 56008800GTX
Single-GPU / Stand-alone1615

Single-GPU / Multi-Threaded7.57.5
Single-GPU / Multi-Processing7.57.5

Multi-GPU / Multi-Threaded6.56.5
Multi-GPU / Multi-Processing1615

   Quadro 56008800GTX

OSG_THREADING=ThreadPerContext
(CPU Affinity is set but appears to be ignored: 1 core shows heavy use, 
others idle)


Single-GPU / Stand-alone1615

Single-GPU / Multi-Threaded7.57.5
Single-GPU / Multi-Processing7.57.5

Multi-GPU / Multi-Threaded3.511
Multi-GPU / Multi-Processing1114


   Quadro 56008800GTX
Baseline:
Multi-GPU / Multi-Threaded6.56.5

Speeding up one card by rendering empty scene*, effect on other card:
Multi-GPU / Multi-Threaded6000*15
Multi-GPU / Multi-Threaded714*


All results are reasonable, except:

Single-GPU / Multi-Processing7.57.5
Multi-GPU / Multi-Threaded6.56.5
Multi-GPU / Multi-Processing1615

Which is very strange; using two distinct GPUs simultaneously in a 
threaded way in the same address space is slower than sharing a single 
GPU. I can only conclude that OpenGL drivers can not handle 
multi-threading with different contexts on different devices. It also 
seems that the Quadro is the culprit, locking the driver or something. 
If you let the quadro render fast, the 8800 also renders fast. However, 
if you allow the 8800 to render fast, both will remain slow.


Robert Osfield wrote:

Hi Ferdi,

The understand what is happening with draw in the two instances you
need to understand how OpenGL operates.  For each graphics context
OpenGL maintains a FIFO that is filled by the applications graphics
thread for that context, and is drained by the driver that batches the
commands/data in the fifo up into a form that can be pushed to the
graphics card.

Now if this FIFO has plenty of room then the application can keep
filling the FIFO without OpenGL ever blocking the applications graphis
thread - in this case the draw dispatch times (the OSG side) are
relatively low.  If however you fill the FIFO then OpenGL will block
the applications graphics thread till enough room has been made by the
GPU consuming command/data at the other end.   When you get to this
point often you'll find draw dispatch times that suddenly jump up, and
it's not because it's suddenly doing more work - in fact the app
graphics thread is just sitting their idle waiting for the graphics
drive/GPU to do it's stuff.

Now drivers may have different sized FIFO's, and different GPU's will
work at different speeds and possibly have other features that affect
the FIFO filling/emptying. One would expect slower GPU's to empty the
fifo slower so are more likely to block, but the driver can also have
affect.  The architecture of overall hardware, what other threads are
running, how contended the various parts of the hardware etc all can
have an effect.   The fact that one GPU's draw dispatch is far longer
than another might simply bit that it's pushed just hard enough to
fill the FIFO, but it might still hit frame just fine, but the draw
times will be drastically higher because of the blocking due to the
filled FIFO, a slightly lower load will lead could lead to FIFO not
blocking an huge drop in draw dispatch times.  It's very no linear,
small differences can result in large observed differences, but often
the long draw time might not be anything to worry about - it's just an
early warning sign, you might still hit your target frame rate just
fine.

Robert.

Robert.

On Tue, Nov 18, 2008 at 3:31 PM, Ferdi Smit <[EMAIL PROTECTED]> wrote:
  

Hi Robert,

I ran some more tests with a realistic scene of ~25M polygons (25 times the
same 1M model). Stand-alone this is rendered at ~15 FPS on one GPU (8800GTX
or Quadro FX5600 + Intel Quad Core). Multi-_processing_ with two contexts at
two gpus, both rendering this scene, the 8800 stays at 15 but the Quadro
drops to 12. Multi-_threading_ with two contexts at two gpus, the 8800 drops
to 9.5 and the quadro to 4.5 FPS. This is weird. Also, the 8800 reports (in
the osg performance hud) that GPU=65 and Draw=10. Draw is always much lower
than GPU. But the Quadro in multi-threading goes to GPU=210 and Draw=210;
GPU and Draw are suddenly equal now. What does t

Re: [osg-users] Question about views, contexts and threading

2008-11-18 Thread Robert Osfield

Hi Ferdi,

The understand what is happening with draw in the two instances you
need to understand how OpenGL operates.  For each graphics context
OpenGL maintains a FIFO that is filled by the applications graphics
thread for that context, and is drained by the driver that batches the
commands/data in the fifo up into a form that can be pushed to the
graphics card.

Now if this FIFO has plenty of room then the application can keep
filling the FIFO without OpenGL ever blocking the applications graphis
thread - in this case the draw dispatch times (the OSG side) are
relatively low.  If however you fill the FIFO then OpenGL will block
the applications graphics thread till enough room has been made by the
GPU consuming command/data at the other end.   When you get to this
point often you'll find draw dispatch times that suddenly jump up, and
it's not because it's suddenly doing more work - in fact the app
graphics thread is just sitting their idle waiting for the graphics
drive/GPU to do it's stuff.

Now drivers may have different sized FIFO's, and different GPU's will
work at different speeds and possibly have other features that affect
the FIFO filling/emptying. One would expect slower GPU's to empty the
fifo slower so are more likely to block, but the driver can also have
affect.  The architecture of overall hardware, what other threads are
running, how contended the various parts of the hardware etc all can
have an effect.   The fact that one GPU's draw dispatch is far longer
than another might simply bit that it's pushed just hard enough to
fill the FIFO, but it might still hit frame just fine, but the draw
times will be drastically higher because of the blocking due to the
filled FIFO, a slightly lower load will lead could lead to FIFO not
blocking an huge drop in draw dispatch times.  It's very no linear,
small differences can result in large observed differences, but often
the long draw time might not be anything to worry about - it's just an
early warning sign, you might still hit your target frame rate just
fine.

Robert.

Robert.

On Tue, Nov 18, 2008 at 3:31 PM, Ferdi Smit <[EMAIL PROTECTED]> wrote:
> Hi Robert,
>
> I ran some more tests with a realistic scene of ~25M polygons (25 times the
> same 1M model). Stand-alone this is rendered at ~15 FPS on one GPU (8800GTX
> or Quadro FX5600 + Intel Quad Core). Multi-_processing_ with two contexts at
> two gpus, both rendering this scene, the 8800 stays at 15 but the Quadro
> drops to 12. Multi-_threading_ with two contexts at two gpus, the 8800 drops
> to 9.5 and the quadro to 4.5 FPS. This is weird. Also, the 8800 reports (in
> the osg performance hud) that GPU=65 and Draw=10. Draw is always much lower
> than GPU. But the Quadro in multi-threading goes to GPU=210 and Draw=210;
> GPU and Draw are suddenly equal now. What does this Draw statistic
> represent? Is it time spend in driver draw calls?
>
> I suspect buggy Quadro drivers, but I'm not sure. It's the only system I can
> test on. I'm sorry if this diverts from a pure OSG discussion; perhaps I
> should take it to an nvidia forum.
>
> Robert Osfield wrote:
>>
>> Hi Ferdi,
>>
>> W.r.t performance and stability of multi-threading the graphics, as
>> long as you have two GPU's the most efficient way to drive them should
>> be multi-threaded - there is a caveat though, hardware and drivers
>> aren't always up to scratch, and even then they should be able to
>> manage the multi-threads and multi-gpus seemless they fail too.
>>
>> I'm poised to build a new machine based on the new Intel iCore7 and
>> X58 motherboard, it'll be interesting to see how well is scales.
>>
>> W.r.t PBO readback - it's very very sensitive to the pixel formats you
>> use.  See the osgscreencapture example.
>>
>> Robert.
>>
>> On Mon, Nov 17, 2008 at 5:31 PM, Ferdi Smit <[EMAIL PROTECTED]> wrote:
>>
>>>
>>> Thanks Robert. I did a quick test with two viewers from two threads and
>>> it
>>> appears to be working. Btw, from my experience, PBO doesn't seem to be
>>> any
>>> faster (and on some hardware much slower) for downloading textures to
>>> host
>>> than glReadPixels, while for uploads it is almost consistently faster.
>>> Anyway, that should not be a problem, even to code it manually.
>>>
>>> One question about the OpenGL driver, are you by any chance aware of any
>>> threading issues? Is it completely re-entrant from two different contexts
>>> and threads? With this two-thread setup, I see some occasional erratic
>>> fluctuation in drawing time in the osg performance hud for a completely
>>> still scene. The GPU performance is very stable, regardless of the load
>>> on
>>> the other card, but the drawing time (software) sometimes goes from
>>> something like 0.4 to 2.6 or 1.5 for a couple of frames. I do not notice
>>> this, or not as much, when using two separate processes instead of two
>>> threads. The only difference I can think of here is that the OpenGL
>>> driver
>>> part is in the same address space and maybe internall

Re: [osg-users] Question about views, contexts and threading

2008-11-18 Thread Ferdi Smit


Hi Robert,

I ran some more tests with a realistic scene of ~25M polygons (25 times 
the same 1M model). Stand-alone this is rendered at ~15 FPS on one GPU 
(8800GTX or Quadro FX5600 + Intel Quad Core). Multi-_processing_ with 
two contexts at two gpus, both rendering this scene, the 8800 stays at 
15 but the Quadro drops to 12. Multi-_threading_ with two contexts at 
two gpus, the 8800 drops to 9.5 and the quadro to 4.5 FPS. This is 
weird. Also, the 8800 reports (in the osg performance hud) that GPU=65 
and Draw=10. Draw is always much lower than GPU. But the Quadro in 
multi-threading goes to GPU=210 and Draw=210; GPU and Draw are suddenly 
equal now. What does this Draw statistic represent? Is it time spend in 
driver draw calls?


I suspect buggy Quadro drivers, but I'm not sure. It's the only system I 
can test on. I'm sorry if this diverts from a pure OSG discussion; 
perhaps I should take it to an nvidia forum.


Robert Osfield wrote:

Hi Ferdi,

W.r.t performance and stability of multi-threading the graphics, as
long as you have two GPU's the most efficient way to drive them should
be multi-threaded - there is a caveat though, hardware and drivers
aren't always up to scratch, and even then they should be able to
manage the multi-threads and multi-gpus seemless they fail too.

I'm poised to build a new machine based on the new Intel iCore7 and
X58 motherboard, it'll be interesting to see how well is scales.

W.r.t PBO readback - it's very very sensitive to the pixel formats you
use.  See the osgscreencapture example.

Robert.

On Mon, Nov 17, 2008 at 5:31 PM, Ferdi Smit <[EMAIL PROTECTED]> wrote:
  

Thanks Robert. I did a quick test with two viewers from two threads and it
appears to be working. Btw, from my experience, PBO doesn't seem to be any
faster (and on some hardware much slower) for downloading textures to host
than glReadPixels, while for uploads it is almost consistently faster.
Anyway, that should not be a problem, even to code it manually.

One question about the OpenGL driver, are you by any chance aware of any
threading issues? Is it completely re-entrant from two different contexts
and threads? With this two-thread setup, I see some occasional erratic
fluctuation in drawing time in the osg performance hud for a completely
still scene. The GPU performance is very stable, regardless of the load on
the other card, but the drawing time (software) sometimes goes from
something like 0.4 to 2.6 or 1.5 for a couple of frames. I do not notice
this, or not as much, when using two separate processes instead of two
threads. The only difference I can think of here is that the OpenGL driver
part is in the same address space and maybe internally locks occasionally?
Or is this nonsense?

Anyway, the osg part seems to be fairly straightforward and simple like
this. Thanks.


Robert Osfield wrote:


Hi Ferdi,

osgViewer::CompositeViewer runs all of the views synchronously - one
frame() call dispatches update, event, cull and draw traversals for
all the views.  So for you case where you want them to run async, this
isn't supported.  Supporting within CompositeViewer would really
complicate the API so it's not something I gone for.

What you will be able to do is use two separate Viewer's.  You are
likely to want to run two threads for each of the viewers frame loops
as well.  To get the render to image result to the second viewer all
you need to do is assign the same osg::Image to the first viewer's
Camera for it to copy to, and then attach the same osg::Image to a
texture in the scene of the second viewer.  The OSG should
automatically do the glReadPixels to the image data, dirty the Image,
and then automatically the texture will update in the second viewer.
You could potentially optimize things by using an PBO but the off the
shelf osg::PixelBufferObject isn't suitable for read in this way so
you'll need to roll you own support for this.

It's worth noting that I've never written a app like the above, so you
are rather working on the bleeding edge.  I "think" it should work, or
at least I can't spot any major problems that might appear.

Robert.

On Mon, Nov 17, 2008 at 9:37 AM, Ferdi Smit <[EMAIL PROTECTED]> wrote:

  

I'm looking to do the following in OSG, and I wonder if I'm on the right
track (before wasting time needlessly): have two render processes run in
parallel on two different GPUs, have one render a scene to texture and
let
this texture be read by the other process and mapped to an object in a
different scene. Problem, the rendering of the first scene to texture is
very slow and the rendering of the second scene is very fast.

I intend to solve it in the following way in pseudo-code:

- new CompositeViewer
- Add two Views
- Construct two contexts, one on localhost:0.0, one on localhost:0.1
- Attach contexts to cameras of corresponding Views
- Set composite viewer threading mode to thread-per-context

--- First process
- Set view camera mode to FBO and pre-render
- Add post-dr

Re: [osg-users] Question about views, contexts and threading

2008-11-17 Thread Robert Osfield

Hi Ferdi,

W.r.t performance and stability of multi-threading the graphics, as
long as you have two GPU's the most efficient way to drive them should
be multi-threaded - there is a caveat though, hardware and drivers
aren't always up to scratch, and even then they should be able to
manage the multi-threads and multi-gpus seemless they fail too.

I'm poised to build a new machine based on the new Intel iCore7 and
X58 motherboard, it'll be interesting to see how well is scales.

W.r.t PBO readback - it's very very sensitive to the pixel formats you
use.  See the osgscreencapture example.

Robert.

On Mon, Nov 17, 2008 at 5:31 PM, Ferdi Smit <[EMAIL PROTECTED]> wrote:
> Thanks Robert. I did a quick test with two viewers from two threads and it
> appears to be working. Btw, from my experience, PBO doesn't seem to be any
> faster (and on some hardware much slower) for downloading textures to host
> than glReadPixels, while for uploads it is almost consistently faster.
> Anyway, that should not be a problem, even to code it manually.
>
> One question about the OpenGL driver, are you by any chance aware of any
> threading issues? Is it completely re-entrant from two different contexts
> and threads? With this two-thread setup, I see some occasional erratic
> fluctuation in drawing time in the osg performance hud for a completely
> still scene. The GPU performance is very stable, regardless of the load on
> the other card, but the drawing time (software) sometimes goes from
> something like 0.4 to 2.6 or 1.5 for a couple of frames. I do not notice
> this, or not as much, when using two separate processes instead of two
> threads. The only difference I can think of here is that the OpenGL driver
> part is in the same address space and maybe internally locks occasionally?
> Or is this nonsense?
>
> Anyway, the osg part seems to be fairly straightforward and simple like
> this. Thanks.
>
>
> Robert Osfield wrote:
>>
>> Hi Ferdi,
>>
>> osgViewer::CompositeViewer runs all of the views synchronously - one
>> frame() call dispatches update, event, cull and draw traversals for
>> all the views.  So for you case where you want them to run async, this
>> isn't supported.  Supporting within CompositeViewer would really
>> complicate the API so it's not something I gone for.
>>
>> What you will be able to do is use two separate Viewer's.  You are
>> likely to want to run two threads for each of the viewers frame loops
>> as well.  To get the render to image result to the second viewer all
>> you need to do is assign the same osg::Image to the first viewer's
>> Camera for it to copy to, and then attach the same osg::Image to a
>> texture in the scene of the second viewer.  The OSG should
>> automatically do the glReadPixels to the image data, dirty the Image,
>> and then automatically the texture will update in the second viewer.
>> You could potentially optimize things by using an PBO but the off the
>> shelf osg::PixelBufferObject isn't suitable for read in this way so
>> you'll need to roll you own support for this.
>>
>> It's worth noting that I've never written a app like the above, so you
>> are rather working on the bleeding edge.  I "think" it should work, or
>> at least I can't spot any major problems that might appear.
>>
>> Robert.
>>
>> On Mon, Nov 17, 2008 at 9:37 AM, Ferdi Smit <[EMAIL PROTECTED]> wrote:
>>
>>>
>>> I'm looking to do the following in OSG, and I wonder if I'm on the right
>>> track (before wasting time needlessly): have two render processes run in
>>> parallel on two different GPUs, have one render a scene to texture and
>>> let
>>> this texture be read by the other process and mapped to an object in a
>>> different scene. Problem, the rendering of the first scene to texture is
>>> very slow and the rendering of the second scene is very fast.
>>>
>>> I intend to solve it in the following way in pseudo-code:
>>>
>>> - new CompositeViewer
>>> - Add two Views
>>> - Construct two contexts, one on localhost:0.0, one on localhost:0.1
>>> - Attach contexts to cameras of corresponding Views
>>> - Set composite viewer threading mode to thread-per-context
>>>
>>> --- First process
>>> - Set view camera mode to FBO and pre-render
>>> - Add post-draw callback and render textures
>>> - Download texture to host memory in post-draw callback
>>> - (possibly add post-render camera to render textured screen quad as
>>> output)
>>>
>>> --- Second process
>>> - Add update-callback and regular texture
>>> - Upload host memory to texture in update callback (if available,
>>> non-blocking)
>>>
>>> The downloading and uploading of textures uses multiple slots and regular
>>> threaded locking, so to ensure we never read or write the same memory at
>>> the
>>> same time. The second process doesn't block if no new texture is
>>> available,
>>> it just continues using the old one then.
>>>
>>> Some questions. Will the two processes now run at independent frame
>>> rates,
>>> or will the composite viewer synchronize them? I ne

Re: [osg-users] Question about views, contexts and threading

2008-11-17 Thread Ferdi Smit

Thanks Robert. I did a quick test with two viewers from two threads and 
it appears to be working. Btw, from my experience, PBO doesn't seem to 
be any faster (and on some hardware much slower) for downloading 
textures to host than glReadPixels, while for uploads it is almost 
consistently faster. Anyway, that should not be a problem, even to code 
it manually.


One question about the OpenGL driver, are you by any chance aware of any 
threading issues? Is it completely re-entrant from two different 
contexts and threads? With this two-thread setup, I see some occasional 
erratic fluctuation in drawing time in the osg performance hud for a 
completely still scene. The GPU performance is very stable, regardless 
of the load on the other card, but the drawing time (software) sometimes 
goes from something like 0.4 to 2.6 or 1.5 for a couple of frames. I do 
not notice this, or not as much, when using two separate processes 
instead of two threads. The only difference I can think of here is that 
the OpenGL driver part is in the same address space and maybe internally 
locks occasionally? Or is this nonsense?


Anyway, the osg part seems to be fairly straightforward and simple like 
this. Thanks.



Robert Osfield wrote:

Hi Ferdi,

osgViewer::CompositeViewer runs all of the views synchronously - one
frame() call dispatches update, event, cull and draw traversals for
all the views.  So for you case where you want them to run async, this
isn't supported.  Supporting within CompositeViewer would really
complicate the API so it's not something I gone for.

What you will be able to do is use two separate Viewer's.  You are
likely to want to run two threads for each of the viewers frame loops
as well.  To get the render to image result to the second viewer all
you need to do is assign the same osg::Image to the first viewer's
Camera for it to copy to, and then attach the same osg::Image to a
texture in the scene of the second viewer.  The OSG should
automatically do the glReadPixels to the image data, dirty the Image,
and then automatically the texture will update in the second viewer.
You could potentially optimize things by using an PBO but the off the
shelf osg::PixelBufferObject isn't suitable for read in this way so
you'll need to roll you own support for this.

It's worth noting that I've never written a app like the above, so you
are rather working on the bleeding edge.  I "think" it should work, or
at least I can't spot any major problems that might appear.

Robert.

On Mon, Nov 17, 2008 at 9:37 AM, Ferdi Smit <[EMAIL PROTECTED]> wrote:
  

I'm looking to do the following in OSG, and I wonder if I'm on the right
track (before wasting time needlessly): have two render processes run in
parallel on two different GPUs, have one render a scene to texture and let
this texture be read by the other process and mapped to an object in a
different scene. Problem, the rendering of the first scene to texture is
very slow and the rendering of the second scene is very fast.

I intend to solve it in the following way in pseudo-code:

- new CompositeViewer
- Add two Views
- Construct two contexts, one on localhost:0.0, one on localhost:0.1
- Attach contexts to cameras of corresponding Views
- Set composite viewer threading mode to thread-per-context

--- First process
- Set view camera mode to FBO and pre-render
- Add post-draw callback and render textures
- Download texture to host memory in post-draw callback
- (possibly add post-render camera to render textured screen quad as output)

--- Second process
- Add update-callback and regular texture
- Upload host memory to texture in update callback (if available,
non-blocking)

The downloading and uploading of textures uses multiple slots and regular
threaded locking, so to ensure we never read or write the same memory at the
same time. The second process doesn't block if no new texture is available,
it just continues using the old one then.

Some questions. Will the two processes now run at independent frame rates,
or will the composite viewer synchronize them? I need them to run
independently. I read OSG does not support multi-threaded updating of the
scene graph. However, if I use two distinct scene graphs with two contexts,
I can _pull_ updates in an update callback from another thread, right? What
I can not do is push updates at arbitrary times; that would make sense. How
do I make the TrackballManipulator work for only the first process? It seems
that as soon as I set that camera to FBO it just doesn't respond to events
(or maybe something else is wrong...  I added another orthogonal camera to
the view1->getCamera() that renders the screenquad in post-render mode).
Also, the second process camera is affected when I move the mouse in the
first process window. Is it sufficient to call
view2->getCamera()->setAllowEventFocus(false); to disable this behavior?
Finally, can I do this the same way with a shared context on a single GPU
(i.e. both on :0.0) sharing texture data

Re: [osg-users] Question about views, contexts and threading

2008-11-17 Thread Robert Osfield

Hi Ferdi,

osgViewer::CompositeViewer runs all of the views synchronously - one
frame() call dispatches update, event, cull and draw traversals for
all the views.  So for you case where you want them to run async, this
isn't supported.  Supporting within CompositeViewer would really
complicate the API so it's not something I gone for.

What you will be able to do is use two separate Viewer's.  You are
likely to want to run two threads for each of the viewers frame loops
as well.  To get the render to image result to the second viewer all
you need to do is assign the same osg::Image to the first viewer's
Camera for it to copy to, and then attach the same osg::Image to a
texture in the scene of the second viewer.  The OSG should
automatically do the glReadPixels to the image data, dirty the Image,
and then automatically the texture will update in the second viewer.
You could potentially optimize things by using an PBO but the off the
shelf osg::PixelBufferObject isn't suitable for read in this way so
you'll need to roll you own support for this.

It's worth noting that I've never written a app like the above, so you
are rather working on the bleeding edge.  I "think" it should work, or
at least I can't spot any major problems that might appear.

Robert.

On Mon, Nov 17, 2008 at 9:37 AM, Ferdi Smit <[EMAIL PROTECTED]> wrote:
> I'm looking to do the following in OSG, and I wonder if I'm on the right
> track (before wasting time needlessly): have two render processes run in
> parallel on two different GPUs, have one render a scene to texture and let
> this texture be read by the other process and mapped to an object in a
> different scene. Problem, the rendering of the first scene to texture is
> very slow and the rendering of the second scene is very fast.
>
> I intend to solve it in the following way in pseudo-code:
>
> - new CompositeViewer
> - Add two Views
> - Construct two contexts, one on localhost:0.0, one on localhost:0.1
> - Attach contexts to cameras of corresponding Views
> - Set composite viewer threading mode to thread-per-context
>
> --- First process
> - Set view camera mode to FBO and pre-render
> - Add post-draw callback and render textures
> - Download texture to host memory in post-draw callback
> - (possibly add post-render camera to render textured screen quad as output)
>
> --- Second process
> - Add update-callback and regular texture
> - Upload host memory to texture in update callback (if available,
> non-blocking)
>
> The downloading and uploading of textures uses multiple slots and regular
> threaded locking, so to ensure we never read or write the same memory at the
> same time. The second process doesn't block if no new texture is available,
> it just continues using the old one then.
>
> Some questions. Will the two processes now run at independent frame rates,
> or will the composite viewer synchronize them? I need them to run
> independently. I read OSG does not support multi-threaded updating of the
> scene graph. However, if I use two distinct scene graphs with two contexts,
> I can _pull_ updates in an update callback from another thread, right? What
> I can not do is push updates at arbitrary times; that would make sense. How
> do I make the TrackballManipulator work for only the first process? It seems
> that as soon as I set that camera to FBO it just doesn't respond to events
> (or maybe something else is wrong...  I added another orthogonal camera to
> the view1->getCamera() that renders the screenquad in post-render mode).
> Also, the second process camera is affected when I move the mouse in the
> first process window. Is it sufficient to call
> view2->getCamera()->setAllowEventFocus(false); to disable this behavior?
> Finally, can I do this the same way with a shared context on a single GPU
> (i.e. both on :0.0) sharing texture data directly on the GPU in different
> textures? Ignoring the slow context switching issues for the time being.
>
> Am I one the right track here, or should this be done differently? I know
> all this is possible because I have the manual OpenGL code for it working,
> both using shared contexts and with up/downloading of texture data.
>
> --
> Regards,
>
> Ferdi Smit
> INS3 Visualization and 3D Interfaces
> CWI Amsterdam, The Netherlands
>
> ___
> osg-users mailing list
> osg-users@lists.openscenegraph.org
> http://lists.openscenegraph.org/listinfo.cgi/osg-users-openscenegraph.org
>
___
osg-users mailing list
osg-users@lists.openscenegraph.org
http://lists.openscenegraph.org/listinfo.cgi/osg-users-openscenegraph.org

[osg-users] Question about views, contexts and threading

2008-11-17 Thread Ferdi Smit

I'm looking to do the following in OSG, and I wonder if I'm on the right 
track (before wasting time needlessly): have two render processes run in 
parallel on two different GPUs, have one render a scene to texture and 
let this texture be read by the other process and mapped to an object in 
a different scene. Problem, the rendering of the first scene to texture 
is very slow and the rendering of the second scene is very fast.


I intend to solve it in the following way in pseudo-code:

- new CompositeViewer
- Add two Views
- Construct two contexts, one on localhost:0.0, one on localhost:0.1
- Attach contexts to cameras of corresponding Views
- Set composite viewer threading mode to thread-per-context

--- First process
- Set view camera mode to FBO and pre-render
- Add post-draw callback and render textures
- Download texture to host memory in post-draw callback
- (possibly add post-render camera to render textured screen quad as output)

--- Second process
- Add update-callback and regular texture
- Upload host memory to texture in update callback (if available, 
non-blocking)


The downloading and uploading of textures uses multiple slots and 
regular threaded locking, so to ensure we never read or write the same 
memory at the same time. The second process doesn't block if no new 
texture is available, it just continues using the old one then.


Some questions. Will the two processes now run at independent frame 
rates, or will the composite viewer synchronize them? I need them to run 
independently. I read OSG does not support multi-threaded updating of 
the scene graph. However, if I use two distinct scene graphs with two 
contexts, I can _pull_ updates in an update callback from another 
thread, right? What I can not do is push updates at arbitrary times; 
that would make sense. How do I make the TrackballManipulator work for 
only the first process? It seems that as soon as I set that camera to 
FBO it just doesn't respond to events (or maybe something else is 
wrong...  I added another orthogonal camera to the view1->getCamera() 
that renders the screenquad in post-render mode). Also, the second 
process camera is affected when I move the mouse in the first process 
window. Is it sufficient to call 
view2->getCamera()->setAllowEventFocus(false); to disable this behavior? 
Finally, can I do this the same way with a shared context on a single 
GPU (i.e. both on :0.0) sharing texture data directly on the GPU in 
different textures? Ignoring the slow context switching issues for the 
time being.


Am I one the right track here, or should this be done differently? I 
know all this is possible because I have the manual OpenGL code for it 
working, both using shared contexts and with up/downloading of texture data.


--
Regards,

Ferdi Smit
INS3 Visualization and 3D Interfaces
CWI Amsterdam, The Netherlands

___
osg-users mailing list
osg-users@lists.openscenegraph.org
http://lists.openscenegraph.org/listinfo.cgi/osg-users-openscenegraph.org

Re: [osg-users] Question about views, contexts and threading

Re: [osg-users] Question about views, contexts and threading

Re: [osg-users] Question about views, contexts and threading

Re: [osg-users] Question about views, contexts and threading

Re: [osg-users] Question about views, contexts and threading

Re: [osg-users] Question about views, contexts and threading

Re: [osg-users] Question about views, contexts and threading

Re: [osg-users] Question about views, contexts and threading

Re: [osg-users] Question about views, contexts and threading

[osg-users] Question about views, contexts and threading

10 matches

Site Navigation

Mail list logo

Footer information