Hi Serge, Thanks for the explanation, this really does help. You suggest change to DatabasePager.cpp to use ReadResult rather than osgDB::readNodeFile() is an interesting one. While the getting the timing to reproduce the crash seems pretty rare it is certainly possible, but often any little opening for a multi-threading crash can come out and bite, even if its seems really unlikely.
My current though is the weakness you've highlights is that osgDB::readNodeFile() doesn't return a ref_ptr<>, opening the door to the object being unref'd at the same time as the C pointer is be passed back from readNodeFile. Changing readNodeFile to pass back a ref_ptr<> is a possibility, but it'd break a lot of user code - this might be the right thing to do in terms of writing robust C++, but it is a heavy hit to force on users that don't code stuff in ways that would be sensitive to this issue. Perhaps a readRefNodeFile that passes back a ref_ptr<> would be a workaround. I guess one might be able to write a little class to work as adapter too so one could pass this object back and it work for C* as well as ref_ptr<> but this would be a pretty obscure use of C++. Thoughts? Robert. On Dec 12, 2007 9:28 AM, Serge Lages <[EMAIL PROTECTED]> wrote: > Hi Robert, > > > On Dec 11, 2007 8:59 PM, Robert Osfield <[EMAIL PROTECTED]> wrote: > > > Hi Serge, > > > > How reliably can you recreate the crash? > > > > Does it happen with osgviewer or other OSG examples? > > > > Could you explain whats happening in your app before the crash. > > > > > Here is the context of the crash : > > DatabasePager configuration : > pager->setExpiryDelay(5); > pager->setUnrefImageDataAfterApplyPolicy(true, true); > pager->setDeleteRemovedSubgraphsInDatabaseThread(false); > pager->setMaximumNumOfRemovedChildPagedLODs(50); > pager->setMinimumNumOfInactivePagedLODs(0); > > Registry configuration : > osgDB::Registry::instance()->getOptions()->setObjectCacheHint(osgDB::ReaderWriter::Options::CACHE_ALL); > > To make it crash, I have a custom manipulator which randomely move into a > very high paged database (lots of giga). The database contains nodes with > simple geometries but also texts. Sometime it crashs after hours of > randomely moves, sometime after just 1 or 2 minutes. The crash happens in > two locations : > > 1 - into osgText::Text::setFont (Text.cpp line 119) because the variable > "font" is a corrupted pointer. > 2 - into databasepager.cpp (line 604), during the initialization of > databaseRequest->_loadedModel, because the pointer returned by readNodeFile > is corrupted. > > These crashs happen into the pager thread, and the main thread is always > into the removeExpiredSubgraphs method at this moment. In both cases, we > have a very short laps of time where a C pointer is used for an object > instead of a ref_ptr. > > For the crash number 2, it seems that switching readNodeFile by a readNode > fix it. But I think that the better solution will be to prevent any cache > clear during the readNodeFile call into databasepager.cpp, I will > investigate further this morning and see if I can fix it. > > > > > If we can work out the conditions that the crash happens we'll have a > > much better chance of isolation. It could be as your describe an > > object be deleted just when another thread is about to take a ref to > > it, I would have thought this would be a pretty rare condition though. > > > > Robert. > > > > > > > > > > On Dec 11, 2007 3:55 PM, Serge Lages <[EMAIL PROTECTED]> wrote: > > > I have another theory about the crash, let's say that : > > > > > > > > > osg::Object* object = osgDB::readObjectFile(foundFile, userOptions ? > > > userOptions : localOptions.get()); > > > > > > read the object from the cache, and between this moment and the one > where > > > the font object is set to a ref_ptr into setFont, the cache is cleared > (into > > > another thread). With this scenario the pointer points to a deleted > object > > > or is there anything to prevent it ? > > > > > > In my case this scenario is possible as I load my IVE with the > databasepager > > > and that I remove the expired objects into the main thread. > > > > > > > > > > > > -- > > > Serge Lages > > > http://www.tharsis-software.com > > > > > > > > > _______________________________________________ > > > osg-users mailing list > > > osg-users@lists.openscenegraph.org > > > > http://lists.openscenegraph.org/listinfo.cgi/osg-users-openscenegraph.org > > > > > > > > _______________________________________________ > > osg-users mailing list > > osg-users@lists.openscenegraph.org > > http://lists.openscenegraph.org/listinfo.cgi/osg-users-openscenegraph.org > > > > > > -- > > > Serge Lages > http://www.tharsis-software.com > _______________________________________________ > osg-users mailing list > osg-users@lists.openscenegraph.org > http://lists.openscenegraph.org/listinfo.cgi/osg-users-openscenegraph.org > > _______________________________________________ osg-users mailing list osg-users@lists.openscenegraph.org http://lists.openscenegraph.org/listinfo.cgi/osg-users-openscenegraph.org