I used the snippet in this thread to debug the open objects - but I could have used it wrong?
http://hdf-forum.184993.n3.nabble.com/Repeated-H5Dwrite-calls-increase-memory-usage-td4026367.html#a4026375 - Jorj On Fri, 14 Aug 2015 at 18:52 Miller, Mark C. <[email protected]> wrote: > Hmm. I just wanted to ask THG guys a quick follow-up question here. > > I didn't follow this whole thread but was this growth due to the C++ > interface failing to close or dec-ref some objects? > > If so, why didn't H5Oget_obj_count help to deduce that? My understanding > is that Jorj tried that but it yielded no indication of an object handle > leak. Is there a bug there? > > Mark > > From: Hdf-forum <[email protected]> on behalf of > Binh-Minh Ribler <[email protected]> > > Reply-To: HDF Users Discussion List <[email protected]> > Date: Friday, August 14, 2015 10:03 AM > To: HDF Users Discussion List <[email protected]> > > Subject: Re: [Hdf-forum] Growing memory usage in small HDF program > > That's good. Thank you for applying the files and letting us know, George! > > > Binh-Minh > > > ------------------------------ > *From:* Hdf-forum <[email protected]> on behalf of > Jorj Pimm <[email protected]> > *Sent:* Friday, August 14, 2015 4:47 AM > *To:* HDF Users Discussion List > *Subject:* Re: [Hdf-forum] Growing memory usage in small HDF program > > I've applied Binh-Minh's files to my local copy of HDF and this does > significantly reduce memory usage for my example. > > It now uses under 100MB of memory, which seems pretty reasonable, I'll > continue testing against these changes. > > Thanks all for the help, > - George > > On Fri, 14 Aug 2015 at 09:10 Jason Newton <[email protected]> wrote: > >> Hmm, I did make a mistake in how p_setID works (since it decref's but >> does not incref it's new reference) - and I figured setId wasn't defined >> provided the naming. >> >> I've never seen shared_ptr or OpenCL's C++ wrapper (which is almost a >> mirror image in terms of the library complexity they map) foul up or leak >> references in any cases with the same fundamental operations. It's unclear >> to me why this library can't do the same. The amount of code dedicated to >> those purposes in those libraries is much less than what's going on here >> too... >> >> -Jason >> >> On Fri, Aug 14, 2015 at 12:59 AM, Binh-Minh Ribler <[email protected] >> > wrote: >> >>> >>> Hello Jason, >>> >>> >>> ------------------------------ >>> >>> *From:* Hdf-forum <[email protected]> on behalf of >>> Jason Newton <[email protected]> >>> *Sent:* Thursday, August 13, 2015 10:39 PM >>> *To:* HDF Users Discussion List >>> *Subject:* Re: [Hdf-forum] Growing memory usage in small HDF program >>> >>> Bug found (in C++ api as usual) >>> >>> Thank you for your efforts in tracking down the problem and your >>> suggestions. >>> >>> >>> The C++ API *should* take care of inc/dec ref appropriately although >>> they do this in each object class (may be higher in some class hierarchies >>> like datatypes) but something of a leaf otherwise, rather than through >>> inheritance of IdComponent. That strategy while working has left a few bugs >>> I've found / encountered both as leaks and dec'reffing references not >>> incref'd. As of 1.8.15, all that I was aware of though but this concern >>> should be warranted all the time based on past-burnings (this would be the >>> third time noticing something a shared_ptr like class/wrapper around HDF >>> resources (IdComponent...?) would completely eliminate. >>> >>> >>> dataset.getSpace() leaks a reference: >>> >>> //create dataspace object using the existing id then return the object >>> DataSpace data_space; <--default constructor makes a valid hdf >>> dataspace for H5S_SCALAR >>> f_DataSpace_setId(&data_space, dataspace_id); <-- evil line, why >>> didn't we just use the ctor that takes the id parameter? >>> return( data_space ); >>> >>> In 1.8.14, this block of code is like this, before it was changed into >>> using the friend function in 1.8.15. >>> >>> //create dataspace object using the existing id then return the object >>> DataSpace data_space(dataspace_id); >>> return(data_space); >>> >>> As you can see in the comments you included below, the friend function >>> was a work-around of a problem reported by some other users. In that >>> problem, the id was prematurely closed, due to the behind-the-scene >>> copy-constructor/destructor when an object was returned from a function. >>> In order to fix that problem, the copy constructor and the constructor that >>> takes an existing id of those classes that associate with an HDF5 id needs >>> to increment the ref counter. >>> >>> However, incrementing ref count left some objects opened at the end of >>> the program, perhaps, due to some compiler's optimization when returning an >>> object to the caller. In these situations, a destructor for a temporary >>> object didn't seem to be invoked, so the id ref of the temporary object was >>> never closed. I could never figure out why. Hence, the work-around was to >>> use p_setId instead, which required the use of the friend function. If >>> anyone has a different suggestion, please let us know. >>> >>> >>> //-------------------------------------------------------------------------- >>> // Function: f_DataSpace_setId - friend >>> // Purpose: This function is friend to class H5::DataSpace so that it >>> can >>> // can set DataSpace::id in order to work around a problem >>> // described in the JIRA issue HDFFV-7947. >>> // Applications shouldn't need to use it. >>> // param dspace - IN/OUT: DataSpace object to be changed >>> // param new_id - IN: New id to set >>> // Programmer Binh-Minh Ribler - 2015 >>> >>> //-------------------------------------------------------------------------- >>> void f_DataSpace_setId(DataSpace* dspace, hid_t new_id) <--evil function >>> that shouldn't exist (as a friend no-less!) >>> { >>> dspace->id = new_id; <-- why not dspace->p_setId(new_id);? Just >>> make it public already as "reset" and get rid of the friend. Follow >>> shared_ptr semantics.. and bring all this stuff inside IdComponent. >>> . >>> } >>> >>> The difference between the public "setId" and the private p_setId is >>> that "setId" also increments the ref count and is intended for applications >>> to use on the C++ object id. The private p_setId doesn't increment the id >>> ref count and is not intended for application use. The difference is >>> explained in the function's header. >>> Thank you, >>> Binh-Minh >>> >>> >>> -Jason >>> >>> On Thu, Aug 13, 2015 at 9:37 AM, Miller, Mark C. <[email protected]> >>> wrote: >>> >>>> Hmm. Well I have no experience with HDF5's C++ interface. >>>> >>>> My first thought when reading your description was. . . I've seen that >>>> before. It happens when I forgot to H5Xclose() all the objects I H5Xopened >>>> (groups, datasets, types, dataspaces, etc.). >>>> >>>> However, with C++, I presume the interface is designed to close objects >>>> when they fall out of scope (e.g. deconstructor is called). So, in looking >>>> at your code, even though I don't see any explicit calls to close objects >>>> previously opened, I assume that *should* be happening when the objects >>>> fall out of scope. But, are you *certain* that *is* happening? Just before >>>> exiting main, you migth wanna make a call to H5Fget_obj_count() to get some >>>> idea how many objects HDF5 library thinks are still open in the file. If >>>> you get a large number, then that would suggest the problem is that the C++ >>>> interface isn't somehow closing objects as they fall out of scope. >>>> >>>> Thats all I can think of. Sorry if no help. >>>> >>>> Mark >>>> >>>> >>>> From: Hdf-forum <[email protected]> on behalf of >>>> Jorj Pimm <[email protected]> >>>> Reply-To: HDF Users Discussion List <[email protected]> >>>> Date: Thursday, August 13, 2015 9:21 AM >>>> To: "[email protected]" <[email protected]> >>>> Subject: [Hdf-forum] Growing memory usage in small HDF program >>>> >>>> Hello, >>>> >>>> I am writing an application which writes large data sets to HDF5 files, >>>> in fixed size blocks, using the HDF C++ API (version 1.8.15, patch 1, built >>>> in msvc 2013 x64) >>>> >>>> I my application seems to quickly consume all the available memory on >>>> my system (win32 - around 5.9GB), and then crash whenever the system >>>> becomes stressed (windows kills it as it has no memory) >>>> >>>> I have also tested the application on a linux machine, where I saw >>>> similar results. >>>> >>>> I was under the impression that by using HDF5, the file would be >>>> brought in and out of memory in such a way that the library would only use >>>> a small working set - is this not true? >>>> >>>> I have experimented with HDF features such as flushing to disk, >>>> regularly closing and re opening, garbage collection and tuning chunking >>>> and caching settings and haven't managed to get a stable working set. >>>> >>>> I've attached a minimal example, can anyone point out my mistake? >>>> >>>> Thanks, >>>> - Jorj >>>> >>>> >>>> _______________________________________________ >>>> Hdf-forum is for HDF software users discussion. >>>> [email protected] >>>> http://lists.hdfgroup.org/mailman/listinfo/hdf-forum_lists.hdfgroup.org >>>> Twitter: https://twitter.com/hdf5 >>>> >>> >>> >>> _______________________________________________ >>> Hdf-forum is for HDF software users discussion. >>> [email protected] >>> http://lists.hdfgroup.org/mailman/listinfo/hdf-forum_lists.hdfgroup.org >>> Twitter: https://twitter.com/hdf5 >>> >> >> _______________________________________________ >> Hdf-forum is for HDF software users discussion. >> [email protected] >> http://lists.hdfgroup.org/mailman/listinfo/hdf-forum_lists.hdfgroup.org >> Twitter: https://twitter.com/hdf5 > > _______________________________________________ > Hdf-forum is for HDF software users discussion. > [email protected] > http://lists.hdfgroup.org/mailman/listinfo/hdf-forum_lists.hdfgroup.org > Twitter: https://twitter.com/hdf5
_______________________________________________ Hdf-forum is for HDF software users discussion. [email protected] http://lists.hdfgroup.org/mailman/listinfo/hdf-forum_lists.hdfgroup.org Twitter: https://twitter.com/hdf5
