I've applied Binh-Minh's files to my local copy of HDF and this does
significantly reduce memory usage for my example.

It now uses under 100MB of memory, which seems pretty reasonable, I'll
continue testing against these changes.

Thanks all for the help,
- George

On Fri, 14 Aug 2015 at 09:10 Jason Newton <[email protected]> wrote:

> Hmm, I did make a mistake in how p_setID works (since it decref's but does
> not incref it's new reference) - and I figured setId wasn't defined
> provided the naming.
>
> I've never seen shared_ptr or OpenCL's C++ wrapper (which is almost a
> mirror image in terms of the library complexity they map) foul up or leak
> references in any cases with the same fundamental operations. It's unclear
> to me why this library can't do the same.  The amount of code dedicated to
> those purposes in those libraries is much less than what's going on here
> too...
>
> -Jason
>
> On Fri, Aug 14, 2015 at 12:59 AM, Binh-Minh Ribler <[email protected]>
> wrote:
>
>>
>> Hello Jason,
>>
>>
>> ------------------------------
>>
>> *From:* Hdf-forum <[email protected]> on behalf of
>> Jason Newton <[email protected]>
>> *Sent:* Thursday, August 13, 2015 10:39 PM
>> *To:* HDF Users Discussion List
>> *Subject:* Re: [Hdf-forum] Growing memory usage in small HDF program
>>
>> Bug found (in C++ api as usual)
>>
>> Thank you for your efforts in tracking down the problem and your
>> suggestions.
>>
>>
>> The C++ API *should* take care of inc/dec ref appropriately although they
>> do this in each object class (may be higher in some class hierarchies like
>> datatypes) but something of a leaf otherwise, rather than through
>> inheritance of IdComponent. That strategy while working has left a few bugs
>> I've found / encountered both as leaks and dec'reffing references not
>> incref'd.  As of 1.8.15, all that I was aware of though but this concern
>> should be warranted all the time based on past-burnings (this would be the
>> third time noticing something a shared_ptr like class/wrapper around HDF
>> resources (IdComponent...?) would completely eliminate.
>>
>>
>> dataset.getSpace() leaks a reference:
>>
>>    //create dataspace object using the existing id then return the object
>>    DataSpace data_space; <--default constructor makes a valid hdf
>> dataspace for H5S_SCALAR
>>    f_DataSpace_setId(&data_space, dataspace_id); <-- evil line, why
>> didn't we just use the ctor that takes the id parameter?
>>    return( data_space );
>>
>> In 1.8.14, this block of code is like this, before it was changed into
>> using the friend function in 1.8.15.
>>
>> //create dataspace object using the existing id then return the object
>> DataSpace data_space(dataspace_id);
>> return(data_space);
>>
>> As you can see in the comments you included below, the friend function
>> was a work-around of a problem reported by some other users.  In that
>> problem, the id was prematurely closed, due to the behind-the-scene
>> copy-constructor/destructor when an object was returned from a function.
>> In order to fix that problem, the copy constructor and the constructor that
>> takes an existing id of those classes that associate with an HDF5 id needs
>> to increment the ref counter.
>>
>> However, incrementing ref count left some objects opened at the end of
>> the program, perhaps, due to some compiler's optimization when returning an
>> object to the caller.  In these situations, a destructor for a temporary
>> object didn't seem to be invoked, so the id ref of the temporary object was
>> never closed.  I could never figure out why.  Hence, the work-around was to
>> use p_setId instead, which required the use of the friend function.  If
>> anyone has a different suggestion, please let us know.
>>
>>
>> //--------------------------------------------------------------------------
>> // Function:    f_DataSpace_setId - friend
>> // Purpose:    This function is friend to class H5::DataSpace so that it
>> can
>> //        can set DataSpace::id in order to work around a problem
>> //        described in the JIRA issue HDFFV-7947.
>> //        Applications shouldn't need to use it.
>> // param    dspace   - IN/OUT: DataSpace object to be changed
>> // param    new_id - IN: New id to set
>> // Programmer    Binh-Minh Ribler - 2015
>>
>> //--------------------------------------------------------------------------
>> void f_DataSpace_setId(DataSpace* dspace, hid_t new_id) <--evil function
>> that shouldn't exist (as a friend no-less!)
>> {
>>     dspace->id = new_id; <-- why not dspace->p_setId(new_id);?  Just make
>> it public already as "reset" and get rid of the friend.  Follow shared_ptr
>> semantics.. and bring all this stuff inside IdComponent.
>> .
>> }
>>
>> The difference between the public "setId" and the private p_setId is that
>> "setId" also increments the ref count and is intended for applications to
>> use on the C++ object id.  The private p_setId doesn't increment the id ref
>> count and is not intended for application use.  The difference is
>> explained in the function's header.
>> Thank you,
>> Binh-Minh
>>
>>
>>  -Jason
>>
>> On Thu, Aug 13, 2015 at 9:37 AM, Miller, Mark C. <[email protected]>
>> wrote:
>>
>>> Hmm. Well I have no experience with HDF5's C++ interface.
>>>
>>> My first thought when reading your description was. . . I've seen that
>>> before. It happens when I forgot to H5Xclose() all the objects I H5Xopened
>>> (groups, datasets, types, dataspaces, etc.).
>>>
>>> However, with C++, I presume the interface is designed to close objects
>>> when they fall out of scope (e.g. deconstructor is called). So, in looking
>>> at your code, even though I don't see any explicit calls to close objects
>>> previously opened, I assume that *should* be happening when the objects
>>> fall out of scope. But, are you *certain* that *is* happening? Just before
>>> exiting main, you migth wanna make a call to H5Fget_obj_count() to get some
>>> idea how many objects HDF5 library thinks are still open in the file. If
>>> you get a large number, then that would suggest the problem is that the C++
>>> interface isn't somehow closing objects as they fall out of scope.
>>>
>>> Thats all I can think of. Sorry if no help.
>>>
>>> Mark
>>>
>>>
>>> From: Hdf-forum <[email protected]> on behalf of
>>> Jorj Pimm <[email protected]>
>>> Reply-To: HDF Users Discussion List <[email protected]>
>>> Date: Thursday, August 13, 2015 9:21 AM
>>> To: "[email protected]" <[email protected]>
>>> Subject: [Hdf-forum] Growing memory usage in small HDF program
>>>
>>> Hello,
>>>
>>> I am writing an application which writes large data sets to HDF5 files,
>>> in fixed size blocks, using the HDF C++ API (version 1.8.15, patch 1, built
>>> in msvc 2013 x64)
>>>
>>> I my application seems to quickly consume all the available memory on my
>>> system (win32 - around 5.9GB), and then crash whenever the system becomes
>>> stressed (windows kills it as it has no memory)
>>>
>>> I have also tested the application on a linux machine, where I saw
>>> similar results.
>>>
>>> I was under the impression that by using HDF5, the file would be brought
>>> in and out of memory in such a way that the library would only use a small
>>> working set - is this not true?
>>>
>>> I have experimented with HDF features such as flushing to disk,
>>> regularly closing and re opening, garbage collection and tuning chunking
>>> and caching settings and haven't managed to get a stable working set.
>>>
>>> I've attached a minimal example, can anyone point out my mistake?
>>>
>>> Thanks,
>>> - Jorj
>>>
>>>
>>> _______________________________________________
>>> Hdf-forum is for HDF software users discussion.
>>> [email protected]
>>> http://lists.hdfgroup.org/mailman/listinfo/hdf-forum_lists.hdfgroup.org
>>> Twitter: https://twitter.com/hdf5
>>>
>>
>>
>> _______________________________________________
>> Hdf-forum is for HDF software users discussion.
>> [email protected]
>> http://lists.hdfgroup.org/mailman/listinfo/hdf-forum_lists.hdfgroup.org
>> Twitter: https://twitter.com/hdf5
>>
>
> _______________________________________________
> Hdf-forum is for HDF software users discussion.
> [email protected]
> http://lists.hdfgroup.org/mailman/listinfo/hdf-forum_lists.hdfgroup.org
> Twitter: https://twitter.com/hdf5
_______________________________________________
Hdf-forum is for HDF software users discussion.
[email protected]
http://lists.hdfgroup.org/mailman/listinfo/hdf-forum_lists.hdfgroup.org
Twitter: https://twitter.com/hdf5

Reply via email to