> I think it would be great if it would be possible to contribute these test > codes back to The HDF Group for them to included in the regular performance > testing.
Ditto. We would be more than happy to accept. Elena On Apr 26, 2017, at 4:32 PM, Miller, Mark C. <[email protected]<mailto:[email protected]>> wrote: "Hdf-forum on behalf of Castro Rojo, Rodrigo" wrote: this will help *only* for closed objects. Garbage collect will have no effect on objects left open. Hence, one of your use cases of keeping all datasets open will not benefit from a call to H5garbage_collect(). I am testing H5garbage_collect() after everything is closed (including file) and no way. We have tried to use this primitive many times but no way. Ok, well, closing the file implies the file has also been "garbage collected". So, I would not expect H5garbage_collect() *after* H5Fclose to have much effect. It might have some, maybe. But, H5garbabe_collect() is, IMHO, the *best* you can do in terms of forcing HDF5 to free up memory just short of actually closing the file. Also, be aware that H5Fclose will NOT NECESSARILY actually close your file. If you have any objects in the file left open, H5Fclose will silently *ignore* your request to close the file. You need to have opened the file with H5P_CLOSE_SEMI property...see https://support.hdfgroup.org/HDF5/doc/RM/RM_H5P.html#Property-SetFcloseDegree 2- Build version of the library with "—enable-using-memchecker”, has no effect in memory consume in my example. 3- H5Pset_evict_on_evict has no effect Again, this should/will work *only* for closed objects. Yes. But also no it doesn’t work. Memory is mot freed. What tool(s) are you using to measure available memory? valgrind / massif? 5- After writing data in all datasets, H5Fflush(fid,H5F_SCOPE_GLOBAL) has very different times depending on open close dataset strategy. For 40K datasets and 300 records to write per dataset: a) Open and close every dataset when data is written: 7 seconds (including flush time) that is a latency very similar that I got in previous release. b) Keep all datasets open and Write in all datasets and then flush with “H5Fflush(fid,H5F_SCOPE_GLOBAL)”: 95 seconds. I am not surprised by these results. I think best time and space performance of HDF5 is likely when you close objects as soon as practical. Ok. Then we are going to keep this line. Anyway, the issue of freeing memory after closing everything (including file) it is very important for us, so if you have any clue don’t doubt tell us to try. Just comment that this huge latency in case b) it was not present in "hdf5-10.0.0-alpha1”. I know there was a lot of work done on the metadata cache recently and I wonder if that work could have lead to this performance regression? As an aside, I wonder how things would be if you attempted some adjustments to the metadata cache algorithm via a call to H5Pset_mdc_config()... https://support.hdfgroup.org/HDF5/doc/RM/RM_H5P.html#Property-SetMdcConfig set_initial_size=1 initial_size=16384 min_size=8192 epoch_length=3000 lower_hr_threshold=1e-5 possibly adjust upwards initial_size and min_size to something that represents 2-5x whatever size '300 records is'. Again, I suspect this will help *only* for the open and close case. Yes. I agree with your bet. Thank you very much for your quick answer and support. I have 4 tests files to cover this case with very simple programs. What would be the best way to share them with the forum? I think it would be great if it would be possible to contribute these test codes back to The HDF Group for them to included in the regular performance testing. _______________________________________________ Hdf-forum is for HDF software users discussion. [email protected]<mailto:[email protected]> http://lists.hdfgroup.org/mailman/listinfo/hdf-forum_lists.hdfgroup.org Twitter: https://twitter.com/hdf5
_______________________________________________ Hdf-forum is for HDF software users discussion. [email protected] http://lists.hdfgroup.org/mailman/listinfo/hdf-forum_lists.hdfgroup.org Twitter: https://twitter.com/hdf5
