A Wednesday 16 March 2011 09:52:10 Yngve Inntjore Levinsen escrigué: > Dear hierarchical people, > > I have currently converted a piece of code from using a simple ascii > format for output into using HDF5. What the code does is at every > iteration dumping some information about particle > energy/trajectory/position to the ascii file (this is a particle > tracking code). > > Initially I then did the same with the HDF5 library, having a > unlimited row dimension in a 2D array and using h5extend_f to > extend by one element each time and writing a hyperslab of one row > to the file. As some (perhaps most) of you might have guessed or > know already, this was a rather bad idea. The file (without > compression) was about the same size as the ascii file (but > obviously with higher precision), and reading the file in subsequent > analysis was at least an order of magnitude slower. > > I then realized that I probably needed to write less frequently and > rather keeping a semi-large hyperslab in memory. I chose a hyperslab > of 1000 rows, but otherwise using the same procedure. This seems to > be both fast and with compression creating quite a bit smaller file. > I tried even larger slabs, but did not see any speed improvement in > my initial testing > > My question really was just if there are some recommended ways to do > this? I would imagine I am not the first that want to use HDF5 in > this way, dumping some data at every iteration of a given > simulation, without having to keep it all in memory until the end?
For getting a good performance is very important your chunksize. Typical figures for serial I/O are between 32 KB and 1 MB, depending the final size of the dataset. Which one you are using? -- Francesc Alted _______________________________________________ Hdf-forum is for HDF software users discussion. [email protected] http://mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org
