Hi David,
 
thanks for the quick reply!
 
Tweaking chunk size does not do the trick:
 
Setting 
 
hsize_t chunkDims[1]                = {1000000}; 
 
will increase the file size, but not the speed, i.e. it remains almost the same 
with a 150MB instead of 15MB file. 
 
Regarding the data type in my struct, each element is a variable length string. 
 
Best regards,
 
Daniel


 
> Date: Mon, 20 Jul 2015 09:13:41 -0700
> From: [email protected]
> To: [email protected]
> Subject: Re: [Hdf-forum] Incremental writing of compound dataset slow
> 
> Hi Daniel,
> 
> It looks like you are writing chunks of size 100, where each struct is 
> maybe 40 bytes? I'm not sure what all the types are in the struct - but 
> if that is the case each chunk is about 4k. It is my understanding that 
> each chunk equates to one system write to disk, and these are expensive. 
> A good rule of thumb is to target 1MB chunks.
> 
> best,
> 
> David
> software engineer at SLAC
> 
> On 07/19/15 06:26, Daniel Rimmelspacher wrote:
> > Dear hdf-forum,
> >
> > I am trying to write compound data to an extendible hdf-dataset. For 
> > the code snippet below, I am writing ~30000 compound items one-by-one, 
> > resulting in an approximately 15MB h5-file.
> >
> > For dumping this amount of data the hdf library requires roughly 15 
> > seconds. This seems a little bit long to me. My guess is that 
> > requesting the proper hyperslab for each new item wastes most of the 
> > time.
> >
> > Here, however, I am struggling a little bit, since I don't manage to 
> > find out more about this.
> >
> > I'd appreciate if someone would have a quick look at the code below in 
> > order to give me a hint.
> >
> > Thanks and regards,
> >
> > Daniel
> >
> > ////////////////////////////////////////////////////////////////////
> > // Header: definition of struct type characteristic_t
> > //////////////////////////////////////////////// ////
> > ...
> >
> > /////////////////////////////////////////////////////////////////////
> > // This section initializes the dataset (once) for incremental read
> > /////////////////////////////////////////////////////////////////////
> > // initialize variable length string type
> > constStrType vlst(PredType::C_S1, H5T_VARIABLE);
> >
> > // Create memory space for compound datatype
> > memspace = CompType(sizeof(characteristic_t));
> > H5Tinsert(memspace.getId(), "Name",       HOFFSET(characteristic_t , 
> > name),    vlst.getId());
> > H5Tinsert(memspace.getId(), "LongIdentifier",         
> > HOFFSET(characteristic_t , longId),   vlst.getId());
> > H5Tinsert(memspace.getId(), "Type",          HOFFSET(characteristic_t 
> > , type),      vlst.getId());
> > H5Tinsert(memspace.getId(), "Address",    HOFFSET(characteristic_t , 
> > address),                 vlst.getId());
> > H5Tinsert(memspace.getId(), "Deposit",     HOFFSET(characteristic_t , 
> > deposit),                 vlst.getId());
> > H5Tinsert(memspace.getId(), "MaxDiff",     HOFFSET(characteristic_t , 
> > maxDiff),                vlst.getId());
> > H5Tinsert(memspace.getId(), "Conversion",               
> > HOFFSET(characteristic_t , conversion),           vlst.getId());
> > H5Tinsert(memspace.getId(), "LowerLimit",               
> > HOFFSET(characteristic_t , lowerLimit),           vlst.getId());
> > H5Tinsert(memspace.getId(), "UpperLimit",               
> > HOFFSET(characteristic_t , upperLimit),          vlst.getId());
> >
> > // Prepare data set
> > dims[0]      = 0; // Initial size
> > hsize_t rank                = 1; 
> >                                                    // data will be 
> > alligned in array style
> > hsize_t maxDims[1]        = {H5S_UNLIMITED}; 
> >                                                   // dataset will be 
> > extendible
> > hsize_t chunkDims[1]                = {100}; // some random chunksize
> > DataSpace *dataspace                = newDataSpace (rank, dims, 
> > maxDims);                   // set dataspace for dataset
> >
> > // Modify dataset creation property to enable chunking
> > DSetCreatPropList prop;
> > prop.setChunk(rank, chunkDims);
> >
> > // Create the chunked dataset.  Note the use of pointer.
> > charData = file.createDataSet( "Characteristic", memspace, *dataspace, 
> > prop);
> >
> > // Init helper
> > hsize_t chunk[1]                = {1};
> > chunkSpace        = DataSpace(1, chunk, NULL);
> > filespace              = DataSpace(charData.getSpace());
> >
> >
> > //////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////
> > // This section will be called repeatadly in order to write the 
> > compound items iteratively
> > /////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////
> > // Create the new item.
> > characteristic_t s1[1];
> > s1[0].name        = name;
> > s1[0].longId                     = Id;
> > s1[0].type          = type;
> > s1[0].address                  = address;
> > s1[0].deposit                   = deposit;
> > s1[0].maxDiff = maxDiff;
> > s1[0].conversion            = conversion;
> > s1[0].lowerLimit             = lowerLimit;
> > s1[0].upperLimit            = upperLimit;
> >
> > // Extend dataset
> > dims[0]++;
> > charData.extend(dims);
> >
> > // Compute new dims
> > hsize_t chunk[1]        = {1};
> > hsize_t start[1]           = {0};
> >  start[0] = dims[0]-1;
> >
> > // Select a hyperslab in extended portion of the dataset.
> > filespace = charData.getSpace();
> > filespace.selectHyperslab(H5S_SELECT_SET, chunk, start);
> >
> > // Write data to the extended portion of the dataset.
> > charData.write(s1, memspace, chunkSpace, filespace);
> >
> >
> >
> >
> > _______________________________________________
> > Hdf-forum is for HDF software users discussion.
> > [email protected]
> > http://lists.hdfgroup.org/mailman/listinfo/hdf-forum_lists.hdfgroup.org
> > Twitter: https://twitter.com/hdf5
> 
> 
> _______________________________________________
> Hdf-forum is for HDF software users discussion.
> [email protected]
> http://lists.hdfgroup.org/mailman/listinfo/hdf-forum_lists.hdfgroup.org
> Twitter: https://twitter.com/hdf5
                                          
_______________________________________________
Hdf-forum is for HDF software users discussion.
[email protected]
http://lists.hdfgroup.org/mailman/listinfo/hdf-forum_lists.hdfgroup.org
Twitter: https://twitter.com/hdf5

Reply via email to