Hi Elena,
A simple code demonstrating this issue is attached. Please try to modify
the variables "NGroup, LibVerLow, LibVerLow". NGroup gives the number of
groups for a fixed number of datasets (NDataset), and the other two
variables specify the file format. The size of each dataset is ~2 KB.
I tried four different cases, with the combination of NGroup=1 or 128 and
LibVerLow=H5F_LIBVER_EARLIEST or H5F_LIBVER_18. For NGroup=1, the I/O
bandwidth drops dramatically when the file size exceeds ~ 3.4 GB. For
NGroup=128, the bandwidth becomes reasonable. The results are similar for
different LibVerLow (actually the results are a bit worse for H5F_LIBVER_18
and H5F_LIBVER_LATEST than for H5F_LIBVER_EARLIEST ).
Some system spec:
HDF5 version: 1.8.16
CPU: Intel(R) Xeon(R) CPU E5-2670 v2 @ 2.50GHz
File system: gpfs
OS: CentOS release 6.7
Sincerely,
Justin
2016-02-19 17:41 GMT-06:00 Elena Pourmal <[email protected]>:
> Justin,
>
> Will it be possible for you to provide a program that illustrates the
> problem? Which version of the library are you using? On which system are
> you running your application?
>
> Thank you!
>
> Elena
> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> Elena Pourmal The HDF Group http://hdfgroup.org
> 1800 So. Oak St., Suite 203, Champaign IL 61820
> 217.531.6112
> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>
>
>
>
> On Feb 19, 2016, at 4:03 PM, Hsi-Yu Schive <[email protected]> wrote:
>
> Thanks for the suggestion. The performance I reported was measured using
> the earliest file format (i.e., H5F_LIBVER_EARLIEST). I just tried to use
> H5F_LIBVER_18, but it leads to an even worse performance. The bandwidth
> starts to drop when N > ~ 0.5 million. Using H5F_LIBVER_LATEST does not
> help either.
>
> Justin
>
> 2016-02-19 8:26 GMT-06:00 Gerd Heber <[email protected]>:
>
>> Are you using the latest version of the file format? In other words, are
>> you using H5P_DEFAULT (-> earliest)
>>
>> as your file access property list, or have you created one which sets the
>> library version bounds to H5F_LIBVER_18?
>>
>>
>>
>> See
>> https://www.hdfgroup.org/HDF5/doc/RM/RM_H5P.html#Property-SetLibverBounds
>>
>>
>>
>> In the newer version, groups with large numbers of links and attributes
>> are managed more.
>>
>>
>>
>> Does that solve your problem?
>>
>>
>>
>> Best, G.
>>
>>
>>
>>
>>
>> *From:* Hdf-forum [mailto:[email protected]] *On
>> Behalf Of *Hsi-Yu Schive
>> *Sent:* Thursday, February 18, 2016 2:36 PM
>> *To:* [email protected]
>> *Subject:* [Hdf-forum] I/O bandwidth drops dramatically and
>> discontinuously for a large number of small datasets
>>
>>
>>
>> I encounter a sudden drop of I/O bandwidth when the number of datasets in
>> a single group exceeds around 1.7 million. In the following I describe the
>> issue in more detail.
>>
>>
>>
>> I'm converting an adaptive mesh refinement data to HDF5 format. Each
>> dataset contains a small 4-D array with a size of ~ 10 KB in the compact
>> format. All datasets are stored in the same group. When the total number of
>> datasets (N) is smaller than ~ 1.7 million, I get an I/O bandwidth of ~100
>> MB/s, which is acceptable. However, when N exceeds ~ 1.7 million, the
>> bandwidth suddenly drops by at least one to two orders of magnitude.
>>
>>
>>
>> This issue seems to relate to the **number of datasets per group**
>> instead of total data size. For example, if I reduce the size of each
>> dataset by a factor of 5 (so ~2 KB per dataset), the I/O bandwidth stills
>> drops when N > ~ 1.7 million, even though the total data size is reduced by
>> a factor of 5.
>>
>>
>>
>> So I was wondering what causes this issue, and if there is any simple
>> solution to that. Since the data stored in different datasets are
>> independent to each other, I prefer not to combine them into a larger
>> dataset. My current solution is to further create several HDF5 sub-groups
>> under the main group, and then distribute all datasets evenly in these
>> sub-groups (so that the number of datasets per group becomes smaller). By
>> doing so the I/O bandwidth becomes stable even when N > 1.7 million.
>>
>>
>>
>> If necessary, I can post a simplified code to reproduce this issue.
>>
>>
>>
>> Hsi-Yu
>>
>> _______________________________________________
>> Hdf-forum is for HDF software users discussion.
>> [email protected]
>> http://lists.hdfgroup.org/mailman/listinfo/hdf-forum_lists.hdfgroup.org
>> Twitter: https://twitter.com/hdf5
>>
>
> _______________________________________________
> Hdf-forum is for HDF software users discussion.
> [email protected]
> http://lists.hdfgroup.org/mailman/listinfo/hdf-forum_lists.hdfgroup.org
> Twitter: https://twitter.com/hdf5
>
>
>
> _______________________________________________
> Hdf-forum is for HDF software users discussion.
> [email protected]
> http://lists.hdfgroup.org/mailman/listinfo/hdf-forum_lists.hdfgroup.org
> Twitter: https://twitter.com/hdf5
>
#include "hdf5.h"
#include "sys/time.h"
int main()
{
// input parameters
const int NDataset = 128*128*128; // total number of datasets
const int NGroup = 128; // total number of groups
const int NDatasetPerGroup = NDataset/NGroup; // number of datasets per group
const int N = 8; // V*N^3*sizeof(float) = size of each dataset
const int V = 1;
const char FileName[] = "Data.h5"; // output filename
// HDF5 file format (low and high)
const H5F_libver_t LibVerLow = H5F_LIBVER_EARLIEST;
// const H5F_libver_t LibVerLow = H5F_LIBVER_18;
// const H5F_libver_t LibVerLow = H5F_LIBVER_LATEST;
// const H5F_libver_t LibVerHigh = H5F_LIBVER_18;
const H5F_libver_t LibVerHigh = H5F_LIBVER_LATEST;
hid_t file_id, group_id, dataset_id, dataspace_id, fapl;
H5G_info_t ginfo;
hsize_t dims[4];
herr_t status;
timeval tv1, tv2;
char SetName[100], GroupName[100];
float (*dset_data)[N][N][N] = new float [5][N][N][N];
float Time, SizeMB;
/* Initialize the dataset. */
for (int v=0; v<V; v++)
for (int k=0; k<N; k++)
for (int j=0; j<N; j++)
for (int i=0; i<N; i++) dset_data[v][k][j][i] = (((float)v*N+k)*N+j)*N+i;
printf( "Data[First] = %14.7e\n", dset_data[ 0][ 0][ 0][ 0] );
printf( "Data[Last ] = %14.7e\n", dset_data[V-1][N-1][N-1][N-1] );
printf( "\n" );
fflush( stdout );
SizeMB = (float)NDataset*V*N*N*N*sizeof(float)/1024./1024.;
printf( "NDataset = %10d\n", NDataset );
printf( "NGroup = %10d\n", NGroup );
printf( "NDatasetPerGroup = %10d\n", NDatasetPerGroup );
printf( "Data size = %13.7e MB\n", SizeMB );
fflush( stdout );
gettimeofday( &tv1, NULL );
/* Create file with the specified format. */
fapl = H5Pcreate( H5P_FILE_ACCESS );
status = H5Pset_libver_bounds( fapl, LibVerLow, LibVerHigh );
file_id = H5Fcreate( FileName, H5F_ACC_TRUNC, H5P_DEFAULT, fapl );
/* Create the data space for the dataset. */
dims[0] = V;
dims[1] = N;
dims[2] = N;
dims[3] = N;
dataspace_id = H5Screate_simple( 4, dims, NULL );
for (int g=0; g<NGroup; g++)
{
/* Set a group name. */
sprintf( GroupName, "/group_%d%d%d%d%d%d%d%d%d", g/100000000
,(g%100000000)/10000000
,(g%10000000 )/1000000
,(g%1000000 )/100000
,(g%100000 )/10000
,(g%10000 )/1000
,(g%1000 )/100
,(g%100 )/10
,(g%10 ) );
/* Create a group */
group_id = H5Gcreate2( file_id, GroupName, H5P_DEFAULT, H5P_DEFAULT, H5P_DEFAULT );
for (int t=0; t<NDatasetPerGroup; t++)
{
/* Set a dataset name. */
sprintf( SetName, "%s/dset_%d%d%d%d%d%d%d%d%d",
GroupName
, t/100000000
,(t%100000000)/10000000
,(t%10000000 )/1000000
,(t%1000000 )/100000
,(t%100000 )/10000
,(t%10000 )/1000
,(t%1000 )/100
,(t%100 )/10
,(t%10 ) );
/* Create a dataset. */
dataset_id = H5Dcreate2( file_id, SetName, H5T_NATIVE_FLOAT, dataspace_id,
H5P_DEFAULT, H5P_DEFAULT, H5P_DEFAULT );
/* Write the dataset. */
status = H5Dwrite( dataset_id, H5T_NATIVE_FLOAT, H5S_ALL, H5S_ALL, H5P_DEFAULT, dset_data );
/* Close the dataset. */
status = H5Dclose( dataset_id );
} // for (int t=0; t<NDatasetPerGroup; t++)
/* Obtain the group info and print the group storage type of the last group. */
if ( g == NGroup-1 )
{
status = H5Gget_info( group_id, &ginfo );
printf( "\nGroup storage type is: " );
switch ( ginfo.storage_type )
{
case H5G_STORAGE_TYPE_COMPACT: printf("H5G_STORAGE_TYPE_COMPACT\n"); break;
case H5G_STORAGE_TYPE_DENSE: printf("H5G_STORAGE_TYPE_DENSE\n"); break;
case H5G_STORAGE_TYPE_SYMBOL_TABLE: printf("H5G_STORAGE_TYPE_SYMBOL_TABLE\n"); break;
}
printf( "\n" );
}
/* Close the group. */
status = H5Gclose( group_id );
} // for (int g=0; g<NGroup; g++)
/* Close the data space for the dataset. */
status = H5Sclose( dataspace_id );
/* Close the file. */
status = H5Fclose( file_id );
gettimeofday( &tv2, NULL );
Time = ( ( tv2.tv_sec*1000000 + tv2.tv_usec ) - ( tv1.tv_sec*1000000 + tv1.tv_usec ) )*1.0e-6;
printf( "Time = %13.7e sec\n", Time );
printf( "Bandwidth = %13.7e MB/sec\n", SizeMB/Time );
delete [] dset_data;
}
_______________________________________________
Hdf-forum is for HDF software users discussion.
[email protected]
http://lists.hdfgroup.org/mailman/listinfo/hdf-forum_lists.hdfgroup.org
Twitter: https://twitter.com/hdf5