Quincey, Mohamad

The new H5Dgather looks good, but I am concerned about one issue. The fact that 
we assume a common datatype between buffers. I am accustomed to ignoring 
datatype issues when working with hdf5 and simply get on with float/double etc 
and can happily read double into float or vice versa.
One project coming along will be using BlueGene and requires interoperability 
with another cluster attached which will be x86. Here we will have big/little 
endian issues and it'd be nice if some of the internals of hdf5 which handle 
this could be leveraged.

Is there any way we can use the routines for gather scatter and iteration that 
you have proposed in the last few messages in conjunction with datatype changes 
such as float/double/ long/int or big/little endian conversion?

JB

From: h5vol-boun...@hdfgroup.org [mailto:h5vol-boun...@hdfgroup.org] On Behalf 
Of Quincey Koziol
Sent: 22 August 2012 20:27
To: hdf5...@hdfgroup.org
Cc: HDF5 Virtual Object Layer (VOL) Discussions; hdf-forum@hdfgroup.org
Subject: Re: [H5vol] [Hdf5lib] RFC for new Dataspace routines

Hi John,
            Mohamad and I kicked around another pair of routines that should 
meet your goals for the iterative scatter/gather routines, and I've tried to 
describe them below.  Let me know what you think.

            Quincey

=========================================================================

herr_t H5Dgather(hid_t src_space_id, const void *src_buf, hid_t type, size_t 
dst_buf_size, void *dst_buf, H5D_gather_func_t op, void *op_data);

typedef herr_t (*H5D_gather_func_t)(const void *dst_buf, size_t 
dst_buf_bytes_used, void *op_data);

The H5Dgather routine would gather [at most] dst_buf_size bytes from the source 
buffer, according to the selection in the source dataspace and [common] 
datatype, into the destination buffer, then call the application's op_data 
callback, giving the application a chance to "drain" the destination buffer.  
If more than dst_buf_size bytes worth of data are available in the source 
selection, H5Dgather would repeatedly call the application's callback routine.


herr_t H5Dscatter(size_t src_buf_size, void *src_buf, hid_t type, hid_t 
dst_space_id, void *dst_buf, H5D_scatter_func_t op, void *op_data);

typedef herr_t (*H5D_scatter_func_t)(void *src_buf, size_t *src_buf_bytes_used, 
void *op_data);

Similar to the H5Dgather routine, the H5Dscatter routine would call the 
application's op_data callback to fill up the source buffer with data 
(returning the number of bytes used in the source buffer through the 
src_buf_bytes_used parameter), and scatter those values into the destination 
buffer, according to the destination selection and the [common] datatype.  
Repeated calls to the application callback will be made if more than 
src_buf_size bytes worth of data is needed to fill the destination selection.

On Aug 22, 2012, at 8:56 AM, Biddiscombe, John A. wrote:


Mohamad

herr_t H5Dtransfer (hid_t src_space_id, const void *src_buf, H5T_t type, hid_t 
dst_space_id, /*out*/void *dst_buf);
I'm not certain that this will be any use on its own. If a selection is 
gigabytes and the VOL layer only has a small buffer available, then it needs to 
be able to make these transfers in pieces, performing puts/sends or whatever as 
appropriate. The function would be more useful if it was re-entrant and had a 
buffer size_type so that the selection could be copied by the first N, then 
next N, then next N until exhausted.

herr_t H5Sselect_iterate (hid_t dataspace_id, H5S_select_iterator_t op, void 
*op_data);
This function would be great and should be higher priority than the first 
because you can do the first using this second one - and don't have the problem 
of the limited buffer size. The user can iterate over the dataspace and copy as 
much or as little on each entry to the callback function as desired and 
maintain their own book keeping of where they left off. If the selection has 
huge contiguous chunks, the user callback can break these into pieces and 
perform the appropriate copies as substeps. If the selection is very sparse, 
then an internal buffer can be filled and acted upon as the iterations progress.

the callback function
herr_t (*H5S_select_iterate_t)(hsize_t *offset_coords, hsize_t length, void * 
op_data);
could be improved by adding a user callback void *pointer so that when you call 
iterate - you pass a pointer to the function - and also a user pointer to a 
data structure of the user's choice, which is passed to the callback as a user 
parameter. This way we can track intermediate transfer objects (like if we only 
partially transferred data or are filling an internal buffer) and in the case 
of multiple threads acting on these iterations, we can make sure each thread 
has its own data pointer an avoid static/global objects which will not be safe.

I just wrote this off the top of my head, so criticism welcome.

JB



-----Original Message-----
From: h5vol-boun...@hdfgroup.org<mailto:h5vol-boun...@hdfgroup.org> 
[mailto:h5vol-boun...@hdfgroup.org]<mailto:[mailto:h5vol-boun...@hdfgroup.org]> 
On Behalf Of Mohamad Chaarawi
Sent: 21 August 2012 23:45
To: HDF5 Virtual Object Layer (VOL) Discussions; 
hdf-forum@hdfgroup.org<mailto:hdf-forum@hdfgroup.org>; 
hdf5...@hdfgroup.org<mailto:hdf5...@hdfgroup.org>
Subject: [H5vol] RFC for new Dataspace routines

Hi All,

Please find attached an RFC that describes a couple of dataspace routines that 
we plan to add to the HDF5 API in the near future. If you have the time, please 
give it a read and feel free to send us comments.
We would like to hear from you if you see that you could benefit from those 
routines but would like to change something or would like us to consider adding 
other routines.  It would be great, in either case, if you could include your 
use case.

Thank you,
Mohamad

_______________________________________________
Hdf5lib mailing list
hdf5...@hdfgroup.org<mailto:hdf5...@hdfgroup.org>
http://mail.hdfgroup.org/mailman/listinfo/hdf5lib_hdfgroup.org

_______________________________________________
Hdf-forum is for HDF software users discussion.
Hdf-forum@hdfgroup.org
http://mail.hdfgroup.org/mailman/listinfo/hdf-forum_hdfgroup.org

Reply via email to