Re: [Hdf-forum] Best way to repackage a dataset? (C program)

Martijn Jasperse Sun, 04 Sep 2016 07:53:34 -0700

Hi,
Maybe I misunderstood the requirements, but if you want to just copy a
dataset to another file, why not just use H5Ocopy? It allows you to use a
different file as the destination. Could be a lot faster and simpler than
loading the data into memory.


https://www.hdfgroup.org/HDF5/doc/RM/RM_H5O.html#Object-Copy

Cheers,
Martijn

On 4 Sep 2016 14:50, "Landon Clipp" <[email protected]> wrote:

Hello,

Thank you everyone for your help. I figured out the problem, I was just
misunderstanding how the functions worked. I was able to successfully read
the dataset into a buffer. I did not realize that a 1D array was
sufficient, I was for some reason thinking that it had to be a contiguous
multidimensional array but it turns out that the functions know how to read
the arrays if you give it the rank and the size of each dimension.

Turns out I have another problem however. I am trying to now write this
buffer into a new file. The error happens when I try to create a new
dataset. When I ran my code, I got errors such as: "H5D.c line 194 in
H5Dcreate2(): unable to create dataset." I looked online and it turns out
that there is a size limit to the buffer and mine most certainly exceeds
that. So the solution is to create a dataset creation property list and set
it to chunk. Even after I have set a reasonable chunk size, I still get the
same errors. I will attach my code and the errors I am receiving. Relevant
code starts at line 122. Thank you SO MUCH for your help, I'm still trying
to learn all of this.

Landon

On Sat, Sep 3, 2016 at 11:45 AM, Michael Jackson <
[email protected]> wrote:

> You can take a look at the following source files. They are mean for C++
> and templates but assuming you know a bit of C++ you can convert them back
> to pure "C" without any issues. The template parameter is on the POD type.
>
> https://github.com/BlueQuartzSoftware/SIMPL/tree/develop/Source/H5Support
>
> Take a look at H5Lite.h and H5Lite.cpp. There are functions in there to
> "readPointerDataset()", writePointerDataSet() and getDatasetInfo().
>
> The basic flow would be the following (using some pure "C").
>
> // Open the file and get the "Location ID"
> hid_t fileId = ...
>
> char* datasetName = ....
> //Since you know it is a 5D array:
> hsize_t_t dims[5];
> H5T_class_t  classType;
> size_t type_size;
> H5Lite::getDatasetInfo(fileId, datasetName, dims, classType, typesize);
>
> // Now loop over all the dim[] values to compute the total number
> // of elements that need to allocate, lets assume they are 32 bit
> // signed ints
> size_t totalElements = dims[0] * dims[1] * dims[2] * dims[3] * dims[4];
> // Allocate the data
> signed int* dataPtr = malloc(totalElements * sizeof(signed int));
>
> herr_t err = H5Lite::readPointerDataset(fileId, datasetName, dataPtr);
> // Check error
> if (err < 0) { ..... }
>
> // Open New file for writing
> hid_t outFileId = ...
> signed int rank = 5;
> err = H5Lite::writePointerDataset(outFileid, datasetName, rank, dims,
> dataPtr);
> // Check error
> if (err < 0) { ..... }
>
> This assumes that you take the code from GitHub and convert the necessary
> functions into pure "C" which should be straight forward to do.
>
> The code referenced above is BSD licensed.
>
> --
> Michael A. Jackson
> BlueQuartz Software, LLC
> [e]: [email protected]
>
>
> Nelson, Jarom wrote:
>
>> You might look at h5copy as a reference, or just use that tool to do the
>> work for you.
>>
>> Jarom
>>
>> *From:*Hdf-forum [mailto:[email protected]] *On
>> Behalf Of *Landon Clipp
>> *Sent:* Friday, September 02, 2016 11:56 AM
>> *To:* [email protected]
>> *Subject:* [Hdf-forum] Best way to repackage a dataset? (C program)
>>
>> Hello everyone,
>>
>> I am working with an HDF5 file that has a 5D dataset. What I'm wanting
>> to do is to create a C program that reads this dataset into memory and
>> then outputs it into a newly created file with only that dataset in it
>> (perhaps at the root directory of the file tree). What I don't
>> understand is how to read this entire 5D array using H5Dread into a 5D
>> buffer that has been previously allocated on the heap (note I cannot use
>> an array allocated on the stack, it would be too large and would create
>> seg faults).
>>
>> What is the general process I need to employ to do such a thing, and is
>> there maybe a more elegant solution to this than reading the entire
>> dataset into memory? This process seems easy to me for a 1 or 2D array
>> but I am lost with larger dimension arrays. Thanks.
>>
>> Regards,
>>
>> Landon
>>
>> _______________________________________________
>> Hdf-forum is for HDF software users discussion.
>> [email protected]
>> http://lists.hdfgroup.org/mailman/listinfo/hdf-forum_lists.hdfgroup.org
>> Twitter: https://twitter.com/hdf5
>>
>
> _______________________________________________
> Hdf-forum is for HDF software users discussion.
> [email protected]
> http://lists.hdfgroup.org/mailman/listinfo/hdf-forum_lists.hdfgroup.org
> Twitter: https://twitter.com/hdf5
>


_______________________________________________
Hdf-forum is for HDF software users discussion.
[email protected]
http://lists.hdfgroup.org/mailman/listinfo/hdf-forum_lists.hdfgroup.org
Twitter: https://twitter.com/hdf5

_______________________________________________
Hdf-forum is for HDF software users discussion.
[email protected]
http://lists.hdfgroup.org/mailman/listinfo/hdf-forum_lists.hdfgroup.org
Twitter: https://twitter.com/hdf5

Re: [Hdf-forum] Best way to repackage a dataset? (C program)

Reply via email to