Hi Markus,
        Sounds like there is more to investigate here.  :-/  Unfortunately, I’m 
very time constrained right now and can’t spend more hours in this direction.  
I spoke with Elena and she’s going to see about some HDF Group staff to look 
into the issue.

        Quincey


> On Sep 18, 2017, at 11:41 PM, Krug, Markus <[email protected]> wrote:
> 
> Dear Quincey,
>  
> yes the file gets corrupted if you add the 4th object. However, the problem 
> I’m observing is not related to the number of objects you add to the file or 
> the number that are already in the file. It’s just because the HDF file spec 
> did not specify the location of different blocks within the file. The entire 
> spec is a linked list that has its origin in the superblock. The superblock 
> itself is the only block that has rules about its location. So in my 
> understanding all software that handles HDF files in any way should first 
> explore the linked list structure and identify afterwards the location that 
> are not used yet and can therefore be used for adding additional content to 
> the HDF file if requested. From what I observe the HDFlib implementation 
> behaves different. It has an algorithm where to locate the different blocks. 
> This algorithm does not consider if these locations are already occupied or 
> not. As long as you use the HDFlib implementation this behavior will not lead 
> to any problems because you are somehow consistent. The problem shows up at 
> that point in time when you generate HDF files with one tool and modify them 
> afterwards with a tool that is based on HDFlib. 
>  
> Actually I’m quite surprised that this behavior hasn’t been observed before. 
> I guess the reason is that not many projects use HDF files in embedded 
> projects (small 16- or 32bit microcontroller with significant less than 
> 1Mbyte program memory, and no or only a small real-time operating system). 
> Additionally even in applications where computing power and memory is not a 
> topic to be too concerned people use the HDFlib code or binary to save the 
> time it takes to re-write it. Nevertheless, I’m almost sure I found a ‘hole’ 
> in the specification that needs to be fixed. Either in the file specification 
> or the HDFlib implementation.
>  
> I did not use h5check or h5debug. Is it necessary to compile the belonging 
> code before I can use it? I’m also not sure if that will give me new results 
> because the file I’m generating is accepted by HDFview with no problem at 
> all. Do you think HDFview will accept files that do not follow the HDF 
> standard?
>  
> Best Regards
> Markus
>   <>
> Von: Hdf-forum [mailto:[email protected] 
> <mailto:[email protected]>] Im Auftrag von Quincey Koziol
> Gesendet: Montag, 18. September 2017 17:34
> An: HDF Users Discussion List <[email protected] 
> <mailto:[email protected]>>
> Betreff: Re: [Hdf-forum] HDF lib incompatible with HDF file spec?
>  
> Hi Markus,
>             I’ve looked at the files you’ve produced and it seems like the 
> first object is getting corrupted when you add the 4th object.  Can you see 
> if that’s the case?  Also, have you been using the h5debug tool for looking 
> at your files?  (in the tools directory)  Or h5check?
>  
>             Regards,
>                         Quincey
>  
> On Sep 18, 2017, at 5:03 AM, Krug, Markus <[email protected] 
> <mailto:[email protected]>> wrote:
>  
> Dear all,
>  
> I just want to come back to my question about incompatibility between the 
> HDFlib and the HDF file spec concerning the actual physical layout of a HDF 
> file. Can  anyone confirm my observation that this can lead to corrupt files 
> if they are generated first in a ‘non HDFlib based’ application that complies 
> to the HDF file spec and then is altered in a ‘HDFlib based’ application like 
> HDFview?
>  
> Best Regards
> Markus
> Von: Krug, Markus 
> Gesendet: Mittwoch, 6. September 2017 17:56
> An: 'HDF Users Discussion List' <[email protected] 
> <mailto:[email protected]>>
> Betreff: AW: [Hdf-forum] HDF lib incompatible with HDF file spec?
>  
> Dear Mark,
>  
> completely correct. I wrote some routines that generate hdf files. However 
> only a small subset of functionality is uses. More less only compressed, 
> compound data types with a maximum number of 5 will be in the files. Very 
> likely not more than two groups. I follow this paper 
> (http://www.ep.liu.se/ecp/076/050/ecp12076050.pdf 
> <http://www.ep.liu.se/ecp/076/050/ecp12076050.pdf>) concerning the hdf file 
> layout because I have the need to write ‘time series’ in my embedded 
> application.
>  
> You are right. The HDF file spec is highly complex. Even my reduced 
> functional set takes me significant more time that I was planning to get an 
> understanding. In the meantime I think I understand what I need for my 
> purpose. However, I’m not saying that the file that I can generate so far are 
> 100% correct in the sense of the HDF file spec. But at least HDFview can read 
> them with no problems. So it cannot be that wrong.
>  
> Best Regards
> Markus
>  
> Von: Hdf-forum [mailto:[email protected] 
> <mailto:[email protected]>] Im Auftrag von Miller, Mark C.
> Gesendet: Dienstag, 5. September 2017 19:22
> An: HDF Users Discussion List <[email protected] 
> <mailto:[email protected]>>
> Betreff: Re: [Hdf-forum] HDF lib incompatible with HDF file spec?
>  
> Hmm. If I understand you, you have written code that you believe produces an 
> HDF5 file according to the 3.0 file version specification, 
> https://support.hdfgroup.org/HDF5/doc/H5.format.html 
> <https://support.hdfgroup.org/HDF5/doc/H5.format.html> but nevertheless does 
> NOT use the HDF5 library to do it. Furthermore, where 'extended padding' is 
> concerned, your implementation does business differently than the HDF5 
> implementation. 
>  
> You can prove HDF5 tools will *read* the file ok. But, in a read-modify-write 
> scenario, the file is getting corrupted by HDF5 library due to the difference 
> in how the two implementations handle the extended padding -- a feature that 
> you explain is '...not defined at all -- not even recommended'.
>  
> Is that about right?
>  
> If so, it does indeed sound like a potential issue in the file format 
> specification for HDF5.
>  
> Your scenario sounds like a super useful test case...does a wholly 
> independent implementation produce a file the HDF5 library can "handle"?
>  
> I wonder if there are settings in HDF5 library you may need to set (such as 
> alignment or block-size or something) such that read-modify-write will indeed 
> work ok? I wonder if there is some metadata missing from your file that will 
> inform the HDF5 library what specific settings it must use to properly read 
> and write to the file? I wonder if there is some boot-block information you 
> have neglected to include so that the HDF5 library is not aware of all the 
> paramaters effecting the file's layout.
>  
> The only reason for calling into question many possibilities of your 
> implementation is that the HDF5 file format is fairly complex. I don't think 
> it is easily duplicated without using the library itself. So, I think its 
> highly likely you may be overlooking some important features of the format 
> necessary for the HDF5 library to fully handle it.
>  
> All that said, I commend your courage for attempting it and hope others can 
> chime in with more detailed thoughts on what to do about it.
>  
> Mark
>  
>  
>  
> "Hdf-forum on behalf of Krug, Markus" wrote:
>  
> Dear all,
>  
> I just came around an interesting issue.
> I implemented the writing of HDF files on an embedded system. The amount of 
> functionality I implemented is significant less than the HDF lib offers. So 
> it is just tailored to my needs. I implemented everything on base of the HDF 
> 3.0 file spec. One point of my tailoring was to optimize the file size. 
> Therefore, I write every internal block in the HDF files aligned byte-by-byte 
> to the next – or padded to the address alignment if it is requested by the 
> HDF file specification. The HDF files generated by HDFview or Matlab have 
> plenty of space in-between the internal blocks. Sometimes a few hundred 
> bytes. As far as I read from the HDF file specification this ‘extended 
> padding’ is not defined at all – not even recommended.
> However, this ‘extended padding’ that is performed by the HDF lib leads to a 
> behavior that I would consider as an incompatibility to itself. To 
> demonstrate this I attached two HDF files to this email. The first 
> (sizeoptimized.h5) is generated by my embedded software and is optimized 
> concerning the file size. It contains three compounds with each of them has 2 
> elements. You should be able to open that file in HDFview or similar tools 
> and read all its contents.
> The second file (sizeoptimizedextended.h5) is generated by HDFview by adding 
> a fourth compound after the sizeoptimized.h5 file was opened in HDFview. You 
> can see that the file is partly corrupted. The reason for this is that 
> HDFview (and therefore the HDF lib I guess) is not really taking care about 
> the position of the internal blocks of a file that it is writing to. It seems 
> to me it has some internal mapping of those blocks. This mapping gets applied 
> even if it will collide, and therefore corrupt, the existing blocks.
> If my observation is correct I think the HDF lib will need a bugfix or the 
> HDF file spec will need a description of how the internal blocks are allowed 
> to be positioned within a HDF file.
> I forgot to mention that I tried to use the HDF lib sources and compile it to 
> my system. However, I quit after a couple of days because the way the sources 
> are written are not suitable at all to adopt them to an embedded system that 
> runs a simplified file system and a real-time operating system – and all of 
> it has to fit into a few hundred kilobytes.
>  
> Can anyone comment on my observation?
>  
>  
> Best Regards
> Markus
> _______________________________________________
> Hdf-forum is for HDF software users discussion.
> [email protected] <mailto:[email protected]>
> http://lists.hdfgroup.org/mailman/listinfo/hdf-forum_lists.hdfgroup.org 
> <http://lists.hdfgroup.org/mailman/listinfo/hdf-forum_lists.hdfgroup.org>
> Twitter: https://twitter.com/hdf5 <https://twitter.com/hdf5>
>  
> _______________________________________________
> Hdf-forum is for HDF software users discussion.
> [email protected] <mailto:[email protected]>
> http://lists.hdfgroup.org/mailman/listinfo/hdf-forum_lists.hdfgroup.org 
> <http://lists.hdfgroup.org/mailman/listinfo/hdf-forum_lists.hdfgroup.org>
> Twitter: https://twitter.com/hdf5 <https://twitter.com/hdf5>
_______________________________________________
Hdf-forum is for HDF software users discussion.
[email protected]
http://lists.hdfgroup.org/mailman/listinfo/hdf-forum_lists.hdfgroup.org
Twitter: https://twitter.com/hdf5

Reply via email to