Thank you Gerd and Dave.

Solution #1 is okay for my current task. However, ultimately, for performance 
of my app, I would like to visit only those areas of the sparse dataset where 
data really exists. From your answers and the documentation I learn that this 
information is available with chunk granularity in b-tree but apparently not 
exposed in API.

________________________________
From: Hdf-forum <[email protected]> on behalf of Dave 
Allured - NOAA Affiliate <[email protected]>
Sent: Thursday, April 20, 2017 10:16 AM
To: [email protected]
Subject: Re: [Hdf-forum] [**EXTERNAL**] Re: first non-fill-value in the sparse 
chunked dataset

Efim,

Can you simply add a scalar integer attribute that keeps track of the lower 
bound index value of the slower dimension?  Just update this attribute every 
time you write to the data set, or at least every time the lower bound goes 
lower.  This would be an application level solution, rather than something 
provided by the library.

This resembles a minimal version of Gerd's suggestion #1.

--Dave


On Thu, Apr 20, 2017 at 8:39 AM, Efim Dyadkin 
<[email protected]<mailto:[email protected]>> wrote:

Sorry I should have specified what "first" is. I have a 2d dataset with slower 
dimension sparse and unlimited,

and with fast dimension non-sparse and of fixed length. Typically for my data, 
information can be written first

in the "middle" of the slower dimension of the dataset and then grow in any 
direction (to the left and to the right)

incrementally. I need to keep track of current bounding box in order to only 
access populated part of the dataset.

The upper boundary of the slower dimension is basically an extent of the 
dataset so I do not need to store it

on my own. As to lower boundary I hoped I could find it by getting access to a 
first available chunk with

 a smallest index along slower dimension.

I think exposing at least a boolean grid of existing chunks could be helpful 
for sparse data handling.

Thanks,

Efim


From: Hdf-forum 
<[email protected]<mailto:[email protected]>>
 on behalf of Gerd Heber <[email protected]<mailto:[email protected]>>
Sent: Thursday, April 20, 2017 7:20 AM
To: HDF Users Discussion List
Subject: [**EXTERNAL**] Re: [Hdf-forum] first non-fill-value in the sparse 
chunked dataset


The “first non-fill-value” in which order? (chronological, C-order, …)



Short answer: No chance.



Slightly longer: (Apart from H5DOwrite_chunk…) There is currently no API that

gives you direct control over/introspection into chunks. You can control certain

aspects of chunk allocation time and policy (via dataset creation properties),

but the rest is pretty opaque and a side-effect of H5D[read,write].

I think you have at least two options:



1. Create an auxiliary structure where you maintain that type of log 
information.

   (This is dangerous/illusionary because you’ll be making assumptions about 
how the

    HDF5 library writes/updates chunks, and what happens in the underlying 
storage.)



2. Create a proper sparse structure and don’t use chunking to mimic one.

   (You might still struggle with the definition of ‘first.’)



G.



 From: Hdf-forum 
[mailto:[email protected]<mailto:[email protected]>]
 On Behalf Of Efim Dyadkin

Sent: Wednesday, April 19, 2017 5:04 PM
To: [email protected]<mailto:[email protected]>
Subject: [Hdf-forum] first non-fill-value in the sparse chunked dataset



Hi,



I am using a sparse chunked dataset with a certain fill value. I’d like to find 
a first non-fill-value element in the dataset. Can I narrow down my search to a 
first available chunk? How can I do it?



Thank you,

Efim Dyadkin

------------------- This e-mail, including any attached files, may contain 
confidential and privileged information for the sole use of the intended 
recipient. Any review, use, distribution, or disclosure by others is strictly 
prohibited. If you are not the intended recipient (or authorized to receive 
information for the intended recipient), please contact the sender by reply 
e-mail and delete all copies of this message.
_______________________________________________
Hdf-forum is for HDF software users discussion.
[email protected]
http://lists.hdfgroup.org/mailman/listinfo/hdf-forum_lists.hdfgroup.org
Twitter: https://twitter.com/hdf5

Reply via email to