A few thoughts come to mind. But, I don't know if they'll be useful.

First, I assume you are doing this for a read-mostly (or maybe read-only) 
scenario. That is, the datasets are already in the file and now you want to 
find mins/maxs.

If I/O is your bottleneck (e.g. cause for the slowness), then I can't imagine 
anything going faster than a single H5Dread (C interface) and scan of all the 
data values in one fell swoop. And, I cannot imagine H5 Java API would do any 
better.

Is the dataset compressed and/or chunked in the file? If not, maybe you can 
adjust the data-producer to ensure that it is. Compressed data would be read 
faster.

If I/O is NOT the bottleneck, then its just the compute time spent finding the 
min/max. This would be a very simple operation to multi-thread (or GPU-ize) 
though. Is that an option?

If you have control over how the data is initially written, why not compute the 
min/max whe the dataset is written using a filter much like the checksum 
filter? It can scan all values on write, compute min/max and then store them as 
metadata with the dataset. Then, finding them later during read is of course 
trivial. And, it would avoid you doing the min/max scan repeateadly for 
different readers, etc.

Don't know if any of that might be useful but maybe it triggers some ideas for 
better strategies.

Mark





From: <EXI-Manoharan>, Dhilipan 
<[email protected]<mailto:[email protected]>>
Reply-To: HDF Users Discussion List 
<[email protected]<mailto:[email protected]>>
Date: Friday, October 3, 2014 5:27 PM
To: "[email protected]<mailto:[email protected]>" 
<[email protected]<mailto:[email protected]>>
Subject: [Hdf-forum] Help Needed For Finding Minimum/Maximum

All,

I need help in finding minimum/maximum of values that are stored in a dataset.
I currently do it by reading the entire dataset and finding the minimum/maximum 
which is really slow when the datavalues are more(say more than 2000000).
Is there any way to find them using H5 java API?
Appreciating your help in this.

Thanks,
Dhilipan M
Boeing FTCS
Desk:2066626488
Mob:2066694758

_______________________________________________
Hdf-forum is for HDF software users discussion.
[email protected]
http://mail.lists.hdfgroup.org/mailman/listinfo/hdf-forum_lists.hdfgroup.org
Twitter: https://twitter.com/hdf5

Reply via email to