Hi Martin, 

Have you set the chunk cache sufficiently large? Otherwise it will
reread the same chunks again and again. Allthough the system file cache
might hold all those data, I think it's better to size the cache
correctly because of the lookups HDF5 is doing. 
E.g. in the case of (*,y,*,*) you'll need a cache of 601*8*61*1501
floats (1.64 GB). I assume have sufficient memory, otherwise you could
adjust the chunk size, especially in z,w. 
Your chunks are not particularly large (16384 bytes) leading to a lot
of iops and a large B-tree to index the chunks. On the other hand, when
enlarging the chunks, you''ll need more memory for the chunk cache. 

What is the pattern when accessing the data as *,*,z,w? First w, and
thereafter all z? You'll need a much smaller cache when accessing it
like 
    for w in 0:nw/ncw    (nw is length of w-axis; ncw is chunk-size in
w) 
      for z in 0:nz/ncz 
        for w1 in 0:ncw 
          for z1 in 0:ncz 
In this way you handle a full z,w chunk before moving to the next one,
so your cache size needs to be only 601*482*8*8. 

I have a program testing 3D data sets of arbitrary size and chunk size
using a cache size depending on the chunk size and access pattern. If
you like to, I can send it. 

Cheers, 
Ger

>>> Matthieu Brucher <[email protected]> 6/12/2014 10:56 PM
>>>
Hi,

Unfortunately, this is indeed the worst you can have. It's completely
normal that you have the worst performance with slicing in these
dimensions. Even with a parallel filesystem, you would need to read
EVERYTHING from the dataset, and then the library would pick up the
pieces you need.
One solution would be to agglomerate several z,w in dimensions 5 and
6, so that you still get some performance, but it will be worse than 1
or even 2.

Cheers,

Matthieu


2014-06-12 20:43 GMT+01:00 Martin Sarajærvi <[email protected]>:
> Hi all,
>
> I'm working with floating point data building up a very large
dataset
> typically >100Gb of four dimensions (x, y, z, w).
> Dimensions are of the size (x,y,z,w) = (601, 482, 61, 1501) in my
example.
>
> The aim is to slice (READING ONLY) this dataset in orthogonal
directions:
> 1) (x, *, *, *)
> 2) (*, y, *, *)
> 3) (*, *, z, w)
>
> When using a contiguous layout I naturally get good performance for
> directions (1) and (2), however it is very poor for (3).
> Using a chunking layout of (8,8,8,8) seem to give the best balance so
far
> for reasonable access times in all directions. but still not as fast
as I
> was hoping for. My tests also show that compression improves the
read
> performance slightly.
>
> I'm looking for advise on possible optimization techniques to use for
this
> problem other than what has been mentioned.
> Otherwise, is my only option to move to some (expensive?) parallel
solution?
>
> Thanks!
>
> Regards,
> Martin
>
> _______________________________________________
> Hdf-forum is for HDF software users discussion.
> [email protected]
>
http://mail.lists.hdfgroup.org/mailman/listinfo/hdf-forum_lists.hdfgroup.org
> Twitter: https://twitter.com/hdf5



--
Information System Engineer, Ph.D.
Blog: http://matt.eifelle.com
LinkedIn: http://www.linkedin.com/in/matthieubrucher
Music band: http://liliejay.com/

_______________________________________________
Hdf-forum is for HDF software users discussion.
[email protected]
http://mail.lists.hdfgroup.org/mailman/listinfo/hdf-forum_lists.hdfgroup.org
Twitter: https://twitter.com/hdf5 
_______________________________________________
Hdf-forum is for HDF software users discussion.
[email protected]
http://mail.lists.hdfgroup.org/mailman/listinfo/hdf-forum_lists.hdfgroup.org
Twitter: https://twitter.com/hdf5

Reply via email to