[HDF Forum] [HDF5] Thread-parallel compression filters? - Feature request

2023-01-05 Thread Kittisopikulm


The question is how are people obtaining h5py. Most people either obtain h5py 
from pip or conda.

PyPI (via pip)
https://pypi.org/project/hdf5plugin/

Conda-Forge (conda)
https://anaconda.org/conda-forge/hdf5plugin

The source repository for those packages is maintained by the synchrotron 
community via the [silx project](https://www.silx.org/). 
https://github.com/silx-kit/hdf5plugin

Additionally, zstandard is an open source project maintained by Meta (fma 
Facebook):
http://facebook.github.io/zstd/

These third party filters are generally open, free, and registered:
https://portal.hdfgroup.org/display/support/Registered+Filter+Plugins

The HDF Group maintains a repository that collects the source code of these 
filters:
https://github.com/HDFGroup/hdf5_plugins

There is an ongoing conversation along these lines in another thread:
https://forum.hdfgroup.org/t/what-do-you-want-to-see-in-hdf5-2-0/10003/50





---
[Visit 
Topic](https://forum.hdfgroup.org/t/thread-parallel-compression-filters-feature-request/10656/4)
 or reply to this email to respond.

You are receiving this because you enabled mailing list mode.

To unsubscribe from these emails, [click 
here](https://forum.hdfgroup.org/email/unsubscribe/e17c08cb3308853251ed787f93475aed4df76b19a74b689e0456e7f0be64fdab).


[HDF Forum] [HDF5] Thread-parallel compression filters? - Feature request

2023-01-01 Thread Kittisopikulm


There are a few applications now that have implemented thread parallel 
compression and decompression:

https://github.com/imaris/ImarisWriter
https://www.blosc.org/posts/blosc2-pytables-perf/

One trick is to use 
[`H5Dread_chunk`](https://docs.hdfgroup.org/hdf5/v1_14/group___h5_d.html#title30)
 or 
[`H5Dwrite_chunk`](https://docs.hdfgroup.org/hdf5/v1_14/group___h5_d.html#title38).
 This will allow you to read or write the chunk directly in its compressed 
form. You can then setup the thread-parallel compression or decompression 
yourself.

Another approach is using 
[`H5Dget_chunk_info`](https://docs.hdfgroup.org/hdf5/v1_14/group___h5_d.html#gaccff213d3e0765b86f66d08dd9959807)
 to query the location of a chunk within the file. 
[`H5Dchunk_iter`](https://docs.hdfgroup.org/hdf5/v1_14/group___h5_d.html#title6)
 provides a faster way to do this, particularly if you want to get this 
information for all the chunks, but this is a relatively new API function.

The source for many of the filters is located in the following repository.
https://github.com/HDFGroup/hdf5_plugins





---
[Visit 
Topic](https://forum.hdfgroup.org/t/thread-parallel-compression-filters-feature-request/10656/2)
 or reply to this email to respond.

You are receiving this because you enabled mailing list mode.

To unsubscribe from these emails, [click 
here](https://forum.hdfgroup.org/email/unsubscribe/af8cbd21568d2e7884801a1e29c617c9ca3ec206c1021c8cb82bb1b9ddce8db5).


[HDF Forum] [HDF5] Thread-parallel compression filters? - Feature request

2022-12-31 Thread Matthieu


Hi all,

I have a large production application that uses MPI + threads. The i/o pattern 
is either pure parallel-hdf5 or one file/rank (in a similar way to the VFD 
interface but done by hand). I make heavy use of compression filters. 

As my application is MPI +  threads, only one thread per MPI rank actually does 
all the i/o work. In an uncompressed scenario, that is fine as I can then 
saturate the Infiniband cards with my 2 (or 4) MPI ranks per node. However, 
when compressing, this gets a lot slower. Only one thread does the compression 
and this now the clear bottleneck of the whole operation. 
It seems a bit silly as I have many idling threads which could all be drafted 
in for parallel compression. 

Would it hence be possible to see the use of `pigz` over plain `gzip` in future 
releases?

Note that I'd like to avoid creating my own custom filter as I'd then have to 
distribute it collaborators using simulation results. Too high a barrier for 
access for many unfortunately.





---
[Visit 
Topic](https://forum.hdfgroup.org/t/thread-parallel-compression-filters-feature-request/10656/1)
 or reply to this email to respond.

You are receiving this because you enabled mailing list mode.

To unsubscribe from these emails, [click 
here](https://forum.hdfgroup.org/email/unsubscribe/77de6ecb3ced6322a56818c689e2b0c1477571543328f3f06ad4447dd88c8bce).