Announcing python-blosc 1.0.2
A Python wrapper for the Blosc compression library
What is it?
===
Blosc (http://blosc.pytables.org) is a high performance compressor
optimized for binary data. It has been designed to transmit data to
the processor cache faster than the traditional, non-compressed,
direct memory fetch approach via a memcpy() OS call.
Blosc works well for compressing numerical arrays that contains data
with relatively low entropy, like sparse data, time series, grids with
regular-spaced values, etc.
python-blosc is a Python package that wraps it.
What is new?
Updated to Blosc 1.1.2. Fixes some bugs when dealing with very small
buffers (typically smaller than specified typesizes). Closes #1.
Basic Usage
===
[Using IPython shell and a 2-core machine below]
# Create a binary string made of int (32-bit) elements
import array
a = array.array('i', range(10*1000*1000))
bytes_array = a.tostring()
# Compress it
import blosc
bpacked = blosc.compress(bytes_array, typesize=a.itemsize)
len(bytes_array) / len(bpacked)
110 # 110x compression ratio. Not bad!
# Compression speed?
timeit blosc.compress(bytes_array, typesize=a.itemsize)
100 loops, best of 3: 12.8 ms per loop
len(bytes_array) / 0.0128 / (1024*1024*1024)
2.9103830456733704 # wow, compressing at ~ 3 GB/s, that's fast!
# Decompress it
bytes_array2 = blosc.decompress(bpacked)
# Check whether our data have had a good trip
bytes_array == bytes_array2
True# yup, it seems so
# Decompression speed?
timeit blosc.decompress(bpacked)
10 loops, best of 3: 21.3 ms per loop
len(bytes_array) / 0.0213 / (1024*1024*1024)
1.7489625814375185 # decompressing at ~ 1.7 GB/s is pretty good too!
More examples showing other features (and using NumPy arrays) are
available on the python-blosc wiki page:
http://github.com/FrancescAlted/python-blosc/wiki
Documentation
=
Please refer to docstrings. Start by the main package:
import blosc
help(blosc)
and ask for more docstrings in the referenced functions.
Download sources
Go to:
http://github.com/FrancescAlted/python-blosc
and download the most recent release from here.
Blosc is distributed using the MIT license, see LICENSES/BLOSC.txt for
details.
Mailing list
There is an official mailing list for Blosc at:
bl...@googlegroups.com
http://groups.google.es/group/blosc
**Enjoy data!**
--
Francesc Alted
--
http://mail.python.org/mailman/listinfo/python-announce-list
Support the Python Software Foundation:
http://www.python.org/psf/donations/