Thanks for both your responses. I was indeed talking about developing a
codec utility as the hadoop application itself.
In particular, thanks to Bertrand for the lengthy response. I'm actually
learning Hadoop at the moment, so I've been trying to find a suitable (very
modestly sized) application f
Your question could be interpreted in another way : should I use Hadoop in
order to perform massive compression/decompression using my own
(eventually, proprietary) utility?
So yes, Hadoop can be used to parallelize the work. But the real answer
will depend on your context, like always.
How many f
Dear Robert,
SequenceFiles do have either record, block or no compression. You can
configure, which codec (gzip, bzip2, etc.) is used. Have a look at
http://hadoop.apache.org/docs/current/api/org/apache/hadoop/io/compress/CompressionCodec.html
Best regards,
Jens
Hi,
I was wondering if anyone could comment on the suitability of using Hadoop
to run a custom file compression/decompression utility (with functionality
such as zip, gzip, bzip2 etc.).