Uncompressed size of Sequence files

Robert Dyer Sat, 23 Nov 2013 13:15:28 -0800

Is there an easy way to get the uncompressed size of a sequence file that
is block compressed?  I am using the Snappy compressor.


I realize I can obviously just decompress them to temporary files to get
the size, but I would assume there is an easier way.  Perhaps an existing
tool that my search did not turn up?

If not, I will have to run a MR job load each compressed block and read the
Snappy header to get the size.  I need to do this for a large number of
files so I'd prefer a simple CLI tool (sort of like 'hadoop fs -du').

- Robert

Uncompressed size of Sequence files

Reply via email to