Is there an easy way to get the uncompressed size of a sequence file that is block compressed? I am using the Snappy compressor.
I realize I can obviously just decompress them to temporary files to get the size, but I would assume there is an easier way. Perhaps an existing tool that my search did not turn up? If not, I will have to run a MR job load each compressed block and read the Snappy header to get the size. I need to do this for a large number of files so I'd prefer a simple CLI tool (sort of like 'hadoop fs -du'). - Robert