Hi guys,
I would like to compress the files on HDFS to save some storage.

As far as i see bzip2 is the only format which is splitable (and slow).

The actual files are Avro.

So in my driver class i have :

job.setInputFormatClass(AvroKeyInputFormat.class);

I have number of jobs running processing Avro files so i would like to keep the code change to a minimum.

Is it possible to comrpess these avro files with bzip2 and keep the code of MR jobs the same (or with little change) If it is , please give me some hints as so far i don't seem to find any good resources on the Internet.


Georgi

Reply via email to