Hi guys,
I would like to compress the files on HDFS to save some storage.
As far as i see bzip2 is the only format which is splitable (and slow).
The actual files are Avro.
So in my driver class i have :
job.setInputFormatClass(AvroKeyInputFormat.class);
I have number of jobs running processing Avro files so i would like to
keep the code change to a minimum.
Is it possible to comrpess these avro files with bzip2 and keep the code
of MR jobs the same (or with little change)
If it is , please give me some hints as so far i don't seem to find any
good resources on the Internet.
Georgi