Elephas MapReduce bzip2 compressed input support

Azhar Jassal Wed, 04 Mar 2015 01:50:46 -0800

Hi

I have began using jena-elephas.


Is there any thought on how to deal with compressed (particularly bzip2)
input files- bzip2 is splittable.

For illustration, the DBpedia "persondata_en.nq" (release 3.9) is 80mb
compressed (bzip2) and 1.5gb uncompressed. At the moment the jena-elephas
record reader deals with input based upon filename extensions (using RIOT
Lang's) so .bz2 files hit an obvious unknown serialization error...

Any thoughts on reading bzip2 compressed input files ?

Az

Elephas MapReduce bzip2 compressed input support

Reply via email to