Hi I have began using jena-elephas.
Is there any thought on how to deal with compressed (particularly bzip2) input files- bzip2 is splittable. For illustration, the DBpedia "persondata_en.nq" (release 3.9) is 80mb compressed (bzip2) and 1.5gb uncompressed. At the moment the jena-elephas record reader deals with input based upon filename extensions (using RIOT Lang's) so .bz2 files hit an obvious unknown serialization error... Any thoughts on reading bzip2 compressed input files ? Az
