Harsh J created PIG-2746: ---------------------------- Summary: Pig doesn't detect all forms of compression extensions properly Key: PIG-2746 URL: https://issues.apache.org/jira/browse/PIG-2746 Project: Pig Issue Type: Bug Reporter: Harsh J
The PigStorage has the following snippet. {code} private void setCompression(Path path, Job job) { String location=path.getName(); if (location.endsWith(".bz2") || location.endsWith(".bz")) { FileOutputFormat.setCompressOutput(job, true); FileOutputFormat.setOutputCompressorClass(job, BZip2Codec.class); } else if (location.endsWith(".gz")) { FileOutputFormat.setCompressOutput(job, true); FileOutputFormat.setOutputCompressorClass(job, GzipCodec.class); } else { FileOutputFormat.setCompressOutput( job, false); } } {code} This limits it to only work with STORE filenames provided as 'output.gz' or 'output.bz2' and for the rest (like LZO) one has to specify codecs and manually enable compression. Ideally Pig can rely on Hadoop's extension-to-codec detector instead of having this ladder. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira