[ https://issues.apache.org/jira/browse/PIG-2746?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13393504#comment-13393504 ]
Harsh J commented on PIG-2746: ------------------------------ Daniel, Unsure if I understood your question right: This code will support whatever has been provided in the compression codecs list of Hadoop (Wherein LZO has to be registered). In 1.x the property is called io.compression.codecs. It removes need to have per-codec/extension code, and relies on the codec itself to accept an extension and provide itself. > Pig doesn't detect all forms of compression extensions properly > --------------------------------------------------------------- > > Key: PIG-2746 > URL: https://issues.apache.org/jira/browse/PIG-2746 > Project: Pig > Issue Type: Bug > Affects Versions: 0.8.1 > Reporter: Harsh J > Assignee: Harsh J > Attachments: PIG-2746.patch, PIG-2746.patch > > > The PigStorage has the following snippet. > {code} > private void setCompression(Path path, Job job) { > String location=path.getName(); > if (location.endsWith(".bz2") || location.endsWith(".bz")) { > FileOutputFormat.setCompressOutput(job, true); > FileOutputFormat.setOutputCompressorClass(job, BZip2Codec.class); > } else if (location.endsWith(".gz")) { > FileOutputFormat.setCompressOutput(job, true); > FileOutputFormat.setOutputCompressorClass(job, GzipCodec.class); > } else { > FileOutputFormat.setCompressOutput( job, false); > } > } > {code} > This limits it to only work with STORE filenames provided as 'output.gz' or > 'output.bz2' and for the rest (like LZO) one has to specify codecs and > manually enable compression. > Ideally Pig can rely on Hadoop's extension-to-codec detector instead of > having this ladder. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira