BELUGA BEHR created HADOOP-14668:
------------------------------------
Summary: Remove Configurable Default Sequence File Compression Type
Key: HADOOP-14668
URL: https://issues.apache.org/jira/browse/HADOOP-14668
Project: Hadoop Common
Issue Type: Improvement
Components: io
Affects Versions: 3.0.0-alpha3
Reporter: BELUGA BEHR
Priority: Trivial
Fix For: 2.8.1
It is confusing to have two different ways to set the Sequence File compression
type.
In a basic configuration, I can set
_mapreduce.output.fileoutputformat.compress.type_ or
_io.seqfile.compression.type_. If I would like to set a default value, I
should set it by setting the cluster environment's mapred-site.xml file setting
for _mapreduce.output.fileoutputformat.compress.type_.
Please remove references to this magic string _io.seqfile.compression.type_,
remove the {{setDefaultCompressionType}} method, and have
{{getDefaultCompressionType}} return value hard-coded to
{{CompressionType.RECORD}}. This will make administration easier as I have to
only interrogate one configuration.
{code:title=org.apache.hadoop.io.SequenceFile}
/**
* Get the compression type for the reduce outputs
* @param job the job config to look in
* @return the kind of compression to use
*/
static public CompressionType getDefaultCompressionType(Configuration job) {
String name = job.get("io.seqfile.compression.type");
return name == null ? CompressionType.RECORD :
CompressionType.valueOf(name);
}
/**
* Set the default compression type for sequence files.
* @param job the configuration to modify
* @param val the new compression type (none, block, record)
*/
static public void setDefaultCompressionType(Configuration job,
CompressionType val) {
job.set("io.seqfile.compression.type", val.toString());
}
{code}
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]