'compressed' keyword in DDL syntax misleading and does not compress
-------------------------------------------------------------------
Key: HADOOP-4169
URL: https://issues.apache.org/jira/browse/HADOOP-4169
Project: Hadoop Core
Issue Type: Bug
Components: contrib/hive
Reporter: Joydeep Sen Sarma
Hive two types of data files - flat files and sequencefiles. Syntax should
reflect this. Currently the 'compressed' keyword is used to choose sequencefile
format - but does not actually compress the files. this is misleading. In
addition - flat files can also be compressed.
Proposal is to replace 'compressed' with 'sequencefile'. And compression
options should be applied from standard hadoop way of specifying whether output
should be compressed (''mapred.output.compress') - ie. session options.
(session options will also define codec etc.). default file format and
compression options can be specified in conf file.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.