[
https://issues.apache.org/jira/browse/HIVE-4221?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Phabricator updated HIVE-4221:
------------------------------
Attachment: HIVE-4221.HIVE-4221.HIVE-4221.HIVE-4221.D9759.1.patch
sxyuan requested code review of "HIVE-4221 [jira] Stripe-level merge for ORC
files
".
Reviewers: kevinwilfong, omalley
As with RC files, we would like to be able to merge ORC files efficiently by
reading/writing stripes without deserializing each row. Most of the logic is
unchanged from merging for RC files, so the original code has been refactored
for reuse.
TEST PLAN
Copied and modified RC file merge tests to use ORC file format. Added a test
case to TestOrcFile to make sure file level column stats are merged properly.
REVISION DETAIL
https://reviews.facebook.net/D9759
AFFECTED FILES
data/files/smbbucket_1.orc
data/files/smbbucket_3.orc
data/files/smbbucket_2.orc
common/src/java/org/apache/hadoop/hive/conf/HiveConf.java
ql/src/test/results/clientpositive/orc_createas1.q.out
ql/src/test/results/clientpositive/orcfile_merge3.q.out
ql/src/test/results/clientpositive/orcfile_merge2.q.out
ql/src/test/results/clientpositive/alter_merge_orc2.q.out
ql/src/test/results/clientpositive/alter_merge_orc.q.out
ql/src/test/results/clientpositive/orcfile_merge1.q.out
ql/src/test/results/clientpositive/orcfile_merge4.q.out
ql/src/test/results/clientpositive/alter_merge_orc_stats.q.out
ql/src/test/org/apache/hadoop/hive/ql/io/orc/TestOrcFile.java
ql/src/test/queries/clientpositive/orcfile_merge2.q
ql/src/test/queries/clientpositive/orcfile_merge3.q
ql/src/test/queries/clientpositive/alter_merge_orc.q
ql/src/test/queries/clientpositive/orcfile_merge4.q
ql/src/test/queries/clientpositive/alter_merge_orc_stats.q
ql/src/test/queries/clientpositive/orcfile_merge1.q
ql/src/test/queries/clientpositive/alter_merge_orc2.q
ql/src/java/org/apache/hadoop/hive/ql/optimizer/GenMRFileSink1.java
ql/src/java/org/apache/hadoop/hive/ql/exec/TaskFactory.java
ql/src/java/org/apache/hadoop/hive/ql/exec/DDLTask.java
ql/src/java/org/apache/hadoop/hive/ql/exec/MoveTask.java
ql/src/java/org/apache/hadoop/hive/ql/plan/PlanUtils.java
ql/src/java/org/apache/hadoop/hive/ql/parse/AlterTablePartMergeFilesDesc.java
ql/src/java/org/apache/hadoop/hive/ql/parse/DDLSemanticAnalyzer.java
ql/src/java/org/apache/hadoop/hive/ql/io/RCFile.java
ql/src/java/org/apache/hadoop/hive/ql/io/rcfile/merge/MergeWork.java
ql/src/java/org/apache/hadoop/hive/ql/io/rcfile/merge/RCFileBlockMergeOutputFormat.java
ql/src/java/org/apache/hadoop/hive/ql/io/rcfile/merge/RCFileMergeMapper.java
ql/src/java/org/apache/hadoop/hive/ql/io/rcfile/merge/BlockMergeTask.java
ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcBlockMergeRecordReader.java
ql/src/java/org/apache/hadoop/hive/ql/io/orc/Reader.java
ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcBlockMergeInputFormat.java
ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcMergeMapper.java
ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcFile.java
ql/src/java/org/apache/hadoop/hive/ql/io/orc/ReaderImpl.java
ql/src/java/org/apache/hadoop/hive/ql/io/orc/StripeReader.java
ql/src/java/org/apache/hadoop/hive/ql/io/orc/WriterImpl.java
ql/src/java/org/apache/hadoop/hive/ql/io/merge
ql/src/java/org/apache/hadoop/hive/ql/io/merge/MergeWork.java
ql/src/java/org/apache/hadoop/hive/ql/io/merge/MergeMapper.java
ql/src/java/org/apache/hadoop/hive/ql/io/merge/BlockMergeOutputFormat.java
ql/src/java/org/apache/hadoop/hive/ql/io/merge/BlockMergeTask.java
MANAGE HERALD RULES
https://reviews.facebook.net/herald/view/differential/
WHY DID I GET THIS EMAIL?
https://reviews.facebook.net/herald/transcript/23295/
To: kevinwilfong, omalley, sxyuan
Cc: JIRA
> Stripe-level merge for ORC files
> --------------------------------
>
> Key: HIVE-4221
> URL: https://issues.apache.org/jira/browse/HIVE-4221
> Project: Hive
> Issue Type: Improvement
> Components: Query Processor
> Reporter: Samuel Yuan
> Assignee: Samuel Yuan
> Attachments: HIVE-4221.HIVE-4221.HIVE-4221.HIVE-4221.D9759.1.patch
>
>
> As with RC files, we would like to be able to merge ORC files efficiently by
> reading/writing stripes without decompressing/recompressing them. This will
> be similar to the RC file merge, except that footers will have to be updated
> with the stripe positions in the new file.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira