[ 
https://issues.apache.org/jira/browse/HIVE-4221?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Phabricator updated HIVE-4221:
------------------------------

    Attachment: HIVE-4221.HIVE-4221.HIVE-4221.HIVE-4221.D9759.1.patch

sxyuan requested code review of "HIVE-4221 [jira] Stripe-level merge for ORC 
files
".

Reviewers: kevinwilfong, omalley

As with RC files, we would like to be able to merge ORC files efficiently by 
reading/writing stripes without deserializing each row. Most of the logic is 
unchanged from merging for RC files, so the original code has been refactored 
for reuse.

TEST PLAN
  Copied and modified RC file merge tests to use ORC file format. Added a test 
case to TestOrcFile to make sure file level column stats are merged properly.

REVISION DETAIL
  https://reviews.facebook.net/D9759

AFFECTED FILES
  data/files/smbbucket_1.orc
  data/files/smbbucket_3.orc
  data/files/smbbucket_2.orc
  common/src/java/org/apache/hadoop/hive/conf/HiveConf.java
  ql/src/test/results/clientpositive/orc_createas1.q.out
  ql/src/test/results/clientpositive/orcfile_merge3.q.out
  ql/src/test/results/clientpositive/orcfile_merge2.q.out
  ql/src/test/results/clientpositive/alter_merge_orc2.q.out
  ql/src/test/results/clientpositive/alter_merge_orc.q.out
  ql/src/test/results/clientpositive/orcfile_merge1.q.out
  ql/src/test/results/clientpositive/orcfile_merge4.q.out
  ql/src/test/results/clientpositive/alter_merge_orc_stats.q.out
  ql/src/test/org/apache/hadoop/hive/ql/io/orc/TestOrcFile.java
  ql/src/test/queries/clientpositive/orcfile_merge2.q
  ql/src/test/queries/clientpositive/orcfile_merge3.q
  ql/src/test/queries/clientpositive/alter_merge_orc.q
  ql/src/test/queries/clientpositive/orcfile_merge4.q
  ql/src/test/queries/clientpositive/alter_merge_orc_stats.q
  ql/src/test/queries/clientpositive/orcfile_merge1.q
  ql/src/test/queries/clientpositive/alter_merge_orc2.q
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/GenMRFileSink1.java
  ql/src/java/org/apache/hadoop/hive/ql/exec/TaskFactory.java
  ql/src/java/org/apache/hadoop/hive/ql/exec/DDLTask.java
  ql/src/java/org/apache/hadoop/hive/ql/exec/MoveTask.java
  ql/src/java/org/apache/hadoop/hive/ql/plan/PlanUtils.java
  ql/src/java/org/apache/hadoop/hive/ql/parse/AlterTablePartMergeFilesDesc.java
  ql/src/java/org/apache/hadoop/hive/ql/parse/DDLSemanticAnalyzer.java
  ql/src/java/org/apache/hadoop/hive/ql/io/RCFile.java
  ql/src/java/org/apache/hadoop/hive/ql/io/rcfile/merge/MergeWork.java
  
ql/src/java/org/apache/hadoop/hive/ql/io/rcfile/merge/RCFileBlockMergeOutputFormat.java
  ql/src/java/org/apache/hadoop/hive/ql/io/rcfile/merge/RCFileMergeMapper.java
  ql/src/java/org/apache/hadoop/hive/ql/io/rcfile/merge/BlockMergeTask.java
  ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcBlockMergeRecordReader.java
  ql/src/java/org/apache/hadoop/hive/ql/io/orc/Reader.java
  ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcBlockMergeInputFormat.java
  ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcMergeMapper.java
  ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcFile.java
  ql/src/java/org/apache/hadoop/hive/ql/io/orc/ReaderImpl.java
  ql/src/java/org/apache/hadoop/hive/ql/io/orc/StripeReader.java
  ql/src/java/org/apache/hadoop/hive/ql/io/orc/WriterImpl.java
  ql/src/java/org/apache/hadoop/hive/ql/io/merge
  ql/src/java/org/apache/hadoop/hive/ql/io/merge/MergeWork.java
  ql/src/java/org/apache/hadoop/hive/ql/io/merge/MergeMapper.java
  ql/src/java/org/apache/hadoop/hive/ql/io/merge/BlockMergeOutputFormat.java
  ql/src/java/org/apache/hadoop/hive/ql/io/merge/BlockMergeTask.java

MANAGE HERALD RULES
  https://reviews.facebook.net/herald/view/differential/

WHY DID I GET THIS EMAIL?
  https://reviews.facebook.net/herald/transcript/23295/

To: kevinwilfong, omalley, sxyuan
Cc: JIRA

                
> Stripe-level merge for ORC files
> --------------------------------
>
>                 Key: HIVE-4221
>                 URL: https://issues.apache.org/jira/browse/HIVE-4221
>             Project: Hive
>          Issue Type: Improvement
>          Components: Query Processor
>            Reporter: Samuel Yuan
>            Assignee: Samuel Yuan
>         Attachments: HIVE-4221.HIVE-4221.HIVE-4221.HIVE-4221.D9759.1.patch
>
>
> As with RC files, we would like to be able to merge ORC files efficiently by 
> reading/writing stripes without decompressing/recompressing them. This will 
> be similar to the RC file merge, except that footers will have to be updated 
> with the stripe positions in the new file.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to