[
https://issues.apache.org/jira/browse/HIVE-4221?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13641209#comment-13641209
]
Phabricator commented on HIVE-4221:
-----------------------------------
kevinwilfong has commented on the revision "HIVE-4221 [jira] Stripe-level merge
for ORC files
".
The testcases orcfile_merge1-4.q it's not clear to me what your testing, and
I have doubts as to whether multiple files are being generated/merged can you
add comments and confirm.
Can you also add negative testcases where you create a table with certain
parameters, write a file to it, alter one or more of those parameters, add
another file into the table (insert into not insert overwrite) and try to merge
it. The parameters should include
orc.compress
orc.compress.size
orc.row.index.stride
orc.create.index
This should cause the merge to fail.
INLINE COMMENTS
common/src/java/org/apache/hadoop/hive/conf/HiveConf.java:481-485 Could you
add the new configs to conf/hive-default.xml.template
ql/src/java/org/apache/hadoop/hive/ql/io/merge/BlockMergeOutputFormat.java:33
Could you make this string a constant, it seems fragile to have to change it in
two places.
ql/src/test/queries/clientpositive/orcfile_merge1.q:1 What's the purpose of
this test?
src is a very small table, so I don't think you'll be able to get multiple
splits from it. So I don't think the merge will do anything.
I could be wrong though, did you confirm there were multiple files being
generated and they were being merged?
ql/src/test/queries/clientpositive/orcfile_merge2.q:1 Again, not clear to me
what the purpose of this test is.
Could you put a comment in it describing what the goal is.
REVISION DETAIL
https://reviews.facebook.net/D9759
To: kevinwilfong, omalley, sxyuan
Cc: JIRA
> Stripe-level merge for ORC files
> --------------------------------
>
> Key: HIVE-4221
> URL: https://issues.apache.org/jira/browse/HIVE-4221
> Project: Hive
> Issue Type: Improvement
> Components: Query Processor
> Reporter: Samuel Yuan
> Assignee: Samuel Yuan
> Attachments: HIVE-4221.HIVE-4221.HIVE-4221.HIVE-4221.D9759.1.patch
>
>
> As with RC files, we would like to be able to merge ORC files efficiently by
> reading/writing stripes without decompressing/recompressing them. This will
> be similar to the RC file merge, except that footers will have to be updated
> with the stripe positions in the new file.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira