[
https://issues.apache.org/jira/browse/HIVE-1307?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12900473#action_12900473
]
Ning Zhang commented on HIVE-1307:
----------------------------------
Will address the code refactoring and update patch.
regarding to hadoopVersion, the patch does have changes to add it to all tests
(in ql/build.xml). checkPlan is changed to compare different log files
according to hadoopVersion.
> More generic and efficient merge method
> ---------------------------------------
>
> Key: HIVE-1307
> URL: https://issues.apache.org/jira/browse/HIVE-1307
> Project: Hadoop Hive
> Issue Type: New Feature
> Affects Versions: 0.6.0
> Reporter: Ning Zhang
> Assignee: Ning Zhang
> Fix For: 0.7.0
>
> Attachments: HIVE-1307.0.patch, HIVE-1307.2.patch, HIVE-1307.3.patch,
> HIVE-1307.3_java.patch, HIVE-1307.patch, HIVE-1307_java_only.patch
>
>
> Currently if hive.merge.mapfiles/mapredfiles=true, a new mapreduce job is
> create to read the input files and output to one reducer for merging. This MR
> job is created at compile time and one MR job for one partition. In the case
> of dynamic partition case, multiple partitions could be created at execution
> time and generating merging MR job at compile time is impossible.
> We should generalize the merge framework to allow multiple partitions and
> most of the time a map-only job should be sufficient if we use
> CombineHiveInputFormat.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.