[jira] Commented: (HIVE-1307) More generic and efficient merge method

Ning Zhang (JIRA) Thu, 19 Aug 2010 14:32:57 -0700

    [ 
https://issues.apache.org/jira/browse/HIVE-1307?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12900473#action_12900473
 ]


Ning Zhang commented on HIVE-1307:
----------------------------------

Will address the code refactoring and update patch. 

regarding to hadoopVersion, the patch does have changes to add it to all tests 
(in ql/build.xml). checkPlan is changed to compare different log files 
according to hadoopVersion.

> More generic and efficient merge method
> ---------------------------------------
>
>                 Key: HIVE-1307
>                 URL: https://issues.apache.org/jira/browse/HIVE-1307
>             Project: Hadoop Hive
>          Issue Type: New Feature
>    Affects Versions: 0.6.0
>            Reporter: Ning Zhang
>            Assignee: Ning Zhang
>             Fix For: 0.7.0
>
>         Attachments: HIVE-1307.0.patch, HIVE-1307.2.patch, HIVE-1307.3.patch, 
> HIVE-1307.3_java.patch, HIVE-1307.patch, HIVE-1307_java_only.patch
>
>
> Currently if hive.merge.mapfiles/mapredfiles=true, a new mapreduce job is 
> create to read the input files and output to one reducer for merging. This MR 
> job is created at compile time and one MR job for one partition. In the case 
> of dynamic partition case, multiple partitions could be created at execution 
> time and generating merging MR job at compile time is impossible. 
> We should generalize the merge framework to allow multiple partitions and 
> most of the time a map-only job should be sufficient if we use 
> CombineHiveInputFormat. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HIVE-1307) More generic and efficient merge method

Reply via email to