[jira] [Assigned] (HIVE-3733) Improve Hive's logic for conditional merge

2012-12-30 Thread Zhenxiao Luo (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3733?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhenxiao Luo reassigned HIVE-3733:
--

Assignee: Zhenxiao Luo  (was: Pradeep Kamath)

 Improve Hive's logic for conditional merge
 --

 Key: HIVE-3733
 URL: https://issues.apache.org/jira/browse/HIVE-3733
 Project: Hive
  Issue Type: Improvement
Reporter: Pradeep Kamath
Assignee: Zhenxiao Luo
 Attachments: HIVE-3733.1.patch.txt, HIVE-3733.3.patch.txt, 
 HIVE-3733.4.patch.txt, HIVE-3733.5.patch.txt, HIVE-3733.optimizer.patch.txt


 If the config hive.merge.mapfiles is set to true and hive.merge.mapredfiles 
 is set to false then when hive encounters a FileSinkOperator when generating 
 map reduce tasks, it will look at the entire job to see if it has a reducer, 
 if it does it will not merge. Instead it should be check if the 
 FileSinkOperator is a child of the reducer. This means that outputs generated 
 in the mapper will be merged, and outputs generated in the reducer will not 
 be, the intended effect of setting those configs.
 Simple repro:
 set hive.merge.mapfiles=true;
 set hive.merge.mapredfiles=false;
 EXPLAIN
 FROM input_table
 INSERT OVERWRITE TABLE output_table1 SELECT key, COUNT(*) group by key
 INSERT OVERWRITE TABLE output_table2 SELECT *;
 The output should contain a Conditional Operator, Mapred Stages, and Move 
 tasks

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Assigned] (HIVE-3733) Improve Hive's logic for conditional merge

2012-11-21 Thread Namit Jain (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-3733?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Namit Jain reassigned HIVE-3733:


Assignee: Pradeep Kamath

 Improve Hive's logic for conditional merge
 --

 Key: HIVE-3733
 URL: https://issues.apache.org/jira/browse/HIVE-3733
 Project: Hive
  Issue Type: Improvement
Reporter: Pradeep Kamath
Assignee: Pradeep Kamath

 If the config hive.merge.mapfiles is set to true and hive.merge.mapredfiles 
 is set to false then when hive encounters a FileSinkOperator when generating 
 map reduce tasks, it will look at the entire job to see if it has a reducer, 
 if it does it will not merge. Instead it should be check if the 
 FileSinkOperator is a child of the reducer. This means that outputs generated 
 in the mapper will be merged, and outputs generated in the reducer will not 
 be, the intended effect of setting those configs.
 Simple repro:
 set hive.merge.mapfiles=true;
 set hive.merge.mapredfiles=false;
 EXPLAIN
 FROM input_table
 INSERT OVERWRITE TABLE output_table1 SELECT key, COUNT(*) group by key
 INSERT OVERWRITE TABLE output_table2 SELECT *;
 The output should contain a Conditional Operator, Mapred Stages, and Move 
 tasks

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira