Re: Review Request 24627: HIVE-7704: Create tez task for fast file merging

2014-09-11 Thread Vikram Dixit Kumaraswamy

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/24627/#review53087
---

Ship it!


Ship It!

- Vikram Dixit Kumaraswamy


On Sept. 9, 2014, 7:32 a.m., Prasanth_J wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/24627/
> ---
> 
> (Updated Sept. 9, 2014, 7:32 a.m.)
> 
> 
> Review request for hive and Gunther Hagleitner.
> 
> 
> Bugs: HIVE-7704
> https://issues.apache.org/jira/browse/HIVE-7704
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> Currently tez falls back to MR task for merge file task. It will beneficial 
> to convert the merge file tasks to tez task to make use of the performance 
> gains from tez.
> 
> 
> Diffs
> -
> 
>   common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 31aeba9 
>   itests/src/test/resources/testconfiguration.properties 99049ca 
>   
> ql/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/ql/plan/api/OperatorType.java
>  6f23575 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/AbstractFileMergeOperator.java 
> PRE-CREATION 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/DDLTask.java e076683 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/MoveTask.java 7477199 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/OperatorFactory.java 8946221 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/OrcFileMergeOperator.java 
> PRE-CREATION 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/RCFileMergeOperator.java 
> PRE-CREATION 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/TaskFactory.java 3d74459 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java 5bbf3f6 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/tez/DagUtils.java 4ff568d1 
>   
> ql/src/java/org/apache/hadoop/hive/ql/exec/tez/MergeFileRecordProcessor.java 
> PRE-CREATION 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/tez/MergeFileTezProcessor.java 
> PRE-CREATION 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/tez/RecordProcessor.java 994721f 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/tez/TezProcessor.java 831e6a5 
>   ql/src/java/org/apache/hadoop/hive/ql/io/merge/MergeFileInputFormat.java 
> PRE-CREATION 
>   ql/src/java/org/apache/hadoop/hive/ql/io/merge/MergeFileMapper.java 
> PRE-CREATION 
>   ql/src/java/org/apache/hadoop/hive/ql/io/merge/MergeFileOutputFormat.java 
> PRE-CREATION 
>   ql/src/java/org/apache/hadoop/hive/ql/io/merge/MergeFileTask.java 
> PRE-CREATION 
>   ql/src/java/org/apache/hadoop/hive/ql/io/merge/MergeFileWork.java 
> PRE-CREATION 
>   ql/src/java/org/apache/hadoop/hive/ql/io/merge/MergeInputFormat.java 
> 4651920 
>   ql/src/java/org/apache/hadoop/hive/ql/io/merge/MergeMapper.java 6c691b1 
>   ql/src/java/org/apache/hadoop/hive/ql/io/merge/MergeOutputFormat.java 
> a3ce699 
>   ql/src/java/org/apache/hadoop/hive/ql/io/merge/MergeTask.java c30476b 
>   ql/src/java/org/apache/hadoop/hive/ql/io/merge/MergeWork.java 9efee3c 
>   ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcFileMergeMapper.java 
> 13ec642 
>   
> ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcFileStripeMergeInputFormat.java
>  a6c92fb 
>   ql/src/java/org/apache/hadoop/hive/ql/io/orc/Writer.java c391b0e 
>   ql/src/java/org/apache/hadoop/hive/ql/io/orc/WriterImpl.java 195d60e 
>   
> ql/src/java/org/apache/hadoop/hive/ql/io/rcfile/merge/RCFileBlockMergeInputFormat.java
>  6809c79 
>   
> ql/src/java/org/apache/hadoop/hive/ql/io/rcfile/merge/RCFileMergeMapper.java 
> dee6b1c 
>   ql/src/java/org/apache/hadoop/hive/ql/optimizer/GenMapRedUtils.java 7129ed8 
>   ql/src/java/org/apache/hadoop/hive/ql/parse/TezCompiler.java 11a9419 
>   ql/src/java/org/apache/hadoop/hive/ql/plan/FileMergeDesc.java PRE-CREATION 
>   ql/src/java/org/apache/hadoop/hive/ql/plan/OrcFileMergeDesc.java 
> PRE-CREATION 
>   ql/src/java/org/apache/hadoop/hive/ql/plan/RCFileMergeDesc.java 
> PRE-CREATION 
>   ql/src/test/queries/clientpositive/list_bucket_dml_8.q 9e81b8d 
>   ql/src/test/queries/clientpositive/orc_merge1.q ee65b98 
>   ql/src/test/queries/clientpositive/orc_merge5.q PRE-CREATION 
>   ql/src/test/queries/clientpositive/orc_merge6.q PRE-CREATION 
>   ql/src/test/queries/clientpositive/orc_merge7.q PRE-CREATION 
>   ql/src/test/results/clientpositive/infer_bucket_sort_dyn_part.q.out 11c7578 
>   ql/src/test/results/clientpositive/list_bucket_dml_10.q.out 8de452f 
>   ql/src/test/results/clientpositive/list_bucket_dml_4.q.out b1c060e 
>   ql/src/test/results/clientpositive/list_bucket_dml_6.q.out 3450d63 
>   ql/src/test/results/clientpositive/list_bucket_dml_7.q.out f6a4cb5 
>   ql/src/test/results/clientpositive/list_bucket_dml_9.q.out 796c7af 
>   ql/src/test/results/clientpositive/merge_dynamic_partition4.q.out 0899648 
>   ql/src/test/results/clientpositive/merge_dynamic_partition5.q.out

Re: Review Request 24627: HIVE-7704: Create tez task for fast file merging

2014-09-09 Thread j . prasanth . j

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/24627/
---

(Updated Sept. 9, 2014, 7:32 a.m.)


Review request for hive and Gunther Hagleitner.


Changes
---

Addressed Vikram's review comment.s


Bugs: HIVE-7704
https://issues.apache.org/jira/browse/HIVE-7704


Repository: hive-git


Description
---

Currently tez falls back to MR task for merge file task. It will beneficial to 
convert the merge file tasks to tez task to make use of the performance gains 
from tez.


Diffs (updated)
-

  common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 31aeba9 
  itests/src/test/resources/testconfiguration.properties 99049ca 
  
ql/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/ql/plan/api/OperatorType.java
 6f23575 
  ql/src/java/org/apache/hadoop/hive/ql/exec/AbstractFileMergeOperator.java 
PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/exec/DDLTask.java e076683 
  ql/src/java/org/apache/hadoop/hive/ql/exec/MoveTask.java 7477199 
  ql/src/java/org/apache/hadoop/hive/ql/exec/OperatorFactory.java 8946221 
  ql/src/java/org/apache/hadoop/hive/ql/exec/OrcFileMergeOperator.java 
PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/exec/RCFileMergeOperator.java 
PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/exec/TaskFactory.java 3d74459 
  ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java 5bbf3f6 
  ql/src/java/org/apache/hadoop/hive/ql/exec/tez/DagUtils.java 4ff568d1 
  ql/src/java/org/apache/hadoop/hive/ql/exec/tez/MergeFileRecordProcessor.java 
PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/exec/tez/MergeFileTezProcessor.java 
PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/exec/tez/RecordProcessor.java 994721f 
  ql/src/java/org/apache/hadoop/hive/ql/exec/tez/TezProcessor.java 831e6a5 
  ql/src/java/org/apache/hadoop/hive/ql/io/merge/MergeFileInputFormat.java 
PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/io/merge/MergeFileMapper.java 
PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/io/merge/MergeFileOutputFormat.java 
PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/io/merge/MergeFileTask.java 
PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/io/merge/MergeFileWork.java 
PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/io/merge/MergeInputFormat.java 4651920 
  ql/src/java/org/apache/hadoop/hive/ql/io/merge/MergeMapper.java 6c691b1 
  ql/src/java/org/apache/hadoop/hive/ql/io/merge/MergeOutputFormat.java a3ce699 
  ql/src/java/org/apache/hadoop/hive/ql/io/merge/MergeTask.java c30476b 
  ql/src/java/org/apache/hadoop/hive/ql/io/merge/MergeWork.java 9efee3c 
  ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcFileMergeMapper.java 13ec642 
  
ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcFileStripeMergeInputFormat.java 
a6c92fb 
  ql/src/java/org/apache/hadoop/hive/ql/io/orc/Writer.java c391b0e 
  ql/src/java/org/apache/hadoop/hive/ql/io/orc/WriterImpl.java 195d60e 
  
ql/src/java/org/apache/hadoop/hive/ql/io/rcfile/merge/RCFileBlockMergeInputFormat.java
 6809c79 
  ql/src/java/org/apache/hadoop/hive/ql/io/rcfile/merge/RCFileMergeMapper.java 
dee6b1c 
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/GenMapRedUtils.java 7129ed8 
  ql/src/java/org/apache/hadoop/hive/ql/parse/TezCompiler.java 11a9419 
  ql/src/java/org/apache/hadoop/hive/ql/plan/FileMergeDesc.java PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/plan/OrcFileMergeDesc.java PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/plan/RCFileMergeDesc.java PRE-CREATION 
  ql/src/test/queries/clientpositive/list_bucket_dml_8.q 9e81b8d 
  ql/src/test/queries/clientpositive/orc_merge1.q ee65b98 
  ql/src/test/queries/clientpositive/orc_merge5.q PRE-CREATION 
  ql/src/test/queries/clientpositive/orc_merge6.q PRE-CREATION 
  ql/src/test/queries/clientpositive/orc_merge7.q PRE-CREATION 
  ql/src/test/results/clientpositive/infer_bucket_sort_dyn_part.q.out 11c7578 
  ql/src/test/results/clientpositive/list_bucket_dml_10.q.out 8de452f 
  ql/src/test/results/clientpositive/list_bucket_dml_4.q.out b1c060e 
  ql/src/test/results/clientpositive/list_bucket_dml_6.q.out 3450d63 
  ql/src/test/results/clientpositive/list_bucket_dml_7.q.out f6a4cb5 
  ql/src/test/results/clientpositive/list_bucket_dml_9.q.out 796c7af 
  ql/src/test/results/clientpositive/merge_dynamic_partition4.q.out 0899648 
  ql/src/test/results/clientpositive/merge_dynamic_partition5.q.out 0653469 
  ql/src/test/results/clientpositive/orc_createas1.q.out 993c853 
  ql/src/test/results/clientpositive/orc_merge1.q.out 7f88125 
  ql/src/test/results/clientpositive/orc_merge3.q.out 258f538 
  ql/src/test/results/clientpositive/orc_merge5.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/orc_merge6.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/orc_merge7.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/rcfile_createas1.q.out cdfa036 
  

Re: Review Request 24627: HIVE-7704: Create tez task for fast file merging

2014-09-08 Thread j . prasanth . j

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/24627/#review52679
---



ql/src/java/org/apache/hadoop/hive/ql/exec/AbstractFileMergeOperator.java


All these were exisiting code taken from MergeMapper.java. Anyways, I 
rewrote the comment in the new patch. Also merged fixTmpPath and 
fixTmpPathConcatenate method to single method in the new patch.



ql/src/java/org/apache/hadoop/hive/ql/exec/AbstractFileMergeOperator.java


Updated in new patch.



ql/src/java/org/apache/hadoop/hive/ql/exec/AbstractFileMergeOperator.java


Updated comment in new patch.



ql/src/java/org/apache/hadoop/hive/ql/exec/AbstractFileMergeOperator.java


This is all gone in new patch.



ql/src/java/org/apache/hadoop/hive/ql/exec/DDLTask.java


Thats the eclipse. I used intellij. Perhaps both seems to do the opposite. 
:)



ql/src/java/org/apache/hadoop/hive/ql/exec/DDLTask.java


Fixed it.



ql/src/java/org/apache/hadoop/hive/ql/exec/DDLTask.java


If I use Map interface here, then I need to cast it to LinkedHashMap when I 
set aliasToWork(). To avoid casting I am using ListkedHashMap on LHS.



ql/src/java/org/apache/hadoop/hive/ql/exec/OrcFileMergeOperator.java


Fixed it.



ql/src/java/org/apache/hadoop/hive/ql/io/merge/MergeFileOutputFormat.java


This is a dummy output format which is set in DagUtils/MergeFileTask. I am 
just using to make sure the operator pipeline for fast file merge is 
initialized properly. If the operator pipeline is wrongly initialized with say 
TS -> FS, then FS will get record writer from this output format throwing 
RuntimeException. If OFM or RFM operators are initialized then this will never 
be called.

The RCFile and ORC file writers handles output file opening and closing 
themselves. It does not use the standard record writer interfaces for writing 
the output. Both RCFile and ORC use custom interfaces for block level and 
stripe level writing respectively.


- Prasanth_J


On Sept. 6, 2014, 2:03 a.m., Prasanth_J wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/24627/
> ---
> 
> (Updated Sept. 6, 2014, 2:03 a.m.)
> 
> 
> Review request for hive and Gunther Hagleitner.
> 
> 
> Bugs: HIVE-7704
> https://issues.apache.org/jira/browse/HIVE-7704
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> Currently tez falls back to MR task for merge file task. It will beneficial 
> to convert the merge file tasks to tez task to make use of the performance 
> gains from tez.
> 
> 
> Diffs
> -
> 
>   common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 54e2b18 
>   itests/src/test/resources/testconfiguration.properties 99049ca 
>   
> ql/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/ql/plan/api/OperatorType.java
>  6f23575 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/AbstractFileMergeOperator.java 
> PRE-CREATION 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/DDLTask.java e076683 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/MoveTask.java 7477199 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/OperatorFactory.java 8946221 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/OrcFileMergeOperator.java 
> PRE-CREATION 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/RCFileMergeOperator.java 
> PRE-CREATION 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/TaskFactory.java 3d74459 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java 2d9b9c3 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/tez/DagUtils.java 4ff568d1 
>   
> ql/src/java/org/apache/hadoop/hive/ql/exec/tez/MergeFileRecordProcessor.java 
> PRE-CREATION 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/tez/MergeFileTezProcessor.java 
> PRE-CREATION 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/tez/RecordProcessor.java 994721f 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/tez/TezProcessor.java 831e6a5 
>   ql/src/java/org/apache/hadoop/hive/ql/io/merge/MergeFileInputFormat.java 
> PRE-CREATION 
>   ql/src/java/org/apache/hadoop/hive/ql/io/merge/MergeFileMapper.java 
> PRE-CREATION 
>   ql/src/java/org/apache/hadoop/hive/ql/io/merge/MergeFileOutputFormat.java 
> PRE-CREATION 
>   ql/src/java/org/apache/hadoop/hive/ql/io/merge/MergeFileTask.java 
> PRE-CREATION 
>   ql/src/java/org/apache/hadoop/hive/ql

Re: Review Request 24627: HIVE-7704: Create tez task for fast file merging

2014-09-08 Thread Vikram Dixit Kumaraswamy

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/24627/#review52632
---



ql/src/java/org/apache/hadoop/hive/ql/exec/AbstractFileMergeOperator.java


inputDepth or tmpPath?



ql/src/java/org/apache/hadoop/hive/ql/exec/AbstractFileMergeOperator.java


inputDepth perhaps?



ql/src/java/org/apache/hadoop/hive/ql/exec/AbstractFileMergeOperator.java


nit updated.



ql/src/java/org/apache/hadoop/hive/ql/exec/AbstractFileMergeOperator.java


nit: has slight difference w.r.t



ql/src/java/org/apache/hadoop/hive/ql/exec/DDLTask.java


Usually done in the reverse way. The java imports come before the 
hadoop/hive imports.



ql/src/java/org/apache/hadoop/hive/ql/exec/DDLTask.java


Don't really need it here. Can be moved down to line 582.



ql/src/java/org/apache/hadoop/hive/ql/exec/DDLTask.java


HashMap only on the LHS. Don't use LinkedHashMap.



ql/src/java/org/apache/hadoop/hive/ql/exec/MoveTask.java


Nice work! Although this section usually comes above the hadoop imports 
though.



ql/src/java/org/apache/hadoop/hive/ql/exec/OrcFileMergeOperator.java


nit: comma after merge makes it easier to read.



ql/src/java/org/apache/hadoop/hive/ql/exec/tez/DagUtils.java


Nice work!



ql/src/java/org/apache/hadoop/hive/ql/io/merge/MergeFileOutputFormat.java


What purpose does this class serve? Doesn't look like it does anything. 
Comments please.


- Vikram Dixit Kumaraswamy


On Sept. 6, 2014, 2:03 a.m., Prasanth_J wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/24627/
> ---
> 
> (Updated Sept. 6, 2014, 2:03 a.m.)
> 
> 
> Review request for hive and Gunther Hagleitner.
> 
> 
> Bugs: HIVE-7704
> https://issues.apache.org/jira/browse/HIVE-7704
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> Currently tez falls back to MR task for merge file task. It will beneficial 
> to convert the merge file tasks to tez task to make use of the performance 
> gains from tez.
> 
> 
> Diffs
> -
> 
>   common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 54e2b18 
>   itests/src/test/resources/testconfiguration.properties 99049ca 
>   
> ql/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/ql/plan/api/OperatorType.java
>  6f23575 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/AbstractFileMergeOperator.java 
> PRE-CREATION 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/DDLTask.java e076683 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/MoveTask.java 7477199 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/OperatorFactory.java 8946221 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/OrcFileMergeOperator.java 
> PRE-CREATION 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/RCFileMergeOperator.java 
> PRE-CREATION 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/TaskFactory.java 3d74459 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java 2d9b9c3 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/tez/DagUtils.java 4ff568d1 
>   
> ql/src/java/org/apache/hadoop/hive/ql/exec/tez/MergeFileRecordProcessor.java 
> PRE-CREATION 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/tez/MergeFileTezProcessor.java 
> PRE-CREATION 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/tez/RecordProcessor.java 994721f 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/tez/TezProcessor.java 831e6a5 
>   ql/src/java/org/apache/hadoop/hive/ql/io/merge/MergeFileInputFormat.java 
> PRE-CREATION 
>   ql/src/java/org/apache/hadoop/hive/ql/io/merge/MergeFileMapper.java 
> PRE-CREATION 
>   ql/src/java/org/apache/hadoop/hive/ql/io/merge/MergeFileOutputFormat.java 
> PRE-CREATION 
>   ql/src/java/org/apache/hadoop/hive/ql/io/merge/MergeFileTask.java 
> PRE-CREATION 
>   ql/src/java/org/apache/hadoop/hive/ql/io/merge/MergeFileWork.java 
> PRE-CREATION 
>   ql/src/java/org/apache/hadoop/hive/ql/io/merge/MergeInputFormat.java 
> 4651920 
>   ql/src/java/org/apache/hadoop/hive/ql/io/merge/MergeMapper.java 6c691b1 
>   ql/src/java/org/apache/hadoop/hive/ql/io/merge/MergeOutputFormat.java 
> a3ce699 
>   ql/src/java/org/apache/hadoop/hive/ql/io/merge/MergeTask.java c30476b 
>   ql/src/java/org/apache/hadoop/hive/ql/io/merge/MergeWor

Re: Review Request 24627: HIVE-7704: Create tez task for fast file merging

2014-09-05 Thread j . prasanth . j

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/24627/
---

(Updated Sept. 6, 2014, 2:03 a.m.)


Review request for hive and Gunther Hagleitner.


Changes
---

Fixed test failures


Bugs: HIVE-7704
https://issues.apache.org/jira/browse/HIVE-7704


Repository: hive-git


Description
---

Currently tez falls back to MR task for merge file task. It will beneficial to 
convert the merge file tasks to tez task to make use of the performance gains 
from tez.


Diffs (updated)
-

  common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 54e2b18 
  itests/src/test/resources/testconfiguration.properties 99049ca 
  
ql/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/ql/plan/api/OperatorType.java
 6f23575 
  ql/src/java/org/apache/hadoop/hive/ql/exec/AbstractFileMergeOperator.java 
PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/exec/DDLTask.java e076683 
  ql/src/java/org/apache/hadoop/hive/ql/exec/MoveTask.java 7477199 
  ql/src/java/org/apache/hadoop/hive/ql/exec/OperatorFactory.java 8946221 
  ql/src/java/org/apache/hadoop/hive/ql/exec/OrcFileMergeOperator.java 
PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/exec/RCFileMergeOperator.java 
PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/exec/TaskFactory.java 3d74459 
  ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java 2d9b9c3 
  ql/src/java/org/apache/hadoop/hive/ql/exec/tez/DagUtils.java 4ff568d1 
  ql/src/java/org/apache/hadoop/hive/ql/exec/tez/MergeFileRecordProcessor.java 
PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/exec/tez/MergeFileTezProcessor.java 
PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/exec/tez/RecordProcessor.java 994721f 
  ql/src/java/org/apache/hadoop/hive/ql/exec/tez/TezProcessor.java 831e6a5 
  ql/src/java/org/apache/hadoop/hive/ql/io/merge/MergeFileInputFormat.java 
PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/io/merge/MergeFileMapper.java 
PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/io/merge/MergeFileOutputFormat.java 
PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/io/merge/MergeFileTask.java 
PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/io/merge/MergeFileWork.java 
PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/io/merge/MergeInputFormat.java 4651920 
  ql/src/java/org/apache/hadoop/hive/ql/io/merge/MergeMapper.java 6c691b1 
  ql/src/java/org/apache/hadoop/hive/ql/io/merge/MergeOutputFormat.java a3ce699 
  ql/src/java/org/apache/hadoop/hive/ql/io/merge/MergeTask.java c30476b 
  ql/src/java/org/apache/hadoop/hive/ql/io/merge/MergeWork.java 9efee3c 
  ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcFileMergeMapper.java 13ec642 
  
ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcFileStripeMergeInputFormat.java 
a6c92fb 
  ql/src/java/org/apache/hadoop/hive/ql/io/orc/Writer.java c391b0e 
  ql/src/java/org/apache/hadoop/hive/ql/io/orc/WriterImpl.java 195d60e 
  
ql/src/java/org/apache/hadoop/hive/ql/io/rcfile/merge/RCFileBlockMergeInputFormat.java
 6809c79 
  ql/src/java/org/apache/hadoop/hive/ql/io/rcfile/merge/RCFileMergeMapper.java 
dee6b1c 
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/GenMapRedUtils.java 7129ed8 
  ql/src/java/org/apache/hadoop/hive/ql/parse/TezCompiler.java 11a9419 
  ql/src/java/org/apache/hadoop/hive/ql/plan/FileMergeDesc.java PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/plan/OrcFileMergeDesc.java PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/plan/RCFileMergeDesc.java PRE-CREATION 
  ql/src/test/queries/clientpositive/list_bucket_dml_8.q 9e81b8d 
  ql/src/test/queries/clientpositive/orc_merge1.q ee65b98 
  ql/src/test/queries/clientpositive/orc_merge5.q PRE-CREATION 
  ql/src/test/queries/clientpositive/orc_merge6.q PRE-CREATION 
  ql/src/test/queries/clientpositive/orc_merge7.q PRE-CREATION 
  ql/src/test/results/clientpositive/infer_bucket_sort_dyn_part.q.out ea37c36 
  ql/src/test/results/clientpositive/list_bucket_dml_10.q.out e9367ac 
  ql/src/test/results/clientpositive/list_bucket_dml_4.q.out 99496d5 
  ql/src/test/results/clientpositive/list_bucket_dml_6.q.out d5deadb 
  ql/src/test/results/clientpositive/list_bucket_dml_7.q.out 4aea4db 
  ql/src/test/results/clientpositive/list_bucket_dml_9.q.out f94a3cc 
  ql/src/test/results/clientpositive/merge_dynamic_partition4.q.out 0899648 
  ql/src/test/results/clientpositive/merge_dynamic_partition5.q.out 0653469 
  ql/src/test/results/clientpositive/orc_createas1.q.out b0c58dd 
  ql/src/test/results/clientpositive/orc_merge1.q.out fc3e206 
  ql/src/test/results/clientpositive/orc_merge3.q.out 258f538 
  ql/src/test/results/clientpositive/orc_merge5.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/orc_merge6.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/orc_merge7.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/rcfile_createas1.q.out cdfa036 
  ql/src/test/resu

Re: Review Request 24627: HIVE-7704: Create tez task for fast file merging

2014-09-05 Thread j . prasanth . j

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/24627/
---

(Updated Sept. 5, 2014, 7:51 a.m.)


Review request for hive and Gunther Hagleitner.


Changes
---

Addressed Gunther's review comments in this patch.


Bugs: HIVE-7704
https://issues.apache.org/jira/browse/HIVE-7704


Repository: hive-git


Description
---

Currently tez falls back to MR task for merge file task. It will beneficial to 
convert the merge file tasks to tez task to make use of the performance gains 
from tez.


Diffs (updated)
-

  common/src/java/org/apache/hadoop/hive/conf/HiveConf.java a0a5f54 
  itests/src/test/resources/testconfiguration.properties 4ad72a3 
  
ql/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/ql/plan/api/OperatorType.java
 aa094ee 
  ql/src/java/org/apache/hadoop/hive/ql/exec/AbstractFileMergeOperator.java 
PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/exec/DDLTask.java e076683 
  ql/src/java/org/apache/hadoop/hive/ql/exec/MoveTask.java 7477199 
  ql/src/java/org/apache/hadoop/hive/ql/exec/OperatorFactory.java 2bcb481 
  ql/src/java/org/apache/hadoop/hive/ql/exec/OrcFileMergeOperator.java 
PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/exec/RCFileMergeOperator.java 
PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/exec/TaskFactory.java 3d74459 
  ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java 44b6a43 
  ql/src/java/org/apache/hadoop/hive/ql/exec/tez/DagUtils.java 82aa6ba 
  ql/src/java/org/apache/hadoop/hive/ql/exec/tez/MergeFileRecordProcessor.java 
PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/exec/tez/MergeFileTezProcessor.java 
PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/exec/tez/TezProcessor.java c2ba782 
  ql/src/java/org/apache/hadoop/hive/ql/io/merge/MergeFileInputFormat.java 
PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/io/merge/MergeFileMapper.java 
PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/io/merge/MergeFileOutputFormat.java 
PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/io/merge/MergeFileTask.java 
PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/io/merge/MergeFileWork.java 
PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/io/merge/MergeInputFormat.java 4651920 
  ql/src/java/org/apache/hadoop/hive/ql/io/merge/MergeMapper.java 6c691b1 
  ql/src/java/org/apache/hadoop/hive/ql/io/merge/MergeOutputFormat.java a3ce699 
  ql/src/java/org/apache/hadoop/hive/ql/io/merge/MergeTask.java c30476b 
  ql/src/java/org/apache/hadoop/hive/ql/io/merge/MergeWork.java 9efee3c 
  ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcFileMergeMapper.java 13ec642 
  
ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcFileStripeMergeInputFormat.java 
a6c92fb 
  ql/src/java/org/apache/hadoop/hive/ql/io/orc/Writer.java c391b0e 
  ql/src/java/org/apache/hadoop/hive/ql/io/orc/WriterImpl.java 195d60e 
  
ql/src/java/org/apache/hadoop/hive/ql/io/rcfile/merge/RCFileBlockMergeInputFormat.java
 6809c79 
  ql/src/java/org/apache/hadoop/hive/ql/io/rcfile/merge/RCFileMergeMapper.java 
dee6b1c 
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/GenMapRedUtils.java 7129ed8 
  ql/src/java/org/apache/hadoop/hive/ql/parse/GenTezProcContext.java 8513f99 
  ql/src/java/org/apache/hadoop/hive/ql/parse/TezCompiler.java d58c59d 
  ql/src/java/org/apache/hadoop/hive/ql/plan/FileMergeDesc.java PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/plan/OrcFileMergeDesc.java PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/plan/RCFileMergeDesc.java PRE-CREATION 
  ql/src/test/queries/clientpositive/orc_merge1.q ee65b98 
  ql/src/test/queries/clientpositive/orc_merge5.q PRE-CREATION 
  ql/src/test/queries/clientpositive/orc_merge6.q PRE-CREATION 
  ql/src/test/queries/clientpositive/orc_merge7.q PRE-CREATION 
  ql/src/test/results/clientpositive/infer_bucket_sort_dyn_part.q.out ea37c36 
  ql/src/test/results/clientpositive/list_bucket_dml_10.q.out e9367ac 
  ql/src/test/results/clientpositive/list_bucket_dml_4.q.out 99496d5 
  ql/src/test/results/clientpositive/list_bucket_dml_6.q.out d5deadb 
  ql/src/test/results/clientpositive/list_bucket_dml_7.q.out 4aea4db 
  ql/src/test/results/clientpositive/list_bucket_dml_9.q.out f94a3cc 
  ql/src/test/results/clientpositive/merge_dynamic_partition4.q.out 0899648 
  ql/src/test/results/clientpositive/merge_dynamic_partition5.q.out 0653469 
  ql/src/test/results/clientpositive/orc_createas1.q.out b0c58dd 
  ql/src/test/results/clientpositive/orc_merge1.q.out fc3e206 
  ql/src/test/results/clientpositive/orc_merge3.q.out 258f538 
  ql/src/test/results/clientpositive/orc_merge5.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/orc_merge6.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/orc_merge7.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/rcfile_createas1.q.out cdfa036 
  ql/src/test/results/clientpositive/rcfile_merge1.q.ou

Re: Review Request 24627: HIVE-7704: Create tez task for fast file merging

2014-08-25 Thread Gunther Hagleitner

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/24627/#review51416
---



ql/src/java/org/apache/hadoop/hive/ql/exec/MapOperator.java


why do you need a map operator at all then? Can't you just write a net new 
processor that doesn't init map op?



ql/src/java/org/apache/hadoop/hive/ql/exec/tez/DagUtils.java


conf setup should happen in initVertexConf.



ql/src/java/org/apache/hadoop/hive/ql/exec/tez/DagUtils.java


if i read this right the only diff is the process or. can you use a var for 
this and keep a single call to "new Vertex"?

String procClassName;

if ... {
  procClassName = ...
}
...
new Vertext(...procClassName)

if you move all the conf setup into the initVertexConf method this should 
be more clear.



ql/src/java/org/apache/hadoop/hive/ql/exec/tez/DagUtils.java


indentation seems broken



ql/src/java/org/apache/hadoop/hive/ql/exec/tez/DagUtils.java


that means you're setting a path as the alias?



ql/src/java/org/apache/hadoop/hive/ql/exec/tez/MergeFileMapRecordProcessor.java


I'm assuming that Merge* and ORCMerge* contain a lot of copied code? (from 
the MR path). If that's the case can you factor that out?



ql/src/java/org/apache/hadoop/hive/ql/exec/tez/TezProcessor.java


this should be a different class. not every processor will need these 
things.



ql/src/java/org/apache/hadoop/hive/ql/exec/tez/TezTask.java


don't call it jobClose if it only applies to merge work.



ql/src/java/org/apache/hadoop/hive/ql/io/CombineHiveInputFormat.java


you seem to be fighting that merge work is only partly a map work. why not 
create a dummy op? that way everything is the same. you could even create a 
real op and move your merge logic into it.


- Gunther Hagleitner


On Aug. 15, 2014, 5:27 a.m., Prasanth_J wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/24627/
> ---
> 
> (Updated Aug. 15, 2014, 5:27 a.m.)
> 
> 
> Review request for hive and Gunther Hagleitner.
> 
> 
> Bugs: HIVE-7704
> https://issues.apache.org/jira/browse/HIVE-7704
> 
> 
> Repository: hive-git
> 
> 
> Description
> ---
> 
> Currently tez falls back to MR task for merge file task. It will beneficial 
> to convert the merge file tasks to tez task to make use of the performance 
> gains from tez.
> 
> 
> Diffs
> -
> 
>   itests/src/test/resources/testconfiguration.properties b801678 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/DDLTask.java cd017d8 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/MapOperator.java d5de58e 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/MoveTask.java a2975cb 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/TaskFactory.java 3d74459 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java 1d6a93a 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/mr/ExecMapper.java 4e0fd79 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/tez/DagUtils.java e116426 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/tez/MapRecordProcessor.java 
> 8513e33 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/tez/MapTezProcessor.java 31f3bcd 
>   
> ql/src/java/org/apache/hadoop/hive/ql/exec/tez/MergeFileMapRecordProcessor.java
>  PRE-CREATION 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/tez/MergeFileTezProcessor.java 
> PRE-CREATION 
>   
> ql/src/java/org/apache/hadoop/hive/ql/exec/tez/OrcMergeFileMapRecordProcessor.java
>  PRE-CREATION 
>   
> ql/src/java/org/apache/hadoop/hive/ql/exec/tez/OrcMergeFileTezProcessor.java 
> PRE-CREATION 
>   
> ql/src/java/org/apache/hadoop/hive/ql/exec/tez/RCFileMergeFileMapRecordProcessor.java
>  PRE-CREATION 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/tez/RecordProcessor.java 1577827 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/tez/TezProcessor.java c2ba782 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/tez/TezTask.java 951e918 
>   
> ql/src/java/org/apache/hadoop/hive/ql/exec/tez/tools/RCFileMergeFileTezProcessor.java
>  PRE-CREATION 
>   ql/src/java/org/apache/hadoop/hive/ql/io/CombineHiveInputFormat.java 
> bf44548 
>   ql/src/java/org/apache/hadoop/hive/ql/io/merge/MergeFileInputFormat.java 
> PRE-CREATION 
>   ql/src/java/org/apache/hadoop/hive/ql/io/merge/MergeFileMapper.java 
> PRE-CREATION 
>   ql/src/java/org/

Re: Review Request 24627: HIVE-7704: Create tez task for fast file merging

2014-08-14 Thread j . prasanth . j

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/24627/
---

(Updated Aug. 15, 2014, 5:27 a.m.)


Review request for hive and Gunther Hagleitner.


Changes
---

Fixed failing tests.


Bugs: HIVE-7704
https://issues.apache.org/jira/browse/HIVE-7704


Repository: hive-git


Description
---

Currently tez falls back to MR task for merge file task. It will beneficial to 
convert the merge file tasks to tez task to make use of the performance gains 
from tez.


Diffs (updated)
-

  itests/src/test/resources/testconfiguration.properties b801678 
  ql/src/java/org/apache/hadoop/hive/ql/exec/DDLTask.java cd017d8 
  ql/src/java/org/apache/hadoop/hive/ql/exec/MapOperator.java d5de58e 
  ql/src/java/org/apache/hadoop/hive/ql/exec/MoveTask.java a2975cb 
  ql/src/java/org/apache/hadoop/hive/ql/exec/TaskFactory.java 3d74459 
  ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java 1d6a93a 
  ql/src/java/org/apache/hadoop/hive/ql/exec/mr/ExecMapper.java 4e0fd79 
  ql/src/java/org/apache/hadoop/hive/ql/exec/tez/DagUtils.java e116426 
  ql/src/java/org/apache/hadoop/hive/ql/exec/tez/MapRecordProcessor.java 
8513e33 
  ql/src/java/org/apache/hadoop/hive/ql/exec/tez/MapTezProcessor.java 31f3bcd 
  
ql/src/java/org/apache/hadoop/hive/ql/exec/tez/MergeFileMapRecordProcessor.java 
PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/exec/tez/MergeFileTezProcessor.java 
PRE-CREATION 
  
ql/src/java/org/apache/hadoop/hive/ql/exec/tez/OrcMergeFileMapRecordProcessor.java
 PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/exec/tez/OrcMergeFileTezProcessor.java 
PRE-CREATION 
  
ql/src/java/org/apache/hadoop/hive/ql/exec/tez/RCFileMergeFileMapRecordProcessor.java
 PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/exec/tez/RecordProcessor.java 1577827 
  ql/src/java/org/apache/hadoop/hive/ql/exec/tez/TezProcessor.java c2ba782 
  ql/src/java/org/apache/hadoop/hive/ql/exec/tez/TezTask.java 951e918 
  
ql/src/java/org/apache/hadoop/hive/ql/exec/tez/tools/RCFileMergeFileTezProcessor.java
 PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/io/CombineHiveInputFormat.java bf44548 
  ql/src/java/org/apache/hadoop/hive/ql/io/merge/MergeFileInputFormat.java 
PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/io/merge/MergeFileMapper.java 
PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/io/merge/MergeFileOutputFormat.java 
PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/io/merge/MergeFileTask.java 
PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/io/merge/MergeFileWork.java 
PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/io/merge/MergeInputFormat.java 4651920 
  ql/src/java/org/apache/hadoop/hive/ql/io/merge/MergeMapper.java beb4f7d 
  ql/src/java/org/apache/hadoop/hive/ql/io/merge/MergeOutputFormat.java a3ce699 
  ql/src/java/org/apache/hadoop/hive/ql/io/merge/MergeTask.java c437dd0 
  ql/src/java/org/apache/hadoop/hive/ql/io/merge/MergeWork.java 9efee3c 
  ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcFileMergeMapper.java b36152a 
  
ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcFileStripeMergeInputFormat.java 
a6c92fb 
  ql/src/java/org/apache/hadoop/hive/ql/io/orc/Writer.java c391b0e 
  ql/src/java/org/apache/hadoop/hive/ql/io/orc/WriterImpl.java 76b4d03 
  
ql/src/java/org/apache/hadoop/hive/ql/io/rcfile/merge/RCFileBlockMergeInputFormat.java
 6809c79 
  ql/src/java/org/apache/hadoop/hive/ql/io/rcfile/merge/RCFileMergeMapper.java 
dee6b1c 
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/GenMapRedUtils.java 7129ed8 
  ql/src/java/org/apache/hadoop/hive/ql/parse/GenTezProcContext.java 8513f99 
  ql/src/java/org/apache/hadoop/hive/ql/parse/TezCompiler.java d58c59d 
  ql/src/test/queries/clientpositive/orc_merge5.q PRE-CREATION 
  ql/src/test/queries/clientpositive/orc_merge6.q PRE-CREATION 
  ql/src/test/queries/clientpositive/orc_merge7.q PRE-CREATION 
  ql/src/test/results/clientpositive/infer_bucket_sort_dyn_part.q.out ea37c36 
  ql/src/test/results/clientpositive/list_bucket_dml_10.q.out e9367ac 
  ql/src/test/results/clientpositive/list_bucket_dml_4.q.out 99496d5 
  ql/src/test/results/clientpositive/list_bucket_dml_6.q.out d5deadb 
  ql/src/test/results/clientpositive/list_bucket_dml_7.q.out 4aea4db 
  ql/src/test/results/clientpositive/list_bucket_dml_9.q.out f94a3cc 
  ql/src/test/results/clientpositive/merge_dynamic_partition4.q.out 0899648 
  ql/src/test/results/clientpositive/merge_dynamic_partition5.q.out 0653469 
  ql/src/test/results/clientpositive/orc_createas1.q.out a104480 
  ql/src/test/results/clientpositive/orc_merge3.q.out 258f538 
  ql/src/test/results/clientpositive/orc_merge5.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/orc_merge6.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/orc_merge7.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/rcfile_createas1.q.out c8d65c9 
  ql/s

Re: Review Request 24627: HIVE-7704: Create tez task for fast file merging

2014-08-14 Thread j . prasanth . j

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/24627/
---

(Updated Aug. 14, 2014, 8:53 p.m.)


Review request for hive and Gunther Hagleitner.


Changes
---

Fixed test failures. Mostly golden file diffs because of renaming of MergeWork 
to MergeFileWork. Also previous patch missed TestCliDriver diffs for newly 
added q file tests.


Bugs: HIVE-7704
https://issues.apache.org/jira/browse/HIVE-7704


Repository: hive-git


Description
---

Currently tez falls back to MR task for merge file task. It will beneficial to 
convert the merge file tasks to tez task to make use of the performance gains 
from tez.


Diffs (updated)
-

  itests/src/test/resources/testconfiguration.properties b801678 
  ql/src/java/org/apache/hadoop/hive/ql/exec/DDLTask.java cd017d8 
  ql/src/java/org/apache/hadoop/hive/ql/exec/MapOperator.java d5de58e 
  ql/src/java/org/apache/hadoop/hive/ql/exec/MoveTask.java a2975cb 
  ql/src/java/org/apache/hadoop/hive/ql/exec/TaskFactory.java 3d74459 
  ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java 1d6a93a 
  ql/src/java/org/apache/hadoop/hive/ql/exec/mr/ExecMapper.java 4e0fd79 
  ql/src/java/org/apache/hadoop/hive/ql/exec/tez/DagUtils.java e116426 
  ql/src/java/org/apache/hadoop/hive/ql/exec/tez/MapRecordProcessor.java 
8513e33 
  ql/src/java/org/apache/hadoop/hive/ql/exec/tez/MapTezProcessor.java 31f3bcd 
  
ql/src/java/org/apache/hadoop/hive/ql/exec/tez/MergeFileMapRecordProcessor.java 
PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/exec/tez/MergeFileTezProcessor.java 
PRE-CREATION 
  
ql/src/java/org/apache/hadoop/hive/ql/exec/tez/OrcMergeFileMapRecordProcessor.java
 PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/exec/tez/OrcMergeFileTezProcessor.java 
PRE-CREATION 
  
ql/src/java/org/apache/hadoop/hive/ql/exec/tez/RCFileMergeFileMapRecordProcessor.java
 PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/exec/tez/RecordProcessor.java 1577827 
  ql/src/java/org/apache/hadoop/hive/ql/exec/tez/TezProcessor.java c2ba782 
  ql/src/java/org/apache/hadoop/hive/ql/exec/tez/TezTask.java 951e918 
  
ql/src/java/org/apache/hadoop/hive/ql/exec/tez/tools/RCFileMergeFileTezProcessor.java
 PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/io/CombineHiveInputFormat.java bf44548 
  ql/src/java/org/apache/hadoop/hive/ql/io/merge/MergeFileInputFormat.java 
PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/io/merge/MergeFileMapper.java 
PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/io/merge/MergeFileOutputFormat.java 
PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/io/merge/MergeFileTask.java 
PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/io/merge/MergeFileWork.java 
PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/io/merge/MergeInputFormat.java 4651920 
  ql/src/java/org/apache/hadoop/hive/ql/io/merge/MergeMapper.java beb4f7d 
  ql/src/java/org/apache/hadoop/hive/ql/io/merge/MergeOutputFormat.java a3ce699 
  ql/src/java/org/apache/hadoop/hive/ql/io/merge/MergeTask.java c437dd0 
  ql/src/java/org/apache/hadoop/hive/ql/io/merge/MergeWork.java 9efee3c 
  ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcFileMergeMapper.java b36152a 
  
ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcFileStripeMergeInputFormat.java 
a6c92fb 
  ql/src/java/org/apache/hadoop/hive/ql/io/orc/Writer.java c391b0e 
  ql/src/java/org/apache/hadoop/hive/ql/io/orc/WriterImpl.java 76b4d03 
  
ql/src/java/org/apache/hadoop/hive/ql/io/rcfile/merge/RCFileBlockMergeInputFormat.java
 6809c79 
  ql/src/java/org/apache/hadoop/hive/ql/io/rcfile/merge/RCFileMergeMapper.java 
dee6b1c 
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/GenMapRedUtils.java 7129ed8 
  ql/src/java/org/apache/hadoop/hive/ql/parse/GenTezProcContext.java 8513f99 
  ql/src/java/org/apache/hadoop/hive/ql/parse/TezCompiler.java d58c59d 
  ql/src/test/queries/clientpositive/orc_merge5.q PRE-CREATION 
  ql/src/test/queries/clientpositive/orc_merge6.q PRE-CREATION 
  ql/src/test/queries/clientpositive/orc_merge7.q PRE-CREATION 
  ql/src/test/results/clientpositive/infer_bucket_sort_dyn_part.q.out bb6145c 
  ql/src/test/results/clientpositive/list_bucket_dml_10.q.out e45ab04 
  ql/src/test/results/clientpositive/list_bucket_dml_4.q.out 9c4ff6b 
  ql/src/test/results/clientpositive/list_bucket_dml_6.q.out d1cde40 
  ql/src/test/results/clientpositive/list_bucket_dml_7.q.out 19000dc 
  ql/src/test/results/clientpositive/list_bucket_dml_9.q.out 692bc10 
  ql/src/test/results/clientpositive/merge_dynamic_partition4.q.out 0f57a21 
  ql/src/test/results/clientpositive/merge_dynamic_partition5.q.out 65f195e 
  ql/src/test/results/clientpositive/orc_createas1.q.out 7e74d49 
  ql/src/test/results/clientpositive/orc_merge3.q.out 93edc38 
  ql/src/test/results/clientpositive/orc_merge5.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/orc_merge6.q.out PR

Re: Review Request 24627: HIVE-7704: Create tez task for fast file merging

2014-08-12 Thread j . prasanth . j

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/24627/
---

(Updated Aug. 13, 2014, 2:17 a.m.)


Review request for hive and Gunther Hagleitner.


Changes
---

Refreshed patch against trunk.


Bugs: HIVE-7704
https://issues.apache.org/jira/browse/HIVE-7704


Repository: hive-git


Description
---

Currently tez falls back to MR task for merge file task. It will beneficial to 
convert the merge file tasks to tez task to make use of the performance gains 
from tez.


Diffs (updated)
-

  itests/src/test/resources/testconfiguration.properties 62aa9a3 
  ql/src/java/org/apache/hadoop/hive/ql/exec/DDLTask.java cd017d8 
  ql/src/java/org/apache/hadoop/hive/ql/exec/MapOperator.java d5de58e 
  ql/src/java/org/apache/hadoop/hive/ql/exec/MoveTask.java a2975cb 
  ql/src/java/org/apache/hadoop/hive/ql/exec/TaskFactory.java 24dfed1 
  ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java 1d6a93a 
  ql/src/java/org/apache/hadoop/hive/ql/exec/mr/ExecMapper.java 4e0fd79 
  ql/src/java/org/apache/hadoop/hive/ql/exec/tez/DagUtils.java e116426 
  ql/src/java/org/apache/hadoop/hive/ql/exec/tez/MapRecordProcessor.java 
8513e33 
  ql/src/java/org/apache/hadoop/hive/ql/exec/tez/MapTezProcessor.java 31f3bcd 
  
ql/src/java/org/apache/hadoop/hive/ql/exec/tez/MergeFileMapRecordProcessor.java 
PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/exec/tez/MergeFileTezProcessor.java 
PRE-CREATION 
  
ql/src/java/org/apache/hadoop/hive/ql/exec/tez/OrcMergeFileMapRecordProcessor.java
 PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/exec/tez/OrcMergeFileTezProcessor.java 
PRE-CREATION 
  
ql/src/java/org/apache/hadoop/hive/ql/exec/tez/RCFileMergeFileMapRecordProcessor.java
 PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/exec/tez/RecordProcessor.java 1577827 
  ql/src/java/org/apache/hadoop/hive/ql/exec/tez/TezProcessor.java c2ba782 
  ql/src/java/org/apache/hadoop/hive/ql/exec/tez/TezTask.java 951e918 
  
ql/src/java/org/apache/hadoop/hive/ql/exec/tez/tools/RCFileMergeFileTezProcessor.java
 PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/io/CombineHiveInputFormat.java bf44548 
  ql/src/java/org/apache/hadoop/hive/ql/io/merge/MergeFileInputFormat.java 
PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/io/merge/MergeFileMapper.java 
PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/io/merge/MergeFileOutputFormat.java 
PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/io/merge/MergeFileTask.java 
PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/io/merge/MergeFileWork.java 
PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/io/merge/MergeInputFormat.java 4651920 
  ql/src/java/org/apache/hadoop/hive/ql/io/merge/MergeMapper.java beb4f7d 
  ql/src/java/org/apache/hadoop/hive/ql/io/merge/MergeOutputFormat.java a3ce699 
  ql/src/java/org/apache/hadoop/hive/ql/io/merge/MergeTask.java c437dd0 
  ql/src/java/org/apache/hadoop/hive/ql/io/merge/MergeWork.java 9efee3c 
  ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcFileMergeMapper.java b36152a 
  
ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcFileStripeMergeInputFormat.java 
a6c92fb 
  ql/src/java/org/apache/hadoop/hive/ql/io/orc/Writer.java c391b0e 
  ql/src/java/org/apache/hadoop/hive/ql/io/orc/WriterImpl.java 76b4d03 
  
ql/src/java/org/apache/hadoop/hive/ql/io/rcfile/merge/RCFileBlockMergeInputFormat.java
 6809c79 
  ql/src/java/org/apache/hadoop/hive/ql/io/rcfile/merge/RCFileMergeMapper.java 
dee6b1c 
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/GenMapRedUtils.java 7129ed8 
  ql/src/java/org/apache/hadoop/hive/ql/parse/GenTezProcContext.java 8513f99 
  ql/src/java/org/apache/hadoop/hive/ql/parse/TezCompiler.java d58c59d 
  ql/src/test/queries/clientpositive/orc_merge5.q PRE-CREATION 
  ql/src/test/queries/clientpositive/orc_merge6.q PRE-CREATION 
  ql/src/test/queries/clientpositive/orc_merge7.q PRE-CREATION 
  ql/src/test/results/clientpositive/tez/orc_merge5.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/tez/orc_merge6.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/tez/orc_merge7.q.out PRE-CREATION 

Diff: https://reviews.apache.org/r/24627/diff/


Testing
---


Thanks,

Prasanth_J



Review Request 24627: HIVE-7704: Create tez task for fast file merging

2014-08-12 Thread j . prasanth . j

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/24627/
---

Review request for hive and Gunther Hagleitner.


Bugs: HIVE-7704
https://issues.apache.org/jira/browse/HIVE-7704


Repository: hive-git


Description
---

Currently tez falls back to MR task for merge file task. It will beneficial to 
convert the merge file tasks to tez task to make use of the performance gains 
from tez.


Diffs
-

  itests/qtest/testconfiguration.properties 385397d 
  ql/src/java/org/apache/hadoop/hive/ql/exec/DDLTask.java cd017d8 
  ql/src/java/org/apache/hadoop/hive/ql/exec/MapOperator.java d5de58e 
  ql/src/java/org/apache/hadoop/hive/ql/exec/MoveTask.java a2975cb 
  ql/src/java/org/apache/hadoop/hive/ql/exec/TaskFactory.java 24dfed1 
  ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java 1d6a93a 
  ql/src/java/org/apache/hadoop/hive/ql/exec/mr/ExecMapper.java 4e0fd79 
  ql/src/java/org/apache/hadoop/hive/ql/exec/tez/DagUtils.java e116426 
  ql/src/java/org/apache/hadoop/hive/ql/exec/tez/MapRecordProcessor.java 
8513e33 
  ql/src/java/org/apache/hadoop/hive/ql/exec/tez/MapTezProcessor.java 31f3bcd 
  
ql/src/java/org/apache/hadoop/hive/ql/exec/tez/MergeFileMapRecordProcessor.java 
PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/exec/tez/MergeFileTezProcessor.java 
PRE-CREATION 
  
ql/src/java/org/apache/hadoop/hive/ql/exec/tez/OrcMergeFileMapRecordProcessor.java
 PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/exec/tez/OrcMergeFileTezProcessor.java 
PRE-CREATION 
  
ql/src/java/org/apache/hadoop/hive/ql/exec/tez/RCFileMergeFileMapRecordProcessor.java
 PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/exec/tez/RecordProcessor.java 1577827 
  ql/src/java/org/apache/hadoop/hive/ql/exec/tez/TezProcessor.java c2ba782 
  ql/src/java/org/apache/hadoop/hive/ql/exec/tez/TezTask.java 951e918 
  
ql/src/java/org/apache/hadoop/hive/ql/exec/tez/tools/RCFileMergeFileTezProcessor.java
 PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/io/CombineHiveInputFormat.java bf44548 
  ql/src/java/org/apache/hadoop/hive/ql/io/merge/MergeFileInputFormat.java 
PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/io/merge/MergeFileMapper.java 
PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/io/merge/MergeFileOutputFormat.java 
PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/io/merge/MergeFileTask.java 
PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/io/merge/MergeFileWork.java 
PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/io/merge/MergeInputFormat.java 4651920 
  ql/src/java/org/apache/hadoop/hive/ql/io/merge/MergeMapper.java beb4f7d 
  ql/src/java/org/apache/hadoop/hive/ql/io/merge/MergeOutputFormat.java a3ce699 
  ql/src/java/org/apache/hadoop/hive/ql/io/merge/MergeTask.java c437dd0 
  ql/src/java/org/apache/hadoop/hive/ql/io/merge/MergeWork.java 9efee3c 
  ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcFileMergeMapper.java b36152a 
  
ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcFileStripeMergeInputFormat.java 
a6c92fb 
  ql/src/java/org/apache/hadoop/hive/ql/io/orc/Writer.java c391b0e 
  ql/src/java/org/apache/hadoop/hive/ql/io/orc/WriterImpl.java 76b4d03 
  
ql/src/java/org/apache/hadoop/hive/ql/io/rcfile/merge/RCFileBlockMergeInputFormat.java
 6809c79 
  ql/src/java/org/apache/hadoop/hive/ql/io/rcfile/merge/RCFileMergeMapper.java 
dee6b1c 
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/GenMapRedUtils.java 7129ed8 
  ql/src/java/org/apache/hadoop/hive/ql/parse/GenTezProcContext.java 8513f99 
  ql/src/java/org/apache/hadoop/hive/ql/parse/TezCompiler.java d58c59d 
  ql/src/test/queries/clientpositive/orc_merge5.q PRE-CREATION 
  ql/src/test/queries/clientpositive/orc_merge6.q PRE-CREATION 
  ql/src/test/queries/clientpositive/orc_merge7.q PRE-CREATION 
  ql/src/test/results/clientpositive/tez/orc_merge5.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/tez/orc_merge6.q.out PRE-CREATION 
  ql/src/test/results/clientpositive/tez/orc_merge7.q.out PRE-CREATION 

Diff: https://reviews.apache.org/r/24627/diff/


Testing
---


Thanks,

Prasanth_J