Re: Review Request 16728: Implement non-staged MapJoin

2014-01-27 Thread Vikram Dixit Kumaraswamy

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/16728/#review32885
---



itests/util/src/main/java/org/apache/hadoop/hive/ql/QTestUtil.java
https://reviews.apache.org/r/16728/#comment61876

This would break tez tests.



itests/util/src/main/java/org/apache/hadoop/hive/ql/QTestUtil.java
https://reviews.apache.org/r/16728/#comment61875

This would eliminate tez unit tests. Was this intentional?



ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/LocalMapJoinProcFactory.java
https://reviews.apache.org/r/16728/#comment61880

Could you raise a jira for this.



ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/LocalMapJoinProcFactory.java
https://reviews.apache.org/r/16728/#comment61882

can it be only these 2 operators? Maybe common join operator can be used?


- Vikram Dixit Kumaraswamy


On Jan. 20, 2014, 5 a.m., Navis Ryu wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/16728/
 ---
 
 (Updated Jan. 20, 2014, 5 a.m.)
 
 
 Review request for hive.
 
 
 Bugs: HIVE-6144
 https://issues.apache.org/jira/browse/HIVE-6144
 
 
 Repository: hive-git
 
 
 Description
 ---
 
 For map join, all data in small aliases are hashed and stored into temporary 
 file in MapRedLocalTask. But for some aliases without filter or projection, 
 it seemed not necessary to do that. For example.
 
 {noformat}
 select a.* from src a join src b on a.key=b.key;
 {noformat}
 
 makes plan like this.
 {noformat}
 STAGE PLANS:
   Stage: Stage-4
 Map Reduce Local Work
   Alias - Map Local Tables:
 a 
   Fetch Operator
 limit: -1
   Alias - Map Local Operator Tree:
 a 
   TableScan
 alias: a
 HashTable Sink Operator
   condition expressions:
 0 {key} {value}
 1 
   handleSkewJoin: false
   keys:
 0 [Column[key]]
 1 [Column[key]]
   Position of Big Table: 1
 
   Stage: Stage-3
 Map Reduce
   Alias - Map Operator Tree:
 b 
   TableScan
 alias: b
 Map Join Operator
   condition map:
Inner Join 0 to 1
   condition expressions:
 0 {key} {value}
 1 
   handleSkewJoin: false
   keys:
 0 [Column[key]]
 1 [Column[key]]
   outputColumnNames: _col0, _col1
   Position of Big Table: 1
   Select Operator
 File Output Operator
   Local Work:
 Map Reduce Local Work
   Stage: Stage-0
 Fetch Operator
 {noformat}
 
 table src(a) is fetched and stored as-is in MRLocalTask. With this patch, 
 plan can be like below.
 {noformat}
   Stage: Stage-3
 Map Reduce
   Alias - Map Operator Tree:
 b 
   TableScan
 alias: b
 Map Join Operator
   condition map:
Inner Join 0 to 1
   condition expressions:
 0 {key} {value}
 1 
   handleSkewJoin: false
   keys:
 0 [Column[key]]
 1 [Column[key]]
   outputColumnNames: _col0, _col1
   Position of Big Table: 1
   Select Operator
   File Output Operator
   Local Work:
 Map Reduce Local Work
   Alias - Map Local Tables:
 a 
   Fetch Operator
 limit: -1
   Alias - Map Local Operator Tree:
 a 
   TableScan
 alias: a
   Has Any Stage Alias: false
   Stage: Stage-0
 Fetch Operator
 {noformat}
 
 
 Diffs
 -
 
   common/src/java/org/apache/hadoop/hive/conf/HiveConf.java a78b72f 
   conf/hive-default.xml.template 7cd8a1f 
   itests/util/src/main/java/org/apache/hadoop/hive/ql/QTestUtil.java 9ad5986 
   
 itests/util/src/main/java/org/apache/hadoop/hive/ql/hooks/MapJoinCounterHook.java
  1b0d57e 
   ql/src/java/org/apache/hadoop/hive/ql/exec/AbstractMapJoinOperator.java 
 d8f4eb4 
   ql/src/java/org/apache/hadoop/hive/ql/exec/HashTableLoader.java a080fcc 
   ql/src/java/org/apache/hadoop/hive/ql/exec/HashTableSinkOperator.java 
 fc08b28 
   ql/src/java/org/apache/hadoop/hive/ql/exec/JoinUtil.java 1e0314d 
   ql/src/java/org/apache/hadoop/hive/ql/exec/MapJoinOperator.java bdc85b9 
   ql/src/java/org/apache/hadoop/hive/ql/exec/Task.java 56676df 
   ql/src/java/org/apache/hadoop/hive/ql/exec/TemporaryHashSinkOperator.java 
 PRE-CREATION 
   ql/src/java/org/apache/hadoop/hive/ql/exec/mr/ExecDriver.java 

Re: Review Request 16728: Implement non-staged MapJoin

2014-01-27 Thread Navis Ryu

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/16728/
---

(Updated Jan. 28, 2014, 1:37 a.m.)


Review request for hive.


Bugs: HIVE-6144
https://issues.apache.org/jira/browse/HIVE-6144


Repository: hive-git


Description
---

For map join, all data in small aliases are hashed and stored into temporary 
file in MapRedLocalTask. But for some aliases without filter or projection, it 
seemed not necessary to do that. For example.

{noformat}
select a.* from src a join src b on a.key=b.key;
{noformat}

makes plan like this.
{noformat}
STAGE PLANS:
  Stage: Stage-4
Map Reduce Local Work
  Alias - Map Local Tables:
a 
  Fetch Operator
limit: -1
  Alias - Map Local Operator Tree:
a 
  TableScan
alias: a
HashTable Sink Operator
  condition expressions:
0 {key} {value}
1 
  handleSkewJoin: false
  keys:
0 [Column[key]]
1 [Column[key]]
  Position of Big Table: 1

  Stage: Stage-3
Map Reduce
  Alias - Map Operator Tree:
b 
  TableScan
alias: b
Map Join Operator
  condition map:
   Inner Join 0 to 1
  condition expressions:
0 {key} {value}
1 
  handleSkewJoin: false
  keys:
0 [Column[key]]
1 [Column[key]]
  outputColumnNames: _col0, _col1
  Position of Big Table: 1
  Select Operator
File Output Operator
  Local Work:
Map Reduce Local Work
  Stage: Stage-0
Fetch Operator
{noformat}

table src(a) is fetched and stored as-is in MRLocalTask. With this patch, plan 
can be like below.
{noformat}
  Stage: Stage-3
Map Reduce
  Alias - Map Operator Tree:
b 
  TableScan
alias: b
Map Join Operator
  condition map:
   Inner Join 0 to 1
  condition expressions:
0 {key} {value}
1 
  handleSkewJoin: false
  keys:
0 [Column[key]]
1 [Column[key]]
  outputColumnNames: _col0, _col1
  Position of Big Table: 1
  Select Operator
  File Output Operator
  Local Work:
Map Reduce Local Work
  Alias - Map Local Tables:
a 
  Fetch Operator
limit: -1
  Alias - Map Local Operator Tree:
a 
  TableScan
alias: a
  Has Any Stage Alias: false
  Stage: Stage-0
Fetch Operator
{noformat}


Diffs (updated)
-

  common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 84ee78f 
  conf/hive-default.xml.template 66d22f9 
  itests/util/src/main/java/org/apache/hadoop/hive/ql/QTestUtil.java 9ad5986 
  
itests/util/src/main/java/org/apache/hadoop/hive/ql/hooks/MapJoinCounterHook.java
 1b0d57e 
  ql/src/java/org/apache/hadoop/hive/ql/exec/AbstractMapJoinOperator.java 
d8f4eb4 
  ql/src/java/org/apache/hadoop/hive/ql/exec/HashTableLoader.java a080fcc 
  ql/src/java/org/apache/hadoop/hive/ql/exec/HashTableSinkOperator.java fc08b28 
  ql/src/java/org/apache/hadoop/hive/ql/exec/JoinUtil.java 1e0314d 
  ql/src/java/org/apache/hadoop/hive/ql/exec/MapJoinOperator.java bdc85b9 
  ql/src/java/org/apache/hadoop/hive/ql/exec/Task.java 56676df 
  ql/src/java/org/apache/hadoop/hive/ql/exec/TemporaryHashSinkOperator.java 
PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/exec/mr/ExecDriver.java 22e5777 
  ql/src/java/org/apache/hadoop/hive/ql/exec/mr/HashTableLoader.java 58484af 
  ql/src/java/org/apache/hadoop/hive/ql/exec/mr/MapredLocalTask.java 2d2508d 
  ql/src/java/org/apache/hadoop/hive/ql/exec/tez/HashTableLoader.java 2df8ab9 
  
ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/LocalMapJoinProcFactory.java
 5a53e15 
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/MapJoinResolver.java 
83b8d6e 
  ql/src/java/org/apache/hadoop/hive/ql/plan/ConditionalResolverCommonJoin.java 
ebccb14 
  ql/src/java/org/apache/hadoop/hive/ql/plan/HashTableSinkDesc.java c30da56 
  ql/src/java/org/apache/hadoop/hive/ql/plan/MapredLocalWork.java 709c50e 
  ql/src/test/queries/clientpositive/auto_join_without_localtask.q PRE-CREATION 
  ql/src/test/results/clientnegative/bucket_mapjoin_mismatch1.q.out 0595cd6 
  ql/src/test/results/clientnegative/deletejar.q.out b873e34 
  ql/src/test/results/clientnegative/file_with_header_footer_negative.q.out 
fa261b3 
  ql/src/test/results/clientnegative/sortmerge_mapjoin_mismatch_1.q.out bca069a 
  ql/src/test/results/clientpositive/auto_join1.q.out b93c10f 
  

Re: Review Request 16728: Implement non-staged MapJoin

2014-01-27 Thread Navis Ryu

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/16728/
---

(Updated Jan. 28, 2014, 1:38 a.m.)


Review request for hive.


Bugs: HIVE-6144
https://issues.apache.org/jira/browse/HIVE-6144


Repository: hive-git


Description
---

For map join, all data in small aliases are hashed and stored into temporary 
file in MapRedLocalTask. But for some aliases without filter or projection, it 
seemed not necessary to do that. For example.

{noformat}
select a.* from src a join src b on a.key=b.key;
{noformat}

makes plan like this.
{noformat}
STAGE PLANS:
  Stage: Stage-4
Map Reduce Local Work
  Alias - Map Local Tables:
a 
  Fetch Operator
limit: -1
  Alias - Map Local Operator Tree:
a 
  TableScan
alias: a
HashTable Sink Operator
  condition expressions:
0 {key} {value}
1 
  handleSkewJoin: false
  keys:
0 [Column[key]]
1 [Column[key]]
  Position of Big Table: 1

  Stage: Stage-3
Map Reduce
  Alias - Map Operator Tree:
b 
  TableScan
alias: b
Map Join Operator
  condition map:
   Inner Join 0 to 1
  condition expressions:
0 {key} {value}
1 
  handleSkewJoin: false
  keys:
0 [Column[key]]
1 [Column[key]]
  outputColumnNames: _col0, _col1
  Position of Big Table: 1
  Select Operator
File Output Operator
  Local Work:
Map Reduce Local Work
  Stage: Stage-0
Fetch Operator
{noformat}

table src(a) is fetched and stored as-is in MRLocalTask. With this patch, plan 
can be like below.
{noformat}
  Stage: Stage-3
Map Reduce
  Alias - Map Operator Tree:
b 
  TableScan
alias: b
Map Join Operator
  condition map:
   Inner Join 0 to 1
  condition expressions:
0 {key} {value}
1 
  handleSkewJoin: false
  keys:
0 [Column[key]]
1 [Column[key]]
  outputColumnNames: _col0, _col1
  Position of Big Table: 1
  Select Operator
  File Output Operator
  Local Work:
Map Reduce Local Work
  Alias - Map Local Tables:
a 
  Fetch Operator
limit: -1
  Alias - Map Local Operator Tree:
a 
  TableScan
alias: a
  Has Any Stage Alias: false
  Stage: Stage-0
Fetch Operator
{noformat}


Diffs (updated)
-

  common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 84ee78f 
  conf/hive-default.xml.template 66d22f9 
  
itests/util/src/main/java/org/apache/hadoop/hive/ql/hooks/MapJoinCounterHook.java
 1b0d57e 
  ql/src/java/org/apache/hadoop/hive/ql/exec/AbstractMapJoinOperator.java 
d8f4eb4 
  ql/src/java/org/apache/hadoop/hive/ql/exec/HashTableLoader.java a080fcc 
  ql/src/java/org/apache/hadoop/hive/ql/exec/HashTableSinkOperator.java fc08b28 
  ql/src/java/org/apache/hadoop/hive/ql/exec/JoinUtil.java 1e0314d 
  ql/src/java/org/apache/hadoop/hive/ql/exec/MapJoinOperator.java bdc85b9 
  ql/src/java/org/apache/hadoop/hive/ql/exec/Task.java 56676df 
  ql/src/java/org/apache/hadoop/hive/ql/exec/TemporaryHashSinkOperator.java 
PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/exec/mr/ExecDriver.java 22e5777 
  ql/src/java/org/apache/hadoop/hive/ql/exec/mr/HashTableLoader.java 58484af 
  ql/src/java/org/apache/hadoop/hive/ql/exec/mr/MapredLocalTask.java 2d2508d 
  ql/src/java/org/apache/hadoop/hive/ql/exec/tez/HashTableLoader.java 2df8ab9 
  
ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/LocalMapJoinProcFactory.java
 5a53e15 
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/MapJoinResolver.java 
83b8d6e 
  ql/src/java/org/apache/hadoop/hive/ql/plan/ConditionalResolverCommonJoin.java 
ebccb14 
  ql/src/java/org/apache/hadoop/hive/ql/plan/HashTableSinkDesc.java c30da56 
  ql/src/java/org/apache/hadoop/hive/ql/plan/MapredLocalWork.java 709c50e 
  ql/src/test/queries/clientpositive/auto_join_without_localtask.q PRE-CREATION 
  ql/src/test/results/clientnegative/bucket_mapjoin_mismatch1.q.out 0595cd6 
  ql/src/test/results/clientnegative/file_with_header_footer_negative.q.out 
fa261b3 
  ql/src/test/results/clientnegative/sortmerge_mapjoin_mismatch_1.q.out bca069a 
  ql/src/test/results/clientpositive/auto_join1.q.out b93c10f 
  ql/src/test/results/clientpositive/auto_join15.q.out 6bfcfc7 
  ql/src/test/results/clientpositive/auto_join17.q.out 698270e 
  

Re: Review Request 16728: Implement non-staged MapJoin

2014-01-27 Thread Navis Ryu


 On Jan. 27, 2014, 9:52 p.m., Vikram Dixit Kumaraswamy wrote:
  itests/util/src/main/java/org/apache/hadoop/hive/ql/QTestUtil.java, line 90
  https://reviews.apache.org/r/16728/diff/4/?file=431458#file431458line90
 
  This would break tez tests.

Included by mistake (described in HIVE-6241). I'll remove this.


 On Jan. 27, 2014, 9:52 p.m., Vikram Dixit Kumaraswamy wrote:
  itests/util/src/main/java/org/apache/hadoop/hive/ql/QTestUtil.java, line 369
  https://reviews.apache.org/r/16728/diff/4/?file=431458#file431458line369
 
  This would eliminate tez unit tests. Was this intentional?

Same with above one


 On Jan. 27, 2014, 9:52 p.m., Vikram Dixit Kumaraswamy wrote:
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/LocalMapJoinProcFactory.java,
   line 142
  https://reviews.apache.org/r/16728/diff/4/?file=431471#file431471line142
 
  Could you raise a jira for this.

Ok. sure.


 On Jan. 27, 2014, 9:52 p.m., Vikram Dixit Kumaraswamy wrote:
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/LocalMapJoinProcFactory.java,
   line 165
  https://reviews.apache.org/r/16728/diff/4/?file=431471#file431471line165
 
  can it be only these 2 operators? Maybe common join operator can be 
  used?

Logically, it might possible but in real, it seemed not possible to have 
CommonJoinOperator as a parent which would be in reducer task. Tez could 
exploit that but I don't know well on tez.


- Navis


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/16728/#review32885
---


On Jan. 28, 2014, 1:37 a.m., Navis Ryu wrote:
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 https://reviews.apache.org/r/16728/
 ---
 
 (Updated Jan. 28, 2014, 1:37 a.m.)
 
 
 Review request for hive.
 
 
 Bugs: HIVE-6144
 https://issues.apache.org/jira/browse/HIVE-6144
 
 
 Repository: hive-git
 
 
 Description
 ---
 
 For map join, all data in small aliases are hashed and stored into temporary 
 file in MapRedLocalTask. But for some aliases without filter or projection, 
 it seemed not necessary to do that. For example.
 
 {noformat}
 select a.* from src a join src b on a.key=b.key;
 {noformat}
 
 makes plan like this.
 {noformat}
 STAGE PLANS:
   Stage: Stage-4
 Map Reduce Local Work
   Alias - Map Local Tables:
 a 
   Fetch Operator
 limit: -1
   Alias - Map Local Operator Tree:
 a 
   TableScan
 alias: a
 HashTable Sink Operator
   condition expressions:
 0 {key} {value}
 1 
   handleSkewJoin: false
   keys:
 0 [Column[key]]
 1 [Column[key]]
   Position of Big Table: 1
 
   Stage: Stage-3
 Map Reduce
   Alias - Map Operator Tree:
 b 
   TableScan
 alias: b
 Map Join Operator
   condition map:
Inner Join 0 to 1
   condition expressions:
 0 {key} {value}
 1 
   handleSkewJoin: false
   keys:
 0 [Column[key]]
 1 [Column[key]]
   outputColumnNames: _col0, _col1
   Position of Big Table: 1
   Select Operator
 File Output Operator
   Local Work:
 Map Reduce Local Work
   Stage: Stage-0
 Fetch Operator
 {noformat}
 
 table src(a) is fetched and stored as-is in MRLocalTask. With this patch, 
 plan can be like below.
 {noformat}
   Stage: Stage-3
 Map Reduce
   Alias - Map Operator Tree:
 b 
   TableScan
 alias: b
 Map Join Operator
   condition map:
Inner Join 0 to 1
   condition expressions:
 0 {key} {value}
 1 
   handleSkewJoin: false
   keys:
 0 [Column[key]]
 1 [Column[key]]
   outputColumnNames: _col0, _col1
   Position of Big Table: 1
   Select Operator
   File Output Operator
   Local Work:
 Map Reduce Local Work
   Alias - Map Local Tables:
 a 
   Fetch Operator
 limit: -1
   Alias - Map Local Operator Tree:
 a 
   TableScan
 alias: a
   Has Any Stage Alias: false
   Stage: Stage-0
 Fetch Operator
 {noformat}
 
 
 Diffs
 -
 
   common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 84ee78f 
   conf/hive-default.xml.template 66d22f9 
   
 itests/util/src/main/java/org/apache/hadoop/hive/ql/hooks/MapJoinCounterHook.java
  1b0d57e 
   

Re: Review Request 16728: Implement non-staged MapJoin

2014-01-19 Thread Navis Ryu

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/16728/
---

(Updated Jan. 20, 2014, 5 a.m.)


Review request for hive.


Changes
---

Addressed comment


Bugs: HIVE-6144
https://issues.apache.org/jira/browse/HIVE-6144


Repository: hive-git


Description
---

For map join, all data in small aliases are hashed and stored into temporary 
file in MapRedLocalTask. But for some aliases without filter or projection, it 
seemed not necessary to do that. For example.

{noformat}
select a.* from src a join src b on a.key=b.key;
{noformat}

makes plan like this.
{noformat}
STAGE PLANS:
  Stage: Stage-4
Map Reduce Local Work
  Alias - Map Local Tables:
a 
  Fetch Operator
limit: -1
  Alias - Map Local Operator Tree:
a 
  TableScan
alias: a
HashTable Sink Operator
  condition expressions:
0 {key} {value}
1 
  handleSkewJoin: false
  keys:
0 [Column[key]]
1 [Column[key]]
  Position of Big Table: 1

  Stage: Stage-3
Map Reduce
  Alias - Map Operator Tree:
b 
  TableScan
alias: b
Map Join Operator
  condition map:
   Inner Join 0 to 1
  condition expressions:
0 {key} {value}
1 
  handleSkewJoin: false
  keys:
0 [Column[key]]
1 [Column[key]]
  outputColumnNames: _col0, _col1
  Position of Big Table: 1
  Select Operator
File Output Operator
  Local Work:
Map Reduce Local Work
  Stage: Stage-0
Fetch Operator
{noformat}

table src(a) is fetched and stored as-is in MRLocalTask. With this patch, plan 
can be like below.
{noformat}
  Stage: Stage-3
Map Reduce
  Alias - Map Operator Tree:
b 
  TableScan
alias: b
Map Join Operator
  condition map:
   Inner Join 0 to 1
  condition expressions:
0 {key} {value}
1 
  handleSkewJoin: false
  keys:
0 [Column[key]]
1 [Column[key]]
  outputColumnNames: _col0, _col1
  Position of Big Table: 1
  Select Operator
  File Output Operator
  Local Work:
Map Reduce Local Work
  Alias - Map Local Tables:
a 
  Fetch Operator
limit: -1
  Alias - Map Local Operator Tree:
a 
  TableScan
alias: a
  Has Any Stage Alias: false
  Stage: Stage-0
Fetch Operator
{noformat}


Diffs (updated)
-

  common/src/java/org/apache/hadoop/hive/conf/HiveConf.java a78b72f 
  conf/hive-default.xml.template 7cd8a1f 
  itests/util/src/main/java/org/apache/hadoop/hive/ql/QTestUtil.java 9ad5986 
  
itests/util/src/main/java/org/apache/hadoop/hive/ql/hooks/MapJoinCounterHook.java
 1b0d57e 
  ql/src/java/org/apache/hadoop/hive/ql/exec/AbstractMapJoinOperator.java 
d8f4eb4 
  ql/src/java/org/apache/hadoop/hive/ql/exec/HashTableLoader.java a080fcc 
  ql/src/java/org/apache/hadoop/hive/ql/exec/HashTableSinkOperator.java fc08b28 
  ql/src/java/org/apache/hadoop/hive/ql/exec/JoinUtil.java 1e0314d 
  ql/src/java/org/apache/hadoop/hive/ql/exec/MapJoinOperator.java bdc85b9 
  ql/src/java/org/apache/hadoop/hive/ql/exec/Task.java 56676df 
  ql/src/java/org/apache/hadoop/hive/ql/exec/TemporaryHashSinkOperator.java 
PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/exec/mr/ExecDriver.java 22e5777 
  ql/src/java/org/apache/hadoop/hive/ql/exec/mr/HashTableLoader.java 58484af 
  ql/src/java/org/apache/hadoop/hive/ql/exec/mr/MapredLocalTask.java 2d2508d 
  ql/src/java/org/apache/hadoop/hive/ql/exec/tez/HashTableLoader.java 2df8ab9 
  
ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/LocalMapJoinProcFactory.java
 5a53e15 
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/MapJoinResolver.java 
83b8d6e 
  ql/src/java/org/apache/hadoop/hive/ql/plan/ConditionalResolverCommonJoin.java 
ebccb14 
  ql/src/java/org/apache/hadoop/hive/ql/plan/HashTableSinkDesc.java c30da56 
  ql/src/java/org/apache/hadoop/hive/ql/plan/MapredLocalWork.java 709c50e 
  ql/src/test/queries/clientpositive/auto_join_without_localtask.q PRE-CREATION 
  ql/src/test/results/clientnegative/bucket_mapjoin_mismatch1.q.out 0595cd6 
  ql/src/test/results/clientnegative/deletejar.q.out b873e34 
  ql/src/test/results/clientnegative/file_with_header_footer_negative.q.out 
fa261b3 
  ql/src/test/results/clientnegative/sortmerge_mapjoin_mismatch_1.q.out bca069a 
  

Re: Review Request 16728: Implement non-staged MapJoin

2014-01-12 Thread Navis Ryu

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/16728/
---

(Updated Jan. 13, 2014, 4:43 a.m.)


Review request for hive.


Changes
---

Rebased to trunk  added entry for hive-defalt.xml.template


Bugs: HIVE-6144
https://issues.apache.org/jira/browse/HIVE-6144


Repository: hive-git


Description
---

For map join, all data in small aliases are hashed and stored into temporary 
file in MapRedLocalTask. But for some aliases without filter or projection, it 
seemed not necessary to do that. For example.

{noformat}
select a.* from src a join src b on a.key=b.key;
{noformat}

makes plan like this.
{noformat}
STAGE PLANS:
  Stage: Stage-4
Map Reduce Local Work
  Alias - Map Local Tables:
a 
  Fetch Operator
limit: -1
  Alias - Map Local Operator Tree:
a 
  TableScan
alias: a
HashTable Sink Operator
  condition expressions:
0 {key} {value}
1 
  handleSkewJoin: false
  keys:
0 [Column[key]]
1 [Column[key]]
  Position of Big Table: 1

  Stage: Stage-3
Map Reduce
  Alias - Map Operator Tree:
b 
  TableScan
alias: b
Map Join Operator
  condition map:
   Inner Join 0 to 1
  condition expressions:
0 {key} {value}
1 
  handleSkewJoin: false
  keys:
0 [Column[key]]
1 [Column[key]]
  outputColumnNames: _col0, _col1
  Position of Big Table: 1
  Select Operator
File Output Operator
  Local Work:
Map Reduce Local Work
  Stage: Stage-0
Fetch Operator
{noformat}

table src(a) is fetched and stored as-is in MRLocalTask. With this patch, plan 
can be like below.
{noformat}
  Stage: Stage-3
Map Reduce
  Alias - Map Operator Tree:
b 
  TableScan
alias: b
Map Join Operator
  condition map:
   Inner Join 0 to 1
  condition expressions:
0 {key} {value}
1 
  handleSkewJoin: false
  keys:
0 [Column[key]]
1 [Column[key]]
  outputColumnNames: _col0, _col1
  Position of Big Table: 1
  Select Operator
  File Output Operator
  Local Work:
Map Reduce Local Work
  Alias - Map Local Tables:
a 
  Fetch Operator
limit: -1
  Alias - Map Local Operator Tree:
a 
  TableScan
alias: a
  Has Any Stage Alias: false
  Stage: Stage-0
Fetch Operator
{noformat}


Diffs (updated)
-

  common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 16d54c6 
  conf/hive-default.xml.template d188f2a 
  ql/src/java/org/apache/hadoop/hive/ql/exec/AbstractMapJoinOperator.java 
d8f4eb4 
  ql/src/java/org/apache/hadoop/hive/ql/exec/HashTableLoader.java a080fcc 
  ql/src/java/org/apache/hadoop/hive/ql/exec/HashTableSinkOperator.java aa8f19c 
  ql/src/java/org/apache/hadoop/hive/ql/exec/JoinUtil.java 1e0314d 
  ql/src/java/org/apache/hadoop/hive/ql/exec/MapJoinOperator.java bdc85b9 
  ql/src/java/org/apache/hadoop/hive/ql/exec/TemporaryHashSinkOperator.java 
PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/exec/mr/ExecDriver.java 5511bca 
  ql/src/java/org/apache/hadoop/hive/ql/exec/mr/HashTableLoader.java efe5710 
  ql/src/java/org/apache/hadoop/hive/ql/exec/mr/MapredLocalTask.java 0cc90d0 
  
ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/LocalMapJoinProcFactory.java
 5a53e15 
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/MapJoinResolver.java 
010ac54 
  ql/src/java/org/apache/hadoop/hive/ql/plan/HashTableSinkDesc.java 14fced7 
  ql/src/java/org/apache/hadoop/hive/ql/plan/MapredLocalWork.java 83a778d 
  ql/src/test/queries/clientpositive/auto_join_without_localtask.q PRE-CREATION 
  ql/src/test/results/clientpositive/auto_join_without_localtask.q.out 
PRE-CREATION 

Diff: https://reviews.apache.org/r/16728/diff/


Testing
---


Thanks,

Navis Ryu



Re: Review Request 16728: Implement non-staged MapJoin

2014-01-12 Thread Navis Ryu

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/16728/
---

(Updated Jan. 13, 2014, 7:10 a.m.)


Review request for hive.


Changes
---

Missed a file


Bugs: HIVE-6144
https://issues.apache.org/jira/browse/HIVE-6144


Repository: hive-git


Description
---

For map join, all data in small aliases are hashed and stored into temporary 
file in MapRedLocalTask. But for some aliases without filter or projection, it 
seemed not necessary to do that. For example.

{noformat}
select a.* from src a join src b on a.key=b.key;
{noformat}

makes plan like this.
{noformat}
STAGE PLANS:
  Stage: Stage-4
Map Reduce Local Work
  Alias - Map Local Tables:
a 
  Fetch Operator
limit: -1
  Alias - Map Local Operator Tree:
a 
  TableScan
alias: a
HashTable Sink Operator
  condition expressions:
0 {key} {value}
1 
  handleSkewJoin: false
  keys:
0 [Column[key]]
1 [Column[key]]
  Position of Big Table: 1

  Stage: Stage-3
Map Reduce
  Alias - Map Operator Tree:
b 
  TableScan
alias: b
Map Join Operator
  condition map:
   Inner Join 0 to 1
  condition expressions:
0 {key} {value}
1 
  handleSkewJoin: false
  keys:
0 [Column[key]]
1 [Column[key]]
  outputColumnNames: _col0, _col1
  Position of Big Table: 1
  Select Operator
File Output Operator
  Local Work:
Map Reduce Local Work
  Stage: Stage-0
Fetch Operator
{noformat}

table src(a) is fetched and stored as-is in MRLocalTask. With this patch, plan 
can be like below.
{noformat}
  Stage: Stage-3
Map Reduce
  Alias - Map Operator Tree:
b 
  TableScan
alias: b
Map Join Operator
  condition map:
   Inner Join 0 to 1
  condition expressions:
0 {key} {value}
1 
  handleSkewJoin: false
  keys:
0 [Column[key]]
1 [Column[key]]
  outputColumnNames: _col0, _col1
  Position of Big Table: 1
  Select Operator
  File Output Operator
  Local Work:
Map Reduce Local Work
  Alias - Map Local Tables:
a 
  Fetch Operator
limit: -1
  Alias - Map Local Operator Tree:
a 
  TableScan
alias: a
  Has Any Stage Alias: false
  Stage: Stage-0
Fetch Operator
{noformat}


Diffs (updated)
-

  common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 16d54c6 
  conf/hive-default.xml.template d188f2a 
  ql/src/java/org/apache/hadoop/hive/ql/exec/AbstractMapJoinOperator.java 
d8f4eb4 
  ql/src/java/org/apache/hadoop/hive/ql/exec/HashTableLoader.java a080fcc 
  ql/src/java/org/apache/hadoop/hive/ql/exec/HashTableSinkOperator.java aa8f19c 
  ql/src/java/org/apache/hadoop/hive/ql/exec/JoinUtil.java 1e0314d 
  ql/src/java/org/apache/hadoop/hive/ql/exec/MapJoinOperator.java bdc85b9 
  ql/src/java/org/apache/hadoop/hive/ql/exec/TemporaryHashSinkOperator.java 
PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/exec/mr/ExecDriver.java 5511bca 
  ql/src/java/org/apache/hadoop/hive/ql/exec/mr/HashTableLoader.java efe5710 
  ql/src/java/org/apache/hadoop/hive/ql/exec/mr/MapredLocalTask.java 0cc90d0 
  ql/src/java/org/apache/hadoop/hive/ql/exec/tez/HashTableLoader.java 2df8ab9 
  
ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/LocalMapJoinProcFactory.java
 5a53e15 
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/MapJoinResolver.java 
010ac54 
  ql/src/java/org/apache/hadoop/hive/ql/plan/HashTableSinkDesc.java 14fced7 
  ql/src/java/org/apache/hadoop/hive/ql/plan/MapredLocalWork.java 83a778d 
  ql/src/test/queries/clientpositive/auto_join_without_localtask.q PRE-CREATION 
  ql/src/test/results/clientpositive/auto_join_without_localtask.q.out 
PRE-CREATION 

Diff: https://reviews.apache.org/r/16728/diff/


Testing
---


Thanks,

Navis Ryu



Review Request 16728: Implement non-staged MapJoin

2014-01-07 Thread Navis Ryu

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/16728/
---

Review request for hive.


Bugs: HIVE-6144
https://issues.apache.org/jira/browse/HIVE-6144


Repository: hive-git


Description
---

For map join, all data in small aliases are hashed and stored into temporary 
file in MapRedLocalTask. But for some aliases without filter or projection, it 
seemed not necessary to do that. For example.

{noformat}
select a.* from src a join src b on a.key=b.key;
{noformat}

makes plan like this.
{noformat}
STAGE PLANS:
  Stage: Stage-4
Map Reduce Local Work
  Alias - Map Local Tables:
a 
  Fetch Operator
limit: -1
  Alias - Map Local Operator Tree:
a 
  TableScan
alias: a
HashTable Sink Operator
  condition expressions:
0 {key} {value}
1 
  handleSkewJoin: false
  keys:
0 [Column[key]]
1 [Column[key]]
  Position of Big Table: 1

  Stage: Stage-3
Map Reduce
  Alias - Map Operator Tree:
b 
  TableScan
alias: b
Map Join Operator
  condition map:
   Inner Join 0 to 1
  condition expressions:
0 {key} {value}
1 
  handleSkewJoin: false
  keys:
0 [Column[key]]
1 [Column[key]]
  outputColumnNames: _col0, _col1
  Position of Big Table: 1
  Select Operator
File Output Operator
  Local Work:
Map Reduce Local Work
  Stage: Stage-0
Fetch Operator
{noformat}

table src(a) is fetched and stored as-is in MRLocalTask. With this patch, plan 
can be like below.
{noformat}
  Stage: Stage-3
Map Reduce
  Alias - Map Operator Tree:
b 
  TableScan
alias: b
Map Join Operator
  condition map:
   Inner Join 0 to 1
  condition expressions:
0 {key} {value}
1 
  handleSkewJoin: false
  keys:
0 [Column[key]]
1 [Column[key]]
  outputColumnNames: _col0, _col1
  Position of Big Table: 1
  Select Operator
  File Output Operator
  Local Work:
Map Reduce Local Work
  Alias - Map Local Tables:
a 
  Fetch Operator
limit: -1
  Alias - Map Local Operator Tree:
a 
  TableScan
alias: a
  Has Any Stage Alias: false
  Stage: Stage-0
Fetch Operator
{noformat}


Diffs
-

  common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 3bfd539 
  ql/src/java/org/apache/hadoop/hive/ql/exec/AbstractMapJoinOperator.java 
d8f4eb4 
  ql/src/java/org/apache/hadoop/hive/ql/exec/HashTableLoader.java a080fcc 
  ql/src/java/org/apache/hadoop/hive/ql/exec/HashTableSinkOperator.java aa8f19c 
  ql/src/java/org/apache/hadoop/hive/ql/exec/JoinUtil.java 1e0314d 
  ql/src/java/org/apache/hadoop/hive/ql/exec/MapJoinOperator.java bdc85b9 
  ql/src/java/org/apache/hadoop/hive/ql/exec/TemporaryHashSinkOperator.java 
PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/exec/mr/ExecDriver.java 42d764d 
  ql/src/java/org/apache/hadoop/hive/ql/exec/mr/HashTableLoader.java efe5710 
  ql/src/java/org/apache/hadoop/hive/ql/exec/mr/MapredLocalTask.java 0cc90d0 
  
ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/LocalMapJoinProcFactory.java
 5a53e15 
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/MapJoinResolver.java 
010ac54 
  ql/src/java/org/apache/hadoop/hive/ql/plan/HashTableSinkDesc.java 14fced7 
  ql/src/java/org/apache/hadoop/hive/ql/plan/MapredLocalWork.java 83a778d 
  ql/src/test/queries/clientpositive/auto_join_without_localtask.q PRE-CREATION 
  ql/src/test/results/clientpositive/auto_join_without_localtask.q.out 
PRE-CREATION 

Diff: https://reviews.apache.org/r/16728/diff/


Testing
---


Thanks,

Navis Ryu