-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/7126/
-----------------------------------------------------------
Review request for hive.
Description
-------
This optimizer exploits intra-query correlations and merges multiple correlated
MapReduce jobs into one jobs. Open a new request since I have been working on
hive-git.
This addresses bug HIVE-2206.
https://issues.apache.org/jira/browse/HIVE-2206
Diffs
-----
common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 5efae89
ql/src/java/org/apache/hadoop/hive/ql/exec/BaseReduceSinkOperator.java
PRE-CREATION
ql/src/java/org/apache/hadoop/hive/ql/exec/CorrelationCompositeOperator.java
PRE-CREATION
ql/src/java/org/apache/hadoop/hive/ql/exec/CorrelationLocalSimulativeReduceSinkOperator.java
PRE-CREATION
ql/src/java/org/apache/hadoop/hive/ql/exec/CorrelationReducerDispatchOperator.java
PRE-CREATION
ql/src/java/org/apache/hadoop/hive/ql/exec/ExecReducer.java 283d0b6
ql/src/java/org/apache/hadoop/hive/ql/exec/JoinOperator.java e3ed13a
ql/src/java/org/apache/hadoop/hive/ql/exec/Operator.java f0c35e7
ql/src/java/org/apache/hadoop/hive/ql/exec/OperatorFactory.java 0c22141
ql/src/java/org/apache/hadoop/hive/ql/exec/ReduceSinkOperator.java a2caeed
ql/src/java/org/apache/hadoop/hive/ql/exec/SMBMapJoinOperator.java 1a40630
ql/src/java/org/apache/hadoop/hive/ql/exec/TableScanOperator.java dffdd7b
ql/src/java/org/apache/hadoop/hive/ql/optimizer/CorrelationOptimizer.java
PRE-CREATION
ql/src/java/org/apache/hadoop/hive/ql/optimizer/CorrelationOptimizerUtils.java
PRE-CREATION
ql/src/java/org/apache/hadoop/hive/ql/optimizer/GenMapRedUtils.java 6bc5fe4
ql/src/java/org/apache/hadoop/hive/ql/optimizer/Optimizer.java 67d3a99
ql/src/java/org/apache/hadoop/hive/ql/parse/ParseContext.java 8bacd3d
ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java a65b0e4
ql/src/java/org/apache/hadoop/hive/ql/plan/BaseReduceSinkDesc.java
PRE-CREATION
ql/src/java/org/apache/hadoop/hive/ql/plan/CorrelationCompositeDesc.java
PRE-CREATION
ql/src/java/org/apache/hadoop/hive/ql/plan/CorrelationLocalSimulativeReduceSinkDesc.java
PRE-CREATION
ql/src/java/org/apache/hadoop/hive/ql/plan/CorrelationReducerDispatchDesc.java
PRE-CREATION
ql/src/java/org/apache/hadoop/hive/ql/plan/MapredWork.java 5f38bf2
ql/src/java/org/apache/hadoop/hive/ql/plan/ReduceSinkDesc.java 16eb125
ql/src/java/org/apache/hadoop/hive/ql/plan/TableScanDesc.java 9a95efd
ql/src/test/org/apache/hadoop/hive/ql/exec/TestExecDriver.java 142f040
ql/src/test/results/compiler/plan/groupby1.q.xml 4382252
ql/src/test/results/compiler/plan/groupby2.q.xml eef669c
ql/src/test/results/compiler/plan/groupby3.q.xml 9743480
ql/src/test/results/compiler/plan/groupby5.q.xml 8e07860
Diff: https://reviews.apache.org/r/7126/diff/
Testing
-------
Cannot test TestHBaseMinimrCliDriver, TestHBaseCliDriver,
TestHBaseNegativeCliDriver, testSynchronized in TestEmbeddedHiveMetaStore,
testSynchronized in TestRemoteHiveMetaStore, testSynchronized in
TestSetUGIOnBothClientServer, testSynchronized in TestSetUGIOnOnlyClient,
testSynchronized in TestSetUGIOnOnlyServer, and
testNegativeCliDriver_local_mapred_error_cache in TestNegativeCliDriver. This
patch should pass all other tests.
When the optimizer is enabled (right now, the optimizer is disabled by
default), there are several cases failed. 1 is optimized by the optimizer. 1 is
not suitable for this correlation optimizer. 2 are due to potential bugs of the
trunk. Other failures are parsing cases (xml plans). Those failures are due to
my minor changes in SemanticAnalyzer since several redundant operators will be
generated for the correlation optimizer. Overall, those failures are not very
relevant to the patch. Please see
https://issues.apache.org/jira/browse/HIVE-2206?focusedCommentId=13456171&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13456171
for details.
Thanks,
Yin Huai