-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/65174/
-----------------------------------------------------------
(Updated 5 31, 2018, 2:47 오전)
Review request for hive.
Changes
-------
I rebased the patch on the latest master branch.
Bugs: HIVE-17896
https://issues.apache.org/jira/browse/HIVE-17896
Repository: hive-git
Description
-------
For TPC-DS Query27, the TopN operation is delayed by the group-by - the
group-by operator buffers up all the rows before discarding the 99% of the rows
in the TopN Hash within the ReduceSink Operator.
The RS TopN operator is very restrictive as it only supports doing the
filtering on the shuffle keys, but it is better to do this before breaking the
vectors into rows and losing the isRepeating properties.
Adding a TopN Key operator in the physical operator tree allows the following
to happen.
GBY->RS(Top=1)
can become
TNK(1)->GBY->RS(Top=1)
So that, the TopNKey can remove rows before they are buffered into the GBY and
consume memory.
Here's the equivalent implementation in Presto
https://github.com/prestodb/presto/blob/master/presto-main/src/main/java/com/facebook/presto/operator/TopNOperator.java#L35
Adding this as a sub-feature of GroupBy prevents further optimizations if the
GBY is on keys "a,b,c" and the TopNKey is on just "a".
Diffs (updated)
-----
common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 3295d1dbc5
itests/src/test/resources/testconfiguration.properties 6a70a4a6bd
ql/src/gen/thrift/gen-javabean/org/apache/hadoop/hive/ql/plan/api/OperatorType.java
a002348013
ql/src/java/org/apache/hadoop/hive/ql/exec/KeyWrapperFactory.java 3c7f0b78c2
ql/src/java/org/apache/hadoop/hive/ql/exec/OperatorFactory.java 7bb6590d5e
ql/src/java/org/apache/hadoop/hive/ql/exec/TopNKeyOperator.java PRE-CREATION
ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorTopNKeyOperator.java
PRE-CREATION
ql/src/java/org/apache/hadoop/hive/ql/optimizer/TopNKeyProcessor.java
PRE-CREATION
ql/src/java/org/apache/hadoop/hive/ql/optimizer/TopNKeyPushdown.java
PRE-CREATION
ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/Vectorizer.java
394f826508
ql/src/java/org/apache/hadoop/hive/ql/parse/TezCompiler.java dfd790853b
ql/src/java/org/apache/hadoop/hive/ql/plan/TopNKeyDesc.java PRE-CREATION
ql/src/java/org/apache/hadoop/hive/ql/plan/VectorTopNKeyDesc.java
PRE-CREATION
ql/src/test/queries/clientpositive/topnkey.q PRE-CREATION
ql/src/test/queries/clientpositive/vector_topnkey.q PRE-CREATION
ql/src/test/results/clientpositive/llap/topnkey.q.out PRE-CREATION
ql/src/test/results/clientpositive/llap/vector_topnkey.q.out PRE-CREATION
ql/src/test/results/clientpositive/tez/topnkey.q.out PRE-CREATION
ql/src/test/results/clientpositive/tez/vector_topnkey.q.out PRE-CREATION
ql/src/test/results/clientpositive/topnkey.q.out PRE-CREATION
ql/src/test/results/clientpositive/vector_topnkey.q.out PRE-CREATION
serde/src/java/org/apache/hadoop/hive/serde2/objectinspector/ObjectInspectorUtils.java
a442cb1228
Diff: https://reviews.apache.org/r/65174/diff/2/
Changes: https://reviews.apache.org/r/65174/diff/1-2/
Testing
-------
Thanks,
Teddy Choi