-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/13059/
-----------------------------------------------------------
(Updated Oct. 3, 2013, 2:17 p.m.)
Review request for hive, Eric Hanson and Jitendra Pandey.
Bugs: HIVE-4850
https://issues.apache.org/jira/browse/HIVE-4850
Repository: hive-git
Description
-------
This is not the final iteration, but I thought is easier to discuss it with a
review.
This implementation works, handles multiple aliases and multiple values per
key. The implementation uses the exiting hash tables saved by the local task
for the map join, which are row mode hash tables (have row mode keys and store
row mode writable object values). Going forward we should avoid the
size-of-big-table conversions of big table keys to row-mode and conversion of
small table values to vector data. This would require either converting
on-the-fly the hash tables to vector friendly ones (when loaded) or changing
the local task tahstable sink to create a vectorization friendly hash. First
approach may have memory consumption problems (potentially two hash tables end
up in memory, would have to stream the transformation or transform as reading
from serialized format... nasty).
Diffs (updated)
-----
ql/src/java/org/apache/hadoop/hive/ql/exec/CommonJoinOperator.java d320b47
ql/src/java/org/apache/hadoop/hive/ql/exec/JoinOperator.java 86db044
ql/src/java/org/apache/hadoop/hive/ql/exec/MapJoinOperator.java 153b8ea
ql/src/java/org/apache/hadoop/hive/ql/exec/OperatorFactory.java 8ab5395
ql/src/java/org/apache/hadoop/hive/ql/exec/SMBMapJoinOperator.java cde1a59
ql/src/java/org/apache/hadoop/hive/ql/exec/vector/ColumnVector.java 8b4c615
ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorColumnAssign.java
PRE-CREATION
ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorColumnAssignFactory.java
PRE-CREATION
ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorHashKeyWrapperBatch.java
9955d09
ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorMapJoinOperator.java
PRE-CREATION
ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorReduceSinkOperator.java
6df3551
ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorizationContext.java
02ebe14
ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorizedRowBatch.java
ff13f89
ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/VectorExpressionWriterFactory.java
9e189c9
ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/Vectorizer.java
df1c5a6
ql/src/java/org/apache/hadoop/hive/ql/plan/MapWork.java a72ec8b
Diff: https://reviews.apache.org/r/13059/diff/
Testing
-------
Manually run some join queries on alltypes_orc table.
Thanks,
Remus Rusanu