-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/13059/
-----------------------------------------------------------
Review request for hive, Eric Hanson and Jitendra Pandey.
Bugs: HIVE-4850
https://issues.apache.org/jira/browse/HIVE-4850
Repository: hive-git
Description
-------
This is not the final iteration, but I thought is easier to discuss it with a
review.
This implementation works, handles multiple aliases and multiple values per
key. The implementation uses the exiting hash tables saved by the local task
for the map join, which are row mode hash tables (have row mode keys and store
row mode writable object values). Going forward we should avoid the
size-of-big-table conversions of big table keys to row-mode and conversion of
small table values to vector data. This would require either converting
on-the-fly the hash tables to vector friendly ones (when loaded) or changing
the local task tahstable sink to create a vectorization friendly hash. First
approach may have memory consumption problems (potentially two hash tables end
up in memory, would have to stream the transformation or transform as reading
from serialized format... nasty).
Diffs
-----
ql/src/java/org/apache/hadoop/hive/ql/exec/CommonJoinOperator.java 82d4b93
ql/src/java/org/apache/hadoop/hive/ql/exec/JoinUtil.java 31dbf41
ql/src/java/org/apache/hadoop/hive/ql/exec/MapJoinOperator.java 4da1be8
ql/src/java/org/apache/hadoop/hive/ql/exec/Operator.java 29de38d
ql/src/java/org/apache/hadoop/hive/ql/exec/mr/ExecDriver.java e579c00
ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/MapJoinDoubleKeys.java
d774226
ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/MapJoinObjectKey.java
791bb3f
ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/MapJoinObjectValue.java
58a9dc0
ql/src/java/org/apache/hadoop/hive/ql/exec/persistence/MapJoinSingleKey.java
4bff936
ql/src/java/org/apache/hadoop/hive/ql/exec/vector/ColumnVector.java 8b4c615
ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorColumnAssign.java
PRE-CREATION
ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorColumnAssignFactory.java
PRE-CREATION
ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorExecMapper.java
083b9b9
ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorMapJoinOperator.java
PRE-CREATION
ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorMapOperator.java
41d2001
ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorizationContext.java
9c90230
ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorizedRowBatch.java
ff13f89
ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/VectorExpressionWriterFactory.java
9e189c9
ql/src/java/org/apache/hadoop/hive/ql/plan/HashTableDummyDesc.java f15ce48
Diff: https://reviews.apache.org/r/13059/diff/
Testing
-------
Manually run some join queries on alltypes_orc table.
Thanks,
Remus Rusanu