-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/17899/
-----------------------------------------------------------
Review request for hive, Brock Noland, Eric Hanson, and Jitendra Pandey.
Bugs: HIVE-5998
https://issues.apache.org/jira/browse/HIVE-5998
Repository: hive-git
Description
-------
Implementation is straight forward and very simple, but offers all benefits of
vectorization possible with a 'shallow' vectorized reader (ie. one that doe not
got into parquet-mr project changes). the only complication arrised because of
discrepancies between the object inspector seen by the inputformat and the
actual output provided by the Parquet readers (eg. OI declares 'byte'
primitives but the Parquet reader outputs IntWritable). I had to create a
just-in-time VectorColumnAssigner colelciton base don whatever writers the
Parquet record reader provides. It is assumed the reader does not change it's
output during the iteration.
Diffs
-----
ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorColumnAssignFactory.java
d1a75df
ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorizedRowBatch.java
0b504de
ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorizedRowBatchCtx.java
f513188
ql/src/java/org/apache/hadoop/hive/ql/io/parquet/MapredParquetInputFormat.java
d3412df
ql/src/java/org/apache/hadoop/hive/ql/io/parquet/VectorizedParquetInputFormat.java
PRE-CREATION
Diff: https://reviews.apache.org/r/17899/diff/
Testing
-------
Manually tested. I will add .q query but I need to get home (to my Mac) where I
can actually run tests and create expected output(s).
Thanks,
Remus Rusanu