-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/14162/
-----------------------------------------------------------
(Updated Sept. 16, 2013, 10:10 p.m.)
Review request for hive, Ashutosh Chauhan and Owen O'Malley.
Changes
-------
added UNION case to ORC writer raw data size computation.
Bugs: HIVE-4340
https://issues.apache.org/jira/browse/HIVE-4340
Repository: hive-git
Description
-------
ORC's SerDe currently does nothing, and hence does not calculate a raw data
size. WriterImpl, however, has enough information to provide one.
WriterImpl should compute a raw data size for each row, aggregate them per
stripe and record it in the strip information, as RC currently does in its key
header, and allow the FileSinkOperator access to the size per row.
FileSinkOperator should be able to get the raw data size from either the SerDe
or the RecordWriter when the RecordWriter can provide it.
Diffs (updated)
-----
ql/src/java/org/apache/hadoop/hive/ql/exec/FileSinkOperator.java bcee201
ql/src/java/org/apache/hadoop/hive/ql/io/orc/BinaryColumnStatistics.java
PRE-CREATION
ql/src/java/org/apache/hadoop/hive/ql/io/orc/ColumnStatisticsImpl.java
6268617
ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcOutputFormat.java c80fb02
ql/src/java/org/apache/hadoop/hive/ql/io/orc/Reader.java 90260fd
ql/src/java/org/apache/hadoop/hive/ql/io/orc/ReaderImpl.java c454f32
ql/src/java/org/apache/hadoop/hive/ql/io/orc/StringColumnStatistics.java
72e779a
ql/src/java/org/apache/hadoop/hive/ql/io/orc/Writer.java 8e74b91
ql/src/java/org/apache/hadoop/hive/ql/io/orc/WriterImpl.java 44961ce
ql/src/protobuf/org/apache/hadoop/hive/ql/io/orc/orc_proto.proto edbf822
ql/src/test/org/apache/hadoop/hive/ql/io/orc/TestOrcFile.java e6569f4
ql/src/test/org/apache/hadoop/hive/ql/io/orc/TestOrcNullOptimization.java
b93db84
ql/src/test/org/apache/hadoop/hive/ql/io/orc/TestOrcSerDeStats.java
PRE-CREATION
ql/src/test/resources/orc-file-dump-dictionary-threshold.out 003c132
ql/src/test/resources/orc-file-dump.out fac5326
serde/src/java/org/apache/hadoop/hive/serde2/SerDeStats.java 1c09dc3
Diff: https://reviews.apache.org/r/14162/diff/
Testing
-------
All unit tests and q file tests related to ORC are passing.
Thanks,
Prasanth_J