----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/14162/ -----------------------------------------------------------
Review request for hive, Ashutosh Chauhan and Owen O'Malley. Bugs: HIVE-4340 https://issues.apache.org/jira/browse/HIVE-4340 Repository: hive-git Description ------- ORC's SerDe currently does nothing, and hence does not calculate a raw data size. WriterImpl, however, has enough information to provide one. WriterImpl should compute a raw data size for each row, aggregate them per stripe and record it in the strip information, as RC currently does in its key header, and allow the FileSinkOperator access to the size per row. FileSinkOperator should be able to get the raw data size from either the SerDe or the RecordWriter when the RecordWriter can provide it. Diffs ----- ql/src/java/org/apache/hadoop/hive/ql/exec/FileSinkOperator.java bcee201 ql/src/java/org/apache/hadoop/hive/ql/io/orc/BinaryColumnStatistics.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/io/orc/ColumnStatisticsImpl.java 6268617 ql/src/java/org/apache/hadoop/hive/ql/io/orc/OrcOutputFormat.java c80fb02 ql/src/java/org/apache/hadoop/hive/ql/io/orc/Reader.java 90260fd ql/src/java/org/apache/hadoop/hive/ql/io/orc/ReaderImpl.java c454f32 ql/src/java/org/apache/hadoop/hive/ql/io/orc/StringColumnStatistics.java 72e779a ql/src/java/org/apache/hadoop/hive/ql/io/orc/Writer.java 8e74b91 ql/src/java/org/apache/hadoop/hive/ql/io/orc/WriterImpl.java 44961ce ql/src/protobuf/org/apache/hadoop/hive/ql/io/orc/orc_proto.proto edbf822 ql/src/test/org/apache/hadoop/hive/ql/io/orc/TestOrcFile.java e6569f4 ql/src/test/org/apache/hadoop/hive/ql/io/orc/TestOrcNullOptimization.java b93db84 ql/src/test/org/apache/hadoop/hive/ql/io/orc/TestOrcSerDeStats.java PRE-CREATION ql/src/test/resources/orc-file-dump-dictionary-threshold.out 003c132 ql/src/test/resources/orc-file-dump.out fac5326 serde/src/java/org/apache/hadoop/hive/serde2/SerDeStats.java 1c09dc3 Diff: https://reviews.apache.org/r/14162/diff/ Testing ------- All unit tests and q file tests related to ORC are passing. Thanks, Prasanth_J