Quanlong Huang has uploaded this change for review. ( http://gerrit.cloudera.org:8080/18359
Change subject: IMPALA-11204: template implementation for OrcStringColumnReader ...................................................................... IMPALA-11204: template implementation for OrcStringColumnReader There are some checks in OrcStringColumnReader::ReadValue() that we can determine outside the scope of this method. They should be optimized since this is a critical method that will be executed for each row (and for each string column). With these checks, the method is too complex to be inlined by the compiler in OrcBatchedReader::ReadValueBatch(). This patch templates OrcStringColumnReader with two parameters, one for whether the column is dictionary encoded, the other for the target slot type (i.e. STRING/CHAR/VARCHAR). Compiler is able to inline OrcStringColumnReader::ReadValue() after this patch. The encoding of a column can change in different ORC stripes. So we have to re-create the column readers for each stripe. Note that we already do so for orc::RowReader. So this patch changes the life-cycle of OrcColumnReaders to match the processing of each stripe. They are now managed by std::unique_ptr. This requires OrcStructReader be defined earlier than HdfsOrcScanner. So we include orc-column-readers.h in hdfs-orc-scanner.h and move all code that depends on the scanner implementation in orc-column-readers.h to the source file. Ran a single node perf test on TPCH(30) on my dev box using 3 impalad instances. There are some improvements and no significant regressions: +----------+--------+-------------+------------+ | Query | Avg(s) | Base Avg(s) | Delta(Avg) | +----------+--------+-------------+------------+ | TPCH-Q19 | 5.42 | 5.78 | I -6.21% | | TPCH-Q4 | 3.43 | 3.69 | I -7.25% | | TPCH-Q6 | 2.25 | 2.45 | I -8.18% | | TPCH-Q12 | 3.95 | 4.54 | I -13.04% | +----------+--------+-------------+------------+ File Format: orc/snap/block Tests: - Ran CORE tests. Change-Id: I166b8ad3a959e97a3911da968b8e76bc337e5fa4 --- M be/src/exec/hdfs-orc-scanner.cc M be/src/exec/hdfs-orc-scanner.h M be/src/exec/orc-column-readers.cc M be/src/exec/orc-column-readers.h 4 files changed, 240 insertions(+), 195 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/59/18359/1 -- To view, visit http://gerrit.cloudera.org:8080/18359 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: I166b8ad3a959e97a3911da968b8e76bc337e5fa4 Gerrit-Change-Number: 18359 Gerrit-PatchSet: 1 Gerrit-Owner: Quanlong Huang <huangquanl...@gmail.com>