Internal Jenkins has submitted this change and it was merged.

Change subject: IMPALA-2789: More compact mem layout with null bits at the end.
......................................................................


IMPALA-2789: More compact mem layout with null bits at the end.

There are two motivations for this change:
1. Reduce memory consumption.
2. Pave the way for full memory layout compatibility between
   Impala and Kudu to eventually enable zero-copy scans. This
   patch is a only first step towards that goal.

New Memory Layout
Slots are placed in descending order by size with trailing bytes to
store null flags. Null flags are omitted for non-nullable slots. There
is no padding between tuples when stored back-to-back in a row batch.

Example: select bool_col, int_col, string_col, smallint_col
         from functional.alltypes
Slots:   string_col|int_col|smallint_col|bool_col|null_byte
Offsets: 0          16      20           22       23

The main change is to move the null indicators to the end of tuples.
The new memory layout is fully packed with no padding in between
slots or tuples.

Performance:
Our standard cluster perf tests showed no significant difference in
query response times as well as consumed cycles, and a slight
reduction in peak memory consumption.

Testing:
An exhaustive test run passed. Ran a few select tests like TPC-H/DS
with ASAN locally.

These follow-on changes are planned:
1. Planner needs to mark slots non-nullable if they correspond
   to a non-nullable Kudu column.
2. Update Kudu scan node to copy tuples with memcpy.
3. Kudu client needs to support transferring ownership of the
   tuple memory (maybe do direct and indirect buffers separately).
4. Update Kudu scan node to use memory transfer instead of copy

Change-Id: Ib6510c75d841bddafa6638f1bd2ac6731a7053f6
Reviewed-on: http://gerrit.cloudera.org:8080/4673
Reviewed-by: Alex Behm <alex.b...@cloudera.com>
Tested-by: Internal Jenkins
---
M be/src/benchmarks/row-batch-serialize-benchmark.cc
M be/src/codegen/llvm-codegen.cc
M be/src/codegen/llvm-codegen.h
M be/src/exec/hdfs-scanner.cc
M be/src/exec/hdfs-scanner.h
M be/src/exec/kudu-scanner.cc
M be/src/exec/kudu-scanner.h
M be/src/exec/row-batch-list-test.cc
M be/src/exec/text-converter.cc
M be/src/runtime/buffered-tuple-stream-test.cc
M be/src/runtime/collection-value-builder-test.cc
M be/src/runtime/descriptors.cc
M be/src/runtime/descriptors.h
M be/src/runtime/row-batch-serialize-test.cc
M be/src/runtime/row-batch-test.cc
M be/src/runtime/tuple.cc
M be/src/runtime/tuple.h
M be/src/service/frontend.cc
M be/src/service/frontend.h
M be/src/testutil/desc-tbl-builder.cc
M be/src/testutil/desc-tbl-builder.h
M common/thrift/Frontend.thrift
M fe/src/main/java/org/apache/impala/analysis/DescriptorTable.java
M fe/src/main/java/org/apache/impala/analysis/TupleDescriptor.java
M fe/src/main/java/org/apache/impala/service/JniFrontend.java
M fe/src/test/java/org/apache/impala/analysis/AnalyzerTest.java
26 files changed, 401 insertions(+), 311 deletions(-)

Approvals:
  Internal Jenkins: Verified
  Alex Behm: Looks good to me, approved



-- 
To view, visit http://gerrit.cloudera.org:8080/4673
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: Ib6510c75d841bddafa6638f1bd2ac6731a7053f6
Gerrit-PatchSet: 9
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Alex Behm <alex.b...@cloudera.com>
Gerrit-Reviewer: Alex Behm <alex.b...@cloudera.com>
Gerrit-Reviewer: Internal Jenkins
Gerrit-Reviewer: Marcel Kornacker <mar...@cloudera.com>
Gerrit-Reviewer: Matthew Jacobs <m...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <tarmstr...@cloudera.com>

Reply via email to