Internal Jenkins has submitted this change and it was merged. Change subject: IMPALA-2789: More compact mem layout with null bits at the end. ......................................................................
IMPALA-2789: More compact mem layout with null bits at the end. There are two motivations for this change: 1. Reduce memory consumption. 2. Pave the way for full memory layout compatibility between Impala and Kudu to eventually enable zero-copy scans. This patch is a only first step towards that goal. New Memory Layout Slots are placed in descending order by size with trailing bytes to store null flags. Null flags are omitted for non-nullable slots. There is no padding between tuples when stored back-to-back in a row batch. Example: select bool_col, int_col, string_col, smallint_col from functional.alltypes Slots: string_col|int_col|smallint_col|bool_col|null_byte Offsets: 0 16 20 22 23 The main change is to move the null indicators to the end of tuples. The new memory layout is fully packed with no padding in between slots or tuples. Performance: Our standard cluster perf tests showed no significant difference in query response times as well as consumed cycles, and a slight reduction in peak memory consumption. Testing: An exhaustive test run passed. Ran a few select tests like TPC-H/DS with ASAN locally. These follow-on changes are planned: 1. Planner needs to mark slots non-nullable if they correspond to a non-nullable Kudu column. 2. Update Kudu scan node to copy tuples with memcpy. 3. Kudu client needs to support transferring ownership of the tuple memory (maybe do direct and indirect buffers separately). 4. Update Kudu scan node to use memory transfer instead of copy Change-Id: Ib6510c75d841bddafa6638f1bd2ac6731a7053f6 Reviewed-on: http://gerrit.cloudera.org:8080/4673 Reviewed-by: Alex Behm <alex.b...@cloudera.com> Tested-by: Internal Jenkins --- M be/src/benchmarks/row-batch-serialize-benchmark.cc M be/src/codegen/llvm-codegen.cc M be/src/codegen/llvm-codegen.h M be/src/exec/hdfs-scanner.cc M be/src/exec/hdfs-scanner.h M be/src/exec/kudu-scanner.cc M be/src/exec/kudu-scanner.h M be/src/exec/row-batch-list-test.cc M be/src/exec/text-converter.cc M be/src/runtime/buffered-tuple-stream-test.cc M be/src/runtime/collection-value-builder-test.cc M be/src/runtime/descriptors.cc M be/src/runtime/descriptors.h M be/src/runtime/row-batch-serialize-test.cc M be/src/runtime/row-batch-test.cc M be/src/runtime/tuple.cc M be/src/runtime/tuple.h M be/src/service/frontend.cc M be/src/service/frontend.h M be/src/testutil/desc-tbl-builder.cc M be/src/testutil/desc-tbl-builder.h M common/thrift/Frontend.thrift M fe/src/main/java/org/apache/impala/analysis/DescriptorTable.java M fe/src/main/java/org/apache/impala/analysis/TupleDescriptor.java M fe/src/main/java/org/apache/impala/service/JniFrontend.java M fe/src/test/java/org/apache/impala/analysis/AnalyzerTest.java 26 files changed, 401 insertions(+), 311 deletions(-) Approvals: Internal Jenkins: Verified Alex Behm: Looks good to me, approved -- To view, visit http://gerrit.cloudera.org:8080/4673 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: merged Gerrit-Change-Id: Ib6510c75d841bddafa6638f1bd2ac6731a7053f6 Gerrit-PatchSet: 9 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Alex Behm <alex.b...@cloudera.com> Gerrit-Reviewer: Alex Behm <alex.b...@cloudera.com> Gerrit-Reviewer: Internal Jenkins Gerrit-Reviewer: Marcel Kornacker <mar...@cloudera.com> Gerrit-Reviewer: Matthew Jacobs <m...@cloudera.com> Gerrit-Reviewer: Tim Armstrong <tarmstr...@cloudera.com>