Hello Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/24089

to look at the new patch set (#15).

Change subject: WIP IMPALA-2744: Codegen for tuple DeepCopy - part1
......................................................................

WIP IMPALA-2744: Codegen for tuple DeepCopy - part1

Created codegen'd version of BufferedTupleStream::DeepCopy.
Codegen'd function is currently used by PartitionedHashJoinBuilder.

TODO: Use it in other systems that use BufferedTupleStream:
  -AnalyticEvalNode
  -GroupingAggregator
  -SpillableRowBatchQueue/BufferedPlanRootSink

It was considered to use Tuple's TryDeepCopy* functions for
BufferedTupleStream, but it's better to keep its own DeepCopy
for there are differences between the two:
  -BufferedTupleStream doesn't copy tuples serially, first
   it copies "fixed len" parts of all tuples, then all
   "string data" for all tuples, then all "collection data" of
   all tuples.
  -BufferedTupleStream's DeepCopy doesn't set String's pointers.
   This also applies when copying a string from a collection.

Measurements:
  Measured with the following commit:
    select straight_join l_orderkey, o_custkey, o_orderkey, l_partkey
      from tpch30.orders left join tpch30.lineitem on o_orderkey = l_orderkey
                         where o_totalprice<0

  Where tpch30 is generated by:
    bin/load-data.py -s 30 -f --workloads tpch
      --table_formats text/none,parquet/snap

  Before:
    BuildRowsPartitionTime: 2s530ms
  After:
    BuildRowsPartitionTime: 1s866ms

Testing:
  Added tests to buffered-tuple-stream-test.cc that compare the results
  of codegen'd and basic DeepCopy variations of BufferedTupleStream
  with different data types.

Change-Id: I63e32babdbaf56095478c6c66afb9cb91189f946
---
M be/src/codegen/gen_ir_descriptions.py
M be/src/codegen/impala-ir.cc
M be/src/exec/partitioned-hash-join-builder-ir.cc
M be/src/exec/partitioned-hash-join-builder.cc
M be/src/exec/partitioned-hash-join-builder.h
A be/src/exec/partitioned-hash-join-builder.inline.h
M be/src/runtime/CMakeLists.txt
A be/src/runtime/buffered-tuple-stream-ir.cc
M be/src/runtime/buffered-tuple-stream-test.cc
M be/src/runtime/buffered-tuple-stream.cc
M be/src/runtime/buffered-tuple-stream.h
M be/src/runtime/buffered-tuple-stream.inline.h
M be/src/runtime/spillable-row-batch-queue.h
A be/src/runtime/tuple-row-ir.cc
M be/src/runtime/tuple-row.h
15 files changed, 736 insertions(+), 176 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/89/24089/15
--
To view, visit http://gerrit.cloudera.org:8080/24089
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I63e32babdbaf56095478c6c66afb9cb91189f946
Gerrit-Change-Number: 24089
Gerrit-PatchSet: 15
Gerrit-Owner: Balazs Hevele <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>

Reply via email to