Omid Shahidi has uploaded a new patch set (#10). ( 
http://gerrit.cloudera.org:8080/18798 )

Change subject: IMPALA-6684: Fix untracked memory in KRPC
......................................................................

IMPALA-6684: Fix untracked memory in KRPC

During serialization of an row batch header, a tuple_data_ is created
which will hold the compressed tuple data for an outbound row batch.
We would like this tuple data to be trackable as it is responsible for
a significant portion of untrackable memory from the krpc data stream
sender. By using free pool, we are able to allocate tuple data and
compression scratch and account for it in the memory tracker of the
KrpcDataStreamSender. This solution creates a RAII class responsible
for memory allocation and changes the existing code to use a char buffer
pointed by a char* tuple_data_ instead of the previously used
std::string tuple_data_. The thrift implementation is left unchanged and
the protobuf implementation is seperated.

Testing:
 - Passed core tests.
 - Ran a single node benchmark which shows no regression.
 - Updated row-batch-serialize-test and row-batch-serialize-benchmark to
   test the row-batch serialization used by KRPC.
 - Manually collected query-profile, heap growth, and memory usage log
   showing untracked memory decreased by 1/2.
 - Add end-end unit-test to verify the new counters in runtime profile

serialize:
Func                    10%  50%  90%  10%  50%  90% ile
                                      (rel) (rel) (rel)
-----------------------------------------------------------
ser_no_dups_baseline    8.36 8.6 8.7   1X  1X  1X
ser_no_dups             6.73 6.85 6.93 0.804X 0.796X 0.796X
ser_no_dups_full        5.28 5.38 5.55 0.631X 0.625X 0.637X

ser_adjacent_dups_baseline 12.9 13.2 13.4 1X 1X 1X
ser_adjacent_dups          23.2 23.7 24.1 1.8X 1.8X 1.8X
ser_adjacent_dups_full     19.9 20.3 20.7 1.54X 1.54X 1.55X

ser_dups_baseline          9.17 9.54 9.72 1X  1X 1X
ser_dups                7.45 7.69 7.86 0.812X 0.806X 0.809X
ser_dups_full           14.6 15 15.3 1.6X 1.57X 1.57X

deserialize:
Func                    10%  50%  90%  10%  50%  90% ile
                                      (rel) (rel) (rel)
-----------------------------------------------------------
deser_no_dups_baseline  32.6 33.5 34   1X   1X    1X
deser_no_dups           32.5 33.1 33.7 0.999X 0.99X 0.992X

deser_adjacent_dups_baseline  53.1 54 54.7 1X 1X 1X
deser_adjacent_dups     80.3 81.6  82.5 1.51X 1.51X 1.51X

deser_dups_baseline      52.4 54  54.7  1X  1X   1X
deser_dups               86.8 88.4 89.7 1.66X 1.64X 1.64X

Change-Id: I2ba2b907ce4f275a7a1fb8cf75453c7003eb4b82
---
M be/src/benchmarks/row-batch-serialize-benchmark.cc
M be/src/runtime/krpc-data-stream-sender.cc
M be/src/runtime/krpc-data-stream-sender.h
M be/src/runtime/row-batch-serialize-test.cc
M be/src/runtime/row-batch.cc
M be/src/runtime/row-batch.h
A be/src/runtime/row-batch.inline.h
A testdata/workloads/tpch/queries/datastream-sender.test
A tests/query_test/test_datastream_sender.py
9 files changed, 656 insertions(+), 214 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/98/18798/10
--
To view, visit http://gerrit.cloudera.org:8080/18798
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I2ba2b907ce4f275a7a1fb8cf75453c7003eb4b82
Gerrit-Change-Number: 18798
Gerrit-PatchSet: 10
Gerrit-Owner: Omid Shahidi <omid.shahidi.2...@gmail.com>
Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com>
Gerrit-Reviewer: Kurt Deschler <kdesc...@cloudera.com>
Gerrit-Reviewer: Omid Shahidi <omid.shahidi.2...@gmail.com>
Gerrit-Reviewer: Riza Suminto <riza.sumi...@cloudera.com>
Gerrit-Reviewer: Wenzhe Zhou <wz...@cloudera.com>

Reply via email to