Riza Suminto has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/18798 )

Change subject: IMPALA-6684: Fix untracked memory in KRPC
......................................................................


Patch Set 15:

(5 comments)

Agree that we should clean up Thrift RPC related code at this point.

I have discussion with Kurt. He has an alternative idea to use 
std::basic_string + custom tracked allocator. This is potentially lighter than 
using BufferPool. An example of such allocator is MemTrackerAllocator from 
be/src/kudu/util/mem_tracker.h. I'm currently investigating this solution.

In the meantime, I'm leaving some note to fix.

http://gerrit.cloudera.org:8080/#/c/18798/15/be/src/runtime/krpc-data-stream-sender.cc
File be/src/runtime/krpc-data-stream-sender.cc:

http://gerrit.cloudera.org:8080/#/c/18798/15/be/src/runtime/krpc-data-stream-sender.cc@780
PS15, Line 780:   krpc_tuple_data_bytes_ =
              :       ADD_SUMMARY_STATS_COUNTER(profile(), "TupleDataBytes", 
TUnit::BYTES);
              :   krpc_compression_scratch_bytes_ =
              :       ADD_SUMMARY_STATS_COUNTER(profile(), 
"CompressionScratchBytes", TUnit::BYTES);
I was suggesting this counter addition for research purpose. Wonder if we 
should drop this now. TupleDataBytes seems to overlap with 
UncompressedRowBatchSize.


http://gerrit.cloudera.org:8080/#/c/18798/15/be/src/runtime/krpc-data-stream-sender.cc@786
PS15, Line 786: "Memory tracker for OutBoundRowBatch serialization", 
parent_mem_tracker
We should nest MemTracker under the KrpcDataStreamSender's mem_tracker_.get() 
here. Also can use shorter MemTracker name.


http://gerrit.cloudera.org:8080/#/c/18798/15/be/src/runtime/krpc-data-stream-sender.cc@1116
PS15, Line 1116: dest->SetMemAllocator(outbound_rb_free_pool_);
Can be set during Prepare and Init method rather than here.


http://gerrit.cloudera.org:8080/#/c/18798/15/be/src/runtime/row-batch.cc
File be/src/runtime/row-batch.cc:

http://gerrit.cloudera.org:8080/#/c/18798/15/be/src/runtime/row-batch.cc@321
PS15, Line 321: RETURN_IF_ERROR(output_batch->AllocateTraceableBuffer(size, 
false));
In the original code, we throw TErrorCode::ROW_BATCH_TOO_LARGE if size is 
larger than numeric_limits<int32_t>::max().


http://gerrit.cloudera.org:8080/#/c/18798/15/be/src/runtime/row-batch.cc@365
PS15, Line 365: 
RETURN_IF_ERROR(output_batch->AllocateTraceableBuffer(compressed_size, true));
In the original code, we do not resize here if current length is longer than 
expected compressed_size.



--
To view, visit http://gerrit.cloudera.org:8080/18798
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2ba2b907ce4f275a7a1fb8cf75453c7003eb4b82
Gerrit-Change-Number: 18798
Gerrit-PatchSet: 15
Gerrit-Owner: Omid Shahidi <omid.shahidi.2...@gmail.com>
Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com>
Gerrit-Reviewer: Kurt Deschler <kdesc...@cloudera.com>
Gerrit-Reviewer: Omid Shahidi <omid.shahidi.2...@gmail.com>
Gerrit-Reviewer: Riza Suminto <riza.sumi...@cloudera.com>
Gerrit-Reviewer: Wenzhe Zhou <wz...@cloudera.com>
Gerrit-Comment-Date: Thu, 06 Oct 2022 23:51:01 +0000
Gerrit-HasComments: Yes

Reply via email to