Riza Suminto has posted comments on this change. ( http://gerrit.cloudera.org:8080/18798 )
Change subject: IMPALA-6684: Fix untracked memory in KRPC ...................................................................... Patch Set 15: (5 comments) Agree that we should clean up Thrift RPC related code at this point. I have discussion with Kurt. He has an alternative idea to use std::basic_string + custom tracked allocator. This is potentially lighter than using BufferPool. An example of such allocator is MemTrackerAllocator from be/src/kudu/util/mem_tracker.h. I'm currently investigating this solution. In the meantime, I'm leaving some note to fix. http://gerrit.cloudera.org:8080/#/c/18798/15/be/src/runtime/krpc-data-stream-sender.cc File be/src/runtime/krpc-data-stream-sender.cc: http://gerrit.cloudera.org:8080/#/c/18798/15/be/src/runtime/krpc-data-stream-sender.cc@780 PS15, Line 780: krpc_tuple_data_bytes_ = : ADD_SUMMARY_STATS_COUNTER(profile(), "TupleDataBytes", TUnit::BYTES); : krpc_compression_scratch_bytes_ = : ADD_SUMMARY_STATS_COUNTER(profile(), "CompressionScratchBytes", TUnit::BYTES); I was suggesting this counter addition for research purpose. Wonder if we should drop this now. TupleDataBytes seems to overlap with UncompressedRowBatchSize. http://gerrit.cloudera.org:8080/#/c/18798/15/be/src/runtime/krpc-data-stream-sender.cc@786 PS15, Line 786: "Memory tracker for OutBoundRowBatch serialization", parent_mem_tracker We should nest MemTracker under the KrpcDataStreamSender's mem_tracker_.get() here. Also can use shorter MemTracker name. http://gerrit.cloudera.org:8080/#/c/18798/15/be/src/runtime/krpc-data-stream-sender.cc@1116 PS15, Line 1116: dest->SetMemAllocator(outbound_rb_free_pool_); Can be set during Prepare and Init method rather than here. http://gerrit.cloudera.org:8080/#/c/18798/15/be/src/runtime/row-batch.cc File be/src/runtime/row-batch.cc: http://gerrit.cloudera.org:8080/#/c/18798/15/be/src/runtime/row-batch.cc@321 PS15, Line 321: RETURN_IF_ERROR(output_batch->AllocateTraceableBuffer(size, false)); In the original code, we throw TErrorCode::ROW_BATCH_TOO_LARGE if size is larger than numeric_limits<int32_t>::max(). http://gerrit.cloudera.org:8080/#/c/18798/15/be/src/runtime/row-batch.cc@365 PS15, Line 365: RETURN_IF_ERROR(output_batch->AllocateTraceableBuffer(compressed_size, true)); In the original code, we do not resize here if current length is longer than expected compressed_size. -- To view, visit http://gerrit.cloudera.org:8080/18798 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I2ba2b907ce4f275a7a1fb8cf75453c7003eb4b82 Gerrit-Change-Number: 18798 Gerrit-PatchSet: 15 Gerrit-Owner: Omid Shahidi <omid.shahidi.2...@gmail.com> Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com> Gerrit-Reviewer: Kurt Deschler <kdesc...@cloudera.com> Gerrit-Reviewer: Omid Shahidi <omid.shahidi.2...@gmail.com> Gerrit-Reviewer: Riza Suminto <riza.sumi...@cloudera.com> Gerrit-Reviewer: Wenzhe Zhou <wz...@cloudera.com> Gerrit-Comment-Date: Thu, 06 Oct 2022 23:51:01 +0000 Gerrit-HasComments: Yes