Joe McDonnell has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/13772 )
Change subject: IMPALA-8732: Use a serialized descriptor table in TQueryCtx ...................................................................... IMPALA-8732: Use a serialized descriptor table in TQueryCtx In IMPALA-8732, there is contention in tcmalloc when sending the ExecQueryFInstances messages for a query referencing a large number of partitions. This is because each thread in the ExecEnv::exec_rpc_thread_pool_ is making a copy of the TQueryCtx, which contains the TDescriptorTable and a large map of THdfsPartition objects. Every thread in the exec_rpc_thread_pool_ is doing this simultaneously. The threads are copying this structure, but the TQueryCtx and its corresponding TDescriptorTable is the same across all messages for this query. Copying a large map of THdfsPartition objects is wasteful, especially considering that the coordinator does not need to access any of the information in TDescriptorTable before sending it out to executors. In future, the entire TQueryCtx can be serialized once and embedded in its own sidecar. This change is limited to TDescriptorTable to allow easier backports to older versions, as this codepath has been converted from Thrift to KRPC and a large amount of code has changed. This changes TQueryCtx to contain a TDescriptorTableSerialized, which is a binary blob containing the serialized form of TDescriptorTable. This is serialized in the frontend and passed directly through to executors. The old unserialized TDescriptorTable form is maintained to enable frontend planner tests (which use incomplete structures lacking some required fields and cannot be serialized). Testing: - Core and exhaustive tests pass Change-Id: I458aa62dd4d1e4e4a7b1869a604623a69f3b2d9a Reviewed-on: http://gerrit.cloudera.org:8080/13772 Reviewed-by: Joe McDonnell <joemcdonn...@cloudera.com> Tested-by: Impala Public Jenkins <impala-public-jenk...@cloudera.com> --- M be/src/runtime/coordinator.cc M be/src/runtime/data-stream-test.cc M be/src/runtime/descriptors.cc M be/src/runtime/descriptors.h M be/src/runtime/query-state.cc M be/src/testutil/desc-tbl-builder.cc M common/thrift/Descriptors.thrift M common/thrift/ImpalaInternalService.thrift M fe/src/main/java/org/apache/impala/analysis/DescriptorTable.java M fe/src/main/java/org/apache/impala/service/Frontend.java M fe/src/test/java/org/apache/impala/planner/PlannerTestBase.java 11 files changed, 112 insertions(+), 26 deletions(-) Approvals: Joe McDonnell: Looks good to me, approved Impala Public Jenkins: Verified -- To view, visit http://gerrit.cloudera.org:8080/13772 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: I458aa62dd4d1e4e4a7b1869a604623a69f3b2d9a Gerrit-Change-Number: 13772 Gerrit-PatchSet: 7 Gerrit-Owner: Joe McDonnell <joemcdonn...@cloudera.com> Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com> Gerrit-Reviewer: Joe McDonnell <joemcdonn...@cloudera.com> Gerrit-Reviewer: Michael Ho <k...@cloudera.com>