Qifan Chen has uploaded a new patch set (#19). ( http://gerrit.cloudera.org:8080/18178 )
Change subject: IMPALA-10992 Planner changes for estimate peak memory ...................................................................... IMPALA-10992 Planner changes for estimate peak memory This patch provides replan support for multiple executor group sets. Each executor group set is associated with a distinct number of nodes and a threshold for estimated memory per host in bytes that can be denoted as [<group_name_prefix>:<#nodes>, <threshold>]. In the patch, a query of type EXPLAIN, QUERY or DML can be compiled more than once. In each attempt, per host memory is estimated and compared with the threshold of an executor group set. If the estimated memory is no more than the threshold, the iteration process terminates and the final plan is determined. The executor group set with the threshold is selected to run the query. A new query option 'enable_replan', default to 1 (enabled), is added. It can be set to 0 to disable this patch and to generate the distributed plan for the default executor group. To avoid long compilation time, the following enhancement is enabled. Note 1) and 2) can be disabled when relevant meta-data change is detected. 1. Authorization is performed only for the 1st compilation; 2. The needed meta-data is fetched into a StmtTableCache in 1st compilation and reused in subsequent compilations; 3. openTransaction() is called for transactional queries in 1st compilation and the saved transactional info is used in subsequent compilations. Similar logic is applied to Kudu transactional queries. To facilitate testing, the patch imposes an artificial two executor group setup in FE as follows. 1. [regular:<#nodes>, 64MB] 2. [large:<#nodes>, 8PB] This setup is enabled when a new query option 'test_replan' is set to 1 in backend tests, or RuntimeEnv.INSTANCE.isTestEnv() is true as in most frontend tests. This query option is set to 0 by default. Compilation time increases when a query is compiled in several iterations, as shown below for several TPCDs queries. The increase is mostly due to redundant work in either single node plan creation or recomputing value transfer graph phase. For small queries, the increase can be avoided if they can be compiled in single iteration by properly setting the smallest threshold among all executor group sets. For example, for the set of queries listed below, the smallest threshold can be set to 320MB to catch both q15 and q21 in one compilation. Compilation time (ms) Queries Estimated Memory 2-iterations 1-iteration Percentage of increase q1 408MB 18.32 13.01 40.81% q11 1.37GB 186.17 86.28 115.77% q10a 519MB 108.27 53.58 102.07% q13 339MB 118.03 82.43 43.19% q14a 3.56GB 628.27 307.24 104.49% q14b 2.20GB 518.79 239.05 117.02% q15 314MB 13.12 4.51 190.91% q21 275MB 11.04 6.34 74.13% q23a 1.34GB 458.7 227.62 101.52% q23b 1.50GB 471.29 224.75 109.70% q4 2.60GB 206.34 98.64 109.18% q67 5.16GB 691.45 336.31 105.60% Testing: 1. Almost all FE and BE tests are now run in the artificial two executor setup except a few where a specific cluster configuration is desirable; 2. Ran core tests successfully; 3. Added a new observability test and a new query assignment test. Change-Id: I75cf17290be2c64fd4b732a5505bdac31869712a --- M be/src/service/query-options.cc M be/src/service/query-options.h M common/thrift/Frontend.thrift M common/thrift/ImpalaService.thrift M common/thrift/Query.thrift M fe/src/main/java/org/apache/impala/analysis/AnalysisContext.java M fe/src/main/java/org/apache/impala/analysis/Analyzer.java M fe/src/main/java/org/apache/impala/planner/HBaseScanNode.java M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java M fe/src/main/java/org/apache/impala/planner/KuduScanNode.java M fe/src/main/java/org/apache/impala/planner/ResourceProfileBuilder.java M fe/src/main/java/org/apache/impala/service/Frontend.java M fe/src/main/java/org/apache/impala/util/ClassUtil.java M fe/src/main/java/org/apache/impala/util/ExecutorMembershipSnapshot.java M fe/src/test/java/org/apache/impala/planner/ClusterSizeTest.java A fe/src/test/resources/fair-scheduler-2-groups.xml A fe/src/test/resources/llama-site-2-groups.xml M tests/common/test_dimensions.py M tests/custom_cluster/test_admission_controller.py M tests/custom_cluster/test_coordinators.py M tests/custom_cluster/test_executor_groups.py M tests/query_test/test_observability.py 22 files changed, 644 insertions(+), 69 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/78/18178/19 -- To view, visit http://gerrit.cloudera.org:8080/18178 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I75cf17290be2c64fd4b732a5505bdac31869712a Gerrit-Change-Number: 18178 Gerrit-PatchSet: 19 Gerrit-Owner: Qifan Chen <qc...@cloudera.com> Gerrit-Reviewer: Bikramjeet Vig <bikramjeet....@cloudera.com> Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com> Gerrit-Reviewer: Kurt Deschler <kdesc...@cloudera.com> Gerrit-Reviewer: Qifan Chen <qc...@cloudera.com> Gerrit-Reviewer: Wenzhe Zhou <wz...@cloudera.com>