Joe McDonnell has posted comments on this change. ( http://gerrit.cloudera.org:8080/17872 )
Change subject: IMPALA-10811 RPC to submit query getting stuck for AWS NLB forever ...................................................................... Patch Set 21: (1 comment) http://gerrit.cloudera.org:8080/#/c/17872/21/be/src/service/client-request-state.cc File be/src/service/client-request-state.cc: http://gerrit.cloudera.org:8080/#/c/17872/21/be/src/service/client-request-state.cc@713 PS21, Line 713: // Transition the exec state to RUNNING for any non-CTAS DDLs. For the later, the : // state is set from PENDING to RUNNING inside ExecQueryOrDmlRequest(). : if (catalog_op_type() != TCatalogOpType::DDL : || ddl_type() != TDdlType::CREATE_TABLE_AS_SELECT) { : UpdateNonErrorExecState(ExecState::RUNNING); : } > The difference with the state in main thread for CTAS before overwritten by When a statement is in the INITIALIZED state, its runtime profile is unavailable. This limitation is because we currently don't handle generating a runtime profile when the query is in planning, but this restriction doesn't apply after the planning is over. This is why it is useful to transition out of the INITIALIZED state as soon as planning is over. You can search the code for ExecState::INITIALIZED to see these locations. So, my reason for wanting to get out of INITIALIZED state is so that these DDLs have runtime profiles while they are doing the catalog ops. PENDING corresponds to admission control, and the profile contains extra information while in this state (e.g. the queued reason and other state). CTAS could spend time in admission control after the create table completes, so we want that admission control information to be accessible. Choice 1 for CTAS is to stay in INITIALIZED during the create table, then transition to PENDING when in admission control. Choice 2 for CTAS is to go directly to PENDING before the create table, then stay in PENDING when in admission control. I prefer Choice 2, because it means the profile is available during create table. Other DDLs don't go through admission control, so PENDING doesn't provide anything extra. RUNNING fits for those statements. As far as which thread should set it, we should avoid race conditions when we can. If we set the state in the async thread, then Exec() could return INITIALIZED or RUNNING. If we set it prior to spawning the async thread, then there is no race condition. This is what ExecAsyncQueryOrDmlRequest() does today. If there was a meaningful difference in the time when the state would transition, then we would put it directly in that location, but for the transition to RUNNING here, we were doing it immediately after spawn, so there is no reason not to do it in the main thread. -- To view, visit http://gerrit.cloudera.org:8080/17872 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ib57e86926a233ef13d27a9ec8d9c36d33a88a44e Gerrit-Change-Number: 17872 Gerrit-PatchSet: 21 Gerrit-Owner: Qifan Chen <qc...@cloudera.com> Gerrit-Reviewer: Amogh Margoor <amarg...@gmail.com> Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com> Gerrit-Reviewer: Joe McDonnell <joemcdonn...@cloudera.com> Gerrit-Reviewer: Qifan Chen <qc...@cloudera.com> Gerrit-Comment-Date: Mon, 18 Oct 2021 17:37:31 +0000 Gerrit-HasComments: Yes