Joe McDonnell has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17872 )

Change subject: IMPALA-10811 RPC to submit query getting stuck for AWS NLB 
forever
......................................................................


Patch Set 21:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/17872/21/be/src/service/client-request-state.cc
File be/src/service/client-request-state.cc:

http://gerrit.cloudera.org:8080/#/c/17872/21/be/src/service/client-request-state.cc@713
PS21, Line 713:   // Transition the exec state to RUNNING for any non-CTAS 
DDLs. For the later, the
              :   // state is set from PENDING to RUNNING inside 
ExecQueryOrDmlRequest().
              :   if (catalog_op_type() != TCatalogOpType::DDL
              :       || ddl_type() != TDdlType::CREATE_TABLE_AS_SELECT) {
              :     UpdateNonErrorExecState(ExecState::RUNNING);
              :   }
> The difference with the state in main thread for CTAS before overwritten by
When a statement is in the INITIALIZED state, its runtime profile is 
unavailable. This limitation is because we currently don't handle generating a 
runtime profile when the query is in planning, but this restriction doesn't 
apply after the planning is over. This is why it is useful to transition out of 
the INITIALIZED state as soon as planning is over. You can search the code for 
ExecState::INITIALIZED to see these locations. So, my reason for wanting to get 
out of INITIALIZED state is so that these DDLs have runtime profiles while they 
are doing the catalog ops.

PENDING corresponds to admission control, and the profile contains extra 
information while in this state (e.g. the queued reason and other state). CTAS 
could spend time in admission control after the create table completes, so we 
want that admission control information to be accessible. Choice 1 for CTAS is 
to stay in INITIALIZED during the create table, then transition to PENDING when 
in admission control. Choice 2 for CTAS is to go directly to PENDING before the 
create table, then stay in PENDING when in admission control. I prefer Choice 
2, because it means the profile is available during create table.

Other DDLs don't go through admission control, so PENDING doesn't provide 
anything extra. RUNNING fits for those statements.

As far as which thread should set it, we should avoid race conditions when we 
can. If we set the state in the async thread, then Exec() could return 
INITIALIZED or RUNNING. If we set it prior to spawning the async thread, then 
there is no race condition. This is what ExecAsyncQueryOrDmlRequest() does 
today.

If there was a meaningful difference in the time when the state would 
transition, then we would put it directly in that location, but for the 
transition to RUNNING here, we were doing it immediately after spawn, so there 
is no reason not to do it in the main thread.



--
To view, visit http://gerrit.cloudera.org:8080/17872
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib57e86926a233ef13d27a9ec8d9c36d33a88a44e
Gerrit-Change-Number: 17872
Gerrit-PatchSet: 21
Gerrit-Owner: Qifan Chen <qc...@cloudera.com>
Gerrit-Reviewer: Amogh Margoor <amarg...@gmail.com>
Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com>
Gerrit-Reviewer: Joe McDonnell <joemcdonn...@cloudera.com>
Gerrit-Reviewer: Qifan Chen <qc...@cloudera.com>
Gerrit-Comment-Date: Mon, 18 Oct 2021 17:37:31 +0000
Gerrit-HasComments: Yes

Reply via email to