Tim Armstrong has uploaded a new patch set (#2).

Change subject: IMPALA-4409: respect lock order in 
QueryExecState::CancelInternal()
......................................................................

IMPALA-4409: respect lock order in QueryExecState::CancelInternal()

The code previously violated the (partially documented) lock order
in ImpalaServer. An example of a possible cycle in the dependency
graph is:

* SetQueryInFlight() holds SessionState::lock_ and waits for
  'query_expiration_lock_'
* ExpireQueries() holds 'query_expiration_lock_' and waits for
  'query_exec_state_map_lock_'
* GetQueryExecState() holds 'query_exec_state_map_lock_' and
  waits for QueryExecState::lock_
* QES::Cancel() holds QueryExecState::lock_
  and waits for SessionState::lock

It's not clear how likely the above scenario is, but it's hard to rule
it out.

We have not seen this hang in the wild but have seen similar ones.

Testing:
Ran local stress test on 3-node minicluster with TPC-H 20 and 50%
of queries being cancelled.

Change-Id: I785fea0163a90d0633fb6ed77ec7c6882ab5c110
---
M be/src/service/impala-server.h
M be/src/service/query-exec-state.cc
2 files changed, 13 insertions(+), 9 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/96/4896/2
-- 
To view, visit http://gerrit.cloudera.org:8080/4896
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I785fea0163a90d0633fb6ed77ec7c6882ab5c110
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong <tarmstr...@cloudera.com>
Gerrit-Reviewer: Matthew Jacobs <m...@cloudera.com>

Reply via email to