[Impala-ASF-CR] IMPALA-4409: respect lock order in QueryExecState::CancelInternal()
Internal Jenkins has posted comments on this change. Change subject: IMPALA-4409: respect lock order in QueryExecState::CancelInternal() .. Patch Set 6: Verified+1 -- To view, visit http://gerrit.cloudera.org:8080/4896 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I785fea0163a90d0633fb6ed77ec7c6882ab5c110 Gerrit-PatchSet: 6 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Internal Jenkins Gerrit-Reviewer: Marcel Kornacker Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Sailesh Mukil Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4409: respect lock order in QueryExecState::CancelInternal()
Internal Jenkins has submitted this change and it was merged. Change subject: IMPALA-4409: respect lock order in QueryExecState::CancelInternal() .. IMPALA-4409: respect lock order in QueryExecState::CancelInternal() The code previously violated the (partially documented) lock order in ImpalaServer. An example of a possible cycle in the dependency graph is: * SetQueryInFlight() holds SessionState::lock_ and waits for 'query_expiration_lock_' * ExpireQueries() holds 'query_expiration_lock_' and waits for 'query_exec_state_map_lock_' * GetQueryExecState() holds 'query_exec_state_map_lock_' and waits for QueryExecState::lock_ * QES::Cancel() holds QueryExecState::lock_ and waits for SessionState::lock It's not clear how likely the above scenario is, but it's hard to rule it out. We have not seen this hang in the wild but have seen similar ones. Testing: Ran local stress test on 3-node minicluster with TPC-H 20 and 50% of queries being cancelled. Change-Id: I785fea0163a90d0633fb6ed77ec7c6882ab5c110 Reviewed-on: http://gerrit.cloudera.org:8080/4896 Reviewed-by: Tim ArmstrongTested-by: Internal Jenkins --- M be/src/runtime/coordinator.h M be/src/service/impala-server.h M be/src/service/query-exec-state.cc M be/src/service/query-exec-state.h 4 files changed, 66 insertions(+), 32 deletions(-) Approvals: Internal Jenkins: Verified Tim Armstrong: Looks good to me, approved -- To view, visit http://gerrit.cloudera.org:8080/4896 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: merged Gerrit-Change-Id: I785fea0163a90d0633fb6ed77ec7c6882ab5c110 Gerrit-PatchSet: 7 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim Armstrong Gerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Internal Jenkins Gerrit-Reviewer: Marcel Kornacker Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Sailesh Mukil Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-4409: respect lock order in QueryExecState::CancelInternal()
Internal Jenkins has posted comments on this change. Change subject: IMPALA-4409: respect lock order in QueryExecState::CancelInternal() .. Patch Set 6: Build failed: http://sandbox.jenkins.cloudera.com/job/impala-external-gerrit-verify-merge-ASF/393/ -- To view, visit http://gerrit.cloudera.org:8080/4896 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I785fea0163a90d0633fb6ed77ec7c6882ab5c110 Gerrit-PatchSet: 6 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Internal Jenkins Gerrit-Reviewer: Marcel Kornacker Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Sailesh Mukil Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4409: respect lock order in QueryExecState::CancelInternal()
Internal Jenkins has posted comments on this change. Change subject: IMPALA-4409: respect lock order in QueryExecState::CancelInternal() .. Patch Set 6: Verified-1 Build failed: http://sandbox.jenkins.cloudera.com/job/impala-external-gerrit-verify-merge-ASF/391/ -- To view, visit http://gerrit.cloudera.org:8080/4896 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I785fea0163a90d0633fb6ed77ec7c6882ab5c110 Gerrit-PatchSet: 6 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Internal Jenkins Gerrit-Reviewer: Marcel Kornacker Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Sailesh Mukil Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4409: respect lock order in QueryExecState::CancelInternal()
Tim Armstrong has posted comments on this change. Change subject: IMPALA-4409: respect lock order in QueryExecState::CancelInternal() .. Patch Set 6: Code-Review+2 rebase -- To view, visit http://gerrit.cloudera.org:8080/4896 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I785fea0163a90d0633fb6ed77ec7c6882ab5c110 Gerrit-PatchSet: 6 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Marcel Kornacker Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Sailesh Mukil Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4409: respect lock order in QueryExecState::CancelInternal()
Marcel Kornacker has posted comments on this change. Change subject: IMPALA-4409: respect lock order in QueryExecState::CancelInternal() .. Patch Set 5: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/4896 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I785fea0163a90d0633fb6ed77ec7c6882ab5c110 Gerrit-PatchSet: 5 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Marcel Kornacker Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Sailesh Mukil Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4409: respect lock order in QueryExecState::CancelInternal()
Tim Armstrong has posted comments on this change. Change subject: IMPALA-4409: respect lock order in QueryExecState::CancelInternal() .. Patch Set 4: (1 comment) Augmented the comment. Also explained the relationship to the Coordinator locks. http://gerrit.cloudera.org:8080/#/c/4896/4/be/src/service/query-exec-state.h File be/src/service/query-exec-state.h: Line 215: /// holding this lock. > what about this? Updated. The condition was kind-of trivially true since it's only acquired in a couple of short functions. -- To view, visit http://gerrit.cloudera.org:8080/4896 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I785fea0163a90d0633fb6ed77ec7c6882ab5c110 Gerrit-PatchSet: 4 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Marcel Kornacker Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Sailesh Mukil Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4409: respect lock order in QueryExecState::CancelInternal()
Hello Marcel Kornacker, Matthew Jacobs, Sailesh Mukil, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/4896 to look at the new patch set (#5). Change subject: IMPALA-4409: respect lock order in QueryExecState::CancelInternal() .. IMPALA-4409: respect lock order in QueryExecState::CancelInternal() The code previously violated the (partially documented) lock order in ImpalaServer. An example of a possible cycle in the dependency graph is: * SetQueryInFlight() holds SessionState::lock_ and waits for 'query_expiration_lock_' * ExpireQueries() holds 'query_expiration_lock_' and waits for 'query_exec_state_map_lock_' * GetQueryExecState() holds 'query_exec_state_map_lock_' and waits for QueryExecState::lock_ * QES::Cancel() holds QueryExecState::lock_ and waits for SessionState::lock It's not clear how likely the above scenario is, but it's hard to rule it out. We have not seen this hang in the wild but have seen similar ones. Testing: Ran local stress test on 3-node minicluster with TPC-H 20 and 50% of queries being cancelled. Change-Id: I785fea0163a90d0633fb6ed77ec7c6882ab5c110 --- M be/src/runtime/coordinator.h M be/src/service/impala-server.h M be/src/service/query-exec-state.cc M be/src/service/query-exec-state.h 4 files changed, 66 insertions(+), 32 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/96/4896/5 -- To view, visit http://gerrit.cloudera.org:8080/4896 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I785fea0163a90d0633fb6ed77ec7c6882ab5c110 Gerrit-PatchSet: 5 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Marcel Kornacker Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Sailesh Mukil Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-4409: respect lock order in QueryExecState::CancelInternal()
Marcel Kornacker has posted comments on this change. Change subject: IMPALA-4409: respect lock order in QueryExecState::CancelInternal() .. Patch Set 4: (1 comment) if you could augment the comment to include the locks that aren't part of the hierarchy yet, that would be helpful. http://gerrit.cloudera.org:8080/#/c/4896/4/be/src/service/query-exec-state.h File be/src/service/query-exec-state.h: Line 215: /// holding this lock. what about this? -- To view, visit http://gerrit.cloudera.org:8080/4896 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I785fea0163a90d0633fb6ed77ec7c6882ab5c110 Gerrit-PatchSet: 4 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Marcel Kornacker Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Sailesh Mukil Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4409: respect lock order in QueryExecState::CancelInternal()
Marcel Kornacker has posted comments on this change. Change subject: IMPALA-4409: respect lock order in QueryExecState::CancelInternal() .. Patch Set 4: this needs a +2 from an owner. -- To view, visit http://gerrit.cloudera.org:8080/4896 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I785fea0163a90d0633fb6ed77ec7c6882ab5c110 Gerrit-PatchSet: 4 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Marcel Kornacker Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Sailesh Mukil Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4409: respect lock order in QueryExecState::CancelInternal()
Marcel Kornacker has posted comments on this change. Change subject: IMPALA-4409: respect lock order in QueryExecState::CancelInternal() .. Patch Set 4: Code-Review-1 -- To view, visit http://gerrit.cloudera.org:8080/4896 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I785fea0163a90d0633fb6ed77ec7c6882ab5c110 Gerrit-PatchSet: 4 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Marcel Kornacker Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Sailesh Mukil Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4409: respect lock order in QueryExecState::CancelInternal()
Hello Matthew Jacobs, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/4896 to look at the new patch set (#4). Change subject: IMPALA-4409: respect lock order in QueryExecState::CancelInternal() .. IMPALA-4409: respect lock order in QueryExecState::CancelInternal() The code previously violated the (partially documented) lock order in ImpalaServer. An example of a possible cycle in the dependency graph is: * SetQueryInFlight() holds SessionState::lock_ and waits for 'query_expiration_lock_' * ExpireQueries() holds 'query_expiration_lock_' and waits for 'query_exec_state_map_lock_' * GetQueryExecState() holds 'query_exec_state_map_lock_' and waits for QueryExecState::lock_ * QES::Cancel() holds QueryExecState::lock_ and waits for SessionState::lock It's not clear how likely the above scenario is, but it's hard to rule it out. We have not seen this hang in the wild but have seen similar ones. Testing: Ran local stress test on 3-node minicluster with TPC-H 20 and 50% of queries being cancelled. Change-Id: I785fea0163a90d0633fb6ed77ec7c6882ab5c110 --- M be/src/runtime/coordinator.h M be/src/service/impala-server.h M be/src/service/query-exec-state.cc M be/src/service/query-exec-state.h 4 files changed, 50 insertions(+), 27 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/96/4896/4 -- To view, visit http://gerrit.cloudera.org:8080/4896 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I785fea0163a90d0633fb6ed77ec7c6882ab5c110 Gerrit-PatchSet: 4 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Sailesh Mukil Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-4409: respect lock order in QueryExecState::CancelInternal()
Sailesh Mukil has posted comments on this change. Change subject: IMPALA-4409: respect lock order in QueryExecState::CancelInternal() .. Patch Set 4: Code-Review+2 Thanks for adding the comment. Will be very helpful in the future. -- To view, visit http://gerrit.cloudera.org:8080/4896 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I785fea0163a90d0633fb6ed77ec7c6882ab5c110 Gerrit-PatchSet: 4 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Sailesh Mukil Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4409: respect lock order in QueryExecState::CancelInternal()
Tim Armstrong has posted comments on this change. Change subject: IMPALA-4409: respect lock order in QueryExecState::CancelInternal() .. Patch Set 4: Code-Review+1 Carry +1 -- To view, visit http://gerrit.cloudera.org:8080/4896 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I785fea0163a90d0633fb6ed77ec7c6882ab5c110 Gerrit-PatchSet: 4 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Sailesh Mukil Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4409: respect lock order in QueryExecState::CancelInternal()
Tim Armstrong has posted comments on this change. Change subject: IMPALA-4409: respect lock order in QueryExecState::CancelInternal() .. Patch Set 3: (1 comment) http://gerrit.cloudera.org:8080/#/c/4896/3/be/src/service/impala-server.h File be/src/service/impala-server.h: PS3, Line 675: The lock must be acquired before 'query_exec_state_map_lock_' or : /// QueryExecState::lock_ if they will be held simulataneously. > Instead of just having this above this lock, it would be better to spell ou Done -- To view, visit http://gerrit.cloudera.org:8080/4896 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I785fea0163a90d0633fb6ed77ec7c6882ab5c110 Gerrit-PatchSet: 3 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Sailesh Mukil Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4409: respect lock order in QueryExecState::CancelInternal()
Sailesh Mukil has posted comments on this change. Change subject: IMPALA-4409: respect lock order in QueryExecState::CancelInternal() .. Patch Set 3: (1 comment) http://gerrit.cloudera.org:8080/#/c/4896/3/be/src/service/impala-server.h File be/src/service/impala-server.h: PS3, Line 675: The lock must be acquired before 'query_exec_state_map_lock_' or : /// QueryExecState::lock_ if they will be held simulataneously. Instead of just having this above this lock, it would be better to spell out the lock ordering of the main locks in question at the top of this header file, based on the DAG you posted in the JIRA. I have a feeling there might be other places where we mess up the lock ordering too that we don't know of yet. -- To view, visit http://gerrit.cloudera.org:8080/4896 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I785fea0163a90d0633fb6ed77ec7c6882ab5c110 Gerrit-PatchSet: 3 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Henry Robinson Gerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Sailesh Mukil Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4409: respect lock order in QueryExecState::CancelInternal()
Matthew Jacobs has posted comments on this change. Change subject: IMPALA-4409: respect lock order in QueryExecState::CancelInternal() .. Patch Set 3: Code-Review+1 -- To view, visit http://gerrit.cloudera.org:8080/4896 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I785fea0163a90d0633fb6ed77ec7c6882ab5c110 Gerrit-PatchSet: 3 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-4409: respect lock order in QueryExecState::CancelInternal()
Tim Armstrong has posted comments on this change. Change subject: IMPALA-4409: respect lock order in QueryExecState::CancelInternal() .. Patch Set 1: (1 comment) http://gerrit.cloudera.org:8080/#/c/4896/1/be/src/service/query-exec-state.cc File be/src/service/query-exec-state.cc: PS1, Line 833: if (check_inflight) { : DCHECK(session_lock.owns_lock()); : if (session_->inflight_queries.find(query_id()) == : session_->inflight_queries.end()) { : return Status("Query not yet running"); : } > Are you aware of any reason we can't just do this check before taking lock_ It doesn't look like it would protect against anything - I did a pass through the code. Based on the history of the code, 'check_inflight' is meant to guard against the query being cancelled early when it is in a state with partially set-up data structures, so the race where it is removed from 'inflight_queries' during cancellation should be benign. -- To view, visit http://gerrit.cloudera.org:8080/4896 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I785fea0163a90d0633fb6ed77ec7c6882ab5c110 Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Matthew Jacobs Gerrit-Reviewer: Tim Armstrong Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4409: respect lock order in QueryExecState::CancelInternal()
Tim Armstrong has uploaded a new patch set (#2). Change subject: IMPALA-4409: respect lock order in QueryExecState::CancelInternal() .. IMPALA-4409: respect lock order in QueryExecState::CancelInternal() The code previously violated the (partially documented) lock order in ImpalaServer. An example of a possible cycle in the dependency graph is: * SetQueryInFlight() holds SessionState::lock_ and waits for 'query_expiration_lock_' * ExpireQueries() holds 'query_expiration_lock_' and waits for 'query_exec_state_map_lock_' * GetQueryExecState() holds 'query_exec_state_map_lock_' and waits for QueryExecState::lock_ * QES::Cancel() holds QueryExecState::lock_ and waits for SessionState::lock It's not clear how likely the above scenario is, but it's hard to rule it out. We have not seen this hang in the wild but have seen similar ones. Testing: Ran local stress test on 3-node minicluster with TPC-H 20 and 50% of queries being cancelled. Change-Id: I785fea0163a90d0633fb6ed77ec7c6882ab5c110 --- M be/src/service/impala-server.h M be/src/service/query-exec-state.cc 2 files changed, 13 insertions(+), 9 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/96/4896/2 -- To view, visit http://gerrit.cloudera.org:8080/4896 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I785fea0163a90d0633fb6ed77ec7c6882ab5c110 Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Matthew Jacobs
[Impala-ASF-CR] IMPALA-4409: respect lock order in QueryExecState::CancelInternal()
Matthew Jacobs has posted comments on this change. Change subject: IMPALA-4409: respect lock order in QueryExecState::CancelInternal() .. Patch Set 1: (1 comment) http://gerrit.cloudera.org:8080/#/c/4896/1/be/src/service/query-exec-state.cc File be/src/service/query-exec-state.cc: PS1, Line 833: if (check_inflight) { : DCHECK(session_lock.owns_lock()); : if (session_->inflight_queries.find(query_id()) == : session_->inflight_queries.end()) { : return Status("Query not yet running"); : } Are you aware of any reason we can't just do this check before taking lock_, i.e. doing this check at the beginning of the method? -- To view, visit http://gerrit.cloudera.org:8080/4896 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I785fea0163a90d0633fb6ed77ec7c6882ab5c110 Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Matthew Jacobs Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-4409: respect lock order in QueryExecState::CancelInternal()
Tim Armstrong has uploaded a new change for review. http://gerrit.cloudera.org:8080/4896 Change subject: IMPALA-4409: respect lock order in QueryExecState::CancelInternal() .. IMPALA-4409: respect lock order in QueryExecState::CancelInternal() The code previously violated the (partially documented) lock order in ImpalaServer. An example of a possible cycle in the dependency graph is: * SetQueryInFlight() holds SessionState::lock_ and waits for 'query_expiration_lock_' * ExpireQueries() holds 'query_expiration_lock_' and waits for 'query_exec_state_map_lock_' * GetQueryExecState() holds 'query_exec_state_map_lock_' and waits for QueryExecState::lock_ * QES::Cancel() holds QueryExecState::lock_ and waits for SessionState::lock It's not clear how likely the above scenario is, but it's hard to rule it out. We have not seen this hang in the wild but have seen similar ones. Change-Id: I785fea0163a90d0633fb6ed77ec7c6882ab5c110 --- M be/src/service/impala-server.h M be/src/service/query-exec-state.cc 2 files changed, 9 insertions(+), 2 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/96/4896/1 -- To view, visit http://gerrit.cloudera.org:8080/4896 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newchange Gerrit-Change-Id: I785fea0163a90d0633fb6ed77ec7c6882ab5c110 Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tim Armstrong