[Impala-ASF-CR] IMPALA-4409: respect lock order in QueryExecState::CancelInternal()

2016-11-03 Thread Internal Jenkins (Code Review)
Internal Jenkins has posted comments on this change.

Change subject: IMPALA-4409: respect lock order in 
QueryExecState::CancelInternal()
..


Patch Set 6: Verified+1

-- 
To view, visit http://gerrit.cloudera.org:8080/4896
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I785fea0163a90d0633fb6ed77ec7c6882ab5c110
Gerrit-PatchSet: 6
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Internal Jenkins
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4409: respect lock order in QueryExecState::CancelInternal()

2016-11-03 Thread Internal Jenkins (Code Review)
Internal Jenkins has submitted this change and it was merged.

Change subject: IMPALA-4409: respect lock order in 
QueryExecState::CancelInternal()
..


IMPALA-4409: respect lock order in QueryExecState::CancelInternal()

The code previously violated the (partially documented) lock order
in ImpalaServer. An example of a possible cycle in the dependency
graph is:

* SetQueryInFlight() holds SessionState::lock_ and waits for
  'query_expiration_lock_'
* ExpireQueries() holds 'query_expiration_lock_' and waits for
  'query_exec_state_map_lock_'
* GetQueryExecState() holds 'query_exec_state_map_lock_' and
  waits for QueryExecState::lock_
* QES::Cancel() holds QueryExecState::lock_
  and waits for SessionState::lock

It's not clear how likely the above scenario is, but it's hard to rule
it out.

We have not seen this hang in the wild but have seen similar ones.

Testing:
Ran local stress test on 3-node minicluster with TPC-H 20 and 50%
of queries being cancelled.

Change-Id: I785fea0163a90d0633fb6ed77ec7c6882ab5c110
Reviewed-on: http://gerrit.cloudera.org:8080/4896
Reviewed-by: Tim Armstrong 
Tested-by: Internal Jenkins
---
M be/src/runtime/coordinator.h
M be/src/service/impala-server.h
M be/src/service/query-exec-state.cc
M be/src/service/query-exec-state.h
4 files changed, 66 insertions(+), 32 deletions(-)

Approvals:
  Internal Jenkins: Verified
  Tim Armstrong: Looks good to me, approved



-- 
To view, visit http://gerrit.cloudera.org:8080/4896
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: I785fea0163a90d0633fb6ed77ec7c6882ab5c110
Gerrit-PatchSet: 7
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Internal Jenkins
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-4409: respect lock order in QueryExecState::CancelInternal()

2016-11-02 Thread Internal Jenkins (Code Review)
Internal Jenkins has posted comments on this change.

Change subject: IMPALA-4409: respect lock order in 
QueryExecState::CancelInternal()
..


Patch Set 6:

Build failed: 
http://sandbox.jenkins.cloudera.com/job/impala-external-gerrit-verify-merge-ASF/393/

-- 
To view, visit http://gerrit.cloudera.org:8080/4896
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I785fea0163a90d0633fb6ed77ec7c6882ab5c110
Gerrit-PatchSet: 6
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Internal Jenkins
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4409: respect lock order in QueryExecState::CancelInternal()

2016-11-02 Thread Internal Jenkins (Code Review)
Internal Jenkins has posted comments on this change.

Change subject: IMPALA-4409: respect lock order in 
QueryExecState::CancelInternal()
..


Patch Set 6: Verified-1

Build failed: 
http://sandbox.jenkins.cloudera.com/job/impala-external-gerrit-verify-merge-ASF/391/

-- 
To view, visit http://gerrit.cloudera.org:8080/4896
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I785fea0163a90d0633fb6ed77ec7c6882ab5c110
Gerrit-PatchSet: 6
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Internal Jenkins
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4409: respect lock order in QueryExecState::CancelInternal()

2016-11-02 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change.

Change subject: IMPALA-4409: respect lock order in 
QueryExecState::CancelInternal()
..


Patch Set 6: Code-Review+2

rebase

-- 
To view, visit http://gerrit.cloudera.org:8080/4896
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I785fea0163a90d0633fb6ed77ec7c6882ab5c110
Gerrit-PatchSet: 6
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4409: respect lock order in QueryExecState::CancelInternal()

2016-11-01 Thread Marcel Kornacker (Code Review)
Marcel Kornacker has posted comments on this change.

Change subject: IMPALA-4409: respect lock order in 
QueryExecState::CancelInternal()
..


Patch Set 5: Code-Review+2

-- 
To view, visit http://gerrit.cloudera.org:8080/4896
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I785fea0163a90d0633fb6ed77ec7c6882ab5c110
Gerrit-PatchSet: 5
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4409: respect lock order in QueryExecState::CancelInternal()

2016-11-01 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change.

Change subject: IMPALA-4409: respect lock order in 
QueryExecState::CancelInternal()
..


Patch Set 4:

(1 comment)

Augmented the comment. Also explained the relationship to the Coordinator locks.

http://gerrit.cloudera.org:8080/#/c/4896/4/be/src/service/query-exec-state.h
File be/src/service/query-exec-state.h:

Line 215:   /// holding this lock.
> what about this?
Updated. The condition was kind-of trivially true since it's only acquired in a 
couple of short functions.


-- 
To view, visit http://gerrit.cloudera.org:8080/4896
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I785fea0163a90d0633fb6ed77ec7c6882ab5c110
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4409: respect lock order in QueryExecState::CancelInternal()

2016-11-01 Thread Tim Armstrong (Code Review)
Hello Marcel Kornacker, Matthew Jacobs, Sailesh Mukil,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/4896

to look at the new patch set (#5).

Change subject: IMPALA-4409: respect lock order in 
QueryExecState::CancelInternal()
..

IMPALA-4409: respect lock order in QueryExecState::CancelInternal()

The code previously violated the (partially documented) lock order
in ImpalaServer. An example of a possible cycle in the dependency
graph is:

* SetQueryInFlight() holds SessionState::lock_ and waits for
  'query_expiration_lock_'
* ExpireQueries() holds 'query_expiration_lock_' and waits for
  'query_exec_state_map_lock_'
* GetQueryExecState() holds 'query_exec_state_map_lock_' and
  waits for QueryExecState::lock_
* QES::Cancel() holds QueryExecState::lock_
  and waits for SessionState::lock

It's not clear how likely the above scenario is, but it's hard to rule
it out.

We have not seen this hang in the wild but have seen similar ones.

Testing:
Ran local stress test on 3-node minicluster with TPC-H 20 and 50%
of queries being cancelled.

Change-Id: I785fea0163a90d0633fb6ed77ec7c6882ab5c110
---
M be/src/runtime/coordinator.h
M be/src/service/impala-server.h
M be/src/service/query-exec-state.cc
M be/src/service/query-exec-state.h
4 files changed, 66 insertions(+), 32 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/96/4896/5
-- 
To view, visit http://gerrit.cloudera.org:8080/4896
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I785fea0163a90d0633fb6ed77ec7c6882ab5c110
Gerrit-PatchSet: 5
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-4409: respect lock order in QueryExecState::CancelInternal()

2016-11-01 Thread Marcel Kornacker (Code Review)
Marcel Kornacker has posted comments on this change.

Change subject: IMPALA-4409: respect lock order in 
QueryExecState::CancelInternal()
..


Patch Set 4:

(1 comment)

if you could augment the comment to include the locks that aren't part of the 
hierarchy yet, that would be helpful.

http://gerrit.cloudera.org:8080/#/c/4896/4/be/src/service/query-exec-state.h
File be/src/service/query-exec-state.h:

Line 215:   /// holding this lock.
what about this?


-- 
To view, visit http://gerrit.cloudera.org:8080/4896
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I785fea0163a90d0633fb6ed77ec7c6882ab5c110
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4409: respect lock order in QueryExecState::CancelInternal()

2016-11-01 Thread Marcel Kornacker (Code Review)
Marcel Kornacker has posted comments on this change.

Change subject: IMPALA-4409: respect lock order in 
QueryExecState::CancelInternal()
..


Patch Set 4:

this needs a +2 from an owner.

-- 
To view, visit http://gerrit.cloudera.org:8080/4896
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I785fea0163a90d0633fb6ed77ec7c6882ab5c110
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4409: respect lock order in QueryExecState::CancelInternal()

2016-11-01 Thread Marcel Kornacker (Code Review)
Marcel Kornacker has posted comments on this change.

Change subject: IMPALA-4409: respect lock order in 
QueryExecState::CancelInternal()
..


Patch Set 4: Code-Review-1

-- 
To view, visit http://gerrit.cloudera.org:8080/4896
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I785fea0163a90d0633fb6ed77ec7c6882ab5c110
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4409: respect lock order in QueryExecState::CancelInternal()

2016-11-01 Thread Tim Armstrong (Code Review)
Hello Matthew Jacobs,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/4896

to look at the new patch set (#4).

Change subject: IMPALA-4409: respect lock order in 
QueryExecState::CancelInternal()
..

IMPALA-4409: respect lock order in QueryExecState::CancelInternal()

The code previously violated the (partially documented) lock order
in ImpalaServer. An example of a possible cycle in the dependency
graph is:

* SetQueryInFlight() holds SessionState::lock_ and waits for
  'query_expiration_lock_'
* ExpireQueries() holds 'query_expiration_lock_' and waits for
  'query_exec_state_map_lock_'
* GetQueryExecState() holds 'query_exec_state_map_lock_' and
  waits for QueryExecState::lock_
* QES::Cancel() holds QueryExecState::lock_
  and waits for SessionState::lock

It's not clear how likely the above scenario is, but it's hard to rule
it out.

We have not seen this hang in the wild but have seen similar ones.

Testing:
Ran local stress test on 3-node minicluster with TPC-H 20 and 50%
of queries being cancelled.

Change-Id: I785fea0163a90d0633fb6ed77ec7c6882ab5c110
---
M be/src/runtime/coordinator.h
M be/src/service/impala-server.h
M be/src/service/query-exec-state.cc
M be/src/service/query-exec-state.h
4 files changed, 50 insertions(+), 27 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/96/4896/4
-- 
To view, visit http://gerrit.cloudera.org:8080/4896
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I785fea0163a90d0633fb6ed77ec7c6882ab5c110
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-4409: respect lock order in QueryExecState::CancelInternal()

2016-11-01 Thread Sailesh Mukil (Code Review)
Sailesh Mukil has posted comments on this change.

Change subject: IMPALA-4409: respect lock order in 
QueryExecState::CancelInternal()
..


Patch Set 4: Code-Review+2

Thanks for adding the comment. Will be very helpful in the future.

-- 
To view, visit http://gerrit.cloudera.org:8080/4896
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I785fea0163a90d0633fb6ed77ec7c6882ab5c110
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4409: respect lock order in QueryExecState::CancelInternal()

2016-11-01 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change.

Change subject: IMPALA-4409: respect lock order in 
QueryExecState::CancelInternal()
..


Patch Set 4: Code-Review+1

Carry +1

-- 
To view, visit http://gerrit.cloudera.org:8080/4896
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I785fea0163a90d0633fb6ed77ec7c6882ab5c110
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4409: respect lock order in QueryExecState::CancelInternal()

2016-11-01 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change.

Change subject: IMPALA-4409: respect lock order in 
QueryExecState::CancelInternal()
..


Patch Set 3:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/4896/3/be/src/service/impala-server.h
File be/src/service/impala-server.h:

PS3, Line 675: The lock must be acquired before 'query_exec_state_map_lock_' or
 : /// QueryExecState::lock_ if they will be held 
simulataneously.
> Instead of just having this above this lock, it would be better to spell ou
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/4896
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I785fea0163a90d0633fb6ed77ec7c6882ab5c110
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4409: respect lock order in QueryExecState::CancelInternal()

2016-11-01 Thread Sailesh Mukil (Code Review)
Sailesh Mukil has posted comments on this change.

Change subject: IMPALA-4409: respect lock order in 
QueryExecState::CancelInternal()
..


Patch Set 3:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/4896/3/be/src/service/impala-server.h
File be/src/service/impala-server.h:

PS3, Line 675: The lock must be acquired before 'query_exec_state_map_lock_' or
 : /// QueryExecState::lock_ if they will be held 
simulataneously.
Instead of just having this above this lock, it would be better to spell out 
the lock ordering of the main locks in question at the top of this header file, 
based on the DAG you posted in the JIRA.

I have a feeling there might be other places where we mess up the lock ordering 
too that we don't know of yet.


-- 
To view, visit http://gerrit.cloudera.org:8080/4896
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I785fea0163a90d0633fb6ed77ec7c6882ab5c110
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4409: respect lock order in QueryExecState::CancelInternal()

2016-11-01 Thread Matthew Jacobs (Code Review)
Matthew Jacobs has posted comments on this change.

Change subject: IMPALA-4409: respect lock order in 
QueryExecState::CancelInternal()
..


Patch Set 3: Code-Review+1

-- 
To view, visit http://gerrit.cloudera.org:8080/4896
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I785fea0163a90d0633fb6ed77ec7c6882ab5c110
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4409: respect lock order in QueryExecState::CancelInternal()

2016-11-01 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change.

Change subject: IMPALA-4409: respect lock order in 
QueryExecState::CancelInternal()
..


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/4896/1/be/src/service/query-exec-state.cc
File be/src/service/query-exec-state.cc:

PS1, Line 833: if (check_inflight) {
 :   DCHECK(session_lock.owns_lock());
 :   if (session_->inflight_queries.find(query_id()) ==
 :   session_->inflight_queries.end()) {
 : return Status("Query not yet running");
 :   }
> Are you aware of any reason we can't just do this check before taking lock_
It doesn't look like it would protect against anything - I did a pass through 
the code.

Based on the history of the code,  'check_inflight' is meant to guard against 
the query being cancelled early when it is in a state with partially set-up 
data structures, so the race where it is removed from 'inflight_queries' during 
cancellation should be benign.


-- 
To view, visit http://gerrit.cloudera.org:8080/4896
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I785fea0163a90d0633fb6ed77ec7c6882ab5c110
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4409: respect lock order in QueryExecState::CancelInternal()

2016-11-01 Thread Tim Armstrong (Code Review)
Tim Armstrong has uploaded a new patch set (#2).

Change subject: IMPALA-4409: respect lock order in 
QueryExecState::CancelInternal()
..

IMPALA-4409: respect lock order in QueryExecState::CancelInternal()

The code previously violated the (partially documented) lock order
in ImpalaServer. An example of a possible cycle in the dependency
graph is:

* SetQueryInFlight() holds SessionState::lock_ and waits for
  'query_expiration_lock_'
* ExpireQueries() holds 'query_expiration_lock_' and waits for
  'query_exec_state_map_lock_'
* GetQueryExecState() holds 'query_exec_state_map_lock_' and
  waits for QueryExecState::lock_
* QES::Cancel() holds QueryExecState::lock_
  and waits for SessionState::lock

It's not clear how likely the above scenario is, but it's hard to rule
it out.

We have not seen this hang in the wild but have seen similar ones.

Testing:
Ran local stress test on 3-node minicluster with TPC-H 20 and 50%
of queries being cancelled.

Change-Id: I785fea0163a90d0633fb6ed77ec7c6882ab5c110
---
M be/src/service/impala-server.h
M be/src/service/query-exec-state.cc
2 files changed, 13 insertions(+), 9 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/96/4896/2
-- 
To view, visit http://gerrit.cloudera.org:8080/4896
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I785fea0163a90d0633fb6ed77ec7c6882ab5c110
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Matthew Jacobs 


[Impala-ASF-CR] IMPALA-4409: respect lock order in QueryExecState::CancelInternal()

2016-10-31 Thread Matthew Jacobs (Code Review)
Matthew Jacobs has posted comments on this change.

Change subject: IMPALA-4409: respect lock order in 
QueryExecState::CancelInternal()
..


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/4896/1/be/src/service/query-exec-state.cc
File be/src/service/query-exec-state.cc:

PS1, Line 833: if (check_inflight) {
 :   DCHECK(session_lock.owns_lock());
 :   if (session_->inflight_queries.find(query_id()) ==
 :   session_->inflight_queries.end()) {
 : return Status("Query not yet running");
 :   }
Are you aware of any reason we can't just do this check before taking lock_, 
i.e. doing this check at the beginning of the method?


-- 
To view, visit http://gerrit.cloudera.org:8080/4896
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I785fea0163a90d0633fb6ed77ec7c6882ab5c110
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4409: respect lock order in QueryExecState::CancelInternal()

2016-10-31 Thread Tim Armstrong (Code Review)
Tim Armstrong has uploaded a new change for review.

  http://gerrit.cloudera.org:8080/4896

Change subject: IMPALA-4409: respect lock order in 
QueryExecState::CancelInternal()
..

IMPALA-4409: respect lock order in QueryExecState::CancelInternal()

The code previously violated the (partially documented) lock order
in ImpalaServer. An example of a possible cycle in the dependency
graph is:

* SetQueryInFlight() holds SessionState::lock_ and waits for
  'query_expiration_lock_'
* ExpireQueries() holds 'query_expiration_lock_' and waits for
  'query_exec_state_map_lock_'
* GetQueryExecState() holds 'query_exec_state_map_lock_' and
  waits for QueryExecState::lock_
* QES::Cancel() holds QueryExecState::lock_
  and waits for SessionState::lock

It's not clear how likely the above scenario is, but it's hard to rule
it out.

We have not seen this hang in the wild but have seen similar ones.

Change-Id: I785fea0163a90d0633fb6ed77ec7c6882ab5c110
---
M be/src/service/impala-server.h
M be/src/service/query-exec-state.cc
2 files changed, 9 insertions(+), 2 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/96/4896/1
-- 
To view, visit http://gerrit.cloudera.org:8080/4896
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: I785fea0163a90d0633fb6ed77ec7c6882ab5c110
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong