Hello Michael Ho, Joe McDonnell, Bikramjeet Vig, Dan Hecht, Impala Public Jenkins,
I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/10813 to look at the new patch set (#10). Change subject: IMPALA-7163: Implement a state machine for the QueryState class ...................................................................... IMPALA-7163: Implement a state machine for the QueryState class This patch adds a state machine for the QueryState class. The motivation behind this patch is to make the query lifecycle from the point of view of an executor much easier to reason about and this patch is key for a follow on patch for IMPALA-2990 where the status reporting will be per-query rather than per-fragment-instance. Currently, the state machine provides no other purpose, and it will mostly be used for IMPALA-2990. We introduce 5 possible states for the QueryState which include 3 terminal states (FINISHED, CANCELLED and ERROR) and 2 non-terminal states (PREPARING, EXECUTING). The transition from one state to the next is always handled by a single thread which is also the QueryState thread. This thread will additionally bear the purpose of sending periodic updates after IMPALA-4063, which is the primary reason behind having only this thread modify the state of the query. Counting barriers are introduced to keep a count of how many fragment instances have finished Preparing and Executing. These barriers also block until all the fragment instances have finished a respective state. The fragment instances update the query wide query status if an error is hit and unblocks the barrier if it is in the EXECUTING state. The PREPARING state blocks regardless of whether a fragment instance hit an error or not, until all the fragment instances have completed successfully or unsuccessfully, to maintain the invariant that fragment instances cannot be cancelled until the entire QueryState has finished PREPARING. The status reporting protocol has not been changed and remains exactly as it was. Testing: - Added 2 failure points in the query lifecycle using debug actions and added tests to validate the same (extension of IMPALA-7376). - Ran 'core' and 'exhaustive' tests. Future related work: 1) IMPALA-2990: Make status reporting per-query. 2) Try to logically align the FIS states with the QueryState states. 3) Consider mirroring the QueryState state machine to CoordinatorBackendState Change-Id: Iec5670a7db83ecae4656d7bb2ea372d3767ba7fe --- M be/src/runtime/coordinator.cc M be/src/runtime/fragment-instance-state.cc M be/src/runtime/fragment-instance-state.h M be/src/runtime/query-state.cc M be/src/runtime/query-state.h M tests/failure/test_failpoints.py 6 files changed, 272 insertions(+), 67 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/13/10813/10 -- To view, visit http://gerrit.cloudera.org:8080/10813 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Iec5670a7db83ecae4656d7bb2ea372d3767ba7fe Gerrit-Change-Number: 10813 Gerrit-PatchSet: 10 Gerrit-Owner: Sailesh Mukil <sail...@cloudera.com> Gerrit-Reviewer: Bikramjeet Vig <bikramjeet....@cloudera.com> Gerrit-Reviewer: Dan Hecht <dhe...@cloudera.com> Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com> Gerrit-Reviewer: Joe McDonnell <joemcdonn...@cloudera.com> Gerrit-Reviewer: Michael Ho <k...@cloudera.com> Gerrit-Reviewer: Sailesh Mukil <sail...@cloudera.com>