-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51876/
-----------------------------------------------------------

Review request for Aurora, Joshua Cohen, Maxim Khutornenko, and Zameer Manji.


Bugs: AURORA-1225
    https://issues.apache.org/jira/browse/AURORA-1225


Repository: aurora


Description
-------

Modify executor state transition logic to rely on health checks (if enabled).

[Summary]
Executor needs to start executing user content in STARTING and transition to 
RUNNING when a successful required number of health checks is reached.

This review contains a series of executor changes that implement the health 
check driven updates. It gives a more complete context of the design of this 
feature.

[Background]
Please see this epic: https://issues.apache.org/jira/browse/AURORA-1225
and the design doc: 
https://docs.google.com/document/d/1ZdgW8S4xMhvKW7iQUX99xZm10NXSxEWR0a-21FP5d94/edit#
 for more details and background.

[Description]
If health check is enabled on vCurrent executor, the health checker will send a 
"TASK_RUNNING" message when a successful required number of health checks is 
reached within the initial_interval_secs. On the other hand, a "TASK_FAILED" 
message was sent if if fail to satisift the required number of health checks 
within the initial_interval_secs, or a maximum number of failed health check 
limit is reached after the initital_interval_secs.

If health check is disabled on the vCurrent executor, it will sends 
"TASK_RUNNING" message to scheduler, so the behavior will be the same as the 
vPrev executor.

[Change List]
The current change set includes:
1. Removed the status memoization in StatusManager.
2. Modified the StatusManager to be edge triggered.
3. Changed the Aurora Executor callback function.
4. Modified the Health Checker and redefined the meaning initial_interval_secs.

[TODOs]
Currently I fixed all broken tests caused by my changes. However, more tests 
needs to to be added to accomodate to the executor change. I will send 
follow-up review update when I cover more edge cases. But any feedback on 
implementation is highly appreciated.


Diffs
-----

  src/main/python/apache/aurora/executor/aurora_executor.py 
ce5ef680f01831cd89fced8969ae3246c7f60cfd 
  src/main/python/apache/aurora/executor/common/health_checker.py 
5fc845eceac6f0c048d7489fdc4c672b0c609ea0 
  src/main/python/apache/aurora/executor/common/status_checker.py 
795dae2d6b661fc528d952c2315196d94127961f 
  src/main/python/apache/aurora/executor/status_manager.py 
228a99a05f339e21cd7e769a42b9b2276e7bc3fc 
  src/test/python/apache/aurora/executor/common/test_health_checker.py 
bb6ea69dd94298c5b8cf4d5f06d06eea7790d66e 
  src/test/python/apache/aurora/executor/common/test_status_checker.py 
5be1981c8c8e88258456adb21aa3ca7c0aa472a7 
  src/test/python/apache/aurora/executor/test_status_manager.py 
ce4679ba1aa7b42cf0115c943d84663030182d23 
  src/test/python/apache/aurora/executor/test_thermos_executor.py 
0bfe9e931f873c9f804f2ba4012e050e1f9fd24e 

Diff: https://reviews.apache.org/r/51876/diff/


Testing
-------

./build-support/jenkins/build.sh

./pants test.pytest src/test/python/apache/aurora/executor::


Thanks,

Kai Huang

Reply via email to