-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/56723/
-----------------------------------------------------------

(Updated Feb. 15, 2017, 6 p.m.)


Review request for Aurora, David McLaughlin and Santhosh Kumar Shanmugham.


Changes
-------

Style fixes.


Bugs: AURORA-1890
    https://issues.apache.org/jira/browse/AURORA-1890


Repository: aurora


Description
-------

Currently the scheduler causes all coordinated ("pulsed") updates into
ROLL_FORWARD_AWAITING_PULSE, or ROLL_BACK_AWAITING_PULSE on scheduler
startup/recovery. This is because the last pulse timestamp is not durably stored
and the timestamp of the last pulse is set to 0L (aka no pulse yet).

In cases where the pulse timeout is larger and the failover is fast or frequent,
this casues many updates to unnecessarily transition into a pulse related state
until the next pulse.

It is posible to avoid these uncessary transitons by traversing the job update
events and finding the last PULSE -> * state transition. The timestamp of the *
event indicates that a pulse was recieved at that point in time and can be used
to inititalize the pulse sate on startup.


Diffs (updated)
-----

  api/src/main/thrift/org/apache/aurora/gen/api.thrift 
efd4e534c4ad90862d7a9fae437ed724da3a34dc 
  src/main/java/org/apache/aurora/scheduler/base/Jobs.java 
49e5b2cfc0b84bb0e0c95cca375cd0503f9dcdb5 
  
src/main/java/org/apache/aurora/scheduler/updater/JobUpdateControllerImpl.java 
729c1234a2e27f1e756ddfd6a4e5a04fa20bbd7f 
  src/test/java/org/apache/aurora/scheduler/updater/JobUpdaterIT.java 
ea0b89a232c2fc10f2183218b750bb0478d51a58 

Diff: https://reviews.apache.org/r/56723/diff/


Testing
-------


Thanks,

Zameer Manji

Reply via email to