----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/58053/ -----------------------------------------------------------
(Updated March 30, 2017, 11:37 a.m.) Review request for Aurora and Stephan Erb. Changes ------- Feedback. Bugs: AURORA-1911 https://issues.apache.org/jira/browse/AURORA-1911 Repository: aurora Description ------- As noted in AURORA-1911 the `V1Mesos` driver doesn't re try `SUBSCRIBE` calls if they fail. This means that after a leader subscribes and disconnects, it is possible for it to never re subscribe again if the Mesos Master is unhealthy. To fix this, I have moved the subscription into the dedicated `SchedulerExecutor` and it coninutes to attempt to subscribe using truncated binary backoff. It only stops if we are disconnected or if we sucessfully connect. Diffs (updated) ----- src/jmh/java/org/apache/aurora/benchmark/StatusUpdateBenchmark.java 206b11458da2b0f938f0fcab5e5d3259a88ac9ee src/main/java/org/apache/aurora/scheduler/mesos/MesosCallbackHandler.java 5bf1e4e8c46044cb69b266cd203b5ec2f8b9ab61 src/main/java/org/apache/aurora/scheduler/mesos/SchedulerDriverModule.java 10d4f1b515b91d85b283cb7c655275c22fb133f9 src/main/java/org/apache/aurora/scheduler/mesos/VersionedMesosSchedulerImpl.java 67d356ab66c926a3b56860b906a453d57d6b694d src/test/java/org/apache/aurora/scheduler/mesos/VersionedMesosSchedulerImplTest.java 756d0d9e30a447f9fba75c1c60f2f2f3c610399b Diff: https://reviews.apache.org/r/58053/diff/2/ Changes: https://reviews.apache.org/r/58053/diff/1-2/ Testing ------- Thanks, Zameer Manji