Re: Review Request 66716: [WIP] Enable `Tasks` to specify their own custom maintenance SLA.

2018-05-15 Thread Santhosh Kumar Shanmugham

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/66716/
---

(Updated May 15, 2018, 10:15 a.m.)


Review request for Aurora, David McLaughlin, Jordan Ly, and Stephan Erb.


Changes
---

- Dropping trigger-poll mechanism and simplifying the implementation.


Repository: aurora


Description
---

`Tasks` can specify custom SLA requirements as part of
their `TaskConfig`. One of the new features is the ability
to specify an external coordinator that can ACK/NACK
maintenance requests for tasks. This will be hugely
beneficial for onboarding services that cannot satisfactorily
specify SLA in terms of running instances.

Maintenance requests are driven from the Scheduler to
improve management of nodes in the cluster.


Diffs (updated)
-

  api/src/main/thrift/org/apache/aurora/gen/api.thrift 
ef754e32172e7490a47a13e7b526f243ffa3efeb 
  api/src/main/thrift/org/apache/aurora/gen/storage.thrift 
b79e2045ccda05d5058565f81988dfe33feea8f1 
  src/main/java/org/apache/aurora/scheduler/app/AppModule.java 
ffc07443fae9e5216a5333ae305f75aa9b452a0c 
  src/main/java/org/apache/aurora/scheduler/config/CliOptions.java 
a2fb0393ba47e876c4c8c63e3ed27ebe42cb6ca3 
  
src/main/java/org/apache/aurora/scheduler/configuration/ConfigurationManager.java
 4073229b74d0e0e7fd31552bd96894ceb8a0971a 
  src/main/java/org/apache/aurora/scheduler/maintenance/MaintenanceModule.java 
PRE-CREATION 
  src/main/java/org/apache/aurora/scheduler/mesos/MesosCallbackHandler.java 
3b4df55a05873e79aae206b117cbc753fa3abb94 
  src/main/java/org/apache/aurora/scheduler/sla/SlaManager.java PRE-CREATION 
  src/main/java/org/apache/aurora/scheduler/sla/SlaModule.java 
25ed474289f369e74c24e999ad97ed6810c9fd5e 
  src/main/java/org/apache/aurora/scheduler/state/MaintenanceController.java 
f58c66aaebe8d31913d67a05add0f3d6054e88d1 
  src/main/java/org/apache/aurora/scheduler/state/StateModule.java 
0e0f90b670bbbcd6cb3aa302ce4a9abfe70ea979 
  src/main/java/org/apache/aurora/scheduler/storage/HostMaintenanceStore.java 
PRE-CREATION 
  src/main/java/org/apache/aurora/scheduler/storage/Storage.java 
da5534f886e032ca5a182f3704aa335ff680b258 
  
src/main/java/org/apache/aurora/scheduler/storage/durability/DurableStorage.java
 f1fdc275d3958a36bbe79110d70dfeba640a948a 
  src/main/java/org/apache/aurora/scheduler/storage/durability/Loader.java 
10864f122eff5027c88d835baae6de483d960218 
  
src/main/java/org/apache/aurora/scheduler/storage/durability/WriteRecorder.java 
8d70cae35289a9e36142bab288cf0c9398ebd2d4 
  
src/main/java/org/apache/aurora/scheduler/storage/log/SnapshotDeduplicator.java 
9733ffe74b107f336858657550156ddb1f1dd215 
  src/main/java/org/apache/aurora/scheduler/storage/log/SnapshotService.java 
b30de881eafa3226fdc32383b4e9bfd33ca912a5 
  src/main/java/org/apache/aurora/scheduler/storage/log/SnapshotterImpl.java 
4b52be02001e704f4b1a5f447226ac8c2386e3fd 
  
src/main/java/org/apache/aurora/scheduler/storage/mem/MemHostMaintenanceStore.java
 PRE-CREATION 
  src/main/java/org/apache/aurora/scheduler/storage/mem/MemStorage.java 
9f324b010db7e351e98b257d8fc8fecfeac81268 
  src/main/java/org/apache/aurora/scheduler/storage/mem/MemStorageModule.java 
edcea09b4d206cfddb642074237b031ad71cff13 
  src/main/java/org/apache/aurora/scheduler/thrift/ReadOnlySchedulerImpl.java 
e88cad6cf12312512e6840329db7ca7134ceaae6 
  
src/main/java/org/apache/aurora/scheduler/thrift/SchedulerThriftInterface.java 
9fc0416086dd3eb2e2f4e8f659da59fcdea2b22b 
  src/main/python/apache/aurora/admin/admin_util.py 
8240e8093160623b4c30dd212a88b8e122fd9856 
  src/main/python/apache/aurora/admin/host_maintenance.py 
83fc2b6ece40d3436cc7de7a034f95224235fcfd 
  src/main/python/apache/aurora/admin/maintenance.py 
942a237f47a6e0416bbaf244278685477e0f407d 
  src/main/python/apache/aurora/client/api/__init__.py 
f6fd1dd6d7c2bdd5bca3037f501b36badab78c75 
  src/main/python/apache/aurora/config/schema/base.py 
a629bcd1261e5959da0a8458a55545d4e2c2a7a5 
  src/main/python/apache/aurora/config/thrift.py 
6d2dde6e964daa68bf6f0e5bbbffecc5bd8c0431 
  src/main/python/apache/aurora/executor/executor_vars.py 
561f9452aedda4cc695c84a2a850bdd7e1d65dec 
  src/test/java/org/apache/aurora/scheduler/app/SchedulerIT.java 
63c338e5bbdf60de0fba8d68c6613904abb93fa8 
  src/test/java/org/apache/aurora/scheduler/base/TaskTestUtil.java 
778148a7c033cba9004954cabc33a2b1d003dccf 
  src/test/java/org/apache/aurora/scheduler/config/CommandLineTest.java 
e66ec116112df164106598d9ff0bc9e8f465e44f 
  
src/test/java/org/apache/aurora/scheduler/configuration/ConfigurationManagerTest.java
 749ffeac6cb851f32bba7606390203d7a046a0e6 
  src/test/java/org/apache/aurora/scheduler/mesos/MesosCallbackHandlerTest.java 
c6163bbabc7e7748f167b679893a93f58e4ef1ac 
  src/test/java/org/apache/aurora/scheduler/sla/SlaManagerTest.java 
PRE-CREATION 
  

Re: Review Request 66716: [WIP] Enable `Tasks` to specify their own custom maintenance SLA.

2018-05-14 Thread Aurora ReviewBot

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/66716/#review203049
---



Master (805a53f) is green with this patch.
  ./build-support/jenkins/build.sh

However, it appears that it might lack test coverage.

I will refresh this build result if you post a review containing "@ReviewBot 
retry"

- Aurora ReviewBot


On May 14, 2018, 4:52 p.m., Santhosh Kumar Shanmugham wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/66716/
> ---
> 
> (Updated May 14, 2018, 4:52 p.m.)
> 
> 
> Review request for Aurora, David McLaughlin, Jordan Ly, and Stephan Erb.
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> `Tasks` can specify custom SLA requirements as part of
> their `TaskConfig`. One of the new features is the ability
> to specify an external coordinator that can ACK/NACK
> maintenance requests for tasks. This will be hugely
> beneficial for onboarding services that cannot satisfactorily
> specify SLA in terms of running instances.
> 
> Maintenance requests are driven from the Scheduler to
> improve management of nodes in the cluster.
> 
> 
> Diffs
> -
> 
>   api/src/main/thrift/org/apache/aurora/gen/api.thrift 
> ef754e32172e7490a47a13e7b526f243ffa3efeb 
>   api/src/main/thrift/org/apache/aurora/gen/storage.thrift 
> b79e2045ccda05d5058565f81988dfe33feea8f1 
>   src/main/java/org/apache/aurora/scheduler/app/AppModule.java 
> ffc07443fae9e5216a5333ae305f75aa9b452a0c 
>   src/main/java/org/apache/aurora/scheduler/config/CliOptions.java 
> a2fb0393ba47e876c4c8c63e3ed27ebe42cb6ca3 
>   
> src/main/java/org/apache/aurora/scheduler/configuration/ConfigurationManager.java
>  4073229b74d0e0e7fd31552bd96894ceb8a0971a 
>   
> src/main/java/org/apache/aurora/scheduler/maintenance/MaintenanceModule.java 
> PRE-CREATION 
>   src/main/java/org/apache/aurora/scheduler/mesos/MesosCallbackHandler.java 
> 3b4df55a05873e79aae206b117cbc753fa3abb94 
>   src/main/java/org/apache/aurora/scheduler/sla/SlaManager.java PRE-CREATION 
>   src/main/java/org/apache/aurora/scheduler/sla/SlaModule.java 
> 25ed474289f369e74c24e999ad97ed6810c9fd5e 
>   src/main/java/org/apache/aurora/scheduler/state/MaintenanceController.java 
> f58c66aaebe8d31913d67a05add0f3d6054e88d1 
>   src/main/java/org/apache/aurora/scheduler/state/StateModule.java 
> 0e0f90b670bbbcd6cb3aa302ce4a9abfe70ea979 
>   src/main/java/org/apache/aurora/scheduler/storage/HostMaintenanceStore.java 
> PRE-CREATION 
>   src/main/java/org/apache/aurora/scheduler/storage/Storage.java 
> da5534f886e032ca5a182f3704aa335ff680b258 
>   
> src/main/java/org/apache/aurora/scheduler/storage/durability/DurableStorage.java
>  f1fdc275d3958a36bbe79110d70dfeba640a948a 
>   src/main/java/org/apache/aurora/scheduler/storage/durability/Loader.java 
> 10864f122eff5027c88d835baae6de483d960218 
>   
> src/main/java/org/apache/aurora/scheduler/storage/durability/WriteRecorder.java
>  8d70cae35289a9e36142bab288cf0c9398ebd2d4 
>   
> src/main/java/org/apache/aurora/scheduler/storage/log/SnapshotDeduplicator.java
>  9733ffe74b107f336858657550156ddb1f1dd215 
>   src/main/java/org/apache/aurora/scheduler/storage/log/SnapshotService.java 
> b30de881eafa3226fdc32383b4e9bfd33ca912a5 
>   src/main/java/org/apache/aurora/scheduler/storage/log/SnapshotterImpl.java 
> 4b52be02001e704f4b1a5f447226ac8c2386e3fd 
>   
> src/main/java/org/apache/aurora/scheduler/storage/mem/MemHostMaintenanceStore.java
>  PRE-CREATION 
>   src/main/java/org/apache/aurora/scheduler/storage/mem/MemStorage.java 
> 9f324b010db7e351e98b257d8fc8fecfeac81268 
>   src/main/java/org/apache/aurora/scheduler/storage/mem/MemStorageModule.java 
> edcea09b4d206cfddb642074237b031ad71cff13 
>   src/main/java/org/apache/aurora/scheduler/thrift/ReadOnlySchedulerImpl.java 
> e88cad6cf12312512e6840329db7ca7134ceaae6 
>   
> src/main/java/org/apache/aurora/scheduler/thrift/SchedulerThriftInterface.java
>  9fc0416086dd3eb2e2f4e8f659da59fcdea2b22b 
>   src/main/python/apache/aurora/admin/admin_util.py 
> 8240e8093160623b4c30dd212a88b8e122fd9856 
>   src/main/python/apache/aurora/admin/host_maintenance.py 
> 83fc2b6ece40d3436cc7de7a034f95224235fcfd 
>   src/main/python/apache/aurora/admin/maintenance.py 
> 942a237f47a6e0416bbaf244278685477e0f407d 
>   src/main/python/apache/aurora/client/api/__init__.py 
> f6fd1dd6d7c2bdd5bca3037f501b36badab78c75 
>   src/main/python/apache/aurora/config/schema/base.py 
> a629bcd1261e5959da0a8458a55545d4e2c2a7a5 
>   src/main/python/apache/aurora/config/thrift.py 
> 6d2dde6e964daa68bf6f0e5bbbffecc5bd8c0431 
>   src/main/python/apache/aurora/executor/executor_vars.py 
> 561f9452aedda4cc695c84a2a850bdd7e1d65dec 
>   src/test/java/org/apache/aurora/scheduler/app/SchedulerIT.java 
> 

Re: Review Request 66716: [WIP] Enable `Tasks` to specify their own custom maintenance SLA.

2018-05-14 Thread Santhosh Kumar Shanmugham

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/66716/#review203047
---



@ReviewBot retry

- Santhosh Kumar Shanmugham


On May 14, 2018, 9:52 a.m., Santhosh Kumar Shanmugham wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/66716/
> ---
> 
> (Updated May 14, 2018, 9:52 a.m.)
> 
> 
> Review request for Aurora, David McLaughlin, Jordan Ly, and Stephan Erb.
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> `Tasks` can specify custom SLA requirements as part of
> their `TaskConfig`. One of the new features is the ability
> to specify an external coordinator that can ACK/NACK
> maintenance requests for tasks. This will be hugely
> beneficial for onboarding services that cannot satisfactorily
> specify SLA in terms of running instances.
> 
> Maintenance requests are driven from the Scheduler to
> improve management of nodes in the cluster.
> 
> 
> Diffs
> -
> 
>   api/src/main/thrift/org/apache/aurora/gen/api.thrift 
> ef754e32172e7490a47a13e7b526f243ffa3efeb 
>   api/src/main/thrift/org/apache/aurora/gen/storage.thrift 
> b79e2045ccda05d5058565f81988dfe33feea8f1 
>   src/main/java/org/apache/aurora/scheduler/app/AppModule.java 
> ffc07443fae9e5216a5333ae305f75aa9b452a0c 
>   src/main/java/org/apache/aurora/scheduler/config/CliOptions.java 
> a2fb0393ba47e876c4c8c63e3ed27ebe42cb6ca3 
>   
> src/main/java/org/apache/aurora/scheduler/configuration/ConfigurationManager.java
>  4073229b74d0e0e7fd31552bd96894ceb8a0971a 
>   
> src/main/java/org/apache/aurora/scheduler/maintenance/MaintenanceModule.java 
> PRE-CREATION 
>   src/main/java/org/apache/aurora/scheduler/mesos/MesosCallbackHandler.java 
> 3b4df55a05873e79aae206b117cbc753fa3abb94 
>   src/main/java/org/apache/aurora/scheduler/sla/SlaManager.java PRE-CREATION 
>   src/main/java/org/apache/aurora/scheduler/sla/SlaModule.java 
> 25ed474289f369e74c24e999ad97ed6810c9fd5e 
>   src/main/java/org/apache/aurora/scheduler/state/MaintenanceController.java 
> f58c66aaebe8d31913d67a05add0f3d6054e88d1 
>   src/main/java/org/apache/aurora/scheduler/state/StateModule.java 
> 0e0f90b670bbbcd6cb3aa302ce4a9abfe70ea979 
>   src/main/java/org/apache/aurora/scheduler/storage/HostMaintenanceStore.java 
> PRE-CREATION 
>   src/main/java/org/apache/aurora/scheduler/storage/Storage.java 
> da5534f886e032ca5a182f3704aa335ff680b258 
>   
> src/main/java/org/apache/aurora/scheduler/storage/durability/DurableStorage.java
>  f1fdc275d3958a36bbe79110d70dfeba640a948a 
>   src/main/java/org/apache/aurora/scheduler/storage/durability/Loader.java 
> 10864f122eff5027c88d835baae6de483d960218 
>   
> src/main/java/org/apache/aurora/scheduler/storage/durability/WriteRecorder.java
>  8d70cae35289a9e36142bab288cf0c9398ebd2d4 
>   
> src/main/java/org/apache/aurora/scheduler/storage/log/SnapshotDeduplicator.java
>  9733ffe74b107f336858657550156ddb1f1dd215 
>   src/main/java/org/apache/aurora/scheduler/storage/log/SnapshotService.java 
> b30de881eafa3226fdc32383b4e9bfd33ca912a5 
>   src/main/java/org/apache/aurora/scheduler/storage/log/SnapshotterImpl.java 
> 4b52be02001e704f4b1a5f447226ac8c2386e3fd 
>   
> src/main/java/org/apache/aurora/scheduler/storage/mem/MemHostMaintenanceStore.java
>  PRE-CREATION 
>   src/main/java/org/apache/aurora/scheduler/storage/mem/MemStorage.java 
> 9f324b010db7e351e98b257d8fc8fecfeac81268 
>   src/main/java/org/apache/aurora/scheduler/storage/mem/MemStorageModule.java 
> edcea09b4d206cfddb642074237b031ad71cff13 
>   src/main/java/org/apache/aurora/scheduler/thrift/ReadOnlySchedulerImpl.java 
> e88cad6cf12312512e6840329db7ca7134ceaae6 
>   
> src/main/java/org/apache/aurora/scheduler/thrift/SchedulerThriftInterface.java
>  9fc0416086dd3eb2e2f4e8f659da59fcdea2b22b 
>   src/main/python/apache/aurora/admin/admin_util.py 
> 8240e8093160623b4c30dd212a88b8e122fd9856 
>   src/main/python/apache/aurora/admin/host_maintenance.py 
> 83fc2b6ece40d3436cc7de7a034f95224235fcfd 
>   src/main/python/apache/aurora/admin/maintenance.py 
> 942a237f47a6e0416bbaf244278685477e0f407d 
>   src/main/python/apache/aurora/client/api/__init__.py 
> f6fd1dd6d7c2bdd5bca3037f501b36badab78c75 
>   src/main/python/apache/aurora/config/schema/base.py 
> a629bcd1261e5959da0a8458a55545d4e2c2a7a5 
>   src/main/python/apache/aurora/config/thrift.py 
> 6d2dde6e964daa68bf6f0e5bbbffecc5bd8c0431 
>   src/main/python/apache/aurora/executor/executor_vars.py 
> 561f9452aedda4cc695c84a2a850bdd7e1d65dec 
>   src/test/java/org/apache/aurora/scheduler/app/SchedulerIT.java 
> 63c338e5bbdf60de0fba8d68c6613904abb93fa8 
>   src/test/java/org/apache/aurora/scheduler/base/TaskTestUtil.java 
> 778148a7c033cba9004954cabc33a2b1d003dccf 
>   

Re: Review Request 66716: [WIP] Enable `Tasks` to specify their own custom maintenance SLA.

2018-05-14 Thread Aurora ReviewBot

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/66716/#review203045
---



Master (805a53f) is red with this patch.
  ./build-support/jenkins/build.sh

 
src/test/python/apache/aurora/client/hooks/test_hooked_api.py::test_api_methods_exist[restart]
 <- 
.pants.d/pyprep/sources/4bed8bc8576734b7a2876724a12f2715efdd749a/apache/aurora/client/hooks/test_hooked_api.py
 PASSED [ 46%]
 
src/test/python/apache/aurora/client/hooks/test_hooked_api.py::test_api_methods_exist[start_cronjob]
 <- 
.pants.d/pyprep/sources/4bed8bc8576734b7a2876724a12f2715efdd749a/apache/aurora/client/hooks/test_hooked_api.py
 PASSED [ 53%]
 
src/test/python/apache/aurora/client/hooks/test_hooked_api.py::test_api_methods_exist[start_job_update]
 <- 
.pants.d/pyprep/sources/4bed8bc8576734b7a2876724a12f2715efdd749a/apache/aurora/client/hooks/test_hooked_api.py
 PASSED [ 60%]
 
src/test/python/apache/aurora/client/hooks/test_hooked_api.py::test_api_methods_params[add_instances]
 <- 
.pants.d/pyprep/sources/4bed8bc8576734b7a2876724a12f2715efdd749a/apache/aurora/client/hooks/test_hooked_api.py
 PASSED [ 66%]
 
src/test/python/apache/aurora/client/hooks/test_hooked_api.py::test_api_methods_params[create_job]
 <- 
.pants.d/pyprep/sources/4bed8bc8576734b7a2876724a12f2715efdd749a/apache/aurora/client/hooks/test_hooked_api.py
 PASSED [ 73%]
 
src/test/python/apache/aurora/client/hooks/test_hooked_api.py::test_api_methods_params[kill_job]
 <- 
.pants.d/pyprep/sources/4bed8bc8576734b7a2876724a12f2715efdd749a/apache/aurora/client/hooks/test_hooked_api.py
 PASSED [ 80%]
 
src/test/python/apache/aurora/client/hooks/test_hooked_api.py::test_api_methods_params[restart]
 <- 
.pants.d/pyprep/sources/4bed8bc8576734b7a2876724a12f2715efdd749a/apache/aurora/client/hooks/test_hooked_api.py
 PASSED [ 86%]
 
src/test/python/apache/aurora/client/hooks/test_hooked_api.py::test_api_methods_params[start_cronjob]
 <- 
.pants.d/pyprep/sources/4bed8bc8576734b7a2876724a12f2715efdd749a/apache/aurora/client/hooks/test_hooked_api.py
 PASSED [ 93%]
 
src/test/python/apache/aurora/client/hooks/test_hooked_api.py::test_api_methods_params[start_job_update]
 <- 
.pants.d/pyprep/sources/4bed8bc8576734b7a2876724a12f2715efdd749a/apache/aurora/client/hooks/test_hooked_api.py
 PASSED [100%]
 
  generated xml file: 
/home/jenkins/jenkins-slave/workspace/AuroraBot/.pants.d/test/pytest/src.test.python.apache.aurora.client.hooks.hooks/junitxml/TEST-src.test.python.apache.aurora.client.hooks.hooks.xml
 
 === 15 passed in 0.33 seconds 
 
   src.test.python.apache.aurora.admin.admin
   .   SUCCESS
   src.test.python.apache.aurora.client.client  
   .   SUCCESS
   src.test.python.apache.aurora.client.api.api 
   .   SUCCESS
   src.test.python.apache.aurora.client.cli.cli 
   .   SUCCESS
   src.test.python.apache.aurora.client.docker.docker   
   .   SUCCESS
   src.test.python.apache.aurora.client.hooks.hooks 
   .   SUCCESS
   src.test.python.apache.aurora.common.common  
   .   SUCCESS
   
src.test.python.apache.aurora.common.health_check.health_check  
.   SUCCESS
   src.test.python.apache.aurora.config.config  
   .   SUCCESS
   src.test.python.apache.aurora.executor.executor  
   .   SUCCESS
   src.test.python.apache.aurora.executor.bin.bin   
   .   SUCCESS
   src.test.python.apache.aurora.executor.common.common 
   .   SUCCESS
   src.test.python.apache.aurora.tools.tools
   .   SUCCESS
   src.test.python.apache.thermos.cli.cli   
   .   SUCCESS
   src.test.python.apache.thermos.cli.commands.commands 
   .   SUCCESS
   src.test.python.apache.thermos.common.common 
   .   SUCCESS
   src.test.python.apache.thermos.config.config 
   .   SUCCESS
   src.test.python.apache.thermos.core.core 
   

Re: Review Request 66716: [WIP] Enable `Tasks` to specify their own custom maintenance SLA.

2018-05-14 Thread Santhosh Kumar Shanmugham

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/66716/
---

(Updated May 14, 2018, 9:52 a.m.)


Review request for Aurora, David McLaughlin, Jordan Ly, and Stephan Erb.


Changes
---

- Improves test-coverage.
- UI changes to display the SlaPolicy in the TaskConfigSummary
- New sla_host_drain in admin client
- Introduces the trigger-then-poll mechanism for coordinator drains.

Pending:
- Persist coordinator trigger state and continue to drain tasks, even if the 
maintenance request gets cancelled prematurely
- Re-name some config variables in the CoordinatorSlaPolicy, namely 
`trigger_response` -> `trigger_status_key` and `poll_response` -> 
`poll_status_key`


Repository: aurora


Description
---

`Tasks` can specify custom SLA requirements as part of
their `TaskConfig`. One of the new features is the ability
to specify an external coordinator that can ACK/NACK
maintenance requests for tasks. This will be hugely
beneficial for onboarding services that cannot satisfactorily
specify SLA in terms of running instances.

Maintenance requests are driven from the Scheduler to
improve management of nodes in the cluster.


Note to reviewers:
- Test coverage is minimal at this point. Expect more coverage soon in the next 
diff.


Diffs (updated)
-

  api/src/main/thrift/org/apache/aurora/gen/api.thrift 
ef754e32172e7490a47a13e7b526f243ffa3efeb 
  api/src/main/thrift/org/apache/aurora/gen/storage.thrift 
b79e2045ccda05d5058565f81988dfe33feea8f1 
  src/main/java/org/apache/aurora/scheduler/app/AppModule.java 
ffc07443fae9e5216a5333ae305f75aa9b452a0c 
  src/main/java/org/apache/aurora/scheduler/config/CliOptions.java 
a2fb0393ba47e876c4c8c63e3ed27ebe42cb6ca3 
  
src/main/java/org/apache/aurora/scheduler/configuration/ConfigurationManager.java
 4073229b74d0e0e7fd31552bd96894ceb8a0971a 
  src/main/java/org/apache/aurora/scheduler/maintenance/MaintenanceModule.java 
PRE-CREATION 
  src/main/java/org/apache/aurora/scheduler/mesos/MesosCallbackHandler.java 
3b4df55a05873e79aae206b117cbc753fa3abb94 
  src/main/java/org/apache/aurora/scheduler/sla/SlaManager.java PRE-CREATION 
  src/main/java/org/apache/aurora/scheduler/sla/SlaModule.java 
25ed474289f369e74c24e999ad97ed6810c9fd5e 
  src/main/java/org/apache/aurora/scheduler/state/MaintenanceController.java 
f58c66aaebe8d31913d67a05add0f3d6054e88d1 
  src/main/java/org/apache/aurora/scheduler/state/StateModule.java 
0e0f90b670bbbcd6cb3aa302ce4a9abfe70ea979 
  src/main/java/org/apache/aurora/scheduler/storage/HostMaintenanceStore.java 
PRE-CREATION 
  src/main/java/org/apache/aurora/scheduler/storage/Storage.java 
da5534f886e032ca5a182f3704aa335ff680b258 
  
src/main/java/org/apache/aurora/scheduler/storage/durability/DurableStorage.java
 f1fdc275d3958a36bbe79110d70dfeba640a948a 
  src/main/java/org/apache/aurora/scheduler/storage/durability/Loader.java 
10864f122eff5027c88d835baae6de483d960218 
  
src/main/java/org/apache/aurora/scheduler/storage/durability/WriteRecorder.java 
8d70cae35289a9e36142bab288cf0c9398ebd2d4 
  
src/main/java/org/apache/aurora/scheduler/storage/log/SnapshotDeduplicator.java 
9733ffe74b107f336858657550156ddb1f1dd215 
  src/main/java/org/apache/aurora/scheduler/storage/log/SnapshotService.java 
b30de881eafa3226fdc32383b4e9bfd33ca912a5 
  src/main/java/org/apache/aurora/scheduler/storage/log/SnapshotterImpl.java 
4b52be02001e704f4b1a5f447226ac8c2386e3fd 
  
src/main/java/org/apache/aurora/scheduler/storage/mem/MemHostMaintenanceStore.java
 PRE-CREATION 
  src/main/java/org/apache/aurora/scheduler/storage/mem/MemStorage.java 
9f324b010db7e351e98b257d8fc8fecfeac81268 
  src/main/java/org/apache/aurora/scheduler/storage/mem/MemStorageModule.java 
edcea09b4d206cfddb642074237b031ad71cff13 
  src/main/java/org/apache/aurora/scheduler/thrift/ReadOnlySchedulerImpl.java 
e88cad6cf12312512e6840329db7ca7134ceaae6 
  
src/main/java/org/apache/aurora/scheduler/thrift/SchedulerThriftInterface.java 
9fc0416086dd3eb2e2f4e8f659da59fcdea2b22b 
  src/main/python/apache/aurora/admin/admin_util.py 
8240e8093160623b4c30dd212a88b8e122fd9856 
  src/main/python/apache/aurora/admin/host_maintenance.py 
83fc2b6ece40d3436cc7de7a034f95224235fcfd 
  src/main/python/apache/aurora/admin/maintenance.py 
942a237f47a6e0416bbaf244278685477e0f407d 
  src/main/python/apache/aurora/client/api/__init__.py 
f6fd1dd6d7c2bdd5bca3037f501b36badab78c75 
  src/main/python/apache/aurora/config/schema/base.py 
a629bcd1261e5959da0a8458a55545d4e2c2a7a5 
  src/main/python/apache/aurora/config/thrift.py 
6d2dde6e964daa68bf6f0e5bbbffecc5bd8c0431 
  src/main/python/apache/aurora/executor/executor_vars.py 
561f9452aedda4cc695c84a2a850bdd7e1d65dec 
  src/test/java/org/apache/aurora/scheduler/app/SchedulerIT.java 
63c338e5bbdf60de0fba8d68c6613904abb93fa8 
  src/test/java/org/apache/aurora/scheduler/base/TaskTestUtil.java 

Re: Review Request 66716: [WIP] Enable `Tasks` to specify their own custom maintenance SLA.

2018-05-14 Thread Santhosh Kumar Shanmugham

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/66716/
---

(Updated May 14, 2018, 9:52 a.m.)


Review request for Aurora, David McLaughlin, Jordan Ly, and Stephan Erb.


Repository: aurora


Description (updated)
---

`Tasks` can specify custom SLA requirements as part of
their `TaskConfig`. One of the new features is the ability
to specify an external coordinator that can ACK/NACK
maintenance requests for tasks. This will be hugely
beneficial for onboarding services that cannot satisfactorily
specify SLA in terms of running instances.

Maintenance requests are driven from the Scheduler to
improve management of nodes in the cluster.


Diffs
-

  api/src/main/thrift/org/apache/aurora/gen/api.thrift 
ef754e32172e7490a47a13e7b526f243ffa3efeb 
  api/src/main/thrift/org/apache/aurora/gen/storage.thrift 
b79e2045ccda05d5058565f81988dfe33feea8f1 
  src/main/java/org/apache/aurora/scheduler/app/AppModule.java 
ffc07443fae9e5216a5333ae305f75aa9b452a0c 
  src/main/java/org/apache/aurora/scheduler/config/CliOptions.java 
a2fb0393ba47e876c4c8c63e3ed27ebe42cb6ca3 
  
src/main/java/org/apache/aurora/scheduler/configuration/ConfigurationManager.java
 4073229b74d0e0e7fd31552bd96894ceb8a0971a 
  src/main/java/org/apache/aurora/scheduler/maintenance/MaintenanceModule.java 
PRE-CREATION 
  src/main/java/org/apache/aurora/scheduler/mesos/MesosCallbackHandler.java 
3b4df55a05873e79aae206b117cbc753fa3abb94 
  src/main/java/org/apache/aurora/scheduler/sla/SlaManager.java PRE-CREATION 
  src/main/java/org/apache/aurora/scheduler/sla/SlaModule.java 
25ed474289f369e74c24e999ad97ed6810c9fd5e 
  src/main/java/org/apache/aurora/scheduler/state/MaintenanceController.java 
f58c66aaebe8d31913d67a05add0f3d6054e88d1 
  src/main/java/org/apache/aurora/scheduler/state/StateModule.java 
0e0f90b670bbbcd6cb3aa302ce4a9abfe70ea979 
  src/main/java/org/apache/aurora/scheduler/storage/HostMaintenanceStore.java 
PRE-CREATION 
  src/main/java/org/apache/aurora/scheduler/storage/Storage.java 
da5534f886e032ca5a182f3704aa335ff680b258 
  
src/main/java/org/apache/aurora/scheduler/storage/durability/DurableStorage.java
 f1fdc275d3958a36bbe79110d70dfeba640a948a 
  src/main/java/org/apache/aurora/scheduler/storage/durability/Loader.java 
10864f122eff5027c88d835baae6de483d960218 
  
src/main/java/org/apache/aurora/scheduler/storage/durability/WriteRecorder.java 
8d70cae35289a9e36142bab288cf0c9398ebd2d4 
  
src/main/java/org/apache/aurora/scheduler/storage/log/SnapshotDeduplicator.java 
9733ffe74b107f336858657550156ddb1f1dd215 
  src/main/java/org/apache/aurora/scheduler/storage/log/SnapshotService.java 
b30de881eafa3226fdc32383b4e9bfd33ca912a5 
  src/main/java/org/apache/aurora/scheduler/storage/log/SnapshotterImpl.java 
4b52be02001e704f4b1a5f447226ac8c2386e3fd 
  
src/main/java/org/apache/aurora/scheduler/storage/mem/MemHostMaintenanceStore.java
 PRE-CREATION 
  src/main/java/org/apache/aurora/scheduler/storage/mem/MemStorage.java 
9f324b010db7e351e98b257d8fc8fecfeac81268 
  src/main/java/org/apache/aurora/scheduler/storage/mem/MemStorageModule.java 
edcea09b4d206cfddb642074237b031ad71cff13 
  src/main/java/org/apache/aurora/scheduler/thrift/ReadOnlySchedulerImpl.java 
e88cad6cf12312512e6840329db7ca7134ceaae6 
  
src/main/java/org/apache/aurora/scheduler/thrift/SchedulerThriftInterface.java 
9fc0416086dd3eb2e2f4e8f659da59fcdea2b22b 
  src/main/python/apache/aurora/admin/admin_util.py 
8240e8093160623b4c30dd212a88b8e122fd9856 
  src/main/python/apache/aurora/admin/host_maintenance.py 
83fc2b6ece40d3436cc7de7a034f95224235fcfd 
  src/main/python/apache/aurora/admin/maintenance.py 
942a237f47a6e0416bbaf244278685477e0f407d 
  src/main/python/apache/aurora/client/api/__init__.py 
f6fd1dd6d7c2bdd5bca3037f501b36badab78c75 
  src/main/python/apache/aurora/config/schema/base.py 
a629bcd1261e5959da0a8458a55545d4e2c2a7a5 
  src/main/python/apache/aurora/config/thrift.py 
6d2dde6e964daa68bf6f0e5bbbffecc5bd8c0431 
  src/main/python/apache/aurora/executor/executor_vars.py 
561f9452aedda4cc695c84a2a850bdd7e1d65dec 
  src/test/java/org/apache/aurora/scheduler/app/SchedulerIT.java 
63c338e5bbdf60de0fba8d68c6613904abb93fa8 
  src/test/java/org/apache/aurora/scheduler/base/TaskTestUtil.java 
778148a7c033cba9004954cabc33a2b1d003dccf 
  src/test/java/org/apache/aurora/scheduler/config/CommandLineTest.java 
e66ec116112df164106598d9ff0bc9e8f465e44f 
  
src/test/java/org/apache/aurora/scheduler/configuration/ConfigurationManagerTest.java
 749ffeac6cb851f32bba7606390203d7a046a0e6 
  src/test/java/org/apache/aurora/scheduler/mesos/MesosCallbackHandlerTest.java 
c6163bbabc7e7748f167b679893a93f58e4ef1ac 
  src/test/java/org/apache/aurora/scheduler/sla/SlaManagerTest.java 
PRE-CREATION 
  src/test/java/org/apache/aurora/scheduler/sla/SlaModuleTest.java 
d37e7a07e9258bc8c0758bf50aece5b79025126b 
  

Re: Review Request 66716: [WIP] Enable `Tasks` to specify their own custom maintenance SLA.

2018-05-10 Thread Jordan Ly

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/66716/#review202850
---



Looking really good! I am easily able to use this for my SLA-aware update work.

Mostly small style nits at this point. Sorry in advance for being pedantic :)


src/main/java/org/apache/aurora/scheduler/sla/SlaManager.java
Lines 78 (patched)


I would rename this as it collides with `AsyncModule.AsyncExecutor` and it 
might confuse people in a quick look.



src/main/java/org/apache/aurora/scheduler/sla/SlaManager.java
Lines 186 (patched)


nit: I would format this as:

```
final Set running = store.getTaskStore().fetchTasks(
Query.jobScoped(task.getAssignedTask().getTask().getJob())
.byStatus(ScheduleStatus.RUNNING))
.stream()
.filter(t -> !Tasks.id(t).equals(Tasks.id(task))) // exclude the 
task to be updated
.filter(t -> meetsSLADuration(t, slaDuration)) // task is running 
for sla duration
.collect(Collectors.toSet());
```



src/main/java/org/apache/aurora/scheduler/sla/SlaManager.java
Lines 194 (patched)


seems extraneous, maybe remove and call `running.size()` directly below



src/main/java/org/apache/aurora/scheduler/sla/SlaManager.java
Lines 218-219 (patched)


nit:
```
  Storage.MutateWork work) {
// newline
...
```



src/main/java/org/apache/aurora/scheduler/sla/SlaManager.java
Lines 222-223 (patched)


This will generate a pretty hard to parse string like so:

```
devcluster/IInstanceKey{jobKey=IJobKey{role=vagrant, envir
onment=test, name=http_example}, instanceId=1}
```

Stats fail to create a counter with this name, and the query string sent to 
the coordinator looks pretty weird.



src/main/java/org/apache/aurora/scheduler/sla/SlaManager.java
Lines 226 (patched)


qq: what is the behavior if you max parallel coordinators? does it just 
block until a lock is freed (at most one minute)?



src/main/java/org/apache/aurora/scheduler/sla/SlaManager.java
Lines 299 (patched)


Switch this to ISlaPolicy (and probably other references to policies within 
this class as well e.g. CoordinatorSlaPolicy -> ICoordinatorSlaPolicy)



src/main/java/org/apache/aurora/scheduler/sla/SlaManager.java
Lines 301-303 (patched)


nit:

```
  boolean force) throws E {
  //newline
if (force) {
```



src/main/java/org/apache/aurora/scheduler/state/MaintenanceController.java
Line 139 (original), 210 (patched)


What does this comment mean?



src/main/java/org/apache/aurora/scheduler/state/MaintenanceController.java
Lines 275 (patched)


nit: newline below



src/main/java/org/apache/aurora/scheduler/state/MaintenanceController.java
Lines 300-302 (patched)


What is the behavior of DRAINING an already DRAINED host? Will the host 
change state to DRAINING temporarily or will it stay in DRAINED?



src/main/java/org/apache/aurora/scheduler/state/MaintenanceController.java
Lines 415-419 (patched)


If a host is in `DRAINING` without a maintenance request, will it 
constantly fail on this step? It is probably fine since we a metric is exposed 
so people can monitor it.



src/main/java/org/apache/aurora/scheduler/storage/durability/WriteRecorder.java
Lines 256 (patched)


nit: newline below



src/main/java/org/apache/aurora/scheduler/storage/mem/MemHostMaintenanceStore.java
Lines 44 (patched)


nit: newline below



src/main/python/apache/aurora/config/schema/base.py
Line 170 (original), 171 (patched)


nit: newline



src/main/python/apache/aurora/executor/executor_vars.py
Line 65 (original), 65 (patched)


where does this come from


- Jordan Ly


On May 9, 2018, 5:49 p.m., Santhosh Kumar Shanmugham wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/66716/
> ---
> 
> 

Re: Review Request 66716: [WIP] Enable `Tasks` to specify their own custom maintenance SLA.

2018-05-09 Thread Aurora ReviewBot

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/66716/#review202790
---


Ship it!




Master (805a53f) is green with this patch.
  ./build-support/jenkins/build.sh

I will refresh this build result if you post a review containing "@ReviewBot 
retry"

- Aurora ReviewBot


On May 9, 2018, 10:49 a.m., Santhosh Kumar Shanmugham wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/66716/
> ---
> 
> (Updated May 9, 2018, 10:49 a.m.)
> 
> 
> Review request for Aurora, David McLaughlin, Jordan Ly, and Stephan Erb.
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> `Tasks` can specify custom SLA requirements as part of
> their `TaskConfig`. One of the new features is the ability
> to specify an external coordinator that can ACK/NACK
> maintenance requests for tasks. This will be hugely
> beneficial for onboarding services that cannot satisfactorily
> specify SLA in terms of running instances.
> 
> Maintenance requests are driven from the Scheduler to
> improve management of nodes in the cluster.
> 
> 
> Note to reviewers:
> - Test coverage is minimal at this point. Expect more coverage soon in the 
> next diff.
> 
> 
> Diffs
> -
> 
>   api/src/main/thrift/org/apache/aurora/gen/api.thrift 
> ef754e32172e7490a47a13e7b526f243ffa3efeb 
>   api/src/main/thrift/org/apache/aurora/gen/storage.thrift 
> b79e2045ccda05d5058565f81988dfe33feea8f1 
>   src/main/java/org/apache/aurora/scheduler/app/AppModule.java 
> ffc07443fae9e5216a5333ae305f75aa9b452a0c 
>   src/main/java/org/apache/aurora/scheduler/config/CliOptions.java 
> a2fb0393ba47e876c4c8c63e3ed27ebe42cb6ca3 
>   
> src/main/java/org/apache/aurora/scheduler/maintenance/MaintenanceModule.java 
> PRE-CREATION 
>   src/main/java/org/apache/aurora/scheduler/mesos/MesosCallbackHandler.java 
> 3b4df55a05873e79aae206b117cbc753fa3abb94 
>   src/main/java/org/apache/aurora/scheduler/sla/SlaManager.java PRE-CREATION 
>   src/main/java/org/apache/aurora/scheduler/sla/SlaModule.java 
> 25ed474289f369e74c24e999ad97ed6810c9fd5e 
>   src/main/java/org/apache/aurora/scheduler/state/MaintenanceController.java 
> f58c66aaebe8d31913d67a05add0f3d6054e88d1 
>   src/main/java/org/apache/aurora/scheduler/state/StateModule.java 
> 0e0f90b670bbbcd6cb3aa302ce4a9abfe70ea979 
>   src/main/java/org/apache/aurora/scheduler/storage/HostMaintenanceStore.java 
> PRE-CREATION 
>   src/main/java/org/apache/aurora/scheduler/storage/Storage.java 
> da5534f886e032ca5a182f3704aa335ff680b258 
>   
> src/main/java/org/apache/aurora/scheduler/storage/durability/DurableStorage.java
>  f1fdc275d3958a36bbe79110d70dfeba640a948a 
>   src/main/java/org/apache/aurora/scheduler/storage/durability/Loader.java 
> 10864f122eff5027c88d835baae6de483d960218 
>   
> src/main/java/org/apache/aurora/scheduler/storage/durability/WriteRecorder.java
>  8d70cae35289a9e36142bab288cf0c9398ebd2d4 
>   src/main/java/org/apache/aurora/scheduler/storage/log/SnapshotService.java 
> b30de881eafa3226fdc32383b4e9bfd33ca912a5 
>   src/main/java/org/apache/aurora/scheduler/storage/log/SnapshotterImpl.java 
> 4b52be02001e704f4b1a5f447226ac8c2386e3fd 
>   
> src/main/java/org/apache/aurora/scheduler/storage/mem/MemHostMaintenanceStore.java
>  PRE-CREATION 
>   src/main/java/org/apache/aurora/scheduler/storage/mem/MemStorage.java 
> 9f324b010db7e351e98b257d8fc8fecfeac81268 
>   src/main/java/org/apache/aurora/scheduler/storage/mem/MemStorageModule.java 
> edcea09b4d206cfddb642074237b031ad71cff13 
>   
> src/main/java/org/apache/aurora/scheduler/thrift/SchedulerThriftInterface.java
>  9fc0416086dd3eb2e2f4e8f659da59fcdea2b22b 
>   src/main/python/apache/aurora/config/schema/base.py 
> a629bcd1261e5959da0a8458a55545d4e2c2a7a5 
>   src/main/python/apache/aurora/config/thrift.py 
> 6d2dde6e964daa68bf6f0e5bbbffecc5bd8c0431 
>   src/main/python/apache/aurora/executor/executor_vars.py 
> 561f9452aedda4cc695c84a2a850bdd7e1d65dec 
>   src/test/java/org/apache/aurora/scheduler/app/SchedulerIT.java 
> 63c338e5bbdf60de0fba8d68c6613904abb93fa8 
>   src/test/java/org/apache/aurora/scheduler/base/TaskTestUtil.java 
> 778148a7c033cba9004954cabc33a2b1d003dccf 
>   src/test/java/org/apache/aurora/scheduler/config/CommandLineTest.java 
> e66ec116112df164106598d9ff0bc9e8f465e44f 
>   
> src/test/java/org/apache/aurora/scheduler/mesos/MesosCallbackHandlerTest.java 
> c6163bbabc7e7748f167b679893a93f58e4ef1ac 
>   src/test/java/org/apache/aurora/scheduler/sla/SlaModuleTest.java 
> d37e7a07e9258bc8c0758bf50aece5b79025126b 
>   
> src/test/java/org/apache/aurora/scheduler/state/MaintenanceControllerImplTest.java
>  770846e84e9980ea3dbf9e1c46b0d45c5488c5b3 
>   src/test/java/org/apache/aurora/scheduler/storage/backup/RecoveryTest.java 

Re: Review Request 66716: [WIP] Enable `Tasks` to specify their own custom maintenance SLA.

2018-05-09 Thread Santhosh Kumar Shanmugham

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/66716/
---

(Updated May 9, 2018, 10:49 a.m.)


Review request for Aurora, David McLaughlin, Jordan Ly, and Stephan Erb.


Changes
---

Addressing David's comments. Some more refactoring.


Repository: aurora


Description
---

`Tasks` can specify custom SLA requirements as part of
their `TaskConfig`. One of the new features is the ability
to specify an external coordinator that can ACK/NACK
maintenance requests for tasks. This will be hugely
beneficial for onboarding services that cannot satisfactorily
specify SLA in terms of running instances.

Maintenance requests are driven from the Scheduler to
improve management of nodes in the cluster.


Note to reviewers:
- Test coverage is minimal at this point. Expect more coverage soon in the next 
diff.


Diffs (updated)
-

  api/src/main/thrift/org/apache/aurora/gen/api.thrift 
ef754e32172e7490a47a13e7b526f243ffa3efeb 
  api/src/main/thrift/org/apache/aurora/gen/storage.thrift 
b79e2045ccda05d5058565f81988dfe33feea8f1 
  src/main/java/org/apache/aurora/scheduler/app/AppModule.java 
ffc07443fae9e5216a5333ae305f75aa9b452a0c 
  src/main/java/org/apache/aurora/scheduler/config/CliOptions.java 
a2fb0393ba47e876c4c8c63e3ed27ebe42cb6ca3 
  src/main/java/org/apache/aurora/scheduler/maintenance/MaintenanceModule.java 
PRE-CREATION 
  src/main/java/org/apache/aurora/scheduler/mesos/MesosCallbackHandler.java 
3b4df55a05873e79aae206b117cbc753fa3abb94 
  src/main/java/org/apache/aurora/scheduler/sla/SlaManager.java PRE-CREATION 
  src/main/java/org/apache/aurora/scheduler/sla/SlaModule.java 
25ed474289f369e74c24e999ad97ed6810c9fd5e 
  src/main/java/org/apache/aurora/scheduler/state/MaintenanceController.java 
f58c66aaebe8d31913d67a05add0f3d6054e88d1 
  src/main/java/org/apache/aurora/scheduler/state/StateModule.java 
0e0f90b670bbbcd6cb3aa302ce4a9abfe70ea979 
  src/main/java/org/apache/aurora/scheduler/storage/HostMaintenanceStore.java 
PRE-CREATION 
  src/main/java/org/apache/aurora/scheduler/storage/Storage.java 
da5534f886e032ca5a182f3704aa335ff680b258 
  
src/main/java/org/apache/aurora/scheduler/storage/durability/DurableStorage.java
 f1fdc275d3958a36bbe79110d70dfeba640a948a 
  src/main/java/org/apache/aurora/scheduler/storage/durability/Loader.java 
10864f122eff5027c88d835baae6de483d960218 
  
src/main/java/org/apache/aurora/scheduler/storage/durability/WriteRecorder.java 
8d70cae35289a9e36142bab288cf0c9398ebd2d4 
  src/main/java/org/apache/aurora/scheduler/storage/log/SnapshotService.java 
b30de881eafa3226fdc32383b4e9bfd33ca912a5 
  src/main/java/org/apache/aurora/scheduler/storage/log/SnapshotterImpl.java 
4b52be02001e704f4b1a5f447226ac8c2386e3fd 
  
src/main/java/org/apache/aurora/scheduler/storage/mem/MemHostMaintenanceStore.java
 PRE-CREATION 
  src/main/java/org/apache/aurora/scheduler/storage/mem/MemStorage.java 
9f324b010db7e351e98b257d8fc8fecfeac81268 
  src/main/java/org/apache/aurora/scheduler/storage/mem/MemStorageModule.java 
edcea09b4d206cfddb642074237b031ad71cff13 
  
src/main/java/org/apache/aurora/scheduler/thrift/SchedulerThriftInterface.java 
9fc0416086dd3eb2e2f4e8f659da59fcdea2b22b 
  src/main/python/apache/aurora/config/schema/base.py 
a629bcd1261e5959da0a8458a55545d4e2c2a7a5 
  src/main/python/apache/aurora/config/thrift.py 
6d2dde6e964daa68bf6f0e5bbbffecc5bd8c0431 
  src/main/python/apache/aurora/executor/executor_vars.py 
561f9452aedda4cc695c84a2a850bdd7e1d65dec 
  src/test/java/org/apache/aurora/scheduler/app/SchedulerIT.java 
63c338e5bbdf60de0fba8d68c6613904abb93fa8 
  src/test/java/org/apache/aurora/scheduler/base/TaskTestUtil.java 
778148a7c033cba9004954cabc33a2b1d003dccf 
  src/test/java/org/apache/aurora/scheduler/config/CommandLineTest.java 
e66ec116112df164106598d9ff0bc9e8f465e44f 
  src/test/java/org/apache/aurora/scheduler/mesos/MesosCallbackHandlerTest.java 
c6163bbabc7e7748f167b679893a93f58e4ef1ac 
  src/test/java/org/apache/aurora/scheduler/sla/SlaModuleTest.java 
d37e7a07e9258bc8c0758bf50aece5b79025126b 
  
src/test/java/org/apache/aurora/scheduler/state/MaintenanceControllerImplTest.java
 770846e84e9980ea3dbf9e1c46b0d45c5488c5b3 
  src/test/java/org/apache/aurora/scheduler/storage/backup/RecoveryTest.java 
ba03ff94bb5fee2b09a6660a9ad759cece7449f1 
  
src/test/java/org/apache/aurora/scheduler/storage/durability/DurableStorageTest.java
 3dd9ce4039b223cb6156462d089f7062a1cde772 
  
src/test/java/org/apache/aurora/scheduler/storage/durability/WriteRecorderTest.java
 27c8c829cd1e417dd5e60a8e9415331ca4a7c918 
  src/test/java/org/apache/aurora/scheduler/storage/log/SnapshotterImplIT.java 
be07361a27afefa21cc2ba76ce82531a418d9814 
  
src/test/java/org/apache/aurora/scheduler/storage/testing/StorageTestUtil.java 
d59118be13342da9003b0bcb97e12e477d9edf8f 
  

Re: Review Request 66716: [WIP] Enable `Tasks` to specify their own custom maintenance SLA.

2018-05-08 Thread David McLaughlin


> On May 8, 2018, 4:30 p.m., David McLaughlin wrote:
> > src/main/java/org/apache/aurora/scheduler/state/MaintenanceController.java
> > Lines 110 (patched)
> > 
> >
> > I'm wondering if defaultSlaPolicy needs to be supplied by the client, 
> > or configured on the server? Is there a situation where client would ever 
> > supply different values depending on the situation?
> 
> Santhosh Kumar Shanmugham wrote:
> As of today dedicated clusters can supply their own SLA requirements. 
> Setting a global SLA requirement (which can be less stricter) can be a 
> surprise for operators. 
> 
> But my initial approach was to set a server default as well.

That sounds reasonable.


> On May 8, 2018, 4:30 p.m., David McLaughlin wrote:
> > src/main/java/org/apache/aurora/scheduler/state/MaintenanceController.java
> > Lines 407 (patched)
> > 
> >
> > I'm struggling to understand how maintenace request presence is related 
> > to the SLA. 
> > 
> > Isn't there a race condition where any hosts in DRAINING when the 
> > Scheduler first starts up (after this patch) would have no maintenance 
> > requests? Also, doesn't the maintenance request stay present *until* the 
> > SLA is satisfied?
> > 
> > This may be another argument for having a default policy in the server.
> 
> Santhosh Kumar Shanmugham wrote:
> After this patch has been rolled out, this should never happen. 
> 
> Added this to log (we can change the message) and export metric to 
> account for exactly the situation you mentioned while rolling out the patch 
> (for debugging). It is safe to ingore drains like this after the scheduler 
> startup, since all drains are driven from clients today. So the clients will 
> retry the drain -> which will create the maintenance request object -> which 
> will unblock the drains. 
> 
> Considering the `dedicated` owners user-case (mentioned above), I dropped 
> the scheduler level config. We can bring it back if you feel strongly about 
> it.

That makes sense. Please call this out in the RELEASE-NOTES (semi-related: we 
should make sure to update all other documentaton that would be affected by 
this change).


- David


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/66716/#review202652
---


On May 8, 2018, 3:49 p.m., Santhosh Kumar Shanmugham wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/66716/
> ---
> 
> (Updated May 8, 2018, 3:49 p.m.)
> 
> 
> Review request for Aurora, David McLaughlin, Jordan Ly, and Stephan Erb.
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> `Tasks` can specify custom SLA requirements as part of
> their `TaskConfig`. One of the new features is the ability
> to specify an external coordinator that can ACK/NACK
> maintenance requests for tasks. This will be hugely
> beneficial for onboarding services that cannot satisfactorily
> specify SLA in terms of running instances.
> 
> Maintenance requests are driven from the Scheduler to
> improve management of nodes in the cluster.
> 
> 
> Note to reviewers:
> - Test coverage is minimal at this point. Expect more coverage soon in the 
> next diff.
> 
> 
> Diffs
> -
> 
>   api/src/main/thrift/org/apache/aurora/gen/api.thrift 
> ef754e32172e7490a47a13e7b526f243ffa3efeb 
>   api/src/main/thrift/org/apache/aurora/gen/storage.thrift 
> b79e2045ccda05d5058565f81988dfe33feea8f1 
>   src/main/java/org/apache/aurora/scheduler/app/AppModule.java 
> ffc07443fae9e5216a5333ae305f75aa9b452a0c 
>   src/main/java/org/apache/aurora/scheduler/config/CliOptions.java 
> a2fb0393ba47e876c4c8c63e3ed27ebe42cb6ca3 
>   
> src/main/java/org/apache/aurora/scheduler/maintenance/MaintenanceModule.java 
> PRE-CREATION 
>   src/main/java/org/apache/aurora/scheduler/mesos/MesosCallbackHandler.java 
> 3b4df55a05873e79aae206b117cbc753fa3abb94 
>   src/main/java/org/apache/aurora/scheduler/sla/SlaManager.java PRE-CREATION 
>   src/main/java/org/apache/aurora/scheduler/sla/SlaModule.java 
> 25ed474289f369e74c24e999ad97ed6810c9fd5e 
>   src/main/java/org/apache/aurora/scheduler/state/MaintenanceController.java 
> f58c66aaebe8d31913d67a05add0f3d6054e88d1 
>   src/main/java/org/apache/aurora/scheduler/state/StateModule.java 
> 0e0f90b670bbbcd6cb3aa302ce4a9abfe70ea979 
>   src/main/java/org/apache/aurora/scheduler/storage/HostMaintenanceStore.java 
> PRE-CREATION 
>   src/main/java/org/apache/aurora/scheduler/storage/Storage.java 
> da5534f886e032ca5a182f3704aa335ff680b258 
>   
> 

Re: Review Request 66716: [WIP] Enable `Tasks` to specify their own custom maintenance SLA.

2018-05-08 Thread Santhosh Kumar Shanmugham


> On May 8, 2018, 9:30 a.m., David McLaughlin wrote:
> > src/main/java/org/apache/aurora/scheduler/maintenance/MaintenanceModule.java
> > Lines 35 (patched)
> > 
> >
> > Missing javadoc.

Done.


> On May 8, 2018, 9:30 a.m., David McLaughlin wrote:
> > src/main/java/org/apache/aurora/scheduler/sla/SlaManager.java
> > Lines 64 (patched)
> > 
> >
> > Missing javadoc.

Done.


> On May 8, 2018, 9:30 a.m., David McLaughlin wrote:
> > src/main/java/org/apache/aurora/scheduler/sla/SlaManager.java
> > Lines 149 (patched)
> > 
> >
> > Delete blank line.

Done.


> On May 8, 2018, 9:30 a.m., David McLaughlin wrote:
> > src/main/java/org/apache/aurora/scheduler/sla/SlaManager.java
> > Lines 194-195 (patched)
> > 
> >
> > Would be good to add comments (either here or in the javadoc) 
> > explaining the use of locks per coordinator-URL.

Done.


> On May 8, 2018, 9:30 a.m., David McLaughlin wrote:
> > src/main/java/org/apache/aurora/scheduler/sla/SlaManager.java
> > Lines 206 (patched)
> > 
> >
> > To avoid having to dive into logs to see who is blocking host 
> > maintenance, should we export metrics for the job keys that are causing the 
> > errors too? Here and in the catch block below.

Done.


> On May 8, 2018, 9:30 a.m., David McLaughlin wrote:
> > src/main/java/org/apache/aurora/scheduler/sla/SlaModule.java
> > Lines 141 (patched)
> > 
> >
> > You probably want multiple threads with a 1 minute (default) 
> > coordinator timeout and explicit locking, right?

Yes, that makes sense. I will reuse the number of parallel coordinator 
transaction as the thread pool limit.


> On May 8, 2018, 9:30 a.m., David McLaughlin wrote:
> > src/main/java/org/apache/aurora/scheduler/state/MaintenanceController.java
> > Lines 54-59 (original), 78-83 (patched)
> > 
> >
> > This (or the Impl) needs updated with a high-level overview of the new 
> > logic (the async nature of maintenance, etc.)

Done.


> On May 8, 2018, 9:30 a.m., David McLaughlin wrote:
> > src/main/java/org/apache/aurora/scheduler/state/MaintenanceController.java
> > Lines 110 (patched)
> > 
> >
> > I'm wondering if defaultSlaPolicy needs to be supplied by the client, 
> > or configured on the server? Is there a situation where client would ever 
> > supply different values depending on the situation?

As of today dedicated clusters can supply their own SLA requirements. Setting a 
global SLA requirement (which can be less stricter) can be a surprise for 
operators. 

But my initial approach was to set a server default as well.


> On May 8, 2018, 9:30 a.m., David McLaughlin wrote:
> > src/main/java/org/apache/aurora/scheduler/state/MaintenanceController.java
> > Line 131 (original), 192 (patched)
> > 
> >
> > Revert to single-line.

Done.


> On May 8, 2018, 9:30 a.m., David McLaughlin wrote:
> > src/main/java/org/apache/aurora/scheduler/state/MaintenanceController.java
> > Lines 407 (patched)
> > 
> >
> > I'm struggling to understand how maintenace request presence is related 
> > to the SLA. 
> > 
> > Isn't there a race condition where any hosts in DRAINING when the 
> > Scheduler first starts up (after this patch) would have no maintenance 
> > requests? Also, doesn't the maintenance request stay present *until* the 
> > SLA is satisfied?
> > 
> > This may be another argument for having a default policy in the server.

After this patch has been rolled out, this should never happen. 

Added this to log (we can change the message) and export metric to account for 
exactly the situation you mentioned while rolling out the patch (for 
debugging). It is safe to ingore drains like this after the scheduler startup, 
since all drains are driven from clients today. So the clients will retry the 
drain -> which will create the maintenance request object -> which will unblock 
the drains. 

Considering the `dedicated` owners user-case (mentioned above), I dropped the 
scheduler level config. We can bring it back if you feel strongly about it.


- Santhosh Kumar


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/66716/#review202652

Re: Review Request 66716: [WIP] Enable `Tasks` to specify their own custom maintenance SLA.

2018-05-08 Thread Aurora ReviewBot

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/66716/#review202665
---


Ship it!




Master (805a53f) is green with this patch.
  ./build-support/jenkins/build.sh

I will refresh this build result if you post a review containing "@ReviewBot 
retry"

- Aurora ReviewBot


On May 8, 2018, 8:49 a.m., Santhosh Kumar Shanmugham wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/66716/
> ---
> 
> (Updated May 8, 2018, 8:49 a.m.)
> 
> 
> Review request for Aurora, David McLaughlin, Jordan Ly, and Stephan Erb.
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> `Tasks` can specify custom SLA requirements as part of
> their `TaskConfig`. One of the new features is the ability
> to specify an external coordinator that can ACK/NACK
> maintenance requests for tasks. This will be hugely
> beneficial for onboarding services that cannot satisfactorily
> specify SLA in terms of running instances.
> 
> Maintenance requests are driven from the Scheduler to
> improve management of nodes in the cluster.
> 
> 
> Note to reviewers:
> - Test coverage is minimal at this point. Expect more coverage soon in the 
> next diff.
> 
> 
> Diffs
> -
> 
>   api/src/main/thrift/org/apache/aurora/gen/api.thrift 
> ef754e32172e7490a47a13e7b526f243ffa3efeb 
>   api/src/main/thrift/org/apache/aurora/gen/storage.thrift 
> b79e2045ccda05d5058565f81988dfe33feea8f1 
>   src/main/java/org/apache/aurora/scheduler/app/AppModule.java 
> ffc07443fae9e5216a5333ae305f75aa9b452a0c 
>   src/main/java/org/apache/aurora/scheduler/config/CliOptions.java 
> a2fb0393ba47e876c4c8c63e3ed27ebe42cb6ca3 
>   
> src/main/java/org/apache/aurora/scheduler/maintenance/MaintenanceModule.java 
> PRE-CREATION 
>   src/main/java/org/apache/aurora/scheduler/mesos/MesosCallbackHandler.java 
> 3b4df55a05873e79aae206b117cbc753fa3abb94 
>   src/main/java/org/apache/aurora/scheduler/sla/SlaManager.java PRE-CREATION 
>   src/main/java/org/apache/aurora/scheduler/sla/SlaModule.java 
> 25ed474289f369e74c24e999ad97ed6810c9fd5e 
>   src/main/java/org/apache/aurora/scheduler/state/MaintenanceController.java 
> f58c66aaebe8d31913d67a05add0f3d6054e88d1 
>   src/main/java/org/apache/aurora/scheduler/state/StateModule.java 
> 0e0f90b670bbbcd6cb3aa302ce4a9abfe70ea979 
>   src/main/java/org/apache/aurora/scheduler/storage/HostMaintenanceStore.java 
> PRE-CREATION 
>   src/main/java/org/apache/aurora/scheduler/storage/Storage.java 
> da5534f886e032ca5a182f3704aa335ff680b258 
>   
> src/main/java/org/apache/aurora/scheduler/storage/durability/DurableStorage.java
>  f1fdc275d3958a36bbe79110d70dfeba640a948a 
>   src/main/java/org/apache/aurora/scheduler/storage/durability/Loader.java 
> 10864f122eff5027c88d835baae6de483d960218 
>   
> src/main/java/org/apache/aurora/scheduler/storage/durability/WriteRecorder.java
>  8d70cae35289a9e36142bab288cf0c9398ebd2d4 
>   src/main/java/org/apache/aurora/scheduler/storage/log/SnapshotService.java 
> b30de881eafa3226fdc32383b4e9bfd33ca912a5 
>   src/main/java/org/apache/aurora/scheduler/storage/log/SnapshotterImpl.java 
> 4b52be02001e704f4b1a5f447226ac8c2386e3fd 
>   
> src/main/java/org/apache/aurora/scheduler/storage/mem/MemHostMaintenanceStore.java
>  PRE-CREATION 
>   src/main/java/org/apache/aurora/scheduler/storage/mem/MemStorage.java 
> 9f324b010db7e351e98b257d8fc8fecfeac81268 
>   src/main/java/org/apache/aurora/scheduler/storage/mem/MemStorageModule.java 
> edcea09b4d206cfddb642074237b031ad71cff13 
>   
> src/main/java/org/apache/aurora/scheduler/thrift/SchedulerThriftInterface.java
>  9fc0416086dd3eb2e2f4e8f659da59fcdea2b22b 
>   src/main/python/apache/aurora/config/schema/base.py 
> a629bcd1261e5959da0a8458a55545d4e2c2a7a5 
>   src/main/python/apache/aurora/config/thrift.py 
> 6d2dde6e964daa68bf6f0e5bbbffecc5bd8c0431 
>   src/main/python/apache/aurora/executor/executor_vars.py 
> 561f9452aedda4cc695c84a2a850bdd7e1d65dec 
>   src/test/java/org/apache/aurora/scheduler/app/SchedulerIT.java 
> 63c338e5bbdf60de0fba8d68c6613904abb93fa8 
>   src/test/java/org/apache/aurora/scheduler/base/TaskTestUtil.java 
> 778148a7c033cba9004954cabc33a2b1d003dccf 
>   src/test/java/org/apache/aurora/scheduler/config/CommandLineTest.java 
> e66ec116112df164106598d9ff0bc9e8f465e44f 
>   
> src/test/java/org/apache/aurora/scheduler/mesos/MesosCallbackHandlerTest.java 
> c6163bbabc7e7748f167b679893a93f58e4ef1ac 
>   src/test/java/org/apache/aurora/scheduler/sla/SlaModuleTest.java 
> d37e7a07e9258bc8c0758bf50aece5b79025126b 
>   
> src/test/java/org/apache/aurora/scheduler/state/MaintenanceControllerImplTest.java
>  770846e84e9980ea3dbf9e1c46b0d45c5488c5b3 
>   src/test/java/org/apache/aurora/scheduler/storage/backup/RecoveryTest.java 
> 

Re: Review Request 66716: [WIP] Enable `Tasks` to specify their own custom maintenance SLA.

2018-05-08 Thread David McLaughlin

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/66716/#review202652
---



Looks good, mostly minor feedback.


src/main/java/org/apache/aurora/scheduler/maintenance/MaintenanceModule.java
Lines 35 (patched)


Missing javadoc.



src/main/java/org/apache/aurora/scheduler/sla/SlaManager.java
Lines 64 (patched)


Missing javadoc.



src/main/java/org/apache/aurora/scheduler/sla/SlaManager.java
Lines 149 (patched)


Delete blank line.



src/main/java/org/apache/aurora/scheduler/sla/SlaManager.java
Lines 194-195 (patched)


Would be good to add comments (either here or in the javadoc) explaining 
the use of locks per coordinator-URL.



src/main/java/org/apache/aurora/scheduler/sla/SlaManager.java
Lines 206 (patched)


To avoid having to dive into logs to see who is blocking host maintenance, 
should we export metrics for the job keys that are causing the errors too? Here 
and in the catch block below.



src/main/java/org/apache/aurora/scheduler/sla/SlaModule.java
Lines 141 (patched)


You probably want multiple threads with a 1 minute (default) coordinator 
timeout and explicit locking, right?



src/main/java/org/apache/aurora/scheduler/state/MaintenanceController.java
Lines 54-59 (original), 78-83 (patched)


This (or the Impl) needs updated with a high-level overview of the new 
logic (the async nature of maintenance, etc.)



src/main/java/org/apache/aurora/scheduler/state/MaintenanceController.java
Lines 110 (patched)


I'm wondering if defaultSlaPolicy needs to be supplied by the client, or 
configured on the server? Is there a situation where client would ever supply 
different values depending on the situation?



src/main/java/org/apache/aurora/scheduler/state/MaintenanceController.java
Line 131 (original), 192 (patched)


Revert to single-line.



src/main/java/org/apache/aurora/scheduler/state/MaintenanceController.java
Lines 407 (patched)


I'm struggling to understand how maintenace request presence is related to 
the SLA. 

Isn't there a race condition where any hosts in DRAINING when the Scheduler 
first starts up (after this patch) would have no maintenance requests? Also, 
doesn't the maintenance request stay present *until* the SLA is satisfied?

This may be another argument for having a default policy in the server.


- David McLaughlin


On May 8, 2018, 3:49 p.m., Santhosh Kumar Shanmugham wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/66716/
> ---
> 
> (Updated May 8, 2018, 3:49 p.m.)
> 
> 
> Review request for Aurora, David McLaughlin, Jordan Ly, and Stephan Erb.
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> `Tasks` can specify custom SLA requirements as part of
> their `TaskConfig`. One of the new features is the ability
> to specify an external coordinator that can ACK/NACK
> maintenance requests for tasks. This will be hugely
> beneficial for onboarding services that cannot satisfactorily
> specify SLA in terms of running instances.
> 
> Maintenance requests are driven from the Scheduler to
> improve management of nodes in the cluster.
> 
> 
> Note to reviewers:
> - Test coverage is minimal at this point. Expect more coverage soon in the 
> next diff.
> 
> 
> Diffs
> -
> 
>   api/src/main/thrift/org/apache/aurora/gen/api.thrift 
> ef754e32172e7490a47a13e7b526f243ffa3efeb 
>   api/src/main/thrift/org/apache/aurora/gen/storage.thrift 
> b79e2045ccda05d5058565f81988dfe33feea8f1 
>   src/main/java/org/apache/aurora/scheduler/app/AppModule.java 
> ffc07443fae9e5216a5333ae305f75aa9b452a0c 
>   src/main/java/org/apache/aurora/scheduler/config/CliOptions.java 
> a2fb0393ba47e876c4c8c63e3ed27ebe42cb6ca3 
>   
> src/main/java/org/apache/aurora/scheduler/maintenance/MaintenanceModule.java 
> PRE-CREATION 
>   src/main/java/org/apache/aurora/scheduler/mesos/MesosCallbackHandler.java 
> 3b4df55a05873e79aae206b117cbc753fa3abb94 
>   src/main/java/org/apache/aurora/scheduler/sla/SlaManager.java PRE-CREATION 
>   src/main/java/org/apache/aurora/scheduler/sla/SlaModule.java 
> 25ed474289f369e74c24e999ad97ed6810c9fd5e 
>   src/main/java/org/apache/aurora/scheduler/state/MaintenanceController.java 
> 

Re: Review Request 66716: [WIP] Enable `Tasks` to specify their own custom maintenance SLA.

2018-05-08 Thread Santhosh Kumar Shanmugham

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/66716/#review202657
---



@ReviewBot retry

- Santhosh Kumar Shanmugham


On May 8, 2018, 8:49 a.m., Santhosh Kumar Shanmugham wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/66716/
> ---
> 
> (Updated May 8, 2018, 8:49 a.m.)
> 
> 
> Review request for Aurora, David McLaughlin, Jordan Ly, and Stephan Erb.
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> `Tasks` can specify custom SLA requirements as part of
> their `TaskConfig`. One of the new features is the ability
> to specify an external coordinator that can ACK/NACK
> maintenance requests for tasks. This will be hugely
> beneficial for onboarding services that cannot satisfactorily
> specify SLA in terms of running instances.
> 
> Maintenance requests are driven from the Scheduler to
> improve management of nodes in the cluster.
> 
> 
> Note to reviewers:
> - Test coverage is minimal at this point. Expect more coverage soon in the 
> next diff.
> 
> 
> Diffs
> -
> 
>   api/src/main/thrift/org/apache/aurora/gen/api.thrift 
> ef754e32172e7490a47a13e7b526f243ffa3efeb 
>   api/src/main/thrift/org/apache/aurora/gen/storage.thrift 
> b79e2045ccda05d5058565f81988dfe33feea8f1 
>   src/main/java/org/apache/aurora/scheduler/app/AppModule.java 
> ffc07443fae9e5216a5333ae305f75aa9b452a0c 
>   src/main/java/org/apache/aurora/scheduler/config/CliOptions.java 
> a2fb0393ba47e876c4c8c63e3ed27ebe42cb6ca3 
>   
> src/main/java/org/apache/aurora/scheduler/maintenance/MaintenanceModule.java 
> PRE-CREATION 
>   src/main/java/org/apache/aurora/scheduler/mesos/MesosCallbackHandler.java 
> 3b4df55a05873e79aae206b117cbc753fa3abb94 
>   src/main/java/org/apache/aurora/scheduler/sla/SlaManager.java PRE-CREATION 
>   src/main/java/org/apache/aurora/scheduler/sla/SlaModule.java 
> 25ed474289f369e74c24e999ad97ed6810c9fd5e 
>   src/main/java/org/apache/aurora/scheduler/state/MaintenanceController.java 
> f58c66aaebe8d31913d67a05add0f3d6054e88d1 
>   src/main/java/org/apache/aurora/scheduler/state/StateModule.java 
> 0e0f90b670bbbcd6cb3aa302ce4a9abfe70ea979 
>   src/main/java/org/apache/aurora/scheduler/storage/HostMaintenanceStore.java 
> PRE-CREATION 
>   src/main/java/org/apache/aurora/scheduler/storage/Storage.java 
> da5534f886e032ca5a182f3704aa335ff680b258 
>   
> src/main/java/org/apache/aurora/scheduler/storage/durability/DurableStorage.java
>  f1fdc275d3958a36bbe79110d70dfeba640a948a 
>   src/main/java/org/apache/aurora/scheduler/storage/durability/Loader.java 
> 10864f122eff5027c88d835baae6de483d960218 
>   
> src/main/java/org/apache/aurora/scheduler/storage/durability/WriteRecorder.java
>  8d70cae35289a9e36142bab288cf0c9398ebd2d4 
>   src/main/java/org/apache/aurora/scheduler/storage/log/SnapshotService.java 
> b30de881eafa3226fdc32383b4e9bfd33ca912a5 
>   src/main/java/org/apache/aurora/scheduler/storage/log/SnapshotterImpl.java 
> 4b52be02001e704f4b1a5f447226ac8c2386e3fd 
>   
> src/main/java/org/apache/aurora/scheduler/storage/mem/MemHostMaintenanceStore.java
>  PRE-CREATION 
>   src/main/java/org/apache/aurora/scheduler/storage/mem/MemStorage.java 
> 9f324b010db7e351e98b257d8fc8fecfeac81268 
>   src/main/java/org/apache/aurora/scheduler/storage/mem/MemStorageModule.java 
> edcea09b4d206cfddb642074237b031ad71cff13 
>   
> src/main/java/org/apache/aurora/scheduler/thrift/SchedulerThriftInterface.java
>  9fc0416086dd3eb2e2f4e8f659da59fcdea2b22b 
>   src/main/python/apache/aurora/config/schema/base.py 
> a629bcd1261e5959da0a8458a55545d4e2c2a7a5 
>   src/main/python/apache/aurora/config/thrift.py 
> 6d2dde6e964daa68bf6f0e5bbbffecc5bd8c0431 
>   src/main/python/apache/aurora/executor/executor_vars.py 
> 561f9452aedda4cc695c84a2a850bdd7e1d65dec 
>   src/test/java/org/apache/aurora/scheduler/app/SchedulerIT.java 
> 63c338e5bbdf60de0fba8d68c6613904abb93fa8 
>   src/test/java/org/apache/aurora/scheduler/base/TaskTestUtil.java 
> 778148a7c033cba9004954cabc33a2b1d003dccf 
>   src/test/java/org/apache/aurora/scheduler/config/CommandLineTest.java 
> e66ec116112df164106598d9ff0bc9e8f465e44f 
>   
> src/test/java/org/apache/aurora/scheduler/mesos/MesosCallbackHandlerTest.java 
> c6163bbabc7e7748f167b679893a93f58e4ef1ac 
>   src/test/java/org/apache/aurora/scheduler/sla/SlaModuleTest.java 
> d37e7a07e9258bc8c0758bf50aece5b79025126b 
>   
> src/test/java/org/apache/aurora/scheduler/state/MaintenanceControllerImplTest.java
>  770846e84e9980ea3dbf9e1c46b0d45c5488c5b3 
>   src/test/java/org/apache/aurora/scheduler/storage/backup/RecoveryTest.java 
> ba03ff94bb5fee2b09a6660a9ad759cece7449f1 
>   
> src/test/java/org/apache/aurora/scheduler/storage/durability/DurableStorageTest.java
>  

Re: Review Request 66716: [WIP] Enable `Tasks` to specify their own custom maintenance SLA.

2018-05-08 Thread Aurora ReviewBot

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/66716/#review202651
---



Master (805a53f) is red with this patch.
  ./build-support/jenkins/build.sh

:startScripts
:distTar
:distZip
:assemble
:compileTestJavaNote: Some input files use or override a deprecated API.
Note: Recompile with -Xlint:deprecation for details.
Note: Some input files use unchecked or unsafe operations.
Note: Recompile with -Xlint:unchecked for details.

:processTestResources
:testClasses
:compileJmhJavaNote: 
/home/jenkins/jenkins-slave/workspace/AuroraBot/src/jmh/java/org/apache/aurora/benchmark/fakes/FakeSchedulerDriver.java
 uses or overrides a deprecated API.
Note: Recompile with -Xlint:deprecation for details.

:processJmhResources NO-SOURCE
:jmhClasses
:checkstyleJmh
:checkstyleMain[ant:checkstyle] [ERROR] 
/home/jenkins/jenkins-slave/workspace/AuroraBot/src/main/java/org/apache/aurora/scheduler/config/CliOptions.java:42:
 Wrong order for 'org.apache.aurora.scheduler.maintenance.MaintenanceModule' 
import. Order should be: java, javax, scala, com, net, org. Each group 
should be separated by a single blank line. [ImportOrder]
[ant:checkstyle] [ERROR] 
/home/jenkins/jenkins-slave/workspace/AuroraBot/src/main/java/org/apache/aurora/scheduler/maintenance/MaintenanceModule.java:1:
 Line does not match expected header line of '^\/\*\*$'. [RegexpHeader]
[ant:checkstyle] [ERROR] 
/home/jenkins/jenkins-slave/workspace/AuroraBot/src/main/java/org/apache/aurora/scheduler/mesos/MesosCallbackHandler.java:45:
 Wrong order for 
'org.apache.aurora.scheduler.maintenance.MaintenanceController' import. Order 
should be: java, javax, scala, com, net, org. Each group should be 
separated by a single blank line. [ImportOrder]
[ant:checkstyle] [ERROR] 
/home/jenkins/jenkins-slave/workspace/AuroraBot/src/main/java/org/apache/aurora/scheduler/sla/SlaManager.java:18:8:
 Unused import - java.util.Collection. [UnusedImports]
[ant:checkstyle] [ERROR] 
/home/jenkins/jenkins-slave/workspace/AuroraBot/src/main/java/org/apache/aurora/scheduler/sla/SlaManager.java:163:
 '.' have incorrect indentation level 6, expected level should be 8. 
[Indentation]
[ant:checkstyle] [ERROR] 
/home/jenkins/jenkins-slave/workspace/AuroraBot/src/main/java/org/apache/aurora/scheduler/sla/SlaManager.java:164:
 '.' have incorrect indentation level 6, expected level should be 8. 
[Indentation]
[ant:checkstyle] [ERROR] 
/home/jenkins/jenkins-slave/workspace/AuroraBot/src/main/java/org/apache/aurora/scheduler/sla/SlaManager.java:165:
 '.' have incorrect indentation level 6, expected level should be 8. 
[Indentation]
[ant:checkstyle] [ERROR] 
/home/jenkins/jenkins-slave/workspace/AuroraBot/src/main/java/org/apache/aurora/scheduler/thrift/SchedulerThriftInterface.java:84:
 Wrong order for 
'org.apache.aurora.scheduler.maintenance.MaintenanceController' import. Order 
should be: java, javax, scala, com, net, org. Each group should be 
separated by a single blank line. [ImportOrder]
 FAILED

FAILURE: Build failed with an exception.

* What went wrong:
Execution failed for task ':checkstyleMain'.
> Checkstyle rule violations were found. See the report at: 
> file:///home/jenkins/jenkins-slave/workspace/AuroraBot/dist/reports/checkstyle/main.html

* Try:
Run with --stacktrace option to get the stack trace. Run with --info or --debug 
option to get more log output.

* Get more help at https://help.gradle.org

BUILD FAILED in 5m 20s
32 actionable tasks: 26 executed, 6 up-to-date


I will refresh this build result if you post a review containing "@ReviewBot 
retry"

- Aurora ReviewBot


On May 8, 2018, 8:49 a.m., Santhosh Kumar Shanmugham wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/66716/
> ---
> 
> (Updated May 8, 2018, 8:49 a.m.)
> 
> 
> Review request for Aurora, David McLaughlin, Jordan Ly, and Stephan Erb.
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> `Tasks` can specify custom SLA requirements as part of
> their `TaskConfig`. One of the new features is the ability
> to specify an external coordinator that can ACK/NACK
> maintenance requests for tasks. This will be hugely
> beneficial for onboarding services that cannot satisfactorily
> specify SLA in terms of running instances.
> 
> Maintenance requests are driven from the Scheduler to
> improve management of nodes in the cluster.
> 
> 
> Note to reviewers:
> - Test coverage is minimal at this point. Expect more coverage soon in the 
> next diff.
> 
> 
> Diffs
> -
> 
>   api/src/main/thrift/org/apache/aurora/gen/api.thrift 
> ef754e32172e7490a47a13e7b526f243ffa3efeb 
>   api/src/main/thrift/org/apache/aurora/gen/storage.thrift 
> 

Re: Review Request 66716: [WIP] Enable `Tasks` to specify their own custom maintenance SLA.

2018-05-08 Thread Santhosh Kumar Shanmugham

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/66716/
---

(Updated May 8, 2018, 8:49 a.m.)


Review request for Aurora, David McLaughlin, Jordan Ly, and Stephan Erb.


Changes
---

Style fixes.


Repository: aurora


Description
---

`Tasks` can specify custom SLA requirements as part of
their `TaskConfig`. One of the new features is the ability
to specify an external coordinator that can ACK/NACK
maintenance requests for tasks. This will be hugely
beneficial for onboarding services that cannot satisfactorily
specify SLA in terms of running instances.

Maintenance requests are driven from the Scheduler to
improve management of nodes in the cluster.


Note to reviewers:
- Test coverage is minimal at this point. Expect more coverage soon in the next 
diff.


Diffs (updated)
-

  api/src/main/thrift/org/apache/aurora/gen/api.thrift 
ef754e32172e7490a47a13e7b526f243ffa3efeb 
  api/src/main/thrift/org/apache/aurora/gen/storage.thrift 
b79e2045ccda05d5058565f81988dfe33feea8f1 
  src/main/java/org/apache/aurora/scheduler/app/AppModule.java 
ffc07443fae9e5216a5333ae305f75aa9b452a0c 
  src/main/java/org/apache/aurora/scheduler/config/CliOptions.java 
a2fb0393ba47e876c4c8c63e3ed27ebe42cb6ca3 
  src/main/java/org/apache/aurora/scheduler/maintenance/MaintenanceModule.java 
PRE-CREATION 
  src/main/java/org/apache/aurora/scheduler/mesos/MesosCallbackHandler.java 
3b4df55a05873e79aae206b117cbc753fa3abb94 
  src/main/java/org/apache/aurora/scheduler/sla/SlaManager.java PRE-CREATION 
  src/main/java/org/apache/aurora/scheduler/sla/SlaModule.java 
25ed474289f369e74c24e999ad97ed6810c9fd5e 
  src/main/java/org/apache/aurora/scheduler/state/MaintenanceController.java 
f58c66aaebe8d31913d67a05add0f3d6054e88d1 
  src/main/java/org/apache/aurora/scheduler/state/StateModule.java 
0e0f90b670bbbcd6cb3aa302ce4a9abfe70ea979 
  src/main/java/org/apache/aurora/scheduler/storage/HostMaintenanceStore.java 
PRE-CREATION 
  src/main/java/org/apache/aurora/scheduler/storage/Storage.java 
da5534f886e032ca5a182f3704aa335ff680b258 
  
src/main/java/org/apache/aurora/scheduler/storage/durability/DurableStorage.java
 f1fdc275d3958a36bbe79110d70dfeba640a948a 
  src/main/java/org/apache/aurora/scheduler/storage/durability/Loader.java 
10864f122eff5027c88d835baae6de483d960218 
  
src/main/java/org/apache/aurora/scheduler/storage/durability/WriteRecorder.java 
8d70cae35289a9e36142bab288cf0c9398ebd2d4 
  src/main/java/org/apache/aurora/scheduler/storage/log/SnapshotService.java 
b30de881eafa3226fdc32383b4e9bfd33ca912a5 
  src/main/java/org/apache/aurora/scheduler/storage/log/SnapshotterImpl.java 
4b52be02001e704f4b1a5f447226ac8c2386e3fd 
  
src/main/java/org/apache/aurora/scheduler/storage/mem/MemHostMaintenanceStore.java
 PRE-CREATION 
  src/main/java/org/apache/aurora/scheduler/storage/mem/MemStorage.java 
9f324b010db7e351e98b257d8fc8fecfeac81268 
  src/main/java/org/apache/aurora/scheduler/storage/mem/MemStorageModule.java 
edcea09b4d206cfddb642074237b031ad71cff13 
  
src/main/java/org/apache/aurora/scheduler/thrift/SchedulerThriftInterface.java 
9fc0416086dd3eb2e2f4e8f659da59fcdea2b22b 
  src/main/python/apache/aurora/config/schema/base.py 
a629bcd1261e5959da0a8458a55545d4e2c2a7a5 
  src/main/python/apache/aurora/config/thrift.py 
6d2dde6e964daa68bf6f0e5bbbffecc5bd8c0431 
  src/main/python/apache/aurora/executor/executor_vars.py 
561f9452aedda4cc695c84a2a850bdd7e1d65dec 
  src/test/java/org/apache/aurora/scheduler/app/SchedulerIT.java 
63c338e5bbdf60de0fba8d68c6613904abb93fa8 
  src/test/java/org/apache/aurora/scheduler/base/TaskTestUtil.java 
778148a7c033cba9004954cabc33a2b1d003dccf 
  src/test/java/org/apache/aurora/scheduler/config/CommandLineTest.java 
e66ec116112df164106598d9ff0bc9e8f465e44f 
  src/test/java/org/apache/aurora/scheduler/mesos/MesosCallbackHandlerTest.java 
c6163bbabc7e7748f167b679893a93f58e4ef1ac 
  src/test/java/org/apache/aurora/scheduler/sla/SlaModuleTest.java 
d37e7a07e9258bc8c0758bf50aece5b79025126b 
  
src/test/java/org/apache/aurora/scheduler/state/MaintenanceControllerImplTest.java
 770846e84e9980ea3dbf9e1c46b0d45c5488c5b3 
  src/test/java/org/apache/aurora/scheduler/storage/backup/RecoveryTest.java 
ba03ff94bb5fee2b09a6660a9ad759cece7449f1 
  
src/test/java/org/apache/aurora/scheduler/storage/durability/DurableStorageTest.java
 3dd9ce4039b223cb6156462d089f7062a1cde772 
  
src/test/java/org/apache/aurora/scheduler/storage/durability/WriteRecorderTest.java
 27c8c829cd1e417dd5e60a8e9415331ca4a7c918 
  src/test/java/org/apache/aurora/scheduler/storage/log/SnapshotterImplIT.java 
be07361a27afefa21cc2ba76ce82531a418d9814 
  
src/test/java/org/apache/aurora/scheduler/storage/testing/StorageTestUtil.java 
d59118be13342da9003b0bcb97e12e477d9edf8f 
  
src/test/java/org/apache/aurora/scheduler/thrift/SchedulerThriftInterfaceTest.java
 

Re: Review Request 66716: [WIP] Enable `Tasks` to specify their own custom maintenance SLA.

2018-05-08 Thread Santhosh Kumar Shanmugham


> On May 3, 2018, 10:43 a.m., Jordan Ly wrote:
> > src/main/java/org/apache/aurora/scheduler/state/MaintenanceController.java
> > Lines 144-150 (patched)
> > 
> >
> > If bound in a private module, you don't need to have a `@Named` 
> > annotation (I don't think it is used anywhere else).
> > 
> > Additionally, maybe use `@Qualifier` instead (e.g. 
> > https://github.com/apache/aurora/blob/master/src/main/java/org/apache/aurora/scheduler/async/AsyncModule.java#L60-L62).
> >  Not entirely sure what the difference is myself but this is used more 
> > throughout the project.

I would prefer namespacing the argument even if they are inside a private 
module. I will use `@Qualifier` that looks cleaner.


> On May 3, 2018, 10:43 a.m., Jordan Ly wrote:
> > src/main/java/org/apache/aurora/scheduler/state/MaintenanceController.java
> > Line 157 (original), 203 (patched)
> > 
> >
> > This name confused me a bit. For me, it implies there will be a 
> > long-running method continually doing something but this is a one-time 
> > thing.
> > 
> > Maybe rename to `checkForDrainingTasks` or something?

Done


> On May 3, 2018, 10:43 a.m., Jordan Ly wrote:
> > src/main/java/org/apache/aurora/scheduler/state/SlaManager.java
> > Lines 153-155 (patched)
> > 
> >
> > Do we need to immediately return after this?
> > 
> > If I am going to force the work to be done, do I need to even call this 
> > method?

I want to invoke the required in one-place so it is easier to follow.


- Santhosh Kumar


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/66716/#review202363
---


On May 8, 2018, 8:28 a.m., Santhosh Kumar Shanmugham wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/66716/
> ---
> 
> (Updated May 8, 2018, 8:28 a.m.)
> 
> 
> Review request for Aurora, David McLaughlin, Jordan Ly, and Stephan Erb.
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> `Tasks` can specify custom SLA requirements as part of
> their `TaskConfig`. One of the new features is the ability
> to specify an external coordinator that can ACK/NACK
> maintenance requests for tasks. This will be hugely
> beneficial for onboarding services that cannot satisfactorily
> specify SLA in terms of running instances.
> 
> Maintenance requests are driven from the Scheduler to
> improve management of nodes in the cluster.
> 
> 
> Note to reviewers:
> - Test coverage is minimal at this point. Expect more coverage soon in the 
> next diff.
> 
> 
> Diffs
> -
> 
>   api/src/main/thrift/org/apache/aurora/gen/api.thrift 
> ef754e32172e7490a47a13e7b526f243ffa3efeb 
>   api/src/main/thrift/org/apache/aurora/gen/storage.thrift 
> b79e2045ccda05d5058565f81988dfe33feea8f1 
>   src/main/java/org/apache/aurora/scheduler/app/AppModule.java 
> ffc07443fae9e5216a5333ae305f75aa9b452a0c 
>   src/main/java/org/apache/aurora/scheduler/config/CliOptions.java 
> a2fb0393ba47e876c4c8c63e3ed27ebe42cb6ca3 
>   
> src/main/java/org/apache/aurora/scheduler/maintenance/MaintenanceModule.java 
> PRE-CREATION 
>   src/main/java/org/apache/aurora/scheduler/mesos/MesosCallbackHandler.java 
> 3b4df55a05873e79aae206b117cbc753fa3abb94 
>   src/main/java/org/apache/aurora/scheduler/sla/SlaManager.java PRE-CREATION 
>   src/main/java/org/apache/aurora/scheduler/sla/SlaModule.java 
> 25ed474289f369e74c24e999ad97ed6810c9fd5e 
>   src/main/java/org/apache/aurora/scheduler/state/MaintenanceController.java 
> f58c66aaebe8d31913d67a05add0f3d6054e88d1 
>   src/main/java/org/apache/aurora/scheduler/state/StateModule.java 
> 0e0f90b670bbbcd6cb3aa302ce4a9abfe70ea979 
>   src/main/java/org/apache/aurora/scheduler/storage/HostMaintenanceStore.java 
> PRE-CREATION 
>   src/main/java/org/apache/aurora/scheduler/storage/Storage.java 
> da5534f886e032ca5a182f3704aa335ff680b258 
>   
> src/main/java/org/apache/aurora/scheduler/storage/durability/DurableStorage.java
>  f1fdc275d3958a36bbe79110d70dfeba640a948a 
>   src/main/java/org/apache/aurora/scheduler/storage/durability/Loader.java 
> 10864f122eff5027c88d835baae6de483d960218 
>   
> src/main/java/org/apache/aurora/scheduler/storage/durability/WriteRecorder.java
>  8d70cae35289a9e36142bab288cf0c9398ebd2d4 
>   src/main/java/org/apache/aurora/scheduler/storage/log/SnapshotService.java 
> b30de881eafa3226fdc32383b4e9bfd33ca912a5 
>   src/main/java/org/apache/aurora/scheduler/storage/log/SnapshotterImpl.java 
> 4b52be02001e704f4b1a5f447226ac8c2386e3fd 
>   
> 

Re: Review Request 66716: [WIP] Enable `Tasks` to specify their own custom maintenance SLA.

2018-05-08 Thread Santhosh Kumar Shanmugham


> On May 7, 2018, 2:12 p.m., Jordan Ly wrote:
> > src/main/java/org/apache/aurora/scheduler/state/SlaManager.java
> > Lines 124-130 (patched)
> > 
> >
> > I believe that you will want to only look at the lastest task event 
> > only if it is RUNNING.
> > 
> > This will take into account tasks that are KILLING/PREEMPTING which 
> > would break SLA.

Changed it so that we only look for RUNNING tasks when computing running time 
SLA. This will be more conservative than how we are doing SLA calculations in 
the admin client.


> On May 7, 2018, 2:12 p.m., Jordan Ly wrote:
> > src/main/java/org/apache/aurora/scheduler/state/SlaManager.java
> > Lines 138 (patched)
> > 
> >
> > Can you elaborate more on what this line does?

This should change to exclude the task that is currently under consideration to 
simulate the SLA without that task. Updated.


- Santhosh Kumar


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/66716/#review202594
---


On May 8, 2018, 8:28 a.m., Santhosh Kumar Shanmugham wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/66716/
> ---
> 
> (Updated May 8, 2018, 8:28 a.m.)
> 
> 
> Review request for Aurora, David McLaughlin, Jordan Ly, and Stephan Erb.
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> `Tasks` can specify custom SLA requirements as part of
> their `TaskConfig`. One of the new features is the ability
> to specify an external coordinator that can ACK/NACK
> maintenance requests for tasks. This will be hugely
> beneficial for onboarding services that cannot satisfactorily
> specify SLA in terms of running instances.
> 
> Maintenance requests are driven from the Scheduler to
> improve management of nodes in the cluster.
> 
> 
> Note to reviewers:
> - Test coverage is minimal at this point. Expect more coverage soon in the 
> next diff.
> 
> 
> Diffs
> -
> 
>   api/src/main/thrift/org/apache/aurora/gen/api.thrift 
> ef754e32172e7490a47a13e7b526f243ffa3efeb 
>   api/src/main/thrift/org/apache/aurora/gen/storage.thrift 
> b79e2045ccda05d5058565f81988dfe33feea8f1 
>   src/main/java/org/apache/aurora/scheduler/app/AppModule.java 
> ffc07443fae9e5216a5333ae305f75aa9b452a0c 
>   src/main/java/org/apache/aurora/scheduler/config/CliOptions.java 
> a2fb0393ba47e876c4c8c63e3ed27ebe42cb6ca3 
>   
> src/main/java/org/apache/aurora/scheduler/maintenance/MaintenanceModule.java 
> PRE-CREATION 
>   src/main/java/org/apache/aurora/scheduler/mesos/MesosCallbackHandler.java 
> 3b4df55a05873e79aae206b117cbc753fa3abb94 
>   src/main/java/org/apache/aurora/scheduler/sla/SlaManager.java PRE-CREATION 
>   src/main/java/org/apache/aurora/scheduler/sla/SlaModule.java 
> 25ed474289f369e74c24e999ad97ed6810c9fd5e 
>   src/main/java/org/apache/aurora/scheduler/state/MaintenanceController.java 
> f58c66aaebe8d31913d67a05add0f3d6054e88d1 
>   src/main/java/org/apache/aurora/scheduler/state/StateModule.java 
> 0e0f90b670bbbcd6cb3aa302ce4a9abfe70ea979 
>   src/main/java/org/apache/aurora/scheduler/storage/HostMaintenanceStore.java 
> PRE-CREATION 
>   src/main/java/org/apache/aurora/scheduler/storage/Storage.java 
> da5534f886e032ca5a182f3704aa335ff680b258 
>   
> src/main/java/org/apache/aurora/scheduler/storage/durability/DurableStorage.java
>  f1fdc275d3958a36bbe79110d70dfeba640a948a 
>   src/main/java/org/apache/aurora/scheduler/storage/durability/Loader.java 
> 10864f122eff5027c88d835baae6de483d960218 
>   
> src/main/java/org/apache/aurora/scheduler/storage/durability/WriteRecorder.java
>  8d70cae35289a9e36142bab288cf0c9398ebd2d4 
>   src/main/java/org/apache/aurora/scheduler/storage/log/SnapshotService.java 
> b30de881eafa3226fdc32383b4e9bfd33ca912a5 
>   src/main/java/org/apache/aurora/scheduler/storage/log/SnapshotterImpl.java 
> 4b52be02001e704f4b1a5f447226ac8c2386e3fd 
>   
> src/main/java/org/apache/aurora/scheduler/storage/mem/MemHostMaintenanceStore.java
>  PRE-CREATION 
>   src/main/java/org/apache/aurora/scheduler/storage/mem/MemStorage.java 
> 9f324b010db7e351e98b257d8fc8fecfeac81268 
>   src/main/java/org/apache/aurora/scheduler/storage/mem/MemStorageModule.java 
> edcea09b4d206cfddb642074237b031ad71cff13 
>   
> src/main/java/org/apache/aurora/scheduler/thrift/SchedulerThriftInterface.java
>  9fc0416086dd3eb2e2f4e8f659da59fcdea2b22b 
>   src/main/python/apache/aurora/config/schema/base.py 
> a629bcd1261e5959da0a8458a55545d4e2c2a7a5 
>   src/main/python/apache/aurora/config/thrift.py 
> 6d2dde6e964daa68bf6f0e5bbbffecc5bd8c0431 
>   src/main/python/apache/aurora/executor/executor_vars.py 
> 

Re: Review Request 66716: [WIP] Enable `Tasks` to specify their own custom maintenance SLA.

2018-05-08 Thread Santhosh Kumar Shanmugham


> On May 3, 2018, 3:26 p.m., Reza Motamedi wrote:
> > api/src/main/thrift/org/apache/aurora/gen/api.thrift
> > Lines 248 (patched)
> > 
> >
> > Should the comment be updated to match the struct in resolution?
> > 
> > Ms is milli secs, right?

Done.


> On May 3, 2018, 3:26 p.m., Reza Motamedi wrote:
> > api/src/main/thrift/org/apache/aurora/gen/api.thrift
> > Lines 1241 (patched)
> > 
> >
> > s/of the host/of the hosts

Done.


- Santhosh Kumar


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/66716/#review202386
---


On May 8, 2018, 8:28 a.m., Santhosh Kumar Shanmugham wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/66716/
> ---
> 
> (Updated May 8, 2018, 8:28 a.m.)
> 
> 
> Review request for Aurora, David McLaughlin, Jordan Ly, and Stephan Erb.
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> `Tasks` can specify custom SLA requirements as part of
> their `TaskConfig`. One of the new features is the ability
> to specify an external coordinator that can ACK/NACK
> maintenance requests for tasks. This will be hugely
> beneficial for onboarding services that cannot satisfactorily
> specify SLA in terms of running instances.
> 
> Maintenance requests are driven from the Scheduler to
> improve management of nodes in the cluster.
> 
> 
> Note to reviewers:
> - Test coverage is minimal at this point. Expect more coverage soon in the 
> next diff.
> 
> 
> Diffs
> -
> 
>   api/src/main/thrift/org/apache/aurora/gen/api.thrift 
> ef754e32172e7490a47a13e7b526f243ffa3efeb 
>   api/src/main/thrift/org/apache/aurora/gen/storage.thrift 
> b79e2045ccda05d5058565f81988dfe33feea8f1 
>   src/main/java/org/apache/aurora/scheduler/app/AppModule.java 
> ffc07443fae9e5216a5333ae305f75aa9b452a0c 
>   src/main/java/org/apache/aurora/scheduler/config/CliOptions.java 
> a2fb0393ba47e876c4c8c63e3ed27ebe42cb6ca3 
>   
> src/main/java/org/apache/aurora/scheduler/maintenance/MaintenanceModule.java 
> PRE-CREATION 
>   src/main/java/org/apache/aurora/scheduler/mesos/MesosCallbackHandler.java 
> 3b4df55a05873e79aae206b117cbc753fa3abb94 
>   src/main/java/org/apache/aurora/scheduler/sla/SlaManager.java PRE-CREATION 
>   src/main/java/org/apache/aurora/scheduler/sla/SlaModule.java 
> 25ed474289f369e74c24e999ad97ed6810c9fd5e 
>   src/main/java/org/apache/aurora/scheduler/state/MaintenanceController.java 
> f58c66aaebe8d31913d67a05add0f3d6054e88d1 
>   src/main/java/org/apache/aurora/scheduler/state/StateModule.java 
> 0e0f90b670bbbcd6cb3aa302ce4a9abfe70ea979 
>   src/main/java/org/apache/aurora/scheduler/storage/HostMaintenanceStore.java 
> PRE-CREATION 
>   src/main/java/org/apache/aurora/scheduler/storage/Storage.java 
> da5534f886e032ca5a182f3704aa335ff680b258 
>   
> src/main/java/org/apache/aurora/scheduler/storage/durability/DurableStorage.java
>  f1fdc275d3958a36bbe79110d70dfeba640a948a 
>   src/main/java/org/apache/aurora/scheduler/storage/durability/Loader.java 
> 10864f122eff5027c88d835baae6de483d960218 
>   
> src/main/java/org/apache/aurora/scheduler/storage/durability/WriteRecorder.java
>  8d70cae35289a9e36142bab288cf0c9398ebd2d4 
>   src/main/java/org/apache/aurora/scheduler/storage/log/SnapshotService.java 
> b30de881eafa3226fdc32383b4e9bfd33ca912a5 
>   src/main/java/org/apache/aurora/scheduler/storage/log/SnapshotterImpl.java 
> 4b52be02001e704f4b1a5f447226ac8c2386e3fd 
>   
> src/main/java/org/apache/aurora/scheduler/storage/mem/MemHostMaintenanceStore.java
>  PRE-CREATION 
>   src/main/java/org/apache/aurora/scheduler/storage/mem/MemStorage.java 
> 9f324b010db7e351e98b257d8fc8fecfeac81268 
>   src/main/java/org/apache/aurora/scheduler/storage/mem/MemStorageModule.java 
> edcea09b4d206cfddb642074237b031ad71cff13 
>   
> src/main/java/org/apache/aurora/scheduler/thrift/SchedulerThriftInterface.java
>  9fc0416086dd3eb2e2f4e8f659da59fcdea2b22b 
>   src/main/python/apache/aurora/config/schema/base.py 
> a629bcd1261e5959da0a8458a55545d4e2c2a7a5 
>   src/main/python/apache/aurora/config/thrift.py 
> 6d2dde6e964daa68bf6f0e5bbbffecc5bd8c0431 
>   src/main/python/apache/aurora/executor/executor_vars.py 
> 561f9452aedda4cc695c84a2a850bdd7e1d65dec 
>   src/test/java/org/apache/aurora/scheduler/app/SchedulerIT.java 
> 63c338e5bbdf60de0fba8d68c6613904abb93fa8 
>   src/test/java/org/apache/aurora/scheduler/base/TaskTestUtil.java 
> 778148a7c033cba9004954cabc33a2b1d003dccf 
>   src/test/java/org/apache/aurora/scheduler/config/CommandLineTest.java 
> e66ec116112df164106598d9ff0bc9e8f465e44f 
>   
> 

Re: Review Request 66716: [WIP] Enable `Tasks` to specify their own custom maintenance SLA.

2018-05-08 Thread Santhosh Kumar Shanmugham

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/66716/
---

(Updated May 8, 2018, 8:28 a.m.)


Review request for Aurora, David McLaughlin, Jordan Ly, and Stephan Erb.


Changes
---

Refactor to separate into different modules and update more interface 
requirements for SLAManager.


Repository: aurora


Description
---

`Tasks` can specify custom SLA requirements as part of
their `TaskConfig`. One of the new features is the ability
to specify an external coordinator that can ACK/NACK
maintenance requests for tasks. This will be hugely
beneficial for onboarding services that cannot satisfactorily
specify SLA in terms of running instances.

Maintenance requests are driven from the Scheduler to
improve management of nodes in the cluster.


Note to reviewers:
- Test coverage is minimal at this point. Expect more coverage soon in the next 
diff.


Diffs (updated)
-

  api/src/main/thrift/org/apache/aurora/gen/api.thrift 
ef754e32172e7490a47a13e7b526f243ffa3efeb 
  api/src/main/thrift/org/apache/aurora/gen/storage.thrift 
b79e2045ccda05d5058565f81988dfe33feea8f1 
  src/main/java/org/apache/aurora/scheduler/app/AppModule.java 
ffc07443fae9e5216a5333ae305f75aa9b452a0c 
  src/main/java/org/apache/aurora/scheduler/config/CliOptions.java 
a2fb0393ba47e876c4c8c63e3ed27ebe42cb6ca3 
  src/main/java/org/apache/aurora/scheduler/maintenance/MaintenanceModule.java 
PRE-CREATION 
  src/main/java/org/apache/aurora/scheduler/mesos/MesosCallbackHandler.java 
3b4df55a05873e79aae206b117cbc753fa3abb94 
  src/main/java/org/apache/aurora/scheduler/sla/SlaManager.java PRE-CREATION 
  src/main/java/org/apache/aurora/scheduler/sla/SlaModule.java 
25ed474289f369e74c24e999ad97ed6810c9fd5e 
  src/main/java/org/apache/aurora/scheduler/state/MaintenanceController.java 
f58c66aaebe8d31913d67a05add0f3d6054e88d1 
  src/main/java/org/apache/aurora/scheduler/state/StateModule.java 
0e0f90b670bbbcd6cb3aa302ce4a9abfe70ea979 
  src/main/java/org/apache/aurora/scheduler/storage/HostMaintenanceStore.java 
PRE-CREATION 
  src/main/java/org/apache/aurora/scheduler/storage/Storage.java 
da5534f886e032ca5a182f3704aa335ff680b258 
  
src/main/java/org/apache/aurora/scheduler/storage/durability/DurableStorage.java
 f1fdc275d3958a36bbe79110d70dfeba640a948a 
  src/main/java/org/apache/aurora/scheduler/storage/durability/Loader.java 
10864f122eff5027c88d835baae6de483d960218 
  
src/main/java/org/apache/aurora/scheduler/storage/durability/WriteRecorder.java 
8d70cae35289a9e36142bab288cf0c9398ebd2d4 
  src/main/java/org/apache/aurora/scheduler/storage/log/SnapshotService.java 
b30de881eafa3226fdc32383b4e9bfd33ca912a5 
  src/main/java/org/apache/aurora/scheduler/storage/log/SnapshotterImpl.java 
4b52be02001e704f4b1a5f447226ac8c2386e3fd 
  
src/main/java/org/apache/aurora/scheduler/storage/mem/MemHostMaintenanceStore.java
 PRE-CREATION 
  src/main/java/org/apache/aurora/scheduler/storage/mem/MemStorage.java 
9f324b010db7e351e98b257d8fc8fecfeac81268 
  src/main/java/org/apache/aurora/scheduler/storage/mem/MemStorageModule.java 
edcea09b4d206cfddb642074237b031ad71cff13 
  
src/main/java/org/apache/aurora/scheduler/thrift/SchedulerThriftInterface.java 
9fc0416086dd3eb2e2f4e8f659da59fcdea2b22b 
  src/main/python/apache/aurora/config/schema/base.py 
a629bcd1261e5959da0a8458a55545d4e2c2a7a5 
  src/main/python/apache/aurora/config/thrift.py 
6d2dde6e964daa68bf6f0e5bbbffecc5bd8c0431 
  src/main/python/apache/aurora/executor/executor_vars.py 
561f9452aedda4cc695c84a2a850bdd7e1d65dec 
  src/test/java/org/apache/aurora/scheduler/app/SchedulerIT.java 
63c338e5bbdf60de0fba8d68c6613904abb93fa8 
  src/test/java/org/apache/aurora/scheduler/base/TaskTestUtil.java 
778148a7c033cba9004954cabc33a2b1d003dccf 
  src/test/java/org/apache/aurora/scheduler/config/CommandLineTest.java 
e66ec116112df164106598d9ff0bc9e8f465e44f 
  src/test/java/org/apache/aurora/scheduler/mesos/MesosCallbackHandlerTest.java 
c6163bbabc7e7748f167b679893a93f58e4ef1ac 
  src/test/java/org/apache/aurora/scheduler/sla/SlaModuleTest.java 
d37e7a07e9258bc8c0758bf50aece5b79025126b 
  
src/test/java/org/apache/aurora/scheduler/state/MaintenanceControllerImplTest.java
 770846e84e9980ea3dbf9e1c46b0d45c5488c5b3 
  src/test/java/org/apache/aurora/scheduler/storage/backup/RecoveryTest.java 
ba03ff94bb5fee2b09a6660a9ad759cece7449f1 
  
src/test/java/org/apache/aurora/scheduler/storage/durability/DurableStorageTest.java
 3dd9ce4039b223cb6156462d089f7062a1cde772 
  
src/test/java/org/apache/aurora/scheduler/storage/durability/WriteRecorderTest.java
 27c8c829cd1e417dd5e60a8e9415331ca4a7c918 
  src/test/java/org/apache/aurora/scheduler/storage/log/SnapshotterImplIT.java 
be07361a27afefa21cc2ba76ce82531a418d9814 
  
src/test/java/org/apache/aurora/scheduler/storage/testing/StorageTestUtil.java 
d59118be13342da9003b0bcb97e12e477d9edf8f 
  

Re: Review Request 66716: [WIP] Enable `Tasks` to specify their own custom maintenance SLA.

2018-05-07 Thread Jordan Ly

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/66716/#review202594
---




src/main/java/org/apache/aurora/scheduler/state/SlaManager.java
Lines 117 (patched)


reverse running and getCount



src/main/java/org/apache/aurora/scheduler/state/SlaManager.java
Lines 124-130 (patched)


I believe that you will want to only look at the lastest task event only if 
it is RUNNING.

This will take into account tasks that are KILLING/PREEMPTING which would 
break SLA.



src/main/java/org/apache/aurora/scheduler/state/SlaManager.java
Lines 138 (patched)


Can you elaborate more on what this line does?



src/main/java/org/apache/aurora/scheduler/state/SlaManager.java
Lines 142-143 (patched)


Remove references to draining

The last argument to be substituted needs to be converted into a list or 
joined into a string. This will print out an object reference.


- Jordan Ly


On May 1, 2018, 9:19 p.m., Santhosh Kumar Shanmugham wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/66716/
> ---
> 
> (Updated May 1, 2018, 9:19 p.m.)
> 
> 
> Review request for Aurora, David McLaughlin, Jordan Ly, and Stephan Erb.
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> `Tasks` can specify custom SLA requirements as part of
> their `TaskConfig`. One of the new features is the ability
> to specify an external coordinator that can ACK/NACK
> maintenance requests for tasks. This will be hugely
> beneficial for onboarding services that cannot satisfactorily
> specify SLA in terms of running instances.
> 
> Maintenance requests are driven from the Scheduler to
> improve management of nodes in the cluster.
> 
> 
> Note to reviewers:
> - Test coverage is minimal at this point. Expect more coverage soon in the 
> next diff.
> 
> 
> Diffs
> -
> 
>   api/src/main/thrift/org/apache/aurora/gen/api.thrift 
> ef754e32172e7490a47a13e7b526f243ffa3efeb 
>   api/src/main/thrift/org/apache/aurora/gen/storage.thrift 
> b79e2045ccda05d5058565f81988dfe33feea8f1 
>   src/main/java/org/apache/aurora/scheduler/state/MaintenanceController.java 
> f58c66aaebe8d31913d67a05add0f3d6054e88d1 
>   src/main/java/org/apache/aurora/scheduler/state/SlaManager.java 
> PRE-CREATION 
>   src/main/java/org/apache/aurora/scheduler/state/StateModule.java 
> 0e0f90b670bbbcd6cb3aa302ce4a9abfe70ea979 
>   src/main/java/org/apache/aurora/scheduler/storage/HostMaintenanceStore.java 
> PRE-CREATION 
>   src/main/java/org/apache/aurora/scheduler/storage/Storage.java 
> da5534f886e032ca5a182f3704aa335ff680b258 
>   
> src/main/java/org/apache/aurora/scheduler/storage/durability/DurableStorage.java
>  f1fdc275d3958a36bbe79110d70dfeba640a948a 
>   src/main/java/org/apache/aurora/scheduler/storage/durability/Loader.java 
> 10864f122eff5027c88d835baae6de483d960218 
>   
> src/main/java/org/apache/aurora/scheduler/storage/durability/WriteRecorder.java
>  8d70cae35289a9e36142bab288cf0c9398ebd2d4 
>   src/main/java/org/apache/aurora/scheduler/storage/log/SnapshotService.java 
> b30de881eafa3226fdc32383b4e9bfd33ca912a5 
>   src/main/java/org/apache/aurora/scheduler/storage/log/SnapshotterImpl.java 
> 4b52be02001e704f4b1a5f447226ac8c2386e3fd 
>   
> src/main/java/org/apache/aurora/scheduler/storage/mem/MemHostMaintenanceStore.java
>  PRE-CREATION 
>   src/main/java/org/apache/aurora/scheduler/storage/mem/MemStorage.java 
> 9f324b010db7e351e98b257d8fc8fecfeac81268 
>   src/main/java/org/apache/aurora/scheduler/storage/mem/MemStorageModule.java 
> edcea09b4d206cfddb642074237b031ad71cff13 
>   
> src/main/java/org/apache/aurora/scheduler/thrift/SchedulerThriftInterface.java
>  9fc0416086dd3eb2e2f4e8f659da59fcdea2b22b 
>   src/main/python/apache/aurora/config/schema/base.py 
> a629bcd1261e5959da0a8458a55545d4e2c2a7a5 
>   src/main/python/apache/aurora/config/thrift.py 
> 6d2dde6e964daa68bf6f0e5bbbffecc5bd8c0431 
>   src/main/python/apache/aurora/executor/executor_vars.py 
> 561f9452aedda4cc695c84a2a850bdd7e1d65dec 
>   src/test/java/org/apache/aurora/scheduler/base/TaskTestUtil.java 
> 778148a7c033cba9004954cabc33a2b1d003dccf 
>   src/test/java/org/apache/aurora/scheduler/config/CommandLineTest.java 
> e66ec116112df164106598d9ff0bc9e8f465e44f 
>   
> src/test/java/org/apache/aurora/scheduler/state/MaintenanceControllerImplTest.java
>  770846e84e9980ea3dbf9e1c46b0d45c5488c5b3 
>   src/test/java/org/apache/aurora/scheduler/storage/backup/RecoveryTest.java 
> ba03ff94bb5fee2b09a6660a9ad759cece7449f1 
>   
> 

Re: Review Request 66716: [WIP] Enable `Tasks` to specify their own custom maintenance SLA.

2018-05-03 Thread Reza Motamedi

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/66716/#review202386
---




api/src/main/thrift/org/apache/aurora/gen/api.thrift
Lines 248 (patched)


Should the comment be updated to match the struct in resolution?

Ms is milli secs, right?



api/src/main/thrift/org/apache/aurora/gen/api.thrift
Lines 1241 (patched)


s/of the host/of the hosts


- Reza Motamedi


On May 1, 2018, 9:19 p.m., Santhosh Kumar Shanmugham wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/66716/
> ---
> 
> (Updated May 1, 2018, 9:19 p.m.)
> 
> 
> Review request for Aurora, David McLaughlin, Jordan Ly, and Stephan Erb.
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> `Tasks` can specify custom SLA requirements as part of
> their `TaskConfig`. One of the new features is the ability
> to specify an external coordinator that can ACK/NACK
> maintenance requests for tasks. This will be hugely
> beneficial for onboarding services that cannot satisfactorily
> specify SLA in terms of running instances.
> 
> Maintenance requests are driven from the Scheduler to
> improve management of nodes in the cluster.
> 
> 
> Note to reviewers:
> - Test coverage is minimal at this point. Expect more coverage soon in the 
> next diff.
> 
> 
> Diffs
> -
> 
>   api/src/main/thrift/org/apache/aurora/gen/api.thrift 
> ef754e32172e7490a47a13e7b526f243ffa3efeb 
>   api/src/main/thrift/org/apache/aurora/gen/storage.thrift 
> b79e2045ccda05d5058565f81988dfe33feea8f1 
>   src/main/java/org/apache/aurora/scheduler/state/MaintenanceController.java 
> f58c66aaebe8d31913d67a05add0f3d6054e88d1 
>   src/main/java/org/apache/aurora/scheduler/state/SlaManager.java 
> PRE-CREATION 
>   src/main/java/org/apache/aurora/scheduler/state/StateModule.java 
> 0e0f90b670bbbcd6cb3aa302ce4a9abfe70ea979 
>   src/main/java/org/apache/aurora/scheduler/storage/HostMaintenanceStore.java 
> PRE-CREATION 
>   src/main/java/org/apache/aurora/scheduler/storage/Storage.java 
> da5534f886e032ca5a182f3704aa335ff680b258 
>   
> src/main/java/org/apache/aurora/scheduler/storage/durability/DurableStorage.java
>  f1fdc275d3958a36bbe79110d70dfeba640a948a 
>   src/main/java/org/apache/aurora/scheduler/storage/durability/Loader.java 
> 10864f122eff5027c88d835baae6de483d960218 
>   
> src/main/java/org/apache/aurora/scheduler/storage/durability/WriteRecorder.java
>  8d70cae35289a9e36142bab288cf0c9398ebd2d4 
>   src/main/java/org/apache/aurora/scheduler/storage/log/SnapshotService.java 
> b30de881eafa3226fdc32383b4e9bfd33ca912a5 
>   src/main/java/org/apache/aurora/scheduler/storage/log/SnapshotterImpl.java 
> 4b52be02001e704f4b1a5f447226ac8c2386e3fd 
>   
> src/main/java/org/apache/aurora/scheduler/storage/mem/MemHostMaintenanceStore.java
>  PRE-CREATION 
>   src/main/java/org/apache/aurora/scheduler/storage/mem/MemStorage.java 
> 9f324b010db7e351e98b257d8fc8fecfeac81268 
>   src/main/java/org/apache/aurora/scheduler/storage/mem/MemStorageModule.java 
> edcea09b4d206cfddb642074237b031ad71cff13 
>   
> src/main/java/org/apache/aurora/scheduler/thrift/SchedulerThriftInterface.java
>  9fc0416086dd3eb2e2f4e8f659da59fcdea2b22b 
>   src/main/python/apache/aurora/config/schema/base.py 
> a629bcd1261e5959da0a8458a55545d4e2c2a7a5 
>   src/main/python/apache/aurora/config/thrift.py 
> 6d2dde6e964daa68bf6f0e5bbbffecc5bd8c0431 
>   src/main/python/apache/aurora/executor/executor_vars.py 
> 561f9452aedda4cc695c84a2a850bdd7e1d65dec 
>   src/test/java/org/apache/aurora/scheduler/base/TaskTestUtil.java 
> 778148a7c033cba9004954cabc33a2b1d003dccf 
>   src/test/java/org/apache/aurora/scheduler/config/CommandLineTest.java 
> e66ec116112df164106598d9ff0bc9e8f465e44f 
>   
> src/test/java/org/apache/aurora/scheduler/state/MaintenanceControllerImplTest.java
>  770846e84e9980ea3dbf9e1c46b0d45c5488c5b3 
>   src/test/java/org/apache/aurora/scheduler/storage/backup/RecoveryTest.java 
> ba03ff94bb5fee2b09a6660a9ad759cece7449f1 
>   
> src/test/java/org/apache/aurora/scheduler/storage/durability/DurableStorageTest.java
>  3dd9ce4039b223cb6156462d089f7062a1cde772 
>   
> src/test/java/org/apache/aurora/scheduler/storage/durability/WriteRecorderTest.java
>  27c8c829cd1e417dd5e60a8e9415331ca4a7c918 
>   
> src/test/java/org/apache/aurora/scheduler/storage/log/SnapshotterImplIT.java 
> be07361a27afefa21cc2ba76ce82531a418d9814 
>   
> src/test/java/org/apache/aurora/scheduler/storage/testing/StorageTestUtil.java
>  d59118be13342da9003b0bcb97e12e477d9edf8f 
>   
> src/test/java/org/apache/aurora/scheduler/thrift/aop/MockDecoratedThrift.java 
> 

Re: Review Request 66716: [WIP] Enable `Tasks` to specify their own custom maintenance SLA.

2018-05-03 Thread Jordan Ly

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/66716/#review202363
---



The interface for SlaManager looks good to me! Thanks for the refactors :)


src/main/java/org/apache/aurora/scheduler/state/MaintenanceController.java
Lines 144-150 (patched)


If bound in a private module, you don't need to have a `@Named` annotation 
(I don't think it is used anywhere else).

Additionally, maybe use `@Qualifier` instead (e.g. 
https://github.com/apache/aurora/blob/master/src/main/java/org/apache/aurora/scheduler/async/AsyncModule.java#L60-L62).
 Not entirely sure what the difference is myself but this is used more 
throughout the project.



src/main/java/org/apache/aurora/scheduler/state/MaintenanceController.java
Line 157 (original), 203 (patched)


This name confused me a bit. For me, it implies there will be a 
long-running method continually doing something but this is a one-time thing.

Maybe rename to `checkForDrainingTasks` or something?



src/main/java/org/apache/aurora/scheduler/state/MaintenanceController.java
Lines 259 (patched)


nit: newline after



src/main/java/org/apache/aurora/scheduler/state/SlaManager.java
Lines 60-65 (patched)


Maybe use a private binding for this, or a `@Qualifier` (see my comments on 
`StateModule` for more context.



src/main/java/org/apache/aurora/scheduler/state/SlaManager.java
Lines 152 (patched)


nit: newline after



src/main/java/org/apache/aurora/scheduler/state/SlaManager.java
Lines 153-155 (patched)


Do we need to immediately return after this?

If I am going to force the work to be done, do I need to even call this 
method?



src/main/java/org/apache/aurora/scheduler/state/SlaManager.java
Lines 186 (patched)


nit: newline after



src/main/java/org/apache/aurora/scheduler/state/StateModule.java
Lines 97-107 (patched)


We may want this as a private binding since `WebhookModule` also uses this 
class.



src/main/java/org/apache/aurora/scheduler/state/StateModule.java
Lines 115-116 (patched)


I think this (+ its dependencies and options) may be better suited for the 
`SlaModule`.



src/main/java/org/apache/aurora/scheduler/state/StateModule.java
Lines 131-140 (patched)


I would create a `MaintenanceModule` and move this.


- Jordan Ly


On May 1, 2018, 9:19 p.m., Santhosh Kumar Shanmugham wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/66716/
> ---
> 
> (Updated May 1, 2018, 9:19 p.m.)
> 
> 
> Review request for Aurora, David McLaughlin, Jordan Ly, and Stephan Erb.
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> `Tasks` can specify custom SLA requirements as part of
> their `TaskConfig`. One of the new features is the ability
> to specify an external coordinator that can ACK/NACK
> maintenance requests for tasks. This will be hugely
> beneficial for onboarding services that cannot satisfactorily
> specify SLA in terms of running instances.
> 
> Maintenance requests are driven from the Scheduler to
> improve management of nodes in the cluster.
> 
> 
> Note to reviewers:
> - Test coverage is minimal at this point. Expect more coverage soon in the 
> next diff.
> 
> 
> Diffs
> -
> 
>   api/src/main/thrift/org/apache/aurora/gen/api.thrift 
> ef754e32172e7490a47a13e7b526f243ffa3efeb 
>   api/src/main/thrift/org/apache/aurora/gen/storage.thrift 
> b79e2045ccda05d5058565f81988dfe33feea8f1 
>   src/main/java/org/apache/aurora/scheduler/state/MaintenanceController.java 
> f58c66aaebe8d31913d67a05add0f3d6054e88d1 
>   src/main/java/org/apache/aurora/scheduler/state/SlaManager.java 
> PRE-CREATION 
>   src/main/java/org/apache/aurora/scheduler/state/StateModule.java 
> 0e0f90b670bbbcd6cb3aa302ce4a9abfe70ea979 
>   src/main/java/org/apache/aurora/scheduler/storage/HostMaintenanceStore.java 
> PRE-CREATION 
>   src/main/java/org/apache/aurora/scheduler/storage/Storage.java 
> da5534f886e032ca5a182f3704aa335ff680b258 
>   
> src/main/java/org/apache/aurora/scheduler/storage/durability/DurableStorage.java
>  f1fdc275d3958a36bbe79110d70dfeba640a948a 
>   src/main/java/org/apache/aurora/scheduler/storage/durability/Loader.java 
> 

Re: Review Request 66716: [WIP] Enable `Tasks` to specify their own custom maintenance SLA.

2018-05-01 Thread Aurora ReviewBot

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/66716/#review202250
---



Master (a3d596e) is red with this patch.
  ./build-support/jenkins/build.sh


I0502 01:29:14.031 [Time-limited test-SendThread(localhost:48165), ClientCnxn] 
Opening socket connection to server localhost/127.0.0.1:48165. Will not attempt 
to authenticate using SASL (unknown error) 
W0502 01:29:14.032 [Time-limited test-SendThread(localhost:48165), ClientCnxn] 
Session 0x0 for server null, unexpected error, closing socket connection and 
attempting reconnect java.net.ConnectException: Connection refused
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at 
sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)
at 
org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:361)
at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1141)

I0502 01:29:14.133 [Time-limited test-SendThread(localhost:48165), ClientCnxn] 
Opening socket connection to server localhost/0:0:0:0:0:0:0:1:48165. Will not 
attempt to authenticate using SASL (unknown error) 
W0502 01:29:14.133 [Time-limited test-SendThread(localhost:48165), ClientCnxn] 
Session 0x0 for server null, unexpected error, closing socket connection and 
attempting reconnect java.net.ConnectException: Connection refused
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at 
sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)
at 
org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:361)
at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1141)



org.apache.aurora.scheduler.events.WebhookTest > 
testTaskChangedWithOldStateError FAILED
java.lang.AssertionError at WebhookTest.java:251
I0502 01:29:44.772 [ShutdownHook, SchedulerMain] Stopping scheduler services. 

1084 tests completed, 1 failed, 1 skipped
:test FAILED
:jacocoTestReport
Coverage report generated: 
file:///home/jenkins/jenkins-slave/workspace/AuroraBot/dist/reports/jacoco/test/html/index.html
:jacocoTestCoverageVerification

FAILURE: Build failed with an exception.

* What went wrong:
Execution failed for task ':test'.
> There were failing tests. See the report at: 
> file:///home/jenkins/jenkins-slave/workspace/AuroraBot/dist/reports/tests/test/index.html

* Try:
Run with --stacktrace option to get the stack trace. Run with --info or --debug 
option to get more log output.

* Get more help at https://help.gradle.org

BUILD FAILED in 7m 13s
45 actionable tasks: 36 executed, 9 up-to-date


I will refresh this build result if you post a review containing "@ReviewBot 
retry"

- Aurora ReviewBot


On May 1, 2018, 9:19 p.m., Santhosh Kumar Shanmugham wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/66716/
> ---
> 
> (Updated May 1, 2018, 9:19 p.m.)
> 
> 
> Review request for Aurora, David McLaughlin, Jordan Ly, and Stephan Erb.
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> `Tasks` can specify custom SLA requirements as part of
> their `TaskConfig`. One of the new features is the ability
> to specify an external coordinator that can ACK/NACK
> maintenance requests for tasks. This will be hugely
> beneficial for onboarding services that cannot satisfactorily
> specify SLA in terms of running instances.
> 
> Maintenance requests are driven from the Scheduler to
> improve management of nodes in the cluster.
> 
> 
> Note to reviewers:
> - Test coverage is minimal at this point. Expect more coverage soon in the 
> next diff.
> 
> 
> Diffs
> -
> 
>   api/src/main/thrift/org/apache/aurora/gen/api.thrift 
> ef754e32172e7490a47a13e7b526f243ffa3efeb 
>   api/src/main/thrift/org/apache/aurora/gen/storage.thrift 
> b79e2045ccda05d5058565f81988dfe33feea8f1 
>   src/main/java/org/apache/aurora/scheduler/state/MaintenanceController.java 
> f58c66aaebe8d31913d67a05add0f3d6054e88d1 
>   src/main/java/org/apache/aurora/scheduler/state/SlaManager.java 
> PRE-CREATION 
>   src/main/java/org/apache/aurora/scheduler/state/StateModule.java 
> 0e0f90b670bbbcd6cb3aa302ce4a9abfe70ea979 
>   src/main/java/org/apache/aurora/scheduler/storage/HostMaintenanceStore.java 
> PRE-CREATION 
>   src/main/java/org/apache/aurora/scheduler/storage/Storage.java 
> da5534f886e032ca5a182f3704aa335ff680b258 
>   
> src/main/java/org/apache/aurora/scheduler/storage/durability/DurableStorage.java
>  f1fdc275d3958a36bbe79110d70dfeba640a948a 
>   src/main/java/org/apache/aurora/scheduler/storage/durability/Loader.java 
> 10864f122eff5027c88d835baae6de483d960218 
>   
> src/main/java/org/apache/aurora/scheduler/storage/durability/WriteRecorder.java

Re: Review Request 66716: [WIP] Enable `Tasks` to specify their own custom maintenance SLA.

2018-05-01 Thread Santhosh Kumar Shanmugham

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/66716/#review202246
---



@ReviewBot retry

- Santhosh Kumar Shanmugham


On May 1, 2018, 2:19 p.m., Santhosh Kumar Shanmugham wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/66716/
> ---
> 
> (Updated May 1, 2018, 2:19 p.m.)
> 
> 
> Review request for Aurora, David McLaughlin, Jordan Ly, and Stephan Erb.
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> `Tasks` can specify custom SLA requirements as part of
> their `TaskConfig`. One of the new features is the ability
> to specify an external coordinator that can ACK/NACK
> maintenance requests for tasks. This will be hugely
> beneficial for onboarding services that cannot satisfactorily
> specify SLA in terms of running instances.
> 
> Maintenance requests are driven from the Scheduler to
> improve management of nodes in the cluster.
> 
> 
> Note to reviewers:
> - Test coverage is minimal at this point. Expect more coverage soon in the 
> next diff.
> 
> 
> Diffs
> -
> 
>   api/src/main/thrift/org/apache/aurora/gen/api.thrift 
> ef754e32172e7490a47a13e7b526f243ffa3efeb 
>   api/src/main/thrift/org/apache/aurora/gen/storage.thrift 
> b79e2045ccda05d5058565f81988dfe33feea8f1 
>   src/main/java/org/apache/aurora/scheduler/state/MaintenanceController.java 
> f58c66aaebe8d31913d67a05add0f3d6054e88d1 
>   src/main/java/org/apache/aurora/scheduler/state/SlaManager.java 
> PRE-CREATION 
>   src/main/java/org/apache/aurora/scheduler/state/StateModule.java 
> 0e0f90b670bbbcd6cb3aa302ce4a9abfe70ea979 
>   src/main/java/org/apache/aurora/scheduler/storage/HostMaintenanceStore.java 
> PRE-CREATION 
>   src/main/java/org/apache/aurora/scheduler/storage/Storage.java 
> da5534f886e032ca5a182f3704aa335ff680b258 
>   
> src/main/java/org/apache/aurora/scheduler/storage/durability/DurableStorage.java
>  f1fdc275d3958a36bbe79110d70dfeba640a948a 
>   src/main/java/org/apache/aurora/scheduler/storage/durability/Loader.java 
> 10864f122eff5027c88d835baae6de483d960218 
>   
> src/main/java/org/apache/aurora/scheduler/storage/durability/WriteRecorder.java
>  8d70cae35289a9e36142bab288cf0c9398ebd2d4 
>   src/main/java/org/apache/aurora/scheduler/storage/log/SnapshotService.java 
> b30de881eafa3226fdc32383b4e9bfd33ca912a5 
>   src/main/java/org/apache/aurora/scheduler/storage/log/SnapshotterImpl.java 
> 4b52be02001e704f4b1a5f447226ac8c2386e3fd 
>   
> src/main/java/org/apache/aurora/scheduler/storage/mem/MemHostMaintenanceStore.java
>  PRE-CREATION 
>   src/main/java/org/apache/aurora/scheduler/storage/mem/MemStorage.java 
> 9f324b010db7e351e98b257d8fc8fecfeac81268 
>   src/main/java/org/apache/aurora/scheduler/storage/mem/MemStorageModule.java 
> edcea09b4d206cfddb642074237b031ad71cff13 
>   
> src/main/java/org/apache/aurora/scheduler/thrift/SchedulerThriftInterface.java
>  9fc0416086dd3eb2e2f4e8f659da59fcdea2b22b 
>   src/main/python/apache/aurora/config/schema/base.py 
> a629bcd1261e5959da0a8458a55545d4e2c2a7a5 
>   src/main/python/apache/aurora/config/thrift.py 
> 6d2dde6e964daa68bf6f0e5bbbffecc5bd8c0431 
>   src/main/python/apache/aurora/executor/executor_vars.py 
> 561f9452aedda4cc695c84a2a850bdd7e1d65dec 
>   src/test/java/org/apache/aurora/scheduler/base/TaskTestUtil.java 
> 778148a7c033cba9004954cabc33a2b1d003dccf 
>   src/test/java/org/apache/aurora/scheduler/config/CommandLineTest.java 
> e66ec116112df164106598d9ff0bc9e8f465e44f 
>   
> src/test/java/org/apache/aurora/scheduler/state/MaintenanceControllerImplTest.java
>  770846e84e9980ea3dbf9e1c46b0d45c5488c5b3 
>   src/test/java/org/apache/aurora/scheduler/storage/backup/RecoveryTest.java 
> ba03ff94bb5fee2b09a6660a9ad759cece7449f1 
>   
> src/test/java/org/apache/aurora/scheduler/storage/durability/DurableStorageTest.java
>  3dd9ce4039b223cb6156462d089f7062a1cde772 
>   
> src/test/java/org/apache/aurora/scheduler/storage/durability/WriteRecorderTest.java
>  27c8c829cd1e417dd5e60a8e9415331ca4a7c918 
>   
> src/test/java/org/apache/aurora/scheduler/storage/log/SnapshotterImplIT.java 
> be07361a27afefa21cc2ba76ce82531a418d9814 
>   
> src/test/java/org/apache/aurora/scheduler/storage/testing/StorageTestUtil.java
>  d59118be13342da9003b0bcb97e12e477d9edf8f 
>   
> src/test/java/org/apache/aurora/scheduler/thrift/aop/MockDecoratedThrift.java 
> d412090c292691305f01bccd1596fb0f6bb003ad 
>   src/test/python/apache/aurora/api_util.py 
> 3fc9b478cc9aada0503e8ed8698a37b4ed926cdd 
>   src/test/python/apache/aurora/client/api/test_scheduler_client.py 
> f2a2eae1539f7f6dff6855e4122cc41c6cbb0f7b 
>   src/test/python/apache/aurora/client/cli/test_inspect.py 
> e4f43d0573c7862adc9bc679f4cea40cc76eac38 
>   

Re: Review Request 66716: [WIP] Enable `Tasks` to specify their own custom maintenance SLA.

2018-05-01 Thread Aurora ReviewBot

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/66716/#review202229
---



Master (a3d596e) is red with this patch.
  ./build-support/jenkins/build.sh

Pass 2: Analyzing classes (323 / 336) - 96% complete
Pass 2: Analyzing classes (324 / 336) - 96% complete
Pass 2: Analyzing classes (325 / 336) - 96% complete
Pass 2: Analyzing classes (326 / 336) - 97% complete
Pass 2: Analyzing classes (327 / 336) - 97% complete
Pass 2: Analyzing classes (328 / 336) - 97% complete
Pass 2: Analyzing classes (329 / 336) - 97% complete
Pass 2: Analyzing classes (330 / 336) - 98% complete
Pass 2: Analyzing classes (331 / 336) - 98% complete
Pass 2: Analyzing classes (332 / 336) - 98% complete
Pass 2: Analyzing classes (333 / 336) - 99% complete
Pass 2: Analyzing classes (334 / 336) - 99% complete
Pass 2: Analyzing classes (335 / 336) - 99% complete
Pass 2: Analyzing classes (336 / 336) - 100% complete
Done with analysis
:test

org.apache.aurora.scheduler.events.WebhookTest > 
testTaskChangedWithOldStateError FAILED
java.lang.AssertionError at WebhookTest.java:251
I0501 21:52:48.436 [ShutdownHook, SchedulerMain] Stopping scheduler services. 

1084 tests completed, 1 failed, 1 skipped
:test FAILED
:jacocoTestReport
Coverage report generated: 
file:///home/jenkins/jenkins-slave/workspace/AuroraBot/dist/reports/jacoco/test/html/index.html
:jacocoTestCoverageVerification

FAILURE: Build failed with an exception.

* What went wrong:
Execution failed for task ':test'.
> There were failing tests. See the report at: 
> file:///home/jenkins/jenkins-slave/workspace/AuroraBot/dist/reports/tests/test/index.html

* Try:
Run with --stacktrace option to get the stack trace. Run with --info or --debug 
option to get more log output.

* Get more help at https://help.gradle.org

BUILD FAILED in 7m 20s
45 actionable tasks: 36 executed, 9 up-to-date


I will refresh this build result if you post a review containing "@ReviewBot 
retry"

- Aurora ReviewBot


On May 1, 2018, 9:19 p.m., Santhosh Kumar Shanmugham wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/66716/
> ---
> 
> (Updated May 1, 2018, 9:19 p.m.)
> 
> 
> Review request for Aurora, David McLaughlin, Jordan Ly, and Stephan Erb.
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> `Tasks` can specify custom SLA requirements as part of
> their `TaskConfig`. One of the new features is the ability
> to specify an external coordinator that can ACK/NACK
> maintenance requests for tasks. This will be hugely
> beneficial for onboarding services that cannot satisfactorily
> specify SLA in terms of running instances.
> 
> Maintenance requests are driven from the Scheduler to
> improve management of nodes in the cluster.
> 
> 
> Note to reviewers:
> - Test coverage is minimal at this point. Expect more coverage soon in the 
> next diff.
> 
> 
> Diffs
> -
> 
>   api/src/main/thrift/org/apache/aurora/gen/api.thrift 
> ef754e32172e7490a47a13e7b526f243ffa3efeb 
>   api/src/main/thrift/org/apache/aurora/gen/storage.thrift 
> b79e2045ccda05d5058565f81988dfe33feea8f1 
>   src/main/java/org/apache/aurora/scheduler/state/MaintenanceController.java 
> f58c66aaebe8d31913d67a05add0f3d6054e88d1 
>   src/main/java/org/apache/aurora/scheduler/state/SlaManager.java 
> PRE-CREATION 
>   src/main/java/org/apache/aurora/scheduler/state/StateModule.java 
> 0e0f90b670bbbcd6cb3aa302ce4a9abfe70ea979 
>   src/main/java/org/apache/aurora/scheduler/storage/HostMaintenanceStore.java 
> PRE-CREATION 
>   src/main/java/org/apache/aurora/scheduler/storage/Storage.java 
> da5534f886e032ca5a182f3704aa335ff680b258 
>   
> src/main/java/org/apache/aurora/scheduler/storage/durability/DurableStorage.java
>  f1fdc275d3958a36bbe79110d70dfeba640a948a 
>   src/main/java/org/apache/aurora/scheduler/storage/durability/Loader.java 
> 10864f122eff5027c88d835baae6de483d960218 
>   
> src/main/java/org/apache/aurora/scheduler/storage/durability/WriteRecorder.java
>  8d70cae35289a9e36142bab288cf0c9398ebd2d4 
>   src/main/java/org/apache/aurora/scheduler/storage/log/SnapshotService.java 
> b30de881eafa3226fdc32383b4e9bfd33ca912a5 
>   src/main/java/org/apache/aurora/scheduler/storage/log/SnapshotterImpl.java 
> 4b52be02001e704f4b1a5f447226ac8c2386e3fd 
>   
> src/main/java/org/apache/aurora/scheduler/storage/mem/MemHostMaintenanceStore.java
>  PRE-CREATION 
>   src/main/java/org/apache/aurora/scheduler/storage/mem/MemStorage.java 
> 9f324b010db7e351e98b257d8fc8fecfeac81268 
>   src/main/java/org/apache/aurora/scheduler/storage/mem/MemStorageModule.java 
> edcea09b4d206cfddb642074237b031ad71cff13 
>   
> src/main/java/org/apache/aurora/scheduler/thrift/SchedulerThriftInterface.java
>  

Re: Review Request 66716: [WIP] Enable `Tasks` to specify their own custom maintenance SLA.

2018-05-01 Thread Santhosh Kumar Shanmugham


> On April 20, 2018, 3:31 p.m., Jordan Ly wrote:
> > api/src/main/thrift/org/apache/aurora/gen/api.thrift
> > Lines 263 (patched)
> > 
> >
> > Is this per request? Will end users be able to do partial ACKs?

Turned out to be not very useful when using locks. We can revisit if this is 
needed. Dropping this now.


> On April 20, 2018, 3:31 p.m., Jordan Ly wrote:
> > src/main/java/org/apache/aurora/scheduler/state/MaintenanceController.java
> > Line 131 (original), 148-149 (patched)
> > 
> >
> > nit: make a single line (looks like it could fit, but ignore if it 
> > actually can't)

Done.


> On April 20, 2018, 3:31 p.m., Jordan Ly wrote:
> > src/main/java/org/apache/aurora/scheduler/state/MaintenanceController.java
> > Lines 213-217 (original), 221-240 (patched)
> > 
> >
> > Does it make sense to/is it possible to combine these? E.g. force 
> > drains would call with a defaultSlaPolicy of none or something

I was really trying to avoid touching the old code. But have since changed my 
mind and refactored it to unify logic.


> On April 20, 2018, 3:31 p.m., Jordan Ly wrote:
> > src/main/java/org/apache/aurora/scheduler/state/MaintenanceController.java
> > Lines 214-216 (original), 222-224 (patched)
> > 
> >
> > Should this also create `HostMaintenanceRequest` for consistency?

I was trying not to touch the current flow at all. However the refactor 
addressed this.


> On April 20, 2018, 3:31 p.m., Jordan Ly wrote:
> > src/main/java/org/apache/aurora/scheduler/state/MaintenanceController.java
> > Lines 231 (patched)
> > 
> >
> > What is the behavior if there is an existing host maintenance request? 
> > Do we deny the new one or overwrite the old one?

There can be only one maintenance request per host at any time. So we will 
simply overwrite it. If the current one is actively handled, the new one will 
just be a no-op.


> On April 20, 2018, 3:31 p.m., Jordan Ly wrote:
> > src/main/java/org/apache/aurora/scheduler/state/MaintenanceController.java
> > Lines 318-326 (patched)
> > 
> >
> > Can we make this `@Timed` as well?

Done.


> On April 20, 2018, 3:31 p.m., Jordan Ly wrote:
> > src/main/java/org/apache/aurora/scheduler/state/MaintenanceController.java
> > Lines 325 (patched)
> > 
> >
> > Why is `checkSla` always `true`?

I was trying to differentiate `drain` vs `slaDrain`. Refactored out the old 
logic to use `HostMaintenanceRequest` as well.


> On April 20, 2018, 3:31 p.m., Jordan Ly wrote:
> > src/main/java/org/apache/aurora/scheduler/state/MaintenanceController.java
> > Lines 328-331 (patched)
> > 
> >
> > Should this be customizable? I am pretty sure it is fine if it is not, 
> > but wanted to call it out.

Sure. Done.


> On April 20, 2018, 3:31 p.m., Jordan Ly wrote:
> > src/main/java/org/apache/aurora/scheduler/state/MaintenanceController.java
> > Lines 337-340 (patched)
> > 
> >
> > Seems odd to have `if not checkSla, call slaManager to drain the task`
> > 
> > maybe slaManager should behave like a consumer, executing the 
> > statechange only if SLA check passes?

Refactored SlaManager with the interface agreed upon below, and now we have a 
single `checkSlaThenAct` as the API into `SlaManager`.


> On April 20, 2018, 3:31 p.m., Jordan Ly wrote:
> > src/main/java/org/apache/aurora/scheduler/state/MaintenanceController.java
> > Lines 342-345 (patched)
> > 
> >
> > This should never happen if `checkSla` is `true`, correct?
> > 
> > Could we add a metric here as well?

Done.


> On April 20, 2018, 3:31 p.m., Jordan Ly wrote:
> > src/main/java/org/apache/aurora/scheduler/state/SlaManager.java
> > Lines 60-61 (patched)
> > 
> >
> > I think that this makes more sense in `MaintenanceController`

Agree. Done.


> On April 20, 2018, 3:31 p.m., Jordan Ly wrote:
> > src/main/java/org/apache/aurora/scheduler/state/SlaManager.java
> > Lines 100 (patched)
> > 
> >
> > mostly for my curiosity but why 100?

This should be the max number of coordinators that can exist in a cluster. 
Should be parametrized.


> On April 20, 2018, 3:31 p.m., Jordan Ly 

Re: Review Request 66716: [WIP] Enable `Tasks` to specify their own custom maintenance SLA.

2018-05-01 Thread Santhosh Kumar Shanmugham

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/66716/
---

(Updated May 1, 2018, 2:19 p.m.)


Review request for Aurora, David McLaughlin, Jordan Ly, and Stephan Erb.


Changes
---

Feedback and refactor the SlaManager interface.


Repository: aurora


Description (updated)
---

`Tasks` can specify custom SLA requirements as part of
their `TaskConfig`. One of the new features is the ability
to specify an external coordinator that can ACK/NACK
maintenance requests for tasks. This will be hugely
beneficial for onboarding services that cannot satisfactorily
specify SLA in terms of running instances.

Maintenance requests are driven from the Scheduler to
improve management of nodes in the cluster.


Note to reviewers:
- Test coverage is minimal at this point. Expect more coverage soon in the next 
diff.


Diffs (updated)
-

  api/src/main/thrift/org/apache/aurora/gen/api.thrift 
ef754e32172e7490a47a13e7b526f243ffa3efeb 
  api/src/main/thrift/org/apache/aurora/gen/storage.thrift 
b79e2045ccda05d5058565f81988dfe33feea8f1 
  src/main/java/org/apache/aurora/scheduler/state/MaintenanceController.java 
f58c66aaebe8d31913d67a05add0f3d6054e88d1 
  src/main/java/org/apache/aurora/scheduler/state/SlaManager.java PRE-CREATION 
  src/main/java/org/apache/aurora/scheduler/state/StateModule.java 
0e0f90b670bbbcd6cb3aa302ce4a9abfe70ea979 
  src/main/java/org/apache/aurora/scheduler/storage/HostMaintenanceStore.java 
PRE-CREATION 
  src/main/java/org/apache/aurora/scheduler/storage/Storage.java 
da5534f886e032ca5a182f3704aa335ff680b258 
  
src/main/java/org/apache/aurora/scheduler/storage/durability/DurableStorage.java
 f1fdc275d3958a36bbe79110d70dfeba640a948a 
  src/main/java/org/apache/aurora/scheduler/storage/durability/Loader.java 
10864f122eff5027c88d835baae6de483d960218 
  
src/main/java/org/apache/aurora/scheduler/storage/durability/WriteRecorder.java 
8d70cae35289a9e36142bab288cf0c9398ebd2d4 
  src/main/java/org/apache/aurora/scheduler/storage/log/SnapshotService.java 
b30de881eafa3226fdc32383b4e9bfd33ca912a5 
  src/main/java/org/apache/aurora/scheduler/storage/log/SnapshotterImpl.java 
4b52be02001e704f4b1a5f447226ac8c2386e3fd 
  
src/main/java/org/apache/aurora/scheduler/storage/mem/MemHostMaintenanceStore.java
 PRE-CREATION 
  src/main/java/org/apache/aurora/scheduler/storage/mem/MemStorage.java 
9f324b010db7e351e98b257d8fc8fecfeac81268 
  src/main/java/org/apache/aurora/scheduler/storage/mem/MemStorageModule.java 
edcea09b4d206cfddb642074237b031ad71cff13 
  
src/main/java/org/apache/aurora/scheduler/thrift/SchedulerThriftInterface.java 
9fc0416086dd3eb2e2f4e8f659da59fcdea2b22b 
  src/main/python/apache/aurora/config/schema/base.py 
a629bcd1261e5959da0a8458a55545d4e2c2a7a5 
  src/main/python/apache/aurora/config/thrift.py 
6d2dde6e964daa68bf6f0e5bbbffecc5bd8c0431 
  src/main/python/apache/aurora/executor/executor_vars.py 
561f9452aedda4cc695c84a2a850bdd7e1d65dec 
  src/test/java/org/apache/aurora/scheduler/base/TaskTestUtil.java 
778148a7c033cba9004954cabc33a2b1d003dccf 
  src/test/java/org/apache/aurora/scheduler/config/CommandLineTest.java 
e66ec116112df164106598d9ff0bc9e8f465e44f 
  
src/test/java/org/apache/aurora/scheduler/state/MaintenanceControllerImplTest.java
 770846e84e9980ea3dbf9e1c46b0d45c5488c5b3 
  src/test/java/org/apache/aurora/scheduler/storage/backup/RecoveryTest.java 
ba03ff94bb5fee2b09a6660a9ad759cece7449f1 
  
src/test/java/org/apache/aurora/scheduler/storage/durability/DurableStorageTest.java
 3dd9ce4039b223cb6156462d089f7062a1cde772 
  
src/test/java/org/apache/aurora/scheduler/storage/durability/WriteRecorderTest.java
 27c8c829cd1e417dd5e60a8e9415331ca4a7c918 
  src/test/java/org/apache/aurora/scheduler/storage/log/SnapshotterImplIT.java 
be07361a27afefa21cc2ba76ce82531a418d9814 
  
src/test/java/org/apache/aurora/scheduler/storage/testing/StorageTestUtil.java 
d59118be13342da9003b0bcb97e12e477d9edf8f 
  src/test/java/org/apache/aurora/scheduler/thrift/aop/MockDecoratedThrift.java 
d412090c292691305f01bccd1596fb0f6bb003ad 
  src/test/python/apache/aurora/api_util.py 
3fc9b478cc9aada0503e8ed8698a37b4ed926cdd 
  src/test/python/apache/aurora/client/api/test_scheduler_client.py 
f2a2eae1539f7f6dff6855e4122cc41c6cbb0f7b 
  src/test/python/apache/aurora/client/cli/test_inspect.py 
e4f43d0573c7862adc9bc679f4cea40cc76eac38 
  src/test/python/apache/aurora/config/test_thrift.py 
8e1d0e177959af12b97bdd1cd47845b72bc12fe1 
  
src/test/resources/org/apache/aurora/scheduler/storage/durability/goldens/current/removeHostMaintenanceRequest
 PRE-CREATION 
  
src/test/resources/org/apache/aurora/scheduler/storage/durability/goldens/current/saveCronJob
 88e1c36a1aa2d192b95963f7aa36e243a447e4af 
  
src/test/resources/org/apache/aurora/scheduler/storage/durability/goldens/current/saveHostMaintenanceRequest
 PRE-CREATION 
  

Re: Review Request 66716: [WIP] Enable `Tasks` to specify their own custom maintenance SLA.

2018-04-20 Thread Jordan Ly

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/66716/#review201642
---



Thanks for starting the work on this feature! Very exciting.

Most of my concerns lie around future usecases (selfishly like SLA-aware 
updates) :)


api/src/main/thrift/org/apache/aurora/gen/api.thrift
Lines 263 (patched)


Is this per request? Will end users be able to do partial ACKs?



src/main/java/org/apache/aurora/scheduler/state/MaintenanceController.java
Line 131 (original), 148-149 (patched)


nit: make a single line (looks like it could fit, but ignore if it actually 
can't)



src/main/java/org/apache/aurora/scheduler/state/MaintenanceController.java
Lines 213-217 (original), 221-240 (patched)


Does it make sense to/is it possible to combine these? E.g. force drains 
would call with a defaultSlaPolicy of none or something



src/main/java/org/apache/aurora/scheduler/state/MaintenanceController.java
Lines 214-216 (original), 222-224 (patched)


Should this also create `HostMaintenanceRequest` for consistency?



src/main/java/org/apache/aurora/scheduler/state/MaintenanceController.java
Lines 231 (patched)


What is the behavior if there is an existing host maintenance request? Do 
we deny the new one or overwrite the old one?



src/main/java/org/apache/aurora/scheduler/state/MaintenanceController.java
Lines 318-326 (patched)


Can we make this `@Timed` as well?



src/main/java/org/apache/aurora/scheduler/state/MaintenanceController.java
Lines 321 (patched)


Do we need the write lock for each iteration? Is it possible to minimize 
the scope to only when we go to change state for a task to DRAINING?



src/main/java/org/apache/aurora/scheduler/state/MaintenanceController.java
Lines 325 (patched)


Why is `checkSla` always `true`?



src/main/java/org/apache/aurora/scheduler/state/MaintenanceController.java
Lines 328-331 (patched)


Should this be customizable? I am pretty sure it is fine if it is not, but 
wanted to call it out.



src/main/java/org/apache/aurora/scheduler/state/MaintenanceController.java
Lines 333-334 (patched)


nit: style

```
private void maybeDrain(
boolean checkSla,
IScheduledTask task,
String host,
Storage.MutableStoreProvider store) {
[NEWLINE]
...
```



src/main/java/org/apache/aurora/scheduler/state/MaintenanceController.java
Lines 337-340 (patched)


Seems odd to have `if not checkSla, call slaManager to drain the task`

maybe slaManager should behave like a consumer, executing the statechange 
only if SLA check passes?



src/main/java/org/apache/aurora/scheduler/state/MaintenanceController.java
Lines 342-345 (patched)


This should never happen if `checkSla` is `true`, correct?

Could we add a metric here as well?



src/main/java/org/apache/aurora/scheduler/state/MaintenanceController.java
Lines 348 (patched)


Use `TimeAmount` to do comparisons instead of `expireMs / 1000`



src/main/java/org/apache/aurora/scheduler/state/MaintenanceController.java
Lines 349 (patched)


Add metric here as well, maybe tracking the tasks that cause this.



src/main/java/org/apache/aurora/scheduler/state/SlaManager.java
Lines 60-61 (patched)


I think that this makes more sense in `MaintenanceController`



src/main/java/org/apache/aurora/scheduler/state/SlaManager.java
Lines 90 (patched)


nit: newline after



src/main/java/org/apache/aurora/scheduler/state/SlaManager.java
Lines 100 (patched)


mostly for my curiosity but why 100?



src/main/java/org/apache/aurora/scheduler/state/SlaManager.java
Lines 110 (patched)


would `IllegalArgumentException` here be better?



src/main/java/org/apache/aurora/scheduler/state/SlaManager.java
Lines 121 (patched)


Maybe use `IllegalArgumentException`




Re: Review Request 66716: [WIP] Enable `Tasks` to specify their own custom maintenance SLA.

2018-04-19 Thread Aurora ReviewBot

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/66716/#review201545
---


Ship it!




Master (ad0bc5f) is green with this patch.
  ./build-support/jenkins/build.sh

I will refresh this build result if you post a review containing "@ReviewBot 
retry"

- Aurora ReviewBot


On April 19, 2018, 5:30 p.m., Santhosh Kumar Shanmugham wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/66716/
> ---
> 
> (Updated April 19, 2018, 5:30 p.m.)
> 
> 
> Review request for Aurora, David McLaughlin, Jordan Ly, and Stephan Erb.
> 
> 
> Repository: aurora
> 
> 
> Description
> ---
> 
> `Tasks` can specify custom SLA requirements as part of
> their `TaskConfig`. One of the new features is the ability
> to specify an external coordinator that can ACK/NACK
> maintenance requests for tasks. This will be hugely
> beneficial for onboarding services that cannot satisfactorily
> specify SLA in terms of running instances.
> 
> Maintenance requests are driven from the Scheduler to
> improve management of nodes in the cluster.
> 
> 
> Note to reviewers:
> - Test coverage is minimal at this point. Expect more coverage soon in the 
> next diff.
> - I have received some last minute comments from customers, which might add 
> some extra features to the `CoordinatorSlaPolicy`.
> 
> 
> Diffs
> -
> 
>   api/src/main/thrift/org/apache/aurora/gen/api.thrift 
> ef754e32172e7490a47a13e7b526f243ffa3efeb 
>   api/src/main/thrift/org/apache/aurora/gen/storage.thrift 
> b79e2045ccda05d5058565f81988dfe33feea8f1 
>   src/main/java/org/apache/aurora/scheduler/state/MaintenanceController.java 
> f58c66aaebe8d31913d67a05add0f3d6054e88d1 
>   src/main/java/org/apache/aurora/scheduler/state/SlaManager.java 
> PRE-CREATION 
>   src/main/java/org/apache/aurora/scheduler/state/StateModule.java 
> 0e0f90b670bbbcd6cb3aa302ce4a9abfe70ea979 
>   src/main/java/org/apache/aurora/scheduler/storage/HostMaintenanceStore.java 
> PRE-CREATION 
>   src/main/java/org/apache/aurora/scheduler/storage/Storage.java 
> da5534f886e032ca5a182f3704aa335ff680b258 
>   
> src/main/java/org/apache/aurora/scheduler/storage/durability/DurableStorage.java
>  f1fdc275d3958a36bbe79110d70dfeba640a948a 
>   src/main/java/org/apache/aurora/scheduler/storage/durability/Loader.java 
> 10864f122eff5027c88d835baae6de483d960218 
>   
> src/main/java/org/apache/aurora/scheduler/storage/durability/WriteRecorder.java
>  8d70cae35289a9e36142bab288cf0c9398ebd2d4 
>   src/main/java/org/apache/aurora/scheduler/storage/log/SnapshotService.java 
> b30de881eafa3226fdc32383b4e9bfd33ca912a5 
>   src/main/java/org/apache/aurora/scheduler/storage/log/SnapshotterImpl.java 
> 4b52be02001e704f4b1a5f447226ac8c2386e3fd 
>   
> src/main/java/org/apache/aurora/scheduler/storage/mem/MemHostMaintenanceStore.java
>  PRE-CREATION 
>   src/main/java/org/apache/aurora/scheduler/storage/mem/MemStorage.java 
> 9f324b010db7e351e98b257d8fc8fecfeac81268 
>   src/main/java/org/apache/aurora/scheduler/storage/mem/MemStorageModule.java 
> edcea09b4d206cfddb642074237b031ad71cff13 
>   
> src/main/java/org/apache/aurora/scheduler/thrift/SchedulerThriftInterface.java
>  9fc0416086dd3eb2e2f4e8f659da59fcdea2b22b 
>   src/main/python/apache/aurora/config/schema/base.py 
> a629bcd1261e5959da0a8458a55545d4e2c2a7a5 
>   src/main/python/apache/aurora/config/thrift.py 
> 6d2dde6e964daa68bf6f0e5bbbffecc5bd8c0431 
>   src/main/python/apache/aurora/executor/executor_vars.py 
> 561f9452aedda4cc695c84a2a850bdd7e1d65dec 
>   src/test/java/org/apache/aurora/scheduler/base/TaskTestUtil.java 
> 778148a7c033cba9004954cabc33a2b1d003dccf 
>   src/test/java/org/apache/aurora/scheduler/config/CommandLineTest.java 
> e66ec116112df164106598d9ff0bc9e8f465e44f 
>   
> src/test/java/org/apache/aurora/scheduler/state/MaintenanceControllerImplTest.java
>  770846e84e9980ea3dbf9e1c46b0d45c5488c5b3 
>   src/test/java/org/apache/aurora/scheduler/state/SlaManagerTest.java 
> PRE-CREATION 
>   src/test/java/org/apache/aurora/scheduler/storage/backup/RecoveryTest.java 
> ba03ff94bb5fee2b09a6660a9ad759cece7449f1 
>   
> src/test/java/org/apache/aurora/scheduler/storage/durability/DurableStorageTest.java
>  3dd9ce4039b223cb6156462d089f7062a1cde772 
>   
> src/test/java/org/apache/aurora/scheduler/storage/durability/WriteRecorderTest.java
>  27c8c829cd1e417dd5e60a8e9415331ca4a7c918 
>   
> src/test/java/org/apache/aurora/scheduler/storage/log/SnapshotterImplIT.java 
> be07361a27afefa21cc2ba76ce82531a418d9814 
>   
> src/test/java/org/apache/aurora/scheduler/storage/testing/StorageTestUtil.java
>  d59118be13342da9003b0bcb97e12e477d9edf8f 
>   
> src/test/java/org/apache/aurora/scheduler/thrift/aop/MockDecoratedThrift.java 
> 

Review Request 66716: [WIP] Enable `Tasks` to specify their own custom maintenance SLA.

2018-04-19 Thread Santhosh Kumar Shanmugham

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/66716/
---

Review request for Aurora, David McLaughlin, Jordan Ly, and Stephan Erb.


Repository: aurora


Description
---

`Tasks` can specify custom SLA requirements as part of
their `TaskConfig`. One of the new features is the ability
to specify an external coordinator that can ACK/NACK
maintenance requests for tasks. This will be hugely
beneficial for onboarding services that cannot satisfactorily
specify SLA in terms of running instances.

Maintenance requests are driven from the Scheduler to
improve management of nodes in the cluster.


Note to reviewers:
- Test coverage is minimal at this point. Expect more coverage soon in the next 
diff.
- I have received some last minute comments from customers, which might add 
some extra features to the `CoordinatorSlaPolicy`.


Diffs
-

  api/src/main/thrift/org/apache/aurora/gen/api.thrift 
ef754e32172e7490a47a13e7b526f243ffa3efeb 
  api/src/main/thrift/org/apache/aurora/gen/storage.thrift 
b79e2045ccda05d5058565f81988dfe33feea8f1 
  src/main/java/org/apache/aurora/scheduler/state/MaintenanceController.java 
f58c66aaebe8d31913d67a05add0f3d6054e88d1 
  src/main/java/org/apache/aurora/scheduler/state/SlaManager.java PRE-CREATION 
  src/main/java/org/apache/aurora/scheduler/state/StateModule.java 
0e0f90b670bbbcd6cb3aa302ce4a9abfe70ea979 
  src/main/java/org/apache/aurora/scheduler/storage/HostMaintenanceStore.java 
PRE-CREATION 
  src/main/java/org/apache/aurora/scheduler/storage/Storage.java 
da5534f886e032ca5a182f3704aa335ff680b258 
  
src/main/java/org/apache/aurora/scheduler/storage/durability/DurableStorage.java
 f1fdc275d3958a36bbe79110d70dfeba640a948a 
  src/main/java/org/apache/aurora/scheduler/storage/durability/Loader.java 
10864f122eff5027c88d835baae6de483d960218 
  
src/main/java/org/apache/aurora/scheduler/storage/durability/WriteRecorder.java 
8d70cae35289a9e36142bab288cf0c9398ebd2d4 
  src/main/java/org/apache/aurora/scheduler/storage/log/SnapshotService.java 
b30de881eafa3226fdc32383b4e9bfd33ca912a5 
  src/main/java/org/apache/aurora/scheduler/storage/log/SnapshotterImpl.java 
4b52be02001e704f4b1a5f447226ac8c2386e3fd 
  
src/main/java/org/apache/aurora/scheduler/storage/mem/MemHostMaintenanceStore.java
 PRE-CREATION 
  src/main/java/org/apache/aurora/scheduler/storage/mem/MemStorage.java 
9f324b010db7e351e98b257d8fc8fecfeac81268 
  src/main/java/org/apache/aurora/scheduler/storage/mem/MemStorageModule.java 
edcea09b4d206cfddb642074237b031ad71cff13 
  
src/main/java/org/apache/aurora/scheduler/thrift/SchedulerThriftInterface.java 
9fc0416086dd3eb2e2f4e8f659da59fcdea2b22b 
  src/main/python/apache/aurora/config/schema/base.py 
a629bcd1261e5959da0a8458a55545d4e2c2a7a5 
  src/main/python/apache/aurora/config/thrift.py 
6d2dde6e964daa68bf6f0e5bbbffecc5bd8c0431 
  src/main/python/apache/aurora/executor/executor_vars.py 
561f9452aedda4cc695c84a2a850bdd7e1d65dec 
  src/test/java/org/apache/aurora/scheduler/base/TaskTestUtil.java 
778148a7c033cba9004954cabc33a2b1d003dccf 
  src/test/java/org/apache/aurora/scheduler/config/CommandLineTest.java 
e66ec116112df164106598d9ff0bc9e8f465e44f 
  
src/test/java/org/apache/aurora/scheduler/state/MaintenanceControllerImplTest.java
 770846e84e9980ea3dbf9e1c46b0d45c5488c5b3 
  src/test/java/org/apache/aurora/scheduler/state/SlaManagerTest.java 
PRE-CREATION 
  src/test/java/org/apache/aurora/scheduler/storage/backup/RecoveryTest.java 
ba03ff94bb5fee2b09a6660a9ad759cece7449f1 
  
src/test/java/org/apache/aurora/scheduler/storage/durability/DurableStorageTest.java
 3dd9ce4039b223cb6156462d089f7062a1cde772 
  
src/test/java/org/apache/aurora/scheduler/storage/durability/WriteRecorderTest.java
 27c8c829cd1e417dd5e60a8e9415331ca4a7c918 
  src/test/java/org/apache/aurora/scheduler/storage/log/SnapshotterImplIT.java 
be07361a27afefa21cc2ba76ce82531a418d9814 
  
src/test/java/org/apache/aurora/scheduler/storage/testing/StorageTestUtil.java 
d59118be13342da9003b0bcb97e12e477d9edf8f 
  src/test/java/org/apache/aurora/scheduler/thrift/aop/MockDecoratedThrift.java 
d412090c292691305f01bccd1596fb0f6bb003ad 
  src/test/python/apache/aurora/api_util.py 
3fc9b478cc9aada0503e8ed8698a37b4ed926cdd 
  src/test/python/apache/aurora/client/api/test_scheduler_client.py 
f2a2eae1539f7f6dff6855e4122cc41c6cbb0f7b 
  src/test/python/apache/aurora/client/cli/test_inspect.py 
e4f43d0573c7862adc9bc679f4cea40cc76eac38 
  src/test/python/apache/aurora/config/test_thrift.py 
8e1d0e177959af12b97bdd1cd47845b72bc12fe1 
  
src/test/resources/org/apache/aurora/scheduler/storage/durability/goldens/current/removeHostMaintenanceRequest
 PRE-CREATION 
  
src/test/resources/org/apache/aurora/scheduler/storage/durability/goldens/current/saveCronJob
 88e1c36a1aa2d192b95963f7aa36e243a447e4af